So, over the past couple of weeks, since I’ve practically had nothing to do (besides blog to all you wonderful people!), I’ve been trying to make myself slightly more useful, and learn a bit of programming. My language of choice right now is Ruby! I’ve done a couple of projects (I might release the other ones when I’m satisfied with them), but tonight I created my first “real” from-scratch ruby script.
One of the sites I’m involved with has a wiki-type section, and each page has an image. This image is not stored on the websever (necessarily), but instead, each page has an image box where each user can put in a link to an image hosted on imageshack/tinypic/et al, and this link is stored with each wiki page in MySQL as it’s own column.
This same website does provide a coppermine gallery installation where users are encouraged to upload these pictures to. Unfortunately, one of the staff members there (through sheer accident) deleted one of the major albums that held these wiki pics.
Fortunately, Ruby makes it stupidly easy to work with HTTP and MySQL. I whipped up this script, and after a few bugs squashed (damn you, “==” and “=”), we got the list of all the wiki pages that no longer had a working image. Here’s the script in it’s full glory:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | #!/usr/bin/env ruby %w[rubygems net/http activerecord].map { |r| require r} Net::HTTP.version_1_2 # This is the http path to the page that would display the image # I only do this so that the text files generated have clickable # links if your view supports that. If you just want ID #'s, # leave this blank! -> http_path = "" http_path = "http://domain.tld/path/index.php?id=" ActiveRecord::Base.establish_connection( :adapter => "mysql", :database => "dbname", :username => "user", :password => "password", :host => "localhost" ) class Group < ActiveRecord::Base set_table_name 'table_that_has_images' end # Quick note: these files are in "append" mode, so if you plan on # running this script more than once, delete or rename the files # between runs. # Doesn't have an image in the first place no_img = File.new("no_img.txt", "a") # Dead img dead = File.new("dead_img.txt", "a") # Timed out time = File.new("timed_out.txt", "a") num = Group.last.ID puts "There are #{num} groups total" for i in (1..num) g = Group.find(i) rescue "none" if g == "none" puts "No Group ID #{i}" next else unless g.WikiImage =~ /^(http|https):/i # Doesn't have an image in the first place, might want # to report this, also. no_img.puts("#{http_path}#{i}") puts "ID: #{i} -> No image in the first place" else # Rescues in case it times out... I think we can assume if # that's the case, it's dead. res = Net::HTTP.get_response(URI.parse(g.WikiImage)) rescue "timeout!" # res.value returns an error for all values other than an # HTTP 200 response. So, this will probably also catch # permanent re-directs, et al. :X test = res.value rescue "dead" if test == "dead" if res == "timeout!" puts "ID: #{i} -> Timed Out" time.puts("#{http_path}#{i}") else dead.puts("#{http_path}#{i}") puts "ID: #{i} -> Dead Img" end else puts "ID: #{i} -> Image fine" end end end end time.close() dead.close() no_img.close() |
Feel free to use the script for your own purposes. Pretty much all the main variables are at the top that you need to change (site path and MySQL connection params). I couldn’t figure how to make new variable names from a variable, so if your image column isn’t named “WikiImage” (and I doubt it is), you’ll have to change lines 43 and 51 to reflect your column name!
Enjoy! :)