back to homepage

For the morbidly interested, here is some info on how I gathered the info. I created a database program (using 4th Dimension, a major database in the Macintosh world but perhaps less known by Windows users). I entered a movie title, and the database did a lookup at IMDB (the online Internet Movie DataBase, a highly useful site). The IMDB page gives an indication of whether a DVD exists, and could also be a source of all sorts of useful information like "director". Currently I scrape the original film's aspect ratio and that of the DVD from the site. Next, I make a SOAP request to Amazon Web Services to search for the DVD (one of the reasons I did this was to play with the new SOAP web services tools in 4D). From this I can find out if the DVD exists (or did at one point, as they tend to go out of print), and other information such as release date. The unique Amazon identifier (ASIN) is usually the same at their Canadian store, although the Canadian operation doesn't have Web Services yet, and I have to programmatically retrieve the web page for the item and scan the HTML to get the Canadian price.

Then the database does an automated search at each of the DVD rental companies via their web site search pages. This is a pain because it has to parse the HTML returned, and each site has its own little quirks. Prepositions (such as "A" "An" and "The") in titles are sometimes moved the the end of the title to ensure alphabetical order ("Seventh Seal, The"), but some of the search engines don't like it when search submissions are in this format. It can be a bit of a pain to pull the titles and inventory numbers out of the HTML. DVDflix is an especial peeve; their inventory numbers are frequently very long ("549526761075847950"), and don't fit into a 32-bit integer as do all the other companies' IDs. This makes for special-case DVDflix code throughout the database, drat them. Sometimes the movie title I get from IMDB is slightly different than what a company has used, and the IMDB title doesn't find the movie on some sites ("Return of the Secaucus 7" vs "Seven", or "About Last Night..." without the ellipsis, etc).

I have done a little more work with the ZIP site for my own use - I can programatically add and delete titles from my queue, or change the priority (ASAP/Standard/Park) to alter things to my desired order.

As I point out elsewhere, Hollyweb's site is filled with older titles that they no longer have available for rent, but with no indication of this to the public. I've managed in the case of some of the titles in my comparison list to confirm existence or non-existence with them, but I'm hoping they can be convinced to tidy things up (no such luck). The American company GreenCine also shows some titles they don't currently have, but these are clearly indicated as Out of Print, or with a "Suggest" button instead of "Rent" when it is something they just don't have yet, which makes it easy for me to tell if they actually have a specific DVD.

Finally, the database does a search of the Roger Ebert reviews, and retrieves the link if there is one. Unfortunately, reviews prior to 1985 are not online. I have one of his older book collections, so I've filled in the rating for some of these movies where possible.

I've create a couple of routines to generate web pages from the results - the inventory comparison tables I've posted. For myself, I've also been keeping track of the Canada Post mailing times.

My choice of using the IMDB database number as my unique identifier had one unfortunate side effect. IMDB has one entry for an entire TV series, such as the Sopranos. However, there may be multiple DVD products, such as "Season One", "Season Two", etc. And all the DVD rental companies split up multiple disc sets into individual discs for rental, which adds another layer. I have to do a little work to deal with this.