Wednesday, December 13, 2006

Counting Netflix: in which Google thinks I'm a hacker

Inspired by this item on Hacking NetFlix, about how Netflix Has 70,000 Titles, Blockbuster 60,000, I decided to see if I could count the number of titles displayed on Netflix's Web site using an advanced Google search.

This is the search I attempted: allinurl: MovieDisplay -rss 1..10000000

The words MovieDisplay appear in the url for every movie. I didn't want RSS feeds, because that results in duplication, and I restricted the site to because Netflix movies are linked on hundreds of thousands of sites. I restricted results to include numbers one through ten million, because I thought that would help find only those movies with a movie id. I changed my search preferences to include all languages, and removed filtering. Surprisingly, that doubled the results!

Google will let me see only one page of the more than 85 thousand results of this search. If I try to go further, I get an error message which says I'm acting like spyware. Click on the following photo to read the message:

I decided to eliminate the 1..10000000, and lost ten thousand results, but now Google no longer thinks I'm a virus. A quick scan of the results shows they are all specific movie titles on the Netflix site. There are 75,400 titles on See if you can duplicate my results and let me know if you get a different number.

Update: I've repeated the search, and now I can't get more than 75,200 results.


  1. Very strange.. and why would spyware search on Google?

    I get the same results, but 75,200 movies.

  2. Maybe not spyware itself, but folks who write it, or bots, or maybe just hackers, and Google is trying to be polite.

  3. Removing the 1 thru 10mil returns the same number of results without Google freaking out on you.

    allinurl: MovieDisplay -rss

  4. Oops... it would help if I finished reading the article first.

    On another note, I'm not so sure the results are valid. by adding a movieid into the mix, I can see that each movie may have multiple entries in Google's search results.

    allinurl:MovieDisplay -rss 70036929

  5. This search returns no results:

    allinurl:MovieDisplay -rss -trkid -mqso 70036929

    This search returns 28,900 results:

    allinurl:MovieDisplay -rss -trkid -mqso

  6. I'm left to wonder about the completeness of Google's search results a la Netflix. The following search returns no results:
    allinurl:MovieDisplay 70057684

    When there is definitely a disc with id 70057684 (24: Season 5: Disc 4). Since Netflix says its counts include individual discs in TV series box sets, how many more aren't cataloged by the GOOG?

