Netflix Prize: Forum

Forum for discussion about the Netflix Prize and dataset.

You are not logged in.

Announcement

Congratulations to team "BellKor's Pragmatic Chaos" for being awarded the $1M Grand Prize on September 21, 2009. This Forum is now read-only.

#1 2010-03-09 07:57:32

Charlie Eppes
Member
Registered: 2006-10-12
Posts: 10

Loading IMDB into SQL and merging with Netflix IDs

Hey folks,

Now that the competition is over, I'd like to know if anyone can offer up pointers on importing IMDB into MySQL; even better is to know about matching IMDB entries to Netflix IDs.  (Either the ones from the original Netflix data set or the API IDs that appeared in the post-competition data.)  I figure that there are going to be more than a few data quality issues with IMDB, and that someone else has already addressed these.

By the way, I realize that Ilya Grigorik wrote about this in these posts:
http://www.netflixprize.com//community/ … php?id=502
http://www.igvita.com/2007/01/27/correl … -datasets/

I appreciate those posts, though I'm interested in any additional pointers or notes that can be offered.

Many thanks!

Offline

 

#2 2010-03-09 08:04:23

Charlie Eppes
Member
Registered: 2006-10-12
Posts: 10

Re: Loading IMDB into SQL and merging with Netflix IDs

Or, has anyone ditched IMDB and just used the API indices that Netflix provided?  The Netflix database accessible through the API seems sweet!  smile

http://developer.netflix.com/docs/REST_API_Conventions
http://developer.netflix.com/docs/REST_API_Reference

I need to think this over: does anyone know what IMDB offers that Netflix doesn't?  Other than sheer, unadulterated scale, encompassing movies and TV shows from across the planet.  Hmm, I guess scale is argument enough for attempting to work with IMDB.  Who can't be mesmerized by scale?  ;-)

Thanks!

Ps.  At some point I'll have an interest beyond just structured data sets and ask about Wikipedia, but for now I'll hold off on that can of worms.

Last edited by Charlie Eppes (2010-03-09 08:07:29)

Offline

 

Board footer

Powered by PunBB
© Copyright 2002–2005 Rickard Andersson