Forum for discussion about the Netflix Prize and dataset.
You are not logged in.
Hi,
Has anyone else noticed that movie_titles.txt contains some NULL values in place of years.
This is all of them:
4388,NULL,Ancient Civilizations: Rome and Pompeii
4794,NULL,Ancient Civilizations: Land of the Pharaohs
7241,NULL,Ancient Civilizations: Athens and Greece
10782,NULL,Roti Kapada Aur Makaan
15918,NULL,Hote Hote Pyaar Ho Gaya
16678,NULL,Jimmy Hollywood
17667,NULL,Eros Dance Dhamaka
What are others doing about them, leaving them as NULL, inserting the correct years, inserting a generic year (EG. 2006)?
Note that the Netflix web site also does not list the years of these titles.
Regards.
Offline
Find the release dates on Amazon or approximate. Or don't use the release years or movie titles at all.
Offline
Actually, you're mistaken.
Those movies were actually released to DVD in the year 0.
Jesus was a huge fan of "Jimmy Hollywood".
Offline
I put them as 1800 just for the integrity of all my data in my database. But anything with a year of 1800 is just disregarded (the year itself, not the actual movie).
Offline
benjismith wrote:
Actually, you're mistaken.
Those movies were actually released to DVD in the year 0.
Jesus was a huge fan of "Jimmy Hollywood".
NULL and 0 are different things.
Offline
They don't have a NULL value BTW.
They have the text string "NULL"
Offline
okstumbler wrote:
benjismith wrote:
Actually, you're mistaken.
Those movies were actually released to DVD in the year 0.
Jesus was a huge fan of "Jimmy Hollywood".NULL and 0 are different things.
Perhaps. Depends on your environment.
Certainly in SQL, null means "beats the s*** outta me". Or, perhaps, "I've put invalid data into the record, and I just don't care". Or, in many cases it means "I really should have created a new table, and used third normal form to link optional records from table B to the records in table A. But I didn't feel like it."
But, in many areas of computer science, null and zero are semantically identical. For example, trying to dereference a "null pointer" means that your pointer is set to zero (which is a privileged memory location owned by the OS) and you tried to access it from non-kernel-level code.
Offline
NULL and the number 0 have nothing in common conceptually. If someone decides to represent NULL with the number 0, because 0 has no other useful meaning in the context (as in 'NULL pointer'), does not imply that 0 and NULL are in any way related conceptually.
And I think we can all agree that in this context (Netflix data file) NULL means "beats the s*** outta me".
To be honest I didn't quite feel like arguing over NULL. I simply wanted to point out that they are not equivalent in every case, and thus are not equivalent at all.
From your last post it sounds like you have a good understanding of NULL, and I wasn't trying to correct you. But you should know that equating NULL and 0 is a common mistake made by inexperienced computer scientists and programmers. My intention was to break this habbit in a friendly way.
Offline
okstumbler wrote:
I wasn't trying to correct you. But you should know that equating NULL and 0 is a common mistake made by inexperienced computer scientists and programmers. My intention was to break this habbit in a friendly way.
Gotcha. I appreciate it.
I think this is a nice environment for new coders to get their feet wet with statistical machine learning algorithms, especially since there are lots of experienced developers hanging out here, willing to lend a helping hand.
My only intention with the original null/zero comment was to make a hilarious joke about movies being released in the year Zero, Anno Domine. ![]()
Offline
In case anybody cares, here are the DVD release dates I found for the NULL date movies, from various sources (Amazon and some Indian movie site):
4388, 2002
4794, 2002
7241, 2002
10782, 2005
15918, 2005
16678, 2004
17667, 1999
Offline
benjismith wrote:
Certainly in SQL, null means "beats the s*** outta me". Or, perhaps, "I've put invalid data into the record, and I just don't care". Or, in many cases it means "I really should have created a new table, and used third normal form to link optional records from table B to the records in table A. But I didn't feel like it."
Nope! It means "I've fallen and I can't get up!"
Mwuhahahhahaha
Offline
willakawill wrote:
They don't have a NULL value BTW.
They have the text string "NULL"
But in the Netflix database, I'll bet they have a NULL value. It's just that conversion into text files that causes the translation to what you see here.
Offline