Forum for discussion about the Netflix Prize and dataset.
You are not logged in.
Not living in the USA, I don't have a Netflix account, so I hope someone can help with a simple question:
How are the aggregate ratings info about a movie displayed to a user? In other words, if the average rating for a movie is (say) 3.7 stars, does the web site show it as 3.7 stars or does it get rounded to 4 stars?
These questions are more for the Netflix guys:
If the site is using integer ratings for aggregate rating info, how is it summarized? Arithmatic average plus rounding or truncating? Mode? Median?
Are the number of ratings available for each movie and user in the training set highly correlated to the 'real' number of ratings in the full dataset? In other words, can a movie (or user) with a relatively high number of ratings in the full dataset have a relatively low number of ratings in the training set, and vice versa?
Last edited by yarkun (2006-10-03 08:33:10)
Offline
yarkun wrote:
How are the aggregate ratings info about a movie displayed to a user? In other words, if the average rating for a movie is (say) 3.7 stars, does the web site show it as 3.7 stars or does it get rounded to 4 stars?
It shows you the fractional amount. For example, picking a random DVD off of the suggestions on my homepage (Grey's Anatomy: Season 1), it has the following information in the details:
"Average of people who rate like you: 4.7 stars
Average of 154,393 ratings: 4.6 stars"
Offline
The ratings in the training set are a random sample from our database as of December 31 2005. It is possible, but very unlikely, for a movie to have many ratings in our database but relatively few in the training set.
As to how we show ratings on the site, this isn't really a question about the prize, but I'll answer it anyway.
Users interact with a graphical "widget" that acts as both an input and an output device. The widget shows a series of 5 stars possibly colored stars. For output it shows either their ratings for the movie (yellow stars), or (if they have not rated it) our prediction for how much they will enjoy the movie (red stars). Users can click on a star to rate the movie, turning the widget from showing red start to showing yellow stars. Users ratings are integral values from 1 to 5 stars. (There is an additional out-of-band rating of "Not Interested" that plays no part in the prize.)
As Erasmus noted, we also show summary statistics for individual movies: the total number of ratings and the mean value of those ratings.
Offline
Is the goal of Prize contest to predict the preference-based rating (similar to "average of people who rate like you" as mentioned by Erasmus)? Can you confirm or correct?
Offline