Netflix Prize: Forum

Forum for discussion about the Netflix Prize and dataset.

You are not logged in.

Announcement

Congratulations to team "BellKor's Pragmatic Chaos" for being awarded the $1M Grand Prize on September 21, 2009. This Forum is now read-only.

#1 2009-07-07 11:45:40

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

I have counted in the first 1000 rows in the leaderboard more than 100 users that all have the same format:
jo<4 digits>mo, from place 36 and down, and all submitted today at about the same time.

This team is now occupying over 10% of the leaderboard, and it seems that they use some kind of bot to create new users and to violate the rules by hundreds of submissions per day.
They are obviously trying to train against the test set.

PrizeMaster - can you please ban this team and delete all its fake users?

Offline

 

#2 2009-07-07 14:22:14

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Thanks to PrizeMaster for handling this so fast.
As a result of deleting all the jo@mo's except one, I moved back about 70 places up in the leaderboard...

Offline

 

#3 2009-07-08 04:57:48

CS1
Member
From: San Jose, CA
Registered: 2006-10-02
Posts: 151

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Eeck.... "They're back...."
http://www.youtube.com/watch?v=F8TrCJQAid0#t=0m30s

I don't think the zombie JoMos are aware that they've been killed.  They're respawning.  Get out, get out while you still can!  Run for the hills...steepest ascent!  big_smile

CS1

Offline

 

#4 2009-07-08 07:54:40

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Yes, they are back...
Plus - there is, in addition to jo#mo also a new variation: OPAG#.

I guess the user registration and submission mechanisms require at least a CAPTCHA feature.hmm

Since the highest jo#mo is at place 37 on the leaderboard, to me this means that this is the doing of one of the top 36 teams, which tries to train against the qualifying set.
Without saying whom I suspect, I would say that it is probably one of the teams that is closest to place 37...wink

Last edited by DB (2009-07-08 08:01:59)

Offline

 

#5 2009-07-09 07:20:35

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Today they are back AGAIN, and now they have a new level of sophistication: they use real different person names as prefixes, in the following format: <name><number>k.
Examples:
Charlize38k
Claudia1k
Alicia35k
Carrie46k

All these have posted at about the same time. I have counted 12 such entries in the top 270 places in the leaderboard alone.

They are obviously defying the PrizeMaster and the rules.

Last edited by DB (2009-07-09 07:24:06)

Offline

 

#6 2009-07-09 18:40:56

Phil
Member
Registered: 2006-10-09
Posts: 132

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

They are probably hoping to use this information about the test set, and transfer it to some other team that can't be connected with the bot entries.

A simple solution would be for the contest rules to stipulate that the winner must produce code that Netflix can duplicate their results with, without using information from test sets.

Offline

 

#7 2009-07-10 08:53:42

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Update for today:
The team that brought you the jo#mo's and the girl#k's, has now, after their previous fake sub-teams were deleted by PrizeMaster, started a new method, maybe manual or driven from prepared lists, that does the same multiple submissions in the same day for them.
Here is a list of their new "teams" in the top 300 places in the leaderboard:
impossiblemission
Forwinner
killmethistest
pupubaby
sunnylike
Richardwin
nevergiveup
teufelisme
yeman
Windman
pplive
xiisbadguy
readytogo
newteamforprize


Obviously - they don't give up easily, and they think they are outsmarting everybody by using pattern-less names.
Well they are not.
Further - having followed this for a few days - I have now a strong suspicion which is the "real" team that is behind all this.
I am just waiting for some additional evidence, and then I will publish their name.

Last edited by DB (2009-07-10 09:00:28)

Offline

 

#8 2009-07-10 09:28:27

CS1
Member
From: San Jose, CA
Registered: 2006-10-02
Posts: 151

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

DB wrote:

I am just waiting for some additional evidence, and then I will publish their name.

Such as the IP addresses?  big_smile  I wouldn't worry about it too much.  They lack imagination and ethics.  This is obvious to Netflix and the team is clearly signaling this fact to everyone else.

Anyway, how many teams can one person be on?  I mean, other than Kevin Costner, of course: http://movies.toptenreviews.com/actors/ … _actor.htm   He is the team.  big_smile

CS1

Offline

 

#9 2009-07-10 11:08:56

LMV
Member
Registered: 2008-05-24
Posts: 46
Website

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Yeah! It's really amazing to be so daft, hoping that "no one would notice" and especially the Prizemaster.
To quote Einstein :
"Two things are infinite: the universe and human stupidity; and I'm not sure about the universe."
yikes

Offline

 

#10 2009-07-10 11:58:08

edr2
Member
Registered: 2009-07-02
Posts: 58

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

One draconian solution to the recent score-bot attacks would be a moratorium on new 'teams' until the July 26th deadline.  While this would be grossly unfair to some genius out there who might come up with a winning solution, and just decided to sign up, it wouldn't have any impact on anybody who is already signed up and 'playing by the rules'.

Offline

 

#11 2009-07-10 12:02:59

edr2
Member
Registered: 2009-07-02
Posts: 58

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

... although it would also elimnate the chance of us getting to see "Belkor's Chaotic and Pragmatic Grand Prize Team ride the XLVector to the Vandelay Opera"

Offline

 

#12 2009-07-10 12:13:52

Lazy Monkey
Member
Registered: 2007-12-13
Posts: 93

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

I don't think a moratorium on new teams is called for.

The making of hundreds of submissions only makes sense if the guilty team (assuming there is a guilty team) is trying to tune a blending algorithm against the quiz set and if that is what is happening and they get the lowest RMSE they are bound to be caught when they disclose to Netflix how they did it and disqualified. 

It is at least possible that we are seeing associated but permitted teams working on a project in a co-ordinated way (say all of the students in a third year computer science course somewhere in the world.)

Offline

 

#13 2009-07-10 15:02:46

CS1
Member
From: San Jose, CA
Registered: 2006-10-02
Posts: 151

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Lazy Monkey wrote:

It is at least possible that we are seeing associated but permitted teams working on a project in a co-ordinated way (say all of the students in a third year computer science course somewhere in the world.)

True, this could be keeping them off the streets.  If they didn't have this to do, they'd probably be spending time sending us spam.

Oh look! I just got a nice email from the dethroned princess of Nigeria.  Hmm...let me see.  Oh wow, fortune can be mine if I just give them my bank account to put their family funds into.

Why have I been working so hard on this competition, when riches await through easier means?  See ya suckers! lol

CS1

Offline

 

#14 2009-07-10 15:56:37

Lazy Monkey
Member
Registered: 2007-12-13
Posts: 93

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

CS1 wrote:

If they didn't have this to do, they'd probably be spending time sending us spam.

That could be closer to the truth than most of us are comfortable admitting.

Maybe they are practicing with data mining so they can do a better job of targeting spam.

Offline

 

#15 2009-07-10 18:57:45

BellKor
Member
Registered: 2007-11-26
Posts: 8

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

edr2 wrote:

... although it would also elimnate the chance of us getting to see "Belkor's Chaotic and Pragmatic Grand Prize Team ride the XLVector to the Vandelay Opera"

now *that* was funny lol

Offline

 

#16 2009-07-11 09:44:27

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

CS1 wrote:

Such as the IP addresses?  big_smile

I don't have THIS data, although PrizeMaster may have it. No, I am using much simpler and public means.
But since they did not re-appear today, I will wait as well. We may have frightened the zombies away, at last...lol

Offline

 

#17 2009-07-11 20:44:14

Lazy Monkey
Member
Registered: 2007-12-13
Posts: 93

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

DB wrote:

xlvector is gone from the leaderboard...wink

xlvector is a member of WeAretheBorg and the Opera Solutions and Vandelay United teams so, technically, he is still on the leader board.

While spam submissions (if they were indeed non-compliant) may be a nuisance, they are not going to give anyone much of an advantage in the contest.

Interesting to note that the number of submissions seems to have fallen in the last day or two from 200+ per day to 50.

Last edited by Lazy Monkey (2009-07-11 21:05:05)

Offline

 

#18 2009-07-12 07:05:40

xlvector
Member
Registered: 2009-05-11
Posts: 9

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

I want to explain something. When team VI is created, their is some chaos because every member of VI want to submit result by their own blending algorithms using models in VI. But we do not want to submit result by our own team, so, we create many new subteams (4 teams) of VI and submit result. Now, we have make a decision who to submit result as VI. So, we removed all subteams of VI.

I have submitted result using all models in VI by teamname xlvector, now, I think this may make other people think I make this result by my own. So, I remove team xlvector by my own and use a new team name "xiangliang" to submit result of my own models.

P.S. After I removed my teamname, one member of VI tell me you discuss me here, so I want to explain something here.

Last edited by xlvector (2009-07-12 17:28:00)

Offline

 

#19 2009-07-12 11:00:05

ADifferentName
Member
Registered: 2008-06-29
Posts: 19

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

I want to add to xlvector's explanation.
"xlvector" was not removed from the leaderboard by Netflix.

Several teams that were formed by members of Vandelay Industries ! were voluntarily taken down from the leaderboard.  The teams were not breaking any rules.

We were responding to comments from other teams that:  1) Vandelay Industries ! teams were taking up too much space on the leaderboard, blocking the way for other deserving teams  and 2) it gave the appearance that Vandelay Industries ! was trying to bend the rules to make more than one submission each day.

xlvector started as a single person, but later "xlvector" became a team of people.  (I am a member of the xlvector team).  Looking back, I wish that we had chosen a different name for the team so that there would be no confusion between "xlvector" the person and "xlvector" the team.

xlvector's was taken off of the leaderboard with good motives.  It is unfortunate that anyone mistakenly believed that xlvector was removed by Netflix.

We're working very hard to play by the rules and to be cooperative with the rest of the Netflix Prize community.  There are still two teams ahead of Vandelay Industries ! and we've got a long way to go to catch up.

Just a suggestion:  I think that we ought to wait for Netflix to "out" whoever is making the mass submissions.  Everybody should send any evidence or suspicions to the prize master.

Good luck everybody!

Greg

P.S.  Lazy Monkey is correct that xlvector (the individual) is a member of "We are the borg", "Vandelay Industries !", and "Opera Solutions and Vandelay United".  We are working to remove "We are the borg" from the leaderboard to free up another spot in the top 20.

Last edited by ADifferentName (2009-07-12 11:06:59)

Offline

 

#20 2009-07-12 14:14:26

CS1
Member
From: San Jose, CA
Registered: 2006-10-02
Posts: 151

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

ADifferentName wrote:

We are working to remove "We are the borg" from the leaderboard to free up another spot in the top 20.

The Borg cannot be removed.  They learn quickly, and assimilate everyone.

Still, good luck: may the farce be with you!!!  lol

CS1

Last edited by CS1 (2009-07-12 14:14:40)

Offline

 

#21 2009-07-16 08:00:45

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

So, after a few days in the underground, they feel safe and they start emerging again.

All the following teams did not exist yesterday, but today they suddenly are here, with submissions all in the top 275.
The highest is in place 24 in the leaderboard, which narrows the list of "real" teams that are able to do this to the top 23.
Some of them use names from previous rounds, and were already eliminated once.

Here are some of the names:
Forwinner
xiisbadguy
you_sb
killmethistest
olympic champion
newteamforprize
readytogo                                 
impossiblemission
Richardwin
youneverknow
hellocat
winnetflix
yuanyuanpp
bigbangfan
teufelisme (place 24)

Offline

 

#22 2009-07-16 09:18:25

dale5351
Member
From: Columbia, MD
Registered: 2008-10-18
Posts: 116

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Actually, this multiple posting is not new.  There are a number of obviouse multiple postings that can be seen by a simple sort of names on the top 1000 of the leader board.  Examples:

Auroral...               on 2007-05-01
Wattee...              on 2008-10-05
alone...                 on 2009-07-13
eastevil...             on 2007-04-20
hello...                  on 2007-07-19
jingqian...             on 2007-05-02
majia...                 on 2009-07-15
meow...                on 2007-07-19
postechmlgport... on 2009-03-30
pqt2001-...           on 2008-04-14
pyoung98...          on 2009-04-06
ttp...                     on 2009-03-13
xlzg_china...         on 2007-09-21
Goingsolo...          on 2009-07-10


If netflix took the effort to eliminate those entries, it would remove 50-100 (I did not bother to count) from the top 1000, but IMO that does not really matter.  What matters is that netflix find some way to prevent such gaming of the system in these final days when the real money is on the line.

Offline

 

#23 2009-07-16 09:26:03

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

dale5351 wrote:

IMO that does not really matter

Well, it does matter to the rest of us, who will not have the $1M but will have a place on the leaderboard when the contest finishes. If I am pushed down 50-100 (or even 15) places on the leaderboard by a single team - this upsets me...sad

What matters is that netflix find some way to prevent such gaming of the system in these final days when the real money is on the line.

They should just clean the table once and then prevent creation of any new teams in the last few days. Nothing in the rules prevents them from doing this.

Offline

 

#24 2009-07-16 10:30:31

CS1
Member
From: San Jose, CA
Registered: 2006-10-02
Posts: 151

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Does anyone remember the time...  when the leader was a big ol DB?  lol
http://www.netflixprize.com//community/ … php?id=261

I don't mean to imply the "DB" in the forum was the same as "Giant DB", but I'm a sucker for irony and coincidences.

Hey, where is UToronto, anyway?  They've been awfully quiet.

Uh oh.

CS1

Last edited by CS1 (2009-07-16 10:31:29)

Offline

 

#25 2009-07-16 11:09:19

DB
Member
From: Home
Registered: 2006-10-20
Posts: 114

Re: PrizeMaster: jo#mo polluting the leaderboard w hundreds of fake users

Well, I don't recall this specific team, but it was not me. Unfortunately I never was at the top of the table, although I am now placed in the top 270 (on days that the zombies don't drive me down to > 300...tongue)

Offline

 

Board footer

Powered by PunBB
© Copyright 2002–2005 Rickard Andersson