chess rating algorithm that performs better than ELO system
chess rating algorithm that performs better than ELO system
Interesting article about finding a chess rating algorithm that performs better than the official Elo rating system:
http://games.slashdot.org/story/10/08/0 ... e-Over-Elo
http://games.slashdot.org/story/10/08/0 ... e-Over-Elo
Re: chess rating algorithm that performs better than ELO sys
I don't know about the data pool in this experiment, but just switching from Bell to logistic in expectation curve can give a noticeable correction when typical games are between players of large skill differential.
However, ELO Benchmark is back on top (as of right now):
Note, the method for creating seed ratings for Elo Benchmark is being refined, so don't be surprised if the benchmark improves a little in the competition's first week.
I don't know if RMSE is the best metric here. Listing it to 6 sigfigs (and with no error bound) with only 781 data points (to be 7809 in the end) is not a good sign from the quality-control standpoint.
We reserve the right to disqualify any competitor who is blatantly attempting to decode the leaderboard portion of the test dataset. Well, at least the RE guys have something to do...
However, ELO Benchmark is back on top (as of right now):
Code: Select all
# Team Name RMSE #Submits Date of Latest Submission
1 Elo Benchmark 0.723834 4 1:27pm, Thursday 5 August 2010
2 EdROpen 0.729125 2 3:47pm, Wednesday 4 August 2010
3 whiteknightOpen 0.731656 4 6:29pm, Wednesday 4 August 2010
4 Chris_ROpen 0.742663 2 1:27am, Thursday 5 August 2010
5 ulvundOpen 0.742744 8 10:23am, Thursday 5 August 2010
6 FirstTryOpen 0.833059 1 2:01pm, Thursday 5 August 2010
I don't know if RMSE is the best metric here. Listing it to 6 sigfigs (and with no error bound) with only 781 data points (to be 7809 in the end) is not a good sign from the quality-control standpoint.
We reserve the right to disqualify any competitor who is blatantly attempting to decode the leaderboard portion of the test dataset. Well, at least the RE guys have something to do...
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: chess rating algorithm that performs better than ELO sys
My take on this is that the Elo system is nothing more than a curve that approximates expected results, which has been tuned to be as accurate as possible when dealing with humans playing the game of chess. Curve-fitting is taught in most every numerical analysis course. And its based on the idea of minimizing the error of the curve when matched against observed data. But humans and computers are different. And perhaps humans have changed enough since this was first done to make the system need additional tuning. Maybe the original approach is too simplistic. And then again, the Elo system is abused daily when different sorts of time controls are co-mingled into one rating pool. A human playing game/5min is quite different than a human playing with even a 1 sec increment. This only deals with wins and losses. What about when one player is sick? What happens when one plays several consecutive games? What about an adjourned game that resumes after the final round of the day so that one player stays up late, another does not? All of these influence the games, yet none of them are factored into the rating system to affect future predictions.
Re: chess rating algorithm that performs better than ELO sys
Or if one side is using mind control rays; reference --> Fischer / Spasskyhyatt wrote:My take on this is that the Elo system is nothing more than a curve that approximates expected results, which has been tuned to be as accurate as possible when dealing with humans playing the game of chess. Curve-fitting is taught in most every numerical analysis course. And its based on the idea of minimizing the error of the curve when matched against observed data. But humans and computers are different. And perhaps humans have changed enough since this was first done to make the system need additional tuning. Maybe the original approach is too simplistic. And then again, the Elo system is abused daily when different sorts of time controls are co-mingled into one rating pool. A human playing game/5min is quite different than a human playing with even a 1 sec increment. This only deals with wins and losses. What about when one player is sick? What happens when one plays several consecutive games? What about an adjourned game that resumes after the final round of the day so that one player stays up late, another does not? All of these influence the games, yet none of them are factored into the rating system to affect future predictions.
Re: chess rating algorithm that performs better than ELO sys
Good call. Also Topalov / Kramnikbenstoker wrote: Or if one side is using mind control rays; reference --> Fischer / Spassky

Re: chess rating algorithm that performs better than ELO sys
lmader wrote:Good call. Also Topalov / Kramnikbenstoker wrote: Or if one side is using mind control rays; reference --> Fischer / Spassky
yes i also wants to know about that, you have mentioned a long detail about the computer and man.
- noctiferus
- Posts: 122
- Joined: Thu Jun 10, 2010 7:57 am
- Location: Ivrea (To), Italy
Re: chess rating algorithm that performs better than ELO sys
If anybody is interested in more details, this is the site of the rating competition:
http://www.kaggle.com/c/ChessRatings2
(I didn't check if datasets are still available...)
http://www.kaggle.com/c/ChessRatings2
(I didn't check if datasets are still available...)
- noctiferus
- Posts: 122
- Joined: Thu Jun 10, 2010 7:57 am
- Location: Ivrea (To), Italy
Re: chess rating algorithm that performs better than ELO sys
Of course, for a correction to elo system, an evaluation of the modified rating, and a huge rating exercise on old and current players (interesting!), you can also look at Sonas' site
www.chessmetrics.com
www.chessmetrics.com
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: chess rating algorithm that performs better than ELO sys
The Elo system for humans has worked just fine for years. The problem is that you want relatively slow change after some time, because a human's real skill does not change, although his health and mental acuity varies day by day. Computers are a different kind of player entirely. Their strength is high, compared to humans. Their consistency is off the charts compared to humans, they don't get tired, sick, irritable, distracted, hungry, bored, etc. They can play 100 games in a row, non-stop, and play just as strongly in the last game as they did in the first. No human can do that. So at least the smoothing component of the Elo system is not as well tuned for computers as it could be. But then again, I doubt _any_ rating system will fit humans _and_ computers perfectly. It seems almost impossible.
- noctiferus
- Posts: 122
- Joined: Thu Jun 10, 2010 7:57 am
- Location: Ivrea (To), Italy
Re: chess rating algorithm that performs better than ELO sys
Bob, I fully agree about differences between human and comp ratings (MHO isn't quite relevant, of course).
For what concerns human performance, however, it looks like Sonas' method has a less inertial response without being too prone to short term variations. IMHO, it is a good compromise between a nervous eval, and a rock-builded one.
Of course, engines scenary is quite different. Maybe it could be used as a quite loose trend evaluation for engines strenght prediction, but awaiting for a lot of jumps, anomalous predictions etc etc...
For what concerns human performance, however, it looks like Sonas' method has a less inertial response without being too prone to short term variations. IMHO, it is a good compromise between a nervous eval, and a rock-builded one.
Of course, engines scenary is quite different. Maybe it could be used as a quite loose trend evaluation for engines strenght prediction, but awaiting for a lot of jumps, anomalous predictions etc etc...