During the World Chess Championship I developed prediction models for the outcome, both of which performed very well. Since then I have been attempting to transform the model from a match setting to a tournament setting, as there are some very distinct differences. There are more players involved, and in order to win the tournament you have to risk more in each individual game, which makes for more wins/losses compared with a match.
The basic setup is still mostly the same as for predicting the championship. I set a probability for each player in each game, and use a random number generator to simulate the outcome of each of the 28 games. That overall outcome is recorded and the simulation is repeated 10,000 times to get an overall distribution of the tournament outcome.
The main difference and challenge lies in how to assign win probabilities of each individual game. How do you adjust for playing with white or black, and how much emphasis do you place on ratings? For the chess championship I used historical match data to assign probabilities of winning, drawing and losing with white, but that is harder to do in a tournament setting. As stated, matches and tournaments are fundamentally different.
What I did instead was to try to personalize each parameter, which means that I wanted to look at how each player has performed when playing with either white or black. To do this, I downloaded all of each player’s games that are listed on 365chess.com and performed an individual analysis. I included games played since January 2011 while excluding blindfolded, rapid and blitz games. This gave me an idea of how each player performs with each color. Here are the statistics for each player in the GRENKE Chess Classic while playing as white, sorted by rating:
White Player | Win | Draw | Loss |
Carlsen | 49,3 % | 41,2 % | 9,6 % |
Caruana | 49,6 % | 38,8 % | 11,6 % |
Anand | 27,9 % | 59,5 % | 12,6 % |
Aronian | 38,0 % | 51,8 % | 10,2 % |
Adams | 41,3 % | 50,4 % | 8,3 % |
Bacrot | 45,9 % | 44,0 % | 10,0 % |
Naiditsch | 54,8 % | 30,5 % | 14,8 % |
Baramidze | 39,2 % | 46,8 % | 13,9 % |
And while playing with black:
Black Player | Win | Draw | Loss |
Carlsen | 30,2 % | 63,6 % | 6,2 % |
Caruana | 27,8 % | 55,1 % | 17,2 % |
Anand | 21,1 % | 67,0 % | 11,9 % |
Aronian | 20,5 % | 62,0 % | 17,5 % |
Adams | 22,5 % | 62,5 % | 15,0 % |
Bacrot | 27,1 % | 55,6 % | 17,3 % |
Naiditsch | 39,5 % | 37,8 % | 22,7 % |
Baramidze | 23,3 % | 54,4 % | 22,2 % |
As can be seen, there are some individual differences in playing styles. Magnus Carlsen hardly ever loses a game, while Naiditsch only ties about one third of the time. Anand on the other hand, draws a lot (which is in part due to playing a lot of championship matches, where draws are more common.) So what happens when Anand and Naiditsch play each other? On the one hand we expect Anand to draw, while on the other we expect Naiditsch not to draw. And then there is the question of rating differences. On January 30th, when setting the basics for the model, the players were rated as follows:
Player | Rating |
Carlsen | 2865 |
Caruana | 2811 |
Anand | 2797 |
Aronian | 2777 |
Adams | 2738 |
Bacrot | 2711 |
Naiditsch | 2706 |
Baramidze | 2594 |
As during the chess championships, I can use chess-db.com to find the probability of winning a game based on the players’ ratings.
My approach to assigning probabilities of the outcomes of each individual game was then simply to calculate the average of the probabilities of each outcome based on the player with white, the player with black, and the probability based on their rating. As an example, here is the calculation for Magnus Carlsen’s first game against Levon Aronian:
Assumption | White win | Draw | Black win |
Aronian white | 38,0 % | 51,8 % | 10,2 % |
Carlsen black | 6,2 % | 63,6 % | 30,2 % |
Rating | 14,4 % | 47,1 % | 38,4 % |
Average | 19,5 % | 54,2 % | 26,3 % |
To simulate the outcome of the game, the random number generator would assign a number between 0 and 1, and if that number was less than 0.195, then it would score a point for Aronian, if it was between 0.195 and 0.737 then each player would get 0.5 points, and a number greater than 0.737 would score a point for Carlsen.
Having done that for all 28 games, 10,000 times, the most likely outcome of the tournament is:
A 2-player tie-break. Followed by Magnus Carlsen and then Fabiano Caruana. Here is the distribution of likely outcomes:
In most of the pre-draw test simulations, Magnus Carlsen as the sole winner was the most likely outcome. One reason why this changed in the actual simulation is due to a pretty bad draw for him. Not only does he have to play 4 games with the black pieces, but he has to play black against Caruana, Anand and Aronian, the three highest rated players except himself. A more in-depth analysis of the outcomes to follow. Let the games begin!
Edit: Analysis of the simulations of the model:
Pingback: Grenke Chess Classic – Analysis of Win Chances | Analytic Minds
Pingback: Win Chances of Grenke Chess Classic After First Round | Analytic Minds
Pingback: Win Chances of Grenke Chess Classic After Round 2 | Analytic Minds
Pingback: Win Chances of Grenke Chess Classic After Round 3 | Analytic Minds
Pingback: Win Chances of Grenke Chess Classic After Round 4 | Analytic Minds
Pingback: Win Chances of Grenke Chess Classic After Round 5 | Analytic Minds
Pingback: Win Chances of Grenke Chess Classic After Round 6 | Analytic Minds
Pingback: Magnus Carlsen Can Still Win Norway Chess 2015 | Analytic Minds