The expected outcome of a random game of chess?

Imagine a game of chess where both players generate a list of legal moves and pick one uniformly at random.

Q: What is the expected outcome for white?

  • 1 point for black checkmated, 0.5 for a draw, 0 for white checkmated. So the expected outcome is given by $$\mathrm{Pr}[\text{black checkmated}]+0.5\ \mathrm{Pr}[\text{draw}].$$
  • Neither player resigns, nor are there any draw offers or claims.

As a chess player, I'm curious if white (who plays first) has some advantage here.


I'm not expecting an exact answer to be possible. Partial results (e.g. that the expectation is >0.5) and experimental results are welcome. (The expectation is not 0 or 1, since there are possible games where white does not win and where black does not win.)

I'm guessing this has been looked at before, so I decided to ask first (rather than implement a chess engine that makes random moves and hope to find something other than "draw, draw, draw, draw, ..."). Searching for "random game of chess" lists Chess960 and other randomized variants, which is not what I want.


Technicalities:

  • En passant capturing, castling, pawn promotion, etc. all apply as usual.

  • The FIDE Laws of Chess will be updated 1 July 2014 with the following:

    9.6 If one or both of the following occur(s) then the game is drawn:

    • a. the same position has appeared, as in 9.2b, for at least five consecutive alternate moves by each player.

    • b. any consecutive series of 75 moves have been completed by each player without the movement of any pawn and without any capture. If the last move resulted in checkmate, that shall take precedence.

    This means that games of chess must be finite, and thus there is a finite number of possible games of chess.


I found a bug in the code given in Hooked's answer (which means that my original reanalysis was also flawed): one also have to check for insufficient material when assessing a draw, i.e.

int(board.is_stalemate())

should be replaced with

int(board.is_insufficient_material() or board.is_stalemate())

This changes things quite a bit. The probabillity of a draw goes up quite a bit. So far with $n = 5\cdot 10^5$ samples I find

$$E[\text{Black}] \approx 0.5002$$ $$E[\text{White}] \approx 0.4998$$ $$P[\text{Draw}] \approx 84.4\%$$

A simple hypotesis test shows that with $P(\text{white})=P(\text{black})=0.078,~P(\text{draw})=0.844$ and $N=5\cdot 10^5$ samples the probabillity to get $|E[\text{Black}] - 0.5| > 0.002$ is $25\%$ so our results are perfectly consistent with $E[\text{White}]=E[\text{Black}]=0.5$. The "hump" remains, but is now easily explained: it is due to black or white winning. Either they win early or the game goes to a draw.

enter image description here
(source: folk.uio.no)

Here is one of the shortest game I found, stupid black getting matted in four moves:

enter image description here
(source: folk.uio.no)


Update: The code below has a small, but significant oversight. I was unaware that a stalemate would not be counted the same way as a board with insufficient pieces to play and this changes the answer. @Winther has fixed the bug and reran the simulations. That said, there is still value to the code being posted so I'll leave it up for anyone else to repeat the experiments (and find more bugs!).


Slightly rephrasing your question,

Is the expected outcome for EX[white] = 1/2 in a random game?

To test this, I simulated 10^5 games using the library python-chess. The code is posted below for those wishing to repeat the numerical experiment (this takes about 4 hours on an 8-core machine). In the 100000 games, 46123 came up as wins for white and 6867 games were ties. This puts the expected value of the game at

$$ \text{EX}[white] = 0.495565 $$

Using the 2-sided normal approximation to the binomial test of a fair game, we get a p-value of 0.00511. Therefore, we can reject the null-hypothesis that the game is fair. This was surprising to me.

In other words, $\text{EX}[white]<1/2$ looks to be statistically significant, however the advantage for black is very small.

Personally, the more interesting question is the distribution of game length, hence a plot of it is included below.

import chess, random, itertools, multiprocessing
simulations = 10**5

def random_move(board):
    return random.choice(list(board.legal_moves))

def play(game_n):
    board = chess.Bitboard()
    ply = 0
    while not board.is_game_over():
        board.push( random_move(board) )
        ply += 1

    # board.turn == 0 -> White, == 1 -> Black
    return game_n, int(board.is_stalemate()), board.turn, ply

P = multiprocessing.Pool()
results = P.imap(play,xrange(simulations))

with open("results.txt",'w') as FOUT:
    for game in results:
        s = "{} {} {} {}\n".format(*game)
        FOUT.write(s)

enter image description here

There is much to be mined out of this dataset, but I am not a chess-aficionado. I'm not sure why the distribution contains two "humps", comments are welcome.