Game of draughts, expected value of first move advantage, part 2

Solution 1:

The technique we used are the same as the first part - using backward recursion, and the recursion is due to the first step analysis - law of total probability.

Note that the definition of the $E_n$ is the expectation for the first player (gain) when $n$ more games to be played.

WLOG let's call be the first player of the game, when there is still $n$ games to go, to be player $A$.

Denote $$ R_i = \begin{cases} 1 & \text{ if } A \text{ is the first player} \\ 2 & \text{ if } A \text{ is the second player} \end{cases}, i = n, n-1, \ldots, 1$$

Define the payoff of $A$ when there are $i$ games to go to be $$ X_i = \begin{cases} 1 & \text{ if } A \text{ wins} \\ -1 & \text{ if } A \text{ loses} \end{cases}, i = n, n-1, \ldots, 1 $$

such that $$\Pr\{X_i = 1|R_i = 1\} = \frac {\mu} {1 + \mu}, \Pr\{X_i = 1|R_i = -1\} = \frac {1} {1 + \mu}$$ following the definition of odds.

Define the gain of player $A$ to be the sum of those payoffs of $i$ games not played yet:

$$ G_i = \sum_{k=1}^i X_k, i = n, n - 1, \ldots, 1$$

and in such case $E_n = E[G_n | R_n = 1]$

Therefore when they played all games and no more games are played, the expected gain $E_0 = 0$, as no more games to be played. The final wealth of the two players are fixed already. This is the boundary conditions of the recursion.

Using the first step analysis - law of total probability,

$$ \begin{align} E_n &= E[G_n|R_n = 1] \\ &= E[G_n|R_n = 1, X_n = 1]\Pr\{X_n = 1\} + E[G_n|R_n = 1, X_n = -1]\Pr\{X_n = -1\} \\ &= E[G_{n-1}+1|R_{n-1} = 1] \frac {\mu} {\mu + 1} + E[G_{n-1}-1|R_{n-1} = 2] \frac {1} {\mu + 1} \\ &= (E_{n-1} + 1)\frac {\mu} {\mu + 1} + (-E_{n-1}-1) \frac {1} {\mu + 1} \\ &= \frac {\mu - 1} {\mu + 1}(E_{n-1} + 1)\end{align} $$ Here the second line follows from the law of total probability; The third line will follows from $G_n = X_n + G_{n-1}$ and note that the distribution of $G_{n-1}$ solely depends on the playing order of $R_{n-1}$ only (Markov property), which carry the same information as the previous game payoff $X_{n-1}$.

When we play this game in long run, those $E_n$ converge and if we pass the limit on both sides, we obtain the equation of the long run gain as in the first part of the answer.

Due to the Markov property, the long run proportion of time that a certain player plays first converge for both players, regardless of the initial playing order. The long run overall gain $E$ does not further increase indefinitely but converge to the answer of the first part.