dice problem - throws necessary for sum multiple of n

A die (with six sides) is rolled repeatedly and summed. Show that the expected number of rolls until the sum is a multiple of $n$ is $n$.


In

http://www.madandmoonly.com/doctormatt/mathematics/dice1.pdf

problem 12 page 15 a proof is given, assuming that the distribution for the sum of dice values for n > 6 is uniform and equals $\frac{2}{7}$ which is a false assumption that should not be used in a rigorous proof.


If $X_m$ is the sum after $m$ rolls, then the process $Y_m=X_m\pmod n$ taking values in $\{0,1,\dots,n-1\}$ is a random walk, considering the state space as points on the circle. By symmetry, its unique invariant probability distribution $\pi$ is uniform over the $n$ states. Therefore the expected time to return to the starting point $0$ is $1/\pi(0)=n$. Since your sum began at $0$, this gives the result.

By the way, this argument works for any (fixed) fair or loaded die with any number of sides. It could even be a one-sided "die", and numbers on it could be positive, negative, and/or zero! I only insist that it is possible to get from any state to any other state. That is, the Markov chain is irreducible.

To prove such results fully, you need the theory of Markov chains on a finite state space. In particular, the topic of "return times". Take a look at the book Markov Chains and Mixing Times by Levin, Peres, and Wilmer, freely available here. Proposition 1.14 (ii) on page 12 gives what you want. Section 1.4 of Introduction to Stochastic Processes (2nd edition) by Gregory F. Lawler also has a good explanation.


Here is an intuitive picture. The Markov chain starts over whenever it hits the state $0$. Writing "$z$" when the chain is in state zero and "$y$" otherwise, a typical outcome looks like

$zyyyyyyyyyyyzyyzzyyyyyyyyyzyyyyzyyyyyyyyyyyyyyyyyyzy\dots$

where we have joined together a sequence of independent finite strings, like $zyyyyyyyyyyy$, of random length. Each finite string starts with one $z$ and has zero or more $y$s following.

The fact that $\pi(z)=1/n$ means that over a very long number of trials, the average proportion of $z$s in the outcome is about $1/n$. To make that work, the average distance between the $z$s must be $n$.