A very challenging probability question

In a certain 2-player game, the winner is determined by rolling a single 6-sided die in turn, until a 6 is shown, at which point the game ends immediately. Now, suppose that k dice are now rolled simultaneously by each player on his turn, and the first player to obtain a total of k (or more) 6’s, accumulated over all his throws, wins the game. (For example, if k = 3, then player 1 will throw 3 dice, and keep track of any 6’s that show up. If player 1 did not get all 6’s then player 2 will do the same. Assuming that player 1 gets another turn, he will again throw 3 dice, and any 6’s that show up will be added to his previous total). Compute the expected number of turns that will be needed to complete the game.

I've analysed this problem as follows: The problem can be modeled by a negative binomial distribution with probability $p=\frac{1}{6}$. Now, X is a random variable representing the number of dice being thrown. I need to find the cdf $Pr[X\geq k]$, and then find the expectation as follows $E[X] = \int_0^\infty kPr[X\geq k]$. The problem here is that the cdf of a negative binomal distribution is a regularized beta function and this is quite messy to deal with. I'm wondering is there another way to approach this problem that wouldn't involve that?


Note: The initial answer here was incorrect. Thanks to Markus, who solved the problem in a different way and got different results from my first ones, I found the mistake and rewrote the answer. (If there's any reason for me to repost the original, wrong answer somewhere, let me know.)


You can set up a recurrence to solve if you consider intermediate stages in the game and consider the expected number of moves for the game to end from there.

Let $E(a,b,k)$ be the expected number of turns the game takes to finish if player whose turn it is needs $a$ more sixes to win, the other player needs $b$ more, and $k$ dice are rolled each turn.

If $a\le0$ or $b\le0$, the game is already over, and $E(a,b,k)=0$.

Now consider the possible outcomes of the next two rolls. Let the number of $6$s in those rolls be $i$ and $j$, respectively.

The particular combination $(i,j)$ happens with probability ${\tbinom{k}{i}}{\tbinom{k}{j}}\left(\frac{5}{6}\right)^{2k-i-j}\left(\frac{1}{6}\right)^{i+j}$. The expected game length of a given case depends on whether or not $i\ge a$: If $i\ge a$, the game is over after the first of the two turns, and the (expected) game length is $1$. Otherwise, it's $2+E(a-i,b-j,k)$.

This leads to the following formula:

$$\begin{align} E(a,b,k)&= \sum_{i=0}^{a-1}\sum_{j=0}^k{\tbinom{k}{i}}{\tbinom{k}{j}}\left(\tfrac{5}{6}\right)^{2k-i-j}\left(\tfrac{1}{6}\right)^{i+j}\left(2+E(a-i,b-j,k)\right)\\ &+\sum_{i=a}^{k}{\tbinom{k}{i}}\left(\tfrac{5}{6}\right)^{k-i}\left(\tfrac{1}{6}\right)^{i},\\ \end{align}\\ \mbox{where here and below } E(a,b,k)=0 \mbox{ if $a\le0$ or $b\le0$.} $$

To get a useful recursive formula, first pull the $i=0$, $j=0$ term out of the sum.

$$ \begin{align}E(a,b,k) &= \left(\tfrac{5}{6}\right)^{2k}(2+E(a,b,k))\\ &+ \sum_{i={\color{red}{1}}}^{a-1}\sum_{j={\color{red}{0}}}^{{\color{red}{0}}}{\tbinom{k}{i}}{\tbinom{k}{j}}\left(\tfrac{5}{6}\right)^{2k-i-j}\left(\tfrac{1}{6}\right)^{i+j}\left(2+E(a-i,b-j,k)\right)\\ &+ \sum_{i=0}^{a-1}\sum_{j={\color{red}{1}}}^{k}{\tbinom{k}{i}}{\tbinom{k}{j}}\left(\tfrac{5}{6}\right)^{2k-i-j}\left(\tfrac{1}{6}\right)^{i+j}\left(2+E(a-i,b-j,k)\right)\\ &+\sum_{i=a}^{k}{\tbinom{k}{i}}\left(\tfrac{5}{6}\right)^{k-i}\left(\tfrac{1}{6}\right)^{i}\\ \end{align}$$

Then solve for $E(a,b,k)$. $$\begin{align} \left({1 - \left(\tfrac{5}{6}\right)^{2k}}\right)E(k,k,k) &= 2\left(\tfrac{5}{6}\right)^{2k}\\ &+ \sum_{i=1}^{a-1}{\tbinom{k}{i}}\left(\tfrac{5}{6}\right)^{2k-i}\left(\tfrac{1}{6}\right)^{i}\left(2+E(a-i,b,k)\right)\\ &+ \sum_{i=0}^{a-1}\sum_{j=1}^{k}{\tbinom{k}{i}}{\tbinom{k}{j}}\left(\tfrac{5}{6}\right)^{2k-i-j}\left(\tfrac{1}{6}\right)^{i+j}\left(2+E(a-i,b-j,k)\right)\\ &+\sum_{i=a}^{k}{\tbinom{k}{i}}\left(\tfrac{5}{6}\right)^{k-i}\left(\tfrac{1}{6}\right)^{i}\\ \end{align} $$

Using this to evaluate $E(k,k,k)$ in Mathematica (assuming no typos in my transcription from what I used) gives the same results as Markus put in his answer. Here's the beginning of the sequence $E(k,k,k)$, starting at $k=1$.

\begin{array}{ccc} \it{k} & \it{E(k,k,k)} \\ \hline \\ 1 & 6.00000000000 \\ 2 & 7.84416992031 \\ 3 & 8.69584550140 \\ 4 & 9.20010963516 \\ 5 & 9.54353874272 \\ 6 & 9.79634774936 \\ 7 & 9.99197435248\\ 8 & 10.1488645407\\ 9 & 10.2781431548\\ \cdots\\ 30 & 11.2061391171\\ \cdots\\ \end{array}

These values should be correct to all decimal places, since there's no approximation going on other than the final conversion of a fraction to decimal.


Note: This question seemed to be a nice challenge for me. Here I provide an explicit formula, which is admittedly not very attractive, because it is rather complicated. In fact, I really appreciated the elegant approach of Steve and I still do so. But, my calculations using the explicit formula show different values for $k>1$ and the things became interesting for me. I verified with a small program that the stated figures for $E(k,k,k)$ from Steve were consistent with his recursion formula. But nevertheless my figures were different from those. I wrote small programs calculating first and last expressions of my calculations to verify, that the transformations were correct. Then I did an alternative verification, wrote some R code to produce random samples and simulated the rolls with $1$ to $6$ dice. The pleasant news for me was, that the simulation based upon random samples confirmed convincingly the figures calculated with my explicit formula. So, for me one open question is still left, namely what is the reason for the differing results between the explicit formula and the recursion formula?

@Steve, I would really appreciate, if you could review both answers and help to clarify it.

Added 2014-03-31: Note: Recursion corrected! In the meanwhile Steve found a clever correction of his answer and now both, recursive and direct solution show consistent results.

Now, here's a proof resulting in an explicit formula for the expectation: Let $A^{(k)}_{N,j}$ denote the number of ways player $A$ can get $j(\ge0)$ times a $6$ with $k$ dice in $N$ rolls. Let $A^{(k)}(x,y)=(5x+y)^{Nk}$ be the corresponding generating function, with the exponent of $y$ giving the number of $6$ after $N$ rolls with $k$ dice and the coefficient giving the number of different possibilities of this event. The coefficient of $x$ marks the number of possibilities of all other rolled pips with $k$ dice. Let the coefficient operator $[y^j]$ denote the coefficient of $y^j$ in $A^{(k)}(x,y)$. So,

\begin{align} A^{(k)}_{N,j}=[y^j](5x+y)^{Nk}=[y^j]\sum_{l=0}^{Nk}\binom{Nk}{l}y^l(5x)^{Nk-l}=\binom{Nk}{j}5^{Nk-j} \end{align}

Similarly, let $A^{(k)}_{N,\leq j}$ denote the number of possibilities of player $A$ to get not more than $j$ $6$ with $k$ dice in $N$ rolls.

$$A^{(k)}_{N,\leq j}=\sum_{l=0}^{j}\binom{Nk}{l}5^{Nk-l}$$ We also have $A^{(k)}_{N,\geq j}=6^{Nk} - A^{(k)}_{N,< j}$, since $6^{Nk}$ is the number of all possible results after $N$ rolls with $k$ dice. For player $B$ we use the corresponding notation $B^{(k)}_{N,j}$, etc.

The resulting probabilities for these events are: \begin{align} p(A^{(k)}_{N,j})&=p(B^{(k)}_{N,j})=\left(\frac{5}{6}\right)^{Nk}\binom{Nk}{j}\frac{1}{5^j}\\ p(A^{(k)}_{N,\leq j})&=p(B^{(k)}_{N,\leq j})=\left(\frac{5}{6}\right)^{Nk}\sum_{l=0}^{j}\binom{Nk}{l}\frac{1}{5^l}\\ p(A^{(k)}_{N,\geq j})&=p(B^{(k)}_{N,\geq j})=1-p(A^{(k)}_{N,< j})\\ \end{align}

To calculate $E^{(k)}(X)$, the expectation when playing with $k$ dice, let $P(X=N)$ denote the probability, that player A or player B wins after a total of $N$ rolls. In order to do so, we consider two scenarios. Either $A$ has rolled $k$ (or more) $6$ with his $(N+1)$-st roll of $k$ dice, so that $A$ wins after a total of $2N+1$ rolls, or $B$ wins after $A$ has rolled the $k$ dice $N$ times, so that they have reached a total of $2N$ times. Symbolically

\begin{align} A\ wins:\ \ \ &(A B) (A B)\ldots (A B) (A B)A = (AB)^{N}A&(2N+1)\ rolls\\ B\ wins:\ \ \ &(A B) (A B)\ldots (A B) (A B) = (AB)^{N}&(2N)\ rolls \end{align}

So, the expectation $E^{(k)}(X)$ consists of two parts. The first sum corresponds to the scenario with player $A$ as winner of the game. This implies that foreach $N\ge0$ player $B$ has reached less than $k$ $6$ in $N$ rolls, while player $A$ has reached $j<k$ times a $6$ in $N$ rolls and with his $(N+1)$-st roll he has $k$ or more $6$. Similarly the second sum with $B$ as winner.

\begin{align} E^{(k)}(X)&=\sum_{N\ge0}NP(X=N)\\ &=\sum_{N\ge0}(2N+1)P(X=2N+1) + \sum_{N\ge0}(2N)P(X=2N)\\ &=\sum_{N\ge0}(2N+1)p(B_{N,<k})\sum_{j=0}^{k-1}p(A_{N,j})p(A_{1,\ge k-j})\\ &+\sum_{N\ge0}(2N)p(A_{N,<k})\sum_{j=0}^{k-1}p(B_{N-1,j})p(B_{1,\ge k-j}) \end{align}

Substituting for the probabilities the formulas stated above gives

\begin{align} E^{(k)}(X)&=\sum_{N\ge0}(2N+1) \left(\sum_{i=0}^{k-1}\binom{Nk}{i}\left(\frac{5}{6}\right)^{Nk}\frac{1}{5^i}\right)\\ &\qquad\qquad\cdot\sum_{j=0}^{k-1}\left(\binom{Nk}{j}\left(\frac{5}{6}\right)^{Nk}\frac{1}{5^j} \sum_{l=k-j}^{k}\binom{k}{l}\left(\frac{5}{6}\right)^{k}\frac{1}{5^l}\right)\\ &+\sum_{N\ge0}(2N) \left(\sum_{i=0}^{k-1}\binom{Nk}{i}\left(\frac{5}{6}\right)^{Nk}\frac{1}{5^i}\right)\\ &\qquad\qquad\cdot\sum_{j=0}^{k-1}\left(\binom{(N-1)k}{j}\left(\frac{5}{6}\right)^{(N-1)k}\frac{1}{5^j} \sum_{l=k-j}^{k}\binom{k}{l}\left(\frac{5}{6}\right)^{k}\frac{1}{5^l}\right)\\ &=\sum_{N\ge0}(2N+1) \left(\frac{5}{6}\right)^{(2N+1)k}\sum_{i=0}^{k-1}\binom{Nk}{i}\frac{1}{5^i} \sum_{j=0}^{k-1}\binom{Nk}{j}\frac{1}{5^j} \sum_{l=k-j}^{k}\binom{k}{l}\frac{1}{5^l}\\ &+\sum_{N\ge0}(2N) \left(\frac{5}{6}\right)^{2Nk}\sum_{i=0}^{k-1}\binom{Nk}{i}\frac{1}{5^i} \sum_{j=0}^{k-1}\binom{(N-1)k}{j}\frac{1}{5^j} \sum_{l=k-j}^{k}\binom{k}{l}\frac{1}{5^l}\\ \end{align}

After a few simplifications the resulting value for $E^{(k)}(x)$ is \begin{align} E^{(k)}(X)&=\frac{1}{6^k}\sum_{N\ge0}(2N+1) \left(\frac{5}{6}\right)^{2Nk}\sum_{i=0}^{k-1}\binom{Nk}{i}\frac{1}{5^i} \sum_{j=0}^{k-1}\binom{Nk}{j} \sum_{l=0}^{j}\binom{k}{j-l}\frac{1}{5^l}\\ &+\frac{1}{6^k}\sum_{N\ge0}(2N) \left(\frac{5}{6}\right)^{(2N-1)k}\sum_{i=0}^{k-1}\binom{Nk}{i}\frac{1}{5^i} \sum_{j=0}^{k-1}\binom{(N-1)k}{j} \sum_{l=0}^{j}\binom{k}{j-l}\frac{1}{5^l} \end{align}

I've calculated the expectations for $k=1\ldots 6$ with a small program. The computed values approximating $E^{(k)}(x)$ converge quickly with increasing $N$. The table below lists the resulting figures with $7$ significant digits.

Expectation $E^{(k)}(X)$: The value in column $N$ indicates the smallest value, where the significant digits did no longer change by increasing $N$. $$ \begin{array}{ccc} \it{k} & \it{E^{(k)}(X)} & \it{N}\\ \hline \\ 1 & 6.000 000 & 57 \\ 2 & 7.844 169 & 32 \\ 3 & 8.695 845 & 26 \\ 4 & 9.200 109 & 22 \\ 5 & 9.543 538 & 19 \\ 6 & 9.796 347 & 18 \\ \end{array} $$

In the following I give explicit values for $k=1$ and $2$. With the help of generating functions and $D$ as (formal) differential operator we have for $k=1$:

\begin{align} E^{(1)}(x)&=\frac{1}{6}\sum_{N\ge0}(2N+1)\left(\frac{5}{6}\right)^{2N} + \frac{1}{6}\sum_{N\ge0}(2N)\left(\frac{5}{6}\right)^{2N-1}\\ &=\frac{1}{6}\left.\left(D\frac{1}{1-x}\right)\right\rvert_{x=\frac{5}{6}}=\frac{1}{6}\left.\frac{1}{(1-x)^2}\right\rvert_{x=\frac{5}{6}}\\ &=6 \end{align}

Calculation for $k=2$ gives:

\begin{align} E^{(2)}(x)&=\frac{1}{6^2}\sum_{N\ge0}(2N+1)\left(\frac{5}{6}\right)^{4N} \sum_{i=0}^{1}\binom{2N}{i}\frac{1}{5^i} \sum_{j=0}^{1}\binom{2N}{j}\sum_{l=0}{j}\binom{2}{j-l}\frac{1}{5^l}\\ &+\frac{1}{6^2}\sum_{N\ge0}(2N)\left(\frac{5}{6}\right)^{4N-2} \sum_{i=0}^{1}\binom{2N}{i}\frac{1}{5^i} \sum_{j=0}^{1}\binom{2N-2}{j}\sum_{l=0}{j}\binom{2}{j-l}\frac{1}{5^l}\\ &=\frac{1}{5^26^2}\sum_{N\ge0}(2N+1)(2N+5)(22N+5)\left(\frac{5}{6}\right)^{4N}\\ &+\frac{1}{5^26^2}\sum_{N\ge0}(2N)(2N+5)(22N-17)\left(\frac{5}{6}\right)^{4N-2}\\ &=\frac{1}{5^46^2}\sum_{N\ge0}(5368N^3+12572N^2-1870N+625)\left(\frac{5}{6}\right)^{4N}\\ \end{align}

We use the following well known relations:

\begin{align} \sum_{N\ge0}Nx^N&=(xD)\frac{1}{1-x}=\frac{x}{(1-x)^2}\\ \sum_{N\ge0}N^2x^N&=(xD)^2\frac{1}{1-x}=\frac{x(x+1)}{(1-x)^3}\\ \sum_{N\ge0}N^3x^N&=(xD)^3\frac{1}{1-x}=\frac{x(x^2+4x+1)}{(1-x)^4}\\ \end{align}

to get: \begin{align} E^{(2)}(x)&=\frac{1}{5^46^2}\left.\left(5368\frac{x(x^2+4x+1)}{(1-x)^4} +12572\frac{x(x+1)}{(1-x)^3}\right.\right.\\ &\qquad\qquad\qquad\left.\left.-1870\frac{x}{(1-x)^2} +625\frac{1}{1-x}\right)\right|_{x=\left(\frac{5}{6}\right)^4}\\ &=\frac{1}{5^46^2}\left.\frac{-9699x^3+27087x^2+14195x+625}{(1-x)^4}\right|_{x=\left(\frac{5}{6}\right)^4}\\ &=\frac{636876}{81191}\approx7.844169 \end{align}

Summarised the expectation for $k=1,2$: \begin{align} E^{(1)}(x)&=6\\ E^{(2)}(x)&=\frac{636876}{81191}\approx7.844169 \end{align}

The formulas for $k>2$ can be calculated in a similar manner, but note that the degree of $N$ increases by $2$ whenever $k$ is incremented by $1$. So, manual calculation could become really cumbersome. :-)