How to calculate the probability of rolling 6 at least 5 times in a row, out of 50 tries?

This is equivalent to counting the number of strings with length $50$ over the alphabet $\Sigma=\{1,2,3,4,5,6\}$ that avoid $66666$ as a substring of five consecutive characters. Let $S_n$ be the set of such strings with length $n$ and $L_n=|S_n|$. The prefix of an element in $S_n$ can be only: $$ x,\quad 6x,\quad 66x,\quad 666x,\quad 6666x$$ where $x$ differs from $6$. This gives the recursive formula: $$ L_n = 5(L_{n-1}+L_{n-2}+L_{n-3}+L_{n-4}+L_{n-5})\tag{1}$$ with the initial conditions: $$ L_0=1,\quad L_1=6,\quad L_2=36,\quad L_3=216,\quad L_4=1296,\quad L_5=7775.\tag{2}$$ So we have that: $$ L_n = A_1\sigma_{1}^n + A_2\sigma_2^n + A_3\sigma_3^n+A_4\sigma_4^n + A_5\sigma_5^n \tag{3}$$ where $\sigma_i$ is a root of the characteristic polynomial $f(x)=x^5-5(x^4+x^3+x^2+x+1)$ and the $A_i$s are constants given by the initial conditions. $f(x)$ has only one real root, very close to $6$, namely: $$\sigma_1 = 5.999356651043833111223\ldots $$ and all the other roots satisfy $|\sigma_i|<1$, hence our probability is close to: $$1-\left(\frac{\sigma_1}{6}\right)^{50}=0.0053471814\ldots\sim\frac{1}{187}.$$ An explicit computation of the coefficients in $(3)$ gives: $$ A_1 = 1.00040773044846\ldots,$$ $$ A_2 = A_3 = -0.006863339\ldots,$$ $$ A_4 = A_5 = 0.0066594738\ldots,$$ hence the true probability is:

$$ 0.0049416311686434\ldots\sim\frac{3}{607}.$$


To expand on Jack's solution, you can write a matrix equation

$$ \left[ \begin{matrix} L_n \\ L_{n-1} \\ L_{n-2} \\ L_{n-3} \\ L_{n-4} \end{matrix} \right] = \left[ \begin{matrix} 5 & 5 & 5 & 5 & 5 \\ 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ \end{matrix}\right] \left[ \begin{matrix} L_{n-1} \\ L_{n-2} \\ L_{n-3} \\ L_{n-4} \\ L_{n-5} \end{matrix} \right] $$

and thus

$$ \left[ \begin{matrix} L_{n+4} \\ L_{n+3} \\ L_{n+2} \\ L_{n+1} \\ L_{n} \end{matrix} \right] = \left[ \begin{matrix} 5 & 5 & 5 & 5 & 5 \\ 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ \end{matrix}\right]^n \left[ \begin{matrix} L_{4} \\ L_{3} \\ L_{2} \\ L_{1} \\ L_{0} \end{matrix} \right] $$

By diagonalizing the matrix, we can obtain an exact formula for the matrix exponential, and thus an exact formula for $L_n$.

However, if you just want the exact value of $L_{50}$, it's probably better to do it by exponentiating the matrix directly (using a square-and-multiply method: e.g. to obtain $A^{25}$, you first find $A^{12}$, then square it, and finally multiply the result by $A$).


There is no straightforward formula for this probability. However it can be computed exactly (within numerical error). You can keep track of the vector of probabilities $\mathbf{p}_t$ that at time $t$ you are in state:

  1. you haven't already rolled 5 6s and at the present time have rolled 0 6s,
  2. you haven't already rolled 5 6s and at the present time have rolled 1 6s,
  3. you haven't already rolled 5 6s and at the present time have rolled 2 6s,
  4. you haven't already rolled 5 6s and at the present time have rolled 3 6s,
  5. you haven't already rolled 5 6s and at the present time have rolled 4 6s, or
  6. you have already rolled 5 6s.

There are thus 6 probabilities in the vector. And initially $\mathbf{p}_0 = [1, 0, 0, 0, 0, 0]^T$. The chance of transitioning from state 1 to state 2 is $1/6$ and similarly for other states upto state 5 to state 6. From states 1 to 5 the chance of transitioning back to state 1 is $5/6$. However once you have already rolled 5 6s you have always already rolled 5 6s so if you get to state 6 you stay in state 6. These probabilities can be specified in a transition matrix $X$:

$$ X = \left\{ \begin{array}{c|cccccc} & S_1 & S_2 & S_3 & S_4 & S_5 & S_6 \\ \hline S_1 & \frac56 & \frac56 & \frac56 & \frac56 &\frac56 & 0 \\ S_2 & \frac16 &0 & 0 &0 & 0 & 0 \\ S_3 & 0 & \frac16 &0 &0 & 0 & 0 \\ S_4 & 0 & 0 &\frac16 &0 & 0 & 0 \\ S_5 & 0 & 0 &0 &\frac16 & 0 & 0 \\ S_6 & 0 & 0 & 0&0 &\frac16 & 1 \end{array}\right\} $$

Probabilities for time $t+1$ can be computed from probabilities for time $t$ by applying the transition matrix: $$ \mathbf{p}_{t+1} = X\mathbf{p}_t $$

Computing $\mathbf{p}_{50} = X^{50}\mathbf{p}_0$ gives the probability of observing at least 5 6s within 50 rolls to be 0.00494 (simulation gave 0.00493). As Hurkyl points out, for large numbers of rolls it can be worth using the square and multiply method for matrix exponentiation to maintain accuracy.


This problem can also be approached using a generating function as in this related answer.


Using A Generating Function

As Jack D'Aurizio notes, we can build up all possible strings where $6$ does not appear at least $5$ times with the $5$ atoms $$ \underbrace{\square\vphantom{6}}_{\large 5x^{\vphantom{1}}},\underbrace{6\square}_{\large 5x^2},\underbrace{66\square}_{\large 5x^3},\underbrace{666\square}_{\large 5x^4},\underbrace{6666\square}_{\large 5x^5} $$ The coefficient of $5$ represents the number of ways to fill the $\square$. Note that any sequence of length $50$ can be made in a unique way by putting together such atoms to a length of $51$ and removing the last $\square$. Therefore, in the sum $$ \begin{align} \sum_{k=0}^\infty5^k(x+x^2+x^3+x^4+x^5)^k &=\frac1{1-5\frac{x-x^6}{1-x}}\\ &=\frac{1-x}{1-6x+5x^6}\tag{1} \end{align} $$ the coefficient of $x^{n+1}$ is $5$ times the number of ways to arrange $n$ numbers with no subsequence of $5$ sixes in a row.

The coefficient of $x^{51}$ is $4021435247555066377711342806458789062500$ and $6^{50}$ is $808281277464764060643139600456536293376$. Dividing their quotient by $5$ and subtracting from $1$, we get a probability of $$ \frac{4109288018262124589373497083105433}{831565100272391008892118930510839808}\doteq0.0049416311686434034927\tag{2} $$


Computing The Coefficients By Recursion

Computing the coefficients of generating function in $(1)$ can be tedious. I used Mathematica to compute $(2)$. However, there is a more manual-friendly way to compute the coefficients using recursion.

The denominator in $(1)$ tells us that the coefficients of the series should satisfy $$ a_n=6a_{n-1}-5a_{n-6}\tag{3} $$ where we compute the first $6$ terms by series division $$ a_0=1,a_1=5,a_2=30,a_3=180,a_4=1080,a_5=6480\tag{4} $$ Using this recursion, it is easier to compute $$ a_{51}=4021435247555066377711342806458789062500\tag{5} $$