Probability of streaks

I thought of this question lately, but I'm not satisfied with the answer I got:

If I flip a coin 100 times, what is the probability that I will get a streak of at least ten of the same side?

The way I thought of solving it was just: $\left(90 \times 0.5 ^ 9\right) = 0.0879$, but this has to be wrong because, for the probability of getting a streak of at least 4 in the same scenario, it yields: $\left(96 \times 0.5 ^ 3\right) = 12$, which obviously doesn't make sense.


Solution 1:

EDIT: I have added to the end of this answer the solution to the generalization of this problem to a coin flip with probability p of heads. The question as stated is based on p = 0.5

This can be readily solved "exactly" by using a 10 state (stationary, i.e., time-homogeneous) discrete time Markov Chain, with state i, from 1 to 10, being the length of the current streak (whether heads or tails). The answer is 0.086659044348362, with which the comment by @Scott above is consistent. The details are below.

Take the initial state to be 1, corresponding to a streak of 1 after 1 coin flip (there will always be a streak of 1 after 1 coin flip). Let state 10 be an absorbing state (once state 10 is reached, a streak of at least 10 has occurred, so "we're done").

The one step transition matrix is populated as follows:

In state 1, there is a 0.5 probability of staying in state 1 (but now we have started a new streak on the other side of the coin), and 0.5 probability of moving to state 2.

In states 2 through 9: there is a 0.5 probability of moving to state 1 (i.e.,the current streak is terminated, and a new streak is started on the other side of the coin), and 0.5 probability of moving to the state incremented by 1 (i.e., from state i to state i+1).

State 10 has probability of 1 of staying in state 10, which is absorbing.

So to summarize, the one-state transition matrix is all zeros except for:

row 1 columns 1 and 2 are 0.5
rows 2 through 9: columns 1 and "row + 1" are 0.5
row 10: column 10 is 1

Because we start the Markov Chain after one coin toss, we need to do 99 more coin tosses, and therefore 99 Markov Chain steps to constitute 100 coin flips. Therefore the answer is the probability of being in state 10 after 99 Markov Chain steps, having started in state 1, and so is the (1,10) entry of the {one-step transition matrix to the 99 power}. The answer, as previously stated, is 0.086659044348362.

Note that with these parameters, it is not correct, contrary to a comment by @user1546083 above, to perform a calculation for getting a streak of at least 10 heads, and then doubling it to include tails, because it is possible to get a streak of at least 10 heads and another of at least 10 tails in the same 100 coin flips. In light of this, the accepted answer by @rajb245 does not answer the question as asked, and as of now, my answer is the only one to have exactly solved the question as asked, with @scott getting a consolation prize for having apparently correctly simulated the problem.

EDIT: Here is the generalization of this problem to a coin flip with probability p of heads.

The astute reader will note that my solution for the p = 0.5 case takes advantage of the symmetry between heads (H) and tails (T) in order to keep the state space down to 10 states, in which the state indicates the length of the current streak, and does not distinguish between H and T. In the general p not necessarily equal to 0.5 case, that "trick" no longer works, because the probability of extending the current streak depends on whether it is a streak of H or T.

Therefore, I now define a 21 state Markov Chain, with state 0 being the initial state of zero length streak, which is the state prior to the first coin toss. The remaining states are labeled 1H, 1T, 2H, 2T, ..., 10H, 10T, where the number indicates the length of the streak and the H or T indicates whether the streak is H or T. States 10H and 10T are both made to be absorbing states, and the probability of having a streak of (at least) 10H or (non-exclusive) 10T is the sum of the (0,10H) and (0,10T) entries of the one step transition matrix raised to the 100 power, i.e., of the 100 step transition matrix. The (0,10H) entry of that 100 step transition matrix is the probability that there is a streak of at least 10 heads and that it occurs before a streak (if any) of 10 tails. And of course, the (0,10T) entry of that 100 step transition matrix is the probability that there is a streak of at least 10 tails and that it occurs before a streak (if any) of 10 heads.

Starting in state 0, transition to 1H with probability p and to 1T with probability 1-p.

In state 1H, transition to 1T with probability 1-p and to 2H with probability p. In state 1T, transition to 1H with probability p and to 2T with probability 1-p. Etc. up through transitioning from states 9H and 9T.

States 10H and 10T are absorbing, so 10H transitions to 10H with probability 1, and 10T transitions to 10T with probability 1.

I will order the states as 0,1H,1T,2H,2T,...10H,10T. Given a value of p, the one-step transition matrix,TM, can be created in MATLAB (note that MATLAB array indices start with 1) with the code:

TM = zeros(21);
TM(1:2:19,2) = p; TM([1 2:2:18],3) = 1-p;
for i = 2:2:18, TM(i,i+2) = p; TM(i+1,i+3) = 1-p; end
MP(20,20) = 1; TM(21,21) = 1;

If we use this approach with p = 0.5, we of course duplicate the result of the earlier approach which was specialized for p = 0.5, namely that the (0,10H) and (0,10T) entries of the 100 step transition matrix are both 0.043329522174181, and their sum matches the previous result of 0.086659044348362 . If we let p = 0.6, then the (0,10H) entry of the 100 step transition matrix is 0.204426792384254, the (0,10T) entry of the 100 step transition matrix is 0.005246574538275, and their sum is 0.209673366922529, which is the probability of getting a H or (non-exclusive) T streak of at least 10 in a row. In the extreme case of p = 1, the probability of getting a streak of at least 10 heads is 1.

Solution 2:

According to this page, there is a closed form expression for just this problem.

...the probability, S, of getting K or more heads in a row in N independent attempts (where p is the probability of heads and q=1-p is the probability of tails) is:

$$ S(N,K) = p^K\sum_{T=0}^\infty {N-(T+1)K\choose T}(-qp^K)^T-\sum_{T=1}^\infty {N-TK\choose T}(-qp^K)^T $$

With the unusual convention that ${A\choose B}= 0$ for $A < B$. Numerical evaluation gives me 0.0441372 for the case of $p=1/2$, $N=100$, $K=10$.

Edit 1

Reworking it a bit to get rid of that weird convention just changes the upper limit.

$$ p^k \sum _{t=0}^{\frac{n-k}{k+1}} \binom{n-k (t+1)}{t} \left(-q p^k\right)^t-\sum _{t=1}^{\frac{n}{k+1}} \binom{n-k t}{t} \left(-q p^k\right)^t $$

The following Mathematica code gives you numbers, just plug in p, n, k in the last substitution bit.

-\!\(
\*UnderoverscriptBox[\(\[Sum]\), \(t = 1\), 
FractionBox[\(n\), \(1 + k\)]]\(
\*SuperscriptBox[\((\(-
\*SuperscriptBox[\(p\), \(k\)]\)\ q)\), \(t\)]\ Binomial[n - k\ t, 
       t]\)\) + p^k \!\(
\*UnderoverscriptBox[\(\[Sum]\), \(t = 0\), 
FractionBox[\(\(-k\) + n\), \(1 + k\)]]\(
\*SuperscriptBox[\((\(-
\*SuperscriptBox[\(p\), \(k\)]\)\ q)\), \(t\)]\ Binomial[
       n - k\ \((1 + t)\), t]\)\) //. {k -> 10, n -> 100, q -> 1 - p, 
   p -> 1/2} // N

Edit 2

It has recently been pointed out by Mark L. Stone that the above is for a streak of heads, but not for the case of either streak occurring. I'd recommend reading his post below.

Solution 3:

Your calculation is essentially the expected number of streaks of that length. A naive approach would be that the chance that you don't start a run of $n$ on a a particular roll is $2^{n-1}$ as you realize. The chance of never starting a run of $10$ is then $(1-2^{9})^{91}$ and the chance of having at least one $1-(1-2^{9})^{91}\approx 0.163$ The reason for $91$ instead of $90$ is that you could start on any flip $1$ through $91$. The reason for the weasel words is this calculation assumes the chance of starting a run on flip 2 is independent of the chance of starting on 1 or 3, but they interact.

Solution 4:

EDIT: This answer is erroneous, because a streak can start at any number, not just at a given multiple at 10. Please disregard.

Q: If I flip a coin 100 times, what is the probability that I will get at least one streak of at least ten of the same side?

Assuming the coin is fair:

P(10 consecutive same side) = $0.5^{10}$
Conversely, the probability of that outcome not occurring is $1-0.5^{10}$. Call this outcome F.

Now, since you're flipping a coin 100 times, and 100 times corresponds to 10 such samples (of 10 flips each), we can do this simply with independence:

P(No Streak in 10 sets of samples): $F^{10}$
$\therefore$ P(At least one streak in 10 sets of samples) $= 1-F^{10}$
$\therefore$ P(At least one streak in 10 sets of samples) $= (1-(1-0.5^{10})^{10})$
$\therefore$ P(At least one streak in 10 sets of samples) $=0.00972...$