Expected number of dice rolls until get the same number twice

In particular, the numbers do not have to be consecutive. The way I solved this was as follows: let $N$ be the number of rolls until we get the same two numbers. Then, define $N = X + Y$ where $X$ is the number of rolls to get an arbitrary face and $Y$ is the number of rolls to get the same face given that we have gotten some arbitrary face.

X = 1. $Y \sim Geo(1/6)$ since there is a $1/6$ chance of getting the same number . Then, E(X) + E(Y) = 7, which is confusing because by Pidgeon Hole, we are guaranteed to get two of the same number in 7 rolls.

I am not exactly sure why the work is wrong, but the result seems to be off.


Solution 1:

In your work, you are finding expected number of rolls to see the same number that you saw on the first roll. That is indeed $6$. But you are ignoring the possibility that it could be another number that could appear twice and not necessarily the first number. One of the approaches is to build system of equations for expected number of rolls to see a repeat from different states -

Once all numbers have appeared, it would take one more roll to see a repeat.

So $ \displaystyle e_6 = 1~$, where $e_6$ is the expected number of additional rolls for a repeat after you have seen all numbers once.

Similarly, once five numbers have appeared, it would take one more roll with probability $5/6$ to see a repeat or there is $1/6$ probability that you will see the sixth number and then the expected number of rolls for a repeat will be $e_6$.

$ \displaystyle e_5 = 1 + \frac{e_6}{6} = \frac{7}{6}$

Continuing,

$ \displaystyle e_4 = 1 + \frac{2 e_5}{6} = \frac{25}{18}$

$ \displaystyle e_3 = 1 + \frac{3 e_4}{6} = \frac{61}{36}$

$ \displaystyle e_2 = 1 + \frac{4 e_3}{6} = \frac{115}{54}$

$ \displaystyle e_1 = 1 + \frac{5 e_2}{6} = \frac{899}{324}$

So, $~ \displaystyle e_0 = 1 + e_1 = \fbox{$\frac{1223}{324}$}$

Solution 2:

Your $X+Y$ decomposition does not seem to work.

I think the easiest way to solve this is using $$ E[N]=\sum_{n=0}^6 P(N>n) $$ Note that $N>n$ if and only if the first $n$ rolls of the die are all different. The probability the first $n$ rolls are different is $\frac{6!}{(6-n)!6^n}$, for each $n\in \{0,1,\dots,6\}$. We conclude $$ E[N]=\sum_{n=0}^n \frac{6!}{(6-n)!6^n}. $$