A question about Poker (and probability in general)

First you need to calculate the correct number of distinct hands. Let's say that suits are always relevant. In that case the number of distinct five-card poker hands is the number of ways of choosing 5 cards from 52, which is

$$\binom{52}{5} = \frac{52!}{5!47!} = \frac{52\cdot 51\cdot 50\cdot 49\cdot 48}{5\cdot4\cdot3\cdot2\cdot1} = 2598960$$

(see here for an explanation)

Now we can ask how long it takes, on average, before you have seen every hand. Let $T_{k,n}$ be the expected number of hands until you have seen all $n$ possible hands, when you have already seen $k$ of the hands. Then we know that $T_{n,n}=0$ (you have already seen all the hands) and we want to find $T_{0,n}$, the expected number of hands to be dealt when we haven't yet seen any hands dealt. The updating rule follows from considering the probabilities of being dealt a new hand, or a hand that we've already seen:

$$T_{k,n} = \frac{n-k}{n} (1 + T_{k+1,n}) + \frac{k}{n} (1 + T_{k,n})$$

which simplifies to

$$T_{k,n} = \frac{n}{n-k} + T_{k+1,n}$$

which we can solve, giving the solution

$$T_{0,n} = n\sum_{i=1}^n \frac{1}{i} = n H_n$$

where $H_n$ is the $n$th Harmonic number.

For the problem you want to consider you have $n=2598960$. The $2598960$th Harmonic number is approximately $15.34784$, which means that you expect to wait for

$$15.34784 \times 2598960 = 39888416$$

or approximately 40 million deals before you have seen every possible five-card poker hand. To put this in perspective, a dealer working at the phenomenal speed of one full hand per second, and never stopping to eat or sleep, would take approximately 1 year and 3 months to deal out every possible poker hand.


Another way to derive the average waiting time, which is closely related to Chris' method and of course leads to the same result but might be slightly more accessible if you hadn't encountered recurrence relations before, is this: First you have to wait until you've seen at least one hand; then you have to wait until you've seen at least two hands; then you have to wait until you've seen at least three hands, etc. The total average waiting time is just the sum of these individual waiting times. Now when you've already seen $j$ hands, the probability to get a hand you haven't seen yet, and thus progress to the next waiting stage, is $n-j$ in $n$, where $n$ is the number of all possible hands. It's plausible that if you have a $1$ in $m$ chance of seeing something, on average you have to wait $m$ times to see it, and in fact this turns out to be true (see the geometric distribution). So the average waiting time to see something is the reciprocal of the chance of seeing it, so if your chance of seeing a new hand is $(n-j)/n$, on average you have to wait $n/(n-j)$ times to see it. Now you just have to add up all these average waiting times:

$$t=\sum_{j=0}^{n-1}\frac n{n-j}=n\sum_{j=0}^{n-1}\frac1{n-j}=n\sum_{i=1}^{n}\frac1{i}$$

(with $i=n-j$), which is exactly Chris' result.


Here's another solution, which is obviously not as elegant as and a lot more complicated than the one in the other answers, but perhaps serves as a good example that a) in mathematics you can often do things in quite different ways and (sometimes somewhat miraculously) end up with the same result and b) which of these different ways you choose can make a big difference in how much work you have to do to get the result.

We can model the dealing as a Markov process. The state is given by the number $i$ of hands not yet seen. With each deal, in state $i$ the state is left unchanged with probability $i/n$ and a transition from state $i$ to state $i-1$ occurs with probability $1-i/n$, where again $n$ is the number of possible hands. So for example for $n=5$ we'd have the following transition matrix:

$$\left(\begin{array}{ccccc} 1&\frac15&0&0&0&0\\ 0&\frac45&\frac25&0&0&0\\ 0&0&\frac35&\frac35&0&0\\ 0&0&0&\frac25&\frac45&0\\ 0&0&0&0&\frac15&1\\ 0&0&0&0&0&0\\ \end{array}\right)\;.$$

If we number the rows and columns starting with $0$, this matrix has eigenvectors $x_k=(-1)^i\binom ki$ for $k=0,\dotsc,n$ and corresponding eigenvalues $\lambda_k=1-k/n$ (where the binomial coefficient is zero if the bottom argument is greater than the top one). Our initial state has probability $1$ in state $n$ and $0$ in all other states, and this has the expansion

$$\delta_{in}=\sum_{k=0}^n(-1)^k\binom nkx_k=\sum_{k=0}^n(-1)^{k+i}\binom nk\binom ki$$

in terms of the eigenvectors of the transition matrix. The final state has $i=0$, so the probability of having seen all hands at time $t$ is the corresponding component of the state vector at time $t$. Each eigenvector is multiplied by its eigenvalue in each deal, so the probability to be in the state $i=0$ at time $t$ is given by

$$p_t=\sum_{k=0}^n(-1)^k\binom nk\lambda_k^t=p_t=\sum_{k=0}^n(-1)^k\binom nk\left(1-\frac kn\right)^t\;.$$

The probability of not being done yet at time $t$ is

$$1-p_t=-\sum_{k=1}^n(-1)^k\binom nk\left(1-\frac kn\right)^t\;,$$

and the average waiting time is the sum of all these probabilities:

$$ \begin{eqnarray} T &=& \sum_{t=0}^\infty(1-p_t) \\ &=& \sum_{t=0}^\infty\left(-\sum_{k=1}^n(-1)^k\binom nk\left(1-\frac kn\right)^t\right) \\ &=& -\sum_{k=1}^n(-1)^k\binom nk\sum_{t=0}^\infty\left(1-\frac kn\right)^t \\ &=& -\sum_{k=1}^n(-1)^k\binom nk\frac nk\;. \end{eqnarray} $$

To evaluate this, consider

$$\sigma(q)=\sum_{k=1}^n(q-1)^k\binom nk\frac1k\;.$$

Differentiating with respect to $q$ yields

$$\begin{eqnarray} \sigma'(q) &=& \sum_{k=1}^n(q-1)^{k-1}\binom nk \\ &=& \frac1{q-1}\left(\sum_{k=0}^n(q-1)^k\binom nk-1\right) \\ &=& \frac{(q-1+1)^n-1}{q-1} \\ &=& \frac{q^n-1}{q-1} \\ &=& \sum_{i=0}^{n-1}q^i \;, \end{eqnarray} $$

and so

$$\sigma(q)=\sum_{i=1}^{n}\frac{q^i}i+C\;.$$

The integration constant $C$ can be recovered from $\sigma(1)=0$:

$$\sigma(q)=\sum_{i=1}^{n}\frac{q^i}i-\sum_{i=1}^{n}\frac1i\;,$$

and thus we obtain

$$T=-n\sigma(0)=n\sum_{i=1}^{n}\frac1i\;,$$

in agreement with the other answers.