A number of die rolls to see every number at least once

Solution 1:

Let $S_k$ be all the outcomes in which a $k$ is not rolled. For each $k$, $|S_k|=(n-1)^r$ and there are $\binom{n}{1}$ choices for $k$. For each $j\ne k$, $|S_j\cap S_k|=(n-2)^r$ and there are $\binom{n}{2}$ choices for $j$ and $k$. Continuing in this manner and using Inclusion-Exclusion to count the number of outcomes missing at least $1$ number, we get $$ \begin{align} \left|\bigcup_{k=1}^nS_k\right| &=\sum_{k=1}^n|S_k|-\sum_{j< k}|S_j\cap S_k|+\sum_{i<j<k}|S_i\cap S_j\cap S_k|-\dots\\ &=\binom{n}{1}(n-1)^r-\binom{n}{2}(n-2)^r+\binom{n}{3}(n-3)^r-\dots \end{align} $$ Since there are a total of $n^r$ total outcomes, we get the number of outcomes in which all possible numbers are rolled is $$ n^r-\binom{n}{1}(n-1)^r+\binom{n}{2}(n-2)^r-\binom{n}{3}(n-3)^r+\dots $$ Thus, the probability of this occurring is $$ 1-\binom{n}{1}\left(1-\frac1n\right)^r+\binom{n}{2}\left(1-\frac2n\right)^r-\binom{n}{3}\left(1-\frac3n\right)^r+\dots $$ So we need to find the smallest $r$ so that $$ p\le\sum_{k=0}^n(-1)^k\binom{n}{k}\left(1-\frac kn\right)^r $$ Expected Duration

The expected duration until all numbers are rolled can be computed using the formulas above; i.e. $$ \begin{align} \mathrm{E}[r] &=\color{#C00000}{\sum_{r=1}^\infty}\sum_{k=0}^n(-1)^k\binom{n}{k}\color{#C00000}{r\left[\left(1-\frac kn\right)^r-\left(1-\frac kn\right)^{r-1}\right]}\\ &=\sum_{k=1}^n(-1)^k\binom{n}{k}\color{#C00000}{\left(-\frac nk\right)}\\ &=n\sum_{k=1}^n(-1)^{k-1}\color{#00A000}{\binom{n}{k}}\frac1k\\ &=n\sum_{k=1}^n(-1)^{k-1}\color{#00A000}{\sum_{j=k}^n\binom{j-1}{k-1}}\frac1k\\ &=n\sum_{j=1}^n\sum_{k=1}^j(-1)^{k-1}\binom{j-1}{k-1}\frac1k\\ &=n\sum_{j=1}^n\color{#0000FF}{\sum_{k=1}^j(-1)^{k-1}\binom{j}{k}}\frac1j\\ &=n\sum_{j=1}^n\frac1j \end{align} $$ However, there is a simpler way. Since the expected duration for a binomial event with probability $p$ is $\frac1p$, we get that the expected duration for the $k^{\mathrm{th}}$ number is $\frac{n}{n-k+1}$. Therefore, the expected duration to get all numbers is $$ \sum_{k=1}^n\frac{n}{n{-}k{+}1}=n\sum_{k=1}^n\frac1{k} $$

Solution 2:

The probability of $k$ given numbers out of $n$ choices not coming up after $t$ throws is $\left(\frac{n-k}{n}\right)^t$

You can use inclusion exclusion to calculate the probability of every number coming up.

$p = 1 - \binom{n}{1}\left(\frac{n-1}{n}\right)^t + \binom{n}{2}\left(\frac{n-2}{n}\right)^t - ...$

Solution 3:

Let $P(k, r)$ be defined by recursion as follows: $$P(0, r) = 1$$ $$P(k + 1, 0) = 0$$ $$P(k + 1, r + 1) = \frac{k + 1}{n} P(k, r) + \left( 1 - \frac{k + 1}{n} \right) P(k + 1, r)$$ That is, $P(k, r)$ is the probability of seeing a particular set of $k$ different outcomes in $r$ samples. Note that by expanding only the second term of the second equation, we get: $$P(k + 1, r + 1) = \sum_{s = 0}^{r} \left( 1 - \frac{k + 1}{n} \right)^s \frac{k + 1}{n} P(k, r - s)$$ In particular, $$P(1, r) = 1 - \left( 1 - \frac{1}{n} \right)^r$$ as expected, and one can check that $$P(k, r) = \sum_{l=0}^{k} (-1)^l \frac{k!}{(k-l)! l!} \left(1 - \frac{l}{n}\right)^r$$ solves the recurrence.

The probability you are interested in is $P(n, r)$ – or more precisely, what you want to know is what the least number $r$ such that $P(n, r) \ge p$ is. Unfortunately to answer that question would require detailed knowledge about the asymptotics of $P(n, r)$, which seems difficult in general. For $n = 2$ things are easy, because in that case we just need to solve $$1 - \frac{1}{2^{r-1}} \ge p$$ and so we need $r \ge 1 - \log_2 (1-p)$.