Birthday-coverage problem

Solution 1:

As Ross also states, this can be framed as a coupon collector problem where birthdays correspond to coupons and individuals correspond to trials. Then, good bounds and a strong asymptotic result are both known for the time (number of trials) needed to collect at least one coupon of each unique type.

Let $T$ be the number of trials at which all coupons have been collected. Let there be $n$ coupons (so $n = 365$ in our case). For each $n$, and any $c > 0$,

$$ \mathbb{P}(T > n \log n + c n) < e^{-c} $$

and, also,

$$ \mathbb{P}(T < n \log n - c n) < e^{-c} \> . $$

From the first inequality we get that no more than 2407 trials are needed so that there is greater than 50% chance all coupons are collected, and from the second we know that at least 1900 are needed.

For more details, see the recent question and answer here.


The following classical asymptotic result holds for $c \in \mathbb{R}$,

$$ \mathbb{P}(T > n \log n + c n ) \to 1 - e^{-e^{-c}} . $$

See, e.g., Motwani and Raghavan, Randomized Algorithms, pp. 60--63 for a proof.

Note that if you plug in $c = -\log \log 2$ here, one gets 2287.240 which matches very closely to Byron's exact answer and the Monte Carlo estimate reported in the question.

Solution 2:

For the coupon collector's problem with $n$ objects, let $T$ be the number of trials needed to get a complete set. Then we have the formula $$P(T\leq k)=n^{-k}\ n!\ \left\lbrace {k\atop n}\right\rbrace.$$ Here the braces indicate Stirling numbers of the second kind.

With $n=365$, Maple gives me $P(T\leq 2286)=.4994$ while $P(T\leq 2287)=.5003$, so that $2287$ is the smallest number to give a 50% chance to get all 365 birthdays.

Solution 3:

This is the Coupon collector problem The expected value to cover them all (not quite what you asked) is $365 \sum_{i=1}^{365}\frac{1}{i}\approx 365n \ln 365 + 365\gamma + \frac{1}{2}$