Probability of 3 people in a room of 30 having the same birthday

I have been looking at the birthday problem (http://en.wikipedia.org/wiki/Birthday_problem) and I am trying to figure out what the probability of 3 people sharing a birthday in a room of 30 people is. (Instead of 2).

I thought I understood the problem but I guess not since I have no idea how to do it with 3.


Solution 1:

The birthday problem with 2 people is quite easy because finding the probability of the complementary event "all birthdays distinct" is straightforward. For 3 people, the complementary event includes "all birthdays distinct", "one pair and the rest distinct", "two pairs and the rest distinct", etc. To find the exact value is pretty complicated.

The Poisson approximation is pretty good, though. Imagine checking every triple and calling it a "success" if all three have the same birthdays. The total number of successes is approximately Poisson with mean value ${30 \choose 3}/365^2$. Here $30\choose 3$ is the number of triples, and $1/365^2$ is the chance that any particular triple is a success. The probability of getting at least one success is obtained from the Poisson distribution: $$ P(\mbox{ at least one triple birthday with 30 people})\approx 1-\exp(-{30 \choose 3}/365^2)=.0300. $$

You can modify this formula for other values, changing either 30 or 3. For instance, $$ P(\mbox{ at least one triple birthday with 100 people})\approx 1-\exp(-{100 \choose 3}/365^2)=.7029,$$ $$ P(\mbox{ at least one double birthday with 25 people })\approx 1-\exp(-{25 \choose 2}/365)=.5604.$$

Poisson approximation is very useful in probability, not only for birthday problems!

Solution 2:

An exact formula can be found in Anirban DasGupta, The matching, birthday and the strong birthday problem: a contemporary review, Journal of Statistical Planning and Inference 130 (2005), 377-389. This paper claims that if $W$ is the number of triplets of people having the same birthday, $m$ is the number of days in the year, and $n$ is the number of people, then

$$ P(W \ge 1) = 1 - \sum_{i=0}^{\lfloor n/2 \rfloor} {m! n! \over i! (n-2i)! (m-n+i)! 2^i m^n} $$

No derivation or source is given; I think the idea is that the term corresponding to $i$ is the probability that there are $i$ birthdays shared by 2 people each and $n-2i$ birthdays with one person each.

In particular, if $m = 365, n = 30$ this formula gives $0.0285$, not far from Byron's approximation.

Solution 3:

As being pointed out by Micheal Lugo the formulation given by Anirban DasGupta is a exact answer for this problem, however a formal proof is needed. I have found and verified a solution by Doctor Rick from Math Forum, below is the link

http://mathforum.org/library/drmath/view/56650.html

His approach is to partition the sample space as following:

   1.   none share a birthday
   2.   one pair shares a birthday
   3.   two pairs share different birthdays
   4.   three pairs share different birthdays
   :
 1+N/2. N/2 pairs share different birthdays
 2+N/2. three or more share a birthday

Then he points out a clever way to count for each partition by picking different birthday for each pair of person. I have tried and arrived with the same formulation as Anirban DasGupta's. For more detail please take a look at the link above!

Solution 4:

I'm a bit skeptical of this answer. Here is a formula that works.

It probably helps to explain Dasgupta's formula from Michael Lugo's answer first.

Say that a map $f : [m]\to [n]$ is $k$-almost injective if $|f^{-1}(j)|\le k$ for all $j\in [n]$. Counting injective maps is easy, there are

$$ I(1,m,n) := m!\binom{n}{m} $$

of them. You just pick the range and then a bijection to it. This gives right away the standard birthday collision probability for $m$ people and years of length $n$

$$ 1 - n^{-m}I(1,m,n) $$

One gets the generalized birthday probability from $I(k,m,n)$ in the same way, so we can just think about $I(k,m,n)$.

How would we go about counting $2$-injective maps? The same idea as before works. This time, we pick $c$ pairs that will have colliding images, injectively map these into $[n]$, then injectively map the rest to a set of size $n-c$. So we get

$$ I(2,m,n) = \sum_{c=0}^{\lfloor m/2\rfloor}\frac{1}{c!} \left(\prod_{j=0}^{c-1}\binom{m - 2j}{2}\right) I(1,c,n)I(1,m-2c,n-c) $$

This is equivalent to Dasgupta's formula, but it is easier to see the induction.

If we want to get $I(k,m,n)$ in general, we have

$$ I(k,m,n) = \sum_{c=0}^{\lfloor m/k\rfloor}\frac{1}{c!} \left(\prod_{j=0}^{c-1}\binom{m - kj}{k}\right) I(1,c,n)I(k-1,m-kc,n-c) $$

Solution 5:

My own research led to the following result...

Knowing that there are $A$ days in the year (typically $A=365$), the probability $P(A, M, n)$ that at least $n$ children have their birthday the same day within a class of $M$ children is:

$$\boxed{ P(A, M,n) = 1 - \dfrac{ K_n(A, M) }{ A^M } }$$

where $K_n(A, M)$ represents the number of configurations in which one cannot find $n$ children (or more) having their birthday the same day, and can be computed by recurrence as follows:

$$\forall n\ge 2, \quad \boxed{ K_{n+1}(A, M) = \sum_{0\le k\le \left\lfloor{\frac{M}{n}}\right\rfloor} \dfrac{ \binom{A}{k} \; (M)_{nk} \; K_n(A-k, M-nk)}{ (n!)^k} }$$

with the following initialization: $\boxed{ K_2(A,M)=(A)_M }$

and where $(n)_k$ stands for the decreasing factorial : $(n)_k = n(n-1)...(n-k+1)$

Numerically:

Within a class of $M=30$ children, knowing that the year counts $A=365$ days...

The probability that at least $n=2$ children have their birthday the same day is:

$$P(365,30,2)\simeq 70,6\%$$

The probability that at least $n=3$ children have their birthday the same day is:

$$P(365,30,3)\simeq 2,85\%$$

The probability that at least $n=4$ children have their birthday the same day is:

$$P(365,30,4)\simeq 0,0532\%$$

Nicolas