Group of $r$ people at least three people have the same birthday?

What is the probability that in a randomly chosen group of $r$ people at least three people have the same birthday?

  1. $\displaystyle 1- \frac{365\cdot364 \cdots(365-r+1)}{365^r}$
  2. $\displaystyle \frac{365\cdot364 \cdots(365-r+1)}{365^r} +{r\choose 2}\cdot \frac{365\cdot364\cdot363 \cdots (364-(r-2) +1)}{364^{r-2}}$
  3. $\displaystyle 1- \frac{365\cdot364 \cdots(365-r+1)}{365^r} +{r\choose 2}\cdot \frac{365\cdot364\cdot363 \cdots (364-(r-2) +1)}{364^{r-2}}$
  4. $\displaystyle\frac{365\cdot364 \cdots(365-r+1)}{365^r} $

My attempt :

(May be typo in option $(3)$ of question !)

$$P(\text{at least 3 persons have same birthday})$$

$$= 1 - \{P\text{(no one has same birthday) + P(any 2 have same birthday)\}}$$

So, option $(3)$ is true.

Can you explain it, please?

It asked here before, but I'm not satisfied by explanation.


Given $2k$ items, there are $(2k-1)!!$ ways to arrange them into pairs: the first item can be paired with $2k-1$ possibilities; the first unpaired item can be matched with $2k-3$ items; the new first unpaired item can be paired with $2k-5$ items; etc.

The number of functions from $n$ people to $365$ dates with $n-2k$ singles and $k$ pairs is $$ \begin{array}{cc} &\displaystyle\underbrace{ \overbrace{\binom{365}{n-k}}^{\substack{\text{ways to choose}\\\text{$n-k$ dates}\\\text{for birthdays}}} \overbrace{\binom{n-k}{k}}^{\substack{\text{ways to choose}\\\text{$k$ dates}\\\text{for pairs}}} }&\displaystyle\underbrace{ \overbrace{\ \ \binom{n}{2k}\ \ }^{\substack{\text{ways to choose}\\\text{$2k$ people}\\\text{for pairs}}} \overbrace{(n-2k)!\vphantom{\binom{n}{k}}}^{\substack{\text{ways to arrange}\\\text{$n-2k$ singles}}} \overbrace{(2k-1)!!\vphantom{\binom{n}{k}}}^{\substack{\text{ways to pair}\\\text{$2k$ people}}} \overbrace{\ \ \ \ \ k!\ \ \ \ \ \vphantom{\binom{n}{k}}}^{\substack{\text{ways to arrange}\\\text{$k$ pairs}}} }\\ \displaystyle= &\displaystyle\frac{365!}{(365-n+k)!\,(n-2k)!\,k!} &\displaystyle\frac{n!}{2^k} \end{array} $$ Thus, the probability of getting at least one triple is $$ 1-\frac{365!\,n!}{365^n}\sum_{k=0}^{365}\frac1{(365-n+k)!\,(n-2k)!\,k!\,2^k} $$ where we take $\frac1{n!}=0$ for $n\lt0$.

Here is a plot from $n=0$ to $n=730$. For $n\lt3$, the probability of getting a triple is $0$, and for $n\gt730$, the probability is $1$.

enter image description here


Before we start with calculations we should think about which values of $r$ we should take into account.

  • $r=1,2$: We can exclude the trivial cases $r=1$ and $r=2$ which give a probability of zero, since we never reach a configuration with three people having the same birthday.

  • $2<r\leq 365$: The main part where we put the focus of our analysis. Fortunately the reasoning can be easily extended to $r\leq 730$.

  • $365<r\leq 2\cdot 365$: Assuming the year has $365$ days, the situation becomes slightly different if the number of people is greater than $365$. In these cases it is guaranteed that there are people with the same birthday and this needs some additional thoughts.

  • $r>2\cdot 365$: According to the pigeonhole principle groups with more than $730$ people have at least three equal birthdays with probability $p=1$.


Starter:

In order to get a first impression and to have a verification for small numbers at hand we look at a smaller example. It should be small enough to make calculations easy and large enough to be representative for the problem.

We take $r=5$ people and instead of $365$ days we use $6$ days. If we take a fair die with six faces instead, this smaller version of the problem can then be stated as:

Small Problem: What is the probability that in a group of $5$ people, each of them throwing a fair die once, at least three people roll the same number.

Five people can throw a total of $6^5=7776$ different configuration of $5$-tuples from $(1,1,1,1,1)$ to $(6,6,6,6,6)$.

We answer the question by counting the $5$-tuples which have no more than two equal faces.

  • Let $N_{(11111)}$ be the number of $5$-tuples with pairwise different entries. We see that there are $6\cdot5\cdot\ldots\cdot1=6!$ possibilities and get \begin{align*} N_{(11111)}=6!=720 \end{align*} The index $11111$ indicates that there are five pairwise different values counted.

Now we consider the number of $5$-tuples with one or more pairs of equal numbers, but with no occurrence of three or more equal numbers. There are two different constellations:

  • One pair and three other numbers which are pairwise different, denoted by $N_{(1112)}$.

  • The second constellation are two different pairs and another number different to each of them, denoted by $N_{(122)}$.

    We obtain \begin{align*} N_{(1112)}&=\frac{1}{1!}\binom{5}{2}6\cdot5\cdot4\cdot3=3600\tag{1}\\ N_{(122)}&=\frac{1}{2!}\binom{5}{2}6\binom{3}{2}5\cdot4=1800\tag{2} \end{align*}

Comment:

  • In (1) there are $\binom{5}{2}$ pairs which can take the values $1,\ldots,6$. The other three people have $5,4$ and $3$ possibilities to throw a number pairwise different to all the others.

  • In (2) there are $\binom{5}{2}\cdot6$ possibilities for one pair, leaving $\binom{5-2}{2}\cdot5$ possibilities for the second pair. Since we don't want to count different pair constellations more than once, we have to divide them by $2!$.

The distribution table of all different $5$-tuples can be easily calculated

\begin{array}{cccccccc} \text{Total}&N_{(11111)}&N_{(1112)}&N_{(113)}&N_{(122)}&N_{(14)}&N_{(23)}&N_{(5)}\\ 7776&\color{blue}{720}&\color{blue}{3600}& 1200& \color{blue}{1800} & 150 & 300 & 6 \end{array}

We are ready to calculate the wanted probability $p$ for this small example:

\begin{align*} p&=1-\frac{N_{(11111)}}{6^5}-\left(\frac{N_{(1112)}}{6^5}+\frac{N_{(122)}}{6^5}\right)\\ &=1-\frac{720}{6^5}-\frac{5400}{6^5}\\ &=1-\frac{6120}{7776}\\ &=0.212963 \end{align*}


The real problem:

Most of the work has been done by the small example. If we now consider $365$ days and $r$ people we can go on straight forward.

At first we put the focus at

\begin{align*} 3\leq r \leq 365 \end{align*}

  • Let's denote with $N_{(1^{r})}$ the number of $r$-tuples with all entries pairwise different. The exponent in the index $1^{r}$ denotes the multiplicity of $1$.

We obtain \begin{align*} N_{(1^{r})}=\frac{365!}{(365-r)!}\tag{3} \end{align*}

Next we have to consider $r$-tuples containing $k$ pairs of tuples, with $1\leq k\leq \lfloor\frac{r}{2}\rfloor$ together with $r-2k$ single numbers pairwise different to all other single numbers and numbers within tuples. We denote them with $N_{(1^{r-2k}2^k)}$.

We obtain analogously to (2) \begin{align*} N_{(1^{r-2k}2^k)}&=\frac{1}{k!}\binom{r}{2}365\binom{r-2}{2}364\cdots\binom{r-2(k-1)}{2}\left(365-(k-1)\right)\\ &\qquad\cdot(365-k)\cdots((365-k)-(r-2k-1))\\ &=\frac{1}{k!}\binom{r}{2}\binom{r-2}{2}\cdots\binom{r-2(k-1)}{2}\frac{365!}{(365-(r-k))!}\\ &=\frac{1}{k!}\frac{r!}{2^k(r-2k)!}\frac{365!}{(365-(r-k))!}\tag{4} \end{align*}

Plausibility check: If we compare the expression (4) with the small example above we could do a simple check. If we set \begin{align*} r=5,k=2,\text{ and }365\rightarrow 6 \end{align*} we obtain \begin{align*} N_{(1^{5-4}2^2)}=N_{(12^2)}=\frac{1}{2!}\frac{5!}{2^2(5-4)!}\frac{6!}{(6-(5-2))!} =1800 \end{align*} in accordance with the table entry $N_{(122)}$ above.

The range $365<r\leq 2\cdot 365$

If $r>365$ it is guaranteed that at least two people have the same birthday. It follows $$N_{(1^r)}=0$$ We see after short consideration, the approach (4) for a group of people with $1\leq k\leq \lfloor r\rfloor$ distinct pairs is also valid up to $r\leq 730$.


Now it's time to harvest. In order to find the number of all $r$-tuples we have to exclude, we sum up $N_{(1^r)}$ and $N_{(1^{r-2k}2^k)}$ for $1\leq k \leq \lfloor\frac{r}{2}\rfloor$.

In fact, inspired by the comment of @robjohn, we can also respect all other values of $r$ in the same formula, if we take $\frac{1}{n!}=0$ for $n<0$ and finally obtain

The probability $p$ that at least three people from a group of $r\geq 1$ people have the same birthday is according to (3) and (4)

\begin{align*} p&=1-\frac{1}{365^r}\left(N_{(1^r)}+\sum_{k=1}^{\lfloor\frac{r}{2}\rfloor}N_{(1^{r-2k}2^k)}\right)\\ &=1-\frac{365!}{365^r}\left(\frac{1}{(365-r)!} +r!\sum_{k=1}^{\lfloor\frac{r}{2}\rfloor}\frac{1}{2^kk!(r-2k)!(365-(r-k))!}\right)\\ &=1-r!\frac{365!}{365^r}\sum_{k=0}^{\lfloor\frac{r}{2}\rfloor}\frac{1}{2^kk!(r-2k)!(365-(r-k))!} \end{align*}


There are three possible situations: either 1. everyone has a different birthday, 2. there are 1 or more pairs of birthdays, but no triplets, 3. there is any triplet. The three situations add up to all possible situations.

The calculations for each of the cases:

  1. You have the possibility to have all different birthdays: $365 \choose r$
  2. You have the possibility that there is any pair that has the same birthday, this can be calculated with inclusion/exclusion method, taking into account that there can be $1,2,3,..,(r/2)$ pairs. Please invest the time to fully understand the inclusion/exclusion method. Counting the combinations with $(a,b)$ having the same birthday regardless of $c, d$, has to exclude counting twice $(c,d)$ having the same birthday (though different from $(a,b)$.
  3. Final solution: ${(365^r - 1. - 2.)} / {365^r}$

More elaborate discussion on: Probability of 3 people in a room of 30 having the same birthday

Where you obviously have to generalize to $r$ iso 30. (And take the second answer, the first is only an estimation)

Note: Maybe it is more clear when you have 6 people throwing dice. What is the probability that no 3 throw the same number?

There can be 0..3 pairs of pairs. Extend this solution to 365 sided dice, and $r$ people.