I conclude that the result in the question is accurate, but the method by which it is obtained is questionable.

The questionable aspect is the use of $\lvert\mathcal{A}_1\rvert$ and $\lvert\mathcal{A}_2\rvert$, the cardinalities of the sets $\mathcal{A}_1$ and $\mathcal{A}_2$. The usual way to use cardinalities of sets of outcomes in order to compute the probability of a desirable event ("success") is to partition the probability space into some number $D$ of equally likely events such that each event is either completely within the favorable outcomes or completely within the unfavorable outcome. The favorable outcomes are then represented by a set of events $\mathcal{A}$, and the probability of success is then simply $\lvert\mathcal{A}\rvert/D.$

When working on the birthday paradox under the usual assumptions ($365$ possible birthdays, any person equally likely to be born on any of those dates), for a group of $n$ persons the probability space can be partitioned into $365^n$ equally-likely events, one for each possible combination of birthdays of the $n$ distinct persons in the group. You can count up the number of events in which no two persons share a birthday, which is $^{365}P_n,$ and the probability of no match is therefore $^{365}P_n / 365^n.$

In the question, however, the probability space has not been partitioned into a set of equally likely events. Instead, the probability is computed with a denominator $365.25^n$ which is not even an integer.

There are, however, at least three ways to fix this. One way (which is detailed in another answer) is to consider each person to be equally likely to have a birthday on every day in a four-year period. I present two more methods below.

The basis of each of these methods is the assumption that each day other than February $29$ is equally likely, whereas February $29$ is $\frac14$ as likely as any other particular day, and these $366$ possibilities cover the entire probability space. The total probability of all events is therefore $365.25$ times the probability of a birthday on January $1$ (which is $1/365.25$), while the probability of a birthday on February $29$ is one quarter of that, or $0.25/365.25.$ The probability space is therefore partitioned into $366$ events whose probabilities add up to $1$ as required.

First alternative method

One typical way to approach the usual ($365$-day) problem via conditional probabilities. The probability of no match is the product of conditional probabilities $P_k$ that the $k$th person will have a birthday distinct from the previous $k-1$ persons, given that the first $k-1$ persons all have birthdays distinct from each other. If we define $B_0$ equal to the entire probability space, and for $k > 0$ we define $B_k$ as the event that the first $k$ persons all have distinct birthdays, then $P_k = \mathbb P(B_k \mid B_{k-1})$ and the probability of no match in the entire group is $$ \prod_{k=1}^n \mathbb P(B_k \mid B_{k-1}). $$

Adapting this approach to the leap-year version of the problem, let $A_0$ be the event that all $n$ persons have distinct birthdays not including February $29.$ If $B_0$ is the entire probability space, and for $k > 0$ we define $B_k$ as the event that the first $k$ persons all have distinct birthdays not including February $29,$ then $$ \mathbb P(A_0) = \prod_{k=1}^n \mathbb P(B_k \mid B_{k-1}). $$

Now let $A_m$ be the event that all persons in the group were born on distinct days and the $m$th person was born on February $29$, Let $C_0$ be the entire probability space, let $C_1$ be the event that the first person is born on February $29,$ and for $k > 1$ define $C_k$ as the event that the first person was born on February $29$ and the next $k-1$ persons all have distinct birthdays not including February $29.$ Then $$ \mathbb P(A_1) = \prod_{k=1}^n \mathbb P(C_k \mid C_{k-1}). $$

By symmetry, $\mathbb P(A_1) = \mathbb P(A_2) = \cdots = \mathbb P(A_n).$ So the probability all $n$ persons in the group are born on distinct days is

$$ \mathbb P(A_0) + \mathbb P(A_1) + \cdots + \mathbb P(A_n) = \mathbb P(A_0) + n \mathbb P(A_1). $$

Now let's compute $\mathbb P(A_0).$ We suppose that a person has a $1/365.25$ probability to be born on any particular day other than February $29,$ and a $0.25/365.25$ probability to be born on February $29.$ The probability that the first person's birthday is not February $29$ is therefore $\mathbb P(B_1 \mid B_0) = 365/365.25.$ More generally, for $k > 0,$ given that the first $k-1$ persons have all distinct birthdays, the probability that the $k$th person has a birthday distinct from any of the previous $k-1$ and not on February $29$ (that is, the probability that the first $k$ persons have distinct birthdays not including February $29$) is $\mathbb P(B_k \mid B_{k-1}) = (365 - (k - 1))/365.25.$ Therefore

\begin{align} \mathbb P(A_0) &= \prod_{k=1}^n \frac{365 - (k - 1)}{365.25} \\ &= \frac{365 \cdot 364 \cdot 363 \cdot \cdots \cdot (365 - (n - 1))}{365.25^n} \\ &= \frac{^{365}P_n}{365.25^n}. \end{align}

Next let's compute $\mathbb P(A_0).$ The probability that the first person's birthday is February $29$ is $\mathbb P(C_1 \mid C_0) = 0.25/365.25.$ For $k > 1,$ given that the first person is born on February $29$ and the first $k-1$ persons have all distinct birthdays, the probability that the $k$th person has a birthday distinct from any of the previous $k-1$ and not on February $29$ (that is, the probability that the first $k$ persons have distinct birthdays and the first one was born on February $29$) is $\mathbb P(C_k \mid C_{k-1}) = (365 - (k - 2))/365.25.$ Therefore

\begin{align} \mathbb P(A_1) &= \frac{0.25}{365.25} \prod_{k=2}^n \frac{365 - (k - 2)}{365.25} \\ &= \frac{0.25}{365.25} \cdot \frac{365 \cdot 364 \cdot \cdots \cdot (365 - (n - 2))}{365.25^{n-1}} \\ &= \frac{0.25 \cdot {}^{365}P_{n-1}}{365.25^n}. \end{align}

In conclusion,

$$ \mathbb P(A_0) + n \mathbb P(A_1) = \frac{^{365}P_n}{365.25^n} + \frac{n \cdot 0.25 \cdot {}^{365}P_{n-1}}{365.25^n}, $$ the same result that was given in the question. But the fact that we can use the permutation symbol here seems to be coincidental, since permutations if $n$ persons or $n-1$ persons were not involved in any part of the derivation of this result.

Second alternative method

A second alternative method also uses conditional probabilities, but not so many of them, and also uses a counting argument involving permutations.

Again we consider two possible kinds of event: $A_0,$ which occurs when all $n$ persons have distinct birthdays from each other and none was born on February $29$; and $A_k$ for $k\geq 1,$ which occurs when the $k$th person is born on February $29$ and the other $k-1$ have have distinct birthdays from each other, none of which is February $29.$

The probability of $A_0$ is the probability that none of the $n$ persons was born on February $29,$ which is $(365/365.25)^n,$ times the conditional probability that no two of them share a birthday, given that none was was born on February $29.$

The conditional probability is simply the probability of no matching birthdays among $n$ persons in the ordinary $365$-day birthday problem, which we already know is $^{365}P_n / 365^n$ by counting the number of favorable combinations of birthdays among the $365^n$ possible combinations. So $$ \mathbb P(A_0) = \left(\frac{365}{365.25}\right)^n \times \frac{^{365}P_n}{365^n} = \frac{^{365}P_n}{365.25^n}. $$

For $A_1$, the probability is the probability that the first person was born on February $29,$ which is $0.25/365.25,$ times the probability that none of the other $n-1$ persons was born on February $29,$ times the conditional probability that the remaining $n-1$ persons were all born on different days, given that none of them was born on February $29.$

Similarly to the previous case, the conditional probability is simply the probability of $n-1$ distinct birthdays in the ordinary $365$-day birthday problem, which is $^{365}P_{n-1} / 365^{n-1}.$ So $$ \mathbb P(A_1) = \frac{0.25}{365.25} \left(\frac{365}{365.25}\right)^{n-1} \times \frac{^{365}P_{n-1}}{365^n} = \frac{0.25 \cdot {}^{365}P_{n-1}}{365.25^n}. $$

Therefore the final answer is

$$ \mathbb P(A_0) + n \mathbb P(A_1) = \frac{^{365}P_n}{365.25^n} + \frac{n \cdot 0.25 \cdot {}^{365}P_{n-1}}{365.25^n}. $$

In this method, the fact that we can write parts of the formula using permutations is no coincidence; we actually derived those parts of the formula by counting permutations.


Answer below is mildly flawed, because I originally incorrectly computed $N_1$. This caused me to doubt OP's answer. With (my) flaw corrected, the OP's math looks good, and my only qualm of the OP's work is:

  1. Having a sample space of 366 elements is not really a good approach, since the elements are not equally likely.

  2. As indicated by the comment of J Moravitz, it is best to work with integers, rather than (for example) a denominator of $(365.25)^n.$

  3. Another minor issue is that while the OP's math is correct, he approached it by examining the probability of various events occurring. In problems like this, I think that it is better to approach it as

$$\frac{\text{Number of pertinent possibilities}}{\text{Number of total possibilities}}.$$

Anyway, given the original flaw in my thinking, I've left the flawed work intact, with the edit-patch pasted on as an example of my moderately going off the rails.


No, it's not correct. In my opinion, the first comment of J Moravitz should be followed. Therefore, your notion of a sample space of $\mathcal{D}^n$, where $\mathcal{D} = 366$ is not tenable.

Superficially examining your math, it looks like your math is (somehow) consistent with the idea of $\mathcal{D} = 365.25$ Therefore, although your math seems hard to follow, it might be correct; it's hard for me to tell.

I would have approached it as follows:

Probability of no two people out of $n$ sharing a birthday is

$$\frac{N\text{(umerator)}}{D\text{(enominator)}}$$

where $D = (1461)^n.$

To calculate $N$ you must consider two possibilities : exactly one of the people is born on Feb 29, or none of the people is born on Feb 29.

Case-1
Without loss of generality, you can assume that Person-1 was born on Feb 29, and none of the other $(n-1)$ people were.

The number of ways that this can occur is therefore

$$N_1 = 1 \times (1460) \times (1456) \times \cdots \times (1468-4n).$$

Edit
Just realized that this is wrong. Because the denominator is $(1461)^n$ the consistent enumeration of the numerator requires that order be regarded as important. Therefore, I can not assume that Person-1 was the person born on Feb 29. Therefore, the above computation of $N_1$ must be multiplied by the $\binom{n}{1}$ scalar.

The intuition is therefore: there are $\binom{n}{1}$ ways of choosing which person was born on Feb 29. Having chosen this person, you can then line up the other $(n-1)$ people in a row, regarding them as Person-2, Person-3, ..., Person-n.

Then, there are $1460$ possible days to assign to Person-2. Then, once this assignment is made, there are $1456$ possible days to assign to Person-3, and so forth. This type of enumeration is consistent with the enumeration of $D = (1461)^n$, where (for example) Person-A on Jan 1 --- Person-B on Jan 2 is deemed distinct from Person-A on Jan 2 --- Person-B on Jan 1. So the revised enumeration is :

$$N_1 = \binom{n}{1} \times (1460) \times (1456) \times \cdots \times (1468-4n).$$

Case-2
No one was born on Feb. 29.

The number of ways that this can occur is therefore

$$N_2 = (1460) \times (1456) \times (1452) \times \cdots \times (1464-4n).$$

Final Answer:

$$\frac{N_1 + N_2}{D}.$$


The two earlier answers are correct both in their analysis and when they say the effect of including a leap-day is small.

It worth checking that is not big enough to change the standard answer of $23$ to the question of smallest number of people for which the probability of no matches is below $\frac12$. Using the following R code to calculate this for $365$ days and $22,23,24$ people, we get

probnomatch <- function(people, daysinyear){
  (product((daysinyear %/% 1) - (0:(people-1))) +
   product((daysinyear %/% 1) - (0:(people-2))) *people* (daysinyear %% 1)) / 
     daysinyear ^ people
  }
probnomatch(22, 365)
# 0.5243047
probnomatch(23, 365)
# 0.4927028
probnomatch(24, 365)
# 0.4616557

which is the standard birthday problem result, with the probability falling below $\frac12$ when there are $23$ people.

Increasing the average number of days in a year to $365.25$ gives

probnomatch(22, 365.25)
# 0.5247236
probnomatch(23, 365.25)
# 0.493135
probnomatch(24, 365.25)
# 0.4620987

which is similar, and leaves $23$ as the median. Using the current estimate of $365.24217$ mean solar days in a tropical year, or the average of $365.2425$ across a $400$-year cycle of the Gregorian calendar would also leave the $23$ unchanged. This is not surprising as a year of $366$ days also leaves it unchanged:

probnomatch(22, 366)
# 0.5252494
probnomatch(23, 366)
# 0.493677
probnomatch(24, 366)
# 0.4626536

The switch would happen when there were about $372.34695$ days in a year:

probnomatch(23, 372.34695)
# 0.5

These calculations do not take into account seasonal or other effects on particular dates of birth in the year, which would tend to reduce the median number of people or leave it the same.