How often does it happen that the oldest person alive dies?

Solution 1:

I think that this question is best approached through careful modelling, rather than pure mathematics. Here's the approach I took. I don't claim that this is the perfect approach by any means, but it's a start.

Spoiler: My simulations give a rate of approximately once every 0.66 years, for a population of 7 billion people who share US mortality statistics.

First, I took the US mortality tables from the Center for Disease Control. They only go up to age 100, so I need to extrapolate beyond that. I fitted a power law to the hazard rate $h(a)$ which gives the probability of dying between age $a$ and $a+1$, getting

$$h(a) = 3.54 \times 10^{-15} \times a^{6.933}$$

I assume $h(a)=1$ in the case that my power law gives me a number above 1. This occurs at $a=122$, which seems realistic (the oldest person to ever live died at age 122).

I then simulated an evolving population until it converged on a stable distribution. I assume $N(a)$ people at age $A$, and a constant birth rate of $9\times 10^7$ people every year (chosen to give a stable popluation of 7 billion). The result is a reasonable-looking population pyramid:

enter image description here

Now that I have a stable population, I simulate again. For each age $a$, the number of people of age $a$ in year $t$ is the fraction of the population aged $a-1$ at time $t-1$ who don't die, i.e.

$$N(t,a) = (1-h(a-1)) \times N(t-1, a-1)$$

When appropriate I approximate the number of deaths with the normal distribution, but for small populations I use the binomial distribution. In the case that there are some deaths in the highest age bracket, I calculate the probability that the person who died was the oldest person in the world at that time, and record this as an event.

Taking the total number of events, and dividing by the number of years that I run the simulation for, gives an approximate rate. The punchline is that in my simulation, I see 15,234 events in 10,000 years, for an approximate rate of once in every 0.66 years.

Assuming a population of one billion people (the population of the developed world, to which the US mortality statistics are most likely to apply) we can see the following histogram, which gives the age of the oldest person in the world at the time they die. Comparing to the wikipedia page for oldest people it looks as though the numbers are too high by 1-2 years, but otherwise I'm surprised at how accurate this crude model is!

enter image description here

One final chart. This is how the number of deaths of the oldest living person each year varies as a function of the total population. Roughly, it seems to be linear in the logarithm of the population. I'd be interested to see a more rigorous mathematical treatment that can get this result out

enter image description here

Edit: I corrected a bug which was causing me to estimate the rate as too high. I was approximating the binomial $B(n,p)$ with a normal distribution $\sigma=np(1-p)$ rather than $\sigma^2=np(1-p)$.

Edit no. 2: It was pointed out in the comments that I had another bug, and I also realized that I wasn't ever checking for the possibility that more than 1 'oldest person' dies in a given year.

Solution 2:

The Gerontology Research Group keeps records

I brought this up on the GRG and Louis Epstein posted the table "CHRONOLOGICAL OLDEST LIVING LISTED PERSONS (Since 1955)". I extracted the final column, death dates, and formatted it and extracted the intervals between the death dates of each person, reasoning that if the Oldest Person In The World who died in 1955 is succeeded by a person who died in 1956, that meant an observer would, in 1955, wait ~1 year for the new Oldest Person to die. The mean interval between deaths turns out to be 1.2 years, but the median wait turns out to be 0.65 years! This seems to be due in large part due to the astounding lifespan of Jeanne Calment, as you will see on the interval graph shortly:

deaths <- as.Date(c("1955-10-24","1959-06-25","1961-02-10","1964-12-30",
                    "1965-08-06","1966-01-10","1968-03-21","1968-06-16",
                    "1970-01-11","1973-02-27","1973-08-18","1973-10-31",
                    "1975-05-31","1976-11-16","1977-12-02","1978-04-25",
                    "1981-01-22","1981-03-09","1982-11-13","1983-10-13",
                    "1985-02-16","1986-10-21","1987-02-02","1987-12-27",
                    "1988-01-11","1997-08-04","1998-04-16","1999-12-30",
                    "2000-11-02","2001-06-06","2002-03-18","2003-10-31",
                    "2003-11-13","2004-05-29","2006-08-27","2006-12-11",
                    "2007-01-24","2007-01-28","2007-08-13","2008-11-26",
                    "2009-01-02","2009-09-11","2010-05-02","2010-11-04",
                    "2011-06-21","2012-12-04","2012-12-17"))

plot(deaths)

Plot of years of each date:

enter image description here

R> intervals <- NULL; for (i in 1:(length(deaths)-1)) { intervals[i] <- deaths[i+1] - deaths[i] }
R> intervals
 [1] 1340  596 1419  219  157  801   87  574 1143  172   74  577  535  381  144 1003   46  614  334
[20]  492  612  104  328   15 3493  255  623  308  216  285  592   13  198  820  106   44    4  197
[39]  471   37  252  233  186  229  532   13
R> summary(intervals)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      4     147     270     454     588    3490
R> # the monstrous 3493 outlier is Jeanne Louise Calment:
R> # her predecessor Florence Knapp died in 1988, she in 1997

plot(intervals)

Graph the size of successive interval between the death of the previous Oldest Person and the replacement Oldest Person:

enter image description here

I leave it to you guys to compare the empirical observations with the mathematical models; what's interesting to me is that eyeballing it, there seems to be a sort of linear trend downwards where the Oldest Person is dying faster over time. This could be due to bad data collection (if the real Oldest Person is 110 but you only found some 105 piker, then your fake Oldest person will probably live longer than the real Oldest Person did) improving over time, or maybe a more interesting phenomenon where medicine is improving or the population is growing, and so there are many more centenarians, and so the Oldest Person is ever closer to the extreme of what is possible and naturally slips off the cliff that much quicker (from this perspective, Calment is even more of an extraordinary outlier).

Solution 3:

Super-simple answer, depending only on some crude properties of mortality rates:

It appears that at very advanced ages, mortality rate varies only slowly with age and is on the order of 50% per year. (If this is wrong, everything else in this answer is wrong. On the other hand, if it's right then it's all we need.)

It will basically always be the case that the oldest person alive is very old.

Therefore, at any given time the death-of-oldest-person process is approximately Poisson with rate 1/2, so typical gaps between these events will be on the order of 2 years.

[EDITED to add: A bit of wikipedia-grovelling suggests that recent oldest-person deaths have been: Dina Manfredini, 2012-12; Besse Cooper, 2012-12; Maria Gomes Valentim, 2011-06; Eunice Sanborn, 2011-01; Eugenie Blanchard, 2010-10. That's a lot more frequent, which suggests that mortality accelerates pretty rapidly at extreme ages. I'm puzzled because Chris Taylor's simulation gives results comparable to mine despite having that feature, even after he fixed his bug; perhaps he has another? :-)]


This is only weakly sensitive to population size. (In a population of size N, the median age of the oldest person will be roughly the 1-log(2)/N quantile of the age distribution, and given the assumption above this doesn't vary much between, say, N=10^3 and N=10^6.)

It is sensitive in the obvious way to variation in that asymptotic mortality rate: if it's p then typical gaps between oldest-person deaths will be 1/p. (So, in particular, smallish errors in estimating p lead to smallish errors in estimating the time between oldest-person deaths.)


On the other hand, if mortality rate turns out to vary rapidly with age beyond some threshold (e.g., some malicious god kills everyone once they reach 120 or something) then the analysis above could be very wrong. In the extreme case where a malicious god kills everyone once they reach some easily-reachable age, the oldest-person death rate is just the birth rate times the probability of reaching the age in question.

Solution 4:

This problem is similar to finding the distribution of a stochastic process at a certain time. In that case you need to solve a PDE for which you need some initial condition in order to find a unique solution.

So, the solution of this problem involves the knowledge of the distribution of ages for the living people in a given population at time $t_0$. This means that we know at this time how many people of age 1, age 2, age 3 ..., age 122,... we have in our population.

Notation : ${}_xP_n$ -probability of a a person of age $n$ to survive $x$ years(lives up to $x+n$); ${}_xQ_n$ -probability of a a person of age $n$ to die in the next $x$ years(dies before $x+n$); $E_n$ -expectation of life of a a person of age $n$ (how many years will she live on average); $l_n$ - number of people having age $n$ at time $t_0$; $h$ -highest life for which ${}_1P_h$ is positive.

Now, the person $Y$ born at time $t_0$ is the youngest person alive. The probability that this person to be the oldest person alive at some time $t_x$ (where $t_x=t_0+x$) in the future can be calculated as the product of: the probability that $Y$ lives up to the age $x$ and the probabilities that the people having ages $n_1,n_2,...$ at time $t_0$ live less than (die before) $x+n_1,x+n_2,...$: \begin{equation} O_x={}_xP_0 \prod_{n=1}^h{}_{x+n}Q_0^{l_n } \end{equation} We compute $O_x$ for $x\in\{1,2,\dots,h \}$ to obtain the probabilities that $Y$ is the oldest person for various values of $x$. Now, at a general time $t_x$, the person $Y$ is the oldest person with probability $O_x$ and her life expectancy is $E_x$. Given the formula for the life expectancy $E_x=\sum_{t=1}^h {}_tP_x $ we can say that, on average, the oldest person at a general time $t_x$ will live: \begin{equation} E_x \cdot O_x=\sum_{t=1}^h {}_tP_x \cdot \Big[{}_xP_0 \prod_{n=1}^h{}_{x+n}Q_0^{l_n }\Big] \end{equation}

Note that this solution doesn't guarantee that at time $t_x$, the person $Y$ is the only person of age $x$ (not the unique oldest person). To take the uniqueness into consideration we will have to multiply the relation above by the probability that the other people born at $t_0$ die before $t_x$.

Solution 5:

An approach would be to assume:

  • a non-changing (period) life table and hence
  • a distribution of the age of a random person.

Let this distribution be called $F$, so the density of the oldest person of a population of $N$ being $t$ years should be $\frac{d}{dt}F(t)^N$. With the constant life table one can calculate the conditional density of having to wait $t'$ until the oldest person dies, given the oldest person is now $t$ years old. Let this density be called $g$. With this the expected time until an oldest person dies would be:

$\int_0^\omega \frac{d}{dt}F(t)^N \int_t^\omega g(t') \cdot t' dt' dt$

If one does not want to impose an ultimate age $\omega$ one can set it to infinity. Perhaps this does not quite answer your question as it is the expected waiting time of any given moment and not conditioned on the oldest person just having died, but it might point others to the right solution.