Why is the inverse of an average of numbers not the same as the average of the inverse of those same numbers?

I have a set of numbers (in my case: mean retention time (MRT) in the stomach (h)) of which I want to calculate the average gastric passage rate (/h). Gastric passage rate = 1/MRT.

My question is why 'the average of the calculated gastric passage rates of those numbers' is not the same as 'the calculated gastric passage rate of the averaged MRTs'. The next question is: what is the right way?

So for example:

$x = 5; 10; 4; 2.$ Average $= 5.25 h \Rightarrow 1/5.25 = 0.19$/h

$1/x = 0.2; 0.1; 0.25; 0.5.$ Average $= 0.26$/h

So should I first take the average of the MRTs and then take the inverse for calculating the gastric passage rate (first way) or should I first take the inverse of all numbers for calculating the gastric passage rates and then take the average of that number (second way).

Thanks in advance!

Solution 1:

Here's an everyday puzzle that may help.

If you travel from here to there at $30$ miles per hour and back at $60$ miles per hour, what is your average speed? Instinct says it should be the average, which would be $45$ miles per hour.

But speed is (total distance)/(total time). You don't have a distance given, but you can make one up. Suppose your destination was $60$ miles away. Then it took you $2$ hours to get there and $1$ to get back. You drove $120$ miles in $3$ hours so your average speed was $40$ miles per hour.

The moral of the story is that you can't naively average averages, and a rate is an average. So be careful when you have to compute an average rate.

In your case your MRT is like the reciprocal of the speed, whose units are hours/mile. In my example those are $2$ hours per $60$ miles for the slow trip and $1$ hour per $60$ miles for the fast return. You can average those to get the average number of hours per mile. The average is $1.5$ hours per $60$ miles. The reciprocal is $60$ miles per $1.5$ hours, or $40$ miles per hour.

So this is right:

take the average of the MRTs and then take the inverse for calculating the gastric passage rate (first way)

Edit in light of many comments and clarifications.

The important question is "what is the right way to average the MRT values?", not "why do these two methods differ?" or even "which of these two is right?"

The answer depends on what MRT actually measures. If material moves through the gut at a constant rate then your first method is correct, as discussed above. But if material leaves the gut at a rate proportional to the amount present - that is, a fraction of the amount leaves per hour - then the process is like exponential decay. I don't know a right way to compute the average rate in that case. If you have very few values to average and they are not very different then you may be able to argue that whatever results you get are essentially independent of the way you average the rates.

Solution 2:

You've encountered a phenomenon called the AM-HM inequality, $\frac{1}{n}\sum_{i=1}^n x_i\ge\frac{n}{\sum_{i=1}^n\frac{1}{x_i}}$ for $x_i>0$, with equality iff all $x_i$ are equal. In general, functions don't commute with taking expectations; this is the example with the function $1/x$.

What you do with your data depends on what form you assume the distribution of retention times has. For example, suppose you think it has an Exponential distribution, with rate parameter $\lambda$. The fact that $1/\lambda$ is the MRT is then $1/\lambda=\int_0^\infty\lambda x\exp (-\lambda x)\mathrm{d}x$. I don't advise trying to estimate $\lambda$ from averaged reciprocals, since $\int_0^\infty\frac{\lambda}{x}\exp (-\lambda x)\mathrm{d}x$ is infinite.

Let's suppose for the sake of argument that we estimate $1/\lambda$ as the dataset's mean retention time. This is an example of the method of moments. It can be shown maximum likelihood estimation would recommend the same estimator. Pages 3 and 4 here discuss what happens with Bayesian estimation, and it comes to much the same thing for a large sample size. So if I were you, I'd estimate $1/\lambda$ as the mean GPR.

Solution 3:

One way to handle questions like this is to reduce it to the simplest possible version of the question and then examine that problem. In this case,

 Why is the inverse of the average of x and y not equal to 
 the average of the inverses of x and y?

So you are asking "Why isn't this true?" $$\dfrac{1}{\left( \dfrac{x+y}{2} \right)} = \dfrac{\left( \dfrac 1x + \dfrac 1y \right)}{2}$$

Personally, I would be suprised if the two sides turned out to be equal. But we can just solve the equation and see what happens.

\begin{align} \dfrac{1}{\left( \dfrac{x+y}{2} \right)} &= \dfrac{\left( \dfrac 1x + \dfrac 1y \right)}{2} \\ \dfrac{1}{\left( \dfrac{x+y}{2} \right)} \cdot \dfrac 22 &= \dfrac{\left( \dfrac 1x + \dfrac 1y \right)}{2} \cdot \dfrac{xy}{xy} \\ \dfrac{2}{x+y} &= \dfrac{x+y}{2xy} \\ (x+y)^2 &= 4xy \\ x^2 - 2xy + y^2 &= 0 \\ (x-y)^2 &= 0 \\ x-y &= 0 \\ x &= y \end{align}

The two sides are equal when $x = y$. Otherwise they are not.

If the question won't work, in general, for two variables, then it probably won't work, in general, for more than two variables.

A finite-dimensional vector space cannot be covered by finitely many proper subspaces?

Find the integer closest to $\ln(2013)$

Vector Spaces: Redundant Axiom?

Negating A Mathematical Statement

Is $A + A^{-1}$ always invertible?

Prove that the equation $x^{10000} + x^{100} - 1 = 0$ has a solution with $0 < x < 1$

An example showing that van der Waerden's theorem is not true for infinite arithmetic progressions

Intuitively, why are the curves of exponential, log, and parabolic functions all smooth, even though the gradient is being changed at every point?

Vector spaces. When in the real world are we checking if it's a vector space or not?

Collatz Conjecture (3n+1) variant