Is the average of the averages equal to the average of all the numbers originally averaged?
I am tempted to say yes because of the following pseudo-proof (I say pseudo-proof because I am not convinced):
$$ \frac{\frac{w+x}{2}+\frac{y+z}{2}}{2}=\frac{w+x}{4}+\frac{y+z}{4}=\frac{w+x+y+z}{4} $$
Is this proof enough or am I completely wrong? If I am not wrong but this is not proof, what would be a good proof?
Edit:
I guess the following proves otherwise:
$$ \frac{w+x+y+z}{4} \neq \frac{\frac{w+x+y}{3}+z}{2} $$
That would be proof against my original statement by contradiction.
$1,1,1,2,2$
Their average is $\frac{7}{5}$.
But if you take it as $1,1,1$ and $2,2$, and average the averages, you get a different result.
But, what you said works if the number of numbers is a power of $2$ and you split into two equal sized sets. Interestingly, this observation was used by Cauchy to give an inductive proof of the $\text{AM} \ge \text{GM}$ inequality!
The correct answer is that it depends. The average of averages is only equal to the average of all values in two cases:
- if the number of elements of all groups is the same; or
- the trivial case when all the group averages are zero
As hinted in the comments, your proof is not enough to answer the question as your calculations only prove one particular case in which the sets have the same size (thus you arriving at the incorrect answer). You need to generalize your math to account for all cases. Below is my attempt at it.
Consider two sets $X = \{x_1, x_2, ..., x_n\}$ and $Y = \{y_1, y_2, ..., y_m\}$ and their averages:
$$ \bar{x} = \frac{\sum_{i=1}^{n}{x_i}}{n} \,,\, \bar{y} = \frac{\sum_{i=1}^{m}{y_i}}{m} $$
The average of the averages is:
$$ average(\bar{x}, \bar{y}) = \frac{\frac{\sum_{i=1}^{n}{x_i}}{n} + \frac{\sum_{i=1}^{m}{y_i}}{m}}{2} = \frac{\sum_{i=1}^{n}{x_i}}{2n} + \frac{\sum_{i=1}^{m}{y_i}}{2m} $$
Now consider the whole group $Z = \{x_1, x_2, ..., x_n, y_1, y_2, ..., y_m\}$ and its average:
$$ \bar{z} = \frac{\sum_{i=1}^{n}{x_i} + \sum_{i=1}^{m}{y_i}}{n + m}$$
For the general case, we can see that these averages are different:
$$ \frac{\sum_{i=1}^{n}{x_i}}{2n} + \frac{\sum_{i=1}^{m}{y_i}}{2m} \ne \frac{\sum_{i=1}^{n}{x_i} + \sum_{i=1}^{m}{y_i}}{n + m} $$
This is why the average of averages usually gives the wrong answer.
However, if we make $n = m$, we have:
$$ \frac{\sum_{i=1}^{n}{x_i}}{2n} + \frac{\sum_{i=1}^{m}{y_i}}{2n} = \frac{\sum_{i=1}^{n}{x_i} + \sum_{i=1}^{n}{y_i}}{2n} $$
This is why the average of averages is equal to the average of the whole group when the groups have the same size.
The second case is trivial: $\bar{x} = \bar{y} = average(\bar{x}, \bar{y}) = 0$.
Note that the above reasoning can be extended for any number of groups.
See also the answers to this similar question: Why is an average of an average usually incorrect?