What is the intuition behind the formula for the average? [duplicate]
Why is the average for $n$ numbers given by $(a+b+c+\cdots)/n$? I deduced the formula for the average of 2 numbers which was easy because its also the mid point, but I couldn't do it for more than 2 numbers.
Solution 1:
Suppose all of us gathered here in this room take all the money out of our pockets and put it on the table, and then we divide it among us in such a way that we all have the same amount. The total amount is still the same. Then the amount we each have is the average. That's what averages are.
The total amount is $a+b+c+\cdots$. The number of us gathered here is $n$. So the amount that each of us gets is $(\text{total}/n).$
Solution 2:
In most contexts, what passes for an 'average' can be thought of this way: if you replaced a collection of separate instances with their 'average', you get the same result.
The usual mean comes from thinking this way for addition: if you have numbers $a_1,\ldots,a_n$, their sum is $a_1+\cdots+a_n$. If you replaced all of them with their mean $\mu$, you should also get $a_1+\cdots+a_n$.
Therefore $\mu$ must satisfy $$ n\mu=a_1+\cdots+a_n, $$ leading to the formula you've seen.
As another example: doing the same thing but for multiplication leads to the geometric mean $\sqrt[n]{a_1a_2\cdots a_n}$.
Solution 3:
Here is a slightly different perspective on what Nick and Michael have already said: the average of $n$ numbers $x_i$ is the unique number $\mu$ such that the sum of the deviations $x_i-\mu$ is zero.
Starting from this characteristic property it is easy to derive the formula
$$\mu=\frac{1}{n}\sum x_i$$
A closely related characterization comes from statistics. Suppose we want to find the "number of best fit" for our data points $x_i$. To find this number, we first need to say what counts as "best."
One popular choice is to measure the "error" of our best-fit "approximation" using a quadratic "cost function." More formally, finding "the number of best fit" amounts to finding the number $m$ that minimizes the sum of the squared errors
$$SSE=\sum (x_i-m)^2$$
If you know any calculus (or simple multivariable geometry) you can easily prove that this function is minimized precisely when $m$ is the average of the $x_i$. In this sense, the average is the minimizer of squared errors.
If instead of measuring error by the sum of the quadratic deviations $(x_i-m)^2$ we use the sum of absolute deviations $|x_i-m|$, the minimizer is the median rather than the average.
In fact, other types of means (the geometric mean, the harmonic mean, etc.) can be understood using this same framework. See the Wikipedia page on Fréchet means.
Solution 4:
If Bill Gates walked into a crowded bar, on an average, everyone is a millionaire.
Loosely, an average is supposed to be a representative value for a sample. Sort of. But as you can see, it needn't be the case always.
But every time, average is definitely this: if what we collectively have is distributed equally among all of us, the average is what each of us would get.
What we collectively have: $a_1+a_2+\cdots+a_n$
How many of us are there: $n$
What each one would get: $\frac{\text{total}}{\text{number of people}} = \frac{a_1+a_2+\cdots+a_n}{n} = \textbf{average}$
And that's why everyone becomes a millionaire. On average. Bill Gates simply has that much. The moral is, outliers can sometimes mess up the average, make it unreliable. Other times, everyone kind of has that much.
PS: Call it arithmetic mean instead of average. Also, read the answer by @symplectomorphic. It has an interesting (and often very useful) take on how to think of an arithmetic mean.