Arithmetic mean. Why does it work?

Solution 1:

The simplest way to explain arithmetic mean is in terms of "equal sharing":

Abe has 12 cookies, Brianna has 8 cookies, and Chuck has 7 cookies. If they were to redistribute them so that they all have the same amount, how many would each get?

Obviously the way you answer this question is to find the total amount of cookies ($12 + 8 + 7 = 27$) and then divide the cookies among the three people ($27/3 = 9$). That's precisely what the computation of arithmetic mean does.

Is that what you're looking for?


Edited to add:

Here's another viewpoint that might help. We would like to find some number $N$ that is in the "middle" of the set $ \{12, 8, 7 \}$ (using the same numbers from the example above). What does "in the middle" mean? Well, one way to interpret this vague phrase is to imagine that we already had such an $N$ in hand, and we compute the three quantities $12-N, 8-N,7-N$. These three quantities tell us how far $N$ is from each of three pieces of information -- call these the "deviations".

What if we made a bad choice of $N$? For example, if each of the three deviations were positive, then that would mean that $N$ is smaller than each of the three original numbers, which we don't want. If each of the three deviations were negative, then that would mean that $N$ is larger than each of the three original numbers -- again bad. For $N$ to be in the middle, we would want some of the deviations to be positive and some of them to be negative. In fact, if we could choose $N$ so that the positive deviations exactly cancel out the negative deviations, then we will feel like we've really found the "middle".

Let's translate that now into a computation. We want to find $N$ such that $$(12-N) + (8-N) + (7-N) = 0$$ If you now think about what it would take to solve this equation, you will quickly realize that you end up adding the three numbers in your dataset together and then dividing by 3.

Solution 2:

Let's take a step back. Forget you ever learned about the arithmetic mean.


Let's say you have a list of numbers. A natural question is: what is the center of this list?
To answer that, you have to ask yourself: what is a "center" in the first place?
Why, for example, is 9 not the center of the numbers {1, 2, 4, 8}?

If you think about it for a while, you will realize that the center of a list of numbers it the number $\bar x$ whose total distance from the all the numbers $x_k$ in the list is minimum.
So that means you want to minimize $\sum_k \lVert x_k - \bar x \rVert$.

But how do you define $\lVert x \rVert$? A natural definition is $|x|$.
When you define it like that, you get $\bar x = $ the median. Why? Try a simple example on a piece of paper to see it visually -- the left and right side penalties cancel at the median:
enter image description here
Also notice that when there are an even number of elements, any element in the interval that holds the two middle elements is "a median". However, by taking an upper limit, you can find a single value rather than an interval -- which in this case is 8/3.

But you can also define $\lVert x \rVert =|x|^2$. In that case, you get $\bar x = $ the arithmetic mean:
enter image description here

Why is this the arithmetic mean? The formula for this should explain:
If you have $\bar x = \arg \min_x \sum_k |x_k - x|^2$, then you can set its derivative to zero:

$$\frac{d}{d\bar x}\sum_k |x_k - \bar x|^2 = 0$$ $$\sum_k 2 (x_k - \bar x) = 0$$ $$\sum_{k=1}^n x_k = n \bar x$$ $$\bar x = \frac{1}{n} \sum_{k=1}^n x_k$$

Notice this is exactly the arithmetic mean?

This is exactly why the arithmetic mean is a poor measure of central tendency.

It penalizes for deviations quadratically rather than linearly.
However, it's easy to compute (try the same thing for the median to see what I mean), and has the nice property that (by definition) multiplying it by $n$ gives you the total sum.
So people use it anyway, even when it's not the right choice.
But when is it the right choice?
It's the right choice when you're looking for the "average" dependent variable rather than the "average" independent variable, so to speak.
For example, if you're looking at the wealth of the average person, then you need to look at the median wealth. This is -- by definition -- useful for understanding how wealthy the average person is. But if you're trying to understand what's happening to the wealth itself rather than the people -- i.e., you want to know the average wealth of a person -- then you need to look at the mean wealth.

Now what if we go further? We've tried $\lVert x \rVert = |x|^1$ (the median) and $\lVert x \rVert = |x|^2$ (the mean).

What if we try $\lVert x \rVert = |x|^0$? If we do, we get back the mode, assuming we define $0^0$ to be $0$ (we have to take a limit here to see what happens):
enter image description hereenter image description here

What if we try $\lVert x \rVert = |x|^\infty$? In this case we get back the midpoint -- that is, the average of the minimum and the maximum values (again, we have to take a limit to see what happens):
enter image description here

It should make sense why all of these are said to measure "central tendency". :)

Solution 3:

Graph of the Arithmetic, Geometric and Harmonic Means

AMGMHM

The Arithmetic and Geometric Mean within a Semi Circle

AMGM

Arithmetic Mean

In the above image, the arithmetic mean finds the mid point of the total sum because it's dividing by two. In other words, it's finding half of the total sum because there are only two values in the sum.

In general, the arithmetic mean divides the total sum into equal parts, regardless of how different each value is. This is represented mathematically as $$ \overline{x}=\frac{1}{n}\sum_{k=1}^n x_k=\frac{1}{n}x_1+\frac{1}{n}x_2+\dots +\frac{1}{n}x_n $$ So for example, if we want to find the arithmetic mean of $\{3, 60, 900\}$, then $$ \overline{x}=\frac{1}{3}\sum_{k=1}^3 x_k=\frac{1}{3}3+\frac{1}{3}60+\frac{1}{3}900=321 $$ Where $321$ represents one third of the value of the sum. Also notice that in cases such as this one, the arithmetic mean can be heavily influenced by one value that is much larger or smaller than the rest. For this reason, the arithmetic mean is not considered a robust statistic.

Weighted Mean

The weighted mean for the values $\{x_1, x_2, \dots, x_n\}$ and the weights $\{w_1, w_2, \dots, w_n\}$, is expressed mathematically as $$ \overline{x}=\frac{\sum_{k=1}^n w_kx_k}{\sum_{k=1}^n w_k}=\frac{w_1x_1+w_2x_2+\cdots +w_nx_n}{w_1+w_2+\cdots +w_n} $$ So for the weighted mean of $\{3, 60, 900\}$ with weights $\left\{\frac{6}{9}, \frac{2}{9}, \frac{1}{9}\right\}$, we have $$ \overline{x}=\frac{\frac{6}{9}3+\frac{2}{9}60+\frac{1}{9}900}{\frac{6}{9}+ \frac{2}{9}+ \frac{1}{9}}=2+13.\overline{3}+100=115.\overline{3} $$ Notice that the arithmetic mean can also be generalized to be a weighted mean where every value has an equal weight of $\frac{1}{n}$. As seen in the above example, the selected weights can have a huge impact on the result.