Intuition on why the average minimizes the euclidean distance

I saw that there was a question with a very similar (if not identical) flavour to my question, but the answer was derived from derivative, the method that I already knew solved this problem.

I feel that it is "obvious" that the value that minimizes the sum of the euclidean distance from points, i.e. find a z that minimizes:

$$\sum^{k}_{i=1}\|x_i - z\|^2$$

I know the solution can be obtained with derivatives and that $z = \frac{\sum^k_{i=1} x_i}{k}$ but even before I tried solving it with derivatives, it seemed "obvious" that was the case and I felt that solving it using derivatives is the correct approach but seemed over doing it for the simple problem.

I was wondering if anyone had a intuitive argument for this solution. It just seemed so obvious and there is a way of doing it rigorously, but I was more interested if someone knows intuitively why that had to be solution. Maybe there isn't but I am just curious to know if someone had a alternative view for the problem/solution.

Thanks in advance!

Solution 1:

Here are two ways of viewing it. The second may be (for some people?) more "intuitive":

First way: $$ \sum_{i=1}^k (x_i - z)^2 = \sum_{i=1}^k \Big((x_i - m)^2 + 2(x_i-m)(m-z) + (m-z)^2\Big). $$

In the sum of the middle term, $\displaystyle\sum_{i=1}^k 2(x_i-m)(m-z)$, the factor $2(m-z)$ does not depend on the index $i$, i.e. does not change as $i$ goes from $1$ to $k$, hence this sum is $\displaystyle 2(m-z)\sum_{i=1}^k (x_i-m)$.

That sum is $0$ if and only if $m=\bar x = (x_1+\cdots+x_n)/n$.

In the last term, $\displaystyle\sum_{i=1}^k (m-z)^2$, the whole expression $(m-z)^2$ does not change as $i$ goes from $1$ to $k$, so it's a sum of $k$ terms that are all equal; hence it is $k(m-z)^2$.

Therefore $$ \sum_{i=1}^k (x_i-z)^2 = k(\bar x - z)^2 + \sum_{i=1}^k (x_i-\bar x)^2. $$ Since $z$ appears only in the first term of this last expression, the value of $z$ that minimizes that is the value of $z$ that minimizes the first term.

That's one way to show that the least-squares estimate of the population mean is the sample mean.

Second way:

But now let's look at it geometrically: $$ \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} \bar x \\ \vdots \\ \bar x \end{bmatrix} + \begin{bmatrix} x_1 - \bar x \\ \vdots \\ x_n - \bar x \end{bmatrix}. $$ The first term on the right is the orthogonal projection of the vector on the left onto a certain one-dimensional subspace of $\mathbb R^n$. The second term on the right is the orthogonal projection of the same vector onto the complementary $(n-1)$-dimensional subspace. The vector in a subspace that is nearest to a vector not in the subspace, in terms of Euclidean distance, is the orthogonal projection onto the subspace.

Solution 2:

The following sketch captures one possible intuition for the situation.

We may assume $z$ lies in the affine hull of the $x_i$. (Indeed, replacing $z$ with its image under orthogonal projection onto that affine subspace reduces each of the distances $\|x_i-z\|$.) That is, we assume $z = \sum_i \lambda_i x_i$, where $\sum_i \lambda_i = 1$.
Each function $z\mapsto\|x_i-z\|^2$ is strictly convex (indeed, its graph is a paraboloid), so their sum $z\mapsto\sum_i\|x_i-z\|^2$ is also strictly convex.
The objective function is symmetrical under permutation of the $x_i$: the candidate solutions $z=\sum_i\lambda_i x_i$ and $z^\sigma = \sum_i \lambda_{\sigma(i)} x_i$, where $\sigma$ is a permutation of $\{1,\dotsc,k\}$, have the same value of the objective function. If $z\ne z^\sigma$, then the point $\frac12(z+z^\sigma)$ is a better solution than $z$ and $z^\sigma$, since the objective function is strictly convex. So the optimal solution must be invariant under such permutations, that is, its $\lambda_i$ are all equal.

Superassociative operation

History of the terms "prime" and "irreducible" in Ring Theory.

Drawing the longest line segment contained in an area

Find the area where dog can roam [duplicate]

How can I determine the number of wedge products of $1$-forms needed to express a $k$-form as a sum of such?

Javascript prints value in loop though the print is outside the loop

An app build does not appear in TestFlight

How to get the substring from between two [ ] braces in a string in javascript or jquery

husky pre commit error the --no-stash option has been renamed to --no-reset

How to go back to an old commit, make some changes and commit those changes at the old commit which also updates all the following commits

How Can I Make SHA256withRSA in PHP? How i can know what [signature bytes] in official example?

Python why file with content cannot be read?