Finding a more direct way to reach $\mathbb{E} \left( \sum (X_i - \mu)^2 \right) - \mathbb{E} \left( \sum (X_i - \overline{X})^2 \right) = \sigma^2$

Solution 1:

An intuitive and possibly "completely rigorous" derivation of the result

The result in question follows from an application of the Pythagorean theorem of plane geometry.

Without loss of generality, assume that $\mu=0$ and consider a fixed point $\mathbf{x} = (x_1,x_2,\ldots, x_n)$ in $\mathbb R^n$. Define $\bar{x}$ as $\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i$. The set of all points $(y_1,y_2,\ldots, y_n) \in \mathbb R^n$ that satisfy $$y_1+y_2+\cdots + y_n = n\bar{x}$$ is a hyperplane $H$ in $\mathbb R^n$ that contains $\mathbf{x}$, and it is easily shown that the point in $H$ that is closest to the origin $\mathbf 0$ is $\bar{\mathbf{x}} = (\bar{x},\bar{x}, \ldots, \bar{x})$ and that the straight line through $\mathbf 0$ and $\bar{\mathbf{x}}$ is perpendicular to $H$. Now, the three points $\mathbf 0$, $\bar{\mathbf{x}}$, and $\mathbf x$ define a plane, and in this plane, they are the vertices of a right triangle (with right angle at $\bar{\mathbf{x}}$). The Pythagorean theorem of plane geometry tells us that $$\sum_{i=1}^n x_i^2 = \sum_{i=1}^n \left(\bar{x}\right)^2 + \sum_{i=1}^n \left(x_i-\bar{x}\right)^2 = n\left(\bar{x}\right)^2 + \sum_{i=1}^n \left(x_i-\bar{x}\right)^2$$ or, equivalently, $$\sum_{i=1}^n x_i^2 - \sum_{i=1}^n \left(x_i-\bar{x}\right)^2 = n\left(\bar{x}\right)^2.$$ Now, the above identity holds for all choices of the $x_i$, and in particular, it holds for all realizations $(x_1, x_2, \ldots, x_n)$ of the random vector $(X_1, X_2, \ldots, X_n)$, that is, $$\sum_{i=1}^n X_i^2 - \sum_{i=1}^n \left(X_i-\bar{X}\right)^2 = n\left(\bar{X}\right)^2 ~~\text{with probability } 1.$$ Therefore, assuming all the expectations exist, we have that $$E\left[\sum_{i=1}^n X_i^2\right] - E\left[\sum_{i=1}^n \left(X_i-\bar{X}\right)^2\right] = nE\left[\left(\bar{X}\right)^2\right].$$ Introducing a common mean $\mu$ for the $X_i$'s merely translates the origin to $(\mu,\mu, \ldots, \mu)$ giving

$$E\left[\sum_{i=1}^n (X_i-\mu)^2\right] - E\left[\sum_{i=1}^n \left(X_i-\bar{X}\right)^2\right] = nE\left[\left(\bar{X}-\mu\right)^2\right] = n\cdot\operatorname{var}\left(\bar{X}\right).$$

Notice that the result holds for all random variables with common mean $\mu$: we have not made any assumptions about independence or zero correlation or even about common variance. Now, for the special case of uncorrelated random variables with common variance $\sigma^2$, the right side of the above equality is just $$n\cdot\operatorname{var}\left(\bar{X}\right) = n\cdot\operatorname{var}\left(\frac{1}{n}\sum_{i=1}^n X_i\right) = n \cdot \frac{1}{n^2}\sum_{i=1}^n \operatorname{var}(X_i) = \sigma^2$$ giving

$$E\left[\sum_{i=1}^n (X_i-\mu)^2\right] - E\left[\sum_{i=1}^n \left(X_i-\bar{X}\right)^2\right] = \sigma^2.$$



(Previous answer: no intuition or geometry, just a simple derivation)

You already have noted that $$E\left[\sum_{i=1}^n (X_i-\mu)^2\right] = \sum_{i=1}^n E[(X_i-\mu)^2] = n\sigma^2.$$ Since $E\left[\bar{X}\right] = \mu = E[X_i]$, we have that $Y_i = X-\bar{X}$ is a zero-mean random variable, and so $E\left[\left(X_i-\bar{X}\right)^2\right]$ is the variance of $Y_i$ which gives $$\begin{align} E\left[\left(X_i-\bar{X}\right)^2\right] &= \operatorname{var}(Y_i) = \operatorname{var}\left(X_i-\bar{X}\right)\\ &= \operatorname{var}\left(\frac{n-1}{n}X_i-\sum_{j\neq i}\frac{X_j}{n}\right) &\scriptstyle{\text{write as a weighted sum of the uncorrelated variables }X_i}\\ &= \left(\frac{(n-1)^2}{n^2}+ (n-1)\frac{1}{n^2}\right)\sigma^2 &\scriptstyle{\text{so that we can use the formula}\operatorname{var}\left(\sum_i a_iX_i\right) = \sum_i a_i^2\operatorname{var}(X_i)}\\ &= \frac{n-1}{n}\sigma^2&\scriptstyle{\text{for the variance of a sum of uncorrelated random variables}} \end{align}$$ leading to $$E\left[\sum_{i=1}^n\left(X_i-\bar{X}\right)^2\right] = (n-1)\sigma^2.$$

Solution 2:

Note that if the $X_i$ are independent, $$ \begin{align} \mathrm{Var}\left(\bar{X}\right) &=\mathrm{Var}\left(\frac1n\sum X_i\right)\\ &=\frac1{n^2}\sum\mathrm{Var}(X_i)\\ &=\frac1{n^2}n\sigma^2\\ &=\frac{\sigma^2}{n}\tag{1} \end{align} $$ Simply expand and simplify to get $$ \begin{align} &\mathbb{E}\left(\sum\left(X_i-\mu\right)^2\right)-\mathbb{E}\left(\sum\left(X_i-\bar{X}\right)^2\right)\\ &=\mathbb{E}\left(\sum\left(X_i^2-2\mu X_i+\mu^2\right)-\sum\left(X_i^2-2\bar{X}X_i+\bar{X}^2\right)\right)\\ &=\mathbb{E}\left(\sum\left(\mu^2+2\left(\bar{X}-\mu\right)X_i-\bar{X}^2\right)\right)\\ &=\mathbb{E}\left(n\mu^2+2n\left(\bar{X}-\mu\right)\bar{X}-n\bar{X}^2\right)\\ &=n\mathbb{E}\left(\mu^2-2\mu\bar{X}+\bar{X}^2\right)\\ &=n\mathbb{E}\left(\left(\mu-\bar{X}\right)^2\right)\\ &=n\frac{\sigma^2}{n}\tag{2}\\ &=\sigma^2 \end{align} $$ $(2)$ is just $n$ times $(1)$.

Solution 3:

I think you want to do the tedious calculations and then extract the key insight. And for me, the key insight is that, for each $X_i$:

$$ \textstyle \mathbb{E} \left( X_i \overline{X} \right) = \frac{\sigma^2}{n} +\mu^2 $$ which does require that $Cov(X_i,X_j)=0$ if $i\ne j$.

Once you have this, then you can see that:

$$ \textstyle \mathbb{E} \left( \sum (X_i - \overline{X})^2 \right) = \mathbb{E} \left( \sum [(X_i -\mu)+(\mu- \overline{X})]^2 \right) $$

$$ \textstyle = \mathbb{E} \left( \sum (X_i -\mu)^2+\sum2(X_i -\mu)(\mu- \overline{X})+\sum(\mu- \overline{X})^2 \right) $$

$$ \textstyle = n\sigma^2-2\sigma^2+\sigma^2=(n-1)\sigma^2 $$

which is indeed tedious to compute, but this is where your "degree of freedom" shows up.