Mean of $ \sum (X_i - \bar{X})^2$

If $X_1,...,X_n$ are iid, what is the mean of the following variable:?

$ \sum (X_i - \bar{X})^2$

I know the answer is $\sigma^2(n-1)$, but how is this calculated in general? If I expand, I get the variables squared, and that turns quite ugly. What's the trick here?


By expanding out the square, you can easily show that $$\sum_{i=1}^n(X_i-\bar X)^2=\sum_{i=1}^nX_i^2-n\bar X^2,$$ using the fact that $\sum_{i=1}^n(X_i)=n\bar X.$

So we need to calculate $$\mathop{\mathbb{E}}\left[\sum_{i=1}^nX_i^2\right]+\mathop{\mathbb{E}}\left[n\bar X^2\right].$$

The first term, by the iid condition, is equal to $n\mathop{\mathbb{E}}\left[X_i^2\right].$ Now note that $\mathop{\text{Var}}(X_i)=\mathop{\mathbb{E}}\left[X_i^2\right]-\mu^2,$ so $\mathop{\mathbb{E}}\left[X_i^2\right]=\sigma^2+\mu^2.$

Now let $Y=\bar X.$ Then $$ \begin{align*} \mathop{\mathbb{E}}\left[Y^2\right] &=\mathop{\text{Var}}(Y)+(\mathop{\mathbb{E}}\left[Y\right])^2\\ &=\mathop{\text{Var}}\left(\frac{1}{n}\sum_{i=1}^{n}X_i\right)+\mu^2\\ &=\frac{1}{n^2}\mathop{\text{Var}}\left(\sum X_i\right)+\mu^2\\ &=\frac{1}{n^2}n\sigma^2+\mu^2\\ &=\frac{\sigma^2}{n}+\mu^2. \end{align*} $$

So the whole expectation becomes $n(\sigma^2+\mu^2)-n(\sigma^2/n+\mu^2)=n\sigma^2-\sigma^2$ as required.


We have that \begin{align*} \sum_{i=1}^n(X_i-\bar X)^2 &=\sum_{i=1}^n(X_i^2-2X_i\bar X+\bar X^2)\\ &=\sum_{i=1}^nX_i^2-2\sum_{i=1}^nX_i\bar X+\sum_{i=1}^n\bar X^2\\ &=\sum_{i=1}^nX_i^2-n\bar X^2. \end{align*} Let us denote $\operatorname EX_1^2=\sigma^2$. Using the independence and identical distributions, $$ \operatorname E\biggl[\sum_{i=1}^nX_i^2\biggr]=n\sigma^2 $$ and $$ \operatorname E\biggl[\frac1n\sum_{i=1}^nX_i\biggr]^2=\frac1{n^2}\operatorname E\biggl[\sum_{i=1}^nX_i\biggr]^2=\frac n{n^2}\operatorname EX_1^2=\frac{\sigma^2}n. $$ Hence, $$ \operatorname E\biggl[\sum_{i=1}^n(X_i-\bar X)^2\biggr]=n\sigma^2-n\cdot\frac{\sigma^2}n=(n-1)\sigma^2. $$

Let us observe that there is no loss of generality by assuming that $\operatorname EX_1=0$ since $$ \sum_{i=1}^n(X_i-\bar X)^2=\sum_{i=1}^n(X_i-\operatorname EX_1+\operatorname EX_1-\bar X)^2=\sum_{i=1}^n\biggl[(X_i-\operatorname EX_1)-\frac1n\sum_{i=1}^n(X_i-\operatorname EX_1)\biggr]^2. $$ Then we can denote $Y_i=X_i-\operatorname EX_1$ for $1\le i\le n$ and investigate $\sum_{i=1}^n(Y_i-\bar Y)^2$.