Proof that $\frac{1}{n}\sum_{i=1}^n(\hat{y}_i-y)^2$ is a biased estimator of the residual variance

I know that $\frac{1}{n-2}\sum_{i=1}^n(\hat{y}_i-y)^2$ is an unbiased estimator of the residual variance. Therefore I am obviously making a mistake when checking if $\frac{1}{n}\sum_{i=1}^n(\hat{y}_i-y)^2$ is unbiased, because I get that the answer is yes.

My assumptions on $\varepsilon_i$ are: linear, homoscedastic, gaussian with $\mu=0, \sigma=1$ and independent. This is my work: $$\mathbb{E}\left(\frac{1}{n}\sum_{i=1}^n(\varepsilon_i)^2\right)=\frac{1}{n}\sum_{i=1}^n\mathbb{E}(\varepsilon_i^2)=\frac{1}{n}\sum_{i=1}^n\mathbb{V}(\varepsilon_i)+[\mathbb{E}(\varepsilon_i)]^2=\frac{1}{n}\sum_{i=1}^n\sigma^2=\sigma^2$$ Where is my mistake?


Solution 1:

You are confusing errors with residuals.

Consider the simple linear regression model

$$y_i=\beta_0+\beta_1 x_i+\varepsilon_i \quad,\,i=1,2,\ldots,n$$

where the errors $\varepsilon_1,\ldots,\varepsilon_n$ are independent $N(0,\sigma^2)$.

The $i$th residual is $\hat \varepsilon_i=y_i-\hat y_i=y_i-\hat\beta_0-\hat\beta_1 x_i$, where $(\hat\beta_0,\hat\beta_1)$ is the least square estimator of $(\beta_0,\beta_1)$. This of course is not same as $\varepsilon_i$.

The $i$th residual has mean $0$ but

$$\operatorname{Var}(\hat \varepsilon_i)=\sigma^2\left(1-\frac1n-\frac{(x_i-\overline x)^2}{\sum_{i=1}^n (x_i-\overline x)^2}\right)$$

For details, see this or this.

It follows that

$$\operatorname E\left[\sum_{i=1}^n \hat \varepsilon_i^2\right]=\sum_{i=1}^n\operatorname E \left[\hat \varepsilon_i^2\right] =\sum_{i=1}^n\operatorname{Var}(\hat \varepsilon_i)=\sigma^2 \left(n-2\right)$$

For easier proofs of the last statement, see Proof $E[\hat \sigma ^2] = E\left( \frac{1}{n-2} \Sigma(y_i-\hat{y_i})^2 \right) = \sigma ^2$: Linear Regression and Proof that $E[SS_E] = (n-2)\sigma^2$.