Why the sum of residuals equals 0 when we do a sample regression by OLS?

That's my question, I have looking round online and people post a formula by they don't explain the formula. Could anyone please give me a hand with that ? cheers


If the OLS regression contains a constant term, i.e. if in the regressor matrix there is a regressor of a series of ones, then the sum of residuals is exactly equal to zero, as a matter of algebra.

For the simple regression,
specify the regression model $$y_i = a +bx_i + u_i\,,\; i=1,...,n$$

Then the OLS estimator $(\hat a, \hat b)$ minimizes the sum of squared residuals, i.e.

$$(\hat a, \hat b) : \sum_{i=1}^n(y_i - \hat a - \hat bx_i)^2 = \min$$

For the OLS estimator to be the argmin of the objective function, it must be the case as a necessary condition, that the first partial derivatives with respect to $a$ and $b$, evaluated at $(\hat a, \hat b)$ equal zero. For our result, we need only consider the partial w.r.t. $a$:

$$\frac {\partial}{\partial a} \sum_{i=1}^n(y_i - a - bx_i)^2 \Big |_{(\hat a, \hat b)} = 0 \Rightarrow -2\sum_{i=1}^n(y_i - \hat a - \hat bx_i) = 0 $$

But $y_i - \hat a - \hat bx_i = \hat u_i$, i.e. is equal to the residual, so we have that

$$\sum_{i=1}^n(y_i - \hat a - \hat bx_i) = \sum_{i=1}^n\hat u_i = 0 $$

The above also implies that if the regression specification does not include a constant term, then the sum of residuals will not, in general, be zero.

For the multiple regression,
let $\mathbf X$ be the $n \times k$ matrix containing the regressors, $\hat {\mathbf u}$ the residual vector and $\mathbf y$ the dependent variable vector. Let $\mathbf M = I_n-\mathbf X(\mathbf X'\mathbf X)^{-1}\mathbf X'$ be the "residual-maker" matrix, called thus because we have

$$\hat {\mathbf u} = \mathbf M\mathbf y$$

It is easily verified that $\mathbf M \mathbf X = \mathbf 0$. Also $\mathbf M$ is idempotent and symmetric.

Now, let $\mathbf i$ be a column vector of ones. Then the sum of residuals is

$$\sum_{i=1}^n \hat u_i = \mathbf i'\hat {\mathbf u} =\mathbf i'\mathbf M\mathbf y = \mathbf i'\mathbf M'\mathbf y = (\mathbf M\mathbf i)'\mathbf y = \mathbf 0' \mathbf y = \mathbf 0$$

So we need the regressor matrix to contain a series of ones, so that we get $\mathbf M\mathbf i = \mathbf 0$.


The accepted solution by Alecos Papadopoulos has a mistake at the end. I can't comment so I will have to submit this correction as a solution, sorry.

It's true that a series of ones would do the job. But it's not true that we need it. We do not need the regressor to have a series of ones in order for $Mi = 0$.

Theorem: If $\exists$ a $p$ x $1$ vector $v$ such that: $$Xv = 1_n$$

where $1_n$ is a $n$ x $1$ vector of ones, then $$\sum_{i=1}^ne_i=0$$ Proof: $\sum_{i=1}^ne_i= e^T1_n =e^T X v = (e^T X) v=(X^Te)^T v = (0)^T v = 0 $

Above I am using the fact that $X^Te=0$. Having a series of ones in X (a.k.a. intercept) is just a special case of $v$. If the intercept is in the first column $v$ would look like this $[1,0,0,0,0,0...]$