Least Square Estimation of Linear Regression

I have a linear regression model: $$y = \beta_0 + \beta_1x + \epsilon$$ with $\epsilon$ as a random noise.

The least squares estimation is given by:

$$S(\beta_0, \beta_1) = \sum^n_{i=1} (y_i - \beta_0 - \beta_1x_i)^2$$

By setting partial derivatives with respect to $\beta_0$ and $\beta_1$, I obtained:

$$\hat \beta_1 = \frac{\sum x_iy_i - \frac{\sum x_i \sum y_i}{n}}{\sum x_i^2 - \frac{(\sum x_i)^2}{n}} (1)$$ The final result for $\hat \beta_1$ is:

$$\hat \beta_1 = \frac{\sum(x_i - \bar x)(y_i - \bar y)}{\sum (x_i - \bar x)^2} (2)$$

with $\bar x = \frac{\sum x_i}{n}$ and $\bar y = \frac{\sum y_i}{n}$

I am stuck at getting from (1) to (2), which is the final result. How should I proceed for this case?


Solution 1:

Some transformations are necessary. For instance the numerator:

$$\sum x_iy_i - \frac{\sum x_i \sum y_i}n$$ $$=n\left(\frac1n\sum x_iy_i - \frac{\sum x_i \sum y_i}{n\cdot n}\right)$$

$$=n\left(\frac1n\sum x_iy_i - \overline x\cdot \overline y\right)$$

$$=n\left(\frac1n\sum x_iy_i\underbrace{+\overline x\cdot \overline y-\overline x\cdot \overline y}_{=0} - \overline x\cdot \overline y\right)$$

$$=n\left(\frac1n\sum x_iy_i-\overline x\cdot \overline y-\overline x\cdot \overline y+ \overline x\cdot \overline y\right)$$

$$=n\left(\frac1n\sum x_iy_i- \overline y \cdot \frac1n\sum x_i-\overline x\cdot \frac1n\sum y_i- \overline x\cdot \overline y\right)$$

$$=\sum x_iy_i- \overline y \sum x_i-\overline x\cdot \sum y_i+ n\cdot \overline x\cdot \overline y$$

$$=\sum x_iy_i- \overline y \cdot\sum x_i-\overline x\cdot \sum y_i+ \sum \overline x\cdot \overline y$$

$$=\sum \left(x_iy_i- \overline y \cdot x_i-\overline x\cdot y_i+ \overline x\cdot \overline y\right)$$

Since $(a-b)(c-d)=ac-bc-da+bd$ we get

$$=\sum \left(x_i-\overline x)(y_i-\overline y\right)$$

For the denominator see here the transformations.