Least Square Estimation of Linear Regression
I have a linear regression model: $$y = \beta_0 + \beta_1x + \epsilon$$ with $\epsilon$ as a random noise.
The least squares estimation is given by:
$$S(\beta_0, \beta_1) = \sum^n_{i=1} (y_i - \beta_0 - \beta_1x_i)^2$$
By setting partial derivatives with respect to $\beta_0$ and $\beta_1$, I obtained:
$$\hat \beta_1 = \frac{\sum x_iy_i - \frac{\sum x_i \sum y_i}{n}}{\sum x_i^2 - \frac{(\sum x_i)^2}{n}} (1)$$ The final result for $\hat \beta_1$ is:
$$\hat \beta_1 = \frac{\sum(x_i - \bar x)(y_i - \bar y)}{\sum (x_i - \bar x)^2} (2)$$
with $\bar x = \frac{\sum x_i}{n}$ and $\bar y = \frac{\sum y_i}{n}$
I am stuck at getting from (1) to (2), which is the final result. How should I proceed for this case?
Solution 1:
Some transformations are necessary. For instance the numerator:
$$\sum x_iy_i - \frac{\sum x_i \sum y_i}n$$ $$=n\left(\frac1n\sum x_iy_i - \frac{\sum x_i \sum y_i}{n\cdot n}\right)$$
$$=n\left(\frac1n\sum x_iy_i - \overline x\cdot \overline y\right)$$
$$=n\left(\frac1n\sum x_iy_i\underbrace{+\overline x\cdot \overline y-\overline x\cdot \overline y}_{=0} - \overline x\cdot \overline y\right)$$
$$=n\left(\frac1n\sum x_iy_i-\overline x\cdot \overline y-\overline x\cdot \overline y+ \overline x\cdot \overline y\right)$$
$$=n\left(\frac1n\sum x_iy_i- \overline y \cdot \frac1n\sum x_i-\overline x\cdot \frac1n\sum y_i- \overline x\cdot \overline y\right)$$
$$=\sum x_iy_i- \overline y \sum x_i-\overline x\cdot \sum y_i+ n\cdot \overline x\cdot \overline y$$
$$=\sum x_iy_i- \overline y \cdot\sum x_i-\overline x\cdot \sum y_i+ \sum \overline x\cdot \overline y$$
$$=\sum \left(x_iy_i- \overline y \cdot x_i-\overline x\cdot y_i+ \overline x\cdot \overline y\right)$$
Since $(a-b)(c-d)=ac-bc-da+bd$ we get
$$=\sum \left(x_i-\overline x)(y_i-\overline y\right)$$
For the denominator see here the transformations.