how is the formula for linear regression connected to the fact that the conditional mean is the optimal estimator?

It’s not the linear function That minimizes the MSE — it’s how you estimate it’s parameters. If you are trying to minimize the MSE of a linear estimator then there are nice linear algebra formulas called normal equations that save you from having to numerically optimize over the $\beta_i$.

However, if you are in nonlinear territory then MSE may be just had hard to optimize.

how is the formula for linear regression connected to the fact that the conditional mean is the optimal estimator?

Related

Recent Posts