Writing projection in terms of projection matrix
Solution 1:
The key point is that from here
$$p = ax = a\frac{a^Tb}{a^Ta}$$
we can write in matrix form
$$p = ax = a\frac{a^Tb}{a^Ta}=\frac{aa^T}{a^Ta}b=Pb$$
From here we can generalize for a projection onto a subspace spanned by multiple vectors $a_i$.
Let consider the matrix $A=[a_1 \, a_2\,...\, a_n]$ and the vector $b$ to project then consider
$$Ax=p$$
the error is $e=b-p=b-Ax$ and it is miminized when $e$ is orthogonal to $Col(A)$ that is
$$A^Te=A^T(b-Ax)=0\implies A^Tb=A^TAx\implies x=(A^TA)^{-1}A^Tb$$
and then
$$p=Ax=A(A^TA)^{-1}A^Tb=Pb\implies P=A(A^TA)^{-1}A^T$$