Matrix Calculus in Least-Square method
Solution 1:
Well the first step is the definition of $||r||^2$. This is easy \begin{align} ||r||^2 & = \langle r,r \rangle = r^T r \\ &= (Ax-b)^T(Ax-b) = (x^TA^T-b^T)(Ax-b) \\ &= x^TA^TAx -x^TA^Tb-b^TAx +b^Tb \\ &= x^TA^TAx -(b^TAx)^T -b^TAx +b^Tb \end{align} Since $(b^TAx)$ is a scalar it holds $(b^TAx)⁼ (b^TAx)^T$ Thus \begin{align} ||r||^2 & = x^TA^TAx -2b^TAx +b^Tb \end{align}
And for the derivatives, you could take a look here. Another approach would be to write out the matrix-vector expressions in sumation form and calculate the derivative, then no matrices are involved.
Solution 2:
Below are the matrix/vector derivative rules, you will need.
$$\dfrac{d(x^TBx)}{d x_i} = \dfrac{d}{dx_i}\left(\sum_{j,k} x_j B_{jk}x_k\right) = \sum_{j} x_j B_{ji} + \sum_{k}B_{ik} x_k = \sum_{k}\left(B^T + B\right)_{ik}x_k$$ Hence, we have $$\dfrac{d(x^TBx)}{d x} = (B^T+B)x$$ Similarly, we have $$\dfrac{d(c^Tx)}{d x_i} = \dfrac{d}{d x_i}\left(\sum_k c_k x_k\right) = c_i$$ Hence, we have $$\dfrac{d(c^Tx)}{dx} = c$$ Now you should be able to get what you want.
Solution 3:
All we need here is multivariable calculus, not matrix calculus. Let $f(x) = (1/2) \| Ax - b \|^2$. Notice that $f(x) = g(h(x))$, where $h(x) = Ax - b$ and $g(u) = (1/2) \|u \|^2$. It can easily be seen that the derivatives of $g$ and $h$ are $$ h'(x) = A, \qquad g'(u) = u^T. $$ From the multivariable chain rule, we have \begin{align*} f'(x) &= g'(h(x)) h'(x) \\ &= (Ax - b)^T A. \end{align*} If we use the convention that $\nabla f(x)$ is a column vector, then \begin{align} \nabla f(x) &= f'(x)^T \\ &= A^T (Ax - b). \end{align}