Prove that the system $A^T A x = A^T b$ always has a solution

The matrix $A^TA$ need not be invertible. So what you need to prove is that $A^T b$ lies in the image $V= {\rm im\;} A^T A$.

Now, $A^T b\in V$ is equivalent to $V^\perp \subset (A^T b)^\perp$ so it suffices to show that the orthogonal complement of ${\rm im} A^T A$ is also orthogonal to $A^T b$.

So suppose that $z^T A^T A=0$ (i.e. the vector $z$ is orthogonal to the image). Then also $|z^T A^T|^2 = z^T A^T A z=0$ so $z^T A^T=0$. But then $z^T A^Tb=0$ as we had to show.

Let the SVD of $\mathrm A \in \mathrm R^{m \times n}$ be

$$\mathrm A = \mathrm U \Sigma \mathrm V^{\top} = \begin{bmatrix} \mathrm U_1 & \mathrm U_2\end{bmatrix} \begin{bmatrix} \hat\Sigma & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm V_1^{\top}\\ \mathrm V_2^{\top}\end{bmatrix}$$

where the zero matrices may be empty. The eigendecomposition of $\mathrm A^{\top} \mathrm A$ is, thus,

$$\mathrm A^{\top} \mathrm A = \mathrm V \Sigma^{\top} \mathrm U^{\top} \mathrm U \Sigma \mathrm V^{\top} = \mathrm V \Sigma^{\top} \Sigma \mathrm V^{\top} = \begin{bmatrix} \mathrm V_1 & \mathrm V_2\end{bmatrix} \begin{bmatrix} \hat\Sigma^2 & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm V_1^{\top}\\ \mathrm V_2^{\top}\end{bmatrix}$$

Hence, the normal equations $\mathrm A^{\top} \mathrm A \, \mathrm x = \mathrm A^{\top} \mathrm b$ can be written as follows

$$\mathrm V \begin{bmatrix} \hat\Sigma^2 & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \mathrm V^{\top} \mathrm x = \mathrm V \begin{bmatrix} \hat\Sigma & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \mathrm U^{\top} \mathrm b$$

Let $\mathrm y := \mathrm V^{\top} \mathrm x$. Left-multiplying by $\mathrm V^{\top}$,

$$\begin{bmatrix} \hat\Sigma^2 & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm y_1\\ \mathrm y_2\end{bmatrix} = \begin{bmatrix} \hat\Sigma & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm U_1^{\top} \mathrm b\\ \mathrm U_2^{\top} \mathrm b\end{bmatrix}$$

and, thus,

$$\begin{bmatrix} \mathrm I_r & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm y_1\\ \mathrm y_2\end{bmatrix} = \begin{bmatrix} \hat\Sigma^{-1} & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm U_1^{\top} \mathrm b\\ \mathrm U_2^{\top} \mathrm b\end{bmatrix} = \begin{bmatrix} \hat\Sigma^{-1}\mathrm U_1^{\top} \mathrm b\\ \mathrm O\end{bmatrix}$$

which always has a solution, as $\mathrm O \mathrm y_2 = \mathrm O$ always has a solution. The affine solution space is parametrized by

$$\mathrm x = \mathrm V_1 \hat\Sigma^{-1} \mathrm U_1^{\top} \mathrm b + \mathrm V_2 \eta$$

where $\{ \mathrm V_2 \eta \mid \eta \in \mathbb R^{n-r} \}$ is the null space of $\mathrm A$. If $\mathrm A$ has full column rank, then $r = n$, the null space of $\mathrm A$ is $\{0_n\}$, and $\mathrm V_1 \hat\Sigma^{-1} \mathrm U_1^{\top} \mathrm b$ is the only solution to the normal equations.

Addendum

Using the SVD of $\mathrm A$, the original system $\mathrm A \mathrm x = \mathrm b$ can be written as

$$\begin{bmatrix} \hat\Sigma & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm y_1\\ \mathrm y_2\end{bmatrix} = \begin{bmatrix} \mathrm U_1^{\top} \mathrm b\\ \mathrm U_2^{\top} \mathrm b\end{bmatrix}$$

which only has a solution if $\mathrm U_2^{\top} \mathrm b = \mathrm 0$, i.e., if $\mathrm b$ is orthogonal to the left null space of $\mathrm A$, i.e., if $\mathrm b$ is in the column space of $\mathrm A$.

Multivariable limit - perhaps a trickier problem I am stuck on. [duplicate]

How to show that $\sum\limits_{k=0}^n (-1)^k\tfrac{{ {n}\choose{k}}}{{ {x+k}\choose{k}}} = \frac{x}{x+n}$

What are "oracle results"?

How to tell two ideals belong to the same ideal class group

Dividing a curve into chords of equal length

Graphically Organizing the Interrelationships of Basic Algebraic Structures

If $\ \sum_{k=1}^n m(E_n) > n-1,$ then prove that $\bigcap_{k=1}^n E_k$ has positive measure.

Find the value $\binom {n}{0} + \binom{n}{4} + \binom{n}{8} + \cdots $, where $n$ is a positive integer.

What does a symmetric matrix transformation do, geometrically?

Criteria of a trivial G bundle of a 3-manifold

Show that $\sum\limits_{n=1}^\infty \frac {\sqrt{a_n}}{n}$ converges if $\sum\limits_{n=1}^\infty{a_n}$ does provided that $a_n>0$ [duplicate]

How does one DERIVE the formula for the maximum of two numbers [duplicate]