Why does $A^TAx = A^Tb$ have infinitely many solution algebraically when $A$ has dependent columns?

Solution 1:

Let $A$ be an $n \times m$ matrix.

Consider $$T_{A^T} : \mathbb R^n \to \mathbb R^m , T_{A^T}=A^Tx \\ T_{A} : \mathbb R^m \to \mathbb R^n , T_{A}=Ax \\ T_{A^TA} : \mathbb R^m \to \mathbb R^m , T_{A^TA}=A^TAx $$

Since $T_{A^TA}=T_{A^T}\circ T_A$ we have $$C(A^TA)=range(T_{A^TA}) \subset range(T_{A^T})= C(A^T)$$

If we can show that these two spaces have the same dimension we are done.

But this comes for free from the rank nullity theorem and the fact that $$ker(T_{A^TA})=\ker(T_A)$$

Indeed, the $\supset$ inclusion is obvious. For $\subset$ let $x \in ker(T_{A^TA})$, then $$A^TAx=0 \Rightarrow x^TA^TAx=0 \Rightarrow (Ax)^T(Ax)=0 \Rightarrow (Ax)\cdot(Ax)=0 \Rightarrow \| Ax\|^2=0 \Rightarrow Ax=0$$

Solution 2:

Existence

First and foremost, we must have existence before we can talk about uniqueness. To have existence, we require that the data vector $b$ is not in the $\color{red}{null}$ space: $$ b\notin\color{red}{\mathcal{N} \left( \mathbf{A} \right)} $$

Uniqueness

When the matrix $\mathbf{A}$ has dependent columns, the $\color{red}{null}$ space $\color{red}{\mathcal{N} \left( \mathbf{A} \right)}$ is nontrivial. Let $\color{red}{z}$ be any vector in the $\color{red}{null}$ space, and let $x_{LS}$ be a known least squares minimizer.

$$ \mathbf{A} \left( x_{LS} + \color{red}{z} \right) = \mathbf{A} x_{LS} + \mathbf{A} \color{red}{z} = \mathbf{A} x_{LS} + \mathbf{0} = \mathbf{A} x_{LS} $$

Under the action of $\mathbf{A}$, the solutions $\left( x_{LS} + \color{red}{z} \right)$ and $x_{LS}$ are equivalent.

Geometrically, the issue is shown below. The general least squares solution is the affine space represented by the red, dashed line.

  1. If the $\color{red}{null}$ space is trivial, $\color{red}{\mathcal{N} \left( \mathbf{A} \right)} = \mathbf{0}$, the least squares solution is the point $\color{blue}{\mathbf{A}^{+}b}$ in the $\color{blue}{range}$ space $\color{blue}{\mathcal{R} \left( \mathbf{A}^{*} \right)}$.

  2. If the $\color{red}{null}$ space is not trivial, $\color{red}{\mathcal{N} \left( \mathbf{A} \right)} \ne \mathbf{0}$, the least squares solution is the affine space going through the point $\color{blue}{\mathbf{A}^{+}b}$ in $\color{blue}{range}$ space $\color{blue}{\mathcal{R} \left( \mathbf{A}^{*} \right)}$ and extending through the $\color{red}{null}$ space.

affine


Least squares solution

Given a matrix $\mathbf{A}\in\mathbb{C}^{m\times n}_{\rho}$, and a data vector $b\in\mathbb{C}^{m}$ such that $$ b\notin\color{red}{\mathcal{N} \left( \mathbf{A} \right)} $$ The least squares solution is defined as $$ x_{LS} = \left\{ x\in\mathbb{C}^{n} \colon \lVert \mathbf{A} x - b \rVert_{2}^{2} \text{ is minimized} \right\} $$ The least squares solution is computed using
$$ x_{LS} = \color{blue}{\mathbf{A}^{+} b} + \color{red}{ \left( \mathbf{I}_{n} - \mathbf{A}^{+} \mathbf{A} \right) y}, \quad y \in \mathbb{C}^{n} $$ In this form, it's clear that having a unique solution demands $$ \mathbf{A}^{+} \mathbf{A} = \mathbf{I}_{n} $$ which happens when $\color{red}{\mathcal{N} \left( \mathbf{A} \right)} = \mathbf{0}$. More details are in the subsequent links.


Explore Stack Exchange:

Existence and uniqueness of least squares solutions: Query about the Moore Penrose pseudoinverse method

Derivation of the SVD solution and conditions on existence and uniqueness: Singular value decomposition proof

Subspace decomposition for least squares: Singular Value Decomposition

How the two null spaces affect the least squares solution: Pseudo-inverse of a matrix that is neither fat nor tall?, What forms does the Moore-Penrose inverse take under systems with full rank, full column rank, and full row rank?

How full column rank changes the inverse: How to find the singular value decomposition of $A^{T}A$ & $\left( A^{T}A \right)^{-1}$

How null spaces affect the pseudoinverse: generalized inverse of a matrix and convergence for singular matrix, When pseudo inverse and general inverse of a invertible square matrix will be equal or not equal?