Orthogonal projection matrix

Let $A\in M_{m\times n}(\mathbb{R})$. Denoting by $R(A)$ the column space of $A$ and $N(A)$ the null space of $A$.

I know that

$z^*=Ax^*$ is a projection of $b\in R^m$ on $R(A)=N(A^T)$ where $x^*=Ab$

$r^*=b-z^*$ is a projection of $b\in R^m$ on $N(A^T)=R(A)^\perp$

Similarly

$z^*=A^Tx^*$ is a projection of $b\in R^n$ on $R(A^T)=N(A)^\perp$ where $x^*=A^TAx=Ab$

$r^*=b-z^*$ is a projection of $b\in R^n$ on $N(A)=R(A^T)^\perp$

Finally I know the orthogonal projection matrix on the subspace $R(A)$ is defined by $P=A(A^TA)^{-1}A^T$.

my question is how do I determine for example the projection matrix on $R(A^T),N(A^T),N(A)$?

EDIT:* means nothing is just notation that I used


Solution 1:

If $A^+$ is the Moore-Penrose pseudo inverse of $A$, that is, the matrix such that $$\tag{1} a)\;AA^+A=A, \quad b)\;A^+AA^+=A^+, \quad c)\;(AA^+)^*=AA^+, \quad d)\;(A^+A)^*=A^+A,$$ then

  • (i) $AA^+$ is the orthogonal projector (OP) onto the range of $A$,
  • (ii) $A^+A$ is OP onto the range of $A^*$.

Note that if $Q$ is OP onto a subspace $S$, that is, $Q=Q^*$ (here, $^*$ denotes conjugate transpose), $Q^2=Q$, and $R(Q)=S$ (note that these three conditions imply that $Q$ is unique), then $I-Q$ is OP onto $S^\perp$. So, if (i) is true, then $I-AA^+$ is OP onto $R(A)^\perp=N(A^*)$ and, if (ii) is true, then $I-A^+A$ is OP onto $R(A^*)^\perp=N(A)$.

For the proof of (i) and (ii), we use only (1) and the fact that if $X=YZ$, then $R(X)\subseteq R(Y)$. We need only to show that the matrices $AA^+$ and $A^+A$ are Hermitian, idempotent, and their ranges are equal to the subspaces on which they are supposed to project.

Both $AA^+$ and $A^+A$ are obviously Hermitian; see (1c) and (1d). In addition, (1a) and/or (1b) imply that they are idempotent. It remains to show that $R(AA^+)=R(A)$ and $R(A^+A)=R(A^*)$. Clearly, $R(AA^+)\subseteq R(A)$; $R(A)\subseteq R(AA^+)$ follows from (1a). From (1d), we have $A^+A=A^*(A^+)^*$, so $R(A^+A)\subseteq R(A^*)$. From (1a) and (1d), $A^*=A^+AA^*$, so $R(A^*)\subseteq R(A^+A)$.

Summarizing, the following are OPs:

  • $AA^+$ on $R(A)$,
  • $I-AA^+$ on $N(A^*)$,
  • $A^+A$ on $R(A^*)$,
  • $I-A^+A$ on $N(A)$.

Note that this is true no matter what is the rank of $A$. While for some $A$, $A^*A$ or $AA^*$ (or both) might not be invertible depending on the rank of $A$, any $A$ has always a unique Moore-Penrose pseudoinverse and a set of unique OPs given above associated with the four subspaces of $A$.

Solution 2:

Projection onto $\operatorname{col}(A)$

Let's start at the beginning and derive the matrix which projects a vector $b$ onto the column space of $A$, denoted $\operatorname{col}(A)$.

Let $A$ be an $m\times n$ representing an $n$-dimensional subspace of $\Bbb R^m$. Let $b\in M_{m\times 1}(\Bbb R)$ where $b$ is not necessarily an element of the subspace $\operatorname{col}(A)$. Then we can decompose $b$ into parts parallel and orthogonal to $\operatorname{col}(A)$:

$$b = b_\| + b_\bot$$

If $P_{\operatorname{col}(A)}$ is the projection matrix that we hope to find, then $b_\| = P_{\operatorname{col}(A)}b$ and $b_\bot = b-P_{\operatorname{col}(A)}b$. If $A = \begin{bmatrix} a_1 & \cdots & a_n\end{bmatrix}$ (where $a_i$ is the $i$th column of $A$), then we know that $$b_\| = \kappa_1a_1 + \kappa_2a_2 + \cdots + \kappa_na_n = A\begin{bmatrix} \kappa_1 \\ \vdots \\ \kappa_n\end{bmatrix} = A\kappa$$

And knowing that $\operatorname{col}(A)$ is orthogonal to $b_\bot$, we also know that $A^Tb_\bot = A^T(b-A\kappa)=0$. Therefore $A^Tb=A^TA\kappa$. Solving for $\kappa$, we get $$\kappa = (A^TA)^{-1}A^Tb$$ Note here that $A^TA$ is guaranteed to be invertible by the fact that we assume the columns of $A$ are linearly independent. Then if we left multiply both sides of this by $A$ we end up with $$b_\| = P_{\operatorname{col}(A)}b = A(A^TA)^{-1}A^Tb \\ \implies P_{\operatorname{col}(A)} = A(A^TA)^{-1}A^T$$


Projection onto $\operatorname{lnull}(A)$

Because the left nullspace of $A$ is the orthogonal complement of its column space, we essentially just want the $b_\bot$ from above. Thus $$P_{\operatorname{lnull}(A)}b = b_\bot = b-b_\| = I_mb - A(A^TA)^{-1}A^Tb = (I_m -A(A^TA)^{-1}A^T)b \\ \implies P_{\operatorname{lnull}(A)} = I_m -A(A^TA)^{-1}A^T$$


Unfortunately we can't use the above approach to find the projection onto $\operatorname{row}(A)$ or $\operatorname{null}(A)$ because while $A^TA$ is invertible, $AA^T$ is NOT. To find these, you'd need to use the pseudoinverse -- which is described in AlgebraicPavel's answer.