I am looking for an intuitive reason for a projection matrix of an orthogonal projection to be symmetric. The algebraic proof is straightforward yet somewhat unsatisfactory.

Take for example another property: $P=P^2$. It's clear that applying the projection one more time shouldn't change anything and hence the equality.

So what's the reason behind $P^T=P$?


Solution 1:

In general, if $P = P^2$, then $P$ is the projection onto $\operatorname{im}(P)$ along $\operatorname{ker}(P)$, so that $$\mathbb{R}^n = \operatorname{im}(P) \oplus \operatorname{ker}(P),$$ but $\operatorname{im}(P)$ and $\operatorname{ker}(P)$ need not be orthogonal subspaces. Given that $P = P^2$, you can check that $\operatorname{im}(P) \perp \operatorname{ker}(P)$ if and only if $P = P^T$, justifying the terminology "orthogonal projection."

Solution 2:

There are some nice and succinct answers already. If you'd like even more intuition with as little math and higher level linear algebra concepts as possible, consider two arbitrary vectors $v$ and $w$.

Simplest Answer

Take the dot product of one vector with the projection of the other vector. $$ (P v) \cdot w $$ $$ v \cdot (P w) $$

In both dot products above, one of the terms ($P v$ or $P w$) lies entirely in the subspace you project onto. Therefore, both dot products ignore every vector component that is not in this subspace - they consider only components in the subspace. This means both dot products are equal to each other, and are in fact equal to: $$ (P v) \cdot (P w) $$

Since $(P v) \cdot w = v \cdot (P w)$, it doesn't matter whether we apply the projection matrix to the first or second argument of the dot product operation. Some simple identities then imply $P = P^T$, so $P$ is symmetric (See step 2 below if you aren't familiar with this property).

Less intuitive Answer

If the above explanation isn't intuitive, we can use a little more math.

Step 1.

First, prove that the two dot products above are equal.

Decompose $v$ and $w$: $$ v = v_p + v_n $$ $$ w = w_p + w_n $$

The projection of a vector lies in a subspace. The dot product of anything in this subspace with anything orthogonal to this subspace is zero. We use this fact on the dot product of one vector with the projection of the other vector: $$ (P v) \cdot w \hspace{1cm} v \cdot (P w) $$ $$ v_p \cdot w \hspace{1cm} v \cdot w_p $$ $$ v_p \cdot (w_p + w_n) \hspace{1cm} (v_p + v_n) \cdot w_p $$ $$ v_p \cdot w_p + v_p \cdot w_n \hspace{1cm} v_p \cdot w_p + v_n \cdot w_p $$ $$ v_p \cdot w_p \hspace{1cm} v_p \cdot w_p $$ Therefore $$ (Pv) \cdot w = v \cdot (Pw) $$

Step 2.

Next, we can show that a consequence of this equality is that the projection matrix P must be symmetric. Here we begin by expressing the dot product in terms of transposes and matrix multiplication (using the identity $x \cdot y = x^T y$ ): $$ (P v) \cdot w = v \cdot (P w) $$ $$ (P v)^T w = v^T (P w) $$ $$ v^T P^T w = v^T P w $$ Since v and w can be any vectors, the above equality implies: $$ P^T = P $$

Solution 3:

You can also directly verify this by definition of self-adjointness:

Let $U$ be a subspace of $V$, let $P$ be the orthogonal projection that projects onto $U$; let $x, y \in V$, and we can decompose them as $x=u_x+v_x, y=u_y+v_y$, with $u_x, u_y \in U, v_x, v_y \in U^\perp$, so $\langle u_x, v_y \rangle = \langle v_x, u_y \rangle = 0 $. By definition of $P$, $P x = u_x, P y = u_y$.

Then $$ \langle Px, y \rangle = \langle u_x,u_y+v_y \rangle = \langle u_x,u_y \rangle = \langle u_x + v_x, u_y \rangle = \langle x, Py \rangle $$ it follows $P$ is self-adjoint and its matrix is Hermitian (i.e. symmetric when $V$ is real).

UPDATE: I just read your question again; here's a more intuitive explanation.

Remember:

  1. $\forall A \in \mathbb{R}^{n\times n} $, $\operatorname{null}(A)$ and $\operatorname{row}(A)$ are always orthogonal complements of each other in $\mathbb{R}^n$
  2. when $A$ describes a projection, $A^2=A$, $\mathbb{R}^n = \operatorname{null}(A) \oplus \operatorname{col}(A)$

Now when $A$ describes an orthogonal projection, we also have $\operatorname{null}(A) \perp \operatorname{col}(A)$, so $\operatorname{null}(A)$ and $\operatorname{col}(A)$ are orthogonal complements of each other in $\mathbb{R}^n$; this happens iff $A$ is symmetric, so that $\operatorname{row}(A) = \operatorname{col}(A)$.