A geometric way to reason about Schur complements?

Solution 1:

I hope I got the right idea what you would consider a "geometric" interpretation.

You can think of Schur complementation as splitting up a linear transform according to how it couples two complementary subspaces. If you represent a vector as the sum of two components in the subspaces, the result of acting on it with the linear transform has four parts: two corresponding to the diagonal blocks of the matrix, where the transform acts on each component separately, and two corresponding to the off-diagonal blocks of the matrix, where the transform mixes the two components. The Schur complement results from splitting the transform up into steps corresponding to this decomposition.

Imagine you're performing these operations on a computer, and you'd like to do everything "in place" -- you want to allow multiplying a vector by a matrix (x *= A) and adding a matrix multiple of a vector to another vector (x += B y), but you don't want to add two matrix multiples of vectors (z = A x + B y). (Here a op= b is short for a = a op b.) Directly applying a transform

$$Ax=\begin{pmatrix}P&K\\M&R\end{pmatrix}\begin{pmatrix}y\\z\end{pmatrix}$$

would require you to calculate $Py+Kz$ and $My+Rz$. To avoid that, you could first mix the lower component into the upper component (y += K z), then apply the diagonal transforms on the subspaces (y *= P and z *= R), and then mix the upper component into the lower component (z += M y), but the result wouldn't be quite right because you'd be mixing the lower component back into itself and multiplying the mixed components by the diagonal blocks. Correcting for this yields:

$$\begin{pmatrix}P&K\\M&R\end{pmatrix} = \begin{pmatrix}I&\\MP^{-1}&I\end{pmatrix} \begin{pmatrix}P&\\&R-MP^{-1}K\end{pmatrix} \begin{pmatrix}I&P^{-1}K\\&I\end{pmatrix} =: LDU \;, $$

which decomposes the linear transform into a part $U$ that mixes the lower component into the upper component, a block-diagonal part $D$ that acts on the subspaces separately, and a part $L$ that mixes the upper component into the lower component; the block-diagonal part contains the Schur complement $S=R-MP^{-1}K$.

Now if you have

$$EA=AF$$

with $E$ lower triangular and $F$ upper triangular, that says that it doesn't matter whether you first mix the lower component into the upper component using $F$ and then apply $A$, or first apply $A$ and then mix the upper component into the lower component using $E$. With our decomposition of $A$, this becomes

$$ELDU=LDUF\;.$$

Now $E$ and $L$ are both lower diagonal and mix the upper component into the lower component, and $U$ and $F$ are both upper diagonal and mix the lower component into the upper component. Since the inverse of a lower/upper triangular matrix is again lower/upper triangular (the inverse of mixing one component into the other is simply subtracting what was added), multiplying by $L^{-1}$ on the left and $U^{-1}$ on the right collects all the lower diagonal parts and all the upper diagonal parts:

$$\left(L^{-1}EL\right)D=D\left(UFU^{-1}\right)\;.$$

The key here is that now there's only mixing in one direction on each side, upper into lower on the left and lower into upper on the right, and mixing in only one direction doesn't affect the diagonal blocks, which describe how the transform acts separately on the components. So we can simply equate the lower right diagonal blocks, and since $L$ and $U$ are pure mixing and their diagonal blocks are identities, this yields

$$E'S=SF'\;.$$