Is Schur complement better conditioned than the original matrix?

Consider the following linear system (in block form) with s.p.d. matrix: $$ \begin{pmatrix} A & B\\ B^\top & C \end{pmatrix} \begin{pmatrix} x\\y \end{pmatrix} = \begin{pmatrix} f\\g \end{pmatrix} $$ I'm wondering if elimination of some variables does improve conditioning of the system matrix. If we eliminate $y$ first, by substituting $y = C^{-1}g - C^{-1}B^\top x$ the following system is obtained: $$ (A - BC^{-1}B^\top) x = f - BC^{-1}g. $$ The matrix of the new system is simply the Schur complement of the block $C$.

The question is whether the conditioning number of the resulting system is less than the conditioning number of the original one? The case $C = I$ is particularly interesting.

I've tried using formula $$ 0 = \det \begin{pmatrix} A - \lambda I& B\\ B^\top & C - \lambda I \end{pmatrix} = \det(C - \lambda I) \det(A - \lambda I - B (C - \lambda I)^{-1}B^\top), $$ but with no luck, though $A - B (C - \lambda I)^{-1}B^\top$ seems to be quite close to $A - BC^{-1}B^\top$.

Numerical experiments show that the Schur complment is always better conditioned than the original matrix, here's my code.

Experiments also show that not only s.p.d, but also diagonally dominant M-matrices share this property.


Lemma. If $P$ and $Q$ are two $n\times n$ Hermitian matrices and $\operatorname{nullity}(Q)=k>0$, the minimum eigenvalue of $P+Q$ is bounded above by the $k$-th largest eigenvalue of $P$.

Proof of lemma. By Courant-Fischer minimax principle, \begin{align} \lambda_\min(P+Q) &=\min_{\|x\|=1}x^\ast(P+Q)x\\ &\le\min\limits_{\substack{x\in\ker(Q)\\ \|x\|=1}}x^\ast(P+Q)x\\ &=\min\limits_{\substack{x\in\ker(Q)\\ \|x\|=1}}x^\ast Px\\ &\le\max\limits_{\substack{V\subseteq\mathbb C^n\\ \dim V=k}} \min\limits_{\substack{x\in V\\ \|x\|=1}}x^\ast Px\\ &=\lambda_k^{\downarrow}(P).\ {}_{\large\square} \end{align}

Now, return to your question. Suppose $A$ is $k\times k$ and $C$ is $(n-k)\times(n-k)$. Denote by $S$ the Schur complement $A-BC^{-1}B^\top$. Your block matrix is then $P+Q$, where $$ P=\pmatrix{S\\ &0}\ \text{ and }\ Q=\pmatrix{BC^{-1/2}\\ C^{1/2}}\begin{matrix}\pmatrix{C^{-1/2}B^\top&C^{1/2}}\\ {}\end{matrix}. $$ We have $\lambda_\min(P+Q)\le\lambda_k^{\downarrow}(P)=\lambda_\min(S)$ by our lemma and $\lambda_\max(S)=\lambda_\max(P)\le\lambda_\max(P+Q)$ because $Q\succeq0$. Since $\lambda_\min(P+Q)>0$, we conclude that $$ \kappa(S)=\frac{\lambda_\max(S)}{\lambda_\min(S)}\le\frac{\lambda_\max(P+Q)}{\lambda_\min(P+Q)}=\kappa(P+Q). $$