Compactness of the Grassmannian

At the risk of introducing one more a priori different topology on the Grassmannian, here's an easy way to see it's compact. I write $\mathbb{R}$ in what follows, though this applies equally well to $\mathbb{C}$.

We can realize $G(V,d)$ as a quotient of something compact, so it's definitely compact. Namely, write $S := \{v \in V: \|v\| = 1 \}$, picking an isomorphism of $(V, \langle\ \rangle) \simeq (\mathbb{R}^n, \cdot)$, this is just $S^{n-1} \subseteq \mathbb{R}^n$.

Now, consider the collection $C$ of $d$ tuples of orthogonal vectors, in $S^d$, $S^d$ is compact, and this collection is the 0 locus of $$S^d \rightarrow \mathbb{R}^{\binom{d}{2}} \quad (v_1, \ldots, v_d) \mapsto (v_i \cdot v_j)_{i< j}$$ So $C$ is the pullback of a point, hence closed, hence with the induced topology from $S^d$ it's compact; to get the Grassmannian, quotient $C$ by saying two $d$ tuples are the same, if their span gives the same $d$ plane.

The easiest explanation that I can think of (at the moment) is the following:

The Grassman manifold $G_{n}(m)$ consisting of all subspaces of $\mathbb{R}^m$ of dimension $n$ is a homogeneous space obtained by considering the natural action of the orthogonal group $O(m)$ on the Stiefel manifold $V_{n}(m)$. The Lie group $O(m)$ is compact and we conclude that $G_{n}(m)$ is a compact space.

I think that there is probably an easier explanation but the above constitutes the conceptual reason as to why the Grassman manifold is compact. In any case, I think that Grassman manifolds are conceptually simple to understand if you think of them as homogeneous spaces.

This can be seen directly. I use $d(X,Y)=\|P_X-P_Y\|$. To show $G(V,d)$ is compact, it suffices to know that $P:=\{P_X\}\subseteq End(V)$ is compact, where $P$ is the collection of all orthogonal projections onto $d$-planes.

(That this suffices is arguably the definition: take a sequence of $d$ planes $X_i$, consider the $P_{X_i}$; extract a convergent subsequence $P_{X_j}\rightarrow P_Y$, by definition, $X_j\rightarrow Y$ as desired.)

To show compactness, we need that $P$ is closed and bounded, as choosing a basis for $V$ gives a homeomorphism $End(V)\simeq\mathbb{R}^{n^2}$.

I use the following convention: let $b_i$ be an orthonormal basis for $V$, for $T\in End(V)$, set $$\|T\|:=\sum_i \|T(b_i)\|=\sum_i\langle T(b_i), T(b_i)\rangle$$ One checks (i) this is indeed a norm, and (ii) this norm is invariant under orthogonal change of basis.

Bounded: I claim we have the trivial bound $\|P_X\|\leqslant n$. Why? Each column has norm at most 1, as projection only ever drops something's norm, and the $b_i$ have norm 1. (To see the thing about projection only dropping norm, we can write any $v$ uniquely as $w+w^\perp,w\in X,w^\perp\in X^\perp$, then $$\|P(v)\|=\sqrt{\langle w,w\rangle}\quad\|v\|=\sqrt{\langle w+w^\perp,w+w^\perp\rangle}=\sqrt{\langle w,w\rangle+\langle w^\perp,w^\perp\rangle}$$ by bilinearity)

Closed: This is slightly more involved, I first outline the steps in the proof, then fill in the details below.

The first observation is that $d(X,Y)$ really does measure "distance" between $d$ planes. To be precise, if $d(X,Y)=\epsilon$, there exists orthonormal bases $x_i$ for $X$, $y_i$ for $Y$, such that $d(x_i, y_i)\leqslant f(\epsilon)$, where $\lim_{\epsilon \downarrow 0}f(\epsilon)=0$.
If $P_{X_i}$ is a Cauchy sequence of projection operators, possibly after passing to a subsequence one can construct a sequence of orthonormal bases $x_{i,j}$ for $X_i$, $1\leqslant j\leqslant n$, such that for fixed $j$, $x_{i,j}$ is Cauchy. Informally, we have a Cauchy sequence of not single vectors but orthonormal bases.
The $x_{i,j}\rightarrow y_j$, a collection of $k$ orthonormal vectors, writing their span as $Y$, we have that $P_{X_i}\rightarrow P_Y$, as desired.

Right, now the proofs. To be honest, they're kinda ugly:

(1) Informally, one takes an orthonormal basis for $X$, projects it down to $Y$, tweaks it to make them perpendicular, and then normalizes the resulting vectors.

Let $x_1,\ldots x_d$ be an orthonormal basis for $X$. Write $x_1=P_Y(x_1)+P_Y(x_1)^\perp$, we have $$d(x_1, P_Y(x_1))=\|P_Y(x_1)^\perp\|\leqslant\|P_X-P_Y\|=\epsilon$$ Hence by the reverse triangle inequality, $\|P_Y(x_1)\|\geqslant 1-\epsilon$, thus setting $y_1:=\frac{P_Y(x_1)}{\|P_Y(x_1)\|}$, we have $$d(x_1, y_1)\leqslant d(x_1, P_Y(x_1))+d(P_Y(x_1), y_1)\leqslant 2\epsilon$$

Suppose we have defined $y_1,\ldots y_{k-1}$, satisfying $d(x_i, y_i)\rightarrow 0$ as $\epsilon\downarrow 0$, which I denote by $d(x_i, y_i)\sim v(\epsilon)$ ("vanishes" with $\epsilon$). By abuse of notation, I use $v(\epsilon)$ to also denote vectors, whose length vanishes with $\epsilon$, e.g. $x_i-y_i=v(\epsilon)$.

Consider $P_Y(x_k)$. We have $$\langle P_Y(x_k),y_i\rangle=\langle x_k+v(\epsilon),x_i+v(\epsilon)\rangle$$$$=\langle x_k, v(\epsilon)\rangle+\langle x_i,v(\epsilon)\rangle+\langle v(\epsilon),v(\epsilon)\rangle=v(\epsilon)$$ by bilinearity and Cauchy-Schwarz. Hence, subtracting of the component of $P_Y(x_k)$ along $y_1,\ldots y_{k-1}$ results in a vector $P_Y(x_k)'$ of length $1-v(\epsilon)$, and setting $y_k:=\frac{P_Y(x_k)'}{\|P_Y(x_k)'\|}$, we have $$d(x_k, y_k)\leqslant d(x_k,P_Y(x_k))+d(P_Y(x_k),P_Y(x_k)')+d(P_Y(x_k)',y_k)=v(\epsilon)$$ as desired.

(2) Let $P_{X_i}$ be a Cauchy sequence, construct a subsequence $P_{Y_j}$ as follows:

Inductively, suppose we have chosen (i) $Y_{j-1}=X_{i}$, such that for all $k\geqslant i$, for any orthonormal basis $x_i$ for $X_i$, we can pick an orthonormal basis $y_i$ for $X_k$ such that $d(x_i, y_i)\leqslant\frac{1}{2^{j-1}}$ (we can do ths by 1), and (ii) an orthonormal basis for $Y_{j-1}$, $x_{j-1,i}$.

Pick $Y_j$ to satisfy for (i) blah blah $\leqslant\frac{1}{2^{j}}$, and (ii) an orthonormal basis $x_{j,i}$ for $Y_j$ satisfying $d(x_{j-1,i},x_{j,i})\leqslant\frac{1}{2^{j-1}}$

Why does this work? Well, by construction, for fixed $i$, the $x_{j,i}$ satisfy the distance between consecutive terms decreases geometrically, that they form a Cauchy sequence follows immediately from (i) the triangle inequality, and (ii) geometric series converge. Namely, for any $\epsilon$, we can pick $j$, such that $\sum_{k\geqslant j}2^{-k}<\epsilon$, hence for any $n,m\geqslant j$, we have $$d(x_{n,i}, x_{m,i})\leqslant\sum_{k=n}^{m-1}d(x_{k, i},x_{k+1, i})\leqslant\sum_{k=j}^\infty d(x_{k,i},x_{k+1,i})\leqslant\sum_{k\geqslant j}2^{-k}<\epsilon$$

(3) By Cauchyness $x_{j,i}\rightarrow y_i$, by the (tautological) continuity of $\langle\rangle:V \times V\rightarrow V$, and $\|\;\|:V\rightarrow\mathbb{R}$, we deduce the $y_i$ are orthonormal. I claim $P_{Y_i}\rightarrow P_Y$, where $Y$'s just the span of the $y_i$. The proof is the "reverse" of 1, and proceeds similarly: namely, if $P_X,P_Y$ are projections, such that $X,Y$ admit orthonormal bases $x_i,y_i$ satisfying $d(x_i, y_i)\leqslant\epsilon$, then $d(P_X, P_Y)=v(\epsilon)$.

Compactness of the Grassmannian

Related

Recent Posts