Difference between Grassmann and Stiefel manifolds

I'm working on an optimization problem on manifolds and I'm having a bit of a conceptual issue with choosing between the Grassmann and Stiefel manifolds. Grassmann(2, 3) is the linear subspace of dimension 2 within the space $\mathbb{R}^3$, so all planes through the origin. So a point on the manifold corresponds to a plane, invariant to linear mixing of support vectors. Stiefel(2, 3) would be all possible planes through the origin that are the span of two orthonormal vectors. So my questions are:

  1. Does that mean that a point on the Stiefel manifold is also a point on the Grassmann manifold (in the sense that all linear subspaces of a given span can be reduced to the same orthonormal components)?

  2. What do I gain by optimizing on the Grassmann vs the Stiefel manifold, if the parametrization of the points is the same?


Solution 1:

To understand how to choose between Grassmann and Stiefel manifolds, it helps to understand better the difference. In the following I will only look at the case of real vector space $\mathbb{R}^n$. Much information can be found at https://en.wikipedia.org/wiki/Stiefel_manifold and https://en.wikipedia.org/wiki/Grassmannian.

The Grassmannian $Gr(k,n)$ is the (compact) manifold of all $k$-dimensional linear subspaces in $\mathbb{R}^n$. So the one-dimensional case $k=1$ is real projective space. The Stiefel manifold $V_k(n)$ is (compact) manifold of all (orthogonal) $k$-frames (hereafter we leave out "orthogonal" which is always implied). A $k$-frame is a set of $k$ orthonormal vectors in $\mathbb{R}^n$, so can alternatively be described as the manifold of $n\times k$ column orthogonal matrices.

Let us compare the definitions for a few low-dim cases:

- $Gr(1,n)$ and $V_1(n)$

$Gr(1,n)$ is the set of all line through the origin, while $V_1(n)$ is the set of all unit vectors, so is the unit sphere. To each point in $Gr(1,n)$ there corresponds two (antipodal) unit vectors in $V_1(n)$.

- $Gr(2,n)$ and $V_2(n)$

$Gr(2,n)$ is the set of all 2-planes (through the origin) while $V_2(n)$ is all 2-frames in $\mathbb{R}^n$. To each point in $Gr(2,n)$ (that is, plane) there corresponds in $V_2(n)$ the set of all orthogonal bases for that plane.

-

General case

There is a natural projection $p\colon V_r(n) \mapsto Gr(r,n)$ which to each frame in $V_r(n)$ sends it to the plane in $Gr(r,n)$ of which it is a basis. In this way, we can define $Gr(r,n)$ as a quotient space of $V_r(n)$.

What does all this mean for your optimization problem? If you are only interested in linear subspaces, use $Gr(r,n)$, but if how you parametrize the space with an orthogonal basis is important, use the Stiefel manifold.

But, algorithmically, it could be easier to represent a Stiefel manifold, so you can use that but build into the step-finding algorithm that you avoid walking to a new frame which is a basis for the same subspace. See Optimization Algorithms on Matrix Manifolds