Maximize trace over Stiefel manifold
This question is the same as the question in this post. The OP of that post changed what they were asking and reduced it to a special case, so I’m asking the question in full generality here.
Given symmetric $A \in \mathbb{R}^{m\times{m}}$, solve the optimization problem in $X \in \mathbb{R}^{m\times{n}}$
$$\begin{array}{rl} \max&\mathrm{Tr}(X^TAX)\\ \text{s.t.}&X^TX=I \end{array}$$
Now, if $X$ is square, then the objective function satisfies $$ \mathrm{Tr}(X^TAX)=\mathrm{Tr}(AXX^T)=\mathrm{Tr}(A) $$ for any orthogonal $X$. So we are interested in the case when $X \in \mathbb{R}^{m\times{n}}$ is tall (more rows than columns).
Attempt: Let $A=VDV^T$ denote the eigendecomposition of $A$. Then the objective function satisfies: $$ \mathrm{Tr}(X^TAX)=\mathrm{Tr}(X^TVDV^TX)=\mathrm{Tr}(DV^TXX^TV)=\langle{D,V^TXX^TV}\rangle. $$ If $D$ has non-negative entries (i.e. $A$ is positive semidefinite), I believe (but I’m not certain) that this expression is maximized when $V^TXX^TV=I$. However, this can never happen, since the outer product of two tall matrices can’t equal the identity. My guess is that, in the positive semidefinite case, you might pick eigenvectors of the $n$ largest eigenvalues of $A$.
Does this problem have a nice solution in general?
Solution 1:
Yes, there is indeed a nice solution in general. We find that $$ \begin{array}{rl} \max&\mathrm{Tr}(X^TAX)\\ \text{s.t.}&X \in \Bbb R^{m \times n}, \ X^TX=I \end{array} $$ is given by $\sum_{i=1}^m \lambda_i$ where $\lambda_1 \geq \cdots \geq \lambda_n$ are the eigenvalues of $A$ (which coincides with your suspicion about the positive semidefinite case).
One can prove that $\mathrm{Tr}(X^TAX) \leq \sum_{i=1}^m \lambda_i$ using the Schur-Horn theorem, for instance. In order to see that this upper-bound is attained, take the columns of $X$ to be the eigenvectors corresponding to the largest eigenvalues.