Maximize $\langle \mathrm A , \mathrm X \rangle$ subject to $\| \mathrm X \|_2 \leq 1$

Given $\mathrm A \in \mathbb R^{m \times n}$,

$$\begin{array}{ll} \text{maximize} & \langle \mathrm A , \mathrm X \rangle\\ \text{subject to} & \| \mathrm X \|_2 \leq 1\end{array}$$


Since $\| \mathrm X \|_2 \leq 1$ is equivalent to $\sigma_{\max} (\mathrm X ) \leq 1$, which is equivalent to $\lambda_{\max} (\mathrm X^\top \mathrm X) \leq 1$, we have

$$1 - \lambda_{\max} (\mathrm X^\top \mathrm X) = \lambda_{\min} (\mathrm I_n - \mathrm X^\top \mathrm X)\geq 0$$

and, thus,

$$\mathrm I_n - \mathrm X^\top \mathrm X \succeq \mathrm O_n$$

Using the Schur complement test for positive semidefiniteness, we obtain the following linear matrix inequality (LMI)

$$\begin{bmatrix} \mathrm I_m & \mathrm X\\ \mathrm X^\top & \mathrm I_n\end{bmatrix} \succeq \mathrm O_{m+n}$$

Hence, we have the following semidefinite program (SDP)

$$\begin{array}{ll} \text{maximize} & \langle \mathrm A , \mathrm X \rangle\\ \text{subject to} & \begin{bmatrix} \mathrm I_m & \mathrm X\\ \mathrm X^\top & \mathrm I_n\end{bmatrix} \succeq \mathrm O_{m+n}\end{array}$$

Is this correct? Any feedback would be highly appreciated. Thank you.


Solution 1:

This can be computed in closed-form via SVD. Indeed, given a matrix norm $\|.\|$ on $\mathbb C^{n,m}$, there is always a corresponding dual norm (i.e on the dual space $\mathbb C^{m,n}$) given by

$$\|A\|^{D} := \max_{\|X\| \le 1}\langle A, X\rangle_{\text{F}}, $$ where $\langle A, X\rangle_{\text{F}} := \text{trace}(AX^T)$ is the Frobenius / Hilbert-Schmidt inner product. In particular, it's not hard to show that $\|A\|_2^D = \|A\|_* := \sum_{k}\sigma_k(A)$, the nuclear norm of $A$.

Solution 2:

Let $A=\sum_i \lambda_i u_i v_i^T$ be the SVD of $A$. The nuclear norm of $A$ is by definition the sum of the singular values, $\sum_i \lambda_i$.

  • The fact that the maximum of your optimization problem is at most the nuclear norm of $A$ follows from $$\langle A, X\rangle = trace(A^TX) =\sum_i \lambda_i v_i^T X u_i \le \sum_i \lambda_i $$ by the triangle inequality because $X$ has operator norm at most one, so $|v_i^T X u_i|\le 1$.

  • The fact that $\sum_i \lambda_i$ can be achieved for some matrix $X$ with operator norm at most one follows by taking $X= \sum_i u_iv_i^T$.