Inner product of matrices bounded by sum of products of their singular values [duplicate]

Let $A,B$ have the appropriate size. How can we show Von Neumann Trace inequality $ \text{Tr}(AB) \leq \sum_{i=1}^n \sigma_{A,i}\sigma_{B,i} $?

Also, what is the intuition behind this inequality?


Solution 1:

One may view the trace inequality as a pile of Cauchy-Schwarz inequalities.

Let us tackle the (more general) complex case here. Using the tracial property and singular value decomposition, one can reduce the inequality $|\operatorname{tr}(AB)|\le\sum_i\sigma_i(A)\sigma_i(B)$ to $$ |\operatorname{tr}(DUSV^\ast)|\le\operatorname{tr}(DS)\tag{1} $$ where $U,V$ are two unitary matrices and $D=\operatorname{diag}(d_1,\ldots,d_n),\,S=\operatorname{diag}(s_1,\ldots,s_n)$ are two diagonal matrices with nonnegative and decreasing diagonal entries. Let $P_k$ denotes the orthogonal projection matrix $I_k\oplus0_{n-k}$. Note that $D$ is a non-negatively weighted combinations of the $P_i$s. In fact, $$ D=(d_1-d_2)P_1+\cdots+(d_{n-1}-d_n)P_{n-1}+d_nP_n $$ and similarly for $S$. For convenience, let us write $D=\sum_ka_kP_k$ and $S=\sum_lb_lP_l$, where the $a_k$s and $b_l$s are nonnegative. The inequality $(1)$ thus becomes $$ \left|\sum_{k,l}a_kb_l\operatorname{tr}(P_kUP_lV^\ast)\right| \le\sum_{k,l}a_kb_l\operatorname{tr}(P_kP_l).\tag{2} $$ So, by triangle inequality, it suffices to prove that $$ \left|\operatorname{tr}(P_kUP_lV^\ast)\right| \le\operatorname{tr}(P_kP_l)\tag{3} $$ for each pair of $k$ and $l$. Assume that $k\ge l$, or else interchange the roles of $k$ and $l$. As $P_kUP_l=[P_ku_1,\,\ldots,\,P_ku_l,\,0,\ldots,0]$, the inequality $(3)$ is equivalent to $$ \left|\sum_{i=1}^l \langle P_ku_i,\,v_i\rangle\right|\le l.\tag{4} $$ Since $P_k$ is an orthogonal projection and the columns of $U$ are unit vectors, $\|P_ku_i\|_2\le1$. Therefore $(4)$ follows from Cauchy-Schwarz inequality.

Solution 2:

A different (non-standard) proof:

Consider the SVD of $A$, with singular values $\sigma_i(A)$ and associated orthonormal bases $e_i$, $\tilde{e}_i$; similarly for $B$, with bases $f_j$, $\tilde{f}_j$. Then \begin{align} \mathrm{tr}(A^*B)&=\mathrm{tr}\langle Ae_i,Be_i\rangle\\ &=\sum_i\sigma_i(A)\langle\tilde{e}_i,Be_i\rangle\\ &=\sum_{ij}\sigma_i(A)\langle\tilde{e}_i,\tilde{f}_j\rangle\langle \tilde{f}_j,Be_i\rangle \\ &=\sum_{ij}\sigma_i(A)\sigma_j(B)\langle\tilde{e}_i\tilde{f}_j\rangle\langle f_j,e_i\rangle\\ |\mathrm{tr}(A^*B)|&\le\sum_{ij}\sigma_i(A)\sigma_j(B)(|\langle\tilde{e}_i,\tilde{f}_j\rangle|^2+|\langle f_j,e_i\rangle|^2)/2\\ &=\vec{\sigma}(A)^*\,T\,\vec{\sigma}(B) \end{align} $T$ is a doubly stochastic matrix since $$\sum_iT_{ij}=\tfrac{1}{2}\sum_i|\langle\tilde{e}_i,\tilde{f}_j\rangle|^2+\tfrac{1}{2}\sum_i|\langle f_j,e_i\rangle|^2|=\tfrac{1}{2}\|\tilde{f}_j\|^2+\tfrac{1}{2}\|f_j\|^2=1$$ (and similarly for columns).

But by Birkhoff's theorem, every doubly stochastic matrix is the convex sum of permutation matrices, $$T=\sum_k\alpha_kP_k,\qquad\qquad \sum_k\alpha_k=1,$$ hence $$|\mathrm{tr}(A^*B)|\le\sum_k\alpha_k(\vec{\sigma}(A)^*P_k\vec{\sigma}(B))\le\max_k\vec{\sigma}(A)^*P_k\vec{\sigma}(B)$$ By the rearrangement inequality, the largest sum of $(\sigma_i(A))$ dotted with the permuted values of $(\sigma_i(B))$ is the one in which the values align from biggest to smallest. This is precisely the von Neumann trace inequality.