I'm looking for a reference for a matrix-norm inequality that I used in this answer, which has a few equivalent forms. I will use notation that applies to complex vector spaces with a sesquilinear inner product, but of course the same applies over real matrices.

The statement is as follows:

Take $A,B \in \Bbb F^{n \times n}$. Then $$\vert\operatorname{tr}(A^*B)\vert \leq \sigma_1(A)\sum_{i=1}^n \sigma_i(B) = \|A\| \operatorname{tr}|B|$$ where $\sigma_i$ denotes the $i$th singular value, $|B| = (B^*B)^{1/2}$, and $\|\cdot\|$ denotes the spectral norm (induced Euclidean norm).

I did manage to find some references, but they're overkill, and the texts themselves are not readily accessible to the faint of heart (Bhatia's text is dense and Pedersen's is not about matrices in particular).

A suitable reference would be greatly appreciated.


Solution 1:

A proof in linear algebra. I hope you're familiar with SVD.

Lemma 1 For any matrix $A$, $|tr(A)|\le \sum_i \sigma_i(A)$

Proof: By SVD decomposition, and properties of the trace function $$tr(A) = tr(U\Sigma V) = tr(\Sigma VU) $$ If $Z=VU$ then it is still an unitary matrix, and $$|tr(\Sigma Z)| = |\sum_i \sigma_i(A)z_{ii}|\le \sum_i |\sigma_i(A)z_{ii}|\le \sum_i \sigma_i(A) $$ since $|z_{ii}|\le 1$.

Lemma 2 For any matrix $A,B$, $\sigma_i(A^*B)\le \sigma_i(A)\sigma_1(B)$

Proof: Using Fischer minmax theorem, we know $$ \sigma_i(A^*B) = \max_{\dim V=i}\min_{x\in V,\,\|x\|=1}\|A^*Bx\| $$ but $$ \min_{x\in V,\,\|x\|=1} \|A^*Bx\| \le \max_{x\in V,\,\|x\|=1}\|Bx\| \min_{y\in BV,\,\|y\|=1}\|A^*y\| $$ so $$ \sigma_i(A^*B) \le \max_{\dim V=i}(\max_{x\in V,\,\|x\|=1}\|Bx\| \min_{y\in BV,\,\|y\|=1}\|A^*y\|) $$ $$ \le \max_{\dim V=i}\max_{x\in V,\,\|x\|=1}\|Bx\| \max_{\dim V=i}\min_{y\in BV,\,\|y\|=1}\|A^*y\| \le \sigma_1(B)\sigma_i(A^*) $$

Solution 2:

This holds true more generally when $\Bbb F^n$ is replaced by a(-ny) Hilbert space $H$.

Let $\mathcal K(H)\,$, $\ell^1(H)$, and $\mathcal L(H)$ denote the compact, the trace class, and all bounded linear operators on $H$, respectively. They form a chain of dual Banach spaces, i.e., each is followed by its (topological) dual.
The duality map is given by the trace-based pairing, which is central to the OP, in detail:
Every continuous linear functional $\varphi$ on $\mathcal K(H)$ has the form $$\varphi(k)=\operatorname{tr}(kt)$$ for some fixed $t\in\ell^1(H)$, and one gets $\big(\mathcal K(H)\big)'=\ell^1(H)$.
Furthermore, every continuous linear functional $\phi$ on $\mathcal\ell^1(H)$ has the form $$\phi(t)=\operatorname{tr}(tx)$$ for some $x\in\mathcal L(H)$, hence $\big(\ell^1(H)\big)'=\mathcal L(H)$.

Note that $\sum_{i=1}^\infty \sigma_i(t) = \operatorname{tr}(\,|t|\,) = \|t\|_{\ell^1}$ which is the trace-class norm; and the norm inequality expresses the continuity in each case.
In the finite-dimensional case one has $\mathcal K(H)=\ell^1(H)=\mathcal L(H)$, but what "remains" is the conceptual & worthwhile view, that the trace implements the duality.

Hope all this is helpful and not an overkill.

Ref's out of my mind are

  • Barry Simon: Trace ideals and their applications
  • Gohberg & Krein: Introduction to the Theory of Linear nonselfadjoint operators
  • Reed & Simon: Methods of Modern mathematical physics, Volume 1