Let $X\in\mathbb{R}^n$ and $Y\in\mathbb{R}^m$, $n\geq m$, be independent standard Gaussian random vectors and define $D\in \mathbb{R}^{m\times m}$, a positive-definite (symmetric) matrix.

I want to prove that $$ -E\|X\|_2+E\|Y\|_2\leq -\sqrt{n}+\sqrt{m}\quad\quad (\text{1}) $$ and $$ \dfrac{E\|\sqrt{D}Y\|_2}{\sqrt{tr(D)}}\leq \dfrac{E\|Y\|_2}{\sqrt{m}}\quad\quad (\text{2}) $$

(Here $\|x\|_2=\sqrt{x^{T}x}$, $tr(D)$ is the trace of $D$ and $\sqrt{D}$ is such that $(\sqrt{D})^2=D$.)

For (1) I know by Jensen's inequality that $E\|X\|_2\leq \sqrt{n}$ and $E\|Y\|_2\leq \sqrt{m}$. But how this implies (1)?

For (2) I know (again by Jensen) that $E\|\sqrt{D}Y\|_2\leq \sqrt{tr(D)}$ but that doesn't help me to obtain the bound since $E\|Y\|_2/\sqrt{m}\leq 1$.

Any help will be appreciated.


Very nice questions, can you share where you got them from?

For the second one:

We can show this claim by showing that the right side is maximized for $D = I$.

$$\frac{E ||\sqrt D Y||_2}{\sqrt{tr(D)}}=E||\sqrt{\frac{D}{tr(D)}}Y||_2$$

The expectation of the norm and the expectation of the norm squared are both maximized by the same $D$ since squaring is monotonic for positive values. So we can instead maximize:

$$E||\sqrt{\frac{D}{tr(D)}}Y||_2^2$$

It's not too hard to show that this expression is convex is $D$ by showing the hessian is positive semidefinite. We can then show that the maximum is attained for $D = I$ by showing that the gradient is $0$ at that point:

$$\nabla_D E||\sqrt{\frac{D}{tr(D)}}Y||_2^2=E[\frac{YY^T}{tr(D)} - \frac{Y^TDY}{tr(D)^2}I]$$

Applying $E[YY^T] = I$ and $E[Y^TDY] = tr(D)$ shows that this is $0$ for $D = I$, and thus the claim holds.

I also tried another way by expressing the ratio in terms of the SVD of $D$. Then you end up with something like this for the ratio on the left:

$$E ||\sum_i \alpha_i \sqrt{\frac{\lambda_i}{\sum_j \lambda_j}}v_i||$$

where $\alpha_i$ is the length of the projection of $Y$ onto the eigenvector $v_i$ of $D$. And on the right you get:

$$E ||\sum_i \alpha_i \frac{1}{\sqrt m}v_i||$$

You can then show that the first expression is maximized when all $\lambda_i$ are equal. wlog, since $V$ (the matrix of eigenvectors) is orthonormal, this shows that this expression is maximized for $D=I$.


This was done with the help of a friend.

For the first one. It is sufficient to prove it for $n=m+1$. Let $X_1,X_2,...$ be a sequence of iid $N(0,1)$ and let $Z=\sum_{j=1}^mX_j^2$.

It's easy to see that the functions $$ x\mapsto \sqrt{c+x^2},\;c\geq 0,\quad x\mapsto\sqrt{x+1}-\sqrt{x} $$ are convex in $[0,\infty)$ (by double derivative if you will). Therefore, by Jensen inequality, \begin{align*} E\sqrt{Z+X^2_{m+1}}-E\sqrt{Z}&=E\left[E\left[\sqrt{Z+X^2_{m+1}}\,\Big|\,Z\right]\right]-E\sqrt{Z}\\ &\geq E\left[\sqrt{Z+E[X^2_{m+1}|Z]}\right]-E\sqrt{Z}\\ &=E\left[\sqrt{Z+1}\right]-E\sqrt{Z}\\ &=E[\sqrt{Z+1}-\sqrt{Z}]\\ &\geq \sqrt{EZ+1}-\sqrt{EZ}\\ &=\sqrt{m+1}-\sqrt{m}. \end{align*}