Deriving Sample Complexity from given expectation bound

In equations (1.6) and (1.7) of this paper, the authors provide an upper bound on an expectation term and the corresponding sample complexity. I am a bit confused as to how they derive their sample complexity term (1.7), though I could be missing something obvious here.

The bound is of the form:

\begin{align*} \sqrt{E \| Y_n -Y\|^2} \le C \left [ \sqrt{\Gamma \frac{A\log p}{n}} + \Gamma \frac{B\log p \log(np)}{n}\right] \|\Sigma \| \end{align*}

They state that if you assume $n \le p$, and select a tolerance $\epsilon \in (0,1)$, then there is a (different) constant C such that

\begin{align*} n \ge C \left[ \frac{A \log p}{\epsilon^2} + \frac{B \log^2 p}{\epsilon} \right]\Gamma \implies \| Y_n - Y\| \le \epsilon\| \Sigma \|. \end{align*}

My working so for is: \begin{align*} P(\| Y_n - Y\| > \epsilon \|\Sigma\|) & \le \frac{1}{\epsilon\|\Sigma\|} E \| Y_n -Y\|\\ &\le \frac{1}{\epsilon\|\Sigma\|} \sqrt{E \| Y_n -Y\|^2}\\ &\le \frac{1}{\epsilon}C \left [ \sqrt{\Gamma \frac{A\log p}{n}} + \Gamma \frac{B\log p \log(np)}{n}\right]\\ &\le \frac{1}{\epsilon}C \left [ \sqrt{\Gamma \frac{A\log p}{n}} + \Gamma \frac{B\log^2 p }{n}\right] \end{align*} where the first inequality holds by Markov's inequality, the second by Jensen's, the third by the given bound, and the fourth by the assumption that $n\le p$. I'm just not sure how they get their sample complexity from here.


Solution 1:

I think you did the main part of the job. One useful observation to keep in its toolkit is that for $a,b\ge 0$, $$ \max(a,b)\le a+b\le 2\max(a,b) $$ so that since we do not care about the value of the universal constant $C$, we may as well treat both terms separately, depending on which one dominates. Imagine for example that $$ \sqrt{\Gamma \frac{A\log p}{n}}\ge \Gamma \frac{B\log^2 p}{n}.\tag{1} $$ Then for $\|Y_n-Y\|\le \varepsilon \|\Sigma\|$ to hold with probability at least $0.99$, it suffices that $$ \frac{2C}{\varepsilon}\sqrt{\Gamma \frac{A\log p}{n}}\le 0.01. $$ For sure, this holds if $$ n\ge C'\frac{A\log p}{\varepsilon^2}\Gamma. $$ Similarly, if the reverse inequality of (1) holds, then we get the same high probability result if we set $$ n\ge C''\frac{B\log^2 p}{\varepsilon}\Gamma. $$ So, in general, taking $$ n\ge \max(C,C'')\left(\frac{A\log p}{\varepsilon^2}+\frac{B\log^2 p}{\varepsilon}\right)\Gamma $$ will always give $\|Y_n-Y\|\le \varepsilon \|\Sigma\|$, regardless of the regime we are in.