Rate of convergence in the central limit theorem (Lindeberg–Lévy)

Solution 1:

  1. I think you've basically defined it. You can say a sequence $Y_n$ of random variables converges of order $a_n$ if $Y_n/a_n$ converges in distribution to a random variable which isn't identically zero. The reason to have division instead of multiplication is so that $Y_n = a_n$ itself converges of order $a_n$. You should think of this as meaning "$Y_n$ grows or decays at about the same rate as $a_n$".

  2. This is Slutsky's theorem: if $Z_n \to Z$ in distribution and $c_n \to c$, then $c_n Z_n \to cZ$ in distribution. So suppose $Y_n$ converges of order $a_n$, so that $\frac{Y_n}{a_n}$ converges in distribution to some nontrivial $W$. If $b_n / a_n \to \infty$, then $\frac{Y_n}{b_n} = \frac{Y_n}{a_n} \frac{a_n}{b_n} \to W \cdot 0$, taking $Z_n = \frac{Y_n}{a_n}$, $Z=W$, $c_n = \frac{a_n}{b_n}$, and $c=0$ in Slutsky. So $Y_n$ does not converge of order $b_n$.

    On the other hand, if $\frac{b_n}{a_n} \to 0$, suppose to the contrary $\frac{Y_n}{b_n}$ converges in distribution to some $Z$. Then $\frac{Y_n}{a_n} = \frac{Y_n}{b_n} \frac{b_n}{a_n} \to 0 \cdot Z$ by Slutsky. But $\frac{Y_n}{a_n}$ was assumed to converge in distribution to $W$ which is not zero. This is a contradiction, so $Y_n$ does not converge of order $b_n$.

    But there isn't generally a unique sequence here. If $Y_n$ converges of order $\frac{1}{n}$, it would also be true to say $Y_n$ converges of order $\frac{1}{n+43}$, or $\frac{1}{n+\log n}$, or $\frac{1}{2n}$, et cetera.

  3. Not sure what you mean here, as this is just a restatement of the CLT, whose proof you seem to know.

Solution 2:

This is just a remark that was too long to fit as a comment. The remark is about what people mean when they casually say "$\sqrt{n}$ (or $1/\sqrt{n}$) convergence".

Take $\mu = 0, \sigma = 1$ for simplicity. If $$\frac{1}{\sqrt{n}}\sum_i X_i $$ is "approximately normally distributed", as is guaranteed for sufficiently large $n$ by CLT, then we can approximate deviations from the empirical mean $\frac{1}{n}\sum_i X_i\approx 0$ by using the CLT approximation $$\mathbb{P}\left(-\frac{\epsilon}{\sqrt{n}} < \frac{1}{n}\sum_i X_i < \frac{\epsilon}{\sqrt{n}}\right) \approx \mathbb{P}\left(-\epsilon < N(0,1) < \epsilon\right).\tag{1}$$ Then, heuristically, if you want an extra decimal point of accuracy (i.e. divide $\epsilon$ by 10) with a fixed probability, you need $10^2$ more samples (this keeps the left-hand side of the above approximation asymptotically constant). This is often what people mean when they say "CLT implies $\sqrt{n}$ convergence".

The slight of hand above is: how large does $n$ have to be? In other words, what is the order of convergence of $$\left|\mathbb{P}\left(-\frac{\epsilon}{\sqrt{n}} < \frac{1}{n}\sum_i X_i < \frac{\epsilon}{\sqrt{n}}\right) - \mathbb{P}\left(-\epsilon < N(0,1) < \epsilon\right)\right|.$$ More explicitly, what is the approximation for any set, say $$\left|\mathbb{P}\left(\frac{1}{\sqrt{n}}\sum_i X_i < x\right) - \mathbb{P}\left(N(0,1) < x\right)\right|?$$ It turns out a bound on the above is also of order $1/\sqrt{n}$, and is the content of the Barry-Esseen Theorem, as pointed out in one of the links in the comments.