Rate of Weak Convergence for a Sequence of Probability Measures

Solution 1:

We can compute directly that $\int_0^x{d\mu_n}=\frac{\lfloor nx\rfloor}{n}$ and $\int_0^x{d\lambda}=x$, so that the Kolmogorov-Smirnov distance is $$\sup_{x\in[0,1]}{\left|\frac{\lfloor nx\rfloor}{n}-x\right|}=\frac{\sup_{x\in[0,1]}{|\lfloor nx\rfloor-nx|}}{n}=\frac{1}{n}$$

Computing the Wasserstein distance is much harder. To begin, notice that in a coupling $(X,Y)$ where $X\sim\mu_n$ and $Y\sim\lambda$, the random variable $X$ lies in the finite set $\left\{\frac{k}{n}:k\in[n]\right\}$. Thus we need only pick $n$ conditional distributions $\{r_k\}_{k=1}^n$ on $[0,1]$ for $Y$. Since a (finite) sum of singular probability measures remains singular, the $r_k$ must be absolutely continuous w.r.t. $\lambda$; let their pdfs be $\{f_k\}_{k=1}^n$. Then the $p$-distance is $$\sqrt[p]{\frac{1}{n}\sum_{k=1}^n{\int_0^1{\left|\frac{k}{n}-y\right|^pf_k(y)\,dy}}}$$ We seek to minimize the same subject to the conditions \begin{gather*} \frac{1}{n}\sum_k{f_k}=1_{[0,1]} \\ f_k\geq0 \end{gather*}

For simplicity, call the inner sum $S$. One upper bound is the case $$f_k=\begin{cases} \frac{2n}{3}\cdot1_{\left[0,\frac{3}{2n}\right]} & k=1 \\ n\cdot1_{\left[\frac{k-\frac{1}{2}}{n},\frac{k+\frac{1}{2}}{n}\right]} & 1<k<n \\ \frac{n}{3}\cdot1_{\left[0,\frac{3}{2n}\right]}+n\cdot1_{\left[1-\frac{1}{2n},1\right]} & k=n \end{cases}$$ which gives \begin{align*} S&=(n-2)(1<k<n\text{ term})+(k=1\text{ term})+(k=n\text{ term}) \\ &=(n-2)\cdot n\cdot\frac{2\left(\frac{1}{2n}\right)^{p+1}}{p} + \frac{2n}{3}\left(\frac{\left(\frac{1}{2n}\right)^{p+1}}{p}+\frac{\left(\frac{1}{n}\right)^{p+1}}{p}\right)+n\cdot\frac{\left(\frac{1}{2n}\right)^{p+1}}{p}+\frac{2n}{3}\left(\frac{1-\left(\frac{3}{2n}\right)^{p+1}}{p}\right) \\ &=\frac{1+o(1)}{p2^pn^{p-1}} \end{align*} Call this formula (1).

I claim (1) is asymptotically optimal. First, since we are now interested in a lower bound, we can omit the annoying boundary terms: $$S\geq\int_0^1{\sum_{k=2}^{n-1}{\left|\frac{k}{n}-y\right|^pf_k(y)}\,dy}$$ Since $\left|\frac{k}{n}-y\right|^p$ is increasing in $\left|\frac{k}{n}-y\right|$, the infimum occurs when we maximize $f_k(y)$ around $\frac{k}{n}$ — otherwise, we could "shift" large values of $f_k(y)$ that are weighted highly to places where they are weighted less. So what are those maximum values? Well, the $f_k$ sum to $n$ and are nonnegative, so $f_k(y)\leq n$. But this is precisely the value $f_k$ takes above!

Thus the Wasserstein distance is $$\sqrt[p]{\frac{\inf_{f_k}{S}}{n}}=\sqrt[p]{\frac{1+o(1)}{p2^pn^p}}=\frac{1+o(1)}{2\sqrt[p]{p}n}$$