Monte-Carlo for the Wasserstein metric

Let $(X,d)$ be some metric space and assume that $d\leq 1$. Further, let $\mu, $ $\nu$ be two Borel probability measures on $X$ and let $$ \Gamma(\mu,\nu) = \{\gamma - \text{measure on }X\times X:\gamma(A,X) = \mu(A),\gamma(X,A) = \nu(A) \} $$ to be the space of all couplings of $\mu$ and $\nu$. Define $$ W_1(\mu,\nu) = \inf_{\gamma\in \Gamma(\mu,\nu)}\int_{X\times X} d(x,y)\gamma(\mathrm dx\times\mathrm dy) $$ to be the Wasserstein distance (of order $1$) between $\mu$ and $\nu$.

Suppose, we are able to draw two sets of random elements according to these distributions. E.g. $\{X_i\}$ is iid such that $X_i\sim \mu$ and $\{Y_j\}$ is iid such that $Y_j\sim \nu$ where $1\leq i,j\leq n$.

Is it possible to derive any results on $W_1(\mu,\nu)$ using $\{d(X_i,Y_j)\}_{ij}$, with some confidence-like bounds?


Some thoughts: let $\gamma(d):=\int_{X\times X}d(x,y)\mathrm d\gamma$. A dummy way is for a fixed $\gamma$, we can simulate pairs $(X_i,Y_i)\sim \gamma$ and then define $$ \overline{\gamma(d)}:=\frac1n\sum d(X_i,Y_i). $$ By the Hoeffding inequality we get $$ \Bbb P\left(\left|\gamma(d) - \overline{\gamma(d)}\right|\geq t\right)\leq 2 \mathrm e^{-nt^2} $$ and since $\gamma(d) \geq W_1(\mu,\nu)$ we can get some conservative upper bounds on the distance. The conservatism clearly depends on how $\gamma$ is far from the optimal coupling.

Perhaps, we can do something like that: simulate $\{X_i\}$ and $\{Y_j\}$ and consider $$ \bar W:= \frac1n\sum_{i=1}^n \min_j d(X_i,Y_j) $$ thus "coupling" $X$ and $Y$ on-the-run. But I am not sure whether this is an unbiased estimate and whether it allows for some bounds on the error.

Please feel free to retag. Perhaps I shall better ask it on statistics.SE?


The authors of the recent paper "On the empirical estimation of integral probability metrics (2012, Electronic Journal of Statistics)" have a setting where they present an estimator for Wasserstein metrics based on data and where they derive consistency of the estimator with convergence rates. It could be directly applied to a Monte Carlo setting.