How well can the maximum of a Gaussian process be approximated by a finite-dimensional Gaussian variable?
Solution 1:
The supremum of a Gaussian process is tricky to handle, and one way to do it is to use chaining1, which is based on $\epsilon$-net of $K$ for the canonical pseudo-distance: $$d^2(x,y) = \mathbb{V}[W(x)-W(y)] = k(x,x)-2k(x,y)+k(y,y)\,,$$ where $k(x,y)$ denotes the covariance of the GP. With this pseudo-distance in hand, define the covering numbers $N(\epsilon,K,d)$ be the smallest number of balls of size $\epsilon$ wrt $d$ required to cover $K$. Then you have the general upper bound: $$\mathbb{E}[\sup_{x\in K}W(x)] \leq C \int_0^{\sup_{x,y\in K}d(x,y)} \sqrt{\log N(\epsilon,K,d)} \mathrm{d}\epsilon\,.$$ Furthermore if you are given an $\epsilon$-net $x_1,\dots,x_k$, then it is possible to prove that: $$\mathbb{E}[\sup_{x\in K}W(x)-\max_{i \leq k} x_i] \leq C \int_0^\epsilon \sqrt{\log N(\epsilon',K,d)} \mathrm{d}\epsilon'\,.$$ Note that this indeed goes to $0$ when $\epsilon$ tends to $0$. It is also possible to get the high probabilistic counterpart of such bounds (instead of only the expectation).
In your case, the $x_i$ are not an $\epsilon$-net of $K$ with respect to $d$, but with respect to the metric of $K$. Therefore, you first need to provide an upper bound of $d(x,y)$ in terms of the metric of $K$. I advice you to read Grünewälder et. al., (2010), where they answer this question for the case of Bandit Problem.
1 See for example Chapter 13 of Concentration Inequalities. A Non asymptotic theory of independance, Boucheron, Lugosi, Massart