In short: the Lévy-Prokhorov metric specialized over sets of the form $\{y | y^{(1)} < x^{(1)}, \ldots, y^{(d)} < x^{(d)} \}$ for $x = (x^{(1)}, \ldots x^{(d)}) \in \mathbb{R}^d$ gives a metric on distribution functions.

The generalization of the Lévy metric to metric spaces is usually the Lévy-Prokhorov metric defined, for two probability measures $\mu, \nu$ on the measure space $(M, \mathcal{B}(M))$ with $(M, \rho)$ a metric space with distance $\rho$ and $\mathcal{B}(M)$ the Borel sigma-algebra, by $$ d_L(\nu,\mu) := \inf \{\epsilon > 0 | (\forall) A, B \in \mathcal{B}(M), \mu(A) \leq \nu(A^\epsilon) + \epsilon ~ \text{and} ~ \nu(A) \leq \mu(A^\epsilon) + \epsilon \}, $$ wherein $A^\epsilon := \{x \in M | \inf_{y \in A} \rho(x,y) < \epsilon \}$ can be seen as fattening $A$ by $\epsilon$. In the case that $M = \mathbb{R}^d$, for any probability measure $\nu$ the corresponding c.d.f. $F_\nu$ is given by $$ F_\nu(x) = \nu(\chi_{y \prec x}), $$ wherein $\chi_{y \prec x} := \{y \in M | y^{(1)} < x^{(1)}, \ldots, y^{(d)} < x^{(d)} \}$, so $\chi_{y \prec x}^\epsilon = \chi_{y \prec x + \epsilon \alpha/\sqrt{d}}$ (for $\alpha = (1, \ldots, 1)$ as you've defined). So if $F_\mu, F_\nu$ are the c.d.f.s corresponding to $\mu, \nu$ respectively, then $$ F_\nu(x) < F_\mu(x + \epsilon \alpha/\sqrt{d}) + \epsilon \iff F_\nu(x - \epsilon \alpha/\sqrt{d}) - \epsilon < F_\mu(x)$$ and similarly for the reverse inequality, so this implies, $d_L(\nu, \mu) \geq d(F_\nu, F_\mu)$. It's not so hard to see that the metric properties of $d_L$ carry over to $d$ acting on distribution functions. Morally speaking, distribution functions hold the same amount of information as measures: by taking sums and differences, you can get back masses of cubes and use this to build the corresponding measures.

In the seminal work

Prokhorov, Convergence of random processes and limit theorems in probability theory, Theory of Probability & Its Applications, SIAM, 1956

where the metric $d_L$ was introduced along with many of its properties, Prokhorov proves the equivalence between weak convergence of measures and convergence in the metric $d_L$. This is statement (A) in section 1.4 and the proof follows. With that and the above remarks you would have, $$F_{\mu_n} \to F_\mu \implies d_L(\mu_n ,\mu) \to 0 \implies d(F_{\mu_n},F_\mu) \to 0. $$

Remark. In a previous instantiation of your question, you mentioned you could show $d(\cdot,\cdot)$ is a metric and that convergence in $d(\cdot,\cdot)$ implies weak convergence of distributions. This is not so hard to see also from the proof for $d_L$ and just adapting it for sets of the form $\{y | y \prec x \}$ for all $x \in \mathbb{R}^d$. You mentioned you had trouble showing the reverse implication, so both to address that and as an example of how this specialization looks, I'll give the example below.

For completeness, I'll provide the essential arguments coming from the proof of this more general theorem of Prokhorov, specialized to your case for distribution functions. I'll follow the treatment given here, Theorem 4.2, nearly verbatim. Let $F_\mu, F_{\mu_n}$ be the distributions with measures $\mu, \mu_n$ respectively and suppose $\mu \to \mu_n$ weakly, which is equivalent to $F_{\mu_n} \to F_\mu$.

Fix $\epsilon > 0$ and $\delta < \epsilon/3$. Essentially by separability of the space, there exists a countable set of open balls $\{B(x_j,r_j)\}_{j=1}^\infty$ about points $\{x_j\} \subset \mathbb{R}^d$ with radii $r_j < \delta/2$ and $\mu(\partial B_j) = 0$ (for this condition of massless boundaries, see Lemma 4.3). There is a $k > 0$ such that $$ B := \cup_{j=1}^k B(x_j,r_j) \quad \text{and} \quad \mu(B) \geq 1 - \delta .$$ Since the collection of sets $$ \mathcal{A} := \{ \bigcup_{j \in J} B(x_j,r_j) | J \in 2^{\{1,\ldots,k\}} \}$$ is finite, there exists $N > 0$ such that for all $n > N$ and all $A \in \mathcal{A}$, $$ |\mu_n(A) -\mu(A)| < \delta .$$ Now let for each $x \in \mathbb{R}^d$, $A_x$ be the union in $\mathcal{A}$ coming from all the balls with which $\{y \prec x \}$ intersects: $$ A_x := \bigcup \{B(x_j,r_j) | j \in \{1,\ldots,k\} ~\text{and}~ \{y \prec x \} \cap B(x_j, r_j) \} .$$ As I explained in the beginning of the post, for any $\epsilon' > 0$, the fattening by $\epsilon'$ is given by $$\{y \prec x \}^{\epsilon'} := \{x \in \mathbb{R}^d | \inf_{y \in A} |x - y| < \epsilon' \} = \{y \prec x + \epsilon'\alpha/\sqrt{d} \} .$$ Since each $B(x_j, r_j)$ has diameter less than $\delta$, $A_x \subset \{y \prec x \}^\delta \subset \{y \prec x + \epsilon\alpha \}$. Then, noting that $\mu(\mathbb{R}^d \backslash B) \leq \delta$ and $|\mu_n(B) - \mu(B)| < \delta$, we have $\mu_n(\mathbb{R}^d \backslash B) < 2\delta$, which gives $$ F_n(x) \leq \mu_n(A_x) + \mu_n(\mathbb{R}^d \backslash B) < \mu_n(A_x) + 2\delta < \mu(A_x) + 3\delta \leq F(x + \epsilon\alpha) + \epsilon .$$ As mentioned earlier, this is equivalent to $$ F_n(x - \epsilon\alpha) - \epsilon < F(x) .$$ In the other direction, $$ F(x) \leq \mu(A_x) + \mu(\mathbb{R}^d \backslash B) < \mu(A_x) + \delta < \mu_n(A_x) + 2\delta \leq F_n(x + \epsilon\alpha) + \epsilon .$$

Remarks. In the one-dimensional case, one argues directly using the monotonicity of the distribution function. Measures are in some sense the right generalization of this monotonicity to $\subseteq$-relation over the measurable sets and this proof embodies much of the same ingredients. In generalizing to arbitrary separable metric spaces this is a useful jump (though I suppose with the help of Urysohn you could project into a cube). In the case of $\mathbb{R}^d$, it's possible to use $\prec$-monotonicity of the distribution functions to prove this result by taking essentially a cuboid containing the mass concentration $B$, placing points of continuity near a lattice with sufficiently small spacing, and comparing with projections of arbitrary points to these lattice points. It's significantly more tedious than arguing with measures and using $\subseteq$-monotonicity.