Convergence of discrete-time Markov chain to Feller processes
Solution 1:
Update: The consequences of Theorem 1, below, have been more neatly stated in Corollary 1 and Lemma 1. To Lemma 1, I am adding an extra-assumption: the limit process $X$ has continuous sample-paths, almost surely.
I will build from what you proposed and I may repeat some passages to make sure everything is in place.
Theorem 1. $\mathbb{P}\left(\rho_{T}\left(\overline{X}^{(d)},X^{(d)}\right)>\epsilon\right)\overset{d\rightarrow \infty}\longrightarrow 0$, for any $\epsilon>0$.
Theorem 1 will imply Corollary 1 and Lemma 1 further ahead.
Preliminary Remarks. I am assuming that $D_{\left[0,T\right]}$ is endowed with the (incomplete) metric $\rho_{T}(X,Y)=\inf_{\lambda\in \Lambda_T}\left\{\left|\left|\lambda-{\sf id}\right|\right|\vee \left|\left|X\circ\lambda-Y\right|\right|\right\}$, where
$\Lambda_T\overset{\Delta}=\left\{\lambda\,:\,\left[0,T\right]\rightarrow \left[0,T\right]\,:\,\lambda\mbox{ is bijective, continuous and }\lambda(0)=0,\,\lambda(T)=T\right\}$;
${\sf id}$ is the identity map from $\left[0,T\right]$ onto itself; and we have defined
$\left|\left|X\right|\right|=\sup_{t\in\left[0,T\right]} \left|X(t)\right|$
as the $\sup$ norm on the interval $\left[0,T\right]$.
Note that for a particular $\lambda\in\Lambda_T$, we have $\rho_T(X,Y)\leq\left|\left|\lambda-{\sf id}\right|\right|+\left|\left|X\circ\lambda-Y\right|\right|$ (as pointed out in your equation (2)).
Proof to Theorem 1. We need to slightly correct your $\lambda^{(d)}_t$ so it belongs to $\Lambda_T$ (almost surely), since in your case $\lambda_T^{(d)}\neq T$ almost surely (and we need $\lambda(0)=0$ and $\lambda(T)=T$). Define $n^{\star}\overset{\Delta}=\min\left\{n\in\mathbb{N}_0\,:\,\tau_{n+1}^{(d)}> T\right\}$, -- note that $n^{\star}(\omega)=N_{T}^{(d)}(\omega)$ for all $\omega\in\Omega$, -- and let us redefine your $\lambda_t^{(d)}$ rather as
$\lambda_t^{(d)}\overset{\Delta}=\sum_{n=0}^{n^{\star}}1_{\left.\left[\frac{n}{d},\frac{n+1}{d}\right.\right)}(t)\left(\tau_n^{(d)}+\left(dt-n\right)\left(\tau_{n+1}^{(d)}-\tau_n^{(d)}\right)\right)+1_{\left[\left.\frac{n^{\star}}{d},T\right]\right.}(t)\left(\tau_{n^{\star}}^{(d)}+(\frac{dt-n^{\star}}{Td-n^{\star}})\left(T-\tau_{n^{\star}}^{(d)}\right)\right)$
Now we have that $\lambda^{(d)}_t\in \Lambda_T$ for all $d$, almost surely. Note, in particular, that $\lambda^{(d)}_T=T$.
We have that
$\rho_T(\overline{X}^{(d)},X^{(d)})\leq\left|\left|\lambda^{(d)}-{\sf id}\right|\right|+\left|\left|\overline{X}^{(d)}\circ\lambda^{(d)}-X^{(d)}\right|\right|=\left|\left|\lambda^{(d)}-{\sf id}\right|\right|.\tag{1}$
Note that without the correction on $\lambda^{(d)}_t$ the second term on the left-hand side of the identity above would not be zero.
Now, we observe that
$\left|\left|\lambda^{(d)}-{\sf id}\right|\right|=\frac{1}{d}\sup_{k\in\left\{0,1,\ldots,N_{T}^{(d)}\right\}}\left|\tau_k-k\right|=\sup_{t\in\left[0,\tau_{n^{\star}}\right]}\frac{1}{d}\left|N_{t}^{(d)}-td\right|\leq \sup_{t\in\left[0,T\right]}\frac{1}{d}\left|N_{t}^{(d)}-td\right|$,
where for the first identity above: i) without loss of optimality, we can restrict attention to the jump moments plus the final moment $T$; (ii) in the final moment $T$, $\lambda_T^{(d)}-T=0$, thus we can restrict attention to the jumps within the interval $\left[0,T\right]$ and ignore the moment $T$. Observe that without the correction on $\lambda_t^{(d)}$ the first identity would not hold true (and the devised upper-bound above would not follow).
Note that $N^{(d)}_t-td$ is a martingale with $N_{T}^{(d)}-Td\in L_2$ and from Doob's inequality
$\mathbb{P}\left(\sup_{t\in\left[0,T\right]}\left|\lambda_{t}^{(d)}-t\right|>\epsilon\right)\leq\mathbb{P}\left(\sup_{t\in\left[0,T\right]}\frac{1}{d}\left|N_{t}^{(d)}-td\right|>\epsilon\right)\leq \frac{E\left[\left(N_{T}^{(d)}-Td\right)^2\right]}{d^2\epsilon^2}=\frac{Td}{d^2 \epsilon^2}=\frac{T}{d\epsilon}\overset{d\rightarrow \infty}\longrightarrow 0$.
From the bound (1), we have $\rho_T(\overline{X}^{(d)}(\omega),X^{(d)}(\omega))>\epsilon \Rightarrow \left|\left|\lambda^{(d)}(\omega)-{\sf id}\right|\right|>\epsilon$ and thus,
$\mathbb{P}\left(\rho_T(\overline{X}^{(d)},X^{(d)})>\epsilon\right)\leq \mathbb{P}\left(\left|\left|\lambda^{(d)}-{\sf id}\right|\right|>\epsilon\right)\overset{d\rightarrow\infty}\longrightarrow 0$. $\tag*{$\blacksquare$}$
Corollary 1.[Convergence in probability] For every $T>0$, we have
$\mathbb{P}\left(\rho_T\left(X^{(d)},X\right)>\epsilon\right)\longrightarrow 0$ for all $\epsilon>0$ $\Leftrightarrow \mathbb{P}\left(\rho_T\left(\overline{X}^{(d)},X\right)>\epsilon\right)\longrightarrow 0$ for all $\epsilon>0$, i.e., $X^{(d)}\rightarrow X$ in probability w.r.t. $\left(\rho_T, D_{\left[0,T\right]}\right)$ if and only if $\overline{X}^{(d)}\rightarrow X$ in probability w.r.t. $\left(\rho_T, D_{\left[0,T\right]}\right)$.
Proof to Corollary 1. Obvious from Theorem 1. $\tag*{$\blacksquare$}$
In what follows, $\rho^{o}_T$ is a metric that is topologically equivalent to $\rho_T$, i.e., it induces the same (Skorokhod) topology on $D_{\left[0,T\right)}$, except that the metric space $\left(\rho^{o}_T,D_{\left[0,T\right)}\right)$ is complete. $\rho^{o}$ is a metric built upon $\left\{\rho^{o}_T\right\}_{T=1}^{\infty}$ and inducing the Skorokhod topology on $D_{\left[0,\infty\right)}$, with $\left(\rho^{o},D_{\left[0,\infty\right)}\right)$ complete. Their explicit characterizations can be abstracted in what follows, but can be found in equations (16.4) for $\rho^{o}$ and (12.16) for $\rho^{o}_{T}$ of Patrick Billingsley "Convergence of Probability Measures".
Lemma 1.[Weak convergence] If $\mathbb{P}\left(X\in\mathcal{C}_{\left[0,\infty\right)}\right)=1$, then $X^{(d)}\longrightarrow X$ weakly w.r.t the Skorokhod topology in $D_{\left[\left.0,\infty\right)\right.}$ if and only if $\overline{X}^{(d)}\longrightarrow X$ weakly w.r.t. the Skorokhod topology in $D_{\left[\left.0,\infty\right)\right.}$.
Proof to Lemma 1. Let $X^{(d)}\longrightarrow X$ weakly in $D_{\left[\left.0,\infty\right)\right.}$. Then, in view of the Skorokhod Representation Theorem, Theorem 6.7 in Billingsley, we have $\widetilde{X}^{(d)}\equiv X^{(d)}$ and $\widetilde{X}\equiv X$, where $\equiv$ stands for equal in distribution, so that $\rho^{o}(\widetilde{X}^{(d)}(\omega),\widetilde{X}(\omega))\rightarrow 0$, for all $\omega\in \Omega$. Note that $\mathbb{P}\left(\widetilde{X}\in\mathcal{C}_{\left[0,\infty\right)}\right)=\mathbb{P}\left(X\in\mathcal{C}_{\left[0,\infty\right)}\right)=1$ and from Theorem 16.2, Billingsley, we have that $\rho^{o}_{T}(\widetilde{X}^{(d)},\widetilde{X})\rightarrow 0$ for all $T>0$, almost surely. This further implies that $X^{(d)}\longrightarrow X$ weakly with respect to $D_{\left[0,T\right]}$. Now we resort to Theorem 4.28 referred to in the question and to Theorem 1. Let $\epsilon,\delta>0$ and choose $d$ large enough so that $\mathbb{P}\left(\rho_{T}\left(X^{(d)},\overline{X}^{(d)}\right)\leq\epsilon\right)\geq 1-\delta$, then $E\left[\rho_{T}\left(X^{(d)},\overline{X}^{(d)}\right)\wedge 1\right]\leq \epsilon+\delta$, and thus we have $\limsup_{d\rightarrow\infty} E\left[\rho_{T}\left(X^{(d)},\overline{X}^{(d)}\right)\wedge 1\right]=0$. This implies that $\overline{X}^{(d)}\longrightarrow X$ weakly with respect to $D_{\left[0,T\right]}$ in light of Theorem 4.28. This convergence holds for all $T$. With the same Skorokhod representation+Theorem 16.2, we can conclude that $\overline{X}^{(d)}\longrightarrow X$ converges weakly with respect to the Skorokhod topology in $D_{\left[0,\infty\right)}$. $\tag*{$\blacksquare$}$