Computation in Wikipedia's article "Riemann Curvature Tensor"
Solution 1:
Let's work in coordinates $(x^1,\dots,x^n)$ centered at a point $p$. It suffices to prove the equation in the case where $X=\frac{\partial}{\partial x^i}$ and $Y=\frac{\partial}{\partial x^j}$ and $Z\in T_pM$ arbitrary. To help with the computation, denote $Z=u_0$ and let $u_0=u_0^k\frac{\partial}{\partial x^k}$. For simpliclity, set $X=\frac{\partial}{\partial x^1}$ and $Y=\frac{\partial}{\partial x^2}$.
Let $\tau_1^s$ denote the flow of $\frac{\partial}{\partial x^1}$ and $\tau_2^t$ the flow of $\frac{\partial}{\partial x^2}$. It follows that in coordinates $$\tau_1^s(x^1,\dots,x^n)=(x^1+s,x^2,\dots,x^n)$$ and $$\tau_2^t(x^1,\dots,x^n)=(x^1,x^2+t,\dots,x^n).$$Then $$\tau(s,t):=\tau_2^{-t}\circ\tau_1^{-s}\circ\tau_2^t\circ\tau_1^s(p)$$is a parallelogram with corners $p_0=p , p_1=\tau_1^s(p), p_2=\tau_2^t(p_1), p_3=\tau_1^{-s}(p_2)$ and $p_4=\tau_2^{-t}(p_3)$.
Recall the Christoffel symbols are defined by $$\nabla_{\frac{\partial}{\partial x^i}}\frac{\partial}{\partial x^j}=\Gamma_{ij}^k\frac{\partial}{\partial x^k}$$ and that given $V_p\in T_pM$ and a curve $\gamma$ the parallel transport $\widetilde V$ of $V_p$ along $\gamma$ satisfies $$\frac{d\widetilde V}{dt}(t)=-\Gamma_{ij}^k(\gamma(t))\frac{d\gamma^i}{dt}(t)\widetilde V^j(\gamma(t))$$for any time $t$. Also recall that in the coordinate frame the curvature tensor looks like $$R_{ijk}^l=\frac{\partial \Gamma_{jk}^m}{\partial x^i}-\frac{\partial\Gamma_{ik}^m}{\partial x^j}+\Gamma_{il}^m\Gamma_{jk}^l-\Gamma_{jl}^m\Gamma_{ik}^l.$$ To make the notation more clear, let $\Gamma_i$ denote the $n\times n$ matrix, or endomorphism of the tangent bundle, given by $$\Gamma_i := \Gamma_{ij}^k \text{ with } j,k=1,\dots,n.$$ It follows that the curvature endomorphism $F$ is given by $$F_{ij}=\frac{\partial \Gamma_j}{\partial x^i}-\frac{\partial\Gamma_i}{\partial x^j}+\Gamma_i\Gamma_j-\Gamma_j\Gamma_i=[\nabla_i,\nabla_j].$$ Let $u_1(\gamma(s,t))=u_1(s,t)=u_1(s)$ denote the parallel transport of $u_0$ along the curve that connects $(x^1,\dots,x^n)$ to $(x^1+s,x^2,\dots,x^n)$, i.e. that connects $\tau(0,0)$ to $\tau(s,0)$. Note we are abusing notation by letting $s$ and $t$ denote the variables of $u_1$ as well as the end corners of the parallelogram. Let $u_1^k$ denote $x^k\circ u_1$. Taylor expanding $u_1^k(s,t)$ about $(0,0)$ we get that
\begin{align} u^k_1(s,t)&=u^k_1(p_0)+s\frac{\partial}{\partial s}u_1^k(p_0) +t\frac{\partial}{\partial t}u^k_1(p_0) + O(2)\\ &=u^k_1(p_0)+s\frac{\partial}{\partial s}u_1^k(p_0) +O(2)\\ &=u_1^k(p_0)-s\Gamma_{ij}^k(p_0)\frac{d}{dt}\gamma^i(t)u_1^j(p_0)+O(2)\\ &=u_0^k-s\Gamma_{1j}^k(p_0)u_0^j +O(2) & \text{since $\frac{d}{dt}\gamma^i(t)=\delta_{i1}$} \end{align}
Summing over $j$ and $k$, since $s\Gamma_{1j}^k(p_0)u_0^j $ is just matrix multiplication, we get that $$u_1(s,t)=u_0-s\Gamma_1(p_0)u_0+O(2).$$ Now let $u_2(s,t)$ denote that parallel transport of $u_1$ along $\tau(s,t)$ from $\tau(s,0)$ to $\tau(s,t)$. That is, $s$ is fixed and so $u_2(s,t)$ is independent of $s$. Taylor expanding $u_2(s,t)$ centered at the point $(s,0)$, the exact same reasoning as above gives that $$u_2(s,t)=u_1-t\Gamma_2(p_1)u_1+O(2).$$ Substituting the expression of $u_1$ in terms of $u_0$ into this expression yields $$u_2(s,t)=\left(\mathbb{1}-s\Gamma_1(p_0)-t\Gamma_2(p_0)+st\Gamma_2(p_1)\Gamma_1(p_0)\right)u_0+O(2).$$ Notice that there is no $st$ term in $O(2)$ as both $u_1$ and $u_2$ are only dependent on one variable.
Each entry of the matrices $\Gamma_1(s,t)$ and $\Gamma_2(s,t)$ can also be expanded as a Taylor series through the chain rule. Along the curve connecting $p_0$ to $p_1$ we have $\Gamma_2$ is independent of $t$, and so a Taylor expansion around $p_0$ is given by
\begin{align} \Gamma_2(p_1)&=\Gamma_2(p_0)+s\frac{\partial\Gamma_2}{\partial x^i}\frac{\partial(\tau_1^s)^i}{\partial s}+O(2)\\ &=\Gamma_2(p_0)+s\frac{\partial\Gamma_2}{\partial x^1}(p_0)+O(2). \end{align}
Substituting this into the equation of $u_2$ in terms of $u_0$ gives $$u_2=\left(\mathbb{1}-s\Gamma_1-t\Gamma_2+st\left(\frac{\partial\Gamma_2}{\partial x^1}-\Gamma_2\Gamma_1\right)\right)u_0+O(2)$$
Similarly, a Taylor expansion gives $$\Gamma_1(p_1)=\Gamma_1(p_0)+s\frac{\partial\Gamma_1}{\partial x^2}(p_1)+O(2),$$from which another Taylor expansion gives $$\Gamma_1(p_2)=\Gamma_1(p_1)+\Gamma_1(p_0)+s\frac{\partial\Gamma_1}{\partial x^2}(p_1)+O(2).$$
Letting $u_3$ denote the parallel transport along the path from $\tau(s,t)$ to $\tau(0,t)$, that is along the curve $\tau_1^{-s}$ we get the expression $$u_3=u_2+s\Gamma_1(p_2)u_2.$$ Notice that there is a plus sign in the second term as we have differentiated with respect to $-s$. Substituting the above expressions for $\Gamma_2$ and $u_2$ into this we get that $$u_3=\left(\mathbb{1}-t\Gamma_2+st\left(\frac{\partial\Gamma_1}{\partial x^2}-\frac{\partial\Gamma_2}{\partial x^1}+\Gamma_2\Gamma_1-\Gamma_1\Gamma_2\right)\right)u_0+O(2),$$ where the $O(2)$ does not contain an $st$-term.
Letting $u_4$ denote the parallel transport along the path from $\tau(0,t)$ to $\tau(0,0)$, that is along $\tau_2^{-t}$, we get the expression $$u_4=u_3+t\Gamma_2(p_3)u_3+O(2).$$ Substituting in the expression for $u_3$ in terms of $u_0$ we get that $$u_4=u_0-st\left(\frac{\partial\Gamma_1}{\partial x^2}-\frac{\partial\Gamma_2}{\partial x^1}+\Gamma_2\Gamma_1-\Gamma_1\Gamma_2\right)+O(2)$$ where the $O(2)$ term does not contain an $st$-term. That is, $$u_4=u_0-stF_{12}+O(2)$$ where $O(2)$ does not contain an $st$-term. By the uniqueness of the Taylor expansion it follows that $$\frac{\partial^2u_4}{\partial s\partial t}=-F_{12}.$$ But by definition, $u_4$ is the parallel transport along the curve $\tau(s,t)$. Hence we have shown $$\frac{\partial^2\Pi_{\tau}}{\partial s\partial t}=-F_{12}.$$
Solution 2:
This question is old but since the above answer was not very helpful for me, I figured I should post the one I got.
I denote $\tau_{tX}$ by $\tau_t^X$. Note that $\left(\tau_t^X\right)^{-1} = \tau_{-t}^X$. Also, denote $g(t_1,t_2,t_3,t_4) = \tau^X_{t_1}\tau_{t_2}^Y\tau_{t_3}^X\tau_{t_4}^YZ$. Then : \begin{align*} \left.\frac{d}{dt}\right\vert_0\left.\frac{d}{ds}\right\vert_0 \left(\tau^X_t\right)^{-1}\left(\tau_s^Y\right)^{-1}\tau_t^X\tau_s^YZ =& \left.\frac{d}{dt}\right\vert_0\left.\frac{d}{ds}\right\vert_0 g(-t,-s,t,s)\\ =& \left.\frac{d}{dt}\right\vert_0 \left[-\frac{\partial g}{\partial t_2}(-t,0,t,0) + \frac{\partial g}{\partial t_4}(-t,0,t,0)\right]\\ =& \frac{\partial^2 g}{\partial t_1\partial t_2}(0) - \frac{\partial^2 g}{\partial t_3\partial t_2}(0) - \frac{\partial^2 g}{\partial t_1\partial t_4}(0) + \frac{\partial^2 g}{\partial t_3\partial t_4}(0)\\ =& \left.\frac{d}{dt}\right\vert_0\left.\frac{d}{ds}\right\vert_0 \Big[\tau^X_{t}\tau_{s}^Y\tau_{0}^X\tau_{0}^YZ - \tau^X_{0}\tau_{s}^Y\tau_{t}^X\tau_{0}^YZ\\ &- \tau^X_{t}\tau_{0}^Y\tau_{0}^X\tau_{s}^YZ +\tau^X_{0}\tau_{0}^Y\tau_{t}^X\tau_{s}^YZ \Big]\\ =& \left.\frac{d}{dt}\right\vert_0\left.\frac{d}{ds}\right\vert_0 \Big[\tau^X_{-t}\tau_{-s}^Y\tau_{0}^X\tau_{0}^YZ - \tau^X_{0}\tau_{-s}^Y\tau_{-t}^X\tau_{0}^YZ\\ &- \tau^X_{-t}\tau_{0}^Y\tau_{0}^X\tau_{-s}^YZ +\tau^X_{0}\tau_{0}^Y\tau_{-t}^X\tau_{-s}^YZ \Big](-1)^2\\ =& \nabla_X\nabla_Y Z-\nabla_Y\nabla_X Z-\nabla_X\nabla_Y Z+\nabla_X\nabla_Y Z\\ =& \nabla_X\nabla_Y Z-\nabla_Y\nabla_X Z = R(X,Y)Z, \end{align*} where we recall that in the Wikipedia's article, it is assumed that the vector fields commute, i.e. $[X,Y]=0$.