The Baker-Campbell-Hausdorff formula says that $$ e^{tX}e^{tY}=e^Z $$ where $$ Z=tX+tY+\frac{t^2}{2}[X,Y]+\frac{t^3}{12}\big([X,[X,Y]]-[Y,[X,Y]]\big)+\dots $$ and further terms involve higher powers of $t$ and higher order commutators. Since all commutators of order higher than $1$ are $0$, this gives the result. That is, because $[X,Y]$ commutes with both $X$ and $Y$, $$ \begin{align} e^{tX}e^{tY} &=e^{tX+tY+\frac{t^2}{2}[X,Y]}\\ &=e^{tX+tY}e^{\frac{t^2}{2}[X,Y]} \end{align} $$


The Baker-Campbell-Hausdorff Formula

Suppose $X$ and $Y$ are elements in a non-commutative algebra. Define the operator $\mathrm{ad}(X)$ by $\mathrm{ad}(X)Y = [X,Y] = XY-YX$. Then we get the following:

Lemma 1: $\displaystyle X^kY=\sum_{j=0}^k\binom{k}{j}\mathrm{ad}(X)^jYX^{k-j}$

Proof: The case $k=0$ is trivial. Suppose the statement is true for some $k$, then $$ \begin{align} X^{k+1}Y &=\mathrm{ad}(X)\left(X^kY\right)+\left(X^kY\right)X\\ &=\sum_{j=0}^k\binom{k}{j}\mathrm{ad}(X)^{j+1}YX^{k-j} +\sum_{j=0}^k\binom{k}{j}\mathrm{ad}(X)^jYX^{k-j+1}\\ &=\sum_{j=0}^{k+1}\binom{k}{j-1}\mathrm{ad}(X)^{j}YX^{k-j+1} +\sum_{j=0}^{k+1}\binom{k}{j}\mathrm{ad}(X)^jYX^{k-j+1}\\ &=\sum_{j=0}^{k+1}\binom{k+1}{j}\mathrm{ad}(X)^jYX^{k-j+1} \end{align} $$ Thus, the statement is true for $k+1$.$\quad\square$

Let $\mathrm{D}$ be the usual derivative: $\mathrm{D}(XY) = X\,\mathrm{D}(Y) +\mathrm{D}(X)\,Y$. Then $$ \begin{align} \mathrm{D}\left(X^n\right) &=\sum_{k=0}^{n-1}X^k\mathrm{D}(X)X^{n-k-1}\\ &=\sum_{k=0}^{n-1}\sum_{j=0}^k\binom{k}{j}\mathrm{ad}(X)^j\mathrm{D}(X)X^{k-j}X^{n-k-1}\\ &=\sum_{j=0}^{n-1}\sum_{k=j}^{n-1}\binom{k}{j}\mathrm{ad}(X)^j\mathrm{D}(X)X^{n-j-1}\\ &=\sum_{j=0}^{n-1}\binom{n}{j+1}\mathrm{ad}(X)^j\mathrm{D}(X)X^{n-j-1}\tag{1} \end{align} $$ Using the power series for $e^X$, we get $$ \begin{align} \mathrm{D}\left(e^X\right) &=\sum_{n=0}^\infty\sum_{j=0}^{n-1}\frac1{n!}\binom{n}{j+1}\mathrm{ad}(X)^j\mathrm{D}(X)X^{n-j-1}\\ &=\sum_{j=0}^\infty\sum_{n=j+1}^\infty\frac1{(j+1)!}\frac1{(n-j-1)!}\mathrm{ad}(X)^j\mathrm{D}(X)X^{n-j-1}\\ &=\sum_{j=0}^\infty\sum_{n=0}^\infty\frac1{(j+1)!}\frac1{n!}\mathrm{ad}(X)^j\mathrm{D}(X)X^n\\ &=\left[\frac{e^{\mathrm{ad}(X)}-1}{\mathrm{ad}(X)}\mathrm{D}(X)\right]e^X\tag{2} \end{align} $$ Because $\frac{\mathrm{d}}{\mathrm{d}t}e^{tX}=Xe^{tX}$, we get the following

Lemma 2: $e^{tX}Ye^{-tX}=e^{t\,\mathrm{ad}(X)}Y$

Proof: Note that $$ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t}\left(e^{tX}Ye^{-tX}\right) &=X\left(e^{tX}Ye^{-tX}\right)-\left(e^{tX}Ye^{-tX}\right)X\\ &=\mathrm{ad}(X)\left(e^{tX}Ye^{-tX}\right) \end{align} $$ Thus, $e^{tX}Ye^{-tX}=e^{t\,\mathrm{ad}(X)}Y$.$\quad\square$

From this, we get $$ \begin{align} \frac{\mathrm{d}}{\mathrm{d}t}\left(e^{tX}e^{tY}\right) &=X\left(e^{tX}e^{tY}\right)+\left(e^{tX}Ye^{tY}\right)\\ &=\left(X+e^{t\,\mathrm{ad}(X)}Y\right)\left(e^{tX}e^{tY}\right)\tag{3} \end{align} $$ Using the power series for $\log$, we see that formally, there is a $Z$ so that $e^Z=e^{tX}e^{tY}$.

Combining $$ \frac{\mathrm{d}}{\mathrm{d}t}\left(e^Z\right)e^{-Z} =\frac{\mathrm{d}}{\mathrm{d}t}\left(e^{tX}e^{tY}\right)e^{-tY}e^{tX}\tag{4} $$ with $(2)$ and $(3)$, we get that $$ \frac{e^{\mathrm{ad}(Z)}-1}{\mathrm{ad}(Z)}\frac{\mathrm{d}}{\mathrm{d}t}Z =X+e^{t\,\mathrm{ad}(X)}Y\tag{5} $$ and thus, $$ \frac{\mathrm{d}}{\mathrm{d}t}Z =\frac{\mathrm{ad}(Z)}{e^{\mathrm{ad}(Z)}-1}\left(X+e^{t\,\mathrm{ad}(X)}Y\right)\tag{6} $$ If we iterate $(6)$, the power series in $t$ for $Z$ converges at least one order of $t$, hence one order of commutator, per iteration to yield $$ Z=tX+tY+\frac{t^2}{2}[X,Y]+\frac{t^3}{12}\big([X,[X,Y]]-[Y,[X,Y]]\big)+\dots\tag{7} $$ where only terms with commutators of third or higher order have been omitted.


$\newcommand{\angles}[1]{\left\langle #1 \right\rangle}% \newcommand{\braces}[1]{\left\lbrace #1 \right\rbrace}% \newcommand{\bracks}[1]{\left\lbrack #1 \right\rbrack}% \newcommand{\dd}{{\rm d}}% \newcommand{\isdiv}{\,\left.\right\vert\,}% \newcommand{\ds}[1]{\displaystyle{#1}}% \newcommand{\equalby}[1]{{#1 \atop {= \atop \vphantom{\huge A}}}}% \newcommand{\expo}[1]{\,{\rm e}^{#1}\,}% \newcommand{\floor}[1]{\,\left\lfloor #1 \right\rfloor\,}% \newcommand{\ic}{{\rm i}}% \newcommand{\imp}{\Longrightarrow}% \newcommand{\pars}[1]{\left( #1 \right)}% \newcommand{\partiald}[3][]{\frac{\partial^{#1} #2}{\partial #3^{#1}}} \newcommand{\pp}{{\cal P}}% \newcommand{\sgn}{\,{\rm sgn}}% \newcommand{\ul}[1]{\underline{#1}}% \newcommand{\verts}[1]{\left\vert #1 \right\vert}% \newcommand{\yy}{\Longleftrightarrow}$

$\ds{\large% \expo{tA}\expo{tB} = \expo{t\pars{A + B}}\expo{\bracks{A,B}t^{2}/2}:\ {\Large ?}}$. Let's define $U\pars{t} = \expo{-tA}\expo{t\pars{A + B}}$. Then, we have to prove that $\ds{\large U\pars{t} = \expo{tB}\expo{-\bracks{A,B}t^{2}/2}}$.

\begin{align} \partiald{U\pars{t}}{t} &= -\expo{-tA}A\expo{t\pars{A + B}} + \expo{-tA}\pars{A + B}\expo{t\pars{A + B}} = \expo{-tA}B\expo{tA}\expo{-tA}\expo{t\pars{A + B}} = B\pars{t}U\pars{t} \end{align} where $B\pars{t} \equiv \expo{-At}B\expo{At}$. Notice that $\partiald{B\pars{t}}{t} = -A\expo{-At}B\expo{At} + \expo{-At}B\expo{At}A = \bracks{B\pars{t},A}$ $$ B\pars{0} = B\,, \quad \left.\partiald{B\pars{t}}{t}\right\vert_{t = 0} = \bracks{B,A}\,, \quad \left.\partiald[2]{B\pars{t}}{t}\right\vert_{t = 0} = \bracks{\bracks{B,A},A} = 0 $$ $$ \left.\partiald[3]{B\pars{t}}{t}\right\vert_{t = 0} = \bracks{\bracks{\bracks{B,A},A},A} = 0\,, \quad\ldots\quad \left.\partiald[n]{B\pars{t}}{t}\right\vert_{t = 0} = 0 $$ Then $$ B\pars{t} = B + \bracks{B,A}t $$ \begin{align} \partiald{U\pars{t}}{t} = \braces{B + \bracks{B,A}t}U\pars{t} \quad\imp\quad U\pars{t} = \expo{Bt + \bracks{B,A}t^{2}/2} \end{align} Since $\bracks{B,\bracks{A,B}} = 0$ and $\bracks{A,B} = -\bracks{B,A}$:

$$ U\pars{t} = \expo{Bt} \expo{-\bracks{A,B}t^{2}/2} $$