Is there a general form for the derivative of a matrix to a power?

Let $S:Mat(2,2) \rightarrow Mat(2,2)$ be the squaring map $S(A)=A^2$ then $[DS(A)]B=AB+BA$. I was wondering if there was a general form for this solution ($S(A)=A^n$, then $[DS(A)]B =$...). I have tried using the definition of derivative, where I remove all the linear terms of H, but it is becoming very messy to compute, very fast.


We have, as a general rule, that for $S_n(A) = A^n$,

$DS_n(A)B = \sum_{l = 0}^{l = n - 1}A^lBA^{n - l - 1}; \tag{1}$

this may readily be proved by a simple induction on $n$, using the Leibniz rule for the derivatives of products; viz:

the cases $n = 1$ and $n = 2$ merely state what we already know, that

$DS_1(A)B = B \tag{2}$

and

$DS_2(A)B = BA + AB, \tag{3}$

and will serve as the base case (k = 1) of our induction. (2) is evident from

$S_1(A + B) = A + B, \tag{4}$

whence

$DS_1(A)B = S_1(A + B) - S_1(A) = B; \tag{5}$

there is no error term in the linear (i.e. first degree) case. (3) is seen from

$S_2(A + B) = (A +B)^2 = A^2 + AB + BA + B^2, \tag{6}$

so that

$S_2(A + B) - S_2(A) = AB + BA + B^2; \tag{7}$

when the second order terms in $B$ are dropped from (7), we see that

$DS_2(A)B = AB + BA. \tag{8}$

Of course, the derivations presented above are not completely rigorous in the strictest sense of the word, but they do give the general flow of the arguments and also serve to see exactly what we are dealing with here. More will be said about this topic in a moment, but first, we finish off the induction which establishes (1).

So assume

$DS_k(A)B = \sum_{l = 0}^{l = k - 1}A^lBA^{k - l - 1} \tag{9}$

holds for all $A, B$. Then we simply observe that

$DS_{k + 1}(A)B = D(AS_k(A))B = (DA)BS_k(A) + ADS_k(A)B$ $= BS_k(A) + A\sum_{l = 0}^{l = k - 1}A^lBA^{k - l - 1} = BA^k + \sum_{l = 0}^{l = k - 1}A^{l + 1}BA^{k - l - 1}$ $= BA^k + \sum_{l = 1}^{l = k}A^lBA^{k - l} = \sum_{l = 0}^{l = k}A^lBA^{k - l}, \tag{10}$

using (2) and (9). This completes the induction and proves that (1) holds for all $n$.

The minor lack of rigor in the above lies in the (unproved, tacit) assertion that the terms of order $B^2$ can be dropped from (7) without formal verification that this is so. Without going into more detail in this direction at present, I would like to add that the formula (1) may also be derived, rigorously, if one accepts the known result that for any matrix function of $t$ $X(t)$ with differentiable entries, we have

$(X^n)' = \sum_{l = 0}^{l = n - 1}X^lX'X^{n - l - 1}; \tag{11}$

This formula is proved in my answer to this question. Taking $X(t) = A + Bt$ in (11), noting that $X'(t) = B$, and evaluating at $t = 0$ in fact yields formula (1). A slightly different take which can be shown equivalent to the present analysis. QED

Finally, it should be evident that the restriction to $Mat(2, 2)$ is inessential; the results are all valid for matrices of size $N \in \Bbb Z_+$, the positive integers.

Hope this helps. Cheerio,

and as always,

Fiat Lux!!!