Why does this "miracle method" for matrix inversion work?

Recently, I answered this question about matrix invertibility using a solution technique I called a "miracle method." The question and answer are reproduced below:

Problem: Let $A$ be a matrix satisfying $A^3 = 2I$. Show that $B = A^2 - 2A + 2I$ is invertible.

Solution: Suspend your disbelief for a moment and suppose $A$ and $B$ were scalars, not matrices. Then, by power series expansion, we would simply be looking for $$ \frac{1}{B} = \frac{1}{A^2 - 2A + 2} = \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A^4}{8}-\frac{A^5}{8} + \cdots$$ where the coefficient of $A^n$ is $$ c_n = \frac{1+i}{2^{n+2}} \left((1-i)^n-i (1+i)^n\right). $$ But we know that $A^3 = 2$, so $$ \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A^4}{8}-\frac{A^5}{8} + \cdots = \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A}{4}-\frac{A^2}{4} + \cdots $$ and by summing the resulting coefficients on $1$, $A$, and $A^2$, we find that $$ \frac{1}{B} = \frac{2}{5} + \frac{3}{10}A + \frac{1}{10}A^2. $$ Now, what we've just done should be total nonsense if $A$ and $B$ are really matrices, not scalars. But try setting $B^{-1} = \frac{2}{5}I + \frac{3}{10}A + \frac{1}{10}A^2$, compute the product $BB^{-1}$, and you'll find that, miraculously, this answer works!

I discovered this solution technique some time ago while exploring a similar problem in Wolfram Mathematica. However, I have no idea why any of these manipulations should produce a meaningful answer when scalar and matrix inversion are such different operations. Why does this method work? Is there something deeper going on here than a serendipitous coincidence in series expansion coefficients?


The real answer is the set of $n\times n$ matrices forms a Banach algebra - that is, a Banach space with a multiplication that distributes the right way. In the reals, the multiplication is the same as scaling, so the distinction doesn't matter and we don't think about it. But with matrices, scaling and multiplying matrices is different. The point is that there is no miracle. Rather, the argument you gave only uses tools from Banach algebras (notably, you didn't use commutativity). So it generalizes nicely.

This kind of trick is used all the time to great effect. One classic example is proving that when $\|A\|<1$ there is an inverse of $1-A$. One takes the argument about geometric series from real analysis, checks that everything works in a Banach algebra, and then you're done.


Think about how you derive the finite version of the geometric series formula for scalars. You write:

$$x \sum_{n=0}^N x^n = \sum_{n=1}^{N+1} x^n = \sum_{n=0}^N x^n + x^{N+1} - 1.$$

This can be written as $xS=S+x^{N+1}-1$. So you move the $S$ over, and you get $(x-1)S=x^{N+1}-1$. Thus $S=(x-1)^{-1}(x^{N+1}-1)$.

There is only one point in this calculation where you needed to be careful about commutativity of multiplication, and that is in the step where you multiply both sides by $(x-1)^{-1}$. In the above I was careful to write this on the left, because $xS$ originally multiplied $x$ and $S$ with $x$ on the left. Thus, provided we do this one multiplication step on the left, everything we did works when $x$ is a member of any ring with identity such that $x-1$ has a multiplicative inverse.

As a result, if $A-I$ is invertible, then

$$\sum_{n=0}^N A^n = (A-I)^{-1}(A^{N+1}-I).$$

Moreover, if $\| A \| < 1$ (in any operator norm), then the $A^{N+1}$ term decays as $N \to \infty$. As a result, the partial sums are Cauchy, and so if the ring in question is also complete with respect to this norm, you obtain

$$\sum_{n=0}^\infty A^n = (I-A)^{-1}.$$

In particular, in this situation we recover the converse: if $\| A \| < 1$ then $I-A$ is invertible.