Looking for insightful explanation as to why right inverse equals left inverse for square invertible matrices

The simple proof goes:

Let B be the left inverse of A, C the right inverse.

C = (BA)C = B(AC) = B

This proof relies on associativity yet does not give any insight as to why this surprising fact about matrices is true.

AC means a bunch of linear combinations of of columns of A. CA means a bunch of linear combinations of rows of A. Completely different numbers get multiplied in each case.

The proof above is just a bunch of syntactical steps not having much to do with matrices directly, I cannot see how CA=AC=I.

Can anyone shed some light on this?


In fact, this isn't about matrices per se, but about inverses in general, and perhaps more specifically about inverses of functions. The same argument works for any function that has a left and a right inverse (and for elements of a monoid or ring, though these can also be interpreted as "functions" via an appropriate setting).

If you really want to try to understand the proof in terms of "meaning", then you should not think of matrices as a bunch of columns or a bunch of numbers, but rather as functions, i.e., as linear transformations.

Say $A$ is an $m\times n$ matrix; then $A$ is "really" a linear transformation from $\mathbb{R}^n$ to $\mathbb{R}^m$: the columns of $A$ are the images of the standard basis vectors of $\mathbb{R}^n$ under the transformation. If $B$ is a right inverse of $A$, then $B$ is $n\times m$, and $AB$ acts like the identity transformation on $\mathbb{R}^m$. In particular, $AB$ has to be onto, so the rank of $AB$ is $m$; since the rank of $AB$ is at most the rank of $A$, then the rank of $A$ has to be $m$; since the rank of $A$ is at most $\max(m,n)$, then $m\leq n$. This tells us that $A$ is onto (full rank), and that it has at least as many columns as it has rows.

If $C$ is a left inverse of $A$, then $C$ must be an $n\times m$ matrix, and $CA$ acts like the identity on $\mathbb{R}^n$. Because $CA$ is one-to-one, then $A$ has to be one-to-one. In particular, it's nullspace is trivial. That means that it cannot have more columns than rows (that would require a nontrivial nullspace, by the Rank-Nullity Theorem); since it has at least as many columns as it has rows, $A$ has exactly the same number of columns as rows, so $m=n$.

Moreover, $A$ is now one-to-one and onto. So it is in fact bijective. So it is in fact invertible. Invertible matrices have unique inverses by definition, so $B$ and $C$ have to be equal: they have no choice in the matter. It isn't about the details of the "recipe", it's about the properties of functions: once you have a function that is one-to-one and onto, it has an inverse and the inverse is unique.


I honestly think that trying to puzzle out the details of the "recipe" is not insightful here: it is staring at the bark of a single tree instead of trying to appreciate the forest.

But if you must (and I really think you shouldn't), then you want to realize that $AC$ and $CA$ are talking in a different language: the columns of $A$ specify a basis, $\gamma$, and tell you how to express the elements of $\gamma$ in terms of the standard basis $\beta$; it provides a "translation" from $\gamma$ to $\beta$. That is, $A=[\mathrm{Id}]_{\gamma}^{\beta}$. The inverse, $C$, explains how to express the elements of the standard basis $\beta$ in terms of the vectors in $\gamma$, $C=[\mathrm{Id}]_{\beta}^{\gamma}$. $AC$ talks in the language of the standard basis $\beta$, $CA$ talks in the language of $\gamma$. Then it becomes clear why "the same recipe" (not really) should work. It's not really the same recipe, because in $CA$ you "hear" vectors in $\gamma$, translate them into $\beta$, and then translates them back in to $\gamma$. But in $AC$ you "hear" vectors in $\beta$, translate them into $\gamma$, and then back into $\beta$. The "translation recipes" are the same, whether you do $\beta\to\gamma$ first or you do it second (translating English to Russian is the same, whether you are translating something written originally in English into Russian, or something that was translated into English first).

$A$ establishes a bijection between the vectors expressed in terms of $\gamma$, and the vectors expressed in terms of $\beta$. $C$ is the bijection going "the other way". Both $AC$ and $CA$ are the identity, but they are the identity of slightly different structures: $AC$ is the identity of "$\mathbb{R}^n$-with-basis-$\beta$", and $CA$ is the identity of "$\mathbb{R}^n$-with-basis-$\gamma$." Only when you forget that $AC$ and $CA$ are being interpreted as matrices on vector-spaces-with-basis do you realize that in fact you have the same "formula" for the expressions (i.e., that both matrices are "the" identity matrix).

If you want to stare at the bark so intently that you must think of matrices in terms of "linear combinations of rows" or "linear combinations of columns", you are going to miss a lot of important properties for matrices. Matrices are really functions; multiplication of matrices is really composition of functions. The aren't a bunch of vectors or a bunch of numbers thrown into a box that you multiply based on some ad hoc rule. They are functions.

Compare how easy, and, yes, intuitive, it is to realize that matrix multiplication is associative because it is just "composition of functions", vs. figuring it out by expanding the double summations of an entry-by-entry expression of $(AB)C$ and $A(BC)$: you are going in the wrong direction for 'intuitive' understanding. Staring at those summations will not tell you why multiplication is associative, it will only tell you it is. Trying to puzzle out why a right inverse is also a left inverse in terms of "adding columns" and "adding rows" is not going to help either: you need to think of it in terms of functions.


Arturo's response above is a very good one-as usual-but here's another way to approach the problem: Since the set of invertible n x n matrices is a non-commutative group under matrix multiplication, what you're really asking is why the right inverse of every element in a group is also a left inverse. This property is really true because the set of invertible matrices is a group and it's true of groups in general.

If you're unfamiliar with algebra, a group is defined as follows: A group is a set, G, together with an binary operation (i.e. a function from GxG into G) • (called the group law of G) that combines any two elements a and b to form another element, denoted a • b or ab. To qualify as a group, the set and operation, (G, •), must satisfy four requirements known as the group axioms: i)Closure: For all a, b in G, the result of the operation, a • b, is also in G. ii)Associativity:For all a, b and c in G, (a • b) • c = a • (b • c). iii)Identity element:There exists an element e in G, such that for every element a in G, the equation e • a = a • e = a holds. The identity element of a group G is often written as 1 or 1G, a notation inherited from the multiplicative identity. iv)Inverse element: For each a in G, there exists an element b in G such that a • b = b • a = 1. It's pretty easy to prove the set of all invertible n x n matrices is a group under matrix multiplication. Notice it's usually assumed in the definition that the identity is unique and 2 sided and each inverse of every element is 2 sided and unique to the element. However, it's not necessary to do so. One can modify the axioms of a group so it's not assumed the inverses are 2 sided: i.e. Define a group G to be nonempty set closed under a associative binary operation * such that 1) there exists at least one element e for all elements a in G such that e*a = a and 2) for every element a, there is at least one element b such that b*a= e. You can of course, alternatively define an identity on the right and inverses on the right. The point here is to not assume the 2 sided-ness of the identity and the inverses and you should try and prove that every left identity(inverse) must also be a right.You then prove that each left is unique and so is every right-therefore, they're equal. It's a bit tedious, but I think you'll find it to answer your question in a decisive fashion.

Another way-consider the following:Modify the above definition of a group so that e is a LEFT identity and for every element, there is a right inverse. Is the resulting set under a binary operation a group? It turns out the answer is NO. For a counterexample, define a*b=b for all a and b in the group. Then we can pick any e to be the left identity of all the elements. Similarly, any b has the right inverse e because b*e=e.In the trivial case of a group-a single element e-this is clearly a group. BUT if there is more than one element, the result is NOT a group: If a*x=a and b*x=b, then x=a and x=b, which obviously can't hold in the general case because there is no single (two-sided) identity element!

Do the exercise above in the alternate definition of a group with the weaker axioms-this plus the counterexample should clear the fog of your confusion.