Why is the condition enough for a matrix to be diagonalizable?

I've heard that for a matrix $A\in M_n(\mathbb{C})$, if $A^3=A$, then $A$ is diagonalizable. Does there happen to be a proof or reference as to why this is true?

Out of curiosity, is it necessary that the entries be from $\mathbb{C}$? Would any field $F$ work just as well, or is this possibly a special fact related to the properties of $\mathbb{C}$?


Solution 1:

You might also want to look at the minimal polynomial $\mu_A$ of $A$. If $A^3-A=0$, then $\mu_A \mid X^3-X=X(X^2-1)$. If $\mathrm{char}(K)\neq 2$ (if $K$ is the underlying field) then this polynomial is equal to $X(X-1)(X+1)$ and a matrix with a minimal polynomial which splits into linear factors with multiplicity 1 is diagonalisable.

Solution 2:

This holds because it holds for the Jordan normal form. Every matrix over $\mathbb C$ is similar to a matrix in Jordan normal form. The third power of a Jordan normal block of size at least $2$ is

$$ \pmatrix{\lambda&1\\ &\lambda&\ddots\\ &&\ddots&1\\ &&&\lambda&}^3 = \pmatrix{\lambda^3&3\lambda^2\\ &\lambda^3&\ddots\\ &&\ddots&3\lambda^2\\ &&&\lambda^3&} $$

(with further non-zero entries further above the diagonal). If $A^3=A$, then this also holds for any similar matrix, and thus for the Jordan normal form, which implies $\lambda=\lambda^3$ and $3\lambda^2=1$. These equations don't have a common solution, which shows that all Jordan blocks are of size $1$, that is, the Jordan normal form is diagonal. This clearly also works for any other integer power of $A$.

Regarding arbitrary fields, this exposition proves that every matrix over an algebraically closed field is similar to a matrix in Jordan normal form. Thus the result holds in any algebraically closed field $\mathbb F$ in which there is no common solution for $\lambda=\lambda^3$ and $3\lambda^2=1$. For $\operatorname{char}\mathbb F=3$, the second equation reads $0=1$, which has no solutions. For $\operatorname{char}\mathbb F\ne3$, we can multiply the first equation by $3$ and substitute for $3\lambda^2$ from the second equation to get $3\lambda=\lambda$ and thus $3=1$ (since $\lambda=0$ doesn't solve the second equation). Subtracting $1$ yields $2=0$, so the statement holds if $\operatorname{char}\mathbb F\ne2$.

For $\operatorname{char}\mathbb F=2$, we have $\pmatrix{1&1\\0&1}^3=\pmatrix{1&1\\0&1}$, and this matrix is not diagonalizable.

Solution 3:

An $n$ by $n$ matrix $A$ with coefficients in a field $K$ is diagonalizable if and only if its minimal polynomial $f$ has no multiple roots and splits over $K$.

Clearly, if $A$ is diagonalizable, $f$ has no multiple roots and splits over $K$.

For the converse, one can use the Chinese Remainder Theorem to reduce the statement to the case $A=0$.

Solution 4:

Here is an argument/explanation that does not use Jordan normal form or charateristic polynomials. Instead it uses a more linear transformation-based viewpoint, and properties of projectors. I find this a useful way to think about these kinds of questions, which is why I'm positing it.


Note first that $A^3 = A$ implies that $A^4 = A^2$. This means that $A^2$ is a projection. In general, if $P^2 = P$, for some linear transformation of a vector space $V$ (over any field), then $V$ is the direct sum of the image of $P$ and the image of $I- P$. (Here $I$ denotes the identity.) Furthermore, on the the image of $P$, the transformation $P$ acts by the identity, while on the image of $I - P$, it acts by zero.

So in our case, the $3$-dim'l vector space $V$ on which your matrix $A$ is acting splits as the direct sum of the image of $A^2$, on which $A^2$ acts by the identity, and as the image of $I - A^2$, on which (using the equation $A - A^3 = 0$) we see that $A$ acts by zero.

So we have partially diagonalized $A$; we have decomposed $V$ into a sum of two subspaces, each invariant under $A$, with $A^2 = I$ on the first, and $A = 0$ on the second.

To complete the diagonalization, we make the same kind of argment, but now we may assume that $A^2 = I$.

From $A^2 = I$, we find that $\bigl(\dfrac{I-A}{2}\bigr)^2 = \dfrac{I-A}{2}$. Thus $(I-A)/2$ is again a projector, and so the subspace on which $A^2 = I$ decomposes as a sum of two subspaces, one on which $(I-A)/2 = I,$ which is to say $A = - I$, and one on which $(I-A)/2 = 0$, which is to say $A = I$.

Putting it altogether, we decomposed our original space $V$ as the sum of three $A$-invariant subspaces, on which $A$ acts by $0$, $-1$, and $1$ respectively.


The argument works over any field where $2$ is invertible. If we are in char. $2$, then the decomposition into the sum of spaces on which $A = 0$ and $A^2 = I$ is still possible, but (as Joriki points out in his answer) we can't necessarily diagonalize a matrix satisfying $A^2 = I$.

One way to see this is to note that in char. $2$, $A^2 = I$ is equivalent to $(A-I)^2 = 0$, and so we can construct matrices $A$ such that $A^2 = I$ by choosing nilpotent matrices $N$ such that $N^2 = 0$, and then setting $A = I + N$. If $N \neq 0$ (which is possible for $n\times n$ matrices with $n > 1$), then such an $A$ (identity plus non-zero nilpotent) is not diagonalizable (in any characteristic; but away from char. $2$, matrices of this form can't satisfy $A^2 = 0$).