Linear homomorphisms of square matrices are conjugations

This kind of problems are known as linear preserver problems in the literature. The following is a sketch of proof that immediately comes to my mind. Certainly there are simpler ways to solve the problem (especially if one makes use of existing results on linear preserver problems), but anyway, let $\{e_1,\ldots,e_n\}$ be the standard basis of $\mathbb R^n$ and $E_{ij}=e_ie_j^T$.

  1. Prove that $\phi$ is injective. Hint. Suppose the contrary that $\phi(X)=0$ for some matrix $X$ whose $(r,s)$-th entry is nonzero. Now consider $\phi(E_{ir}XE_{sj})$ for every $(i,j)$.

  2. Prove that

    • $\phi$ preserves non-invertibility (hint: if $X$ is singular, then $XY=0$ for some nonzero $Y$),
    • $\phi$ preserves invertibility (hint: if $\phi(P)$ is singular for some invertible $P$, then $\phi(P)Y=0$ for some nonzero matrix $Y$; since $\phi$ is an injective linear operator over a finite dimensional vector space, $Y=\phi(B)$ for some nonzero $B$, but then ...),
    • $\phi(I)=I$.
  3. This is the only interesting step in the whole proof: show that every $\phi(E_{ii})$ is a rank-1 idempotent matrix. Hint: the rank of an idempotent matrix is equal to its trace.

  4. Argue that without loss of generality, we may assume that $\phi(E_{11})=E_{11}$.

  5. Show that whenever $i,j\ne1$, the first column and the first row of $\phi(E_{ij})$ are zero (hint: $E_{ij}E_{11}=0=E_{11}E_{ij}$). By mathematical induction/recursion, show that we may further assume that $\phi(E_{ii})=E_{ii}$ for every $i$.

  6. For any off-diagonal coordinate pair $(i,j)$, show that $\phi(E_{ij})$ is a scalar multiple of $E_{ij}$ (hint: we have $E_{kk}E_{ij}=0$ for every $k\ne i$ and $E_{ij}E_{kk}=0$ for every $k\ne j$).

  7. Hence prove that in addition to all the previous assumptions (i.e. $\phi(E_{ii})=E_{ii}$ and $\phi(E_{ij})$ is a scalar multiple of $E_{ij}$ for every $i,j$), we may further assume that $\phi(E_{\color{red}{1}j})=E_{\color{red}{1}j}$ for every $j$.

  8. Since $\phi$ preserves invertibility and non-invertibility, prove that $\phi(E_{ij})=E_{ij}$ for every $(i,j)$.


Here is an alternative proof that is much less computational. It works over any field. Let $E_{ij}$ be the matrix with a $1$ at the $(i,j)$-th entry and zeros elsewhere.

  1. $\phi$ is injective and hence an automorphism. Suppose the contrary that $\phi(A)=0$ for some nonzero $A$. Since $A$ is nonzero, every square matrix $B$ can be written as a finite sum of the form $\sum_k X_kAY_k$ (that's simply because Kronecker products of the form $Y_k^T\otimes X_k$ span the set of all $n^2\times n^2$ matrices). But then we would have $\phi(B)=\sum_k\phi(X_k)\phi(A)\phi(Y_k)=0$ for every $B$, which is a contradiction because $\phi\ne0$.

  2. $\phi$ preserves the rank/nullity of a matrix. As $\phi$ is an automorphism, $\phi$ and $\phi^{-1}$ preserve the linear dependence/independence of any set of matrices. In turn, $$ \frac1n\dim\{X\in M_n(\mathbb C):\,AX=0\}=\frac1n\dim\{Y\in M_n(\mathbb C):\,\phi(A)Y=0\}. $$ As the two sides are the respective nullities of $A$ and $\phi(A)$, the conclusion follows. In particular, $\phi$ preserves singularity/invertibility of matrices and $\phi(I)=I$.

  3. As $\phi$ is an automorphism, a polynomial $p$ annihilates a matrix $A$ if and only if it annihilates $\phi(A)$. It follows that

    • $A$ and $\phi(A)$ always have identical minimal polynomials.
    • So, if $A$ is diagonalisable, $\phi(A)$ must be diagonalisable too and the two matrices must share the same set of eigenvalues. The multiplicities of the eigenvalues must be identical too, because $A-\lambda I$ and $\phi(A)-\lambda I$ have the same rank for every scalar $\lambda$. In other words, $\phi(A)$ must be similar to $A$.
    • As $\phi$ also preserve the linear independence of any set of matrices as well as matrix rank, it must map the set of all diagonal matrices onto an $n$-dimensional subspace of commuting diagonalisable matrices. Since commuting diagonalisable matrices are simultaneously diagonalisable, we may assume that $\phi$ maps diagonal matrices to diagonal matrices.
    • So, each $\phi(E_{ii})$ is a diagonal matrix that is similar to $E_{ii}$. Hence $\phi(E_{ii})=E_{jj}$ for some $j$. As $\phi$ is injective, we may assume without loss of generality that $\phi(E_{ii})=E_{ii}$ for each $i$. Hence $\phi(D)=D$ for every diagonal matrix $D$.
  4. Here comes the computational part. Let $C$ be the circulant permutation matrix whose super-diagonal as well as the bottom left entry are filled with ones. By comparing the coefficients on both sides of $\phi(D_1CD_2)=D_1\phi(C)D_2$ for arbitrary diagonal matrices $D_1$ and $D_2$, we see that $\phi(C)=DC$ for some diagonal matrix $D$. Note that $\det D=1$ because $I=\phi(C^n)=(DC)^n=\det(D)I$. Write $D=\operatorname{diag}(d_1,\ldots,d_n)$. Then $DC=\tilde{D}C\tilde{D}^{-1}$ where $$ \tilde{D}=\operatorname{diag}\left(\prod_{j=1}^nd_j,\ \prod_{j=2}^nd_j,\ \prod_{j=3}^nd_j,\ \ldots,\ d_n\right). $$ Since every matrix $A$ can be written as $A=\sum_{k=0}^{n-1}D_kC^k$ for some diagonal matrices $D_0,D_1,\ldots,D_{n-1}$, we obtain \begin{align} \phi(A) &=\sum_{k=0}^{n-1}\phi(D_k)\,\phi(C)^k\\ &=\sum_{k=0}^{n-1}D_k\left(\tilde{D}C\tilde{D}^{-1}\right)^k\\ &=\sum_{k=0}^{n-1}D_k\tilde{D}C^k\tilde{D}^{-1}\\ &=\sum_{k=0}^{n-1}\tilde{D}D_kC^k\tilde{D}^{-1}\\ &=\tilde{D}A\tilde{D}^{-1}. \end{align}


Since $\varphi$ is linear, it's only natural to investigate how it acts on the canonical basis of $M_n(k)$. Let $E_{ij}$ denote the matrix with a single one for the entry $(i,j)$ and $0$'s elsewhere. It is standard that $E_{ij}E_{kl} = \delta_{jk}E_{il}$, hence $\varphi(E_{ij})\varphi(E_{kl}) = \delta_{jk}\varphi(E_{il})$.

Let $e_k$ denote the $k$-th vector of the canonical basis of $k^n$.Note that if $\varphi(E_{ij}) = CE_{ij}C^{-1}$, then $\varphi(E_{ij})Ce_k = \delta_{jk}Ce_i$. With $j=k$, $$\varphi(E_{ij})Ce_j=Ce_i$$ which yields a construction of $C$ (fix $j$ and let $i$ vary), as follows.

Let $I$ denote the identity matrix. Since $\varphi \neq 0$, there is some $A$ such that $\varphi(A)\neq 0$, hence $\varphi(A)\varphi(I)\neq 0$, hence $\varphi(I)\neq 0$. Since $I=\sum_{k=1}^n E_{kk}$, there exists some $k$ such that $\varphi(E_{kk})\neq 0$. WLOG suppose $k=1$.
Since $\varphi(E_{11})\neq 0$, there exists $e_1'\in \operatorname{im}\varphi(E_{11})\setminus\{0\}$. Define $e_i':=\varphi(E_{i1})(e_1')$. It's easy to check that every $e_i'$ is non-zero and that $(e_1',\ldots, e_n')$ is linearly independent, hence a basis. Define $C$ the matrix with columns $(e_1',\ldots, e_n')$. $C$ is invertible by construction. It is easily checked that for fixed $(i,j)$, the following holds: $$\forall k, \varphi(E_{ij})e_k' = CE_{ij}C^{-1}e_k'$$ thus $$\varphi(E_{ij}) = CE_{ij}C^{-1}$$

Hence by linearity, $\varphi(A) = CAC^{-1}$ for all $A$.