Simple proof that if $A^n=I$ then $\mathrm{tr}(A^{-1})=\overline{\mathrm{tr}(A)}$

Let $A$ be a linear map from a finite dimensional complex vector space to itself. If $A$ has finite order then the trace of its inverse is the conjugate of its trace.

I know two proofs of this fact, but they both require linear algebra facts whose proofs are themselves quite involved.

  1. Since $A^n=I$, the eigenvalues of $A$ are roots of unity. Hence they have unit norm, and so their reciprocals are their conjugates. Then the result follows from following facts: (a) The eigenvalues of $A^{-1}$ are the reciprocals of the eigenvalues of $A$, (b) the dimensions of the eigenspaces of $A^{-1}$ are equal to the dimensions of the corresponding eigenspaces of $A$, (c) the trace is equal to the sum of the (generalised) eigenvalues. The proof of (a) is relatively easy, but (b) and (c) seem to require the existence of Jordan Normal Form, which requires a lot of work.

  2. By Weyl's Unitary Trick, there's a inner product for which $A$ is unitary (this proof is itself a fair amount of work). So in an orthonormal basis (which we must construct with the Gram-Schmidt procedure) the inverse of $A$ is given by its conjugate transpose (one must also prove this). So the trace of the inverse is the conjugate of the trace.

Since the condition $A^n=I$ and the consequence $\mathrm{tr}(A^{-1})=\overline{\mathrm{tr}(A)}$ are both elementary statements, I'm wondering if there's a short proof from first principles (ideally without quoting any big linear algebra Theorems). Can anyone find one?


Addressing the question raised in the comments:

Claim. Suppose that the minimal polynomial of $A$ has distinct roots. Then $A$ is diagonalizable, and conversely.

Proof. Write $m_A = (X-\lambda_1)\cdots (X-\lambda_t)$ and set $m_i = m_A/(X-\lambda_i)$. Then we can find polynomials $p_i$ such that $1=\sum_{i=1}^t p_i m_i$. Moreover, $v_i = p_i(A)m_i(A)v$ is either zero or such that $(A-\lambda_i)v=0$, so that $v = \sum_{i=1}^tv_i$ is a sum of eigenvectors of $A$. This shows $V$ is a sum of eigenspaces, and the sum is direct since the $\lambda_i$ are distinct. $\blacktriangleleft$

Since $X^n-1$ annhiliates $A$, $m_A$ has distinct roots, so $A$ is diagonalizable. Since its trace is the sum of its eigenvalues, which are roots of unity, the proposed argument in $(1)$ goes through.


Since the question attracted quickly four votes, i'll try to use minimal known linear algebra to get the result. (One more comment. Since the complex conjugation is involved, there is no purely algebraic proof, e.g. one that uses polynomial/functional calculus in $A$.)

We start with $A$ a matrix, an endomorphism of a vector space $V$ of finite dimension $\ge 1$ over $\Bbb C$, such that for a suitable natural $n$ we have $$A^n=I\ .$$

Let $v\ne 0$ be a vector in $V$. The sequence $v, Av, A^2 v,\dots A^nv=v, \dots$ is periodic. Let $d$ be its period, $d$ is a divisor of $n$. If $d=1$ we record this $v$, set $w=v$. Else, let $\xi$ be a primitive $d$-root of unity in $\Bbb C$, e.g. $\xi=\exp\frac {2\pi\, i}d$ if we want to fix the ideas (and leave algebra). Consider the following vectors in $V$: $$ \begin{aligned} w_0 &=v +Av+\dots+A^{d-1}v\ ,\\ w_1 &=v +\xi Av+\dots+(\xi A)^{d-1}v\ ,\\ \ \vdots\ \vdots\ &\qquad \vdots\qquad\vdots\qquad\vdots\qquad\vdots\qquad\vdots\qquad\vdots\qquad\vdots\qquad\\ w_k &=v +(\xi^k A)v+\dots+(\xi^k A)^{d-1}v\ ,\\ \ \vdots\ \vdots\ &\qquad \vdots\qquad\vdots\qquad\vdots\qquad\vdots\qquad\vdots\qquad\vdots\qquad\vdots\qquad\\ w_{d-1} &=v +(\xi^{d-1} A)v+\dots+(\xi^{d-1} A)^{d-1}v\ . \end{aligned} $$ If at least one of these vectors is $\ne 0$, then we record it, and set $w$ to be one choice among them. Else?! Else we have the situation, which is formally described by the following relation: $$ \underbrace{ \begin{bmatrix} 1 & 1 & 1 & \dots & 1\\ 1 & \xi & \xi^2 &\dots & \xi^{d-1}\\ 1 & \xi^2 & \xi^4 &\dots & \xi^{2(d-1)}\\ \vdots &\vdots &\vdots &\ddots &\vdots\\ 1 & \xi^{d-1} & \xi^{2(d-1)} &\dots & \xi^{(d-1)(d-1)} \end{bmatrix} }_{\text{Vandermonde}(1,\xi,\dots,\xi^{d-1})} % \begin{bmatrix} v \\ Av\\ A^2 v\\\vdots\\A^{d-1}v \end{bmatrix} = \begin{bmatrix} 0 \\ 0\\ 0\\\vdots\\0 \end{bmatrix} \ . $$ The Vandermonde matrix is invertible, so we formally multiply from left with its inverse. To be exact, this is reflected then in building linear combinations in the given formulas for $w_0,w_1,\dots,w_{d-1}$ to isolate $0=v=Av=\dots$ , which gives a contradiction. Doing this we have constructed a $w$ such that $Aw=\xi^? w$ for a root of unit $\xi^?$. We consider $V'$, the quotient space or some subspace of $V$ generated by "the other" vectors that extend the linear independent system $\{w\}$ to a basis, and consider the same problem with the induced / restricted $A$ on $V'$.

Inductively we get a basis of $V$ on which $A$ acts diagonally (or upper triangular if taking quotients), and the elements on the diagonal are $\xi_1,\xi_2,\dots$ all of them roots of unit.

Then the inverse matrix has the same shape with diagonal $\xi_1^{-1},\xi_2^{-1},\dots$ and the equality involving traces can be equivalently traced back to: $$ \frac 1{\xi_1}+ \frac 1{\xi_2}+ \dots = \overline{\xi_1}+ \overline{\xi_2}+ \dots $$ which is true.


If the trouble is just a simple proof for the fact that $$ \text{tr}\,A=\sum_{i=1}^n\lambda_i $$ you can try the following approach instead of JNF.

  1. In the field $\Bbb C$ we can factorize $\det(\lambda I-A)=\prod_{i=1}^n(\lambda-\lambda_i)$ where $\lambda_i$ are all eigenvalues (possibly repeated with multiplicities). The coefficient for $\lambda^{n-1}$ is $\color{red}{-\sum_{i=1}^n\lambda_i}$.
  2. Prove (e.g. expanding the determinant along the first column + induction) that the coefficients for $\lambda^n$ and $\lambda^{n-1}$ are build from the main diagonal product only \begin{align} \det(\lambda I-A)&=\begin{vmatrix}\lambda-a_{11} & -a_{12} & \ldots & -a_{1n}\\-a_{21} & \lambda-a_{22} & \ldots & -a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ -a_{n1} & -a_{n2} & \ldots & \lambda-a_{nn} \end{vmatrix}=\prod_{i=1}^n(\lambda-a_{ii})+<\text{terms of $\deg\le n-2$}>=\\ &=\lambda^n\color{red}{-\text{tr}\,A}\,\lambda^{n-1}+<\text{terms of $\deg\le n-2$}>. \end{align} It is because for any cofactor $A_{j1}$, $j>1$, in the first column you remove two $\lambda-a_{ii}$ elements: one from the first column and one from the $j$th row.
  3. Compare the red coefficients to conclude.

I will go along the lines of your point (2), since I think that there are some nice simple arguments here. Essentially all the work is showing that $A$ can be "unitarised" (there is some inner product making $A$ unitary). Once you know that, the trace property follows from essentially a single line of working. I always take inner products to be conjugate-linear in the first spot, and $\mathbb{C}$-linear in the second. Let $A: V \to V$ be a linear operator such that $A^n = I$.

Claim: Suppose that $A$ is unitary (with respect to some inner product), then $\mathrm{tr}(A) = \overline{\mathrm{tr}(A^{-1})}$.

Proof: Let $v_1, \ldots, v_k$ be an orthonormal basis. Then, using the definition of the trace as the sum of diagonal elements in a matrix, we have $$\mathrm{tr}(A) = \sum_{i = 1}^k \langle v_i, A v_i \rangle = \sum_{i = 1}^k \langle A^{n-1} v_i, A^nv_i \rangle = \sum_{i = 1}^k \langle A^{-1} v_i, v_i \rangle = \sum_{i=1}^k \overline{\langle v_i, A^{-1} v_i \rangle} = \overline{\mathrm{tr}(A^{-1})}$$ where the second equality uses that $A$ is unitary, and the second-last follows from the general property $\langle v, w \rangle = \overline{\langle w, v \rangle}$ of an inner product. $\blacksquare $

Of course we still have to show that there is an inner product making $A$ unitary, but at least the above argument avoids anything about conjugate-transpose matrices. I don't know if there is a quicker way of seeing this, but for completeness I'll write out the averaging argument you would encounter in representation theory.

Claim: There is an inner product on $V$ such that $A$ is unitary.

Proof: Let $(-, -)$ be any inner product on $V$, and define a new form $$ \langle v, w \rangle := \sum_{i = 0}^{n-1} (A^i v, A^i w) $$ It is clear from the definition and the fact that $(-, -)$ is an inner product that we have additivity, conjugate-linearity in the first argument, $\mathbb{C}$-linearity in the second, and conjugate symmetry. We also have positive-definiteness, since $\langle v, v \rangle = \sum_{i=0}^{n-1} (A^i v, A^i v) \geq (v, v)$. So $\langle -, - \rangle$ is an inner product, and since $\langle Av, Aw \rangle = \langle v, w \rangle$, it makes $A$ unitary. $\blacksquare $