Expressing the trace of $A^2$ by trace of $A$

Let $A$ be a a square matrix. Is it possible to express $\operatorname{trace}(A^2)$ by means of $\operatorname{trace}(A)$ ? or at least something close?


In general,

$\text{Tr}(A^2) = (\text{Tr}A)^2 - 2 \sigma_2(A), \tag{1}$

where $\text{Tr}$ denotes the trace of $A$, and $\sigma_2(A)$ is the coefficient of $N - 2$ in the characteristic polynomial $p_A(\lambda)$ of $A$, where $N$ is the size of $A$. We have

$\sigma_2(A) = \sum_{i < j}\lambda_i \lambda_j, \tag{2}$

where $\lambda_1, \lambda_2, . . ., \lambda_N \in \Bbb C$ are the eigenvalues of $A$. In this formula, repeated eigenvalues are admitted but are assigned distinct indices.

This result may be seen as follows: factoring $p_A(\lambda)$, we have

$p_A(\lambda) = \prod_1^N (\lambda - \lambda_i) = \sum_0^N (-1)^i\sigma_i(\lambda_1, \lambda_2, . . . \lambda_N) \lambda^{N - i}, \tag{3}$

where the $\sigma_i(\lambda_1, \lambda_2, . . . \lambda_N)$ are the so-called elementary symmetric functions/polynomials in the $\lambda_i$. This result is very well-known and is thoroughly discussed in this Wikipedia entry. Inspecting (3), it is easily seen that

$\sigma_1(\lambda_1, \lambda_2, . . . \lambda_N) = \text{Tr}A; \tag {4}$

and

$\sigma_2(\lambda_1, \lambda_2, . . . \lambda_N) = \sum_{i < j}\lambda_i \lambda_j. \tag{5}$

$\sigma_k(\lambda_1, \lambda_2, . . . \lambda_N)$ is the sum of $C_k^N = \frac{N!}{k! (N - k)!}$ terms, each being the product of precisely $k$ of the $\lambda_i$ with distinct $i$. It is a homogeneous polynomial of degree $k$, and is evidently invariant under any permutation of the indices of the $\lambda_i$. We also take

$\sigma_0(\lambda_1, \lambda_2, . . . \lambda_N) = 1. \tag{6}$

An important fact for the present purposes is that, though the $\sigma_k(\lambda_1, \lambda_2, . . . \lambda_N)$ may be expressed in terms of the $\lambda_i$, in the case of $p_A(\lambda)$ they may be had without explicit knowledge of the eigenvalues simply be obtaining the coefficients of $p_A(\lambda)$ from the defining equation

$p_A(\lambda) = \det(\lambda I - A); \tag{7}$

thus there is no ambiguity in referring to the $\sigma_k(A)$, as we have done above for the cases $k = 1, 2$.

Bearing these observations in mind, we recall that the eigenvalues of $A^2$ are precisely the $\lambda_i^2$ and thus

$(\text{Tr}(A))^2 = (\sum_i \lambda_i)^2 = \sum_i \lambda_i^2 + 2\sum_{i < j}\lambda_i \lambda_j = \text{Tr}(A^2) + 2\sigma_2(A); \tag{8}$

(1) follows by way of a minor re-arrangement of (8). QED.

The full machinery of symmetric polynomials can actually be avoided by means of a simple induction whereby we may directly show that, for any polynomial $p(\lambda)$ with complex coefficients and $\deg p = N$, the coefficient of $\lambda^{N - 2}$ is $\sigma_2(\lambda_1, \lambda_2, . . . \lambda_N)$. The case $N = 2$ is easily ratified, and serves as our base case:

$(\lambda - \lambda_1)(\lambda - \lambda_2) = \lambda^2 - (\lambda_1 + \lambda_2)\lambda + \lambda_1 \lambda_2; \tag{9}$

now suppose that

$\prod_1^k (\lambda - \lambda_i) = \lambda^k - (\sum_1^k \lambda_i)\lambda^{k - 1} + (\sum_{1 \le i < j \le k} \lambda_i \lambda_j) \lambda^{k - 2} + r(\lambda), \tag{10}$

where if $r(\lambda) \ne 0$ we have $\deg r(\lambda) \le k - 3$. For $\lambda_{k + 1}$ arbitrary,

$\prod_1^{k + 1} (\lambda - \lambda_i) = (\lambda - \lambda_{k + 1}) (\lambda^k - (\sum_1^k \lambda_i)\lambda^{k - 1} + (\sum_{1 \le i < j \le k} \lambda_i \lambda_j) \lambda^{k - 2} + r(\lambda))$ $= \lambda^{k + 1} - (\sum_1^k \lambda_i)\lambda^k + (\sum_{1 \le i < j \le k} \lambda_i \lambda_j) \lambda^{k - 1} + \lambda r(\lambda)$ $- \lambda_{k + 1} \lambda^k + (\sum_1^k \lambda_i \lambda_{k + 1})\lambda^{k - 1} - (\sum_{1 \le i < j \le k} \lambda_i \lambda_j \lambda_{k + 1}) \lambda^{k - 2} - \lambda_{k + 1}r(\lambda)$ $=\lambda^{k + 1} -(\sum_1^{k + 1} \lambda_i) \lambda^k + (\sum_{1 \le i < j \le k + 1} \lambda_i \lambda_j) \lambda^{k - 1}$ $-(\sum_{1 \le i < j \le k} \lambda_i \lambda_j \lambda_{k + 1}) \lambda^{k - 2} + \lambda r(\lambda) - \lambda_{k + 1} r(\lambda). \tag{11}$

Inspection of (11) reveals that the last three summands are all of degree $k - 2$ or less, since $\deg r(\lambda) \le k - 3$; thus (11) shows that the coefficient of $\lambda^{k - 1}$ in $\prod_1^{k + 1} (\lambda - \lambda_i)$ is in fact $\sigma_2(\lambda_1, \lambda_2, . . . \lambda_{k + 1})$, and the induction is complete. QED.

Hope this helps. Cheerio,

and as always,

Fiat Lux!!!


For $2\times 2$ matrices, the answer is $(\operatorname{tr} A)^2-2\det A$.

For an $n\times n$ matrix, the characteristic polynomial includes the trace, determinant and $n-2$ other functions in between.

The characteristic polynomial is the determinant $\det(\lambda I-A)$, where $I$ is the identity matrix. This turns out to be a polynomial in $\lambda$, of degree $n$. The coefficient of $\lambda^{n-1}$ is $(-1)$ trace $A$, and the constant term is $(-1)^n\det A$. The trace of $A^2$ can be written in terms of the trace of $A$ and the coefficient of $\lambda^{n-2}$.


It is not. Consider two $2\times 2$ diagonal matrices, one with diagonal $\{1,-1\}$ and one with diagonal $\{0,0\}$. They have the same trace, but their squares have different traces.