I am trying to understand the similarities and differences between the minimal polynomial and characteristic polynomial of Matrices.

  1. When are the minimal polynomial and characteristic polynomial the same
  2. When are they different
  3. What conditions (eigenvalues/eigenvectors/...) would imply 1 or 2
  4. Please tell me anything else about these two polynomials that is essential in comparing them.

Solution 1:

The minimal polynomial is quite literally the smallest (in the sense of divisibility) nonzero polynomial that the matrix satisfies. That is to say, if $A$ has minimal polynomial $m(t)$ then $m(A)=0$, and if $p(t)$ is a nonzero polynomial with $p(A)=0$ then $m(t)$ divides $p(t)$.

The characteristic polynomial, on the other hand, is defined algebraically. If $A$ is an $n \times n$ matrix then its characteristic polynomial $\chi(t)$ must have degree $n$. This is not true of the minimal polynomial.

It can be proved that if $\lambda$ is an eigenvalue of $A$ then $m(\lambda)=0$. This is reasonably clear: if $\vec v \ne 0$ is a $\lambda$-eigenvector of $A$ then $$m(\lambda) \vec v = m(A) \vec v = 0 \vec v = 0$$ and so $m(\lambda)=0$. The first equality here uses linearity and the fact that $A^n\vec v = \lambda^n \vec v$, which is an easy induction.

It can also be proved that $\chi(A)=0$. In particular that $m(t)\, |\, \chi(t)$.

So one example of when (1) occurs is when $A$ has $n$ distinct eigenvalues. If this is so then $m(t)$ has $n$ roots, so has degree $\ge n$; but it has degree $\le n$ because it divides $\chi(t)$. Thus they must be equal (since they're both monic, have the same roots and the same degree, and one divides the other).

A more complete characterisation of when (1) occurs (and when (2) occurs) can be gained by considering Jordan Normal Form; but I suspect that you've only just learnt about characteristic and minimal polynomials so I don't want to go into JNF.

Let me know if there's anything else you'd like to know; I no doubt missed some things out.

Solution 2:

The minimal polynomial $m(t)$ is the smallest factor of the characteristic polynomial $f(t)$ such that if $A$ is the matrix, then we still have $m(A) = 0$. The only thing the characteristic polynomial measures is the algebraic multiplicity of an eigenvalue, whereas the minimal polynomial measures the size of the $A$-cycles that form the generalized eigenspaces (a.k.a. the size of the Jordan blocks). These facts can be summarized as follows.

  • If $f(t)$ has a factor $(t - \lambda)^k$, this means that the eigenvalue $\lambda$ has $k$ linearly independent generalized eigenvectors.
  • If $m(t)$ has a factor $(t - \lambda)^p$, this means that the largest $A$-cycle of generalized eigenvectors contains $p$ elements; that is, the largest Jordan block for $\lambda$ is $p \times p$. Notice that this means that $A$ is only diagonalizable if $m(t)$ has only simple roots.
  • Thus $f(t) = m(t)$ if and only if each eigenvalue $\lambda$ corresponds to a single Jordan block, a.k.a each eigenvalue corresponds to a single minimal invariant subspace of generalized eigenvectors.
  • $f(t)$ and $m(t)$ differ if any eigenvalue has more than one Jordan block, a.k.a. if an eigenvalue has more than one generalized eigenspace.