Minimal polynominal: geometric meaning

Solution 1:

One way to visualize the minimal polynomial (and characteristic polynomial) is to first ask why they are annihilating polynomials. This intuition is probably easiest for diagonalizable operators so let us start there.

If $A: V\rightarrow V$ is diagonalizable, then we can write $$V=\bigoplus_{i=1}^k E_{\lambda_i}$$ where each $E_{\lambda_i}$ is the eigenspace of $A$ corresponding to eigenvalue $\lambda$. To annihilate $A$, it is therefore sufficient to annihilate $A$ on each of the eigenspaces and the annihilator for each $E_{\lambda_i}$ is just $A-\lambda I$. Now as far as invariant subspaces go, eigenspaces are pretty much as simple as possible. This means that a single iteration of $(x-\lambda_i)$ on $A$ is enough to eliminate the space $E_{\lambda_i}$. Therefore the minimal polynomial is $$p(x) = (x-\lambda_1)\cdots (x-\lambda_k)$$ and intuitively this is why the minimal polynomial of a diagonalizable operator splits into linear terms.

However, not all operators are diagonalizable and so we need to talk about generalized eigenspaces instead. If you are familiar with the Jordan form, then generalized eigenspaces need no introduction. If you are not (which I think might be the case since Jordan forms are chapter 7 in Hoffman) then we can summarize as follows: Just as a diagonalizable operator "decomposes" into eigenspaces, any operator (diagonalizable or not) decomposes into what are called generalized eigenspaces.

Normal eigenspaces $E_{\lambda_i}$ are defined as the set of vectors $\mathbf{v}\in V$ for which $$(A-\lambda_i I)\mathbf{v} = \mathbf{0}$$ generalized eigenspaces $W_{\lambda_i}$ are defined as the set of vectors $\mathbf{v} \in V$ for which there exists some $m\in \mathbb{N}$ such that $$(A-\lambda_i I)^m \mathbf{v} = \mathbf{0}$$ And we then have $$V = \bigoplus_{i=1}^kW_{\lambda_i}$$ The representation of the operator $A$ corresponding to the above decomposition is then the Jordan Normal Form of the matrix.

With each generalized eigenspace, we can make a conceptual distinction. We separate the space into sets $$S_j = \left\{\mathbf{v}\in V \big|\ j\ \text{is the smallest integer such that } (A-\lambda_i)^j\mathbf{v} = \mathbf{0}\right\}$$ for example, $S_1 \cup \{\mathbf{0}\} = E_{\lambda_i}$. In this notation, we can view the action of the operator $A-\lambda I$ as shifting the sets $$S_m \rightarrow S_{m-1} \rightarrow \cdots \rightarrow S_1 = E_{\lambda_i} \rightarrow S_0 = \{\mathbf{0}\}$$ In this way, the exponent of $e_i$ each factor $(x-\lambda_i)^{e_i}$ in the minimal polynomial can be seen as the "depth" of the generalized eigenspace $W_{\lambda_i}$.

I am not sure if the above explanation is satisfying, but this gives us a very rough geometrical intuition of the minimal polynomial. The minimal polynomial can be seen as a sequence of steps taken to eliminate the vectors of $V$ which reflects the structure of $A$. We decompose $V$ into invariant generalized eigenspaces $V = \bigoplus_{i=1}^kW_{\lambda_i}$. For each generalized eigenspace, we apply the operator $(A-\lambda_i)$ enough times so that the space is eliminated. Each application of the operator corresponds to a shift of the form $S_{i+1} \rightarrow S_i$ in the space. Doing this for all the generalized eigenspaces then eliminates the entire the vector space.

Solution 2:

I think the minimal polynomial is an inherently algebraic notion, and you should accept it as that. There are geometric consequences though, such as the fact that a decomposition of the minimal polynomial of $\phi$ into a product of powers of irreducible polynimials leads to a canonical decomposition of the vector space as direct sum of $\phi$-stable subspaces (generalized eigenspaces in the case the minimal polynomial splits).

The basic intuition I have for minimal polynomials is that they form a extension to linear maps of the notion of order of a permutation. If you iterate a permutation, the only "remarkable" thing that can happen is that at some point you get the identity; this defines the order of the permutation. For a linear operator, when you compute successive powers you are unlikely to get the identity (except trivally for the power $0$), but (in finite dimension) you are bound to run into a linear dependence between the powers. This first linear dependence defines your minimal polynomial. Just like it is the case that for a permutation once you hit the order futher iteration is just repetitive, similarly further powers of a linear map can be reduced to lower powers once the minimal polynomial is found.

For permutations you can also look at orbits of individual elements being permuted, which may have a length that divides the order of the permutation. Similarly you can look at the first linear dependence among iterated images by the linear operator of a given vector, leading to a "minimal polynomial" for this vector, which is a divisor of the (global) minimal polynomial of the operator.

Solution 3:

Consider the following matrices: $$ A = \left(\begin{array}{cc}2&0\\0&2\end{array}\right) \ \ \text{ and } \ \ B = \left(\begin{array}{cc}2&1\\0&2\end{array}\right). $$ The first matrix has minimal polynomial $X - 2$ and the second has minimal polynomial $(X-2)^2$. If we subtract $2I_2$ from these matrices then we get $$ \left(\begin{array}{cc}0&0\\0&0\end{array}\right) \ \ \text{ and } \ \ \left(\begin{array}{cc}0&1\\0&0\end{array}\right), $$ where the first has minimal polynomial $X$ and the second has minimal polynomial $X^2$. The different exponents here reflect the fact that a matrix can have a power that is $O$ without being $O$ itself (this doesn't happen with ordinary numbers, like in ${\mathbf R}$ or ${\mathbf C}$). A matrix has a power equal to $O$ precisely when its minimal polynomial is some power of $X$, and the exponent you need on $X$ to achieve that can vary. As another example, compare $$ M = \left(\begin{array}{ccc} 0 & 1 & 1 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array} \right) \ \ \text{ and } \ \ N = \left(\begin{array}{ccc} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right). $$ These are not the zero matrix, but $N^2 = O$ while $M^3 = O$ and $M^2 \not= O$. So $M$ has minimal polynomial $X^3$ and $N$ has minimal polynomial $X^2$.

To describe a (monic) polynomial we could provide its roots and the multiplicities for those roots. For a square matrix, the roots of its minimal polynomial are easy to connect with the matrix: they are the eigenvalues of the matrix. The subtle part is their multiplicities, which are more algebraic than geometric.

It might be natural to hope that the multiplicity of an eigenvalue $\lambda$ as a root of the minimal polynomial is the dimension of the $\lambda$-eigenspace, but this is false in general, as we can see with the matrices $A$ and $B$ above (e.g., $B$ has minimal polynomial $(X-2)^2$ but its 2-eigenspace is 1-dimensional). In the case when the matrix has all distinct eigenvalues, the minimal polynomial is the characteristic polynomial, so you could think of the distinction between the minimal and characteristic polynomials as reflecting the presence of repeated eigenvalues.