To prove Cayley-Hamilton theorem, why can't we substitute $A$ for $\lambda$ in $p(\lambda) = \det(\lambda I - A)$?

Solution 1:

There is another way to see that the proof must be flawed: by finding the interesting consequences this proof technique has. If the proof would be valid, then we would also have the following generalisation:

Faulty Lemma. Suppose that $A$ and $B$ are $n\times n$ matrices. Let $p_A$ be the characteristic polynomial for $A$. If $B - A$ is singular, then $B$ must be a zero of $p_A$.

Faulty proof: We have $p_A(B) = \det(BI - A) = \det(B - A) = 0$.$$\tag*{$\Box$}$$

This has the following amazing consequence:

Faulty Corollary. Every singular matrix is nilpotent.

Faulty proof: Let $B$ be a singular matrix and let $A$ be the zero matrix. Now we have $p_A(\lambda) = \lambda^n$. Furthermore, by the above we have $p_A(B) = 0$, because $B - A$ is singular. Thus, we have $B^n = 0$ and we see that $B$ is nilpotent.$$\tag*{$\Box$}$$

In particular, this proves that we have $$ \pmatrix{1 & 0 \\ 0 & 0}^2 = \pmatrix{0 & 0 \\ 0 & 0}. $$ This comes to show just how wrong the proof is!

Solution 2:

If $$A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$$ then $p(\lambda)$ is the determinant of the matrix $$\lambda I - A = \begin{bmatrix} \lambda - 1 & -2 \\ -3 & \lambda - 4 \end{bmatrix}.$$ Now I plug in $A$ for $\lambda$ and get $$\begin{bmatrix} \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} - 1 & -2 \\ -3 & \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} - 4 \end{bmatrix}$$ but I don't know what that is, and I certainly don't know how to take its determinant.

So the reason you can't plug into $\det(\lambda I - A)$ is because that expression only makes sense when $\lambda$ is a scalar. The definition of $p(\lambda)$ isn't really $\det(\lambda I - A)$, the definition of $p(\lambda)$ is that it's the polynomial whose value on any scalar equals the value of $\det(\lambda I - A)$.

On the other hand I could define a function $P(\lambda) = \det(\lambda - A)$ where I'm now allowed to plug in matrices of the same size as $A$, and I certainly would get zero if I plugged in $A$. But this is a function from matrices to numbers, whereas when I plug matrices into $p(\lambda)$ I get as output matrices. So it doesn't make sense to say that these are equal, so the fact that $P(A) = 0$ wouldn't seem to imply that $p(A) = 0$ sense $P$ and $p$ aren't the same thing.

Solution 3:

Remember that there is a difference between $p(x)$ where $x$ is scalar and $p(A)$ where $A$ is a matrix, the next thing you should notice is that if your deduction is true, then $p(A)=0$, the left hand side of this equation is a matrix, while the right hand side is the scalar $0$.

What Cayley-Hamilton theorem says is that $A$ satisfies its own characteristic polynomial. If you have worked with minimal polynomials before, the proof of this statement is a simple task (given all of the previous work, obviously).

Solution 4:

The original question was: Can i write $\det(I A - A) = 0$ in a meaningful way? Yes, if the first $A$ is consider as a scalar in the ring $F[A]\subset M_n(F)$, and the second one as the matrix representing $A$. And the reason this works is, indeed, that $A$ is an eigenvalue of the matrix $A$. How is this so? Details are below. Note that to make it work we need to work with modules, since the coefficients will be a ring containing $F$.

We'll start with some easy statements:

Let $k$ be a field, and $(a_{ij})$ an $n\times n$ matrix with elements in $k$ and $\lambda \in k$ so that there exists an element $v \in k^n$, $v$ nonzero, with $(a_{ij}) \cdot v = \lambda \cdot v$. Then $\det( (a_{ij}) - \lambda I ) = 0$. Indeed, the matrix $(a_{ij}) - \lambda I$ is not invertible.

Same conclusion, if we substitute $k$ with a $k$-vector space $V$, and there exists a nonzero element $v$ in $V^{ n}$ with $(a_{ij})\cdot v = \lambda \cdot v$.

More generally: $k$ a commutative ring, $V$ a $k$-module, $(a_{ij})$ a matrix in $M_n(k)$ and $\lambda \in k$ and $v$ in $V^n$ so that $(a_{ij})\cdot v = \lambda \cdot v$. Then $\det((a_{ij}) - \lambda I) \cdot v = 0 \in V^n$. Use the adjoint matrix.

Let now $A$ an $n\times n$ matrix. Let $k \colon = F[A]\subset M_n(F)$ the commutative algebra generated by $A$. $F^n$ is a $k$-module. Let $e_1$,$\ldots $, $e_n$ the standard basis of $F^n$. We have

$$ (a_{ij}) \cdot \left( \begin{array}{c} e_1 \\ \ldots \\ e_n\end{array} \right ) = \left( \begin{array}{c} A\cdot e_1 \\ \ldots \\ A \cdot e_n\end{array} \right ) $$

The above equation says: $\left( \begin{array}{c} e_1 \\ \ldots \\ e_n\end{array} \right )$ in $V^n$ eigenvector for the eigenvalue $A$. It follows that $$P_A(A) \cdot \left( \begin{array}{c} e_1 \\ \ldots \\ e_n\end{array} \right )= \left( \begin{array}{c} 0 \\ \ldots \\ 0\end{array} \right )$$

and therefore $P_A(A)=0$.

$\bf{Added}$: Making sense of $\det(AI - A)$:

Consider the example of @Jim: $$A= \left[\begin{array}{cc}1&2\\3&4\end{array} \right]$$

$$\begin{bmatrix} \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} - 1 & -2 \\ -3 & \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} - 4 \end{bmatrix}\ \ \ ?$$

We need to look at scalars as scalar matrices. So we get

$$\begin{bmatrix} \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ 0 & 1\end{bmatrix} & \begin{bmatrix} -2 & 0 \\ 0 & -2\end{bmatrix} \\ \begin{bmatrix} -3 & 0 \\ 0 &-3 \end{bmatrix} & \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} - \begin{bmatrix} 4 & 0 \\ 0 & 4 \end{bmatrix} \end{bmatrix}$$

This $2\times 2$ matrix, with entries in the commutative algebra $F[A]$, has determinant $P_A(A)$, a matrix of the same size as $A$. And it will always the zero matrix.