Smith normal form of a polynomial matrix

I have the following matrix

$$P(s) := \begin{bmatrix} s^2 & s-1 \\ s & s^2 \end{bmatrix}$$

How does one compute the Smith normal form of this matrix? I can't quite grasp the algorithm.

Solution 1:

Following the algorithm from Wilkinson's "The algebraic eigenvalue problem".

I will denote the current polynomial matrix as $P(\lambda)$ and its elements as $p_{ij}(\lambda)$, meaning that their values will change, with the final goal of having $P(\lambda)$ in the Smith form.

We start with

$$P(\lambda) = \begin{bmatrix} \lambda^2 & \lambda - 1 \\ \lambda & \lambda^2 \end{bmatrix}.$$

Pivot $(1,2)$.

Pick one of the polynomials with the lowest degree.

It seems that it would be easier to pick $p_{21}$ due to the step 3, but I'll go for $p_{12}$, as it seems to show better how that part is done. I'll later do the whole algorithm using $p_{21}$ as a pivot, and we'll see that it gets easier step 3, but is more complicated later on.

$$P(\lambda) = \begin{bmatrix} \lambda^2 & \color{red}{\lambda - 1} \\ \lambda & \lambda^2 \end{bmatrix}.$$
Swap rows/columns so that the pivot comes to the position $(1,1)$. In our case, we need to swap the first and the second column. We now have

$$P(\lambda) = \begin{bmatrix} \color{red}{\lambda - 1} & \lambda^2 \\ \lambda^2 & \lambda \end{bmatrix}.$$
Present all the polynomials in the first row and the first column (i.e., either $i=1$ or $j=1$) as $p_{ij}(\lambda) = p_{11}(\lambda) q_{ij}(\lambda) + r_{ij}(\lambda)$. In our case:

$$P(\lambda) = \begin{bmatrix} \lambda - 1 & \color{red}{\lambda^2} \\ \color{red}{\lambda^2} & \lambda \end{bmatrix} = \begin{bmatrix} \lambda - 1 & \color{red}{(\lambda-1)(\lambda+1)+1} \\ \color{red}{(\lambda-1)(\lambda+1)+1} & \lambda \end{bmatrix}.$$

So, in both cases, we got $q_{ij}(\lambda) = (\lambda+1)$ and $r_{ij}(\lambda) = 1$.

Since $r_{ij} \not\equiv 0$, we need to fix that. Let us first deal with $r_{21}$. We substract $q_{21}$ times the first row from second row (the one in which is our $r_{21}$) and then interchange those two rows.

First, we compute $q_{12}$ times the first row:

$$q_{12}(\lambda) \begin{bmatrix} \lambda - 1 & \lambda^2 \end{bmatrix} = \begin{bmatrix} (\lambda+1)(\lambda - 1) & (\lambda+1)\lambda^2 \end{bmatrix} = \begin{bmatrix} \lambda^2 - 1 & \lambda^3 + \lambda^2 \end{bmatrix}.$$

Substracting that from the second row, we get

$$\begin{bmatrix} \lambda^2 & \lambda \end{bmatrix} - \begin{bmatrix} \lambda^2 - 1 & \lambda^3 + \lambda^2 \end{bmatrix} = \begin{bmatrix} 1 & -\lambda^3 - \lambda^2 + \lambda \end{bmatrix}.$$

Now, exchanging it with the first row, we obtain

$$P(\lambda) = \begin{bmatrix} 1 & -\lambda^3 - \lambda^2 + \lambda \\ \lambda - 1 & (\lambda-1)(\lambda+1)+1 \end{bmatrix}.$$

This is to be repeated until all elements of the first row and the first column are divisible by $p_{11}$. In our case, $p_{11} \equiv 1$, so we got that, and we can write them in the form $p_{ij} = p_{11}q_{ij}$:

$$P(\lambda) = \begin{bmatrix} 1 & \color{red}{1 \cdot (-\lambda^3 - \lambda^2 + \lambda)} \\ \color{red}{1 \cdot (\lambda - 1)} & (\lambda-1)(\lambda+1)+1 \end{bmatrix}.$$
We now subtract the appropriate multiples of the first row from the rows $i=2,\dots,n$ and the appropriate multiples of the first column from the columns $ji=2,\dots,n$.

Let us first find the multiple of the first row, to be subtracted from the second one, denoting it as ${\rm row}_2$. Since $p_{11} \equiv 1$, we see that ${\rm row}_2$ equals the first row multiplied by $p_{21}(\lambda)$:

\begin{align*} {\rm row}_2 &= p_{21}(\lambda) \begin{bmatrix} 1 & -\lambda^3 - \lambda^2 + \lambda \end{bmatrix} = \begin{bmatrix} (\lambda - 1) \cdot 1 & (\lambda - 1) \cdot (-\lambda^3 - \lambda^2 + \lambda) \end{bmatrix} \\ &= \begin{bmatrix} \lambda - 1 & -\lambda^4 + 2 \lambda^2 - \lambda \end{bmatrix}.\end{align*}

We now subtract these from the current second row and the current second column:

\begin{align*} \begin{bmatrix} \lambda - 1 & \lambda^2 \end{bmatrix} - {\rm row}_2 &= \begin{bmatrix} \lambda - 1 & \lambda^2 \end{bmatrix} - \begin{bmatrix} \lambda - 1 & -\lambda^4 + 2 \lambda^2 - \lambda \end{bmatrix} \\ &= \begin{bmatrix} 0 & \lambda^4 - \lambda^2 + \lambda \end{bmatrix}. \\ \end{align*}

Now, we need to be careful here, because we have just changed the second column and must not be working with the old one above. Observe our current $P(\lambda)$:

$$P(\lambda) = \begin{bmatrix} 1 & -\lambda^3 - \lambda^2 + \lambda \\ \color{red}{0} & \color{red}{\lambda^4 - \lambda^2 + \lambda} \end{bmatrix}.$$

Obviously, subtracting the first column times $(-\lambda^3 - \lambda^2 + \lambda)$ gives us

$$P(\lambda) = \begin{bmatrix} 1 & \color{red}{0} \\ 0 & \color{red}{\lambda^4 - \lambda^2 + \lambda} \end{bmatrix}.$$
Repeat the process on the bottom right principal submatrix of order $n-1$. In our case, $n = 2$, so the job is done, and the Smith form is

$$P(\lambda) = \begin{bmatrix} 1 & 0 \\ 0 & \lambda^4 - \lambda^2 + \lambda \end{bmatrix}.$$

Pivot $(2,1)$.

Let us now do the same process, but picking $p_{21}(\lambda) = \lambda$ as the pivot in the first step.

Pick one of the polynomials with the lowest degree.

This time, we choose $p_{21}$.

$$P(\lambda) = \begin{bmatrix} \lambda^2 & \lambda - 1 \\ \color{red}{\lambda} & \lambda^2 \end{bmatrix}.$$
Swap rows/columns so that the pivot comes to the position $(1,1)$. In our case, we need to swap the first and the second row. We now have

$$P(\lambda) = \begin{bmatrix} \color{red}{\lambda} & \lambda^2 \\ \lambda^2 & \lambda - 1 \end{bmatrix}.$$
Present all the polynomials in the first row and the first column (i.e., either $i=1$ or $j=1$) as $p_{ij}(\lambda) = p_{11}(\lambda) q_{ij}(\lambda) + r_{ij}(\lambda)$. In our case:

$$P(\lambda) = \begin{bmatrix} \lambda & \color{red}{\lambda^2} \\ \color{red}{\lambda^2} & \lambda - 1 \end{bmatrix} = \begin{bmatrix} \lambda & \color{red}{\lambda \cdot \lambda + 0} \\ \color{red}{\lambda \cdot \lambda + 0} & \lambda - 1 \end{bmatrix}.$$

So, in both cases, we got $q_{ij}(\lambda) = \lambda$ and $r_{ij}(\lambda) = 0$. This is great, as we can skip the rest of the step 3. $\ddot\smile$
We now subtract the appropriate multiples of the first row from the rows $i=2,\dots,n$ and the appropriate multiples of the first column from the columns $ji=2,\dots,n$.

Let us first find the multiple of the first row, to be subtracted from the second one, denoting it as ${\rm row}_2$. Since $p_{11}(\lambda) = \lambda$ and $p_{21}(\lambda) = \lambda^2$, we see that ${\rm row}_2$ equals the first row multiplied by $(p_{21}/p_{11})(\lambda) = \lambda$:

$${\rm row}_2 = (p_{21}/p_{11})(\lambda) \begin{bmatrix} \lambda & \lambda^2 \end{bmatrix} = \begin{bmatrix} \lambda^2 & \lambda^3 \end{bmatrix}.$$

We now subtract this from the current second row:

$$\begin{bmatrix} \lambda^2 & \lambda - 1 \end{bmatrix} - {\rm row}_2 = \begin{bmatrix} 0 & -\lambda^3 + \lambda - 1 \end{bmatrix}.$$

Our current $P(\lambda)$ is

$$P(\lambda) = \begin{bmatrix} \lambda & \lambda^2 \\ \color{red}{0} & \color{red}{-\lambda^3 + \lambda - 1} \end{bmatrix}.$$

As before, we subtracting the first column's multiple will now just remove the nonzeroes in the first row.

$$P(\lambda) = \begin{bmatrix} \lambda & \color{red}{0} \\ 0 & \color{red}{-\lambda^3 + \lambda - 1} \end{bmatrix}.$$

Now, this may seem great, but notice that $p_{11} \nmid p_{22}$, so we need further corrections. This is done by adding the second row to the first one, repeating the above procedure. So, we get

$$P(\lambda) = \begin{bmatrix} \color{red}{\lambda} & \color{red}{-\lambda^3 + \lambda - 1} \\ 0 & -\lambda^3 + \lambda - 1 \end{bmatrix} = \begin{bmatrix} \lambda & \color{red}{\lambda \cdot (-\lambda^2 + 1) + (-1)} \\ \color{red}{0} & -\lambda^3 + \lambda - 1 \end{bmatrix}.$$

We see that $q_{12}(\lambda) = -\lambda^2 + 1$ and $r_{12}(\lambda) = 1$. We need to subtract the first column times $q_{12}(\lambda)$ from the second colum:

$$P(\lambda) = \begin{bmatrix} \lambda & \color{red}{-1} \\ 0 & \color{red}{-\lambda^3 + \lambda - 1} \end{bmatrix}.$$

Exchanging these two, we get

$$P(\lambda) = \begin{bmatrix} -1 & \lambda \\ -\lambda^3 + \lambda - 1 & 0 \end{bmatrix}.$$

After eliminating $p_{12}$, we have

$$P(\lambda) = \begin{bmatrix} -1 & \color{red}{0} \\ -\lambda^3 + \lambda - 1 & \color{red}{-\lambda^4 + \lambda^2 - \lambda} \end{bmatrix}.$$

Then we eliminate $p_{21}$:

$$P(\lambda) = \begin{bmatrix} -1 & 0 \\ \color{red}{0} & \color{red}{-\lambda^4 + \lambda^2 - \lambda} \end{bmatrix}.$$
Repeat the process on the bottom right principal submatrix of order $n-1$. In our case, as above, $n = 2$, so the job is almost done, as we have

$$P(\lambda) = \begin{bmatrix} -1 & 0 \\ 0 & -\lambda^4 + \lambda^2 - \lambda \end{bmatrix}.$$

For this to be a Smith form, we need the nonzero diagonal polynomials to be monic (i.e., have a leading coefficient equal to $1$). In our case, we do this by multiplying $P(\lambda)$ by $-1$. In a more generale case, we'd need to multiply it from either left or right by some constant diagonal matrix.

Finally, we obtain

$$P(\lambda) = \begin{bmatrix} 1 & 0 \\ 0 & \lambda^4 - \lambda^2 + \lambda \end{bmatrix}.$$

Concluding remarks

In both cases we got monic polynomials $p_{ii}$ such that $p_{11} \mid p_{22}$, and the degree of $(p_{11} \cdot p_{22})(\lambda)$ is $4 = 2 \cdot 2 = l \cdot n$, where $n$ is the order of the matrix and $l$ is the degree of $P$, as it should be. This is due to the fact that $\det P(\lambda)$ remained unchanged (in a more general case, it might get multiplied by a constant factor, in order to get the polynomials on the diagonal to become monic).

I have discarded all the info about the transformations. If one is to keep that, the row transformations are put in $E(\lambda)$, and the column transformations are put in $F(\lambda)$, to obtain

$$E(\lambda) P_{\text{starting}}(\lambda) F(\lambda) = P_{\text{final}}(\lambda).$$

where

$$ E(\lambda)=\begin{bmatrix}-1& 1 + \lambda - \lambda^2\\ -\lambda& -1 + \lambda + \lambda^2 - \lambda^3\end{bmatrix}$$

and

$$ F(\lambda)=\begin{bmatrix}1 - \lambda& -1 + \lambda - \lambda^2 - \lambda^3 + \lambda^4\\ 1& \lambda - \lambda^3\end{bmatrix}$$

If every real valued continuous function on $X$ is uniformly continuous , then is every continuous function to any metric space uniformly continuous?

On the problem $1$ of Putnam $2009$

Why does $\sum a_i \exp(b_i)$ always have root?

Proving the pairing axiom from the rest of ZF

Smoothness and decay property of Fourier transformation

Relationship between tuples, vectors and column/row matrices

Why is the Riemann integral only defined on compact sets?