How do iterative methods applied to the companion matrix of a polynomial $p(\lambda)$ relate to $p$ itself?

Solution 1:

Take a favorite iterative procedure for computing eigenvalues; for concreteness, I'll restrict to ... the power iteration for the dominant eigenvalue. Does applying this procedure to $C$ correspond to a natural root-finding algorithm for $p$?

I'll restrict myself to power iteration proper as well, lest this answer get too long. Most of the variants of power iteration translate readily to a corresponding operation on polynomials, which I might discuss later in a separate answer if there's interest. (I've already said something on the QR algorithm in the comments.)

I'll consider the following form of the (upper Hessenberg) Frobenius companion matrix for $p(x)=c_n x^n +\cdots +c_1 x+c_0$:

$$\mathbf C=\begin{pmatrix}-\frac{c_{n-1}}{c_n}&\cdots&-\frac{c_1}{c_n}&-\frac{c_0}{c_n}\\1&&&\\&\ddots&&\\&&1&\end{pmatrix}$$

which is similar to another form you might be accustomed to:

$$\mathbf T\mathbf C\mathbf T^{-1}=\begin{pmatrix}&&&-\frac{c_0}{c_n}\\1&&&-\frac{c_1}{c_n}\\&\ddots&&\vdots\\&&1&-\frac{c_{n-1}}{c_n}\end{pmatrix}$$

with the similarity transformation $\mathbf T$ given by the unit upper triangular Toeplitz matrix

$$\mathbf T=\begin{pmatrix}1&\frac{c_{n-1}}{c_n}&\cdots&\frac{c_1}{c_n}\\&\ddots&\ddots&\vdots\\&&\ddots&\frac{c_{n-1}}{c_n}\\&&&1\end{pmatrix}$$

Now, the key is that there is an intimate relationship between polynomials, difference equations, and the Frobenius companion matrix; to wit, the difference equation

$$c_n u_{k+n}+\dots+c_1 u_{k+1}+c_0 u_k=0$$

has $p(x)$ as its characteristic polynomial, and its solutions $u_k$ take the form

$$u_k=w_1 \mu_1^k+\dots+w_n \mu_n^k$$

where the $\mu_j$ satisfy $p(\mu_j)=0$ with $|\mu_1|\geq \cdots \geq |\mu_n|$, and the $w_j$ are arbitrary constants. For the rest of this answer, I assume that there is only one dominant root $\mu_1$ (thus, $\mu_1$ is necessarily real and simple).

As $k\to \infty$, the ratio $\dfrac{u_{k+1}}{u_k}$ eventually tends to $\mu_1$, with the convergence rate depending on how far away the dominant root is from the other roots. Thus, one method to find the dominant root $\mu_1$ of a polynomial $p(x)$ would consist of iterating the recurrence relation for the $u_k$ (with starting values chosen such that $w_1$ in the expression for $u_k$ is nonzero) and forming successive ratios. This method is called the Bernoulli iteration.

The connection with companion matrices can be seen if we note that

$$\begin{pmatrix}u_{k+1}\\u_k\\\vdots\\u_{k-n+2}\end{pmatrix}=\begin{pmatrix}-\frac{c_{n-1}}{c_n}&\cdots&-\frac{c_1}{c_n}&-\frac{c_0}{c_n}\\1&&&\\&\ddots&&\\&&1&\end{pmatrix}\cdot\begin{pmatrix}u_k\\u_{k-1}\\\vdots\\u_{k-n+1}\end{pmatrix}$$

(see for instance this related question); we can thus consider Bernoulli's method as equivalent to repeatedly multiplying some arbitrary starting vector $\mathbf u_0$ with $\mathbf C$ ($\mathbf u_j=\mathbf C\mathbf u_{j-1}$), after which the dominant eigenvalue is estimated as the modified Rayleigh quotient

$$\frac{\mathbf e_1^\top\mathbf C\mathbf u_j}{\mathbf e_1^\top\mathbf u_j}$$

where $\mathbf e_1$ is the first column of the identity.

This leaves the problem of starting up the iteration. The customary way is to derive the starting values from the solutions of the equation

$$\mathbf T\mathbf u_0=\mathbf d$$

where $\mathbf d=(c_1\quad 2c_2\quad \cdots\quad n c_n)^\top$ represents the coefficients of $p^\prime(x)$.


As I mentioned earlier, most of the variants of power iteration, when specialized to the Frobenius companion matrix case, give rise to recursive methods for estimating the roots of a polynomial that directly act on the polynomial's coefficients. In particular, the Jenkins-Traub method, the modern descendant of the Bernoulli iteration, can be thought of either as a recursion on polynomials or on companion matrices. Some of the details are discussed in this paper.


Here's some sundry Mathematica code for playing around with Bernoulli's method:

n = 5; p = Expand[Times @@ (x - Range[n])];

cmat[poly_, x_] := Module[
  {cl = CoefficientList[poly, x], n = Exponent[poly, x]},
      SparseArray[{{1, i_} :> -cl[[n - i + 1]]/Last[cl],
      Band[{2, 1}] -> 1}, {n, n}]]

start = LinearSolve[ToeplitzMatrix[UnitVector[n, 1], 
  (Reverse[Rest[#]/Last[#]]) &[CoefficientList[p, x]]],
   CoefficientList[D[p, x], x]];

With[{prec = 20, its = 20},
  Ratios[Drop[LinearRecurrence[
      -(Reverse[Most[#]/Last[#]]) &[CoefficientList[p, x]], 
        Reverse[N[start, prec]], its + n + 1], n - 1]]]

With[{prec = 20, its = 20}, 
  Table[((UnitVector[n, 1].cmat[p, x].#)/(UnitVector[n, 1].#)) &[
     MatrixPower[cmat[p, x], k, N[start, prec]]], {k, 0, its}]]

You can change p here to any polynomial whose largest (in magnitude) root is real (of course, you must then set n to be the degree of p).