Influence of small constant term on roots of polynomial
Let $p(x)=a_n x^n + a_{n-1} x^{n-1} + ... + a_1 x + a_0,\enspace a_i\in\mathbb{C}$ some polynomial.
Suppose that $|a_0|$ is very small (compared to the other coefficients' magnitude).
Is there any way the (complex) roots of $p(x)$ could be heavily affected by setting $a_0 = 0$ ?
I think the answer could be no because the polynomial is continuous in the constant term and thus small changes in the constant term will affect the function only slightly. But does this hold true for the location of the roots?
Why am I asking this?
I try to incorporate a not selfwritten custom root finder into my Matlab-program but unfortunately in some rare cases one of its loops doesn't converge if the input vector's constant coefficient is of the magnitude $\approx 10^{-18}$ and then the algorithm crashes. However it does converge if I set the constant term to zero but I worry whether I could get wrong results.
EDIT 1:
Based on the answer by @Fixed Point, I could find a successor (NAClab) of the Matlab root finder that I was using and it doesn't crash anymore. I then went on to quickly investigate the polynomials Fixed Point proposed. Here are the results:
Figure1: Up to degree 17 the root finder keeps the results on the real axis. From degree 18 onwards the calculated roots gain an imaginary part which grows linearly with the degree of the polynomial.
The ratio between the constant coefficient $a_0$ and the highest order coefficient $a_n$ is in the order of $10^{-16}$ when the roots begin to diverge from the real axis and grows with approximately one order of magnitude per increase of polynomial degree.
Figure2: Here the constant coefficient is set to zero, one can see that the roots that are close to each other get perturbed quite significantly.
The Matlab code to reproduce the results can be downloaded here.
EDIT 2: To address Fixed Point's questions:
Since Brent's algorithm is guaranteed to converge (it could be slow but it will converge), I am curious as to why you were having the problem that you said you were.
When developing the MIMO extension for ANP (animated nyquist diagram, a leisure project for educational purposes) I came to realize that the program would have to deal with high order polynomials even for small MIMO systems. I then noticed that Matlab's 'roots' would produce very inaccurate results when there were roots with high multiplicity present - even in trivial, obvious cases like $(x+1)^4=0$ (try roots(poly([-1,-1,-1,-1]))
).
Even if my program should only be used for entertainment, that wasn't good enough. After finding Multroot (by Zeng) and unit-testing it with quite some success using randomized MIMO systems I found that besides some trivial to solve crashes it had a more severe flaw that had to do with a small constant term.
How/why was your application crashing?
One such polynomial can be defined in Matlab as follows (it's the one that finally lead me to this SE question):
p = hex2num(['bfae7873980ada44';'bfd79794c0074ef6';'bfe9e4c737c98680';'bfe5502ed16afae0';'bf81513e302abba0';'3fc59ae0b4d97164';'bc80000000000000'])';
Use it as input to Multroot: multroot(p)
Most likely it will end with an error saying that an output argument hasn't been assigned. (Beware that the algorithm uses randomized initial vectors and thus succeeds with a small chance)
Was this MATLAB's fzero which was crashing?
As explained I didn't use 'fzero' and unfortunately it can't help me here, as it says in the documentation that it needs a change of sign to detect a zero - which isn't the case for all roots of a general polynomial.
Solution 1:
Consider the polynomial, $$p_1(x)=(x-1)(x-2)...(x-19)(x-20).$$ After expanding it, you get,
\begin{array}{|r|r|} \hline \textrm{Exponent} & \textrm{Coefficients of $p(x)$}\\\hline 0 & 2432902008176640000 \\ 1 & -8752948036761600000 \\ 2 & 13803759753640704000 \\ 3 & -12870931245150988800 \\ 4 & 8037811822645051776 \\ 5 & -3599979517947607200 \\ 6 & 1206647803780373360 \\ 7 & -311333643161390640 \\ 8 & 63030812099294896 \\ 9 & -10142299865511450 \\ 10 & 1307535010540395 \\ 11 & -135585182899530 \\ 12 & 11310276995381 \\ 13 & -756111184500 \\ 14 & 40171771630 \\ 15 & -1672280820 \\ 16 & 53327946 \\ 17 & -1256850 \\ 18 & 20615 \\ 19 & -210 \\ 20 & 1 \\\hline \end{array}
Note that the constant term here is $20!$. Now let us consider the polynomial $$p_2(x)=x^{20}p_1(1/x)$$ which basically flips the coefficients, giving us
\begin{array}{|r|r|} \hline \textrm{Exponent} & \textrm{Coefficients of $p_2(x)$}\\\hline 0 & 1 \\ 1 & -210 \\ 2 & 20615 \\ 3 & -1256850 \\ 4 & 53327946 \\ 5 & -1672280820 \\ 6 & 40171771630 \\ 7 & -756111184500 \\ 8 & 11310276995381 \\ 9 & -135585182899530 \\ 10 & 1307535010540395 \\ 11 & -10142299865511450 \\ 12 & 63030812099294896 \\ 13 & -311333643161390640 \\ 14 & 1206647803780373360 \\ 15 & -3599979517947607200 \\ 16 & 8037811822645051776 \\ 17 & -12870931245150988800 \\ 18 & 13803759753640704000 \\ 19 & -8752948036761600000 \\ 20 & 2432902008176640000 \\\hline \end{array}
This polynomial $p_2(x)$, I present as an answer to your question. The constant term here is very small compared to the other coefficients. The constant term is just one. The roots of this $p_2(x)$ polynomial are just $$x=\frac{1}{20},\frac{1}{19},\frac{1}{18},...,\frac{1}{3},\frac{1}{2},1.$$ The roots are all distinct, rational, and reasonably separated on the real line. Now let us define $p_3(x)=p_2(x)$ except that the constant term is equal to zero instead of one and compare the roots.
\begin{array}{|l|l|} \hline \textrm{Old roots of $p_2(x)$} & \textrm{New roots of $p_3(x)$}\\\hline 0.05 & 0\\ 0.0526316 & 0.00606612 - 0.0292961i\\ 0.0555556 & 0.00606612 + 0.0292961i\\ 0.0588235 & 0.0236616 - 0.0549143i\\ 0.0625 & 0.0236616 + 0.0549143i\\ 0.0666667 & 0.0510481 - 0.0735013i\\ 0.0714286 & 0.0510481 + 0.0735013i\\ 0.0769231 & 0.0855378 - 0.0823447i\\ 0.0833333 & 0.0855378 + 0.0823447i\\ 0.0909091 & 0.123755 - 0.0796379i\\ 0.1 & 0.123755 + 0.0796379i\\ 0.111111 & 0.16195 - 0.064656i\\ 0.125 & 0.16195 + 0.064656i\\ 0.142857 & 0.196345 - 0.0378412i\\ 0.166667 & 0.196345 + 0.0378412i\\ 0.2 & 0.218259\\ 0.25 & 0.249\\ 0.333333 & 0.333333\\ 0.5 & 0.5\\ 1 & 1\\ \hline \end{array}
As you can see that some of the roots stay the same or don't change much. But the majority of the roots "changed significantly" and became complex instead of purely real. So the answer to your question is, don't set the constant term to be zero. It won't work in general and can give you wacky answers.
If these coefficients are too large for you, then just divide the polynomial $p_2(x)$ by $20!$ and you get the coefficients
\begin{array}{|r|r|} \hline \textrm{Exponent} & \textrm{Coefficients of $\frac{p_2(x)}{20!}$}\\\hline 0 & 4.110317623312165\times10^{-19}\\ 1 & -8.631667008955546\times10^{-17}\\ 2 & 8.473419780458027\times10^{-15}\\ 3 & -5.166052704859894\times10^{-13}\\ 4 & 2.1919479625883945\times10^{-11}\\ 5 & -6.873605325572918\times10^{-10}\\ 6 & 1.6511874089046065\times10^{-8}\\ 7 & -3.1078571268337856\times10^{-7}\\ 8 & 4.6488830858656685\times10^{-6}\\ 9 & -0.0000557298 \\ 10 & 0.000537438 \\ 11 & -0.00416881 \\ 12 & 0.0259077 \\ 13 & -0.127968 \\ 14 & 0.495971 \\ 15 & -1.47971 \\ 16 & 3.3038 \\ 17 & -5.29036 \\ 18 & 5.67378 \\ 19 & -3.59774 \\ 20 & 1 \\\hline \end{array}
Making that small $10^{-19}$ constant to zero will give you the same problem with the roots because multiplying a polynomial by a constant doesn't change its roots. The thing to realize here is that the roots of a polynomial do depend continuously on the coefficients but they can be extremely sensitive and you must consider the complex plane as a whole with the real line embedded in it. The real line is just a tiny part of the entire complex plane. The real line is nothing special in this context. Changing the coefficients makes the roots wander but they can wander in any direction in the complex plane even if they were originally strictly on the real line. There is nothing constraining the roots to the real line. You can have real roots becoming complex or the other way around by slightly changing a polynomial's coefficients.
James Wilkinson is one of the most respected numerical analysts of the 20th century and his specialty was coming up with counter examples. He demonstrated that the problem of finding the roots of an arbitrary polynomial using its coefficients is an ill-conditioned problem in general.
He presented a specific example, $$p(x)=(x-1)(x-2)...(x-19)(x-20).$$ The roots are easy to find; $x=1,2,3,...,19,20$ so they are well-seperated. If the polynomial is expanded, the coefficient of the $x^{19}$ term is $-210$. Let us perturb this coefficient by $2^{-23}$ and then round it to $−210.0000001192$ and let's call this new polynomial $q(x)$. The following happens:
- The value of $p(20)=0$ blows up to $q(20)\approx-6.24949\times10^{17}$.
- The roots at $x=1,2,3,4,5,6$ stay roughly the same.
- The roots at $x=7,8,9$ have noticeable change.
- The next ten roots actually turn complex (pairs, because the coefficients are still real). All ten of these roots have a not-so-small imaginary part. The smallest imaginary part is $\approx 0.643439$ so you decide how far this is from being a real number.
- The root which was at $x=20$ has moved to $x\approx20.8469$.
Remember, this is despite the fact that the coefficients were integers (albeit very large integers). The roots were all real integers and well-separated and look what happened by a tiny perturbation. This polynomial is the one on which my answer is based.
Wilkinson presented another polynomial. Define $$q(x)=(x-2^{-1})(x-2^{-2})(x-2^{-3})...(x-2^{-19})(x-2^{-20}).$$ He showed that this polynomial $q(x)$ is rather insensitive to relatively large changes in the coefficients. Therefore in general, one cannot say anything either way. Check out this page for some more detail and I also urge you to read his published works (like the references at the bottom of the wikipedia page). They may be a bit hard to find but they are totally worth it.
Addendum (too long to be a comment)
In response to the OP's comment,
Just out of curiosity - how did you calculate the new roots (for p3) in the third table? Is it an analytical result?
the answer is that no, this is not an analytical result. Plain floating point arithmetic was used to numerically estimate the roots of $p_3(x)$.
I want to point out here that lots of new or improved methods were presented in the past century. We know quite a bit about polynomials and their roots. But there is (still) no single numerical method which will work for all polynomials in an optimal fashion. You can always come up with counterexamples where a well-liked method will be "slow" or sub-optimal for finding the roots of a polynomial. However, there are methods which are much better than others.
One of the most popular is Brent's method which is a hybrid method. Brent actually wrote a book, "Algorithms for Minimization without Derivatives" but the book is now out of print. So they scanned a copy and made it available to public here. In this books, you want chapter 4, "An Algorithm with Guaranteed Convergence for Finding a Zero of a Function" which describes his method. His method has also been peer-reviewed published.
Furthermore, one of the cofounders of MATLAB, Cleve Moler has a blog (useful in itself) and once he had a three-part post (one, two, and three) describing various algorithms and how MATLAB's fzero works. Part two is where he discusses Brent's method which is implemented in MATLAB's fzero. Since Brent's algorithm is guaranteed to converge (it could be slow but it will converge), I am curious as to why you were having the problem that you said you were. How/why was your application crashing? Was this something you yourself wrote? But then I would advise you against this. There are plenty of canned routines and there shouldn't be a need to reinvent the wheel. Was this MATLAB's fzero which was crashing? In which case I would be very interested in knowing those polynomials. Perhaps you can post a few examples here? Lastly, there is this fun book, one of my all-time favorites and talks quite a bit about zeros of polynomials and how to find them.
Addendum - 72 Days Later
The OP provided with a specific example of a polynomial with which he was having a problem. Just out of curiosity, I wanted to take a look at it and see what was happening. The OP provided the coefficients in floating point (in hexadecimal). Since I cannot tell what the actual coefficients were supposed to be, I will assume that the provided floating point coefficients represent the coefficients exactly as rational numbers. First, convert the hexadecimal form into base ten fractions and then decimals just to see what they are.
$$ \begin{array}{|l|l|l|r|r|} \hline &\text{Exponent} & \text{Hex Form} & \text{Fractional Form} & \text{Decimal Form} \\\hline a_6 & 6 & \text{BFAE7873980ADA44} & -\frac{2144171792184977}{36028797018963968} & -0.05951272231 \\ a_5 & 5 & \text{BFD79794C0074EF6} & -\frac{3320294798501755}{9007199254740992} & -0.3686267734 \\ a_4 & 4 & \text{BFE9E4C737C98680} & -\frac{56940771119885}{70368744177664} & -0.8091770258 \\ a_3 & 3 & \text{BFE5502ED16AFAE0} & -\frac{187473016346583}{281474976710656} & -0.6660379496 \\ a_2 & 2 & \text{BF81513E302ABBA0} & -\frac{152325066937821}{18014398509481984} & -0.008455739827 \\ a_1 & 1 & \text{3FC59AE0B4D97164} & \frac{1520316102106201}{9007199254740992} & 0.1687889941 \\ a_0 & 0 & \text{BC80000000000000} & \frac{1}{36028797018963968} & 2.775557561562\times10^{-17} \\ \hline \end{array} $$
Using the exact fractional form of the coefficients, define the polynomials $$p(x)=a_0+a_1x+a_2x^2+a_3x^3+a_4x^4+a_5x^5+a_6x^6$$ $$q(x)=a_1x+a_2x^2+a_3x^3+a_4x^4+a_5x^5+a_6x^6$$ where $q(x)$ is just $p(x)$ with the constant term set to zero. After looking at these, I know that all of the roots are irrational, except for the trivial root $x=0$ for $q(x)$. Further, I don't know for sure, but I suspect that both of these polynomials are not solvable in the radicals (check this and this if the reader doesn't know what this means) so there are no solutions in the radicals. Now, we have no choice but to rely on numerical methods.
This picture above plots both $p(x)$ and $q(x)$. Look at the axis limits and the scales. I can see the three simple real distinct roots easily enough (the three right-most roots). On the left, the graph is very flat near the $x$-axis but there is at least one root there for sure. It could be just one real root with multiplicity three or it could be three distinct real roots very close together or any other combination in between. My guess would be there aren't any complex roots. Notice how both polynomials are indistinguishable from each other. We just see a single curve.
This second image, above, zooms in on that flat region. The third simple root (the one on the right in this plot) is even more obvious now. But we still can't tell what is happening with the other three roots on the left. The graph is still too flat. Note we still cannot distinguish between $p(x)$ and $q(x)$. This means that for this specific polynomial, the polynomial which prompted this question, setting the constant term to zero actually doesn't change the roots by much. We'll verify this now.
Starting with the exact coefficients and carrying about a hundred digits throughout the computations for precision and accuracy, I got the following roots.
$$ \begin{array}{|r|r|r|}\hline \text{Roots of $p(x)$} & \text{Roots of $q(x)$} & \frac{|\text{Roots}(p(x))-\text{Roots}(q(x))|}{|\text{Roots}(p(x))|}\cdot100\%\\ \hline -1.7613269 & -1.7613263 & 0.00003277 \\ -1.3071486 & -1.3071486 & 1.70941691\times10^{-13} \\ -1.64439487\times10^{-16} & 0 & 100 \\ 0.39708223 & 0.39708223 & 1.72600660\times10^{-14} \\ -1.7613452-0.0000106 i & -1.7613455-0.0000111 i & 0.00003276 \\ -1.7613452+0.0000106 i & -1.7613455+0.0000111 i & 0.00003276 \\\hline \end{array} $$
The first column shows the roots of $p(x)$. The second column shows the roots of the perturbed $q(x)$. The third column shows the relative difference in the roots in percent. The absolute value in the third column is the usual norm in $\mathbb{R}$ or $\mathbb{C}$ as appropriate. The relative changes are small so that you can see that none of the roots really changed significantly.
The magnitude of the entire difference vector of the roots is $$||\text{Roots}(p(x))-\text{Roots}(q(x))||_2=9.995367\times10^{-7}$$ and indeed the roots changed very little. This is all relative because another perspective is that a change of the order $10^{-17}$ to the coefficient vector caused the magnitude of the roots vector to change by almost $10^{-6}$ which is eleven orders of magnitude. This is gigantic from another point of view. The vector function which maps the coefficients of a polynomial to its roots is indeed continuous. But being continuous doesn't restrict the gradient in anyway. It can be large or small. And in this case, the gradient of this vector function just happens to be very large at this particular point in its domain.
Solution 2:
If $r$ is a root, thought of a a function of $a_0$, then by implicit differentiation we get $$ \dfrac{dr}{da_0} = - \frac{1}{p'(r)}$$
The roots that will be strongly affected by a small change to $a_0$ are those for which $|p'(r)|$ is small - in particular, multiple roots, which are those where $p'(r) = 0$.
EDIT: On the other hand: given any simple closed contour $\Gamma$ such that no root is on $\Gamma$, let $\eta = \inf_{z \in \Gamma} |p(z)|$. Then any change to $a_0$ by less than $\eta$ in absolute value will leave constant the number of roots inside $\Gamma$, counted by multiplicity.
Solution 3:
The general dictum is that the roots of the polynomial vary continuously with the coefficients, but not analytically.
Typical of what makes the "trajectories" of roots follow curves that are not analytic is what happens to a double root as it reaches the origin. That is consider $x^2 = \varepsilon$, so that small positive values of $\varepsilon$ give you a pair of distinct (positive/negative) real roots which become a double root at $\varepsilon = 0$. If $\varepsilon$ become small negative, the two roots which coalesced at zero fly off as a conjugate pair, i.e. going along a trajectory at right angles to the trajectory that brought the pair together as a double root.