Solving a quadratic equation with precision when using floating point variables

I know how to solve a basic quadratic equation with the formula

$$t_{1,2}=\dfrac{-b\pm\sqrt{b^2-4ac}}{2a}$$

but I learned that if $b \approx \sqrt{b^2-4ac}$ floating point precision may give slightly wrong results and this approach is better. It works, indeed. But why? Is there a simple explanation on why this works?

We are accustomed to solving the equation $ax^2+bx+c=0$ by using the Quadratic Formula $$x=\frac{-b\pm \sqrt{b^2-4ac}}{2a}.$$ Alternately, we can use the Citardauq Formula $$x=\frac{2c}{-b\mp \sqrt{b^2-4ac}}.$$ When $|4ac|$ is small in comparison with $|b|$, one of the roots as computed by the Quadratic Formula may suffer serious loss of precision, because in the numerator we are finding the difference of large nearly equal quantities. Precisely that root is then nicely computed with no loss of precision by the Citardauq Formula.

A simpler way of dealing with the problem is to note that the product of the roots of $ax^2+bx+c=0$ is $c/a$. So if we can compute one of the roots $r_1$ to high precision, the other root $r_2$ can be computed from $r_1r_2=c/a$ with no loss of precision.

As the other comments mention, there is loss of precision when subtracting similar values. So you want to compute one root $r_1$ of $a x^2 + b x + c = 0$ without substraction of possibly similar values and get the other from the relation $r_2 = \frac{c}{a r_1}$. If $4 a c$ is small with respect to $b^2$, $\sqrt{b^2 - 4 a c} \approx b$. So the general strategy is:

If $b < 0$, compute $r_1 = \frac{- b + \sqrt{b^2 - 4 a c}}{2 a}$
If $b > 0$, compute $r_1 = \frac{- b - \sqrt{b^2 - 4 a c}}{2 a}$
If $b = 0$, $r_1 = \sqrt{\frac{c}{a}}$, $r_2 = - \sqrt{\frac{c}{a}}$

Any time you subtract two nearly equal numbers there is loss of precision. Imagine we work in scientific notation with five decimal digits available. The number $10100$ is represented as $1.0100 E5$ and you can think of it having a potential error of $\pm 0.5$ as anything from $10099.5$ to $10100.5$ would get the same representation. If we now subtract it from $10200=1.0200 E5$ we get $100=1.0000 E2$ but the error could be $\pm 1$ (in the worst case). We only have three digits of precision now, instead of 5, though it will be represented still with $5$ digits.

For the specific case of the quadratic formula, if $4ac \ll b^2,$ you have $-b+\sqrt{b^2-4ac}=-b+b\sqrt {1-\frac {4ac}{b^2}}\approx -b+b(1-\frac {2ac}{b^2})=-\frac {2ac}b$ and the much larger $b$ has canceled, losing precision.

Solving a quadratic equation with precision when using floating point variables

Related

Recent Posts