Why is it not possible to generate an explicit formula for Newton's method?

Going through the recursive formula for approximating roots every time is extraordinarily tedious, so I was wondering why there was no formula that computed the $n$th iteration of Newton's method.


Solution 1:

First, note that such a formula would violate the "too good to be true" test: it would let you find the minima of any (sufficiently smooth, strictly) convex function in a single step, no matter how complicated!

For intuition, consider what Newton's method is doing: you are starting from an initial guess, and taking a sequence of "downhill steps" until you reach the minimum, where each step uses local information (a quadratic local approximation) about your neighborhood to compute the next destination. Imagine a hiker climbing down a mountain, who first hikes to the lowest point she can see, then reorients herself and goes to the lowest point she can see from her new location, etc.

The only way to make a direct beeline to the lowest point ($n\to\infty$) is if the hiker somehow knows the entire shape of the landscape in advance. In mathematical terms, this means your formula would have to use either:

  • information about the value of your function at all points, or equivalently (for nicely-behaved functions)
  • the values of all derivatives of all orders at your starting point.

And of course neither is practical in practice. Note though that sometimes you do have such complete information, e.g. when your function is quadratic, so that all higher-order derivatives are zero. And then you can easily make a beeline straight to the solution.

Also, you can write down a formula for the $n$th iteration, but as stated in the other answers, this is not useful: it gets messier and messier the larger $n$ gets (as it must, as explained above), doesn't converge to anything nice as $n\to \infty$, and amounts to nothing more than... running Newton's method for $n$ consecutive steps.

Solution 2:

You wouldn't really gain much with a formula for the $n$th iteration.

The appeal of the Newton-Raphson method is that a single step:

  1. is conceptually easy to understand,
  2. is fast to compute, and
  3. can be checked from step to step for closeness to a stationary point.

Nesting $n$ of these computations into a single formula works against all three of these things.

Solution 3:

Given $y \in \mathbb R^+$, suppose we want to compute the square root of $y$. Newton's method produces the recurrence relation $x_{k+1} = g (x_k)$ where

$$g (x) := \frac 12 \left( x + \frac{y}{x}\right)$$

which Hero of Alexandria apparently knew of almost two millennia ago. Suppose that we want a formula for $x_3$ in terms of $y$ and an initial estimate $x_0$. Using SymPy, we define $y$, $x_0$ and $g$:

>>> y, x0 = symbols('y x0')
>>> g = lambdify(x, 0.5 * (x + (y / x)))

Iterating $3$ times:

>>> g(g(g(x0)))
                         0.5*y                     0.25*y       0.125*y
0.125*x0 + --------------------------------- + -------------- + -------
                         0.5*y        0.25*y            0.5*y      x0  
           0.25*x0 + -------------- + ------   0.5*x0 + -----          
                              0.5*y     x0                x0           
                     0.5*x0 + -----                                    
                                x0                                     

Simplifying:

>>> simplify(g(g(g(x0))))
    /           8            6               4  2            2  3             4\
1.0*\0.015625*x0  + 0.4375*x0 *y + 1.09375*x0 *y  + 0.4375*x0 *y  + 0.015625*y /
--------------------------------------------------------------------------------
            /        6           4             2  2          3\             
         x0*\0.125*x0  + 0.875*x0 *y + 0.875*x0 *y  + 0.125*y /             

Is working with the ratio of bivariate polynomials of degrees $8$ and $7$ really less tedious than using the recurrence relation $x_{k+1} = g (x_k)$ thrice?

Suppose now that we want a formula for $x_5$ in terms of $y$ and $x_0$. Iterating $5$ times:

>>> g(g(g(g(g(x0)))))
                                                                                    0.5*y                                                                                                           0.25*y                                                0.125*y                   0.0625*y      0.03125*y
0.03125*x0 + --------------------------------------------------------------------------------------------------------------------------------------------------- + ----------------------------------------------------------------------- + --------------------------------- + -------------- + ---------
                                                          0.5*y                                                  0.25*y                   0.125*y       0.0625*y                            0.5*y                     0.25*y       0.125*y                 0.5*y        0.25*y            0.5*y       x0   
             0.0625*x0 + ----------------------------------------------------------------------- + --------------------------------- + -------------- + --------   0.125*x0 + --------------------------------- + -------------- + -------   0.25*x0 + -------------- + ------   0.5*x0 + -----            
                                                  0.5*y                     0.25*y       0.125*y                 0.5*y        0.25*y            0.5*y      x0                               0.5*y        0.25*y            0.5*y      x0                        0.5*y     x0                x0             
                         0.125*x0 + --------------------------------- + -------------- + -------   0.25*x0 + -------------- + ------   0.5*x0 + -----                         0.25*x0 + -------------- + ------   0.5*x0 + -----                       0.5*x0 + -----                                      
                                                  0.5*y        0.25*y            0.5*y      x0                        0.5*y     x0                x0                                             0.5*y     x0                x0                                   x0                                       
                                    0.25*x0 + -------------- + ------   0.5*x0 + -----                       0.5*x0 + -----                                                             0.5*x0 + -----                                                                                                     
                                                       0.5*y     x0                x0                                   x0                                                                         x0                                                                                                      
                                              0.5*x0 + -----                                                                                                                                                                                                                                               
                                                         x0                                                                                                                                                                                                                                                

Simplifying:

>>> simplify(g(g(g(g(g(x0))))))
    /                       32                         30                           28  2                         26  3                         24  4                        22  5                       20  6                       18  7                       16  8                       14  9                       12  10                        10  11                         8  12                         6  13                         4  14                         2  15                         16\
1.0*\9.31322574615479e-10*x0   + 4.61935997009277e-7*x0  *y + 3.34903597831726e-5*x0  *y  + 0.00084395706653595*x0  *y  + 0.00979593023657799*x0  *y  + 0.0600817054510117*x0  *y  + 0.210285969078541*x0  *y  + 0.439058616757393*x0  *y  + 0.559799736365676*x0  *y  + 0.439058616757393*x0  *y  + 0.210285969078541*x0  *y   + 0.0600817054510117*x0  *y   + 0.00979593023657799*x0 *y   + 0.00084395706653595*x0 *y   + 3.34903597831726e-5*x0 *y   + 4.61935997009277e-7*x0 *y   + 9.31322574615479e-10*y  /
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                    /                      30                         28                            26  2                         24  3                        22  4                       20  5                       18  6                       16  7                       14  8                       12  9                       10  10                        8  11                         6  12                          4  13                         2  14                        15\                 
                 x0*\2.98023223876953e-8*x0   + 4.61935997009277e-6*x0  *y + 0.000187546014785767*x0  *y  + 0.00313469767570496*x0  *y  + 0.0261224806308746*x0  *y  + 0.120163410902023*x0  *y  + 0.323516875505447*x0  *y  + 0.526870340108871*x0  *y  + 0.526870340108871*x0  *y  + 0.323516875505447*x0  *y  + 0.120163410902023*x0  *y   + 0.0261224806308746*x0 *y   + 0.00313469767570496*x0 *y   + 0.000187546014785767*x0 *y   + 4.61935997009277e-6*x0 *y   + 2.98023223876953e-8*y  /                 

Is this formula useful? If we use floating-point arithmetic, almost certainly not.

However, we can use arbitrary-precision rational arithmetic instead:

from sympy import *
from fractions import Fraction

# declare symbols
x, y, x_0 = symbols('x y x_0')

# declare function g
g = lambdify(x, Fraction(1, 2) * (x + (y / x)))

# iterate
x_1 = g(x_0)
x_2 = g(x_1)
x_3 = g(x_2)
x_4 = g(x_3)
x_5 = g(x_4)

print latex(simplify(x_5))

which produces the following monstrous formula for $x_5$:

$$\color{blue}{\frac{x_{0}^{32} + 496 x_{0}^{30} y + 35960 x_{0}^{28} y^{2} + 906192 x_{0}^{26} y^{3} + 10518300 x_{0}^{24} y^{4} + 64512240 x_{0}^{22} y^{5} + 225792840 x_{0}^{20} y^{6} + 471435600 x_{0}^{18} y^{7} + 601080390 x_{0}^{16} y^{8} + 471435600 x_{0}^{14} y^{9} + 225792840 x_{0}^{12} y^{10} + 64512240 x_{0}^{10} y^{11} + 10518300 x_{0}^{8} y^{12} + 906192 x_{0}^{6} y^{13} + 35960 x_{0}^{4} y^{14} + 496 x_{0}^{2} y^{15} + y^{16}}{32 x_{0} \left(x_{0}^{30} + 155 x_{0}^{28} y + 6293 x_{0}^{26} y^{2} + 105183 x_{0}^{24} y^{3} + 876525 x_{0}^{22} y^{4} + 4032015 x_{0}^{20} y^{5} + 10855425 x_{0}^{18} y^{6} + 17678835 x_{0}^{16} y^{7} + 17678835 x_{0}^{14} y^{8} + 10855425 x_{0}^{12} y^{9} + 4032015 x_{0}^{10} y^{10} + 876525 x_{0}^{8} y^{11} + 105183 x_{0}^{6} y^{12} + 6293 x_{0}^{4} y^{13} + 155 x_{0}^{2} y^{14} + y^{15}\right)}}$$

Suppose that we want to compute $\sqrt{10}$ using the initial estimate $x_0 = 3$:

>>> (x_5.subs(y,10)).subs(x_0,3)
9347391150304592810234881/2955904621546382351702304
>>> 9347391150304592810234881.0/2955904621546382351702304.0
3.1622776601683795
>>> 3.1622776601683795**2
10.000000000000002

which is good enough. Hence,

$$\sqrt{10} \approx \frac{9347391150304592810234881}{2955904621546382351702304} \approx 3.16227766$$