Why does Newton's method work?

The method is easiest to justify in one dimension. Say that I have some complicated function $f(x)$ whose root I want to find:

some dumb function

"I don't know how to find its root; it's complicated!" Thus, we use a general idea that has always been used in the design of numerical methods:

Replace a complicated function with a simple approximation.

One of the simplest functions one can deal with is a linear function:

$$f(x)=mx+b$$

In particular, if you want the root of a linear function, it's quite easily figured:

$$x=-\frac{b}{m}$$

Now, it is well-known (or at least, ought to be) that the tangent line of a function is the "best" linear approximation of a function in the vicinity of its point of tangency:

dumb function with tangent line

The first idea of the Newton-Raphson method is that, since it is easy to find the root of a linear function, we pretend that our complicated function is a line, and then find the root of a line, with the hope that the line's crossing is an excellent approximation to the root we actually need.

Mathematically, if we have the tangent line of $f(x)$ at $x=a$, where $a$ is the "starting point":

$$f(x)\approx f(a)+f^\prime(a)(x-a)=0$$

If we want $x$, then

$$x=a-\frac{f(a)}{f^\prime(a)}$$

Let's call this $x_1$.

dumb function with tangent and approximate root

As you can see, the blue point corresponding to the approximation is a bit far off, which brings us to the second idea of Newton-Raphson: if at first you don't succeed, try again:

dumb function with new root approximation

As you can see, the new blue point is much nearer to the red point. Mathematically, this corresponds to finding the root of the new tangent line at $x=x_1$:

$$x_2=x_1-\frac{f(x_1)}{f^\prime(x_1)}$$

We can keep playing this game (with $x_2, x_3, \dots x_n$), up until the point that we find a value where the quantity $\dfrac{f(x_n)}{f^\prime(x_n)}$ is "tiny". We then say that we have converged to an approximation of the root. That is the essence of Newton-Raphson.

As an aside, the previous discussion should tip you on what might happen if the tangent line is nearly horizontal, which is one of the disastrous things that can happen while applying the method.

This animation from the Wikipedia page for Newton's Method might be useful:

enter image description here

Why does Newton's method work?

Related

Recent Posts