Proof that Newton Raphson method has quadratic convergence

I've googled this and I've seen different types of proofs but they all use notations that I don't understand.

But first of all, I need to understand what quadratic convergence means, I read that it has to do with the speed of an algorithm. Is this correct?

Ok, so I know that this is the Newton-Raphson method:

$$x_{n+1}=x_n-\dfrac{f(x_n)}{f'(x_n)}$$

How do I prove that it converges?

Thanks.


The method converges under suitable hypotheses. Assume that you have determined by whatever means an interval $[a,b]$ with $$f(a)<0<f(b);\qquad f'(x)>0, \quad f''(x)>0\quad(a<x<b)$$ (or similar, but with different signs of $f$, $f'$, and $f''$). Then $f$ has exactly one zero $\xi\in\ ]a,b[\ $. Furthermore it is obvious from looking at a figure, resp., the convexity properties of $f$, that $$x_0:=b,\qquad x_{n+1}:=x_n-{f(x_n)\over f'(x_n)}\quad(n\geq0)\tag{1}$$ produces a monotonically decreasing sequence of points $x_n>\xi$. It follows that the $x_n$ converge to some $\xi'\in[\xi,x_0]$. Letting $n\to\infty$ in $(1)$ implies $f(\xi')=0$, whence $\xi'=\xi$.

In order to analyze the speed of convergence we invoke Taylor's theorem: For each $n\geq0$ there is an $x^*\in[\xi,x_n]$ with $$0=f(\xi)=f(x_n)+f'(x_n)(\xi-x_n)+{f''(x^*)\over 2!}(\xi-x_n)^2\ .$$ This implies, by definition of $x_{n+1}$, that $$x_{n+1}-\xi={f''(x^*)\over 2 f'(x_n)}(x_n-\xi)^2\ .$$ Here for large $n$ the first factor on the right hand side is approximately equal to $$C:={f''(\xi)\over 2 f'(\xi)}\ .$$ This means that for large $n$ we have approximately $$x_{n+1}-\xi\doteq C(x_n-\xi)^2\qquad(n\gg1)\ .$$ Qualitatively this means that with each Newton step the number of correct decimals is about doubled. That is what is meant by "quadratic convergence".


Let $f$ be twice continuous differentiable on some interval $(a, b)$. Assume that $f(c) = 0$ where $a < c < b$ and that $f'(c) \neq 0$. Then there exists $0 < \epsilon < \min(c - a, b-c)$ with the following property: pick any $x_0 \in (c- \epsilon, c + \epsilon)$ and define iteratively$$x_{n+1} = x_n - {{f(x_n)}\over{f'(x_n)}},\text{ }n \ge 0.\tag*{$(1)$}$$We have that $\{x_n\}_{n=0}^\infty$ converges to $c$ in the following fashion:$$\left|x_{n+1} - c\right| \le M\left|x_n - c\right|^2 \text{ for all }n \ge 0,\tag*{$(2)$}$$where $M$ is some constant.

We may assume $c = 0$, and $\epsilon$ small enough that$$\left|f'(x)\right| > {{\left|f'(0)\right|}\over2}$$ when $\left|x\right| < \epsilon$ for some $B \in \mathbb{R}^+$. Then by Taylor's Theorem,$$f(x_n) - xf'(x_n) = {{x_n^2}\over2} f''(y_n)$$for some $y_n$ between (fix this->)$)$ and $x_n$. Thus, for $\left|x_n\right| < \epsilon$, we have$$\left|x_{n+1}\right| = \left|-x_{n+1}\right|$$$$=\left| {{f(x_n)}\over{f'(x_n)}} - x_n\right|$$$$= {1\over{\left|f'(x_n)\right|}} \cdot \left| f(x_n) - x_n f'(x_n)\right|$$$$= {1\over{\left|f'(x_n)\right|}} \cdot \left| {{x_n^2}\over2} f''(y_n)\right|$$$$\le {2\over{f'(0)}} \cdot {B\over2} x_n^2.$$What we are doing is taking $x_{n+1}$ to be the point where the tangent line to the graph of $f$ at $x_n$ hits the $x$-axis. The inequality we proved shows that for $\left|x_n\right| < \epsilon$, we have$$\left| {{x_{n+1}}\over{x_n}}\right| < Mx_n,$$so if$$\left| x_n\right| < \min\left( \epsilon, {1\over{2M}}\right),$$we have$$\left|x_{n+1}\right| < {{\left|x_n\right|}\over2}.$$Thus,$$\left|x_n\right| \to 0\text{ as } n \to \infty.$$


The rate of convergence in $(2)$ is quadratic and thus faster than in the contraction principle. There the convergence is exponential, here it is super-exponential. This plays an important role in applications, also to problems in pure mathematics (Nash embedding). If you have experience in programming, you could write a brief code which computes this Newton sequence for simple functions of your choice. You will see how rapidly the sequence stabilizes behind the comma.


You look at the size of the next function value. For simple roots and close to the root, the function value is a measure for the distance to the root. $$ f(x+h)=f(x)+f'(x)h+\frac12 f''(x+\theta h)h^2 $$ Denote $L=\max_{x\in I} |f''(x)|$ and set $f(x)+f'(x)h=0$, then $$ |f(x+h)|\le \frac L2 h^2=\frac L2\frac{f(x)^2}{f'(x)^2} $$ Now put the first derivatives into the constant and return to the iteration sequence $(x_n)$ to get $$ |f(x_{n+1})|\le C\,|f(x_n)|^2 \iff |C\,f(x_{n+1}|\le|C\,f(x_n)|^2 $$ where $C=\frac{L}{2m^2}$ with $$ 0< m\le |f'(x)|\le M<\infty $$


Repeated squaring leads to a dyadic power in the exponent, so that $$ |C\,f(x_n)|\le|C\,f(x_0)|^{2^n} $$ This is what is meant with quadratic convergence, that the exponent is $2^n$ instead of $n$ as in linear convergence.

The condition to guarantee convergence is then $|C\,f(x_0)|<1$.


For the distance to the root $x_*$ use $$ f(x)=f(x)-f(x_*)\le f'(x_*+\theta(x-x_*))\,(x-x_*) $$ so that $$ m\,|x-x_*|\le |f(x)|\le M\,|x-x_*|\iff \frac{|f(x)|}M\le |x-x_*|\le\frac{|f(x)|}m. $$