An intuitive explanation of the Taylor expansion

We know that the higher the degree of an equation, the more "turning points" it may have. For example, a parabola has one "turning point."

alt text (A parabola has an equation of the form $y=ax^2 + bx +c$.)

A cubic of the form $y=ax^3 + bx^2 +cx +d$ can have up to two "turning points," though it may have fewer. In general, an equation of degree $n$ may have up to $n-1$ turning points.

alt text

(Here is the polynomial $f(x) = 2x^4 - x^3 -3x^2 + 7x - 13$. It is degree 4 and it has the maximum number of turning points, 4-1=3. But, keep in mind, some degree 4 polynomials have only one or two turning points. The degree gives us the MAXIMUM number: $n-1$.)

This is important because, if you want to use a polynomial to approximate a function, you will want to use a polynomial of high enough degree to match the "features" of the function. The Taylor series will let you do this with functions that are "infinitely differentiable" since it uses the derivatives of the function to approximate the functions behavior.

Here are Taylor polynomials of increasing degree and the sine curve. Notice how they are "wrapping around" the sine curve, giving an approximation that fits better and better over more of the curve as the degree of the Taylor polynomial increases.

alt text

(Source for this image: http://202.38.126.65/navigate/math/history/Mathematicians/Taylor.html)

Since the sine curve has so many turning points it is easy to see that to match all of the features of the sine curve we will need to take the limit of the $n^{th}$ degree Taylor polynomial as $n \rightarrow \infty$.*

That's the intuition behind the Taylor series. The higher the degree, the better the "fit." Why? Because higher degree curve have more "turning points" so they can better match the shape of things like the sine function. (As long as the function we are approximating is differentiable.)

*Side note: A function may have only a few turning points and still need infinitely many terms of the Taylor polynomial. Take the catenary, for example, which only has one turning point since it looks like a parabola. The Taylor series for the catenary will not have any terms where the coefficients are zero, since the derivatives of the catenary are hyperbolic sinusoidal functions.

But, even with the catenary, higher degree polynomials give a better approximation.


I give it a try: If you want to know where you will be in x time driving a car you can find out by separating the different components: position at the moment, speed, acceleration, jolt and so on and add them all together.


Think of a Taylor series not as one entity but as a sequence of approximations.

The first term gives a constant approximation: $f(x + h)$ is approximately $f(x).$

The first two terms give a linear approximation: $f(x + h)$ is approximately $f(x)$ plus a trend term, $h f'(x).$

The first three terms include a constant approximation, a linear trend, and a curvature term to account for the change in the linear trend: f(x + h) is approximately $f(x) + h f'(x) + h^2 \dfrac{f''(x)}{2}.$

Next you add a term to account for the change in the curvature, etc.


Predict global while computing local!

A Taylor expansion of a function $f$ around some value $x_0$ is similar to a prediction of the function at a neighboring value $x$ knowing progressively more about the variation of $f$ at the point $x_0$.

First step: easiest prediction: nothing changed, that is, $f(x) = f(x_0)$

Second step: we know the first derivative, so we predict the function was linear between $x_0$ and $x$ : $f(x) = f(x_0) + (x-x_0)f'(x_0)$. See, everything still local as the derivate is given at $x_0$.

The next step give a generalization of this predictions for higher derivatives. The different forms give bounds to the error or more knowledge of the residual.


Performing an $n$-th finite Taylor expansion can be thought of as making the approximation that the function's $n$-th derivative is constant.

Try it yourself: let $f$ be a $n$ times differentiable function whose $n$-th derivative is constant, and suppose you know the values of $f^{(i)}(0)$, $0\leq i\leq n$. By integrating repeatedly, you'll find that this uniquely determines $f$ and produces the formula for a Taylor's expansion.

Intuitively, I find it plausible that neglecting higher order derivatives of a function shouldn't cause too large of an error. Taylor's theorem confirms this intuition.