Repeatedly taking differences on a polynomial yields the factorial of its degree?

Solution 1:

Here's a fact:

If $p(x)$ is a polynomial of degree $n$ with leading term $ax^n$ then $p(x+1)-p(x)$ is a polyomial of degree $n-1$ with leading term $a \, n \, x^{n-1}$.

(I'll prove this fact below).

Applying this fact together with an induction argument, it follows that after repeating the process $n$ times, one obtains a polynomial of degree zero whose leading term is $$a \, n \, (n-1) \ldots (2) (1) = a \, n! $$ which is just a constant having that value.

So if the original leading coefficient $a$ is equal to $1$, as it is in the specific cases $F(x)=x^n$ that you ask about, repeating the difference process $n$ times yields a constant sequence of $n!$ as you ask.

Here's a proof of the fact by applying induction (the base case $n=1$ is easy).

Assuming the induction hypothesis for polynomials of degree $\le n-1$, suppose that $$p(x) = a \, x^n + q(x) $$ where $q(x)$ is a polynomial of degree $\le n-1$.

We have $$p(x+1)-p(x) = a \, (x+1)^n - a \, x^n + \underbrace{(q(x+1)-q(x))}_{r(x)} $$ and $r(x)$ is a polynomial of degree $\le n-2$ by induction. Thus $$p(x+1)-p(x) = a \, (x^n + n \, x^{n-1} + s(x)) - a\, x^n + r(x) $$ where $s(x)$ is also a polynomial of degree $\le n-2$ (by application of the binomial theorem). Therefore $$p(x+1)-p(x) = a \, n \, x^{n-1} + (a \, s(x)+r(x)) $$ which is a polynomial of degree $n-1$ with leading term as required.

Solution 2:

What you have discovered/invented is known as the forward difference operator $D$ defined as: $ \def\nn{\mathbb{N}} \def\zz{\mathbb{Z}} \def\lfrac#1#2{{\large\frac{#1}{#2}}} \def\lbinom#1#2{{\large\binom{#1}{#2}}} $

$D = ( \text{function $f$ on $\zz$} \mapsto ( \text{int $n$} \mapsto f(n+1) - f(n) ) )$

Namely for any function $f$ on $\zz$ and $n \in \zz$, $D(f)(n) = f(n+1) - f(n)$.

If you think of the functions as sequences (infinite in both directions), then taking the forward difference means replacing each term with the value of the next term minus itself. What you did is essentially to repeatedly take the forward difference of the sequence of cubes:

...,-27,-8,-1, 0, 1, 8,27,...
..., 19, 7, 1, 1, 7,19,37,...
...,-12,-6, 0, 6,12,18,24,...
...,  6, 6, 6, 6, 6, 6, 6,...
...,  0, 0, 0, 0, 0, 0, 0,...
...,  0, 0, 0, 0, 0, 0, 0,...

This powerful abstraction makes it easy to get a lot of things. For instance, the numbers obtained here can be easily used to obtain the general formula for sum of cubes!

General method for indefinite summation

The key is that:

$D\left( \text{int $n$} \mapsto \lbinom{n}{k+1} \right) = \left( \text{int $n$} \mapsto \lbinom{n}{k} \right)$ for any $k \in \zz$.

This is to be expected because it follows directly from Pascal's triangle, especially if we define $\lbinom{n}{k}$ using the triangle.

This means that if we have any function $f$ on $\zz$ such that $f(n) = \sum_{k=0}^\infty a_k \lbinom{n}{k}$ for any $n \in \zz$, then we get the Newton series:

$D(f)(n) = \sum_{k=0}^\infty a_{k+1} \lbinom{n}{k}$ for any $n \in \zz$.

From a high-level perspective, this is the discrete version of the Taylor series, and indeed for such a function we easily see that $f(n) = \sum_{k=0}^\infty D^k(f)(0) \lbinom{n}{k}$ for any $n \in \zz$, because $\binom{0}{0} = 1$ while $\lbinom{0}{k} = 0$ for any $k \in \nn^+$.

This works for any polynomial function $f$ on $\zz$, since $D^k(f)$ is the zero function once $k$ is larger than the degree of $f$, so we can use it to immediately find the series for $(\text{int n} \mapsto n^3)$, and then just take the anti-difference by shifting the coefficients of the series the other way. The undetermined constant that appears will drop out once we perform a definite sum like if we want the sum of the first $m$ cubes.

Note also that $D^k(f)$ is the constant function with value $k!$ if $f(n) = n^k$ for every $n$. Lee Mosher has already explained this particular fact by directly proving it, but another way to see it is that the highest order term in its Newton series is $k! \lbinom{n}{k}$, because $\lbinom{n}{k}$ is the only term that can contribute the $k$-th power of $n$. Since $D$ merely shifts the coefficients, $D^k(f) = \left( \text{int $n$} \mapsto k! \lbinom{n}{0} \right)$ and we are done.

Sum of $p$ powers

For example if we want $\sum_{k=1}^{n-1} k^3$ we first find the iterated forward differences of the sequence of cubes $( n^3 )_{n \in \zz}$:

..., 0, 1, 8,27,64,...
..., 1, 7,19,37,...
..., 6,12,18,...
..., 6, 6,...
..., 0,...

So we immediately get $n^3 = 0 \binom{n}{0} + 1 \binom{n}{1} + 6 \binom{n}{2} + 6 \binom{n}{3}$ and hence $\sum_{k=0}^{n-1} k^3 = 0 \binom{n}{1} + 1 \binom{n}{2} + 6 \binom{n}{3} + 6 \binom{n}{4} = \lfrac{n(n-1)}{2} \Big( 1 + \lfrac{6(n-2)}{3} \big( 1 + \lfrac{n-3}{4} \big) \Big) = \Big( \lfrac{n(n-1)}{2} \Big)^2$.

Computation efficiency

This is far more efficient than the usual method (namely by taking summation on both sides of $(n+1)^3-n^3 = 3n^2+3n+1$ and telescoping) because the series using binomial coefficients is easy to manipulate and easy to compute. For sum of $p$-powers we only need $O(p^2)$ arithmetic operations to find the forward-differences and then $O(p^2)$ more to simplify the series form into a standard polynomial form. In contrast, the other method requires $O(p^3)$ arithmetic operations.

Indefinite summation of non-polynomials

Also, for a wide class of non-polynomial functions, we can still compute the indefinite sum without using the series, by using the discrete analogue to integration by parts, here called summation by parts.

To derive it, simply check that $D(f \times g)(n) = f(n+1) g(n+1) - f(n) g(n) = f(n+1) D(g)(n) - D(f)(n) g(n)$ and so we get the product rule:

$D(f \times g) = R(f) \times D(g) + D(f) \times g$

where $R$ is the right-shift operator defined as:

$R = ( \text{function $f$ on $\zz$} \mapsto ( \text{int $n$} \mapsto f(n+1) ) )$

Namely for any function $f$ on $\zz$ and $n \in Z$, $R(f)(n) = f(n+1)$.

For convenience we also define the summation operator:

$S = ( \text{function $f$ on $\zz$} \mapsto ( \text{int $n$} \mapsto \sum_{k=0}^{n-1} f(k) ) )$

Then we have the important property that $DS(f) = f$ for any function $f$ on $\zz$, analogous to the fundamental theorem of calculus.

Now by substituting $f$ with $S(f)$ into the product rule and taking summation on both sides we get summation by parts:

$S( f \times g ) = S(f) \times g - S( R(S(f)) \times D(g) ) + c$ for some constant function $c$ on $\zz$.

Example indefinite sum

Using this we can easily compute things like $\sum_{k=1}^n k^3 3^k$ by applying it three times, each time reducing the degree of the polynomial part. There are other ways to achieve this using differentiation, but this method is purely discrete.

Exercise 1.6.3 from Alon & Spencer's *The Probabilistic Method*: prove that $Pr[|X-Y| \leq 2] \leq 3 Pr[|X-Y| \leq 1]$ for i.i.d. real RVs $X$ and $Y$

"Efficient version" of Cayley's Theorem in Group Theory

Proofs of the Cauchy-Schwarz Inequality?

In an extension of finitely generated $k$-algebras the contraction of a maximal ideal is also maximal

Does $\sum{\frac{\sin{(nx)}}{n}}$ converge uniformly for all $x$ in $[0,2\pi]$

Proof that convex open sets in $\mathbb{R}^n$ are homeomorphic?

$-1 = 0$ by integration by parts of $\tan(x)$

How to prove triangle inequality for $p$-norm?

Calculating $\lim_{n\to\infty}\sqrt{n}\sin(\sin...(\sin(x)..)$

Number of points in the fibre and the degree of field extension

Finding the maximum likelihood estimator

Number of automorphisms of a direct product of two cyclic $p$-groups