Intuitive explanation of proof of Abel's limit theorem
Since Abel's limit theorem can be proven using the Dirichlet convergence test (see these notes on Ken Davidson's webpage), perhaps you will be satisfied with a geometrically intuitive proof of the latter. I might edit this answer later to incorporate this step.
It is easy to spot that the Dirichlet test is a generalization of the "alternating series test". However, whereas the alternating series is proven "geometrically", the Dirichlet test is usually proven by an uninspiring application of the summation by parts formula. For motivation, let us briefly look at the alternating series test;
Alternating series test: Let $(a_n)_{n \geq 0}$ be a monotone sequence of positive reals with limit zero. Then, $\sum_{n=0}^\infty (-1)^n a_n$ converges. Indeed, the sequence $s_n = \sum_{i=0}^n (-1)^i a_i$ of partial sums satisfies $|s_m -s_n| \leq a_n$ for $m >n$ and is Cauchy, in particular.
This all follows from the fact that there is a nested sequence of closed intervals $I_0 \supseteq I_1 \supseteq I_2 \supseteq \ldots$ with $\mathrm{length}(I_n) =a_n$ such that $s_n \in I_n$ for each $n$. Just take \begin{align*} I_0=[0,a_0] && I_1 = [s_1,s_0] && I_2 = [s_1,s_2] && I_3 = [s_3,s_2] && I_4 = [s_3,s_4] && \cdots \end{align*}
The plan is to generalize this approach to the setting of Dirichlet's test.
Dirichlet test: Let $(z_n)_{n \geq 0}$ be a sequence of complex numbers whose sequence of partial sums is bounded. So, there exists a closed disk $D$ of diameter $d$ such that $\zeta_n = \sum_{i=0}^n z_i \in D$ for all $n$. Let $(a_n)_{n \geq 0}$ be a monotone sequence of positive reals converging to zero. Then, $ \sum_{n=1}^\infty a_n z_n$ converges. Indeed, the sequence $s_n = \sum_{i=0}^n (-1)^i a_i$ of partial sums satisfies $|s_m -s_n| \leq da_n$ for $m >n$ and is Cauchy, in particular.
Remark: In the case $z_n=(-1)^n$, we can take $D=[0,1]$ so that $d=1$.
This will all follow once we construct a nested sequence of closed disks $D_0 \supseteq D_1 \supseteq D_2 \supseteq \ldots$ with $\mathrm{diam}(D_n) = da_n$ such that $s_n \in D_n$ for each $n$. The little piece of geometry we will use to accomplish this is the following:
Rescaling a disk with respect to a point other than the centre: Let $D$ be a closed disk in the plane. Consider a transformation $f(z) = a(z-z_0)+z_0$ where $a \in (0,1]$. This effects a scaling down by the factor $a$ with respect to the basepoint $z_0$. Suppose that $z_0$ belongs to the disk $D$ (but is not necessarily the centre). Then,
- $f(D)$ is, again, a disk.
- $f(D) \subseteq D$
- the diameter of $f(D)$ is $a$ times the diameter of $D$.
A couple brief comments: (1) is quite intuitive--if you are standing somewhere inside a circle, it's "circleness" doesn't depend whether you measure distance in one unit system or another. (2) would be true replacing $D$ by any convex set (or even any set star-shaped about $z_0$).
Define $\zeta_n = \sum_{i=1}^n z_i$ so that, by assumption, $\zeta_n \in D$ for all $n$. Define transformations $f_n : \mathbb{C} \to \mathbb{C}$ by $$ f_n(z) = \frac{a_{n}}{a_{n-1}} (z-s_{n-1})+s_{n-1},$$ noting $0 < \frac{a_{n}}{a_{n-1}} \leq 1$. It is easy to check that:
- $s_0=a_0\zeta_0$
- $s_1=f_1(a_0\zeta_1)$
- $s_2=f_2 \circ f_1(a_0\zeta_2)$
- $s_n=f_n \circ \cdots \circ f_1(a_0\zeta_n)$, in general.
In particular \begin{align*} s_n \in D_n && \text{ where } && D_n = f_n \circ \cdots \circ f_1 (a_0D) \end{align*}
This is not a geometric explanation, but I thought I would try to explain why partial summation is fairly natural here by writing it with the infinite series in $z$ throughout.
Let $s_{n}=\sum_{k=0}^n a_k$. Then we can express the power series for $a_n$ in terms of $s_n$ with a factor that goes to zero at $z=1$: $$\sum_{n=0}^\infty a_n z^n = \sum_{n=0}^\infty (s_n-s_{n-1}) z^n=(1-z)\sum_{n=0}^\infty s_n z^n.$$ Suppose that $\lim_{n\rightarrow\infty} s_n=\alpha$, that is, the series converges. Then $$\sum_{n=0}^\infty a_n z^n=\alpha + (1-z)\sum_{n=0}^\infty (s_n -\alpha)z^n.$$ In this form, it's easy to see why $(s_n-\alpha)\rightarrow 0$ as $n\rightarrow\infty$ implies the desired convergence - the $(1-z)$ factor will zero out the $\sum_{n=0}^\infty (s_n -\alpha)z^n$ term as $z\rightarrow 1$, since the terms tend to zero. That is, since $$\lim_{z\rightarrow 1} (1-z)\sum_{n=0}^\infty (s_n -\alpha)z^n =0,$$ we obtain Abel's theorem.
This expands on Mike F's answer. He defined $D_n=f_n \circ\cdots \circ f_1(a_0D)$
To show $D_{n+1}\subset D_n$ we will first show
$a_0z_0 + a_1z_1 + \ldots + a_{n-1}z_{n-1}+a_n(z_n+\ldots+z_m)\in D_n$.
In the base case $a_0z_0 + a_0z_1 + \ldots + a_0z_m\in D_0 = a_0D$ because the partial sums of $z_i$ are bounded so we can define $D$ to contain these sums. Now assume that $a_0z_0 + a_1z_1 + \ldots + a_{n-1}z_{n-1}+a_n(z_n+\ldots+z_m)\in D_n$ holds and we wish to show it for $n+1$. The function $f_{n+1}$ is a scaling centered at $a_0z_0+\cdots+a_nz_n$ by $a_{n+1}/a_{n}$. It satisfies
$f(a_0z_0 + a_1z_1 + \ldots + a_{n-1}z_{n-1}+a_n(z_n+\ldots+z_m))=a_0z_0 + a_1z_1 + \ldots + a_{n-1}z_{n-1}+a_nz_n+a_{n+1}(z_{n+1}+\ldots+z_m)$
when $m>n$. Therefore we have $a_0z_0 + a_1z_1 + \ldots + a_{n-1}z_{n-1}+a_nz_n+a_{n+1}(z_{n+1}+\ldots+z_m)\in D_{n+1}$. This proves the result.
Using that result now set $m=n$ then we have $a_0z_0 + a_1z_1 + \ldots + a_{n-1}z_{n-1}+a_nz_n\in D_n$. This says the scaling center of $f_{n+1}$ is contained in $D_n=f_n \circ\cdots \circ f_1(a_0D)$ which shows $D_{n+1}\subset D_n$. This combined with $s_n\in D_n$ shows convergence of $s_n$.
Like a previous answer, this answer does not give geometric intuition, but I hope it shows how someone might reasonably have discovered the proof from scratch.
Note: I'm working WLOG in the situation where the $r = 1$.
So it's hopeless to try to find a proof which uses only the fact that the coefficients $a_n$ tend to zero as $n \to \infty$, and it's reasonable to look for a proof involving sums of the coefficients.
EDIT: Clarification of the above two paragraphs
- Sorry, I should've just said that the convergence of $\sum_n a_n$ is a stronger condition than $a_n \to 0$, so we should focus on somehow using $\sum_n a_n$ and not spend time trying to use $a_n \to 0$.
Because... - ... Because, if you could somehow prove uniform convergence of $f_n(x) = \sum_{k=1}^n a_k x_k$ to $f(x) = \sum_n a_n x_n$ on $(0,1)$ only from the hypothesis $a_n \to 0$ as $n \to \infty$, then you could actually prove the other hypothesis, namely that $\sum_n a_n$ converges. This isn't reasonable.
- See this question which references a uniform
convergence interchange of limits theorem:
Proof explanation of some theorem about uniform convergence
Someone who has played around a lot with power series may have noticed the following:
Lemma: If $\{ b_n \}$ is a sequence with $b_n \to 0$ as $n \to \infty$, then the sequence of functions $$ g_k(x) = (1 - x) \sum_{n = 0}^k b_n x^n $$ converges uniformly on $(0, 1)$ as to the function $$ g(x) = (1 - x) \sum_{n \ge 0} b_n x^n $$ The lemma is very easy to prove, since the radius of convergence of $\sum\limits_{n \ge 0} b_n x^n$ is at least one, and so absolute convergence gives $$ |g(x) - g_k(x)| \le (1-x) \sum_{k < n} |b_n| x^n \le (1-x) \sum_{k < n} \epsilon x^n = \epsilon x^{k+1} (1-x) \sum_{0 \le n} x^n = \epsilon x^{k+1} < \epsilon $$ since $|b_n| < \epsilon$ when $k$ is large enough.
You'd like to coerce the partial sums for $f(x)$ closer to the form in the lemma. There are a couple ways to approach it:
Approach #1: Use Summation By Parts \begin{equation*}\begin{aligned} f_k(x) & = \sum_{n = 0}^k a_n x^n \\ & = s_k x^{k+1} + \sum_{n = 0}^k s_n \left( x^n - x^{n+1} \right) \\ & = s_k x^{k+1} + (1 - x) \sum_{n = 0}^k s_n x^n \end{aligned}\end{equation*}
Approach #2: Multiply by $1 = (1 - x)(1 + x + x^2 + \cdots)$ \begin{equation*}\begin{aligned} f_k(x) & = \sum_{n = 0}^k a_n x^n \\ & = (1 - x)(1 + x + x^2 + \cdots) \sum_{n = 0}^k a_n x^n \\ & = (1 - x) \left( \sum_{n = 0}^k s_n x^n + \sum_{n \ge k + 1} s_k x^n \right) \\ & = (1 - x) \sum_{n = 0}^k s_n x^n + s_k (1 - x) \sum_{n \ge k + 1} x^n \\ & = (1 - x) \sum_{n = 0}^k s_n x^n + s_k x^{k+1} (1 - x)(1 + x + x^2 + \cdots) \\ & = s_k x^{k+1} + (1 - x) \sum_{n = 0}^k s_n x^n \end{aligned}\end{equation*}
That was the main trick, and either way we got $$ f_k(x) = s_k x^{k+1} + (1 - x) \sum_{n = 0}^k s_n x^n $$ We have made progress.
But now the issue is that in general it's not the case that $s_n \to 0$ as $n \to \infty$.
But it is the case that $(s_n - s) \to 0$, so:
\begin{equation*}\begin{aligned} f_k(x) & = s_k x^{k+1} + (1 - x) \sum_{n = 0}^k s_n x^n \\ & = s_k x^{k+1} + (1 - x) \sum_{n = 0}^k s x^n + (1 - x) \sum_{n = 0}^k (s_n - s) x^n \\ & = s_k x^{k+1} + (1 - x) s \sum_{n = 0}^k x^n + (1 - x) \sum_{n = 0}^k (s_n - s) x^n \\ & = s_k x^{k+1} + s (1 - x) \sum_{n = 0}^k x^n + (1 - x) \sum_{n = 0}^k (s_n - s) x^n \\ & = s_k x^{k+1} + s (1 - x^{k + 1})+ (1 - x) \sum_{n = 0}^k (s_n - s) x^n \\ & = \Bigg\{ s + (s_k - s) x^{k+1} \Bigg\} + \Bigg\{ (1 - x) \sum_{n = 0}^k (s_n - s) x^n \Bigg\} \\ & = \Bigg\{ \text{(A)} \Bigg\} + \Bigg\{ \text{(B)} \Bigg\} \end{aligned}\end{equation*} We are in very good shape now, since (A) is obviously uniformly convergent, and (B) is uniformly convergent by the lemma.