proof of the chain rule for calculus
I was comparing my attempt to prove the chain rule by my own and the proof given in Spivak's book but they seems to be rather different. Please tell me if I'm wrong or if I'm missing something. I really appreciate any comment. Thank you so much.
Theorem:
Let $I$ and $J$ be open intervals in $\mathbb{R}$. Let $a\in I$ and let $f:I\longrightarrow J$ and $g:J\longrightarrow \mathbb{R}$ functions. Suppose that $f$ is differentiable at $a$, and that $g$ is differentiable at $f(a)$. Then $g\circ f$ is differentiable at $a$ and $(g\circ f)'(a)=g'(f(a))f'(a)$
Idea and observations: We wanted to find $\lim_{h\to 0}\frac{(g\circ f)(a+h)-g\circ f(a)}{h}= \lim_{h\to 0}\frac{g(f(a+h))-g(f(a))}{h}$, which seems to be the same as $\lim_{h\to 0}\frac{g(f(a+h))-g(f(a))}{f(a+h)-f(a)}\cdot \frac{f(a+h)-f(a)}{h}=g'(f(a))\cdot f'(a)$. The problem here, as Spivak suggests is that $f(a+h)$ may be equals $f(a)$ for some values of $h$ and then the division might not be defined for such values.
In this case Spivak defines a new function $\phi$ such that $\phi(h)=\frac{g(f(a+h))-f(a)}{f(a+h)-f(a)}$ if $f(a+h)\neq f(a)$ and $f'(g(a))$ otherwise. Then he proves that $\phi$ is continuous by using $\epsilon - \delta$ definitions and since $\frac{(g\circ f)(a+h)-f(a)}{h}=\phi(h)\cdot \frac{f(a+h)-f(a)}{h}$, the result follows.
Here my approach is rather different by using the fact that if $\lim_{x\to a }f=b$ and $\lim_{x\to b} g=l$ then $\lim_{x\to a}g\circ f=l$ (This implies at the same time that $f'(a)=\lim_{h\to 0} \frac{f(a+h)-f(a)}{h}=\lim_{h\to a}\frac{f(h)-f(a)}{h-a}$).
Proof:
Since $f$ is diferentiable at $a$ then $f$ is continuous at $a$. Then $\lim_{h\to a}f(h)=f(a)$. Since $\lim_{h\to 0}(a+h)=a$ then,
$1)$ $\lim_{h\to 0}f(a+h)=f(a)$.
Now, by hypothesis,
$2)$ $ \lim_{h\to f(a)}\frac{g(h)-g(f(a))}{h-f(a)}=g'(f(a))$, and
$3)$ $\lim_{h\to 0}\frac{f(a+h)-f(a)}{h}=f'(a)$
therefore, from $1)$ and $2)$ $\lim_{h\to 0}\frac{g(f(a+h))-g(f(a))}{f(a+h)-f(a)}=g'(f(a))$.
Finally, $(g\circ f)'(a)=\lim_{h\to 0}\frac{g(f(a+h))-g\circ f(a)}{h}=\lim_{h\to 0}\frac{g(f(a+h))-g(f(a))}{h}=\lim_{h\to 0}\frac{g(f(a+h))-g(f(a))}{f(a+h)-f(a)}\cdot \frac{f(a+h)-f(a)}{h}=g'(f(a))\cdot f'(a)$.
Solution 1:
In case an example helps: The function $\phi:\mathbf{R} \to \mathbf{R}$ defined by $$\phi(x) = \begin{cases} x^2 \sin(1/x) & \text{if } x \neq 0, \\ 0 & \text{if } x = 0, \end{cases}$$ is differentiable (everywhere; $\phi'(0) = 0$ from the difference quotient definition), and satisfies $\phi(1/n\pi) = 0$ for every non-zero integer $n$. Any proof of the chain rule must accommodate the existence of functions like this.
Taking $f = \phi$, and $g$ an arbitrary differentiable function, the equation $$\lim_{h \to 0} \frac{g\bigl(f(a+h)\bigr) − g\bigl(f(a)\bigr)}{f(a+h)−f(a)} = g'\bigl(f(a)\bigr)$$ is (technically) false at $a = 0$ because the quotient on the left-hand side is undefined for $h = 1/n\pi$. That is, in every neighborhood $U$ of $h = 0$, the quotient on the left fails to be defined at some point of $U$.
(A function $f$ that is constant in some neighborhood of $a$ also "breaks" the preceding equality; the function $\phi$ above shows that it's not sufficient merely to handle constant functions separately.)
Solution 2:
I have not read the proof of Spivak, but the point he mentions is important. Another approach (as provided in Hardy's Pure Mathematics) is to consider the two cases:
1) $f'(a) \neq 0$. This will ensure that $f(a + h) - f(a) \neq 0$ for all $h$ satisfying $0 < |h| < \delta$ for some $\delta > 0$ and then the proof proceeds without any problem.
2) $f'(a) = 0$. In this case we may or may not have $f(a + h) - f(a) = 0$ for values of $h$ satisfying $0 < |h| < \delta$. In case $f(a + h) - f(a) = 0$ we can see that $g(f(a + h)) - g(f(a)) = 0$ and hence the quotient $\dfrac{g(f(a + h)) - g(f(a))}{h} = 0$. And if $f(a + h) \neq f(a)$ then $$\dfrac{g(f(a + h)) - g(f(a))}{h} = \dfrac{g(f(a + h)) - g(f(a))}{f(a + h) - f(a)}\cdot \dfrac{f(a + h) - f(a)}{h}$$ and the first factor is bounded (because $\lim_{y \to f(a)}\dfrac{g(y) - g(f(a))}{y - f(a)} = g'(f(a))$ exists) and second factor tends to $f'(a) = 0$ so that in any case ratio $\{g(f(a + h)) - g(f(a))\}/h \to 0$ as $h \to 0$. Since $g'(f(a))f'(a) = 0$ here so the chain rule is established in this case.