Better Proofs Than Rudin's For The Inverse And Implicit Function Theorems

I am finding Rudin's proofs of these theorems very non-intuitive and difficult to recall. I can understand and follow both as I work through them, but if you were to ask me a week later to prove one or the other, I couldn't do it.

For instance, the use of a contraction mapping in the inverse function theorem seems to require one to memorize, at the very least, a non-obvious (at least to me) function (viz. $\phi(\mathbf{x}) = \mathbf{x} + \mathbf{A}^{-1}(\mathbf{y}-\operatorname{f}(\mathbf{x}))$) and constant (viz. $\lambda^{-1} = 2 \Vert \mathbf{A}^{-1}\Vert$), where $\mathbf{A}$ is the differential of $\operatorname{f}$ at $\mathbf{a}$.

The implicit function theorem proof, while not as bad, also requires one to construct a new function without ever hinting as to what the motivation is.

I searched the previous questions on this site and haven't found this addressed, so I figured I'd ask. I did finnd this proof to have a much more intuitive approach to the inverse function theorem, but would like to see what proofs are preferred by others.


Solution 1:

Suppose you want to find the inverse of the mapping $F: \mathbb{R}^n \rightarrow \mathbb{R}^n$ near a point $x_o$ where $F'(x_o)$ is invertible. The derivative (Jacobian matrix) provides an approximate form for the map $F(x) = F(x_o)+F'(x_o)(x-x_o)+\eta$. If you set $y = F(x)$ and ignore the error term $\eta$ then solving for $x$ gives us the first approximation to the inverse mapping. $$ x = x_o+[F'(x_o)]^{-1}(y-F(x_o)). $$ Then, you iterate. The technical details are merely to insure this iteration does indeed converge to the inverse mapping, but at the start, it's just using the derivative to linearize the problem.

I don't know if this helps or not, but really the approach is almost brute force, to invert $F(x)=y$ what do you do? You solve for $x$. We can't do that abstractly for $F$ so instead we solve the next best thing, the linearization. Then the beauty of the contraction mapping technique completes the argument.

Solution 2:

The other answer takes care of the Inverse Function Theorem case. Therefore, I'll give "another" proof of the Implicit Function Theorem. (Note that I inverted the order of $\mathbb{R}^n$ and $\mathbb{R}^m$, compared to Rudin's)


Consider the commutative diagram

enter image description here

where $F(x,y)=\big(x,f(x,y)\big)$. By the Inverse Function Theorem, it follows that locally we have

enter image description here

Due to how $F$ is defined, the commutativity of the diagram above is obvious. Due to how $F$ is defined and due to the fact that it is invertible in the given neighbourhood, the existence of $g$ is clear, and $g=\pi_2 \circ F^{-1} \circ i$ is also clear. Since $g=\pi_2 \circ F^{-1} \circ i$, $g$ is $C^1$. The rest of the Implicit Function Theorem follows by using the composition of the diagonal arrows.


As to why take the function $F$ as is, just note that the proof is basically finding a function $F$ making everything commute. The choice of $F$ is the natural one (actually, practically imposed one) to make that happen.