What is the 'implicit function theorem'?

Please give me an intuitive explanation of 'implicit function theorem'. I read some bits and pieces of information from some textbook, but they look too confusing, especially I do not understand why they use Jacobian matrix to illustrate this theorem.

Let's use a simple example with only two variables. Assume there is some relation $f(x,y)=0$ between these variables (which is a general curve in 2D). An example would be $f(x,y) =x^2+y^2-1$ which is the unit circle in $\mathbb{R}^2$. Now you are interested to figure out the slope of the tangent to this curve at some point $x_0,y_0$ on the curve [with $f(x_0,y_0)=0$].

What you can do is to change $x$ a little bit $x = x_0 + \Delta x$. You are interested then how $y$ changes ($y= y_0 + \Delta y$); remember that we are interested in points on the curve with $f(x,y)=0$. Using Taylor expansion on $f(x,y)=0$ yields (up to lowest order in $\Delta x$ and $\Delta y$) $$f(x,y)= \partial_x f(x_0,y_0) \Delta x + \partial_y f(x_0,y_0) \Delta y =0.$$ The slope is thereby given by $$ \frac{\Delta y}{\Delta x} = - \frac{ \partial_x f(x_0,y_0)}{\partial_y f(x_0,y_0)}.$$ As $\Delta x \to 0$ higher order terms in the Taylor expansion (which we neglected) vanish and $\frac{\Delta y}{\Delta x}$ becomes the slope of the curve implicitly defined via $f(x,y)=0$ at $(x_0,y_0)$.

More variables and higher dimensional spaces can be treated similarly (using Taylor series in several variables). But the example above should provide you with enough intuition and insight to understand the 'implicit function theorem'.

The implicit function theorem really just boils down to this: if I can write down $m$ (sufficiently nice!) equations in $n + m$ variables, then, near any sufficiently nice solution point, there is a function of $n$ variables which give me the remaining $m$ coordinates of nearby solution points. In other words, I can, in principle, solve those equations and get the last $m$ variables in terms of the first $n$ variables. But (!) in general this function is only valid on some small set and won't give you all the solutions either.

Here's a concrete example. Consider the equation $x^2 + y^2 = 1$. This is a single equation in two variables, and for a fixed $x_0 \ne 1, y_0 \ne$ satisfying the equation, there is a function $f$ of $x$ such that $x^2 + f(x)^2 = 1$ for $y$ near $x_0$, and $f(x_0) = y_0$. (Explicitly, for $y_0 > 0$, $f(x) = \sqrt{1 - x^2}$, and for $y_0 < 0$, $f(x) = -\sqrt{1 - x^2}$.) Notice that the function doesn't give you all the solution points — but this isn't surprising, since the solution locus of this equation is a circle, which isn't the graph of any function. Nonetheless, I have basically solved the equation and written $y$ in terms of $x$.

The other answers have done a really good job explaining the implicit function theorem in the setting of multivariable calculus. There is a generalization of the implicit function theorem which is very useful in differential geometry called the rank theorem.

Rank Theorem: Assume $M$ and $N$ are manifolds of dimension $m$ and $n$ respectively. If $F : M \to N$ is a smooth map, $p \in M$ and $F_{*} : T_qM \to T_{F(q)}N$ has rank $k$ in a neighborhood of $p$, then there are coordinates $(x^1, \dots, x^k, \dots, x^m)$ centered around $p$ and $(v^1, \dots, v^n)$ centered around $F(p)$ such that in local coordinates, $F$ is given by the equation $$ F(x^1, \dots, x^k, \dots, x^m) = (x^1, \dots, x^k, 0, \dots, 0)$$

Essentially, the rank theorem tells us that if the total derivative of $F$ in a neighborhood of $p$ has rank $k$, then locally around $p$, we can think of $F$ as a linear map with rank $k$.

Here is an example of how you use the rank theorem in to prove a version of the implicit function theorem in differential geometry.

Assume $M$ is a manifold of dimension $n+k$ and $N$ is a manifold of dimension $k$. Assume $\Theta : M \to N$ is a smooth map and $\Theta_{*} : T_pM \to T_{\Theta(p)}N$ has rank $k$ (i.e $p$ is a regular point). Since maximal rank is an open condition, it follows that $F_*$ has rank $k$ in a neighborhood of $p$. By the rank theorem, there are coordinates $(x^1, \dots, x^{n+k})$ defined on the open set $U$ and centered around $p$ and coordinates centered around $\Theta(p)$ such that $$\Theta(x^1, \dots, x^{n+k}) = (x^1, \dots, x^k).$$ If we write $q = \Theta(p)$, then the above equation tells us that $$\Theta^{-1}(q) \cap U = \{ p \in U : x^{1}(p) = \dots = x^{k}(p) = 0 \}$$ which implies that $ (x^{k+1}, \dots, x^{k+n}) : \Theta^{-1}(q) \cap U \to \mathbb{R}^n$ is an open topological embedding. This defines an $n$-dimensional smooth structure on $\Theta^{-1}(q) \cap U$, and in coordinates, $\iota : \Theta^{-1}(q) \cap U \to U$ is given by $(x^{k+1}, \dots, x^{k+n}) \mapsto (0, \dots, 0, x^{k+1}, \dots, x^{k+n})$ which is smooth.

In summary, if $p$ is a regular point and $q = \Theta(p)$, then there are coordinates $(x^1, \dots, x^{n+k})$ for $M$ around $p$ such that $(x^{k+1}, \dots, x^{k+n})$ are coordinates for $\Theta^{-1}(q)$ around $p$ (once suitably restricted of course). This is kind of a differential geometers formulation of the normal implicit function theorem.

EDIT: In the comments, David pointed out that the statement of the rank theorem above was wrong. The reason is that rank $ \geq k $ is an open condition and rank $< k $ is a closed condition, which I didn't really understand when I wrote this answer 2 years ago. Things should be fixed now

What are some examples of infinite dimensional vector spaces?

Why is the localization at a prime ideal a local ring?

How far can one see over the ocean?

Best Algebraic Topology book/Alternative to Allen Hatcher free book?

Why does the volume of the unit sphere go to zero?

How do you rotate a vector by a unit quaternion?

Are there any objects which aren't sets?

What books should I get to self study beyond Calculus for someone about to start undergrad mathematics?

Simplest way to get the lower bound $\pi > 3.14$

Why can't a set have two elements of the same value?

Which of the numbers $1, 2^{1/2}, 3^{1/3}, 4^{1/4}, 5^{1/5}, 6^{1/6} , 7^{1/7}$ is largest, and how to find out without calculator? [closed]

Proof that every number ≥ $8$ can be represented by a sum of fives and threes.