Proof that gradient is orthogonal to level set

When we proved the gradient of a function $f:\mathbb{R}^n\rightarrow \mathbb{R}$ is orthogonal to the level sets of the function $f(\vec{x}) = c$ for some constant $c$, my professor was quite explicit in stating that the implicit function theorem (IFT) is needed for the proof without giving a clear reason why. In every other proof I've seen of the theorem however, the implicit function theorem was not used nor even mentioned. This has got me to thinking why exactly the IFT was invoked in our proof or whether it's needed at all.

All of the proofs start by taking any differentiable curve, parametrized in $t$, residing in the level set and passing through the point of interest $\vec{a}$. The chain rule guarantees that the tangent to the curve is orthogonal to the gradient at $\vec{a}$. Since this happens for any curve, we can say that the gradient is orthogonal to the surface. I'm thinking that the IFT is needed to prove that such a curve actually exists, but I'm not sure on how exactly it does that.

If anyone can shed some light on the subject that would be great. Thanks.

If $\nabla f({\bf x}_0) \not= {\bf 0}$, then the Jacobian of $f$ (i.e. $\nabla f$) has maximal rank at ${\bf x}_0$. This means the implicit function theorem can be applied so that $\{ {\bf x} \in \mathbb{R}^{n} \,|\, f({\bf x})={\bf c} \}$ is a submanifold of $\mathbb{R}^n$. This means that about each point in the level set there is a diffeomorphism between a neighborhood of that point and an open set in $\mathbb{R}^{n-1}$.

At this point, we know the level set has a well defined tangent space. There are $n-1$ curves whose tangent vectors are linearly independent. Then we can apply the standard argument to each of these curves. Using the chain rule, we have $f({\bf r}(t))={\bf c}$ $\Rightarrow$ $\nabla f({\bf r}(t)) {\bf \cdot} {\bf r}'(t) = 0$. So the gradient is orthogonal to each tangent and thus is orthogonal to the level set.

So you are correct. The implicit function theorem is being used to guarantee that the curves we need actually exist.

Edit: A few more details.

Take a point on the level surface, say ${\bf x}_0 = (x_1,\dots,x_{n-1},y_0)=({\bf z}_0,y_0)$. Suppose that $\nabla f({\bf x}_0) \not=0$. For convenience, suppose that the last component of the gradient is non-zero.

Then there exists a region $D$ in $\mathbb{R}^{n-1}$ of points "close to" ${\bf z}_0$ such that $g(t_1,\dots,t_{n-1})$ is a function from $D$ to $\mathbb{R}$ and $f(t_1,\dots,t_{n-1},g(t_1,\dots,t_{n-1}))={\bf c}$ for all $(t_1,\dots,t_{n-1})$ in $D$ [This is the implicit function theorem in action. It allowed us to "solve" for the last variable in terms of the others.] Now we can define ${\bf r}_i(t)=(x_1,\dots,x_{i-1},t,x_{i+1},\dots,x_{n-1},g(x_1,\dots,x_{i-1},t,x_{i+1},\dots,x_{n-1}))$. We have ${\bf r}_i(x_i)={\bf x}_0$ and $f({\bf r}_i(t))={\bf c}$. This gives us $n-1$ curves on our level set.

Let $f$ be a scalar-valued function of $N$ real variables, continuously differentiable around a point $p$. Let $n := (\nabla f)(p)$ denote the gradient of $f$ at $p$. Consider the function $F(x,t) := f(x+tn)$, where $t$ is real. Assume that $f(p)=c$ for some fixed real $c$. Therefore also $F(p,0)=c$.

Assume that the gradient $n$ of $f$ at $p$ is nonzero. Then $dF/dt(p,0)=n \cdot n$ is nonzero. By the implicit function theorem there exists a function $g$ continuously differentiable in an open ball $B$ around $p$ such that $F(x,g(x))=c$ for all $x \in B$. Using the appropriate version of the implicit function theorem, we may further assume $g(p)=0$.

Let $u$ be any vector perpendicular to $n$. Define the path $P(s) := p+su + g(p+su)n$, which satisfies $P(0)=p$. As can be verified, $dP/ds(0)=u$, $P(s)$ is in the ball $B$ for small values of $s$, and $f(P(s))=c$ by construction.

Thus we have shown: given that the gradient is nonzero, for any vector $u$ perpendicular to it, we can construct a differentiable nonconstant path in the level set of $f$, and this path runs along $u$. In particular, given any two perpendicular vectors, the corresponding paths will be perpendicular. This shows that the level set is $(N-1)$-dimensional, and orthogonal to the gradient.

ps. One could set up a local parameterization of the level set as follows. Define the vector-valued function $F(x, (t, y)) = [f(x + tn); y - U (x-p)]$, where $U$ is a $(N-1) \times N$ matrix containing an orthonormal basis for the orthogonal complement of $n$ in its rows, $y$ is a $(N-1)$ vector. The IFT gives a function $(t, y) = G(x)$ which is continuously differentiable in a neighborhood of $p$. Its Jacobian at $x=p$ can be checked to be invertible. Thus, $G^{-1}$ exists (locally), and the Jacobian of $y \mapsto G^{-1}(t, y)$ is injective. Moreover, the derivative of $y \mapsto f(G^{-1}(t, y))$ vanishes, showing that the level set has dimension $(N-1)$.

pps. The same game can probably be played with $F((t, y), x) = [f(x + tn); y - U (x-p)]$.

Proof that gradient is orthogonal to level set

Related

Recent Posts