Why gradient vector is perpendicular to the plane

I know what gradient vector or $\nabla F$ is and I know how to prove that it is orthogonal to the surface (using calculation - not intuitive).
In a particular case, in which we have a three variable function, I want to know why the gradient vector is perpendicular as mentioned. I mean, not in theoretical terms, but intuitive.


Solution 1:

Let $F:\mathbb{R}^3 \to \mathbb{R}$ be a function of three variables. Say we are looking at the surface defined by $F(x,y,z) = 0$. By definition, the total derivative of $F$ at $p \in \mathbb{R}^3$ is the best linear approximation to $F$ near $p = (a,b,c)$, in other words

$$dF\big|_p (\Delta x,\Delta y,\Delta z)= \dfrac{\partial F}{\partial x} \Delta x + \dfrac{\partial F}{\partial y} \Delta y +\dfrac{\partial F}{\partial z} \Delta z$$

and

$$F(a+\Delta x,b +\Delta y,c + \Delta z) \approx F(p)+dF\big|_p (\Delta x,\Delta y,\Delta z)$$

This is really what the multivariable derivative is all about. Now we can notice that

$$dF\big|_p (\Delta x,\Delta y,\Delta z)= \langle\dfrac{\partial F}{\partial x}, \dfrac{\partial F}{\partial y},\dfrac{\partial F}{\partial z}\rangle \cdot \langle \Delta x,\Delta y,\Delta z\rangle$$

Now defining $\nabla F = \langle\dfrac{\partial F}{\partial x}, \dfrac{\partial F}{\partial y},\dfrac{\partial F}{\partial z}\rangle$, we see that

$$dF\big|_p (\textbf{v})= \nabla F\cdot \textbf{v}$$

This one formula packages a lot of mathematics.

Hopefully this gives you some more intuition about what the gradient does. In words, to see how much $F(\textbf{p})$ changes when you move away a little bit to $\textbf{p}+\textbf{v}$, just dot product $\nabla F\big|_p$ with $\textbf{v}$.

Armed with this intuitive understanding of the gradient, we can see why it must be perpendicular to the level curves of $F$ quite intuitively.

If $p$ is a point of the surface $F(x,y,z) = 0$, then the tangent vectors $\textbf{v}$ to the surface must satisfy $dF\big|_p(v) = 0$, because moving in the direction of the surface should not change the value of $F$ much since the value of $F$ is constant on the surface. Translating this into a gradient statement we see that $\nabla F\big|_p \cdot \textbf{v} = 0$ for each tangent vector to the surface.

This just says that $\nabla F\big|_p$ is perpendicular to the tangent plane to the surface at $p$!

Solution 2:

I've been thinking on this for awhile now (well, I first learned of this fact about 15 years ago, so I'd say awhile), and I now had a bit of a thought. So, first thing's first, we're discussing a tangent plane, so we're talking about a surface, let's just call it $z = f(x,y)$. In this case, our Function will be $F(x,y,z) = f(x,y) - z = 0$, and our gradient will look something like $\nabla F = (\partial f/ \partial x, \partial f / \partial y, -1)$. So what does the tangent plane look like? Say we are at a point $p_0 = (x_0, y_0, f(x_0,y_0))$. Given the definition of a derivative, any vector that lies on our tangent plane will start from $p_0$, then go out in some direction. Say we increase $x$ by $t$ and leave $y$ constant, how much does $z$ change?... $t\cdot\partial f/\partial x$ of course (if this is not clear, go meditate on the definition of the [partial] derivative for awhile). So this vector will be in the direction $(t, 0, t\cdot \partial f/\partial x)$. Similarly, increasing y, we go off in a direction $(0, t, t\cdot \partial f/\partial y)$. It seems that both of these guys are orthogonal to $\nabla F$, so far so good. These two vectors certainly span a two dimensional space, so they take care of our whole tangent plane, and our gradient really is orthogonal to it. If you don't like the form $z = f(x,y)$, then consult the implicit function theorem and just rotate the space a little bit for those pesky places where $\nabla F$ vanishes. I don't know if this is the answer you need, but this is the explanation I'm satisfied with. Happy to answer any follow up questions, or receive any criticism.

Solution 3:

$\newcommand{\Reals}{\mathbf{R}}\newcommand{\Brak}[1]{\left\langle #1\right\rangle}$Arguably this is not a question of calculus, but of linear algebra. Let $x = (x_{1}, \dots, x_{n})$ denote Cartesian coordinates on $\Reals^{n}$. A function $\phi:\Reals^{n} \to \Reals$ is linear if and only if there exists a vector $a = (a_{1}, \dots, a_{n})$ such that $$ \phi(x) = a_{1} x_{1} + \dots + a_{n} x_{n} = \sum_{i=1}^{n} a_{i} x_{i} = \Brak{a, x}. $$ If $a \neq 0$, then $\phi(x) = 0$ if and only if $\Brak{a, x} = 0$, if and only if $x$ is orthogonal to $a$. That is, $\phi$ decomposes $\Reals^{n}$ into the orthogonal direct sum of the line spanned by $a$ and the kernel of $\phi$ (a.k.a. the hyperplane orthogonal to $a$).

This orthogonality is "why" the gradient is orthogonal to the level set.

Steven Gubkin's excellent answer contains a proper explanation. In the spirit of geometric intuition: Let $F:\Reals^{n} \to \Reals$ be continuously-differentiable, $p$ a point with $\nabla F(p) \neq 0$, and $\Sigma$ the level set of $F$ through $p$.

"Zooming in" on $\Reals^{n}$ at the point $p$ causes $F$ to "look more and more like" its first-order approximation $$ L_{p}(x) = F(p) + \Brak{\nabla F(p), x - p}, $$ and causes $\Sigma$ to "look more and more like" its tangent space at $p$, a.k.a., the level set of $L_{p}$ through $p$, a.k.a., the kernel of $\phi(x) = \Brak{\nabla F(p), x}$ translated by $p$. Since $\nabla F(p)$ is orthogonal to $\ker \phi$, "the gradient is orthogonal to the level set".

(On a tangent, as it were, squinting at this picture "explains" the implicit function theorem for real-valued functions: If $\nabla F(p) \neq 0$, the level set of the first-order approximation of $F$ at $p$ is an affine hyperplane, which "must be tangent" to the level set of $F$ through $p$. That is, the level set of $F$ at $p$ must be a manifold of dimension $(n - 1)$ in some neighborhood of $p$. The implicit function theorem for mappings $F:\Reals^{n} \to \Reals^{m}$ (with $m \leq n$) has a similar interpretation: If $Df(p)$ has rank $m$, then the level set $\Sigma$ of $F$ through $p$ "looks like" the level set of the first-order approximation $L_{p}(x) = F(p) + Df(p)(x - p)$, an affine space of dimension $(n - m)$, which "must be tangent" to $\Sigma$ at $p$; that is, $\Sigma$ "must be an $(n - m)$-manifold near $p$".)

Solution 4:

An intuitive explanation, could be as follows.

Take a function of two variables $z=f(x,y)$.
Plot two level curves $z=z_0$ and $z=z_0+\Delta z$.
Take a point on the first line $(x_0,y_0,z_0)$, and consider the tangent to the line in that point.
Suppose that $f(x,y)$ is "smooth" enough and that $\Delta z$ is enough "small" so that the tangent will not vary "abruptly" from one line to the other: that means that it will remain constant (apart from a $O(\Delta z)$) along the normal to the two curves.
That given, the path along the normal will be shortest one to pass from $z_0$ to $z_0+\Delta z$. Therefore $\Delta z / \Delta s$ will be the highest.

If then you have instead three variables, and thus level surfaces, the "visual representation" above does not change in substance.

Solution 5:

I am using 3 dimensions in my proof, but it easily extends into higher dimensions.

Proof

First, we need to define the tangent plane to a specific point $(x, y, z)$. Suppose our surface satisfies the equation $$f(x) + f(y) + f(z) = k$$ for a constant c. Note that $$f(x+\delta_1) + f(y+\delta_2)+f(z+\delta_3)$$ approaches $c$ for small enough deltas. But, $$f(x+\delta_1) = f(x) + \delta_1 f'(x)$$ Therefore, the tangent plane is defined as $$[f(x) + \delta_1 f'(x)] + [g(y) + \delta_2 g'(y)] + [h(z) + \delta_3 h'(z)] = k$$ Or, $$f'(x)a + g'(y)b + h'(z)c = 0$$ I have replaced the $\delta$s with $(a, b, c)$ to show that the plane extends to infinity (as before, the $\delta$s tended towards $0$). It is well known that the normal vector to this plane is $$\langle f'(x),\ g'(y),\ h'(z)\rangle$$

Proof Suppose $(a, b, c)$ satisfies $$k_1a + k_2b + k_3c = 0.$$ Note that $$\langle k_1, k_2, k_3\rangle \cdot \langle a, b, c\rangle = k_1a + k_2b + k_3c = 0$$ Hence they are orthogonal. $\blacksquare$