Intuition on the direction of steepest ascent always being orthogonal to the level set of the function

Thanks for reading.

THE QUESTION:

Convince me that when on the surface of a smooth hill, the $(x,y)$ direction I should take a tiny step in such that my current height doesn't change is always perpendicular to the $(x,y)$ direction I should take a tiny step in so that my height changes by the most.


More Mathematically formulated:

Convince me, intuitively, that the direction of steepest ascent is perpendicular to the level-set of a function.

Convince me, intuitively, that if I"m standing on a smooth hill, the direction of steepest ascent is perpendicular to the direction I should move in so that the height doesn't change at all.


Why I'm asking it:

(This section is going to be really long, but just because I want to be helpful to potential responders and explain exactly what I understand and what I don't understand in as much depth as possible. If you read it all, thank you so much!)

I've always had trouble understanding that the gradient is the direction of steepest ascent.

I've seen some excellent answers on this site, like this one...

Why is gradient the direction of steepest ascent?

...and this one...

Gradient of a function as the direction of steepest ascent/descent

...and honestly, most answers seem to answer in the same way: by proving that the dot product of a vector of fixed length with the gradient, which by definition is the change in the function at that point, is maximum when the vector of fixed length (the step) points in direction of the gradient.

That answer is fine...but I've always had a little bit of trouble understanding it.

That's because although the phrase "...take the step that points in the direction of the gradient to maximize the dot product between the step's direction and the gradient..." is mathematically sound, the idea of "the direction" of the gradient isn't something I"m really comfortable with, since I view the gradient as an operator on a vector $\begin{bmatrix} dx\\ dy \end{bmatrix}$ that outputs by how much some $f(x,y)$ would change at some specific $(x,y)$ if we took that "step". It's hard for me to think of the gradient as a vector itself.


So yea, I've never really truly understood the "direction of steepest ascent" of a function.

However, something I DO understand is the level-sets of a function. These are all the $(x,y)$ points such that some $f(x,y)$ stays constant.

For example, if $f(x,y)=x+2y$, then $(x+2y)=1$ would be a level-set.

enter image description here

In the picture above, the red plane is $z=f(x,y)$, and the green plane is $(x+2y)=1$. As you can see, the intersection of the two planes is flat, indicating that $f(x,y)$ is constant for all $(x,y)$ such that $(x+2y)=1$.

Now, say I was standing on that intersection, where $z=1$, and I wanted to know which $(x,y)$ direction to take a step in so that I didn't move up or down the mountain?

I would need to move in a $(x,y)$ direction such that $(x+2y)$ stayed constant.

Say I take a tiny step in some arbitrary direction. That step will have an $x$ component and a $y$ component.

We can represent that tiny step as a vector: $\begin{bmatrix} dx\\ dy \end{bmatrix}$.

For whatever tiny amount $dx$ that step corresponds to in the $x$ direction, $f(x,y)$ (my height) will change by $dx$, since at that $(x,y,f(x,y))$ point I'm standing on on that smooth mountain, $\frac{\partial f}{\partial x}=1$.

On the other hand, for whatever tiny amount $dy$ that step corresponds to in the $y$ direction, $f(x,y)$ (my height) will change by $2dy$, since at that $(x,y,f(x,y))$ point I'm standing on on that smooth mountain, $\frac{\partial f}{\partial y}=2$.

In general, at any $(x,y,f(x,y))$, the amount by which $f(x,y)$ changes when I take a tiny step $\begin{bmatrix} dx\\ dy \end{bmatrix}$ is the amount by which it changes due to the component of our step in the $x$ direction, which would be $\frac{\partial f}{\partial x} * dx$, plus the amount that it changes in due to the component of our step in the $y$ direction, which would be $\frac{\partial f}{\partial x} * dy$.

In this specific example, the function changes twice as much for any step in the $y$ direction than it does for any step in the $x$ direction. That means that if I don't want $f(x,y)$ to change at all, then for whatever amount I move in the $y$ direction, I must move negative twice that amount in the $x$ direction, since any fixed amount of movement in the $y$ direction corresponds to twice the change in height as does any movement in the $x$ direction!

In other words, the direction of my step should be: $\begin{bmatrix} -2\\ 1 \end{bmatrix}$.

Let's say I was instead standing at an $(x,y,f(x,y))$ point where a tiny step in the $x$ direction corresponded to 42 times the change in altitude than a tiny step in the $y$ direction did.

In other words, $\frac{\partial f}{\partial x}=42\frac{\partial f}{\partial y}$ at that point.

Then, to not change height at all (stay on the level-set), I would want to take a tiny step in the $\begin{bmatrix} 1\\ -42 \end{bmatrix}$. I'd want to make sure that my step moves me $-42$ times as much in the $y$ direction as we do in the $x$.

More generally, if I'm standing at some point $(x,y,f(x,y))$ on a smooth mountain, the step I should take such that my altitude doesn't change (such that $f(x,y)$ doesn't change) should always be $\begin{bmatrix} +\frac{\partial f}{\partial y}\\ -\frac{\partial f}{\partial x} \end{bmatrix}$

This makes sense to me - no dot products needed so far!!!!

Now, I know that the direction orthogonal to $\begin{bmatrix} +\frac{\partial f}{\partial y}\\ -\frac{\partial f}{\partial x} \end{bmatrix}$ corresponds to taking the negative reciprocal of it.

That is:

$\begin{bmatrix} \frac{\partial f}{\partial x}\\ \frac{\partial f}{\partial y} \end{bmatrix}$

AND THAT'S THE DIRECTION OF STEEPEST ASCENT!

In summary, I understand why the "direction of no ascent" is what it is.

If I could somehow intuitively understand that the "direction of steepest ascent" when climbing a mountain is always perpendicular to the direction of no ascent, then I would understand why the gradient is in the direction of steepest ascent.

Thanks!


One more thing...

I tagged this question as a soft question simply because I'm looking for intuitive answers more than mathematical proofs, and it's hard to say whether or not intuitive answers are correct.

Copied and pasted from a comment below...

I'd like to be able to picture myself standing on the surface of a smooth hill, standing over a spot where someone took a bright neon marker and traced out a level-curve on that hill, and picture the hill in such a way that the direction in which the hill is steepest is OBVIOUSLY perpendicular to that hill. And as of now, I just can't! It seems just as plausible that some OTHER direction not perpendicular to that bright yellow level-curve could be the steepest direction instead!


Solution 1:

I don't know how helpful this will be, it's just the way I sometimes like to picture it.

Since your hill is smooth, it's locally just a plane (more precisely, there exists a tangent plane which is an approximation that is at least quadratically good).

Now take this plane and cut out a small disk where you're standing (it will in general be slanted). Draw its horizontal diameter, which is (a piece of) a level set. If you grab the disk at the points where this diameter intersects the boundary and look at it head-on, being careful to only rotate it about the vertical axis, you may be able to convince yourself that indeed the only possibility is going perpendicular to the diameter.

This is rather vague, I hope it's not completely useless.

Solution 2:

An old question, but a good one, so ...

As you have described, imagine yourself standing on a contour line of the hill (a curve where height is constant). You want to go uphill as quickly as possible.

Imagine another contour line at a height that’s a tiny bit higher than the one you’re currently on. If “tiny” is small enough, the two contour lines will be almost parallel in the small region around your feet.

To go uphill as quickly as possible, you need to follow the shortest path from your current point to the higher contour. That shortest path between the two contour lines is in a direction that’s perpendicular to both of them.

Simple examples of a hill are a hemisphere or a cone with vertical axis. In both cases the two contour lines are circles, so the shortest path between them is pretty obvious.