gradient: field of tangent vectors vs. normal to surface at a point

One definition of the gradient say that its a field of tangent vectors to a surface. The gradient takes a scalar field f(x,y) (aka. a function), and produces a vector field $\vec{v}(x,y)$, where the vector at each point of the field points in the the direction of greatest increase.

$\vec{v}(x,y) = \overbrace{\nabla \underbrace{f(x,y)}_\text{scalar field}}^{\text{vector field}} = \overbrace{\begin{bmatrix}\frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \end{bmatrix}}^{\text{vector field}}$

Another definition of gradient say that its a normal to a surface of the form F(x,y,z)=c.

How to know when to apply which definition of gradient? How is a field of tangent vectors related to the normal of a surface? They seem like contradictory definitions.


let $\vec{r} = x \hat{\text{i}} + y \hat{\text{j}} + z \hat{\text{k}}$ be the position vector to any point P(x,y,z) on the surface $\phi(x,y,z)=c$. Then: $d\vec{r} = dx~\hat{\text{i}} + dy~\hat{\text{j}} + dz~\hat{\text{k}}$ lies in the tangent plane to the surface at P.

$\phi(x,y,z)=c$

taking differential of both sides:

$d\phi = \frac{\partial \phi}{\partial x} dx + \frac{\partial \phi}{\partial y} dy + \frac{\partial \phi}{\partial z} dz = 0$

therefore:

$\bigg(\frac{\partial \phi}{\partial x} \hat{\text{i}} + \frac{\phi}{\partial x}\hat{\text{j}} + \frac{\phi}{\partial x}\hat{\text{k}}\bigg) \cdot \bigg(dx\hat{\text{i}} +dy\hat{\text{j}} + dz\hat{\text{k} }\bigg) =0$

$\nabla \phi \cdot d\vec{r} = 0$

therefore $\nabla \phi$ is perpendicular to $d\vec{r}$ or normal to the surface at point P.


Solution 1:

If you have a surface embedded in a Euclidean space (for the sake of example, the radius 1 sphere of $R^3$ which we call $S^2$), there are a few things to disambiguate.

NB: In math, the "sphere" is the peel, the "ball" is the inside of the orange. We're talking only about the surface: the standard equation for an embedding of $S^2$ in $R^3$ is $S^2 = \{x^2 + y^2 + z^2 = 1 \space | \space (x,y,z) \in R^3\}$, not $B^3 = \{x^2 + y^2 + z^2 \leq 1 \space | \space (x,y,z) \in R^3\}$.

So let's start simple. We're in $R^3$, there is no sphere. Only a a scalar field over $R^3$. A scalar field can be defined as a function $f : R^n \to R$, here $n = 3$. Visually, it's like giving a color to each point of $R^3$, with:

  • darker, redder colors for points of $R^3$ mapped to a value close to $+\infty$
  • white for points of $R^3$ mapped to zero
  • darker, bluer colors for points of $R^3$ mapped to a value close to $-\infty$

Let's take a function which "generates" the radius $1$ sphere, meaning $f(x,y,z) = x^2 + y^2 + z^2 - 1$. With this function, points inside $S^2$ are blue (but rather clear blue as the minimum is given by $f(0,0,0) = -1$, light blue), and the pure white points are $S^2$ itself. Use a grapher like Geogebra to see that this is the case, by testing $f(x,y,z) = 0$, which should be the level set corresponding to the sphere. Swap $0$ with another constant $c$, and you should get other spheres with other radii.

Your level sets (sets of points that map to the same scalar value) for this function $f$ are "each sphere centered in $0$", which is equivalent here to "the set of points for a fixed $c$ for $f(x,y,z) = c$", and also equivalent "each set of points of the same color".

In this context, your gradient (no matter the constant $c$, or equivalently the constant $C > 0$ in $f_C(x,y,z) = x^2 + y^2 + z^2 - C$) will be $\nabla f = (2x, 2y, 2z)$, by deriving in each independent direction.

Now ask yourself, what does this correspond to in this picture ? For each point $p = (x,y,z)$, $\nabla f(p)$ gives a vector that goes from the origin and passes through $p$, with double the length of the origin to $p$. These "rays" are precisely the vectors which are normal to your sphere (if you consider them starting from $p$, on each sphere, rather than from the origin).

[If you were to use another scalar function, generating another manifold (or rather, level sets each which is a manifold, and which is non-intersecting with the others), and calculated the gradient, you would find the same result, where the gradient vector at $p$ is normal to the manifold corresponding to the level set of the value $f(p)$. The function $f(x,y,z) = x^2 + y^2 - z^2 - c$ for the cone ($c = 0$) and hyperbolae ($c \neq 0$) is a highly instructive example, as well (especially if pseudo-euclidean metrics and spacetime are an objective of yours).]

In this context, you can see that the gradient describes "the direction of greatest increase". The direction a point "p" must evolve to get to go from "dark blue to dark red" the fastest possible. Since for every point $p$ you can define a gradient vector $\nabla f(p)$ for that point, the gradient operator actually turns a scalar field $f$, into a vector field $\nabla f$. Important point: this means that $\nabla f : R^n \to R^n$: it takes a point from our base euclidean space, and returns a vector of same dimension.

But what if we were to consider a scalar field, and its consequent gradient field, on the sphere $S^2$ itself ? Ie, a map $g : S^2 \to R$ ? Why not generalize scalar functions to all input manifolds ?

In this context, the rest of $R^3$ is completely ignored: if you look into the difference between "extrinsic geometry" and "intrinsic geometry", you'll see that we could technically represent our sphere as a distorted 2D map (like the famous Mercator projection, or others like stereographic, etc.). Geometry still works, only you need to keep in mind that some things change (sometimes it'll be angles, sometimes lengths, etc.: it actually depends on the type of projection you use, and how it removes curvature from the manifold to make it flat-like).

If we want to use the sphere, which is curved, embedded in $R^3$, and define vectors on the sphere, we need to define the "tangent bundle". Let's unpack the concept.

Basically, if we define a scalar field on the sphere, it's easy: each point on the sphere is given a color corresponding to its value, from dark blue (very negative) to dark red (very positive). Think of temperatures on a weather map, or on a globe: really simple.

For a vector field, it's bit more ambiguous. A vector starting from my sphere "goes out from the sphere into $R^3$", because it is a straight arrow, not a curved arrow. But the "winds" on your sphere model stay on your model, and don't go out into the surrounding space, right ? So what gives ?

Well, we'll decide that for you manifold M of dimension $n$ (here, $n = 2$), at each point, we'll have a version of $R^n$ that is tangent to the manifold at that point, noted $T_p(M)$. The set of all $T_p(M), p \in M$, is called the tangent bundle of M, and is noted $T(M)$. In the case of $S^2$, you can imagine each $T_p(S^2)$ as a plane tangent to the point $p$. The tangent bundle $T(M)$ is like a "hairy sphere", but instead of each "hair" being a vector, each hair is replaced by a tangent plane.

[What's very interesting, and hard to visualize, is that you can generally turn T(M) into a manifold of higher dimension with nice properties. The wikipedia article on tangent bundles gives the only visualization that I can provide: https://en.wikipedia.org/wiki/Tangent_bundle

This "turning the circle $S^1$ and its tangent lines into a cylinder" is very useful. Why ? Because a smooth vector field over $S^1$ is precisely an image (output) of the circle vertically on the cylinder (like a rubber band around the cylinder). The circle on the cylinder itself is precisely graphed by the scalar function $f(\theta) = 0, \theta \in [0, 2\pi[$. And you've been using such a scheme for years: the curves of continuous functions of $R \to R$, that you know very, very well, are precisely vector fields on $R$, and the space $R^2$ in which you graph/represent these curves is the tangent bundle $T(R)$. The function $f(x) = 0$ is graphed precisely to the line $R$ itself.

Do note that vectors on a 1D space and scalars on a 1D space are the same thing, so scalar field and vector field are indistinguishable in this example, as the output space is $R^1$ in both. There is a distinction to be made as soon as you start from a 2D space.

Sadly, $T(S^2)$ is 4-dimensional (and parallelizability isn't verified for $S^2$ because of the "hairy ball theorem", which is very problematic), so we have to stick with our hairy-sphere-but-hairs-are-tangent-planes image.]

Say we define the function $g(\theta, \rho) = \theta$ where $\theta$ is latitude on the sphere, ie, $\theta = - \pi$ at the north pole, and $\theta = + \pi$ at the south pole. Your north pole is the bluest point on your sphere, the south pole is the reddest point, and the circles of fixed latitude are your level sets. The equator is the white circle. Then, $\nabla g$ is the vector field over the sphere that has two singularities, one at the south pole and one at the north pole, and where the arrows all follow the meridians from the north pole to the south pole.

In this context, your gradient field is made up of vectors on the tangent bundle - ie, at each point $p$, the vector $\nabla g(p)$ exists in $T_p(S^2)$.

In a nutshell, this is the distinction one needs to be very clear about: what is the nature of the manifold you use as your input space (especially in the case of an embedding) ?

As an added bonus: do you know that all function spaces are vector spaces ? If so, then you understand that since vector fields from an $n$-manifold $M^n$ to $R^n$ are functions, they form a function space $(M^n \to R^n)$ that behaves like a vector space. As in, you can add and scale elements of $(M^n \to R^n)$, which are vector fields, like you would vectors. You can also define a multiplication between the vector fields (the "commutator") on this function space, turning it into what is called a Lie algebra. Lie algebras plays a fundamental role in physics and the theory of differential equations over manifolds.

Other added bonus: look up the notion of Hodge dual/Hodge star: in the context of an $(n-1)$-space $M$ embedded in an $n$-space $E$, the normal at $p \in M$ is dual to the tangent plane at $p$.