Definition of the hessian as a bilinear functional on the tangent space

In Milnor's Morse Theory, the Hessian of a smooth function $f : M \to \mathbb R$ defined on a manifold $M$ at a critical point $p$ is the bilinear functional on $T_p M$ defined as follows:

$$f_{**}(v, w) = \tilde v_p(\tilde w(f))$$

where $\tilde v, \tilde w$ are vector fields that extend $v$ and $w$. It is written that $\tilde w(f)$ denotes the directional derivative of $f$ in the direction $\tilde w$ and that $\tilde v_p$ is, of course, $v$. I am assuming that this means $\tilde w(f) = T_p(f)(w)$, but

  1. I do not know what it means to write $v(T_p(f)(w))$
  2. it is claimed that $f_{**}$ is symmetric because $$\tilde v_p(\tilde w(f)) - \tilde w_p(\tilde v(f)) = [v, w]_p(f) = 0$$ since $p$ is critical. If I am not mistaken, the argument used here is that since $p$ is critical, any directional derivative of $f$ at $p$ (e.g. $[v, w]_p(f)$) is zero, but wouldn't that also mean that $\tilde w(f) = \tilde v(f) = 0$?

Hopefully, I am only mixing up notations. I would appreciate any comment.


You don't need anything as sophisticated as Lie derivatives or connections to make sense of this. There are two main points to keep in mind:

First, if $w\in T_pM$ is a tangent vector at $p$, then $w$ acts as a linear map from $C^\infty(M)$ to $\mathbb R$; for any smooth function $f$, the number $w(f)$ is interpreted as the directional derivative of $f$ in the direction $w$. It can be written in a number of equivalent ways: $$ w(f) = df_p(w) = T_p(f)(w). $$

Second, if $\bar w$ is a smooth vector field and $f$ is a smooth function, then $\bar w(f)$ is another smooth function, whose value at $p\in M$ is $$ \bar w(f)(p) = \bar w_p(f). $$ So, given another vector field $\bar v$, we can apply $\bar v_p$ to this function and get a real number $\bar v_p(\bar w (f))$. Note that in general this value depends on knowing the vector field $\bar w$ in a neighborhood of $p$, not just on the value $\bar w_p$. Moreover, $\bar v_p(\bar w (f))$ and $\bar w_p(\bar v (f))$ might be different numbers, because $$ \bar v_p(\bar w (f)) - \bar w_p(\bar v (f)) = [\bar v,\bar w]_p (f).\tag{$*$} $$ (This is essentially the definition of the vector $[\bar v,\bar w]_p$.)

However, if $p$ is a critical point of $f$, then $[\bar v,\bar w]_p (f) = df_p([\bar v,\bar w]_p) = 0$, so $(*)$ shows that $\bar v_p(\bar w (f)) = \bar w_p(\bar v (f))$. It also shows that this number is independent of the vector fields $\bar v$ and $\bar w$ chosen to extend the vectors $v$ and $w$, because the first expression depends only on $\bar v_p = v$ and the second depends only on $\bar w_p = w$. Thus at a critical point $p$, the Hessian of $f$ is a well-defined symmetric bilinear form on $T_pM$. That's why Milnor treats the Hessian only at critical points.


I have not read Milnor's book, so I cannot be sure that my notions agree with his, but I will take a shot.

First a bit of intuition in $\mathbb{R}^n$. $H(v,w) \approx df(p+w,v)-df(p,v)$. The Hessian really does tell you how the derivative changes infinitesimally. In order to make sense of this on a general manifold, however, you need to be able to compare the tangent space at $p$ to the tangent space at $p+w$. For this, we need the technology of a connection. You may be interested in this answer for a more in depth idea.

The Hessian can be defined globally as a bilinear form given a connection $\nabla$ on the manifold just as $H = \nabla (df)$. Thinking of a vector field as a gadget which eats functions and returns a new function (the directional derivative in the direction of the tangent field), we can also write $H(X,Y) = X(Y(f)) - df(\nabla_X Y)$.

This last formula shows that at a critical point, the Hessian is independent of the connection, and we just have $H(X,Y) = X(Y(f))$. So we can actually make this definition of the Hessian at a critical point of a function on a manifold without a connection. Another reason to think this is needed is that the second derivative (because of the product rule) transforms in a way that involves the first derivative, unless the first derivative is $0$. So we cannot hope for an invariant definition of second derivative except at a critical point.

The Hessian is symmetric in this case since $[X,Y]_p(f) = df_p([X,Y]) =0$, since $p$ is a critical point.


Part 1: In general, for a real-valued function $f$ on $M$ and a tangent vector $v$ at $p \in M$, $v(f)$ denotes the directional derivative of $f$ in the direction $v$.

By extension, if $$ G : M \to \mathbb R^n: s \mapsto (G_1(s), \ldots, G_n(s)) $$ is a vector-valued function, then $v(G)$ denotes $$ (v(G_1), \ldots, v(G_n)) $$ i.e., a list of the directional derivatives of the components of $G$. That's a list of vectors, which can be thought of as a matrix.

  1. It's true that $\bar{w}(f) (p) = 0$; you have to understand that $\bar{w}(f)$ is a function defined in a neighborhood of $p$; its value at $p$ is zero, but it might have nonzero values elsewhere (and hence its directional derivative in some direction at $p$ might be nonzero, for instance). But $[v, w]_p (f)$ will be zero, for the reason you stated: $p$ is critical for $f$.

Let's make this concrete. Suppose that $M$ is the unit disk in $R^2$, $p$ is the origin, and $f(x, y) = x^2 + y^2$. Then $p$ is a critical point of $f$, right?

Let's pick $v = (1, 0)$ and $w = (0, 1)$. (And we can define $\bar{v}(x, y) = (1, 0)$ for $(x, y)$ near $(0,0)$ as well, and similarly for $\bar{w}$.)

Note that $v_p(f)$, the directional derivative of $f$ in the $x$-direction at $(0,0)$ is zero. But $\bar{v}(f)$ is the function $\bar{v}(f)(x, y) = 2x$, defined in a neighborhood of the origin. You can write out $\bar{w}(f)(x, y)$, I'll bet.

What's $w_p( \bar{v}(f)$? It's the directional derivative of $(x, y) \mapsto 2x$ in the $y$-direction at the point $(x, y) = (0, 0)$. That happens to be zero. Same goes for $w_P(\bar{w}(f))$.

You might want to work through all this stuff with a different $f$, like $f(x, y) = xy$, for possible further enlightenment.