Second (and higher) derivatives of maps between manifolds

I'm trying to understand derivatives of maps between manifolds, and specifically something I read in Dodson and Poston's Tensor Geometry. I'll try to provide as much background as I can for those without the book (for anyone who has the second edition the relevant section is VII.2... pages 166-168 in my copy).

For starters, if $(U, \phi)$ and $(U', \phi')$ are charts on a manifold M (e.g., $\phi: U \subset M \to \mathbb{R}^m$), $u \in U \cap U'$, and $\vec{t}, \vec{t}'$ tangent vectors to $\mathbb{R}^m$, then tangent vectors to $M$ (elements of $T_uM$) are an equivalence class of triples defined by a relation $\sim$ given by

$$ (U,\phi,\vec{t}) \sim (U', \phi', \vec{t}') \iff D_{\phi(u)}(\phi' \circ \phi^{-1})\vec{t} = \vec{t}'. $$

If $f : M \to N$ is differentiable at $u$, and $(V, \psi)$ is a chart on $N$ (say $\psi : V \to \mathbb{R}^n$) with $f(u) \in V$, then

$$ \begin{align*} (U,\phi,\vec{t}) &\sim (U', \phi', \vec{t}') \\ &\Rightarrow D_{\phi(u)}(\psi \circ f \circ \phi^{-1})\vec{t} = D_{\phi(u)}(\psi \circ f \circ \phi'^{-1})\vec{t}' \\ &\Rightarrow (V,\psi,D_{\phi(u)}(\psi \circ f \circ \phi^{-1})\vec{t}) \sim (V,\psi,D_{\phi(u)}(\psi \circ f \circ \phi'^{-1})\vec{t}') \end{align*} $$

so that $f$ induces a well defined map

$$ D_uf : T_uM \to T_{f(u)}N $$

taking the $\sim$ equivalence class of $(U,\phi,\vec{t})$ to that of $(V,\psi,D_{\phi(u)}(\psi \circ f \circ \phi^{-1})\vec{t})$.

Note that here $D_{\phi(u)}$ is just the regular derivative... as I understand what's going on, we basically create a map $\psi \circ f \circ \phi^{-1} : \mathbb{R}^m \to \mathbb{R}^n$ and take its derivative. The result (in combination with the arbitrary chart $\psi$) uniquely specifies the tangent vector to $N$ for a given tangent vector to $M$ (answering the question: as we move in some direction on $M$, what direction are we moving on $N$?).

The above construction can easily be extended to higher derivatives, by simply taking more derivatives of $\psi \circ f \circ \phi^{-1}$. Dodson and Poston say "In a similar way we can define higher derivatives, with $D^k_uf \in L^k(T_uM;T_{f(u)}N)$". $L^k(T_uM;T_{f(u)}N)$ is the space of multilinear mappings taking in $k$ vectors in $T_uM$ and returning a vector in $T_{f(u)}N$.

Moving towards my questions. Consider a map $g : U \subset \mathbb{R}^2 \to V \subset S^2$. Following the above, $D^2_u(g)(\vec{t},\vec{t}') : T_u \mathbb{R}^2 \times T_u \mathbb{R}^2 \to T_{g(u)}S^2$, i.e. after feeding it two vectors we get a vector tangent to the sphere. This doesn't seem to be what you get if you take derivatives of $\iota \circ g : \mathbb{R}^2 \to \mathbb{R}^3$, i.e., considering the sphere as embedded in $\mathbb{R}^3$ ($\iota$ being the inclusion map). Fixing $\vec{t}$, after the first derivative we have a field of tangent vectors to the sphere at each point $g(x)$ for $x \in U$, and the second derivative I presume looks at how this field changes with position (yielding vectors no longer necessarily tangent to the sphere, since we're using just a regular and not covariant derivative).

My questions: I think my confusion festers somewhere in that last paragraph. Are second derivatives of $g$ and $\iota \circ g$ not the same because the second derivative of the inclusion map is not the inclusion map between the relevant tangent spaces (i.e., chain rule)? Is my picture of the second derivative of $\iota \circ g$ incorrect or misleading? What does $D^2_u(f)$ actually tell us about $f$? Is there some intuitive question that can be formulated like in the first derivative case, which tells us how we move on N under $f$ as we move on M?

Any and all help in untying my knotty brain would be greatly appreciated.


Your doubt is well-placed: Even in the case $f:M\to \mathbb R$ you cannot sensibly define $D^2_u f : T_u M \times T_u M \to \mathbb R$ unless $u$ is a critical point of $f$. Otherwise your definition $$D^2 f(\partial_i, \partial_j)=\frac{\partial^2 f}{\partial x^i \partial x^j} \ \ \text{extended by multilinearity}$$ depends on the coordinate system $x^i$: if we have some other coordinate system $y^\alpha$ then by multilinearity we should have $$ D^2 f\left(\frac{\partial}{\partial x^{i}},\frac{\partial}{\partial x^{j}}\right) =\frac{\partial y^{\alpha}}{\partial x^{i}}\frac{\partial y^{\beta}}{\partial x^{j}}D^{2}f\left(\frac{\partial}{\partial y^{\alpha}},\frac{\partial}{\partial y^{\beta}}\right)=\frac{\partial y^{\alpha}}{\partial x^{i}}\frac{\partial y^{\beta}}{\partial x^{j}}\frac{\partial^{2}f}{\partial y^{\alpha}\partial y^{\beta}}; $$ but the chain rule gives $$\frac{\partial^{2}f}{\partial x^{j}\partial x^{i}}=\frac{\partial}{\partial x^{i}}\left(\frac{\partial y^{\beta}}{\partial x^{j}}\frac{\partial f}{\partial y^{\beta}}\right)=\frac{\partial y^{\alpha}}{\partial x^{i}}\frac{\partial y^{\beta}}{\partial x^{j}}\frac{\partial^{2}f}{\partial y^{\alpha}\partial y^{\beta}}+\frac{\partial^{2}y^{\beta}}{\partial x^{i}\partial x^{j}}\frac{\partial f}{\partial y^{\beta}}.$$

In Riemannian geometry this is fixed by using the covariant Hessian $$D^2f (X,Y) = XYf - (\nabla_X Y) f;$$ but without the additional structure of a conenction there is no way to think of $k^\text{th}$-order derivatives as taking $k$ tangent vectors.

Instead the general formulation uses higher tangent bundles: the first derivative is a map $TM \to TN$, so the second derivative is a map $TTM \to TTN$. Thus the picture you should have in your head is that the first input (a vector $v \in T_u M$) is the base direction you're moving in, and then the second input (a vector in $T_v TM$) is the direction you're varying that $v$ in.