Meaning of derivatives of vector fields
I have a doubt about the real meaning of the derivative of a vector field. This question seems silly at first but the doubt came when I was studying the definition of tangent space.
If I understood well a vector is a directional derivative operator, i.e.: a vector is an operator that can produce derivatives of scalar fields. If that's the case then a vector acts on a scalar field and tells me how the field changes on that point.
However, if a vector is a derivative operator, a vector field defines a different derivative operator at each point. So differentiate a vector would be differentiate a derivate operator, and that seems strange to me at first. I thought for example that the total derivative of a vector field would produce rates of change of the field, but my studies led me to a different approach, where the total derivative produces rates of change only for scalar fields and for vector fields it produces the pushforward.
So, what's the real meaning of differentiating a vector field knowing all of this?
Solution 1:
As I understand it, these are your questions:
- How does one define the derivative of a vector field? Do we just take the "derivatives" of each vector in the field? If so, what does it mean to take the derivative of a differential operator, anyway?
- Why does the total derivative of a scalar field give information about rates of change, while the "total derivative" of a vector field gives the pushforward (which doesn't seem to relate to rates of change)?
I think the best way to answer these questions is to provide a broader context:
In calculus, we ask how to find derivatives of functions $F\colon \mathbb{R}^m \to \mathbb{R}^n$. The typical answer is the total derivative $DF\colon \mathbb{R}^m \to L(\mathbb{R}^m, \mathbb{R}^n)$, which assigns to each point $p \in \mathbb{R}^m$ a linear map $D_pF \in L(\mathbb{R}^m, \mathbb{R}^n)$. With respect to to the standard bases, this linear map can be represented as a matrix: $$D_pF = \begin{pmatrix} \left.\frac{\partial F^1}{\partial x^1}\right|_p & \cdots & \left.\frac{\partial F^1}{\partial x^m}\right|_p \\ \vdots & & \vdots \\ \left.\frac{\partial F^n}{\partial x^1}\right|_p & \cdots & \left.\frac{\partial F^n}{\partial x^m}\right|_p \end{pmatrix}$$
Personally, I think this encodes the idea of "rate of change" very well. (Just look at all those partial derivatives!)
Let's now specialize to the case $m = n$. Psychologically, how does one intuit these functions $F\colon \mathbb{R}^n \to \mathbb{R}^n$? There are two usual answers:
(1) We intuit $F\colon \mathbb{R}^n \to \mathbb{R}^n$ as a map between two different spaces. Points from the domain space get sent to points in the codomain space.
(2) We intuit $F\colon \mathbb{R}^n \to \mathbb{R}^n$ as a vector field. Every point in $\mathbb{R}^n$ is assigned an arrow in $\mathbb{R}^n$.
This distinction is important. When we generalize from $\mathbb{R}^n$ to abstract manifolds, these two ideas will take on different forms. Consequently, this means that we will end up with different concepts of "derivative."
In case (1), the maps $F\colon \mathbb{R}^m \to \mathbb{R}^n$ generalize to smooth maps between manifolds $F \colon M \to N$. In this setting, the concept of "total derivative" generalizes nicely to "pushforward." That is, it makes sense to talk about the pushforward of a smooth map $F \colon M \to N$.
But you asked about vector fields, which brings us to case (2). In this case, we first have to be careful about what we mean by "vector" and "vector field."
A vector $v_p \in T_pM$ at a point $p$ is (as you say) a directional derivative operator at the point $p$. This means that $v_p$ inputs a scalar field $f\colon M \to \mathbb{R}$ and outputs a real number $v_p(f) \in \mathbb{R}$.
A vector field $v$ on $M$ is a map which associates to each point $p \in M$ a vector $v_p \in T_pM$. This means that a vector field defines a derivative operator at each point.
Therefore: a vector field $v$ can be regarded as an operator which inputs scalar fields $f\colon M \to \mathbb{R}$ and outputs scalar fields $v(f)\colon M \to \mathbb{R}$.
In this setting, it no longer makes sense to talk about the "total derivative" of a vector field. You've said it yourself: what would it even mean to talk about "derivatives" of vectors, anyway? This doesn't make sense, so we'll need to go a different route.
In differential geometry, there are two ways of talking about the derivative of a vector field with respect to another vector field:
- Connections (usually denoted $\nabla_wv$ or $D_wv$)
- Lie derivatives (usually denoted $\mathcal{L}_wv$ or $[w,v]$)
Intuitively, these notions capture the idea of "infinitesimal rate of change of a vector field $v$ in the direction of a vector field $w$."
Question: What do these constructions look like in $\mathbb{R}^n$?
Taking advantage of the fact that we're in $\mathbb{R}^n$, we can look at our vector fields in the calculus way: as functions $v\colon \mathbb{R}^n \to \mathbb{R}^n$. As such, we can write the components as $v = (v^1,\ldots, v^n)$.
The (Levi-Civita) connection of $v$ with respect to $w$ is defined as $$\nabla_wv = (w(v^1), \ldots, w(v^n)),$$ where $$w(v^i) := w^1\frac{\partial v^i}{\partial x^1} + \ldots + w^n\frac{\partial v^i}{\partial x^n}.$$
The Lie derivative of $v$ with respect to $w$ has a technical definition in terms of flows that I don't want to go into, but the bottom line is that it's similar to Rod Carvalho's answer.
Also, in $\mathbb{R}^n$ we have the pleasant formula
$$\mathcal{L}_wv = \nabla_wv - \nabla_vw,$$
which aids in computation.
Solution 2:
Let $\mathbb{v} : \mathbb{R}^n \to \mathbb{R}^n$ be a vector field, and let $\varphi : \mathbb{R}^n \to \mathbb{R}$ be a scalar field. Suppose that we would like to obtain the directional derivative of $\varphi$ at every $x$ in the direction of $\mathbb{v} (x)$, which is the following
$$(D_{\mathbb{v}} \varphi) (x) := \displaystyle\lim_{t \rightarrow 0^+} \frac{\varphi (x + t \mathbb{v} (x)) - \varphi (x)}{t} = \langle \nabla \varphi (x), \mathbb{v} (x) \rangle$$
This is the Lie derivative of $\varphi$ along $\mathbb{v}$. It's widely used in control theory, namely, in the study of Lyapunov stability of dynamical systems. If the vector field $\mathrm{v}$ is the gradient of a scalar field $\psi : \mathbb{R}^n \to \mathbb{R}$, then the Lie derivative of $\varphi$ along $\mathrm{v} (x) := \nabla \psi (x)$ is given by
$$(D_{\mathbb{v}} \varphi) (x) = \langle \nabla \varphi (x), \mathbb{v} (x) \rangle = \langle \nabla \varphi (x), \nabla \psi (x) \rangle$$
Has this answered, even if remotely, your question?
Update: Since my original post did not answer the OP's question, I will add this update. Let $\mathbb{u}, \mathbb{v} : \mathbb{R}^n \to \mathbb{R}^n$ be vector fields. Let $\mathbb{u}_i$ be the $i$-th component of $\mathbb{u}$, and note that $\mathbb{u}_i$ is a scalar field. We can compute the Lie derivative of $\mathbb{u}_i$ along $\mathbb{v}$, which is the scalar function
$$(D_{\mathbb{v}} \mathbb{u}_i) (x) = \langle \nabla \mathbb{u}_i (x), \mathbb{v} (x) \rangle$$
We could define the Lie derivative of vector field $\mathbb{u}$ along the vector field $\mathbb{v}$ as follows
$$(D_{\mathbb{v}} \mathbb{u}) (x) := \left[\begin{array}{c} (D_{\mathbb{v}} \mathbb{u}_1) (x)\\ (D_{\mathbb{v}} \mathbb{u}_2) (x)\\ \vdots \\ (D_{\mathbb{v}} \mathbb{u}_n) (x)\end{array}\right]$$
Finally, do note that $(D_{\mathbb{v}} \mathbb{u}) (x) = ((D \mathbb{u}) (x)) \, \mathbb{v} (x)$, where $(D \mathbb{u})$ is the Jacobian of $\mathbb{u}$.