understanding covariant derivative (connexion)

My lecturer defined the covariant derivative as in this section from Wikipedia: http://en.wikipedia.org/wiki/Covariant_derivative#Vector_fields. From this, he defines the operator $\nabla_X Y$ to mean the covariant derivative of $X$ along $Y$. I'm confused as to the role $\nabla$ plays here: all I understand is that $\nabla_X Y|_p$ is the result of taking in a tangent vector (given by $X(p)$) and doing something with it and $Y$, but $Y$ takes a point as input, not a tangent vector.

On some other site I found this covariant derivative defined as a directional derivative but I don't see how that relates. Maybe once I understand this I can understand why $\nabla_X Y = 0 $ means that $Y$ is parallel along $X$.


Solution 1:

First, let's make sure we understand what a connection is. Let $M$ be a smooth manifold, let $\mathscr{O}(M)$ be its ring of smooth functions (scalar fields), and let $TM$ be its tangent bundle. Let $\Gamma(TM)$ denote the space of vector fields on $M$ (i.e. the $\mathscr{O}(M)$-module of smooth sections of $TM$). A connection on $TM$ is a smooth map $\nabla : \Gamma(TM) \times \Gamma(TM) \to \Gamma(TM)$ satisfying the following properties:

  1. $\nabla$ is $\mathscr{O}(M)$-linear in the first argument: so for vector fields $X, Y, Z$ and smooth functions $f, g$, $$\nabla(f X + g Y, Z) = f \nabla(X, Z) + g \nabla(Y, Z)$$

  2. $\nabla$ is $\mathbb{R}$-linear in the second argument, where (by abuse of notation) $\mathbb{R}$ is the subalgebra of constant functions in $\mathscr{O}(M)$; that is, for any constant $c$ and vector fields $X$ and $Y$, $$\nabla(X, c Y) = c \nabla(X, Y)$$

  3. $\nabla$ obeys the Leibniz rule for the second argument, in the sense that for vector fields $X$ and $Y$ and a smooth function $f$, $$\nabla(X, f Y) = f \nabla(X, Y) + \nabla(X, f) Y$$ where $\nabla(X, f)$ denotes the action of $X$ (as a differential operator) on $f$. (Recall that tangent vectors are defined as equivalence classes of differential operators at a point.)

Alternatively, we might define $\nabla$ as a smooth $\mathbb{R}$-linear map $\Gamma(TM) \to \Gamma(T^*M \otimes TM)$ satisfying certain properties.

It's not hard to show that connections exist: one can be constructed by patching together coordinate differentials using a partition of unity, but since you tagged the question riemannian-geometry, I'll give a specific example of a non-trivial connection, for concreteness. A Riemannian manifold is equipped with a metric $g_{ij}$, and if we impose the additional condition that $\nabla_k g_{ij} = 0$, we obtain a unique connection $\nabla$, called the Levi–Civita connection. It is given in coordinates by the formula $$(\nabla(X, Y))^i = X^j \nabla_j Y^i = X^k \partial_k Y^i + \Gamma^i_{\phantom{i}jk} X^j Y^k$$ where $\Gamma^i_{\phantom{i}jk}$ is the Christoffel symbol, which is defined in coordinates by $$\Gamma^i_{\phantom{i}jk} =\frac{1}{2} g^{il} \left( \partial_k g_{jl} + \partial_j g_{lk} - \partial_l g_{jk} \right)$$ It is a straightforward exercise in symbol-pushing to verify that this does indeed define a connection with the desired properties.

Solution 2:

A connection is an additional structure, or, simply speaking, a piece of information, that one may have in a vector bundle on a manifold.

What you are asking about is called technically a linear connection, i.e. a connection in the tangent bundle, so we are only discussing such connections here.

As @Zhen Lin pointed out, there are plenty of connections: just choose some $\Gamma^k{}_{ij}$ to be your Christoffel symbols in each coordinate patch, and then use the partition of unity argument to smoothly glue up the data.

Having a connection defined, you can then compute covariant derivatives of different objects. On functions you get just your directional derivatives $\nabla_X f = X f$. (Notice that this is true for any connection, in other words, connections agree on scalars).

On vector fields you get covariant derivatives in the sense that you mentioned in your question. The definitions are kindly provided by @Zhen Lin.

If you write down your vector fields in terms of a coordinate system, say $X=X^i \partial_i$ and $Y=Y^j \partial_j$, then \begin{align} \nabla_X Y &= \nabla_X (Y^j \partial_j) \\ &= \nabla_X (Y^j) \partial_j + Y^j \nabla_{X^i \partial_i} \partial_j \\ &= X(Y^j)\partial_j + X^i Y^j \nabla_i \partial_j \\ &= X(Y^j)\partial_j + X^i Y^j \Gamma^k{}_{ij} \partial_k\\ \end{align}

From this simple calculation you can see that the result $\nabla_X Y |_{p}$ of taking the covariant derivative at a point $p$ really depends only on the value of $X$ at point $p$, and of all values of $Y$ defined in a small neighborhood of $p$, as you would expect from a derivative.

You can then extend the notion of covariant derivatives to 1-forms, and then to arbitrary tensor fields: just use the Leibniz rule!

In Riemannian geometry we study manifolds along with an additional structure already given, namely, a Riemannian metric $g$. In this situation there exist a preferred choice of connection. Indeed, the Fundamental theorem of Riemannian geometry guarantees existence and uniqueness of a symmetric connection with respect to which the metric tensor is parallel $\nabla {g}=0$. It is called the Levi-Civita connection. In coordinates you know its Christoffel symbols and can compute covariant derivatives from the formulae provided in the answer of @Zhen Lin.

Geometrically, connection introduces the notion of parallel transport. Strictly speaking, we transport objects along curves, but vector fields induce some curves (integral curves), so one can speak about objects that are parallel along vector fields in this sense.