jump to navigation

Covariant derivatives and parallelism November 1, 2009

Posted by Akhil Mathew in differential geometry, MaBloWriMo.
Tags: , , ,

[Nobody should read this post without reading the excellent comments below.  It turns out that thinking  more generally (via connections on the pullback bundle) clarifies things. Many thanks to the (anonymous) reader who posted them.  –AM, 5/16]

A couple of days back I covered the definition of a (Koszul) connection. Now I will describe how this gives a way to differentiate vector fields along a curve.

Covariant Derivatives

First of all, here is a minor remark I should have made before. Given a connection {\nabla} and a vector field {Y}, the operation {X \rightarrow \nabla_X Y} is linear in {X} over smooth functions—thus it is a tensor (of type (1,1)), and the value at a point {p} can be defined if {X} is replaced by a tangent vector at {p}. In other words, we get a map {T(M)_p \times \Gamma(TM) \rightarrow T(M)_p}, where {\Gamma(TM)} denotes the space of vector fields. We’re going to need this below.

Next, a curve {c} in the smooth manifold {M} is an immersion {c: J \rightarrow M}, where {J} is an interval in {\mathbb{R}}. We can talk about a vector field along {c} to be a map {X: J \rightarrow T(M)} such that {X(t)} lies above {c(t) \in M}. An example is the derivative {c'}.

Now assume {M} is given a connection {\nabla}. I claim that there is a unique operator {D} sending vector fields along {c} to vector fields along {c} such that:

  • If {X} is a vector field along {c} and {f: M \rightarrow \mathbb{R}}, then {D(fX)(t) = (f \circ c)' X + f D( X)(t)}.Note that {c'(t) \in (TM)_{c(t)}}, by definition.  
  • If {X} is the restriction of a vector field {\bar{X}} on {M}, i.e. {X(t) = \bar{X}(c(t))}, then\displaystyle D(X)(t) = ( \nabla_{c'(t)} \bar{X})(c(t)). 

This operator is called the covariant derivative along {c}. It is in fact a generalization of the usual directional derivative of vector fields in multivariable calculus, which occurs when you take the connection on {\mathbb{R}^n} with all Christoffel symbols zero.

The first condition means we can, by multiplying {X} by a cut-off function, assume {X} is supported in some coordinate neighborhood {U} with coordinates {x^1, \dots, x^n}. In particular, we may even assume that the image of {c} is contained in {U} by shrinking {J} and using local uniqueness (which we prove below). Moreover, we can assume that {c} is one-to-one by shrinking further.

Now, in the local case, we can write {c(t) = (c^1(t), \dots, c^n(t))}, and {X(t) = \sum_i X^i(t) \partial_i}, where {\partial_i = \frac{\partial}{\partial x_i}}. We can extend {X} to {\bar{X} = \sum_i \bar{X}^i \partial_i }. Let the Christoffel symbols of the connections be {\Gamma^k_{ij}}. We write write what {\nabla_{c'(t)} \bar{X} } looks like, and dive into the algebra. By linearity

\displaystyle \nabla_{c'(t)} \bar{X} = \sum_{i,k} c'^i(t) \nabla_{\partial_i} \left( \bar{X}^k \partial_k \right). 

This equals by the derivation-like identity for connections

\displaystyle \sum_{i,k} c'^i(t) \frac{\partial \bar{X}^k}{\partial x^i} \partial_k + \sum_{i,j,k} c'^i(t) \bar{X}^k \Gamma^j_{ik} \partial_j .  

Shifting the indices, collecting terms, and using that {X} is a restriction of {\bar{X}} gives that if we have such an operator {D}, then

\displaystyle D(X)(t) = \sum_j \left( X'^j(t) + \sum_{i,k} c'^i(t) {X}^k(t) \Gamma^j_{ik}(c(t)) \right) \partial_j. 

So we’re basically out of the woods—this expression depends only on {X}. Thus we define {D(X)} this way in local coordinates; it is easily checked that the conditions are satisfied locally, and one pieces together the local covariant derivatives to get the global ones. The fact that patching is legal follows from the uniqueness assertion and a partition of unity argument.


A vector field {X} along the curve {c} is said to be parallel if {D(X) \equiv 0}. For instance, in the case of {\mathbb{R}^n} with the usual connection, this means that all the components are constant.

Now fix a curve {c} starting at {p} and ending at {q}, with interval {[0,1]}.

Proposition 1 Given {v \in T_p(M)}, there is a unique parallel vector field {X} along {c} such that {X(0)=v}.

Indeed, we may assume that {c([0,1])} is contained in a coordinate neigbhorhood, in which case it follows from the fundamental existence and uniqueness theorem on linear ODEs(!) and the local equation for a connection.

Anyway, this means that we can define a map {\tau_{pq}: T_p(M) \rightarrow T_q(M)} as follows: for {v}, choose a curve {c} as above, and then take {c(1) \in T_q(M)}. It’s smooth because of the smoothness theorem on ODEs. It’s even linear because if {X,Y} correspond to {v,w}, then {X+Y} corresponds to {v+w}, etc.

The next result tells us what I have been insisting all along—that connections are about connecting different tangent spaces.

Proposition 2 {\tau_{pq}} is a linear isomorphism.

We just need to check that it’s one to one. This follows because the value of the vector field {X} along {c} at {1} determines its value along {[0,1]}, because of the uniqueness theorem on ODEs again.

However, {\tau_{pq}} does depend on the curve {c}. I believe the extent to which this dependence holds is measured by the holonomy groups, but I don’t (yet) understand what that’s all about, so I’ll let you read about it elsewhere.


1. Geodesics and the exponential map « Delta Epsilons - November 4, 2009

[…] Covariant derivatives and parallelism […]

2. . - May 13, 2010

This proof is incorrect (as in some standard books too) because it overlooks the possibility that the map may not be locally immersive and the image could be extremely self-crossing: the very concept of “vector field along a parametric curve” is awkward to work with if stuck inside the manifold in cases when the parametric map is not locally immersive. So pull back to bundle with connection over the time interval and use the existence/uniqueness theorem on linear ODE to get the canonical framing by flat sections for the pullback. Sections of the pullback bundle are rigorous meaning of vector field along the parametric curve as well. Anyway, old-style differentiation relative to that framing of flat sections gives the construction, with all properties then easy to verify.

Akhil Mathew - May 15, 2010

Thanks for pointing this out! However, I’m not sure I understand how this differs from defining a vector field along c as a map I \to TM (where I is the parameter interval) – is this not the same as the pullback bundle? I suppose I could have made this more explicit though.

3. . - May 16, 2010

Certainly the smooth maps from I to TM covering c are identified with the (smooth) global sections of the pullback bundle c*(TM) over I (please be careful not to mix up sections of a bundle with the actual bundle, by the way) — it is of course an easy and useful identification, though not literally a definition — but there is a genuine advantage to working with the pullback bundle over I: this pullback bundle over I is also equipped with the pullback *connection* (which is less simple to express along analogous lines to what you say above), and so one can see much more clearly that the essential content has nothing to do with vector fields (or even M!) and everything to do with connections on *arbitrary* vector bundles over a (non-trivial) interval I in the real line.

Namely, one uses the linear ODE stuff and connectedness/compactness argument along I (due to not knowing *a priori* that a vector bundle over I is globally trivial) to prove that any vector bundle with connection over I is canonically trivialized by the space of flat sections (which maps isomorphically onto each fiber). So the bump functions are removed from the argument and the parallel transport also drops out: all fibers over I are transitively identified with each other through their common identification with the global vector space of flat sections over I.

Pulling back to the time interval *before* doing the local analysis has the effect of disentangling the possibly complicated geometry of the curve, and makes the structure of the proof clearer. But one can’t really express this clearly unless the notion of connection on a general vector bundle has been introduced. That’s why some otherwise fine standard texts (such as do Carmo’s) which avoid general vector bundles and connections on them get tripped up when the parametric curve is not locally immersive.

4. . - May 16, 2010

One other point: what I have said may not make sense if you don’t know a definition of connection more general than you have described earlier. You need to define a notion of connection \nabla on a general vector bundle E and understand a notion of “pullback” for it. Loosely speaking, it is the data of directional derivative operators along vector fields for local sections of E. That is, we define (compatibly with localization) an operator \nabla_X on E(V) for any vector field X on any open set U of M containing V. Using a local frame for E over the domain of local coordinate chart (where X admits the usual description) one again gets Christoffel symbols to describe the connection, now depending on the local frame of E and the local coordinates. In the classical case E = TM, by using as local frame the classical one from local coordinates we recover the classical notion on TM. But the more general notion on any E is really useful, as the above stuff with pullback to time interval (and so much more) shows. Connections are useful far beyond the special case of Levi-Civita connection in Riemannian geometry.

Akhil Mathew - May 16, 2010

Thanks for the explanation! Yes, this is necessary for the previous comment. I ought to understand connections on principal bundles next…

5. . - May 16, 2010

Before moving on to the hypergenerality of connections on more general bundles for Lie groups (which is entirely unnecessary for the above purposes, and for which the case of vector bundles ultimately plays a central role as far as covariant differentiation is concerned), you should first fully understand very well the case of connections on general vector bundles: how they behave with respect to the operations of tensor algebra (induced connection on dual bundle, tensor product of bundles, exterior powers, etc.), why this includes the case of Levi-Civita on tensor fields via L-C on tangent bundle as a special case, how they behave under pullback, how to express compatibility of connection with a (pseudo-)metric on the vector bundle, and what Christoffel symbols mean.

In all cases, the interaction with linear algebra operations is to be uniquely characterized by some simple rules on local sections (e.g., “product formula” behavior on elementary tensors of local sections in the case of tensor product of two vector bundles, and “quotient rule” behavior on local sections of dual bundle), and grunge out what this says at the level of Christoffel symbols. The version on nLab has the above errors too, but nLab is so incredibly arid that there is no point in correcting what they say.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: