Guajara in other languages: Spanish, Deutsch, French, Italian ...



Covariant derivative

Here the covariant derivative of a vector in a vector field is described, the covariant derivative of a tensor is an extension of the same concept.

In physics, the covariant derivative D (also written as ) of a vector u in the direction of the vector v is a rule that defines a third vector called (also vu) which has the properties of a derivative, specified below. A vector is a geometrical object and independent of a choosen basis (coordinate system). In terms of a coordinate system, this derivative transforms under a change of coordinate system "in the same way" as the vector itself (covariant transformation), hence the name.

One would tend to define the derivative of a vector field in a proces that involves the difference between two vectors in two nearby points. In Euclidean space and orthonormal coordinate system one drags one of the vectors to the origin of the other, keeping it parallel (parallel transport). The covariant derivative in Euclidean space can simply be obtained by taking the derivative of the components. In the general case, however, one must take into account the change of the coordinate system. Moreover, in a curved space, such as the surface of the Earth, this parallel transport is depending on the path of moving the vector. For example in polar coordinates in a two dimensional Euclidean plane, the derivative contains extra terms that describe how the coordinate grid itself "rotates". In other cases the extra terms describe how the coordinate grid expands, contracts, twists , interweaves, etc.

Any vector is known if we know its components on a choosen basis, say the vectors ei ,i=0,1,2,... (here written with lower indices). The covariant derivative is a vector and so can be expressed as a sum over all basis vectors, the linear combination ΣkΓkek, where Γk are the components (here written with upper indices). The components of the covariant derivative are known as the Christoffel symbols. Specifying the covariant derivative is also called specifying connections in the space under consideration, or specifying an Affine connection (see Affine space. The covariant derivative of ej in the direction ei is indicated with lower indices, so

or writing the derivative D in the lefthand side as Di
   definition of Christoffel symbols Γki j
So a covariant derivative is defined if we know the Christiffel symbols on a given basis.


Example of a curve in polar coordinates in a 2-dim Euclidean space. A vector at curve parmater t (say the acceleration, not shown) is expressed in a coordinate system , where and are unit tangent vectors for the polar coordinates, serving as a basis to decompose a vector in terms of radial and tangential components. At a slightly later time, the new basis in polar coordinates appears slightly rotated with respect to the first set. The covariant derivative of the basis vectors (the Christoffel symbols) serve to express this change.

The vectors u and v in the definition are defined in the same point p. Also the covariant derivative Dvu is a vector defined in p.

The definition of the covariant derivative does not use the metric in space. However, a given metric uniquely defines the covariant derivative.

The properties of a derivative imply that Dvu depends on the surrounding of point p, in the same way as e.g. the derivative of a scalar function along a curve in a given point p, depends on the surroundings of p. Therefore, the covariant derivative is not a tensor.

The information on the surroundings of a point p in the covariant derivative can be used to define parallel transport of a vector. Also geometrical properties, such as the "curvature" of space and can be defined without the use of a metric. The Riemann curvature tensor can be formulated in a coordinate independent way in terms of the covariant derivative.


In order to appreciate the definition of covariant derivative, a little background on the spaces involved is needed. The points p at which the vectors are defined, are elements of a space which in the most general case is called a manifold. This is a collection of points p'\' together with a set of smooth (differentiable) coordinates functions xa(p), a=0,1,...''. Examples of manifolds are Euclidean space and spacetime. The covariant derivative is extensively used in general relativity where the points are elements of spacetime.

In such a space, a function f that assigns real numbers to every point p in the manifold, can be considered as a function of the variables xa(p), a=0,1,... simply by saying that f(xa(p))= f(p). Curves c in a manifold can be defined as a collection of points p that depend on one parameter λ, called the curve parameter, so p=c(λ). The coordinates functions themselves define curves, the coordinate grid, when the other coordinates are held constant. A world line is another example of a curve in spacetime. The derivative of f in a point p with respect to the curve parameter can be considered a vector in p, tangent to the curve in p and therefore called a tangent vector. It has components . Conversly, every vector is tangent to a curve. For example the vector v in point p with components vi is tangent to the curve parametrized by xi(p) + λvi.

A vector v can as well be seen as an operator v that can be applied to a function f and produces the derivative of f along the curve. Consider

Without the function f one could write
short for differentiation with respect to the curve to which the vector v is tangent. So v is an operator by saying that v[f] is the vector with components . In these terms the base vectors are the operators

and

Where we have written , the tangent vectors to the curves which are simply the coordinate grid itself.


The rules defining the covariant derivative D (or ) of a vector u in the direction of the vector v is that the vector Dvu should have the following properties of a differentiation. For vectors u, v, w and scalar functions f and g these are
  1. Dvu is algebraically linear in v so  Df v + g wu = f Dvu + g Dwu
  2. Dvu is additive in u so                    Dv(u + w) = Dvu + Dvw
  3. Dvu obeys the "chain rule"             Dv(f u) = f (Dvu) + (Dv f )u

where Dv f is defined as the normal differentiation of a real function in the direction of the vector v v[f]. Note that Dvu is not linear in v and depends on the neighborehood of p because of the last property, the chain rule. Therefore the covariant derivative is not a tensor (which depends linearly in all its arguments) and the Christoffel symbols are not the components of a tensor.

If we express the vector u as a linear combination of the basis vectors ei ,i=0,1,2,..., say with the coordinates uk

u = Σk uk ek ,

one can apply the rules of the covariant diffentiation to the product in the righthand side to obtain

or

   (Covariant derivative
   in components)

In words: the covariant derivative is the normal derivative along the coordinates plus correction terms that tells you how the coordinates changes. In textbooks on physics, the covariant derivative is sometimes simply stated in terms of its components in this equation.

Often a notation is used in which the covariant derivative is given with a semicolon, while a normal derivative is indicated by a comma. In this notation we write the same as:

   (Semicolon notation)

Once again this shows that the covariant derivative of a vector field is not just simply obtained by differentiating to the coordinates , but also depends on the vector v itself through .

Table of contents
1 Parallel transport
2 Metric defines a unique covariant derivative
3 Covariant derivative indicates curvature in space

Parallel transport

With a covariant derivative it is possible to compare vectors in different (neighboring) points. This allows a description of transport of vectors. A vector u is said to be parallel transported in the direction of a vector v if Dvu = 0 , since in that case the (infinitessimal) change of the vector u in the direction of v is zero. In other words u remains the same.

A special role is played by curves that are created by transporting the tangent vector parallel to itself. They are called geodesics. A geodesic is a curve , for which the tangent vector

satisfies 
Duu = 0
  (coordinate free Geodesic equation)
for every point on the curve. An example can be given in 4-dimensional
spacetime for curves that are world lines. For a worldline the tangent vector u is the 4-velocity and its derivative is the acceleration. So Duu = 0 one sees that geodesics are orbits in which the acceleration is zero: the worldlines of particles and observers in free fall. When a metric is introduced, it can be shown that geodesics defined in this way are also the routes between two points for which the pathlengths has a stationary point (form an extremum, the "the most straight" routes).

In components Duu = 0 is the well known geodesic equation, writing ú on a basis as with ui = dxi/dλ

So Duu = 0 in components give the important equation for geodesics

   (Geodesic equation in coordinates)

Note that all anti-symmetric parts in the lower indices of the Christoffel symbols will cancel out in the summation, so only the symmetric parts will play a role in geodesics.

Metric defines a unique covariant derivative

A metric tensor, metric for short, defines a real number called the innerproduct (or dot-product) for two vector u and v. With a metric one can derive metric properties, such as length of the vector and the angle between two vectors. It is a tensor, linearly dependent on both u and v and is denoted in varies ways. For example as the tensor or as

The components are real numbers of the metric tensor applied to a basis, say , so

In components on a basis on writes , with and

A given metric defines a unique covariant derivative by the requirement that the chain rule should be applicable to the inner product as well

 (Requirement on D)

for any u, v and w, o also the base vectors and we have in components

The lefthand side is simply the normal derivative, since the innerproduct is a scalar. Writing out the covariant derivative in terms of the Christoffel symbols
and using  
This gives the unique relation between the Christoffel symbols (defining the covariant derivative) and the metric tensor components.

We can invert this equation and express the Christoffel symbols with a little trick, by wring this equation three times with a handy choice of the indices

By adding, most of the terms on the right hand side cancel and we are left with
Or with the inverse of g, defined as (using the Kronecker delta function)
we write the Christoffel symbols as


In other words, the Christoffel symbols (and hence the covariant derivative) are completely determined by the metric, through equantions involving the derivative of the metric. For a given metric this set of equations can become rather complicated. There are quicker and simpler methods to obtain the Christoffel symbols for a given metric, e.g. using the action integral and the associated Euler-Lagrange equations.

Covariant derivative indicates curvature in space

A vector e on a globe on the equator in Q is directed to the north. Suppose we parallel-transport the vector first along the equator until P and then (keeping it parallel to itself) drag it along a meridian to the pole N and (keeping the direction there) subsequently transport it along another meridian back to Q, then we will notice that the parallel-transported vector along a close circuit doesnot return as the same vector. It has another orientation. This would not happen in Euclidean space and is caused by the curvature of the surface of the globe. The same effect can be noticed if we drag the vector along an infinitesimal small close surface subsequently along two directions and then back. The infinitesimal change of the vector is a measure of the curvature.

An infinitesimal closed path

In the following section we need a closed circuit in order to illustrate the relation of the covariant derivative with the concept of curvature.

Moving from P0 to P1 and then to P3 in general ends up in a different point as compared to going the other way around: from P0 to P2 and then to P4. How can we identify these endpoint differences (P3 and P4) with a vector, in such a way that we can use this in a description of a closed circuit. The following argument looks kind of complicated but is merely a matter of proper bookkeeping.

Consider a set of curves as shown in the figure (for each ζ this is a curve with curve parameter λ and vice versa) and some arbitrary function f in the space under consideration (a manifold). Suppose u is the tangent vector in a point P0 (see figure) for a curve with curve parameter λ. The vector is drawn in P0 towards P1. The difference can be approximated for sufficiently small differences in the parameter Δλ along the curve

Suppose further that v is the tangent vector in P0 in the other coordinate direction ζ towards P2. In the same way, the difference can be approximated for sufficiently small differences in the parameter Δζ along the curve
Now consider u applied to . Since u is linear, we may write
Note v u is defined in . It appears that v u is a kind of second derivative: at first a derivative in one direction and subsequenly in the other. u v is not linear.

Now going the other way and one can show that

the same first order derivatives now applied in another order.


We want to know , the "open part" in the loop in the figure. With a little trick (adding terms that cancel out), we write
or rearranging
Now using (see the figure):
  
we can write the desired as
and we can do the same for the second order, using
we get for
where the equal signs are valid in the limit of infinitesimal displacements.

A special notation is used for this difference and its is called the Lie vector or Lie operator

   (Lie operator)

So the Lie vector is the commutator of u and v. Normally, in simpel situations, u and v commute and the Lie-vector is zero. Only in badly twisted coordinate systems it will deviate from zero.

We have shown that the Lie vector forms the fifth vector that closes the loop between the points . Note also that the Lie operator is a vector defined in the point and that it it a linear operator.

since all second order terms cancel out.


Curvature with covariant derivatives

This section shows how the covariant derivative can be used to define curvature. It can be seen as an extensive explanation of the defenition of the Riemann curvature tensor.

We move some vector w around in a loop spanned by two vectors u and v (all in P0) and closed by the vector {u,v} as discussed in the previous section. Note the directions, moving against the arrows is taking into account with a minus sign.

We will transport w to the five corners in the figure and compare it with the value of w in each corner. For a closed curve all intermediate values will cancel and in the end we can compare w in P0 with the transported w, but also in P0. This difference will be defined as the curvature operator, the vector field R(u,v) w (since it will depend on the path defined by u and v).

Note that is the difference between w in P2 and the from P3 to P2 parallel shifted w. This difference is a vector in P2. The of this thing thus compares the original vector w with the w that is shifted from P3 via P2 to P0. So

is a vector in P0.
In the same way is 
   a vector in
P0, the difference between the original and the one that is dragged from P4 via P1 to P0. In the last part we use that {u,v} is a vector defined in P0 , while
 is the difference between 
the w that is transported from P4 to P3 and compared with w in P3.

We now take care of the signs and compare the "original-w " with the one that is dragged from P0 to P1, than from P1 to P3, than from P3 to P4, than from P4 to P2, and finally from P2 back to P0. This difference is defined as the curvature operator, the vector field R(u,v) w with

(Definition of Riemann tensor)

So the curvature operator is the difference between the commutator of the covariant derivative with the covariant derivative of the commutaor of the normal derivative.

For not too badly twisted coordinates systems {u,v} =0 and one can ignore the last term in the definition. The Riemann tesnsor is than simply the change in w when moved around in an infinitesimal parallogram spanned by the vectors u and v.

Although defined as a combination of derivatives, it appears that the curvature operator is linear in its arguments, and hence is a tensor. It is easily shown that for f some real function and any w, the derivatives of f just cancel out. So





Wikipedia - All text is available under the terms of the GNU Free Documentation License.

Tagoror dot com  -  Legal Information  -  Contact us