Manifolds - personal.maths.surrey.ac.ukpersonal.maths.surrey.ac.uk/st/...manifolds_all.pdf ·...

Manifolds [7CCMMS18/ CM437Z - Semester 1, 2011]

Jan B. Gutowski

Department of Mathematics, King’s College London

Strand, London

WC2R 2LS

Email: [email protected]

These notes are slightly modified from the lecture notes written by Neil Lambert and Alice Rogers

Contents

1. Manifolds 3

1.1 Elementary Topology and Definitions 3

1.2 Manifolds 4

2. The Tangent Space 11

2.1 Maps from M to R 11

2.2 Tangent Vectors 12

2.3 Curves and their Tangents 17

3. Maps Between Manifolds 21

3.1 Diffeomorphisms 21

3.2 Push Forward of Tangent Vectors 21

4. Vector Fields 24

4.1 Vector Fields 24

4.2 Integral Curves and Local Flows 25

4.3 Lie Derivatives 28

5. Tensors 30

5.1 Co-Tangent Vectors 30

5.2 Pull-back and Lie Derivative of a co-vector 32

5.3 Tensors 34

6. Differential Forms 38

6.1 Forms 38

6.2 Exterior Derivative 41

6.3 Integration on Manifolds 47

7. Connections, Curvature and Metrics 54

7.1 Connections, Curvature and Torsion 54

7.2 Riemannian Manifolds 60

7.3 Symplectic Manifolds 65

– 1 –

Recommended Books

• C. Isham, Modern Differential Geometry for Physicists, World Scientific, 1989.

• M. Nakahara, Geometry, Topology and Physics IOP, 1990.

• I. Madsen and J. Tornehave, From Calculus to Cohomology, CUP, 1997.

• M. Gockler and T. Schuker, Differential Geometry, Gauge Thoeries and Gravity,

CUP, 1987.

• S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, vol. I, Wiley,

1963.

A useful review of differential geometry (which includes much of the course material,

but also significant amounts of other material) can be found in

• http://empg.maths.ed.ac.uk/Activities/GT/EGH.pdf

For an introduction to topological spaces, see

• W. A. Sutherland, Introduction to Metric and Topological Spaces, OUP, 2009.

Course information can be found at:

http://www.mth.kcl.ac.uk/courses/cmms18.html

– 2 –

1. Manifolds

1.1 Elementary Topology and Definitions

This section should be a review of concepts (hence it is all definitions and no theorems).

Definition 1.1. A topological space X is a set whose elements are called points together

with a collection U = Ui of subsets of X which are called open sets and satisfy

(i) U is closed under finite intersections and arbitrary unions

(ii) ∅, X ∈ U

Definition 1.2. A set is closed if its complement in X is open.

Definition 1.3. An open cover of X is a collection of open sets whose union is X.

A topology allows us to define a notion of local meaning that something is true or

exists in some open set about a point, but not necessarily the whole space or even all open

sets about that point. Some other important concepts are:

Definition 1.4. X is connected if it is not the union of two disjoint open sets.

Definition 1.5. A map f : X → Y between two topological spaces is continuous iff

f−1(V ) = x ∈ X : f(x) ∈ V is open for any open set V ⊂ Y .

Definition 1.6. If X ⊂ Y , where Y is a topological space then X can be made into a

topological space too by considering the induced topology: open sets in X are generated by

U ∩X ⊂ X where U is an open set of Y .

Problem 1.1. Show that the induced topology indeed satisfies the definition of a topology.

Definition 1.7. A Hausdorff space is a topological space with the additional property that

points can be separated: for any two distinct points x, y ∈ X, there exists open sets Ux and

Uy such that x ∈ Ux, y ∈ Uy and Ux ∩ Uy = ∅. Hausdorff spaces are also known as T2spaces (as there are also weaker notions of separability).

Non-Hausdorff spaces have various pathologies that we do not want to consider. There-

fore in what follows we will take all topological spaces to be Hausdorff unless otherwise

mentioned.

Definition 1.8. A function f : X → Y is one-to-one iff f(x) = f(y) implies x = y.

This guarantees the existence of a left inverse f−1L : f(X) ⊂ Y → X such that f−1L f(x) = x for all x ∈ X, since every element in the image f(X) comes from a unique point

in X.

Definition 1.9. A function f : X → Y is onto iff f(X) = Y , i.e. for all y ∈ Y there

exists an x ∈ X such that f(x) = y.

– 3 –

This guarantees the existence of a right inverse f−1R : Y → X such that f f−1R (y) = y

for all y ∈ Y , since every element in Y has some x ∈ X (not necessarily unique) which is

mapped to it by f .

Definition 1.10. A bijection is a map which is both one-to-one and onto.

Definition 1.11. A homeomorphism is a bijection f : X → Y which is continuous and

whose inverse is continuous.

We often have much more structure. For example if there is a notion of distance then

the usual topology is that defined by the open balls.

Definition 1.12. A metric space is a point set X together with a map d : X × X → Rsuch that

(i) d(x, y) = d(y, x)

(ii) d(x, y) ≥ 0 with equality iff x = y

(iii) d(x, y) ≤ d(x, z) + d(z, y)

The metric topology is then defined by the open balls

Uε(x) = y ∈ X : d(x, y) < ε (1.1)

where ε > 0, along finite intersections and arbitrary unions of open balls. Note that it

follows that X is open since it is a union of open balls

X =⋃x∈X

Uε(x) (1.2)

for any ε > 0 of your choosing. These spaces are always Hausdorff (property (ii) ensures

that any two distinct points have a finite distance between them and hence open balls with

ε taken to be half this distance will seperate them) and this also implies that ∅ is open.

In particular we will heavily use Rn viewed as a metric topological space (with the

usual Pythagorian definition of distance).

1.2 Manifolds

Roughly speaking a manifold is a topological space for which one can locally make charts

which piece together in a consistent way.

Definition 1.13. An n-dimensional chart onM is a pair (U, φ) where U is an open subset

of M and φ : U → Rn is a homeomorphism onto its image φ(U) ⊂ Rn

– 4 –

Figure 1: An n-dimensional chart (U, φ)

Definition 1.14. A n-dimensional differentiable structure on M is a collection of n-

dimensional charts (Ui, φi), i ∈ I such that

(i) M = ∪i∈IUi

(ii) For any pair of charts Ui, Uj with Ui ∩ Uj 6= ∅, the map φj φ−1i : φi(Ui ∩ Uj) →φj(Ui ∩ Uj) is C∞, i.e. all partial derivatives exist up to any order.

(iii) We always take a differentiable structure to be a maximal set of charts, i.e. the union

over all charts which satisfy (i) and (ii).

N. B. The functions φj φ−1i are called transition functions.

Theorem 1.1. If M is connected then n is well defined, i.e. all charts have the same

value of n.

Proof. Suppose that two charts had different values of n then it is clear that they can’t

intersect because the map φj φ−1i which takes a subset of Rni to a subset of Rnj is C∞

smooth and invertible, and hence preserves the dimension. Since this is true for all charts

we see that M must split into disjoint charts, at least one for each different value of n.

Since a chart is an open set we can therefore write M as a union over disjoint open sets,

one for each value of n. Thus if M is connected it must have a unique value of n.

Henceforth we will only consider connected topological spaces.

Definition 1.15. A differentiable manifold M of dimension n is a connected Hausdorff

topological space equipped with an n-dimensional differentiable structure.

– 5 –

Figure 2: Transition functions

N.B. One can also study differentiable structures where the transition functions are

only Ck for some k > 0. Alternatively one could replace Rn by Cn and demand that the

transition functions are holomorphic. These are therefore a special case of 2n-dimensional

real manifolds. This leads to the beautiful and rich subject of complex differential geometry

which we will not have time to consider here.

Example: Trivially Rn is a n-dimensional manifold. A single chart that covers the

whole of Rn is (Rn, id) where id is the identity map id(x) = x

Example: Any open subset U ⊂ Rn is a n-dimensional manifold. Again the single

chart (U, id) is sufficient. In fact any open subset U of a manifold M with charts (Ui, φi)

is also a manifold since one can use the charts (U ∩ Ui, φi).

Example: The circle is a 1-dimensional manifold.

Let M = (x, y) ∈ R2 : x2 + y2 = 1. We need at least two charts, say

U1 = (x, y) ∈ R2 : x2 + y2 = 1, y > − 1√2

U2 = (x, y) ∈ R2 : x2 + y2 = 1, y <1√2 .

– 6 –

We define φi : Ui → R by

φ1(x, y) = θ ∈(− π

4,5π

4

)where (x, y) = (cos θ, sin θ)

φ2(x, y) = θ′ ∈(− 5π

4,π

4

)where (x, y) = (cos θ′, sin θ′)

Now

U1 ∩ U2 = (x, y) ∈ R2 : x2 + y2 = 1,− 1√2< y <

1√2 = VL ∪ VR

where

VL = (x, y) ∈ R2 : x2 + y2 = 1,− 1√2< y <

1√2, x < 0

and

VR = (x, y) ∈ R2 : x2 + y2 = 1,− 1√2< y <

1√2, x > 0

Figure 3: The open sets U1, U2

Now on φ1(VL) = (3π4 ,5π4 ), φ2 φ−11 (θ) = θ − 2π, whereas on φ1(VR) = (−π

4 ,π4 ),

φ2φ−11 (θ) = θ. Similarly φ1φ−12 (θ′) = θ′+2π on φ2(VL) = (−5π4 ,−

3π4 ), and φ1φ−12 (θ′) =

θ′ on φ2(VR) = (−π4 ,

π4 ). These maps are C∞ and hence we have a manifold.

Here on the circle we see that locally we can define a single coordinate, θ which we

think of as an angle. But θ is not defined over the whole circle, θ = 0 and θ = 2π are the

same.

This illustrates a key point: The maps φi provide coordinates, just like a map in an

atlas provides coordinates in the form of longitude and latitude. However the coordinates

– 7 –

will not in general work over the whole of the manifold. For example the surface of the

earth can be mapped in an atlas but the notion of latitude and longitude will break down

somewhere; at the poles longitude is not defined. Maps that one sees hanging on a wall

always break down somewhere (usually at both poles) but an atlas can smoothly cover the

whole earth.

Often the Ui are called coordinate neighbourhoods, or patches, and the φi are coordi-

nate maps. If p ∈M is some point contained in a given patch Ui then the local coordinates

of p are

φi(p) =

(x1(p), x2(p), x3(p), . . . , xn(p)

)∈ Rn (1.3)

Figure 4: Local coordinates

Clearly there is a huge amount of choice of local coordinates. Typically in any given

patch Ui we could choose from an infinite number of different functions φi. Furthermore

for a given manifold there will be infinitely many choices of open sets Ui which we use to

cover it with.

Differential geometry is the study of manifolds and uses tensorial objects which take

into account this huge redundancy in the actual way that we may choose to describe a given

manifold. This is the so-called coordinate free approach. Often, especially in older texts,

one fixes a covering and coordinate patches and writes any tensor in terms of its values in

some given local coordinate system. This may be convenient for some calculational purposes

but it obscures the true coordinate independent meaning of the important concepts. In

addition it should always be kept in mind when using explicit coordinates that they are

almost certainly not valid everywhere. One might often need to change coordinates, either

because we prefer to use a different choice of coordinates valid in the same patch, or because

– 8 –

we need to transform to a new patch which covers a different portion of the manifold. In

this course we will use the coordinate free approach as much as possible.

Example: Let us consider RPn = (Rn+1−0)/ ∼ where ∼ is the equivalence relation

(x1, x2, x3, . . . , xn+1) ∼ λ(x1, x2, x3, . . . , xn+1) (1.4)

for any λ ∈ R−0. We denote an element of the equivalence class by [x1, x2, x3, . . . , xn+1].

Let us choose for the charts

U1 = [x1, x2, x3, . . . , xn+1] ∈ RPn : x1 6= 0U2 = [x1, x2, x3, . . . , xn+1] ∈ RPn : x2 6= 0U3 = [x1, x2, x3, . . . , xn+1] ∈ RPn : x3 6= 0

.

.

.

Un+1 = [x1, x2, x3, . . . , xn+1] ∈ RPn : xn+1 6= 0

with the functions

φ1([x1, x2, x3, . . . , xn+1]) =

(x2

x1,x3

x1,x4

x1, . . . ,

xn+1

x1

)∈ Rn

φ2([x1, x2, x3, . . . , xn+1]) =

(x1

x2,x3

x2,x4

x2, . . . ,

xn+1

x2

)∈ Rn

φ3([x1, x2, x3, . . . , xn+1]) =

(x1

x3,x2

x3,x4

x3, . . . ,

xn+1

x3

)∈ Rn

.

.

.

φn+1([x1, x2, x3, . . . , xn+1]) =

(x1

xn+1,x2

xn+1,x3

xn+1, . . . ,

xn

xn+1

)∈ Rn

Clearly ∪n+1i=1 Ui = RPn and each φi is a homeomorphism (without loss of generality

we can take xi = 1 in Ui). Consider the intersection of U1 and U2, and consider

(u1, u2, . . . , un) ∈ φ1(U1 ∩ U2) (1.5)

One must therefore take u1 6= 0, and hence

φ−11 (u1, u2, u3, . . . , un) = [1, u1, u2, . . . , un] (1.6)

and hence

φ2 φ−11 (u1, u2, u3, . . . , un) =

(1

u1,u2u1,u3u1, . . . ,

unu1

)∈ Rn . (1.7)

Since u1 6= 0 this map is C∞ on φ1(U1 ∩ U2). All the other intersections follow the same

way. Thus RPn is an n-dimensional manifold.

– 9 –

Problem 1.2. What is RP 1?

Theorem 1.2. If M and N are m and n-dimensional manifolds respectively then M×Nis an (m+ n)-dimensional manifold.

Proof. Let (Ui, φi), i ∈ I be an m-dimensional differential structure for M and (Va, ψa),

a ∈ A be an n-dimensional differential structure for N . We can construct a differential

structure for M×N by taking the following charts:

Wia = Ui × Va χia : Wia → Rm+n, χia(x, y) = (φi(x), ψa(y)) i ∈ I, a ∈ A(1.8)

where x ∈ M, y ∈ N . Clearly ∪iaWia =M×N , and χia are homeomorphisms. It is also

clear that the transition functions

χjb χ−1ia = (φj φ−1i , ψb ψ−1a ) (1.9)

are C∞.

Problem 1.3. Show that the following

U1 = (x, y) ∈ S1 : y > 0, φ1(x, y) = x

U2 = (x, y) ∈ S1 : y < 0, φ2(x, y) = x

U3 = (x, y) ∈ S1 : x > 0, φ3(x, y) = y

U4 = (x, y) ∈ S1 : x < 0, φ4(x, y) = y

are a set of charts which cover S1.

Problem 1.4. Show that the 2-sphere S2 = (x, y, z) ∈ R3 : x1 + y2 + z2 = 1 is a

2-dimensional manifold.

Hint: consider stereographic projection. This requires using two charts

US = (x, y, z) ∈ S2|z < 1 and UN = (x, y, z) ∈ S2|z > −1 (1.10)

these are clearly open and cover S2. In each chart one constructs φN/S : UN/S → R2 by

taking a straight line through either the south pole (0, 0,−1) or north pole (0, 0, 1) and

then through the point p ∈ UN/S . These lines are defined by the equation

X(λ) =

0

0

±1

+ λ

x

y

z ∓ 1

(1.11)

so that X(0) is either the north or south pole and X(1) is a point on S2. We define φN/S(p)

to be the point in the (x, y)-plane where the line intersects z = 0.

– 10 –

Figure 5: Stereographic projection from US = S2 − (0, 0, 1) → R2

2. The Tangent Space

An important notion in geometry is that of a tangent vector. This is intuitively familiar for

a curve in Rn. But the elementary definition of a tangent vector, or indeed any vector, relies

on special properties of Rn such a fixed coordinate system and its vector space structure.

Once given a vector v = (v1, v2, v3) ∈ R3 for example, we can consider the derivative

in the direction of v:

v1∂

∂x1+ v2

∂

∂x2+ v3

∂

∂x3(2.1)

Here we view this expression as an operator acting on functions f : R3 → R. Changing

coordinates, for example by performing a rotation, we also must change the coefficients

v1, v2, v3 in an appropriate way however the action on a function remains the same. We

need to generalise the notion of a tangent vector to manifolds in a coordinate free way.

There are several equivalent ways to do this but here we will use the identification of a

vector field with an operator acting on functions.

Our first step is to introduce differentiable functions on manifolds. We will then proceed

to understand vectors as operators on differentiable functions.

2.1 Maps from M to R

Definition 2.1. A function f :M→ R is C∞ iff for each chart (Ui, φi) in the differentiable

structure of M

f φ−1i : φi(Ui)→ R (2.2)

is C∞. The set of such functions on a manifold M is denoted C∞(M).

– 11 –

Note that if f φ−1i is C∞, and (Uj , φj) is another chart with Ui∩Uj 6= 0, then f φ−1jwill be C∞ on Ui ∩ Uj . Thus we need only check that f φ−1i is C∞ over a set of charts

that covers M.

Problem 2.1. Consider the circle S1 as above. Show that f : S1 → R with f(x, y) = x2+y

is C∞.

Definition 2.2. An algebra V is a real vector space along with an operation ? : V ×V → Vsuch that

(i) v ? 0 = 0 ? v = 0

(ii) (λv) ? u = v ? (λu) = λ(v ? u)

(iii) v ? (u + w) = v ? u + v ?w

(iv) (u + w) ? v = u ? v + w ? v

for all u,v,w ∈ V and λ ∈ R.

Theorem 2.1. C∞(M) is an algebra with addition and multiplication defined pointwise

(f + g)(p) = f(p) + g(p)

(λf)(p) = λf(p)

(f ? g)(p) = f(p)g(p) (2.3)

Proof. Let us show that f ? g is in C∞(M). Note that

(f ? g) φ−1i = (f φ−1i ).(g φ−1i ) (2.4)

where f φ−1i and g φ−1i are C∞. Therefore their pointwise product is too. Hence,

(f ? g) φ−1i is C∞, which is what we needed to show.

The other properties can be shown in a similar manner.

Definition 2.3. For a point p ∈M, we let C∞(p) be the set of functions such that

(i) f : U → R where p ∈ U ⊂M and U is an open set.

(ii) f ∈ C∞(U) (recall that an open subset of a manifold is a manifold)

2.2 Tangent Vectors

We can now state our main definition.

Definition 2.4. A tangent vector at a point p ∈ M is a map Xp : C∞(p) → R, which

satisfies

(i) Xp(f + g) = Xp(f) +Xp(g)

(ii) Xp(constant function) = 0

– 12 –

(iii) Xp(fg) = f(p)Xp(g) +Xp(f)g(p) (‘Liebniz rule’)

The set of tangent vectors at a point p ∈M is called the tangent space to M at p and

is denoted by TpM.

The union of all tangent spaces to M is called the tangent bundle

TM = ∪p∈MTpM . (2.5)

This is an example of a fibre bundle and is itself a 2n-dimensional manifold.

N.B.: In general objects which satisfy these properties are called derivations.

N.B.: With this definition a tangent vector is a linear map from C∞(p) to R. Since C∞(p)

is a vector space (it is an algebra) the tangent vectors are therefore elements of the dual

vector space to C∞(p). However C∞(p) and hence its dual are infinite dimensional. The

conditions (i), (ii) and (iii) restrict the possible linear maps that we identify as tangent

vectors and in fact we will see that they become a finite dimensional vector space.

Example: Consider Rn as a manifold with the obvious chart U = Rn, φ : U → Rn

taken to be the identity, then

X =

n∑µ=1

λµ∂

∂xµ

is a tangent vector. In fact, we will learn that all tangent vectors have this form.

In what follows we denote

∂f

∂xµ= ∂µf,

∂2f

∂xµ∂xν= ∂µ∂νf etc. (2.6)

where f is defined on some open set in Rn. We can extend this to manifolds by the following

Definition 2.5. Let (x1, . . . xn) be local coordinates about a point p ∈ M. That is there

exists a chart (Ui, φi) with p ∈ Ui and φi(q) = (x1(q), x2(q), . . . , xn(q)) for all q ∈ Ui. We

define

∂

∂xµ∣∣p

: C∞(p)→ R (2.7)

by

∂

∂xµ∣∣pf = ∂µ(f φ−1i )(x1(p), . . . xn(p)) = ∂µ(f φ−1i ) φi(p) (2.8)

Theorem 2.2. ∂∂xµ

∣∣p

is a tangent vector to M at p.

Proof. Let f, g ∈ C∞(p) be defined on an open set U in M that contains p, then

∂

∂xµ∣∣p(f + g) = ∂µ((f + g) φ−1i )(x1(p), . . . xn(p))

= ∂µ(f φ−1i + g φ−1i )(x1(p), . . . xn(p))

= ∂µ(f φ−1i )(x1(p), . . . xn(p)) + ∂µ(g φ−1i )(x1(p), . . . xn(p))

=∂

∂xµ∣∣p(f) +

∂

∂xµ∣∣p(g) (2.9)

– 13 –

It is clear that ∂∂xµ

∣∣p(constant map) = 0.

Also

∂

∂xµ∣∣p(fg) = ∂µ((fg) φ−1i )(x1(p), . . . xn(p))

= ∂µ((f φ−1i ).(g φ−1i ))(x1(p), . . . xn(p))

= ∂µ(f φ−1i )(x1(p), . . . xn(p))(g φ−1i )(x1(p), . . . xn(p))

+ (f φ−1i )(x1(p), . . . xn(p))∂µ(g φ−1i )(x1(p), . . . xn(p))

=∂

∂xµ∣∣p(f)g(p) +

∂

∂xµ∣∣p(g)f(p) (2.10)

We will show that all tangent vectors arise in this way

Example: Consider RP 1 = (R2 − 0)/ ∼ where (x, y) ∼ λ(x, y), λ 6= 0.

We have two charts

Ut = (x, y) : x 6= 0, t = φt([x1, x2]) =

x2

x1

Us = (x, y) : y 6= 0, s = φs([x1, x2]) =

x1

x2

thus on the intersection Us ∩ Ut, s = 1/t. Let p = [1, 3] and consider the tangent vector

X : C∞(p)→ R, X(f) =∂

∂t

∣∣pf =

d

dt

(f φ−1t

)(t = 3) . (2.11)

How does X act in the other coordinate system (where they overlap)? Recall that

d

dt(f(t)) =

ds(t)

dt

d

ds(f(t(s)))

Now s(t) = φs φ−1t and t(s) = φt φ−1s so we have

X(f) =d

dt(f φ−1t )(t = 3)

=ds(t)

dt

d

ds(f φ−1t (φt φ−1s ))(s =

1

3)

= − 1

t2d

ds(f φ−1s )

∣∣s= 1

3

= −1

9

d

ds(f φ−1s )

∣∣s= 1

3. (2.12)

Thus the tangent vector can look rather different, depending on the coordinate system one

chooses. However, its definition as a linear map from C∞(p) to R is independent of the

choice of coordinates, i.e. (2.11) and (2.12) will agree on any function in C∞(p).

– 14 –

Lemma 2.1. Let (x1, . . . xn) be a coordinate system about p ∈M. Then for every function

f ∈ C∞(p) there exist n functions f1, . . . fn ∈ C∞(p) such that

fµ(p) =∂

∂xµ∣∣pf (2.13)

and

f(q) = f(p) +∑µ

(xµ(q)− xµ(p))fµ(q) (2.14)

Proof. To begin, let F = f φ−1i which is defined on V = φi(U ∩Ui) where U is the domain

of f . Let B be an open ball in V ⊂ Rn centred on v = φi(p), and take y ∈ B.

Then

F (y1, . . . , yn) = F (y1, . . . , yn)− F (y1, . . . yn−1, vn)

+ F (y1, . . . yn−1, vn)− F (y1, . . . vn−1, vn)

. . .

+ F (y1, v2, . . . vn)− F (v1, . . . , vn)

+ F (v1, . . . , vn)

= F (v1, . . . , vn) +∑µ

(F (y1, . . . , yµ, vµ+1, . . . , vn)− F (y1, . . . , vµ, vµ+1, . . . , vn)

)= F (v1, . . . , vn) +

∑µ

F (y1, . . . , vµ + t(yµ − vµ), vµ+1, . . . , vn)∣∣t=1

t=0

= F (v1, . . . , vn) +∑µ

∫ 1

0

d

dtF (y1, . . . , vµ + t(yµ − vµ), vµ+1, . . . , vn)dt

= F (v1, . . . , vn) +∑µ

∫ 1

0∂µF (y1, . . . , yµ−1, vµ + t(yµ − vµ), vµ+1, . . . vn)(yµ − vµ)dt

(2.15)

If we define

Fµ(y1, . . . yn) =

∫ 1

0∂µF (y1, . . . , vµ + t(yµ − vµ), vµ+1, . . . , vn)dt (2.16)

then

F (y1, . . . , yn) = F (v1, . . . , vn) +∑µ

(yµ − vµ)Fµ(y1, . . . , yn) (2.17)

Finally, we recall that F = fφ−1i so that if we let (y1, . . . , yn) = φi(q) = (x1(q), . . . , xn(q))

then this condition can be rewritten as

f φ−1i φi(q) = f φ−1i φi(p) +∑µ

(xµ(q)− xµ(p))Fµ φi(q) (2.18)

– 15 –

and so

f(q) = f(p) +∑µ

(xµ(q)− xµ(p))fµ(q) (2.19)

where we identify fµ = Fµ φi.It also follows that

∂

∂xµ∣∣pf =

∂

∂xµ(f φ−1i ) φi(p)

=∂F

∂yµ φi(p)

=∂

∂yµ

(F (v1, . . . , vn) +

∑ν

(yν − vν)Fν(y1, . . . , yn)

)∣∣y=φi(p)

=

(Fµ(y1, . . . , yn) +

∑ν

(yν − vν)∂µFν(y1, . . . , yn)

)∣∣y=φi(p)

= Fµ(φi(p))

= fµ(p) (2.20)

Theorem 2.3. Tp(M) is an n-dimensional real vector space and a set of basis vectors is∂

∂xµ∣∣p, µ = 1, . . . , n

(2.21)

i.e. a general element of Tp(M) can be written as

Xp =∑µ

λµ∂

∂xµ∣∣p

(2.22)

Proof. First we note that if Xp and Yp are two tangent vectors at a point p ∈ M then we

can add them or multiply them by a number λ ∈ R:

(i) (Xp + Yp) : C∞(p)→ R, f → Xpf + Ypf

(ii) (λXp) : C∞(p)→ R, f → λ(Xpf)

(Convince yourself of this).

Next, we show that the basis elements (2.21) span the vector space of tangent vectors.

To do this, note that the previous lemma implied

Xp(f) = Xp

(f(p) +

∑µ

(xµ − xµ(p))fµ

)(2.23)

Now Xp(f(p)) = 0 and Xp(xµ(p)) = 0 as f(p) and xµ(p) are constants. Thus we have

Xp(f) =∑µ

((xµ(q)− xµ(p))Xpfµ +Xp(xµ)fµ(q)

∣∣q=p

=∑µ

Xp(xµ)fµ(p)

=∑µ

Xp(xµ)

∂

∂xµ∣∣pf (2.24)

– 16 –

where we have used (2.20) in the last step. This shows that the elements (2.21) span

Tp(M). We must now show that they are also linearly independent. To this end, suppose

that ∑µ

λµ∂

∂xµ∣∣p

= 0 (2.25)

The coordinate functions are in C∞(p), so we may consider

0 =∑µ

λµ∂

∂xµ∣∣pxν =

∑µ

λµ∂

∂xµ(xν) =

∑µ

λµδνµ = λν (2.26)

Thus λµ = 0 for all µ.

So why are they called tangent vectors? First consider Rn, and a curve C : (0, 1)→ Rn.

Recall from elementary geometry that a tangent vector to a point p = C(t1) is a line through

p in the direction (i.e. with the slope)(dC1

dt

∣∣t=t1

, . . . ,dCn

dt

∣∣t=t1

)(2.27)

where C(t) = (C1(t), . . . , Cn(t)) ∈ Rn is C∞.

Now if f ∈ C∞(p), then by our definition, Xp : C∞(p)→ R defined by

Xp(f) =d

dtf(C(t))

∣∣t=t1

(2.28)

is a tangent vector to Rn at p = C(t1), i.e. it acts linearly on the function f , vanishes on

constant functions, and satisfies the Liebniz rule. On the other hand, we also have

Xp(f) =dCµ

dt

∣∣t=t1

(∂µf)∣∣C(t1)

(2.29)

dCµ

dt are the components of Xp in the basis ∂∂xµ

∣∣p.

2.3 Curves and their Tangents

We can now discuss curves on manifolds and their tangents.

Definition 2.6. Consider an open interval (a, b) ⊂ R. A map C : (a, b) → M to a

manifold M is called a smooth curve on M if φi C is C∞ where it is defined for any

chart (Ui, φi) of M (i.e. with Ui ∩ C((a, b)) 6= ∅).

Definition 2.7. For a point t1 ∈ (a, b) with C(t1) = p, we can define the tangent vector

Tp(C) ∈ Tp(M) to the curve C at p by

Tp(C)(f) =d

dtf(C(t))

∣∣t=t1

(2.30)

– 17 –

Figure 6: A smooth curve

It should be clear that Tp(C) ∈ Tp(M).

Let p ∈M be a point on a curve C at t = t1 which is covered by a chart (Ui, φi). Then

there is some ε > 0 such that

C((t1 − ε, t1 + ε)) ⊂ Ui (2.31)

We can express the tangent to C at p as

Tp(C)(f) =d

dt(f(C(t)))

∣∣t=t1

=d

dt

((f φ−1i ) (φi C)(t)

)∣∣t=t1

=n∑µ=1

d

dt

((φi C)µ(t))

∣∣t=t1

∂µ(f φ−1i )(φi(C(t1))) (2.32)

Here we have split fC : (t1−ε, t1+ε)→ R as the composition of a function fφ−1i : Rn → Rwith φi C : (t1 − ε, t1 + ε)→ Rn, and used the chain rule.

Thus we have

Tp(C)(f) =n∑µ=1

d

dt

((φi C)µ(t))

∣∣t=t1

∂

∂xµ∣∣p(f) (2.33)

so that

Tp(C) =n∑µ=1

d

dt

((φi C)µ(t))

∣∣t=t1

∂

∂xµ∣∣p

(2.34)

– 18 –

Conversely, suppose that Tp is a tangent vector to M at p. We will now show that

there exists a curve through p such that Tp is its tangent at p. Let (x1, . . . , xn) = φi(q) be

local coordinates about p defined on an open set Ui. Hence we may write

Tp =n∑µ=1

Tµ∂

∂xµ∣∣p

(2.35)

for some numbers Tµ. We define C : (t1 − ε, t1 + ε)→M by

(φi C(t))µ = xµ(p) + (t− t1)Tµ (2.36)

where we pick ε sufficiently small such that φ−1i (xµ(p)+(t−t1)Tµ) ∈ Ui for t ∈ (t1−ε, t1+ε).

It follows that

d

dt(f(C(t)))

∣∣t=t1

=d

dt

((f φ−1i ) (φ C)(t)

)∣∣t=t1

=n∑µ=1

d

dt(φi C)µ

∣∣t=t1

∂µ(f φ−1i )(φi(C(t1)))

=n∑µ=1

Tµ∂

∂xµ∣∣p(f) (2.37)

We have therefore shown that all curves through p ∈ M define a tangent vector to

M at p, and that conversely, all tangent vectors to M at p can be realized as the tangent

vector to some curve C. However, this correspondence is not unique. Clearly many distinct

curves may have the same tangent vector at p, and conversely the construction of the curve

through p with tangent Tp was not unique. But this does lead to the following theorem:

Theorem 2.4. TpM is isomorphic to the set of curves through p ∈ M modulo the equiv-

alence relation

C(t) ∼ C(t) iffd

dt(f C)

∣∣t=t1

=d

dt(f C)

∣∣t=t1

(2.38)

for all f ∈ C∞(p) where C(t1) = C(t1) = p

Proof. It is easy to check that the construction above provides a bijection between these

two spaces. The more difficult part , which we won’t go into here, is to show that the

vector space structure is preserved. Indeed, to do this we would need to give a vector space

structure to the equivalence class of curves through p.

Theorem 2.5. Let (x1, . . . , xn) = φµ1 , (y1, . . . , yn) = φµ2 be two coordinate systems at a

point p ∈M with U1 ∩ U2 6= ∅, and suppose Xp ∈ TpM. If

Xp =n∑µ=1

Aµ∂

∂xµ∣∣p

and Xp =

n∑µ=1

Bµ ∂

∂yµ∣∣p

(2.39)

then

Bµ =n∑ν=1

Aν∂

∂xν∣∣p(yµ) (2.40)

where yµ(x1, . . . , xn) = φ2 φ−11 : φ1(U1 ∩ U2)→ φ2(U1 ∩ U2) is a smooth function.

– 19 –

Proof. We have in the second coordinate system that

Xp(yµ) =

n∑ν=1

Bν ∂

∂yν∣∣p(yµ) = Bµ (2.41)

but in the first coordinate system we also see that

Xp(yµ) =

n∑ν=1

Aν∂

∂xν∣∣p(yµ) (2.42)

and as these two expressions must agree, we have proved the theorem.

N.B.: This formula is often simply written as

Bµ =n∑ν=1

Aν∂yµ

∂xν(2.43)

or even

A′µ = Aν∂x′µ

∂xν(2.44)

with a sum over ν and a prime denoting quantities in the new coordinate system.

We can see that

∂

∂xµ

∣∣∣∣p

yν , and∂

∂yµ∣∣pxν (2.45)

are inverses of each other (when viewed as matrices). To see this note that on interchanging

the coordinate systems, we must also have

Aµ =n∑ν=1

Bν ∂

∂yν∣∣p(xµ) (2.46)

It follows that

Aµ =

n∑ν=1

∂

∂yν∣∣p(xµ)

n∑σ=1

Aσ∂

∂xσ∣∣p(yν) =

n∑σ=1

Aσ( n∑ν=1

∂

∂yν∣∣p(xµ)

∂

∂xσ∣∣p(yν)

)(2.47)

As this must be true for all possible Aµ, it implies that

n∑ν=1

∂

∂yν∣∣p(xµ)

∂

∂xσ∣∣p(yν) = δµσ (2.48)

– 20 –

3. Maps Between Manifolds

3.1 Diffeomorphisms

Definition 3.1. Suppose that f :M→N , whereM, (Ui, φi), i ∈ I, and N , (Va, ψa), a ∈ Aare two manifolds. We say that f is C∞ iff

ψa f φ−1i : φi(f−1(Va))→ Rn (3.1)

is C∞ for all i ∈ I, a ∈ A (such that Ui ∩ f−1(Va) 6= ∅).

Problem 3.1. Show that f : S1 → S1 defined by f(e2πiθ) = e2πinθ is C∞ for any n.

Theorem 3.1. Suppose that M,N and P are manifolds with f :M→N and g : N → PC∞, then g f :M→ P is also C∞.

Proof. This follows from the chain rule.

Definition 3.2. If f :M→ N is a bijection with both f and f−1 C∞ then f is called a

diffeomorphism.

Two manifolds which are diffeomorphic, i.e. for which there exists a diffeomorphism

between them, are equivalent geometrically.

Problem 3.2. Show that the charts of two diffeomorphic manifolds are in a 1-1 correspon-

dence.

Problem 3.3. Show that the set of diffeomorphisms from a manifold to itself forms a

group under composition.

3.2 Push Forward of Tangent Vectors

Theorem 3.2. Let f :M→N be C∞, if Xp ∈ TpM then

f?Xf(p) : C∞(f(p))→ R defined by g → Xp(g f) (3.2)

is in Tf(p)N , i.e. is a tangent vector to N at f(p)

Proof. Let g1, g2 ∈ C∞(f(p)), λ ∈ R. Then

f?Xf(p)(g1 + g2) = Xp((g1 + g2) f)

= Xp(g1 f + g2 f)

= Xp(g1 f) +Xp(g2 f)

= f?Xf(p)(g1) + f?Xf(p)(g2) (3.3)

and

f?Xf(p)(λ) = Xp(λ f) = Xp(λ) = 0 . (3.4)

– 21 –

Finally, we have

f?Xf(p)(g1.g2) = Xp((g1.g2) f)

= Xp((g1 f).(g2 f))

= Xp(g1 f)(g2 f)(p) + (g1 f)(p)Xp(g2 f)

= f?Xf(p)(g1)g2(f(p)) + g1(f(p))f?Xf(p)(g2) (3.5)

Definition 3.3. f?Xf(p) is called the push forward of Xp.

Figure 7: Push forward of tangent vector

Theorem 3.3. Suppose that f : M→ N is C∞, (x1, . . . , xm) are local coordinates for a

point p ∈M and (y1, . . . , yn) are local coordinates for the image f(p) ∈ N . If

Xp =

m∑µ=1

Aµ∂

∂xµ∣∣p

(3.6)

is in TpM then

f?Xf(p) =

m∑µ=1

n∑ν=1

Aµ∂

∂xµ∣∣p(yν f).

∂

∂yν∣∣f(p)

(3.7)

– 22 –

Proof. Let g ∈ C∞(f(p)). Then

f?Xf(p)(g) = Xp(g f)

=

m∑µ=1

Aµ∂

∂xµ∣∣p(g f)

=m∑µ=1

Aµ∂

∂xµ(g f φ−1i ) φi(p)

=m∑µ=1

Aµ∂

∂xµ(g ψ−1a ψa f φ−1i

) φi(p)

=m∑µ=1

n∑ν=1

Aµ∂

∂xµ((ψa f φ−1i )ν

)(φi(p)))

∂

∂yν(g ψ−1a )(ψa(f(p)))

=m∑µ=1

n∑ν=1

Aµ∂

∂xµ∣∣p(yν f).

∂

∂yν∣∣f(p)

(g) (3.8)

Corollory 3.1. The push forward acts linearly on vectors

– 23 –

4. Vector Fields

4.1 Vector Fields

Next we consider vector fields.

Definition 4.1. A vector field is a map X :M→ TM such that X(p) = Xp ∈ TpM and

for all f ∈ C∞(M) the mapping

p→ X(f)(p) (4.1)

is C∞, where X(f)(p) = Xp(f).

For vector fields we will drop the explicit subscript p. Thus a vector field assigns, in a

smooth way, a vector in TpM to each point p ∈M.

N.B.: As we defined it, a vector field is (globally) valid over all M, however it can

also be defined over an open subset U ⊂M (i.e. locally).

Is the product of two vector fields also a vector field? To check this, take two vector

fields X,Y and f, g ∈ C∞(M). With multiplication of vectors taken to mean (X.Y )(f) =

X(Y (f)), we see that

X(Y (f + g)) = X(Y (f) + Y (g)) = X(Y (f)) +X(Y (g))

X(Y (constant map)) = X(0) = 0 (4.2)

However, we find

X(Y (f.g)) = X(Y (f)g + fY (g)) = g.X(Y (f)) + f.X(Y (g))

+ Y (f).X(g) +X(f).Y (g) (4.3)

and this is not a vector field, due to the presence of the terms on the second line. However,

note that the terms on the second line are symmetric in X,Y . Motivated by this, we

construct a vector field by taking

[X,Y ](f) = X(Y (f))− Y (X(f)) (4.4)

so that

[X,Y ](f.g) = g.X(Y (f)) + f.X(Y (g)) + Y (f).X(g) +X(f).Y (g)

−(g.Y (X(f)) + f.Y (X(g) +X(f).Y (g) + Y (f).X(g))

= [X,Y ](f).g + f.[X,Y ](g) (4.5)

as required for a vector field.

Definition 4.2. [X,Y ] is called the commutator of two vector fields

Problem 4.1. What goes wrong if we try to make a vector field using the definition

(X.Y )(f) = X(f).Y (f)?

– 24 –

Problem 4.2. Show that, if in a particular coordinate system

X =∑µ

Xµ(x)∂

∂xµ∣∣p, Y =

∑µ

Y µ(x)∂

∂xµ∣∣p

(4.6)

then

[X,Y ] =∑µ

∑ν

(Xµ∂µY

ν − Y µ∂µXν) ∂

∂xν∣∣p

(4.7)

Theorem 4.1. With the product of two vector fields defined as the commutator, the space

of vector fields is an algebra

Proof. We have already seen that TpM is a vector space for a particular p ∈M. this ensures

that the space of vector fields is also a vector space, with addition and scalar multiplication

defined pointwise. We need to check the conditions (i)−(iv) in the definition of an algebra.

Conditions (i), (ii) are obviously satisfied. In addition since [X,Y ] = −[Y,X], we need

only check that

[X,Y + Z](f) = X(Y + Z)(f)− (Y + Z)X(f)

= X(Y (f)) +X(Z(f))− Y (X(f))− Z(X(f))

= [X,Y ](f) + [X,Z](f) (4.8)

as required

Problem 4.3. Show that for three vector fields X,Y, Z on M the Jacobi Identity holds:

[X, [Y,Z]] + [Y, [Z,X]] + [Z, [X,Y ]] = 0 (4.9)

Problem 4.4. Consider a manifold with a local coordinate system Ui, φi = (x1, . . . xn).

In Ui we can simply write

∂

∂xµ∣∣p

=∂

∂xµ(4.10)

(i) Show that[∂∂xµ ,

∂∂xν

]= 0.

(ii) Evaluate[∂∂x1

, φ(x1, x2) ∂∂x2

]where φ(x1, x2) is a C∞ function of x1, x2.

4.2 Integral Curves and Local Flows

Given a vector field X, we can construct curves that pass through p ∈ M for which the

tangent vector at p is X.

Definition 4.3. Let X be a vector field on M and consider a point p ∈ M. An integral

curve of X passing through p is a curve C(t) in M such that C(0) = p and

C?

(d

dt

)= XC(t) (4.11)

– 25 –

for all t in some open interval (−ε, ε) ⊂ R. Here we are viewing ddt as a vector field on R,

so that the push forward is defined by

C?

(d

dt

)(f) =

d

dt(f(C(t))) = Tp(C)(f) (4.12)

is just the tangent vector to C(t) at p.

If we introduce a local coordinate system so that in an open set about p ∈M,

X =∑µ

Xµ(x)∂

∂xµ∣∣φ−1i (x)

(4.13)

then we find the integral curve is

C?

(d

dt

)(f) =

d

dtf(C(t))

=d

dt

(f φ−1i φi C

)(t)

=∑µ

dCµ

dt(t)

∂

∂xµ∣∣C(t)

(f) (4.14)

where we have set Cµ = (φi C)µ. On the other hand we have

XC(t)(f) =∑µ

Xµ(C(t))∂

∂xµ∣∣C(t)

(f) . (4.15)

Thus we see that the condition for an integral curve is a first order differential equation

for the coordinates of the curve Cµ(t):

d

dtCµ(t) = Xµ(C(t)) (4.16)

together with the initial condition xµ(C(0)) = xµ(p). This is a first order differential

equation, and as such it has (at least locally) a unique solution with the given initial

condition (Picard’s theorem). However it is by no means clear whether or not the solution

can be extended to all values of t. In particular, even if there is a solution to the differential

equation (4.16) for all t, one must also worry about patching solutions together over the

different coordinate patches. This leads to

Definition 4.4. A vector field X on M is complete if for every point p ∈ M the integral

curve of X can be extended to a curve on M for all values of t

Theorem 4.2. IfM is compact (i.e. all open covers have a finite subcover) then all vector

fields on M are complete.

Proof. We will not prove this

– 26 –

Let σ(t, p) be an integral curve of a vector field X that passes through p at t = 0. This

therefore satisfies

d

dtσµ(t, p) = Xµ(σ(t, p)) (4.17)

along with the initial condition σ(0, p) = p. Thus we have found the following, at least

locally (and globally for complete vector fields):

Definition 4.5. The flow generated by the vector field X is a differentiable map

σ : R×M→M (4.18)

such that

(i) At each point p ∈M, the tangent to the curve Cp(t) = σ(t, p) at p is X.

(ii) σ(0, p) = p

(iii) σ(t+ s, p) = σ(t, σ(s, p))

We must show that the third property holds. To do this we dimply note that since

(4.17) is a first order differential equation it has a unique solution for a fixed initial condi-

tion. So consider

d

dtσµ(t+ s, p) =

dσµ(t+ s, p)

d(t+ s)= Xµ(σ(t+ s, p)) (4.19)

which satisfies the initial condition σ(0 + s, p) = σ(s, p). On the other hand we also have

d

dtσµ(t, σ(s, p)) = Xµ(σ(t, σ(s, p))) (4.20)

with the initial condition σ(0, σ(s, p)) = σ(s, p). Thus both σ(t + s, p) and σ(t, σ(s, p))

satisfy the same equation (4.17) with the same initial condition, and so they must be

equal.

We see that each point p ∈ M defines a curve Cp(t) = σ(t, p) in M whose tangent is

X, and such that Cp(0) = p.

Another way of looking at this is that, for each t the flow defines a map σt :M→Msuch that σt+s = σt σs. Now for t = ε small, we have, from (4.17),

σµε (p) = σµ(ε, p) = xµ(p) + εXµ(p) +O(ε2) (4.21)

Thus, at least for a sufficiently small value of t, σt is 1-1 and C∞. Hence it is a diffeomor-

phism onto its image (at least for some open set in M). This leads to another definition

Definition 4.6. A 1-parameter family of diffeomorphisms of M is a collection of diffeo-

morphisms σt :M→M with t ∈ R, such that

(i) σ0 = id where id is the identity map.

– 27 –

(ii) σt σs = σt+s

(iii) σt σ−t = id

So in effect, we can think of vector fields as generating infinitessimal diffeomorphisms

through their flows. In this sense, vector fields can be identified with the Lie algebra of the

diffeomorphism group.

4.3 Lie Derivatives

A vector field allows us to introduce the notion of a derivative on a manifold. The problem

with the usual derivative,

∂f

∂p= lim

ε→0

f(p+ ε)− f(p)

ε(4.22)

is that we don’t know how to add two points on a manifold, i.e. what is p + ε? However,

we saw that at least locally, a vector field generates a unique integral flow about any given

point p. Therefore we can use the flow to take us to a nearby point and hence form a

derivative. This is the notion of a Lie derivative.

Definition 4.7. The X be a vector field on M and f ∈ C∞(M). We define the Lie

derivative of f along X to be

LXf(p) = limε→0

f(σ(ε, p))− f(p)

ε(4.23)

where σ(ε, p) is the flow generated by X at p.

Theorem 4.3.

LXf = X(f) (4.24)

Proof. We have from the definition

LXf(p) = limε→0

f(σ(ε, p))− f(σ(0, p))

ε

=d

dtf(σ(t, p))

∣∣t=0

= σ(t, p)?

(d

dt

)∣∣∣∣t=0

(f)

= Xp(f) (4.25)

where we have used the defining property of the flow that the tangent at p is Xp.

We can also define the Lie derivative of a vector field Y along X;

Definition 4.8.

LXY = limε→0

σ−ε?Yσ(ε) − Yε

(4.26)

where again σ(ε) is the flow generated by X, and we have suppressed the dependence on

the point p.

– 28 –

Theorem 4.4.

LXY = [X,Y ] (4.27)

Proof. Let us introduce coordinates (x1, . . . , xn) about p ∈M such that

X =∑µ

Xµ(x)∂

∂xµ∣∣φ−1(x)

, Y =∑µ

Y µ(x)∂

∂xµ∣∣φ−1(x)

(4.28)

Recall that

σν(ε, p) = xν(p) + εXν(p) +O(ε2) (4.29)

and note for any f ,

Yσ(ε)(f) =∑µ

Y µ(σ(ε))∂

∂xµ∣∣σ(ε)

(f)

=∑

(Y µ + εXλ∂λYµ)(p)

∂

∂xµ∣∣σ(ε)

(f) +O(ε2)

=∑

(Y µ + εXλ∂λYµ)(∂µf + εXρ∂µ∂ρf)(p) +O(ε2)

=∑

(Y µ∂µ + εXλ∂λYµ∂µ + εY µXλ∂µ∂λ)f(p) +O(ε2) (4.30)

N.B. In the 3rd line, we have suppressed the φ−1i term (in f φ−1i ) for convenience.

We also adopt the Einstein summation convention, whereby repeated indices are

summed over.

Therefore

σ(−ε)?Yσ(ε)(f) = Yσε(f σ(−ε))

=∑

(Y µ∂µ + εXλ∂λYµ∂µ + εY µXλ∂µ∂λ)(f − εXν∂νf)(p) +O(ε2)

=∑

(Y µ∂µf(p) + ε(Xλ∂λYµ − Y λ∂λX

µ)∂µf(p) +O(ε2) (4.31)

It follows that

1

ε

(σ(−ε)?Yσ(ε)(f)− Y (f)

)= (Xλ∂λY

µ − Y λ∂λXµ)∂µf(p) +O(ε)

= [X,Y ]p(f) +O(ε) (4.32)

and we are done.

– 29 –

5. Tensors

5.1 Co-Tangent Vectors

Recall that the tangent space TpM at a point p ∈ M is a vector space. For any vector

space there is a natural notion of a dual vector space which is defined at the space of linear

maps from the vector space to R.

Problem 5.1. Show that the dual space is a vector space, of the same dimension as the

original vector space (provided it is finite-dimensional).

Thus we have the

Definition 5.1. The co-tangent space to M at p ∈ M is the dual vector space to Tp(M)

and is denoted by T ?pM.

In other words, ωp ∈ T ?pM iff ωp : TpM→ R is a linear map. We denote the action of

ωp on a vector Xp ∈ TpM by

ωp(Xp) = 〈ωp, Xp〉 (5.1)

Since ωp is a linear map we have

〈ωp, Xp + λYp〉 = ωp(Xp + λYp) = ωp(Xp) + λωp(Yp) = 〈ωp, Xp〉+ λ〈ωp, Yp〉 (5.2)

Furthermore, as the dual space is a vector space,

〈ωp + ληp, Xp〉 = (ωp + ληp)(Xp) = ωp(Xp) + ληp(Xp) = 〈ωp, Xp〉+ λ〈ηp, Xp〉 (5.3)

Thus 〈, 〉 is linear in each of its entries.

Now the dual of the dual of a vector space is just the original space itself. To see this,

note that for a fixed vector Xp, we can construct the map

ωp → ωp(Xp) ∈ R (5.4)

The properties of dual vectors ensure that this is a linear map. Thus we can view vectors

Xp as linear maps acting on co-vectors ωp via

Xp(ωp) = 〈ωp, Xp〉 (5.5)

Just as for the tangent bundle, we can define the co-tangent bundle to be

T ?M =⋃p∈M

T ?pM (5.6)

which is a 2n-dimensional manifold.

Definition 5.2. A smooth co-vector field is a map ω :M→ T ?M such that

(i) ωp ∈ T ?pM.

– 30 –

(ii) ω(X) : M → R, defined by ω(X)(p) = ωp(Xp) is in C∞M for all smooth vector

fields X

Recall that a chart (Ui, φi) defines a natural basis of TpM for p ∈ Ui:∂

∂xµ∣∣p

(5.7)

where φi(p) = (x1(p), . . . , xn(p)). This allows us to define a natural basis for T ?pM by

〈dxµ∣∣p,∂

∂xν∣∣p〉 = δµν (5.8)

i.e. dxµ∣∣p

is a linear map from TpM to R that maps the vector

n∑ν=1

vν∂

∂xν∣∣p

(5.9)

to vµ.

Thus, if in a local coordinate chart we have a vector

V (p) =

n∑µ=1

V µ(p)∂

∂xµ∣∣p

(5.10)

and a co-vector

ω(p) =n∑ν=1

ων(p)dxν∣∣p

(5.11)

Then

〈ω(p), V (p)〉 = 〈n∑ν=1

ων(p)dxν∣∣p,n∑µ=1

V µ(p)∂

∂xµ∣∣p〉

=n∑µ=1

n∑ν=1

ων(p)V µ(p)〈dxν∣∣p,∂

∂xµ∣∣p〉

=n∑µ=1

n∑ν=1

ων(p)V µ(p)δνµ

=n∑µ=1

ωµ(p)V µ(p) (5.12)

Theorem 5.1. Let (x1, . . . , xn) = φµ1 , and (y1, . . . , yn) = φµ2 be two coordinate systems at

a point p ∈M with p ∈ U1 ∩ U2, and suppose that ωp ∈ T ?pM If

ωp =n∑µ=1

Aµdxµ∣∣p, and ωp =

n∑µ=1

Bµdyµ∣∣p

(5.13)

then

Aµ =

n∑ν=1

Bν∂

∂xµ∣∣pyν (5.14)

where yµ(x1, . . . , xn) = φ2 φ−11 is a smooth function φ1(U1 ∩ U2)→ φ2(U1 ∩ U2).

– 31 –

Proof. In the first coordinate system

ωp

(∂

∂xµ∣∣p

)=

n∑ν=1

Aν〈dxν∣∣p,∂

∂xµ∣∣p〉 = Aµ (5.15)

In the second coordinate system, we see that

ωp

(∂

∂xµ∣∣p

)=

n∑ν=1

Bν〈dyν∣∣p,∂

∂xµ∣∣p〉 (5.16)

Also recall that

∂

∂xµ∣∣p

=∑λ

(∂

∂xµ∣∣p(yλ)

)∂

∂yλ∣∣p

(5.17)

Thus we find

ωp

(∂

∂xµ∣∣p

)=

n∑ν=1

Bν〈dyν∣∣p,∑λ

(∂

∂xµ∣∣p(yλ)

)∂

∂yλ∣∣p〉

=

n∑ν=1

Bν

(∂

∂xµ∣∣p(yλ)

)〈dyν

∣∣p,∑λ

∂

∂xλ∣∣p〉

=

n∑ν=1

Bν∂

∂xµ∣∣pyν (5.18)

and since these must agree the theorem is proved.

N.B.: This formula is often simply written as

Aµ =n∑ν=1

Bν∂yν

∂xµ(5.19)

or even

A′µ = Aν∂xν

∂x′µ(5.20)

with a sum over ν understood and a prime denoting quantities in the new coordinate

system. Note the different positions of the prime and unprimed coordinates as compared

to the analogous formula for a vector.

5.2 Pull-back and Lie Derivative of a co-vector

Suppose we have a smooth map f :M→N . We saw that we could push-forward a vector

Xp ∈ TpM to a vector f?Xf(p) ∈ Tf(p)N by

f?Xf(p)(g) = Xp(g f) (5.21)

We therefore can consider the dual map f? : T ?f(p)N → T ?pM defined by

f?(ωf(p))(Xp) = 〈f?(ωf(p)), Xp〉 = 〈ωf(p), f?Xf(p)〉 = ωf(p)(f?Xf(p)) (5.22)

It is clear that if ωf(p) ∈ T ?f(p)N then f?(ωf(p)) is a linear map TpM→ R, this follows from

the linearity of the push-forward f?.

– 32 –

Theorem 5.2. Let f : M→ N be C∞, (y1, . . . , yn) be local coordinates on V ⊂ N , and

(x1, . . . , xm) be local coordinates on U ∩ f−1(V ) ⊂M (assuming U ∩ f−1(V ) 6= ∅). If

ω =n∑ν=1

ωνdyν∣∣f(p)

(5.23)

then

f?ω =m∑µ=1

n∑ν=1

ων∂

∂xµ(yν f)dxµ

∣∣p

(5.24)

Proof. We have

〈ω, f?Xf(p)〉 =

n∑ν=1

ων(f ? Xf(p))ν

=n∑ν=1

m∑µ=1

ωνXµ ∂

∂xµ∣∣p

(yν f) (5.25)

where we used out earlier result for the components of the push forward of a vector.

However we also have

〈f?ω,Xp〉 =

m∑µ=1

(f?ω)µXµ (5.26)

by definition. As these two expressions must be equal for all possible choices of Xµ, the

result follows.

We can also extend the definition of the Lie derivative to co-vector fields (and in fact

all tensor fields).

Definition 5.3. If X is a smooth vector field and ω a smooth co-vector field on M then

LXω = limε→0

1

ε

(σε?ωσ(ε) − ω

)(5.27)

Theorem 5.3. If, in a local coordinate system (x1, . . . , xn),

X =∑µ

Xµ(x)∂

∂xµ∣∣p, ω =

∑µ

ωµ(x)dxµ∣∣p

(5.28)

then

LXω =∑µ

∑ν

(∂νωµX

ν + ων∂µXν)dxµ

∣∣p

(5.29)

– 33 –

Proof. Suppose Y ∈ TpM, consider

σε?ωσ(ε)Y = ωσ(ε)

(σε?Yσ(ε)

)=∑

ωσ(ε)µ(σε?Yσ(ε))

µ

=∑(

ωµ + εXλ∂λωµ

)(σε

?Yσ(ε))µ +O(ε2) (5.30)

where

(σε?Yσ(ε))

µ = Y λ∂λ(σµ(ε))

= Y λ∂λ(xµ + εXµ) +O(ε2)

= Y µ + εY λ∂λXµ +O(ε2) (5.31)

as a consequence of Theorem 3.3.

Combining these expressions one obtains

σε?ωσ(ε)Y =

∑(ωµ + εXλ∂λωµ

)(Y µ + εY λ∂λX

µ

)+O(ε2)

=∑

ωµYµ + ε

∑(Xλ∂λωµ + ωλ∂µX

λ

)Y µ +O(ε2) (5.32)

It follows that

limε→0

1

ε

(σε?ωσ(ε) − ω

)Y =

∑(Xλ∂λωµ + ωλ∂µX

λ

)dxµ (Y ) (5.33)

and the theorem follows.

5.3 Tensors

We can now construct the definition of a (r, s) tensor. First we need to recall the definition

of the tensor product. If V, W are two vector spaces with basis vi : i = 1, . . . , n and

wa : a = 1, . . . ,m respectively, then the vector space sum is a n+m-dimensional vector

space with basis vi,wa : i = 1, . . . , n; a = 1, . . . ,m, i.e.

V ⊕W = Spani,avi,wa (5.34)

so that a general element is

n∑i=1

aivi +

m∑a=1

bawa (5.35)

where the sum is interpreted as a formal sum. Hence V ⊕W is a n+m-dimensional vector

space.

– 34 –

We can also construct the tensor product of V and W, which is spanned by vi⊗wa :

i = 1, . . . , n; a = 1, . . . ,m, i.e.

V ⊗W = Spani,avi ⊗wa (5.36)

where vi ⊗wa is a formal product. A general element is

n∑i=1

m∑a=1

ciavi ⊗wa . (5.37)

and we have the following identities:

(v1 + v2)⊗ w = (v1 ⊗ w) + (v2 ⊗ w)

v ⊗ (w1 + w2) = (v ⊗ w1) + (v ⊗ w2)

(λv)⊗ w = v ⊗ (λw) = λ(v ⊗ w) (5.38)

for v, v1, v2 ∈ V, w,w1, w2 ∈ W and λ ∈ R. V ⊗W is a nm-dimensional vector space.

Definition 5.4. A (r, s)-tensor T at a point p ∈M is an element of

T (r,s)p M =

(⊗r TpM

)⊗(⊗s T ?pM

)(5.39)

where ⊗r denotes the r-th tensor product.

It follows that, given a local coordinate system, a local basis for (r, s)-tensors is given

by

∂

∂xµ1

∣∣p⊗ · · · ⊗ ∂

∂xµr

∣∣p⊗ dxν1

∣∣p⊗ · · · ⊗ dxνs

∣∣p

(5.40)

and if we write

T =∑

Tµ1...µrν1...νs∂

∂xµ1

∣∣p⊗ · · · ⊗ ∂

∂xµr

∣∣p⊗ dxν1

∣∣p⊗ · · · ⊗ dxνs

∣∣p

(5.41)

then the components can be computed as

Tµ1...µrν1...νs = T

(dxµ1

∣∣p, . . . , dxµr

∣∣p,∂

∂xν1

∣∣p, . . . ,

∂

∂xνs

∣∣p

)(5.42)

Problem 5.2. Show that if in some local coordinate system

T =∑

Aµ1...µrν1...νs∂

∂xµ1

∣∣p⊗ · · · ⊗ ∂

∂xµr

∣∣p⊗ dxν1

∣∣p⊗ · · · ⊗ dxνs

∣∣p

(5.43)

is a (r, s)-tensor, and

Xi =∑µ

Xµi

∂

∂xµ∣∣p, i = 1, . . . , s (5.44)

are s vectors and

ωI =∑ν

ωIνdxν∣∣p, I = 1, . . . , r (5.45)

are r co-vectors then

T (ω1, . . . , ωr, X1, . . . , Xs) =∑

Aµ1...µrν1...νsω1µ1 . . . ω

rµrX

ν11 . . . Xνs

s (5.46)

– 35 –

Problem 5.3. Suppose that T is a (r, s)-tensor, (U1, φ1 = (x1, . . . , xn)) and (U1, φ1 =

(y1, . . . , yn)) are two local coordinate charts with U1 ∩ U2 6= ∅, such that

T =∑

Aµ1...µrν1...νs∂

∂xµ1

∣∣p⊗ · · · ⊗ ∂

∂xµr

∣∣p⊗ dxν1

∣∣p⊗ · · · ⊗ dxνs

∣∣p

(5.47)

and

T =∑

Bµ1...µrν1...νs

∂

∂yµ1

∣∣p⊗ · · · ⊗ ∂

∂yµr

∣∣p⊗ dyν1

∣∣p⊗ · · · ⊗ dyνs

∣∣p

(5.48)

Then

Bµ1...µrν1...νs =

n∑λ1=1

· · ·n∑

λs=1

n∑ρ1=1

· · ·n∑

ρr=1

Aρ1...ρrλ1...λs

×(

∂

∂xρ1

∣∣pyµ1). . .

(∂

∂xρr

∣∣pyµr)

×(

∂

∂yν1

∣∣pxλ1). . .

(∂

∂yνs

∣∣pxλs)

(5.49)

where yµ(x) is the µ-th component of the transition function φ2 φ−11 , and xµ(y) is the

µ-th component of the transition function φ1 φ−12 .

Definition 5.5. A (r, s)-tensor field is a map

T :M→ (⊗rTM)⊗ (⊗sT ?M) such that T (p) ∈ (⊗rTpM)⊗ (⊗sT ?pM)(5.50)

which is smooth in the sense that for any choice of r smooth co-vector fields ω1, . . . , ωr,

and s smooth vector fields V 1, . . . , V s, the map T (ω1, . . . , ωr, V 1, . . . , V s) :M→ R is C∞.

Some tensor fields have special names.

(i) A (0, 0)-tensor is a scalar; as a field it assigns a number to each point in M

(ii) A (1, 0)-tensor is a vector; as a field it assigns a tangent vector to each point in M.

(iii) A (0, 1)-tensor is a 1-form; as a field it assigns a co-vector to each point in M

Sometimes, especially in older books, (r, 0)-tensors are called covariant, and (0, s)

tensors are called contravariant.

A (r, 0) tensor is called symmetric if

T (ωP (1), . . . , ωP (r)) = T (ω1, . . . , ωr) (5.51)

and similarly a (0, s) tensor is called symmetric if

T (V P (1), . . . , V P (s)) = T (V 1, . . . , V s) (5.52)

On the other hand, they are called anti-symmetric if

T (ωP (1), . . . , ωP (r)) = sgn(P )T (ω1, . . . , ωr) (5.53)

– 36 –

or

T (V P (1), . . . , V P (s)) = sgn(P )T (V 1, . . . , V s) (5.54)

Here P is a permutation, and sgn(P ) is its sign. Recall that P is a bijection P : 1, . . . , k →1, . . . , k, and can always be written in terms of either an even (sgn(P ) = 1) or odd

(sgn(P ) = −1) of interchanges (where two neighbouring integers are swapped, and every-

thing else is unaltered).

Similarly, one can talk about symmetry properties of the (r, 0) and (0, s) components

of a mixed (r, s)-tensor separately.

It is straightforward to extend the Lie derivative to act on (r, s) tensors by requiring

that

(i) LX(T1 + T2) = LXT1 + LXT2, where T1, T2 are (r, s) tensors

(ii) LX(λT ) = λ(LXT ), where T is a (r, s) tensor and λ ∈ R is constant.

(iii) LX(T1⊗T2) = (LXT1)⊗T2 +T1⊗ (LXT2) where T1, T2 are (r, s) and (r′, s′) tensors.

– 37 –

6. Differential Forms

N.B.: Conventions about forms can vary from book to book (by various factors

of p! and minus signs), so be careful when comparing different sources.

6.1 Forms

Definition 6.1. A p-form on a manifold M is a smooth anti-symmetric (0, p)-tensor field

on M. In particular, if ω is a p-form then

ω(XP (1), . . . , XP (p)) = sgn(P )ω(X1, . . . , Xp) (6.1)

Definition 6.2. A 0-form on M is a function in C∞(M).

Theorem 6.1. If M is n-dimensional then all p-forms with p > n vanish.

Proof. First note that a p-form acting on a set of vectors with the same vector appearing

twice vanishes, because

ω(X1, . . . , Y,X2, . . . , Y,X3, . . . ) = sgn(P )ω(Y, Y,X1 . . . )

= (−1)sgn(P )ω(Y, Y,X1, . . . )

= (−1)(sgn(P ))2ω(X1, . . . , Y,X2, . . . , Y,X3, . . . )

= −ω(X1, . . . , Y,X2, . . . , Y,X3, . . . ) (6.2)

where in the first line we used a permutation P to place the two Y vectors next to each

other, in the second line we used an interchange to swap the two Y -vectors, which introduces

an extra coefficient of −1, and in the third line we apply the inverse permutation to P to

restore the original order of the vector fields (recall that the sign of P−1 is the same as

that of P ) which introduces another factor of sgn(P ).

Now, if ω is a p form with p > n, then in any collection of p basis vectors, at least

two must be the same. So ω evaluated on any such collection of basis vector must vanish.

Since it vanishes on any set of basis vectors, it must vanish identically.

The space of p-forms on M is denoted Ωp(M,R), and we let

Ω(M) = Ω0(M,R)⊕ Ω1(M,R) + · · ·+ Ωn(M,R) (6.3)

Here we have included an explicit reference to the field R over which the manifold is defined.

Note that if ω and η are a p-form and q-form respectively, then ω⊗η will be a (0, p+q)-

tensor field, but not a (p+ q)-form, since it will not be antisymmetric.

However, one can construct a (p+ q)-form from a p-form and a q-form using the wedge

product ∧:

– 38 –

Definition 6.3. If ω ∈ Ωp(M) and η ∈ Ωq(M) then

(ω ∧ η)(X1, . . . , Xp+q) =∑P

sgn(P )(ω ⊗ η)(XP (1), . . . , XP (p+q)) (6.4)

where

(ω ⊗ η)(XP (1), . . . , XP (p+q)) = ω(XP (1), . . . , XP (p)) . η(XP (p+1), . . . , XP (p+q)) (6.5)

Note that ω ∧ η is clearly a (0, p + q) tensor, because each term in the above sum is.

Also, if Q is some permutation, then

(ω ∧ η)(XQ(1), . . . , XQ(p+q)) =∑P

sgn(P )(ω ⊗ η)(XQP (1), . . . , XQP (p+q))

=∑P

sgn(Q−1P )(ω ⊗ η)(XQQ−1P (1), . . . , XQQ−1P (p+q))

=∑P

sgn(Q−1P )(ω ⊗ η)(XP (1), . . . , XP (p+q))

= sgn(Q)∑P

sgn(P )(ω ⊗ η)(XP (1), . . . , XP (p+q))

= sgn(Q)(ω ∧ η)(X1, . . . , Xp+q) (6.6)

where in going from the first to the second line, the permutation P in the sum is replaced

with Q−1P (this leaves the sum unaffected). So ω ∧ η is indeed a (p+ q)-form.

Example:

dxµ∣∣p∧ dxν

∣∣p

= dxµ∣∣p⊗ dxν

∣∣p− dxν

∣∣p⊗ dxµ

∣∣p

(6.7)

Example:

dxµ∣∣p∧ (dxν

∣∣p∧ dxλ

∣∣p) = dxµ

∣∣p⊗ dxν

∣∣p⊗ dxλ

∣∣p− dxµ

∣∣p⊗ dxλ

∣∣p⊗ dxν

∣∣p

+ dxν∣∣p⊗ dxλ

∣∣p⊗ dxµ

∣∣p− dxν

∣∣p⊗ dxµ

∣∣p⊗ dxλ

∣∣p

+ dxλ∣∣p⊗ dxµ

∣∣p⊗ dxν

∣∣p− dxλ

∣∣p⊗ dxν

∣∣p⊗ dxµ

∣∣p

(6.8)

Note: by convention if f ∈ Ω0(M,R) and ω ∈ Ωk(M,R) then

f ∧ ω = ω ∧ f = f . ω (6.9)

Theorem 6.2. The wedge product satisfies:

(i) (ω1 + ω2) ∧ η = (ω1 ∧ η) + (ω2 ∧ η)

(ii) ω ∧ (η1 + η2) = (ω ∧ η1) + (ω ∧ η2)

(iii) (λω) ∧ η = ω ∧ (λη) = λ(ω ∧ η)

for ω, ω1, ω2 ∈ Ωp(M), η, η1, η2 ∈ Ωq(M) and λ ∈ R

– 39 –

Proof. This follows directly from the definition of the wedge product, and is left as an

exercise

Theorem 6.3. If ω ∈ Ωp(M) and η ∈ Ωq(M) then ω ∧ η = (−1)pqη ∧ ω.

Proof. Let Q be the permutation that maps the ordered list

1, . . . , p, p+ 1, . . . , p+ q → q + 1, . . . , q + p, 1, . . . , q (6.10)

Note that sgn(Q) = (−1)pq.

Then

(ω ∧ η)(X1, . . . , Xp, Xp+1, . . . , Xp+q)

=∑P

sgn(P )(ω ⊗ η)(XP (1), . . . , XP (p), XP (p+1), . . . , XP (p+q))

=∑P

sgn(PQ)(ω ⊗ η)(XPQ(1), . . . , XPQ(p), XPQ(p+1), . . . , XPQ(p+q))

=∑P

sgn(P )sgn(Q)(ω ⊗ η)(XP (q+1), . . . XP (q+p), XP (1), . . . XP (q))

= (−1)pq∑P

sgn(P )(ω ⊗ η)(XP (q+1), . . . XP (q+p), XP (1), . . . XP (q))

= (−1)pq∑P

sgn(P )(η ⊗ ω)(XP (1), . . . , XP (q), XP (q+1), . . . , XP (p+q))

= (−1)pq(η ∧ ω)(X1, . . . , Xp, Xp+1, . . . , Xp+q) (6.11)

Problem 6.1. If

ω = A12dx1 ∧ dx2 +A34dx

3 ∧ dx4

η = B123dx1 ∧ dx2 ∧ dx3 +B125dx

1 ∧ dx2 ∧ dx5 (6.12)

then what is ω ∧ η? (N.B. here we have dropped the subscript p for convenience)

Theorem 6.4. A basis for Ωk(M) at a point p ∈M is given by

dxν1∣∣p∧ · · · ∧ dxνk

∣∣p

(6.13)

for ν1 < · · · < νk.

Proof. Suppose ω ∈ Ωk(M); define

ωµ1,...µk = ω

(∂

∂xµ1

∣∣∣∣p

, . . . ,∂

∂xµk

∣∣∣∣p

)(6.14)

Also, from the definition of a k-form

ωµP (1),...,µP (k)= ω

(∂

∂xµP (1)

∣∣∣∣p

, . . . ,∂

∂xµP (k)

∣∣∣∣p

)= sgn(P )ω

(∂

∂xµ1

∣∣∣∣p

, . . . ,∂

∂xµk

∣∣∣∣p

)= sgn(P )ωµ1,...,µk (6.15)

– 40 –

for any permutation P . It then follows that

1

k!

∑µi

ωµ1,...µkdxµ1∣∣p∧ · · · ∧ dxµk

∣∣p

=1

k!

∑µi

∑P

sgn(P )ωµ1...µkdxµP (1)

∣∣p⊗ · · · ⊗ dxµP (k)

=1

k!

∑µi

∑P

ωµP (1),...µP (k)dxµP (1)

∣∣p⊗ · · · ⊗ dxµP (k)

=∑µi

ωµ1,...,µkdxµ1∣∣p⊗ · · · ⊗ dxµk

∣∣p

= ω (6.16)

This shows that dxµ1∣∣p∧ · · · ∧ dxµk

∣∣p

span the space of k-forms.

Now consider the k-form dxµ1∣∣p∧ · · · ∧ dxµk

∣∣p. This either vanishes, because not all

the µi are distinct, or there exists a permutation P of 1, . . . , k such that

dxµ1∣∣p∧ · · · ∧ dxµk

∣∣p

= sgn(P )dxµP (1)∣∣p∧ · · · ∧ dxµP (k)

∣∣p

(6.17)

with µP (1) < · · · < µP (k). It therefore follows that the k-forms

dxν1∣∣p∧ · · · ∧ dxνk

∣∣p

(6.18)

for ν1 < · · · < νk span Ωk(M)

Furthermore, note that by definition, if ν1 < · · · < νk and τ1 < · · · < τk then

dxν1∣∣p∧ · · · ∧ dxνk

∣∣p

(∂

∂xτ1

∣∣p⊗ · · · ⊗ ∂

∂xτk

∣∣p

)=

1 if ν1 = τ1, . . . , νk = τk

0 otherwise(6.19)

It follows that if τ1 < · · · < τk, and∑ν1<···<νk

Aν1...νkdxν1∣∣p∧ · · · ∧ dxνk

∣∣p

= 0 (6.20)

then acting on

(∂

∂xτ1

∣∣p⊗ · · · ⊗ ∂

∂xτk

∣∣p

)implies that Aτ1...τk = 0, so the k forms dxν1

∣∣p∧

· · · ∧ dxνk∣∣p

for ν1 < · · · < νk are linearly independent.

Hence these k-forms form a basis for Ωk(M)

6.2 Exterior Derivative

We can define a notion of a derivative on k-forms by

Definition 6.4. If ω ∈ Ωk(M), then the exterior derivative, dω, is given by

dω(X1, . . . , Xk+1) =k+1∑i

(−1)i+1Xi(ω(X1, . . . , Xi−1, Xi+1, . . . , Xk+1))

+k+1∑i<j

(−1)i+jω([Xi, Xj ], X1, . . . , Xi−1, Xi+1, . . . , Xj−1, Xj+1, . . . Xk+1)

(6.21)

– 41 –

Theorem 6.5. If ω ∈ Ωk(M) then dω ∈ Ωk+1(M)

Proof. To start with, we shall prove that dω is a (0, k+ 1) tensor. Having established that,

we will prove that it is antisymmetric by evaluating its components in a local basis.

First, fix ` such that 1 ≤ ` ≤ k + 1. Note that from the construction given in (6.21),

one finds

dω(X1, . . . , X` + Y`, . . . , Xk+1) = dω(X1, . . . , X`, . . . , Xk+1) + dω(X1, . . . , Y`, . . . , Xk+1)

(6.22)

Next, suppose that f ∈ C∞(M).

To simplify the expressions in what follows, it will be convenient to denote by Xithe (ordered) k-tuple obtained by removing Xi from the (ordered) list of k+ 1 vector fields

X1, . . . , Xk+1; and by Xi,j the (ordered) (k − 1)-tuple obtained by removing Xi and Xj

from X1, . . . , Xk+1 (i 6= j).

We can proceed to evaluate

dω(X1, . . . , fX`, . . . , Xk+1) =∑i 6=`

(−1)i+1Xi

(fω(Xi)

)+ (−1)`+1fX`

(ω(X`)

)+∑i<ji,j 6=`

f(−1)i+jω([Xi, Xj ], Xi,j)

+∑`<j

(−1)`+jω([fX`, Xj ], X`,j) +∑i<`

(−1)i+`ω([Xi, fX`], Xi,`)

(6.23)

where the first two line of (6.23) corresponds to the first line of (6.21) and the last two

lines of (6.23) come from the second line of (6.21), on restricting the various summations

appropriately. Recall that the Liebniz rule implies

Xi

(fω(Xi)

)= (Xi(f))ω(Xi) + fXi

(fω(Xi)

)(6.24)

Using this, the first line of (6.23) can be rewritten as

f∑i

(−1)i+1Xi

(ω(Xi)

)+∑i 6=`

(−1)i+1(Xi(f))ω(Xi) (6.25)

Also, on recalling that

[Xi, fXj ] = f [Xi, Xj ] + (Xi(f))Xj (6.26)

the remainder of (6.23) can be rewritten as

f∑i<j

(−1)i+jω([Xi, Xj ], Xi,j)

+∑`<j

(−1)(−1)`+j(Xj(f))ω(X`, X`,j) +∑i<`

(−1)i+`(Xi(f))ω(X`, Xi,`)

= f∑i<j

(−1)i+jω([Xi, Xj ], Xi,j) +∑i 6=`

(−1)i(Xi(f))ω(Xi) (6.27)

– 42 –

On adding together the contribution from (6.25) and (6.27), one finds that the terms

involving Xi(f) cancel, and one finds

dω(X1, . . . , fX`, . . . , Xk+1) = fdω(X1, . . . , X`, . . . , Xk+1) (6.28)

Having established linearity, we compute the components of dω in the usual basis of

vectors

dω

(∂

∂xµ1

∣∣p, . . . ,

∂

∂xµk+1

∣∣p

)=

k+1∑i=1

(−1)i+1 ∂

∂xµi

∣∣p

(ω

(∂

∂xµ1

∣∣p, . . . ,

∂

∂xµi−1

∣∣p,

∂

∂xµi+1

∣∣p, . . . ,

∂

∂xµk+1

∣∣p

))(6.29)

where we have used the fact that [∂

∂xµ

∣∣∣∣p

,∂

∂xν

∣∣∣∣p

]= 0 (6.30)

so that the second term in the definition of dω (6.21) vanishes. So, on defining the compo-

nents of dω as

dωµ1,...,µk+1= dω

(∂

∂xµ1

∣∣p, . . . ,

∂

∂xµk+1

∣∣p

)(6.31)

one obtains

dωµ1,...,µk+1=

k+1∑i=1

(−1)i+1∂µiωµ1,...,µi−1,µi+1,...µk+1

= (k + 1)∂[µ1ωµ2,...,µk+1] (6.32)

where the square brackets are defined as

Y[µ1,...µk] =1

k!

∑P

sgn(P )YµP (1),...,µP (k)(6.33)

It is clear that the components of dω are antisymmetric.

Note that in this coordinate basis

dω =1

(k + 1)!

∑(dω)µ1,...,µk+1

dxµ1∣∣p∧ · · · ∧ dxµk+1

∣∣p

=1

k!

∑∂[µ1ωµ2,...,µk+1]dx

µ1∣∣p∧ · · · ∧ dxµk+1

∣∣p

=1

k!

∑∂µ1ωµ2,...,µk+1

dxµ1∣∣p∧ · · · ∧ dxµk+1

∣∣p

(6.34)

where the last line follows due to the antisymmetry of the wedge product terms.

Before proceeding further, we shall establish a correspondence between the exterior

derivative acting on the co-ordinate functions (which although not globally defined, are

nevertheless locally defined), and the dual operators.

– 43 –

In particular we have denoted by dxµ∣∣p

the dual vectors to the tangent vectors ∂∂xν

∣∣p

at p ∈M, defined in equation (5.8).

Suppose that (x1, . . . , xn) is some local co-ordinate system associated with the chart

(Ui, φi). Note that xµ(p) = (φi(p))µ are C∞(Ui) functions, and hence we can (locally)

define their exterior derivative dxµ where

dxµ(X) = X(xµ) (6.35)

for any (locally defined) smooth vector field X. So dxµ is a (smooth) 1-form locally defined

on the patch Ui. To evaluate this 1-form at p ∈ Ui note that(dxµ

)∣∣∣∣p

(∂

∂xν∣∣p) =

∂

∂xν

∣∣∣∣p

(xµ) = δµν (6.36)

where on the LHS of this expression

(dxµ

)∣∣p

denotes the evaluation of dxµ at p. From

the above equation it is clear that dxµ evaluated at p is identical to the dual vector dxµ∣∣p

as defined in equation (5.8), and so our choice of notation in section 5 is consistent with

the exterior derivative as defined in this section.

Example: Consider R3. A 0-form f is just a function, and

df =∑µ

∂µfdxµ (6.37)

is just the gradient of f . A 1-form ω = ωµdxµ has

dω =∑

∂νωµdxν ∧ dxµ

= (∂1ω2 − ∂2ω1)dx1 ∧ dx2 + (∂2ω3 − ∂3ω2)dx

2 ∧ dx3 + (∂1ω3 − ∂3ω1)dx1 ∧ dx3

(6.38)

whose components are those of curl(ω). Lastly, for a 2-form ω = 12

∑ωµνdx

µ ∧ dxν , we

have

dω =1

2

∑∂λωµνdx

λ ∧ dxµ ∧ dxν = (∂1ω23 + ∂2ω31 + ∂3ω12)dx1 ∧ dx2 ∧ dx3 (6.39)

and these are the components of div(ω) where

ω = ω23dx1 + ω31dx

2 + ω12dx3 (6.40)

Next we prove the most important property of the exterior derivative

Theorem 6.6. d2 = 0

Proof. Let us choose a coordinate system as previously, so that

dω =1

k!

∑∂νωµ1,...,µkdx

ν ∧ dxµ1 ∧ · · · ∧ dxµk (6.41)

– 44 –

Then it follows that

d2ω =1

k!

∑∂λ∂νωµ1,...,µkdx

λ ∧ dxν ∧ dxµ1 ∧ · · · ∧ dxµk (6.42)

However this vanishes, because

∂λ∂νωµ1,...,µk = ∂ν∂λωµ1,...,µk (6.43)

Theorem 6.7. If ω ∈ Ωp(M) and η ∈ Ωq(M), then

d(ω ∧ η) = (dω) ∧ η + (−1)pω ∧ dη (6.44)

Proof. In components we have

ω =1

p!

∑ωµ1...µpdx

µ1 ∧ · · · ∧ dxµp (6.45)

and

η =1

q!

∑ην1...νqdx

ν1 ∧ · · · ∧ dxνq (6.46)

and hence

d(ω ∧ η) =1

p!q!

∑∂λ(ωµ1...µpην1...νq

)dxλ ∧ dxµ1 ∧ ... ∧ dxµp ∧ dxν1 ∧ ... ∧ dxνq

=1

p!q!

∑(∂λωµ1...µpην1...νq + ωµ1...µp∂λην1...νq

)dxλ ∧ dxµ1 ∧ ... ∧ dxµp ∧ dxν1 ∧ ... ∧ dxνq

=1

p!q!

∑∂λωµ1...µpην1...νqdx

λ ∧ dxµ1 ∧ ... ∧ dxµp ∧ dxν1 ∧ ... ∧ dxνq

+(−1)p1

p!q!

∑ωµ1...µp∂λην1...νqdx

µ1 ∧ ... ∧ dxµp ∧ dxλ ∧ dxν1 ∧ ... ∧ dxνq

= (dω) ∧ η + (−1)pω ∧ dη (6.47)

where the factor of (−1)p appears because

dxλ ∧ dxµ1 ∧ · · · ∧ dxµp = (−1)pdxµ1 ∧ · · · ∧ dxµp ∧ dxλ (6.48)

Let us return to the notion of a pull-back. This can be easily extended to any p-form.

Consider a C∞ map f : M → N between two manifolds, and let ω be a p-form on N .

Then we define

(f?ω)p(X1, . . . , Xp) = ωf(p)(f?X1, . . . , f?Xp) (6.49)

for tangent vectors Xi ∈ Tp(M). Clearly, f?ω is antisymmetric, because ω is, so if ω ∈Ωp(N ,R) then f?ω ∈ Ωp(M,R). We can also define the pull-back of a 0-form or function

g : N → R by the rule

f?g = g f (6.50)

so that f?g :M→ R.

– 45 –

Theorem 6.8. Let f :M→N , (y1, . . . yn) be local coordinates on V ⊂ N , and (x1, . . . , xm)

local coordinates on U ∩ f−1(V ) ⊂M. If

ω =1

k!

n∑ν=1

ων1,...,νk(q)dyν1∣∣q∧ · · · ∧ dyνk

∣∣q

(6.51)

then

(f?ω)p =1

k!

m∑µ=1

n∑ν=1

ων1,...,νk(f(p))∂

∂xµ1(yν1 f) . . .

∂

∂xµk(yνk f)dxµ1

∣∣p∧ · · · ∧ dxµk

∣∣p

(6.52)

for q ∈ N , p ∈M.

Proof. It is sufficient to compute

(f?ω)p

(∂

∂xµ1

∣∣p, . . . ,

∂

∂xµk

∣∣p

)= ωf(p)

(f?

∂

∂xµ1

∣∣p, . . . , f?

∂

∂xµk

∣∣p

)(6.53)

We have already proven that

f?∂

∂xµi

∣∣p

=∑νi

∂

∂xµi(yνi f)

∂

∂yνi

∣∣f(p)

(6.54)

and hence on substituting this back into (6.53), one obtains the result.

Theorem 6.9. The exterior derivative and the pull-back commute: d(f?ω) = f?dω.

Proof. In a local coordinate system we have proven that

f?ω =1

k!

∑∑ων1,...,νk(f(x))

∂fν1

∂xµ1. . .

∂fνk

∂xµkdxµ1 ∧ · · · ∧ dxµk (6.55)

where we have defined fν = yν f . Therefore,

df?ω =1

k!

∑ ∂

∂xλ

(ων1,...,νk(f(x))

∂fν1

∂xµ1. . .

∂fνk

∂xµk

)dxλ ∧ dxµ1 ∧ · · · ∧ dxµk

=1

k!

∑ ∂fρ

∂xλ∂

∂yρων1,...,νk(f(x))

∂fν1

∂xµ1. . .

∂fνk

∂xµkdxλ ∧ dxµ1 ∧ · · · ∧ dxµk

+1

k!

∑ων1,...,νk

∂

∂xλ

(∂fν1

∂xµ1. . .

∂fνk

∂xµk

)dxλ ∧ dxµ1 ∧ · · · ∧ dxµk

=1

k!

∑ ∂fρ

∂xλ∂


∂fν1

∂xµ1. . .

∂fνk


+1

k!

∑ων1,...,νk

∂

∂xλ

(∂fν1

∂xµ1. . .

∂fνk

∂xµk

)dxλ ∧ dxµ1 ∧ · · · ∧ dxµk

=1

k!

∑ ∂


∂fρ

∂xλ∂fν1

∂xµ1. . .

∂fνk


= f?dω (6.56)

– 46 –

Note that line 4 of this expression is obtained by using the chain rule. The contribution

from line 5 vanishes - this is because ∂2fν1

∂xλ∂xµ1is symmetric in λ, µ1, whereas dxλ ∧ dxµ1 ∧

· · · ∧ dxµk is antisymmetric in λ, µ1; similarly all other contributions from ∂2fνj

∂xλ∂xµjterms

vanish.

There is also a particularly elegant relationship between the exterior derivative d, and

the Lie derivative.

Definition 6.5. Given a vector field Y , the interior product iY is a map iY : Ωp(M) →Ωp−1(M) defined by

(iY η)(X1, . . . , Xp−1) = η(Y,X1, . . . , Xp−1) (6.57)

for η ∈ Ωp(M) and vector fields X1, . . . , Xp−1.

It is clear that if η is a p-form then iY η is a p − 1-form, as it is linear in the Xi, and

interchange of any pair Xi, Xj changes the sign. Note that if ω is a 1-form then iY ω = ω(Y ).

Theorem 6.10. If X is a vector field and ω a 1-form then

LXω = d(iXω) + iXdω (6.58)

Proof. Working in local co-ordinates xµ, note that(d(iXω) + iXdω

)µ

=∑ν

∂µ(ωνXν) +Xνdωνµ

=∑ν

(∂µων)Xν + ων∂µXν +Xν(∂νωµ − ∂µων)

=∑ν

Xν∂νωµ + ων∂µXν

= (LXω)µ (6.59)

In fact, one can show that LXω = d(iXω) + iXdω for any p-form ω, however we shall

not do this here.

6.3 Integration on Manifolds

Let us first recall how we would integrate ω = p(x, y)dx + q(x, y)dy along a curve C :

[0, 1]→ R2 in R2. A natural prescription is∫Cp(x, y)dx+ q(x, y)dy =

∫ 1

0

(p(C(t))

dCx

dt+ q(C(t))

dCy

dt

)dt (6.60)

Here we are thinking of dx and dy as the infinitessimal change in x and y along the curve

C; (dx, dy) =(dCx

dt dt,dCy

dt dt). We can rewrite this as∫

Cω =

∫ 1

0C?ω (6.61)

– 47 –

Problem 6.2. Show that

C?ω =

(p(C(t))

d

dt(x C) + q(C(t))

d

dt(y C)

)dt (6.62)

This definition clearly extends to the integral of a 1-form along a curve in an arbitrary

manifold. To define the integral of a general p-form over a manifold, we need to generalize

a curve to a p-dimensional surface.

Definition 6.6. Let Ip = [0, 1]p = (x1, . . . , xp) ∈ Rp : 0 ≤ xµ ≤ 1 be the p-cube in Rp.

(i) A p-simplex on M is a C∞ map C : J → M where J is an open set in Rp that

contains Ip.

(ii) A 0-simplex is a map from 0 →M i.e. it is just a point in M.

(iii) The support |C| of a p-simplex is the set C(Ip) ⊂M.

Next we can consider ”sums” of such surfaces.

Definition 6.7. A p-chain on M is a finite formal linear combination of p-simplices on

M with real coefficients, i.e. a general p-chain is

σp = r1C1 + · · ·+ rkCk (6.63)

where ri ∈ R and Ci are p-simplexes. The support of a p-chain σp = r1C1 + · · ·+ rkCk is

|σp| =⋃

i: ri 6=0

|Ci| (6.64)

Definition 6.8. We define the maps π(1)i : Ip−1 → Ip and π

(0)i : Ip−1 → Ip by

π(1)i (t1, . . . , tp−1) = (t1, . . . , ti−1, 1, ti, . . . , tp−1)

π(0)i (t1, . . . , tp−1) = (t1, . . . , ti−1, 0, ti, . . . , tp−1) (6.65)

i.e. these project a (p− 1)-cube in Rp−1 onto the side of a p-cube in Rp.

The boundary of a p-cube can then be constructed as the sum of all sides weighted with

a plus sign for front sides and a minus sign for the back sides, in other words a (p−1)-chain.

Example: If we look at I2 then

π(0)1 (t) = (0, t) π

(0)2 = (t, 0)

π(1)1 (t) = (1, t) π

(1)2 = (t, 1) (6.66)

As a point set the boundary is

(1, t) : t ∈ I1 ∪ (t, 1) : t ∈ I1 ∪ (0, t) : t ∈ I1 ∪ (t, 0) : t ∈ I1 (6.67)

but this does not take into account the fact that some sides are oriented differently to

others. This is achieved by considering the 1-chain

π(1)1 − π

(0)1 − π

(1)2 + π

(0)2 (6.68)

whose support is the point set consisting of the boundary of I2.

This allows us to define the boundary of a p-simplex in M.

– 48 –

Figure 8: The (oriented) boundary of I2

Definition 6.9. If C is a p-simplex in M then the boundary of C is denoted by ∂C and

is defined as the (p− 1)-chain

∂C =

p∑i=1

(−1)i+1(C π(1)i − C π(0)i ) (6.69)

For a p-chain σ = r1C1 + · · ·+ rkCk we define

∂C =∑

ri∂Ci (6.70)

The next theorem summarizes the notion that boundaries have no boundaries.

Theorem 6.11. ∂2 = 0

Proof. It suffices to show this for a p-simplex C in M. As

∂C =

p∑i=1

(−1)i+1(C π(1)i − C π(0)i ) (6.71)

it follows that

∂2C =

p∑i=1

(−1)i+1(∂(C π(1)i )− ∂(C π(0)i )

)=

p−1∑j=1

p∑i=1

(−1)i+j(C π(1)i π

(1)j − C π

(1)i π

(0)j − C π

(0)i π

(1)j + C π(0)i π

(0)j

)(6.72)

– 49 –

Now, if j < i then

π(α)i π(β)j (t1, . . . , tp−2) = π

(α)i (t1, . . . , β, . . . , tp−2) = (t1, . . . , β, . . . , α, . . . , tp−2) (6.73)

and also (as j ≤ i− 1)

π(β)j π(α)i−1(t1, . . . , tp−2) = π

(β)j (t1, . . . , α, . . . , tp−2) = (t1, . . . , β, . . . , α, . . . , tp−2) (6.74)

where the final expression has β in the j-th position and α in the i-th position. Thus if

j < i then

π(α)i π(β)j = π

(β)j π(α)i−1 (6.75)

This shift of i→ i− 1 introduces a minus sign into the sum due to the (−1)i+j factor.

Hence, we see that the first term and the last term in ∂2C each sum to zero, and the

middle two terms together sum to zero.

Finally, we can define the integral of a p-form over a p-chain.

Definition 6.10. Let C be a p-simplex in M and ω a p-form. Then∫Cω =

∫Ip

C?ω (6.76)

where if C?ω = f(t1, . . . , tp)dt1∧· · ·∧dtp, the RHS is understood to mean the usual integral∫Ip

C?ω =

∫ 1

0. . .

∫ 1

0f(t1, . . . , tp)dt1 . . . dtp (6.77)

If σ =∑riCi is a p-chain then ∫

σω =

∑i

ri

∫Ci

ω (6.78)

Example: Consider the manifold R2 − (0, 0), the 1-form

ω =ydx

x2 + y2− xdy

x2 + y2(6.79)

and the curve C(t) = (cos(2πt), sin(2πt)). Then∫Cω =

∫ 1

0C?ωdt

=

∫ 1

0

(sin(2πt)

d

dt(cos(2πt))− cos(2πt)

d

dt(sin(2πt))

)dt

= −2π

∫ 1

0(sin2(2πt) + cos2(2πt))dt

= −2π (6.80)

– 50 –

Problem 6.3. Consider the manifold R2 − (0, 0), the 1-form

ω =ydx

x2 + y2− xdy

x2 + y2(6.81)

What is ∫Cω (6.82)

along the curve C(t) = (2 + cos(2πt), 2 + sin(2πt)). Next consider the 2-form

ω =dx ∧ dyx2 + y2

(6.83)

What is ∫Cω (6.84)

where C : I2 → R2 − (0, 0) is given by C(t1, t2) = (t1 + 1)(cos(2πt2), sin(2πt2)).

Problem 6.4. Consider the manifold R2 − (0, 0) and the 1-form

ω =ydx

x2 + y2− xdy

x2 + y2(6.85)

Show that dω = 0. Is there a smooth function f such that ω = df?

The exterior derivative has one very important property:

d2 = 0 (6.86)

Thus if ω = dη then it follows that dω = 0. This motivates two definitions

Definition 6.11. A p-form ω is closed if dω = 0. We denote the set of closed p-forms on

M by Zp(M,R).

Definition 6.12. A p-form ω is exact if ω = dη for some (p−1) form η onM. We denote

the set of exact p-forms on M by Bp(M,R).

Theorem 6.12. Bp(M,R) ⊂ Zp(M,R)

Proof. If ω ∈ Bp(M,R) then ω = dη for some (p − 1)-form η, hence dω = d2η = 0, so

ω ∈ Zp(M,R).

Since the space of p-forms is a vector space over R, we can define the following:

Definition 6.13. The p-th de Rham cohomology group Hp(M,R) is the quotient space

Hp(M,R) =Zp(M,R)

Bp(M,R)(6.87)

where two p-forms are viewed as equivalent iff their difference is an exact form. The

dimension of Hp(M,R) is called the p-th Betti number bp

– 51 –

Theorem 6.13. If M and N are two manifolds, and f :→M→ N is a diffeomorphism

then Hp(M,R) ∼= Hp(N ,R).

Proof. Recall that we proved that the pull-back and exterior derivative commute (Theorem

6.9). Therefore if ω is a closed p-form on N then f?ω is a closed form on M:

df?ω = f?dω = 0 . (6.88)

Furthermore, if ω = dη is an exact form on N then

f?ω = f?dη = d(f?η) (6.89)

is an exact p-form onM. Similarly closed forms onM are pulled back using f−1 to closed

forms on N and exact forms on M are pulled back to exact forms on N .

Thus the de Rham cohomology groups are capable of distinguishing between two dis-

tinct manifolds. By distinct, we mean that two manifolds are equivalent if there is a

diffeomorphism between them. Note though that the converse is not true. There are

plenty of examples of inequivalent manifolds that have the same de Rham cohomology

groups Hk(M,R). The general idea of cohomology can be applied to any operator which

is nilpotent, i.e. whose action squares to zero, and is a central element of modern algebraic

and geometric topology.

We finally arrive at a central theorem in differential geometry

Theorem 6.14 (Stokes’s Theorem). If ω ∈ Ω(p−1)(M,R) and σ is a p− chain then∫σdω =

∫∂σω (6.90)

Proof. By linearity, it suffices to show this is true for p-simplexes C. Recall that as a

consequence of Theorem 6.9 ∫Cdω =

∫Ip

C?dω =

∫Ip

dC?ω (6.91)

and by definition ∫∂Cω =

∫∂Ip

C?ω (6.92)

So to prove that these two quantities are equal, it is sufficient to show that∫Ip

dψ =

∫∂Ip

ψ (6.93)

for any (p− 1)-form on Rp. As this condition is linear in ψ it is sufficient to consider

ψ = f(x)dx1 ∧ · · · ∧ dxp−1 (6.94)

– 52 –

so

dψ = (−1)p−1∂pfdx1 ∧ · · · ∧ dxp (6.95)

and hence we evaluate∫Ip

dψ = (−1)p−1∫Ip

∂pfdx1 ∧ · · · ∧ dxp

= (−1)p−1∫ 1

0dx1 . . . dxp−1

∫ 1

0∂pfdx

p

= (−1)p−1∫ 1

0dx1 . . . dxp−1

(f(x1, . . . , xp−1, 1)− f(x1, . . . , xp−1, 0)

)(6.96)

On the other hand, the boundary of Ip is

∂Ip =

p∑i=1

(−1)i+1(π(1)i − π

(0)i )

=

p∑i=1

(−1)i+1

((x1, . . . , xi−1, 1, xi+1, . . . xp) : xj ∈ I1

− (x1, . . . , xi−1, 0, xi+1, . . . xp) : xj ∈ I1)

(6.97)

Now ψ = fdx1 ∧ · · · ∧ dxp−1 will only have a non-vanishing contribution to the total

integral on those boundary components with xp constant and x1, . . . , xp−1 varying. Hence∫∂Ip

ψ = (−1)p+1

∫(x1,...,xp−1,1:xi∈I1

ψ − (−1)p+1

∫(x1,...,xp−1,0:xi∈I1

ψ

= (−1)p+1

∫ 1

0dx1 . . . dxp−1f(x1, . . . , xp−1, 1)− (−1)p+1

∫ 1

0dx1 . . . dxp−1f(x1, . . . , xp−1, 0)

(6.98)

and this establishes the proof.

This is a beautiful generalization of the following well-known result for 1-forms on R:∫ b

adf = f(b)− f(a) (6.99)

In particular, Stokes’s theorem enables us to see quite explicitly the connection between

the identities d2 = 0 and ∂2 = 0, because if σ is a p-chain and ω is a (p− 2)-form then

0 =

∫σd2ω =

∫∂σdω =

∫∂2σ

ω = 0 (6.100)

where the first equality (reading from left to right) follows from d2 = 0, the next two

equalities follow from Stokes’s theorem, and the last equality follows from ∂2 = 0.

– 53 –

7. Connections, Curvature and Metrics

So far, everything that we have discussed about manifolds has been intrinsic to the man-

ifold, defined as a topological space with a differentiable structure, and has not required

introducing any additional structures. However, it is very common and intuitive to intro-

duce some additional structures.

7.1 Connections, Curvature and Torsion

The first additional structure that we can introduce is that of a connnection. We have

been emphasising that we cannot just differentiate a generic object such as a tensor on a

manifold because we don’t know how to construct something like x+ ε where x ∈ M and

ε is a small parameter.

Problem 7.1. Show that if

X =∑µ

Xµ ∂

∂xµ∣∣p

(7.1)

is a vector field expanded in some local coordinate chart φi(p) = (x1(p), . . . , xn(p)) and we

define

dX =∑µ

∂νXµdxν

∣∣p⊗ ∂

∂xµ∣∣p

(7.2)

then dX is not a tensor. [N.B. d is to be distinguished from the exterior derivative d; here

d is simply an operator defined by the above equation!]

However, check that if ω =∑ωµdx

µ∣∣p

is a co-vector (1-form) then

dω =∑µ

(∂νωµ − ∂µων)dxν∣∣p⊗ dxµ

∣∣p

(7.3)

is a tensor. Hint: consider how things look in a different coordinate chart ψi(p) =

(y1, . . . , yn).

However we can simply declare that there is a suitable derivative

Definition 7.1. A connection (or more accurately an affine connection) on a manifold Mis an operator D which assigns to each vector field X on M a mapping DX : TM→ TMsuch that for all X,Y ∈ TM and f ∈ C∞M:

(i) DX(Y + Z) = DXY +DXZ

(ii) DX+Y Z = DXZ +DY Z

(iii) DfXY = fDXY

(iv) DX(fY ) = X(f)Y + fDXY

– 54 –

DXY is called the covariant derivative of Y along X.

N.B. The commutator (Lie derivative) obeys all but condition (iii).

In other words, DX acts as a directional derivative along the direction determined by

X. However the existence of a connection does not follow from the definition of a manifold

but requires us to add it in. In particular a typical manifold can be endowed with infinitely

many different connections.

It is convenient to introduce a new notation. If x1, . . . , xn are local coordinates on

some chart of M, we let

Dµ = D ∂∂xµ

∣∣p

(7.4)

It is straightforward to see that DX is entirely determined by its action on a set of basis

vectors, hence we introduce

Dµ∂

∂xν∣∣p

=∑λ

Γλµν∂

∂xλ∣∣p

(7.5)

and the Γλµν are known as the connection coefficients. Thus if

X =∑µ

Xµ ∂

∂xµ∣∣p, and Y =

∑µ

Y µ ∂

∂xµ∣∣p

(7.6)

then it follows from the definition of DX that

DXY = D∑Xµ ∂

∂xµ

∣∣p

(∑ν

Y ν ∂

∂xν∣∣p

)=∑

XµDµ

(∑ν

Y ν ∂

∂xν∣∣p

)=∑

Xµ

(∂

∂xµ∣∣p(Y ν)

∂

∂xν∣∣p

+ Y νDµ∂

∂xν∣∣p

)=∑(

Xµ∂µYλ + ΓλµνX

µY ν) ∂

∂xλ∣∣p

(7.7)

Theorem 7.1. Let Γλµν be the connection coefficients in a coordinate system (x1, . . . , xn),

then in another overlapping coordinate system (y1, . . . , yn), we have

Γλµν =∑ ∂xρ

∂yµ∂xτ

∂yν∂yλ

∂xσΓσρτ +

∑ ∂yλ

∂xσ∂2xσ

∂yµ∂yν(7.8)

where we think of xµ(yν) as the transition functions.

Proof. We have seen that the relationship between the standard vector field basis elements

in two overlapping coordinate systems is

∂

∂xµ∣∣p

=∑ ∂yν(x)

∂xµ∂

∂yν∣∣p

and∂

∂yµ∣∣p

=∑ ∂xν(y)

∂yµ∂

∂xν∣∣p

(7.9)

– 55 –

By definition, we have

D ∂∂yµ

∣∣p

(∂

∂yν∣∣p

)=∑

Γλµν∂

∂yλ∣∣p

(7.10)

However, we also have

D ∂∂yµ

∣∣p

(∂

∂yν∣∣p

)= D∑ ∂xρ(y)

∂yµ∂∂xρ

∣∣p

(∑ ∂xσ(y)

∂yν∂

∂xσ∣∣p

)=∑ ∂xρ(y)

∂yµD ∂

∂xρ

∣∣p

(∑ ∂xσ(y)

∂yν∂

∂xσ∣∣p

)=∑ ∂xρ(y)

∂yµ∂xσ(y)

∂yνΓλρσ

∂

∂xλ∣∣p

+∑ ∂2xσ(y)

∂yµ∂yν∂

∂xσ∣∣p

=∑ ∂xρ(y)

∂yµ∂xτ (y)

∂yνΓσρτ

∂

∂xσ∣∣p

+∑ ∂2xσ(y)

∂yµ∂yν∂

∂xσ∣∣p

=

(∑ ∂xρ(y)

∂yµ∂xτ (y)

∂yνΓσρτ

∂yλ(x)

∂xσ+∑ ∂2xσ(y)

∂yµ∂yν∂yλ(x)

∂xσ

)∂

∂yλ∣∣p

On comparing the coefficients of ∂∂yλ

∣∣p

in these two expressions, one establishes the proof.

Thus the connection coefficients cannot be thought of as the components of a tensor,

because they do not transform in the appropriate way. However, from the connection we

can construct two associated tensors.

Definition 7.2. The torsion tensor is a (1, 2) tensor defined by

T (X,Y, ω) = ω(DXY −DYX − [X,Y ]) (7.11)

for X,Y ∈ TM, ω ∈ T ?M.

Theorem 7.2. In a local coordinate system, the torsion tensor is

T =∑

T λµνdxµ∣∣p⊗ dxν

∣∣p⊗ ∂

∂xλ∣∣p

(7.12)

where

T λµν = Γλµν − Γλνµ (7.13)

Proof. We evaluate

T λµν = dxλ(Dµ

( ∂

∂xν)−Dν

( ∂

∂xµ))

= dxλ(

Γσµν∂

∂xσ− Γσνµ

∂

∂xσ

)= Γλµν − Γλνµ (7.14)

– 56 –

Secondly, we have the curvature (1, 3)-tensor R:

Definition 7.3. The curvature (1, 3) tensor is

R(X,Y, Z, ω) = ω

(−DX(DY Z) +DY (DXZ) +D[X,Y ]Z

)(7.15)

for X,Y, Z ∈ TM and ω ∈ T ?M.

Problem 7.2. Prove that R is a tensor.

Theorem 7.3. In a local coordinate system the curvature tensor is

R =∑

Rµνλρdxµ

∣∣p⊗ dxν

∣∣p⊗ dxλ

∣∣p⊗ ∂

∂xρ∣∣p

(7.16)

where

Rµνλρ = −∂µΓρνλ + ∂νΓρµλ + ΓσµλΓρνσ − ΓσνλΓρµσ (7.17)

Problem 7.3. Prove this.

The important property of these tensors is that they contain coordinate independent

information. In particular, if a tensor, such as torsion or curvature, vanishes in one coor-

dinate system, then it vanishes in all. This cannot be said of things like the connection

coefficients or other quantities that one might encounter when working with a particular

coordinate system.

Definition 7.4. A vector field X is parallel transported along a curve C if

DTCX = 0 (7.18)

at each point on C, where TC is the tangent to C.

This means that we think of X as being transported in such a way that it points in

the same direction along the curve. This is possible because we have a connection which

tells us how to compare vectors in the tangent spaces at different points.

We can give a geometric meaning to both the torsion and curvature tensors. First the

torsion: consider an infinitessimal displacement of the coordinate xµ by a vector X:

δXxν = εXν (7.19)

Then parallel transport this displacement along a direction Y . Parallel transport means

that

0 = Y µ(∂µX

ν +∑

ΓνµλXλ)

(7.20)

so that

δYXν = εY µ∂µX

ν = −ε∑

Y µΓνµλXλ (7.21)

– 57 –

Thus

δY δXxν = −ε2

∑Y µΓνµλX

λ (7.22)

On changing the order of displacement first along Y and then X one similarly finds

δXδY xν = −ε2

∑XµΓνµλY

λ (7.23)

Therefore the difference between these two is measured by the torsion

[δX , δY ]xν = −ε2∑

XµY λT νµλ (7.24)

To understand the curvature, we first parallel transport a vector Z around a curve

with tangent X by an infinitessimal amount. Using the above formula one finds

Zρ → Zρ − ε∑

XµΓρµλZλ (7.25)

Let us now parallel transport this around Y by an infinitessimal amount:


XµΓρµλZλ −

∑(εY π)Γρπσ(x+ εX)(Zσ − εXµΓσµλZ

λ)

= Zρ − ε∑

XµΓρµλZλ − ε

∑Y π(Γρπσ + εXτ∂τΓρπσ)(Zσ − εXµΓσµλZ

λ)

= Zρ − ε∑

(Xµ + Y µ)ΓρµλZλ − ε2

∑Y µXτ∂τΓρµσZ

σ + ε2∑

ΓσµλΓρπσYπXµZλ +O(ε3)

(7.26)

If we first transport around Y and then X we find


(Y µ +Xµ)Γρµλ − ε2∑

XµY τ∂τΓρµσZσ + ε2

∑ΓσµλΓρπσX

πY µZλ +O(ε3)

(7.27)

Then, with a little work, one sees that the difference between parallel transport along X

then Y , minus the parallel transport along Y and then X is expressed in terms of the

curvature:

[δX , δY ]Zρ = ε2∑

RµνλρXµY νZλ (7.28)

Figure 9: Parallel transport along great circle arcs on S2

– 58 –

In the above, the vector field at A can be taken to C by parallel transport A→ B → C

or by A→ D → C. The resulting vector fields at C are not equal.

Figure 10: Parallel transport of vector field Z around a small parallelogram

The connection can be extended to define a covariant derivative on any tensor field.

We start by defining it on a co-vector ω by

(DXω)(Y ) = X(ω(Y ))− ω(DX(Y )) (7.29)

for any vector field Y .

Clearly,

(DXω)(Y + Z) = (DXω)(Y ) + (DXω)(Z) (7.30)

for vector fields X,Y, Z, and if f ∈ C∞(M) then

(DXω)(fY ) = X(ω(fY ))− ω(DX(fY ))

= X(fω(Y ))− ω(X(f)Y + fDX(Y ))

= X(f)ω(Y ) + fX(ω(Y ))−X(f)ω(Y )− fω(DX(Y ))

= f(DXω)(Y ) (7.31)

and hence DXω is a co-vector as claimed. In coordinates this is

Dµων = ∂µων −∑

ωλΓλµν (7.32)

where we have taken X = ∂∂xµ

∣∣p, Y = ∂

∂xν

∣∣p

so that

ω(Y ) = ων and (DXY )λ = Γλµν (7.33)

– 59 –

This implies that

Dµ(dxν) = −∑

Γνµλdxλ (7.34)

The extension to a (r, s)-tensor is

DXT (ω1, . . . , ωr, Y1, . . . , Ys) = X(T (ω1, . . . , ωr, Y1, . . . , Ys))

−∑i

T (ω1, . . . , DXωi, . . . ωr, Y1, . . . , Ys)

−∑i

T (ω1, . . . , ωr, Y1, . . . , DXYi, . . . , Ys) (7.35)

Problem 7.4. Convince yourself that DXT is a (r, s)-tensor and in a local coordinate

system

DλTµ1,...,µr

ν1,...,νs = ∂λTµ1,...,µr

ν1,...,νs

+∑i

ΓµiλρTµ1,...,ρ,...,µr

ν1,...,νs

−∑i

ΓρλνiTµ1,...,µr

ν1,...,ρ,...,νs (7.36)

7.2 Riemannian Manifolds

Another object that is frequently discussed is a metric. This allows one to measure distances

and angles on manifolds. Again, this is not implicit to a manifold, and typically there are

infinitely many possible metrics for a given manifold. For example, General Relativity is a

theory of gravity which postulates that spacetime is a manifold. The dynamical equations

of General Relativity (Einstein’s equations) then determine the metric.

Definition 7.5. A metric g on a manifold M is a non-degenerate symmetric (0, 2) tensor

defined at each point p ∈M; that is a map gp : TpM⊗ TpM→ R such that

g(X,Y ) = g(Y,X), g(X,Y + fZ) = g(X,Y ) + fg(X,Z) (7.37)

for X,Y, Z ∈ TM, f ∈ C∞(M). We will assume that g is a smooth (0, 2) tensor field on

M.

The non-degeneracy condition is equivalent to requiring that in any local co-ordinate

system (x1, . . . , xn), the components of the metric

gµν = g( ∂

∂xµ,∂

∂xν)

(7.38)

form a matrix with nonzero determinant.

Definition 7.6. A Riemannian manifold is a manifold with a positive definite metric

tensor field (positive definite means g(X,X) ≥ 0 with equality iff X = 0). If the metric is

not positive definite it is called a pseudo-Riemannian manifold.

– 60 –

From elementary linear algebra, an inner product allows us to define the lengths and

angles of vectors. Thus, with a (positive definite) metric we can define the length and

angles of tangent vectors. For example, we can define the angle between two intersecting

curves as

arccos

(g(T1, T2)√

g(T1, T1)g(T2, T2)

)(7.39)

where T1, T2 are the tangent vectors to the two curves where they intersect. We can also

define the length of a curve C with tangent vector T to be∫C

√g(T, T )dτ (7.40)

So we simply integrate the length of the tangent vector at each point of the curve.

Thus we can give a metric structure to the manifold by defining

d(p, q) = infC

∫C

√g(T, T )dτ (7.41)

where C is a curve on M such that C(0) = p, C(1) = q.

Example:

The Euclidean metric on Rn is simply

g

(∂

∂xµ,∂

∂xν

)= δµν (7.42)

This is in Cartesian coordinates. The length of a curve is therefore just∫C

√∑µ

dCµ

dτ

dCµ

dτdτ (7.43)

so in particular, for a straight line Cµ = pµ + (qµ − pµ)τ one has∫ 1

0

√∑µ

(pµ − qµ)(pµ − qµ)dτ =

√∑µ

(pµ − qµ)(pµ − qµ) (7.44)

which is the Pythagorian distance.

However, in general, the metric coefficients gµν can be functions of the coordinates.

Indeed, even Rn with a different coordinate system will have non-trivial gµν .

Example: We can think of M = R2 − (0, 0). This is clearly a manifold, as it is

an open subset of R2. By putting different metrics on it, though, we can think of it in a

variety of ways. With the flat metric

g = dr ⊗ dr + r2dθ ⊗ dθ (7.45)

then this is just what we naturally think of asM = R2 − (0, 0) as a subset of the plane.

– 61 –

But we could also consider

g′ = dr ⊗ dr + dθ ⊗ dθ (7.46)

This turns the manifold into a cylinder S1 × R, although since r > 0 it is really only half

a cylinder.

There are also more exotic possibilities, such as

g′′ = dr ⊗ dr + cosh2 rdθ ⊗ dθ (7.47)

This looks like a funnel where the radius of the circle starts at one and then grows expo-

nentially with r.

So it is clear that M admits numerous different metrics.

As is well known, an inner product induces an isomorphism between a vector space

and its dual. A metric tensor induces an isomorphism between TpM and T ?pM at each

point p ∈M. To be precise, given a vector field X, we can construct a co-vector ωX by

ωX(Y ) = g(X,Y ) (7.48)

Clearly this defines a linear map, i.e ωX ∈ T ?M. If in a local coordinate system we have

X =∑

Xµ ∂

∂xµ, and g =

∑gµνdx

µ ⊗ dxν (7.49)

then

ωX(Y ) =∑

(ωX)νYν =

∑gµνX

µY ν (7.50)

therefore we see that

(ωX)ν =∑

gµνXµ (7.51)

To see that all co-vectors arise in this way, suppose that ω is a co-vector (i.e. a linear map

from TpM to R). Then it is defined by its action on a basis

ω

(∂

∂xµ

)= ωµ (7.52)

where

ω =∑

ωµdxµ, and g =

∑gµνdx

µ ⊗ dxν (7.53)

We can therefore define the vector

Xω =∑

gµνων∂

∂xµ(7.54)

– 62 –

where gµν is the matrix inverse to gµν , which exists since g is non-degenerate. It now

follows that

ωXω(Y ) = g(Xω, Y )

=∑

gµνXµωY

ν

=∑

gµνgµσωσY

ν

=∑

ωνYν

= ω(Y ) (7.55)

for all vector fields Y , so ωXω = ω.

A metric tensor gives rise to an inverse metric (2, 0) tensor by

g−1(ωX , ωY ) = g(X,Y ) (7.56)

where we have used the fact that each co-vector can be identified with a unique vector.

Since this identification is linear, we see that

g−1 : T ?M⊗ T ?M→ R (7.57)

is linear, and also symmetric.

We have already used this tensor. Using the above form for ωX , ωY we find that in a

particular coordinate system

g−1(ωX , ωY ) =∑

(g−1)µν(ωX)µ(ωY )ν =∑

(g−1)µνgµλXλgνρY

ρ (7.58)

However, we also have

g(X,Y ) =∑

gλρXλY ρ (7.59)

and as these two expressions must be equal for all choices of X, Y , we find

(g−1)µν = gµν (7.60)

So the components of the tensor g−1 are those of the inverse metric. A metric tensor allows

us to raise and lower the indices on tensors.

Once a metric is supplied there is a natural choice of connection, known as the Levi-

Civita connection

Theorem 7.4. On a (pseudo) Riemannian manifold, there is a unique connection D such

that

(i) DXg = 0 for any vector X.

(ii) The torsion of D vanishes.

– 63 –

Proof. Let us start by assuming that such a connection exists. From the definition of a

covariant derivative on a (0, 2)-tensor we have

0 = (DXg)(Y, Z) = X(g(Y,X))− g(DXY,Z)− g(Y,DXZ) (7.61)

for three vector fields X, Y , Z. This implies, along with its cyclic permutations

X(g(Y, Z)) = g(DXY, Z) + g(Y,DXZ)

Y (g(Z,X)) = g(DY Z,X) + g(Z,DYX)

Z(g(X,Y )) = g(DZX,Y ) + g(X,DZY ) (7.62)

Next, we assume that D has no torsion so that DXY − DYX = [X,Y ]. We use the

torsion-free condition to rewrite the first term on the RHS of all three lines above

X(g(Y, Z)) = g(DYX,Z) + g(DXZ, Y ) + g([X,Y ], Z)

Y (g(Z,X)) = g(DZY,X) + g(DYX,Z) + g([Y,Z], X)

Z(g(X,Y )) = g(DXZ, Y ) + g(DZY,X) + g([Z,X], Y ) (7.63)

Consider the second plus the third minus the first of these expressions

Z(g(X,Y )) + Y (g(Z,X))−X(g(Y,Z)) = 2g(DZY,X)

+ g([Y, Z], X) + g([Z,X], Y )− g([X,Y ], Z)

(7.64)

Rearranging gives

2g(DZY,X) = Z(g(X,Y )) + Y (g(Z,X))−X(g(Y,Z))

− g([Y, Z], X)− g([Z,X], Y ) + g([X,Y ], Z) (7.65)

Because g is non-degenerate and X is arbitrary, this uniquely determines DY Z.

Conversely, if D is defined by (7.65), observe that on taking (7.65) and interchanging

Z ↔ Y , and subtracting, one finds

2g(DZY −DY Z + [Y,Z], X) = 0 (7.66)

and as this must hold for all X, we find the torsion must vanish. Also, on taking (7.65),

interchanging X ↔ Y , and adding, one finds

2g(DZY,X) + 2g(DZX,Y ) = 2Z(g(X,Y )) (7.67)

and we recover the condition DZg = 0.

Having established these identities, it remains to check that D defined in (7.65) does

indeed define a connection. Properties (i), (ii) of the definition of the connection follow

trivially. To test property (iii) we substitute Z → fZ in (7.65) and find

2g(DfZY,Z) = fZ(g(X,Y )) + Y (g(fZ,X))−X(g(Y, fZ))

− g([Y, fZ], X)− g([fZ,X], Y ) + g([X,Y ], fZ)

= 2fg(DZY,X) + Y (f)g(Z,X)−X(f)g(Y, Z)− Y (f)g(Z,X) +X(f)g(Z, Y )

= 2fg(DZY,X) (7.68)

– 64 –

To test property (iv) we substitute Y → fY in (7.65) and find

2g(DZfY,X) = Z(g(X, fY )) + fY (g(Z,X))−X(g(fY, Z))

− g([fY, Z], X)− g([Z,X], fY ) + g([X, fY ], Z)

= 2fg(DZY,X) + Z(f)g(X,Y )−X(f)g(Y,Z) + Z(f)g(Y,X) +X(f)g(Y,Z)

= 2g(fDZY + Z(f)Y,X) (7.69)

and so D is a connection.

Therefore, given a metric tensor we also find a natural curvature tensor, the Riemann

curvature, which is the curvature tensor of the Levi-Civita connection.

Theorem 7.5. In a local coordinate system (x1, . . . , xn) the coefficients of the Levi-Civita

connection are

Γλµν =1

2gλρ(∂µgρν + ∂νgρµ − ∂ρgµν

)(7.70)

Problem 7.5. Prove this.

7.3 Symplectic Manifolds

Definition 7.7. A symplectic manifold (M,Ω) is a manifold equipped with a non-degenerate

2-form Ω which is closed (i.e. dΩ = 0).

The non-degeneracy condition is equivalent to requiring that det Ω 6= 0, where here the

determinant is taken with respect to the components of Ω in any local co-ordinate system,

treating Ω as an antisymmetric matrix with components

Ωµν = Ω( ∂

∂xµ,∂

∂xν). (7.71)

The condition on the determinant is formulated in terms of co-ordinates, but can be

re-expressed in a co-ordinate independent fashion. The non-vanishing of det Ω is equivalent

to requiring the following: given any co-vector field ω, there exists a unique vector field X

such that

ω = iXΩ (7.72)

(where we recall that iXΩ(Y ) = Ω(X,Y ) for vector field Y ).

We also have the following theorem

Theorem 7.6. If (M,Ω) is a symplectic manifold then the dimension of M must be even.

Proof. The fact that a symplectic manifold must have even dimension follows from

det Ω = det ΩT = det (−Ω) = (−1)dim(M)det Ω (7.73)

where in the above line, we regard Ω as an anti-symmetric matrix with components in

some local basis. So if M were odd-dimensional, this would force Ω to have vanishing

determinant, which is a contradiction.

– 65 –

We remark that there are many examples of symplectic manifolds. Of particular

importance to symplectic manifolds is the following theorem:

Theorem 7.7 (Darboux Theorem). If (M,Ω) is a 2m-dimensional symplectic manifold,

then there exists about each point a local co-ordinate system (q1, . . . , qm, p1, . . . pm) such

that

Ω =m∑i=1

dpi ∧ dqi (7.74)

Proof. We shall not prove this here. The proof is non-examinable

Definition 7.8. If (M,Ω) is a symplectic manifold, denote by Ωµν the matrix inverse of

Ωµν (in local co-ordinates) which satisfies ΩµλΩλν = δµν .

It is instructive to compare Riemannian or pseudo-Riemannian manifolds with sym-

plectic manifolds. In the former case, the fundamental structure is the metric. In the latter

case, the fundamental structure is provided by the symplectic form Ω.

There are some similarities; for example for Riemannian or pseudo-Riemannian man-

ifolds, we have seen that given a vector field X one can construct an associated 1-form

ωX via ωX(Y ) = g(X,Y ). For symplectic manifolds, given X, there is a natural 1-form

associated with X which is iXΩ.

However, there are also crucial differences between Riemannian/pseudo-Riemannian

manifolds and symplectic manifolds. In particular, we have seen that symplectic mani-

folds must be even-dimensional, whereas Riemannian/pseudo-Riemannian manifolds can

be either even or odd-dimensional.

The Darboux theorem also indicates a significant difference. This can be seen when

one recalls that for a Riemannian or pseudo-Riemannian manifold, given a particular point

p ∈ M, one can always diagonalize the metric at p, the diagonal components of g being

non-zero constants. However, in general, one cannot argue that this form of g holds in

some patch containing p, for some local co-ordinates. This is because, if this were true,

then the connection coefficients of the Levi-Civita connection would vanish in these co-

ordinates, which would in turn imply that the Riemann curvature would vanish; and the

vanishing of the Riemann curvature is a co-ordinate independent condition. However, there

are manifolds (such as S2) for which it is known that the Riemann curvature is nonzero. So

it is apparent that in general, given p ∈M, one cannot arrange for some local co-ordinate

system around p in which the components of g are constant. However, for symplectic

manifolds, the Darboux theorem implies that given p ∈ M there does exist some local

co-ordinate system in which the components of Ω are constant!

Definition 7.9. If (M,Ω) is a symplectic manifold, then the Poisson bracket , is a map

C∞(M)× C∞(M)→ C∞(M) defined by

f, g = Ωµν∂µf∂νg (7.75)

Theorem 7.8. If (M,Ω) is a symplectic manifold then the Poisson bracket satisfies

– 66 –

(i) f, g = −g, f

(ii) f1 + f2, g = f1, g+ f2, g

(iii) αf, g = αf, g (for constant α ∈ R)

(iv) f, gh = f, gh+ gf, h

(v) f, g, h+ g, h, f+ h, f, g = 0

where f, g, h, f1, f2 ∈ C∞(M).

Proof. (i)− (iv) follow straightforwardly from the definition of the Poisson bracket and are

left as an exercise. One does not require the condition dΩ = 0 for these.

To prove (v), one must make use of dΩ = 0, and the proof is left as a problem for the

final example sheet. (It is not necessary to use the Darboux theorem to prove (v)).

– 67 –

Material on this page is non-examinable

There is a close relationship between symplectic manifolds and Hamiltonian dynamics.

Using the Darboux theorem, work with local co-ordinates q1, . . . , qm, p1, . . . , pm such that

Ω =

m∑i=1

dpi ∧ dqi (7.76)

so that in components

Ωµν =

(0 −ImIm 0

), Ωµν =

(0 Im−Im 0

)(7.77)

Suppose that a classical system is described by a Hamiltonian function H = H(p, q) ∈C∞(M). We define the Hamiltonian vector field XH by

XµH = Ωµν∂νH (7.78)

The classical trajectories of particles whose motion is constrained by H correspond to

integral curves of XH parametrized by t. The integral curves are given by

dqi

dt=∂H

∂pi,

dpidt

= −∂H∂qi

(7.79)

which are simply Hamilton’s equations.

Furthermore along these curves

dH

dt= LXHH = XHH = Ωµν∂µH∂νH = 0 (7.80)

i.e. H is constant along the integral curves of XH , which corresponds to conservation of

energy. Also note that for any function f = f(p, q),

LXHf = XH(f) = f,H (7.81)

– 68 –

Date post:	19-Apr-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Manifolds - personal.maths.surrey.ac.ukpersonal.maths.surrey.ac.uk/st/...manifolds_all.pdf ·...

Documents