+ All Categories
Home > Documents > Connections, gauge theory and characteristic classes · characteristic classes. In the end we...

Connections, gauge theory and characteristic classes · characteristic classes. In the end we...

Date post: 07-Jul-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
56
Connections, gauge theory and characteristic classes Arie Peterson and Loek Spitz 1 Universiteit van Amsterdam 1 Supervised by prof. dr. Erik Verlinde and prof. dr. Eric Opdam
Transcript
Page 1: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Connections, gauge theory and characteristic

classes

Arie Peterson and Loek Spitz 1

Universiteit van Amsterdam

1Supervised by prof. dr. Erik Verlinde and prof. dr. Eric Opdam

Page 2: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Abstract

This paper tries to explain the mathematics behind gauge theory, especially fibrebundles and connections. We use those concepts to describe electromagnetismas a gauge theory, and show how the Aharonov-Bohm effect can be understoodas a gauge-theoretic phenomenon. Finally, we introduce characteristic classes ofvector bundles. It is aimed at advanced undergraduate students with an interestin both mathematics and physics.

Page 3: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Contents

0.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1 Fibre bundles and vector bundles 41.1 Fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 G-bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3 G-torsors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4 Principal G-bundles . . . . . . . . . . . . . . . . . . . . . . . . . 101.5 Vector bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Connections 142.1 Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.1 Linear connections . . . . . . . . . . . . . . . . . . . . . . 182.1.2 Invariant connections . . . . . . . . . . . . . . . . . . . . 19

2.2 Parallel transport . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.3 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.4 Holonomy and monodromy . . . . . . . . . . . . . . . . . . . . . 262.5 Covariant derivative . . . . . . . . . . . . . . . . . . . . . . . . . 29

3 Applications 313.1 Electromagnetism as a gauge theory . . . . . . . . . . . . . . . . 31

3.1.1 The abelian Aharonov-Bohm effect . . . . . . . . . . . . . 333.2 Chern-Simons theory . . . . . . . . . . . . . . . . . . . . . . . . . 373.3 S2 is no Lie group . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4 Characteristic classes 404.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.2 Characteristic classes . . . . . . . . . . . . . . . . . . . . . . . . . 414.3 Covariant derivative revisited . . . . . . . . . . . . . . . . . . . . 42

4.3.1 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.4 Invariant polynomials . . . . . . . . . . . . . . . . . . . . . . . . 454.5 Chern classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2

Page 4: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

0.1 Introduction

This paper is written as a bachelor thesis for the physics and mathematics pro-gramme at the Universiteit van Amsterdam. We were aided in this work byprof. dr. Erik Verlinde and prof. dr. Eric Opdam. While working on thisproject we have had, at different times, several different goals in mind: amongthose were understanding and explaining the Verlinde algebra, Chern-Simonstheory, an article by Witten about topological quantum field theory, an articleby Verlinde about the non-abelian Aharonov-Bohm effect, gauge theory, andcharacteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism as a gauge theory, showing howthe Aharonov-Bohm effect can be understood as a gauge-theoretic phenomenon,and giving an introduction to characteristic classes.

In the first chapter we explain the concept of a fibre bundle quite extensively,as this is the basis on which everything else depends. In the second chapter wewiden the mathematical basis by talking about connections, parallel transport,curvature, holonomy and monodromy, and covariant derivatives. Throughoutthese two chapters we try to give examples of the concepts we introduce, butmost of these examples will be mathematical in nature (recurring most often isthe Mobius strip). The physical examples will have to wait until chapter three,where we discuss electromagnetism and the Aharonov-Bohm effect. In the lastchapter we treat characteristic classes, through use of invariant polynomials andthe curvature of a connection.

3

Page 5: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Chapter 1

Fibre bundles and vectorbundles

If you have two topological spaces, you can use them to define a new one: theproduct space. In a way, this is not a very interesting construction, as you haveonly one way of doing it. It is straightforward and easily pictured by putting onespace on a horizontal axis, the other on a vertical one; the plane then representsthe product space.

Now sometimes we want to have a little more options: we would like to beable to make a product of two spaces, but in a twisted way.

Example 1.0.1. In physics, this might occur in the following simple situation.Imagine you are walking along a path winding through a mountainous landscape,and you want to know what velocities you can have at a certain point on thepath. Obviously, your velocity can only be in the direction of the path, eitherforwards or backwards. It can have any size (putting relativistic objections asidefor a moment, for the sake of this simple example); therefore, we could say itcan take values in the vector space R. This goes for any point on the path. Todescribe the whole situation at once then, it would be useful to be able to lookat a space which consists of the path with the vector space R attached to it atevery point, in such a way as to make it tangent to the path. Since the pathwinds through the mountains, each copy of R has to be attached in its own way;we need (mathematical) tools to do this. These tools are precisely given by theconcept of a fibre bundle (actually, in this example we are dealing with a veryspecial sort of fibre bundle, namely a vector bundle, as we’ll see in section 1.5).

Example 1.0.2. A very simple mathematical use of fibre bundles is given bythe Mobius strip (see figure 1.1). The Mobius strip is almost a product of thecircle with the unit interval, but with one twist given to it. Another way ofsaying this is that the Mobius strip locally (to be shortly defined precisely)looks like a product, but globally it is given a twist. This kind of constructionis exactly what a fibre bundle does for you.

4

Page 6: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Figure 1.1: Mobius strip

1.1 Fibre bundles

Having given some intuitive ideas about fibre bundles, it is now time to give themathematical definition and look at some examples more closely.

Definition 1.1.1. A fibre bundle consists of the following data:

• Three topological spaces: The base space X, the fibre F and the totalspace E.

• A map p : E → X called the projection.

• An open covering⋃α Uα = X of the base space X.

• For each α a homeomorphism hα : p−1(Uα) → Uα × F called a localtrivialisation, with the property that if y ∈ p−1(x), then hα(y) = (x, f)for some f ∈ F .

Furthermore, if Uα ∩ Uβ 6= ∅ then the map

hα h−1β : (Uα ∩ Uβ)× F → (Uα ∩ Uβ)× F

can be written as

hα h−1β (x, f) = (x, hαβ(x)(f)) , (1.1)

wherehαβ : Uα ∩ Uβ → Aut(F )

are called the transition maps.

Now before going any further, let us examine this definition more closely totry and find out why we want it to be like this. To begin with it might be agood idea to look at the simplest example of a fibre bundle; the product of twospaces X and F . Just going through the definition, we have:

A base space X, a fibre F and a total space E which we define as E = X×F .The projection p we will take to be the usual projection onto the base space. Theopen covering of X will be given by just the set X and the local trivialisation canthen be the identity on p−1(X) = E = X × F . Equation (1.1) is automaticalysatisfied by taking the sole transition map (for Uα = Uβ = X) to take everyelement of X to the identity in Aut(F ). The fibre bundle we have thus createdis called the trivial bundle over X with fibre F .

5

Page 7: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

So it is possible to see a product of two spaces as a fibre bundle over one ofthem. Furthermore, this bundle is called the trivial bundle. In the definitionabove we have already seen the word ‘trivial’; we said that to have a fibrebundle you have to have some maps called local trivial isations. Now this is aninteresting name for these maps; there must be a good reason for using it.

We call these maps local trivialisations because they trivialise the bundlelocally; they tell you how the bundle can locally (in the inverse image of anUα under p) be seen as a product of Uα with F , which as we saw is a trivialexample of a fibre bundle. In the example above local is the same as global (theonly open set in the covering being X itself), therefore the bundle is not onlytrivialised locally, but even globally. Hence the name: trivial bundle.

There is another name in the definition of a fibre bundle that might give usa hint as to how we could visualize the concept: the transition maps. In orderto appreciate their name, it is a good idea to examine them a bit closer. First,it is easy to see that these maps satisfy the following relations for Uα ∩ Uβ 6= ∅and Uα ∩ Uβ ∩ Uγ 6= ∅:

hαα = IdF

hβα = (hαβ)−1

hαβ hβγ hγα = IdF

(1.2)

The thing to notice about these three relations is that they are very similar tothe three conditions for an equivalence relation. If we want to understand thetransition maps it might therefore be a good idea to see if we could use themto define an equivalence relation. This is indeed the way to go and it is done asfollows.

Remark 1.1.2. Instead of being given a bundle and looking at the transitionmaps themselves to find out what they do, we will examine the situation wherewe have a base space X with an open covering

⋃Uα = X, a fibre F and a set

of maps satisfying (1.2), and use those data to define a bundle. First we definea space E as the disjoint union of the products of the open sets with the fibre:

E =⋃α

(x, f, α) | x ∈ Uα, f ∈ F . (1.3)

Next, we define an equivalence relation on E by

(x, f, α) ∼ (x, hαβ(x)(f), β) (1.4)

for all x ∈ Uα ∩ Uβ and for all Uα and Uβ with non-empty intersection. Thefact that the maps hαβ satisfy (1.2) ensures that this is indeed an equivalencerelation (check this yourself!). Now that we have an equivalence relation on E

we can define a new space E as the quotient space of E under ∼:

E = E/ ∼ (1.5)

It can be proven without much difficulty that, if we take E as the total spaceand define a projection and local trivialisations in a natural way, this will giveus a fibre bundle. We will not do so here, but the reader can verify it for him-or herself.

6

Page 8: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Before going through all this we promised that it would give some insight intowhy the maps hαβ are called the transition maps. Now that we have done somework, we can see that the transition maps are indeed accurately named; theydescribe the transition from looking at the fibre bundle locally at Uα to lookingat it locally at Uβ , whenever these to have nonempty intersection. Equation(1.4) describes exactly how we go from the one viewpoint to the other.

Another way to look at equation (1.4) (and hence at the transition maps) isto view it as gluing instruction for the total space E. To understand this, let uslook back at what we have done above. We constructed a space E consisting ofproducts of open sets with the fibre, wich are all totally disjoint from one an-other. Next, we defined an equivalence relation which related points in differentproducts to each other, using the transition maps. Lastly, we defined our totalspace as the quotient of E under ∼, thus taking equivalent points and makingthem identical. This last action could be visualised as gluing together the spaceE from the loose fragments of E.

Fibre bundles are often constructed in exactly this way, by telling you howto glue together different products of open sets with the fibre; i.e. by giving thetransition maps. A simple example of this is the Mobius strip.

Example 1.1.3. The Mobius strip can be constructed in the following way.Take the circle S1 (seen as the circle in R2 in this example) as the base spaceand the interval F = [−1, 1] as the fibre. Take as an open cover of S1 the sets Uconsisting of all points with second coordinate greater than − 1

2 and V consistingof all points with second coordinate smaller than 1

2 . The intersection of thesethen exists of two parts; one to the left of the vertical axis (call it W1), and oneto the right (W2). Now define the transition function hUV : (U ∩ V ) → Aut(F)so that hUV (x)(i) = i for x ∈ W1 and hUV (x)(i) = −i for x ∈ W2. The fibrebundle constructed from this data by the method described above is preciselythe Mobius strip, as can be easily seen.

There is some additional terminology we will often use:

Remark 1.1.4. If F - Ep- X is a fibre bundle, we call F the abstract

fibre of the bundle. This contrasts with the concrete fibre p−1(x) ⊆ E ‘above’a point in the base space x ∈ X. In these terms, the trivialisation maps provideidentifications of the concrete fibres with the abstract fibre.

Definition 1.1.5. A section of a fibre bundle F - Ep- X on a subset

U of the base space X is a continuous map s : U → E, such that p s = idU

(possible additional smoothness constraints on s depend on the context). Inwords, a section on U chooses for each x ∈ U an element of the fibre over x. Aglobal section is a section on the whole base space X.

We could now go on by giving more examples of fibre bundles, definingwhat a morphim of fibre bundles is and inspecting what it means to be anisomorphism. However, since we will not need these in the remainder of thetext we will not do so, but concentrate instead on those notions that will beimportant to us later on. The first of these is the concept of a G-bundle, whichwe explain in the next section.

7

Page 9: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

1.2 G-bundles

Often it is useful to have some additional algebraic structure on a fibre bundle.An example of this is the notion of a G-bundle. If G is a group, then a G-bundle is a fibre bundle together with a left action1 λ of G on the fibre suchthat the transition functions can be represented by elements of G. To be moreprecise: we must have that hαβ(x) ∈ λ(G) for all x ∈ Uα ∩Uβ and for all α andβ. Using multiplicative notation for the action: for every x ∈ Uα ∩ Uβ theremust be an g ∈ G such that hαβ(x)(f) = g · f .

Since we have, in a G-bundle, that hαβ(x) ∈ λ(G), we can make mapsgαβ : Uα ∩ Uβ → G such that hαβ(x) = λ(gαβ(x)). If we take the action to beeffective (which can always be done by looking at G/N instead of G, where Nis the normal subgroup of elements that act trivially), these maps are unique;we will then also speak of them as transition maps. It is easy to see from 1.2that these maps satisfy

gαα(x) = e (1.6)

gβα(x) = gαβ(x)−1 (1.7)gαβ(x)gβγ(x)gγα(x) = e (1.8)

Remark 1.2.1. There is a slightly different point of view: if we are given aG-bundle, we say that G is the structure group of this bundle. Note, however,that this is somewhat sloppy language, since there are in general many possiblestructure groups for a given bundle. (For example, every bundle with fibre F isan Aut(F )-bundle.) It would be more precise in this case to say that the bundlecan be made into a G-bundle. Alternatively, we could change the definition of‘G-bundle’, to make the group G part of its data.

On a related note, suppose that we are given a G-bundle. One may wonderwhether we really need all of the elements of G for the transition maps; perhapsit is also a H-bundle, for some subgroup H of G. This can be non-obvious: itmay be necessary to pick a very specific covering of the base space, or transitionmaps that differ from the original ones. If the bundle is in fact also a H-bundle,this insight is called reduction of the structure group (from G to H).

We shall see more examples in the section on principal G-bundles.

1.3 G-torsors

G-torsors are quite common in physics. Usually, their presence is not madeexplicit and their properties are described in an intuitive way only.

Let us first give a few examples.

Example 1.3.1. The energy of a system is not a well-defined quantity: onlyenergy differences can be measured (and thereby have physical significance)2.One may agree to elect a certain state of the system as ‘ground state’ anddeclare it to have zero energy. The energy of other states is then defined to bethe energy difference with the ground state (just a real number).

However, which state qualifies as ground state depends on the situation andmay be open to discussion.

1For those who are not familiar with the concept of an action it is explained in section 1.3.2In general relativity, this is no longer true.

8

Page 10: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Example 1.3.2. Consider positions in our three-dimensional world. It does notmake sense to add two such positions. What would be the sum of the summitof Kilimanjaro and the centre of the moon? We can subtract positions, though,obtaining something we might call a displacement: a vector having both a lengthand a direction. Furthermore, it does make sense to add two displacements (asvectors in 3-space).

After choosing an origin, positions are conveniently specified by giving thedisplacement between the position and the origin. This does introduce a certaindanger, though: you might be tempted to add two positions (after all, now thatwe chose an origin, they are just vectors), even though we agreed that that is ameaningless thing to do.

Example 1.3.3. A more mathematical example concerns the many possiblebases of a vector space. Let’s confine ourselves to a finite dimensional one V ,say of dimension n. We consider the set B(V ) of all ordered bases of V . Givensuch a basis and an element of GL(n) (the set of invertible n× n-matrices), wecan construct a new basis by taking respective linear combinations (specified bythe columns of the matrix) of the old basis vectors. This construction has somenice properties:

• If we first apply a matrix A to a basis, and then apply another matrix Bto the outcome, we get the same as when we apply the product AB to thefirst basis.

• Given two ordered bases of V , there is a unique matrix in GL(n) so thatthe matrix applied to the first basis gives the second one (we might call itthe quotient of the second basis by the first).

It does not make sense to multiply two bases. However, once we pick somebasis as ‘origin’, we can identify other bases with their quotients by the origin(elements of GL(n)).

The general pattern will be clear. The mathematical description is as follows:

Definition 1.3.4.

A left action of a group G on a set X is a map µ : G×X → X : (g, x) 7→ g · xsuch that

1. (gh) · x = g · (h · x) for all g, h ∈ G and x ∈ X;

2. e · x = x for all x ∈ X, with e the identity element of G.

A right action of a group G on a set X is a map ν : X×G→ X : (x, g) 7→ x ·gsuch that

1. x · (gh) = (x · g) · h for all g, h ∈ G and x ∈ X;

2. x · e = x for all x ∈ X, with e the identity element of G.

Alternatively, we can curry the map µ : G × X → X to get a map λ : G →End(X) (where End(X) is the monoid of functions from X to itself). Thereader should check that µ is a left action if and only if the image of λ iscontained in Aut(X) (the group of bijections from X to itself) and λ is a grouphomomorphism.

9

Page 11: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

A set together with a left (right) action of a group G on it is sometimes calleda left (right) G-set. It is easy to check that left actions of G on X correspondto right actions of Gop ∼= G on X 3: if µ : G × X → X is a left action, thenν : X ×G→ X : (x, g) 7→ µ(g−1, x) is a right action. The following definitionsregarding left actions can be extended by declaring a right action to have aproperty if and only if its corresponding left action has it.

• If G acts on X and x is a point of X, then the orbit of x is the setG · x = g · x; g ∈ G of points that can be reached by acting on x. Oneelement being in another element’s orbit is an equivalence relation, so theset X is partitioned in disjunct orbits.

• An action µ : G × X → X is transitive, if for all x, y ∈ X there existssome g ∈ G that sends x to y: g · x = y. Equivalently, µ is transitive iffX has exactly one orbit (namely X itself).

• It is called free, if different elements of G act differently on every point ofX: g · x = h · x⇒ g = h.

• Now, a G-torsor is a set X together with a free transitive action of Gon X. X is then also called a homogeneous space for G. The uniqueelement g such that g ·x = y is called the quotient of y by x and denotedby y/x (in case of a right action, we denote the element g such that x·g = yby x\y).

You can think of a G-torsor as a copy of G that has forgotten which of itselements is the identity. As soon as we choose a x0 ∈ X, we can identify X withG by sending x to x/x0.

Remark 1.3.5. Typically, if G is a topological group andX a topological space,we demand that the map µ : G×X → X is continuous, and then say that theaction is continuous. Analogously, if G is a Lie group, X a smooth manifoldand µ a smooth map, then the action is said to be smooth as well.

1.4 Principal G-bundles

Definition 1.4.1. A principal G-bundle is a G-bundle F - Ep- X

with a right action of G on F and a right action of G on E, such that

1. the orbits of the action on E are precisely the fibres of the bundle, andevery fibre is a G-torsor (that is, restricting the action of G to a fibremakes that fibre a G-torsor);

2. for every trivialisation hU : p−1(U) → U × F , and every a ∈ p−1(U) andg ∈ G, if

hU (a) = (x, f) (1.9)

thenhU (a · g) = (x, f · g) (1.10)

holds. Note that in the last equation the left-hand side features the actionof G on E, while the right-hand side mentions the action of G on F .

3Gop, the opposite of G, is a group with the same underlying set as G, but with multi-plication reversed: a ∗op b = b ∗ a. Gop and G are isomorphic via a 7→ a−1.

10

Page 12: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Remarks.

• Explicitly, condition 1 means that for x, y ∈ E the equation x · g = y canbe solved for g ∈ G only if p(x) = p(y) and that it then has a uniquesolution.

• Note that in addition to the local triviality of a general fibre bundle,condition 2 demands that the right action of G on the principal bundle islocally ‘trivial’ as well: on a chart, it must be given by the action of G onthe abstract fibre F .

• Since the action on a fibre is free and transitive, the action of G on F isfree and transitive as well: F is a G-torsor. Because of this, many authorsdefine a principal G-bundle to have G itself as abstract fibre instead ofsome F , and replace the action of G on F by right multiplication of G onitself. This is no restriction (for all G-torsors are isomorphic to G itself),but it sometimes imposes an inconvenient (because unnatural) choice.

• Global sections on a principal G-bundle G - Ep- X can be hard

to come by4: they exists if and only if the bundle is trivial! The ‘if’ partis obvious: just pick an element g ∈ G, take the global section x 7→ (x, g)of the trivial bundle and use a global trivialisation to make this into aglobal section of p. The ‘only if’ part goes as follows: let s : X → E bea global section. We can make a global trivialisation h : E → X × G bydefining h−1(x, g) = s(x) · g (this is invertible; h can be given explicitly ash(e) = (p(e), s(p(e))\e)), so the bundle is trivial.

Example 1.4.2. A major example of a principal G-bundle is the so calledframe bundle of a vector bundle. We will treat it together with vector bundlesin 1.5.

Example 1.4.3. The map p : S1 → S1 : z 7→ z2 (where we see S1 as theunit complex numbers) is a fibre bundle with a two-point fibre. We can makeit a principal Z/2Z-bundle by letting the non-identity element 1 ∈ Z/2Z act byswitching on the abstract fibre and by inversion (z 7→ −z) on E = S1. The latteraction preserves the fibres (since (−z)2 = z2) and acts freely and transitivelyon them (since all fibres are of the form z,−z). Condition 2 is also satisfied:the case g = 0 is trivial, while in the case g = 1 both hU (a · g) = hU (−a) and(x, f · g) are the other element (other than hU (a) = (x, f)) of the (two-element)fibre above x. If you take a covering of the base space by two connected openproper subsets U and V and examine the transition maps, you will see that thisfibre bundle is much like the Mobius strip of example 1.0.2: at one componentof the intersection U ∩ V , the fibres are connected in the trivial manner, whilein the other component, they are switched (twisted). In a sense, this bundleand the Mobius strip have different fibres, but are twisted in same way. Thenext definition makes this intuition precise.

Definition 1.4.4. Given a principal G-bundle F - Ep- X and a right

action µ of G on some set (or topological space, or smooth manifold) A, weconstruct a new G-bundle A - E ×µ A

ep- X, the associated bundle.

4In this instance, it is convenient to have G as the abstract fibre, so we will assume this.

11

Page 13: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Intuitively, we would like to replace the fibre G by A, and obtain the transitionmaps by composing the old ones with the action of G on A. In order to do so,we first define a right action of G on E ×A by

(e, a) · g := (e · g, a · g). (1.11)

We let the total space E×µA of the associated bundle be the quotient of E×Aby the action of G (so points of E×µA are orbits of that action). Since the actionof G on E preserves the fibres, we can define the projection p : E×µA→ X by

p([(e, a)]) := p(e). (1.12)

The abstract fibre A of the associated bundle is defined by A := (F ×A)/G, sopoints of A are orbits of the logical right action of G on F × A (the action onF is the one of the original (principal) bundle, and the action on A is of courseµ). This fibre is of course homeomorphic to A, and even isomorphic to it asa right G-set, but – once more – not in a canonical way, because F cannot becanonically identified with G. For every trivialisation h : p−1(U) → U × F ofE, we define a trivialisation h : p−1(U) → U × A on E ×µ A by

h([(e, a)]) := (p(e), [h2(e), a]) (1.13)

, where h2 : p−1(U) → F is the second component of h.

Example 1.4.5. The motivating example is formulated in this language asfollows: take the principal Z/2Z-bundle of example 1.4.3 as the starting point.Let A be the unit interval I = [0, 1], and define the action µ of Z/2Z on I byletting 1 act by reflection x 7→ 1 − x. Then the associated bundle S1 ×µ I isisomorphic to the Mobius strip.

Example 1.4.6. If the new fibre A is a vector space and the action is linear (so µis a representation ofG on A), then the associated bundle is what we call a vectorbundle (see the next section). This has a great physical significance: in thelanguage of gauge theory, the ‘charge’ of a particle is given by a representationof the ‘gauge group’ G, and the associated bundle construction describes howthe presence of a ‘gauge field’, which lives on a principal G-bundle, effects theparticle in question (which then lives on a bundle with fibre A).

1.5 Vector bundles

Another way to add additional structure to the concept of a fibre bundle isto look at bundles where fibres have some structure in addition to being atopological space. One instance of this is the concept of a vector bundle. Thisis a fibre bundle in which the abstract fibre is also a vector space. Ofcourse,for this to be in any way useful, we need the transition functions to respect thisstructure; thus for a vector bundle we furthermore require that each hαβ(x) :F → F is a linear isomorphism.

Consequently, in a vector bundle the concrete fibre Fx = p−1(x) above apoint x can be made into a vector space in a meaningful way. That is, for(x, f1, α) ∼ (x, hαβ(x)(f1), β), (x, f2, α) ∼ (x, hαβ(x)(f2), β) and scalars a1 anda2 we have (x, a1f1 + a2f2, α) ∼ (x, a1hαβ(f1) + a2hαβf2), β), following directly

12

Page 14: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

from the fact that hαβ is a homomorphism and implying that we can define ad-dition and scalar multiplication on Fx by the same operations in a trivialisation.

A well known example of a vector bundle is the tangent bundle of a smoothmanifoldM . Take as base space the manifold, and take as the concrete fibre overx ∈M the tangent space TxM . The projection is of course the map taking thewhole of TxM to x and the trivialisations can be constructed from the chartson your manifold. The abstract fibre can be taken to be Rn, where n is thedimension of M .

Given a vector bundle V - Wp- X of dimension n, the frame bundle

of W is a principal GL(n)-bundle B(V ) - Eq- X defined as follows:

• the base space of the frame bundle is X, the base space of the vectorbundle;

• the fibre q−1(x) above a point x ∈ X is the set of all ordered bases of thevector space p−1(x) (the corresponding fibre of the vector bundle), so thetotal space E is the set of all bases of all fibres of W ;

• for every trivialisation h : p−1(U) → U × V above an open set U ⊂ X, weconstruct a trivialisation h : q−1(U) → U ×B(V ). Given C ∈ q−1(U), sayq(C) = x, C is a basis of the vector space p−1(x). By applying h to thesebasis vectors, we obtain a basis D of V (since h restricted to a fibre is anisomorphism of vector spaces). So, define h(C) = (x,D) ∈ U ×B(V );

• the right action of GL(n) on E is (restricted to a fibre) the same as theone in example 1.3.3 of 1.3: if C ∈ E is a basis and A ∈ GL(n), we defineC ·A to be the basis obtained by taking linear combinations of the vectorsof C with coefficients as specified by the columns of A.

This construction is most commonly applied to the tangent bundle TM of asmooth manifold M . It is then called simply the frame bundle of M . A sectionof this frame bundle is called a frame; it is a choice of basis of the varioustangent spaces to M .

By the way, a one-dimensional vector bundle is called a line bundle. Butbeware, this word is also used for one-dimensional complex vector spaces.

13

Page 15: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Chapter 2

Connections

Let us go back to our favourite example of a fibre bundle: the Mobius strip (callthe total space E for convenience). The tangent space TeE at some point e isobviously isomorphic to R2 for all points e. It is intuitively clear what we wouldmean by the vertical subspace VeE ⊂ TeE; it is the space of all vectors tangentto the fibre of e (we will be more precise in a moment, for now just try to seethings in a picture). What is not clear, is what we would mean by a horizontalsubspace HeE ⊂ TeE. There is just no obvious way of defining such a subspace.This is where the notion of a connection comes in; it is a choice of the horizontalsubspace of TeE. Note that since there is no canonical horizontal subspace ofthe tangent space, we have in general many different possible connections on afibre bundle; we can really choose what we want ‘horizontal’ to mean on a fibrebundle.

Connections have many applications. For physicists their main use mightbe in the ability to define a covariant derivative (see section 2.5). It turns outthat many field theories can be described naturally by viewing the field as aconnection on a certain fibre bundle. We will show this explicitly for the caseof electromagnetism (section 3.1).

2.1 Connections

In the introduction to this chapter we said that it was intuitively clear whatwe would mean by the vertical subspace VeE of the tangent space TeE at somepoint on the Mobius strip; it was the subspace tangent to the fibre of the pointe. Now let us define in a more precise manner what the vertical subspace VeEof TeE is for an arbitrary fibre bundle F - E

p- M in which all spacesare manifolds: it is the subspace of all vectors η ∈ TeE such that dp(η) = 0.If you think about the example of the Mobius strip this does exactly what wewant; dp sends a vector to zero if and only if it is tangent to the fibre.

As said, however, we have no canonical way of choosing a horizontal sub-space. Instead we use a connection to define what vectors are horizontal. Inother words; a connection is a smooth (we will specify what we mean by thisin a moment) choice of a horizontal subspace HeE of TeE at each e ∈ E, suchthat HeE is complementary to VeE (as should clearly be the case for the hori-zontal subspace). Such a connection then gives us projections πve an πhe from the

14

Page 16: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

tangent space onto the vertical and horizontal subspaces respectively (since VeEand HeE are complementary to each other, there is a unique decomposition ofany tangent vector η = ηv + ηh as the sum of a vertical and a horizontal vector,and these ηv and ηh are the projections of η onto VeE and HeE). Now we canspecify what we mean by a smooth choice; for any smooth vector field X on Ethe horizontally projected field πhX should also be smooth.

Remark 2.1.1. In order to define even the vertical projection, we needed thechoice of horizontal subspace. You might be tempted to think this unnecessary:

Figure 2.1: Vertical projection needs choice of horizontal subspace.

we could have taken the orthogonal projection onto the vertical subspace. Thehorizontal subspace is then automatically the orthogonal complement of the ver-tical one. However, you need an inner product to even talk about orthogonality.

A manifold equipped with the extra structure of an inner product on thetangent space at each point is called a pseudo-Riemannian manifold. (Thereare two more requirements: the choice of inner product at all the differentpoints must vary smoothly with the point, and all inner products must be non-degenerate.) So, if the total space of a fibre bundle is a pseudo-Riemannianmanifold, this gives a natural choice of connection on the bundle.

Well, so much for the definition; let us look at some examples.

Example 2.1.2. Take again the Mobius strip. How could we define a connec-tion on it? One way to do this is to embed the Mobius strip in R3 and then takethe horizontal subspace to be the subspace orthogonal to the vertical one, whichis well defined because in R3 we have an inner product. It is complementary tothe vertical subspace by definition and that it is smooth is clear intuitively, butcan be checked by introducing a specific parametrisation of the Mobius stripand looking at it in local coordinates.

Example 2.1.3 (Connection on a covering). Consider a fibre bundle F - Ep- X

with F a discrete space (a 0-dimensional manifold). (Note that fibre bundleswith discrete fibres are just covering maps, so p is a smooth covering map.)Then the vertical tangent space is 0-dimensional (just like the fibre), so anyhorizontal tangent space must be the entire tangent space: there is only onepossible connection.

15

Page 17: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Now that we have seen some examples, let’s look at the concept more closely;what happens in a trivialisation?

First the tangent space. In the following let e be a point in p−1(U) and let(x, f) = hU (e) be its representation in U × F . We have, for the tangent spaceof a trivialisation, that T(x,f)(U × F ) ∼= TxU × TfF . Now, the horizontal andvertical subspaces of TeE are carried over to the trivialisation by the isomor-phism dhU |e : Te(p−1(U)) ∼= TeE - T(x,f)(U × F ). The vertical subspace inthe trivialisation is easy to describe: since the following two diagrams commute

p−1(U)hU- U × F Tep

−1(U)dhU- T(x,f)U × F

U

p

? id - U

pr1

?TxU

dp

? id - TxU

dpr1

?

(the first states that the trivialisation respects the projection (it is one ofthe conditions in the definition 1.1.1 of a fibre bundle); the second follows fromthe first by taking derivatives), we have dp(v) = (dpr1 dhU )(v). A vector(ξ, y) ∈ TxU×TfF (which can of course be written as dhU (v) for some v ∈ TeE)is vertical if and only if 0 = dp(v) = (dpr1 dhU )(v) = dpr1((ξ, y)) = ξ.

We are now in a position to conclude what the vertical projection must looklike in a trivialisation. If (ξ, y) is again a vector in T(x,f)(U ×F ), then (realisingthat the projections are linear maps):

πv(x,f)(ξ, y) = (0, Γ(x, f)y + Γ(x, f)ξ) (2.1)

for some maps Γ : U × F → Lin(T.F ) and Γ : U × F → Lin(T.U, T.F ). But wecan do better than that, because we know that if a vector is vertical, then thevertical projection should leave it intact. So (0, y) should be sent to itself forall y, giving that Γ(x, f) must be the identity for all (x, f) ∈ (U × F ):

πv(x,f)(ξ, y) = (0, y + Γ(x, f)ξ) (2.2)

This is all we can tell without actually specifying the horizontal subspace (choos-ing the connection). When a horizontal subspace is designated, that choice de-termines the vertical projection and thereby the map Γ. As a further usefulfact, we know that a vector has vertical projection zero if and only if it is itselfhorizontal, so we can conclude that the vector (ξ,−Γ(x, f)ξ) is horizontal for allξ ∈ TxU (and any horizontal vector can be written in this way). The horizontaland vertical projection are complementary, so πv(x,f)(ξ, y) + πh(x,f)(ξ, y) = (ξ, y).This immediately gives us an equation for the horizontal projection in terms ofΓ:

πh(x,f)(ξ, y) = (ξ,−Γ(x, f)ξ) (2.3)

That is about all we can say about the projections in this trivialisation; tomake things interesting it’s a good plan to see what happens if we look at themfrom another trivialisation. To do this, define a map

ψ(x, f) = (x, hV U (x)(f)) (2.4)

16

Page 18: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

from (U ∩ V ) × F to (U ∩ V ) × F for two trivialisations hU and hV . (Thusψ is nothing other than hV h−1

U , as you can directly see in the definition ofa fibre bundle (see 1.1.1).) We can now show that dψ intertwines the localrepresentatives of the vertical projection, meaning that the following relationholds:

dψ πv(x,f) = πvψ(x,f) dψ . (2.5)

Intuitively speaking, to project at U and then go over to V should be the sameas first going over to V and then projecting there (in other words: the verticalprojections in different trivialisation should be ‘compatible’ with one another).This is reasonable, because the vertical projections in different trivialisationsare by definition the local representatives of the true vertical projection (in thetangent space of the fibre bundle), which is defined without reference to anytrivialisation.

To prove (2.5), write the vertical projection πve in terms of therespective local vertical projections:

πve = dh−1U πv(x,f) dhU = dh−1

V πvψ(x,f) dhV .

Now pre-compose with dh−1U and post-compose with dhV :

dhV dh−1U πv(x,f) = πvψ(x,f) dhV dh−1

U .

This is the desired result, once we note that hV h−1U = ψ.

We now write out (2.5) using the explicit formulas for the projections wederived above and the expression for dψ in terms of its components:

dψ((ξ, y)) = (ξ, d1hV U (ξ) + d2hV U (y)) (2.6)

where d1hV U denotes the derivative of hV U with respect to the first argument(in the base space) and d2hV U the derivative of hV U with respect to the secondargument (in the fibre).

Performing those substitutions gives us:

(0,d2hV U (y + ΓU (x, f)(ξ)))= (0,d1hV U (ξ) + d2hV U (y) + ΓV (x, hV U (x)(f))(ξ)) .

(2.7)

Ignoring the left component and cancelling a common term d2hV U (y), we get:

d2hV U ΓU (x, f) = d1hV U + ΓV (x, hV U (x)(f)) . (2.8)

We can solve this for ΓV by replacing f with hV U (x)−1(f), resulting in:

ΓV (x, f) = d2hV U (ΓU (x, hV U (x)−1(f)))− d1hV U .

(2.9)

Now this may seem like a lot of work just to get some complicated formularelating the connection maps Γ(x, f) in different trivialisations, but it is animportant observation that given maps ΓU : U × F → Lin(T.U, T.F ) satisfying(2.9) there is a unique connection having those maps as its connection maps (this

17

Page 19: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

will later allow us to construct connections representing the electromagneticfield). This construction is very similar to that of remark 1.1.2: we can forma connection (c.q. fibre bundle) with given connection maps (c.q. transitionmaps) if they obey a certain compatibility relation. We do not prove this result;the reader should be able to do so quite easily (transfer the alleged horizontalsubspace (ξ,−ΓU (x, f)ξ) | ξ ∈ TxU from the trivialisation to the fibre bundleby hU , and show that the result of this procedure is the same for overlappingtrivialisations).

Well now, there is a lot of freedom in the choice of connection: one maychoose horizontal tangent spaces mostly independently (only subject to smooth-ness demands) in different points of the total space, even within a fibre. Formany applications featuring additional structure on the fibre, this is too muchfreedom – we need the connection to behave itself within a fibre, with respectto this structure. We will encounter two instances of this principle. On a prin-cipal G-bundle, there is the notion of an invariant connection; as an exampleof its use, the gauge field (the electromagnetic field, in the case we describein chapter 3) is described by an invariant connection on the gauge group. Wereview invariant connections in two subsections time. First, we examine linearconnections on a vector bundle.

2.1.1 Linear connections

Let us now study the notion of a connection in a context where we have someadditional structure; in this case a vector bundle. First we remark that there is acanonical identification VeE ∼= Fp(e); for q ∈ Fp(e) define a curve γq(t) = e+tq ∈Fp(e). Then identification of q with [γq] gives an isomorphism Fp(e)

∼- VeE.

Definition 2.1.4. A connection is called linear if the map Fx → Lin(T.E, T.E) :e 7→ πhe is affine and the canonical zero section of E is horizontal (by which weactually mean that the space tangent to the zero section is everywhere horizon-tal).

Using the canonical identification mentioned above, we can represent πhe :T(.,.)(U×F ) → H(.,.)(U×F ) in a trivialisation by a map πhe : T.U×F → T.U×F .The condition that the map e 7→ πhe be affine for the connection to be linearnow gives us that we can write (omitting the x-dependence, as we are workingwithin a single fibre):

πhf (ξ, y) = L(f)(ξ, y) + c(ξ, y) (2.10)

for some linear map L (from the fibre to the space of linear operators on thetangent space of the trivialisation in (x, f)), and some constant linear operatorc. From the horizontality of the zero section (call it σ, and σ in a trivialisation)we can derive

σ(x) = (x, 0) (because σ is the zero section)=⇒dσ(ξ) = (ξ, 0)=⇒H(x,0)E = dσ(TxU) = TxU × 0 (because T(x,0)(σ(U)) = dσ(TxU))

=⇒πh0 (ξ, y) = (ξ, 0) (also using equation (2.3))

18

Page 20: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

If we substitute this result into equation (2.10), we get (using the linearity ofL)

(ξ, 0) = πh0 (ξ, y) = L(0)(ξ, y) + c(ξ, y) = c(ξ, y)

which gives us a simple expression for c. From equations (2.3) and (2.10), usingthis expression for c, we can then find an equation for Γ:

(0,−Γ(x, f)ξ) = L(f)(ξ, y)

and since L is linear in f , Γ must be linear in f too. This means, that if we areworking in the context of a vector bundle with a linear connection defined on itwe can write:

Γ(x, f)ξ = Γ(x)(ξ)f (2.11)

where Γ(x)(ξ) is a linear map from the fibre to its tangent space (varying linearlyin ξ, of course). This also gives us a new way to write equation (2.9) for thespecific case of a linear connection (we will leave out the arguments of the mapshere, for brevity):

ΓV = hV UΓUh−1V U − dhV Uh−1

V U . (2.12)

This is all we need to know about linear connections, for now.

2.1.2 Invariant connections

As promised, we introduce another type of well-behaved connection, namelyan invariant connection on a principal G-bundle PG, where G is a Lie group.Before we define anything, let’s study this situation a bit. We have a canonicalidentification VePG ∼= g (where g is the associated Lie algebra of G) by sendinga point L ∈ g to the equivalence class Le of the curve γ(t) = e · exp(tL) inVePG (this is clearly an identification if we keep the definition of g (as tangentspace to G at e) in mind). We then have that the vertical projection πve canfor all e ∈ PG be represented by a map ωe : TePG→ g. We shall call the mapω, which actually is a g-valued 1-form, the connection 1-form (note that,although the vertical tangent space does not depend upon the connection, thevertical projection, and thus ωe, do (see Remark 2.1.1)). Note that ω shouldobviously satisfy

ωe(Le) = L , (2.13)

a result we shall use later on in this section.Now for the promised definition, we have:

Definition 2.1.5. A connection is called invariant if all the horizontal sub-spaces HeE satisfy He·gE = dRg(HeE) = HeE ·g, where Rg is the map e 7→ e ·g(the second equality is always true1, the constraint is in the first equality).

If a connection is invariant this immediately has a nice consequence: if ate we have a decomposition of a tangent vector η = ηh + ηv into horizontaland vertical components, then the decomposition at the point e · g is given byη · g = ηh · g + ηv · g. That both sides of the latter equality are in fact equalis of course independent of the connection being invariant or not; it followsimmediately from the fact that dRg is linear (see footnote 1 on page 19). We

1Actually, the second equality can be read as the definition of a convenient notation:writing “v · g” for “dRg(v).”

19

Page 21: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

need the invariance of the connection to make sure that ηh ·g is again horizontaland ηv · g is again vertical. Actually, even for the verticality we do not needany assumption about the connection; since Rg preserves the fibre, a verticalvector in the tangent space will stay vertical under dRg (remember that thevertical tangent space is the same as the space tangent to the fibre). So all weneed to show is that ηh · g is horizontal. But ηh · g is an element of HeE · g (byhorizontality of ηh) and thus, by invariance of the connection, it is an elementof He·gE, which is what we wanted. Note that this decomposition of η · g alsogives that

πve·g(η · g) = πve (η) · g (2.14)

We now want to find an expression for ωe·g(η · g). By the equation aboveand the definition of ω we have that

ωe·g(η · g) = ωe(η) · g (2.15)

The right-hand side of (2.14) is of a general form for which we can derive

Le · g =d

dte · exp(tL)g

∣∣∣t=0

=d

dte · g(g−1exp(tL)g)

∣∣∣t=0

= (Adg−1L)e·g ,

where the last equality follows directly from the definition of Ad as a map onthe Lie algebra g. Using this result on equation (2.14) we get:

πve·g(η · g) = (Adg−1ωe(η))e·g .

Applying the above-mentioned identification on both sides of this equation:

ωe·g(η · g) = Adg−1ωe(η) (2.16)

which is the desired expression for ωe·g.As usual it is interesting to consider these notions in a trivialisation. Let

therefore ωU be defined on U ×G as ωU = (h−1U )∗(ω). We will furthermore use

the identification (by dRh−1)

T(x,h)(U ×G) ∼= TxU × ThG ∼= TxU × g

If now η = (ξ, L) ∈ TxU × g, then we have η · g = (ξ,Adg−1L) and further-more, using equation (2.13), that ωU(x,h)(ξ, L) = ωU(x,h)(ξ, 0) + ωU(x,h)(0, L) =ωU(x,h)(ξ, 0) + L. Using (2.16) and the fact that (x, eG) · h = (x, h) (whereeG ∈ G is the unit element of the group G, not to be confused with the pointe ∈ E), we get

ωU(x,h)(ξ, h−1 · L · h) = h−1 · ωU(x,eG)(ξ, L) · h = h−1 · (ωU(x,eG)(ξ, 0) + L) · h

Taking L = 0 then gives us that we can define a g-valued 1-form AU on U , calledthe local principal gauge potential, such that ωU(x,h)(ξ, 0) = h−1 ·AU (x)(ξ)·h(just take AU (x)(ξ) = ωU(x,eG)(ξ, 0)). Re-entering this into the equation abovefinally gives us

ωU(x,h)(ξ, L) = h−1 · (AU (x)(ξ) + L) · h (2.17)

Lastly, it is ofcourse of interest to know what happens when we go over to adifferent trivialisation. Since we have done similar work a couple of times before

20

Page 22: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

in this paper we will just state the result and leave the calculation to the reader.The result is:

AV = AdgV UAU − dgV U · g−1

V U (2.18)

Having gone through enough mathematical preparations, we can finally havea look at something more physical in nature and study the concept of paralleltransport, which will finally illustrate what we need this whole mathematicalframework for.

2.2 Parallel transport

Once we have a connection at our disposal, we can relate elements of differentfibres. In a sense, a connection is a choice of identification of different fibres –in differential form. Parallel transport is the global version of the same identi-fication.

In this section, we assume that all spaces are smooth manifolds and allmaps are smooth. Let F - E

p- X be a fibre bundle equipped with aconnection. In general, the identification of the fibre above x0 ∈ X with thefibre above x1 ∈ X depends on a choice of path from x0 to x1.

Definition 2.2.1. A smooth path a : [0, 1] → E is called horizontal if at everypoint a(t) (t ∈ (0, 1)) the tangent vector to a is contained in the horizontalsubspace of the tangent space Ta(t)E (as specified by the connection). A lift ofa path b : [0, 1] → X is a path b : [0, 1] → E such that p b = b (in other words,b(t) lies in the fibre over b(t) for all t ∈ [0, 1]). A horizontal lift of a path isjust that, a lift which is horizontal.

A lift of a path can move in an arbitrary manner in the vertical direction;demanding that it be horizontal restricts that freedom, by prescribing in whichdirection the lift must move as we go along the path in the base space X. Youmight guess that the horizontal lift of a path is essentially unique (it can ofcourse be shifted as a whole in the vertical direction). We will see in a momentthat this is true.

Let’s see what a horizontal lift b of a path b : [0, 1] → X looks like in localcoordinates. Let h : p−1(U) → U × F be a trivialisation over the open setU ⊆ X. We assume for the moment that the image of b lies completely in U .Write (h b)(t) =: (x(t), f(t)), for t ∈ [0, 1]. b being a lift of b means thatx(t) = b(t). Horizontality of b at t means that the vertical projection of db|tis zero, or equivalently (since h is a diffeomorphism by assumption) that thevertical projection of d(h b)|t =: (dx|t,df |t) is zero. According to (2.2), thatvertical projection is (0,df |t + Γ(x(t), f(t))dx|t) = (0,df |t + Γ(b(t), f(t))db|t).So b is horizontal if and only if

df |t = −Γ(b(t), f(t)) db|t (2.19)

holds for all t ∈ [0, 1].For a given path b, this is an ordinary differential equation for f . As a

boundary condition, we choose an element e0 of the fibre above b(0) and demandthat b(0) = e0. According to the theory of ordinary differential equations theremay or may not be a solution f for all t, so there may or may not be a horizontal

21

Page 23: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

lift b of b with b(0) = e0, but if there is one, it is unique. In this case, we callb(1) the parallel transport of e0 along the path b.

Remarks.

1. Reparametrisation has no effect on parallel transport: if b is a lift of b (sopb = b) and φ : [0, 1] → [0, 1] is a smooth map with φ(0) = 0 and φ(1) = 1,then c := b φ is a lift of c := b φ (since p c = p (b φ) = b φ = c)with c(0) = b(0) and c(1) = b(1). Furthermore, if b is horizontal, then sois c, since dc|t = db|φ(t) dφ|t is a scalar multiple of db|φ(t), so the formeris in the horizontal tangent space if the latter is.

2. The composition of parallel transport along a path with parallel transportalong another path is the same as parallel transport along the concate-nation of the paths (traversed at double speed, if desired, to be a pathwith domain [0,1]). This follows directly from the remark above and theobservation that the concatenation of horizontal lifts is a horizontal lift ofthe concatenation.

3. The earlier assumption that the image of b lies within a single trivialisingopen subset of X is no real restriction. Given an arbitrary path γ : [0, 1] →X, we can slice it up in a finite (since [0, 1] is compact) number n of piecesγj : [tj−1, tj ] → X (0 = t0 < t1 < . . . < tn−1 < tn = 1), such thatthe image of each piece γj is completely within a trivialising open subsetUj . We then define the parallel transport of a e0 ∈ p−1(γ(0)) along γ byconsecutively transporting e0 along the γj (by the definition above). (Thisdoes not depend on the choice of slicing. We will not prove that here, itfollows from the two remarks above.)

There are two common cases where parallel transport always exists and wecan be more explicit about its form.

Example 2.2.2. If the bundle is a vector bundle and the connection is linear,then the differential equation for f is linear, so it has a (unique) solution (givenb and e0). Let us write A(t)f := Γ(b(t), f) db|t, so A(t) is a linear map fromF to Tf(t)F ; (2.19) then reads

df |t = −A(t)f(t), (2.20)

while f(0) is determined by (b(0), f(0)) = b(0) = e0.Now, if A(t) were independent of t, the solution would be simply f(t) =

exp(∫ t

0−A(s)ds

)f(0). In the general case the exponential and integral must

be replaced by what is known as the ordered exponential (or path-orderedexponential, or time-ordered exponential), commonly written as P exp or T exp.In these terms, the solution to (2.20) is (we do not prove this)

f(t) = P exp(∫ t

0

−A(s)ds)f(0) . (2.21)

This notation is easy on the eye and signifies the similarity to the ordinaryexponential well, but it unfortunately does not show that the P , exp and

∫ t0ds

22

Page 24: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

together form one mathematical operator. The ordered exponential can bedefined by the differential equation above, or as a limit:

P exp(∫ t

0

B(s)ds)

:= limn→∞

1∏k=n

exp(∆nB(sk)) (2.22)

, where sk = ktn , ∆n = t

n and the exp in the right-hand side is the exponentialof linear operators from F to itself (which can be defined by its Taylor series:exp(Q) =

∑∞i=0

Qi

i! ). This looks somewhat like the definition of the Riemannintegral of analysis: we partition the interval [0, t] in small pieces, representexp(B) on a small piece with the exponential of the length of the piece times Bin some point of the piece, multiply those factors over the interval and finallylet the partition become infinitely fine. Apart from the fact that we multiplythe factors exp(B) 2, the major difference with the Riemann integral is that thefactors of the product in the right-hand side of (2.22) may not commute: theorder is important! The path-ordering or time-ordering now lies in the fact thatwe take the product in order of decreasing t from left to right.

With some suggestive notation (identifying A with Γ) we can write the par-allel transport of e0 = h−1(b(0), f(0)) along the curve b as

P exp(−

∫b

Γ)f(0) . (2.23)

Example 2.2.3. If the bundle is a principal G-bundle and the connection isinvariant, we can proceed in a way analogous to the solution above for lin-ear connections on vector bundles. According to (2.17), in terms of the localprincipal gauge potential AU horizontality is equivalent to

f(t)−1 · (AU (db|t(1)) + df |t(1) · f(t)−1) · f(t) = 0 (2.24)

, or equivalentlyAU (db|t(1)) + df |t(1) · f(t)−1 = 0 (2.25)

, sodf |t(1) = −AU (db|t(1)) · f(t) . (2.26)

This differential equation is similar to the linear case (2.20); here the solutionalways exists and is given by (no proof):

f(t) = P exp(−

∫ t

0

AU (b(s),db|s(1))ds)f(0) . (2.27)

, where we define the ordered exponential by the same equation (2.22) as above.The only difference is that while before, the integrand B(s) was a linear operatoron F and exp the operator exponential, now B(s) is an element of the Lie algebrag and exp is the Lie exponential (producing an element of the Lie group G). Asabove, we write the parallel transport of e0 = h−1(b(0), f(0)) along the path bin compact form as

P exp(−

∫b

AU

)f(0) . (2.28)

2This is logical on an intuitive level: since we are trying to define something like theexponential of an integral (a continous sum), it should be something like the (continuous)product of exponentials.

23

Page 25: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

2.3 Curvature

Before we introduce the concept of curvature, we illustrate it with a well knownexample.

Example 2.3.1. For this example, we must introduce some notions from Rie-mannian geometry. With a metric on a smooth manifold X, we here mean asmoothly varying choice of a nondegenerate inner product on the tangent spacesof X. A manifold equipped with such a metric is called a pseudo-Riemannianmanifold. Pseudo, because the metric might not be positive-definite; if it is, themanifold together with this metric is called Riemannian. The metric makes itpossible to talk about length and angle of tangent vectors, and by integrationalso about length of a path in the manifold.

One can define a linear connection on the tangent bundle of a pseudo-Riemannian manifold, that interacts nicely with the metric. It is called theLevi-Civita connection. Along a geodesic path, parallel transport preservesthe tangent vectors to the path.3 This is usually taken as the definition of ageodesic path, but in the Riemannian case, geodesics are precisely the pathswith (locally) minimal length (straight lines in Euclidian space Rn, great circleson the sphere S2).

Now, for a concrete example of parallel tranport, let us consider the tangentbundle of the sphere S2. There is a logical metric on the sphere, induced by theembedding of the sphere in R3, so we have the Levi-Civita connection at ourdisposal. It turns out that parallel transport with respect to this connection isjust what you would hope it to be: along a geodesic path, a tangent vector vis transported by preserving the angle between v and the tangent vector to thepath and keeping v at constant length.

We will take the following path in S2: start at the north pole; go downto the equator along a meridian; move east along the equator for a quarter ofthe circumference; return to the north pole by a meridian. This is a (sphere-)triangular piecewise-smooth path that encloses one eighth of the surface of thesphere. As our victim of parallel transport, we pick a tangent vector at the northpole that points in the direction of the first piece of our path. Transporting italong that first piece, it remains tangent to the path, so when we arrive at theequator it points due south. While moving east and keeping it perpendicular tothe equator, it keeps pointing south. As we go up again, our tangent vector isconstantly directed back along the meridian we are now following. Back at thenorth pole, the transported version of our vector is tangent to the meridian onwhich the last piece of our path lies: it has been rotated counterclockwise over90 degrees with respect to the original tangent vector!

This example shows that parallel transport on the sphere indeed does dependon the path taken: if we had taken the constant path from the north pole toitself, our tangent vector would not have changed at all. If you perform thesame experiment on the plane (R2) or a the surface of a cylinder (S1 ×R), youwill see that parallel transport does not depend on the choice of path. Thecrucial difference is that the sphere is curved, while the plane and the surface

3Note that there are two entirely different levels of ”tangentness” at work here. When weare talking about tangent vectors to the path we are talking about elements of the tangentbundle. The concept of parallel transport itself, however, deals with the tangent space of thisbundle; it is one level of ”tangentness” deeper.

24

Page 26: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

of a cylinder are not. (The latter may seem to be curved, but not in the sensewe mean. This is commonly illustrated by the fact that one can fold a (flat)piece of paper perfectly tight around a cylinder, while one can not do so arounda sphere.)

Definition 2.3.2. We would like to define a general notion of curvature for asmooth fibre bundle F - E

p- X. In general, the degree of curvaturemight vary from point to point, so let’s fix a point e ∈ E. We must also choosetwo vectors η, ξ ∈ Tp(e)X. The curvature R(e)(η, ξ) is by definition the tangentvector given by the parallel transport of e around the infinitesimal parallelogramin X based at x = p(e) with subsequent corners x, x+ξ, x+ξ+η, x+η, x 4. Therectangle is infinitesimal, so the result of parallel transport is not an element ofthe fibre, but an element of the vertical tangent space (vertical, since paralleltransport along a closed path does not change the basepoint). We will see in amoment that it depends linearly on both η and ξ. It is obviously antisymmetricin η and ξ (since switching them gives the same rectangle traversed in theopposite direction), soR(e) is something like a VeE-valued 2-form. R is thereforesometimes called the curvature 2-form.

We say the curvature vanishes at e ∈ E if R(e) is the zero 2-form, and thatit vanishes at a point x ∈ X if it vanishes at the whole fibre above x. Finally,we call the connection flat if the curvature vanishes everywhere.

To actually calculate the curvature in terms of the connection, we chooselocal coordinates fa (a ∈ 1, . . . ,dimF) of F and xi (i ∈ 1, . . . ,dimX) ofX around (x, f) = h(e) (h a local trivialisation). We can obtain the paralleltransport along a side of the rectangle up to second order in the length by usingPicard iteration on the differential equation of parallel transport ((2.19)). Nextwe must compose those four operators, throwing out all higher order terms thatcome up. We will not do this calculation here. The answer is relatively simple:in coordinates and index notation, parallel transport is given by

(xi, fa) 7→ (xi, fa +R(x, f)aijηiξj) (2.29)

where

R(x, f)aij =∂Γaj∂xi

(x, f)− ∂Γai∂xj

(x, f) +∂Γai∂f b

(x, f)Γbj(x, f)−∂Γaj∂f b

(x, f)Γbi (x, f) .

(2.30)

Example 2.3.3. We will simplify this in the all-important case of a linearconnection on a vector bundle. We use the same notation as above. Because ofthe linearity of the connection, we can write

Γai (x, f) = Λabi(x)fb . (2.31)

Substitute this in (2.30), noting that ∂Γai

∂fb (x, f) = Λabi(x):

R(x, f)aij =∂Λabj∂xi

(x)f b − ∂Λabi∂xj

(x)f b + Λabi(x)Λbcj(x)f

c − Λabj(x)Λbci(x)f

c .

4More formally, one should construct an actual parallelogram with (probably slight curved)sides of variable length in the forementioned directions, and then take the derivative at length0 of the parallel transport along the parallelogram.

25

Page 27: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

We see that this is linear in f , so after identifying the vertical tangent spacewith F in the natural way, we can interpret R as a 2-form that takes values inthe space of operators from F to itself (an End(F )-valued 2-form). We writeR(x, f)aij = Rabijf

b (dropping the x-dependence for esthetic reasons and abusingthe same letter R), with

Rabij =∂Λabj∂xi

− ∂Λabi∂xj

+ ΛaciΛcbj − ΛacjΛ

cbi .

It is now clear that the first two terms are together the ij-component of dΛab ,while the latter two terms are the ij-component of the 2-form that is usuallyrather tersely written as [Λ,Λ]ab , defined by [Λ,Λ](η, ξ) = [Λ(η),Λ(ξ)] (the com-mutator is that of linear operators on F ). In this terse notation, we have thefollowing simple formula for the curvature (identifying Λ with Γ):

R = dΓ + [Γ,Γ] . (2.32)

Example 2.3.4. In the case of an invariant connection on a principal G-bundle,the curvature is given by

R = dA+ [A,A] (2.33)

where A is the local principal gauge potential and the brackets denotes the Liebracket of g-valued differential forms (g the Lie algebra of G)5. This can beproved with an argument analogous to the one above.

These two special cases can also be obtained from the respective orderedexponential solutions of the parallel transport equation, in combination withthe Baker-Campbell-Hausdorff formula (the non-commutative analogue to theformula exp(a) exp(b) = exp(a+ b)).

2.4 Holonomy and monodromy

The path dependence of parallel transport is a symptom of a phenomenon calledanholonomy. We can measure anholonomy by describing what changes parallel

5Just as above, the g-valued 2-form [C,D] is defined by [C,D](η, ξ) = [C(η), D(ξ)], wherenow the right-hand side features the Lie bracket of the Lie algebra.

In the special (but common) case that the Lie algebra g is a matrix Lie algebra (thatis, the elements of g are matrices and the Lie bracket [x, y] is the commutator xy − yx ofmatrices), we can see a g-valued differential form as a matrix of real-valued forms. We thendefine the exterior product C ∧D of two g-valued forms C and D as the matrix product ofC and D, multiplying elements of the matrices according to the exterior product of ordinary(real-valued) forms. In index notation, we write this as

(C ∧D)ac = Ca

b ∧Dbc . (2.34)

This exterior product is associative, just like the real-valued case, but it is not in generalantisymmetric on 1-forms, due to possible non-commutativity of the Lie-algebra.

For the special case where C and D are 1-forms, we have the formula (Cab ∧ D

bc)(η, ξ) =

Cab (η)Db

c(ξ)− Cab (ξ)Db

c(η), so when C = D = A, we have

(A ∧A)ac (η, ξ) = Aa

b (η)Abc(ξ)−Aa

b (ξ)Abc(η) = (A(η)A(ξ))a

c − (A(ξ)A(η))ac = [A(η), A(ξ)]ac

(2.35)so dropping the indices this says that A ∧ A = [A,A]. Formula (2.33) is thus also written asR = dA+A ∧A.

26

Page 28: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

transport along a closed path can cause, in the form of the holonomy group(which might be more properly called the anholonomy group).

Definition 2.4.1. Let F - Ep- X be a fibre bundle with a connection.

The holonomy group of the bundle (and connection) at basepoint x ∈ Xis the subgroup of Aut(p−1(x)), the automorphism group of the fibre abovex, obtained by parallel transporting (according to the connection) along closedpaths based at x. The local holonomy group is the subgroup of the holonomygroup obtained by parallel transporting only along contractible paths based atx.

So, given a closed path γ based at x, the map that sends e ∈ p−1(x) to theparallel transport of e along γ is an element of the holonomy group at basepointx.

Remarks.

1. The (local) holonomy group is indeed a group, because of remarks 1 and 2of section 2.2, and since the closed paths based at x modulo reparametri-sation form a group.

2. If the base space X is path-connected, switching to another base point y ∈X changes the holonomy group only by isomorphism (namely conjugationby parallel transport along a path from x to y), so the basepoint is thenusually not mentioned. This situation is analogous to that regarding thefundamental group of a topological space. Especially when the basepointis omitted, the fibre above the basepoint is identified with the abstractfibre F , so the holonomy group can be seen as a subgroup of Aut(F ).(Choosing an identification induced by another trivialisation only changesthe holonomy group by conjugation with the transition map evaluated atthe basepoint.)

3. The (local) holonomy group of a pseudo-Riemannian manifold is definedto be the (local) holonomy group of the tangent bundle of the manifoldwith respect to the Levi-Civita connection. Since the Levi-Civita connec-tion is linear, parallel transport is always a linear transformation, so theholonomy group of a n-dimensional pseudo-Riemannian manifold is a sub-group of GL(n) (identifying the tangent space with Rn). In fact, becauseof the special features of the Levi-Civita connection (loosely speaking, itpreserves the metric), parallel transport is actually an orthogonal trans-formation (i.e. has determinant ±1), so the holonomy group is a subgroupof O(n). If the manifold is orientable, a parallel transport must have de-terminant 1, so the holonomy group is even a subgroup of SO(n), thespecial orthogonal transformations.

Example 2.4.2 (Holonomy of S2). Let us return to example 2.3.1. What isthe holonomy group of S2? We saw in remark 3 above that it is a subgroup ofSO(2) (the rotations of the plane). But with a simple modification of the pathof 2.3.1, we can obtain any rotation of the tangent space: start at the northpole; go down to the equator along a meridian; move east along the equator overa length (or angle) α; then return to the north pole along a meridian; paralleltransport along this path is obviously rotation over α in the counterclockwise

27

Page 29: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

direction. This shows that the holonomy group of S2 is SO(2). Because S2 issimply connected, this is also the local holonomy group.

Remark 2.4.3. In general, oriented n-dimensional pseudo-Riemannian man-ifolds have as holonomy group the full SO(n), just like the 2-sphere. If theholonomy group is smaller, this signifies some special property of the manifold.For example:

• The 2n-dimensional oriented Riemannian manifolds with holonomy U(n) ⊆SO(2n) are precisely the Kahler manifolds (manifolds that admit a certaincomplex structure).

• The 2n-dimensional oriented Riemannian manifolds with holonomy SU(n) ⊆SO(2n) are precisely the Calabi-Yau manifolds (Kahler manifolds with acertain additional topological property – namely that the first Chern class(a so-called characteristic class) is zero).

A serious investigation of these correspondences is beyond the scope of thispaper.

Example 2.4.4 (Parallel transport on a covering). On a covering (fibre bundlewith a discrete fibre) with its unique connection, parallel transport around aclosed loop coincides with what is known in topology as the monodromy action.Any (continuous) lift of a path in the base space is automatically horizontal.Because the total space locally looks like the base space (or because the verticaltangent space is zero), parallel transport around an infinitesimally small closedpath is the identity. In particular, the curvature vanishes.

However, if the the fibre has more than one point and the total space is path-connected, there are loops in the base space around which parallel transport isnot the identity. Consider for example the covering exp : R → S1 of the circleby the line. Parallel transport around a path that winds one time around thecircle moves an element of a fibre by 2π.

The above example shows that even if the curvature is zero, so locally paralleltransport is path-independent, there still can be anholonomy caused by globalpath dependence. This phenomenon is called monodromy.

Equivalence of flatness with trivial local holonomy If a fibre bundlewith connection has trivial local holonomy group, then of course the connectionis flat (a manifold is locally simply connected, so the curvature can be obtainedas a limit of parallel transport around contractible loops, and by assumptionthose have no effect). It is an important fact that the reverse implication holdsas well. We will sketch a proof. Let γ : [0, 1] → X be a loop based at x andlet F : [0, 1] × [0, 1] → X be a homotopy from the constant loop at x to γ (wemay take it to be a smooth homotopy). Consider the parallel transport aroundthe path F (·, t). For t = 0, this does nothing, since F (·, 0) is the constantloop. As we move from F (·, t) to F (·, t + dt), the difference between these twopaths can be written as the concatenation of many infinitesimal rectangles withcorners F (t, s), F (t+dt, s), F (t+dt, s+ds) and F (t, s+ds) (this is appreciatedbest using a drawing). Parallel transport around each of these rectangles isthe identity (because the curvature vanishes by assumption), so the differencebetween parallel transport around F (·, t) and F (·, t+ dt) is also the identity (it

28

Page 30: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

is just the continuous composition of the parallel transport around each of theinfinitesimal rectangles); by continuous composition of all those differences, wesee that parallel transport around γ(·) = F (·, 1) is the identity as well.

This result can be interpreted as a justification of the definition of curvature:curvature can detect all local anholonomy.

2.5 Covariant derivative

An important reason to consider connections is that they facilitate the differen-tiation of sections. Let F - E

p- X be a fibre bundle, as always. Givena (local) section s : U → E on an open subset U ⊆ X, a point x ∈ U and avector ξ ∈ TxX, we can take the derivative ds(ξ) ∈ Ts(x)E. This is perfectlylegitimate, but we end up with a tangent vector to the total space E. That isusually not what you want. We already know how the horizontal componentof s changes as we move along the base space (since s is a section); we reallywould like to know how the vertical component of s changes – but we cannoteven talk about the vertical component of a section without a connection.

When we do have the disposal of a connection, we can take the verticalprojection of the derivative: πvds(ξ) ∈ Vs(x)E. This is called the covariantderivative of s at x in direction ξ and denoted by Dξs. More generally, if Xis a vector field on U , we can take the covariant derivative at every point of Uand obtain a function DX s : U → V E : x 7→ DX (x)s, a section over U of thevertical tangent bundle.

If furthermore the bundle is a vector bundle, we can identify the verticaltangent space Vs(x)E with the fibre p−1(x) in the usual way. The covariantderivative Dξs is then simply an element of the fibre over x, and the covariantderivative DX s of a section s is again a section of the vector bundle.

In local coordinates, we can use formula (2.2) for the vertical projection toobtain an explicit formula for the covariant derivative. Using the same notationas above, let h : p−1(U) → U×F be a trivialisation, let sU : U → F be the localrepresentative of the section s (given by (h s)(x) = (x, sU (x))) and let ΓU bethe local representative of the connection. Substituting (2.2) in the definition ofcovariant derivative above, we see that the local representative of Dξs is givenby

dsU (ξ) + ΓU (x, sU (x))ξ . (2.36)

The first term is just X (sU )(x); we can write the second term as ΓU (X )(x, sU (x))if we let ΓU act on the vector field X in the logical way (emphasizing the 1-formcharacter of ΓU ). So in terse point-free notation, we can say that the local formof DX s is

X (sU ) + ΓU (X )(sU ) . (2.37)

Remark 2.5.1. The complementary description, using differential forms in-stead of vector fields, is closer to the usual physical notation. We consider thecase of a vector bundle with a linear connection. Let t : X → E be a section(written in index notation as ta, the components of the vertical part of t (ac-cording to some implicit trivialisation)). Then πv dt : TX → V E is called thecovariant derivative of t, denoted by Dt – this is the same definition as above,except that here we have not yet chosen in which direction we will differentiate:Dξt = Dt(ξ). Since the bundle is a vector bundle, we can identify V E with E,

29

Page 31: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

obtaining a Dt : TX → E, or – one more identification – a section of the tensorproduct bundle T ∗X ⊗ E.

This last object is written in index notation as Dµta; from (2.36), we see

thatDµ = ∂µ +Aµ (2.38)

, where A is the connection 1-form 6, an End(F )-valued 1-form, given byA(ξ)(t) = ΓU (x, t)ξ. Intuitively, the connection form of a linear connection(with respect to some trivialisation) gives the difference between the connectionand the trivial connection (induced by the trivialisation).

6This ‘connection form’ of a linear connection is similar, but not directly related, to the‘connection form’ of an invariant connection (from section 2.1.2).

30

Page 32: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Chapter 3

Applications

To give a taste of how the theory we introduced can be put to work, we willnow review some appplications.

3.1 Electromagnetism as a gauge theory

Classically, electromagnetism is described by two vector fields: the electric fieldand the magnetic field. From a special-relativistic point of view, those arereally two manifestations of the same object, known by physicists as the fieldstrength tensor, typically written in index notation as Fµν (see [1, Chapter 12]).It is an antisymmetric (0, 2)-tensor: a 2-form on the 4-dimensional spacetimemanifold (let’s call that X). We will denote the field strength by F.

Maxwell’s equations of electrodynamics take a particularly simple form interms of differential forms. The two homogeneous equations translate to thecondition that F is a closed 2-form:

dF = 0 (3.1)

(where d is the ordinary exterior derivative of differential forms). The twoinhomogeneous equations can be written as1

∗d(∗F) = J , (3.2)

where ∗ is the so-called Hodge star (given a p-form on an n-dimensional pseudo-Riemannian manifold, it produces an (n − p)-form, making explicit use of themetric; physicists might know it under the name of “contracting with the Levi-Civita tensor”) and J is a 1-form called the 4-current (the source of the elec-tromagnetic field: the temporal component is (minus) the electric charge distri-bution ρ and the spacial component is the ordinary current J). We will not gothrough this in detail; this is done for example in [2, Chapter 7, sections 4.1-4]

Just as in the field formulation, we can exploit the homogeneous equationdF = 0 to write the field strength F in terms of a potential: F = dA (so A

is a 1-form)2. (In index notation, this is written as Fµν = ∂Aν

∂xµ − ∂Aµ

∂xν , and

1We use natural units, so that 4πε0 = 1.2Note that for this to work, spacetime must be contractible (or at least its second de Rham-

cohomologygroup must vanish). To simplify a bit, we will assume a contractible spacetime inthis section, so in particular all closed forms are exact.

31

Page 33: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Aµ has (minus) the electric scalar potential V as temporal component and themagnetic vector potential A as spacial component.) This potential formulationautomatically takes care of the homogeneous Maxwell equation (dF = ddA = 0);the only remaining equation is

∗d(∗dA) = J . (3.3)

The description of the electromagnetic fields in terms of the potential A has anobvious redundancy: if we add a closed 1-form α(= dλ) to it, A A + α, thefield strength F does not change. (In the tensor formulation, this is written asAµ Aµ + ∂λ

∂xµ with λ a scalar function on X.) This local symmetry (local,because we can change A independently at different points of spacetime) iscalled a gauge symmetry; changing to a different but equivalent A (by addinga closed form) is called a gauge transformation.

Now comes the starting point of gauge theory: using this potential A, wecan interpret the electromagnetic fields as a connection: specifically, an invariantconnection on the trivial principal U(1)-bundle over spacetime X (U(1) ∼= S1 isthe group of unitary complex numbers). To do so, we let A be the local principalgauge potential of our connection. This makes sense, since A is a real-valued 1-form, and the Lie algebra of U(1) is isomorphic to R. Furthermore, transforma-tions of A due to a change of trivialisation correspond to gauge transformationsof A in the sense defined above: let h1 and h2 be two different global triviali-sations, and let g : X → U(1) be the difference ((h2 h−1

1 )(x, f) = (x, g(x)f)).According to formula (2.18), the transformation of A is given by

A2 = g ·A1 · g−1 − dg · g−1 . (3.4)

U(1) is abelian, so the first term on the right-hand side is just A1. The mul-tiplication with g−1 in the second term is just the identification of the tangentspace at g(x) with the Lie algebra, so the second term is a u(1)-valued one-form,say α:

A2 = A1 − α : (3.5)

this is exactly what we called a gauge transformation! Reversely, starting withsuch a gauge transformation, we can exponentiate it to get a change of trivial-isation. This is a nice development. We have incorporated the redundancy ofthe potential in a natural way: different gauges (equivalent As) turn out to bedifferent “chartings” of the same underlying object (the connection).

The curvature of the connection introduced above is given by (see (2.33))

R = dA+ [A,A] = dA (3.6)

(the second term vanishes, because the Lie algebra of U(1) is abelian), so thecurvature of the connection is just the field strength F!

We state without proof that the sourceless (J = 0) Maxwell equations arethe Euler-Lagrange equations emanating from the Lagrangian F ∧ ∗F (FµνFµν

in index notation) (note that if F is a 2-form, then ∗F is a (n − 2)-form, soF∧∗F is an n-form that can be integrated over n-dimensional spacetime). Thisis an instance of the general Yang-Mills Lagrangian

−ktr(F ∧ ∗F) (3.7)

32

Page 34: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

, where k is some constant and tr denotes the trace (of matrices) on the matrixLie algebra. The study of this Yang-Mills action is an important part of gaugetheory.

Maxwell’s equations (or, if you prefer, the Yang-Mills Lagrangian) only de-scribe the evolution of the electromagnetic field, not how it influences particles.To do that, you need to add some sort of interaction term to the Yang-MillsLagrangian. There is a nice procedure to derive such a term. First you mustchoose an irreducible unitary representation of the gauge group, U(1) in our case– this will turn out be the charge of the particle3. Use this representation toconstruct the associated bundle from the principal bundle. The wave functionof the particle is assumed to be a section of this associated bundle. Next, theconnection representing the gauge field can be transferred in a natural way to aconnection on the associated bundle. Finally, we obtain an interaction term bytaking the usual ‘free’ Lagrangian of the particle and replacing all occurrencesof the derivative by the covariant derivative, with respect to the connection wejust obtained. This procedure is called minimal coupling.

3.1.1 The abelian Aharonov-Bohm effect

In quantum electrodynamics, there is the following phenomenon, first noted byEhrenberg and Siday in 1949, and later by Aharonov and Bohm ([3]), afterwhom it is called. Suppose that we have an infinitely long solenoid. Inside,there is a nonzero magnetic field. Outside, the magnetic field is zero (we as-sume for simplicity that the configuration is static and that there are no electricfields). We send a beam of electrically charged particles – electrons, say – to-wards the solenoid, split it into two beams, let the beams go past the solenoid ondifferent sides, and then rejoin the beams. If you think about this experimentclassically, you wouldn’t expect anything special to happen: the electrons stayaway from the solenoid, so they never encounter any magnetic field. Quantummechanically, however, it turns out that there is an electromagnetic effect: elec-trons going past the solenoid on different sides acquire a relative phase factor,thereby creating interference. Let’s see if we can find out what happens. Tosimplify things, we constrain the problem to 2 dimensions (the electrons stay inthe z = 0-plane).

In electrodynamics the Hamiltonian is given by4

H =(P− qA)2

2m. (3.8)

Note that, to avoid confusion with the base of the natural logarithm, we havedenoted the the electron charge by −q. Now to better exploit the symmetry ofour situation it is better to express this in cylindrical coordinates. SubstitutingP −i∇ and realising that in cylindrical coordinates the nabla operator is

3Note that the unitary irreducible representations of U(1) are (complex) one-dimensionaland are parameterised by an integer q, sending eiθ to multiplication by eiqθ; this nicelyincorporates the quantised nature of electrical charge.

4In other words, it turns out that using this Hamiltonian gives the correct predictions withrespect to experiments and reduces to classical electrodynamics in an appropiate limit. One

way to obtain this Hamiltonian is by minimal coupling: take the Hamiltonian H = P2

2mof a

free neutral particle and replace the partial derivative ∇ (in P) with the covariant derivative∇− iqA.

33

Page 35: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Figure 3.1: Aharonov-Bohm effect

given by ∇ = ∂∂r r+ 1

r∂∂θ θ we can rewrite the Hamiltonian with respect to these

coordinates to give

H =∂2

∂r2+

1r

∂r+

1r2

(∂

∂θ+ iα)2 . (3.9)

Here we used a gauge5 in which Ar = 0 and Aθ = φ2πr , where φ is the magnetic

flux through the solenoid, and where we have defined α := −qφ. Putting this inthe time-independent Schrodinger equation Hψ = Eψ and setting6 E =: −k2

we get the following wave equation:(∂2

∂r2+

1r

∂r+

1r2

(∂

∂θ+ iα)2 + k2

)ψ = 0 . (3.10)

The solution to this equation is given by:

ψ(r, θ) =∞∑

m=−∞eimθ

[amJm+α(kr) + bmJ−(m+α)(kr)

], (3.11)

where am and bm are arbitrary constants and J±(m+α) is a Bessel function oforder ±(m+α) (indeed, a common approach is to define these Bessel functionsas the solutions of this very differential equation). Note that this solution holdsonly outside the solenoid.

From now on, we will work in the limit where the radius of the solenoidgoes to zero while the total flux φ remains fixed. We do so because it willsimplify the calculations a lot, and while this means that we will be workingin an idealized situation that does not describe a real physical system, it still

5It is easy to see that this gauge indeed describes our situation outside the solenoid: the

curl in cylindrical coordinates of A is ∇×A = ( 1r

∂Az∂θ

− ∂Aθ∂z

)r+( ∂Ar∂z

− ∂Az∂r

)θ+ 1r( ∂

∂r(rAθ)−

∂Ar∂θ

z = 0 = Boutside. Furthermore, the path integral of A along a path that encloses thesolenoid once is φ, the flux through the enclosed surface.

6Note that k will also be the wave number of the incoming electron beam as given in (3.13).This makes perfect sense, since far to the right of the solenoid, where the magnetic influenceis negligible, a solution with energy −k2 should look like a plain wave with wavenumber k.

34

Page 36: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

demonstrates an interesting and important quantum phenomenon, which hasno classical counterpart.

The concrete advantage of working in this limit has to do with the Besselfunctions of negative order. Of course, the wave function must be defined on allof space, and it must also be continuous. This means that we have to extend thissolution (3.11) to region inside the solenoid. To do this, in the limit where theradius of the solenoid goes to zero, we only have to specify a value at the origin.The Bessel functions of negative order, however, have a pole at the origin, sothere is no continuous way to extend them there. Therefore they cannot be partof the solution, in this limit, and we have

ψ(r, θ) =∞∑

m=−∞amJ|m+α|(kr)eimθ . (3.12)

Remember that the situation we want to study is as follows. A beam ofelectrons comes in from the right, splits up, passes the solenoid on both sidesand is rejoined on the other side. This means that we have the initial conditionthat for large r to the right of the solenoid (where the effect of the solenoid isnot yet felt), ψ must represent such an incoming wave; in that limit, we musthave

ψ = ψinc = e−i(αθ+kx) . (3.13)

We mention two ways to see that this ψ indeed represents the desired incomingwave. The first is an intuitive argument. Let’s say we are working in a differentgauge, where we have gauged away the potential on the right side of the solenoid.This is possible since the curl of A is equal to zero here, as we have shown above(in footnote 5 on page 34). So locally, we can write A as the gradient of somescalar function f and gauge it away. In fact, we can easily write down thisscalar function, by solving the equation ∇f = A, which is a system of partialdifferential equations with solution (as the reader will be able to check easily):f = φ

2π θ = −αq θ. Since the potential in the region of the incoming wave is

then zero, we know what the incoming wave should look like; it should be thenormal free particle wavefunction ψinc = e−ikx. Now return to the original gaugeAθ = φ

2πr . When performing this gauge transformation, the wave function ismultiplied by eiqf 7, so after gauging back to our original situation, the incomingwave function becomes ψinc = e−iαθe−ikx, as we claimed in (3.13).

The second reason to want (3.13) as the incoming wave appears once werephrase the initial condition as the demand that far to the right of the solenoid,the wave function should have a constant current density in the −x direction. Ifwe take the gauge covariant form of the current density (with covariant deriva-tives instead of ordinary ones), then this implies that ψinc must have the givenform (see also [3]).

7This transformation behaviour of ψ is just that of a section of the (electromagnetic) bundleassociated to a particle with charge −q.

A compelling reason to demand this behaviour is that it is necessary to make the Schrodingerequation gauge invariant: if we changed only the potential, the Hamiltonian alone wouldchange; we need to change the wave function simultaneously with the potential. Of course, agauge transformation cannot change the physics of a situation, so the only thing we can doto the wave function is tag on a phase factor. For a gauge tranformation A A +∇f theright phase factor is eiqf , as the reader can check by direct calculation.

35

Page 37: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

It turns out that the correct choice for am for this initial condition is givenby am = (−i)|m+α|, thus giving

ψ(r, θ) =∞∑

m=−∞(−i)|m+α|J|m+α|(kr)eimθ . (3.14)

In [3] it is shown that this indeed gives the right ψinc by calculating that, in thelimit of large r, the wave function is given by

ψ −→ e−i(αθ+kx) +eikr

(2πikr)12

sin(πα)e−iθ/2

cos(θ/2). (3.15)

We recognise the first term to be the incoming beam because it does not decreasewith growing r, as a reflected term in a multi dimensional problem should (tokeep the total probability to find the electron somewhere equal to one).

The scattering cross section can now be calculated by writing

ψ = ψinc + eikxg(θ)√r

(3.16)

and using8 dσdΩ = |g(θ)|2. This results in:

dσdΩ

=sin2(πα)

2πk1

sin2(θ/2). (3.17)

We see that if α is integer, there is no scattering at all. However, if α is somearbitrary real number, an interference pattern emerges; particles passing thesolenoid on one side interfere with particles that took the other route! Thisis interesting, because neither of the particles directly encounters the magneticfield. So either the field influences the particle over a finite distance, which wouldbe incompatible with relativity, or there is something more to the potential thanjust being a mathematical aid; the potential itself influenced the particle. Thisis what Aharonov and Bohm argued in their article [3], and what makes thiseffect an important one for understanding electromagnetism.

Gauge-theoretic interpretation From the gauge-theoretic point of view,this effect is not so strange – or at least we can describe it neatly: in the regionoutside of the solenoid the electromagnetic field strength is zero, so the connec-tion A is flat there. We can conclude that there is no local holonomy, but notthat there is no monodromy (especially since the region has non-trivial funda-mental group: you can walk around the solenoid). Therefore the connectionmay be non-trivial, and the electrons can feel this as an electromagnetic influ-ence. The reason the Aharonov-Bohm effect may seem so strange at first is thatwe are used to being able to describe electromagnetism entirely in terms of thecurvature (field strength) F (since space(time) is simply connected). We nowsee that it may be more appropriate to view the connection as the fundamentalphysical object, not its curvature.

8We will actually take this as the definition of the differential cross section dσdΩ

. This isnot an illogical thing to do, since it is intuitively clear that g(θ) represents some measure ofthe probability to have a particle scattering off in direction θ.

36

Page 38: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

3.2 Chern-Simons theory

To define the Yang-Mills action (3.7) we employed the Hodge star, which makesexplicit use of the metric. This is not a flaw of this particular formulation: theYang-Mills action does depend on the metric. Chern-Simons theory is a gaugetheory that does not need a metric; it is therefore called a topological gaugetheory. The starting point of the theory is a smooth manifold M together witha Lie group G (the “gauge group”) (usually some properties are demanded of G,but we will not be concrete enough to need them; we only require the Lie algebraof G to be a matrix Lie algebra); it then considers the trivial principal G-bundleover M . We will look at the special case where the underlying manifold is 3-dimensional (there are generalisations to all odd dimensions). Given an invariantconnection (represented by the 1-form A), we can form the Lagrangian

L =k

4πtr

(A ∧ dA+

23A ∧A ∧A

)(3.18)

(k is some constant). This is a real-valued (because of the trace) 3-form, so wecan integrate it over M to obtain an action. Unfortunately, this Lagrangian isnot gauge invariant: a different choice of trivialisation (a gauge transformation)can give a different L. However, this dependence is not too wild: the differencedepends only on the homotopy type of the gauge transformation, and even thenL changes only by an integer multiple of some constant. If we choose k in theright way, then L changes only by integer multiples of 2π (the normalisationabove is conveniently chosen, so that in fact k must be integer). When thetheory is quantised, this means that the expectation value eiL does not changeat all under gauge transformations, which is all you need to end up with a propergauge quantum field theory. Now, the so-called Feynman path integral of ourquantum field theory can be calculated (this, and actually all we hint at in thissection, is done properly in [4]). The result is a number that can be used as aninvariant of the manifold M .

This Chern-Simons theory has an unexpected application: it can help toclassify and understand knots9! Very roughly, this is done by multiplying theexpectation value eiL with the holonomy (parallel transport) around the knot Kwe wish to examine. Actually we multiply with the trace of this holonomy: theholonomy itself is an element of G, and it is well-defined only up to conjugation(picking a different basepoint for the holonomy results in conjugation); takingthe trace (that is, computing the matrix that corresponds to the element ofG under some representation, and taking the trace of the resulting matrix)gives a number that does not change under change of basepoint. This extrafactor (known as a Wilson line10) is clearly gauge invariant, since we definedit in terms of (parallel transport around) the connection itself and we madeno reference to any trivialisation or representative A. The new Feynman pathintegral can be calculated, and (for fixed M) can be used to classify the knotK.

9A knot is an embedding of the circle S1 in space (R3).10This is analogous to electromagnetism: if you view the connection as a physical field on

the manifold M , the action (3.18) can be read as the action of free evolution of this field. Ifyou see K as the world line of a ‘charged’ particle, the Wilson line factor then couples theparticle to the field.

37

Page 39: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

After all this interesting hand-waving, we will do a small concrete calcula-tion: we derive the Euler-Lagrange equation from the Chern-Simons Lagrangian(3.18). We can leave out the factor k

4π for this purpose. In index notation, theLagragian is

L =Aabµ∂νA

baρ +

23AabµA

bcνA

caρ

dxµ ∧ dxν ∧ dxρ (3.19)

(where Aab is the ab -entry of the matrix of real-valued 1-forms A, and Aabµ is its

µ-component), and the Euler-Lagrange equation reads ∂L∂Aa

bµ− ∂ν

∂L∂(∂νAa

bµ) = 0.So let’s first compute:

∂L∂Aabµ

=∂νA

baρ +

23

(AbcνA

caρ −AcaνA

bcρ −AbcρA

caν

)dxµ ∧ dxν ∧ dxρ

=∂νA

baρ + 2AbcνA

caρ

dxν ∧ dxρ ∧ dxµ

= (dA+ 2A ∧A)ba ∧ dxµ

(3.20)

and

∂L∂(∂νAabµ)

= Abaλdxλ ∧ dxν ∧ dxµ, so

∂ν∂L

∂(∂νAabµ)= ∂νA

baλdx

λ ∧ dxν ∧ dxµ

= −∂νAbaλdxν ∧ dxλ ∧ dxµ

= −dAba ∧ dxµ

(3.21)

. Plugging this in the Euler-Lagrange equation, we get

0 =∂L∂Aabµ

− ∂ν∂L

∂(∂νAabµ)= (dA+ 2A ∧A)ba ∧ dx

µ − (−dAba ∧ dxµ)

= 2 (dA+A ∧A)ba ∧ dxµ .

(3.22)

This is true for all bµa if and only if dA+A∧A = 0. This says that the curvatureof the connection must vanish, i.e. that the connection must be flat!

In more physical language: fields (i.e. connections) that satisfy the equationsof motion have zero field strength. Intuitively, this means that this Chern-Simons action is ‘sourceless’: we included no charge that generates a field.

3.3 S2 is no Lie group

As a small mathematical application of the fibre-bundle theory we introduced,we show that the sphere S2 cannot be made into a Lie group. This followsdirectly from the next two facts:

1. The tangent bundle of a Lie group G is trivial. By multiplying from theright with g−1, the tangent space above a point g ∈ G is sent to thetangent space of the identity element, i.e. the Lie algebra g of G. Wetherefore can take the following global trivialisation: h : TG → G × g :v 7→ (p(v), v · p(v)−1) (where p : TG → G is the projection). This h isindeed a trivialisation: a smooth bijection that commutes with projection.

38

Page 40: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

2. The tangent bundle of S2 is not trivial. If it were, there would be a globalsection that is nowhere zero (simply compose a constant section x 7→(x, f0) (for some nonzero f0) with (the inverse of) a global trivialisa-tion). The ‘hairy ball theorem’ of algebraic topology asserts that suchnon-vanishing vector fields do not exist on S2.

39

Page 41: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Chapter 4

Characteristic classes

4.1 Introduction

In mathematics, the following situation is quite common; we have some class ofobjects (say, topological spaces or knots) and we have some equivalence relationon this class (homeomorphism, homotopy). Now, we often want to know if twogiven objects in this class are equivalent, but this is not always easy to find out.In particular, it can be quite hard to prove that two objects are not equivalent,because this entails proving that something does not exist. This section is abouta tool which can help you in such a situation; this is the notion of an invariant,which can help you discover whether two spaces are equivalent.

Example 4.1.1. In topology two spaces are seen as being equivalent if thereexists a homeomorphism between the two. To check whether two spaces areequivalent it is often convenient to calculate the fundamental groups of thosespaces; if these are not the same, then neither are the topological spaces.

Example 4.1.2. In knot theory we regard two knots as being the same if onecan be continuously deformed into the other. When we want to know if twogiven knots are in fact the same, we can for example calculate the 3-coloring forboth knots, and if they don’t come out the same, the knots are not equivalenteither.

The fundamental group of a topological space and the 3-coloring of a knotare both example of invariants. An invariant is a way of associating somethingto each object of the class (the fundamental group to a topological space) inthe above situation, such that if two objects are equivalent, then the same thingwill be associated to them. For those readers who have seen a little categorytheory, a nice example of an invariant is given by a functor from the categoryof objects you want to be able to separate to some other category.

There are two properties of an invariant which determines its usefulness.First is of course calculizability; it should be easier to compute the invariantthan to directly check whether the original objects are the same. Second isstrictness; a very simple invariant is given by associating the number ‘0’ toeach of our objects. While very easy to calculate, it obviously does not help usin any way to check for equivalence of objects. The stricter an invariant, themore objects it can separate, but this usually means that it is more difficult tocalculate...

40

Page 42: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

The reason we are going through al this is of course that we ourselves arein the above situation. In chapter one we introduced the concept of a bundleand with it that of a bundle isomorphism. We regard two bundles as beingequivalent if there exist some bundle isomorphims between those two, and wewould like to have a way to check for this equivalence. As we explained, thisis exactly what an invariant is, so what we are going to look for in the comingchapter is a vector bundle invariant. The reason we are limiting ourselves tovector bundles at this point is the fact that there is a whole class of invariantswhich can be easily defined for vector bundles.

4.2 Characteristic classes

In this chapter we will often sloppily denote a vector bundle F → Vπ→ M by

V . Let us begin by just giving a definition:

Definition 4.2.1. A characteristic class c is a natural transformation as-sociating to all vector bundles V over a manifold M some element c(V ) of thecohomology group H∗M , such that if V1 ' V2 then c(V1) = c(V2).

This needs some explaining, of course. First we need to know what is meantby a natural transformation. This is a concept from category theory and adetailed explanation can be found in any book on this subject. In short, if wehave two categories A and B and two functors F,G : A → B between thosecategories, then a natural transformation between those functors is a way torelate them to each other. More concrete, for each object A of A it is a morphismc(A) : F (A) → G(A) in B, such that for every morphism f ∈ A(A1, A2), thefollowing diagram commutes:

F (A1)c(A1)- G(A1)

F (A2)

F (f)

?

c(A2)- G(A2)

G(f)

?

How does this apply to our situation? Well, we have two contravariantfunctors:

H∗ : Man → Set , the cohomology functor, here seen as a functor to set

V ect : Man → Set , to be explained below.

Here, V ect associates to a manifold M the set of all isomorphism classes ofvector bundles over M , and to a map f : M → N the map between the twocorresponding sets induced by the pullbacks f∗ of f from vector bundles overN to vector bundles over M . It is thus a contravariant functor.

Note that the condition from the definition, that V1 ' V2 should implyc(V1) = c(V2), is exactly what is necessary to make c act on isomorphism classesof vector bundles, instead of vector bundles.

41

Page 43: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Characteristic classes provide us with a nice invariant for vector bundles, asV1 ' V2 implies c(V1) = c(V2). However, we have not yet seen any examples,and our first job will be to construct some. There is more than one way to do so;we will take a differential-geometric approach, using connections, curvature andinvariant polynomials. In the next section, we will recast some of the earliermaterial on connections and curvature into a form more appropriate to thischapter. After that, we will define what invariant polynomials are and use themto construct characteristic classes.

4.3 Covariant derivative revisited

In this section we take another look at the covariant derivative induced by alinear connection on a vector bundle. First we prove a few of its properties.Next we show how to recover the connection from a given differential operatorsatisfying those properties. This allows us to view the covariant derivative as analternative presentation of the geometric information contained in a connection.Finally we re-interpret curvature in this new context.

Definition 4.3.1. Let F - Ep- M be a vector bundle. A CD on E

is an R-linear operator ∇ : Γ(E) - Γ(T ∗M ⊗ E) satisfying the followingLeibniz-like property: for all sections s ∈ Γ(E) and functions f ∈ C∞(M),

∇(fs) = f∇s+ df ⊗ s . (4.1)

Proposition 4.3.2. Given a linear connection on a vector bundle E, the co-variant derivative ∇ : Γ(E) - Γ(T ∗M ⊗ E) (defined in section 2.5) is aCD.

Proof. From the definition ∇ = πv d we immediately see that ∇ is R-linear.Proof of (4.1):

∇(fs) = πv(d(fs))= πv(df ⊗ s+ fds) (usual Leibniz rule for d)= πv(df ⊗ s) + πv(fds) (linearity of the projection)= πv(df ⊗ s) + fπv(ds) (linearity of the connection)= df ⊗ s+ f∇s (df ⊗ s is already vertical)

. (To see that df ⊗ s is vertical, recall the various identifications involved:df ⊗ s stands for the bundle map ξx 7→ (df(ξx))s(x), and s(x) is identified witha vector in the vertical tangent space V E.)

Theorem 4.3.3. Given a vector bundle E with a CD ∇, there is a uniqueconnection on E that induces ∇.

Proof. Choose a local trivialisation of E, say over an open subset U ⊆ X.Transfer the trivial connection on U × F to a connection on p−1(U) ⊆ E, withinduced CD ∇. Consider A := ∇ − ∇. By the Leibniz rule for ∇ and ∇, wehave

A(fs) = f∇s+ df ⊗ s− f∇s− df ⊗ s = f∇s− f∇s = fAs (4.2)

42

Page 44: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

for any section s and function f , so the value of As at x depends only on thevalue of s at x 1. Because the CDs are R-linear, A is R-linear, so we can write(As)(x) = L(x)(s(x)), where L(x) : Fx - (TxM)∗ ⊗ Fx is a linear map – orequivalently, using a familiar identification, L is a End(Fx)-valued 1-form. Inlocal coordinates, this looks like

∇ = ∇+A ∼= ∂µ +Aµ (4.3)

, where Aµ is the local representative of L. Comparing this with (2.38), we seethat ∇ is in fact the covariant derivative induced by the connection on p−1(U)with connection form Aµ.

We must show that this local inducing connection does not depend on thechoice of trivialisation, in order to glue different such local inducing connectionstogether to form a global inducing connection. Indeed, let s be a section overU ⊆M , x ∈ U a point (s(x) ∼=: (x, f) in our trivialisation) and ξ ∈ TxM . Thenfor the tangent vector α := ds(ξ) ∈ Ts(x)E (locally represented by (ξ, y)), wehave

α is horizontal⇐⇒ y = −A(ξ)(f) (property of connection form)

⇐⇒ y = ∇(s)(ξ)−∇(s)(ξ) (our choice of connection form)

⇐⇒ y = y −∇(s)(ξ) (definition of ∇)⇐⇒ ∇(s)(ξ) = 0

. The last statement is independent of the trivialisation. Now, given any tangentvector α, we can clearly find a local section s such that ds(ξ) = α (whereξ = dp(α)), and we can apply the above criterion. Therefore “being horizontal”,and hence the connection, are independent of trivialisation.

We have proved the ‘existence’ part of the theorem. Uniqueness followsfrom the fact (see section 2.1.1) that a connection is fully determined by itslocal connection forms, which in turn are determined by our construction (theyare the difference between ∇ and the CD induced by the respective trivialconnection).

This result allows us to identify a linear connection with its induced covariantderivative. In fact, what we call a CD, many authors call a connection.

Remarks.

1. From the above proof, we see that the set of linear connections on E isa T ∗M ⊗ End(F )-torsor: the difference between two connections is anEnd(F )-valued 1-form (because it does not involve differentiation: it isof order zero or ‘tensorial’), but the connections themselves are not, andthere is no canonical choice of ‘zero’ or ‘trivial’ connection.

2. Let’s rephrase the trivialisation-independent description of the connectioninducing a given CD: “a section is horizontal at a point iff its covariantderivative there vanishes”. This means that the covariant derivative mea-sures the deviation from being horizontal – not so strange, considering itsdefinition as the vertical projection of the derivative.

1One says that A is an operator of order zero. Similarly, a CD is an operator of order one,i.e. it depends only on the behaviour of its argument up to first order (value and derivative)(this follows from the Leibniz rule).

43

Page 45: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

4.3.1 Curvature

Suppose that we have a vector bundle E with a linear connection. The covariantderivative ∇ sends sections of E to E-valued 1-forms. Another way to look atsections is as E-valued 0-forms. This suggests that ∇ : Γ(E) = Γ(Λ0(T ∗M) ⊗E) - Γ(Λ1(T ∗M)⊗ E) = Γ(T ∗M ⊗ E) might be the first map in a cochainΓ(Λp(T ∗M)⊗E), analogous to the usual exterior derivative d. We define thosehigher maps by enforcing a generalised Leibniz rule:

∇ : Γ(Λp(T ∗M)⊗ E) - Γ(Λp+1(T ∗M)⊗ E)∇(α⊗ s) := dα⊗ s+ (−1)pα ∧∇s

(4.4)

(for a section s and p-form α)2. Observe that the case p = 0 is actually theLeibniz rule (4.1) for the covariant derivative, which we proved before.

To get used to this generalised covariant derivative and to prepare for futurecalculations, we prove a simple property.

Lemma 4.3.4. For a function f : M - R and E-valued p-form ρ, we have∇(fρ) = df ∧ ρ+ f∇ρ.

Proof. First look at the pure case ρ = α⊗ s, for a section s and p-form α:

∇(fρ) = ∇(f(α⊗ s)) = ∇((fα)⊗ s)= d(fα)⊗ s+ (−1)pfα ∧∇s (definition (4.4))= df ∧ α⊗ s+ fdα⊗ s+ f(−1)pα ∧∇s (distribution of d over ∧)= df ∧ α⊗ s+ f∇(α⊗ s) (definition (4.4))= df ∧ ρ+ f∇ρ

. The general case follows because both sides are R-linear in ρ.

Now, let’s investigate whether ∇ forms a cochain – that is, whether ∇2 = 0.First of all, let α be a p-form and s a section, and for convenience denote∇s =: βj ⊗ tj (for 1-forms βj and section tj); then

∇2(α⊗ s) = ∇(dα⊗ s+ (−1)pα ∧∇s)= d2α⊗ s+ (−1)p+1dα ∧∇s+ (−1)pd(α ∧ βj)⊗ tj − α ∧ βj ∧∇tj= −(−1)pdα ∧ βj ⊗ tj + (−1)pdα ∧ βj ⊗ tj + (−1)2pα ∧ dβj ⊗ tj − α ∧ βj ∧∇tj= α ∧ dβj ⊗ tj − α ∧ βj ∧∇tj= α ∧∇(βj ⊗ tj)

= α ∧∇2s

(4.5)

, so we see that ∇2 acts only on the E-part of an E-valued p-form, and we canreduce to the case p = 0. In that case we see from the lemma above and (4.4)(and d2 = 0) that

∇2(fs)

= ∇(df ⊗ s+ f∇s) = d2f ⊗ s− df ∧∇s+ df ∧∇s+ f∇2s

= f∇2s

(4.6)

2A general section of the tensor product bundle is a sum of such pure terms α⊗s. Becausethe right-hand side is R-linear in α and s, (4.4) is in fact well-defined.

44

Page 46: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

, so at least ∇2 is an operator of order zero, and we can identify it with a linearmap from E to Λ2(T ∗M) ⊗ E, i.e. an End(E)-valued 2-form. However, thisneed not be the zero form! In fact, it is equal to that other End(E)-valued2-form we met before:

Proposition 4.3.5. For any E-valued p-form ρ, we have ∇2ρ = R ∧ ρ, whereR is the curvature of the connection associated to ∇ 3.

Proof. We already saw that ∇2(α⊗s) = α∧∇2s, so it suffices to prove the casep = 0. Now, let s be a section. To prove that the forms ∇2s and Rs are equal, itsuffices to prove that they are equal at every point. Choose a local trivialisationof E; we will treat the trivialisation as an equality. Denote the connection formby Λ, and write ∇s = dxj ⊗ tj , where thus tj = ∂js+ Λjs. We compute

∇2s = ∇(dxj ⊗ tj) = d2xj ⊗ tj − dxj ∧∇tj= −dxj ∧ dxi(∂i(∂js+ Λjs) + Λi(∂js+ Λjs))

= dxi ∧ dxj(∂i(Λjs) + Λi(∂js+ Λjs)) (∂i∂j is symmetric in i, j)

= dxi ∧ dxj((∂iΛj)s+ Λj∂is+ Λi∂js+ ΛiΛjs)

= dxi ∧ dxj((∂iΛj)s+ ΛiΛjs) (Λj∂i + Λi∂j is symmetric in i, j)= (dΛ)s+ [Λ,Λ]s (definition of d and [·, ·])= Rs (formula (2.32))

, which was to be proved.

Concisely said: curvature is the obstruction to the covariant derivative beinga cochain map.

4.4 Invariant polynomials

Again, let’s start with a definition.

Definition 4.4.1. Let glm(C) denote the Lie algebra of m×m matrices over C.An invariant polynomial on glm(C) is a polynomial function P : glm(C) → Csuch that for all X,Y ∈ glm(C), P (XY ) = P (Y X).

As above, we have some additional explaining to do about this definition;we have to make clear what we mean by a polynomial function from glm(C)to C, for the first definition for polynomials on glm(C) that comes to mindwould take values in glm(C) itself, not in C. What we mean in this case is amap P : glm(C) → C that is a polynomial in the entries of the matrix in theargument.

As a first example, the trace and determinant of matrices are invariant poly-nomials.

Now, let’s see how this notion helps us to construct characteristic classes.Let V be a complex vector bundle, over some manifold M , and equipped witha connection. We denote by R the curvature of the connection; it is a 2-form on

3R is understood to work only on the E-part of ρ, i.e. R ∧ (α ⊗ s) = α ∧ Rs. We don’thave to worry about the order of R and α, because R has even degree, so it commutes withany p-form.

45

Page 47: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

M with values in End(F ) (recall that V was actually short for F → Vπ→ M).

Choosing a basis for the fibre above some point x ∈ M , R locally (i.e. at x)becomes a matrix-valued 2-form. An equivalent way to look at this, is to regardR locally as a matrix of ordinary 2-forms. If now P is an invariant polynomial,we can apply it to this matrix and get an even-dimensional form, which we willdenote by P (R). Of course, R is not an ordinary glm(C) matrix, but a matrixof 2-forms, so we have to state what we mean by applying P to it. In order tomake sure the outcome is again a form, the only logical way to do this is to letP act on the entries of the matrix as it would in the ordinary case, but replacethe product of two entries with the wedge product of differential forms.4

For example, if F is a two-dimensional vector space, P1 is the trace, P2 isthe determinant, and α, β, γ and δ are differential forms, then;

P1

((α βγ δ

))= α+ δ

P2

((α βγ δ

))= α ∧ δ − β ∧ γ .

The first is a two-form, the second a four-form.Finally we note that, since P is invariant, P (R) does not depend on the

choice of basis we made earlier5. Therefore, we can define the differential formP (R) globally.

So what have we done? We have defined the notion of an invariant poly-nomial, and we have shown that this provides us with a way to associate adifferential form to the curvature on a vector bundle. (Since curvature comesfrom a connection, we could just as well say that it is a way to associate adifferential form to the connection on a vector bundle.) While this may be anice thing to have, it is not what we were looking for in the first place; wewanted to associate a differential form to the vector bundle itself, rather thanto a connection defined upon it.

Luckily for us, there is a very nice theorem to save the day. It tells us that,although our construction of a differential form very much uses the curvature(and thus also the connection), the cohomology class of the result actually doesnot depend upon the choice of connection at all! As we shall see, this meansthat any invariant polynomial provides us with a characteristic class by theprescription “apply it to the curvature of any connection on your bundle”. Letus now state and prove this theorem.

Theorem 4.4.2. For any invariant polynomial P : glm(C) → C, the differentialform P (R) (where R is the curvature of some vector bundle V ) is closed and itsde Rahm cohomology class is independent of the choice of connection.

Proof. To show that P (R) is closed means to show that dP (R) = 0, where ddenotes the exterior derivative. We use a few tricks to accomplish this. First,let us assume that P is homogeneous of degree k. This can be done without loss

4By definition, P is an ordinary, commutative polynomial in the entries of a matrix, i.e.the different ‘variables’ (entries) commute. Therefore, we can only substitute elements of acommutative ring for the variables of P (otherwise, P (R) would not be well-defined). Becausethe curvature matrix R is a matrix of two-forms, this is no problem; two-forms commute witheach other.

5A matrix with respect to any basis can be written with respect to any other basis by atransformation of the form X → UXU−1, and we have P (UXU−1) = P (U−1UX) = P (X).

46

Page 48: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

of generality, because d is a linear operator, so if dP1(R) = 0 and dP2(R) = 0,then d(P1(R) + P2(R)) = 0.

Now that P is a homogeneous polynomial, we can use the polarization P ofP . This is defined as follows: if we view P (t1X1 + · · ·+tkXk) as a polynomial int1, . . . , tk, then k!P (X1, . . . , Xk) is the coefficient of t1 · · · tk in its expansion, seenas a polynomial in X1, . . . , Xk. To get a feeling for what is going on here, notethat P is a polynomial in m2k variables (remember that, even though we writeP (X), P is a polynomial in the m2 entries of the matrix X). Furthermore, P issymmetric; since the t1, . . . , tk commute, each permutation of X1 · · ·Xk entersin the coefficient of t1 · · · tk. To relate P and P we have P (X) = P (X, . . . ,X).

• This is best appreciated by an example: take P (x, y) = xy + x2 as yourhomogeneous polynomial (of degree 2); then

P (t1x1 + t2x2, t1y1 + t2y2)= . . .+ (x1y2 + x2y1 + x1x2 + x2x1)t1t2

(4.7)

, so2!P (x1, x2; y1, y2) = x1y2 + x2y1 + x1x2 + x2x1 (4.8)

, and we see that

2!P (x, x; y, y) = xy + xy + xx+ xx = 2P (x, y) . (4.9)

• To give a sketch of the proof of the general fact, we first note that we onlyneed to look at monomials, since if P (X) = P (X, . . . ,X) holds for twopolynomials, it will also hold for the sum. Next, we see, for a monomialP , that the terms of P (t1X1 + · · ·+ tkXk) are obtained by picking for eachfactor of degree one6 a term tiXi from t1X1 + · · ·+ tkXk and substitutingthe appropriate entry of Xk. However, P is the coefficient of t1 · · · tk inthis expansion, so only the terms where we picked a different i for eachfactor of degree one enter in P . These terms individually all look like theP , only each factor of degree one has an extra index; furthermore, theseindices are all different within a term. So, if we replace Xi by X in P forall i, each term will be equal to P , and since there are k! possible ordersin which to pick the i’s, we exactly compensate the 1

k! in the definition ofP , and we conclude: P (X, . . . ,X) = P (X).

In the same way as we did above for P , we can interpret P as a polynomialin matrices of two-forms instead of a polynomial in ordinary matrices. Evenbetter, since two-forms also commute with three-forms and one-forms, we maysubstitute a one- or three-form in one of the entries of P as long as all the otherarguments are two-forms. We will use this later on for P (dR,R, . . . , R) andP (θ,R, . . . , R), where θ will be some one-form.7

To complete the proof of the first part of the theorem, choose a point x0 ∈Mand work in geodesic coordinates based at this point, so that the connection

6Example: one of the x’s, y’s or z’s in the monomial given by P (x, y, z) = xxyzzz.7For this all to work, we also need that P is again invariant in some way; we want the

matrix form of P (R1, . . . , Rk) to be independent of the choice of local basis. As a matter offact, we have:

P (ht1X1h−1 + · · ·+ htkXkh

−1) = P (h(t1X1 + · · ·+ tkXk)h−1) = P (t1X1 + · · ·+ tkXk)

, so since P is the coefficient of t1 · · · tk in the above expansion, this is okay.

47

Page 49: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

one-form and (therefore) the exterior derivative of the curvature vanish at x0:A(x0) = dR(x0) = 0. (Such coordinates can always be found; see [5, chapter1].). Then we have (to be explained below):

dP (R)(x0) = dP (R, . . . , R)(x0) = kP (dR,R, . . . , R)(x0) = 0 (4.10)

, where we understand dP (X) to mean d(P (X)) (as is the only logical interpre-tation), and the same for P .

• The first equality is just the property of P we saw above.

• The second follows from the Leibniz rule for the exterior derivative: sinceeach term in the expansion of P (R, . . . , R) is a product of k entries fromR, it will create k terms in dP (R, . . . , R) – one for every possible positionof d in this term. From the symmetry of P , they will all appear k! times.On the other hand, P (dR,R, . . . , R) is almost the same as P (R, . . . , R);only in the (k − 1)! terms where the first entry in P is in first place, weget a d in front of the first factor; for the (k − 1)! terms where it is insecond place we get a d in front of the second factor, etc. As you see, thisis the same as with dP (R, . . . , R), with the difference that there we hada factor k! in front. This amounts to exactly the factor k in the secondequality. If you do not follow this argument, just try it out for yourself; itwill become a lot clearer.

• The third equality follows from the linearity of P and the fact that, byour choice of frame, dR(x0) = 0.

Because P is invariant, P (R) does not depend on our choice of frame. Thisimplies that (4.10) holds in any frame. Since the point x0 was chosen arbitrarily,it also holds at any point. Thus, dP (R) = 0, or, in other words, P (R) is closed.

Next, we need to show that the cohomology class of P (R) does not dependon the connection chosen. This means that for two connections with curvatureR0 and R1 the difference P (R1) − P (R0) should be exact; there should besome form TP such that dTP = P (R1)− P (R0). We will do this by explicitlyconstructing the form TP .

Let two connections ∇0 and ∇1 be given. We define a family of linearoperators ∇t on the space of sections C∞(V ) on V by:

∇t = t∇1 + (1− t)∇0

We claim that for all t, ∇t defines a connection on V . Since ∇t is obviously alinear differential operator, we only have to check that it obeys equation (4.1).Let us do this directly:

∇t(fs) = t∇1(fs) + (1− t)∇0(fs)= tdf ⊗ s+ tf∇1s+ (1− t)df ⊗ s+ (1− t)f∇0s

= df ⊗ s+ f(t∇1 + (1− t)∇0)s = df ⊗ s+ f∇ts

Note that for this to work, we really need the fact that t and 1 − t sum to 1;otherwise the first term would get some factor in front.

We also want to find the connection one-form for ∇t. As might be expected,this is given by

At = tA1 + (1− t)A0 = A0 + tθ , (4.11)

48

Page 50: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

where θ = A1−A0. This follows directly from the fact (see the proof of Theorem4.3.3) that the connection one-form is the difference between the connection andthe trivial connection: At = ∇t − ∇.

Furthermore, we will want to be able to differentiate and integrate differentialforms with resprect to the parameter t, so we have to state what we will meanby that. If αt is a family of k-forms, then for a k-tuple ξ of tangent vectors wecan regard α.(ξ) as a map R → R, sending t to αt(ξ). This map can of course bedifferentiated in the usual sense, providing us with a new map d

dt (αt(ξ)) : R → Rfor every k-tuple ξ of tangent vectors. Now we define the k-form dαt

dt , by statingthat it sends ξ to d

dt (αt(ξ)). Integration is completely analogous:∫αtdt sends

ξ to∫αt(ξ)dt, where α.(ξ) is again seen as a function R → R for all ξ.

Now we have, by the main theorem of calculus:

P (R1)− P (R0) = P (R1, . . . , R1)− P (R0, . . . , R0)

=∫ 1

0

ddtP (Rt, . . . , Rt)dt ,

(4.12)

where Rt is the curvature of the connection ∇t.We would like to be able to pull the same trick for the time derivative as

we did above for the exterior derivative; pull it through P so that instead ofworking on the whole thing, it is now only in front of the first argument (allat the cost of a factor k). The only property of the exterior derivative we usedin our argument there, was that it obeys some sort of Leibniz rule. If we canshow that d

dt obeys this same property, then the rest of the argument is just thesame. Let us therefore look at (for families of k1- and k2-forms αt and βt, andk1- and k2-vectors ξ1 and ξ2):

ddt

(αt ∧ βt)(ξ1, ξ2) =ddt

(αt(ξ1)βt(ξ2)− αt(ξ2)βt(ξ1)

)=

dαt(ξ1)dt

βt(ξ2) + αt(ξ1)dβt(ξ2)

dt− dαt(ξ2)

dtβt(ξ1)− αt(ξ2)

dβt(ξ1)dt

=(dαt(ξ1)

dtβt(ξ2)−

dαt(ξ2)dt

βt(ξ1))

+(αt(ξ1)

dβt(ξ2)dt

− αt(ξ2)dβt(ξ1)

dt

)= (

dαtdt

∧ βt)(ξ1, ξ2) + (αt ∧dβtdt

)(ξ1, ξ2) = (dαtdt

∧ βt + αt ∧dβtdt

)(ξ1, ξ2)

So by definitionddt

(αt ∧ βt) = (dαtdt

∧ βt + αt ∧dβtdt

) ,

or in other words, the time derivative satisfies the Leibniz rule. Therefore, bythe same reasoning as for the exterior derivative, we may conclude:

dP (Rt, . . . , Rt)dt

= kP (dRtdt

, Rt, . . . , Rt)

and, by equation (4.12);

P (R1)− P (R0) = k

∫ 1

0

P (dRtdt

, Rt, . . . , Rt)

Before we go on, it is a good idea to look back a little at what we have done,and what we still need to accomplish. What we wanted to show was that the

49

Page 51: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

cohomology class of the differential form P (R) does not actually depend uponthe connection (and thus curvature) chosen. In other words, for two differentconnections on our bundle, with curvatures R0 and R1, we want to show thatP (R1) − P (R0) is exact. Or, in yet other words, there exists some differentialform TP (R0, R1) such that dTP (R0, R1) = P (R1)− P (R2).

We now claim that the following TP does the job:

TP (R0, R1) = k

∫ t

0

P (θ,Rt, . . . , Rt)dt ∈ Λ2k−1(T ∗M) .

What remains to be shown is that

dTP (R0, R1) = d(k

∫ 1

0

P (θ,Rt, . . . , Rt)dt)

= k

∫ 1

0

P (dRtdt

, Rt, . . . , Rt)

i.e.dP (θ,Rt, . . . , Rt) = P (

dRtdt

, Rt, . . . , Rt) . (4.13)

Since P is invariant, both sides of equation (4.13) are independent of choiceof frame. Let x0 ∈ M and t0 ∈ R. We use this freedom to set dRt0(x0) =At0(x0) = 0 (i.e. we work in geodesic coordinates based at x0). Working at thepoint x0, we have:

dRtdt

=ddt

(dAt +At ∧At) =ddt

(dA0 + tdθ +At ∧At)

= dθ +dAtdt

∧At +At ∧dAtdt

dRtdt

|x0,t0 = dθ .

The first equality is just equation (2.32), the second is equation (4.11). Thethird follows from the fact that dA0 and dθ do not depend on t, as well as fromthe Leibniz-rule we derived above for the time derivative. The last equality usesAt0(x0) = 0 (by choice of coordinates).

On the other hand:

dP (θ,Rt, . . . , Rt)|x0,t0 = P (dθ,Rt, . . . , Rt)|x0,t0 ,

which follows from the Leibniz rule for the exterior derivative and the fact thatin these coordinates dRt0(x0) = 0. Putting these last two together results in:

dP (θ,Rt, . . . , Rt)|x0,t0 = P (dRdt, Rt, . . . , Rt)|x0,t0 ,

which, noting that the points x0 and t0 were entirely arbitrary, proves equation(4.13). With that, we now have

P (R1)− P (R0) = d(TP (R0, R1))

completing the proof of the theorem.

Let us once more review what has been going on. We began by definingthe concept of a characteristic class; loosely speaking, it is a way to associatea cohomology class on the base manifold to a vector bundle. We wanted to

50

Page 52: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

have this concept because it can help us distinguish different vector bundles; itis an example of a vector-bundle invariant. Next, we defined the notion of aninvariant polynomial. We showed that we could use such a polynomial as a wayto associate a differential form to the curvature of the connection on a vectorbundle. We noted, however, that this was not what we originally set out to find;we wanted to associate one cohomology class to a vector bundle; not one forevery possible connection on that bundle.

In the end, the solution was found in the theorem we just proved; it showsthat, although the differential form associated to the curvature by an invariantpolynomial may differ according to the connection chosen, the cohomology classof this differential form does not. Therefore, invariant polynomials provide uswith a very easy recipe to create characteristic classes; if P is an invariant poly-nomial, then we can define a characteristic class c by stating that it associatesto a vector bundle V the cohomology class of P (R), where R is the curvatureof an arbitrary connection on V .

4.5 Chern classes

We will now study an important set of characteristic classes on a complex vectorbundle; the Chern classes. They come from a special set of invariant polyno-mials, and it turns out that any characteristic class of complex vector bundlescoming from an invariant polynomial can be written as a polynomial in theChern classes.

Definition 4.5.1. Let k ∈ N, then define ck : glm(C) → C to be the invariantpolynomial given by ck(X) = (−2πi)−ktr(ΛkX). Here we mean by ΛkX :ΛkCm → ΛkCm the linear transformation given for any pure k-form v1∧· · ·∧vkby

(ΛkX)(v1 ∧ · · · ∧ vk) = (Xv1) ∧ · · · ∧ (Xvk) .

Now the kth Chern class is defined to be the characteristic class associated tock.

As has been the case a couple of times before, we need to justify this defin-tion. What we did is define some maps ck : glm(C) → C, and call them invariantpolynomials, while in fact it is not clear at all that they are invariant, or evenpolynomials. Let us therefore check this now. First invariance, because this iseasy. On any pure k-form α = v1 ∧ v2 ∧ · · · ∧ vk ∈ ΛkCm,

Λk(XY )(α) = XY v1 ∧XY v2 ∧ . . . XY vk= ΛkX(Y v1 ∧ Y v2 ∧ . . . Y vk) = ΛkX(ΛkY (α)) ,

so by linearity Λk(XY ) = Λk(X)Λk(Y ), and hence

tr(Λk(XY )) = tr(ΛkXΛkY ) = tr(ΛkY ΛkX) = tr(Λk(Y X)) ;

ck is indeed invariant. To see that ck(X) is a polynomial in the entries of X, wewill show that the entries of ΛkX, when seen as a matrix, are all polynomialsin the entries of X. Then, because the trace is a polynomial in the entries ofΛkX, we will be able to conclude that ck(X) is a polynomial in the entries ofX.

51

Page 53: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

We write e(i1,...,ik) for the vector ei1 ∧ · · · ∧ eik ∈ ΛkCm, and take as a basisof ΛkCm the set of all eI , where I is a strictly increasing k-tuple of elementsof 1, . . . ,m. We compute the I,J -entry of the matrix of ΛkX with respect tothis basis:

ΛkX(eJ) = Xej1 ∧ · · · ∧Xejk = (m∑l1=1

Xl1,j1el1) ∧ · · · ∧ (m∑lk=1

Xlk,jkelk) ,

(where Xa,b is the a,b-entry of the matrix of X, with respect to the ei-basis) and(ΛkX)I,J is the the I -component of this:

(ΛkX)I,J =∑π∈Sm

ε(π) Xπ(i1),j1Xπ(i2),j2 · · ·Xπ(ik),jk (4.14)

where Sm is the permutation group on 1, . . . ,m and ε(π) is the sign of thepermutation π. Thus we have shown that the entries of ΛkX are polynomialsin the entries of X, and therefore ck is a polynomial in the entries of X. Sinceit was also invariant, we have justified the statement that ck is an invariantpolynomial.

Next, we prove the statement made at the beginning of this paragraph:

Theorem 4.5.2. The polynomials ck generate the ring of invariant polynomials.

Proof. Let P : glm(C) → C be an invariant polynomial. The idea of the proofwill be to use the invariance of P to reduce to diagonal matrices, and then applythe algebraic fact that symmetric polynomials are generated by the elementarysymmetric polynomials. We will need to extend this with a continuity argument,because not all matrices are diagonalisable.

In the remainder of the proof, let G ⊆ glm(C) denote the subset of diago-nalisable matrices, and D ⊆ G the set of diagonal ones. Recall that G is densein glm(C).

Let us look at the restriction P |D of P to the diagonal matrices in glm(C).Since P is invariant, and since the diagonal entries of a diagonal matrix can beinterchanged by conjugation, we see that it must be possible to write P |D as asymmetric polynomial in the diagonal entries.

If X ∈ G, we can diagonalise it by conjugation with some matrix U . Aswe know from linear algebra, it then has the eigenvalues of X on the diagonal.This gives us:

P (X) = P (UXU−1) = P |D(UXU−1) .

Since P |D was a symmetric polynomial in the diagonal entries, and since in thiscase they are the eigenvalues of X, we see that on diagonalisable matrices Pequals a symmetric polynomial in the eigenvalues. In other words, we can makethe following commutative diagram:

Gω|G- Cm/Sm

C

f

?P |G -

52

Page 54: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

, where ω : glm(C) → Cm/Sm is the map sending a matrix to its eigenvalues(and dividing out by the permutations Sm, because we want the eigenvalues toremain unordered), and where f is some map (a symmetric polynomial, as weargued above) making the diagram commutative. As a polynomial, f is equalto P |D, when the last is seen as a polynomial in the diagonal entries alone.

Now, P is continuous (being a polynomial), so P |G : G → C is continuouson a dense subset of glm(C), so if it can be extended to a continuous functionon glm(C), this extension is unique. But P obviously constitutes one suchextension, while another is given by f ω: ω is continuous (the reader shouldprove this if it is not familiar) and so is f (it is a polynomial), and we arguedabove that f ω extends P |G (this is expressed by the diagram). We concludethat P = f ω.

We have now shown that P equals some symmetric polynomial in the eigen-values of its argument. But the ring of symmetric polynomials in m variables isgenerated by the elementary symmetric polynomials of m variables. To provethe theorem, we just have to show that ck constitutes the kth elementary sym-metric polynomial in m variables, applied to the eigenvalues of its argument.

Since the trace of a matrix is nothing but the sum of the diagonal entries,we have to ask ourselves what the diagonal entries of the matrix ΛkX are. Weknow the form of the entries of ΛkX; it is given by equation (4.14). To find thediagonal ones, we have to realise that those are just the ones where il = jl forall l ∈ 1, . . . k. Furthermore, we want each one to count exactly once, so weshould order them (remember that the basis of ΛkCm is given by k-fold wedgeproducts of basis vectors of Cm, where for each combination of basis vectors,only one ordering is taken). The sum of the diagonal entries then becomes:

tr(ΛkX) =∑

i1<···<ik

∑π∈Sm

ε(π)Xπ(i1),i1 · · ·Xπ(ik),ik . (4.15)

Now we use the invariance of ck to put X in its Jordan normal form. As weknow from linear algebra, this is always possible. Now we know that X has itseigenvalues on the diagonal, and maybe has some ones directly above it. Therest of the entries is zero. With this in mind, let us look at equation (4.15)again. We see that, for each term in the inner sum, if we have some entry ofX in it which is not on the diagonal of X (i.e il 6= π(il) for some l), then theremust be at least one other which is not either. Even better, if there is some lsuch that il < π(il), then there must be an l′ such that il′ > π(il′), and viceversa (because π is a permutation of a finite set).

But then we see, that if a term in the inner sum contains entries that arenot on the diagonal, it will contain at least one entry from below the diagonal,and since X was a Jordan matrix, these are all zero. We can conclude that onlythe terms in the inner sum that contain only entries from the diagonal of X willremain. These are of course precisely the ones for which π is the identity. Thisleaves us with:

tr(ΛkX) =∑

i1<···<ik

Xi1,i1 · · ·Xik,ik =∑

i1<···<ik

λi1 · · ·λik ,

where the last equality is true because a Jordan matrix has its eigenvalues onthe diagonal.

This shows us that ck is indeed the kth elementary symmetric polynomial inthe eigenvalues of X (apart from the normalisation constant (−2πi)−k, which

53

Page 55: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

is not important since all we want to do is generate the ring of symmetric poly-nomials). They generate the ring of symmetric polynomials in the eigenvaluesof X, which by our earlier arguments was equal to the ring of invariant polyno-mials, and so we have proved the theorem.

A quick recapitulation of what we have done in this chapter: first, we havedefined characteristic classes to be special invariants of vector bundles. Next, wehave seen that invariant polynomials can be used to generate such classes (thiswas essentially theorem 4.4.2). Finally, we defined a special set of invariantpolynomials, which turned out to generate the ring of polynomials (this wastheorem 4.5.2). This showed us that any characteristic class which comes froman invariant polynomial is a polynomial in the Chern classes, making themimportant for further study.

What we would have liked to do now is work out an example which showsthe use of the Chern classes. It would have been nice, for example, to show thatthe tangent bundle to the sphere is non-trivial. We already noted in section 3.3that this follows from a theorem of algebraic topology; we could now prove itusing characteristic classes. However, we would then be discussing a real vectorbundle, instead of a complex one; thus, Chern classes would not apply.

Instead, we would need to introduce the Pontryagin classes of a real vectorbundle, which are, loosely speaking, the Chern classes of its complexification.We decided not to do this: we felt that it would not fit in that well, addingsignificant volume while not offering any new viewpoint. (We do recommendto the reader to look up the concept of Pontryagin classes and work out theexample for him or herself; it is a nice way to have some practice with thematerial from this and earlier chapters.) Moreover, we have been working onour bachelor project for quite a while now, and it is time to finish.

Finally, we would like to summarise what we have done. We started byintroducing fibre bundles, which we specialised to vector bundles on the onehand and principal bundles on the other. We defined what a connection on thesebundles is (with linear and invariant connections being structure preservingkinds, belonging to vector and principal bundles respectively), and how such aconnection gives rise to parallel transport and curvature. To connect all thiswith physics we explained how bundles can be used to describe electromagnetismas a gauge theory. As an explicit example, we discussed the abelian Aharanov-Bohm effect quite thoroughly, while skimming over the more involved subjectof Chern-Simons theory.

Going back to mathematics, we introduced the concept of characteristicclasses, which can be used as an invariant of vector bundles. We also showed howcharacteristic classes can be created from invariant polynomials. We specialisedthese to the Chern classes, which were specific characteristic classes associatedto some special polynomials.

Now, it is obvious that one can travel a lot further along the paths we set outon, far beyond the scope of this paper. We encourage the reader (and ourselves)to make this trip, for it is sure to be an interesting one.

54

Page 56: Connections, gauge theory and characteristic classes · characteristic classes. In the end we settled for explaining the mathematics be-hind gauge theory, describing electromagnetism

Bibliography

[1] D. J. Griffiths, Introduction to electrodynamics. Englewood Cliffs: Prentice-Hall, 3 ed., 1998.

[2] C. von Westenholz, Differential Forms in Mathematical Physics. New York:North-Holland, 1978.

[3] Y. Aharonov and D. Bohm, “Significance of electromagnetic potentials inquantum theory,” Phys. Rev., vol. 115, no. 3, p. 485, 1959.

[4] E. Witten, “Quantum field theory and the jones polynomial,” Commun.Math. Phys., vol. 121, pp. 351–399, 1989.

[5] J. Roe, Elliptic operators, topology and asymptotic methods. Research Notesin Mathematics, Chapman & Hall/CRC, second ed., 1999.

55


Recommended