MathematicalMethodsofPhysicsIIIa2G: ae= ea= a. alsoholdsiscalledamonoid. Wedeﬁneagroup...

Mathematical Methods of Physics III

Lecture Notes

Esko Keski-Vakkuri, Claus Montonen and Marco Panero

Introduction to group theory, topology, and geometry for physics applications

1

Contents

1 Introduction 4

2 Group Theory 42.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Smallest Finite Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 More about the permutation groups Sn . . . . . . . . . . . . . 102.3 Continuous Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1 Examples of Lie groups . . . . . . . . . . . . . . . . . . . . . . 142.4 Groups Acting on a Set . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4.1 Conjugacy classes and cosets . . . . . . . . . . . . . . . . . . . 192.4.2 Normal subgroups and quotient groups . . . . . . . . . . . . . 21

3 Representation Theory of Groups 223.1 Complex Vector Spaces and Representations . . . . . . . . . . . . . . 233.2 Symmetry Transformations in Quantum Mechanics . . . . . . . . . . 263.3 Reducibility of Representations . . . . . . . . . . . . . . . . . . . . . 283.4 Irreducible Representations . . . . . . . . . . . . . . . . . . . . . . . . 313.5 Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4 Differentiable Manifolds 354.1 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.1.1 Continuous Maps . . . . . . . . . . . . . . . . . . . . . . . . . 374.2 Homotopy Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.2.1 Paths and Loops . . . . . . . . . . . . . . . . . . . . . . . . . 384.2.2 Homotopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2.3 Properties of the Fundamental Group . . . . . . . . . . . . . . 404.2.4 Higher Homotopy Groups . . . . . . . . . . . . . . . . . . . . 41

4.3 Differentiable Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . 424.3.1 Manifold with a Boundary . . . . . . . . . . . . . . . . . . . . 43

4.4 The Calculus on Manifolds . . . . . . . . . . . . . . . . . . . . . . . . 434.4.1 Differentiable Maps . . . . . . . . . . . . . . . . . . . . . . . . 434.4.2 Tangent Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 444.4.3 Dual Vector Space . . . . . . . . . . . . . . . . . . . . . . . . 464.4.4 1-forms (i.e. cotangent vectors) . . . . . . . . . . . . . . . . . 484.4.5 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.4.6 Tensor Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.4.7 Differential Map and Pullback . . . . . . . . . . . . . . . . . . 504.4.8 Flow Generated by a Vector Field . . . . . . . . . . . . . . . . 514.4.9 Lie Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.4.10 Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . 54

2

4.4.11 Exterior derivative . . . . . . . . . . . . . . . . . . . . . . . . 564.4.12 Integration of Differential Forms . . . . . . . . . . . . . . . . . 57

4.5 Integral of an r-form over a manifold M; Stokes’ theorem . . . . . . . 594.5.1 Simplexes in a Euclidean space . . . . . . . . . . . . . . . . . 594.5.2 Simplexes and Chains on Manifolds . . . . . . . . . . . . . . . 61

4.6 Briefly about Lie Groups and Algebras . . . . . . . . . . . . . . . . . 634.6.1 Structure Constants of the Lie Algebra . . . . . . . . . . . . . 664.6.2 The adjoint representation of G . . . . . . . . . . . . . . . . . 66

5 Riemannian Geometry (Metric Manifolds) 675.1 The Metric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675.2 The Induced Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675.3 Affine Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695.4 Parallel Transport and Geodesics . . . . . . . . . . . . . . . . . . . . 695.5 The Covariant Derivative of Tensor Fields . . . . . . . . . . . . . . . 705.6 The Transformation Properties of Connection Coefficients . . . . . . . 715.7 The Metric Connection . . . . . . . . . . . . . . . . . . . . . . . . . . 725.8 Curvature And Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . 735.9 Geodesics of Levi-Civita Connections . . . . . . . . . . . . . . . . . . 745.10 Lie Derivative And the Covariant Derivative . . . . . . . . . . . . . . 765.11 Isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.12 Killing Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6 Semisimple Lie algebras and their unitary representations 786.1 SU(2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816.2 Roots and weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.2.1 Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.2.2 Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856.2.3 Raising and lowering operators . . . . . . . . . . . . . . . . . 86

6.3 SU(3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896.4 Simple roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.4.1 Highest weight . . . . . . . . . . . . . . . . . . . . . . . . . . 926.4.2 Simple root . . . . . . . . . . . . . . . . . . . . . . . . . . . . 936.4.3 Dynkin diagrams . . . . . . . . . . . . . . . . . . . . . . . . . 966.4.4 Fundamental weights . . . . . . . . . . . . . . . . . . . . . . . 97

A Appendix 98A.1 Some miscellaneous definitions and formulæ . . . . . . . . . . . . . . 98A.2 Algebras, representations and Young calculus . . . . . . . . . . . . . . 99A.3 Homotopy groups and exact sequences . . . . . . . . . . . . . . . . . 105A.4 Classification of simple Lie algebras . . . . . . . . . . . . . . . . . . . 107

3

1 Introduction

The course Mathematical Methods of Physics III (MMP III) is third in the seriesof courses introducing mathematical concepts and tools which are often needed inphysics. The first two courses MMP I-II focused on analysis, providing tools to an-alyze and solve the dynamics of physical systems. In MMP III the emphasis is ongeometrical and topological concepts, needed for the understanding of the symmetryprinciples and topological structures of physics. In particular, we will learn group the-ory (the basic tool to understand symmetry in physics, especially useful in quantummechanics, quantum field theory and beyond), topology (needed for many subtlereffects in quantum mechanics and quantum field theory), and differential geometry(the language of general relativity and modern gauge field theories). There are alsomany more sophisticated areas of mathematics that are also often used in physics,notable omissions in this course are fiber bundles and complex geometry.

All the course material is available on the course homepage,

http://theory.physics.helsinki.fi/∼fymm3/

Let me know of any typos and confusions that you find. The lecture notes are based onthose prepared and used by Claus Montonen and later revised by Esko Keski-Vakkuri,who lectured the course before me. In practice, they often follow very closely (andoften verbatim) the following textbooks:

• H.F. Jones: Groups, Representations and Physics (IOP Publishing, 2nd edition,1998)

• M. Nakahara: Geometry, Topology and Physics (IOP Publishing, 1990)

• H. Georgi: Lie Algebras in Particle Physics (Addison-Wesley, 1982)

which are common reference textbooks.Finally, we warmly thank Klaus Larjo and Otso Huuska for their big help in

typesetting these notes with LaTeX, and Yann Kempf for pointing out a list of typosin a previous version of these lecture notes.

2 Group Theory

2.1 Groups

Definitions: magmas, semigroups, monoids, groups and Abelian groupsConsider an arbitrary set G = a, b, . . ., and a law of composition (multiplication)which assigns to each ordered pair a, b ∈ G an element a · b. If the set is closed underthis law of composition, namely if, ∀a ∈ G and ∀b ∈ G the product a · b is also in

4

http://theory.physics.helsinki.fi/~fymm3/

G, then (G, ·) is called a magma. Furthermore, it is a semigroup, if the followingcondition is also satisfied:

G1 (associativity): for all a, b, c ∈ G, a · (b · c) = (a · b) · c.

A semigroup for which

G2 (existence of the unit element): There is an element e ∈ G such that for alla ∈ G: a · e = e · a = a.

also holds is called a monoid. We define a group to be a monoid for which:

G3 (existence of the inverse): For all a ∈ G there is an element a−1 ∈ G such thata · a−1 = a−1 · a = e.

holds. Finally, we define an Abelian group to be a group for which:

AG4 (commutativity): For all a, b ∈ G: a · b = b · a.

In the following, with a slight abuse of notation, we will also use the simpler notationG—rather than (G, ·)—to denote the group. The number of elements in the set G iscalled the order of the group, and is denoted by |G|. If |G| is finite, then G is a finitegroup. If G is a discrete set (i.e., a set whose elements can be put in correspondencewith the natural numbers), then G is a discrete group, while if G is a continuous set,G is a continuous group.

Comments

i) The inverse element is unique: suppose that both b, b′ are inverse elements of a.Then b′ = b′e = b′(ab) = (b′a)b = eb = b.

ii) Note that, by definition, the unit element commutes with all elements of the group.In general, the subset of group elements which commute with all elements ofthe group is called the center of the group. The center of a group is always anAbelian group; if it only contains the unit element, then the group is said tohave a trivial center.

iii) The definition of a group, essentially, is the definition of the group multiplicationamong its elements, while the elements of the set G do not necessarily need tobe specified. In particular, this means that the same group can have differentrepresentations.

5

Examples of groups Let us list some examples of groups:

1. Z with + (addition) as a multiplication is a discrete Abelian group.

2. R with + as a multiplication is a continuous Abelian group, e = 0. R\0 with· (product) is also a continuous Abelian group, e = 1. We had to remove 0 inorder to ensure that all elements have an inverse.

3. Z2 = 0, 1 with addition modulo 2 is a finite Abelian group with order 2: theidentity element is e = 0, and the inverse of 1 is 1 itself.

4. The set 1,−1 with the ordinary multiplication is another, equivalent way torepresent the group Z2; in this case, the identity element is e = 1 and the othergroup element, −1, is the inverse of itself.

5. Consider the Boolean logics: FALSE, TRUE with the binary operator XOR (“ex-clusive or”) as the group multiplication. Recalling that a XOR b is TRUE when a isTRUE and b is FALSE, or vice versa, while it is FALSE when a and b are both TRUEor both FALSE, one can explicitly check that the Z2 group structure is realizedby identifying FALSE as the identity element and TRUE as the other element ofthe group.

It is also instructive to list some examples of structures (G, ·) which are not groups:

1. N with addition as the group multiplication is a monoid (the unit element beingthe number 0), but it is not a group, because no element (except for 0) admitsan inverse in the group.

2. R with the group multiplication law defined as: ∀a ∈ R, ∀b ∈ R: a ·b = a+b+1

is a semigroup, but not a monoid.

3. R3 with the group multiplication law defined by the cross-product of vectors,namely: (a · b)i =

∑3j=1

∑3k=1 εijkajbk is a magma, but not a semigroup.

4. R3 with the group multiplication law defined by the scalar product of vectors,namely: (a ·b) =

∑3i=1 aibi is not even a magma (because the result of the group

multiplication is a real number, rather than an element of R3).

Let us also consider the set of mappings (functions) from a set X to a set Y ,Map(X, Y ) = f : X → Y |f(x) ∈ Y for all x ∈ X, f(x) is uniquely determined.There are special cases of functions:

i) f : X → Y is called an injection (or one-to-one) if f(x) 6= f(x′) ∀x 6= x′.

ii) f : X → Y is called a surjection (or onto) if ∀y ∈ Y ∃x ∈ X s.t. f(x) = y.

6

iii) if f is both an injection and a surjection, it is called a bijection.

Now take the composition of maps as a multiplication: fg = fg, (fg)(x) = f(g(x)).Then (Map(X,X), ) (the set of functions f : X → X with as the multiplication)is a semigroup. We had to choose Y = X to be able to use the composition, as gmaps to Y but f is defined in X. Further, (Map(X,X), ) is in fact a monoid withthe identity map id : id(x) = x as the unit element. However, it is not a group,unless we restrict to bijections. The set of bijections f : X → X is called the setof permutations of X, we denote Perm(X) = f ∈ Map(X,X)|f is a bijection.Every f ∈ Perm(X) has an inverse map, so Perm(X) is a group. However, in generalf(g(x)) 6= g(f(x)), so Perm(X) is not an Abelian group. An important special caseis when X has a finite number N of elements. This is called the symmetric groupor the permutation group, and denoted by SN . The order of SN is |SN | = N !

(exercise).

Definitions

i) We denote g2 = gg, g3 = ggg = g2g, . . . , gn =

n︷︸︸︷g · · · g for products of the element

g ∈ G.

ii) The order n of a generic element g ∈ G is the smallest number n such thatgn = e.

2.2 Smallest Finite Groups

Let us find all the groups of order n for n = 1, . . . , 4. First we need a handy defini-tion. A homomorphism in general is a mapping from one set X to another set Ypreserving some structure. Further, if f is a bijection, it is called an isomorphism.We will see several examples of such structure-preserving mappings. The first one isthe one that preserves the multiplication structure of groups.

Definition A mapping f : G → H between groups G and H is called a grouphomomorphism if for all g1, g2 ∈ G, f(g1g2) = f(g1)f(g2). Further, if f is also abijection, it is called a group isomorphism. If there exists a group isomorphismbetween groupsG andH, we say that the groups are isomorphic, and denoteG ∼= H.Isomorphic groups have an identical structure, so they can be identified – there is onlyone abstract group of that structure.

Example. Take G = R+ with “·” and H = R with “+” as a multiplication. Definethe mapping f : G → H, f(x) = lnx. Now f is a group homomorphism, becausef(xy) = ln(xy) = lnx + ln y = f(x) + f(y). In fact, f is also a group isomorphism,because it is a bijection: f−1(x) = ex.

7

Now let us move ahead to groups of order n.

Order n = 1. This is the trivial group G = e, e2 = e.

Order n = 2. Now G = e, a, a 6= e. The multiplications are e2 = e, ea = ae = a.For a2, let us first try a2 = a. But then a = ae = a(aa−1) = a2a−1 = aa−1 = e,a contradiction. So the only possibility is a2 = e. We can summarize this in themultiplication table or Cayley table:

e a

e e a

a a e

This group is called Z2. You have already seen another realization of it: the set0, 1 with addition modulo 2 as the multiplication. Yet another realization ofthe group is 1,−1 with product as the multiplication. This illustrates whatwas said before: for a given abstract group, there can be many ways to describeit. Consider one more realization: the permutation group S2 = Perm(1, 2).Its elements are

e =

1 2

↓ ↓1 2

≡ ( 1 2

1 2

)

a =

1 2

↓ ↓2 1

≡ ( 1 2

2 1

),

the arrows indicate how the numbers are permuted, we usually use the no-tation in the right hand side without the arrows. For products of permuta-tions, the order in which they are performed is “right to left”: we first performthe permutation on the far right, then continue with the next one to the left,and so on. This convention is inherited from that with composite mappings:(fg)(x)=f(g(x)). We can now easily show that S2 is isomorphic with Z2. Takee.g. 1,−1 with the product as the realization of Z2. Then we define themapping i : Z2 → S2 : i(1) = e, i(−1) = a. It is easy to see that i is a grouphomomorphism, and it is obviously a bijection. Hence it is an isomorphism,and Z2

∼= S2. There is only one abstract group of order 2.

Order n = 3. Consider now the set G = e, a, b. It turns out that there is againonly one possible group of order 3. We can try to determine it by completingits multiplication table:

e a b

e e a b

a a ? ?b b ? ?

8

First, guess ab = b. But then a = a(bb−1) = (ab)b−1 = bb−1 = e, a con-tradiction. Try then ab = a. But now b = (a−1a)b = a−1(ab) = a−1a = e,again contradiction. So ab = e. Similarly, ba = e. Then, guess a2 = a.Now a = aaa−1 = aa−1 = e, doesn’t work. How about a2 = e? Nowb = a2b = a(ab) = ae = a, doesn’t work. So a2 = b. Similarly, can showb2 = a. Now we have worked out the complete multiplication table:

e a b

e e a b

a a b e

b b e a

Our group is actually called Z3. We can simplify the notation and call b =

a2, so Z3 = e, a, a2. Z3 and Z2 are special cases of cyclic groups Zn =

e, a, a2, . . . , an−1. They have a single “generating element” a with order n:an = e. The multiplication rules are apaq = ap+q(mod n), (ap)−1 = an−p. Some-times in the literature cyclic groups are denoted by Cn. One possible realiza-tion of them is by complex numbers, Zn = e 2πik

n |k = 0, 1, . . . with productas a multiplication. This also shows their geometric interpretation: Zn is thesymmetry group of rotations of a regular directed polygon with n sides (seeH.F.Jones). You can easily convince yourself that Zn = 0, 1, . . . , n − 1 withaddition modulo n is another realization.

Order n = 4. So far the groups have been uniquely determined, but we’ll see thatfrom order 4 onwards we’ll have more possibilities. Let us start with a definition.

Definition. A direct product G1 × G2 of two groups is the set of all pairs(g1, g2) where g1 ∈ G1 and g2 ∈ G2, with the multiplication (g1, g2) · (g′1, g′2) =

(g1g′1, g2g

′2). The unit element is (e1, e2) where ei is the unit element of Gi

(i = 1, 2). It is easy to see that G1×G2 is a group, and its order is |G1×G2| =|G1||G2|.

Now we can immediately find at least one group of order 4: the direct productZ2 × Z2. Denote Z2 = e, f with f 2 = e, and introduce a shorter notation forthe pairs: E = (e, e), A = (e, f), B = (f, e), C = (f, f). We can easily findthe multiplication table,

E A B C

E E A B C

A A E C B

B B C E A

C C B A E

The group Z2 × Z2 is sometimes also called “Vierergruppe” and denoted by V4.

9

There is another group of order 4, namely the cyclic group Z4 = e, a, a2, a3.It is not isomorphic with Z2 × Z2. (You can easily check that it has a differentmultiplication table.) It can be shown (exercise) that there are no other groupsof order 4, just the above two.

Order n ≥ 5. As can be expected, there are more possible non-isomorphic groups ofhigher finite order. We will not attempt to categorize them much further, butwill mention some interesting facts and examples.

Definition. If H is a subset of the group G such that

i) ∀ h1, h2 ∈ H : h1h2 ∈ H

ii) ∀ h ∈ H : h−1 ∈ H ,

then H is called a subgroup of G. Note as a result of i) and ii), every subgroupmust include the unit element e of G.

Trivial examples of subgroups are e and G itself. Other subgroups H are calledproper subgroups of G. For those, |H| ≤ |G| − 1.

Example. Take G = Z3. Are there any proper subgroups? The only possibilitiescould be H = e, a or H = e, a2. Note that in order for H to be a group oforder 2, it should be isomorphic with Z2. But since a2 6= e (because a3 = e) and(a2)2 = a3a = a 6= e, neither is. So Z3 has no proper subgroups.

2.2.1 More about the permutation groups Sn

It is worth spending some more time on the permutation groups, because on onehand they have a special status in the theory of finite groups (for a reason that willbe explained later) and on the other hand they often appear in physics.

Let X = 1, 2, . . . , n. Denote a bijection of X by p : X → X, i 7→ p(i) ≡ pi. Wewill now generalize our notation for the elements of Sn, you already saw it for S2. Wedenote a P ∈ Sn ≡ Perm(X) by

P =

(1 2 · · · n

p1 p2 · · · pn

).

Recall that the multiplication rule for permutations was the composite operation,with the “right to left” rule. In general, the multiplication is not commutative:

PQ =

(1 2 · · · n

p1 p2 · · · pn

)(1 2 · · · n

q1 q2 · · · qn

)6= QP .

10

So, in general, Sn is not an Abelian group. (Except S2.) For example, in S3,(1 2 3

1 3 2

)(1 2 3

3 1 2

)=

(1 2 3

2 1 3

)(1)

but (1 2 3

3 1 2

)(1 2 3

1 3 2

)=

(1 2 3

3 2 1

), (2)

which is not the same.The identity element is

E =

(1 2 · · · n

1 2 · · · n

)and the inverse of P is

P−1 =

(p1 p2 · · · pn1 2 · · · n

).

An alternative and very useful way of writing permutations is the cycle notation.In this notation we follow the permutations of one label, say 1, until we get back towhere we started (in this case back to 1), giving one cycle. Then we start againfrom a label which was not already included in the previously found cycle, and findanother cycle, and so on until all the labels have been accounted for. The originalpermutation has then been decomposed into a certain number of disjoint cycles. Thisis best illustrated by an example. For example, the permutation(

1 2 3 4

2 4 3 1

)of S4 decomposes into the disjoint cycles 1→ 2→ 4→ 1 and 3→ 3. Reordering thecolumns we can write it as(

1 2 3 4

2 4 3 1

)=

(1 2 4 | 3

2 4 1 | 3

)=

(1 2 4

2 4 1

)(3

3

).

In a cycle the bottom row is superfluous: all the information about the cycle (like1→ 2→ 4→ 1) is already included in the order of the labels in the top row. So wecan shorten the notation by simply omitting the bottom row. The above example isthen written as (

1 2 3 4

2 4 3 1

)= (124)(3) .

As a further abbreviation of the notation, we omit the 1-cycles (like (3) above), itbeing understood that any labels not appearing explicitly just transform into them-selves. With the new shortened cycle notation, (1) reads

(23)(132) = (12) (3)

11

and (2) reads as(132)(23) = (13) . (4)

In general, any permutation can always be written as the product of disjoint cycles.What’s more, the cycles commute since they operate on different indices, hence thecycles can be written in any order in the product. In listing the individual permuta-tions of Sn it is convenient to group them by cycle structure, i.e. by the number andlength of cycles. For illustration, we list the first permutation groups Sn:

n = 2: S2 = E, (12).

n = 3: S3 = E, (12), (13), (23), (123), (132).

n = 4: S4 = E, (12), (13), (14), (23), (24), (34), (12)(34), (13)(24), (14)(23),

(123), (132), (124), (142), (134), (143), (234), (243),

(1234), (1243), (1324), (1342), (1423), (1432).

You can see that the notation makes it quite easy and systematic to write down allthe elements in a concise fashion.

The simplest non-trivial permutations are the 2-cycles, which interchange twolabels. In fact, any permutation can be built up from products of 2-cycles. First, anr-cycle can be written as the product of r − 1 overlapping 2-cycles:

(n1n2 . . . nr) = (n1n2)(n2n3) · · · (nr−1nr) .

Then, since any permutation is a product of cycles, it can be written as a product of2-cycles. This allows us to classify permutations as “even” and “odd”. First, a 2-cyclewhich involves just one interchange of labels is counted as odd. Then, a product of2-cycles is even (odd), if there is an even (odd) number of 2-cycles. Thus, an r-cycleis even (odd), if r is odd (even). (Since it is a product of r − 1 2-cycles.) Finally, ageneric product of cycles is even if it contains an even number of odd cycles, otherwiseit is odd. In particular, the identity E is even. This allows us to find an interestingsubgroup of Sn, the alternating group An which consists of the even permutationsof Sn. The order of An is |An| = 1

2· |Sn|. Hence An is a proper subgroup of Sn. Note

that the odd permutations do not form a subgroup, since any subgroup must containthe identity E which is even.

To keep up a promise, we now mention the reason why permutation groups havea special status among finite groups. This is because of the following theorem (westate it without proof).

Theorem 2.1 (Cayley’s Theorem) Every finite group of order n is isomorphic toa subgroup of Sn.

12

Thus, because of Cayley’s theorem, in principle we know everything about finitegroups if we know everything about permutation groups and their subgroups.

As for physics uses of finite groups, the classic example is their role in solid statephysics, where they are used to classify general crystal structures (the so-called crys-tallographic point groups). They are also useful in classical mechanics, reducing thenumber of relevant degrees of freedom in systems of symmetry. We may later studyan example, finding the vibrational normal modes of a water molecule. In additionto these canonical examples, they appear in different places and roles in all kinds ofareas of modern physics.

2.3 Continuous Groups

Continuous groups have an uncountable infinity of elements. The dimension of acontinuous group G, denoted dimG, is the number of continuous real parameters(coordinates) which are needed to uniquely parameterize its elements. In the productg′′ = g′g, the coordinates of g′′ must be continuous functions of the coordinates of gand g′. (We will make this more precise later when we discuss topology. The aboverequirement means that the set of real parameters of the group must be a manifold,in this context called the group manifold.)

Examples.

1. The set of real numbers R with addition as the product is a continuous group;dimR = 1. Simple generalization: Rn = (r1, . . . , rn)|ri ∈ R, i = 1, . . . , n =

n times︷︸︸︷R× · · · × R, with product (r1, . . . , rn) · (r′1, . . . , r

′n) = (r1 + r′1, . . . , rn + r′n),

dimRn = n.

2. The set of complex numbers C with addition as the product, dimC = 2 (recallthat we count the number of real parameters).

3. The set of n×n real matricesM(n,R) with addition as the product, dimM(n,R) =

n2. Note group isomorphism: M(n,R) ∼= Rn2 .

4. U(1) = z ∈ C||z|2 = 1, with multiplication of complex numbers as theproduct. dim U(1) = 1 since there’s only one real parameter θ ∈ [0, 2π], z = eiθ.Note a difference between U(1) and R: both have dim = 1 but the groupmanifold of the former is the circle S1 while the group manifold of the latter is

the whole infinite x-axis. A generalization of U(1) is U(1)n =

n times︷︸︸︷U(1)× · · · × U(1),

(eiθ1 , . . . , eiθn) · (eiθ′1 , . . . , eiθ′n) = (ei(θ1+θ′1), . . . , ei(θn+θ′n)). The group manifold of

U(1)n is an n-torus

n︷︸︸︷S1 × · · · × S1. Again, the n-torus is different from Rn: on

13

the former it is possible to draw loops which cannot be smoothly contracted toa point, while this is not possible on Rn.

All of the above examples are actually examples of Lie groups. Their group man-ifolds must be differentiable manifolds, meaning that we can take smooth (partial)derivatives of the group elements with respect to the real parameters. We’ll give aprecise definition later – for now we’ll just focus on listing further examples of them.

2.3.1 Examples of Lie groups

1. The group of general linear transformationsGL(n,R) = A ∈M(n,R)| detA 6=0, with matrix multiplication as the product; dimGL(n,R) = n2. WhileGL(n,R), M(n,R) have the same dimension, their group manifolds have a dif-ferent structure. To parameterize the elements of M(n,R), only one coordinateneighborhood is needed (Rn2 itself). The coordinates are the matrix entries aij:

A =

a11 · · · a1n

... . . . ...an1 · · · ann

.

In GL(n,R), the condition detA 6= 0 removes a hyperplane (a set of measurezero) from Rn2 , dividing it into two disconnected coordinate regions. In eachregion, the entries aij are again suitable coordinates.

2. A generalization of the above is GL(n,C) = n× n complex matrices with

non− zero determinant, with matrix multiplication as the product. This hasdimGL(n,C) = 2n2. Note that GL(n,R) is a (proper) subgroup of GL(n,C).The following examples are subgroups of these two.

3. The group of special linear transformations SL(n,R) = A ∈ GL(n,R)| detA =

1. It is a subgroup of GL(n,R) since det(AB) = detA detB. The dimensionis dimSL(n,R) = n2 − 1.

4. The orthogonal group O(n,R) = A ∈ GL(n,R)| ATA = 1n, i.e. the group oforthogonal matrices. (1n denotes the n× n unit matrix.) AT is the transposeof the matrix A:

AT =

a11 · · · an1

... . . . ...a1n · · · ann

,

i.e. if A = (aij) then AT = (aji), the rows and columns are interchanged. Letus prove that O(n,R) is a subgroup of GL(n,R):

a) 1Tn = 1n so the unit element ∈ O(n,R)

14

b) IfA,B are orthogonal, thenAB is also orthogonal: (AB)T (AB) = BTATAB =

BTB = 1n.

c) EveryA ∈ O(n,R) has an inverse inO(n,R): (A−1)T = (AT )−1 so (A−1)TA−1 =

(AT )−1A−1 = (AAT )−1 = ((AT )TAT )−1 = 1−1n = 1n.

Note that orthogonal matrices preserve the length of a vector. The length of avector ~v is

√v2

1 + · · · v2n =√~vT~v. A vector ~v gets mapped to A~v, so its length

gets mapped to√

(A~v)T (A~v) =√~vTATA~v =

√~vT~v, the same. We can inter-

pret the orthogonal group as the group of rotations in Rn.What is the dimension of O(n,R)? A ∈ GL(n,R) has n2 independent param-eters, but the orthogonality requirement ATA = 1n imposes relations betweenthe parameters. Let us count how many relations (equations) there are. Thediagonal entries of ATA must be equal to one, this gives n equations; the en-tries above the diagonal must vanish, this gives further n(n − 1)/2 equations.The same condition is then automatically satisfied by the “below the diagonal”entries, because the condition ATA = 1n is symmetric: (ATA)T = ATA =

(1n)T = 1n. Thus there are only n2− n− n(n− 1)/2 = n(n− 1)/2 free param-eters. So dimO(n,R) = n(n− 1)/2.Another fact of interest is that detA = ±1 for every A ∈ O(n,R). Proof:det(ATA) = det(AT ) detA = detA detA = (detA)2 = det1n = 1 ⇒ detA =

±1. Thus the group O(n,R) is divided into two parts: the matrices withdetA = +1 and the matrices with detA = −1. The former part actuallyforms a subgroup of O(n,R), called SO(n,R) (you can figure out why this istrue, and not true for the part with detA = −1). So we have one more example:

5. The group of special orthogonal transformations, denoted by SO(n,R) = A ∈O(n,R)| detA = 1. dim SO(n,R) = dim O(n,R) = n(n− 1)/2.

6. The group of unitary matrices (transformations) U(n) = A ∈ GL(n,C)| A†A =

1n, where A† = (A∗)T = (AT )∗: (A†)ij = (Aji)∗. Note that (AB)† =

B†A†. These preserve the length of complex vectors ~z. The length is de-fined as

√z∗1z1 + · · · z∗nzn =

√~z†~z. Under A this gets mapped to

√(A~z)†A~z =√

~z†A†A~z =√~z†~z. The unitary matrices are rotations in Cn. We leave it as

an exercise to show that U(n) is a subgroup of GL(n,C), and dim U(n) = n2.Note that U(1) = a ∈ C | a∗a = 1, its group manifold is the unit circle S1 onthe complex plane.

7. The special unitary group SU(n) = A ∈ U(n)| detA = 1. This is the complexanalogue of SO(n,R), and is a subgroup of U(n). Exercise: dim SU(n) = n2−1.U(n) and SU(n) groups are important in modern physics. You will probablyfirst become familiar with U(1), the group of phase transformations in quantummechanics, and with SU(2), in the context of spin. Let us take a closer look at

15

the latter. Its dimension is three. What does its group manifold look like? Letus first parameterize the SU(2) matrices with complex numbers a, b, c, d:

A =

(a b

c d

), A† =

(a∗ c∗

b∗ d∗

).

Then

detA = ad− bc = 1

A†A =

(|a|2 + |c|2 a∗b+ c∗d

b∗a+ d∗c |b|2 + |d|2

)=

(1 0

0 1

).

Let us first assume a 6= 0. Then b = −c∗d/a∗. Substituting to the determinantcondition gives ad− bc = d(|a|2 + |c|2)/a∗ = d/a∗ = 1⇒ d = a∗. Then c = −b∗.So

A =

(a b

−b∗ a∗

).

Assume then a = 0. Now |c|2 = 1, c∗d = 0 ⇒ d = 0. Then |c|2 = |b|2 = 1.Write b = eiβ, c = eiγ. Then detA = −bc = ei(β+γ+π) = 1→ γ = −β+(2n+1)π.Then c = eiγ = e−iβei(2n+1)π = −e−iβ = −b∗. Thus

A =

(0 b

−b∗ 0

).

Let us trade the two complex parameters with four real parameters x1, x2, x3, x4:a = x1 + ix2, b = x3 + ix4. Then A becomes

A =

(x1 + ix2 x3 + ix4

−x3 + ix4 x1 − ix2

).

The determinant condition detA = 1 then turns into the constraint

x21 + x2

2 + x23 + x2

4 = 1

for the four real parameters. This defines an unit 3-sphere. More generally, wedefine an n-sphere Sn = (x1, . . . , xn+1) ∈ Rn+1|

∑n+1i=1 x

2i = 1. The group

manifold of SU(2) is a three-sphere S3. (And the group manifold of U(1) wasa 1-sphere S1. As a matter of fact, these are the only Lie groups with n-spheregroup manifolds.) The n-sphere is an example of so-called pseudospheres. We’llmeet other examples in an exercise.

8. As an aside, note that O(n,R), SO(n,R), U(n), SU(n) were associated withrotations in Rn or Cn, keeping invariant the lengths of real or complex vec-tors. One can generalize from real and complex numbers to quaternions andoctonions, and look for generalizations of the rotation groups. This producesother examples of (compact) Lie groups, the Sp(2n), G2, F4, E6, E7 and E8.The symplectic group Sp(2n) plays an important role in classical mechanics, itis associated with canonical transformations in phase space. The other groupscrop up in string theory.

16

2.4 Groups Acting on a Set

We already talked about the orthogonal groups as rotations, implying that the groupacts on points in Rn. We should make this notion more precise. First, review thedefinition of a homomorphism from p. 6, then you are ready to understand thefollowing

Definition. Let G be a group, and X a set. The (left) action of G on X isa homomorphism L : G → Perm(X), G 3 g 7→ Lg ∈ Perm(X). Thus, L satisfies(Lg2 Lg1)(x) = Lg2(Lg1(x)) = Lg2g1(x), where x ∈ X. The last equality followed fromthe homomorphism property. We often simplify the notation and denote gx ≡ Lg(x).Given such an action, we say that X is a (left) G-space. Respectively, the rightaction of G in X is a homomorphism R : G → Perm(X), Rg2 Rg1 = Rg1g2 (noteorder in the subscript!), xg ≡ Rg(x). We then say that X is a right G-space.

Two (left) G-spaces X,X ′ can be identified, if there is a bijection i : X → X ′ suchthat i(Lg(x)) = L′g(i(x)) where L,L′ are (left) actions of G on X,X ′. A mathemati-cian would say this in the following way: the diagram

Xi→ X ′

Lg ↓ ↓ L′gX

i→ X ′

commutes, i.e. the map in the diagonal can be composed from the vertical andhorizontal maps through either corner.

Definition. The orbit of a point x ∈ X under the action of G is the set Ox =

Lg(x)| g ∈ G. In other words, the orbit is the set of all points that can be reachedfrom x by acting on it with elements of G. Let us put this in another way, by firstintroducing a useful concept.

Definition. An equivalence relation ∼ in a set X is a relation between points ina set which satisfies

i) a ∼ a (reflective) ∀ a ∈ X

ii) a ∼ b⇒ b ∼ a (symmetric) ∀ a, b ∈ X

iii) a ∼ b and b ∼ c⇒ a ∼ c (transitive) ∀ a, b, c ∈ X

Given a setX and an equivalence relation∼, we can partitionX into mutually disjointsubsets called equivalence classes. An equivalence class [a] = x ∈ X| x ∼ a, theset of all points which are equivalent to a under ∼. The element a (or any otherelement in its equivalence class) is called the representative of the class. Note that

17

[a] is not an empty set, since a ∼ a. If [a]⋂

[b] 6= ∅, there is an x ∈ X s.t. x ∼ a

and x ∼ b. But then, by transitivity, a ∼ b and [a] = [b]. Thus, different equivalenceclasses must be mutually disjoint ([a] 6= [b]⇒ [a]

⋂[b] = ∅). The set of all equivalence

classes is called the quotient space and denoted by X/ ∼.

Example. Let n be a non-negative integer. Define an equivalence relation amongintegers r, s ∈ Z: r ∼ s if r−s = 0 (mod n). (Prove that this indeed is an equivalencerelation.) The quotient space is Z/ ∼= [0], [1], [2], . . . , [n− 1]. Define the additionof equivalence classes: [a]+ [b] = [a+ b]. Then Z/ ∼ with addition as a multiplicationis a finite Abelian group, isomorphic to the cyclic group: Z/ ∼∼= Zn. (Exercise: provethe details.)

Back to orbits then. A point belonging to the orbit of another point defines anequivalence relation: y ∼ x if y ∈ Ox. The equivalence class is the orbit itself:[x] = Ox. Since the set X is partitioned into mutually disjoint equivalence classes,it is partitioned into mutually disjoint orbits under the action of G. We denotethe quotient space by X/G. It may happen that there is only one such orbit, thenOx = X ∀x ∈ X. In this case we say that the action of G on X is transitive, and Xis a homogeneous space.

Examples.

1. G = Z2 = 1,−1, X = R. Left actions: L1(x) = x, L−1(x) = −x. Orbits:O0 = 0, Ox = x,−x (∀ x 6= 0). The action is not transitive.

2. G = SO(2, R), X = R2. Parameterize

SO(2, R) 3 g =

(cos θ − sin θ

sin θ cos θ

),

and write

R2 3 x =

(x1

x2

).

Left action:

Lg(x) =

(cos θ − sin θ

sin θ cos θ

)(x1

x2

)=

(cos θ x1 − sin θ x2

sin θ x1 + cos θ x2

)(rotate vector x counterclockwise about the origin by angle θ). Orbits are circleswith radius r about the origin: O0 = 0, Ox 6=0 = x ∈ R2| x2

1 + x22 = r2,

r =√x2

1 + x22. The action is not transitive. R2/SO(2, R) = r ∈ R| r ≥ 0.

3. G = GL(n,R), X = Rn. Left action: LA(x) = x′ where x′i =∑n

j=1 Aijxj. Theorbit of the origin 0 is O0 = 0, other points have other orbits. So the actionis not transitive.

18

2.4.1 Conjugacy classes and cosets

We can also let the group act on itself, i.e. take X = G. A simple way to define theleft action of G on G is the translation, Lg(g′) = gg′. Every group element belongsto the orbit of identity, since Lg(e) = ge = g. So Oe = G, the action is transitive. Amore interesting way to define group action on itself is by conjugation.

Definition. Two elements g1, g2 of a group G are conjugate if there is an elementg ∈ G such that g1 = gg2g

−1. The element g is called the conjugating element.We then take conjugation as the left action, Lg(g′) = gg′g−1. In general conju-

gation is not transitive. The orbits have a special name, they are called conjugacyclasses.

It is also very interesting to consider the action of subgroups H of G on G. Definethis time a right action of H on G by translation, Rh(g) = gh. If H is a propersubgroup, the action need not be transitive.

Definition. The orbits, or the equivalence classes

[g] = g′ ∈ G| ∃h ∈ H s.t. g′ = gh = gh| h ∈ H

are called left cosets of H, and usually they are denoted gH. The quotient spaceG/H = gH| g ∈ G is the set of left cosets. (Similarly, we can define the left actionLh(g) = hg and consider the right cosets Hg. Then the quotient space is denotedH\G.)

Comments.

1. ghH = gH for all h ∈ H.

2. If g1H = g2H, there is an h ∈ H such that g2 = g1h i.e. g−11 g2 ∈ H.

3. There is a one-one correspondence between the elements of every coset andbetween the elements of H itself. The map fg : H → gH, fg(h) = gh isobviously a surjection; it is also an injection since gh1 = gh2 ⇒ h1 = h2. Inparticular, if H is finite, all the orders are the same: |H| = |gH| = |g′H|. Thisleads to the following theorem:

Theorem 2.2 (Lagrange’s Theorem) The order |H| of any subgroup H of a finitegroup G must be a divisor of |G|: |G| = n|H| where n is a positive integer.

Proof. Under right action of H, G is partitioned into mutually disjoint orbits gH,each having the same order as H. Hence |G| = n|H| for some n.

19

Corollary. If p = |G| is a prime number, then G ∼= Zp.

Proof. Pick g ∈ G, g 6= e, denote the order of the element g by m. Then H =

e, g, . . . gm−1 ∼= Zm is a subgroup of G. But according to Lagrange’s theorem|G| = nm. For this to be prime, n = 1 or m = 1. But g 6= e, so m > 1 so n = 1 and|G| = |H|. But then it must be H = G.

Definition. Let group G act on a set X. The little group of x ∈ X is the subgroupGx = g ∈ G| Lg(x) = x of G. It contains all elements of G which leave x invariant.It obviously contains the unit element e, you can easily show the other properties of asubgroup. The little group is also sometimes called the isotropy group, stabilizeror stability group.

Back to cosets. The set of cosets G/H is a G-space, if we define the left actionlg : G/H → G/H, lg(g

′H) = gg′H. The action is transitive: if g1H 6= g2H, thenlg1g−1

2(g2H) = g1H. The inverse is also true:

Theorem 2.3 Let group G act transitively on a set X. Then there exists a subgroupH such that X can be identified with G/H. In other words, there exists a bijectioni : G/H → X such that the diagram

G/Hi→ X

lg ↓ ↓ LgG/H

i→ X

commutes.

Proof. Choose a point x ∈ X, denote its isotropy group Gx by H. Define a mapi : G/H → X, i(gH) = Lg(x). It is well defined: if gH = g′H, then g = g′h

with some h ∈ H and Lg(x) = Lg′h(x) = Lg′(Lh(x)) = Lg′(x). It is an injection:i(gH) = i(g′H) ⇒ Lg(x) = Lg′(x) ⇒ x = Lg−1(L′g(x)) = Lg−1g′(x) ⇒ g−1g′ ∈ H ⇒g′ = gh ⇒ gH = g′H. It is also a surjection: G acts transitively so for all x′ ∈ Xthere exists g s.t. x′ = Lg(x) = i(gH). The diagram commutes: (Lg i)(g′H) =

Lg(Lg′(x)) = Lgg′(x) = i(gg′H) = (i lg)(g′H).

Corollary. A consequence of the proof is that the orbit of a point x ∈ X, Ox, canbe identified with G/Gx since G acts transitively on its orbits. Thus the orbits aredetermined by the subgroups of G, in other words the action of G on X is determinedby the subgroup structure.

20

Example. G = SO(3, R) acts on R3, the orbits are the spheres |x|2 = x21 +x2

2 +x23 =

r2, i.e. S2 when r > 0. Choose the point x = north pole = (0, 0, r) on every orbitr > 0. Its little group is

Gx =

(A2×2 0

0 1

)| A2×2 ∈ SO(2, R)

∼= SO(2, R) .

By Theorem 2.3 and its Corollary, SO(3, R)/SO(2, R) = S2.

2.4.2 Normal subgroups and quotient groups

Since the quotient space G/H is constructed out of a group and its subgroup, it isnatural to ask if it can also be a group. The first guess for a multiplication law wouldbe

(g1H)(g2H) = g1g2H .

This definition would be well defined if the right hand side is independent of thelabeling of the cosets. For example g1H = g1hH, so we then need g1g2H = g1hg2H

i.e. find h′ ∈ H s.t. g1g2h′ = g1hg2. But this is not always true. We can circumvent

the problem if H belongs to a particular class of subgroups, so called normal (alsocalled invariant, selfconjugate) subgroups.

Definition. A normal subgroupH ofG is one which satisfies gHg−1 = ghg−1| h ∈H = H for all g ∈ G.

Another way to say this is that H is a normal subgroup, if for all g ∈ G, h ∈ Hthere exists a h′ ∈ H such that gh = h′g.

Consider again the problem in defining a product for cosets. If H is a normalsubgroup, then g1hg2 = g1(hg2) = g1(g2h

′) = g1g2h′ is possible. One can show

that the above multiplication satisfies associativity, existence of identity (it is eH)and existence of inverse (gH)−1 = g−1H. Hence G/H is a group if H is a normalsubgroup. When G/H is a group, it is called a quotient group.

Comments:

1. If H is a normal subgroup, its left and right cosets are the same: gH = Hg.

2. If G is Abelian, all of its subgroups are normal.

3. |G/H| = |G|/|H| (follows from Lagrange’s theorem).

Example. Consider the cyclic group C2n = e, a, . . . , a2n−1, n ∈ Z. Take H =

e, a2, a4, . . . , a2(n−1). You can easily see that H is a subgroup of C2n. Because cyclicgroups are Abelian, H is normal. The two cosets are H = a2H = · · · = a2(n−1)H andaH = a, a3, a5, . . . , a2n−1 = a3H = · · · = a2n−1H. Because (aH)H = aH, HH = H

and (aH)(aH) = a2H = H, the quotient group C2n/H ∼= C2.

21

Example. Consider G = SU(2), H = 12,−12 ∼= Z2. A12 = 12A for all A ∈SU(2), hence H is a normal subgroup. One can show that the quotient group G/H =

SU(2)/Z2 is isomorphic with SO(3, R). This is an important result for quantummechanics, we will analyze it more in a future problem set.

This is also an example of a center. A center of a group G is the set of all elementsof g′ ∈ G which commute with every element g ∈ G. In other words, it is the setg′ ∈ G| g′g = gg′ ∀g ∈ G. You can show that a center is a normal subgroup, so thequotient of a group and its center is a group. The center of SU(2) is 12,−12.

We finish by showing another way of finding normal subgroups and quotientgroups. Let the map µ : G1 → G2 be a group homomorphism. Its image is theset

Imµ = g2 ∈ G2| ∃g1 ∈ G1 s.t. g2 = µ(g1)

and its kernel is the set

Kerµ = g1 ∈ G1| µ(g1) = e2 .

In other words, the kernel is the set of all elements of G1 which map to the unitelement of G2. You can show that Imµ is a subgroup of G2, Kerµ a subgroup of G1.Further, Kerµ is a normal subgroup: if k ∈ Kerµ then µ(gkg−1) = µ(g)e2µ(g−1) =

µ(gg−1) = µ(e1) = e2 i.e. gkg−1 ∈ Kerµ. Hence G1/Kerµ is a quotient group. Infact, it also isomorphic with Imµ !

Theorem 2.4 G1/Kerµ ∼= Imµ.

Proof. Denote K ≡ Kerµ. Define i : G1/K → Imµ, i(gK) = µ(g). If gK = g′K

then there is a k ∈ K s.t. g = g′k. Then i(gK) = µ(g) = µ(g′k) = µ(g′)e2 =

i(g′K) so i is well defined. Injection: if i(gK) = i(g′K) then µ(g) = µ(g′) so e2 =

(µ(g))−1µ(g′) = µ(g−1)µ(g′) = µ(g−1g′) so g−1g′ ∈ K. Hence ∃k ∈ K s.t. g′ = gk

so g′K = gK. Surjection: i is a surjection by definition. Thus i is a bijection.Homomorphism: i(gKg′K) = i(gg′K) = µ(gg′) = µ(g)µ(g′) = i(gK)i(g′K). i is ahomomorphism and a bijection, i.e. an isomorphism.

For example, our previous example SU(2)/Z2∼= SO(3, R) can be shown this

way, by constructing a surjective homomorphism µ : SU(2) → SO(3, R) such thatKerµ = 12,−12.

3 Representation Theory of Groups

In the previous section we discussed the action of a group on a set. We also listedsome examples of Lie groups, their elements being n× n matrices. For example, theelements of the orthogonal group O(n,R) corresponded to rotations of vectors in Rn.

22

Now we are going to continue along these lines and consider the action of a genericgroup on a (complex) vector space, so that we can represent the elements of the groupby matrices. However, a vector space is more than just a set, so in defining the actionof a group on it, we have to ensure that it respects the vector space structure.

3.1 Complex Vector Spaces and Representations

Definition. A complex vector space V is an Abelian group (we denote its mul-tiplication by “+” and call it a sum), where an additional operation, scalar mul-tiplication by a complex number µ ∈ C has been defined, such that the followingconditions are satisfied:

i) µ(~v1 + ~v2) = µ~v1 + µ~v2

ii) (µ1 + µ2)~v = µ1~v + µ2~v

iii) µ1(µ2~v) = (µ1µ2)~v

iv) 1 ~v = ~v

v) 0 ~v = ~0 (~0 is the unit element of V )

We could have replaced complex numbers by real numbers, to define a real vectorspace, or in general replaced the set of scalars by something called a “field”. Complexvector spaces are relevant for quantum mechanics. A comment on notations: wedenote vectors with arrows: ~v, but textbooks written in English often denote themin boldface: v. If it is clear from the context whether one means a vector or itscomponent, one may also simply use the notation v for a vector.

Definition. Vectors ~v1, . . . , ~vn ∈ V are linearly independent, if∑n

i=1 µi~vi = ~0

only if the coefficients µ1 = µ2 = · · · = µn = 0. If there exist at most n linearlyindependent vectors, n is the dimension of V , we denote dimV = n. If dimV = n,a set ~e1, . . . , ~en of linearly independent vectors is called a basis of the vector space.Given a basis, any vector ~v can be written in a form ~v =

∑ni=1 vi~e

i, where thecomponents vi of the vector are found uniquely.

Definition. A map L : V1 → V2 between two vector spaces V1, V2 is linear, if itsatisfies

L(µ1~v1 + µ2~v2) = µ1L(~v1) + µ2L(~v2)

for all µ1, µ2 ∈ C and ~v1, ~v2 ∈ V . A linear map is also called a linear transforma-tion, or especially in physics context, a (linear) operator. If a linear map is alsoa bijection, it is called an isomorphism, then the vector spaces V1 and V2 are iso-morphic, V1

∼= V2. It then follows that dimV1 = dimV2. Further, all n-dimensional

23

vector spaces are isomorphic. An isomorphism from V to itself is called an auto-morphism. The set of automorphisms of V is denoted Aut(V ). It is a group, withcomposition of mappings L L′ as the law of multiplication. (Existence of inverse isguaranteed since automorphisms are bijections).

Definition. The image of a linear transformation is

Im L = f(V1) = L(~v1)| ~v1 ∈ V1 ⊂ V2

and its kernel is the set of vectors of V1 which map to the null vector ~02 of V2:

Ker L = ~v1 ∈ V1| L(~v1) = ~02 ⊂ V1 .

You can show that both the image and the kernel are vector spaces. I also quote acouple of theorems without proofs.

Theorem 3.1 dimV1 = dim Ker L+ dim Im L.

Theorem 3.2 A linear map L : V → V is an automorphism if and only if Ker L =

~0.

Note that a linear map is defined uniquely by its action on the basis vectors:

L(~v) = L(n∑i=1

vi~ei) =

∑i

viL(~ei)

then we expand the vectors L(~ei) in the basis ~ej and denote the components byLji:

L(~ei) =∑j

Lji~ej.

Now

L(~v) =∑i

∑j

viLji~ej =

∑j

(∑i

Ljivi

)~ej ,

so the image vector L(~v) has the components L(~v)j =∑

i Ljivi. Let dimV1 =

dimV2 = n. The above can be written in the familiar matrix language:L(~v)1

L(~v)2

...L(~v)n

=

L11 L12 · · · L1n

L21 L22 · · · Lnn... . . . ...Ln1 · · · Lnn

v1

v2

...vn

.

We will often shorten the notation for linear maps and write L~v instead of L(~v), andL1L2~v instead of L1(L2(~v)). From the above it should also be clear that the group

24

of automorphisms of V is isomorphic with the group of invertible n × n complexmatrices:

Aut(V ) = L : V → V | L is an automorphism ∼= GL(n,C) .

(The multiplication laws are composition of maps and matrix multiplication.)Now we have the tools to give a definition of a representation of a group. The idea

is that we define the action of a group G on a vector space V . If V were just a set,we would associate with every group element g ∈ G a permutation Lg ∈ Perm(V ).However, we have to preserve the vector space structure of V . So we define the actionjust as before, but replace the group Perm(V ) of permutations of V by the groupAut(V ) of automorphisms of V .

Definition. A (linear) representation of a group G in a vector space V is a homo-morphism D : G → Aut(V ), G 3 g 7→ D(g) ∈ Aut(V ). The dimension of therepresentation is the dimension of the vector space dimV .

Note:

1. D is a homomorphism: D(g1g2) = D(g1)D(g2).

2. D(g−1) = (D(g))−1.

Example. Let G = C4 = e, c, c2, c3 and V = R2. One possible representation ofG is D : G→ Aut(V ),

D(c) =

(0 −1

1 0

), D(e) = D(c4) = (D(c))4 =

(1 0

0 1

)= 1.

Note that the matrix D(c) corresponds to a 90 rotation in the R2 plane.

We say that a representation D is faithful if KerD = e. Then g1 6= g2 ⇒D(g1) 6= D(g2). Whatever the KerD is, D is always a faithful representation of thequotient group G/KerD.

A mathematician would next like to classify all possible representations of a group.Then the first question is when two representations are the same (equivalent).

Definition. Let D1, D2 be representations of a group G in vector spaces V1, V2. Anintertwining operator is a linear map A : V1 → V2 such that the diagram

V1A→ V2

D1(g) ↓ ↓ D2(g)

V1A→ V2

commutes, i.e. D2(g)A = AD1(g) for all g ∈ G. If A is an isomorphism (we then needdimV1 = dimV2), the representations D1 and D2 are equivalent. In other words,there then exists a similarity transformation D2(g) = AD1(g)A−1 for all g ∈ G.

25

Example. Let dimV1 = n, V2 = Cn. Thus any n-dimensional representation isequivalent with a representation of G by invertible complex matrices, the homomor-phism D2 : G→ GL(n,C).

Definition. A scalar product in a vector space V is a map V ×V → C, (~v1, ~v2) 7→〈~v1|~v2〉 ∈ C which satisfies the following properties:

i) 〈~v|µ1~v1 + µ2~v2〉 = µ1〈~v|~v1〉+ µ2〈~v|~v2〉

ii) 〈~v|~w〉 = 〈~w|~v〉∗

iii) 〈~v|~v〉 ≥ 0 and 〈~v|~v〉 = 0⇔ ~v = ~0.

Given a scalar product, it is possible to normalize (e.g. by the Gram-Schmidt method)the basis vectors such that 〈~ei|~ej〉 = δij. Such an orthonormal basis is usually themost convenient on to use. The adjoint A† of an operator (linear map) A : V → V

is the one which satisfies 〈~v|A† ~w〉 = 〈A~v|~w〉 for all ~v, ~w ∈ V .

Definition. An operator (linear map) U : V → V is unitary if 〈~v|~w〉 = 〈U~v|U ~w〉for all ~v, ~w ∈ V . Equivalently, a unitary operator must satisfy U †U = idV = 1. Itfollows that the corresponding n×n matrix must be unitary, i.e. an element of U(n).Unitary operators form a subgroup Unit(V ) of Aut(V ) ∼= GL(n,C).

Definition. An unitary representation of a group G is a homomorphism D :

G→ Unit(V ).

Definition. If U1, U2 are unitary representations of G in V1, V2, and there exists anintertwining isomorphic operator A : V1 → V2 which preserves the scalar product,〈A~v|A~w〉V2 = 〈~v|~w〉V1 for all ~v, ~w ∈ V1, the representations are unitarily equivalent.

Example. Every n-dimensional unitary representation is unitarily equivalent witha representation by unitary matrices, a homomorphism G→ U(n).

As always after defining a fundamental concept, we would like to classify all pos-sibilities. The basic problem in group representation theory is to classify all unitaryrepresentations of a group, up to unitary equivalence.

3.2 Symmetry Transformations in Quantum Mechanics

We have been aiming at unitary representations in complex vector spaces because oftheir applications in Quantum Mechanics (QM). Recall that the set of all possiblestates of a quantum mechanical system is the Hilbert space H, a complex vector spacewith a scalar product. State vectors are usually denoted by |ψ〉 as opposed to our

26

previous notation ~v, and the scalar product of two vectors |ψ〉, |χ〉 is denoted 〈ψ|χ〉.Note that usually the Hilbert space is an infinite dimensional vector space, whereasin our discussion of representation theory we’ve been focusing on finite dimensionalvector spaces. Let us not be concerned about the possible subtleties which ensue, infact in many cases finite dimensional representations will still be relevant, as you willsee.

According to QM, the time evolution of a state is controlled by the Schrödingerequation,

ß~d

dt|ψ〉 = H|ψ〉

whereH is the Hamilton operator, the time evolution operator of the system. Supposethat the system possesses a symmetry, with the symmetry operations forming a groupG. In order to describe the symmetry, we need to specify how it acts on the statevectors of the system – we need to find its representation in the vector space ofthe states, the Hilbert space. The norm of a state vector, its scalar product withitself 〈ψ|ψ〉 is associated with a probability density and normalized to one, similarlythe scalar product 〈ψ|χ〉 of two states is associated with the probability (density) ofmeasurements. Thus the representations of the symmetry group G must preserve thescalar product. In other words, the representations must be unitary. Moreover, in aclosed system probability is preserved under the time evolution. Thus, unitarity ofthe representations must also be preserved under the time evolution.

We can summarize the above in a more formal way: if g 7→ Ug is a faithful unitaryrepresentation of a group G in the Hilbert space of a quantum mechanical system,such that for all g ∈ G

UgHU−1g = H (5)

where H is the Hamilton operator of the system, the group G is a symmetry groupof the system.

The condition (5) arises as follows. Suppose a state vector |ψ〉 is a solution of theSchrödinger equation. In performing a symmetry operation on the system, the statevector is mapped to a new vector Ug|ψ〉. But if the system is symmetric, the new stateUg|ψ〉 must also be a solution of the Schrödinger equation: i~(d/dt)Ug|ψ〉 = HUg|ψ〉).But then it must be i~(d/dt)|ψ〉 = i~(d/dt)U−1

g Ug|ψ〉 = U−1g HUg|ψ〉 = H|ψ〉 ⇒

U−1g HUg = H.

Consider in particular the energy eigenstates |φn〉 at energy level En:

H|φn〉 = En|φn〉 .

An energy level may be degenerate, say with k linearly independent energy eigenstates|φn1〉, . . . , |φnk〉. They span a k-dimensional vector space Hn, a subspace of the fullHilbert space. If the system has a symmetry group,

HUg|φn〉 = UgH|φn〉 = EnUg|φn〉

27

so all states Ug|φn〉 are eigenstates at the same energy level En. Thus the represen-tation Ug maps the eigenspace Hn to itself; in other words the representation Ug isa k-dimensional representation of G acting in Hn. By an inverse argument, supposethat the system has a symmetry group G. Its representations then determine thepossible degeneracies of the energy levels of the system.

3.3 Reducibility of Representations

It turns out that some representations are more fundamental than others. A genericrepresentation can be decomposed into so-called irreducible representations. That isour next topic. Again, we start with some definitions.

Definition. A subset W of a vector space V is called a subspace if it includes allpossible linear combinations of its elements: if ~v, ~w ∈ W then λ~v + µ~w ∈ W for allλ, µ ∈ C.

Let D be a representation of a group G in vector space V . The representationspace V is also called a G-module. (This terminology is used in Jones.) Let W bea subspace of V . We say that W is a submodule if it is closed under the action ofthe group G: ~w ∈ W ⇒ D(g)~w ∈ W for all g ∈ G. Then, the restriction of D(g) inW is an automorphism D(g)W : W → W .

Definition. A representation D : G→ Aut(V ) is irreducible, if the only submod-ules are ~0 and V . Otherwise the representation is reducible.

Example. Choose a basis ~ei in V , let dimV = n. Suppose that all the matricesD(g)ij = 〈~ei|D(g)vej〉 turn out to have the form

D(g) =

(M(g) S(g)

0 T (g)

)(6)

where M(g) is a n1× n1 matrix, T (g) is a n2× n2 matrix, n1 + n2 = n, and S(g) is an1 × n2 matrix. Then the representation is reducible, since

W =

(~v~0

)| ~v =

v1

...vn1

(7)

is a submodule:

D(g)

(~v~0

)=

(M(g)~v + S(g)~0

T (g)~0

)=

(M(g)~v~0

)∈ W. (8)

If in addition S(g) = 0 for all g ∈ G, the representation is obviously built up bycombining two representations M(g) and T (g). It is then an example of a completelyreducible representation. We’ll give a formal definition shortly.

28

Definition. A direct sum V1⊕V2 of two vector spaces V1 and V2 consists of all pairs(v1, v2) with v1 ∈ V1, v2 ∈ V2, with the addition of vectors and scalar multiplicationdefined as

(v1, v2) + (v′1, v′2) = (v1 + v′1, v2 + v′2)

λ(v1, v2) = (λv1, λv2)

It is simple to show that dim(V1 ⊕ V2) = dimV1 + dimV2. If a scalar product hasbeen defined in V1 and V2, one can define a scalar product in V1 ⊕ V2 by

〈(v1, v2)|(v′1, v′2)〉 = 〈v1|v′1〉+ 〈v2|v′2〉 .

Suppose D1, D2 are representations of G in V1, V2, one can then define a direct sumrepresentation D1 ⊕D2 in V1 ⊕ V2:

(D1 ⊕D2)(g)(v1, v2) = (D1(g)v1, D2(g)v2) .

In this case it is useful to adopt the notation

V1 =

(~v1

~0

); V2 =

(~0

~v2

)so that

V1 ⊕ V2 =

(~v1

~v2

)= (~v1, ~v2) .

Now the matrices of the direct sum representation are of the block diagonal form

(D1 ⊕D2)(g) =

(D1(g) 0

0 D2(g)

).

Definition. A representation D in vector space V is completely reducible iffor every submodule W ⊂ V there exists a complementary submodule W ′ such thatV = W ⊕W ′ and D ∼= DW ⊕DW ′ .

Comments.

1. According to the definition, we need to show that D is equivalent with thedirect sum representation DW ⊕ DW ′ . For the matrices of the representation,this means that there must be a similarity transformation which maps all thematrices D(g) into a block diagonal form:

AD(g)A−1 =

(DW (g) 0

0 DW ′(g)

).

29

2. Strictly speaking, according to the definition also an irreducible representationis completely reducible, as W = V,W ′ = 0 or vice versa satisfy the require-ments. We will exclude this case, and from now on by completely reduciblerepresentations we mean those which are not irreducible.

The goal in the reduction of a representation is to decompose it into irreduciblepieces, such that

D ∼= D1 ⊕D2 ⊕D3 ⊕ · · ·

(then dimD =∑

i dimDi). This is possible if D is completely reducible. So, givena representation, how do we know if it is completely reducible or not? Interestingrepresentations from quantum mechanics point of view turn out to be completelyreducible:

Theorem 3.3 Unitary representations are completely reducible.

Proof. Since we are talking about unitary representations, it is implied that therepresentation space V has a scalar product. Let W be a submodule. We defineits orthogonal complement W⊥ = ~v ∈ V | 〈~v|~w〉 = 0 ∀~w ∈ W. I leave it as anexcercise to show that V ∼= W ⊕W⊥. We then only need to show that W⊥ is alsoa submodule (closed under the action of G). Let ~v ∈ W⊥, and denote the unitaryrepresentation by U . For all ~w ∈ W and g ∈ G 〈U(g)~v|~w〉 = 〈U(g)~v|U(g)U−1(g)~w〉 =

〈~v|U †(g)U(g)U−1(g)~w〉 a= 〈~v|U−1(g)~w〉 = 〈~v|U(g−1)~w〉 b

= 〈~v|~w′〉 c= 0, where the step a

follows since U is unitary, step b since W is a G-module, and the step c is true since~v ∈ W⊥. Thus U(g)~v ∈ W⊥ so W⊥ is closed under the action of G.

If G is a finite group, we can say more.

Theorem 3.4 Let D be a finite dimensional representation of a finite group G, invector space V . Then there exists a scalar product in V such that D is unitary.

Proof. We can always define a scalar product in a finite dimensional vector space,e.g. by choosing a basis and defining 〈~v|~w〉 =

∑ni=1 v

∗iwi where vi, wi are the compo-

nents of the vectors. Given a scalar product, we then define a “group averaged” scalarproduct 〈〈~v|~w〉〉 = 1

|G|∑

g′∈G〈D(g′)~v|D(g′)~w〉. It is straightforward to show that 〈〈|〉〉satisfies the requirements of a scalar product. Further,

〈〈D(g)~v|D(g)~w〉〉 =1

|G|∑g′∈G

〈D(g′)D(g)~v|D(g′)D(g)~w〉

=1

|G|∑g′∈G

〈D(g′g)~v|D(g′g)~w〉

=1

|G|∑g′′∈G

〈D(g′′)~v|D(g′′)~w〉 = 〈〈~v|~w〉〉 .

30

In other words, D is unitary with respect to the scalar product 〈〈|〉〉.

Since we have previously shown that unitary representations are completely re-ducible, we have shown the following fact, called Maschke’s theorem.

Theorem 3.5 (Maschke’s Theorem) Every finite dimensional representation of afinite group is completely reducible.

3.4 Irreducible Representations

Now that we have shown that many representations of interest are completely re-ducible, and can be decomposed into a direct sum of irreducible representations, thenext task is to classify the latter. We will first develop ways to identify inequivalentirreducible representations. Before doing so, we must discuss some general theorems.

Theorem 3.6 (Schur’s Lemma) Let D1 and D2 be two irreducible representationsof a group G. Every intertwining operator between them is either a null map or anisomorphism; in the latter case the representations are equivalent, D1

∼= D2.

Proof. Let A be an intertwining operator between the representations, i.e. thediagram

V1A→ V2

D1(g) ↓ ↓ D2(g)

V1A→ V2

commutes: D2(g)A = AD1(g) for all g ∈ G. Let us first examine if A can be aninjection. Note first that if KerA ≡ ~v ∈ V1| A~v = ~02 = ~01, then A is an injectionsince if A~v = A~w then A(~v − ~w) = 0 ⇒ ~v − ~w ∈ KerA = ~01 ⇒ ~v = ~w. Sowhat is KerA? Recall that KerA is a subspace of V1. Is it also a submodule, i.e.closed under the action of G? Let ~v ∈ KerA. Then AD1(g)~v = D2(g)A~v = ~02,hence D1(g)~v ∈ KerA i.e. KerA is a submodule. But since D1 is an irreduciblerepresentation, either KerA = V1 or KerA = ~01. In the former case all vectors ofV1 map to the null vector of V2, so A is a null map A = 0. In the latter case, A isan injection. We then use a similar reasoning to examine if A is also a surjection.Let ~v2 ∈ Im A ≡ ~v ∈ V2| ∃~v1 ∈ V1 s.t. ~v = A~v1. Then we can write ~v2 = A~v1.Then D2(g)~v2 = D2(g)A~v1 = A(D1(g)~v1) so also D2(g)~v2 ∈ Im A. Thus, Im A isa submodule of V2. But since D2 is irreducible, either Im A = ~02 i.e. A = 0, orIm A = V2 i.e. A is a surjection. To summarize, either A = 0 or A is a bijection i.e.an isomorphism (since it is also a linear operator).

Corollary. If D is an irreducible representation of a group G in (complex) vectorspace V , then the only operator which commutes with all D(g) is a multiple of theidentity operator.

31

Proof. If ∀g ∈ G AD(g) = D(g)A, then for all µ ∈ C also (A − µ1)D(g) =

D(g)(A− µ1). According to Schur’s lemma, either (A− µ1)−1 exists for all µ ∈ Cor(A − µ1) = 0. However, it is always possible to find at least one µ ∈ C suchthat (A − µ1) is not invertible. In the finite dimensional case this is follows fromthe fundamental theorem of algebra, which guarantees that the polynomial equationdet(A− µ1) = 0 has solutions for µ. (The infinite dimensional case is more delicate,but turns out to be true as well). So it must be A = µ1.

We will next discuss a sequence of theorems, starting from the rather abstractfundamental orthogonality theorem and then moving towards its more intuitive anduser-friendly forms. Since we are interested in applications, we shall cut some cornersand skip the proof of the fundamental orthogonality theorem. It can be found in theliterature (or in Montonen’s handwritten notes) if you are interested in the details.

Theorem 3.7 (Fundamental Orthogonality Theorem) Let U1 and U2 be twounitary irreducible representations of a group G in vector spaces V1 and V2. Then

∑g∈G

〈~w1|U1(g)~v1〉∗V1〈~w2|U2(g)~v2〉V2 =

0, if U1 and U2 are not equivalent|G|

dimV〈~w1|~w2〉∗〈~v1|~v2〉, if U1 = U2, V1 = V2 = V

for all ~v1, ~w1 ∈ V1, ~v2, ~w2 ∈ V2. In the latter case also dimV <∞.

Note that in the latter case V1 = V2 = V , so ~v1, ~v2, ~w1, ~w2 ∈ V and the scalarproducts on the right hand side are those of V . While this is the generic form of thetheorem, it is more insightful to consider a special case. In the latter case, pick anorthonormal basis ~ei in V and choose ~w1 = ~ei, ~v1 = ~ej, ~w2 = ~ek, ~v2 = ~el. Then, inthe left hand side appear the matrices of the representation, D(α)(g)kl = 〈~ek|Uα(g)~el〉and the right hand side reduces to a product of Kronecker deltas. In other words, theFOGT takes the basis-dependent form∑

g∈G

D(α)∗ij (g)D

(β)kl (g) =

|G|dimD(α)

δαβδikδjl . (9)

The left hand side can be interpreted as a scalar product of two vectors, then theright hand side is an orthogonality relation for them. Namely, consider a given repre-sentation (labeled by α), and the ijth elements of its representation matrices. Theyform a |G|-component vector (D

(α)ij (g1), D

(α)ij (g2), . . . , D

(α)ij (g|G|)) where gi are all the

elements of the group G. So we have a collection of vectors, labeled by α, i, j. Then(9) is an orthogonality relation for the vectors, with respect to the scalar product〈~v|~v′〉 =

∑|G|i=1 v

∗i v′i. However, in a |G| dimensional vector space there can be at most

|G| mutually orthogonal vectors. The index pair ij has (dimD(α))2 possible values,so the upper bound on the total number of the above vectors is∑

α

(dimD(α))2 ≤ |G| ,

32

where the sum is taken over all possible unitary inequivalent representations (labeledby α). In fact (you can try to show it), the sum turns out to be equal to the order|G|. This theorem is due to Burnside:

Theorem 3.8 (Burnside’s Theorem)∑

α(dimD(α))2 = |G|.

Burnside’s theorem helps to rule out possibilities for irreducible representations.Consider e.g. G = S3, |S3| = 6. The possible dimensions of inequivalent irreduciblerepresentations are 2,1,1 or 1,1,1,1,1,1. It turns out that S3 has only two inequivalentirreducible representations (show it). So the irreps have dimensions 2,1,1.

3.5 Characters

Characters are a convenient way to classify inequivalent irreducible representations.To start with, let ~e1, . . . , ~en be an orthonormal basis in a n-dimensional vector

space V with respect to scalar product 〈|〉.

Definition. A trace of a linear operator A is

tr A ≡n∑i=1

〈~ei|A~ei〉 .

Note. Trace is well defined, since it is independent of a choice of basis. Let~e′1, . . . , ~e′n be another basis. Then tr A =

∑i〈~ei|A~ei〉 =

∑ij〈~ei|~e

′j〉〈~e′j|A~ei〉 =∑ij〈A†~e

′j|~ei〉〈~ei|~e′j〉 =∑

ij〈A†~e′j|e′j〉 =

∑j〈~e′j|A~e′j〉. Recall also that associated

with the operator A is a n× n matrix with components Aij = 〈~ei|A~ej〉. Thus tr A isequal to the trace of the matrix.

Now, let D(α)(g) be an unitary representation of a finite group G in V .

Definition. The character of the representation D(α) is the map

χ(α) : G→ C, χ(α)(g) = tr D(α)(g) .

Note. Equivalent representations have the same characters: tr (AD(α)A−1) = tr (A−1AD(α)) =

tr D(α), where we used cyclicity of the trace: tr ABC = tr CAB = tr BCA etc.Recall that conjugation Lg(g0) = gg0g

−1 is one way to define how G acts on itself,the orbits gg0g

−1| g ∈ G were called conjugacy classes. Since tr D(gg0g−1) =

tr (D(g)D(g0)D−1(g)) = tr D(g0), group elements related by conjugation have thesame character (again, use cyclicity of trace). So characters can be interpreted asmappings

χ(α) : conjugacy classes of G → C

33

Note also that the character of the unit element is the same as the dimension of therepresentation: χ(α)(e) = tr D(α)(e) = tr idV = dimV = dimD(α).

Recall then the fundamental orthogonality theorem, in its basis-dependent form(9). Now we are going to set i = j, k = l in (9) and sum over i and k. The left handside becomes ∑

g∈G

∑i

D(α)∗ii (g)

∑k

D(β)kk (g) =

∑g∈G

χ(α)∗(g)χ(β)(g) .

The right hand side becomes

|G|dimD(α)

δαβ∑ik

δikδik =|G|

dimD(α)δαβ∑i

δii = |G| δαβ .

We have derived an orthogonality theorem for characters:∑g∈G

χ(α)∗(g)χ(β)(g) = |G| δαβ . (10)

It can be used to analyze the reduction of a representation. In the reduction ofa representation D, it may happen that an irreducible representation D(α) appearsmultiple times in the direct sum:

D = D(1) ⊕D(1) ⊕D(1) ⊕D(2) ⊕D(3) ⊕ · · ·

Then we shorten the notation and multiply each irreducible representation by aninteger nα to account for how many times D(α) appears:

D = 3D(1) ⊕D(2) ⊕D(3) ⊕ · · · =⊕α

nαD(α) .

nα is called the multiplicity of the representation D(α) in the decomposition. Sincetr is a linear operation, obviously the characters of the representation satisfy

χ =∑α

nαχ(α)

with the same coefficients nα. If we know the character χ of the reducible representa-tion D, and all the characters χ(α) of the irreducible representations, we can calculatethe multiplicities of each irreducible representation in the decomposition by using theorthogonality theorem of characters:

nα =1

|G|∑g

χ(α)∗(g)χ(g) .

Then, once we know all the multiplicities, we know what is the decomposition ofthe representation D. In practice, characters of finite groups can be looked up from

34

character tables. You can find them e.g. in Atoms and Molecules, by M. Weissbluth,pages 115-125. For more explanation of construction of character tables, see Jones,section 4.4. You will work out some character tables in a problem set.

Again, the orthogonality of characters can be interpreted as an orthogonalityrelation for vectors, with useful consequences. Let C1, C2, . . . , Ck be the conjugacyclasses of G, denote the number of elements of Ci by |Ci|. Then (10) implies∑

Ci

|Ci|χ(α)∗(Ci)χ(β)(Ci) = |G| δαβ . (11)

Consider then the vectors ~vα = (√|C1|χ(α)(C1), . . . ,

√|Ck|χ(α)(Ck)). The number of

such vectors is the same as the number of irreducible representations. On the otherhand, (11) tells that the vectors are mutually orthogonal, so their number cannot belarger than the dimension of the vector space k, the number of conjugacy classes.Again, it can be shown that the numbers are actually the same:

Theorem 3.9 The number of unitary irreducible representations of a finite group isthe same as the number of its conjugacy classes.

If the group is Abelian, the conjugacy class of each element contains only theelement itself: gg0g

−1 = g0gg−1 = g0. So the number of conjugacy classes is the

same as the order of the group |G|, this is then also the number of unitary irreduciblerepresentations. On the other hand, according to Burnside’s theorem,

|G|∑α=1

(dimD(α))2 = |G| .

Since there are |G| terms on the left hand side, it must be dimD(α) = 1 for all α.Hence:

Theorem 3.10 All unitary irreducible representations of an Abelian group are onedimensional.

This fact can be shown to be true even for continuous Abelian groups. (Hence noword “finite” in the above.)

4 Differentiable Manifolds

4.1 Topological Spaces

The topology of a space X is defined via its open sets.Let X= set, τ = Xαα∈I a (finite or infinite) collection of subsets of X. (X, τ) is atopological space, if

35

T1 ∅ ∈ τ, X ∈ τ

T2 all possible unions of Xα’s belong to τ(⋃

α∈I′ Xα ∈ τ, I ′ ⊆ I)

T3 all intersections of a finite number of Xα’s belong to τ . (⋂ni=1Xαi ∈ τ)

The Xα are called the open sets of X in topology τ , and τ is said to give a topologyto X.So: topology = specify which subsets of X are open.The same set X has several possible definitions of topologies (see examples).

Examples

(i) τ = ∅, X “trivial topology”

(ii) τ = all subsets of X “discrete topology”

(iii) Let X = R, τ = open intervals ]a, b[ and their unions “usual topology”

(iv) X = Rn, τ = ]a1, b1[× . . .× ]an, bn[ and unions of these.

Definition: A metric on X is a function d : X ×X → R such that

M1 d(x, y) = d(y, x)

M2 d(x, y) ≥ 0, and d(x, y) = 0 if and only if x = y.

M3 d(x, y) + d(y, z) ≥ d(x, z) “triangle inequality”

Example:

X = Rn, dp(x, y) =

(n∑i=1

|xi − yi|p) 1

p

, p > 0

If p = 2 we call it the Euclidean metric.

If X has a metric, then the metric topology is defined by choosing all the “opendisks”

Uε(x) = y ∈ X| d(x, y) < ε and all their unions as open sets.The metric topology of Rn with metric dp is equivalent with the usual topology (forall p > 0 !)

Let (X, τ) be a topological space, A ⊂ X a subset. The topology τ induces therelative topology τ ′ in A,

τ ′ = Ui ∩ A | Ui ∈ τ

This is how we obtain a topology for all subsets of Rn (like Sn).

36

4.1.1 Continuous Maps

Let (X, τ) and (Y, σ) be topological spaces. A map f : X → Y is continuous if andonly if the inverse image of every open set V ∈ σ, f−1(V ) = x ∈ X | f(x) ∈ V , isan open set in X: f−1(V ) ∈ τ .

A function f : X → Y is a homeomorphism if f is continuous, and has an inversef−1 : Y → X which is also continuous.

If there exists a homeomorphism f : X → Y , then we say that X is homeomorphicto Y and vice versa. Denote X ≈ Y .This (≈) is an equivalence relation.

Intuitively : X and Y are homeomorphic if we can continuously deform X to Y

(without cutting or pasting).Example: coffee cup ≈ donut.[

The fundamental question of topology : classify all homeomorphic spaces.

]

One method of classification: topological invariants i.e. quantities which are in-variant under homeomorphisms.If a topological invariant for X1 6= for X2 then X1 ≈/ X2.

The neighborhood N of a point x ∈ X is a subset N ⊂ X such that there exists anopen set U ∈ τ, x ∈ U and U ⊂ N .(N does not have to be an open set).

(X, τ) is a Hausdorff space if for an arbitrary pair x, x′ ∈ X, x 6= x′, there alwaysexist neighborhoods N 3 x, N ′ 3 x′ such that N ∩N ′ = ∅.We’ll assume from now on that all topological spaces (that we’ll consider) are Haus-dorff.

Example: Rn with the usual topology is Hausdorff.All spaces X with metric topology are Hausdorff.

A subset A ⊂ X is closed if its complement X − A = x ∈ X | x /∈ A is open.N.B. X and ∅ are both open and closed.

A collection Ai of subsets Ai ⊂ X is called a covering of X if⋃iAi = X.

If all Ai are open sets in the topology τ of X, Ai is an open covering.

37

A topological space (X, τ) is compact if, for every open covering Ui | i ∈ I thereexists a finite subset J ⊂ I such that Ui | i ∈ J is also a covering of X, i.e. everyopen covering has a finite subcovering.

X is connected if it cannot be written as X = X1

⋃X2, with X1, X2 both open,

nonempty and disjoint, i.e. X1

⋂X2 = ∅.

A loop in topological space X is a continuous map f : [0, 1]→ X such thatf(0) = f(1). If any loop in X can be continuously shrunk to a point, X is calledsimply connected.

Examples: R2 is simply connected.Torus T 2 is not simply connected.

Examples of topological invariants = quantities or properties invariant under homeo-morphisms:

1. Connectedness

2. Simply connectedness

3. Compactness

4. Hausdorff

5. Euler characteristic (see below)

Let X ⊂ R3, X ≈ polyhedron K. (monitahokas)Euler characteristic:

χ(X) = χ(K) = (# vertices in K)− (# edges in K) + (# faces in K)

( = K:n kärkien lkm.−K:n sivujen lkm. +K:n tahkojen lkm.)

Example: χ(T 2) = 16− 32 + 16 = 0.χ(S2) = χ(cube) = 8− 12 + 6 = 2.

4.2 Homotopy Groups

4.2.1 Paths and Loops

Let X be a topological space, I = [0, 1] ⊂ R.A continuous map α : I → X is a path in X. The path α starts at α0 = α(0) andends at α1 = α(1).If α0 = α1 ≡ x0, then α is a loop with base point x0. We will focus on loops.

38

Definition: A product of two loops α, β with the same base point x0, denoted byα ∗ β, is the loop

(α ∗ β)(t) =

α(2t) 0 ≤ t ≤ 1

2

β(2t− 1) 12≤ t ≤ 1

4.2.2 Homotopy

Let α, β be two loops in X with base point x0. α and β are homotopic, α ∼ β, ifthere exists a continuous map F : I × I → X such that

F (s, 0) = α(s) ∀s ∈ IF (s, 1) = β(s) ∀s ∈ IF (0, t) = F (1, t) = x0 ∀t ∈ I.

F is called a homotopy between α and β.

Homotopy is an equivalence relation:

1. α ∼ α: choose F (s, t) = α(s) ∀t ∈ I

2. α ∼ β, homotopy F (s, t)⇒ β ∼ α, homotopy F (s, 1− t)

3. α ∼ β, homotopy F (s, t); β ∼ γ, homotopy G(s, t). Then choose

H(s, t) =

F (s, 2t) 0 ≤ t ≤ 1

2

G(s, 2t− 1) 12≤ t ≤ 1

⇒ H(s, t) is a homotopy between α and γ, so α ∼ γ.

The equivalence class [α] is called the homotopy class of α.([α] = all paths homotopic with α ).

Lemma: If α ∼ α′ and β ∼ β′, then α ∗ β ∼ α′ ∗ β′.Proof: Let F (s, t) be a homotopy between α and α′ and let G(s, t) be a homotopybetween β and β′. Then

H(s, t) =

F (2s, t) 0 ≤ s ≤ 1

2

G(2s− 1, t) 12≤ s ≤ 1

is a homotopy between α ∗ β and α′ ∗ β′. This concludes the proof.

By the lemma, we can define a product of homotopy classes: [α] ∗ [β] ≡ [α ∗ β].

39

Theorem: The set of homotopy classes of loops at x0 ∈ X, with the product definedas above, is a group called the fundamental group (or first homotopy group) ofX at x0. It is denoted by Π1(X, x0)

Proof:

(0) Closure under multiplication: For all [α], [β] ∈ Π1(X, x0) we have [α] ∗ [β] =

[α ∗ β] ∈ Π1(X, x0), since α ∗ β is also a loop at x0.

(1) Associativity: We need to show (α ∗ β) ∗ γ ∼ α ∗ (β ∗ γ).

Homotopy F (s, t) =

α(

4s1+t

)0 ≤ s ≤ 1+t

4

β(4s− t− 1) 1+t4≤ s ≤ 2+t

4

γ(

4s−t−22−t

)2+t

4≤ s ≤ 1

⇒ [(α ∗ β) ∗ γ] = [α ∗ (β ∗ γ)] ≡ [α ∗ β ∗ γ].

(2) Unit element: Let us show that the unit element is e = [Cx0 ], where Cx0 is theconstant path Cx0(s) = x0 ∀s ∈ I. This follows since we have the homotopies:

α ∗ Cx0 ∼ α : F (s, t) =

α(

2s1+t

)0 ≤ s ≤ 1+t

2

x01+t

2≤ s ≤ 1

Cx0 ∗ α ∼ α : F (s, t) =

x0 0 ≤ s ≤ 1−t

2

α(

2s−1+t1+t

)1−t

2≤ s ≤ 1

.

⇒ [α ∗ Cx0 ] = [Cx0 ∗ α] = [α].

(3) Inverse: Define α−1(s) = α(1 − s). We need to show that α−1 is really theinverse of α: [α ∗ α−1] = [Cx0 ]. Define:

F (s, t) =

α(2s(1− t)) 0 ≤ s ≤ 1

2

α(2(1− s)(1− t)) 12≤ s ≤ 1

Now we have F (s, 0) = α ∗ α−1 and F (s, 1) = Cx0 so α ∗ α−1 ∼ Cx0 . Similarlyα−1 ∗ α ∼ Cx0 so we have proven the claim: [α−1 ∗ α] = [α ∗ α−1] = [Cx0 ], quoderat demonstrandum.

4.2.3 Properties of the Fundamental Group

1. If x0 and x1 can be connected by a path, then Π1(X, x0) ∼= Π1(X, x1). If X isarcwise connected, then the fundamental group is independent of the choice ofx0 up to an isomorphism: Π1(X, x0) ∼= Π1(X).

(A space X is arcwise connected if any two points x0, x1 ∈ X can beconnected with a path. It can be shown that an arcwise connected spaceis always connected, but the converse is not true. However a connectedmetric space is also arcwise connected.)

40

2. Π1(X) is a topological invariant: X ≈ Y ⇒ Π1(X) ∼= Π1(Y ).

3. Examples:

• Π1(R2) = 0 (= the trivial group)

• Π1(T 2) = Π1(S1 × S1) = Z⊕ Z.

(One can show that Π1(X × Y ) = Π1(X)⊕Π1(Y ) for arcwise connected spacesX and Y .)

The real projective space is defined as RP n = lines through the origin in Rn+1. Ifx = (x0, x1, . . . , xn) 6= 0, then x defines a line. All y = λx for some nonzero λ ∈ Rare on the same line and thus we have an equivalence relation: y ∼ x⇔ y = λx, λ ∈R− 0 ⇔ (x and y are on the same line.)So RP n = [x]| x ∈ Rn+1 − 0 with the above equivalence relation.

Example: RP 2 ≈ (S2 with opposite points identified)Π1(RP 2) = Z2.

4.2.4 Higher Homotopy Groups

Define: In = (s1, . . . , sn)| 0 ≤ si ≤ 1, 1 ≤ i ≤ n∂In = boundary of In = (s1, . . . , sn)| some si = 0 or 1

A map α : In → X which maps every point on ∂In to the same point x0 ∈ X

is called an n-loop at x0 ∈ X. Let α and β be n-loops at x0. We say that α ishomeotopic to β, α ∼ β, if there exists a continuous map F : In × I → X such that

F (s1, . . . , sn, 0) = α(s1, . . . , sn)

F (s1, . . . , sn, 1) = β(s1, . . . , sn)

F (s1, . . . , sn, t) = x0 ∀t ∈ I when (s1, . . . , sn) ∈ ∂In.

Homotopy α ∼ β is again an equivalence relation with respect to homotopy classes[α].

Define: α ∗ β : α ∗ β(s1, . . . , sn) =

α(2s1, s2, . . . , sn) 0 ≤ s1 ≤ 1

2

β(2s1 − 1, s2, . . . , sn) 12≤ s1 ≤ 1.

α−1 : α−1(s1, . . . , sn) = α(1− s1, . . . , sn)

[α] ∗ [β] = [α ∗ β]

⇒ Πn(X, x0), the nth homotopy group of X at x0. (This classifies continuous mapsSn → X.)Example: Π2(S2) = Z.

41

4.3 Differentiable Manifolds

Definition: M is an m-dimensional differentiable manifold if

(i) M is a topological space

(ii) M is provided with a family of pairs (Ui, ϕi), where Ui is an open coveringofM :

⋃i Ui = M , and every ϕi : Ui → U ′i ⊂ Rm, U ′i open, is a homeomorphism.

- The pair (Ui, ϕi) is called a chart, (Ui, ϕi) an atlas, Ui the coordinateneighborhood and ϕi the coordinate function.ϕ(p) = (x1(p), . . . , xm(p)), p ∈ Ui are the coordinate(s) of p.

(iii) Given Ui and Uj such that Ui⋂Uj 6= ∅, the map ψij = ϕiϕ−1

j from ϕj(Ui⋂Uj)

to ϕi(Ui⋂Uj) is infinitely differentiable (or: C∞ or smooth).

- ψij is called a transition function.

Recall: f : Rm → Rn is Ck if the partial derivatives

∂kf l

∂(x1)k1 · · · ∂(xm)km, f = (f 1, . . . , fn),

l = 1, . . . , n

k1 + k2 + . . .+ km = k

exist and are continuous. The function f is C∞ if all partial derivatives exist and arecontinuous for any k. We also call a C∞ function f smooth.

The number m is the dimension of the manifold: dim M = m.

If the union of two atlases (Ui, ϕi), (Vi, ψi) is again an atlas, they are said to becompatible. This gives an equivalence relation among atlases, the equivalence classis called a differentiable structure.

A given differentiable manifold M can have several different differentiable structures:for example S7 has 28 and R4 has infinitely (!) many differentiable structures.

Examples of differentiable manifolds: Sn

Let us realize Sn as a subset of Rn+1: Sn = x ∈ Rn+1|∑n

i=0(xi)2 = 1.One possible atlas:

• coordinate neighborhoods:

Ui+ ≡ x ∈ Sn|xi > 0Ui− ≡ x ∈ Sn|xi < 0

42

• coordinates:

ϕi+(x0, . . . , xn) = (x0, . . . , xi−1, xi+1, . . . , xn) ∈ Rn

ϕi−(x0, . . . , xn) = (x0, . . . , xi−1, xi+1, . . . , xn) ∈ Rn

(so these are projections on the plane xi = 0.)

The transition functions (i 6= j, α = ±, β = ±),

ψiαjβ =ϕiα ϕ−1jβ ,

(x0, . . . ,xi, . . . , xj−1, xj+1, . . . , xn)

7→ (x0, . . . , xi−1, xi+1, . . . , xj−1, β

√1−

∑k 6=j

(xk)2, xj+1, . . . , xn)

are C∞.There are other compatible atlases, e.g. the stereographic projection.

4.3.1 Manifold with a Boundary

Let H be the “upper” half-space: Hm = (x1, . . . , xm) ∈ Rm | xm ≥ 0.Now require for the coordinate functions: ϕi : Ui → U ′i ⊂ Hm, where U ′i is open inHm. (The topology on Hm is the relative topology induced from Rm.)Points with coordinate xm = 0 belong to the boundary ofM (denoted by ∂M). Thetransition functions must now satisfy: ψij : ϕj(Ui ∩ Uj) → ϕi(Ui ∩ Uj) are C∞ in anopen set of Rm which contains ϕj(Ui ∩ Uj). .

4.4 The Calculus on Manifolds

4.4.1 Differentiable Maps

Let M,N be differentiable manifolds with dimensions dim M = m and dim N = n.Let f be a map f : M → N, p 7→ f(p). Take charts (U,ϕ) and (V, ψ) such that p ∈ Uand f(p) ∈ V . If the combined map ψ f ϕ−1 : Rm → Rn is C∞ at ϕ(p), then fis differentiable at p. The definition is independent of the choice of charts, since if(U1, ϕ1) is some other chart at p, then

ψ f ϕ−11 =

C∞︷︸︸︷ψ f ϕ−1

C∞︷︸︸︷ϕ ϕ−1

1 ⇒ ψ f ϕ−11 is C∞.

If in addition ψ f ϕ−1 is invertible, i.e. the inverse map ϕ f−1 ψ−1 exists andis also C∞, then f is called a diffeomorphism between M and N . In this case wesay that M is diffeomorphic to N and denote it by M ≡ N .

Note: homeomorphism = continuous deformationdiffeomorphism = smooth deformation

43

• An open curve on M is a map c :]a, b[→M where ]a, b[ is an open interval inR (notation: (a, b) =]a, b[).

• A closed curve is a map S1 →M .

• On a chart (U,ϕ) a curve c has a coordinate representationx(t) = (ϕ c)(t) : R→ Rm.

A function f on M is a smooth map M → R.F = the set of smooth maps = f : M → R|f is smooth.

4.4.2 Tangent Vectors

Tangent vectors are defined using curves. Let c : (a, b) → M be a curve (we canassume 0 ∈ (a, b) ). Denote c(0) = p and let f : M → R be a function.The rate of change of f along the curve c at point p is

df(c(t))

dt

∣∣∣∣t=0

=∂f

∂xµdxµ(c(t))

dt

∣∣∣∣t=0

,

where xµ(p) = ϕµ(p) are local coordinates and

∂f

∂xµ≡ ∂(f ϕ−1(x))

∂xµ.

Also we have introduced the Einstein summation convention:

• When an index appears once as a subscript and once as a superscript, it is under-stood to be summed over. For example xµyµ ≡

∑mµ=1 xµy

µ = x1y1 + . . .+ xmy

m.

In other words, df(c(t))dt

is obtained by acting on the function f with the differentialoperator

Xp ≡ Xµp

(∂

∂xµ

)p

, where Xµp =

dxµ(c(t))

dt

∣∣∣∣t=0

.

The operator Xp is called a tangent vector of M at p. It depends on the curve,but several curves can give rise to the same tangent vector Xp. We can see that twocurves c1 and c2 give the same Xp if and only if

(i) c1(0) = c2(0) = p

(ii) dxµ(c1(t))dt

∣∣∣t=0

= dxµ(c2(t))dt

∣∣∣t=0

This gives an equivalence relation between the two curves, c1 ∼ c2. Thus equivalenceclasses can be identified with tangent vectors Xp.

44

Example. Define curves α, β, γ : R→ R2,

α(t) = (1 + sin t cos t, 1 + 3t cos 2t)

β(t) = (1 + t, 1 + 3te3t)

γ(t) = (et, e3t)

All these curves pass through the point p = (1, 1) at t = 0. The tangent vector to αat p is

Xαp = dx1(α(t))dt

∣∣∣t=0

(∂∂x1

)p

+ dx2(α(t))dt

∣∣∣t=0

(∂∂x2

)p

= (cos2 t− sin2 t)∣∣t=0

(∂∂x1

)p

+ (3 cos 2t− 6t sin 2t)|t=0

(∂∂x2

)p

=(

∂∂x1

)p

+ 3(

∂∂x2

)p

The tangent vectors to other curves are the same, hence the curves belong to thesame equivalence class.

The set of all tangent vectors at p is the tangent space TpM at p. It is a real vectorspace, dim TpM = m:

• X1p +X2p = (Xµ1p +Xµ

2p)(

∂∂xµ

)p

• cXp = (cXµp )(

∂∂xµ

)p

(eµ)p =(

∂∂xµ

)pis called the coordinate basis.

The vectors are independent of a choice of coordinates, if their components are trans-formed in a correct way. Let x(p) = ϕi(p) and y(p) = ϕj(p) be two coordinates. Forthe vector to be independent of the choice of coordinates we must have

X = Xµ ∂

∂xµ= Y µ ∂

∂yµ

But on the other hand by the chain rule we have

Xµ ∂

∂xµ= Xν ∂y

µ

∂xν∂

∂yµ.

Thus we get the transformation rule for the components:

Y µ = Xν ∂yµ

∂xν

Note the abuse of the notation:

Xν ∂yµ

∂xν∂

∂xµ≡ Xν

p

∂(ϕj ϕ−1i )(xµ(p))

∂xν(p)

(∂

∂xµ

)p

.

Let us now leave calculus on manifolds for a while and study vector spaces some more.

45

4.4.3 Dual Vector Space

Let V be a complex vector space and f a linear function V → C. Now V ∗ =

f |f is a linear function V → C is also a complex vector space, the dual vectorspace to V :

• (f1 + f2)(~v) = f1(~v) + f2(~v)

• (af)(~v) = a(f(~v))

• ~0V ∗(~v) = 0 ∀~v ∈ V

The elements of V ∗ are called the dual vectors.Let ~e1, . . . , ~en be a basis of V . Then any vector ~v ∈ V can be written as ~v = vi~ei.We define a dual basis in V ∗ such that e∗i(~ej) = δij. From this it follows thatdim V = dim V ∗ = n (dual basis = e∗1, . . . , e∗n). We can then expand any f ∈ V ∗as f = fie

∗i for some coefficients fi ∈ C. Now we have

f(~v) = fie∗i(vj~ej) = fiv

je∗i(~ej) = fivi.

This can be interpreted as an inner product:

〈 , 〉 : V ∗ × V → C〈f,~v〉 = fiv

i.

(Note that this is not the same inner product 〈|〉 which we discussed before: 〈 , 〉 :

V ∗ × V → C but 〈 | 〉 : V × V → C.)

Pullback: Let f : V → W and g : W → C be linear maps (g ∈ W ∗). It followsthat g f : V → C is a linear map, i.e. g f ∈ V ∗.

Vf→ W

↓ gg f C

Now f induces a map f ∗ : W ∗ → V ∗, g 7→ g f i.e. f ∗(g) = g f ∈ V ∗. f ∗(g) iscalled the pullback (takaisinveto) of g.

Dual of a Dual: Let ω : V ∗ → C be a linear function (ω ∈ (V ∗)∗). Every ~v ∈ Vinduces via inner product ω~v ∈ (V ∗)∗ defined by ω~v(f) = 〈f,~v > . On the other hand,it can be shown this gives all ω ∈ (V ∗)∗. So we can identify (V ∗)∗ with V .

46

Tensors: A tensor of type (p, q) is a function of p dual vectors and q vectors, andis linear in its every argument1

T :

p︷︸︸︷V ∗ × . . .× V ∗×

q︷︸︸︷V × . . .× V → C.

Examples: (0,1) tensor = dual vector : V → C(1,0) tensor = (dual of a dual) vector

(1,2) tensor: T : V ∗ × V × V → C. Choose basis ~ei in V and e∗i in V ∗:

T (f,~v, ~w) = T (fie∗i, vj~ej, w

k~ek) = fivjwk

≡T ijk︷︸︸︷T (e∗i, ~ej, ~ek) = T ijkfiv

jwk,

where T ijk are the components of the tensor and they uniquely determine the tensor.Note the positioning of the indices.In general, (p, q) tensor components have p upper and q lower indices.

Tensor product: Let R be a (p, q) tensor and S be a (p′, q′) tensor. Then T = R⊗Sis defined as the (p+ p′, q + q′) tensor:

T (f1, . . . , fp; fp+1, . . . , fp+p′ ;~v1, . . . , ~vq;~vq+1, . . . , ~vq+q′)

= R(f1, . . . , fp;~v1, . . . , ~vq)S(fp+1, . . . , fp+p′ ;~vq+1, . . . , ~vq+q′).

In terms of components:

Ti1...ipip+1...ip+p′j1...jqjq+1...jq+q′

= Ri1...ipj1...jq

Sip+1...ip+p′jq+1...jq+q′

Contraction: This is an operation that produces a (p−1, q−1) tensor from a (p, q)

tensor:T︸︷︷︸

(p,q)

7→ Tc(ij)︸︷︷︸(p−1,q−1)

,

where the (p− 1, q − 1) tensor Tc(ij) is

Tc(ij)(f1, . . . , fp−1;~v1, . . . , ~vq−1) = T (f1, . . . ,

ith︷︸︸︷e∗k , . . . , fp−1;~v1, . . . ,

jth︷︸︸︷~ek , . . . , ~vq−1).

Note the sum over k in the formula above. In component form this is

Tl1...lp−1

c(ij) m1...mq−1= T

l1...li−1kli...lp−1

m1...mj−1kmj ...mq−1

Now we can return to calculus on manifolds.1So T is a multilinear object.

47

4.4.4 1-forms (i.e. cotangent vectors)

Tangent vectors of a differentiable manifold M at point p were elements of the vectorspace TpM . Cotangent vectors or 1-forms are their dual vectors, i.e. linearfunctions TpM → R. In other words, they are elements of the dual vector spaceT ∗pM. Let w ∈ T ∗pM and v ∈ TpM , then the inner product 〈 , >: T ∗pM ×TpM → R is

〈w, v >= w(v) ∈ R.

The inner product is bilinear:

〈w, α1v1 + α2v2 > = w(α1v1 + α2v2) = α1〈w, v1 > +α2〈w, v2 >

〈α1w1 + α2w2, v > = (α1w1 + α2w2)(v) = α1〈w1, v > +α2〈w2, v > .

Let eµ = ∂∂xµ be a coordinate basis of TpM . (Note that the correct notation would

be (

∂∂xµ

)p, but this is somewhat cumbersome so we use the shorter notation.) The

dual basis is denoted by dxµ and it satisfies by definition

〈dxµ, ∂

∂xν>= dxµ(

∂

∂xν) = δµν .

Now we can expand w = wµdxµ and v = vν ∂

∂xν. Then

w(v) = 〈w, v >= wµvνdxµ(

∂

∂xν) = wµv

µ.

Consider now a function f ∈ F(M) (i.e. f is a smooth mapM → R). Its differentialdf ∈ T ∗pM is the map

df(v) = 〈df, v >≡ v(f) = vµ∂f

∂xµ.

Thus the components of df are ∂f∂xµ

and

df =∂f

∂xµdxµ.

Consider two coordinate patches Ui and Uj with p ∈ Ui ∩ Uj. Let x = ϕi(p) andy = ϕj(p) be the coordinates in Ui and Uj respectively. We can derive how thecomponents of a 1-form transform under the change of coordinates:Let w = wµdx

µ = wνdyν ∈ T ∗pM and v = vρ ∂

∂xρ= vσ ∂

∂yσ∈ TpM be a 1-form and a

vector. We already know that vν = ∂yν

∂xµvµ, so we get

w(v) = wµvµ = wν v

ν = wν∂yν

∂xµvµ,

so we find the transformed components

wµ = wν∂yν

∂xµor wµ = wν

∂xν

∂yµ.

The dual basis vectors transform as

dyν =∂yν

∂xµdxµ.

48

4.4.5 Tensors

A tensor of type (q, r) is a multilinear map

T :

q︷︸︸︷T ∗pM × . . .× T ∗pM ×

r︷︸︸︷TpM × . . .× TpM → R.

Denote the set of type (q, r) tensors at p ∈M by T qr,p(M). Note that T 10,p = (T ∗pM)∗ =

TpM and T 01,p(M) = T ∗pM .

The basis of T qr,p is ∂

∂xµ1⊗ · · · ⊗ ∂

∂xµq⊗ dxν1 ⊗ · · · ⊗ dxνr

.

The basis vectors satisfy (as a mapping T ∗pM × . . .×T ∗pM ×TpM × . . .×TpM → R):(∂

∂xµ1⊗ · · · ⊗ ∂


)(dxα1 , . . . , dxαq ,

∂

∂xβ1, . . . ,

∂

∂xβr

)= δα1

µ1. . . δαqµqδ

ν1β1. . . δνrβr .

(Note that ∂∂xµ

(dxα) ≡ 〈dxα, ∂∂xµ

>= δαµ. On the left ∂∂xµ

is interpreted as an elementof (T ∗pM)∗.)We can expand as T = T

µ1...µqν1...νr

∂

∂xµ1⊗ · · · ⊗ ∂


so

T (w1, . . . , wq; v1, . . . , vr) = T µ1...µqν1...νrw1µ1 . . . wqµqvν11 . . . vνrr .

The tensor product of tensors T ∈ T qr,p(M) and U ∈ T st,p(M) is the tensor T ⊗U ∈T q+sr+t,p(M) with

(T ⊗ U)(w1, . . . , wq, wq+1, . . . , wq+s; v1, . . . , vr, vr+1, . . . , vr+t)

= T (w1, . . . , wq; v1, . . . , vr)U(wq+1, . . . , wq+s; vr+1, . . . , vr+t).

= T µ1...µqν1...νrw1µ1 . . . wqµqvν11 . . . vνrr ·

Uα1...αsβ1...βt

w(q+1)α1 . . . w(q+s)αsvβ1r+1 . . . v

βtr+t.

Contractionmaps a tensor T ∈ T qr,p(M) to a tensor T ′ ∈ T q−1r−1,p(M) with components

T ′µ1...µq−1ν1...νr−1

= T µ1...µi−1ρµi...µq−1ν1...νj−1ρνj ...νr−1

Under a coordinate transformation, a tensor of type (q, r) transforms like a productof q vectors and r one-forms (note that v1 ⊗ · · · ⊗ vq ⊗ w1 ⊗ . . .⊗ wr is one exampleof a (q, r) tensor). For example T ∈ T 1

2,p(M) tensor of type (1, 2):

T = Tαβ1β2∂

∂xα⊗ dxβ1 ⊗ dxβ2 = T µν1ν2

∂

∂yµ⊗ dyν1 ⊗ dyν2

gives us the transformation rule for the components

T µν1ν2 =∂yµ

∂xα∂xβ1

∂yν1∂xβ2

∂yν2Tαβ1β2

49

4.4.6 Tensor Fields

Suppose that a vector v(p) has been assigned to every point p in M . This is a(smooth) vector field, if for every C∞ function f ∈ F the function v(p)(f) : M → Ris also a smooth function. We denote v(p)(f) by v[f ]. The set of smooth vector fieldson M is denoted by χ(M).

Smooth cotangent vector field : For every p ∈ M there is w(p) ∈ T ∗pM suchthat if V ∈ χ(M), then the function

w[V ] : M → Rp 7→ w[V ](p) = w(p)(V (p))

is smooth. The set of cotangent vector fields is denoted by Ω1(M).

Smooth (q, r)-tensor field : If for all p ∈ M there is T (p) ∈ T qr,p(M) such thatif w1, . . . , wq are smooth cotangent vector fields and v1, . . . , vr are smooth tangentvector fields, then the map

p 7→ T [w1, . . . , wq; v1, . . . , vr](p) = T (p)(w1(p), . . . , wq(p); v1(p), . . . , vr(p))

is smooth on M .

4.4.7 Differential Map and Pullback

Let M and N be differentiable manifolds and f : M → N smooth.f induces a map called the differential map (työntökuvaus) f∗ : TpM → TpN . It isdefined as follows:If g ∈ F(N) (i.e. g : N → R smooth), and v ∈ TpM , then

(f∗v)[g] = v[g f ].

In other words, if v characterizes the rate of change of a function along a curve c(t),then f∗v characterizes the rate of change of a function along the curve f(c(t)).Let x be local coordinates on M and y be local coordinates on N , “y = f(x)”. Alsolet v = vµ ∂

∂xµand (f∗v)ν ∂

∂yν. Then

v[g f ] = vµ∂(g(f(x)))

∂xµ= vµ

∂g

∂yν∂yν

∂xµ≡ (f∗v)ν

∂g

∂yν

and we get

(f∗v)ν = vµ∂yν

∂xµ, where y = f(x).

[More precisely xµ = ϕµ(p), yν = ψν(f(p)) and ∂yν

∂xµ= ∂(ψfϕ−1)ν

∂xµ.]

50

Example. Let (x1, x2) and (y1, y2, y3) be the coordinates inM and N , respectively,and let V = a ∂

∂x1+ b ∂

∂x2be a tangent vector at (x1, x2). Let f : M → N be a map

whose coordinate presentation is y = (x1, x2,√

1− (x1)2 − (x2)2). Then

f∗V = V µ∂yα

∂xµ∂

∂yα= a

∂

∂y1+ b

∂

∂y2− (a

y1

y3+ b

y2

y3)∂

∂y3.

The function f also induces the map

f ∗ : T ∗f(p)N → T ∗pM, (f ∗w)(v) = w(f∗v),

where v ∈ TpM and w ∈ T ∗f(p)N are arbitrary. f ∗ is called the pullback.In local coordinates, w = wνdy

ν ,

w(f∗v) = wνdyν

(vµ∂yα

∂xµ∂

∂yα

)= wνv

µ ∂yν

∂xµ= (f ∗w)µv

µ = (f ∗w)(v),

from which we get

(f ∗w)µ = wν∂yν

∂xµ.

The pullback f ∗ can also be generalized to (0, r) tensors and similarly the differentialmap f∗ can be generalized to (q, 0) tensors.

4.4.8 Flow Generated by a Vector Field

Let X be a vector field on M . An integral curve x(t) of X is a curve on M , whosetangent vector at x(t) is X|x(t).In local coordinates, the integral curve is the solution of the differential equations

dxµ(t)

dt= Xµ(x(t))

(X = Xµ ∂

∂xµ

).

The existence and uniqueness theorem of ordinary differential equations guaranteesthat the equation has a unique solution (at least locally in some neighborhood oft = 0), once the initial condition xµ(t = 0) = xµ0 has been specified. If M is compact,the solution exists for all t.Let us denote the integral curve of X which passes the point x0 at t = 0 by σ(t, x0).Thus

dσµ(t,x0)dt

= Xµ(σ(t, x0))

σµ(t = 0, x0) = xµ0.

The map σ : I × M → M is called a flow generated by X (I ⊂ R). It satisfiesσ(t, σ(s, x0)) = σ(t+ s, x0) (as long as t+ s ∈ I).Proof: The left and right hand sides satisfy the same differential equation: d

dtσµ(t, σ) =

Xµ(σ) = ddtσµ(t + s, σ) and the same initial condition. Thus by uniqueness they are

the same map, as we wanted to prove (see also the book by Nakahara, page 15.)

51

Example. LetM = R2 and let X((x, y)) = −y ∂∂x

+x ∂∂y

be a vector field inM . Theflow generated by X is

σ(t, (x, y)) = (x cos t− y sin t, x sin t+ y cos t).

Hence the flow through (x, y) is a circle whose center is at the origin.

For a fixed t, σ(t, x) is a diffeomorphism σt : M →M, x 7→ σ(t, x). The family ofdiffeomorphisms σt|t ∈ I is a commutative (Abelian) group (when I = R):

σt · σs ≡ σt σs = σt+s

σ−t = (σt)−1

σ0 = idM .

The group is called the one-parameter group of transformations.Let t = ε be infinitesimally close to 0. Now,

σµε (x) = σµ(ε, x) ≈ σµ(0, x) +dσµ(t, x)

dt

∣∣∣∣t=0

ε+O(ε2) = xµ +Xµ(x)ε.

In this context the vector field X is called the infinitesimal generator of the trans-formation σt.

Given a vector field X, the corresponding flow is often denoted by

σµt (x) = σµ(t, x) = exp(tX)xµ = (etX)xµ

and called the exponentiation of X. This is because

σµt (x) = xµ + tdσµ(s, x)

ds

∣∣∣∣s=0

+1

2!t2d2σµ(s, x)

ds2

∣∣∣∣s=0

+ · · ·

=

(1 + t

d

ds+

1

2!t2d2

ds2+ · · ·

)σµ(s, x)

∣∣∣∣s=0

= etddsσµ(s, x)

∣∣∣s=0

= etXxµ .

4.4.9 Lie Derivative

Let σt(x) be a flow on M generated by vector field X: dσµt (x)

dt= Xµ(σt(x)). Let Y be

another vector field on M . We want to calculate the rate of change of Y along thecurve xµ(t) = σµt (x).The Lie derivative of a vector field Y is defined by

LXY = limε→0

1

ε

((σ−ε)∗Y |σε(x) − Y |x

).

52

Let us rewrite this in a more user-friendly form: First

Y |x = Y µ(x)∂

∂xµ

Y |x = Y µ(x)∂

∂xµ,

where we have for the coordinates

xµ ≡σµε (x) = xµ + εXµ(x) +O(ε2)

⇒ xµ = xµ − εXµ(xµ) +O(ε2).

ThusY |x = (Y µ(x+ εX))

∂

∂xµ=

(Y µ(x) + εXν ∂Y

µ(x)

∂xν

)∂

∂xµ.

Differential map from x to x:

((σ−ε)∗Y |x)α = Y µ|x∂xα

∂xµ=

(Y µ(x) + εXν(x)

∂Y µ(x)

∂xν

)(δαµ − ε

∂Xα

∂xµ+O(ε)︷︸︸︷

∂Xα(x)

∂xµ

)= Y α(x) + ε

(Xν(x)

∂Y α

∂xν− Y µ(x)

∂Xα

∂xµ

)+O(ε2)

⇒ LXY =

(Xν ∂Y

µ

∂xν− Y ν ∂X

µ

∂xν

)∂

∂xµ.

So we got

LXY =

(Xν ∂Y

µ

∂xν− Y ν ∂X

µ

∂xν

)∂

∂xµ= [X, Y ] ,

where the commutator (“Lie bracket”) acts on functions by

[X, Y ] f = X[Y [f ]]− Y [X[f ]].

Note that XY is not a vector field but [X, Y ] is:

XY f = X[Y [f ]] = Xµ∂µ[Y ν∂νf ] = Xµ(∂µYν)∂ν︸︷︷︸

vector field

f + XµY ν∂µ∂ν︸︷︷︸not a vector field

f.

Lie derivative of a one-form: Let w ∈ Ω1(M) be a one-form (cotangent vector).Define the Lie derivative of w along X as

LXw = limε→0

1

ε

(σ∗εw|σε(x) − w|x

).

Let us simplify this. The coordinates at σε(x) : yµ ≡ σµε (x) ≈ xµ + εXµ(x).

(σ∗εw)α = wβ(y)∂yβ

∂xα= wβ(x+ εX)

∂

∂xα(xβ + εXβ)

= (wβ(x) + εXµ∂µwβ(x))(δβ α + ε∂αXβ)

= wα + ε(Xµ∂µwα + wµ∂αXµ)

53

Thus we findLXw = (Xµ∂µwα + wµ∂αX

µ) dxα.

Lie derivative of a function: A natural guess would be LXf = X[f ]. Let uscheck if this works:

Lxf = limε→0

1

ε(f(σε(x))− f(x)) = lim

ε→0

1

ε(f(x+ εX)− f(x)) = Xµ∂µf = Xf = X[f ].

Thus the definition works.

Lie derivative of a tensor field: We define these using the Leibniz rule: werequire that

LX(t1 ⊗ t2) = (LXt1)⊗ t2 + t1 ⊗ (LXt2).

This is true if t1 is a function ((0,0) tensor) and t2 is a one form or a vector field, orvice versa. (exercise)Example: Let us find the Lie derivative of a (1,1) tensor: t = t ν

µ dxµ ⊗ eν ; eν = ∂

∂xν.

LXt = (LXt νµ )dxµ ⊗ eν + t ν

µ (LXdxµ)⊗ eν + t νµ dx

µ ⊗ (LXeν)= (Xα∂αt

νµ )dxµ ⊗ eν + t ν

µ (∂αXµ)dxα ⊗ eν − t ν

µ dxµ ⊗ (∂νX

α)eα

= (Xα∂αtνµ + t ν

α ∂µXα − t α

µ ∂αXν)dxµ ⊗ eν .

[We used here eν = ∂∂xν

, (eν)α = δ α

ν , (dxµ)α = δµα, (LXeν)α = Xµ∂µ(eν)α −

(eν)µ∂µX

α = −∂νXα and also (LXdxµ)α = Xν∂ν(dxµ)α + (dxµ)ν∂αX

ν = ∂αXµ.]

4.4.10 Differential Forms

A differential form of order r (or r-form) is a totally antisymmetric (0, r)-tensor:

p ∈ Sr : w(vp(1), . . . , vp(r)) = sgn(p) w(v1, . . . , vr),

where sgn(p) is the sign of the permutation p:

sgn(p) = (−1)number of exchanges =

+1 for an even permutation−1 for an odd permutation.

Example: p : (123)→ (231) : Two exchanges [(231)→ (213)→ (123)] to (123), thusp is an even permutation.p : (123) → (321) : One exchange to (231) and then two exchanges to (123), thus pis an odd permutation.

The r-forms at point p ∈M form a vector space Ωrp(M). What is its basis?

We define the wedge product of 1-forms:

dxµ1 ∧ dxµ2 ∧ . . . ∧ dxµr =∑p∈Sr

sgn(p) dxµp(1) ⊗ . . .⊗ dxµp(r)

54

Then dxµ1 ∧ . . . ∧ dxµr | µ1 < µ2 < . . . < µr forms the basis of Ωrp(M).

Examples: dxµ ∧ dxν = dxµ ⊗ dxν − dxν ⊗ dxµdx1 ∧ dx2 ∧ dx3 = dx1 ⊗ dx2 ⊗ dx3 + dx2 ⊗ dx3 ⊗ dx1 + dx3 ⊗ dx1 ⊗ dx2

−dx2 ⊗ dx1 ⊗ dx3 − dx3 ⊗ dx2 ⊗ dx1 − dx1 ⊗ dx3 ⊗ dx2.

Note:

• dxµ1 ∧ . . . ∧ dxµr = 0 if the same index appears twice (or more times).

• dxµ1 ∧ . . . ∧ dxµr = sgn(p)dxµp(1) ∧ . . . ∧ dxµp(r) . (reshuffling of terms.)

In the above basis, an r-form w ∈ Ωrp(M) is expanded

w =1

r!wµ1...µrdx

µ1 ∧ . . . ∧ dxµr .

Note: the components wµ1...µr are totally antisymmetric in the indices(e.g. wµ1µ2µ3...µr = −wµ2µ1µ3...µr).

One can show that dim Ωrp(M) = m!

r!(m−r)! =(mr

), where m = dimM .

Note also: Ω1p(M) = T ∗p (M) cotangent space

Ω0p(M) = R by convention

Now we generalize the wedge product for the products of a q-form and an r-formand call it exterior product:

Definition: The exterior product of a q-form ω and an r-form η is a (q + r)-formω ∧ η:

(ω ∧ η)(v1, . . . , vq+r) =1

q!r!

∑p∈Sq+r

sgn(p)ω(vp(1), . . . , vp(q)) · η(vp(q+1), . . . , vp(q+r)).

If q+ r > m = dim(M), then ω∧η = 0. The exterior product satisfies the properties:

(i) ω ∧ ω = 0, if q is odd.

(ii) ω ∧ η = (−1)qrη ∧ ω.

(iii) (ω ∧ η) ∧ ξ = ω ∧ (η ∧ ξ).

[Proof: exercise]

55

Example. Take the Cartesian coordinates (x, y) in R2. The two-form dx ∧ dy isthe oriented area element (the vector product in elementary vector algebra). In polarcoordinates this becomes

dx ∧ dy = (cos θdr − r sin θdθ) ∧ (sin θdr + r cos θdθ)

= cos θ sin θdr ∧ dr + r(cos θ)2dr ∧ dθ − r(sin θ)2dθ ∧ dr − r2 sin θ cos θdθ ∧ dθ= rdr ∧ dθ.

We may assign an r-form smoothly at each point p on a manifold M , to obtain anr-form field. The r-form field will also be called an r-form for short.

The corresponding vector spaces of r-forms (r-form fields) are called Ωr(M):

Ω0(M) = F(M) smooth functions on M

Ω1(M) = T ∗(M) cotangent vector fields on M

Ω2(M) = spdxµ ∧ dxν | µ < ν...

4.4.11 Exterior derivative

The exterior derivative d is a map Ωr(M)→ Ωr+1(M),

ω =1

r!ωµ1...µrdx

µ1 ∧ . . . ∧ dxµr 7→ dω =1

r!

∂ωµ1...µrdxν

dxν ∧ dxµ1 ∧ . . . ∧ dxµr .

Example: dim M = m = 3. We have the following r-forms:

• r = 0 : ω0 = f(x, y, z),

• r = 1 : ω1 = ωx(x, y, z)dx+ ωy(x, y, z)dy + ωz(x, y, z)dz,

• r = 2 : ω2 = ωxy(x, y, z)dx ∧ dy + ωyzdy ∧ dz + ωzxdz ∧ dx,

• r = 3 : ω3 = ωxyzdx ∧ dy ∧ dz.

The exterior derivatives are:

• dω0 = ∂f∂xdx+ ∂f

∂ydy + ∂f

∂zdz. Thus the components are the components of ∇f .

• dω1 = ∂ωx∂ydy∧dx+ ∂ωx

∂zdz∧dx+ ∂ωy

∂xdx∧dy+ ∂ωy

∂zdz∧dy+ ∂ωz

∂xdx∧dz+ ∂ωz

∂ydy∧dz

=(∂ωy∂x− ∂ωx

∂y

)dx ∧ dy +

(∂ωz∂y− ∂ωy

∂z

)dy ∧ dz +

(∂ωx∂z− ∂ωz

∂x

)dz ∧ dx

These are the components of ∇× ~ω (~ω = (ωx, ωy, ωz))

• dω2 = ∂ωxy∂dz

dz ∧ dx ∧ dy + ∂ωyz∂x

dx ∧ dy ∧ dz + ∂ωzx∂y

dy ∧ dz ∧ dx

=(∂ωyz∂x

+ ∂ωzx∂y

+ ∂ωxy∂z

)dx ∧ dy ∧ dz

The component is a divergence: ∇ · ~ω′ (where ~ω′ = (ωyz, ωzx, ωxy))

56

• Thus the exterior derivatives correspond to the gradient, curl and divergence![dω3 = 0]

What is d(dω)?

d(dω) =1

r!

∂2

∂xα∂xβ︸︷︷︸symmetric in α and β

wµ1...µr

antisymmetric in α and β︷︸︸︷dxα ∧ dxβ ∧ dxµ1 ∧ . . . ∧ dxµr

= 0.

So d2 = 0. Note that (for dim M = 3)

d(df) = d(∂xfdx+ ∂yfdy + ∂zfdz) =

(∂2

∂x∂y− ∂2

∂y∂x

)dx ∧ dy + . . . = 0,

so we recover ∇×∇f = 0. Similarly d(dω1) = 0↔ ∇ · ∇× ~ω = 0.

If dω = 0, we say that ω is a closed r-form. If there exists an (r-1)-form ωr−1 suchthat ωr = dωr−1, then we say that ωr is an exact r-form.

The exterior derivative induces the sequence of maps

0i→ Ω0 d0→ Ω1 d1→ Ω2 d2→ . . .

dm−2→ Ωm−1 dm−1→ Ωm dm→ 0,

where Ωr = Ωr(M), i is the inclusion map 0 → Ω0(M) and dr denotes the mapdr : Ωr−1 → Ωr, ω 7→ dω. Since d2 = 0, we have Im dr︸︷︷︸

exact r-forms

⊂ ker dr+1︸︷︷︸closed r+1 forms

. Such

a sequence is called an exact sequence. This particular sequence is called the deRham complex. The quotient space Ker dr+1/Im dr is called the rth de Rhamcohomology group.

4.4.12 Integration of Differential Forms

Orientable manifolds : Let dim M = m. We can define integration over an m-form over M only if M is an orientable manifold.Let p ∈M, p ∈ Ui ∩ Uj and denote the coordinates on Ui = xµ and on Uj = yµ.TpM is spanned by eµ = ∂

∂xµor eµ = ∂

∂yµ. [Recall that eµ = ∂xν

∂yµeν (chain rule)]

Let J denote the determinant J = det(∂xµ

∂yν

).

If J > 0, we say that eµ and eµ define the same orientation on Ui ∩ Uj.If J < 0, we say that eµ and eµ define the opposite orientation on Ui ∩ Uj.(J = 0 is not possible if the coordinates xµ and yν are properly defined.)

We say that (M, Ui, xi) (manifold M with an atlas Ui, xi) is orientable if forany overlapping charts Ui and Uj the determinant J = det

(∂xµi∂xνj

)is positive, J > 0.

57

(Note that i and j are fixed, while µ and ν denote the components of the matrix. Inother words the determinant is taken over µ and ν.)

If M is orientable, then there exists an m-form ω which is non-vanishing everywhereon M (proof skipped). This m-form ω is called a volume element and it plays therole of an integration measure on M . Two volume elements ω and ω′ are equivalent,if ω = hω′, where h ∈ F is a smooth, positive function on M , i.e. h(p) > 0 for allp ∈M . We denote then ω ∼ ω′ (this is clearly an equivalence relation).If ω′′ ∼/ ω, then ω = h′′ω′′, where h′′(p) < 0 ∀p ∈ M . So there are two equivalenceclasses for volume elements, corresponding to two inequivalent orientations. We callone of them right-handed and the other left-handed.

Integration of forms: Let M be orientable and f : M → R a function whichis nonzero only on one chart (Ui, x

µ(p) = ϕµi (p)), and ω a volume element on Ui:ω = h(p)dx1 ∧ . . . ∧ dxm. We define∫

Ui

fω =

∫ϕi(Ui)

dx1dx2 . . . dxmh(ϕ−1i (x))f(ϕ−1

i (x))

Note that the right hand side is a regular integral in Rm. For a generic function onM , we need to use the “partition of unity”.Let Ui be an open covering of M , such that every point p ∈ M belongs to onlya finite number of Ui’s. (If such an open covering exists, manifold M is calledparacompact). The partition of unity is a family of differentiable functions εi(p)such that

(i) 0 ≤ εi(p) ≤ 1

(ii) εi(p) = 0 ∀p /∈ Ui

(iii)∑

i εi(p) = 1 ∀p ∈M .

The partition of unity εi depends on the choice of Ui.

Now let f : M → R. We can write f(p) = f(p)∑

i εi(p) =∑

i fi(p), where fi = fεi.Then fi(p) = 0 when p /∈ Ui so we can use the previous definition to extend theintegral over all M : ∫

M

fω =∑i

∫Ui

fiω.

Note that due to the paracompactness condition, the sum over i is finite and thusthere are no problems with the convergence of the sum. One can show, that althougha different atlas (Vi, ψi) gives different coordinates and partition of unity, the inte-gral remains the same.

58

Example: LetM = S1, U1 = S1−(1, 0), U2 = S1−(−1, 0). Choose the (inverse)coordinate functions as

ϕ−11 : (0, 2π)→ U1, θ1 7→ (cos θ1, sin θ1)

ϕ−12 : (−π, π)→ U2, θ2 7→ (cos θ2, sin θ2)

Partition of unity: ε1(θ1) = sin2 θ12, ε2(θ2) = cos2 θ2

2. (Note that this satisfies (i) -

(iii)). Choose f : S1 → R as f(θ) = sin2 θ and ω = 1·dθ1 on U1 and ω = 1·d(θ2+2π) =

1 · dθ2 on U2. Now∫S1

fω =2∑i=1

∫Ui

fiω =

∫ 2π

0

dθ1 sin2 θ1

2sin2 θ1 +

∫ π

−πdθ2 cos2 θ2

2sin2 θ2 =

π

2+π

2= π,

as expected.

4.5 Integral of an r-form over a manifold M; Stokes’ theorem

4.5.1 Simplexes in a Euclidean space

We define simplexes in Rm as follows:0-simplex : point s0 = p0

1-simplex : oriented line s1 = (p0, p1)

2-simplex : oriented triangle s2 = (p0, p1, p2)

3-simplex : oriented tetrahedron s3 = (p0, p1, p2, p3)...

n-simplex (p0, . . . , pn) is made of (n+1) geometrically independent2 points (ver-tices) p0, . . . , pn in this order and the n-dimensional object spanned by them:

sn = x ∈ Rm|xµ =n∑i=0

tixµ(pi),

n∑i=0

ti = 1, ti ≥ 0

The numbers t0, . . . , tn are the barycentric coordinates on sn.As a subset of Rm sn is closed and bounded and therefore compact. The orientation isdefined by the order of the vertices. If Π ∈ Sn+1 is a permutation of (n+1)-elements,then we define

(pΠ(0), . . . , pΠ(n)) = (−1)Π(p0, . . . , pn),

so even permutations of the vertices give the same oriented simplex sn, and oddpermutations give the simplex −sn with opposite orientation.The boundary ∂sn of an n-simplex sn is a combination of (n-1)-simplexes: If sn =

(p0, . . . , pn),

∂sn =n∑i=0

(−1)i(p0, . . . , pi−1, pi+1, . . . , pn).

2Geometrically independent ≡ vectors p0−p1, . . . , p0−pn are linearly independent and thus spanan n-dimensional space.

59

Example: ∂s0 = 0

s1 = (p0, p1), ∂s1 = p1 − p0

s2 = (p0, p1, p2), ∂s2 = (p1, p2)− (p0, p2) + (p0, p1) = (p1, p2) + (p0, p1) + (p2, p0)

s3 = (p0, p1, p2, p3), ∂s2 = (p1, p2, p3)− (p0, p2, p3) + (p0, p1, p3)− (p0, p1, p2)

= (p1, p2, p3) + (p0, p3, p2) + (p0, p1, p3) + (p1, p0, p2).An n-chain c is a formal sum

c =∑i

aisni , ai ∈ R, sni an n-simplex.

Thus ∂sn is an (n-1)-chain. The boundary of the chain is: ∂c ≡∑

i ai∂sni . A boundary

has no boundary, so we should have ∂2c = 0. Let us prove this. It is enough to provethis for a simplex since ∂ is defined as a linear operator.

∂2sn = ∂

(n∑i=0

(−1)i(p0, . . . , pi−1, pi+1, . . . , pn)

)

Let j < k. In ∂2sn the simplex (p0, . . . , pj−1, pj+1, . . . , pk−1, pk+1, . . . , pn) is created intwo ways:

1. The first ∂ removes pk and the second pj: sign (−1)k+j

2. The first ∂ removes pj and the second pk: sign (−1)j+(k−1).

Thus the two terms have opposite signs and cancel each other ⇒ ∂2sn = 0.Two n-simplexes, P = (p0, . . . , pn) and Q = (q0, . . . , qn), can be mapped onto each

other with an orientation preserving linear homeomorphism. The image of p ∈ P inQ is the point with the same barycentric coordinates ti.

In Rm we define the standard simplex sm = (p0, . . . , pm) as follows:

p0 = (0, 0, . . . , 0) (origin)

p1 = (1, 0, . . . , 0)

p2 = (0, 1, . . . , 0)

...

pm = (0, 0, . . . , 1).

Now let ω be an m-form on U ⊂ Rm, where sm ⊂ U . Now ω can be written as

ω = A(x1, x2, . . . , xm)dx1 ∧ dx2 ∧ . . . ∧ dxm.

Let us define the integral of ω over the standard simplex:∫smω ≡ dx1 . . . dxmA(x1, . . . , xm).

60

Example: Consider m = 3, ω = dx ∧ dy ∧ dz:∫s3ω =

∫ 1

0

dx

∫ 1−x

0

dy

∫ 1−x−y

0

dz =

∫ 1

0

dx

∫ 1−x

0

dy(1− x− y)

=

∫ 1

0

dx((1− x)2 − 1

2(1− x)2) =

1

2

∫ 1

0

dx(1− x)2 =1

6

4.5.2 Simplexes and Chains on Manifolds

Let M be a manifold of dimension m and sn ⊂ U ⊂ Rn a Euclidean n-simplex(sn = (p0, . . . , pn)). In addition ϕ : U → M is a smooth map (does not need tobe injective or surjective) where U is open. A “protosimplex” on M is (sn, U, ϕ). Iftn = (q0, . . . , qn) ⊂ V ⊂ Rm is another Euclidean n-simplex and ψ : V → M , then(sn, U, ϕ) ∼ (tn, V, ψ) if

ψ(n∑i=0

tixµ(qi)) = ϕ(n∑i=0

tixµ(pi))

with the same ti. (So the points with the same barycentric coordinates map to thesame point on M). We can see that ∼ is an equivalence relation.

An n-simplex σn onM is an equivalence class in the equivalence relation above. If(sn, U, ϕ) is a representative of σn and the “sides” of sn are t0, . . . , tn : ∂sn =

∑±ti,

then the sides of σn are τi = (ti, Vi, ϕ), where ti ⊂ Vi ⊂ U (Vi open in Rn−1) and theboundary of σn is ∂σn =

∑±τi.

An n-chain on M is a formal sum c =∑aiσ

ni , where ai ∈ R and σni is an n-simplex.

Addition of chains is defined by αc + βc′ ≡∑

i(αai + βa′i)σni . The boundary of the

chain is ∂c ≡∑ai∂σ

ni .

If we denote by Cn(M) the set of chains (Cn(M) = n-chains on M), then wehave a linear map ∂ : Cn(M) → Cn−1(M) with the property ∂2 = 0. A cycle z is achain with a vanishing boundary: ∂z = 0. (Compare with closed n-forms : dω = 0).A cycle b is a boundary cycle or boundary if there exists an (n+1)-chain c suchthat b = ∂c. (Compare with exact n-forms: ω = dα for some (n-1)-form α). Everyboundary is a cycle, but not vice versa. (Compare with all exact forms are closed butnot vice versa).

Integration of Forms Let M be a manifold, ω a p-form on M and c a p-chain onM . We wish to define ∫

c

ω.

Let us write c =∑

i aisi, where si’s are p-simplexes, and let us define∫c

ω =∑i

ai

∫si

ω.

61

This means that we have to define the integral of ω over a simplex s. We can writethe simplex in the form (sp, U, ϕ), where sp is a standard simplex in Rp, ϕ : U →M ,sp ⊂ U . Now we can define ∫

s

ω ≡∫spϕ∗ω.

In practice there are often more practical methods to calculate.

Stokes’ Theorem: Let ω ∈ Ωr−1(M) and c be an r-chain on M . Then∫c

dω =

∫∂c

ω.

Proof: Due to linearity it is enough to show this for a simplex:∫sdω =

∫∂sω. Writing

s as (sr, U, ϕ) we can write∫s

dω =

∫srϕ∗(dω)

∗=

∫srd(ϕ∗ω),

where (∗) is an exercise. Similarly∫∂s

ω =

∫∂sr

ϕ∗ω.

Thus it is enough to show that in Rr we have∫srdη =

∫∂sr

η, η ∈ Ωr−1(Rr).

In general η =∑

µ aµ(x)dx1 ∧ . . .∧ dxµ−1 ∧ dxµ+1 ∧ . . .∧ dxr. It is enough to examineone term, for instance η = a(x)dx1∧. . .∧dxr−1. Then dη = (−1)r−1 ∂a(x)

∂xrdx1∧. . .∧dxr.

A direct calculation gives∫srdη = (−1)r−1

∫sr

∂a(x)

∂xrdx1 . . . dxr

= (−1)r−1

∫xµ≥0,

∑xµ=1

dx1 . . . dxr−1

∫ 1−∑r−1µ=1

0

dxr∂a(x)

∂xr

= (−1)r−1

∫dx1 . . . dxr−1

(a(x1, . . . , xr−1, 1−

r−1∑µ=1

xµ)− a(x1, . . . , xr−1, 0)

)(12)

Now ∂sr = (p1, . . . , pr)− (p0, p2, . . . , pr) + . . .+ (−1)r(p0, . . . , pr−1). The sides(p0, p2, . . . , pr), . . . , (p0, p1, . . . , pr−2, pr) are all subsets of the planes xµ = 0, µ =

1, 2, . . . , r−1. In the plane xµ = 0 the µ component of vectors is zero, i.e. η(v1, . . . , vr−1) =

62

0. Therefore on these sides η = 0, only sides (p1, . . . , pr) and (−1)r(p0, . . . , pr−1) con-tribute. The latter part is a standard simplex:

(−1)r∫

(p0,...,pr−1)

η = (−1)r∫sr−1

dx1 . . . dxr−1a(x1, . . . , xr−1, 0).

This is the second term in (12). σ ≡ (p1, . . . , pr) is not a standard simplex. Theintegral over it is defined by mapping σ to a standard simplex preserving orientation.This is done by mapping points with the same barycentric coordinates to each other,which here simply means a projection to the xµ = 0 plane:

(p1, . . . , pr−1, pr) 7→ (p1, . . . , pr−1, p0) = (−1)r−1(p0, . . . , pr−1) = (−1)r−1sr−1.

Therefore∫(p1,...,pr)

η = (−1)r−1

∫sr−1

dx1 . . . dxr−1a(x1, . . . , xr−1, 1−∑

xµ)

This is the first term in (12). Therefore∫cdω =

∫∂cω, quod erat demonstrandum.

4.6 Briefly about Lie Groups and Algebras

A Lie group G is a differentiable manifold with a group structure,

(i) product G×G→ G, (g1, g2) 7→ g1g2, such that g1(g2g3) = (g1g2)g3,

(ii) unit element: point e ∈ G such that eg = ge = g ∀g ∈ G,

(iii) inverse element: ∀g ∈ G ∃g−1 ∈ G such that gg−1 = g−1g = e,

in such a way that the map G×G→ G, (g1, g2) 7→ g1g2 is differentiable. We alreadyknow some examples: GL, SL, O, U, SU and SO.

Example: Coordinates on GL(n,R) : xij(g) = gij (and thus xij(e) = δij.) One chartis sufficient : U = GL(n,R). (thus U is open in any topology.)

• To be exact we don’t yet have a topology on GL(n,R). We can define thetopology in several (inequivalent) ways. One way would be to choose a topologymanually, for instance choose the discrete or trivial topology. This is rarely auseful method. A better way of defining the topology is to choose a map f fromGL(n,R) to some known topological space N and then choose the topology onGL(n,R) so that the map f is continuous, i.e. define

V ⊂ GL(n,R) is open⇔ V = f−1W for some W open in N .

(check that this defines a topology). Here are two possible topologies:

63

1. Choose f : GL(n,R) → R, g 7→ det(g). (So we choose N = R). Theinduced topology is:V ⊂GL(n,R) is open ⇔ V = f−1(W ) for some W open in R.Note that GL(n,R) is not Hausdorff with respect to this topology, sinceif g1, g2 ∈ GL(n,R), g1 6= g2, and det g1 = det g2, then any open setcontaining g1 also contains g2.

2. Choose N = Rn2, f : GL(n,R)→ Rn2 defined by x11 · · · x1n

... . . . ...xn1 · · · xnn

7→ (x11, x12, . . . , x1n, x21, . . . , xnn) ∈ Rn2

.

This is clearly injective, and when we define topology as above, we see thatf is a homeomorphism from GL(n,R) to an open subset of Rn2 . Since Rn2

is Hausdorff, so is GL(n,R) with this topology. Thus this topology isnot equivalent to the one defined in the first example. This is the usualtopology one has on GL(n,R).

Let a ∈ G be a given element. We can define the left-translation

La : G→ G, La(g) = ag (group action on itself from the left).

This is a diffeomorphism G→ G.A vector field X on G is left-invariant, if the push satisfies

(La)∗X|g = X|ag

Using coordinates, this means

(La)∗X|g = Xµ(g)∂xα(ag)

∂xµ(g)

∂

∂xα

∣∣∣∣ag

= X|ag = Xα(ag)∂

∂xα

∣∣∣∣ag

,

and thusXα(ag) = Xµ(g)

∂xα(ag)

∂xµ(g).

A left-invariant vector field is uniquely defined by its value at a point, for example ate ∈ G, because

X|g = (Lg)∗Xe ≡ Lg∗V,

where V = X|e ∈ TeG. Let us denote the set of left-invariant vector fields by G. It isa vector space (since Lg∗ is a linear map); it is isomorphic with TeG. Thus we havedim G = dim G.

64

Example: The left-invariant fields of GL(n,R):

V = V ij ∂

∂xij

∣∣∣∣e

∈ TeGL(n,R),

X|g = Lg∗V = V ij ∂(

=xkm(g)︷︸︸︷xkl(g)xlm(e))

∂xij(e)

∂

∂xkm(e)= V ijxkl(g)δliδ

mj

∂

∂xkm(g)

V ijxki(g)∂

∂xkj(g)= xki(g)V ij︸︷︷︸

(gV )kj

∂

∂xkj(g)= (gV )kj

∂

∂xkj(g),

where V ij is an arbitrary n× n real matrix.

Since G is a collection of vector fields, we can compute their commutators. The resultis again left-invariant!

La∗ [X, Y ]|g = [La∗X|g, La∗Y |g]l. inv.= [X|ag, Y |ag] ≡ [X, Y ]|ag.

So if X, Y ∈ G, also [X, Y ] ∈ G.

Definition: The set of left-invariant vector fields G with the commutator (Liebracket) [ , ] : G × G → G is called the Lie algebra of a Lie group G.

Examples:

1. gl(n,R) = n×n real matrices (Lie algebras are written with lower case letters).

2. sl(n,R) : Take a curve c(t) that passes through e ∈ SL(n,R) and compute itstangent vector (c(0) = e = 1n). For small t: c(t) = 1n + tA, dc

dt

∣∣t=0

= A ∈TeSL(n,R). Now det c(t) = det (1n + tA) = 1 + t tr A+ . . . = 1. Thus tr A = 0

and sl(n,R) = A| A is a n× n real matrix, tr A = 0.

3. so(n) : c(t) = 1n + tA. We need c(t) to be orthogonal:c(t)c(t)T = (1 + tA)(1 + tAT ) = 1 + t(A + AT ) + O(t2) = 1. Thus we need tohave A = −AT and so so(n) = A| A is an antisymmetric n× n matrix .

For complex matrices, the coordinates are taken to be the real and imaginary partsof the matrix

4. u(n) : c(t) = 1n+tA. Thus c(t)c(t)† = (1+tA)(1+tA†) = 1+t(A+A†)+O(t2) =

1. So A = −A† and u(n) = A|A is an antihermitian n× n complex matrix .

Note: In physics, we usually use the convention c(t) = 1 + itA⇒ A† = A

⇒u(n) = Hermitian n× n matrices .

5. su(n) = n× n antihermitian traceless matrices .

65

4.6.1 Structure Constants of the Lie Algebra

Let V1, . . . , Vn be a basis of TeG (assume dim G = n < ∞). Then Xµ|g =

Lg∗Vµ, µ = 1, . . . , n is a basis of TgG (usually it is not a coordinate basis). Sincethe vectors V1, . . . , Vn are linearly independent, X1|g, . . . , Xn|g are also linearlyindependent. (Lg∗ is an isomorphism between TeG and TgG; (Lg∗)

−1 = Lg−1∗). SinceVµ are basis vectors of TeG, we can expand

[Vµ, Vν ] = c λµν Vλ.

Let us then push this to TgG:

Lg∗[Vµ, Vν ] = [Lg∗Vµ, Lg∗Vν ] = [Xµ|g, Xν |g]Lg∗(c

λµν Vλ) = c λ

µν Xλ|g⇒ [Xµ|g, Xν |g] = c λ

µν Xλ|g.

Letting g vary over all G, we get the same equation everywhere on G with the samenumbers c λ

µν . Thus we can write

[Xµ, Xν ] = c λµν Xλ.

The c λµν are called the structure constants of the Lie algebra. Evidently we have

c λµν = −c λ

νµ . We also have the Jacobi identity (of commutators)

c τµν c

στρ + c τ

νρ cσ

τµ + c τρµ c σ

τν = 0.

4.6.2 The adjoint representation of G

Let b be some element of G, b ∈ G. Let us define the map

adb : G→ G, adb(g) ≡ adbg = bgb−1.

This is a homomorphism: adbg1 · adbg2 = adb(g1g2), and at the same time defines anaction of G on itself (conjugation): adb · adc = adbc, ade = idG. (Note that thisis really a combined map: adb · adc ≡ adb adc). The differential map adb∗ pushesvectors from TgG to TadbgG. If g = e, adbe = beb−1 = e, so adb∗ maps TeG to itself.Lets denote this map by Adb:

Adb : TeG→ TeG, Adb = adb∗|TeG

One can easily show that (f g)∗ = f∗ g∗, thus adb∗adc∗ = adbc∗. It then followsthat Adb is a representation of G in the vector space G ∼= TeG, the so-called adjointrepresentation:

Ad : G→ Aut(G), b 7→ Adb.

66

If G is a matrix group (O, SO,...), then V ∈ TeG ∼= G is a matrix and

AdgV = gV g−1.

(This follows from adg(e+ tV ) = e+ tgV g−1.) So, if Vµ is a basis of G,

gVµg−1 = VνD

(adj)νµ(g).

We will next move to a new topic, Riemannian geometry, and will return to discussLie algebras more in Section 6.

5 Riemannian Geometry (Metric Manifolds)

(Chapter 7 of Nakahara’s book)

5.1 The Metric Tensor

Let M be a differentiable manifold. The Riemannian metric on M is a (0, 2) tensorfield, which satisfies

(i) gp(U, V ) = gp(V, U) ∀p ∈M, U, V ∈ TpM (i.e. g is symmetric)

(ii) gp(U,U) ≥ 0, and gp(U,U) = 0⇔ U = 0 (g is positive definite).

If instead of (ii) g satisfies

(ii’) If gp(U, V ) = 0 for all U ∈ TpM , then V = 0,

we say that g is a pseudo-Riemannian metric (symmetric and non-degenerate).(M, g) with a (pseudo-) Riemannian metric is called a (pseudo-) Riemannian manifold.The spacetime in general relativity is an example of a pseudo-Riemannian manifold.In local coordinates g = gµνdx

µ ⊗ dxν . (The Euclidean metric: gµν = δµν . Theng(U, V ) =

∑ni=1 U

iV i.)

5.2 The Induced Metric

Let (N, gN) be a Riemannian manifold, dim N = n. We define an m dimensionalsubmanifold M of N :Let f : M → N be a smooth map such that f is an injection and the pushf∗ : TpM → Tf(p)N is also an injection. Then f is an embedding of M in N

and the image f(M) is a submanifold of N . However, it follows that M and f(M)

are diffeomorphic, so we can call M a submanifold of N .

67

Now the pullback f ∗ of f induces the natural metric gM on M :

gM = f ∗gN .

The components of gM are given by

gMµν(x) = gNαβ(f(x))∂fα

∂xµ∂fβ

∂xν.

[By the chain rule: gMµνdxµ ⊗ dxν = gNαβ

∂fα

∂xµ∂fβ

∂xνdxµ ⊗ dxν ]

Example: Let (θ, ϕ) be the polar coordinates on S2 and f : S2 → R3 the usualembedding: f(θ, ϕ) = (sin θ cosϕ, sin θ sinϕ, cos θ). On R3 we have the Euclideanmetric δµν . We denote y1 = θ, y2 = ϕ. We obtain the induced metric on S2:

gµνdyµ ⊗ dyν = δαβ

∂fα

∂yµ∂fβ

∂yνdyµ ⊗ dyν = dθ ⊗ dθ + sin2 θdϕ⊗ dϕ.

Thus the components of the metric are g11(θ, ϕ) = 1, g22(θ, ϕ) = sin2 θ, g12(θ, ϕ) =

g21(θ, ϕ) = 0.

Why the notation ds2 is often used for the metric?Often the metric is denoted ds2 = gµνdx

µ⊗dxν . The reason for this is as follows. Letc(t) be a curve on manifold M with the metric g. The tangent vector of the curve isc(t), which in local coordinates is c(t) = (dx

µ(t)dt

). [c(t) = (xµ(t))]If M = R3 with the Euclidean metric gµν = δµν , the length of the curve between t0and t1 would be

LR3 =

∫ t1

t0

dt√

(x1)2 + (x2)2 + (x3)2 =

∫ t1

t0

dt√δµν xµxν .

In general case the length of the part of the curve between t0 and t1 is then

L =

∫ t1

t0

dt√gµν xµxν . (13)

If t0 and t1 are infinitesimally close : t1 = t0 + ∆t, then

∆s ≡ L ≈ ∆t√gµν xµxν ≈ ∆t

√gµν

∆xµ

∆t

∆xν

∆t=√gµν∆xµ∆xν .

Thus ds2 = gµνdxµdxν is the square of an “infinitesimal length element” ds. We will

have more to say about (13) later.

68

5.3 Affine Connection

Recall that χ(M) = vector fields on M. An (affine) connection ∇ is a mapχ(M)× χ(M)→ χ(M), (X, Y ) 7→ ∇XY such that

1. ∇X(Y + Z) = ∇XY +∇XZ (linear in the 2nd argument)

2. ∇(X+Y )Z = ∇XZ +∇YZ (linear in the 1st argument)

3. f is a function on M (f ∈ F(M)) ⇒ ∇fXY = f∇XY

4. ∇X(fY ) = X[f ]Y + f∇XY .

Now take a chart (U,ϕ) with coordinates x = ϕ(p). Let eν = ∂∂xν be the coordinate

basis of TpM . We define (dim M)3 connection coefficients Γλµν by

∇eµeν = Γλµνeλ.

We can express the connection in the coordinate basis with the help of connectioncoefficients: Let X = Xµeµ and Y = Y νeν be two vector fields. Denote ∇µ ≡ ∇eµ .Now

∇XY2,3= Xµ∇µ(Y νeν)

4= Xµeµ[Y ν ]eν +XµY ν∇µeν = Xµ∂Y

ν

∂xµeν +XµY νΓλµνeλ

= Xµ(∂Y λ

∂xµ+ ΓλµνY

ν)eλ ≡ Xµ(∇µY )λeλ,

where we have

(∇µY )λ =∂Y λ

∂xµ+ ΓλµνY

ν .

Note that ∇XY contains no derivatives of X unlike LXY .

5.4 Parallel Transport and Geodesics

Let c : (a, b) → M be a curve on M with coordinate representation xµ = xµ(t). Itstangent vector is

V = V µeµ|c(t) =dxµ(c(t))

dteµ

∣∣∣∣c(t)

.

If a vector field X satisfies

∇VX = 0 (along c(t)),

then we say that X is parallel transported along the curve c(t). In componentform this is

dXµ

dt+ Γµνλ

dxν(t)

dtXλ = 0.

69

If the tangent vector V itself is parallel transported along the curve c(t),

∇V V = 0, (14)

then the curve c(t) is called a geodesic. The equation (14) is the geodesic equationand in component form it is

d2xµ

dt2+ Γµνλ

dxν

dt

dxλ

dt= 0

Geodesics can be interpreted as the straightest possible curves in a Riemannian mani-fold. If M = Rn and g = δ (the Euclidean metric), then the geodesics are straightlines.

5.5 The Covariant Derivative of Tensor Fields

Connection was a term that we used for the map ∇ : (X, Y ) 7→ ∇XY . The map∇X : χ(M) → χ(M), Y 7→ ∇XY is called the covariant derivative. It is a propergeneralization of the directional derivative of functions to vector fields, and as we’lldiscuss next, to tensor fields.For a function, we define ∇Xf to be the same as the directional derivative:

∇Xf = X[f ].

Thus the condition number 4 in the definition of ∇ is the Leibniz rule:

∇X(fY ) = (∇Xf)Y + f(∇XY ).

Let us require that this should be true for any product of tensors:

∇X(T1 ⊗ T2) = (∇XT1)⊗ T2 + T1 ⊗ (∇XT2),

where T1 and T2 are tensor fields of arbitrary types. The formula must also be truewhen some of the indices are contracted. Thus we can define the covariant derivativeof a one-form as follows. Let ω ∈ Ω1(M) be a one-form ((0,1) tensor field), Y ∈ χ(M)

be a vector field ((1,0) tensor field). Then 〈ω, Y 〉 ∈ F(M) is a smooth function onM . Recall that 〈ω, Y 〉 ≡ ω[Y ] = ωµY

µ. (Here µ is the contracted index.) Then

∇X〈ω, Y 〉 = X(ω[Y ]) = Xµ ∂

∂xµ(ωνY

ν) = Xµ∂ων∂xµ

Y ν +Xµων∂Y ν

∂xµ.

On the other hand because of the Leibniz rule we must have

∇X〈ω, Y 〉 = 〈∇Xω, Y 〉+ 〈ω,∇XY 〉 = (∇Xω)νYν + ων(∇XY )ν

= (∇Xω)νYν + ωνX

µ∂Yν

∂xµ+ ωνΓ

νµαX

µY α

70

From these two formulas we find (∇Xω)ν . (Note that the two Xµων∂Y ν

∂xµterms cancel.)

⇒ (∇Xω)ν = Xµ

(∂ων∂xµ− Γαµνωα

).

When X = ∂∂xµ

, this reduces to

(∇µω)ν =∂ων∂xµ− Γαµνωα.

Further when ω = dxσ: ∇µdxσ = −Γσµνdx

ν .For a generic tensor, the result turns out to be

∇νtλ1...λpµ1...µq

= ∂νtλ1...λpµ1...µq

+ Γλ1νρtρλ2...λpµ1...µq

+ . . .+ Γλpνρtλ1...λp−1ρµ1...µq

− Γρ νµ1tλ1...λpρµ2...µq

− . . .− Γρ νµqtλ1...λpµ1...µq−1ρ

.

(Note that we should really have written tλ1...λpµ1...µq , but this was not done for typo-graphical reasons.)

5.6 The Transformation Properties of Connection Coefficients

Let U and V be two overlapping charts with coordinates:

on U : x eµ =∂

∂xµ,

on V : y eν =∂

∂yν=∂xµ

∂yνeµ.

Let p ∈ U ∩ V 6= ∅. The connection coefficients on V are

∇eα eβ = Γγαβ eγ = Γγαβ∂xν

∂yγeν

On the other hand

∇eα eβ = ∇eα(∂xµ

∂yβeµ) =

(∂2xν

∂yαyβ+∂xλ

∂yα∂xµ

∂yβΓνλµ

)eν

Thus

Γγαβ∂xν

∂yγ=

(∂2xν

∂yαyβ+∂xλ

∂yα∂xµ

∂yβΓνλµ

).

From this we find the transformation rule for the connection coefficients:

Γγαβ =∂yγ

∂xν∂xλ

∂yα∂xµ

∂yβΓνλµ +

∂2xν

∂yαyβ∂yγ

∂xν.

We notice that the first term is just the transformation rule for the components of a(1,2)-tensor. But we also have an additional second term, which is symmetric in α

and β. Thus Γ is almost like a (1,2)-tensor, but not quite. To construct a (1,2)-tensorout of Γ, define

T γαβ = Γγαβ − Γγβα ≡ 2Γγ [αβ] = the torsion tensor

(note: t[αβ] = 12(tαβ − tβα) is the antisymmetrization of indices.)

71

5.7 The Metric Connection

Let c be an arbitrary curve and V its tangent vector. If a connection ∇ satisfies3

∇V (g(X, Y )) = 0 when ∇VX = 0 and ∇V Y = 0,

then we say that ∇ is a metric connection. Since

∇V (g(X, Y )) = (∇V g)(X, Y ) + g(

=0︷︸︸︷∇VX,Y ) + g(X,

=0︷︸︸︷∇V Y ) = 0,

the metric connection satisfies∇V g = 0.

In component form:

1. (∇µg)αβ = ∂µgαβ − Γλµαgλβ − Γλµβgαλ = 0.

And by cyclic permutation of µ, α and β we get:

2. (∇αg)βµ = ∂αgβµ − Γλαβgλµ − Γλαµgβλ = 0

3. (∇βg)µα = ∂βgµα − Γλβµgλα − Γλβαgµλ = 0

Let us denote the symmetrization of indices: Γγ(αβ) ≡12(Γγαβ + Γγβα). Then adding

-(1)+(2)+(3) gives

−∂µgαβ + ∂αgβµ + ∂βgµα + T λµαgλβ + T λµβgλα − 2Γλ(αβ)gλµ = 0

In other words

Γλ(αβ)gλµ =1

2

(∂αgβµ + ∂βgµα − ∂µgαβ) + T λµαgλβ + T λµβgλα

Thus

Γκ(αβ) =

κ

αβ

+

1

2(T κ

α β + T κβ α),

where

καβ

= 1

2gκµ(∂αgβµ + ∂βgµα − ∂µgαβ) are the Christoffel symbols and

T κα β = gαλg

κµT λµβ.The coefficients of a metric connection thus satisfy

Γκαβ = Γκ(αβ) + Γκ[αβ] =

κ

αβ

+

1

2

(T κα β + T κ

β α + T καβ)︸︷︷︸

≡Kκαβ= contorsion

.

If the torsion tensor vanishes, T καβ = 0, the metric connection is called the Levi-Civita connection:

Γκαβ =

κ

αβ

.

3This condition means that the angle between vectors is preserved under parallel transport.

72

5.8 Curvature And Torsion

We define two new tensors:(Riemann) curvature tensor: R : χ(M)× χ(M)× χ(M)→ χ(M)

R(X, Y, Z) ≡ R(X, Y )Z ≡ ∇X∇YZ −∇Y∇XZ −∇[X,Y ]Z.Torsion tensor: T : χ(M)× χ(M)→ χ(M)

T (X, Y ) ≡ ∇XY −∇YX − [X, Y ].Let us check that these definitions really define tensors, i.e. multilinear maps. Obvi-ously R(X +X ′, Y, Z) = R(X, Y, Z) +R(X ′, Y, Z) etc. are true, but it is less obviousthat R(fX, gY, hZ) = fghR(X, Y, Z) where f, g, h ∈ F(M). Let us calculate:

[fX, gY ] = fX[g]Y − gY [f ]X + fg[X, Y ] (15)

Using (15) we obtain

R(fX, gY )(hZ) = f∇X(g∇Y (hZ))− g∇Y (f∇X(hZ))

− fX[g]∇Y (hZ) + gY [f ]∇X(hZ)− fg∇[X,Y ](hZ).

Here the first term is

f∇X(g∇Y (hZ)) = f∇X(gY [h]Z + gh∇YZ) = fX[g]Y [h]Z + fg(X[Y [h]])Z

+ fgY [h]∇XZ + fgX[h]∇YZ + fhX[g]∇YZ + fgh∇X∇YZ,

and the second term is obtained by changing X ↔ Y and f ↔ g. Continuing

R(fX, gY )(hZ) = fX[g]Y [h]Z + fg(X[Y [h]])Z + fgY [h]∇XZ + fgX[h]∇YZ

+ fhX[g]∇YZ + fgh∇X∇YZ − gY [f ]X[h]Z − fg(Y [X[h]])Z

− fgX[h]∇YZ − fgY [h]∇XZ − ghY [f ]∇XZ − fgh∇Y∇XZ

− fX[g]Y [h]Z − fhX[g]∇YZ + gY [f ]X[h]Z + ghY [f ]∇XZ

− fg([X, Y ][h])Z − fgh∇[X,Y ]Z = fgh(∇X∇YZ −∇Y∇XZ −∇[X,Y ]Z)

= fghR(X, Y )Z.

Thus R is a linear map. In other words, when X = Xµeµ, Y = Y νeν and Z = Zλeλ,we have

R(X, Y )Z = XµY νZλR(eµ, eν)eλ.

R maps three vector fields to a vector field, so it is a (1,3)-tensor. A similar (butshorter) calculation shows that T (fX, gY ) = fgT (X, Y ), so T (X, Y ) = XµY νT (eµ, eν).T is a (1,2) tensor.The operations of R and T on vectors are obtained by knowing their actions on thebasis vectors eµ ∂

∂xµ. Denote

R(eµ, eν)eλ = a vector, expand in basis eκ = Rκλµνeκ.

73

Note the placement of indices. We can derive a formula for obtaining the componentsRκ

λµν . Recall that [eµ, eν ] = 0 and dxκ(eσ) = δκσ. Thus we get

Rκλµν = dxκ(R(eµ, eν)eλ) = dxκ(∇µ∇νeλ −∇ν∇µeλ) = dxκ(∇µ(Γηνλeη)−∇ν(Γ

ηµλeη))

= dxκ((∂µΓηνλ)eη) + ΓηνλΓρµηeρ − (∂νΓ

ηµλ)eη − ΓηµλΓ

ρνηeρ)

(16)

ThereforeRκ

λµν = ∂µΓκνλ − ∂νΓκµλ + ΓηνλΓκµη − ΓηµλΓ

κνη

Similarly if we denote T (eµ, eν) = T λµνeλ and derive the components T λµν :

T λµν = dxλ(T (eµ, eν)) = dxλ(∇µeν −∇νeµ) = dxλ(Γηµνeη − Γηνµeη),

and thereforeT λµν = Γλµν − Γλνµ.

Thus this is the same torsion tensor as the one we had defined earlier.

Geometric interpretation:

SEE THE FIGURES IN SECTION 7.3.2. OF NAKAHARA

Let us also define:The Ricci tensor: Ric(X, Y ) = dxλ(R(eλ, Y )X). Thus the components are:

(Ric)µν = Ric(eµ, eν) = Rλµλν . (Usual notation (Ric)µν ≡ Rµν .)

The scalar curvature: R = gµν(Ric)µν = Rλνλν .

The Einstein tensor: Gµν = (Ric)µν − 12Rgµν .

5.9 Geodesics of Levi-Civita Connections

The length of a curve c(s) = (xµ(s)) is defined by

I(c) =

∫c

ds =

∫c

√gµν

dxµ

ds′dxν

ds′ds′ ≡

∫c

Lds′

Thus along a curve L is constant. One can normalize s′ such that L = 1 so s′ = s.Curves with extremal (minimum or maximum) length satisfy δI = 0 about the curve.(Variational principle.) They satisfy the Euler-Lagrange equations (familiar fromcalculus of variations (FYMM II)):

d

ds

(∂L

dx′µ

)− ∂L

dxµ= 0, where x′µ =

dxµ

ds(17)

74

L = Lagrange function or Lagrangian. Instead of L, which contains a square root,we can equivalently use a simpler Lagrange function

F =1

2gµν

dxµ

ds

dxν

ds=

1

2L2,

because

d

ds

(∂F

dx′µ

)− ∂F

dxµ= L

(d

ds

(∂L

dx′µ

)− ∂L

dxµ

)︸︷︷︸

=0

+∂L

∂x′µdL

ds︸︷︷︸=0

= 0,

when xµ(s) satisfies the Euler-Lagrange equation (17). Then δ(∫Fds) = 0 gives

d

ds

(gλµ

dxµ

ds

)− 1

2

∂gµν∂xλ

dxµ

ds

dxν

ds= 0

⇒ ∂gλµ∂xν

dxµ

ds

dxν

ds+ gλµ

d2xµ

ds2− 1

2

∂gµν∂xλ

dxµ

ds

dxν

ds= 0

⇒ gλµd2xµ

ds2+

1

2

(∂gλµ∂xν

+∂gλν∂xµ

− ∂gµν∂xλ

)dxµ

ds

dxν

ds= 0.

Multiply this by gκλ and sum over λ:

d2xκ

ds2+

κµν

dxµ

dsdxν

ds= 0. (18)

This is the geodesic equation with a Levi-Civita connection! The action I =∫Fds

sometimes provides a convenient starting point for computing the Christoffel symbolsκµν

: plug in the metric to I, derive the Euler- Lagrange equations and read off the

Christoffel symbols comparing the Euler-Lagrange equations with (18).Note: previously when we discussed the geodesic equation in the context of generalconnection, we said that geodesics are the “straightest” possible curves. Now, in thecontext of the Levi-Civita connection which is only based on the metric, we that thegeodesics are also the shortest possible curves.

Note also that we can explicitly restore a parameter m and write the action ofthe length of the curve as I = m

∫ √gµν

dxµ

ds′dxν

ds′ds′. This is the relativistic action of

a free massive point particle (with mass m) moving on a curved spacetime. Thusthe free point particles move along geodesics. If m2 > 0 (usual particles), we saythat the corresponding geodesics (on a pseudo-Riemannian manifold) are timelike,if m2 < 0 (tachyonic particles) the geodesics are spacelike. Massless particles (suchas the photon) move along null geodesics. The invariant length vanishes along a nullgeodesic, ds2 = 0. This equation can be used to determine the null geodesics.

75

5.10 Lie Derivative And the Covariant Derivative

Let Γµνλ be an arbitrary symmetric (Γµνλ = Γµλν) connection. We can then re-expressthe Lie derivative with the help of the covariant derivative as follows:

(LXY )µ = Xν∂νYµ − Y ν∂νX

µ = Xν∇νYµ − (∇νX

µ)Y ν

This is true because of the symmetry of the connection:

Xν∇νYµ−(∇νX

µ)Y ν = Xν(∂νYµ + ΓµνλY

λ)− (∂νXµ + ΓµνλX

λ)Y ν

Xν∂νYµ − Y ν∂νX

µ + (Γµνλ − Γµλν︸︷︷︸=0

)XνY λ

For a generic (p,q)-tensor:

LXT µ1...µpν1...νq= (Xλ∇λ)T

µ1...µpν1...νq

− (∇λXµ1)T λµ2...µpν1...νq

− . . .− (∇λXµp)T µ1...µp−1λ

ν1...νq

+ (∇ν1Xλ)T

µ1...µpλν2...νq

+ . . .+ (∇νqXλ)T

µ1...µpν1...νq−1λ

.

5.11 Isometries

Isometries are a very important concept. They are symmetries of a Riemannianmanifold. If the manifold is a spacetime, we usually require a physical theory to beinvariant under isometries.

Definition. Let (M, g) be a (pseudo)-Riemannian manifold. A diffeomorphism f :

M →M is an isometry if it preserves the metric,

f ∗gf(p) = gp ,

for all p ∈M .If we interpret the metric as a map on vector fields, the above requirement means

gf(p)(f∗X, f∗Y ) = gp(X, Y ) (19)

for all tangent vectors X, Y ∈ TpM . In component form, (19) is

∂yα

∂xµ∂yβ

∂xνgαβ(f(p)) = gµν(p) (20)

where x, y are coordinates of the points p, f(p) respectively. What (19) means, is thatan isometry must preserve the angles between all tangent vectors and their lengths.

The identity map is trivially an isometry, also the composite map f g of twoisometries f, g is an isometry. Further, if f is an isometry, so is its inverse f−1. Thismeans that isometries form a group with composition of maps as the product, calledthe isometry group. The isometry group is a group of symmetries of a (pseudo)-Riemannian manifold.

76

Examples.

• (M, g) = the Euclidean space (Rn, δ) with the Euclidean metric. All translationsxµ 7→ xµ + aµ in some direction a = (aµ) are isometries, and so are rotations.The isometry group translations, rotations, and their combinations is calledthe Euclidean group or Galilean group and denoted by En.

• (M, g) = the (d+1)-dimensional Minkowski space(time) (R1,d, η) with the Min-kowski metric η. Again, spacetime translations xµ 7→ xµ + aµ are isometries,additional isometries are (combinations of these and) space rotations and boosts.The isometry group translations, rotations, boosts, and their combinationsis called the Poincaré group.

In typical laboratory scales, our spacetime is approximately flat (a Minkowskispace) so its approximate isometry group is the Poincaré group. That’s the reasonfor special relativity and the requirement that physics in the laboratory be relativistic,i.e. Poincaré invariant. More precisely, that requirement is necessary for experimentswhich involve scales where relativistic effects become important. For lower scales,time “decouples” and we can make a further approximation where only the Euclideanisometries of the spacelike directions are relevant. Recall also that symmetries suchas the time translations and space translations lead into conservation laws, like theconservation of energy and momentum. As you can see, important physical principlesare a reflection of the isometries of the spacetime.

5.12 Killing Vector Fields

Let us now consider the limit of “small” isometries, i.e. infinitesimal displacementsx = p 7→ f(p) = y ≈ x + εX. Here ε is an infinitesimal parameter and X is a vectorfield indicating the direction of the infinitesimal displacement. If the above map is anisometry, the vector field X is called a Killing vector field. Since the infinitesimaldisplacement is an isometry, eq. (20) must be satisfied and it now takes the form

∂(xα + εXα)

∂xµ∂(xβ + εXβ)

∂xνgαβ(x+ εX) = gµν(x) (21)

By Taylor expanding the left hand side, and requiring that the leading infinitesimalterm of order ε vanishes (there’s no ε-dependence on the right hand side), we obtainthe equation

Xξ∂ξgµν + ∂µXαgαν + ∂νX

βgµβ = 0 . (22)

We can recognize the left hand side as a Lie derivative, so (22) can be rewritten as

LXgµν = 0 .

77

Expressing LXgµν with the help of the covariant derivative,

LXgµν = Xλ

=0︷︸︸︷∇λgµν +(∇µX

λ)gλν + (∇νXλ)gµλ = 0.

(∇λgµν = 0) for a metric connection). Thus Killing vector field satisfies

∇µXν +∇νXµ = 0 Killing equation.

Let X and Y be two Killing vector fields. We can easily verify that

a) all linear combinations aX + bY with a, b ∈ R are also Killing vector fields

b) the Lie bracket [X, Y ] is a Killing vector field

It then follows that the Killing vector fields form an algebra, the Lie algebra of theisometry group. (The isometry group is usually a Lie group.)

Now let xµ(t) be a geodesic, its tangent vector Uµ = dxµ

dt, and let V µ be a Killing

vector. Then,

(U ν∇ν)(UµVµ) = UµUν∇νVµ︸︷︷︸

= 12UµUν(∇µVν+∇νVµ)

+Vµ Uν∇νUµ︸︷︷︸

=0 (geodesic)

= 0.

Thus UµVµ = U · V is a constant on a geodesic.

An m-dimensional manifold M can have at most 12m(m+ 1) linearly independent

Killing vector fields. Manifolds with the maximum number of Killing vector fields arecalled maximally symmetric. E.g. Rm is maximally symmetric (gµν = δµν ⇒ Γ =

0). The Killing equation ∂µVν + ∂νVµ = 0 has solutions:

V µ(i) = δµi (m of these)

Vµ = aµνxν with aµν = −aνµ︸︷︷︸

12m(m−1) components

= constant 6= 0 (23)

Thus in total we have m + 12m(m − 1) = 1

2m(m + 1) linearly independent Killing

vector fields.

You will learn more Riemann geometry in a course on General Relativity. We willnext move to

6 Semisimple Lie algebras and their unitary repre-sentations

(Following Howard Georgi: Lie Algebras in Particle Physics)

78

We have already introduced Lie groups, and briefly discussed various examples ofthem. In what follows, we will focus on compact Lie groups, whose parameters arecoordinates of a compact differentiable manifold.

One aspect which makes compact Lie groups simpler to discuss is that all represen-tations of compact Lie groups are equivalent to unitary representations. Considera m-dimensional representation of a Lie group G whose dimension is N , so that itselements are parameterized by N real parameters αa, a = 1, . . . , N . So the groupelements correspond to m × m matrices D(α), α = (αa). We assume that the ori-gin of the parameters has been chosen so that the unit element e corresponds toD(α)αa=0 = 1m. Consider then a curve c(α(t)) through the unit element such thatα(t = 0) = 0 there. Then, for small t, the curve is C(t) = 0 + (α(0)t) + . . .. We canTaylor expand as we did on p. 63:

D(α(t)) = 1 + itA+ · · · = 1 + itαa(0)Xa + · · · ,

where A represents the tangent vector of the curve at the origin, and Xa are a set ofN linearly independent vectors (a basis) so that A can be expanded as above. TheXa, a = 1, . . . , N are called the generators of the group. Because we can assumethat the representation is unitary, as a (physics) convention we have added a i in theabove so that Xa are linearly independent and Hermitian (X†a = Xa). In general, anygroup element which can be obtained from the identity by continuous changes in theparameters can be written as

D(β) = eiβaXa (24)

where βa(a = 1, · · · , N = dimG) are real parameters.Along a curve through the unit element, infinitesimally close to it, we can multiply

simply by the ruleeit1αa(0)Xaeit2αa(0)Xa = e−i(t1+t2)αa(0)Xa , (25)

so that the multiplication is Abelian. Of course, in general Lie groups are non-Abelianso that for group elements close to unity

eiβaXaeiγaXa 6= eiγaXaeiβaXa . (26)

You may already know the Baker-Campbell-Hausdorff lemma which gives a rulefor products of exponentials:

eAeB = expA+B +1

2[A,B] +

1

12([A, [A,B]] + [B, [B,A]]) + · · · . (27)

ThuseiβaXaeiγaXa = eiβaXa+iγaXa− 1

2[βaXa,γbXb]+··· . (28)

The third term in the exponent can also be expanded in the Xa basis, with somecoefficients δa,

[βaXa, γbXb] = iδcXc . (29)

79

Since this must be true for all βa and γb, it must be possible introduce new coefficientsfabc so that we can write

δc = βaγbfabc . (30)

So we obtain a commutation rule for the generators,

[Xa, Xb] = ifabcXc . (31)

Now you should compare this with our previous discussion in subsection 4.6.1. Thegenerators Xa form a basis of the Lie algebra g of the Lie group G. In 4.6.1. wecalled ifabc ≡ Cc

ab as the structure constants. Georgi uses the physics convention,where fabc are called the structure constants. The actual numerical values of thestructure constants naturally depend on the choice of basis of the Lie algebra (anylinearly independent combination of basis vectors are again a basis).

To choose a convenient basis, consider the N ×N matrix with elements

tr (XaXb).

The N × N matrix is symmetric, and real, so we can diagonalize it by choosingappropriate linear combinations of Xa’s with real coefficients. Then,

tr (XaXb) = kaδab

(here the underlining of indices indicates that there is no sum over a). We couldrescale the generators to set |ka| = 1 ∀ a = 1, . . . , N . But we cannot change the signof ka’s. We will focus on (compact semisimple) Lie groups for which all ka’s will bepositive, and we can set

tr (XaXb) = λδab, λ > 0.

In this choice of basis, the structure constants fabc are completely antisymmetric.This can be seen explicitly by writing

fabc =−iλ

tr ([Xa, Xb]Xc) .

Definitions.

I) An invariant subalgebra is generated by a subset of generators of algebra g whichgoes into itself or zero under commutation with any element of the algebra: IfX ∈ subalgebra, then [X, Y ] ∈ subalgebra ∀ Y ∈ g.

II) If an algebra has no nontrivial invariant subalgebras (g itself is a trivial subalge-bra), it is called simple. A simple algebra generates a simple group.

III) If an algebra has no Abelian invariant subalgebras, it is called semisimple.They are products of simple algebras, and generate semisimple groups.

80

Note that generators of Abelian invariant subalgebras contribute vanishing structureconstants: if Xa is a generator of an Abelian invariant subalgebra, then

fabc = 0 ∀ b, c

Using the structure constants fabc we can define a set of matrices Ta with components

(Ta)bc ≡ −ifabc.

They satisfy [Ta, Tb] = ifabcTc so the matrices Ta give a N ×N -matrix representationof the algebra, i.e. an N-dimensional representation. The dimension of the representa-tion is the same as the dimension of the algebra - this is the adjoint representationwhich we described in an abstract way earlier! Note that in fact Georgi defines thenumber λ in

tr (XaXb) = λδab

using the adjoint representation, so Xa = Ta above. Note that we said we use a basisXa where fabc are completely antisymmetric. Then the matrices Ta in the adjointrepresentation are Hermitian:

[(Ta)†]bc = [(Ta)cb]

∗ = ifacb = −ifabc = (Ta)bc

Further, a generator of an Abelian invariant subalgebra is represented in the adjointrepresentation by Ta = 0N×N since fabc = 0 ∀ b, c.

6.1 SU(2)

The simplest non-Abelian Lie algebra is SU(2), where the generators Ja, a = 1, 2, 3

satisfy

[Ja, Jb] = iεabcJc(i.e. fabc = εabc)

⇔

[J1, J2] = iJ3

[J2, J3] = iJ1

[J3, J1] = iJ2

Let us find all its finite irreducible representations. Suppose J3 is represented byan N × N hermitian matrix. We can then diagonalize J3. Suppose that ~vj is aneigenvector of J3 with eigenvalue j

J3~vj = j · ~vj.

There could be more than one such eigenvector, so let us label them by α:

J3~vj,α = j · ~vj,α, α = 1, 2, . . .

Georgi uses the notation |j, α〉 instead of ~vj,α. The vectors are also complex vectors:all components of ~vj,α are complex numbers, ~vjα ∈ CN .

81

Define a scalar product:

〈~v|~w〉 =N∑i=1

v∗iwi,

where vi, wi are the components of ~v, ~w respectively. We can take the eigenvectors~vj,α to be orthonormal:

〈j, α|j, β〉 ≡ 〈~vj,α|~vj,β〉 = δαβ.

Now define linear combinations of J1, J2:

J+ =1√2

(J1 + iJ2); J− =1√2

(J1 − iJ2)

Note: (J±)† = J∓.

They satisfy the commutation relations

[J3, J+] = J+, [J3, J

−] = −J−, [J+, J−] = J3.

Then J+(J−) maps an eigenvector of J3 with eigenvalue j to an eigenvector witheigenvalue j + 1(j − 1):

J3J±|j, α〉 = ([J3, J

±] + J±J3)|j, α〉= ±J±|j, α〉+ J±J3|j, α〉 = (j ± 1)J±|j, α〉

This is why we will call J+(J−) the raising (lowering) operator.In a finite dimensional representation, there will be a vector with highest J3

eigenvalue – let us call that eigenvalue as j. There might be more than one suchvector (meaning that the eigenspace is degenerate), we label them by index α:

J3|j, α〉 = j|j, α〉

We say that these vectors span a degenerate eigenspace. We can take suitable linearcombinations so that |j, α〉 form an orthonormal basis:

〈j, α|j, β〉 = δα,β

Since j is the highest J3 eigenvalue, we must have

J+|j, α〉 = 0.

The states J−|j, α〉 have J3 = j − 1. Define

J−|j, α〉 ≡ Nj,α|j − 1, α〉.

States with different α are again orthogonal:

N∗j,βNj,α〈j − 1, β|j − 1, α〉 = 〈j, β|J+J−|j, α〉= 〈j, β|[J+, J−]|j, α〉 = 〈j, β|J3|j, α〉 = jδα,β

82

So we can choose |j − 1, α〉 orthonormal, normalizing

Nj,α =√j ≡ Nj

ThenJ+|j − 1, α〉 =

1

Nj

J+J−|j, α〉 =1

Nj

[J+, J−]|j, α〉 = Nj|j, α〉

Note: J± change j without changing α.

By an analogous argument, we find states |j − 2, α〉:

J−|j − 1, α〉 = Nj−1|j − 2, α〉J+|j − 2, α〉 = Nj−1|j − 1, α〉

et c. toJ−|j − k, α〉 = Nj−k|j − 2− 1, α〉

J+|j − k − 1, α〉 = Nj−k|j − 1, α〉

The N ’s (which can be chosen to be real) satisfy

N2j−k = 〈j − k, α|J+J−|j − k, α〉

= 〈j − k, α|[J+, J−]|j − k, α〉+ 〈j − k, α|J−J+|j − k, α〉= j − k +N2

j−k+1 ← recursion relation for N ’s

Solve the recursion relation by addition:

N2j = j

N2j−1 −N2

j = j − 1...

...+ N2

j−k −N2j−k+1 = j − k

N2j−k = 1

2(k + 1)(2j − k)

Or, with j − k ≡ m,

Nm =1√2

√(j +m)(j −m+ 1)

Eventually we must come to a j − l so that J−|j − l, α〉 = 0. But this requiresNj−l = 0, or l = 2j. Thus

j =l

2for some integer l

So the representation breaks up into subspaces labeled by α of dimension 2j + 1.(J3 = j, j − 1, . . . ,−j): Rep= ⊕αV α, dimV α = 2j + 1. This is reducible. But if weare interested in irreducible representations, there can be only one α. So we conclude:

Unitary irreducible representations of SU(2) are 2j+ 1 dimensional, char-acterized by their highest J3 eigenvalue j, with j being integer or half-integer: j = l/2, l = 0, 1, 2, . . .

83

The eigenvalues of J3 are called weights, and j is the highest weight (of the irre-ducible representation).

In quantum mechanics, representations with highest weight j are called spin j

representations because j is associated with the spin angular momentum of a par-ticle at rest:

J2 = ~2j(j + 1) (J2 = J21 + J2

2 + J23 )

All reducible representations reduce to a direct sum of spin j representations. Thesimplest nontrivial representation is spin 1/2, generated by

Ja =1

2σa,

where σa are the Hermitian, traceless Pauli matrices

σ1 =

(0 1

1 0

), σ2 =

(0 −ii 0

), σ3 =

(1 0

0 −1

)

6.2 Roots and weights

Now we will try to generalize our way of classifying the unitary irreducible represen-tations of SU(2) for arbitrary simple Lie algebras. We divide generators into two sets:one set analogous to J3 of SU(2) (diagonal generators), the other set analogous to J±

(raising and lowering operators).

6.2.1 Weights

Let Xa, a = 1, 2, . . . , dim g be the generators of a simple Lie algebra g. Suppose theyare represented in an irreducible representation D. We try to diagonalize as manyXa’s as possible. You may know that this requires that these Xa commute with eachother. To maximize the number of mutually commuting generators, consider linearcombinations Hi = CiaXa, i = 1, 2, . . . ,m ≤ dim g. They must satisfy the followingproperties:

I) Hi are Hermitian ⇒ Cia ∈ R ∀ i, a

II) [Hi, Hj] = 0 ∀ i, j = 1, 2, . . . ,m

III) tr (HiHj) = kDδij

IV) m is as large as possible

The Hi’s are the analogy of J3, they form a maximal set of commuting Hermitianoperators within g called the Cartan subalgebra. The integer m is called the rankof the Lie algebra g ( and of the corresponding Lie group). We then diagonalize Hi’s

84

in the representation D (we can diagonalize them all since [Hi, Hj] = 0) and call theireigenvectors with eigenvalue µi as |~µ,D〉:

Hi|~µ,D〉 = µi|~µ,D〉

(the label ~µ is a m-component vector ~µ = (µ1, . . . , µm) where all µi’s are real). Theeigenvalues µi are called weights and the vector ~µ is a weight vector.

6.2.2 Roots

The weights µi can be defined for any irreducible representation D. Now supposethat D is the adjoint representation. Then dimD = dim g. As a vector space, the Liealgebra is isomorphic with the vector space of the adjoint representation. So we canchoose a basis where we identify the basis vectors: those of the adjoint representationand the generators, and we can label the basis vectors in the same notation as thegenerators: Xa → |Xa〉. In particular, in the representation space we must havepicked a basis where the basis vectors transform under Xa’s as follows:

Xa|Xb〉 = +ifabc|Xc〉

where fabc are the structure constants. We can rewrite this as

Xa|Xb〉 = |[Xa, Xb]〉.

We are also assuming that the basis vectors are normalized with the scalar product

〈Xa|Xb〉 = 1λtr (X†aXb)

= 1λtr (XaXb) recall Xa Hermitian

= δab .

Recall that in the adjoint representation we used a basis with normalization

tr (XaXb) = λδab.

Now suppose that the first m generators of Xa’s are the Cartan generators Hi. Then,the first m states |Xa〉 in the basis are eigenvectors of Hi’s with eigenvalue zero:

Hi|Hj〉 = 0 ∀ i, j = 1, . . . ,m.

We then proceed to find the rest of the dim g eigenvectors of Hi’s and call them |E~α〉:

Hi|E~α〉 = αi|E~α〉 (32)

where ~α = (αi) = (α1, α2, . . . , αm). There are dim g − m such vectors |E~α〉 so thatthe label E~α has dim g −m values. We can rewrite (32) as

Hi|E~α〉 = |[Hi, E~α]〉 = αi|E~α〉 = |αiE~α〉

85

So the states |E~α〉 correspond to generators

E~α = C~αaXa, [Hi, E~α] = αiE~α.

Since Hi are Hermitian, the eigenvalues αi are real. Note that the commutationrelation

[Hi, E~α] = αiE~α

is true regardless of what representation D is used to represent the generators. Notealso that although Hi are Hermitian, the E~α are not necessarily Hermitian:

[Hi, E~α]† = [E†~α, H†i ] = −[Hi, E

†~α]

= α∗iE†~α = αiE

†~α (αi real)

⇒ E†~α = E−~α

So E~α come in pairs E±~α.

The vectors |Hi〉, |E~α〉 form a basis of the adjoint representation. We normalize themas

〈E~α|E~β〉 = 1λtr (E†~αE~β) = δ~α,~β

〈Hi|Hj〉 = 1λtr (HiHj) = δi,j

Note that the αi are the weights of the adjoint representation (along with weightszero). However, they are given a special name, and are called roots. The vector ~α isa root vector.

6.2.3 Raising and lowering operators

Whereas Hi were analogous to I3 of SU(2), the E~α are analogous to the raising andlowering operators J±. (From now on, we shall simplify notation and denote ~α by α,E~α by Eα.) The Eα are raising and lowering operators for the weights ~µ (we shall useboth the ~µ and µ notations):

HiEα|µ,D〉 = [Hi, Eα]|µ,D〉+ EαHi|µ,D〉= αiEα|µ,D〉+ µiEα|µ,D〉= (µ+ α)iEα|µ,D〉

This is true in any representation D.Thus, starting with any |µ,D〉, we can derive properties of the roots and weights

which are analogous to the statement that the eigenvalues of I3 are half-integers. Ifthere is more than one vector with weight µ, we will choose |µ,D〉 to be an eigenvectorof E−αEα, with fixed α, and define

E±α|µ,D〉 = N±α,µ|µ± α,D〉

We will find a recursion relation for N±α,µ’s as we did for the Nm’s for SU(2).

86

In the adjoint representation, the vector Eα|E−α〉 = |[Eα, E−α〉 has weight zero:

HiEα|E−α〉 = ([Hi, Eα] + EαHi)|E−α〉 = (αi − αi)Eα|E−α〉 = 0.

Thus, Eα|E−α〉 can be expanded in |Hi〉:

Eα|E−α〉 = βi|Hi〉

whereβi = 〈Hi|Eα|E−α〉 = 1

λtr (Hi[Eα, E−α])

= 1λtr (E−α[Hi, Eα) = αi

so[Eα, E−α] = αiHi

Now let us consider a generic (irreducible) representation D. Consider

〈µ,D|[Eα, E−α]|µ,D〉 = αi〈µ,D|Hi|µ,D〉 = α · µ= 〈µ,D|EαE−α|µ,D〉 − 〈µ,D|E−αEα|µ,D〉= |N−α,µ|2 − |Nα,µ|2

ButN−α,µ = 〈µ− α,D|E−α|µ,D〉 = 〈µ− α,D|E†α|µ,D〉

= 〈µ,D|Eα|µ− α,D〉∗ = N∗α,µ−α

Thus:|Nα,µ−α|2 − |Nα,µ|2 = α · µ

Assuming that D is a finite dimensional representation, if we apply Eα (or E−α) re-peatedly to |µ,D〉, we must eventually get zero (because µ→ µ+ α→ µ+ 2α→ · · · ).Suppose

Eα|µ+ pα,D〉 = 0; E−α|µ− qα,D〉 = 0

for positive integers p, q. Then,

|Nα,µ+(p−1)α|2 − 0 = α · (µ+ pα)

|Nα,µ+(p−2)α|2 − |Nα,µ+(p−1)α|2 = α · (µ+ (p− 1)α)...

...|Nα,µ−α|2 − |Nα,µ|2 = α · µ

......

|Nα,µ−qα|2 − |Nα,µ−(q−1)α|2 = α · (µ− (q − 1)α)

+ 0− |Nα,µ−qα|2 = α · (µ− qα)

⇒ 0 = (p+ q + 1)(α + µ) + α2[p(p+1)2− q(q+1)

2]

= (p+ q + 1)[(α · µ) + α2

2(p− q)]

⇒ [(α · µ) +α2

2(p− q)] = 0

87

⇒ α · µα2

= −1

2(p− q)

i.e. the combination α·µα2 is half-integer.

Using the recursion relation, we could work “backwards” and determine all the|N |2 in terms of µ, α, p, q. The relation α·µ

α2 = −12(p − q) has particularly simple and

strong consequences if we take D to be the adjoint representation. There the nonzeroweights µ are roots, let us denote them by β. Then,

α · βα2

= −1

2(p− q) ≡ m

2, m ∈ Z

But equally well, we could switch the roles of α and β and let E±β act on |α,Dadj〉 ≡|Eα〉. Then

β · αβ2

= −1

2(p′ − q′) ≡ m′

2.

Multiplying the two relations, we obtain

mm′

4=

(α · β)2

α2β2= cos2 θ,

where θ is the angle between root vectors α, β. The full list of possible values is:

mm′ θ

0 90

1 60, 120

2 45, 135

3 30, 150

4 0, 180

These are the only angles that are allowed between roots of simple Lie algebras.We have been assuming that for each nonzero root vector α there is a unique

generator Eα. Let us prove this. Suppose α→ Eα, E′α. Choose them to be orthogonal:

〈Eα|E ′α〉 =1

λtr (E†αE

′α) = 0

Applying E±α to |E ′α〉, we get

α · αα2

= −1

2(p− q)

but q = 0 because E−α|E ′α〉 = 0: write E−α|E ′α〉 = βiHi, then the coefficients vanish,

βi = 〈Hi|E−α|E ′α〉 = tr (Hi[E−α, E′α]) 1

λ

= 1λtr (E ′α[Hi, E−α]) = −αitr (E ′αE−α) = 0

Thusα · αα2

= 1 = −1

2p⇒ Contradiction, because p〉0.

88

So E ′α = Eα, quod erat demonstrandum.To clarify the relation ofHi, E±α to the J3, J

± in the SU(2) case, define the rescaledgenerators

E± ≡ 1|α|E±α (where |α|2 = α · α)

E3 ≡ |α|−2Hiαi

It can easily be seen that

[E3, E±] = ±E±; [E+, E−] = E3

Thus, each Eα is associated with a SU(2) subalgebra of g, with Eα, E−α, αiHi (rescaled)as its generators. If |µ,D〉 is an eigenvector of E−αEα, then the vectors

|µ± kα,D〉 ∝ (E±α)k|µ,D〉

form an irreducible representation of the SU(2) subalgebra. (The condition E−αEα|µ,D〉 ∝|µ,D〉 ensures that |µ,D〉 is not a linear combination of vectors from different repre-sentations but with the same µ ).

The vectors |µ+ kα,D〉 satisfy

E3|µ+ kα,D〉 =1

|α|2(α · µ+ kα2)|µ+ kα,D〉

The highest E3 eigenvalue, corresponding to |µ+ pα,D〉 must be

j =α · µα2

+ p

for the spin j representation. The lowest E3 eigenvalue, corresponding to |µ− qα,D〉is

−j =α · µα2− q

(Then

0 = j − j = 2αµ

α2+ (p− q)⇒ α · µ

α2= −1

2(p− q).)

Let us now use this abstract formalism in a specific example of a simple Lie algebraother than SU(2), namely SU(3).

6.3 SU(3)

SU(3)= 3× 3 unitary matrices W | detW = 1

Generators of SU(3)= 3× 3 Hermitian traceless matrices

89

Standard basis consists of Gell-Mann matrices λa (the analog of Pauli matrices σaof SU(2)):

λ1 =

0 1 0

1 0 0

0 0 0

λ2 =

0 −i 0

i 0 0

0 0 0

λ3 =

1 0 0

0 −1 0

0 0 0

λ4 =

0 0 1

0 0 0

1 0 0

λ5 =

0 0 −i0 0 0

i 0 0

λ6 =

0 0 0

0 0 1

0 1 0

λ7 =

0 0 0

0 0 −i0 i 0

λ8 = 1√3

1 0 0

0 1 0

0 0 −2

The generators of SU(3) are

Ta =1

2λa. (33)

Notice thattr (T aT b) =

1

2δab.

The generators T1, T2, T3 generate an SU(2) subgroup of SU(3), called the isospingroup in particle physics applications.

Choose the Cartan subalgebra generators to be H1 = T3, H2 = T8 (Rank of SU(3)is 2⇒ no others can exist). This choice is convenient since T3, T8 are diagonal. Theeigenvectors of T3, T8 are 1

0

0

, T3 eigenvalue =1

2, T8 eigenvalue =

1

2√

3

So this has the weight vector µ = (12, 1

2√

3) and we denote 1

0

0

≡ |12, 1

2√

3〉, similarly

0

1

0

≡ | − 12, 1

2√

3〉

0

0

1

≡ |0,− 1√3〉.

Let us plot the weight vectors:

90

6

-

r

rrH1

H2

(0,− 1√3)

(12, 1

2√

3)(−1

2, 1

2√

3)

They form an equilateral triangle.There are six root vectors. They correspond to generators E±α that map from

one weight to another, so ±α = are found from differences of weights

= ±(1, 0) ; ±

(1

2,

√3

2

); ±

(−1

2,

√3

2

)

The corresponding generators are matrices that map the eigenvectors to one another: 1

0

0

→ 0

1

0

or

0

0

1

etc.

Such matrices will have only a single off-diagonal element. Thus,

1√2(T1 ± iT2) = E(±1,0)

1√2(T4 ± iT5) = E

(± 1√2,±√3

2)

1√2(T6 ± iT7) = E

(∓ 1√2,±√3

2)

The roots form a regular hexagon:

91

6

-

rr

rr

r r

rr H1

H2

(12,√

32

)(−12,√

32

)

(1, 0)(−1, 0)

(−12,−√

32

) (12,−√

32

)

CCCCCCCCCCCCCO

(Plus two “roots” (0,0) for the gener-ators H1, H2 of Cartan subalgebra)

All angles between root vectors of SU(3) are multiples of 60.

6.4 Simple roots

(Section VIII of Georgi’s book)

Recall how we built all the states of a spin j representation of SU(2) starting fromthe highest J3 eigenvalue j and then acting on |j〉 by the lowering operator J− togenerate all the other (linearly independent) states

|j − 1〉, |j − 2〉, . . . , | − j + 1〉, | − j〉.

We then wanted to generalize this to other simple Lie groups, finding the analogueof J3 (the generators Hi of the Cartan subalgebra) and the analogue of J± (the restof the generators, the E±α). Next we want to find the analogue of the highest J3

eigenvalue j from which we generated the rest of the states (of the representation).The analogue or generalized concept is called “highest weight”.

6.4.1 Highest weight

Pick a basis of the Cartan subalgebra, i.e. choose some particular ordering for thegenerators, H1, H2, . . . , Hm. The associated weights are then µ1, µ2, . . . , µm. A weightvector ~µ = (µ1, µ2, . . . , µm) is then said to be

92

positive if its first nonzero component is > 0

negative if its first nonzero component is < 0

or zero if all components are zero, ~µ = (0, 0, . . . , 0).

We can then define an ordering among all possible weight vectors:

~µ > ~µ′ ⇔ ~µ− ~µ′ is positive.

All states |~µ,D〉 of an irreducible representation D are then ordered, and if the rep-resentation is finite dimensional there is a weight ~µ which is greater than any of theothers. This is the highest weight. It can be shown that the highest weight isunique. Also, although it seems that all the above notions depend on the initial ar-bitrary choice of the ordering H1, H2, . . . , Hm, it can be shown that the choice doesnot matter for what follows (i.e. any other choice would work equally well).

Consider then the adjoint representation. In this case the weight vectors ~µ werecalled root vectors. Of these, m vectors were zero vectors ~µ = ~0 = (0, 0, . . . , 0),the ones corresponding to the m generators of the Cartan subalgebra. There weredim g −m non-zero roots ±~α (we will again return to the notation µ ≡ ~µ, α ≡ ~α).

6.4.2 Simple root

A simple root is a positive root α which cannot be written as a sum β + γ of twopositive roots β, γ. The simple roots are important, as they turn out to determinethe whole structure of the Lie algebra g (or Lie group G)!

Note: If α, β are simple roots, then β − α and α− β are not roots.

Proof: Pick β−α or α−β depending on which one is positive. Suppose β−α > 0.Assume it is a root. Then β = β−α+α is a sum of two positive roots, thus β is notsimple. Likewise for α− β. This concludes the proof.

Recall the formula defining p, q

Eα|µ+ pα,D〉 = 0; E−α|µ− qα,D〉 = 0

and the subsequent formulaα · µα2

= −1

2(p− q).

Then, choosing D =adjoint representation and µ = β (now root) we had

α · βα2

= −1

2(p− q).

and likewise, interchanging α, β

β · αβ2

= −1

2(p′ − q′).

93

with some (other) p′, q′. Suppose we now pick α, µ ≡ β simple roots. Then α − β isnot a root so q must be zero, so

2α · βα2

= −p < 0

and likewise q′ = 0 and2β · αβ2

= −p′ < 0

Then, since α · β = β · α = |α||β| cos θ (θ being the angle between α and β) andα2 = |α|2 we get

cos θ = −1

2

√pp′,

β2

α2=p

p′.

This means that the angle θ between any pair of simple roots is within the rangeπ

2≤ θ < π

Claim: Any set of simple root vectors satisfying the above constraint is linearlyindependent.

Proof: Suppose the simple roots α are not linearly independent. Then ∃ coefficientsxα such that ∑

α

xαα = 0 (34)

In the above combination, define two sets Γ± so that

α ∈ Γ+ if xα ≥ 0

α ∈ Γ− if xα < 0

Then(34)⇔ y ≡

∑α∈Γ+

xαα = −∑α∈Γ−

xαα ≡ z

Notice that y =∑

α∈Γ+xαα is a positive vector, since all α > 0 and xα > 0 for

α ∈ Γ+. Also z =∑

α∈Γ−(−xα)α is a positive vector since −xα > 0 for α ∈ Γ−. But

then

0 < y2 = y · z (y = z)

=∑

α∈Γ+

∑β∈Γ−

>0︷︸︸︷xα

>0︷︸︸︷−xβ

≤0︷︸︸︷α · β ≤ 0, (Recall α · β ≤ 0 between

any two simple roots.)

which is a contradiction. This concludes the proof.

Now then, all positive roots φ can be written as a linear combination of simple rootsα with non-negative integer coefficients kα:

φ =∑α

kαα. (35)

94

Proof: If φ is simple, (35) is trivially true. If φ is not simple, it is a sum of twopositive roots, φ = φ1 + φ2. If φ1 and φ2 are both simple, (35) is true. If not, thenφ1 = φ11 + φ12 or φ2 = φ21 + φ22, a sum of two positive roots, and so on. Thisconcludes the proof.

Claim: The number m of simple roots is equal to the rank of the group.

Proof: Simple roots α are weights i.e. m-component vectors, so their number mustbe ≤ m. Suppose it is < m. They then span a vector space whose dimension is lessthan m, so one can introduce a basis where these m-component vectors all have firstcomponent equal to zero. By (35) then the first component of every root φ is alsoequal to zero. But then

[H1, Eφ] =

=0︷︸︸︷φ1 Eφ = 0

for all roots φ. Recall also[H1, Hi] = 0 for all i

Thus H1 commutes with all generators⇒ generates an invariant subalgebra. But thiscannot exist since we assumed that the Lie algebra was simple. As a consequence, mis equal to the rank of the group, quod erat demonstrandum.

Consider then all possible linear combinations of simple roots,∑α

kαα. (36)

Which ones of them are actually roots? Let us consider by induction, denotingk ≡

∑α kα.

k=1 The sum (36) are just the simple roots themselves.

k=2 α + β is a root if α · β < 0 because then

2α·βα2 = −(p− q) = −p < 0 (q = 0 since α, β are simple)

⇒ p > 0⇒ p ≥ 1 so that α + pβ is still a root.

k=n+1 Suppose γ =∑

α kαα has k = n and was found to be a root. Consider thenγ + α where α is a simple root. From previous considerations at k ≤ n oneknows what is the value of q such that γ − qα is still a root. Knowing q, onecan evaluate γ · α and use the master formula

2α · γα2

= −(p− q)

to determine if p > 0. If it is, then γ + α is a root since γ + pα with p ≥ 1 isstill a root. But this is also all the roots with k = n+ 1. Suppose there is some

95

root ρ which does not have the form γ + α for any α. Then, since ρ− α is nota root, q = 0 and

2α · ρα2

= −p ≤ 0 for all α

But then ρ is linearly independent of all the simple roots, which is a contradic-tion.

Summary: Knowing the simple roots α, one can construct all roots.

Example: SU(3) Positive roots are

= (1, 0);

(1

2,

√3

2

);

(−1

2,

√3

2

).

α1 = (1, 0) = α2 + α3, a sum of two positive roots, thus α1 is not simple. α2 and α3

are simple roots.

α22 = α2

3 = 1, α2 · α3 = −1

2,

so2α2 · α3

α22

=2α3 · α2

α23

= −1 = −(p−=0︷︸︸︷q ) = −p⇒ p = 1

⇒ α2 + α3(= α1) is a positive root but α2 + 2α3 is not a root.

6.4.3 Dynkin diagrams

Let us restate what we know. The m simple roots can be used to find all the roots,which in turn we can use to write down the algebra by finding the Nα,β. The simpleroots are all we need to know. A Dynkin diagram is just a shorthand, diagrammaticnotation for writing down the simple roots. Each simple root is written as an opencircle. Pairs of circles are then connected by lines depending on the angle θ (recallθ ∈ [π

2, π[) between the pair of roots to which the circles correspond, as follows:

i i if the angle is 150i i if the angle is 135i i if the angle is 120i i if the angle is 90

The complete Dynkin Diagram determines all the angles between pairs of simpleroots. This doesn’t quite determine all the simple roots, because there may be severalpossible choices for the relative lengths, but we will worry about that later.

96

Example: The diagrams for SU(2) and SU(3) arei SU(2)i i SU(3)

6.4.4 Fundamental weights

Label the simple roots αi (m-component vectors), i = 1, . . . ,m. Now consider thehighest weight of an arbitrary irreducible representation, D. A weight µ in D isthe highest weight in the representation if and only if µ + φ is not a weight in therepresentation for all positive roots φ. Clearly, it is sufficient to require µ+ αi not aweight for all the simple roots αi. That means that for Eαi acting on µ, p = 0, thus

2αi · µ(αi)2

= qi (37)

where the qi’s are non-negative integers.Since the αi’s are linearly independent, the qi’s completely determine µ. Every set

of qi gives a vector µ satisfying (37) which is the highest weight of some irreduciblerepresentation, and we can construct the entire representation by acting on µ withthe lowering operators E−αi .

We have shown that the irreducible representations of a rank m simple Lie groupcan be labeled by a set of m non-negative integers, qi. To get a feeling for what thismeans, it is useful to consider the weight vectors µi, such that

2αi · µj

(αi)2= δij.

µi is the highest weight of a representation with qi = 1 and qj = 0 for i 6= j. Clearly,any highest weight µ, can be written uniquely as a sum of the µi’s,

µ =∑i

qiµi.

We can build the representation with highest weight µ out of the tensor product ofq1 representations with highest weight µ1, q2 with highest weight µ2, etc; just as webuild the spin n

2representation of SU(2) out of n spin 1

2representations.

The vectors µi are called the fundamental weights, and the m irreducible rep-resentations whose highest weights are µi are the fundamental representations,Di.

Do not confuse the upper indices which label the simple roots and the fundamentalweights with the lower indices which label the components of all the weight and rootvectors. Both indices run from 1 to the rank of the group, but the first is just a labelwhile the second is a vector index.

97

A Appendix

A.1 Some miscellaneous definitions and formulæ

? A magma is defined as a set M equipped with a binary operation M ×M →M .

? A (partial) order relation in a set X is a binary relation among elements ofX which, ∀a, b, c ∈ X, satisfies:

• a a

• [(a b) AND (b a)]⇒ (a = b)

• [(a b) AND (b c)]⇒ (a c)

namely, which is reflexive, antisymmetric, and transitive.

? Given a Lie algebra A and a subalgebra I ⊂ A, we say that I is an ideal, when,for all x ∈ A and for all y ∈ I, one has: [x, y] ∈ I.

? A Lie algebra is said to be semisimple, if it does not contain non-trivial Abelianideals.

? A Lie group G is said to be semisimple if its Lie algebra is semisimple, or, equiv-alently, if G does not have any non-trivial connected, normal, Abelian subgroups.

? The zero-th homotopy group of a topological space M , denoted by π0(M) is thenumber of disconnected components of M .

? The n-dimensional unit pseudospheres are n-dimensional hypersurfaces of con-stant curvature in the (n + 1)-dimensional real vector space Rn+1. There are four ofthem:

1. the n-dimensional sphere Sn, defined as the set of points x ∈ Rn+1 satisfyingthe equation:

x20 +

(x2

1 + x22 + · · ·+ x2

n−1

)+ x2

n = 1,

2. the n-dimensional hyperbolic space Hn, described by the equation:

− x20 +

(x2

1 + x22 + · · ·+ x2

n−1

)+ x2

n = −1,

3. the n-dimensional de Sitter space dSn, described by the equation:

− x20 +

(x2

1 + x22 + · · ·+ x2

n−1

)+ x2

n = 1,

98

4. the n-dimensional anti-de Sitter space AdSn, described by the equation:

− x20 +

(x2

1 + x22 + · · ·+ x2

n−1

)− x2

n = −1,

where x0, . . . xn denote the components of a generic element x of Rn+1.

? Let A be a generic, diagonalizable matrix. Then:

det eA = etr A.

To prove this statement, recall that:

eA =+∞∑n=0

An

n!.

If A is diagonal (A = diag (λ1, λ2, . . . )), then it is trivial to check that: An =

diag (λn1 , λn2 , . . . ), and, hence: expA = diag (eλ1 , eλ2 , . . . ), which implies:

det eA =∏i

(expA)ii =∏i

eλi = exp

(∑i

λi

)= etr A.

On the other hand, if A is not diagonal, then, since it is diagonalizable, there existsa similarity transformation that diagonalizes it:

A = U−1ΛU,

where detU 6= 0, and Λ is diagonal. Then: detA = det (U−1ΛU) = det (U−1) · det Λ ·detU = (detU)−1 · det Λ · detU = det Λ, and the previous argument can be appliedto Λ. Note that, if A = U−1ΛU , then tr A = tr (U−1ΛU) = tr (UU−1Λ) = tr Λ, dueto the invariance of the trace of a product of matrices, under cyclic permutations ofthe factors.

A.2 Algebras, representations and Young calculus

Let G be a Lie group (i.e., a continuous group whose elements can be parametrizedin terms of real parameters, for which the group manifold is differentiable). Thegenerators of G are denoted by ta, and can be defined by looking at elements q ∈ G,which are infinitesimally close to the identity element:

q = 1 + i∑a

xata +O(x2)

where the summation index ranges from 1 to the dimension of the group, and thexa’s are infinitesimally small parameters.

The generators ta form a basis for a vector space (called the “algebra of generatorsof G”, and usually denoted as g), in which it is possible to define an internal “product”operation denoted as [ , ] (“Lie bracket”), which is linear in each of its two arguments,and satisfies the following properties:

99

1. the Lie bracket is “alternating in g”: ∀u, v ∈ g : [u, u] = 0. Due to thebilinearity, this condition is equivalent to antisymmetry: ∀u, v ∈ g : [u, v] =

−[v, u]

2. “Jacobi identity”: ∀u, v, w ∈ g : [[u, v], w] + [[w, u], v] + [[v, w], u] = 0

In a representation of the group G in terms of square matrices of size N × N (inwhich, obviously, also the elements of g are square matrices of the same size), the Liebracket can be expressed as the commutator of two matrices:

[A,B] = AB −BA (B.1)

Since the Lie product is closed in the algebra, and since the ta’s are a basis forthe algebra itself, Lie products of different generators can be expressed as linearcombinations of the ta’s themselves, namely:

[ta, tb] = i∑c

Cabctc (B.2)

where the Cabc are called the “structure constants” of the algebra; owing to the proper-ties of the Lie bracket, they satisfy certain conditions (for example, the antisymmetryof the Lie bracket implies that Cabc = −Cbac).

As an example, if we consider the SU(2) group in its fundamental (defining)representation in terms of 2 × 2 unitary complex matrices with unit determinant, itturns out that the generators can be written in terms of the three Pauli matrices σa(a ∈ 1, 2, 3) as:

ta =1

2σa

with:

σ1 =

(0 1

1 0

), σ2 =

(0 −ii 0

), σ3 =

(1 0

0 −1

)With this definition of the ta’s, it is possible to show that:

[ta, tb] = i

3∑c=1

εabctc (B.3)

(where ε123 = ε231 = ε312 = 1, while ε132 = ε321 = ε213 = −1, and εabc = 0 if at leasttwo indices are equal) where the structure constants of su(2) appear: Cabc = εabc.

Note that, with the conventions introduced above, the ta’s of a generic su(N)

algebra can be represented as Hermitean traceless matrices. The fact that the traceof the ta’s vanishes is related to the requirement that we are dealing with a specialunitary group, i.e. that the determinant of the group elements is 1.

The structure constants are at the very core of the geometric structure of analgebra: essentially, defining an algebra means defining its structure constants. In

100

particular, this implies that an algebra can be realized using different matrices ofdifferent sizes: the relations defined by eq. (B.2) could be, for example, satisfied by aset of matrices of size N1×N1, or by a different set of matrices of size N2×N2. Thiscorresponds to different “representations” of the algebra.

By constructing different representations of an algebra, it is also possible to obtainthe corresponding representations of the group, of the same size: for group elementsinfinitesimally close to the identity element, the group and algebra elements are re-lated by eq. (B.1), which, for a generic element q of the group, generalizes to theexponential map:

q = exp

(i∑a

xata

)=

+∞∑n=0

1

n!

(i∑a

xata

)n

If the group elements are represented as (real or complex) matrices of size N ×N ,they can be considered as linear transformations acting on a N -component (real orcomplex) vector space.

Many problems in physics have to do with the properties of a system, which ismade of several, simpler, objects. If each component of the system has well-definedtransformation properties with respect to a certain group of transformations (mean-ing, for example, that it transforms according to a particular representation of thegroup), it is often useful to work out which are the transformation properties of thesystem as a whole. One example could be the computation of the total spin of aquantum system of two non-relativistic, indistinguishable particles: if one of the con-stituents transforms as an N -component real or complex vector under the given groupof transformations, and the other transforms as an M -component vector, then thecompound system of the two particles could be represented as a vector with N ·Mcomponents, given by the products of the components of the two original vectors.The action of a transformation in the group on such a vector can then be describedvia the “tensor product” of the two matrices that describe the transformations of thetwo original vectors, which is a (N ·M)× (N ·M) matrix with the structure:

A⊗B =

a11 . . . a1N

. . . . . . . . .

aN1 . . . aNN

⊗ b11 . . . b1M

. . . . . . . . .

bM1 . . . bMM

=

a11B a12B . . . a1NB

. . . . . . . . . . . .

aN1B aN2B . . . aNNB

=

a11b11 a11b12 . . . a11b1M a12b11 . . . a1Nb1M

a11b21 a11b22 . . . a11b2M a12b21 . . . a1Nb2M

. . . . . . . . . . . . . . . . . . . . .

aN1bM1 aN1bM2 . . . aN1bMM aN2bM1 . . . aNNbMM

(Note that in general the tensor product of two matrices is not commutative: A⊗B 6=B⊗A, but the two resulting matrices can be mapped into each other by a permutationof rows and columns.)

101

If A is a matrix of size N ×N , and B is of size M ×M , it is easy to prove that:tr (A⊗B) = (tr A) · (tr B), and det(A⊗B) = (detA)M · (detB)N .

A different operation on matrices is the “tensor sum”, which can be used to describethe case of two transformations acting separately on two independent vectors: in thiscase, the result is a matrix of size (N +M)× (N +M):

A⊕B =

(A 0

0 B

)=

a11 . . . a1N 0 . . . 0

a21 . . . a2N 0 . . . 0

. . . . . . . . . . . . . . . . . .

aN1 . . . aNN 0 . . . 0

0 . . . 0 b11 . . . b1M

0 . . . 0 b21 . . . b2M

. . . . . . . . . . . . . . . . . .

0 . . . 0 bM1 . . . bMM

(Similarly to what happens for the tensor product of two matrices, also the “tensorsum” of two matrices is not commutative: A⊕B 6= B⊕A, but, again, the two resultingmatrices can be mapped into each other by a permutation of rows and columns.)

It is obvious that: tr (A⊕B) = tr A+ tr B, and det(A⊕B) = (detA) · (detB).The operations above are useful in the composition of different representations.

In particular, it is interesting to look at the tensor products of different irreduciblerepresentations (which, roughly speaking, are those that cannot be written in termsof matrices in a block-triangular or block-diagonal form), and to decompose them inthe sum of other irreducible representations. To do this, it is useful to use the rulesof Young calculus, which we describe below. For simplicity, we only discuss the casesrelevant for the algebras of generators of unitary or special unitary groups, u(N) andsu(N) respectively.

• Each representation is denoted by a Young diagram made of square boxes,arranged in horizontal rows and vertical columns.

• The lengths of the horizontal rows of a Young diagram are non-increasing, fromtop to bottom.

• The maximum number of horizontal rows in the Young diagram of a represen-tation of u(N) is N ; for su(N), the length of the N -th row is always zero.

The box diagrams can also be represented by a sequence of N non-increasingintegers λ1 ≥ λ2 ≥ . . . λN−1 ≥ λN , with λN = 0 for su(N), that represent the lengthsof subsequent rows: [λ1, . . . , λN ] (sometimes the shortened notation [λ1, . . . , λN−1] isused for su(N)).

Young diagrams of particularly interesting representations include:

102

• The Young diagram of the fundamental representation consists of only one box.

• The Young diagram of the trivial representation (in which each generator ta isrepresented by the number 0, and each element of the group is represented by1) does not contain any box: it can denoted by the symbol ∅.

• Given an irreducible representation r whose first row has length λ1, the Youngdiagram of the “conjugate representation” r is obtained from the rectangle ofN rows and λ1 columns in which the Young diagram of r can be inscribed, byremoving the boxes belonging to the Young diagram of r, and by rotating thediagram made of the remaining boxes by 180 degrees. For example, for su(3),the conjugate representation of is , whereas for su(4) it is .

• In particular, the Young diagram of the “anti-fundamental” representation isgiven by a vertical column of N − 1 boxes.

• If the Young diagram of the conjugate representation is equal to the one of theinitial representation, then the latter is self-conjugate; in particular, this impliesthat the trace of all group elements in that representation are real.

• The Young diagram of the “adjoint representation” is made of a vertical columnof N − 1 boxes, together with a further box in the first horizontal row; theadjoint representation is always self-conjugate.

Given a non-empty Young diagram, the size of the corresponding representation(i.e., the dimension of the vector space on which the representation acts) is given bythe number of possible ways to fit the numbers from 1 to N in each of the boxes ofthe diagram, with the constraints that:

• In every horizontal row, the sequence of numbers from left to right must benon-decreasing.

• In every vertical column, the sequence of numbers from top to bottom must bestrictly increasing.

Roughly speaking, higher representations are obtained from the indices 1, . . . , N ofthe fundamental one, either by symmetrization (of indices in the same horizontalrow) or by antisymmetrization (of indices in the same vertical column). The tworules listed above enforce these properties and help avoiding multiple-countings.

Equivalently, the dimension of the representation [λ1, λ2, . . . , λN ] is given by:

N−1∏i=1

N∏j=i+1

li − ljl0i − l0j

, with: li = λi +N − i, l0i = N − i

(with λN = 0 for su(N)).

103

Using the Young diagrams, it is possible to decompose generic tensor productsof different representations into a direct sum of irreducible representations: this iscalled the Young calculus. Using the properties of the traces of A⊗B and A⊕B, theYoung calculus allows one to compute the characters of group elements in differentrepresentations.

The tensor products of representations that are easiest to simplify are those forwhich one of the two factors is the fundamental representation (for convenience, itis easiest to have the fundamental representation as the second factor—this doesnot change the representation decomposition of the product). They are decomposedinto a direct sum of new representations, which are obtained by adding the box ofthe fundamental representation in each of the rows of the initial diagram, with theconstraint that resulting diagrams should still have rows on non-increasing length.For su(N), if in this process the first column gets completely filled, it can be deletedfrom the corresponding diagram.

Some examples, for su(3):

⊗ = ⊕⊗ = ⊕ ⊕⊗ = ⊕

Note that for u(N ≥ 3), or for su(N ≥ 4), the latter two products yield instead:

⊗ = ⊕ ⊕

⊗ = ⊕

When neither factor is the fundamental representation, the Young calculus rulesget considerably more cumbersome; again, it is easiest to have the representation withthe simplest (smallest) Young diagram as the second factor. Then the constructionof the representations appearing in the decomposition of the product goes as follows:

1. Replace the boxes appearing in the Young diagram of the second factor byletters, with the following rules: a’s in the first row, b’s in the second row, et c.

2. Attach the a’s to the right end of the rows of the Young diagram appearing asthe first factor in all possible ways, but without placing any two a’s in the samecolumn.

3. Repeat the previous step for the b’s, then the c’s, et c.

4. At the end, for each of the possible diagrams obtained, read the sequence of let-ters from right to left and from top to bottom, and discard those representationswhich violate the following admissibility rule: at any point in the sequence, atleast as many a’s have occurred as b’s, at least as many b’s have occurred as c’s,et c. (for example: the sequences aab and aba are admissible, while baa is not).

104

5. In the diagrams that satisfy the admissibility rule, replace the a’s, b’s c’s, et c.with boxes to get the Young diagrams of the decomposition into a direct sumof representations (for su(N), cancel out possible columns of N boxes, if any).

An example (for su(N > 3), or for u(N > 2)) is the following:

⊗ = ⊕ ⊕ ⊕ ⊕

Note that, according to the rules above, for su(3) the same product would yield:

⊗ = ⊕ ⊕ ⊕ ⊕ ⊕ ∅

Finally, as a cross-check of the representation composition rules, it may be usefulto evaluate the size of the different representations: interpreting the representationcomposition operations in terms of the matrix product and sums discussed above, thematrix sizes on the l.h.s. and r.h.s. of any decomposition should match. For example,in eq. (B.4) for su(3) we have:

8 · 8 = 27 + 10 + 10 + 8 + 8 + 1

Note, however, that this is a necessary but not sufficient condition for the correctnessof the decomposition.

A.3 Homotopy groups and exact sequences

Given a topological space X, its different homotopy groups πq(X), for q ∈ N, classifythe homotopically inequivalent ways to map a Sq sphere (which can be obtained fromthe q-fold product of intervals [0, 1] ⊂ R:

Iq = [0, 1]× [0, 1]× . . . [0, 1]︸︷︷︸q factors

⊂ Rq

by identifying all the points on its boundary) to X, where “homotopically inequiva-lent” means “that cannot be mapped into each other by continuous deformations”. Inparticular:

• π0(X) is the set of connected components of X,

• π1(X) is the set of homotopically inequivalent loops in X,

• π2(X) is the set of homotopically inequivalent closed surfaces in X,

et c. (Note, in particular, that, since a generic ZN consists only of N points, it doesnot contain any one-, two- or three-loop or any higher loops either, and the onlypossibly non-trivial homotopy group is π0(ZN) ∼= ZN .)

105

The homotopy groups of direct products of (group) manifolds are given by thedirect sum of the corresponding homotopy groups of each factor:

πq(X1 ×X2) = πq(X1)⊕ πq(X2)

(where the meaning of the direct sum on the right-hand side of this equation is thatthe two homotopy groups are independent of each other).

Of particular interest are the homotopy groups of N -dimensional spheres. Recallthat: SN = O(N + 1)/O(N) = SO(N + 1)/SO(N) (and, furthermore, for spheres ofodd dimension larger than or equal to three, one also has: S2k+1 = U(k + 1)/U(k));it turns out that:

πq(SN) =

Z1 for q < N

Z for q = N

(where Z1 = e denotes the trivial group, containing only the identity element). Forq > N , in general πq(SN) can be non-trivial; in particular, the simplest non-trivialcase is π3(S2) = Z (which is related to the “Hopf fibration”).

In order to compute the homotopy groups of various manifolds, it is often usefulto resort to exact sequences of group homomorphisms.

A generic sequence of groups Gi and group homomorphisms fi : Gi → Gi+1:

. . . −→ Gifi−→ Gi+1

fi+1−→ Gi+2 −→ . . .

is said to be “exact” if, for every a (except, possibly, at the end of the sequence), onehas: Im fa ∼= Ker fa+1, where the symbol ∼= denotes group isomorphism.

Furthermore, recall that, due to a fundamental theorem of group homomorphisms,given a group homomorphism f defined on a group G, one has:

Im f = G/Ker f.

Given a Lie group G, and a compact Lie subgroup H ⊂ G, it is possible to provethat the following sequence:

· · · → πq(H)→ πq(G)→ πq(G/H)→ πq−1(H)→ πq−1(G)→ πq−1(G/H)→ . . .

(constructed by repeating the basic block: · · · → πm(H) → πm(G) → πm(G/H) →. . . for values of m which decrease by 1 every time) is exact.

Typically, the determination of non-trivial homotopy groups using exact sequencescan be done by:

• considering a portion of the sequence starting and ending from the trivial groupZ1,

• using the definition of exact sequence,

106

• using the group homomorphisms’ theorem.

As an example, consider the computation of π3 (SO(3)); since SO(3) ∼= SU(2)/Z2, wecan write:

π3(Z2)f−→ π3 (SU(2))

g−→ π3 (SO(3))h−→ π2(Z2)

First of all, we have: π3(Z2) ∼= π2(Z2) ∼= Z1. Second, note that π3 (SU(2)) ∼= Z,because SU(2) is isomorphic to S3. Next, one can observe that h necessarily mapsevery element of its domain π3 (SO(3)) to the unique element of π2(Z2) ∼= Z1, whichis the identity element of Z1: hence, Ker h ∼= π3 (SO(3)), and, since the sequenceis exact, one obtains: Im g ∼= Ker h ∼= π3 (SO(3)). Due to the homomorphismtheorem, one also has: Im g ∼= π3 (SU(2)) /Ker g. Given that the sequence is exact,one has: Ker g ∼= Im f , but, since the domain of f is isomorphic to Z1, and f is ahomomorphism, f necessarily maps the unique element of its domain to the identityelement in π3 (SU(2)), so Im f ∼= Z1. Thus: Ker g = Z1. Therefore we find that:

π3 (SO(3)) ∼= Ker h ∼= Im g ∼= π3 (SU(2)) /Ker g ∼= π3 (SU(2)) /Im f

∼= π3

(S3)/Im f ∼= π3

(S3)/Z1∼= Z/Z1

∼= Z.

A.4 Classification of simple Lie algebras

Simple roots of a simple Lie algebra are such that:

1. they are linearly independent vectors;

2. if α and β are simple roots, then:

− 2α · βα2

∈ N; (D.1)

3. the set of simple roots cannot be split into non-trivial disjoint subsets, such thatevery element from one of them is orthogonal to every element from anothersubset.

These conditions define a so-called “Π-system”. Π-systems can be associated withDynkin diagrams, made of circles (representing simple roots), in which all possiblepairs of circles are either connected by no line (if the angle between the associatedsimple roots is π/2), one line (if the angle between the corresponding simple roots is2π/3), two lines (if the angle is 3π/4) or three lines (if the angle is 5π/6). Furthermore,in the case of diagrams with a double or a triple line, the simple roots have differentlengths (in particular, the ratio between the lengths is

√2): in that case, short simple

roots are denoted by filled circles, and/or are pointed at by an arrow on the doubleor triple line.

107

The linear independence of simple roots (together with the fact that the onlyallowed angles between simple roots are π/2, 2π/3, 3π/4 and 5π/6, and with thefact that the sum of the angles between three linearly independent vectors in R3 isalways less than 2π) and the requirement that a Π-system must be connected (henceno simple root can be at an angle π/2 with all of the others) imply that the onlyΠ-systems of three simple roots are: Because of this, any subset of a Dynkin diagram

and

Figure 1: The only Π-systems made of three simple roots.

must still be a Dynkin diagram, thus Dynkin diagrams with three or more roots canonly consist of pieces of the two forms in the figure above.

In particular, this implies that triple lines can only appear in a diagram of twosimple roots: This diagram corresponds to the exceptional algebra G2, which has

Figure 2: The only Π-system including a triple line.

rank 2 and dimension 14; one of its two simple roots is shorter than the other, andthis is denoted by the filled circle (and pointed at by the arrow).

Furthermore, it is possible to prove that, in any Dynkin diagram, the operationof merging two roots which are connected by a single line leads to another Dynkindiagram. Thus, a Π-system cannot have two or more than two double lines (otherwise,by merging all roots connected by single lines, it could be reduced to a three-rootdiagram with two double lines, which is not allowed), and it cannot have any loop(otherwise, by merging enough roots connected by single lines, it could be reduced toa loop three-root diagram, which is not allowed).

It is also possible to prove that, if a Π-system is of the form: (where the rectangle

generically denotes the rest of the diagram) then also: is a Π-system. This impliesthat, if in a Π-system there are branches, they can only have the form of three singlelines coming out of the simple root in the center: Otherwise, diagrams with four ormore branches, and/or with branches and double lines, and/or with more branchings

108

Figure 3: Π-systems can have at most one branching, in the form of three single linescoming out of a simple root.

stemming out of different simple roots, could be shrunk to a three-root diagram withtwo double lines, which is not allowed.

Finally, no Π-system can be (or include as a subdiagram) any of the followingdiagrams: This is so, because it can be proven that for all of them there exists a van-

Figure 4: Diagrams which are forbidden, because they violate the condition of linearindependence of the simple roots.

ishing linear combination of the simple roots with non-vanishing integer coefficients(i.e., their simple roots are not linearly independent). Note that, in the diagram withthe double line, for which one can choose the direction in which the arrow points (i.e.which simple roots are short and which are long), both possible choices of the long andshort simple roots can be proven to violate the requirement of linear independence.

This leads to the following classification of all possible Dynkin diagrams: thereare four infinite series An, Bn, Cn and Dn (where n denotes the rank, i.e. the numberof simple roots—which is also equal to the number of Cartan generators), whichare associated to the algebras of classical simple Lie groups, and five exceptionaldiagrams: G2, F4, E6, E7 and E8, associated to corresponding exceptional Lie groups.In particular:

• An diagrams are chains of n simple roots, connected by (n− 1) single lines andcorrespond to the algebras of the SU(n+ 1) groups, with rank n and dimensionn(n+ 2);

• Bn diagrams are chains of n simple roots, connected by (n− 2) single lines, andby one double line at one end of the chain; all simple roots are long, except the

109

Figure 5: Structure of the Dynkin diagrams for the An algebras.

one that is only connected to the others by the double line. These diagrams

Figure 6: Structure of the Dynkin diagrams for the Bn algebras.

correspond to the algebras of the SO(2n+1) groups, with rank n and dimensionn(2n+ 1);

• Cn diagrams have the same structure as Bn diagrams, except that all simpleroots are short, except the one that is only connected to the others by the doubleline. These diagrams corresponds to the algebras of the Sp(2n) groups, with

Figure 7: Structure of the Dynkin diagrams for the Cn algebras.

rank n and dimension n(2n+ 1);

• Dn diagrams are chains of n simple roots, in which the first (n − 2) simpleroots are connected by single lines, and the simple root at one of the ends ofthe chain is also attached to two further simple roots, by single lines; These

Figure 8: Structure of the Dynkin diagrams for the Dn algebras.

diagrams correspond to the algebras of the SO(2n) groups, with rank n anddimension n(2n− 1);

• G2 is a diagram with two simple roots connected by a triple line. One of thetwo simple roots is short. The rank of G2 is 2 and its dimension is 14;

• F4 is a diagram which is a chain of four simple roots: they are connected by twosingle lines and by one double line (at the center of the chain). The first two(or, equivalently, the third and the fourth) simple roots are short. The rank ofthe F4 algebra is 4 and its dimension is 52;

110

Figure 9: Dynkin diagram of the G2 algebra.

Figure 10: Dynkin diagram of the F4 algebra.

• E6 is a diagram consisting of 6 simple roots: 5 of them form a linear chain,connected by single lines, and the last one is connected (again, by a single line)to the third root in the chain. The rank of E6 is 6 and its dimension is 78;

Figure 11: Dynkin diagram of the E6 algebra.

• E7 is a diagram similar to E6, except that it includes 7 simple roots: 6 of themform a linear chain, connected by single lines, and the last one is connected by asingle line to the third root in the chain. The rank of E7 is 7, and its dimension


is 133;

• E8 is a diagram made of 8, in which 7 of them form a linear chain, connectedby single lines, and the last one is connected by a single line to the third rootin the chain. The rank of E8 is 8, and its dimension is 248.


Note that, for the algebras represented by the diagrams of type A, D and E, allsimple roots have the same length: these algebras are said to be “simply laced”.

111

Date post:	23-Nov-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	1 times

MathematicalMethodsofPhysicsIIIa2G: ae= ea= a. alsoholdsiscalledamonoid. Wedeﬁneagroup...

Documents