Di erential Geometry A coarse outline Ryan Budney...

Dierential Geometry A coarse outline

Ryan Budney March 2, 2017

This is a set of notes used over the years for a variety of manifolds, dierential geometry and Lie

Groups courses. The central idea of these notes is to minimize technical formalism as much as possible,

both in terms of distracting abstractions as well as hard-to-parse formulae. Inspired by the Guillemin

and Pollack textbook, we keep all our manifolds in Euclidean space, unless a proof or construction

demands otherwise. Similarly we use the Ehresmann formalism for connections, to minimize our

number of interactions with Christoel symbols as most of the time they make computations more

dicult and less intuitive. The notes have grown out of several oerings of a variety of topics, so they

are something of a grab-bag of topics to choose from.

These notes assume a fairly robust background in multi-variable calculus. A terse review of the

material is supplied.

Copyright c© Ryan Budney.1

Index

Chapter 0: Sets, functions and linear algebra. This section is a quick review of conventions

for describing sets, as well as a quick review of linear algebra. Specically: vector spaces, linear

maps, basis, matrix representations of linear maps, determinants, trace, dual spaces, adjoints, norms,

inner products, Gram-Schmidt, angles, non-degenerate bilinear forms, signatures of forms, multi-linear

algebra, wedge products of forms, tensor products, and the Hodge star. Page 4.

Chapter 1: Is a review of basic calculus. The derivative, the chain rule, dieomorphisms, manifolds,

tangent spaces, inverse and implicit function theorems, the tangent bundle, vector elds, integral

curves, the double tangent bundle and the Clairault involution, the Hessian, the Lie bracket, Newton's

method, Kantorovich's theorem, alternative versions of the implicit function theorem, critical points,

Sard's Theorem, Taylor's Theorem. Page 17

Chapter 2: Existence and uniqueness theorem for solutions of ordinary dierential equations. Flows,

Lie Derivatives, the commutator, Lie Groups, Lie Group actions, Lie algebras. Page 41.

Chapter 3: Riemann metrics, isometries and conformal dieomorphisms. Liouville's theorem.

Connections: Ehresmann, Kozul, linear and Christoel symbols. Parallel transport, holonomy, torsion,

fundamental theorem of ane connections. Levi-Cevita connections, geodesic completeness, Hopf-

Rinow theorem. Page 70.

Chapter 4: Deeper into holonomy. Riemann curvature tensor. Sectional curvature. The pull-back

of the metric via the exponential map. Constant curvature. Ricci curvature and scalar curvature.

Volume distorion of the exponential map. Ricci ow. Appendix on the volumes of spheres. Page 96.

Chapter 5: Classication of the constant-curvature geometries. Models for hyperbolic space. The

compactication of hyperbolic space. Classication of isometries of hyperbolic space. Horospheres.

Examples of nite-volume cusped hyperbolic 2-manifolds, and the hyperbolic structure on the comple-

ment of the gure-8 knot. The Gauss-Bonnet theorem for 2-manifolds. Totally-geodesic submanifolds.

Page 120.

Chapter 6: Basic dierential topology. Tubular neighbourhood theorem. Isotopy extension the-

orem. Fibre bundles. The submersions-are-bre-bundles theorem. Connections and holonomy on

bre bundles. The manifolds that are regular-values of smooth functions. Classication of 0, 1 and

2-dimensional manifolds. Page 140.

Chapter 7: Representation theory. This section starts with a quick discussion of integration of

forms on manifolds. We continue with the basic theory of representations of compact Lie groups.

Page 152.

2

References

Chapter 0:

• Friedberg, Insel and Spence. Linear Algebra. Pearson.

• Homan and Kunze. Line Algebra. Pearson.

• Hubbard, J. Vector Calculus, Linear Algebra and Dierential Forms: A Unied Approach, 4th

ed, Matrix Editions.

Chapter 1:

• Spivak, M. A Comprehensive Introduction to Dierential Geometry, Vol 1. Publish or Perish.

• Spivak, M. Calculus on Manifolds, Westview Press.

• Guillemin and Pollack. Dierential Topology, Prentice-Hall. Only the 1st chapter is relevant.

• Hubbard, J. Vector Calculus, Linear Algebra and Dierential Forms: A Unied Approach, 4th

ed, Matrix Editions.

Chapter 2:


• Stillwell, J. Naive Lie Theory, Springer. (only for the Lie Groups content)

• Bröcker, T. Dieck, T. Representations of compact Lie groups, GTM 98, Springer.

Chapter 3:


Chapter 4:


• Sakai. Riemannian Geometry. Translations of Mathematical Monographs, vol 149. AMS.

1996.

Chapter 5:


• Sakai. Riemannian Geometry. Translations of Mathematical Monographs, vol 149. AMS.

1996.

• Thurston. Three-dimensional geometry and topology, Vol 1, Princeton University Press 1997.

Chapter 6:

• M.Hirsch, Dierential Topology. Springer-Verlag GTM 33.

Chapter 7:

• Bröcker, T. Dieck, T. Representations of compact Lie Groups, Springer GTM 98.

3

Dierential Geometry - Coarse Outline Sets, functions, linear algebra

Sets and functions

Below is a brief summary of notations used for sets and functions. 1, 2, 3, 3, 4 could be described

as the set that contains every integer from 1 to 4. We use curly braces to indicate that everything

contained inside of them is considered as elements of that set. = ∅ is the empty set, the set that

contains no elements. is the set that contains the empty set, so 6= , since the set on the

left contains a set, while the set on the right does not contain anything. Two sets are equal if and

only if they contain precisely the same objects

S = R⇐⇒ (s ∈ S =⇒ s ∈ R) and (r ∈ R =⇒ r ∈ S).

Sets are frequently dened as subsets of other sets,

X = p ∈ Q : p2 + p − 2 = 0.

An ordered pair (a, b) is dened in various ways in terms of sets, a common one is (a, b) =

a, a, b. The key property of the ordered pair is that two ordered pairs are equal (a, b) = (c, d)

if and only if a = c and b = d . One can similarly dene ordered triples, etc.

We will use A ⊂ B for À is a subset of B' meaning

a ∈ A =⇒ a ∈ B.

Similarly, we use A ∩ B and A ∪ B for intersections and unions of pairs of sets, and A \ B for the

complement of B in A.

A ∪ B = a is a member of at least one of A or B

A ∩ B = a is a member of both A and BA \ B = a ∈ A : a /∈ B.

We use the notation N for the natural numbers N = 1, 2, 3, · · · . Z for the integers Z =

· · · ,−3,−2,−1, 0, 1, 2, 3, · · · . Q for the rational numbers. R for the real numbers, and C the

complex numbers. We will frequently blur over the equivalence of R2 and C via x + iy ∈ C being

considered essentially the same as (x, y) ∈ R2. So i as a complex number is (0, 1) ∈ R2 and the real

number 1 as a complex number is (1, 0) ∈ R.If Si is a set for all i ∈ I (with I a set), we call the collection Si : i ∈ I an indexed collection of

sets. Most often I is a subset of N, but sometimes we will use larger index sets. Intersections and

unions make sense for indexed collections of sets as well

∩i∈ISi = s : s ∈ Si ∀i ∈ I

∪i∈ISi = s : ∃i ∈ I such that s ∈ SiA function is a triple (D,R, f ) where both D and R are sets. D is the domain of the function, R is

the range. f ⊂ D×R has the property that (d×R)∩ f is a set consisting of precisely one element,

for all d ∈ D. In this context, f ⊂ D×R, f is frequently called the graph of the function (D,R, f ). If

(d, v) ∈ f we write f (d) = v . Frequently people are rather sloppy with functional notation and simply

call f the function, while the graph is specied as graph(f ) ⊂ D × R.So for us, functions have a well-dened domain, they are dened everywhere on their domain, and

at each point of their domain they have precisely one value. The set

r ∈ R : ∃d ∈ D such that f (d) = r4


is called the image of the function, and is denoted img(f ) = f (D), thus img(f ) ⊂ R.The composite of two functions f : D → R and g : A → B is dened provided img(f ) ⊂ A, and

it is denoted g f : D → B and dened by (g f )(d) = g(f (d)). A right-inverse for f is a function

g : R→ D such that f g = IdR where IdR : R→ R is the identity function meaning IdR(x) = x for

all x ∈ R. A left-inverse for f is a function g : D → D such that g f = IdD. A function is one-to-one

if it has a left inverse and onto if it has a right inverse. If a function is both one-to-one and onto, the

left and right inverses of f are identical, and called the inverse to f , and denoted f −1 : R→ D.

Given f : D → R and a set B ⊂ R, the pre-image of B is denoted f −1(B) = d ∈ D : f (d) ∈ B.Although the notation f −1 is used here, there is no assumption that f is invertible. When f is invertible,

the pre-image of B agrees with the image of B under the inverse function so this is consistent with

our previous notation.

One common abuse of notation is the symbol f −1(r) when f is not an invertible function. In these

situations f −1(r) is a synonym for f −1(r), i.e. it is a subset of D not just an element of D.

Given a set I and a set X, the exponential notation XI is meant to indicate the set of functions

XI = f : I → X.

This notation is fairly suggestive, and for a good reason. There is a canonical bijection between (XI)J

and XI×J called the adjoint. The idea is that if one has a function f : J → XI one can dene a

function f : I × J → X by f (i , j) = f (i)(j).

Vector spaces and linear functions

We will be assuming basic familiarity with vector spaces over elds, primarily vector spaces over Rand C. This section is largely a review of notation to do with linear functions and inner products. The

linear algebra references at the start of these notes and the Wikipedia page for vector space is a good

review of the basics before continuing below.

Given a vector space V over a eld F , and another vector space W over a eld F , a linear function

f : V → W is a function which satises f (av + bw) = af (v) + bf (w) for all v , w ∈ V and a, b ∈ F .A linear function f : V → W is a mono-morphism if it is one-to-one and an epi-morphism if it is

onto. It is an iso-morphism if it is both one-to-one and onto. Notice, the inverse of an isomorphism

is also linear (and an isomorphism).

One primary purpose of a basis for a vector space is to represent elements. Let β = ei : i ∈ I bea basis for V . This means that given v ∈ V there is a unique way of writing v =

∑i∈I ciei where only

nitely many of the ci ∈ F are non-zero. There is a corresponding function

[·]β : V → F I

which sends v ∈ V to the coecients (ci)i∈I , called the representation function (with respect to

β). The subspace of F I where only nitely many terms are non-zero is typically denoted⊕

i∈I F and

called the direct sum over I of F . A basic observation is that [·]β : V →⊕

i∈I F is an isomorphism

of vector spaces. Given v ∈ V , [v ]β = (ci)i∈I where v =∑

i∈I ciei . When I = 1, 2, · · · , n,F I =

⊕i∈I F =

⊕ni=1 F is simply denoted F n.

Typical examples of vector spaces in this course includes:5


• F n the standard n-dimensional vector space over F . When F = R it is called n-dimensional

Euclidean space. We will frequently interpret this as either a space of row vectors or column

vectors, depending on notational convenience.

• The set of all linear functions f : V → W is a vector space with operations of addition (f +

g)(v) = f (v)+g(v) and scalar multiplication (cf )(v) = cf (v). This space is typically denoted

Hom(V,W ). Notice that if both V and W are nite-dimensional, then dim(Hom(V,W )) =

dim(V ) · dim(W ). An easy way to see this is to choose basis for V and W and take the

matrix representative of the linear function.

• Similarly, the set of all m × n matrices is a vector space, with addition being entry-wise and

scalar multiplication also being coordinatewise.

• Given a vector space V , the dual space V ∗ = Hom(V, F ) is the linear functions to the base

eld. The vector space operations are dened as in the second item.

Given a linear function f : V → W and basis for V and W respectively, α = ej : j ∈ J ⊂ V

and β = e ′i : i ∈ I ⊂ W , the matrix representation of f is denoted [f ]βα. It is dened so that the

(i , j)-th entry of the matrix is denoted fi j , and this is the coecient of e ′i when one represents f (ej)

with respect to the basis β, i.e. [f (ej)]β = (fi j)i∈I .

The key properties of this construction is that if g : W → U is linear and γ is a basis for U, then

[g f ]γα = [g]γβ[f ]βα, where the right-hand side of this equation is the matrix product of [g]γβ and [f ]βα.

Similarly using matrix multiplication [f (v)]β = [f ]βα[v ]α, where we interpret [v ]α as a column vector.

A key property of matrix representations is how one views a change of coordinates. Let f : V → V

be linear, and let α and β be two basis for V . Then the above formula tells us that

[f ]ββ = [IdV ]βα[f ]αα[IdV ]αβ

where IdV : V → V is the identity function IdV (v) = v . Moreover, since IdV = IdV IdV , I = [IdV ]ββ =

[IdV IdV ]ββ = [IdV ]βα · [IdV ]αβ , i.e. [IdV ]βα = ([IdV ]αβ )−1.

[f ]ββ = ([IdV ]αβ )−1[f ]αα[IdV ]αβ

i.e. representing f with respect to two dierent basis is the same as conjugating the matrix repre-

sentative by the change-of-basis matrix. The entries of [IdV ]αβ are simply the coecients one uses to

represent the basis β with respect to the basis α.

One way in which this plays a role is that determinants are originally dened for matrices. But

given a linear function f : V → V , we can dene det(f ) = det([f ]αα) and by the above formula, this

denition does not depend on the choice of α, since

det([f ]ββ) = det(([IdV ]αβ )−1[f ]αα[IdV ]αβ ) = det(([IdV ]αβ )−1)det([f ]αα)det([IdV ]αβ ) = det([f ]αα).

Thus the determinant is a well-dened geometric invariant of a linear function on any nite-dimensional

vector space. Similarly, the trace tr(f ) of a linear function f : V → V is dened, since for matrices

tr(AB) = tr(BA). Thus both concepts represent geometric invariants of linear maps.

Many dual spaces

The construction of the dual vector space V ∗ to a given vector space V is one that is typically done

often and with many dierent input vector spaces V . Because of this we are particularly interested in

how V ∗1 relates to V ∗2 when the vector spaces V1 and V2 are themselves related. Given a linear map6


f : V1 → V2, there is a corresponding linear function f ∗ : V ∗2 → V ∗1 dened by f ∗(g) = g f , whereg ∈ V ∗2 , i.e. g : V2 → F . f ∗ is called the pull-back function, and f ∗(g) is the pull back of g (by f ).

The pull-back satises two basic properties:

(1) If f1 : V1 → V2 and f2 : V2 → V3 are linear, then (f2 f1)∗ = f ∗1 f ∗2 .(2) If I : V → V is the identity function i.e. I(v) = v for all v ∈ V , then I∗ : V ∗ → V ∗ is the

identity function on V ∗, i.e. I∗(g) = g for all g ∈ V ∗.In a branch of mathematics called abstract nonsense (now known as category theory) the above

properties would be stated as the `dual space construction is a contra-variant functor from the category

of vector spaces to the category of vector spaces'. A functor is a special type of function. A functor

is a procedure that generates new objects V ∗ from old V , but it also produces maps between the new

objects f ∗ : V ∗2 → V ∗1 when you have a map between the original objects f : V1 → V2. The statement

that it is contravariant means that the direction of the new map is reversed from the original. To

be covariant the induced function f ∗ would have to be of the form f ∗ : V ∗1 → V ∗2 . An example of a

covariant functor is the double-dual construction V 7−→ (V ∗)∗, or more simply, the identity functor

V 7−→ V .

Proposition 0.1. For any vector space there is a canonical mono-morphism V → (V ∗)∗. When V is

nite-dimensional this map is an isomorphism. The map is called duality. The denition is that one

sends v ∈ V to φ(v) ∈ (V ∗)∗ dened by φ(v)(g) = g(v), where g ∈ V ∗.This map is natural in the sense that if f : V → W is a map, then one has an induced map

(f ∗)∗ : (V ∗)∗ → (W ∗)∗, and (f ∗)∗ φ = φ f .

The point about naturality is that if V = W , f could be viewed as an arbitrary change of coordinates

on V . The last identity is saying the map φ `looks the same' in all coordinate systems.

The above proposition is stating that the isomorphism between V and (V ∗)∗ is independent of

coordinates.

Proof. If v 6= 0, let g ∈ V ∗ be any function such that g(v) 6= 0. Such a g exists since v can be

extended to a basis of V . Thus φ(v) 6= 0. φ is of course a linear function since φ(av + bw)(g) =

g(av + bw) = ag(v) + bg(w) = (aφ(v) + bφ(w))(g).

To check that φ is an isomorphism when V is nite-dimensional, let β = b1, b2, · · · , bn be a basisfor V . A standard method to construct a basis for V ∗ is called the dual basis and it goes like this.

The dual basis β∗ = b∗1, b∗2, · · · , b∗n is dened on the basis β by the rule b∗i (bj) = δi j where δi j is the

Kronecker delta, meaning δi j = 1 when i = j and δi j = 0 otherwise. Let β∗∗ = b∗∗1 , b∗∗2 , · · · , b∗∗n be

the dual basis for (V ∗)∗ dual to β∗. Then by design b∗∗i (b∗j ) = δi j . But notice, φ(bi)(b∗j ) = b∗j (bi) = δi j .

Thus, b∗∗i = φ(bi). We leave it as an exercise to verify β∗ is a basis of V ∗.

Regarding the naturality statement, it is simply a careful chase through the denitions. Let v ∈ Vand g ∈ W ∗. By denition (φ f )(v)(g) = φ(f (v))(g) = (g f )(v). Similarly, ((f ∗)∗ φ)(v)(g) =

(φ(v) f ∗)(g) = φ(v)(g f ) = (g f )(v).

One can check that the isomorphism V → V ∗ dened by sending bi to b∗i is not a natural isomor-

phism. So this is a rather odd situation. There is an isomorphism V → V ∗ for every nite dimensional

vector space V , but it requires a choice of basis. But if one composes the two isomorphisms V → V ∗

and V ∗ → V ∗∗ one gets a canonical isomorphism V → V ∗∗. Although this might seem paradoxical, it

relies heavily on choosing β∗ and β∗∗ inductively as dual basis.7


When V isn't nite-dimensional, we leave it as an exercise to prove V and V ∗ are not isomorphic

because the number of elements in a basis for V ∗ is strictly larger than for V . So V and (V ∗)∗ are

similarly not isomorphic, although V naturally embeds in (V ∗)∗ via the map φ : V → (V ∗)∗.

Norms and inner products

Denition 0.2. A norm on a vector space over the eld R is a function µ : V → R, which satises:

• µ(cv) = |c |µ(v) for all c ∈ R and v ∈ V .• µ(v + w) ≤ µ(v) + µ(w) for all v , w ∈ V .• µ(v) ≥ 0 always, with µ(v) > 0 if v 6= 0.

In the above, |c | is the absolute value of c ∈ R. Typically one denotes a norm with a symbol similar

to the absolute value, like µ(v) = ||v || or µ(v) = |v |.A bilinear form on a vector space V over a eld F is a function µ : V × V → F which satises

µ(av + bw, u) = aµ(v , u) + bµ(w, u) and µ(v , aw + bu) = aµ(v , w) + bµ(v , u) for all a, b ∈ Fand v , w, u ∈ V . The adjoint of a bilinear form is a linear map denoted µ∗ : V → V ∗, where

µ∗(v)(w) = µ(v , w).

• We say µ is non-degenerate if µ∗ : V → V ∗ is a monomorphism.

• µ is said to be symmetric if µ(v , w) = µ(w, v) always.

• µ is anti-symmetric if µ(v , w) = −µ(w, v) always.

• When F = R the form is positive denite if µ(v , v) > 0 for all v 6= 0.

• A positive denite, symmetric bilinear form is called an inner product.

• If F = C, a form is said to be sesqui-linear if µ(av + bw, u) = aµ(v , u) + bµ(w, u) and

µ(v , aw + bu) = aµ(v , w) + bµ(v , u).

• If F = C the form µ is conjugate-symmetric if µ(v , w) = µ(w, v).

• A sesquilinear, conjugate-symmetric form is called Hermitian.

Notice that an inner product always produces a norm, |v | =√µ(v , v), but not all norms come from

inner products. One way to see this is that |v |2 = µ(v , v) so µ(v +w, v +w) = µ(v , v) + 2µ(v , w) +

µ(w,w) by bilinearity. Thus, µ(v , w) = 12 (|v +w |2−|v |2−|w |2). For example, the norm on Rn given

by |(x1, · · · , xn)| = max|x1|, · · · , |xn| does not come from an inner product.

Example 0.3. a) The standard inner product on Rn is µ(v , w) =∑n

i=1 viwi where v =∑ni=1 viei and w =

∑ni=1 wiei . In this case, µ(v , w) is frequently denoted v · w , the `dot

product'.

b) The standard Hermitian form on Cn is µ(v , w) =∑n

i=1 viwi .

c) The (standard) Minkowski space is Rn equipped with the form of signature (n − 1, 1) given

by µ(v , w) = −v1w1 + v2w2 + · · ·+ vnwn.

d) The the standard signature (a, b)-form on Rn where a + b = n is given by µ(v , w) =

−(v1w1 + v2w2 + · · ·+ vawa) + (va+1wa+1 + · · ·+ vnwn).

Notice that items (a) and (b) are positive denite, while neither (c) nor (d) are.

Proposition 0.4. If µ is an inner product on a nite-dimensional vector space over R, the map

µ∗ : V → V ∗ is an isomorphism.

So while in a typical vector space there is no natural isomorphism between V and V ∗, there is

such an isomorphism in an inner product space. One can go further and dene an inner product on8


V ∗ so that the map µ∗ : V → V ∗ is an isometry, then Proposition 0.4 gives a further isomorphism

V ∗ → (V ∗)∗. Is the composite V → V ∗ → (V ∗)∗ the canonical isomorphism? We leave it to the

reader as an exercise to check this is true.

Proposition 0.5. Inner products satisfy the Cauchy-Schwartz inequality, that

|〈v , w〉| ≤ |v ||w | for all v , w ∈ V.

Proof. Consider 0 ≤ |v − tw |2 = 〈v − tw, v − tw〉 = |v |2 − 2t〈v , w〉 + t2|w |2. Cauchy-Schwartz is

true if either v or w is zero, so consider both non-zero. Let t = 〈v,w〉|w |2 . Then |v |

2−2t〈v , w〉+t2|w |2 =

|v |2 − 2 〈v,w〉2

|w |2 + 〈v , w〉2|w |2/|w |4 = |v |2 − 〈v,w〉2

|w |2 , so 0 ≤ |v |2|w |2 − 〈v , w〉2, giving the result. Notice

that this further proves that the Cauchy-Scwartz inequality is sharp if and only if v and w are linearly

dependent.

Denition 0.6. Given an inner product, one says a vector v is of unit length if µ(v , v) = 1. One says

vectors v , w are orthogonal if µ(v , w) = 0. One says they are orthonormal if they are orthogonal and

unit length.

An isomorphism f : V → W between two spaces equipped with bilinear forms is called an isometry

if µW (f (v), f (w)) = µV (v , w) for all v , w ∈ V .

The Gram-Scmidt method is a technique for taking two linearly independent vectors v , w and

replacing them with orthonormal vectors that span the same subspace of V . The idea has many

variants. A key idea is to subtract o the projection of one vector onto another. Specically, w ′ =

w − µ(v,w)µ(v,v) v . By bilinearity

µ(v , w ′) = µ(v , w)−µ(v , w)

µ(v , v)µ(v , v) = 0

We leave it to the reader to check that if v and w are linearly independent, v and w ′ are as well. The

output of the Gram-Schmidt process is the vectors v|v | and

w ′

|w ′| . One can apply this process to arbitrary

linearly independent sets in V by simply repeating the process. There are many further variants of this

simple idea.

Corollary 0.7. Any two nite-dimensional inner product spaces of the same dimension are isometric.

In particular, by Gram-Schmidt, any such space has a orthonormal basis.

So nite-dimensional inner product spaces look like Rn.Non-degenerate bilinear forms which are not inner products are also frequently of interest, in par-

ticular because they provide the local model for relativity (gravitation). We mention a key theorem

regarding the diversity of bilinear forms that are not inner products.

Given a non-degenerate symmetric bilinear form µ on a nite-dimensional vector space V over R, ora non-degenerate Hermitian form µ on a nite-dimensional vector space over C, the function V → F

given by v 7−→ µ(v , v) is called the quadradic form associated to µ. There is an invariant called

the signature of such a form. One way of stating the signature is that V decomposes as a direct

sum of V = Vp ⊕ Vm and µ restricts to a positive-denite form on Vp and negative-denite on Vm,

in which case the signature of µ is the pair (dim(Vp), d im(Vm)). Some people prefer to call the

dierence dim(Vp)− dim(Vm) the signature as it contains the same information (provided you know

dim(V )). Thus, one can nd a basis β = b1, b2, · · · , bn such that µ(bi , bj) = ±δi j . The fact that

the dimensions of Vp and Vm are well-dened is known as Sylvester's Law of Inertia. There is also a9


variant of this theorem in the case the form is degenerate, in which case there is one additional number

that describes µ, the dimension of the kernel of µ∗ : V → V ∗.

Denition 0.8. The angle 0 ≤ θ ≤ π between two non-zero vectors in an inner product space is

dened by cos θ = 〈v,w〉|v ||w | . Two vectors or orthogonal if 〈v , w〉 = 0, i.e. θ = π/2. A collection of

vectors v1, · · · , vk are orthonormal if 〈vi , vj〉 = δi j for all i , j .

Denition 0.9. A Minkowski space in general will denote any n-dimensional vector space V over Rwith a form µ of signature (n − 1, 1). A vector v in a Minkowski space V is timelike if µ(v , v) < 0,

spacelike if µ(v , v) > 0 and null or lightlike if µ(v , v) = 0. TheMinkowski pseudo-norm of a vector

is dened as |v | =√|µ(v , v)|.

The Minkowski pseudo-norm is not technically a norm, since there are non-zero lightlike vectors.

Proposition 0.10. Every nite-dimensional vector space with a symmetric non-degenerate bilinear form

is isometric to Rn (for some n) together with a signature (a, b)-form for some uniquely determined

pair (a, b).

Proposition 0.10 is a restatement of Sylvester's Law of Inertia.

Adjoints

Let V and W be vector spaces. If f : V → W is linear, there is the induced dual map f ∗ : W ∗ → V ∗.

If both V and W are nite-dimensional inner product spaces, there are isomorphisms V ' V ∗ and

W ' W ∗ coming from the inner product. If we compose f ∗ with these isomorphisms, we get a new

map adj(f ) : W → V , making the diagram commute.

W ∗f ∗ // V ∗

W

OO

adj(f ) // V

OO

The vertical maps W → W ∗ are given by w 7−→ 〈w, ·〉, where the angle-brackets indicate the

inner product on W , similarly for V → V ∗. One way to express this commutative diagram is that

adj(f ) : W → V is the unique linear function such that

〈w, f (v)〉 = 〈adj(f )(w), v〉for all w ∈ W and v ∈ V .

Proposition 0.11. If α = v1, · · · , vn is an orthonormal basis for V and β = w1, · · · , wm is an

orthonormal basis for W , with f : V → W linear, let A = [f ]βα and B = [adj(f )]αβ then B = At , i.e.

the representation of the adjoint is the transpose.

Multi-linear algebra and determinants

Denition 0.12. Given vector spaces Vi for i ∈ 1, 2, · · · , k and W over a eld F , a function

f : V1× V2× · · · × Vk → W is multi-linear (also sometimes called k-multi-linear) if when you x all but

one variable, it is linear in that variable.10


Specically, let vi ∈ Vi for all i and let w ∈ Vj for some j ∈ 1, 2, · · · , k xed. We demand

f (v1, · · · , vj−1, avj + bw, vj+1, · · · , vk) = af (v1, · · · , vk) + bf (v1, · · · , vj−1, w, vj+1, · · · , vk)

for all a, b ∈ F , all i , and all vectors v1, · · · , vk and w .

Bilinear forms such as inner products are 2-multi-linear. Another commonly-known multi-linear

function is the determinant. Think of the determinant of a matrix as a function of either the column

vectors or row vectors. Then the determinant is an n-multi-linear function where k = n, Vi = F n for

all i , and W = F .

A standard example of a bilinear function that never seems to be mentioned until one studies

multi-linear functions is the evaluation map. It is a function α : V × V ∗ → F given by α(v , f ) = f (v).

Multi-linear functions have familiar adjoints. In the case of α : V × V ∗ → F , the adjoint is the

function

α : V → (V ∗)∗

given by α(v)(f ) = α(v , f ) = f (v). i.e. the adjoint of the evaluation map is the canonical isomorphism

between V and (V ∗)∗.

Similarly, the adjoint of an inner product µ : V × V → R is the canonical isomorphism between V

and V ∗ that one has in an inner product space.

Example 0.13. Consider the determinant to be an n-linear function (Rn)n → R. If we write (Rn)n =

Rn × (Rn)n−1, we can take the adjoint of the determinant to get a map:

Det : (Rn)n−1 → (Rn)∗

Consider Rn to be an inner product space with the standard Euclidean/dot product. Then there

is the canonical isomorphism between Rn and its dual (Rn)∗. The composite of these two maps

(Rn)n−1 → (Rn)∗ → Rn is called the cross product. When n = 3 this is the standard cross product

in R3 that one learns in calculus courses. One can check it has the same heuristic description as the

`determinant' of the n× n matrix where the top row are the standard basis vectors, and the remaining

n−1 rows are the vectors one is taking the cross product of. When n = 2 this is the map R2 → R2 that

rotates a vector counter-clockwise by π/2. This way of viewing the cross product is the big-concept

way to see that the cross product is geometric, i.e. that f (v)× f (w) = f (v ×w) where f : R3 → R3

is an orientation-preserving isometry.

Proposition 0.14. The adjoint gives an isomorphism between the vector space of bilinear functions

V ×W → U and the vector space of linear functions V → Hom(W,U). The former is frequently called

Bil in(V ×W,U) while the latter Hom(V,Hom(W,U)).

Multi-linear functions are sometimes also called tensors.

Denition 0.15. Let V be a vector space over a eld F and µ : V n → W multi-linear. µ is said

to be symmetric if µ(v1, v2, · · · , vn) = µ(vσ(1), vσ(2), · · · , vσ(n)) for all permutations σ of the set

1, 2, · · · , n.The multi-linear function µ is anti-symmetric (also known as alternating) if µ(v1, v2, · · · , vn) =

(−1)|σ|µ(vσ(1), vσ(2), · · · , vσ(n)) where |σ| is the number of transpositions one needs to represent σ as

a composite of transpositions.11


Theorem 0.16. Let V be an n-dimensional vector space over the eld F . Then the space of k-multi-

linear functions V k → F has dimension nk .

The space of symmetric k-multi-linear functions V k → F has dimension(n+k−1k

), this is also

sometimes denoted `n multichoose k '.

Provided 2 6= 0 in the eld F , the space of alternating k-multi-linear functions V k → F has

dimension(nk

).

The condition 2 6= 0 is equivalent to 1 6= −1. When 2 = 0 anti-symmetric forms are symmetric

forms, so the second dimension count applies.

Among other things, Theorem 0.16 tells us that the space of alternating n-linear functions V n → F

has dimension 1. When V = F n, the standard basis for this one dimensional space is of course the

determinant.

The proof is the most interesting part of the above theorem.

Proof. Let µ : V k → F be multi-linear, and let β = e1, · · · , en be a basis for V . Then we can write

arbitrary vectors vi ∈ V as linear combinations

vi =

n∑j=1

ai jej

and expand µ using the multi-linearity

µ(v1, · · · , vk) = µ(

n∑j=1

a1jej , · · · ,n∑j=1

akjej)

=∑j1,··· ,jk

a1j1a2j2 · · · akjkµ(ej1 , · · · , ejk )

where in the last sum all j1, · · · jk run from 1 through n. Thus we can specify µ uniquely by its values on

all possible k-tuples of basis vectors. The k-linear function V k → F which takes value 1 on ej1 , · · · , ejkand which is zero on every other possible k-tuple of basis elements is a basis element for the space of

k-linear functions, and there is one for every possible j1, · · · , jk ∈ 1, 2, · · · , n giving the dimension

result. The k-multilinear function which is nonzero on precisely (ej1 , · · · , ejk ) is frequently denoted

e∗j1 ⊗ · · · ⊗ e∗jk. In this context, the symbols β∗ = e∗1 , · · · , e∗n are meant to be interpreted as the dual

basis to the basis β. In general, if f : V → F and g : W → F , the tensor product f ⊗g : V ×W → F

is denes as (f ⊗ g)(v , w) = f (v)g(w). So if f and g are (multi-)linear, so is f ⊗ g.In the symmetric case, by symmetry one can always permute the entries of µ so that j1 ≤ j2 ≤

· · · ≤ jk . Thus the dimension of the symmetric multi-linear functions V k → F is the number of weakly

increasing k-tuples in 1, 2, · · · , n.In the anti-symmetric case with 2 6= 0 the exact same argument as above implies the dimension

is the number of strictly increasing k-tuples j1 < j2 < · · · < jk . This number is denoted(nk

)or

`n choose k ,' the number of k element subsets of an n-element set. It's rather simple to describe

such a generator. Let π : V → F k be the linear function that sends∑n

i=1 aiei to the k-tuple

(aj1 , aj2 , · · · , ajk ). Then the alternating k-linear function V k → F corresponding to j1 < j2 < · · · < jkis the determinant of the k-by-k matrix (πv1, πv2, · · · , πvk). In the literature, this element is typically

denoted e∗j1 ∧ e∗j2∧ · · · ∧ e∗jk .

As far as I'm aware, there is no conventional special notation for the basis for the symmetric multi-

linear functions. Since every anti-symmetric multilinear function is a multi-linear function, there must12


be some formula expressing e∗j1 ∧ · · · ∧ e∗jkin terms of tensor products of the dual basis β∗. One can

check that it is

e∗j1 ∧ · · · ∧ e∗jk

=∑σ∈Σk

(−1)|σ|e∗jσ1⊗ · · · ⊗ e∗jσk

In the case k = n this is precisely the expansion of the determinant function, written explicitly as a

polynomial function in its entries.

Given a multi-linear function µ : V k → F , and provided the eld F is of characteristic zero, the

symmetrization and anti-symmetrization of µ respectively can be formed:

sym(µ) =1

k!

∑σ∈Σk

µ.σ

here we use the right-action of Σk on functions V k → F .

asym(µ) =1

k!

∑σ∈Σk

(−1)|σ|µ.σ.

Given an n-form ω : V n → F and an m-form η : V m → F the tensor product ω⊗ η : V n × V m → F is

dened as ω ⊗ η(v1, · · · , vn, vn+1, · · · , vn+m = ω(v1, · · · , vn) · η(vn+1, · · · , vn+m). The wedge product

is dened as

ω ∧ η =(n +m)!

n!m!· asym(ω ⊗ η).

Proposition 0.17. Both the tensor product and the wedge product are associative, i.e.

ω ⊗ (η ⊗ ζ) = (ω ⊗ η)⊗ ζ

ω ∧ (η ∧ ζ) = (ω ∧ η) ∧ ζmoreover if the forms ω and η are alternating then

ω ∧ η = (−1)nmη ∧ ω.

Finally, this notation is consistent with our previous notation for the wedge product.

In Rn, the dual basis β∗ = e∗1 , · · · , e∗n is frequently given dierent terminology. In the language of

a standard calculus course, the function Rn → R which sends (x1, x2, · · · , xn) 7−→ xi would typically be

denoted by symbol xi , i.e. representing the function by the formula that denes it. Since this function

is linear, it is equal to its own derivative. Typically D(xi) is just denoted dxi in this context, and if we

think of dxi ∈ (Rn)∗, notice that dxi(a1, a2, · · · , an) = ai . i.e. dxi = e∗i , the dxi 's are the standard

dual basis β∗ on Rn.

Further operations on forms

This section assumes the reader is familiar with the basic operations on forms as covered in Spivak

or Guillemin an Pollack. We discuss the Hodge Star operation.

Denition 0.18. The Hodge star is a map of forms ∗ : Λk(Rn)∗ → Λn−k(Rn)∗ given by the formula

∗ω =∑I

ω(eI)Det(eI , ·)

13


where in the formula, the summation is over all multi-indices I = (i1, · · · , ik) where 1 ≤ i1 < i2 <

· · · < ik ≤ n, and eI = (ei1 , · · · , eik ). We are thinking of Rn as column vectors, so Det(eI , ·) means

the (n − k)-form on Rn where the rst k columns are ei1 , ei2 , · · · eik respectively, and the remaining

columns are where the input (n − k)-vectors are put, so

(∗ω)(vk+1, vk+2, · · · , vn) =∑I

ω(ei1 , ei2 , · · · , eik )Det(ei1 , ei2 , · · · , eik , vk+1, vk+2, · · · , vn).

We wish to verify

Theorem 0.19. The Hodge star is a linear isomorphism ∗ : Λk(Rn)∗ → Λn−k(Rn)∗ for all k ∈0, 1, · · · , n. The Hodge star satises

ω(v1, · · · , vk) = (∗ω)(vk+1, · · · , vn)

whenever (v1, · · · , vn) is an orthonormal basis for Rn which is oriented in the sense thatDet(v1, · · · , vn) =

1, moreover

∗(v ∗σ(1) ∧ · · · ∧ v∗σ(k)) = (−1)σv ∗σ(k+1) ∧ · · · ∧ v

∗σ(n)

whenever σ ∈ Σn is a permutation. If f : Rn → Rn is an orientation-preserving linear isometry, then

f ∗(∗ω) = ∗(f ∗ω) for all ω ∈ Λk(Rn)∗.

The nice thing about Denition 0.18 is it's a formula, well-dened, linear and computable. The bad

thing about it, is it does not appear to have any immediate geometric meaning. Theorem 0.19 xes

this problem, in that it gives the denition geometric meaning.

Proof. Let σ ∈ Σn, and if ω = e∗σ(1) ∧ · · · ∧ e∗σ(k), then

∗ω = Det(eσ(1), · · · , eσ(k), ·)

By the multi-linearity of the determinant we can, without aecting the value of (∗ω)(vk+1, · · · , vn)

project the vectors vk+1, · · · , vn into the orthogonal complement of eσ(1), · · · , eσ(k). This orthogonal

complement is just the span of eσ(k+1), · · · , eσ(n). But notice,

(∗ω)(eσ(k+1), · · · , eσ(n)) = Det(eσ(1), · · · , eσ(n)) = (−1)σ

This veries that

∗(e∗σ(1) ∧ · · · ∧ e∗σ(k)) = (−1)σe∗σ(k+1) ∧ · · · ∧ e

∗σ(n).

Given a multi-index (i1, · · · , ik) where 1 ≤ i1 < i2 < · · · < ik ≤ n, notice that there is a unique

permutation σ ∈ Σn such that σ(j) = ij for all j and where σ(k + 1) < σ(k + 2) < · · · < σ(n). So

we will think of multi-indices as being these permutations which preserve order on 1, 2, · · · , k andk + 1, k + 2, · · · , n respectively. If we use the notation v ∗σ for the k-form v ∗σ(1) ∧ v

∗σ(2) ∧ · · · ∧ v

∗σ(k)

and v ∗σ for the (n− k)-form (−1)σv ∗σ(k+1) ∧ v∗σ(k+2) ∧ · · · ∧ v

∗σ(n) then the denition of the Hodge star

can be re-stated as

if ω =∑σ

aσe∗σ then ∗ ω =

∑σ

aσe∗σ

and we are trying to prove that we can replace the oriented orthonormal basis e1, · · · , en with an

arbitrary oriented orthonormal basis v1, · · · , vn in this formula. So we let v1, · · · , vn be an orientedorthonormal basis for Rn and evaluate ω(v1, · · · , vk) and (∗ω)(vk+1, · · · , vn). If we could verify that

e∗σ(v1, · · · , vk) = e∗σ(vk+1, · · · , vn)14


we would be done. If we think of the column vectors v1, · · · , vn as being the columns of a matrix A,

the fact that v1, · · · , vn is an oriented orthonormal basis is saying that A ∈ SOn. e∗σ(v1, · · · , vk) is the

sub-determinant of A where one takes the rst k columns of A and the rows σ(1), · · · , σ(k) in that

order. e∗σ(vk+1, · · · , vn) is the sub-determinant of A where one takes the nal (n − k) columns and

rows σ(k+1), · · · , σ(n) respectively, multiplying this determinant by (−1)σ. Think of the permutation

σ as inducing the linear isometry of Rn, σ∗ : Rn → Rn which sends ei to eσ(i). This is an orientation-

preserving isometry if and only if (−1)σ = 1. Thus up to a permutation of the rows, this reduces to

the case where σ is the identity, since when σ is an odd permutation it permutes our vectors to be a

negatively-oriented basis for Rn. Thus we've reduced the problem to checking that whenever A ∈ SOnif we write

A =

(B C

D E

)where B is a k × k matrix and E is (n − k)× (n − k) that Det(B) = Det(E).

Since A is orthogonal, observe

AtA = I =

(BtB +DtD BtC +DtE

CtB + EtD CtC + EtE

)which can be further written as the relations

BtB +DtD = I, CtC + EtE = I, BtC +DtE = 0

Notice, (Bt Dt

0 I

)A =

(Bt Dt

0 I

)(B C

D E

)=

(BtB +DtD BtC +DtE

D E

)=

(I 0

D E

)Taking the determinants of the far left and right sides gives

Det(Bt)Det(A) = Det(E)

But Det(A) = 1 by the assumption A ∈ SOn, and Det(Bt) = Det(B) gives the result.

The result that the Hodge star commutes with pull-backs via orientation-preserving isometries

f : Rn → Rn follows from observing f ∗(e∗σ(1) ∧ · · · ∧ e∗σ(k)) = f ∗(e∗σ(1)) ∧ · · · ∧ f ∗(e∗σ(k)), and since

f ∗(e∗i ) = e∗i f , f ∗(e1), · · · , f ∗(en) is precisely the dual-basis to the basis f −1(e1), · · · , f −1(en)for Rn. Write vi = f −1(ei), then v1, · · · , vn are an oriented orthonormal basis. Thus,

f ∗(e∗σ(1) ∧ · · · ∧ e∗σ(k)) = v ∗σ(1) ∧ · · · ∧ v

∗σ(k)

giving

∗(f ∗(e∗σ(1) ∧ · · · ∧ e∗σ(k))) = (−1)σv ∗σ(k+1) ∧ · · · ∧ v

∗σ(n)

Conversely,

f ∗(∗(e∗1 ∧ · · · ∧ e∗k )) = f ∗((−1)σe∗k+1 ∧ · · · ∧ e∗n)

= (−1)σv ∗k+1 ∧ · · · ∧ v ∗nSo we've checked ∗ commutes with f ∗ on a basis, and since f ∗ is linear this is enough.

Proposition 0.20. The map

〈·, ·〉 : Λk(Rn)∗ × Λk(Rn)∗ → Rgiven by 〈ω, η〉 = ∗(ω∧∗η) is an inner product on Λk(Rn)∗. Moreover, if f : Rn → Rn is an orientation-preserving isometry of Rn, then f ∗ : Λk(Rn)∗ → Λk(Rn)∗ is an isometry of the space of k-forms with

this inner product.15


Proof. 〈·, ·〉 is bilinear since ∗ is linear and the wedge product is bilinear. Let σ,α ∈ Σn be any two

permutations that preserve the order on 1, 2, · · · , k and k + 1, k + 2, · · · , n respectively. Noticethat

〈e∗σ(1) ∧ · · · ∧ e∗σ(k), e

∗α(1) ∧ · · · ∧ e

∗α(k)〉 = (−1)α ∗ (e∗σ(1) ∧ · · · ∧ e

∗σ(k) ∧ e

∗α(k+1) ∧ · · · ∧ e

∗α(n))

The form e∗σ(1) ∧ · · · ∧ e∗σ(k) ∧ e

∗α(k+1) ∧ · · · ∧ e

∗α(n) is non-zero if and only if σ(1), · · · , σ(k), α(k +

1), · · · , α(n) = 1, 2, · · · , n which because of our conventions is equivalent to σ = α, and

e∗σ(1) ∧ · · · ∧ eσ(n) = (−1)σe∗1 ∧ · · · ∧ e∗nso

〈e∗σ(1) ∧ · · · ∧ e∗σ(k), e

∗α(1) ∧ · · · ∧ e

∗α(k)〉 =

1 if α = σ

0 otherwise

Thus the bilinear form 〈·, ·〉 is the standard Euclidean inner product on Λk(Rn)∗ provided we identify

Λk(Rn)∗ with R(nk) by sending forms e∗σ(1) ∧ · · · ∧ e∗σ(k) to the standard basis vectors of R(nk).

If f : Rn → Rn is an orientation-preserving isometry, notice

〈f ∗(ω), f ∗(η)〉 = ∗(f ∗ω ∧ ∗(f ∗(η)))

but the Hodge star commutes with orientation-preserving isometries and wedge products commute

with pull-backs via arbitrary linear maps

= ∗(f ∗ω ∧ f ∗(∗η)) = ∗(f ∗(ω ∧ (∗η))) = f ∗(∗(ω ∧ ∗η)) = ∗(ω ∧ ∗η)

because f ∗ acting on 0-forms (or n-forms) is trivial.

Technically, we did not ocially dene the pull-back of a 0-form (Λ0(Rn)∗ is conventionally dened

as simply R). But f ∗ acts trivially on Λn(Rn)∗ since this space is one-dimensional and f preserves the

orientation. In general of course f ∗ : Λn(Rn)∗ → Λn(Rn)∗ is multiplication by Det(f ), but since f is an

orientation-preserving isometry, Det(f ) = 1. So it does us no harm to dene f ∗ : Λ0(Rn)∗ → Λ0(Rn)∗

as the identity map on R = Λ0(Rn)∗.

16

Dierential Geometry - Coarse Outline Calculus and manifolds

1. The Derivative, Notation

Denition 1.1. (The Derivative) Given a function f : U → Rm where U ⊂ Rn is open, and a point

p ∈ U we say f is dierentiable at p if there exists a linear map L : Rm → Rn such that

limh→0

f (p + h)− f (p)− L(h)

|h| = 0

In this case, the derivative is L and is denoted Dfp. So the derivative of f at a point p is a linear

map. The directional derivative of f at p in the direction v ∈ Rn is the limit

Dv fp = limt→0

f (p + tv)− f (p)

t

where t ∈ R is a scalar. If e1, · · · en represent the standard basis vectors of Rn, the partial derivativesof f at p are denoted

∂f

∂ei(p) = Dei fp

Generality note: In the denition of the derivative we use the concepts of a vector space and a

norm. This notion of derivative therefore makes sense in any Banach space and is frequently called the

Fréchet derivative in that context. When one leaves nite-dimensional vector spaces, linear functions

need not be continuous. In the context of Banach spaces one sometimes puts additional restrictions

on Df . Also notice that we only use the norm in the context of showing a certain limit is zero. In the

language of norms this means we only care about the norm up to `topological equivalence' (this is the

equivalence relation between two norms, when they induce the same topology on the vector space).

In nite-dimensional vector spaces all norms are equivalent so it does not matter which norm you use

in the denition of the derivative. In these notes we will frequently use calculus on nite-dimensional

spaces that are not Rn. Things like spaces of matrices, ane spaces and such. So keep in mind that

for all the denitions below one is allowed to replace Rn by any nite-dimensional vector space over

R. Also, one can do calculus over other elds. The complex numbers C are a common variation. If

in the denition of dierentiability one demands that L : Cn → Cm is complex linear, then one has a

stronger notion of dierentiability, called complex dierentiability.

If one uses the notation vε' w to mean |v − w | < ε the statement of dierentiability at p ∈ U

becomes for every ε > 0 there exists δ > 0 such that |h| < δ implies f (p + h)ε|h|' f (p) + L(h). This

statement looks much like the statement of continuity, except we allow the displacement between

f (p + h) and f (p) + L(h) to be bounded by a small linear function instead of a small constant.

Proposition 1.2. With the same conventions as in the above denition,

• If f is dierentiable at p, all directional derivatives exist and are equal to

Dv fp = Dfp(v)

• If all partial derivatives of f exist in an neighbourhood of p, and if they are continuous at p,

then f is dierentiable at p.

• (Clairaut's Theorem) If the 2nd order partial derivatives exist in a neighbourhood of p and are

continuous at p, then the mixed-partials agree at p, i.e.

∂2f

∂ei∂ej(p) =

∂2f

∂ej∂ei(p)

17


• A function f : U → Rm × Rk is dierentiable at p if and only if f1 : U → Rm and

f2 : U → Rk are dierentiable at p, where f (x) = (f1(x), f2(x))∀x . Moreover, Dfp(v) =

(Df1(p)(v), Df2(p)(v)).

One consequence of Proposition 1.2 is that if f is dierentiable at p, Dfp(v) = A · v where A is

the n×m matrix whose (i , j)-th entry is ∂fixj

(p), and fi is the i-th component of f , i.e. fi = f · ei . Forthese formula to make sense, one must think of Rn and Rm as spaces of column vectors.

Theorem 1.3. (Chain Rule) If f : U → Rm is dierentiable at p ∈ U ⊂ Rn and g : W → Rk is

dierentiable at f (p) ∈ W ⊂ Rm, then provided f (U) ⊂ W , gf : U → Rk is dened and dierentiableat p, moreover

D(g f )(p) = Dg(f (p)) Df(p)

Proof. We demonstrate the proof of the Chain rule using the vε' w notation. Dierentiability of f

and g translates to for any ε1 > 0 and ε2 > 0 there exist δ1 > 0 and δ2 > 0 such that

g(p + h1)ε1|h1|' g(p) +Dgp(h1)

f (g(p) + h2)ε2|h2|' f (g(p)) +Dfg(p)(h2)

provided |h1| < δ1 and |h2| < δ2. We let h2 = g(p + h1)− g(p), this gives

f (g(p + h1))ε2|h2|' f (g(p)) +Dfg(p)(g(p + h1)− g(p))

||Dfg(p)||·ε1|h1|' f (g(p)) +Dfg(p) Dgp(h1)

to turn this into a statement of dierentiability of f g notice that h2 = g(p+h1)−g(p)ε1|h1|' Dgp(h1)

so by the triangle inequality |h2| < ε1|h1|+ |Dgp(h1)| ≤ (ε1 + ||Dgp||)|h1|, thus

(f g)(p + h1)K|h1|' (f g)(p) +Dfg(p) Dgp(h1)

where K = ε2 (ε1 + ||Dgp||) + ||Dfg(p)|| · ε1, and for this to be true, one needs |h2| < δ2, so (ε1 +

||Dgp||)δ1 < δ2 suces. Given ε > 0, let ε1 = ε2||Dfg(p)||+1 and ε2 = ε

2(ε1+||Dgp ||) . The denition of

dierentiability of f and g respectively gives us a δ1 and δ2, we then let δ = minδ1,δ2

ε1+||Dgp ||.

Example 1.4. • If f : Rn → Rm is linear, it is dierentiable and Dfp = f .

• The multiplication function µ : R2 → R given by (x, y) 7−→ xy is dierentiable, moreover

Dµ(x,y)(a, b) = xb + ay . In matrix notation, Dµ(x,y) = [y , x ].

• The addition function + : R2 → R given by (x, y) 7−→ x + y is dierentiable.

• Polynomials functions Rn → Rm are dierentiable. This follows from the previous items, the

chain rule and a long induction.

• If f : Rn × Rm → Rk is bilinear, it is dierentiable and

Df(p,q)(v , w) = f (p, w) + f (v , q)

The proof of this has two steps, rst since f is bilinear it is a quadratic polynomial in the

entries p and q of f (p, q). So Df(p,q)(x, y) = Df(p,q)(x, 0) + Df(p,q)(0, y) by linearity, but

the RHS can be viewed as a directional derivative (in only one variable!) but f is linear as a

function of only one of its two variables, so Df(p,q)(x, 0) = f (x, q) and Df(p,q)(0, y) = f (p, y),

giving the result.18


• Let Mn,m be the space of n × m matrices, then addition, scalar multiplication and matrix

multiplication are all dierentiable functions:

+ : Mn,m ×Mn,m → Mn,m

R×Mn,m → Mn,m

µ : Mn,m ×Mm,k → Mn,k

moreover,

DµA,B(H, J) = AJ +HB

The last example with matrix spaces, if one were to insist on doing calculus only in Euclidean space,

the above computations would be awkward. We can identify Mn,m with Rnm by thinking of the matrix

with entries [ai j ] as the vector (a11, · · · , a1m, a21, · · · , a2m, · · · an1, · · · , anm) ∈ Rnm. The perspective

we prefer to encourage is to simply think ofMn,m as a vector space with a norm. Specically, dene the

standard inner product on Mn,m by 〈A,B〉 = tr(AtB). A computation shows 〈A,B〉 =∑n

i,j=1 ai jbi j ,

i.e. this is the inner product on Mn,m one would get by identifying Mn,m with Rnm and using the

standard Euclidean inner product on Rnm. In particular the induced norm |A| =√∑n

i,j=1 a2i j . We will

call |A| the length of A.

There are several big theorems in multi-variable calculus. One family of such theorems relate the

innitesimal behavior of a function to its local behavior, in the case that the derivative at a point

has maximal rank. These theorems go by the names: Implicit Function Theorem (Epi-Split Lemma),

Inverse Function Theorem (Iso Lemma) and Mono-split Lemma (local embedding theorem), depending

on whether or not n −m is positive, zero or negative respectively.

Theorem 1.5. Assume f : U → Rm is Ck for k ≥ 1, with U ⊂ Rn open. Let p ∈ U and Dfp have

rank equal to minm, n.• If n > m, the Implicit Function Theorem says that there exists a C∞ function g : U → Rn−msuch that there is a perhaps smaller open neighbourhood of p, p ∈ U ′ ⊂ U such that f⊕g : U →Rm×Rn−m given by (f ⊕g)(x) = (f (x), g(x)) restricts to a map (f ⊕g)|U ′ : U ′ → (f ⊕g)(U ′)

which is one-to-one, onto and invertible, moreover the inverse is Ck .

• If n = m, the Inverse Function Theorem theorem says there is an open neighbourhood of

p in U, p ∈ U ′ ⊂ U such that f restricts to a map f|U ′ : U ′ → f (U ′) with f (U ′) open, f

one-to-one and onto, and such that the inverse is Ck .

• If n < m the Mono-Split Lemma says there exists a function F : U ×Rm−n → Rm such that

F|U×0 = f , with DF(p,0) an isomorphism, therefore there is a neighbourhood U ′ of (p, 0)

in U × Rm−n such that F|U ′ : U ′ → F (U ′) is a one-to-one onto map, whose inverse is Ck ,

moreover F (U ′) is open in Rm.

The name for the implicit function theorem corresponding to the mono-split lemma is the epi-split

lemma.

Note 1.6. • In some textbooks, when the implicit function theorem is stated, frequently (f ⊕g)(U ′) = f (U ′) × g(U ′) is taken to be f (U ′) × Rn−m. This can always be achieved, with a

little more work, but of course one has to modify the function slightly. The key idea is that

there is a dieomorphism (−1, 1)→ R given by h(x) = x1−x2 .

19


• Usually one derives the implicit function theorem from the inverse function theorem, letting

g(v) = (w1 · v , w2 · v , · · · , wn−m · v) where w1, · · · , wn−m is a basis for ker(Dfp), and v ·windicates the dot product of two vectors in Rn.• In many textbooks, the implicit function theorem does not look like the version above. The

connection will be made later after the basic denitions of manifolds and tangent spaces are

given.

• A one-to-one, onto, Ck map between open subsets of Euclidean space such that the inverse

is also Ck is called a Ck -dieomorphism. Roughly speaking this says that the two open sets,

from the point of view of calculus are equivalent.

• The mono-split theorem is proven from the inverse function theorem by setting F (v , w) =

(f (v), 0) +∑m−n

i=1 wipi where p1, · · · , pm−n together with the image of Dfp span Rn, andw =

∑n−mi=1 wiei .

Manifolds

Manifolds play many roles in mathematics. On the purely geometric side, a manifold is something

that looks locally like Euclidean space. So a 3-dimensional manifold can be viewed as a hypothetical

universe in which we might live. The subject of general relativity takes this idea and runs with it,

imagining all of space and time to be a 4-dimensional manifold. On a rather formal mathematical

side, the idea of a manifold represents a challenge to mathematics. They were the rst instance of

mathematical objects that are `locally trivial', in the sense that locally they all look like one common

object, a vector space. Much of mathematics has its focus on objects that are inherently combinatorial,

with many structures one can count or otherwise study. Manifolds are by design objects that give you

no way to grab onto them, they are relatively elusive objects. In that regard it has taken mathematics

a relatively long time to get a handle on manifolds, which triggered the development of subjects such

as algebraic topology and category theory. On an analytic side, manifolds are extremely natural objects

as the level-sets of smooth functions are generically manifolds (see Sard's Theorem towards the end

of these notes). In that regard, the study of manifolds is a very natural activity and this is much

of the reason why manifold theory got o the ground originally in attempts to study the shape of

constant-energy hypersurfaces for conservative mechanical systems in classical mechanics.

Denition 1.7. A subset M ⊂ Rn is a Ck manifold of dimension m if for every p ∈ M there is a Ck -

dieomorphism φ : U → V where p ∈ U ⊂ Rn is open, V ⊂ Rn is open, and φ(U∩M) = V ∩(Rm×0).The map φ is called a chart for M about p. Provided k ≥ and p ∈ M, the tangent space to M at p

is dened as TpM = v ∈ Rn : v = γ′(0) where γ : (−ε, ε) → M is Ck − smooth, i.e. the tangent

space at p is the set of all the tangent vectors to all the curves in M through p. A Ck -manifold for

k ≥ 1 is called a smooth manifold, with some authors defaulting to smooth being synonymous with

C∞-smooth.

In the above denition, k is allowed to be any integer of the form k ∈ 0, 1, 2, 3, · · · , but we also

allow k = ∞ for innitely-dierentiable manifolds. In applied topics as well as many other branches

of mathematics one puts further constraints on the dieomorphism φ, such as analyticity, complex

dierentiability or a host of other constraints, giving various specialized types of structured manifolds.

Historically, smooth manifolds were the rst notion of manifold. Smooth manifolds are particularly20


natural since they are objects which are locally approximately linear, and smooth maps are similarly

approximately linear.

Example 1.8. We list some examples of manifolds and leave the proofs as exercises.

• The square S = ∂([0, 1]× [0, 1]) is a 1-dimensional C0 manifold but it is not a C1 manifold.

• Sn = v ∈ Rn+1 : |v | = 1 is an n-manifold, called the (unit) n-sphere. TpSn = v ∈ Rn+1 :

v · p = 0.• Open subsets of Rn are manifolds. Moreover, if U ⊂ Rn is open, TpU = Rn.• Given a set X ⊂ Rn, a subset U ⊂ X is relatively open in X if U = U ′ ∩ X with U ′ open in

Rn. Relatively open subsets of m-manifolds are m-manifolds. Moreover, TpU = TpX for all

p ∈ U.• (z, w) ∈ C2 : |z | = 1, w = t

√z for some t ∈ R is a 2-dimensional manifold called the

Möbius band. |z | is the modulus, and√z means either of the two square roots of z .

Proposition 1.9. If M is a non-empty m-dimensional Ck -manifold with k ≥ 1, the tangent spaces to

M are m-dimensional vector subspaces of Rn. The dimension of a non-empty manifold is therefore

well-dened.

Proof. Given p ∈ M and γ : (−ε, ε) → M, and given a chart for M at p, φ : U → V , by the

continuity of γ, for some δ > 0 γ((−δ, δ)) ⊂ U, so without loss of generality assume the image of

γ is contained in U ∩M. Since φ γ takes its image in Rm × 0, (φ γ)′(0) ∈ Rm × 0. Thus

TpM = ((Dφ)p)−1(Rm × 0).

Note 1.10. • An equivalent denition of a Ck -smooth m-dimensional manifold (for k ≥ 1) is

a set M ⊂ Rn such that for every p ∈ M there is an open neighbourhood U of p in Rn and

a Ck -smooth map f : U → Rn−m such that U ∩M = f −1(0) and Dfp is an onto linear map.

Moreover, TpM = kerDfp. The proof amounts to applying the implicit function theorem and

keeping track of the tangent space under Dfp.

• An immediate corollary of the above is that pre-images of regular values of smooth functions

f : U → Rj are smooth manifolds.

• The empty set is a manifold of every dimension.

There are several common small variations of the denition of a manifold. Above we have dis-

cussed the `dieomorphism' version of the denition, as well as the ìmplicit function theorem version'.

There is of course a mono-split lemma version, where the manifold is described in terms of local

parametrizations.

Denition 1.11. Let M and N be two Ck -manifolds of dimensions m and n respectively, with M ⊂Rk , N ⊂ Rj . A function f : M → N is Ck -smooth if for all p ∈ M there exists an open set

U, p ∈ U ⊂ Rk and a Ck -smooth function f : U → Rj such that f|U∩M = f|U∩M . Moreover,

Dfp : TpM → Tf (p)N is dened as Dfp(v) = Dfp(v). An equivalent denition using the chain rule is

that Dfp(v) = (f γ)′(0) where γ(0) = p, γ′(0) = v .

The fact that Dfp : TpM → Tf (p)N is well-dened follows from the observation that Dfp(v) =

Dfp · v = (f γ)′(0) = (f γ)′(0) by the chain rule.

The above notion of smooth function between manifolds is perhaps the most convenient for this

course. But there are other (equivalent) notions. The one most relevant to abstract manifolds comes

from charts.21


Lemma 1.12. A function f : M → N is Ck -smooth if and only if for every p ∈ M there are relatively

open sets in M and N respectively U ⊂ M, V ⊂ N with p ∈ U and f (U) ⊂ V . Moreover, there must

be charts φ : U ′ → U ′′ and ψ : V ′ → V ′′ with U ′, U ′′, V ′, V ′′ ⊂ Rk open, U = U ′ ∩M, V = V ′ ∩N, andψ f ((φ−1)|U ′∩(Rm×0)) must be Ck -smooth in the sense of calculus.

Another way to think of Lemma 1.12 is that a function between manifolds is Ck if it is Ck -

smooth when `put into local coordinates'. f in the local coordinates of φ and ψ is precisely ψ f ((φ−1)|U ′∩(Rm×0)).

We mention a few constructions of smooth manifolds and maps of manifolds. We leave the proofs

as exercises for the reader.

Proposition 1.13. 1) Linear functions L : Rm → Rn are smooth, moreover DLp(v) = L(v) for

all p, v ∈ Rm. Similarly, all the standard functions from calculus such as trig functions, the

exponential, etc, are smooth on their domains, with the standard exceptions such as abs(x)

and√x , which fail to be smooth at precisely one point of their domains.

2) If M,N, L are three manifolds, with L ⊂ M, and if f : M → N is smooth, then f|L : L→ N is

smooth.

3) (Chain Rule) If f : M → N is Ck and g : N → L is Ck then g f : M → L is Ck , moreover

D(g f )p = Dgf (p) Dfp.4) If M ⊂ Rk and N ⊂ Rl are m and n-manifolds respectively, M × N ⊂ Rk × Rl is an (m + n)-

manifold. Moreover, T(p,q)M × N = TpM × TqN. A function f : P → M × N can be written

in terms of its coordinates, let πM : M × N → M and πN : M × N → N be the coordinate

projections πM(p, q) = p, πN(p, q) = q. Then f : P → M × N is smooth if and only if

πM f and πN f are smooth. Moreover, Df(p,q)(v , w) = (D(πM f )p(v), D(πN f )q(w)).

A function f : P → M × N such that πM f = g and πN f = h is sometimes denoted

f = g × h.5) Polynomial functions p : Rn → Rm are smooth. A polynomial f : Rn → R is any linear

combination of functions of the form (x1, · · · , xn) 7−→ xk1

1 xk2

2 · · · xknn where ki ∈ 0, 1, 2, · · · for all i . A polynomial function p : Rn → Rm is any function whose coordinate projections piwhere p = (p1, · · · , pm) are polynomials.

Example 1.14. (A standard smoothness argument) The function f : R3 \ 0 → R4 given by

f (x, y , z) = 1x2+y2+z2 (x2 − y2, xy , zx, zy) is smooth since we can write f = G ((I L) × p) where

G : R2 → R is given by G(x, y) = xy , L : R4 \ 0 → (0,∞) is given by L(x, y , z) = x2 + y2 + z2,

I : (0,∞) → (0,∞) is I(x) = 1/x and L(x, y , z) = (x2 − y2, xy , zx, zy). L and G are polynomials,

while I is inversion, which is a standard smooth function from calculus. Thus G is a composite of

smooth functions.

Denition 1.15. We dene terminology relevant to Ck smooth functions f : M → N.

• f is a dieomorphism if it is one-to-one, onto and if f −1 : N → M is Ck .

• f is an immersion if Dfp : TpM → Tf (p)N is one-to-one for all p ∈ M.

• f is a submersion if Dfp : TpM → Tf (p)N is onto for all p ∈ M.

• p ∈ N is a regular value of f if for all q ∈ f −1(p), Dfq : TqM → TpN is an onto linear map.

• q ∈ M is a critical point of f is Dfq : TqM → Tf (q)N is not onto.

• f : M → N is an embedding if f (M) is a manifold, such that f : M → f (M) is a dieomor-

phism. In this situation we say f (M) is a submanifold of N.22


Example 1.16. • Let π : Rn → Rk be any onto linear map, π is a submersion.

• f : R→ R given by f (x) = (x − 1)2. 1 is the only critical point, and 0 is the only point in Rwhich is not a regular value.

• If M and N are manifolds, the projection maps πM : M×N → M and πN : M×N → N given

by πM(p, q) = p and πN(p, q) = q are submersions.

• If M ⊂ Rk is an m-manifold, the inclusion map iM : M → Rk given by iM(x) = x is an

immersion.

• The function f : S1 → S1 given by f (z) = zn is an immersion provided n ∈ Z and n 6= 0.

When n 6= 0, every point in S1 is a regular value, but when n = 0 there is precisely one

non-regular value, (1, 0) ∈ S1.

• Provided p, q ∈ R and at least one of p or q is non-zero, the function γ(t) = (e ipt , e iqt),

γ : R → S1 × S1 is an immersion. The function is periodic if and only if one of p or q

is a rational multiple of the other. It is a one-to-one immersion if and only if they are not

rational multiples of each other, but in this case γ is not an embedding. If p and q are rational

multiples of each other, the image of γ is a submanifold, dieomorphic to S1.

The inverse, implicit and mono-split lemmas extend to manifolds, and are listed below. The proofs

amount to reducing to the Euclidean case via charts.

Lemma 1.17. Assume f : M → N is Ck for k ≥ 1, with M and N manifolds. Let p ∈ M and Dfphave maximal rank, meaning rank(Dfp) = minm, n. Then:

• If m > n, the Implicit Function Theorem says that there exists a C∞ function g : M → Rm−nsuch that f ⊕ g : U → N × Rm−n given by (f ⊕ g)(x) = (f (x), g(x)), when restricted to

some relatively open neighbourhood of p in M is a Ck dieomorphism to its image, which is

a relatively open neighbourhood of (f (p), g(p)) ∈ N × Rm−n.• If n = m, the Inverse Function Theorem theorem says there is a relatively open neighbour-

hood U of p such that f|U is a Ck dieomorphism onto its image, which is a relatively open

set in N.

• If m < n the Mono-Split Lemma says there exists a function F : M×Rn−m → Rm such that

F|M×0 = f , with DF(p,0) an isomorphism, therefore there is a neighbourhood U of (p, 0) in

M×Rn−m such that F|U : U → F (U) is a Ck -dieomorphism, with F (U) ⊂ N relatively open.

Denition 1.18. The tangent bundle of M, TM = (p, v) ∈ M × Rk : v ∈ TpM.

Proposition 1.19. If M ⊂ Rk is an m-manifold, TM ⊂ (Rk)2 is a 2m-manifold.

Proof. By the denition of a smooth manifold, if p ∈ M there exists an open neighbourhood U of p

in Rk and a smooth function f : U → Rk−m with f −1(0) = U ∩M and Dfp onto.

Think of the derivative as giving a function Df : U×Rk → (Rk−m)2 by Df (p, v) = (f (p), Dfp(v)).

This is a smooth map that satises TM∩(U×Rk) = Df −1(0, 0). A computation shows the derivative

of Df has full rank, so TM is a manifold.

A rather fundamental example of tangent bundles is the case of vector spaces. In this course we

will frequently have use for doing calculus in vector spaces that are not Rn. Of course they will be

isomorphic to Rn, but as a set-theoretic object they will not be Rn literally. All the above calculus

works perfectly well on nite-dimensional normed vector spaces, and the tangent bundle also makes

perfect sense.23


Example 1.20. If V is a nite-dimensional normed vector space, TV = V × V . If U ⊂ V is an open

subset then TU = U × V .

Denition 1.21. Given a Ck -smooth function f : M → N we interpret the derivative of f as a

function Df : TM → TN given by Df (p, v) = (p,Df(p)(v)). The bundle projection of the tangent

bundle is the map πTM : TM → M given by πTM(p, v) = p.

Notice that πTM is a smooth submersion, and π−1TM(p) = p × TpM. This is called the bre of

TM over the point p. Notice also that if f : M → N is Ck , Df : TM → TN is Ck−1. The chain rule

has perhaps its most natural extension in terms of the derivative as a map between tangent bundles.

Theorem 1.22. (Chain Rule) Provided f : M → N and g : N → P are Ck smooth for k ≥ 1

D(g f ) = Dg Df .

Since π−1M (p) = p×TpM, we frequently consider π−1

M (p) to be the vector space where the vector

space operations are (p, v) + (p, w) = (p, v + w), t(p, v) = (p, tv). This makes the identication

TpM ≡ π−1M (p) an isomorphism of vector spaces.

Given γ : (a, b)→ M ⊂ Rk , notice that γ′(t) ∈ Tγ(t)M. Thus, γ′ as a function is simply a function

of the form γ′ : (a, b)→ Rk . This can be viewed as a formal annoyance, in that γ′ is not a map to a

geometric object associated to M unless the tangent spaces to M are constant.

In this context, we frequently like to reinterpret the notation of partial derivatives to make it friendly

to manifolds. In the setting that γ : (a, b)→ N is a smooth map, the partial derivative (in the bundle-

theoretic sense) is∂γ

∂t(t) = (γ(t), lim

h→0

γ(t + h)− γ(t)

h)

i.e. ∂γ∂t (t) = Dγ(t, 1) ∈ π−1

N (γ(t)) = γ(t) × Tγ(t)N ⊂ TN. While ∂γ∂t (t) ∈ Tγ(t)N.

In these notes, when we use this bundle-theoretic partial derivative notation we will consistently use

colour, but this is simply a pedagogical device the context always makes it clear if one is using the

traditional denition or the bundle-theoretic partial derivative. The traditional partial derivative will be

in black and white∂γ

∂t(t) = (γ(t),

∂γ

∂t(t)).

Proposition 1.23. M ⊂ N is an m-dimensional submanifold of N ⊂ Rk if and only if for every point

p ∈ M there is a (relatively) open neighbourhood U of p in N and a dieomorphism φ : U → V ⊂ Rnsuch that φ(M ∩ U) = (Rm × 0) ∩ V . A map from a manifold f : X → M is Ck -smooth if and only

if i f : X → N is Ck -smooth where i : M → N is inclusion.

Proof. Exercise.

Example 1.24. If we think of Rm+1 ≡ Rm+1 × 0 ⊂ Rn+1, then Sm ≡ Sm × 0 ⊂ Sn is a

submanifold for all 0 ≤ m ≤ n. Moreover, the function f : Sn → Rn−m given by f (x0, x1, · · · , xn) =

(xm+1, xm+2, · · · , xn) has 0 as a regular value, and f −1(0) = Sm ⊂ Sn.

Denition 1.25. Given a function π : X → Y with X and Y sets, a section of π is a function

s : Y → X such that π s = IdY , i.e. π(s(y)) = y ∀y ∈ Y . A smooth section of the tangent bundle

π : TN → N is called a vector eld on N.

Example 1.26. • On S2, s(x, y , z) = ((x, y , z), (−y , x, 0)).24


• On S2n−1 thought of as the unit sphere in Cn, s(p) = (p, ip).

• On R2, s(x, y) = ((x, y), (x2, y)).

• Frequently when the context is clear, the base-point of a section is omitted, so the previous

example might have been written s(x, y) = x2e1 + ye2, or s(x, y) = (x2, y).

Vector elds can be thought of as dening dierential equations on a manifold. For example, does

there exist γ : (a, b) → N such that ∂γ(t)∂t = s(γ(t)) for all t? Notice we use the bundle-theoretic

partial derivative for this to make sense. A curve γ which satises this is called an integral curve of

s : N → TN.

Denition 1.27. Given a vector eld v : N → TN and a smooth function f : N → R, the directionalderivative of f in the direction of the vector eld v is Df v : N → R. Directional derivatives are

frequently denoted v(f ).

It's a good exercise to check that if v and w are two vector elds on an open subset of Euclidean

space, and if f is smooth, then w(vf )− v(wf ) can be written as [v , w ]f for some vector eld [v , w ]

work out a formula for this vector eld. This is true for any two vector elds on an arbitrary manifold

as well, and will be one of several topics in the next section.

Note 1.28. Since vector elds give directional derivatives, and every directional derivative comes from

a vector eld (or at a point, a vector), this has led to certain notational short-hand in the literature.

For example, the vector eld s(x, y) = (x2, y) from Example 1.26 is frequently denoted x2 ∂∂x + y ∂

∂y .

This is especially true in the dierential geometry literature, in the notation called Ricci Calculus. The

reason for this notation is of course that the formula x2 ∂∂x + y ∂

∂y can be thought of as the directional

derivative in the direction s(x, y), since if one applies this formula to a function f : R2 → R one gets

x2 ∂f∂x + y ∂f∂y , which is precisely sf . Similarly, the rst example from 1.26 might be denoted ∂

∂θ (in

polar coordinates) since polar coordinates have the form P (θ, φ) = (cos θ cosφ, sin θ cosφ, sinφ) and∂P∂θ = (− sin θ cosφ, cos θ cosφ, 0), which is P (θ, φ) = (x, y , z) then ∂P

∂θ = (−y , x, 0).

Structure of the tangent bundle

Denition 1.29. The set of all vector elds on a manifold N is

Γ(TN) = v : N → TN a vector eld.

Given two vector elds v , w ∈ Γ(TN) one can add them v +w ∈ Γ(TN). The idea is that if p ∈ N,both v(p), w(p) ∈ π−1

N (p) = p × TpN. Since the tangent space is a vector space, we can add v(p)

and w(p) there. We mention some natural formalism that ensures v + w is smooth when v and w

both are.

Proposition 1.30. If N is a manifold, then

TN ⊕ TN = (p, v , w) : p ∈ N, v, w ∈ TpN

is a manifold. It is called the brewise direct sum of TN with itself. There are smooth maps

TN ⊕ TNπ⊕ // N TN ⊕ TN π1 // TN TN ⊕ TN π2 // TN

(p, v , w) // p (p, v , w)

// (p, v) (p, v , w) // (p, w)

25


and smooth maps

TN ⊕ TN + // TN R× TN · // TN TN ⊕ TN i // TN × TN

(p, v , w) // (p, v + w) (a, (p, v))

// (p, av) (p, v , w) // ((p, v), (p, w))

These maps satisfy the properties:

(a) If we let ∆ : N → N × N be the diagonal embedding ∆(p) = (p, p) and πN×N : TN × TN ≡T (N×N)→ N×N be the bundle projection map, then i is a dieomorphism between TN⊕TNand π−1

N×N(∆(N)).

(b) The maps `+' and `·' above give π−1N (p) ⊂ TN the structure of vector spaces for all p ∈ N.

(c) If v , w ∈ Γ(TN) are vector elds, and v × w : N × N → TN × TN is the product map, then

+ i−1 (v × w) ∆ : N → TN is the vector eld v + w on TN.

(d) Given any two smooth maps f , g : M → TN, provided πN f = πN g there exists a

unique smooth map f ⊕ g : M → TN ⊕ TN such that π1 (f ⊕ g) = f and π2 (f ⊕ g) = g.

f⊕g = i−1(f×g)∆ where f×g : M×M → TN×TN is given by (f×g)(p, q) = (f (p), g(q)).

Proof. Exercise.

The Double Tangent Bundle

It takes some adjustment to think of the derivative as a map of bundles Df : TM → TN. But

it is also one of the most convenient notions for the derivative on manifolds. In particular, in this

notation the second derivative D2f : T (TM) → T (TN) is dened. The purpose of this is that with

this setup, second derivatives not only make sense, but they make sense in a manner that a resident

of the manifold can conceive of.

To start making sense of the last statement in the previous paragraph, recall that if p ∈ M, TpM

consists of all the velocity vectors of the curves in M that pass through p. So if (p, v) ∈ TM,

tautologically, T(p,v)TM is therefore the velocity vectors of all curves in TM that pass through (p, v),

which are themselves velocity vectors of curves in M. So we should therefore expect double tangent

vectors to represent the mixed partials of 2-variable functions to M. The next proposition makes this

explicit.

Proposition 1.31. Any point in T 2M can be realized as ∂2γ∂b∂a (0, 0) for some smooth function γ :

(−ε, ε)2 → M. Here we use a and b to denote the variables of γ(a, b).

Example 1.32. Consider f : R→ R a C∞-smooth map. We will compute D2f : T 2R→ T 2R. Let'sreserve the notation f ′ : R→ R for the calculus notion of derivative, i.e. f ′(x) = limt→0

f (x+t)−f (x)t =

∂f∂x (x). We identify TR with R2, and T 2R with R4, giving

Df (x, v) = (f (x), f ′(x) · v)

D2f (x, v , w, y) = (f (x), f ′(x) · v , f ′(x) · w, f ′′(x) · wv + f ′(x) · y).

Given f : U → Rn with U ⊂ Rm open, Df : TU → TRn is a map of bundles, and Dfp : Rm → Rnis linear. Dfp can be thought of as a point in the space of linear functions Rm → Rn, denoted

Lin(Rm,Rn) or sometimes Hom(Rm,Rn). By dening f + g ∈ Lin(Rm,Rn) when f , g ∈ Lin(Rm,Rn)

by (f + g)(v) = f (v) + g(v), Lin(Rm,Rn) becomes a vector space. Interpreting the derivative as a26


map Df : U → Lin(Rm,Rn) is called taking the adjoint of Df . This gives another formalism in which

the derivative can be further dierentiated.

Given a function f : M → N, its derivative, for the purposes of manifold theory is a function

Df : TM → TN and its second derivative is a function D2f : T 2M → T 2N where T 2M = T (TM) is

the double tangent bundle, or the tangent bundle of the tangent bundle respectively. The notation of

the double tangent bundle has the potential to introduce some confusion, but it also allows to identify

some basic phenomena, so we spend some time discussing its basic structure and symmetries here.

Notice the formula in the computation of D2f in Example 1.32 had redundancies. Phenomena such

as that will be explained.

If N ⊂ Rk is a manifold we've seen its tangent bundle is a manifold,

TN = (p, v) : p ∈ N, v ∈ TpN ⊂ Rk × Rk .

The role of the tangent bundle is to describe motions of points on a manifold, and so it is the natural

domain and range for the derivative. The double tangent bundle captures the innitesimal motions of

tangent vectors. By design,

T 2N = T (TN) = (p, v , w, y) : p ∈ N, v ∈ TpN, (w, y) ∈ T(p,v)TN ⊂ (Rk)4.

Example 1.33. Since Sn was f −1(1) for the function f : Rn+1 → R given by f (p) = p · p (dot

product), and 1 is a regular value of f , Dfp(v) = 2p · v , we have

TSn = (p, v) : p · p = 1, p · v = 0

In particular, TSn = g−1(1, 0) where g : Rn+1 × Rn+1 → R2 is given by g(p, v) = (p · p, p · v). Since

(1, 0) is a regular value of g and Dg(p,v)(w, y) = (2p · w, p · y + w · v) we have

T 2Sn = (p, v , w, y) : p · p = 1, p · v = 0, p · w = 0, p · y + w · v = 0

We wish to generalize Example 1.33 but in order to do this we need more notation from multi-

variable calculus.

Denition 1.34. Given a smooth function f : U → Rn, where U ⊂ Rm is open, Df : U → Lin(Rm,Rn)

is also smooth, and DpDf : Rm → Lin(Rm,Rn) is therefore adjoint to a unique map called the Hessian

of f ,

Hf : U → BiLin(Rm × Rm,Rn)

given by Hfp(v , w) = DDf (p)(v)(w). By design Hfp is bilinear, in that Hfp(av1 + bv2, w) =

aHfp(v1, w) +bHfp(v2, w) and Hfp(v , aw1 +bw2) = aHfp(v , w1) +bHfp(v , w2). BiLin(Rm×Rm,Rn)

denotes the vector space of all bi-linear functions Rm × Rm → Rn.

Example 1.35. If f : U → Rn is smooth with U ⊂ Rm open, Hfp : Rm × Rm → Rn is the bilinear

function

Hfp(v , w) =

m∑i ,j=1

∂2f

∂ei∂ej(p)viwj

where v = (v1, · · · , vm), w = (w1, · · · , wm). Since mixed partials agree for C2-smooth functions, the

Hessian is a symmetric function Hfp(v , w) = Hfp(w, v) for all p, v , w .

In general, if f : U → Rn is smooth, the function Df : TU → TRn given by (p, v) 7−→(f (p), Dfp(v)) has derivative D2f : T 2U → T 2Rn given by

D2f (p, v , w, y) = (f (p), Dfp(v), Dfp(w), Hfp(v , w) +Dfp(y))27


Proposition 1.36 generalizes Example 1.33.

Proposition 1.36. Let N = f −1(0) where f : U → Rj is smooth, 0 a regular value of f , U ⊂ Rk open.Then

TN = (p, v) ∈ U × Rk : f (p) = 0, Dfp(v) = 0moreover, F : TU → (Rj)2 dened by F (p, v) = (f (p), Dfp(v)) has (0, 0) ∈ (Rj)2 as a regular value,

and

T 2N = (p, v , w, y) ∈ U × (Rk)3 : f (p) = 0, Dfp(v) = Dfp(w) = 0, Hfp(v , w) +Dfp(y) = 0

Note 1.37. The tangent bundle TN has its projection map πN : TN → N which gives the base-point

of a tangent vector. This map is smooth, so it has a derivative DπN : T 2N → TN. Thinking of T 2N

as the tangent bundle of TN, T 2N = T (TN), we have another map πTN : T 2N → TN.

T 2NDπN // TN T 2N

πTN // TN

(p, v , w, y) // (p, w) (p, v , w, y)

// (p, v)

Notice that both DπN and πTN restrict to linear maps T(p,v)TN → TpN for all (p, v) ∈ TN.

Between these two maps, the one thing that isn't `seen' from T 2N is the vector y . y is what measures

the innitesimal rate of change of a tangent vector (forgetting about the vector's base-point).

Proposition 1.38. The map ιN : T 2N → T 2N given by ιN(p, v , w, y) = (p, w, v , y) is well dened

and smooth. In particular DπN ιN = πTN and πTN ιN = DπN . Moreover, if f : M → N is smooth,

ιN D2f = D2f ιM .

Proof. That ιN is well-dened follows from Proposition 1.36.

There is a certain amount of symmetry in the double tangent bundle. The above involution is part

of the story. For example, in the spirit of Proposition 1.31, the xed points of ιN , as a subset of T 2N

are precisely the double tangent vectors that can be realized as 2nd order derivatives of curves ∂2γ∂t2 (0),

where γ : (−ε, ε)→ N.

Another way to describe ιN is if f : (−ε, ε)2 → N is a C2-smooth function dened on a neighbour-

hood of the origin in R2, and if we denote the variables x and y , i.e. f (x, y), then

∂2f

∂x∂y(0, 0) = ιN

(∂2f

∂y∂x(0, 0)

).

For this reason one might want to call ιN the Clairaut involution of T 2N as it is a reection of Clairaut's

theorem in the double tangent bundle notation.

Denition 1.39. (Lie brackets) Given two vector elds v , w : N → TN then the two functions

Dv w and ιN Dw v : N → T 2N

both send a point p into the vector space Tv(p)TN. This means the dierence is well dened,

Dv w − ιN Dw v .This dierence is an element of ker(Dπ) where π : TN → N is bundle projection. ker(Dπ) has a

canonical identication with TN⊕TN, and we let π2 : ker(Dπ)→ TN be projection onto the second

factor. Then

[v , w ] = π2 (Dv w − ιn Dw v) .28


To elaborate a little on Denition 1.39, notice that

(Dv w)(p) = (p, ~v(p), ~w(p), D~vp( ~w(p)))

and

ιN(Dw w)(p) = (p, ~v(p), ~w(p), D ~wp(~v(p))),

so

(Dv w)(p)− ιN(Dw w)(p) = (p, ~v(p), 0, D~vp( ~w(p))−D~wp(~v(p))).

ker(Dπ) = (p, v , 0, y) : p ∈ N, v ∈ TpN, y ∈ TpNgiving the identication

ker(Dπ) ≡ TN ⊕ TNker(Dπ) 3 (p, v , 0, y) 7−→ (p, v , y) ∈ TN ⊕ TN

and π2(p, v , y) = (p, y).

The Lie bracket is explored in Chapter 2.

Additional results from calculus

The inverse and implicit function theorems are fundamental to manifold theory, in that they an-

ticipate the subject and their usage is everywhere in the subject. We cover a strengthening of the

inverse function theorem here, known as Kantorovich's theorem. We also describe various classical

perspectives on the implicit and inverse function theorems.

Denition 1.40. If L : Rn → Rm is a linear function, the operator norm of L is dened to be

||L|| = max|f (x)| : x ∈ Rn, |x | = 1

The key property of the norm is that |L(x)| ≤ ||L|| · |x | for all x ∈ Rn, moreover, ||L|| is the

minimum number K ∈ R so that |L(x)| ≤ K · |x | for all x . So it tells you what the maximum

amount of `stretching' happens when applying a linear function. If m = n and L is invertible, by design

|L−1(x)| ≤ ||L−1|| · |x |, so we can conclude that

1

||L−1|| · |x | ≤ |L(x)| ≤ ||L|| · |x |

So in particular if Br is the ball of radius r centred at 0, then L(Br ) ⊂ B||L||r and B r

||L−1 ||⊂ L(Br ).

Given a matrix A, operator norm of A is denoted ||A||. This is the operator norm of the linear map

v 7−→ Av . It's an elementary argument that ||A|| ≤ |A|.We give an example of what the norm does for calculus, the analogous theorem to the above

observation, but for smooth functions instead of linear functions.

Proposition 1.41. Provided f : U → Rn is C1, U ⊂ Rn an open ball centred at p ∈ U with radius R,

and K is a Lipschitz constant for Df , i.e.

||Df(x) −Df(y)|| ≤ K|x − y |

for all x, y ∈ U, letM = sup||Dfx || : x ∈ U

29


then

(1) |f (q)− f (p)| ≤ M · |q − p|for all q ∈ U. Provided Dfp is invertible,

(2)1

2 · ||Df −1p ||

· |q − p| ≤ |f (q)− f (p)|

is true, for all q ∈ U provided we further require R ≤ 12·K·||Df −1

(p)|| . We further claim that all points

within distance 14·K·||Df −1

p ||2of f (p) are in the image of f .

Proof. For part (a), let γ(t) = p + t(q − p), γ : [0, 1]→ U. Then

f (q)− f (p) =

∫ 1

0

(f γ)′(t)dt

so

|f (q)− f (p)| = |∫ 1

0

(Dfγ(t)(γ′(t))dt|

≤∫ 1

0

|(Dfγ(t)(γ′(t))|dt

≤∫ 1

0

||Dfγ(t)|| · |γ′(t)|dt

≤ M∫ 1

0

|γ′(t)|dt = M|q − p|

For the lower bound (2), the idea is essentially the same.

f (q)− f (p) =

∫ 1

0

Dfγ(t)(γ′(t))dt

we write

Dfγ(t) · γ′ = Df(p) · γ′ + (Dfγ(t) −Df(p)) · γ′

giving

f (q)− f (p) =

∫ 1

0

Df(p) · γ′dt +

∫ 1

0

(Dfγ(t) −Df(p)) · γ′dt

applying the length and the triangle inequality gives

|f (q)− f (p)| ≥∣∣|Df(p) · (q − p)| − |

∫ 1

0

((Dfγ(t) −Df(p)) · γ′dt|∣∣

but |Dfp · (q − p)| ≥ 1||Df −1

(p)|| |q − p|, and

|∫ 1

0

((Dfγ(t) −Df(p)) · γ′dt| ≤∫ 1

0

||Dfγ(t) −Df(p)|| · |γ′|dt ≤∫ 1

0

K|γ(t)− p| · |γ′|dt ≤ k |q − p|2

giving

|f (q)− f (p)| ≥∣∣ 1

||Df −1p ||

−K|q − p|∣∣·|q − p|

The function h : R→ R given by h(x) = (a− bx)x for a, b > 0 obtains its maximum at x = a2b , thus

|f (q)− f (p)| ≥1

2||Df −1p |||q − p|

30


provided |q− p| ≤ 12K||Df −1

p ||. The statement that f maps all the points of U onto all the points in the

ball of radius 12K||Df −1

p ||follows from the standard proofs of the inverse function theorem. Alternatively,

Kantorovich's theorem (next) tells us when Newton's method can be used to solve the equation

f (q) = x .

Theorem 1.42. (Kantorovich) Let f : U → Rn be a C1-smooth map, with U ⊂ Rn open. Let M be a

Lipschitz constant for Df , i.e.

||Df(x) −Df(y)|| ≤ K|x − y |for all x, y ∈ U. If Df(p) is invertible, and if

|f (p)| · ||Df −1(p) ||

2 ·K ≤1

2

then the equation f (x) = 0 has a unique solution in the set

U0 = q ∈ U : |q − p| < |Df −1(p) (f (p))|

provided U0 ⊂ U. Moreover, Newton's Method starting at p converges to this solution, where `New-

ton's method' means the sequence pi = g(pi−1) with p0 = p, and

g(x) = x −Df −1(x) (f (x))

From the perspective of computation, the norm is sometimes dicult. If we let A be the matrix

representing a linear map L : Rn → Rn, then ||L|| < |L| where |L| =√∑

i ,j a2i ,j . Also, notice that if f is

a C2-dierentiable map, then a Lipschitz constant for Df is given any upper bound on√∑n

i,j=1 |∂2f∂ei∂ej

|2

on U.

Kantorovich's theorem is also true if one replaces the set U with U0 in the Lipschitz constraint.

Moreover, when |f (p)| · ||Df −1(p) ||

2 ·M < 12 , one can ensure Newton's method superconverges, meaning

the error term goes to zero at a much faster than exponential rate. This allows for one to nd solutions

to smooth systems of equations in a computationally-eective manner. See the Hubbard reference for

details.

Example 1.43. Consider the `squaring' function s : Mn,n → Mn,n given by s(A) = A2. SinceDsA(H) =

AH+HA, which for A = I is the isomorphism DsI(H) = 2H, Proposition 1.41 can be used to determine

a lower-bound on the radius R such that s−1 is dened and smooth on A ∈ Mn,n : |A− I| < R. Forthis we need to know K, and ||Ds−1

I ||. Since DsI is multiplication by 2, ||Ds−1I || = 1

2 .

To nd K we need a Lipschitz constant for Ds. (DsA −DsB)H = (A−B)H +H(A−B). A little

argument should convince you that ||DsA −DsB|| ≤ 2||A− B||, i.e. the Lipschitz constant is K = 2.

i.e. the squaring function maps the ball of radius 1/2 centred at I onto the ball of radius 1/2 centred

at I. In particular, there is a smooth `square root function' s−1 such that s−1(I) = I, and the domain

at least contains the ball of radius 1/2 about the identity matrix.

One nds various classical versions of the implicit function theorem in calculus textbooks. One

fairly standard one is this.

Theorem 1.44. (Implicit Function Theorem) Let f : U → Rm be Ck -smooth, with U ⊂ Rn open and

k ≥ 1. If for some p ∈ U the m ×m matrix[∂fi∂ej

(p)

]i ,j=1,··· ,m

31


is invertible, where f (x1, · · · , xn) = (f1(x1, · · · , xn), · · · , fm(x1, · · · , xm)), and if we let p = (p1, · · · , pn)

then in some neighbourhood of p, all solutions to the equation f (x) = f (p) can be written as

(g1(xm+1, · · · , xn), g2(xm+1, · · · , xn), · · · , gm(xm+1, · · · , xn), xm+1, xm+2, · · · , xn)

where g = (g1, · · · , gm) is a smooth function dened in a neighbourhood of (pm+1, · · · , pn) ∈ Rn−m.So in particular, (p1, · · · , pm) = g(pm+1, · · · , pn).

The above variation of the implicit function theorem follows directly from Theorem 1.5 and some

basic observations from linear algebra.

Lemma 1.45. Let V ⊂ Rn be a j-dimensional vector subspace of Rn. Then there exists integers

1 ≤ i1 < i2 < · · · < ij ≤ n such that orthogonal projection π : Rn → Rj onto the subspace spanned by

ei1 , · · · , eij, when restricted to V , π|V : V → Rj , is an isomorphism. π(x1, · · · , xn) = (xi1 , · · · , xij ).

Proof. If v1, · · · , vj is a basis for V , this claim immediately reduces to the claim that an n × j-matrix

(in our case, the matrix consisting of the column vectors v1, · · · , vj) has rank ≥ j if and only if a j × jsubdeterminant is non-zero.

To see how this implicit function theorem follows from Theorem 1.5, notice that the tangent space

to f −1(f (p)) is by design (n−m)-dimensional. Since Tp(f −1(f (p))) is the kernel of Dfp. This is all the

vectors in Rn orthogonal to the gradients ∇fi(p) for i = 1, · · ·m. But since the left m×m submatrix

of Dfp is invertible, this means orthogonal projection from Tp(f −1(f (p))) onto the subspace spanned

by em+1, · · · , en is an isomorphism. So one deduces this version of the implicit function theorem from

Theorem 1.5 by rst deducing that f −1(f (p)) is a manifold, and then applying the inverse function

theorem to the orthogonal projection onto the subspace spanned by em+1, · · · , en, this gives us the

function g.

Critical points of real-valued functions

Denition 1.46. Given a smooth function f : M → N, a point p ∈ M is a critical point of f if

Dfp : TpM → Tf (p)N is not an onto linear function. A point q ∈ N is a critical value of f if some

p ∈ f −1(q) is a critical point. A point p ∈ M that is not a critical point is called a regular point and

q ∈ N that is not a critical value is a regular value.

The purpose of the above denition is to be useful in describing the generic properties of smooth

functions, which is initiated by Sard's Theorem (below). Notice that if dim(N) > dim(M), if q /∈img(f ) then q is a regular value of f . But if q ∈ img(f ) then q can not be a regular value of f . So

the regular values of f : M → N when dim(M) < dim(N) are precisely the points of N that are not

in the image of f , N \ img(f ).

Some authors have slightly dierent notions of critical points. One somewhat common alternative

denition would be to say p ∈ M a critical point provided the rank(Dfp) < mindim(M), d im(N).The advantage of this denition is functions can have both critical points and regular points regardless

of the dimensions of M and N respectively. But we will stick with Denition 1.46.

Denition 1.47. A subset X ⊂ Rn has Lebesgue measure zero provided for every ε > 0 there is a

cover of X by intervals X ⊂ ∪∞i=1Ii such that∑∞

i=1 µ(Ii) < ε. For the purposes of this denition, an

interval in Rn is a set of the form I =∏ni=1[ai , bi ] where ai ≤ bi for all i , and µ(I) =

∏ni=1(bi − ai).

32


Similarly, if M is a manifold and X ⊂ M, one says X has measure zero in M analogously, but

the ìntervals' in M are slightly dierent. We say X has measure zero in M provided for any ε > 0,

X ⊂ ∪ni=1Ji and there is a chart φi : Ui → Vi ⊂ Rm with Ui ⊂ M relatively open and Ji ⊂ Ui with∑∞i=1 µ(φi(Ji)) < ε. Equivalently X ⊂ M has measure zero in M if it is a countable union of sets

which correspond to measure zero sets in Rm under charts of M.

The fact that the above denition does not depend on the choice of charts boils-down to proving

that dierentiable maps are locally Lipschitz, and the image of a measure zero set under a Lipschitz

map is also measure zero.

Lemma 1.48. The union of countably-many measure zero sets (in a manifold or in Rn) also has

measure zero.

Theorem 1.49. (Sard's Theorem) Given f : M → N Ck -smooth, if we let reg(f ) ⊂ N be the regular

values of N, then provided k ≥ maxn −m + 1, 1 then N \ reg(f ) has measure zero in N.

Notice this implies that the regular values of f are dense in N. Moreover, Lemma 1.48 implies

that intersection of the regular values of a countable collection of smooth functions fi : M → N with

i ∈ 1, 2, 3, · · · is also dense in N.

Taylor approximations

The derivative is motivated by nding rst order approximations to smooth functions. Taylor

polynomials are motivated by nding higher order approximations.

Denition 1.50. We say two functions f , g : U → Rn agree to order k at p ∈ U ⊂ Rm if for any ε > 0

there exists δ > 0 such that f (x)ε·|x−p|k' g(x) whenever |x − p| < δ.

Lemma 1.51. Provided the two functions f , g : U → Rn are Ck , they agree to order k at p ∈ U ⊂ Rmif and only if all their partial derivatives of order i ≤ k agree at p.

Proof. Without loss of generality assume g is the constant zero function, p = 0 and k ≥ 1. Given

x ∈ U let hx : [0, 1] → Rn be dened by hx(t) = f (tx). Then f (x) = hx(1) − hx(0) =∫ 1

0∂hx∂t dt =∫ 1

0 Dftx(x)dt. By the intermediate value theorem, f (x) = Dfax(x) for some a ∈ [0, 1], thus if

|f (x)| < ε|x | then |Dfax(x)| < ε|x | and so Df0 = 0.

Similarly, ∂2hx∂t2 = Hftx(x, x) and so f (x) =

∫ 1

0

∫ t0 Hfwx(x, x)dwdt so the result again follows from

the mean value theorem. The result proceeds by induction.

Lemma 1.51 gives the key to deciding what one should approximate by, since we can readily man-

ufacture polynomial functions with prescribed partial derivatives.

Denition 1.52. The k-th degree Taylor approximation to f : U → Rn at p ∈ U ⊂ Rm is the function

T kp (x) =∑|I|≤k

1

I!DI fp(x − p)I

where in the sum, I ranges over all multi-indices of degree no greater than k .

A multi-index of order j in m variables is an m-tuple (i1, i2, · · · , im) where ia ∈ 0, 1, 2, 3, · · · forall a ∈ 0, 1, 2, · · · , l and i1 + i2 + · · ·+ im = j . The purpose of the multi-index is to describe arbitrary

33


partial derivatives:

DI fp =∂ i1∂ i2 · · · ∂ im f∂e i11 ∂e

i22 · · · ∂e

imm

(p)

Similarly the notation I! = i1!i2! · · · im! is the multi-index factorial, and given v = (v1, v2, · · · , vm) ∈Rm, v I = v i11 v

i22 · · · v imm is the elementary monomial corresponding to I.

Example 1.53. T 20 for f (x, y) = cos(x) cos(y) is 1− x2/2− y2/2. T 2

A for f (A) = A3 where A is an

n × n matrix is

T 2A(A+H) = A3 + A2H + AHA+HA2 + AH2 +H2A.

Theorem 1.54. (Taylor remainder estimate) Provided f : U → R is of class Ck+1

f (x) = T kp (x) +∑|I|=k+1

1

I!DI fc(x − p)I

where c ∈ [p, x ] i.e. c = (1− t)p + tx for some t ∈ [0, 1].

Integration

This section describes the basics of (Riemann) integration of functions in Rn, Jordan measure and

Lebesgue measure. In Section 7 we will return to this topic and prove some improvements to the

theorems presented here. We refer the reader to other textbooks for proofs of the statements below.

Denition 1.55. An interval in Rn is a product I =∏ni=1[ai , bi ] with ai < bi . The content of an

interval is dened as µ(I) =∏ni=1 bi − ai . A set S ⊂ Rn has zero (Jordan) measure if for any ε > 0

there is a collection of intervals Ii with i = 1, 2, · · · , k with S ⊂⋃ki=1 Ii and

∑ki=1 µ(Ii) < ε. S has

zero Lebesgue measure if for any ε > 0 there is a collection of intervals Ii with i ∈ 1, 2, 3, · · · suchthat S ⊂ ∪∞i=1Ii and

∑∞i=1 µ(Ii) < ε.

Example 1.56. Q ⊂ R has Lebesgue measure zero, but it does not have zero Jordan measure. The

set 0 ∪ 1/n : n ∈ N has zero Jordan measure.

Denition 1.57. A bf partition of an interval I =∏ni=1[ai , bi ] is a set P =

∏ni=1 Pi where Pi =

ai = ti ,1 < ti ,2 < · · · < ti ,mi= bi for some mi ∈ N for all i ∈ 1, 2, · · · , n. Given a partition P of I

there is the renement of P given by |P | = ∏ni=1[ti ,ki , ti ,ki+1] : 1 ≤ ki < mi.

Given a function f : S → R with S ⊂ Rn bounded, let S ⊂ I a Riemann sum for f is a sum of the

form ∑J∈|P |

f (vJ)µ(J)

where vJ ∈ J ∩ S and we dene f (vJ) = 0 if J ∩ S = ∅.f : S → R is Riemann integrable if there is some number L ∈ R such that, given any ε > 0 there

is a partition Pε of I such that for any renement P of Pε the Riemann sum approximates L in the

sense ∑J∈|P |

f (vJ)µ(J)ε' L.

In this settng we dene∫S f = L, and call this number the Riemann integral of f over S.

The Jordan measure of a bounded set S ⊂ Rn is dened to be µ(S) =∫S 1.

34


Riemann sums for functions taking value in a Banach space is a direct generalization of the above

denition.

Theorem 1.58. The Jordan measure of a bounded set S ⊂ Rn exists if and only if ∂S has Lebesgue

measure zero. The Riemann integral ∫S

f

exists if and only f is bounded and the set of discontinuities of χSf : Rn → R has Lebesgue measure

zero, where

χSf (x) =

f (x) if x ∈ S

0 if x /∈ S .

Typically when one has a function given in terms of specic variables, one would write∫S f (x)dx

to indicate we are integrating with respect to the Jordan measure for the Euclidean space that the

variable x resides in. There are many variations of this notation, one given by Fubini's Theorem.

Theorem 1.59. (Fubini) Given a bounded function f : I × J → R, with I ⊂ Rn and J ⊂ Rm, providedthe iterated Riemann integral ∫

I

(∫J

f (x, y)dy

)dx

exists, then the Riemann integral ∫I×J

f

exists, and they are equal.

Thus one often sees expressions such as∫[0,1]×[2,3]

xy2 =

∫[0,1]

(∫[2,3]

xy2dy

)dx =

∫ 1

0

∫ 3

2

xy2dydx.

Of course, the above theorem is symmetric in I and J, thus the Riemann integral exists and is equal

to the iterated integral∫J

(∫I f (x, y)dx

)dy provided it exists, and if all three integrals exist, they must

be all equal.

Theorem 1.60. If f is continuous on a compact, connected Jordan measurable set S then

minf (x) : x ∈ S ≤∫S

f ≤ maxf (x) : x ∈ Sµ(S).

Typically, the number∫S f

µ(S) is dened to be the average of f over S, thus the above theorem,

together with the intermediate value theorem states such a function achieves its average value.

The last major theorem from calculus is the change of variables theorem. One standard version of

the theorem that can be proven rather directly appears below.

Theorem 1.61. Let f be Riemann integrable on a bounded set S ⊂ Rn. Assume φ : U → V is a

C1-dieomorphism between open subsets of Rn and that S ⊂ V . Then∫f −1(S)

(f φ)|Det(Dφ)| =

∫S

f .

35


Meaning both integrals exist and are equal. |Det(Dφ)| is the absolute value of the determinant

of the derivative of φ. Theorem 1.61 is proven directly by concretely estimating how φ changes the

Jordan measure of intervals. Perhaps one of the dissapointments of calculus is that this change of

variables theorem is much weaker than the corresponding one for integrals on R. The theorem that∫ ba f (g(t))g′(t)dt =

∫ g(b)

g(a) f (x)dx only requires that g is smooth, there is no condition that it needs to

be one-to-one. Once we learn about dierential forms we will see how the above change of variables

theorem can be xed to be a complete generalization of the single-variable theorem.

Exercises

Problem 1.62. If X ⊂ Rn is a subset, A ⊂ X is relatively open (also called open in X) if A = U ∩ Xwhere U ⊂ Rn is open. Similarly we dene B ⊂ X to be relatively closed if B = X \ A where A ⊂ Xis (relatively) open.

a) Prove that B ⊂ X is closed in X if and only if B = C ∩ X where C ⊂ Rn is closed. Use

whichever denition of open in Rn you are familiar with, but be sure to indicate what that

denition is.

b) Prove that an arbitrary union of relatively open sets (in X) is open in X.

c) Prove that a nite intersection of relatively closed sets in X is closed in X.

d) Let X and Y be subsets of Rn. Prove that a function f : X → Y is continuous in the

traditional ε − δ sense if and only if f −1(U) is relatively open in X for all U ⊂ Y relatively

open. The traditional ε − δ sense of continuity means, for any x ∈ X and any ε > 0 there

exists a δ > 0 such that whenever |x − y | < δ, its true that |f (x)− f (y)| < ε.

Problem 1.63. a) If f : (a, b)→ R is C1 and has an everywhere non-zero derivative, argue that

it is a dieomorphism onto its image, in particular that the image is an open interval.

b) Find an example of f : R → R, one-to-one, onto and dierentiable but which is not a

dieomorphism.

Problem 1.64. Let f : S1 → R2 be continuous, and dene g : R2 → R2 by g(v) = |v |f (v/|v |) if

v 6= 0, and g(0) = 0. Prove that g is dierentiable at the origin if and only if f satises f (cos θ, sin θ) =

cos(θ)f (1, 0) + sin(θ)f (0, 1) for all θ ∈ R. Hint: Start by computing the directional derivatives of g

at 0.

Problem 1.65. This is a problem about dierentiating elementary many-variable functions. It has

several parts. Let V be a nite-dimensional vector space over R, and µ : V × V → V a bi-linear36


function, meaning

µ(a~u + b~v, c ~w + d~x) = acµ(~u, ~w) + bcµ(~v, ~w) + adµ(~u, ~x) + bdµ(~v, ~x)

a) Prove that µ is dierentiable.

b) Show that Dµ(~v, ~w)(~x, ~y) = µ(~x, ~w) + µ(~v, ~y). Hint: Consider Dµ~v, ~w (~x, 0) and interpret

bilinearity as saying the function µ is linear in the rst variable.

c) Let V be the space of n × n matrices. Notice that matrix multiplication is bilinear, and use

part (b) to compute the derivative, in matrix notation.

d) Compute the derivative of the function of an n× n matrix given by f (A) = A2 using part (c)

and the chain rule. The idea is to write A2 as the composite of the functions A 7−→ (A,A)

and matrix multiplication.

e) Do the same kind of thing to compute the derivative of the cubing function f (A) = A3.

f) Compute the derivatives of A2 and A3 using the alternative method of considering the function

g(t) = (A+tH)2, taking the derivative with respect to t, and applying the chain rule. Similarly

with (A+ tH)3.

g) Show that matrix inversion f (A) = A−1 is a dierentiable function on the set of all n × nmatrices with non-zero determinant. Hint: Remember Cramer's Rule?

h) Compute the derivative of the inversion function, i.e. write and justify a formula for DfA(H),

the derivative of f at the matrix A in the direction of H. I suggest computing a formula for

the derivative implicitly, once you know from part (g) that f is dierentiable. Consider the

formula f (A) · A = I and use the techniques from (c), (d).

Problem 1.66. There are quite a few alternative approaches to proving the matrix inversion function is

dierentiable. The Cayley-Hamilton theorem asserts that every matrix satises its own characteristic

polynomial. Use this to deduce a formula for A−1. Specically, you can use it to show that Det(A)A−1

is a polynomial in A of degree n − 1. As a reminder, the characteristic polynomial of an n × n matrix

A is pA(t) = Det(A− tI), thought of as a polynomial in the variable t. In particular,

pA(t) = (−1)ntn + (−1)n−1tr(A)tn−1 + cn−2tn−2 + · · ·+ c2t

2 + c1t +Det(A)

and the coecients ck are generally degree k polynomials in the entries of the matrix A. The Cayley-

Hamilton theorem asserts that if one replaces the variable t by A, one gets the zero matrix, i.e.

pA(A) = 0, or

pA(A) = (−1)nAn + (−1)n−1tr(A)An−1 + cn−2An−2 + · · ·+ c2A

2 + c1A+Det(A)I = 0.

(a) Rewrite the above equation as a formula for Det(A)A−1, as a polynomial in A.

(b) Check that when n = 2 this is precisely the Cramer's Rule formula for A−1.

Problem 1.67. Find a number ε > 0 such that if |(a, b, c)| < ε then the equations

x + y2 = a, y + z2 = b, z + x2 = c

have a solution. Hint: Apply Kantorovich's theorem, or Proposition 1.41

37


Problem 1.68. Given a 2 × 2 matrix A ∈ M2,2 such that A2 = I, give an if and only if statement

(which depends only on A) for when there exists an open neighbourhood U of I in M2,2 and a smooth

function g : U → M2,2 such that g(I) = A, and the equation g(B)2 = I holds for all B ∈ U.

Problem 1.69. Consider the function f (x, y) = xy . Determine the critical points and irregular values.

For which c ∈ R is the set (x, y) ∈ R2 : f (x, y) = c a manifold? Note: the implicit function

theorem answers this for you in most cases, but for an irregular value you have to do more work!

Problem 1.70. Consider the two functions Det : Mn,n(R)→ R and tr : Mn,n(R)→ R, the determi-

nant and trace of matrices. Prove that D(Det)I = tr . Suggestion: Perhaps the easiest way to show

this is to compute the partial/directional derivatives of Det at I (there are n2 independent rst-order

partial derivatives).

Problem 1.71. Let Det : Mn,n → R be the determinant function.

(a) Compute the derivative of the determinant at the identity matrix

D(Det)IH =?

(b) Compute the derivative of the determinant D(Det)A when A is invertible. Hint: Let LA :

Mn,n → Mn,n be left multiplication by a matrix A, and check Det LA = Det(A) ·Det. Do

you see a general formula that works?

(c) Compute the Hessian of the determinant at the identity matrix H(Det)I .

Problem 1.72. Provide an explicit example of a one-to-one smooth function f : R → R2 such that

f (R) = (x, 0) : x ≥ 0 ∪ (0, x) : x ≥ 0. Hint: I suggest you think about what the derivatives and

anti-derivatives of bump functions look like.

Problem 1.73. Given an m-manifold M ⊂ Rn and p ∈ M show that there is a relatively open set

U ⊂ M containing p such that orthogonal projection π : Rn → TpM map, when restricted to U,

π|U : U → TpM is a dieomorphism onto an open subset of TpM. Hint: compute the derivative of

π|U at p.

Problem 1.74. Prove that the square S = ∂([0, 1] × [0, 1]) is a C0 manifold but not a C1 manifold.

Hint: the previous problem can help.

38


Problem 1.75. Prove the Möbius band from Example 1.8 is a smooth manifold. Hint: Divide the

problem into two cases, where Re(z) > −1 and Re(z) < 1. Also, compute the tangent space at

(−1, i).

Problem 1.76. (a) Give a complete argument, starting from the denition of the tangent bundle that

if U ⊂ Rn is open, TU = U × Rn.(b) Similarly, if V ⊂ Rn is a vector subspace, TV = V × V .(c) Show that T (S1) is dieomorphic to S1 × R. Note: We will eventually see that T (S2) is not

dieomorphic to S2 × R2.

(d) Prove that if U is a relatively open subset of a manifold M, argue that U is a manifold of the

same dimension, and TpU = TpM for any p ∈ U.

Problem 1.77. Prove that it is not always true that an immersion f : Rn → Rn is a dieomorphism

onto its image provided n ≥ 2.

Problem 1.78. Prove that if M is a compact manifold, and if f : M → N is a smooth one-to-one

immersion, then f (M) is also a manifold. In each of the three cases, give examples of how this

proposition would be false if (a) M were non-compact, (b) f were not one-to-one and (c) f were not

an immersion.

Problem 1.79. Show that the map φ : S3 → S2 dened by φ(z0, z1) = (2z0z1, |z0|2 − |z1|2) is a

submersion. In the denition of φ, we are considering S3 to be the unit sphere in R4 = C2, so z0

and z1 are complex numbers such that |z0|2 + |z1|2 = 1. The target space is S2, the unit sphere in

R3 = C × R. This map is sometimes called the Hopf bration. Verify also that φ is onto, and that

φ(z0, z1) = φ(z ′0, z′1) if and only if (z0, z1) = λ(z ′0, z

′1) for some unit complex number λ.

Problem 1.80. This problem concerns basic properties of maps between manifolds.

(a) Prove that if f : M → N is a submersion, it is an open map, meaning it takes relatively open

sets to relatively open sets.

(b) Prove that if M is compact then C ⊂ M is closed i it is compact. So if f : M → N is

continuous, it sends (relatively) closed sets to (relatively) closed sets.

(c) Argue that if a set X ⊂ Rn is pathwise connected, then it can not be decomposed X = A∪Bwith A∩B = φ, with both A and B relatively open and non-empty. Note: this is the point-set

topological notion of connectedness.

(d) Show that it follows that if f : M → N is a smooth submersion from a non-empty compact

manifold M to a pathwise-connected manifold N then f must be onto.39


Problem 1.81. Show that there is a Hessian chain rule. Precisely, if f : U → Rn, with U ⊂ Rm open,

and g : V → U with V ⊂ Rk open, argue that

H(f g)(p)(v , w) = Hfg(p)(Dg(p)(v), Dg(p)(w)) +Df(g(p))(Hg(p)(v , w)).

Hint: The chain rule says D2(f g) = D(D(f g)) = D(Df Dg) = D2f D2g, but Example 1.35

expresses Hf in terms of D2f . Compare the two.

40

Dierential Geometry - Coarse Outline Dierential equations and ows

2. Dierential equations and ows

Theorem 2.1. (Existence and Uniqueness theorem for ODEs on manifolds) Given a vector eld

v : N → TN on a manifold N, there exists a unique function Φv : Av → N where Av ⊂ R × Nsatisfying:

(1) Av = ∪p∈NIp × p with Ip ⊂ R an open interval for all p ∈ N.(2) 0 × N ⊂ Int(Av ), and Φv : Int(Av ) → M is smooth. Int(Av ) is the interior of Av in the

relative sense the union of all relatively open subsets of R×N that are contained in Av . Φv

is C l provided v is C l .

(3) ∂Φv (t,p)∂t = v(Φv (t, p)) for all (t, p) ∈ Av .

(4) Φv (t,Φv (s, p)) = Φv (t + s, p) whenever either side is dened. Moreover Φv (0, p) = p for all

p ∈ N.

The point of the above theorem is that Φv (t, p) as a function of t is the integral curve for the

vector eld v , with the initial condition dened via Φv (0, p) = p. This means one should think of

Φv (t, p) as òw along the solution to the DE t units of time, starting from p', thus the identity

Φv (t,Φv (s, p)) = Φv (t + s, p) is a version of the existence and uniqueness theorem, since both the

left and right hand sides can be viewed as integral curves with common initial conditions. Given t ∈ R,if Φv (t, p) is dened for all p ∈ N, the function Φv (t, ·) : N → N is a dieomorphism (with inverse

Φv (−t, ·)).

Denition 2.2. • Φv is the ow associated to the vector eld v .

• If Av = R×N the ow is said to be complete. If the ow of a vector eld v is complete, we

say v is a complete vector eld.

• A vector eld v : N → TN is bounded if |v(p)| is bounded, as a function of p ∈ N. Specically,v(p) ∈ π−1

N (p) = p × TpN, so write v(p) = (p, ~v(p)), then the notation |v(p)| is meant as

|v(p)| = |~v(p)|, i.e. ignore the basepoint. This would be called the brewise length in the

bre π−1N (p) = p × TpN (to distinguish it from the regular Euclidean norm of the vector

(p, ~v(p)) ∈ (Rk)2 which would be√|p|2 + |~v(p)|2), provided N ⊂ Rk .

A pleasant consequence is that for a complete vector eld, the adjoint of Φv , is a group homo-

morphism Φv : R→ Dif f (N) from the additive real numbers to the group of dieomorphisms of the

manifold N. By adjoint we mean Φv (t) ∈ Dif f (N) dened by Φv (t)(p) = Φv (t, p).

Proposition 2.3. If M ⊂ Rk is closed, and if w : M → TM is bounded, i.e. there is some L ∈ R such

that |w(p)| ≤ L ∀p ∈ M then w is complete.

So if M is compact, all ows are complete.

Example 2.4. Here are some examples of vector elds and ows.

• v : S2n−1 → TS2n−1, v(p) = (p, ip). We think of S2n−1 as the unit sphere in Cn to make

sense of this formula. Φv (t, p) = e itp.

• v : S2 → TS2 given by v(x, y , z) = ((x, y , z), (y ,−x, 0)). Φv (t, (x, y , z)) = (x cos t −y sin t, x sin t + y cos t, z). This ow might be called `rotation of the surface of a planet'.

• v : R2 → TR2, v(x, y) = ((x, y), (x2, y)). Φv (t, x, y) = ( x1−tx , e

ty). The previous two

ows are complete, but this ow is not. Av = (t, x, y) : t ∈ (−∞, 1/x) if x > 0, t ∈(1/x,∞) if x < 0, and t ∈ R otherwise.

41


Denition 2.5. Let πM : TM → M and πN : TN → N be the tangent bundles of manifolds M

and N respectively. Let w : M → TM and v : N → TN be vector elds. If f : M → N is a

dieomorphism, the pull-back of v along f is denoted f ∗v : M → TM it is a vector eld and it is

dened by f ∗v = Df −1 v f .

TMDf //

πM

TN

πN

M

f //

f ∗v

HH

N

v

HH

Evaluating at p this gives (f ∗v)(p) = (p,D(f −1)(f (p))(~v(f (p))) ∈ π−1M (p). This is using the notation

that ~v : M → Rk is the function such that v(p) = (p, ~v(p)), i.e. ~v is the ìmportant' part of the

vector eld v . Similarly, if w : M → TM is a vector eld, its push-forward f∗w : N → TN is the

vector eld f∗w = Df w f −1.

Proposition 2.6. If f : M → N and g : N → P are dieomorphisms, (g f )∗ = g∗ f∗ and

(g f )∗ = f ∗ g∗.

Denition 2.7. If f : M → N is a smooth map and h : N → R is smooth, the pull-back of h along f is

f ∗h = h f : M → R. If f : M → N is a dieomorphism and h : M → R is smooth, the push-forward

of h along f is f∗h = h f −1.

Denition 2.8. (Lie Derivatives) Given a vector eld v : N → TN, notice that if (−ε, ε)× N ⊂ Avfor some ε > 0, the function Φv,t : N → N dened by Φv,t(p) = Φv (t, p) is a dieomorphism, for

all t ∈ (−ε, ε). So whenever one has a notion of pull-back along dieomorphisms (such as we have

for real-valued functions and vector elds) we can pull-back along Φv,t and take the derivative with

respect to t, giving a new object of the same type (real-valued function, vector eld, etc, we will have

more!).

• Let h : N → R be smooth, and v : N → TN a vector eld, then Lvh : N → R is dened as

Lvh(p) = limt→0

(Φ∗v,th)(p)− h(p)

t=∂(Φ∗v,th)(p)

∂t(t = 0)

• Let w : N → TN and v : N → TN be vector elds, then Lvw : N → TN is dened as

Lvw(p) = limt→0

(Φ∗v,−tw)(p)− w(p)

t=∂(Φ∗v,−tw(p))

∂t(t = 0)

Notice Lvh = vh is the directional derivative of h in the direction of v .

Proposition 2.9. The Lie derivative of vector elds on N satises:

a) Lvw = [v , w ] for all vector elds v and w on N, where [v , w ] = π2(Dv w − ιN Dw v)

from Chapter 1.

b) [v , w ]h = w(vh)− v(wh) for all h : N → R smooth, and it is the unique vector eld with this

property (for any given v , w : N → TN xed).

c) f∗[v , w ] = [f∗v , f∗w ] for any dieomorphism f : N → M.

d) [v , w ](p) = (p,D~v(p)( ~w(p))−D~w(p)(~v(p))) if N is an open subset of Rn, where v(p) = (p, ~v(p)),

and w(p) = (p, ~w(p)) for all p ∈ N.e) [·, ·] is bilinear, as a function Γ(TN)× Γ(TN)→ Γ(TN).

f) [v , w ] = −[w, v ] for all v , w ∈ Γ(TN).42


g) [v , [w, x ]] + [w, [x, v ]] + [x, [v , w ]] = 0 for all v , w, x ∈ Γ(TN). This is called the Jacobi

identity.

Proof. We will prove parts (a), (c) and (g) leaving the rest as an exercise.

(a) We can locally extend the vector elds to vector elds on an open subset of Euclidean space.

The existence and uniqueness theorem for ows tells us the ow of the extended vector eld is an

extension of our ow on N. So it suces to verify part (a) on an open subset of Euclidean space.

The denition of φ∗v,−tw is φ∗v,−tw = D(φv,−t)−1 w φv,−t . Notice φ−1

v,t = φv,−t , so by the chain

rule D(φv,−t)−1 = Dφv,t . By the chain rule we can break this derivative up into

∂

∂t

(D(φv,t) w φ−1

v,t

)(t=0)

=∂

∂t(Dφv,t w φ−1

v,0)t=0 +∂

∂t(D(φv,0) w φv,−t)(t=0)

but φv,0 is the identity function, giving

Lvw(p) =∂

∂t(Dφv,t w)(t=0) +

∂

∂t(w φv,−t)(t=0)

Since the variable t is independent of the spatial variables, we can commute dierentiating with respect

to t and the spatial variables, giving

Lvw(p) = Dv w −Dw v

which is what we set out to prove.

(c) In chapter 1 we observed that if f : N → M is a dieomorphism, then D2f : T 2N → T 2M

has the form D2f (p, v , w, y) = (f (p), Dfp(v), Dfp(w), Hfp(v , w) + Dfp(y)). Using the Chapter 1

denition of the Lie Bracket gives the result immediately.

(g) There are a variety of ways to prove this, for instance one could evaluate the vector eld

[v , [w, x ]] + [w, [x, v ]] + [x, [v , w ]] directly on an arbitrary smooth, real-valued function using part (b).

One could also consider this as a coecient in a Taylor expansion for the commutativity of three ows.

We will give an iterated tangent bundle interpretation.

Let Φx be the ow associated to the vector eld x , then by part (a), [x, [v , w ]] = Lx [v , w ], so

[x, [v , w ]](p) =∂Φ∗x,−t [v,w ](p)

∂t (t = 0) =∂[Φ∗x,−tv,Φ

∗x,−tw ](p)

∂t (t = 0) = [[x, v ], w ] + [v , [x, w ]] where the last

identity is via the Chain Rule. If we collect the terms on one side using (f), this gives us

[x, [v , w ]] + [w, [x, v ]] + [v , [w, x ]] = 0,

which is the Jacobi identity.

The proof of part (c) gives this natural generalization.

Proposition 2.10. If v1, v2 : M → TM and w1, w2 : N → TN are vector elds, and if f : M → N is

any smooth map such that wi f = Df vi for both i ∈ 1, 2 then

[w1, w2] f = Df [v1, v2]

i.e. Lie brackets are natural with respect to arbitrary smooth functions.

Notice that the latter coordinate in formula (d) can be written as

n∑i=1

n∑j=1

∂vi∂ej

(p)wj(p)−∂wi∂ej

(p)vj(p)

eiwhere ~v(p) =

∑ni=1 vi(p)ei and ~w(p) =

∑ni=1 wi(p)ei with e1, · · · , en the standard basis for Rn.

43


Denition 2.11. Given two vector elds v , w ∈ Γ(TN) on N, their ows commute if Φv (t,Φw (s, p)) =

Φw (s,Φv (t, p)) whenever both sides of the equation are dened.

Note: for any p ∈ N, there is an ε > 0 such that both Φv (t,Φw (s, p)) and Φw (s,Φv (t, p)) are

dened whenever |t| < ε and |s| < ε so the above denition is always non-vacuous.

Theorem 2.12. [v , w ] = 0 if and only if the ows Φv and Φw commute.

Proof. `⇐=' If the vector elds commute, the Lie Bracket is zero by Proposition 2.9 part (a).

`=⇒' This is the more substantial part of the proof. Consider what it would take for the ows of

v and w to commute.

Φw (s,Φv (t, p)) = Φv (t,Φw (s, p)) ∀t, s, pdierentiating this with respect to the variable s at s = 0 gives

w Φv (t, p) = D(Φv,t) w(p) (∗)

One way to read this formula is that it says that the curves Φv (t,Φw (s, p)), as a function of the

variable s are integral curves for the vector eld w . But the curve Φw (s,Φv (t, p)) is also such an

integral curve, and at parameter time s = 0 they agree, so they must be the same by the existence

and uniqueness theorem for ODEs. So they key to the proof is showing that the identity (*) holds if

we only know [v , w ] = 0.

Since [v , w ] = ∂∂t (Φ∗v,−tw)(t=0), and since Φv,t1 Φv,t2 = Φv,t1+t2 the chain rule tells us that

∂∂t (Φ∗v,tw)(t=0) = 0 implies that Φ∗v,tw is constant as a function of t alone, as its derivative with

respect to t is zero for all time, not just when t = 0. So Φ∗v,tw = w for all time t, since it is true for

t = 0. But Φ∗v,tw = D(Φv,t)−1 w Φv,t , so

D(Φv,t) w = w Φv,t ∀t

Although we dened the Lie Bracket for vector elds, as we have seen, the Lie bracket at a particular

point [v , w ](p) only depends on Dv(p), Dw(p), v(p) and w(p). The rst two take values in T (TN) and

the latter two in TN.

Proposition 2.13. A variant of Theorem 2.12 is the statement that the second order Taylor expansion

of the function

Φv (a,Φw (b, p))−Φw (b,Φv (a, p))

with respect to the variables a and b at (a, b) = (0, 0) is

ab[v , w ]p.

Technically Proposition 2.13 does not make intrinsic sense in the manifold, as it uses the subtraction

operation from the ambient Euclidean (vector) space the manifold is sitting in. One can correct in a

variety of ways. For example, consider

Φv (−a,Φw (−b,Φv (a,Φw (b, p))))

which one can think of as a function (−ε, ε)2 → N. This function has a 2nd order Taylor expansion

with respect to the variables (a, b) at (0, 0), and it is p+ ab[v , w ]p. This is readily derivable from the44


above proposition. More generally, one can prove that the 2nd order Taylor expansion of the functions

below are:

Φw (b,Φv (a, p))ε·|(a,b)|2' p + avp + bwp +

a2

2Dvp(vp) +

b2

2Dwp(wp) + abDwp(vp)

Φv (a,Φw (b, p))ε·|(a,b)|2' p + avp + bwp +

a2

2Dvp(vp) +

b2

2Dwp(wp) + abDvp(wp).

The proof of the proposition, and a sketch of how to prove the above statements is below.

Proof. (of Proposition 2.13) Φv (a,Φw (b, p)) = Φw (b,Φv (a, p)) whenever either a = 0 or b = 0, so

the only potential non-zero term in the 2nd order Taylor expansion of Φv (a,Φw (b, p))−Φw (b,Φv (a, p))

is the mixed partial ∂2

∂a∂b . We compute this in local coordinates, i.e. assuming our manifold is an open

subset of Euclidean space.

∂2

∂a∂b(Φv (a,Φw (b, p))−Φw (b,Φv (a, p))) (0, 0)

=∂2

∂b∂aΦv (a,Φw (b, p))−

∂2

∂a∂bΦw (b,Φv (a, p))

=∂

∂b(v(Φw (b, p)))(0)−

∂

∂a(w(Φv (a, p)))(0)

=Dvp(w(p))−Dwp(v(p)) = [v , w ]p

This `defect to commutativity of ows' is one of the more satisfying interpretation of the Jacobi

identity. Given three ows, the question of if two commute roughly asks if their owlines form

(smooth analogues of) rectangles. When they do not commute, think of the owlines as forming

broken rectangles. Given three vector elds, one can assemble the three broken rectangles into three

of the faces of a broken cube. The side lengths of the non-broken edges of the rectangles are linear

in the parameter t (as in Proposition 2.13), while the broken edge has length proportional to t2.

Completing the diagram we get the double Lie brackets, resulting in a `broken triangle' in the top

corner. By Proposition 2.13 the side lengths of the broken triangle are proportional to t3, but sum

of the displacements represented by the edges is zero by design, and the sum represents t3 times the

Jacobi identity, again by Proposition 2.13.

Lie Groups

The subject of Lie groups was originally motivated by the study of dierential equations. As we've

seen above, ows of ODEs produce (in reasonable situations) homomorphisms from the additive

group of real numbers to the group of dieomorphisms of a manifold. Sophus Lie was motivated

by a somewhat more abstract notion of the symmetry properties of arbitrary dierential equations.

Dierential equations tend to have innite families of symmetries. The Platonic solids would be the

archetypal example of objects with nite symmetries. The basic subject of Lie Groups is meant to be

a framework for the study of continuously-varying (therefore innite) symmetry.45


Denition 2.14. A group is a pair (G,µ) where G is a set and µ : G × G → G is a function that

satises:

a) µ(a, µ(b, c)) = µ(µ(a, b), c) for all a, b, c ∈ G. This condition is called associativity of

(G,µ).

b) There is an element e ∈ G such that µ(e, a) = a = µ(a, e) for all a ∈ G. e is called an

identity element for (G,µ).

c) For every a ∈ G there is some b ∈ G such that µ(a, b) = e = µ(b, a). b is the inverse to a

with respect to (G,µ).

Typically, µ(a, b) is denoted ab or a ·µ b. Rewriting (a) with this notation gives a(bc) = (ab)c , (b)

gives ea = a = ae and (c) gives ab = e = ba. In (c), typically b is denoted a−1 as one can argue b is

unique. If one erases condition (c) in the above denition, the object is called a monoid.

Example 2.15. Standard examples of groups are:

• The symmetric group of a set X. If X is a set, let G be all bijections X → X and µ the

composition operation, i.e. if f , g : X → X are one to one and onto, µ(f , g) = f g, i.e.(f g)(x) = f (g(x)) for all x ∈ X. In the case X = 1, 2, · · · , n the symmetric group on

X is denoted Σn. In general, the symmetric group of a set X is denoted Σ(X) or sometimes

one might see Bij(X) (`bijections of X') or AutSet(X) (`set-theoretic automorphisms of X').

• If F is a eld (such as R or C), GLn(F ) is the General Linear Group, it consists of all n× ninvertible matrices with entries in F , with µ being matrix multiplication. The neutral element

of this group is typically denoted I.

• SLn(F ) ⊂ GLn(F ) is the subset of matrices such that det(A) = 1, called the special linear

group of F . The group operation is also matrix multiplication.

• GL+n (R) ⊂ GLn(R) is the subset of matrices such that det(A) > 0.

• On ⊂ GLn(R) is the subspace of matrices such that AtA = I, this is the orthogonal group.

• Un ⊂ GLn(C) is the Unitary group. It is all matrices with complex entries satisfying A∗A = I,

where A∗ is the conjugate-transpose of A.

• GLn(Z) the invertible n × n matrices with integer entries is a group.

• If L is the n × n matrix L = diag(−1, 1, 1, · · · , 1), the matrices A ∈ GLn(R) such that

AtLA = L is called the Lorentz group, also sometimes denoted O1,n−1.

Denition 2.16. (1) If f : G → H is a function whose domain G has a group structure (G,µ)

and co-domain H also has a group structure (H, ν), f is a homomorphism of groups if

ν(f (a), f (b)) = f (µ(a, b)) for all a, b ∈ G. In the alternative notation his says f (a) ·ν f (b) =

f (a ·µ b).

(2) If H ⊂ G and µ : G ×G → G restricts to a map µ|H×H : H ×H → H making (H,µ|H×H) into

a group, then H is said to be a subgroup of G.

(3) A right coset of a subgroup H in a group G is a set

gH = gh : h ∈ H

and a left coset of a subgroup H in a group G is a set

Hg = hg : h ∈ H.

(4) A subgroup is normal if its left cosets are equal to its right cosets, i.e.

gH = Hg ∀g ∈ G.46


(5) A one-to-one homomorphism of groups is called an injection or embedding or monomor-

phism, an onto homomorphism of groups a surjection or epimorphism, and a bijective ho-

momorphism of groups is an isomorphism.

Example 2.17. SLn(F ) is a subgroup of GLn(F ). On is a subgroup of GLn(R). Un is a subgroup of

GLn(C).

Proposition 2.18. Given a homomorphism f : G → H, the kernel

ker(f ) = g ∈ G : f (g) = e

is a normal subgroup of G. Given any normal subgroup K of G, the quotient group

G/K = gK : g ∈ G

is naturally a group, with operations

g1K · g2K = (g1g2).K

The map G → G/K given by g 7−→ gK is an onto homomorphism with kernel equal to K. This is

universal in a sense. Specically, the homomorphism f : G → H ts into a commutative diagram of

homomorphisms

Gf //

H

G/ker(f )

f

::

with f : G/ker(f )→ H injective (one-to-one).

Example 2.19. The n-th roots of unity in the complex plane

z ∈ C : zn = 1

form a group under multiplication. The function

f : Z→ S1

given by f (k) = e2πink is a homomorphism and its image is the n-th roots of unity. The kernel of f

is nZ ⊂ Z, thus we have f is an isomorphism of groups between Z/nZ and the n-th roots of unity.

Sometimes Z/nZ is denoted Zn.

Example 2.20. If G and H are groups, the product G ×H is a group with group operations (g1, h1) ·(g2, h2) = (g1g2, h1h2). The product has homomorphisms πG : G × H → G and πH : G × H → H

dened by πG(g, h) = g and πH(g, h) = h. Moreover, if K is any group, α : K → G and β : K → H

are any homomorphisms, there exists a unique homomorphism γ : K → G × H such that πG γ = α

and πH γ = β.

Denition 2.21. A Lie Group is a pair (M,µ) which is a group, with M a manifold, µ : M ×M → M

is smooth, and the inversion function i : M → M i(g) = g−1 must be smooth.

Example 2.22. GLn(R), GLn(C), SLn(R), SLn(C), GLn(Z), SLn(Z) = GLn(Z) ∩ SLn(R), On,

SOn = On ∩ SLn(R), Un, SUn = Un ∩ SLn(C) and the Lorentz group O1,n−1 are all Lie groups.47


Proof. (for On) First of all, some additional context. Although On = A ∈ GLn(R) : AtA = I wecan reinterpret this. The standard inner product on Rn is the function µ : Rn × Rn → Rn given by

µ(v , w) = v tw where v and w are thought of as n × 1 matrices, and v t means the transpose of v .

Notice that a linear function f : Rn → Rn satises µ(f (v), f (w)) = µ(v , w) for all v , w ∈ Rn if and onlyif v tAtAw = v tw for all v , w , where A is the matrix representation of f with respect to the standard

basis e1, · · · , en. But if we let v = ei and w = ej this formula becomes eti AtAej = eti ej = δi ,j

(Kroneker delta), thus AtA = I. Thus On is the matrix representations of all linear isometries

Rn → Rn. This explains why it is a group composition preserves the property of being an isometry.

The explanation of why it is a manifold is entertaining. Consider f : GLn(R) → Mn,n(R) given by

f (A) = AtA−I. Notice that f −1(0) = On. A computation gives DfA(H) = HtA+AtH. Notice that as

a function of H, this is not an onto linear transformation since this matrix is symmetric. So reconsider

f to be a function to the symmetric n × n matrices, f : GLn(R) → Sn,n. Sn,n is a nite-dimensional

vector space, and DfA(H) = HtA + AtH is an onto linear transfomation, since if B is symmetric,

B = B+Bt

2 . So if we solve HtA = B2 , A

tH = Bt

2 comes for free, as one is the transpose of the other.

So H = AB2 is our solution. Since Sn,n is

(n+1)n2 -dimensional, the implicit function theorem tells us On

is(n2

)-dimensional.

(for GLn(Z)) This is a zero-dimensional manifold, since GLn(Z) ⊂ Mn,n(R) is discrete.

(for the rest) Exercise.

Denition 2.23. • A homomorphism of Lie Groups is a smooth function f : G → H where

G and H are Lie groups, and f is a homomorphism of the underlying groups.

• If H and G are Lie groups, and H ⊂ G with inclusion i : H → G an embedding which is also

a homomorphism of Lie groups, then we say H is a Lie subgroup of G.

Example 2.24. • Let H and G be any pair from Example 2.22 such that H ⊂ G, then H is a

Lie subgroup of G.

• The product of two Lie groups is a Lie group.

Proposition 2.25. Let f : G → H be a homomorphism of Lie groups. Then Dfp : TpG → Tf (p)H has

a rank that is independent of p ∈ G, i.e. the dimension of Dfp(TpG) does not depend on p.

Proof. Given p ∈ G, Lp : G → G dened by Lp(g) = pg is called left-multiplication by p. Notice

that Lp Lq = Lpq, and Lp−1Lp = IdG , so Lp−1 = L−1p , in particular Lp is a dieomorphism of G for

all p ∈ G. In particular, since f is a homomorphism, f (gh) = f (g)f (h), so Lf (g) f = f Lg for all

g ∈ G. Applying the chain rule gives:

(DLf (g))(f (p))(Dfp(v)) = Df(gp)((DLg)(p)(v))

Since DLg and DLf (g) is an isomorphism of tangent spaces for all g, this gives the result.

Actions of groups

A group is in a sense always an object that's a step away from the object one is usually interested

in. Most often, the thing one cares about for a group is its action on an object. For Lie Groups, this

will be the action on a manifold.48


Denition 2.26. An action of a group G on a set X is a function µ : G × X → X such that

µ(g, µ(h, x)) = µ(gh, x) for all (g, h, x) ∈ G×G×X, and µ(e, x) = x for all x ∈ X. Typically, µ(g, x)

is denoted g.x , so the conditions for an action are expressed as

g.(h.x) = (gh).x e.x = x

i.e. the rst condition is a type of associativity axiom, and the second is the identity axiom.

Technically, the above condition is called a left action of G on X. In a right action, the associativity

condition g.(h.x) = (gh).x is replaced by g.(h.x) = (hg).x , this is because it looks more natural if

one writes x.g instead of g.x for µ(g, x), i.e. (x.h).g = x.(hg) is the associativity axiom for a right

action. Given a left action of G on X, there is always an associated right action, and vice-versa. This

is because every group G admits the anti-automorphism g 7−→ g−1. So if µ is a left action of G on

X, then µ(g, x) = µ(g−1, x) is a right action.

Remark 2.27. When µ : G ×X → X is an action, the map

µ : G → Σ(X)

dened by µ(g)(x) = µ(g, x) is a homomorphism from G to the symmetric group of X. µ is called

the adjoint of µ. Similarly, given a homomorphism from a group to the symmetric group of a set, you

can write it as the adjoint a group action on the set. So group actions can be thought of as being

precisely homomorphisms to symmetric groups.

Example 2.28. • The symmetric group Σ(X) acts on X, where if f : X → X is a bijection,

f .x = f (x).

• GLn(F ) acts on F n by A.v = A(v) where v is a column vector.

• A group G acts on itself òn the left' g.h = gh.

• Also òn the right' g.h = hg−1.

• Also by conjugation, g.h = ghg−1. Since conjugation is a homomorphism of groups, the

adjoint to this action is a homomorphism G → Aut(G) from G to the group of automorphisms

of G.

• S1 ⊂ C acts on S2n−1 ⊂ Cn by complex multiplication.

Denition 2.29. If G is a Lie Group and M a smooth manifold, a group action

µ : G ×M → M

is a Lie Group action if it is a smooth map.

Notice that if µ is a Lie Group action, the adjoint map µ : G → Σ(M) is mapping to the sub-

group Dif f (M) ⊂ Σ(M), i.e. Lie Group actions produce adjoints that are homomorphisms to the

dieomorphism group of M. Notice that not every homomorphism G → Dif f (M) is the adjoint to

a Lie Group action, since the corresponding map G ×M → M may not be smooth. To rectify this

problem one would have to nd a suitable notion of `smooth structure' on Dif f (M) to talk about

maps G → Dif f (M) being smooth. This is the birthpoint of a subject called global analysis, which is

the subject where calculus of variations lives.

Denition 2.30. Let x ∈ X and suppose µ : G ×X → X is an action of G on the set X.

• The orbit of X is G.x = g.x : g ∈ G. The set of all orbits of the action is typically denoted

X/G.49


• The stabilizer subgroup of x ∈ X is Gx = g ∈ G : g.x = x. This is also sometimes called

the isotropy subgroup of x .

• A xed point of the action is an element x ∈ X such that g.x = x ∀g ∈ G. The set of xed

points of the action is typically denoted XG .

• The stabilizer subgroup of a subset Y ⊂ X is typically denoted StabG(Y ) = g ∈ G : g.y ∈Y ∀y ∈ Y . Another way to think of this, is that if G acts on X it acts on the set of all

subsets of X, and Y is a stabilized element under that induced action.

• If a Lie group G is acting on Rn, the action is said to be linear if g.(av+bw) = a(g.v)+b(g.w)

for all a, b ∈ R and v , w ∈ Rn.

Proposition 2.31. If µ : G ×M → M is a Lie group action, the stabilizers of points in M are closed

subgroups of G.

One of the most basic actions of Lie groups are called representations.

Denition 2.32. If G is a Lie group, a representation of G is a smooth homomorphism

G → GLnR.

The representation is said to be faithful if it is one-to-one.

Proposition 2.33. Given a representation f : G → GLnR, the corresponding action

f : G × Rn → Rn

is smooth and linear, f (g, v) = f (g)(v). Moreover, every smooth linear action G × Rn → Rn corre-

sponds to a unique representation of G.

Example 2.34. • The tautological action of GLnR on Rn is A.v = Av , i.e. matrix multiplica-

tion.

• The unit complex numbers S1 acts on S3 ⊂ C2 by the (p, q)-Seifert action

z.(z1, z2) = (zpz1, zqz2)

provided the integers (p, q) are coprime, the adjoint map S1 → SO4 is injective. The orbits

of this action are all dieomorphic to circles, but the stabilizer group is either trivial, cyclic of

order p, or cyclic of order q.

A basic theorem in Lie theory is the closed subgroup theorem.

Theorem 2.35. A subgroup H ⊂ G of a Lie group G is a Lie subgroup (i.e. a submanifold) if and

only if it is closed as a subset of G.

The proof will appear below in the `Some Computations' section.

For example, Q ⊂ R is a subgroup but it is not closed. The point of the closed subgroup theorem

is that a subgroup of a Lie group fails to be a Lie subgroup only when it is not closed.

Corollary 2.36. A continuous group homomorphisms between Lie groups is smooth, i.e. it is a Lie

group homomorphism.

Proof. If f : G → H is a continuous homomorphism, the graph graph(f ) ⊂ G × H is a subgroup.

Since f is continuous, and G ×H is Hausdor, the graph is closed, therefore it is a Lie group. So f is

smooth. This last step is a direct application of the inverse function theorem. 50


Denition 2.37. Given Lie groups G and H, and an action of G on H by automorphisms, the semi-

direct product G nH is dened as the manifold G ×H with the group operation

(g1, h1) · (g2, h2) = (g1g2, (g−12 .h1)h2).

The term G acts on H by automorphisms means that g.(h1h2) = (g.h1)(g.h2) for all g ∈ G and

h1, h2 ∈ H.

Proposition 2.38. G nH is a Lie group. It has the properties that:

• The map πG : G nH → G given by πG(g, h) is an onto homomorphism of Lie groups.

• The inclusion iH : H → G n H given by α(h) = (e, h) is a homomorphism of Lie groups, and

iH(H) = kerπG is a normal subgroup of G nH.• The inclusion iG : G → G nH given by iG(g) = (g, e) is a homomorphism of Lie groups.

• iG(G) is a normal subgroup of G nH if and only if the action of g on H is trivial, i.e. g.h = h

for all g ∈ G and h ∈ H. In this case, G nH = G ×H as Lie groups.

The purpose of semi-direct products is that one frequently encounters Lie groups that are products

(as manifolds), but their group operation isn't the product group operation. We have a proposition

useful for recognition of this situation.

Proposition 2.39. Given a Lie group A and an onto homomorphism of Lie groups f : A → G, the

group A is isomorphic to G n ker(f ) provided there is a Lie group homomorphism iG : G → A such

that f iG = IdG . The action of G on ker(A) is g.h = (iG(g))h(iG(g))−1.

Proof. The homomorphism φ : G n ker(f ) → A is given by mapping (g, h) 7−→ iG(g)h. The inverse

is given by g 7−→ (f (g), iG(f (g))−1g). First, let's verify φ is a homomorphism.

φ(g1, h1)φ(g2, h2) = iG(g1)h1iG(g2)h2 = iG(g1)iG(g2)iG(g2)−1h1iG(g2)h2

= iG(g1g2)iG(g−12 )h1iG(g2)h2 = φ(g1g2, (iG(g2).h1)h2

as we claimed. We leave it to the reader to verify the formula for φ−1. That φ and its inverse are

smooth follows from the formulas they are composites of smooth functions.

Corollary 2.40. As a Lie group, On ' Z2 n SOn. This follows because we have the extension

SOn → On → +1,−1 given by the determinant. Moreover, the inclusion +1,−1 → On is given

by associating to −1 any mirror reection in On more generally, any element of order 2 in On whose

determinant is −1.

The above corollary indicates that semi-direct product decompositions are frequently not unique.

For On one has to make a choice of certain elements of order 2. A Lie group rarely has a canonical

such decomposition.

In the special case that n is odd, there is an excellent choice to make. −I is an element of order 2 in

On with determinant −1. Moreover, −I commutes with every element in SOn (indeed, every element

of On) thus in this case On ' Z2 × SOn, i.e. the semi-direct product is a direct product.

Typically, a semi-direct product is written in two ways, G n H and H o G. In both situations, G is

expected to be acting on H by automorphisms, with GnH = G×H as manifolds, and HoG = H×Gas manifolds. But the multiplication is dened slightly dierently in the latter case. It is assumed here

that G acts on H from the right.

Lie Algebras 51


Denition 2.41. A vector eld v : G → TG on a Lie group G is left invariant if Lg∗v = v for

all g ∈ G, where Lg : G → G is left-multiplication by g, i.e. Lg(h) = gh. Similarly, a vector eld

v : G → TG is right-invariant if Rg∗v = v for all g ∈ G. A vector eld is bi-invariant if it is both

left and right invariant. One can make the same kinds of denitions for real-valued functions, bilinear

forms, and any object that has push-forwards with respect to dieomorphisms.

Example 2.42. Notice that every Lie group has non-trivial left-invariant vector elds, and also right-

invariant vector elds. Given a vector ve ∈ TeG, we can extend it to a left-invariant vector eld

vL : G → TG via the formula

vLg = D(Lg)eve

vL is left-invariant since

(Lg∗vL)p = D(Lg)g−1p D(Lg−1p)eve = D(Lp)eve

by the chain rule.

In an abelian Lie group, a left-invariant vector eld is also a right-invariant vector eld, since Lg = Rgfor all g. But in a non-abelian Lie group, sometimes the only bi-invariant vector eld is zero.

In GLnR the only bi-invariant vector eld is A 7−→ (A, tA) for t ∈ R. To see this, recall that

TGLnR = GLnR × Mn,nR. Given H ∈ Mn,nR, the left-invariant vector eld that agrees with H at

I is A 7−→ (A,AH) is a left-invariant vector eld on GLnR, and the right-invariant vector eld is

A 7−→ (A,HA). Thus these vector elds are bi-invariant if and only if

AH = HA ∀A ∈ GLnR.

Since GLnR is an open dense subset of Mn,nR the above formula must hold for all A ∈ Mn,nR as well.

The matrix H = tI commutes with all matrices. We will show that no other matrices do. Let

HE = EH where E is an elementary matrix such that H 7−→ EH that adds the i-th row to the j-th,

with i 6= j . This tells us that the (i , i)-th and (j, j)-th entries of H are identical, and the (i , j)-th entry

of H is zero. If we use all i 6= j then we see H = tI for some t ∈ R.

Proposition 2.43. The Lie bracket of two left-invariant (resp. right-invariant) vector elds is left

invariant (resp. right-invariant). The map that associates ve ∈ TeG to a left (or right)-invariant

vector eld v identies the left (or right)-invariant vector elds on G with TeG. Thus left and right-

invariant vector elds form a vector space of dimension dim(G).

Proof. That the bracket of left/right invariant vector elds is left/right invariant follows from Propo-

sition 2.9 part (c). Denote the left-invariant vector elds on G by ΓL(G), and the right-invariant

vector elds by ΓR(G). That the map ΓL(G)→ TeG given by v 7−→ ve is a bijection follows from the

converse construction, ve 7−→ vL.

Denition 2.44. • The left-invariant vector elds of a Lie Group G together with the Lie

Bracket is called the Lie Algebra of the Lie Group. Denote the left-invariant vector elds on

G by ΓL(G).

• Given g ∈ G, conjugation by g Cg = Lg Rg−1 is a dieomorphism of G, Cg(x) = gxg−1. Cgxes the identity Cg(e) = geg−1 = e.

52


• The derivative of Cg at e gives a homomorphism

GAd // Aut(TeG)

g // D(Cg)e

called the Adjoint representation of G. Notice that Ad(g) : TeG → TeG is not just a linear

automorphism of the vector space, but Ad(g)[v , w ] = [Ad(g)(v), Ad(g)(w)], i.e. it is an

automorphism of the Lie algebra.

Denition 2.45. A vector space V over a eld F , together with an bilinear map [·, ·] : V × V → V

such that:

(a) [v , w ] + [w, v ] = 0 ∀v , w ∈ V(b) [v , [w, x ]] + [w, [x, v ]] + [x, [v , w ]] = 0 ∀v , w, x ∈ V

is called a Lie Algebra.

Example 2.46. Provided 1 + 1 6= 0 in the eld F prove that the condition (a) in the denition of a

Lie Algebra is equivalent to [v , v ] = 0 ∀v ∈ V .

Proposition 2.47. If f : G → H is a homomorphism of Lie Groups, and v : G → TG is a left-invariant

vector eld, then there is a unique left-invariant vector-eld w on H such that Df v = w f . i.e.

Dfe : TeG → TeH is a homomorphism of Lie algebras.

Proposition 2.47 has a somewhat more dicult partial converse, which states that if there exists a

Lie algebra homomorphism TeG → TeH and if G is simply-connected, i.e. π0G = π1G = 0, then there

is a Lie group homomorphism f : G → H whose derivative Dfe : TeG → TeH is the original Lie algebra

homomorphism. So the theory of Lie groups becomes closely connected to the theory of Lie algebras.

This converse uses the language of algebraic topology, and the proof uses some basic facts about the

fundamental group π1 and covering spaces. The simple-connectivity assumption is essential, and we

encourage the reader to do a few experiments to see why.

Proposition 2.48. The derivative of the Adjoint representation Ad : G → Aut(TeG) at the identity

is the Lie Bracket, i.e.

D(Ad)e(A)(B) = [A,B].

We are considering Aut(TeG) as a subspace of the vector space Hom(TeG, TeG) thus T0Aut(TeG) =

Hom(TeG, TeG) thus we can consider D(Ad)e : TeG → Hom(TeG, TeG) or equivalently, we could

think of it as a bilinear map Bil in(TeG × TeG, TeG).

Frequently D(Ad)e is denoted ad .

The exponential map

Given a left-invariant vector eld v on a Lie Group, notice that for g ∈ G, the curve R 3 t 7−→Lg(Φv (t, Lg−1 (p))) satises the same ODE as Φv (t, p). Thus Lg Φv (t, ·) = Φv (t, ·) Lg for all t

that the expression is dened, and for all g ∈ G. Thus the maximal domains of our integral curves

are independent of the point in G they pass through, thus the ow of all left-invariant vector elds is

complete, i.e. Φv : R× G → G.53


Given a vector v ∈ TeG, let ΦLv : R×G → G be the ow of the associated left-invariant vector eld

which agrees with v at e. Similarly, let ΦRv : R× G → G be the ow associated to the corresponding

right-invariant vector eld.

Denition 2.49. The exponential map TeG → G is dened to be the map

TeGexp // G

v // ΦL

v (1, e)

Proposition 2.50. Notice that ΦLv (1, e) = ΦR

v (1, e) for all v ∈ TeG. Moreover,

ΦLv (t, g) = g · exp(tv)

ΦRv (t, g) = exp(tv) · g

for all (t, g) ∈ R × G and v ∈ TeG. The derivative of the exponential map at 0 is the identity,

Dexp0 : T0TeG ≡ TeG → TeG, Dexp0 = IdTeG , and so the exponential map is a dieomorphism in

some neighbourhood of 0 ∈ TeG.

Proof. In the initial paragraph we had observed that ΦLv is `left invariant' in the sense that gΦL

v (t, p) =

ΦLv (t, gp) for all g, p ∈ G and t ∈ R, thus ΦL

v (t, g) = gΦLv (t, e) = gΦL

tv (1, e) = g ·exp(tv). Similarly,

ΦRv is right-invariant in the sense that ΦR

v (t, pg) = ΦRv (t, p) · g, thus ΦR

v (t, g) = ΦRv (t, e) · g =

ΦRtv (1, e) · g.Using the group action property of ows, we can further see that left-invariant vector elds, when

restricted to their ow-line out of e are also right invariant. Specically,

ΦLv (t1 + t2, e) = ΦL

v (t1,ΦLv (t2, e)) = ΦL

v (t2, e) ·ΦLv (t1, e) = ΦL

v (t1, e) ·ΦLv (t2, e).

Dierentiating this expression with respect to t2, for t2 = 0 gives

vL(ΦLv (t1, e)) = D(RΦL

v (t1,e))(v) = D(LΦLv (t1,e))(v)

thus ΦRv (t, e) = ΦL

v (t, e) for all v ∈ TeG and for all t ∈ R.

In the case of matrix groups, the exponential map has a particularly simple analytic form.

Proposition 2.51. In GLnR, exp : TIGLnR→ GLnR is given by the power series

exp(A) =

∞∑k=0

1

k!Ak .

One can of course worry if the formula is well-dened, smooth, and if exp(A) ∈ GLnR. These

questions are addressed below.

Proposition 2.52. The series for the exponential map above converges absolutely and uniformly on

compact sets. If ||A|| denotes the matrix norm, then

||exp(A)|| ≤ e ||A||.

So the exponential map is well-dened and C∞-smooth. Moreover, it satises the dierential equation∂exp(tA)

∂t = Aexp(tA) = exp(tA)A.

Using the properties of the determinant, one can show that the function Det(exp(tA)) satises

the dierential equation ∂Det(exp(tA))∂t = tr(A)Det(exp(tA)) thus Det(exp(tA)) = etr(A)t .

54


Proposition 2.53. If f : G → H is a homomorphism of Lie groups, f (exp(v)) = exp(Dfe(v)), i.e.

one has a commutative diagram:

TeGDfe //

exp

TeH

exp

G

f // H

The next theorem at its heart is very simple. It states that a subgroup of a Lie group is itself a Lie

group (i.e. a submanifold) if and only if it is a closed subspace. Before proving the theorem, consider

what might (at rst) appear to be a massive over-simplication. Let V be a subgroup of Rn thoughtof as an additive group via addition of vectors. Let's try to prove V is a manifold. Of course, V need

not be a manifold, as V = Qn is not locally Euclidean. To rule out cases such as this, consider the

case that V is closed. If V is closed, I claim that the path-component of V containing 0 is a vector

space, and therefore V is a manifold. Notice that V itself may not be a vector subspace, as Zn ⊂ Rnis a closed subgroup. So assume V ⊂ Rn is a closed, connected subgroup. We want to prove it is

a vector sub-space of Rn. Without loss of generality, we can assume span(V ) = Rn, otherwise the

problem reduces to the identical problem with n of smaller dimension. In particular, we can assume in

any neighbourhood of 0, V contains a spanning set for Rn, otherwise V would not be connected. By

the homogeneity of groups, this implies V = Rn, and since V is closed, V = Rn.

Theorem 2.54. A subgroup of a Lie group is a submanifold if and only if it is closed.

Proof. The proof is very much in the same spirit as the case where G is a vector space. Let H be a

connected, closed subgroup of G. Without loss of generality we can assume in every neighbourhood

U of 0 ∈ TeG on which the exponential map is a dieomorphism, we can assume exp−1|U (H) contains

a spanning set for TeG, since if this were not the case we could reduce to the same problem but with

G of lower dimension. The lower-dimensional group would be the group generated by the exponential

of the subspace of TeG that exp−1|U (H) generates a dense subspace of. This follows from the linear

case, since

TeG × TeGD(µ)(e,e) // TeG

(v , w) // v + w

and thus multiplication in G, when pulled-back to the tangent space via the exponential map looks

approximately like vector addition. Thus by induction on the dimension of the ambient group we have

a proof.

Corollary 2.55. A continuous homomorphism of Lie groups G → H must be smooth.

Proof. Given any homomorphism f : G → H (not even continuous) then its graph

graph(f ) = (g, f (g)) ∈ G ×H

is a subgroup. If f is continuous, it is a closed subspace, so by Theorem 2.54 it is a Lie subgroup.

The composition of inclusion graph(f ) → G × H together with projection G × H → G is a local

dieomorphism by the inverse function theorem, so f is a smooth map.

Denition 2.56. A subgroup T ⊂ G of a Lie group G is said to be a torus if T is abelian and compact.

T is said to be a maximal torus if it is a torus and it is not contained in any strictly larger torus.55


Example 2.57. The only Lie subgroups of SO3 that are abelian and connected are either trivial, or

isomorphic to S1, thus these latter groups are maximal tori. These non-trivial connected abelian

subgroups are precisely all the rotations about a given axis in R3. One proves this by observing every

element of SO3 has a +1-eigenvector, and two elements of SO3 commute if and only if they share a

+1-eigenvector. Thus, every element of SO3 sits in a subgroup isomorphic to S1, moreover, any two

subgroups of SO3 isomorphic to S1 are conjugate.

The next theorem is a vast generalization of the above observation.

Theorem 2.58. Any two maximal tori in a compact connected Lie group G are conjugate. Every

element of G is contained in a maximal torus.

Before we begin the proof of Theorem 2.58 we need a few preliminary results about abelian groups,

and something called the Weyl group.

Lemma 2.59. A connected Lie group is abelian if and only if the exponential map is a homomor-

phism of Lie groups. Moreover, for connected abelian Lie groups, the exponential map is an onto

homomorphism.

Proof. In an abelian Lie group, left and right multiplication are identical Lg = Rg, so ΦLv = ΦR

v which

we will label this unambiguously Φv : R × G → G. Consider the ODE for the ow Φv+w (t, g). We

wish to show it satises the same ODE as the function Φv (t,Φw (t, g)). By Proposition 2.50 we have

Φv (t,Φw (t, g)) = Φv (t, e)Φw (t, e)g = exp(tv)exp(tw)g

and similarly

Φv+w (t, g) = Φv+w (t, e)g = exp(t(v + w))g.

Dierentiating both expressions with respect to t gives

∂Φv (t,Φw (t, g))

∂t=∂ (exp(tv)exp(tw)g)

∂tCR

= D(Lexp(tw)g)(exp(tv))(v(exp(tv))) +D(Lexp(tv)g)(exp(tw))(w(exp(tw)))

= v(exp(tv)exp(tw)g) + w(exp(tv)exp(tw)g)

= (v + w)(Φv (t,Φw (t, g)))

∂Φv+w (t, v)

∂t=∂ (exp(t(v + w))g)

∂t= (v + w)(Φv+w (t, v))

Thus they both satisfy the same ODE, so they are the same ows by the existence and uniqueness

theorem. We have just veried that

exp(v + w) = Φv+w (1, e) = Φv (1, e)Φw (1, e) = exp(v)exp(w)

i.e. that the exponential map is a homomorphism of Lie groups provided the Lie group is abelian. To

show the converse, i.e. if it is a homomorphism then the Lie group is abelian, notice that since the

TeG is an abelian group, the image of exp is an abelian group. But we have seen that exp is a local

dieomorphism in a neighbourhood of 0, thus the image of exp is an open abelian subgroup of G. But

as we have seen, any connected Lie group is generated by a neighbourhood of the identity element,

thus exp is onto. 56


Example 2.60. SL2R is a connected but non-compact Lie group. The exponential map is not onto.

To see this, consider the Lie algebra of SL2R, TISL2R = H ∈ M2,2R : tr(H) = 0. We will compute

exp(H) via the Jordan form. The characteristic polynomial of a 2× 2 matrix H is

λ2 − tr(H)λ+Det(H)

thus for our H ∈ TISL2R the eigenvalues are λ =√Det(H).

• In the case Det(H) < 0 this gives H = B

[0 θ

−θ 0

]B−1 where θ,−θ =

√−Det(H) and

B ∈ M2,2R. Thus

exp(H) = B

[cos θ sin θ

− sin θ cos θ

]B−1.

• In the case Det(H) > 0 this gives H = B

[α 0

0 −α

]B−1 where B ∈ M2,2R and α,−α =√

Det(H). Thus

exp(H) = B

[eα 0

0 e−α

]B−1.

• In the case Det(H) = 0 the only eigenvalue is zero, so either H = 0 and exp(H) = I or

H = B

[0 0

1 0

]B−1, i.e. H has rank 1 and

exp(H) = B

[1 0

1 1

]B−1.

Thus tr(exp(H)) ≥ −2 and is equal to −2 if and only if exp(H) = −I. Thus, the matrices conjugate

to

[−1 0

1 −1

]are not in the image of the exponential map for SL2R. One can further argue they are

the only matrices not in the image of the exponential map.

We we see in Proposition 2.68 that for compact connected Lie groups, the exponential map onto.

Theorem 2.61. An abelian Lie group is isomorphic to a product of three groups G1 ×G2 ×G3 where

G1 is a nitely-generated abelian group, G2 is a product of circles, and G3 is a Euclidean space (as a

vector space). A connected Lie group is abelian if and only if the exponential map is a homomorphism

of groups. If an abelian Lie group G is compact and connected it is isomorphic to (S1)n for some n.

A topological generator of a compact connected abelian Lie group G is any element z ∈ G such that

zk : k ∈ Z = G. For compact connected abelian Lie groups, the topological generators are dense.

Moreover, any non-zero power of a generator is also a generator.

Proof. If G is an arbitrary abelian Lie group, we have the extension

G0 → G → G/G0

where G0 is the path-component of the identity. This reduces the theorem to proving:

(1) G/G0 is a nitely generated abelian group.

(2) There is a homomorphism G/G0 → G so that the composite with the quotient map G → G/G0

is the identity on G/G0.

(3) Proving the theorem for G0.57


Proofs of parts (1) and (2) will appear in a later iteration of these notes.

To prove (3), Proposition 2.59 gives us an onto homomorphism exp : TeG → G provided G is

connected an abelian. Since the exponential map is a local dieomorphism everywhere, ker(exp) is a

discrete subgroup of TeG. One can readily check that there exists a collection of linearly independent

vectors v1, · · · , vk ⊂ TeG such that every element ω ∈ ker(exp) has a unique representation

ω = b1v1 + · · ·+ bkvk

where b1, · · · , bk ∈ Z. Extend the collection v1, · · · , vk to a basis for TeG, v1, · · · , vn. Dene

f : (S1)k × Rn−k → G

by the formula

(e2πit1 , · · · , e2πitk , tk+1, · · · , tn) 7−→ exp(

n∑i=1

tivi)

This is the isomorphism of Lie groups we seek.

Regarding the claim about topological generators, we prove this by induction. Notice that in S1

an element is a topological generator if and only if it is of innite order, this follows from the group

property of S1 together with compactness. Since the elements of nite order are roots of unity, they

form a countable set. Consider the general case of (S1)n. We have projection maps (S1)n → (S1)n−1,

thus an element of (S1)n is a topological generator if and only if it projects to a topological generator

of (S1)n and it is not in the graph of some splitting (S1)n−1 → (S1)n. Since the splittings are a

countable set, we have proven by induction that topological generators are dense, and that powers of

them are also topological generators.

The idea of topological generators can be taken further. What we have observed is that compact

connected abelian Lie groups have dense cyclic groups. But what about non-abelian Lie groups? One

can check that SO3 for example has a dense free group on two generators. The group U(n2)turns out

to have a dense group isomorphic to the braid group on n strands. In general, discrete groups have

many ways of being embedded (as groups) into Lie groups. As topological groups their embeddings

are a far more rigid theory.

Finitely generated abelian groups have been classied. The proof employs little other than crafty

usage of the division algorithm in the guise of matrix algebra. We quote it below.

Theorem 2.62. If G is a nitely-generated abelian group, then G is isomorphic to a unique group of

the form

G ' Zn ⊕ Zm1 ⊕ Zm2 ⊕ Zmk.

The integer n ≥ 0 is called the rank of G. The integer n + k is the minimal number of generators

of G. The integers m1, · · ·mk are required to satisfy

1 < m1|m2| · · · |mk

i.e. m2 is an integer multiple of m1, and so on, and k ≥ 0.

Thus Z2 ⊕ Z2 is not isomorphic to Z ⊕ Z2 ⊕ Z4, because they are both in their canonical forms.

But Z6 is isomorphic to Z2 ⊕Z3. The latter group is not in its canonical form, since 3 is not even. In

general if p and q have no common divisors, Zpq ' Zp ⊕ Zq.58


Denition 2.63. If H ⊂ G is a subgroup of a group G, let N(H) be the normalizer of H, i.e.

N(H) = g ∈ G : gH = Hg

then N(H)/H is called the Weyl group of H in G.

The context to interpret the above denitions is to think of a Lie group as an object with two

primary sources of symmetry: the left-multiplication maps Lg : G → G and the right-multiplication

maps Rg : G → G, Lg(h) = gh and Rg(h) = hg. It is only natural to ask, after taking the quotient

by a subgroup, how much of these symmetries survive, giving symmetries of G/H ? To make this

precise, consider any function f : G → G. First, we ask then it descends to a well-dened function

f : G/H → G/H, i.e. such that the diagram below commutes.

Gf //

G

G/H

f // G/H

To have such a commutative diagram the requirement is that f (gh1) = f (g)h2. Meaning that for

any g ∈ G and any h1 ∈ H there must exist an H2 ∈ H such that f (gh1) = f (g)h2. Consider the case

f is left and right multiplication respectively.

(1) If f = Lω our well-denedness formula becomes ωgh1 = ωgh2. We can left multiply by

g−1ω−1, giving us h2 = h1. Thus any ω ∈ G induces a symmetry of G/H by left multiplication.

So there is an onto homomorphism from G to the left-multiplication-induced automorphisms

of G/H. Such an automorphism is the identity on G/H if and only if gH = ωgH for all g ∈ G,which is the intersection ∩g∈GgHg−1, i.e. this is the maximal subgroup of H which is normal

in G. Thus we can describe the left-multiplication induced automorphism group of G/H as

G/A where A = ∩g∈GgHg−1.

(2) If f = Rω the above formula tells us that gh1ω = gωh2, i.e. ωh2ω−1 = h1, i.e. ω ∈ N(H). So

we have an onto homomorphism from N(H) to the right-multiplication induced automorphisms

of H, and the kernel is H. Thus the right-multiplication induced automorphisms of H is the

Weyl group N(H)/H.

Notice that N(H) acts on H by conjugation. In the case that H is a maximal torus, this action

therefore descends to an action of the Weyl group N(H)/H on H.

Proposition 2.64. Given a compact Lie group G, let H be a maximal torus. Then the Weyl group

N(H)/H is a nite group.

Proof. View the action of N(H) on H as a continuous function N(H) → Aut(H) ≡ GLk(Z). Since

GLk(Z) is discrete, the path-component of the identity element in N(H) is contained in the kernel,

denote this last group by N0(H). Thus both H and N0(H) act trivially on H. Since H is connected and

a subset of N(H), clearly H ⊂ N0(H). Now consider an element h ∈ N0(H) which is not an element

of H. There is a path from e to h in N0(H), thus if we let γ : [0, 1] → N0(H) be that path, γ(t)

acts trivially on H for all t. Thus if we take the group generated by both H and γ(t) : t ∈ [0, 1] itis a connected abelian group containing H. By the maximality condition, it must be precisely H, thus

N0(H) = H. Since N(H) is closed and G compact, N(H) is also compact, thus N(H)/H is compact

and discrete, therefore nite. 59


Proof. (of Theorem 2.58) ♦ Uses Oriented Transverse Intersections, 6. ♦ There is a well-dened

function

(G/H)×H F // G

([g], t) // gtg−1

which will be central to the proof. For now, assume F is onto. To nish the proof, we need to show

that if H′ is some other maximal torus, then H′ = gHg−1 for some g. Let x ∈ H′ be a generator of H′in the sense that H′ = xn : n ∈ Z, and let x = gyg−1 for y ∈ H. Existence of such generators will

be seen in Theorem 2.61. Thus gHg−1 ⊃ H′, and since both H and H′ are maximal tori, gHg−1 = H′.

Thus we have reduced the proof to showing that F is onto.

To prove that F is onto, the key observation is to consider F−1(y) with y ∈ G. We will argue this

can never be empty. Consider the special case where y ∈ H is a generator of our maximal torus. So

we are asking, which elements ([g], x) ∈ (G/H)× H satisfy gxg−1 = y . Since y is a generator of H,

and x ∈ H, x must also generate a maximal torus, which is a subset of H, which by maximality must

also be H. Thus g is in the normalizer of H, and F−1(y) has exactly as many elements as the Weyl

group.

Generators of compact connected abelian Lie groups are quite common, so it is only a matter of

book-keeping to go from this step to the complete proof. Unfortunately, the appropriate book-keeping

tools are degree theory, which is described in Chapter 6. We will complete the proof assuming the

reader is familiar with transversality, tubular neighbourhoods, and the degree of a smooth map, from

Chapter 6. I would like to emphasize, one can get some insights out of the proof without knowing

these tools.

PSfrag repla ements

U

H

νH

The key idea to nish the proof is to argue the degree of F is the order of the Weyl group, and

therefore F must be an onto smooth function. To make sense of this, we need to turn G/H into an

abstract manifold. A direct computation gives

DF[g],t(h, l) = D(Rtg−1 )g(h) +D(Lg Rg1 )t(l)−D(Lgtg−1 Rg−1 )g(h)

to better understand this map, we use left-multiplication to identify all the tangent spaces with the

tangent space at e. This amounts to post-composing the map with the derivative of Lgt−1g−1 , and

in the G/H factor pre-composing with D(Lg), and in the H factor pre-composing with D(Lt), which

gives us the map

Te(G/H)× TeHD(Lgt−1g−1 )gtg−1DF([g],t)(D(Lg)e×D(Lt)e)

// TeG

([h], l) // Adg(Adt−1 (h)− h + l)

60


In the above formula we are considering [h] to be the coset in Te(G/H) = TeG/TeH, one can easily

check the above formula is unambiguous. Since G is a connected Lie group, the sign of the determinant

of the above map is the same regardless of the choice of g ∈ G. Moreover, the map is the identity on

TeH so the sign of the determinant of the above map is the same as the sign of the determinant of

Te(G/H) 3 [h] 7−→ [Adt−1 (h)− h] ∈ Te(G/H). If this map were to have a real eigenvalue λ ∈ R then

the map [h] 7−→ [Adt−1 (h)] would have the real Eigenvalue λ+ 1. Thus one map has real eigenvalues

if and only if the other does. Recall that t is a generator of the maximal torus H, moreover, tk is

therefore a generator of the maximal torus for every k ∈ Z\0. So if [h] 7−→ [Adt−1 (h)] were to have

a real eigenvalue λ, the map [h] 7−→ [Adt−k (h)] would have real eigenvalue λk . Since the maximal

torus is compact, we can assume the k-th iterate of the map is close to the rst, for innitely many

k ∈ Z. Thus there is a sequnce k1 < k2 < k3 < · · · such that λki → λ, thus λ ∈ ±1. By replacing

t by t2 we can assume λ = 1. Thus conjugation by t ∈ T leaves a subspace of Te(G/H) xed. By

Proposition 2.53, this means conjugation by t xes a subgroup of G which is not a subgroup of H,

and so H is not maximal, a contradiction. Thus all the eigenvalues of our map are complex, and so

its determinant is positive, completing the proof.

Denition 2.65. The dimension of a maximal torus in a Lie group G is called the rank of the Lie

group, rank(G).

Corollary 2.66. In a compact connected Lie group, dim(G)− rank(G) ∈ 2Z. This follows from the

proof of Theorem 2.58, where we observed that a tangent space to G/H had a linear automorphism

with only complex eigenvalues.

So for example, a compact connected 3-dimensional Lie group is either abelian, or its maximal torus

is one-dimensional.

Example 2.67. In the group SO3 the Weyl group is isomorphic to Z2. This is because the maximal

tori are specied by the 1-dimensional subspaces of R3, and the elements of SO3 that normalize these

subspaces is precisely the subgroup of SO3 that preserves this subspace, a group isomorphic to O2.

So the quotient W (H)/H = O2/SO2 ' Z2.

In O3 the Weyl group is isomorphic to Z2 × Z2, this is the Weyl group of SO3 together with the

mirror reection in the plane orthogonal to the axis of rotation.

In S3 the Weyl group is Z2.

Proposition 2.68. If G is a compact and connected Lie group, the exponential map TeG → G is onto.

Proof. We know that every g ∈ G belongs to some maximal torus g ∈ T . The exponential map is

onto for a torus, therefore by the naturality Proposition 2.53 it is onto for G.

It turns out one can prove that in every connected Lie group G, every element g ∈ G can be written

as the product g = g1g2 with both g1, g2 ∈ exp(TeG), [Wüstner, 2002]. At present there is no

completely understood simple criterion for deciding if, for a given Lie group G, whether exp(TeG) = G

or not.

Some computations and examples

Proposition 2.69. Given A,B ∈ TIGLnR their Lie Bracket in the Lie algebra of GLnR is agrees with

the matrix commutator

[A,B] = AB − BA.61


Denition 2.70. A linear function on a nite-dimensional vector space f : V → V is hyperbolic if

every eigenvalue has non-zero real component.

Given a vector eld v : M → TM on a manifold, a xed point of v is any point p ∈ M such that

v(p) = 0p. We say p is a hyperbolic xed point if Dvp is a hyperbolic linear function.

Technically, the domain of Dvp is TpM while the target space is T(v(0))TM. To make sense of this

denition, write the vector eld as v(q) = (q, ~v(q)), thus v(p) = (p, 0), and Dvp(w) = (w,D~vp(w)),

and D~vp(w) ∈ TpM. Stated another way, T(p,0)TM = TpM × TpM, and D~vp is the projection of

Dvp onto the second factor. Thus we say p ∈ M is a hyperbolic xed point if v(p) = 0p and

D~vp : TpM → TpM is a hyperbolic linear function.

At a xed point p, the linearization of the ODE is the vector eld on TpM, given by w ′ = D~vp(w).

Theorem 2.71. Given a complete vector eld v on a manifold M, if p is a hyperbolic xed point then

there exists a unique open set U containing p in M and a dierentiable homeomorphism g : TpM → U

which is a conjugacy between the ow of v on U and that of its linearization at p, i.e.

Φv (t, g(w)) = g(ΦD~vp(t, w)) ≡ g(exp(tD~vp)w).

Since T0TpM = TpM, think of the derivative of g at the origin as a map Dg0 : TpM → TpM. We can

ensure Dg0 = IdTpM , moreover this condition ensures g is unique. The map g is a Ck dieomorphism

away from the origin, provided the original vector eld is Ck .

Hyperbolicity is the only hypothesis one can make on Dvp which results in conclusion of the theorem

being true, provided the manifold M is at least one dimensional. Specically, if the derivative were not

hyperbolic, there would be two vector elds having the same xed point and derivative at p, and their

ows are not conjugate.

We can not ensure g is a Ck dieomorphism in a neighbourhood of the origin. To see this, consider

the ODE on R3:

(x, y , z)′ = (ax, (a − b)y + cxz,−bz)

where a, b > 0, a 6= b and c 6= 0. The ow has the form

Φ(t, (x, y , z)) = (eatx, (cxzt + y)e(a−b)t , e−btz)

while the ow of the linearization has the form

Φ(t, (x, y , z)) = (eatx, e(a−b)ty , e−btz).

Theorem 2.71 gives a partial decomposition of a manifold with a hyperbolic vector eld into orbit

types. It only misses the subtleties of the dynamics where the vector eld is xed-point free.

Example 2.72. The vector eld eld on S2 given by

(x, y , z) 7−→ (−y − z3x, x − z3y , z2 − z4)

has two hyperbolic xed points, decomposing the manifold into two open-hemi-spheres centred on the

spiral sinks at (0, 0,±1), and with the one closed orbit S1 × 0.

Proposition 2.73. If f : G → H is a surjective homomorphism with compact and connected Lie groups,

and H abelian, then f (T ) = H where T ⊂ G is a maximal torus. In particular, rank(G) ≥ rank(H).

Proof. Since f is onto, we know that if q ∈ H there exists p ∈ G with f (p) = q, moreover, p is

contained in some maximal torus of G. Since H is abelian f (gpg−1) = f (p), and so any maximal torus

of G maps onto 62


Denition 2.74. If H is a subgroup of G, the centralizer of H is

Z(H) = g ∈ G : gh = hg ∀h ∈ H.

it is a subgroup of G. The centre of G is Z(G).

Proposition 2.75. If T ⊂ G is a maximal torus in a compact connected Lie group G. Then

(1) If S ⊂ G is a connected abelian group then Z(S) is the union of the maximal tori that contain

S.

(2) The center of G is the intersection of all the maximal tori in G.

Corollary 2.76. In a compact connected Lie group G with maximal torus H, the action of the Weyl

group N(H)/H on H is eective, i.e. the homomorphism

N(H)/H → Aut(H)

is one-to-one. Two elements of H are conjugate in G if and only if they are in the same orbit of the

Weyl group's action. The map from the orbits of N(H)/H acting on H to the conjugacy classes in G

is a bijection.

The next theorem can be thought of as a generalization of Proposition 2.73.

Theorem 2.77. Let f : G → H be an onto homomorphism of compact connected Lie groups. If

T ⊂ G is a maximal torus, then f (T ) ⊂ H is also a maximal torus. The kernel of f is contained in a

maximal torus if and only if it is in the centre of the group, and in this case f induces isomorphisms

of Weyl groups. This always happens if dim(G) = dim(H).

Proof. Let q be a generator for a maximal torus T of H, and let p ∈ G be such that f (p) = q. So

the closure of the group generated by p maps onto the closure of T . The group generated by p is

contained in some maximal torus of G, and this torus maps onto T if it did not, T would not be

maximal. To see the claim when dim(G) = dim(H) observe that f must be a local dieomorphism

in this case, and so the kernel is a nite subgroup of G. But discrete normal subgroups are always in

the centres of connected Lie groups.

Exercises

Problem 2.78. Let GL+n be the invertible n × n matrices with real entries and positive determinant.

Prove GL+n is pathwise connected, meaning for any two matrices A,B ∈ GL+

n there exists a continuous

function f : [0, 1]→ GL+n such that f (0) = A and f (1) = B.

Hint: either using multiplication in GL+n or a concatenation argument, it is enough to consider the

case A = I. Via a standard linear algebra argument, all invertible matrices are products of elementary

matrices (corresponding to row and column operations). Unfortunately not all elementary matrices

are in GL+n , so revisit the argument and write B as a product of èlementary' matrices such that the

èlementary' matrices are in GL+n and such that you can nd path from the èlementary' matrices to

I. The transposition matrices are the problem try to replace them with an appropriate `rotation by

π/2' matrix. Note how this implies that On has precisely two path components the component of I63


and the component of a (any) mirror-reection. For this part, the determinant is your friend. Bonus

points: also nd a way to make the paths smooth.

Problem 2.79. Check that the map A 7−→ A2 is a homomorphism of a Lie group G if and only if G

is abelian. Compute the kernel in the case G = SO2.

Problem 2.80. The complex numbers C are a eld, dened as R2 as a vector space over R but with

the additional multiplication (a, b)(c, d) = (ac − bd, ad + bc). An alternative way to say this is that

C is a 2-dimensional vector space over R with basis vectors 1 and i , and the product map C×C→ Cis the bilinear map uniquely determined by 1 · x = x = x · 1 ∀x , and i · i = −1. The quaternions H(H is for Hamilton) are dened similarly, they are R4 as a vector space, but if we let 1, i , j, k bea basis for R4 we dene multiplication to be the bilinear map determined by 1 · x = x = x · 1 ∀x ,i ·j = k, j ·k = i , k ·i = j , j ·i = −k, k ·j = −i , i ·k = −j , and i2 = j2 = k2 = −1. So really 1 is shorthand

for (1, 0, 0, 0) and i for (0, 1, 0, 0) and so on. So when we write a quaternion as a+ bi + cj + dk , we

really mean the vector (a, b, c, d) ∈ R4. Given a quaternion q = a + bi + cj + dk , the conjugate is

q = a − bi − cj − dk .

(a) Verify that conjugation respects multiplication q1 · q2 = q2 · q1. Also check that qq = |q|2,where |q| =

√a2 + b2 + c2 + d2 is the Euclidean norm in R4.

(b) A division algebra is an algebraic object that satises the axioms of a eld, except that

multiplication need not be commutative. Verify that the quaternions H are a division algebra.

Check that qx = xq for all real numbers x and quaternions q.

(c) Check that the unit-length quaternions, S3 ⊂ H with operation S3 × S3 → S3 given by

quaternion multiplication satises the axioms of a Lie group.

(d) Given a unit quaternion q ∈ S3, verify that the left-multiplication map Lq : H → H given

by Lq(p) = qp is an isometry of H = R4 with the Euclidean metric. Similarly for right

multiplication Rq(p) = pq. Verify that the two maps S3 → O4 given by q 7−→ Lq and

q 7−→ Rq−1 are homomorphisms of groups.

(e) By part (d), the map f : S3 → O4 given by f (p) = Lq Rq−1 is a homomorphism of Lie groups,

but by part (b), f (p) xes the real numbers R and therefore xes their orthogonal complement

0 × R3 ⊂ H and so one can view f as a homomorphism S3 → O3. Since S3 is pathwise

connected and f (1) = I, f is a map f : S3 → SO3. Prove that f is a submersion and therefore

by problem (3) an open map for this it's probably best to argue Df1 : T1S3 → TISO3 is an

onto linear map, and then use Proposition 0.9 from the Flows notes. This means the image

of f , f (S3) ⊂ SO3 is both (relatively) open and closed (since it is compact), and non-empty.

Argue that this implies f (S3) = SO3. Further, show that f is 2 : 1, i.e. that its kernel is

±1 ⊂ S3.

Note: The map f from (4) part (e) is usually called the spinor cover of SO3, and S3 thought of as

a group in this way is the 3-dimensional `spin group' or `spinor group'.64


Problem 2.81. Compute the tangent spaces to the identity matrix for the Lie groups SLn(R), Un, SUn,

SLn(C) and O1,n−1. Hint: It will help to recall problem 11 from chapter 1, that D(Det)I = tr .

Problem 2.82. This problem asks you to compute a few Lie brackets of vector elds on dierent

manifolds, using dierent techniques.

(a) Consider vector elds ~i,~j,~k on S2 given by

~i(x, y , z) = (0,−z, y)

~j(x, y , z) = (z, 0− x)

~k(x, y , z) = (−y , x, 0)

Compute the ows of~i ,~j and ~k on S2, and also compute the Lie brackets [~i,~j ], [~j,~k ] and [~k,~i ].

Hint: For computing the Lie brackets, notice that the formulas for the vector elds naturally

extend to R3, so perhaps compute the Lie bracket there. Also, sketch a picture comparing

the double-ows Φv (t,Φw (t, x)) and Φw (t,Φv (t, x)) for v = ~i, w = ~j to see if your answer

resembles Theorem 0.13 from the Flows notes.

(b) Consider the vector elds ~v(x, y) = (−y , x) and ~w(x, y) = (x2, y) on R2. Compute the ows

Φ~v and Φ ~w and the Lie bracket [~v, ~w ]. Also compute the limit

limt→0

Φ~v (t,Φ ~w (t, (x, y)))−Φ ~w (t,Φ~v (t, (x, y)))

t2

and compare your answer to the Lie bracket.

Problem 2.83. Consider the homomorphism S3 × S3 → SO4 given by mapping (x, y) 7−→ Lx Ry−1 .

Similar to Problem (2), verify this map is on onto submersion. Compute the kernel.

Problem 2.84. Argue that if G ⊂ SO2 is a closed subgroup, then G is either nite and cyclic, or

G = SO2. Give an example of a subgroup of SO2 which is innite cyclic. Up to isomorphism of

groups, are there any other subgroups of SO2?

Problem 2.85. The group of Euclidean isometries of Rn is denoted Isom(Rn). Every element of

Isom(Rn) is a product of a rotation matrix A ∈ On and a translation function Rn 3 x 7−→ x + v ∈ Rnwhere v ∈ Rn is some vector. Prove that, as a Lie Group

Isom(Rn) ' On nRn

where Rn is a Lie group under vector addition. Identify the action of On on Rn.

65


Problem 2.86. If H is a subgroup of a Lie group G, and if H is abelian, argue that H is also abelian.

Argue that every Lie group has a maximal abelian subgroup, i.e. an abelian subgroup H ⊂ G that is

not a proper subset of any other abelian subgroup of G. Moreover, argue it is closed. In the theory of

Lie groups, such a group is called a maximal torus.

Problem 2.87. Compute a maximal torus of SO4.

Problem 2.88. The centre of a Lie group G is dened as

Z(G) = g ∈ G : gh = hg ∀h ∈ G.

Compute the centre of SO3, S3, SO4 and S3 × S3.

Problem 2.89. Compute a maximal torus and the centres of On and GLnR.

Problem 2.90. The ane group of Rn is the group generated by GLnR and the group of translations

of Rn. Argue thatAf f (Rn) ' GLnRn Rn.

Construct an embedding of Af f (R) into GL2(R), and use this to compute the Lie bracket on Af f (R).

Problem 2.91. Generalize Problem 12, by constructing an embedding

Af f (Rn)→ GLn+1R.

Compute the Lie bracket in Af f (Rn).

Problem 2.92. Prove that det(exp(A)) = exp(tr(A)) for any matrix A ∈ GLnR or GLnC. Hint: thisfunction satises a dierential equation.

Problem 2.93. Prove that if G is a Lie group, and G0 the path-component of the identity element,

then G0 is an open and closed normal subgroup of G.

66


Problem 2.94. An embedding of Lie groups is a homomorphism of Lie groups f : G → H which is

an embedding of manifolds. i.e. f (G) ⊂ H is a submanifold of H, and f is a dieomorphism between

G and f (G). Construct an embedding On−1 → SOn which restricts to the map SOn−1 → SOn where

A 7−→ A× IdR.

Problem 2.95. Determine all the bi-invariant vector elds on On. Hint: Perhaps follow Example

2.42?

Problem 2.96. Prove that if G is a smooth manifold, and it has an operation µ : G × G → G which

is smooth, making G into a group, then the inversion function i : G → G given by i(g) = g−1 must

be smooth. i.e. the smoothness axiom for inversion in a Lie group is redundant. Hint: argue that for

every g ∈ G the left and right multiplication maps are dieomorphisms Lg : G → G, Rg : G → G.

Then consider the function F : G × G → G × G given by F (g1, g2) = (µ(g1, g2), g2). Argue that F

is a dieomorphism, and compute F−1 explicitly that F−1 is smooth you should be able to deduce

from the inverse function theorem. Notice a formula for inversion appears in your formula for F−1.

Problem 2.97. Let G be a connected Lie group.

• Argue that if U ⊂ G is a neighbourhood of the identity element, then U generates the group

G.

• Let U be any non-empty open subset of G. Argue that it generates the group G.

• Argue that a discrete normal subgroup of G must be in Z(G), the centre of G.

Z(G) = g ∈ G : gh = hg ∀h ∈ G

Problem 2.98. Show that in any Lie group G there is a neighbourhood of e not containing any

subgroup other than the trivial group e.

Problem 2.99. Argue if G is a Lie group then its tangent bundle is trivial, in the sense that if

dim(G) = n then there are n everywhere linearly independent smooth vector elds. Hint: How about

left-invariant elds? Go further and nd an explicit dieomorphism TG ' G × Rn.

Problem 2.100. This exercise aims to determine all the nite subgroups of O2. Let G ⊂ O2 be nite.

The group O2 has its canonical representation on R2 given by matrix multiplication, so O2 acts on the

unit circle S1. For this question, let µ : G × S1 → S1 be the restriction of this tautological action to67


G, i.e. µ(A, v) = Av . If there is a point p ∈ S1 xed by a non-trivial element g ∈ G, g.p = p then

argue that G is a dihedral group. If every non-trivial element of G acts without xed points on the

circle, argue G ⊂ SO2 and G is cyclic. If you like to think of cyclic and dihedral groups as subgroups of

the symmetric group, the orbit G.p works in the dihedral case, and G.(1, 0) works in the cyclic case.

Problem 2.101. Show that if T ⊂ G is a torus, its Lie algebra satises [v , w ] = 0 ∀v , w ∈ TeT . Thisis called an abelian Lie algebra. Show that the Lie algebra of a maximal torus is a maximal abelian

Lie subalgebra of TeG.

Problem 2.102. Find a maximal abelian subgroup of SOn for all n ≥ 2 which is not a maximal torus.

Problem 2.103. Show that if G is a compact and connected Lie group with dim(G) ≤ 2 then it is a

torus.

Problem 2.104. Argue that compact connected Lie groups are divisible, i.e. given any g ∈ G, thenfor any n ∈ Z, g = hn for some h ∈ G.

Problem 2.105. Show that every compact Lie group contains a nitely-generated dense subgroup.

Give an upper bound on the number of generators.

Problem 2.106. Argue that the group of diagonal matrices is a maximal torus in Un, and that the

diagonal matrices are also a maximal torus in SUn. Prove that the Weyl group of Un and SUn is the

symmetric group Σn.

Problem 2.107. Compute the maximal tori and Weyl groups for SOn.

Problem 2.108. If G is a Lie group and H is a normal Lie subgroup, argue that if v ∈ TeG and

w ∈ TeH then

[v , w ] ∈ TeH.A Lie subalgebra of a Lie algebra that satises this condition is called an ideal.

68


In a Lie algebra V , both V and 0 are ideals. If the Lie algebra has no other ideals, it is said to be

simple. This is because if one has an ideal W ⊂ V , the quotient V/W inherits the structure of a Lie

algebra with the denition [v + W,w + W ] = [v , w ] + W where we use additive notation v + W for

cosets. Thus simple Lie algebras are precisely the Lie algebras that can not be written as extensions

of any other Lie algebras.

Problem 2.109. Argue that the Lie algebra of GLnC is not simple. Hint: The determinant is a

homomorphism from GLnC to the group of non-zero complex numbers.

Problem 2.110. Prove that the Lie algebra of SLnC is simple.

Problem 2.111. Prove that the Lie algebra of SOn is simple, provided n 6= 4.

If you wish to see why the Lie algebra of SO4 is not simple, look back to Problem 2.83.

Problem 2.112.

69

Dierential Geometry - Coarse Outline Dierential geometry primer

3. Riemann Manifolds

Denition 3.1. A Riemann metric on N is a smooth function µ : TN ⊕ TN → R such that the

restrictions µ|π−1(p) : TpN ⊕ TpN → R are inner products, for all p. Similarly, a Lorentz metric on N

is one where the restrictions endows TpN with the structure of a Minkowski space.

Note 3.2. If M is a submanifold of a Riemann manifold (N,µN : TN ⊕ TN → R), M inherits a

Riemann metric from N via µM(p, v , w) = µN(p, v , w). Since Rk is an inner product space, it is a

Riemann manifold, and so all submanifolds M ⊂ Rk are Riemann manifolds. Although these induced

Riemann metrics are interesting, frequently we will be interested in non-induced metrics. Notice also

that given any brewise bilinear function µN : TN⊕TN → R it restricts to a brewise bilinear function

TM ⊕ TM → R, but if µN isn't positive or negative denite, there one would have to have additional

restrictions in how M is sitting in N to ensure the induced bilinear function had any special properties.

Whenever M is a submanifold of N, and N is a Riemann manifold, we generally give M the induced

Riemann structure unless explicitly stated otherwise.

Example 3.3. • Euclidean n-space is Rn with the standard Riemann metric µ(x, y) =∑n

i=1 xiyi .

• Hyperbolic n-space Hn is the manifold

−x20 + x2

1 + · · ·+ x2n = −1 and x0 > 0

in Rn+1 (with coordinates (x0, x1, · · · , xn) with the induced metric, with Rn+1 given the

Minkowski metric µ(x, y) = −x0y0 + x1y1 + · · · + xnyn, µ : Rn+1 × Rn+1 → R. To prove

this, rst notice that if we dene f : Rn+1 → R by f (x) = µ(x, x), we see that Hn is the

component of f −1(−1) contained in the x0 > 0 open half-space. Dfx(v) = 2µ(x, v), so −1

is a regular value of f and Hn is an n-manifold. TxHn = v ∈ Rn+1 : µ(v , x) = 0.In order to show Hn is a Riemann manifold, we need to argue that µ(v , v) > 0 whenever

µ(x, x) = −1, x0 > 0, µ(v , x) = 0 and v 6= 0. Write x as x = (x0, ~x), i.e. ~x ∈ Rn, andv = (v0, ~v). Then by design, µ(v , v) = −v2

0 + |~v |2, where |~v | is the standard Euclidean norm

in Rn. In this language, our assumptions are −x20 + |~x |2 = −1 (a), x0 > 0 (b) and x0v0 = ~x ·~v

(c). We want to determine whether or not µ(v , v) is positive or not. Equation (a) says

x20 = 1 + |~x |2, and so we can solve for v0 in (c) giving ~v0 = ~x ·~v√

1+|~x |2.

µ(v , v) = −v20 + |~v |2 = −

(~x · ~v)2

1 + |~x |2 + |~v |2 =|~v |2 + |~x |2|~v |2 − (~x · ~v)2

1 + |~x |2 ≥|~v |2

1 + |~x |2 ≥ 0

The second to last inequality is Cauchy-Schwartz in Euclidean n-space. By equation (c) and

(b), ~v = 0 if and only if v = 0, giving the result.

• The Poincaré Disc model for hyperbolic geometry is the unit ball Bn = x ∈ Rn : |x | < 1with the metric µ(p, v , w) = 4

(1−|p|2)2 (v · w) where v · w is the standard Euclidean metric in

Rn. Stated in terms of the components of the vectors,

µ(p, v , w) =4

(1−∑n

i=1 p2i )2

n∑i=1

viwi .

Since it is just a pointwise rescaling of the standard Euclidean metric on Bn, this is also a

Riemann metric.70


Denition 3.4. • A dieomorphism f : M → N between two Riemann manifolds with metrics

µM and µN respectively is an isometry if

µN(f (p), Dfp(v), Dfp(w)) = µM(p, v , w) ∀(p, v , w) ∈ TM ⊕ TM.• Given a smooth function γ : [a, b]→ M in a Riemann manifold M, its length (or arc-length)

is l(γ) =∫ ba |γ

′(t)|dt, where |γ′(t)| means the length of the vector in the inner product space

Tγ(t)M, with the inner product µM restricted to TpM, i.e. |γ′(t)| =√µM(γ′(t), γ′(t)).

• A great circle in Sn is the intersection of a 2-dimensional vector subspace of Rn+1 with Sn.

A round circle in Sn is the intersection of an ane 2-dimensional subspace of Rn+1 with Sn,

provided the intersection is a 1-manifold.

Example 3.5. • A round circle in Sn sits on a plane in Rn+1, which has a minimal distance

a to the origin, orthogonal to the plane. So provided n ≥ 2, every round circle admits a

parametrization γ(θ) = av1 + b(cos(θ)v2 + sin(θ)v3) where v1, v2, v3 are orthonormal in

Rn+1 and a2 + b2 = 1, and av1 is the displacement vector from the origin to the closest

point on the plane. Thinking of γ as having domain [0, 2π], the length of the round circle is∫ 2π

0 |γ′(θ)|dθ = 2πb.

• In hyperbolic n-space, the curve γ(t) = cosh(t)v + sinh(t)w with v , w ∈ Rn+1 satisfying

µ(v , v) = −1, µ(w,w) = 1 and µ(v , w) = 0, on the interval [a, b] has length∫ b

a

√− sinh2(t) + cosh2(t)dt =

∫ b

a

1dt = b − a

• In the Poincaré Disc, consider the path γ(t) = (t, 0, · · · , 0). γ′(t) = (1, 0, · · · , 0) and

|γ′(t)| =√

4(1−t2)2 = 2

1−t2 , so the length of γ along [a, b] is∫ b

a

2

1− t2dt = · · · some math 101 manipulations · · · = ln

(1 + t

1− t

)]ba

• The Poincaré ball model Bn is isometric to Hn. The isometry is given by f : Hn → Bn,

f (x0, x1, · · · , xn) = 11+x0

(x1, · · · , xn). The inverse by f −1 : Bn → Hn, f −1(x1, · · · , xn) =1

1−∑n

i=1 x2i

(1 +

∑ni=1 x

2i , 2x1, 2x2, · · · , 2xn

). Exercise!

• If A ∈ On and b ∈ Rn, the function f : Rn → Rn given by f (x) = Ax + b is an isometry of

n-dimensional Euclidean space.

• If A ∈ On+1, the function f : Sn → Sn given by f (v) = Av is an isometry of Sn.

• If A ∈ GLn+1, and if AtLA = L where L is the diagonal matrix L = diag(−1, 1, 1, · · · , 1),

and if b ∈ Rn+1 then f (v) = Av + b is an isometry of Minkowski (n + 1)-dimensional space.

The subgroup that xes the origin (i.e. b = 0) is sometimes called the Lorentz group, but

frequently this terminology is reserved for the n + 1 = 4 dimensional case.

• If A ∈ GLn+1, and AtLA = L as before, but further if the entries of A are ai ,j with a1,1 > 0,

then f : Hn → Hn given by f (v) = Av is an isometry of hyperbolic n-space.

We will (hopefully) see before the end of the course that the last four items actually describe all

the isometries of the respective objects.

Denition 3.6. A smooth function f : M → N between Riemann manifolds is a conformal map if

there is a smooth real-valued function h : M → (0,∞) such that

µN(f (p), Dfp(v), Dfp(w)) = h(p)µM(p, v , w)71


for all (p, v , w) ∈ TM ⊕ TM. The function h is called the conformal factor. f is a conformal

dieomorphism if it is a dieomorphism that is also a conformal map.

By design, conformal maps are ones that innitesimally preserve all angles, but unless the conformal

factor is constant 1, it does not preserve lengths. So any immersion S1 → M is a conformal map, but

conformality is a rather stringent requirement on a map S2 → M.

Example 3.7. • In the complex plane C ≡ R2 given the Euclidean metric, if f : U → C is any

complex dierentiable mapping for U ⊂ C open, then f is conformal on the subset of U where

f ′ 6= 0.

• If Sn is the n-sphere, and p ∈ Sn, stereographic projection πp : (Sn \ p) → TpSn is a

conformal dieomorphism. Stereographic projection is given by considering the straight line

lp,q from p to any point q ∈ Sn, q 6= p. lp,q is not tangent to Sn, so it intersects TpSn in

precisely one point. πp(q) = lp,q ∩ TpSn. With a little algebra we see this is:

πp(x) = p +1

1− x · p (x − p)

The rest: exercise.

• Given sphere of radius r centred at p ∈ Rn, inversion about that sphere is the map ip,r :

Rn \ p → Rn \ p that has the property that ip,r (x)− p is a positive multiple of x − p, but|x−p|·|ip,r (x)−p|

r2 = 1. This uniquely determines ip,r . A little algebra gives

ip,r (x) = p +r2

|x − p|2 (x − p)

Inversion is a conformal dieomorphism as well.

Note: if p ∈ Sn−1, ip,√

2 restricted to Sn−1 \ p is equal to πp. This allows one to use the fact

that ip,r is conformal to prove πp is conformal.

Conformal dieomorphisms are linked to many aspects of dierential geometry. So we will spend

some time exploring their geometry. Inversions in spheres are an excellent example of a conformal

dieomorphism. A beautiful property of the inversion ip,r is that it sends `round spheres' to `round

spheres'. Dene a `round (n−1)-sphere' in Rn be either a sphere v ∈ Rn : |v −p| = r centered at a

point p ∈ Rn of some radius r > 0, or a hyperplane v ∈ Rn : v ·u = d where u ∈ Sn and d ∈ R. We

call the former (actual spheres) compact round spheres, and the latter non-compact round spheres.

Then the key property of inversion is:

Proposition 3.8. If S ⊂ Rn is a round sphere, then

ip,r (S \ p)

is also a round sphere. We remove p since if p ∈ S, ip,r (S) is not dened. Similarly, if S is a hyperplane,

ip,r (S \ p) is generally non-compact so we need to take the closure if there's any hope of it being a

compact round sphere.

Proof. Consider the inclusion α : Rn → Rn+1 by the map x 7−→ (x, 0). Via this inclusion, α(ip,r (x)) =

iα(p),r (α(x)), i.e. ip,r makes sense in any dimension, and via this inclusion the denition in Rn+1 is an

extension of the denition in Rn.Consider the case n = 1. Basically, by design we know inversion must send round spheres to round

spheres, as compact round spheres are simply pairs of points in R.72


ip,r (x) = p +r2

x − pSo ip,r sends the sphere of radius s centred at q to the sphere of radius

|r2s

(q − p)2 − s2|

centred at

p +r2(q − p)

(q − p)2 − s2

If f : Rn → Rn is an isometry,

f ip,r f −1 = if (p),r

and via an isometry we can ensure both p and q are sent to a xed axis. Thus we should expect in

general ip,r should send the sphere of radius s centred at q ∈ Rn to the sphere of radius | r2s|q−p|2−s2 |

centred at

p +r2(q − p)

|q − p|2 − s2.

This is a direct (although somewhat involved) computation.

If you would prefer a more geometric argument, look up the proof in Thurston's book Three-

Dimensional Geometry and Topology (vol 1). There the n = 2 case is the base case, which he argues

preserves circles using orthogonal trajectories.

The intersection of two round (n−1)-spheres in Rn is either another round (n−1)-sphere, a point,

or it is what would be called a round (n−2)-sphere. So inversions send round (n−2)-spheres to round

(n − 2)-spheres, all the way down to sending round circles to round circles.

A classical theorem which we can potentially prove later in the course gives very strong control over

conformal dieomorphisms.

Theorem 3.9. (Liouville) Let U ⊂ Rn be open. Provided n ≥ 3, with U ⊂ Rn connected and open

and f : U → Rn conformal, then f either has the form

f (x) = p + rA(x − q)

for some A ∈ On, r > 0, and p, q ∈ Rn, or f has the form

f (x) = p +r

|x − q|2A(x − q).

In particular, f extends to a conformal dieomorphism of either all Rn or in the latter case, Rn \q →Rn \ p.

The dieomorphisms in Theorem 3.9 are called Möbius transformations. In a typical complex

analysis course one proves these are precisely the conformal dieomorphisms of the Riemann sphere

(or conformal dieomorphisms of open subsets of the complex plane with at most one pole). When

n = 2 one typically writes these maps as `fractional linear transformations' i.e. f (z) = az+bcz+d . Of

course, when n = 2 there are plenty of conformal maps f : R2 → R2 that do not extend to Möbius

tranformations, for example, f (z) = ez , the exponential map. The situation is even more degenerate

when n = 1 since a function f : R→ R is conformal if and only if the derivative is never zero.

Since stereographic projection is a conformal dieomorphism, the Möbius transformations conjugate

up to conformal dieomorphisms of the sphere. A basic consequence then, is:73


Corollary 3.10. The conformal dieomorphisms of Sn are precisely the Möbius transformations, pro-

vided n ≥ 2.

The above corollary should be taken to mean that one has to conjugate the Möbius transformation

with stereographic projection to consider it as a Möbius transformation of S2.

Connections, Relating nearby tangent spaces

One of the most interesting aspects of the tangent bundle of a manifold M is that it looks almost

like a product M ×Rm. The only time when the tangent bundle is literally a product is when M ⊂ Rnis a relatively open subset of a vector subspace of Rn (exercise). Sometimes the tangent bundle is

dieomorphic to a product, for example a dieomorphism f : TS1 → S1 × R is given by f (z, v) =

(z, ivz−1) where the latter operations are interpreted as multiplication in C ≡ R2. But frequently,

tangent bundles are not even dieomorphic to products although we lack the tools to prove it at

present, it can be shown that TS2 is not dieomorphic (or even homeomorphic) to S2 × R2.

The key issue is that, as one moves from point to point in M, the tangent spaces of M twist, and

as one travels around the manifold, the tangent spaces may ip-about in a way that no dieomorphism

can straighten. Perhaps simpler than the S2 example above, a non-orientable manifold has a tangent

bundle that is not dieomorphic to a product. This is something we will be able to prove, and the

proof will be given in these notes.

A connection allows one to relate nearby tangent spaces. It will also allow us to address some of the

more subtle aspects raised in the previous paragraphs. Given a manifold M, the bundle projection map

πM : TM → M has its derivative DπM : T 2M → TM. The kernel of D(πM)(p,v) : T(p,v)TM → TpM is

called the vertical subspace of T(p,v)TM, and the union of all the kernels for all (p, v) ∈ TM the vertical

subspace of T 2M. Since there is the inclusion ip : p×TpM → TM, and since p×TpM = π−1M (p)

is a vector space, there is a canonical identication T (p × TpM) = p × TpM × p × TpM. The

images of D(ip) : T (p × TpM)→ T 2M for p ∈ M by design gives the vertical subspaces of T 2M.

Denition 3.11. An Ehresmann Connection on the tangent bundle TN of a manifold N is a smooth

map

c : T 2N → T 2N

which satises πTN c = πTN , meaning c restricts to a map c| : T(p,v)TN → T(p,v)TN. We demand

these restrictions are linear for all (p, v) ∈ TN. Moreover, we demand that these restrictions be

projection maps, onto the vertical subspace: c c = c and the image of c is the vertical subspace of

T 2N.

A linear function on a vector space L : V → V is said to be idempotent if L L = L. When we

say c above is a projection in the above denition, we mean it is an idempotent linear map. One can

check that if V is a vector space and L idempotent, V ' img(L) ⊕ ker(L), i.e. an idempotent map

gives a decomposition of the ambient space as a direct sum. The map that gives this isomorphism is

v 7−→ (L(v), v − L(v)), and the inverse is (v , w) 7−→ v + w .

An Ehresmann connection takes as input a double tangent vector, and returns as output a vector

which represents the `vertical part' of the double tangent vector. If you prefer less categorical language,74


perhaps call the application of c to a double tangent vector the `tangential component' of the double

tangent vector.

Note 3.12. When c is an Ehresmann connection, there is a canonical way we can identify T(p,v)TN

with TpN ⊕ TpN, which we will describe. Fix (p, v) ∈ TN. Since p × TpN ⊂ TN, there is the

inclusion T (p × TpN) = p × TpN × p × TpN ⊂ T 2N. Let π2 : (p × TpN)2 → TpN be

projection onto the nal factor π2(p, v , p, w) = w . Then given V ∈ T 2N with πTN(V ) = (p, v),

(π2 c)(V ) is dened and is an element of TpN.

V 7−→ ((π2 c)(V ), (DπN)p(V ))

is the isomorphism between π−1TN(p, v) and TpN ⊕ TpN. In the above πN : TN → N and πTN :

T 2N → TN are the bundle projections. We could have written the above formula as (p, v , w, y) 7−→(π2(c(p, v , w, y)), (DπN)p(p, v , w, y)) where V = (p, v , w, y) but that would be rather redundant.

The reason the above map is an isomorphism is that if (DπN)p(V ) = 0, V is by design a vertical

vector, and c(V ) = V , and π2(V ) = π2(p, v , 0, y) = (p, y) is zero if and only if y = 0.

Example 3.13. • For a manifold N ⊂ Rk , the induced connection (coming from the manifold

sitting in Euclidean space) is c(p, v , w, y) = (p, v , 0, πTpN(y)) where πTpN : Rk → TpN is

orthogonal projection onto TpN.

• On R3 the football connection is given by c(p, v , w, y) = (p, v , 0, v × w + y), where v × wis the standard cross product. Notice that the kernel of c consists of all V = (p, v , w, y) such

that y = w × v . This is saying that if one takes a tangent vector (p, v) and attempts to push

it in the V direction, then v will want to turn towards the w × v direction. Keep in mind

(p, w) = D(πN)(V ), i.e. when we push (p, v) in the V direction, p = πN(p, v) will want to

move in the D(πN)(V ) direction.

• Since c produces vertical double tangent vectors, which can be identied as objects in TpN

by Note 3.12, given a connection c , there is an equivalent, associated map

c : T 2N → TN

dened by c(V ) = π2(c(V )) with the conventions of Note 3.12. c is called the covariant

derivative associated to c .

• If U ⊂ Rn is open, any Ehresmann connection has the form c(p, v , w, y) = (p, v , 0, f(p,v)(w)+

y) where for every (p, v) ∈ U × Rn the function f(p,v) : Rn → Rn is linear. In this notation,

the associated covariant derivative has the form c(p, v , w, y) = (p, f(p,v)(w) + y).

Note 3.14. A common trap one might want to fall into is an inappropriate usage of the linearity of c as

a map T(p,v)TN → TpN, for example by writing c(p, v , w, y) = c(p, v , w, 0) + c(p, v , 0, y). This only

makes sense if both (p, v , w, 0) and (p, v , 0, y) ∈ T(p,v)TN, which would require both (p, w) ∈ TpNand (p, y) ∈ TpN. This is rarely the case. Moreover, the only manifolds for which this is true are

manifolds which are locally open subsets of vector subspaces of the ambient space.

There is one further common constraint people put on Ehresmann connections which relates them

to another type of connection, so we introduce that connection rst.

Denition 3.15. • A Kozul connection ∇ on TN is a bilinear function

∇ : Γ(TN)× Γ(TN)→ Γ(TN)75


that satises

∇f vw = f∇vw ∀f ∈ C∞(N,R)

∇v (f w) = (vf )w + f∇vw ∀f ∈ C∞(N,R)

for all v , w ∈ Γ(TN).

• If c : T 2N → TN is the covariant derivative associated to an Ehresmann connection on TN,

dene

∇c : Γ(TN)× Γ(TN)→ Γ(TN)

by

∇cv (w) = c Dw v

Note 3.16. • Assuming ∇ is a Kozul connection on TU where U ⊂ Rn is open, let e1, · · · , enbe the standard basis of Rn. We abuse notation slightly, when v ∈ Rn, we will frequently

consider v as a section of πU : TU → U, i.e. the function p 7−→ (p, v) we will call v , even

though it's technically not v . With this convention, we can represent two arbitrary smooth

vector elds on TU by v(p) =∑n

i=1 vi(p)ei(p) and w(p) =∑n

i=1 wi(p)ei(p), where the viand wi are functions U → R for all i .

∇vw = ∇(∑n

i=1 viei )

n∑i=1

wiei =

n∑i ,j=1

∇vieiwjej

=

n∑i ,j=1

vi(∇eiwjej) =

n∑i ,j=1

vi ((eiwj)ej + wj∇ei ej) .

In the above formula, eiwj is the directional derivative. Sometimes one prefers the notation

∂eiwj = eiwj in these situations. Keeping to the directional-derivative notation one has to be

mindful of the order one writes down an expression eiwj is a directional derivative, while

wjei is scalar multiplication of ei by the real-valued function wj . By design, ∇ei ej is just some

vector eld on U, so it is a pointwise linear combination of the vector elds e1, · · · , en.The coecents of these vectors elds are called the Christoel Symbols of the connection.

Specically, ∇ei ej =∑n

k=1 Γki,jek . So for every i , j and k ∈ 1, 2, · · · , n, Γki,j is a smooth

function with domain U and range R. This gives

∇vw = Dw(v) +

n∑i ,j,k=1

viwjΓki,jek .

• Notice that the covariant derivative associated to an Ehresmann connection need not be a

Kozul connection. But every Kozul connection is the covariant derivative of a unique Ehres-

mann connection. The case U ⊂ Rn is open is perhaps the most notationally-straightforward

to write out. In Example 3.13 we noticed that c(p, v , w, y) = (p, f(p,v)(w) + y). The asso-

ciated connection has the form ∇XY = c DY X. So if we write X(p) = (p, ~X(p)) and

Y (p) = (p, ~Y (p)), so DY (p, v) = (p, ~Y (p), v ,D~Y(p)(v)), giving

∇XY(p) = (p, f(p,~Y (p))( ~X(p)) +D~Y(p)(X(p))

in this notation, a Kozul connection has the form

∇XY(p) =

p,D~Y(p)(X(p)) +

n∑i ,j,k=1

xiyjΓki,j(p)ek

76


what this says is that given a Kozul connection, the corresponding Ehresmann connection is

specied by

f(p,v)(w) + y = y +

n∑i ,j,k=1

viwjΓki,j(p)ek .

This formula tell us that f(p,v)(w) must be bilinear in the pair (v , w). An Ehresmann connection

by denition must be linear in w , but the linearity in v is an additional constraint on the

Ehresmann connection. We will formalize this observation next.

Denition 3.17. An Ehresmann connection is linear if its covariant derivative c : T 2N → TN satises

c DA = c Dπ1 + c Dπ2

and

c D(Ma) = Ma cfor all a ∈ R. Here Ma : TN → TN is scalar multiplication by a, i.e. Ma(p, v) = (p, av), A :

TN ⊕TN → TN is brewise addition, A(p, v , w) = (p, v +w), π1 : TN ⊕TN → TN is remembering

only the 1st vector, π1(p, v , w) = (p, v) and π2 : TN ⊕ TN → TN remembers only the 2nd,

π2(p, v , w) = (p, w). The above formulas are intended to mean brewise addition and multiplication,

respectively, i.e.

Note 3.18. Notice that Ehresmann connections form an ane space, i.e. if both c1 and c2 are

Ehresmann connections, tc1 + (1 − t)c2 is an Ehresmann connection for all t ∈ [0, 1]. Similarly, if

both c1 and c2 are linear, the above ane combination is linear.

Theorem 3.19. A Kozul connection is induced by a unique Ehresmann connection, moreover, the

covariant derivative of an Ehresmann connection is a Kozul connection if and only if it is linear.

Proof. We only consider the case U ⊂ Rn open. This suces since locally any manifold is dieo-

morphic to some such U, and we can pull-back the Ehresmann connection on N to U. Consider

the linearity conditions on c , and what they say about covariant dierentiation ∇XY . The addi-

tion function A : TU ⊕ TU → TU has the form A(p, v , w) = (p, v + w), so DA has the form

DA(p, v , w, ~p, ~v , ~w) = (p, v + w, ~p, ~v + ~w). Simiarly, Ma(p, v) = (p, av) so D(Ma)(p, v , ~p, ~v) =

(p, av , ~p, a~v). Dπ1(p, v , w, ~p, ~v , ~w) = (p, v , ~p, ~v) and Dπ2(p, v , w, ~p, ~v , ~w) = (p, w, ~p, ~w). So the

additivity condition says

(p, f(p,v+w)(~p) + ~v + ~w) = (p, f(p,v)(~p) + ~v) + (p, f(p,w)(~p) + ~w)

i.e.

f(p,v+w)(~p) = f(p,v)(~p) + f(p,w)(~p)

similarly, additivity says

f(p,av)(~p) + a~v = a(f(p,v)(~p) + ~v)

which reduces to

f(p,av)(~p) = af(p,v)(~p)

the remainder follows once we dene Γki,j so that

n∑k=1

Γki,j(p)ek = f(p,ei )(ej)

77


So one can regard Kozul connections as a subspace of Ehresmann connections, and in this regard

it's easy to check they form an ane subspace.

Parallel transport, holonomy, geodesics

Denition 3.20. A 1-parameter family of vectors v : (a, b) → TN is parallel with respect to an

Ehresmann connection c : T 2N → TN if

c Dv = ~0.

The above formula requires some interpretation, c Dv : T (a, b)→ TN, moreover, (c Dv)(t, w) ∈π−1N (πN(v(t))) for all (t, w) ∈ T (a, b) = (a, b) × R, where πN : TN → N is the bundle projection.

So the right hand side ~0 means the zero vector over the point πN(v(t)) in TN. One could write this

as c Dv = ~0πNv to be precise. Thinking of the parameter of γ as t, i.e. γ(t) being the point on

the curve at time t, one could equivalently write

c

(∂v

∂t

)= ~0πNv .

We say a smooth curve γ : (a, b)→ N is a geodesic provided its velocity vector ∂γ∂t : (a, b)→ TN

is parallel, i.e.

c

(∂2γ

∂t2

)= ~0γ .

One should think of the expression c Dv as representing the amount the vector v appears to

change as (from the perspective of the connection) as one moves from tangent space to tangent

space, along the curve πNv : (a, b)→ N.

Example 3.21. • In Rn with the standard connection c(p, v , w, y) = (p, y), a curve γ : (a, b)→Rn is a geodesic if and only if γ(t) = p + tv where p ∈ Rn and v ∈ Rn. This is because the

equation that says γ is a geodesic is c(D(γ, γ′)) = 0 which states c(γ, γ′, γ′, γ′′) = (γ, γ′′) =

(γ, 0), i.e. the 2nd derivative of γ is zero.

• In Rn with the standard connection, a smooth family of vectors v(t) = (γ(t), ~v(t)) ∈ TRnis parallel if and only if ~v(t) is constant. So c Dv(t, 1) = (γ(t), ~v ′(t)), so this is the zero

vector if and only if ~v(t) is constant.

• In the football connection, the curve γ(t) = p+ tv is a geodesic provided p ∈ R3 and v ∈ R3,

this is because γ′(t) = (γ(t), v), so D(γ′)(t, 1) = (γ(t), v , v , 0), giving c D(γ′)(t, 1) =

(γ(t), 0) which is the zero vector. Similarly, these are all the geodesics in R3 with the football

connection.

• In the football connection, notice the family of vectors v(t) = (t~k, cos t~i + sin t~j) is parallel.

• In Sn a curve γ is a geodesic if and only if γ(t) = v cos(kt) + w sin(kt) with v , w ∈ Rn+1

unit length and orthogonal.

• In Hn, a curve γ is a geodesic if and only if γ(t) = cosh(kt)v + sinh(kt)w with v , w ∈ Rn+1

Minkowski space, v ∈ Hn, µ(w,w) = 1 and µ(v , w) = 0.78


• Consider the Ehresmann connection

c : T 2R2 → TR2

given by c(p, v , w, y) = (p, f(p,v)(w) + y) where f(p,v)(w) = (w1, 0), provided w = (w1, w2) ∈R2. The geodesics for this connection have the form

γ(t) = (−a1e−t + a2, a3 + ta4)

where a1, a2, a3, a4 are constants. So if we want the geodesics satisfying γ(0) = 0 we get

γ(t) = (a2(1− e−t), ta4)

which is a parametrization of the curve x = a(1− e−y/b) with a = a2 and b = a4. Moreover,

γ′(0) = (a2, a4). So if one re-scales the initial velocity vector, the geodesic traces out a

dierent curve! In Proposition 3.27 we will see this can not happen for linear connections.

Theorem 3.22. Given any smooth curve γ : [a, b] → N a manifold with an Ehresmann connection

c and any vector va ∈ π−1N (γ(a)) there exists a unique smooth function v : [a, b] → TN such that

πN v = γ and v is parallel. The v is called the parallel transport of va along γ.

Proof. The idea is to break [a, b] up into intervals, such that for each interval I, γ|I is contained in a

chart. This reduces the problem to the local case, where we pull-back the Ehresmann connection to

the chart domain (an open subset of Euclidean space), U ⊂ Rn. So WLOG consider the problem for

γ : [a, b] → U and c : T 2U → TU an Ehresmann connection. Write v(t) = (γ(t), ~v(t)), then the

dierential equation ~v(t) must satisfy is c(γ(t), γ′(t), ~v(t), ~v ′(t)) = 0, which in our local coordinates

is f(γ(t),γ ′(t))(~v(t)) + ~v ′(t) = 0. This is an ODE in ~v(t) and so there is a unique solution to any initial

condition v(a) = va. We know the solution depends smoothly on the input vector va and on the curve

γ(t), as the existence and uniqueness theorem provides smooth dependence on initial conditions.

Denition 3.23. A continuous function γ : I → N where I ⊂ R is piecewise smooth if I = [x0, x1] ∪[x1, x2] ∪ [xj−1, xj ] and γ|[xi−1,xi ] : [xi−1, xi ]→ N is smooth for all i .

So for example, the absolute value function R → R is piecewise smooth as is√

sin(x) + 1, but√|x | is not piecewise smooth, since it's not dierentiable on any interval [0, ε].

Notice that we can make parallel transportation make sense for any path γ : I → N which is

piecewise-smooth. We simply demand that the parallel transport ODE holds only on each interval on

which γ is smooth, and that the vector v(t) is continuous. By design, the above existence theorem

also holds for piecewise-smooth curves γ.

Denition 3.24. • Given a manifold N and two points p, q ∈ N let Ωp,qN be the set of all

piecewise-smooth paths γ : [0, a]→ N such that γ(0) = p and γ(a) = q. The holonomy of

the pair (N, c) is the function HOLp,q : Ωp,qN × TpN → TqN given by HOLp,q(γ, v0) = v(a)

where v : [0, a]→ TN is the parallel transport of v0 along γ.

• If c is an Ehresmann connection on N, we say the holonomy is linear if whenever γ ∈ Ωp,qN

and every p, q ∈ N the map f (v) = HOLp,q(f , v), f : TpN → TqN is a linear function.

• If N has a Riemann metric µ and Ehresmann connection c , the connection c is said to be

orthogonal if it is linear and if for every γ ∈ Ωp,qN and every p, q ∈ N the map f (v) =

HOLp,q(f , v), f : TpN → TqN is a isometry with respect to µ.79


Proposition 3.25. (Groupoid property of holonomy) Let c be an Ehresmann connection on N. Let

γ ∈ Ωp,qN, and δ ∈ Ωq,rN then their concatenation δ · γ ∈ Ωp,rN is dened as

(δ · γ)(t) =

γ(t) if t ∈ [0, a]

δ(t − a) if t ∈ [a, a + b].

Where γ : [0, a]→ N and δ : [0, b]→ N. Holonomy respects concatenation, in that

HOLq,r (δ,HOLp,q(γ, v)) = HOLp,r (δ · γ, v)

and if γ ∈ Ωq,pN is the reverse of γ, i.e. γ(t) = γ(a − t) then

HOLq,p(γ,HOLp,q(γ, v)) = v

A consequence of the above groupoid property is that for an arbitrary Ehresmann connection and

path γ, the map TpN → TqN given by TpN 3 v 7−→ HOLp,q(γ, v) ∈ TqN is a dieomorphism.

The linearity condition (Denition 3.17) on a connection is precisely the condition to ensure this

dieomorphism is a linear one.

Notice that the subscripts in the holonomy notation is somewhat redundant. Given γ ∈ Ωp,qN

we will sometimes use HOL(γ, v) to denote HOLp,q(γ, v), since p = γ(a) and q = γ(b) where

γ : [a, b]→ N.

Theorem 3.26. Let N be a manifold with an Ehresmann connection c . Given any v ∈ TN there is a

unique maximal geodesic γ : (a, b)→ N such that γ′(0) = v . Moreover, there is a map

exp : Ac → N

where Ac ⊂ TN is the subset of TN such for v ∈ Ac the maximal geodesic γ : I → N with γ′(0) = v

is dened at γ(1), moreover exp(v) = γ(1). An open neighbourhood of the 0-section of TN (i.e.

(p, 0) : p ∈ N) is contained in Ac , when exp is restricted to that neighbourhood, it is smooth.

Proof. The proof is similar to Theorem 3.22. That maximal geodesics exist and are unique boils down

to a local computation, so assume c : T 2U → TU is an Ehresmann connection on U ⊂ Rn open. Thedierential equation that says γ is a geodesic has the form

c(γ, γ′, γ′, γ′′) = f(γ,γ′)(γ′) + γ′′ = 0

where f(p,v)(w, y) is a linear function in (w, y) and depends smoothly on (p, v) ∈ TU. As-is, this is a2nd-order dierential equation. Reduction of order turns it into the dierential equation

f(γ,β)(β) + β′ = 0, γ′ = β

which is now an ODE and has unique maximal solutions for any initial condition, which in this case is

a point in TU. Smoothness of exp follows from smooth dependence on initial conditions.

A key property about linear Ehresmann connections is that if v : (a, b)→ TN is parallel, then any

multiple of that vector kv : (a, b) → TN is also parallel, similarly, if v , w : (a, b) → TN is parallel,

then v + w is also parallel. In particular, if γ : (a, b)→ N is a geodesic, δ : (a/k, b/k)→ N given by

δ(t) = γ(kt) is also a geodesic. So re-parametrizing geodesics results in geodesics that trace out the

same image. There are non-linear Ehresmann connections where all these observations fail, such as

Example 3.21.80


Proposition 3.27. If c : T 2N → TN is a linear Ehresmann connection and γ : (a, b)→ N is smooth,

with h : (c, d)→ (a, b) smooth,

c ∂2(γ h)

∂t2= h′′ ·

(∂γ

∂t h)

+ (h′)2 ·(c

∂2γ

∂t2 h)

Moreover:

• Reparametrization of a geodesic γ results in a curve γ h such that the covariant derivative

of its velocity vector c ∂2(γh)∂t2 is a multiple of its velocity vector ∂(γh)

∂t . If h is of constant

speed, then γ h is also a geodesic.

• Assuming further that c is a Levi-Cevita connection, given any curve γ such that c ∂2γ∂t2 is

always a multiple of dγdt , the unit-speed reparametrization of γ is a continuous curve which is

piecewise a geodesic. The points of non-dierentiability of the reparametrization correspond

to a subset of t ∈ (a, b) : dγdt (t) = 0. So for example, if the velocity vector of γ is

always non-zero its unit-speed reparametrization is a geodesic. If this set is discrete, the

reparametrization is a piecewise-smooth curve which on every smooth segment is a geodesic.

Proof. By the chain rule∂(γ h)

∂t= (γ h, γ′h · h′)

∂2(γ h)

∂t2= (γ h, γ′h · h′, γ′h · h′, γ′′h · (h′)2 + γ′h · h′′)

Notice that the vector (γ h, γ′h ·h′, 0, γ′h ·h′′) is tangent to π−1N (γ h) in TN at every parameter time.

So by the idempotence condition,

c ∂2(γ h)

∂t2= (γ h, γ′h · h′′) + c(γ h, γ′h · h′, γ′h · h′, γ′′h · (h′)2).

By the linearity condition (specically c D(Ma) = Ma c),

c(γ h, γ′h · h′, γ′h · h′, γ′′h · (h′)2) = h′ · c(γ h, γ′h, γ′h · h′, γ′′h · h′)

and Ehresmann connections are linear functions on the double tangent bres T(p,v)TN → TpN, giving

c(γ h, γ′h, γ′h · h′, γ′′h · h′) = h′ · c(γ, γ′, γ′, γ′′) h

which is the formula we were looking for.

All curves on Riemann manifolds have unit speed reparametrizations. The idea is to let t0 ∈ (a, b)

and dene L(t) =∫ tt0| ∂γ∂t |dt. So L : (a, b) → R is a non-decreasing function, which can have zero

derivative. For now, assume L has positive derivative on (a, b), and let (c, d) be the image of L then

h = L−1 is our reparametrization of the curve, i.e. γ h is unit speed and a geodesic. One can do this

on every open interval in the domain of γ where the derivative is non-zero, assembling the piecewise

reparametrization.

Another consequence linearity is that if ∂γ∂t is parallel, then given any k ∈ R, k ∂γ∂t is also parallel.

Notice that this is not the case for non-linear connections, like the one in Example 3.21.

The torsion of connections 81


Recall the football connection on R3, c(p, v , w, y) = (p, v ×w + y). The characteristic property of

the football connection is that geodesics are precisely the Euclidean straight lines, yet parallel vector

elds along a geodesic `twist' about that geodesic. In this regard, the football connection is quite

similar to the usual connection on Euclidean space, as it shares the same geodesics. But it's dierent

in that parallel vectors twist. If you view parallel transport as a manner of transporting a `reference

frame' from one tangent space to another, the football connection rotates reference frames, in the

manner that happens when one throws a football. In this section we make these observations precise,

resulting in the notion of the torsion of a connection.

Consider the general case of a manifold N with a Kozul connection c . Let vp, wp ∈ TN be vectors

such that πN(vp) = p, πN(wp) = p. Consider the function

f : (−ε, ε)2 → N

given by

f (t, x) = exp(x ·HOL(γ|[0,t], wp)

)= exp

(HOL(γ|[0,t], xwp)

)where γ : (−ε, ε) → N is the geodesic such that ∂γ

∂t (0) = vp. We can think of f as describing the

motion of a `rigid' rod: An observer is travelling along the γ geodesic. At time t the observer is at

the point γ(t). f (t, x) the point a parameter-distance x from the observer along the rod, at time t.

Thus the question of whether or not ∂f∂t is parallel along the rod amounts to asking whether or not

the rod appears to twist as the observer moves.

PSfrag repla ements

p = f (0, 0)

f (0, x)f (t, 0)

vpwp

Consider ∂f∂x : (−ε, ε)2 → TN and ∂f

∂t : (−ε, ε)2 → TN. By design, ∂f∂x is parallel along the curves

f|(−ε,ε)×0 curve, and so in local coordinates

∂2f

∂t∂x+

n∑i ,j,k=1

∂fi∂t

∂fj∂x

Γki,jek = 0 (1)

where (1) holds provided x = 0.

We now ask if ∂f∂t is parallel along the rod at time t = 0. This would require

∂2f

∂x∂t+

n∑i ,j,k=1

∂fi∂x

∂fj∂t

Γki,jek (2)

82


to be zero. Both of the above equations apply at (t, x) = (0, 0). By (1) we know

∂2f

∂t∂x= −

n∑i ,j,k=1

∂fi∂t

∂fj∂x

Γki,jek

which we can sub into (2) to observe that the `twisting' experienced by the rod (to rst order) at

(t, x) = (0, 0) is given byn∑

i ,j,k=1

∂fi∂x

∂fj∂t

Γki,jek −n∑

i ,j,k=1

∂fi∂t

∂fj∂x

Γki,jek .

It's important to keep in mind, that this is quite literally c ∂2f∂x∂t evaluated at (t, x) = (0, 0), but

expressed in local coordinates. We can rewrite the above as

=

n∑i ,j,k=1

(∂fi∂t

∂fj∂x

(Γki,j − Γkj,i

)ek

)which can be further expressed as

n∑i ,j,k=1

wpivpj(

Γki,j − Γkj,i)ek

where wpi and vpj are the coordinates of wp and vp, expressed with respect to the standard basis of

Rn.So clearly, there is a brewise bilinear function TN ⊕ TN → TN, meaning for every p ∈ N it

restricts to a bilinear function TpN ⊕ TpN → TpN, given by the above geometric construction or by

the above formula using the Christoel symbols in local coordinates.

Denition 3.28. The torsion of a Kozul connection is the brewise bilinear function

τc : TN ⊕ TN → TN

dened by the property that if f : R2 → N is a smooth function where we will write it as a function

of variables f (x, y) such that ∂f∂x (0, 0) = v and ∂f

∂y (0, 0) = w then

τc(v , w) = c

(∂2f

∂x∂y(0, 0)

)− c

(∂2f

∂y∂x(0, 0)

).

Torsion by design is anti-symmetric, τc(v , w) = −τc(w, v).

In our preamble where we interpret torsion as describing the `twisting' experienced by a rod that is

being dragged along a parallel family of vectors, we are using the above formula in the special case

c(∂2f∂t∂x (0, 0)

)= 0. We did not need to assume γ was a geodesic in the rod example.

The denition of torsion is reminiscent of a few of our denitions of the Lie bracket. These two

notions are very closely related, one expression of this is the proposition below.

Proposition 3.29. The torsion τc of a Kozul connection satises

∇vw −∇wv = τc(v , w) + [v , w ]

for all v , w ∈ Γ(TN).83


In the above formula, the input to τc is (v , w), which we are interpreting as a section of TN ⊕ TNin the natural way technically (v , w) is i−1 v ⊕w where v ⊕w : N → TN×TN and i : TN⊕TN →TN × TN is the standard inclusion, i(p, v , w) = (p, v , p, w).

So an alternative interpretation of the torsion is that it measures the extent to which the dierence

of mixed covariant dierentiation gives the Lie bracket of vector elds.

In local coordinates the torsion has the form

τc(X, Y )(p) =

n∑i ,j,k=1

(xiyj(Γki,j(p)− Γkj,i(p))

)ek

If the torsion is zero everywhere, the connection c is called symmetric or torsion free. From a more

physical perspective, one might prefer to think of a torsion-free connection as one where holonomy

along geodesics describes what you might like to call inertial reference frames, since to rst order,

dragging a stick along a parallel family of vectors does not result in the stick twisting.

We give one more interpretation of the torsion.

Proposition 3.30. Let ιN : T 2N → T 2N be the involution of the double tangent bundle. Let

c : T 2N → TN be a linear Ehresmann connection, and πN : TN → N the tangent bundle projection,

and πTN : T 2N → TN the corresponding projection for the double tangent bundle. Then πN c ιN =

πN c . In particular, c − c ιN is therefore dened, and it satises

c − c ιN = τC(πTN , πTN ιN)

i.e. for V ∈ T 2N, c(V )−c(ιNV ) = τC(πTN(V ), πTN(ιN(V ))). If we write V = (p, v , w, y) then recall

πTN(V ) = (p, v), πN(p, v) = p, ιN(V ) = (p, w, v , y).

Thus, from this perspective a connection being symmetric means c ιN = c . ιN acts on the space

of all connections, thus symmetric connections are the xed points of this action.

Example 3.31. Any connection on any 1-manifold is torsion free. On R3 with the football connection,

τc(v , w) = 2v × w.

The torsion τc of a connection has an adjoint map τc : TpN → Hom(TpN, TpN) given by

τc(v)(w) = τc(v , w). A productive way to think of this is that GL(TpN) is a Lie group, and

Hom(TpN, TpN) is the tangent space to the identity in GL(TpN). Given v ∈ TpN, τc(v) ∈ Hom(TpN, TpN)

describes the direction a rod twists if its base is at p, moving in the direction v , where the observer

holds the rod so that its base is being parallel transported. Each such rod has a direction vector

w ∈ TpN, and τc(v)(w) is measures the instantanous amount of twist experienced as one moves from

the observer towards the end of the rod (while the rod is being dragged by the observer).

Can you think of an example of a connection on R2 with non-zero torsion?

Theorem 3.32. (Fundamental Theorem of Ane Connections) For any Kozul connection on a man-

ifold N there exists a unique torsion-free Kozul connection which shares the same geodesics.

Proof. Conceptually, the idea here is that Kozul connections form an ane space. And Kozul connec-

tions that share the same geodesics are an ane subspace. What we need to show is that among Kozul

connections that share the same geodesics as ∇c , there is precisely one torsion-free connection. So

there are two steps in this proof (1) nding a torsion-free connection that shares the same geodesics,

and (2) showing that any two torsion-free connections that have the same geodesics are the same.84


(1) Recall that the formula γ : (a, b)→ N being a geodesic is

c ∂2γ

∂t2= 0γ .

In local coordinates this is

c(γ, γ′, γ′, γ′′) = 0γ

or

(γ, γ′′ + f(γ,γ′)(γ′)) = (γ, 0)

where c(p, v , w, y) = (p, y + f(p,v)(w)). Notice that if c : T 2N → TN is a linear Ehresmann

connection, c ιN is also, where ιN : T 2N → T 2N is the involution of the double tangent bundle.

Moreover, c ιN shares the same geodesics as c! Since connections form an ane space, we can take

the average of the two:

c ′ =c + c ιN

2.

This is by design a linear Ehresmann connection, and it has the same geodesics as c . It is also by

design torsion free.

(2) Let c and c ′ be two torsion free Ehresmann connections that share the same geodesics. We

want to prove that c = c ′. Consider the two connections in local coordinates

c(p, v , w, y) = (p, y + f(p,v)(w)) c ′(p, v , w, y) = (p, y + g(p,v)(w))

If γ is a geodesic, we know c(γ, γ′, γ′, γ′′) = 0 = c ′(γ, γ′, γ′, γ′′). Given any p ∈ N and v ∈ TpNwe can realize some y such that (γ, γ′, γ′, γ′′) = (p, v , v , y). Plugging this into our formulas for the

connections, we see

f(p,v)(v) = g(p,v)(v)

for all (p, v) ∈ TN. But it's a basic fact that if F : V ×V → V and G : V ×V → V are symmetric bilinear

functions such that F (v , v) = G(v , v) for all v ∈ V , then F = G. The proof of this is that F (v +

w, v +w) = F (v , v) +F (w,w) + 2F (v , w), i.e. F (v , w) = 12 (F (v +w, v +w)−F (v , v)−F (w,w)).

In this case, both f and g restrict to symmetric bilinear functions on every tangent space.

Example 3.33. If c is the football connection on R3 we observed that holonomy along a geodesic

rotates normal vectors like a football being thrown by a right-handed person. The opposite connection

c ιN does the referse holonomy rotates normal vectors like a football being thrown by a left-handed

person.

Further observe, if Γki,j are the Christoel symbols for a connection c , then the Christoel symbols

Γki,j for c ιN are Γki,j = Γkj,i .

Levi-Cevita connections

We have used the word linear in two contexts in regard to Ehresmann connections, with regard

to how addition TN ⊕ TN → TN and multiplication are compatible with the connection, and also

with regard to the holonomy HOL(γ, ·) : TpN → TqN. In this section we show these two notions of

linearity are the same, and discuss how linear Ehresmann connections can relate to Riemann metrics.85


Proposition 3.34. (1) An Ehresmann connection is linear if and only if its holonomy is linear.

(2) A Kozul connection is orthogonal if and only if ∂Xµ(v , w) = µ(∇Xv , w) + µ(v ,∇Xw) for all

vector elds v , w,X ∈ Γ(TN).

Proof. First the claim about linearity (1). If v , w : I → TN are parallel, with πN v = πN w ,we claim that av + bw , the brewise linear combination, is also parallel. This requires checking

c D(av + bw) = 0, but the rst denition of linearity says c D(av + bw) = c D A(av, bw) =

c D(av) + c D(bw) = c D(Ma v) + c D(Mb w) = (Ma c Dv) + (Mb c Dw) = 0. By the

uniqueness of parallel transport, HOL(γ, av0 + bw0) = aHOL(γ, v0) + bHOL(γ, w0). To deduce the

converse, let γ : (−ε, ε)→ N be a curve and dierentiate the function HOL(γ|[0,t], v0) : Tγ(0)N → TN

with respect to t at t = 0.

For the claim (2) about orthogonality, consider the formula

∂Xµ(v , w) = µ(∇Xv , w) + µ(v ,∇Xw).

re-written this says

Dµ D(v ⊕ w) X = µ((c Dv X)⊕ w) + µ(v ⊕ (c Dw X))

if we evaluate the above formula at a point p ∈ N we get

Dµ(v(p),w(p))(D(v ⊕ w)(p)(X(p)) = µ(c(Dv(p)(X(p))), w(p)) + µ(v(p), c(Dw(p)(X(p)))).

The important thing to notice about this formula is that from X, v and w the only relevant data is

v(p), w(p), X(p) and Dv(p)(X(p)) and Dw(p)(X(p)). So we could restate this formula as

Dµ(p, v , w, ~p, ~v , ~w) = µ(c(p, v , ~p, ~v), (p, w)) + µ((p, v), c(p, w, ~p, ~w))

where (p, v , w) ∈ TN⊕TN and (p, v , w, ~p, ~v , ~w) ∈ T (TN⊕TN). In particular, we could restate this

condition as

(µ(γ))′(0) = µ(c((π1γ)′(0), π2γ(0))) + µ(π1γ(0), c((π2γ)′(0)))

where γ : I → TN ⊕ TN is smooth and π1 : TN ⊕ TN → TN and π2 : TN ⊕ TN → TN remember

only the rst and second tangent vectors respectively. So by design if π1γ and π2γ are parallel, µ γmust be constant. i.e. our vector eld denition of orthogonality implies the holonomy is orthogonal.

To prove the converse, write the vectors π1 γ and π2 γ as a brewise linear combination of a parallel

orthonormal frame and dierentiate.

Theorem 3.35. (Fundamental Lemma of Riemannian Geometry) On a Riemannian manifold there

exists a unique orthogonal torsion-free Kozul connection.

Proof. The proof is a somewhat inspired computation. The orthogonality condition says

∂Xµ(Y, Z) =µ(∇XY, Z) + µ(Y,∇XZ) (1)

∂Y µ(Z,X) =µ(∇Y Z,X) + µ(Z,∇Y X) (2)

∂Zµ(X, Y ) =µ(∇ZX, Y ) + µ(X,∇ZY ) (3)

So if we take the linear combination (1) + (2)− (3) of the above equations, we get

∂Xµ(Y, Z) + ∂Y µ(Z,X)− ∂Zµ(X, Y )

= µ(∇XZ −∇ZX, Y ) + µ(∇Y Z −∇ZY,X) + µ(∇XY +∇Y X,Z)

which if the connection were to be torsion-free, the above expression would have to be equal to

= µ([X,Z], Y ) + µ([Y, Z], X) + µ(2∇XY − [X, Y ], Z)86


and this can be viewed as a formula for ∇XY , i.e. (*)

µ(2∇XY, Z) = µ([X, Y ], Z)− µ([X,Z], Y )− µ([Y, Z], X) + ∂Xµ(Y, Z) + ∂Y µ(Z,X)− ∂Zµ(X, Y )

One can check directly that the above formula uniquely determines ∇XY from µ.

To see that the above formula denes an orthogonal, torsion-free connection, recall the torsion-free

condition amounts to checking

µ(∇XY −∇Y X,Z) = µ([X, Y ], Z)

for all X, Y, Z. This amounts to checking that the dierence between the formula (*) and (*) with

X and Y switched gives 2µ([X, Y ], Z), which is a direct check. Similarly the orthogonality condition

amounts to adding up two copies of (*) with Y and Z reversed and checking it gives 2∂Xµ(Y, Z),

which is again direct.

Frequently the formula above is given in component notation. If N is an open subset of Rn, representvector elds X and Y by X(p) =

∑i xi(p)ei , Y (p) =

∑i yi(p)ei . Then

∇XY(p) = DY(p)(X(p)) +

n∑i ,j,k=1

xi(p)yj(p)Γki,j(p)ek .

Plug in X = ei , Y = ej and Z = el into the above formula for µ(2∇XY, Z) giving

2

n∑k=1

Γki,jgk,l = ∂igj,l + ∂jgl ,i − ∂lgi ,j (∗)

where gi ,j(p) = µ(p)(ei , ej). To give the formula a little more meaning, notice that if µ is any inner

product on a nite-dimensional vector space V , there is a natural isomorphism between V and its dual

space V ∗. The dual space is the space of linear functions V → R. µ+ : V → V ∗ is usually dened

as µ+(v)(w) = µ(v , w), or in more relaxed notation, µ+(v) = µ(v , ·). That µ+ : V → V ∗ is an

isomorphism only depends on µ being bilinear, non-degenerate and V being nite-dimensional. When

V has a basis e1, · · · , en then there is the corresponding dual basis, e∗1 , · · · , e∗n ⊂ V ∗ dened

by e∗i (∑n

k=1 akek) = ai . So from this point of view, µ+(ei) =∑n

j=1 gi ,je∗j , i.e. the matrix [gi ,j ] is

the representation of the linear transformation µ+ with respect to the basis e1, · · · , en for V and

e∗1 , · · · , e∗n for V ∗ respectively. Since µ+ is an isomorphism, the matrix [gi ,j ] is invertible, and we call

the entries of the inverse matrix [gi ,j ]. Rewriting equation (∗) we get

n∑k=1

Γki,jgk,l =1

2(∂igj,l + ∂jgl ,i − ∂lgi ,j)

For each i and j xed, we can think of this equation as an equality of an n-tuple of real numbers,

one for each l ∈ 1, 2, · · · , n. So we can think of it as a vector in Rn, indexed by l , in particular,

we want to think of it as the coecients of a vector in (Rn)∗. If we apply the inverse transform

(µ+)−1 : (Rn)∗ → Rn to this vector, we get

n∑k,l=1

Γki,jgk,lgl ,m =

n∑l=1

gl ,m


The sumn∑

k,l=1

Γki,jgk,lgl ,m =

n∑k=1

Γki,j

(n∑l=1

gk,lgl ,m

)87


but∑n

l=1 gk,lgl ,m = δk,m since [gl ,m] is the inverse matrix to gk,l . So we have the formula,

Γmi,j =

n∑l=1

gl ,m


Denition 3.36. An orthogonal, torsion-free Kozul connection is called a Levi-Cevita connection. On

a Riemann manifold it is conventional to take the Levi-Civita connection as the standard connection,

unless otherwise specied.

Note 3.37. Notice that if N ⊂ Rk , what we called the induced connection, c(p, v , w, y) =

(p, oTpN(y)) is an orthogonal, torsion-free connection it is orthogonal with respect to the induced

Riemann metric. So it is the Levi-Civita connection for the induced Riemann metric. Similarly for our

example with hyperbolic space, but where we induce from the Lorentz metric.

Gauss's formula for the Levi-Cevita connection and its classical derivation, given in Theorem 3.35

might make the formula appear a little mysterious.

µ(2∇XY, Z) = µ([X, Y ], Z)− µ([X,Z], Y )− µ([Y, Z], X) + ∂Xµ(Y, Z) + ∂Y µ(Z,X)− ∂Zµ(X, Y )

I claim that it should not appear mysterious all the complications in this formula are primarily because

it is written in the formalism of vector elds. Let's write it out in a more elementary formalism, to see

how it could be derived more simply.

Recall our terminology, if f : I → N is smooth, with I ⊂ R an interval, then ∂f∂t : I → TN is the

function ∂f∂t (t) = Df (t, 1). Given a function v : I → TN, ∂v∂t : I → T 2N is similarly dened. Let the

symbol ∇tv = c ∂v∂t . Then given f : (−ε, ε)3 → N is a smooth map, and if we denote the variables

of f by (x, y , z) then

2µ

(c

∂2f

∂x∂y,∂f

∂z

)=

∂

∂xµ

(∂f

∂y,∂f

∂z

)+

∂

∂yµ

(∂f

∂z,∂f

∂x

)−∂

∂zµ

(∂f

∂x,∂f

∂y

). (∗)

is a restatement of the formula appearing in the Fundamental theorem of Riemannian geometry. The

statement that the connection is torsion free is equivalent to ∇x ∂f∂y = c ∂2f∂x∂y = c ∂2f

∂y∂x = ∇y ∂f∂x .The statement that the connection is orthogonal is equivalent to ∂xµ( ∂f∂y ,

∂f∂z ) = µ(∇x ∂f∂y ,

∂f∂z ) +

µ( ∂f∂y ,∇x∂f∂z ).

∂xµ

(∂f

∂y,∂f

∂z

)=µ

(∇x

∂f

∂y,∂f

∂z

)+ µ

(∂f

∂y,∇x

∂f

∂z

)(1)

∂yµ

(∂f

∂z,∂f

∂x

)=µ

(∇y

∂f

∂z,∂f

∂x

)+ µ

(∂f

∂z,∇y

∂f

∂x

)(2)

∂zµ

(∂f

∂x,∂f

∂y

)=µ

(∇z

∂f

∂z,∂f

∂y

)+ µ

(∂f

∂x,∇z

∂f

∂y

)(3)

Just as in the original proof of the fundamental theorem, we add the rst two equations and

subtract the third. We need only use the torsion free condition to derive the above version (*) of the

Fundamental Theorem. Moreover, one needs only use orthogonality and the torsion-free condition on

the right hand side of the equation (*) to deduce it is equal to the left side.

88


Corollary 3.38. Let f : N → M be a smooth map of manifolds, moreover let cN be a connection on

TN and cM a connection on TM. Provided f preserves connections in the sense that the diagram

T 2NcN //

D2f

TN

Df

T 2M

cM // TM

commutes, then f sends geodesics in N to geodesics in M and Df sends parallel vectors in TN to

parallel vectors in TM. Moreover, f expN = expM Df wherever both are dened. In particular, if

f is an isometry of Riemann manifolds and the connections on both are the Levi-Cevita connections,

then the diagram

TNexp //

Df

N

f

TM

exp // M

commutes. This latter property is sometimes called the universal property of isometries.

Corollary 3.38 tells us some beautiful things about the group of isometries of a Riemann manifold.

Theorem 3.39. Let M be a Riemann manifold of dimension m with its Levi-Cevita connection.

FM = (p, v1, · · · , vm) : p ∈ M, v1, · · · , vm ∈ TpM orthonormal

is a smooth manifold of dimension m +(m2

). If f : M → N is an isometry, there is an induced

dieomorphism f∗ : FM → FN given by

f∗(p, v1, · · · , vm) = (f (p), Dfp(v1), · · · , Dfp(vm)).

Denote the group of isometries of M by Isom(M). If M is connected and P ∈ FM, the map

Isom(M)→ FM

given by sending f ∈ Isom(M) to f∗(P ) ∈ FM is a one-to-one function.

Conceptually Theorem 3.39 is important because it says the group of isometries of a Riemann

manifold (given the compact-open topology) is nite dimensional, and it gives an upper bound of

m +(m2

)on the dimension of the isometry group of a Riemann manifold. Moreover, it also gives us

a criterion to know when we have found all the isometries of a manifold. If the isometries you know

can send any point to any other, moreover, if they can send any orthonormal basis to any other, then

you have found all the isometries of the manifold.

Corollary 3.40.

Isom(Rn) = f : Rn → Rn : f (x) = Ax + p where A ∈ On, p ∈ Rn

Isom(Sn) = On+1

And the group of isometries of hyperbolic space Hn is the Lorentz group.

Isom(Hn) = O1,n

where O1,n = A ∈ GLn+1R : AtJA = J, a11 > 0 where J = diag(−1, 1, 1, · · · , 1).89


The metric in a Riemann metric

Denition 3.41. A manifold N with connection c is called geodesically complete if every maximal

geodesic has domain R. Equivalently, if the exponential map exp : TN → N is dened on the entire

tangent bundle.

Denition 3.42. Given a connected Riemann manifold (N,µ), and any two points p, q ∈ N dene

their distance to be

d(p, q) = infLength(γ) : γ : [0, 1]→ N, γ(0) = p, γ(1) = q.

The above denition is the reason why the function µ : TN ⊕ TN → R for Riemann manifolds is

called the Riemann metric, in that µ induces the function d . If N is not connected, typically one denes

d(p, q) = ∞ provided there is no path between p and q. d(p, q) is sometimes called the intrinsic

metric or path metric on (N,µ), to distinguish it from the Euclidean (ambient) metric |p− q| where| · | : Rk → R is the standard norm on Rk , where N ⊂ Rk .

Proposition 3.43. On a Riemann manifold (N,µ), the distance function d is a metric, meaning:

(1) d(x, y) + d(y , z) ≥ d(x, z) ∀x, y , z ∈ N.(2) d(x, y) = d(y , x) ∀x, y ∈ N.(3) d(x, y) ≥ 0 ∀x, y ∈ N and is equal to 0 if and only if x = y .

Moreover, the metric is compatible with the regular topology on N, meaning U ⊂ N is relatively open if

and only if for all p ∈ U, there is some ε > 0 such B(p, ε) ⊂ U, where B(p, ε) = q ∈ N : d(p, q) < ε,and moreover B(p, ε) is relatively open in N for all p ∈ N and ε > 0.

Proof. Sketch. For (1) use concatenation of paths - to do this smoothly one needs bump functions.

Alternatively, one could use piecewise smooth paths in the denition of d(p, q) and argue the two

denitions are equivalent. (2) path reversal, (3) the mean value theorem for integrals. To show

compatibility of the topologies, let U ⊂ N be relatively open, in particular let U = q ∈ Rk : |p− q| <δ∩N. We need to nd some B(p, ε) ⊂ U. Let U1/2 = q ∈ N : |p− q| ≤ δ/2. U1/2 ⊂ U, moreover

(q, v) ∈ TN : q ∈ U1/2, |v | = 1

is compact, where |v | is the regular Euclidean length in Rk , N ⊂ Rk . The function

f : U1/2 → R

given by f (v) = µ(v , v) is a smooth function on a compact set, so it has a maximum and a minimum.

Let m,M ∈ R be the minimum and maximum of f respectively. m > 0 since µ is a Riemann metric on

N. In particular, this tells us that m|v |2 ≤ µ(v , v) ≤ M|v |2 for any v ∈ π−1N (U1/2). Therefore, since

q − p =∫ 1

0 γ′(t)dt,

|q − p| ≤∫ 1

0

|γ′(t)|dt ≤∫ 1

0

√µ(γ′(t), γ′(t))√

mdt

where γ : [0, 1]→ U1/2 is any path connecting p to q. Since the number on the right can be arbitrarily

close to d(p,q)√m

, we let ε =√mδ/2.

90


Theorem 3.44. (Hopf-Rinow) A connected Riemann manifold N with its Levi-Cevita connection is

geodesically complete if and only if the metric d on N is complete as a metric space. Moreover, for a

geodesically complete manifold, if there exists a path between two points p, q ∈ N then there exists a

geodesic γ : [0, 1] → N such that γ(0) = p, γ(1) = q and L(γ) = d(p, q). Such a geodesic is called

length-minimizing or minimal.

Example 3.45. Before we begin the proof of Hopf-Rinow, notice some consequences. Sn is compact,

so all Cauchy sequences are convergent. So any Riemann metric on Sn is complete. In particular,

the standard Riemann metric on Sn is complete. So we can compute d(u, v) as the length of the

shortest geodesic between u, v ∈ Sn. Since the geodesics in Sn are simply the great circles, this tells

us d(u, v) = arccos(u · v) where u · v is the standard inner product in Rn+1. There is a similar formula

in hyperbolic space.

Proof. We prove geodesic completeness implies (N, d) complete as a metric space rst. This has two

parts. Given r > 0 dene property Mr to be

Mr = (∀q ∈ Dr (p) ∃ minimal geodesic connecting p to q)

where Dr (p) = q ∈ N : d(p, q) ≤ r. The rst part of the proof is to show Mr is true for some

r > 0. The second part is to show it is true for all r .

Part 1. The exponential map exp : TpN → N satises that D(exp)0 : T0TpN ≡ TpN → TpN is

the identity map. By the inverse function theorem, there is some r > 0 so that exp when restricted

to Br = v ∈ TpN : |v | < r is a dieomorphism onto its image. We will argue that Mr ′ is true for

any r ′ < r . Let Sp = v ∈ TpN : |v | = 1. Let f : R× Sp → N be dened as f (r, v) = exp(rv). By

design f restricted to (0, r)×Sr is a dieomorphism onto its image, so we can consider the vector ùnit

radial' eld ~r = ∂f∂t f

−1 : exp(Br \ 0)→ T (exp(Br \ 0). By design if γ : [0, 1]→ Br is a smooth

path with γ(0) = p and γ(1) = q then Length(γ) =∫ 1

0 |∂γ∂t |dt =

∫ 1

0

√µ(~r γ, ∂γ∂t )2 + |π~r⊥ ∂γ∂t |2dt.

where π~r⊥ is meant to be the orthogonal projection into the complement of ~r . Clearly this length is at

least as long as the geodesic connecting p and q, exp(texp−1(q)), moreover these lengths are equal

only when the orthogonal projection is everywhere zero, which means γ(t) = exp(texp−1(q)).

Part 2. Let R = supr : Mr is true. If R were nite, rst observe that ∂DR(p) must be compact,

and DR(p) ⊂ q ∈ N : d(p, q) = R the key idea in arguing that this set is compact is to consider

a sequence in ∂DR(p), by design there are points qi near this sequence with d(p, qi) < R, so there

is a minimal geodesic γi connecting p to qi . Since Sp is compact, the velocity vectors ∂γi∂t (0) have

a convergent subsequence. We leave the rest to the reader. So we can apply Part 1 to all points

q ∈ ∂DR(p). The compactness of ∂DR(p) tells us that there is an ε > 0 so that if q ∈ ∂DR(p) and

d(q′, q) < ε then q and q′ are connected by a minimal geodesic. So given q′ ∈ N whose distance from

∂DR(p) is less than ε, let q ∈ ∂DR(p) be a closest point to q′. By our compactness argument, there

is a minimal geodesic from p to q, and further one from q to q′. These geodesics share a common

tangent vector (direction) at q, since if they did not we could argue there is a closer point to q′ in

∂DR(p). This is a `rounding the corner' argument which we will describe in more detail once we get

to normal coordinates.

To argue that (N, d) is a complete metric space, take a Cauchy sequence pi in N. Then d(pi , p1) is

Cauchy in R so it is convergent, in particular pi ∈ Dr (p1) for some r > 0. Since Dr (p1) is compact (it

is the continuous image of a compact set under the exponential map), the sequence pi is convergent.91


To argue the converse, that (N, d) is complete implex N geodesically complete, take a maximal

geodesic γ : (a, b) → N where either a or b nite. Without loss of generality, assume b is nite.

If ti ∈ (a, b) with limi→∞ ti = b then γ(ti) is Cauchy in N, therefore convergent. Similarly, ∂γ∂t (ti)

converges to some vector v ∈ TN. But exp((t − b)v) is dened, and this is a geodesic that extends

γ at b.

If one thinks of manifolds as ùniverses,' then the lack of geodesic completeness means that the

universe unexpectedly ceases to be. So the standard setting for most dierential geometry is that of

complete Riemannian manifolds.

There are many types of incomplete Riemann manifolds. The simplest example would be to take

any complete Riemann manifold like Rn, or Sn or Hn and remove a closed subset. This is perhaps the

most vanilla type of incomplete Riemann manifold. In the theory of metric spaces, there is the notion

of the completion of a metric space. If (X, d) is a metric space, there exists (X ′, d) and a 1−1 metric

embedding X → X ′ which preserves all distances, moreover, (X ′, d) is complete. If one removes a

closed subset from Rn or Sn or Hn, provided the closed subset has no interior, the completion, as a

metric space is your space before you removed the closed subset.

There are much more interesting incomplete Riemann manifolds. For example,

(x, y , z) ∈ R3 : z2 = x2 + y2, z > 0

with the induced Riemann metric. This is incomplete because the lines γ(t) = (t cos θ, t sin θ, t) are

geodesics for all θ, as a function of t ∈ (0,∞). But these geodesics do not extend to t = 0. Moreover,

the completion of this metric space is

(x, y , z) ∈ R3 : z2 = x2 + y2, z ≥ 0

which is not (naturally) a manifold.

There are more extreme examples of incomplete manifolds, which you will perhaps see in the

homework.

Exercises

Problem 3.46. Read the section on stereographic projection and inversion in the Dierential Geometry

notes (Example 3.7). Prove that inversion ip,r : Rn \ p → Rn \ p is a conformal dieomorphism.

Verify that stereographic projection satises πp(x) = ip,√

2(x) for all x ∈ Sn−1 \ p, and so stereo-

graphic projection is also conformal.

Problem 3.47. Consider the function f : S2 → R4 given by

f (x, y , z) = (xy , xz,y2 − z2

2, yz)

a) Show that f is an immersion.

b) Show that f (x1, y1, z1) = f (x2, y2, z2) if and only if (x1, y1, z1) = ±(x2, y2, z2).92


c) Use the mono-split theorem (and some sweat) to prove that image of f is a submanifold of

R4.

d) Is f conformal? If so, compute the conformal factor.

Note: the image of f in problem 2 is called the real projective plane.

Problem 3.48. Recall the football connection on R3 given by c : T 2R3 ≡ (R3)4 → TR3 ≡ (R3)2,

c(p, v , w, y) = (p, v × w + y).

(a) Show that Euclidean straight lines are geodesics with respect to the football connection.

(b) Compute the holonomy along a straight line according to the football connection.

(c) Compute the holonomy along a round circle according to the football connection.

Problem 3.49. Finish the proof that the map from the Poincare ball Bn toHn given by (x1, · · · , xn) 7−→1

1−∑n

i=1 x2i

(1 +∑n

i=1 x2i , 2x1, 2x2, · · · , 2xn) is an isometry.

Problem 3.50. The at torus is S1 × S1 ⊂ R4 with the induced Riemann metric and connection.

Prove that for any p, q ∈ R, γ : R→ S1 × S1 dened by γ(t) = (e ipt , e iqt) is a geodesic in S1 × S1.

Further, argue if p and q are not rational multiples of each other, that γ is a one-to-one immersion, but

not an embedding. Hint: I think this is most easily handled by thinking of the Ehresmann connection

as being given by orthogonal projection.

Problem 3.51. Show that R2 with the metric µ : TR2 ⊕ TR2 → R given by

µ(p, v , w) = ex(v · w)

where p = (x, y) is incomplete. v · w is the standard Euclidean dot product. Note: You will have to

compute the Christoel symbols for the Levi-Cevita connection to determine the geodesics.

Problem 3.52. Prove that if M is a 2-dimensional Riemann manifold, and if a 1-dimensional subman-

ifold C ⊂ M is xed by a non-trivial isometry i : M → M, i.e. i 6= IdM yet i(p) = p for all p ∈ C, thenC, suitably parametrized, is a geodesic in M.

Problem 3.53. Let ω be the 1-form on R2 \ (0, 0) given by

ω(x, y) =−y

x2 + y2dx +

x

x2 + y2dy

93


(a) Compute∫C ω where C is any circle of radius R centred at the origin.

(b) Check that dω = 0.

(c) Show that if you restrict ω to the half-plane P = (x, y) ∈ R2 : x > 0, then ω|P = df for

some f : P → R. Compute an explicit formula for one such f (x, y).

(d) Show that ω can not be the exterior derivative of any smooth function f : R2 \ (0, 0) → R.Hint: let γ : [0, 1]→ R2 \ (0, 0) be γ(t) = e2πit and consider

∫[0,1] γ

∗ω.

Problem 3.54. For each of the two manifolds N = S2 and N = H2, choose a point p ∈ N and

compute the pull-back exp∗(µ) where µ : TN ⊕ TN → R is the standard Riemann metric on S2 and

H2 respectively, and exp : TpN → N (i.e. we restrict the exponential map to an individual tangent

space). In both cases, determine the injectivity radius,

supr ∈ R : exp| : Br (p)→ N is an isometry to its image

where Br (p) = v ∈ TpN : |v | < r. Here `compute' means write out an explicit formula for

exp∗(µ)q(v , w) so that it's clear exactly which bilinear function this is for all q ∈ Br (p). The Riemann

metric on Br (p) is the pull-back metric.

Problem 3.55. On S2, given 0 < r < π, let Cr = (x, y , z) ∈ S2 : x = cos(r), we call Cr the roundcircle centred at (1, 0, 0) of radius r . One reason for this denition is that Cr = exp(1,0,0)v : |v | = r.Compute the holonomy about Cr . Hint: Dene γ(t, r) = (cos r, sin r cos t, sin r sin t), then as a

function of t this is a parametrization of Cr , moreover, R3 has γ(t, r), ∂γ∂t (t, r), ∂γ∂r (t, r) as a basis for

every (t, r). So if (γ(t), v(t)) is a parallel vector eld, we can write v(t) as v(t) = x(t) ∂γ∂t (t, r) +

y(t) ∂γ∂r (t, r) for x(t) and y(t) any 2π-periodic functions.

Problem 3.56. Let H = (x, y) ∈ R2 : x > 0. Give H the Riemann metric (in `traditional' notation)

xdx2 + dy2

In our more explicit notation, we would say

µ((x, y), (v1, v2), (w1, w2)) = xv1w1 + v2w2.

Compute the Christoel symbols for the Levi-Cevita connection and check that the x-axis is a

geodesic (suitably parametrized). Is H complete with this Riemann metric?

Problem 3.57. Consider two objects moving in the plane R2. The rst object (the `horse') moves

along the x-axis, and it tows the `plow' via a rigid rod of unit length. This means the two objects

always maintain a unit distance, and the velocity vector of the plow points directly at the horse. If the

horse starts at (0, 0) and the plow at (0, 1) respectively, the contour traced out by the plow as the

horse moves along the x-axis is called the tractrix.94


a) Argue that if γ(t) = (x(t), y(t)) is the unit speed parametrization of the Tractrix, then dydt = −y .

b) Argue that γ(t) = (t − tanh t, 1cosh t ) is a parametrization of the Tractrix.

The pseudo-sphere P is the surface of rotation in R3 generated by rotating the tractrix about the

x-axis. The cusp of the pseudo-sphere is the points on the surface of rotation corresponding to t = 0.

P = (x, y , z) ∈ R3 : x = t − tanh t, y2 + z2 =1

cosh2 twhere t ∈ R

c) Argue that (x, y , z) : x > 0, and (x, y , z) ∈ P is a 2-dimensional manifold. We further consider

it to be a Riemann manifold by restricting the Euclidean metric to P .

d) Compute the Riemann metric f ∗µ on the domain of f , where

f (t, θ) = (x(t), y(t) cos θ, y(t) sin θ)

where γ(t) = (x(t), y(t)) is the unit speed parametrization of the tractrix. Notice that this metric

can be written as1

t2(dt2 + dθ2)

Problem 3.58. In a connected, complete Riemann manifold, the exponential map exp : TpN → N is

known to be onto. If this manifold is compact, one can imagine exp(Br (p)) as the result of blowing

up a balloon, centred at p ∈ N. Since the exponential map is onto, every point in N is in exp(Br (p))

for suitably-large r . But for a compact manifold, some nite r will do, this is called the diameter

of the manifold. If we imagine inating an actual balloon, at some point the balloon will come into

contact with itself. Eventually the ìnterior' of the balloon will ll the entire manifold. This must give

a decomposition of N as a compact ball with some identications on the boundary. This is called the

Dirichlet Domain for the manifold. Specically, dene the Dirichlet domain of N, centred at p to be

the subspace of TpN consisting of all v ∈ TpN such that there is a unique length-minimizing geodesic

between p and exp(v), and the geodesic is exp(tv) where t ∈ [0, 1]. Determine the Dirichlet domain

for S1 × S1 ⊂ R4, i.e. the at torus.

95

Dierential Geometry - Coarse Outline Curvature, rst pass

4. Playing with curvature

We start o looking at a few possible ideas for curvature, via some naive computations.

Example 4.1. Consider the 2-sphere S2. Let Cr be a circle of (Hopf-Rinow) radius r centred about

a point (0, 0, 1) ∈ S2. Cr = (x, y , z) ∈ S2 : z = cos r. So Length(Cr ) = π sin2(r). A direct

computation shows that if q ∈ Cr the holonomy around Cr is the rotation

Rθ : TqS2 → TqS

2

where θ = −2π cos r , where the angle θ is in the same orientation as the parametrization of Cr .

The angle only makes sense modulo 2π so perhaps it would be best to describe the holonomy as

`multiplication by e−2πi cos r = e2πi(1−cos r)'. Notice if r is small, this is a similarly small rotation, in the

same direction as the circle orientation.

For comparison's sake, the holonomy about any closed curve in Rn is trivial. One way to interpret

the above computation is that when r is small, Cr looks much like a circle in the (Euclidean) plane.

In the Euclidean plane, if one performs parallel transport around a circle, the parallel vector `turns to

the right' from the perspective of the circle's tangent vector. And it turns at a rate of one radian per

radian of the circle. But in Cr as r in creases, Cr looks more and more like a geodesic, so the tangent

vector itself is parallel. In this case, holonomy around Cr results in a vector eld that turns less rapidly

in relation to Cr 's tangent vectors. Hyperbolic geometry turns out to be the opposite.

Example 4.2. Consider hyperbolic 2-spaceH2, and let Cr be the circle of (Hopf-Rinow) radius r centred

about a point (1, 0, 0) ∈ H2. Cr = (z, x, y) ∈ H2 : z = cosh(r), and Length(Cr ) = 2π sinh2(r). A

direct computation shows that if q ∈ Cr the holonomy around Cr is the rotation

Rθ : TqH2 → TqH2

where θ = −2π cosh(r). Again, this angle only makes sense modulo 2π, so we should think of it as

multiplication by e(1−cosh r)2πi . So in this geometry, small circles are a little longer than in Euclidean

space, and holonomy about small circles is small, but in the opposite direction as the circle's orientation.

The previous two examples were ones where the holonomy was local in nature. We will describe a

manifold where there is non-trivial holonomy, but it is purely global in nature.

Example 4.3. The k-twisted band Mk = (z1, z2) : |z1| = 1, z2 = tzk/21 ⊂ C2 has trivial holonomy

for all closed curves except when k is odd. In this case, the holonomy about S1 × 0 is a mirror

reection, but the holonomy about small loops is trivial.

Deeper into holonomy

Given two paths γ1, γ2 : [0, 1] → N that both start at p and end at q ∈N, the question of if they

have the same holonomy, by the groupoid property is equivalent to showing Hol(γ2 · γ1, v) = v for

all v ∈ TpN. γ2 · γ1 ∈ Ωp,pN, i.e. it is a closed loop. In this section we compute the holonomy over

`small' closed loops.

Denition 4.4. Given two tangent vectors v , w ∈ TN over p (meaning πN(v) = p = πN(w)) the box

in N spanned by v and w is the function

γv,w : R2 → N96


given by

γv,w (x, y) = exp(xv + yw).

γv,w satises ∂γv,w∂x (0, 0) = v , and ∂γv,w

∂y (0, 0) = w .

We will consider rectangles Ra,b = [0, a] × [0, b] and the holonomies over the boundaries of such

rectangles, γv,w |∂Ra,b .

Proposition 4.5. Given two tangent vectors v , w ∈ TN over p ∈ N, the holonomy over the loop

γv,w |∂Ra,b is an element of the orthogonal group of TpN (or the general-linear group if the connection

is not Levi-Cevita). The second order Taylor expansion of the holonomy is given by

Hol(γv,w |∂Ra,b)ε·|(a,b)|2' IdTpN + abR(v , w)

where R : TpN × TpN → Hom(TpN, TpN) is a well-dened bilinear anti-symmetric function taking

its image in the (skew-adjoint if the connection is Levi-Cevita) linear transformations of TpN. It's

conventional that when one evaluates R(v , w) at z ∈ TpN one simply denotes it R(v , w)z .

Proof. Property (1) follows from γw,v being γv,w pre-composed with mirror reection across the

diagonal of R2, the associated boundary rectangles are parametrized with the opposite orientation.

Property (2) is literally the statement of R takes values in the skew-adjoint linear transformations.

To see that the limit is well-dened, let z0 ∈ TpN and dene a 2-parameter holonomy Zvw : R2 →TN by

Zvw (x, y) = Hol(γv,w |x×[0,y ], Hol(γv,w |[0,x ]×0, z0)).

Similarly, dene the 2-parameter holonomy

Zwv (x, y) = Hol(γv,w |[0,x ]×y, Hol(γv,w |0×[0,y ], z0)).

If we think of the initial condition z0 as being a variable, the above holonomies are 2-parameter families

of isometries Zvw (x, y) : R2×TpN → TN and Zwv (x, y) : R2×TpN → TN where πNZvw = πNZwv .The original holonomy were were interested in is related to these via the identity

Hol(γv,w |∂Rx,y ) = Zwv ((x, y), ·)−1 Zvw ((x, y), ·). (*)

By design, the above holonomy is trivial if either x = 0 or y = 0, so the only second-order data near

(x, y) = (0, 0) is the 2nd order partial

∂2Hol(γv,w |∂Rx,y )

∂x∂y(0, 0).

To compute this 2nd order partial, let's write Zvw (p) = (p, ~Z(p)) and Zwv (p) = (p, Z(p)), and

assume N = U ⊂ Rn is open. By formula (*) this 2nd order partial can be computed as

∂2Hol(γv,w |∂Rx,y )

∂x∂y(0, 0) =

∂2 ~Z

∂x∂y(0, 0)−

∂2Z

∂x∂y(0, 0).

The dierential equations satised by Zvw and Zwv can be written as

∂ ~Z

∂y+∑i j

~Γi j~zi∂γj∂y

= 0 ∀x, y∂ ~Z

∂x+∑i j

~Γi j~zi∂γj∂x

= 0 ∀y = 0

∂Z

∂x+∑i j

~Γi jz i∂γj∂x

= 0 ∀x, y∂Z

∂y+∑i j

~Γi jz i∂γj∂y

= 0 ∀x = 0

97


where ~Γi j =∑

k Γkijek = ∇ei ej . Dierentiating the two formulas on the left gives

R(v , w)z =∂

∂y

∑i j

~Γi jz i∂γj∂x

− ∂

∂x

∑i j

~Γi j~zi∂γj∂y

=∑i jk

(∂k~Γi jz i

∂γj∂x

∂γk∂y− ∂k~Γi j~zi

∂γj∂y

∂γk∂x

)+∑i j

(~Γi j∂z i∂y

∂γj∂x− ~Γi j

∂~zi∂x

∂γj∂y

)

and the two equations on the right allow for us to solve for ∂z i∂y and ∂~zi

∂x at (x, y) = (0, 0), giving the

formula

R(v , w)z =∑i jk

(∂k~Γi jvjwkzi − ∂k~Γi jvkwjzi

)+∑i jkl

(~Γi jΓ

iklvlwjzk − ~Γi jΓiklvjwlzk

)

at this stage we can see R is well-dened and bilinear in v and w .

Another way to say Proposition 4.5 is that the second-order Taylor expansion of Hol(γv,w |∂Ra,b , ·)in a and b is IdTpN + abR(v , w).

The function R in Proposition 4.5 is called the Riemann curvature tensor. When looking at other

references, be aware that some author's conventions for R agrees with ours, while other authors use

conventions where their Riemann curvature tensor would be equal to the negative of our R.

As Proposition 4.5 shows, the Riemann curvature tensor is an invariant of the connection c : T 2N →TN, and at any given point it only depends on the connection and its derivative. So in principle, there

should be a means of expressing R directly in terms of c and Dc , using only linear algebra. The next

proposition outlines this observation in detail.

As we've observed, the double tangent bundle T 2N has an involution ιN : T 2N → T 2N given by

ιN(p, v , w, y) = (p, w, v , y). ιN gives rise to an involution of T 3N by taking its derivative DιN :

T 3N → T 3N. Any torsion-free linear Ehresmann connection c : T 2N → TN satises Dc DιN = Dc .

More generally, the triple tangent bundle T 3N admits a group of six automorphisms. To see this,

envision the double tangent vector (p, v , w, y) as a square diagram

p1 //

2

v

w // y

In the diagram we think of v as being the `rst variation' of p, and w the `second variation' of p,

with y being the 2nd order variation among the two rst order variations. Then ιN corresponds to the

symmetry where one switches the rst and second order variations. In that regard, a triple tangent98


vector has the form

p1 //

2

3

!!

p1

""

p3//

p13

p2//

!!

p12

""p23

// p123

This six automorphisms of T 3N correspond to the permutations of the edges labelled 1, 2 and 3,

i.e. the elements of the symmetric group on three elements Σ3 = Bij(1, 2, 3), equivalently all the

symmetries of the cube that leave the vertex labelled p xed. The permutations essentially just re-

arrange which order one considers the rst variations to be occuring in. Let's denote the automorphism

that permutes via a permutation σ by ισ. Notice that DιN = ι(12), where we interpret (12) ∈ Σ3 as

a transposition.

Proposition 4.6. Let c : T 2N → TN be any torsion-free, linear Ehresmann connection. Let πi :

T 3N → TN for i ∈ 1, 2, 3 be the bundle projection that forgets all but the i-th rst-order variation.

If P is the cubical diagram above, then πi(P ) = (p, pi).

There exists a bre-wise tri-linear function R : TN ⊕ TN ⊕ TN → TN satisfying

c Dc − c Dc ι(23) = R(π2, π3, π1),

i.e. c(Dc(P )) − c(Dc(ι(23)(P ))) = R(p2, p3, p1) ≡ R(p2, p3)p1. R is the Riemann curvature tensor

of the connection, as in Proposition 4.5.

Proof. We outline the proof, assuming (WLOG) that N = U is an open subset of Euclidean space with

some connection. We write the connection c : T 2U → TU as c(p, p1, p2, p12) = (p, p12 + fp(p1, p2))

where f : U → Bil in(Rn × Rn,Rn) is smooth, taking its image in the symmetric bilinear functions.

Dc(P ) = p1 //

2

p12 + fp(p1, p3)

p3

// p123 +Dfp(p3)(p1, p2) + fp(p13, p2) + fp(p1, p23)

which gives the formula for c Dc − c Dc ι(23)

P 7−→ (p,Dfp(p3)(p1, p2)−Dfp(p2)(p1, p3) + fp(fp(p1, p2), p3)− fp(fp(p1, p3), p2),

which is R(p2, p3)p1.

Example 4.7. On Sn, given p ∈ Sn and v , w, z ∈ TpSn,

R(v , w)z = (v · z)w − (w · z)v .

On Hn, given p ∈ Hn and v , w, z ∈ TpHn,

R(v , w)z = (w · z)v − (v · z)w99


where in both identities v ·z indicates the standard (Euclidean or Minkowski respectively) bilinear form.

The way to think of the formulas for the Riemann curvature tensors above is they are the skew-

adjoint transformations of the tangent spaces corresponding to innitesimal rotations in the v , w -plane

(with no motion in the orthogonal complement). In the sphere, small loops give rise to holonomies

that rotate slightly in the direction of the orientation of the loop (or in this case, rectangle), and in

hyperbolic geometry the holonomy is in the opposite direction to the orientation of the loop.

We sketch the proof of the formula for Sn. The Levi-Cevita connection on Sn is

c(p, v , w, y) = (p, πTpSny) = (p, y − (y · p)p) = (p, y + (v · w)p)

where y · p is the Euclidean inner product on Rn+1. The last step in the above equation follow from

our computation of T 2Sn

T 2Sn = (p, v , w, y) ∈ (Rn+1)4 : p · p = 1, p · v = 1, p · w = 0, v · w + p · y = 0.

The formula c(p, v , w, y) = (p, y + (v · w)p) gives a connection on Rn+1 that extends the Levi-

Cevita connection on Sn. Since the curvature R only depends on Dc , we can therefore compute R for

this extension using the formula at the end of Proposition , and restrict to T 2Sn, giving the curvature

for Sn.

The next problem asks you to generalize Example 4.7.

Problem 4.8. Let M be the pre-image of the regular value of a smooth function, M = f −1(q), with

f : U → R smooth where U ⊂ Rn is open. Compute the Riemann curvature tensor for M with the

pull-back metric i : M → Rn, with Rn given the standard Euclidean metric, i(p) = p. Given p ∈ Mand v , w, z ∈ TpN argue that

R(v , w)z =1

|∇fp|2(Hfp (z, v)Hfp(w)−Hfp (z, w)Hfp(v)) +

∇fp|∇fp|4

(Hfp (z, v)Hfp (w,∇fp)−Hfp (z, w)Hfp (v ,∇fp)) .

In the above equation ∇f is the traditional gradient, i.e. ∇f : U → Rn is such that ∇fp · v = Dfp(v).

The notation Hfp(v) is a synonym for Hfp(v) = D(∇f )p(v), so Hfp(v) ·w = Hfp(v , w), the Hessian.

v · w is the standard Euclidean inner product.

Hint: Recall,

TM = (p, v) ∈ (Rn)2 : f (p) = q,Dfp(v) = 0

T 2M = (p, v , w, y) ∈ (Rn)4 : f (p) = q,Dfp(v) = Dfp(w) = 0, Hfp(v , w) +Dfp(y) = 0and extend the techniques of Example 4.7.

The formula in Problem 4.8 is a little more complicated than what appears in Example 4.7, primarily

because the second line in the Problem 4.8 is zero in Example 4.7. If one is `reasonably thoughtful'

about choosing f , one can always ensure that the second line is zero. The idea is thatD(∇f ·∇f )p(v) =

2D(∇f )p(v) · ∇fp = 2Hfp(v ,∇fp), so if ∇f is of constant length on M, this would be zero, giving a

formula that quickly produces Example 4.7, as f (p) = |p|2, ∇fp = 2p and Hfp(v , w) = 2v · w in that

case.

Riemann and Sectional Curvature 100


Since the Riemann curvature tensor is a tri-linear function on every tangent space it can be inter-

preted as a function TpN × TpN × TpN → TpN for every p ∈ N or globally as a smooth function

TN⊕TN⊕TN → TN. Much like how a bilinear form V ×V → R gives us two adjoint maps V → V ∗,

the Riemann curvature tensor can be viewed from many perspectives. We start this section o with the

traditional interpretation of the tensor, given three vector elds X, Y, Z ∈ Γ(TN), R(X, Y )Z ∈ Γ(TN)

makes sense as a vector eld. We mention the basic results about this operation, here.

Theorem 4.9. The Riemann curvature tensor, dened in Proposition 4.5 satises

R(X, Y )Z =∇Y∇XZ −∇X∇Y Z −∇[Y,X]Z (0)

0 =R(X, Y )Z + R(Y,X)Z (1)

0 =R(X, Y )Z + R(Y, Z)X + R(Z,X)Y (2)

0 =µ(R(X, Y )Z,W ) + µ(R(X, Y )W,Z) (3)

µ(R(X, Y )Z,W ) =µ(R(Z,W )X, Y ) (4)

for arbitrary smooth vector elds X, Y, Z,W ∈ Γ(TN).

Proof. Identity (0) is a common denition of the Riemann curvature tensor. It follows from a direct

computation of the right-hand side, and is conceptually almost immediate, as in our denition of R

it was clear R was the innitesimal obstruction to two holonomies commuting, which in principle is

the question of how mixed covariant dierentiations interact. The only surprise should be the form in

which the Lie bracket [Y,X] appears in our denition of R the Lie bracket did not appear, but this

was because the vector elds corresponding to Y and X in the denition commuted by design.

Identity (1) is the anti-symmetry property given in Proposition 4.5.

Identity (2) is called the rst (or algebraic) Bianchi identity. It's reminiscent of the Jacobi identity

for Lie brackets

[X, [Y, Z]] + [Y, [Z,X]] + [Z, [X, Y ]] = 0

for good reason, as it's more or less a `hidden' expression of it.

R(X, Y )Z+R(Y, Z)X + R(Z,X)Y =

∇Y∇XZ −∇X∇Y Z −∇[Y,X]Z +∇Z∇Y X −∇Y∇ZX −∇[Z,Y ]X+

∇X∇ZY −∇Z∇XY −∇[X,Z]Y

= ∇Y [X,Z] +∇Z [Y,X] +∇X [Z, Y ]−∇[Y,X]Z −∇[Z,Y ]X −∇[X,Z]Y

= [Y, [X,Z]] + [Z, [Y,X]] + [X, [Z, Y ]] = 0

Line 1 to 2 is three applications of identity 0. Line 2 to 3 and lines 3 to 4 are both applications

of the fact that our connection is torsion free, and the very last line is zero since this is the Jacobi

identity. The colours underlining indicate how objects are related line-to-line.

Identity (3) is the statement that if we view R as a map TpN × TpN → TIdO(TpN) the target

space is the tangent-space to the orthogonal group of TpN, and these are the skew-symmetric linear

maps.

Identity (4) is sometimes called the interchange symmetry of the tensor. Like identity (2) it might

appear unmotivated on rst inspection, but it turns out to be a consequence of (1), (2) and (3). Since

it's a consequence of (1), (2) and (3), in principle one can ignore it, but it turns out to be convenient.101


We give Milnor's derivation of (4). Given four vector elds X, Y, Z and W , there are precisely

4! = 24 ways one can insert them into the empty spots in the expression µ(R(·, ·)·, ·). Using relation

(1) we can permute the rst two entries (after negating the expression) and using (3) we can permute

the last two entries, after negating the expression. This gives 4 permutations of 24 possible expressions.

24/4 = 6 is the number of equivalence class of expressions µ(R(·, ·)·, ·) modulo exchanging 1 ↔ 2,

3↔ 4. There are precisely 4 distinct relations of type (2) in this case, distinguished by the rightmost

entry µ(R(·, ·)·, ∗) and each such relation involves three of the 24/4 = 6 vertices so our relations

(2) form half of the faces of an octahedron! In the above expression, the yellow face corresponds to

relation (2) with ∗ = W , green ∗ = Z, red ∗ = Y and blue ∗ = X.

So for example, for the green face we write µ(R(X, Y )Z,W )(3)= −µ(Z,R(X, Y )W )

(1)= µ(Z,R(Y,X)W ) =

µ(R(Y,X)W,Z) and then apply (2) to generate the entire face.

I encourage you to treat this diagram like a jigsaw puzzle, and verify the identity on your own. Line-

by-line manipulations following this schematic are provided below, with the colours indicating grouping

from line to line.102


µ(R(X, Y )Z,W ) =µ(R(X, Y )Z,W )

2+µ(R(X, Y )Z,W )

2

=µ(R(X, Y )Z,W )

2+µ(Z,R(Y,X)W )

2

=−µ(R(Y, Z)X,W )− µ(R(Z,X)Y,W )

2

+−µ(Z,R(X,W )Y )− µ(Z,R(W, Y )X)

2

=−µ(R(Z, Y )W,X)− µ(R(Y,W )Z,X)

2

−µ(R(X,Z)W, Y )− µ(R(W,X)Z, Y )

2

=µ(R(W,Z)Y,X)

2+µ(R(Z,W )X, Y )

2= µ(R(Z,W )X, Y )

In the above equations, the only sign changes occur on application of identity (2). Whenever we

apply relation (3) we correspondingly apply (1) to maintain signs.

The in many references one will nd the expression

R(v , w)z =∑i jk

(∂k~Γi jvjwkzi − ∂k~Γi jvkwjzi

)+∑i jkl

(~Γi jΓ

iklvlwjzk − ~Γi jΓiklvjwlzk

)written in a slightly dierent form. Since the Riemann tensor is multi-linear, one can describe it

by a collection of smoothly-varying real valued functions Rli jk : U → R by evaluating the tensor

on a standard basis for Rn. This approach to keeping track of multi-linear functions (tensors) is

called the Ricci calculus. One writes R(ei , ej)ek =∑

l Rli jkel , giving R(ei , ej)ek = ∂j~Γki − ∂i~Γkj +∑

m

(~ΓmjΓ

mki − ~ΓmiΓmkj

), which when we substitute in ~Γi j =

∑k Γkijek gives

Rli jk = ∂jΓlki − ∂iΓlkj +

∑m

(ΓlmjΓ

mki − ΓlmiΓ

mkj

).

One similarly denes Ri jkl = µ(R(ei , ej)ek , el), which by design satises Ri jkl =∑

m Rmijkglm. From

this perspective the interchange identity states that Ri jkl = Rkl i j and so if one is representing the

Riemann tensor via the coecients Ri jkl , the interchange identity is a time-saving device.

An important thing to note is that formulas such as the Bianchi identity are simply identities among

multi-linear maps on all the tangent spaces TpN. So in principle we could have made the proof of the

Bianchi identity a one line proof, because given any three vectors x, y , z ∈ TpN we can extend them

to a collection of commuting vector elds in a neighbourhood of p ∈ N. After substituting in identity

(0) in the Bianchi relation, the fact that the connection is torsion-free nishes the proof indeed,

using arbitrary vector elds complicates the proof above.

Proposition 4.10. Given any two paths γ0, γ1 ∈ Ωp,q, where γi : [0, 1] → N, if the two paths are

homotopic rel their endpoints and if the Riemann curvature tensor for N is zero, then

Hol(γ0, ·) = Hol(γ1, ·).103


Proof. The paths being homotopic is the statement that there is a smooth function

γ : [0, 1]× [0, 1]→ N

with γ(0, ·) = γ0, γ(1, ·) = γ1 and γ(t, 0) = γ(t, 1) = γ(0, 0) ∀t ∈ [0, 1].

To complete the proof, denote the variables for γ by (t, s). Let v : [0, 1]2 → TN be a smooth

`parallel' family of vectors over γ, meaning ∇sv = c ∂v∂s = 0γ . Since the Riemann tensor is zero,

we have ∇t∇sv = ∇s∇tv = ∇s0γ = 0γ , i.e. ∇sv is parallel, meaning ∇sv(t, 1) = Hol(γ(t, ·))

is constant, as the only parallel vector over a constant base-point is a constant vector, this is the

idempotence condition of a connection.

More generally, observe that if v : R2 → TN is smooth, if we denote the variables of R2 by (x, y)

then, and if we let ∇xv ≡ c ∂v∂x then Proposition 4.7 states, in this notation that

∇y∇xv −∇x∇yv = R

(∂γ

∂x,∂γ

∂y

)v

where πN v = γ, πN : TN → N.

A Riemann manifold N is said to be locally at if every point p ∈ N has a relatively open neigh-

bourhood U ⊂ N and an isometry f : U → V ⊂ Rn where V ⊂ Rn has the pull-back metric from the

Euclidean Riemann metric on Rn.

Theorem 4.11. A Riemann manifold N is locally at if and only if its Riemann curvature tensor is

zero everywhere.

Proof. The Riemann tensor is zero on Rn, so if the manifold is locally at, the tensor must be zero,

as the tensor is dened in terms of holonomy, which is an isometry invariant by the Fundamental

Theorem of Riemannian Geometry.

If the Riemann tensor is zero, let r > 0 be such that the exponential map is a dieomorphism

from the ball of radius r in TpN to its image, lets call it U ⊂ N. Restrict the Riemann metric for N,

µ : TN⊕TN → R to TpN, µ| : TpN×TpN → R. This turns TpN into an inner product (vector) space,

and hence it is isometric to Euclidean space. We claim exp : Br → U is the isometry we're looking

for. This amounts to checking that for any q ∈ Br and any v , w ∈ TpN, µ(Dexpq(v), Dexpq(w)) =

µ(v , w).

Fix for the remainder of the proof v , w, q ∈ TpN. Let f (r) = µ(Dexprq(v), Dexprq(w)). Our goal

is to show f is constant as a function of r .

Let γ : R3 → N be dened as γ(r, x, y) = exp(r(q + xv + yw)). As a function of r , γ is

the geodesic emanating from p with initial velocity vector q + xv + yw , so ∇r ∂γ∂r = 0γ . But

also ∂γ∂x (r, 0, 0) = rDexp(rq, v) and ∂γ

∂y (r, 0, 0) = rDexp(rq, w) by the chain rule. So r2f (r) =

µ( ∂γ∂x (r, 0, 0), ∂γ∂y (r, 0, 0)). Dene g(r) to be r2f (r). We will compute the derivatives of g which will

in turn tell us the derivatives of f .

g = µ(∂γ

∂x,∂γ

∂y)

g′ = µ(∇r∂γ

∂x,∂γ

∂y) + µ(

∂γ

∂x,∇r

∂γ

∂y)

104


g′′(r) = µ(∇2r

∂γ

∂x,∂γ

∂y) + 2µ(∇r

∂γ

∂x,∇r

∂γ

∂y) + µ(

∂γ

∂x,∇2

r

∂γ

∂y)

g(3)(r) = µ(∇3r

∂γ

∂x,∂γ

∂y) + 3µ(∇2

r

∂γ

∂x,∇r

∂γ

∂y) + 3µ(∇r

∂γ

∂x,∇2

r

∂γ

∂y)

+ µ(∂γ

∂x,∇3

r

∂γ

∂y)

We derive the above equations by dierentiating inductively, from top to bottom with respect to

r . The equation for g(3) is rst of interest. By denition, ∇3r∂γ∂x ≡ ∇r∇r∇r

∂γ∂x = ∇r∇r

(c ∂γ

∂r∂x

).

Since our connection is torsion-free, c ∂γ∂x∂r = c ∂γ

∂r∂x = ∇r ∂γ∂x . Similarly, we can write ∇r∇x ∂γ∂r =

∇x∇r ∂γ∂r + R( ∂γ∂x ,∂γ∂r ) ∂γ∂r by the denition of curvature. Since the Riemann tensor is zero everywhere,

we can conclude ∇r∇x ∂γ∂r = ∇x∇r ∂γ∂r = 0, as γ as a function of r is a geodesic. We can similarly

conclude all the other terms in our expression for g(3)(r) are zero, so g(3)(r) = 0 ∀r and g(r) is a

quadratic polynomial in r .

Similarly, g′′(0) = 2µ(∇r ∂γ∂x ,∇r∂γ∂y ) = 2µ(∇x ∂γ∂r ,∇y

∂γ∂r ) = 2µ(v , w). g′(0) = 0, so g(r) =

r2µ(v , w), thus f (r) = µ(v , w) is constant.

The above argument shows us how we can understand the local shape of the manifold from rather

simple manipulations with the Riemann curvature tensor. Moreover, viewing the manifold through the

èyes' of the exponential map is a very natural way to see the manifold from the inside.

In the above theorem, if the Riemann curvature tensor was non-zero, of course we would not have

been able to come to the same conclusions. But in principle we could come to some conclusions about

the Taylor expansion of exp∗(µ) on TpN. The next theorem takes this perspective and runs with it.

Theorem 4.12. (Riemann) Given a point p in a Riemann manifold N, we can consider the pull-back of

the Riemann metric on N to the tangent space TpN via the exponential map. Denote this smoothly-

varying bilinear form on TpN by ζ. Denote the space of symmetric bi-linear function TpN × TpN → Rby S, then

ζ : TpN → S

is a smooth function. Specically, ζ(q)(v , w) = µ(Dexpq(v), Dexpq(w)), where we are restricting

the exponential map to TpN, exp : TpN → N.

Riemann's theorem states that the second-order Taylor expansion of ζ is

ζ(q)(v , w)ε·|q|2' µp(v , w) +

1

3µp(R(q, v)w, q)

Where R is the Riemann tensor of N at p.

Proof. We will model the proof on Theorem 4.11. Fix for the remainder of the proof v , w, q ∈ TpN.Let f (r) = µ(Dexprq(v), Dexprq(w)). We will compute the 2nd order Taylor approximation of f

centred at r = 0.

Let γ : R3 → N be dened as γ(r, x, y) = exp(r(q + xv + yw)). As a function of r , γ is

the geodesic emanating from p with initial velocity vector q + xv + yw , so ∇r ∂γ∂r = 0γ . But

also ∂γ∂x (r, 0, 0) = rDexp(rq, v) and ∂γ

∂y (r, 0, 0) = rDexp(rq, w) by the chain rule. So r2f (r) =

µ( ∂γ∂x (r, 0, 0), ∂γ∂y (r, 0, 0)). Dene g(r) to be r2f (r). We will compute the 2nd order Taylor expansion

105


of f by computing the 4th order Taylor expansion of g, which we can do in the spirit of Theorem 4.11.

Dierentiating repeatedly the formula g(r) = r2f (r) we get the Taylor coecient relations

f (0) =1

2g′′(0), f ′(0) =

1

6g(3)(0), f ′′(0) =

1

12g(4)(0).

We can compute the derivatives of g at similarly,

g = µ(∂γ

∂x,∂γ

∂y) g(0) = 0

g′ = µ(∇r∂γ

∂x,∂γ

∂y) + µ(

∂γ

∂x,∇r

∂γ

∂y) g′(0) = 0

g′′ = µ(∇2r

∂γ

∂x,∂γ

∂y) + 2µ(∇r

∂γ

∂x,∇r

∂γ

∂y) + µ(

∂γ

∂x,∇2

r

∂γ

∂y) g′′(0) = 2µ(v , w)

g(3) = µ(∇3r

∂γ

∂x,∂γ

∂y) + 3µ(∇2

r

∂γ

∂x,∇r

∂γ

∂y) + 3µ(∇r

∂γ

∂x,∇2

r

∂γ

∂y)

+ µ(∂γ

∂x,∇3

r

∂γ

∂y) g(3)(0) = 0

g(4) = µ(∇4r

∂γ

∂x,∂γ

∂y) + 4µ(∇3

r

∂γ

∂x,∇r

∂γ

∂y) + 6µ(∇2

r

∂γ

∂x,∇2

r

∂γ

∂y) g(4)(0) =

+ 4µ(∇r∂γ

∂x,∇3

r

∂γ

∂y) + µ(

∂γ

∂x,∇4

∂γ

∂y) 8µ(R(v , q)q, w)

In the above equations, the rst equation g = µ( ∂γ∂x ,∂γ∂y ) is simply the denition of g, and the conclusion

g(0) = 0 follows since ∂γ∂x (0, 0, 0) = 0 by the denition of γ.

The computation of g′(0) follows similarly. There is the one application of orthogonality to get the

formula for g′(r), it's a sum of two applications of µ, and in each occurs either ∂f∂x or ∂f

∂y , which are

both zero when r = 0.

In the computation of g′′(0), only the central term is potentially non-zero. ∇r ∂γ∂x = c ∂2γ∂r∂x which

when r = 0 is equal to v , by direct computation. Similarly, ∇r ∂γ∂y (0, 0, 0) = w .

In the computation of g(3)(0) there is a sum of four applications of µ. The two terms on the end

are zero at r = 0 since at least one of the terms in µ is zero. The two central terms involve expressions

like ∇2r∂γ∂x ≡ ∇r∇r

∂γ∂x . That our connection is torsion-free gives us ∇r∇r ∂γ∂x = ∇r∇x ∂γ∂r . This is equal

to ∇x∇r ∂γ∂r + R( ∂γ∂x ,∂γ∂r ) ∂γ∂r by our observation following Proposition 4.10. But since γ is a geodesic

as a function of r , ∇r ∂γ∂r = 0 everywhere. And at r = 0, ∂γ∂x = 0, so the second term in this expression

is zero as well.

The computation of g(4)(0) follows similarly, with the exception of the terms µ(∇3r∂γ∂x ,∇r

∂γ∂y ) and

µ(∇r ∂γ∂x ,∇3r∂γ∂y ). By the similar reasoning as in the g(3)(0) case we see ∇3

r∂γ∂x = ∇r

(R( ∂γ∂x ,

∂γ∂r ) ∂γ∂r

).

We can argue that at r = 0, ∇r(R( ∂γ∂x ,

∂γ∂r ) ∂γ∂r

)= R(∇r ∂γ∂x ,

∂γ∂r ) ∂γ∂r . The reason for this is that R is

a multi-linear function so it must satisfy a type of `product rule' where one dierentiates each factor

at a time and adds up the terms. Since ∂γ∂x = 0 when r = 0 this would complete the argument. We

describe the product rule in detail in the next proposition.

In the above proof, although we did not expressly need to compute the derivative of the Riemann

curvature tensor, we came across it in the proof and needed it to satisfy a type of product rule. We

describe some basic formal properties of the derivative of the Riemann tensor in the next proposition.

While we're at it, we'll mention that this kind of product rule is a fairly general result.106


We will use⊕

k TN as shorthand to denote the space TN ⊕ TN ⊕ · · · ⊕ TN.⊕k

TN = (p, v1, · · · , vk) : p ∈ N, vi ∈ TpN ∀i

Proposition 4.13. (`Product rule' for DR and Dµ) The `product rule' for Dµ states that there is a

tri-linear map

∇µ :⊕

3

TN → TR

called the co-variant derivative of µ which allows for an alternative computation of Dµ. Specically,

Dµ = ∇µ (D(π⊕), π1 π⊕, π2 π⊕) + µ(c Dπ1, π2 π⊕) + µ(π1 π⊕, c Dπ2)

where πi : TN⊕TN → TN are projections onto the rst and second factors respectively, π1(p, v , w) =

(p, v). π⊕ : TN ⊕ TN → N is the bundle projection map π⊕(p, v , w) = p. An alternative way to

state this is if f : R3 → N is smooth, we can think of the pair ( ∂f∂x ,∂f∂y ) as a map R3 → TN ⊕ TN, so

∂∂z ( ∂f∂x ,

∂f∂y ) is a map R3 → T (TN ⊕ TN). The above equality states that

Dµ

(∂

∂z(∂f

∂x,∂f

∂y)

)=

∂

∂zµ(∂f

∂x,∂f

∂y) = ∇µ(

∂f

∂z,∂f

∂x,∂f

∂y) + µ(∇z

∂f

∂x,∂f

∂y) + µ(

∂f

∂x,∇z

∂f

∂z).

Similarly, given a function f : R4 → N with variables denoted (x, y , z, w) there is a product rule for

computing c DR. There is a multi-linear function ∇R :⊕

4 TN → TN called the covariant derivative

of R which satises

(c DR)

(∂

∂w(∂f

∂x,∂f

∂y,∂f

∂z)

)=∇R(

∂f

∂w,∂f

∂x,∂f

∂y,∂f

∂z) + R(∇w

∂f

∂x,∂f

∂y,∂f

∂z)

+R(∂f

∂x,∇w

∂f

∂y,∂f

∂z) + R(

∂f

∂x,∂f

∂y,∇w

∂f

∂z)

By the chain rule, (c DR)(∂∂w ( ∂f∂x ,

∂f∂y ,

∂f∂z ))

= c ∂∂wR

(∂f∂x ,

∂f∂y ,

∂f∂z

).

Notice that the above proposition holds for any connection c : T 2N → TN and any Riemann metric

on N. The connection c being orthogonal with respect to the Riemann metric is equivalent to the

statement ∇µ = 0. Further, notice there are product rules for any bre-wise multi-linear functions⊕TN → TN, one can even replace the target TN with any vector bundle over a manifold, provided

the map sends bres to bres and is multi-linear.

Proof. Recall how a connection decomposes the double tangent bundle. If c : T 2N → TN is

a connection, it corresponded uniquely with an idempotent linear map c : T 2N → T 2N whose

image is the kernel of DπN : T 2N → TN. (p, v , w, y) ∈ kerDπN if and only if w = 0 and

y ∈ TpN. This allows us to identify T(p,v)TN with TpN ⊕ TpN. Given (p, v , w, y) ∈ T(p,v)TN,

(c(p, v , w, y), DπN(p, v , w, y)) ∈ TpN ⊕ TpN. This is an isomorphism. Under this isomorphism

the subspace of T(p,v)TN corresponding to TpN × 0 are the `vertical' directions, and 0 × TpNthe `parallel' directions. Similarly, given any V ∈ T (

⊕3 TN), we can write it uniquely as a sum

V = V0 +V1 where V0 is `parallel' and V1 `vertical'. Precisely, V0 satises c Dπi(V0) = 0 ∀i ∈ 1, 2, 3,where πi :

⊕3 TN → TN is projection onto the i-th factor, and V1 satises Dπ⊕(V1) = 0, where

π⊕ :⊕

3 TN → N is bundle projections. Thus, cDR(V ) = cDR(V0+V1) = cDR(V0)+cDR(V1).

The former we would call (∇R)(V ) and the latter is computable from the multi-linearity of R. 107


Theorem 4.12 one can in principle extend it indenitely to a formula for the full Taylor expansion

of ζ(q). If our Riemann manifold was analytic, meaning both the charts being analytic functions and

the metric µ being analytic, this Taylor expansion would converge to ζ in some neighbourhood of p.

We give an example of how one would compute the next step, below.

Problem 4.14. Verify the cubic Taylor approximation for ζ(q)(v , w) is

ζ(q)(v , w)ε·|q|3' µp(v , w) +

1

3µp(R(q, v)w, q) +

1

6µp ((∇R)(q, q, v , w), q) .

Denition 4.15. Given a vector space W , the Stiefel manifold of k linearly independent vectors in W

is typically denoted Vk(W ) = (w1, · · · , wk) ∈ W k lin. indep, one typically thinks of this as an open

subset of W k .

At a point p ∈ N the sectional curvature of a Riemann manifold N is the function K : V2TpN → Rgiven by

K(v , w) =µ(R(v , w)v , w)

|v |2|w |2 − µ(v , w)2.

Globally we think of K as a smooth function K : V2TN → R, where V2TN is taken to be an open

subset of TN ⊕ TN.

The above formula might seem contrived, because it is. Notice that since µ(R(x, y)z, w) is multi-

linear and anti-symmetric in both (x, y) and (z, w), the expression µ(R(v , w)v , w). So if we let

v ′ = av + bw and w ′ = cv + dw then µ(R(v ′, w ′)v ′, w ′) =

(Det

(a b

c d

))2

µ(R(v , w)v , w). But

the denominator |v |2|w |2 − µ(v , w)2 is the formula for the square of the area of the parallelogram

spanned by v and w , so |v ′|2|w ′|2 − µ(v ′, w ′)2 =

(Det

(a b

c d

))2 (|v |2|w |2 − µ(v , w)2

). So the

point of the formula for K(v , w) is that it only depends on the plane spanned by v and w . We dene

it for all (v , w) ∈ V2TpN because we will be interested in dierentiating K.

By design, sectional curvature tells us strictly about the component of holonomy in the plane where

one performs the loop. Perhaps the must fundamental property of sectional curvature is the theorem

below.

Theorem 4.16. In a Riemann manifold, v , w ∈ TpN be unit length and orthogonal. Let f : [0,∞)×[0, 2π]→ N be f (r, θ) = exp(r cos θv + r sin θw). Let L : [0,∞)→ R be the length of f (r, ·), i.e.

L(r) =

∫ 2π

0

|∂f

∂θ(r, θ)|dθ.

Then the third order Taylor expansion of L centred at 0 is

L(r)ε·r3

' 2πr −πr3

3K(v , w).

Proof. Since the theorem is local, we can use the fact that a neighbourhood of p in N is isometric to

a neighbourhood of 0 in TpN given the pull-back metric ζ = exp∗µ. Riemann's Theorem 4.12 tells

us the second order behavior of ζ, and this will suce. Let g(r, θ) = r cos θv + r sin θw . Then by108


Riemann's Theorem,

ζ(g(r, θ),∂g

∂θ,∂g

∂θ)ε·r4

' µp

(∂g

∂θ,∂g

∂θ

)+

1

3µp

(R

(g(r, θ),

∂g

∂θ

)∂g

∂θ, g(r, θ)

)= r2 +

1

3r4µp(R(v , w)w, v) at θ = 0

L(r) =

∫ 2π

0

|∂f

∂θ(r, θ)|dθ =

∫ 2π

0

(ζ(g(r, θ),

∂g

∂θ,∂g

∂θ)

)1/2

dθ

ε·r3

'∫ 2π

0

r +r3

3µp(R(v , w)w, v)dθ

= 2πr

(1−

r2

3K(v , w)

)

Example 4.17. We saw in Example 4.7, the Riemann tensor for Sn is given by

R(v , w)z = (v · z)w − (w · z)v

and for Hn

R(v , w)z = (w · z)v − (v · z)w.

So for Sn, let v , w ∈ TpSn be unit length and orthogonal, then the sectional curvature can be computed

as

K(v , w) = µ(R(v , w)v , w) = µ(|v |2w,w) = 1

similarly in Hn,K(v , w) = µ(R(v , w)v , w) = µ(−|v |2w,w) = −1.

So the sectional curvature of Sn and Hn are constant they do not depend on where in the manifold

you are, nor do they depend on the plane one evaluates K in.

Denition 4.18. A manifold is said to be of constant curvature if the sectional curvature is a constant

function.

The sectional curvature and Riemann curvature tensors are known to be essentially equivalent

one can compute either one from the other. Although at rst this might appear surprising, recall how

that if we take a symmetric bilinear function f : V × V → V where V is a vector space, the function

N : V → V given by N(v) = f (v , v) determines f . Specically, the Hessian H(N)p(v , w) = 2f (v , w).

Of course, if f were not symmetric, H(N)p(v , w) = f (v , w) + f (w, v) is not enough to determine f .

The next proposition shows that R has enough symmetry so that we can re-derive it from K.

Proposition 4.19. For every p ∈ N, the Riemann curvature Rp :⊕

3 TpN → TpN is determined by

the sectional curvature K : V2(TpN)→ R.

Proof. We will use essentially the same idea as the derivation of f from N in the previous paragraph.

Dene f : TpN × TpN → R by

f (v , w) = µ(R(v , w)v , w) = K(v , w) ·(|v |2|w |2 − µ(v , w)2

).

109


Clearly f is determined by the sectional curvature (and µ) on TpN. Fixing v , w ∈ TpN consider the

Taylor expansion of the function f (v + αx,w + βy) as a function of (α, β). The coecient in front

of the term αβ is

µ(R(x, y)v , w) + µ(R(x, w)v , w) + µ(R(v , y)x, w) + µ(R(v , w)x, y).

So if we take the dierence of this expression with itself after switching x ↔ y we get

µ(R(x, y)v , w)+µ(R(x, w)v , y) + µ(R(v , y)x, w) + µ(R(v , w)x, y)−µ(R(y , x)v , w)+µ(R(y , w)v , x) + µ(R(v , x)y , w) + µ(R(v , w)y , x)

=4µ(R(x, y)v , w) + 2µ(R(x, w)v , y) + 2µ(R(x, v)y , w) =

=6µ(R(x, y)v , w)

where for the last line we applied the Bianchi identity to get µ(R(x, w)v , y) = −µ(R(w, v)x, y) −µ(R(v , x)w, y).

The next proposition follows immediately from the above proof.

Proposition 4.20. A Riemann manifold has constant sectional curvature κ if and only if

R(v , w)z = κ (µ(v , z)w − µ(w, z)v) .

Example 4.21. If M and N are Riemann manifolds, the product M × N is canonically a Riemann

manifold, as T(p,q)M×N = TpM×TqN. If µM and µN are Riemann metrics on M and N respectively,

µM×N :⊕

2 T (M × N)→ R is dened by

µM×N((p, q), (v1, v2), (w1, w2)) = µM(p, v1, w1) + µN(q, v2, w2).

We are assigning the Riemann metrics for M and N on the respective factors, and making the tangent

spaces to M orthogonal to the tangent spaces for N. We will compute the connection, Holonomy,

Riemann, and Sectional curvatures of M × N in terms of the corresponding items for M and N

respectively.

• If cM and cN are the Levi-Cevita connections on M and N respectively, we will use the

identication T (M×N) ≡ TM×TN. Then the Levi-Cevita connection on M×N is given by

cM×N : T 2M × T 2N → TM × TN

cM×N(V,W ) = (cM(V ), cN(W )).

• A smooth function I → TM×TN is parallel if and only if the compositions with the projections

π1 : TM×TN → TM and π2 : TM×TN → TN are parallel. Therefore γ : (a, b)→ M×N is a

geodesic if and only π1γ and π2γ are geodesics. The holonomy of a curve γ : [a, b]→ M×Nis the product of the two holonomies π1 γ and π2 γ.• Riemann curvature tensor RM×N is the product of the Riemann tensors for M and N respec-

tively, i.e.

RM×N((v1, v2), (w1, w2))(z1, z2) = (RM(v1, w1)z1, RN(v2, w2)z2).

• The sectional curvature satises KM×N((v1, v2), (w1, w2)) =(|v1|2|w1|2 − µ(v1, w1)2

)KM(v1, w1) +

(|v2|2|w2|2 − µ(v2, w2)2

)KN(v2, w2)

(|v1|2 + |v2|2) (|w1|2 + |w2|2)− (µ(v1, w1) + µ(v2, w2))2 .

110


KM(v1, w1) is not dened if v1 and w1 are not linearly independent, but one can substitute(|v1|2|w1|2 − µ(v1, w1)2

)KM(v1, w1) = µ(RM(v1, w1)v1, w1), i.e. if v1 and w1 are not inde-

pendent take(|v1|2|w1|2 − µ(v1, w1)2

)KM(v1, w1) to be zero.

Notice that regardless of what the sectional curvatures ofM and N are, inM×N there are directions

where the sectional curvature must be zero, as KM×N((v1, 0), (0, w2)) = 0 regardless of what v1 or

w2 are. So the only way a product of two Riemann manifolds could have constant curvature is if both

factors are at.

Notice that S2 × S2 with its standard metric (the product metric) has sectional curvature that is

always greater than or equal to zero. A fundamental unsolved problem in Riemannian geometry is

the question of whether or not S2 × S2 admits a Riemann metric where the sectional curvature is

everywhere positive. Heinz Hopf conjectured the answer is false.

Ricci and Scalar Curvatures

Recall that if f : V → V is a linear transformation of a nite-dimensional vector space V , both the

determinant det(f ) and the trace tr(f ) are well-dened. Their denitions are det(f ) = det([f ]ββ)

and tr(f ) = tr([f ]ββ) where β is any basis for V and [f ]ββ is the matrix representation of f with respect

to β. The determinant had the interpretation as the amount f distorts volume in V . The trace had

the interpretation that tr(f ) = D(Det)I(f ), i.e. Det(IdV + tf ) has linear approximation 1 + tr(f )t.

Since the Riemann curvature tensor takes values in the tangent space TIdO(TpN), we can take the

trace of the Riemann curvature tensor and interpret it geometrically. Orthogonal transformations

do not distort volume, thus tr(R(v , w)) = 0 always. But we can construct other bilinear functions

TpN × TpN → Hom(TpN, TpN) from the Riemann curvature tensor.

Denition 4.22. The Ricci Curvature Tensor is the function

Ric(v , w) = tr (z 7−→ R(v , z)w) .

Ric : TN ⊕ TN → R

The Ricci tensor is known to be bi-linear and symmetric. Bilinearity follows from multi-linearity of

the Riemann tensor. To see symmetry, compute the trace using an orthonormal basis and apply the

exchange identity.

Note: in Ricci calculus the act of taking the trace of a multi-linear function is called `contracting

the tensor'. If Rli jk the components of the Riemann tensor, the Ricci tensor has components∑

l Rli lk .

Denition 4.23. Since the Ricci tensor is symmetric, it is completely determined by Ric(v , v) for v

a unit length vector.

STN := v ∈ TN : |v |2 = µ(v , v) = 1STN is called the sphere bundle of the tangent bundle. If Ric is constant on STN, N is called an

Einstein manifold. Notice, that this is equivalent to Ric = k ·µ where k ∈ R and µ : TN⊕TN → Ris the Riemann metric.

Example 4.24. We compute the Ricci tensors for Sn and Hn. On Sn, R(v , w)z = (v · z)w − (w ·z)v so Ric(v , w) = tr [z 7−→ (v · w)z − (w · z)v ]. We can compute the trace of this map directly,

Ric(v , w) = n(v · w)− (v · w) = (n − 1)(v · w), so Sn is an Einstein manifold.111


For HnR(v , w)z = (w · z)v − (v · z)w.

This Riemann curvature is the negative of the Riemann curvature for Sn, so Ric(v , w) = (1−n)(v ·w),

and Hn is an Einstein manifold as well.

We've seen that the Riemann tensor tells us, to 2nd order, the pull-back of the Riemann metric via

the exponential map. It also tells us, to 2nd order, the holonomy about small loops. The Sectional

Curvature tells us the ratio of the lengths of short loops (compared to Euclidean loops), to 2nd order.

In that regard, the Ricci tensor tells us the extent to which the exponential map distorts volume,

again to second order. This is made precise in the next theorem. In order to state the theorem, we

need the notion of the determinant of a linear function between two inner product spaces of the same

dimension.

Denition 4.25. Let f : V → W be a linear function, and assume µV : V ×V → R is an inner product,

as well as µW : W ×W → R. If dim(V ) = n = dim(W ), dene |det(f )| to be |det(A)| wheneverf (vj) =

∑ni=1 ai jwj if α = v1, · · · , vn is an orthonormal basis for V and β = w1, · · · , wn is an

orthonormal basis for W .

|det(f )| = |det([f ]βα)|

There is a typical formula used for computing the determinant above. Denote µW (f (vj), f (vk)) by

gjk , then the matrix [gjk ] = ([f ]βα)t [f ]βα, and |det(f )| =√det[gjk ].

Theorem 4.26. Given a Riemann manifold with Riemann metric µ : TN ⊕ TN → R and p ∈ N,consider TpN to be an inner product space whose inner product is the restriction of µ to TpN.

Consider the exponential map restricted to TpN, exp : TpN → N, and let ∆p : TpN → R be dened

as ∆p(q) = det(Dexpq : TqTpN = TpN → Texp(q)N

). Then the 2nd order Taylor expansion of ∆p,

centred at 0 ∈ TpN is

∆p(q)ε·|q|2' 1−

1

3Ric(q, q).

If one imagines a Riemann manifold as a place where one lives, and imagines light as traveling along

geodesics, the Ricci tensor Ric(v , v) for v ∈ STN describes to second order the distortion in surface

area between a patch of light that hits your eye, and the amount of area that light took up in space,

moments before it hit your eye the Ricci tensor describes how that area shrinks, to second order.

So positive Ricci curvature in the direction v means area shrinks, negative Ricci curvature means area

grows. Being an Einstein manifold means that Ric = k ·µ, so being an Einstein manifold is equivalent

to the statement that ∆p(q)ε·|q|2' 1− k

3 |q|2.

Proof. Fix q ∈ TpN. Given any two vectors v , w ∈ TpN we know by Riemann's Theorem 4.12 that

µ(Dexpq(v), Dexpq(w))ε·|q|2' µq(v , w) + 1

3µp(R(q, v)w, q). If we let v1, · · · , vn be an orthonormal

basis for TpN, then

∆p(q)ε·|q|2'

√det

([δi j +

1

3µp(R(q, vi)vj , q)]i j

)ε·|q|2' 1 + tr

(1

3[µp(R(q, vi)vj , q)]i j

)=1−

1

3Ric(q, q)

112


Next we will compute the Ricci tensor of a product in terms of the Ricci tensors of the factors.

Example 4.27. Let M and N be Riemann manifolds with Ricci tensors RicM and RicN respectively,

The Ricci tensor of the product RicM×N is given by

RicM×N((v1, v2), (w1, w2)) = RicM(v1, w1) + Ric(v2, w2).

So we can conclude that M × N is an Einstein manifold if and only if both M and N are, and

RicM = kMµM , RicN = kNµN with kM = kN . Meaning S2×S2 is an Einstein manifold, while S2×H2

is not.

Since the Ricci tensor describes directional volume distortion, it's natural to consider the average

volume distortion over all directions. This brings us to another interpretation of the trace and a new

notion of curvature. To dene scalar curvature we need the notion of trace of a symmetric bilinear

form.

Denition 4.28. Given a symmetric bilinear function f : V × V → R on an inner product space V

with inner product µ, its trace is dened as the trace of the composite µ−1 f where f : V → V ∗ and

µ : V → V ∗ are the adjoints of f and µ respectively, f (v)(w) = f (v , w). Using this denition, given

an orthonormal basis v1, · · · , vn for V , tr(f ) =∑

i µ(f (vi), vi).

Given a point p ∈ N, the scalar curvature at p, Scalp is dened as the trace of the Ricci tensor,

Scalp := tr(Ricp). This gives a smooth function Scal : N → R.

Proposition 4.29. If f is a symmetric bilinear form on an inner product space V , the average of the

function v 7−→ f (v , v) over the unit sphere is tr(f )dim(V ) .

See the appendix at the end of this section for the proof of the above proposition.

Corollary 4.30. Let Sp(r) ⊂ TpN denote the sphere of radius r . Then the second order Taylor

expansion of the ratio of the (n − 1)-dimensional content of exp(Sp(r)) to the (n − 1)-dimensional

content of Sp(r) ⊂ TpN is ∫exp(Sp(r)) 1∫Sp(r) 1

ε·|r |2' 1−

r2

3Scalp.

Stated more explicitly ∫exp(Sp(r))

1ε·|r |n+1

'(

1−r2

3Scalp

)(2π

n+12

Γ( n+12 )

rn−1

)The proof follows immediately from our interpretation of Ricci curvature.

Denition 4.31. The (unnormalized) Ricci ow is a dierential equation on a smooth manifold (i.e.

it is not given a Riemann metric) of the form

∂tµ = −2Ric.

µ : (−ε, ε)×TN⊕TN → R is required to be a smoothly-varying family of Riemann metrics on N, and

Ric : (−ε, ε) × TN ⊕ TN → R is required to satisfy that Ric(t, ·) is the Ricci tensor for (N,µ(t, ·))

for all times t ∈ (−ε, ε).113


When attempting to make sense of this dierential equation geometrically, it's helpful to think of µ

locally, through the lens of Theorem 4.26. This states that near any point p ∈ N the volume distortion

of the exponential map is 1− 13Ric(q, q), to second order in q ∈ TpN. This means that the Ricci ow

is literally attempting to undo the volume distortion in a direct manner if the manifold is `getting

smaller' in the q direction, the Ricci ow attempts to enlarge the metric in that direction.

As the next example shows, there are unintended consequences if everyone attempts to do the same

`smart' thing using only local data, everywhere, all simultaneously!

Note 4.32. If N is a Riemann manifold with Riemann metric µ : TN ⊕ TN → R, given a positive

constant A > 0 then the function µ : TN ⊕ TN → R given by µ = Aµ is also a Riemann metric.

The Levi-Cevita connections for (N,µ) and (N, µ) are the same, this follows from the denition. The

Riemann curvature tensors are the same since the two manifolds have the same connections, they

have the same holonomy and the same curvature tensors. The sectional curvatures are related by the

formula

Kµ =1

AKµ,

so the sphere of radius r in a Euclidean space has sectional curvature 1r2 . Similarly, the Ricci tensors

of (N,µ) and (N, µ) are identical, as well as the scalar curvatures.

Example 4.33. (Ricci ow on the spheres) We had earlier shown that the unit sphere in Rn+1, Sn,

has Ricci tensor

Ric(v , w) = (n − 1)(v · w).

where v · w is the Euclidean inner product on Rn+1. The sphere of radius r in Rn+1 is isometric to

the unit sphere in Rn+1 once we give it the Riemann metric µr (v , w) = r2v · w . The isometry is a

re-scaling. For the remainder of this example we will consider `the sphere of radius r ' to be (Sn, µr )

as a Riemann manifold. By Note , these manifolds have a common Ricci curvature. So the Ricci ow

∂tµr = −2Ric with initial condition the unit sphere in Rn+1 has a solution that stays inside the family

of spheres (of various radii) in Euclidean space. Write r(t) as the solution. Then the DE in r(t) is

r r ′ = 1− nwhose solution is

r =√

1− 2(n − 1)t.

which terminates at t = 12(n−1) when the sphere's radius goes to zero.

Problem 4.34. Repeat Example 4.33 for a general Einstein manifold N. You will nd a solution to

the Ricci ow whose initial condition is the original Einstein manifold N. When does the solution exist

for all t ≥ 0? When does it terminate in nite time? If the maximal interval [0, t) is nite, determine

t.

Notice that the round sphere is a xed point of this dierential equation. The xed points of the

unnormalized Ricci ow are simply manifolds with Ric = 0.

Example 4.35. Consider the manifold Sn ×Sm, given the product Riemann metric, where Sn has the

geometry of the sphere of radius r1 and Sm the sphere of radius r2 in Euclidean space. We will compute

the unnormalized Ricci ow on this manifold. As in Example 4.33, a solution stays within this family of

manifolds, so can be described in terms of the two radii r1(t) and r2(t). The dierential equation has

solution r1 =√

1− 2(n − 1)t, r2 =√

1− 2(m − 1)t. The result is the higher-dimensional sphere

has its radius collapse faster than the lower-dimensional sphere (if they are of dierent dimensions).114


The act of re-scaling the metric as in Note 4.33 is sometimes called a homothety. Given a solution

to the Ricci ow, in principle one could re-scale the metric on Sn × Sm so that the Riemann metric

always has constant volume this would no longer be a solution to the Ricci ow, but it helps to get a

sense for what is happening. If one did this, and if n < m this means the Sm factor would have radius

going to zero, while the Sn factor would have radius going to innity to compensate. Moreover, this

re-scaled ow would satisfy a new but related dierential equation, called the volume normalized Ricci

ow. It can be written explicitly as

∂µ

∂t= −2Ric +

2

nAvg(R) · µ

where Avg(R) ≡ Avg(N,µ)(R) is the scalar-valued function of t which is the average of the scalar

curvature R over the Riemann manifold (N,µ), i.e.

Avg(N,µ)(R) =

∫N RdV∫N 1dV

=

∫N RdV

V ol(N,µ).

We leave it as an exercise to check that solutions to the volume-normalized Ricci ow are re-

parametrizations (in t) and rescalings (of µ for each t) of the un-normalized Ricci ow. The volume

of this Riemann manifold is constant. The main downside to this dierential equation is it only makes

sense if N is compact.

Exercises

Problem 4.36. Prove that the k-twisted Möbius band from Example 4.3 has zero Riemann curvature.

Problem 4.37. Let X, Y, Z,W : N → TN be vector elds on a Riemann manifold with Levi-Cevita

connection c : T 2N → TN. Determine a formula for ∇WR(X, Y )Z in the spirit of Proposition 4.13.

Is it as simple as

∇WR(X, Y )Z = (∇R)(W,X, Y, Z) + R(∇WX, Y )Z + R(X,∇W Y )Z + R(X, Y )(∇WZ)

or is it more complicated? Keep in mind, the formula in Proposition 4.13 is being evaluated on a

function whose partial derivatives whose vector elds commute in the sense that the mixed partials

satisfy ∂2f∂x∂y = ιN

∂2f∂y∂x . Our vector elds are not such special double tangent vectors.

Problem 4.38. The second Bianchi identity states∇R(X, Y, Z, U)+∇R(Y, Z,X, U)+∇R(Z,X, Y, U) =

0. Give a proof of this. Hint: You could use the result from Problem 4.37 as your denition of

∇R(X, Y, Z,W ) to convert this problem into a nd-the-Jacobi-identity puzzle.

Problem 4.39. Write up a complete proof for Problem 4.8.

Problem 4.40. Write up a complete argument that the Riemann curvature tensor of Hn is as describedin Example 4.7.

Problem 4.41. Complete Problem 4.14.

Problem 4.42. A manifold was dened to have constant curvature if the sectional curvature is con-

stant. In a sense this is a denition of convenience, as the Riemann tensor is not a real-valued function,115


so how would you compare the tensor at one point to the tensor at another? This problem gives you

various notions of `constant curvature' to consider, to compare and contrast.

(a) Given a path γ : [0, 1] → N with γ(0) = p and γ(1) = q the holonomy gives an isometry

Hol(γ, ·) : TpN → TqN. Let Rp and Rq be the Riemann tensors at p and q respectively.

Let fγ = Hol(γ, ·). By design, the function f ∗γ Rq :⊕

3 TpN → TpN is a tri-linear func-

tion on TpN, just as how Rp :⊕

3 TpN → TpN is, where we dene (f ∗γ Rq)(v , w, z) =

f −1γ (Rq(fγ(v), fγ(w), fγ(z))). This leads to a competing notion of constant curvature. We

say the manifold N is of constant curvature if f ∗γ Rq = Rp for all paths γ in N.

(b) If instead of using arbitrary paths as in (a) one restricts to loops contained in a normal

coordinate patch.

(c) R(v , w)R(y , z)−R(y , z)R(v , w) = R(R(v , w)y , z) +R(y , R(v , w)z). Hint: If the Riemann

tensor is invariant under the holonomy along a small closed loop f ∗γ R = R. You have an

expression for the 2nd order Taylor expansion of such a holonomy.

(d) ∇R = 0

Work out the chain of implications between these notions of constant curvature determine if one

implies the other, which ones are equivalent, which ones are not.

Problem 4.43. Repeat Example 4.35 on the product R× S2.

Problem 4.44. Show that a three-dimensional Einstein manifold has constant sectional curvature.

Problem 4.45. Consider the connection on the complex plane given by

c(p, v , w, y) = (p, y − 2 ((v · p)(w · ip) + (v · ip)(w · p)) (ip) + ((v · ip)(w · ip)) p).

In the above equation, if complex numbers appear concatenated, it means complex multiplication. So

ip means p rotated counter-clockwise by π/2. v · p means the Euclidean dot product.

(a) Compute the holonomy around the unit circle S1 ⊂ C, with the counter-clockwise orientation.(b) Argue that the connection c can not be the Levi-Cevita connection of any Riemann metric

on C.

Problem 4.46. Compute the Riemann curvature tensor for the connection in Problem 4.45.

Problem 4.47. Consider the `torus' in R3

(x, y , z) ∈ R3 :(√

x2 + y2 − 2)2

+ z2 = 1

Compute the Riemann curvature tensor for the Levi-Cevita connection. Compute also the Sectional

/ Gauss curvature for this surface. Sketch the surface as well as where the Sectional curvature is

positive, negative and zero. Hint: Feel free to use the formula in Problem 4.8 this avoids the

computation of the connection. You might want to manipulate the above formula to an equivalent

one without the square root!

116


Problem 4.48. Compute the Riemann curvature tensor for S1×S1 ⊂ C2, thinking of it as a subman-

ifold of C2 ≡ R4, Euclidean space.

Problem 4.49. Compute the Riemann curvature tensor for the Klein bottle manifold

K2 = ((cos θ, sin θ), (x, y cos(θ/2), y sin(θ/2))) : x2 + y2 = 1, θ ∈ R ⊂ R5

Problem 4.50. Consider the symmetric bilinear function on R2n given by v · w = −v1w1 − v2w2 −· · · − vnwn + vn+1wn+1 + · · ·+ v2nw2n. Consider the ùnit sphere'

S = p ∈ R2n : p · p = −1.

Show that the above bilinear form pulls-back to a non-degenerate bilinear form on S of signature

(n, n − 1).

Problem 4.51. Use the denition of Ricci curvature to argue that the Ricci curvature of an n-manifold

for n ≤ 3 determines the Riemann curvature. Give an example in dimension n = 4 where the Ricci

curvature does not determine the Riemann curvature.

Problem 4.52. Give an example where the Riemann curvature does not determine the Riemann metric.

Hint: we have computed how re-scaling a metric aects the Riemann tensor.

Appendix: Volumes of Spheres

In these notes there are various instances where certain numbers pop up, like the content of a

sphere or ball in Euclidean space. Although we do not need to know closed form descriptions of these

numbers, they are not particularly dicult to compute. As they appear in many dierent places in the

literature and they're something of a curiosity of their own, we outline their computation here.

Theorem 4.53. The n-dimensional measure of Sn ⊂ Rn+1 is∫Sn

1 = 2π

n+12

Γ( n+12 )

,

where Γ(z) =∫∞

0 tz−1e−tdt is the Gamma Function.

Proof. Consider the improper integral ∫Rn+1

e−|p|2

dp.

We have two ways to apply Fubini's theorem to this integral. There is the change of variables φ :

Sn × [0,∞)→ Rn+1 given by φ(v , r) = rv . This allows us to write∫Rn+1

e−|p|2

dp =

∫ ∞0

rne−r2

dr

∫Sn

1,

117


which gives us àccess' to∫Sn 1, which is the content of the sphere. Alternatively, write p =

(x0, x1, · · · , xn). We can apply Fubini's theorem to deduce∫Rn+1

e−|p|2

dp =

∫Rn+1

e−x20−x2

1−···−x2n dx0dx1 · · · dxn

=

∫Re−x

20 dx0

∫Re−x

21 dx1 · · ·

∫Re−x

2n dxn

=

(∫Re−x

2

dx

)n+1

Notice that the integral∫∞

0 re−r2

dr is perfectly computable using a substitution u = −r2,∫ ∞0

re−r2

dr =−1

2

∫ −∞0

eudu

=1

2

So we conclude ∫R2

e−|p|2

dp = π

and ∫Rn+1

e−|p|2

dp =

(∫R2

e−|p|2

dp

) n+12

= πn+1

2

and ∫Sn

1 =π

n+12∫∞

0 rne−r2dr.

If we perform a u = r2 substitution in this integral we get∫ ∞0

rne−r2

dr =1

2

∫ ∞0

un−1

2 e−udu.

The function Γ(z) =∫∞

0 tz−1e−tdt is known as the Gamma Function. It's a commonly-occuring

function in complex analysis and number theory, and satises the key properties Γ(z + 1) = zΓ(z) and

Γ(1) = 1, so by induction Γ(n) = (n − 1)! provided n ∈ N. This leaves us with the equation∫Sn

1 = 2π

n+12

Γ( n+12 )

.

Since we know the length of a circle to be 2π and the area of the unit sphere in R3 to be 2π this

gives Γ(1) = 1. On the extreme end we could argue that S0 has two points, so its content should be

two, and Γ(1/2) must then be√π. Alternatively, we could argue that the content of S2 is known to

be 4π via a spherical polar coordinates argument. This allows us to solve for Γ(3/2) =√π

2 . At this

stage one either accepts or veries the recursion relation Γ(z + 1) = zΓ(z). This allows computation

of Γ( n+12 ) for all n, giving

µ1S1 = 2π, µ2S

2 = 4π, µ3S3 = 2π2, µ4S

4 =8

3π2, µ5S

5 = π3, · · ·

So the sphere of radius r in R5 has 4-dimensional measure µ4 = 83π

2r4. 118


A corollary is that the n-dimensional content of the sphere of radius r in Rn+1 is

µnSn(r) = 2

πn+1

2

Γ( n+12 )

rn

and the n-dimensional content of the ball of radius r Bn(r) = p ∈ Rn : |p| ≤ r in Euclidean space

is given by

µnBn(r) =

2

n

πn2

Γ( n2 )rn.

A rather elementary fact one can check is that the two limits are zero

limn→∞

µnBn(r) = 0 lim

n→∞µnS

n(r) = 0

regardless of what r is. It's quite common to nd this near paradoxical at rst.

We need one more standard integral on spheres.

Theorem 4.54. Let A be a symmetric (n+ 1)× (n+ 1) matrix with real entries. The integral can be

computed as ∫SnptApdp = tr(A)

2

n + 1

πn+1

2

Γ( n+12 )

.

Proof. By the spectral theorem, there is a matrix M ∈ SOn+1 such that M−1AM = B is diagonal. So

by a change of variables we have∫SnptApdp =

∫SnptBpdp =

∫Sn

n∑i=0

bi ix2i dp

There are orthogonal transformations of Rn+1 that permute the coordinate lines, so∫Sn x

2i dp =∫

Sn xjdp for all i , j . Moreover, since x20 + · · ·+x2

n is constant and equal to 1 on Sn we have∫Sn x

2i dp =

1n+1

∫Sn 1dp. ∫

SnptApdp = tr(B)

1

n + 1

∫Sn

1dp = tr(A)

∫Sn 1dp

n + 1.

Of course, the message to get out of this theorem is that the integral of p 7−→ ptAp over the unit

sphere is some constant (that only depends on the dimension of the sphere) times the trace.

Corollary 4.55. The average value of the function p 7−→ ptAp on Sn is 1n+1 tr(A).

Notice, this number is also the average of the diagonal entries in A. Moreover, notice that in the

above argument we could have deduced this statement without ever computing µnSn.

119

Dierential Geometry - Coarse Outline Curvature, theorems

5. The constant curvature geometries

The point of this section is to come to a basic understanding of Riemann manifolds of constant

sectional curvature. To state the theorem we need the notion of simply connected. A manifold N is

simply connected if it is both connected, and if any two paths in Ωp,qN are homotopic. The latter

statement that any two paths in Ωp,qN (for any p and any q in N) are homotopic is equivalent to the

statement that any two paths in Ωp,pN are homotopic (for any choice of p ∈ N). This follows from

the groupoid property of the sets Ωp,q : p, q ∈ N and compatibility with the homotopy relation. In

algebraic topology one would say `N is simply connected' by saying both π0N = ∗ and π1N = ∗, whereπ0N is the path-components of N and π1N is the `fundamental group' of N.

Denition 5.1. If N with Riemann metric µ has constant sectional curvature κ, the normalized

Riemann metric on N is dened as µ = µ provided κ = 0, and if κ 6= 0 we dene µ = κµ.

Thus, with its normalized Riemann metric, a Riemann manifold with constant sectional curvature

has sectional curvature κ ∈ −1, 0, 1.

Theorem 5.2. If N is a complete, simply-connected Riemann manifold with constant sectional curva-

ture κ = 0 it is isometric to Euclidean space Rn.If N is a complete, simply-connected Riemann manifold with constant sectional curvature κ = 1 it

is isometric to the round sphere Sn.

If N is a complete, simply-connected Riemann manifold with constant sectional curvature κ = −1

it is isometric to hyperbolic space Hn.

Theorem 5.2 should probably not come as a surprise by now. In Theorem 4.19 we saw that the

sectional curvature determines the Riemann tensor, so sectional curvature κ = 0 implies the Riemann

tensor is constant zero. Theorem 4.11 tells us the manifold is locally isometric to Rn, so in the κ = 0

case we need to only show simple-connectivity allows us to extend this isometry to a global isometry.

One might expect the κ = ±1 cases to be similar, and they are. One would expect simple-connectivity

in the statement of the theorem, since manifolds like S1 × S1 are at.

Proof. Let's begin with the κ = 0 case. By Theorem 4.11 we know N is locally isometric to Euclidean

space, and the exponential map exp : TpN → N is an isometry between a neighbourhood of 0 ∈TpN and a neighbourhood of p ∈ N. Our argument in Theorem 4.11 proves that the derivative of

the exponential map is an isometry at every point of TpN, and so the exponential map is locally a

dieomorphism at every point in TpN even more, it is a covering map. So by the theory of covering

spaces, the only way exp could fail to be one-to-one is if N has a non-trivial fundamental group, which

we have ruled-out by assumption.

We wish to make a comparable argument but for hyperbolic and spherical geometries. The most

direct argument is to show exp∗(µ) is the same for both the model geometry and the constant-

curvature geometry, and use this to dene an isometry between the corresponding manifolds. We

continue the conventions as in the proof of Theorem 4.12. Notice that ∇2r∂γ∂x = ∇r∇x ∂γ∂r = ∇x∇r ∂γ∂r +

R( ∂γ∂x ,∂γ∂r ) ∂γ∂r = κ

(µ( ∂γ∂x ,

∂γ∂r ) ∂γ∂r −

∂γ∂x

)by Proposition 4.20. The function µ( ∂γ∂x ,

∂γ∂r ) is constant in r

(simply dierentiate it), so it is constant zero. Thus, we have ∇2r∂γ∂x = −κ ∂γ∂x . If we plug this into our

formula for g′′ we get

g′′ = −4κg + 2µ(v , w).120


This gives the solution

g(r) = µ(v , w) ·

r2 if κ = 0

sin2(r) if κ = 1

sinh2(r) if κ = −1

Thus the pull-back of the Riemann metric via the exponential map only depends on the curvature, and

is therefore the same as in the canonical constant curvature geometries Rn, Sn and Hn.To see that if the manifold is simply-connected it is isometric to the model geometry, the idea is

that in both N and the model geometry Rn,Hn and Sn we have exponential maps and the pull-backs in

the case of N and the model geometry are the same (up to an isometry between the tangent spaces).

So the idea is to dene a map from the model geometry to N via an isometry of the tangent spaces,

together with the exponential maps.

Consider the case κ = 1, and x x ∈ Sn. Let expx : TxSn → Sn and expp : TpN → N be the

corresponding exponential maps, and let L : TxSn → TpN be any linear isometry. Then the map

φ : Sn → N dened by φ(q) = expp(L(exp−1x (q))) we claim is well-dened and an isometry. Certainly

this map is well-dened provided |exp−1x (q)| ≤ π. The key to nishing the argument is showing that

φ is a covering-map, meaning that it is a submersion, and its bers are discrete. We leave this as an

exercise. The basic theory of covering spaces kicks in and tells us that a covering map from a simply-

connected space to another simply-connected space must be a homeomorphism. If this is unknown to

the reader, we leave it as an unproved statement that one can prove `by hands'.

We end this section with some examples of constant curvature geometries. Hyperbolic geometry

is the most foreign. To get a sense for how to construct hyperbolic manifolds it's helpful to be

accustomed to some alternative models. We have seen two models of hyperbolic space, the sphere in

Minkowski space (also called the hyperboloid model) Hn, and the Poincaré ball model P nH which was

the unit ball in Rn with the metric µ(p, v , w) = 4(1−|p|2)2 v ·w . We give two more models of hyperbolic

space, the upper half-space model, and the Klein model.

Example 5.3.

PSfrag repla ements

H

121


The identication between the hyperboloid model / Minkowski sphere Hn and the Poincaré ball P nH.

Denition 5.4. The upper half-space model for hyperbolic space is

UnH = (x1, · · · , xn) ∈ Rn : xn > 0

with metric µ(p, v , w) = 1x2nv · w provided p = (x1, · · · , xn).

In Problem 5.36 you are asked to show the surface of rotation of the tractrix is locally isometric

to the upper-half space model of the hyperbolic plane H2. This was historically the rst `concrete'

realization of hyperbolic geometry.

The Klein model is a variant of the Poincaré model. In the Poincaré model, we used stereographic

projection from Hn to the unit ball 0 × Bn, via the stereographic projection point (−1, 0). In the

Klein model, on uses the origin as the stereographic projection at the origin to the translated ball

1 × Bn.

Denition 5.5. The Klein model KnH is the ball Bn with Riemann metric

µ(p, v , w) =1

(1− |p|2)2

((1− |p|2)v · w + (p · v)(p · w)

).

Example 5.6.

PSfrag repla ements

H

The identication between the hyperboloid model / Minkowski sphere Hn and the Klein model/ball

KnH.

Proposition 5.7. All four Riemann manifolds, Hn, UnH, P nH and KnH are isometric.

Proof. In chapter 3 of the notes we constructed an isometry P nH → Hn. We recall the isometry. The

key observation was that the Euclidean straight line between the point (−1, 0) and a point (0, p) where

p ∈ Bn passes through Hn at a unique point that is our map. Specicially, p 7−→ 11−|p|2

(1 + |p|2, 2p

).

122


The isometry KnH → Hn is given by f (p) = 1√1−|p|2

(1, p). The way to think of this is one takes

the straight line from 0 through a point in (1, p) ∈ 1 × KnH. This line intersects the hyperboloid in

a unique point, being f (p). One can check this map is an isometry by computing the derivative

Dfp(v) =1

(1− |p|2)3/2

(p · v , (p · v)p + (1− |p|2)v

).

In fact, the Klein metric is frequently just dened as the pull-back of the hyperboloid's metric via this

identication.

The isometry between UnH and P nH is given by an inversion construction. Recall from chapter

3 stereographic projection πp : Sn \ p → TpSn is the restriction of inversion about the sphere

with radius√

2 to the sphere. The map ip,r (x) = p + r2

|x−p|2 (x − p) is called inversion about the

sphere of radius r centred at p. πp = ip,√

2 on the sphere. Notice that |q| < 1 if and only if

ip,√

2(q) · p < 0. So our isometry between UnH and P nH is given by the map P nH → UnH, q 7−→ i−en,√

2(q),

where en = (0, 0, · · · , 0, 1). As in chapter 3 we can compute the derivative of this map,

D(i−en,√

2)q(v) =2

|q + en|2

(v − 2v ·

q + en|q + en|

q + en|q + en|

),

which is a multiple of the mirror reection map along the plane orthogonal to q + en. A direct

computation shows this is an isometry between the two Riemann metrics.

The upper half-space and Poincaré ball models have the advantage that the angles between vectors

are the same as their Euclidean angles. The Klein model has the advantage that geodesics in the Klein

model are Euclidean straight lines. Geodesics in the half-space and Poincaré ball models turn out to

be a special class of round circles.

Example 5.8.

PSfrag repla ements

−en −en

0 0

UnH

H

The identication between the Poincaré ball P nH and the upper half-space model UnH.

Proposition 5.9. The (non-constant) geodesics in Hn are the intersection of 2-dimensional linear

subspaces of Minkowski space that contain both a timelike and lightlike vector with Hn. One can

parametrize them explicitly as γp,v (t) = cosh(kt)p+ sinh(kt)v , where p · p = −1, v · v = 1, p · v = 0,

and |k | ∈ R is the speed of the curve.

The geodesics in the Poincaré model P nH are the round circles in Rn which intersect the boundary

sphere at right angles (with respect to the Euclidean metric on Rn).The geodesics in the upper half-space model UnH are the round circles in Rn that intersect the

boundary Rn−1 × 0 in right angles.

The geodesics in the Klein model KnH are the Euclidean straight lines, intersect with the ball.123


Proof. We prove the statement for the Poincaré model. The claim for the upper half-space model

follows from inversion preserving `roundness' of circles. The statement for the Klein model follows

from our characterization of geodesics in Hn, from chapter 3.

Since a unit-speed geodesic in Hn has the form γ(t) = cosh(t)p+ sinh(t)v , and since our isometry

P nH → Hn had the form p 7−→ 11−|p|2 (1+|p|2, 2p), we compute the inverse. Write p ∈ Hn as p = (p0, ~p)

then the inverse map is (p0, ~p) 7−→ 11+p0

~p. Composing γ with this map gives

t 7−→cosh(t)~p + sinh(t)~v

cosh(t)p0 + sinh(t)v0 + 1

By reparametrizing via the map t 7−→ t − arctanh( v0

p0) we can ensure that for our geodesic γ satises

w0 = 0 and so ~p · ~v = 0, and ~v · ~v = 1, giving us the geodesic parametrized as

t 7−→cosh(t)~p + sinh(t)~v

p0 cosh(t) + 1.

This is a parametrization of the segment of the circle (the part in the Poincaré ball) of radius 1√p2

0−1

centred at p0

p20−1

~p in the plane spanned by ~p, ~v . If v0 = 1 it is of course a straight line, as ~p = 0 in that

case.

Hyperbolic geometry has a feature called the visual sphere which is not present in spherical or

Euclidean geometry. We start with an observation.

Proposition 5.10. Let Rn+1 denote Minkowski space, Hn ⊂ Rn+1. A vector p ∈ Rn+1 is on the light

cone means that p · p = −p20 + p2

1 + · · ·+ p2n = 0. We say the vector is time-positive if p0 > 0. Given

p ∈ Hn and a unit tangent vector v ∈ TpHn, consider the 2-dimensional subspace of Rn+1 spanned

by p and v . This is a 2-dimensional vector space with a bilinear form, isomorphic to a 2-dimensional

Minkowski space. From the pair (p, v) we can form the pair (p + v , p − v). Notice that these two

vectors are time-positive vectors on the light cone, moreover we can recover (p, v) from this pair by

taking average constructions. This gives us a bijection between unit tangent vectors (p, v) ∈ THnand pairs of time-positive points on the light cone (q, w) such that q · w = −2.

Consider a unit-speed geodesic γp,v (t) = cosh(t)p + sinh(t)v in Hn. We can perform the above

construction on the pair (γp,v (t), γ′p,v (t)) giving

(et(p + v), e−t(p − v)).

Thus oriented, unit-speed parametrized geodesics correspond to pairs of time-positive points on the

light cone whose Minkowski product is −2. Denote such a pair by (p, q). Then if (p1, q1) and

(p2, q2) correspond to the same geodesic, perhaps with dierent unit-speed parametrizations then

(p1, q1) = (λp2,1λq2) if they share the same orientation, and (p1, q1) = (λq2,

1λp2) if they have the

opposite orientation, where λ > 0 is some positive number.

Notice that if q and w are any two time-positive vectors on the light cone, q · w < 0, so the

condition q · w = −2 is simply a normalization condition.

Although the above argument was primarily algebraic, we will see there is an underlying geometric

fact that it is describing. Roughly, it records to what extent an observer believes to geodesics `have

the same asymptote'.

Perhaps the simplest way to understand the visual sphere is to consider two tangent vectors v , w ∈THn, and ask if their corresponding geodesics exp(tv) and exp(tw) get `close' as t →∞. Precisely,

124


we are asking what it takes for the limit

limt→∞

infd(exp(xv), exp(yw)) : x, y ∈ R, x > t, y > t

to be zero, where d : Hn×Hn → R is the Hopf-Rinow metric, which for hyperbolic space has the form

d(p, q) = cosh−1(−p ·q) where p ·q is the Minkowski form on Rn+1, p ·q = −p0q0 +p1q1 + · · ·+pnqn.

Proposition 5.11. Given two unit-speed geodesics γp,v and γq,w in Hn then there are three exclusive

possibilities:

(1) The geodesics are reparametrizations of each other, i.e. (p + v , p − v) ∼ (q + w, q − w).

(2) Neither q + w nor q − w are positive multiples of p + v or p − v . In this situation, there is a

unique minimum of the function (t1, t2) 7−→ d(γp,v (t1), γq,w (t2)) given by

1

2cosh−1

((q · p)2 − (q · v)2 − (p · w)2 + (v · w)2 − 1

+√

[(p + v) · (q + w)][(p + v) · (q − w)][(p − v) · (q + w)][(p − v) · (q − w)])

(3) One of q + w or q − w is a positive multiple of either p + v or p − v . In this situation, the

distance between points on the two geodesics has no minimum, but it does have an inmum

and this is zero.

The above proposition can be seen as the birth of àsymptotic geometry', that some Riemann

manifolds have large-scale features not present in Euclidean geometry.

Proof. We will start by computing a formula for the inmum of the distance between a point q ∈ Hnand all the points on the geodesic γp,v (t) = cosh(t)p+ sinh(t)v . A little playing-about with hyperbolic

trig functions gives us that if q is not on the geodesic, the inmum is realized and it is represented by

2 cosh(d(q, γp,v )) =√

(q · (p + v)) (q · (p − v)).

giving

cosh2(d(γq,w (t), γp,v )) =cosh(2t)

2

((q · p)2 − (q · v)2 + (w · p)2 − (w · v)2

)+

sinh(2t)

2(2(q · p)(w · p)− 2(q · v)(w · v))

+1

2

((q · p)2 − (q · v)2 − (w · p)2 + (w · v)2

)As a function of t. The limit as t → ∞ exists if and only if the leading term is zero, but the leading

term can be expressed as the product of (q + w)(p + v) and (q + w)(p − v). But q + w , p + v and

q − w are all light-like vectors, and the product of two light-like vectors is zero if and only if they are

linearly dependent. This means the limit exists if and only if q+w is a positive multiple of either p+ v

or p − v . The inmum is clearly the constant term in the above formula. We compute the constant125


term (q · p)2 − (q · v)2 − (w · p)2 + (w · v)2 when (p + v)(q + w) = 0 below.

(q · p)2 − (q · v)2 − (w · p)2 + (w · v)2 = (q · (p + v))(q · (p − v))− (w · (p + v))(w · (p − v))

= ((q · (p + v)q)− (w · (p + v))w) · (p − v)

= ((q · (p + v)q) + (q · (p + v))w) · (p − v)

= ((q · (p + v))(q + w)) · (p − v)

= ((q · (q + w))(p + v)) · (p − v)

= (q · q)(p · p − v · v) = −(−2) = 2

In the above manipulations, the underlined red step represents an application of the relation (q+w) ·(p + v) = 0, while the blue and orange steps are applications of q + w = λ(p + v). So the above

gives us that

limt→∞

d(γq,w (t), γp,v ) = cosh−1(1) = 0.

The computation of (2) is now direct, if a little tedious.

Proposition 5.11 tells us several things. Two observers travelling along geodesics either diverge from

each other at an exponential rate, or they converge towards the same (asymptotoic) destination, in

the sense that they are arbitrarily close. This is quite dierent from what happens in Euclidean space

or on a sphere in Euclidean space, two observers travelling along geodesics either keep a constant

distance or the distance diverges linearly. In the sphere the distance is always bounded and periodic.

Proposition 5.11 allows us to construct a compactication of hyperbolic space, by appending the

àsymptotic directions' or the ènds of the geodesics' to hyperbolic space, giving it a boundary. The

most direct way to see this construction is in the Poincaré ball P nH. A geodesic γp,v in Hn projects toa geodesic t 7−→ cosh(t)~p+sinh(t)~v

1+p0 cosh(t) ∈ P nH as we saw in Proposition 5.9. The limit as t → ∞ of this is~p+~vp0∈ Sn−1. As information, this is also equivalent to the light-like vector p + v taken up to positive

scalar multiple. So we call Sn−1 the visual sphere for the Poincaré (and Klein) models. There is a

model-independent denition which we give below.

Denition 5.12. Let N be a complete Riemann manifold. Given two unit-speed geodesics, γ1, γ2 :

R→ N, we say they have the same asymptotic direction if the function

[0,∞) 3 t 7−→ infd(γ1(t1), γ2(t2)) : t1, t2 ≥ t ∈ R

is bounded. One would say they have the same asymptote if this limit were zero.

Let UTN be the unit tangent bundle,

UTN = v ∈ TN : |v | = 1

then the space of asympotic directions in N is the quotient UTN/ ∼ where ∼ is the relation v ∼ w ←→the geodesics exp(tv) and exp(tw) have the same asymptotic direction. One can similarly dene the

space of asymptotes in N to be the quotient of UTN but where we use the exp(tv) and exp(tw)

have the same asymptote relation.

Let ASD(N) denote the space of asymptotic directions, and ASM(N) the space of asymptotes.

We topologise N ∪ASD(N) using a basis, consisting of the open sets in N union the sets of the form

exp(tv) : t ≥ 0 ∪ [t 7−→ exp(tv)] where v ∈ W and W ⊂ UTN is an open subset of the unit

tangent bundle. When it is not ambiguous, we will frequently denote N ∪ ASD(N) by N.126


The next theorem identies (when notationally-convenient) the above objects for various models

of hyperbolic space.

Theorem 5.13. • For N = Hn we can canonically identify ASD(Hn) with the time-positive rays

on the light cone.

• In the Poincaré model, P nH is canonically homeomorphic to the unit ball Dn = p ∈ Rn : |p| ≤1, similarly for the Klein model KnH ≡ Dn.

• In the upper half-space model UnH can be canonically-identied with the one-point compacti-

cation of the upper-half space.

• Sn is non-Hausdor, as ASD(Sn) consists of a single point.

Proof. Denote the compact unit ball in Rn by Dn = p ∈ Rn : |p| ≤ 1. The homeomorphism

P nH ∪ ASD(P nH)→ Dn

is the identity map on the Poincaré ball P nH. Given a geodesic in P nH its limit is well-dened on

Sn−1 = ∂Dn. This is the homeomorphism.

Having the visual sphere compactication of hyperbolic space allows us to come to a general

understanding of hyperbolic isometries. By design, a hyperbolic isometry extends to a map of the

visual sphere, and by design it is a homeomorphism. A basic theorem in topology is called the Brouwer

xed-point theorem which says that any continuous function f : Dn → Dn has a xed point. This

allows us to dissect the group of hyperbolic isometries of Hn.

Denition 5.14. An isometry of Hn that has a xed point in Hn is called elliptic. Isometries of Hnthat have no xed points in Hn must have xed points on the visual sphere. If there is precisely one

on the visual sphere, it is called parabolic. If it xes precisely two points on the visual sphere, it is

called hyperbolic.

Example 5.15. • An element A ∈ An produces a standard elliptic element of Isom(Hn). To

dene such, consider Isom(Hn) as a subgroup of the linear isometries of Minkowski space

Rn+1. Then A ∈ On produces the elliptic element RA ∈ Isom(Hn)

(x0, x1, · · · , xn)RA7−→ (x0, A(x1, · · · , xn)).

In this case the point (1, 0, · · · , 0) is xed, RA(1, 0, · · · , 0) = (1, 0, · · · , 0). If f : Hn → Hnis any isometry, then f RA f −1 xes f (1, 0, · · · , 0). Provided A ∈ SOn we would call such

an elliptic element a rotation about f (1, 0, · · · , 0).

• Given a light-like vector l , l · l = 0 and an orthogonal vector s, s · l = 0, there is a parabolic

transformation associated to the two, Pl ,s dened by

pPl ,s7−→ p + (p · s)l − (p · l)s −

s · s2

(p · l)l

is parabolic, provided s · s > 0 and p 6= 0. The way to think about this map is that the map

v 7−→ (v · s)l − (v · l)s is a skew-adjoint linear map of Minkowski space, and thus a tangent

vector to the identity in the group of isometries of Minkowski space. Pl ,s is the exponential of

this map. In particular, for l xed the Lie bracket of these tangent vectors is zero, so we have

Pl ,s Pl ,r = Pl ,r Pl ,s = Pl ,r+s , provided both l · s = 0 = l · r . One elementary observation is

that Pl ,r (l) = l always. This is the unique xed point on the visual sphere.127


• The isometry

(x0, x1, · · · , xn) 7−→ (cosh(t)x0 + sinh(t)x1, sinh(t)x0 + cosh(t)x1, x2, · · · , xn)

is called `translation along the geodesic cosh(t)e0 + sinh(t)e1, provided t 6= 0. It is a basic

example of a hyperbolic isometry. More generally, given A ∈ On−1 the map

(x0, x1, · · · , xn) 7−→ (cosh(t)x0 + sinh(t)x1, sinh(t)x0 + cosh(t)x1, A(x2, · · · , xn))

is a hyperbolic isometry. If A is not the identity this is can be thought of as a combination of

translation along a geodesic together with a rotation orthogonal to the geodesic. When A is

not the identity it is called a loxodromic transformation.

Lemma 5.16. Consider the parabolic transformations of Hn xing l in the visual sphere, i.e. the group

Pl ,s ∈ Isom(Hn) : l · s = 0

then this group is isomorphic to the abelian group Rn, moreover, it acts freely and transitively on the

complement of l in the visual sphere.

One can prove the above lemma directly, but it sits in a richer, more classical context, which we

give below.

Denition 5.17. Given l in the time-forward light cone, i.e. l · l = 0, and given a real number c > 0

the horosphere associated to (l , c) is the set

HBl ,c = p ∈ Hn : p · l + c = 0.

The terminology horosphere is meant in the context of the next proposition.

Proposition 5.18. The parabolic isometry Pl ,s preserves the above horosphere for all s · l = 0. More-

over, if we denote π : Hn → P nH the isometry π(p) = 11+p0

~p from Proposition 5.7, then the image of

the above horosphere under π is the set

q ∈ Rn : |q| < 1, |q −1

1 + c~l | =

c

c + 1

i.e. the horosphere projects to the intersection of P nH with the (euclidean) sphere of radius cc+1 centred

at 11+c~l . This assumes the convention l = (1,~l), so |~l | = 1.

If one goes the additional step to the upper half-space model UnH, the horosphere corresponding to

l = e0 + en with parameter c is the subspace x ∈ UnH : 0 < xn ≤ 1c . The parabolic transformations

Pe0+en,s conjugates to the isometry of UnH which xes the point at innity. Write s = (0, ~s), then

~s ·en = 0 and the parabolic transformation Pe0+en,s conjugates to the motion p 7−→ p+~s, i.e. it agrees

with Euclidean translation in Rn, moreover it preserves the xn-coordinate.

Theorem 5.19. Every isometry of Hn is either elliptic, parabolic or hyperbolic. Elliptic isometries are

conjugate to RA for some A ∈ On as above. Parabolic transformations are equal to maps Pl ,s for

some l lightlike, and some s · l = 0, s · s 6= 0. Hyperbolic isometries are conjugate to the above

translation-plus-rotation along a geodesic.

Proof. We provide a sketch. Consider the isometries of H that have p as a xed-point. In the theory

of Lie groups, this is called Stab(p). From our theorem that an isometry is determined by its derivative

at a single point, we know that Stab((1, 0, · · · , 0)) is the orthogonal group On. So if f ∈ Isom(Hn)

is any isometry that sends a point p ∈ Hn to a point q ∈ Hn then f −1Stab(q)f = Stab(p).128


If f has no xed points in Hn, it must have at least one xed point on the visual sphere by the

Brouwer xed point theorem. If it has one, it must preserve the horospheres corresponding to that

xed point. So in the upper-half-space model UnH it must give a Euclidean isometry of the xn constant

level-sets. The only isometry of a Euclidean space that has no xed points is a translation (this is a

basic linear-algebra argument).

If f has two xed points on the visual sphere, it must preserve the geodesic between those two xed

points, so if we compose f with the corresponding translation along that geodesic, the new map T fxes all the points on that geodesic, so it must be elliptic, proving the claim.

Notice if an isometry of Hn has three xed points on the boundary, it must x the corresponding

ideal triangle and so it must be the identity on that 2-dimensional hyperbolic subspace.

There are many ways to explore the core ideas in the proof of Theorem 5.19. We will mention a

few below.

Denition 5.20. A geodesic triangle in a Riemann manifold is a piecewise-smooth closed curve

which is the union of three geodesics. An ideal geodesic triangle in Hn is the union of three geodesics,

corresponding to three ideal points on the visual sphere. Since an ideal geodesic triangle is specied

uniquely by three points on the visual sphere we consider the set of ordered distinct 3-tuples on the

visual sphere to be the space of geodesic triangles.

Proposition 5.21. Up to an isometry of Hn there is only one ideal triangle. Isom(H2) acts freely and

transitively on the space of geodesic triangles in H2. An ideal triangle in H2 separates H2 into four

regions, only one having nite area. We call the nite-area region the interior of the triangle. This

has area equal to π.

Proof. Using the upper half-space model choose the ideal triangle consisting of the circle of radius 1

centred at the origin together with the two vertical lines to ∞. This gives the integral

A =

∫ 1

−1

∫ ∞1

1

y2dydx + 2

∫ 1

0

∫ 1

√1−y2

dxdy

which reduces to 2 + 2∫ 1

0dy

1+√

1−y2, making a y = sin θ substitution this becomes

2 + 2

∫ π/2

0

cos θ

1 + cos θdθ

which is

2 + 2 [θ − tan(θ/2)]π/20 = π

It's a fun problem to determine the space of ideal tetrahedra in H3 (up to isometry), and ideal

simplices in Hn in general. There is a 2-dimensional family of ideal tetrahedra in H3, what is it?

Example 5.22. Let T be an ideal geodesic triangle in H2. Label the geodesics of the triangle by

α, β, γ as in the gure above. We will also let α, β, γ ∈ Isom(H2) double for the reections in those

geodesics. Dene the group

G1 = 〈γα, βα〉.As a group G1 acts freely and transitively on the black triangles in the above gure, and so it is a free

group on two generators. The generator βα a the parabolic element that xes the common endpoint129


α

β

γ

of the α and β geodesics (and cycles the incident black triangles). Similarly, γα is a parabolic element

that xes the common endpoint of the γ and α geodesics, and cycles the incident black triangles. A

direct computation shows H2/G1 is homeomorphic to a 3-times punctured sphere.

Let R denote the counter-clockwise rotation by π/3 about the center of the central triangle.

Consider the group

G2 = 〈γαR−1, βαR〉.G2 also acts freely and transitively on the black triangles, and so it is also a free group on two generators,

but the generators of G2 are hyperbolic isometries, and H2/G2 is a once-punctured S1 × S1.

We will nd some other hyperbolic structures of nite volume on familiar objects.

Example 5.23. In the language of surfaces, a pair of pants is a compact surface dieomorphic to S2

with the interior of three disjoint embedded discs removed. i.e. it is a compact orientable surface of

genus zero with three boundary circles. An orientable surface of genus g ≥ 2 has what is known as a

pants decomposition. This is a collection of curves that separate the surface into a union of pairs of

pants. An euler characteristic argument shows there are 2g − 2 pants and 3g − 3 curves in any pants

decomposition.

The key idea is for nding hyperbolic structures on compact surfaces is to nd hyperbolic structures

on pants so that the `cus' of the pants are geodesics of arbitrary length. It turns out there is

essentially only one such hyperbolic structure. We sketch the construction below.

Example 5.24. Recall Example 5.22. We gave an example of a group G1 = 〈γα, βα〉. It was a free

group generated by two parabolic elements. G1 was the group generated by parabolics that preserve

the blue and orange geodesic triangles in the gure above. From the perspective of the Poincaré130


α

β

γα′

β′

γ′

ball, parabolics can be thought of as `rotations' about points on the visual circle (sphere). More

precisely, one can view them as the transition between an elliptic and a hyperbolic element imagine

a 1-parameter famile of elliptic elements and the xed-point slides o to the visual sphere. We can

make this quite precise. In the gure to the right we have taken the ideal geodesic triangle and pushed

the ideal points on the visual sphere apart. If we label these new geodesics by α′, β′, γ′ then the

group G3 = 〈γ′α′, β′α′〉 acts freely and transitively on the blue triangles in the gure these can also

be thought of as the triangles one obtains by inductively reecting in the curves α′, β′, γ′ and then

2-colouring them orange and blue. But now γ′α′ and β′α′ are no longer parabolics, they are hyperbolic

transformation. The transverse curves in the gures represent the geodesics that connect the closest-

points. So G3 is a new free group on two generators, sitting inside Isom(H2). The quotient is also a

three punctured sphere, but there are three closed geodesics which we can give any length we choose

(the distances between the curves α′, β′, γ′). The quotient is no longer nite-volume, but if we cut

the pants at the geodesics, we obtain a compact hyperbolic manifold with geodesic boundary, a proper

hyperbolic pair of pants. One can then glue such pants together along isometries of their boundaries

to construct hyperbolic surfaces of arbitrary genus g ≥ 2.

The above example shows that hyperbolic structures arise rather naturally in 2-dimensional mani-

folds, which was known towards the end of the 19th century. Towards the early 1970's there was an

accumulation of evidence suggesting hyperbolic structures are also natural on 3-manifolds. We give

one of the rst examples to be discovered, due to Robert Riley and Bill Thurston.

Example 5.25. If we let K ⊂ S3 be the gure-8 knot thought of as a 1-dimensional manifold, then

S3\K admits a complete metric of nite volume and constant curvature κ = −1. To see this, consider131


the gure below.

In the top-left corner is the gure-8 knot, pictured in R3, whose one-point compactication is S3.

The rst step is to thicken the knot. At this step, imagine placing a balloon both over and under the

knot in S3. Now inate the balloons until they ll-up the exterior of the knot. At this stage the two

balloons come into contact with (a) the thickened knot and (b) each other. The membrane where the

two balloons come in contact is drawn in the bottom-right corner of the gure above. The membrane

is divided into regions, with divisions sketched in red corresponding to the crossings in the original

knot diagram. Each region is given a colour. So if we imagine the original balloons as impressionable

surfaces, and the membrane as coloured in ink, once we deate the balloons there will be an impression

on their surface, indicating how one glues them together. This pattern is sketched in the bottom-left

of the gure. What we see is that the balloons should naturally be viewed as truncated tetrahedra.

The vertices of the tetrahedra are sent to the surface of the fattened knot, while the faces are glued

to each other. In the truncated tetrahedron diagram there are two coloured rectangular facets on each

tetrahedron. Simply erase a red edge between these rectangular facets and a truncated triangle that

it borders on (in this language of CW-complexes this is a type of Whitehead move / handle slide).

This gives the knot complement as the union of two truncated tetrahedra, where we only remove a

neighbourhood of the tetrahedra vertices. Label the faces of the tetrahedra by the vertices that they

are opposite. Then the gluing map from the top to the bottom tetrahedron sends faces 1, 2, 3, 4 in that

order to the faces 3, 4, 1, 2 in that order. In particular, notice that after gluing the tetrahedra together,

all the edges of the tetrahedra have collapsed into only two arcs in S3 \ K. Moreover, around each

edge there are six tetrahedra incident (this takes some careful thought regarding the above Whitehead

move). Consider a regular ideal tetrahedron in H3 ideal meaning all the vertices are on the visual132


sphere, and regular meaning there are isometries that permute all the vertices. Consider the angle

between two faces of the tetrahedron. This can be computed easily in the U3H model, since a regular

ideal tetrahedron would have to give regular triangular x3 = c cross-sections, thus the angle between

faces of a regular ideal tetrahedron in H3 is π/3 = 2π/6. So the union of six regular ideal tetrahedra

around a common edge give a neighbourhood of that geodesic edge in H3. Thus, S3 \ K admits a

complete hyperbolic structure, having volume twice whatever the volume of a regular ideal tetrahedron

in H3 is.

Example 5.25 is due to Bill Thurston. Robert Riley had made the observation (via a computer-

aided calculation) that the gure-8 knot complement admits a complete hyperbolic structure of nite

volume. Example 5.25 is Thurston's re-interpretation of that result, giving a more direct geometric

construction of the gure-8 knot complement as a hyperbolic manifold.

We end this section with a brief description of Isom(Hn) acting on the visual sphere. The rst

thing to notice is that the visual sphere admits a canonical Riemann metric making it isometric to

the round sphere, i.e. the unit sphere in Euclidean space. This happens naturally in the Poincaré

and Klein models but it is model-independent although it takes a little work to see why. The visual

sphere ASD(Hn) was dened abstractly as a quotient space of the unit sphere bundle UT (Hn). The

standard way to make ASD(Hn) into a smooth (abstract) manifold is to demand the quotient map

UT (Hn) → ASD(Hn) is a smooth submersion. This allows us to dene the Riemann metric on

ASD(Hn) as a bilinear function on T (UT (Hn)) which respects the equivalence relation. We leave

this as an exercise to the motivated reader. We will simply assume the Riemann metric on the visual

sphere is the round metric, in the Poincaré model.

Theorem 5.26. Given an isometry f ∈ Isom(Hn) we can extend it uniquely to a homeomorphism of

Hn, and restrict to the visual sphere. This gives a map

Isom(Hn)→ Dif f c(∂Hn)

where Dif f c(∂Hn) is the group of conformal dieomorphisms of the visual sphere. Moreover, provided

n ≥ 3 this map is a one-to-one and onto map. When n = 2 the map is simply injective, since the

group of conformal dieomorphisms of S1 is the whole group of dieomorphisms of S1 which is

innite-dimensional, while Isom(H2) is three-dimensional.

Perhaps the most startling revelation in Theorem 5.26 is that every conformal dieomorphism of a

round sphere extends uniquely to a hyperbolic isometry of hyperbolic space provided the sphere is at

least 2 dimensional, and every hyperbolic isometry is of this form.

There are many ways to prove Theorem 5.26. An informative major step is to prove that hyperbolic

isometries restrict to conformal dieomorphisms of the boundary. This is essentially direct in the

Poincaré ball model, since the Poincaré metric is a conformal multiple of the Euclidean metric. One

can also prove this using more general principles, such as the sketched Riemann metric on the visual

sphere in the preamble to Theorem 5.26.

The Liouville Theorem (0.9 in Chapter 3) then tells us that Isom(Hn) and Dif f c(∂Hn) have the

same dimension provided n ≥ 3. It is a quick argument from there that our map must be one-to-one

and onto.

Theorem 5.26 can be expressed in many dierent ways. For example, in much of mathematics

people prefer not to think about the group Isom(Hn) but the conformal dieomorphisms of the

one-point compactication of Rn−1, via the upper-half-space model UnH.133


Theorem 5.26 provides perhaps the most appealing argument that the conformal dieomorphisms

of a sphere preserve round circles. Each dieomorphism extends to an isometry of Hn, which extends

uniquely to a linear automorphism of the ambient Euclidean space. Linear automorphisms send 2-

dimensional subspaces to 2-dimensional subspaces. The boundary of these subspaces in the visual

sphere are precisely the round circles.

Proposition 5.27. The isometries of U2H when restricted to the visual sphere are the automorphisms

of.

R having the form

f (t) =at + b

ct + dwhere a, b, c, d ∈ R satisfy ad − bc 6= 0. If we re-scale the numerator and denominator appropriately

we can assume ad − bc = ±1. Thus the orientation-preserving subgroup is isomorphic to the group

PSL2R, this is the group of 2× 2 matrices with real entries whose determinant is +1 modulo ±I.If we think of the visual boundary of U3

H as the one-point compactication of C × 0 then the

conformal dieomorphisms of.

C have the form

f (z) =az + b

cz + d

if orientation-preserving and

f (z) =az + b

cz + dif orientation-reversing, where a, b, c, d ∈ C have the form ad − bc 6= 0. Multiplication of such a

matrix by λI for λ ∈ S1 ⊂ C does not change the above function, this is why people frequently talk

about PSL2(C) rather than the conformal dieomorphisms of H3.

Surfaces

This section is devoted to the dierential geometry of surfaces, meaning 2-dimensional manifolds.

We will give two proofs of the Gauss-Bonnet Theorem, Hopf's proof and the Chern proof. We will also

talk about surfaces in R3, the classical interpretation of curvature and Gauss's Theorema Egregium.

Proposition 5.28. In a 2-manifold the sectional curvature K is constant Kp : V2(TpN)→ R for every

p ∈ N, denote κ : N → R, where Kp is constant equal to κp. So the Riemann curvature is given by

R(v , w)z = κ(p) (µ(v , z)w − µ(w, z)v), and the Ricci curvature is given by Ricp(v , w) = κpµ(v , w)

and Scalar curvature Scalp = 2κp

We start by proving the Gauss-Bonnet theorem, which states that the total curvature of a surface

is closely related to the Euler characteristic of the surface.

Denition 5.29. If γ : (a, b) → N is a smooth unit-speed curve and ~n : (a, b) → UTN a smoothly-

varying unit normal vector in the sense that µ( ∂γ∂t , ~n) = 0, with πN~n = γ. Dene the signed geodesic

curvature of γ in the direction of ~n to be κg(t) = µ(∇t ∂γ∂t , ~n). Typically we think of κg as a function

dened on the image of γ as it is independent of the choice of parametrization.

Theorem 5.30. (Gauss-Bonnet) Let N be a compact, oriented 2-dimensional Riemann manifold (per-

haps with boundary), then ∫N

κdA+

∫∂N

κgdl = 2πχ(N)

134


where for κg we use the inward-pointing normal vector ~n on the boundary.

We will give the proof below but for now consider some of the consequences that the total

curvature of the manifold is independent of the Riemann metric on the manifold. One way to view this

is that if one is to perturb the Riemann metric in a way that creates positive curvature somewhere,

there must be a balancing negative curvature created somewhere else. So it is impossible to put a

metric of strictly negative curvature on S2.

Gauss's proof of the Gauss-Bonnet theorem was very much a computation, giving little in the way

of insight into how it ts in to dierential geometry more generally. Chern has given one of the most

satisfying proofs of the Gauss-Bonnet theorem, giving a satisfying reason for the appearance of the

2π in the formula it is the length of a unit circle in the tangent bundle to N (with a certain natural

Riemann metric on TN). We describe the natural Riemann metric on the tangent bundle.

Denition 5.31. Given a Riemann manifold (N,µN) with a connection cN : T 2N → TN, we can

dene a Riemann metric on TN, µTN : T (TN)⊕ T (TN)→ R by the formula

(V,W ) 7−→ µN(cN(V ), cN(W )) + µN(DπN(V ), DπN(W )).

Provided cN is the Levi-Cevita connection on N, we call µTN the canonical Riemann metric on TN. In

the above denition, we use the convention that T (TN)⊕T (TN) is the subset of T 2N ×T 2N where

πTN(V ) = πTN(W ) where πTN : T 2N → TN is the bundle projection.

The bundle projection πTN : T 2N → TN has bers π−1TN(p, v) which are canonically isomorphic to

the direct-sum TpN ⊕ TpN for all (p, v) ∈ TN. The isomorphism is simply the direct-sum cN and

DπN restricted to π−1TN(p, v). So our connection cN allows us to think of the double tangent spaces

as canonically a direct sum of two copies of TpN. These two spaces are the `vertical' and `horizontal'

subspaces respectively. What µTN does is take the direct sum of µN on these two spaces, making

the vertical and horizontal subspaces orthogonal and isometric. For the next theorem, recall that the

group Σk acts on the iterated tangent bundle T kN by permuting the order of the variations. Precisely,

if P ∈ T kN is represented by a cube whose primary (rst order) variations are the edges p → pifor i = 1, 2, · · · , k in that order, then σ.P is the cube whose primary variations are p → pσ(i) for

i = 1, 2, · · · , k in that order.

Denition 5.32. IfM is a Riemann manifold with Levi-Cevita connection cM and N ⊂ M is a Riemann

manifold whose Riemann metric µN = i∗(µM) is the pull-back of µM along the inclusion i : N → M

i(p) = p, we call N totally geodesic if the Levi-Cevita connection on N is the pull-back of cM , i.e. if

the diagram commutes

T 2ND2i //

cN

T 2M

cM

TN

Di // TM

It is an equivalent statement that N ⊂ M is totally geodesic if and only if all the geodesics of N

are also geodesics for M.

Theorem 5.33. Given a Riemann manifold N with a linear Ehresmann connection cN : T 2N → TN,

we claim that cTN : T 3N → T 2N dened by

cTN = ι(12) Dc ι(123)

135


is a linear Ehresmann connection on TN. Moreover, if cN is Levi-Cevita for the Riemann metric

µ : TN ⊕ TN → R, then cTN is Levi-Cevita on TN for the Riemann metric µTN in Denition 5.31.

TN with its Riemann metric µTN and Levi-Cevita connectio cTN satises:

• A curve γ : (−ε, ε)→ TN is a geodesic (for cTN) if and only if πN γ is a geodesic in N and

∇tγ is parallel, i.e. ∇t∇tγ = 0.

• The inclusion TpN → TN given by v 7−→ (p, v) is an isometric embedding in TN as a totally

geodesic subspace.

• The inclusion N → TN given by p 7−→ (p, 0) is an isometric embedding in TN as a totally

geodesic subspace.

Example 5.34. • If (Rn, µ) is Euclidean space with its Ehresmann connection c(p, v , w, y) =

(p, y) then (TRn, µTRn) is a Euclidean space of double the dimension.

• If (Sn, µSn) is the unit sphere in Rn+1 then (TSn, µTSn) has Levi-Cevita connection

p1 //

3!!

2

p1

##p3

// p13

p2

//

!!p12

##p23

// p123

7−→ p1 //

2

p1

p23 + (p2 · p3)p // p123 + p1(p2 · p3) + (p12 · p3 + p2 · p13)p

We return now to the proof of the Gauss-Bonnet theorem. For this, recall that N is a compact

oriented 2-dimensional manifold, perhaps with boundary. We consider UTN to be the unit tangent

bundle of N as a Riemann metric, giving UTN the pull-back metric from the inclusion UTN → TN.

There is a vector eld on UTN, which we will denote ~T : UTN → T (UTN) dened to be the unit

length vector pointing in the positively-oriented direction of the bres π : UTN → N.

Proof. (of Theorem 5.30) Dene the function α : T (UTN) → R by α = µUTN(~T, ·). This is a

dierential 1-form on UTN. Chern's observed that α is related to curvature in two ways

dα = −π∗(∗κ) α|∂UTN = π∗(∗κg)

where κ : N → R is the sectional/Gauss curvature on N, and ∗κ is the Hodge star of κ, i.e. if

v1, v2 ∈ TpN are oriented and orthonormal, then (∗κ)(v1, v2) = κ. Alternatively,∫N ∗κ =

∫N κdA, i.e.

the Hodge star is the form one integrates so that the integral is an integral with respect to surface

area. Similarly,∫∂N ∗κg =

∫∂N κgdl where dl indicates integration with respect to arc-length. Chern's

observation follows for fairly elementary reasons. Given a unit-speed curve γ : [0, l ] → N, since the

surface is oriented there is a well-dened function θ : [0, l ]→ R so that Hol(γ[0,t],∂γ∂t (0)) is a rotation

by θ(t) radians counter-clockwise from ∂γ∂t . Via a quick computation we see

α(∂2γ

∂t2) = µUTN(

∂2γ

∂t2, ~T ) = µN(∇t

∂γ

∂t, ~n) = κg = −µN(

∂γ

∂t,∇t~n) = −

dθ

dt.

In the above formula ~n is the rotation of ∂γ∂t by π/2 radians counter-clockwise in the tangent space

Tγ(t)N, using the orientation of N. But we also know that the integral of dθdt with respect to t over a

small parallelogram is the total change in θ, and the holonomy over that loop is rotation by θ. From

our denition of the Riemann curvature at the start of these notes and Stokes Theorem this gives us

dα = −π∗UTN(∗κ).136


To complete the proof, assume v : N → TN is a vector eld on N with nitely-many zeros, all

of which are in the interior of N. Such a vector eld exists by transversality (a generic vector eld

has such properties). Let us further assume that v restricts to the unit (oriented) tangent vector

eld on ∂N, so that the Poincaré-Hopf index theorem applies to v . Let N0 be N with a small open

neighbourhood of the zeros of v removed. Let v : N0 → UTN be the corresponding unit vector eld

to v restricted to N0. Then∫N0

κdA =

∫N0

(∗κ) = −∫v(N0)

dα = −∫∂v(N0)

α = −∫v(∂N)

α−∫v(∂N0\∂N)

α.

But ∫v(∂N)

α =

∫∂N

∗κg =

∫∂N

κgdl

and ∫v(∂N0\∂N)

αε' −2πχ(N).

the latter being the Poincaré-Hopf index theorem together with the observation that this integral is

approximately the sum of the local indices of the vector eld, but with a minus sign. To nish the

argument observe that∫N κdA

ε'∫N0κdA provided N0 is N remove a suciently-small neighbourhood

of the zeros of v .

Theorem 5.35. (de Rham) If M is a simply connected complete Riemann manifold such that there is

a decomposition TpM = V1⊕ V2⊕ · · · ⊕ Vk and the holonomy at p preserves the subspaces Vi for each

i , then exp(Vi) ⊂ M is a totally-geodesic submanifold for all i , moreover

M = exp(V1)× exp(V2)× · · · × exp(Vk).

Notice the simple connectivity hypothesis is fundamental. A at torus S1×S1 has trivial holonomy,

so we can choose the subspaces V1, V2 arbitrarily. So if exp(V1) is a non-closed geodesic the theorem

is clearly not satised.

Proof. (of Theorem 5.35) Since the holonomy preserves the decomposition TpM = ⊕ki=1Vi , we can

parallel translate the decomposition to every tangent space ofM, giving a decomposition of the tangent

bundle TM as a direct sum of k subbundles. Near p we use the exponential map to paralell translate

two vectors in TpM to vector elds dened in a small neighbourhood of p. Using the fact that our

connection is torsion-free, ∇vw −∇wv = [v , w ]

Use parallel translation to transport the decomposition of TpM to a decomposition of TM as an

orthogonal direct sum of k subbundles. Application of the Frobenius theorem (TODO) shows this is

an integrable distribution, i.e. that this decomposition comes from the tangent spaces of foliations.

TODO show this foliation is by genuine manifolds and they are a trivial product.

Exercises

Problem 5.36. Consider two objects moving in the plane R2. The rst object (the `horse') moves

along the x-axis, and it tows the `plow' via a rigid rod of unit length. This means the two objects137


always maintain a unit distance, and the velocity vector of the plow points directly at the horse. If the

horse starts at (0, 0) and the plow at (0, 1) respectively, the contour traced out by the plow as the

horse moves along the x-axis is called the tractrix.

a) Argue that if γ(t) = (x(t), y(t)) is the unit speed parametrization of the Tractrix, then dydx =

− y√1−y2

.

b) Argue that γ(t) = (t − tanh t, 1cosh t ) is a parametrization of the Tractrix.

The pseudo-sphere P is the surface of rotation in R3 generated by rotating the tractrix about the

x-axis. The cusp of the pseudo-sphere is the points on the surface of rotation corresponding to t = 0.

P = (x, y , z) ∈ R3 : x = t − tanh t, y2 + z2 =1

cosh2 twhere t ∈ R

c) Argue that (x, y , z) : x > 0, and (x, y , z) ∈ P is a 2-dimensional manifold. We further

consider it to be a Riemann manifold by restricting the Euclidean metric to P .

d) Compute the Riemann metric f ∗µ on the domain of f , where

f (t, θ) = (x(t), y(t) cos θ, y(t) sin θ)

where γ(t) = (x(t), y(t)) is the unit speed parametrization of the tractrix. Notice that this metric

can be written as1

t2(dt2 + dθ2)

Problem 5.37. • Argue Sn ⊂ Rn+1 is not totally geodesic.

• Argue that a geodesic in any Riemann manifold is totally geodesic.

• Let f : N → N be an isometry of a Riemann manifold N. Dene F ix(f ) = p ∈ N : f (p) = p.Assume F ix(f ) is a manifold (this can in fact be proven), and show that F ix(f ) is totally

geodesic.

Problem 5.38. Argue that in hyperbolic space Hn there is the notion of average. Precisely, if

p1, · · · , pk ∈ Hn show that the function

q 7−→ d2(q, p1) + d2(q, p2) + · · · d2(q, pk)

has a unique minimum q ∈ Hn.

Problem 5.39. Using Problem 5.38 argue that any nite group acting on Hn has a xed point.

Problem 5.40. Argue that if a non-orientable compact surface admits a metric of constant curvature,

then its Euler characteristic has the same sign as the curvature.

Problem 5.41. Argue that if µ1 and µ2 are any two complete hyperbolic metrics of nite surface-

area on three-times punctured sphere S2 \ p1, p2, p3, then there is an isometry between the two.

Moreover, compute the group of isometries of this manifold, it should be a familiar group.

Problem 5.42. Argue that the two-times punctured sphere S2 \ p1, p2 does not admit a complete

hyperbolic metric of nite area. Does it admit a complete spherical (curvature +1) or Euclidean

(curvature 0) metric of nite area?

Problem 5.43. Let G be a nite group of isometries of the standard round 2-sphere S2.138


• Argue that if the action of G is xed-point free, G is either trivial or a group of order 2, acting

by the antipodal map.

• In the case that the action of G preserves orientation and has non-trivial xed points, argue

that either G preserves a great circle and is therefore either cyclic or dihedral, or G does not

preserve a great circle and is a group of isometries of some platonic solid.

Hint: To construct a platonic solid, consider the set X to be the points in S2 with maximal

number of elements in their point-stabilizers. One denition of a platonic solid is it is a convex

polyhedron that is 'regular' in the sense that there is an isometry sending any vertex to any other, and

the point stabilizer of a vertex acts transitively on the incident edges. The convex polyhedron will be

the convex-hull of the set X, thus an edge is any straight line between the vertices of X which remains

on the boundary of the convex hull of X.

Problem 5.44. Let G be a nite group of isometries of the standard round 3-sphere S3. Consider G

to be a subgroup of O4.

• Argue that if two elements of G commute, then they are simultaneously diagonalizable (over

C), i.e. R4 is an orthogonal direct-sum of two 2-dimensional subspaces that are invariant-

subspaces of both elements.

• Argue that if G is xed-point free and cyclic, then the action of G on S3 is conjugate to

Zp × S3 → S3

given by

t.(z1, z2) = (e2πiqtp z1, e

2πitp z2)

where GCD(p, q) = 1. This action is called the (p, q) lens-space action.

• If the action of G is xed-point free but G is not cyclic, dene the `2-sheeted cover' of G, G

to be the pre-image of G under the homomorphism S3 × S3 → SO4 from Problem 2.83, and

consider the projections of G to the two S3 factors. Argue that G is therefore an extension of

a nite subgroup of S3 by another nite subgroup of S3. In Problem 5.43 you determined the

nite subgroups of SO3, thus you know all the nite subgroups of S3, via the 2 : 1 covering

map S3 → SO3.

139

Dierential Geometry - Coarse Outline Dierential Topology

6. Tubular neighbourhoods

Denition 6.1. Let M be a submanifold of a Riemann manifold N. The geometric normal bundle of

M in N is denoted ν(M,N) and dened as:

ν(M,N) = (p, v) : p ∈ M, v ∈ TpN, v ⊥ TpM

meaning the vector v is tangent to N, but orthogonal (with respect to N's Riemann metric) to

TpM ⊂ TpN.Given r ≥ 0,

νr (M,N) = (p, v) ∈ ν(M,N) : |v | < r

Proposition 6.2. In the above situation, ν(M,N) is an n-dimensional submanifold of TN. νr (M,N)

is a relatively open subset of ν(M,N). Moreover, the map

πν : ν(M,N)→ M

given by πν(p, v) = p is a submersion.

Proof. exercise.

Theorem 6.3. Let M be a compact submanifold of a Riemann manifold N. Then there exists r > 0

such that the restriction of N's exponential map to νr (M,N) ⊂ TN

exp : νr (M,N)→ N

is a dieomorphism to an open subset of N. Let V ⊂ N be the image of this map. Then the function

f : V → M given by the composite f = πν exp−1 has the property that d(q, f (q)) = |v | whereexp(p, v) = q. Here d : N × N → R is the induced path-length metric on N, i.e. the metric used in

Hopf-Rinow. i.e. f : V → M associates to any point q ∈ V the nearest point to q in M. Moreover,

since we have written this map as a composite of smooth maps, it is itself smooth.

Proof. Notice that there is the standard inclusion

i : N → ν(M,N)

given by i(p) = (p, 0), embedding N in ν(M,N), called the 0-section of ν(M,N). From the Hopf-

Rinow theorem, we know the derivative of exp : TN → N at a point (p, 0) ∈ TN is the identity map

T(p,0)TN = TpN → TpN. We can think of T(p,0)ν(M,N) as the direct sum of T(p,0)i(N) together with

T(p,0)π−1ν (p). Thus, if we restrict to exp to the map exp : ν(M,N)→ N, Dexp(p,0) : T(p,0)ν(M,N)→

Tp(N) is an isomorphism. Further, using the above constructions we can identify T(p,0)ν(M,N) with

TpN, and under this identication, Dexp(p,0) is the identity map.

Thus by the compactness of M and the inverse function theorem, there is some r > 0 such that

the restriction exp : νr (M,N) → N is a local dieomorphism. So the image is open, and we need

only show for r suciently small exp is one-to-one. If it were not true, there would be a sequence

(pi , vi) ∈ ν(M,N) and (qi , wi) ∈ ν(M,N) with (pi , vi) 6= (qi , wi) for all i , yet exp(pi , vi) = exp(qi , wi)

for all i . Further, limi→∞ vi = 0 = limi→∞ wi . Since pi , qi ∈ M is compact, we can pass to a convergent

subsequence, so limi→∞ pi = p, limi→∞ qi = q. Thus p = q. Since exp is a local dieomorphism, this

forces (pi , vi) = (qi , wi) for i suciently large, a contradiction. 140


Denition 6.4. The neighbourhood V ofM in N is called a tubular neighbourhood ofM in N. Theorem

6.3 is called the tubular neighbourhood theorem.

In a Riemann manifold N, given a point p ∈ N,Inj(p,N) := supr ∈ [0,∞), exp : νr (p, N)→ N is an embedding

is called the injectivity radius at p. The number

Inj(N) := infInj(p,N) : p ∈ Nis called the injectivity radius of N. Similarly, given M ⊂ N,

Inj(M,N) := supr ∈ [0,∞), exp : νr (M,N)→ M is an embeddingis the injectivity radius of M in N.

Given p ∈ M and r ∈ R, the ball of radius r centred at p ∈ N is

Bp(r) = q ∈ N : d(p, q) < r

Remark 6.5. It's a fun exercise to show that the injectivity radius as a function

Inj(·, N) : M → (0,∞]

is a continuous function. The interval (0,∞] is topologized as a half-open interval. Less abstractly,

arctan Inj(·, N) : M → (0, π/2] is a continuous function.

The tubular neighbourhood theorem states that when M is compact Inj(M,N) is positive. When

M is non-compact, unfortunately it's possible for Inj(M,N) to be zero. For example, consider the

spiral etz : t ∈ R ⊂ C where say z ∈ C is a complex number |z | > 1 and Im(z) 6= 0. When M is

compact, if r is the injectivity radius, by design it is the smallest positive number so that either (1)

exp(p, v) = exp(q, w) where |v | = |w | = r and (p, v) 6= (q, w) or (2) Dexp(p,v) : T(p,v)ν(M,N) →Texp(p,v)N is not an isomorphism for some p ∈ N and |v | = r . In case (2) the point exp(p, v) ∈ N is

said to be a focal point of the exponential map, and |v | is the focal radius of N at p.

M being compact is a strong restriction. Sometimes we will want something similar to a tubu-

lar neighbourhood, but when M is not compact. The weakening we're looking for is called the ε-

neighbourhood theorem.

Theorem 6.6. (ε-Neighbourhood Theorem) Let M be a submanifold of a Riemann manifold N. Then

there exists a smooth function ε : M → (0,∞) such that the restriction of N's exponential map

exp : νε(M,N)→ N

where νε = (p, v) ∈ ν(M,N) : |v | < ε(p) is a dieomorphism onto an open subset of N. If we call

the image Vε ⊂ N, then for all q ∈ Vε, if we let (p, v) = exp−1(q) then d(p, q) = |v |. Further, we canensure that Bp(ε(p)) ⊂ Vε for all p ∈ M.

Proof. The proof of the ε-Neighbourhood Theorem is essentially the same as the tubular neighbour-

hood theorem. The key point is of course to determine an appropriate ε : M → (0,∞). Given p ∈ M,

let Rf (p) be the focal radius at p. Similarly, dene

Ri(p) = infr ∈ [0,∞) : exp(p, v) = exp(q, w), (p, v) 6= (q, w) ∈ ν(M,N) with |v | = |w | = rRi(p) > 0 for all p ∈ M by an argument similar to the proof of the tubular neighbourhood theorem.

Moreover, given a suciently small neighbourhood U of p in N, we can guarantee

Ri(p) : p ∈ U ∩M ∪ Rf (p) : p ∈ U ∩M141


has a positive lower bound. Let Rp be that lower bound, with Up = U ∩M.

Dene ε : M → (0,∞) to be

ε =∑p

µpRp

where the collection µp : M → [0, 1] is a partition of unity subordinate to the cover of M by the

sets Up. This works by design.

Isotopies and extensions

Denition 6.7. Given a smooth embedding f : M → N, an isotopy of f is a smooth function

F : [0, 1]×M → N such that if we dene Ft : M → N by Ft(p) = F (t, p) for all (t, p) ∈ [0, 1]×M,

then we require Ft : M → N to be an embedding for all t ∈ [0, 1], and F0 = f .

The support of an isotopy, supp(F ) is dened as

supp(F ) = p ∈ M : F (t, p) 6= f (p) for some t ∈ [0, 1]

An ambient isotopy is a smooth map G : [0, 1]× N → N such that G0 = IdN and Gt : N → N is a

dieomorphism for all t ∈ [0, 1]. The support of G is dened as

supp(G) = ∪t∈[0,1]supp(Gt) = p ∈ N : G(t, p) 6= p for some t ∈ [0, 1]

Theorem 6.8. (Isotopy Extension Theorem) If F : [0, 1] × M → N is an isotopy of an embedding

f : M → N such that supp(F ) is compact, then there exists an ambient isotopy G : [0, 1] × N → N

such that F (t, p) = G(t, f (p)) for all (t, p) ∈ [0, 1]×M.

Proof. Given an ambient isotopy G : [0, 1]×N → N, the track of G is the map G : [0, 1]×N → [0, 1]×Ndened by G(t, p) = (t, G(t, p)). The track is a dieomorphism of [0, 1] × N. Notice ∂G

∂t G−1

is a

vector eld on [0, 1]× N. Denote this vector eld by v . The ow satises Φv (t, (0, p)) = G(t, p).

Given F an isotopy of f , let F (t, p) = (t, F (t, p)) be its track. w = ∂F∂t F

−1is a vector eld

dened on the image of F , which is a submanifold of [0, 1]×N. The main idea of the proof is to nd

a smooth extension of w to a vector eld on all [0, 1]×N which is ∂G∂t G

−1for some ambient isotopy

G.

Since w is a smooth vector eld on the image of F , around every point p in the image, there

are local smooth extensions wp : Up → T (Up) where Up ⊂ [0, 1] × N is relatively open. Choose the

local smooth extensions so that the component of wp in the [0, 1] direction is 1, i.e. Dπ wp = 1,

where π : [0, 1] × N → [0, 1] is projection onto the rst factor. So we can cover [0, 1] × N with

the open sets Up : p ∈ img(F ) together with Uφ, where Uφ = [0, 1] × N \ img(F ). Then dene

w : [0, 1]× N → T ([0, 1]× N) as

w =∑p

µpwp

where µp : Up → [0, 1] is the smooth partition of unity subordinate to the cover Up : p ∈img(F ) ∪ Uφ of [0, 1] × N. We need to dene wφ to be the constant vector eld wφ(t, q) =

((t, q), (1, 0)).

Notice that the ow Φw (t, (0, q)) is dened for all t ∈ [0, 1]. In particular the ow is complete,

since the vector eld remains purely vertical outside of a compact set, moreover, it is never zero. The142


component of w in the [0, 1] direction is 1, i.e. the π Φw (t, (0, q)) = t. Thus Φw (t, (0, q)) = G for

some isotopy G of N.

Notice that by the nature of the proof, we can choose the open sets Up to be arbitrarily small, thus

supp(G1) can be made to t in an arbitarily small neighbourhood containing supp(F ).

Example 6.9. A classical (long) knot is a smooth embedding f : R→ R3 such that f (x) = (x, 0, 0)

for all |x | ≥ 1. A variant of the Alexander trick (from the homework) is the re-scaling map:

F (t, x) =

(1− t)f ( 1

1−t x) if t ∈ [0, 1)

(x, 0, 0) if t = 1

Notice that this map satises all the conditions of a smooth isotopy except F is not dierentiable at

the point (1, 0) unless f (x) = (x, 0, 0). F is a continuous function, though. This is sometimes called

`pulling a knot tight'.

PSfrag repla ements

F0 = f

F1/2

F1

An example of an actual isotopy and an extension.

Example 6.10. Our embedding f will be an embedding of a 0-dimensional manifold in R2.

f : 1, 2, · · · , n → R2

given by f (n) = (n, 0). We will describe an isotopy that has an integer parameter i ,

F i : [0, 1]× 1, 2, · · · , n → R2

given by

F i(t, k) =

(k, 0) if k ≥ i + 2 or k ≤ i − 1

(k, 0) + ( 1−cosπt2 ,− sinπt

2 ) if k = i

(k, 0)− ( 1−cosπt2 ,− sinπt

2 ) if k = i + 1143


In the gure below, F i is the track of F i , and G i1 is a representation of the isotopy extension of F ,

evaluated at t = 1.

... ... ... ...

......

PSfrag repla ements

G i1

imgF i

1

11

i

ii

i + 1

i + 1i + 1

n

nn

There is a much weaker notion of isotopy that we will study in the transversality section, called

homotopy.

Denition 6.11. A homotopy of a smooth map f : M → N is a smooth map

F : [0, 1]×M → N

such that F0 = f . We say the map F0 and F1 : M → N are homotopic.

Thus, an isotopy is a homotopy which at every time is an embedding. An ambient isotopy is a

homotopy which at every time is a dieomorphism.

Fibre bundles

Denition 6.12. Let f : M → N be a smooth function, with M and N manifolds. If for all p ∈ Nthere exists a neighbourhood U of p and a dieomorphism φ : f −1(U) → U × f −1(p) such that the

diagram

f −1(U)

f""

φ // U × f −1(p)

π

zzU

commutes, where π(x, y) = x , then f is said to be a locally trivial bre bundle. f −1(p) is the bre

over p.

Roughly speaking, the concept of bre bundle is the `manifold concept' but applied to smooth

functions. Whereas manifolds are objects that are `locally trivial' in the sense that they're locally144


Euclidean, bre bundles are the maps that are `locally trivial' in the sense that they're locally projection

maps. Just like with manifolds, for bre bundles the dieomorphisms φ : f −1(U) → U × f −1(p) are

called bundle charts.

If f : M → N is a bre bundle, we would say `M is a bre bundle over N', and call N the base

space, M the total space and f the bundle map.

Theorem 6.13. Let f : M → N be a submersion of manifolds. Provided f −1(p) is compact for all

p ∈ N, f is a locally trivial bre bundle. (The compactness criterion automatically holds if M itself is

compact.)

Proof. Let exp : νr (f−1(p),M) → V ⊂ M be a tubular neighbourhood of f −1(p) in M. By design,

there exists U ⊂ N open with p ∈ U such that f −1(U) ⊂ V .Our map φ : f −1(U)→ U × f −1(p) is given by φ(q) = (f (q), πν(exp−1(q))). One can check that

along f −1(p), Dφ has rank m = dim(M). So for r small enough, φ is a dieomorphism, much like in

the proof of the tubular neighbourhood theorem.

Note 6.14. • If N is connected, then f −1(p) and f −1(q) are dieomorphic for any p, q ∈ N simply take a path from p to q and cover it by bundle charts. There are various dierent

notations for bre bundles in the literature. Because of this observation, sometimes people

restrict bre bundles where all the bres are dieomorphic, but from the perspective of the

above theorem, this is a little un-natural.

• When the bres of a locally trivial bre bundle are 0-dimensional, it is called a covering space.

One of the main sources of bre bundles is the next theorem.

• There are many other specialized notions of bre bundles. When the bres of a bre bundle are

vector spaces such that the vector space operations are smooth, it is called a vector bundle.

Example 6.15. • If M and N are manifolds, the projection map M × N → N, (p, q) 7−→ q, is

a bre bundle. This would be called the trivial bundle with base N and bre M.

• The tangent bundle TM → M is a vector bundle.

• The Hopf bration S3 → S2 is a bre bundle with bres dieomorphic to the circle.

• The map S1 → S1 given by z 7−→ zn is a bre bundle provided n 6= 0, and the bre is

dieomorphic to a 0-manifold containing precisely |n| points. This map is not a bre bundle

if n = 0.

• The spinor cover S3 → SO3 constructed in the homework is a bre bundle with bre a

two-point space. This would usually be called à 2-sheeted covering space'.

• The map from S2 to the real projective plane constructed in the homework is a 2-sheeted

covering space.

• Notice that not all submersions are bre bundles. For example, the function f : R2\(0, 0) →R given by f (x, y) = x is a submersion, but not a bre bundle, since there are no charts for

0 ∈ R.• The Möbius band from Chapter 1, M = (z, v) : z ∈ S1, v ∈ R2, v = t

√z for some t ∈ R

is a vector bundle over S1, the map M → S1 is given by (z, v) 7−→ z .

An Ehresmann connection on the tangent bundle of a manifold was a way of relating vectors in

nearby tangent spaces to the manifold. The idea applies just as well to bre bundles.

Denition 6.16. If f : M → N is a bre bundle. Given p ∈ M, the vertical subspace of TpM is

Tp(f −1(f (p))). An Ehresmann connection on f is a smooth function c : TM → TM such that145


πM c = πM where πM : TM → M the tangent bundle projection map, moreover the restriction of c

must satisfy c|TpM : TpM → Tp(f −1(f (p))) and c(p, v) = (p, v) for all v ∈ Tp(f −1(f (p))), i.e. c is a

projection map from TpM to the vertical subspace. The kernel of c|TpM : TpM → TpM is called the

horizontal subspace of TpM.

Note 6.17. By denition, the restriction of Df : TM → TN to any horizontal subspace of TpM will

give an isomorphism between that horizontal subspace and Tf (p)N. Thus an Ehresmann connection

on f allows us to associate motions in M with motions in N, so a vector eld on N will pull-back to a

vector eld on M.

Example 6.18. • If f : M → N is a bre bundle and M is a Riemann manifold, let c : TM →TM be orthogonal projection into the vertical subspaces. This is an Ehresman connection.

Thus, every bre bundle admits and Ehresmann connection.

• Given an idempotent Ehresmann connection on M in the sense of the Riemannian geometry

notes, c : T 2M → TM we can re-interpret it to be an Ehresmann connection on the bre

bundle πM : TM → M in the above sense. Our dierential geometry sense of connection

restricted to a linear map T(p,v)TM → TpM for all (p, v) ∈ TM. But there is a canonical

identication of ker(DπM)(p,v) ⊂ T(p,v)TM with TpM, since there is a canonical identication

of the tangent space of a vector space with the vector space.

Denition 6.19. Given an Ehresmann connection c : TM → TM on a bre bundle f : M → N, a

smooth function g : X → M is parallel if c Dg = 0.

Theorem 6.20. If f : M → N is a bre bundle with Ehresmann connection c : TM → TM, if

p ∈ M and γ : [0, 1] → N satises f (p) = γ(0), then there exists a unique smooth parallel function

γ : [0, 1]→ M such that f γ = γ and γ(0) = p. This is called the parallel transport of p along γ.

The proof of the above is the same as the proof for vector bundles, in the dierential geometry

notes.

Denition 6.21. The holonomy of a bre bundle with Ehresmann connection is the maps

HOLp,q : Ωp,qN × f −1(p)→ f −1(q)

dened by HOLp,q(γ, x) = γ(a) γ : [0, a] → N satises γ(0) = p, γ(a) = q, γ is parallel, and

γ(0) = x .

As with vector bundles, the holonomy satises the groupoid property. So HOLp,q(γ, ·) is a dieo-

morphism between f −1(p) and f −1(q).

A useful feature of connections we have not yet discussed is that they provide certain canonical

trivializations of bundles. We develop the idea by examples.

Example 6.22. Let M be the Möbius band, considering it to be a vector bundle with the projection

map f : M → S1 from Example 6.15. We consider M to be a Riemann manifold, giving it the Riemann

metric as a submanifold of R2 × R2. Consider the path γ : [0, 1] → S1 given by γ(t) = e2πit . Then

the holonomy HOL1,1(γ, (1, v)) = (1,−v). (Note: computation skipped!)

More generally, the map [0, 1]× R→ M given by

(t, x) 7−→ HOL1,e2πit (γ|[0,t], (1, (0, x)))

is therefore one-to-one with the sole exception that (0, x) and (1,−x) gets mapped to the same place.

So the Möbius band looks like the cylinder [0, 1]×R with the boundaries glued together with a twist.146


Example 6.23. Let f : M → N be a bre bundle with a connection c . Let N be a Riemann manifold,

p ∈ N and r > 0 such that the exponential map when restricted to the ball Br ⊂ TpN of radius r in

TpN is a dieomorphism to its image exp : Br → N. Denote the image by U = exp(Br ) ⊂ N.A dieomorphism between f −1(U) and U × f −1(p) is given by

φ : U × f −1(p)→ f −1(U)

with φ(q, w) = HOLp,q(γ, w) where γ : [0, 1]→ U is given by γ(t) = exp(t · exp−1(q)).

Consider applying the above construction to the tangent bundle of Sn, π : TSn → Sn. Dene

the northern hemi-sphere of Sn to be HN = p ∈ Sn, p = (x0, · · · , xn), x0 ≥ 0 and similarly the

southern hemi-sphere HS = p ∈ Sn, p = (x0, · · · , xn), x0 ≤ 0. Dene the north and south poles

to be pN = (1, 0, · · · , 0) and pS = −pN . Then the exponential map is a dieomorphism between

the compact ball of radius π/2 in TpNSn and HN . Similarly, the exponential map is a dieomorphism

between the compact ball of radius π/2 in TpSSn and HS. Thus we have dieomorphisms

φN : HN × π−1(pN)→ π−1(HN)

φS : HS × π−1(pS)→ π−1(HS).

But since HN and HS are dieomorphic to compact discs, and HN ∩ HS = Sn−1 × 0, and

π−1(pN) = π−1(pS) = 0 × Rn, we have therefore expressed TSn as the union of two spaces

dieomorphic to Dn × Rn, glued together along their common Sn−1 × Rn boundary. The natural

question is, what is the gluing map?

Specically, if we restrict φN to (Sn−1 × 0)× π−1(pN) we can post-compose this map with φ−1S

giving a map

φ−1S (φN|Sn−1×0×Rn) : Sn−1 × Rn → Sn−1 × Rn

it is a homework problem to check that this map is given by

(p, v) 7−→ (p, v − 2(p · v)p)

where p · v is the standard Euclidean inner product. So this map xes p but in the bres it is mirror-

reection in the plane orthogonal to p.

We have used extensively the corollary from the implicit function theorem that states that if f :

M → N is smooth and p ∈ N is a regular value, then f −1(p) is a submanifold of M. One might

further ask, when is a submanifold of M the pre-image of a regular value of a smooth function?. This

is largely answered in the next theorem.

Theorem 6.24. Let M ⊂ N be a compact submanifold. Then M = f −1(p) for a regular value of a

smooth function N → X for some smooth manifold X if and only if ν(M,N) is trivial, in the sense

that there exists (n − m) pointwise linearly-independent sections of the bundle πν : ν(M,N) → M.

This condition is sometimes stated as πν : ν(M,N)→ M is a trivial vector bundle.

Proof. `=⇒' If M = f −1(p), then for any U a small neighbourhood of p ∈ N, some tubular neighbour-

hood of M in N is contained in f −1(U). The proof of Theorem 6.13 applies giving us a dieomorphism

f −1(U) → U × f −1(p), for perhaps a smaller neighbourhood U of p. Since U × f −1(p) is a product,

so ν(p × f −1(p), U × f −1(p)) is a trivial vector bundle, and so it ν(M,N).

`⇐=' This is a more substantial argument, so we provide more of a sketch. Let exp : νr (M,N)→V ⊂ N be a tubular neighbourhood of M in N. Dene f : V → Rn−m by f (q) = (t1, · · · , tn−m) where

exp−1(q) =∑n−m

i=1 tivi(p), where v1, ·, vn−m : M → ν(M,N) are the linearly independent sections of147


πν . Notice that M = f −1(0) and 0 is a regular value of f , but, the domain of f is not all of N,

it is only V . To extend f to all of M we need to determine where to send the points N \ V . The

idea is to think of Rm−n ≡ 0 × Rm−n as the tangent space to Sm−n at the point (1, 0, · · · , 0). Let

Exp : T(1,0,··· ,0)Sm−n → Sm−n be the exponential map for TSm−n. The composite

Exp (π

rf ) : V → Sn−m

moreover, all all points near the boundary of V are sent to points near the south pole. So this map

extends to a continuous function

N → Sn−m.

Unfortunately, this function is not generally dierentiable at points in V \ V . There are two ways this

can be xed. One would be to take a smooth approximation to this function, and use the tubular

neighbourhood of Sn−m in Rn−m+1 to project it back to a map into Sn−m, another would be to use

bump functions to `slow' the function down near V \ V . Either way, we obtain the function we are

looking for, with M the pre-image of the north pole, which is a regular value.

The proof above of Theorem 6.24 is sometimes called the Pontriagin Construction. As we learn

the basics of transversality, we will see that this construction gives a strong relationship between

homotopy-classes of maps f : N → Sn−m and m-dimensional submanifolds of N.

0, 1 and 2-dimensional manifolds

We give a brief sketch of the classication of the low-dimensional manifolds. The proofs will be

complete for 0 and 1-dimensional manifolds, but will have some gaps for 2-dimensional manifolds,

which we will repair later once we know a little intersection theory / transversality.

Dimension zero is the simplest to work with, since a 0-manifold is a collection of isolated points,

and is therefore countable. Moreover, a smooth map of 0-manifolds is simply a function. In particular

there is no notion of 0-manifold with boundary.

Theorem 6.25. Two 0-dimensional manifolds are dieomorphic if and only if they have the same

cardinality. Therefore the list:

∅, 1, 1, 2, 1, 2, 3, · · · , 1, 2, 3, · · · , n, · · · ,Z

is a complete and non-redundant list of 0-manifolds up to dieomorphism. In particular, there are only

two connected 0-manifolds up to dieomorphism: the empty-set ∅, and a 1-point space.

One-dimensional manifolds are more complicated. We list only the connected ones, since every

1-manifold is characterized by how many path components are dieomorphic to the various standard

connected 1-manifolds, and there can be at most countably-many such components.

Theorem 6.26. Up to dieomorphism, the only connected 1-manifolds are:

∅, S1, [0, 1], [0, 1), (−1, 1)

Proof. The key idea is to argue that the exponential map from a tangent space to the manifold is an

onto function. Then consider a maximal open interval in a tangent space such that the exponential

map (for some Riemann metric) is one-to-one, and ask where the endpoints go, or if the exponential

map is dened there. 148


Two-dimensional manifolds have also been classied. The classifcation for non-compact 2-manifolds

is too complicated to state in these notes. To get a sense for how complicated it can get, consider

the complement of a Cantor set in R2. This is a non-compact 2-manifold.

We will not complete the proof of the classication for compact 2-manifolds here, but we outline

the basic ideas in the proof and point-out the key technical parts of the proof. In particular we will

build up some basic terminology to describe 2-manifolds and important 1-manifolds contained in them.

We will call a 2-dimensional manifold a surface and a 1-dimensional manifold a curve. Given a

compact surface without boundaryM and n distinct points p1, · · · , pn ∈ M, we will call the complement

of a tubular neighbourhood of the points in M a punctured surface. Technically, for this to be a

manifold with boundary we need the closures of these tubular neighbourhoods of the individual points

to be disjoint.

Proposition 6.27. Every compact surface M is dieomorphic to a punctured surface. One obtains M

by puncturing a unique (up to dieomorphism) compact boundaryless surface.

Although we will not prove the above proposition, the rough idea is that the boundary of M has to

be dieomorphic to a nite collection of circles, which one caps-o with discs. `Capping o' might

require embedding the surface M in a higher-dimensional Euclidean space consider an embedding of

the Möbius band in R3 for example. Capping-o is formally a simpler idea when dealing with abstract

manifolds, since one does not need to worry about producing an embedding in Euclidean space.

Manifolds frequently are studied inductively. Surfaces by the curves contained in them. 3-dimensional

manifolds by the surfaces contained in them, and so on. A connected compact boundaryless curve

in a surface by is dieomorphic to a circle. Consider a tubular neighbourhood of an embedded circle

in the interior of a surface. By design, it is a 1-dimensional vector bundle over the circle. Similar to

the argument in Example 6.22, one can prove that there are precisely two surfaces that are the total

spaces of 1-dimensional vector bundles over the circle are, the Möbius band and S1×R. The key ideais to put a connection on the bundle, so the manifold looks like [0, 1]× R with a gluing operation on

the boundary. The gluing operation is a dieomorphism 0 × R → 1 × R, given by the holonomy

once about the base circle. Proposition 6.28 tells us the holonomy is of two kinds.

Proposition 6.28. Up to isotopy, there are precisely two dieomorphisms of R: the identity IdR and

−IdR.

Proof. Let f : R→ R be a dieomorphism. We will dene two isotopies of f .

F (t, x) = (1− t)f (x) + t(f (0) + x(f (1)− f (0)))

starts at F0 = f and ends with F1(x) = f (0) + x(f (1)− f (0)), i.e. a linear map which agrees with f

at x = 0 and x = 1. There is a similar isotopy

G(t, x) = f (x) + t

(−f (0) + x

(f (1)− f (0)

|f (1)− f (0)| − f (1) + f (0)

))which starts with G0 = f and adds a linear function to f so that G(1, 0) = 0 and G(1, 1) = f (1)−f (0)

|f (1)−f (0)| .

Call G translation and rescaling of f , and F taking out the kinks. We would like to do both. The

formula for F , think of it as giving F as a function of three variables, F (f , t, x) where f : R → R is

a dieomorphism, and similarly G(f , t, x), then our isotopy is H : [0, 1] × R → R given by H(t, x) =

F (G(f , t, ·), t, x) = F (Gf ,t , t, x) i.e. apply F to the time t G-isotopy of f . This is clearly smooth,

H0 = f and H1 is either ±IdR. 149


One could also nish the above argument by concatenating the two isotopies F and G, but to do

that one would have to apply bump functions to ensure the concatenation is smooth.

To construct a dieomorphism between the total space of an arbitrary 1-dimensional vector bundle

over S1 and either S1 ×R or the Möbius band, one simply denes the map on an individual bre and

extends via the holonomy. As one completes the loop about the base circle, the holonomies might

not match exactly, so one modies the map via the isotopy given in Proposition 6.28 in a little tubular

neighbourhood of a bre.

So the tubular neighbourhood of a circle in a surface can be trivial or non-trivial. If it is non-trivial,

it is dieomorphic to the Möbius band, if trivial, dieomorphic to S1×R, a cylinder. Curves with trivialtubular neighbourhoods are said to be 2-sided. Curves with Möbius band tubular neighbourhoods are

called 1-sided. If the surface is connected, removal of a 1-sided curve never disconnects the manifold.

But for a 2-sided curve, sometimes it does, and sometimes it does not. The terminology is separating

and non-separating. If a curve is separating, capping o results in two compact boundaryless manifolds,

M1 and M2. In this case we say M is the connect-sum of M1 and M2. If the 2-sided curve is non-

separating, capping o results in a new connected surface, N. In this case,M is either the connect-sum

of N with a torus S1×S1 or it is the connect-sum of N with a Klein bottle (which is itself the connect

sum of two RP 2 real projective planes). The reason for this is fairly simple. Consider the surface that

consists of the tubular neighbourhood of our 2-sided curve, together with a small neighbourhood of

an embedded circle in the surface that intersects the curve only once, connecting the two sides of M

remove the tubular neighbourhood. Either this curve travels an orientable path or not.

So in this fashion one can repeatedly cut surfaces into (one might hope!) simpler surfaces. This is

the case.

Theorem 6.29. In S2, every embedded circle bounds an embedded disc D2. Moreover, if a compact

connected boundaryless 2-manifold is such that every embedded circle bounds an embedded disc, then

it is dieomorphic to S2.

Corollary 6.30. Every compact connected boundaryless surface is the connect-sum of nitely many

copies of S1 × S1 or RP 2, the former if it is orientable, the latter if not. Moreover, the number of

copies is well-dened up to dieomorphism, and it is called the genus of the surface. Let Σ+g denote

the orientable surface of genus g, and Σ−g the non-orientable surface of genus g.

So Σ+0 = S2, Σ+

1 = S1 × S1, etc. Σ−0 does not exist, Σ−1 = RP 2.

It is fairly typical to denote the n-times punctured surface by Σ+g,n, Σ−g,n, etc.

Exercises

Problem 6.31. Consider the parabola P = (x, y) ∈ R2 : y = x2.(a) Construct an explicit dieomorphism f : R2 → ν(P,R2) such that f (R×0) = P with f (x, y)

an ane-linear function of y (with x xed). Hint: Try to write f (x, y) = ((x, x2), y~v(x))

where ~v(x) is orthogonal to T(x,x2)P and of unit length.

(b) Compute the set of focal points of P , i.e. the points in the plane where the derivative of

exp : ν(P,R2)→ R2 is not an isomorphism. Make a sketch of the parabola together with all

the focal points.150


(c) Compute the injectivity radius Inj(P,R2).

(d) If k > 0, compute the injectivity radius of the parabola Pk = (x, y) ∈ R2 : y = kx2.

Problem 6.32. Compute the injectivity radius for the sphere Inj(Sn), also for Euclidean space, and

hyperbolic n-space Hn.

Problem 6.33. What is the injectivity radius of S1 × S1 ⊂ R2 × R2?

Problem 6.34. What is the injectivity radius of the Möbius band?

Problem 6.35. Compute the injectivity radius of Sj × 0 ⊂ Sn for any pair j < n.

Problem 6.36. Let M be a connected, compact manifold without boundary. Let p, q ∈ M. Prove

that there exists a dieomorphism of M, f : M → M such that f (p) = q. Hint: Interpret a path

γ : [0, 1]→ M as an isotopy of a one-point embedding!

Problem 6.37. Let M ⊂ C2 be the Möbius band from Chapter 5 of the course notes, considered as

a vector bundle π : M → S1 given by π(p, v) = p. Compute an explicit smooth section of π which

is transverse to the 0-section. Compute the intersection of the 0-section with this transverse section.

How many points are in it?

Problem 6.38. a) Show that the function f : R→ R2 dened by

f (t) = (1

1 + t2, t −

2t

1 + t2)

is an immersion. Sketch the image of f . Determine if the self-intersections are transverse.

b) Consider the function g : R2 → R3 dened by

g(t, a) = (1

(1 + t2)(1 + a2), t −

2t

(1 + t2)(1 + a2),

ta

(1 + t2)(1 + a2)).

Sketch g(R, a) for a = −1, 0, 1 separately.

c) Consider the function G : R2 → R4 dened by G(x, y) = (g(x, y), y). Show that G is an

immersion but not an embedding. Are the self-intersections transverse?

Comment: The above family of maps are due to Hassler Whitney. They are central in the proof of

the `Whitney trick', which is the basis for the strong Whitney embedding theorem and the H-cobordism

theorem.

151

Dierential Geometry - Coarse Outline Representation theory

7. Representation Theory

This section furthers the basic theory of compact Lie groups. We start with a quick introduction to

integration of forms on manifolds. This allows us to average over group actions, allowing us to dene

invariant Riemann metrics and volume forms on manifolds, which in turn allows us to build up the

theory of representations of compact Lie groups, as well as constructing many interesting examples

of Riemann manifolds. We use these tools to describe equivariant tubular neighbourhoods, and to

describe the basic local structure of Lie group actions on manifolds.

Denition 7.1. We say ω is a k-form on a manifoldM ⊂ Rn if it is an alternating, multi-linear function

on each tangent space of M. We demand ω is smooth, in the sense that it is a smooth function

ω : ⊕kTM → R.

Dierential forms are a vector space (and a module over C∞(M,R)), and we denote the vector space

/ module of all dierential k-forms on M by Λk(M).

For example, a 1-form is a function ω : TM → R which is linear on each bre ωp : TpM → R for

all p ∈ M, ωp(v) = ω(p, v). Thus 1-forms can be pre-composed with vector elds on M.

One can interpret a 1-form as a type of vector eld, but not a vector eld on the tangent bundle

of M. Dene a vector bundle over the manifold M to be one where the bre over p ∈ M is the dual-

space to TpM. This is called the co-tangent bundle to M, and denoted T ∗M. A 1-form is therefore

a smooth section of the co-tangent bundle. Since the functor that turns a vector space into its dual

space is contra-variant, there is no canonical way to embed the co-tangent bundle in Euclidean space.

So typically people dene the co-tangent bundle an abstract manifold, rather than a submanifold of

a Euclidean space. For this reason we will maintain the formalism of Denition 7.1, which avoids the

co-tangent bundle and abstract manifolds.

Denition 7.2. Recall if f : M → N is a smooth map of manifolds, the derivative was the map

Df : TM → TN dened as Df (p, v) = (f (p), Dfp(v)). Similarly, there is a sum of derivatives

Df ⊕k : ⊕kTM → ⊕kTN dened by Df ⊕k(p, v1, · · · , vk) = (f (p), Dfp(v1), · · · , Dfp(vk)).

If ω : ⊕kTN → R is a k-form and f : M → N is smooth, the pull-back f ∗ω is dened as the

composite

f ∗ω = ω Df ⊕k .Similarly, if ω is a k-form and η is a j-form then ω ∧ η is a (k + j)-form, dened tangent space by

tangent space as

(ω ∧ η)(p, v1, · · · , vk , vk+1, · · · , vk+j) = (ωp ∧ ηp)(v1, · · · , vk , vk+1, · · · , vk+j).

The wedge product of forms on a given vector space is dened as in Chapter 0 Denition 0.17.

(ω ∧ η)p =(k + j)!

k!j!· asym(ωp ⊗ ηp)

Example 7.3. Typically, if M ⊂ Rn, we represent a k-form on the manifold M in terms of the standard

basis for the k-forms on Rn. That is, we extend the form on M to a form dened on all of Rn. We can

do this whenever M is closed, by the ε-neighbourhood theorem. So it makes perfect sense to consider

expressions of the form1

x2 + y2(ydx − xdy)

152


to be a 1-form on R2 \ 0 as well as on any 1-manifold in R2. Precisely, if M ⊂ R2 the one-form on

M is the pull-back via the inclusion i : M → R2. If one needs to carefully keep-track of the domains

of forms, ω = 1x2+y2 (ydx − xdy) would be considered to be an element of Λ1(R2 \ 0), while i∗ω

would be an element of Λ1S1, with i : S1 → R2 \ 0 being the inclusion.

The 2-form

xdy ∧ dz + ydz ∧ dx + zdx ∧ dyis the area form on S2, since it returns the oriented area spanned by the parallelepiped of any two

tangent vectors to S2.

One way in which this notation suers is that extensions are never unique. For example, both

ydx − xdy and 1x2+y2 (ydx − xdy) dene the same 1-form on S1.

The pull-back operation is functorial, linear, and compatible with the wedge product.

• (f g)∗ω = g∗(f ∗ω)

• Id∗Mω = ω

• f ∗(c1ω1 + c2ω2) = c1f∗ω1 + c2f

∗ω2

• f ∗(ω1 ∧ ω2) = f ∗ω1 ∧ f ∗ω2

Just like with real-valued functions, the support of a dierential k-form is the closure of the collection

of points p ∈ M such that ω(p, ·) 6= 0.

supp(ω) = p ∈ M : ωp 6= 0

Denition 7.4. Given a dierential n-form ω with compact support on an oriented manifold N of

dimension n the Riemann integral∫N ω is dened as∫

N

ω =

m∑j=1

εj

∫Rnφ∗j ωj

where ω =∑m

j=1 ωj and the supports of the ωj are required to be compact. We demand that

φj : Uj → N is an embedding with Uj ⊂ Rn open and connected, and supp(ωj) ⊂ φj(Uj). The number

εj ∈ +1,−1 is chosen to be positive precisely when φj is orientation-preserving. The dierential

form φ∗j ωj can be written uniquely as φ∗j ωj = fjdx1 ∧ · · · ∧ dxn where fj : Rn → R is a function with

compact support. We dene the integral∫Rn φ

∗j ωj =

∫Rn fj , i.e. the integral of a form in Euclidean

space is the Riemann integral of the scalar-valued function in front of the determinant n-form. One

can replace the Riemann integral with the Lebesgue integral, to similarly dene the Lebesgue integral

of forms on manifolds.

Proposition 7.5. Denition 7.4 is consistent, and independent of the choice of charts or decompositon

ω =∑ωi . Moreover, the integral satises

•∫N c1ω + c2η = c1

∫N ω + c2

∫N η

•∫N f∗ω =

∫M ω if f : N → M is an orientation-preserving dieomorphism.

•∫−N ω = −

∫N ω.

Denition 7.6. We dene the exterior derivative as a function d : ΛkM → Λk+1M. The idea is much

like a regular derivative. We give the initial denition in Euclidean space to simplify matters. Let ω be

a k-form on Rn. Vectors v0, v1, · · · vk ∈ Rn and a point p ∈ R dene the parallelopiped based at p as

Pv (p) = p + t0v0 + · · ·+ tkvk : 0 ≤ ti ≤ 1 ∀i.153


If we dene

P εi,v (p) = p + t0v0 + · · · tkvk : 0 ≤ tj ≤ 1 if j 6= i , ti = ε

a natural way to orient these facets is via the parametrization

[0, 1]k 3 (t0, t1, · · · , ti , · · · tk) 7−→ p + t0v0 + · · ·+ ti−1vi−1 + εvi + ti+1vi+1 + · · ·+ tkvk ∈ P εi,v (p)

which gives us the expression

∂Pv (p) =

k⋃i=0

(−1)1+iP 0i ,v (p) ∪ (−1)iP 1

i ,v (p)

for the oriented boundary. Thus the most coarse Riemann sum for the integral of ω over ∂Pv (p) is

given by

k∑i=0

(−1)1+iωp(v0, · · · , vi , · · · , vk) +

k∑i=0

(−1)iωp+vi (v0, · · · , vi , · · · , vk)

which has an approximation a (k + 1)-st order Taylor expansion (in v) whose only non-trivial term is

a (k + 1)-form called the exterior derivative.

k∑i=0

(−1)1+iωp(v0, · · · , vi , · · · vk) +

k∑i=0

(−1)iωp+vi (v0, · · · , vi , · · · , vk)ε|v |k+1

' (dω)p(v0, v1, · · · , vk)

Denition 7.6 tells us immediately that for a 0-form ω ∈ Λ0Rn ≡ C∞(Rn,R), the exterior derivative

is essentially the derivative in disguise, i.e.

dωp(v) = Dωp(v).

Proposition 7.7. The exterior derivative given in Denition 7.6 is well-dened. It is also natural, i.e.

f ∗(dω) = d(f ∗ω) for all k-forms ω on Rn and all smooth functions f : Rm → Rn. Given a k-form ω

on Rn, ω =∑

I fIdxI with fI : Rn → R smooth, there is the formula

dω =∑I

d(fI) ∧ dxI .

Because of the naturality, the exterior derivative extends to arbitrary manifolds. For example, one

denition of dω for a ω ∈ ΛkN would be to extend ω locally to a k-form on a neighbourhood (in the

ambient Euclidean space), take the exterior derivative there and pull-back to manifold. The exterior

derivative of forms on manifolds is also natural, d(f ∗ω) = f ∗(dω), where f : M → N is any smooth

function, and ω ∈ ΛkN. It is also nilpotent of order two, i.e. d2ω = d(dω) = 0 always. Given two

dierential forms on a manifold, ω ∈ ΛkN and η ∈ ΛjN the exterior derivative satises

d(ω ∧ η) = (dω) ∧ η + (−1)kω ∧ (dη).

Proof. To see that the exterior derivative is well-dened let's rst verify the formula for a k-form

ω ∈ ΛkRn

dω =∑I

d(fI) ∧ dxI where ω =∑I

fIdxI

154


Let vi = xieji with j0 < j1 < · · · < jk , Denition 7.6 tells us that

k∑i=0

(−1)i(ω(p+xieji )

(x0ej0 , · · · , xieji , · · · , xkejk )− ωp(x0ej0 , · · · , xieji , · · · xkejk ))

=

k∑i=0

(−1)i

∏j 6=ixj

(ω(p+xieji )(ej0 , · · · , eji , · · · , ejk )− ωp(ej0 , · · · , eji , · · · ejk )

)

=

k∑i=0

(−1)i

∏j 6=ixj

(fJi (p + xieji )− fJi (p)) Ji = j0, · · · , jk \ ji

ε|∏

i xi |'

(k∏i=0

xi

)k∑i=0

(−1)i∂fJi∂eji

(p)

Our formula for dω states

dωp(ej0 , · · · , ejk ) =∑I

d(fI) ∧ dxI(ej0 , · · · , ejk )

=∑I

∑j /∈I

∂fI∂ej

(p)dxj ∧ dxI(ej0 , · · · , ejk )

=

k∑i=0

(−1)i∂fJi∂eji

(p)

Which conrms the identity.

Nilpotence follows from the above formula directly, it is just the Clairault theorem on equality of

mixed partials, buried in notation.

To conrm the compatibility with the wedge product, let ω =∑

I fIdxI and η =∑

J gJdxJ . Then

ω ∧ η =∑I∩J=∅

fIgJdxI ∧ dxJ

so the exterior derivative is given by

d(ω ∧ η) =∑I∩J=∅

d(fIgJ)dxI ∧ dxJ

=∑I∩J=∅

(d(fI)gJ + fId(gJ)) dxI ∧ dxJ

=∑I∩J=∅

d(fI)dxI ∧ (gJdxJ) + (−1)k fIdxI ∧ (d(gJ) ∧ dxJ)

= dω ∧ η + (−1)kω ∧ dη155


To conrm naturality, we will verify it for real-valued functions and forms of the type ω = dxi ∧ η.This gives the proof by induction, since every form is a linear combination of forms of this kind.

f ∗dω = f ∗d(dxi ∧ η)

= f ∗(d2xi ∧ η − dxi ∧ dη

)= f ∗ (−dxi ∧ dη)

= −f ∗(dxi) ∧ f ∗(dη)

= −d(f ∗xi) ∧ d(f ∗η)

d(f ∗ω) = d(f ∗dxi ∧ f ∗η)

= d(d(f ∗xi) ∧ f ∗η)

= −d(f ∗xi) ∧ d(f ∗η)

Recall that if ω ∈ ΛkM and v : M → TM is a vector eld, then the contraction ω(v , ·) is a

(k − 1)-form on the manifold, i.e. ω(v , ·) ∈ Λk−1M, called the contraction of ω along v . Since a

k-form takes k vectors as input, the contraction is a type of partial evaluation of the form. The next

proposition is an omnibus proposition that describes how the operations of contraction are compatible

with the wedge product, the exterior derivative and the Lie derivative. It also gives a relation between

the Lie derivative and the exterior derivative.

Proposition 7.8. If v : M → TM is a vector eld, and ω ∈ ΛkM and η ∈ ΛjM then,

(1) d(ω(v , ·)) = Lvω − (dω)(v , ·)(2) Lv (ω(w, ·)) = (Lvω)(w, ·) + ω([v , w ], ·)(3) (ω ∧ η)(v , ·) = ω(v , ·) ∧ η + (−1)kω ∧ (η(v , ·))

(4) d(Lvω) = Lv (dω)

(5) Lv (ω ∧ η) = (Lvω) ∧ η + ω ∧ (Lvη).

The Lie derivative of a form is dened exactly the same way as the Lie derivative of a vector eld.

The vector eld v has a ow Φv , so we can compute the pull-back Φ∗v,−tω and this is a vector eld on

our manifold for all t. The derivative of this at t = 0 is the Lie derivative. Since it can be computed

at every point of our manifold, and since alternating multi-linear functions are a vector space, the Lie

derivative of a k-form, dened in this way, is also a k-form.

Proof. Consider (1). (sketch of proof) This has a simple interpretation in terms of our original Riemann

sum interpretation of the exterior derivative. If ω is a k-form, d(ω(v , ·)) is a k-form which when one

plugs in a k-tuple of vectors w1, · · · , wk ,

d(ω(v , ·))p(w1, · · · , wk) '∫∂Pw (p)

ω(v , ·).

Contrast this with

(dω)p(v , w1, · · · , wk) '∫∂Pv,w (p)

ω

this latter integral involves all 2k+1 faces of the parallelepiped Pv,w (p) while the rst integral misses two

of those faces. Our orientation conventions ensure these terms have opposite signs, thus d(ω(v , ·)) =

−dω(v , ·) + (correction term) where the correction term has to do with the missing two faces, this

being the Lie derivative.156


Consider (2). The Lie derivative has the expression

Lv (ω(w, ·)) =∂Φ∗(v,−t)(ω(w, ·))

∂t (t=0)

but the pull back of the contraction is

Φ∗(v,−t)(ω(w, ·)) = Φ∗(v,−t)ω(Φ∗v,−tw, ·)

which we can compute using the chain rule, giving the result.

Consider (3).

(ω ∧ η)(v , ·) =(k + j)!

k!j!asym(ω ⊗ η)(v , ·)

=1

k!j!

∑σ∈Σ(k+j)

(−1)|σ|(ω ⊗ η).σ(v , ·)

Notice that (ω ⊗ η).σ(v , ·) = (ω ⊗ η)(σ.(v , ·)) where we use the left action of Σ(k + j) on (k + j)-

tuples of vectors. Partition Σ(k + j) into the two subsets ΣK where elements send 1 into the set

1, 2, · · · , k and ΣJ where elements send 1 to k + 1, k + 2, · · · , k + j. This gives us

(ω ∧ η)(v , ·) =1

k!j!

(∑σ∈ΣK

(−1)|σ|(ω ⊗ η)σ.(v , ·) +∑σ∈ΣJ

(−1)|σ|(ω ⊗ η)σ.(v , ·)

)

=1

k!j!

∑σ∈ΣK

(−1)|σ|(ω(v , ·)⊗ η).σ where σ = (1, σ(1)) · σ

+1

k!j!

∑σ∈ΣJ

(−1)σ(ω ⊗ η(v , ·))σ where σ = (k + 1, σ(1)) · σ

=(k + j − 1)!

(k − 1)!j!asym(ω(v , ·)⊗ η) + (−1)k

(k + j − 1)!

k!(j − 1)!asym(ω ⊗ η(v , ·))

= ω(v , ·) ∧ η + (−1)kω ∧ η(v , ·)

Consider (4). If ω is a vector eld Lvω is the derivative of∂Φ∗v,−tω

∂t (t=0)= Lvω, so

Lv (dω) =∂Φ∗v,−tdω

∂t (t=0)

=∂(dΦ∗v,−tω

)∂t (t=0)

= dLvω.

Where the last equality comes from the variable t being independent of the spatial variable.

We leave (5) as an exercise.

Example 7.9. If ω = x1de2 ∧ de3 + x2de1 ∧ de3 + x3de1 ∧ de2 then dω = de1 ∧ de2 ∧ de3, which is

the determinant function.

Theorem 7.10. (Poincaré Lemma) Given a k-form ω with compact support dened on Rn, if dω = 0

then ω = dα for some (k − 1)-form α on Rn, also with compact support.157


Proof. The standard basis for ΛkRn+1 is the set dxI : I ⊂ 0, 1, · · · , n is a k element set. Partitionthis set into two disjoint subsets, corresponding to the indices I where 0 ∈ I and the ones where

0 /∈ I, call these A and B respectively. This allows us to split ΛkRn+1 as a direct sum of two free

C∞(Rn+1,R)-modules of rank |A| =(nk

)and rank |B| =

(nk−1

)respectively.

Dene a map α : Λk(Rn+1) → Λk−1(Rn+1) linearly by f dxI 7−→ 0 if I ∈ A and f dx0 ∧ dxI 7−→(∫ x0

0 f dx0

)dxI if 0 ∪ I ∈ B.

Check that

Id − π∗ i∗ = (d α+ α d)

where Id : ΛkRn+1 → ΛkRn+1 is the identity function, π : Rn+1 → Rn is the map π(x0, x1, · · · , xn) =

(x1, · · · , xn), and π∗ : ΛkRn → ΛkRn+1 is the pull-back operation, and similarly i : Rn → Rn+1 is

inclusion i(x1, · · · , xn) = (0, x1, · · · , xn). So if we let ω ∈ ΛkRn+1 be a closed form, i.e. dω = 0 this

formula tells us that

ω = d(α(ω)) + π∗(i∗ω)

so we have reduced the problem of showing ω is an exterior derivative to the same problem, in one

dimension lower showing that i∗ω ∈ ΛkRn is an exterior derivative. Applying the procedure recursivelygives an inductive procedure to nd η such that dη = ω.

Theorem 7.11. (Stokes's) If ω is an (n− 1)-form with compact support dened on an n-manifold N,

then ∫N

dω =

∫∂N

ω

Proof. Write ω =∑

i ωi where supp(ωi) ⊂ Ui and φi : Ui → Vi ⊂ Hn are charts. This can be done by

a partition of unity argument. Then ∫N

dω =∑i

∫N

dωi

=∑i

∫Hn

(φ−1i )∗dωi

=∑i

∫Hnd(φ−1

i )∗ωi

But notice that Stoke's theorem holds in Hn by a direct argument. Write the form on Hn as ω =∑i fide1 ∧ · · · ∧ dei ∧ · · · ∧ den, then dω =

∑i(−1)i−1 ∂fi

∂eide1 ∧ · · · ∧ den.∫

Hndω =

∑i

∫−Rn−1

∫ ∞0

(−1)i−1 ∂fi∂ei

=

∫Rn−1

f1

=

∫∂Hn

ω

Recall in single variable calculus there was the `strong' change of variables theorem which states that∫ ba (f g)g′ =

∫ g(b)

g(a) f . The only conditions required on the functions is that both sides are integrable,

g is dierentiable everywhere and g([a, b]) ⊂ domain(f ). We do not require g([a, b]) = [g(a), g(b)],158


i.e. g need not be a dieomorphism, nor even one to one g can be constant. The reason we have

such a clean theorem in the one-dimensional case is that one can assume f has an anti-derivative.

The function F (g(t))−F (g(a)) by the chain rule has (f g)g′ as its derivative. The Poincare Lemma

gives us exactly the same tool for integration of forms, giving us a strong change-of variables theorem

for integration in Rn.

Theorem 7.12. Let ω be an n-form dened on an open subset V ⊂ Rn. Let N ⊂ V be a compact n-

dimensional submanifold. ProvidedM is any compact n-dimensional submanifold of Rn and f : M → V

is smooth, ∫M

f ∗ω =

∫N

ω

provided f (∂M) = ∂N and the restriction of f to ∂M is a degree one map on every component of

∂M. The degree one condition can be ensured by demanding that f|∂M : ∂M → ∂N is an orientation-

preserving dieomorphism.

Proof. Using a tubular neighbourhood of N we can nd a smooth real-valued function α : Rn → [0, 1]

such that α(N) = 1 and α(Rn \ V ) = 0, thus ω · α admits a smooth extension to all of Rn(dened to be zero outside of V ). Without loss of generality, we simply replace ω by this function.

The advantage is ω = dη for some (n − 1)-form η by the Poincare Lemma.∫N

ω =

∫N

dη by the Poincare Lemma.

=

∫∂N

η by Stoke's theorem.

=

∫∂M

f ∗η * See below

=

∫M

df ∗η Stokes's Theorem.

=

∫M

f ∗dη naturality of d

=

∫M

f ∗ω since ω = dη

The key step, that∫∂N η =

∫∂M f

∗η, in the case that f : ∂M → ∂N is an orientation-preserving

dieomorphism, this follows from Proposition 7.17. More generally, if we only know that f|∂M : ∂M →∂N is degree one on every component of M, the result follows from Lemma 7.13, whose proof is much

like the proof of Proposition 7.17 but with the added input of degree theory.

Lemma 7.13. If M and N are compact, connected, oriented n-manifolds and ω ∈ ΛnN, then for any

smooth function f : M → N we have ∫M

f ∗ω = deg(f )

∫N

ω.

There are times when one wants to integrate real-valued functions on non-orientable manifolds, or

when otherwise the framework of dierential forms is a little too restrictive. Such integrals do not

have quite as pleasant formal properties as integrals with respect to dierential forms, but they are

closely related.159


Denition 7.14. A density on a nite-dimensional vector space V is a function

µ : V n → R

where n = dim(V ) with the properties that

µ(v1, v2, · · · , vk−1, λvk , vk+1, · · · , vn) = |λ|µ(v1, v2, · · · , vn)

µ(v1, v2, · · · , vj−1, vj +∑k 6=j

λkvk , vj+1, · · · , vn) = µ(v1, v2, · · · , vn)

and

µ(vσ(1), vσ(2), · · · , vσ(n)) = µ(v1, v2, · · · , vn)

for any σ ∈ Σn.

It is not dicult to check that the last property follows from the rst two.

Proposition 7.15. If µ is a density on V , then there is an alternating n-linear function ω such that

|µ(v1, · · · , vn)| = |ω(v1, · · · , vn)|

for all v1, · · · , vn ∈ V . Moreover, if f : V → V is linear, then

µ(f (v1), · · · , f (vn)) = |Det(f )|µ(v1, · · · , vn).

When one works through the proof of Proposition 7.15 one sees that if µ 6= 0 there are precisely

two forms ω such that |µ| = |ω|, since if it holds for ω it also holds for −ω. The point is that a densitycan take only one sign µ ≥ 0 or µ ≤ 0, but multi-linear functions if they are non-zero they must take

all values in R. Just as with forms, one can use densities to integrate, analogous to Denition 7.4

and Proposition 7.17. As with forms, densities on manifolds are dened as densities on all the tangent

spaces, thus a density is a continuous or smooth function µ : ⊕nTN → R.

Denition 7.16. Given a density µ with compact support on an manifold N of dimension n the

Riemann integral∫N µ is dened as ∫

N

µ =

m∑j=1

∫Rnφ∗j ωj

where ω =∑m

j=1 ωj and the supports of the ωj are required to be compact. We demand that φj : Uj →N is an embedding with Uj ⊂ Rn open and connected, and supp(ωj) ⊂ φj(Uj). Pull-backs of densitiesare dened exactly as they are for forms, i.e. φ∗j ωj(p, v1, · · · , vn) = ωj(φj(p), D(φj)p(v1), · · · , D(φj)p(vn)).

The density φ∗j ωj can be written uniquely as φ∗j ωj = fj |dx1 ∧ · · · ∧ dxn| where fj : Rn → R is a function

with compact support. We dene the integral∫Rn φ

∗j ωj =

∫Rn fj , i.e. the integral of a density in

Euclidean space is the Riemann integral of the scalar-valued function in front of the standard density.

One can replace the Riemann integral with the Lebesgue integral, to similarly dene the Lebesgue

integral densities on manifolds this is only relevant when one wishes to pass to less regular densities

that continuous functions µ : ⊕nTN → R.

Proposition 7.17. Denition 7.16 is consistent, and independent of the choice of charts or decom-

positon ω =∑ωi . Moreover, the integral satises

•∫N c1ω + c2η = c1

∫N ω + c2

∫N η

•∫N f∗ω =

∫M ω if f : N → M is a dieomorphism.

160


Averaging arguments in Lie Groups

There are a family of constructions in Lie groups that use a type of averaging argument. One way

or another you have seen these types of arguments before, be it in Fourier analysis or the more simple

observation that every function f : R→ R decomposes uniquely as a sum f = S+A where S : R→ Ris symmetric, i.e. S(−x) = S(x) and A : R→ R is anti-symmetric, i.e. A(−x) = −A(x). One denes

S(x) =f (x) + f (−x)

2and A(x) =

f (x)− f (−x)

2.

Before we begin proving theorems about Lie groups, we `warm up' by considering the simpler case

that G is a nite group. If f : G → GLnR is a representation, then we can show that there is a

matrix A ∈ GLnR such that A−1f (g)A ∈ On for all g ∈ G, i.e. every representation of a nite group

is conjugate to a representation into the orthogonal group

Proof. (of claim) Let µ be the standard inner product on Rn, i.e. µ(v , w) = v · w =∑n

i=1 viwi . If

f : G → GLn is a representation of a nite group, and g ∈ G consider f (g) : Rn → Rn to be the

associated linear map to the matrix. The pull-back of the bilinear function µ along f (g) is dened in

the standard way

((f (g))∗µ)(v , w) = µ(f (g)(v), f (g)(w)).

So f (g)∗µ must also an inner product on Rn. The set of symmetric bilinear functions on a vector

space is itself a vector space under the operation (µ+ν)(v , w) = µ(v , w)+ν(v , w) and (αµ)(v , w) =

α(µ(v , w)). The set of inner products on Rn is not a subspace of this vector space, because of the

non-degeneracy condition. But notice that it is an ane space, i.e if µ and ν are inner products then

tµ+ (1− t)ν is also an inner product for any 0 ≤ t ≤ 1. More generally, if µi is an inner product on

Rn for i = 1, 2, · · · , k and if 0 ≤ ti ≤ 1 satisfy∑k

i=1 ti = 1 then the sum∑k

i=1 tiµi is an inner product

on Rn. This implies that the average:

µ :=1

|G|∑g∈G

f (g)∗µ

is an inner product on Rn, where |G| is the number of elements in G. But notice, for all g ∈ G,f (g)∗µ = µ. Letting A be the matrix whose columns are an orthonormal basis for Rn with respect to

the inner product µ we have our proof (A should perhaps be thought of as the representation of the

isometry between (Rn, µ) and (Rn, µ) as inner product spaces).

In order to make the above kind of argument on a Lie group, we would need a notion of integration

that has some kind of symmetry with respect to the group action. A natural framework for this would

be a top-dimensional dierential form on a Lie group G which is invariant under either left or right

multiplication, or both.

Denition 7.18. If ω ∈ ΛkG is a dierential form on a Lie group G, it is left-invariant if L∗gω = ω for

all g ∈ G. It is right-invariant if R∗gω = ω for all g ∈ G and it is bi-invariant if it is both left an right

invariant.

Proposition 7.19. The left invariant k-forms on a Lie group G is a vector space of dimension(dim(G)k

),

i.e. every k-form on TeG extends uniquely to a left-invariant form. Similarly for right-invariant forms.

Not all Lie-groups have non-zero bi-invariant forms.161


The proof of the above is much like the characterisation of left and right-invariant vector elds. To

see that Lie groups generally do not have bi-invariant forms, the Lie group O2 provides perhaps the

simplest example. O2 has a 1-dimensional space of left-invariant 1-forms, and a 1-dimensional space

of right-invariant forms, but the intersection of these spaces is 0-dimensional. To see this concretely,

consider O2 as the semi-direct product SO2o±1. The group ±1 ' Z2 acts on SO2 by conjugation

by mirror reection over the x-axis. Thus (I,−1) · (A, 1) · (I,−1) = (MAM,−1) · (I,−1) = (A−1, 1)

where M ∈ O2 is the mirror reection. Since matrix inversion is orientation-reversing on SO2, there

are no non-zero bi-invariant 1-forms on O2.

Proposition 7.20. A Lie group G has a non-zero top-dimensional bi-invariant form ω if and only if the

action Ad : G × TeG → TeG has determinant 1, i.e. the map Ad : G → Aut(TeG) must have image

in the special-linear group of TeG. But every compact Lie group has a unique left-invariant form ωLsuch that

∫G ωl = 1. Moreover, every compact Lie group has a unique right-invariant form ωR such

that∫G ωR = 1. ωL = ωR if and only if the conjugation action of G on itself is orientation-preserving.

Thus ωL = ωR if G is connected. Taking one step back, compact Lie groups have a unique bi-invariant

density µ such that∫G µ = 1, this is sometimes called the Haar measure. In this case, µ = |ωL| = |ωR|.

Proof. If G is compact, the composite Det Ad : G → R\0 is a homomorphism of groups, thinking

of R\0 as a group using multiplication. This is a continuous function so its image is a compact set,

but there are only two compact subgroups of R \ 0 and they are subsets of ±1. If Det Ad = 1,

then any left-invariant top-dimensional form is automatically right-invariant, so we are done. Dene

µ = |ωL|.

If G is a non-compact Lie group, the function Det Ad can have a non-compact image in R \ 0.For example, if G is the group of ane-linear automorphisms of R then the image of Det Ad is

R \ 0, thus this group does not have a bi-invariant 1-form, nor even a bi-invariant density.

Theorem 7.21. Let γ : G → GLnR be a representation of a compact Lie group G, then there is

some matrix A ∈ GLnR such that A−1γA : G → On, i.e. the representation γ is conjugate to an

orthogonal representation. Similarly, if γ : G → GLnC then it is conjugate to a unitary representation,

with A ∈ GLnC.

Proof. Let µ be the unique bi-invariant density on G with∫G µ = 1. Let η be the Euclidean dot

product on Rn. Notice that

η =

∫G

(γ∗gη)µ

is an inner product on Rn, moreover, the representation γ is orthogonal with respect to η.

η(γh(v), γh(w)) =

∫g∈G

(γ∗gη)(γh(v), γh(w))µ

=

∫g∈G

η(γgh(v), γgh(w))µ

=

∫g∈G

η(γg(v), γg(w))µ by right-invariance of µ

= η(v , w).

If we let the column vectors of the matrix A be an orthonormal basis for Rn with respect to the inner

product η we are done. 162


The proof of Theorem 7.21 only used the fact that µ is right-invariant but we use that G is

compact in order to dene the integral. As we know, a right-invariant density on a compact manifold

must be left-invariant, so our assumptions in Theorem 7.21 are minimal.

Denition 7.22. Given a representation ρ : G → GLnR the character associated to ρ is tr(ρ) : G → R.This is a function dened on the Lie group G whose value at g ∈ G is the trace, tr(ρ)(g) = tr(ρ(g)).

Notice that tr(ρ(hgh−1)) = tr(ρ(g)) since the trace of a product of two matrices satises

tr(AB) = tr(BA). Thus the characters of representations a functions on the group that are constant

on conjugacy classes.

Proposition 7.23. (Weyl integral formula) If χ : G → R is a continuous function on a compact

connected Lie group G then∫g∈G

χ(g)µG =1

|W |

∫h∈H

Det(Adh−1 − IdTeG)|TeG/TeH

(∫[g]∈G/H

χ(ghg−1)µG/H

)µH

where |W | is the order of the Weyl group and H ⊂ G is a maximal torus

Proof. The idea is to consider the map F : G/H × H → G from Theorem 2.58 given by F ([g], h) =

ghg−1. We showed this map has degree |W |. Since G is compact and connected, the canonical

bi-invariant density µ can be represented by a dierentiable form. Lemma 7.13 tells us that∫G/H×H

F ∗(χη) = |W |∫G

χη.

We divide both sides by |W |. The aect of F ∗ on χη is fairly direct to compute, as we saw in Theorem

2.58. Up to a left multiplication, the derivative of F is

Te(G/H)× TeHD(Lgt−1g−1 )gtg−1DF([g],t)(D(Lg)e×D(Lt)e)

// TeG

([h], l) // Adg(Adt−1 (h)− h + l)

But Adg has determinant +1 since G is compact and connected. So we are left to computing the

determinant of ([h], l) 7−→ Adt−1 (h)− h + l which is the determinant of h 7−→ Adt−1 (h)− h. Fubini'stheorem gives the rest.

Denition 7.24. If ρ : G → GLnF is a representation of a Lie group G with F a eld, the function

χ : G → F given by χ(g) = tr(ρ(g)) is called the character of the representation.

Notice that since the trace satises the identity tr(AB) = tr(BA), the character is a conjugation-

invariant function, i.e. χ(ghg−1) = χ(h) for all g, h ∈ G.

Corollary 7.25. If χ : G → F is the character of a representation (or any conjugation-invariant

function) with F = R or F = C then∫g∈G

χ(g)µG =1

|W |

∫h∈H

Det(Adh−1 − IdTeG)|TeG/TeHχ(h)µH

Given two representations of a Lie group G, ρ1 : G → GL(V ) and ρ2 : G → GL(W ) a morphism

between ρ1 and ρ2 is a linear function f : V → W such that f (ρ1(g)(v)) = ρ2(g)(f (v)) for all v ∈ Vand g ∈ G. Another way to state this is f is a conjugacy between the action of G on V and the action

of G on W which happens to also be a linear function. f would be an isomorphism of representations163


if f were a one-to-one and onto function. Notice that if f is simply onto then not only is the kernel

of f invariant under the action of ρ1 but the representation ρ1 is isomorphic to the restriction of ρ1

to ker(f ) direct product with the representation ρ2.

Denition 7.26. A representation ρ : G → GLnF is reducible if there is a decomposition F n = V ⊕Wsuch that ρ(g)(v) ∈ V for all v ∈ V and g ∈ G, and ρ(g)(w) ∈ W for all w ∈ W and g ∈ G. The

decomposition must be non-trivial, i.e. V 6= F n nor is V = 0 allowed.

A representation is reducible if and only if it is isomorphic to a product of two non-trivial representa-

tions of G. As we have seen, this is equivalent to having a non-trivial conjugacy from the representation

to a lower-dimensional representation.

Theorem 7.27. (Schur's Lemma) If ρ1 : G → GL(V ) and ρ2 : G → GL(W ) are irreducible repre-

sentations of a compact Lie group G then a morphim f : V → W is either zero or an isomorphism.

A morphism f : V → V must be of the form f (v) = λv with λ ∈ C, provided V is a complex

representation of G.

Proof. The image of f : V → W , if non-zero is a G-invariant subspace of W . So if img(f ) 6= W then

the quotient map W → W/img(f ) is a non-trivial homomorphic image of W on which G acts via ρ2.

Thus W is not irreducible.

Let λ ∈ C be an eigenvalue of f , then the λ-eigenspace is a factor of the representation ρ, thus it

must be all of V .

Proposition 7.28. The complex irreducible representations of compact abelian Lie groups are one

dimensional.

Proof. If ρ : G → GL(V ) is a representation, then for any g ∈ G the map Lg : V → V given

by Lg(v) = ρ(g)(v) is a conjugacy between ρ and itself. Thus if λ is an eigenvalue of Lg then

Lg(v) = λv for all v ∈ V . If we choose g to be a topological generator of G then this would imply any

1-dimensional subspace of V would be G-invariant, thus if V is not complex 1-dimensional, ρ could

not be an irreducible representation.

Similarly, the real irreducible representations of abelian Lie groups are at most two dimensional, as

the above argument applies to the complexication. In less sophisticated language, think of V as Rnand think of the linear maps as being represented by square matrices with real entries. Now treat that

matrix as a square complex matrix (that just happens to have real entries).

TODO: Orthogonality relations for characters.

TODO: reps determined by characters?

TODO: go back to densities and mention mean value theorem, etc.

Exercises

There are more direct proofs of the Poincaré Lemma (Theorem 7.10). The next problem asks you

to work out one such.164


Problem 7.29. Given a k-form ω on Rn dene

Ωω(p) =

∫ 1

0

tk−1ω(tp)(p, ·)dt

Prove d(Ωω)(p) + Ωdω(p) = ω(p).

The next problem sketches yet another proof of the Poincaré Lemma.

Problem 7.30. The parallelepiped based at p ∈ Rn in the directions of v = v1, · · · , vk was denoted

Pv (p) = p +

k∑i=1

tivi , 0 ≤ ti ≤ 1 ∀i

The cone on Pv (p) is denoted

Cv (p) = t0q : q ∈ Pv (p), 0 ≤ t0 ≤ 1.

If ω ∈ ΛkRn is a k-form on Rn we dene cω ∈ Λk−1Rn, the `cone' on ω as

cωp(v1, · · · , vk−1) = limt→0

1

tk−1

∫Ctv (p)

ω

where in this equation v = (v1, · · · , vk−1). Show cω is a (k − 1)-form, and as in Problem 7.29, show

d(cω) + c(dω) = ω.

Problem 7.31. Given the form ω = ydx + xdy , dω = 0 and so both the Poincare Lemma (Theorem

7.10) and the alternative proofs in Problem 7.29 and 7.30 give forms η such that dη = ω. Compute

η using all three methods, directly.

Problem 7.32. Derive a formula for the exterior derivative of a double contraction, analogous to

Proposition 7.8 (1). If v , w : N → TN are vector elds and ω ∈ ΛkM, compute

d(ω(v , w, ·)).

Problem 7.33. Verify on R4 that the 2-form ω = dx1 ∧ dx2 + dx3 ∧ dx4 satises ω ∧ ω = 2dx1 ∧dx2 ∧dx3 ∧dx4. In general, nd a 2-form ω on R2n such that ω∧ω∧ · · · ∧ω = dx1 ∧dx2 ∧ · · · ∧dx2n.

Problem 7.34. Compute the integral of the 2-form:

ω = (xdy − ydx) ∧ (zdw − wdz)

over the manifold

M = S1 × S1 ⊂ R4

Use whichever orientation is convenient.

Problem 7.35. Compute the integral ∫S1

xdy − ydx

using the denition and also by using Stoke's theorem.165


Problem 7.36. Compute the integral of the 2-form:

ω =z√

x2 + y2dx ∧ dy +

√x2 + y2 − 2

x2 + y2(ydz ∧ dx + xdy ∧ dz)

over the manifold

M = (x, y , z) ∈ R3 : (√x2 + y2 − 2)2 + z2 = 1 ⊂ R3

Use the orientation of this as the boundary of a compact 3-dimensional submanifold of R3. Hint: nd

a parametrization of the manifold and pull-back the form along the parametrization.

Problem 7.37. Compute the integral of the 2-form

ω = xdy ∧ dz + ydz ∧ dx + zdx ∧ dy

over the sphere

S2 = v ∈ R3 : |v | = 1Give the sphere the outward orientation. Verify your computation using Stoke's theorem.

Problem 7.38. Compute the integral of the 3-form

ω = x1dx2 ∧ dx3 ∧ dx4 + x2dx3 ∧ dx4 ∧ dx1 + x3dx4 ∧ dx1 ∧ dx2 + x4dx1 ∧ dx2 ∧ dx3

over the 3-sphere

S3 = v ∈ R4 : |v | = 1Orient the sphere as the boundary of the 4-disc D4. Verify your computation using Stoke's Theorem.

Problem 7.39. Check that the form ω =∑n

i=0(−1)ixidx0 ∧ dx1 ∧ · · · ∧ dxi ∧ · · · dxn on the sphere Sn

is SOn+1-invariant, i.e. if A ∈ SOn+1 then A∗ω = ω. Here we use the convention that Sn ⊂ Rn+1 =

(x0, x1, · · · , xn) : xi ∈ R ∀i ,∑n

i=0 x2i = 1. Since the group On+1 acts on Rn+1 by isometries xing

the origin, it also acts on the unit sphere, so we can consider A ∈ SOn+1 as an orientation-preserving

dieomorphism of Sn.

Problem 7.40. Interpret your answers to Problems 7.35, 7.36, 7.38 and 7.39 in the language of the

Hodge Star operation from Chapter 0, Denition 0.18. The forms are Hodge dual to 1-forms which

are themselves dual (in the sense of the isomorphism between a vector space and its dual induced

by an inner product) to familiar vector elds. This gives a pleasant geometric interpretation of the

original form.

The problems below assume familiarity with Riemann manifolds and the Hodge star.

Problem 7.41. Let N ⊂ Rk be the regular value of a smooth function f : U → Rj , N = f −1(q) where

q ∈ Rj and U ⊂ Rk is open. Then f ∗(dx1 ∧ · · · ∧ dxj) is a smooth j-form on U. Argue that the Hodge

star of this form is a volume form on N.

Notice: The above problem proves non-orientable manifolds are not the pre-image of regular values

of smooth functions.

Problem 7.42. For this question, N will be a compact boundaryless oriented Riemann manifold.166


Denition 7.43. If α : Λj(N) → Λi(N) is linear, a linear function α∗ : Λi(N) → Λj(N) is called a

metric adjoint to α if

〈α(ω), ζ〉 = 〈ω,α∗(ζ)〉for all ω ∈ Λj(N), ζ ∈ Λi(N).

Given ω ∈ Λi(N), the current associated to ω is denoted ω[·], it is a linear function ω[·] : Λn−i(N)→R given by

ω[ζ] =

∫N

ω ∧ ζ

where ζ ∈ Λn−i(N).

Given α : Λj(N) → Λi(N) linear, then a linear function α : Λn−i(N) → Λn−j(N) is a topological

adjoint to α provided (α(ω))[ζ] = ω[α(ζ)] for all ω ∈ Λj(N) and ζ ∈ Λn−i(N).

(a) Prove that the function 〈·, ·〉 : Λj(N)× Λj(N)→ R dened by

〈ω, ζ〉 =

∫N

ω ∧ (∗jζ)

is an inner product on Λj(N) for all j .

(b) Show that if either α∗ or α exists they are unique. Moreover, prove that α has a metric

adjoint if and only if it has a topological adjoint, and they are related by α ∗i = ∗j α∗.(c) Dene θi : Λi(N) → Λi(N) by θi(ω) = (−1)iω. Check θi is its own metric adjoint, and the

topological adjoint is given by ∗i θi ∗−1i . Also notice that d(ω∧ ζ) = (dω)∧ ζ+ (θiω)∧dζ,

where ω ∈ Λi(N).

(d) Check that the metric adjoint of the exterior derivative di : Λi(N)→ Λi+1(N) is ∗−1i dn−i−1

∗i+1 θi+1. This is sometimes denoted δi+1. Hint: First nd its topological adjoint, and

remember Stokes theorem.

Denition 7.44. The Laplacian 4i : Λi(N) → Λi(N) is dened by 4i(ω) = di−1(δiω) +

δi+1(diω). One can readily verify 4n−i∗i = ∗i4i , di4i = 4i+1di , and 4i−1δi = δi4i ,moreoever, like the exterior derivative, δi−1δi = 0. The Laplacian also agrees with the standard

denition on smooth functions in Rn.A dierential i-form ω is harmonic if 4iω = 0.

(e) Check that if 4iω = 0 if and only if diω = 0 and δiω = 0. Hint: use what you know from

above to play around with 〈4iω,ω〉.

Problem 7.45. Let c : Rn → (Rn)∗ be the isomorphism between Rn and its dual given by the standardinner product on Rn, i.e. c(v) ∈ (Rn)∗ is dened by c(v)(w) = v ·w . If v is a vector eld on Rn, theoperator c turns v into a 1-form, c(v) ∈ Λ1Rn.

(a) If v is a vector eld on R2, d(c(v)) is a 2-form on R2, so its Hodge star ∗dc(v) is a smooth

function on R2. Identify this function as a familiar mathematical object from calculus.

(b) If v is a vector eld on R3, d(c(v)) is a 2-form on R3, so ∗dc(v) is equal to c(w) for some

vector eld w on R3. Identify w as a familiar mathematical object from calculus.

(c) If v and w are vector elds on R3, ∗(cv ∧ cw) is equal to cu for some vector eld u on R3.

Identify u as a familiar vector eld from calculus.

(d) If v is a vector eld on Rn, ∗cv is a (n − 1)-form on Rn, so ∗d ∗ cv is a smooth real-valued

function on Rn. Identify this as a familiar object from calculus.167


(e) Given two vector elds v , w on R4, d(cv ∧ cw) is a 3-form on R4, so ∗d(cv ∧ cw) is equal to

cu for some vector eld on R4. Can you give a geometric interpretation of this vector eld?

(f) Given two vector elds v , w on R4, there are four real valued functions on R4: ∗(dcv ∧dcw),

∗(∗dcv ∧dcw), ∗(dcv ∧∗dcw) and ∗(∗dcv ∧∗dcw). Can you interpret these geometrically?

Problem 7.46. Let ω and η be 1-forms on an oriented Riemann manifold. Argue that

∗(ω ∧ η) = (∗ω)(c−1η, ·) = −(∗η)(c−1ω, ·)where c−1η and c−1ω are the vector elds dual to η and ω respectively.

The above formula has a variant for ω a k-form and η a j-form, for arbitrary k and j but there is

an intermediate step that lives outside the language of forms. Precisely, the formula is

∗(ω ∧ η) = (∗ω)(c−1η, ·) = (−1)jk(∗η)(c−1ω, ·)

168

Date post:	23-Apr-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Di erential Geometry A coarse outline Ryan Budney...

Documents