PHY483F/1483F Relativity Theory I (2019-20) · 2019-11-18 · on explaining Einstein’s famous...

PHY483F/1483FRelativity Theory I

(2019-20)

Department of Physics

University of Toronto

Instructor: Prof. A.W. Peet

Sources:-

• M.P. Hobson, G.P. Efsthathiou, and A.N. Lasenby, “General relativity: an introductionfor physicists” (Cambridge University Press, 2005) [recommended textbook];

• Sean Carroll, “Spacetime and geometry: an introduction to general relativity” (Addison-Wesley, 2004);

• Ray d’Inverno, “Introducing Einstein’s relativity” (Oxford University Press, 1992);

• Jim Hartle, “Gravity: an introduction to Einstein’s general relativity” (Pearson, 2003);

• Bob Wald, “General relativity” (University of Chicago Press, 1984);

• Tomas Ortın, “Gravity and strings” (Cambridge University Press, 2004);

• Noel Doughty, “Lagrangian Interaction” (Westview Press, 1990);

• my personal notes over three decades.

Version: Monday 18th November, 2019 @ 11:01

Contents

1 Thu.05.Sep iii1.1 Invitation to General Relativity . . . . . . . . . . . . . . . . . . . . . . . . . iii1.2 Course website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

2 Fri.06.Sep 12.1 Galilean relativity, 3-vectors in Euclidean space, and index notation . . . . . 1

3 Mon.09.Sep 73.1 Special relativity and 4-vectors in Minkowski spacetime . . . . . . . . . . . . 73.2 Partial derivative 4-vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Thu.12.Sep 134.1 Relativistic particle: position, momentum, acceleration 4-vectors . . . . . . . 134.2 Electromagnetism: 4-vector potential and field strength tensor . . . . . . . . 16

5 Mon.16.Sep 195.1 Constant relativistic acceleration and the twin paradox . . . . . . . . . . . . 195.2 The Equivalence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225.3 Spacetime as a curved Riemannian manifold . . . . . . . . . . . . . . . . . . 23

6 Thu.19.Sep 256.1 Basis vectors in curved spacetime . . . . . . . . . . . . . . . . . . . . . . . . 256.2 Tensors in curved spacetime . . . . . . . . . . . . . . . . . . . . . . . . . . . 276.3 Rules for tensor index gymnastics . . . . . . . . . . . . . . . . . . . . . . . . 28

7 Mon.23.Sep 307.1 Building a covariant derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 307.2 How basis vectors change: the role of the affine connection . . . . . . . . . . 307.3 The covariant derivative and parallel transport . . . . . . . . . . . . . . . . . 34

8 Thu.26.Sep 368.1 The geodesic equations for test particle motion in curved spacetime . . . . . 368.2 Example computation for affine connection and geodesic equations . . . . . . 38

9 Mon.30.Sep 429.1 Spacetime curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429.2 The Riemann tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429.3 Example computations for Riemann . . . . . . . . . . . . . . . . . . . . . . . 44

10 Thu.03.Oct 4810.1 Geodesic deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4810.2 Tidal forces and taking the Newtonian limit for Christoffels . . . . . . . . . . 49

11 Mon.07.Oct 5411.1 Newtonian limit for Riemann . . . . . . . . . . . . . . . . . . . . . . . . . . 5411.2 Riemann normal coordinates and the Bianchi identity . . . . . . . . . . . . . 56

i

11.3 The information in Riemann . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

12 Thu.10.Oct 6012.1 Lie derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6012.2 Killing vectors and tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

13 Thu.17.Oct 6613.1 Maximally symmetric spacetimes . . . . . . . . . . . . . . . . . . . . . . . . 6613.2 Einstein’s equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

14 Mon.21.Oct 7214.1 Birkhoff’s theorem and the Schwarzschild black hole . . . . . . . . . . . . . . 72

15 Thu.24.Oct 7815.1 TOV equation for a star . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7815.2 Geodesics of Schwarzschild . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

16 Mon.28.Oct 8416.1 Causal structure of Schwarzschild . . . . . . . . . . . . . . . . . . . . . . . . 84

17 Thu.31.Oct 9017.1 Charged black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9017.2 Rotating black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

18 Mon.11.Nov 9518.1 The Kerr solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9518.2 The Penrose process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

19 Thu.14.Nov 10019.1 Gravitational redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10019.2 Planetary perihelion precession . . . . . . . . . . . . . . . . . . . . . . . . . 102

20 Mon.18.Nov 10520.1 Bending of light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10520.2 Radar echoes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

21 Thu.21.Nov 11021.1 Geodesic precession of gyroscopes . . . . . . . . . . . . . . . . . . . . . . . . 11021.2 Accretion disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

22 Mon.25.Nov 11522.1 Finding the wave equation for metric perturbations . . . . . . . . . . . . . . 11522.2 Solving the linearized Einstein equations . . . . . . . . . . . . . . . . . . . . 117

23 Thu.28.Dec 12123.1 Gravitational plane waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12123.2 Energy loss from gravitational radiation . . . . . . . . . . . . . . . . . . . . 124

ii

1 Thu.05.Sep

1.1 Invitation to General Relativity

From a particle physics perspective, the gravitational force is the weakest of the four knownforces. So why does gravity dominate the dynamics of the universe? A simple first answer isthat there is a lot of matter in the universe that gravitates. Even though the gravitationalattraction between any two subatomic particles is weak, if you get enough of them togetheryou can eventually make a black hole! A slightly more sophisticated answer focuses on therange of the gravitational force and what sources it. The only two long-range forces we knowof in Nature are gravity and electromagnetism. By contrast, the strong nuclear force bindingatomic nuclei and the weak nuclear force responsible for the fusion reaction powering ourSun are very short-range. Electromagnetic fields are sourced by charges and currents, butthe universe is electrically neutral on average, so electromagnetism does not dominate itsevolution. Gravity, on the other hand, is sourced by energy-momentum. Since everythinghas energy-momentum, even the graviton, you can never get away from gravity.

Newton wowed the world a third of a millennium ago with his Law of Universal Gravita-tion, which explained both celestial and terrestrial observations. Our focus in this course ison explaining Einstein’s famous General Theory of Relativity (GR), which is a gener-alization of both Newtonian gravity and Special Relativity proven useful for describing thedynamics of the cosmos. By the end of term, you will be familiar with Albert Einstein’sfamous equation for the gravitational field gµν(x)

Rµν −1

2gµνR + Λgµν = −8πGNTµν , (1.1)

where Rµν and R involve (first and) second derivatives of gµν and Tµν describes the energy-momentum tensor of all non-gravitational fields which are collectively known as matter fields.You will also understand how GR gives back Newton’s theory of gravity in the limit wherespeeds are small and spacetime is weakly curved. You can think of Einstein’s GR as Gravity2.0, built on the foundation of Gravity 1.0 established by Newton – an upgrade.

The name for this course is “Relativity Theory 1”. Another name by which it is com-monly known is “GR 1”, which stands for “General Relativity 1”. The main thing we learnhow to do in this course is how to write the dynamical equations of physics in the language oftensor analysis. Tensor analysis always sounds scary when you start, but it is not much morecomplicated conceptually than vector analysis, something you have been doing for years. Wewill show how to take your vector analysis knowledge from flat space and generalize it tospacetime. We begin with flat spacetime, which is pertinent to Special Relativity, and thenwe build on that knowledge to figure out how to write dynamical equations of physics evenin curved spacetime.

Einstein taught us that the speed of light is constant and is the same in all inertial framesof reference. We will therefore adopt the relativistic convention that c = 1 throughout thecourse. This implies that time is measured in metres, and mass is measured in units ofenergy, e.g. me=511 keV. We will keep all other physical constants explicit, such as Planck’sconstant ~ characterizing the strength of quantum effects and the Newton constant GN

characterizing the strength of gravity. If you feel queasy about missing factors of c in anyequation, they can always be easily restored by using dimensional analysis.

iii

1.2 Course website

Please have a careful read of the course website at http://ap.io/483f/. It contains lotsof vital and useful information for all students taking PHY483F/PHY1483F, including thesyllabus, online lecture notes, and how to contact me. Everything you need to know aboutthe course is contained in the pages listed, and in all the clickable links in those pages.

iv

2 Fri.06.Sep

2.1 Galilean relativity, 3-vectors in Euclidean space, and indexnotation

Before we review some aspects of Special Relativity and introduce some new ones, let us beginby reminding ourselves of the non-relativistic version of relativity, also known as Galileanrelativity. When we want to transform from one inertial frame of reference to another movingat relative velocity v, there are three things we must think about:

(a) how time intervals relate,

(b) how spatial position intervals relate,

(c) how velocities relate.

In Galilean relativity, all clocks are synchronized,

dt′ = dt , (2.1)

displacements are related viadx′ = dx− vdt , (2.2)

and velocities u = dx/dt compose by simple addition,

u′ = u− v , (2.3)

where v is the relative velocity between the unprimed and primed frames of reference. Ein-stein upgraded these formulæ when he invented Special Relativity, and you have seen theresults before: they are known as Lorentz Transformations. We will get to them soon enough– and we will show you how simple they can look when written in terms of rapidity ratherthan velocity. But for now, let us inspect how 3-vectors work more closely, in some detail.This will serve as a pattern for the relativistic case.

Lots of things of interest in physics are vectors, which are in essence things that point. Ilike to say that a vector has a ‘leg’ that sticks out, telling you where it points. Mathematically,the vector components are what you get when you resolve the vector along an orthonormalbasis. In Special and General Relativity, we will need to be scrupulously careful to distinguishwhere we put our indices (up/down and left/right). For arbitrary vectors v, we write theindex telling you which component is which with an upstairs index: v1, v2, . . . , vd, wherei = 1, . . . , d and d is the spatial dimension. Note that the upstairs index i used here is not apower; instead, it specifies which component vi is being discussed: the ith one. If you thinkof a contravariant vector as a column vector, the upstairs index i denotes which row of thecolumn vector you are talking about. If you need to take a power of a vector component,the GR convention is to write parentheses around it, e.g. (v1)2. Note also that it is commonin GR literature to write the vector v as vi – technically, vi is a component of v, but lettingthe index show explicitly rather than suppressing it helps us remember its transformationproperties.

1

Vectors provide a useful notational shorthand, preventing us from having to write outall the components explicitly every time we write a physics equation like ~F = m~a. Tensoranalysis in GR is nothing scary – it is the natural generalization of vector analysis to curvedspacetime and multilegged objects. Its underlying idea is twofold:-

• In physics, the most useful dynamical variables transform in well-defined ways undercoordinate transformations, and are known as tensors. Example: the momentumvector pi.• The laws of physics should be tensorial equations. A Newtonian example you will

recognize isF i = mai . (2.4)

When we change coordinates, the components of tensors on both sides of the equation change,but the underlying physical relations between them do not.

The natural type of vector we defined above is called a contravariant vector. This islike a column vector. It has a natural counterpart called a covariant vector, also knownas a dual vector. This is like a row vector. A covariant vector ω has components ωi; notethat this is a downstairs index rather than an upstairs index. The index i tells you whichcolumn of the row vector you are talking about. There is a natural inner product betweencontravariant vectors v and covariant vectors ω:

ω · v =∑i

ωivi . (2.5)

A very useful convention that we will use throughout the course is the Einstein summa-tion convention. This is a notational shorthand in which a repeated index is automaticallysummed over when it occurs precisely once upstairs and precisely once downstairs. This con-vention suppresses the unwieldy Σ signs so that it becomes easier to see the wood for thetrees. The thing that signals that you are summing over an index is that it is repeated. Notethat a repeated (summed over) index can appear precisely twice in any given equation: if itoccurs more times, the writer has made a mistake. Summing over a repeated index is alsocalled index contraction, because what you get for the result has none of the summed-overindices remaining. In our v · ω example above, the result is a scalar: a tensor with zero legs.

Why is it important to distinguish between contravariant and covariant vectors? In anutshell, because they transform differently under coordinate transformations. Let us seehow this works for a rotation. You may be used to writing a rotation of (say) a displacementvector as a d× d matrix R. Rotation matrices are orthogonal,

R−1 = RT . (2.6)

Alternatively, we can say that they preserve the Euclidean norm in 3-space:

RT13R = 13 , (2.7)

where 13 is the identity matrix. While R transforms contravariant vectors v in Euclideanspace as

v → v′ = R · v , (2.8)

2

where the prime indicates the transformed vector and the unprimed vector is the original,the transpose RT transforms covariant vectors vT as

vT → vT′= vT ·RT . (2.9)

However, we strongly discourage writing coordinate transformations in terms of matricesin future, and instead encourage you to get the hang of index notation. Once time isincluded, coordinates are curvilinear, and spacetime is physically curved, index notationand the Einstein summation convention will help us keep track of indices in a much moresuccinct way and therefore reduce the error rate when handling tensors.

A rotation is expressed in index notation for the contravariant vector components as

vi′= Ri′

jvj . (2.10)

The Ri′j is the component of the rotation matrix from the i′th row and jth column. Note

that the left-right placement of indices here is physically important, as well as the upstairs-downstairs placement. The physics reason why is that rotation matrices are not symmetric,so casually switching them makes no sense. Let us write out the above transformation lawmore explicitly, so that you can see how it encodes matrix multiplication in a disciplinedway. For a rotation of the vector v with components vi about the z-axis, it reads1

v1′ = R1′

1v1 +R1′

2v2 +R1′

3v3 = + cos θ v1 + sin θ v2

v2′ = R2′

1v1 +R2′

2v2 +R2′

3v3 = − sin θ v1 + cos θ v2

v3′ = R3′

1v1 +R3′

2v2 +R3′

3v3 = v3 . (2.11)

For the covariant vector ω specified by its components ωi, we have

ωi′ = R ji′ ωj . (2.12)

Note that the left-right and upstairs-downstairs index placements are deliberate and phys-ically meaningful here, as for the contravariant case earlier. R j

i′ is the i′th column of thejth row of RT .

Rotations are interesting mathematically because they preserve the Euclidean norm. Inindex notation, this condition reads

R ij′ δ

j′

k′Rk′

` = δi` , (2.13)

where

δi` =

1, i = `0, i 6= `

. (2.14)

Mathematically, the tensor δij with one contravariant index and one covariant index is theidentity matrix. Since the identity is a symmetric matrix, we do not have to be picky aboutleft-right index placement on Kronecker deltas like we do for other tensors. A physicalimplication of the above formula is that even if you rotate your velocity vector, it still

1Nitpickers: please note that, like Carroll, we take the point of view that the vector stays fixed while thecoordinate system changes under the relevant transformation.

3

produces the same kinetic energy for a non-relativistic particle, because the kinetic energyis proportional to the norm of the velocity vector.

Using the components of Ri′j given above, use eq.(2.13) to check that you have correctly

identified the R k′

` . Then using the transformation laws for contravariant and covariantvectors eq.s (2.10,2.12), show that covariant vector components are transformed in the −θdirection while the contravariant vector components are transformed in the +θ direction.This sign difference might seem rather trivial, but it is anything but! It is our first glimpseof why we do need to be very careful to distinguish between upstairs and downstairs in-dices for vectors – and more generally for tensors, which are multilegged generalization ofcontravariant and covariant vectors.

In physics, we often want to find the norm (length) of a vector, or the angle betweentwo distinct vectors via the dot product. In index notation, what we need to be able todo is to convert contravariant vectors into covariant vectors or vice versa. To achieve this,we need extra structure on the space (or later on, the spacetime) in which the vectors live,called a metric tensor g, which must be invertible. In flat Euclidean 3-space in Cartesiancoordinates, the role of the metric tensor is played by the Kronecker delta tensor, whichis the identity matrix in both upstairs and downstairs components,

δij =

1, i = j0, i 6= j

, (2.15)

and

δij =

1, i = j0, i 6= j

. (2.16)

The upstairs spacetime metric is the inverse of the downstairs metric,

gijgjk = δik = gkjgji . (2.17)

As you can see, this equation is easily satisfied for flat Euclidean 3-space in Cartesian coordi-nates. Soon we will see that this equation must also hold when using more general coordinatesystems or when operating in curved spacetime – or both.

Converting between contravariant and covariant components of vectors and vice versa isachieved via

vi = gijvj , (2.18)

andωi = gijωj , (2.19)

where again we used the Einstein summation convention. The fact that the metric is so trivialin flat Euclidean 3-space in Cartesian coordinates is why people are often very careless withindex placement – if you write it out explicitly you will see that (for example) v2 = v2 becausegij = δij and gij = δij. Reminder: if you need to take a power of a vector component, putparentheses around it to make it unambiguous. For example, (v1)2 means the square of thefirst contravariant component of the vector v.

Notice that since the spacetime metric is what we use to raise and lower indices, thehalf-upstairs-half-downstairs version of the metric is just the identity,

gij = δij = g ij , (2.20)

4

regardless of what the metric is. Notice also how if we have a physics equation for contravari-ant vectors, say F j = maj, we can multiply both sides by δij and sum over the repeated indexj to obtain a covariant vector equation Fi = mai. This chain of logic only works becausewe have a metric available – otherwise, we would have no way of converting upstairs-indexequations to lower-index ones.

As you will recall from Newtonian physics, the kinetic energy is proportional to thesquare of the velocity vector, i.e. its norm |v|2 = δijv

ivj. This is a scalar, i.e. invariantunder rotations. So is the inner product or dot product of any two contravariant vectorsai and bj, formed by using the metric tensor,

a · b = gijaibj . (2.21)

In 3 spatial dimensions only, we can build another 3-index animal out of two contravariantvectors ai and bj by taking an outer product or cross product. We will be able to write anexpression for this in our handy index notation by making use of another new object knownas the permutation pseudotensor Eijk, which is totally antisymmetric in its d indices,

Ei1...id =

+1 , (i1 . . . id) = even permutation of (12 . . . d) ,−1 , (i1 . . . id) = odd permutation of (12 . . . d) ,

0 , otherwise .(2.22)

There is also an upstairs version Eijk defined according to the same rules, in flat Euclidean3-space. What is this permutation pseudotensor used for? Well, one of the first things itcan do is to help us find the determinant of a matrix,

det(M) = EijkM1iM

2jM

3k . (2.23)

(Applying this formula to an orthogonal transformation matrix allows us to discover thatEijk is a pseudotensor, rather than a proper tensor, because it does not flip sign under aparity transformation xi → −xi. It is also invariant under rotations and translations.)

When handling expressions containing Eijk or its upstairs version Eijk, we may need toknow what products of these beasts look like. The identities it obeys are very handy toknow,

EijkEijk = 3! ,

EijkEij` = 2! δk` ,

EijkEimn = 1! δjkmn ,

EijkE`mn = 0! δijk`mn , (2.24)

where the generalized Kronecker deltas are defined by

δjkmn ≡∣∣∣∣δjm δkmδjn δkn

∣∣∣∣ = δjmδkn − δjnδkm , (2.25)

and

δijk`mn ≡

∣∣∣∣∣∣δi` δj` δk`δim δjm δkmδin δjn δkn

∣∣∣∣∣∣ = +δi`δjmδ

kn + δimδ

jnδ

k` + δinδ

j`δkm − δimδ

j`δkn − δi`δjnδkm − δinδjmδk` . (2.26)

5

Using all of this, we can finally write out the components of the outer (cross) product inindex notation in 3D,

(a× b)i = Eijkajbk . (2.27)

Notice that in writing the outer product here, we have again used the Einstein summationconvention – twice – on both j and k. This makes the expression more compact. Also,since this is a bona fide 3-vector equation, we can raise the index using our (spatial, trivial)metric. As you should convince yourself, the result is (a× b)i = Eijkajbk. We can use theseexpressions to find, for example,

[a× (b× c)]` = E`mnam(b× c)n

= E`mnamEnpqbpcq

= δpq`mambpcq

= (δp` δqm − δpmδ

q` )a

mbpcq

= aqb`cq − apbpc`= [b(a · c)− c(a · b)]` , (2.28)

which should look familiar from vector calculus classes earlier in your education.Note that in general spacetime dimension d, the outer product between two contravariant

vectors ai and bj is more properly thought of as a pseudotensor with (d − 2) legs, becauseit is formed via the contraction Ei1i2...ida

i1bi2 of two vectors a and b with the d-legged Epseudotensor. Note that E is defined in any dimension as long as the manifold is orientable.

6

3 Mon.09.Sep

3.1 Special relativity and 4-vectors in Minkowski spacetime

Let us now turn to studying how to generalize spatial vectors in flat Euclidean space tospacetime vectors in flat Minkowski spacetime, in Cartesian coordinates to begin with.

The bedrock principle of the constancy of the speed of light has some fairly dramaticphysics implications, chief among them being time dilation and length contraction.Both of these ideas have been rigorously tested experimentally, e.g. in particle collider andcosmic ray contexts, and found to hold true. Also, velocities no longer add simply, obeying acomposition law that looks pretty mysterious the first time you see it. Let me now demystifythis and Lorentz boosts by using a clever parametrization.

When you first saw Lorentz boosts, probably at the end of first year Newtonian me-chanics or in a second year modern physics course, they probably looked like the following.For an infinitesimal Lorentz boost in the x direction in units where c = 1,

dt′ = γv (dt− vdx) , dx′ = γv (dx− vdt) , dy′ = dy , dz′ = dz , (3.1)

where γv ≡ 1/√

1− v2. Using these expressions, you can easily figure out how velocitiestransform for a Lorentz boost along the x axis,

u′x =dx′

dt′=

ux − v1− uxv

, u′y =dy′

dt′=

uyγv(1− uxv)

, u′z =dz′

dt′=

uzγv(1− uxv)

. (3.2)

You can also work out the 3-accelerations

a′x =ax

γ3v(1− uxv)3

, a′y =ay

γ2v(1− uxv)2

+(uyv)ax

γ2v(1− uxv)3

, a′z =az

γ2v(1− uxv)2

+(uzv)ax

γ2v(1− uxv)3

.

(3.3)Notice that, unlike for Galilean relativity, acceleration is not an invariant in Special Relativ-ity. But whether or not someone is accelerating is an absolute concept: if the acceleration iszero in one frame of reference, then it is also zero in a Lorentz boosted frame of reference.Note: these formulæ are written in older notation that we will not continue using later inthis course.

We can write Lorentz boost formulæ in a much prettier way by using the rapidity ζ,which is defined by

v = tanh ζ . (3.4)

Note that while the speed ranges over v ∈ (−1,+1), the rapidity ranges over ζ ∈ (−∞,+∞).The really awesome thing about rapidity is that it is additive. To add the rapidities, youliterally just add them, like for rotation angles: ζtot = ζ1 + ζ2. It is a simple exercise torecover the relativistic velocity addition law from the definition of rapidity and its additivenature. Give it a go yourself to be sure you understand. Now we are in a position to showyou a Lorentz boost along the x direction in rapidity variables – da-daah!

dt′ = + cosh ζ dt− sinh ζ dx ,

dx′ = − sinh ζ dt+ cosh ζ dx ,

dy′ = dy ,

dz′ = dz . (3.5)

7

This looks a bit like a rotation, except for two physically important differences: (1) it mixestemporal and spatial intervals, rather than different spatial intervals, and (2) it involveshyperbolic trig functions, rather than normal trig functions. Another difference is that it isnot the 3D Euclidean norm that is preserved under Lorentz transformations, but rather the4D Minkowski norm, also known as the invariant interval

ds2 = dt2 − dx2 − dy2 − dz2 . (3.6)

The invariant interval so defined is positive if the points are timelike separated, negativeif they are spacelike separated, and zero if they are null separated. This classificationworks regardless of which inertial reference frame you use, because it is invariant under sym-metry transformations of Minkowski spacetime: rotations, [Lorentz] boosts, and translations.

The invariant interval ds2 = dt2 − |d~x|2 gives rise to the concept of a light cone. Fora point p, this is the cone defined by all light rays emanating from p into the future or thepast. Points that are timelike separated from p are inside its light cone (positive ds2), thosethat are spacelike separated from p are outside it (negative ds2), and those that are nullseparated from p (zero ds2) lie on the light cone itself. Put more colloquially, if you had justdied at point p, then your past light cone and its interior would contain all possible suspectsfor who had murdered you. If on the other hand you had set off a bomb at p, then yourfuture light cone and its interior would contain beings you could have killed (using any formof explosive, TNT and photon torpedoes included!). Here is a pictorial representation of thelight cone (for the D = 2 + 1 case). Figure credit: Wikipedia.

Note that light rays are conventionally drawn at a 45 degree angle on spacetime diagrams,in flat spacetime, to represent the fact that c = 1. In curved spacetime, the story gets morecomplicated, because the spacetime metric varies with position, rather than being constant.

It might be worth reminding you of the definition of proper time. To set the context,consider two events that are timelike separated. The proper time between two spacetimeevents measures the time elapsed as seen by an observer for whom the two events occur atthe same spatial position. In our signature convention, the invariant interval is positive inthe timelike case, so ds2 = dτ 2.

8

https://en.wikipedia.org/wiki/Light_cone

Motivated by the form of the matrices representing Lorentz boost transformations, letus define a relativistic 4-vector x with components xµ given by

x0 = (c)t ,

xi = (~x)i . (3.7)

Here, µ ∈ 0, 1, . . . , d. Notice how time is totally different conceptually than it was inGalilean relativity: it is the zeroth position coordinate, not an invariant. We can then definethe invariant interval as

ds2 = gµνdxµdxν . (3.8)

In flat Minkowski spacetime in Cartesian coordinates, the metric tensor has downstairscomponents

ηµν =

+1, µ = ν = 0−1, µ = ν ∈ 1, 2, . . . , d

0, µ 6= ν. (3.9)

Its upstairs counterpart, the inverse, has components

ηµν =

+1, µ = ν = 0−1, µ = ν ∈ 1, 2, . . . , d

0, µ 6= ν. (3.10)

It satisfiesηαβηβγ = δαγ . (3.11)

Again, we have used the Einstein summation convention where repeated indices are summedover. Note that we have chosen the mostly minus signature convention here. Be awarethat formulæ that you may obtain from various GR textbooks may have been written in theopposite sign convention. This can be quite annoying when you are trying to track downminus sign errors in a calculation. HEL has a useful table on p.193 outlining key signatureconvention differences with d’Inverno, Misner-Thorne-Wheeler and Weinberg.

The Minkowski metric tensor η is useful for raising and lowering indices. Specifically,for a contravariant vector V ν we can find its covariant components Vµ by contracting withηµν :

Vµ = ηµνVν . (3.12)

Contracting an index means repeating it (precisely once) and summing over it. For ex-ample, in the above equation, the index ν is contracted, while the index µ is not. Let uscalculate one component, V0.

V0 = η0νVν

= η00V0 + η01V

1 + η02V2 + η03V

3

= (+1)V 0 + (0)V 1 + (0)V 2 + (0)V 3

= +V 0 . (3.13)

To find the contravariant components ωµ of a covariant vector ων , we need to contract withthe upstairs metric ηµν :-

ωµ = ηµνων . (3.14)

9

Using the Minkowski metric, we can define a relativistic dot product between two contravari-ant vectors aµ and bν ,

a · b = ηµνaµbν . (3.15)

Before we move on to defining tensors in a more general way, let us make a coupleof comments about the symmetry group of Minkowski spacetime for those who might beinterested. We talked about rotations earlier, and noted that they preserved the norm of 3-vectors in flat Euclidean space. A rotation matrix is orthogonal and preserves the Euclidean3-norm. The group of such matrices in 3D is known as SO(3). What is the analoguecondition for 4-vectors in flat Minkowski spacetime? If you work out the algebra, you willfind that both rotation and boost transformations written as 4× 4 matrices Λ preserve theMinkowski norm, ΛTηΛ = η, where η is the Minkowski metric tensor we defined above. Inindex notation,

Λ ik′ η

k′

`′Λ`′

j = ηij . (3.16)

Such matrices Λ in D = d+1 dimensions are said to belong to the group SO(1, d). Rotationand boost matrices together are known as Lorentz transformations and they form a Lie(continuous) group known as the Lorentz group.

If we include translations as well, the resulting group of transformations is known asthe Poincare group ISO(1, d) (mathematically, it is a semidirect product). An interestingfact about the Poincare group is that, without even looking at an experiment, you canprove theoretically that there are only two2 invariants: the mass m and the intrinsic spins. They are always the same in different inertial frames of reference related by rotations,boosts, or translations. This is why subatomic particles are differentiated by their mass andspin. The third label we use to distinguish subatomic particles, that also respects Poincareinvariance, is the set of conserved charges under whichever gauge symmetries are relevant,e.g. SU(3)× SU(2)× U(1) of the Standard Model of Particle Physics.

So, how do we define vectors and tensors in flat spacetime? The signature property ofa vector and, more generally, of a tensor, is that it transforms in a specific and well-definedway under changes of reference frame, using the spacetime coordinates as the quintessentialexample. For a single-index tensor V with upstairs components V µ,

V µ′ =∂xµ

′

∂xνV ν = Λµ′

νVν , (3.17)

which is known as a contravariant vector. There are also covariant vectors which obey

Vµ′ =∂xν

∂xµ′Vν = Λ ν

µ′ Vν . (3.18)

Look closely at the above two equations: they are materially different. In the equation forthe contravariant vectors, the transformed coordinates x′ appear in the numerator of theJacobian and the original coordinates x appear in the denominator in the transformationlaw; for covariant vectors the opposite happens.

2If you want to know why, and are unafraid of a little Lie group theory, you can find out why by readingmy PHY2404S notes at https://ap.io/archives/courses/2014-2020/2404s/qft.pdf. I also explainthere why helicity is the relevant thing for massless particles and why spin has the character of an angularmomentum for massive particles.

10

https://ap.io/archives/courses/2014-2020/2404s/qft.pdf

Mathematically speaking, contravariant vectors live in the tangent space, which is de-fined at every point in spacetime. Covariant vectors live in the cotangent space. Theyobey the usual axioms of vector spaces: associativity and commutativity of addition, ex-istence of identity and inverse under addition, distributivity, and compatibility with scalarmultiplication.

A rank (m,n) tensor has m contravariant indices and n covariant indices. In math-ematical language, a rank (m,n) tensor is a multilinear map from the direct product ofm copies of the cotangent space with n copies of the tangent space into the real numbers.Alternatively, you can think of it as a machine with m slots for covariant vectors and n slotsfor contravariant vectors to make a scalar. For instance, a rank (0, 1) tensor (a covariantvector) is a machine with one slot for a contravariant vector (a rank (1, 0) tensor), whichwhen inserted will produce a scalar (a rank (0, 0) tensor). The spacetime metric is a (0, 2)tensor; its inverse is a (2, 0) tensor.

To find out how the components of a tensor transform, you use the transformationmatrices on each index in turn,

T µ1′...µm′

ν1′...νn′=∂xµ1

′

∂xλ1. . .

∂xµm′

∂xλm∂xσ1

∂xν1′. . .

∂xσn

∂xνn′T λ1...λm σ1...σn

. (3.19)

Note that each of the indices λ1, . . . , λm and σ1, . . . , σn in this equation is repeated andsummed over, keeping to the Einstein summation convention. So if you were to expand outall the components one by one, this would be a pretty long equation. It’s just as well weknow how to represent it compactly using index notation!

The general idea of tensor analysis is that all laws of physics should be expressible interms of tensor equations. In tensorial equations, indices can be consistently raised andlowered, as long as this is done consistently to both sides. In other words, you should notraise an index on the left side of a tensor equation while failing to do the same on the righthand side. Every equation should have the same number and type of indices on both sides.Tensorial equations hold equally well in any frame of reference, even though the componentsare different in different frames of reference.

Now let us turn to a few examples of the utility of tensors in Minkowski spacetime.

3.2 Partial derivative 4-vector

We can use Minkowski spacetime tensors to describe more objects than a massive pointparticle. For starters, we can form a very important covariant vector out of derivatives,

∂µ ≡∂

∂xµ. (3.20)

Its zeroth component describes the time derivative

∂0 =∂

∂(c)t=

1

(c)

∂

∂t, (3.21)

while the spatial parts ∂i describe spatial derivatives. As you can see, ∂µ arises naturallyas a covariant vector. It is a straightforward and worthwhile exercise to show that in flat

11

Minkowski spacetime,

∂µ∂µ =1

(c)2

∂2

∂t2− ∂2

∂x2− ∂2

∂y2− ∂2

∂z2. (3.22)

This differential operator appears in relativistic wave equations, e.g. the Maxwell equations,or the equation of motion for a Klein-Gordon (scalar) field Φ, ∂µ∂µΦ = m2Φ.

For fun, let us try applying −i~∂µ to a plane wave of the form f(x) = eik·x and see whathappens.

−i~∂µ f(x) = −i~∂µ exp(ikλxλ)

= −i~[ikνδνµ] exp(ikλx

λ)

= ~kµ f(x) . (3.23)

In other words, −i~∂µ is playing the role of the momentum when acting on plane waves ofthe form f(x) = eik·x, producing the eigenvalue pµ = ~kµ. In mathematical lingo, we saythat the plane wave carries a representation of the translation group. If we only had discretetranslation invariance up to a lattice vector instead, we would end up with Bloch wavesinstead of continuous spectrum plane waves.

12

4 Thu.12.Sep

4.1 Relativistic particle: position, momentum, acceleration 4-vectors

For any point particle, massive or massless, we can define its 4-momentum pµ by

p0 = E ,

pi = (~p)i , (4.1)

where ~p is the relativistic 3-momentum and E is the relativistic energy. For a massiveparticle, we have

p0 =m√

1− v2= m cosh ζ ,

pi =m√

1− v2vi = m sinh ζ vi . (4.2)

Check out for yourself what happens to components of the 4-momentum under Lorentztransformations.

Notice that the relativistic norm of the momentum 4-vector is a constant,

pµpµ = E2 − |~p|2 = m2 . (4.3)

This is known as the mass shell relation. It holds for any particle, massless or massive.For massless particles like the photon, E2 = |~p|2.

The 4-velocity is defined for massive particles only, via

uµ =dxµ(τ)

dτ, (4.4)

where τ is the proper time. It is related to the momentum 4-vector by

pµ = muµ . (4.5)

Note that the 4-velocity satisfiesuµuµ = +1 , (4.6)

by the mass shell constraint. Work out for yourself how the spatial components of uµ relateto the Newtonian velocity – you should find u0 = γv, u

i = γv(~v)i.The 4-acceleration is defined for massive particles only, via

aµ =duµ(τ)

dτ=d2xµ(τ)

dτ 2, (4.7)

where τ is proper time. Work out for yourself how this relativistic acceleration aµ connectswith the Newtonian version of acceleration ~a that you used in first-year undergrad physics.

What if we wanted to get a bit more sophisticated and write down an action principlefor the point particle? First, let us do a lightning review of some salient points from classicalmechanics. In a general dynamical system, our variables are the coordinates

qa(λ) , (4.8)

13

where the index a labels which coordinate we are discussing and λ is a parameter thatmeasures where we are along a particle path. For non-relativistic system we will pick λ = t,Newtonian time. The velocities are

.qa(λ), where . = d/dλ. We also have the expression for

the canonical momenta in terms of the velocities,

pa =∂L

∂.qa, (4.9)

which are found from the action, which is a functional of the coordinates,

S = S[qa(λ)] =

∫dλL(qa(λ),

.qb(λ)) . (4.10)

Using the Lagrangian L and the expressions for the canonical momenta in terms of thevelocities, we can form the Hamiltonian H which depends on the coordinates on phasespace, the coordinates and their conjugate momenta,

H = H(qa, pb) =∑a

pa.qa − L . (4.11)

The principle of least action δS = 0, combined with your knowledge of the calculus ofvariations, results in the Euler-Lagrange equations

∂L

∂qa− d

dλ

(∂L

∂.qa

)= 0 . (4.12)

These equations of motion are equivalent to Hamilton’s equations,

dpadλ

= pa, HPB ,dqa

dλ= qa, HPB , (4.13)

where the Poisson bracket is defined via

f, gPB =∑a

∂f

∂qa∂g

∂pa− ∂g

∂qa∂f

∂pa. (4.14)

The basic dynamical variables for a non-relativistic point particle are xi(t), where t isthe non-relativistic time and i = 1, 2, 3. These are 3 bona fide independent functions. Thereis no issue about how to parametrize t, because all observers agree on time, by Galileanrelativity. For a free nonrelativistic particle, the Lagrangian is just the kinetic energy,

Snonrel =

∫dt

1

2m|~v|2 . (4.15)

This action respects Galilean invariance in Euclidean 3-space. The canonical momenta are

pi = mvi , (4.16)

and the Hamiltonian is

HNR =1

2mpipi . (4.17)

14

This is just the kinetic energy written in terms of momentum rather than velocity.So, that was all well and good, but what about an action principle for the relativistic

point particle? This will be an integral over the worldline of the particle, which is the pathit traces out as it moves through spacetime. For relativistic point particles we cannot usethe Newtonian kinetic energy, because it is not invariant under Lorentz boosts. We will haveto use a generalization that respects Einsteinian relativity. The simplest guess for an actiongeneralizing the above that people typically write for a massive particle is proportional tothe arc length,

S(1)rel = −m

∫dτ

√ηµν

dxµ(τ)

dτ

dxν(τ)

dτ, (4.18)

where τ is the proper time (an invariant, unlike the time coordinate). This action has thebenefit that, at low speeds, it reduces to the familiar non-relativistic action – up to anadditive constant (try it yourself to see how, by doing a Taylor series). It assumes that theparticle position xµ(τ) can be parametrized by the proper time τ .

The drawback of this first choice of relativistic particle action is twofold. First, theparticle is assumed to be massive, so that proper time can be used to parametrize theworldline. If we want to write down equations of motion for massless particles like photons,it will not suffice. Second, and more seriously, our 4 dynamical variables xµ(τ) are notactually independent functions. They must obey the mass shell constraint,

.xµ

.xµ = +1 , (4.19)

where . = d/dτ . As a result, only 3 of the 4 xµ(τ) are independent functions. It is a physicsfib to pretend that all 4 can be independently varied in the action principle. Oops.

Suppose further that we tried to use the above Lagrangian Lτ = −m√ .x2 to find the

canonical momenta and Hamiltonian. What we would end up with is Hτ = 0. A related factis that the geometric arc length Lagrangian is ‘singular’. What does this mean? Well, if weinspect the general Euler-Lagrange equations for the qa(t), we can rearrange to see that

∂2L

∂.qb∂

.qa

..qb =

∂L

∂qa− ∂2L

∂qb∂.qa

.qb − ∂2L

∂t∂.qa. (4.20)

Everything on the RHS of this equation is a function of t, qa, and.qa. So for our relativistic

point particle, finding all of the accelerations..xµ in terms of τ , xµ(τ),

.xµ(τ) only works if our

Hessian tensor∂2L

∂.xν∂

.xµ

(4.21)

has maximal rank. It actually has one zero eigenvalue, and this signals the presence of alocal gauge symmetry: reparametrization invariance.3

Let us now show the correct way to handle a constraint. The key is to impose themass shell constraint via a Lagrange multiplier e(λ). In general, a Lagrange multiplier

3For more details on this and a number of related topics, see the 1990 textbook “Lagrangian Interaction”by Noel Doughty, intended for senior undergraduates. When I was doing my B.Sc.(Hons) degree in NewZealand, I took a course from Doughty, and his notes and background material were published as this booka year later. I am very grateful to Doughty for helping inspire me to be a theoretical physicist. If you takea peek into the Acknowledgements section, you will see that he thanked me and four of my classmates. :D

15

is something that appears in your action principle only via dependence on “coordinates”but not on “velocities”: it is not a dynamical field. Its only function is to implement theconstraint that you need to impose, in a way that respects the symmetries of your system.In our case, we want to preserve Poincare symmetry – invariance under rotations, boosts,and translations.

The einbein Lagrangian is

S(2)rel =

∫dλ

[1

2e−1(λ) ηµν

dxµ(λ)

dλ

dxν(λ)

dλ+

1

2e(λ)m2

]. (4.22)

This action is invariant under reparametrizations λ→ λ′, as long as the einbein transformscontragrediently,

e→ e′ =dλ

dλ′e . (4.23)

Varying this action w.r.t. e(λ) gives the constraint equation,

.xµ(λ)

.xµ(λ) = +m2 [e(λ)]2 . (4.24)

For massive particles we can pick the proper time gauge in which e(λ) = 1/m; then λ = τ .Because we have derived the constraint directly from the action, we can be confident thatwe truly have only 3 independent functions xµ(λ) in our dynamical system, not 4. Varyingthis action w.r.t. xµ(λ) gives the equations of motion,

[pµ(λ)]. = 0 , (4.25)

where the canonical momenta are

pµ(λ) = e−1(λ).xµ(λ) . (4.26)

The above equation of motion.pµ = 0 is equally valid for massive or massless particles.

The Hamiltonian is

Hλ = pµ(λ).xµ(λ)− Lλ

=1

2e(λ)

[pµ(λ)pµ(λ)−m2

]. (4.27)

This Hamiltonian is proportional to the constraint, and this is the correct answer because itgives all the correct Poisson Brackets:

xµ, pνPB = δµν , xµ, xνPB = 0 , pµ, pνPB = 0 . (4.28)

If we wanted to canonically quantize a system (something we will not be doing in this course),we would replace classical Poisson brackets with quantum mechanical commutators.

4.2 Electromagnetism: 4-vector potential and field strength ten-sor

A less trivial example of a special relativistic tensor is Maxwell’s electromagnetism. Havingplayed with the Maxwell equations, you know why EM waves travel (in vacuum) at the speed

16

of light. You may already know that, decades before Einstein invented special relativity,Maxwell had baked it into the very fabric of his eponymous equations! What you maynot know is that the familiar electric and magnetic field strengths are actually not correctlydescribed by vectors, but instead by a two-index covariant antisymmetric tensor. Specifically,in four spacetime dimensions, the gauge field strength components Fµν are built out of

F0i = +δij ~Ej

Fij = −Eijk ~Bk

(4.29)

In this equation, we used the totally antisymmetric permutation symbol in 3 dimensions.The electromagnetic 4-vector gauge potential Aµ is built out of the scalar potential

and the 3-vector potential, with components

A0 = Φ ,

Ai = ~Ai. (4.30)

It is related to the field strength via the covariant curl,

Fµν = ∂µAν − ∂νAµ . (4.31)

This splits up in 3+1 notation as

~B = ~∇× ~A ,

~E = −~∇Φ− ∂ ~A

∂t. (4.32)

Note that Aµ(xλ) is the basic dynamical field of electromagnetism. The field strength Fµν isa derived quantity.

Using the above definitions, the four Maxwell equations

~∇× ~B − ∂ ~E

∂t= ~J , ∇ · ~E = ρ , (4.33)

~∇× ~E +∂ ~B

∂t= ~0 , ∇ · ~B = 0 . (4.34)

neatly collapse into two manifestly relativistic Maxwell equations,

∂µFµν = Jν ,

Eµνλρ∂νFλρ = 0 . (4.35)

Later when we generalize to curved spacetime, the partial derivatives ∂µ will be replaced bycovariant derivatives ∇µ.

In the above relativistic Maxwell equations, the 4-vector current is built out of the chargedensity and the 3-vector current, with components

J0 = ρ ,

J i = ~ji. (4.36)

17

The 4-vector current obeys a conservation law,

∂µJµ = 0 . (4.37)

Here we are working in four spacetime dimensions. If we write more generally thespacetime dimension D as4 D = d+ 1, then the D-index pseudotensor Eµ0...µd is defined via

Eµ0...µd =

+1 , (µ0 . . . µd) = even permutation of (012 . . . d) ,−1 , (µ0 . . . µd) = odd permutation of (012 . . . d) ,

0 , otherwise .(4.38)

If you want the permutation tensor with upstairs indices, you can easily build it by usingηµν to raise the indices. Note that our 4-index permutation pseudotensor Eµνλσ obeys somehandy identities, in a Minkowski space generalization of what we saw before for Euclidean3-space. Defining

δµναβ =

∣∣∣∣δµα δναδµβ δνβ

∣∣∣∣ (4.39)

and

δµνλαβγ =

∣∣∣∣∣∣δµα δνα δλαδµβ δνβ δλβδµγ δνγ δλγ

∣∣∣∣∣∣ (4.40)

and

δµνλσαβγδ =

∣∣∣∣∣∣∣∣δµα δνα δλα δσαδµβ δνβ δλβ δσβδµγ δνγ δλγ δσγδµδ δνδ δλδ δσδ

∣∣∣∣∣∣∣∣ (4.41)

gives, after quite a bit of algebra,

EµνλσEµνλσ = −4! ,

EµνλσEµνλδ = −3! δσδ ,

EµνλσEµνγδ = −2! δλσγδ ,

EµνλσEµβγδ = −1! δνλσβγδ ,

EµνλσEαβγδ = −0! δµνλσαβγδ . (4.42)

The relativistic Lorentz force law can be written very nicely in relativistic tensor notation,

maµ = qF µνu

ν , (4.43)

where uµ is the relativistic 4-velocity and aµ is the relativistic 4-acceleration. You will workout some aspects of this EM story in your HW1 assignment. In particular, you will be ableto compute the effect of a Lorentz boost on the electromagnetic fields ~E and ~B, which manyof you will not have seen before.

We now turn to the question of what happens for accelerated observers moving withconstant relativistic acceleration.

4We only ever consider spacetimes with one timelike dimension. Currently, it is not generally known howto make sense of quantum theory with two or more timelike dimensions. Which ∂/∂t should we use in theSchrodinger equation?

18

5 Mon.16.Sep

5.1 Constant relativistic acceleration and the twin paradox

When I was an undergraduate, a professor introduced the idea of the Twin Paradox to us.Could the space traveller twin really live longer by travelling at relativistic speeds? Themaddening thing was that he never equipped us with the technology to answer the question!Here is how we can solve that without having to resort to General Relativity: we will onlyuse what we know about Special Relativity to solve it, along with a tiny bit of calculus.

We all know that time dilation lengthens time intervals as compared to what is measuredin rest frame. We also know that each observer sees the other person’s clock as running slow.So why is there even a difference between what the space twin sees and what the homebodytwin sees? Acceleration. The space twin must accelerate in order to turn around and comeback to Earth, before they can compare clocks with the homebody twin. This is what makesthe space twin physically distinct from the homebody twin, who stays in a relatively boringinertial reference frame while the space twin gallivants around the galaxy.

What do we mean by constant relativistic acceleration, exactly? Without loss of gener-ality, we may take the astronaut acceleration to be pointing along the x1 direction. Since therapidity is additive under two successive Lorentz boosts, we may take a guess that constantrelativistic acceleration occurs when rapidity increases linearly with proper time. Let themagnitude of the constant relativistic acceleration be g. For an infinitesimal addition torapidity in the x1 direction dζ, the proposal is

dζ = g dτ . (5.1)

Here we have suppressed the factors of c, which can be easily restored via dimensionalanalysis. This formula is interesting because it actually holds for any kind of accelerationg(τ), not just the constant kind. Let us now see why.

Our key tool for analysis will be to define the instantaneous inertial rest frame(IIRF) for the accelerating astronaut, which we will denote by primes. This is obviouslydistinct from the ordinary inertial reference frame (IRF) of the homebody twin, and itis different at each point along the space twin’s trajectory because they accelerate. Thekey physical feature of the IIRF at any τ is that the astronaut is at rest in that frame atthe instant in question: u′x = 0. And since we know the relationship between 3-velocitiesand 3-accelerations in different inertial reference frames from our experience with Lorentztransformations, we can figure out what happens for the astronaut’s trajectory measured inthe lab frame. We have

u′x =ux − v1− uxv

, a′x =ax

γ3v(1− uxv)3

, (5.2)

where γv ≡ 1/√

1− v2. Accordingly, at each instant along the astronaut trajectory,

ux = v . (5.3)

Therefore, for a general acceleration a′x = g(τ) in the IIRF,

ax = γ−3v g(τ) . (5.4)

19

We also know from elementary Lorentz transformations that dt = γvdτ (this is just likemuons from cosmic rays lasting longer in the lab frame than in the muon rest frame becausethey are whizzing down to earth at relativistic speed). Remembering that ax = dux/dt,rearranging the above equation as a function of v, and integrating gives for the 3-velocity inthe x1 direction

arctanh[ux(τ)] =

∫ τ

0

dσ g(σ) . (5.5)

In turn, this can be easily rearranged to give the rapidity along the x1 direction

ζ(τ) =

∫ τ

0

dσ g(σ) , (5.6)

where we assumed that ζ(τ = 0) = 0. Next, we would like to compute the distance in thehomebody frame moved by the astronaut during homebody time dt. This is simply obtainedfrom the speed

dx = ux dt = tanh ζ dt . (5.7)

To convert to astronaut time, we again use the standard time dilation formula,

dt = cosh ζ dτ . (5.8)

This implies thatdx = sinh ζ dτ . (5.9)

If we know ζ(τ), we can integrate these equations. It is especially easy to do so forconstant acceleration g. The position of the space twin in homebody coordinates becomes

x(τ) =1

g[cosh(gτ)− 1] + x0 . (5.10)

The time for the space twin in homebody coordinates integrates to

t(τ) =1

gsinh(gτ) + t0 . (5.11)

Using these equations, you can figure out the physical effect of acceleration on the ageingprocess. In your first homework assignment, you will find out that acceleration serves toenhance the familiar constant-speed time dilation effect, rather than reduce it. This isbecause the free particle trajectory actually maximizes proper time elapsed during motion;any acceleration applied reduces it. We will be able to see why this is later on when westudy geodesics. Geodesics are, morally speaking, the closest thing to a straight line that isavailable in curved spacetime. They describe the trajectories of test particles in freefall.

Getting back to our equations above, we can see that for the case of a constant relativisticacceleration g, the trajectory of the accelerating space twin satisfies[

x(τ)− x0 +1

g

]2

− [t(τ)− t0]2 =1

g2. (5.12)

As you can see by inspection, this is a hyperbola. The asymptotes of the hyperbola areknown as acceleration horizons or Rindler horizons. These asymptotes are lines with

20

a 1:1 slope on a spacetime diagram. We can see why they are horizons by recalling thatlight rays also move at 45 degrees. An observer on a timelike trajectory going at higheracceleration hugs the hyperbola asymptote more tightly, but still cannot ‘see’ beyond theRindler horizon.

In fact, the physics is even more interesting than this. The accelerated astronaut notonly finds that there are parts of spacetime that they cannot communicate with because oftheir acceleration, but also that the physics of quantum fields for them is qualitatively andquantitatively different from what the homebody sees. The Minkowski vacuum (the statewith no particles), seen in the reference frame of the astronaut with constant acceleration,turns out to have plenty of particles in it, and they can be measured with a detector. Notonly that, the spectrum is thermal, at the Rindler temperature. Including factors of c,the formula for this reads

TRindler =~g

2πckB. (5.13)

The greater the acceleration, the higher the temperature that the detector will register. Thisphenomenon of acceleration radiation is known as the Unruh effect. For those who areinterested, the physics of particle detectors in GR is explained nicely in the advanced GRtextbook by Birrell and Davies “Quantum Field Theory in Curved Space”.

21

5.2 The Equivalence Principle

Einstein became famous for several different accomplishments. One which is legion amongtheoretical physicists is the concept of the Gedankenexperiment (German for thought exper-iment). It allows us to work out all sorts of imaginative ideas without having to actuallyspend any money. So imagine, if you will, that you are an astronaut on the space station.Imagine that you are blindfolded and kidnapped and then one of two things happens to you.Either you feel the acceleration due to gravity or you take a ride in a rocket ship capable ofthat same acceleration. How would you tell the difference?

The gravitational force from a body of mass M on a test mass mg is

~F grav = −GNMmg

r2r , (5.14)

where mg is the gravitational mass and GN is the Newton constant. So we have

~agrav = −[mg

mi

]GNMr

r2. (5.15)

If mi = mg, then this acceleration ~agrav does not depend on the properties of the test massfeeling the gravitational force.

The universality of gravitation was first put forward by Newton, centuries before Ein-stein. Others hypothesized that the acceleration due to gravity should be universal, not de-pending on the composition of the falling object. This idea has since been tested exquisitelywell. It implies that an object’s inertial mass (what makes you hard to move in the morning)is equal to its gravitational mass (what responds to gravity), and is known as the WeakEquivalence Principle.

When Einstein formulated his theory of General Relativity (GR) he decided to bake theequivalence principle into the very fabric of spacetime. In GR, there is no local experiment

22

you can do to tell the difference between acceleration due to rockets and acceleration due togravity. This is known as the Einstein Equivalence Principle.

The really cool thing about the equivalence principle? It implies that every referenceframe, including accelerating ones, can be instantaneously approximated by a Lorentz frame.This might seem like mathematical nitpicking, but it is actually a key physics insight, as itimplies that locally in spacetime, everything is just Special Relativity. What makes Gen-eral Relativity interesting and nontrivial is the story of how those individual infinitesimalneighbourhoods are sewn together into the fabric of curved spacetime.

It is important to note that this equivalence between gravity and acceleration holds onlyin an infinitesimal patch about a point. If we have access to a finite sized patch of spacetime,we can distinguish gravity from acceleration by measuring tidal forces. We will develop thatstory later on when we get to discussing geodesic deviation.

Consider a photon in Earth’s gravitational field. If it gets aimed upwards, then aftera time interval dt, what is the effect? Well, photons cannot change their speed, as theyalways go at c. What can change for a photon is its energy (or equivalently the magnitudeof its momentum, because the photon mass shell relation is E = |~p|). It can also changeits heading. When a photon moves upwards in a gravitational field, it gains gravitationalpotential energy, so it must lose kinetic energy (conservation of the total energy is valid nearEarth, because there is a time translation symmetry). The photon should therefore suffer aredshift in going upwards. This phenomenon is known as gravitational redshift, and itimplies that clocks run slower when they are deeper in a gravitational field. Black holes takethis to an extreme, as we will see much later in the course.

Did you know that GPS devices rely on both Special and General Relativity to locateyou accurately? They need to take account of the fact that the GPS transmitter satellitesare (a) travelling at a measurable fraction of the speed of light, requiring Special RelativisticDoppler corrections, and (b) higher up in Earth’s gravitational field than we are, necessitatingGeneral Relativistic corrections. Without those corrections together, you would probablybe kilometres off your intended position after a day’s canoeing. So GR does actually touchyour life in a measurable way, if you ever use a GPS unit, say in your smartphone.

5.3 Spacetime as a curved Riemannian manifold

Newton conceptualized gravity via forces that act at a distance instantaneously. This in-stantaneous propagation of gravitational effects is in direct contradiction to the relativisticprinciple that the speed of light is the upper speed limit for everyone. In Einstein’s GR,the speed of propagation of gravititational disturbances is tied to be exactly equal to thespeed of light in vacuum. The formalism of GR is designed to express all the effects of grav-ity in a relativistic way, like gravitational redshift, via geometrical properties of the fabric ofspacetime. The mathematical name for the type of geometry used is (pseudo)Riemanniangeometry.

A (p + q)-dimensional manifold with signature (p, q) is a spacetime that locally lookslike a patch of Rp,q. For example, for our D = 3 + 1 universe with three large spatialdimensions, this would be R3,1. The manifold is the collection (union) of these patches,known as coordinate charts, along with the transition functions that teach you howto sew the patches together. The manifold needs to be continuous, and in order for us

23

to compute sensible physical quantities it should also be differentiable. The mathematicalconcept of the coordinate chart is equivalent to the usual physics idea of a coordinate systemor reference frame.

As an example of how you might need more than one coordinate chart to cover a manifold,consider a circle S1. Each coordinate chart must be an open set of R (emphasis on open).So the minimum number of coordinate charts required to cover the 1-sphere is two.

For the 2-sphere S2, you need a projection to get S2 onto R2, or a patch thereof like amap. The most commonly used projection is the Mercator projection, which preserves anglesrather than area. It is possible to use a different projection that preserves area, such as thePeters projection. However, the price of maintaining areas on the map is that angles are notpreserved: countries look funny shaped compared to their Mercator cousins. Because thesphere is curved and the plane is not, you cannot create a map that preserves both anglesand areas. The reason why the Mercator projection has been so dominant is a technical one:because it preserves angles, it is optimal for navigation of marine vessels and aeroplanes.But it massively overstates the size of countries closer to the poles. In particular, WesternEurope looks more important on Mercator maps than it should, while Africa and Brazil lookmuch smaller. Colonialism also had a role in the dominance of the Mercator projection.

Examples of manifolds include Minkowski space, the sphere, the torus, and 2D Riemannsurfaces with arbitrary genus. What about spaces that are not manifolds? Any intersectionof lines with k-planes will do. A cone is an example of a non-differentiable manifold, becauseof what happens at its apex. Some manifolds have a boundary, for instance a line segment.Some manifolds have no boundary.

General Relativity treats the fabric of spacetime as a differentiable manifold. Note thatit is also possible to handle discontinuities in the spacetime metric in some situations inGR, but only if the appropriate source of energy-momentum is available at the discontinuityto enforce consistency with Einstein’s equations. The formalism for handling this non-differentiable case is known as the Israel junction conditions, and its equations are derivedby integrating Einstein’s equations across discontinuities in suitably covariant ways. Thisworks a lot like deriving equations for shock waves in fluid mechanics.

Spacetime being a differentiable manifold is not enough structure to describe gravity aswe see it in experiments. The geometry should be suitably constrained by some physicalequations, which should – by the Correspondence Principle – reduce to Newtonian mechanicsin the limit of small speeds and weak gravity. Our spacetime manifolds will satisfy theEinstein equations.

24

6 Thu.19.Sep

6.1 Basis vectors in curved spacetime

How do vectors and tensors work when spacetime is curved? We will have to be more carefulthan before, and the signature difference is that the matrices showing us how to transformbetween different coordinate systems are no longer constant matrices. Suppose that wehave coordinates xµ on our manifold and that we consider an arbitrary functions of thesecoordinates. Then the directional derivative along a direction λ of a curve is

df

dλ=

∂f

∂xµdxµ

dλ(6.1)

so that we can writed

dλ=dxµ

dλ∂µ (6.2)

In other words, eµ = ∂µ is a set of basis vectors.This story goes deeper. Mathematically, the tangent space Tp(M) at a point p of a

manifold M is isomorphic to the space of directional derivative operators on curves throughp. It is a vector space, and the Leibniz rule is obeyed. Vector fields can then be defined on M .An example of a vector field would be the wind direction at the surface of the Earth. Takea look at https://earth.nullschool.net/ for a very beautiful interactive visualization ofwinds on Earth.

What about a basis for covariant vectors living in the cotangent space T ∗p (M)? There isa very natural candidate: the differentials eµ = dxµ. Note that these dxµ are not the sameas the contravariant basis vectors ∂µ = ∂/∂xµ; you can tell the difference partly by wherethe index is placed. The coordinate bases for contravariant and covariant vectors obey anatural inner product,

(∂ν)(dxµ) =

∂xµ

∂xν= δµν . (6.3)

We do not have to stick to only using partial derivatives and differentials as bases. Moregenerally, we can denote a basis for contravariant vectors living in the tangent space as

eµ . (6.4)

Any contravariant vector v can be expanded in terms of this basis,

v = vµeµ . (6.5)

Also, we denote a basis for covariant vectors living in the cotangent space as

eν . (6.6)

Any covariant vector ω can be expanded in terms of this basis,

ω = ωνeν . (6.7)

Generally, a basis for contravariant vectors eµ and a basis for covariant vectors eν must bereciprocals,

eν · eµ = δµν . (6.8)

25

https://earth.nullschool.net/

What if we wanted to measure distances and angles on our spacetime manifold? In termsof the basis eµ we can write the vector displacement ds between a point at xµ and anotherpoint at xµ + dxµ in terms of our general basis vectors,

ds = eµdxµ . (6.9)

Accordingly, the line element ds2 is

ds2 = ds · ds = eµdxµ · eνdxν . (6.10)

From this equation we can identify the metric tensor, denoted by gµν ,

gµν = eµ · eν . (6.11)

This is a generalization of the flat Minkowski metric that we encountered in our quick reviewof Special Relativity, and it tells us how to measure distances and angles. The above lineelement in curved spacetime obeys a very important principle: it is invariant under arbitrarycoordinate changes which are invertible and C∞, known as diffeomorphisms.

The inverse metric is denoted as gµν and it is built in an exactly similar fashion,

gµν = eµ · eν . (6.12)

The downstairs metric gµν and its inverse the upstairs metric gνλ obey

gµνgνλ = δµλ . (6.13)

The spacetime metric and its inverse are used all the time in GR, for raising and loweringindices on tensors.

Sometimes it is physically useful to use a special basis called the orthonormal basis.In this case, we denote the basis vectors with hats, and they obey

eµ · eν = ηµν ,

eµ · eν = ηµν . (6.14)

Flat, boring Minkowski spacetime R1,3 written in a spherical polar coordinate basis isnot a curved spacetime, but it has a spacetime metric and tensor transformation laws thatdepend on spacetime position. As an exercise to test your understanding, check explicitlythat the line element is

ds2 = dt2 − dr2 − r2dθ2 − r2 sin2 θ dφ2 , (6.15)

by starting from the expressions for the spherical polar spatial coordinates r, θ, φ in termsof the the Cartesian spatial coordinates x1, x2, x3,

x1 = r cos θ , (6.16)

x2 = r sin θ cosφ , (6.17)

x3 = r sin θ sinφ . (6.18)

26

6.2 Tensors in curved spacetime

Tensors work in curved spacetime a lot like they do in flat spacetime. The most importantphysical difference is that under a change of reference frame represented by

Λµ′

ν ≡∂xµ

′

∂xν(6.19)

the new coordinates are related to the old ones by coordinate-dependent factors, rather thansimple constants like cos θ or sinh ζ. Our central physics strategy will be to remain focusedon the transformation properties of our tensors of interest. That is the essence of what atensor does: it transforms in very specific, well-defined ways when the coordinate systemchanges.

Earlier, we introduced bases eµ for rank (1,0) vectors and eν for rank (0,1) vectors.Accordingly, a general rank (1,0) vector v in curved spacetime can be written in componentsas

v = vµeµ , (6.20)

and a covariant vector ω in curved spacetime can be written in components

ω = ωµ eµ . (6.21)

Under coordinate transformations, their components transform as

vµ′= Λµ′

νvν , (6.22)

andωµ′ = Λ ν

µ′ ων . (6.23)

The transformation matrices

Λµ′

ν =∂xµ

′

∂xνand Λ ν

µ′ =∂xν

∂xµ′(6.24)

now generically depend on spacetime position, and they satisfy

Λ σµ′ Λµ′

ν = δσν , Λ σµ′ Λν′

σ = δν′

µ′ . (6.25)

Contravariant vectors on a pseudoRiemannian manifold representing curved spacetimelive in the tangent space, which is a vector space. Covariant vectors live in the cotangentspace. The collection of all (co)tangent spaces over M is known mathematically as the(co)tangent bundle. One of the key properties of a covariant vector ω is that we cannaturally take its inner product with a contravariant vector v without using the metric, andit yields a scalar:

ω(v) = ωνeν · vµeµ = ωµv

µ . (6.26)

This enables us to recognize that another way to think about a covariant vector is that itis a machine that takes a contravariant vector and produces a scalar. Or in mathematicalwords, it is a bilinear map from the cotangent space into the real numbers R, obeying

(aω1 + bω2)(v) = aω1(v) + bω2(v) , (6.27)

ω(av1 + bv2) = aω(v1) + bω(v2) . (6.28)

27

Vectors obey entirely analogous rules.A rank (m,n) tensor in curved spacetime is defined by direct analogy as a multilinear

map from a collection of m covariant vectors and n contravariant vectors to R. Its com-ponents in a coordinate basis can be extracted from T by slotting in the right number ofcovariant and contravariant basis vectors,

T µ1...µm ν1...νn= T (eµ1 , . . . , eµm , eν1 , . . . , eνn) . (6.29)

Alternatively, it can be written in terms of basis tensors as

T = T µ1...µm ν1...νneµ1 ⊗ . . .⊗ eµn ⊗ eν1 ⊗ . . .⊗ eνn , (6.30)

where ⊗ denotes the outer product (not the inner product!). Note that in this pictureexpanding a tensor in components in one basis versus a second basis results in differentcomponents, as we would expect; the tensor stays the same.

The coordinate transformation law for the components of a rank (m,n) tensor in curvedspacetime is

T µ1′...µm′

ν1′...νn′=∂xµ1

′

∂xλ1. . .

∂xµm′

∂xλm∂xσ1

∂xν1′. . .

∂xσn

∂xνn′T λ1...λm σ1...σn

. (6.31)

Notice that now the Jacobians involved typically depend on spacetime position.

6.3 Rules for tensor index gymnastics

There are very specific rules for manipulating tensors. We already met one of them: theEinstein summation convention. In curved spacetime it works exactly the same way as inflat spacetime: repeated indices are summed over. But let us also make explicit some otherspecific tensor manipulation rules.

First and foremost among them is the fact that when you write a tensor equation, indiceson the LHS and RHS must be exactly matched. For example, pµ = muµ is a sensible tensorequation (it has one upstairs index on both sides) while the erroneous pµ = muµ is not (onthe LHS the µ index is downstairs while on the RHS it is upstairs).

Second, vertical moves of tensor indices – up or down – can only be made by lowering orraising them with the rank (0,2) metric tensor or its rank (2,0) inverse. Said less pedantically,we raise or lower indices using the metric. For example,

T νµ λσ = gµρT

ρνλσ , (6.32)

and similarly for other raised/lowered components: you use as many factors of the met-ric/inverse metric as needed (with appropriate contractions) to lower/raise all the requisiteindices.

Third, we must always preserve the horizontal ordering of the indices when cal-culating, for both upstairs and downstairs indices. For example, for a general rank (2,2)tensor,

T µνλσ 6= T νµλσ . (6.33)

28

(The RHS has the µ and ν indices switched compared to the LHS.) Other horizontal switchesof indices are equally verboten, unless you know that the tensor has appropriate symmetryproperties. The only standard exception to the rule that horizontal index ordering mattersis the Kronecker δαβ tensor, which is symmetrical by definition.

Let us now make a few remarks about symmetries among tensors. Tensors can havesymmetries on their indices, which reduce the number of independent components, but thisis not generic. For example, under interchange of its indices a two-index tensor might besymmetric

Sµν = +Sνµ , (6.34)

or antisymmetricAµν = −Aνµ . (6.35)

For rank two tensors only, an arbitrary tensor T can actually be written as the sum of asymmetric tensor S and an antisymmetric tensor A. In components,

Tµν = Sµν + Aµν , (6.36)

where

Sµν =1

2!(Tµν + Tνµ) , Aµν =

1

2!(Tµν − Tνµ) , (6.37)

This works because the total number of independent components of a 2-index tensor is D×D = D2, while a symmetric 2-index tensor has D(D+1)/2 components and an antisymmetric2-index tensor has D(D − 1)/2, so that D(D + 1)/2 + D(D − 1)/2 = D2. For larger rank,such a split cannot be done, because totally symmetric and totally antisymmetric tensors donot have enough independent components between them to cover the total number.

Any tensor can be symmetrized on any number k of upper or lower indices. For sym-metrization, we have

T(µ1...µk) =1

k!(Tµ1µ2...µk + sum over permutations of (µ1 . . . µk)) , (6.38)

while for antisymmetrization we have

T[µ1...µk] =1

k!(Tµ1µ2...µk + alternating sum over permutations of (µ1 . . . µk)) . (6.39)

where the alternating sum counts even permutations with a + sign and odd ones with a− sign. Note how the round parentheses denote symmetrization and the square bracketsantisymmetrization.

Suppose you know that a tensor is symmetric with all indices down. How do you workout the symmetry of its counterpart with some or all of its indices up? By using knownsymmetry properties of the downstairs components and using the metric tensor to raiseindices. Remember that the metric tensor itself is symmetric under interchange of its twoindices, and so is its inverse. Note also that the contraction of the metric tensor with itselfis

gµνgµν = gµµ = δµµ = 1 + . . .+ 1 = D . (6.40)

29

7 Mon.23.Sep

7.1 Building a covariant derivative

Because coordinate change matrices generically depend on spacetime position, simple partialderivatives of tensors are typically not themselves tensors. For example, the partial derivativeof a covariant vector W , ∂µWν , changes under coordinate changes as

∂

∂xµWν −→

∂

∂xµ′Wν′ =

∂xµ

∂xµ′∂

∂xµ

(∂xν

∂xν′Wν

)=∂xµ

∂xµ′∂xν

∂xν′∂

∂xµWν +Wν

∂xµ

∂xµ′∂

∂xµ

(∂xν

∂xν′

). (7.1)

Although the first term looks good for tensoriality, we see that the second term ruins thefun for generic changes of coordinates.

At any particular point p, we can choose a reference frame (denoted here by hats) inwhich the first derivatives can be set to zero in that coordinate system, ∂σgµν |p = 0. Theway to see this mathematically is to use (a) Taylor expansions around a particular pointand (b) the tensorial transformation property of the two-index metric tensor gµν . However,this cannot be made to work beyond first order in derivatives, because there are not enoughcomponents. Physically, this means that we will need extra structure on our spacetimemanifold in order to be able to define covariant derivatives that transform like tensors.

The structure we need is known as an affine connection. It will enable us to makecovariant versions of partial derivatives ∂µ, denoted ∇µ, designed to transform like tensors.For taking covariant derivatives of tensorial indices relevant to bosonic fields, we will use theLevi-Civita connection or Christoffel symbols Γµνλ. For taking covariant derivatives ofspinorial indices relevant to fermionic fields, a researcher would use a different beast knownas the spin connection ωµab (see GR2 for details). We will work on manifolds without torsion,and for this case knowing the metric tensor is sufficient to determine both connections.

7.2 How basis vectors change: the role of the affine connection

In curved spacetime, the partial derivative of a basis vector generically does not appear tolie in the tangent space. But as HEL explain in §3, this can be easily fixed up by defining thederivative in the manifold of the coordinate basis vectors by projecting into the tangent spaceat the point in question. Then we can expand out the expression for the partial derivative∂λ of a contravariant basis vector eµ in terms of the basis eν , with coefficients Γνµλ:

∂λeµ = Γνµλeν . (7.2)

We can figure out the analogous equation for the covariant basis vectors eµ by differentiatingthe equation eµ · eν = δµν and taking the partial derivative of both sides, yielding

∂λeµ = −Γµνλe

ν . (7.3)

We can find the expressions for the coefficients Γνµλ by taking the partial derivative of

30

our metric tensor,

∂λgµν = (∂λeµ) · eν + eµ · ∂λeν= Γσµλeσ · eν + eµ · Γσνλeσ . (7.4)

Using this, we can now form the combination

∂λgµν + ∂νgλµ − ∂µgνλ= 2Γσλνgµσ . (7.5)

where in the second step we used the fact that the Christoffels are symmetric under inter-change of their lower indices. This fact stems from the assumption that our spacetime haszero torsion5. Rearranging the above gives the full expression for the coefficients Γµνλ interms of first derivatives of the metric tensor,

Γµνλ =1

2gµσ (∂νgσλ + ∂λgνσ − ∂σgνλ) . (7.6)

How about an example? Consider the 2D plane R2 with Cartesian coordinates x, y.The basis vectors ex and ey are maximally boring: they do not change with position. How-ever, if we transform to plane polar coordinates ρ, φ given by

x = ρ cosφ , ρ =√x2 + y2 ,

y = ρ sinφ , φ = arctan(y/x) , (7.7)

then in plane polar coordinates the basis vectors eρ and eφ definitely do change with position,which is the generic situation in GR. To see this, recall that

ds = exdx+ eydy = eρdρ+ eφdφ , (7.8)

and use the above coordinate transformations to identify

eρ = + cosφ ex + sinφ ey ,

eφ = ρ(− sinφ ex + cosφ ey) . (7.9)

It follows quickly from the above that

ds2 = dρ2 + ρ2 dθ2 . (7.10)

We can inspect how the basis vectors change with position, to obtain

∂eρ∂ρ

= 0 ,

∂eρ∂φ

= − sinφ ex + cosφ ey =1

ρeφ ,

∂eφ∂ρ

= − sinφ ex + cosφ ey =1

ρeφ ,

∂eφ∂φ

= −ρ cosφ ex − ρ sinφ ey = −ρ eρ . (7.11)

5Torsion is a rank (1,2) tensor, and it falls outside the scope of this course.

31

From eq.(7.3), we have that ∂λeµ = Γνµλeν , so we can read off the Christoffels,

Γρρρ = 0 , Γφρρ = 0 ,

Γρρφ = 0 , Γφρφ = +1

ρ,

Γρφρ = 0 , Γφφρ = +1

ρ,

Γρφφ = −ρ , Γφφφ = 0 . (7.12)

Alternatively, we could have obtained these expressions for the Christoffels by taking deriva-tives of the metric tensor as in eq.(7.6). But I think also showing the explicit effect on thebasis vectors as in eq.(7.11) helps us understand the physics better. I recommend drawingyourself some pictures to illustrate explicitly how the plane polar coordinate basis vectorschange with position according to the above equations.

Previously, we noticed that taking the partial derivative of a tensor does not give anothertensor, generically. The problem was that the coordinate transformation generally dependson spacetime position. Let us delineate the properties that a covariant derivative ∇ shouldhave. It should be linear,

∇(T + S) = ∇T +∇S , (7.13)

and it should obey the Leibniz rule

∇(T ⊗ S) = (∇T )⊗ S + T ⊗ (∇S) . (7.14)

It should also commute with contractions, which is tantamount to assuming that

∇σgµν = 0 , (7.15)

a very reasonable assumption. The covariant derivative should reduce to the partial deriva-tive when operating upon scalars, because those tensors have no legs. The combinationof the first two properties above implies that ∇ can be written as the sum of the partialderivative ∂ and a linear transformation, which you can think of as a ‘correction’ to keep thederivative tensorial. The coefficients of this correction term are known as the connectioncoefficients or the Christoffel symbols Γµαβ (no, this is not pronounced “Christawful”!).

To see how this works, let us consider the derivative of a contravariant vector v

∂νv = (∂νvµ)eµ + vµ(∂νeµ)

= (∂νvµ)eµ + vµ(Γλµνeλ)

= (∂νvµ)eµ + vλ(Γµλνeµ)

= (∂νvµ + vλΓµλν)eµ , (7.16)

where in the third line we relabelled dummy indices. The part in brackets is known as thecovariant derivative,

∇νvµ ≡ ∂νv

µ + Γµλνvλ . (7.17)

32

By exactly similar logic, we can find the covariant derivative of a covariant vector,

∇νωµ = ∂νωµ − Γλµνωλ . (7.18)

If we want to take the covariant derivative of a rank (m,n) tensor, then we just act on eachof its legs in turn with the connection,

∇σTµ1...µm

ν1...νn= ∂σT

µ1...µmν1...νn

+Γµ1σλTλµ2...µm

ν1...νn+ Γµ2σλT

µ1λµ3...µmν1...νn

+ . . .

−Γλσν1Tµ1...µm

λν2...νn− Γλσν2T

µ1...µmν1λν3...νn

+ . . . (7.19)

How about a quick example? We can use the Christoffels to teach us how to take thecovariant Laplacian in 2D plane polar coordinates. If we have a scalar field Ψ(ρ, φ), then

∇µ∇µΨ = ∇µ∂µΨ

= ∂µ∂µΨ + Γµµν∂

νΨ

= (∂ρ∂ρ + ∂φ∂

φ)Ψ + Γρρρ∂ρΨ + Γρρφ∂

φΨ + Γφφρ∂ρΨ + Γφφφ∂

φΨ

=∂2Ψ

∂ρ2+

1

ρ

∂Ψ

∂ρ+

1

ρ2

∂2Ψ

∂φ2. (7.20)

This should look familiar from vector calculus class: we just derived it from first principles.

One of the most important things to remember is that the connection is not a tensor.It has components labelled with Greek indices, but that does not make it a tensor in and ofitself. Indeed, the connection is designed specifically to correct the non-tensorial propertyof the partial derivative in order to create a new tensor from an old one. Its transformationlaw under a coordinate transformation is

Γν′

µ′λ′ =∂xµ

∂xµ′∂xλ

∂xλ′∂xν

′

∂xνΓνµλ −

∂xµ

∂xµ′∂xλ

∂xλ′∂2xν

′

∂xµ∂xλ. (7.21)

From this, you can see that the difference between two connections is a tensor, because thesecond term (which is independent of the Γs) drops out of their transformation law.

Our connection is metric compatible, meaning that the covariant derivative con-structed from it obeys

∇σgµν = 0 . (7.22)

There are two other useful equations that follow from this one,

∇σgµν = 0 , (7.23)

∇λεµ0µ1...µd = 0 , (7.24)

where the completely antisymmetric tensor density εµ0µ1...µd is used in integrating covariantlyover spacetime. We will not have occasion to use it here, but it will play an important rolein deriving Einstein’s equations from an action principle in the GR2 course. It follows thatthe metric-compatible covariant derivative commutes with raising and lowering of indices.

33

This is very fortunate – as you may be able to imagine, if there were torsion, you would haveto be scrupulously careful about your index placements.

In discussing covariant derivatives of tensors, it is worth noting here that some peopleuse a different convention than ours. They abbreviate by defining commas after indices torepresent partial derivatives, while semicolons represent covariant derivatives. We will stickwith keeping ∂ and ∇ explicit, because in pages full of long GR equations it is all too easyto lose track of punctuation marks.

7.3 The covariant derivative and parallel transport

Introducing a covariant derivative (as compared to a plain derivative) was a really great ideafor doing physics. It allows us to write tensor equations wherever we go. All we need to dois to be sure to write ∇s rather than ∂s. But one question worth asking is this: what rateof change does ∇ actually measure?

A way to answer that question and get a better handle on ∇ is to ask when ∇ of sometensor is zero. For this we actually have to specify what path along which we hope to comparetensors – because comparing tensors at two different points is, a priori, meaningless in GR.After all, the spacetime metric varies from point to point.

Consider a parametrized curve xµ(λ), and a vector field v(λ) = vµ(λ)eµ(λ). The deriva-tive of the vector v field w.r.t the curve parameter λ is

dv

dλ=dvµ

dλeµ + vµ

∂eµ∂xσ

dxσ

dλ

=

(dvµ

dλ+ Γµνσv

ν dxσ

dλ

)eµ

≡ Dvµ

Dλeµ . (7.25)

The quantityD

Dλ≡ dxσ

dλ∇σ (7.26)

is known as the directional covariant derivative. This animal is only defined along thepath xµ(λ), and when acting on a tensor it produces another tensor.

We say that a tensor is parallel transported along the path if( DDλ

T)µ1...µk

ν1...ν`=dxσ

dλ∇σT

µ1...µkν1...ν`

= 0 . (7.27)

This is known as the equation of parallel transport, and it is a proper tensor equation. Now,since we have a metric compatible connection, ∇σgµν = 0, parallel transport preservesthe inner product of two tensors. For example, for two vectors V µ and W µ,

D

Dλ(gµνV

µW ν) =

(D

Dλgµν

)V µW ν + gµν

(( DDλ

V µ)W ν + V µ

( DDλ

W ν))

(7.28)

= 0 + 0 + 0 = 0 , (7.29)

if the vectors are both parallel transported. You can visualize what parallel transport doesby imagining that it keeps the same angle between the vector and the directional derivativealong the path xµ(λ).

34

To see what parallel transporting can imply, consider the two-sphere. Imagine that westart at the North Pole with a vector at an angle. We keep the angle of our vector constantas we move along a line of longitude, (say) the Greenwich meridian, down to the Equator.Then imagine that we turn East and continue parallel transporting our vector some wayaround the equator. Then we turn North and parallel transport our vector up a second lineof longitude, back to the North Pole. If you have visualized this correctly in your mind, youwill see that our vector, regardless of the direction it was initially pointing, has undergone afinite rotation. This is because the sphere is (positively) curved.

35

8 Thu.26.Sep

8.1 The geodesic equations for test particle motion in curvedspacetime

A geodesic is a path xµ(λ) that parallel transports its own tangent vector. It follows thatthe equation satisfied by the geodesic is

D

Dλ

(dxµ

dλ

)=d2xµ

dλ2+ Γµνσ

dxν

dλ

dxσ

dλ= 0 . (8.1)

We can also think about parallel transport in the following way. When we take anordinary partial derivative, we do it by taking

lim∆x→0

f(xµ + ∆xµ)− f(xµ)

∆xµ=

∂f

∂xµ. (8.2)

In curved spacetime, the result of this is not a tensor. What we do is instead take thecovariant derivative, as follows.

1. We take xµ(λ+ dλ) as our “x plus an infinitesimal change” and find T there.2. We parallel transport T back to the original point at xµ(λ), along the path xµ(λ).3. We compare the parallel-transported-back T to the original T at xµ(λ), and we ‘divide

by’ dλ.

The result is DT/Dλ, the covariant rate of change of the tensor with respect to λ at thespacetime point xµ(λ).

Let us now see another way that the geodesic equation can be derived, using a variationalapproach. Consider a massive point particle in proper time gauge. The relativistic einbeinaction is, up to a constant that is physically irrelevant at the classical level,

S = −m∫dτ gµν(x

λ)dxµ(τ)

dτ

dxν(τ)

dτ. (8.3)

What happens when we varyxµ → xµ + δxµ ? (8.4)

36

Under such a variation,gµν → gµν + (∂σgµν)δx

σ . (8.5)

Varying the action, we have

− 1

mδS =

∫dτ δ

(gµν

dxµ

dτ

dxν

dτ

)(8.6)

=

∫dτ

[(∂σgµν) δx

σ dxµ

dτ

dxν

dτ+

gµν

(dδxµ

dτ

)dxν

dτ+ (µ↔ ν)

](8.7)

=

∫dτ

[(∂σgµν)

dxµ

dτ

dxν

dτδxσ+

−

(∂σgµν)dxσ

dτ

dxν

dτδxµ + gµν

d2xµ

dτ 2δxµ + (µ↔ ν)

], (8.8)

where in the last step we integrated by parts6. We also used the fact that

δdxµ

dτ=dδxµ

dτ. (8.9)

Collecting all the terms, we have

− 1

mδS =

∫dτ

[gµσ

d2xσ

dτ 2+ gµσΓσνρ

dxν

dτ

dxρ

dτ

]δxµ . (8.10)

Demanding that this be zero for arbitrary variations δxµ, we obtain the geodesic equation,

d2xµ

dτ 2+ Γµνρ

dxν

dτ

dxρ

dτ= 0 . (8.11)

An affine parameter λ is defined to be λ = aτ + b for constants a, b. In other words, a λis an affine parameter if it is linearly related to τ (for a massive particle). For a masslessparticle, we can still define an affine parameter. In fact, our geodesic equation requires justsuch an affine parametrization, regardless of the particle mass.

For either massive or massless particles, the geodesic equation can be written in verycompact form in terms of the momentum vector,

pν∇ν pµ = 0 . (8.12)

For point particles, we relate the momentum pµ to the four-velocity uµ via

pµ = muµ = mdxµ

dτ, m2 > 0 ,

pµ = uµ , m2 = 0 . (8.13)

The second formula follows Carroll’s convention for defining the four-velocity for masslessparticles.

6We assume that the manifold has sufficiently trivial topology for the integration by parts to work.

37

There is a central physics point to understand about this extremization. Is it a mini-mization or a maximization? In fact, the geodesic maximizes proper time. Why? Well,if we were to lower the proper time interval (∆τ)2 along a changed path, we would get closerto (∆τ)2 = 0, which is a null path. To go lower, to (∆τ)2 < 0, we would have to use anillegal spacelike path. So minimizing (∆τ)2 does not make sense, and in fact the propertime is maximized via the variational principle. The fact that the proper time is maximizedhappens precisely because it is infinitesimally close to paths with lower proper time. Carrollhas a morally similar argument: he shows that for any timelike path, we can approximate itby a (jaggedy looking) piecewise continuous bunch of null paths, all of the pieces of whichhave zero invariant interval. Since the geodesic is infinitesimally nearby to null paths withzero proper time, it must maximize proper time.

The physical consequence of this mathematical fact that geodesics maximize proper timeis that accelerated observers – those who are not in freefall – measure less propertime than those who are in freefall. This is why the space twin in the Twin Paradox alwayscomes back younger, not older, than the homebody twin. The more you accelerate aroundwith your rockets, the younger you are compared to a homebody who stays on a geodesic.

If all geodesics on a spacetime manifold go as far as they please, then the manifold issaid to be geodesically complete. But if some geodesic(s) bang into a singularity, or endprematurely, then the manifold is geodesically incomplete. For spacetimes with matter, thisis the generic case, actually. We will see why when we get to singularity theorems later on.

8.2 Example computation for affine connection and geodesic equa-tions

Let us now work a relatively simple example of calculating Christoffel components for aspacetime with dependence on only one coordinate, x0 = t. We will take the spatially flat7

Friedman-Robertson-Walker ansatz in D = d+ 1 spacetime dimensions,

ds2 = dt2 − a2(t)|d~x|2 , (8.14)

where a(t) is the scale factor. Since

ds2 = gµνdxµdxν , (8.15)

we have

g00 = +1 ,

gij = −[a(t)]2 δij . (8.16)

Because the metric is diagonal, we can invert it by eye, to obtain

g00 = +1 ,

gij = −[a(t)]−2 δij . (8.17)

7For the more general case with nontrival spatial metric, see Carroll §8.3.

38

Finding the Christoffels is relatively straightforward, as many of them are zero. Notice thatthe only coordinate dependence in the metric is on the time coordinate.

First, let us try for Γ000,

Γ000 =

1

2g0σ (∂0g0σ + ∂0g0σ − ∂σg00)

=1

2g00∂0g00 = 0 , (8.18)

because the metric is diagonal and because g00 is a constant.Next up is

Γ00i =

1

2g0σ (∂0giσ + ∂ig0σ − ∂σg0i)

=1

2g00 (∂0gi0 + ∂ig00 − ∂0g0i)

=1

2g00∂ig00 = 0 , (8.19)

because the metric is diagonal and because g00 is a constant.A more interesting case is Γ0

ij, which is nonzero.

Γ0ij =

1

2g0σ (∂igjσ + ∂jgiσ − ∂σgij)

=1

2g00 (∂igj0 + ∂jgi0 − ∂0gij)

= −1

2g00∂0gij

= a.a δij , (8.20)

where . = d/dt. Along the way, we again used the fact that the metric is diagonal and g00

is a constant.Now consider Γi00.

Γi00 =1

2giσ (∂0g0σ + ∂0g0σ − ∂σg00)

= 0 , (8.21)

because the metric is diagonal and because g00 is a constant.Next, let us look at the only other nonzero Christoffel symbol Γi0j. We have

Γi0j =1

2giσ (∂0gjσ + ∂jg0σ − ∂σg0j)

=1

2gik (∂0gjk + ∂jg0k − ∂kg0j)

=1

2gik∂0gjk

=1

2[a(t)]−2δik∂0[a(t)]2δjk

=.a

aδij . (8.22)

39

Finally, what about the all-spatial Christoffels Γi jk? We have

Γi jk =1

2gi` (∂jgk` + ∂kgj` − ∂`gjk)

= 0 , (8.23)

because none of the spatial components of the metric depends on spatial position.In summary, we have:-

Γ0ij = a

.a δij , (8.24)

Γi j0 =.a

aδij , (8.25)

with all other components zero. Notice how it is the “velocity” of the scale factor a(t) thatappears here. The quantity .

a

a= H(t) (8.26)

is known as the Hubble constant if the scale factor is exponential. (Whether or not thescale factor can behave in this fashion is determined by the energy-momentum of matter inthe spacetime, as we will discover later on in the course.)

Now let us look at the geodesic equations in this simple spacetime, doing a time spacesplit like for the Christoffels above. In general, we have

d2xµ

dλ2+ Γµνσ

dxν

dλ

dxσ

dλ= 0 . (8.27)

The 0th component of this equation reads

0 =d2x0

dλ2+ Γ0

νσ

dxν

dλ

dxσ

dλ

=d2x0

dλ2+ Γ0

ij

dxi

dλ

dxj

dλ

=d2x0

dλ2+ a

.a δij

dxi

dλ

dxj

dλ(8.28)

because all the other terms contributing to the sums over ν and σ involve Christoffel com-ponents that are zero.

The ith component reads

0 =d2xi

dλ2+ Γiνσ

dxν

dλ

dxσ

dλ

=d2xi

dλ2+ Γi0j

dx0

dλ

dxj

dλ+ Γi j0

dxj

dλ

dx0

dλ

=d2xi

dλ2+

2.a

a

dx0

dλ

dxi

dλ. (8.29)

The first thing to notice about these geodesic equations we have derived is that they arecoupled and nonlinear. The equation for dx0/dλ depends on what dxi/dλ are doing, and vice

40

versa. This is why solving for motions of massless particles (photons) or massive particles(like electrons) in the background of a general curved spacetime is generically much morecomplicated than doing Newton’s Laws for non-relativistic physics.

The second thing to notice about our super-simple spacetime is that the spatial geodesicequations actually have a first integral (!). To see this, let us try taking the λ derivative of

pi = a2(t) δijdxj

dλ. (8.30)

We have, by the Leibniz rule and the chain rule,

d

dλpi = δij

d

dλ

[a2(x0)

dxj

dλ

]= δij

(2a

.adx0

dλ

)dxj

dλ+ δija

2 d2xj

dλ2

= δija2

[d2xj

dλ2+

2.a

a

dx0

dλ

dxj

dλ

]= 0 . (8.31)

Therefore, pi is a conserved quantity along the geodesic. As we will see a bit later in thecourse, this conservation law arises because our spacetime metric has a symmetry: noneof the components of the metric tensor depends on spatial coordinates. This is your firstexample of how Noether’s Theorem works in General Relativity.

41

9 Mon.30.Sep

9.1 Spacetime curvature

Einstein’s General Theory of Relativity upgraded the way we think about gravitationalphysics. Instead of imposing Newton’s three laws of motion and imposing his force lawfor universal gravitation, we assume that the starting point is the fabric of spacetime. Weworked quite hard already to define tensors on arbitrary spacetimes, by focusing intensely ontheir transformation properties under changes of reference frame, i.e., changes of coordinates.We also figured out in our last lecture how to define a covariant derivative, with the help ofthe Levi-Civita connection. We went to all that trouble of wrangling the Christoffel symbolsbecause this enabled us to do two exciting things: (a) to define a derivative ∇µ that isa tensor, even in curved spacetime, and (b) to derive the geodesic equation, which is theequation obeyed by any relativistic particle undergoing freefall in the spacetime in question.Along the way, we learned that geodesics maximize the proper time.

As we alluded to earlier, Riemann curvature tensor is the mathematical quantity thatAlbert Einstein discovered was the key to gravitational physics expressed in the languageof curved spacetime. He realized that the Riemann tensor, which contains at most twoderivatives of the metric tensor, could even be used to build an action principle for generalrelativity. We will derive the Einstein action and the Einstein equations of motion forthe gravitational field in the GR2 course. For now, all we need to keep in mind is that theRiemann tensor encodes a wide variety of gravitational phenomena in its tensor components,including the physics of tidal forces and the motion of particles in spacetime. In particular,we will soon show how in the Newtonian limit of weak gravity and slow speeds, we will recoverfamiliar expressions from Newtonian physics – without ever having to use the concept of aforce! First, we need to develop a bit more formalism.

9.2 The Riemann tensor

Consider an infinitesimal parallelogram, with vectors Aµ and Bν forming the sides.

In hand-waving terms, the Riemann curvature is what tells us how much a vector V µ getsrotated under parallel transport around the parallelogram. The infinitesimal change in V ,δV , is a (1,0) tensor, and so are A, B, and V . Roughly speaking, we expect δV to beproportional to V and to the size of the parallelogram. To connect δV to A,B, V we needa (1,3) tensor with which to contract indices naturally, and the role of this is played by theRiemann curvature. The resulting equation from our handwaving is therefore

δV µ ∼ RµναβV

νAαBβ . (9.1)

While this sketch of Riemann’s origin gives us the gist, we now need to be more precise andmake a proper definition.

42

Recall that earlier we found parallel transport to be the right way of thinking about howto compare vectors at different places in spacetime. Combined with our little parallelogramhand-wave just now, this can be used to motivate a mathematical definition of the Riemanntensor as arising from taking commutators of covariant derivatives. On a (1,0) vector V ,Riemann is defined via8

[∇µ,∇ν ]Vρ = +Rρ

λµνVλ , (9.2)

for a torsion-free connection. This formula teaches us how to find the components of theRiemann tensor in terms of Christoffel connection coefficients. Let us write out the piecesindividually to see how it works out. First, note that for any (1,1) tensor T ρ

ν ,

∇µTρν = ∂µT

ρν + ΓρµσT

σν − ΓλµνT

ρλ . (9.3)

So with T ρν = ∇νV

ρ, we have

∇µ(∇νVρ) = ∂µ(∇νV

ρ) + Γρµλ(∇νVλ)− Γλµν(∇λV

ρ) (9.4)

= ∂µ(∂νVρ + ΓρνλV

λ) + Γρµλ(∂νVλ + Γλνσ)− Γλµν(∂λV

ρ + ΓρλσVσ) (9.5)

= ∂µ∂νVρ + Γρνλ∂µV

λ + Γρµλ∂νVλ − Γλµν∂λV

ρ

+ (∂µΓρνσ)V σ + ΓρµλΓλνσV

σ − ΓλµνΓρλσV

σ . (9.6)

Then

∇µ(∇νVρ)− (µ↔ ν) =

(∂µΓρνσ + ΓρµλΓ

λνσ

)V σ −

[ΓλµνΓ

ρλσV

σ]

+

Γρνλ∂µVλ + Γρµλ∂νV

λ−[Γλµν∂λV

ρ]− (µ↔ ν) (9.7)

=(∂µΓρνσ + ΓρµλΓ

λνσ − (µ↔ ν)

)V σ . (9.8)

Now we can put the pieces together to see the general formula for taking the commutator ofcovariant derivatives acting on a vector. Using

[∇µ,∇ν ]Vρ = +Rρ

σµνVσ (9.9)

gives us the formula for the Riemann tensor components,

Rρσµν = ∂µΓρσν − ∂νΓρσµ + ΓλσνΓ

ρλµ − ΓλσµΓρλν . (9.10)

For a covariant derivative acting on a (0,1) tensor, a covariant vector, one finds the sameRiemann tensor coefficients and

[∇µ,∇ν ]ωρ = −Rλρµνωλ . (9.11)

If you slog through the details, you can compute the commutator of covariant derivativeson a rank (k, `) tensor V as well. This is not much worse than the calculation we have justdone, and we suppress the details here. The result is

[∇ρ,∇σ]V µ1...µkν1...ν`

= Rµ1λρσV

λ...µkν1...ν`

+Rµ2λρσV

µ1λµ3...µkν1...ν`

+ . . .

−Rλν1ρσ

V µ1...µkλν2...ν`

−Rλν2ρσ

V µ1...µkν1λν3...ν`

− . . . . (9.12)

8We are using the sign conventions of HEL

43

Riemann arises naturally as a rank (1,3) tensor. By doing a partial contraction of twoof its indices, we can define the Ricci tensor Rµν , which naturally arises as a rank (0,2)tensor,

Rµν = Rαµνα . (9.13)

Notice that we are contracting the first and fourth indices here to make the Ricci tensor.This is a choice of convention, and we have chosen to use the same convention as the HELtextbook.

By contracting the Ricci tensor with the metric, we can form the Ricci scalar R, whichhas rank (0,0),

R = gµνRµν . (9.14)

Other kinds of contractions involving Riemann are also possible, such as “Riemann squared”and “Ricci squared”. For our purposes in this course, we only need to know about the Riccitensor and the Ricci scalar – because both of them will appear on the left hand side ofEinstein’s equations.

Note that if you change the signature of our Lorentzian spacetime from mostly minusto mostly plus, the Christoffels Γλµν would stay the same, the Riemann tensor Rρ

λµν wouldalso stay the same, and so would the Ricci tensor Rµν , but the Ricci scalar R would developa relative minus sign.

9.3 Example computations for Riemann

Suppose that we study 2D Euclidean space in plane polar coordinates x1, x2 = ρ, ϕ,

ds2 = dρ2 + ρ2dϕ2 . (9.15)

We previously found nonzero Christoffels for this spacetime in these coordinates when weintroduced basis vectors,

Γ212 =

1

ρ, Γ1

22 = −ρ . (9.16)

From this, we can find the Riemann tensor using our formula from above,

Rρσµν = ∂µΓρνσ − ∂νΓρµσ + ΓρµλΓ

λνσ − ΓρνλΓ

λµσ . (9.17)

Substituting in gives

R1212 = ∂1Γ1

22 − ∂2Γ121 + Γλ22Γ1

λ1 − Γλ21Γ1λ2

= ∂1Γ122 − 0 + 0− Γ2

21Γ122

= ∂ρ(−ρ)− 1

ρ(−ρ)

= −1− (−1) = 0 . (9.18)

All the other components of Riemann that might have been nonzero are actually zero, also.This result reflects the fact that this 2D spacetime is flat.

44

Now suppose we try a spacetime which we already suspect is curved: the two-spherewith coordinates x1, x2 = θ, φ,

ds2 = dθ2 + sin2 θ dφ2 . (9.19)

Computing the Christoffels is straightforward, using either the formula in terms of deriva-tives of the metric tensor or the formula for how basis vectors change. The only nonzerocomponents turn out to be

Γ122 = − sin θ cos θ , Γ2

12 =cos θ

sin θ. (9.20)

As we will discover next week, there is only one independent component of Riemann in 2D,and it is R1

212. To compute it, we substitute in again,

R1212 = ∂1Γ1

22 − Γ221Γ1

22

= ∂θ(− sin θ cos θ)− cos θ

sin θ(− sin θ cos θ)

= −(cos2 θ − sin2 θ) + cos2 θ

= + sin2 θ . (9.21)

If we lift the second index using gµν , we obtain

R1212 = +1 . (9.22)

The answer is positive because the sphere is positively curved. All of the other nonzerocomponents of Riemann can be expressed in terms of this one, for instance R21

21 = +R1212.

Finally, let us work a slightly more nontrivial example of calculating Riemann com-ponents for a spacetime with dependence on only one coordinate. As with our geodesicequation example at the end of the previous section, we take the spatially flat FRW ansatzin D = d+ 1 spacetime dimensions,

ds2 = dt2 − a2(t)|d~x|2 , (9.23)

where a(t) is the scale factor. Most of the components of Riemann for this simple spacetimeare actually zero. Let us sketch how to find the ones that are nonzero.

We had for the Christoffels

Γ0ij = a

.a δij ,

Γi j0 =.a

aδij . (9.24)

The first group of nonzero Riemann components have one time index up and one down, andtwo spatial indices:

R0i0j = ∂0Γ0

ji − Γ0jkΓ

k0i

=(.a2 + a

..a)δij − a

.a δjk

.a

aδki

= a..a δij . (9.25)

45

Then we have

Ri00j = ∂0Γi j0 + Γi0kΓ

kj0

=a..a− .

a2

a2δij +

.a2

a2δij

=..a

aδij . (9.26)

The second group of nonzero Riemann components has all spatial indices,

Rijk` = Γik0Γ0

lj − Γi `0Γ0kj

=

( .a

aδik

)(a

.a δ`j)−

( .a

aδi`

)(a

.a δkj)

=.a2(δikδj` − δi`δjk

). (9.27)

Notice how we have discovered both “velocity squared” a2 terms, which arise via Γ···Γ···

parts in Riemann, and a “acceleration” terms, which arise via ∂2· g·· parts in Riemann. It

is not until you compute the curvature that you see the appearance of the “acceleration”pieces. Notice also how the “acceleration” of the scale factor showed up in the Riemann com-ponents involving the time direction; the all-spatial Riemanns gave only “velocity squared”contributions.

Since we now have the Riemann tensor, we can contract it to find Ricci. The nonzerocomponents are

R00 = Ri00i

= +δii

..a

a

= +d..a

a,

Rij = R0ij0 −Rk

ijk

= −a..a δij −.a2(δkkδij − δkjδik

)= −a..a δij −

.a2 (d− 1)δij

= −δij[a..a+ (d− 1)

.a2], (9.28)

where d is the spatial dimension (d = 3 in our universe). Contracting the Ricci tensor withthe metric tensor gives the Ricci scalar,

R = g00R00 + gijRij

= +..a

ad− d

(− 1

a2

)[a..a+ (d− 1)

.a2]

= +2d..a

a+ d(d− 1)

.a2

a2. (9.29)

Using D = d+ 1, we can write this in terms of the spacetime dimension D,

R = +2(D − 1)..a

a+ (D − 1)(D − 2)

.a2

a2. (9.30)

46

The time evolution of this depends sensitively on the details of how the scale factor evolves.We will need to develop the Einstein equation to see how scale factor evolution is tied tothe energy-momentum of the type of matter hanging out in the spacetime. Arbitrary scalefactors a(t) are not allowed; the Einstein equations will determine them in terms of theenergy density and the pressures.

47

10 Thu.03.Oct

10.1 Geodesic deviation

Geodesics are generally not straight lines in curved spacetime. Physically, they deviate fromone another, because of spacetime curvature. How can we make this intuition mathematicallyprecise? Consider a one-parameter family of geodesics γs(λ), where λ is the affine parameteralong the geodesic in question. The parameter s ∈ R tells you which geodesic you arereferring to. We can choose coordinates s and λ on the manifold as long as the geodesics donot cross.

Then we have two naturally defined vector fields,

Sµ =∂xµ

∂s, T µ =

∂xµ

∂λ. (10.1)

A useful mnemonic here is that S is for Separation while T is for Tangent.We would now like to build the covariant analogue of the ‘relative velocity’ between

geodesics,V µ = Tα∇αS

µ , (10.2)

and the ‘relative acceleration’Aµ := Tα∇αV

µ . (10.3)

Note that the acceleration of a path away from being a geodesic is different. That would be

Tα∇αTµ . (10.4)

Since our proposed definitions above are tensor equations, they are well-defined. Now,Sµ and T µ are basis vectors adapted to a coordinate system, with s and λ. Therefore,

[S, T ] = 0 . (10.5)

On our way towards building the relative acceleration vector, we will need an identity forvector fields,

[X, Y ]µ = Xα∂αYµ − Y α∂αX

µ (10.6)

= Xα∇αYµ − Y α∇αX

µ . (10.7)

48

This allows us to relate S-directional derivatives of T to T -directional derivatives of S,

Sα∇αTµ = Tα∇αS

µ . (10.8)

Now we can compute the relative acceleration vector.

Aµ = Tα∇α (T σ∇σSµ) (10.9)

= Tα∇α(Sσ∇σTµ)

= (Tα∇αSσ)(∇σT

µ) + TαSσ[∇σ∇αTµ] +Rµ

νασTν

= (Sα∇αTσ)(∇σT

µ) +RµνασT

νTαSσ

+[Sσ∇σ(Tα∇αTµ)− (Sσ∇σT

α)(∇αTµ)]

= +RµνασT

νTαSσ , (10.10)

where we used (i) [S, T ] = 0, (ii) ∇ obeys the Leibniz rule and Riemann is defined in termsof a commutator of covariant derivatives, (iii) the Leibniz rule and rearranging terms, (iv)relabelling of dummy indices to cancel terms and T being the tangent vector of a geodesic.

Summarizing, we have the geodesic deviation equation

Aµ =D2Sµ

Dλ2= (∇T∇TS)µ = +Rµ

νασTνTαSσ . (10.11)

Here we see how the Riemann curvature tensor governs the deviation of geodesics in a veryprecise way. The covariant acceleration deviation of this one-parameter family of geodesicsis given by the Riemann tensor contracted with the tangent vector T twice, on its secondand third indices, and contracted with the separation vector S once, on its fourth index.

10.2 Tidal forces and taking the Newtonian limit for Christoffels

Remember the tides? If you, like me, have spent any length of time near the ocean, thenyou know that the water level rises and falls twice a day. But do you know why? Newtonfirst explained this in his Principia. Basically, oceanic water on the near side to the Moonbulges because it is closer to the Moon than ocean on the far side and hence feels strongergravity; for the bulge on the far side that can be seen to happen through ‘centrifugal force’.So we see two tides per day. (Note: distances in the figure are not to scale.)

49

How do tidal forces work in Newtonian and Einsteinian gravity? Well, you cannot detectcurvature using only one test particle, or only one geodesic. You need to use multiples tosee the physical effects of curvature of space or spacetime. So let us think about geodesicdeviation in the Newtonian limit, even before we recruit the heavy machinery of tensoranalysis in curved spacetime and the Riemann tensor. We will soon see how Riemann andthe Newtonian potential are connected by the Newtonian limit of weak gravity andslow speeds.

In an inertial frame, the equation of motion of the first particle moving in a Newtoniangravitational potential Φ(xk) is

d2xi

dt2= −δij∂jΦ(xk) . (10.12)

Next, we define the vector yi to be the separation of the second particle from the first, whichis assumed to be small. We have that

d2

dt2(xi + yi) = −δij∂jΦ(xk + yk) . (10.13)

Taylor expanding gives

∂jΦ(xk + yk) = ∂jΦ(xk) + (∂`∂jΦ(xk))y` +O(y2) , (10.14)

so that the Newtonian trajectory deviation equation is

d2

dt2yi = −δij(∂j∂kΦ) yk . (10.15)

The left hand side is known as the tidal acceleration, and it is described by the second mixedpartial derivatives of the Newtonian potential.

For simplicity, let us ignore the fact that the Earth is rotating on its own axis as wellas the rotation of the Earth around the Sun. Letting the moon be at (x, y, z) = (0, 0, d), wehave for the Newtonian gravitational potential

Φm(x, y, z) = − GNMm

x2 + y2 + (z − d)21/2. (10.16)

From this we can calculate the acceleration deviation vector(∂2Φ

∂xi∂xj

)∣∣∣∣0

= +GNMm

d3diag(1, 1,−2) . (10.17)

Why the asymmetry between the z and x, y directions? Simple. The functional dependencein the denominator is different.

∂2Φ

∂x2

∣∣∣∣0

= − GNMm∂x

[2x · −1

2. . .−3/2

]∣∣∣∣0

(10.18)

= GNMm

[. . .−3/2 − x · 2x · −3

2. . .−5/2

]∣∣∣∣0

(10.19)

=GNMm

d3+ 0 (10.20)

50

whereas

∂2Φ

∂z2

∣∣∣∣0

= − GNMm∂z

[2(z − d) · −1

2. . .−3/2

]∣∣∣∣0

(10.21)

= GNMm

[. . .−3/2 + (z − d) · 2(z − d) · −3

2. . .−5/2

]∣∣∣∣0

(10.22)

= +GNMm

d3− 3

GNMmd2

d5(10.23)

= −2GNMm

d3. (10.24)

Another way to write the same set of equations is to use a unit normal vector ni = xi/rpointing in the radial direction; then

aij = −(

∂2Φ

∂xi∂xj

)∣∣∣∣0

= − (δij − 3ninj)GNMm

r3(10.25)

This (tensor) equation tells us that you get stretched in the radial direction and squeezed in thetransverse directions. Quite generally, you can think of gravity as a stretchy-squeezy force.This originates in the fact that gravitational intereactions in our universe are transmittedby a spin-two boson known as the graviton. It has a polarization tensor rather than apolarization vector. After symmetries under arbitrary changes of coordinates are taken intoaccount, there are two independent physical polarizations for the graviton in four spacetimedimensions, like there are for the photon. But please do not mistake one for the other: thephoton only has spin one, and in dimension other than D = 3+1 the numbers of independentphysical polarizations of photons and gravitons will not match. That they do in D = 3 + 1is an numerical accident.

How big are tidal forces, in orders of magnitude? First, we need to figure out whichof the solar system bodies is relevant. If you do the calculation using the above formulafor tidal accelerations, you find that the Moon is actually the biggest contributor, becausealthough it is much lighter than the Sun (about 27,100,000 times) it is much closer (about388 times), and it is the cube of the distance that counts. Plugging in the numbers, you willfind that the Sun’s tidal acceleration is only about 45% of the Moon’s. So we focus on theMoon. We would like to compare the magnitude to the acceleration due to gravity. So, toget the order of magnitude, we are computing the ratio of the tidal force on a piece of oceanto the g-force,

GNMMrEd3

· r2E

GNME

∼ MM

ME

(rEd

)3

∼ 10−7 . (10.26)

Tidal forces might seem like teeny weeny forces, but when you multiply by entire oceans,you get physical effects that human beings can relate to.

We can make a little table comparing what we have found in Newtonian gravity versusEinsteinian General Relativity so far.

51

What Newton Einsteingravity Φ(xi, t) gαβ(xλ)

test particle EOMd2xi

dt2= −δij∂jΦ

d2xµ

dλ2= −Γµνσ

dxν

dλ

dxσ

dλ

deviationd2yi

dt2= −δij∂j∂kΦ yk

D2Sµ

Dλ2= +Rµ

νσρTνT σSρ

tidal forces ∂i∂jΦ Rρσµν = +∂µΓρνσ − ∂νΓρµσ + ΓρµλΓ

λνσ − ΓρνλΓ

λµσ

gravity EOM ∇2Φ = 4πGNρ ??? (Einstein equations, coming soon!)

In the Newtonian equation of motion for Φ, ρ is the mass density of whatever is sourcing thegravitational field, and GN is the Newton constant characterizing the strength of gravity.

In order to see how the covariant geodesic deviation equation reduces to the familiarNewtonian equations, we need to take the Newtonian limit in which gravity is weak andspeeds are low. (Recall also that x0 = ct and we will need to put back the factors of chere to make the approximation clear.) Either we can assume staticity, or we can notethat ∂0 = ∂t/c, which is a factor 1/c smaller than ∂i. In the Newtonian approximation, wetreat the Newtonian potential as a perturbation on 1, and we will ignore terms of order Φ2

compared to terms of order Φ.In the weak-field limit, the line element is diagonal and quite simple,

ds2 = (1 + 2Φ/c2)c2dt2 − (1− 2Φ/c2)(dx2 + dy2 + dz2) . (10.27)

For the moment, you will need to take this equation on faith, as I have not yet developedthe machinery required to see how it emerges. What I will do for now is to assume it asan ansatz, and show that it correctly gives back the familiar Newtonian limit in the limitof weak gravity and slow speeds. Later on in the course, I will give a fuller explanation ofwhere this expression for the approximate line element comes from.

In the low-speed Newtonian limit, there is no difference between proper time and coor-dinate time t. The dynamical variables of interest become xµ(λ)→ xi(t). What this meansis that we only need to consider the spatial components of the geodesic deviation equation,as the temporal component takes care of itself automatically. In the limit of slow speedscompared to the speed of light, then, we have

d2yi

dt2= +Ri

ttjyj . (10.28)

To check that this does reduce to the Newtonian expression we need to compute Rittj for the

above line element. In the limit of weak gravity, we can find the components of the inversemetric to first order in Φ,

gtt ' (1− 2Φ/c2)/c2 , gij ' −δij(1 + 2Φ/c2) . (10.29)

For our general Christoffel symbol we have

Γµνλ =1

2gµσ (∂νgσλ + ∂λgσν − ∂σgνλ) , (10.30)

52

so we can pick off the 0 and i parts individually. Assuming that gravity is weak allows us tokeep only first order terms in Φ. Assuming that Φ does not depend on time (to first orderin small quantities) sets some Christoffels to zero. For example,

Γ000 =

1

2g00 (∂0g00) = 0 . (10.31)

and

Γ0ij =

1

2g00 (∂ig0j + ∂jg0i − ∂0gij) = 0 , (10.32)

and

Γi0j =1

2gik (∂0gk0 + ∂jg0k − ∂kg0j) = 0 . (10.33)

Then we have

Γ00i =

1

2g00∂ig00 '

1

2(1− 2Φ/c2)∂i(1 + 2Φ/c2) ' ∂iΦ/c

2 ⇒ Γtti = ∂iΦ/c2 . (10.34)

Another nontrivial component is

Γi00 =1

2gik (∂0gk0 + ∂0g0k − ∂kg00) = −1

2gik∂kg00 = δik∂kΦ/c

2 ⇒ Γi tt = δik∂kΦ . (10.35)

Finally, we have

Γi jk =1

2gi` (∂jg`k + ∂kg`j − ∂`gjk) (10.36)

=1

2δi`(1 + 2Φ/c2)(−2/c2) (δ`k∂jΦ + δ`j∂kΦ− δjk∂`Φ) (10.37)

⇒ Γi jk =1

c2

(−δik∂jΦ− δij∂kΦ + δi`δjk∂`Φ

). (10.38)

53

11 Mon.07.Oct

11.1 Newtonian limit for Riemann

From the Christoffels we computed last time, we can compute the Riemann components,

Rtxtx = − 1

c2

∂2Φ

∂x2, (11.1)

Rtxty = − 1

c2

∂2Φ

∂x∂y, (11.2)

Rxyxy = +

1

c2

(∂2Φ

∂x2+∂2Φ

∂y2

), (11.3)

Rxyxz = +

1

c2

(∂2Φ

∂y∂z

). (11.4)

plus eight more equations from cyclic permutations of (x, y, z). Note that we do not obtainany squares of partial derivatives here in our Riemanns because we are only working to firstorder in the Newtonian potential Φ. Then, using our geodesic deviation equation in theNewtonian limit, we have

d2yi

dt2= +Ri

ttjyj . (11.5)

Since we also know that

Ritjt = ∂jΓ

itt − 0 = +∂j(δ

ik∂kΦ) = +δik∂j∂kΦ , (11.6)

we can see that the General Relativistic geodesic deviation equation involving Riemann givesback the Newtonian expression, which is exactly what we set out to prove last time.

To illustrate the abstract concept of geodesic deviation, let us work a very simple exam-ple. Suppose that we have a two-sphere of unit radius with line element

dΩ22 = dθ2 + sin2θ dφ2 . (11.7)

If you did the Homework 1 assignment, you will already know how to find the Christoffelsfor this case. There are only two that are nonzero,

Γθφφ = − sin θ cos θ , Γφφθ = + cot θ . (11.8)

Denoting d/dλ by an overdot, we find for the geodesic equation

θ − sin θ cos θ.φ

2= 0 , (11.9)

φ+ 2 cot θ.θ

.φ = 0 . (11.10)

These are second order nonlinear PDEs, and solving them can be a battle if you do notchoose your initial conditions cleverly.

54

If we wish, we can use spherical symmetry to pick a particular initial condition to makeintegrating these equations simpler. We choose the initial conditions

θ(λ)|λ=0 =π

2,

.θ(λ)|λ=0 = −Ω0 , φ(λ)|λ=0 = 0 ,

.φ(λ)|λ=0 = 0 . (11.11)

This corresponds to pointing your tangent vector down a line of longitude. The constant Ω0

is the angular speed with which the polar angle θ is changing with the affine parameter λ.What is Riemann? The only nonzero component on the 2-sphere S2 is

Rθφθφ = + sin2 θ . (11.12)

So the components of the geodesic deviation acceleration are

Aθ = RθνασT

νTαSσ

= RθφθφT

φT θSφ +RθφφθT

φT φSθ

= + sin2 θ[T φT θSφ − (T φ)2Sθ)] , (11.13)

and

Aφ = RφθθφT

θT θSφ +RφθφθT

θT φSθ

= +[T θT φSθ − (T θ)2Sφ] , (11.14)

Now we need to specify S and T . Since the tangent vector to a geodesic running down aline of longitude points in the (negative of the) polar direction, and the separation vectorbetween two adjacent such geodesics points in the azimuthal direction, we have that

T θ = −Ω0 , T φ = 0 , Sθ = 0 , Sφ = 1 . (11.15)

SoAθ = 0 , Aφ = −Ω2

0 . (11.16)

The magnitude is what you should expect for an angular acceleration of the type representedhere. The minus sign is physical.

It is possible to get considerably more sophisticated in discussing the physics of geodesicdeviation. In order to derive more precise equations, one studies a congruence of geodesics,which is a set of curves in an open region of spacetime such that every point in the regionlies on precisely one curve. The story of how geodesics deviate can be expressed in moresophisticated tensor languauge by studying the covariant derivative of the four-velocity vector∇µUν and decomposing it into three independent parts: (a) the trace part θ, known as theexpansion of the congruence, (b) the symmetric traceless part σµν , known as the shear of thecongruence, and (c) the antisymmetric part ωµν , known as the rotation of the congruence.Each of these affects the evolution of the others, and the equations obtained are different formassive and massless particles. We will not show the details here because the algebra is toolong-winded.

55

11.2 Riemann normal coordinates and the Bianchi identity

Riemann normal coordinates are a handy coordinate system that you can always usebased about any point p. They are defined in a smallish patch in the neighbourhood of p,and do not necessarily extend infinitely in all directions, as we will explain when we talkabout geodesic deviation soon. But they are a great little coordinate system that you can useto evaluate tensor equations, and to help prove tensor equations. We will use the notationalconvention that equations written in Riemann normal coordinates have bars over the tensors.Strictly speaking we should also bar all the indices, but this is beyond my typing patienceat present, so please imagine barred indices everywhere in your head.

A Riemann normal coordinate system is one built using geodesics about a point p.More concretely for our purposes, it is the coordinate system in which the metric is locallyMinkowskian, and the Christoffels are zero – at the point p,

Γµαβ = 0 . (11.17)

Then, since ∇σgαβ = 0 everywhere, including at p,

∇σgµν = ∂σgµν − Γλσµgλν − Γ

λσν gλµ (11.18)

= ∂σgµν + 0 = 0 . (11.19)

Therefore, in Riemann normal coordinate system, we have the special relations

∂σgµν = 0 , (11.20)

Γαλσ = 0 , (11.21)

Rµνσρ = ∂σΓ

µνρ − ∂ρΓ

µνσ . (11.22)

As you can imagine, using this coordinate system we can more quickly check tensor equations.This is not a trick – tensor equations are valid in any coordinate system. Therefore, theymust hold in any frame, including the Riemann normal coordinate frame in which our tensorcomponents simplify. This conceptual tool can be super handy.

We are now going to make use of this special coordinate system to identify all thesymmetries of Riemann. This is an important quest, because it will enable is to computehow many independent components Riemann has in arbitrary spacetime dimensionD = d+1.In turn, that helps us understand the physics of this four-legged tensor. Computing it canbe arduous for a general spacetime, and this is why I set computer algebra as part of HW1.

To help us find the symmetries, it helps to start by using the spacetime metric to buildthe (0,4) version of Riemann from the natural (1,3) version,

Rαβµν = gαλRλβµν . (11.23)

The first symmetry we can notice by inspection of the formula for Riemann in terms ofChristoffels. We see immediately that Riemann is antisymmetric upon exchange of its finaltwo indices,

Rρσµν = −Rρ

σνµ . (11.24)

56

In Riemann normal coordinates,

Rρσµν = gρλ

(∂µΓ

λνσ − ∂νΓ

λµσ

)(11.25)

= gρλ∂µ

[1

2gλα (∂σgνα + ∂ν gσα − ∂αgνσ)

]− (µ↔ ν) (11.26)

=1

2gρλg

λα (∂µ∂σgνα + ∂µ∂ν gσα − ∂µ∂αgνσ)− (µ↔ ν) (11.27)

=1

2

(∂µ∂σgνρ + ∂µ∂ν gσρ − ∂µ∂ρgνσ

)− (µ↔ ν) (11.28)

=1

2

(∂µ∂σgνρ − ∂µ∂ρgνσ

)− (µ↔ ν) , (11.29)

where in the third line above we used the fact that ∂µgλα = 0 in Riemann normal coordinates,

and in the fourth line we used symmetry. Therefore, we can see two additional identitiessatisfied by Riemann,

Rρσµν = −Rσρµν , (11.30)

i.e., Riemann is antisymmetric upon exchange of its first two indices as well as its last two,and

Rρσµν = Rµνρσ (11.31)

i.e., Riemann is symmetric under interchange of the first two indices with the last two.We can also look at a version of Riemann with cyclic permutations on the last three

indices,Qρσµν := Rρσµν +Rρµνσ +Rρνσµ . (11.32)

Evaluating again in Riemann normal coordinates gives

2Qρσµν = (∂µ∂σgνρ − ∂µ∂ρgνσ) + (∂ν∂µgρσ − ∂ν∂ρgµσ) + (∂σ∂ν gµρ − ∂σ∂ρgµν)−(µ↔ ν) (11.33)

= ∂ρ(−∂µgνσ − ∂ν gµσ − ∂σgµν + ∂ν gµσ + ∂µgνσ + ∂σgνµ

)+∂σ

(∂µgνρ + ∂ν gµρ − ∂ν gµρ − ∂µgνρ

)+ ∂µ

(∂ν gσρ

)− ∂ν

(∂µgσρ

)(11.34)

= (0) + (0) + 0− 0 (11.35)

= 0 , (11.36)

where we have used the fact that mixed partial derivatives commute and the fact that themetric is symmetric. Because of the antisymmetry properties, an equivalent way of writingthis is (check this!)

Rρ[σµν] = 0 , (11.37)

and it immediately follows from this that

R[ρσµν] = 0 , (11.38)

i.e., the totally antisymmetric part of Riemann vanishes too. With straightforward buttedious algebra of very similar type, we can also derive the Bianchi identity which governscovariant derivatives of Riemann. It can be written in (at least) two mathematically different

57

but physically identical ways, which are related by the symmetries of Riemann. The firstform is

∇λRρσµν +∇ρRσλµν +∇σRλρµν = 0 , (11.39)

and the second form is∇[λRµν]ρσ = 0 , (11.40)

which constrains Riemann by relating components at different points. You can think of theBianchi identity for Riemann as like a Jacobi identity for covariant derivatives,

[[∇µ,∇ν ],∇λ] + [[∇ν ,∇λ],∇µ] + [[∇λ,∇µ],∇ν ] = 0 . (11.41)

11.3 The information in Riemann

Now we have all the ingredients we need in order to compute the number of independentRiemann coefficients. We know that as a (0,4) tensor Riemann satisfies

Rαβγδ = −Rαβδγ , (11.42)

Rαβγδ = −Rβαγδ , (11.43)

Rαβγδ = Rγδαβ , (11.44)

R[αβγδ] = 0 . (11.45)

Suppose that we bunch the indices of Riemann in twos. Then we can think of Riemann aslike a symmetric combination of two antisymmetric blocks. Recall that the dimension of anantisymmetric D×D matrix is D(D−1)/2 while that of a symmetric matrix is D(D+1)/2.Then the number of components of Riemann should be

nR(D) =1

2

[1

2D(D − 1)

] [1

2D(D − 1) + 1

]−(D4

). (11.46)

We obtained this by using the symmetries of the first three identities to compute the tentativetotal and then subtracting off the number of completely antisymmetric components to satisfythe fourth identity. This process works because the four constraints are independent. Then,with very simple algebra, we obtain

nR(D) =1

12D2(D2 − 1) . (11.47)

Notice a few things about this formula. In one spacetime dimension, nR(1) = 0 and Riemannhas no components. This makes sense, as there is only one independent direction, so youcannot build a nonzero commutator of covariant derivatives. There is not enough room inspacetime to build a parallelogram. In two spacetime dimensions, we have nR(2) = 1 andRiemann has just one independent component. This makes gravitational physics in D = 1+1quite easy compared to higher dimensions. In three spacetime dimensions, we get nR(3) = 6,and in four spacetime dimensions we have nR(4) = 20. This number is, not accidentally,equal to the number of degrees of freedom in the second partial derivatives of the metricthat we cannot set to zero by a clever choice of coordinate system when Taylor expandingthe metric.

58

As we keep going up in dimension, nR(D) proliferates like a quartic polynomial of D.By the time we get to ten or eleven spacetime dimensions, we are dealing with nR(10) = 825or nR(11) = 1210 independent components! This is why we often use computer algebra inresearch, when calculating in spacetime dimensions relevant to string theory. Of course, itis also possible with clever techniques to cut through the algebra and find quicker ways tocalculate analytically, when your metric is diagonal or sparse in other significant ways.

59

12 Thu.10.Oct

12.1 Lie derivatives

So far we have developed covariant derivatives and curvature, which required having aChristoffel connection. An interesting fact is there are some structures that can be de-fined on a curved spacetime manifold even without reference to a connection or curvature.We will introduce a the idea of the Lie9 derivative today, because studying it acting on themetric tensor of spacetime will lead us to the General Relativistic version of Noether’s The-orem, which is one of the most important ideas of all time in theoretical physics. We willfind that a symmetry of the spacetime metric gives an integral of the motion, a conservedquantity, which we can use to help solve for trajectories of test particles in some importantcases.

The key concept we will need for our discussion of Noether’s Theorem is how to take aLie derivative along the congruence defined by a vector field. So, first things first, what iscongruence? On a spacetime manifold, a congruence is a set of curves that fill the manifold(or more generally some part of it) without intersecting. Therefore, the congruence providesa mapping of a manifold onto itself, in the following sense. If the parameter on the curves isλ, then any tiny ∆λ defines a mapping, where each point is advanced by ∆λ along the samecurve in the congruence. This is a 1-1 mapping if the vector field is C1, and if it is C∞ it iscalled a diffeomorphism. If there is such a map for any ∆λ, then we have a one-parameterLie group, and the mapping is called a Lie dragging along the congruence.

Suppose that we have a scalar function f defined on our spacetime manifold. Then ourabove mapping defined by ∆λ lets us define a new function f ∗∆λ in the obvious way: if apoint P on a certain curve in the congruence gets mapped to the point Q, then

f(P ) = f ∗∆λ(Q) . (12.1)

If it happens that we have a function for which the new value f ∗∆λ(Q) is equal to the old onef(Q), for all Q,

f = f ∗∆λ , (12.2)

then the function is invariant under the mapping. If it is invariant for all ∆λ, then thefunction is said to be Lie dragged. In less fancy language,

df

dλ= 0 . (12.3)

Acting on any given tensor, the Lie derivative along some vector field V , written as LV ,measures how fast the tensor changes along integral curves of V . Acting on a scalar functionf , that is just the directional derivative,

LV (f) = V λ∂λf . (12.4)

Note that this is the partial derivative: we have not involved any affine connection here.

9Pronunciation note: “Lie” rhymes with “see”.

60

What about a vector field? Any vector field V is defined by the congruence of curves forwhich it is the tangent field,

V µ =dxµ

dλ. (12.5)

A familiar example from undergraduate electromagnetism is that the magnetic flux lines arethe integral curves of the magnetic field 3-vector. Now suppose that we have two generalvector fields X and Y . Recall that for any vector V , it can be expanded in the coordinatebasis as V = V µ∂µ. Then we can define the commutator [X, Y ] of two vector fields via

[X, Y ](f) ≡ X(Y (f))− Y (X(f)) , (12.6)

where f is an arbitrary function. The neat thing about [X, Y ] is that it is a bona fide vectorfield: it is linear,

[X, Y ](af + bg) = a[X, Y ]f + b[X, Y ]g , (12.7)

and it obeys the Leibniz rule,

[X, Y ](fg) = f [X, Y ]g + g[X, Y ]f . (12.8)

In the coordinate basis, the new vector field [X, Y ] has components

[X, Y ]µ = Xλ∂λYµ − Y λ∂λX

µ . (12.9)

This is a well-defined tensor, because the non-tensorial pieces from the partial derivativescancel by antisymmetry of the commutator. If you prefer, you can write the above formulawith covariant derivatives instead – that way, it looks more tensorial.

Suppose that we adapt our coordinate system so that V = ∂/∂xd. The utility of choosingthis coordinate system is that a diffeomorphism by λ amounts to a coordinate transformationfrom (x0, x1, . . . , xd) to (x0, x1, . . . , xd + λ). Then the components of a different vector T µ

pulled back from the transformed point to the original are simply T µ(x0, x1, . . . , xd + λ). Inthis coordinate system, the Lie derivative then becomes

LV T µ =∂

∂xdT µ . (12.10)

This expression is clearly not covariant, but we know that for two vector fields V and T thecommutator [V, T ] is a well-defined tensor, and in this coordinate system it happens to havecomponents

[V, T ]µ = V ν∂νTµ − T ν∂νV µ =

∂T µ

∂xd. (12.11)

Since both LV T and [V, T ] are vectors (rank (1, 0) tensors), their components must be equal,and so we finally have the formula we want,

LV T = [V, T ] . (12.12)

This quantity on the RHS is called the Lie bracket. The equation itself says that how thevector T changes along integral curves of another vector V is encoded in the commutator ofthe two vector fields.

61

The formula for the action of the Lie derivative on covariant vectors follows directly fromwhat we have just derived for contravariant vectors and the Leibniz rule. For a general rank(k, `) tensor, the Lie derivative is

LV (T )µ1...µk ν1...ν` = V σ∂σTµ1...µk

ν1...ν`

−(∂λVµ1)T λµ2...µk ν1...ν` − . . .

+(∂ν1Vλ)T µ1...µkλν2...ν` + . . . . (12.13)

This equation may make you a bit uncomfortable because it involves partial derivatives. Infact, if you do the straightforward but tedious algebra, you will find that it is just as validwith covariant derivatives replacing the partial ones,

LV (T )µ1...µk ν1...ν` = V σ∇σTµ1...µk

ν1...ν`

−(∇λVµ1)T λµ2...µk ν1...ν` − . . .

+(∇ν1Vλ)T µ1...µkλν2...ν` + . . . . (12.14)

This equation certainly looked less tensorial written the first way. But the first equation hasthe advantage that it makes clear that no connection is necessary to define Lie derivativesof tensors. It is an independent structure.

12.2 Killing vectors and tensors

In this section, we will be especially interested in the expression above for the Lie derivativeof the metric tensor, which characterizes everything about gravity in our spacetime. Wehave

LV (g)µν = ∇µVν +∇νVµ . (12.15)

So if0 = ∇µKν +∇νKµ , (12.16)

for some vector K, the metric is unchanged. K is known as a Killing vector, and themetric is unchanged along its integral curves, i.e., it has a symmetry. This is Noether’sTheorem in curved spacetime, and it plays an extremely important role in the physics ofGR. So, what is the corresponding conservation law?

Consider the quantity K · p. Its covariant derivative is

∇µ(Kλpλ) = (∇µKλ)p

λ +Kλ(∇µpλ) . (12.17)

Contracting this with pµ gives

pµ∇µ(Kλpλ) = pµpλ∇µKλ +Kλp

µ(∇µpλ) , (12.18)

and the second term disappears by the geodesic equation. The first term can also be seento vanish by virtue of symmetry and the Killing vector equation. So the Killing equation isequivalent to conservation of K · p. More generally, if we have a Killing tensor obeying

∇(µKν1...ν`) = 0 (12.19)

62

thenpµ∇µ(Kν1...ν`p

ν1 . . . pν`) = 0 . (12.20)

A fascination with finding conserved quantities is physically important because it canhelp us solve for geodesics. Soon, when we introduce black holes, we will see just howcrucial conserved quantities can be in analyzing geodesic motion and physical consequencesof it. So let us derive an alternative form of the geodesic equation which will be handy forfuture reference. What is the directional covariant derivative of the downstairs version ofthe tangent vector to the curve xµ(λ)?

D

Dλ

(dxµdλ

)=d2xµdλ2

− Γασµdxσ

dλ

dxαdλ

. (12.21)

This should be zero for geodesics, giving

d2xµdλ2

=1

2gαβ (∂σgβµ + ∂µgβσ − ∂βgσµ)

dxσ

dλ

dxαdλ

,

=1

2(∂σgβµ + ∂µgβσ − ∂βgσµ)

dxσ

dλ

dxβ

dλ,

=1

2(+∂µgβσ)

dxσ

dλ

dxβ

dλ(12.22)

which yields (upon relabelling dummy indices)

d

dλ

(dxµdλ

)=

1

2(∂µgαβ)

dxα

dλ

dxβ

dλ. (12.23)

So if the entire spacetime metric has zero dependence on a particular coordinate xµ, thecorresponding lower-index tangent vector dxµ/dλ is conserved! For a massive particle, thisquantity is none other than pµ/m. For the massless particle, we can choose a convention inwhich pµ = dxµ/dλ. Therefore,

if ∂µgαβ = 0 ∃µ ∀α, β then pµ = constant . (12.24)

Let us do an ultra-simple example of a Killing vector. Consider Minkowski space in 4D,namely R3,1 with the flat spacetime metric. In Cartesian coordinates, we obviously havespacetime translation invariance. This implies that all components of pµ are conserved.

As a less trivial example, take our spatially flat FRW universe for which we previouslyworked out the Christoffels. Notice that the metric depended only on time. Obviously, thismeans that energy is not conserved. Stop and think on that for a minute. You probablythought that conservation of energy must be true in all circumstances, even for the wholeuniverse. You would be wrong. It requires a symmetry! Since none of the components ofthe metric depend on spatial coordinates, the spatial momenta pi are conserved.

For our third example of Killing vectors, consider the two-sphere S2 with round metric

ds2 = dθ2 + sin2θ dφ2 . (12.25)

63

How do we find the Killing vectors? We need to solve the D(D + 1)/2 Killing equations,

0 = ∇µKν +∇νKµ

= ∂µKν + ∂νKµ − 2ΓαµνKα . (12.26)

First, we need the nonzero Christoffels,

Γφφθ =cos θ

sin θ, Γθφφ = − sin θ cos θ . (12.27)

Then the three independent Killing vector equations involve θθ, φφ, θφ:

0 = ∂θKθ ,

0 = ∂φKφ + sin θ cos θ Kθ ,

0 = ∂φKθ + ∂θKφ −2 cos θ

sin θKφ . (12.28)

The first Killing equation teaches us that

Kθ = Kθ(φ) . (12.29)

Taking ∂φ of the second Killing equation gives, after a little bit of massaging of trig functionsand using the third equation,

∂2φKθ +Kθ = 0 . (12.30)

We can readily solve this,Kθ(φ) = A sinφ+B cosφ , (12.31)

where A,B are constants of integration. Using this in the third Killing equation and partiallyintegrating w.r.t. φ to find Kφ gives

Kφ = F (θ) + A sin θ cos θ cosφ−B sin θ cos θ sinφ , (12.32)

where F is an arbitrary function of integration. Substituting this back into the third Killingequation gives, after more trigonometric algebraic massage,

∂θF (θ)− 2 cos θ

sin θF (θ) = 0 , (12.33)

which is readily integrated toF (θ) = C sin2 θ , (12.34)

where C is a constant of integration. Therefore, the general form of our Killing vectors forthe two-sphere are, for the downstairs components,

Kθ = A sinφ+B cosφ ,

Kφ = C sin2 θ + sin θ cos θ (A cosφ−B sinφ) . (12.35)

If we take A = 0, B = 0, C = 1, we get a Killing vector R with upstairs components

Rθ = 0 , Rφ = 1 . (12.36)

64

If we take A = 0, B = 1, C = 0, we get a Killing vector S with upstairs components

Sθ = cosφ , Sφ = − cot θ sinφ . (12.37)

If we take A = −1, B = 0, C = 0, we get a Killing vector T with upstairs components

T θ = − sinφ , T φ = − cot θ cosφ . (12.38)

As you can check by transforming between spherical polar coordinates and Cartesian coordi-nates, these three Killing vectors correspond to R = x∂y−y∂x, S = z∂x−x∂z, T = y∂z−z∂y.

65

13 Thu.17.Oct

13.1 Maximally symmetric spacetimes

Spacetimes are distinguished by how many symmetries they possess. The more symmet-ric, the more calculable. The less symmetric, the less calculable. Even though maximallysymmetric spacetimes possess an unrealistic amount of symmetry for experimental purposes,they are still very useful to study because calculations are easier to complete and they helpbuild intuition.

What are the maximally symmetric spacetimes? We need to specify the spacetimesignature10 in order to get started on this discussion. In Euclidean signature, Riemannianmanifolds with maximal symmetry are (up to local isometry) either: Euclidean space RD,the sphere SD, or hyperbolic space HD. In Lorentzian signature, there are also three options,and they split up according to the value of the cosmological constant Λ (a.k.a. dark energydensity). When Λ = 0, we get Minkowski space Rd,1, where D = d + 1. For Λ < 0 we getAnti de Sitter spacetime (AdS), and for Λ > 0 we get de Sitter spacetime (deS).

Recall that Minkowski spacetime is invariant under (d + 1) translations, d(d − 1)/2rotations, and d boosts. Adding the numbers together gives a total of

(d+ 1) +1

2d(d− 1) + d =

1

2(d+ 1)(d+ 2) =

1

2D(D + 1) (13.1)

symmetries. We therefore say that a spacetime manifold of dimension D is maximallysymmetric if it possesses D(D + 1)/2 independent symmetries.

What equation should the Riemann tensor obey in maximally symmetric spacetimes? Ithad better be invariant under local Lorentz transformations, because there is no preferreddirection in spacetime. There are only a very few tensors which we can use: gµν and εµ1...µD .The epsilon tensor turns out to have the wrong symmetry properties to build Riemanncomponents, and the metric ends up the winner. The sole combination of metric tensorcomponents that possesses the right symmetries to be Riemann is antisymmetric, and tracinggives the constant of proportionality,

Rρσµν =R

D(D − 1)(gρνgσµ − gρµgσν) . (13.2)

The Ricci scalar R is constant over the entire manifold for maximally symmetric spacetimes.

Anti de Sitter spacetime AdSD=d+1 can be embedded in a Minkowski spacetime of onehigher dimension Rd,2, via

− (t1)2 − (t2)2 + (x1)2 + . . .+ (xd)2 = −L2 (13.3)

where L is the radius of curvature of the AdSD. There are several different coordinate

10If we were in a mathematically picky mood, we would also want to specify the spacetime topology.

66

systems in common usage for AdSD. One of the most useful is global coordinates, in which

t1 = L cosh ρ cos τ , (13.4)

t2 = L cosh ρ sin τ , (13.5)

xi = L sinh ρ xi , whered∑i=1

(xi)2 = 1 . (13.6)

In general dimension, spherical coordinates are defined via

x1 = cos θ1 ,

xp = cos θ1

p−1∏m=1

sin θm , p ∈ 2, . . . , d− 1 ,

xd =d−1∏m=1

sin θm . (13.7)

You can check yourself, either by hand or using Maxima, that the resulting line element ofAdSD in global coordinates is

ds2 = L2(cosh2ρ dτ 2 − dρ2 − sinh2ρ dΩ2

d−1

), (13.8)

where

dΩ2d−1 = dθ2

1 +d−1∑`=2

(`−1∏m=1

sin2 θm

)dθ2

` . (13.9)

With a further coordinate transformation in time and radius,

t = L τ ,

r = L sinh ρ , (13.10)

we obtain

ds2 =

(1 +

r2

L2

)dt2 −

(1 +

r2

L2

)−1

dr2 − r2dΩ2d−1 . (13.11)

The scale L is the radius of curvature, and it sets the scale for all the physics in AdSD.The physics of Anti de Sitter (or de Sitter) spacetime in D = d + 1 dimensions differs

markedly from the physics of Minkowski spacetime. One of the quickest ways to illustratethis is to compare the falloff of partial waves in AdS versus flat spacetime. Solving a waveequation for a simple type of field is a straightforward way to see this.

Consider a Klein-Gordon (scalar) field living in flat Minkowski spacetime. Its equationof motion in spherical coordinates t, r,ΩD−2 is

∇µ∇µΦ = m2Φ . (13.12)

If we writeΦ(t, r,ΩD−2) = e−iωtχ(r)Y`,m(ΩD−2) , (13.13)

67

where the spherical harmonics obey

∇2Sd−1Y`,m = −`(`+ d− 2)Y`,m , (13.14)

and separate variables, we find∂2

∂r2+

(d− 2)

r

∂

∂r+

[ω2 − `(`+ d− 2)

r2−m2

]χ(r) = 0 . (13.15)

The most physically important thing to understand from this partial differential equation isthat higher partial waves with ` > 0 are less important at large radius than the ` = 0 mode.A related fact is that when we write out the multipole expansion for electric and magneticfields in Minkowski spacetime, higher multipole fields fall off with larger powers of radius.This physics is inherent to Minkowski spacetime with Λ = 0. It may surprise you to learnthat it does not carry over to other values of the cosmological constant.

Suppose that we now consider instead AdSd+1 with global coordinates t, ρ,Ωd−2,

ds2 = L2(cosh2ρ dτ 2 − dρ2 − sinh2ρ dΩ2

d−1

). (13.16)

In this set of coordinates, ρ ranges from 0 (the interior of AdS) to π/2 (the boundary) andthe coordinate t ranges from −∞ to +∞. What does the scalar wave equation look like inthis spacetime? Anticipating separation of variables again, let us write

Φ(τ, ρ,Ωd−1) = e−iωτχ(ρ)Y`,m(Ωd−2) . (13.17)

Then the equation of motion becomes1

(tan ρ)d−1∂ρ((tan ρ)d−1∂ρ

)+[ω2 − `(`+ d− 2) csc2 ρ−m2 sec2 ρ

]χ(ρ) = 0 . (13.18)

Notice that as we approach the boundary, the higher angular momentum modes are notsuppressed compared to the ` = 0 mode. This is the germ of why the AdS/CFT correspon-dence discovered in the context of string theory in 1997 can work: an observer living onthe boundary of the spacetime can see lots of information about what is happening in theinterior of the spacetime all the way from the boundary. If we want to know the characterof solutions to the above differential equation, we can substitute

χ(ρ) = (cos ρ)2h(sin ρ)2bf(ρ) , (13.19)

which, upon the substitutiony ≡ sin2 ρ , (13.20)

gives

y(1− y)∂2yf +

[2b+

d

2− (2h+ 2b+ 1)y

]∂yf −

[(h+ b)2 − ω2

4

]f = 0 . (13.21)

The solutions to this equation are hypergeometric functions, with

h± =d±√d2 + 4m2

4, b =

+`

2, − `

2+ 1− d

2

. (13.22)

68

(For further details, see e.g. hep-th/9805171.)de Sitter spacetime dSD can be embedded in RD,1 via

t1 =√L2 − r2 sinh(t/L) , (13.23)

xi = Lxi , whered∑i=1

(xi)2 = 1 , (13.24)

xD =√L2 − r2 cosh(t/L) . (13.25)

This gives rise to static coordinates (like AdS, dS can alternatively be sliced with flat,positively curved, or negatively curved spatial sections). In static coordinates, the de Sitterline element becomes

ds2 =

(1− r2

L2

)dt2 −

(1− r2

L2

)−1

dr2 − r2dΩ2d−1 . (13.26)

This has a cosmological horizon at r = L. We will not have time to develop the similaritiesbetween cosmological horizons and black hole horizons in this course.

13.2 Einstein’s equations

In plain language, Einstein’s equations express the fact that matter tells spacetime how tocurve and spacetime tells matter how to move. Later in the course, I will show you how toderive Einstein’s equations of General Relativity. For now, we will just write them down foryou and show you how to use them. They relate a geometrical quantity on the left hand side,built out of the Riemann curvature tensor, to an energy-momentum tensor of any matterfields in the physical system containing gravitation as well. In tensor notation, they read asfollows,

Rαβ −1

2gαβR + Λgαβ = −8πGNTαβ . (13.27)

The quantity Λ is known as the cosmological constant. (Note: you can put back thepowers of c very easily by recruiting dimensional analysis.)

A very important characteristic of Einstein’s equations is that they are nonlinear. Youcan see this by eye by recalling the formula for Christoffels in terms of metric derivatives,which is nonlinear, as well as the formula for the Riemanns in terms of derivatives of Christof-fels and contractions of Christoffels, which is also nonlinear. Nonlinearity makes GR verydifferent qualitatively than Newtonian gravity. It is only in the Newtonian limit of GR thatthe linearity with which you are familiar emerges and shows itself as the superposition prin-ciple for the Newtonian potential Φ(x). For generic situations in GR, nonlinearity is presentin the partial differential equations for the evolution of spacetime. The mathematics of non-linear PDEs is hugely complicated compared to linear ones, and for generic spacetimes oftenno general statements can be made. Symmetry helps enormously with the task of trying tosolve the differential equations, classify spacetimes, or find their geodesics.

The energy-momentum tensor on the RHS of Einstein’s equations is covariantly con-served. The way to see this is to take covariant derivatives of both sides of the Einstein

69

https://arxiv.org/abs/hep-th/9805171

equations. The Einstein tensor is defined as

Gµν = Rµν −1

2gµνR . (13.28)

Notice that this is denoted with a big-Gµν , rather than the small-gµν metric or the GN

denoting the Newton gravitational constant. By itself, the rank (0,2) Einstein tensor Gµν

does not look like much. But it obeys an extremely useful identity by virtue of the Bianchiidentity for the Riemann tensor. To see this, let us take the first form of our Bianchi identityand contract with two factors of the upstairs metric,

0 = gνσgµλ (∇λRρσµν +∇ρRσλµν +∇σRλρµν) (13.29)

= ∇µRρµ −∇ρR +∇νRρν . (13.30)

Rearranging this expression gives a relationship between the covariant derivative of the Riccitensor and the covariant derivative of the Ricci scalar,

∇µRρµ =1

2∇ρR . (13.31)

This identity is handy because it enables us to prove that

∇µGµν = 0 . (13.32)

In other words, the Einstein tensor is covariantly conserved. We also have the metric com-patibility condition on our affine connection,

∇σgµν = 0 . (13.33)

Then we have∇µTmatter

µν = 0 . (13.34)

Covariant conservation of the energy-momentum tensor in GR is mandatory, not voluntary.How about some examples of energy-momentum tensors? Consider a perfect fluid, which

is a spherical cow approximation to real fluids, characterized only by three things: energydensity ρ, pressure p, and fluid velocity uµ. Its energy-momentum tensor is constructed fromthose three quantities and the metric tensor,

T p.f.µν =

(ρ+

p

c2

)uµuν − pgµν . (13.35)

More generally, if we have an action principle for some classical matter (non-gravitational)field coupled to gravity, Smatter, then the energy-momentum tensor is determined by varyingthe action w.r.t. gµν according to the following recipe11:

Tµν(xσ) =

2√−g(xσ)

δSmatter

δgµν(xσ), (13.36)

11I will prove this near the beginning of the GR2 PHY[1]484S course

70

where (−g) is an abbreviation for the determinant of the downstairs metric,

√−g ≡ det (gαβ) . (13.37)

This quantity arises in writing down a general relativistically invariant measure of integra-tion, dDx

√−g. (For the case of spherical coordinates on flat Minkowski spacetime, it is

r2 sin θ, which should be familiar to you from undergraduate multivariable calculus.) Ahandy formula is

δ√−g = −1

2

√−g gαβ δgαβ = +

1

2

√−g gαβ δgαβ . (13.38)

For a relativistic massive point particle,

T particleµν (x) =

m√−g(x)

∫dτ

.zµ

.zν δ

4(x− z(τ)) . (13.39)

We can see how this arises by starting from the Einbein action in curved spacetime in propertime gauge for a massive particle,

S(2)rel = m

∫dτ

[1

2gµν

dzµ(τ)

dτ

dzν(τ)

dτ+

1

2m

]. (13.40)

The only part of this action that depends on the spacetime metric is the first term. Also,we will only get a nonzero result when we are on the particle path.

How about for a scalar field Φ? For minimal coupling to gravity,

Sscalar[Φ] =

∫dDx√−g(

1

2∇µΦ∇µΦ− 1

2m2Φ2 − V (Φ)

). (13.41)

It follows that

T scalarµν = ∇µΦ∇νΦ− gµν

[1

2(∇Φ)2 − V (Φ)

]. (13.42)

For the electromagnetic field Aµ,

SEM[Aα] = −1

4

∫dDx√−gF µνFµν . (13.43)

It follows that

TEMµν = −

(FµλF

λν −

1

4gµνF

λσFλσ

). (13.44)

71

14 Mon.21.Oct

14.1 Birkhoff’s theorem and the Schwarzschild black hole

Let us now attack the question of solving the vacuum Einstein equations when we have astatic, spherically symmetric spacetime. After a bit of work, we will be able to show thatthe Schwarzschild black hole possessing mass M is the unique solution.

Our methodology follows that of Carroll §5.2, and will involve a few steps. We will firstuse spherical symmetry to constrain the possible metric components that might be turnedon. Then we will use the vacuum Einstein equations to prove that the time dependence mustdrop out. Then we will solve the remaining vacuum Einstein equations, and we will obtainthe Schwarzschild solution. The last piece of the puzzle will be provided by the Newtonianlimit, which will connect a mathematically arbitrary constant of integration to the physicalquantity GNM , where M is the mass of the Schwarzschild geometry and GN is the Newtonconstant, which has dimensions of lengthD−2 and parametrizes the strength of gravity.

First, let us discuss the definition of a static spacetime in Lorentzian signature. Callingthe timelike coordinate x0, we define a static spacetime as one for which (a) there is noexplicit time dependence in the metric and (b) the invariant interval possesses time reversalinvariance,

∂

∂x0

(gµν(x

λ))

= 0 , (14.1)

ds2 invariant under x0 → −x0 . (14.2)

A spacetime that only obeys the first condition is called a stationary spacetime. In essence,a static spacetime basically does nothing at all over time, while a stationary spacetime doesexactly the same thing at all times. Note that staticity requires that there be no time-spacecross terms in the invariant interval, only time-time and space-space components.

Isotropy is also big requirement. Having this much symmetry eliminates a lot of possiblyindependent components of the metric tensor. In particular, writing in terms of either Carte-sian coordinates ~x or spherical polar coordinates r, θ, φ, we can only use three ingredients,

~x · ~x = r2 ,

d~x · d~x = dr2 + r2dΩ22 ,

~x · d~x = rdr , (14.3)

wheredΩ2

2 = dθ2 + sin2 θ dφ2 . (14.4)

Any other thing we could build from the available ingredients would not respect sphericalsymmetry.

Given the spherical symmetry of our ansatz, it is traditional to use spherical polarcoordinates, in which the metric on the S2 is round – throughout the spacetime. For now,we will allow the metric to have time dependence, but bear in mind that shortly we will findit is disallowed by the Einstein equations. We write the metric as

ds2 = e2α′′(t′,r′)(dt′)2 − e2β′′(t′,r′)(dr′)2 − 2e2γ′′(t′,r′)dt′dr′ − e2δ′′(t′,r′)(r′)2dΩ22 . (14.5)

72

Next, we can change to a new radial coordinate r(t′, r′) by

r2 = (r′)2e2δ′′(t′,r′) (14.6)

and adjust the definitions of all functions dependent on time and radius accordingly, to newfunctions, single primed,

ds2 = e2α′(t′,r)(dt′)2 − e2β′(t′,r)dr2 − 2e2γ′(t′,r)dt′dr − r2dΩ22 . (14.7)

In order to be able to get rid of the 2dt′dr′ term in this line element, we are going tohave to work harder. Let us start by trying the easiest-looking trick,

dt??= e2α(t′,r)dt′ − e2γ(t′,r)dr . (14.8)

If you try to follow this path further, you will find that second mixed partial derivativesof the new t coordinate w.r.t. the old coordinates fail to commute, so the equation (14.8)above is inconsistent12. Our simple trick failed. As you may recall from the general theoryof ODEs, the right strategy is to recruit an integrating factor, which must be a function ofboth t′ and r: Φ(t′, r). We define a new time coordinate t(t′, r) by

dt = e2Φ(t,r)[e2α(t′,r)dt′ − e2γ(t′,r)dr

]. (14.9)

Note that the very explicit factor of eΦ(t,r) in front of the [. . .] parts we wanted is designedprecisely such that the right hand side of the above expression is an exact differential. Thisis absolutely necessary, because for any newly defined coordinates, second mixed partialderivatives w.r.t. the old coordinates must commute! Then, using the above equations, wehave

e−4Φdt2 = e4α(dt′)2 − 2e2(α+γ)dt′dr + e4γdr2 . (14.10)

Rearranging this and forming our (dt′)2 and 2dt′dr pieces gives

e2α(t′,r)(dt′)2 − 2e2γ(t′,r)dt′dr = e−2α(t,r)−4Φ(t,r)dt2 − e−2α(t,r)+4γ(t,r)dr2 . (14.11)

So redefining our metric ansatz functions according to

e2α = e−2α′−4Φ′ ,

e2β = e2β′ + e−2α′+4γ′ (14.12)

givesds2 = e2α(t,r)dt2 − e2β(t,r)dr2 − r2dΩ2

2 . (14.13)

The point of all this wrestling with differentials was to show that we can always choose acoordinate system in which off-diagonal metric components are absent, even if our sphericallysymmetric system is time dependent.

Our next task is going to be to show that the time dependence in the metric functionsalso has to drop out. For this part, we will need to use the equations of motion for the metric

12To see a simple example of how this process works when done right, try transforming from Cartesiancoordinates (x, y) on the plane to polar coordinates (r, θ), and checking that mixed second partials commute.

73

tensor field on spacetime. For the Einstein equations, we need to compute Christoffels to getRiemanns which we can then contract to get Ricci components, e.g. via Maxima. We get

Γttt = ∂tα , Γttr = ∂rα ,

Γtrr = e2(β−α)∂tβ , Γrtt = e2(α−β)∂rα ,

Γrtr = ∂tβ , Γrrr = ∂rβ ,

Γrθθ = −re−2β , Γrφφ = −r sin2 θe−2β ,

Γθrθ =1

r, Γθφφ = − sin θ cos θ ,

Γφrφ =1

r, Γφθφ =

cos θ

sin θ. (14.14)

Note that the pieces involving ∂t have been highlighted with . . . in the above equation soyou can clearly see the effect of allowing time dependence. For the Ricci tensor, we obtain

Rtt = e2(α−β)

[−(∂2

rα)− (∂rα)2 + ∂rα∂rβ −2

r(∂rα)

]+−(∂2

t β) + (∂tα)(∂tβ)− (∂tβ)2,

Rtr =

−2

r(∂tβ)

Rrr =

[(∂2rα) + (∂rα)2 − (∂rα)(∂rβ)− 2

r(∂rβ)

]+ e2(β−α)

−(∂2

t β)− (∂tβ)2 + (∂tα)(∂tβ),

Rθθ = −[e−2β (r∂rβ − r∂rα− 1) + 1

],

Rφφ = sin2 θRθθ . (14.15)

All these tensors must be zero for us to have a solution of the vacuum Einstein equations.First, have a look at Rtr. This must be zero, which demands of β(t, r) that

∂tβ(t, r) = 0 ⇒ β = β(r) . (14.16)

You can see by looking for the . . . parts in the Riccis that many terms now drop outcompletely because β is a function of r only. Obviously, this simplifies our life quite a lot!

Second, let us notice that the Rθθ = 0 equation (a first order constraint equation) isrelatively simple. Let us take a time derivative of it,

∂t(Rθθ) = 0 = −2(∂tβ)e−2β[r∂rβ − r∂rα− 1] + e−2β[r∂t∂r(β − α)] . (14.17)

But since ∂tβ = 0 by our Rtr = 0 equation, we have

e−2βr∂t∂r(β − α) = 0 . (14.18)

Then, using what we know about β(r), we can partially integrate to get

α(t, r) = f(r) + g(t) . (14.19)

74

Notice how the only remaining place where we have time dependence is in the tt componentof the metric. What a stroke of luck! This means that we can absorb it simply by doing acoordinate transformation involving only time (not radius or angular coordinates),

dt = dt eg(t) . (14.20)

Let us redefine our time coordinate to correspond to this t (we drop the tilde, for notationalclarity). Then we have

β = β(r) , α = α(r) . (14.21)

Third, let us look at the remaining (more complex) tt and rr Einstein equations,

0 =

[(∂2rα) + (∂rα)2 − ∂rα∂rβ +

2

r(∂rα)

], (14.22)

0 =

[−(∂2

rα)− (∂rα)2 + (∂rα)(∂rβ) +2

r(∂rβ)

](14.23)

By simply adding these equations together, we obtain

∂r(α + β) = 0 . (14.24)

This means thatβ(r) = const− α(r) . (14.25)

Fourth, we can plug this expression for β(r) in terms of α(r) back in to the Rθθ = 0Einstein equation to obtain

[2r∂rα + 1] e2α = 1 . (14.26)

By quick inspection you can see that this becomes

∂r(re2α(r)

)= 1 , (14.27)

so thatre2α(r) = r + c1 , (14.28)

where c1 is a mathematically arbitrary constant. This can be integrated to give

e2α(r) = 1 +c1

r. (14.29)

We are nearly done, but we need one more physical ingredient. We need to know thephysical meaning of c1, because it is what controls all the nontrivial radial dependencein our new static spherically symmetric metric satisfying the vacuum Einstein equations.This is where the Newtonian limit comes to our rescue. We know that in regions of weakgravity, far away from the centre of our spacetime near r →∞, gtt should take the form ofgtt ' −1 + 2Φ/c2, where Φ = −GNM/r. This fixes our arbitrary constant of integration.

Therefore, we finally obtain the famous Schwarzschild metric in four13 spacetimedimensions:

ds2 =

(1− 2GNM

r

)dt2 −

(1− 2GNM

r

)−1

dr2 − r2dΩ22 . (14.30)

13Cousins of Schwarzschild which are asymptotically flat are available in various dimensions for D > 3.They have 1/rD−3 dependence rather than 1/r dependence in the metric.

75

Birkhoff’s Theorem says that this is the unique static spherically symmetric solution ofthe vacuum Einstein equations. We sketched a proof of this en route, when we found thatthe Einstein equations would not allow time dependence. Note that in the solution we seeGN , which is a theory parameter, and M , which is a solution parameter.

One of the two most physically intriguing things about this solution, in this coordinatesystem, is that there is a place where grr blows up (and gtt goes to zero). This is known asthe event horizon. It is located at the Schwarzschild radius

rS =2GNM

c2. (14.31)

where I have temporarily shown the factors of c for physical clarity. Try calculating yourown Schwarzschild radius. You do not fit inside this radius, so you are not a black hole.

The second physically intriguing thing about this solution of Einstein’s equations is thatit has a singularity at r = 0 that is not just a coordinate singularity. It is a truly physicalsingularity, as you can see by computing a curvature invariant like Riemann squared,

RµνσρRµνσρ =48(GNM)2

r6, (14.32)

which diverges at r = 0. Notice that at r = rS = 2GNM , this curvature invariant is smallin Planck units for a big black hole,

`4PR

µνσρRµνσρ(r = rS) =12`4

P

r4S

. (14.33)

Conversely, for a Planck mass black hole, the curvature would be Planckian.A third aspect of this solution should also grab your physical interest: the nonlinearity

of gravity manifest in it. Nonlinearity is what allows there to be a nontrivial solution of thevacuum Einstein equations (ones with Tµν = 0) at all. Compare to Newtonian gravity, wherea zero mass on the RHS of the Laplace equation would result in a zero Newtonian potential!

Mathematically, the mass M of a classical Schwarzschild black hole might in principletake any value from −∞ to +∞, because rS arose as a mere constant of integration ofEinstein’s equations. However, physically there are limits to what the mass can be. Forstarters, M must be finite for a physically reasonable solution. More importantly, the massmust be nonnegative, M ≥ 0, because the singularity is not covered by a horizon if theSchwarzschild radius is negative! When M < 0, the gravitational redshift also walks off intothe complex plane, and we are in trouble interpreting what the heck our spacetime mightmean. So already at the classical level, we can imagine why taking our black holes to havenon-negative mass is a sensible physical precaution. (Of course, the case M = 0 is Minkowskispacetime.)

There is a more sophisticated argument available for mass non-negativity that takes intoaccount quantum corrections to classical gravity, first made in 1995 by G.T. Horowitz andR.C. Myers. They argued that if a negative-mass black hole solution were physical, in thesense that quantum gravity corrections somehow ‘fixed up’ the negative-mass naked singu-larity into some physical blob with large-but-finite curvature, then the vacuum of quantumgravity would be unstable. Their logic was this: we could reduce the energy of our system

76

from the vacuum state by simply pair-producing more and more blob-antiblob pairs. Thisworks because each blob has negative energy and so does each anti-blob! The existence ofnegative-mass ‘black holes’ would therefore thoroughly destabilize the vacuum of quantumgravity, which is the foundation upon which we lay excitations of quantum fields describingthe fluctuating degrees of freedom of the system. The result would be a horrible, physicallyinconsistent mess.

The moral of this mass positivity story is this: do not trust that every mathematicalsolution of a physically interesting set of PDEs is physical. We must also check that physicalboundary conditions are obeyed, and ensure that basic physical principles like stability ofthe vacuum are preserved. This is why we assume henceforth that MBH ≥ 0.

77

15 Thu.24.Oct

15.1 TOV equation for a star

Let us now see what changes when we allow an energy-momentum tensor in our static,spherically symmetric spacetime. The simplest kind of thing to consider is called a perfectfluid. What is a perfect fluid? Physically, it is a kind of spherical cow approximation,in which we model a system like the ball of gas we call our Sun by a simple macroscopicfluid, described only by its proper energy density ρ and pressure p in the instantaneous restframe. We ignore shear viscosity, bulk viscosity, and heat conduction. For a perfect fluid,the energy-momentum tensor Tµν can be written in the form

T p.f.µν =

(ρ+

p

c2

)UµUν − pgµν . (15.1)

This obeys the conservation equation

∇µT p.f.µν = 0 . (15.2)

In flat Minkowski spacetime in Cartesian coordinates, in the Newtonian limit, conservationof energy-momentum can be seen to reduce to (a) the continuity equation, and (b) the Eulerequation, the classical equation of motion for a perfect fluid. For details, see §8.3 of HEL.Here, we work in curved spacetime so our story is more involved.

As you can check, in comoving coordinates, we have only the time component of the4-velocity and its magnitude is set by the timelike condition UµUµ = 1. Then the Einsteinequations for our static, spherically symmetric star involve only radial dependence, and theyare (with c = 1)

1

r2e−2β

[2r∂rβ − 1 + e2β

]= 8πGNρ , (15.3)

1

r2e−2β

[2r∂rα + 1− e2β

]= 8πGNp , (15.4)

e−2β

[∂2rα + (∂rα)2 − ∂rα∂rβ +

1

r(∂rα− ∂rβ)

]= 8πGNp . (15.5)

Note very carefully here the difference between ρ(r) and p(r). Make sure you write ρs inyour own handwritten notes in such a way that they are easily distinguishable from ps.

Now, we have a set of three coupled ODEs in α(r), β(r), ρ(r), p(r). Without some physicalinput there are not enough equations to solve the system. But we can do it if we recruitconservation of the energy-momentum tensor ∇µTµν = 0 and provide an equation of state.

The tt Einstein equation is a function of β only: it does not involve α. This allows us todefine a mass function m(r) such that

e2β =

(1− 2GNm(r)

r

)−1

. (15.6)

Then in terms of m(r) rather than β(r), the tt Einstein equation becomes dm/dr = 4πr2ρ(r),which we can immediately integrate to

m(r) = 4π

∫ r

0

dr r2ρ(r) . (15.7)

78

You might look at this formula and think “Oh! This is just the natural answer: youtake the mass density and multiply by the surface area, and integrate radially.” But thatwould be too quick, because the volume element in our curved spacetime metric is actuallydrdθdφ r2 sin θ eβ(r). So if we wanted to define the true energy density, we would insteadcalculate

M =

∫ R

0

drr2ρ(r)√

1− 2GNm(r)/r(15.8)

and this is greater than M because of the binding energy (a concept which does make sensein GR for spherical stars).

The radial Einstein equation becomes

dα

dr=

[GNm(r) + 4πGNr3 p(r)]

r[r − 2GNm(r)]. (15.9)

To get any further, we need to recruit energy-momentum tensor conservation. With onlyradial dependence, this gives

[ρ(r) + p(r)]dα(r)

dr= −dp(r)

dr, (15.10)

which lets us eliminate dα/dr in favour of dp/dr. We obtain

dp(r)

dr= − [ρ(r) + p(r)] [GNm(r) + 4πGNr

3 p(r)]

r[r − 2GNm(r)](15.11)

This is the Tolman-Oppenheimer-Volkov equation for hydrostatic equilibrium in a star,for the static spherically symmetric case in 4D.

In order to actually solve the TOV equation, we need to know one more equation: theequation of state, which is a relationship p = p(ρ). For astrophysical systems, a polytropicequation of state is often employed, which takes the form ρ = Kργ for some constants K, γ.As a toy model, we can consider an incompressible star with finite constant mass density ρ∗out to some radius R. Then the mass function is easily integrated, and M = 4πR2/3. Thisin turn gives

p(r) = ρ∗

√R3 − rSR2 −

√R3 − rSr2

√R3 − rSr2 − 3

√R3 − rSR2

(15.12)

Integrating again to find gtt yields

eα(r) =3

2

√1− rS

R− 1

2

√1− rSr2

R3, r < R . (15.13)

The pressure increases near the core, even though we have assumed absolute incompressibilityof the fluid. In particular, if M > Mmax = (4R)/(9GN), then the pressure at the core goes toinfinity. Oops! With our simplistic ansatz, we have managed to evolve ourselves outside theregime of validity of Einstein’s equations. Of course, real stars do not obey such a simplisticmodel as an incompressible fluid. Still, it is interesting that we can get the right order ofmagnitude estimate of when a star can be too big to be gravitationally stable. Sometimes,

79

the stellar object collapses gravitationally into a black hole. If the initial configuration hadno overall angular momentum, it will settle down eventually to a Schwarzschild solution. Ifit is rotating, then the metric we will discuss soon is known as the Kerr black hole.

Stellar evolution produces different endpoints depending on the initial mass of the starin question. For small stars like ours, when they run out of gas for nuclear fusion, theycontract and become white dwarfs. If they are somewhat larger, above about 1.4M, knownas the Chandrasekhar limit, then electron degeneracy pressure is not sufficient to holdthem up, and they collapse further to become a neutron star (a class that includes pulsars).Above about 3-4M, known as the Oppenheimer-Volkov limit, even neutron degeneracypressure is not enough. Bigger stars collapse to produce black holes.

People like to categorize black holes by size. We can distinguish three basic classes byformation mechanism. Stellar mass black holes are produced by collapse of individualstars, and have masses of a few to a few hundred solar masses. We also have supermassiveblack holes at the centres of most galaxies, at millions to billions of solar masses. The thirdclass is known as primordial black holes because the only way these smaller-mass objectscould have been formed would have been in the Big Bang. The density of primordial blackholes is small, if there were any at all to begin with, because of the period of inflation whichgrew the universe by gigantic amounts early in the history of its evolution, diluting them.

15.2 Geodesics of Schwarzschild

We now move to studying geodesics in the Schwarzschild spacetime explicitly. The nonzeroChristoffels for this geometry are

Γttr =rS

2r(r − rS); (15.14)

Γrrr = −Γttr , Γrtt =rS2r3

(r − rS) , Γrθθ = −(r − rS) , Γrφφ = sin2 θ Γrθθ ; (15.15)

Γθrθ =1

r, Γθφφ = − sin θ cos θ ; (15.16)

Γφrφ =1

r, Γφθφ =

cos θ

sin θ, (15.17)

where rS is the Schwarzschild radius. Then our geodesic equations become

d2t

dλ2+

rSr(r − rS)

dt

dλ

dr

dλ= 0 , (15.18)

d2r

dλ2+

rS2r3

(r − rS)

(dt

dλ

)2

− rS2r(r − rS)

(dr

dλ

)2

− (r − rS)

(dθ

dλ

)2

+ sin2 θ

(dφ

dλ

)2

= 0 , (15.19)

d2θ

dλ2+

2

r

dθ

dλ

dr

dλ− sin θ cos θ

(dφ

dλ

)2

= 0 , (15.20)

d2φ

dλ2+

2

r

dφ

dλ

dr

dλ+ 2

cos θ

sin θ

dθ

dλ

dφ

dλ= 0 . (15.21)

80

These equations look rather formidable until you realize that finding the Killing vectorsallows you to find first integrals of two out of four of the geodesic equations. This followsbecause ∂tgµν = 0 and ∂φgµν = 0. We write the energy

E = pt =(

1− rSr

) dtdλ

. (15.22)

and the angular momentum

L = pφ = r2 sin θdφ

dλ. (15.23)

The next equation we can recruit isUµUµ = ε , (15.24)

where ε = 0 for null geodesics and ε = +1 for timelike geodesics. For either type of geodesic,we have gµνU

µUν = ε, or

ε =(

1− rSr

)( dtdλ

)2

−(

1− rSr

)−1(dr

dλ

)2

− r2

[(dθ

dλ

)2

+ sin2 θ

(dφ

dλ

)2]. (15.25)

Substituting in our conserved angular momentum L and energy E gives

ε =(

1− rSr

)−1

E2 −(

1− rSr

)−1(dr

dλ

)2

− r2

(dθ

dλ

)2

− L2

r2 sin2 θ. (15.26)

Our next step is a piece of physics input. We can use rotational symmetry to pickθ = π/2. It is consistent with the geodesic equations to leave dθ/dλ = 0 for all affine time.Then

1

2

(dr

dλ

)2

=1

2E2 − 1

2

(1− rS

r

)(ε+

L2

r2

). (15.27)

Some textbooks like to help you visualize this setup by making a mapping onto a familiarnon-relativistic Newtonian system, as follows,

m→ 1 (15.28)

|~v|2 →(dr

dλ

)2

, (15.29)

Etot →E2

2, (15.30)

Veff(r)→ 1

2

(1− rS

r

)(ε+

L2

r2

)=ε

2− εrS

2r+L2

2r2− rSL

2

2r3. (15.31)

You can learn everything you need to know about the availability of various types of orbits(for either null or timelike geodesics) by plotting this “effective potential”. Carroll has twogreat figures in §5.4, Figures 5.4 and 5.5:-

81

For Newtonian gravity, there are no massless particle orbits. Massive particles can havestable bound orbits, depending on the angular momentum per unit mass.

For Einsteinian gravity, photons can orbit, but they are unstable. Any small perturbationand the path flings off back out to infinity (sometimes after buzzing around the black holehorizon a few times) or falls inexorably into the black hole. Massive particles, on the otherhand, can have bound orbits, and the outer solution radius gives a stable orbit while theinner one gives an unstable orbit.

Circular orbits can happen when dVeff/dr = 0 at r = r∗, solving the equation

ε rS2r2∗ − L2r∗ +

3rS2L2γ = 0 , (15.32)

where (following Carroll) we introduce γ = 1 for GR and γ = 0 for NG (Newtonian gravity).Specifically, for massless geodesics, r∗ = 3rSγ/2, and as you can see by evaluating the secondderivative of Veff(r), it is an unstable maximum. Massive geodesics provide a richer context.

82

We find two solutions,

r∗rS

=L2

r2S

±

√L4

r4S

− 3L2γ

r2S

, (15.33)

From this you can quickly see that NG has only one solution, at r∗ = 2L2/rS. But for GRthe story is a lot more interesting. There are two solutions and, as you can see by computingthe second derivative of the effective potential, the outer one is stable while the inner oneis unstable. As you can discover by inspecting the negative root of eq.(15.33) carefully, forradii smaller than r = r∗,ICO, where

r∗,ICO =3rS2, (15.34)

there are no stable circular orbits at all. Nothing can orbit that close without falling acrossthe horizon. Gravity is too strong. The angular momentum at which the stable and unstableorbits coalesce for timelike geodesics is L4 = 3r2

SL2γ, i.e., where the discriminant in eq.(15.33)

vanishes. This is called the ISCO, or the Innermost Stable Circular Orbit,

r∗,ISCO = 3rS = 2r∗,ICO . (15.35)

The following image is an artist’s rendition of what the black hole in the Large MagellanicCloud might look like (credit: Alain Riazuelo / CC BY-SA 2.5.). It looks weird becauseyou are not used to photon trajectories being bent. The strong and nonlinear gravitationaleffects of the black hole are quite extreme!

83

16 Mon.28.Oct

16.1 Causal structure of Schwarzschild

How do light cones behave in the spacetime of the Schwarzschild black hole? In the originalSchwarzschild coordinates, we had the spacetime metric

ds2 =(

1− rSr

)dt2 −

(1− rS

r

)−1

dr2 − r2dΩ22 . (16.1)

Obviously, we will have to suppress some of our four spacetime coordinates in order to fita diagram onto a two-dimensional page. It will assist our visualizations to suppress theangular directions and focus attention on the time and radial directions. (Tip: be sure todouble-check spacetime diagrams in textbooks to eliminate avoidable confusion over whichcoordinates are suppressed.)

For a null trajectory we have ds2 = 0. For purely radial motion, we can immediatelyread off the slope of the light cone,

dt

dr= ±

(1− rS

r

)−1

. (16.2)

As we would expect, the magnitude of this tends to unity at r → ∞. Light rays go at 45

on a (t, r) diagram. At slightly smaller radii, it increases a little. What happens at r → rSyou may not have expected: the magnitude of the slope of the light cone blows up! Thelight cone is physically squashed down to have zero opening angle. This is a coordinatesingularity.

Inside the Schwarzschild radius, gtt and grr both flip sign, seemingly switching roles.This is a symptom of the fact that this coordinate system does not actually cover the regionof the black hole spacetime inside the horizon. Another symptom of the disease we see hereis that it appears a photon would take an infinite amount of time to fall into the black hole.It does – in these coordinates.

Redshifting depends on the coordinate system. To do a better job of probing the causalstructure of the Schwarzschild black hole spacetime, we are actually better off aiming toanswer more invariant questions, like “How much affine parameter does it take before afreely falling particle hits the singularity?” We can also hunt for better coordinate systemswhich do cover the entire black hole spacetime, not just the region outside the horizon.

84

Let us start by inspecting what we have so far in our Schwarzschild coordinates. Forradial null paths,

dt

dr=

±1

(1− rS/r). (16.3)

Definingdt

dr∗= ±1 (16.4)

givesr∗rS

=

∫d(r/rS)

(1− rS/r), (16.5)

so that

r∗ = r + rS ln

(r

rS− 1

), (16.6)

which is known as the tortoise coordinate. This ranges over r∗ ∈ (−∞,+∞), while theoriginal radial coordinate ranged over r ∈ [0,∞). So this tortoise coordinate also only coversthe region outside the horizon. The benefit of using these coordinates is that the radial nullpaths are simple,

t = ±r∗ + c . (16.7)

The light cones are all at 45 in tortoise coordinates.Using the tortoise coordinate, our black hole metric becomes

ds2 =

(1− rS

r(r∗)

)[dt2 − dr2

∗]− r2(r∗)dΩ22 . (16.8)

Next, let us try adapting our coordinates to null motion. Define null coordinates in thetime-radius plane,

u ≡ t− r∗ , (16.9)

v ≡ t+ r∗ . (16.10)

Then our black hole spacetime metric takes the form

ds2 =

(1− rS

r(u, v)

)dv2 − 2drdv − r2(u, v)dΩ2

2 . (16.11)

These coordinates are called Eddington-Finkelstein coordinates. As you can check, thismetric remains invertible, including at the horizon.

Then for radial null motion, we have(1− rS

r

)(dvdr

)2

− 2dv

dr= 0 , (16.12)

so that

dv

dr=

2

(1− rS/r)(outgoing) ,

0 (ingoing) .(16.13)

85

Because the first solution is positive, it is relevant for outgoing radial null paths. The secondsolution is the one relevant for ingoing radial null paths.

Notice what this implies about our light cones in (v, r) coordinates. We have that forthe ingoing ‘side’ (on a 2D diagram) of the light cones, this always hugs v =const. For theoutgoing ‘side’ of the light cones, the slope depends on r/rS. At r → ∞, this slope is 2. Ifwe are at a finite r > rS, then the slope is positive and bigger than 2. At r = rS the slopebecomes infinite, pointing straight up the v-axis. For r < rS the slope becomes negative,and points towards the inside of the black hole only. This represents the physics that wewant in a rather more elegant way than Schwarzschild coordinates did. Infalling photons donot make it out of the black hole once they have crossed the horizon.

The following picture is a summary of what we have found out about light cones inEddington-Finkelstein coordinates.

A nice feature of Eddington-Finkelstein coordinates is that our light-cones do not getsquished down to infinitely thin pencils. But note carefully that they do turn over at thehorizon. Note that with these new coordinates we have managed to cover the region t→ +∞of the black hole spacetime, because at constant v, decreasing r sends t→ +∞. So we haveextended in one direction. How about the other direction?

Are there other coordinates that might restore the symmetry between u and v? OurEddington-Finkelstein coordinates so far privileged v. Because of that, they are known asingoing Eddington-Finkelstein coordinates. It turns out that we can alternatively find asecond set of Eddington-Finkelstein coordinates, adapted for outgoing rather than ingoingnull paths, in which we have

ds2 =

(1− rS

r(u, v)

)du2 + 2drdu− r2(u, v)dΩ2

2 . (16.14)

Working back from our definitions, we see that this corresponds to the region t → −∞, ascompared to the region t → +∞ which the first set of Eddington-Finkelstein coordinatesextended us to. In outgoing Eddington-Finkelstein coordinates (u, r), the slope of the lightcones is

du

dr=

−2

(1− rS/r)(ingoing) ,

0 (outgoing) .(16.15)

The accompanying picture illustrates this.

86

Can we uncover yet more regions of the Schwarzschild black hole spacetime? It turnsout that the answer is yes, if we use another even smarter coordinate system known asKruskal-Szekeres coordinates.

Our first guess for how to get further is furnished by choosing both light-cone coordinates(u, v), in place of (u, r), or (v, r), or (t, r∗), or (t, r). We find immediately that

ds2 =

(1− rS

r(u, v)

)dudv − r2(u, v)dΩ2

2 , (16.16)

where1

2(v − u) =

r

rS+ ln

(r

rS− 1

), (16.17)

which implicitly defines r as a function of (u, v).This is looking more promising: our light-cones will stay at 45 in these (u, v) coordinates.

But there is still one big fly in the ointment: we still have the problem that the horizon islocated infinitely far away. To cure this symptom, we make an exponential mapping to bringthe horizon to a finite place,

U = − exp

(− u

2rS

)V = + exp

(+

v

2rS

). (16.18)

In these Kruskal-Szekeres coordinates we find

ds2 = dUdV

[− 2r3

S

r(U, V )e−r(U,V )/rS

]+ r2(U, V )dΩ2

2 . (16.19)

Picking apart the null (U, V ) coordinates into time T and radius R coordinates, via

T =1

2(U + V ) =

√r

rS− 1 exp

(r

2rS

)sinh

(t

2rS

), (16.20)

R =1

2(U − V ) =

√r

rS− 1 exp

(r

2rS

)cosh

(t

2rS

), (16.21)

gives the spacetime metric

ds2 =(−dT 2 + dR2

) [− 2r3

S

r(T,R)e−r(T,R)/rS

]+ r2(T,R)dΩ2

2 , (16.22)

where r(T,R) is implicitly defined by

T 2 −R2 =

(1− r

rS

)er/rS . (16.23)

87

This slick manipulation probably feels like it just happened at 100km/h. So let us slow downa little, and unpack all of what these new amazing Kruskal coordinates allow us to see forthe physics of the Schwarzschild black hole.

In Kruskal-Szekeres coordinates,

• Radial null motion occurs along

T = ±R + c1 . (16.24)

• Surfaces of constant r are at

T 2 −R2 =

(1− r

rS

)er/rS , (16.25)

which are hyperbolas in the (T,R) plane.

• Surfaces of constant t in are at

T

R= tanh

(t

2rS

), (16.26)

which are simply straight lines in the (T,R) plane.

• The event horizon is atT = ±R . (16.27)

This has two solutions, corresponding physically to having both a black hole horizonand a white hole horizon.

• The singularity is atT 2 −R2 = 1 . (16.28)

This has two solutions, one which corresponds to a black hole singularity and onewhich corresponds to a white hole singularity.

• What ranges do our coordinates (U, V ) cover? We see that (U, V ) range over all possiblevalues aside from where the curvature singularity occurs:

−∞ ≤ T ≤ ∞ ,

−∞ ≤ R ≤ ∞ ,

T 2 −R2 < 1 . (16.29)

Note: it appears that the U, V may be ill-defined inside the horizon, but it is actuallythe original t, r coordinates that are ill-defined there. The U, V Kruskal coordinatesare well-defined, except of course in the disallowed singular region. This is the reallykey part of using Kruskal coordinates which allows us to obtain what is known as themaximal analytic extension of the Schwarzschild spacetime. (In the figure below,rg is our rS, t is our T , and r is our R.)

88

Notice how the Kruskal diagram actually has extra regions by comparison to the originalSchwarzschild coordinate patch. These new extra regions can be abbreviated as II, III, andIV. From region I we can, via future-directed null rays, go into region II. So it makes senseto interpret this part as the region behind the black hole event horizon. And you can seefrom the picture above that the black hole singularity is in region II.

Suppose, from region I, we followed instead a past-directed null ray. Then what? Ac-cording to our Kruskal diagram, we would cross a horizon to go into another region – III –with another singularity, the white hole singularity, which can be loosely called the ‘mirrorimage’ of the singularity in region II under time reversal. The horizon in region III is thewhite hole horizon that we identified in our list of bullet points above.

By following future-directed null rays from region III, or past-directed null rays fromregion II, we can see a second asymptotically flat region. But we can never communicatewith it! It is a causally separated place unconnected by timelike or null geodesics to theoriginal asymptotic region. Some people like to speak of the Schwarzschild geometry as a“wormhole” connecting two asymptotically flat regions, but it is not physical in any senseto call it a wormhole because it is not traversable. It closes up too quickly for any physicalobserver (even an electron) to cross from I to IV. For more details, see p.228 of Carroll.

A black hole formed in gravitational collapse would involve at most regions I and II.Regions III and IV would not be present; there would be no white hole, only a black hole.

What is a white hole, physically? Mathematically, it is the time-reverse of a black hole.Such a beast cannot actually be formed in gravitational collapse – that produces a black holewith a future horizon, not a white hole with a past horizon. The other interesting fact aboutwhite holes, shown by D.M. Eardley in 1974, is that a white hole is unstable to collapsinginto a black hole. For these and other reasons, you do not need to worry about the physicsof white holes if you are considering classical gravity. Only quantum gravity theorists needto worry our heads about such things.

89

17 Thu.31.Oct

17.1 Charged black holes

The Reissner-Nordstrøm solution is obtained when we assume staticity and sphericalsymmetry, and allow an energy-momentum tensor coming from the electromagnetic field.Since there are no known magnetic monopoles that could source a magnetic field, we willstick with an electric field14. Then the only nonzero component of F µν is F tr. Let us assumethe same metric ansatz as we had for Schwarzschild,

ds2 = e2α(r)dt2 − e2β(r)dr2 − r2dΩ22 . (17.1)

The covariant source-free Maxwell equation ∇µFµν = 0 can be rewritten in the form

1√−g

∂µ(√−gF µν

)= 0 , (17.2)

where −g is shorthand for the negative of the determinant of the downstairs spacetimemetric. This Maxwell equation simplifies significantly by virtue of spherical symmetry. Inour spacetime ansatz, we have √

−g = eα+βr2 sin θ , (17.3)

so that the Maxwell equation implies

∂r(r2 sin θeα+βF tr

)= 0 , (17.4)

which we can immediately integrate by eye to

F tr =c1

r2e−α−β . (17.5)

(Note: if we had been more rigorous and put in a delta function charge source on the RHSof the Maxwell equation, c1 would have been the electric charge.)

The next step in solving the Einstein-Maxwell system is to substitute in the aboveelectric field into the energy-momentum tensor and apply Einstein’s equations. The detailsare similar in spirit but longer in practice than what we did before in deriving Schwarzschild,so we will not drag you through the algebra. The really nice thing is that, even with theelectric field turned on, it turns out that the Einstein equations still furnish the relationship

α = −β + const. , (17.6)

between the time-time component of the metric and the space-space component. Integratingup the θθ Einstein equation like we did for Schwarzschild produces the solution,

ds2RN = −

(1− 2GNM

c2r+

GNQ2

4πε0c4r2

)dt2 +

(1− 2GNM

c2r+

GNQ2

4πε0c4r2

)−1

dr2 +r2dΩ22 , (17.7)

14If you wanted to do the magnetic case, you would find Fθφ = P sin θ, where P ∝ magnetic charge.

90

where we have temporarily restored physical constants that we would usually set to unity.We can make this look slightly prettier by defining

µ ≡ GNM

c2, and q2 ≡ GNQ

2

4πε0c4; (17.8)

thenr± = µ±

√µ2 − q2 . (17.9)

The geometry has two event horizons, an outer horizon and an inner horizon. As youcan check by computing the full contraction of the Riemann tensor with itself, the curvaturesingularity is located at r = 0. The “singularities” in the metric at r = r± are just coordinatesingularities, like the one we encountered for Schwarzschild.

There are three cases for Reissner-Nordstrøm metrics depending on the sign of what isunder the square root in the above formula.

1. µ2 < q2: This is unphysical. The event horizon walks off into the complex plane andthe singularity at the origin is then naked. Oops!

2. µ2 > q2: This is physical. It includes the limit of zero charge, which gives backSchwarzschild (r+ = 2µ, r− = 0). Here there are two horizons, at r = r±. Thesingularity in this case is timelike, as compared to spacelike for Schwarzschild.

3. µ2 = q2. This is also physical, and is known as the extremal Reissner-Nordstrømspacetime. You can think of it as having exquisitely balanced gravitational attractionand electric repulsion.

The Penrose diagrams for Reissner-Nordstrøm black hole spacetimes are available in HEL§12.6, if you wish to peruse them to obtain intuition. Note however one important caveat onthe maximal analytic extensions that display an infinite number (!) of asymptotic regions.The inner horizon has the property that probe perturbations coming in from I − tend tobunch up there: their magnitude grows out of control. But if the perturbation amplitudewere that big, then it would surely backreact on the geometry, from having so much energy-momentum. This would entail changing the solution that we already wrote down. Whatthis teaches us is that the semiclassical perturbation analysis is breaking down. Most likely,the singularity of a real physical charged black hole would become spacelike, covered by onlyone horizon, not two.

At this point we can make one more advanced comment, concerning the physical realismof charged black hole solutions. Quantum field theory shows you that charged black holesin real astrophysical situations will actually discharge rather quickly, via the Schwingerprocess, which nucleates charged particle-antiparticle pairs (e.g. electron-positron pairs) inan electric field, about a Compton wavelength apart. So if you mention a charged black holeto an astrophysicist, they tend to burst out laughing. But in some ways the joke is on them,because focusing on the dynamics of charged black holes was what led string theorists toperform the first-ever first-principles computation of the entropy of black holes in 1996, adiscovery whose development indirectly helped me get hired! To finesse the astrophysicist’sobjection, you can imagine that the “electric” charge we are discussing is not carried by lightquanta in the theory.

91

A note about two neat properties of our extremal Reissner-Nordstrøm black hole. First,we will be able to see pretty quickly that there are multi black hole solutions in this case.This spacetime has one double horizon at r = r− = |q|,

ds2ERN = −

(1− |q|

r

)2

dt2 +

(1− |q|

r

)−2

dr2 + r2dΩ22 . (17.10)

We can easily define a shifted radial coordinate

ρ := r − |q| . (17.11)

Then dρ = dr and (1− |q|

r

)=

(r − |q|r

)=

ρ

ρ+ |q|=

(1 +|q|ρ

)−1

. (17.12)

Defining

H(ρ) = 1 +|q|ρ, (17.13)

we have

ds2ERN = −H−2dt2 +H2dρ2 + (ρ+ |q|)2dΩ2

2

= −H−2dt2 +H2(dρ2 + ρ2dΩ2

2

). (17.14)

This coordinate system is known as isotropic coordinates because the metric in parenthe-ses is the standard Euclidean metric in spherical polar coordinates. We also find that√

GNAt = H−1 − 1 . (17.15)

If we substituted this ansatz for the gauge potential and the metric into Maxwell’s equationsand the Einstein equations, we would find that they require only one equation between them,

~∇2H = 0 . (17.16)

In other words, H is a harmonic function of Cartesian coordinates ~x obtained from theisotropic spherical coordinates. It is actually possible to have multi black hole solutions ofthis system, because of the exact cancellation between gravitational attraction and electricrepulsion between any two of the black hole centres!

H = 1 +N∑i=1

GMa

|~x− ~xa|. (17.17)

This ability to superpose is extremely niche: it very generally fails in GR, a nonlinear theory.Another interesting feature of the Reissner-Nordstrøm spacetime is what happens when

you take the near-horizon limit |~x| → 0. This in effect removes the 1 from the harmonicfunction. If you look carefully at the single-centred black hole metric in this limit, you willfind that it produces AdS2×S2, two-dimensional Anti de Sitter spacetime times a two-sphere.This fact is related to the famous AdS/CFT correspondence of string theory.

92

17.2 Rotating black holes

Now we move to discussing the Kerr black hole, which has not only mass but also angularmomentum. Our discussion here will be based largely on HEL §13. Demanding that thespacetime be stationary and spheroidally symmetric requires an ansatz of the form

ds2 = e2α(r,θ)dt2 − e2γ(r,θ) [dφ− ω(r, θ)dt]2 − e2β(r,θ)dr2 − e2δ(r,θ)dθ2 . (17.18)

Note how many more functions we have turned on here, and the fact that there is now bothr and θ dependence in all our metric functions. Mathematically speaking, this complicatesthe hell out of the process of solving the Einstein equations, because we now have PDEs intwo variables instead of ODEs in r only. We will not actually prove that the Kerr solutionsolves the vacuum Einstein equations, because the algebra is awful.15 Instead, we will derivesome fascinating physical properties of spacetimes of the above form, and just present theKerr solution, gift wrapped with a bow on top.

From the above ansatz we see that

gtt = e2α − e2γω2 . (17.19)

Since the metric has an off-diagonal component, gtφ, inverting to find the upstairs metric isslightly more complicated. We can easily read off two of the components,

grr = −e−2β , (17.20)

gθθ = −e−2δ , (17.21)

but for the (t, φ) block we need to invert the 2× 2 matrix. The result is

gtt = e−2α , (17.22)

gφφ = −e−2γ + ω2e−2α , (17.23)

gtφ = +ωe−2α . (17.24)

It is possible to see one of the most intriguing consequences of GR, known as the draggingof inertial frames, without getting specific about the form of any of the functions in our metricansatz. Since the metric obeys ∂φgµν = 0, pφ is conserved along a geodesic. Then

pφ = gφµpµ = gφφpφ + gφtpt , (17.25)

and similarlypt = gttpt + gtφpφ . (17.26)

Let us specialize to the case of pφ = 0: no initial angular momentum. This quantity remainszero along the geodesic. Then, recalling our relationship between the momentum and thetangent vector for either massive or massless geodesics,

pµ ∝ dxµ

dλ, (17.27)

15If you are a masochist and want to see it for yourself, please make Maxima do it.

93

we havedφ

dt=pφ

pt=gtφ

gtt= ω(r, θ) . (17.28)

In other words, ω is the coordinate angular velocity of a massless particle with no angularmomentum. What we have obtained here might not look like much, but it is physicallyremarkable. A particle dropped straight inwards from infinity will not end up continuingstraight inwards – instead, gravity drags the particle around so it acquires an angular velocity.This effect is know as the dragging of inertial frames.

Our next task is to define a physically important surface known as the stationary limitsurface. To get basic intuition for this phenomenon, consider what happens if we assumethat a particle/observer could in principle remain at fixed (r, θ, φ). This would require a4-velocity of the form [uµ] = [ut,~0]T . Is this compatible with our spacetime? The answeris: not everywhere! In the region where gtt is negative, we see that our assumed 4-velocityis incompatible with the condition that u2 = 1. Oops. The equation gtt = 0 delineates thesurface inside which a particle/observer cannot stay stationary, and it is called the stationarylimit surface. Let us now dig a little deeper.

Imagine photons emitted from (r, θ, φ) purely in the ±φ direction at first, so that onlydt and dφ are nonzero along the photon path. Using ds2 = 0, we have

gttdt2 + 2gtφdtdφ+ gφφdφ

2 = 0 , (17.29)

so that

dφ

dt= − gtφ

gφφ±

√g2tφ

g2φφ

− gttgφφ

(17.30)

If at the emission point gtt/gφφ < 0, then dφ/dt is positive (negative) for photons emittedin the ±φ direction, even though the magnitudes differ. But when gtt = 0, we cross over toa different behaviour. In particular, on the surface gtt(r, θ) = 0, known as the stationarylimit surface, there are two qualitatively different solutions:

dφ

dt= −2

gtφgφφ

= 2ω ordφ

dt= 0 . (17.31)

The first solution corresponds to a photon sent off in the same direction as the source rotation.The second solution shows that frame dragging is so severe that initially the photon doesnot move at all. This implies that a massive particle, which must always go slower than aphoton, also has to rotate with the source. This is true even if it has an arbitrarily largeangular momentum with opposite orientation!

As we will find next week when we start talking about experimental successes of GR, theformula for gravitational redshift of an observer at a fixed spatial location in a stationaryspacetime is (

νRνE

)∣∣∣∣stationary, fixed

=

√gtt(E)

gtt(R). (17.32)

This is why the stationary limit surface is also known as the infinite redshift surface.

94

18 Mon.11.Nov

18.1 The Kerr solution

Today’s material is based on parts of HEL §13. All figures shown are theirs.How would we find the horizon in our rotating spacetime? The defining property of

an event horizon is that it is a null surface. In stationary axisymmetric spacetimes, itsequation must be of the form f(r, θ) = 0. Nullness then implies that gµν∂µf∂νf = 0, orgrr (∂rf)2 + gθθ (∂θf)2 = 0. In fact, it turns out that it is actually possible to choose ourcoordinates r and θ such that the equation for the horizon can be put in the form f(r) = 0.In this case, our condition reduces to grr (∂rf)2 = 0, and therefore we see that the eventhorizon occurs when

grr = 0 . (18.1)

In our previous case of Schwarzschild, this was equivalent to the condition gtt = 0, but thatonly holds for static black holes, not stationary ones.

This is a good place to mention a definition of a horizon associated to Killing vectors.Suppose that we have a Killing vector χµ. If that Killing vector is null along some nullhypersurface Σ, then Σ is a Killing horizon of χµ. Note that χµ is normal to Σ becausea null surface cannot have two linearly independent null tangent vectors. Some importantfacts are as follows.

• Every event horizon Σ in a stationary, asymptotically flat spacetime is a Killing horizonfor some Killing vector χµ.

• If the spacetime is static, then χµ will be the Killing vector Kµ = (∂t)µ representing

time translations at infinity.

• If the spacetime is stationary but not static, then it will be axisymmetric with arotational Killing vector Rµ = (∂φ)µ, and χµ will be a linear combination Kµ + ΩHR

µ

for some constant ΩH .

To prove that the empty space Einstein equations are satisfied, we need to show thatthe Ricci tensor is zero for metrics of our form with α, β, γ, δ, ω. Take my word for it: this isa very tedious computation. Here is the Kerr metric that emerges after all the calculationaldust has settled:-

ds2 =

(1− 2µr

ρ2

)dt2 +

4µar sin2 θ

ρ2dtdφ− ρ2

∆dr2 − ρ2dθ2

−(r2 + a2 +

2µra2 sin2 θ

ρ2

)sin2 θdφ2 ,

=ρ2∆

Σ2dt2 − Σ2 sin2 θ

ρ2(dφ− ωdt)2 − ρ2

∆dr2 − ρ2dθ2 , (18.2)

where

ρ2 = r2 + a2 cos2 θ , Σ2 = (r2 + a2)2 − a2∆ sin2 θ , (18.3)

∆ = r2 − 2µr + a2 , ω =2µar

Σ2. (18.4)

95

The coordinate system in which we have presented the Kerr metric is known as the Boyer-Lindquist coordinate system. Note: this was not actually the original coordinate systemused by Kerr when he derived the black hole, which are known as Kerr-Schild coordinates.

Where is the singularity of the Kerr spacetime describing a rotating black hole? Com-puting the full contraction of Riemann with itself shows that only at ρ2 = 0 do we see aphysical singularity. This happens at

r2 + a2 cos2 θ = 0 , (18.5)

yielding

r = 0 , θ =π

2. (18.6)

Careful inspection reveals that this singularity is ring shaped. To see this, take the limitM → 0 while keeping a nonzero; the result gives Minkowski spacetime in oblate spheroidalcoordinates, which are related to Cartesian coordinates by

x =√r2 + a2 sin θ cosφ , y =

√r2 + a2 sin θ sinφ , z = r cos θ . (18.7)

This is to be contrasted with Schwarzschild, where the singularity was pointlike.Where are the horizons? These occur where grr → 0. This requires ∆ = 0, or

r = r± = µ±√µ2 − a2 . (18.8)

Note that, with factors of c temporarily restored for physical clarity,

µ =GNM

c2, and J = Mac . (18.9)

So then we requireµ ≥ |a| (18.10)

for cosmic censorship.Where is the stationary limit surface, also referred to as the ergoregion? This happens

when gtt → 0,rS± = µ±

√µ2 − a2 cos2 θ . (18.11)

The following figure summarizes these aspects from a side-on perspective.

96

18.2 The Penrose process

Previously we started the discussion of frame dragging in GR for the Kerr spacetime. Letus now finish that line of reasoning, which will help lead us into the subject of black holethermodynamics.

Suppose that you had the ability to fire rockets and wanted to remain fixed at (r, θ) butrotate around φ. Then the 4-velocity is

[uµ] = ut[1, 0, 0,Ω]T , (18.12)

where

Ω =dφ

dt(18.13)

is the angular velocity w.r.t. an observer at infinity. Demanding that the 4-velocity squaresto ε gives a quadratic equation for ut:

gtt(ut)2 + 2gtφu

tuφ + gφφ(uφ)2 = (ut)2[gtt + 2gtφΩ + gφφΩ2] = ε . (18.14)

For real solutions for ut, we need

gφφΩ2 + 2gtφΩ + gtt ≥ 0 . (18.15)

Since gφφ < 0 everywhere, Ω must lie in the interval Ω ∈ (Ω−,Ω+), where

Ω± = − gtφgφφ±

√g2tφ

g2φφ

− gttgφφ

= ω ±√ω2 − gtt

gφφ. (18.16)

Notice how Ω− can be negative if gtt > 0. Where gtt = 0, Ω− = 0 and Ω+ = 2ω. This occurson the stationary limit surface S+, which is outside (or at, for θ = 0, π) the event horizon. Aspecial situation ensues when ω2 = gtt/gφφ, Ω± = ω. This holds at ∆ = 0, i.e. at the outerhorizon. At r = r+, the angular velocity has to be one value only,

ΩH = ω(r+, θ) =a

2µr+

. (18.17)

This is independent of θ, which is a highly nontrivial physics fact. It is also the maximumallowed value of the angular velocity inside the ergoregion.

Now we have all the ingredients at hand to discuss the Penrose process. Suppose thatwe have an observer at infinity with fixed position who fires particle A into the Kerr blackhole ergoregion. Then the energy of A measured at the emission event E is

E(A) = p(A)(E) · uobs = p(A)t (E) , (18.18)

where the observer 4-velocity is [uµobs] = [1,~0]T . Now suppose that inside the ergoregion,particle A decays into two other particles: A → B + C. Then momentum conservationimplies that

p(A)(D) = p(B)(D) + p(C)(D) , (18.19)

where D denotes the decay event.

97

Suppose that C eventually makes it out to infinity. The observer at infinity measuresthe particle energy at the reception event R to be

E(C) = p(C)t (R) = p

(C)t (D) (18.20)

because pt is conserved along a geodesic by virtue of stationarity: ∂tgµν = 0. Similarly, forthe original particle,

p(A)t (D) = p

(A)t (E) . (18.21)

Then the time component of the above momentum conservation equation can be rearrangedto

E(C) = E(A) − p(B)t (D) , (18.22)

because p(B)t is conserved along a geodesic.

Now, if B were to escape the ergoregion, p(B)t would be timelike, and hence proportional

to the particle energy as measured by an observer with purely timelike 4-velocity. Sincep

(B)t > 0, this implies that E(C) < E(A), i.e. you get less energy out than you put in. But ifB were to instead fall into the black hole, then it would forever remain in the region wheregtt has opposite sign. Then p

(B)t would be interpreted as a component of spatial momentum,

which could in principle be either positive or negative. (If it were the energy, it would have

to be positive for a physical particle.) If p(B)t happened to be negative, then E(C) > E(A).

This means that we can extract energy from a rotating black hole!Once B has fallen inside the event horizon, it becomes part of the black hole, whose

mass and angular momentum are then

Mc2 →Mc2 + p(B)t ,

J → J − p(B)φ . (18.23)

If we have an observer at fixed r, θ, we already worked out the 4-velocity: [uµ] = ut[1, 0, 0,Ω]T ,where Ω = dφ/dt is the angular velocity w.r.t. infinity. This observer measures B’s energyto be

E(B) = p(B)µ uµ = ut

(p

(B)t + p

(B)φ Ω

). (18.24)

This quantity must be positive for a physical particle, so

− p(B)φ <

p(B)t

Ω. (18.25)

Consider the quantity L = −p(B)φ . What is it? This is the component of B’s angular

momentum along the black hole rotation axis: there is a − sign because we are working inmostly minus signature. Now, because p

(B)t < 0 for the Penrose process and Ω > 0, this

means that L < 0, resulting in a loss of angular momentum for the black hole. You cankeep extracting energy from a black hole like this until you have spun Kerr down all the wayto Schwarzschild. Earlier, we learned that the angular velocity is maximal at r = r+, whenΩ = ΩH . So in fact for any observer at fixed r, θ, we have a general bound,

δJ <δMc2

ΩH

. (18.26)

98

Let us sketch the calculation of the area of the outer horizon r+ in the Kerr spacetime.Writing

γijdxidxj = −ds2(dt = 0, dr = 0, r = r+) (18.27)

= (r2+ + a2 cos2 θ)dθ2 +

[(r2

+ + a2)2 sin2 θ

(r2+ + a2 cos2 θ)

]dφ2 , (18.28)

we define the area A as

A(r) =

∫ √|γ|dθdφ . (18.29)

From the metric, we have √|γ| = (r2

+ + a2) sin θ , (18.30)

so thatA(r+) = 4π(r2

+ + a2) . (18.31)

A very cute fact about the Penrose process is that the area of the black hole horizon doesnot shrink when it occurs. What is the physics behind this? The angular momentum isreduced more than the mass each time we do it, and this ensures that the area of the blackhole never decreases. To see a few more details, let us define the irreducible mass by

M2irr =

A

16πG2N

(18.32)

=1

G2N

(r2+ + a2) (18.33)

=1

2

(M2 +

√M4 − J2

G2N

)(18.34)

This might seem a tad unmotivated until we realize how it is affected by changes in M andJ . We find after some straightforward but boring algebra that

δMirr =a

4GNMirr

√G2M2 − J2/M2

(δM

ΩH

− δJ). (18.35)

Look carefully at what this implies. We had earlier that for a Penrose process, δJ < δM/ΩH

(where both δM and δJ are negative), so

δMirr > 0 . (18.36)

Therefore, the maximum work you can extract via the Penrose process is

M −Mirr = M − 1√2

√√√√M2 +

√M4 − J2

G2N

, (18.37)

and this is maximized to (1− 1/√

2) ' 29% of the original energy for extreme Kerr.The moral of the story here is that we are discovering relationships between macroscopic

variables of the black hole, and this opens the door to discussing black hole thermodynamics(a topic on which I am an expert). I will have more to say about this in Winter/Spring inthe second GR course, PHY484S/PHY1484S.

99

19 Thu.14.Nov

The reason we teach GR is not based in theoretical aesthetics, although those are reallyquite beautiful and many great intellects have fallen in love with it! We teach GR and useit because it works as an experimental description of gravity. In the next few lectures, wewill discuss some of the signature experiments that established GR firmly in the minds ofhumans worldwide.

Material on experimental tests, not including gravitational waves, is based pretty closelyon Appendix 9A and §10 of the HEL textbook. All the figures displayed for this materialare theirs.

19.1 Gravitational redshift

Suppose that we have a stationary spacetime of the form

ds2 = g00(xk)(dx0)2 + 2g0i(xk)dx0dxi + gij(x

k)dxidxj .

This includes all of the types of black holes we have studied so far: Schwarzschild, Reissner-Nordstrøm, and Kerr. Imagine two different physical observers who are massive and thereforemove slower than light. Call them E for emitter and R for receiver, with worldlines xµE(τE)and xµR(τR) respectively, where τE, τR are the proper times for those two observers. Now letE moving with 4-velocity UE(A) emit a photon at event A and R moving with 4-velocityUR(B) receive it at event B.

We can find the energy of a photon in the reference frame of a massive observer by takingthe dot product of the photon’s 4-momentum with the observer’s 4-velocity,

E = pµUµ . (19.1)

This works because we can choose the affine parameter of a null geodesic such that

pµ =dxµ

dλ. (19.2)

(Note that this is different from the convention for massive particles, for which the constantof proportionality in the above equation is the rest mass, rather than unity.) Then we have

E(A) = pµ(A)UµE(A) , (19.3)

E(B) = pµ(B)UµR(B) . (19.4)

100

Since in both cases E = hν, we have

νRνE

=pµ(B)Uµ

R(B)

pµ(A)UµE(A)

. (19.5)

Now, since the photon’s 4-momentum is tangent to its geodesic, it is parallel transportedtransported along its path. Equivalently, the directional covariant derivative of pµ is zeroalong the geodesic,

0 =D

Dλpµ =

dxσ

dλ∇σpµ =

dxσ

dλ

(∂σpµ − Γνσµpν

)=

d

dλpµ − Γνµσpν

dxσ

dλ. (19.6)

We can use this to relate pµ(B) to pµ(A). Recruiting our convention pµ = dxµ/dλ, we have

d

dλpµ = Γνµσpνp

σ . (19.7)

Recall that we also have the mass shell relation for the photon,

pµpµ = 0 . (19.8)

Suppose the emitter E and receiverR are at fixed spatial coordinates. (This would not betrue for freely falling observers.) Then the spatial components of the observers’ 4-velocitiesvanish,

U iE =

dxiEdτE

= 0 , and U iR =

dxiRdτR

= 0 . (19.9)

Using UµUµ = 1 for massive observers gives u0 = 1/√g00 , so that

νRνE

∣∣∣∣fixed

=p0(B)

p0(A)

√g00(A)

g00(B). (19.10)

If the metric is stationary, i.e. ∂0gµν = 0, then p0 is conserved by the geodesic equation.Then since the momentum vector for a photon is equal to the tangent vector, as in eq.(19.2),p0 is constant along a photon geodesic, and so

νRνE

∣∣∣∣fixed, stationary

=

√g00(xkE)

g00(xkR). (19.11)

For Schwarzschild, we obtain

νRνE

∣∣∣∣fixed, stationary

=

√[1− 2GNm/(c2rE)]

[1− 2GNm/(c2rR)]. (19.12)

For the Kerr spacetime, we previously found that the location where g00 = 0 marks the sta-tionary limit surface (SLS), the surface inside which a stationary observer gets involuntarily

101

dragged around with the rotating black hole spacetime geometry. Here, we see that the SLSis also the location where the gravitational redshift for an observer at a fixed spatial locationbecomes infinite.

The quantity z for the redshift is defined by

νRνE

=1

1 + z. (19.13)

If we want to find out redshifts for freely falling observers, then we need to solve the geodesicequations for the stationary spacetime in question. The analysis is even more complicatedfor spacetimes that are not stationary.

19.2 Planetary perihelion precession

How will we discover the perihelion advance we are after? We will start by using the geodesicequations derived for the Schwarzschild geometry introduced previously. The analysis canalso be done for Kerr, but for our purposes here the non-rotating case will suffice to showthe essential physics. We had a conserved energy

E =

(1− 2µ

r

).t , (19.14)

and a conserved angular momentumL = r2

.φ , (19.15)

where · = d/dλ. Then the norm condition gµν.xµ

.xν = ε (with ε = 0 for photons and +1 for

massive particles) gives

E2 +.r2 +

(1− 2µ

r

)[L2

r2+ ε

]= 0 . (19.16)

We saw previously that defining

Veff(r) =1

2

[L2

r2+ ε

](1− 2µ

r

), (19.17)

in analogy with Newtonian experience allows the rewriting

1

2.r2 + Veff(r) =

1

2E2 ≡ E . (19.18)

We can combine the knowledge above to find the shape equation,

dφ

dr=dφ

dλ

(dr

dλ

)−1

, (19.19)

givingdφ

dr= ±L

r2[2 (E − Veff(r))]−1/2 . (19.20)

Defining the orbit parameter b, via

b =L

E, (19.21)

102

gives

dφ

dr=

1

r2

[1

b2−(

1

r2+

ε

L2

)(1− 2µ

r

)]−1/2

, (19.22)

where ε = 1 for massive particles. (For photons, we would set ε = 0 in this equation.) Nowwe make a change of variables, to

u =L2

GNM

1

r=L2

µ

1

r. (19.23)

The radial equation for massive particles (planets, etc.) then turns into(du

dφ

)2

+L2

µ2− 2u+ u2 − 2µ2

L2u3 =

2EL2

µ2. (19.24)

On the face of it, this does not look any simpler than before. The neat trick is to realizethat differentiating this again yields a simpler second order equation! Straightforward butunilluminating algebra yields

d2u

dφ2+ u = 1 +

3µ2

L2u2 . (19.25)

This equation is the full unadulterated GR result, and involves no approximations. Thesecond term on the RHS of this equation would be absent in the Newtonian computation.

For the Newtonian case, you can check that the solution to the shape equation is

u0 = 1 + e cosφ . (19.26)

Treating this as the zeroth order approximation to the GR result, we can substitute back

u(φ) ' u0(φ) + u1(φ) + . . . (19.27)

into eq.(19.25) and obtain a perturbative equation for first order corrections u1. This gives

d2u1

dφ2+ u1 '

3µ2

Lu2

0 . (19.28)

As you should expect for an inherently nonlinear theory like GR, perturbation theory hereis nonlinear. Substituting in the specific form of u0 gives

d2u1

dφ2+ u1 '

3µ2

L2

[(1 +

e2

2

)+ 2e cosφ+

e2

2cos 2φ

]. (19.29)

You can check by explicitly differentiating that the solution to this is

u1 '3µ2

L2

[(1 +

e2

2

)+ eφ sinφ− e2

6cos 2φ

]. (19.30)

Notice that the first term here is a constant displacement and that the third term is oscillatoryabout zero. The second term that gives rise to a cumulative effect per orbit is the mostphysically important one.

103

Figure credit: Mpfiz - Own work, Public Domain.

From here on we just focus on that key second cumulative term on top of the zerothorder Newtonian contribution. We have

ukey = 1 + e cosφ+3µ2

L2e φ sinφ . (19.31)

This can be rewritten asukey = 1 + e cos [(1− α)φ] , (19.32)

where

α =3µ2

L2, (19.33)

as you can see by doing a Taylor expansion to first order in small quantities,

cos[(1− α)φ] ' cosφ+ αφ sinφ+O(α2) . (19.34)

Then the precession per orbit is

∆φ ' 2πα =6πG2

NM2

L2. (19.35)

In order to massage this expression a little further, we need to relate L2 to physical quantitieswe know. For the Newtonian (uncorrected) ellipse, the EOM show that

a =L2

µ(1− e2), (19.36)

so that

∆φ =6πGNM

c2a(1− e2). (19.37)

The first experimental test of this was with Mercury. For that planet, the gravitationalradius µ = GNM/c2 is about 1.48km, the eccentricity is about e = 0.2056, and the semima-jor axis is about a = 5.79× 1010m. This results in a perihelion precession advance of about5 × 10−7 radians per orbit, or about 43 seconds of arc per century. Note that the observedvalue is actually considerably greater, but most of it comes from two prosaic places: (a)precession of the equinoxes in our geocentric coordinate system, and (b) other planets per-turbing Mercury’s orbit. The residual amount of 43 seconds of arc per century is perfectlydescribed by GR, to within experimental errors. This was not settled definitively in theexperimental realm until the 1960s. For Earth, our perihelion precession is less, only about4 seconds of arc per century. Mercury is affected most because it is closest to the Sun.

104

https://commons.wikimedia.org/w/index.php?curid=11707205

20 Mon.18.Nov

20.1 Bending of light

Now let us focus on the bending of light. To start with, let us remind ourselves first of theNewtonian result. Most people think that because two photons with zero mass should feelzero Newtonian force between them, that implies that photons do not feel gravity. This isincorrect. Newton imagined light as corpuscular, and it feels gravity like any other corpuscle.The gravitational acceleration of a test mass does not depend on the mass.

In Newtonian mechanics, particles in unbound orbits move on hyperbolae rather thanellipses. The incoming path asymptotes in the infinite past to one of the separatrices, andthe outgoing path asymptotes in the infinite future to the other separatrix. In principle, itcould come as close as the radius of the stellar object as it slingshots around the star.

We can estimate the size of the effect just using dimensional analysis. The variablesin the problem are: GN and c (theory constants), M (a solution parameter), and b, theradius of closest approach. Since the deflection angle we are looking for is dimensionless, weestimate that

θ ∼ GNM

c2b. (20.1)

In principle, θ could have been any function of the dimensionless RHS. We have chosen alinear functional dependence on purpose, because we expect zero deflection angle when thereis no star and because we expect a small effect overall.

We can get more precise and confirm the linear dependence by asking about the gravita-tional force felt by a corpuscle. Suppose that far from the star it starts in along the x-axis,and that the star is located along the negative y-axis. To first order in small quantities,px is unaffected by the gravitational deflection, and the corpuscle develops a small py bygravitational attraction. The deflection angle |∆φ| = −(py/px)final is, to first order in small

105

quantities,

|∆φ| = − 1

px

∫ ∞−∞

dxdpydx

= − 1

px c

∫ ∞−∞

dxdpydt

= − 1

px c

∫ ∞−∞

dxGNMm

(x2 + y2)

y√x2 + y2

=2GNM

c2b, (20.2)

where b is the impact parameter. Note that the factor of m for the corpuscle cancelled out:the m in the numerator arising from the gravitational force killed the m in the momentumdenominator px = mc. Overall, we see that the Newtonian angle for deflection of light issmall but nonzero.

To analyze the answer in General Relativity, our starting point is again the geodesicequations. For photons executing equatorial motion (θ = π/2), we had two Killing vectorsgiving rise to two conserved quantities and also the tangent vector norm condition,(

1− 2µ

r

).t = E , (20.3)

r2.φ = L , (20.4)(

1− 2µ

r

).t2 −

(1− 2µ

r

)−1.r2 − r2

.φ

2= 0 . (20.5)

Substituting in the conserved quantities gives for the radial equation

.r2 +

L2

r2

(1− 2µ

r

)= E2 . (20.6)

From the above, we can find the GR shape equation for photons moving in a Schwarzschildgeometry,

dφ

dr=

1

r2

[1

b2− 1

r2

(1− 2µ

r

)]−1/2

. (20.7)

Substituting this time

u =1

r(20.8)

into the shape equation, and massaging the algebra a bit further, gives

d2u

dφ2+ u = 3µ u2 . (20.9)

When there is no matter, the RHS of the above shape equation is zero. In that case, thesolution is

u(φ) = u0(φ) =1

bsinφ , (20.10)

106

where b is the impact parameter. This just describes a straight line at an offset of b fromthe origin. Note that with a little bit of bending from matter, we should expect a solutionthat is symmetrical about the axis perpendicular to the no-bending trajectory. Hence whenwe add matter and find the final angle as r → large, we will need to double it to get thedeflection angle. Now let us work perturbatively in the gravitational bending, writing

u(φ) ' 1

bsinφ+ u1(φ) . (20.11)

Substituting to find the equation of motion for the perturbation gives

d2u1(φ)

dφ2+ u1(φ) ' 3µ

b2sin2 φ . (20.12)

As you can check explicitly, this is solved by

u1(φ) ' 3µ

2b2

(1 +

1

3cos 2φ

), (20.13)

so that

u(φ) ' 1

bsinφ+

3µ

2b2

(1 +

1

3cos 2φ

). (20.14)

This is the equation describing the trajectory of the photon in GR, to first order inperturbations about the Newtonian result. So let us ask the question: what does the angletend to as we go very far away from the gravitating body? This amounts to taking r →∞,which corresponds in our variables to u → 0. In other words, we need to look for solutionsof u(φ) = 0. For slight deflections, sinφ ' φ and cos 2φ ' 1. Solving for the angle gives aslightly negative answer, corresponding to one of the separatrices of the hyperbola,

φ ' −2GNM

c2b. (20.15)

We are not quite finished. As mentioned earlier, the GR deflection angle for photons is twicethe above result,

|∆φGR| '4GNM

c2b. (20.16)

Notice that GR’s final answer is twice the Newtonian result for the bending of light. For agrazing deflection by our Sun, it is about 1.75 seconds of arc.

What if we cannot apply a perturbation analysis because the deflection angle is large?Then we would need to use the full GR geodesic equations for photons without any approx-imations. In that case, by making use of previous results we have derived for the shapeequation for geodesics, we find the solution in terms of an integral,

|∆φGR| = 2

∫ ∞r0

dr

[1

b2− 1

r2

(1− 2µ

r

)]−1/2

, (20.17)

where r0 is the point of closest approach. At r0, the [. . .] in the integrand vanishes.Historical note: Eddington’s eclipse expedition to measure bending of light while the

Sun was blocked by the Moon was accepted in 1919 and made Einstein a rock star, despitepoorly understood systematic errors, because it appealed to Western Europeans in the postWWI climate of wanting peace between nations that had been at war.

107

20.2 Radar echoes

One other important test of GR is measuring radar echoes in the solar system, which isabout the interplay between distance and time. To analyze this, we need two ingredients.First, one of our geodesic equations for photons from earlier that took the form

.r2 +

L2

r2

(1− 2µ

r

)= E2 . (20.18)

We also had the energy equation (1− 2µ

r

).t = E , (20.19)

which is the second ingredient. These can be combined to find the t−r shape equation.Using (

dr

dλ

)2

=

(dr

dt

dt

dλ

)2

=

(dr

dt

)2

E2

(1− 2µ

r

)−2

, (20.20)

we have that (1− 2µ

r

)−3(dr

dt

)2

+(L/E)2

r2=

(1− 2µ

r

)−1

. (20.21)

At the distance of closest approach, which we will call R, we have(dr

dt

)2∣∣∣∣∣r=R

= 0 , (20.22)

so that at that point(L/E)2

R2=

(1− 2µ

R

)−1

(20.23)

Then, after a bit of algebra, the expression(dr

dt

)2

=

(1− 2µ

r

)2

− (L/E)2

r2

(1− 2µ

r

)3

. (20.24)

can then be massaged into the form

dr

dt=

(1− 2µ

r

)[1− R2(1− 2µ/r)

r2(1− 2µ/R)

]1/2

(20.25)

We can integrate this to get the time taken to travel from radial position R to r. It helps tobegin by expanding the integrand to first order in µ/r. After some algebra, we get

t(r, R) '∫ r

R

drr√

r2 −R2

[1 +

2µ

r+

µR

r(r +R)+ . . .

]. (20.26)

Then we integrate, to obtain

t(r, R) '√r2 −R2 + 2µ ln

[r +√r2 −R2

R

]+ µ

√r −Rr +R

+ . . . . (20.27)

The first term on the RHS here is just what we would have got if we had drawn a straightline. So the second and third terms are quantifying the bending of photon trajectories.

108

Now, suppose that we bounced a radar beam out to Venus and back, grazing the Sun.Then we would have twice the sum of the second and third terms above (twice for there andback). Using the approximation that the closest approach distance is much less than thedistance of either Earth or Venus from the Sun (rE R, rV R), gives

∆t ' 4GNM

c3

[lnrErVR2

+ 1]. (20.28)

Note that if we wanted to take into account the gravitational redshift of Earth, this is anorder µE/rE correction to what we have already calculated and therefore negligible.

Experimentally, when Venus is on the opposite side of the Sun to the Earth, the numericalvalue of the time delay for a grazing passing of the Sun is about 220µs, if you convert timeback from metres to seconds. HEL goes into more detail about the experimental nuances in§10.3. One has to correct for the motion of Venus and Earth in their orbits, their individualgravitational fields, the variance of reflecting surfaces on Venus, and refraction by the Solarcorona. After all the experimental dust settles, you get the data agreeing in a pretty waywith the GR prediction.

109

21 Thu.21.Nov

21.1 Geodesic precession of gyroscopes

Precession of gyroscopes is another experimental test of General Relativity. Gyros areinteresting because they spin on an axis, and this spin vector sµ feels the effects of GeneralRelativity through the physics of parallel transport. Let us see out how this works.

The geodesic is a physically special curve because it parallel transports its own tangentvector,

d

dλuµ + Γµνσu

νuσ = 0 . (21.1)

Physically, the spin must be orthogonal to the tangent vector,

gµνsµuν = 0 . (21.2)

In other words, the spin cannot have a timelike component in the instantaneous rest frameof the test object. If we want this zero inner product to be conserved at all points along theworldline of the gyro, we need to insist that the spin vector sµ be parallel transported,

d

dλsµ + Γµνσs

νuσ = 0 . (21.3)

To demonstrate the effect we are after, it is sufficient to use the approximation thatEarth’s gravitational field (in which GPB flew) is described by the Schwarzschild metric. Thiswill simplify our computations because there are fewer Christoffel symbols for Schwarzschildthan for Kerr. Imagine that our test gyroscope is orbiting Earth in a circle, in the equatorialplane of our spherical polar coordinate system. Circular motion occurs at fixed (r, θ), so thatu1(λ) = 0 and u2(λ) = 0 ∀ λ. Because θ = π/2, Γθϕϕ and Γϕθϕ are zero and Γrϕϕ = Γrθθ. Soour spin parallel transport equations in (t, r, θ, ϕ) coordinates become

dst

dλ+ Γtrts

rut = 0 , (21.4)

dsr

dλ+ Γrtts

tut + Γrϕϕsϕuϕ = 0 , (21.5)

dsθ

dλ= 0 , (21.6)

dsϕ

dλ+ Γϕrϕs

ruϕ = 0 . (21.7)

where

Γtrt =µ

r2

(1− 2µ

r

)−1

, Γrtt =µ

r2

(1− 2µ

r

), Γrϕϕ = −r

(1− 2µ

r

), Γϕrϕ =

1

r. (21.8)

To proceed further, we need to know something about the normalization of the velocityvector. We can write it as [uµ] = ut[1, 0, 0,Ω]T , where Ω is our angular velocity for circularmotion. What is the angular velocity for our case? We actually mentioned the key ingredi-ents already, in passing, when we discussed massless and massive particle geodesics in the

110

Schwarzschild spacetime. In particular, we derived the shape equation for (quasi-)ellipticalorbits. Circular orbits are a special case, and the shape equation can easily be rearrangedto find L. We obtain

L2 =µR2

R− 3µ(21.9)

where R is the radius of the circular orbit. Then using the norm condition on the velocityvector gives

E =(1− 2µ/R)√

1− 3µ/R. (21.10)

We can also find the angular velocity, by using the geodesic equations to find ϕ(t),(dϕ

dt

)2

=

(dϕ

dλ

dt

dλ

−1)2

. (21.11)

After the dust settles, this gives the very simple expression

Ω2 =µ

r3. (21.12)

The norm of the 4-velocity must be unity, as appropriate to a massive particle (our gyro-scope). This gives the equation

u0 =

[(1− 2µ

r

)− r2Ω2

]−1/2

=

(1− 3µ

r

)−1/2

. (21.13)

In this system, we have ur = 0 = uθ, and so the condition that the spin vector beorthogonal to the velocity vector becomes(

1− 2µ

r

)stut − r2sϕuϕ = 0 . (21.14)

Since uϕ/ut = dϕ/dt = Ω, we can express st in terms of sϕ,

st =Ωr2

(1− 2µ/r)sϕ . (21.15)

As you can check for yourself, this means that the first and fourth of the parallel transportequations are equivalent. Then the remaining equations are

dsr

dλ− rΩ

utsϕ = 0 ,

dsθ

dλ= 0 ,

dsϕ

dλ+utΩ

rsr = 0 . (21.16)

We can convert the experimentally relatively unfamiliar affine parameter λ to the coor-dinate time t using ut = dt/dλ. Using the third equation to eliminate sϕ from the first givesfor the set of three

d2sr

dt2+

Ω2

(ut)2sr = 0 ,

dsθ

dt= 0 ,

dsϕ

dt+

Ω

rsr = 0 . (21.17)

111

This has solution

sr(t) = s1(0) cos Ω′t , sθ(t) = 0 , sϕ(t) = − Ω

rΩ′s1(0) sin Ω′t , (21.18)

where

Ω′ =Ω

ut= Ω

√1− 3µ/r . (21.19)

Therefore, the spatial part of the spin vector is rotating relative to the radial direction rwith a coordinate angular speed −Ω′ in the direction -ϕ. But the radial direction itself isrotating with coordinate angular speed +Ω. So it is the difference in speeds which gives riseto geodesic precession.

If you revolve once in a coordinate time t = 2π/Ω, the final direction of the spatial spinvector is 2π + α, where α = 2π(1− Ω′/Ω). Per revolution, then, the angular precession is

α = 2π

[1−

√1− 3µ

r

]. (21.20)

This effect is not very big, but it is cumulative. That means if you can machine almost-perfect gyros and leave them in orbit for a veeeeeeeery long time, then you have a chance ofthese effects adding up and being measurable.

From the GPB website: “Gravity Probe B, launched 20 April 2004, is a space exper-iment testing two fundamental predictions of Einstein’s theory of General Relativity (GR),the geodetic and frame-dragging effects, by means of cryogenic gyroscopes in Earth orbit.Data collection started 28 August 2004 and ended 14 August 2005. Analysis of the datafrom all four gyroscopes results in a geodetic drift rate of −6, 601.8 ± 18.3 mas/yr anda frame-dragging drift rate of −37.2 ± 7.2 mas/yr, to be compared with the GR predic-tions of −6, 606.1 mas/yr and −39.2 mas/yr, respectively (‘mas’ is milliarc-second; 1 mas=4.848× 10−9 radians or 2.778× 10−7 degrees).”

112

21.2 Accretion disks

Lastly, let us mention one more experimental test of GR: accretion disks around compactobjects. They have matter swirling around the central black hole at millions of Kelvins, andtend to emit strongly in the X-ray part of the spectrum. Even at such extreme temperatures,some atoms can retain electrons and then emit radiation as they jump between energy levels,and one such nucleus is iron. Looking at the shape of the broadened iron emission line fromthe whole accretion disk actually gives a probe of the strong-field regime of GR, as we willnow motivate.

There are two types of redshift that operate in this system: gravitational redshift, andDoppler shifting from relative velocity w.r.t. an observer here on Earth. Supposing thatwe view an accretion disk and black hole system side-on, we would see a range of Dopplershifting depending on which part of the disk we were looking at. This would even happen inthe Newtonian approximation! The really key part is the gravitational redshift. The essentialreason is that the smallest-possible frequency present in the observed spectrum must havebeen emitted at the smallest possible value of r, so that it could experience maximum redshifton the way out. Knowing the radius of the ISCO, we can then get a handle on the biggestfrequency ratio possible.

113

The ratio of the photon frequency at reception compared to that at emission is given by

νRνE

=pµ(R)uµRpµ(E)uµE

. (21.21)

Using what we derived in the previous experiment’s discussion concerning the angular ve-locity and the tangent vector norm condition, you can show with straightforward algebrathat

νRνE

=p0(R)

p0(E)u0E + p3(E)u3

E

(1− 3µ

r

)1/2p0(R)

p0(E)

[1± p3(E)

p0(E)Ω

], (21.22)

where + corresponds to emitting matter on the side of the disk moving towards the observerand − corresponds to matter on the other side. Now, because Schwarzschild is a stationarymetric, the downstairs component of the time component of the momentum of the photonis conserved along a geodesic.

Our last ingredient is to find the ratio p3(E)/p0(E), and this is done using the nullphoton momentum norm condition. Working in the equatorial plane, we find(

1− 2µ

r

)−1

(p0)2 −(

1− 2µ

r

)(p1)2 − 1

r2(p3)2 = 0 . (21.23)

To get any further for a general angle between the accretion disk and us, we would need torecruit the full photon geodesic equations. But in two special cases we can actually do a slickavoidance manoeuvre and finesse this issue! When the matter is transverse to the observer(or in a face-on disk), ϕ = 0, π. Then p3(E) = 0, and so

νRνE

=

√1− 3µ

r. (21.24)

When matter moves either directly towards or away from the observer, ϕ = ±π/2. Thenthe radial component of the photon momentum is zero, and so

νRνE

=

√1− 3µ/r

1± 1/√r/µ− 2

(21.25)

You find that the smallest frequency represented in the Iron emission line will be νR/νE =√2/3 ' 0.47 for face-on disks and 1/

√2 for edge-on disks.

114

22 Mon.25.Nov

The following gravitational waves material is based on §17 and §18 of HEL. All figures shownare from HEL.

22.1 Finding the wave equation for metric perturbations

For this section we will assume that the cosmological constant is zero and that spacetime isapproximately flat. We will figure out the equations obeyed by small (perturbative) ripplesin the fabric of spacetime about the Minkowski metric, which are known as gravitationalwaves. To begin, we assume that

gµν = ηµν + hµν , (22.1)

where |hµν | 1. To first order in small quantities,

gµν = ηµν − hµν , (22.2)

where we raise and lower indices to this order by using the Minkowski metric,

hµν = ηµρηνσhρσ . (22.3)

It is important to know how the perturbations are affected by changes of coordinates.Under global Lorentz transformations x′µ = Λµ

νxν , we know that

g′µν =∂xρ

∂x′µ∂xσ

∂x′νgρσ

= Λ ρµ Λ σ

ν (ηρσ + hρσ)

= ηµν + Λ ρµ Λ σ

ν hρσ (22.4)

because the Minkowski metric is invariant under global Lorentz transformations. Therefore,

h′µν = Λ ρµ Λ σ

ν hρσ . (22.5)

In other words, hµν transforms like a tensor under global Lorentz transformations.We can also ask about how perturbations in spacetime are affected by a general coordi-

nate transformation of the formx′µ = xµ + ξµ(x) . (22.6)

Therefore,∂x′µ

∂xν= δµν + ∂νξ

µ . (22.7)

By eye, we can see using the above equation that to first order in small quantities the inversetransformation obeys

∂xµ

∂x′ν= δµν − ∂νξµ . (22.8)

115

Accordingly, under these general coordinate transformations, we have

g′µν =(δρµ − ∂µξρ

)(δσν − ∂νξσ) (ηµν + hµν)

= ηµν + (hµν − ∂µξν − ∂νξµ) (22.9)

where we defined ξµ = ηµνξν , and worked to first order in small quantities. Therefore, the

transformation law of the perturbations under general coordinate transformations (22.6) is

h′µν = hµν − ∂µξν − ∂νξµ . (22.10)

What are the Christoffels to first order in perturbations? We did this type of approxi-mation earlier when we first linearized a GR expression to recover its Newtonian limit. Here,we obtain

Γσµν =1

2

(∂νh

σµ + ∂µh

σν − ∂σhµν

). (22.11)

From this, it follows that, to first order in perturbations,

Rσµνρ =

1

2

(∂ν∂µh

σρ + ∂ρ∂

σhµν − ∂ν∂σhµρ − ∂ρ∂µhσν). (22.12)

The really neat thing about this expression for the Riemann tensor is that it is invariantunder general coordinate transformations (22.6). As you can check, this property also holdsfor the Ricci tensor and the Ricci scalar. For convenience, let us define

h = hσσ (22.13)

and write2 = ∂µ∂µ . (22.14)

Then we obtain

Rµν =1

2

(∂µ∂νh+ 2hµν − ∂µ∂ρhρν − ∂ν∂ρhρµ

), (22.15)

andR = 2h− ∂µ∂νhµν . (22.16)

Plugging the above expressions into the Einstein equations yields a second-order PDEfor the perturbations. In order to aid in wrangling all the pertinent algebra, it is convenientto define the trace reverse of hµν ,

hµν ≡ hµν −1

2ηµνh . (22.17)

This obeys the property that¯hµν = hµν . (22.18)

We also have (again to first order in perturbations)

h = ηµν hµν , (22.19)

which obeys h = (1 − D/2)h. In D = 4, h = −h. In terms of hµν , the Einstein equationsbecome

2hµν + ηµν∂ρ∂σhρσ − ∂ν∂ρh

ρµ − ∂µ∂ρh

ρν = −16πGNTµν . (22.20)

116

On the face of it, this equation does not look very much like a familiar wave equationinvolving the d’Alembertian!

In order to figure out what our Einstein equations for the perturbations imply physically,it is crucial that we understand how hµν transforms under general coordinate transformations(22.6). First, recall our equation (22.10), h′µν = hµν − ∂µξν − ∂νξµ. From this, it followsdirectly that

h′ = ηµν (hµν − ∂µξν − ∂νξµ)

= h− 2∂µξµ . (22.21)

Therefore,

h′µρ

= h′µρ − 1

2ηµρh′

= (hµρ − ∂µξρ − ∂ρξµ)− 1

2ηµρ (h− 2∂σξ

σ)

= hµρ − ∂µξρ − ∂ρξµ + ηµρ∂σξ

σ . (22.22)

Taking the partial derivative of this expression gives

∂ρh′µρ

= ∂ρhµρ −2ξµ . (22.23)

So far, this description of algebra manipulations might seem a tad dry. But this is where thereal money is to be made in careful observation. Suppose that we are smart enough to choosea coordinate system in which 2ξµ = ∂ρh

µρ. Then ∂ρh

′µρ= 0, which massively simplifies

the Einstein equation. In particular, all the terms on the LHS which did not involve thed’Alembertian operator become equal to zero in this coordinate system. Wow!

To summarize, let us drop the primes for clarity, raise the indices with η, and write ourEinstein equation in this awesome new coordinate system,

2hµν

= −16πGN

c4T µν . (22.24)

In order for the wave equation for our metric perturbations to obey this simple equation,our coordinate system must obey

∂µhµν

= 0 . (22.25)

Any further coordinate change xµ → xµ + ξµ within this gauge class would be OK, as longas it satisfied 2ξµ = 0. This is very reminiscent of the Lorentz gauge in electromagnetism,∂µA

µ = 0, which still allows further gauge transformations of the form Aµ → Aµ + ∂µλ,where 2λ = 0. Accordingly, this gauge for metric perturbations is sometimes rather looselycalled the Lorentz gauge. More properly, it is called the de Donder gauge.

22.2 Solving the linearized Einstein equations

As always, if we are trying to solve a wave equation, it helps to start by finding the Green’sfunction,

2xG(xσ − yσ) = δ(4)(xσ − yσ) . (22.26)

117

As explained in detail in HEL §17.6, this is solved by the retarded Green’s function

G(xσ) =δ(x0 − |~x|)θ(x0)

4π|~x|, (22.27)

as you can check by substituting it in. Note that the retarded Green’s function (as com-pared to, say, the advanced Green’s function) is required by causality: we cannot expect agravitational wave to be influenced by sources in its future light cone, only those in its pastlight cone. Using the retarded Green’s function, we can see immediately that the solutionto the Einstein equation for the metric perturbation is

hµν

(ct, ~x) = −4GN

c4

∫d3~y

T µν(ctr, ~y)

|~x− ~y|(22.28)

HEL Fig.17.2 is very helpful for visualizing the meaning of the retarded time variable tr inthis equation, which is defined by

ctr = ct− |~x− ~y| . (22.29)

Plodding through the details of how to check whether this satisfies the de Donder gaugecondition requires careful attendance to the retarded time story, using the chain rule forderivatives, and integration by parts. The net result is

∂

∂xµhµν

= −4GN

c4

∫d3~y

1

|~x− ~y|∂

∂yµT µν(y0, ~y) . (22.30)

But since the energy-momentum tensor is conserved in the linearized theory,

∂µTµν = 0 , (22.31)

we have what we need:∂µh

µν= 0 . (22.32)

118

A very important idea from electromagnetism was the multipole expansion. Here,it is the conserved energy-momentum that sources our gravitational wave, rather than theconserved current sourcing the EM wave, but the principle is analogous. In an asymptoticallyflat spacetime, higher partial waves fall off with higher powers of distance, so the lowestpertinent multipole moment for a compact source dominates the physics of wave propagationfar from the source. As you learned in 3rd year EM class, in order to generate EM waves, atime-dependent dipole moment is needed. In order to generate gravitational waves, it turnsout that we will need a time-dependent quadrupole moment.

To start our way towards that result, let us Taylor expand the denominator in the integralfor h

µν, with |~x| = r and small ~y,

1

|~x− ~y|' 1

r+ (−yi)∂i

(1

r

)+

1

2!(−yi)(−yj)∂i∂j

(1

r

)+ . . .

=1

r+ yi

xir3

+ yiyj(

3xixj − δijr2

r5

)+ . . . (22.33)

Motivated by this, we define the multipoles

Mµνσ1σ2...σ`(ctr) =

∫d3~y T µν(ctr, ~y)yσ1yσ2 . . . yσ` , (22.34)

and obtain

hµν

(ct, ~x) = −4GN

c4

∞∑`=0

(−1)`

`!Mµνσ1σ2...σ`(ctr)∂σ1∂σ2 . . . ∂σ`

(1

r

)(22.35)

For the case of a compact source, we can use these general expressions to find ap-proximations for our linearized metric perturbations. First we need to consider what thecomponents of T µν tell us physically. T 00 is the energy density of the source particles, and ifthis is integrated over all space then it gives Mc2, the conserved energy. T 0i is the momen-tum density of source particles, and if this is integrated over all space it gives P ic, which isalso conserved at this order in perturbations. The T ij are the internal stresses, and they arenot necessarily zero when integrated over all space. Without loss of generality, we may takeour spatial coordinates xi to be in the centre of momentum (CoM) frame of the particles, sothat P i = 0. Then in CoM coordinates,

h00

= −4GNM

c2r, h

0i= h

i0= 0 . (22.36)

The remaining parts are

hij

(ct, ~x) = −4GN

c2r

∫d3~y

[T ij(ct′, ~y)

]∣∣ct′=ct−r (22.37)

It is not especially easy to compute this integral directly. HEL explain carefully in §17.8that a slightly indirect yet algebraically shorter route can be found by recruiting energy-momentum conservation ∂µT

νµ = 0. In a 3+1 split, we have

0 = ∂0T00 + ∂kT

0k

0 = ∂0Ti0 + ∂kT

ik . (22.38)

119

These two equations can be used to turn our integral over T ij into integrals over highermoments of T 0i and T 00. The first trick is to consider the integral of ∂k(T

ikyj) over a volumecompletely enclosing the source and using Gauss’s theorem. The first conservation equationthen yields ∫

d3~y T ij =1

2c

d

dtr

∫d3~y

(T i0yj + T j0yi

). (22.39)

The second trick is to consider the integral of ∂k(T0kyiyj) over the same enclosing volume;

it yields ∫d3~y T ij =

1

2c2

d2

dt2r

∫d3~y T 00yiyj . (22.40)

Defining the quadrupole moment I ij by

I ij(ct) =

∫d3~y T 00(ct, ~y) yi yj (22.41)

gives the solution

hij

(ct, ~x) = −2GN

c6r

[d2I ij(ct′)

dt′2

]∣∣∣∣t′=tr

. (22.42)

This is known as the quadrupole formula.

120

23 Thu.28.Dec

As a warm-up example of solving for gravitational perturbations, we can consider a station-ary non-relativistic source which is a perfect fluid. In this case the energy-momentumtensor is constant in time, and the distinction between time and retarded time is irrelevant.Then we have directly that

hµν

(~x) = −4GN

c4

∫d3~y

T µν(~y)

|~x− ~y|. (23.1)

When our perfect fluid is non-relativistic, all speeds are much smaller than c and to lowestorder in perturbation theory we can neglect the pressure. This gives

T 00 = ρc2 , T 0i = ρcui , T ij = ρuiuj , (23.2)

where ρ is the proper density distribution of the source. The solution can be written as

h00

=4Φ

c2, h

0i=Ai

c, h

ij= 0 , (23.3)

where

Φ(~x) = −GN

∫d3~y

ρ(~y)

|~x− ~y|

Ai(~x) = −4GN

c2

∫d3~y

ρ(~y)ui(~y)

|~x− ~y|. (23.4)

We can easily obtain hµν as functions of hµν

using our earlier formula connecting them. Theresult is

h00 = h11 = h22 = h33 =2Φ

c2, h0i =

Aic. (23.5)

This provides the derivation that we promised quite a long time ago of the lowest-orderNewtonian approximation to the spacetime metric,

ds2 =

(1 +

2Φ

c2

)(cdt)2 + 2

Aic

(cdt)dxi −(

1− 2Φ

c2

)δijdx

idxj , (23.6)

with the bonus that we now allow for slow rotation of the source. An example of a stationarynon-relativistic source would be a rigidly rotating sphere.

23.1 Gravitational plane waves

Another simple example of solving for gravitational perturbations is the case of gravita-tional plane waves. These take the form

hµν

=1

2Aµν exp(ikµx

µ) + c.c. . (23.7)

The de Donder gauge condition requires that the polarization tensor Aµν obeys

kµAµν = 0 , (23.8)

121

i.e., it is transverse to the direction of propagation of the wave.Let us count polarizations. We started off with ten components of our symmetric tensor

metric perturbations. Fixing de Donder gauge reduces that to six independent components.We can further fix the gauge by doing a coordinate transformation xµ → xµ+ξµ, as long as westay within de Donder gauge, which further reduces the number of independent componentsdown to two. Let us see how this works, in more detail. Consider a ξµ of the form

ξµ = εµ exp(ikνxν) . (23.9)

This clearly obeys 2ξµ = 0 if εµ =const. Under this transformation, we know how hµν

transforms, which tells us that the polarization tensor must also transform as

A′µν = Aµν − iεµkν − iενkµ + iηµνερkρ . (23.10)

Let our wavevector lie along the z-direction: ~k = kz. Then by our de Donder gauge condition,Aµ3 = Aµ0 ∀µ. Using this and the above two equations, we can straightforwardly show thatthe components of εµ can always be chosen to ensure that the only nonzero components ofthe polarization tensor are

[AµνTT] =

0 0 0 00 a b 00 b −a 00 0 0 0

. (23.11)

This cleverly chosen gauge is known as the transverse traceless (TT) gauge. If we wish,we can write the polarization tensor as

AµνTT = aeµν+ + beµν× . (23.12)

More generally, we define the TT gauge via

h0iTT ≡ 0 , hTT ≡ 0 . (23.13)

Using this and the de Donder gauge condition ∂µhµνTT = 0 , we have that

∂0h00TT = 0 , ∂ih

ijTT = 0 . (23.14)

What effect does such a gravitational plane wave have on a bunch of free particles? Wecan work this out by using the geodesic equation for their motion,

duσ

dτ+ Γσµνu

µuν = 0 . (23.15)

Suppose a particle is initially at rest before the wave comes by. Then [uµ] = c[1,~0]T , and so

duσ

dτ= −c2Γσ00

= −c2

2ησρ (∂0hρ0 + ∂0h0ρ − ∂ρh00)

= 0 (23.16)

122

because we are working in TT gauge. So hey: our coordinate system is adapted to individualparticles! But even though the coordinate separation of particles is constant, their physicalseparation is not, because h

µν 6= 0 . Let us parametrize the coordinate spatial separationbetween two nearby particles as Si. Then the physical spatial separation is

`2 ≡ −gijSiSj = (δij − hij)SiSj . (23.17)

To first order in perturbations, we can define a new physical separation vector ζ i by

`2 = δijζiζj , (23.18)

or

ζ i = Si +1

2hikS

k . (23.19)

To see the effect of our gravitational plane wave in the z direction, let us inspect two particlesin the (x, y) plane. Then S3 = 0. Also, because hk3

TT = 0 ∀k, there is no change in theirz-separation due to the plane wave. Their moving around is going to happen in the (x, y)plane only. Another advantage of TT gauge is that h = 0, which also implies that h = 0.Picking the eµν+ polarization tensor for definiteness, we find easily that

hµνTT = aeµν+ cos(kµxµ) = aeµν+ cos[k(x0 − x3)] , (23.20)

where k = |~k| = ω/c. So

(ζ i) = (S1, S2, 0)T − a

2cos[k(x0 − x3)](S1,−S2, 0)T . (23.21)

This is illustrated nicely in Fig.18.1 of HEL.

For the other case of the crossed polarization eµν× , this Fig.18.2 of HEL shows how tovisualize its effect.

Either way, you can think of gravitational waves as stretchy-squeezy waves.

123

23.2 Energy loss from gravitational radiation

THere is no local notion of gravitational energy density in GR, because we could alwayschange it via a coordinate transformation. Also, in generic spacetimes in GR, neither en-ergy nor momentum is conserved. But we can still motivate an expression for the energy-momentum tensor of the gravitational field itself in the perturbative approximation, in orderto allow us to derive the famous formula for the power radiated by gravitational waves.

We started our perturbative approach to spacetime metric perturbations starting fromthe full equations,

Gµν = −8πGN

c4Tµν . (23.22)

Now imagine that we go one step beyond linear order, keeping up to second-order terms insmall quantities. Then we have

G(1)µν +G(2)

µν + . . . = −8πGN

c4Tµν . (23.23)

We could try moving the second-order approximation to the Einstein tensor over to the RHSand calling it tgrav

µν . The problem with this idea is that unfortunately this expression is notgauge invariant. HEL explain in detail how to fix this by averaging over a small region aboutany given point and writing

tgravµν ≡

c4

8πGN

〈G(2)µν 〉 . (23.24)

After a good deal of fairly unilluminating algebra, the resulting expression becomes

tgravµν =

c4

32πGN

〈(∂µhρσ)∂ν hρσ − 2(∂σh

ρσ)∂(µhν)ρ −

1

2(∂µh)∂ν h〉

− 〈hρ(µTρ

ν) +1

4ηµνh

ρσTρσ〉 . (23.25)

The key property of this thing is that it is invariant under gauge transformations, as required.We consider gravitational plane waves in vacuo so that only the top line will appear for us.

Now, in TT de Donder gauge, we have ∂µhµνTT = 0, hTT = 0, and h

µνTT = hµνTT. So then

in vacuo, we have only the first term in the complicated expression above turned on. In ourTT gauge, considering only the radiative part of the gravitational field shows that h

0iTT = 0,

so that in fact only the spatial components of the perturbations actually appear,

tgravµν =

c4

32πGN

〈(∂µhTTij )(∂νh

ijTT)〉 (23.26)

Physically, the energy flux (energy/area/time) in the ni spatial direction is

F (~n) = −ct0knk = +δkjt0knj , (23.27)

in our signature convention, because in general an energy-momentum tensor tµν encodes theflux of µ-momentum in the ν-direction.

124

Let us consider a compact source and aim for the far-field result, choosing ~n to bepointing in the radial direction away from the source. Then we have

F (r) = − c4

32πGN

〈(∂thTTij )∂rh

ijTT〉 . (23.28)

But from our quadrupole formula from earlier, we know that

hij

= −2GN

c6

[..Iij]r

(23.29)

where · ≡ d/dt and r means using retarded time. We need an expression for the TT part ofthe quadrupole moment, so we define

Jij ≡ Iij −1

3δijI , (23.30)

where I = I ii . Then

hijTT = hijTT = −2GN

c6

[ ..Jij]r

(23.31)

Now, in order to finish this line of reasoning, we need to slow down a little and be carefulabout how to take t and r derivatives at retarded time. Our definition of retarded time was

x0r ≡ ctr = x0 − |~x− ~y| , (23.32)

and so for any function f(x0r, ~y), we have

∂f(x0r, ~y)

∂xµ=

[∂f(y0, ~y)

∂y0

]r

∂x0r

∂xµ,

∂f(x0r, ~y)

∂yi=

[∂f(y0, ~y)

∂yi

]r

+

[∂f(y0, ~y)

∂y0

]r

∂x0r

∂yi, (23.33)

where r means to evaluate at y0 = x0r. We therefore have that

∂thTTij = −2GN

c6

[ ...J

TT

ij

]r

(23.34)

We can also evaluate

∂rhTTij =

2GN

c6 r2

[ ..Jij

TT

]r

+2GN

c7r

...Jij

TT . (23.35)

The second term here dominates over the first, and so our expression for the radiation fluxfrom the gravitational wave source is

F (r) =GN

8πr2c9〈...J

TT

ij

...Jij

TT〉 . (23.36)

Our last task is to express this in terms of the original quadrupole. For that we need ahandy projection tensor,

Pij ≡ δij − ninj . (23.37)

125

Applying this to an arbitrary spatial vector allows one to see that it obeys the properties weexpect of a projector. Then the transverse part of the polarization vector for the gravitationalwave is AijT = P i

kPj`A

k` is the transverse part. To ensure that there is no trace part, weneed to form AijTT =

(P i

kPj` − 1

2P ijPk`

)Ak`. By direct analogy, we find for the quadrupole

J ijTT =

(P i

kPj` −

1

2P ijPk`

)Jk` . (23.38)

Denoting the components of the unit radial vector by xi, this gives

JTTij J ijTT = JijJ

ij − 2J ji J

ikxjxk +1

2J ijJk`xixjxkx` . (23.39)

To get the integrated luminosity, we integrate this over 4π of solid angle. After the boringdust settles, we have (at last!) the famous formula we wanted,

dE

dt= −LGW = −GN

5c9〈[ ...J ij

...Jij]r〉 . (23.40)

This shows that you not only need a quadrupole (not a monopole or a dipole) to producegravitational radiation, you also need the third derivative of it turned on. Again, the reasonwhy we use retarded times in this expression is to ensure the correct boundary conditionsfor our Green’s function reflecting causality.

Gravitational radiation was discovered indirectly in 1974, via the famous observationsof Russell Hulse and Joseph Taylor of binary pulsars which won them a 1993 Nobel Prize inPhysics. The period between winks of the pulsar slowed down over time, at a rate preciselypredicted by GR. What was a far more impressive technological feat was the building ofLIGO, the Laser Interferometer Gravitational Wave Observatory. It won the 2017 NobelPrize in Physics for Rainer Weiss, Barry Barish, and Kip Thorne for the direct discovery ofgravitational waves – tiny ripples in the very fabric of spacetime long thought technologicallyimpossible to detect. Here are a few URLs for checking out their discoveries:-

• https://www.youtube.com/watch?v=FXlg3cr-q44

• https://www.ligo.caltech.edu/news/ligo20160211

• https://www.ligo.caltech.edu/page/four-new-detections-o1-o2-catalog

126

https://www.youtube.com/watch?v=FXlg3cr-q44

https://www.ligo.caltech.edu/news/ligo20160211

https://www.ligo.caltech.edu/page/four-new-detections-o1-o2-catalog

Date post:	24-Apr-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

PHY483F/1483F Relativity Theory I (2019-20) · 2019-11-18 · on explaining Einstein’s famous...

Documents