+ All Categories
Home > Documents > PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity...

PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity...

Date post: 17-May-2018
Category:
Upload: buinhi
View: 257 times
Download: 2 times
Share this document with a friend
157
PHY483F/1483F Relativity Theory I (2017-18) Department of Physics University of Toronto Instructor: Prof. A.W. Peet Sources:- M.P. Hobson, G.P. Efsthathiou, and A.N. Lasenby, “General relativity: an introduction for physicists” (Cambridge University Press, 2005) [recommended textbook]; Sean Carroll, “Spacetime and geometry: an introduction to general relativity” (Addison- Wesley, 2004); Ray d’Inverno, “Introducing Einstein’s relativity” (Oxford University Press, 1992); Jim Hartle, “Gravity: an introduction to Einstein’s general relativity” (Pearson, 2003); Bob Wald, “General relativity” (University of Chicago Press, 1984); Tom´as Ort´ ın, “Gravity and strings” (Cambridge University Press, 2004); Noel Doughty, “Lagrangian Interaction” (Westview Press, 1990); my personal notes over three decades. Version: Thursday 26 th April, 2018 @ 18:14
Transcript
Page 1: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

PHY483F/1483FRelativity Theory I

(2017-18)

Department of Physics

University of Toronto

Instructor: Prof. A.W. Peet

Sources:-

• M.P. Hobson, G.P. Efsthathiou, and A.N. Lasenby, “General relativity: an introductionfor physicists” (Cambridge University Press, 2005) [recommended textbook];

• Sean Carroll, “Spacetime and geometry: an introduction to general relativity” (Addison-Wesley, 2004);

• Ray d’Inverno, “Introducing Einstein’s relativity” (Oxford University Press, 1992);

• Jim Hartle, “Gravity: an introduction to Einstein’s general relativity” (Pearson, 2003);

• Bob Wald, “General relativity” (University of Chicago Press, 1984);

• Tomas Ortın, “Gravity and strings” (Cambridge University Press, 2004);

• Noel Doughty, “Lagrangian Interaction” (Westview Press, 1990);

• my personal notes over three decades.

Version: Thursday 26th April, 2018 @ 18:14

Page 2: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Contents

1 Special Relativity and tensors 11.1 Invitation to GR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Galilean relativity and 3-vectors in Euclidean space . . . . . . . . . . . . . . 31.3 Special relativity and 4-vectors in Minkowski spacetime . . . . . . . . . . . . 71.4 Relativistic particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.5 Partial derivatives, and electromagnetism . . . . . . . . . . . . . . . . . . . . 161.6 Constant relativistic acceleration . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Curved spacetime and tensors 202.1 The Equivalence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.2 Spacetime as a curved manifold . . . . . . . . . . . . . . . . . . . . . . . . . 212.3 Tensors in curved spacetime . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.4 Rules for tensor index gymnastics . . . . . . . . . . . . . . . . . . . . . . . . 25

3 The covariant derivative 273.1 The Christoffel symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.2 The covariant derivative and parallel transport . . . . . . . . . . . . . . . . . 293.3 The geodesic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.4 Example computation for Christoffels and geodesic equations . . . . . . . . . 33

4 Spacetime curvature 364.1 Curvature and the Riemann tensor . . . . . . . . . . . . . . . . . . . . . . . 364.2 Example computation for Riemann . . . . . . . . . . . . . . . . . . . . . . . 384.3 Riemann normal coordinates and the Bianchi identity . . . . . . . . . . . . . 404.4 The information in Riemann . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.5 Geodesic deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.6 Tidal forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5 The power of symmetry, and Einstein’s equations 505.1 Lie derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.2 Killing tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.3 Maximally symmetric spacetimes . . . . . . . . . . . . . . . . . . . . . . . . 555.4 Einstein’s equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6 Black holes 586.1 Birkhoff’s theorem and the Schwarzschild solution . . . . . . . . . . . . . . . 586.2 TOV equation for a star . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646.3 Geodesics of Schwarzschild . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666.4 Causal structure of Schwarzschild . . . . . . . . . . . . . . . . . . . . . . . . 706.5 Kerr black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766.6 The Penrose process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7 Classic experimental tests of GR 837.1 Gravitational redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

i

Page 3: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

7.2 Planetary perihelion precession . . . . . . . . . . . . . . . . . . . . . . . . . 847.3 Bending of light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877.4 Radar echoes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907.5 Geodesic precession of gyroscopes . . . . . . . . . . . . . . . . . . . . . . . . 927.6 Accretion disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

8 Gravitational waves 978.1 Finding the wave equation for metric perturbations . . . . . . . . . . . . . . 978.2 Solving the linearized Einstein equations . . . . . . . . . . . . . . . . . . . . 998.3 Energy loss from gravitational radiation . . . . . . . . . . . . . . . . . . . . 105

9 Cosmology 1089.1 FRW metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1099.2 Solving Einstein’s equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 1129.3 Energy-momentum tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

10 Deriving Einstein’s equations 12110.1 Covariant integration over spacetime and classical fields . . . . . . . . . . . . 12110.2 Relativistic scalar gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12310.3 Building tensor gravity with the right Newtonian limit . . . . . . . . . . . . 12510.4 Deriving Einstein’s equations from an action principle . . . . . . . . . . . . . 127

11 Appendix: advanced topics* 13411.1 Spherical polars in D dimensions and electric fields* . . . . . . . . . . . . . . 13411.2 Noncoordinate bases and the spin connection* . . . . . . . . . . . . . . . . . 13411.3 Differential forms* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13511.4 Cartan structure equations* . . . . . . . . . . . . . . . . . . . . . . . . . . . 13711.5 Timelike geodesic congruences and the Raychaudhuri equation* . . . . . . . 14011.6 Null geodesic congruences* . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14311.7 Conformal transformations and Carter-Penrose diagrams* . . . . . . . . . . 14511.8 Reissner-Nordstrom black holes* . . . . . . . . . . . . . . . . . . . . . . . . . 14911.9 Black hole thermodynamics* . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

1 Special Relativity and tensors

1.1 Invitation to GR

The gravitational force is the weakest of the four known forces. So why does it dominate thedynamics of the universe? A simple first answer is that there is a lot of matter (stuff) in theuniverse. Even though the gravitational attraction between any two atoms is weak, if youget enough of them together you can eventually make a black hole! A more sophisticatedanswer focuses on the range of the gravitational force and what sources it. The only twolong-range forces we know of in Nature are gravity and electromagnetism. By contrast, thestrong nuclear force binding atomic nuclei and the weak nuclear force responsible for thefusion reaction powering our Sun are very short-range. EM fields are sourced by charges and

1

Page 4: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

currents, but the universe is electrically neutral on average, so EM does not dominate itsevolution. Gravity, on the other hand, is sourced by energy-momentum. Since every particlehas energy-momentum, even the graviton, you can never get away from gravity.

Newton wowed the world 330 years ago with his Law of Universal Gravitation, whichexplained both celestial and terrestrial observations. Our focus in this course is on explainingEinstein’s famous General Theory of Relativity (GR), which is a generalization ofNewtonian gravity and Special Relativity proven useful for describing the dynamics of thecosmos. By the end of term, you will know how to wrangle tensors, and how Einstein’sequations for the gravitational field

Rαβ −1

2gαβR + Λgαβ = 8πGNTαβ (1.1)

can be derived starting from an action principle for gravity. You will also understand howGR gives back Newton’s theory of gravity in the limit where speeds are small and spacetimeis weakly curved. You can think of Einstein’s GR as Gravity 2.0, built on the foundation ofGravity 1.0 established by Newton – an upgrade.

The name for this course is “Relativity Theory 1”. Another name by which it is com-monly known is “GR 1”, which stands for “General Relativity 1”. The main thing we learnhow to do in this course is how to write the dynamical equations of physics in the language oftensor analysis. Tensor analysis always sounds scary when you start, but it is not much morecomplicated conceptually than vector analysis, something you have been doing for years. Wewill show how to take your vector analysis knowledge from flat space and generalize it tospacetime. We begin with flat spacetime, which is pertinent to Special Relativity, and thenwe build on that knowledge to figure out how to write dynamical equations of physics evenin curved spacetime.

Albert Einstein taught us that the speed of light is constant and is the same in allinertial frames of reference. We will therefore adopt the relativistic convention that c = 1throughout the course. This implies that time is measured in metres, and mass is measuredin units of energy, e.g. me=511 keV. We will keep all other physical constants explicit, such asPlanck’s constant ~ characterizing the strength of quantum effects and the Newton constantGN characterizing the strength of gravity. If you feel queasy about missing factors of c inany equation, they can always be easily restored by using dimensional analysis.

Greek letters tend to be very handy when you run out of Latin ones. We will mostly usethe small letters that look different from Latin ones,

α alpha β beta γ gamma δ delta ε epsilon ζ zeta η etaθ theta κ kappa λ lambda µ mu ν nu ξ xi π piρ rho σ sigma τ tau φ phi χ chi ψ psi ω omega

and also use a smattering of capital letters that look different from Latin ones,

Γ Gamma ∆ Delta Θ Theta Λ Lambda Ξ XiΠ Pi Σ Sigma Φ Phi Ψ Psi Ω Omega

It is worth practising making these letters if you handwrite, as Greek indices are traditionalin a lot of GR literature. To get started, you can print the above tables at an appropriate

2

Page 5: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

magnification and trace your pen over them. Most people get the hang of them fairly quickly;ones that often trip people up are ζ, η, ξ and not making κ, ν, ρ sufficiently distinguishablefrom k, v, p.

1.2 Galilean relativity and 3-vectors in Euclidean space

Before we review some aspects of Special Relativity and introduce some new ones, let us beginby reminding ourselves of the non-relativistic version of relativity, also known as Galileanrelativity. When we want to transform from one inertial frame of reference to another movingat relative velocity v, there are three things we must think about:

(a) how time intervals relate,

(b) how spatial position intervals relate,

(c) how velocities relate.

In Galilean relativity, all clocks are synchronized,

dt′ = dt , (1.2)

displacements are related viadx′ = dx+ vdt , (1.3)

and velocities compose by simple addition,

vtot = v1 + v2 . (1.4)

Einstein upgraded these formulæ when he invented Special Relativity, and you have seenthe results before: they are known as Lorentz Transformations. We will get to them soonenough – and we will show you how simple they can look when written in terms of rapidityrather than velocity. But for now, let us inspect how 3-vectors work more closely, in somedetail. This will serve as a pattern for the relativistic case.

Lots of things of interest in physics are vectors, which are in essence things that point. Ilike to say that a vector has a ‘leg’ that sticks out, telling you where it points. Mathematically,the vector components are what you get when you resolve the vector along an orthonormalbasis. In Special and General Relativity, we will need to be scrupulously careful to distinguishwhere we put our indices (up/down and left/right). For vectors v, we write the index tellingyou which component is which with an upstairs index: v1, v2, . . . , vd, where i = 1, . . . , d andd is the spatial dimension. Note that the upstairs index i used here is not a power; instead, itspecifies which component vi is being discussed: the ith one. If you think of a contravariantvector as a column vector, the upstairs index i denotes which row of the column vector youare talking about. If you need to take a power of a vector component, the GR conventionis to write parentheses around it, e.g. (v1)2. Note also that it is common in GR literatureto write the vector v as vi – technically, vi is a component of v, but letting the index showexplicitly rather than suppressing it helps us remember its transformation properties.

Vectors provide a useful notational shorthand, preventing us from having to write outall the components explicitly every time we write a physics equation like ~F = m~a. Tensoranalysis in GR is nothing scary – it is the natural generalization of vector analysis to curvedspacetime and multilegged objects. Its underlying idea is twofold:-

3

Page 6: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

• In physics, the most useful dynamical variables transform in well-defined ways undercoordinate transformations, and are known as tensors. Example: the momentumvector pi.• The laws of physics should be tensorial equations. A Newtonian example you will

recognize isF i = mai . (1.5)

When we change coordinates, the components of tensors on both sides of the equation change,but the underlying physical relations between them do not.

The natural type of vector we defined above is called a contravariant vector. This islike a column vector. It has a natural counterpart called a covariant vector, also knownas a dual vector. This is like a row vector. A covariant vector ω has components ωi; notethat this is a downstairs index rather than an upstairs index. The index i tells you whichcolumn of the row vector you are talking about. There is a natural inner product betweencontravariant vectors v and covariant vectors ω:

v · ω =∑i

viωi . (1.6)

A very useful convention that we will use throughout the course is the Einstein summa-tion convention. This is a notational shorthand in which a repeated index is automaticallysummed over when it occurs precisely once upstairs and precisely once downstairs. This con-vention suppresses the unwieldy Σ signs so that it becomes easier to see the wood for thetrees. The thing that signals that you are summing over an index is that it is repeated. Notethat a repeated (summed over) index can appear precisely twice in any given equation: if itoccurs more times, the writer has made a mistake. Summing over a repeated index is alsocalled index contraction, because what you get for the result has none of the summed-overindices remaining. In our v · ω example above, the result is a scalar: a tensor with zero legs.

Why is it important to distinguish between contravariant and covariant vectors? In anutshell, because they transform differently under coordinate transformations. Let us seehow this works for a rotation. You may be used to writing a rotation of (say) a displacementvector as a d× d matrix R. Rotation matrices are orthogonal,

R−1 = RT . (1.7)

Alternatively, we can say that they preserve the Euclidean norm in 3-space:

RT13R = 13 , (1.8)

where 13 is the identity matrix. While R transforms contravariant vectors v in Euclideanspace as

v → v′ = R · v , (1.9)

where the prime indicates the transformed vector and the unprimed vector is the original,the transpose RT transforms covariant vectors vT as

vT → vT′= vT ·RT . (1.10)

4

Page 7: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

However, we strongly discourage writing coordinate transformations in terms of matricesin future, and instead encourage you to get the hang of index notation. Once time isincluded, coordinates are curvilinear, and spacetime is physically curved, index notationand the Einstein summation convention will help us keep track of indices in a much moresuccinct way and therefore reduce the error rate when handling tensors.

A rotation is expressed in index notation for the contravariant vector components as

vi′= Ri′

jvj . (1.11)

Let us write this out more explicitly so that you can see how it encodes matrix multiplicationin a disciplined way. For a rotation of the vector v with components vi about the z-axis, itreads1

v1′ = R1′

1v1 +R1′

2v2 +R1′

3v3 = + cos θ v1 + sin θ v2

v2′ = R2′

1v1 +R2′

2v2 +R2′

3v3 = − sin θ v1 + cos θ v2

v3′ = R3′

1v1 +R3′

2v2 +R3′

3v3 = v3 . (1.12)

For the covariant vector ω specified by its components ωi, we have

ωi′ = Rji′ωj . (1.13)

Check explicitly that this equation also reproduces what you expect from your linear algebraexperience. Use vector components and the Einstein summation convention like I did above.You should find that the covariant vector components are transformed in the −θ directionwhile the contravariant vector components are transformed in the +θ direction. This signdifference might seem rather trivial, but it is anything but! It is our first glimpse of why wedo need to be very careful to distinguish between upstairs and downstairs indices for vectors– and more generally for tensors, which are multilegged generalization of contravariant andcovariant vectors.

Rotations are interesting mathematically because they preserve the Euclidean norm. Inindex notation, this condition reads

Rij′δ

j′

k′Rk′

` = δi` , (1.14)

where

δi` =

1, i = `0, i 6= `

. (1.15)

Mathematically, the tensor δij with one contravariant index and one covariant index is theidentity matrix. The physical implication is that even if you rotate your velocity vector, itstill produces the same kinetic energy for a non-relativistic particle, because the latter isproportional to the norm of the velocity vector.

In physics, we often want to find the norm (length) of a vector, or the angle betweentwo distinct vectors via the dot product. In index notation, what we need to be able to

1Nitpickers: please note that, like Carroll, we take the point of view that the vector stays fixed while thecoordinate system changes under the relevant transformation.

5

Page 8: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

do is to convert contravariant vectors into covariant vectors or vice versa. To achieve this,we need extra structure on the space (or later on, the spacetime) in which the vectors live,called a metric tensor g. In flat Euclidean 3-space in Cartesian coordinates, the role ofthe metric tensor is played by the Kronecker delta tensor, which is the identity matrixin both upstairs and downstairs components,

δij =

1, i = j0, i 6= j

, (1.16)

and

δij =

1, i = j0, i 6= j

. (1.17)

In Euclidean space, the conversion between contravariant and covariant components of vec-tors and vice versa is achieved via

vi = δijvj , (1.18)

andvi = δijvj , (1.19)

where again we used the Einstein summation convention. The fact that the Euclidean metricis so trivial is why people are often very careless with index placement – if you write it outexplicitly you will see that (for example) v2 = v2 in flat Euclidean 3-space. Reminder: ifyou need to take a power of a vector component, put parentheses around it to make itunambiguous. For example, (v1)2 means the square of the first contravariant component ofthe vector v.

Our spacetime metric for flat Euclidean 3-space must obey the identity that the upstairsmetric contracted with the downstairs metric gives the unit tensor,

δijδjk = δik . (1.20)

This encodes the fact that the upstairs metric is physically the inverse of the downstairsmetric, and this must also hold in curved spacetime. Using the index notation as above willensure that no mistakes are made when we graduate from flat space to curved spacetime.

As you will recall from Newtonian physics, the kinetic energy is proportional to thesquare of the velocity vector, i.e. its norm |v|2 = δijv

ivj. This is a scalar, i.e. invariantunder rotations. So is the inner product or dot product of any two contravariant vectorsai and bj, formed by using the Kronecker delta tensor,

a · b = δijaibj . (1.21)

In 3 spatial dimensions only, we can build another 3-vector (actually a pseudovector) outof two contravariant vectors ai and bj by taking an outer product or cross product. Thecomponents are given by

(a× b)i = Eijkajbk . (1.22)

Notice that in writing the outer product here, we have again used the Einstein summationconvention – twice – on both i and j. This makes the expression more compact. To see thatthe above formula reproduces what you already know about cross products from earlier in

6

Page 9: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

your undergraduate education, write out each of the components and see. You will need thecomponents of the permutation pseudotensor Eijk, which is totally antisymmetric in itsd indices,

Ei1...id =

+1 , (i1 . . . id) = even permutation of (12 . . . d) ,−1 , (i1 . . . id) = odd permutation of (12 . . . d) ,

0 , otherwise .(1.23)

Note that this 3-index tensor obeys some handy identities,

EijkEimn = 1!

(δmjδ

nk − δmkδnj

),

EijkEij` = 2!

(δ`k),

EijkEijk = 3! . (1.24)

In general spacetime dimension d, the outer product between two contravariant vectors ai

and bj is more properly thought of as a pseudotensor with (d− 2) legs, because it is formedvia the contraction Ei1i2...ida

i1bi2 of two vectors a and b with the d-legged E pseudotensor.Note that E is defined in any dimension as long as the manifold is orientable.

1.3 Special relativity and 4-vectors in Minkowski spacetime

Let us now turn to studying how to generalize spatial vectors in flat Euclidean space tospacetime vectors in flat Minkowski spacetime, in Cartesian coordinates to begin with.

The bedrock principle of the constancy of the speed of light has some fairly dramaticphysics implications, chief among them being time dilation and length contraction.Both of these ideas have been rigorously tested experimentally, e.g. in particle collider andcosmic ray contexts, and found to hold true. Also, velocities no longer add simply, obeying acomposition law that looks pretty mysterious the first time you see it. Let me now demystifythis and Lorentz boosts by using a clever parametrization.

When you first saw Lorentz boosts, probably at the end of first year Newtonian me-chanics or in a second year modern physics course, they probably looked like the following.For an infinitesimal Lorentz boost in the x direction (recall that c = 1),

dt′ =1√

1− v2(dt+ vdx) , (1.25)

dx′ =1√

1− v2(dx+ vdt) , (1.26)

dy′ = dy , (1.27)

dz′ = dz , (1.28)

and velocities are composed according to

vtot =v1 + v2

1− v1v2

. (1.29)

Let us define the rapidity ζ via

v = tanh ζ . (1.30)

7

Page 10: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Note that while the speed ranges over v ∈ (−1,+1), the rapidity ranges over ζ ∈ (−∞,+∞).The really awesome thing about rapidity is that it is additive. To add the rapidities, youliterally just add them – like for rotation angles.

ζtot = ζ1 + ζ2 . (1.31)

It is a simple exercise to recover the relativistic velocity addition law from the definition ofrapidity and its additive nature. Give it a go yourself, and you will see what I mean.

Now we are in a position to show you a Lorentz boost along the x direction in rapidityvariables – da-daah!

dt′ = cosh ζ dt+ sinh ζ dx ,

dx′ = sinh ζ dt+ cosh ζ dx ,

dy′ = dy ,

dz′ = dz . (1.32)

This looks a bit like a rotation, except for two physically important differences: (1) it mixestemporal and spatial intervals, rather than different spatial intervals, and (2) it involveshyperbolic trig functions, rather than normal trig functions. Another difference is that it isnot the 3D Euclidean norm that is preserved under Lorentz transformations, but rather the4D Minkowski norm, also known as the invariant interval

ds2 = dt2 − dx2 − dy2 − dz2 . (1.33)

The invariant interval so defined is positive if the points are timelike separated, negativeif they are spacelike separated, and zero if they are null separated. This classificationworks regardless of which inertial reference frame you use, because it is invariant under sym-metry transformations of Minkowski spacetime: rotations, [Lorentz] boosts, and translations.

The invariant interval ds2 = dt2 − |d~x|2 gives rise to the concept of a light cone. Fora point p, this is the cone defined by all light rays emanating from p into the future or thepast. Points that are timelike separated from p are inside its light cone (positive ds2), thosethat are spacelike separated from p are outside it (negative ds2), and those that are nullseparated from p (zero ds2) lie on the light cone itself. Put more colloquially, if you had justdied at point p, then your past light cone and its interior would contain all possible suspectsfor who had murdered you. If on the other hand you had set off a bomb at p, then yourfuture light cone and its interior would contain beings you could have killed (using any formof explosive, TNT and photon torpedoes included!).

8

Page 11: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Note that light rays are conventionally drawn at a 45 degree angle on spacetime diagrams,in flat spacetime, to represent the fact that c = 1. In curved spacetime, the story gets morecomplicated, because the spacetime metric varies with position, rather than being constant.

It might be worth reminding you of the definition of proper time. To set the context,consider two events that are timelike separated. The proper time between two spacetimeevents measures the time elapsed as seen by an observer for whom the two events occur atthe same spatial position. In our signature convention, the invariant interval is positive inthe timelike case, so ds2 = dτ 2.

Motivated by the form of the matrices representing Lorentz boost transformations, letus define a relativistic 4-vector x with components xµ given by

x0 = (c)t ,

xi = (~x)i . (1.34)

Here, µ ∈ 0, 1, . . . , d. Notice how time is totally different conceptually than it was inGalilean relativity: it is the zeroth position coordinate, not an invariant. We can then definethe invariant interval as

ds2 = ηµνdxµdxν (1.35)

where the Minkowski metric tensor has downstairs components

ηµν =

+1, µ = ν = 0−1, µ = ν = 1−1, µ = ν = 2

. . .−1, µ = ν = d

0, µ 6= ν

. (1.36)

Its upstairs counterpart, the inverse, has components

ηµν =

+1, µ = ν = 0−1, µ = ν = 1−1, µ = ν = 2

. . .−1, µ = ν = d

0, µ 6= ν

. (1.37)

It satisfiesηαβηβγ = ηαγ = δαγ . (1.38)

Again, we have used the Einstein summation convention where repeated indices are summedover. Note that we have chosen the mostly minus signature convention here. Be awarethat formulæ that you may obtain from various GR textbooks may have been written in theopposite sign convention. This can be quite annoying when you are trying to track downminus sign errors in a calculation. HEL has a useful table on p.193 outlining key signatureconvention differences with d’Inverno, Misner-Thorne-Wheeler and Weinberg.

9

Page 12: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The Minkowski metric tensor η is useful for raising and lowering indices. Specifically,for a contravariant vector V ν we can find its covariant components Vµ by contracting withηµν :

Vµ = ηµνVν . (1.39)

Contracting an index means repeating it (precisely once) and summing over it. For ex-ample, in the above equation, the index ν is contracted, while the index µ is not. Let uscalculate one component, V0.

V0 = η0νVν

= η00V0 + η01V

1 + η02V2 + η03V

3

= (+1)V 0 + (0)V 1 + (0)V 2 + (0)V 3

= +V 0 . (1.40)

To find the contravariant components ωµ of a covariant vector ων , we need to contract withthe upstairs metric ηµν :-

ωµ = ηµνων . (1.41)

Using the Minkowski metric, we can define a relativistic dot product between two contravari-ant vectors aµ and bν ,

a · b = ηµνaµbν . (1.42)

Before we move on to defining tensors in a more general way, let us make a coupleof comments about the symmetry group of Minkowski spacetime for those who might beinterested. We talked about rotations earlier, and noted that they preserved the norm of 3-vectors in flat Euclidean space. A rotation matrix is orthogonal and preserves the Euclidean3-norm. The group of such matrices in 3D is known as SO(3). What is the analoguecondition for 4-vectors in flat Minkowski spacetime? If you work out the algebra, you willfind that both rotation and boost transformations written as 4× 4 matrices Λ preserve theMinkowski norm, ΛTηΛ = η, where η is the Minkowski metric tensor we defined above. Inindex notation,

Λik′η

k′

`′Λ`′

j = ηij . (1.43)

Such matrices Λ in D = d+1 dimensions are said to belong to the group SO(1, d). Rotationand boost matrices together are known as Lorentz transformations and they form a Lie(continuous) group known as the Lorentz group.

If we add in translations as well, the resulting group of transformations is known as thePoincare group ISO(1, d) (mathematically, it is a semidirect product). An interesting factthat you probably don’t know about the Poincare group is that, without even looking at anexperiment, you can prove theoretically that there are only two2 invariants of the Poincaregroup: the mass m and the intrinsic spin s. They are always the same in different inertialframes of reference related by rotations, boosts, or translations. This is why subatomicparticles are differentiated by their mass and spin. The third particle label that respects

2If you want to know why, and are unafraid of a little Lie group theory, you can find out why by readingmy PHY2404S notes at http://ap.io/2404s/qft.pdf. I also explain there why helicity is the relevantthing for massless particles and why spin has the character of an angular momentum for massive particles.

10

Page 13: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Poincare invariance is the set of conserved gauge charges, under whichever gauge symmetriesare relevant, e.g. SU(3) × SU(2) × U(1) of the Standard Model of Particle Physics. Themass, spin, and charges of a point particle do not change under a coordinate transformation,and that is precisely why they are physically useful. We can agree on how to identify anelectron whether we are in Montreal or on Mars, or in motion somewhere between them.

So, how do we define vectors and tensors in flat spacetime? The signature property ofa vector and, more generally, of a tensor, is that it transforms in a specific and well-definedway under changes of reference frame, using the spacetime coordinates as the quintessentialexample. For a single-index tensor V with upstairs components V µ,

V µ′ =∂xµ

∂xνV ν = Λµ′

νVν , (1.44)

which is known as a contravariant vector. There are also covariant vectors which obey

Vµ′ =∂xν

∂xµ′Vν = Λν

µ′Vν . (1.45)

Look closely at the above two equations: they are materially different. In the equation forthe contravariant vectors, the transformed coordinates x′ appear in the numerator of theJacobian and the original coordinates x appear in the denominator in the transformationlaw; for covariant vectors the opposite happens. Right now, the distinction between con-travariant vectors and covariant vectors may not seem all that important, because we havebeen working in flat Minkowski spacetime in Cartesian coordinates. Once we use curvilinearcoordinates or turn on a nontrival spacetime metric, the distinction will become physicallyand mathematically crucial.

Mathematically speaking, contravariant vectors live in the tangent space, which is de-fined at every point in spacetime. Covariant vectors live in the cotangent space. Theyobey the usual axioms of vector spaces: associativity and commutativity of addition, ex-istence of identity and inverse under addition, distributivity, and compatibility with scalarmultiplication.

A rank (m,n) tensor has m contravariant indices and n covariant indices. In math-ematical language, a rank (m,n) tensor is a multilinear map from the direct product ofm copies of the cotangent space with n copies of the tangent space into the real numbers.Alternatively, you can think of it as a machine with m slots for covariant vectors and n slotsfor contravariant vectors to make a scalar. For instance, a rank (0, 1) tensor (a covariantvector) is a machine with one slot for a contravariant vector (a rank (1, 0) tensor), whichwhen inserted will produce a scalar (a rank (0, 0) tensor). The spacetime metric is a (0, 2)tensor; its inverse is a (2, 0) tensor.

To find out how the components of a tensor transform, you use the transformationmatrices on each index in turn,

T µ1′...µm′

ν1′...νn′=∂xµ1

∂xλ1. . .

∂xµm′

∂xλm∂xσ1

∂xν1′. . .

∂xσn

∂xνn′T λ1...λm σ1...σn

. (1.46)

Note that each of the indices λ1, . . . , λm and σ1, . . . , σn in this equation is repeated andsummed over, keeping to the Einstein summation convention. So if you were to expand out

11

Page 14: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

all the components one by one, this would be a pretty long equation. It’s just as well weknow how to represent it compactly using index notation!

The general idea of tensor analysis is that all laws of physics should be expressible interms of tensor equations. In tensorial equations, indices can be consistently raised andlowered, as long as this is done consistently to both sides. In other words, you should notraise an index on the left side of a tensor equation while failing to do the same on the righthand side. Every equation should have the same number and type of indices on both sides.Tensorial equations hold equally well in any frame of reference, even though the componentsare different in different frames of reference.

Now let us turn to a few examples of the utility of tensors in Minkowski spacetime.

1.4 Relativistic particle

For any point particle, massive or massless, we can define its 4-momentum pµ by

p0 = E ,

pi = (~p)i , (1.47)

where ~p is the relativistic 3-momentum and E is the relativistic energy. For a massiveparticle, we have

p0 =m√

1− v2= m cosh ζ ,

pi =m√

1− v2vi = m sinh ζ vi . (1.48)

Check out for yourself what happens to components of the 4-momentum under Lorentztransformations.

Notice that the relativistic norm of the momentum 4-vector is a constant,

pµpµ = E2 − |~p|2 = m2 . (1.49)

This is known as the mass shell relation. It holds for any particle, massless or massive.For massless particles like the photon, E2 = |~p|2.

The 4-velocity is defined for massive particles only, via

uµ =dxµ(τ)

dτ, (1.50)

where τ is the proper time. It is related to the momentum 4-vector by

pµ = muµ . (1.51)

Note that the 4-velocity satisfiesuµuµ = +1 , (1.52)

by the mass shell constraint. Work out for yourself how the spatial components of uµ relateto the Newtonian velocity.

12

Page 15: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The 4-acceleration is defined for massive particles only, via

aµ =duµ(τ)

dτ=d2xµ(τ)

dτ 2, (1.53)

where τ is proper time. Work out for yourself how this relativistic acceleration connects withthe Newtonian version of acceleration that you used in first-year undergrad physics. (Hint:it will differ, by relativistic factors!)

What if we wanted to get a bit more sophisticated and write down an action principlefor the point particle? First, let us do a lightning review of some salient points from classicalmechanics. In a general dynamical system that is non-relativistic, our variables are thecoordinates

qa(t) , (1.54)

where the index a labels which coordinate we are discussing and t is the non-relativistic time.The velocities are

.qa(t), where · = d/dt. We also have the expression for the canonical

momenta in terms of the velocities,

pa =∂L

∂.qa, (1.55)

which are found from the action, which is a functional of the coordinates,

S = S[qa(t)] =

∫dt L(qa,

.qb) . (1.56)

Using the Lagrangian L and the expressions for the canonical momenta in terms of thevelocities, we can form the Hamiltonian H which depends on the coordinates on phasespace, the coordinates and their conjugate momenta,

H = H(qa, pb) =∑a

pa.qa − L . (1.57)

The principle of least action δS = 0, combined with your knowledge of the calculus ofvariations, results in the Euler-Lagrange equations

∂L

∂qa− d

dt

(∂L

∂.qa

)= 0 . (1.58)

These equations of motion are equivalent to Hamilton’s equations,

dpadt

= pa, HPB ,dqa

dt= qa, HPB , (1.59)

where the Poisson bracket is defined via

f, gPB =∑a

∂f

∂qa∂g

∂pa− ∂g

∂qa∂f

∂pa. (1.60)

13

Page 16: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The basic dynamical variables for a non-relativistic point particle are xi(t), where i =1, 2, . . . d. There is no issue about how to parametrize t, because all observers agree on time,by Galilean relativity. For a free nonrelativistic particle, the Lagrangian is just the kineticenergy,

Snonrel =

∫dt

1

2m|~v|2 . (1.61)

This action respects rotational invariance of Euclidean 3-space, because it depends only onthe norm of the velocity vector. The canonical momenta are

pi = mvi , (1.62)

and the Hamiltonian is

Ht =1

2mpipi . (1.63)

This is just the kinetic energy written in terms of momentum rather than velocity.So, that was all well and good, but what about an action principle for the relativistic

point particle? This will be an integral over the worldline of the particle, which is the pathit traces out as it moves through spacetime. For relativistic point particles we cannot usethe Newtonian kinetic energy, because it is not invariant under Lorentz boosts. We will haveto use a generalization that respects Einsteinian relativity. The simplest guess for an actiongeneralizing the above that people typically write is proportional to the arc length,

S(1)rel = −m

∫dτ

√ηµν

dxµ(τ)

dxν(τ)

dτ, (1.64)

where τ is the proper time (an invariant, unlike the time coordinate). This action has thebenefit that, at low speeds, it reduces to the familiar non-relativistic action – up to anadditive constant (try it yourself to see how, by doing a Taylor series). It assumes that theparticle position xµ(τ) can be parametrized by the proper time τ .

The drawback of this first choice of relativistic action is twofold: the dynamical variablesxµ(τ) are not independent functions, and the particle is assumed to be massive so that propertime can be used to parametrize the worldline. To see the first problem, we just need torecall the mass shell constraint,

pµpµ = E2 − |~p|2 = m2 , (1.65)

or equivalently thatuµuµ = +1 , (1.66)

where

uµ =dxµ

dτ. (1.67)

In other words, at most three of the xµ(τ) are independent functions. It is a physics fibto pretend that all four of them can be independently varied in the action principle. Oops!To see the second problem, suppose that we tried to use the above Lagrangian to find thecanonical momenta and Hamiltonian. What we would find is that the geometric arc lengthLagrangian is singular: it does not provide a path for making a sensible Hamiltonian.

14

Page 17: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Both of these problems turn out to be caused by the fact that we forgot to take account ofa symmetry of the arc length action, known as reparametrization invariance, or equivalentlythe non-independence of the 4 components of the 4-vector xµ(τ). We can become moresophisticated about writing down a Lagrangian for the relativistic particle with dynamicalvariables xµ(λ), where λ is an arbitrary worldline parameter. The key is to impose themass shell constraint via a Lagrange multiplier e(λ). In general, a Lagrange multiplieris something that appears in your action principle only via dependence on “coordinates”but not on “velocities”: it is not a dynamical field. Its only function is to implement theconstraint that you need to impose, in a way that respects the symmetries of your system –Poincare invariance, in our case.

The einbein Lagrangian is

S(2)rel =

∫dλ

[1

2e−1(λ)

dxµ(λ)

dxµ(λ)

dλ+

1

2e(λ)m2

]. (1.68)

This action is invariant under reparametrizations, as long as the einbein transforms in a veryspecific way,

λ → λ′ ,

e → e′ =dλ

dλ′e . (1.69)

Varying w.r.t. e(λ) gives the constraint equation as

dxµ(λ)

dxµ(λ)

dλ= +m2 [e(λ)]2 . (1.70)

For massive particles we can pick the proper time gauge in which e(λ) = 1/m; then, λ = τ .Varying w.r.t. xµ(λ) gives the equations of motion

d

[e−1(λ)

dxµ(λ)

]= 0 , (1.71)

which are equally valid for massive or massless particles. The canonical momenta are

pµ =1

e(λ)

dxµ(λ)

dλ(1.72)

and so the Hamiltonian is

Hλ =1

2e(pµpµ −m2

). (1.73)

This Hamiltonian is proportional to the constraint, and this is the correct answer (!), becauseit gives all the correct Poisson Brackets:

xµ, pνPB = δµν , xµ, xνPB = 0 , pµ, pνPB = 0 . (1.74)

If we wanted to canonically quantize a system (something we will not be doing in this course),we would replace classical Poisson brackets with quantum mechanical commutators.

15

Page 18: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

1.5 Partial derivatives, and electromagnetism

We can use Minkowski spacetime tensors to describe more objects than a massive pointparticle. For starters, we can form a very important covariant vector out of derivatives,

∂µ ≡∂

∂xµ. (1.75)

Its zeroth component describes the time derivative

∂0 =∂

∂(c)t=

1

(c)

∂t, (1.76)

while the spatial parts ∂i describe spatial derivatives. As you can see, ∂µ arises naturallyas a covariant vector. It is a straightforward and worthwhile exercise to show that in flatMinkowski spacetime,

∂µ∂µ =1

(c)2

∂2

∂t2− ∂2

∂x2− ∂2

∂y2− ∂2

∂z2. (1.77)

This differential operator appears in relativistic wave equations, e.g. for a Klein-Gordon(scalar) field.

For fun, let us try applying −i~∂µ to a plane wave of the form f(x) = eik·x and see whathappens.

−i~∂µ f(x) = −i~∂µ exp(ikλxλ)

= −i~[ikνδνµ] exp(ikλx

λ)

= ~kµ f(x) . (1.78)

In other words, −i~∂µ is playing the role of the momentum when acting on plane waves ofthe form f(x) = eik·x, producing the eigenvalue pµ = ~kµ. In mathematical lingo, we say thatf(x) carries a representation of the translation group. If we only had discrete translationinvariance up to a lattice vector instead, we would end up with Bloch waves instead ofcontinuous spectrum plane waves.

A less trivial example of a special relativistic tensor is Maxwell’s electromagnetism.Having played with the Maxwell equations, you know why EM waves travel (in vacuum) atthe speed of light. You may already know that, decades before Einstein invented specialrelativity, Maxwell had baked it into the very fabric of his eponymous equations! Whatyou may not know is that the familiar electric and magnetic field strengths are actually notcorrectly described by vectors, but instead by a two-index covariant antisymmetric tensor.Specifically, in four spacetime dimensions, the gauge field strength components Fµν arebuilt out of

F0i = +δijEj

Fij = −EijkBk (1.79)

In this equation, we used the totally antisymmetric permutation symbol in 3 dimensions.

16

Page 19: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The electromagnetic 4-vector gauge potential Aµ is built out of the scalar potentialand the 3-vector potential, with components

A0 = Φ ,

Ai = ( ~A)i . (1.80)

It is related to the field strength via the covariant curl,

Fµν = ∂µAν − ∂νAµ . (1.81)

This splits up in 3+1 notation as

~B = ~∇× ~A ,

~E = −~∇Φ− ∂ ~A

∂t. (1.82)

Note that Aµ(xλ) is the basic dynamical field of electromagnetism. The field strength Fµν isa derived quantity.

Using the above definitions, the four Maxwell equations

~∇× ~B − ∂ ~E

∂t= ~J , ∇ · ~E = ρ , (1.83)

~∇× ~E +∂ ~B

∂t= ~0 , ∇ · ~B = 0 . (1.84)

neatly collapse into two manifestly relativistic Maxwell equations,

∂µFµν = Jν ,

Eµνλρ∂νFλρ = 0 . (1.85)

Later when we generalize to curved spacetime, the partial derivatives ∂µ will be replaced bycovariant derivatives ∇µ.

Note that the 4-index permutation pseudotensor Eµνλσ obeys some handy identities, ina Minkowski space generalization of how it worked in Euclidean 3-space. Defining

δµναβ =

∣∣∣∣δµα δµβδνα δνβ

∣∣∣∣ (1.86)

and

δµνλαβγ =

∣∣∣∣∣∣δµα δµβ δµγδνα δνβ δνγδλα δλβ δλγ

∣∣∣∣∣∣ (1.87)

gives us

EµνλσEµβγδ = −2! δβγδνλσ ,

EµνλσEµνγδ = −3! δγδλσ ,

EµνλσEµνλσ = −4! . (1.88)

17

Page 20: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The minus signs arise from the fact that the determinant of the Minkowski metric tensor is-1, not +1. If you want the permutation tensor with upstairs indices, you can easily build itby using ηµν to raise the indices.

In the Maxwell equations, the 4-vector current is built out of the charge density and the3-vector current, with components

J0 = ρ ,

J i = (~j)i . (1.89)

The EM current obeys a conservation law,

∂µJµ = 0 . (1.90)

The relativistic Lorentz force law also simplifies,

maµ = qF µνu

ν , (1.91)

where uµ is the relativistic 4-velocity and aµ is the relativistic 4-acceleration. You will workout some aspects of this EM story in your HW1 assignment. In particular, you will be ableto compute the effect of a Lorentz boost on the electromagnetic fields ~E and ~B, which manyof you will not have seen before.

We now turn to the question of what happens for accelerated observers moving withconstant relativistic acceleration.

1.6 Constant relativistic acceleration

When I was an undergraduate, a professor introduced the idea of the Twin Paradox to us.Could the space traveller twin really live longer by travelling at relativistic speeds? Themaddening thing was that he never equipped us with the technology to answer the question!Here is how you can solve that without having to resort to General Relativity: we will onlyuse what we know about Special Relativity to solve this problem.

We all know that time dilation lengthens time intervals as compared to what is measuredin rest frame. We also know that each observer sees the other person’s clock as running slow.So why is there even a difference between what the space twin sees and what the homebodytwin sees? Acceleration. The space twin must accelerate in order to turn around and comeback to Earth, before they can compare clocks with the homebody twin.

We will handle the case of constant acceleration by making a series of instantaneousLorentz transformations along the accelerated path of the space twin. At each instant, weapproximate the path by an inertial frame of reference moving at that particular velocityat that moment. This is doable using our current technology when we have a constantrelativistic acceleration.

But what do we mean by constant acceleration, exactly? It is possible to get yourselfquite knotted up trying to answer this question by fiddling with 4-acceleration and its 3-vector components, along with time dilation and length contraction factors. The way out ofthe thicket is to focus on the idea of constant acceleration as rapidity increasing linearly withproper time. This is the physically key observation, which arises from the mathematical fact

18

Page 21: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

that it is the rapidity which is additive under two successive Lorentz transformations. Foran infinitesimal addition to rapidity, then, dζ = gdτ , which can be integrated up to

ζ = gτ . (1.92)

We have dropped the constant of integration by picking ζ = 0 when τ = 0.Now we would like to compute the distance in the homebody twin’s frame moved by the

astronaut twin during homebody time dt. This is simply given by the speed

dx = vdt = tanh ζdt . (1.93)

Next, to convert to astronaut time, we use the time dilation formula from the Lorentz boostmatrix,

dt = cosh ζdτ . (1.94)

This implies thatdx = sinh ζdτ . (1.95)

Notice that since we know ζ(τ) we can integrate this equation. Assuming that x(τ = 0) = 0,we have for the position of the space twin in homebody coordinates as

x(τ) =1

g[cosh(gτ)− 1] . (1.96)

The time for the space twin in homebody coordinates integrates to

t(τ) =1

gsinh(gτ) . (1.97)

Using these equations, you can figure out the physical effect of acceleration on the ageingprocess. You will find out that acceleration serves to enhance the familiar constant-speedtime dilation effect, rather than reducing it. This is because the free particle trajectoryactually maximizes proper time elapsed during motion; any acceleration applied reduces it.We will be able to see this later on when we study geodesics.

Using the equations above, we can see that the trajectory of the space twin satisfies[x(τ) +

1

g

]2

− [t(τ)]2 =1

g2. (1.98)

As you can see by inspection, this is a hyperbola. The asymptotes of the hyperbola corre-spond to Rindler horizons. We can see why they are horizons by recalling that light raysmove at 45 degrees. Observers going at higher and higher accelerations hug the asymptotesmore and more tightly, but they still cannot ‘see’ beyond the Rindler horizon.

In fact, the physics is even more interesting than this. The accelerated space twin notonly finds that there are parts of spacetime that they cannot reach, but also that the physicsof quantum fields differs from that in the homebody twin’s frame. The Minkowski vacuum,seen in the reference frame of the space twin with constant acceleration, turns out to have athermal spectrum at the Rindler temperature. Including the factor of c, the formula forthis reads

T =~g

2πckB. (1.99)

19

Page 22: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The greater the acceleration, the higher the temperature that your detector will feel. Thisphenomenon of acceleration radiation is known as the Unruh effect. For those who areinterested, the physics of particle detectors in GR is explained nicely in the advanced GRtextbook by Birrell and Davies on quantum field theory in curved spacetime.

2 Curved spacetime and tensors

2.1 The Equivalence Principle

Einstein became famous for several different accomplishments. One which is legion amongtheoretical physicists is the concept of the Gedankenexperiment (German for thought exper-iment). It allows us to work out all sorts of imaginative ideas without having to actuallyspend any money. So imagine, if you will, that you are an astronaut on the space station.Imagine that you are blindfolded and kidnapped and then one of two things happens to you.Either you feel the acceleration due to gravity or you take a ride in a rocket ship capable ofthat same acceleration. How would you tell the difference?

The gravitational force from a body of mass M on a test mass mg is

~F grav = −GNMmg

r2r , (2.1)

where mg is the gravitational mass and GN is the Newton constant. So we have

~agrav = −[mg

mi

]GNMr

r2. (2.2)

If mi = mg, then this acceleration ~agrav does not depend on the properties of the test massfeeling the gravitational force.

The universality of gravitation was first put forward by Newton, centuries before Ein-stein. Others hypothesized that the acceleration due to gravity should be universal, not de-pending on the composition of the falling object. This idea has since been tested exquisitely

20

Page 23: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

well. It implies that an object’s inertial mass (what makes you hard to move in the morning)is equal to its gravitational mass (what responds to gravity), and is known as the WeakEquivalence Principle.

When Einstein formulated his theory of General Relativity (GR) he decided to bake theequivalence principle into the very fabric of spacetime. In GR, there is no local experimentyou can do to tell the difference between acceleration due to rockets and acceleration due togravity. This is known as the Einstein Equivalence Principle.

The really cool thing about the equivalence principle? It implies that every referenceframe, including accelerating ones, can be instantaneously approximated by a Lorentz frame.This might seem like mathematical nitpicking, but it is actually a key physics insight, as itimplies that locally in spacetime, everything is just Special Relativity. What makes Gen-eral Relativity interesting and nontrivial is the story of how those individual infinitesimalneighbourhoods are sewn together into the fabric of curved spacetime.

It is important to note that this equivalence between gravity and acceleration holds onlyin an infinitesimal patch about a point. If we have access to a finite sized patch of spacetime,we can distinguish gravity from acceleration by measuring tidal forces. We will develop thatstory later on when we get to the Raychaudhuri equation for geodesic deviation.

Consider a photon in Earth’s gravitational field. If it gets aimed upwards, then aftera time interval dt, what is the effect? Well, photons cannot change their speed, as theyalways go at c. What can change for a photon is its energy (or equivalently the magnitudeof its momentum, because the photon mass shell relation is E = |~p|). It can also changeits heading. When a photon moves upwards in a gravitational field, it gains gravitationalpotential energy, so it must lose kinetic energy (conservation of the total energy is valid nearEarth, because there is a time translation symmetry). The photon should therefore suffer aredshift in going upwards. This phenomenon is known as gravitational redshift, and itimplies that clocks run slower when they are deeper in a gravitational field. Black holes takethis to an extreme, as we will see much later in the course.

Did you know that GPS devices rely on both Special and General Relativity to locateyou accurately? They need to take account of the fact that the GPS transmitter satellitesare (a) travelling at a measurable fraction of the speed of light, requiring Special RelativisticDoppler corrections, and (b) higher up in Earth’s gravitational field than we are, necessitatingGeneral Relativistic corrections. Without those corrections together, you would probablybe kilometres off your intended position after a day’s canoeing. So GR does actually touchyour life in a measurable way, if you ever use a GPS unit, say in your smartphone.

2.2 Spacetime as a curved manifold

Newton conceptualized gravity via forces that act at a distance instantaneously. This in-stantaneous propagation of gravitational effects is in direct contradiction to the relativisticprinciple that the speed of light is the upper speed limit for everyone. In Einstein’s GR,the speed of propagation of gravititational disturbances is tied to be exactly equal to thespeed of light in vacuum. The formalism of GR is designed to express all the effects of grav-ity in a relativistic way, like gravitational redshift, via geometrical properties of the fabric ofspacetime. The mathematical name for the type of geometry used is (pseudo)Riemanniangeometry.

21

Page 24: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

A (p + q)-dimensional manifold with signature (p, q) is a spacetime that locally lookslike a patch of Rp,q. For example, with three spatial dimensions this would be R1,3. Themanifold is the collection (union) of these patches, known as coordinate charts, along withthe transition functions that teach you how to sew the patches together. The manifoldneeds to be continuous, and in order for us to compute sensible physical quantities it shouldalso be differentiable. The mathematical concept of the coordinate chart is equivalent to theusual physics idea of a coordinate system or reference frame.

As an example of how you might need more than one coordinate chart to cover a manifold,consider a circle S1. Each coordinate chart must be an open set of R (emphasis on open).So the minimum number of coordinate charts required to cover the 1-sphere is two.

For the 2-sphere S2, you need a projection to get S2 onto R2, or a patch thereof like amap. The most commonly used projection is the Mercator projection, which preserves anglesrather than area. It is possible to use a different projection that preserves area, such as thePeters projection. However, the price of maintaining areas on the map is that angles are notpreserved: countries look funny shaped compared to their Mercator cousins. Because thesphere is curved and the plane is not, you cannot create a map that preserves both anglesand areas. The reason why the Mercator projection has been so dominant is a technical one:because it preserves angles, it is optimal for navigation of marine vessels and aeroplanes.But it massively overstates the size of countries closer to the poles. In particular, WesternEurope looks more important on Mercator maps than it should, while Africa and Brazil lookmuch smaller. Colonialism also had a role in the dominance of the Mercator projection.

Examples of manifolds include Minkowski space, the sphere, the torus, and 2D Riemannsurfaces with arbitrary genus. What about spaces that are not manifolds? Any intersectionof lines with k-planes will do. A cone is an example of a non-differentiable manifold, becauseof what happens at its apex. Some manifolds have a boundary, for instance a line segment.Some manifolds have no boundary.

General Relativity treats the fabric of spacetime as a differentiable manifold. Note thatit is also possible to handle discontinuities in the spacetime metric in some situations inGR, but only if the appropriate source of energy-momentum is available at the discontinuityto enforce consistency with Einstein’s equations. The formalism for handling this non-differentiable case is known as the Israel junction conditions, and its equations are derivedby integrating Einstein’s equations across discontinuities in suitably covariant ways. Thisworks a lot like deriving equations for shock waves in fluid mechanics.

Spacetime being a differentiable manifold is not enough structure to describe gravity aswe see it in experiments. The geometry should be suitably constrained by some physicalequations, which should – by the Correspondence Principle – reduce to Newtonian mechanicsin the limit of small speeds and weak gravity. Our spacetime manifolds will satisfy theEinstein equations.

How do vectors and tensors work when spacetime is curved? We will have to be morecareful than before, and the signature difference is that the matrices showing us how totransform between different coordinate systems are no longer constant matrices. Supposethat we have coordinates xµ on our manifold and that we consider an arbitrary functions of

22

Page 25: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

these coordinates. Then the directional derivative along a direction λ of a curve is

df

dλ=

∂f

∂xµdxµ

dλ(2.3)

so that we can writed

dλ=dxµ

dλ∂µ (2.4)

In other words,e(µ) = ∂µ (2.5)

is a set of basis vectors. Note the parentheses around the lower index, which we use todistinguish basis vectors from other common-or-garden vectors. This story goes deeper: thetangent space Tp(M) at a point p of a manifold M is isomorphic to the space of directionalderivative operators on curves through p. It is a vector space, and the Leibniz rule is obeyed.Vector fields can then be defined on M . An example of a vector field would be the winddirection at the surface of the Earth. (It must have two zeroes because it lives on S2.) Takea look at https://earth.nullschool.net/ for a very beautiful interactive visualization ofwinds on Earth.

Note that the above derivative basis vectors are naturally covariant. What about a basisfor contravariant vectors living in the cotangent space T ∗p (M)? There is a very natural dual– the differentials:

θ(µ)

= dxµ . (2.6)

These rank (0, 1) and (1, 0) basis tensors (the coordinate basis) have a natural inner product,

(∂ν)(dxµ) =

∂xµ

∂xν= δµν . (2.7)

In order to measure distances and angles on the spacetime manifold, we need a metrictensor. This is denoted by gµν and is a generalization of the flat Minkowski metric ηµν . Itgives rise to the general relativistic line element

ds2 = gµνdxµdxν , (2.8)

which is invariant under arbitrary coordinate changes which are invertible and C∞. Theseare known as diffeomorphisms. The inverse metric is denoted as gµν and it obeys

gµνgνλ = δµλ . (2.9)

The metric and its inverse are used all the time in GR, for raising and lowering indices ontensors.

Flat, boring Minkowski spacetime R1,d written in spherical polar coordinates is not acurved spacetime, but it does have tensor transformation laws that depend on spacetimeposition. As an exercise to test your understanding, you should check explicitly that the lineelement is

ds2 = dt2 − dr2 − r2dθ2 − r2 sin2 θ dφ2 , (2.10)

23

Page 26: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

where the spherical polar spatial coordinates r, θ, φ are related to the Cartesian spatialcoordinates x1, x2, x3 by

x1 = r cos θ , (2.11)

x2 = r sin θ cosφ , (2.12)

x3 = r sin θ sinφ . (2.13)

Check also that the volume element is

r2 sin θ dθ dφ . (2.14)

2.3 Tensors in curved spacetime

Tensors work in curved spacetime a lot like they do in flat spacetime. The most importantphysical difference is that under a change of reference frame represented by

Λµ′

ν ≡∂xµ

∂xν(2.15)

the new coordinates are related to the old ones by coordinate-dependent factors, rather thansimple constants (like cos θ or sinh ζ). Our central physics strategy will be to remain focusedon the transformation properties of our tensors of interest. That is the essence of what atensor does: it transforms in very specific, well-defined ways when the coordinate systemdoes.

We introduced basis tensors a little earlier. Our coordinate bases for dual vectors (θ(µ)

=dxµ) and vectors (e(ν) = ∂ν) obeyed a natural inner product

θ(µ)· e(ν) = δµν . (2.16)

A dual vector ω can be written in components

ω = ωµ θ(µ), (2.17)

These are also known as one-forms. Under coordinate transformations, components trans-form as

ωµ′ = Λνµ′ ων (2.18)

where the matrices

Λµ′

ν =∂xµ

∂xνand Λν

µ′ =∂xν

∂xµ′(2.19)

satisfyΛν

µ′Λµ′

λ = δνλ , and Λµ′

νΛνλ′ = δµ

λ′ . (2.20)

Dual vectors live in the cotangent space T ∗p which is a vector space. (The collection of allcotangent spaces over M is known as the cotangent bundle.) They are bilinear mapsω : Tp → R,

(aω1 + bω2)(V ) = aω1(V ) + bω2(V ) , (2.21)

ω(aV1 + bV2) = aω(V1) + bω(V2) . (2.22)

24

Page 27: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Vectors live in the tangent space Tp and obey analogous rules. (Their bundle is known asthe tangent bundle.)

Note that the components of the gradient dual vector ∂µ acting on some function φ aresometimes denoted φ,µ. Covariant derivatives which we will develop shortly are denoted inthe same convention by φ;µ. Personally, I find it far too easy to drop a comma or a semicolon(or a dot!) in handwritten notes, so I prefer not to use these conventions, but instead keepthe ∂µ and so forth explicit.

For any vector V and dual vector ω, the scalar is

ω(V ) = V (ω) = V µωµ . (2.23)

Notice that a vector cannot be turned into a scalar without the help of either (a) somethingwith a downstairs index (e.g. ω); (b) a metric tensor. The object with components gµνV

ν isa dual vector; that with components gµνων is a vector.

A rank (m,n) tensor in curved spacetime is defined, as before, as a multilinear mapfrom a collection of m dual vectors and n vectors to R. Its components in a coordinate basiscan be extracted from T by slotting in the right number of dual vector and vector basiselements,

T µ1...µm ν1...νn= T (dxµ1 , . . . , dxµm , ∂ν1 , . . . , ∂νn) . (2.24)

Alternatively, it can be written in terms of basis tensors as

T = T µ1...µm ν1...νne(µ1) ⊗ . . .⊗ e(µn) ⊗ θ

(ν1)⊗ . . .⊗ θ

(νn). (2.25)

The coordinate transformation law for rank (m,n) tensors is

T µ1′...µm′

ν1′...νn′=∂xµ1

∂xλ1. . .

∂xµm′

∂xλm∂xσ1

∂xν1′. . .

∂xσn

∂xνn′T λ1...λm σ1...σn

. (2.26)

where now the Jacobian of the coordinate transformation typically depends on spacetimeposition.

2.4 Rules for tensor index gymnastics

There are very specific rules for manipulating tensors. We already met one of them: theEinstein summation convention. In curved spacetime it works exactly the same way as inflat spacetime: repeated indices are summed over. But let us also make explicit some otherspecific tensor manipulation rules.

First and foremost among them is the fact that when you write a tensor equation, indiceson the LHS and RHS must be exactly matched. For example, pµ = muµ is a sensible tensorequation (it has one upstairs index on both sides) while the erroneous pµ = muµ is not (onthe LHS the µ index is downstairs while on the RHS it is upstairs).

Second, vertical moves of tensor indices – up or down – can only be made by lowering orraising them with the rank (0,2) metric tensor or its rank (2,0) inverse. Said less pedantically,we raise or lower indices using the metric. For example,

T νµ λσ = gµρT

ρνλσ , (2.27)

25

Page 28: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

and similarly for other raised/lowered components: you use as many factors of the met-ric/inverse metric as needed (with appropriate contractions) to lower/raise all the requisiteindices.

Third, we must always preserve the horizontal ordering of the indices when cal-culating, for both upstairs and downstairs indices. For example, for a general rank (2,2)tensor,

T µνλσ 6= T νµλσ . (2.28)

(The RHS has the µ and ν indices switched compared to the LHS.) Other horizontal switchesof indices are equally verboten, unless you know that the tensor has appropriate symmetryproperties. The only standard exception to the rule that horizontal index ordering mattersis the Kronecker δαβ tensor, which is symmetrical by definition.

Let us now make a few remarks about symmetries among tensors. Tensors can havesymmetries on their indices, which reduce the number of independent components, but thisis not generic. For example, under interchange of its indices a two-index tensor might besymmetric

Sµν = +Sνµ , (2.29)

or antisymmetricAµν = −Aνµ . (2.30)

For rank two tensors only, an arbitrary tensor T can actually be written as the sum of asymmetric tensor S and an antisymmetric tensor A. In components,

Tµν = Sµν + Aµν , (2.31)

where

Sµν =1

2!(Tµν + Tνµ) , Aµν =

1

2!(Tµν − Tνµ) , (2.32)

This works because the total number of independent components of a 2-index tensor is D×D = D2, while a symmetric 2-index tensor has D(D+1)/2 components and an antisymmetric2-index tensor has D(D − 1)/2, so that D(D + 1)/2 + D(D − 1)/2 = D2. For larger rank,such a split cannot be done, because totally symmetric and totally antisymmetric tensors donot have enough independent components between them to cover the total number.

Any tensor can be symmetrized on any number k of upper or lower indices. For sym-metrization, we have

T(µ1...µk) =1

k!(Tµ1µ2...µk + sum over permutations of (µ1 . . . µk)) , (2.33)

while for antisymmetrization we have

T[µ1...µk] =1

k!(Tµ1µ2...µk + alternating sum over permutations of (µ1 . . . µk)) . (2.34)

where the alternating sum counts even permutations with a + sign and odd ones with a− sign. Note how the round parentheses denote symmetrization and the square bracketsantisymmetrization.

26

Page 29: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Suppose you know that a tensor is symmetric with all indices down. How do you workout the symmetry of its counterpart with some or all of its indices up? By using knownsymmetry properties of the downstairs components and using the metric tensor to raiseindices. Remember that the metric tensor itself is symmetric under interchange of its twoindices, and so is its inverse. Note also that the contraction of the metric tensor with itselfis

gµνgµν = gµµ = δµµ = 1 + . . .+ 1 = D . (2.35)

3 The covariant derivative

Because coordinate change matrices generically depend on spacetime position, simple partialderivatives of tensors are typically not themselves tensors. For example, the partial derivativeof a covariant vector W , ∂µWν , changes under coordinate changes as

∂xµWν −→

∂xµ′Wν′ =

∂xµ

∂xµ′∂

∂xµ

(∂xν

∂xν′Wν

)=∂xµ

∂xµ′∂xν

∂xν′∂

∂xµWν +Wν

∂xµ

∂xµ′∂

∂xµ

(∂xν

∂xν′

). (3.1)

Although the first term looks good for tensoriality, we see that the second term ruins thefun for generic changes of coordinates.

At any particular point p, we can choose a reference frame (denoted here by hats) inwhich the first derivatives can be set to zero in that coordinate system, ∂σgµν |p = 0. Theway to see this mathematically is to use (a) Taylor expansions around a particular pointand (b) the tensorial transformation property of the two-index metric tensor gµν . However,this cannot be made to work beyond first order in derivatives, because there are not enoughcomponents. Physically, this means that we will need extra structure on our spacetimemanifold in order to be able to define covariant derivatives that transform like tensors.

The structure we need is known as an affine connection. It will enable us to makecovariant versions of partial derivatives ∂µ, denoted ∇µ, designed to transform like tensors.For taking covariant derivatives of tensorial indices relevant to bosonic fields, we will use theLevi-Civita connection or Christoffel symbols Γµνλ. For taking covariant derivatives ofspinorial indices relevant to fermionic fields, a researcher would use a different beast knownas the spin connection ωµab (see Appendix for a brief summary). We will work on manifoldswithout torsion, and for this case knowing the metric tensor is sufficient to determine bothconnections.

3.1 The Christoffel symbols

Previously, we noticed that taking the partial derivative of a tensor does not give anothertensor, generically. The problem was that the coordinate transformation generally dependson spacetime position. Let us delineate the properties that a covariant derivative ∇ shouldhave. It should be linear,

∇(T + S) = ∇T +∇S , (3.2)

27

Page 30: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

and it should obey the Leibniz rule

∇(T ⊗ S) = (∇T )⊗ S + T ⊗ (∇S) . (3.3)

It should also commute with contractions, which is tantamount to assuming that

∇σgµν = 0 , (3.4)

a very reasonable assumption. The covariant derivative should reduce to the partial deriva-tive when operated upon scalars, because those tensors have no legs.

The combination of the first two properties above implies that ∇ can be written asthe sum of the partial derivative ∂ and a linear transformation, which you can think of asa ‘correction’ to keep the derivative tensorial. The coefficients of this correction term areknown as the connection coefficients or the Christoffel symbols Γµαβ (no, this is notpronounced “Christawful”!). I have no intention of making you suffer through all the stepsrequired to construct this beast from the ground up. Instead, I will offer the formulæ foryou to work with, and motivate why they make sense physically.

Without further ado, here is the formula for the components of the covariant derivativeof a vector V µ,

∇σVµ = ∂σV

µ + ΓµσλVλ . (3.5)

Note that neither of the two terms appearing on the RHS of this equation is separately atensor, but as a combination they do make a tensor. For a dual vector Wµ, the covariantderivative acts differently; in components,

∇σWµ = ∂σWµ − ΓλσµWλ . (3.6)

(If you want a bit more detail about why it should be the same Γ appearing in upper andlower covariant derivatives, see Carroll §3.2 on page 96-97.)

If you want to take the covariant derivative of a rank (m,n) tensor, then you just act oneach of its legs in turn with the connection,

∇σTµ1...µm

ν1...νn= ∂σT

µ1...µmν1...νn

+Γµ1σλTλµ2...µm

ν1...νn+ Γµ2σλT

µ1λµ3...µmν1...νn

+ . . .

−Γλσν1Tµ1...µm

λν2...νn− Γλσν2T

µ1...µmν1λν3...νn

+ . . . (3.7)

One of the most important things to remember is that the connection is not a tensor.It has components labelled with Greek indices, but that does not make it a tensor in and ofitself. Indeed, the connection is designed specifically to correct the non-tensorial propertyof the partial derivative in order to create a new tensor from an old one. Its transformationlaw under a coordinate transformation is

Γν′

µ′λ′ =∂xµ

∂xµ′∂xλ

∂xλ′∂xν

∂xνΓνµλ −

∂xµ

∂xµ′∂xλ

∂xλ′∂2xν

∂xµ∂xλ, . (3.8)

From this, you can see that the difference between two connections is a tensor, because thesecond term (which is independent of the Γs) drops out of their transformation law.

28

Page 31: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

In discussing covariant derivatives of tensors, it is worth noting here that some peopleuse a different convention than ours. They abbreviate by defining commas after indices torepresent partial derivatives, while semicolons represent covariant derivatives. We will stickwith keeping ∂ and ∇ explicit, because in pages full of long GR equations it is all too easyto lose track of punctuation marks.

Advanced note: we will assume throughout this course that our spacetime has zerotorsion. Torsion is a fancy advanced concept beyond the scope of this course involving arank (1, 2) tensor. This torsion-free physics assumption is very handy, as it implies that theChristoffel symbols are symmetric in their lower indices,

Γλµν = Γλνµ . (3.9)

Further, when torsion is zero, the relationship between the Christoffel symbols and the metrictensor is completely determined.

The Christoffel symbols are

Γλµν =1

2gλσ (∂µgνσ + ∂νgσµ − ∂σgµν) . (3.10)

Our connection is metric compatible, meaning that the covariant derivative constructedfrom it obeys

∇σgµν = 0 . (3.11)

There are two other useful equations that follow from this one,

∇σgµν = 0 , (3.12)

∇λεµ0µ1...µd = 0 . (3.13)

It follows that the metric-compatible covariant derivative commutes with raising and loweringof indices. This is very fortunate – as you may be able to imagine, if there were torsion, youwould have to be scrupulously careful about your index placements. We will have more tosay about the completely antisymmetric εµ0µ1...µd tensor density later in the course, near theend.

3.2 The covariant derivative and parallel transport

Introducing a covariant derivative (as compared to a plain derivative) was a really great ideafor doing physics. It allows us to write tensor equations wherever we go. All we need to dois to be sure to write ∇s rather than ∂s. But one question worth asking is this: what rateof change does ∇ actually measure?

A way to answer that question and get a better handle on ∇ is to ask when ∇ of sometensor is zero. For this we actually have to specify what path along which we hope to comparetensors – because comparing tensors at two different points is, a priori, meaningless in GR.After all, the spacetime metric varies from point to point.

Consider a path xµ(λ), and define the directional covariant derivative

D

Dλ≡ dxµ

dλ∇µ . (3.14)

29

Page 32: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

This animal is only defined along the path xµ(λ), and when acting on a tensor it producesanother tensor. We say that a tensor is parallel transported along the path if( D

DλT)µ1...µk

ν1...ν`=dxσ

dλ∇σT

µ1...µkν1...ν`

= 0 . (3.15)

This is known as the equation of parallel transport, and it is a proper tensor equation.Now, since we have a metric compatible connection, ∇g = 0, parallel transport pre-

serves the inner product of two tensors. For example, for two vectors V µ and W µ,

D

Dλ(gµνV

µW ν) =

(D

Dλgµν

)V µW ν + gµν

(( DDλ

V µ)W ν + V µ

( DDλ

W ν))

(3.16)

= 0 + 0 + 0 = 0 , (3.17)

since the vectors are both parallel transported. You can visualize what parallel transportdoes by imagining that it keeps the same angle between the vector and the directionalderivative along the path xµ(λ).

To see what parallel transporting can imply, consider the two-sphere. Imagine that westart at the North Pole with a vector at an angle. We keep the angle of our vector constantas we move along a line of longitude, (say) the Greenwich meridian, down to the Equator.Then imagine that we turn East and continue parallel transporting our vector some wayaround the equator. Then we turn North and parallel transport our vector up a second lineof longitude, back to the North Pole. If you have visualized this correctly in your mind, youwill see that our vector, regardless of the direction it was initially pointing, has undergone afinite rotation. This is because the sphere is (positively) curved.

3.3 The geodesic equations

A geodesic is a path xµ(λ) that parallel transports its own tangent vector. It follows thatthe equation satisfied by the geodesic is

D

(dxµ

)=d2xµ

dλ2+ Γµνσ

dxν

dxσ

dλ= 0 . (3.18)

We can also think about parallel transport in the following way. When we take anordinary partial derivative, we do it by taking

lim∆x→0

f(xµ + ∆xµ)− f(xµ)

∆xµ=

∂f

∂xµ. (3.19)

30

Page 33: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

In curved spacetime, the result of this is not a tensor. What we do is instead take thecovariant derivative, as follows.

1. We take xµ(λ+ dλ) as our “x plus an infinitesimal change” and find T there.2. We parallel transport T back to the original point at xµ(λ), along the path xµ(λ).3. We compare the parallel-transported-back T to the original T at xµ(λ), and we ‘divide

by’ dλ.

The result is DT/Dλ, the covariant rate of change of the tensor with respect to λ at thespacetime point xµ(λ).

Let us now see another way that the geodesic equation can be derived, using a variationalapproach. Consider a massive point particle in proper time gauge. The relativistic einbeinaction is, up to a constant that is physically irrelevant at the classical level,

S =m

2

∫dτ gµν(x

λ)dxµ(τ)

dxν(τ)

dτ. (3.20)

What happens when we varyxµ → xµ + δxµ ? (3.21)

Under such a variation,gµν → gµν + (∂σgµν)δx

σ . (3.22)

Varying the action, we have

2

mδS =

∫dτ δ

(gµν

dxµ

dxν

)(3.23)

=

∫dτ

[(∂σgµν) δx

σ dxµ

dxν

dτ+

gµν

(dδxµ

)dxν

dτ+ (µ↔ ν)

](3.24)

=

∫dτ

[(∂σgµν)

dxµ

dxν

dτδxσ+

(∂σgµν)dxσ

dxν

dτδxµ + gµν

d2xµ

dτ 2δxµ + (µ↔ ν)

], (3.25)

where in the last step we integrated by parts3. We also used the fact that

δdxµ

dτ=dδxµ

dτ. (3.26)

3We assume that the manifold has sufficiently trivial topology for the integration by parts to work.

31

Page 34: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Collecting all the terms, we have

2

mδS =

∫dτ

[gµσ

d2xσ

dτ 2+ gµσΓσνρ

dxν

dxρ

]δxµ . (3.27)

Demanding that this be zero for arbitrary variations δxµ, we obtain the geodesic equation,

d2xµ

dτ 2+ Γµνρ

dxν

dxρ

dτ= 0 . (3.28)

An affine parameter λ is defined to be λ = aτ + b for constants a, b. In other words, a λis an affine parameter if it is linearly related to τ (for a massive particle). For a masslessparticle, we can still define an affine parameter. In fact, our geodesic equation requires justsuch an affine parametrization, regardless of the particle mass.

For either massive or massless particles, the geodesic equation can be written in verycompact form in terms of the momentum vector,

pν∇ν pµ = 0 . (3.29)

For point particles, we relate the momentum pµ to the four-velocity uµ via

pµ = muµ = mdxµ

dτ, m2 > 0 ,

pµ = uµ , m2 = 0 . (3.30)

The second formula follows Carroll’s convention for defining the four-velocity for masslessparticles.

There is a central physics point to understand about this extremization. Is it a min-imization or a maximization? In fact, the geodesic maximizes proper time. Why?Well, if we were to lower the proper time interval ∆τ along a changed path, we would getcloser to ∆τ = 0, which is a null path. To go lower, to ∆τ < 0, we would have to use anillegal spacelike path. So minimizing ∆τ does not make sense, and in fact the proper time ismaximized via the variational principle. The fact that the proper time is maximized happensprecisely because it is infinitesimally close to paths with lower proper time. Carroll has amorally similar argument: he shows that for any timelike path we can approximate it by a(jaggedy looking) piecewise continuous bunch of null paths, all of the pieces of which havezero invariant interval. Since the geodesic is infinitesimally nearby to null paths with zeroproper time, it must maximize proper time.

The physical consequence of this mathematical fact that geodesics maximize proper timeis that accelerated observers – those who are not in freefall – measure less propertime than those who are in freefall. This is why the space twin in the Twin Paradox alwayscomes back younger, not older, than the homebody twin. The more you accelerate aroundwith your rockets, the younger you are compared to a homebody who stays on a geodesic.

If all geodesics on a spacetime manifold go as far as they please, then the manifold issaid to be geodesically complete. But if some geodesic(s) bang into a singularity, or endprematurely, then the manifold is geodesically incomplete. For spacetimes with matter, thisis the generic case, actually. We will see why when we get to singularity theorems later on.

32

Page 35: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

3.4 Example computation for Christoffels and geodesic equations

Let us now work a relatively simple example of calculating Christoffel components for aspacetime with dependence on only one coordinate, x0 = t. We will take the spatially flat4

Friedman-Robertson-Walker ansatz in D = d+ 1 spacetime dimensions,

ds2 = dt2 − a2(t)|d~x|2 , (3.31)

where a(t) is the scale factor. Since

ds2 = gµνdxµdxν , (3.32)

we have

g00 = +1 ,

gij = −[a(t)]2 δij . (3.33)

Because the metric is diagonal, we can invert it by eye, to obtain

g00 = +1 ,

gij = −[a(t)]−2 δij . (3.34)

Finding the Christoffels is relatively straightforward, as many of them are zero. Notice thatthe only coordinate dependence in the metric is on the time coordinate.

First, let us try for Γ000,

Γ000 =

1

2g0σ (∂0g0σ + ∂0g0σ − ∂σg00)

=1

2g00∂0g00 = 0 , (3.35)

because the metric is diagonal and because g00 is a constant.Next up is

Γ00i =

1

2g0σ (∂0giσ + ∂ig0σ − ∂σg0i)

=1

2g00 (∂0gi0 + ∂ig00 − ∂0g0i)

=1

2g00∂ig00 = 0 , (3.36)

because the metric is diagonal and because g00 is a constant.A more interesting case is Γ0

ij, which is nonzero.

Γ0ij =

1

2g0σ (∂igjσ + ∂jgiσ − ∂σgij)

=1

2g00 (∂igj0 + ∂jgi0 − ∂0gij)

= −1

2g00∂0gij

= a.a δij , (3.37)

4For the more general case with nontrival spatial metric, see Carroll §8.3.

33

Page 36: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

where · = d/dt. Along the way, we again used the fact that the metric is diagonal and g00 isa constant.

Now consider Γi00.

Γi00 =1

2giσ (∂0g0σ + ∂0g0σ − ∂σg00)

= 0 , (3.38)

because the metric is diagonal and because g00 is a constant.Next, let us look at the only other nonzero Christoffel symbol Γi0j. We have

Γi0j =1

2giσ (∂0gjσ + ∂jg0σ − ∂σg0j)

=1

2gik (∂0gjk + ∂jg0k − ∂kg0j)

=1

2gik∂0gjk

=1

2[a(t)]−2δik∂0[a(t)]2δjk

=.a

aδij . (3.39)

Finally, what about the all-spatial Christoffels Γi jk? We have

Γi jk =1

2gi` (∂jgk` + ∂kgj` − ∂`gjk)

= 0 , (3.40)

because none of the spatial components of the metric depends on spatial position.In summary, we have:-

Γ0ij = a

.a δij , (3.41)

Γi j0 =.a

aδij , (3.42)

with all other components zero. Notice how it is the “velocity” of the scale factor a(t) thatappears here. The quantity .

a

a= H(t) (3.43)

is known as the Hubble constant if the scale factor is exponential. (Whether or not thescale factor can behave in this fashion is determined by the energy-momentum of matter inthe spacetime, as we will discover later on in the course.)

Now let us look at the geodesic equations in this simple spacetime, doing a time spacesplit like for the Christoffels above. In general, we have

d2xµ

dλ2+ Γµνσ

dxν

dxσ

dλ= 0 . (3.44)

34

Page 37: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The 0th component of this equation reads

0 =d2x0

dλ2+ Γ0

νσ

dxν

dxσ

=d2x0

dλ2+ Γ0

ij

dxi

dxj

=d2x0

dλ2+ a

.a δij

dxi

dxj

dλ(3.45)

because all the other terms contributing to the sums over ν and σ involve Christoffel com-ponents that are zero.

The ith component reads

0 =d2xi

dλ2+ Γiνσ

dxν

dxσ

=d2xi

dλ2+ Γi0j

dx0

dxj

dλ+ Γi j0

dxj

dx0

=d2xi

dλ2+

2.a

a

dx0

dxj

dλ(3.46)

The first thing to notice about these geodesic equations we have derived is that they arecoupled and nonlinear. The equation for dx0/dλ depends on what dxi/dλ are doing, and viceversa. This is why solving for motions of massless particles (photons) or massive particles(like electrons) in the background of a general curved spacetime is generically much morecomplicated than doing Newton’s Laws for non-relativistic physics.

The second thing to notice about our super-simple spacetime is that the spatial geodesicequations actually have a first integral (!). To see this, let us try taking the λ derivative of

pi = a2(t) δijdxj

dλ. (3.47)

We have, by the Leibniz rule and the chain rule,

d

dλpi = δij

d

[a2(x0)

dxj

]= δij

(2a

.adx0

)dxj

dλ+ δija

2 d2xj

dλ2

= δija2

[d2xj

dλ2+

2.a

a

dx0

dxj

]= 0 . (3.48)

Therefore, pi is a conserved quantity along the geodesic. As we will see a bit later in thecourse, this conservation law arises because our spacetime metric has a symmetry: noneof the components of the metric tensor depends on spatial coordinates. This is your firstexample of how Noether’s Theorem works in General Relativity.

35

Page 38: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

4 Spacetime curvature

Einstein’s General Theory of Relativity upgraded the way we think about gravitationalphysics. Instead of imposing Newton’s three laws of motion and imposing his force lawfor universal gravitation, we assume that the starting point is the fabric of spacetime. Weworked quite hard already to define tensors on arbitrary spacetimes, by focusing intensely ontheir transformation properties under changes of reference frame, i.e., changes of coordinates.We also figured out in our last lecture how to define a covariant derivative, with the help ofthe Levi-Civita connection. We went to all that trouble of wrangling the Christoffel symbolsbecause this enabled us to do two exciting things: (a) to define a derivative ∇µ that isa tensor, even in curved spacetime, and (b) to derive the geodesic equation, which is theequation obeyed by any relativistic particle undergoing freefall in the spacetime in question.Along the way, we learned that geodesics maximize the proper time.

As we alluded to earlier, Riemann curvature tensor is the mathematical quantity thatAlbert Einstein discovered was the key to gravitational physics expressed in the languageof curved spacetime. He realized that the Riemann tensor, which contains at most twoderivatives of the metric tensor, could even be used to build an action principle for generalrelativity. We will spend some time deriving the Einstein action and the Einstein equationsof motion for the gravitational field at the end the course. For now, all we need to keepin mind is that the Riemann tensor encodes a wide variety of gravitational phenomena inits tensor components, including the physics of tidal forces and the motion of particles inspacetime. In particular, we will soon show how in the Newtonian limit of weak gravityand slow speeds, we will recover familiar expressions from Newtonian physics – without everhaving to use the concept of a force! First, we need to develop a bit more formalism.

4.1 Curvature and the Riemann tensor

Consider an infinitesimal parallelogram, with vectors Aµ and Bν forming the sides.

In hand-waving terms, the Riemann curvature is what tells us how much a vector V µ getsrotated under parallel transport around the parallelogram. The infinitesimal change in V ,δV , is a (1,0) tensor, and so are A, B, and V . Roughly speaking, we expect δV to beproportional to V and to the size of the parallelogram. To connect δV to A,B, V we needa (1,3) tensor with which to contract indices naturally, and the role of this is played by theRiemann curvature. The resulting equation from our handwaving is therefore

δV µ ∼ RµναβV

νAαBβ . (4.1)

While this sketch of Riemann’s origin gives us the gist, we now need to be more precise andmake a proper definition.

36

Page 39: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Recall that earlier we found parallel transport to be the right way of thinking about howto compare vectors at different places in spacetime. Combined with our little parallelogramhand-wave just now, this can be used to motivate a mathematical definition of the Riemanntensor as arising from taking commutators of covariant derivatives. On a (1,0) vector V ,Riemann is defined via5

[∇µ,∇ν ]Vα = −Rα

λµνVλ , (4.2)

for a torsion-free connection. This formula teaches us how to find the components of theRiemann tensor in terms of Christoffel connection coefficients. Let us write out the piecesindividually to see how it works out. First, note that for any (1,1) tensor T ρ

ν ,

∇µTρν = ∂µT

ρν + ΓρµσT

σν − ΓλµνT

ρλ . (4.3)

So with T ρν = ∇νV

ρ, we have

∇µ(∇νVρ) = ∂µ(∇νV

ρ) + Γρµλ(∇νVλ)− Γλµν(∇λV

ρ) (4.4)

= ∂µ(∂νVρ + ΓρνλV

λ) + Γρµλ(∂νVλ + Γλνσ)− Γλµν(∂λV

ρ + ΓρλσVσ) (4.5)

= ∂µ∂νVρ + Γρνλ∂µV

λ + Γρµλ∂νVλ − Γλµν∂λV

ρ

+ (∂µΓρνσ)V σ + ΓρµλΓλνσV

σ − ΓλµνΓρλσV

σ . (4.6)

Then

∇µ(∇νVρ)− (µ↔ ν) =

(∂µΓρνσ + ΓρµλΓ

λνσ

)V σ −

[ΓλµνΓ

ρλσV

σ]

+

Γρνλ∂µVλ + Γρµλ∂νV

λ−[Γλµν∂λV

ρ]− (µ↔ ν) (4.7)

=(∂µΓρνσ + ΓρµλΓ

λνσ − (µ↔ ν)

)V σ . (4.8)

This gives us the formula for the Riemann tensor components,

Rρσµν = −∂µΓρνσ + ∂νΓ

ρµσ − ΓρµλΓ

λνσ + ΓρνλΓ

λµσ (4.9)

Now we can put the pieces together to see the general formula for taking the commutatorof covariant derivatives acting on a vector,

[∇µ,∇ν ]Vρ = −Rρ

σµνVσ . (4.10)

If you slog through the details, you can compute the commutator of covariant derivativeson a rank (k, `) tensor V as well. This is not much worse than the calculation we have justdone, and we suppress the details here. The result is

[∇ρ,∇σ]V µ1...µkν1...ν`

= −Rµ1λρσV

λ...µkν1...ν`

−Rµ2λρσV

µ1λµ3...µkν1...ν`

− . . .+Rλ

ν1ρσV µ1...µk

λν2...ν`+Rλ

ν2ρσV µ1...µk

ν1λν3...ν`+ . . . . (4.11)

Riemann arises naturally as a rank (1,3) tensor. By doing a partial contraction of twoof its indices, we can define the Ricci tensor Rµν , which naturally arises as a rank (0,2)tensor,

Rµν = −Rαµαν . (4.12)

5We are using the sign conventions of HEL

37

Page 40: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

By contracting the Ricci tensor with the metric, we can form the Ricci scalar R, which hasrank (0,0),

R = gµνRµν . (4.13)

Other kinds of contractions involving Riemann are also possible, such as “Riemann squared”and “Ricci squared”. For our purposes in this course, we only need to know about the Riccitensor and the Ricci scalar – because both of them will appear on the left hand side ofEinstein’s equations.

Note that if you change the signature of our Lorentzian spacetime from mostly minusto mostly plus, the Christoffels Γλµν would stay the same, the Riemann tensor Rρ

λµν wouldalso stay the same, and so would the Ricci tensor Rµν , but the Ricci scalar R would developa relative minus sign.

4.2 Example computation for Riemann

Let us now work a relatively simple example of calculating Riemann components for a space-time with dependence on only one coordinate. As with our geodesic equation example at theend of the previous section, we take the spatially flat FRW ansatz in D = d + 1 spacetimedimensions,

ds2 = dt2 − a2(t)|d~x|2 , (4.14)

where a(t) is the scale factor.To compute Riemann components, we use the formula from above,

Rρσµν = −∂µΓρνσ + ∂νΓ

ρµσ − ΓρµλΓ

λνσ + ΓρνλΓ

λµσ . (4.15)

Most of the components of Riemann for this simple spacetime are actually zero. Let ussketch how to find the ones that are nonzero.

We had for the Christoffels

Γ0ij = a

.a δij ,

Γi j0 =.a

aδij . (4.16)

The first group of nonzero Riemann components have one time index up and one down, andtwo spatial indices:

R0i0j = −∂0Γ0

ji + Γ0jkΓ

k0i

= −(.a2 + a

..a)δij + a

.a δjk

.a

aδki

= −a..a δij . (4.17)

Then we have

Ri00j = −∂0Γi j0 − Γi0kΓ

kj0

= −a..a− .

a2

a2δij −

.a2

a2δij

= −..a

aδij . (4.18)

38

Page 41: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The second group of nonzero Riemann components has all spatial indices,

Rijk` = −Γik0Γ0

lj + Γi `0Γ0kj

= −( .a

aδik

)(a

.a δ`j) +

( .a

aδi`

)(a

.a δkj)

= −.a2(δikδj` − δi`δjk

). (4.19)

Notice how we have discovered both “velocity squared” a2 terms, which arise via Γ···Γ···

parts in Riemann, and a “acceleration” terms, which arise via ∂2· g·· parts in Riemann. It

is not until you compute the curvature that you see the appearance of the “acceleration”pieces. Notice also how the “acceleration” of the scale factor showed up in the Riemann com-ponents involving the time direction; the all-spatial Riemanns gave only “velocity squared”contributions.

Since we now have the Riemann tensor, we can contract it to find Ricci. The nonzerocomponents are

R00 = −Ri0i0

= +δii

..a

a

= +d..a

a,

Rij = −R0i0j −Rk

ikj

= −a..a δij −.a2(δkkδij − δkjδik

)= −a..a δij −

.a2 (d− 1)δij

= −δij[a..a+ (d− 1)

.a2], (4.20)

where d is the spatial dimension (d = 3 in our universe). Contracting the Ricci tensor withthe metric tensor gives the Ricci scalar,

R = g00R00 + gijRij

= +..a

ad− d

(− 1

a2

)[a..a+ (d− 1)

.a2]

= +2d..a

a+ d(d− 1)

.a2

a2. (4.21)

Using D = d+ 1, we can write this in terms of the spacetime dimension D,

R = +2(D − 1)..a

a+ (D − 1)(D − 2)

.a2

a2. (4.22)

The time evolution of this depends sensitively on the details of how the scale factor evolves.We will need to develop the Einstein equation to see how scale factor evolution is tied tothe energy-momentum of the type of matter hanging out in the spacetime. Arbitrary scalefactors a(t) are not allowed; the Einstein equations will determine them in terms of theenergy density and the pressures.

39

Page 42: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

4.3 Riemann normal coordinates and the Bianchi identity

Riemann normal coordinates are a handy coordinate system that you can always usebased about any point p. They are defined in a smallish patch in the neighbourhood of p,and do not necessarily extend infinitely in all directions, as we will explain when we talkabout geodesic deviation soon. But they are a great little coordinate system that you can useto evaluate tensor equations, and to help prove tensor equations. We will use the notationalconvention that equations written in Riemann normal coordinates have bars over the tensors.Strictly speaking we should also bar all the indices, but this is beyond my typing patienceat present, so please imagine barred indices everywhere in your head.

The idea is to use geodesics to define basis vectors, and coordinates for nearby points.Consider a tangent vector kµ. We write, at p,

xµ(λ) = λkµ. (4.23)

From this, it follows immediately that

d2xµ

dλ2= 0 . (4.24)

But since the curve xµ(λ) is a geodesic, it obeys

d2xµ

dλ2= −Γ

µαβk

αkβ, (4.25)

which is true for arbitrary kµ. Therefore,

Γµαβ = 0 . (4.26)

Then, since ∇σgαβ = 0 everywhere, including at p,

∇σgµν = ∂σgµν − Γλσµgλν − Γ

λσν gλµ (4.27)

= ∂σgµν + 0 = 0 . (4.28)

Therefore, in Riemann normal coordinate system, we have the special relations

∂σgµν = 0 , (4.29)

Γαλσ = 0 , (4.30)

Rµνσρ = −∂σΓ

µνρ + ∂ρΓ

µνσ . (4.31)

As you can imagine, using this coordinate system we can more quickly check tensor equations.This is not a trick – tensor equations are valid in any coordinate system. Therefore, theymust hold in any frame, including the Riemann normal coordinate frame in which our tensorcomponents simplify. This conceptual tool can be super handy.

We are now going to make use of this special coordinate system to identify all thesymmetries of Riemann. This is an important quest, because it will enable is to computehow many independent components Riemann has in arbitrary spacetime dimensionD = d+1.

40

Page 43: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

In turn, that helps us understand the physics of this four-legged tensor. Computing it canbe arduous for a general spacetime, and this is why I set computer algebra as part of HW1.

To help us find the symmetries, it helps to start by using the spacetime metric to buildthe (0,4) version of Riemann from the natural (1,3) version,

Rαβµν = gαλRλβµν . (4.32)

The first symmetry we can notice by inspection of the formula for Riemann in terms ofChristoffels. We see immediately that Riemann is antisymmetric upon exchange of its finaltwo indices,

Rρσµν = −Rρ

σνµ . (4.33)

In Riemann normal coordinates,

Rρσµν = −gρλ(∂µΓ

λνσ − ∂νΓ

λµσ

)(4.34)

= −gρλ∂µ[

1

2gλα (∂σgνα + ∂ν gσα − ∂αgνσ)

]− (µ↔ ν) (4.35)

= −1

2gρλg

λα (∂µ∂σgνα + ∂µ∂ν gσα − ∂µ∂αgνσ)− (µ↔ ν) (4.36)

= −1

2

(∂µ∂σgνρ + ∂µ∂ν gσρ − ∂µ∂ρgνσ

)− (µ↔ ν) (4.37)

= −1

2

(∂µ∂σgνρ − ∂µ∂ρgνσ

)− (µ↔ ν) , (4.38)

where in the third line above we used the fact that ∂µgλα = 0 in Riemann normal coordinates,

and in the fourth line we used symmetry. Therefore, we can see two additional identitiessatisfied by Riemann,

Rρσµν = −Rσρµν , (4.39)

i.e., Riemann is antisymmetric upon exchange of its first two indices as well as its last two,and

Rρσµν = Rµνρσ (4.40)

i.e., Riemann is symmetric under interchange of the first two indices with the last two.We can also look at a version of Riemann with cyclic permutations on the last three

indices,Qρσµν := Rρσµν +Rρµνσ +Rρνσµ . (4.41)

Evaluating again in Riemann normal coordinates gives

−2Qρσµν = (∂µ∂σgνρ − ∂µ∂ρgνσ) + (∂ν∂µgρσ − ∂ν∂ρgµσ) + (∂σ∂ν gµρ − ∂σ∂ρgµν)−(µ↔ ν) (4.42)

= ∂ρ(−∂µgνσ − ∂ν gµσ − ∂σgµν + ∂ν gµσ + ∂µgνσ + ∂σgνµ

)+∂σ

(∂µgνρ + ∂ν gµρ − ∂ν gµρ − ∂µgνρ

)+ ∂µ

(∂ν gσρ

)− ∂ν

(∂µgσρ

)(4.43)

= (0) + (0) + 0− 0 (4.44)

= 0 , (4.45)

41

Page 44: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

where we have used the fact that mixed partial derivatives commute and the fact that themetric is symmetric. Because of the antisymmetry properties, an equivalent way of writingthis is (check this!)

Rρ[σµν] = 0 , (4.46)

and it immediately follows from this that

R[ρσµν] = 0 , (4.47)

i.e., the totally antisymmetric part of Riemann vanishes too. With straightforward buttedious algebra of very similar type, we can also derive the Bianchi identity which governscovariant derivatives of Riemann. It can be written in (at least) two mathematically differentbut physically identical ways, which are related by the symmetries of Riemann. The firstform is

∇λRρσµν +∇ρRσλµν +∇σRλρµν = 0 , (4.48)

and the second form is∇[λRµν]ρσ = 0 , (4.49)

which constrains Riemann by relating components at different points. You can think of theBianchi identity for Riemann as like a Jacobi identity for covariant derivatives,

[[∇µ,∇ν ],∇λ] + [[∇ν ,∇λ],∇µ] + [[∇λ,∇µ],∇ν ] = 0 . (4.50)

4.4 The information in Riemann

Now we have all the ingredients we need in order to compute the number of independentRiemann coefficients. We know that as a (0,4) tensor Riemann satisfies

Rαβγδ = −Rαβδγ , (4.51)

Rαβγδ = −Rβαγδ , (4.52)

Rαβγδ = Rγδαβ , (4.53)

R[αβγδ] = 0 . (4.54)

Suppose that we bunch the indices of Riemann in twos. Then we can think of Riemann aslike a symmetric combination of two antisymmetric blocks. Recall that the dimension of anantisymmetric D×D matrix is D(D−1)/2 while that of a symmetric matrix is D(D+1)/2.Then the number of components of Riemann should be

nR(D) =1

2

[1

2D(D − 1)

] [1

2D(D − 1) + 1

]−(D4

). (4.55)

We obtained this by using the symmetries of the first three identities to compute the tentativetotal and then subtracting off the number of completely antisymmetric components to satisfythe fourth identity. This process works because the four constraints are independent. Then,with very simple algebra, we obtain

nR(D) =1

12D2(D2 − 1) . (4.56)

42

Page 45: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Notice a few things about this formula. In one spacetime dimension, nR(1) = 0 and Riemannhas no components. This makes sense, as there is only one independent direction, so youcannot build a nonzero commutator of covariant derivatives. There is not enough room inspacetime to build a parallelogram. In two spacetime dimensions, we have nR(2) = 1 andRiemann has just one independent component. This makes gravitational physics in D = 1+1quite easy compared to higher dimensions. In three spacetime dimensions, we get nR(3) = 6,and in four spacetime dimensions we have nR(4) = 20. This number is, not accidentally,equal to the number of degrees of freedom in the second partial derivatives of the metricthat we cannot set to zero by a clever choice of coordinate system when Taylor expandingthe metric.

As we keep going up in dimension, nR(D) proliferates like a quartic polynomial of D.By the time we get to ten or eleven spacetime dimensions, we are dealing with nR(10) = 825or nR(11) = 1210 independent components! This is why we often use computer algebra inresearch, when calculating in spacetime dimensions relevant to string theory. Of course, itis also possible with clever techniques to cut through the algebra and find quicker ways tocalculate analytically, when your metric is diagonal or sparse in other significant ways.

4.5 Geodesic deviation

Remember flat Euclidean space Rd? We learned in high school that parallel lines do not meet.But in curved spacetime, we have two major differences. First, straight lines are replaced bygeodesics, which are the closest thing we have to straight lines in curved spacetime. Second,geodesics physically deviate because of spacetime curvature. Let us now make this intuitiona bit more mathematically precise.

Consider a one-parameter family of geodesics γs(λ), where λ is the affine parameter alongthe geodesic in question. (Note that Carroll refers to λ as t.) The parameter s ∈ R tells youwhich geodesic you are referring to. We can choose coordinates s and λ on the manifold aslong as the geodesics do not cross.

Then we have two naturally defined vector fields,

Sµ =∂xµ

∂s, T µ =

∂xµ

∂λ. (4.57)

A useful mnemonic here is that S is for Separation while T is for Tangent.We would now like to build the covariant analogue of the ‘relative velocity’ between

geodesics,V µ = Tα∇αS

µ , (4.58)

43

Page 46: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

and the ‘relative acceleration’Aµ := Tα∇αV

µ . (4.59)

Note that the acceleration of a path away from being a geodesic is different. That would be

Tα∇αTµ . (4.60)

Since our proposed definitions above are tensor equations, they are well-defined. Now,Sµ and T µ are basis vectors adapted to a coordinate system, with s and λ. Therefore,

[S, T ] = 0 . (4.61)

On our way towards building the relative acceleration vector, we will need an identity forvector fields,

[X, Y ]µ = Xα∂αYµ − Y α∂αX

µ (4.62)

= Xα∇αYµ − Y α∇αX

µ . (4.63)

This allows us to relate S-directional derivatives of T to T -directional derivatives of S,

Sα∇αTµ = Tα∇αS

µ . (4.64)

Now we can compute the relative acceleration vector.

Aµ = Tα∇α (T σ∇σSµ) (4.65)

= Tα∇α(Sσ∇σTµ)

= (Tα∇αSσ)(∇σT

µ) + TαSσ[∇σ∇αTµ]−Rµ

νασTν

= (Sα∇αTσ)(∇σT

µ)−RµνασT

νTαSσ

+[Sσ∇σ(Tα∇αTµ)− (Sσ∇σT

α)(∇αTµ)]

= −RµνασT

νTαSσ , (4.66)

where we used (i) [S, T ] = 0, (ii) ∇ obeys the Leibniz rule and Riemann is defined in termsof a commutator of covariant derivatives, (iii) the Leibniz rule and rearranging terms, (iv)relabelling of dummy indices to cancel terms and T being the tangent vector of a geodesic.

Summarizing, we have the geodesic deviation equation

Aµ =D2Sµ

Dλ2= (∇T∇TS)µ = −Rµ

νασTνTαSσ . (4.67)

Here we see how the Riemann curvature tensor governs the deviation of geodesics in a veryprecise way. The covariant acceleration deviation of this one-parameter family of geodesicsis given by the Riemann tensor contracted with the tangent vector T twice, on its secondand third indices, and contracted with the separation vector S once, on its fourth index.

44

Page 47: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

4.6 Tidal forces

Remember the tides? If you, like me, have spent any length of time near the ocean, thenyou know that the water level rises and falls twice a day. But do you know why? Newtonfirst explained this in his Principia. Basically, oceanic water on the near side to the Moonbulges because it is closer to the Moon than ocean on the far side and hence feels strongergravity; for the bulge on the far side that can be seen to happen through ‘centrifugal force’.So we see two tides per day.

How do tidal forces work in Newtonian and Einsteinian gravity? Well, you cannot detectcurvature using only one test particle, or only one geodesic. You need to use multiples tosee the physical effects of curvature of space or spacetime. So let us think about geodesicdeviation in the Newtonian limit, even before we recruit the heavy machinery of tensoranalysis in curved spacetime and the Riemann tensor. We will soon see how Riemann andthe Newtonian potential are connected by the Newtonian limit of weak gravity andslow speeds.

In an inertial frame, the equation of motion of the first particle moving in a Newtoniangravitational potential Φ(xk) is

d2xi

dt2= −δij∂jΦ(xk) . (4.68)

Next, we define the vector yi to be the separation of the second particle from the first, whichis assumed to be small. We have that

d2

dt2(xi + yi) = −δij∂jΦ(xk + yk) . (4.69)

Taylor expanding gives

∂jΦ(xi + yi) = ∂jΦ(xk) + (∂`∂jΦ(xk))y` +O(y2) , (4.70)

so that the Newtonian trajectory deviation equation is

d2

dt2yi = −δij(∂j∂kΦ) yk . (4.71)

The left hand side is known as the tidal acceleration, and it is described by the second mixedpartial derivatives of the Newtonian potential.

Now we get to picturing Earth’s tides. For simplicity, we ignore the fact that the Earthis rotating on its own axis as well as the rotation of the Earth around the Sun. This is apretty decent spherical cow approximation.

45

Page 48: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Letting the moon be at (x, y, z) = (0, 0, d), we have for the Newtonian gravitational potential

Φm(x, y, z) = − GNMm

x2 + y2 + (z − d)21/2. (4.72)

From this we can calculate the acceleration deviation vector(∂2Φ

∂xi∂xj

)∣∣∣∣0

= +GNMm

d3diag(1, 1,−2) . (4.73)

Why the asymmetry between the z and x, y directions? Simple. The functional dependencein the denominator is different.

∂2Φ

∂x2

∣∣∣∣0

= − GNMm∂x

[2x · −1

2. . .−3/2

]∣∣∣∣0

(4.74)

= GNMm

[. . .−3/2 − x · 2x · −3

2. . .−5/2

]∣∣∣∣0

(4.75)

=GNMm

d3+ 0 (4.76)

whereas

∂2Φ

∂z2

∣∣∣∣0

= − GNMm∂z

[2(z − d) · −1

2. . .−3/2

]∣∣∣∣0

(4.77)

= GNMm

[. . .−3/2 + (z − d) · 2(z − d) · −3

2. . .−5/2

]∣∣∣∣0

(4.78)

= +GNMm

d3− 3

GNMmd2

d5(4.79)

= −2GNMm

d3. (4.80)

Another way to write the same set of equations is to use a unit normal vector ni = xi/rpointing in the radial direction; then

aij = −(

∂2Φ

∂xi∂xj

)∣∣∣∣0

= − (δij − 3ninj)GNMm

r3(4.81)

This (tensor) equation tells us that you get stretched in the radial direction and squeezed in thetransverse directions. Quite generally, you can think of gravity as a stretchy-squeezy force.This originates in the fact that gravitational intereactions in our universe are transmittedby a spin-two boson known as the graviton. It has a polarization tensor rather than apolarization vector. After symmetries under arbitrary changes of coordinates are taken intoaccount, there are two independent physical polarizations for the graviton in four spacetimedimensions, like there are for the photon. But please do not mistake one for the other: thephoton only has spin one, and in dimension other than D = 3+1 the numbers of independentphysical polarizations of photons and gravitons will not match. That they do in D = 3 + 1is an numerical accident.

46

Page 49: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

How big are tidal forces, in orders of magnitude? First, we need to figure out whichof the solar system bodies is relevant. If you do the calculation using the above formulafor tidal accelerations, you find that the Moon is actually the biggest contributor, becausealthough it is much lighter than the Sun (about 27,100,000 times) it is much closer (about388 times), and it is the cube of the distance that counts. Plugging in the numbers, you willfind that the Sun’s tidal acceleration is only about 45% of the Moon’s. So we focus on theMoon. We would like to compare the magnitude to the acceleration due to gravity. So, toget the order of magnitude, we are computing the ratio of the tidal force on a piece of oceanto the g-force,

GNMMrEd3

· r2E

GNME

∼ MM

ME

(rEd

)3

∼ 10−7 . (4.82)

Tidal forces might seem like teeny weeny forces, but when you multiply by entire oceans,you get physical effects that human beings can relate to.

We can make a little table comparing what we have found in Newtonian gravity versusEinsteinian General Relativity so far.

What Newton Einsteingravity Φ(xi, t) gαβ(xλ)

test particle EOMd2xi

dt2= −δij∂jΦ

d2xµ

dλ2= −Γµνσ

dxν

dxσ

deviationd2yi

dt2= −δij∂j∂kΦ yk

D2Sµ

Dλ2= −Rµ

νσρTνT σSρ

tidal forces ∂i∂jΦ Rρσµν = −∂µΓρνσ + ∂νΓ

ρµσ − ΓρµλΓ

λνσ + ΓρνλΓ

λµσ

gravity EOM ∇2Φ = 4πGNρ ??? (Einstein equations, coming soon!)

In the Newtonian equation of motion for Φ, ρ is the mass density of whatever is sourcing thegravitational field, and GN is the Newton constant characterizing the strength of gravity.

In order to see how the covariant geodesic deviation equation reduces to the familiarNewtonian equations, we need to take the Newtonian limit in which gravity is weak andspeeds are low. (Recall also that x0 = ct and we will need to put back the factors of chere to make the approximation clear.) Either we can assume staticity, or we can notethat ∂0 = ∂t/c, which is a factor 1/c smaller than ∂i. In the Newtonian approximation, wetreat the Newtonian potential as a perturbation on 1, and we will ignore terms of order Φ2

compared to terms of order Φ.In the weak-field limit, the line element is diagonal and quite simple,

ds2 = (1 + 2Φ/c2)c2dt2 − (1− 2Φ/c2)(dx2 + dy2 + dz2) . (4.83)

For the moment, you will need to take this equation on faith, as I have not yet developedthe machinery required to see how it emerges. What I will do for now is to assume it asan ansatz, and show that it correctly gives back the familiar Newtonian limit in the limitof weak gravity and slow speeds. Later on in the course, I will give a fuller explanation ofwhere this expression for the approximate line element comes from.

In the low-speed Newtonian limit, there is no difference between proper time and coor-dinate time t. The dynamical variables of interest become xµ(λ)→ xi(t). What this means

47

Page 50: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

is that we only need to consider the spatial components of the geodesic deviation equation,as the temporal component takes care of itself automatically. In the limit of slow speedscompared to the speed of light, then, we have

d2yi

dt2= +Ri

tjtyj . (4.84)

To check that this does reduce to the Newtonian expression we need to compute Rjtjt for the

above line element. In the limit of weak gravity, we can find the components of the inversemetric to first order in Φ,

gtt ' (1− 2Φ/c2) , gij ' −δij(1 + 2Φ/c2) . (4.85)

For our general Christoffel symbol we have

Γµνλ =1

2gµσ (∂νgσλ + ∂λgσν − ∂σgνλ) , (4.86)

so we can pick off the 0 and i parts individually. Assuming that gravity is weak allows us tokeep only first order terms in Φ. Assuming that Φ does not depend on time (to first orderin small quantities) sets some Christoffels to zero. For example,

Γ000 =

1

2g00 (∂0g00) = 0 . (4.87)

and

Γ0ij =

1

2g00 (∂ig0j + ∂jg0i − ∂0gij) = 0 , (4.88)

and

Γi0j =1

2gik (∂0gk0 + ∂jg0k − ∂kg0j) = 0 . (4.89)

Then we have

Γ00i =

1

2g00∂ig00 '

1

2(1− 2Φ/c2)∂i(1 + 2Φ/c2) ' ∂iΦ/c

2 ⇒ Γtti = ∂iΦ/c2 . (4.90)

Another nontrivial component is

Γi00 =1

2gik (∂0gk0 + ∂0g0k − ∂kg00) = −1

2gik∂kg00 = δik∂kΦ/c

2 ⇒ Γi tt = δik∂kΦ . (4.91)

Finally, we have

Γi jk =1

2gi` (∂jg`k + ∂kg`j − ∂`gjk) (4.92)

=1

2δi`(1 + 2Φ/c2)(−2/c2) (δ`k∂jΦ + δ`j∂kΦ− δjk∂`Φ) (4.93)

⇒ Γi jk =1

c2

(−δik∂jΦ− δij∂kΦ + δi`δjk∂`Φ

)(4.94)

48

Page 51: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

From these, you can find the Riemann components,

Rtxtx = +

1

c2

∂2Φ

∂x2, (4.95)

Rtxty = +

1

c2

∂2Φ

∂x∂y, (4.96)

Rxyxy = − 1

c2

(∂2Φ

∂x2+∂2Φ

∂y2

), (4.97)

Rxyxz = − 1

c2

(∂2Φ

∂y∂z

). (4.98)

plus eight more equations from cyclic permutations of (x, y, z). Note that we do not obtainany squares of partial derivatives here in our Riemanns because we are only working to firstorder in the Newtonian potential Φ. Then, using our geodesic deviation equation in theNewtonian limit, we have

d2yi

dt2= +Ri

tjtyj . (4.99)

Since we also know that

Ritjt = −∂jΓi tt + 0 = −∂j(δik∂kΦ) = −δik∂j∂kΦ , (4.100)

we can see that the General Relativistic geodesic deviation equation involving Riemann givesback the Newtonian expression, which is exactly what we set out to prove.

It is possible to get considerably more sophisticated in discussing the physics of geodesicdeviation. In order to derive more precise equations, one studies a congruence of geodesics,which is a set of curves in an open region of spacetime such that every point in the regionlies on precisely one curve. The story of how geodesics deviate can be expressed in more so-phisticated tensor languauge by studying the covariant derivative of the four-velocity vector∇µUν and decomposing it into three independent parts: (a) the trace part θ, known as theexpansion of the congruence, (b) the symmetric traceless part σµν , known as the shear of thecongruence, and (c) the antisymmetric part ωµν , known as the rotation of the congruence.The details are described in a section of the Appendix for timelike geodesics, and anothersection of the Appendix for null geodesics which have a qualitatively different structure be-cause the massless limit of massive physics is non-analytic. The coupled nonlinear equationsfor the evolution of the expansion, rotation, and shear are derived in a qualitatively simi-lar way to our previous analysis, just with more sophisticated mathematics. The resultingequation for the expansion is

D

dλθ = −1

3θ2 − σµνσµν + ωµνω

µν −RµνUµUν . (4.101)

Notice that the first two terms on the RHS for the directional covariant derivative of theexpansion are negative semidefinite, because they are the negative of a sum of squares. Bycontrast, the the third term involving the rotation is positive semidefinite. The equationobtained for the rotation is the next simplest,

D

dλωµν = −2

3θ ωµν + σ α

µ ωνα − σ αν ωµα . (4.102)

49

Page 52: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The shear evolution equation is the most complicated of the three,

D

dλσµν = −2

3θ σµν − σµασαν − ωµαωαν

+1

3Pµν

(σαβσ

αβ − ωαβωαβ)

+ CανµβUαUβ +

1

2Rµν , (4.103)

where the spatially projected trace-free part of the Ricci tensor is

Rµν = PαµP

βνRαβ −

1

3PµνR

αβRαβ . (4.104)

In this equation, the projection tensor Pµν is built from the 4-velocity vector as

P µν = δµν + UµUν . (4.105)

The tensor Cρσµν is called the Weyl tensor and is formed as a particular contraction ofRiemann,

Cρσµν = Rρσµν −2

(D − 2)

(gρ[µRν]σ − gσ[µRν]ρ

)+

2

(D − 1)(D − 2)gρ[µgν]σR , (4.106)

Note: the Weyl tensor is clearly only defined for D ≥ 3. For lower dimensions, there aresimply not enough independent components of anything to summon a nonzero Weyl tensor.In D = 3, the rigidity of the structure of GR means that the Weyl tensor vanishes identically.So interesting Weyl tensor physics starts at D = 4. Petrov classified the distinct types ofeigenvalues and eigenvectors of Weyl. The most important physical property of the Weyltensor that it is invariant under conformal transformations – local changes of scale.

The main physics fact to draw from all this analysis is that it is the Riemann tensor thatcontrols everything about the evolution of geodesic congruences. This is one reason why weare so obsessed with computing the Riemann tensor!

5 The power of symmetry, and Einstein’s equations

5.1 Lie derivatives

So far we have developed covariant derivatives and curvature. An interesting fact is there aresome structures that can be defined on a curved spacetime manifold even without referenceto a connection, let alone a Riemann tensor. Suppose that we have two vector fields X andY . Recall that any vector V can be expanded in the coordinate basis as V = V µ∂µ. Thenwe can define the commutator [X, Y ] of two vector fields as

[X, Y ](f) ≡ X(Y (f))− Y (X(f)) , (5.1)

where f is some arbitrary function. The neat thing about [X, Y ] is that it is a bona fidevector field: it is linear

[X, Y ](af + bg) = a[X, Y ]f + b[X, Y ]g (5.2)

50

Page 53: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

and it obeys the Leibniz rule

[X, Y ](fg) = f [X, Y ]g + g[X, Y ]f . (5.3)

In the coordinate basis, the new vector field [X, Y ] has components

[X, Y ]µ = Xλ∂λYµ − Y λ∂λX

µ . (5.4)

This is a well-defined tensor, because the non-tensorial pieces from the partial derivativescancel by antisymmetry of the commutator. If you prefer, you can write the above formulawith covariant derivatives instead of partial derivatives.

A more general construction than the commutator is known as the Lie derivative. (Pro-nunciation note: “Lie” rhymes with “see”.) This is actually a more primitive concept thanour covariant derivative ∇ that we have already introduced, and can be defined withoutreference to it. Let us start by defining the integral curves of a vector field V (x) to bethose curves xµ(λ) satisfying

dxµ

dλ= V µ (5.5)

A familiar example is that the magnetic flux lines are the integral curves of the magneticfield 3-vector. It is interesting to ask the following question: how fast does a tensor T changealong integral curves of V ? This is the Lie derivative of T along V , LV (T ). Our vectorfield V µ(x) gives us a family of diffeomorphisms parametrized by λ, and for each λ we candefine the infinitesimal change between the pullback of the tensor to p and its original valueat p. So, on functions f , also known as rank (0, 0) tensors, the Lie derivative acts as

LV (f) = V λ∂λf (5.6)

which is just the directional derivative.For a rank (k, `) tensor, the procedure is a bit more complicated. Let us start by seeing

how it works on contravariant vectors, then we will outline how it goes for covariant vectors,then we will just quote the formula for the general rank (k, `) tensor. To do this, we willneed to use a little bit of math about things called pullbacks and pushforwards. These mayseem a bit abstract, but they are really quite simple concepts.

If we have a map φ from one manifold M with coordinates x to another N with coordi-nates y, we can define the pullback of a function f on N by φ as

φ∗f = (f φ) . (5.7)

In general we cannot define the pushforward of the function f , because the map φ may notbe invertible. But we can actually define the pushforward of a vector, because a vectorcan be thought of as a derivative operator mapping smooth functions to real numbers. Wedefine

(φ∗V )(f) = V (φ∗f) (5.8)

In component form,

(φ∗V )α∂αf = V µ∂µ(φ∗f) = V µ∂µ(f φ) = V µ

(∂yα

∂xµ

)∂αf . (5.9)

51

Page 54: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

So if you think of the pushforward operation as a matrix operator, it has components

(φ∗)αµ =

∂yα

∂xµ. (5.10)

This looks a lot like the vector transformation law under a change of coordinates, but whenM and N are unequal it is a more general concept. While we have that contravariant vectorscan be pushed forward, they cannot be pulled back. For the case of covariant vectors, as youmight suspect, they can be pulled back but cannot be pushed forward. Indeed, we define

(φ ∗ ω)(V ) = ω(φ∗V ) (5.11)

The chain rule yields

(φ∗)αµ =

∂yα

∂xµ, (5.12)

which is the same matrix appearing in the pushforward of a contravariant vector. Of course,a different index is contracted when it acts to pull back covariant vectors.

Let us now get back to the business of defining the Lie derivative on vectors. Supposethat we have a vector field V and that we adapt our coordinate system so that V = ∂/∂x1.The utility of choosing this coordinate system is that a diffeomorphism by λ amounts toa coordinate transformation from (x1, x2, . . . , xD) to (x1 + λ, x2, . . . , xD). Then using ourabove formula for pushforwards,

(φt∗)νµ = δνµ , (5.13)

and the components of the tensor pulled back from φt(p) to p are simply

φλ∗[Tµ1...µk

ν1...ν`(φλ(p))] = T µ1...µkν1...ν`(x

1 + λ, x2, . . . , xD) . (5.14)

In this coordinate system, the Lie derivative becomes

LV T µ1...µkν1...ν` = (∂/∂x1)T µ1...µkν1...ν` . (5.15)

In particular, the Lie derivative for a vector field Uµ(x) is

LVUµ = (∂/∂x1)Uµ . (5.16)

This is clearly not covariant, but we know that [V, U ] is a well-defined tensor, and in thiscoordinate system it happens to have components

[V, U ]µ = V ν∂νUµ − Uν∂νV

µ = ∂Uµ/∂x1 . (5.17)

Since both are vectors (rank (1, 0) tensors), their components must be equal, and so wefinally have the formula we want,

LVU = [V, U ] . (5.18)

This is called the Lie bracket.

52

Page 55: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The formula for the action of the Lie derivative on covariant vectors follows directly fromwhat we have just derived for contravariant vectors and the Leibniz rule. For a general rank(k, `) tensor, the Lie derivative is

LV (T )µ1...µk ν1...ν` = V σ∂σTµ1...µk

ν1...ν`

−(∂λVµ1)T λµ2...µk ν1...ν` − . . .

+(∂ν1Vλ)T µ1...µkλν2...ν` + . . . . (5.19)

This equation may make you a bit uncomfortable because it involves partial derivatives. Infact, if you do the straightforward but tedious algebra, you will find that it is just as validwith covariant derivatives replacing the partial ones,

LV (T )µ1...µk ν1...ν` = V σ∇σTµ1...µk

ν1...ν`

−(∇λVµ1)T λµ2...µk ν1...ν` − . . .

+(∇ν1Vλ)T µ1...µkλν2...ν` + . . . . (5.20)

This equation certainly looked less tensorial written the first way. But the first equation hasthe advantage that it makes clear that no connection is necessary to define Lie derivativesof tensors. It is an independent structure.

5.2 Killing tensors

In this section, we will be especially interested in the expression above for the Lie derivativeof the metric tensor, which characterizes everything about gravity in our spacetime. Wehave

LV (g)µν = ∇µVν +∇νVµ . (5.21)

So if ∇µKν +∇νKµ = 0 for some vector K, the metric is unchanged! This thing K is knownas a Killing vector. The metric is unchanged along integral curves of K, i.e., it has asymmetry. This is an incredibly important concept in General Relativity, because as wewill shortly see, when you have a Killing vector for a given spacetime metric, you have aconservation law. This is how Noether’s Theorem is represented in curved spacetime.

Consider the quantity K · p. Its covariant derivative is

∇µ(Kλpλ) = (∇µKλ)p

λ +Kλ(∇µpλ) . (5.22)

Contracting this with pµ gives

pµ∇µ(Kλpλ) = pµpλ∇µKλ +Kλp

µ(∇µpλ) , (5.23)

and the second term disappears by the geodesic equation. The first term can also be seento vanish by virtue of symmetry and the Killing vector equation. So the Killing equation isequivalent to conservation of K · p. More generally, if we have a Killing tensor obeying

∇(µKν1...ν`) = 0 (5.24)

thenpµ∇µ(Kν1...ν`p

ν1 . . . pν`) = 0 . (5.25)

53

Page 56: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

A fascination with finding conserved quantities is physically important because it canhelp us solve for geodesics. Soon, when we introduce black holes, we will see just how crucialconserved quantities can be in analyzing geodesic motion and physical consequences of it!The alternative form of the geodesic equation we would like to explain is

d

dλpµ =

1

2(∂µgνσ) pνpσ . (5.26)

Let us see both how to prove this and what it implies about the physics. We start from thegeodesic equation

pµ∇µpν = 0 , (5.27)

and notice that because our connection is metric compatible, we can pull the indices down,to get

pµ∇µpν = 0 = pµ(∂µpν − Γλµνpλ

). (5.28)

To simplify the first term in parentheses, we use the fact that for either massive or masslessparticles, pµ is proportional to dxµ/dλ, the tangent vector. (In Carroll’s textbook, he uses thestandard convention pµ = mUµ for massive particles, and he chooses pµ = Uµ for masslessparticles.) Accordingly,

pµ∇µpν =d

dλpν − Γλµνp

µpλ = 0 . (5.29)

Now let us use the formula for the Christoffels in terms of metric derivatives to work on thesecond term,

Γλµνpµpλ =

1

2gλσ (∂µgνσ + ∂νgµσ − ∂σgµν) pµpλ

=1

2(∂µgνσ + ∂νgµσ − ∂σgµν) pµpσ

=1

2(∂νgµσ) pµpσ . (5.30)

where we did an index contraction in step 1 in raising the index on the second momentumvector, and used symmetry of the Christoffels under exchange of their two lower indices instep 2. Then our alternative form of the geodesic equation becomes

d

dλpν =

1

2(∂νgµσ) pµpσ . (5.31)

The alternative rewriting we have just found makes very clear when we can find conservationlaws: it is when the metric has a symmetry. When the metric tensor components obey∂νgµσ(xλ) = 0 for some particular coordinate xν , we have pν conserved, i.e., the downstairsνth component of the momentum.

Let us do an ultra-simple example of a Killing vector. Consider Minkowski space in 4D,namely R3,1 with the flat spacetime metric. In Cartesian coordinates, we obviously havetranslation invariance. Then the presence of the Killing vector (1, 0, 0, 0) implies that p0 isconserved. Similar arguments for all four spacetime coordinates imply that all componentsof pµ are conserved.

54

Page 57: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Suppose instead that we used spherical polar coordinates, still sticking with flat Minkowskispacetime. Then rotations about the azimuthal angle are clearly a symmetry of the met-ric, which implies that the azimuthal angular momentum pφ is conserved. The other tworotational Killing vectors are − cosφ∂θ + cot θ sinφ∂φ and − sinφ∂θ − cot θ cosφ∂φ.

As another example, take our spatially flat FRW universe. Notice that the metric de-pends only on time. Obviously, this means that energy is not conserved. Stop and thinkon that for a minute. You probably thought that conservation of energy must be true in allcircumstances, even for the whole universe. You would be wrong. It requires a symmetry!For the spatially flat FRW universe, the downstairs components of the spatial momenta areconserved.

5.3 Maximally symmetric spacetimes

Spacetimes are distinguished by how many symmetries they possess. The more symmet-ric, the more calculable. The less symmetric, the less calculable. Even though maximallysymmetric spacetimes possess an unrealistic amount of symmetry for experimental purposes,they are still very useful to study because calculations are easier to complete and they helpbuild intuition. So we will not shy away from showing you maximally symmetric spacetimes.

What are the maximally symmetric spacetimes? We need to specify the spacetimesignature6 in order to get started on this discussion. In Euclidean signature, Riemannianmanifolds with maximal symmetry are (up to local isometry) either: Euclidean space RD,the sphere SD, or hyperbolic space HD. In Lorentzian signature, there are also three options,and they split up according to the value of the cosmological constant Λ. When Λ = 0, weget Minkowski space Rd,1. For Λ < 0 we get Anti de Sitter spacetime (AdS), and for Λ > 0we get de Sitter spacetime (deS).

Recall that Minkowski spacetime is invariant under d+1 translations, d(d−1)/2 rotations,and d boosts. Adding the numbers together gives a total of

(d+ 1) +1

2d(d− 1) + d =

1

2(d+ 1)(d+ 2) =

1

2D(D + 1) (5.32)

symmetries. We therefore say that a spacetime manifold of dimension D is maximallysymmetric if it possesses D(D + 1)/2 independent symmetries.

What equation should the Riemann tensor obey in maximally symmetric spacetimes? Ithad better be invariant under local Lorentz transformations, because there is no preferreddirection in spacetime. There are only a very few tensors which we can use: gµν and εµ1...µD .The epsilon tensor turns out to have the wrong symmetry properties to build Riemanncomponents, and the metric ends up the winner. The sole combination of metric tensorcomponents that possesses the right symmetries to be Riemann is antisymmetric, and tracinggives the constant of proportionality,

Rρσµν =−R

D(D − 1)(gρµgσν − gρνgσµ) . (5.33)

The Ricci scalar R is a constant over the manifold for maximally symmetric cases.

6If we were in a mathematically picky mood, we would also want to specify the spacetime topology.

55

Page 58: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Anti de Sitter spacetime AdSD=d+1 can be embedded in a Minkowski spacetime of onehigher dimension Rd,2, via

− (t1)2 − (t2)2 + (x1)2 + . . .+ (xd)2 = −L2 (5.34)

where L is the radius of curvature of the AdSD. There are several different coordinatesystems in common usage for AdSD. One of the most useful is global coordinates, in which

t1 = L coshρ cos τ , (5.35)

t2 = L coshρ sin τ , (5.36)

xi = L sinhρ xi , whered∑i=1

(xi)2 = 1 . (5.37)

In general dimension, spherical coordinates are defined via

x1 = cos θ1 , (5.38)

x2 = sin θ1 cos θ2 , (5.39)

. . . (5.40)

xd−1 = sin θ1 . . . sin θd−2 cos θd−1 , (5.41)

xd = sin θ1 . . . sin θd−2 sin θd−1 . (5.42)

You can check yourself, either by hand or using Maxima, that the resulting line element ofAdSD in global coordinates is

ds2 = L2(cosh2ρ dτ 2 − dρ2 − sinh2ρ dΩ2

d−1

). (5.43)

With a further coordinate transformation in time and radius,

r = L sinh ρ , (5.44)

t = Lτ , (5.45)

we obtain

ds2 =

(1 +

r2

L2

)dt2 −

(1 +

r2

L2

)−1

dr2 − r2dΩ2d−1 . (5.46)

The scale L is the radius of curvature, and it sets the scale for all the physics in AdSD.If you computed the covariant Laplacian of a scalar field Ψ, ∇µ∇µΨ, you would find

a qualitatively different partial differential equation than you know from flat Minkowskispacetime. Indeed, in AdSD, higher angular momentum partial waves do not fall off withdistance as you go out in radius like they do in Minkowski spacetime. This means thatobservers at asymptotic infinity can actually see what is happening in detail in the interiorof AdSD. This non-falloff of partial waves is one of the key aspects of the physics of AdSDthat led to the celebrated AdSD/CFTd correspondence of string theory.

de Sitter spacetime dSD can be embedded in RD,1 via

t1 =√L2 − r2 sinh(t/L) , (5.47)

xi = Lxi , whered∑i=1

(xi)2 = 1 , (5.48)

xD =√L2 − r2 cosh(t/L) . (5.49)

56

Page 59: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

This gives rise to static coordinates (like AdS, dS can alternatively be sliced with flat,positively curved, or negatively curved spatial sections). In static coordinates, the de Sitterline element becomes

ds2 =

(1− r2

L2

)dt2 −

(1− r2

L2

)−1

dr2 − r2dΩ2d−1 . (5.50)

This has a cosmological horizon at r = L. We will not have time to develop the similaritiesbetween cosmological horizons and black hole horizons in this course.

5.4 Einstein’s equations

In plain language, Einstein’s equations express the fact that matter tells spacetime how tocurve and spacetime tells matter how to move. Later in the course, I will show you how toderive Einstein’s equations of General Relativity. For now, we will just write them down foryou and show you how to use them. They relate a geometrical quantity on the left hand side,built out of the Riemann curvature tensor, to an energy-momentum tensor of any matterfields in the physical system containing gravitation as well. In tensor notation, they read asfollows,

Rαβ −1

2gαβR + Λgαβ = −8πGNTαβ . (5.51)

The quantity Λ is known as the cosmological constant. (Note: you can put back thepowers of c very easily by recruiting dimensional analysis.)

A very important characteristic of Einstein’s equations is that they are nonlinear. Youcan see this by eye by recalling the formula for Christoffels in terms of metric derivatives,which is nonlinear, as well as the formula for the Riemanns in terms of derivatives of Christof-fels and contractions of Christoffels, which is also nonlinear. Nonlinearity makes GR verydifferent qualitatively than Newtonian gravity. It is only in the Newtonian limit of GR thatthe linearity with which you are familiar emerges and shows itself as the superposition prin-ciple for the Newtonian potential Φ(x). For generic situations in GR, nonlinearity is presentin the partial differential equations for the evolution of spacetime. The mathematics of non-linear PDEs is hugely complicated compared to linear ones, and for generic spacetimes oftenno general statements can be made. Symmetry helps enormously with the task of trying tosolve the differential equations, classify spacetimes, or find their geodesics.

The energy-momentum tensor on the RHS of Einstein’s equations is covariantly con-served. The way to see this is to take covariant derivatives of both sides of the Einsteinequations. The Einstein tensor is defined as

Gµν = Rµν −1

2gµνR . (5.52)

Notice that this is denoted with a big-Gµν , rather than the small-gµν metric or the GN

denoting the Newton gravitational constant. By itself, the rank (0,2) Einstein tensor Gµν

does not look like much. But it obeys an extremely useful identity by virtue of the Bianchi

57

Page 60: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

identity for the Riemann tensor. To see this, let us take the first form of our Bianchi identityand contract with two factors of the upstairs metric,

0 = gνσgµλ (∇λRρσµν +∇ρRσλµν +∇σRλρµν) (5.53)

= ∇µRρµ −∇ρR +∇νRρν . (5.54)

Rearranging this expression gives a relationship between the covariant derivative of the Riccitensor and the covariant derivative of the Ricci scalar,

∇µRρµ =1

2∇ρR . (5.55)

This identity is handy because it enables us to prove that

∇µGµν = 0 . (5.56)

In other words, the Einstein tensor is covariantly conserved. We also have the metric com-patibility condition on our affine connection,

∇σgµν = 0 . (5.57)

Then we have∇µTmatter

µν = 0 . (5.58)

Covariant conservation of the energy-momentum tensor in GR is mandatory, not voluntary.

6 Black holes

6.1 Birkhoff’s theorem and the Schwarzschild solution

Let us now attack the question of solving the vacuum Einstein equations when we have astatic, spherically symmetric spacetime. After a bit of work, we will be able to show thatthe Schwarzschild black hole possessing mass M is the unique solution.

Our methodology follows that of Carroll §5.2, and will involve a few steps. We will firstuse spherical symmetry to constrain the possible metric components that might be turnedon. Then we will use the vacuum Einstein equations to prove that the time dependence mustdrop out. Then we will solve the remaining vacuum Einstein equations, and we will obtainthe Schwarzschild solution. The last piece of the puzzle will be provided by the Newtonianlimit, which will connect a mathematically arbitrary constant of integration to the physicalquantity GNM , where M is the mass of the Schwarzschild geometry and GN is the Newtonconstant, which has dimensions of lengthD−2 and parametrizes the strength of gravity.

First, let us discuss the definition of a static spacetime in Lorentzian signature. Callingthe timelike coordinate x0, we define a static spacetime as one for which (a) there is noexplicit time dependence in the metric and (b) the invariant interval possesses time reversalinvariance,

∂x0

(gµν(x

λ))

= 0 , (6.1)

ds2 invariant under x0 → −x0 . (6.2)

58

Page 61: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

A spacetime that only obeys the first condition is called a stationary spacetime. In essence,a static spacetime basically does nothing at all over time, while a stationary spacetime doesexactly the same thing at all times. Note that staticity requires that there be no time-spacecross terms in the invariant interval, only time-time and space-space components.

Isotropy is also big requirement. Having this much symmetry eliminates a lot of possiblyindependent components of the metric tensor. In particular, writing in terms of either Carte-sian coordinates ~x or spherical polar coordinates r, θ, φ, we can only use three ingredients,

~x · ~x = r2 ,

d~x · d~x = dr2 + r2dΩ22 ,

~x · d~x = rdr , (6.3)

wheredΩ2

2 = dθ2 + sin2 θ dφ2 . (6.4)

Any other thing we could build from the available ingredients would not respect sphericalsymmetry.

Given the spherical symmetry of our ansatz, it is traditional to use spherical polarcoordinates, in which the metric on the S2 is round – throughout the spacetime. For now,we will allow the metric to have time dependence, but bear in mind that shortly we will findit is disallowed by the Einstein equations. We write the metric as

ds2 = e2α′′(t′,r′)(dt′)2 − e2β′′(t′,r′)(dr′)2 − 2e2γ′′(t′,r′)dt′dr′ − e2δ′′(t′,r′)(r′)2dΩ22 . (6.5)

Next, we can change to a new radial coordinate r(t′, r′) by

r2 = (r′)2e2δ′′(t′,r′) (6.6)

and adjust the definitions of all functions dependent on time and radius accordingly, to newfunctions, single primed,

ds2 = e2α′(t′,r)(dt′)2 − e2β′(t′,r)dr2 − 2e2γ′(t′,r)dt′dr − r2dΩ22 . (6.7)

In order to be able to get rid of the 2dt′dr′ term in this line element, we are going tohave to work harder. Let us start by trying the easiest-looking trick,

dt??= e2α(t′,r)dt′ − e2γ(t′,r)dr . (6.8)

If you try to follow this path further, you will find that second mixed partial derivatives ofthe new t coordinate w.r.t. the old coordinates fail to commute, so the equation (6.8) aboveis inconsistent7. Our simple trick failed. As you may recall from the general theory of ODEs,the right strategy is to recruit an integrating factor, which must be a function of both t′ andr: Φ(t′, r). We define a new time coordinate t(t′, r) by

dt = e2Φ(t,r)[e2α(t′,r)dt′ − e2γ(t′,r)dr

]. (6.9)

7To see a simple example of how this process works when done right, try transforming from Cartesiancoordinates (x, y) on the plane to polar coordinates (r, θ), and checking that mixed second partials commute.

59

Page 62: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Note that the very explicit factor of eΦ(t,r) in front of the [. . .] parts we wanted is designedprecisely such that the right hand side of the above expression is an exact differential. Thisis absolutely necessary, because for any newly defined coordinates, second mixed partialderivatives w.r.t. the old coordinates must commute! Then, using the above equations, wehave

e−4Φdt2 = e4α(dt′)2 − 2e2(α+γ)dt′dr + e4γdr2 . (6.10)

Rearranging this and forming our (dt′)2 and 2dt′dr pieces gives

e2α(t′,r)(dt′)2 − 2e2γ(t′,r)dt′dr = e−2α(t,r)−4Φ(t,r)dt2 − e−2α(t,r)+4γ(t,r)dr2 . (6.11)

So redefining our metric ansatz functions according to

e2α = e−2α′−4Φ′ ,

e2β = e2β′ + e−2α′+4γ′ (6.12)

givesds2 = e2α(t,r)dt2 − e2β(t,r)dr2 − r2dΩ2

2 . (6.13)

The point of all this wrestling with differentials was to show that we can always choose acoordinate system in which off-diagonal metric components are absent, even if our sphericallysymmetric system is time dependent.

Our next task is going to be to show that the time dependence in the metric functionsalso has to drop out. For this part, we will need to use the equations of motion for the metrictensor field on spacetime. For the Einstein equations, we need to compute Christoffels to getRiemanns which we can then contract to get Ricci components, e.g. via Maxima. We get

Γttt = ∂tα , Γttr = ∂rα ,

Γtrr = e2(β−α)∂tβ , Γrtt = e2(α−β)∂rα ,

Γrtr = ∂tβ , Γrrr = ∂rβ ,

Γrθθ = −re−2β , Γrφφ = −r sin2 θe−2β ,

Γθrθ =1

r, Γθφφ = − sin θ cos θ ,

Γφrφ =1

r, Γφθφ =

cos θ

sin θ. (6.14)

Note that the pieces involving ∂t have been highlighted with . . . in the above equation so

60

Page 63: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

you can clearly see the effect of allowing time dependence. For the Ricci tensor, we obtain

Rtt = e2(α−β)

[+(∂2

rα) + (∂rα)2 − ∂rα∂rβ +2

r(∂rα)

]

(∂2t β)− (∂tα)(∂tβ) + (∂tβ)2

,

Rtr =

2

r(∂tβ)

Rrr =

[−(∂2

rα)− (∂rα)2 + (∂rα)(∂rβ) +2

r(∂rβ)

]+ e2(β−α)

(∂2t β) + (∂tβ)2 − (∂tα)(∂tβ)

,

Rθθ =[e−2β (r∂rβ − r∂rα− 1) + 1

],

Rφφ = sin2 θRθθ . (6.15)

All these tensors must be zero for us to have a solution of the vacuum Einstein equations.First, have a look at Rtr. This must be zero, which demands of β(t, r) that

∂tβ(t, r) = 0 ⇒ β = β(r) . (6.16)

You can see by looking for the . . . parts in the Riccis that many terms now drop outcompletely because β is a function of r only. Obviously, this simplifies our life quite a lot!

Second, let us notice that the Rθθ = 0 equation (a first order constraint equation) isrelatively simple. Let us take a time derivative of it,

∂t(R00) = 0 = −2(∂tβ)e−2β[r∂rβ − r∂rα− 1] + e−2β[r∂t∂r(β − α)] . (6.17)

But since ∂tβ = 0 by our Rtr = 0 equation, we have

e−2βr∂t∂r(β − α) = 0 . (6.18)

Then, using what we know about β(r), we can partially integrate to get

α(t, r) = f(r) + g(t) . (6.19)

Notice how the only remaining place where we have time dependence is in the tt componentof the metric. What a stroke of luck! This means that we can absorb it simply by doing acoordinate transformation involving only time (not radius or angular coordinates),

dt = dt eg(t) . (6.20)

Let us redefine our time coordinate to correspond to this t (we drop the tilde, for notationalclarity). Then we have

β = β(r) , α = α(r) . (6.21)

Third, let us look at the remaining (more complex) tt and rr Einstein equations,

0 =

[(∂2rα) + (∂rα)2 − ∂rα∂rβ +

2

r(∂rα)

], (6.22)

0 =

[−(∂2

rα)− (∂rα)2 + (∂rα)(∂rβ) +2

r(∂rβ)

](6.23)

61

Page 64: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

By simply adding these equations together, we obtain

∂r(α + β) = 0 . (6.24)

This means thatβ(r) = const− α(r) . (6.25)

Fourth, we can plug this expression for β(r) in terms of α(r) back in to the Rθθ = 0Einstein equation to obtain

[2r∂rα + 1] e2α = 1 . (6.26)

By quick inspection you can see that this becomes

∂r(re2α(r)

)= 1 , (6.27)

so thatre2α(r) = r + c1 , (6.28)

where c1 is a mathematically arbitrary constant. This can be integrated to give

e2α(r) = 1 +c1

r. (6.29)

We are nearly done, but we need one more physical ingredient. We need to know thephysical meaning of c1, because it is what controls all the nontrivial radial dependencein our new static spherically symmetric metric satisfying the vacuum Einstein equations.This is where the Newtonian limit comes to our rescue. We know that in regions of weakgravity, far away from the centre of our spacetime near r →∞, gtt should take the form ofgtt ' −1 + 2Φ/c2, where Φ = −GNM/r. This fixes our arbitrary constant of integration.

Therefore, we finally obtain the famous Schwarzschild metric in four8 spacetime di-mensions:

ds2 =

(1− 2GNM

r

)dt2 −

(1− 2GNM

r

)−1

dr2 − r2dΩ22 . (6.30)

Birkhoff’s Theorem says that this is the unique static spherically symmetric solution ofthe vacuum Einstein equations. We sketched a proof of this en route, when we found thatthe Einstein equations would not allow time dependence. Note that in the solution we seeGN , which is a theory parameter, and M , which is a solution parameter.

One of the two most physically intriguing things about this solution, in this coordinatesystem, is that there is a place where grr blows up (and gtt goes to zero). This is known asthe event horizon. It is located at the Schwarzschild radius

rS =2GNM

c2. (6.31)

where I have temporarily shown the factors of c for physical clarity. Try calculating yourown Schwarzschild radius. You do not fit inside this radius, so you are not a black hole.

8Cousins of Schwarzschild which are asymptotically flat are available in various dimensions for D > 3.They have 1/rD−3 dependence rather than 1/r dependence in the metric.

62

Page 65: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The second physically intriguing thing about this solution of Einstein’s equations is thatit has a singularity at r = 0 that is not just a coordinate singularity. It is a truly physicalsingularity, as you can see by computing a curvature invariant like Riemann squared,

RµνσρRµνσρ =48(GNM)2

r6, (6.32)

which diverges at r = 0. Notice that at r = rS = 2GNM , this curvature invariant is smallin Planck units for a big black hole,

`4PR

µνσρRµνσρ(r = rS) =12`4

P

r4S

. (6.33)

Conversely, for a Planck mass black hole, the curvature would be Planckian.A third aspect of this solution should also grab your physical interest: the nonlinearity

of gravity manifest in it. Nonlinearity is what allows there to be a nontrivial solution of thevacuum Einstein equations (ones with Tµν = 0) at all. Compare to Newtonian gravity, wherea zero mass on the RHS of the Laplace equation would result in a zero Newtonian potential!

Mathematically, the mass M of a classical Schwarzschild black hole might in principletake any value from −∞ to +∞, because rS arose as a mere constant of integration ofEinstein’s equations. However, physically there are limits to what the mass can be. Forstarters, M must be finite for a physically reasonable solution. More importantly, the massmust be nonnegative, M ≥ 0, because the singularity is not covered by a horizon if theSchwarzschild radius is negative! When M < 0, the gravitational redshift also walks off intothe complex plane, and we are in trouble interpreting what the heck our spacetime mightmean. So already at the classical level, we can imagine why taking our black holes to havenon-negative mass is a sensible physical precaution. (Of course, the case M = 0 is Minkowskispacetime.)

There is a more sophisticated argument available for mass non-negativity that takes intoaccount quantum corrections to classical gravity, first made in 1995 by G.T. Horowitz andR.C. Myers. They argued that if a negative-mass black hole solution were physical, in thesense that quantum gravity corrections somehow ‘fixed up’ the negative-mass naked singu-larity into some physical blob with large-but-finite curvature, then the vacuum of quantumgravity would be unstable. Their logic was this: we could reduce the energy of our systemfrom the vacuum state by simply pair-producing more and more blob-antiblob pairs. Thisworks because each blob has negative energy and so does each anti-blob! The existence ofnegative-mass ‘black holes’ would therefore thoroughly destabilize the vacuum of quantumgravity, which is the foundation upon which we lay excitations of quantum fields describingthe fluctuating degrees of freedom of the system. The result would be a horrible, physicallyinconsistent mess.

The moral of this mass positivity story is this: do not trust that every mathematicalsolution of a physically interesting set of PDEs is physical. We must also check that physicalboundary conditions are obeyed, and ensure that basic physical principles like stability ofthe vacuum are preserved. This is why we assume henceforth that MBH ≥ 0.

63

Page 66: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

6.2 TOV equation for a star

Let us now see what changes when we allow an energy-momentum tensor in our static,spherically symmetric spacetime. The simplest kind of thing to consider is called a perfectfluid. What is a perfect fluid? Physically, it is a kind of spherical cow approximation,in which we model a system like the ball of gas we call our Sun by a simple macroscopicfluid, described only by its proper energy density ρ and pressure p in the instantaneous restframe. We ignore shear viscosity, bulk viscosity, and heat conduction. For a perfect fluid,the energy-momentum tensor Tµν can be written in the form

T p.f.µν =

(ρ+

p

c2

)UµUν − pgµν . (6.34)

This obeys the conservation equation

∇µT p.f.µν = 0 . (6.35)

In flat Minkowski spacetime in Cartesian coordinates, in the Newtonian limit, conservationof energy-momentum can be seen to reduce to (a) the continuity equation, and (b) the Eulerequation, the classical equation of motion for a perfect fluid. For details, see §8.3 of HEL.Here, we work in curved spacetime so our story is more involved.

As you can check, in comoving coordinates, we have only the time component of the4-velocity and its magnitude is set by the timelike condition UµUµ = 1. Then the Einsteinequations for our static, spherically symmetric star involve only radial dependence, and theyare (with c = 1)

1

r2e−2β

[2r∂rβ − 1 + e2β

]= 8πGNρ , (6.36)

1

r2e−2β

[2r∂rα + 1− e2β

]= 8πGNp , (6.37)

e−2β

[∂2rα + (∂rα)2 − ∂rα∂rβ +

1

r(∂rα− ∂rβ)

]= 8πGNp . (6.38)

Note very carefully here the difference between ρ(r) and p(r). Make sure you write ρs inyour own handwritten notes in such a way that they are easily distinguishable from ps.

Now, we have a set of three coupled ODEs in α(r), β(r), ρ(r), p(r). Without some physicalinput there are not enough equations to solve the system. But we can do it if we recruitconservation of the energy-momentum tensor ∇µTµν = 0 and provide an equation of state.

The tt Einstein equation is a function of β only: it does not involve α. This allows us todefine a mass function m(r) such that

e2β =

(1− 2GNm(r)

r

)−1

. (6.39)

Then in terms of m(r) rather than β(r), the tt Einstein equation becomes dm/dr = 4πr2ρ(r),which we can immediately integrate to

m(r) = 4π

∫ r

0

dr r2ρ(r) . (6.40)

64

Page 67: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

You might look at this formula and think “Oh! This is just the natural answer: youtake the mass density and multiply by the surface area, and integrate radially.” But thatwould be too quick, because the volume element in our curved spacetime metric is actuallydrdθdφ r2 sin θ eβ(r). So if we wanted to define the true energy density, we would insteadcalculate

M =

∫ R

0

drr2ρ(r)√

1− 2GNm(r)/r(6.41)

and this is greater than M because of the binding energy (a concept which does make sensein GR for spherical stars).

The radial Einstein equation becomes

dr=

[GNm(r) + 4πGNr3 p(r)]

r[r − 2GNm(r)]. (6.42)

To get any further, we need to recruit energy-momentum tensor conservation. With onlyradial dependence, this gives

[ρ(r) + p(r)]dα(r)

dr= −dp(r)

dr, (6.43)

which lets us eliminate dα/dr in favour of dp/dr. We obtain

dp(r)

dr= − [ρ(r) + p(r)] [GNm(r) + 4πGNr

3 p(r)]

r[r − 2GNm(r)](6.44)

This is the Tolman-Oppenheimer-Volkov equation for hydrostatic equilibrium in a star,for the static spherically symmetric case in 4D.

In order to actually solve the TOV equation, we need to know one more equation: theequation of state, which is a relationship p = p(ρ). For astrophysical systems, a polytropicequation of state is often employed, which takes the form ρ = Kργ for some constants K, γ.As a toy model, we can consider an incompressible star with finite constant mass density ρ∗out to some radius R. Then the mass function is easily integrated, and M = 4πR2/3. Thisin turn gives

p(r) = ρ∗

√R3 − rSR2 −

√R3 − rSr2

√R3 − rSr2 − 3

√R3 − rSR2

(6.45)

Integrating again to find gtt yields

eα(r) =3

2

√1− rs

R− 1

2

√1− rSr2

R3, r < R . (6.46)

The pressure increases near the core, even though we have assumed absolute incompressibilityof the fluid. In particular, if M > Mmax = (4R)/(9GN), then the pressure at the core goes toinfinity. Oops! With our simplistic ansatz, we have managed to evolve ourselves outside theregime of validity of Einstein’s equations. Of course, real stars do not obey such a simplisticmodel as an incompressible fluid. Still, it is interesting that we can get the right order ofmagnitude estimate of when a star can be too big to be gravitationally stable. Sometimes,

65

Page 68: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

the stellar object collapses gravitationally into a black hole. If the initial configuration hadno overall angular momentum, it will settle down eventually to a Schwarzschild solution. Ifit is rotating, then the metric we will discuss soon is known as the Kerr black hole.

Stellar evolution produces different endpoints depending on the initial mass of the starin question. For small stars like ours, when they run out of gas for nuclear fusion, theycontract and become white dwarfs. If they are somewhat larger, above about 1.4M, knownas the Chandrasekhar limit, then electron degeneracy pressure is not sufficient to holdthem up, and they collapse further to become a neutron star (a class that includes pulsars).Above about 3-4M, known as the Oppenheimer-Volkov limit, even neutron degeneracypressure is not enough. Bigger stars collapse to produce black holes.

People like to categorize black holes by size. We can distinguish three basic classes byformation mechanism. Stellar mass black holes are produced by collapse of individualstars, and have masses of a few to a few hundred solar masses. We also have supermassiveblack holes at the centres of most galaxies, at millions to billions of solar masses. The thirdclass is known as primordial black holes because the only way these smaller-mass objectscould have been formed would have been in the Big Bang. The density of primordial blackholes is small, if there were any at all to begin with, because of the period of inflation whichgrew the universe by gigantic amounts early in the history of its evolution, diluting them.

6.3 Geodesics of Schwarzschild

We now move to studying geodesics in the Schwarzschild spacetime explicitly. The nonzeroChristoffels for this geometry are

Γttr =rS

2r(r − rS); (6.47)

Γrrr = −Γttr , Γrtt =rS2r3

(r − rS) , Γrθθ = −(r − rS) , Γrφφ = sin2 θ Γrθθ ; (6.48)

Γθrθ =1

r, Γθφφ = − sin θ cos θ ; (6.49)

Γφrφ =1

r, Γφθφ =

cos θ

sin θ, (6.50)

where rS is the Schwarzschild radius. Then our geodesic equations become

d2t

dλ2+

rSr(r − rS)

dt

dr

dλ= 0 , (6.51)

d2r

dλ2+

rS2r3

(r − rS)

(dt

)2

− rS2r(r − rS)

(dr

)2

− (r − rS)

(dθ

)2

+ sin2 θ

(dφ

)2

= 0 , (6.52)

d2θ

dλ2+

2

r

dr

dλ− sin θ cos θ

(dφ

)2

= 0 , (6.53)

d2φ

dλ2+

2

r

dr

dλ+ 2

cos θ

sin θ

dλ= 0 . (6.54)

66

Page 69: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

These equations look rather formidable until you realize that finding the Killing vectorsallows you to find first integrals of two out of four of the geodesic equations. This followsbecause ∂tgµν = 0 and ∂φgµν = 0. We write the energy

E = pt =(

1− rSr

) dtdλ

. (6.55)

and the angular momentum

L = pφ = r2 sin θdφ

dλ. (6.56)

The next equation we can recruit isUµUµ = ε , (6.57)

where ε = 0 for null geodesics and ε = +1 for timelike geodesics. For either type of geodesic,we have gµνU

µUν = ε, or

ε =(

1− rSr

)( dtdλ

)2

−(

1− rSr

)−1(dr

)2

− r2

[(dθ

)2

+ sin2 θ

(dφ

)2]. (6.58)

Substituting in our conserved angular momentum L and energy E gives

ε =(

1− rSr

)−1

E2 −(

1− rSr

)−1(dr

)2

− r2

(dθ

)2

− L2

r2 sin2 θ. (6.59)

Our next step is a piece of physics input. We can use rotational symmetry to pickθ = π/2. It is consistent with the geodesic equations to leave dθ/dλ = 0 for all affine time.Then (

dr

)2

= E2 −(

1− rSr

)(ε+

L2

r2

). (6.60)

Some textbooks like to help you visualize this setup by making a mapping onto a familiarnon-relativistic Newtonian system, as follows,

m→ 1 (6.61)

|~v|2 →(dr

)2

, (6.62)

Etot →E2

2, (6.63)

Veff(r)→ 1

2

(1− rS

r

)(ε+

L2

r2

)=ε

2− εrS

2r+L2

2r2− rSL

2

2r3. (6.64)

You can learn everything you need to know about the availability of various types of orbits(for either null or timelike geodesics) by plotting this “effective potential”. Carroll has twogreat figures in §5.4, Figures 5.4 and 5.5:-

67

Page 70: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

68

Page 71: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

For Newtonian gravity, there are no massless particle orbits. Massive particles can havestable bound orbits, depending on the angular momentum per unit mass.

For Einsteinian gravity, photons can orbit, but they are unstable. Any small perturbationand the path flings off back out to infinity (sometimes after buzzing around the black holehorizon a few times) or falls inexorably into the black hole. Massive particles, on the otherhand, can have bound orbits, and the outer solution radius gives a stable orbit while theinner one gives an unstable orbit.

Circular orbits can happen when dVeff/dr = 0 at r = r∗, solving the equation

ε rS2r2∗ − L2r∗ +

3rS2L2γ = 0 , (6.65)

where (following Carroll) we introduce γ = 1 for GR and γ = 0 for NG (Newtonian gravity).Specifically, for massless geodesics, r∗ = 3rSγ/2, and as you can see by evaluating the secondderivative of Veff(r), it is an unstable maximum. Massive geodesics provide a richer context.We find two solutions,

r∗rS

=L2

r2S

±

√L4

r4S

− 3L2γ

r2S

, (6.66)

From this you can quickly see that NG has only one solution, at r∗ = L2/(2rS). But for GRthe story is a lot more interesting. There are two solutions and, as you can see by computingthe second derivative of the effective potential, the outer one is stable while the inner one isunstable. As you can discover by inspecting the negative root of eq.(6.66) carefully, for radiismaller than r = r∗,ICO, where

r∗,ICO =3rS2, (6.67)

there are no stable circular orbits at all. Nothing can orbit that close without falling acrossthe horizon. Gravity is too strong. The angular momentum at which the stable and unstableorbits coalesce for timelike geodesics is L4 = 3r2

SL2γ, i.e., where the discriminant in eq.(6.66)

vanishes. This is called the ISCO, or the Innermost Stable Circular Orbit,

r∗,ISCO = 3rS = 2r∗,ICO . (6.68)

The following image is an artist’s rendition of what the black hole in the Large MagellanicCloud might look like. It looks weird because you are not used to photon trajectories beingbent. The strong and nonlinear gravitational effects of the black hole are quite extreme!

69

Page 72: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

6.4 Causal structure of Schwarzschild

How do light cones behave in the spacetime of the Schwarzschild black hole? In the originalSchwarzschild coordinates, we had the spacetime metric

ds2 = −(

1− rSr

)dt2 +

(1− rS

r

)−1

dr2 + r2dΩ22 . (6.69)

Obviously, we will have to suppress some of our four spacetime coordinates in order to fita diagram onto a two-dimensional page. It will assist our visualizations to suppress theangular directions and focus attention on the time and radial directions. (Tip: be sure todouble-check spacetime diagrams in textbooks to eliminate avoidable confusion over whichcoordinates are suppressed.)

For a null trajectory we have ds2 = 0. For purely radial motion, we can immediatelyread off the slope of the light cone,

dt

dr= ±

(1− rS

r

)−1

. (6.70)

As we would expect, the magnitude of this tends to unity at r → ∞. Light rays go at 45

on a (t, r) diagram. At slightly smaller radii, it increases a little. What happens at r → rSyou may not have expected: the magnitude of the slope of the light cone blows up! Thelight cone is physically squashed down to have zero opening angle. This is a coordinatesingularity.

Inside the Schwarzschild radius, gtt and grr both flip sign, seemingly switching roles.This is a symptom of the fact that this coordinate system does not actually cover the regionof the black hole spacetime inside the horizon. Another symptom of the disease we see hereis that it appears a photon would take an infinite amount of time to fall into the black hole.It does – in these coordinates.

Redshifting depends on the coordinate system. To do a better job of probing the causalstructure of the Schwarzschild black hole spacetime, we are actually better off aiming toanswer more invariant questions, like “How much affine parameter does it take before afreely falling particle hits the singularity?” We can also hunt for better coordinate systemswhich do cover the entire black hole spacetime, not just the region outside the horizon.

Let us start by inspecting what we have so far in our Schwarzschild coordinates. Forradial null paths,

dt

dr=

±1

(1− rS/r). (6.71)

70

Page 73: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Definingdt

dr∗= ±1 (6.72)

givesr∗rS

=

∫d(r/rS)

(1− rS/r), (6.73)

so that

r∗ = r + rS ln

(r

rS− 1

), (6.74)

which is known as the tortoise coordinate. This ranges over r∗ ∈ (−∞,+∞), while theoriginal radial coordinate ranged over r ∈ [0,∞). So this tortoise coordinate also only coversthe region outside the horizon. The benefit of using these coordinates is that the radial nullpaths are simple,

t = ±r∗ + c . (6.75)

The light cones are all at 45 in tortoise coordinates.Another advantage of the tortoise coordinate is that the wave equation for a probe9

scalar field propagating in the background of the Schwarzschild solution is simpler-looking.I recommend looking into this, and inspecting the form of the effective potential Veff(r∗). Itmakes very clear how the angular momentum barrier affects s-waves differently than higherpartial waves with ` > 0.

Using the tortoise coordinate, our black hole metric becomes

ds2 =(

1− rSr

)[−dt2 + dr2

∗] + r2(r∗)dΩ22 . (6.76)

Next, let us try adapting our coordinates to null motion. Define null coordinates in thetime-radius plane,

u ≡ t− r∗ , (6.77)

v ≡ t+ r∗ . (6.78)

Then our black hole spacetime metric takes the form

ds2 = −(

1− rSr

)dv2 + 2drdv + r2(u, v)dΩ2

2 . (6.79)

These coordinates are called Eddington-Finkelstein coordinates. As you can check, thismetric remains invertible, including at the horizon.

Then for radial null motion, we have(1− rS

r

)(dvdr

)2

− 2dv

dr= 0 , (6.80)

so that

dv

dr=

2

(1− rS/r)(outgoing) ,

0 (ingoing) .(6.81)

9In the probe (a.k.a. test particle) approximation, gravitational backreaction is assumed to be negligible.

71

Page 74: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Because the first solution is positive, it is relevant for outgoing radial null paths. The secondsolution is the one relevant for ingoing radial null paths.

Notice what this implies about our light cones in (v, r) coordinates. We have that forthe ingoing ‘side’ (on a 2D diagram) of the light cones, this always hugs v =const. For theoutgoing ‘side’ of the light cones, the slope depends on r/rS. At r → ∞, this slope is 2. Ifwe are at a finite r > rS, then the slope is positive and bigger than 2. At r = rS the slopebecomes infinite, pointing straight up the v-axis. For r < rS the slope becomes negative,and points towards the inside of the black hole only. This represents the physics that wewant in a rather more elegant way than Schwarzschild coordinates did. Infalling photons donot make it out of the black hole once they have crossed the horizon.

The following picture is a summary of what we have found out about light cones inEddington-Finkelstein coordinates.

A nice feature of Eddington-Finkelstein coordinates is that our light-cones do not getsquished down to infinitely thin pencils. But note carefully that they do turn over at thehorizon. Note that with these new coordinates we have managed to cover the region t→ +∞of the black hole spacetime, because at constant v, decreasing r sends t→ +∞. So we haveextended in one direction. How about the other direction?

Are there other coordinates that might restore the symmetry between u and v? OurEddington-Finkelstein coordinates so far privileged v. Because of that, they are known asingoing Eddington-Finkelstein coordinates. It turns out that we can alternatively find asecond set of Eddington-Finkelstein coordinates, adapted for outgoing rather than ingoingnull paths, in which we have

ds2 = −(

1− rSr

)du2 − 2drdu+ r2(u, v)dΩ2

2 . (6.82)

Working back from our definitions, we see that this corresponds to the region t → −∞, ascompared to the region t → +∞ which the first set of Eddington-Finkelstein coordinatesextended us to. In outgoing Eddington-Finkelstein coordinates (u, r), the slope of the lightcones is

du

dr=

−2

(1− rS/r)(ingoing) ,

0 (outgoing) .(6.83)

The accompanying picture illustrates this.

72

Page 75: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Can we uncover yet more regions of the Schwarzschild black hole spacetime? It turnsout that the answer is yes, if we use another even smarter coordinate system known asKruskal-Szekeres coordinates.

Our first guess for how to get further is furnished by choosing both light-cone coordinates(u, v), in place of (u, r), or (v, r), or (t, r∗), or (t, r). We find immediately that

ds2 = −(

1− rSr

)dudv + r2(u, v)dΩ2

2 , (6.84)

where1

2(v − u) =

r

rS+ ln

(r

rS− 1

), (6.85)

which implicitly defines r as a function of (u, v).This is looking more promising: our light-cones will stay at 45 in these (u, v) coordinates.

But there is still one big fly in the ointment: we still have the problem that the horizon islocated infinitely far away. To cure this symptom, we make an exponential mapping oflight-cone coordinates to bring infinity in to a finite place,

U = − exp

(− u

2rS

), (6.86)

V = + exp

(+

v

2rS

). (6.87)

In these Kruskal-Szekeres coordinates we find

ds2 = dUdV

[−2r3

S

re−r/rS

]+ r2(U, V )dΩ2

2 . (6.88)

where everywhere we have written r above we really mean r(U, V ). Picking apart the null(U, V ) coordinates into a time T and radius R via

T =1

2(U + V ) =

√r

rS− 1 exp

(r

2rS

)sinh

(t

2rS

), (6.89)

and

R =1

2(U − V ) =

√r

rS− 1 exp

(r

2rS

)cosh

(t

2rS

), (6.90)

gives the spacetime metric

ds2 =(−dT 2 + dR2

) [−2r3

S

re−r/rS

]+ r2(T,R)dΩ2

2 , (6.91)

73

Page 76: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

where

T 2 −R2 =

(1− r

rS

)er/rS (6.92)

implicitly defines r(T,R).This slick manipulation probably feels like it just happened at 100km/h. So let us slow

down a little, and pull apart all the pieces of what these new amazing Kruskal coordinatesallow us to see for the physics of the Schwarzschild black hole.

In Kruskal-Szekeres coordinates,

• Radial null motion occurs along

T = ±R + c1 . (6.93)

• Surfaces of constant r are at

T 2 −R2 =

(1− r

rS

)er/rS , (6.94)

which are hyperbolas in the (T,R) plane.

• Surfaces of constant t are atT

R= tanh

(t

2rS

), (6.95)

which are simply straight lines in the (T,R) plane.

• The event horizon is atT = ±R . (6.96)

This has two solutions, corresponding physically to having both a black hole horizonand a white hole horizon.

• The singularity is atT 2 −R2 = 1 . (6.97)

This has two solutions, one which corresponds to a black hole singularity and onewhich corresponds to a white hole singularity.

• What ranges do our coordinates (U, V ) cover? We see that (U, V ) range over all possiblevalues aside from where the curvature singularity occurs:

−∞ ≤ T ≤ ∞ ,

−∞ ≤ R ≤ ∞ ,

T 2 −R2 < 1 . (6.98)

Note: it appears that the U, V may be ill-defined inside the horizon, but it is actuallythe original t, r coordinates that are ill-defined there. The U, V Kruskal coordinatesare well-defined, except of course in the disallowed singular region. This is the reallykey part of using Kruskal coordinates which allows us to obtain what is known as themaximal analytic extension of the Schwarzschild spacetime. (In the figure below,rg is our rS, t is our T , and r is our R.)

74

Page 77: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Notice how the Kruskal diagram actually has extra regions by comparison to the originalSchwarzschild coordinate patch. These new extra regions can be abbreviated as II, III, andIV. From region I we can, via future-directed null rays, go into region II. So it makes senseto interpret this part as the region behind the black hole event horizon. And you can seefrom the picture above that the black hole singularity is in region II.

Suppose, from region I, we followed instead a past-directed null ray. Then what? Ac-cording to our Kruskal diagram, we would cross a horizon to go into another region – III –with another singularity, the white hole singularity, which can be loosely called the ‘mirrorimage’ of the singularity in region II under time reversal. The horizon in region III is thewhite hole horizon that we identified in our list of bullet points above.

By following future-directed null rays from region III, or past-directed null rays fromregion II, we can see a second asymptotically flat region. But we can never communicatewith it! It is a causally separated place unconnected by timelike or null geodesics to theoriginal asymptotic region. Some people like to speak of the Schwarzschild geometry as a“wormhole” connecting two asymptotically flat regions, but it is not physical in any senseto call it a wormhole because it is not traversable. It closes up too quickly for any physicalobserver (even an electron) to cross from I to IV. For more details, see p.228 of Carroll.

A black hole formed in gravitational collapse would involve at most regions I and II.Regions III and IV would not be present; there would be no white hole, only a black hole.

What is a white hole, physically? Mathematically, it is the time-reverse of a black hole.Such a beast cannot actually be formed in gravitational collapse – that produces a black holewith a future horizon, not a white hole with a past horizon. The other interesting fact aboutwhite holes, shown by D.M. Eardley in 1974, is that a white hole is unstable to collapsinginto a black hole. For these and other reasons, you do not need to worry about the physicsof white holes if you are considering classical gravity. Only quantum gravity theorists needto worry our heads about such things.

75

Page 78: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

6.5 Kerr black holes

Now we move to discussing the Kerr black hole, which has not only mass but also angu-lar momentum. Demanding that the spacetime be stationary and spheroidally symmetricrequires an ansatz of the form

ds2 = e2α(r,θ)dt2 − e2γ(r,θ) [dφ− ω(r, θ)dt]2 − e2β(r,θ)dr2 − e2δ(r,θ)dθ2 . (6.99)

Note how many more functions we have turned on here, and the fact that there is now bothr and θ dependence in all our metric functions. Mathematically speaking, this complicatesthe hell out of the process of solving the Einstein equations, because we now have PDEs intwo variables instead of ODEs in r only. We will not actually prove that the Kerr solutionsolves the vacuum Einstein equations, because the algebra is awful.10 Instead, we will derivesome fascinating physical properties of spacetimes of the above form, and just present theKerr solution, gift wrapped with a bow on top.

From the above ansatz we see that

gtt = e2α − e2γω2 . (6.100)

Since the metric has an off-diagonal component, gtφ, inverting to find the upstairs metric isslightly more complicated. We can easily read off two of the components,

grr = −e−2β , (6.101)

gθθ = −e−2δ , (6.102)

but for the (t, φ) block we need to invert the 2× 2 matrix. The result is

gtt = e−2α , (6.103)

gφφ = −e−2γ + ω2e−2α , (6.104)

gtφ = +ωe−2α . (6.105)

It is possible to see one of the most intriguing consequences of GR, known as the draggingof inertial frames, without getting specific about the form of any of the functions in our metricansatz. Since the metric obeys ∂φgµν = 0, pφ is conserved along a geodesic. Then

pφ = gφµpµ = gφφpφ + gφtpt , (6.106)

and similarlypt = gttpt + gtφpφ . (6.107)

Let us specialize to the case of pφ = 0 (no initial photon angular momentum). Then, recallingour relationship between the momentum and the tangent vector for photons,

pµ =dxµ

dλ, (6.108)

10If you are a masochist and want to see it for yourself, please make Maxima do it.

76

Page 79: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

we havedφ

dt=pφ

pt=gtφ

gtt= ω(r, θ) . (6.109)

In other words, ω is the coordinate angular velocity of a massless particle with no angularmomentum. What we have obtained here might not look like much, but it is physicallyremarkable. A particle dropped straight inwards from infinity will not end up continuingstraight inwards – instead, gravity drags the photon around so it acquires an angular velocity.This effect is know as the dragging of inertial frames. It turns out to be down by 1/r3

compared to 1, so it is a small effect.

Our next task is to define a physically important surface known as the stationary limitsurface. Imagine photons emitted from (r, θ, φ) purely in the ±φ direction at first, so thatonly dt and dφ are nonzero along the photon path. Using ds2 = 0, we have

gttdt2 + 2gtφdtdφ+ gφφdφ

2 = 0 , (6.110)

so that

dt= − gtφ

gφφ±

√g2tφ

g2φφ

− gttgφφ

(6.111)

If at the emission point gtt/gφφ < 0, then dφ/dt is positive (negative) for photons emittedin the ±φ direction, even though the magnitudes differ. But when gtt = 0, we cross over toa different behaviour. In particular, on the surface gtt(r, θ) = 0, known as the stationarylimit surface, there are two qualitatively different solutions:

dt= −2

gtφgφφ

= 2ω ordφ

dt= 0 . (6.112)

The first solution corresponds to a photon sent off in the same direction as the source rotation.The second solution shows that frame dragging is so severe that initially the photon doesnot move at all. This implies that a massive particle, which must always go slower than aphoton, has to rotate with the source. This is true even if it has an arbitrarily large angularmomentum with opposite orientation.

Another way to see that a particle/observer cannot remain at fixed (r, θ, φ) inside thestationary limit surface, where gtt < 0, by looking at the 4-velocity. We have [uµ] = [ut,~0]T ,but this is incompatible with the condition that u2 = 1 when gtt < 0.

Remember our gravitational redshift formula? We had(νRνE

)∣∣∣∣stationary, fixed

=

√gtt(E)

gtt(R). (6.113)

This is why the stationary limit surface is also known as the infinite redshift surface.How would we find the horizon in our rotating spacetime? The defining property of

an event horizon is that it is a null surface. In stationary axisymmetric spacetimes, itsequation must be of the form f(r, θ) = 0. Nullness then implies that gµν∂µf∂νf = 0, orgrr (∂rf)2 + gθθ (∂θf)2 = 0. In fact, we may actually choose our coordinates r and θ such

77

Page 80: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

that the equation of the surface can be written as f(r) = 0. In this case, our conditionreduces to grr (∂rf)2 = 0, and therefore we see that the event horizon occurs when

grr = 0 . (6.114)

In our previous case of Schwarzschild, this was equivalent to the condition gtt = 0, but thatonly holds for static black holes, not stationary ones.

This is a good place to mention a definition of a horizon associated to Killing vectors.Suppose that we have a Killing vector χµ. If that Killing vector is null along some nullhypersurface Σ, then Σ is a Killing horizon of χµ. Note that χµ is normal to Σ becausea null surface cannot have two linearly independent null tangent vectors. Some importantfacts are as follows.

• Every event horizon Σ in a stationary, asymptotically flat spacetime is a Killing horizonfor some Killing vector χµ.

• If the spacetime is static, then χµ will be the Killing vector Kµ = (∂t)µ representing

time translations at infinity.

• If the spacetime is stationary but not static, then it will be axisymmetric with arotational Killing vector Rµ = (∂φ)µ, and χµ will be a linear combination Kµ + ΩHR

µ

for some constant ΩH .

To prove that the Ricci tensor is zero for metrics of our form with α, β, γ, δ, ω is atedious computation. Here is the Kerr metric that emerges after all the calculational dusthas settled:-

ds2 =

(1− 2µr

ρ2

)dt2 +

4µar sin2 θ

ρ2dtdφ− ρ2

∆dr2 − ρ2dθ2

−(r2 + a2 +

2µra2 sin2 θ

ρ2

)sin2 θdφ2 , (6.115)

where

ρ2 = r2 + a2 cos2 θ , (6.116)

∆ = r2 − 2µr + a2 . (6.117)

Where is the singularity? Computing the full contraction of Riemann with itself showsthat only ρ = 0 is physically singular. This happens at

r2 + a2 cos2 θ = 0 (6.118)

yielding

r = 0 , θ =π

2. (6.119)

Careful inspection reveals that this singularity is ring shaped. For Schwarzschild it was apoint.

Where are the horizons? These occur where grr → 0. This requires ∆ = 0, or

r = r± = µ±√µ2 − a2 . (6.120)

78

Page 81: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Note that, with factors of c temporarily restored for physical clarity,

µ =GNM

c2, and J = Mac . (6.121)

So then we requireµ ≥ |a| (6.122)

for cosmic censorship.Where is the stationary limit surface, also referred to as the ergosphere? This happens

when gtt → 0,rS± = µ±

√µ2 − a2 cos2 θ . (6.123)

The following figure summarizing these aspects is taken from HEL.

6.6 The Penrose process

Previously we started the discussion of frame dragging in GR for the Kerr spacetime. Letus now finish that line of reasoning, which will help lead us into the subject of black holethermodynamics.

Suppose that you wanted to remain fixed at (r, θ) but rotate around φ. Then the 4-velocity is

[uµ] = ut[1, 0, 0,Ω]T , (6.124)

where

Ω =dφ

dt(6.125)

79

Page 82: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

is the angular velocity w.r.t. an observer at infinity. Demanding that the 4-velocity squaresto 1 gives a quadratic equation for ut:

gtt(ut)2 + 2gtφu

tuφ + gφφ(uφ)2 (6.126)

= (ut)2[gtt + 2gtφΩ + Ω2] = 0 . (6.127)

For the roots to be real, we require

gφφΩ2 + 2gtφΩ + gtt < 0 . (6.128)

This gives that Ω ∈ (Ω−,Ω+), where

Ω± = − gtφgφφ±

√g2tφ

g2φφ

− gttgφφ

= ω ±√ω2 − gtt

gφφ(6.129)

This implies two things. First, when gtt = 0, Ω− = 0 and Ω+ = 2ω. This occurs on thestationary light surface S+. Second, when ω2 = gtt/gφφ, Ω± = ω. This holds at ∆ = 0, i.e.at the outer horizon. Note that at r = r+ the angular velocity is restricted to be just

ΩH = ω(r+, θ) =ac

2µr+

. (6.130)

This is independent of θ, which is a highly nontrivial physics feat!

Now we have all the ingredients at hand to discuss the Penrose process, a clever ideathat can be used to extract energy from any spacetime with an ergoregion – at the cost ofreducing its angular momentum. Suppose that we have an observer at infinity with fixedposition who fires photon A into the Kerr black hole ergoregion. Then the energy of Ameasured at the emission event E is

E(A) = p(A)(E) · uobs = p(A)t (E) , (6.131)

where the observer 4-velocity is [uµobs] = [1,~0]T . Now suppose that inside the ergoregion Adecays: A→ B + C.

80

Page 83: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Then momentum conservation implies that

p(A)(D) = p(B)(D) + p(C)(D) , (6.132)

where D denotes the decay event.Suppose that C eventually makes it out to infinity. The observer at infinity measures

the particle energy at the reception event R to be

E(C) = p(C)t (R) = p

(C)t (D) (6.133)

because pt is conserved along a geodesic by virtue of stationarity: ∂tgµν = 0. Similarly, forthe original particle,

p(A)t (D) = p

(A)t (E) . (6.134)

Then the time component of the above momentum conservation equation can be rearrangedto

E(C) = E(A) − p(B)t (D) , (6.135)

because p(B)t is conserved along a geodesic.

Now, if B were to escape the ergoregion, p(B)t would be timelike, and hence proportional

to the particle energy as measured by an observer with purely timelike 4-velocity. Sincep

(B)t > 0, this implies that E(C) < E(A), i.e. you get less energy out than you put in. But if

B were to instead fall into the black hole, then it would forever remain in the region wheregtt has opposite sign. Then p

(B)t would be interpreted as a component of spatial momentum,

which could in principle be either positive or negative. If it happened to be negative, p(B)t < 0,

then E(C) > E(A). Yes, Virginia, you can extract energy from a rotating black hole!Once the particle has fallen inside the event horizon, the black hole mass and angular

momentum are changed as

Mc2 →Mc2 + p(B)t ,

J → J − p(B)φ . (6.136)

If we have an observer at fixed r, θ, we already worked out the 4-velocity: [uµ] = ut[1, 0, 0,Ω]T ,where Ω = dφ/dt is the angular velocity w.r.t. infinity. This observer measures B’s energyto be

E(B) = p(B)µ uµ = ut

(p

(B)t + p

(B)φ Ω

). (6.137)

This quantity must be positive, so

L <p

(B)t

Ω, (6.138)

where L = −p(B)φ is the component of B’s angular momentum along the black hole rotation

axis. Because p(B)t < 0 for the Penrose process and Ω > 0, this means that L < 0, resulting

in a loss of angular momentum for the black hole. You can keep extracting energy from ablack hole like this until you have spun Kerr down all the way to Schwarzschild. Earlier, welearned that the angular velocity is maximal at r = r+, when Ω = ΩH . So in fact for anyobserver at fixed r, θ, we have a general bound,

δJ <δMc2

ΩH

. (6.139)

81

Page 84: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

In a nutshell: the Penrose process allows extraction of energy from the black hole at theprice of spinning it down.

Let us calculate the area of the outer horizon r+ in the Kerr spacetime. We have

γijdxidxj = −ds2(dt = 0, dr = 0, r = r+) (6.140)

= (r2+ + a2 cos2 θ)dθ2 +

[(r2

+ + a2)2 sin2 θ

(r2+ + a2 cos2 θ)

]dφ2 , (6.141)

so that

A(r+) =

∫ √|γ|dθdφ . (6.142)

We have √|γ| = (r2

+ + a2) sin θ , (6.143)

so thatA(r+) = 4π(r2

+ + a2) . (6.144)

A very cute fact about the Penrose process is that the area of the black hole horizon doesnot actually decrease. The angular momentum is reduced more than the mass each time wedo it, and this ensures that the area of the black hole never decreases. To see this, let usdefine the irreducible mass by

M2irr =

A

16πG2N

(6.145)

=1

G2N

(r2+ + a2) (6.146)

=1

2

(M2 +

√M4 − J2

G2N

)(6.147)

This might seem a tad unmotivated until we realize how it is affected by changes in M andJ . We find after some straightforward but boring algebra that

δMirr =a

4GNMirr

√G2M2 − J2/M2

(δM

ΩH

− δJ). (6.148)

Look carefully at what this implies. We had earlier that for a Penrose process, δJ < δM/ΩH

(where both δM and δJ are negative), so

δMirr > 0 . (6.149)

Therefore, the maximum work you can extract via the Penrose process is

M −Mirr = M − 1√2

√√√√M2 +

√M4 − J2

G2N

, (6.150)

and it turns out that this is maximized to 29% of the original energy for an extreme Kerrblack hole.

The moral of the story here is that we are discovering relationships between macroscopicvariables of the black hole, and this opens the door to discussing black hole thermodynamics(a topic on which I am an expert). A bit of introductory material on this fascinating topiccan be found in the Appendix.

82

Page 85: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

7 Classic experimental tests of GR

The reason we teach GR is not based in theoretical aesthetics, although those are reallyquite beautiful and many great intellects have fallen in love with it! We teach it and use itbecause it works as an experimental description of gravity. In this section we discuss someof the signature experiments that established GR firmly in the minds of humans worldwide.

This section is based pretty closely on Appendix 9A and §10 of the HEL textbook. Allthe figures displayed in this chapter are theirs.

7.1 Gravitational redshift

Suppose that we have a stationary spacetime in which we can choose coordinates such thatthe metric takes the form

ds2 = g00(~x)dt2 + gij(~x)dxidxj .

Imagine that you have two different physical observers who are massive and therefore moveslower than light. Call the two observers E for emitter and R for receiver, with worldlinesxµE(τE) and xµR(τR) respectively, where τE, τR are proper times. Now let E emit a photon atevent A with 4-velocity UE(A) and R receive it at event B with 4-velocity UR(B). We canfind the energy of a photon in the reference frame of an observer by taking the dot productof the photon momentum with the observer 4-velocity,

E = pµUµ , (7.1)

which works because (a la Carroll) we choose the convention for affine parameters of nullparticles to be

dxµ

dλ= pµ . (7.2)

(Note that this is different from the convention for massive particles.) Then we have

E(A) = pµ(A)UµE(A) , (7.3)

E(B) = pµ(B)UµR(B) . (7.4)

Since in both cases E = hν, we have

νRνE

=pµ(B)Uµ

R(B)

pµ(A)UµE(A)

(7.5)

Now, the photon 4-momentum is parallel transported along its path,

d

dλpµ − Γνµσpν

dxσ

dλ= 0 , (7.6)

so we can use this to relate pµ(B) to pµ(A), using our convention pµ = dxµ/dλ. Then

d

dλpµ = Γνµσpνp

σ . (7.7)

Recall that we also have the mass shell relation for the photon, pµpµ = 0.

83

Page 86: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Suppose the emitter E and receiver R are at fixed spatial coordinates. (This works asif they are tethered to the coordinate system, which would not be true for freely fallingobservers.) Then when the spatial components of the 4-velocity vanish, we have

U iE =

dxiEdτE

= 0 , and U iR =

dxiRdτR

= 0 . (7.8)

Using UµUµ = 1 gives u0 = 1/√g00, so that

νRνE

∣∣∣∣fixed

=p0(B)

p0(A)

√g00(A)

g00(B). (7.9)

If the metric is stationary, i.e. ∂0gµν = 0, then p0 is conserved by the geodesic equation.Then since the momentum vector for a photon is equal to the tangent vector, p0 is constantalong a photon geodesic, and so

νRνE

∣∣∣∣fixed, stationary

=

√g00(~xE)

g00(~xR). (7.10)

For Schwarzschild, in particular, we find

νRνE

∣∣∣∣fixed, stationary

=

√(1− 2GNm/rE)

(1− 2GNm/rR). (7.11)

The quantity z for the redshift is defined by

νRνE

=1

1 + z. (7.12)

If we want to find out redshifts for freely falling timelike observers, then we need to solvethe geodesic equations for the Schwarzschild spacetime.

7.2 Planetary perihelion precession

How will we discover the perihelion advance we are after? We will start by using the geodesicequations derived for the Schwarzschild geometry introduced previously. The analysis canalso be done for Kerr, but for our purposes here the non-rotating case will suffice to showthe essential physics. We had a conserved energy

E =

(1− 2µ

r

).t , (7.13)

and a conserved angular momentumL = r2

.φ , (7.14)

where · = d/dλ. Then the norm condition gµν.xµ

.xν = ε (with ε = 0 for photons and +1 for

massive particles) gives

E2 +.r2 +

(1− 2µ

r

)[L2

r2+ ε

]= 0 . (7.15)

84

Page 87: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

We saw previously that defining

Veff(r) =1

2

[L2

r2+ ε

](1− 2µ

r

), (7.16)

in analogy with Newtonian experience allows the rewriting

1

2.r2 + Veff(r) =

1

2E2 ≡ E . (7.17)

We can combine the knowledge above to find the shape equation,

dr=dφ

(dr

)−1

, (7.18)

givingdφ

dr= ±L

r2[2 (E − Veff(r))]−1/2 (7.19)

Defining

b =L

E(7.20)

gives

dr=

1

r2

[1

b2− 1

r2

(1− 2µ

r

)]−1/2

(7.21)

Now we make a change of variables, to

u =L2

GNM

1

r=L2

µ

1

r. (7.22)

The radial equation for massive particles (planets, etc.) then turns into(du

)2

+L2

µ2− 2u+ u2 − 2µ2

L2u3 =

2EL2

µ2. (7.23)

On the face of it, this does not look any simpler than before. The neat trick is to realizethat differentiating this again yields a simpler second order equation! Straightforward butunilluminating algebra yields

d2u

dφ2+ u = 1 +

3µ2

L2u2 . (7.24)

The second term on the RHS of this equation would be absent in the Newtonian computation.For the Newtonian case, you can check that the solution to the shape equation is

u0 = 1 + e cosφ . (7.25)

Treating this as the zeroth order approximation, we can substitute back

u(φ) = u0(φ) + u1(φ) + . . . (7.26)

85

Page 88: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

into eq.(7.24) and obtain a perturbative equation for first order corrections u1. This gives

d2u1

dφ2+ u1 =

3µ2

Lu2

0 . (7.27)

As you should expect for an inherently nonlinear theory like GR, perturbation theory hereis nonlinear. Substituting in the specific form of u0 gives

d2u1

dφ2+ u1 =

3µ2

L2

[(1 +

e2

2

)+ 2e cosφ+

e2

2cos 2φ

](7.28)

You can check by explicitly differentiating that the solution to this is

u1 =3µ2

L2

[(1 +

e2

2

)+ eφ sinφ− e2

6cos 2φ

](7.29)

Notice that the first term here is a constant displacement and that the third term is oscillatoryabout zero. The second term that gives rise to a cumulative effect per orbit is the mostphysically important one.

From here on we just focus on that key second cumulative term on top of the zerothorder Newtonian contribution. We have

ukey = 1 + e cosφ+3µ2

L2eφ sinφ . (7.30)

This can be rewritten asukey = 1 + e cos [(1− α)φ] , (7.31)

where

α =3µ2

L2, (7.32)

as you can see by doing a Taylor expansion to first order in small quantities,

cos[(1− α)φ] ' cosφ+ αφ sinφ+O(α2) . (7.33)

Then the precession per orbit is

∆φ ' 2πα =6πG2

NM2

L2. (7.34)

86

Page 89: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

In order to massage this expression a little further, we need to relate L2 to physical quantitieswe know. For the Newtonian (uncorrected) ellipse, the EOM show that

a =L2

µ(1− e2), (7.35)

so that

∆φ =6πGNM

c2a(1− e2). (7.36)

The first experimental test of this was with Mercury. For that planet, the gravitationalradius µ = GNM/c2 is about 1.48km, the eccentricity is about e = 0.2056, and the semima-jor axis is about a = 5.79× 1010m. This results in a perihelion precession advance of about5 × 10−7 radians per orbit, or about 43 seconds of arc per century. Note that the observedvalue is actually considerably greater, but most of it comes from two prosaic places: (a)precession of the equinoxes in our geocentric coordinate system, and (b) other planets per-turbing Mercury’s orbit. The residual amount of 43 seconds of arc per century is perfectlydescribed by GR, to within experimental errors. This was not settled definitively in the ex-perimental realm until the 1960s. Eddington’s eclipse expedition was accepted in 1919 andmade Einstein a rock star, despite poorly understood systematic errors, because it appealedto Western Europeans in the post WWI climate of wanting peace between nations that hadbeen at war.

7.3 Bending of light

Now let us focus on the bending of light. To start with, let us remind ourselves first of theNewtonian result. Most people think that because two photons with zero mass should feelzero Newtonian force between them, that implies that photons do not feel gravity. This isincorrect. Newton imagined light as corpuscular, and it feels gravity like any other corpuscle.The gravitational acceleration of a test mass does not depend on the mass.

In Newtonian mechanics, particles in unbound orbits move on hyperbolas. The incom-ing path asymptotes in the infinite past to one of the separatrices, and the outgoing pathasymptotes in the infinite future to the other separatrix. In principle, it could come as closeas the radius of the stellar object as it slingshots around the star.

We can estimate the size of the effect just using dimensional analysis. The variablesin the problem are: GN and c (theory constants), M (a solution parameter), and b, theradius of closest approach. Since the deflection angle we are looking for is dimensionless, weestimate that

θ ∼ GNM

c2b. (7.37)

87

Page 90: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

In principle, θ could have been any function of the dimensionless RHS. We have chosen alinear functional dependence on purpose, because we expect zero deflection angle when thereis no star and because we expect a small effect overall.

We can get more precise and confirm the linear dependence by asking about the gravita-tional force felt by a corpuscle. Suppose that far from the star it starts in along the x-axis,and that the star is located along the negative y-axis. To first order in small quantities,px is unaffected by the gravitational deflection, and the corpuscle develops a small py bygravitational attraction. The deflection angle |∆φ| = −(py/px)final is, to first order in smallquantities,

|∆φ| = − 1

px

∫ ∞−∞

dxdpydx

= − 1

px c

∫ ∞−∞

dxdpydt

= − 1

px c

∫ ∞−∞

dxGNMm

(x2 + y2)

y√x2 + y2

=2GNM

c2b, (7.38)

where b is the distance of closest approach. Note that the factor of m for the corpusclecancelled out: the m in the numerator arising from the gravitational force killed the min the momentum denominator px = mc. Overall, we see that the Newtonian angle fordeflection of light is small but nonzero.

To analyze the answer in General Relativity, our starting point is again the geodesicequations. For photons executing equatorial motion (θ = π/2), we had two Killing vectorsgiving rise to two conserved quantities and also the tangent vector norm condition,(

1− 2µ

r

).t = E , (7.39)

r2.φ = L , (7.40)(

1− 2µ

r

).t2 −

(1− 2µ

r

)−1.r2 − r2

2= 0 . (7.41)

Substituting in the conserved quantities gives for the radial equation

.r2 +

L2

r2

(1− 2µ

r

)= E2 . (7.42)

Notice that this has ε = 0, unlike the shape equation for orbiting planets that we studiedearlier in this section. Then substituting this time

u =1

r(7.43)

into the shape equation for photons in GR we have

d2u

dφ2+ u = 3µ u2 . (7.44)

88

Page 91: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

When there is no matter, the RHS of the above shape equation is zero. In that case, thesolution is

u(φ) = u0(φ) =1

bsinφ , (7.45)

where b is the impact parameter. Here, we go one step further than this perturbatively,writing

u(φ) =1

bsinφ+ u1(φ) , (7.46)

and substituting to find the equation of motion for the perturbation gives

d2u1(φ)

dφ2+ u1(φ) =

b2sin2 φ . (7.47)

As you can check explicitly, this is solved by

u1(φ) =3µ

2b2

(1 +

1

3cos 2φ

), (7.48)

so that

u(φ) =1

bsinφ+

2b2

(1 +

1

3cos 2φ

). (7.49)

In the limit r →∞, u→ 0. For slight deflections, sinφ ' φ and cos 2φ ' 1 so that the GRdeflection angle is

φ ' −2GNM

c2b. (7.50)

We are not quite finished. As indicated in the figure, the total deflection angle is twice theabove result,

∆φGR '4GNM

c2b. (7.51)

Notice that this is also twice the Newtonian result for the bending of light. For a grazingdeflection by our Sun, it is about 1.75 seconds of arc.

What if we cannot apply a perturbation analysis because the deflection angle is large?Then we would need to use the full GR geodesic equations for photons without any approx-imations.

89

Page 92: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

7.4 Radar echoes

One other important test of GR is measuring radar echoes in the solar system. To look atthis, we need two ingredients. First, one of our geodesic equations for photons from earlierthat took the form

.r2 +

L2

r2

(1− 2µ

r

)= E2 . (7.52)

We also had the energy equation (1− 2µ

r

).t = E , (7.53)

which is the second ingredient. These can be combined to find the t−r shape equation.Using (

dr

)2

=

(dr

dt

dt

)2

=

(dr

dt

)2

E2

(1− 2µ

r

)−2

, (7.54)

we have that (1− 2µ

r

)−3(dr

dt

)2

+(L/E)2

r2=

(1− 2µ

r

)−1

. (7.55)

At the distance of closest approach, which we will call R, we have(dr

dt

)2∣∣∣∣∣r=R

= 0 , (7.56)

so that at that point(L/E)2

R2=

(1− 2µ

R

)−1

(7.57)

Then, after a bit of algebra, the expression(dr

dt

)2

=

(1− 2µ

r

)2

− (L/E)2

r2

(1− 2µ

r

)3

. (7.58)

can then be massaged into the form

dr

dt=

(1− 2µ

r

)[1− R2(1− 2µ/r)

r2(1− 2µ/R)

]1/2

(7.59)

We can integrate this to get the time taken to travel from radial position R to r. It helps tobegin by expanding the integrand to first order in µ/r. After some algebra, we get

t(r, R) '∫ r

R

drr√

r2 −R2

[1 +

r+

µR

r(r +R)+ . . .

]. (7.60)

Then we integrate, to obtain

t(r, R) '√r2 −R2 + 2µ ln

[r +√r2 −R2

R

]+ µ

√r −Rr +R

+ . . . . (7.61)

The first term on the RHS here is just what we would have got if we had drawn a straightline. So the second and third terms are quantifying the bending of photon trajectories.

90

Page 93: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Now, suppose that we bounced a radar beam out to Venus and back, grazing the Sun.Then we would have twice the sum of the second and third terms above (twice for there andback). Using the approximation that the closest approach distance is much less than thedistance of either Earth or Venus from the Sun (rE R, rV R), gives

∆t ' 4GNM

c3

[lnrErVR2

+ 1]. (7.62)

Note that if we wanted to take into account the gravitational redshift of Earth, this is anorder µ/rE correction to what we have already calculated and therefore negligible.

Experimentally, when Venus is on the opposite side of the Sun to the Earth, the numericalvalue of the time delay for a grazing passing of the Sun is about 220µs, if you convert timeback from metres to seconds. HEL goes into more detail about the experimental nuances in§10.3. One has to correct for the motion of Venus and Earth in their orbits, their individualgravitational fields, the variance of reflecting surfaces on Venus, and refraction by the Solarcorona. After all the experimental dust settles, you get the data agreeing in a pretty waywith the GR prediction.

91

Page 94: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

7.5 Geodesic precession of gyroscopes

Precession of gyroscopes is another experimental test of General Relativity. Gyros areinteresting because they spin on an axis, and this spin vector sµ feels the effects of GeneralRelativity through the physics of parallel transport. Let us see out how this works.

The geodesic is a physically special curve because it parallel transports its own tangentvector,

d

dλuµ + Γµνσu

νuσ = 0 . (7.63)

Physically, the spin must be orthogonal to the tangent vector,

gµνsµuν = 0 . (7.64)

In other words, the spin cannot have a timelike component in the instantaneous rest frameof the test object. If we want this zero inner product to be conserved at all points along theworldline of the gyro, we need to insist that the spin vector sµ be parallel transported,

d

dλsµ + Γµνσs

νuσ = 0 . (7.65)

To demonstrate the effect we are after, it is sufficient to use the approximation thatEarth’s gravitational field (in which GPB flew) is described by the Schwarzschild metric. Thiswill simplify our computations because there are fewer Christoffel symbols for Schwarzschildthan for Kerr. Imagine that our test gyroscope is orbiting Earth in a circle, in the equatorialplane of our spherical polar coordinate system. Circular motion occurs at fixed (r, θ), so thatu1(λ) = 0 and u2(λ) = 0 ∀ λ. Because θ = π/2, Γθϕϕ and Γϕθϕ are zero and Γrϕϕ = Γrθθ. Soour spin parallel transport equations in (t, r, θ, ϕ) coordinates become

dst

dλ+ Γtrts

rut = 0 , (7.66)

dsr

dλ+ Γrtts

tut + Γrϕϕsϕuϕ = 0 , (7.67)

dsθ

dλ= 0 , (7.68)

dsϕ

dλ+ Γϕrϕs

ruϕ = 0 . (7.69)

where

Γtrt =µ

r2

(1− 2µ

r

)−1

, Γrtt =µ

r2

(1− 2µ

r

), Γrϕϕ = −r

(1− 2µ

r

), Γϕrϕ =

1

r. (7.70)

To proceed further, we need to know something about the normalization of the velocityvector. We can write it as [uµ] = ut[1, 0, 0,Ω]T , where Ω is our angular velocity for circularmotion. What is the angular velocity for our case? We actually mentioned the key ingredi-ents already, in passing, when we discussed massless and massive particle geodesics in theSchwarzschild spacetime. In particular, we derived the shape equation for (quasi-)elliptical

92

Page 95: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

orbits. Circular orbits are a special case, and the shape equation can easily be rearrangedto find L. We obtain

L2 =µR2

R− 3µ(7.71)

where R is the radius of the circular orbit. Then using the norm condition on the velocityvector gives

E =(1− 2µ/R)√

1− 3µ/R. (7.72)

We can also find the angular velocity, by using the geodesic equations to find ϕ(t),(dϕ

dt

)2

=

(dϕ

dt

−1)2

. (7.73)

After the dust settles, this gives the very simple expression

Ω2 =µ

r3. (7.74)

The norm of the 4-velocity must be unity, as appropriate to a massive particle (our gyro-scope). This gives the equation

u0 =

[(1− 2µ

r

)− r2Ω2

]−1/2

=

(1− 3µ

r

)−1/2

. (7.75)

In this system, we have ur = 0 = uθ, and so the condition that the spin vector beorthogonal to the velocity vector becomes(

1− 2µ

r

)stut − r2sϕuϕ = 0 . (7.76)

Since uϕ/ut = dϕ/dt = Ω, we can express st in terms of sϕ,

st =Ωr2

(1− 2µ/r)sϕ . (7.77)

As you can check for yourself, this means that the first and fourth of the parallel transportequations are equivalent. Then the remaining equations are

dsr

dλ− rΩ

utsϕ = 0 ,

dsθ

dλ= 0 ,

dsϕ

dλ+utΩ

rsr = 0 . (7.78)

We can convert the experimentally relatively unfamiliar affine parameter λ to the coor-dinate time t using ut = dt/dλ. Using the third equation to eliminate sϕ from the first givesfor the set of three

d2sr

dt2+

Ω2

(ut)2sr = 0 ,

dsθ

dt= 0 ,

dsϕ

dt+

Ω

rsr = 0 . (7.79)

93

Page 96: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

This has solution

sr(t) = s1(0) cos Ω′t , sθ(t) = 0 , sϕ(t) = − Ω

rΩ′s1(0) sin Ω′t , (7.80)

where

Ω′ =Ω

ut= Ω

√1− 3µ/r . (7.81)

Therefore, the spatial part of the spin vector is rotating relative to the radial direction rwith a coordinate angular speed −Ω′ in the direction -ϕ. But the radial direction itself isrotating with coordinate angular speed +Ω. So it is the difference in speeds which gives riseto geodesic precession.

If you revolve once in a coordinate time t = 2π/Ω, the final direction of the spatial spinvector is 2π + α, where α = 2π(1− Ω′/Ω). Per revolution, then, the angular precession is

α = 2π

[1−

√1− 3µ

r

]. (7.82)

This effect is not very big, but it is cumulative. That means if you can machine almost-perfect gyros and leave them in orbit for a veeeeeeeery long time, then you have a chance ofthese effects adding up and being measurable.

From the GPB website: “Gravity Probe B, launched 20 April 2004, is a space exper-iment testing two fundamental predictions of Einstein’s theory of General Relativity (GR),the geodetic and frame-dragging effects, by means of cryogenic gyroscopes in Earth orbit.Data collection started 28 August 2004 and ended 14 August 2005. Analysis of the datafrom all four gyroscopes results in a geodetic drift rate of −6, 601.8 ± 18.3 mas/yr anda frame-dragging drift rate of −37.2 ± 7.2 mas/yr, to be compared with the GR predic-tions of −6, 606.1 mas/yr and −39.2 mas/yr, respectively (‘mas’ is milliarc-second; 1 mas=4.848× 10−9 radians or 2.778× 10−7 degrees).”

94

Page 97: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

7.6 Accretion disks

Lastly, let us mention one more experimental test of GR: accretion disks around compactobjects. They have matter swirling around the central black hole at millions of Kelvins, andtend to emit strongly in the X-ray part of the spectrum. Even at such extreme temperatures,some atoms can retain electrons and then emit radiation as they jump between energy levels,and one such nucleus is iron. Looking at the shape of the broadened iron emission line fromthe whole accretion disk actually gives a probe of the strong-field regime of GR, as we willnow motivate.

There are two types of redshift that operate in this system: gravitational redshift, andDoppler shifting from relative velocity w.r.t. an observer here on Earth. Supposing thatwe view an accretion disk and black hole system side-on, we would see a range of Dopplershifting depending on which part of the disk we were looking at. This would even happen inthe Newtonian approximation! The really key part is the gravitational redshift. The essentialreason is that the smallest-possible frequency present in the observed spectrum must havebeen emitted at the smallest possible value of r, so that it could experience maximum redshifton the way out. Knowing the radius of the ISCO, we can then get a handle on the biggestfrequency ratio possible.

95

Page 98: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The ratio of the photon frequency at reception compared to that at emission is given by

νRνE

=pµ(R)uµRpµ(E)uµE

. (7.83)

Using what we derived in the previous experiment’s discussion concerning the angular ve-locity and the tangent vector norm condition, you can show with straightforward algebrathat

νRνE

=p0(R)

p0(E)u0E + p3(E)u3

E

(1− 3µ

r

)1/2p0(R)

p0(E)

[1± p3(E)

p0(E)Ω

], (7.84)

where + corresponds to emitting matter on the side of the disk moving towards the observerand − corresponds to matter on the other side. Now, because Schwarzschild is a stationarymetric, the downstairs component of the time component of the momentum of the photonis conserved along a geodesic.

Our last ingredient is to find the ratio p3(E)/p0(E), and this is done using the nullphoton momentum norm condition. Working in the equatorial plane, we find(

1− 2µ

r

)−1

(p0)2 −(

1− 2µ

r

)(p1)2 − 1

r2(p3)2 = 0 . (7.85)

To get any further for a general angle between the accretion disk and us, we would need torecruit the full photon geodesic equations. But in two special cases we can actually do a slickavoidance manoeuvre and finesse this issue! When the matter is transverse to the observer(or in a face-on disk), ϕ = 0, π. Then p3(E) = 0, and so

νRνE

=

√1− 3µ

r. (7.86)

When matter moves either directly towards or away from the observer, ϕ = ±π/2. Thenthe radial component of the photon momentum is zero, and so

νRνE

=

√1− 3µ/r

1± 1/√r/µ− 2

(7.87)

You find that the smallest frequency represented in the Iron emission line will be νR/νE =√2/3 ' 0.47 for face-on disks and 1/

√2 for edge-on disks.

96

Page 99: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

8 Gravitational waves

This section is based on §17 and §18 of HEL. All figures shown are from HEL.

8.1 Finding the wave equation for metric perturbations

For this section we will assume that the cosmological constant is zero and that spacetime isapproximately flat. We will figure out the equations obeyed by small (perturbative) ripplesin the fabric of spacetime about the Minkowski metric, which are known as gravitationalwaves. To begin, we assume that

gµν = ηµν + hµν , (8.1)

where |hµν | 1. To first order in small quantities,

gµν = ηµν − hµν , (8.2)

where we raise and lower indices to this order by using the Minkowski metric,

hµν = ηµρηνσhρσ . (8.3)

It is important to know how the perturbations are affected by changes of coordinates.Under global Lorentz transformations x′µ = Λµ

νxν , we know that

g′µν =∂xρ

∂x′µ∂xσ

∂x′νgρσ

= Λ ρµ Λ σ

ν (ηρσ + hρσ)

= ηµν + Λ ρµ Λ σ

ν hρσ (8.4)

because the Minkowski metric is invariant under global Lorentz transformations. Therefore,

h′µν = Λ ρµ Λ σ

ν hρσ . (8.5)

In other words, hµν transforms like a tensor under global Lorentz transformations.We can also ask about how perturbations in spacetime are affected by a general coordi-

nate transformation of the formx′µ = xµ + ξµ(x) . (8.6)

Therefore,∂x′µ

∂xν= δµν + ∂νξ

µ . (8.7)

By eye, we can see using the above equation that to first order in small quantities the inversetransformation obeys

∂xµ

∂x′ν= δµν − ∂νξµ . (8.8)

Accordingly, under these general coordinate transformations, we have

g′µν =(δρµ − ∂µξρ

)(δσν − ∂νξσ) (ηµν + hµν)

= ηµν + (hµν − ∂µξν − ∂νξµ) (8.9)

97

Page 100: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

where we defined ξµ = ηµνξν , and worked to first order in small quantities. Therefore, the

transformation law of the perturbations under general coordinate transformations (8.6) is

h′µν = hµν − ∂µξν − ∂νξµ . (8.10)

What are the Christoffels to first order in perturbations? We did this type of approxi-mation earlier when we first linearized a GR expression to recover its Newtonian limit. Here,we obtain

Γσµν =1

2

(∂νh

σµ + ∂µh

σν − ∂σhµν

). (8.11)

From this, it follows that, to first order in perturbations,

Rσµνρ =

1

2

(∂ν∂µh

σρ + ∂ρ∂

σhµν − ∂ν∂σhµρ − ∂ρ∂µhσν). (8.12)

The really neat thing about this expression for the Riemann tensor is that it is invariantunder general coordinate transformations (8.6). As you can check, this property also holdsfor the Ricci tensor and the Ricci scalar. For convenience, let us define

h = hσσ (8.13)

and write2 = ∂µ∂µ . (8.14)

Then we obtain

Rµν =1

2

(∂µ∂νh+ 2hµν − ∂µ∂ρhρν − ∂ν∂ρhρµ

), (8.15)

andR = 2h− ∂µ∂νhµν . (8.16)

Plugging the above expressions into the Einstein equations yields a second-order PDEfor the perturbations. In order to aid in wrangling all the pertinent algebra, it is convenientto define the trace reverse of hµν ,

hµν ≡ hµν −1

2ηµνh . (8.17)

This obeys the property that¯hµν = hµν . (8.18)

We also have (again to first order in perturbations)

h = ηµν hµν , (8.19)

which obeys h = (1 − D/2)h. In D = 4, h = −h. In terms of hµν , the Einstein equationsbecome

2hµν + ηµν∂ρ∂σhρσ − ∂ν∂ρh

ρµ − ∂µ∂ρh

ρν = −16πGNTµν . (8.20)

On the face of it, this equation does not look very much like a familiar wave equationinvolving the d’Alembertian!

98

Page 101: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

In order to figure out what our Einstein equations for the perturbations imply physically,it is crucial that we understand how hµν transforms under general coordinate transformations(8.6). First, recall our equation (8.10), h′µν = hµν−∂µξν−∂νξµ. From this, it follows directlythat

h′ = ηµν (hµν − ∂µξν − ∂νξµ)

= h− 2∂µξµ . (8.21)

Therefore,

h′µρ

= h′µρ − 1

2ηµρh′

= (hµρ − ∂µξρ − ∂ρξµ)− 1

2ηµρ (h− 2∂σξ

σ)

= hµρ − ∂µξρ − ∂ρξµ + ηµρ∂σξ

σ . (8.22)

Taking the partial derivative of this expression gives

∂ρh′µρ

= ∂ρhµρ −2ξµ . (8.23)

So far, this description of algebra manipulations might seem a tad dry. But this is where thereal money is to be made in careful observation. Suppose that we are smart enough to choosea coordinate system in which 2ξµ = ∂ρh

µρ. Then ∂ρh

′µρ= 0, which massively simplifies

the Einstein equation. In particular, all the terms on the LHS which did not involve thed’Alembertian operator become equal to zero in this coordinate system. Wow!

To summarize, let us drop the primes for clarity, raise the indices with η, and write ourEinstein equation in this awesome new coordinate system,

2hµν

= −16πGN

c4T µν . (8.24)

In order for the wave equation for our metric perturbations to obey this simple equation,our coordinate system must obey

∂µhµν

= 0 . (8.25)

Any further coordinate change xµ → xµ + ξµ within this gauge class would be OK, as longas it satisfied 2ξµ = 0. This is very reminiscent of the Lorentz gauge in electromagnetism,∂µA

µ = 0, which still allows further gauge transformations of the form Aµ → Aµ + ∂µλ,where 2λ = 0. Accordingly, this gauge for metric perturbations is sometimes rather looselycalled the Lorentz gauge. More properly, it is called the de Donder gauge.

8.2 Solving the linearized Einstein equations

As always, if we are trying to solve a wave equation, it helps to start by finding the Green’sfunction,

2xG(xσ − yσ) = δ(4)(xσ − yσ) . (8.26)

99

Page 102: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

As explained in detail in HEL §17.6, this is solved by the retarded Green’s function

G(xσ) =δ(x0 − |~x|)θ(x0)

4π|~x|, (8.27)

as you can check by substituting it in. Note that the retarded Green’s function (as com-pared to, say, the advanced Green’s function) is required by causality: we cannot expect agravitational wave to be influenced by sources in its future light cone, only those in its pastlight cone. Using the retarded Green’s function, we can see immediately that the solutionto the Einstein equation for the metric perturbation is

hµν

(ct, ~x) = −4GN

c4

∫d3~y

T µν(ctr, ~y)

|~x− ~y|(8.28)

HEL Fig.17.2 is very helpful for visualizing the meaning of the retarded time variable tr inthis equation, which is defined by

ctr = ct− |~x− ~y| . (8.29)

Plodding through the details of how to check whether this satisfies the de Donder gaugecondition requires careful attendance to the retarded time story, using the chain rule forderivatives, and integration by parts. The net result is

∂xµhµν

= −4GN

c4

∫d3~y

1

|~x− ~y|∂

∂yµT µν(y0, ~y) . (8.30)

But since the energy-momentum tensor is conserved in the linearized theory,

∂µTµν = 0 , (8.31)

we have what we need:∂µh

µν= 0 . (8.32)

100

Page 103: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

A very important idea from electromagnetism was the multipole expansion. Here,it is the conserved energy-momentum that sources our gravitational wave, rather than theconserved current sourcing the EM wave, but the principle is analogous. In an asymptoticallyflat spacetime, higher partial waves fall off with higher powers of distance, so the lowestpertinent multipole moment for a compact source dominates the physics of wave propagationfar from the source. As you learned in 3rd year EM class, in order to generate EM waves, atime-dependent dipole moment is needed. In order to generate gravitational waves, it turnsout that we will need a time-dependent quadrupole moment.

To start our way towards that result, let us Taylor expand the denominator in the integralfor h

µν, with |~x| = r and small ~y,

1

|~x− ~y|' 1

r+ (−yi)∂i

(1

r

)+

1

2!(−yi)(−yj)∂i∂j

(1

r

)+ . . .

=1

r+ yi

xir3

+ yiyj(

3xixj − δijr2

r5

)+ . . . (8.33)

Motivated by this, we define the multipoles

Mµνσ1σ2...σ`(ctr) =

∫d3~y T µν(ctr, ~y)yσ1yσ2 . . . yσ` , (8.34)

and obtain

hµν

(ct, ~x) = −4GN

c4

∞∑`=0

(−1)`

`!Mµνσ1σ2...σ`(ctr)∂σ1∂σ2 . . . ∂σ`

(1

r

)(8.35)

For the case of a compact source, we can use these general expressions to find ap-proximations for our linearized metric perturbations. First we need to consider what thecomponents of T µν tell us physically. T 00 is the energy density of the source particles, and ifthis is integrated over all space then it gives Mc2, the conserved energy. T 0i is the momen-tum density of source particles, and if this is integrated over all space it gives P ic, which isalso conserved at this order in perturbations. The T ij are the internal stresses, and they arenot necessarily zero when integrated over all space. Without loss of generality, we may takeour spatial coordinates xi to be in the centre of momentum (CoM) frame of the particles, sothat P i = 0. Then in CoM coordinates,

h00

= −4GNM

c2r, h

0i= h

i0= 0 . (8.36)

The remaining parts are

hij

(ct, ~x) = −4GN

c2r

∫d3~y

[T ij(ct′, ~y)

]∣∣ct′=ct−r (8.37)

It is not especially easy to compute this integral directly. HEL explain carefully in §17.8that a slightly indirect yet algebraically shorter route can be found by recruiting energy-momentum conservation ∂µT

νµ = 0. In a 3+1 split, we have

0 = ∂0T00 + ∂kT

0k

0 = ∂0Ti0 + ∂kT

ik . (8.38)

101

Page 104: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

These two equations can be used to turn our integral over T ij into integrals over highermoments of T 0i and T 00. The first trick is to consider the integral of ∂k(T

ikyj) over a volumecompletely enclosing the source and using Gauss’s theorem. The first conservation equationthen yields ∫

d3~y T ij =1

2c

d

dtr

∫d3~y

(T i0yj + T j0yi

). (8.39)

The second trick is to consider the integral of ∂k(T0kyiyj) over the same enclosing volume;

it yields ∫d3~y T ij =

1

2c2

d2

dt2r

∫d3~y T 00yiyj . (8.40)

Defining the quadrupole moment I ij by

I ij(ct) =

∫d3~y T 00(ct, ~y) yi yj (8.41)

gives the solution

hij

(ct, ~x) = −2GN

c6r

[d2I ij(ct′)

dt′2

]∣∣∣∣t′=tr

. (8.42)

This is known as the quadrupole formula.

As a very easy example of solving for gravitational perturbations, we can consider astationary non-relativistic source which is a perfect fluid. In this case the energy-momentum tensor is constant in time, and the distinction between time and retarded timeis irrelevant. Then we have directly that

hµν

(~x) = −4GN

c4

∫d3~y

T µν(~y)

|~x− ~y|. (8.43)

When our perfect fluid is non-relativistic, all speeds are much smaller than c and to lowestorder in perturbation theory we can neglect the pressure. This gives

T 00 = ρc2 , T 0i = ρcui , T ij = ρuiuj , (8.44)

where ρ is the proper density distribution of the source. The solution can be written as

h00

=4Φ

c2, h

0i=Ai

c, h

ij= 0 , (8.45)

where

Φ(~x) = −GN

∫d3~y

ρ(~y)

|~x− ~y|

Ai(~x) = −4GN

c2

∫d3~y

ρ(~y)ui(~y)

|~x− ~y|. (8.46)

We can easily obtain hµν as functions of hµν

using our earlier formula connecting them. Theresult is

h00 = h11 = h22 = h33 =2Φ

c2, h0i =

Aic. (8.47)

102

Page 105: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

This provides the derivation that we promised quite a long time ago of the lowest-orderNewtonian approximation to the spacetime metric,

ds2 =

(1 +

c2

)(cdt)2 + 2

Aic

(cdt)dxi −(

1− 2Φ

c2

)δijdx

idxj , (8.48)

with the bonus that we now allow for slow rotation of the source. An example of a stationarynon-relativistic source would be a rigidly rotating sphere.

Another simple example of solving for gravitational perturbations is the case of gravi-tational plane waves. These take the form

hµν

=1

2Aµν exp(ikµx

µ) + c.c. . (8.49)

The de Donder gauge condition requires that the polarization tensor Aµν obeys

kµAµν = 0 , (8.50)

i.e., it is transverse to the direction of propagation of the wave.Let us count polarizations. We started off with ten components of our symmetric tensor

metric perturbations. Fixing de Donder gauge reduces that to six independent components.We can further fix the gauge by doing a coordinate transformation xµ → xµ+ξµ, as long as westay within de Donder gauge, which further reduces the number of independent componentsdown to two. Let us see how this works, in more detail. Consider a ξµ of the form

ξµ = εµ exp(ikνxν) . (8.51)

This clearly obeys 2ξµ = 0 if εµ =const. Under this transformation, we know how hµν

transforms, which tells us that the polarization tensor must also transform as

A′µν = Aµν − iεµkν − iενkµ + iηµνερkρ . (8.52)

Let our wavevector lie along the z-direction: ~k = kz. Then by our de Donder gauge condition,Aµ3 = Aµ0 ∀µ. Using this and the above two equations, we can straightforwardly show thatthe components of εµ can always be chosen to ensure that the only nonzero components ofthe polarization tensor are

[AµνTT] =

0 0 0 00 a b 00 b −a 00 0 0 0

. (8.53)

This cleverly chosen gauge is known as the transverse traceless (TT) gauge. If we wish,we can write the polarization tensor as

AµνTT = aeµν+ + beµν× . (8.54)

More generally, we define the TT gauge via

h0iTT ≡ 0 , hTT ≡ 0 . (8.55)

103

Page 106: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Using this and the de Donder gauge condition ∂µhµνTT = 0 , we have that

∂0h00TT = 0 , ∂ih

ijTT = 0 . (8.56)

What effect does such a gravitational plane wave have on a bunch of free particles? Wecan work this out by using the geodesic equation for their motion,

duσ

dτ+ Γσµνu

µuν = 0 . (8.57)

Suppose a particle is initially at rest before the wave comes by. Then [uµ] = c[1,~0]T , and so

duσ

dτ= −c2Γσ00

= −c2

2ησρ (∂0hρ0 + ∂0h0ρ − ∂ρh00)

= 0 (8.58)

because we are working in TT gauge. So hey: our coordinate system is adapted to individualparticles! But even though the coordinate separation of particles is constant, their physicalseparation is not, because h

µν 6= 0 . Let us parametrize the coordinate spatial separationbetween two nearby particles as Si. Then the physical spatial separation is

`2 ≡ −gijSiSj = (δij − hij)SiSj . (8.59)

To first order in perturbations, we can define a new physical separation vector ζ i by

`2 = δijζiζj , (8.60)

or

ζ i = Si +1

2hikS

k . (8.61)

To see the effect of our gravitational plane wave in the z direction, let us inspect two particlesin the (x, y) plane. Then S3 = 0. Also, because hk3

TT = 0 ∀k, there is no change in theirz-separation due to the plane wave. Their moving around is going to happen in the (x, y)plane only. Another advantage of TT gauge is that h = 0, which also implies that h = 0.Picking the eµν+ polarization tensor for definiteness, we find easily that

hµνTT = aeµν+ cos(kµxµ) = aeµν+ cos[k(x0 − x3)] , (8.62)

where k = |~k| = ω/c. So

(ζ i) = (S1, S2, 0)T − a

2cos[k(x0 − x3)](S1,−S2, 0)T . (8.63)

This is illustrated nicely in Fig.18.1 of HEL.

104

Page 107: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

For the other case of the crossed polarization eµν× , this Fig.18.2 of HEL shows how tovisualize its effect.

Either way, you can think of gravitational waves as stretchy-squeezy waves.

8.3 Energy loss from gravitational radiation

THere is no local notion of gravitational energy density in GR, because we could alwayschange it via a coordinate transformation. Also, in generic spacetimes in GR, neither en-ergy nor momentum is conserved. But we can still motivate an expression for the energy-momentum tensor of the gravitational field itself in the perturbative approximation, in orderto allow us to derive the famous formula for the power radiated by gravitational waves.

We started our perturbative approach to spacetime metric perturbations starting fromthe full equations,

Gµν = −8πGN

c4Tµν . (8.64)

Now imagine that we go one step beyond linear order, keeping up to second-order terms insmall quantities. Then we have

G(1)µν +G(2)

µν + . . . = −8πGN

c4Tµν . (8.65)

We could try moving the second-order approximation to the Einstein tensor over to the RHSand calling it tgrav

µν . The problem with this idea is that unfortunately this expression is notgauge invariant. HEL explain in detail how to fix this by averaging over a small region aboutany given point and writing

tgravµν ≡

c4

8πGN

〈G(2)µν 〉 . (8.66)

105

Page 108: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

After a good deal of fairly unilluminating algebra, the resulting expression becomes

tgravµν =

c4

32πGN

〈(∂µhρσ)∂ν hρσ − 2(∂σh

ρσ)∂(µhν)ρ −

1

2(∂µh)∂ν h〉

− 〈hρ(µTρ

ν) +1

4ηµνh

ρσTρσ〉 . (8.67)

The key property of this thing is that it is invariant under gauge transformations, as required.We consider gravitational plane waves in vacuo so that only the top line will appear for us.

Now, in TT de Donder gauge, we have ∂µhµνTT = 0, hTT = 0, and h

µνTT = hµνTT. So then

in vacuo, we have only the first term in the complicated expression above turned on. In ourTT gauge, considering only the radiative part of the gravitational field shows that h

0iTT = 0,

so that in fact only the spatial components of the perturbations actually appear,

tgravµν =

c4

32πGN

〈(∂µhTTij )(∂νh

ijTT)〉 (8.68)

Physically, the energy flux (energy/area/time) in the ni spatial direction is

F (~n) = −ct0knk = +δkjt0knj , (8.69)

in our signature convention, because in general an energy-momentum tensor tµν encodes theflux of µ-momentum in the ν-direction.

Let us consider a compact source and aim for the far-field result, choosing ~n to bepointing in the radial direction away from the source. Then we have

F (r) = − c4

32πGN

〈(∂thTTij )∂rh

ijTT〉 . (8.70)

But from our quadrupole formula from earlier, we know that

hij

= −2GN

c6

[..Iij]r

(8.71)

where · ≡ d/dt and r means using retarded time. We need an expression for the TT part ofthe quadrupole moment, so we define

Jij ≡ Iij −1

3δijI , (8.72)

where I = I ii . Then

hijTT = hijTT = −2GN

c6

[ ..Jij]r

(8.73)

Now, in order to finish this line of reasoning, we need to slow down a little and be carefulabout how to take t and r derivatives at retarded time. Our definition of retarded time was

x0r ≡ ctr = x0 − |~x− ~y| , (8.74)

106

Page 109: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

and so for any function f(x0r, ~y), we have

∂f(x0r, ~y)

∂xµ=

[∂f(y0, ~y)

∂y0

]r

∂x0r

∂xµ,

∂f(x0r, ~y)

∂yi=

[∂f(y0, ~y)

∂yi

]r

+

[∂f(y0, ~y)

∂y0

]r

∂x0r

∂yi, (8.75)

where r means to evaluate at y0 = x0r. We therefore have that

∂thTTij = −2GN

c6

[ ...J

TT

ij

]r

(8.76)

We can also evaluate

∂rhTTij =

2GN

c6 r2

[ ..Jij

TT

]r

+2GN

c7r

...Jij

TT . (8.77)

The second term here dominates over the first, and so our expression for the radiation fluxfrom the gravitational wave source is

F (r) =GN

8πr2c9〈...J

TT

ij

...Jij

TT〉 . (8.78)

Our last task is to express this in terms of the original quadrupole. For that we need ahandy projection tensor,

Pij ≡ δij − ninj . (8.79)

Applying this to an arbitrary spatial vector allows one to see that it obeys the properties weexpect of a projector. Then the transverse part of the polarization vector for the gravitationalwave is AijT = P i

kPj`A

k` is the transverse part. To ensure that there is no trace part, weneed to form AijTT =

(P i

kPj` − 1

2P ijPk`

)Ak`. By direct analogy, we find for the quadrupole

J ijTT =

(P i

kPj` −

1

2P ijPk`

)Jk` . (8.80)

Denoting the components of the unit radial vector by xi, this gives

JTTij J ijTT = JijJ

ij − 2J ji J

ikxjxk +1

2J ijJk`xixjxkx` . (8.81)

To get the integrated luminosity, we integrate this over 4π of solid angle. After the boringdust settles, we have (at last!) the famous formula we wanted,

dE

dt= −LGW = −GN

5c9〈[ ...J ij

...Jij]r〉 . (8.82)

This shows that you not only need a quadrupole (not a monopole or a dipole) to producegravitational radiation, you also need the third derivative of it turned on. Again, the reasonwhy we use retarded times in this expression is to ensure the correct boundary conditionsfor our Green’s function reflecting causality.

107

Page 110: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Gravitational radiation was discovered indirectly in 1974, via the famous observationsof Russell Hulse and Joseph Taylor of binary pulsars which won them a 1993 Nobel Prize inPhysics. The period between winks of the pulsar slowed down over time, at a rate preciselypredicted by GR. What was a far more impressive technological feat was the building ofLIGO, the Laser Interferometer Gravitational Wave Observatory. It won the 2017 NobelPrize in Physics for Rainer Weiss, Barry Barish, and Kip Thorne for the direct discovery ofgravitational waves – tiny ripples in the very fabric of spacetime long thought technologicallyimpossible to detect. Here are some URLs for checking out their discoveries:-

• https://www.youtube.com/watch?v=FXlg3cr-q44

• https://www.ligo.caltech.edu/page/press-release-gw170817

• https://www.ligo.caltech.edu/news/ligo20160211

• https://www.ligo.caltech.edu/news/ligo20160615

• https://www.ligo.caltech.edu/news/ligo20170601

9 Cosmology

This section is based partly on pieces of §14, §15, and §16 of HEL.Have you ever wondered why the night sky is mostly dark, punctuated by starlight?

Assume for the moment that the universe is approximately the same in every direction,which is a decent first approximation. There ought to be stars in every direction. But notall stars shine with the same brightness. Indeed, electromagnetism obeys an inverse-squarelaw. A competing effect is that as you go further away, the surface area of the night skygrows like radius-squared. Combining these two facts tells us that the night sky should beablaze with light in every direction. The fact that it is not is called Olbers’ Paradox. Inreaching this conclusion, we assumed implicitly that the universe is static and infinite in size.In fact, our observable universe is actually finite and expanding! So how do astrophysicistsknow this?

The key technique is spectrographic analysis of starlight. Quantum mechanics tells usthat photons obey E = hν. So if ν is redshifted compared to what is observed in the atomicrest frame, then special relativity tells us that it must be moving away from us and howfast. By making use of the work of others such as Slipher, Hubble found that, on average,the more distant light sources were moving away faster, and codified his observations into a“law”: v = H0D, where D is distance. The constant H0 is known as the Hubble constant.

Physicists realized that the reason why the universe is expanding is that the fabric ofspacetime itself is expanding. This can be explained in a similar way to how pointlike dotsdrawn on a balloon move apart from each other faster as the balloon is blown up. Note thatthis expansion of spacetime does not change the size of gravitationally bound systems likestars with planets or galaxies – it just increases the space in between them.

What if the galaxy was far enough away that the recessional speed became greater thanc? Does it mean that signals are being transmitted in violation of causality? No. The lightfrom them simply cannot reach us any more. We call the patch of spacetime containingthings with which we could in principle communicate the causal patch. It has a finite sizebecause our universe has lived for only a finite amount of time (about 13.8 billion years).

108

Page 111: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

9.1 FRW metrics

The universe is isotropic and homogeneous to a pretty decent level of approximation. We willnot have nearly enough time in this course to develop the fascinating story of cosmologicalperturbations describing the cosmic microwave background and structure formation, but wecan at least offer you a cosmological aperitif. So let us see where the assumptions of isotropy(sameness in all directions) and homogeneity (sameness in all locations) take us.

If our metric is of the form

ds2 = c2dt2 − gijdxidxj , (9.1)

then you can check that [xµ(τ)] = [τ, ~x∗]T obeys the geodesic equation, where ~x∗ =constant.

This is why the time coordinate t is called synchronous and the spatial coordinates arecalled comoving: an observer at ~x∗ measures proper time τ = t. Isotropy and homogeneityrequire uniform expansion,

gij(xα) = a2(t)hij(x

k) , (9.2)

where a(t) is an overall time-dependent scale factor. The spatial metric, being isotropic,must be a function of only

~x · ~x , ~x · d~x , d~x · d~x . (9.3)

In spherical polars, where ~x · ~x = r2, ~x · d~x = rdr, and d~x · d~x = dr2 + r2dθ2 + r2 sin2θ dφ2,

dσ2 ≡ gijdxidxj = C(r) (~x · d~x)2 +D(r) (d~x · d~x)

= C(r)r2dr2 +D(r)(dr2 + r2dθ2 + r2 sin2θ dφ2

):= B(r)dr2 + r2

(dθ2 + sin2θ dφ2

)(9.4)

after a redefinition of the radial coordinate. If we further require a maximally symmetricspace, for which

Rijk` = κ (gikgj` − gi`gjk) , (9.5)

then a little bit of further algebra yields

ds2 = c2dt2 − a2(t)

[dr2

(1− κr2)+ r2

(dθ2 + sin2θ dφ2

)], (9.6)

where κ = ±1, 0 for positively curved, negatively curved, and flat spatial sections respec-tively. Notice that for the case κ = +1, grr blows up at r = 1. It is convenient for seeingthe physics to define a new radial coordinate χ via r = sinχ, which you can check gives ametric without a coordinate singularity. For the flat case, we simply write r = χ. For thecase κ = −1, by analogy with the κ = +1 case we pick r = sinhχ. All three cases can becombined efficiently by writing

ds2 = c2dt2 − a2(t)[dχ2 + S2(χ)

(dθ2 + sin2θ dφ2

)](9.7)

where

S(χ) =

sinχ , κ = +1 ;χ , κ = 0 ;sinhχ , κ = −1 .

(9.8)

109

Page 112: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

What do geodesics look like in the above spacetimes (9.7)? Let us inspect the alternativeform of the geodesic equation,

.uµ =

1

2(∂µgνσ)uνuσ , (9.9)

where uµ is the usual four-velocity. By the homogeneity and isotropy of FRW spacetimesabove, without loss of generality we can take the spatial origin as a point P on the geodesicwe are analyzing, at χ = 0. Then inspecting the φ component of eq.(9.9) shows us that uφis conserved. If we have uφ = 0 at P , then it will continue to be zero. Since the metricis diagonal and invertible, we have that also uφ = 0. Accordingly, this component of thegeodesic equation is solved by

φ = constant . (9.10)

Inspecting the θ component in turn yields, by a similar chain of logic,

θ = constant . (9.11)

How about the radial component? We know that uφ, uθ are both zero, from above. We alsoknow that ∂rgtt = 0 and ∂rgrr = 0 in this coordinate system. Therefore, ur is constant,

a2(t).χ = const . (9.12)

We can find.t from uµuµ = εc2, where ε = +1 for massive particles and ε = 0 for massless

particles,.t2

= ε+a2(t)

.χ2

c2. (9.13)

Can we recover Hubble’s Law from all this? There are a few steps involved, but yes.The first step to figure out a quantity known as the cosmological redshift z. Let us seehow this works. Suppose that we emit a photon at [xµE] = [tE, χE, θE, φE]T and receive itat [xµR] = [tR,~0]T . Its momentum four-vector obeys pµp

µ = 0 , which yields p0 = cp1/a(t),where p1 =const. is conserved along a geodesic. If the emitter and receiver are at fixed spatial

110

Page 113: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

coordinates, which they are, we can use our redshift formula worked out for application tothe Pound-Rebka experiment,

νRνE

=p0(R)

p0(E)

√g00(E)

g00(R). (9.14)

For our FRW metric, g00 = c2 for all t, so that

νRνE≡ 1

1 + z=a(tE)

a(tR). (9.15)

Accordingly, when the universe is expanding, the scale factor at reception is larger than thatat emission, and so the photon is redshifted. If the universe were contracting, we would seea blueshift instead.

The second step is to recognize that, for small look-back times like Hubble was investi-gating, we can Taylor expand about our current epoch time t0,

a(t) ' a(t0)− (t0 − t).a(t0) +

1

2(t0 − t)2..a(t0) + . . .

= a(t0)

[1− (t0 − t)H(t0)− 1

2(t0 − t)2 q(t0)H2(t0)− . . .

], (9.16)

where we have defined the Hubble parameter H(t) and deceleration parameter q(t)

H(t) ≡.a(t)

a(t), q(t) ≡ −

..a(t) a(t).a2(t)

. (9.17)

Then

z =a(t0)

a(t)− 1 '

[1− (t0 − t)H0 −

1

2(t0 − t)2q0H

20 − . . .

]−1

− 1

⇒ z ' (t0 − t)H0 + (t0 − t)2[1 +

q0

2

]H2

0 + . . . , (9.18)

For small z, this can be inverted to give

t0 − t ' H−10 z −H−1

0

(1 +

q0

2

)z2 + . . . . (9.19)

Note that, when we are not looking back very far in time, none of the above perturbativeexpressions depends on any variables other than H0, a0, q0 .

We are nearly there. To get a relationship between redshift and distance, we need the χcoordinate of the emitting galaxy, also expanded in a Taylor series. Since we are analyzinga photon, we have

χ =

∫ t0

t

c dt

H(t)'∫ t0

t

dt c a−10 [1− (t0 − t)H0 − . . .]−1

' c

a0

[(t0 − t) +

1

2(t0 − t)2H0 + . . .

]. (9.20)

111

Page 114: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Combining this with our expression for (t0 − t) in terms of z gives

χ ' c

a0H0

[z − 1

2(1 + q0)z2 + . . .

]. (9.21)

Now we can do the last step. The proper distance to an emitting galaxy is d = a0χ. So fornearby galaxies,

d ' c(t0 − t) . (9.22)

We had earlier that z ' (t0 − t)H0 at lowest order. Combining them and interpreting thecosmological redshift as a Doppler shift due to recession velocity v of the emitting galaxygives

v = cz = H0d . (9.23)

This is Hubble’s Law. If we wanted to look back further than perturbatively, we would needto know the whole history of evolution of the scale factor of the universe to do it.

Another concept worth mentioning here is known as the particle horizon. To see whatit is, let us consider a comoving observer at χ = 0 and ask about light signals. The emitter’scoordinate χ1 is determined by

χ1 = c

∫ t

t1

dt

a(t)(9.24)

If this integral diverges as t1 → 0, then χ1 can be as big as we like by taking t1 sufficientlysmall; this means it is possible for us to see any comoving particle like a galaxy. But if theintegral converges as t1 → 0, then there is an upper bound on χ1, and our view is limitedby a particle horizon. Does this mean stars just pop into our view as the universe grows?No. In fact, the particle horizon is the surface of infinite redshift, so a galaxy coming intoour view would begin with infinite redshift. As the cosmos expanded, its redshift wouldgradually reduce.

9.2 Solving Einstein’s equations

FRW cosmological models assume homogeneity and isotropy, and that the universe can bedescribed in terms of an overall spatial scale factor a(t). What I now want to discuss is whatkinds of Tµνs can actually support this type of solution of the Einstein equations. We willfind that the scale factor evolution will be tied to the energy density and pressure of thesystem.

To see how the Einstein equations work, it is more convenient to use our comovingcoordinates, in which the areal radius is r and the metric is

ds2 = c2dt2 − a2(t)

[dr2

(1− κr2)+ r2

(dθ2 + sin2θ dφ2

)], (9.25)

The Christoffels, and the Riemann, and Ricci tensors can be computed by hand or via yourfavourite computer algebra app like Maxima. As you can check, the nonzero components of

112

Page 115: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

the rank (1,1) Ricci tensor are

Rtt = d

..a

a(9.26)

Rrr = Rθ

θ = Rφφ =

..a

a+ (d− 1)

[( .a

a

)2

a2

], (9.27)

where d = D − 1 = 3 in our universe. Then the Ricci scalar is

R =

[2d

..a

a+ d(d− 1)

( .a

a

)2

a2

]. (9.28)

How about the stress-energy tensor of our perfect fluid? This generally takes the form

T µν = (ρ+p

c2)uµuν − pδµν . (9.29)

The usefulness of the coordinate system we have chosen is that the coordinates are comov-ing, meaning that we can assign a timelike four-velocity vector field consistently obeyinguµuµ = c2 that has [uµ] = c[1,~0]T . Then we have

T tt = c2ρ , T rr = T θθ = T φφ = −p . (9.30)

Substituting this into the Einstein equations

Rµν = −8πGN

c4

(Tµν −

1

2gµνT

λλ

)(9.31)

gives ( .a

a

)2

=8πGN

3ρ+

1

3Λc2 − κc2

a2, (9.32)

..a

a= −4πGN

3

(ρ+ 3

p

c2

)+

1

3Λc2 . (9.33)

These are known as the Friedmann-Lemaıtre equations. Note that the first one is a firstorder constraint, while the second one is a second order equation of motion. The constraintmust be satisfied at all times throughout the evolution by the (nonlinear) second orderequation of motion. This is a general feature: when you have a solution of the Einsteinequations, there is always one first order constraint as well as the requisite number of secondorder equations.

Recall our definition for the Hubble parameter,

H =.a

a. (9.34)

Let us also defineΩ =

ρ

ρcrit

, (9.35)

113

Page 116: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

where

ρcrit =3

8πGN

(H2 − Λc2

3

). (9.36)

Then we can rearrange the constraint equation to obtain

Ω− 1 =κ.a2 . (9.37)

We can see immediately that this will be driven to zero dynamically for an expanding uni-verse. Experimentally, our universe appears to have critical density, to a very high degreeof accuracy, so this means the universe is likely to be spatially flat with κ = 0. Also, forour universe in the present day, ρcrit, 0 ∼ 9.2× 10−27 kg.m3, which is a few protons per cubicmetre.

We also need to know how the scale factor accelerates with time, and this is providedby the second order Friedmann equation, which involves both the energy density and thepressure. Note that the constraint equation only involved the energy density.

A note regarding the contribution of the cosmological constant. We can shift it over tothe RHS of the Einstein equation,

Rµν −1

2gµνR = −8πGN

c4Tµν − Λgµν , (9.38)

allowing us to define a Tµν(Λ). Raising one index, it is

T µν(Λ) =Λc4

8πGN

δµν , (9.39)

and writing this in perfect fluid form gives

ρΛc2 =

Λc4

8πGN

= −pΛ . (9.40)

It is traditional to define the constant w via the equation of state

p = wρc2 , (9.41)

and so for the cosmological constant we have w = −1.There is one other important equation we have not written down yet – the equation ex-

pressing covariant conservation of the energy-momentum tensor. In our comoving coordinatesystem in D = 4, it yields

.ρ+ 3

.a

a

(ρ+

p

c2

)= 0 . (9.42)

Using our equation of state this can be rewritten as

ρ= −3(1 + w)

( .a

a

). (9.43)

The solution isρ ∝ a−3(1+w) . (9.44)

114

Page 117: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

For dust, which is the cosmologist’s name for any set of collisionless nonrelativistic particles,p = 0, which implies that

ρdust ∝1

a3. (9.45)

This shows us that the energy density of nonrelativistic particles gets diluted by three powersof the scale factor as time progresses in an expanding universe. For radiation, which iscomposed of massless stuff like photons, the energy-momentum tensor is traceless (in D = 4,as we will show soon). For a perfect fluid, T σσ = ρc2 + p − Dp = ρc2 − 3p. Therefore, forradiation, w = +1/3, so that

ρrad ∝1

a4. (9.46)

Radiation dilutes faster than matter does as the universe expands.Our actual universe has contributions to Ω from matter, radiation, cosmological constant,

and spatial curvature, as we can see from the Friedmann-Lemaıtre equations. For such amodel, we would have a multi-component perfect fluid, and this makes solving analytically forthe forms of the scale factors rather more nontrivial. The principles are very straightforward,though, and you can easily make Maxima do the work, either analytically or numerically, toproduce suitable plots. HEL §15 go into quite a bit of detail on this front.

Let us make a few brief remarks about the cosmological constant, sometimes known asthe dark energy. This has w = −1 and constant ρ = Λc2/(8πGN). You might wonderwhatever possessed Einstein to put in a term like that. After all, suggesting that spacetimehas a constant energy density just for existing means that the energy of a hunk of space isproportional to how much space you bite off. If you imagined doubling the volume of space,then the total energy from the cosmological constant would be doubled. The truth is thatEinstein put it in to his equations by hand because he could not see how to build the kindof universe that appealed to his sense of physical taste without it, given experimental dataknown at the time. He later backtracked and called Λ his “greatest blunder”, but historydecided otherwise. Even Einstein was scientifically and intellectually human; sometimes hegot important physics wrong (like quantum uncertainty).

Note that having a cosmological constant does not violate some sacrosanct principle ofenergy conservation, because there is generically no such principle in a curved spacetime.Remember, when we discussed Killing vectors, we found that the general relativistic versionof Noether’s Theorem only gave us a conserved quantity whenever we had a Killing vector,which required a symmetry of spacetime. Generic spacetimes do not have timelike Killingvectors, so there is no generic requirement that energy must be conserved.

It may amuse you to know that it was not actually known with believable precisionuntil the very late 1990s that there actually is a cosmological constant, or something likeit known as dark energy. This was obtained by a combination of methods, using mostlycosmic microwave background radiation (CMBR) left over from the Big Bang and usingType 1a supernovae as standard candles. Arno Penzias and Robert Wilson won half ofthe 1978 Physics Nobel for discovering the CMBR. John Mather and George Smoot won in2006 for the COBE satellite measurements of cosmological parameters via the CMBR. SaulPerlmutter, Brian Schmidt and Adam Riess won in 2011 for the supernovae measurementsof cosmological parameters.

115

Page 118: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

9.3 Energy-momentum tensors

Next week we will derive Einstein’s equations from an action principle. As part of that story,we will show that for a matter field coupled to the metric, the energy-momentum tensor is

Tµν ≡2√−g

δSmatter

δgµν, (9.47)

where (−g) ≡ − det(gαβ) is the determinant of the metric. Since√−g often appears in

actions, it is worth stating the identity for its variation,

δ√−g

δgαβ= −1

2gαβ√−g . (9.48)

Let us take some common forms of matter and see how their Tαβ can be derived.For the relativistic point particle, we start from the Einbein action

Sparticle[zµ] =

∫dλ

[1

2e−1(λ)gµν(z)

dzµdλ

dzνdλ

+1

2e(λ)m2

]. (9.49)

Therefore, we have

T particleµν (x) =

∫dλ e−1(λ)

dzµ(λ)

dzν(λ)

δD(x− z(λ))√−g(x)

. (9.50)

Since this is a bona fide tensor equation, indices can be raised or lowered on both sides atwill. When the particle is timelike, we can use e−1 = m; then λ is the proper time τ .

Remember the fact that the energy-momentum tensor – whatever it is – must be covari-antly conserved by virtue of the Einstein equations, because of a Bianchi identity? Well,this principle of covariant conservation of energy-momentum can actually be used to derivethe geodesic equation for the point particle providing the source of energy-momentum!

Consider a relativistic point particle of mass m > 0 in curved spacetime. Our first stepin the algebra is to show that for an arbitrary rank (2,0) tensor T

∇µTµν = ∂µT

µν + ΓµσµTσν + ΓνσµT

µσ =1√−g

∂µ(√−g T µν

)+ ΓνσµT

µσ . (9.51)

The next step is to use the Einstein equations, the Bianchi identity, and metric compatibilityof the connection to show that any energy-momentum tensor coupled to gravity must becovariantly conserved,

∇µTµν = 0 . (9.52)

Then, by using the chain rule for differentiation

.zµ(λ)

∂zµ(λ)δD(x− z(λ)) =

d

dλδD(x− z(λ)) ,

and recruiting integration by parts, we can see that covariant conservation of energy-momentumrequires

∇µTµνparticle =

∫dλ

[d2zν(λ)

dλ2+ Γναβ(z(λ))

dzα(λ)

dzβ(λ)

]δD(x− z(λ)) = 0 . (9.53)

116

Page 119: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Similarly, by starting with the Maxwell action as a functional of Aµ(xλ),

S[A] = −1

4

∫dDx√−gF µνFµν , (9.54)

and expanding it as

S[A] = −1

4

∫dDx√−ggαµgβνFαβFµν , (9.55)

we obtain

TAµν = −FµαF αν +

1

4gµνF

2 . (9.56)

Check to make sure that you understand how this works; the algebra is only a few lines. Thegeneral relativistic covariance of the Maxwell story relies on the property that

Fµν = ∇µAν −∇νAµ . (9.57)

which you should also check using the definition of covariant derivatives in terms of partialderivatives and contractions with Christoffels.

We can also work out Tµν for a scalar field. Using the minimal coupling ansatz, we writethe relativistic action by analogy with the Newtonian particle,

S[φ] =

∫dDx√−g(

1

2∇µφ∇µφ− V (φ)

). (9.58)

To proceed, we need to identify where the upstairs metric tensor components arise. This iseasily seen by putting the derivatives downstairs,

S[φ] =

∫dDx√−g(

1

2gµν∇µφ∇νφ− V (φ)

). (9.59)

We can see that the upstairs metric appears in the√−g term as well as the term involving

derivatives of the dynamical field variable φ(x). So we need to take

T φµν =2√−g

−1

2

√−ggµν

[1

2(∇φ)2 − V (φ)

]+√−g[

1

2∇µφ∇νφ

](9.60)

= ∇µφ∇νφ− gµν[

1

2(∇φ)2 − V (φ)

]. (9.61)

Another important idea occurring in modern cosmology is inflation, a mechanism thatled to an astonishing exponentially fast growth spurt when the universe was in its infancy.It is postulated (and by now there is pretty good evidence for it) as a mechanism of solvinga number of conundrums, for example the fact that regions on the sky which are widelyseparated now show correlations in the CMBR, so must have been in causal contact at somepoint in the past. Another is the scarcity of magnetic monopoles and other exotic objectsthat may have been made in the big bang. Inflation requires an accelerating scale factor

a > 0 . (9.62)

117

Page 120: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

From the Friedmann-Lemaıtre equations, this requires ρ+ 3p/c2 < 0, i.e.,

w < −1

3. (9.63)

What kind of field might be able to drive inflation? A first guess might be a scalar field φknown as the inflaton. Comparing the energy-momentum tensor to the perfect fluid ansatzin an inertial Cartesian coordinate system (in which gαβ = ηαβ and the fluid is at rest) gives

c2ρ(φ) =1

2

2+ V (φ) +

1

2|~∇φ|2 ,

p(φ) =1

2

2− V (φ)− 1

6|~∇φ|2 . (9.64)

Let us assume that spatial gradients are negligible compared to temporal ones and to the

potential, |~∇φ|2 .φ

2and V (φ); then we have

c2ρ(φ) =1

2

2+ V (φ) ,

p(φ) =1

2

2− V (φ) . (9.65)

The continuity equation then becomes

..φ+ 3

.φ.a

a+dV

dφ= 0 . (9.66)

The second term in this equation is known as the Hubble friction term. Using the constraintFriedmann equation, and assuming that ρ is large enough that the curvature term can beneglected gives the Hubble parameter,

H2 =8πGN

3c2

(1

2

2+ V (φ)

). (9.67)

Inflation can occur , as long as p/c2 < −ρ/3. Translated into φ language, this is the condition

2< V (φ) . (9.68)

The slow roll approximation is the one in which.φ

2 V (φ). This implies that

..φ dV/dφ,

and so the continuity equation becomes

3H.φ = −dV

dφ≡ −V ′ , (9.69)

and the Friedmann equation simplifies to

H2 =8πGN

3c2V (φ) . (9.70)

When the potential is sufficiently flat for slow roll,(V ′

V

)2

1 ,V ′′

V 1 , (9.71)

118

Page 121: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

the approximate solution for the scale factor is indeed exponential,

a(t) = a0 exp

(√8πGN

3c2V (φ) t

). (9.72)

To solve our horizon and flatness problems, about 60-70 e-foldings are needed.The approximation that φ couples only to gravity is unrealistic. In fact, it can be

expected to interact with other fields like matter. Typically, at the end of inflation, theinflaton will execute damped oscillations about the minimum of its potential and it decaysinto other fields in a process called reheating. Later on, at about 10−11s after the big bang,baryogenesis occurrs, allowing matter to outnumber matter. At about 10−10s, electroweaksymmetry breaks in a phase transition, and colour confines at about 10−4s. Finally, bigbang nucleosynthesis creating lots of hydrogen, a little helium, and a tiny bit of lithium,occurs at about 100s. The first three minutes of the evolution of our cosmos were busy!It was not until about 70,000 years after the big bang that matter began to dominate thegravitational evolution of the universe over radiation, which dilutes faster as the universegrows. Recombination at 370,000 years was when the universe was finally cool enough forelectrically neutral atoms to form, and it was then that photons could fly unimpeded throughto our detectors. In practice, we cannot “see” any further back in time than that using theCMBR.

Our last topic for today is to discuss energy conditions so that we can prove for youwhen gravity is attractive. For this discussion, we need the Raychaudhuri equation for theevolution of a congruence of timelike geodesics, discussed in an Appendix. The result we needhere is the one for the directional covariant derivative of the expansion of the congruence,

Dλ= +ωµνω

µν − σµνσµν −1

3θ2 +RµνU

µUν . (9.73)

The rotation part involving ωµν vanishes if the 4-velocity vector field is hypersurface-orthogonal.This technical assumption can be checked straightforwardly if tediously. The shear part in-volving σµν is positive semidefinite.

A handy identity is something you proved in a homework assignment,

Rµν = −8πGN

[Tµν −

1

(D − 2)Tgµν

], (9.74)

where we have included the cosmological constant term in what we mean by the stress-energytensor here. In locally inertial coordinates xµ, the 4-velocity points entirely along the timedirection, and

RµνUµU ν = Rtt (9.75)

In the same coordinates, gµν = ηµν and T = c2(ρ −∑

i pi/c2), which gives with a small

amount of algebradθ

dλ= −4πGN(ρ+

∑i

pic2

) . (9.76)

To know whether or not gravity is attractive, we need to know how the energy density andpressures behave. There are several commonly discussed energy conditions.

119

Page 122: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

WEC: The Weak Energy Condition requires Tµνtµtν ≥ 0 for all timelike vectors tµ. In perfect

fluid language,

ρ ≥ 0 , ρ+p

c2≥ 0 . (9.77)

NEC: The Null Energy Condition requires Tµν`µ`ν ≥ 0 for all null `µ, or

ρ+p

c2≥ 0 . (9.78)

The energy density may now be negative, if it is compensated by a positive pressure.

DEC: The Dominant Energy Condition requires the WEC and also that T µνtν be nonspace-like,

ρ ≥ |p|c2. (9.79)

NDEC: The Null Dominant Energy Condition is the DEC for null vectors only. The NDECexcludes all sources excluded by the DEC, except for a negative vacuum energy.

SEC: The Strong Energy Condition requires Tµνtµtν ≥ 1

2T λλt

σtσ (in 4D) or

ρ+p

c2≥ 0 , ρ+ 3

p

c2≥ 0 . (9.80)

If we can require the SEC, then gravity must be attractive. Let us prove this using theEinstein equations. We have

RµνUµUν = −8πGN

(Tµν −

1

2T λλgµν

)UµUν , (9.81)

≤ 0 ∀ timelike Uµ . (9.82)

Therefore, RµνUµUν ≤ 0 if we have the SEC. The 4-velocity should be perpendicular to a

family of hypersurfaces, so that the rotation of the congruence of geodesics is zero. Thenthe shear term is negative semidefinite and

dλ≤ −1

3θ2 . (9.83)

We can integrate this to get

θ−1(λ) ≥ θ−10 +

1

3λ . (9.84)

This implies that the geodesics converge, to what is called a caustic, in finite affine time.So as long as we have the SEC, geodesics focus and gravity is attractive. Note that thecosmological constant disobeys the SEC.

120

Page 123: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

10 Deriving Einstein’s equations

10.1 Covariant integration over spacetime and classical fields

In the section on special relativity, we briefly reviewed pertinent facts about classical me-chanics. When we graduate to relativistic classical field theory in flat spacetime, ourdynamical fields are no longer coordinates as a function of time – they become continuousfield variables depending on all D spacetime coordinates

qa(t)→ φa(xµ) , (10.1)

and we replace time derivatives by partial derivatives,

d

dt→ ∂µ . (10.2)

In the above, a denotes an an arbitrary tensor index structure that we will need to specifyseparately for each different classical field that we study (e.g. Higgs vs photon). The φa(xµ)are fields, and in our theory that we are building they must be tensors (scalars, spinors,vectors, etc.). Varying the action

S[φa] =

∫dDxL (10.3)

gives the Euler-Lagrange equations,

∂L

∂φa− ∂µ

(∂L

∂(∂µφa)

)= 0 . (10.4)

These equations are correctly written in terms of partial derivatives. In order to uncover thecovariant versions of these equations, we must spent a bit of time worrying about how todefine a covariant measure for integration in curved spacetime.

The epsilon tensor density can be defined for orientable manifolds. E is the permuta-tion symbol that we had already defined for flat Minkowski space. Can we make a covariantversion? The answer is yes. To see how, consider the formula for the determinant of someD ×D matrix with components Mµ

ν ,

E01...d(detM) = Mµ00M

µ11 . . .M

µddEµ0µ1...µd . (10.5)

Using the matrix to be

Mµ′

ν =

(∂xµ

∂xν

), (10.6)

we obtain

Eµ′0...µ′d =

∣∣∣∣∂x′∂x

∣∣∣∣Eν0...νdΛν0µ′0. . .Λνd

µ′d(10.7)

Taking the determinant of the metric tensor component transformation equation gives

det(gµν(x′)) =

∣∣∣∣∂x′∂x

∣∣∣∣−2

det(gµν(x)) . (10.8)

121

Page 124: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Renaming √− det(gµν(x)) ≡

√−g (10.9)

we haveεµ...ν ≡

√−gEµ...ν (10.10)

transforms like a tensor. Similarly,

εµ...ν ≡ 1√−g

Eµ...ν (10.11)

This ε tensor is also known as the Levi-Civita tensor.So how do we write covariant integrals in spacetime? Notice that the naive volume

element dDx = dx0dx1 . . . dxd is not tensorial. We need a factor of√−g to make an integral

covariant. So an integral of a scalar field φ(x) over all spacetime would look like∫dDx√−g φ(x) . (10.12)

For a more detailed set of steps on defining the invariant integration measure as above, seeCarroll §2.10.

Stokes’s Theorem is quite pretty in curved spacetime. It makes use of a pretty identityfor the partial trace of the Christoffel connection coefficients,

Γλλν =1√−g

∂ν(√−g). (10.13)

This expression is useful because it helps us write down the answer for the covariant diver-gence of a vector,

∇µVµ = ∂µV

µ +1√−g(∂µ√−g)V µ (10.14)

=1√−g

∂µ(√−gV µ

), (10.15)

and so ∫Σ

(∇µVµ)(√−gdDx

)=

∫∂Σ

(nµVµ)(√−γddx

), (10.16)

where nµ is the unit normal to the boundary ∂Σ and√−γ is the induced metric on ∂Σ.

Note that nµ should be chosen inward-pointing if the boundary is timelike, and it should bechosen outward-pointing if it is spacelike.

The story of how hypersurfaces work in GR is a lot more complicated than in flatEuclidean space. For some of the details you can consult Appendices of Carroll and Ap-pendices of Wald. For example, Carroll Appendix C and Appendix D discuss submanifolds,Frobenius’s Theorem, Gaussian normal coordinates, induced metric, extrinsic and intrinsiccurvature, first and second fundamental forms, and the Gauss-Codazzi equations.

122

Page 125: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

We noted above that in curved spacetime the combination dDx√−g is invariant under

coordinate transformations. If the curved spacetime action can be written as

S =

∫dDxL =

∫dDx√−gL (10.17)

where L is a scalar, then the Euler-Lagrange equations can be recast in the more covariantlooking form

∂L

∂φa−∇µ

(∂L

∂(∇µφa)

)= 0 . (10.18)

10.2 Relativistic scalar gravity

The stress-energy tensor or energy-momentum tensor T µν is defined in a very physicalway as the flux of µ momentum in the ν direction. For example, T 00 is the flux of energy inthe time direction, otherwise known as the energy density, and T 0i is the momentum density,while the T ij are the internal stresses. From the Einbein action, which can handle eithermassive or massless particles in flat spacetime, we can find this flux,

T µνparticle(x) =

∫dλ e−1(λ)

.zµ(λ)

.zν(λ)

δD(x− z(λ))√−g(x)

. (10.19)

The D-dimensional delta function is necessary because the energy-momentum tensor for apoint particle is zero unless you are actually on the trajectory. (Remember that for a massiveparticle in proper time gauge, e−1(λ) = m.) From the above formula we can see that fora massless particle, which satisfies

.z2(λ) = 0, the trace of the energy-momentum tensor is

zero. This will turn out to be important physically a bit later in the story

Let us now turn to begin building a relativistic version of Newtonian gravity. The firstthing we need to do is dimensional analysis on our coupling constant. The Newton constantGN appears in the force law, which takes Coulomb form in D dimensions,

~F grav = −GNMm

rD−2r . (10.20)

Now[F ] = [M ][L][T ]−2 , (10.21)

so that[GN ] = [L]D−2[M ]−2[M ][L][T ]−2 = [L]D−1[M ]−1[T ]−2 . (10.22)

We can get rid of the time dimensions by dividing by c2, giving[GN

c2

]= [L]D−3[M ]−1 . (10.23)

Now notice that [~c

]= [ML2T−1L−1T ] = [M ][L] , (10.24)

123

Page 126: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

so that we can always convert a mass to an inverse length via

L =1

M

~c. (10.25)

Then we have that [~GN

c3

]= [L]D−2 , (10.26)

giving rise to the definition of the Planck length `P

`P =

(~GN

c3

)1/(D−2)

. (10.27)

At length scales this short, gravitation actually becomes seriously quantum mechanical,and we can no longer trust the classical picture of spacetime according to Einstein’s GR.Addressing what happens here is the domain of quantum gravity. In 4D, this length scalecorresponds in SI units to approximately 10−35m. Using our mass-length conversion above,we can convert this to an energy scale of about 1019GeV.

Newtonian physics gave us the gravitational force law arising from the Newtonian po-tential Φ(~r),

m~a = −m~∇Φ , (10.28)

where~∇

2Φ = βD

GN

c2ρ , (10.29)

where the constant is

βD = 8π(D − 3)

(D − 2)(10.30)

For a point particle the mass density ρ is

ρ = mδD−1(~x) , (10.31)

and this gives Newtonian potential solution

Φ(~x) = − 8πGN

(D − 2)ωD−2

m

|~x|D−3(10.32)

where ωD−2 is the area of the (D− 2)-sphere. Notice how it is only in 4D that this simplifiesto −GNm/|~x|.

Now suppose that we want to upgrade this to a relativistic scalar theory of gravity,which for lack of typing energy I will refer to as RSG. What would we write down for theaction principle? We would need to have a kinetic energy for the scalar field Ψ, as well asan interaction term between the source of energy-momentum and Ψ. The only scalar sourcewe generically have lying around in a theory is the trace of the energy-momentum tensor,T µµ, so we write

Sinteraction ∝∫dDx ΨT µµ . (10.33)

124

Page 127: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Let us exhibit the couplings for the case of simple relativistic point particles – which we willpermit to be either massive or massless. We will use the Einbein Lagrangian for the pointparticle kinetic action. For the scalar field, we want something as simple as possible but nosimpler, so we choose a kinetic term consisting of a relativistic square of the derivative ofthe field,

Skin[Ψ] =1

2

∫dDx (∇Ψ)2 . (10.34)

The coefficient is chosen carefully to be canonically normalized, and this requires an inter-action term of the form

Sinteraction =

√βDGN

c2

∫ΨT µµ (10.35)

in order to reproduce the Newtonian limit.Now consider the equation of motion for the relativistic point particle coupled to Ψ.

Using the Einbein action

SEinbein =

∫dλ

1

2

[e−1(λ)

.z2(λ) + e(λ)m2

], (10.36)

we see that the point particle EOM is

d

[e−1(λ)

(1 +

2√βDGN

c2Ψ

).zµ(λ)

]= 0 . (10.37)

This has an immediate physics implication – all directions of spacetime are equally affectedby the presence of Ψ. RSG predicts zero deflection of light, and this is because masslesspoint particles have a traceless energy-momentum tensor. Incidentally, Newtonian gravitydid give rise to light deflection, if one used the ‘logic’ of the day of treating massless particlesas the limit of massive particles with m→ 0 in the non-relativistic formulas. Unfortunatelyfor Newton’s theory, the result is also incommensurate with the observed value. But it getscloser than RSG to the correct answer!

If you work through the details (which are tedious and inessential to our plot develop-ment), you find that relativistic scalar gravity (RSG) gets only −1/6 of the correct answerfor the planetary perihelion precession. So it misses the magnitude by a factor of 6, and getsthe sign wrong – perihelion is retarded, not advanced. This is a second good reason to junkthe theory – and so it will stay in our trash can for the forseeable future.

What else could we use to build a messenger field for gravity? Clearly, fermions wouldnot work, because fermions obey the Pauli Exclusion Principle and hence you can never buildup a big semiclassical fermion field. Spin one will not work either, because it is repulsive forlike charges (which are analogous to masses). This is the reason why Einstein then workedfor years on developing a spin-two tensor theory of gravity.

10.3 Building tensor gravity with the right Newtonian limit

There is a long history of confusion about the right way to couple a spin-two tensor theoryof gravity to matter, which is basically everything else that isn’t gravity! Let us give a verybrief overview of the main advances in thinking on this subject. First, suppose that we have

125

Page 128: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

a Lorentz tensor hµν in Minkowski spacetime, where indices are raised and lowered using ηµν

etc, obeying∂2hµν = 0 . (10.38)

This is a much less simple theory than appears at first glance, because it does not even havepositive-definite energy unless a consistency condition is imposed,

∂µ(hµν −1

2ηµνh

σσ) = 0 (10.39)

which later became known as the de Donder gauge condition. A physically related problemis that there are too many helicity states to describe the graviton, which in D dimensions hasD(D − 3)/2 independent degrees of freedom, not D(D + 1)/2. The unwanted negative con-tribution to the energy comes precisely from some of the unwanted helicities. To completelyremove all unwanted helicities actually requires a second condition as well,

hµµ = 0 . (10.40)

To give you a hint of how this is resolved: in full general relativity, these conditions willactually emerge very naturally from taking care of diffeomorphism invariance.

Now let us consider how to couple our prototype tensor theory of gravity to matter. Ifwe write an interaction Lagrangian, it will need to produce a rank-two tensor on the RHS ofthe equation of motion for the tensor field. The only available object around to consider is– you guessed it! – the energy-momentum tensor. So we try writing the tentative equation

∂2hµν = g1Tµνmatter , (10.41)

where g1 is some gravitational coupling constant which we will get straight in a future lecture.For now, just focus on the properties of T µν . An important feature of the energy-momentumtensor is that it obeys a conservation law,

∂µTµν = 0 . (10.42)

Suppose that our field equations can be written as

Dµν(h) = g1Tµνmatter . (10.43)

Then the wave operator acting on the tensor field must also be divergenceless,

∂µDµν(h) = 0 , (10.44)

and this must hold off-shell, i.e., independently of the equation of motion. This is a form ofBianchi identity. This seems quite satisfying, until you realize that it only guarantees conser-vation of the energy-momentum tensor not counting any contributions that the gravitationalfield itself might make to the energy-momentum tensor.

We know in our hearts that this cannot be true: it should be the total energy, count-ing the matter and gravitational contributions, that should be conserved! So how shouldwe take this gravitational contribution into account? Suppose that we add a term in theoriginal Lagrangian to represent gravitational energy-momentum. Then that would require

126

Page 129: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

a correction to the form of the energy-momentum tensor, which would correct the equationsof motion, and so forth order by order. Ergo, this coupling to matter of Fierz-Pauli spin-twotensor gravity ends up requiring an infinite number of corrections.

Many famous physicists have contributed to solving this conundrum. Weinberg showedthat a quantum theory of a massless spin-two particle can have a Lorentz-invariant S-matrixonly if it couples to the total energy-momentum tensor. Deser showed that GR can be seenas the result of adding precisely this infinite set of corrections. GR also possesses a gaugesymmetry from diffeomorphism invariance, and this is precisely enough to solve all the issueswe have encountered so far.

It is important to check whether or not our proposed tensor gravity equation reduces tothe Newtonian gravity equation in the Newtonian limit. We will suppress the explicit detailshere, as they are very similar in spirit to what we already covered about linearized Einsteingravity when talking about tidal forces and about gravitational waves.

10.4 Deriving Einstein’s equations from an action principle

How about an action principle for tensor gravity, then? The Ricci scalar R is a rank (0,0)tensor, and it is a function of all the spacetime coordinates. So although R is a scalar, itis not invariant under diffeomorphisms unless we integrate it covariantly over all spacetime.Fortunately, we recently learned how to do covariant integration by using the covariantmeasure dDx

√−g. Putting these together gives the Einstein action

S1 =1

16πGN

∫dDx√−g R . (10.45)

There is another even simpler invariant that is also allowed, which is proportional to thecosmological constant Λ,

S2 =Λ

8πGN

∫dDx√−g . (10.46)

This one does not involve any derivatives of the metric tensor, so it will not be able to providethe gravitational analogue of a kinetic energy functional. But the cosmological constantaffects the evolution of the cosmos in very fundamental ways, and is certainly allowed bydiffeomorphism invariance, so we should include it in our effective action for later. We write

Sgrav = S1 + S2 =1

16πGN

∫dDx√−g (R− 2Λ) . (10.47)

So our candidate action has two terms, one involving the Ricci scalar and the otherinvolving the cosmological constant,

SEH =1

16πGN

∫dDx√−g (R− 2Λ) . (10.48)

If you like, you can think of the first term as a bit like a kinetic energy density and the secondas a bit like a potential energy density. The above action is known as the Einstein-Hilbertaction. (Notice that we did not use ∇µ∇νgµν in trying to build an action principle – this isidentically zero for a torsion-free connection.)

127

Page 130: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

We can now use dimensional analysis to estimate how big correction terms to theEinstein-Hilbert action might be. For instance, consider a term of the form

S3 =1

α3

∫dDx√−g R2 (10.49)

In order to get the dimensionful coupling constant α3, we use the only length scale in theproblem: `P . Now, our Einstein-Hilbert kinetic term was of order `2−D

P , and that was neededbecause of the D powers of length in the integration measure and the two derivatives usedin forming Riemann from the spacetime metric. Therefore, α3 has two extra powers of `2

P

compared to the Einstein-Hilbert coupling, making it much smaller because typical Riemanncurvature length scales must be much larger than `P for spacetime to be classical. This allhappens because gravity is a very weak force – it is actually the weakest force in the universe.

From the perspective of QFT, however, GR will turn out to be a sick theory. It isnonrenormalizable, with a dimensionless coupling constant that grows as GNE

D−2 whereE is the typical energy scale. This means that we cannot trust GR in the UV and wouldneed a theory of quantum gravity to address its physics. We will not venture at all into thisterritory in this course.

One final note about the Einstein-Hilbert action – the fact that light and gravity moveat the same speed is baked into it. In order to obtain a theory of gravity that has a differentspeed than light, we would have to recruit additional fields and physical complications. Themost compelling reason for not doing this is that no experimental measurement has yet beenable to distinguish between the two speeds to within experimental uncertainty.

In order for our story to be more interesting, we should also include a coupling tothe matter fields in the problem. The minimal coupling recipe consistent with generalrelativistic coordinate invariance tells us to replace

dDx → dDx√−g , (10.50)

∂µ → ∇µ , (10.51)

ηµν → gµν . (10.52)

Since we do not consider torsion or other complications in this course, we will not be exploringhow to couple in more complicated forms of matter than spin zero or spin one matter fields.So for our purposes the above recipe will be enough to handle all our cases of interest. For amore sophisticated discussion, you can refer to e.g. Tomas Ortın’s monograph “Gravity andStrings” (Cambridge, 2004).

How will we find the variation of SEH w.r.t. the dynamical field in the problem, whichis gαβ(xλ)? We need to examine where all possible metric tensors might be hiding in thediffeomorphism-invariant action. One insight that makes the algebra a bit easier is to varyw.r.t. the upstairs components of the metric rather than the downstairs ones. We will proceedlooking to vary w.r.t. gαβ rather than gαβ. The physics has to be the same in the end, ofcourse.

A handy identity for metric variations can be obtained directly from the definition of

128

Page 131: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

the upstairs (inverse) metric, as follows.

δ(gαµ) = δ(δαµ) = 0

= δ(gαβgβµ) (10.53)

= (δgαβ)gβµ + gαβ(δgβµ) . (10.54)

Contracting with gαν and using symmetry of the metric gives

δgµν = −gµαgνβ(δgαβ) , (10.55)

which allows you to make a switch between δgµν and δgµν . Notice that the minus sign inthis switcheroo equation is not an error; it is essential. You cannot obtain the downstairsvariations from the upstairs ones merely by naıvely pulling down the indices with the down-stairs metric, because they are not independent variables. The metric and its inverse mustalways obey the definition equation gασgσβ = δαβ .

Let us examine the variation of the cosmological constant term w.r.t. the (upstairs)metric first, because it is structurally simpler. This part of the Lagrangian is just

√−g

times a constant. So we need to know the variation of the determinant of the metric. Tofind this, consider the downstairs metric as a matrix M with eigenvalues λi. Then we have

det(M) =∏i

λi = exp

(log∏i

λi

)= exp

(∑i

log λi

)= exp (Tr(logM)) . (10.56)

Taking the variation of this w.r.t. M gives

δ(detM) = detM δ (Tr(logM)) (10.57)

= detM Tr (δ(logM)) (10.58)

= detM Tr(M−1δM

). (10.59)

so thatδ(detM)

(detM)= Tr

(M−1δM

). (10.60)

Here, detM = (−g), so that

δ(−g)

(−g)= +gαβδgαβ (10.61)

= −gαβδgαβ , (10.62)

where in the second step we used the switcheroo formula from earlier. We can see immediatelythat this implies the formula we need:-

δ√−g = −1

2

√−g gαβ δgαβ . (10.63)

129

Page 132: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

So we have that

δ

(1

16πGN

∫dDx√−g(−2Λ)

)=

1

16πGN

∫dDx√−g (+Λgαβ) δgαβ . (10.64)

This takes care of the variation of the cosmological constant term. All the action from thispart comes from the variation of

√−g.

Now, how about the term involving the complete contraction of the Riemann tensor? Weneed to know the variation of this part of the action under a variation in the field variablegαβ,

δ

(1

16πGN

∫dDx√−gR

)(10.65)

= δ

(1

16πGN

∫dDx√−ggαβRαβ

)=

1

16πGN

∫dDx

[(δ√−g)gαβRαβ +

√−g(δgαβ)Rαβ +

√−ggαβ(δRαβ)

]. (10.66)

We know what to do for the first two terms already, but the third one is new.To make progress with the third term here, we need a new identity, which is known as

the Palatini identity,δRσ

µνρ = −2∇[µδΓσν]ρ . (10.67)

The most straightforward way to prove this identity is to consider a change from one affineconnection to another,

Γρµν → Γρ

µν = Γρµν + τ ρµν , (10.68)

and use the definition of Riemann in terms of Christoffels to see that

µνρ(Γ) = Rσµνρ(Γ)− 2∇[µτ

σν]ρ − 2τσ[µ|λτ

λ|ν]ρ , (10.69)

(Check this algebra for yourself if you would like to understand where the result came from.Remember that we take the torsion tensor T λµν to be zero throughout this course.) Therefore,to first order in small variations,

− δRρµλν = ∇λ(δΓ

ρνµ)−∇ν(δΓ

ρλµ) . (10.70)

Notice that this is a true tensor equation, because we are dealing with the difference of twoaffine connections. Recall also that the connection is metric-compatible and torsion-free.

We are almost ready to compute the total variation of the gravitational (Einstein-Hilbert)action, but there is a significant subtletly to which we first need to attend, involving surfaceterms. We have so far that

δSEH =1

16πGN

∫dDx

(δ√−g)gαβRαβ +

√−g(δgαβ)Rαβ

+√−g gαβ(δRαβ)− 2Λδ(

√−g)

(10.71)

=1

16πGN

∫dDx

√−g(−1

2gαβR +Rαβ + Λgαβ

)δgαβ

+√−g gαβ

[∇λ(δΓ

λβα)−∇β(δΓλλα)

](10.72)

130

Page 133: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The terms in round parentheses are looking very promising for giving the left hand side ofEinstein’s equations, which is what we are driving at obtaining.

For now, we have one more piece of physics to tie up with a neat bow, before we canproduce the equations of motion. We have to look more carefully on what is going on withthe total derivative terms. So let us focus on the last gnarly looking terms in the above actionvariation, involving covariant derivatives of the variations of the Christoffels. We have

δSgnarly =1

16πGN

∫dDx√−ggαβ

[∇λ(δΓ

λβα)−∇β(δΓλλα)

](10.73)

=1

16πGN

∫dDx√−g∇λ

[gαβ(δΓλβα)− gαλ(δΓσσα)

]. (10.74)

Note that here we used (i) relabelling of two sets of dummy indices and (ii) the fact that theconnection is metric-compatible. The above gnarly action variation can be written in thefollowing form

δSgnarly =1

16πGN

∫dDx√−g∇λv

λ , (10.75)

wherevλ = gαβδΓλαβ − gαλδΓσασ (10.76)

where we used the symmetry property of the Christoffel connection coefficients. What isour next step? We need to know how the variations in the Christoffels are related to thosefor the original dynamical fields in the action, the metric tensor gαβ(xλ). These variationscan be obtained (with some algebra) from our formula for the Christoffels in terms of metriccomponents, and our formula for the upstairs variations in terms of the downstairs variations.The result is

δΓσαβ = −1

2

[gλα∇β(δgλσ) + gλβ∇α(δgλσ)− gαλgβρ∇σ(δgλρ)

]. (10.77)

A bit of algebra then yields

vλ = gαβ∇λ(δgαβ)−∇σ(δgλσ) , (10.78)

We can now use Stokes’ Theorem, working in D = d+ 1 dimensions with Lorentzian ‘mostlyplus’ signature, ∫

MdDx√−g∇λv

λ =

∫∂M

ddΣλvλ =

∫∂M

ddΣnλvλ , (10.79)

whereddΣ ≡ n2ddΣλn

λ . (10.80)

How do we use these formulæ for Stokes’ Theorem properly? Suppose you have a hypersur-face Σ, and that it has a unit normal nµ, normalized as nµn

µ = ±1, with −1 for spacelikesurfaces and +1 for timelike surfaces. The induced metric on Σ by gµν is

hµν = gµν − n2nµnν . (10.81)

131

Page 134: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Note that hµνnν = 0, so that h can be used to project tensors onto Σ. We can ask how the

normal vector nµ varies across the manifold. Define

Kµν ≡ h αµ h

βν ∇(αnβ) , (10.82)

which is known as the extrinsic curvature or the second fundamental form. If we takenµ to define a vector field, then

Kµν =1

2Lnhµν . (10.83)

But getting back to the task at hand, let us inspect what we have for the variations of thetotal gravity action. We have

δSEH =1

16πGN

∫MdDx

√−g(−1

2gαβR +Rαβ + Λgαβ

)δgαβ

+1

16πGN

∫∂M

ddxnλ(hαβ∇λ(δgαβ)−∇σ(δgλσ)

), (10.84)

Now we come to the key physics point. We want to produce the Einstein equations byrequiring that the total (gravity + matter) action be stationary under arbitrary variationsof the metric which vanish on the boundary,

δgαβ∣∣∂M = 0 . (10.85)

If this holds, then those field variations are constant on the boundary, and so the covariantderivative projected onto the boundary directions must vanish, so the second of the offendingterms above in the action variation will drop out. We are however left with one remainingoffending term which does not vanish by virtue of any identity for our torsion-free spacetimewith a metric-compatible connection.

This physical subtlety might not seem like much, but it is in fact pretty darned important.It requires that we add an additional term to the gravitational action, which is a surfaceterm, designed in exactly such a way as to cancel out this offending boundary term that wehave left over. This can be done by observing that the trace of the extrinsic curvature is therelevant object, using

δK = δhµν∇µnν + hµνδΓ

νµρn

ρ , (10.86)

and

δK|∂M = −1

2nλhµσgµαgσβ∇ρ(δg

αβ) (10.87)

Therefore, what we need to do in order to cancel off the offending piece in the surface termsand make the GR initial value problem well-defined is to add a surface term to Einstein-Hilbert, with the precise coefficient

∆SEH =1

8πGN

∫∂M

ddΣK (10.88)

The result of all this sweating carefully over details is that we obtain the full Einstein-Hilbertaction, including boundary terms,

Sgrav =1

16πGN

∫MdDx√−g(R− 2Λ) +

1

8πGN

∫∂M

ddΣK . (10.89)

132

Page 135: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Finally, we are now in a position to put together all the ingredients and obtain Einstein’sequations. We have

δStotal = δSgrav + δSmatter (10.90)

=1

16πGN

∫dDx

(−1

2

√−ggαβR +

√−gRαβ +

√−gΛgαβ

)+

∫dDx

(+

1

2

√−gTαβδgαβ

)(10.91)

=1

16πGN

∫dDx√−g(Rαβ −

1

2gαβR + Λgαβ + 8πGNTαβ

)δgαβ . (10.92)

Since this must be zero for arbitrary functional variations δgαβ(xλ), it follows that

Rαβ −1

2gαβR + Λgαβ = −8πGN

c4Tαβ . (10.93)

These are Einstein’s famous equations of motion for General Relativity.11

11I put back the powers of c temporarily. You can find them again lickety split by dimensional analysis.

133

Page 136: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

11 Appendix: advanced topics*

11.1 Spherical polars in D dimensions and electric fields*

Flat, boring Minkowski spacetime R1,d written in spherical polar coordinates is not a curvedspacetime, but it does have tensor transformation laws that depend on spacetime position.The metric is

ds2 = dt2 − dr2 − r2dθ21 − r2 sin2 θ1 dθ

22 + . . . , (11.1)

where the spherical polar spatial coordinates r, θ1, . . . , θd−1 are related to the Cartesianspatial coordinates x1, . . . , xd by

x1 = r cos θ1 (11.2)

x2 = r sin θ1 cos θ2 (11.3)... =

... (11.4)

xd−1 = r sin θ1 . . . sin θd−2 cos θd−1 (11.5)

xd = r sin θ1 . . . sin θd−2 sin θd−1 . (11.6)

Check also that the volume element is

rd−2 sind−2 θ1 sind−3 θ2 . . . sin θd−2 dθ1dθ2 . . . dθd−1 . (11.7)

As an exercise, try solving the Maxwell equation assuming a purely radial electric field. Youshould find

F tr =1

Ωd−1rd−1. (11.8)

where Ωd−1 is the volume of a (d− 1)-sphere. The unit normal vectors to the field strengthare (1, 0, . . . , 0) and (0, 1, . . . , 0). The induced metric (see section on tensor densities) on theSd−1 at spatial infinity is

γijdxidxj = r2dθ2

1 + r2 sin2 θ1dθ22 + . . . . (11.9)

Then the conserved electric charge is

Q = − limr→∞

∫Sd−1

(dθ1 . . . dθd−1r

d−1 sind−2 θ1 . . . sin θd−2

) (− q

Ωd−1rd−1

)= q . (11.10)

11.2 Noncoordinate bases and the spin connection*

So far, we have only used the coordinate basis: dxµ for upstairs vectors and ∂ν for downstairsones. But it is also possible to use a noncoordinate basis for tensor analysis. The abilityto do this relies on a physical assumption that Einstein baked into GR – that even thoughspacetime is curved, it is locally Minkowski at any point. The metric tensor can be writtenas

gµν = e Aµ e

Bν ηAB , (11.11)

where the e Aµ is known as the vielbein. The indices A,B now live in flat Minkowski

spacetime. Accordingly, the vielbein is often loosely described as the ‘square root of the

134

Page 137: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

metric’. The vielbein e Aµ is invertible and has D × D components. It can be thought of

as having one leg in the curved spacetime (the µ index) and one leg in the flat tangentspacetime (the A index). The inverse of the vielbein e A

µ is eµA and it obeys

eµAeBµ = η B

A , eµAeAν = gµν . (11.12)

The rules for manipulating their indices are that flat indices are raised/lowered with theMinkowski metric η, while curved indices are raised/lowered with the curved spacetimemetric g. If you count components, you will see that there is a certain coordinate freedomin picking the vielbein. As you can check yourself explicitly by counting carefully, thiscorresponds to the option of doing local Lorentz transformations at every point.

For any vector V , we can define its components in tangent space via the vielbein

V A = e Aµ V

µ . (11.13)

For any one-form α, its components are

αA = eµAαµ . (11.14)

which makes use of the inverse vielbein.The Christoffel connection is not the only available connection allowing us to parallel

transport stuff around. In particular, it does not correctly provide for parallel transport ofspinor fields. By recruiting the vielbein, we can define a connection that does, known as thespin connection ω A

µ B. The tetrad postulate states that the covariant derivative of thevielbein is zero,

∇µeAν = ∂µe

Aν − Γλµν e

Aλ + ω A

µ B eBν . (11.15)

Physically, this restricts the spin connection to be related to the Christoffel connection. Incomponents,

ωABµ = e Aν e

λBΓνµλ − eλB∂µe A

λ . (11.16)

An important feature of the spin connection is that it is antisymmetric when written withlower indices,

ωAB = −ωBA . (11.17)

For a spinor field ψ, the spinor covariant derivative is

∇µψ =

(1∂µ +

1

4ωµABΓ[AB]

)ψ (11.18)

where the Γ[AB] = 12(ΓAΓB − ΓBΓA) are antisymmetric products of two gamma matrices in

spinor space and 1 is the unit matrix in spinor space.

11.3 Differential forms*

Here are some neat tricks for rank (0, p) completely antisymmetric tensors, which are alsoknown as differential forms or p-forms. These objects turn out to have a surprising

135

Page 138: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

amount of utility in describing physical systems of interest, and one prominent applicationis to electromagnetism. A p-form in D dimensions has(

Dp

)=

D!

(D − p)!p!(11.19)

independent components. We can define several operations that are exclusive to p-forms.The wedge product for a p-form A with a q-form B is defined by

(A ∧B)µ1...µp+q =(p+ q)!

p!q!A[µ1...µpBµp+1...µp+q ] (11.20)

Note thatA ∧B = (−1)pqB ∧ A , (11.21)

because we live in a spacetime of signature (−,+, . . . ,+).Consider the exterior derivative d, expanded in coordinate basis as

d = dxµ∂µ , (11.22)

When acting on a 0-form (a function), the exterior derivative gives simply

dφ = dxµ∂µφ . (11.23)

Let us now ask what happens when it acts on a 1-form. Take for example the gauge potentialA,

A = dxµAµ . (11.24)

Then the wedge product of d with A has components

(d ∧ A)µν = 2∂[µAν] = ∂µAν − ∂νAµ = Fµν , (11.25)

i.e., the electromagnetic field strength tensor. So the covariant curl equation relating gaugepotential and gauge field strength becomes very neat,

F = d ∧ A . (11.26)

Stokes’ Theorem is beautifully simple for differential forms,∫M

d ∧ ω =

∫∂M

ω . (11.27)

The exterior differential allows us to express gauge invariance in electromagnetism veryelegantly. We know that F = dA. Then the electromagnetic gauge transformation is

A→ A+ dλ , (11.28)

where λ is a 0-form, i.e. a function. Because d2 = 0, this is guaranteed to keep the gaugefield strength F invariant.

136

Page 139: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The really amazing thing about d is that it produces a tensor when acting on a p-form.The reason is that the failure of the partial derivative to produce a tensor was proportional to∂λ(∂x

µ′/∂xν) = ∂λ∂νxµ′ , which is symmetric in (λν). This cute fact that exterior derivatives

acting on forms give more forms, which are all tensors, is restricted to p-forms. It relies onthe fact that they are totally antisymmetric, and does not hold for any other kind of tensor.

An interesting property of d is that d2 ≡ 0 when acting on any p-form. This worksbecause second mixed partial derivatives commute. A p-form is said to be closed if dA = 0,and exact if A = dB for some B which is a (p − 1)-form. Obviously, all exact forms areclosed, but the converse is not true in general. The space of closed forms is a vector space,and so is the space of exact forms. From these spaces we can define a new vector space, whichhas elements called de Rham cohomology classes, defined as the closed forms modulo theexact forms. The dimension of the vector space of closed p-forms modulo exact p-forms is atopological quantity known as the pth Betti number of the manifold. In topologically trivialMinkowski space, these quantities are all maximally boring. But consider the Aharonov-Bohm effect, wherein electrons impinging on space outside a solenoid pick up a measurablephase. Outside the solenoid the field strength is zero, but the gauge potential is not puregauge. If we integrate the gauge potential as a one-form around a closed curve, getting theline integral, the answer will be nonzero, by Stokes’ Theorem, because there is a magneticfield threading the surface whose boundary is the curve.

Differential forms have a property known as Hodge duality, which maps a p-form Ainto a (D − p)-form ∗A, by contracting with the epsilon tensor. In components,

(∗A)µ1...µD−p=

1

p!εν1...νpµ1...µD−p

Aν1...νp . (11.29)

Applying the Hodge star operator twice gives

∗ ∗A = (−1)s+p(D−p)A , (11.30)

where s is the number of minus signs in the metric, in our case 1. The Hodge star hasrelevance to electromagnetism. It replaces ~E with ~B and ~B with − ~E – you should do theexercise to check this yourself. The Hodge star operation also helps us to rewrite Maxwell’sequations in differential form language,

F = d ∧ A ⇒ d ∧ F = 0 , (11.31)

for the Bianchi identity, and∗ d ∗ F = J (11.32)

for the field equation. Pretty neat, huh?

11.4 Cartan structure equations*

Let us the vielbeins e Aµ as a one-form, via

eA = e Aµ dx

µ . (11.33)

137

Page 140: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

If the torsion tensor is zero, then the equation satisfied by the spin connection

ωAB = ω Aµ Bdx

µ (11.34)

isTA = deA + ωAB ∧ eB = 0 . (11.35)

This is known as the first of two Cartan structure equations.The spin connection can be used to find the Riemann curvature tensor. It appears in

the second of the Cartan structure equations,

RAB = dωAB + ωAC ∧ ωCB . (11.36)

Note that the Riemann animal here is a two-form, because the spin connection is a one-form.We can resolve Riemann into its components by writing

RAB = R A

µν Bdxµ ∧ dxν , (11.37)

The Riemann tensor that we have already discovered from taking a commutator of covariantderivatives has curved space components

Rλσµν = eλAe

Bσ R A

µν B , (11.38)

where we used an antisymmetry property of Riemann.So we see that for our theory of gravity, which is torsion-free, there are two different

connections – the Christoffel connection Γ and the spin connection ω – which can both beused to find the Riemann tensor directly.

Let us also illustrate here the power of the language of differential forms by computingthe vielbeins and spin connection to get Riemann for a simple example. Computations usingChristoffels will generally be a bit more laborious, for simple diagonal spacetime metrics.Consider the metric

ds2 = −dt2 + a2(t)|d~x|2 , (11.39)

where a(t) is the scale factor. We have some freedom in picking vielbeins. We will choosewisely,

e00 = 1 , ei j = a(t)δ ij . (11.40)

Note that we are using hats to designate flat indices, as distinct from curved ones withouthats – our mnemonic here is ‘hat for flat’. Then you can see very straightforwardly thatthese obey the basic equation gµν = eAµe

BνηAB.

We now want to find the spin connection coefficients by using the first Cartan structureequation deA + ωAB ∧ eB = 0. For our first step, we form

e0 = e0αdx

α = dt , ei = eiαdxα = a(t)dxi . (11.41)

Then

de0 = d ∧ (dt) = 0 (11.42)

= −ω00∧ e0 − ω0

j∧ ej (11.43)

138

Page 141: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Using the Minkowski metric η to raise/lower tangent space indices, we have

ω00

= 0 (11.44)

by antisymmetry of ω and symmetry of η. Also, using the same logic, we find

ω0i

= −ω0i

(11.45)

So de0 = 0 implies that

ω0i∧ ei = 0 (11.46)

But ei = a(t)dxi, which implies that

ω0i∝ ei . (11.47)

To fix the ambiguity here, we need to recruit the spatial part of the first Cartan structureequations as well. We have

dei = d ∧(a(t)dxi

)(11.48)

= a dt ∧ dxi (11.49)

=.a

ae0 ∧ ei (11.50)

= −ωi0∧ e0 − ωi

k∧ ek (11.51)

This implies that

ω0i

=.a

aei , ωi

j= 0 . (11.52)

This fixes the ambiguity that we had before.It is straightforward to compute Riemann from the above, and I recommend it as an

exercise. The answers are

R0i0j

=..a

aδij (11.53)

Ri00j

=..a

aδ ij

(11.54)

Rijk ˆ =

( .a

a

)2 (δ ikδj ˆ− δ iˆδjk

). (11.55)

See how much easier it was to get the Riemann components this way? Notice how we getlogarithmic derivatives of scale factors appearing in Riemann in the hatted noncoordinatebasis. This is different to what we saw with the non-hatted components. But of course, thehatted and non-hatted components are related by contracting with a vielbein for each index.Check this yourself so that you are fully confident about why.

Parenthetical note: there is a cute way to represent Torsion and Riemann tensors givenwhat we learned about Lie derivatives. Define

∇X := Xµ∇µ . (11.56)

139

Page 142: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Then you can think of the torsion as a multilinear map from two vectors to a third vectorfield, and Riemann in terms of a multilinear map from three vector fields to a fourth. Thentheir equations are

T (X, Y ) = ∇XY −∇YX − [X, Y ] , (11.57)

R(X, Y )Z = ∇X∇YZ −∇Y∇XZ −∇[X,Y ]Z . (11.58)

11.5 Timelike geodesic congruences and the Raychaudhuri equa-tion*

In the previous section, we studied tidal forces, using the geodesic deviation equation. Nowwe will dig into this topic of geodesic deviation in a little more depth. This will enable us toflesh out some really interesting properties of geodesics and how they relate to the Riemanntensor. Note that for the duration of this discussion, we will stick to D = 3 + 1 dimensionsof spacetime, because writing all the formulæ for the d + 1 dimensional case can get rathermessy, obscuring the physics.

A congruence is a set of curves in an open region of spacetime such that every pointin the region lies on precisely one curve. Consider a timelike geodesic congruence withtangent vector

Uµ =dxµ

dλ, (11.59)

like T before. You can think of this physically as the 4-velocity vector of some pressurelessfluid, if you like. Recall that for timelike 4-velocities we have

UµUµ = 1 , (11.60)

while for geodesics we have thatUλ∇λU

µ = 0. (11.61)

Consider a separation vector V µ pointing from one geodesic to a neighbouring one. We havealready found that

DV µ

Dλ= Uλ∇λV

µ ≡ BµνV

ν (11.62)

where we defineBµ

ν = ∇νUµ . (11.63)

Note that we need to be very careful not to make any symmetry assumptions about this(1,1) tensor B yet. What does B do? It measures the failure of V µ to be parallel transportedalong the congruence – so it describes the extent to which neighbouring geodesics deviatefrom being parallel.

Given a Uµ, at each point we can study the subspace of the tangent space correspondingto vectors that are normal to Uµ. We can project onto this subspace by using the very handytensor

P µν = δµν + UµUν . (11.64)

140

Page 143: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

This obeys the main condition for a projector, that ‘P 2 = P ’,

P µνP

νσ = (δµν + UµUν) (δνσ + UνUσ) (11.65)

= δµσ + UµUσ + UµUσ + Uµ(UνUν)Uσ (11.66)

= δµσ + (1 + 1− 1)UµUσ (11.67)

= P µσ . (11.68)

where we again used UµUµ = −1. Note also that P µ

µ = (D− 1) = d, which happens becausethe normal space is d dimensional by design. Further, the tensor P µ

ν works to do what itadvertises, because

Uµ (P µνW

ν) = Uµ ([δµν + UµUν ]Wν) (11.69)

= UνWν + (−1)UνW

ν (11.70)

= 0 , (11.71)

so that a P -projected vector is indeed in the normal space. The dot product of U with a P -projected vector is zero. (Recall that for any projector P , it provides a unique decompositionof a vector X into PX and (1− P )X.)

Notice that Bµν is in the normal subspace too, since contracting with the first indexgives zero,

UµBµν = Uµ∇νUµ = 0 . (11.72)

This follows from the identity

∇λ (UµUµ) = ∇λ(−1) = 0 , (11.73)

and the fact that ∇ obeys the Leibniz rule. We also need to check normal-ness whencontracting on the second index of B as well. This also works, because

UνBµν = Uν∇νUµ = 0 , (11.74)

where we used the fact that U is the tangent vector to the geodesic.Since B is a rank (0,2) tensor, we can decompose it into three parts: (a) the antisym-

metric part, (b) the symmetric traceless part, and (c) the trace part. Note that in takingthese traces, we use the tensor P rather than the Kronecker delta. The resulting three partsare traditionally written as

Bµν =1

3θPµν + σµν + ωµν , (11.75)

where the expansion of the geodesic congruence is defined as

θ = P µνBµν = ∇µUµ . (11.76)

For the rotation of the geodesic congruence, we have

ωµν = B[µν] = ∇[µUν] , (11.77)

141

Page 144: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

and for the shear of the geodesic congruence we have

σµν = B(µν) −1

3θPµν (11.78)

= ∇(µUν) −1

3(∇µU

µ) (gµν + UµUν) . (11.79)

We now want to derive an equation for the directional covariant derivative

D

Dλ= Uσ∇σ (11.80)

acting on the tensor B. In components,

D

DλBµν = Uσ∇σ (∇νUµ) (11.81)

= Uσ(∇ν∇σUµ −Rλ

µνσUλ)

(11.82)

= ∇ν (Uσ∇σUµ)− (∇νUσ)(∇σUµ) +RλµσνU

σUλ (11.83)

= −(∇νUσ)(∇σUµ) +RλµσνU

σUλ (11.84)

= −BσnuBµσ +RλµσνU

σUλ (11.85)

where in the second to last line we used the geodesic equation. Our next step is to splitup this equation into its antisymmetric, symmetric traceless, and trace parts. The resultingalgebra is straightforward, tedious, and not especially illuminating, so we suppress the detailshere. The result yields three nonlinear coupled equations. For the expansion, the evolutionequation is known as the Raychaudhuri equation,

D

Dλθ = −1

3θ2 − σµνσµν + ωµνω

µν +RµνUµUν . (11.86)

Notice that the first two terms on the RHS for the directional covariant derivative of theexpansion are negative semidefinite, because they are the negative of a sum of squares. Bycontrast, the the third term involving the rotation is positive semidefinite. Incidentally:by the Frobenius theorem (see Carroll Appendices C,D), if the congruence is hypersurface-orthogonal then the rotation is zero. Accordingly, if an inequality can be put on the con-traction of the Ricci tensor with two 4-velocities, then we could prove something about theexpansion of the geodesic congruence. Physically, this would correspond to figuring out howour congruence of geodesics gets focused (or not) by the spacetime curvature. We will com-ment more on this general issue later, near the end of the course, once we have introducedthe energy-momentum tensor and mentioned the famous Hawking-Penrose singularity theo-rems. As yet, we do not have sufficient ingredients to finish the job, because we have not yetderived the Einstein equations connecting the Ricci tensor to the energy-momentum tensorof whatever matter is hanging out in our spacetime.

One other thing that is clear from the evolution equation for the expansion is that theevolution of the rotation and shear also generally need to be understood – in their own right,and also because they enter nonlinearly into Dθ/dλ. After some straightforward but tediousalgebra, you will find that the rotation obeys

D

Dλωµν = −2

3θ ωµν + σ α

µ ωνα − σ αν ωµα . (11.87)

142

Page 145: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The shear evolution equation is the most complicated of the three. Chugging through thealgebra yields

D

Dλσµν = −2

3θ σµν − σµασαν − ωµαωαν

+1

3Pµν

(σαβσ

αβ − ωαβωαβ)

+ CανµβUαUβ − 1

2Rµν , (11.88)

where the spatially projected trace-free part of the Ricci tensor is

Rµν = PαµP

βνRαβ −

1

3PµνR

αβRαβ . (11.89)

Reminder: all the discussion up until now concerned timelike geodesics. If instead we wantedto study null geodesic congruence, we would need a somewhat different analysis.

11.6 Null geodesic congruences*

Suppose that instead of a timelike geodesic congruence we wanted to follow a null geodesiccongruence. If we want to derive a similar looking set of equations for our null geodesics,there are a few wrinkles compared to the timelike case. First, note that the tangent vectorof a null curve

kµ =dxµ

dλ(11.90)

is normal to itself: kµkµ = 0 (by the relativistic mass shell relation12). The closest analoguewe can look for in the case of a null geodesic congruence is to seek the evolution of vectorsin a two-dimensional (not 3D) subspace of “spatial” vectors `µ which are normal to the nulltangent vector field kµ. Unfortunately, there is no way to define this subspace uniquely, asobservers in different Lorentz frames typically have different notions of what constitutes aspatial vector. As Carroll motivates in Appendix F, the low-tech way of proceeding is todefine a second auxiliary vector `µ, which (in some frame) points in the opposite staitaldirection to kµ, normalized such that

`µ`µ = 0 , `µk

µ = −1 . (11.91)

Demanding that `µ be parallel transported gives

kµ∇µ`ν = 0 . (11.92)

Then the transverse space T⊥ is defined as the set of vectors V µ such that kµVµ = 0 and`µV

µ = 0.The projection tensor that works this time is

Qµν = gµν + k(µ`ν) . (11.93)

12Recall that setting pµ = ~kµ = dxµ/dλ is a different convention than we used for massive particles,where we had pµ = mUµ. Carroll makes the same choices.

143

Page 146: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

This Q obeys several equations when acting on a vector in the transverse subspace, V µ ∈ T⊥.

QµνQ

νσ = Qµ

σ (projector) , (11.94)

kσ∇σQµν = 0 (parallel transported) , (11.95)

Qµνk

ν = 0 (perpendicular) , (11.96)

Qµν`ν = 0 (perpendicular) , (11.97)

QµνV

ν = V µ (yes, it projects) , (11.98)

QµνVµW ν = gµνV

µW ν (Q is the inner product in T⊥) . (11.99)

Now we can define our analogue of the B-tensor from the case of timelike geodesiccongruences. We write

Bµν = Qµ

αQβνB

αβ , where Bα

β = ∇βkα . (11.100)

Then

D

DλV µ = kν∇νV

µ , (11.101)

= kν∇ν

(Qµ

ρVρ), (11.102)

= Qµρk

ν∇ν (V ρ) , (11.103)

= QµρB

ρνV

ν , (11.104)

= QµρB

ρνQ

νσV

σ , (11.105)

= BµσV

σ , (11.106)

where for the second line we used the fact that Q projects onto T⊥, for the third line weused the second equation obeyed by Q, for the fourth line we used the definition of B in thegeodesic deviation equation, for the fifth line we used the fifth equation obeyed by Q, andfor the sixth we used the definition of B.

The final step towards writing the deviation equations for a congruence of null geodesicsis to break up B into its trace part, its trace-free symmetric part, and its antisymmetricpart, this time using Q to take traces,

Bµν =1

2θ Qµν + σµν + ωµν . (11.107)

This yields, through further tedious algebra,

D

Dλθ = −1

2θ2 − σµν σµν + ωµνω

µν +Rµνkµkν , (11.108)

D

Dλωµν = −θ ωµν , (11.109)

D

Dλσµν = −θ σµν −Qα

µQβνCαλβσk

λkσ . (11.110)

Note that the physics of these equations does not depend at all on `µ, our auxiliary vector.Whew! The Cαλβσ in the above equation is not a typo. It is known as the Weyl tensor andis the trace-free piece of Riemann.

144

Page 147: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Now, if we wanted to constrain the evolution of the expansion of our null geodesiccongruence, we would again choose hypersurface-orthogonality to remove the rotation term.Then, by the first of three equations above, it would remain only to constrain Rµνk

µkν ,where kµ is null, to say something about the expansion evolution, which we will be able todo after we have derived the Einstein equation. Note how this story is qualitatively differentthan the analogue for timelike geodesic congruence, which involved constraining RµνU

µUν ,where Uµ (the 4-velocity) is timelike.

11.7 Conformal transformations and Carter-Penrose diagrams*

Conformal transformations involve a conformal factor ω(x), such that the transformed metricgµν is related to the original by

gµν = ω2(x)gµν . (11.111)

Null geodesics have ds2 = 0, so they are left invariant under conformal transformations.(If you multiply zero by a function of spacetime, you still get zero.) But even though nullgeodesics are untouched, pretty much everything else changes.

Let us write down some formulæ for the Christoffels and Riemann, which are obviouslynot invariant under conformal transformations. We find for the Christoffels

Γρ

µν = Γρµν + ω−1(δρµ∇νω + δρν∇µω − gµνgρλ∇λω

). (11.112)

Note that the difference between the transformed Christoffel and the original one is a tensor,as it should be. For the Riemann tensor, we find

σµν = Rρσµν − 2

(δρ[µδ

αν]δ

βσ − gσ[µδ

αν]g

ρβ)ω−1(∇α∇β ω)

+2(

2δρ[µδαν]δ

βσ − 2gσ[µδ

αν]g

ρβ + gσ[µδρν]g

αβ)ω−2(∇α ω)(∇β ω) . (11.113)

Tracing this equation for the Riemann tensor gives a relationship for the Ricci tensor,

Rσν = Rσν −[(D − 2)δασδ

βν + gσνg

αβ]ω−1(∇α∇β ω)

+[2(D − 2)δασδ

βν − (D − 3)gσνg

αβ]ω−2(∇α ω)(∇β ω) , (11.114)

while tracing further for the Ricci scalar gives

R = ω−2R− 2(D − 1)gαβω−3(∇α∇β ω)

−(D − 1)(D − 4)gαβω−4(∇α ω)(∇β ω) . (11.115)

You can find a few more formulæ of interest for conformal transformations in Appendix Gof the Carroll.

Another arena in which conformal transformations play a role in GR is in constructingconformal diagrams, which are also known as (Carter-)Penrose diagrams. This cooltool is basically a way of representing the causal structure of spacetime on a finite amount ofpaper. The cardinal rule is that light cones always go at 45 degrees, in the right coordinates.This makes it easy to visualize the behaviour of light rays, and thereby figure out aspects ofcausality. (Is A in the past light cone of B, able to murder them, or not?)

145

Page 148: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Of course, in order to actually put a generally infinitely large spacetime on a finite pieceof paper, we need to work quite hard to find an appropriate coordinate system. CarrollAppendix H discusses how to do this for Minkowski spacetime in detail. We will just sketchthe outline here to give you a taste of the recipe. Later on, when you are more confidentwith spacetimes, you can attempt the big one by yourself at your leisure: constructing theconformal diagram for a black hole spacetime. (This is quite involved, so do not be downcastif it feels impenetrable right now. Most GR textbooks do not delve into this amount of detail.)

Begin with Minkowski spacetime spherical polar coordinates (like in HW1), and definenull coordinates u and v by

u = t− r , v = t+ r . (11.116)

Then bring (semi-)infinite ranges of familiar coordinates into a finite interval by using thearctan function,

U = arctanu , V = arctan v . (11.117)

Define a new time T and radius R in the arctan variables

T = V + U , R = V − U . (11.118)

Then, as you can check for yourself (by hand or using computer algebra), the metric istransformed into

ds2 =1

(cosT + cosR)2

[−dT 2 + dR2 + sin2RdΩ2

]. (11.119)

The part in square brackets is the metric of the Einstein static universe solution. Fora general spacetime, if we include the boundary, known as conformal infinity, then weget the conformal compactification, which is a manifold with boundary. In the case ofMinkowski spacetime, the resulting diagram is given in Figure H.4 of Carroll,

Radial null geodesics are at ±45 on the conformal diagram. All timelike geodesics beginat a place known as i−, past timelike infinity, and they end at i+, future timelike infinity.

146

Page 149: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Future null infinity is known as I +, and all null geodesics end on it. Past null infinity isI −, and all null geodesics originate on it. Spacelike geodesics begin and end at i0, spacelikeinfinity. Note that these rules only apply to geodesics; other paths typically look gnarlier.

Conformal diagrams do not teach us an awful lot about Minkowski spacetime that we didnot already know, except that it made us think veeeery carefully about ranges of coordinatesand what the boundary looks like in detail. Where conformal diagrams really come into theirown is when we are discussing more complicated spacetimes, such as cosmologies or blackholes. We do not have time to develop this further here.

Anti de Sitter spacetime has a Penrose diagram as follows, taken from Matthias Blau’slecture notes on GR:-

de Sitter spacetime has the following Penrose diagram:-

147

Page 150: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

Note how different this is from flat Minkowski spacetime or negatively curved Anti deSitter. Future infinity I + is a spacelike surface, not null or timelike. de Sitter physics iseven more weird than that, because on I + you cannot separate particles sufficiently wellfrom each other in order to define asymptotic states and an S-matrix.

We can also finding the Penrose diagram for Schwarzschild. I do not recommend that youtackle this unless you feel 100% on top of the material: it is pretty technical and not usuallycovered in an average modern GR textbook. Suffice it to say that some arctan functionsare involved. We just quote the result here for your satisfaction. The wavy lines representsingularities, while the diagonals in the interior of the spacetime represent horizons. At allpoints there is a transverse S2 suppressed.

148

Page 151: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

11.8 Reissner-Nordstrom black holes*

The Reissner-Nordstrom solution is obtained when we assume staticity and sphericalsymmetry, and allow an energy-momentum tensor coming from the electromagnetic field.Since there are no known magnetic monopoles that could source a magnetic field, we willstick with an electric field13. Then the only nonzero component of F µν is F tr. Let us assumethe same metric ansatz as we had for Schwarzschild,

ds2 = e2α(r)dt2 − e2β(r)dr2 − r2dΩ22 . (11.120)

Then our Maxwell equation,1√−g

∂µ(√−gF νµ

)= 0 , (11.121)

will simplify significantly. We have

√−g = eα+βr2 sin θ , (11.122)

so that the Maxwell equation implies

∂r(r2 sin θeα+βF tr

)= 0 , (11.123)

which we can immediately integrate by eye to

F tr =c1

r2e−α−β . (11.124)

This may look more familiar in terms of the radial electric field,

F tr = gttgrrFtr = e−2α−2βEr (11.125)

so that~E =

c1

r2eα+β r . (11.126)

The constant c1 can be determined by taking the r → ∞ limit and connecting with EMfields for point charges that we know and love.

The next step in solving the Einstein-Maxwell system is to substitute in the aboveelectric field into the energy-momentum tensor and apply Einstein’s equations. The detailsare similar in spirit but longer in practice than what we did before in deriving Schwarzschild,so we will not drag you through the algebra. The nice thing is that, even with the electricfield turned on, it turns out that the Einstein equations still furnish the relationship

α = −β + const. , (11.127)

between the time-time component of the metric and the space-space component. Integratingup the θθ Einstein equation like we did for Schwarzschild produces the solution,

ds2RN = −

(1− 2GNM

c2r+

GNQ2

4πε0c4r2

)dt2 +

(1− 2GNM

c2r+

GNQ2

4πε0c4r2

)−1

dr2 + r2dΩ22 ,

(11.128)

13If you wanted to do the magnetic case, you would find Fθφ = P sin θ, where P ∝ magnetic charge.

149

Page 152: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

where we have temporarily restored physical constants that we would usually set to unity.We can make this look slightly prettier by defining

µ ≡ GNM

c2, and q2 ≡ GNQ

2

4πε0c4; (11.129)

thenr± = µ±

√µ2 − q2 . (11.130)

The geometry has two event horizons, an outer horizon and an inner horizon. As youcan check by computing the full contraction of the Riemann tensor with itself, the curvaturesingularity is located at r = 0. The “singularities” in the metric at r = r± are just coordinatesingularities, like the one we encountered for Schwarzschild.

There are three cases for Reissner-Nordstrom metrics depending on the sign of what isunder the square root in the above formula.

1. µ2 < q2: This is unphysical. The event horizon walks off into the complex plane andthe singularity at the origin is then naked. Oops!

2. µ2 > q2: This is physical. It includes the limit of zero charge, which gives backSchwarzschild (r+ = 2µ, r− = 0). Here there are two horizons, at r = r±. Thesingularity in this case is timelike, as compared to spacelike for Schwarzschild.

3. µ2 = q2. This is also physical, and is known as the extremal Reissner-Nordstromspacetime. You can think of it as having exquisitely balanced gravitational attractionand electric repulsion.

The Penrose diagrams for Reissner-Nordstrom black hole spacetimes are available in theHobson-Efstathiou-Lasenby textbook §12.6, if you wish to peruse them to obtain intuition.Note however one important caveat on the maximal analytic extensions that display an in-finite number (!) of asymptotic regions. The inner horizon has the property that probeperturbations coming in from I − tend to bunch up there: their magnitude grows out ofcontrol. But if the perturbation amplitude were that big, then it would surely backreact onthe geometry, from having so much energy-momentum. This would entail changing the solu-tion that we already wrote down. What this teaches us is that the semiclassical perturbationanalysis is breaking down. Most likely, the singularity of a real physical charged black holewould become spacelike, covered by only one horizon, not two.

At this point we can make one more advanced comment, concerning the physical realismof charged black hole solutions. Quantum field theory shows you that charged black holesin real astrophysical situations will actually discharge rather quickly, via the Schwingerprocess, which nucleates charged particle-antiparticle pairs (e.g. electron-positron pairs) inan electric field, about a Compton wavelength apart. To finesse the astrophysicist’s objection,you can imagine that the “electric” charge we are discussing is not carried by light quantain the theory.

A note about two neat properties of our extremal Reissner-Nordstrom black hole. First,we will be able to see pretty quickly that there are multi black hole solutions in this case.

150

Page 153: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

This spacetime has one double horizon at r = r− = |q|,

ds2ERN = −

(1− |q|

r

)2

dt2 +

(1− |q|

r

)−2

dr2 + r2dΩ22 . (11.131)

We can easily define a shifted radial coordinate

ρ := r − |q| . (11.132)

Then dρ = dr and (1− |q|

r

)=

(r − |q|r

)=

ρ

ρ+ |q|=

(1 +|q|ρ

)−1

. (11.133)

Defining

H(ρ) = 1 +|q|ρ, (11.134)

we have

ds2ERN = −H−2dt2 +H2dρ2 + (ρ+ |q|)2dΩ2

2

= −H−2dt2 +H2(dρ2 + ρ2dΩ2

2

). (11.135)

This coordinate system is known as isotropic coordinates because the metric in parenthe-ses is the standard Euclidean metric in spherical polar coordinates. You also find that√

GNAt = H−1 − 1 . (11.136)

If we substituted this ansatz for the gauge potential and the metric into Maxwell’s equationsand the Einstein equations, we would find that they require only one equation between them,

~∇2H = 0 , (11.137)

i.e., that H is a harmonic function. The solution we had above for H(ρ) was one solution. Itis actually possible to have multi black hole solutions of this system, because of the exactcancellation between gravitational attraction and electric repulsion between any two of theblack hole centres! In equations, the superposition works as

H = 1 +N∑i=1

GMa

|~x− ~xa|. (11.138)

Another interesting feature of the Reissner-Nordstrom spacetime is what happens whenyou take the near-horizon limit. This in effect removes the 1 from the harmonic function.If you look carefully at the single-centred black hole metric in this limit, you will find thatit produces AdS2 × S2, two-dimensional Anti de Sitter spacetime times a two-sphere. Thisfact is related to the famous AdS/CFT correspondence of string theory.

151

Page 154: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

11.9 Black hole thermodynamics*

In general, finding new solutions of the field equations for Einstein gravity coupled to matteris very difficult, because the equation of motion are nonlinear. The search is aided by classicalno-hair theorems, which say that once conserved charges of a system are determined, thespacetime geometry is unique. The essential physics behind this is threefold:-

• Specify the Lagrangian, including matter couplings.• Gravity falloffs give two conserved quantum numbers: M , J . Gauge field falloffs give

conserved charges Qi.• Any other matter fields coupled minimally have 2nd order PDEs for their fields (“hair”)

in the black hole background. Must have solutions well-behaved both at infinity andat the horizon. Not possible to do both and retain physical regularity, so: no hair.

It is important for applicability of no-hair theorems that any black hole singularity behidden behind an event horizon; theorems fail in spacetimes with naked singularities. Wealso assumed asymptotic flatness, i.e. Λ = 0. When Λ < 0 there can be very interestingviolations of no-hair lore developed from intuition with asymptotically flat spacetimes. Thisstory has been richly developed in the context of the AdS/CFT Correspondence of stringtheory.

Let us inspect the near-horizon region, of Schwarzschild for simplicity. Our key metricfunction was (

1− rSr

)=

(r − rS)

r' (r − rS)

rS(11.139)

and so it is straightforward to change to proper distance η defined by gηη = 1:

dη = dr

√rS

(r − rS)(11.140)

This givesη ' 2

√rS(r − rS) . (11.141)

Notice that our analysis would also work for the outer horizon of a charged static black hole.It relies on having a single pole in grr, so it would not work for extremal black holes.

In this η coordinate, we can see by inspection that

gtt ' −η2/(4r2S) , (11.142)

and so we can easily rescale time according to

ω =t

2rS(11.143)

to put our black hole metric in the very simple form

ds2 ' −η2dω2 + dη2 + r2SdΩ2

2 . (11.144)

152

Page 155: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

The (η, ω) piece is in fact Rindler space. What we have found is that the near-horizonregion of a Schwarzschild black hole is approximately two-dimensional Rindler space timesa constant two-sphere. This approximation becomes exact in the limit M →∞.

In QFT applications, the Euclidean Feynman path integral can be formally identifiedwith the statistical mechanical partition function. Periodicity in Euclidean time is identifiedas inverse temperature. We shall not have time to explore this fascinating connection atall, but we will just use the fact as a simple method of finding the Hawking temperature ofour black hole. So let us Wick rotate our Schwarzschild black hole. We see that we avoida conical singularity if we identify the Euclidean time iω with period 2π. This gives thepicture of the Euclidean BH as a cigar14 geometry in (η, ω).

rrH

(Caveat: Wick rotation is unlikely to be a well-defined operation in quantum gravity ingeneral. E.g. some Lorentzian spacetimes have no Euclidean counterpart. Also, smoothEuclidean spaces can turn into singular Lorentzian ones upon Wick rotation.)

Since ω = t/(2rH), translating back to our asymptotically flat coord system gives

kBTH =~c3

8πG4M(11.145)

This is Hawking temperature of Schwarzschild black hole. The result can easily bereplicated by a multitude of other methods. TH is the physical temperature felt by anobserver at infinity. The Hawking temperature blueshifts as you approach the horizon;Hawking radiated particles have temperature TP at proper distance `P from horizon. Someadditional calculable physics: the black hole radiates with a thermal spectrum; gravitationalbackscattering on the way out from the horizon causes wavelength-dependent filtering, andgives rise to what are known as greybody factors.

Notice an interesting fact about these black holes. Since TH ∼ 1/M , TH increases asM decreases. As a result, the specific heat is negative. The physical consequence of this isrunaway evaporation of black hole at low mass. The exact nature of the final state is stillvery much an open research question, although a lot of progress has been made in the last25 years since I started my PhD. Can we calculate the black hole lifetime? We can get anorder of magnitude very easily. Since the black hole radiates with a thermal spectrum like ablackbody, its luminosity is (in units where ~ = c = 1)

− dM

dt∼ (Area) TH

4 ∼ (G4M)2−4=−2 ⇒ ∆t ∼ G24M

3 (11.146)

14Sometimes a cigar is just a cigar. Here it is just a Euclidean black hole.

153

Page 156: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

No-hair theorems indicate that we know very little about a BH by looking from outside.Only quantum numbers conserved because of a gauge symmetry survive. This suggests thata black hole will possess a degeneracy of states, and hence an entropy, as a function of itsconserved quantum numbers:

S(M,J,Q) (11.147)

In late 1960’s and early 1970’s, laws of classical black hole mechanics were discovered bearinga striking resemblance to laws of thermodynamics.

• Zeroth law: surface gravity κ is constant over the horizon of a stationary black hole.• First law:

dM =κ

dA

4GN

+ ΩHdJ + ΦedQ (11.148)

where ωH is angular velocity at horizon and Φe electrostatic potential.• Second law: horizon area Amust be nondecreasing in any classical process. (Singularity

theorems imply that horizons do not bifurcate.)• Third law: it is impossible to achieve κ=0 via a physical process such as emission of

photons.

After doing many Gedankenexperiments, Bekenstein proposed that the entropy of ablack hole should be proportional to the area of its event horizon. Hawking’s semiclassicalcalculation of black hole temperature

TH =~κ2π

(11.149)

made the entropy-area identification precise by fixing the coefficient. (In the semiclassicalapproximation, spacetime is treated classically, while matter fields interacting with it aretreated quantum mechanically.)

The Bekenstein-Hawking entropy is

SBH

kB=

c3

4~GN

× Area(horizon) . (11.150)

This is a universal result for any black hole, applicable to any theory with Einstein gravityas its classical action. The entropy it codifies is enormous: for an Earth-mass black hole,rS ∼ 1cm and SBH/kB∼1066!

Classically, black hole horizon never gets smaller. Hawking radiation results in loss ofmass for black hole, therefore violates classical area theorem. Worse, it appears to violate thesecond law. There is a way out – thermal Hawking radiation contributes to the entropy too.Define a generalized entropy of the black hole plus other stuff such as Hawking radiation,

Stot = SBH + Sother ≥ 0 (11.151)

Bekenstein argued that this fixes up the second law, by doing many Gedankenexperimentsinvolving various things falling into black holes.

The subject of entropy bounds has been significantly developed in the past couple ofdecades. Bousso formulated a more general, covariant, semiclassical entropy bound, using

154

Page 157: PHY483F/1483F Relativity Theory I (2017-18)ap.io/483f/files/notes.pdf · PHY483F/1483F Relativity Theory I (2017-18) Department of Physics ... Ray d’Inverno, \Introducing Einstein’s

a new ingredient: light-sheets, which are surfaces generated by light rays leaving an area Awhich have nonpositive expansion everywhere on the sheet. You can think of the Boussobound as a proposed semiclassical proxy for a more fundamental law.

Kindergarten thermodynamics tells us that if we have an entropy, then there should bea quantum statistical mechanical partition function and hence a degeneracy of states thatexplains this entropy. For black hole, how would we even get started on computing thisdegeneracy of states from first principles? This was first achieved for very special toy-modelblack holes only in 1996 – in your lifetime!

An intimately related problem to the problem of computing the degeneracy of black holestates is Hawking’s black hole information paradox.

Semiclassical computation says that spectrum of Hawking radiation is exactly thermal.This computation is apparently remarkably robust – order one changes in Planck scale physicsof quantum fields near the horizon result in only exponentially small corrections to Hawkingspectrum. General suspicion of switched-on gravitational theorists in 2015: full quantumgravity may be necessary to solve it.

Loss of information happens because in falling book and vacuum cleaner of same massgive rise to identical Hawking radiation. Closely connected to no-hair theorems: observersat asymptotic infinity can see only long-range hair, which is very limited.

In quantum field theory, global symmetries are possible. Black holes gobble these up.This may not be a problem in string theory, since there are no global symmetries, only gaugesymmetries. If you have a “global symmetry” coming out of string theory, it has to arise asthe zero-coupling limit of a gauge symmetry.

String theory is a fully unified theory, including quantum gravity. Therefore, no degreesof freedom (information) should go missing. Information loss must therefore be an artifactof the semiclassical approximation.

As a result, information must be returned via subtle correlations of outgoing Hawkingradiation particles. This point of view was espoused early on by Don Page, Gerard ’t Hooft,Leonard Susskind, and collaborators (including me). Information return requires a quantumgravity theory with very subtle nonlocality, which is apparently impossible to see at thesemiclassical level.

In the case of the AdS/CFT correspondence, there is a precise equivalence between aquantum gravity setup and a quantum field theory. In this case, it is clear that informationmust be returned in principle, because quantum field theories are certainly unitary. Thedifficulty is translated into understanding properties of field theories in the limit of extremelystrong coupling, which is itself a very hard problem. Modifications to Hawking’s semiclassicalresult for black hole radiation are necessary in order to be able to resolve the informationproblem. This has really fascinating connections to quantum entanglement but I do not havetime to discuss that in this course.

If you are interested in microscopic modelling of black holes with string theory, I recom-mend either my recent Perimeter Institute public lecture (May 2015)http://ap.io/archives/talks/pi15/pi15.pdf

or, if more technically advanced, try §11 of my graduate string theory course lecture noteshttp://ap.io/archives/courses/2014-2020/2406s/notes/all.pdf

155


Recommended