+ All Categories
Home > Documents > Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special...

Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special...

Date post: 20-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
180
Introduction to Relativity Newtonian mechanics provides a good explanation of the motion of objects, but two major exceptions were known at the start of the twentieth century. The first was that although Newtonian mechanics predicted the orbits of the planets, Mercury was an exception. Its ellipse rotates around the sun (perihelion precession) by 43 arc seconds per century which was not predicted. It is now known that the orbits of Venus and Earth also have perihelion precessions of 8 and 4 arc seconds respectively. Secondly theory and experiments showed that the observed speed of light in a vacuum (or the aether as it was known at that time) has the same value when measured by observers travelling at different speeds. Newtonian mechanics (or rather Galilean mechanics which came before Newton) states that if the speed of light is a constant, its observed speed should be equal to its speed minus the speed of the observer when measured in the same direction as the light beam. The second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian mechanics – it is called Special because it only applies to objects and observers moving with constant velocity, i.e. no acceleration or rotation. Thus it cannot explain perihelion precession which involves rotation. This was solved by Einstein’s theory of General Relativity published in 1915 which takes into account acceleration and replaces Newton’s law of gravitation (which was not a theory since no mechanism was provided). It is particularly important when objects have large masses (i.e. that of a star or galaxy). Both theories simplify to Newtonian mechanics under “normal” conditions. The factors of special relativity only become important at speeds greater than one third of that of light unless particularly high precision is required. Both theories combine space and time into a single concept called space-time. In special relativity space-time is flat – straight lines are straight, and parallel lines never meet. The space- time of general relativity is curved so the concept of a straight line is much more complicated. Both theories involve frames of reference and observers. An observer makes measurements while in a specific frame of reference which is just a coordinate system. Special relativity adds correction factors for the measured length, time, and mass of objects observed to be moving close to the speed of light . Distances and durations are reduced by a factor of √1 − 2 2 and mass is increased by 1 √1− 2 2 where is the difference between the velocity of the object and that of the observer, the factors being applied to the proper length, proper time and rest mass as measured when the object and observer are both moving with the same velocity. The proper values are those used in Newtonian mechanics. This obviously affects more complex properties such as velocity, momentum and energy. These results also apply to general relativity, but only if the observer is local to (in the same place) as the object. The strength of the gravitational field also affects the results – in fact the concept of a gravitational force is replaced by a curved space-time. This means that the measured values of length, time, and mass are not absolute, but depend on the relative speeds of the object and observer, and in general relativity also gravity. However this is not the same as saying that lengths and times decrease or mass increases with speed.
Transcript
Page 1: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Introduction to Relativity Newtonian mechanics provides a good explanation of the motion of objects, but two major

exceptions were known at the start of the twentieth century.

The first was that although Newtonian mechanics predicted the orbits of the planets, Mercury

was an exception. Its ellipse rotates around the sun (perihelion precession) by 43 arc seconds

per century which was not predicted. It is now known that the orbits of Venus and Earth also

have perihelion precessions of 8 and 4 arc seconds respectively.

Secondly theory and experiments showed that the observed speed of light in a vacuum (or the

aether as it was known at that time) has the same value when measured by observers travelling

at different speeds. Newtonian mechanics (or rather Galilean mechanics which came before

Newton) states that if the speed of light is a constant, its observed speed should be equal to its

speed minus the speed of the observer when measured in the same direction as the light beam.

The second exception led to the theory of Special Relativity published by Einstein in 1905 which

applies correction factors to Newtonian mechanics – it is called Special because it only applies to

objects and observers moving with constant velocity, i.e. no acceleration or rotation. Thus it

cannot explain perihelion precession which involves rotation.

This was solved by Einstein’s theory of General Relativity published in 1915 which takes into

account acceleration and replaces Newton’s law of gravitation (which was not a theory since no

mechanism was provided). It is particularly important when objects have large masses (i.e. that

of a star or galaxy).

Both theories simplify to Newtonian mechanics under “normal” conditions. The factors of special relativity only become important at speeds greater than one third of that of light unless

particularly high precision is required.

Both theories combine space and time into a single concept called space-time. In special

relativity space-time is flat – straight lines are straight, and parallel lines never meet. The space-

time of general relativity is curved so the concept of a straight line is much more complicated.

Both theories involve frames of reference and observers. An observer makes measurements

while in a specific frame of reference which is just a coordinate system.

Special relativity adds correction factors for the measured length, time, and mass of objects

observed to be moving close to the speed of light 𝑐.

Distances and durations are reduced by a factor of √1 −𝑉2

𝑐2 and mass is increased by

1

√1−𝑉2

𝑐2

where 𝑉 is the difference between the velocity of the object and that of the observer, the factors

being applied to the proper length, proper time and rest mass as measured when the object and

observer are both moving with the same velocity. The proper values are those used in

Newtonian mechanics. This obviously affects more complex properties such as velocity,

momentum and energy. These results also apply to general relativity, but only if the observer is

local to (in the same place) as the object. The strength of the gravitational field also affects the

results – in fact the concept of a gravitational force is replaced by a curved space-time.

This means that the measured values of length, time, and mass are not absolute, but depend on

the relative speeds of the object and observer, and in general relativity also gravity. However

this is not the same as saying that lengths and times decrease or mass increases with speed.

Page 2: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Frames of Reference A frame of reference is a set of two or three coordinates used to specify a location – for example

on a map an east and north value, or for an aircraft east, north and height. In mathematics the

co-ordinate axes are usually labelled x, y, and z and a location is specified as (𝑥, 𝑦, 𝑧).

An alternative notation is a column vector [𝑥𝑦𝑧], the disadvantage being it takes up more room.

There is no absolute origin for a frame of reference so the origin is often chosen to simplify the

solution process. In mechanics it is often located at the position of an object at time zero. Time

also is not absolute so this must also be defined, usually in relation to the location. For example an object at time zero (𝑡𝑜 = 0) is at location (𝑥0, 𝑦0, 𝑧0) = (0,0,0). The object moves in a straight

line at constant speed (a requirement for special relativity) and at a later time (𝑡1) is at

(𝑥1, 𝑦1, 𝑧1). Its speed is therefore √(𝑥1−𝑥0)

2+(𝑦1−𝑦0)2+(𝑧1−𝑧0)

2

𝑡1−𝑡0=

√𝑥2+𝑦2+𝑧2

𝑡 .

√𝑥2 + 𝑦2 + 𝑧2 is the distance of the object from the origin, and is sometimes represented by 𝑟.

In addition to there being no absolute origin for a coordinate system, there is no absolute

orientation of the axes, and these can be aligned to simplify the solution. In the above case the x

axis can be aligned with the path of the object (which must be a straight line otherwise there

would be an acceleration) so that the y and z axes are no longer required.

At 𝑡𝑜 the object is at 𝑥0 = 0 and at 𝑡1the object is at 𝑥1. Its speed is therefore 𝑥1

𝑡1= 𝑉.

When defining the origin of a frame of reference it must obviously be done in relation to another

frame. However there is no frame of reference that is stationary since there is no location in the

universe that is stationary. Since all frames of reference move it is possible to define a frame of

reference for the above problem where the origin moves with the object (Frame B). In that case

at 𝑡𝑜 the object is at 𝑥0 = 0 and at 𝑡1the object is at 𝑥1 = 0. Its speed measured in B is therefore 0

𝑡1= 0. The speed of the object is therefore relative to the frame of reference and is not absolute.

Special relativity only applies if all frames of reference are without acceleration or rotation.

These are called inertial frames of reference. Each observer uses a specific inertial frame of

reference in order to make measurements (Frame A is normally the observer’s frame).

Taking the above example and comparing the observations in two frames of reference, both

with the x axis aligned with the path of the object:

In the first frame A (the observing frame) the object is at 𝑥𝐴0 = 0 at 𝑡𝑜 and at 𝑥𝐴1at 𝑡1.

In the second frame B (the rest frame) the object is at 𝑥𝐵0 = 0 at 𝑡𝑜 and at 𝑥𝐵1 = 0 at 𝑡1.

The second frame’s origin remains located in the object, and the second frame is moving at a

speed of 𝑉 =𝑥1

𝑡1 relative to the first frame. This comparison of frames of reference is very

common in special relativity. Many texts use an un-primed value for frame A and a primed value

for frame B e.g. 𝑥 in A and 𝑥′ in B (the primed value refers to a frame at rest with respect to the

object – known as the rest frame).

It is very useful to be able to convert from one frame of reference to another.

In this case the conversion is 𝑥𝐵 = 𝑥𝐴 − 𝑉𝑡1 where 𝑉 is the velocity of B with respect to A. This is

called a coordinate transformation and specifically a Galilean transformation.

Page 3: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Galilean/Newtonian Mechanics If a beam of light is sent by the body along the same x axis and in the direction it is moving, an

observer in frame B would measure its speed as 𝑐 which represents the speed of light in a

vacuum. An observer in frame A however would measure its speed as 𝑐 + 𝑉, 𝑉 being the

velocity of Frame B with respect to (WRT) Frame A. Since this was known to Galileo it is known

as Galilean mechanics which involves relative velocities and accelerations. Newtonian

mechanics involve forces, but since most applications involve forces the term Newtonian

mechanics is normally used instead of Galilean mechanics.

Galileo introduced the term relativity because all velocities are relative to the speed of the

observer – there is no absolute zero velocity to be measured against.

Thus the speed of light is predicted to differ according to the speed of the frame of reference of

the observer. If the observer is moving with the speed of light, its speed will be observed to be

zero. However experiment proves otherwise – the speed of light is 𝑐, a constant value regardless

of the speed at which the observer moves as shown by the Michelson–Morley experiment of

1887 whose frames of reference were the Earth moving relative to the aether as it orbited the

Sun.

The postulate made by special relativity is that the speed of light is the same in all frames of

reference, and Galilean/Newtonian mechanics must be modified so that it gives this result. This

involves modifications to the concepts of distance and duration - they cannot be absolute.

Page 4: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Postulates of Relativity

The two postulates which are the foundation of special relativity are

The principle of relativity - the laws of physics can be written in the same form in all

inertial (non-accelerating) frames of reference. This was in fact first stated by Galileo in

1632.

The principle of the constancy of the speed of light – the speed of light in a vacuum is the

same in all inertial frames. This is the result of careful measurement, first made by

Michelson and Morley in 1887, but has been confirmed many times since and in 1983

the speed was defined as 299,792,458 m s-1 (the length of a metre was redefined such

that this was the speed of light with no error), and is given the symbol 𝑐, normally taken

as 3x108 or 2.998 x108 m s-1.

Page 5: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Constant Speed of Light The two postulates of special relativity are first that the laws of physics take the same form in all

inertial frames of reference, and the second is that the measured speed of light in a vacuum has

the same value 𝑐 in all inertial frames (assuming the measurement is made with absolute

precision). Today the speed of light is exactly 299,792,458 metres per second because the metre

is defined in terms of the speed of light (the value 3 x108 m s-1 is often used). If the value of the

speed of light varied with the observer, then so would the length of a metre which would have a

major impact of the rest of physics.

The Galilean coordinate transformation described above cannot be correct as 𝑐 = 𝑐 ± 𝑉 can only

be true if 𝑉 = 0. It could be the case that one or two false assumptions have been made:

that ∆𝑡𝐴1 = ∆𝑡𝐵1 , or more generally that a duration between two instances in frame A

(∆𝑡𝐴) is the same as in frame B, i.e. that ∆𝑡𝐴 = ∆𝑡𝐵

that ∆𝑥𝐴1 = ∆𝑥𝐵1 , or more generally that a distance between two points in frame A

(∆𝑥𝐴) is the same as in frame B, i.e. that ∆𝑥𝐴 = ∆𝑥𝐵.

If these assumptions are false and given that no accelerations are allowed in special relativity

there must be a linear relationship between the values of the form

𝑡𝐵 = 𝑎𝑡𝐴 + 𝑏𝑥𝐴 noting that the dimensions of 𝑎 and 𝑏 must differ

𝑥𝐵 = 𝑑𝑥𝐴 + 𝑒𝑡𝐴 noting that the dimensions of 𝑑 and 𝑒 must differ, but 𝑏 and 𝑑 have the same

dimensions, and so do 𝑎 and 𝑒. Here 𝑥𝐴 and 𝑥𝐵 refer to the distance to an object measured from

the origins of the two frames of reference at time 𝑡𝐴 in frame A and 𝑡𝐵 in frame B, both measured

from a time 𝑡0 when the two origins coincided. The following process shows that 𝑏, 𝑑 and 𝑒 can

be expressed in terms of 𝑎, and that 𝑎 is a function of 𝑉 and 𝑐.

Each frame contains an observer to make measurements, a clock to measure time difference,

and a measuring rod to measure distances. The observer does not have a specific location, but

can observe from any position and any time within the frame. When both frames are travelling

in the same direction at the same speed the two clocks and two measuring rods always give the

identical results. Measurements in frame A are made by the frame A observer and

measurements in frame B by the frame B observer using the clock and measuring rod in the

observer’s frame.

At time zero the origins of frame A and frame B coincide. The two frames are travelling along

the x axis at different speeds; that of frame B being greater than that of frame A by an amount 𝑉

as measured in frame A.

At time 𝑡𝐴 measured by observer A the origin of frame B will be at 𝑉𝑡𝐴 or 𝑥𝐴 (measured in frame

A), but is at 0 at all times as measured in frame B.

0 = 𝑑𝑉𝑡𝐴 + 𝑒𝑡𝐴 𝑥𝐵 = 𝑑𝑥𝐴 + 𝑒𝑡𝐴

giving 𝑒 = −𝑑𝑉 and so

𝑥𝐵 = 𝑑𝑥𝐴 − 𝑑𝑉𝑡𝐴 𝑥𝐵 = 𝑑𝑥𝐴 + 𝑒𝑡𝐴

𝑥𝐵

𝑡𝐵=

𝑑𝑥𝐴−𝑑𝑉𝑡𝐴

𝑎𝑡𝐴+𝑏𝑥𝐴 dividing by 𝑡𝐵 = 𝑎𝑡𝐴 + 𝑏𝑥𝐴

At time 𝑡𝐵 measured by observer B the origin of frame A will be at −𝑉𝑡𝐵 (measured in frame B),

but is at 0 at all times as measured in frame A, i.e. 𝑥𝐴 = 0 and 𝑥𝐵 = −𝑉𝑡𝐵

Page 6: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

−𝑉𝑡𝐵

𝑡𝐵=

−𝑑𝑉𝑡𝐴

𝑎𝑡𝐴 or −1 =

−𝑑

𝑎

𝑥𝐵

𝑡𝐵=

𝑑𝑥𝐴−𝑑𝑉𝑡𝐴

𝑎𝑡𝐴+𝑏𝑥𝐴

This means 𝑑 = 𝑎

𝑥𝐵

𝑡𝐵=

𝑎𝑥𝐴−𝑎𝑉𝑡𝐴

𝑎𝑡𝐴+𝑏𝑥𝐴

𝑥𝐵

𝑡𝐵=

𝑑𝑥𝐴−𝑑𝑉𝑡𝐴

𝑎𝑡𝐴+𝑏𝑥𝐴 and 𝑎 = 𝑑

=𝑎𝑥𝐴𝑡𝐴−𝑎𝑉

𝑡𝐴𝑡𝐴

𝑎𝑡𝐴𝑡𝐴+𝑏

𝑥𝐴𝑡𝐴

𝑥𝐵

𝑡𝐵=

𝑎𝑥𝐴−𝑎𝑉𝑡𝐴

𝑎𝑡𝐴+𝑏𝑥𝐴.

1

𝑡𝐴1

𝑡𝐴

=𝑎𝑥𝐴𝑡𝐴−𝑎𝑉

𝑎+𝑏𝑥𝐴𝑡𝐴

noting that 𝑥

𝑡 is a speed.

Applying this equation to a pulse of light travelling at speed 𝑐 in all frames of reference then 𝑥𝐴

𝑡𝐴=

𝑥𝐵

𝑡𝐵= 𝑐 giving

𝑐 =𝑎𝑐−𝑎𝑉

𝑎+𝑏𝑐 so

𝑎 + 𝑏𝑐 =𝑎𝑐−𝑎𝑉

𝑐

𝑏𝑐 =𝑎𝑐−𝑎𝑉

𝑐− 𝑎 =

𝑎𝑐−𝑎𝑉

𝑐−

𝑎𝑐

𝑐=

−𝑎𝑉

𝑐

𝑏 =−𝑎𝑉

𝑐2

𝑡𝐵 = 𝑎𝑡𝐴 +−𝑎𝑉

𝑐2𝑥𝐴 𝑡𝐵 = 𝑎𝑡𝐴 + 𝑏𝑥𝐴 and 𝑏 =

−𝑎𝑉

𝑐2

𝑥𝐵 = 𝑎𝑥𝐴 − 𝑎𝑉𝑡𝐴 𝑥𝐵 = 𝑑𝑥𝐴 + 𝑒𝑡𝐴 and 𝑎 = 𝑑 and 𝑒 = −𝑑𝑉

or

𝑡𝐵 = 𝑎 (𝑡𝐴 −𝑉

𝑐2𝑥𝐴) and 𝑥𝐵 = 𝑎(𝑥𝐴 − 𝑉𝑡𝐴)

Applying the same reasoning converting from frame B into frame A gives a similar result except 𝑉 is replaced by – 𝑉 (note that 𝑉 is a speed, not a velocity so 𝑉 ≥ 0).

𝑡𝐴 = 𝑎 (𝑡𝐵 +𝑉

𝑐2𝑥𝐵) and 𝑥𝐴 = 𝑎(𝑥𝐵 + 𝑉𝑡𝐵) so replacing 𝑡𝐴 and 𝑥𝐴 in 𝑥𝐵 = 𝑎(𝑥𝐴 − 𝑉𝑡𝐴) gives

𝑥𝐵 = 𝑎 (𝑎(𝑥𝐵 + 𝑉𝑡𝐵) − 𝑉𝑎 (𝑡𝐵 +𝑉

𝑐2𝑥𝐵))

= 𝑎2𝑥𝐵 + 𝑎2𝑉𝑡𝐵 − 𝑎2𝑉𝑡𝐵 −

𝑎2𝑉2

𝑐2𝑥𝐵

= 𝑎2𝑥𝐵 −𝑎2𝑉2

𝑐2𝑥𝐵

1 = 𝑎2 (1 −𝑉2

𝑐2) or 𝑎2 =

1

1−𝑉2

𝑐2

𝑎 =1

√1−𝑉2

𝑐2

sometimes written as 𝑎 =1

√1−𝛽2 where 𝛽 =

𝑉

𝑐

Thus all the unknowns have been eliminated leaving a function of 𝑉. This function occurs

throughout special relativity and is known as the Lorentz factor or gamma factor (𝑉) =1

√1−𝑉2

𝑐2

.

Page 7: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Lorentz Factor or Gamma Factor

The Lorentz or gamma factor 𝛾(𝑉) =1

√1−𝑉2

𝑐2

occurs frequently in special relativity and acts as a

“correction” factor to time and spatial values in Newtonian mechanics. Note that the factor is

dimensionless, and depends only on the speed towards or away from the observer.

The value is close to 1 for small values of 𝑉 (small compared to the speed of light or less than

1x108 metres per second), but increases rapidly at speeds greater than 2x108 metres per second

to infinity at 𝑉 = 𝑐 = 3x108 m s-1)

𝑉 m s-1 1x108 2x108 2.5x108 2.75x108 2.875x108 𝛾(𝑉) 1.06 1.34 1.81 2.50 3.47

Applying the factor (replacing 𝑎 by 𝛾(𝑉)) where 𝑉 is the speed of frame B relative to frame A

gives

𝑡𝐵 = 𝛾(𝑉) (𝑡𝐴 −𝑉𝑥𝐴

𝑐2) 𝑡𝐵 = 𝑎 (𝑡𝐴 −

𝑉

𝑐2𝑥𝐴)

𝑥𝐵 = 𝛾(𝑉)(𝑥𝐴 − 𝑉𝑡𝐴) 𝑥𝐵 = 𝑎(𝑥𝐴 − 𝑉𝑡𝐴)

The inverse transformations are

𝑡𝐴 = 𝛾(𝑉) (𝑡𝐵 +𝑉𝑥𝐵

𝑐2) 𝑡𝐴 = 𝑎 (𝑡𝐵 +

𝑉

𝑐2𝑥𝐵)

𝑥𝐴 = 𝛾(𝑉)(𝑥𝐵 + 𝑉𝑡𝐵) 𝑥𝐴 = 𝑎(𝑥𝐵 + 𝑉𝑡𝐵)

The y and z axes are unaffected by movement along the x axis so 𝑦𝐴 = 𝑦𝐵 and 𝑧𝐴 = 𝑧𝐵. If the

motion is at an angle to the x axis, similar equations apply to the y and z axes, 𝑉 being replaced

by 𝑉𝑥, 𝑉𝑦 and 𝑉𝑧, i.e. vectors must be used.

Since 𝛾(𝑉) =1

√1−𝑉2

𝑐2

, then if 𝛾(𝑉) is known

𝑉 = 𝑐√1 −1

𝛾2(𝑉)

𝛾(𝑉) or just 𝛾 is often used as a measure of relativistic speeds (over 0.5c).

𝑉

𝑐= √1 −

1

𝛾2(𝑉) but the following approximation is commonly used

𝑉

𝑐≅ 1 −

1

2𝛾2(𝑉) (1 + 𝑥)

1

2 ≅ 1 +𝑥

2, 𝑥 ≪ 1

𝛾 V/c 1 0.5

10 0.995 100 0.99995

1000 0.9999995

Substituting 0.5 back into 1

√1−𝑉2

𝑐2

gives 𝛾 = 1.1547 so the error is about 15%, but then 1/2 is not

significantly less than 1. Substituting 0.995 gives 𝛾 = 10.013, an error of 0.13% since 1/200 is.

Page 8: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Speeds faster than light.

If 𝑉 > 𝑐 then 𝛾(𝑉) =1

√1−𝛽2 where 𝛽2 =

𝑉2

𝑐2 and so 𝛽 > 1.

√1 − 𝛽2 is the square root of a negative number, i.e. an imaginary number i√𝛽2 − 1. (Note that

an italic 𝑖 is a variable, a normal i is the square root of -1.)

This means that 𝛾(𝑉) is an imaginary number and since the Lorentz factor multiplies scalars

such as time, length and mass these also become imaginary numbers, a result that has no

physical reality.

This imposes a maximum speed of 𝑉 < 𝑐 since for 𝑉 = 𝑐 the Lorentz factor 𝛾(𝑉) = ∞, resulting

in infinite values for physical properties.

Note that 𝑉 is the velocity as measured by the observer in the x direction, i.e. is equivalent to 𝑣𝑥 .

Particles that travel at the speed of light are a special case – time, length and mass must be zero.

The only known particles that meet this requirement are photons.

Photons have momentum, but instead of momentum being 𝑝 = 𝑚𝑣, it is 𝑝 = ℎ𝑓 =𝜆

ℎ where ℎ is

Planck’s constant, 𝑓 is frequency (often given the symbol 𝜈 in astronomy, but that can be

confused with 𝑣 for velocity) and 𝜆 is the wavelength.

Neutrinos were thought to travel at the speed of light and to have zero mass, but experiments

showed that they can change their flavour (electron, muon or tau) during their existence which

means that they must exist for a finite time and not zero time. This implies they have mass

although it has not been measured, and their speed is extremely close to that of light from

measurements of astronomical sources.

Note that 𝑉 is the velocity as measured by the observer in the x direction, i.e. is equivalent to 𝑣𝑥 .

Page 9: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Transformation of Intervals An interval is the distance between two points in space or two instances in time. Assuming

motion is restricted to the x axis these are ∆𝑥 and ∆𝑡.

The distance interval between two events 1 (at time 𝑡𝐴1 and distance 𝑥𝐴1) and 2 (at time 𝑡𝐴2 and

distance 𝑥𝐴2) as measured in frame A is ∆𝑥𝐴 = 𝑥𝐴2 − 𝑥𝐴1 with 𝑥𝐴2 > 𝑥𝐴1 .

Converting from frame A to frame B moving at speed 𝑉 relative to A (positive for increasing 𝑥)

gives

𝑥𝐵1 = 𝛾(𝑉)(𝑥𝐴1 − 𝑉𝑡𝐴1) and 𝑥𝐵2 = 𝛾(𝑉)(𝑥𝐴2 − 𝑉𝑡𝐴2) so

∆𝑥𝐵 = 𝛾(𝑉)(𝑥𝐴2 − 𝑉𝑡𝐴2) − 𝛾(𝑉)(𝑥𝐴1 − 𝑉𝑡𝐴1)

= 𝛾(𝑉)(∆𝑥𝐴 − 𝑉∆𝑡𝐴)

Note that there are no absolute distances or times in the result, just differences, so the only

factor affecting the result in the relative speed of the two frames.

Similarly for time

∆𝑡𝐵 = 𝑥𝐵2 − 𝑥𝐵1

∆𝑡𝐵 = 𝛾(𝑉) (𝑡𝐴2 −𝑉𝑥𝐴2𝑐2) − 𝛾(𝑉) (𝑡𝐴1 −

𝑉𝑥𝐴1𝑐2) 𝑡𝐵1 = 𝛾(𝑉) (𝑡𝐴1 −

𝑉𝑥𝐴1𝑐2), 𝑡𝐵2 = 𝛾(𝑉) (𝑡𝐴2 −

𝑉𝑥𝐴2𝑐2)

= 𝛾(𝑉) (∆𝑡𝐴 −𝑉∆𝑥𝐴

𝑐2) and again there are no absolute distances or times in the result, just

differences, so the only factor affecting the result in the relative speed of the two frames.

The reverse transforms are found by reversing the sign of 𝑉 − this has no effect on 𝛾(𝑉).

Since both time and space intervals depend on the relative speed of the frames of reference, all

properties that include time and space will also depend on the relative speed of the frames of

reference, and these include velocity, kinetic energy and momentum.

These relationships are important because a measuring rod measures the distance between two points and a clock the duration between two instances, i.e. both measure different aspects of

intervals and the absolute location or time of an event has no meaning.

Note that frame A always refers to the observer’s frame, frame B to a frame moving at the same

velocity 𝑉 as the object (the rest frame), the x axes of the two frames being parallel to this

velocity.

In most practical problems the observer is located on the Earth and regard themselves as

stationary and the object as moving. However the rest frame B is moving with the object. If the

object is approaching the Earth the x axis is directed towards the object and 𝑉 is negative. If the

object is in orbit about the Earth the x axis is along the Earth’s surface in the direction of the

ground track of the object (special relativity can only approximate the result due to the

presence of gravity and the curvature of the orbit).

Page 10: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Time Dilation The duration between two events depends on the relative speed of the frame of reference used

for the measurement to the rest frame of reference of the object involved. The following

assumes that the rest frame of the object is frame B and it is observed from frame A. An example would be the creation at (𝑡1, 𝑥1) and decay at (𝑡2, 𝑥2) of an atomic particle.

In frame B the particle is located at 𝑥𝐵 at 𝑡𝐵1 and 𝑡𝐵2and its lifetime is ∆𝑡𝐵 = 𝑡𝐵2 − 𝑡𝐵1 .

∆𝑥𝐵 = 0 because this frame is travelling at the same speed as the particle.

The particle is observed by the observer in frame A using the clock and measuring rod in frame

A. The particle is created at (𝑡𝐴1 , 𝑥𝐴1) and decays at (𝑡𝐴2 , 𝑥𝐴2) so its observed lifetime is

∆𝑡𝐴 = 𝑡𝐴2 − 𝑡𝐴1 .

∆𝑡𝐴 = 𝛾(𝑉) (∆𝑡𝐵 +𝑉∆𝑥𝐵

𝑐2) Positive 𝑉 if 𝑥𝐴2 > 𝑥𝐴1

= 𝛾(𝑉)∆𝑡𝐵 since ∆𝑥𝐵 = 0.

Since 𝛾(𝑉) ≥ 1 then ∆𝑡𝐴 ≥ ∆𝑡𝐵.

∆𝑡𝐴 − ∆𝑡𝐵 is called the time dilation ∆𝑡 since the clocks tick at the same rate when stationary,

but the clock in frame A runs slower (longer period between ticks) than that in frame B.

The minimum value of ∆𝑡𝐴 is when 𝛾(𝑉) has a minimum value. 𝛾(𝑉) =1

√1−𝑉2

𝑐2

and since 𝑉 ≤ 𝑐

the minimum value of 𝛾(𝑉) is when 𝑉 = 0 and then 𝛾(𝑉) = 1.

This means the minimum time dilation is when the time duration is measured in the rest frame.

This is called the proper time and is given the symbol ∆𝜏, i.e. ∆𝜏 is the proper time between the

two events measured in a frame moving at the speed of the object between the two events, and any measurement made in another frame of reference will be larger than this. Another way of

expressing this is that moving clocks run slow (moving relative to the object).

In all frames of reference ∆𝑡 = 𝛾(𝑉)∆𝜏 where 𝑉 is the velocity difference between the frame of

reference and the particle.

For example a muon moving at a speed of 3𝑐

5 and a proper lifetime of 2.2 𝜇𝑠 has a lifetime of

𝛾(𝑉)2.2 =1

√1−9

25

2.2 = 2.8 𝜇𝑠 according to a laboratory observer.

A high energy 1020 eV cosmic ray consists of protons with a rest mass 𝑚0 of 938MeV c-2.

(Masses are often expressed in units of eV c-2, but this is often abbreviated to eV where

1 eV c-2 = 1.7827 x10-36 kg). The Milky Way has a diameter of 100,000 lightyears.

The energy of the cosmic ray is given by 𝐸 = 𝛾(𝑉)𝑚0𝑐2 so 1020 = 𝛾(𝑉) × (938x106 c-2 ) × c2 and

𝛾(𝑉) =1020/938x106=10x1011 which means the cosmic ray is very close to the speed of light

and while it takes approximately 100,000 years to cross it in the galaxy’s rest frame, in the

cosmic ray’s frame it takes 100,000/ 𝛾(𝑉) = 10-6 years or 30 seconds.

The extreme case is that of a photon travelling across the universe. The astronomer measures

its lifetime in Gyr (1 giga-year is 109 or 1,000,000,000 years), but its proper lifetime is Gyr

𝛾(𝑐) and

since 𝛾(𝑐) = ∞ the proper lifetime is zero.

Page 11: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Length Contraction Taking as an example the length of a rod, this is defined as the distance between the ends of the

rod with all measurements being made at the same time in the same frame. It is assumed that

the rod is aligned with the x axis, and frame B is moving at the same speed as the rod.

The rod is observed by the observer in frame A using the clock and measuring rod in frame A

which is moving relative to the rod. The ends of the rod are at (𝑡𝐴1 , 𝑥𝐴1) and at (𝑡𝐴1 , 𝑥𝐴2) so its

length is ∆𝑥𝐴 = 𝑥𝐴2 − 𝑥𝐴1 .

∆𝑥𝐵 = 𝛾(𝑉)(∆𝑥𝐴 − 𝑉∆𝑡𝐴) Negative 𝑉 if 𝑥𝐴2 > 𝑥𝐴1

= 𝛾(𝑉)∆𝑥𝐴 since ∆𝑡𝐴 = 0.

This can be written as ∆𝑥𝐴 =∆𝑥𝐵

𝛾(𝑉) and since 𝛾(𝑉) ≥ 1 ∆𝑥𝐴 ≤ ∆𝑥𝐵.

The maximum value of ∆𝑥𝐴 is when 𝛾(𝑉) = 1, i.e. when 𝑉 = 0 and the frame is travelling at the

same speed as the rod. ∆𝑥𝐵 is written as ∆𝜒 the proper length.

This means the maximum length is when the length is measured in the rest frame.

In all other frames Δ𝑥 = 𝜒

𝛾(𝑉).

Taking the example of the photon crossing the universe, its path length is measured in Gly (giga

lightyears). The length as seen by the photon is Gly

𝛾(𝑐) which is zero – the photon is created and

destroyed at the same location which is why its proper lifetime is zero.

The combination of time dilation and length contraction gives consistency between observers. If

an object is approaching the Earth at high speed an observer in the rest frame of the object

measures the proper time to impact 𝜏 but at an observed distance 𝑥, while an observer on Earth

measures the proper distance ∆𝜒 but a dilated time t. 𝛾(𝑉) = 𝜒

𝑥=

𝑡

𝜏 so

𝜒

𝑡=

𝑥

𝜏, i.e. both agree on

the velocity 𝑉.

Page 12: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Simultaneous Events Simultaneous events in one frame of reference are separated by a time duration in a frame of

reference moving at a different speed.

Since ∆𝑡𝐵 = 𝛾(𝑉) (∆𝑡𝐴 −𝑉∆𝑥𝐴

𝑐2) simultaneous events in frame A means that ∆𝑡𝐴 = 0 so

∆𝑡𝐵 = 𝛾(𝑉) (−𝑉∆𝑥𝐴

𝑐2).

This is known as the relativity of simultaneity.

Proper Time, Length and Mass Traditionally mass has been treated in a similar way to time and length with the difference that

mass increases with speed by the factor 𝛾(𝑉) i.e. 𝑚 = 𝛾(𝑉)𝑚0 where 𝑚0is the mass measured in

the rest frame. However the modern approach is that the mass 𝑚 is independent of speed and is

always the rest mass and this is multiplied by 𝛾(𝑉) whenever mass is used in a calculation

(relativity tends to use momentum 𝐩 = 𝛾(𝑉)𝑚𝐯 rather than mass). This means that all modern

observers agree on the mass – it is said to be invariant.

Although observers in different frames of reference disagree on the measured values of lengths

and durations, they can all calculate the same proper lengths and durations which are those

measured by an observer in the rest frame. Thus the size and life time of an object are also

invariants, and like mass are properties of the object instead of the results of measurements by

observers.

(Note that vectors are in bold upright, scalars in normal italic.)

Note that although the SI unit of mass is the kilogram in particle physics mass is often quoted in

electron volts (1 electron volt being the energy gained by an electron when it moves across a

potential of 1 volt). Strictly the unit should be eV/c2 to convert energy into a mass and that is

what is used here. Modern usage is for this to be written as eV c-2 but this does not stand out in

text so is not used here.

The conversion comes from the famous 𝐸 = 𝑚𝑐2 so 1 eV/c2 = 1.7827 x10-36 Kg.

Page 13: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Velocities

Velocity is defined as Δ𝑥

Δ𝑡. Strictly speed is a magnitude and velocity is a vector (i.e. it also has a

direction), but when restricting movement to the x axis the difference is not important.

In frame A velocity is defined as 𝑣𝐴 =Δ𝑥𝐴

Δ𝑡𝐴 and in frame B as 𝑣𝐵 =

Δ𝑥𝐵

Δ𝑡𝐵,

but ∆𝑥𝐵 = 𝛾(𝑉)(∆𝑥𝐴 − 𝑉∆𝑡𝐴) and ∆𝑡𝐵 = 𝛾(𝑉) (∆𝑡𝐴 −𝑉∆𝑥𝐴

𝑐2) so

Δ𝑥𝐵

Δ𝑡𝐵=

𝛾(𝑉)(∆𝑥𝐴−𝑉∆𝑡𝐴)

𝛾(𝑉)(∆𝑡𝐴−𝑉∆𝑥𝐴𝑐2

)

= ∆𝑥𝐴−𝑉∆𝑡𝐴

∆𝑡𝐴−𝑉∆𝑥𝐴𝑐2

=

∆𝑥𝐴∆𝑡𝐴

−𝑉

∆𝑡𝐴∆𝑡𝐴

−𝑉∆𝑥𝐴𝑐2∆𝑡𝐴

𝑣𝐵 = 𝑣𝐴−𝑉

1−𝑉𝑣𝐴𝑐2

𝑉 is the speed of frame B along the x axis. If frame B is the rest frame of the object then 𝑣𝐴 = 𝑉

and so 𝑣𝐵 = 0 as expected.

If two objects are moving away from an observer at speeds of 𝑣1 and 𝑣2, both in the same

direction, then the speed of one of the objects as measured from the other can be found directly

by the above equation with 𝑣𝐴 as the speed of one and 𝑉 the speed of the other.

The velocity transformations for objects moving at an angle to the x axis are

𝑣𝐵𝑥 = 𝑣𝐴𝑥−𝑉

1−𝑉𝑣𝐴𝑥𝑐2

𝑣𝐵𝑦 = 𝑣𝐴𝑦

1−𝑉𝑣𝐴𝑥𝑐2

since 𝑣𝐵𝑦 =∆𝑦𝐵

∆𝑡𝐵 , ∆𝑦𝐵 = ∆𝑦𝐴 and ∆𝑡𝐵 = 𝛾(𝑉) (∆𝑡𝐴 −

𝑉∆𝑥𝐴

𝑐2)

𝑣𝐵𝑧 = 𝑣𝐴𝑧

1−𝑉𝑣𝐴𝑥𝑐2

since 𝑣𝐵𝑧 =∆𝑧𝐵

∆𝑡𝐵 , ∆𝑧𝐵 = ∆𝑧𝐴 and ∆𝑡𝐵 = 𝛾(𝑉) (∆𝑡𝐴 −

𝑉∆𝑥𝐴

𝑐2)

Note 𝑉 is the speed at which frame B is moving along the x axis, (𝑣𝐴𝑥 , 𝑣𝐴𝑦, 𝑣𝐴𝑧) is the speed of

the object measured in frame A, and (𝑣𝐵𝑥 , 𝑣𝐵𝑦, 𝑣𝐵𝑧) is the speed measured in frame B.

If both 𝑉 and 𝑣𝐴𝑥 are small compared to 𝑐 then 𝑉𝑣𝐴𝑥𝑐2

is approximately zero and this simplifies to

𝑣𝐵𝑥 = 𝑣𝐴𝑥 − 𝑉 etc. as per Galilean mechanics.

For example if an observer measures the speed of two spacecraft as 𝑐

2 and

3𝑐

4 in the same

direction, then taking the observer as frame A and the direction as the x axis, 𝑉 =𝑐

2 and 𝑣𝐴𝑥 =

3𝑐

4

the speed of the faster craft as measured from the slower is 𝑣𝐵𝑥 = 𝑣𝐴𝑥−𝑉

1−𝑉𝑣𝐴𝑥𝑐2

=3𝑐

4−𝑐

2

1−

𝑐2 3𝑐4

𝑐2

=2𝑐

5 while

Galileo would give simply 3𝑐

4−

𝑐

2=

𝑐

4 .

Page 14: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Doppler Effect The frequency of light from a star moving towards or away from the Earth is changed due to the

Doppler effect.

This is usually explained as the change in frequency of sound waves from a moving object such as a police siren as it passes. The observed frequency 𝑓𝑜𝑏 is

𝑓𝑜𝑏 = (𝑐+𝑣𝑜𝑏

𝑐+𝑣𝑒𝑚) 𝑓𝑒𝑚 where 𝑓𝑒𝑚 is the emitted frequency, 𝑐 is the speed of the wave in the medium.,

𝑣𝑜𝑏 is the velocity component of the observer towards the emitter relative to the medium, and

𝑣𝑒𝑚 is the velocity component of the emitter away from the observer relative to the medium.

However the speed of light is the same when measured by all observers including the emitter

and there is no medium so this formula does not apply, and another is required.

The wavelength of the light is given by 𝜆 =𝑐

𝑓 and the time period between peaks (when the

wave has maximum amplitude) is Δ𝑡 =1

𝑓.

The proper time period is that measured by an observer moving with the star Δ𝜏 =1

𝑓𝑒𝑚.

The time period measured (observed) on Earth is Δ𝑡 =1

𝑓𝑜𝑏.

As measured by an observer moving with the star during the time period Δ𝜏 the Earth will have

moved away by a distance 𝑉Δ𝜏 so the time period on Earth will be Δ𝑡 = Δ𝜏 +𝑉Δ𝜏

𝑐.

For an observer on Earth Δ𝜏 is replaced by 𝛾(𝑉)Δ𝜏 so

Δ𝑡 = 𝛾(𝑉)Δ𝜏 +𝑉𝛾(𝑉)Δ𝜏

𝑐

= 𝛾(𝑉)Δ𝜏 (1 +𝑉

𝑐)

= Δ𝜏 1

√1−𝑉2

𝑐2

(1 +𝑉

𝑐) Note that (1 +

𝑉

𝑐) is the distance travelled in time Δ𝜏 and has

nothing to do with relativity. 1

√1−𝑉2

𝑐2

comes from time dilation.

= Δ𝜏 1

√(1−𝑉

𝑐)(1+

𝑉

𝑐)

(1 +𝑉

𝑐)

= Δ𝜏 √(1+

𝑉

𝑐)

√(1−𝑉

𝑐)

= Δ𝜏 √𝑐+𝑉

√𝑐−𝑉

1

𝑓𝑜𝑏 =

1

𝑓𝑒𝑚 √𝑐+𝑉

√𝑐−𝑉 Δ𝑡 =

1

𝑓𝑜𝑏, Δ𝜏 =

1

𝑓𝑒𝑚.

𝑓𝑜𝑏 = 𝑓𝑒𝑚 √𝑐−𝑉

√𝑐+𝑉 or 𝜆𝑜𝑏 = 𝜆𝑒𝑚

√𝑐+𝑉

√𝑐−𝑉 𝜆 =

𝑐

𝑓

Page 15: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Light from a receding star has an observed lower frequency (called redshift) and that from an

approaching star a higher observed frequency (called blue shift). Note that these terms are also

used for frequencies other than those of visible light but blue shifting UV, X-rays and gamma

rays moves them away from blue, not towards it, blue meaning an increase in frequency.

Likewise red shifting radio, microwaves or infrared waves results in a decrease in frequency,

In practice the frequencies or wave lengths are known (usually with a high precision) and the

velocity is required. Rearranging these gives

𝑉 =𝑓𝑒𝑚

2−𝑓𝑜𝑏2

𝑓𝑒𝑚2+𝑓𝑜𝑏

2 𝑐 =𝜆𝑜𝑏

2−𝜆𝑒𝑚2

𝜆𝑜𝑏2+𝜆𝑒𝑚

2 𝑐 but it may be more convenient to calculate using ratios

𝑉 =1−(

𝑓𝑜𝑏𝑓𝑒𝑚

)2

1+(𝑓𝑜𝑏𝑓𝑒𝑚

)2 𝑐 =

1−(𝜆𝑒𝑚𝜆𝑜𝑏

)2

1+(𝜆𝑒𝑚𝜆𝑜𝑏

)2 𝑐 with 𝑉 positive indicating receding, negative approaching.

However for stars where 𝑉 ≪ 𝑐 the simpler approximation

𝑉 = 𝑐(𝑓𝑒𝑚−𝑓𝑜𝑏)

𝑓𝑜𝑏= 𝑐

(𝜆𝑜𝑏−𝜆𝑒𝑚)

𝜆𝑒𝑚 is often used, but this should not be applied to galaxies.

Page 16: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Seeing and Observing Seeing means that the time recorded is that when the observer sees the event, that is the actual

time the event happened plus the time it took that information to reach the observer. For

example a supernova was seen yesterday. However the event occurred many hundreds of years

earlier. That time is the distance divided by the speed of light 𝑑

𝑐, and has nothing to do with

relativity.

When an observer makes an observation the time recorded is the actual time of the event, as if

the observer was at the event, but measured in the observer’s frame of reference, not the rest

frame.

An observer is assumed to be able to travel instantaneously to any point in the frame of

reference to make an observation. The critical point is that the tools they use – the measuring

rod and clock - are those that belong to their frame of reference. An observer cannot change

their frame of reference as that would imply acceleration. They cannot observe values using

tools in another frame of reference.

In special relativity observers observe and do not see. In general relativity observers see and do

not observe – a very important distinction.

Alice and Bob are in separate space ships approaching each other at speed 𝑉, and Alice and Bob

synchronise their clocks to 𝑡 = 0 when Bob measures the separation distance as 𝑑𝐵.

At 𝑡 = 0 Alice sees Bob’s clock through a telescope – the time seen on Bob’s clock will be −𝑑𝐵

𝑐

since the light will take 𝑑𝐵

𝑐 to reach Alice, but the time on Alice’s clock is 𝑡𝐴 = 0.

Sometime later Alice and Bob pass each other. Alice measures the distance travelled since 𝑡 = 0

as 𝑑𝐴 =𝑑𝐵

𝛾(𝑉) since Bob is approaching Alice at speed V so his distance 𝑑𝐵 is contracted.

Bob observes Alice passes at a time 𝑡𝐵 =𝑑𝐵

𝑉.

Alice observes Bob passes at 𝑡𝐴 =𝑑𝐴

𝑉=

𝑑𝐵

𝛾(𝑉)𝑉 but wants to know the time on Bob’s clock. She

must allow for the Doppler shift which will make Bob’s clock run faster by the factor √𝑐−𝑉

√𝑐+𝑉 so

𝑑𝐴

𝑉= ∆𝑡𝐵

√𝑐−𝑉

√𝑐+𝑉 ∆𝑡𝐵𝐴 because Alice will be measuring the time from −

𝑑𝐵

𝑐, not 0.

∆𝑡𝐵𝐴 =𝑑𝐴

𝑉

√𝑐+𝑉

√𝑐−𝑉

This can be reconciled with Bob’s observation of the time on his clock 𝑡𝐵𝐵 = 𝑡𝐵 =𝑑𝐵

𝑉

∆𝑡𝐵𝐴 =𝑑𝐵

𝛾(𝑉)𝑉

√𝑐+𝑉

√𝑐−𝑉 =

𝑑𝐵

𝑉√1 −

𝑉2

𝑐2√𝑐+𝑉

√𝑐−𝑉 =

𝑑𝐵

𝑉√𝑐2−𝑉2

𝑐2√𝑐+𝑉

√𝑐−𝑉 =

𝑑𝐵

𝑉√(𝑐+𝑣)(𝑐−𝑣)

𝑐2√𝑐+𝑉

√𝑐−𝑉 =

𝑑𝐵

𝑉√(𝑐+𝑣)

𝑐2√𝑐+𝑉

1 =

𝑑𝐵

𝑉

(𝑐+𝑉)

𝑐=

𝑑𝐵𝑐

𝑉𝑐+

𝑑𝐵𝑉

𝑉𝑐

=𝑑𝐵

𝑉+

𝑑𝐵

𝑐

This elapsed time must added to the time she saw at 𝑡𝐴 = 0 which was −𝑑𝐵

𝑐. Therefore

𝑡𝐵𝐴 =𝑑𝐵

𝑉+

𝑑𝐵

𝑐−

𝑑𝐵

𝑐 =

𝑑𝐵

𝑉 the same as Bob, i.e. 𝑡𝐵𝐵 .

Page 17: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Terrell Effect The distinction between seeing and observing is also responsible for the Terrell Effect or

Rotation.

The Lorentz length contraction is observed. It is a real effect, but taking as an example a cube of

size 𝐿 travelling such that one face is perpendicular to its direction of travel and one face is

facing the observer the length of this second face when measured by the observer is 𝐿√1 −𝑉2

𝑐2.

This measurement must be obtained using a stationary measuring rod (with respect to the

observer) and noting simultaneously where the leading and trailing edges of the cube coincide

with marks on the measuring rod (clearly impossible at high speeds!).

However a remote observer will see something different. The mathematics is greatly simplified

by assuming the observer is so remote that the light rays from the edges of the cube are parallel

when entering the eye, and considering the instant when the middle of the face is at its shortest

distance to the observer so that the ray of light from the middle of the face to the observer is

perpendicular to it.

The observer will see that the length of the face is reduced to 𝐿√1 −𝑉2

𝑐2, but will in addition see

the trailing face (which should be invisible by normal optics). It will have a width of 𝐿𝑉

𝑐

meaning the total length of the seen object is still 𝐿.

A photon emitted from the far edge of the trailing face of the cube will take longer to reach the

observer that one from the near edge of that face. However the eye or a camera sees all photons

at the same instant they arrive and so just considering the two photons seen at the same instant,

the one from the far edge must have been emitted a short time ∆𝑡 before that from the near edge

where ∆𝑡 =𝐿

𝑐. So a photon from the far edge will appear at time t. The photon from the near

edge that appears at time 𝑡 will have come from a point that has moved during ∆𝑡 a distance

𝑉𝐿

𝑐. Since this length L between the front and back edges is not moving along its length relative

to the observer there is no length contraction for this face.

The result is that to the observer the cube appears to have rotated through an angle 𝜃 = sin−1𝑉

𝑐.

𝑉𝐿

𝑐+ 𝐿√1 −

𝑉2

𝑐2≅ 𝐿 for small and large values of V, about 1.4 𝐿 for

𝑉

𝑐 in the range 0.7 to 0.8 so the

total length increases slightly rather than contracts.

For a value 𝑣 = 𝑐 the rotation would appear to be 90 degrees – the observer would only see the

trailing face.

This means that if the cube is replaced by a sphere, the sphere does not appear as an ellipsoid as

proposed by Einstein with the minor axis in the direction of travel, but as a rotated sphere as

proposed by Penrose.

Page 18: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Relativistic Outflows An example of the application of special relativity is relativistic outflows. These are jets of

material (typically hydrogen) that originate in neutron stars or black holes where material

(typically hydrogen) is falling into a deep potential well (so general relativity applies). However

the jets can be analysed using only special relativity.

Examples are found across the electromagnetic spectrum – radio, light, X-rays and gamma rays

– so there are several different sources. The jets are emitted in two opposite directions and may

appear to have superluminal speeds (faster than light). This again the result of seeing rather

than observing.

The jet originates from a core (assumed to be a neutron star or black hole), and the jet is called a

lobe which ends in a hot spot where the beam broadens as if the jet is impacting a surrounding

fluid.

Being relativistic the Lorentz contraction and time dilation are obviously important, but so are

several other effects.

In the following Frame A refers to an observer on Earth and Frame B to a frame moving at the

same speed 𝑉 as the material in the jet. The relative motion of the core of the jet and the Earth is

ignored, as are effects due to general relativity.

The time measured by the observer 𝑡𝐴 = 𝛾 (𝑡𝐵 +𝑉𝑥𝐵

𝑐2) where the x axis is aligned with the centre

line of the lobe. Distances will be differentiated wrt time so

d𝑡𝐴 = d(𝛾 (𝑡𝐵 +𝑉𝑥𝐵

𝑐2)) = 𝛾 (d𝑡𝐵 +

𝑉d𝑥𝐵

𝑐2) (note that d implies an infinitesimal change)

𝑥𝐴 = 𝛾(𝑥𝐵 + 𝑉𝑡𝐵)

𝑣𝑥𝐴 =d𝑥𝐴

d𝑡𝐴=

d(𝛾(𝑥𝐵+𝑉𝑡𝐵))

𝛾(d𝑡𝐵+𝑉d𝑥𝐵𝑐2

) =

𝛾(d𝑥𝐵+𝑉d𝑡𝐵)

𝛾(d𝑡𝐵+𝑉d𝑥𝐵𝑐2

) =

d𝑥𝐵+𝑉d𝑡𝐵

d𝑡𝐵+𝑉d𝑥𝐵𝑐2

=

d𝑥𝐵d𝑡𝐵

+𝑉

1+𝑉

𝑐2 d𝑥𝐵d𝑡𝐵

=𝑣𝑥𝐵+𝑉

1+𝑉𝑣𝑥𝐵𝑐2

There will also be velocities in the y and z directions since the jets expand slightly with distance.

𝑣𝑦𝐴 =d𝑦𝐴

d𝑡𝐴=

d𝑦𝐵

𝛾(d𝑡𝐵+𝑉d𝑥𝐵𝑐2

) =

d𝑦𝐵d𝑡𝐵

𝛾(1+𝑉

𝑐2 d𝑥𝐵d𝑡𝐵

)=

𝑣𝑦𝐵

𝛾(1+𝑉𝑣𝑥𝐵𝑐2

) 𝑦𝐴 = 𝑦𝐵

𝑣𝑧𝐴 =d𝑧𝐴

d𝑡𝐴=

d𝑧𝐵

𝛾(d𝑡𝐵+𝑉d𝑥𝐵𝑐2

) =

d𝑧𝐵d𝑡𝐵

𝛾(1+𝑉

𝑐2 d𝑥𝐵d𝑡𝐵

)=

𝑣𝑧𝐵

𝛾(1+𝑉𝑣𝑥𝐵𝑐2

) 𝑧𝐴 = 𝑧𝐵

The expanding jet is assumed to be contained within a small angle 𝜃𝐵 with the x axis so that the

angle is defined by tan 𝜃𝐵 =𝑣𝑦𝐵𝑣𝑥𝐵

=𝑣𝑧𝐵𝑣𝑥𝐵

, the beam assumed to have a circular cross-section.

Then the observed angle is given by

tan 𝜃𝐴 =𝑣𝑦𝐴𝑣𝑥𝐴

=

𝑣𝑦𝐵

𝛾(1+𝑉𝑣𝑥𝐵𝑐2

)

𝑣𝑥𝐵+𝑉

1+𝑉𝑣𝑥𝐵𝑐2

=

𝑣𝑦𝐵𝛾

𝑣𝑥𝐵+𝑉=

𝑣𝑦𝐵𝛾(𝑣𝑥𝐵+𝑉)

=𝑣𝐵 sin𝜃𝐵

𝛾(𝑣𝐵 cos𝜃𝐵+𝑉)

Page 19: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The velocity 𝑣𝐵 can be approximated by 𝑐 since the value is very close to 𝑐

tan 𝜃𝐴 =𝑣𝐵 sin𝜃𝐵

𝛾(𝑣𝐵 cos𝜃𝐵+𝑉)=

𝑐 sin𝜃𝐵

𝛾(𝑐 cos𝜃𝐵+𝑉)=

sin𝜃𝐵

𝛾(cos𝜃𝐵+𝑉

𝑐)

If the jet is directed towards the Earth at velocity 𝑉, and a photon is emitted from a particle in

the beam, the photon can be directed in any direction. If it is assumed that the photon is

travelling perpendicular to the beam’s direction (the x axis) 𝜃𝐵 =𝜋

2.

This means

tan 𝜃𝐴 =sin𝜃𝐵

𝛾(cos𝜃𝐵+𝑉

𝑐)=

sin 𝜋

2

𝛾(cos 𝜋

2 +

𝑉

𝑐)=

1

𝛾𝑉

𝑐

=𝑐

𝑉𝛾

As 𝑉 → 𝑐 tan 𝜃𝐴 →1

𝛾 but 𝛾 → ∞ so 𝜃𝐴 → 0.

This means that all the photons emitted in the half hemisphere directed towards the observer

are concentrated in the narrow cone 𝜃𝐴 =2

𝛾. If the observer is within the cone the beam will

appear to be exceptionally bright since half the total emission is concentrated in this narrow

angle – the beam is invisible outside of the cone. This is called radiation beaming.

A similar effect would be seen from a spaceship travelling at relativistic speeds – the whole

visible universe forward of the ship would appear to be compressed into a small circle whose

size decreases with increasing speed.

The increase in received flux due to relativistic beaming is calculated by assuming that the flux

that would be received 𝐹 =𝐿

4𝜋𝑑2, where 𝐿 is the luminosity and 𝑑 is the distance, is concentrated

into a solid angle Ω so replacing 4𝜋 by Ω gives 𝐹 =𝐿

Ω𝑑2 and the beaming factor 𝑏 is given by

𝑏 =4𝜋

Ω.

The solid angle Ω of a cone with an angle 𝜃𝐴 can be calculated using spherical co-ordinates:

Ω = ∫ d𝜙2𝜋

0 ∫ sin𝜃 d𝜃𝜃𝐴20

= 2𝜋[−cos 𝜃]0

𝜃𝐴2 = 2𝜋 (1 − cos

𝜃𝐴

2) ≅

𝜋𝜃𝐴2

4 for small 𝜃𝐴 so

𝑏 =4𝜋4

𝜋𝜃𝐴2 =

16𝜋

𝜋(2

𝛾)2 = 4𝛾2

The received flux is therefore 𝐹 =4𝛾2𝐿

𝑑2.

This means there is a very large increase in flux for large values of 𝛾 – for 𝑉 = 0.96𝑐 the beaming factor is 51. It also means there is a very large decrease in flux from the receding jet.

Page 20: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

If two photons are emitted from the jet towards the observer, their source travelling at an angle

𝜃 to the line of sight with proper speed V and the time between the emissions of ∆𝑡𝑒𝑚𝐴, the

second photon will be emitted at a distance ∆𝑑𝐵 = 𝑉∆𝑡𝑒𝑚𝐴cos 𝜃 closer to the observer than the

first photon. The time difference between the two photons arriving is

∆𝑡𝑟𝑒𝑐𝐴 = ∆𝑡𝑒𝑚𝐴−

∆𝑑𝐵

𝑐= ∆𝑡𝑒𝑚𝐴

(1 −𝑉 cos𝜃

𝑐).

If 𝜃 is very small as it will be in the case of a jet directed towards the observer cos 𝜃 ≅ 1

∆𝑡𝑟𝑒𝑐𝐴 = ∆𝑡𝑒𝑚𝐴(1 −

𝑉

𝑐)

= ∆𝑡𝑒𝑚𝐴 (1 − (1 −1

2𝛾2(𝑉)))

𝑉

𝑐≅ 1 −

1

2𝛾2(𝑉) , (1 + 𝑥)

1

2 ≅ 1 +𝑥

2, 𝑥 ≪ 1

=∆𝑡𝑒𝑚𝐴

2𝛾2(𝑉)

If the observer measures the distance 𝑉∆𝑡𝐴 between the two emissions and the difference in

arrival time ∆𝑡𝐴 in order to calculate 𝑉𝐴, the apparent velocity,

𝑉𝐴 =𝑉∆𝑡𝑒𝑚𝐴 sin𝜃

∆𝑡𝑟𝑒𝑐𝐴=

𝑉∆𝑡𝑒𝑚𝐴 sin𝜃

∆𝑡𝑒𝑚𝐴(1−𝑉

𝑐cos𝜃)

=𝑉𝜃

1−𝑉

𝑐

for small 𝜃

Thus if 𝜃 >1−

𝑉

𝑐𝑉

𝑐

then 𝑉𝐴 > 𝑐 , i.e. it will appear that the jet has a superluminal speed. For

example if 𝑉 = 0.92𝑐 and 𝜃 = 70° then 𝑉𝐴 = 1.26𝑐, i.e. superluminal.

The maximum apparent speed occurs when cos 𝜃 =𝑉

𝑐 and sin 𝜃 = √1 − (

𝑉

𝐶)2=

1

𝛾(𝑉) so

𝑉𝐴𝑀𝐴𝑋 =𝑉

1

𝛾(𝑉)

(1−𝑉

𝑐

𝑉

𝑐)=

𝑉

𝛾(𝑉)(1−(𝑉

𝑐)2)=

𝑉𝛾2(𝑉)

𝛾(𝑉)= 𝛾(𝑉)𝑉

Page 21: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

In practice the apparent speed can be measured as 𝑉𝐴 but the angle 𝜃 and actual speed V are

unknown. However a minimum value for the speed can be calculated (the following calculates

the minimum value of 𝛾 from which the speed can be calculated)

(1 −𝑉

𝑐cos𝜃)𝑉𝐴 = 𝑉 sin𝜃 𝑉𝐴 =

𝑉 sin𝜃

1−𝑉

𝑐cos𝜃

1 −𝑉

𝑐cos𝜃 =

𝑐

𝑉𝐴

𝑉

𝑐 sin𝜃

𝑐

𝑉𝐴

𝑉

𝑐 sin 𝜃 +

𝑉

𝑐cos 𝜃 = 1

𝑉

𝑐=

1𝑐

𝑉𝐴 sin𝜃+cos𝜃

d

d𝜃(𝑉

𝑐) = −

1

(𝑐

𝑉𝐴 sin𝜃+cos𝜃)

2 (𝑐

𝑉𝐴 cos 𝜃 − sin 𝜃)

This will be zero for a minimum of 𝑉

𝑐 which is when

𝑐

𝑉𝐴 cos𝜃 − sin𝜃 = 0, i.e. tan 𝜃 =

𝑐

𝑉𝐴

𝑉

𝑐=

1

cos𝜃 (𝑐

𝑉𝐴

sin𝜃

cos𝜃+1)

𝑉

𝑐=

1𝑐

𝑉𝐴 sin𝜃+cos𝜃

= √1 + tan2 𝜃1

(𝑐

𝑉𝐴 tan𝜃+1)

cos 𝜃 =1

√1+tan2 𝜃, sin𝜃

cos𝜃= tan𝜃

= √1 + (𝑐

𝑉𝐴)2 1

(𝑐

𝑉𝐴

𝑐

𝑉𝐴+1)

tan 𝜃 =𝑐

𝑉𝐴

=1

√1+(𝑐

𝑉𝐴)2

𝛾(𝑉) =1

1−

(

1

√1−(𝑉𝐶)2

)

2 𝛾(𝑉) =

1

√1−(𝑉

𝐶)2

= √1 + (𝑐

𝑉𝐴)2

which is the minimum value of 𝛾(𝑉). The maximum cannot be

determined.

Page 22: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

There will also be a Doppler effect due to special relativity.

∆𝑡𝑜𝑏𝐴 = 𝛾∆𝑡𝑒𝑚𝐴(1 −

𝑉 cos𝜃

𝑐) ∆𝑡𝑜𝑏𝐴 = ∆𝑡𝑒𝑚𝐴

(1 −𝑉 cos𝜃

𝑐), ∆𝑡𝑒𝑚𝐴

= 𝛾𝑡𝑒𝑚𝐵 where

∆𝑡𝑜𝑏𝐴 is the observed time between the two photons arriving measured in the observers frame,

∆𝑡𝑒𝑚𝐴 is the time between the two photons being emitted measured in the observers frame,

and ∆𝑡𝑒𝑚𝐵is the duration measured in the rest frame (moving at the speed of the jet).

The Doppler factor is defined as 𝐷 =1

𝛾(1−𝑉cos𝜃

𝑐) and so ∆𝑡𝑜𝑏𝐴 =

∆𝑡𝑒𝑚𝐵

𝐷, or in terms of frequency

𝑓𝑜𝑏 = 𝐷𝑓𝑒𝑚 where 𝑓𝑜𝑏 is the frequency measured by the observer and 𝑓𝑒𝑚 is the frequency

emitted as measured in the rest frame. This definition applies to the source moving in any

direction with respect to the observer.

For the source moving perpendicular to the observer 𝜃 =𝜋

2, 𝐷 =

1

𝛾, and 𝑓𝑜𝑏 =

1

𝛾𝑓𝑒𝑚.

For the source moving towards the observer 𝜃 = 0, 𝐷 =1

𝛾(1−𝑉

𝑐)= 𝛾 (1 +

𝑉

𝑐) which approaches

𝐷 = 2𝛾 and 𝑓𝑜𝑏 = 2𝛾𝑓𝑒𝑚

The ratio of received frequencies between the two jets can be very large. If → means away from

the Earth and ← means towards the Earth

𝑓𝑜𝑏←

𝑓𝑜𝑏→=

𝐷←𝑓𝑒𝑚

𝐷→𝑓𝑒𝑚

=𝛾(1+

𝑉

𝑐)

𝛾(1−𝑉

𝑐)

=1+

𝑉

𝑐

1−𝑉

𝑐

The combination of beaming and Doppler increases the apparent luminosity of the source by a

factor of 𝛾3+𝛼(𝑉) where 𝛼 is the power law factor for the spectral distribution, but decreases the luminosity of the receding jet by the same amount.

𝐿𝑜𝑏←

𝐿𝑜𝑏→= (

𝐷←

𝐷→)3+𝛼

= (1+

𝑉

𝑐

1−𝑉

𝑐

)

3+𝛼

Page 23: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Space-time In Newtonian mechanics there are three spatial dimensions and a time variable.

A point in space is called a location and has a unique set of values in a specified reference frame

i.e. (𝑥, 𝑦, 𝑧), (𝑥1, 𝑥2, 𝑥3) or (𝑒1, 𝑒2, 𝑒3) – note that the superscripts are not powers but are

dimension numbers; they would normally be subscripts but superscripts are used in relativity.

A distance is the straight line distance between two locations and is represented by

Δ𝑥 = 𝑥2 − 𝑥1 in one dimension or 𝑟 = √(Δ𝑥)2 + (Δ𝑦)2 + (Δ𝑧)2 in three. This has the same

value in all frames of reference in Newtonian mechanics.

In Newtonian mechanics time is a variable – a point in time is called an instant, and the “length”

between two instances in time is called a duration.

Space-time treats time as another dimension. It is often the first of the four dimensions, but is

numbered zero giving (𝑒0, 𝑒1, 𝑒2, 𝑒3) where 𝑒0 is time and 𝑒1, 𝑒2, 𝑒3 are 𝑥, 𝑦, 𝑧. It may also be

written as (𝑡, 𝑥1, 𝑥2, 𝑥3) or (𝑡, 𝑥, 𝑦, 𝑧) or in spherical coordinates (𝑡, 𝑟, 𝜃, 𝜙).

Space has dimensions of L and time dimensions of T. It is more convenient to use the same

dimension for all four values which can be achieved by multiplying time by 𝑐, the speed of light,

which is a constant in a vacuum – 𝑐𝑡 has dimensions of LT-1T or L.

A point in space-time is called an event – it has an instance and a location, both of which depend

on the frame of reference being used, i.e. (𝑐𝑡, 𝑥, 𝑦, 𝑧) or (𝑐𝑡, 𝑥1, 𝑥2, 𝑥3). This is called a four-

position since four values identify it.

The “distance” between two events in space-time is called the separation and has the value

𝑠 = √(cΔ𝑡)2 − r2. In special relativity this has the same value in all inertial frames of reference,

and is said to be invariant.

An event can also be written as a column vector [

𝑐𝑡𝑥𝑦𝑧

] with the top row having the index number

0 rather than the more normal 1.

This can also be written as [

𝑒0

𝑒1

𝑒2

𝑒3

] which can also be written as [

𝑐𝑡𝑥1

𝑥2

𝑥3

].

This is important because the Lorentz transformations are often written this way.

An event [𝑒𝜈] has a time and location which is called a four-position in space-time, i.e.

[𝑒𝜈] ≡ [

𝑒0

𝑒1

𝑒2

𝑒3

] i.e. the 𝜈 represents all four values 0, 1, 2 and 3. [𝑒𝜈] is a 4D position vector.

Note that the coordinates (𝑥, 𝑦, 𝑧) indicate that Cartesian coordinates are being used, (𝑥1, 𝑥2, 𝑥3)

also indicate Cartesian coordinates, but the superscript allows an arbitrary axis to be specified

as in 𝑥𝑖, a summation such as ∑ 𝑥𝑖𝑖 or more than three axes. (𝑒1, 𝑒2, 𝑒3) or 𝑒𝑖 indicates an

arbitrary set of axes which may be Cartesian, polar, spherical or some obscure curvilinear set.

Latin superscripts such as 𝑖 and 𝑗 in Euclidian space are replaced by Greek superscripts such as

𝜇 and 𝜈 in relativity when 𝑐𝑡 is also a dimension, i.e. Latin start at 1, Greek at 0.

Page 24: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Position Vectors A position vector is an imaginary line from the origin of a frame of reference to a point within it.

The line can be defined in several different ways depending on which is most convenient.

Two common ways use Cartesian coordinates (𝑥, 𝑦, 𝑧) and spherical coordinates (𝑟, 𝜃, 𝜙). Both

can be converted into a vector by changing the brackets to square ones. Although apparently

minor this allows the vector to be transformed, either to a different form or a different location.

A vector has both magnitude and direction, and this is clearest using spherical coordinates

where the 𝑟 is the magnitude and the 𝜃, 𝜙 the direction noting that 𝜃 is the angle between the

vector and the z axis (the polar angle), and 𝜙 is the angle between the component of the vector

on the x-y plane and the x axis (the azimuthal angle).

Vectors are represented by upright bold characters while scalars (magnitudes) are represented

by italic characters. Thus r is a vector, 𝑟 is a scalar (magnitude). The magnitude of a vector can

be changed by multiplying or dividing its magnitude by a scalar -, i.e. 2r has the magnitude 2𝑟.

(Adding a scalar to a vector does not have any physical meaning). This leads to a convenient

notation for vectors. A unit vector is a vector with unit magnitude, and is usually represented as

�̂�. The vector r can then be written as |𝒓|�̂� where |𝒓| is the magnitude of r, i.e. 𝑟 giving 𝑟�̂�.

Three particular unit vectors are �̂�, �̂�, and �̂� along the three Cartesian axes. These are sometimes

written as 𝐢, 𝐣, and 𝐤.

The projection of a vector onto an axis or a plane is given by 𝑟 cos𝛼 where 𝛼 is the angle

between the vector and the axis or plane. The projection is a magnitude, but adding the

direction converts it into a component. So (𝑟 cos 𝜃)�̂� is the component along the z axis.

The projection onto the xy plane is 𝑟 cos(∟ − 𝜃) = 𝑟 sin 𝜃 and since this is at an angle 𝜙 to the x

axis the x component is 𝑟 sin 𝜃 cos𝜙 �̂� and the y component is 𝑟 sin𝜃 cos(∟ − 𝜙) �̂� or

𝑟 sin𝜃 sin𝜙 �̂� .

Vectors are normally written as column matrices, each item being a scalar, and the order being

predefined as in 𝑥, 𝑦, 𝑧 or 𝑟, 𝜃, 𝜙. The conversion from one to the other can then be written as a

matrix equation containing a square transformation matrix. The conversion from spherical to

Cartesian can be written as

[ sin𝜃 cos𝜙 0 0

0𝑟 sin𝜃 sin𝜙

𝜃0

0 0𝑟 cos𝜃

𝜙 ]

[ 𝑟 𝜃 𝜙]

=

[ sin 𝜃 cos𝜙 𝑟 + 0𝜃 + 0𝜙

0𝑟 +𝑟 sin𝜃 sin𝜙

𝜃𝜃 + 0𝜙

0𝑟 + 0𝜃 +𝑟 cos𝜃

𝜙𝜙 ]

=

[ r sin𝜃 cos𝜙

𝑟 sin𝜃 sin𝜙

𝑟 cos 𝜃 ]

Transformation matrices can be calculated for changing the magnitude and/or the direction of

the vector, or changing the position of the origin or rotating the axes etc. Such matrices are

called tensors. The inverse transformation is found by inverting the matrix (not much help in

this case because the values of 𝜃 and 𝜙 must be found in terms of 𝑥, 𝑦 and 𝑧, but a start can be

made by taking the reciprocal of each term on the diagonal – this is only valid for diagonal

matrices).

An important example in special relativity is the Lorentz matrix [Λ𝜇𝜈] which transforms the

coordinates of an event between two frames moving with a relative velocity 𝑉 along the x axis.

Page 25: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Lorentz Transformations The Lorenz transformations between two frames of reference, with B having a speed V relative

to A along the x axis, can be summarised in matrix form, noting that 𝑐𝑡 is a length.

[

𝑐𝑡𝐵𝑥𝐵𝑦𝐵𝑧𝐵

] =

[ 𝛾(𝑉) −

𝛾(𝑉)𝑉

𝑐0 0

−𝛾(𝑉)𝑉

𝑐𝛾(𝑉) 0 0

0 0 1 00 0 0 1]

[

𝑐𝑡𝐴𝑥𝐴𝑦𝐴𝑧𝐴

] ∆𝑡𝐵 = 𝛾(𝑉) (∆𝑡𝐴 −𝑉∆𝑥𝐴

𝑐2), ∆𝑥𝐵 = 𝛾(𝑉)(∆𝑥𝐴 − 𝑉∆𝑡𝐴)

The square Lorentz transformation matrix can be written as [Λ𝜇𝜈] where 𝜇 represents the row

number (which starts at 0) and 𝜈 the column number (which also starts at 0). 𝜇, 𝜈 = 0,1,2,3.

Note that the position of 𝜇 and 𝜈 are critical to the interpretation. Λ02 is top row, third column.

The equation can be written as

[𝑒𝐵𝜇] = [Λ𝜇𝜈] [𝑒𝐴

𝜈] or

𝑒𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 𝑒𝐴

𝜈 𝜇 = 0,1,2,3

This means that 𝑒𝐵2 ≡ 𝑦𝐵 = Λ20𝑒𝐴

0 + Λ21𝑒𝐴1 + Λ22𝑒𝐴

2 + Λ23𝑒𝐴3 = 0𝑐𝑡𝐴 + 0𝑥𝐴 + 1𝑦𝐴 + 0𝑧𝐴 or

the third element down in the first vector is calculated by multiplying the first element in the

third row of the square matrix with the first element in the column matrix, then second element

in the third row times second in the column matrix, third times third, and fourth times fourth

and summing the four results.

Note that there are in effect four different equations, one for each value of 𝜇. Also 𝜇 occurs once

on the LHS and just once on the RHS inside the summation. 𝜇 is called a free index because we

are free to choose which value we give it (i.e. which of the four equations we are interested in).

The index 𝜈 is called a dummy index. It does not occur on the LHS, but twice in the term on the

RHS (Λ𝜇𝜈𝑒𝐴𝜈 − once as a subscript and once as a superscript), and is summed over its range of

values so also appears in the summation sign ( ∑ )3𝜈=0

If this was being programmed on a computer, 𝜇 would be a parameter in a function call, and 𝜈

an index variable controlling a loop.

Since movement is normally restricted to the x axis this can be simplified to

[𝑐𝑡𝐵𝑥𝐵] = [

𝛾(𝑉) −𝛾(𝑉)𝑉

𝑐

−𝛾(𝑉)𝑉

𝑐𝛾(𝑉)

] [𝑐𝑡𝐴𝑥𝐴] since the values of y and z are unchanged.

The inverse transform can be found by inverting the square matrix. The inverse of [𝐴 𝐵𝐶 𝐷

] is

1

𝐴𝐷−𝐵𝐶[𝐷 −𝐵−𝐶 𝐴

] so if [Λ] = [𝛾(𝑉) −

𝛾(𝑉)𝑉

𝑐

−𝛾(𝑉)𝑉

𝑐𝛾(𝑉)

] then [Λ]−1 = [𝛾(𝑉) +

𝛾(𝑉)𝑉

𝑐

+𝛾(𝑉)𝑉

𝑐𝛾(𝑉)

] since

𝐴𝐷 − 𝐵𝐶 =𝛾(𝑉)𝛾(𝑉) − (−𝛾(𝑉)𝑉

𝑐) (−

𝛾(𝑉)𝑉

𝑐) = 𝛾2(𝑉) (1 −

𝑉2

𝑐2) =

1

1−𝑉2

𝑐2

(1 −𝑉2

𝑐2) = 1

The only change is in the sign of 𝑉.

Note that there is a convention that indices that run 1,2, 3, etc. are given lower case Latin letters such as 𝑖, 𝑗, 𝑘 while indices that run 0, 1, 2 etc. are given lower case Greek letters such as 𝜇, 𝜈.

Page 26: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Minkowski Diagram A Minkowski diagram is a plot of space-time. There are four axes – 𝑐𝑡 and the three spatial

dimensions, but it is obviously easier to draw with one spatial dimension. 𝑐𝑡 has the dimension

of length so is compatible with the spatial axes.

𝑐𝑡𝐵 = 𝛾(𝑉) (𝑐𝑡𝐴 −𝑉𝑥𝐴

𝑐) 𝑡𝐵 = 𝛾(𝑉) (𝑡𝐴 −

𝑉𝑥𝐴

𝑐2)

𝑥𝐵 = 𝛾(𝑉)(𝑥𝐴 − 𝑉𝑡𝐴)

Setting 𝑐𝑡𝐵 = 0 in the first gives (𝑐𝑡𝐴 −𝑉𝑥𝐴

𝑐) = 0 or 𝑐𝑡𝐴 =

𝑉

𝑐𝑥𝐴 and setting 𝑥𝐵 = 0 in the second

gives (𝑥𝐴 − 𝑉𝑡𝐴) = 0 or 𝑥𝐴 = 𝑉𝑡𝐴. The latter can be written as 𝑐𝑡𝐴 =𝑐

𝑉𝑥𝐴. This gives two straight

lines 𝑐𝑡 =𝑉

𝑐𝑥 which is the 𝑥𝐵 axis and 𝑐𝑡 =

𝑐

𝑉𝑥 which is the 𝑐𝑡𝐵 axis. Both lines pass through the

origin. The first has slope 𝑉

𝑐 and the second

𝑐

𝑉 so the slope of 𝑥𝐵 is less than 1, and of 𝑐𝑡𝐵 greater

than 1.

The line 𝑐𝑡 =𝑐

𝑐𝑥 (i.e. 𝑉 is replaced by 𝑐) can be written as 𝑐𝑡 = 𝑥 and this is a straight line

through the origin with slope 1.

As V increases the two lines 𝑐𝑡 =𝑉

𝑐𝑥 and 𝑐𝑡 =

𝑐

𝑉𝑥 approach 𝑐𝑡 = 𝑥. Note the angles 𝜃 are

identical.

Four events are marked. Their coordinates can be found from either the 𝑐𝑡𝐴, 𝑥𝐴 axes which are

perpendicular or the 𝑐𝑡𝐵, 𝑥𝐵axes which are at an acute angle to each other.

In the A frame of reference the events occur in the order 0, 2, 3, 1 (up the 𝑐𝑡𝐴 axis)

In the B frame of reference 0 and 3 occur simultaneously (same value of 𝑐𝑡𝐵), and later 2 and 1

occur simultaneously (further along the 𝑐𝑡𝐵 axis). In frame A the events occur in different positions (have different values of 𝑥𝐴).

In frame B 0 and 2 occur at the same location (same value of 𝑥𝐵), and 3 and 1 occur at another

location.

𝑐𝑡𝐴

𝑥𝐴

𝑐𝑡𝐵

𝑥𝐵

𝐶𝑡 = 𝑥

0

1

2

3

𝜃

θ

Page 27: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Causality Causality means that a second event is caused by a first event. Causality is only possible if the

information that the first event has happened can reach the second event at a speed not greater

than 𝑐.

If 𝑥𝑛 > 𝑥0 the two events can only be causally related if a line of slope 1 from the first event does

not pass to the left of the second event on the Minkowski diagram.

A line of slope 1 from an event is called a light cone because in two spatial dimensions it is a

conical surface. There is another line of slope -1 for events where 𝑥𝑛 < 𝑥0 in which case

casually related events must be to the right of this line.

Event 1 may be causally related to event 0, event 2 can only be causally related to event 0 if

information is travelling at the speed of light in a vacuum, and event 3 cannot be causally related

to event 0.

This relationship applies to all possible frames of reference – causality is independent of the

frame of reference.

Events 0, 1 and 2 can only cause event 3 if the information travels faster that the speed of light.

Event 3 cannot cause events 0, 1 or 2 since they are to the left of a line with slope -1 through 3.

𝑐𝑡

𝑥

0

1 0

2 3

Page 28: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Separation The minimum distance between two points in three dimensional space is given by

𝑟 = √(Δ𝑥)2 + (Δ𝑦)2 + (Δ𝑧)2

𝑟 has the same value in all frames of reference and is said to be invariant. Note that the

displacement r is a vector, and its value (direction) does depend on the frame of reference and

so is not invariant - the vector value expression would be different if spherical coordinates were used (𝑥, 𝑦, 𝑧) vs (𝑟, 𝜃, 𝜙) assuming a position vector, and have other values if not a position

vector.

The equivalent in space-time is called the separation of two events

𝑠 = √(cΔ𝑡)2 − 𝑟2

In one spatial dimension the separation in frame B is

𝑠𝐵 = √(cΔ𝑡𝐵)2 − (Δ𝑥𝐵)

2

= √(c𝛾(𝑉) (∆𝑡𝐴 −𝑉∆𝑥𝐴

𝑐2))

2

− (𝛾(𝑉)(∆𝑥𝐴 − 𝑉∆𝑡𝐴))2

∆𝑡𝐵 = 𝛾(𝑉) (∆𝑡𝐴 −𝑉∆𝑥𝐴

𝑐2)

∆𝑥𝐵 = 𝛾(𝑉)(∆𝑥𝐴 − 𝑉∆𝑡𝐴)

= √𝛾2(𝑉) ((𝑐∆𝑡𝐴)2 − 2𝑉∆𝑥𝐴∆𝑡𝐴 + (

𝑉∆𝑥𝐴

𝑐)2− (∆𝑥𝐴)

2 + 2𝑉∆𝑥𝐴∆𝑡𝐴 − (𝑉∆𝑡𝐴)2)

= √𝛾2(𝑉) ((𝑐∆𝑡𝐴)2 + (

𝑉∆𝑥𝐴

𝑐)2− (∆𝑥𝐴)

2 − (𝑉∆𝑡𝐴)2)

= √𝛾2(𝑉) ((𝑐∆𝑡𝐴)2 − (𝑉∆𝑡𝐴)

2 − ((∆𝑥𝐴)2−(

𝑉∆𝑥𝐴

𝑐)2))

= √𝛾2(𝑉)𝑐2 ((∆𝑡𝐴)2 − (

𝑉∆𝑡𝐴

𝑐)2) − 𝛾2(𝑉) ((∆𝑥𝐴)

2−(𝑉∆𝑥𝐴

𝑐)2)

= √𝛾2(𝑉)𝑐2(∆𝑡𝐴)2 (1 − (

𝑉

𝑐)2) − 𝛾2(𝑉)(∆𝑥𝐴)

2 (1−(𝑉

𝑐)2)

= √(𝑐∆𝑡𝐴)2 − (∆𝑥𝐴)

2 𝛾2(𝑉) =1

1−(𝑉

𝑐)2

= 𝑠𝐴 which is the separation in frame A, the result being that 𝑠 is independent of the relative

speed of the frames of reference.

In practice the square root is often avoided - the term separation often means its square 𝑠2.

𝑠2 = (cΔ𝑡)2 − 𝑟2

However then it is equally valid to write 𝑠2 = −(cΔ𝑡)2 + 𝑟2

The first form ensures that the separation of events on the world-line of an object moving at a

speed slower than light is positive. The world-line of an object is its path through space-time.

Page 29: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Twin Effect A common paradox is the twin effect where one twin remains on Earth and another travels to a

remote star at or near the speed of light and returns. The twin that remains on Earth is always

older than the twin that travels when both meet again.

For the twin that remains on Earth taking their frame as the rest frame, the start of the second

twin’s journey as (0,0) and their return as (𝑐𝑇,0) gives a proper time of ∆𝜏 = 𝑇, i.e. the twin on

Earth is 𝑇 older.

The other twin starts their journey at (0,0), reaches the star at (𝑐𝑇

2, 𝑉𝑇

2), and returns at (cT,0).

The journey time 𝑇 =2𝑥

𝑉 where 𝑥 is the distance (assumed constant) to the star

𝑉𝑇

2.

The proper time to the star is

∆𝑠

𝑐=

1

𝑐√(∆𝑡)2 − (∆𝑥)2 =

1

𝑐√(

𝑐𝑇

2)2− (

𝑉𝑇

2)2=

𝑇

2√1 − (

𝑉

𝑐)2=

𝑇

2𝛾(𝑉)

The same applies to the return so the total proper time for the twin is 2𝑇

2𝛾(𝑉)=

𝑇

𝛾(𝑉).

Since the twin’s age will be the twin’s proper time, the twin will be 𝑇

𝛾(𝑉) older when they return

and so will be 𝑇 −𝑇

𝛾(𝑉)= 𝑇 (1 −

1

𝛾(𝑉)) younger than the twin that remained.

Note that the travelling twin must have accelerated when reaching the star in order to return,

not to mention the departure from and arrival back to Earth. The changes in velocity are

assumed to be instantaneous so there is no duration to the acceleration, but even so it is strictly

outside special relativity because of this. The twin that accelerates always arrives back younger

than the one that stayed. The general rule is that the time measured by a non-inertial observer

is less than the proper time as calculated by an inertial observer.

Page 30: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Minkowski Metric

From Δ𝑠 = √(cΔ𝑡)2 − (Δ𝑟)2

(Δ𝑠)2 = cΔ𝑡cΔ𝑡 − Δ𝑥Δ𝑥 − Δ𝑦Δ𝑦 − Δ𝑧Δ𝑧 This is called a line element which defines the

geometry – the equivalent in “normal” space is (Δ𝑙)2 = Δ𝑥Δ𝑥 + Δ𝑦Δ𝑦 + Δ𝑧Δ𝑧.

This can be written as

(Δ𝑠)2 = ∑ 𝜂𝜇𝜈𝑥𝜇𝑥𝜈3

𝜇,𝜈=0 where 𝑥0 = cΔ𝑡, 𝑥1 = Δ𝑥, 𝑥2 = Δ𝑦 and 𝑥3 = Δ𝑧.

The product 𝑥𝜇𝑥𝜈 has 16 terms because it consist of all combinations of the products of

𝑐𝑡, 𝑥, 𝑦, and 𝑧 such as 𝑐𝑡𝑐𝑡, 𝑐𝑡𝑥, 𝑐𝑡𝑦, 𝑐𝑡𝑧, 𝑥𝑦, 𝑥𝑧 etc. However only the square terms are required,

and all except (𝑐𝑡)2 must be negative.

The value of 𝜂𝜇𝜈 must be 1 if 𝜇 = 𝑣 = 0,−1 if 𝜇 = 𝑣 = 1, 2, or 3, and 0 if 𝜇 ≠ 𝑣. These are

encoded in the Minkowski metric which consists of 16 terms where 𝜇, 𝜈 refer to the row,

column in the following array starting at 0.

[𝜂𝜇𝜈] ≡ [

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

]

[𝜂𝜇𝜈] refers to the whole array, 𝜂𝜇𝜈 to a specific term in the array (the array is a metric tensor

field). The Minkowski metric is sometimes written as (+,−,−,−).

There is another convention which reverses all the signs [

−1 0 0 00 1 0 00 0 1 00 0 0 1

] or (−,+,+,+) which

is equally valid – it is important to note which convention is be used. Some texts place 𝑐𝑡 last

instead of first giving (−,−,−,+) or (+,+,+,−). This is a convenient notation to define the

system being used. “Normal” space has the metric (+,+,+).

This metric is for orthogonal coordinates (hence 𝑥𝜇𝑥𝜈 is used instead of 𝑒𝜇𝑒𝜈 which would imply it is valid for all coordinate systems). In spherical coordinates (𝑐𝑡, 𝑟, 𝜃, 𝜙) it becomes

[

1 0 0 00 −1 0 00 0 −𝑟2 00 0 0 −𝑟2 sin2 𝜃

] where 𝜃 is the angle between r and the z axis.

The Minkowski metric is not just a matrix since (Δ𝑠)2 ≠ [𝜂𝜇𝜈]𝑥𝜇𝑥𝜈 , i.e. the LHS is a scalar, the

RHS a 4 by 4 matrix and two column vectors. It is actually a four tensor which has rank 2. The

number of components in a four tensor is 4 raised to a power equal to the rank. Rank 2 is like a

4D matrix, rank 1 a 4D vector with 4 components and rank 0 a scalar. It is even more accurate to

state it is a tensor field (in a similar sense to the electric or magnetic fields) which has the same

value at all positions in the field.

The Minkowski metric is an example of a space-time metric – the one for flat (Euclidean) space-

time (the same as in Newtonian mechanics). A space-time metric defines the geometry of space-

time, but since there are different forms of the metric (one for each type of coordinate system)

for a specified space-time geometry, the space-time geometry does not define the metric.

Conventional axes will have a metric whose off-diagonal terms are zero, but there are

curvilinear axis systems where this is not the case.

Page 31: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Types of Separation If 𝑠2 > 0 two events are time-like separated and may be causally related. One event can be placed at the apex of a lightcone, and the other event will be within it. All observers will agree

that 𝑠2 > 0. There is a frame of reference where the two events happen at the same location but at different times. In that frame the duration between the two events is ∆𝜏, the proper time and

𝑠2 = (cΔ𝜏)2 since 𝑟2 = 0. This means

(Δ𝜏)2 =𝑠2

𝑐2 or the proper time equals the separation divided by the speed of light, all

values being invariant.

The proper time between two events can therefore be calculated by an observer in any frame of

reference if they calculate the separation between the two events.

If 𝑠2 < 0 the events are space-like separated. They cannot be causally related. If one event is

placed at the apex of a light-cone, the other event will be outside the light-cone. There is one

frame of reference where the two events occur at the same time, but in different locations.

If 𝑠2 = 0 the two events are light-like separated. They may be causally related, but the information must have flowed at the speed of light. Importantly the separation between any two

points on a light ray (including its start and end) is zero, and if 𝑠2 = 0 then (Δ𝜏)2 = 0. The

proper time between the creation and destruction of a photon is zero, even if it has crossed the

universe, so creation and destruction can be the only events experienced by the photon and

occur simultaneously in the photon’s rest frame.

If event 1 occurs at (5, 6x108) and event 2 occurs at (7, 18x108) in SI units as measured in frame

A, then in frame B travelling at 4𝑐/5 relative to frame A, the time between the events in frame B

can be calculated either by calculating the time of each event in B or the time duration between

the events in B. Calculating the times and then the difference gives

𝛾(𝑉) =1

√1−𝑉2

𝑐2

=1

√1−(4𝑐

5)2 1

𝑐2

=5

3

𝑡𝐵1 = 𝛾(𝑉) (𝑡𝐴1 −𝑉𝑥𝐴1𝑐2) =

5

3(5 −

4𝑐×6×108

5𝑐2) = 5.667 s

𝑡𝐵2 = 𝛾(𝑉) (𝑡𝐴2 −𝑉𝑥𝐴2𝑐2) =

5

3(7 −

4𝑐×18×108

5𝑐2) = 3.667 s

∆𝑡𝐵 = 𝛾(𝑉) (∆𝑡𝐴 −𝑉∆𝑥𝐴

𝑐2) =

5

3(2 −

4𝑐×12×108

5𝑐2) = −2.0 s

Calculating the proper time between the events gives an imaginary value

Δ𝜏 =Δ𝑠

𝑐=

√(cΔ𝑡𝐴)2−(∆𝑥𝐴)

2

𝑐=

√3×1082×22−(12×108)2

𝑐=

√−108

3= 3.46i 𝑠

Calculating the speed from one event to the other is 6×108−18×108

2=

−12×108

2= −6 × 108 m s-1

which is −2𝑐.

All show that event 1 cannot have caused event 2 even though event 2 occurred after event 1 as

measured in frame A.

Page 32: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Light-cones Light-cones are used frequently in relativity. The following diagram is for two spatial

dimensions and show the time-like and space-like areas, the red lines having a slope of 1 and

being light-like.

The event (at the origin) can only be caused by events in the lower light-cone, and can only

cause events in the upper light-cone.

The upper cone is the absolute future and the lower cone the absolute past.

If the event results in an object with mass then that object will follow a world-line within the

future light-cone, the slope of the line depending on its velocity relative to the x and y axes. A

relative speed of zero results in a vertical world-line.

As the velocity increases the slope decreases until at the speed of light it follows the edge of the

light-cone. This is only possible if it has no mass, and if it has no mass it must move at the speed

of light with a proper lifetime of zero.

The world-line of the object can only go outside the light-cone if its speed is greater than that of

light which is believed to be impossible.

Maths has to be used for three spatial dimensions. 𝑠2 = (cΔ𝑡)2 − 𝑟2 where 𝑠 is the separation,

𝑟 the Euclidian straight line distance and Δ𝑡 the time difference between two events. 𝑠2 is negative and hence 𝑠 is imaginary if 𝑟 > cΔ𝑡 which means that they cannot be causally related. 𝑠

is zero if 𝑟 = cΔ𝑡 which is the case if one event is the creation of a photon and the other is the

absorption of that photon (assuming the photon travels through a vacuum). This corresponds to

the surface of a light-cone. A time-like relationship means 𝑟 < cΔ𝑡.

If the separation were measured along any spatial line other than a straight line the separation

would be smaller, and hence a spatial straight line has the largest separation.

If an object is stationary wrt its frame of reference 𝑟 is zero the separation of two events is

𝑠 = cΔ𝑡 which is the maximum value. Since 𝑠 is independent of the relative velocities of two

frames of reference (invariant) this value also applies to all objects moving with a constant

velocity.

Event

Time-like

Space-like

Time-like

𝑐𝑡

𝑥

𝑦

World-line

Page 33: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Invariants An invariant is a property whose value is the same in all frames of reference. As a general rule in

special relativity all observers agree on the value of certain scalar properties, but they disagree

on vector properties unless they are in the same frame of reference.

The speed of light 𝑐 is an invariant by definition. This results from Maxwell’s equations and is

confirmed by observation.

Two other basic properties that are invariant are mass 𝑚 and charge 𝑞 although some redefine

mass to include the increase due to velocity which is not invariant. The latter raises complications – it may be found in earlier texts (𝑚𝐴 = 𝛾(𝑉)𝑚𝐵 where B is the rest frame moving

at velocity V to frame A so mass increases with speed) but today mass means rest mass 𝑚𝐵 and

is represented by 𝑚 or in some texts by 𝑚0.

The mass energy or rest energy given by 𝐸0 = 𝑚𝑐2.

The permittivity 휀0 and permeability of 𝜇0 of free space are invariant, and are linked by

휀0𝜇0𝑐2 = 1 (from the fourth Maxwell equation)

Space-time separation 𝑠 = √(cΔ𝑡)2 − 𝑟2 is invariant.

Proper time Δ𝜏 =𝑠

𝑐 is invariant, but other durations are not.

Proper lengths are invariant ∆𝜒 = 𝛾(𝑉)∆𝑥, but other lengths are not.

Page 34: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The postulates of Special Relativity

There are two important postulates that form the basis of special relativity.

The speed of light in a vacuum is the same in all inertial frames.

The laws of physics can be written in the same form in all inertial frames.

The second postulate is also known as form invariance or covariance. However this requires

that some of the properties defined in Newtonian mechanics must be redefined in what is

known as Lorentz covariant mechanics.

These new definitions must be compatible with the Newtonian definitions when the speeds are

very small in comparison to the speed of light.

Note that anything that involves acceleration is outside of special relativity – this includes

rotation unless the object is solid and is rotating about an axis of symmetry in which case its

rotation has no external effect, but special relativity does not apply internally.

Page 35: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

General 3D Vectors A general 3D vector is a generalised version of the position vector, the difference being that

instead of one end being anchored on the origin, the vector can be anywhere in space, and only

its magnitude and direction is relevant in vector algebra. Spherical coordinates are nominally

only used for vectors whose direction is away from the original so Cartesian coordinates are assumed and the vector defined as the components (∆𝑥�̂�, ∆𝑦�̂�, ∆𝑧�̂�) which in column vector

notation is[∆𝑥∆𝑦∆𝑧].

A vector can be multiplied/divided by a scalar, 𝑎 [∆𝑥∆𝑦∆𝑧] = [

𝑎∆𝑥𝑎∆𝑦𝑎∆𝑧

] which multiples the magnitude

by 𝑎. Multiplying by -1 reverses its direction. A scalar cannot be added to/subtracted from a

vector.

A vector can be added to/subtracted from another vector [𝑎𝑏𝑐] + [

∆𝑥∆𝑦∆𝑧] = [

𝑎 + ∆𝑥𝑏 + ∆𝑦𝑐 + ∆𝑧

] which changes

both its magnitude and direction. A vector cannot be multiplied/divided by another vector.

There are two other vector operators which are very important in mechanics and physics.

The first is the dot or scalar product which results in a scalar. If r and s are two vectors at an

angle 𝜃 to each other then 𝐫 ∙ 𝐬 = 𝑟 𝑠 cos 𝜃, a scalar.

𝐫 ∙ 𝐫 = 𝑟2 𝐫 ∙ −𝐫 = 𝑟2.

If 𝐫 ∙ 𝐬 = 0 the vectors are orthogonal.

In matrix form 𝐫 ∙ 𝐬 = [𝑟𝑥 𝑟𝑦 𝑟𝑧] [

𝑠𝑥𝑠𝑦𝑠𝑧] = 𝑟𝑥𝑠𝑥 + 𝑟𝑦𝑠𝑦 + 𝑟𝑧𝑠𝑧.

Note that 𝐫 ∙ 𝐬 = 𝐬 ∙ 𝐫 and the angle between two vectors 𝜃 = cos−1𝐫∙𝐬

𝑟 𝑠

The second is the cross or vector product which results in a vector perpendicular to the plane

containing the two vectors, and has the value 𝐫 × 𝐬 = 𝑟 𝑠 sin θ �̂� where the direction of the unit

vector �̂� is away from the observer if the observer is on the side of the rs plane such that θ is

positive clockwise when measured from r to s.

𝐫 × 𝐫 = ∅ 𝐫 ∙ −𝐫 = ∅ where ∅ is the null vector.

𝐫 × 𝐬 = −𝐬 × 𝐫

The cross product can be expressed in simple matrix form as a determinant

𝐫 × 𝐬 = det [

�̂� �̂� �̂�𝑟𝑥 𝑟𝑦 𝑟𝑧𝑠𝑥 𝑠𝑦 𝑠𝑧

] = (𝑟𝑦𝑠𝑧 − 𝑟𝑧𝑠𝑦)�̂� − (𝑟𝑥𝑠𝑧 − 𝑟𝑧𝑠𝑥)�̂� + (𝑟𝑥𝑠𝑦 − 𝑟𝑦𝑠𝑧)�̂�

Special relativity uses four-vectors – four dimensional versions of the above.

Page 36: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Momentum In Newtonian mechanics momentum is defined as 𝐩 = 𝑚𝐯. Note that momentum and velocity

are vectors.

Momentum is conserved in the absence of an external force.

In the case of two bodies A and B suffering an elastic impact the total momentum before the

impact is 𝑚𝐴𝐮𝑨 +𝑚𝐵𝐮𝑩, and after the impact 𝑚𝐴𝐯𝑨 +𝑚𝐵𝐯𝑩 where u is the velocity before and v

the velocity afterwards. The total momentum before and afterwards must be equal giving

𝑚𝐴𝐮𝑨 +𝑚𝐵𝐮𝑩 = 𝑚𝐴𝐯𝑨 +𝑚𝐵𝐯𝑩

Since there are four different velocities involved, this will not be invariant in special relativity,

and a new definition of momentum is required (momentum is not conserved).

Assuming the vectors are three dimensional with �̂�, �̂� and �̂� being unit vectors along x, y and z

axes

𝐯 = 𝑣𝑥�̂� + 𝑣𝑦�̂� + 𝑣𝑧�̂� which can be written as (𝑣𝑥, 𝑣𝑦, 𝑣𝑧) or (∆𝑥

∆𝑡,∆𝑦

∆𝑡,∆𝑧

∆𝑡) with the unit vectors

implied.

In the rest frame ∆𝑡 can be replaced by ∆𝜏, the proper time, to give

𝐯 = (∆𝑥

∆𝜏,∆𝑦

∆𝜏,∆𝑧

∆𝜏)

and then replacing ∆𝜏 by ∆𝑡

𝛾(𝑉) for any other frame

𝐯 = 𝛾(𝑉) (∆𝑥

∆𝑡,∆𝑦

∆𝑡,∆𝑧

∆𝑡) ∆𝑡 = 𝛾(𝑉)∆𝜏

This means that the new definition of momentum is

𝐩 = 𝛾(𝑉)𝑚𝐯 𝐯 = (∆𝑥

∆𝜏,∆𝑦

∆𝜏,∆𝑧

∆𝜏)

This is called relativistic momentum to distinguish it from Newtonian momentum.

It can be shown that if 𝑚𝐴𝛾(𝑢𝐴)𝐮𝑨 + 𝛾(𝑢𝐵)𝑚𝐵𝐮𝑩 = 𝑚𝐴𝛾(𝑣𝐴)𝐯𝑨 + 𝛾(𝑣𝐵)𝑚𝐵𝐯𝑩 in one frame then it applies in all other frames since this may be written as 𝐩𝑨𝟏 + 𝐩𝑩𝟏 = 𝐩𝑨𝟐 + 𝐩𝑩𝟐 or

𝐩𝑨𝟏 + 𝐩𝑩𝟏 + (−𝐩𝑨𝟐) + (−𝐩𝑩𝟐) = 𝟎 which means the four vectors make a closed four sided

figure. They can be regarded as displacement vectors and will form a closed figure in all

reference frames. Relativistic momentum is therefore conserved.

If v is small compared to 𝑐 then 𝛾(𝑉) = 1 and 𝐩 = 𝑚𝐯 as in Newtonian mechanics.

Alternatively 𝐩 = 𝑚𝛾(𝑉)𝐯 where p is the momentum as observed in a frame of reference where

v is the velocity observed in the same frame of reference and 𝑚 is the (rest) mass.

Older texts wrote this as 𝐩 = 𝑚𝐯 where 𝑚 is the relativistic mass 𝛾(𝑉)𝑚0 and 𝑚0 is the rest

mass which retained the Newtonian formulation.

An electron has a mass of 9.109x10-31 kg and so if moving at 4𝑐

5 which means 𝛾 =

1

√1−(4

5)2=

5

3 it

has a momentum of 𝑝 = 𝛾𝑚𝑣 = 5

3

4𝑐

59.109 × 10−31 =

4

3 × 2.998 × 108 × 9.109 × 10−31 =

3.6 x10-22 kg m s-1.

If mass is in units of eV/c2 then momentum is in units of eV/c if speed is the form 𝑎𝑐, 𝑎 < 1.

Page 37: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Kinetic Energy

In Newtonian mechanics kinetic energy is defined as 𝐸𝐾𝐸 =1

2𝑚𝑣2 where 𝑣 is the speed.

An alternative definition is that kinetic energy is the energy gained when accelerating from 0 to

𝑣. The energy comes from the work done in causing that acceleration, and that work comes from

a force 𝑓 moving over the distance 𝑥 that the acceleration takes place.

Assuming that the force varies this can be written as

𝐸𝐾𝐸 = ∫ 𝑓 d𝑥𝑢=𝑣

𝑢=0 where 𝑢 =

d𝑥

d𝑡 is the instantaneous speed, and movement is along the x axis.

Force is equal to the rate of change of momentum since 𝑓 = 𝑚𝑎 = 𝑚d𝑣

d𝑡=

d𝑝

d𝑡.

𝐸𝐾𝐸 = ∫d𝑝

d𝑡 d𝑥

𝑢=𝑣

𝑢=0= ∫

d𝑝

d𝑡 d𝑥

d𝑝

𝑢=𝑣

𝑢=0d𝑝 = ∫

d𝑥

d𝑡

𝑢=𝑣

𝑢=0d𝑝 = ∫ u

𝑢=𝑣

𝑢=0d𝑝 where 𝑝 is momentum.

Replacing Newtonian momentum with relativistic momentum 𝑝 = 𝛾(𝑢)𝑚𝑢 gives

𝐸𝐾𝐸 = ∫ u𝑢=𝑣

𝑢=0d(𝛾(𝑢)𝑚𝑢)

Using integration by parts ∫𝑥 d𝑦 = 𝑥𝑦 − ∫𝑦 d𝑥 where 𝑥 = 𝑢 and 𝑦 = 𝛾(𝑢)𝑚𝑢 gives

𝐸𝐾𝐸 = [𝑢𝛾(𝑢)𝑚𝑢]0𝑣 − ∫ 𝛾(𝑢)𝑚𝑢

𝑣

0d𝑢

= [𝑚𝑢2𝛾(𝑢)]0𝑣 −𝑚∫ 𝑢𝛾(𝑢)

𝑣

0d𝑢

= [𝑚𝑢2𝛾(𝑢) +𝑚𝑐2

𝛾(𝑢)]0

𝑣

∫ 𝑢𝛾(𝑢)𝑣

0d𝑢 = ∫

𝑢

√1−𝑢2

𝑐2

𝑣

0d𝑢

𝑧 = 1 −𝑢2

𝑐2,

d𝑧

d𝑢=

−2𝑢

𝑐2, d𝑢 = −

𝑐2

2𝑢

∫𝑢

√1−𝑢2

𝑐2

d𝑢 = ∫𝑢

√𝑧(−

𝑐2

2𝑢)d𝑧

= −𝑐2

2∫

1

√𝑧d𝑧

= −𝑐2

22√𝑧

= −𝑐2√𝑧

= −𝑐2√1 −𝑢2

𝑐2

= −𝑐2

𝛾(𝑢)

= [𝑚𝑢2𝛾(𝑢) +𝑚𝑐2

𝛾(𝑢)]0

𝑣

= [𝛾(𝑢) (𝑚𝑢2 +𝑚𝑐2

𝛾2(𝑢))]0

𝑣

= [𝛾(𝑢) (𝑚𝑢2 +𝑚𝑐2 (1 −𝑢2

𝑐2))]

0

𝑣

Page 38: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

= [𝛾(𝑢) (𝑚𝑢2 +𝑚𝑐2 −𝑚𝑐2𝑢2

𝑐2)]0

𝑣

= [𝛾(𝑢)𝑚𝑐2]0𝑣

= 𝛾(𝑣)𝑚𝑐2 − 𝛾(0)𝑚𝑐2

𝐸𝐾𝐸 = (𝛾(𝑣) − 1)𝑚𝑐2 𝛾(0) = 1 so if 𝑣 = 0 then 𝐸𝐾𝐸 = 0.

If 𝑣 ≪ 𝑐 then 𝛾(𝑣) ≳ 1 and (𝛾(𝑣) − 1) ≳ 0, i.e. in Newtonian mechanics 𝐸𝐾𝐸 is a very small

fraction of 𝑚𝑐2, but this could be any small value.

In order to get the Newtonian approximation 𝛾(𝑣) = (1 −𝑣2

𝑐2)−1

2 must be expanded into a Taylor

series using (1 + 𝑥)𝑟 = 1 +𝑟𝑥

1!+

𝑟(𝑟−1)𝑥2

2!+

𝑟(𝑟−1)(𝑟−2)𝑥3

3!+⋯ which is valid for −1 < 𝑟 < 1 .

Letting 𝑥 = −𝑣2

𝑐2 and 𝑟 = −

1

2

(1 −𝑣2

𝑐2)−1

2= 1 + (−

1

2) (−

𝑣2

𝑐2) +

(−1

2)(−

3

2)

2(−

𝑣2

𝑐2)2

+(−

1

2)(−

3

2)(−

5

2)

3×2(−

𝑣2

𝑐2)3

+⋯

= 1 +1

2

𝑣2

𝑐2+

3

8(𝑣2

𝑐2)2

+5

16(𝑣2

𝑐2)3

+⋯

If 𝑣 ≪ 𝑐 then terms in (−𝑣2

𝑐2)𝑛>1

can be ignored leaving

𝛾(𝑣) = 1 +𝑣2

2𝑐2

𝐸𝐾𝐸 = ((1 +𝑣2

2𝑐2) − 1)𝑚𝑐2 𝐸𝐾𝐸 = (𝛾(𝑣) − 1)𝑚𝑐2

𝐸𝐾𝐸 =𝑣2

2𝑐2𝑚𝑐2 =

1

2𝑚𝑣2 which is identical to Newtonian mechanics.

A muon has a mass of 1.88x10-28 kg and so if moving at 4𝑐

5 which means 𝛾 =

1

√1−(4

5)2=

5

3 it has a

kinetic energy of 𝐸𝐾𝐸 = (𝛾 − 1)𝑚𝑐2 = (5

3− 1) × 1.88 × 10−28 × (2.998 × 108)2 = 1.13x10-11 J.

Page 39: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Total Relativistic Energy and Mass Energy The total relativistic energy is defined as

𝐸 = 𝛾(𝑣)𝑚𝑐2

This is obtained by rearranging 𝐸𝐾𝐸 = (𝛾(𝑣) − 1)𝑚𝑐2 into

𝐸𝐾𝐸 = 𝛾(𝑣)𝑚𝑐2 −𝑚𝑐2

𝛾(𝑣)𝑚𝑐2 = 𝐸𝐾𝐸 +𝑚𝑐2

Mass energy is defined as 𝐸0 = 𝑚𝑐2 which is invariant in all frames.

𝐸 = 𝐸𝐾𝐸 + 𝐸0 or the total relativistic energy is defined as the sum of the kinetic energy and

mass energy.

The mass energy is the total internal energy when at rest, and includes internal thermal energy.

The mass energy is not just the sum of the mass energies of all the constituent parts, but also

includes the kinetic and potential energy of those parts – heating a stationary object increases

its temperature, the speed of its atoms, and so its rest mass.

The difference between the rest mass of a hydrogen atom, and the sum of the rest masses of a

proton and electron is 2.42x10-35 kg – this is the same as the binding energy of the electron in a

hydrogen atom, i.e. 13.6 eV or 2.18x10-18 J using ∆𝐸0 = ∆𝑚𝑐2 .

In the case of a collision neither kinetic energy nor mass energy are conserved (both are

conserved in Newtonian mechanics), but total relativistic energy is conserved. This means that

particle collisions can create mass in the form of new particles at the expense of kinetic energy

or create energy at the expense of mass, the latter being the theory behind nuclear weapons.

If a uranium 235 atom absorbs a neutron the total mass of the resulting krypton and barium

nuclei and three neutrons is 3.08x10-28 kg less than that of the original atom plus neutron and

using

𝐸0 = 𝑚𝑐2 gives 2.77x10-11 J or 173MeV of energy.

A body moving at √3

2𝑐 has 𝛾 =

1

√1−(√3

2)2= 2 and so its the total relativistic energy is twice its

mass energy or its kinetic energy and mass energy are equal.

A proton has a mass of 1.67x10-27 kg and so if moving at 4𝑐

5 which means 𝛾 =

1

√1−(4

5)2=

5

3 it has a

total relativistic energy of 𝐸 = 𝛾𝑚𝑐2 = 5

3× 1.67 × 10−27 × (2.998 × 108)2 = 2.5x10-10 J.

Note that in particle physics energies are often expressed in eV where 1 eV=1.602x10-19 Joules.

Page 40: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Particle Decay A particle can decay into pure energy according to 𝐸 = 𝛾(𝑣)𝑚𝑐2.

However both the total relativistic energy and the momentum must be conserved so it is not as

simple as just using the above expression to calculate the energy of the photon. For example a

neutral π-meson with a mass of 135 MeV/c2 and moving at 0.99c has an energy of

𝐸 =1

√1−(0.99c)2

𝑐2

135

𝑐2𝑐2 = 7.089 × 135 = 957 MeV

However its momentum is 𝑝 = 𝛾𝑚𝑣 = 7.089×135

𝑐2× 0.99c = 947 MeV/c.

The momentum of the photon which must be travelling in the same direction as the π-meson is

𝑝 =𝐸

𝑐=

957

𝑐= 957 MeV/c and hence 10 MeV/c of momentum has been gained.

There must be two photons (indicated by the subscripts A and B) created with a total energy of

957 MeV so

𝐸𝐴 + 𝐸𝐵 = 957 MeV

Their total momentum must be 947 MeV/c so one must be travelling in the reverse direction

(momentum is a vector, energy is not).

𝑝𝐴 − 𝑝𝐵 = 947 MeV/c so

𝐸𝐴

𝑐−

𝐸𝐵

𝑐= 947 MeV/c

𝐸𝐴 − 𝐸𝐵 = 947 MeV

Solving these

𝐸𝐴 − 𝐸𝐵 = 947 MeV

𝐸𝐴 + 𝐸𝐵 = 957 MeV

2𝐸𝐴 = 957 + 947 = 1904 MeV

𝐸𝐴 = 952 MeV

952 + 𝐸𝐵 = 957 so 𝐸𝐵 = 5 MeV

One photon with an energy of 952MeV must be travelling in the same direction as the π-meson

and one of 5 MeV must be travelling in the reverse direction.

Page 41: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Fireballs A fireball is an optically thick soup of photons, electrons, positrons and baryons that originates

in a burst that accelerates as it expands realistically and emits gamma rays (Gamma Ray

Outbursts – GRBs). Very large energies are involved – 1044 J s-1 and timescales of a few

milliseconds indicate a small size.

There are two phases – prompt emission which lasts from a few milliseconds to a few minutes

during which intense very high energy gamma rays (hundreds of keV) are emitted and

sometimes an optical flash, and the afterglow which may last weeks or months during which

there a decrease in both the intensity and the frequency of the radiation down to radio

frequencies.

The fireball may be spherical or a beam which can be modelled by a cone using the same

techniques as for a spherical fireball.

The source is a small amount of matter and a large amount of energy, 𝐸0 ≫ 𝑚𝑐2, in a small

volume 𝑟0. This results in an expanding shell of thickness ∆𝑟 at a radius or distance 𝑟.

The flow reaches relativistic speeds very quickly and continues to accelerate to a distance called

the saturation radius 𝑟𝑠. The acceleration results from the photons accelerating the particles, a

process called Compton scattering, but results in a decrease in temperature.

The Lorentz factor 𝛾 is proportional to the radius until the saturation radius 𝑟𝑠.

The initial energy is 𝐸0 +𝑚𝑐2 and the acceleration ceases when this has been converted into

kinetic energy 𝛾𝑠𝑚𝑐2. Equating these

𝛾𝑠𝑚𝑐2 = 𝐸0 +𝑚𝑐2 or

𝛾𝑠 =𝐸0

𝑚𝑐2+ 1 the maximum value of 𝛾 which occurs at the saturation radius 𝑟𝑠.

Assuming that 𝛾0 = 1 (it starts from rest)

𝑟𝑠 = 𝛾𝑠𝑟0

𝑟0 cannot be bigger than the variation time ∆𝑡 times the speed of light which is a few light-

milliseconds so an approximation for 𝑟𝑠 is

𝑟𝑠 ≅𝛾𝑠

3∆𝑡 × 109 m

Once the saturation radius has been reached the flow continues at a constant speed, this being

the photosphere as the optical depth 𝜏 ≅ 1. The optical depth is approximated by

𝜏 ≅ 𝜎𝑇𝑛∆𝑟 where 𝜎𝑇 is the Thomson cross-section (6.652x10-29 m) and 𝑛 is the number of

baryons which in turn is approximated by 𝑚

𝑚𝑝4𝜋𝑟2∆𝑟

where 𝑚 is the total mass and 𝑚𝑝 the mass

of a proton (assuming ionised hydrogen). This gives

𝜏 =𝜎𝑇𝑚

4𝜋𝑚𝑝𝑟2

Assuming that the radius of the photosphere is given by 𝜏 = 1

𝑟𝑝ℎ =𝐸0𝜎𝑇

4𝜋𝑚𝑝𝑟𝑐2𝛾𝑠

𝛾𝑠 =𝐸0

𝑚𝑐2+ 1 ignoring the 1 since

𝐸0

𝑚𝑐2≫ 1

𝑟𝑝ℎ ≅𝐸0

𝛾𝑠× 10−10 m where 𝐸0 is in Joules.

Page 42: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Collisions within the flow create internal shock waves. This can be approximated by considering

two blobs of matter of mass 𝑚, travelling at speeds 𝑣 with the second emitted ∆𝑡 after the first,

with 𝑣2 > 𝑣1 so the two collide at a distance 𝑟𝑑𝑖𝑠.

Working in the ejection frame of reference 𝑟𝑑𝑖𝑠 = 𝑣1𝑡 = 𝑣2(𝑡 − ∆𝑡) gives

𝑟𝑑𝑖𝑠 = ∆𝑡𝑣1𝑣2

𝑣2−𝑣1= ∆𝑡

𝑣1

1−𝑣1𝑣2

Assuming both speeds are very close to the speed of light and 𝑣2 > 𝑣1 then 𝑣1

𝑣2≅

𝑣1

𝑐 so

𝑟𝑑𝑖𝑠 = ∆𝑡𝑣1

1−𝑣1𝑐

and since 𝑣1 is very close to the speed of light 1 −𝑣1

𝑐≅

1

2𝛾2(𝑣) so

𝑟𝑑𝑖𝑠 ≅ 2𝑐∆𝑡𝛾2(𝑣).

This is called the dissipation radius and for ∆𝑡 = 10ms and 𝛾(𝑣) = 100 has a value of 5x1011 m,

and beyond this value the flow decelerates.

The flow impacts the external medium which in the case of interstellar space has 106 hydrogen

atoms per cubic metre and in the halo of a spiral galaxy 103 hydrogen atoms per cubic metre.

This creates forward and reverse shock waves which increases the temperature and forms an

afterglow which appears 𝑡𝑑𝑒𝑐 after the start of the burst. The reverse shock may be responsible

for the optical flash.

The deceleration radius (where deceleration begins) is given by

𝑟𝑑𝑒𝑐 = (3𝐸0

4𝜋𝑛𝑚𝑝𝛾𝑠2𝑐2)

1

3 where 𝐸0 is the original energy, n is the number of protons of mass 𝑚𝑝 per

cubic metre and 𝛾𝑠 =𝐸0

𝑀𝑐2+ 1 is the saturation Lorentz factor.

The time at which deceleration starts is

𝑡𝑑𝑒𝑐 =𝑟𝑑𝑒𝑐

2𝛾𝑠2𝑐

These are based on an expanding sphere. In the case of a jet the burst energy 𝐸 is calculated

assuming spherical symmetry and then 𝐸0 in the above expressions is replaced by

𝐸0 = (1 − cos𝜃

2)𝐸 where 𝜃 is the opening angle of the cone.

Since tan 𝜃 =𝑐

𝑉𝛾, the deceleration results in an increase in the opening angle of the cone and this

results in a decrease in the observed intensity as the beam broadens – this is called the

achromatic break since it affects all frequencies simultaneously.

Long bursts are believed to be caused by hypernova (collapse of very large population III stars or type II supernovae) and short bursts by compact binary mergers involving white dwarfs and

neutron stars or type Ia supernovae – they all result in stellar mass black holes of a few sun

masses with a torus for long bursts or a fraction of a sun mass for short bursts, neutrino bursts,

gravitational waves and possible cosmic rays. Gamma ray bursts (GRBs) are the most distant

objects detected so far.

Page 43: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Contravariant Four-Vectors A four-vector is a vector with four components related to the time and three spatial directions.

A contravariant-vector is one that that transforms in a specific way from one frame of reference

to another – it is in fact a rank 1 contravariant tensor, but behaves as a vector.

It is assumed that frame B is moving at a speed V relative to frame A along the x axis. Frame B is

moving at the same speed as an object that experiences two (or more) events so the object will

always have the same spatial coordinates in frame B, and invariants such as proper time and

proper length can be observed directly without calculations. The two origins of the frames of reference coincide at time zero in both frames, i.e. (0, 𝑥𝐴, 𝑦𝐴, 𝑧𝐴) = (0, 𝑥𝐵, 𝑦𝐵 , 𝑧𝐵) = (0,0,0,0).

If a contravariant four-vector 𝐀 is measured in frame A at (𝑡𝐴, 𝑥𝐴, 𝑦𝐴, 𝑧𝐴) it will have

components (𝐴𝐴0, 𝐴𝐴

1, 𝐴𝐴2, 𝐴𝐴

3). In frame B it will have components (𝐴𝐵0, 𝐴𝐵

1, 𝐴𝐵2, 𝐴𝐵

3) and

these can be obtained from the frame A components using the Lorentz transformation matrix

[ 𝐴𝐵

0

𝐴𝐵1

𝐴𝐵2

𝐴𝐵3]

=

[ 𝛾(𝑉) −

𝛾(𝑉)𝑉

𝑐0 0

−𝛾(𝑉)𝑉

𝑐𝛾(𝑉) 0 0

0 0 1 00 0 0 1]

[ 𝐴𝐴

0

𝐴𝐴1

𝐴𝐴2

𝐴𝐴3]

or

[𝐴𝐵𝜇] = [Λ𝜇𝜈] [𝐴𝐴

𝜈] or

𝐴𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 𝐴𝐴

𝜈 𝜇 = 0,1,2,3 note that the superscripts are not powers.

If another event occurs very close to this event at (𝑡𝐴 + d𝑡𝐴, 𝑥𝐴 + d𝑥𝐴, 𝑦𝐴 + d𝑦𝐴, 𝑧𝐴 + d𝑧𝐴) then

the separation between the two events measured in frame B, d being the infinitesimal operator

d𝐴𝐵0(𝐴𝐴

0, 𝐴𝐴1, 𝐴𝐴

2, 𝐴𝐴3) =

𝜕𝐴𝐵0

𝜕𝐴𝐴0 d𝐴𝐴

0 +𝜕𝐴𝐵

1

𝜕𝐴𝐴1 d𝐴𝐴

1 +𝜕𝐴𝐵

2

𝜕𝐴𝐴2 d𝐴𝐴

2 +𝜕𝐴𝐵

3

𝜕𝐴𝐴3 d𝐴𝐴

3 and three similar

equations for d𝐴𝐵1, d𝐴𝐵

2, d𝐴𝐵3, all four of which can be written as

d𝐴𝐵𝜇 = ∑

𝜕𝐴𝐵𝜇

𝜕𝐴𝐴𝜈

3𝜈=0 d𝐴𝐴

𝜈 𝜇 = 0,1,2,3

Also

𝐴𝐵0 =

𝜕𝐴𝐵0

𝜕𝐴𝐴0 𝐴𝐴

0 +𝜕𝐴𝐵

1

𝜕𝐴𝐴1 𝐴𝐴

1 +𝜕𝐴𝐵

2

𝜕𝐴𝐴2 𝐴𝐴

2 +𝜕𝐴𝐵

3

𝜕𝐴𝐴3 𝐴𝐴

3 and three similar equations, all four of which

can be written as 𝐴𝐵𝜇 = ∑

𝜕𝐴𝐵𝜇

𝜕𝐴𝐴𝜈

3𝜈=0 𝐴𝐴

𝜈 𝜇 = 0,1,2,3 which defines a contravariant four-vector.

In special relativity 𝜕𝐴𝐵

𝜇

𝜕𝐴𝐴𝜈 = Λ𝜇𝜈 so for all contravariant four-vectors there is the general rule that

𝐴𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 𝐴𝐴

𝜈 𝜇 = 0,1,2,3.

There are several specific contravariant four-vectors in special relativity, capital bold letters,

each consisting of a scalar and a three component spatial vector, r, 𝐯, p and f being 3D vectors.

Four-displacement ∆𝐗𝜇 = (𝑐∆𝑡, ∆𝐫)

Four-position 𝐗𝜇 = (𝑐𝑡, 𝐫) (a four-displacement from the origin)

Four-velocity 𝐔𝜇 = (𝛾𝑐, 𝛾𝐯)

Four-momentum 𝐏𝜇 = (𝐸

𝑐, 𝐩)

Four-force 𝐅𝜇 = (𝛾𝐟 ∙𝐯

𝑐, 𝛾𝐟) (note that the dot product 𝛾𝐟 ∙

𝐯

𝑐 results in a scalar)

Page 44: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Displacement Four-Vector The displacement four-vector ∆𝐗𝜇 (often called four-displacement) is simply the separation

between two events given by

[∆𝑋𝜇] = (𝑐∆𝑡, ∆𝐫 ), the time being multiplied by the speed of light so that all the units are those

of length. Expanding ∆𝐫 = (∆𝑥, ∆𝑦, ∆𝑧)

[∆𝑋𝜇] = (𝑐∆𝑡, ∆𝑥, ∆𝑦, ∆𝑧) and

Δ𝑋𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 Δ𝑋𝐴

𝜈 𝜇 = 0,1,2,3

Note the convention to use a bold lowercase letter such as 𝐫 to indicate a 3D vector, i.e. with x, y

and z components, but contravariant four-vectors by upper case letters 𝐀𝜇 or [𝐴𝜇] to represent

all the components.

Otherwise four-vectors have the same operations as 3D vectors but obviously these must be

extended to four dimensions.

Four-vectors differ from general 4D vectors in that the last three components form a 3D vector.

Position Four-Vector This is often called four-position and is effectively the same as the displacement, but one of the

events is at the origin to give

[𝑋𝜇] = (𝑐𝑡, 𝐫 ) and

𝑋𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 𝑋𝐴

𝜈 𝜇 = 0,1,2,3

Note that that the superscript is not a power, 𝑥1 corresponds to 𝑥, 𝑥2 to 𝑦 and 𝑥3 to 𝑧.

Einstein Notation Einstein used a notation that omits the summation sign so the above would be written as

Δ𝑋𝐵𝜇 = Λ𝜇𝜈Δ𝑋𝐴

𝜈 and 𝑋𝐵𝜇 = Λ𝜇𝜈𝑋𝐴

𝜈

This reduces the clutter but is very confusing to those who are not familiar with it.

Many texts on relativity use it, but it is not used here.

Page 45: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Velocity Four-Vector Using a clock that travels in the rest frame the time will be the proper time 𝜏.

The velocity 𝐔𝜇 =d𝑥𝜇

d𝜏 or

[𝑈𝜇] = [d𝑒𝜇

d𝜏] = (𝑐

d𝑡

d𝜏,d𝑥

d𝜏,d𝑦

d𝜏,d𝑧

d𝜏) .

d𝑡

d𝜏= 𝛾(𝑉) ∆𝑡 = 𝛾(𝑉)∆𝜏

𝐯 = 𝛾(𝑉) (d𝑥

d𝜏,d𝑦

d𝜏,d𝑧

d𝜏) 𝐯 = 𝛾(𝑉) (

∆𝑥

∆𝑡,∆𝑦

∆𝑡,∆𝑧

∆𝑡)

so [𝑈𝜇] = (𝑐𝛾(𝑉), 𝛾(𝑉)𝐯) ∆𝜏 =∆𝑡

𝛾(𝑉)

Note that 𝐔𝜇 is d𝑥𝜇

d𝜏, not

d𝑥𝜇

d𝑡. Since 𝜏 is invariant the Lorentz transform applies to [𝑈𝜇] so it is

contravariant and

𝑈𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 U𝐴

𝜈 𝜇 = 0,1,2,3

The invariant (Δ𝑠)2 = ∑ 𝜂𝜇𝜈𝑒𝜇𝑒𝜈3

𝜇,𝜈=0 =(cΔ𝑡)2 − (Δ𝑟)2 where 𝜂𝜇𝜈 is the Minkowski metric

Replacing 𝑒 by 𝑈 gives

∑ 𝜂𝜇,𝜈𝑈𝜇𝑈𝜈3

𝜇,𝜈=0 = 𝛾2(𝑣)𝑐2 − 𝛾2(𝑣)(𝑣𝑥2 + 𝑣𝑦

2 + 𝑣𝑧2) = 𝛾2(𝑣)(𝑐2 − 𝑣2) =

𝑐2

𝑐2−𝑣2(𝑐2 − 𝑣2) = 𝑐2

which is invariant.

It is a general rule that ∑ 𝜂𝜇,𝜈𝐴𝜇𝐴𝜈3

𝜇,𝜈=0 produces an invariant for all contravariant four-vectors.

𝐀 ∙ 𝐁 = ∑ 𝜂𝜇,𝜈𝐴𝜇𝐵𝜈3

𝜇,𝜈=0 is the four-vector scalar or dot product.

Page 46: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Momentum Four-Vector The four-momentum is defined as [𝑃𝜇] = 𝑚[𝑈𝜇] = (𝛾(𝑣)𝑚𝑐, 𝛾(𝑣)𝑚𝑣𝑥 , 𝛾(𝑣)𝑚𝑣𝑦, 𝛾(𝑣)𝑚𝑣𝑧).

Since 𝐩 = 𝛾(𝑉)𝑚𝐯 and 𝐸 = 𝛾(𝑣)𝑚𝑐2 this becomes

[𝑃𝜇] = (𝐸

𝑐, 𝐩)

Since m is invariant and 𝐔𝜇 obeys the Lorentz transform, then so does 𝐏𝜇 so it is contravariant

and

𝑃𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 P𝐴

𝜈 𝜇 = 0,1,2,3

Note that it is important to distinguish between 𝑣 and 𝑉.

𝑉 is the speed of one frame compared to another frame, with the x axis of the two frames

being along the same line, and is a scalar.

𝐯 is the velocity (a 3D vector) of an object relative to a frame of reference where

𝐯 = (𝑣𝑥, 0,0) for an object moving along the x axis and 𝑣𝑥𝐵 =𝑣𝑥𝐴−𝑉

1−𝑣𝑥𝐴𝑉

𝑐2

.

Also note that 𝜈 is the Greek letter nu.

The transformation of momentum from frame A to frame B travelling at speed relative V along

the x axis and where 𝐯𝐀 is the velocity of the object measured in frame A and 𝐸 is energy is

[

𝐸𝐵(𝑣𝐵)

𝑐

𝑝𝑥𝐵(𝑣𝑥𝐵)

𝑝𝑦𝐵(𝑣𝑦𝐵

)

𝑝𝑧𝐵(𝑣𝑧𝐵)]

=

[ 𝛾(𝑉) −

𝛾(𝑉)𝑉

𝑐0 0

−𝛾(𝑉)𝑉

𝑐𝛾(𝑉) 0 0

0 0 1 00 0 0 1]

[

𝐸𝐴(𝑣𝐴)

𝑐

𝑝𝑥𝐴(𝑣𝑥𝐴)

𝑝𝑦𝐴(𝑣𝑥𝐴)

𝑝𝑧𝐴(𝑣𝑥𝐴)]

𝑝𝑥𝐵(𝑣𝑥𝐵) is based on 𝑣𝑥𝐵 etc.

So for a particle moving along the x axis

𝐸𝐵(𝑣𝐵)

𝑐= 𝛾(𝑉)

𝐸𝐴(𝑣𝐴)

𝑐−

𝛾(𝑉)𝑉

𝑐𝑝𝑥𝐴(𝑣𝐵) or 𝐸𝐵(𝑣𝐵) = 𝛾(𝑉)(𝐸𝐴(𝑣𝐴) − 𝑉𝑝𝑥𝐴(𝑣𝑥𝐴))

𝑝𝑥𝐵(𝑣𝑥𝐵) = −𝛾(𝑉)𝑉

𝑐

𝐸𝐴(𝑣𝐴)

𝑐+ 𝛾(𝑉)𝑝𝑥𝐴(𝑣𝑥𝐴) or 𝑝𝑥𝐵(𝑣𝑥𝐵) = 𝛾(𝑉) (𝑝𝑥𝐴(𝑣𝑥𝐴) −

𝑉𝐸𝐴(𝑣𝑥𝐴)

𝑐2)

Two observers in two different frames will not agree on the value of either energy or

momentum, but there is a relationship between them.

Page 47: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Energy and Momentum

[𝑃𝜇] = (𝐸

𝑐, 𝐩) is similar in form to [∆𝑥𝜇] = (𝑐∆𝑡, ∆𝐫 ). The latter implies that time and space are

connected as a single concept, and different observers of an event observe different values for

space and time, but there is a link between the values because the separation is invariant.

The former implies that the same is true of energy and momentum – different observers of the

same object will observe different values for its energy and momentum, but there is a link

between these values.

∑ 𝜂𝜇,𝜈𝑃𝜇𝑃𝜈 =3

𝜇,𝜈=0 ∑ 𝜂𝜇,𝜈𝑚𝑈𝜇𝑚𝑈𝜈3

𝜇,𝜈=0 [𝑃𝜇] = 𝑚[𝑈𝜇]

= 𝑚2∑ 𝜂𝜇,𝜈𝑈𝜇𝑈𝜈3

𝜇,𝜈=0

= 𝑚2𝑐2 an invariant ∑ 𝜂𝜇,𝜈𝑈𝜇𝑈𝜈3

𝜇,𝜈=0 = 𝑐2

But ∑ 𝜂𝜇,𝜈𝑃𝜇𝑃𝜈 = (

𝐸

𝑐)2− (𝑝𝑥

23𝜇,𝜈=0 + 𝑝𝑦

2 + 𝑝𝑧2) = (

𝐸

𝑐)2− 𝑝2 so

(𝐸

𝑐)2− 𝑝2 = 𝑚2𝑐2 or

𝐸2 = 𝑝2𝑐2 +𝑚2𝑐4 which is known as the energy-momentum relation.

Given a specific mass, only a combination of energy and momentum that obeys the above

equation is valid.

For example a particle with energy 𝑎𝑚𝑐2 must have a momentum of √𝑎2 − 1𝑚𝑐 so that

𝑎2𝑚2𝑐4 = (𝑎2 − 1) 𝑚2𝑐2𝑐2 +𝑚2𝑐4 .

A photon has no rest mass so 𝐸2 = 𝑝2𝑐2 or 𝑝 =𝐸

𝑐. In Newtonian mechanics a photon cannot

have momentum because it has no mass, but relativity states it has momentum proportional to its energy which is given by 𝐸 = ℎ𝑓 where ℎ is Planck’s constant and 𝑓 the frequency, i.e.

𝑝 =ℎ𝑓

𝑐=

𝑐𝜆 where 𝜆 is the wavelength.

An object that is not moving has 𝐸2 = 𝑚2𝑐4 or 𝐸 = 𝑚𝑐2, this being the mass energy given the

symbol 𝐸0.

𝐸2 = 𝑝2𝑐2 +𝑚2𝑐4 is the original form of Einstein’s famous equation which is more generally

remembered as 𝐸 = 𝑚𝑐2 and is obtained by setting the velocity and hence the momentum to

zero.

Page 48: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Force Four-Vector Force is the rate of change of momentum so differentiating wrt proper time

[𝐹𝜇] = [d𝑃𝜇

d𝜏] = [

d(𝐸

𝑐,𝐩)

d𝜏] = (

1

𝑐

d𝐸

d𝜏,d𝑝𝑥

d𝜏,d𝑝𝑦

d𝜏,d𝑝𝑧

d𝜏)

If 𝐟 = (𝑓𝑥 , 𝑓𝑦, 𝑓𝑧), d𝜏 =𝑑𝑡

𝛾(𝑉) and

d𝐸

d𝑡= 𝐟 ∙ 𝐯 since the rate of change of energy is the dot product of

force and velocity

[𝐹𝜇] = (𝛾(𝑉)

𝑐𝐟 ∙ 𝐯 , 𝛾(𝑉)𝑓𝑥, 𝛾(𝑉)𝑓𝑦, 𝛾(𝑉)𝑓𝑧) = (

𝛾(𝑉)

𝑐𝐟 ∙ 𝐯 , 𝛾(𝑉)𝐟)

𝐟 ∙ 𝐯 produces a scalar so this is compatible with the contravariant four-vectors.

The transformation of force from frame A to frame B travelling at speed relative V along the x

axis and where 𝐯𝐀 is the velocity of the object measured in frame A is

[ (𝛾(𝑣𝐵)

𝑐𝐟 ∙ 𝐯)

𝐵

𝛾(𝑣𝑥𝐵)𝑓𝑥𝐵𝛾(𝑣𝑦𝐵)𝑓𝑦𝐵𝛾(𝑣𝑧𝐵)𝑓𝑧𝐵 ]

=

[ 𝛾(𝑉) −

𝛾(𝑉)𝑉

𝑐0 0

−𝛾(𝑉)𝑉

𝑐𝛾(𝑉) 0 0

0 0 1 00 0 0 1]

[ (𝛾(𝑣𝐴)

𝑐𝐟 ∙ 𝐯)

𝐴

𝛾(𝑣𝑥𝐴)𝑓𝑥𝐴𝛾(𝑣𝑦𝐴)𝑓𝑦𝐴𝛾(𝑣𝑧𝐴)𝑓𝑧𝐴 ]

Thus for the conventional three dimensional force 𝐟

𝛾(𝑣𝑥𝐵)𝑓𝑥𝐵 = −𝛾(𝑉)𝑉

𝑐(𝛾(𝑣𝐴)

𝑐𝐟 ∙ 𝐯)

𝐴+ 𝛾(𝑉)𝛾(𝑣𝑥𝐴)𝑓𝑥𝐴

or

𝛾(𝑣𝑥𝐵)𝑓𝑥𝐵 = 𝛾(𝑉) (𝛾(𝑣𝑥𝐴)𝑓𝑥𝐴−

𝛾(𝑣𝐴)𝑉

𝑐2𝐟 ∙ 𝐯)

𝛾 (𝑣𝑦𝐵)𝑓𝑦𝐵 = 𝛾 (𝑣𝑦𝐴)𝑓𝑦𝐴

𝛾(𝑣𝑧𝐵)𝑓𝑧𝐵 = 𝛾(𝑣𝑧𝐴)𝑓𝑧𝐴

Not that 𝐅𝜇 is not identical to the Newtonian force because it must transform according to the

Lorentz transform to be a contravariant four-vector. This is the case for electromagnetic forces, but not for gravitational forces which means gravitational forces are outside of special relativity

and are not forces in the same sense as other forces – hence general relativity which deals with

gravity is a different theory.

The superscript 𝜇 on the above four-vectors is important because it indicates they are

contravariant vectors.

This distinguishes them from covariant four-vectors which have a subscripted 𝜇.

Page 49: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Covariant Four-Vectors Another type of four-vector is the covariant four-vector, a derivative of the contravariant vector.

Given a contravariant vector 𝐀𝐀 = (𝐴𝐴0, 𝐴𝐴

1, 𝐴𝐴2, 𝐴𝐴

3) where 𝐴𝐴0 = 𝑓0(𝑒𝐴

0),

𝐴𝐴1 = 𝑓1(𝑒𝐴

1), etc. and 𝑒𝐴0 is a vector magnitude along axis 0 etc., then

𝐴𝐵 = (𝐴𝐵0, 𝐴𝐵

1, 𝐴𝐵2, 𝐴𝐵

3) where the 𝐴𝐵0, 𝐴𝐵

1 etc. are found using the Lorentz transform.

If a new vector 𝐁𝐀 is found by differentiating 𝐀𝐀 wrt 𝑒𝐴, such that

𝐵𝐴0 =

∂𝐴𝐴

∂𝑒𝐴0, 𝐵𝐴

1 =∂𝐴𝐴

∂𝑒𝐴1 etc. and expanding 𝐵𝐴

0 to give

𝐵𝐴0 =

∂𝐴𝐴0

∂𝑒𝐴0 +

∂𝐴𝐴1

∂𝑒𝐴0 +

∂𝐴𝐴2

∂𝑒𝐴0 +

∂𝐴𝐴3

∂𝑒𝑎0 , and similarly for 𝐵𝐴

1, 𝐵𝐴2, 𝐵𝐴

3 then for the corresponding 𝐵𝐵

𝐵𝐵0 =

∂𝐴𝐵

∂𝑒𝐵0 =

∂𝐴𝐴0

∂𝑒𝐴0

𝜕𝑒𝐴0

𝜕𝑒𝐵0 +

∂𝐴𝐴1

∂𝑒𝐴0

𝜕𝑒𝐴1

𝜕𝑒𝐵0 +

∂𝐴𝐴2

∂𝑒𝐴0

𝜕𝑒𝐴2

𝜕𝑒𝐵0 +

∂𝐴𝐴3

∂𝑒𝑎0

𝜕𝑒𝐴3

𝜕𝑒𝐵0 , and similarly for 𝐵𝐵

1, 𝐵𝐵2, 𝐵𝐵

3

In the case of special relativity where 𝑒0 is ct, 𝑒1 is x etc, 𝜕𝑒𝐴

0

𝜕𝑒𝐵0 is

𝜕ct𝐴

𝜕ct𝐵

𝜕ct𝐴

𝜕ct𝐵=

𝜕

𝜕ct𝐵(𝛾(𝑉) (𝑡𝐵 +

𝑉𝑥𝐵

𝑐2)) = 𝛾(𝑉) from ct𝐴 = 𝛾(𝑉) (𝑡𝐵 +

𝑉𝑥𝐵

𝑐2)

𝜕𝑒𝐴1

𝜕𝑒𝐵0 =

𝜕𝑥𝐴

𝜕ct𝐵=

𝜕

𝜕ct𝐵(𝛾(𝑉)(𝑥𝐵 + 𝑉𝑡𝐵)) = 𝛾(𝑉)

𝑉

𝑐 from 𝑥𝐴 = 𝛾(𝑉)(𝑥𝐵 + 𝑉𝑡𝐵)

𝜕𝑒𝐴2

𝜕𝑒𝐵0 =

𝜕𝑦𝐴

𝜕ct𝐵= 0 and

𝜕𝑒𝐴3

𝜕𝑒𝐵0 =

𝜕𝑧𝐴

𝜕ct𝐵= 0 so

𝐵𝐵0 = 𝛾(𝑉) (𝐵𝐴

0 +𝑉𝐵𝐴

1

𝑐2) (similar to a contravariant but the sign in the bracket is +) and

𝐵𝐵1 = 𝛾(𝑉) (𝐵𝐴

1 +𝑉𝐵𝐴

0

𝑐) (similar to a contravariant but the sign in the bracket is +) and

𝐵𝐵2 = 𝐵𝐴

2 and 𝐵𝐵3 = 𝐵𝐴

3. This is called a covariant four-vector and is a derivative of a

contravariant vector. [𝐴𝐵𝜇] = [Λ𝜇𝜈] [𝐴𝐴

𝜈] and [𝐵𝐵𝜇] = [Λ−1

𝜈𝜇] [𝐵𝐴

𝜈]

[Λ𝜇𝜈] =

[ 𝛾(𝑉) −

𝛾(𝑉)𝑉

𝑐0 0

−𝛾(𝑉)𝑉

𝑐𝛾(𝑉) 0 0

0 0 1 00 0 0 1]

and its inverse is [Λ−1𝜇𝜈] =

[ 𝛾(𝑉)

𝛾(𝑉)𝑉

𝑐0 0

𝛾(𝑉)𝑉

𝑐𝛾(𝑉) 0 0

0 0 1 00 0 0 1]

To distinguish between contravariant and covariant the first uses superscripts and the second

subscripts so we write [𝐵𝐵𝜇] = [Λ−1𝜈𝜇][𝐵𝐵𝜈] for covariant transforms.

To clarify the subscripts and superscripts, the convention is that

Contravariant transforms are written as [𝑒𝐵𝜇] = [Λ𝜇𝜈] [𝑒𝐴

𝜈] or

𝑒𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 𝑒𝐴

𝜈 𝜇 = 0,1,2,3 - the contravariant vector index is

superscripted and is not a power.

Covariant transforms are written as [𝑒𝐵𝜇] = [Λ−1𝜇𝜈] [𝑒𝐴𝜈] or

𝑒𝐵𝜇 = ∑ Λ−1𝜇𝜈3

𝜈=0 𝑒𝐴𝜈 𝜇 = 0,1,2,3 - the covariant vector index is

subscripted.

Page 50: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Covariant and Contravariant Scalars The terms covariant and contravariant refer to how an object such as a scalar changes when an

axis changes.

The simplest example is where the scale changes. An electric potential can be measured in volts

per metre – for example 5 v/m. If the scale is changed from metres to kilometres (multiplied by

1000) the same potential is 5000 v/km (multiplied by 1000). So if the scale is multiplied by 𝑠

the value is multiplied by 𝑠. The term used for this is covariant.

Speed can be measured in metres per second for example - 5 m/s. If the scale is changed from

metres to kilometres (multiplied by 1000) the same speed is 0.005 km/s (divided by 1000). So

if the scale is multiplied by 𝑠 the value is multiplied by 1

𝑠. The term used for this is contravariant.

Another way of measuring speed is the time taken to cover a specified distance, i.e. seconds per

metre, say 5 s/m. If the scale is changed to kilometres the same speed would be 5000 s/km. In

this case both the scalar and the value are multiplied by the same factor so it is covariant.

Any contravariant can be converted into covariant and v.v. The electric potential could have

been measured in metres per volt – 5 m/v is the same as 0.005 km/v which is contravariant.

If the scale changes by 𝑠 a covariant scalar changes by 𝑠 and a contravariant scalar by 1

𝑠.

These values are reciprocals of each other.

Many physical quantities are contravariant – for example a mass of 5 grams is 0.005 kg.

Other quantities are naturally covariant, and the two types are not normally distinguished, it

being obvious as to how to change the scale. However the concept is extremely important in

tensor calculus and relativity.

Vectors can also be covariant or contravariant. The scaling is slightly more complicated. The key

concept is that it is based on the change in values on one axis when there is a change in values

on the other. Given two axes in different coordinate systems 𝑒𝐴 and 𝑒𝐵, if there is a movement

d𝑒𝐴 along 𝑒𝐴 which is physically the same as the movement d𝑒𝐵 along the 𝑒𝐵, then the scale

factor for covariant vectors is d𝑒𝐴

d𝑒𝐵 . The reciprocal

d𝑒𝐵

d𝑒𝐴 applies to contravariant vectors. In many

cases these will be partial derivatives because there will be more than one axis.

Page 51: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Contravariant and Covariant Vectors There are several ways of interpreting contravariant and covariant vectors. One way is to

consider the projection of vectors onto axes. Working in two dimensions the following diagram

shows the projection of a general 2D vector onto the axes.

The projection onto the x axis can be explained in two ways.

Firstly lines are drawn from the two ends of the vector parallel to the y axis to meet the

x axis, and the contravariant projection is drawn along the x axis between the ends of

these lines.

Secondly lines are drawn from the two ends of the vector perpendicular to the x axis to

meet the x axis, and the covariant projection is drawn along the x axis between the ends

of these lines.

The result is the same (in both cases the x component is 𝐴𝑥), but the method is different.

The same applies to the projection onto the y axis.

The reason the results are the same is that the two axes are orthogonal so perpendicular to one

is the same as parallel to the other.

x

y

𝐴𝑥

𝐴

Page 52: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Now consider the case where the axes are not orthogonal as in a Minkowski space-time

diagram.

There are two ways of drawing the projection onto the x axis.

Firstly lines are drawn from the two ends of the vector parallel to the ct axis to meet the

x axis, and the contravariant projection is drawn along the x axis between the ends of

these lines (in blue). The contravariant component is 𝐴𝑥.

Secondly lines are drawn from the two ends of the vector perpendicular to the x axis to

meet the x axis, and the covariant projection is drawn along the x axis between the ends

of these lines (in red). The covariant component is 𝐴𝑥.

The results are obviously different in this case 𝐴𝑥 ≠ 𝐴𝑥.

Likewise the projections onto the ct axis would also differ.

ct

x

𝐴

𝐴𝑥

𝐴𝑥

Page 53: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Another way of looking at the difference is to consider a point and a line.

The diagram shows a point located by a position vector, and a line. There is one scale on the y

axis and two scales on the x axes. There are two unit vectors, one for each scale, on the x axis.

In order to transform from the A to the B scale the length of the unit vector is halved so

�̂�𝐵 =1

2�̂�𝐴 or �̂�𝐴 = 2�̂�𝐵

The point is located at (2,2) using scale A, but (4,2) using scale B. In general if there is a point at

(𝑥𝐴, 𝑦) using scale A then it is at (𝑥𝐵, 𝑦) using scale B where 𝑥𝐵 = 2𝑥𝐴. This is the inverse of

�̂�𝐵 =1

2�̂�𝐴, and so a position vector to the point is a contravariant vector.

The line has a slope of −4

4= −1 using scale A, but −

4

8= −0.5 using scale B so

d𝑦

d𝑥𝐵=

1

2

d𝑦

d𝑥𝐴. This is

the same as �̂�𝐵 =1

2�̂�𝐴 and so the line is a covariant vector.

If �̂�𝐵 = 𝑠�̂�𝐴 where 𝑠 is a scale factor transforming from one scale to another, then the

contratransform is 𝑠−1 and the cotransform is 𝑠 (which explains the origin of the terms contra

and co).

Making this more general the point could be a point on the curve of 𝑦 = 𝑓(𝑥). All points on that

curve would be transformed by the scale factor 𝑠−1. The line could be a tangent to the curve at

that point, and the tangent’s slope would scale by 𝑠 giving 𝑦 = 𝑓(𝑥𝐵) = 𝑓(𝑠−1𝑥𝐴) and

d𝑦

d𝑥𝐵= 𝑠

d𝑦

d𝑥𝐴.

The scale factor 𝑠 is called a tensor of rank 0. A tensor is a mathematical object that is used to

transform from one set of coordinates to another set of coordinates. It does not mean that all

scalars are tensors of rank 0 (only if they used to transform), but all rank 0 tensors are scalars.

A change in coordinates may mean just a change of scale, but may mean a change in the

orientation of the axes, and even in the shape of the axes (from straight lines to curved lines, or

from lines to angles as in polar and spherical co-ordinates), or any combination. In special

relativity it also includes changes in velocity.

1

2

3

4

1

2 2 1

2

4

4

8

3

6

Scale A

Scale B

�̂�𝐴

�̂�𝐵

x

y

Page 54: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Lowering a Four-Vector Index A contravariant four-vector can be converted into a covariant four-vector by

𝑒𝐴𝜇 = ∑ 𝜂𝜇,𝜈3𝜈=0 𝑒𝐴

𝜈 𝜇 = 0,1,2,3 where 𝜂𝜇𝜈 is the row 𝜇 column 𝜈 element in the

Minkowski metric. Note that [𝜂𝜇𝜈] indicates it is to be treated as a matrix, 𝜂𝜇𝜈 as an array, and if

𝜇 and 𝜈 are each assigned a numerical value as in 𝜂1,2 (or 𝜂𝜇,𝜈) that component (row 1 column 2).

[𝜂𝜇𝜈] ≡ [

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

]

If a four vector [𝐴𝜇] = (𝑎, 𝑏, 𝑐, 𝑑 ) then [𝐴𝜇] = (𝑎, −𝑏,−𝑐,−𝑑 ).

This gives the covariant vectors

Four-displacement ∆𝐗𝜇 = (𝑐∆𝑡, −∆𝐫)

Four-position 𝐗𝜇 = (𝑐𝑡, − 𝐫)

Four-velocity 𝐔𝜇 = (𝛾𝑐,− 𝛾𝐫)

Four-momentum 𝐏𝜇 = (𝐸

𝑐, −𝐩)

Four-force 𝐅𝜇 = (𝛾𝐟 ∙𝐯

𝑐, −𝛾𝐟)

Raising a Four-Vector Index This is the reverse process using the inverse Minkowski metric

[𝜂𝜇𝜈] ≡ [

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

] to give

𝑒𝐴𝜇 = ∑ 𝜂𝜇𝜈3

𝜈=0 𝑒𝐴𝜈 𝜇 = 0,1,2,3

Note that although [𝜂𝜇𝜈] and [𝜂𝜇𝜈] appear identical, they are different objects but have the same

components, and there is an important relationship between them.

∑ 𝜂𝛼,𝜈𝜂𝜈,𝛽 = 𝛿𝛼𝛽3𝜈=0 where 𝛿𝛼𝛽 = 𝛿𝛼𝛽 , i.e. =1 if 𝛼 = 𝛽, otherwise 0 or [𝛿𝛼𝛽] = [𝐼], the identity

matrix. 𝛿𝑎𝑏 is known as the Kronecker Delta defined as 𝛿𝑎𝑏 = 1 for 𝑎 = 𝑏 and 0 for 𝑎 ≠ 𝑏.

Note that there is an alternative form of these metrics:

[

−1 0 0 00 1 0 00 0 1 00 0 0 1

]

This results in slightly different definitions such as ∆𝑠 = √(Δ𝑟)2 − (cΔ𝑡)2 which has the

advantage that it is more compatible with ∆𝑟 = √(Δ𝑥)2 + (Δ𝑦)2 + (Δ𝑧)2. Care has to be taken

to note the convention being used.

The minus sign means that this is pseudo Euclidean geometry which differs from a “normal”

four dimensional Euclidean geometry whose metric is (+,+,+, +). Pseudo geometries have

some unusual characteristics. In particular if 𝑠 = √(𝑐𝑡)2 − 𝑟2 then 𝑠 → 0 as 𝑟 → 𝑐𝑡. Also if

𝑟 > 𝑐𝑡 then 𝑠 is an imaginary number. Finally 𝑠 = 0 does not mean 𝑠 is a point or event.

Page 55: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Forming Invariants by Contraction

∑ 𝜂𝜇,𝜈𝐔𝜇𝐔𝜈3

𝜇,𝜈=0 = 𝛾2(𝑣)𝑐2 − 𝛾2(𝑣)(𝑣𝑥2 + 𝑣𝑦

2 + 𝑣𝑧2) = 𝛾2(𝑣)(𝑐2 − 𝑣2) =

𝑐2

𝑐2−𝑣2(𝑐2 − 𝑣2) = 𝑐2

which is invariant as is the case for all contravariant four-vectors.

However 𝜂𝜇,𝜈𝐔𝜇 = 𝐔𝜈 so

∑ 𝜂𝜇,𝜈𝐔𝜇𝐔𝜈3

𝜇,𝜈=0 = ∑ 𝑈𝜈𝑈𝜈 = 𝑈0𝑈

0 + 𝑈1𝑈1 + 𝑈2𝑈

2 + 𝑈3𝑈3 = 𝑐23

𝜈=0

Similar examples are ∑ Δ𝐗𝜈Δ𝐗𝜈 = (Δ𝑠)23

𝜈=0 and ∑ 𝐏𝜈𝐏𝜈 = (mc)23

𝜈=0 .

In general ∑ 𝐀𝜈𝐁𝜈 = 𝐴0𝐵

0 + 𝐴1𝐵1 + 𝐴2𝐵

2 + 𝐴3𝐵33

𝜈=0 which is a scalar (similar to the vector

dot product) and invariant if A and B are the same, i.e. ∑ 𝐀𝜈𝐀𝜈3

𝜈=0 .

𝐀 ∙ 𝐀 is the square of the magnitude of a 3D vector. 𝐀𝜈𝐀𝜈 is the square of the magnitude of a

four-vector.

In matrix terms it is multiplying a row vector and a column vector to give a scalar.

[(𝑐𝑡)1 −𝑥1 −𝑦1 −𝑧1] [

(𝑐𝑡)2𝑥2𝑦2𝑧2

] = (𝑐𝑡)1(𝑐𝑡)2 − 𝑥1𝑥2 − 𝑦1𝑦2 − 𝑧1𝑧2 = (∆𝑠)2

Note that Einstein notation omits the leading ∑ 3𝜈=0 - for example Δ𝐗𝐴𝜈Δ𝐗𝐴

𝜈 is the square of the

separation – there are rules for the use of this notation, but it is confusing to those who are not

used to using it.

The basic rule is that if an index appears twice and only twice in a term, once subscripted and

once superscripted there is an implied summation, the range of the implied dummy variable

being obvious from the context (e.g. 0 to 3 in special relativity).

Thus 𝐀𝜈𝐀𝜈 really means ∑ 𝐀𝜈𝐀

𝜈3𝜈=0 .

This notation is not used here, but is frequently used in other texts.

Page 56: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Conservation of Electric Charge Electric charge is conserved if no charge is brought in or taken out of a system. This is expressed

in the continuity equation

𝜕𝜌

𝜕𝑡+

𝜕𝐽𝑥

𝜕𝑥+

𝜕𝐽𝑦

𝜕𝑦+

𝜕𝐽𝑧

𝜕𝑧= 0 where 𝜌 is the charge density (coulombs per cubic metre) and J is a

vector of the electric current density in amps per square metre.

This can be converted into a current contravariant four-vector [𝐽𝜇] = (𝑐𝜌, 𝐽𝑥 , 𝐽𝑦, 𝐽𝑧) from which

∑𝜕𝐽𝜈

𝜕𝑒𝜈3𝜈=0 = 0, the covariant charge continuity equation.

Electromagnetic Forces Coulomb’s law states that the electrostatic force between two charged particles with charges 𝑞

and 𝑄 separated by a distance 𝑑 is given by

𝑓 =𝑄𝑞

4𝜋 0𝑑2 where 휀0 is the permittivity of free space. Since force is strictly a vector, this can be

made into a vector equation by introducing a unit vector �̂� in the direction of q from Q giving

𝐟 =𝑄𝑞

4𝜋 0𝑑2 �̂�

This can be converted into an electric field 𝛆(𝑟) by 𝛆 = 𝐟

𝑞 where 𝛆 is the electric field around Q.

𝐟 = 𝑞𝛆(𝑟) represents the electrostatic force on q where 𝐫 is the position vector of 𝑞 from Q, i.e.

the origin is at Q.

The magnetic force is dependent on a particle’s velocity in addition to its position and charge.

𝐟 = 𝑞𝐯 ×𝜇0𝐼

2𝜋𝑑�̂� where 𝜇0𝐼 is the permeability of free space, 𝐼 is the current flowing through a

wire, the particle has a velocity vector 𝐯 parallel and in the same direction as the current in the

wire, 𝑑 is the distance of the charge 𝑞 from the wire and �̂� is a unit vector perpendicular to �̂�

and v. This is a cross product so the result is a vector.

This can be converted into a magnetic field 𝐁(𝑟)

𝐟 = 𝑞𝐯 × 𝐁(𝑟)

Combining these gives

𝐟 = 𝑞(𝛆 + 𝐯 × 𝐁) or

𝑓𝑥 = 𝑞(휀𝑥 + 𝑣𝑦𝐵𝑧 − 𝑣𝑧𝐵𝑦), 𝑓𝑦 = 𝑞(휀𝑦 + 𝑣𝑧𝐵𝑥 − 𝑣𝑥𝐵𝑧), 𝑓𝑧 = 𝑞(휀𝑧 + 𝑣𝑥𝐵𝑦 − 𝑣𝑦𝐵𝑥)

Converting this into a matrix equation

[

𝑓𝑥𝑓𝑦𝑓𝑧

] = 𝑞

[ 𝑥

𝑐0 𝐵𝑧 −𝐵𝑦

𝑦

𝑐−𝐵𝑧 0 𝐵𝑥

𝑧

𝑐𝐵𝑦 −𝐵𝑥 0 ]

[

𝑐𝑣𝑥𝑣𝑦𝑣𝑧

] , the c being introduced so the units of the elements

of the last matrix are all velocities.

Page 57: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Electromagnetic Force Contravariant Four-Tensor To convert the above into a four-vector an electromagnetic four-tensor is required. This is

defined as

[F𝜇𝜈] =

[ 0 − 𝑥

𝑐−

𝑦

𝑐− 𝑧

𝑐𝑥

𝑐0 −𝐵𝑧 𝐵𝑦

𝑦

𝑐𝐵𝑧 0 −𝐵𝑥

𝑧

𝑐−𝐵𝑦 𝐵𝑥 0 ]

which is a contravariant four-tensor.

This can be transformed between two special relativity frames of reference by

F𝜇𝜈𝐵 = ∑ Λ𝜇𝛼Λ𝜈𝛽

3𝛼,𝛽=0 F𝛼𝛽𝐴 where Λ is the Lorenz metric.

For example if there is a uniform electric field in the y direction with strength 휀𝑦, the only non-

zero terms are F2,0𝐴 = −F0,2𝐴 =𝑦𝐴

𝑐.

If frame B is moving along the x axis with velocity 𝑉 the only non-zero components of F𝜇𝜈𝐵 are

F2,0𝐵 = Λ20Λ02F

0,2𝐴 + Λ22Λ

00F

2,0𝐴 = 0 × 0 × (−

𝑦𝐴

𝑐) + 1 × 𝛾(𝑉) ×

𝑦𝐴

𝑐

F2,1𝐵 = Λ20Λ12F

0,2𝐴 + Λ22Λ

10F

2,0𝐴 = 0 × 0 × (−

𝑦𝐴

𝑐) + 1 ×

𝛾(𝑉)𝑉

𝑐×

𝑦𝐴

𝑐

or 𝑦𝐵

𝑐= F2,0𝐵 =

𝛾(𝑉) 𝑦𝐴

𝑐 and 𝐵𝑧𝐵 = F2,1𝐵 =

𝛾(𝑉)𝑉 𝑦𝐴

𝑐2

Note that in the first frame of reference there is only an electric field, in the second there is an

electric and a magnetic field. This explains the relationship between electric and magnetic

forces.

If a long straight wire along the x axis is carrying an electric current, there are moving electrons

of velocity V and stationary holes. If a magnet is placed near to the wire it deflects indicating

that the moving electrons are creating a magnetic field. A charged object is not affected because

there is no electric field, the number of positive holes equalling the number of electrons.

If the charged object moves parallel to the wire with velocity V (i.e. the same as the electrons)

the electrons are observed to be stationary and so the distance between them is less to a moving

observer, but the holes are moving in the opposite direction and the distance between them is

greater. Thus the electron density is greater than the hole density creating a net charge and so

there is an electric field which creates a force on the charged object. Since the electrons are now

stationary there is no magnetic field so a magnet travelling at the speed of the electrons is not

deflected.

To one observer there is a magnetic field, to the other an electric field, but both are observing

the same system.

The relationship between the electric and magnetic fields is similar to the relation between

duration and distance, and energy and momentum.

It should be noted that the speed of the electrons is only about 10-12𝑐 so 𝛾(𝑉) is only slightly

greater than 1, but the number of electrons involved makes the result significant.

Page 58: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Electromagnetic Force Covariant Four-Tensor

[F𝜇𝜈] =

[ 0 − 𝑥

𝑐− 𝑦

𝑐− 𝑧

𝑐𝑥

𝑐0 −𝐵𝑧 𝐵𝑦

𝑦

𝑐𝐵𝑧 0 −𝐵𝑥

𝑧

𝑐−𝐵𝑦 𝐵𝑥 0 ]

which is a contravariant four- tensor can be converted into a

covariant field tensor by applying the Minkowski metric twice

F𝜇𝛽 = ∑ η𝛽𝜈F𝜇𝜈3

𝜈=0 and 𝐅𝛼𝛽 = ∑ η𝛼𝜇F𝜇𝛽

3𝜇=0 to give

[F𝜇𝜈] =

[ 0

𝑥

𝑐

𝑦

𝑐

𝑧

𝑐

− 𝑥

𝑐0 −𝐵𝑧 𝐵𝑦

− 𝑦

𝑐𝐵𝑧 0 −𝐵𝑥

− 𝑧

𝑐−𝐵𝑦 𝐵𝑥 0 ]

the covariant four-tensor

Electromagnetic Force Four-Vector The electromagnetic four-vector is defined by

𝐅𝜇 = 𝑞∑ F𝜇𝜈3𝜈=0 𝐔𝜈 = 𝑞F𝜇𝜈𝐔𝜈 in Einstein notation, the superscripted 𝜈 and subscripted 𝜈

implying the summation sign, and the single 𝜇 the dummy index.

𝐅𝜇 is a contravariant four-vector (note the single superscript and bold upright font).

F𝜇𝜈 is a four-tensor with two contravariant indices (note the two superscripts and the normal

upright font).

𝐔𝜈 is the covariant velocity four-vector (note the single subscript and bold font).

𝑞 is an invariant.

Both sides of the equation transform with the Lorentz transform and the equation is said to be

manifestly covariant (where covariant in this sense means form-invariant, rather than the

opposite of contravariant).

Writing this out in full gives

[ (𝛾(𝑣)/𝑐)𝐟. 𝐯𝛾(𝑣)𝑓𝑥𝛾(𝑣)𝑓𝑦𝛾(𝑣)𝑓𝑧 ]

= 𝑞

[ 0 − 𝑥

𝑐− 𝑦

𝑐− 𝑧

𝑐𝑥

𝑐0 𝐵𝑧 −𝐵𝑦

𝑦

𝑐−𝐵𝑧 0 𝐵𝑥

𝑧

𝑐𝐵𝑦 −𝐵𝑥 0 ]

[ 𝛾(𝑣)𝑐−𝛾(𝑣)𝑣𝑥−𝛾(𝑣)𝑣𝑦−𝛾(𝑣)𝑣𝑧]

where 𝐯 = (𝑣𝑥, 𝑣𝑦, 𝑣𝑧)

The first row is equivalent to 𝐟. 𝐯 = 𝑞𝛆. 𝐯 which represents the fact that (only) the electric field is

doing work and so changing the energy. The magnetic field B cannot do work because it acts

perpendicular to 𝐯 (for a force to do work it must act in the direction of the motion of the body).

Similarly the covariant four vector is defined by

𝐅𝜇 = 𝑞∑ F𝜇𝜈3𝜈=0 𝐔𝜈 = 𝑞F𝜇𝜈𝐔

𝜈

Page 59: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Transforming electromagnetic fields If in frame A

[F𝜇𝜈𝐴] =

[ 0 − 𝑥𝐴

𝑐−

𝑦𝐴

𝑐− 𝑧𝐴

𝑐𝑥𝐴

𝑐0 −𝐵𝑧𝐴 𝐵𝑦𝐴

𝑦𝐴

𝑐𝐵𝑧𝐴 0 −𝐵𝑥𝐴

𝑧𝐴

𝑐−𝐵𝑦𝐴 𝐵𝑥𝐴 0 ]

Then in frame B

[F𝛼𝛽𝐵] = ∑ Λ𝜇𝛼Λ𝜈𝛽

3𝛼,𝛽=0 F𝛼𝛽𝐴

For example 휀𝑥𝐵 can be calculated from [F𝜇𝜈𝐴] by noting that F10𝐵 =𝑥𝐵

𝑐, and

F10𝐵 = ∑ Λ1𝛼Λ0𝛽F

𝛼𝛽𝐴

3𝛼,𝛽=0

= Λ10Λ00F

00𝐴 + Λ10Λ

01F

01𝐴 + Λ10Λ

02F

02𝐴 + Λ10Λ

03F

03𝐴 +⋯+ Λ13Λ

03F

33𝐴

However many of the terms include factors of zero and retaining only the non-zero terms gives

F10𝐵 = Λ10Λ01F

01𝐴 + Λ11Λ

00F

10𝐴

Since

F01𝐴 =𝑥𝐴

𝑐, F10𝐴 = − 𝑥𝐴

𝑐, Λ10 = Λ01 = −

𝛾(𝑉)𝑉

𝑐, Λ00 = Λ11 = 𝛾(𝑉)

F10𝐵 = (−𝛾(𝑉)𝑉

𝑐) (−

𝛾(𝑉)𝑉

𝑐)

𝑥𝐴

𝑐+ 𝛾(𝑉)𝛾(𝑉) (−

𝑥𝐴

𝑐)

= 𝛾2(𝑉) (1 −𝑉2

𝑐2)

𝑥𝐴

𝑐

= 𝑥𝐴

𝑐 so 𝛾2(𝑉) =

1

(1−𝑉2

𝑐2)

휀𝑥𝐵 = 휀𝑥𝐴

Similarly 휀𝑦𝐵 = 𝛾(𝑉) (휀𝑦𝐴 − 𝑉𝐵𝑧𝐴) and 휀𝑧𝐵 = 𝛾(𝑉) (휀𝑧𝐴 − 𝑉𝐵𝑦𝐴)

Likewise 𝐵𝑥𝐵 = 𝐵𝑥𝐴, 𝐵𝑦𝐵 = 𝛾(𝑉) (𝐵𝑦𝐴 −𝑉

𝑐2휀𝑥𝐴) , 𝐵𝑧𝐵 = 𝛾(𝑉) (𝐵𝑧𝐴 −

𝑉

𝑐2휀𝑦𝐴)

An alternative form where 𝐕 is not along the x axis and ∥ means parallel to 𝐕 and ⊥ means

perpendicular to 𝐕 are the Joules-Bernoulli equations

𝛆𝐵∥ = 𝛆𝐴∥

𝐁𝐵∥ = 𝐁𝐴∥

𝛆𝐵⊥ = 𝛾(𝑉)(𝛆𝐴⊥ + 𝐕 × 𝐁𝐴⊥)

𝐄𝐵⊥ = 𝛾(𝑉) (𝐁𝐴⊥ −𝐕×𝛆𝐴⊥𝑐2

)

Page 60: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Maxwell’s Equations The strength of the electric and magnetic fields are determined by both the charge and current

densities and their rate of change are expressed by Maxwell’s Equations, here written in

differential vector calculus notation:

𝛁 ∙ 𝛆 =𝜌

0 The electric field leaving a volume is equal to the charge inside or

electric charges are isolated or electric field lines start and stop

𝛁 ∙ 𝐁 = 0 The total magnetic flux piercing a closed surface is zero or there are

no isolated magnetic monopoles and magnetic field lines form

closed loops

𝛁 × 𝛆 = −∂𝐁

∂t The voltage accumulated around a closed circuit is proportional to

the rate of change of the magnetic flux it encloses, or the electric

fields are only conservative if the magnetic field does not vary with

time

𝛁 × 𝐁 = 𝜇0𝐉 +1

𝑐2∂𝛆

∂t Electric currents and changes in electric fields are proportional to

the magnetic field circulating about the area they pierce.

𝛁 × 𝐁 = 𝜇0𝐉 + 𝜇0휀0∂𝛆

∂t An alternative form, both showing that magnetic fields are only

conservative if there is no electric current density and the electric

field does not vary with time

A field is conservative if the work done around all closed loops is zero.

The fourth equation had an important influence on the development of special relativity - firstly

it provided the theoretical evidence for the constant speed of light since if the speed of light

varies then so must the permeability and permittivity of free space, and secondly the unification

of electric and magnetic fields influenced Einstein’s development of space-time.

𝛆 The electrostatic vector field, normally written as E, but that can be

confused with energy

𝜌 The charge density, a scalar field

𝐁 The magnetic vector field, written H in some texts

𝐉 The current density, a vector field

𝜇0 The permeability of free space

휀0 The permittivity of free space

𝜇0휀0𝑐2 = 1 From the fourth equation

The equations can be expressed in a covariant form as

∑𝜕F𝜇𝜈

𝜕𝑥𝜇3𝜇=0 = 𝜇0𝐽

𝜇

𝜕F𝜆𝜇

𝜕𝑥𝑣+

𝜕F𝜈𝜆

𝜕𝑥𝜇+

𝜕F𝜇𝜈

𝜕𝑥𝜆= 0

Page 61: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Differential Vector Calculus A field in mathematics is a 3D (2D) function. A common example is the temperature map in a

weather forecast – although only a few temperatures are shown, every point has a temperature.

This is an example of a 2D scalar field. Wind speed is another scalar field. However the field of

wind velocities is a 2D vector field. These are 2D (a value each x,y coordinate), but obvious

there is also a variation with height so physically the fields are 3D. The mathematics associated

with fields is called vector calculus (even for scalar fields).

A scalar field can be represented by 𝑓 = 𝑓(𝑥, 𝑦, 𝑧) and a vector field by

𝐟 = 𝑓(𝑓𝑥(𝑥, 𝑦, 𝑧)�̂� + 𝑓𝑦(𝑥, 𝑦, 𝑧)�̂� + 𝑓𝑧(𝑥, 𝑦, 𝑧)�̂�)

A vector field can be converted into a scalar field 𝑓 = √𝐹𝑥2 + 𝐹𝑦

2 + 𝐹𝑧2.

There are three common operations in differential vector calculus. Using the example of the

slope of the ground in a hilly terrain after heavy rain when water is flowing over the surface, the

direction of water flow at each point is that of the steepest gradient at that point. The water flow

lines are similar to the lines of force often used to illustrate force fields. The mathematical

equivalent is the grad operation. The height of the terrain is a 2D scalar field but a gradient has

direction as well as magnitude (positive up) so the grad operation results in a vector field. Near

the top of a hill the gradients tend to converge on one spot – this is (negative) divergence. In fact

the gradient vectors will tend to diverge or converge all over the area. The divergence operation

which only applies to vector fields results in a scalar field. A long horizontal ridge may extend

from the side of the hill, but may curve to the left or right. This is an example of curl, another

operator, which results in a vector perpendicular to the surface. A better example is the swirls

in a fast flowing river.

There are two notations – one uses operators (based on the del (∇) symbol) and the other uses

a notation similar to that used for functions.

𝛁𝑓 ≡𝜕𝑓

𝜕𝑥�̂� +

𝜕𝑓

𝜕𝑦�̂� +

𝜕𝑓

𝜕𝑧�̂� The gradient of a scalar field, also written as 𝐠𝐫𝐚𝐝(𝑓), resulting in a

vector field, hence the bold 𝛁 and grad.

𝛁 ∙ 𝐟 ≡𝜕𝑓𝑥

𝜕𝑥+

𝜕𝑓𝑦

𝜕𝑦+

𝜕𝑓

𝜕𝑧 Divergence of a vector field, also written as div(f), resulting in a

scalar field. The 𝛁 is bold because it is a vector, but div is not because

the result is a scalar field. Note the operator is 𝛁 ∙.

𝛁 × 𝐟 ≡ 𝐜𝐮𝐫𝐥(𝐟) ≡ det [

�̂� �̂� �̂�𝜕

𝜕𝑥

𝜕

𝜕𝑦

𝜕

𝜕𝑧

𝑓𝑥 𝑓𝑦 𝑓𝑧

] ≡ (𝜕𝑓𝑧

𝜕𝑦−

𝜕𝑓𝑦

𝜕𝑧) �̂� + (

𝜕𝑓𝑥

𝜕𝑧−

𝜕𝑓𝑧

𝜕𝑥) �̂� + (

𝜕𝑓𝑦

𝜕𝑥−

𝜕𝑓𝑥

𝜕𝑦) �̂�

Curl is the rotation of a vector field, also written as curl(f), resulting in a vector field. Note the

operator is 𝛁 ×.

𝛁 ∙ 𝛁𝑓 ≡ ∇𝟐𝑓 ≡ div(𝐠𝐫𝐚𝐝(𝑓 )) ≡𝜕2𝑓

𝜕𝑥2+

𝜕2𝑓

𝜕𝑦2+

𝜕2𝑓

𝜕𝑧2 called del squared - it can operate on both

scalar and vector fields as in

𝛁𝟐𝐟 ≡ (𝜕2𝑓𝑥

𝜕𝑥2+

𝜕2𝑓𝑥

𝜕𝑦2+

𝜕2𝑓𝑥

𝜕𝑧2) �̂� + (

𝜕2𝑓𝑦

𝜕𝑥2+

𝜕2𝑓𝑦

𝜕𝑦2+

𝜕2𝑓𝑦

𝜕𝑧2) �̂� + (

𝜕2𝑓𝑧

𝜕𝑥2+

𝜕2𝑓𝑧

𝜕𝑦2+

𝜕2𝑓𝑧

𝜕𝑧2) �̂�

𝛁 × (𝛁 × 𝐟) ≡ 𝛁(𝛁 ∙ 𝐟) − 𝛁𝟐𝐟 A useful theorem

Note 𝛁𝟐𝐟 ≢ 𝛁 ∙ 𝛁 ∙ 𝐟 and 𝛁𝟐𝐟 ≢ 𝛁 × 𝛁 × 𝐟.

Page 62: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Speed of Light Maxwell’s equations are

𝛁 ∙ 𝛆 =𝜌

0 𝛁 ∙ 𝐁 = 0 𝛁 × 𝛆 = −

∂𝐁

∂t 𝛁 × 𝐁 = 𝜇0𝐉 + 𝜇0휀0

∂𝛆

∂t

The charge and current densities (𝜌 and 𝐉) are zero in a vacuum giving

𝛁 ∙ 𝛆 = 0 𝛁 ∙ 𝐁 = 0 𝛁 × 𝛆 = −∂𝐁

∂t 𝛁 × 𝐁 = 𝜇0휀0

∂𝛆

∂t

𝛁 × (𝛁 × 𝛆) = −∂(𝛁×𝐁)

∂t 𝛁 × (𝛁 × 𝛆 = −

∂𝐁

∂t)

= −∂(𝛁×𝜇0 0

∂𝛆

∂t)

∂t 𝛁 × 𝐁 = 𝜇0휀0

∂𝛆

∂t

= −𝜇0휀0∂2𝛆

∂t2

𝛁 ∙ (𝛁 ∙ 𝛆) − 𝛁𝟐𝛆 = −𝜇0휀0∂2𝛆

∂t2 𝛁 × (𝛁 × 𝐟) ≡ 𝛁 ∙ (𝛁 ∙ 𝐟) − 𝛁𝟐𝐟

− 𝛁𝟐𝛆 = −𝜇0휀0∂2𝛆

∂t2 𝛁 ∙ 𝛆 = 𝟎

𝛁𝟐𝛆 = 𝜇0휀0∂2𝛆

∂t2 is a three dimensional wave equation which in one direction x is

∂2 𝑥

∂x2= 𝜇0휀0

∂2 𝑥

∂t2 a solution to which is

휀𝑥 = 𝐴 sin (2𝜋𝑥−𝑣𝑡

𝜆) where 𝐴 is a arbitary constant and 𝑣 is the velocity of wavelength 𝜆 since

differentiating twice wrt to 𝑥 and 𝑡 gives

∂2

∂x2= −𝐴(

2𝜋

𝜆)2sin (2𝜋

𝑥−𝑣𝑡

𝜆) and

∂2𝜺

∂t2= −𝐴(

2𝜋𝑣

𝜆)2sin (2𝜋

𝑥−𝑣𝑡

𝜆) and substituting back gives

−𝐴(2𝜋

𝜆)2sin (2𝜋

𝑥−𝑣𝑡

𝜆) = −𝜇0휀0𝐴 (

2𝜋𝑣

𝜆)2sin (2𝜋

𝑥−𝑣𝑡

𝜆)

∂2𝛆

∂x2= 𝜇0휀0

∂2𝛆

∂t2

This requires that 𝜇0휀0𝑣2 = 1 which in turn requires 𝜇0휀0 =

1

𝑣2

This can be repeated for the y and z directions. Similarly

𝛁 × 𝐁 = 𝜇0휀0∂𝛆

∂t

− 𝛁𝟐𝛆 + 𝛁 ∙ (𝛁 ∙ 𝛆) = 𝜇0휀0∂(𝛁×𝛆)

∂t

− 𝛁𝟐𝛆 = −𝜇0휀0∂𝐁

∂t 𝛁 × 𝛆 = −

∂𝐁

∂t

which gives a similar result requiring 𝜇0휀0 =1

𝑣2.

Thus Maxwell’s equations for a vacuum result in wave equations for both the electric and

magnetic fields. Both waves have the same speed 1

√𝜇0 0, normally written as 𝑐, and since

𝜇0 and 휀0 are both constants the speed of light is a constant. The origin and alignment of the

axes are arbitrary, and the speed is relative to the vacuum which does not have a velocity and so

the speed 𝑐 is invariant, thus confirming the first postulate.

Page 63: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Four-Tensors Tensors are mathematical objects, multidimensional arrays, involved in transformations. The

rank of a tensor is the number of indices required to identify an element (i.e. the number of

dimensions). Tensors that are not scalars or vectors are indicated by upright capital letters such as T or are enclosed in square brackets as in [𝑇].

A scalar can be a tensor, but is of rank 0 (not all scalars are tensors, only those involved in

transformations). Scale factors are a common example.

A vector can be a tensor of rank 1 (not all vectors are tensors, only those involved in

transformations).

A matrix that is involved in transformations is a rank 2 tensor, and has two dimensions.

Tensors can have any number of dimensions, but the special relativity four-tensors have two

dimensions with four components in each dimension, one for time, three spatial, numbered 0 to

3.

The electromagnetic four-tensors [F𝜇𝜈] and [F𝜇𝜈] are of rank 2, the first fully contravariant and

the second fully covariant. F𝜇𝜈 is also of rank 2, but is a mixed tensor rank 1 contravariant and

rank 1 covariant. The total rank is the sum of the covariant and contravariant ranks, i.e. 2.

There are also four-tensors of higher ranks, but they have the same behaviour under a Lorentz

transform, based on [𝑒𝜇𝐵] = ∑ [Λ𝜇𝜈][𝑒𝑣𝐴]𝜈 . All such tensors are Lorentz covariant, even if the

tensor is contravariant – there are two different uses of the term covariant.

A contravariant four-tensor of rank 𝑚 has 4𝑚 components and the transform

T𝜇1,𝜇2,⋯𝜇𝑚𝐵 = ∑ Λ𝜇1𝜈1Λ𝜇2𝜈2⋯Λ𝜇𝑚𝜈𝑚𝜈1,𝜈2,⋯𝜈𝑚 T𝜇1,𝜇2,⋯𝜇𝑚𝐴

A covariant four-tensor of rank 𝑛 has 4𝑛 components and the transform

T𝛼1,𝛼2,⋯𝛼𝑛𝐵= ∑ Λ−1𝛼1

𝛽1Λ−1𝛼2𝛽2⋯Λ−1𝛼𝑛

𝛽𝑛𝛽1,𝛽2,⋯𝛽𝑛 T𝛼1,𝛼2,⋯𝛼𝑛𝐴

A mixed four-tensor of contravariant rank 𝑚 and covariant rank 𝑛 consists of 4𝑚+𝑛 components

and the transform

T𝜇1,𝜇2,⋯𝜇𝑚𝛼1,𝛼2,⋯𝛼𝑛𝐵=

∑ Λ𝜇1𝜈1Λ𝜇2𝜈2⋯Λ𝜇𝑚𝜈𝑚𝜈1,𝜈2,⋯𝜈𝑚,𝛽1,𝛽2,⋯𝛽𝑛

Λ−1𝛼1𝛽1Λ−1𝛼2

𝛽2⋯Λ−1𝛼𝑛𝛽𝑛T𝜇1,𝜇2,⋯𝜇𝑚𝛼1,𝛼2,⋯𝛼𝑛𝐴

The product of a contravariant tensor of rank 𝑚 and a covariant tensor of rank 𝑛 has rank 𝑚 − 𝑛

where a positive result is contravariant, a negative result covariant, and zero is a scalar as in

𝐗𝜈𝐗𝜈, 𝐏𝜈𝐏

𝜈, 𝐔𝜈𝐔𝜈 which all result in scalars.

If the x axes of the two frames of reference are along the same line as is the velocity vector 𝑉

then Λ refers to the standard Lorentz matrix. If this is not the case the terms Λ𝜇𝜈 and the inverse

terms become partial derivatives Λ𝜇𝜈 =𝜕𝑒𝜇𝐵

𝜕𝑒𝜈𝐴.

Note that special relativity is limited to inertial frames, and objects with a small mass (so they

are not affected by gravity) moving at constant speed.

Objects that accelerate or are affected by gravity require general relativity whose basic equation

contains tensors.

Page 64: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

General Relativity General relativity can be summarised as “Matter tells space how to curve; Space tells matter,

how to move”. Mass, energy, electromagnetic radiation, gravity, space and time are all linked.

However electromagnetic and nuclear forces are not included. The Einstein field equation links

space-time and energy-momentum by a constant of proportionality, i.e. G𝜇𝜈 = −8𝜋𝐺

𝑐4(T𝜇𝜈 + T̅𝜇𝜈).

General relativity is a different theory to special relativity although special relativity can be an

approximation around a specific location in space-time in the same way that a flat map can

approximate the curved surface of the Earth provided the area represented is sufficiently small

that the curvature can be ignored.

General relativity is required to explain acceleration and gravity. Special relativity is restricted

to constant velocities and no gravity.

General relativity requires a curved rather than flat space-time, and this means a non-Euclidean

geometry, one where the coordinates are curved, not straight lines. The simplest example is the

surface of a sphere such as the Earth where the coordinates are latitude and longitude, and

straight lines are replaced by great circles.

In Euclidean geometry the parallel postulate can be expressed in many forms, but essentially

parallel lines remain the same distance apart however far they are extended. In non-Euclidean

geometry this is no longer true.

There are two possibilities. One is that the distance between the lines varies, and that they may

eventually meet at a point. This is the geometry of the surface of solids such as the sphere – on

Earth lines of longitude met at the poles. The other possibility is that although the distance

between parallel lines can vary the lines never met, but eventually diverge. An example is the

surface of a saddle. Riemannian geometry is based on the former and forms the basis of space-

time in general relativity.

However general relativity has requirements that go beyond Riemannian geometry and is based

on pseudo-Riemannian geometry. Riemannian geometry states that lines must have a positive

length. In pseudo-Riemannian geometry lines can also have zero length (without being a point)

or negative length. It should be noted that special relativity uses pseudo-Euclidean geometry

which has zero curvature. The length of a line can be negative ((Δ𝑠)2 < 0 is space-like) or zero

((Δ𝑠)2 = 0 is light-like).

General relativity also relies heavily on the uses of tensors and metrics.

The Minkowski metric is used in special relativity – all the terms are constants.

(d𝑠)2 = ∑ 𝜂𝜇𝜈d𝑥𝜇d𝑥𝜈3

𝜇,𝜈=0 where [𝜂𝜇𝜈] ≡ [

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

]

This changes to (d𝑠)2 = ∑ 𝑔𝜇𝜈d𝑥𝜇d𝑥𝜈3

𝜇,𝜈=0 in general relativity - [𝑔𝜇𝜈] is similar to [𝜂𝜇𝜈], but

functions of the coordinates occur instead of constants. In the case of a flat geometry

[𝑔𝜇𝜈] = [𝜂𝜇𝜈] .

[𝑔𝜇𝜈] is known as the metric tensor. Its inverse [𝑔𝜇𝜈] is known as the dual metric.

∑ 𝑔𝜇𝑘𝑔𝑘𝜈 = 𝛿𝜇𝜈 𝑘 where 𝛿𝜇𝜈 is an array whose components are identical to the components of

the identity matrix I.

Page 65: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Differential Geometry In Euclidian geometry parallel lines remain the same distance apart however far they are

extended. As a consequence the shortest distance between two points is a straight line, angles of

a triangle add up to 180 degrees, the circumference of a circle is 2𝜋𝑟 and the surface area of a

sphere is 4𝜋𝑟2. Space is described as flat.

The length of a curved line can be calculated by assuming a short length of line extending from

(𝑥, 𝑦) to (𝑥 + ∆𝑥, 𝑦 + ∆𝑦) is straight and has a length ∆𝑙 where ∆𝑙 = √(∆𝑥)2 + (∆𝑦)2.

In the limit d𝑙 = √(d𝑥)2 + (d𝑦)2. Adding the elements to get the length of the line gives

∫ d𝑙𝐿

0= ∫ d√(d𝑥)2 + (d𝑦)2

𝑥2𝑦2𝑥1𝑦1

This integral can be solved if the curve can be represented by a function of a parameter 𝑢, i.e.

each point on the curve (𝑥, 𝑦) can be represented by (𝑥(𝑢), 𝑦(𝑢)).

For example a parabola 𝑦 = 𝑥2 can be represented by (𝑥(𝑢) = 𝑢, 𝑦(𝑢) = 𝑢2) and the circle

𝑅2 = 𝑥2 + 𝑦2 by (𝑥(𝑢) = 𝑅 cos 𝑢 , 𝑦(𝑢) = 𝑅 sin 𝑢).

Then using d𝑥 =d𝑥

d𝑢d𝑢 and d𝑦 =

d𝑦

d𝑢d𝑢 gives

∫ d𝑙𝐿

0= ∫ d√(

d𝑥

d𝑢d𝑢 )

2+ (

d𝑦

d𝑢d𝑢 )

2𝑢2𝑢1

= ∫ √(d𝑥

d𝑢 )2+ (

d𝑦

d𝑢 )2𝑢2

𝑢1 d𝑢

For the circumference of a circle d𝑥

d𝑢= −𝑅 sin 𝑢 ,

d𝑦

d𝑢= 𝑅 cos 𝑢, 𝑢1 = 0 and 𝑢2 = 2𝜋 to give

𝐶 = ∫ √(−𝑅 sin 𝑢 )2 + (𝑅 cos𝑢 )22𝜋

0 d𝑢 = ∫ 𝑅 √sin2 𝑢 + cos2 𝑢

2𝜋

0 d𝑢 = 𝑅 ∫ √1

2𝜋

0 d𝑢

= 𝑅[𝑢]02𝜋

= 2𝜋𝑅

An alternative approach is to change the coordinate system to (𝑟, 𝜃) where 𝜃 is the angle of 𝑟 to

the x axis (anticlockwise positive). Then

d𝑥 = cos 𝜃 d𝑟 − 𝑟 sin 𝜃 d𝜃

d𝑦 = sin𝜃 d𝑟 + 𝑟 cos 𝜃 d𝜃

d𝑙 = √(d𝑟)2 + 𝑟2(d𝜃)2

𝐶 = ∫ √(d𝑟)2 + 𝑅2(d𝜃)2𝜃=2𝜋

𝜃=0 For a circle 𝑟 = 𝑅

= ∫ √(0)2 + 𝑅2(d𝜃)22𝜋

0 For the circle 𝑟 is a constant R so d𝑟 = 0

= 𝑅 ∫ d𝜃2𝜋

0

= 𝑅[𝜃]02𝜋

= 2𝜋𝑅

Page 66: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Extending this to three dimensions

d𝑙 = √(d𝑥)2 + (d𝑦)2+(d𝑧)2 or in spherical coordinates where

𝑥 = 𝑟 sin𝜃 cos𝜙 𝜃 is now the angle between r and the z axis

𝑦 = 𝑟 sin𝜃 sin𝜙 𝜙 is the angle between the component of r in the xy plane and the x axis

𝑧 = 𝑟 cos 𝜃 so

d𝑥 = sin𝜃 cos𝜙 d𝑟 + r cos 𝜃 cos𝜙 d𝜃 − r sin𝜃 sin𝜙 d𝜙

d𝑦 = sin𝜃 sin𝜙 d𝑟 + r cos 𝜃 sin𝜙 d𝜃 + r sin 𝜃 cos𝜙 d𝜙

d𝑧 = cos 𝜃 d𝑟 − 𝑟 sin𝜃 d𝜃 to give

d𝑙 = √(d𝑟)2 + 𝑟2(d𝜃)2+𝑟2 sin2 𝜃 (d𝜙)2

If 𝑟 = 𝑅 where R is the radius of a sphere (a constant so d𝑟 = 0)

d𝑙 = √𝑅2(d𝜃)2+𝑅2 sin2 𝜃 (d𝜙)2 and since there are only two variables integrating this will give

the area of a surface.

If 𝜃 is kept constant a circle whose centre is the z axis is produced (a line of latitude)

d𝑙 = √𝑅2 sin2 𝜃 (d𝜙)2 = 𝑅 sin𝜃 d𝜙 and integrating this will give the length of the line.

∫ d𝑙 =𝐿

0 ∫ 𝑅 sin 𝜃 d𝜙2𝜋

0

𝐶 = 𝑅 sin 𝜃 [𝜙]02𝜋

= 2𝜋 𝑅 sin 𝜃 the length of a line of latitude where 𝜃 is 0° at the north pole and 180° or 𝜋

at the south.

However if the radius of the circle was measured on the surface of the sphere (from the z axis

along the surface on a line of longitude) it would have the value 𝑅𝜃 and the circumference of the

circle should be 2𝜋 𝑅𝜃 based on Euclidian geometry. Thus the geometry on the surface of a

sphere is different to that on a plane, and Euclid’s theorems need to be re-evaluated.

The shortest distance between two points (the geodesic) is the shortest arc of a great circle

passing through both points, and the angles of a triangle add up to more than 180 degrees since

if the sides are minor arcs of great circles each of the angles is a right angle.

The above give two tests that determine whether a surface is flat or curved – the length of the

circumference of a circle calculated from its radius and the sum of the angles in a triangle, with

the measurements made on the surface and no knowledge of the third dimension. This means

curvature is an intrinsic property of a surface.

An example of a flat surface that appears curved is the surface of a cylinder. It is a flat surface

rolled up (labels on cylindrical cans and bottles can be printed on flat sheets of paper). Maps of

the Earth cannot be printed on a flat sheet of paper unless some distortion of distances or angles

is introduced.

In theory this could be used to determine the curvature, if any, of the universe, but extremely

large triangles or circles would need to be surveyed very accurately.

Page 67: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Riemannian Geometry Riemannian geometric is the geometry on a Riemannian manifold – a multidimensional curved

smooth continuous surface where the metric coefficients are not all equal to 1 but are all

positive.

Riemannian geometry generalises the concept of line elements to

d𝑙 = √∑ 𝑔𝑖𝑗 d𝑒𝑖 d𝑒𝑗𝑛

𝑖,𝑗=1 where 𝑔𝑖𝑗 = 𝑔𝑗𝑖 are metric coefficients. For example if

d𝑙 = √(d𝑟)2 + 𝑟2(d𝜃)2 then 𝑛 = 2, 𝑔11 = 1, 𝑔22 = 𝑟2, 𝑔12 = 𝑔21 = 0.

The metric coefficients can be arranged in a rank 2 covariant metric tensor

[𝑔𝑖𝑗] ≡ [1 00 𝑟2

] note that 𝑔22 is a function of the r coordinate so this is not Euclidean.

In Euclidian space with Cartesian coordinates

[𝑔𝑖𝑗] ≡ [1 0 00 1 00 0 1

]

In Minkowski space-time with Cartesian coordinates

[𝑔𝑖𝑗] ≡ [𝜂𝜇𝜈] ≡ [

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

] , the presence of minus signs indicating this is pseudo

geometry.

For the surface of a sphere of radius 𝑅

[𝑔𝑖𝑗] ≡ [𝑅2 00 𝑅2 sin2 𝑒1

] where 𝑒1is the 𝜃 coordinate and 𝑒2 is the 𝜙 coordinate.

For spherical coordinates where 𝑑𝑙2 = 𝑑𝑟2 + 𝑟2𝑑𝜃2 + 𝑟2 sin2𝜃 𝑑𝜙2 where 𝑒1, 𝑒2, 𝑒3 are the 𝑟, 𝜃, 𝜙 coordinates

[𝑔𝑖𝑗] ≡ [1 0 00 𝑟2 00 0 𝑟2 sin2 𝜃

]

In each case the only non-zero terms are on the leading diagonal. This results from all the above

being based on orthogonal coordinates. Any off-diagonal terms imply that the coordinates are

not orthogonal.

If the orthogonal coordinates are Cartesian the nonzero values are constants. The presence of

functions indicates that the coordinates are not Cartesian.

Note that the metric determines the geometry. However there can be several different metrics

for the same geometry since there are different coordinate systems.

Note that in Minkowski space-time one of the coefficients is negative, and this also the case in

pseudo-Riemannian geometries.

The line element d𝑙 cannot be integrated without a definition of the line.

Page 68: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Parallel Transport and Connection Coefficients Parallel transport of a vector means moving a vector keeping its length and direction constant,

usually with one point on the vector remaining on a line.

Using Cartesian co-ordinates and unit vectors �̂�, �̂� and �̂� if the vector is given by

𝐯 = 𝑣𝑥�̂� + 𝑣𝑦 �̂� + 𝑣𝑧�̂� the vector will have exactly the same expression 𝐯 = 𝑣𝑥�̂� + 𝑣𝑦�̂� + 𝑣𝑧�̂� wherever it is in space - (𝑣𝑥, 𝑣𝑦, 𝑣𝑧) are the contravariant projections

onto those axes, not powers. This part of the definition of a vector – its location in space is not a

property (unless it is a position vector).

However if spherical coordinates are used this is no longer true since they are curvilinear

coordinates – the unit vectors change their direction from place to place.

If it is assumed that the vector is moved such that one end follows a parametrised curve and the coordinates are 𝑒1, 𝑒2, 𝑒3 then the curve is given by 𝑒𝑗(𝑢).

At any point on the curve 𝐯(𝑢) = ∑ 𝑣𝑗𝑒𝑗(𝑢)𝑗 .

The rate of change of the coefficients as the vector moves along the curve is given by

d𝐯

d𝑢= ∑ (

d𝑣𝑗

d𝑢𝑒𝑗(𝑢) + 𝑣

𝑗 𝜕𝑒𝑗

𝜕𝑢)𝑗 = ∑ (

d𝑣𝑗

d𝑢𝑒𝑗(𝑢) + ∑ 𝑣𝑗

𝜕𝑒𝑗

𝜕𝑥𝑘d𝑥𝑘

d𝑢𝑘 )𝑗 where 𝑥𝑘 refers to the Cartesian

axes, 𝑥1to x, 𝑥2 to y and 𝑥3 to z.

𝜕𝑒𝑗

𝜕𝑥𝑘 is a vector with components Γ1 in direction 𝑒1,Γ

2 in direction 𝑒2, and Γ3 in direction 𝑒3 or

generalising

𝜕𝑒𝑗

𝜕𝑥𝑘= ∑ Γ𝑖𝑗𝑘𝑒𝑖 𝑖 where Γ has 33elements called connection coefficients – three directions 𝑖, three

basis vectors 𝑗, and three coordinates 𝑘 .

Γ is a 3D array, but is not a tensor or a matrix. Γ𝑖𝑗𝑘 is the component in the direction of the basis

(unit) vector 𝑒𝑖 of the rate of change of the basis vector 𝑒𝑗 wrt the coordinate 𝑥𝑘.

Then

d𝐯

d𝑢= ∑ (

d𝑣𝑗

d𝑢𝑒𝑗(𝑢) + ∑ Γ𝑖𝑗𝑘𝑒𝑖 𝑣

𝑗 d𝑥𝑘

d𝑢𝑖,𝑘 )𝑗 which can be rewritten as

d𝐯

d𝑢= ∑ (

d𝑣𝑖

d𝑢+ ∑ Γ𝑖𝑗𝑘𝑣

𝑗 d𝑥𝑘

d𝑢𝑗,𝑘 )𝑖 𝑒𝑖(𝑢)

If the vector is transported without any change to its magnitude or direction the rate of change d𝐯

d𝑢 must be zero so

∑ (d𝑣𝑖

d𝑢+ ∑ Γ𝑖𝑗𝑘𝑣

𝑗 d𝑥𝑘

d𝑢𝑗,𝑘 )𝑖 𝑒𝑖(𝑢) = 0 or for each component

d𝑣𝑖

d𝑢= −∑ Γ𝑖𝑗𝑘𝑣

𝑗 d𝑥𝑘

d𝑢𝑗,𝑘

Page 69: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Considering two points on the curve 𝑢 and 𝑢 + d𝑢

𝑣𝑖(𝑢 + d𝑢) = 𝑣𝑖(𝑢) +d𝑣𝑖

d𝑢d𝑢 or

𝑣𝑖(𝑢 + d𝑢) = 𝑣𝑖(𝑢) − ∑ Γ𝑖𝑗𝑘𝑣𝑗 d𝑥

𝑘

d𝑢d𝑢𝑗,𝑘

If the two points are separated by the length d𝒍 then d𝒍 = ∑ 𝑒𝑖 d𝑥𝑖

𝑖 and

(d𝒍)2 = d𝒍 ∙ d𝒍 = ∑ 𝑒𝑖 d𝑥𝑖

𝑖 ∙ ∑ 𝑒𝑗 d𝑥𝑗

𝑗 = ∑ (𝑒𝑖 ∙ 𝑒𝑗) d𝑥𝑖 d𝑥𝑗𝑖,𝑗

But (d𝒍)2 = ∑ 𝑔𝑖𝑗 d𝑒𝑖 d𝑒𝑗𝑛

𝑖,𝑗=1 from d𝑙 = √∑ 𝑔𝑖𝑗 d𝑒𝑖 d𝑒𝑗𝑛

𝑖,𝑗=1 so

𝑔𝑖𝑗 = 𝑒𝑖 ∙ 𝑒𝑗

Differentiating wrt 𝑥𝑘 gives 𝜕𝑔𝑖𝑗

𝜕𝑥𝑘=

𝜕𝑒𝑖

𝜕𝑥𝑘∙ 𝑒𝑗 + 𝑒𝑖 ∙

𝜕𝑒𝑗

𝜕𝑥𝑘 wrt means “with respect to”

𝜕𝑔𝑖𝑗

𝜕𝑥𝑘= ∑ Γ𝑙𝑖𝑘𝑒𝑙𝑙 ∙ 𝑒𝑗 + 𝑒𝑖 ∙ ∑ Γ𝑙𝑗𝑘𝑒𝑙𝑙

𝜕𝑒𝑗

𝜕𝑥𝑘= ∑ Γ𝑖𝑗𝑘𝑒𝑖 𝑖

This can be rearranged to give

Γ𝑖𝑗𝑘 =1

2∑ 𝑔𝑖𝑙 (

𝜕𝑔𝑙𝑘

𝜕𝑥𝑗+

𝜕𝑔𝑗𝑙

𝜕𝑥𝑘−

𝜕𝑔𝑗𝑘

𝜕𝑥𝑙)𝑙 where 𝑔𝑖𝑙 is a component of [𝑔𝑖𝑗] which is the inverse of

[𝑔𝑖𝑗], the covariant metric tensor. [𝑔𝑖𝑙] is the dual contravariant metric tensor.

∑ 𝑔𝑖𝑘𝑔𝑘𝑗𝑘 = 𝛿𝑖𝑗 = 𝐼 the unit matrix.

In 2D Euclidean space [𝑔𝑖𝑙] = [1 00 1

]. Therefore 𝜕𝑔𝑖𝑗

𝜕𝑥𝑘= 0 for all 𝑖, 𝑗, 𝑘 and so Γ𝑖𝑗𝑘 = 0 for all 𝑖, 𝑗, 𝑘.

All n-dimensional Euclidean spaces have zero values for the 3𝑛 connection coefficients.

Those of a unit sphere are more complex. [𝑔𝑖𝑗] = [12 00 12 sin2 𝑥1

] so [𝑔𝑖𝑗] = [1 0

01

sin2 𝑥1]

and 𝜕𝑔22

𝜕𝑥1= 2 sin𝑥1 cos𝑥1 with all other

𝜕𝑔𝑖𝑗

𝜕𝑥𝑘= 0 and 𝑔11 = 1 , 𝑔22 =

1

sin2 𝑥1 and 𝑔12 = 𝑔21 = 0.

Eliminating all the zero terms leaves just

Γ122 = −1

2𝑔11

𝜕𝑔22

𝜕𝑥1= −sin𝑥1 cos 𝑥1

Γ212 = Γ221=1

2𝑔22

𝜕𝑔22

𝜕𝑥1=

cos𝑥1

sin𝑥1= cot 𝑥1

This means that if the vector is transported around a closed loop it will point in a different

direction to the one it stated in – this is another indication that a surface has curvature.

Page 70: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Geodesics A geodesic is the generalisation of a straight line to Riemannian geometry. In Euclidian

geometry a straight line can be defined as a curve whose tangent always points in the same

direction.

If there are n dimensions 𝑒𝑖 and a curve parametrised by u then a tangent vector t can be

defined by

𝑡𝑖 =d𝑒𝑖

d𝑢 𝑖 = 1⋯𝑛

If the vector is parallel transported from 𝑢 to 𝑢 + d𝑢 then the vector will change to 𝑓(𝑢)𝐭 where

𝑓(𝑢) is a function of 𝑢 and

d𝑡𝑖

d𝑢+ ∑ Γ𝑖𝑗𝑘𝑡

𝑗 d𝑒𝑘

d𝑢= 𝑓(𝑢)𝑡𝑖𝑗,𝑘

d𝑣𝑖

d𝑢= −∑ Γ𝑖𝑗𝑘𝑣

𝑗 d𝑒𝑘

d𝑢𝑗,𝑘

d2𝑒𝑖

d𝑢2+ ∑ Γ𝑖𝑗𝑘

d𝑒𝑗

d𝑢

d𝑒𝑘

d𝑢= 𝑓(𝑢)

d𝑒𝑖

d𝑢𝑗,𝑘 𝑡𝑖 =d𝑒𝑖

d𝑢

If the tangent is to always point in the same direction 𝑓(𝑢) = 0, and the parameter is then called

an affine parameter and is given the symbol 𝜆, i. e. 𝜆 ≡ 𝑢.

Any curve defined by 𝑒𝑖(𝜆) where d2𝑒𝑖

d𝜆2+∑ Γ𝑖𝑗𝑘

d𝑒𝑡

d𝜆

d𝑒𝑘

d𝜆= 0𝑗,𝑘 is a geodesic in the n-dimensional

Riemannian space with metric [𝑔𝑖𝑗] and connection coefficients Γ𝑖𝑗𝑘 .

The set of equations d2𝑒𝑖

d𝜆2+ ∑ Γ𝑖𝑗𝑘

d𝑒𝑡

d𝜆

d𝑒𝑘

d𝜆= 0𝑗,𝑘 are called the geodesic equations.

In a 2D Euclidean space Γ𝑖𝑗𝑘 = 0 for all 𝑖, 𝑗, 𝑘 so the geodesic equations become

d2𝑥𝑖

d𝜆2= 0

This has solutions 𝑥𝑖 = 𝐴𝑖𝜆 + 𝐵𝑖 where 𝐴, 𝐵 are arbitrary constants, and the two equations can

be written as 𝑥(𝜆) = 𝐴𝑥𝜆 + 𝐵𝑥 , 𝑦(𝜆) = 𝐴𝑦𝜆 + 𝐵𝑦 so the result is a straight line passing through

(𝐵𝑥 , 𝐵𝑦) with slope 𝐴𝑦

𝐴𝑥, or using vectors 𝐱(𝜆) = 𝜆𝐀 + 𝐁 (this can be extended to 3D).

An alternative definition of a straight line is that it is the curve with the shortest length between

two points. The length is given by

L=∫ d𝑙𝐿

0= ∫ √(

d𝑥

d𝑢 )2+ (

d𝑦

d𝑢 )2𝑢2

𝑢1 d𝑢 in 2D Cartesian coordinates

In n-dimensional Riemannian space the length of a line element is

d𝑙 = √∑ 𝑔𝑖𝑗 d𝑒𝑖 d𝑒𝑗𝑛

𝑖,𝑗=1 in Cartesian coordinates so

L=∫ d𝑙𝐿

0= ∫ √∑ 𝑔𝑖𝑗

d𝑒𝑖

d𝑢 d𝑒𝑗

d𝑢𝑛𝑖,𝑗=1

𝑢2𝑢1

d𝑢 = ∫ 𝐹𝑢2𝑢1

d𝑢 where 𝐹 = √∑ 𝑔𝑖𝑗 d𝑒𝑖

d𝑢 d𝑒𝑗

d𝑢𝑛𝑖,𝑗=1

Note that the line element can only be integrated because the line is defined as the shortest line

which is the value obtained from L=∫ d𝑙𝐿

0. The length of any other line depends upon the

definition of that line. Some texts mark line elements by an overline d�̅� to show they are non-

integrable. However d𝑥 is integrable because the line or its component must be on the x axis.

Page 71: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

When L is a minimum any small change in the curve will not change the length so

𝛿𝐿 = 𝛿 ∫ 𝐹𝑢2𝑢1

d𝑢 = 0 which can be shown to be

d

d𝑢(

𝜕𝐹

𝜕d𝑒𝑚

d𝑢

) −𝜕𝐹

𝜕𝑒𝑚= 0 where m is an index – these are the Euler Lagrange Equations in the

calculus of variations (a method of finding the function that minimises a variable instead of a

variable value that minimises a function as in differential calculus).

It can be shown that this leads to the geodesic equations

d2𝑒𝑖

d𝜆2+ ∑ Γ𝑖𝑗𝑘

d𝑒𝑗

d𝜆

d𝑒𝑘

d𝜆= 0𝑗,𝑘

Taking for example the equator of a sphere (𝜃 =𝜋

2, 0 ≤ 𝜙 < 2𝜋) then 𝑒1 ≡ 𝜃, 𝑒2 ≡ 𝜙,

and Γ122 = −sin 𝑒1 cos 𝑒1 = −sin 𝜃 cos 𝜃 , Γ212 = Γ221 =cos𝑒1

sin𝑒1=

cos𝜃

sin𝜃 and there are just two

geodesic equations

d2𝜃

d𝜆2− sin𝜃 cos 𝜃

d2𝜙

d𝜆2= 0

d2𝑒1

d𝜆2+ Γ122

d𝑒2

d𝜆

d𝑒2

d𝜆= 0

d2𝜙

d𝜆2+ 2

cos𝜃

sin𝜃

d𝜃

d𝜆

d𝜙

d𝜆= 0

d2𝑒2

d𝜆2+ Γ212

d𝑒1

d𝜆

d𝑒2

d𝜆+ Γ221

d𝑒1

d𝜆

d𝑒2

d𝜆= 0

Parametrising the equation of the equator 𝜃(𝜆) =𝜋

2, 𝜙(𝜆) = 𝜆 , 0 ≤ 𝜆 < 2𝜋 and differentiating

wrt 𝜆 gives d𝜃

d𝜆= 0,

d2𝜃

d𝜆2= 0,

d 𝜙

d𝜆= 1,

d2𝜙

d𝜆2= 0.

The LHS of both equations is therefore zero and the geodesic equations are satisfied.

Relativity is based on pseudo-Riemannian geometry which makes an important change to the

above. Whereas in Riemannian geometry the geodesic has the shortest length since

𝛿𝐿 = 𝛿 ∫ 𝐹𝑢2𝑢1

d𝑢 = 0 finds a minimum, in pseudo-Riemannian geometry it finds a maximum

and so a geodesic has the longest length.

This makes sense in relativity because the separation is given by

Δ𝑠 = √(cΔ𝑡)2 − (Δ𝑟)2

Newton’s first law states that a mass travels in a straight line if not acted on by forces, and a

straight line is the shortest distance between two locations, i.e. Δ𝑟 is a minimum. So given a value of cΔ𝑡, if Δ𝑟 is a minimum, Δ𝑠 , the separation, is a maximum. This path through space-time

is the world-line in special relativity or a time-like geodesic in general relativity.

For massless particles both the time and distance between two events in special relativity are

zero, i.e cΔ𝑡 = Δ𝑟 = 0 and so Δ𝑠 = 0. Massless particles follow null geodesics.

The shortest distance results in the maximum separation.

Page 72: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Curvature In 2D the curvature of a line can be calculated by approximating a short length of the curve 𝛿𝑙 by

the circumference of a circle. The angle 𝛿𝜃 subtended by the arc 𝛿𝑙 is approximately equal to the

difference in the angle of the tangents at the two ends of 𝛿𝑙. If the radius of the circle is R then

𝛿𝑙 = 𝛿𝜃𝑅 or 𝛿𝜃 =𝛿𝑙

𝑅 for 𝛿𝑙 ≪ 𝑅

The greater the value of 𝛿𝜃 for a given 𝛿𝑙, the greater the curvature, but the smaller the radius.

Therefore 𝑘 =1

𝑅 is a measure of the curvature with the dimension [L-1]. For a straight line 𝑘 = 0.

In general if the curve is parametrised as (𝑥(𝜆), 𝑦(𝜆)) then

𝑘 =|d𝑥

d𝜆

d2𝑦

d𝜆2−d𝑦

d𝜆

d2𝑥

d𝜆2|

((d𝑥

d𝜆)2+(

d𝑦

d𝜆)2)

32

For an ellipse 𝑥 = 𝑎 sin𝜆, 𝑦 = 𝑏 sin𝜆

𝑘 =𝑎𝑏 sin2 𝜆+𝑎𝑏cos2 𝜆

(𝑎2 sin2 𝜆+𝑏2 cos2 𝜆)32

=𝑎𝑏

(𝑎2 sin2 𝜆+𝑏2 cos2 𝜆)32

If 𝑎 = 𝑏 = 𝑅 then a circle of radius R has curvature 𝑘 =𝑅2

(𝑅2 sin2 𝜆+𝑅2 cos2 𝜆)32

=1

𝑅.

This can be extended to a curved surface by drawing a geodesic on the surface at the point, and

measuring the curvature of the geodesic as above. In most cases the value will depend on the

orientation of the geodesic so the Gaussian curvature 𝐾 = 𝑘𝑚𝑎𝑥𝑘𝑚𝑖𝑛 where these are the

maximum and minimum values obtained. In some cases there is curvature in two directions as

on a saddle. The direction of 𝑘𝑚𝑎𝑥 is assumed to be positive and that of 𝑘𝑚𝑖𝑛to be negative in

that case.

The curvature of a sphere is 𝐾 =1

𝑅2 while that of a cylinder is 𝐾 = 0 since the curvature along

the length of the cylinder is zero (since tin can labels are flat pieces of paper).

It can be shown that the curvature of an n dimensional Riemannian space is given by the

Riemannian curvature tensor

R𝑙𝑖𝑗𝑘 =𝜕Γ𝑙𝑖𝑘

𝜕𝑒𝑗−

𝜕Γ𝑙𝑖𝑗

𝜕𝑒𝑘+ ∑ Γ𝑚𝑖𝑘Γ

𝑙𝑚𝑗𝑚 − ∑ Γ𝑚𝑖𝑗Γ

𝑙𝑚𝑘𝑚

R𝑙𝑖𝑗𝑘is a rank 4 tensor, rank 1 contravariant, rank 3 covariant, and has 𝑛4 components.

In a flat space all the component coefficients are zero so R𝑙𝑖𝑗𝑘 = 0.

In relativity 𝑛 = 4 so there are 256 components. However R𝑙𝑖𝑗𝑘 = −R𝑙𝑖𝑘𝑗, and other symmetries

means there are only twenty independent components.

Page 73: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Gaussian curvature of a 2D surface is given by 𝐾 =𝑅1212

det[𝑔𝑖𝑗]. Note R1212 is calculated from the

Riemann tensor R1212 by lowering the first index.

For the surface of a sphere

[𝑔𝑖𝑗] = [𝑅2 00 𝑅2 sin2 𝑒1

] where R is the radius, 𝑒1 the 𝜃 coordinate and 𝑒2 the

𝜙 coordinate.

det[𝑔𝑖𝑗] = 𝑅2𝑅2 sin2 𝑒1 = 𝑅4 sin2 𝑒1

R1212 = ∑ 𝑔1𝜈R𝜈212

2𝜈=1 𝑒𝐴𝜇 = ∑ η𝜇𝜈

3𝜈=0 𝑒𝐴

𝜈 where 𝜂 is replaced by 𝑔 and 𝜈 = 1,2

= 𝑔11R1212 + 𝑔12R

2212

= 𝑔11R1212 𝑔12 = 0

= 𝑅2R1212 𝑔11 = 𝑅2 first 𝑅2 is the square of the radius,

second R1212 is Riemannian tensor

R𝑙𝑖𝑗𝑘 =𝜕Γ𝑙𝑖𝑘

𝜕𝑒𝑗−

𝜕Γ𝑙𝑖𝑗

𝜕𝑒𝑘+ ∑ Γ𝑚𝑖𝑘Γ

𝑙𝑚𝑘𝑚 − ∑ Γ𝑚𝑖𝑗Γ

𝑙𝑚𝑘𝑚

R1212 =𝜕Γ122

𝜕𝑒1−

𝜕Γ121

𝜕𝑒2+ ∑ Γ𝑚22Γ

1𝑚1𝑚 − ∑ Γ𝑚21Γ

1𝑚2𝑚 replacing 𝑙𝑖𝑗𝑘 by 1212

=𝜕Γ122

𝜕𝑒1−

𝜕Γ𝑙21

𝜕𝑒2+ Γ122Γ

111 + Γ222Γ

121 − Γ

121Γ

112 − Γ221Γ

122 replacing 𝑚 by 1 and 2.

However the only non-zero terms for a sphere are Γ122 = −sin𝑒1 cos 𝑒1 and

Γ212 = Γ221 =cos𝑒1

sin𝑒1 leaving 𝑅1212 =

𝜕Γ122

𝜕𝑒1− Γ221Γ

122

R1212 =𝜕

𝜕𝑒1(− sin 𝑒1 cos 𝑒1) −

cos𝑒1

sin𝑒1(− sin 𝑒1 cos 𝑒1)

= (− cos2 𝑒1 + sin2 𝑒1) + cos2 𝑒1

= sin2 𝑒1

R1212 = 𝑅2R1212 = 𝑅2 sin2 𝑒1

𝐾 =R1212

det[𝑔𝑖𝑗]=

𝑅2 sin2 𝑒1

𝑅4 sin2 𝑒1=

1

𝑅2 confirming 𝐾 = 𝑘𝑚𝑎𝑥𝑘𝑚𝑖𝑛 =

1

𝑅

1

𝑅=

1

𝑅2 for a sphere.

It should be noted that in 2D if R1212 = 0, then R1212 = 0 and space is flat, regardless of the

coordinate system.

R𝑙𝑖𝑗𝑘is zero for all 𝑙𝑖𝑗𝑘 in Cartesian coordinates.

In general relativity it is important to distinguish between the curvature of a surface (such as

the surface of the Earth), the curvature of space, and the curvature of space-time.

Page 74: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Tensors Tensors are mathematical objects that behave consistently under coordinate transforms.

Writing the laws of physics using tensors means that the laws are independent of the coordinate

system used. A tensor is described by two values - its rank 𝑟, the number of dimensions of the

tensor, and 𝑛 the number of dimensions of the space it describes. The rank appears as the

number of indices. 𝑛 appears on top of the summation signs ∑ .𝑛 The number of elements or

components is given by 𝑛𝑟. In general relativity 𝑛 = 4.

A scalar invariant is a tensor of rank 0 and does not change under a coordinate transform.

Examples include mass, temperature, charge etc. There is only one component since 𝑛0 = 1 for all 𝑛.

A rank 1 tensor is a vector of which the simplest is displacement which is the distance and

direction of one location from another. A vector can be represented as a matrix with a single

dimension. The number of elements or components of a displacement vector depends on the

number of dimensions in the space – typically 2 or 3, or in the case of space-time 4.

Tensors are either contravariant or covariant, or a combination of the two.

A contravariant tensor transforms like a displacement (fractions appear as 𝐵

𝐴 where 𝐴 are the

original coordinates and 𝐵 are the new coordinates).

A covariant tensor transforms like a derivative (fractions appear as 𝐴

𝐵)

Contravariant tensors have their indices as superscripts (not to be interpreted as powers) and

covariant tensors as subscripts (often referred to as up and down).

A displacement is a contravariant tensor of rank 1 and can be represented by a column

vector/matrix which transforms by d𝑒𝜇𝐵 = ∑𝜕𝑒𝜇𝐵

𝜕𝑒𝛼𝐴

𝑛𝛼=0 d𝑒𝛼𝐴.

A covariant tensor transforms like the derivative or gradient of a scalar function such as

𝑦 = 𝑓(𝑥) whose derivative is 𝜕𝑦

𝜕𝑥.

𝜕𝑦

𝜕𝑒𝐵= ∑

𝜕𝑦

𝜕𝑒𝛼𝐴

𝑛𝛼=0

∂𝑒𝛼𝐴

𝜕𝑒𝐵 or A𝜇𝐵 =

∑𝜕𝑒𝛼𝐴

𝜕𝑒𝜇𝐵

𝑛𝛼=0 𝐴𝛼𝐴. A covariant vector can be represented by a row

vector/matrix.

Note that derivatives of vectors are more complicated.

Contravariant tensors can be converted into covariant tensors and the other way round.

A rank 2 tensor can be represented as a 2D matrix. The classic example is the stress tensor for a

cube of material under tension (which is where the name tensor comes from) where each of the

three faces perpendicular to the x, y and z axes has a stress in the x, y, and z directions giving a

total of nine terms.

[𝑔] is a metric tensor that is used to convert between contravariant and covariant tensors.

[𝑔𝜇𝜈] is a contravariant tensor of rank 2 g𝜇𝜈𝐵 = ∑𝜕𝑒𝜇𝐵

𝜕𝑒𝛼𝐴

3𝛼=0 ∑

𝜕𝑒𝜈𝐵

𝜕𝑒𝛽𝐴𝑔𝛼𝛽𝐴

3𝛽=0 .

[𝑔𝜇𝜈] is a covariant tensor of rank 2 g𝜇𝜈𝐵= ∑

𝜕𝑒𝛼𝐴

𝜕𝑒𝜇𝐵

3𝛼=0 ∑

𝜕𝑒𝛽𝐴

𝜕𝑒𝜈𝐵𝑔𝛼𝛽𝐴

3𝛽=0 .

Page 75: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

R𝛼𝛽𝛾𝛿 the Riemann curvature tensor is a rank 4 tensor, rank 1 contravariant, rank 3 covariant

and R𝛼𝛽𝛾𝛿𝐵= ∑

𝜕𝑒𝛼𝐵

𝜕𝑒𝜇𝐴

3𝜇=0 ∑

𝜕𝑒𝜈𝐴

𝜕𝑒𝛽𝐵

3𝜈=0 ∑

𝜕𝑒𝜌𝐴

𝜕𝑒𝛾𝐵

3𝜌=0 ∑

𝜕𝑒𝜎𝐴

𝜕𝑒𝛿𝐵

3𝜎=0 𝑅𝜇𝜇𝜈𝜎𝐴

.

For example if 𝐀𝜇𝐴 is a tensor in polar coordinates and 𝐀𝜇𝐵 is the corresponding tensor in 2D

Cartesian coordinates so that 𝜇 = 1,2, 𝑒1𝐴 ≡ 𝑟, 𝑒2𝐴 ≡ 𝜃, 𝑒1𝐵 ≡ 𝑥, 𝑒2𝐵 ≡ 𝑦, 𝑥 = 𝑟 cos 𝜃 and

𝑦 = 𝑟 sin 𝜃 then

𝐀𝜇 is contravariant rank 1 so 𝐀𝜇𝐵 = ∑𝜕𝑒𝜇𝐵

𝜕𝑒𝛼𝐴

2𝛼=1 𝐴𝛼𝐴.

𝜕𝑒1𝐵

𝜕𝑒1𝐴=

𝜕𝑥

𝜕𝑟= cos 𝜃 ,

𝜕𝑒1𝐵

𝜕𝑒2𝐴=

𝜕𝑥

𝜕𝜃= −𝑟 sin𝜃 so 𝐴1𝐵 = cos𝜃𝐴1𝐴 − 𝑟 sin 𝜃𝐴2𝐴

𝜕𝑒2𝐵

𝜕𝑒1𝐴=

𝜕𝑦

𝜕𝑟= sin𝜃 ,

𝜕𝑒2𝐵

𝜕𝑒2𝐴=

𝜕𝑦

𝜕𝜃= 𝑟 cos 𝜃 so 𝐴2𝐵 = sin𝜃𝐴1𝐴 + 𝑟 cos 𝜃𝐴2𝐴

If 𝐀𝜇𝐵is the displacement tensor d𝑒𝜇𝐵 and 𝐀𝜇𝐴 the displacement tensor d𝑒𝜇𝐴 this means

d𝑥 = cos 𝜃d𝑟 − 𝑟 sin𝜃d𝜃, d𝑦 = sin𝜃d𝑟 − 𝑟 cos 𝜃d𝜃, the same as differentiating 𝑥 = 𝑟 cos 𝜃 wrt 𝑥

and 𝑦 = 𝑟 sin𝜃 wrt 𝑦.

A tensor can have its indexed raised by 𝐀𝜇 = ∑ 𝑔𝜇𝛼𝐀𝛼3𝛼=0 and lowered by 𝐀𝜇 = ∑ 𝑔𝜇𝛼A

𝛼3𝛼=0 .

Contraction of a tensor reduces the number of indices. If a tensor A has 𝑟 upper and 𝑠 lower

indices, then replacing one upper and one lower index by a dummy index 𝜎 another tensor of

ranks 𝑟 − 1 and 𝑠 − 1 can be produced by

A𝑖1𝑖2⋯𝑖𝑟−1𝑗1𝑗2⋯𝑗𝑠−1 = ∑ 𝐴𝑖1𝑖2⋯𝜎⋯𝑖𝑟𝑗1𝑗2⋯𝜎⋯𝑗𝑠−1𝑛𝜎=1

The simplest example is a tensor that is a square matrix, for example A𝑖𝑖 = A22 = [𝑎11 𝑎12𝑎21 𝑎22

].

This can be contracted to produce a scalar (tensor of rank 0) by

∑ A𝑖𝑖𝑛𝜎=1 = ∑ 𝐴𝑖𝑖

2𝜎=1 = 𝑎11 + 𝑎22 which is the same as the trace of the matrix. Note that

information has been lost - 𝑎21 and 𝑎12.

Note also that a single vector cannot be contracted because there must be one contravariant and

one covariant index. However the dot product of two vectors is contraction because it produces

a scalar.

Page 76: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Operations between Tensors A tensor is scaled by multiplying it by a scalar – each component is multiplied by the scalar in

the same way that a vector or matrix is scaled. This does not change the rank of a tensor.

Two tensors can be added or subtracted provided they are of exactly the same contravariant

and covariant ranks, resulting in a tensor of the same ranks, the components being added in the

same way as vector or matrix addition.

X𝛼𝛽 + Y𝛼𝛽 = Z𝛼𝛽

Two tensors can be multiplied to produce a tensor product (similar to the cross product of

vectors), but very time consuming. The product has ranks that are the sum of the ranks of the

two tensors and is often written with the ⊗ symbol.

X𝛼𝛽 ⊗Y𝛾𝛽𝛿 = Z𝛼𝛽𝛾

𝛿 𝑜𝑟 = Z𝛼𝛾𝛽𝛿 both results are valid but are not the same. In particular

X𝛼⊗X𝛽 = Z𝛼𝛽

Two tensors can also be multiplied to produce an inner product (similar to the multiplication of

two matrices). This is a form of contraction.

𝐗𝑖𝐘𝑖 is a scalar for example and

X𝛼𝜇𝛽Y𝛼𝛽𝜈 = Z𝜇𝜈

More generally

Z𝑖1𝑖2⋯𝑖𝑟−1𝑘1𝑘2⋯𝑘𝑡𝑗1𝑗2⋯𝑗𝑠𝑙1𝑙2⋯𝑙𝑢−1 = ∑ 𝑋𝑖1𝑖2⋯𝜎⋯𝑖𝑟𝑗1𝑗2⋯𝑗𝑠𝑌𝑘1𝑘2⋯𝑘𝑡

𝑙1𝑙2⋯𝜎⋯𝑙𝑢𝑛𝜎=1 .

Tensor equations are “generally covariant”, “manifestly covariant” or are “covariant equations”

(all have the same meaning) – this has nothing to do with covariant tensors, but means that if

the equation is valid in one set of coordinates it is valid in all sets of coordinates. The word

covariant has two different meanings depending on context.

Page 77: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Tensor Notation It is important that the order of the indices is maintained. Since 𝐴𝑗

𝑖 gives no indication of the

order, it must be written as 𝐴𝑖𝑗 or 𝐴𝑗𝑖 . The order is the indices must be maintained unless the

values in the tensor are symmetric.

If B𝜇𝜈𝜌 = ∑𝜕𝑥𝐵

𝜇

𝜕𝑥𝐴𝛼𝛼𝛽𝛾𝜕𝑥𝐵

𝜐

𝜕𝑥𝐴𝛽

𝜕𝑥𝐴𝛾

𝜕𝑥𝐵𝜌𝐴

𝛼𝛽𝛾 then A and B are both tensors of total rank 3 – rank two

contravariant and rank 1 covariant.

Some indices only occur in the up position in an equation (here 𝜇𝜈), and others only in a down

position (here 𝜌) in an equation. These are called the free indices. They occur once and only

once in every term on both sides of the equation.

Some indices occur exactly twice in a term (here 𝛼𝛽𝛾), one up, the other down, and are summed

over (also occur in the ∑ operator). These are called dummy indices and only have meaning

with that summation.

In programming terms a dummy index is the index in a “for loop” as in “for i =1 to n”. A free

index identifies a parameter passed to a function call that does the summing from an array of

such parameters.

Page 78: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Tensors in Relativity

The Lorentz transforms are linear so the sixteen first derivatives 𝜕𝑥𝜇𝐵

𝜕𝑥𝜈𝐴 are constants.

In general relativity the transform from 𝑒𝑖𝐴 to 𝑒𝑖𝐵 is nonlinear, and the sixteen first and second

derivatives are functions, not constants.

Relativity is generally covariant and forming derivatives of covariant tensors is important.

However the partial derivatives of tensors are not generally covariant.

Covariant Tensor Differentiation Although differentiated tensors are not generally tensors except in special relativity, and

connection coefficients Γ𝑖𝑗𝑘 are not tensors, the combination of 𝜕𝑣𝛼

𝜕𝑒𝛽+ ∑ Γ𝛼𝜆𝛽𝜆 𝑣𝜆 does result in a

tensor as does 𝜕𝑣𝛼

𝜕𝑒𝛽− ∑ Γ𝜆𝛼𝛽𝜆 𝑣𝛼, and these operations occur sufficiently frequently that they are

given symbols called the covariant derivative defined by

∇𝛽𝑣𝛼 ≡

𝜕𝑣𝛼

𝜕𝑒𝛽+ ∑ Γ𝛼𝜆𝛽𝜆 𝑣𝜆 for contravariant vectors

∇𝛽𝑣𝛼 ≡𝜕𝑣𝛼

𝜕𝑒𝛽− ∑ Γ𝜆𝛼𝛽𝜆 𝑣𝛼 for covariant vectors

These can be extended to tensors of a higher rank and of mixed type such as

∇𝜆T𝜇𝜈 ≡

𝜕𝑇𝜇𝜈

𝜕𝑒𝜆+ ∑ Γ𝜇𝜌𝜆𝜌 𝑇𝜌𝜈 − ∑ Γ𝜌𝜈𝜆𝜌 𝑇𝜇𝜌

An important property in general relativity is covariant divergence defined as ∑ ∇𝜇𝑇𝜇𝜈3

𝜇=0 .

Page 79: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Newtonian Gravity Newton’s Law of Gravity states that there is an attracting force between two objects of masses

𝑚1 and 𝑚2 that are a distance 𝑑 apart given by

𝐹𝑔 = 𝐺𝑚1𝑚2

𝑑2 where G is the gravitational constant. Although this equation gives results

confirmed by observation and experiment (with a few exceptions) it is an empirical law, and is

not a theory.

It does not explain why this attraction exists.

It does not explain how this force is carried between the two objects.

The speed at which changes in the force are carried is not defined and is assumed to be

infinite.

It only affects objects with mass so there is no interaction with electromagnet radiation.

It does not explain why the ratio of gravitation mass 𝐸𝑃𝐸 = 𝑚𝑔𝑔ℎ to inertia mass

(𝐹 = 𝑚𝑖𝑎) is a constant (taken to be 𝑚𝑔

𝑚𝑖= 1 in most measurement systems so the two

types of mass are not distinguished in practice).

It does not explain the orbit of Mercury.

Hence the need for a proper theory, especially when masses and densities are very large, speeds

are close to that of light, or there are deviations in the path of light in a vacuum from a straight

line, and one which predicts the orbital deviations of Mercury.

Page 80: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Gravitational Field Theory This states that an object with mass is surrounded by an invisible gravitational field, similar to

the electromagnetic fields.

The presence of the field can be detected by placing a test mass within the field - there will be a

force on the test mass 𝐹𝑡 as given by Newton’s law 𝐹𝑡 = 𝐺𝑀𝑚𝑡

𝑑2 created by a mass 𝑀 at a distance

𝑑.

In mathematics a field is a region of space where a property has a value at every point. Examples include the temperature of the atmosphere (a scalar field) or the wind strength and direction (a

vector field). The field can be two or three dimensional.

The branch of mathematics dealing with fields is called vector calculus. This introduces some

new functions and operators.

The grad function applies to scalar fields and is the rate of change or gradient along the axes and

the result is a gradient vector. In geography this can be thought of as the slope of the ground.

𝐠𝐫𝐚𝐝 𝑓 ≡𝜕𝑓

𝜕𝑥�̂� +

𝜕𝑓

𝜕𝑦�̂� +

𝜕𝑓

𝜕𝑧�̂�

There are two functions of a vector field

div 𝐟 ≡∂f𝑥

∂𝑥+

∂f𝑦

∂𝑦+

∂f𝑧

∂𝑧 resulting in a scalar that indicates the divergence in the vector field. This

can be thought of as the way the slope or gradient of a hill converges when ascending a hill, or

the slopes diverge from the bottom of a closed depression, remembering ascending slopes are

positive when treated as a vector.

𝐜𝐮𝐫𝐥 𝐟 ≡ (∂f𝑧

∂𝑦−

∂f𝑦

∂𝑧) �̂� + (

∂f𝑥

∂𝑧−

∂f𝑧

∂𝑥) �̂� + (

∂f𝑦

∂𝑥−

∂f𝑥

∂𝑦) �̂� which is a vector indicating the rotation of a

field such as eddies in a river, and cyclones and anticyclones in the atmosphere.

The operator 𝛁 is the equivalent of grad, 𝛁 ∙ the equivalent of div and 𝛁 × the equivalent of curl.

In all three cases 𝛁 ≡ 𝜕

𝜕𝑥+

𝜕

𝜕𝑦+

𝜕

𝜕𝑧 is a vector operator.

These functions/operators often appear in pairs such as div(𝐠𝐫𝐚𝐝 𝑓) or 𝛁 ∙ 𝛁𝑓 and this

combination is given the symbol ∇2≡𝜕2𝑓

𝜕𝑥2+

𝜕2𝑓

𝜕𝑦2+

𝜕2𝑓

𝜕𝑧2. Note that it on a scalar field and ∇2 is a

scalar operator. There is a similar operator (in bold) that operates on vector fields

𝛁𝟐𝐟 ≡ (𝜕2𝑓𝑥

𝜕𝑥2+

𝜕2𝑓𝑥

𝜕𝑦2+

𝜕2𝑓𝑥

𝜕𝑧2) �̂� + (

𝜕2𝑓𝑦

𝜕𝑥2+

𝜕2𝑓𝑦

𝜕𝑦2+

𝜕2𝑓𝑦

𝜕𝑧2) �̂� + (

𝜕2𝑓𝑧

𝜕𝑥2+

𝜕2𝑓𝑧

𝜕𝑦2+

𝜕2𝑓𝑧

𝜕𝑧2) �̂�

The above refers to differential vector calculus – there is also integral vector calculus that

covers line integrals ∫

𝐿, surface integrals ∬

𝑆and volume integrals ∭

𝑉.

Page 81: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

An isolated mass 𝑚 is surrounded by a gravitational field 𝐠(𝐫) given by 𝐠(𝐫) = −𝐺𝑚

|𝐫|2𝐞𝒓 where

the origin is at the centre of the mass, r is a position vector and 𝐞𝒓 is a unit vector.

If the mass is surrounded by a sphere of radius R, the gravitational flux passing through the

sphere given by ∬ 𝐠 ∙ 𝐧

𝑆 d𝑆 where n is a unit vector perpendicular to the surface of the sphere.

The field strength 𝐠(𝐫) = −𝐺𝑚

𝑅2𝐞𝒓 and the surface integral of the sphere is ∬ d𝑆 = 4𝜋

𝑆𝑅2 so

∬ 𝐠 ∙ 𝐧

𝑆 d𝑆 = −4𝜋𝐺𝑚

Gauss’ theorem or divergence theorem of vector calculus states that ∭ div 𝐠 d𝑉 = ∬ 𝐠 ∙ 𝐧

𝑆 d𝑆

𝑉

or ∭ 𝛁 ∙ 𝐠 d𝑉 = −4𝜋𝐺𝑚

𝑉

The mass of the sphere is the volume integral of its density i.e. 𝑚 =∭ 𝜌 d𝑉

𝑉 so

∭ 𝛁 ∙ 𝐠 d𝑉 = −4𝜋𝐺∭ 𝜌 d𝑉

𝑉

𝑉 giving

𝛁 ∙ 𝐠 = −4𝜋𝐺𝜌 This applies to all bodies, not just spheres, with constant density. The integral

version is required if the density varies.

The gravitational force is said to be conservative so a fixed amount of work is done or energy

gained when moving from one location to another regardless of the path taken. This means there is also a potential energy field around the sphere 𝛷(𝐫). This is a scalar field since energy is

a scalar. Then

𝐠(𝐫) = −𝛁𝛷(𝐫) so

𝛁 ∙ 𝛁𝛷(𝐫) = −4𝜋𝐺𝜌 or

∇2𝛷(𝐫) = −4𝜋𝐺𝜌 which is known as Poisson’s equation.

Any theory of gravity must be compatible with this equation in a region where there is no

extreme mass or density, i.e. flat space-time.

Page 82: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Constraints on Theories of Gravity In addition to the constraint that any new theory of gravity must be compatible with the field

theory in regions of low mass and density there are other constraints.

Any theory of gravity must also be compatible with special relativity if there are no

accelerations.

It should explain the force of gravity i.e. what causes it, how is it carried and the speed with

which it acts.

It should explain the interaction with electromagnetic radiation, i.e. why light rays are

apparently bent when they pass by massive objects.

It should explain why the ratio of gravitational mass to inertial mass is constant.

It should provide accurate results for the orbit of Mercury, and other planets etc.

The Weak and Strong Equivalence Principles The weak equivalence principle states that within a sufficiently localised region of space-time

adjacent to a concentration of mass, the motion of bodies subject to gravitational effects alone

cannot be distinguished by any experiment from the motion of bodies with a region of

appropriate uniform acceleration.

It is sometimes called the principle of universality of free fall.

It states that locally (e.g. within a closed vehicle) it is impossible to distinguish between gravity

and acceleration, the classic example being an object in a lift. An object held against the floor of

the lift by a force can be held there by gravity if the lift is stationary in a gravity field, or by

acceleration of the lift if the lift is not in a gravity field. A (heavy) object floating freely in a lift

could be falling with the lift in a gravity field or the lift could be in a gravity free space.

This has been tested to one part in 1012.

The strong equivalence principle states that within a sufficiently localised region of space-time

adjacent to a concentration of mass, the physical behaviour of bodies cannot be distinguished by

any experiment from the physical behaviour of bodies with a region of appropriate uniform

acceleration.

It is sometimes called the equivalence principle.

It includes electromagnetic and nuclear forces.

A theory that violates it implies that the value of the gravitational constant changes with time –

tests show that the rate of change must be less than one part in 10-13 per year.

A key concept of general relativity is that gravity is linked to space-time, and there are two

important tensors, one for mass, energy and momentum, and one for space-time, the latter

being the Einstein tensor.

Page 83: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Energy-Momentum Tensor In special relativity mass, energy and momentum are related by 𝐸2 = 𝑝2𝑐2 +𝑚2𝑐4 which implies that gravity must be related to energy and momentum as well as mass. In Newtonian

mechanics it is only related to mass. The energy-momentum tensor describes the distribution

and flow of energy and momentum in a region of space-time and is the source of gravitation.

The energy-momentum tensor is a rank 2 contravariant tensor with 16 components at any

event in space-time. The components are values of energy density. The tensor is symmetric so

T𝜇𝜈 = T𝜈𝜇

𝑇00 is the local energy density.

The rest of the top row and first column 𝑇0𝑖 = 𝑇𝑖0 𝑖 ≠ 0 are the rates of flow of energy per unit

area perpendicular to the 𝑖 direction (i.e. x, y and z) divided by 𝑐 or the density of the 𝑖 component of momentum multiplied by 𝑐 which gives the same result.

All the remaining components 𝑇𝑖𝑗 = 𝑇𝑗𝑖 𝑖, 𝑗 ≠ 0 are the rate of flow of the 𝑖 component of

momentum per unit area perpendicular to the 𝑗 direction.

The components must be calculated by analysing the contents of space-time.

The simplest example is a cloud of non-interacting stationary particles (dust) observed in their

rest frame (which eliminates momentum and kinetic energy) giving

[𝑇𝜇𝜈] = [

𝜌𝑐2 0 0 00 0 0 00 0 0 00 0 0 0

] where 𝜌 is the mass density.

This is equivalent to T𝜇𝜈 = 𝜌𝐔𝜇𝐔𝜈 where 𝐔𝜇 = 𝐔𝜈 = [𝑈𝜇] = (𝑐𝛾(𝑉), 𝛾(𝑉)𝐯) with the spatial

velocity being zero because there is no motion and 𝛾(𝑉) = 1 so 𝐔𝜇 = (𝑐, 0,0,0). The term

𝑇00 = 𝜌𝑐2 is just the mass energy of the dust.

If the dust has motion then from special relativity the momentum of each particle is 𝑚𝛾(𝑣)𝐯 and

the energy is 𝑚𝛾(𝑣)𝑐2.

If there are 𝑛 particles per unit volume the energy density is

𝑇00 = 𝑛𝑚𝛾(𝑣)𝑐2

The rate of flow of energy per unit area perpendicular to the x direction (i.e. parallel to the x

axis) is 𝑛𝑚𝛾(𝑣)𝑐2𝑣𝑥 so

𝑇01 = 𝑇10 = 𝑛𝑚𝛾(𝑣)𝑐𝑣𝑥 .

Likewise 𝑇02 = 𝑇20 = 𝑛𝑚𝛾(𝑣)𝑐𝑣𝑦 and 𝑇03 = 𝑇30 = 𝑛𝑚𝛾(𝑣)𝑐𝑣𝑧

The rate of flow of particles in the x direction is 𝑛𝑣𝑥 so the rate of flow of the x component of

momentum 𝑚𝛾(𝑣)𝑣𝑥 across a unit area perpendicular to the x axis is 𝑛𝑣𝑥 𝑚𝛾(𝑣)𝑣𝑥 so

𝑇11 = 𝑛𝑚𝛾(𝑣)𝑣𝑥2.

Also the rate of flow of the y component of momentum 𝑚𝛾(𝑣)𝑣𝑦 across a unit area

perpendicular to the x axis is 𝑛𝑣𝑥 𝑚𝛾(𝑣)𝑣𝑦 so 𝑇21 = 𝑇12 = 𝑛𝑚𝛾(𝑣)𝑣𝑥𝑣𝑦.

Similarly for the z component of momentum 𝑚𝛾(𝑣)𝑣𝑧 across a unit area perpendicular to the x

axis is 𝑛𝑣𝑥 𝑚𝛾(𝑣)𝑣𝑧 so 𝑇31 = 𝑇13 = 𝑛𝑚𝛾(𝑣)𝑣𝑥𝑣𝑧.

Page 84: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

There are similar expressions for the y component of momentum across a unit area

perpendicular to the z axis so 𝑇32 = 𝑇23 = 𝑛𝑚𝛾(𝑣)𝑣𝑦𝑣𝑧.

The rate of flow of particles in the z direction is 𝑛𝑣𝑧 so the rate of flow of the z component of

momentum 𝑚𝛾(𝑣)𝑣𝑧 across a unit area perpendicular to the z axis is 𝑛𝑣𝑧 𝑚𝛾(𝑣)𝑣𝑧 so

𝑇33 = 𝑛𝑚𝛾(𝑣)𝑣𝑧2.

Finally rate of flow of the y component of momentum across a unit area perpendicular to the y

axis is 𝑇22 = 𝑛𝑚𝛾(𝑣)𝑣𝑦2.

Hence (the squares being expanded and the order of some velocities reversed to make the

pattern clearer)

[𝑇𝜇𝜈] =

[ 𝑛𝑚𝛾(𝑣)𝑐𝑐 𝑛𝑚𝛾(𝑣)𝑐𝑣𝑥 𝑛𝑚𝛾(𝑣)𝑐𝑣𝑦 𝑛𝑚𝛾(𝑣)𝑐𝑣𝑧𝑛𝑚𝛾(𝑣)𝑐𝑣𝑥 𝑛𝑚𝛾(𝑣)𝑣𝑥𝑣𝑥 𝑛𝑚𝛾(𝑣)𝑣𝑦𝑣𝑥 𝑛𝑚𝛾(𝑣)𝑣𝑧𝑣𝑥𝑛𝑚𝛾(𝑣)𝑐𝑣𝑦 𝑛𝑚𝛾(𝑣)𝑣𝑥𝑣𝑦 𝑛𝑚𝛾(𝑣)𝑣𝑦𝑣𝑦 𝑛𝑚𝛾(𝑣)𝑣𝑧𝑣𝑦𝑛𝑚𝛾(𝑣)𝑐𝑣𝑧 𝑛𝑚𝛾(𝑣)𝑣𝑥𝑣𝑧 𝑛𝑚𝛾(𝑣)𝑣𝑦𝑣𝑧 𝑛𝑚𝛾(𝑣)𝑣𝑧𝑣𝑧]

This is also equivalent to

T𝜇𝜈 = 𝜌𝐔𝜇𝐔𝜈

For an ideal fluid with pressure 𝑝

T𝜇𝜈 = (𝜌 +𝑝

𝑐2)𝐔𝜇𝐔𝜈 − 𝑝g𝜇𝜈

This can be simplified in special relativity by choosing a local inertial frame to give

T𝜇𝜈 = (𝜌 +𝑝

𝑐2)𝐔𝜇𝐔𝜈 − 𝑝𝜂𝜇𝜈 and in the rest frame this simplifies further to give

[𝑇𝜇𝜈] = [

𝜌𝑐2 0 0 00 𝑝 0 00 0 𝑝 00 0 0 𝑝

] where 𝑝 is the pressure which reduces to that for dust if 𝑝 → 0.

In a region with electromagnetic fields but no matter

T𝜇𝜈 =1

𝜇0(∑ F𝜇𝜎F

𝜈𝜎 −1

4∑ g𝜇𝜈𝐹𝜌𝜎𝐹𝜌𝜎𝜌,𝜎𝜎 ) where F𝜇𝜈 is the electromagnetic field tensor

indicating that electromagnetic radiation can be a source of gravitation.

The covariant charge continuity equation is ∑𝜕𝐽𝜈

𝜕𝑒𝜈3𝜈=0 = 0 which means in special relativity

∑𝜕𝑇𝜇𝜈

𝜕𝑒𝜇3𝜇=0 = 0

This can be extended to a curved space-time but with no matter

∑ ∇𝜇T𝜇𝜈3

𝜇=0 = 0 where ∇𝜇T𝜇𝜈 ≡

𝜕𝑇𝜇𝜈

𝜕𝑒𝜇+ ∑ Γ𝜇𝜌𝜇𝜌 𝑇𝜌𝜈 + ∑ Γ𝜈𝜌𝜇𝜌 𝑇𝜇𝜌

or the covariant divergence of T𝜇𝜈 is zero.

Page 85: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Einstein Tensor The curvature of space-time is defined by the Einstein tensor G𝜇𝜈.

This is derived from the Riemann tensor for four dimensional space-time.

R𝛿𝛼𝛽𝛾 =𝜕Γ𝛿𝛼𝛾

𝜕𝑥𝛽−

𝜕Γ𝛿𝛼𝛽

𝜕𝑥𝛾+∑ Γ𝜆𝛼𝛾Γ

𝛿𝜆𝛽𝜆 − ∑ Γ𝜆𝛼𝛽Γ

𝛿𝜆𝛾𝜆 where

Γ𝜆𝛼𝛽 =1

2∑ 𝑔𝜆𝜎 (

𝜕𝑔𝜎𝛽

𝜕𝑥𝛼+

𝜕𝑔𝛼𝜎

𝜕𝑥𝛽+

𝜕𝑔𝛼𝛽

𝜕𝑥𝜎)𝜎

The Riemann tensor has 256 components (44).

Contracting the tensor twice to eliminate the first 𝛿 and last 𝛾 indices results in the rank 2 covariant Ricci tensor R𝛼𝛽 = ∑ 𝑅𝛾𝛼𝛽𝛾𝛾

Contracting twice more gives the Ricci or Curvature scalar = ∑ 𝑔𝛼𝛽𝑅𝛼𝛽𝛼,𝛽 .

The Ricci scalar R is the sum of sixteen terms.

The Einstein tensor is G𝜇𝜈 = R𝜇𝜈 −1

2g𝜇𝜈R

The tensor is symmetric and the covariant divergence is zero so ∑ ∇𝜇G𝜇𝜈3

𝜇=0 = 0.

It is important to distinguish between

G𝜇𝜈 the Einstein tensor

𝐺 the gravitational constant and g𝜇𝜈 the dual metric or the contravariant form of the metric g𝜇𝜈

R𝛿𝛼𝛽𝛾 the Riemann tensor and

R𝛼𝛽 the Ricci tensor and

R the Ricci scalar

The covariant divergence of G𝜇𝜈 is zero, and since this is also true of T𝜇𝜈 the two tensors can be

assumed to be proportional to each other.

In fact G𝜇𝜈 = −𝜅T𝜇𝜈 where 𝜅 is a constant of proportionality, this being the two way relationship

between space-time and energy-momentum, and is known as the Einstein field equation.

𝜅 is the Einstein constant.

All mathematical models of all or part of the universe must satisfy this equation. However all but a few solutions have no physical reality, and there is no known solution for the universe as a

whole. There are only solutions for the universe at such a very large scale that even galaxy

clusters can be ignored, for the special case of two bodies, one much more massive than the

other (such as a star and planet), and for a black hole.

Page 86: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Einstein’s Field Equations

These can be written as G𝜇𝜈 = −𝜅T𝜇𝜈 where 𝜅 is Einstein’s constant 𝜅 =8𝜋𝐺

𝑐4. Note that the

covariant forms of G𝜇𝜈 and T𝜇𝜈 are used so both indices must be lowered.

The RHS is the source term – the flow of energy and momentum in a region of space described by the covariant energy momentum tensor T𝜇𝜈.

The LHS is the resulting geometry of space-time described by the covariant Einstein tensor G𝜇𝜈 .

The most common form of the field equation is R𝜇𝜈 −1

2g𝜇𝜈𝑅 = −𝜅T𝜇𝜈.

This represents ten different non-linear second-order partial differential equations for the

independent components of g𝜇𝜈 which must be solved simultaneously.

The solution process involves finding the metric tensor g𝜇𝜈 that corresponds to the energy-

momentum tensor T𝜇𝜈. The metric tensor is the gravitational field.

This solution can be carried out for some specific energy-momentum tensors.

Specifying a metric tensor and finding the energy-momentum tensor usually results in one that

has no physical meaning.

A common approach is to partly define both the metric and the tensor and find a solution that

completes both.

Note that since g𝜇𝜈 is dimensionless Γ𝜆𝜇𝜈 has dimensions of [L-1] and R𝛾𝛼𝛽𝛾, R𝜇𝜈 , and 𝑅 have

dimensions of [L-2], T𝜇𝜈 dimensions of [ML-1T-2], and 𝜅 dimensions of [M-1L-1T2].

If T𝜇𝜈 = 0 space-time is empty – no matter or radiation, but it does not mean it is flat. Thus

G𝜇𝜈 = 0 may not be a trivial equation. There are non-trivial solutions called vacuum field

models such as Schwarzschild space-time.

R𝜇𝜈 can be found as follows:

∑ 𝑔𝜇𝜈𝜇,𝜈 𝑅𝜇𝜈 −1

2∑ 𝑔𝜇𝜈𝜇,𝜈 𝑔𝜇𝜈𝑅 = −𝜅∑ 𝑔𝜇𝜈𝜇,𝜈 𝑇𝜇𝜈 R𝜇𝜈 −

1

2g𝜇𝜈𝑅 = −𝜅T𝜇𝜈

𝑅 − 2𝑅 = −𝜅∑ 𝑇𝜇𝜇𝜇 ∑ 𝑔𝜇𝜈𝜇,𝜈 𝑔𝜇𝜈 = ∑ 𝛿𝜈𝜈 𝛿𝜈 = 4

𝑅 = 𝜅𝑇 𝑇 = ∑ 𝑇𝜇𝜇𝜇

R𝜇𝜈 −1

2g𝜇𝜈𝜅𝑇 = −𝜅T𝜇𝜈 R𝜇𝜈 −

1

2g𝜇𝜈𝑅 = −𝜅T𝜇𝜈

R𝜇𝜈 = −𝜅 (T𝜇𝜈 −1

2g𝜇𝜈𝑇) so if T𝜇𝜈 = 0 then 𝑇 = 0 and R𝜇𝜈 = 0

R𝜇𝜈 = 0 does not imply 𝑅𝛿𝜇𝜈𝛾 = 0 (contraction is a one way process since information is lost)

and hence g𝜇𝜈 may not describe a flat space-time.

Page 87: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Geodesic Equations

The geodesic equations are d2𝑥𝜌

d𝜆2+ ∑ Γ𝜌𝛼𝛽

d𝑒𝛼

d𝜆

d𝑒𝛽

d𝜆= 0𝛼,𝛽 which define the equivalent of a

straight line in a curved space, i.e. shortest local distance between two points (there are two

possible routes along a great circle on a sphere, only the shorter is the shortest distance) or all

tangents along the curve are parallel. (In pseudo-Riemannian geometry shortest is replaced by

longest.)

Γ𝜌𝛼𝛽 =1

2∑ 𝑔𝜌𝜎 (

𝜕𝑔𝜎𝛽

𝜕𝑒𝛼+

𝜕𝑔𝛼𝜎

𝜕𝑒𝛽+

𝜕𝑔𝛼𝛽

𝜕𝑒𝜎)𝜎

The solutions 𝑒𝜌(𝜆) are the parameterised curves of the geodesics through space-time, and are

important because these are the paths of free moving bodies.

The tangent vector at any point 𝜆 is 𝑡𝜌(𝜆) =d𝑒𝜌(𝜆)

d𝜆.

The norm or length of the tangent vector is given by √∑ 𝑔𝛼𝛽𝛼,𝛽 𝑡𝛼(𝜆)𝑡𝛽(𝜆).

In space-time the line element is therefore d𝑠 = √∑ 𝑔𝜇𝜈𝜇,𝜈 𝑡𝜇(𝜆)𝑡𝜈(𝜆)

In the case of a geodesic the norm is invariant along the length of the curve.

For time-like geodesics the norm is positive.

For space-like geodesics the norm is negative (hence the need for pseudo Riemann geometry).

Additionally there are geodesics with zero norms – these are called null geodesics which are

followed by massless bodies such as photons.

The requirement of zero covariant divergence ∑𝜕𝑇𝜇𝜈

𝜕𝑒𝜇 𝜇 = 0 leads to the world-lines of massive

particles falling freely under gravity being time-like geodesics, and the world-lines of massless

particles moving solely under gravity being null geodesics. This is the principle of geodesic

motion. Freely moving objects follow the appropriate geodesic (this is similar to Newton’s law

that a freely moving object follows a straight line). Thus the geodesic equation can be thought of

as the geometry of space-time telling matter and energy how to move.

An important consequence of curved space-time is that the time and space coordinates do not

measure proper time or space defined by what a local observer would measure, but instead

measure coordinate time and space, and it is important to distinguish between the two. Hence

conversion formula are required – in special relativity there is a factor which is dependent on

the difference in velocity but in general relativity it depends on the curvature of space-time.

Additionally a remote observer measures a different proper time and space to the local

observer. In general relativity coordinates do not have immediate metrical significance (an

analogy is that house numbers do not measure distance although the house numbers increase

with distance from one end of the road, and there is no conversion factor from house numbers

into metres).

Note that special relativity still applies locally to the events. An observer local to and moving

with an object measures proper time. An observer local to but moving at a fixed velocity relative

to an object must apply the special relativity corrections to measurements of time and distance

etc. However a remote observer must apply a correction for space time curvature and also for

time delay.

Page 88: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Newtonian Limit

In a region of flat space-time Einstein’s field equations R𝜇𝜈 −1

2g𝜇𝜈𝑅 = −𝜅T𝜇𝜈 must approximate

to Poisson’s equation ∇2𝛷(𝐫) = −4𝜋𝐺𝜌.

Starting from R𝜇𝜈 = −𝜅 (T𝜇𝜈 −1

2g𝜇𝜈𝑇) and applying it to dust where T𝜇𝜈 = 𝜌𝐔𝜇𝐔𝜈 and

𝑇 = 𝜌𝑐2gives R𝜇𝜈 = −𝜅 (𝜌𝐔𝜇𝐔𝜈 −1

2g𝜇𝜈𝜌𝑐

2).

At low speeds 𝑈0 = 𝛾𝑐 ≅ 𝑐 since 𝛾 ≅ 1, and assuming space-time is not curved g𝜇𝜈 can be

replaced by 𝜂𝜇𝜈 + h𝜇𝜈 where |h𝜇𝜈| ≪ 1 and is not a function of time.

R𝜇𝜈 = −𝜅 (𝜌𝑐2 −1

2g𝜇𝜈𝜌𝑐

2) = −𝜅 (𝜌𝑐2 −1

2(𝜂𝜇𝜈 + h𝜇𝜈)𝜌𝑐

2)

Since 𝜂00 = 1 and |h𝜇𝜈| ≪ 1 then 𝑅00 ≅ −𝜅 (𝜌𝑐2 −1

2𝜌𝑐2) ≅ −

𝜅

2𝜌𝑐2.

However working from the Ricci tensor 𝑅00 ≅ −∑𝜕Γ𝑖00

𝜕𝑥𝑖3𝑖=1 where Γ𝑖00 ≅ −

1

2∑ 𝜂𝑖𝑗

𝜕ℎ00

𝜕𝑥𝑗𝑗 .

This means 𝑅00 ≅1

2∑ 𝜂𝑖𝑗𝑖,𝑗

𝜕2ℎ00

𝜕𝑥𝑖𝜕𝑥𝑗= −

1

2∇2ℎ00.

Combining both expressions for 𝑅00 gives −1

2∇2ℎ00 ≅ −

𝜅

2𝜌𝑐2 or ∇2ℎ00 ≅ 𝜅𝜌𝑐2

It can be shown that ℎ00 =2𝛷

𝑐2 where 𝛷 is the Newtonian gravitational potential so

2

𝑐2∇2𝛷 ≅ 𝜅𝜌𝑐2 or ∇2𝛷 ≅

𝜅

2𝜌𝑐4.

Since Poisson’s equation from the gravitational field theory states ∇2𝛷 = 4𝜋𝐺𝜌 then 𝜅

2𝜌𝑐4 = 4𝜋𝐺𝜌 or 𝜅 =

8𝜋𝐺

𝑐4 for general relativity to be compatible with Newton’s law which is how

the value of 𝜅 is obtained.

Page 89: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Cosmological Constant When Einstein applied the field equation to cosmology the resulting model was a collapsing

universe. The cosmological constant was introduced to the geometry to balance this and give a

static (neither expanding nor contracting) universe. The revised equation was

R𝜇𝜈 −1

2g𝜇𝜈𝑅 + Λg𝜇𝜈 = −𝜅T𝜇𝜈

Although an appropriate value of Λ results in a static universe, that universe is not in stable

equilibrium – a small change leads to larger changes so it is unstable.

Today the universe is believed to be expanding at an accelerating rate, the result of dark energy.

This can be expressed in the above equation by a positive possibly increasing value of Λ.

Since Λ is now believed to be a form of energy rather than a property of geometry, it is normally

transferred to the other side of the equation to give

R𝜇𝜈 −1

2g𝜇𝜈𝑅 = −𝜅 (T𝜇𝜈 +

Λ

𝜅g𝜇𝜈) or

R𝜇𝜈 −1

2g𝜇𝜈𝑅 = −𝜅(T𝜇𝜈 + T̅𝜇𝜈) T̅𝜇𝜈 =

Λ

𝜅g𝜇𝜈

If it is assumed that dark energy is an ideal fluid of density 𝜌Λ and pressure 𝑝Λ then

T̅𝜇𝜈 = (𝜌Λ +𝑝Λ

𝑐2)𝐔𝜇𝐔𝜈 − 𝑝Λg

𝜇𝜈 or

T̅𝜇𝜈 = (𝜌Λ +𝑝Λ

𝑐2)𝐔𝜇𝐔𝜈 − 𝑝Λg𝜇𝜈

Since T̅𝜇𝜈 =Λ

𝜅g𝜇𝜈 then 𝑝Λ = −

Λ

𝜅 and 𝜌Λ = −

𝑝Λ

𝑐2=

Λ

𝜅𝑐2 and the field equation is written as

R𝜇𝜈 −1

2g𝜇𝜈𝑅 = −𝜅(T𝜇𝜈 + 𝜌Λ𝑐

2g𝜇𝜈)

Note that density and pressure have opposite signs, but a positive density results in expansion

and in an expanding universe the pressure of dark energy must be negative.

The density of the dark energy is assumed to be constant wrt to time and the expansion of the

universe. The current value is 7x10-30 gm cm-3, but since this value applies to every cubic

centimetre in the universe it totals to 69.11% of the total energy. If constant wrt time then it

must be a property of the universe or of the vacuum that comprises most of the universe. An

obvious explanation is that it is the vacuum energy of quantum mechanics, but that is 10120

times larger.

There are many other explanations including the concept of a Quintessence scalar field, and

some of these imply the value changes with time, or even that dark energy does not exist but

results from a deficiency in relativity.

Page 90: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Schwarzschild Metric The first solution of the Einstein field equations was by Schwarzschild. He imagined a universe

consisting of a single massive non-rotating spherical object of mass M and radius 𝑅𝑀 in a

vacuum with no electromagnetic radiation and no electrical charge. Although a very artificial

example it is important because it results in a constant of proportionality between mass and a

length called the Schwarzschild 𝑅𝑠which amongst other things limits the size of the observable

universe to 13.7 lightyears.

Using spherical coordinates with the origin in the centre of the sphere (with the reference x, y

and z orthogonal axes being chosen in an arbitrary direction) a geodesic element (squared to

remove the square root sign) is given by

(d𝑠)2 = g𝜇𝜈 d𝑒𝜇 d𝑒𝜈 𝜃 is the angle between r and the z axis, and 𝜙 the angle of the

component of r in the xy plane with the x axis

= 𝑓2𝐴(𝑐d𝑡)2 − 𝑓2𝐵(d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2 where 𝑓2𝐴 and 𝑓2𝐵 are unknown

functions of 𝑟 (the power of 2 ensuring they are positive and A and B being numeric values).

Being functions of only r and the lack of d𝑥𝑖d𝑥𝑗 terms means that the universe is spherically

symmetric about the centre of the mass. The lack of a time function and d𝑥𝑖d𝑡 terms means

there is no change with time (it is stationary) and since 𝑡 can be replaced by – 𝑡 it is static and

not rotating.

The Schwarzschild metric is therefore

g𝜇𝜈 =

[ 𝑓2𝐴 0 0 0

0 −𝑓2𝐵 0 0

0 0 −𝑟2 00 0 0 −𝑟2 sin2 𝜃]

The connection coefficients are given by

Γ𝑖𝑗𝑘 =1

2∑ 𝑔𝑖𝑙 (

𝜕𝑔𝑙𝑘

𝜕𝑥𝑗+

𝜕𝑔𝑗𝑙

𝜕𝑥𝑘−

𝜕𝑔𝑗𝑘

𝜕𝑥𝑙)𝑙 which has 4*4*4 or 64 components with the non-zero

components being

Γ001 = Γ010 = 𝐴′ 𝐴′ =d𝐴

d𝑟

Γ100 = 𝐴′𝑓2(𝐴−𝐵)

Γ111 = 𝐵′ 𝐵′ =d𝐵

d𝑟

Γ122 = −𝑟𝑓−2𝐵

Γ133 = 𝑓−2𝐵𝑟 sin2 𝜃

Γ212 = Γ221 =1

𝑟

Γ233 = −sin 𝜃 cos 𝜃

Γ313 = Γ331 =1

𝑟

Γ323 = Γ332 = cot 𝜃

Page 91: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Riemann tensor is given by R𝛿𝛼𝛽𝛾 =𝜕Γ𝛿𝛼𝛾

𝜕𝑒𝛽−

𝜕Γ𝛿𝛼𝛽

𝜕𝑒𝛾+ ∑ Γ𝜆𝛼𝛾Γ

𝛿𝜆𝛾𝜆 − ∑ Γ𝜆𝛼𝛽Γ

𝛿𝜆𝛾𝜆 which has

4*4*4*4 or 256 components with the six non-zero independent components being

𝑅0101 = 𝐴′𝐵′ − 𝐴′′ − 𝐴′2 𝐴′′ =d2𝐴

d𝑟2

𝑅0202 = −𝑟𝑓−2𝐵𝐴′

𝑅0303 = −𝑟𝑓−2𝐵𝐴′ sin2 𝜃

𝑅1212 = 𝑟𝑓−2𝐵𝐵′

𝑅1313 = 𝑟𝑓−2𝐵𝐵′ sin2 𝜃

𝑅2323 = (1 − 𝑓−2𝐵) sin2 𝜃

The Ricci tensor is given by R𝛼𝛽 = ∑ 𝑅𝛾𝛼𝛽𝛾𝛾 which in this case is a 4*4 diagonal matrix

𝑅00 = −𝑓2(𝐴−𝐵) (𝐴′′ + 𝐴′2 − 𝐴′𝐵′ +2𝐴′

𝑟)

𝑅11 = 𝐴′′ + 𝐴′2 − 𝐴′𝐵′ −2𝐵′

𝑟

𝑅22 = 𝑓−2𝐵(1 + 𝑟(𝐴′ − 𝐵′)) − 1

𝑅33 = sin2 𝜃 (𝑓−2𝐵(1 + 𝑟(𝐴′ − 𝐵′)) − 1)

The curvature scalar 𝑅 = ∑ 𝑔𝛼𝛽𝑅𝛼𝛽𝛼,𝛽

𝑅 = 𝑔00𝑅00 + 𝑔11𝑅11 + 𝑔22𝑅22 + 𝑔33𝑅33

= −2𝑓−2𝐵 (𝐴′′ + 𝐴′2− 𝐴′𝐵′ +

2

𝑟(𝐴′ − 𝐵′) +

1

𝑟2) +

2

𝑟2

The Einstein tensor G𝜇𝜈 = R𝜇𝜈 −1

2g𝜇𝜈𝑅 is a 4*4 diagonal matrix with values

𝐺00 =−2𝑓2(𝐴−𝐵)

𝑟𝐵′ +

𝑓2(𝐴−𝐵)

𝑟2−

𝑓2𝐴

𝑟2

𝐺11 = −2𝐴′

𝑟+

𝑓2𝐵

𝑟2−

1

𝑟

𝐺22 = −𝑟2𝑓−2𝐵 (𝐴′′ + 𝐴′2+

𝐴′−𝐵′

𝑟− 𝐴′𝐵′ )

𝐺33 = −𝑟2𝑓−2𝐵sin2 𝜃 (𝐴′′ + 𝐴′2+

𝐴′−𝐵′

𝑟− 𝐴′𝐵′)

Outside the mass there is a vacuum so G𝜇𝜈 = 0 for all 𝜇𝜈

To make further progress

𝑓−2𝐴𝐺00 =−2𝑓2𝐵

𝑟𝐵′ +

𝑓2𝐵

𝑟2−

1

𝑟2 𝑓−2𝐴 (𝐺00 =

−2𝑓2(𝐴−𝐵)

𝑟𝐵′ +

𝑓2(𝐴−𝐵)

𝑟2−

𝑓2𝐴

𝑟2)

𝑓−2𝐵𝐺11 = −2𝑓−2𝐵𝐴′

𝑟+

1

𝑟2−

𝑓−2𝐵

𝑟 𝑓−2𝐵 (𝐺11 = −

2𝐴′

𝑟+

𝑓2𝐵

𝑟2−

1

𝑟)

Page 92: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Adding these

𝑓−2𝐴𝐺00 + 𝑓−2𝐵𝐺11 =−2𝑓2𝐵

𝑟𝐵′ +

𝑓2𝐵

𝑟2−

1

𝑟2−

2𝑓−2𝐵𝐴′

𝑟+

1

𝑟2−

𝑓−2𝐵

𝑟

=−2𝑓2𝐵

𝑟𝐵′ −

2𝑓−2𝐵𝐴′

𝑟

=−2𝑓2𝐵

𝑟(𝐴′ + 𝐵′)

All four components must be zero for a vacuum solution so either −2𝑓2𝐵

𝑟 or 𝐴′ +𝐵′ = 0 the

former being a non-physical solution since either 𝑟 = ∞ or 𝑓2𝐵 = 0, so for all 𝑟

𝐴′ + 𝐵′ = 0

Integrating this and ignoring constants means 𝐴(𝑟) = −𝐵(𝑟)

and substituting back into 𝐺00 = 0 results in 1

𝑟2

d(𝑟(1−𝑓−2𝐵))

d𝑟= 0.

This is only possible if d(𝑟(1−𝑓−2𝐵))

d𝑟= 0 which when integrated gives

𝑟(1 − 𝑓−2𝐵) = 𝑅𝑆 where 𝑅𝑆 is a constant of integration called the Schwarzschild radius.

This gives 𝑓−2𝐵 = 1 −𝑅𝑠

𝑟 and so 𝑓2𝐴 = 𝑓−2𝐵 = 1 −

𝑅𝑠

𝑟 and 𝑓2𝐵 =

1

1−𝑅𝑠𝑟

.

The last step is to find the value of 𝑅𝑠.

The gravitational potential at a large distance r from the mass is 𝛷 = −𝐺𝑀

𝑟 and given by

𝑔00 = 1 +2𝛷

𝑐2= 1 −

2𝐺𝑀

𝑟𝑐2 𝐺 is the gravitational constant from

𝐺𝑚1𝑚2

𝑟2

But 𝑔00 = 𝑓2𝐴 = 1 −𝑅𝑠

𝑟 so 𝑅𝑠 =

2𝐺𝑀

𝑐2

The Schwarzschild metric is therefore

g𝜇𝜈 =

[ 𝑓2𝐴 0 0 0

0 −𝑓2𝐵 0 0

0 0 −𝑟2 00 0 0 −𝑟2 sin2 𝜃]

=

[ 1 −

2𝐺𝑀

𝑟𝑐20 0 0

0 −1

1−2𝐺𝑀

𝑟𝑐2

0 0

0 0 −𝑟2 00 0 0 −𝑟2 sin2 𝜃]

which is often

written as

(d𝑠)2 = g𝜇𝜈 d𝑒𝜇 d𝑒𝜈 = (1 −

2𝐺𝑀

𝑟𝑐2) (𝑐d𝑡)2 −

1

1−2𝐺𝑀

𝑟𝑐2

(d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2 ,

and 𝑔00 = 1 −2𝐺𝑀

𝑟𝑐2, 𝑔𝑟𝑟 = −

1

1−2𝐺𝑀

𝑟𝑐2

, 𝑔𝜃𝜃 = −𝑟2 and 𝑔𝜙𝜙 = −𝑟2 sin2 𝜃

The Schwarzschild radius 𝑅𝑠 =2𝐺𝑀

𝑐2 occurs very frequently in the discussion of black holes as it

is the location of the event horizon. No matter nor radiation can escape from inside the event

horizon. The gravitational constant 𝐺 occurs frequently in cosmology, but its value is not known

to a very high precision and the official value has been subject to change, but the currently

accepted value is 6.674x10-11 N m2 kg-2 or m3 kg-1 s-2. Using this gives −𝜅 =2.071x10-43 s2 m-1kg-1,

and 𝑅𝑠 = 1.485x10-27𝑀 metres where 𝑀 is in Kg.

Page 93: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

If 𝑡 and 𝑟 are fixed d𝑡 and d𝑟 are both zero and

(d𝑠)2 = −𝑅2(d𝜃)2 − 𝑅2 sin2 𝜃 (d𝜙)2 where R is the fixed radius the presence of sin2 𝜃 appears to indicate that there is something special about 𝜃 – in fact it is measured from the z axis which

arbitrarily chosen (this would not be true if the mass was rotating).

The factor of 1 −2𝐺𝑀

𝑟𝑐2 approaches 1 as r increases since

2𝐺𝑀

𝑟𝑐2 approaches 0 so in the limit is equal

to 1 and then

g𝜇𝜈 = [

1 0 0 00 −1 0 00 0 −𝑟2 00 0 0 −𝑟2 sin2 𝜃

] which is the Minkowski metric for flat space in spherical

coordinates.

The Schwarzschild metric is said to be asymptotically flat – space is curved near the mass but

becomes increasing flat as the distance increases.

The factor of 1 −𝑅𝑠

𝑟 also approaches 0 as r approaches 𝑅𝑠 so 𝑔00 approaches 0 and 𝑔𝑟𝑟

approaches infinity indicating a singularity. However this singularity is due purely to the

coordinate system and does not have a physical reality. It is a coordinate singularity. There is a

physical singularity at 𝑟 = 0, but the Schwarzschild metric and space-time are only valid outside

of the mass, i.e. 𝑟 > 𝑅𝑀.

There is no 𝑡 term in the metric (apart from d𝑡) so the metric is stationary (it does not change with time). The fact that it does not include cross terms (d𝑟 d𝑡, d𝜃 d𝑡, d𝜙 d𝑡) also means it is

static (not rotating).

There is no problem with this model as long as the Schwarzschild radius is smaller than the

radius of the mass. For example in the case of the Earth it is about 9mm and for the sun about

3km. If the Schwarzschild radius is greater than the radius of the mass, the space-time within

the Schwarzschild radius has unusual properties – today this is called a black hole. The

Schwarzschild radius is then also called the event horizon.

The mass of the observable universal is estimated at 1.5×1053 kg and so its Schwarzschild

radius is 2.2×1026 m while its radius is 4.4×1026 m so there is a factor of two difference which is

more than the uncertainty in the values. It is sometimes stated that there is an event horizon at

13.8 giga lightyears but this is the radius of a special relativity light-cone whose apex is at the

original of the universe 13.8 giga years ago. Light from outside this radius cannot reach us

because it would take longer than the age of the universe, not because of a mass. Additionally

we are within the radius whereas an event horizon is where light cannot escape from inside the

event horizon to the outside but light can enter from the outside.

Assuming the Schwarzschild radius is smaller than the radius of the mass and remains so, any

change in the radius does not affect the Schwarzschild metric which means that gravitational

waves cannot be created.

Page 94: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Frames and Coordinates in General Relativity There are important differences between frames and coordinates in special and general

relativity.

In special relativity an observer can be anywhere within the frame but always observes events

when and where they happen.

In general relativity an observer can be anywhere within the frame but sees events when the

information has reached them (travelling at the speed of light). Thus to observe an event when

it happens the observer must be local to the event, and in general local frames are used.

Secondly in special relativity the t coordinate represents time and the r coordinate distance –

these may need to be multiplied by a factor to get proper time and distance, but the factor

depends only on the relative velocity. There is no such distinct relationship in general relativity.

In general relativity coordinates do not have immediate metrical significance, and care must be

taken when calculating time and distance. Although coordinate time must be calculated, the

coordinate time duration between two events is the same for all observers. On the other hand

proper time differs between observers.

Three frames of reference are commonly used.

The first is a freely falling observer – this is locally inertial and so special relativity applies. A

freely falling observer can observe a local freely falling body and make proper measurements.

An example is an observation made in a satellite orbiting the Earth of the duration or length in

an experiment within the satellite.

The second is a rest frame that is permanently located at a specific location. This frame must be

subject to a force that prevents it falling freely and so is not inertial. An example is an

observation made in a laboratory on Earth. A freely falling frame will appear to be subject to a

fictitious gravitational force, e.g. one which attracts a satellite and forces it to orbit the Earth.

The third is a distant frame remote from any mass and so is in flat space-time. Such a frame is

also freely falling but not moving.

Cosmology uses yet another frame of reference called the fundamental coordinate system

whose orientation and origin are arbitrary but which extends throughout the universe, and

whose fundamental observers measure cosmological time and co-moving distances.

Page 95: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Time Using the example of Schwarzschild space-time, assume two emission events occur at the same

location 𝑟 from the origin, but separated by a coordinate time d𝑡. In special relativity this would

be the time measured by an inertial observer in the local rest frame and would be called the

proper time and given the symbol d𝜏. This term is confusing when used in general relativity.

In Schwarzschild space-time the separation (squared) of the two events is given by

(d𝑠)2 = (1 −2𝐺𝑀

𝑟𝑐2) (𝑐d𝑡)2 −

1

1−2𝐺𝑀

𝑟𝑐2

(d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2 but since the two locations

are identical d𝑟 = d𝜃 = d𝜙 = 0 and this simplifies to

(d𝑠)2 = (1 −2𝐺𝑀

𝑟𝑐2) (𝑐d𝑡)2

A local observer located at the events, i.e. location 𝑟𝑙𝑜, would measure the proper time as

d𝜏 =d𝑠

𝑐 from special relativity

= √1 −2𝐺𝑀

𝑟𝑙𝑜𝑐2 d𝑡 which is less than d𝑡 although in special relativity

they would be the same.

For another observer on the same radial vector (so 𝜃 and 𝜙 are the same as the values for the

events) at a radius of 𝑟𝑜𝑏 the coordinate time difference between the two events will still be d𝑡,

but the separation will be

(d𝑠𝑜𝑏)2 = (1 −

2𝐺𝑀

𝑟𝑜𝑏𝑐2) (𝑐d𝑡)

2 = (1 −2𝐺𝑀

𝑟𝑜𝑏𝑐2) (𝑐d𝑡)

2 and so the measured time

d𝜏𝑜𝑏 =d𝑠𝑜𝑏

𝑐= √1 −

2𝐺𝑀

𝑟𝑜𝑏𝑐2 d𝑡 which is larger than d𝜏 but small than d𝑡.

However as 𝑟𝑜𝑏 tends to infinity √1 −2𝐺𝑀

𝑟𝑜𝑏𝑐2 tends to 1 and d𝜏∞ = d𝑡 in the limit.

Thus an observer at infinity measures the coordinate time difference between the events.

Also d𝜏∞ =d𝜏

√1−2𝐺𝑀

𝑟𝑙𝑜𝑐2

so the measured time at infinity is greater than the proper time measured

locally.

If the two events correspond to the ticks of a clock at radius 𝑟 then a distant observer (at

infinity) will find that the clock is running slow by a factor of 1

√1−2𝐺𝑀

𝑟𝑐2

compared to a clock at

infinity.

If the clock is moved closer to the mass it will appear to run even slower until the limit of

𝑟 = 𝑅𝑠 =2𝐺𝑀

𝑐2 is reached when the clock will appear to stop since √1 −

2𝐺𝑀

𝑟𝑐2= √1 −

𝑐2

2𝐺𝑀

2𝐺𝑀

𝑐2

This is known as gravitational time dilation. It is different to the time dilation of special

relativity which is due to differences in velocity.

The time intervals measured by a clock at infinity are the same as coordinate time and are also

the same as those in cosmological time. Clocks in gravitational fields run slow.

Page 96: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

At the Earth’s distance from the sun the difference between d𝜏 (measured time) and d𝑡

(coordinate/cosmological time) is about 1in 10-8 due to the sun’ mass.

A correction for time dilation is important in the calculation of position from global positioning

satellites which orbit a height of 20,200 km. The ratio between a time interval on a clock at a

height ℎ (∆𝑡𝑅+ℎ) and the value measured at the Earth’s surface (∆𝑡𝑅) assuming a radius 𝑅 for

the Earth and due only to the Earth’s mass is given

∆𝑡𝑅+ℎ

∆𝑡𝑅=

√1−2𝐺𝑀

𝑅𝑐2

√1−2𝐺𝑀

(𝑅+ℎ)𝑐2

where 𝑀 is the mass of the Earth

∆𝑡𝑅+ℎ ≅ (1 +𝑀𝐺

(𝑅+ℎ)𝑐2−

𝐺𝑀

𝑅𝑐2)∆𝑡𝑅

√1−𝑦

√1−𝑥≅ 1 +

𝑥

2−

𝑦

2 𝑥, 𝑦 ≪ 1

≅ ∆𝑡𝑅 −𝑀𝐺ℎ

𝑅(𝑅+ℎ)𝑐2∆𝑡𝑅

Substituting 𝑅=6371 km, 𝑀=5.9736x1024 kg gives -45.7 µs per day. Thus the satellite clock runs

fast compared to a clock on the surface of the Earth (it is in a weaker gravitational field).

However the satellite is moving wrt to the surface of the Earth and so a special relativity

correction is also required:

∆𝑡𝑅+ℎ =1

√1−(𝑣

𝑐)2∆𝑡𝑅

=1

√1−𝐺𝑀

(𝑅+ℎ)𝑐2

∆𝑡𝑅 𝑣 = √𝐺𝑀

𝑅+ℎ

≅ ∆𝑡𝑅 +𝑀𝐺

2(𝑅+ℎ)𝑐2∆𝑡𝑅

√1−𝑦

√1−𝑥≅ 1 +

𝑥

2−

𝑦

2 𝑥, 𝑦 ≪ 1, 𝑦 = 0

Substituting the above values gives 7.2 µs per day – this time the satellite clock appears to run

slow.

The difference between the two is -38.5 µs per day and multiplying this by 𝑐 gives a distance

error of 11.5 km per day.

The ratio of time on a space station at a height of 370km (∆𝑡𝑆) and on the Earth (∆𝑡𝐸) is

∆𝑡𝑆

∆𝑡𝐸=

√1−2𝐺𝑀

𝑅𝐸𝑐2

√1−2𝐺𝑀

𝑅𝑆𝑐2

≅ 1 +𝐺𝑀

𝑅𝑆𝑐2 −

𝐺𝑀

𝑅𝐸𝑐2

√1−𝑦

√1−𝑥≅ 1 +

𝑥

2−

𝑦

2 𝑥, 𝑦 ≪ 1

∆𝑡𝑆 − ∆𝑡𝐸 =𝐺𝑀

𝑅𝑆𝑐2 −

𝐺𝑀

𝑅𝐸𝑐2 =

𝐺𝑀

𝑐2(1

𝑅𝑆−

1

𝑅𝐸)∆𝑡𝐸 so substituting the above values

∆𝑡𝑆 − ∆𝑡𝐸 = −3.858 × 10−11∆𝑡𝐸 or -1.2 ms per year. The special relativity contribution is

∆𝑡𝑆 − ∆𝑡𝐸 =𝑀𝐺

2(𝑅+ℎ)𝑐2∆𝑡𝐸 = 3.2837 × 10−10∆𝑡𝐸 or 10.2 ms per year so this more than cancels

out the general relativity contribution leaving 9 ms per year.

The two effects (general and special corrections) cancel out when

𝑀𝐺ℎ

𝑅(𝑅+ℎ)𝑐2=

𝑀𝐺

2(𝑅+ℎ)𝑐2 or

𝑅=

1

2 or ℎ =

𝑅

2 = 3.18x106 m for the Earth.

Page 97: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Gravitational Redshift Gravitational time dilation means that photons emitted in a gravitational field with a time

period of ∆𝜏 between wave peaks will have a longer time period ∆𝑡 when measured in zero

gravity, ∆𝑡 =1

√1−2𝐺𝑀

𝑟𝑐2

∆𝜏 .

This is important in astronomy since it means that the spectral lines in the radiation emitted by

stars are redshifted due to gravity when measured on Earth ignoring any Doppler effect.

𝑓𝑜𝑏 =1

√1−2𝐺𝑀

𝑅𝑐2

𝑓𝑒𝑚 or 𝜆𝑜𝑏 = √1 −2𝐺𝑀

𝑅𝑐2𝜆𝑒𝑚 where 𝑀 is the mass of the star and 𝑅 is its radius.

This is in addition to any special relativity correction for any Doppler shift for the stars motion

towards or away from the Earth. (The gravity at the surface of the Earth is assumed to be zero,

i.e. very small compared to that of a star).

In practice the correction factor is extremely small for normal stars and is lost in the noise of

atmospheric turbulence, but becomes measurable for massive dwarf stars where 𝑅 is much

closer to 𝑅𝑠. W S Adams claimed to have measured it for Sirius B (about the same mass as the

Sun, but the radius similar to that of the Earth, so it is one of the more massive white dwarves)

in 1925 although there is doubt regarding this since the value found could have other sources.

James W Brault measured the effect in light from the Sun in 1972. In 2011 Radek Wojtak of the

Niels Bohr Institute at the University of Copenhagen collected data from 8000 galaxy clusters

and found that the light coming from the cluster centers tended to be red-shifted compared to

the cluster edges, the effect being due to the mass of the centres of the galaxies rather than that

of the star.

On the 19 May 2018 a star S2 or S0-2 orbiting around Sagittarius A*, the black hole at the centre

of the Milky Way galaxy, came very close to the black hole, and as well as its speed increasing to

0.027c its spectrum was red-shifted due to the intense gravitational field as predicted.

Page 98: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Distance This is treated in a similar way to proper time but the two events occur at the same coordinate

time but at different locations separated only in a radial direction by d𝑟.

In Schwarzschild space-time the separation (squared) of the two events is given by

(d𝑠)2 = (1 −2𝐺𝑀

𝑟𝑐2) (𝑐d𝑡)2 −

1

1−2𝐺𝑀

𝑟𝑐2

(d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2 but since the two times are

identical d𝑡 = 0 and this simplifies to

(d𝑠)2 = −1

1−2𝐺𝑀

𝑟𝑐2

(d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2 and since there is no change in angle

d𝜃 = d𝜙 = 0 and

(d𝑠)2 = −1

1−2𝐺𝑀

𝑟𝑐2

(d𝑟)2

A local observer located at both events would measure the proper distance between them as

d𝜎 = √(d𝑟)2 from special relativity

=1

√1−2𝐺𝑀

𝑟𝑐2

d𝑟 which is greater than d𝑟 although in special relativity

they would be the same.

The difference increases as the events occur closer to the Schwarzschild radius.

This means that if the Schwarzschild space is considered as a set of concentric shells with radii 𝑟

then a coordinate length ∆𝑟 does not represent the true radial distance between shells.

One method of calculating the coordinate radius of an event is to measure the proper

circumference of the shell 𝐶 and calculate 𝑟 = 𝐶/2𝜋.

Note that d𝜎 =1

√1−𝑅𝑠𝑟

d𝑟 can be integrated to give

∫𝜎 d𝑟 = √1 −𝑅𝑠

𝑟𝑟 +

𝑅𝑠 ln(√1−𝑅𝑠𝑟+1)

2−

𝑅𝑠 ln(√1−𝑅𝑠𝑟−1)

2+ 𝐶

For example the proper distance from 𝑅𝑠 to 𝑛𝑅𝑠 is

𝜎 = (√𝑛 − 1√𝑛 +ln(

𝑛+√𝑛−1√𝑛

𝑛)

2−

ln(𝑛−√𝑛−1√𝑛

𝑛)

2)𝑅𝑠

A remote observer cannot measure distance as to do so requires measuring rods and so the

observer must be local.

The key point is that coordinate 𝑟 increases with distance but is not directly related to distance

– it is said to lack metric significance.

Page 99: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Geodesic Motion A freely moving object with a small mass 𝑚 (called a test mass) so as not to affect space-time

will follow the geodesic that passes through the object.

The geodesic is defined by a parameter 𝜆 and must satisfy the equations

d2𝑒𝜇

d𝜆2+ ∑ Γ𝜇𝜈𝜌

d𝑒𝜈

d𝜆

d𝑒𝜌

d𝜆= 0𝜈,𝜌 where Γ𝜇𝜈𝜌 =

1

2∑ 𝑔𝜇𝜎 (

𝜕𝑔𝜎𝜌

𝜕𝑒𝜈+

𝜕𝑔𝜈𝜎

𝜕𝑒𝜌+

𝜕𝑔𝜈𝜌

𝜕𝑒𝜎)𝜎

The connection coefficients are (from 𝑓2𝐴 = 𝑓−2𝐵 = 1 −𝑅𝑠

𝑟 where 𝑅𝑠 =

2𝐺𝑀

𝑟𝑐2)

Γ001 = Γ010 = 𝐴′ =𝐺𝑀

𝑟2𝑐2(1−2𝐺𝑀

𝑟𝑐2)

Γ100 = 𝐴′𝑓2(𝐴−𝐵) =𝐺𝑀(1−

2𝐺𝑀

𝑟𝑐2)

𝑟2𝑐2

Γ111 = 𝐵′ = −𝐺𝑀

𝑟2𝑐2(1−2𝐺𝑀

𝑟𝑐2)

Γ122 = −𝑟𝑓−2𝐵 = −𝑟 (1 −2𝐺𝑀

𝑟𝑐2)

Γ133 = −𝑓−2𝐵𝑟 sin𝜃 = −𝑟 (1 −2𝐺𝑀

𝑟𝑐2) sin2 𝜃

Γ212 = Γ221 =1

𝑟

Γ233 = −sin 𝜃 cos 𝜃

Γ313 = Γ331 =1

𝑟

Γ323 = Γ332 = cot 𝜃

This gives the four geodesic equations as

d2𝑡

d𝜆2+

2𝐺𝑀

𝑟2𝑐2(1−2𝐺𝑀

𝑟𝑐2)

d𝑟

d𝜆

d𝑡

d𝜆= 0

d2𝑟

d𝜆2+

𝐺𝑀

𝑟2(1 −

2𝐺𝑀

𝑟𝑐2) (

d𝑡

d𝜆)2−

2𝐺𝑀

𝑟2𝑐2(1−2𝐺𝑀

𝑟𝑐2)(d𝑟

d𝜆)2− 𝑟 (1 −

2𝐺𝑀

𝑟𝑐2) ((

d𝜃

d𝜆)2+ sin2 𝜃 (

d𝜙

d𝜆)2) = 0

d2𝜃

d𝜆2+

2

𝑟

d𝑟

d𝜆

d𝜃

d𝜆− sin𝜃 cos𝜃 (

d𝜙

d𝜆)2= 0

d2𝜙

d𝜆2+

2

𝑟

d𝑟

d𝜆

d𝜙

d𝜆+ 2 cot 𝜃 cos 𝜃

d𝜃

d𝜆

d𝜙

d𝜆= 0

which must be solved for 𝑡(𝜆), 𝑟(𝜆), 𝜃(𝜆), 𝜙(𝜆) from which the equation of the geodesic as a

function of (𝑡, 𝑟, 𝜃, 𝜙 ) can be calculated, the equivalent of a trajectory in Newtonian mechanics.

Page 100: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The equations can be simplified by noting that the norm of the tangent vector is

𝑛2 = ∑ 𝑔𝜇𝜈d𝑒𝜇

d𝜆

d𝑒𝜈

d𝜆𝜇,𝜈

A suitable affine parameter for an object with mass is proper time 𝜏, i.e. 𝜆 = 𝜏

𝑛2 = ∑ 𝑔𝜇𝜈d𝑒𝜇

d𝜏

d𝑒𝜈

d𝜏𝜇,𝜈

From special relativity 𝐔𝜇 =d𝑒𝜇

d𝜏 , the four-velocity. Also from special relativity

∑ 𝑔𝜇,𝜈𝐔𝜇𝐔𝜈3

𝜇,𝜈=0 = 𝑐2 so 𝑛2 = ∑ 𝑔𝜇𝜈d𝑒𝜇

d𝜆

d𝑒𝜈

d𝜆𝜇,𝜈 becomes

𝑐2 = 𝑐2 (1 −2𝐺𝑀

𝑟𝑐2) (

d𝑡

d𝜏)2−

1

(1−2𝐺𝑀

𝑟𝑐2)(d𝑟

d𝜏)2− 𝑟2 (

d𝜃

d𝜏)2− 𝑟2 sin2 𝜃 (

d𝜙

d𝜏)2

This can be further simplified by noting symmetries in Schwarzschild space-time which lead to

conservation laws.

𝐸

𝑚𝑐2= (1 −

2𝐺𝑀

𝑟𝑐2)d𝑡

d𝜏 conservation of energy where 𝑚 is the mass of the small

object (𝐸 = 𝐸0 + 𝐸𝐾𝐸 = 𝐸0 − 𝐸𝑃𝐸)

𝐽

𝑚= 𝑟2 sin2 𝜃

d𝜙

d𝜏 conservation of angular momentum - this assumes that

the vector is aligned with the polar axis so 𝜃 =𝜋

2 , and

d𝜃

d𝜏= 0 which gives

𝑐2 = (𝐸

𝑚𝑐)2 1

(1−2𝐺𝑀

𝑟𝑐2)−

1

(1−2𝐺𝑀

𝑟𝑐2)(d𝑟

d𝜏)2− 𝑟2 (

d𝜃

d𝜏)2− (

𝐽

𝑚𝑟

2) or

(d𝑟

d𝜏)2+ (

𝐽

𝑚𝑟

2) (1 −

2𝐺𝑀

𝑟𝑐2) −

2𝐺𝑀

𝑟= 𝑐2 ((

𝐸

𝑚𝑐2)2− 1) for an object that remains in the equatorial

plane.

This applies to both an orbiting mass or a mass falling vertically or anything in between.

For a mass falling along a radial line d𝜙

d𝜏= 0 so 𝐽 = 0 and

(d𝑟

d𝜏)2−

2𝐺𝑀

𝑟= 𝑐2 ((

𝐸

𝑚𝑐2)2− 1)

2d𝑟

d𝜏(d𝑟

d𝜏)2+

2𝐺𝑀

𝑟2d𝑟

d𝜏= 0 Differentiating wrt proper time

(d𝑟

d𝜏)2+

𝐺𝑀

𝑟2= 0 Dividing by 2

d𝑟

d𝜏

d2𝑟

d𝜏2= −

𝐺𝑀

𝑟2

This is similar to the Newtonian equivalent, but here 𝑟 is the coordinate distance, not the proper

distance. At sufficient distance however this does approach the Newtonian equivalent.

However the Newtonian equation is for a mass falling under gravity. In relativity the equivalent

expression is free fall because there is no gravitational force or field.

Page 101: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Orbit A body of mass 𝑚 orbiting around the Schwarzschild mass 𝑀 is in free fall. This is a fairly

accurate model of a single planet orbiting around a star.

Arranging the coordinates so that the body remains in the x-y plane

(d𝑟

d𝜏)2+ (

𝐽

𝑚𝑟

2) (1 −

2𝐺𝑀

𝑟𝑐2) −

2𝐺𝑀

𝑟= 𝑐2 ((

𝐸

𝑚𝑐2)2− 1)

The differential is changed from wrt to 𝜏 to wrt 𝜙 by d𝑟

d𝜏=

d𝜙

d𝜏

d𝑟

d𝜙.

Since 𝐽

𝑚= 𝑟2 sin2 𝜃

d𝜙

d𝜏

d𝜙

d𝜏=

𝐽

𝑟2𝑚sin2 𝜃 but in the equatorial plane 𝜃 =

𝜋

2 so

d𝑟

d𝜏=

𝐽

𝑟2𝑚

d𝑟

d𝜙

The orbit is therefore given by

(𝐽

𝑟2𝑚)2(d𝑟

d𝜙)2

+ (𝐽

𝑚𝑟

2) (1 −

2𝐺𝑀

𝑟𝑐2) −

2𝐺𝑀

𝑟= 𝑐2 ((

𝐸

𝑚𝑐2)2− 1) or

(d𝑟

d𝜙)2+ 𝑟2 (1 −

2𝐺𝑀

𝑟𝑐2) − 𝑚2𝑟3

2𝐺𝑀

𝐽2= (

𝑟2𝑚𝑐

𝐽)2

((𝐸

𝑚𝑐2)2− 1) multiplying by (

𝐽

𝑟2𝑚)2

(−1

𝑢2d𝑢

d𝜙)2+

1

𝑢2(1 −

2𝐺𝑀

𝑟𝑐2) − 𝑚2 1

𝑢32𝐺𝑀

𝐽2= (

𝑚𝑐

𝑢2𝐽)2((

𝐸

𝑚𝑐2)2− 1) 𝑢 =

1

𝑟, d𝑢

d𝜙= −𝑢2

d𝑟

d𝜙

(d𝑢

d𝜙)2+ 𝑢2 = (

𝑚𝑐

𝐽)2((

𝐸

𝑚𝑐2)2− 1) +

2𝐺𝑀𝑢𝑚2

𝐽2+

2𝐺𝑀𝑢3

𝑐2

d

d𝜙(d𝑢

d𝜙)2+

d

d𝜙𝑢2 =

d

d𝜙(𝑚𝑐

𝐽)2((

𝐸

𝑚𝑐2)2− 1) +

d

d𝜙

2𝐺𝑀𝑢𝑚2

𝐽2+

d

d𝜙

2𝐺𝑀𝑢3

𝑐2

2d𝑢

d𝜙(d𝑢

d𝜙)2+

du

d𝜙2𝑢 =

du

d𝜙

2𝐺𝑀𝑚2

𝐽2+

du

d𝜙

6𝐺𝑀𝑢2

𝑐2

(d𝑢

d𝜙)2+ 𝑢 =

𝐺𝑀𝑚2

𝐽2+

3𝐺𝑀𝑢2

𝑐2 dividing by 2

du

d𝜙

d2𝑢

d𝜙2+ 𝑢 =

𝐺𝑀𝑚2

𝐽2+

3𝐺𝑀𝑢2

𝑐2

This last equation is the equation of the orbit and can be compared to the Newtonian equation

d2𝑢

d𝜙2+ 𝑢 =

𝐺𝑀𝑚2

𝐽2

If 𝑟 is large 𝑢 becomes small since 𝑢 =1

𝑟 and the final term

3𝐺𝑀𝑢2

𝑐2 will approach zero so the two

are equivalent at large orbital diameters as would be expected.

However for small diameters the two differ. An object in a Newtonian orbit that does not

intersect the surface and where the speed is less than escape velocity will follow exactly the

same elliptical orbit for as long as there are no external forces. The 3𝐺𝑀𝑢2

𝑐2 term in relativity has

several different consequences, one being that the smallest stable circular orbit is 3𝑅𝑠 =6𝐺𝑀

𝑐2,

and another that elliptical orbits with a larger diameter than 3𝑅𝑠 also deviate from a true ellipse

in that they do not close.

Page 102: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Circular Orbits

In the case of a circular orbit 𝑟 and hence 𝑢 =1

𝑟 are constant so

d𝑢

d𝜙=

d2𝑢

d𝜙2= 0.

𝑢 =𝐺𝑀𝑚2

𝐽2+

3𝐺𝑀𝑢2

𝑐2

d2𝑢

d𝜙2+ 𝑢 =

𝐺𝑀𝑚2

𝐽2+

3𝐺𝑀𝑢2

𝑐2

3𝐺𝑀𝑢2

𝑐2− 𝑢 +

𝐺𝑀𝑚2

𝐽2= 0

Solving this gives

𝑢 =1−√1−4

3𝐺𝑀

𝑐2 𝐺𝑀𝑚2

𝐽2

23𝐺𝑀𝑢2

𝑐2

=1−√1−12(

𝐺𝑀𝑚

𝑐𝐽)2

6𝐺𝑀

𝑐2

If 𝐽

𝑚=

2√3𝐺𝑀

𝑐 which is the condition for the minimum stable circular orbit

𝑢 =𝑐2

6𝐺𝑀 and 𝑟 =

6𝐺𝑀

𝑐2= 3𝑅𝑠 𝑅𝑠 =

2𝐺𝑀

𝑐2

There are no circular orbits for 𝑟 <3

2𝑅𝑠, and those between

3

2𝑅𝑠 and 3𝑅𝑠 are unstable. This

means that in theory an object can remain in the circular orbit, but the slightest perturbation

will cause to leave the orbit (similar to balancing a pin on its point).

Page 103: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Elliptical Orbits

(d𝑟

d𝜏)2+ (

𝐽

𝑚𝑟

2) (1 −

2𝐺𝑀

𝑟𝑐2) −

2𝐺𝑀

𝑟= 𝑐2 ((

𝐸

𝑚𝑐2)2− 1) can be rewritten as

𝑐2

2((

𝐸

𝑚𝑐2)2− 1) =

1

2(d𝑟

d𝜏)2

+1

2(𝐽

𝑚𝑟

2) (1 −

2𝐺𝑀

𝑟𝑐2) −

𝐺𝑀

𝑟

1

2(𝐽

𝑚𝑟)2(1 −

2𝐺𝑀

𝑟𝑐2) −

𝐺𝑀

𝑟 is called the effective potential 𝑉𝑒𝑓𝑓.

In Newtonian mechanics 𝑉𝑒𝑓𝑓 =1

2(𝐽

𝑚𝑟)2−

𝐺𝑀

𝑟.

The difference between the two has two important results which become more important at

small 𝑟.

In Newtonian mechanics any orbiting body with non-zero angular momentum can never reach

𝑟 = 0 (provided its orbit does not intersect the surface) because as r decreases the orbiting

speed must increase to conserve angular momentum. This results in the body always following

the same ellipse in the absence of any other external force.

In the case of the Schwarzschild space-time elliptical orbits rotate so the path followed is not a

true ellipse, but the orbiting body spends longer close to the central mass (perihelion) than

predicted by Newtonian mechanics. This is called perihelion precession.

Within 3𝑅𝑠 the orbits are less predictable, at first the orbit follows a complex path, but does not

exceed some minimum and maximum value of 𝑟, then for smaller 𝑟 the small body is ejected,

and for 𝑟 below a critical radius which depends on the angular momentum the body will plunge

into the central mass.

For any general black hole there are two critical orbits IBCO and ISCO.

If the Innermost Bound Circular Orbit (IBCO ) is crossed the object will (eventually) plunge into

the event horizon, never to return.

If an object does not cross the Innermost Stable Circular Orbit (ISCO) its orbit can be

approximated by Newtonian gravity – it will follow a conic curve, but if it follows an ellipse its

precession will be greater than that predicted (as in the case of Mercury’s orbit).

Between the two an orbit consists of two parts – circular whirls and elliptical leaves. The object

will follow a circular path for a number of revolutions, and then a highly elliptical path, followed

by more circular orbits, then another elliptic orbit. The elliptical orbits may all be in the same

direction or there can be a number of them symmetrically distributed in the orbital plane, like

leaves on a stalk or petals on a flower. Additionally they may also precess around the black hole.

Page 104: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Perihelion Precession

The rotation of the perihelion (closest point) is given by Δ𝜙 =6𝜋𝐺𝑀𝑇

𝑎(1−𝑒2)𝑐2 where 𝑀𝑇 is the total

mass of both bodies, 𝑎 is the semi-major diameter (related to the minimum value of 𝑟) and 𝑒 is

the eccentricity of the ellipse.

This has a relative small effect on planets orbiting stars (for Mercury it is only 43 arc seconds

per century), but a much greater effect for binary stars.

A common unit is arc seconds per (Earth) year while periods 𝑃 are often quoted in (Earth) days

in which case

Δ𝜙 =6×180×3600×365×6.67×10−11

(3×108)2𝑀𝑇

𝑃𝑎(1−𝑒2)= 1.05 × 10−18

𝑀𝑇

𝑃𝑎(1−𝑒2) arc secs year -1

Masses of stars are often quoted in sun masses 1.99x1030 kg, planets in Jupiter (1.90x1027 kg) or

Earth (5.97x1024 kg) masses, and radii in astronomical units 1.496x1011 metres so care has to

be taken with the units.

For example the planet HD118203b has a mass of 2.1 Jupiters and takes 6.1 days to orbit a star

with a mass of 1.2 Suns at a distance of 0.07 AU and an eccentricity of 0.31 which gives a

perihelion precession of 43.4 arc secs per Earth year.

The two neutron stars PSR B1913+16 have masses of 1.44 and 1.39 Suns, orbit in 0.323 days at

a minimum distance of 0.13 AU from the centre of mass, with an eccentricity of 0.617 resulting

in a perihelion precession of 4.2 degrees per Earth year.

Page 105: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Black Hole A Schwarzschild black hole is an example of Schwarzschild space-time, but in this case the

Schwarzschild radius is larger than the radius of the mass. The Schwarzschild radius is now the

event horizon which allows light and matter to enter but not leave.

Using Newtonian mechanics the escape velocity at the surface of a body is given by 𝑣 = √2𝐺𝑀

𝑅

and so if 𝑅 <2𝐺𝑀

𝑐2 light cannot escape. Relativity gives the same result, and 𝑅𝑠 =

2𝐺𝑀

𝑐2.

Most black holes are either stellar black holes with a mass of 3 times that of the Sun (the

maximum mass for a neutron star) to around 100 times that of the Sun, or supermassive black

holes which are larger than 10,000 times that of the Sun, but relativity allows black holes to

have any mass so mini black holes and intermediate black holes may exist although none are

known and there is no known process to create them.

Stellar black holes are collapsed stars and supermassive black holes form the centre of most if

not all galaxies. The one at the centre of the Milky Way is about 4.1 million solar masses so is

supermassive. There may be intermediate black holes at the centre of global clusters.

The main addition to the Schwarzschild metric is that a falling body can pass through the

Schwarzschild radius, but the metric is not valid inside the event horizon.

For a mass falling along a radial line d𝜙

d𝜏= 0 so 𝐽 = 0 where 𝜏 is the proper time measured by an

observer falling with the mass.

(d𝑟

d𝜏)2−

2𝐺𝑀

𝑟= 𝑐2 ((

𝐸

𝑚𝑐2)2− 1) which can be rewritten as

(d𝑟

d𝜏)2= 𝑐2 ((

𝐸

𝑚𝑐2)2− 1 +

𝑅𝑠

𝑟) where 𝐸 is the total energy (𝐸0 + 𝐸𝐾𝐸).

Assuming that the falling body starts at rest at a large distance 𝑟0 so d𝑟

d𝜏= 0 and

(𝐸

𝑚𝑐2)2= 1 −

𝑅𝑠

𝑟0 if 𝑟0 is infinity this becomes 𝐸 = 𝑚𝑐2so 𝐸 decreases as 𝑟 decreases

(d𝑟

d𝜏)2= 𝑐2 (1 −

𝑅𝑠

𝑟0− 1 +

𝑅𝑠

𝑟) = 𝑐2𝑅𝑠 (

1

𝑟−

1

𝑟0)

d𝑟

d𝜏= 𝑐√𝑅𝑠√

1

𝑟−

1

𝑟0 or

d𝜏

dr=

1

𝑐√𝑟0

𝑅𝑠√

𝑟

𝑟0−𝑟

Page 106: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Integrating d𝜏

dr=

1

𝑐√𝑟0

𝑅𝑠√

𝑟

𝑟0−𝑟 from 𝑟0 to 𝑟 noting 𝜏 increases as 𝑟 decreases

𝜏(𝑟) − 𝜏(𝑟0) =1

𝑐√𝑟0

𝑅𝑠∫ √

𝑟

𝑟0−𝑟

𝑟

𝑟0d𝑟

=𝑟0

𝑐√𝑟0

𝑅𝑠[√

𝑟

𝑟0(1 −

𝑟

𝑟0) + tan−1 (−√

𝑟

𝑟0−𝑟)]𝑟0

𝑟

=𝑟0

𝑐√𝑟0

𝑅𝑠(𝜋

2+√

𝑟

𝑟0(1 −

𝑟

𝑟0) + tan−1 (−√

𝑟

𝑟0−𝑟))

≅𝑟0

𝑐√𝑟0

𝑅𝑠(𝜋

2−

2

3(𝑟

𝑟0)

3

2) 𝑟0 ≫ 𝑟 expanding the RHS as a power series.

The proper time to fall to the centre (𝑟 = 0) is therefore

𝜏(0) − 𝜏(𝑟0) ≅𝑟0

𝑐√𝑟0

𝑅𝑠(𝜋

2− 0) ≅

𝜋𝑅𝑠

2𝑐(𝑟0

𝑅𝑠)

3

2

And the proper time to fall to the event horizon is

𝜏(𝑅𝑠) − 𝜏(𝑟0) ≅𝑅𝑠

2𝑐(𝑟0

𝑅𝑠)

3

2(𝜋

2−

2

3(𝑅𝑠

𝑟0)

3

2)

So the proper time from the event horizon to the centre is

𝜏(0) − 𝜏(𝑟0) − 𝜏(𝑅𝑠) + 𝜏(𝑟0) =𝜋𝑅𝑠

2𝑐(𝑟0

𝑅𝑠)

3

2−

𝑅𝑠

2𝑐(𝑟0

𝑅𝑠)

3

2(𝜋

2−

2

3(𝑅𝑠

𝑟0)

3

2)

𝜏(0) − 𝜏(𝑅𝑠) =2

3

𝑅𝑠

𝑐

This is proportional to the Schwarzschild radius which in turn is proportional to the mass 𝑀 so

the larger the mass the greater the time.

Thus the body falls with increasing velocity and passes through the Schwarzschild radius

reaching the centre in a finite time.

Note that 𝜏 is the proper time, that measured by an observer falling with the mass.

However the time as measured by a remote observer is different. This time is the same as

coordinate time 𝑡 if the observer is sufficiently remote.

Page 107: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The remote observer is assumed to be on the same radial line at the origin of the falling body 𝑟0

so d𝜃

d𝑡=

d𝜙

d𝑡= 0.

Considering a photon leaving the falling body and reaching the observer. The separation is 0 so

(d𝑠)2 = (1 −2𝐺𝑀

𝑟𝑐2) (𝑐d𝑡)2 −

1

1−2𝐺𝑀

𝑟𝑐2

(d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2 becomes

0 = (1 −𝑅𝑠

𝑟) (𝑐d𝑡)2 −

1

1−𝑅𝑠𝑟

(d𝑟)2 or

d𝑡 =1

𝑐

1

1−𝑅𝑠𝑟

d𝑟

Integrating from emission at (𝑡𝑒𝑚, 𝑟𝑒𝑚) to observation at (𝑡𝑜𝑏 , 𝑟𝑜𝑏)

𝑡𝑜𝑏 − 𝑡𝑒𝑚 = ∫ d𝑡𝑡𝑜𝑏𝑡𝑒𝑚

= ∫1

𝑐

1

1−𝑅𝑠𝑟

d𝑟𝑟𝑜𝑏𝑟𝑒𝑚

=𝑟𝑜𝑏−𝑟𝑒𝑚

𝑐+

𝑅𝑠

𝑐ln

𝑟𝑜𝑏−𝑅𝑠

𝑟𝑒𝑚−𝑅𝑠

As 𝑟𝑒𝑚 ⟶ 𝑅𝑠 then 𝑡𝑜𝑏 − 𝑡𝑒𝑚 ⟶∞ so the body appears to slow down to zero velocity as it

approaches the event horizon and never reaches it.

As 𝑟𝑒𝑚 ⟶∞ then 𝑅𝑠

𝑐ln

𝑟𝑜𝑏−𝑅𝑠

𝑟𝑒𝑚−𝑅𝑠⟶ 0 and 𝑡𝑜𝑏 − 𝑡𝑒𝑚 =

𝑟𝑜𝑏−𝑟𝑒𝑚

𝑐 since coordinate time and distance

now approach proper time and distance giving the Galilean result.

Page 108: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The observed position 𝑟∗ of the falling body can be obtained from

𝐸

𝑚𝑐2= (1 −

𝑅𝑠

𝑟)d𝑡

d𝜏

𝐸

𝑚𝑐2= (1 −

2𝐺𝑀

𝑟𝑐2)d𝑡

d𝜏

d𝑡

d𝜏=

√1−𝑅𝑠𝑟0

1−𝑅𝑠𝑟

(𝐸

𝑚𝑐2)2= 1 −

𝑅𝑠

𝑟0

Also d𝜏

dr= −

1

𝑐√𝑟0

𝑅𝑠√

𝑟

𝑟0−𝑟 so

d𝜏

dr

d𝑡

d𝜏=

d𝑡

d𝑟= −

1

𝑐√𝑟0

𝑅𝑠√

𝑟𝑟0

𝑟0−𝑟

√1−𝑅𝑠𝑟0

1−𝑅𝑠𝑟

d𝑡

d𝑟= −

1

𝑐√𝑅𝑠

√𝑟

1−𝑅𝑠𝑟

𝑟 ≪ 𝑟0

𝑡(𝑟) − 𝑡(𝑟∗) = −1

𝑐√𝑅𝑠∫

√𝑟

1−𝑅𝑠𝑟

𝑟

𝑟∗d𝑟 𝑅𝑠 ≪ 𝑟∗ ≪ 𝑟0

= −𝑅𝑠

𝑐[2

3(𝑟

𝑅𝑠)

3

2+ 2(

𝑟

𝑅𝑠)

1

2− ln |

(𝑟

𝑅𝑠)

12+1

(𝑟

𝑅𝑠)

12−1

|]

𝑟∗

𝑟

=𝑅𝑠

𝑐(ln |

(𝑟

𝑅𝑠)

12+1

(𝑟

𝑅𝑠)

12−1

| −2

3(𝑟

𝑅𝑠)

3

2− 2(

𝑟

𝑅𝑠)

1

2)+ 𝐶

The value of C is found by adjusting this curve so that at 𝑡(𝑟∗) and 𝜏(𝑟0) the two curves for

coordinate time and proper time match. This curve increases to infinity as 𝑟 approaches 𝑅𝑠.

Another important issue is that the emitted photon will have an increasing gravitational red shift as the body approaches the event horizon, the observed frequency 𝑓𝑜𝑏𝑔being

𝑓𝑜𝑏𝑔 = 𝑓𝑒𝑚√1−𝑅𝑠

𝑟

The effect for stars is extremely small and is hidden by noise such as the Doppler effect of

turbulence in the star’s atmosphere. It can be detected by observing the spectrum of white

dwarfs. It can be measured in laboratory tests. If the radiation is emitted at the earths radius 𝑅 with frequency 𝑓𝑅 and measured at a height ℎ above that with a change in frequency of ∆𝑓𝑅 then

∆𝑓𝑅

𝑓𝑅= −

𝑀𝐺ℎ

𝑐2𝑅2= −

𝑔ℎ

𝑐2 where g is the acceleration due to gravity in the Earth’s surface.

Experiments have been made over a height difference of 22.5 m. Additional tests have been

made between a gravity probe at a height of 10km and the Earth’s surface.

The increasing velocity of the body will also give rise to a Doppler redshift so the received

frequency 𝑓𝑜𝑏𝐷 = 𝑓𝑒𝑚 √𝑐−𝑉𝑟

√𝑐+𝑉𝑟

The combination will cause the body to dim rapidly assuming light is emitted at a constant rate

𝐿0 ≅ 𝐿𝑟e−𝑐𝑡

𝑅𝑠 𝑟 → 𝑅𝑠

Page 109: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The speed of the body as it falls through the event horizon can be calculated from

d𝑡

d𝑟=

1

𝑐√𝑟0

𝑅𝑠√

𝑟𝑟0

𝑟0−𝑟

√1−𝑅𝑠𝑟0

1−𝑅𝑠𝑟

so

d𝑟

d𝑡= −𝑐√

𝑅𝑠

𝑟0√𝑟0−𝑟

𝑟𝑟0

1−𝑅𝑠𝑟

√1−𝑅𝑠𝑟0

This can be converted from coordinate velocity to proper velocity since

d𝜎

d𝜏=

1

1−𝑅𝑠𝑟

d𝑟

d𝑡 d𝜎 =

1

√1−𝑅𝑠𝑟

d𝑟 and d𝜏 = √1 −𝑅𝑠

𝑟d𝑡

d𝜎

d𝜏= −

1

1−𝑅𝑠𝑟

𝑐√𝑅𝑠

𝑟0√𝑟0−𝑟

𝑟𝑟0

1−𝑅𝑠𝑟

√1−𝑅𝑠𝑟0

= −𝑐√𝑅𝑠(𝑟0−𝑟)

𝑟(𝑟0−𝑅𝑠)

= −𝑐√𝑅𝑠(𝑟0−𝑅𝑠)

𝑅𝑠(𝑟0−𝑅𝑠) at 𝑟 = 𝑅𝑠

= −𝑐

So as the body travels through the event horizon it reaches the speed of light as observed by an

observer at the event horizon.

It should be noted that the Schwarzschild metric becomes increasingly inaccurate as the event

horizon is approached, and is not valid inside it, but it does give an indication of the trends.

Page 110: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Tidal Effects at the Event Horizon In Newtonian mechanics tidal effects result from the variation in strength of a gravitational

field. For example the Moon’s gravitational field is strongest on the side of the Earth facing the

Moon and weakest on the other side of the Earth resulting in two tidal bulges on the Earth-

Moon line, one towards the Moon and the other away from it so the overall shape is elongated

along the Earth-Moon line and reduced in width perpendicular to it.

In relativity if there are two nearly parallel geodesics A and B which diverge slightly with

increasing value of an affine parameter 𝜆, the separation between two bodies, one moving along

A and the other along B will be given by 𝜉𝜇(𝜆) = 𝑒𝜇𝐵(𝜆) − 𝑒𝜇𝐴(𝜆). The change in this separation

due to the curvature of space-time is given by the geodesic deviation equation

D2𝜉𝜇

D𝜆2+ ∑ 𝑅𝜇𝛽𝛼𝛾𝛼,𝛽,𝛾 𝜉𝛼

d𝑒𝛽

d𝜆

d𝑒𝛾

d𝜆= 0 where

D𝜉𝜇

D𝜆≡

d𝜉𝜇

d𝜆+ ∑ Γ𝜇𝛼𝛽𝜉

𝛼 d𝑒𝛽

d𝜆𝛼,𝛽

Assuming a finite sized body is falling into the black hole the side of the body facing the hole and the side away from the hole will reach the event horizon at significantly different proper times

(i.e. more than would be accounted for by its speed alone). The body is elongated along the

radial line, and narrowed perpendicular to it (and is called spaghettification in science fiction).

This is in Newtonian terms a tidal force which can be expressed as a field gradient of

|d𝑓

d𝑟| =

2𝐺𝑀

𝑟3

In the case of a 40 solar mass stellar black hole this would be lethal for a human at a distance of

1000 km (𝑅𝑠=120km) but for a 107 solar mass super massive black hole it would not be

noticeable at the event horizon (𝑅𝑠=3×107km) so the body would pass through the event

horizon without being destroyed.

Page 111: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Light Deflection If a beam of light travelled in a straight line past a Schwarzschild black hole such that its closest

approach would be a distance 𝑏 from the centre of the black hole, the light deviates from the

straight line and curves round the black hole. The distance 𝑏 is called the impact parameter and

the amount of deviation depends only on the ratio 𝑏

𝑅𝑠.

For 𝑏

𝑅𝑠> 3.5 the beam will curve by an amount less than 90 degrees, and then continue.

For 3.5 >𝑏

𝑅𝑠> 2.6 the beam will curve by between 90 and 180 degrees but will escape.

At 𝑏

𝑅𝑠= 2.6 the beam enters a circular orbit around the black hole – this orbit is called the

photon sphere and has a radius of 1.5𝑅𝑠. The orbit is unstable since it is less than 3𝑅𝑠 and such

photons will spiral in towards the centre. However since the photon sphere is larger than the

Schwarzschild radius photons and matter originating inside it can escape.

For 𝑏

𝑅𝑠< 2.6 the photons pass through the photon sphere and rapidly spiral in towards the

centre.

For large values of b the deflection angle is 2𝑅𝑠

𝑏 or

4𝐺𝑀

𝑏𝑐2 radians. A star seen at the edge of the sun

has a deflection of about 1.75 seconds of arc. This can lead to gravitational lensing whereby the

light from a distant object is diverted as it passes a nearby object (the lens) in the same line of

sight resulting in multiple images which are distorted into arcs.

The lensing equation is 𝛽 = 𝜃 − 𝛼 where 𝛽 is the angle between the lensing object and the

source, 𝜃 is the angle between the lensing object and the image, and 𝛼 is the angle between the

source and the image, all measured by the observer and all very small.

𝛼 =𝑑𝐿𝑆

𝑑𝑂𝑆

4𝐺𝑀

𝑏𝑐2 where 𝑑𝐿𝑆 is the angular diameter distance of the source from the lens and 𝑑𝑂𝑆 is

the diameter distance from the observer to the source.

𝜃 =𝑏

𝑑𝑂𝐿 where 𝑑𝑂𝐿 is the diameter distance from the observer to the lens (note that these

diameter distances used by observational cosmologists which are based on the relative

diameters of objects are not additive due to expansion of the universe - 𝑑𝑂𝑆 ≠ 𝑑𝑂𝐿 + 𝑑𝐿𝑆 – to be additive they must be converted to proper distances).

If the lens and source are exactly on the same line of sight 𝛽 = 0 and then

𝜃𝐸 = √4𝐺𝑀𝑑𝐿𝑆

𝑑𝑂𝐿𝑑𝑂𝑆𝑐2 where 𝜃𝐸 is the Einstein radius and the image becomes a ring surrounding the

lens known as the Einstein ring – more often four images are seen which is known as the

Einstein cross. For 𝛽 ≠ 0

𝛽 = 𝜃 −𝑑𝐿𝑆

𝑑𝑂𝑆

4𝐺𝑀

𝑏𝑐2 𝛽 = 𝜃 − 𝛼, 𝛼 =

𝑑𝐿𝑆

𝑑𝑂𝑆

4𝐺𝑀

𝑏𝑐2

𝛽 = 𝜃 −𝜃𝐸

2

𝜃 𝜃𝐸 = √

4𝐺𝑀𝑑𝐿𝑆

𝑑𝑂𝐿𝑑𝑂𝑆𝑐2 , 𝜃 =

𝑏

𝑑𝑂𝐿

𝜃 =1

2(𝛽 ± √𝛽2 − 4𝜃𝐸

2) – there are two images, one on the opposite side to the source.

Page 112: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Micro-lensing Light is deflected when it passes any body such as a star or plant, but the effect is very much

smaller because the mass is lower. In particular it can be used to detect planets that are in large

orbits around a star. This is called micro-lensing.

The Einstein radius for a star can be written as

𝜃𝐸 = √2𝑅𝑠

𝑑𝑂𝐿(𝑑𝑂𝑆−𝑑𝑂𝐿

𝑑𝑂𝑆) noting that 𝑑𝑂𝑆 = 𝑑𝑂𝐿 + 𝑑𝐿𝑆 in this case as the effects of expansion can be

ignored.

If 𝑑𝑂𝑆 ≫ 𝑑𝑂𝐿 this simplifies to 𝜃𝐸 = √2𝑅𝑠

𝑑𝑂𝐿= √

4𝐺𝑀∗

𝑐2𝑑𝑂𝐿= √

4𝐺𝑀∗

𝑐2𝑑∗ where 𝑀∗is the mass of a star and

𝑑∗ is its distance from Earth.

The physical radius of the ring is therefore 𝑟𝐸 = 𝜃𝐸𝑑∗ and so if the star is moving at a velocity

component 𝑣∗ perpendicular to the line of sight the maximum period for which a more distant

star can be within the ring is 𝑡∗ = 𝑣∗𝑟𝐸, this known as the lensing time, and this is proportional

to the square root of the mass 𝑡∗ ∝ √𝑀∗

This also applies to planets 𝑡𝑝 ∝ √𝑀𝑝

If a planet is orbiting the star then 𝑡𝑝 = 𝑡∗√𝑀𝑝

√𝑀∗, the lensing time of the planet. A typical value for

a star is measured in weeks, but that of a planet in hours.

If a planet orbiting the star has an angular distance from the star as seen from Earth that is

smaller than the Einstein radius and the star and planet pass in front of a more distant star that

is within the Einstein radius of the star the apparent magnitude of the more distant star will

increase and decrease over a period of up to 𝑡∗. However if the distant star also enters the

Einstein ring of the planet there will be an additional increase in magnitude for a period of 𝑡𝑝.

The increase in brightness (not magnitude) is given by

𝜇 =(𝛽

𝜃𝐸∗)2

+2

(𝛽

𝜃𝐸∗)√(

𝛽

𝜃𝐸∗)2

+4

for 𝛽 ≠ 0 where 𝛽 is the angle between the lensing object and the distant star

=𝜃𝐸∗𝛽 for 𝛽 ≪ 𝜃𝐸∗

Since 𝜃𝐸∗ ∝ √𝑀∗ and since this will also apply to the planet 𝜃𝐸𝑝 ∝ √𝑀𝑝, the ratio of the peaks are

𝜇𝑝

𝜇∗= √

𝑀𝑝

𝑀∗

This is observed as a brief spike in magnitude while the magnitude has already been increased,

and importantly it affects all wavelengths equally which can be used to eliminate other sources

for changes in magnitude. A few planets have been detected in this way, but it is limited to large

planets and orbits (Jupiter sized planets in a Jupiter sized orbit), and is a one-off event – it

cannot be repeated.

Page 113: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Inside the Schwarzschild Radius of a Black Hole Spherical coordinates are not useful close to the Schwarzschild radius. As 𝑟 decreases towards

𝑅𝑠 coordinate time 𝑡 increases towards infinity, but once within the Schwarzschild radius

coordinate time decreases and becomes equal to proper time at 𝑟 = 0.

An incoming photon experiences a reversal of coordinate time once it is inside the event

horizon.

This problem can be solved by changing to the advanced Eddington-Finkelstein coordinates in

which the 𝑡 coordinate is replaced by the 𝑡′ coordinate where

𝑐𝑡′ = 𝑐𝑡 + 𝑅𝑠 ln |𝑟

𝑅𝑠− 1| so

(d𝑠)2 = (1 −𝑅𝑠

𝑟) (𝑐d𝑡′)2 − 2

𝑅𝑠

𝑟𝑐d𝑡′d𝑟 − (1 +

𝑅𝑠

𝑟) (d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2

which has removed the singularity at 𝑅𝑠.

This has two important consequences.

In Schwarzschild coordinates the axis of the local light cone of a falling body is parallel to the ct

axis outside the Schwarzschild radius and is perpendicular to it inside, the change being at 𝑅𝑠.

Ingoing and outgoing geodesics are both symmetrically about 𝑅𝑠.

In advanced Eddington-Finkelstein coordinates the local light cones of a falling body are parallel

to the ct axis remote from the Schwarzschild radius, but tilt towards the centre as the

Schwarzschild radius is approached and continue to tilt more and more as 𝑟 decreases.

Outgoing geodesics are still symmetrically about 𝑅𝑠, but incoming geodesics are straight lines

which cross the Schwarzschild radius.

This implies that the Schwarzschild solution is only half of the full solution – beyond the

singularity there is another region of flat spacetime, and the black hole connects the two – hence

the discussions of wormholes in space.

The advanced Eddington-Finkelstein coordinates are much more complicated so are only used

when necessary – close to or inside the event horizon.

Page 114: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Kerr Black Holes In the real universe a black hole will attract material from around it and it is extremely unlikely

that this material will fall radially inwards so the black hole will acquire angular momentum

even if originally it had none (which is also unlikely if it is a collapsed star).

The presence of angular momentum means that the Schwarzschild model is not appropriate

because spherical symmetry has been lost. There is now a preferred direction given by the

angular momentum vector, i.e. the z axis.

The Kerr model is an exact solution of Einstein’s equations which takes account of rotation. It

has the line element

(d𝑠)2 = (1 −𝑅𝑠𝑟

𝜌2) (𝑐d𝑡)2 + 2

𝑅𝑠𝑟𝑎 sin2 𝜃

𝜌2𝑐d𝑡d𝜙 −

𝜌2

∆(d𝑟)2 − 𝜌2(d𝜃)2 −

((𝑟2 + 𝑎2) sin2 𝜃 +𝑅𝑠𝑟𝑎

2 sin4 𝜃

𝜌2) (d𝜙)2

There is an important difference between this and the other line elements. Although 𝑡 is the

time coordinate and 𝜙 is the same as in spherical coordinates (the angle of the r component in

the x-y plane), 𝑟 is not the radial coordinate and 𝜃 is not the angle with the z axis. Instead they

are defined by

𝑥 = √𝑟2 + 𝑎2 sin𝜃 cos𝜙 and 𝑦 = √𝑟2 + 𝑎2 sin𝜃 sin𝜙

These equations mean that a surface of constant r is an ellipsoid and in this case (𝑐𝑡, 𝑟, 𝜃, 𝜙) are

known as Boyer-Lindquist coordinates.

𝑅𝑠 =2𝐺𝑀

𝑐2 is the Schwarzschild radius, but is not an event horizon unless there is no rotation..

𝑎 =𝐽

𝑀𝑐 where 𝐽 is the magnitude of the angular momentum which has its unit vector as �̂�.

The limiting case for a black hole is 𝑎 =𝑅𝑠

2 so 𝐽𝑚𝑎𝑥 = 𝑎𝑀𝑐 =

𝑅𝑠

2𝑀𝑐 =

2𝐺𝑀

2𝑐2𝑀𝑐 =

𝐺𝑀2

𝑐, the

maximum possible angular momentum (because the coordinate singularity has a zero width).

𝜌2 = 𝑟2 + 𝑎2 cos2 𝜃

∆= 𝑟2 + 𝑎2 − 𝑅𝑠𝑟

For large 𝑟 both 𝜌2 and ∆ approximate to 𝑟2 and so remote from the black hole

(d𝑠)2 = (𝑐d𝑡)2 − (d𝑟)2 − 𝑟2(d𝜃)2 + 𝑟2 sin2 𝜃 (d𝜙)2

i.e. asymptotic flatness.

There is a physical singularity at 𝜌 = 0. This is a ring of coordinate radius 𝑎. If 𝑎 = 0 the

singularity is a point at 𝑟 = 0 and the black hole is a non-rotating Schwarzschild black hole.

There is a coordinate singularity at ∆= 0. This represents two closed surfaces defined by

𝑟+ =𝑅𝑠

2+√(

𝑅𝑠

2)2− 𝑎2 and 𝑟− =

𝑅𝑠

2−√(

𝑅𝑠

2)2− 𝑎2 for 𝑎 <

𝑅𝑠

2 and do not exist for 𝑎 >

𝑅𝑠

2.

These are two event horizons with the outer one being 𝑟+. The two event horizons coincide at

𝑟+ = 𝑟− =𝑅𝑠

2 if 𝑎 =

𝑅𝑠

2 or 𝐽𝑚𝑎𝑥 =

𝐺𝑀2

2. This is called an extreme Kerr black hole.

Page 115: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

If 𝑎 >𝑅𝑠

2 which means 𝐽 >

𝐺𝑀2

2, then 𝑟+ and 𝑟−become imaginary which is why it is assumed that

the angular momentum is limited.

If 𝑎 = 0 then 𝑟+ =𝑅𝑠

2+√(

𝑅𝑠

2)2= 𝑅𝑠 and 𝑟− =

𝑅𝑠

2−√(

𝑅𝑠

2)2= 0 and it becomes a Schwarzschild

black hole.

There is also a surface defined by 𝑟 =𝑅𝑠

2+√(

𝑅𝑠

2)2− 𝑎2 cos 𝜃 called the static limit and given the

symbol 𝑠+. This encloses the outer event horizon but touches it at 𝜃 = 0, i.e. on the rotational

axis. It is a surface of infinite red shift. 𝑠+ = 𝑅𝑠 if 𝑎 = 0.

The volume between 𝑠+ and 𝑟+ is called the ergosphere which vanishes for a Schwarzschild

black hole.

The metric coefficients are not function of time so the line element is stationary.

There is however a cross term d𝑡d𝜙 which means it is not static – time cannot be reversed. This

is because this term represents the dragging of space-time around the rotating black hole – the

light cones tilt in the direction of increasing 𝜙 as well as towards the centre. Not that this term

vanishes for 𝑎 = 0, a non-rotating black hole.

At the static limit the effect is so strong that even light must travel in the direction of rotation,

even if originally it was travelling in the opposite direction. The only exception is light or matter

that enters along the polar axis (axis of rotation).

If a particle passes through the static limit with a certain amount of energy, and decays into two

particles while in the ergosphere, and one of those particles passes through the outer event

horizon, it is possible for the other particle to escape completely carrying more energy than the

original particle. This will reduce the angular momentum of the black hole. This is known as the

Penrose process. Up to 20.7% additional energy can be carried away, and if a continuous

process up to 29% of the mass of the black hole can be lost – it then becomes a Schwarzschild

black hole. This could be a source of high energy cosmic rays and gamma ray bursts.

The outer event horizon is a surface from which even light cannot escape.

The inner event horizon encloses a volume where the physical nature is completely unknown.

The minimum circular orbit for a Kerr black hole spinning at its maximum rate is 0.5𝑅𝑠, and this

increases to 3𝑅𝑠 for zero spin.

Far from the black hole where the dragging of space-time is very small the Schwarzschild black

hole provides a simple approximation, the main difference being the minimum size of a stable

circular orbit.

In the case of zero spin the Kerr black hole has the same properties as a Schwarzschild black

hole.

Page 116: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Hawking Radiation Quantum mechanics implies that a black hole should radiate energy as if it were a black body

with a temperature of

𝑇 =ℏ𝑐3

8𝜋𝑘𝐺𝑀 Kelvin

where ℏ is the reduced Plank’s constant 1.055×10-34 J s, and 𝑘 is Boltzmann’s constant

1.381×10-23 J K-1.

The peak wavelength is just under 15𝑅𝑠 so the smaller the size the higher the frequency and

energy.

This radiation is the result of virtual particles created just outside the event horizon, one of

which passes inside of it and the other remains outside. Thus they cannot recombine and so

some energy has been extracted from the black hole.

The rate emission is given by assuming that a black body of temperature 𝑇 and of surface area 𝐴

is

d𝐸

d𝑡= 𝐴

𝜋2𝑘4

60𝑐2ℏ3𝑇4 Stefan-Boltzmann Law

= 4𝜋 (𝐺𝑀

𝑐2)2 𝜋2𝑘4

60𝑐2ℏ3(

ℏ𝑐3

8𝜋𝑘𝐺𝑀)4

=ℏ𝑐6

15360𝜋𝐺2𝑀2

Since the loss in energy results in a decrease in the mass d𝑀 =d𝐸

𝑐2 or

d𝐸

d𝑡= −𝑐2

d𝑀

d𝑡 so

−𝑐2d𝑀

d𝑡=

ℏ𝑐6

15360𝜋𝐺2𝑀2 or

𝑀2d𝑀 = −ℏ𝑐4

15360𝜋𝐺2d𝑡

If the initial mass is 𝑀𝑜 the time for the total mass to evaporate is given by

∫ 𝑀2d𝑀 = ∫ −ℏ𝑐4

15360𝜋𝐺2d𝑡

𝑡

0

0

𝑀𝑜 or

𝑡 =5120𝜋𝐺2𝑀3

ℏ𝑐4

A black hole of one solar mass has a temperature of 60nK and a lifetime of 1.5×1066 years – this

is much longer than the age of the universe, but the above analysis ignores the absorption of

radiation from its surroundings. Assuming a cosmic microwave background temperature of

2.725K this back hole would absorb more energy than it radiated.

The absorbed and emitted radiation balance for a black hole with the mass of the Moon, i.e. a

radius of 13𝜇m, and only smaller black holes decrease in mass.

Thus very small black holes are required to test the theory of Hawking radiation.

Page 117: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Geodesic Gyroscope Rotation In flat space-time a rotating gyroscope will maintain its angular momentum and hence direction

of its axis if moved.

In curved space-time the centre of mass of a free falling gyroscope moves along a geodesic and

the angular momentum vector is transported along the same geodesic so its direction will

change.

If the gyroscope orbits a Schwarzschild mass with its angular momentum vector aligned with

the radial vector r initially, after one orbit its vector will still be in the plane of the orbit but at an

angle 𝛼 to r where

𝛼 = 2𝜋 (1 − √1 −3𝐺𝑀

𝑐2𝑟) ≅ 2𝜋 (1 − 1 +

3𝐺𝑀

2𝑐2𝑟) ≅

3𝜋𝐺𝑀

𝑐2𝑟 which for the Earth’s surface is 8.44 arc

seconds per year.

This however ignores the fact that the rotating Earth is dragging space-time in its direction of

rotation and this will cause a very small eastward precession called the Lense-Thirring effect.

Gravity Probe B was launched in 2004 to measure both effects giving values of −6.6018 arc

secs per year for geodesic precession and after several years analysis because the value was

hidden in noise −0.0372 arc secs per year for frame dragging to be compared with predictions

of −6.6061 and -0.0392. The geodesic precession is smaller than that quoted above because the

satellite is a distance of 𝑟 = 𝑅 + 642 km instead of being at the surface and this decreases the

value.

Page 118: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Stellar Black Holes No stellar black hole has been observed by electromagnetic radiation– the evidence for them is

a lack of any other explanation and observations of the merger of two black holes from gravity

waves.

A star that does not explode at the end of its life will collapse. Stars of up to 10 sun masses form

white dwarves of up to 1.4 sun masses, more massive stars between 10 and 30 sun masses form

neutron stars of up 3 sun masses. More massive stars (over 30 sun masses) are believed to form

stellar black holes. These massive stars explode in a supernova which ejects much of the mass

leaving just the core. A black hole created by collapse will retain the angular momentum of the

core of the star so will be a Kerr black hole. Stellar black holes also occur in binary stars, but the

mechanism is not understood because the ejection of material and loss of mass should mean

that the companion is ejected from the system. Some other mechanism must be responsible for

a significant loss of mass.

The most likely black hole is Cygnus X-1 which emits intense X-rays that vary over timescales of

milliseconds indicating its size is less than 300 km. The X-rays are believed to come from the

inner edge of an accretion disc, the material coming from a blue supergiant which orbits it with

a 5.6 day period. If the orbital plane is aligned with Earth the mass of the black hole is 4.8 sun

masses, but if not aligned it must be greater than this up to 13 sun masses.

There are many known X-ray emitting binary stars. Those with a mass below 2 sun masses have

X-ray patterns that indicate material that falls in continues to radiate energy. Those with larger

masses have patterns that indicate that the radiation from in-falling material stops radiating

fairly quickly which can be explained by the event horizon. An example is V404 Cygni which has

flashes as short as 1/40th second.

Gravitational waves from two merging stellar black holes has been detected. Three have high

confidence levels - GW150914 has a mass of 62 suns and a distance of 1.4 Gly, GW170104 a

mass of 49 sun masses at a distance of 3.0 Gly and GW151226 a mass of 21 sun masses at a

distance of 1.4 Gly. The angular momenta of the two black holes is conserved in the merged

black holes, but if the spins of both and their orbital rotation are all in the same direction the

spin of the resulting black hole could exceed the Kerr limit (𝐽𝑚𝑎𝑥 =𝐺𝑀2

2) – this means that

rotational energy must be lost in the form of gravitational waves. The Kerr limit occurs because

if the angular momentum exceeds this value the radius of the event horizon is an imaginary

number.

A star N6946-BH1 observed in 2007 and believed to be about 25 sun masses had disappeared in

2015 with just a faint glow remaining - a possibility is that the star collapsed into a stellar black

hole and the glow comes from an accretion disc.

Page 119: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Supermassive Black Holes These have not been seen, but there is a lot of evidence for them at the centre of most normal

sized and large galaxies.

There is considerable evidence from the movement of stars near the centre of the Milky Way

that there is a black hole with a mass of 2,500,000 sun masses which is known as Sagittarius A*.

This would have a Schwarzschild radius of about 7,500,000 km. An approximation is

𝑅𝑠 ≅ 2𝑀𝐵𝐻 AU where 𝑀𝐵𝐻 is measured in units of 108 sun masses.

There is also evidence for an object of 4,000,000 sun masses at the centre of NGC4258.

The shape of a spectral line due to ionised iron atoms in MCG-6-30-15 is best explained by the

source being the inner edge of an accretion disc surrounding a black hole. Similar studies on

other objects indicate that the faster the rotation of the central object the smaller the radius of

the smallest circular orbit which corresponds to the theory of Kerr black holes.

Black holes, or the radiation given off by matter falling into them, is also the best explanation for

the huge amounts of energy emitted by quasars. The efficiency with which potential energy is

converted into radiation energy is very high for black holes – 5.7% for a Schwarzschild black

hole and up to 32% for a rapidly rotating Kerr black hole.

Another example of evidence for black holes is gravitational lensing – two quasars Q0957+561

A and B are so close together that they have the same number, and also appear to be identical. It

is believed that they are the same object but an intervening galaxy has bent the light so there

appear to be two objects rather than one. The galaxy has been identified, and it would require a

central supermassive black hole to have this effect. Many other examples are known. As well as

producing multiple images such as the Einstein Cross and curved elongated images which when

connected form an Einstein Ring, the apparent brightness can be significantly increased.

Page 120: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Schwarzschild Density

The average density of a black hole is given by 𝑀

4

3𝜋𝑟3

=3𝑀

4𝜋𝑟3 where 𝑟 is the radius of the black

hole.

It is only the mass within the Schwarzschild radius that is relevant to the black hole’s properties,

but the physical size of the “body” may be much less than that, and the black hole could consist

of multiple bodies.

Making the assumption that the size of the body is the same as a sphere whose radius is the

Schwarzschild radius 𝑅𝑠 =2𝐺𝑀

𝑐2 gives the Schwarzschild or minimum average density as

3𝑀

4𝜋(2𝐺𝑀

𝑐2)3

or 3𝑐6

32𝜋𝐺3𝑀2 = 7.274×1079

𝑀2 kg m-3 with 𝑀 in kg which can also be written as 3𝑐2

8𝜋𝐺𝑅𝑠2 =

1.608×1026

𝑅𝑠2

where 𝑅𝑠is in metres. However since the mass or radius is not known to any precision in most

cases an approximation is 2 × 1019 (𝑀⨀

𝑀)2

where 𝑀⨀

𝑀 is the ratio of the mass of the Sun to that of

the object.

Based on this assumption the Schwarzschild density is inversely proportional to the square of

the mass so black holes with small masses have a much larger density than those with large

masses.

For a mass equal to that of the sun the value is about 2 × 1019 kg m3, reducing to below

2 × 1019 kg m3 for a stellar black hole. The value for Sagittarius A* is 106 kg m3 and for a large

galaxy less than 1 kg m3 which is about the same as atmospheric pressure at sea level. The

universe has a value of about 9.5 × 10−27 kg m3. If the universe is flat it has a critical density of

8.5 × 10−27 kg m3 so is not a black hole, but the actual value is unknown because it is proportional to Hubble’s constant whose value is somewhere between 65 and 75 km s-1 Mpc-1.

The universal is also expanding and its density is decreasing while black holes consist of matter

that is contracting and its density is increasing. The physical size of the “body” in a black hole is

assumed to be much less than the Schwarzschild sphere and so its density is much greater than

the Schwarzschild density and increases towards the centre.

As a body is contracting and its density increasing a black hole will only form if and when its

density inside the Schwarzschild radius exceeds the Schwarzschild density. Small black holes

are unlikely to form due to the very high Schwarzschild density. A neutron star some 10 or 20

km in diameter has a physical density of about 3.7 to 5.9 × 1017 kg m-3, varying from 109 at the

crust to 8×1018 kg m-3, at the centre, about the same as an atom’s nucleus, but the Schwarzschild

radius is about half its radius. The Schwarzschild density is about 1019 which is why it cannot

form a black hole. It consists entirely of neutrons held apart by atomic forces so exceeding this

physical density seems impossible without some new physics, and hence the minimum mass if a

black hole or the maximum mass of a neutron star is about 2.16 sun masses.

Page 121: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Black Hole Accretion Material falling into a black hole tends to form a disc around it called an accretion disc. Since the

smallest stable circular orbit of a Schwarzschild black hole is 3𝑅𝑠 this must be the inner edge.

The specific angular momentum (per unit rest mass) is 𝐽

𝑚𝑐= √3𝑅𝑠. In practice the Kerr model is

more appropriate – in this case the smallest circular orbit has a radius of (5±4)𝐺𝑀

𝑐2, the + applying

to retrograde orbits and the – to prograde orbits assuming the maximum spin rate.

A particle spiralling in from infinity will lose mass energy which is converted into radiation

energy. In the case of a Schwarzschild black hole this has an efficiency of

𝜂 = 1 −𝐸

𝑚𝑐2= 1 − √

8

9= 5.72% (compared to 0.1% for uranium fission). For a prograde orbit

around a Kerr black hole with maximum spin (𝐽 =𝐺𝑀2

𝑐) the equivalent is 𝜂 = 1 −√

1

3= 42%

(retrograde orbits are unlikely).

Assuming the accretion disc has a luminosity 𝐿, the rate of energy output is given by 𝐿 =d𝐸

d𝑡.

The energy is in the form of photons with momenta 𝑝 =𝐸

𝑐 so the momentum flux is

d𝑝

d𝑡=

1

𝑐

d𝐸

d𝑡=

𝐿

𝑐.

This will cause a pressure 𝑃 at a distance 𝑟 of 𝑃 =𝐿

4𝜋𝑟2𝑐.

This creates an outward force on infalling electrons (assuming the infalling material is a

plasma) of 𝐹 = 𝜎𝑡𝑃 where 𝜎𝑡 is in Thomson cross-section - 𝜎𝑡 = 𝑞4

6𝜋𝜖02𝑐4𝑚𝑒

2 so the value is very

much smaller for the nuclei (protons and possibly neutrons) and can be ignored.

Assuming the infalling material is ionised hydrogen, the inward force is 𝐺𝑀𝐵𝐻𝑚𝑝

𝑟2. The force on

the electrons is very much smaller and can be ignored. Since the electrostatic forces will prevent

the plasma separating into positive and negative clouds these two forces will balance at the

inner edge of the accretion disc so

𝜎𝑡𝐿

4𝜋𝑟2𝑐=

𝐺𝑀𝐵𝐻𝑚𝑝

𝑟2 which rearranging gives the Eddington luminosity 𝐿𝐸 , (the maximum

luminosity so also called the Eddington limit) of 𝐿𝐸 =4𝜋𝑐𝐺𝑀𝑚𝑝

𝜎𝑡 (this assumes a spherical object

so is only an approximation for an accretion disc).

If the Eddington limit is exceeded the outward radiation pressure exceeds the inwards force and

accretion will cease or in the case of a star the surface layer will be expelled.

If the photons interact with the nuclei to create an electron-positron plasma the 𝑚𝑝 must be

replaced by 𝑚𝑒 , but there will be as many positrons as electrons so the outward force will be

doubled and the inward force significantly reduced and 𝐿𝐸 reduced by a factor of 918.

If the black hole radiated all its mass at the Eddington limit this would take

𝑡𝐸 =𝑀𝑐2

𝐿𝐸=

𝜎𝑡𝑐

4𝜋𝐺𝑚𝑝≅ 4×108 years.

Page 122: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The luminosity comes from the accretion disc whose mass is 𝑀𝐴𝐷.

𝐿 = 𝜂d𝐸

d𝑡= 𝜂𝑐2

d𝑀𝐴𝐷

d𝑡 where 𝜂 is the efficiency. The remaining mass falls into the black hole

which therefore grows at a rate d𝑀𝐵𝐻

d𝑡 given by 𝐿𝐸 =

𝜂

1−𝜂

d𝑀𝐵𝐻

d𝑡𝑐2 so

d𝑀𝐵𝐻

d𝑡=

1−𝜂

𝜂

𝑀𝐵𝐻

𝑡𝐸

𝑀𝐵𝐻 = 𝑎 exp ((1−𝜂)𝑡

𝜂𝑡𝐸)

This is restricted to Schwarzschild black holes.

The temperature varies along the radius. If the disc is considered as a series of rings, the

innermost rings have the highest temperature with a peak of

𝑇 ≅1

2(3𝐺𝑀�̇�

8𝜋𝜎𝑅𝐼3)

1

4 Kelvin where 𝑀 is the mass of the black hole, �̇� is the accretion rate and 𝑅𝐼 is

the inner radius, and decreases as (3𝐺𝑀�̇�

8𝜋𝜎𝑟3)

1

4. For a supermassive black hole of 108 sun masses

and an accretion rate of 1 sun mass per year the peak is 300,000 K. The overall spectrum from

the whole disc is similar to that of a black body having similar slopes at the low and high

frequencies, but instead of a relatively sharp peak there is central plateau with a gently

increasing slope of 𝑓1

3. The relative width of this plateau gives an indication of the width of the

disc. In practice the disc is too narrow for this to be observed.

The actual luminosity can be found by assuming that the lost potential energy is radiated.

∆𝐸𝑃𝐸 =𝐺𝑀𝑚

𝑟 for a mass 𝑚, and if the inflow rate is �̇�, 𝑚 = �̇�d𝑡 in the time period d𝑡.

The luminosity is 𝐿 =∆𝐸𝑃𝐸

d𝑡=

𝐺𝑀�̇�d𝑡

𝑟d𝑡=

𝐺𝑀�̇�

𝑟

The mass 𝑚 can be assumed to be in a circular orbit at radius 𝑟 – the centripetal force 𝑚𝑣2

𝑟 is

provided by gravity so 𝑚𝑣2

𝑟=

𝐺𝑀𝑚

𝑟2 and 𝑣 = √

𝐺𝑀

𝑟. Thus the mass has a kinetic energy

𝐸𝐾𝐸 =1

2

𝐺𝑀𝑚

𝑟.

Since its loss of potential energy was 𝐺𝑀𝑚

𝑟, half the potential energy has been converted into

kinetic energy leaving only half to be radiated so

𝐿 =1

2

𝐺𝑀�̇�

𝑟 for a black hole (the falling mass will enter the black hole with a high kinetic energy).

For a white dwarf or neutron star all the potential energy must be radiated for the matter to

reach the surface so 𝐿 =𝐺𝑀�̇�

𝑟.

This is a simple example of the virial theorem for a stable system of particles which relates their

kinetic energy and potential energy by 𝐸𝐾𝐸 = ±𝑛

2𝐸𝑃𝐸 where a minus sign represents an

attractive force and 𝐸𝑃𝐸 = 𝑎𝑟𝑛 where 𝑎 is a constant – in the case of gravity 𝑎 = 𝑚𝑔 where 𝑔 is

the acceleration due to gravity and 𝑛 = 1.

The inner radius 𝑅𝐼 will be in the range 0.5𝑅𝑠 ≤ 𝑅𝐼 ≤ 3𝑅𝑠 depending on the black hole’s spin

rate so 1

12�̇�𝑐2 ≤ 𝐿 ≤

1

2�̇�𝑐2 𝑅𝑠 =

2𝐺𝑀

𝑐2

Page 123: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

A sun mass per year generates 1011 times the luminosity of the sun and is 6.31×1022 kg s-1.

A galaxy may have some 1010 stars so this would exceed the output from all the stars. A typical

quasar must consume about 1.8 sun masses per year while Sgr A* at the centre of the Milky Way

consumes about 2.3×10-9 sun masses per year. A supermassive black hole does not emit any

radiation but can be one of the most luminous objects in the universe! If 3C 273 were 33 light

years away it would be as bright as the sun.

If the infalling matter is optically dense the trapped radiation is dragged down creating super

Eddington accretion, also called advection. There is then a strong gas outflow along the

rotational axis which may be relativistic and generate addition radiation.

At the other extreme low density infalling gas may have a very low value for 𝜂, also resulting in

advection – this may be the case for Sgr A* at the centre of the Milky Way.

Supermassive black holes are believed to be at the centre of most if not all galaxies. There are

two types – inactive and active black holes, the latter being those undergoing accretion. The

active supermassive black holes are known as Active Galactic Nuclei or AGNs. Seyfert galaxies

have a very luminous centre compared to the stars while quasars are so distant that only a point

of redshifted light can be seen and the galaxy is too dim. The name comes from when they were

thought to be point-like radio sources in our galaxy.

There are two types of AGN – they are believed to be the same type of object, but seen from

different angles. The black hole is surrounded by an active accretion disc and the material falling

into this emits broad spectral lines due to the high angular velocities. This is surrounded by a

torus of gas and dust several parsecs further out, outside of which is orbiting material which

having much lower angular velocities emits narrow spectral lines. If the torus obscures the

accretion disk only the narrow line emission is seen – these are known as Seyfert 2 galaxies or

type 2 quasars. If the accretion disc is visible both narrow and broad emissions are seen and

these are Seyfert 1 galaxies or type1 quasars. A relativistic jet of material may be emitted from

each rotational pole, and these end in lobes which emit intense radio waves. The rest of the jet

emits radiation whose frequency distribution is very different to that of a black body and ranges

up into gamma rays. The energy output can be very variable. If the jet points towards the Earth

it is called a blazar. If not it is called a radio galaxy.

Quasars whose spectra have been analysed have redshifts ranging from 0.056 to 7. Many have

luminosities that vary over very short time periods indicating that they can only be a few light

weeks in size. The size of a quasar refers to the radiating region which can be up to 104𝑅𝑠. The

emission lines are very broad and can be used to estimate the mass from 𝐺𝑀𝑣𝑖𝑟

𝑅𝑣𝑖𝑟≅ (∆𝑣)2 where

∆𝑣 is the velocity dispersion (1 standard deviation). The approximation is based on the virial

theorem, a value of 200 km s-1 often being used and the symbols 𝑀200 and 𝑅200 refer to this. ∆𝑣

can be obtained from the width of the emission lines. 𝑅𝑣𝑖𝑟 is estimated from the rate at which

the flux varies although care has to be taken that this is not due to the change in alignment of an

emission beam. For example a variation of 30 days and a velocity dispersion of 600 km s-1 gives

(600×103)2×30×86400×3×108

6.673×10−11= 5.4×10-36 kg, about 2×106 sun masses.

The radiating region can be divided into three – an outer broad-line region where the velocity is

dispersion is 5000 km s-1, an inner narrow-line region 500 km s-1 and an accretion disc. The

broad-line and narrow-line regions contain clouds of cosmic plasma. The main radiation comes

from the accretion disc, but it and the narrow-line region may be obscured by a torus of dust

which emits infrared radiation, having been heated by the radiation from the centre.

Page 124: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Accretion Discs An accretion disc has a large radius but is relatively thin (rather like a long playing record).

Cylindrical coordinates (𝑥, 𝜙, 𝑧) are used where 𝜙 is the angle between the component of the r

position vector onto the xy plane and 𝑧 is the height above the xy plane, the origin being at the

centre of the black hole.

The following assumptions are made:

The disc consists of particles (typically hydrogen atoms or ions) in the form of a gas or

plasma.

The particles have circular Keplerian orbits, but spiral inwards very slowly.

The disc as a whole is in a steady state – material lost from the inner edge is exactly

balanced by new material accreting on the outer edge, this being �̇�.

The vertical density is constant across the thickness 𝐻 of the disk so the 𝑧 axis is not

used and r lies in the 𝑥, 𝑦 plane.

The disk consists of thin rings of material which interact (probably due to magneto-

hydrodynamic turbulence caused by magnetic fields). This manifests itself as kinematic

viscosity 𝜐 where 𝜐 = 𝛼𝐻𝑐𝑠 where 0 < 𝛼 < 1 and 𝑐𝑠 is the speed of sound. This is called

𝛼 viscosity and the discs are called 𝛼 discs or Shakura-Sunyaev discs.

The mass of the disc 𝑀𝑑𝑖𝑠𝑐 can normally be ignored and hence also its gravity.

Consider three thin adjacent rings. The rotational speed 𝜔 = √𝐺𝑀

𝑟3 is that for a circular Keplerian

orbit so is slightly greater for the inner ring and slightly smaller for the outer ring. Thus the

middle ring is simultaneously having its speed increased by the inner ring and decreased by the

outer ring (differential rotation), but with the net effect of having its speed increased. Since its

speed cannot increase this energy is converted into heat (increasing its speed increases its

orbital radius so it interacts with the outer ring resulting in collisions slowing it down).

The force of one ring on the other is the product of the surface area of contact between the two rings (2𝜋𝑟𝐻) and the viscous stress 𝜎 and so the torque (product of force and distance)

𝜏 = −2𝜋𝑟𝐻𝜎𝑟. The outer ring exerts positive torque on the inner ring, the inner ring negative

torque on the outer ring.

The shear stress of a fluid is given by 𝜎 = −𝜐𝜌d𝑉

d𝑟 where 𝜐 is the kinematic viscosity, 𝜌 the

density of the fluid and d𝑉

d𝑟 is the rate of change of velocity (perpendicular to the radius but in the

plane of the disc) with distance. For a circular orbit 𝑉 = 𝑟𝜔 and so 𝜎 = −𝜐𝜌𝑟d𝜔

d𝑟. Note 𝜐𝜌 is the

dynamic viscosity which is normally used when 𝜌 is constant. Here 𝜌 varies with 𝑟 (and time).

d𝜔

d𝑟= √𝐺𝑀(−

3

2𝑟−

5

2) 𝜔 = √𝐺𝑀

𝑟3

This gives the torque

𝜏 = −2𝜋𝑟𝐻𝜐 (−𝜌𝑟√𝐺𝑀(−3

2𝑟−

5

2)) 𝑟 = −3𝜋𝐻𝜌𝜐√𝐺𝑀𝑟

𝐻𝜌 is sometimes called the surface density and given the symbol Σ(r) = ∫ 𝜌(𝑟, 𝑧) d𝑟∞

−∞.

Page 125: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

This mechanism will convert mechanical energy into heat at the viscous dissipation rate per

unit area 𝐷.

𝐷(𝑟) =𝜏Δ𝜔

𝐴 where Δ𝜔 is the difference in angular speed between the inner and outer edges of a

ring and 𝐴 is the external (radiating) area. Δ𝜔 ≅ Δ𝑟d𝜔

d𝑟 where Δ𝑟 is the width of the ring.

The ring has two external surfaces (top and bottom) each of area 2𝜋𝑟Δ𝑟.

The dissipation rate per unit area is

𝐷(𝑟) =𝜏Δ𝜔

2×2𝜋𝑟Δ𝑟=

𝜏

4𝜋𝑟

d𝜔

d𝑟=

−3𝜋𝐻𝜌𝜐√𝐺𝑀𝑟

4𝜋𝑟√𝐺𝑀(−

3

2𝑟−

5

2) =9𝐻𝜌𝜐𝐺𝑀

8𝑟3 for Keplerian rotation.

A small mass is flowing into the ring along its outer circumference and out of the ring along its

inner circumference. This is the local mass accretion rate

�̇� =𝜕𝑚

𝜕𝑡= 2𝜋𝑟(− �̇�)𝐻𝜌 where

𝜕𝑚

𝜕𝑡 is positive for an inward flow rate and �̇� =

d𝑟

d𝑡 is the outward

radial velocity, hence the minus sign for inward flow.

The net change in the mass of the ring is the difference between the inward and outward flows

noting 𝜕𝑚

𝜕𝑡 is a function of 𝑟 so the rate of change of mass with time is −2𝜋∆𝑟

𝜕(𝑟𝐻𝜌�̇�)

𝜕𝑟

The mass of the ring is 2𝜋𝑟∆𝑟𝐻𝜌 so its rate of change is 𝜕(2𝜋𝑟∆𝑟𝐻𝜌 )

𝜕𝑡= 2𝜋𝑟∆𝑟

𝜕(𝐻𝜌 )

𝜕𝑡 so

2𝜋𝑟∆𝑟𝜕(𝐻𝜌 )

𝜕𝑡= −2𝜋∆𝑟

𝜕(𝑟𝐻𝜌�̇�)

𝜕𝑟 or

𝑟𝜕(𝐻𝜌)

𝜕𝑡+

𝜕(𝑟𝐻𝜌�̇�)

𝜕𝑟= 0

But for steady state 𝜕

𝜕𝑡= 0 leaving

d(𝑟𝐻𝜌�̇�)

d𝑟= 0 or 𝑟𝐻𝜌�̇� is constant at all values of 𝑟.

The rate at which mass flows from one ring to another is

�̇� = −2𝜋𝑟𝐻𝜌�̇� which is the same for all 𝑟 and also equal to the rate at which the mass of the

black hole increases, i.e. �̇�

�̇� = −2𝜋𝑟𝐻𝜌�̇� or 𝑟𝐻𝜌�̇� = −�̇�

2𝜋

This can be repeated for angular momentum by replacing 𝐻𝜌 with 𝐻𝜌𝑟2𝜔 which is the angular

momentum per unit mass per unit surface area. But because the net torque ∆𝑟𝜕𝜏

𝜕𝑟 increases

angular momentum this adds an extra term to give under steady state conditions

d(𝐻𝜌�̇�𝑟3𝜔 )

d𝑟=

1

2𝜋

d𝜏

d𝑟 whose solution is

𝐻𝜌�̇�𝑟3𝜔 =𝜏

2𝜋+

𝐶

2𝜋 where 𝐶 is an arbitary constant

−�̇�

2𝜋𝑟2𝜔 =

𝜏

2𝜋+

𝐶

2𝜋 𝑟𝐻𝜌�̇� = −

�̇�

2𝜋

−�̇�𝑟2√𝐺𝑀

𝑟3= 𝜏 + 𝐶 𝜔 = √

𝐺𝑀

𝑟3

−�̇�√𝐺𝑀𝑟 = 𝜏 + 𝐶 Note that this is independent of the kinematic viscosity.

Page 126: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

At the inner edge of the disc, the innermost ring does not have a more inner ring to increase its

speed so it will be slowed by the ring next to it. The speed of this ring will therefore be less than

expected. The rotation rate increases rapidly going out from the inner ring to a maximum, then

decreases as expected. This inner part of the disc is called the boundary layer. If the inner most

ring is at 𝑟 = 𝑅𝐼 and outermost part of the boundary layer is at 𝑟 = 𝑅𝐼 + 𝑅𝑏 then the rate of

change of angular speed is zero at 𝑅𝐼 + 𝑅𝑏 , and so the torque is zero at 𝑅𝐼 + 𝑅𝑏.

−�̇�√𝐺𝑀(𝑅𝐼 + 𝑅𝑏) = 𝐶 − �̇�√𝐺𝑀𝑟 = 𝜏 + 𝐶

−�̇�√𝐺𝑀𝑅𝐼 = 𝐶 𝑅𝑏 ≪ 𝑅𝐼

𝜏 = �̇�√𝐺𝑀𝑅𝐼 − �̇�√𝐺𝑀𝑟 −�̇�√𝐺𝑀𝑟 = 𝜏 + 𝐶

−3𝜋𝐻𝜌𝜐√𝐺𝑀𝑟 = �̇�√𝐺𝑀𝑅𝐼 − �̇�√𝐺𝑀𝑟 𝜏 = −3𝜋𝐻𝜌𝜐√𝐺𝑀𝑟

3𝜋𝐻𝜌𝜐√𝑟 = �̇�√𝑟 − �̇�√𝑅𝐼

𝐻𝜌𝜐 =�̇�

3𝜋(1 − (

𝑅𝐼

𝑟)

1

2)

This means 𝐻𝜌𝜐 increases with 𝑟 but from 𝑟 ≅ 100𝑅𝐼 the value is nearly constant and equal to �̇�

3𝜋. This means the surface density 𝐻𝜌 ∝

1

𝜐 and since 𝜐 is a function of temperature these values

(𝐻, 𝜌, T) must somehow adjust for the mass flow to remain constant.

The viscous dissipation rate per unit area (the conversion of mechanical energy into heat) is

therefore

𝐷(𝑟) =3𝐺𝑀�̇�

8𝜋𝑟3(1 − (

𝑅𝐼

𝑟)

1

2) 𝐷(𝑟) =

9𝐻𝜌𝜐𝐺𝑀

8𝑟3

Assuming this energy is emitted as radiation at the same rate from both external surfaces of a

ring whose inner and outer radii are 𝑟1and 𝑟2 the luminosity is

𝐿𝑟1𝑟2 = 2∫ 2𝜋𝑟𝐷(𝑟) d𝑟𝑟2𝑟1

=3𝐺𝑀�̇�

2∫ (1 − (

𝑅𝐼

𝑟)

1

2)1

𝑟2 d𝑟

𝑟2𝑟1

=3𝐺𝑀�̇�

2𝑅𝐼∫ −(1 − 𝑢

1

2) d𝑢

𝑅𝐼𝑢2𝑅𝐼𝑢1

𝑢 =𝑅𝐼

𝑟,d𝑢

d𝑟= −

𝑅𝐼

𝑟2,1

𝑟2 d𝑟 = −

d𝑢

𝑅𝐼, 𝑟 =

𝑅𝐼

𝑢, 𝑟1 =

𝑅𝐼

𝑢1, 𝑟2 =

𝑅𝐼

𝑢2

=3𝐺𝑀�̇�

2𝑅𝐼(∫ 𝑢

1

2 d𝑢𝑢2𝑢1

− ∫ d𝑢𝑢2𝑢1

)

=3𝐺𝑀�̇�

2𝑅𝐼([2

3𝑢3

2]𝑢1

𝑢2− [𝑢]𝑢1

𝑢2)

=3𝐺𝑀�̇�

2𝑅𝐼((

2

3𝑢2

3

2 −2

3𝑢1

3

2) − (𝑢2 − 𝑢1))

=3𝐺𝑀�̇�

2𝑅𝐼((

2

3(𝑅𝐼

𝑟2)

3

2−

2

3(𝑅𝐼

𝑟1)

3

2) − (

𝑅𝐼

𝑟2−

𝑅𝐼

𝑟1))

Page 127: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

𝐿𝑟1𝑟2 =3𝐺𝑀�̇�

2(1

𝑟1(1 −

2

3√𝑅𝐼

𝑟1) −

1

𝑟2(1 −

2

3√𝑅𝐼

𝑟2))

For the whole disc 𝑟1 = 𝑅𝐼 (strictly 𝑅𝐼 + 𝑅𝑏, the outer edge of the boundary layer) and 𝑟2 = ∞ so

𝐿𝑅∞ =3𝐺𝑀�̇�

2(1

𝑅𝐼(1 −

2

3√𝑅𝐼

𝑅𝐼) −

1

∞(1 −

2

3√𝑅𝐼

∞))

=3𝐺𝑀�̇�

2(1

𝑅𝐼(1 −

2

3) − 0)

=𝐺𝑀�̇�

2𝑅𝐼 confirming only half the potential energy 𝐸𝑃𝐸 =

𝐺𝑀𝑚

𝑅𝐼 is radiated as energy.

Note that at large values of 𝑟1 (and therefore 𝑟2) √𝑅𝐼

𝑟1 and √

𝑅𝐼

𝑟2 are close to zero so

𝐿𝑟1𝑟2 ≅3𝐺𝑀�̇�

2(1

𝑟1−

1

𝑟2) while the available potential energy is 𝐺𝑀�̇� (

1

𝑟1−

1

𝑟2). This means that in

the outer part of the disc more energy is being radiated than is available from the potential

energy. This is possible because angular momentum is being transferred outwards. Further in

the radiated energy is less than that available from the potential energy so more becomes

kinetic energy and the source of that angular momentum. The two balance out so that half the

potential energy is radiated over the whole disc. This discussion only applies to those parts of

the disc where there are stable Keplerian orbits i.e. for 𝑟 > 𝑅𝐼 + 𝑅𝑏.

The inner edge of the disc is at the innermost stable circular orbit (in the range 0.5𝑅𝑠 𝑡𝑜 3𝑅𝑠).

Any particles inside that will have chaotic orbits which will at times go further out into the disc,

and a particle will follow that chaotic orbit until it losses sufficient kinetic energy that its orbit

intersects the event horizon. This means a significant proportion of its remaining kinetic energy

will be radiated away in the boundary layer, an amount nearly equal to the remaining energy 𝐺𝑀�̇�

2𝑅. The boundary layer therefore emits almost as much energy as the rest of the disc, and so

must have a much higher temperature since its area is very much smaller.

Assuming the disc is a black body the flux (power per unit area) 𝐹 = 𝜎𝑇4 where 𝜎 is the Stefan-

Boltzman constant so

𝑇4(𝑟) =3𝐺𝑀�̇�

8𝜋𝜎𝑟3(1 − (

𝑅𝐼

𝑟)

1

2) 𝐷(𝑟) =

3𝐺𝑀�̇�

8𝜋𝑟3(1 − (

𝑅𝐼

𝑟)

1

2)

= 𝑇∗4 (

𝑅𝐼

𝑟)3(1 − (

𝑅𝐼

𝑟)

1

2) where 𝑇∗ = (

3𝐺𝑀�̇�

8𝜋𝜎𝑅𝐼3)

1

4

𝑇4

𝑇∗4 = (

𝑅𝐼

𝑟)3(1 − (

𝑅𝐼

𝑟)

1

2) This can be written as

𝑦 = 𝑥−3 − 𝑥−3.5 𝑦 =𝑇4

𝑇∗4 , 𝑥 =

𝑟

𝑅

𝑑𝑦

𝑑𝑥= −3𝑥−4 − (−3.5)𝑥−4.5 This is zero at the maximum temperature and solving the

resulting equation gives equation gives 𝑥 = (7

6)2

and substituting back gives

𝑦 = (7

6)−6− (

7

6)−7=

66

77 at 𝑥 = (

7

6)2

so the maximum temperature in the main part of the disk is

𝑇4

𝑇∗4 = (

6

7)

3

2(1

7)

1

4≅ 0.488 at 𝑟 = 1.37𝑅𝐼 (this is also independent of the kinematic viscosity).

Page 128: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The above is only valid for the main part of the disc and ignores the boundary layer where the

orbits are not Keplerian. This is close enough to the black hole that space-time is curved, and the

velocities become relativistic.

Assuming effects near the black hole vary with its mass and inversely with distance and the

Eddington limited luminosity varies with its mass so the rate of increase in mass also varies

with mass, and so T varies with (𝑀𝑀

𝑀3 )

1

4 or 𝑀−4 – in other words the temperature is lower for

more massive black holes. Stellar black holes have accretion disc temperatures of 107 K (X-rays)

while supermassive black holes have temperatures of 105 K (optical).

The structure across the height of the disc is symmetric about the z axis and independent of 𝜙.

Considering a particle at height 𝑧 there is a gravitational force towards the centre of the black

hole which can be resolved into components perpendicular to the z axis and a very small

component parallel to the z axis pulling it towards the xy plane which is approximately 𝐺𝑀𝑚

𝑟2𝑧

𝑟.

Assuming the particle is a small mass of ideal gas its mass is the product of density and volume

𝜌𝐴d𝑧. At the top surface 𝑧 =𝐻

2.

This must be balanced by the hydrostatic pressure 𝑃 = 𝜌𝑐𝑠2 for an ideal gas acting over an area

𝐴, where 𝑐𝑠 is the speed of sound and d𝑃

d𝑧≅

2𝑃

𝐻 giving

𝐺𝑀𝐻𝜌𝐴

2𝑟3=

2𝜌𝑐𝑠2𝐴

𝐻

(𝐻

2)2=

𝑟3𝑐𝑠2

𝐺𝑀=

𝑐𝑠2

𝜔2 𝜔 = √𝐺𝑀

𝑟3

𝐻

2𝑟=

𝑐𝑠

𝑉 𝑉 = 𝑟𝜔

This means that for the disk to be thin (𝐻 ≪ 𝑟) then 𝑐𝑠 ≪ 𝑉, i.e. the orbital speed must be very

much greater than the local speed of sound. The speed of sound increases with temperature

(𝑐𝑠 ≅ 104√𝑇 m s-1 where T is in units of 104K) and temperature increases with high mass flow

rates (𝑇4 ∝ �̇�) so the thin disc approximation breaks down at high mass flows (those

approaching the Eddington limit). The outward radiation interacts with the disc and only

numerical solutions are possible.

Page 129: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The value for 𝐻 can be estimated for a neutral hydrogen disc by assuming 1

𝜌(𝑧)

𝜕𝑃(𝑧)

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3

derived by balancing the internal pressure and the gravitational component in the z direction.

1

𝜌(𝑧)

𝜕(𝜌𝑘𝑇

�̅̅̅�)

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3

𝑃

𝜌=

𝑘

�̅�𝑇 the ideal gas law

𝑘

𝜌(𝑧)

𝜕(𝜌𝑇

�̅̅̅�)

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3 𝑘 is Boltzmann’s constant

𝑘

𝑚𝑝𝜌(𝑧)

𝜕(𝜌𝑇)

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3 𝑚𝑝 is the mass of a proton

𝑘

𝑚𝑝𝜌(𝑧)(𝑇

𝜕𝜌

𝜕𝑧+ 𝜌

𝜕𝑇

𝜕𝑧) = −

𝐺𝑀𝑧

𝑟3 𝜌 and 𝑇 are both functions of z

𝑘

𝑚𝑝𝜌(𝑧)𝑇𝜕𝜌

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3

𝜕𝑇

𝜕𝑧= 0 for isothermal stratification

𝑘

𝑚𝑝𝜌𝑐 exp (−𝑧2

2𝐻2)𝑇𝜕(𝜌𝑐 exp(−

𝑧2

2𝐻2))

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3 A solution is 𝜌 = 𝜌𝑐 exp (−

𝑧2

2𝐻2)

𝑘

𝑚𝑝𝜌𝑐 exp (−𝑧2

2𝐻2)𝑇𝜌𝑐𝜕(exp(−

𝑧2

2𝐻2))

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3 𝜌𝑐 is the mid-plane density

𝑘

𝑚𝑝𝜌𝑐 exp (−𝑧2

2𝐻2)𝑇𝜌𝑐 exp (−

𝑧2

2𝐻2)𝜕(−

𝑧2

2𝐻2)

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3

𝜕(exp(−𝑧2

2𝐻2))

𝜕𝑧= exp (−

𝑧2

2𝐻2)𝜕(−

𝑧2

2𝐻2)

𝜕𝑧

𝑘𝑇

𝑚𝑝2𝐻2

𝜕(−𝑧2)

𝜕𝑧= −

𝐺𝑀𝑧

𝑟3

𝑘𝑇

𝑚𝑝2𝐻2(−2𝑧) = −

𝐺𝑀𝑧

𝑟3

𝑘𝑇

𝑚𝑚𝑝̅̅ ̅̅ ̅̅ ̅̅ 2𝐻2=

𝐺𝑀

𝑟3

𝐻 = (𝑘𝑇𝑟3

𝑚𝑝𝐺𝑀)

1

2 This only applies to neutral hydrogen, and not ionised hydrogen.

The surface flux F (power per unit area) is related to the mid-plane temperature T by 𝐹 ≅4𝜎𝑇4

3𝜅𝑅𝜌𝐻

where 𝜅𝑅 is Rosseland mean opacity and equals 5×1020𝜌𝑇−3.5 in SI units for a plasma.

𝜅𝑅𝐻𝜌 is the optical depth, a dimensionless value. This assumes the energy transport is by

radiation where the surface density ∝ 𝑟−0.75 . If the transport is by convection as occurs with partially ionised hydrogen at temperatures around 6000K numerical methods must be used.

This occurs at large radii – there is a step increase in surface density as 𝑟 increases after which

surface density remains nearly constant.

Page 130: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

This step divides the disc into two areas. Inside this step the disc has both low temperature and

viscosity, high opacity, radiative energy transfer, a low vertical temperature gradient

(perpendicular to the disc), and hydrogen is neutral. Outside the step the disc has high

temperature and viscosity, low opacity, convective energy transfer, a high vertical temperature

gradient and ionised hydrogen. The position of the step depends on the mass transfer rate for a

given system – the higher the rate the greater the radius of the step. The step may not exist for

low transfer rates.

On both sides of the step 𝜕�̇�

𝜕(𝐻𝜌)> 0 which is a stable equilibrium – any increase in flow rate

results in an increase in surface density, and this increase will move quickly inwards so the disc

adjusts to the new rate. However at the step 𝜕�̇�

𝜕(𝐻𝜌)< 0 which is unstable, and an increase in the

mass transfer rate decreases the surface density. The result is that the temperature rapidly

increases ionising the hydrogen and the step moves rapidly inwards until a new stable state

emerges. Likewise a decrease in the mass transfer rate will result in the step moving outwards.

This explains the variability in luminosity. Large increases in the transfer rate result in the step

moving to the inner edge increasing the luminosity of the whole disc. The time scales vary from

days or weeks for stellar black holes to perhaps 1000 years for a supermassive black hole.

The quasar OJ287 is believed to have two central black holes. The main one has a mass of

18×109 sun masses and an accretion disc. The second black hole has a mass of 1×108 sun

masses and a 12 year orbit nearly perpendicular to the accretion disc. This means it passes

through the accretion disc twice in close succession each orbit which results in two bursts of

radiation. Its precession has been calculated as 39° per orbit. The main black hole has a spin

rate of about 30% of the maximum rate.

Although black holes cannot be observed directly, they together with accretion discs provide the

best current explanations for quasars and other objects which appear to emit huge amounts of

energy.

The first image of an accretion disc was published on 10 April 2019 – that of M87* which is at

the centre of the supergiant elliptic galaxy Virgo A created from observations at 1.3 mm by

seven radio telescopes over several days in April 2017. The disc is brighter on the side

approaching the Earth and has a dark centre – not the black hole but the zone inside the

minimum circular orbit which is estimated as 2.6𝑅𝑠 where 𝑅𝑠 is about 120 AU. The black hole

mass is estimated at 6.5x109 sun masses.

The Schwarzschild model of a simple universe is the basis for all the above theories of

cosmological objects.

Page 131: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Mass of a Black Hole The mass of a black hole 𝑀 is often estimated by the speed of an object rotating around it

relative to the black hole. Assuming an object of mass 𝑚 at a distance 𝑟 from the black hole is

moving either directly towards or away from the observer at a speed ∆𝑣 relative to the velocity

of the black hole (the Line of Sight Radial Velocity LSRV) and assuming the gravitational force

equals the centripetal force

𝐺𝑀𝑚

𝑟2=

𝑚(∆𝑣)2

𝑟

𝑀 =𝑟(∆𝑣)2

𝐺

∆𝑣 is a maximum when the object is physically moving directly towards the observer (the

observer is in the plane of rotation and the object is at its greatest apparent distance from the

black hole) and this formula will give the minimum mass if 𝑟 and ∆𝑣 are observed values.

Although in theory the black hole affects the curvature of space out to infinitely, in practice

there is a limited sphere of influence outside of which other effects dominate. This is given by

balancing the gravitational force with the centripetal force.

The gravitational force is given by 𝐺𝑀𝑚

𝑟2, and the centripetal force by

𝑚𝜎2

𝑟 so

𝑟 =𝐺𝑀

𝜎2

𝜎 is known as the velocity dispersion ∆𝑣, and is the standard deviation of the velocities

measured with a radius 𝑟, the maximum value of 𝜎 being used. In the case of the Milky Way this

is about 75 km s-1, for M31 about 160 km s-1, elliptical galaxies about 200 km s-1, and the Coma

cluster about 1000 km s-1. In practice 200 km s-1 is used as a standard value.

For supermassive black holes (SMBH ) the radius of the sphere of influence is approximated by

𝑟 =10𝑀

108

𝜎2002 parsecs where 𝑀108is the mass of the black hole in 108 sun masses and 𝜎200 is the

dispersion velocity in units of 200 km s-1 , typically around 106 times the Schwarzschild radius.

However there is a relationship between the dispersion velocity and the mass of the SMBH

given by 𝑀108 = 𝑎108𝜎200𝑏 where 𝑀108 is in 108 sun masses, 𝑎 and 𝑏 are constants, and 𝜎200 is

the dispersion velocity in units of 200 km s-1 with a range of about 0.25 to 5.5. There are various

values given for 𝑎 and 𝑏 – (1.66, 4.58), (1.9, 5.1) and (3.1, 4.0). This is known as the Magorrian

relation which implies there is a close relationship between the dispersion velocity and mass of

the black hole. Here the dispersion velocity is that of the bulge in spirals and the whole in the

case of elliptical galaxies. The black hole mass is about 0.6% of the mass of the bulge in spiral

galaxies and about 0.6% of the whole mass of elliptical galaxies, but the sphere of influence is

about 10-9 of the volume.

There is also a similar relationship for luminosity called the Faber–Jackson relationship for

elliptical galaxies, and Tully-Fisher relationship for spirals, but these have a large spread.

The above implies there is a very strong but unknown link between black holes and galaxies,

and that all galaxies may have black holes. It is not clear which came first, the black hole or the

galaxy or whether both developed together.

Page 132: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Gravitational Waves Any acceleration of a mass through space-time causes gravitational waves which carry energy

away from the mass at the speed of light. The only exception is pure rotation of a body around

its axis of symmetry because the gravity waves from the components cancel out. For example

the rotation of the Earth around the Sun results in a loss of 200 watts reducing the orbital radius

by 2×10-15 m per annum. The power lost is given by (assuming circular orbits)

d𝐸

d𝑡=

32𝐺4(𝑚1𝑚2)2(𝑚1+𝑚2)

5𝑐5𝑟5 where 𝑚1 and 𝑚2 are the masses of two bodies a distance 𝑟 apart.

For two neutron stars with the mass of the sun and 1.89×108 metres apart this comes to 1.38 ×1028 watts or 100 times the energy radiated by the sun in electromagnetic waves.

In the case of a binary star with an orbital period of less than 3 hours the energy lost through

gravitational waves results in a significant loss in orbital angular momentum and the spiral-in of

the two stars. The rate of loss of angular momentum 𝐽 (dimensions ML2T-1) is given by

d𝐽

d𝑡= −1.27 × 10−8

𝑚1𝑚2

(𝑚1+𝑚2)13

𝑃−8

3 𝐽 year-1 where 𝑃 is the period in hours (maximum 3) and the

masses are in units of sun masses. Magnetic breaking dominates for periods greater than this.

For example a primary of 1 sun mass and a secondary of 0.23 sun masses with an orbital period

of 2 hours has

d𝐽

d𝑡= −1.27 × 10−8

1×0.23

1.2313

2−8

3𝐽 = -4.29×10-10 𝐽 y-1. This is a very small proportion of the angular

momentum giving a lifetime of some 109 years.

In flat space-time [𝑔𝜇𝜈] = [𝜂𝜇𝜈]. Assuming a small amount of curvature this can be modified to

[𝑔𝜇𝜈] = [𝜂𝜇𝜈] + [ℎ𝜇𝜈] where [ℎ𝜇𝜈] is a small perturbation. This allows Einstein’s field equations

to be written as

𝜕𝜇𝜕𝜈ℎ + ⧠h𝜇𝜈 −∑ 𝜕𝜈𝜕𝜌ℎ𝜌𝜇 − ∑ 𝜕𝜇𝜕𝜌ℎ

𝜌𝜈 − ∑ 𝜂𝜇𝜈𝜌,𝜎𝜌𝜌 (⧠ℎ − 𝜕𝜌𝜕𝜎)ℎ

𝜎𝜌 = −2𝜅T𝜇𝜈

where 𝜕𝜇 =𝜕

𝜕𝑒𝜇 , ℎ = ∑ ℎ𝜎σ 𝜎 and ⧠ is the d’Alembertian operator quad ⧠ ≡ ∑ 𝜕𝜎𝜎 𝜕𝜎 ≡

1

𝑐2𝜕2

𝜕𝑡2− ∇2. (Note that other texts may use ⧠2 or 𝜕2 for the d’Alembertian operator.)

This is a linear equation in h𝜇𝜈 whose solutions are wave like. To solve it the condition

∑ 𝜕𝜇𝜇 ℎ̅𝜇𝜈 = 0 where h̅𝜇𝜈 = h𝜇𝜈 −1

2𝜂𝜇𝜈ℎ is imposed which results in

⧠h̅𝜇𝜈 = −2𝜅T𝜇𝜈 which is an inhomogeneous wave equation (inhomogeneous means the RHS is

not zero).

The term −2𝜅T𝜇𝜈 describes how the source generates the waves which travel with the speed of

light.

h̅𝜇𝜈 = [

0 0 0 00 0 0 00 0 ℎ+(𝑡 − 𝑟, 𝑟, 𝜃, 𝜙) ℎ×(𝑡 − 𝑟, 𝑟, 𝜃, 𝜙)0 0 ℎ×(𝑡 − 𝑟, 𝑟, 𝜃, 𝜙) −ℎ+(𝑡 − 𝑟, 𝑟, 𝜃, 𝜙)

]

Page 133: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

For a two body system such as the Earth Sun

ℎ+(𝑡 − 𝑟, 𝑟, 𝜃, 𝜙) =2𝐺2𝑚1𝑚2

𝑟𝑅𝑐4(1 + cos𝜃) cos(2𝜔(𝑡 − 𝑅)) which is the plus polarisation.

ℎ×(𝑡 − 𝑟, 𝑟, 𝜃, 𝜙) =4𝐺2𝑚1𝑚2

𝑟𝑅𝑐4cos𝜃 sin(2𝜔(𝑡 − 𝑅)) which is the cross polarisation which is at 45

degrees to the plus polarisation.

Here 𝜔 is the angular velocity of the equivalent circular orbit, and 𝑅 is the distance of the

observer from the centre of mass where R is greater than one wavelength. For the Earth-Sun 𝑅

is greater than one light year since the Earth takes 1 year to rotate about the sun which would

generate one full wave.

For an observer in the rotation plane ℎ× = 0 and ℎ+ =1

𝑅1.7x10-10 metres which means that the

amplitude factor is 10-26, i.e. the measuring precision must be better than 1 in 10-26.

A supernova in the Virgo cluster would have an amplitude factor of 10-21 on Earth which is just

on the limit of detectability. A merger of black holes would result in more energy, and there

should be gravitational waves from the Big Bang.

Most detectors work on laser interferometers such as LIGO where a laser beam is split, sent in

two perpendicular directions, reflected back and recombined to produce an interference

pattern. Multiple locations eliminate local noise, and time delays can allow the direction of the

source to be estimated. A similar approach is to use satellites such as LISA which allows longer

legs and has less noise so should be more sensitive.

The first positive result was detected on 14 September 2015 and was designated GW150914.

Analysis indicates it was caused by the merger of two black holes of 36 and 29 sun masses

orbiting 75 times per second at a distance of 350 km. Some 3 sun masses were converted into

gravitational energy leaving a black hole of 62 sun masses. The event occurred at a luminosity distance of 410 Mpc or z=0.09. The luminosity distance can be converted to the proper distance by 𝑟 = (1 + 𝑧)𝐷𝐿 so r=447 Mpc. 1 Mpc =3.262×106 light years so the distance is 1.46×109 ly.

Note that the masses measured are 𝑀(1 + 𝑧) – these have been divided by (1 + 𝑧) to get the

actual masses.

Gravitational waves are also a mechanism involved in cataclysmic variable stars. These are

binary stars where a white dwarf accretes mass from a low mass star in a close orbit. The orbital

period is under three hours and the accretion rate is of the order of 10-11 to 10-10 sun masses per

year. The gravitational radiation results in a loss of angular momentum given by 𝑑𝐽

𝑑𝑡

𝐽= −1.27 × 10−8

𝑀1𝑀2

(𝑀1+𝑀2)13

𝑃−8

3 y-1 where 𝐽 is the angular momentum, 𝑀1and 𝑀2 are the two

masses in multiples of sun mass, and 𝑃 is the orbital period in hours (maximum 3). This results

in the orbital radius constantly decreasing which allows a sustained accretion rate and an

accretion disc forms which generates most of the radiation which is ultra violet or X-ray. There

may be outbursts of energy caused by the disc becoming unstable or the accretion rate onto the

white dwarf increasing, or thermonuclear reactions (hydrogen fusion) in the disc.

Page 134: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Two merging stellar black holes, although rare because it is unlikely that two massive stars

would form close together and the collapse of one should eject the other, would give a very

distinctive pattern of increasing frequency and amplitude terminating in a massive output of

energy. The exponential increase in frequency, called chirp, is due to the rapidly decreasing

orbital period in the final stages. The merged black hole is initially asymmetric, but rapidly

becomes spherical so there is an exponential decrease in flux called ringdown. The detectable

signal lasts about 0.2 seconds. The mass of the merged black hole is less than that of the two

orbiting black holes, the difference being lost as the energy in the gravitational waves. If

rotational momenta of the two black holes (the orbital plane components of their spins added to

their orbital angular momentum) exceed the Kerr limit of 𝐽𝑚𝑎𝑥 =𝐺𝑀2

2 for the final black hole the

excess momentum is believed to be lost in the form of gravitational waves. This means the

detected pulse of waves should have a larger duration than if the limit is not exceeded. In the

case of the first three events detected GW170104 had a low duration while GW150914 and

GW151226 had large durations (the format is GWyymmdd). This implies that the latter two

must have developed as a pair while the two black holes in the case of GW170104 may have

developed separately.

The three measurements so far confirmed indicate that the velocity of gravitational waves is the

same as that of electromagnetic waves so the particle equivalent, the graviton, must be

massless. The velocity comes from the time difference between the two observations at

different sites on the Earth and the straight line distance between the sites.

The merger of two neutron stars (more massive than stars, less massive than black holes, 1.4 to

3 solar masses) was detected in 2017 (GW170817), and analysis of the gravitational waves has

led to the estimate that the maximum radius of a 1.4 neutron star is 13.6 km.

Page 135: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Status of General Relativity General relative produced much better accuracy for the known orbit of Mercury, and gave

predicted corrections for Venus, Earth and Icarus (a near Earth Apollo asteroid) which have

been verified by subsequent observations.

A test of general relativity took place during the 1919 eclipse by Sir Arthur Eddington and

others detecting the apparent change in position of stars near the edge of the Sun. This has been

repeated several times since, but the accuracy is limited. Better results have been obtained

measuring the deviation of radio emissions from quasars when close to the sun, very long

baseline interferometry giving more precise values.

There are several results for gravitational lensing which predict the presence of black holes, and

for micro-lensing predicting planets.

The orbital decay of two orbiting neutron stars called the Hulse-Taylor System or PSR B1913+16 has been measured for three decades and follows very closely to that predicted by the loss of energy due to gravitational waves, although other qualitative explanations have been put forward. An even more accurate result has been obtained from PSR J0737-3039A where measurements agree with the value predicted by general relativity within an uncertainty of 0.05%, the most precise test yet obtained. Supermassive black holes are the best explanation for quasars to date. The detection of

gravitational waves in 2015 has given significant support to gravitational waves, black holes

and general relativity. The observations of M87* support the theory of black holes.

There is a specific engineering verification. Both general and special relativity must be taken

into account when designing satellite based navigational or positioning systems.

There are number of experiments which are designed to verify general relativity.

The Shapiro experiments are based on the time delay for radar signals reflected by a planet on

the far side of the sun which take slight longer than expected for a flat space-time. This extra

time is given by ∆𝑡 =4𝐺𝑀𝑠

𝑐3(ln

4𝑅𝐸𝑅𝑃

𝑅𝑆2 + 1) where R is a radius and the suffices E, P and S refer to

the Earth, Planet and Sun. 4𝐺𝑀𝑠

𝑐3≅ 20𝜇𝑠 so this factor is very small. More accurate results can be

obtained from the time delay of signals from spacecraft on the far side of the sun. In this case 𝑅𝑃

is the distance of the spacecraft from the sun. Tests have been made from the surface of Mars,

from Voyager 1 and 2, and Cassini which agreed with theory to about 20 parts in a million.

The LAGEOS has measured frame dragging but the results are disputed. Gravity Probe B has just

measured results for gyroscopic precession but these are at its limits of sensitivity.

Finally the Pound Rebka experiment measured the gravitational redshift in an Earth based

experiment over a height of 22.5 metres by emitting electromagnetic waves of a very precise

frequency of about 14keV from an overhead source which vibrated up and down so spreading

the frequency due to the Doppler effect, and measuring the radiation through a very precise

filter which absorbed the radiation at only the transmitted frequency. If the gravitational

redshift was cancelled by the Doppler effect the received radiation would be absorbed. Thus by

comparing the received radiation which was not absorbed with the vibration of the source it

was possible to verify the theory with an accuracy of 10%. This was later reduced to 1% by

Pound and Snyder.

A similar experiment using Gravity Probe A and using a hydrogen maser reduced this to 70

parts per million.

Page 136: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Relativity and Cosmology Einstein’s field equations for the universe as a whole are written today as

R𝜇𝜈 −1

2𝑅g𝜇𝜈 = −𝜅(T𝜇𝜈 − T̅𝜇𝜈)

The LHS describes space-time:

R𝜇𝜈 is the Ricci tensor which is the contracted Riemann tensor describing the geometry

of space-time with units of [L-2].

𝑅 is Ricci or curvature scalar specifying the curvature with dimensions of [L-2].

g𝜇𝜈 is the metric tensor which defines the geometry of space-time, and is dimensionless.

The RHS describes distribution of energy and momentum in the space-time:

T𝜇𝜈 is the energy-momentum tensor for energy and matter [M L-1 T-2]

𝑇00specifies the local energy density including energy equivalent of mass.

The top row specifies the density of the 𝜈th component of momentum.

The first column specifies the energy flux in the 𝜇 direction.

The remaining elements specify the flux in the 𝜇 direction of the 𝜈th component

of momentum.

T̅𝜇𝜈is the energy-momentum tensor for dark energy.

T̅𝜇𝜈 =Λ

𝜅g𝜇𝜈

Λ is the cosmological constant 1.1056x10-52 m-2 (current value).

The two sides are made equal by a constant.

𝜅 is Einstein’s constant which has the value 8𝜋𝐺

𝑐4 with dimensions of [M-1 L-1 T2].

If it is assumed that dark energy is an ideal fluid whose pressure at a point is equal in all

directions the energy-momentum tensor is

T̅𝜇𝜈 = (𝜌Λ +𝑝Λ

𝑐2)𝐔𝜇𝐔𝜈 − 𝑝Λg𝜇𝜈 where 𝜌 is density, 𝑝 is pressure, the suffix Λ refers to dark

energy, and 𝑈 is the momentum four-vector.

If T̅𝜇𝜈 =Λ

𝜅g𝜇𝜈 and 𝐔𝜇𝐔𝜈 ≠ 0 then if (𝜌Λ +

𝑝Λ

𝑐2)𝐔𝜇𝐔𝜈 = 0

𝑝Λ = −Λ

𝜅 i.e, the pressure is negative since Λ and 𝜅 are positive

𝜌Λ +𝑝Λ

𝑐2= 0 or 𝜌Λ = −

𝑝Λ

𝑐2 and the density is positive since 𝑝Λ is negative and 𝜌Λ =

Λc2

8𝜋𝐺 , and

𝑝Λ = −Λc4

8𝜋𝐺

The negative pressure is forcing the space-time to expand. There are various theories as to what

it is. However it cannot be the vacuum energy of quantum mechanics because that is too big by a

factor of 10120.

Page 137: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Expanding Universe The expanding universe refers to the increase in space between two freely falling objects with

time. It does not mean that an object gets bigger with time. A metre measuring rod will always

be 1 metre long from the start to the end of the universe because its length depends on chemical

bonds whose lengths do not change with time (although they will with temperature). However

the wavelength associated with a photon does change with time if that photon exists over a

significant time period.

The distance of the Earth from the Sun also does not change with time due to the expanding

universe.

The universe is not expanding into anything – there is no known higher dimension for it to

expand into.

The universe is not expanding from a single point so there is no unique place such as the centre

of the universe.

There is a special class of observer in cosmology called fundamental observers. These move

with the expanding universe (the distance between them increases at the rate of expansion). All

fundamental observers agree on their cosmological measurements so they can be used as a

frame of reference.

The Cosmological Principle This states that at any given time and on a sufficiently large scale the universe is both

homogeneous (the same everywhere) and isotropic (the same in all directions).

A sufficiently large scale is one significantly greater than any object in the universal, the biggest

of which known at the present time are superclusters which occupy 10% and voids the

remaining 90% of the universe. These have a scale of 100 Mly (million light-years) or 30 Mpc

(mega parsec) or 3×1022 metres.

Weyl’s Postulate This states that in cosmic space-time there exists a set of fundamental observers whose world-

lines form a smooth bundle of time-like geodesics which never meet except possibly at an initial

and/or a final singularity.

This means that if these fundamental observers all measure a proper time they will get the same

value which is called cosmic time.

Every event is the universe has a single value for cosmic time measured from creation.

These fundamental observers are free falling through space-time, often described as floating in

the Hubble flow.

If they observe the cosmic microwave background it will be isotropic except for minor

variations (about 1 part in 10,000). An observer in the solar system observes a significant

departure from this which indicates that the solar system is moving at about one thousandth of

the speed of light in the direction of Leo and so an observer in the solar system cannot be a

fundamental observer. This also applies to an observer based in any galaxy.

Page 138: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Friedmann-Robertson-Walker (FRW) Metric The cosmological principle and Weyl’s postulate form the justification for the Friedmann-

Robertson-Walker metric which separates the time dimension from the spatial dimensions to

give

(d𝑠)2 = (𝑐𝑑𝑡)2 − ∑ 𝑔𝑖𝑗d𝑥𝑖3

𝑖,𝑗=1 d𝑥𝑗 noting that 𝑥𝑖 and 𝑥𝑗 are the three spatial coordinates

and not powers – 𝑥 is used instead of 𝑒 because they are spatial and not general, and 𝑖 and 𝑗 replace 𝜇 and 𝜈

because their range is 1 to 3 and not 0 to 4.

Regardless of the cosmic time, the spatial angles between events must always be the same for all

fundamental observers. This can be imposed by having a common time function 𝑆2(𝑡) and

making the metric coefficients independent of time so h𝑖𝑗 = 𝑓(𝑥1, 𝑥2, 𝑥3) to give

(d𝑠)2 = (𝑐𝑑𝑡)2 − 𝑆2(𝑡) ∑ ℎ𝑖𝑗d𝑥𝑖3

𝑖,𝑗=1 d𝑥𝑗

Since space is homogeneous and isotropic the curvature 𝐾 (on a sufficiently large scale) must be

constant. The FRW metric does not apply in the vicinity of masses (galaxies clusters).

The current approach is to use a set of co-moving polar co-ordinates (�̅�, 𝜃, 𝜙) – i.e. the origin

moves with the fundamental observer. Note that �̅� is dimensionless and 𝑆 has dimension L. The

origin and alignment of the coordinate system is arbitrary.

(d𝑠)2 = (𝑐𝑑𝑡)2 − 𝑆2(𝑡) (1

1−𝐾�̅�2(d�̅�)2 + �̅�2 (d𝜃)2+�̅�2 sin2 𝜃 (d𝜙)2)

The Riemann tensor components are 𝑅𝑖𝑗𝑘𝑙 = 𝐾(ℎ𝑖𝑗ℎ𝑗𝑙 − ℎ𝑖𝑙ℎ𝑗𝑘), the Ricci tensor components are

𝑅𝑖𝑗 = −2𝐾ℎ𝑖𝑗 , and the Ricci curvature 𝑅 = −6𝐾. Note 𝑅 is the curvature of space, not space-

time and is dimensionless.

A simpler form of the metric is based on whether the curvature of space is positive, zero or

negative, and whether space is expanding, stationary or contracting.

𝐾

{

< 0 𝑘 = −1 𝑟 = �̅�√|𝐾| 𝑅(𝑡) =

𝑆(𝑡)

√|𝐾|

= 0 𝑘 = −0 𝑟 = �̅� 𝑅(𝑡) = 𝑆(𝑡)

> 0 𝑘 = 1 𝑟 = �̅�√𝐾 𝑅(𝑡) =𝑆(𝑡)

√𝐾

This gives (d𝑠)2 = (𝑐𝑑𝑡)2 − 𝑅2(𝑡) (1

1−𝑘𝑟2(d𝑟)2 + 𝑟2 (d𝜃)2+𝑟2 sin2 𝜃 (d𝜙)2)

Note that like �̅�, 𝑟 is dimensionless and 𝑅(𝑡) like 𝑆(𝑡) has dimension L. 𝑘 is dimensionless.

𝑅(𝑡) is an unspecified function of time and should not be confused with Riemann or Ricci

tensors or curvature. 𝑅(𝑡) is the universe’s scale factor (sometimes written as just 𝑅) and the

relationship between proper distance and co-moving distance (the co-moving distance is

constant wrt time since it is calculated at time 𝑡0 i.e. now). In a flat universe the proper distance

𝜎 = 𝑅(𝑡)𝑟. The current value of 𝑅(𝑡) is 1 with the same units of length as 𝜎.

Here 𝑘 indicates whether the curvature is positive (+1), flat (0) or negative (-1) and 𝑅(𝑡)

indicates whether space is expanding, staying the same or contracting. These are the only

important parameters and there are nine possibilities. 𝑘 has only one of three values -1, 0 or 1 –

any other value such as say n is taken into account by multiplying 𝑟 by √𝑛 and dividing 𝑅(𝑡) by

√𝑛. The curvature of space at a time 𝑡 is 𝑘

𝑅2(𝑡), and is not the same as the Ricci curvature which is

dimensionless.

Page 139: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Proper Time and Proper Distance The fundamental observers measure proper time 𝜏. This is also called cosmic time.

The proper distance between two fundamental observers at a proper time 𝜏 is

d𝜎 = 𝑅(𝜏)√1

1−𝑘𝑟2(d𝑟)2 + 𝑟2 (d𝜃)2+𝑟2 sin2 𝜃 (d𝜙)2 and is a function of proper time.

This must be integrated along the geodesic. Note that 𝑟 is dimensionless and 𝑅 has dimension L.

It can be simplified by fixing the origin at one fundamental observer and aligning the r axis with

the other fundamental observer so that d𝜃 = d𝜙 = 0.

The second fundamental observer then has coordinates (𝜒, 0,0) where 𝜒 is the dimensionless

distance between the observers and 𝑅(𝜏) has the dimensions of length, i.e. 𝜒 =𝜎

𝑅.

𝜎(𝜏) = ∫ 𝑅(𝜏)√1

1−𝑘𝑟2d𝑟

𝜒

0

This has three possible solutions depending on the value of 𝑘.

𝜎(𝜏) = {

𝑅(𝑡) sin−1 𝜒 𝑘 = +1 𝑅(𝑡)𝜒 𝑘 = 0

𝑅(𝑡) sinh−1 𝜒 𝑘 = −1

All three give essentially the same value for 𝜒 < 0.5, but at 𝜒 = 1 sin−1 𝜒 =𝜋

2 while sinh−1 1 is

about 0.9. Note that in the case of 𝑘 = 1 𝜒 has an absolute maximum value of 1. There is no maximum value of 𝜒 for 𝑘 ≤ 0 .

The change in proper distance with time (the value of 𝜒 is independent of time) is

d𝜎

d𝜏=

d𝑅

d𝜏𝑓(𝜒) where 𝑓(𝜒) is one of the three functions above. However for small changes in

𝜏 they are all equal to 𝜒 =𝜎

𝑅 so

d𝜎

d𝜏=

1

𝑅

d𝑅

d𝜏𝜎

This is the proper radial velocity – the velocity with which fundamental observers move relative

to each other. It is normally written as Hubble’s Law

𝑣𝑝 = 𝐻(𝑡)𝑑𝑝 where 𝑣𝑝 =d𝜎

d𝜏, the proper radial velocity, 𝑑𝑝 = 𝜎, the proper radial distance, and

𝐻(𝑡) =1

𝑅

d𝑅

d𝜏 the Hubble parameter which has the dimension T-1.

This means that the proper radial velocity is proportional to the proper distance.

A changing value of 𝑅(𝑡) means that the universe is either expanding or contracting - the

distance between fundamental observers and the time taken by light to travel between them is

increasing or decreasing. It does not necessarily mean the universal started or will finish at a

point. The value does not need to change at a constant rate – it can be an asymptotic to a non-

zero value so that the universe either started essentially flat with a finite size or will end

essentially flat with a finite size. The value of 𝑅(𝑡) is unknown – it always occurs in a ratio of 𝑅(𝑡1)

𝑅(𝑡2)= 𝑎 which gives a scale factor calculated from the ratio of the same length measured at two

different times, one of which may be the present or as d𝑅

d𝜏, its rate of change.

Page 140: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Possible Geometries of Space and Space-Time

The geometry of space-time is given by R𝜇𝜈 −1

2𝑅g𝜇𝜈 and this is only flat if 𝑅(𝑡) has a constant

value.

The Friedmann-Robertson-Walker space-time results in nine possible geometries of space

depending on the values of 𝑘 and 𝑅(𝑡). In practice these can be reduced to four possibilities.

If 𝑘 = 0 and 𝑅(𝑡) is constant then

(d𝑠)2 = (𝑐𝑑𝑡)2 − 𝑅2(𝑡) (1

1−𝑘𝑟2(d𝑟)2 + 𝑟2 (d𝜃)2+𝑟2 sin2 𝜃 (d𝜙)2) which can be rewritten as

(d𝑠)2 = (𝑐𝑑𝑡)2 − ((d𝑟)2 + 𝑟2 (d𝜃)2+𝑟2 sin2 𝜃 (d𝜙)2) where 𝑅(𝑡) is absorbed into 𝑟 and then 𝑟

has dimension L.

This is the Minkowski metric – space has zero curvature, and the proper distances between

fundamental observers is constant wrt time. Special relativity applies everywhere. Space-time is

flat and unbounded – there is no start or end of the universe and space is infinite. It has no

physical significance because there is no matter, radiation or energy.

If 𝑘 = 0 but 𝑅(𝑡) is not constant space remains flat, but space-time is curved – proper distances

change with time.

If 𝑘 = 1 space has a three dimension curved geometry which is a three dimensional sphere in

four dimensional space-time. The sphere has a proper volume 2𝜋2𝑅3(𝑡). The space is closed (the volume is finite) but it is unbounded – there is no edge. The volume can increase, stay

constant or decrease according to the value of 𝑅(𝑡). An analogy is the surface of a sphere. The

angles of a triangle add to more than 𝜋. The ratio of the circumference of a circle to its radius is

2𝜋𝜒

sin−1 𝜒 which is less than is 2𝜋 since sin−1 𝜒 > 𝜒 with 𝜒 <

𝜋

2. The ratio is 4 for 𝜒 = 1. There is

also a finite maximum distance between two points of half the circumference, and the surface

has a finite area given by 4𝜋𝑅2(𝑡). This area will change if 𝑅(𝑡) increases or decreases, and the

sphere may grow from a point, collapse to a point, grow from an initial size or reduce to a final

size.

If 𝑘 = −1 space also has a three dimension curved geometry but one which is very difficult to

imagine. The two dimensional equivalent is a saddle with opposite curvatures in two

perpendicular directions. The angles of a triangle add to less than 𝜋 and the ratio of the

circumference of a circle to its radius is 2𝜋𝜒

sinh−1𝜒 which is greater than 2𝜋 since sinh−1 𝜒 < 𝜒.

There is no limit to the value of 𝜒 so the space is unbounded with no finite volume, or open. A

saddle does not enclose a volume.

The Friedmann-Robertson–Walker metric describes the LHS of Einstein’s field equation.

Page 141: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Energy-Momentum Tensors The energy-momentum tensors describe the distribution of energy and momentum, and these

depend on physical knowledge or assumptions about their distribution, while keeping them

sufficiently simple to allow the equations to be solved.

It is assumed that there are three physical components – matter (including dark matter),

radiation and dark energy, and that each of these can be treated as a fluid.

It is assumed that the fluids have a specific density given by 𝜌(𝑡) and a pressure by 𝑝(𝑡), i.e. both

are functions of time, but are uniform throughout space at a given time.

The energy-momentum tensor for such a fluid is

[𝑇𝜇𝜈] = [

𝜌𝑐2 0 0 00 𝑝 0 00 0 𝑝 00 0 0 𝑝

]

The density of matter 𝜌𝑚 is simply its mass per unit volume, and its energy density is 𝜌𝑚𝑐2.

If space expands or contracts the energy density changes as inverse function of the cube of the

volume 𝜌𝑚 ∝1

𝑅3(𝑡).

The density of radiation is determined from its energy density 𝜌𝑟𝑐2 by dividing it by 𝑐2.

If space expands or contracts the energy density changes as for matter, but additionally it

increases or decreases the wavelength which increases or decreases the energy so the actual

change is an inverse function of the fourth power 𝜌𝑟 ∝1

𝑅4(𝑡).

It is assumed that the pressure 𝑝Λ and density 𝜌Λ of dark energy are independent of the

expansion or contraction of space, and that the density could be negative.

The current model for the universe starts with a period of inflation when there was no matter or

radiation, so there was only dark energy, then matter and radiation were created, and initially

radiation dominated. However radiation density decreases as the universe expanded as 1

𝑅4(𝑡),

while matter decreased as 1

𝑅3(𝑡) so at some point matter dominated, but eventually both

radiation and matter densities decreased to below that of dark energy which now dominates.

This implies a period of rapid expansion, slower expansion and today an accelerating expansion.

Page 142: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Equation of State The equation of state for a component is 𝑝 = 𝑤𝜌𝑐2 where 𝑝 is the pressure, 𝜌 the density and 𝑤 a number.

The pressure of matter is assumed to be zero, i.e. it is dust rather than a fluid so

𝑝𝑚 = 0 and so 𝑤𝑚 = 0.

The pressure of radiation is 𝑝𝑟 =𝜌𝑟𝑐

2

3 so 𝑤𝑟 =

1

3.

The pressure of dark matter 𝑝Λ = −𝜌Λ𝑐2 so 𝑤Λ = −1.

These can be written as 𝑝 = 𝑤𝜌𝑐2 where 𝑤 = 0 for matter, 1

3 for radiation and −1 for dark

energy.

It is conventional to use 𝑅0 for the current value of 𝑅(𝑡) and 𝑅1 for the value at some other time

𝑡1, usually in the past, and likewise 𝜌𝑚0 and 𝜌𝑚1

etc for current value and a value at 𝑡1. The ratio 𝑅1

𝑅0 also occurs frequently so this is given the dimensionless value 𝑎 which is a function of time.

𝑎(𝑡1) =𝑅1

𝑅0 or more usually just 𝑎 =

𝑅1

𝑅0. Note 𝑎 < 1 for an expanding universe.

𝜌𝑚1= 𝜌𝑚0

(𝑅1

𝑅0)3= 𝑎3𝜌𝑚0

and 𝜌𝑟1 = 𝜌𝑟0 (𝑅1

𝑅0)4= 𝑎4𝜌𝑟0

Since 𝜌Λ and 𝑝Λ are constant wrt time no addition symbols are required.

If the values of 𝜌𝑚0, 𝜌𝑟0 and 𝜌Λ are known today the values at any other time can be calculated

provided the function 𝑅(𝑡) and hence 𝑎(𝑡) is known.

𝜌1 = 𝑎3𝜌𝑚0+ 𝑎4𝜌𝑟0 + 𝜌Λ

𝑝1 =𝑎4𝑐2

3𝜌𝑟0 − 𝜌Λ𝑐

2

The current value for 𝑤 is -1.028±0.032.

Page 143: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Friedmann Equations Starting with the values of g𝜇𝜈 from the Friedmann-Robertson-Walker metric, determining the

metric tensor, connection coefficients, the Riemann curvature, Ricci curvature and Ricci scalar,

substituting these into the Einstein field equations together with the energy momentum tensor,

cancelling many of the terms that vanish and taking symmetry into account results in the two

Friedmann equations which relate the cosmic scale factor 𝑅(𝑡) to the gravitational constant G,

the curvature of space 𝑘, speed of light 𝑐 and density 𝜌 and pressure 𝑝

(1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3𝜌 −

𝑘𝑐2

𝑅2 the energy equation

1

𝑅

d2𝑅

d𝑡2= −

4𝜋𝐺

3(𝜌 +

3𝑝

𝑐2) the acceleration equation

The acceleration equation can be derived from the energy equation so the two are not

independent.

These can be combined to form

d𝜌

d𝑡+ (𝜌 +

3𝑝

𝑐2)3

𝑅

d𝑅

d𝑡= 0 the fluid equation

The first two equations can be written using current values as indicated by the suffix 0 as

(1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3(𝑎3𝜌𝑚0

+ 𝑎4𝜌𝑟0 + 𝜌Λ) −𝑘𝑐2

𝑅2

1

𝑅

d2𝑅

d𝑡2= −

4𝜋𝐺

3(𝑎3𝜌𝑚0

+ 2𝑎4𝜌𝑟0 − 2𝜌Λ) noting 𝜌Λ is constant wrt time

where a(t) =𝑅1

𝑅0≤ 1, the ratio of 𝑅(𝑡) at time t to its value now.

These can be used to create various cosmological models, some of which have no physical

significance, some approximate to previous/future periods in the development of the known

universe, and one of which is assumed to be valid for the universe for the last 4 billion years

plus.

Page 144: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

De Sitter Model This model is for flat space containing only dark energy – i.e. such as during inflation and in the

distant future when space has expanded so much that matter and radiation have negligible densities noting that 𝑅 is used in place of 𝑅(𝑡), i.e. 𝑅 is a function of time being the cosmic scale

factor.

(1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3(𝑎3𝜌𝑚0

+ 𝑎4𝜌𝑟0 + 𝜌Λ) −𝑘𝑐2

𝑅2 becomes (

1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺𝜌Λ

3 or

d𝑅

d𝑡= √

8𝜋𝐺𝜌Λ

3𝑅 and solving the differential equation gives

𝑅(𝑡) = 𝑅0 exp (√8𝜋𝐺𝜌Λ

3(𝑡 − 𝑡0) ) 𝑅0 is the current value of 𝑅(𝑡), i. e. 𝑅(𝑡0)

𝐻(𝑡) =1

𝑅

d𝑅

d𝑡= √

8𝜋𝐺𝜌Λ

3 the Hubble parameter

giving

𝑅(𝑡) = 𝑅0 exp(𝐻0(𝑡 − 𝑡1) ) where 𝐻0 is the Hubble constant, the value of 𝐻(𝑡)

now – the Hubble constant is constant because it is

the value now, and is has the same value

everywhere in the universe now. However the value

of 𝐻(𝑡) varies with time.

This was the first model to describe an expanding universe.

Pure Radiation Model This is applicable to the period after inflation when radiation dominated – matter and dark

energy are ignored. It assumes flat space.

d𝑅

d𝑡= √

8𝜋𝐺

3𝜌r0

𝑅02

𝑅 and

𝑅(𝑡) = 𝑅0√8𝜋𝐺𝜌r0

3

𝐻(𝑡) =1

𝑅

d𝑅

d𝑡= √

8𝜋𝐺𝜌r03

the Hubble parameter

giving

𝑅(𝑡) = 𝑅0√2𝑡𝐻0

Also since

𝐻(𝑡) =1

𝑅0√2𝑡𝐻0

d

d𝑡𝑅0√2𝑡𝐻0 =

1

𝑅0√2𝑡𝐻0

𝑅0√2𝐻0

2√𝑡 =

1

2𝑡

𝐻0 =1

2𝑡0 the Hubble constant

Page 145: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Einstein de Sitter Model In this model matter dominates and radiation and dark energy are ignored. It assumes flat

space-time. It is also known as the critical model because it is between closed models with 𝑘 = 1

and open models with 𝑘 = −1, and was the standard model until the 1990s.

d𝑅

d𝑡= √

8𝜋𝐺

3𝜌m0

√𝑅03

𝑅 and

𝑅(𝑡) = 𝑅0√(3

2)2 8𝜋𝐺𝜌m0

3𝑡2

3

𝐻(𝑡) =1

𝑅

d𝑅

d𝑡= √

8𝜋𝐺𝜌m0

3 the Hubble parameter

giving

𝑅(𝑡) = 𝑅0 (3

2𝐻0 t)

2

3

Also since

𝐻(𝑡) =2

3𝑡

𝐻0 =2

3𝑡0 the Hubble constant

Recent observations mean it is now obsolete.

Page 146: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Density Parameters

Since 𝐻2(𝑡) =8𝜋𝐺

3𝜌(𝑡) for all three models where 𝑘 = 0, the critical density is defined as

𝜌𝑐(𝑡) =3𝐻2(𝑡)

8𝜋𝐺 . The current value 𝜌𝑐0 is 1.06×10-26 kg m-3. Note it varies with time.

This value is used in the other models, being used to divide the density to give a dimensionless

density parameter.

Ω𝑚(𝑡) =𝜌𝑚(𝑡)

𝜌𝑐(𝑡)=

8𝜋𝐺𝜌𝑚(𝑡)

3𝐻2(𝑡)

Ω𝑟(𝑡) =𝜌𝑟(𝑡)

𝜌𝑐(𝑡)=

8𝜋𝐺𝜌𝑟(𝑡)

3𝐻2(𝑡)

ΩΛ(𝑡) =𝜌Λ

𝜌𝑐(𝑡)=

Λ𝑐2

3𝐻2(𝑡) note ΩΛ is a function of time even though 𝜌Λ is constant.

Ωk(𝑡) = −𝑘𝑐2

𝑅2(𝑡)𝐻2(𝑡) 𝑘 is not a density – this term simplifies the equation

and is called the curvature density. In a flat universe the

value is zero since 𝑘 = 0.

(1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3𝜌 −

𝑘𝑐2

𝑅2 can be rewritten as

Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) + Ωk(𝑡) = 1 or

Ωt(𝑡) = Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) = 1 − Ωk(𝑡)

Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) = 1 means 𝑘 = 0 and space is flat

If Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) < 1 𝑘 < 0 and space is open

If Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) > 1 𝑘 > 0 and space is closed

The relative values of Ω(𝑡) and their sum result in very different models.

Page 147: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Einstein Model This is a closed model (𝑘 = 1) of a static universe with only matter and dark energy in flat space

which Einstein believed to be the case when developing general relativity (dark energy as such

was not known but it is the name now given to the cosmological constant).

d𝑅

d𝑡=

d2𝑅

d𝑡2= 0 so 𝑅(𝑡) = 𝑅0.

This requires ΩΛ =Ω𝑚

2 and this value is given the symbol ΩΛ𝐸 =

Ω𝑚

2 since the density of dark

energy must be sufficient to balance the mass (the repulsion of dark energy equals the

attraction of mass for the universe to be static).

𝐻0 = 0 and 𝐻(𝑡) = 0

From (1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3𝜌 −

𝑘𝑐2

𝑅2

8𝜋𝐺

3(𝜌𝑚0

(𝑅0

𝑅(𝑡))3+ ΩΛ) −

𝑘𝑐2

𝑅2= 0

8𝜋𝐺

3(3𝜌𝑚0

2+ ΩΛ) =

𝑐2

𝑅02 𝑘 = 1, 𝑅(𝑡) = 𝑅0, ΩΛ =

Ω𝑚

2

𝑅0 =𝑐

√4𝜋𝐺𝜌𝑚0

𝑅0 = 20,000 𝑀𝑙𝑦 based on 𝜌𝑚0= 3x10-27 kg m-3 which was the value at that time.

A modification of this is the Eddington-Lemaître model which also has 𝑘 = 1 and starts with the

Einstein model but the universe expands from the steady state which is unstable.

This in turn developed into the Lemaître model where ΩΛ > ΩΛ𝐸 which starts from zero

expanding rapidly, decelerating, then accelerating again.

An expanding universe requires ΩΛ > ΩΛ𝐸 .

Page 148: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Lambda-CDM Model The model in use today, often referred to as ΛCDM or Lambda Cold Dark Matter model, is based

on the Friedman Robertson Walker

(d𝑠)2 = (𝑐𝑑𝑡)2 − 𝑅2(𝑡) (1

1−𝑘𝑟2(d𝑟)2 + 𝑟2 (d𝜃)2+𝑟2 sin2 𝜃 (d𝜙)2)

and the best estimate values (2015) Ω𝑚0= 0.3089, Ω𝑟0 ≅ 0, ΩΛ0 = 0.6911 so 𝑘 = 0,

𝑡0 = 13.799 Gy, 𝐻0 = 67.74km s-1 Mpc-1.

Since ΩΛ0 >Ω𝑚0

2 this is an accelerating, expanding universe and since 𝑘 = 0 space is flat.

If the universe is expanding there is a cosmological redshift (not to be confused with Doppler

and gravitational redshifts) usually given by

𝑧 =𝜆observed−𝜆emitted

𝜆emitted or 𝑧 =

𝑓emitted−𝑓observed

𝑓observed

Cosmological redshift is a result of space expanding (the wave length is stretched) and is not

due to motion or gravitation.

The emitted values can be determined by Earth based experiments and the observed values can

be measured with good precision. The unknown is the time at which the photons were emitted

and/or the distance – these must be estimated.

From the Robertson-Walker metric given d𝜃 = d𝜙 = 0 and d𝑠 = 0 for light

𝑐2(d𝑡)2 −𝑅2(𝑡)

1−𝑘𝑟2(d𝑟)2 = 0 or

𝑐

𝑅(𝑡)d𝑡 =

1

√1−𝑘𝑟2d𝑟

If the start of the light wave was emitted at 𝑡em at a distance 𝜒 and observed at 𝑡ob

∫𝑐

𝑅(𝑡)d𝑡

𝑡ob𝑡em

= ∫1

√1−𝑘𝑟2d𝑟

𝜒

𝑜

If the end of the wave was emitted at 𝑡em + 𝛿𝑡em (𝛿𝑡em the period) and is observed at 𝑡ob + 𝛿𝑡ob

∫𝑐

𝑅(𝑡)d𝑡

𝑡ob+𝛿𝑡ob

𝑡em+𝛿𝑡em = ∫

1

√1−𝑘𝑟2d𝑟

𝜒

𝑜 the distance being assumed to be unchanged during the

short period 𝛿𝑡.

Since the RHSs are equal

∫𝑐

𝑅(𝑡)d𝑡

𝑡ob𝑡em

= ∫𝑐

𝑅(𝑡)d𝑡

𝑡ob+𝛿𝑡

𝑡em+𝛿𝑡

∫𝑐

𝑅(𝑡)d𝑡

𝑡em+𝛿𝑡em𝑡em

+ ∫𝑐

𝑅(𝑡)d𝑡

𝑡ob𝑡em+𝛿𝑡em

= ∫𝑐

𝑅(𝑡)d𝑡

𝑡ob

𝑡em+𝛿𝑡em + ∫

𝑐

𝑅(𝑡)d𝑡

𝑡ob+𝛿𝑡ob

𝑡ob

∫𝑐

𝑅(𝑡)d𝑡

𝑡em+𝛿𝑡em𝑡em

= ∫𝑐

𝑅(𝑡)d𝑡

𝑡ob+𝛿𝑡ob

𝑡ob

𝑅(𝑡) can be assumed to be constant during the short period 𝛿𝑡 so 𝛿𝑡em

𝑅em=

𝛿𝑡ob

𝑅ob or

𝛿𝑡em

𝛿𝑡ob=

𝑅em

𝑅ob so

𝜆em

𝜆ob=

𝑅em

𝑅ob or

𝑓ob

𝑓em=

𝑅em

𝑅ob and so 𝑧 =

𝑓em

𝑓ob− 1 =

𝑅ob

𝑅em− 1.

Note that the value measured will include any gravitational redshift or Doppler redshift, but

these are insignificant for very large distances.

Page 149: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Cosmological Redshift A measured value for the cosmological redshift 𝑧 and its relationship to distance is important to

establish the size, expansion and age of the universe.

The cosmological redshift is due to the expansion of the universe and is not related to Doppler

or gravitational effects, but when redshift is measured the value will include Doppler and

gravitational components. The gravitational component is significant when measuring the

redshift of stars, but the cosmological component is then insignificant. The Doppler component

can be significant for stars and nearby galaxies, but is insignificant for distant galaxies.

Measuring distance to galaxies can only be done indirectly. The most common approach is to

base it on the apparent luminosity (brightness or flux) of objects. The distance will then be a

function of the actual luminosity (which must be estimated) and the measured luminosity.

If a source at a distance 𝑑 has luminosity 𝐿 the observed flux is 𝐹 =𝐿

4𝜋𝑑2 in a static flat empty

universe.

If the source is at the radial co-moving coordinate 𝜒, then the flux received by an observer at 𝑡ob

is 𝐹 =𝐿

4𝜋𝑅2(𝑡ob)𝜒2.

If the radiation is observed for time 𝛿𝑡ob, it will have been emitted for time 𝛿𝑡em .

𝛿𝑡em

𝛿𝑡ob=

𝑅em

𝑅ob=

1

1+𝑧 𝑧 =

𝑅ob

𝑅em− 1

This means that the observed flux will be reduced by a factor 1

1+𝑧 . This is sometimes given the

symbol 𝑎 since it occurs frequently.

𝑎 =𝑅em

𝑅0=

1

1+𝑧 This is the same 𝑎 as in 𝑎 =

𝑅1

𝑅0, 𝑎 being common in calculations, but 𝑧 is often

used as a measure of time before the present and/or distance, the two being linked.

Additionally each wavelength will be expanded by the same factor which reduces its energy so

𝐹 =𝐿

4𝜋𝑅2(𝑡ob)𝜒2(1+𝑧)2

The next step is to find the relationship between 𝑅(𝑡ob)𝜒 and 𝑧.

For very small values of 𝑧 ≪ 1 the function 𝑅(𝑡) can be written as a power series in 𝐻0(𝑡0 − 𝑡)

remembering the suffix 0 indicates the current value, and H is the Hubble constant which is

constant everywhere in the universe at time 𝑡, but is a function of time. 𝑡0 − 𝑡 is called the

loopback time - time measured backwards from now.

𝑅(𝑡) ≅ 𝑅(𝑡0) (1 − 𝐻0(𝑡0 − 𝑡) −1

2𝑞0𝐻0

2(𝑡0 − 𝑡)2 +⋯) where 𝑞0 is known as the deceleration

parameter (from a time when it was assumed that the rate of expansion was decreasing – today

𝑞0 is believed to be negative approximately -0.6, i.e. the expansion is accelerating). It is defined

as

𝑞0 = −1

𝐻2(𝑡)

1

𝑅(𝑡)

d2𝑅

d𝑡2.

This means that distance 𝑑 ≅𝑐

𝐻0(𝑧 +

1−𝑞0

2𝑧2 +⋯) since 𝑑 = 𝑅(𝑡ob)𝜒(1 + 𝑧) and to a first

approximation 𝑑 ≅𝑐

𝐻0𝑧 for 𝑧 < 0.1 (astronomical distances). Cosmological distances start at

𝑧 = 0.1 equivalent to approximately 1Gly in distance or 1Ga in loopback time.

Page 150: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The original measurements by Hubble which established an expanding universe were for

galaxies with 𝑧 < 0.004 so peculiar motions of the galaxies meant that there was potentially a

large Doppler component (resulting in blue shifts in some cases), but plotting 𝑧 against 𝑑 and

calculating the gradient gives a value for 𝐻0 of about 67.31 km s-1 Mpc-1 – note that the two

distance dimensions cancel so the true value and dimension is about 2.19x10-18 s-1 but it is

normally expressed in these units which gives the recession velocity at a distance of 1 Mpc. Note

that the term recession velocity is in common use although it is not recession velocity but rate of

expansion.

If expressed as per second, it is the reciprocal of the age of the Universe assuming that H was

constant wrt time. The current value for the age is 13.799 Ga, slightly lower that from taking the

reciprocal of the Hubble constant.

The Hubble distance 𝑑𝐻 =𝑐

𝐻0 which is 14.44 Gly or 4426Mpc is the distance to a galaxy that is

receding at the speed of light. This corresponds to a redshift of 𝑧 = 1.6. Galaxies with 𝑧 < 1.6 are

receding at less than the speed of light. Those with a redshift 𝑧 > 1.6 are receding at a speed

greater than the speed of light. This does not conflict with special relativity because special

relativity does not apply since the observer is not in the local frame, no information is travelling

faster than light, and it is space that is expanding, not the galaxy moving.

The value 𝐻(𝑡) =1

𝑅

d𝑅(𝑡)

d𝑡=

�̇�

𝑅 is called the Hubble parameter and its value is a function of time.

The current value 𝐻0 = 𝐻(𝑡0) is called the Hubble constant.

Note that there are several different definitions of distance in use.

For small redshifts 𝑧 < 0.1 (astronomical distances)

𝐻0 =𝑣

𝑑 and 𝑣 = 𝑐𝑧 so 𝐻0 =

𝑐𝑧

𝑑 where 𝑑 is distance and 𝑣 is the apparent recession velocity.

For larger redshifts great care has to be taken in the definition of distance as there are several

definitions in use.

Note that there are different values for 𝐻0 depending on how it has been estimated. 67.31 is

used in the following sections unless stated otherwise, but there are a range of values from 65 to

75.

Page 151: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Time Dilation If two photons are emitted at time 𝑡1 a short time apart 𝛿𝑡1, and are detected at time 𝑡0 the

initial distance between the photons was 𝑐𝛿𝑡1, but this distance has changed by 𝑅0

𝑅1.

This ratio is the expansion which is given by 1 + 𝑧 where 𝑧 is the redshift.

𝑅0

𝑅1= 1 + 𝑧 =

1

𝑎

So the distance between the photons now is 𝑅0

𝑅1𝑐𝛿𝑡1.

Therefore the second photon will arrive is 𝑅0

𝑅1𝛿𝑡1 later, and time has been dilated by the factor

𝑅0

𝑅1

or 1

𝑎.

Note that 𝑎 is a function of time. In an expanding universe it has the value 1 now, but a value less

than 1 when referring to an event in the past.

Measurements of the decay rates of supernova for small and large redshifts give the ratio of

𝑎 =𝑅1

𝑅0 as (1 + 𝑧)−0.97±0.1

Note that the calculations assume that the emitted wave lengths of quantum jumps has not

changed with time. The difference between two hyperfine wavelengths is proportional to 𝛼2

where 𝛼 ≅1

137. Its value is

𝑒2

4𝜋 0ℏ𝑐, all which are believed to be constant. Any change in the

emitted wavelengths imply that 𝛼 must also change.

Page 152: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Co-moving Distance In special relativity the proper distance 𝜎 is the distance between two objects measured at a

single time by a local observer, and is the smallest value for the distance.

In cosmology the spatial universe is defined by a three dimensional coordinate system (𝑟, 𝜃, 𝜙)

based on an arbitrary origin, and each fundamental observer has a defined coordinate triplet

value which is constant for the life of the universe. The coordinate distance between two

fundamental observers is the calculated distance between the two coordinate triplets and by

definition this remains constant. Placing the origin on one fundamental observer and aligning

the r axis with the other gives a coordinate distance 𝑟. There is no coordinate velocity between

the fundamental observers although the space between them is increasing (or decreasing). This

can be interpreted as a recession velocity, but here velocity does not mean a change in distance

but a change in space. (Note 𝑟 is dimensionless and so is not a distance in the normal sense of

the word - 𝑑𝑐 is the co-moving distance)

Galaxy clusters, galaxies, stars and planets all move relative to the fundamental observers. The

difference in velocity between an observer and a fundamental observer is called a peculiar

velocity and is ignored in the following.

The co-moving distance 𝑑𝑐 is the proper distance between two objects at time 𝑡0 (i.e. now) as

measured by fundamental observers or the coordinate distance (which is constant wrt time)

multiplied by 𝑅0, the value of 𝑅(𝑡) now, assuming a flat space-time. If curvature is taken into

account

d𝑑𝑐 =𝑅0

√1−𝑘𝑟2d𝑟 (d is an operator, 𝑑 is a variable)

𝑑𝑐 = ∫𝑅0

√1−𝑘𝑟2d𝑟

𝑟

0 which for 𝑘 = 0 gives 𝑅0𝑟. Otherwise

𝑑𝑐 = {𝑅0 sin

−1 𝑟 𝑘 = 1

𝑅0 sinh−1 𝑟 𝑘 = −1

However 𝑟 is not an observable quantity (i.e. there is no practical way to measure it). Redshift 𝑧

can be accurately measured. Assuming a photon is sent from the distant object and is observed

on Earth, then from the Robertson-Walker metric

𝑐2(d𝑡)2 −𝑅2(𝑡)

1−𝑘𝑟2(d𝑟)2 = 0 since (d𝑠)2 = 0 for photons

𝑐

𝑅(𝑡)d𝑡 =

1

√1−𝑘𝑟2d𝑟

𝑅0

√1−𝑘𝑟2d𝑟 = 𝑅0

𝑐

𝑅(𝑡)d𝑡

𝑅0 ∫1

√1−𝑘𝑟2d𝑟

0

𝑟= 𝑐 ∫

𝑅0

𝑅(𝑡)d𝑡

𝑡0𝑡

the photon is coming from the distant object

𝑑𝑐 = 𝑐 ∫𝑅0

𝑅(𝑡)d𝑡

𝑡0𝑡

= −𝑐 ∫1

𝐻(𝑡)d𝑧

0

𝑧

𝑅0

𝑅 d𝑡 =

𝑅0

𝑅

d𝑡

d𝑅

d𝑅

d𝑧d𝑧 =

𝑅0

𝑅

d𝑡

d𝑅

𝑑

d𝑧(𝑅0

1+𝑧) d𝑧 =

𝑅0

𝑅

d𝑡

d𝑅(−

𝑅0(1+𝑧)2

) d𝑧 =

𝑅0

𝑅

d𝑡

d𝑅(−

𝑅0𝑅2

𝑅02 )d𝑧 = −

𝑅0

𝑅

d𝑡

d𝑅

𝑅2

𝑅0d𝑧 = −

𝑅d𝑅

d𝑡

d𝑧 = −1

𝐻d𝑧

𝑑𝑐 = 𝑐 ∫1

𝐻(𝑡)d𝑧

𝑧

0

Page 153: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

This can be written as 𝑑𝑐 =𝑐

𝐻0∫

1

𝐻(𝑧)d𝑧

𝑧

0 since 𝐻(𝑡) = 𝐻0𝐻(𝑧) where 𝑧 = 𝑧(𝑡)

An expression must be found for calculating 𝐻(𝑧) from known values.

Starting from

1 = Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) + Ωk(𝑡)

=Ω𝑚(𝑡)

Ω𝑚𝑜

Ω𝑚𝑜+

Ω𝑟(𝑡)

Ω𝑟𝑜Ω𝑟𝑜 +

ΩΛ(𝑡)

ΩΛ𝑜ΩΛ𝑜 +

Ωk(𝑡)

Ωk𝑜Ωk𝑜 where the subscript 0 refers to now.

Ω𝑚(𝑡) =8𝜋𝐺𝜌𝑚(𝑡)

3𝐻2(𝑡) so

Ω𝑚(𝑡)

Ω𝑚𝑜

=𝜌𝑚

𝜌𝑚0

(𝐻0

𝐻(𝑡))2= (

𝑅0

𝑅(𝑡))3(𝐻0

𝐻(𝑡))2= 𝑎−3 (

𝐻0

𝐻(𝑡))2

𝑎 =𝑅(𝑡)

𝑅0

𝜌𝑚

𝜌𝑚0

= (𝑅0

𝑅(𝑡))3 from the conservation of mass.

Ω𝑟(𝑡) term is assumed to be zero.

ΩΛ(𝑡) =Λ𝑐2

3𝐻2(𝑡) so

ΩΛ(𝑡)

ΩΛ𝑜= (

𝐻0

𝐻(𝑡))2

Ωk(𝑡) = −𝑘𝑐2

𝑅2(𝑡)𝐻2(𝑡) so

Ωk(𝑡)

Ωk𝑜= (

𝑅0

𝑅(𝑡))2(𝐻0

𝐻(𝑡))2= 𝑎−2 (

𝐻0

𝐻(𝑡))2

1 = 𝑎−3 (𝐻0

𝐻(𝑡))2Ω𝑚𝑜

+ 0Ω𝑟𝑜 + (𝐻0

𝐻(𝑡))2ΩΛ𝑜 + 𝑎−2 (

𝐻0

𝐻(𝑡))2Ωk𝑜

(𝐻(𝑡)

𝐻0)2= 𝑎−3Ω𝑚𝑜

+ ΩΛ𝑜 + 𝑎−2Ωk𝑜

𝐻(𝑡) = 𝐻0√𝑎−3Ω𝑚𝑜

+ ΩΛ𝑜 + 𝑎−2Ωk𝑜

= 𝐻0√(1 + 𝑧)3Ω𝑚𝑜

+ ΩΛ𝑜 + (1 + 𝑧)2Ωk𝑜 𝑎 =1

1+𝑧

𝐻(𝑧) = √(1 + 𝑧)3Ω𝑚𝑜+ ΩΛ𝑜 + (1 + 𝑧)

2Ωk𝑜 𝐻(𝑡) = 𝐻0𝐻(𝑧)

So the co-moving distance can be written in terms that can be observed today.

𝑑𝑐 =𝑐

𝐻0∫

1

√(1+𝑧)3Ω𝑚𝑜+ΩΛ𝑜+(1+𝑧)2Ωk𝑜

d𝑧𝑧

0

= 𝑑𝐻 ∫1

√(1+𝑧)3Ω𝑚𝑜+ΩΛ𝑜+(1+𝑧)2Ωk𝑜

d𝑧𝑧

0 𝑑𝐻 =

𝑐

𝐻0 is the Hubble distance.

The Ωk𝑜term can be eliminated by replacing it with 1 − Ω𝑚𝑜− ΩΛ𝑜 so

(𝐻(𝑡)

𝐻0)2= (1 + 𝑧)3Ω𝑚𝑜

+ ΩΛ𝑜 + (1 + 𝑧)2(1 − Ω𝑚𝑜− ΩΛ𝑜)

= (1 + 𝑧)3Ω𝑚𝑜+ΩΛ𝑜 + (1 + 𝑧)2 − (1 + 𝑧)2Ω𝑚𝑜

− (1 + 𝑧)2ΩΛ𝑜

= (1 + 𝑧)2(1 + zΩ𝑚𝑜) − 𝑧(2 + 𝑧)ΩΛ𝑜

𝑑𝑐 = 𝑑𝐻 ∫1

√(1+𝑧)2(1+zΩ𝑚𝑜)−𝑧(2+𝑧)ΩΛ𝑜

d𝑧 =2𝑐

𝐻0(1 −

1

√1+𝑧)

𝑧

0

Page 154: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The following table gives approximate values for 𝑧, 𝑑𝑐 , 𝑡𝑙 and 𝐻. Note that the co-moving

distance is the distance between objects now which can give lightyear values which are much

greater than those expected given the age of the universe. Thus an object that was very close to

our location soon after the creation could be up to 45.4 Gly years away due to the expansion of

the universe. 𝑡𝑙 is called the loopback time and is the time before present – equal to the age of

the universe minus cosmic time 𝑡. In fact at 𝑡 = 0 all “objects” coincided and the first entry in the

table is simply a projection back to that time, but is physically meaningless. The first photons

that we can observe were emitted at 𝑧 = 1090 or cosmic time of 380,000 years and are known

as the cosmic microwave background.

Ω𝑚0= 0.3089, Ω𝑟0 = 0, ΩΛ0 = 0.6911 so 𝑘 = 0, 𝑡0 = 13.799 Ga,

𝐻0 = 67.74 km s-1 Mpc-1

𝑧 𝑑𝑐 Gly 𝑡𝑙 Ga 𝐻(𝑡) km s-1 Mpc-1 ∞ 45.407 13.799 ∞ 1090 45.344 13.7974 1356910 1000 45.248 13.796 1192175 100 41.919 13.781 38215 10 13.466 13.325 1374.7 1 11.076 7.9354 120.46 0.1 1.4094 1.3438 71.119 0.01 0.14401 0.14325 68.056 0.001 0.014432 0.014379 67.771 0.0001 0.0014434 0.0014434 67.743

The second column is the co-moving distance – the distance between us and the object now.

Using the same value but as Ga and subtracting it from 45.407 gives the conformal time in Ga.

This is the time it would take for a photon to travel from a point 𝑑𝑐 light years from us to the

most distant object that we can observe now assuming the universe ceases to expand now.

The third column gives the loopback (cosmic) time – the cosmic time measured back from now.

Using the same value as Gly gives the distance between us and where the object was when the

photon was emitted (note time is measured in giga-annum – the same as giga-years).

The co-moving distance, loopback time and Hubble parameter cannot be measured – they must

be calculated.

The redshift (1 + 𝑧) can only be measured for objects for which a spectrum consisting of

spectral lines is available.

In other cases other properties must be measured. These values do not have the same precision

as measuring redshift, and they have to be compared to known or more usually estimated

values for the object. Hence the distance values are much less precise. They are also not additive

unlike the co-moving distances. There are several other measures of distance depending on the

property being measured.

Page 155: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Angular Diameter Distance If the true diameter 𝐷 of a distant object is known (or estimated) and the angle 𝜃 between the

sides is measured its angular diameter distance in a static flat universe is

𝑑𝐴 =𝐷

𝜃 since tan 𝜃 ≅𝜃

If the diameter D was its size at 𝑡1 when 𝑅(𝑡) = 𝑅1 then 𝐷 = 𝑅1𝑟𝜃 so

𝑑𝐴 =𝑅1𝑟𝜃

𝜃= 𝑅1𝑟 in an expanding universe.

This means there is a maximum value for 𝑑𝐴, an object further than this distance appears larger

than an object of the same size which is nearer.

The reason for this is that since universe was smaller at 𝑡1, the object was closer to us then.

𝑑𝐴 =𝑑𝑐

1+𝑧 or 𝑑𝑐 =(1 + 𝑧)𝑑𝐴

Proper Motion Distance If an object is moving across the line of sight so that in time ∆𝑡0 it is has moved though an angle

𝜃, and its known (estimated) proper velocity perpendicular to the line of sight is 𝑣 then its

proper motion distance is

𝑑𝑀 =𝑣∆𝑡0

𝜃

=𝑣𝑅0𝑅1∆𝑡1

𝜃 because of time dilation.

=

𝑅1𝑟

∆𝑡1

𝜃𝑅0𝑅1

∆𝑡1

𝜃 𝑣 =

𝑅1𝑟𝜃

∆𝑡1 equivalent to

𝐷

∆𝑡1

= 𝑅0𝑟

If 𝑘 = 0 this is also the co-moving distance, i.e. 𝑑𝑐 = 𝑑𝑀.

Page 156: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Luminosity Distance If the luminosity 𝐿 of a distant object is known (estimated) and the observed flux is S the

luminosity distance to the object is

𝑑𝐿 = √𝐿

4𝜋𝑆

If the coordinate distance of the object is 𝑟 then the current area of the sphere is 4𝜋𝑑𝑐2 which if

𝑘 = 0 is 4𝜋(𝑅0𝑟)2.

The energy emitted in time ∆𝑡1 is 𝐿∆𝑡1 but redshifting reduces this to 𝑎𝐿∆𝑡1.

The corresponding time required to observe this energy is ∆𝑡0 = 𝑎∆𝑡1.

The received flux is therefore

𝑑𝐿 =𝑅0

2

𝑅1 𝑆 = 𝐿𝑎2

1

4𝜋(𝑅0𝑟)2 =

𝐿𝑅12

4𝜋𝑟2𝑅04 =

𝐿

4𝜋(𝑅0

2𝑟

𝑅1)2

𝑑𝑐 =𝑑𝐿

(1+𝑧)

Since the three values are expressions in only R(t) and r they are related by

𝑑𝐿 = (1 + 𝑧)𝑑𝑀 = (1 + 𝑧)𝑑𝑐 = (1 + 𝑧)2𝑑𝐴

Not that for small redshifts 1 + 𝑧 ≅ 1 so for 𝑧 < 0.01 the four values are approximately equal.

Page 157: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Surface Brightness The apparent brightness of an object is the flux per square degree. The area is inversely

proportional to 𝑑𝐴 =𝐷

𝜃 so the area is inversely proportional to 𝑑𝐴

2.

The flux is inversely proportional to 𝑑𝐿2 =

𝐿

4𝜋𝑆.

So the surface brightness is proportional to 𝑑𝐴

2

𝑑𝐿2 =

1

(1+𝑧)4= 𝑎4. The actual brightness must

therefore be at least 𝑎4 times its apparent brightness (since some flux may have been absorbed by intervening gas and/or dust). Note that this applies to the complete spectrum.

The K Correction In practice a telescope can only observe a narrow range of wavelengths/frequencies since

different technologies are required for different parts of the spectrum (radio, optical, infra-red,

gamma).

If this range is ∆𝜆OB, the emitted range being observed is ∆𝜆EM = 𝑎∆𝜆OB.

The amount of energy varies with wavelength so the amount emitted at 𝜆OB is different to that

emitted at 𝜆EM.

If the flux per unit frequency 𝑆𝑓 ∝ 𝑓−𝛼 the K Correction is

𝐿𝑓 = 1026𝑆𝑓 (𝑑𝑀

3241)2(1 + 𝑧)1+𝛼 where 𝐿𝑓has units of 1026W Hz-1 sr-1, 𝑆𝑓 W Hz-1 m-2 and 𝑑𝑀 Mpc.

Page 158: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Particle Horizon This is the location in space-time of a signal starting from an initial event at the end of inflation

and travelling at the speed of light until now. It forms a volume with a proper radius of 45.407

Gly or 14900 Mpc today and represents the most distance object that can ever be observed, i.e.

the edge of the observable universe. Its value is the product of the speed of light and conformal

time now.

This represents the age of the universe in conformal time which is defined as the time it would

take a photon to travel from Earth to the particle horizon assuming expansion ceased now. The

time 45.407 Gy is obviously much greater than the age of the universe so great care has to be

taken regarding both time and distance.

The largest proper distance is 5.67 Gly (distance from us when the light was emitted) or a co-

moving distance of 16 Gly. This corresponds to a red shift of about 3.53. Objects observed at

larger red shifts were closer to us than 5.67 Gly when the light was emitted.

The cosmic microwave background which was created about 380,000 years after the initial

event has a proper distance of 0.1 Gly (distance when it was emitted), a redshift of z=1090 and

a co-moving distance of 45.344 Gly, just inside the particle horizon.

Page 159: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Co-moving Volume The co-moving volume is the volume of space within a sphere with a co-moving radius.

If 𝑘 = 0 the co-moving volume is simply

𝑉𝑐 =4

3𝜋𝑑𝑐

3 where 𝑑𝑐 is the co-moving radius.

If 𝑘 ≠ 0 the expression is

𝑉𝑐 =2𝜋𝑑𝐻

Ω𝑘0(𝑑𝑐

𝑑𝐻√1 + Ω𝑘0 (

𝑑𝑐

𝑑𝐻)2−

1

√|Ω𝑘0|

sin−1 (𝑑𝑐

𝑑𝐻√|Ω𝑘0|)) for 𝑘 = 1 where

𝑑𝐻 is the Hubble distance 𝑐

𝐻 and Ω𝑘0 =

−𝑘𝑐2

𝑅02𝐻0

2.

If 𝑘 = −1 then sin−1 is replaced by sinh−1.

Page 160: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Cosmological Event Horizon In the future as the Universe expands the matter density decreases and the dark energy density

remains constant so eventually the dark energy will dominate and ΩΛ ≅ 1 and 𝐻2(𝑡) =Λc2

3.

1

𝑅(𝑡)

d𝑅(𝑡)

d𝑡= √

Λc2

3

𝑅(𝑡) = 𝐴𝑒𝑐𝑡√

Λ

3 where 𝐴 is an arbitrary constant

This exponential de Sitter expansion means that objects in the universe will become so far apart

that they will cease to have causal contact. Ω𝑚 will tend to zero and ΩΛ will tend to 1.

At some time 𝑡1 in the distant future

𝑅(𝑡)

𝑅(𝑡1)=

𝐴𝑒𝑐𝑡√

Λ3

𝐴𝑒𝑐𝑡1√

Λ3

= 𝑒𝑐(𝑡−𝑡1)√

Λ

3

Taking a co-moving distance 𝑅(𝑡1)𝑟 at 𝑡1as the distance to the particle horizon, and considering

what will happen from then into the infinite future

𝑅(𝑡∞)𝑟 = 𝑐 ∫ 𝑒−𝑐(𝑡−𝑡1)√

Λ

3d𝑡∞

𝑡1 𝑑𝑐 = 𝑐 ∫

𝑅0

𝑅(𝑡)d𝑡

𝑡0𝑡

= √3

Λ

This is a constant value so the particle horizon is at a constant distance. However the universe is

expanding so nearby objects will eventually cross this horizon. The effects of redshift and time

dilation means that the object cannot be seen crossing the horizon. This is very similar to the

event horizon surrounding a black hole and is called the cosmological event horizon. Thus every

object will be outside a black hole containing the rest of the universe.

Based on current estimates for ΩΛ0 = 0.6911 and 𝐻0 = 2.19x10-18 s-1 and using ΩΛ(𝑡) =Λ𝑐2

3𝐻2(𝑡)

gives Λ = 1.1x10-52 m-2 so the event horizon will be at 1.6x1026 m or 5336 Mpc compared to the

current particle horizon at 14900 Mpc.

In the meantime star formation will cease in about 1012 years.

Page 161: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Loopback Time Loopback time is the age of the universe, the time being measured from now until the redshift is

infinite. This requires an expression for d𝑧

d𝑡.

𝐻(𝑡) =1

𝑅(𝑡)

d𝑅(𝑡)

d𝑡

=𝑅0

𝑅(𝑡)

1

𝑅0d𝑅(𝑡)

d𝑡

=𝑅0

𝑅(𝑡)

d𝑅(𝑡)

𝑅0

d𝑡

= (1 + 𝑧)d

1

1+𝑧

d𝑡

𝑅0

𝑅(𝑡)= 1 + 𝑧

= (1 + 𝑧)d

1

1+𝑧

d𝑧

d𝑧

d𝑡

=−1

1+𝑧

d𝑧

d𝑡

d1

1+𝑧

d𝑧=

−1

(1+𝑧)2

From Friedmann’s energy equation (1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3𝜌 +

Λ𝑐2

3−

𝑘𝑐2

𝑅2

𝑘𝑐2 = 𝐻2(𝑡)𝑅2(𝑡) −8𝜋𝐺𝑅2(𝑡)

3𝜌𝑚 −

Λ𝑐2𝑅2(𝑡)

3 at any time 𝑡 in the past and now

𝑘𝑐2 = 𝐻02𝑅0

2 −8𝜋𝐺𝑅0

2

3𝜌𝑚0

−Λ𝑐2𝑅0

2

3 so

𝐻2(𝑡)𝑅2(𝑡) −8𝜋𝐺𝑅2(𝑡)

3𝜌𝑚 −

Λ𝑐2𝑅2(𝑡)

3= 𝐻0

2𝑅02 −

8𝜋𝐺𝑅02

3𝜌𝑚0

−Λ𝑐2𝑅0

2

3

The density will have changed by expansion of space so 𝜌𝑚0=

𝑅3(𝑡)

𝑅03 𝜌𝑚

𝐻2(𝑡)𝑅2(𝑡) −8𝜋𝐺𝑅2(𝑡)

3

𝑅03

𝑅3(𝑡)𝜌𝑚0

−Λ𝑐2𝑅2(𝑡)

3= 𝐻0

2𝑅02 −

8𝜋𝐺𝑅02

3𝜌𝑚0

−Λ𝑐2𝑅0

2

3

𝐻2(𝑡)𝑅2(𝑡) − 𝐻02 𝑅0

3

𝑅(𝑡)Ω𝑚0

−𝐻02𝑅2(𝑡)ΩΛ0 = 𝐻0

2𝑅02 − 𝐻0

2𝑅02Ω𝑚0

−𝐻02𝑅0

2ΩΛ0

𝐻2(𝑡) − 𝐻02 𝑅0

3

𝑅3(𝑡)Ω𝑚0

−𝐻02 𝑅

2(𝑡)

𝑅2(𝑡)ΩΛ = 𝐻0

2 𝑅02

𝑅2(𝑡)−𝐻0

2 𝑅02

𝑅2(𝑡)Ω𝑚0

−𝐻02 𝑅0

2

𝑅2(𝑡)ΩΛ0

𝐻2(𝑡)

𝐻02 −

𝑅03

𝑅3(𝑡)Ω𝑚0

−𝑅2(𝑡)

𝑅2(𝑡)ΩΛ =

𝑅02

𝑅2(𝑡)−

𝑅02

𝑅2(𝑡)Ω𝑚0

−𝑅0

2

𝑅2(𝑡)ΩΛ0

𝐻2(𝑡)

𝐻02 =

𝑅02

𝑅2(𝑡)−

𝑅02

𝑅2(𝑡)Ω𝑚0

+𝑅0

3

𝑅3(𝑡)Ω𝑚0

−𝑅0

2

𝑅2(𝑡)ΩΛ0 + ΩΛ0

(𝐻(𝑡)

𝐻0)2=

𝑅02

𝑅2(𝑡)+ Ω𝑚0

(𝑅0

3

𝑅3(𝑡)−

𝑅02

𝑅2(𝑡)) + ΩΛ0 (1 −

𝑅02

𝑅2(𝑡))

= (1 + 𝑧)2 + Ω𝑚0((1 + 𝑧)3 − (1 + 𝑧)2) + ΩΛ0(1 − (1 + 𝑧)2)

= (1 + 𝑧)2(1 + 𝑧Ω𝑚0) − 𝑧(2 + 𝑧)ΩΛ0

𝐻2(𝑡) = 𝐻02((1 + 𝑧)2(1 + 𝑧Ω𝑚0) − 𝑧(2 + 𝑧)ΩΛ0 )

Page 162: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

(−1

1+𝑧)2(d𝑧

d𝑡)2= 𝐻0

2((1 + 𝑧)2(1 + 𝑧Ω𝑚0) − 𝑧(2 + 𝑧)ΩΛ0 ) 𝐻(𝑡) =−1

1+𝑧

d𝑧

d𝑡

(d𝑧

d𝑡)2= 𝐻0

2(1 + 𝑧)2((1 + 𝑧)2(1 + 𝑧Ω𝑚0) − 𝑧(2 + 𝑧)ΩΛ0 )

d𝑧

d𝑡= 𝐻0(1 + 𝑧)√(1 + 𝑧)2(1 + 𝑧Ω𝑚0) − 𝑧(2 + 𝑧)ΩΛ0

The age of the universe is found by integrating (numerically) this expression from 𝑧 = 0 to 𝑧 = ∞ giving a value of 13.799 Ga for the current (2015) values of Ω𝑚0

, ΩΛ0 and 𝐻0.

The oldest object currently known is a star SM0313 which is only 6000 lightyears away, but has

an estimated age of 13.6 Ga.

Page 163: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Flatness Problem From Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) + Ωk(𝑡) = 1

Ω(t) + ΩΛ(𝑡) − 1 = Ωk(𝑡) Ω(t) = Ω𝑚(𝑡) + Ω𝑟(𝑡)

Given values for Ω0, ΩΛ0 and 𝐻0 it is possible to calculate the values of Ω(𝑧) and ΩΛ(𝑧) at any 𝑧.

Taking a value of 𝑧 = 1000 (about the age of the cosmic microwave background) gives

Ω = 0.999999996 and ΩΛ = 3.3x10-9.

0.999999996 + .0000000033 − 1 = Ωk1000

The two numerical values are only estimates, but the resulting value is extremely close to zero but is believed to be slightly negative - Ωk0 = −0.0014 ± 0.0017.

Since Ωk(𝑡) = −𝑘𝑐2

𝑅2(𝑡)𝐻2(𝑡) where all the values are positive and non-zero, if Ωk1000 = 0, then

𝑘 = 0 and Ωk(𝑡) = 0 for all 𝑡.

Ωk1000Ωk0

=𝑅0

2𝐻02

𝑅10002𝐻1000

2 = (1 + 𝑧) 2𝐻0

2

𝐻10002 = 0.003

So either Ωk has always been zero or it has had a very small value. Either way Ω1000 had a value

very close to but not 1 which seems surprising.

A slightly larger value would cause the universe to have collapsed before now, a slightly smaller

value and the universe would have expanded too fast for stars to have formed.

Going back even further in time Ω(t) increases and Ωk(t) decreases so Ω(t) increases towards 1

from 0.999999996 but does not quite reach it.

These problems can be avoided by assuming Ω was and always is 1, and ΩΛ(𝑡) and Ωk(𝑡) have

always been zero, but that does not reflect the current observations of an accelerating rate of

expansion.

Page 164: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Hubble Constant and Parameter Edwin Hubble measured the red/blue shift of galaxies to estimate their velocities relative to the

Earth, initially using their brightness as a measure of distance. Nearby galaxies had both blue

and red shifts indicating some were moving towards the Earth, others away from it, but at

larger distances there were only redshifts. It was assumed that these were due to the Doppler

effect. There appeared to be a simple relationship between speed and distance expressed as 𝑣 = 𝐾𝑑 (𝐾 was later changed to 𝐻0).

This is valid for distances of 10 to 600 Mpc with the exception of the Virgo cluster at 16.5 Mpc

because it is a member of the local group of galaxies and has a peculiar velocity relative to it and

its other members.

The value of the Hubble constant can be calculated from observations but does depend on

accurate values for both velocity (obtained from redshift) and distance (obtained by a variety of

methods).

Recent estimates have been about 70 km s-1 Mpc-1, with the latest values 69.32 in 2012 from

WMAP, the Planck mission of 67.80 in 2013 and 67.74 in 2015 from Planck, but a much higher

value of 73.2 from NASA/Hubble in 2016 and 73.0 from Gaia, and a value two months later of

67.6. The lower values depend on current cosmological theory and the higher ones from the

more accurate estimates for distances. In 2017 measurements taken of the variations in the

length of time delays between flux changes in multiple images of a lensed quasar gave a value of

71.9 km s-1 Mpc-1. This measurement depends on the theory of general relativity and geometry

so should be accurate. In 2019 nine different estimates gave values from 67.78 to 76.8. There is

no confirmed explanation for the difference between theory and measurement. In short 𝐻0 is

the least precise of all the constants in physics, and unfortunately is a factor in many equations.

One approach that reconciles the two is that the Hubble constant does vary because the local

density of the Universal varies – the higher values are for our part of the universe while the

lower values are those for the universe as a whole – an average value. The Friedmann-

Robertson-Waler metric and Lambda-CDM model assume that the universe can be modelled as

a mixture of three gas like substances with dark energy being created as the universe expands.

However this may be an oversimplification since the universe is more like a foam with large

voids and walls or ribbons of matter. The large voids account for 40% of the volume and are

over 150 million light years in diameter, while the Sloan Great Wall of galaxies is 1.38 billion

light years long. It is estimated that a clock in the Milky Way runs 35% slower than a clock in the

centre of a void, and therefore more time has passed in the voids with consequently more

expansion than locally. This implies the universe is like an expanding foam.

Many other values depend on 𝐻0 so care has to be taken as to what value is used. For this reason

the value is often expressed as 100h km s-1 Mpc-1 or 3.24x10-18h s-1 where h = 0. 6774 or

whatever the current value is divided buy 100 (h is not to be confused with Planck’s constant h).

Some other quantities are also given in multiples of h such as Ω𝑚0h2 = 0.1326, Ω𝑟0h

2 =

0.000042 etc. as they do not change as the value of 𝐻0 is adjusted by further observations, and

in some cases the h factors cancel out.

Page 165: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The true dimension of the constant 𝐻0 is s-1 with a value of around 2.19x10-18 (the conversion

factor between s-1 and km s-1 Mpc-1 is 3.09x1019).

In a static flat universe this value is constant wrt time, and the age of the universe is

𝐻𝑡 =1

𝐻0= 14.5x109 years.

If the universe is not static, but is expanding or contracting the value 𝐻 is a function of time and is then referred to as the Hubble parameter 𝐻(𝑡) or 𝐻(𝑧) depending whether time or redshift is

used. The function depends on the assumptions of the relative densities of matter (including

dark matter), radiation and dark energy, and the curvature of space-time. The following all

assume flat space.

𝐻(𝑡) =1

𝑅(𝑡)

d𝑅(𝑡)

d𝑡 where 𝑅(t) is the scale factor which is a measure of the change of proper

distances.

In a pure radiation universe 𝐻(𝑡) =1

2𝑡 where 𝑡 is cosmic time and 𝑡0 =

1

2𝐻0 (i.e. the age of the

universe is halved).

In a pure matter universe 𝐻(𝑡) =2

3𝑡 and 𝑡0 =

2

3𝐻0.

In a pure dark energy universe (𝑡) = √8𝜋𝐺𝜌Λ

3 , i.e. a constant and 𝑡0 = √

3

8𝜋𝐺𝜌Λ.

All three are present in the real universe (although some doubt dark energy), but the

proportions of their densities have changed with time, radiation dominating the early universe,

then matter, and dark energy in the future.

The actual variation of 𝐻 with time has not been determined although formula give a good

estimate. From other observations 𝑡0 =0.96

𝐻0, and an age of 13.8x109 years from Planck. This

implies an average past value for 𝐻(𝑡) some 1.04 times the present value.

Using 𝐻(𝑧) = 𝐻0√(1 + 𝑧)2(1 + zΩ𝑚𝑜) − 𝑧(2 + 𝑧)ΩΛ𝑜, the square root factor is about 16000 for

𝑧 = 1000 (CMB), 500 for 𝑧 = 100, 1.9 for 𝑧 = 10 (most distant known galaxies), 1.7 for 𝑧 = 1

(objects at a distance of 7x109 lightyears) and 1.04 for 𝑧 = 0.1 (objects at a distance of 2x109

lightyears or 600 Mpc). Hence Hubble’s law 𝑣 = 𝐻0𝑑 applies up to 600 Mpc, but is increasingly

inaccurate beyond that.

Note that the value of 𝑧 and hence 𝐻(𝑡) decreases extremely rapidly from 𝑧=1000 at the time of

the cosmic microwave background down to 𝑧=1.0 and 𝐻(𝑡) = 120 at a loopback time of 7 Ga.

In the distant future dark energy will dominate so 𝐻(𝑡) = √8𝜋𝐺𝜌Λ

3= √

Λ𝑐2

3, a constant.

The value of ΩΛ(𝑡) will increase asymptotically towards 1 and since ΩΛ(𝑡) =Λ𝑐2

3𝐻2(𝑡) ,

Λ𝑐2

3= 𝐻0

2ΩΛ𝑜 which gives a value of 𝐻(−∞) = 𝐻0√ΩΛ𝑜 (future time is negative). The current

value is close to the minimum since √ΩΛ0 = 0.831 giving a future value of 56 km s-1 Mpc-1. The

value will be higher if acceleration of the expansion of the universe is greater than that

assumed.

Page 166: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

From the Friedmann energy equation 𝐻2(𝑡) =8𝜋𝐺

3𝜌 −

𝑘𝑐2

𝑅2 this can be written as

𝐻2(𝑡) =8𝜋𝐺

3∑ 𝜌𝑖31 −

𝑘𝑐2

𝑅2 where 𝑖 = 1 represents 𝜌𝑚, 𝑖 = 2 represents 𝜌𝑟, 𝑖 = 3 represents 𝜌Λ.

(d𝑅

d𝑡)2=

8𝜋𝐺𝑅2

3∑ 𝜌𝑖31 − 𝑘𝑐2

From the Friedmann acceleration equation 1

𝑅

d2𝑅

d𝑡2= −

4𝜋𝐺

3(𝜌 +

3𝑝

𝑐2)

1

𝑅

d2𝑅

d𝑡2= −

4𝜋𝐺

3∑ (𝜌𝑖 +

3𝑝𝑖

𝑐2)3

1 using the same notation.

Assuming that matter, radiation and dark energy are all types of gas, the pressure 𝑝 = 𝑤𝜌𝑐2.

1

𝑅

d2𝑅

d𝑡2= −

4𝜋𝐺

3∑ 𝜌𝑖(1 + 3𝑤𝑖)31 so at 𝑡 = −∞

−4𝜋𝐺

3𝜌Λ(1 + 3𝑤Λ) =

Λ𝑐2

3

−4𝜋𝐺

3

Λ𝑐2

8𝜋𝐺(1 + 3𝑤Λ) =

Λ𝑐2

3 𝐻(𝑡) = √

8𝜋𝐺𝜌Λ

3= √

Λ𝑐2

3 so 𝜌Λ =

Λ𝑐2

8𝜋𝐺

−1

2(1 + 3𝑤Λ) = 1

𝑤Λ = −1

This means that the cosmological constant Λ implies 𝑤Λ = −1. However it is possible that both

Λ and 𝑤Λ vary with time. Experimental measurements constrain 𝑤Λ0to a value close to −1 and

past/future values to not more than −0.8, but could be as small as -1.5. H(t) is proportional to

(𝑅(𝑡)

𝑅0)−3

2(1+𝑤Λ)

. If 𝑤Λ = −1.5 the universe ends in 21x109 years, galaxies ending at 60 years, solar

systems 3 months, planets 30 minutes and atoms 10-19 seconds before the end.

Page 167: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Cosmic Microwave Background The detection of the cosmic microwave background (CMB) was probably the most important

contribution to cosmology after detection of the redshift. Its analysis constrains the various

cosmological models.

Measurements show that it has the best example of a black body spectrum known and

corresponds to a temperature of 2.725K. Nothing in the universe can remain at a temperature

below this as it would eventually warm up to this temperature.

The radiation density is Ωr0h2 = 4.2x10-5. Using Stefan’s law 𝜌𝑟𝑐

2 =4𝜎𝑇4

𝑐 where

𝜎 = 5.67x10-8 W m-2 T-4, 𝑇0 = 2.725 then 𝜌𝑟0𝑐2=4.17x10-14 J m-3 or 𝜌𝑟0 = 4.64x10-31 kg m-3.

Ωr0 =8𝜋𝐺𝜌𝑟03𝐻0

2 =8𝜋𝐺𝜌𝑟0

3×(3.24×10−18h)2= 2.5x10-5h-2. (Note h =

𝐻0

100)

The proportion of radiation density due to the CMB is 2.5

4.2= 0.6.

This means the CMB is the major contributor (60%) to the radiation density.

The CMB radiation has been redshifted. It can be shown that if a blackbody spectrum is

redshifted it remains a black body spectrum, but for a lower temperature 𝑇𝑧

𝑇0= 1 + 𝑧 so this does

not give any information regarding the original temperature or the value of 𝑧 although given

one the other can be calculated. These must be established from the cosmological theory.

The current theory is that there was time when matter had formed (protons, neutrons and

electrons). Protons and neutrons are baryons, electrons are leptons. The protons would

combine with electrons to form hydrogen atoms, then photons would collide with the hydrogen

atoms and ionise them (the ionisation energy is only 13.6 eV) and so the universe was opaque

to photons. As the universe expanded the matter density decreased to a degree that allowed

most photons unlimited travel, and the universe became transparent. This was the time of the

last scattering or the time of recombination 380,000 years, the age of the cosmic microwave

background.

Assuming that this happened when the matter and radiation densities were equal and

using Ω𝑟

Ω𝑚=

(1+𝑧)Ω𝑟0Ω𝑚0

since radiation scales as R-4 and matter as R-3 and using Ωr0h2 = 4.2x10-5

gives (1 + 𝑧) = 23800Ω𝑚0h2 and 𝑧 = 3160. This value is too large (the current estimate is

1090) though the right order of magnitude.

It is possible to calculate the variation in the fraction of ionised gas with redshift, and it can be

shown that a critical ratio is √Ωm0h

2

Ωb0h2 where Ωb0 is the relative baryonic density 0.047 and

Ωb0h2 = 0.022. Using this gives a redshift of 𝑧 = 1090, and a temperature of 3000K, some

380,000 years after the creation.

Note that a gas at 3K will not create blackbody radiation because the energy levels are too far

apart and the spectrum will consist of discrete emission lines. Blackbody radiation requires a

very large number of closely spaced energy levels, a condition met in ionised gases at

temperatures of 3000K and above. The relationship between mean photon energy in eV and

temperature is approximated by 𝐸𝑝ℎ ≅ 𝑇/3000 so 3000K is implies infrared radiation at 1eV.

The fact that the cosmic microwave background exists is evidence for much higher

temperatures in the past, and so rules out a steady state model for the universe.

Page 168: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Goldilocks Universe If the universe had a temperature of 3000K at an age of 380,000 years, and now has a

temperature of 2.725K, at some time its temperature was in the range of 100°C to 0°C, that of

liquid water which is essential to life. This occurred when the Universe was 10 to 17 million

years old or 𝑧 is between 137 and 100. The first rocky planets are believed to have formed at

𝑧 = 78 but they may have formed a little earlier, and if so there is a period of some 7 million

years during which life could have developed, longer if the planet was warmed by a star.

Curvature of Space Observations indicate that the spatial universe is flat to within 0.4%. This does not mean it was

always flat since a small area of a large curved surface may be regarded flat. To a human

observer on its surface the Earth appears to be flat. If the Earth was much smaller the curvature

would be more obvious.

The universe may have had a large curvature soon after its creation, but the expansion would

have reduced that curvature down to the present level. Therefore present cosmological models

have a curvature of 0. This not to be confused with the curvature of space-time which can be

very large locally.

Page 169: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Conformal Universe The conformal universe is a model based on special relativity so space-time is flat and peculiar

motion is ignored (i.e. the Earth is assumed to be a fundamental observer).

The diagram shows the past light-cone based on here and now. Only one spatial dimension is

shown – in reality there are three. The red lines defining the light cone are the light-like world

lines from the most ancient sources of light that are arriving now. The source of these is the

cosmic microwave background (CMB) emitted some 380,000 years after the origin, and the

light-lines are projected back to the origin.

The loopback time for the origin is 13.799 Ga (shown on the left), but converting this to

conformal time gives 45.407 Ga which forms the vertical axis.

The horizontal axis is the co-moving distance (the proper distance now) which for the light-

lines has a value of 45.407 Gly. Only events on the light-lines can be observed using

electromagnetic radiation or gravitational waves, but events within the triangle can cause

events here and now, and events outside the triangle cannot.

The triangle defines our observable universe which has a mass of 1.45x1053 kg, and a volume of

3.58x1080 m giving a density of 9.9x10-30 g cm-3 or 6 protons per cubic metre.

The value of z increases very rapidly between the CMB and the origin theoretically reaching

infinity while the size of the Universe approaches zero so the proper distances also approach

zero. This means there is a maximum proper distance to the edge of the observable universe of

5.67 Gly which corresponds to a co-moving distance just under 16 Gly and a redshift of 𝑧 = 1.5.

The most distant radiation received, excluding the CMB, is a gamma ray burst from a collapsing

star at 𝑧 = 8.2, a co-moving distance of 30 Gly, but a proper distance (defined as the proper

distance measured at the instant of emission between the object and the point in space which

the Earth now occupies, i.e. its world line) of just over 3 Gly.

Finally the world-lines of fundamental observers are vertical so the world-line for any observer

not on Earth must pass outside of our light-cone but would re-enter the future light-cone.

The world-line of a fundamental observer at the greatest proper distance of 16 Gly is shown.

Here and now 45.407 Ga

Co

nfo

rmal

tim

e G

a an

d o

ur

wo

rld

lin

e

45 40 30 20 16 Co-moving distance Gly 20 30 40 45

CMB 1000

10

𝑧 = 1

100

5.67

0

0

1

2

3 4

5

5

4 3

2

40

30

20

10

Lo

op

bac

k t

ime

Ga

0

8

10

11

12

13

Page 170: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Modified Newtonian Dynamics (MOND) The theory of general relativity and Newton’s law of gravitation give very similar results except

where speeds are similar to those of light and densities are close to those of black holes. The

lower the speed and the density the closer the results.

Both fail to predict the speed of stars orbiting the cores of galaxies based on starlight and the 21

cm hydrogen radiation. Assuming a concentrated mass (𝑀) at the centre of the galaxy, the

negligible mass of the star compared to the core of the galaxy (𝑀) and a circular orbit they

predict that the speed of the star should decrease with its distance 𝑟 from the centre 𝑣 ≅ √𝐺𝑀

𝑟

where 𝐺 is the gravitational constant. However observations shows that the speed increases approximately linearly with distance outside the central region.

The normal solution to this is to add sufficient mass throughout the galaxy so that the velocity

distribution matches observation. This mass must not emit or absorb electromagnetic waves

otherwise it would have been observed, and so is called dark matter. It is possible to match

theory to observation by varying the density of the dark matter but the density of the dark

matter at each radius is approximately proportion to the density of observable (usually called

baryonic) matter. Another problem is that there is no agreement on what dark matter consists

of.

An alternative approach is to modify Newton’s law of gravitation so that below a certain

acceleration (to be determined from observation) the law changes from an inverse square law

to an inverse linear law, i.e. from 𝑓 =𝐺𝑀𝑚

𝑟2 to 𝑓 =

𝐺𝑀𝑚

𝑟. This requires that 𝑓 = 𝑚𝑎 becomes

𝑓 ∝ 𝑚𝑎2 for small accelerations 𝑎.

MOND writes this as 𝑓 = 𝑚𝑎𝜇 (𝑎

𝑎0) where 𝜇 (

𝑎

𝑎0) → 1 for

𝑎

𝑎0≫ 1, →

𝑎

𝑎0 for

𝑎

𝑎0≪ 1 .

It leaves the detail of the function and the value of 𝑎0 to be determined. The region where

𝜇 (𝑎

𝑎0) ≅

𝑎

𝑎0 is known as deep MOND or 𝑓 = 𝑚

𝑎2

𝑎0

Two common functions are 1

1+𝑎0𝑎

and √1

1+(𝑎0𝑎)2.

Common values of 𝑎0 range from 1.2 × 10−10 to 2.0 × 10−10 m s-2.

The opposition to MOND results from the freedom to match the observations and the lack of any

theory behind it.

There are two possible interpretations of the theory.

Firstly it is limited to gravity so that only the law of gravitation is modified to 𝑓 =𝐺𝑀𝑚

𝜇(𝑎

𝑎0)𝑟2

.

This goes against the strong equivalence principle of relativity.

Alternatively it affects all accelerations. In this case 𝑓 = 𝑚𝑎𝜇 (𝑎

𝑎0), but this allows it to be

interpreted as 𝑓 = 𝜇 (𝑎

𝑎0)𝑚𝐺 ∙ 𝑎 where 𝑚𝐺 is the gravitational mass and 𝜇 (

𝑎

𝑎0)𝑚𝐺 = 𝑚𝐼 , the

inertial mass. The two masses are still related but no longer have the same value. This is the

basis of the theory called quantised inertia (QI).

Page 171: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Quantised Inertia This is a theory rather than a law because it explains the mechanism, does not rely on arbitrary

values obtained from the data it seeks to explain, and makes testable predictions (in fact a

mechanical device). Note only does it remove the need for dark matter, but also the need for

dark energy.

It gives a value for 𝜇 (𝑎

𝑎0) = 1 −

𝛽𝜋2𝑐2

𝑎Θ≅ 1 −

2𝑐2

𝑎Θ where 𝛽 has the value 0.2 from Wien’s

displacement law of black body radiation 𝛽 =𝑘𝐵𝑇𝜆

ℎ𝑐, a is the acceleration, and Θ is the co-moving

diameter of the universe Θ =2𝑐

𝐻0.

This means that 𝜇 (𝑎

𝑎0) = 0 for 𝑎0 =

2𝑐2

Θ= (2 ± 0.2) × 10−10m s-2 and becomes negative for

smaller values so this represents a minimum rate of acceleration (quantised acceleration may

be a more accurate name). The uncertainty comes from the uncertainty in the value of Hubble’s

constant which determines the co-moving diameter.

It gives a velocity distribution for the stars in a galaxy that is nearly compatible with

observations – there is a need for an increase in mass of around 2% to get a good match, but this

missing 2% could be undetected baryonic matter such as ionised hydrogen. It gives a value for

the rotational velocity of 𝑣4 =2𝐺𝑀𝑐2

Θ where 𝑀 is the total mass. It is also claimed to predict the

mass of electrons, protons and neutrons.

The theory published about 2007 depends partly in quantum field theory and partly on special

relativity.

The quantum field contribution is Unruh radiation which in turn depends on the Unruh effect.

The latter states that an accelerating body experiences an increase in black body radiation that

is due only to the acceleration. For an accelerating body in a vacuum the effective temperature

of the black body radiation is given by 𝑇 =ℏ𝑎

2𝜋𝑐𝑘𝐵 which is identical to the value for Hawking

radiation from a black hole provided 𝑎 is replaced by 𝑔 =𝐺𝑀

𝑅𝑠2.

An acceleration of 2.47 × 1020m s-2 is required to give a temperature of 1K.

Quantum field theory states that a vacuum is not empty space, but the minimum values of the

quantum fields which appear as virtual particles. These virtual particles exert a pressure on an

object, but the pressure on an object moving with constant speed is normally equal on all sides.

The Casimir Effect or Casimir–Polder force occurs when two flat plates are placed very close

together – so close that only those virtual particles whose wavelength can fit between the plates

can exist. This means there are fewer particles between the plates than outside of them and hence a lower pressure between them and so the plates are attracted towards each other. The

effect has been measured in experiments.

Page 172: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The Hawking Radiation come from the black holes event horizon. The Unruh radiation comes

from the cosmic event horizon which is a sphere of radius 𝑐

𝐻0 surrounding an observer. Special

relativity assumes that the observer is moving at a constant speed. However if the observer is

accelerating a new event horizon forms as a cone whose apex is on the line along which the

observer is accelerating and from which the observer is accelerating. This is called the Rindler

horizon. If a measuring rod is accelerating along its length the Lorentz contraction means the

rod becomes shorter and so the trailing end must accelerate more than the front. Eventually the

trailing end will overtake the front and since this is impossible there is an event horizon. This

trailing horizon replaces the cosmic horizon behind the observer and so symmetry is lost.

This means there are fewer virtual particles behind the accelerating observer than in front, and

so there is a pressure reducing the acceleration. The greater the acceleration the greater this

pressure. This is similar to the Casimir effect and the original name of the QI theory was MiHsC

(Modified Inertia from a Hubble-scale Casimir effect).

It is important to remember that this effect results purely from acceleration and has no

relationship to speed, and that the rotation of an object or part of an object about a point that is

not its centre of mass experiences an acceleration towards that point. The acceleration

decreases with distance from the centre of rotation so will at some distance will approach the

limit of (2 ± 0.2) × 10−10m s-2 since the acceleration is proportional to 1

𝑟2 .

For one body orbiting another 𝑎 =𝑣2

𝑟 assuming a circular orbit. The centripetal force 𝑓 = 𝑚

𝑣2

𝑟.

From Newtons law of gravitation 𝑓 =𝐺𝑀𝑚

𝑟2. The two forces are equal so 𝑚

𝑣2

𝑟=

𝐺𝑀𝑚

𝑟2 and hence

𝑣 = √𝐺𝑀𝑚

𝑚𝑟 and the two 𝑚’s normally cancel.

The orbiting body must have an orbital speed of 𝑣 = √𝐺𝑀

𝑟 to remain in the orbit – a higher speed

would result in a higher orbit and a lower speed in a lower obit (assuming the orbits must be

circular). The centripetal force (gravity) is 𝑓 =𝐺𝑀𝑚𝐺

𝑟2 where 𝑚𝐺 is the gravitational mass (the

mass of a non-accelerating body). This is basic Newtonian mechanics.

However stars orbiting a central mass (such as those in the outer reaches of galaxies or binary

stars with very large separations) have observed speeds that are too high for the radius of the

orbit.

Newtonian mechanics and general relativity assume that 𝑚𝐼 is proportional to 𝑚𝐺 , and in the

scientific system of units the constant of proportionality is 1 so they have the same value.

However QI states this is only an approximation and that 𝑚𝐼 = 𝑚𝐺 (1 −𝛽𝜋2𝑐2

𝑎Θ) ≅ 𝑚𝐺 (1 −

2𝑐2

𝑎Θ).

In QI 𝑓 = 𝑚𝐼𝑣2

𝑟 and 𝑓 =

𝐺𝑀𝑚𝐺

𝑟2 where 𝑚𝐼 is the inertial mass (of an accelerating body), and 𝑚𝐺 is

the gravitational mass (the mass of a non-accelerating body). Making the accelerations equal

equal gives 𝑚𝐼𝑣2

𝑟=

𝐺𝑀𝑚𝐺

𝑟2 and so 𝑣 = √

𝐺𝑀𝑚𝐺

𝑚𝐼𝑟 which means that 𝑣 is much larger than that

predicted by Newtonian physics in the case of low accelerations which occur at large radii. This result is only valid for two bodies such as binary stars separated by distances over 7000 AU.

Page 173: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The application to galaxies is more complicated because the galaxy mass is distributed over a

large volume and cannot be approximated by a point mass. At the outer edge of the galaxy the

acceleration must be a minimum of 𝑎0 and it was shown that the actual acceleration is given by

𝑎 = 𝑎𝐵√1 +2𝑐2

𝑎𝐵Θ0 where 𝑎𝐵 is the acceleration calculated using the baryonic mass. This gives

𝑎 = √2𝐺𝑀𝑐2

𝑟2Θ0 and replacing 𝑎 by

𝑣2

𝑟 gives 𝑣 = √

2𝐺𝑀𝑐2

Θ0

4. This is only valid for small redshifts 𝑧 < 0.1

(astronomical distances). For cosmological distances the value of Θ0 must be corrected for the

cosmic expansion. In 2017 it was shown that replacing Θ0 by Θ =Θ0

1+𝑧 where 𝑧 is the red shift

gave results close to observations for 𝑧 < 2.2.

This theory is not widely accepted and dark matter remains the choice for many. However other

scientists believe that quantised inertia could replace general relativity.

Page 174: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Units The basic units and their dimensions are:

Mass kg Kilograms [M], length m metres [L], time s seconds [T],

electric current A Amperes [I], Temperature K Kelvin [Θ], amount mol mole [N],

luminous intensity cd candela [J], angle rad radian [non-dimensional since [L L-1]

Virtually all physical quantities can be specified using combinations of the above.

Frequency Hz cycles per second Hertz [T-1]

Force N 1 N = 1 kg m s-2 Newtons [M L T-2]

Energy J 1 J = 1 N m Joules [M L T-2]

eV electron Volt 1 eV=1.6x10-19 Joules

Momentum N s Newton seconds [M L T-1]

Angular momentum J s Joule seconds [M L2 T-1]

Power W 1 W = 1 J s-1 Watts [M L T-3]

Potential V 1 V = W A-1 Volts [M L2 T-3 I-1]

Luminosity W Watts radiated per unit time [M L2 T-3]

Flux W m-2 Watts received per unit time and area [M T-3]

Astronomy and cosmology have some non-standard units – the conversion factors are

sometimes included in a constant so great care must be taken.

Mass

1 eV/c2 = 1.7827 x10-36 kg often written as just 1eV, but division by c2 is required

to get units of mass from 𝐸 = 𝑚𝑐2.

Time is often measured in hours, days or years, and time in the past in annum or in

terms of redshift 𝑧, but there is no simple conversion into other units.

There are numerous units for length or distance

1AU = 1.496x1011 m astronomical unit, the distance of the Earth from the sun

1ly = 9.461x1015 m lightyear, distance travelled by light in one year

1pc = 3.086x1016 m parsec, used in astronomy in place of the light year, and is

the distance of a star that appears to move 1 second of arc due to the Earth’s

rotation around the sun or the distance from the Sun that the Earth’s orbit

subtends an angle of one second or arc, and about 3.35 lightyears.

Distances are also expressed in terms of redshift, especially in cosmology, but

there is no simple conversion into other units.

Page 175: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

The mass and radii of objects are often expressed as multiples of the radius of the Earth (E or ⊕), Moon (M), Jupiter (J) or Sun (S 𝑜𝑟 ⨀) with the common

symbols as subscripts.

Mass kg Radius m

Sun 1.99x1030 6.96x108

Jupiter 1.90x1027 7.15x107

Earth 5.97x1024 6.37x106

Moon 7.40x1022 1.73x106

For an equation to make physical sense the units on both sides must be the same, and terms

separated by plus or minus signs must also have the same units. Additionally the arguments to

nonlinear functions such as the trigonometric functions and powers must be dimensionless. In

most cases the simplest approach is to first convert all units to the SI system, but there are cases

where the conversion factors will cancel out.

Constants 𝑐 = 299,792,458 m s-1 the speed of light, 3x108 being a first order approximation

𝐺=6.674×10-11 N m2 kg-2 Gravitational constant

ℎ =6.626×10-34 J s Plank’s constant

ℏ =1.055×10-34 J s Reduced Plank’s constant

𝑘 = 1.381×10-23 J K-1 Boltzmann’s constant

휀0 = 8.854×10-12 C2 N-1 m-2 The permittivity of free space

𝜇0 = 1.257×10-6 N A-2 The permeability of 𝜇0 of free space

휀0𝜇0𝑐2 = 1

𝜌𝑐0 = 1.06×10-27 kg m-3 Current value of critical density

𝐻0 =67.74 km s-1 Mpc-1 Hubble’s constant - cosmological value

𝐻0 =73.2 km s-1 Mpc-1 Hubble’s constant - experimental value

𝐻0 =70 km s-1 Mpc-1 Hubble’s constant - average value

Page 176: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Equations for Special Relativity

𝛾(𝑉) =1

√1−𝑉2

𝑐2

the gamma or Lorentz factor, 𝛾(𝑉) = 1 +𝑉2

2𝑐2, 𝑉 ≪ 𝑐

𝑉 = 𝑐√1 −1

𝛾2(𝑉) but the following approximation is commonly used

𝑉

𝑐≅ 1 −

1

2𝛾2(𝑉), 𝛾 ≥ 10

∆𝑡 = 𝛾(𝑉)∆𝜏 Time dilation

∆𝑥 = 𝜒

𝛾(𝑉) Length contraction

𝛾(𝑉) = 𝜒

𝑥=

𝑡

𝜏 Consistency of velocity

Δ𝜏 =Δ𝑠

𝑐 Proper time

𝜒 = 𝛾(𝑉)𝑥 Proper length

Δ𝑠 = √(cΔ𝑡)2 − (Δ𝑟)2 = cΔ𝜏 Separation

𝑣𝐵𝑥 =𝑣𝐴𝑥−𝑉

1−𝑉𝑣𝐴𝑥𝑐2

, 𝑣𝐵𝑦 =𝑣𝐴𝑦

1−𝑉𝑣𝐴𝑥𝑐2

, 𝑣𝐵𝑧 =𝑣𝐴𝑧

1−𝑉𝑣𝐴𝑥𝑐2

where V is the velocity of frame B wrt frame A

𝐩 = 𝑚𝛾(𝑉)𝐯 Relativistic momentum

𝐸0 = 𝑚𝑐2 Rest energy

𝐸 = 𝛾(𝑣)𝑚𝑐2 = 𝐸𝐾𝐸 + 𝐸0 Total relativistic energy

𝐸𝐾𝐸 = (𝛾(𝑣) − 1)𝑚𝑐2 Kinetic energy

𝐸2 = 𝑝2𝑐2 +𝑚2𝑐4 Energy-momentum relation

𝜆𝑜𝑏 = 𝜆𝑒𝑚 √𝑐+𝑉

√𝑐−𝑉 or 𝑓𝑜𝑏 = 𝑓𝑒𝑚

√𝑐−𝑉

√𝑐+𝑉 Doppler effect due to special relativity

𝑉 =𝑓𝑒𝑚

2−𝑓𝑜𝑏2

𝑓𝑒𝑚2+𝑓𝑜𝑏

2 𝑐 =𝜆𝑜𝑏

2−𝜆𝑒𝑚2

𝜆𝑜𝑏2+𝜆𝑒𝑚

2 𝑐 =1−(

𝑓𝑜𝑏𝑓𝑒𝑚

)2

1+(𝑓𝑜𝑏𝑓𝑒𝑚

)2 𝑐 =

1−(𝜆𝑒𝑚𝜆𝑜𝑏

)2

1+(𝜆𝑒𝑚𝜆𝑜𝑏

)2 𝑐 with 𝑉 positive indicating receding,

negative approaching

𝑓𝑜𝑏 =1

√1−2𝐺𝑀

𝑅𝑐2

𝑓𝑒𝑚 or 𝜆𝑜𝑏 = √1 −2𝐺𝑀

𝑅𝑐2𝜆𝑒𝑚 gravitational redshift

Page 177: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

[Λ𝜇𝜈] =

[ 𝛾(𝑉) −

𝛾(𝑉)𝑉

𝑐0 0

−𝛾(𝑉)𝑉

𝑐𝛾(𝑉) 0 0

0 0 1 00 0 0 1]

Lorentz transformation matrix

[Λ−1𝜇𝜈] =

[ 𝛾(𝑉)

𝛾(𝑉)𝑉

𝑐0 0

𝛾(𝑉)𝑉

𝑐𝛾(𝑉) 0 0

0 0 1 00 0 0 1]

Inverse Lorentz transformation matrix

[𝜂𝜇𝜈] ≡ [

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

] Minkowski metric for Cartesian coordinates

[𝜂𝜇𝜈] ≡ [

1 0 0 00 −1 0 00 0 −𝑟2 00 0 0 −𝑟2 sin2 𝜃

] Minkowski metric for spherical coordinates

𝐴𝐵𝜇 = ∑

𝜕𝐴𝐵𝜇

𝜕𝐴𝐴𝜈

3𝜈=0 𝐴𝐴

𝜈 = ∑ Λ𝜇𝜈3𝜈=0 𝐴𝐴

𝜈 𝜇 = 0,1,2,3 Definition of a contravariant four-vector

𝐵𝐵𝜇 = [(Λ−1)𝜈𝜇]𝐵𝐵𝜈 = ∑ Λ𝜇𝜈3𝜈=0 𝐴𝐴

𝜈 𝜇 = 0,1,2,3 Definition of a covariant four-vector

[∆𝑥𝜇] = (𝑐∆𝑡, ∆𝐫 ) Four-displacement

Δ𝑥𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 Δ𝑥𝐴

𝜈 𝜇 = 0,1,2,3 Four-displacement transformation

[𝑈𝜇] = (𝑐𝛾(𝑉), 𝛾(𝑉)𝐯) Four-velocity

U𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 U𝐴

𝜈 𝜇 = 0,1,2,3 Four-velocity transformation

[𝑃𝜇] = (𝐸

𝑐, 𝐩) Four-momentum

P𝐵𝜇 = ∑ Λ𝜇𝜈

3𝜈=0 P𝐴

𝜈 𝜇 = 0,1,2,3 Four-momentum transformation

Page 178: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

[F𝜇𝜈] =

[ 0 − 𝑥

𝑐−

𝑦

𝑐− 𝑧

𝑐𝑥

𝑐0 −𝐵𝑧 𝐵𝑦

𝑦

𝑐𝐵𝑧 0 −𝐵𝑥

𝑧

𝑐−𝐵𝑦 𝐵𝑥 0 ]

Electromagnet contravariant four-tensor

[F𝜇𝜈] =

[ 0

𝑥

𝑐

𝑦

𝑐

𝑧

𝑐

− 𝑥

𝑐0 −𝐵𝑧 𝐵𝑦

−𝑦

𝑐𝐵𝑧 0 −𝐵𝑥

− 𝑧

𝑐−𝐵𝑦 𝐵𝑥 0 ]

Electromagnet covariant four-tensor

[𝐽𝜇] = (𝑐𝜌, 𝐽𝑥 , 𝐽𝑦, 𝐽𝑧) Current contravariant four-vector

𝐹𝜇 = 𝑞∑ F𝜇𝜈3𝜈=0 𝑈𝜈 Electromagnetic force contravariant four-vector

[F𝛼𝛽𝐵] = ∑ Λ𝜇𝛼Λ𝜈𝛽

3𝛼,𝛽=0 F𝛼𝛽𝐴 Electromagnetic force contravariant four-vector transform

Page 179: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Equations for General Relativity ∑ 𝑔𝑖𝑘𝑔𝑘𝑗𝑘 = 𝛿𝑖𝑗 = 𝐼 Dual metric and metric tensor

Γ𝑖𝑗𝑘 =1

2∑ 𝑔𝑖𝑙 (

𝜕𝑔𝑙𝑘

𝜕𝑥𝑗+

𝜕𝑔𝑗𝑙

𝜕𝑥𝑘−

𝜕𝑔𝑗𝑘

𝜕𝑥𝑙)𝑙 Connection coefficients

𝑅𝑙𝑖𝑗𝑘 =𝜕Γ𝑙𝑖𝑘

𝜕𝑒𝑗−

𝜕Γ𝑙𝑖𝑗

𝜕𝑒𝑘+ ∑ Γ𝑚𝑖𝑘Γ

𝑙𝑚𝑗𝑚 − ∑ Γ𝑚𝑖𝑗Γ

𝑙𝑚𝑘𝑚 Riemannian curvature tensor

𝑅𝜇𝜈 −1

2𝑔𝜇𝜈𝑅 = −𝜅𝑇𝜇𝜈 Einstein’s field equations

𝑅𝜇𝜈 −1

2𝑔𝜇𝜈𝑅 = −𝜅 (𝑇𝜇𝜈 +

Λ

𝜅𝑔𝜇𝜈) including the cosmological constant

𝜅 =8𝜋𝐺

𝑐4 Einstein’s constant

d2𝑥𝜌

d𝜆2+ ∑ Γ𝜌𝛼𝛽

d𝑒𝛼

d𝜆

d𝑒𝛽

d𝜆= 0𝛼,𝛽 Geodesic equations

Schwarzschild metric

(d𝑠)2 = 𝑔𝜇𝜈 d𝑒𝜇 d𝑒𝜈 = (1 −

2𝐺𝑀

𝑟𝑐2) (𝑐d𝑡)2 −

1

1−2𝐺𝑀

𝑟𝑐2

(d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2

𝑅𝑠 =2𝐺𝑀

𝑐2 Schwarzschild radius

∆𝜏 = √1 −2𝐺𝑀

𝑟𝑐2 ∆𝑡 Schwarzschild time

(d𝑠)2 = (1 −𝑅𝑠

𝑟) (𝑐d𝑡′)2 − 2

𝑅𝑠

𝑟𝑐d𝑡′d𝑟 − (1 +

𝑅𝑠

𝑟) (d𝑟)2 − 𝑟2(d𝜃)2 − 𝑟2 sin2 𝜃 (d𝜙)2

in Eddington-Finkelstein coordinates where 𝑐𝑡′ = 𝑐𝑡 + 𝑅𝑠 ln |𝑟

𝑅𝑠− 1|

Kerr metric

(d𝑠)2 = (1 −𝑅𝑠𝑟

𝜌2) (𝑐d𝑡)2 + 2

𝑅𝑠𝑟𝑎 sin2 𝜃

𝜌2𝑐d𝑡d𝜙 −

𝜌2

∆(d𝑟)2 − 𝜌2(d𝜃)2 −

((𝑟2 + 𝑎2) sin2 𝜃 +𝑅𝑠𝑟𝑎

2 sin4 𝜃

𝜌2) (d𝜙)2

where

𝑥 = √𝑟2 + 𝑎2 sin𝜃 cos𝜙

𝑦 = √𝑟2 + 𝑎2 sin 𝜃 sin𝜙

𝑅𝑠 =2𝐺𝑀

𝑐2

𝑎 =𝐽

𝑀𝑐

𝜌 = 𝑟2 + 𝑎2 cos2 𝜃

∆= 𝑟2 + 𝑎2 − 𝑅𝑠𝑟

𝑇 =ℏ𝑐3

8𝜋𝑘𝐺𝑀 Kelvin Blackhole temperature

Page 180: Introduction to Relativity - Croucher ConsultThe second exception led to the theory of Special Relativity published by Einstein in 1905 which applies correction factors to Newtonian

Friedmann-Robertson-Walker Metric

(d𝑠)2 = (𝑐𝑑𝑡)2 − 𝑅2(𝑡) (1

1−𝑘𝑟2(d𝑟)2 + 𝑟2 (d𝜃)2+𝑟2 sin2 𝜃 (d𝜙)2)

𝜎 = 𝑅𝑟 Proper distance

d𝜎

d𝜏=

1

𝑅

d𝑅

d𝜏𝜎 or 𝐻(𝑡) =

1

𝑅

d𝑅

d𝜏 or 𝑣𝑝 = 𝐻(𝑡)𝑑𝑝 Hubble’s Law

where 𝑣𝑝 =d𝜎

d𝜏, the proper radial velocity, 𝑑𝑝 = 𝜎, the proper radial distance, and

𝜌1 = 𝑎3𝜌𝑚0+ 𝑎4𝜌𝑟0 + 𝜌Λ Equations of state

𝑝1 =𝑎4𝑐2

3𝜌𝑟0 − 𝜌Λ𝑐

2

Friedman equations

(1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3𝜌 −

𝑘𝑐2

𝑅2 the energy equation

(1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3(𝑎3𝜌𝑚0

+ 𝑎4𝜌𝑟0 + 𝜌Λ) −𝑘𝑐2

𝑅2

1

𝑅

d2𝑅

d𝑡2= −

4𝜋𝐺

3(𝜌 +

3𝑝

𝑐2) the acceleration equation

1

𝑅

d2𝑅

d𝑡2= −

4𝜋𝐺

3(𝑎3𝜌𝑚0

+ 2𝑎4𝜌𝑟0 − 2𝜌Λ)

d𝜌

d𝑡+ (𝜌 +

3𝑝

𝑐2)3

𝑅

d𝑅

d𝑡= 0 the fluid equation

The first two equations can be written using current values as indicated by the suffix 0 as

(1

𝑅

d𝑅

d𝑡)2=

8𝜋𝐺

3(𝑎3𝜌𝑚0

+ 𝑎4𝜌𝑟0 + 𝜌Λ) −𝑘𝑐2

𝑅2

1

𝑅

d2𝑅

d𝑡2= −

4𝜋𝐺

3(𝑎3𝜌𝑚0

+ 2𝑎4𝜌𝑟0 − 2𝜌Λ)

Dimensionless density parameters

Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) + Ωk(𝑡) = 1

Ωt(𝑡) = Ω𝑚(𝑡) + Ω𝑟(𝑡) + ΩΛ(𝑡) = 1 − Ωk(𝑡)

Ω𝑚(𝑡) =𝜌𝑚(𝑡)

𝜌𝑐(𝑡)=

8𝜋𝐺𝜌𝑚(𝑡)

3𝐻2(𝑡)

Ω𝑟(𝑡) =𝜌𝑟(𝑡)

𝜌𝑐(𝑡)=

8𝜋𝐺𝜌𝑟(𝑡)

3𝐻2(𝑡)

ΩΛ(𝑡) =𝜌Λ

𝜌𝑐(𝑡)=

Λ𝑐2

3𝐻2(𝑡)

Ωk(𝑡) = −𝑘𝑐2

𝑅2(𝑡)𝐻2(𝑡)

a =𝑅1

𝑅0 Scale factor

𝑧 =1−𝑎

𝑎 Cosmological red shift


Recommended