+ All Categories
Home > Documents > ASTM002 The Galaxy Course notes 2006 (QMUL)

ASTM002 The Galaxy Course notes 2006 (QMUL)

Date post: 30-Oct-2014
Category:
Upload: ucaptd3
View: 56 times
Download: 11 times
Share this document with a friend
Description:
ASTM002 The Galaxy Course notes 2006 (QMUL), astronomy, astrophysics, cosmology, general relativity, quantum mechanics, physics, university degree, lecture notes, physical sciencesTHE GALAXYNotes for Lecture CoursesASTM002 and MAS430Queen Mary University of LondonBryn JonesPrasenjit SahaJanuary – March2006Chapter 1Introducing GalaxiesGalaxies are a slightly difficult topic in astronomy at present. Although a great amount isknown about them, galaxies are much less well understood than, say, stars. There remainmany problems relating to their formation and evolution, and even with aspects of their
128
THE GALAXY Notes for Lecture Courses ASTM002 and MAS430 Queen Mary University of London Bryn Jones Prasenjit Saha January – March 2006
Transcript
Page 1: ASTM002 The Galaxy Course notes 2006 (QMUL)

THE GALAXY

Notes for Lecture Courses

ASTM002 and MAS430

Queen Mary University of London

Bryn Jones

Prasenjit Saha

January – March2006

Page 2: ASTM002 The Galaxy Course notes 2006 (QMUL)
Page 3: ASTM002 The Galaxy Course notes 2006 (QMUL)

Chapter 1

Introducing Galaxies

Galaxies are a slightly difficult topic in astronomy at present. Although a great amount isknown about them, galaxies are much less well understood than, say, stars. There remainmany problems relating to their formation and evolution, and even with aspects of theirstructure. Fortunately, extragalactic research currently is very active and our understandingis changing, and improving, noticeably each year.

One reason that galaxies are difficult to understand is that they are made of three verydifferent entities: stars, an interstellar medium (gas with some dust, often abbreviated asISM), and dark matter. There is an interplay between the stars and gas, with stars formingout of the gas and with gas being ejected back into the interstellar medium from evolvedstars. The dark matter affects the other material through its strong gravitational potential,but very little is known about it directly. We shall study each of these in this course, and toa small extent how they influence each other.

An additional factor complicating an understanding of galaxies is that their evolutionis strongly affected by their environment. The gravitational effects of other galaxies can beimportant. Galaxies can sometimes interact and even merge. Intergalactic gas, for examplethat found in clusters of galaxies, can be important. These various factors, and the feedbackfrom one to another, mean that a number of important problems remain to be solved inextragalactic science, but this gives the subject its current vigour.

One important issue of terminology needs to be clarified at the outset. The word“Galaxy” with a capital ‘G’ refers to our own Galaxy, i.e. the Milky Way Galaxy, as doesthe word Galactic. In contrast, “galaxy”, “galaxies” and “galactic” with a lower-case ‘G’refer to other galaxies and to galaxies in general.

1.1 Galaxy Types

Some galaxies (more of them in earlier epochs) have active nuclei which can vastly outshinethe starlight. We shall not go into that here – we shall confine ourselves to normal galaxiesand ignore active galaxies.

There are three broad categories of normal galaxies:

• elliptical galaxies (denoted E);

• disc galaxies, i.e. spiral (S) and lenticular (S0) galaxies;

• irregular galaxies (I or Irr).

Classifications often include peculiar galaxies which have unusual shapes. These are mostlythe result of interactions and mergers.

These classifications are based on the shapes and structures of galaxies, i.e. on the mor-

phology. They are therefore known as morphological types.

1

Page 4: ASTM002 The Galaxy Course notes 2006 (QMUL)

1.2 Disc Galaxies

Disc galaxies have prominent flattened discs. They have masses of 109M¯ to 1012M¯. Theyinclude spiral galaxies and S0 (or lenticular) galaxies. Spirals are gas rich and this gas takespart in star formation. S0s on the other hand have very little gas and no star formation,while their discs are more diffuse than those of spirals.

1.2.1 Spiral galaxies

Spiral galaxies have much gas within their discs, plus some embedded dust, which amountsto 1-20% of their visible mass (the rest of the visible mass is stars). This gas shows activestar formation. The discs contain stars having a range of ages as a result of this continuingstar formation. Spiral arms are apparent in the discs, defined by young, luminous starsand by H II regions. Spiral galaxies have a central bulge component containing mostly oldstars. These bulges superficially resemble small elliptical galaxies. The disc and bulge areembedded in a fainter halo component composed of stars and globular clusters. The starsof this stellar halo are very old and metal-poor (deficient in chemical elements other thanhydrogen and helium). Some spirals have bars within their discs. These are called barred

spirals and are designated type SB. Non-barred, or normal spirals are designated type S orSA. The spectra are dominated by F- and G-type stars but also show prominent emissionlines from the gas: the spectra show the absorption lines from the stars with the emissionlines superimposed. The spiral disc is highly flattened and the gas is concentrated close tothe plane. The spiral arms are associated with regions of enhanced gas density that areusually caused by density waves. Discs rotate, with the stars and gas in near-circular orbitsclose to the plane, having circular velocities ∼ 200 to 250 kms−1. In contrast, the halo starshave randomly oriented orbits with speeds ∼ 250 kms−1. Spiral galaxies are plentiful awayfrom regions of high galaxy density (away from the cores of galaxy clusters). All disc galaxiesseem to be embedded in much larger dark haloes; the ratio of total mass to visible stellarmass is ' 5, but we do not really have a good mass estimate for any disc galaxy.

The disc’s surface brightness I tends to follow a roughly exponential decline with radialdistance R from the centre, i.e.,

I(R) = I0 exp(−R/R0)

where I0 ∼ 102L¯ pc−2 is the central surface brightness, and R0 is a scale length for thedecline in brightness. The scale length R0 is ' 3.5 kpc for the Milky Way.

There are clear trends in the properties of spirals. Those containing the smallest amountsof gas are called subtype Sa and tend to have large, bright bulges compared to the discs,tightly wound spiral arms, and relatively red colours. More gas-rich spirals, such as subtypeSc, have small bulges, open spiral arms and blue colours. There is a gradual variation inthese properties from subtype Sa through Sab, Sb, Sbc, to Sc. Some classification schemesinclude more extreme subtypes Sd and Sm. Subtypes of barred spirals are denoted SBa,SBab, SBb, SBbc, SBc, ...

Because the spiral arms mark regions of recent star formation, the stars in the armsare young and blue in colour. Therefore, spiral arms are most prominent when a galaxyis observed in blue or ultraviolet light, and less prominent when observed in the red orinfrared. Figure 1.5 shows three images of a spiral galaxy recorded through blue, red andinfrared filters only. The spiral pattern is strong in the blue image, but weak in the infrared.

Observations show that there is a correlation between luminosity L (the total poweroutput of galaxies due to emitted light, infrared radiation, ultraviolet etc.) and the maximumrotational velocity vrot of the disc for spiral galaxies. The relationship is close to

L ∝ v4rot

2

Page 5: ASTM002 The Galaxy Course notes 2006 (QMUL)

E0 E2 E4 E6

NGC 1407 NGC 1395 NGC 584 NGC 4033

Figure 1.1: the sequence of morphological types of elliptical galaxies. Elliptical galaxies areclassified according to their shape. [Created with blue-band data from the SuperCOSMOSSky Survey.]

Sa Sb Sc Sd Sm

ESO286-G10 NGC 3223 M74 NGC 300 NGC 4395

Figure 1.2: the sequence of normal (non-barred) spiral types. Spiral galaxies are classifiedaccording to how tightly would their arms are, the prominence of the central bulge, andthe quantity of interstellar gas. [Created with blue-band data from the SuperCOSMOS SkySurvey.]

SBa SBb SBc SBd SBm

NGC 4440 NGC 1097 NGC 1073 NGC 1313 NGC 4597

Figure 1.3: the sequence of barred spiral types. [Created with blue-band data from theSuperCOSMOS Sky Survey.]

NGC 1569 NGC 4214 NGC 4449 NGC 7292

Figure 1.4: examples of irregular galaxies. [Created with blue-band data from the Super-COSMOS Sky Survey.]

3

Page 6: ASTM002 The Galaxy Course notes 2006 (QMUL)

Blue Red Infrared

Figure 1.5: the spiral galaxy NGC2997 in blue, red and infrared light, showing that theprominence of the spiral arms varies with the observed wavelength of light. The blue spiralarms are most evident in the blue image, while the old (red) stellar population is seen moreclearly in the infrared image. [The picture was produced using data from the SuperCOSMOSSky Survey of the Royal Observatory Edinburgh, based on photography from the UnitedKingdom Schmidt Telescope.]

This is known as the Tully-Fisher relationship. It is important because it allows the lumi-nosity L to be calculated from the rotation velocity vrot using optical or radio spectroscopy.A comparison of the luminosity and the observed brightness gives the distance to the galaxy.

1.2.2 S0 galaxies

S0 galaxies, sometimes also known as lenticular galaxies, are flattened disc systems likespirals but have very little gas or dust. They therefore contain only older stars. They haveprobably been formed by spirals that have lost or exhausted their gas.

1.3 Elliptical Galaxies

These have masses from 1010M¯ to 1013M¯ (not including dwarf ellipticals which have lowermasses). They have elliptical shapes, but little other structure. They contain very little gas,so almost all of the visible component is in the form of stars (there is very little dust). Withso little gas, there is no appreciable star formation, with the result that elliptical galaxiescontain almost only old stars. Their colours are therefore red. K-type giant stars dominatethe visible light, and their optical spectra are broadly similar to K-type stars with no emissionlines from an interstellar medium: the spectra have absorption lines only. Dark matter isimportant, and there is probably an extensive dark matter halo with a similar proportion ofdark to visible matter as spirals.

Ellipticals are classified by their observed shapes. They are given a type En where nis an integer describing the apparent ellipticity defined as n = int[10(1 − b/a)] where b/ais the axis ratio (the ratio of the semi-minor to semi-major axes) as seen on the sky. Inpractice we observe only types E0 (circular) to E7 (most flattened). We never see ellipticalsflatter than about E7. The reason (as indicated by simulations and normal mode analyses)seems to be that a stellar system any flatter is unstable to buckling, and will eventuallysettle into something rounder. Note that these subtypes reflect the observed shapes, not thethree-dimensional shapes: a very elongated galaxy seen end-on would be classified as typeE0.

Luminous ellipticals have very little net rotation. The orbits of the stars inside themare randomly oriented. The motions of the stars are characterised by a velocity dispersionσ along the line of sight, most commonly the velocity dispersion at the centres σ0. These

4

Page 7: ASTM002 The Galaxy Course notes 2006 (QMUL)

ellipticals are usually triaxial in shape and have different velocity dispersions in the directionsof the different axes. Less luminous ellipticals can have some net rotation.

Their surface brightness distributions are more centrally concentrated than those of spi-rals. There are various functional forms around for fitting the surface brightness, of whichthe best known is the de Vaucouleurs model,

I(R) = I0 exp

[

−(

R

R0

) 14

]

,

with I0 ∼ 105L¯ pc−2 for giant ellipticals. (To fit to observations, one typically un-squashesthe ellipses to circles first. Also, the functional forms are are only fitted to observations overthe restricted range in which I(R) is measurable. So don’t be surprised to see very differentlooking functional forms being fit to the same data.)

Ellipticals have large numbers of globular clusters. These are visible as faint star-like images superimposed on the galaxies. These globular clusters have masses 104M¯ tofew×106M¯.

Ellipticals are plentiful in environments where the density of galaxies is high, such as ingalaxy clusters. Isolated ellipticals are rare.

Observations show that there is a correlation between three important observationalparameters for elliptical galaxies. These quantities are the scale size R0, the central surfacebrightness I0, and the central velocity dispersion σ0. The observations show that

R0 I0.90 σ−1.4

0 ' constant .

This relation is known as the fundamental plane for elliptical galaxies. (The relation is alsooften expressed in terms of the radius Re containing half the light of the galaxy and thesurface brightness Ie at this radius.)

An older, and cruder, relation is that between luminosity L and the central velocitydispersion:

L ∝ σ40 .

This is known as the Faber-Jackson relation. The Faber-Jackson relation, and particularlythe fundamental plane, are very useful in estimating the distances to elliptical galaxies: theobservational parameters I0 and σ0 give an estimate of R0, which in turn with I0 gives thetotal luminosity of the galaxy, which can then be used with the observed brightness to derivea distance.

1.4 Irregular Galaxies

Irregular galaxies have irregular, patchy morphologies. They are gas-rich, showing strong starformation with many young stars. Ionised gas, particularly H II regions, is prominent aroundthe regions of star formation. They tend to have strong emission lines from the interstellargas, and their starlight is dominated by B, A and F types. As a result their colours areblue and their spectra show strong emission lines from the interstellar gas superimposed onthe stellar absorption-line spectrum. Their internal motions are relatively chaotic. They aredenoted type I or Irr.

1.5 Other Types of Galaxy

As already noted, some galaxies have unusual, disturbed morphologies and are called pecu-

liar. These are mostly the result of interactions and mergers between galaxies. They areparticularly numerous among distant galaxies.

5

Page 8: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 1.6: examples of the optical spectra of elliptical, spiral and irregular galaxies. Theelliptical spectrum shows only absorption lines produced by the stars in the galaxy. Thespiral galaxy has absorption lines from its stars and some emission lines from its interstellargas. In contrast, the irregular galaxy has very strong emission lines on a weaker stellarcontinuum. [Produced with data from the 2dF Galaxy Redshift Survey.]

6

Page 9: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 1.7: The tuning fork diagram of Hubble types. The galaxies on the left are known asearly types, and those on the right as late types.

Clusters of galaxies often have a very luminous, dominant elliptical at their cores, of atype called a cD galaxy. These have extensive outer envelopes of stars.

Low-luminosity galaxies are called dwarf galaxies. They have masses 106 to 109M¯.Common subclasses are dwarf irregulars having a large fraction of gas and active star forma-tion, and dwarf ellipticals which are gas poor and have no star formation. Dwarf spheroidal

galaxies are very low luminosity, very low surface brightness systems, essentially extremeversions of dwarf ellipticals. Our Galaxy has several dwarf spheroidal satellites.

1.6 The Hubble Sequence

On the whole, galaxy classification probably should not be taken as seriously as stellar clas-sification, because there are not (yet) precise physical interpretations of what the gradationsmean. But some physical properties do clearly correlate with the so-called Hubble types.

The basic system of classification described above was defined in detail by Edwin Hubble,although a number of extensions to his system are available. Figure 1.7 shows the Hubbletuning fork diagram which places the various types in a sequence based on their shapes.Ellipticals go on the left, arranged in a sequence based on their ellipticities. Then come thelenticulars or disc galaxies without spiral arms: S0 and SB0. Then spirals with increasinglyspaced arms, Sa etc. if unbarred, SBa etc. if barred.

The left-hand galaxies are called early types, and the right-hand ones late types. Peopleonce thought this represented an evolutionary sequence, but that has long been obsolete.(Our current understanding is that, if anything, galaxies tend to evolve towards early types.)But the old names early and late are still used.

Note that for spirals, bulges get smaller as spiral arms get more widely spaced. Thetheory behind spiral density waves predicts that the spacing between arms is proportionalto the disc’s mass density.

Several galaxy properties vary in a sequence from ellipticals to irregulars. However, theprecise shapes of ellipticals are not important in this: all ellipticals lie in the same positionin the sequence:

7

Page 10: ASTM002 The Galaxy Course notes 2006 (QMUL)

E S0 Sa Sb Sc Sd Irr(all Es)

Early type Late typeOld stars Young stars

Red colour Blue colourGas poor Gas rich

Absorption-line Strong emissionspectra lines in spectra

Some evolution along this sequence from right to left (late to early) can occur if gas isused up in star formation or gas is taken out of the galaxies.

1.7 A Description of Galaxy Dynamics

Interactions between distributions of matter can be very important over the lifetime of agalaxy, be these the interactions of stars, the interactions between clouds of gas, or theinteractions of a galaxy with a near neighbour.

An important distinction between interactions is whether they are collisional or colli-

sionless. Encounters between bodies of matters are:

• collisional if interactions between individual particles substantially affect their mo-tions;

• collisionless if interactions between individual particles do not substantially affecttheir motions.

Gas is collisional. If two gas clouds collide, even with the low densities found in astron-omy, individual atoms/molecules interact. These interactions on the atomic scale stronglyinfluence the motions of the two gas clouds.

Stars are collisionless on the galactic scale. If two stellar systems collide, the interactionsbetween individual stars have little effect on their motions. The ‘particles’ are the stars inthis case. The motions of the stars are mostly affected by the gravitational potentials of thetwo stellar systems. Interactions between individual stars are rare on the scale of galaxies.Stars are therefore so compact on the scale of a galaxy that a stellar system behaves like acollisionless fluid (except in the cores of galaxies and globular clusters), resembling a plasmain some respects.

This distinction between stars and gas leads to two very important differences betweenstellar and gas dynamics in a galaxy.

1. Gas will tend to settle into rotating discs within galaxies. Stars will not settle in thisway.

2. Star orbits can cross each other, but in equilibrium gas must follow closed paths whichdo not cross (and in the same sense). Two streams of stars can go through each otherand hardly notice, but two streams of gas will shock (and probably form stars). Youcould have a disc of stars with no net rotation (just reverse the directions of motion ofsome stars), but not so with a disc of gas.

The terms ‘rotational support’ and ‘pressure support’ are used to describe how materialin galaxies balances its self-gravity. Stars and gas move in roughly circular orbits in thediscs of spiral galaxies, where they achieve a stable equilibrium because they are rotationally

supported against gravity. The stars and gas in the spiral discs have rotational velocities of' 250 kms−1, while the dispersion of the gas velocities locally around this net motion is only' 10 kms−1.

8

Page 11: ASTM002 The Galaxy Course notes 2006 (QMUL)

In contrast, stars in luminous elliptical galaxies maintain a stable equilibrium becausethey are moving in randomly oriented orbits. Drawing a parallel with atoms/molecules in agas, they are said to be pressure supported against gravity. Random velocities of ' 300 kms−1

are typical. The velocities for pressure support need not be isotropically distributed. It isalso possible to have a mixture of rotational and pressure support for a system of stars, withsome appreciable net rotation.

1.8 A Brief Overview of Galaxy Evolution Processes

Galaxies formed early in the history of the Universe, but exactly when is not known withcertainty. They were probably formed by the coalescence of a number of separate clumpsof dark matter that also contained gas and stars, rather than by the collapse of single largebodies of dark matter and gas. Galaxies have evolved with time to give us the galaxypopulations seen today. Some of the processes driving the evolution of galaxies are brieflystated in this section. Exactly how these processes affect galaxies is not understood preciselyat present.

Gas in a galaxy will fall into a rotationally supported disc if it has angular momentum.Subsequent star formation in the gaseous disc will form the stellar disc of a spiral galaxy.Stars formed in gas clouds falling radially inwards during the formation of a galaxy can con-tribute to the stellar halo of a spiral galaxy. Stars in smaller galaxies falling radially inwardsin a merging process can also contribute to the stellar halo of a spiral galaxy. Differentialrotation in a spiral’s disc will generate spiral density waves in the disc, leading to spiralarms. Spiral discs without a bulge can be unstable, and can buckle and thicken, with massbeing redistributed into a bulge, giving the bulge some rotation in the process. Continuingstar formation in the disc gives rise to a range of ages for stars in the disc, while bulges willtend to be older. If a spiral uses up most of its gas in star formation, it will have a stellardisc but no spiral arms.

Mergers and interactions between galaxies can be important. In a merger two galaxiesfuse together. In an interaction, however, one galaxy interacts with another through theirgravitational effects. One or both galaxy may survive an interaction, but may be altered inthe process. If two spiral galaxies merge, or one spiral is disrupted by a close encounter withanother galaxy, the immediate result can be an irregular or peculiar galaxy with strong starformation. This can produce an elliptical galaxy if and when the gas is exhausted.

If a merger between two galaxies produces an elliptical with no overall angular momen-tum, it will be a pressure-supported system. If a merger produces an elliptical with somenet angular momentum, the elliptical will have an element of rotational support. Ellipti-cals, although conventionally gas-poor, can shed gas from their stars through mass loss andsupernova remnants, which can settle into gas discs and form stars in turn.

Almost all galaxies have some very old stars. Most galaxies appear to have been formedfairly early on (> 10 Gyr ago) but some have been strongly influenced by mergers andinteractions since then.

These processes are not well understood at present. Understanding the evolution of galax-ies is currently a subject of much active research, from both a theoretical and observationalperspective. Then there is dark matter ...

1.9 The Galaxy: an Overview

1.9.1 The Structure of the Galaxy

We live in a spiral galaxy. It is relatively difficult to measure its morphology from inside,but observations show that it is almost certainly a barred spiral. The best assessment of its

9

Page 12: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 1.8: A sketch of the Galaxy seen edge on, illustrating the various components.

Figure 1.9: The Galaxy showing its geometry.

morphological type puts it of type SBbc (intermediate between SBb and SBc).The Sun lies close to the plane of the Galaxy, at a distance of 8.0 ± 0.4 kpc from the

Galactic Centre. It is displaced slightly to the north of the plane. The overall diameter ofthe disc is ' 40 kpc.

The structure of the Galaxy can be broken into various distinct components. These are:the disc, the central bulge, the bar, the stellar halo, and the dark matter halo. These are

10

Page 13: ASTM002 The Galaxy Course notes 2006 (QMUL)

illustrated in Figure 1.8. The disc consists of stars and of gas and dust. The gas and dustare concentrated more closely on the plane than the stars, while younger stars are moreconcentrated around the plane than old stars. The disc is rotationally supported againstgravity. The bulge and bar are found within the central few kpc. They consist mostly of oldstars, some metal-poor (deficient in heavy elements). The Galactic Centre has a compactnucleus, probably with a black hole at its core. The stellar halo contains many isolated starsand about 150 globular clusters. These are very old and very metal-poor. The stellar halois pressure supported against gravity. The entire visible system is embedded in an extensivedark matter halo which probably extends out to > 100 kpc. Its properties are not welldefined and the nature of the dark matter is still uncertain.

We shall return to discuss the structure of the Galaxy and its individual components indetail later in the course.

1.9.2 Stellar Populations

It was realised in the early 20th century that the Galaxy could be divided into the discand into a spheroid (which consists of the bulge and stellar halo). The concept of stellarpopulations was introduced by Walter Baade in 1944. In his picture, the stars in the discsof spiral galaxies, which contain many many young stars, were called Population I. Thisincluded the disc of our own Galaxy. In contrast, elliptical galaxies and the spheroids ofspiral galaxies contain many old stars, which he called Population II. Population I systemsconsisted of young and moderately young stars which had chemical compositions similar tothe Sun. Meanwhile, Population II systems were made of old stars which were deficient inheavy elements compared to the Sun. Population I systems were blue in colour, Population IIwere red. Spiral discs and irregular galaxies contained Population I stars, while ellipticalsand the haloes and bulges of disc galaxies were Population II.

This picture was found to be rather simplistic. The population concept was later refinedfor our Galaxy, with a number of subtypes replacing the original two classes. Today, it ismore common to refer to the stars of individual components of the Galaxy separately. Forexample, we might speak of the halo population or the bulge population. The disc populationis often split into the young disc population and the old disc population.

1.9.3 Galactic Coordinates

A galactic coordinate system is frequently used to specify the positions of objects within theGalaxy, and the positions of other galaxies on the sky. In this system, two angles are used tospecify the direction of objects as seen from the Earth, called galactic longitude and galactic

latitude (denoted l and b respectively). The system works in a way that is very similar tolatitude and longitude on the Earth’s surface: galactic longitude measures an angle in theplane of the galactic equator (which is defined to be in the plane of the Galaxy), and galacticlatitude measures the angle from the equator.

Note that the galactic coordinate system is centred on the Earth. Galactic longitude isexpressed as an angle l between 0 and 360. Galactic latitude is expressed as an angle bbetween −90 and +90. Zero longitude is defined to be the Galactic Centre. Thereforethe Galactic Centre is at (l, b) = (0, 0). The direction on the sky opposite to the GalacticCentre is known as the Galactic Anticentre and has coordinates (l, b) = (180, 0). The North

Galactic Pole is at b = +90, and the South Galactic Pole is at b = −90. Note again thatthese are positions on the sky relative to the Earth, and not relative to the Galactic Centre.

11

Page 14: ASTM002 The Galaxy Course notes 2006 (QMUL)

1.10 Density profiles versus surface brightness profiles

The observed surface brightness profiles of galaxies are the result of the projection of three-dimensional density distributions of stars. For example, the observed surface brightness of anelliptical galaxy is well fitted by the de Vaucouleurs R1/4 law, as was discussed earlier. Thissurface brightness is the projection into two-dimensions on the sky of the three-dimensionaldensity distribution of stars in space.

Let us consider an example which demonstrates this. An observer on the Earth observesa spherically symmetric galaxy, such as an E0-type elliptical galaxy. The mean density ofstars in space at a radial distance r from the centre of the galaxies is ρ(r) (measured inM¯ pc−3 or kg m−3, and smoothed out over space). Consider a sight line that passes atangential distance R from the galaxy’s centre.

Consider an element of the path length dl at a distancel from the point on the sight line closest to the nucleus.Let the contribution to the surface brightness from theelement be

dI(R) = k ρ(r) dl (1.1)

where k is a constant of proportionality.

∴ dI(R) = k ρ(r)dr

sin θon subs. dl =

dr

sin θ

= k ρ(r)r dr√r2 −R2

on subs. sin θ =

√r2 −R2

r.

Integrating along the line of sight from B (r = R) to infinity (r → ∞),

IB ∞ =

∫ ∞

Rk ρ(r)

r√r2 −R2

dr = k

∫ ∞

R

r ρ(r)√r2 −R2

dr .

This has neglected the near side of the galaxy. From symmetry, the total surface brightnessalong the line of sight is I(R) = 2 IB ∞(R).

∴ I(R) = 2k

∫ ∞

R

r ρ(r) dr√r2 −R2

. (1.2)

In general, the density profile of a galaxy will not be spherical and we need to take accountof the profile ρ(r) as a function of position vector r.

The calculation of the surface brightness profile from the density profile is straightforwardnumerically, if not always analytically. The inverse problem, converting from an observedsurface brightness profile to a density profile, often has to be done numerically.

12

Page 15: ASTM002 The Galaxy Course notes 2006 (QMUL)

Chapter 2

Stellar Dynamics in Galaxies

2.1 Introduction

A system of stars behaves like a fluid, but one with unusual properties. In a normal fluid two-body interactions are crucial in the dynamics, but in contrast star-star encounters are veryrare. Instead stellar dynamics is mostly governed by the interaction of individual stars withthe mean gravitational field of all the other stars combined. This has profound consequencesfor how the dynamics of the stars within galaxies are described mathematically, allowing forsome considerable simplifications.

This chapter establishes some basic results relating to the motions of stars within galaxies.The virial theorem provides a very simple relation between the total potential and kineticenergies of stars within a galaxy, or other system of stars, that has settled down into a steadystate. The virial theorem is derived formally here. The timescale for stars to cross a systemof stars, known as the crossing time, is a simple but important measure of the motions ofstars. The relaxation time measures how long it takes for two-body encounters to influencethe dynamics of a galaxy, or other system of stars. An expression for the relaxation time isderived here, which is then used to show that encounters between stars are so rare withingalaxies that they have had little effect over the lifetime of the Universe.

The motions of stars within galaxies can be described by the collisionless Boltzmannequation, which allows the numbers of stars to be calculated as a function of position andvelocity in the galaxy. The equation is derived from first principles here. Similarly, theJeans Equations relate the densities of stars to position, velocity, velocity dispersion andgravitational potential.

2.2 The Virial Theorem

2.2.1 The basic result

Before going into the main material on stellar dynamics, it is worth stating – and deriving –a basic principle known as the virial theorem. It states that for any system of particles boundby an inverse-square force law, the time-averaged kinetic energy 〈T 〉 and the time-averagedpotential energy 〈V 〉 satisfy

2 〈T 〉 + 〈V 〉 = 0 , (2.1)

for a steady equilibrium state. 〈T 〉 will be a very large positive quantity and 〈V 〉 a very largenegative quantity. Of course, for a galaxy to hold together, the total energy 〈T 〉+ 〈V 〉 < 0 ;the virial theorem provides a much tighter constraint than this alone. Typically, 〈T 〉 and−〈V 〉 ∼ 1050 to 1054 J for galaxies.

In practice, many systems of stars are not in a perfect final steady state and the virialtheorem does not apply exactly. Despite this, it does give important, approximate results

13

Page 16: ASTM002 The Galaxy Course notes 2006 (QMUL)

for many astronomical systems.

2.2.2 Deriving the virial theorem from first principles

To prove the virial theorem, consider a system of N stars. Let the ith star have a mass mi

and a position vector xi. The velocity of the ith star is xi ≡ dxi/dt, where t is the time.Consider a parameter

F ≡N∑

i=1

mi xi · xi , (2.2)

(F is related to the moment of inertia of the system, as we shall see below.)Differentiating with respect to time t,

dF

dt=

d

dt

(

i

mi xi · xi

)

=∑

i

d

dt

(

mi xi · xi

)

=∑

i

mid

dt

(

xi · xi

)

assuming that the masses mi do not change

=∑

i

mi

(

xi · xi + xi · xi

)

from the product rule

=∑

i

mi xi · xi +∑

i

mi xi2 (2.3)

The kinetic energy of the ith particle is 12mixi

2. Therefore the total kinetic energy of theentire system of stars is

T =∑

i

12 mi x

2i ∴

i

mi x2i = 2T

Substituting this into Equation 2.3,

dF

dt= 2T +

i

mi xi · xi , (2.4)

at any time t.We now need to remember that the average of any parameter y(t) over time t = 0 to τ is

〈y〉 =1

τ

∫ τ

0y(t) dt

Consider the average value of dF/dt over a time interval t = 0 to τ .

dF

dt

=1

τ

∫ τ

0

(

2T +∑

i

mi xi · xi

)

dt

=2

τ

∫ τ

0T dt +

1

τ

∫ τ

0

i

mi xi · xi dt

= 2 〈T 〉 +∑

i

mi

τ

∫ τ

0xi · xi dt assuming mi is constant over time

= 2 〈T 〉 +∑

i

mi 〈xi · xi〉 (2.5)

We can define a moment of inertia of the system of particles about the origin as

I ≡∑

i

mi xi · xi .

14

Page 17: ASTM002 The Galaxy Course notes 2006 (QMUL)

(Note that this definition of moment of inertia is different from the moment of inertia abouta particular axis that is commonly used to study the rotation of bodies.)Differentiating with repect to time,

dI

dt=

d

dt

i

mixi·xi =∑

i

mid

dt

(

xi·xi

)

=∑

i

mi

(

xi·xi + xi·xi

)

= 2∑

i

mixi·xi

Substituting for F =∑

imi xi · xi from Equation 2.2,

F =1

2

dI

dt

When the system of stars eventually reaches equilibrium, the moment of inertia will beconstant. Therefore, F = 0 at all times after equilibrium has been reached. So, 〈dF/dt〉 = 0.(An alternative way of visualising this is by considering that F will be bounded in anyphysical system. Therefore the long-time average 〈 dF

dt 〉 will vanish as τ becomes large, i.e.

limτ→∞〈dFdt 〉 = limτ→∞( 1

τ

∫ τ0

dFdt dt) → 0.)

Substituting for 〈dF/dt〉 = 0 into Equation 2.5,

2 〈T 〉 +∑

i

mi 〈xi · xi〉 = 0 . (2.6)

The term∑

imi 〈xi ·xi〉 is related to the gravitational potential. We next need to show how.

Newton’s Second Law of Motion gives for the ith star,

mi xi =∑

jj 6=i

Fj

where Fj is the force exerted on the ith star by the jth star. Using the law of universalgravitation,

mi xi =∑

jj 6=i

− Gmimj

|xi − xj |3(xi − xj) .

Taking the scalar product (dot product) with xi,

mi xi · xi =(

jj 6=i

− Gmimj

|xi − xj |3(xi − xj)

)

· xi

Summing over all i,

i

mixi·xi = −∑

i

jj 6=i

Gmimj

|xi − xj |3(xi−xj)·xi = −

i,ji6=j

Gmimj

|xi − xj |3(xi−xj)·xi (2.7)

Switching i and j, we have

j

mj xj · xj = −∑

j,ii6=j

Gmj mi

|xj − xi|3(xj − xi) · xj (2.8)

Adding Equations 2.7 and 2.8,

i

mixi ·xi +∑

j

mj xj ·xj = −∑

i,ji6=j

Gmimj

|xi − xj |3(xi−xj)·xi −

i,ji6=j

Gmj mi

|xj − xi|3(xj−xi)·xj

15

Page 18: ASTM002 The Galaxy Course notes 2006 (QMUL)

∴ 2∑

i

mi xi · xi = −∑

i,ji6=j

Gmimj

|xi − xj |3(

(xi − xj) · xi + (xj − xi) · xj

)

But

(xi − xj) · xi + (xj − xi) · xj = (xi − xj) · xi − (xi − xj) · xj

= (xi − xj) · (xi − xj) (factorising)

= |xi − xj |2

∴ 2∑

i

mi xi · xi = −∑

i,ji6=j

Gmimj

|xi − xj |3|xi − xj |2

∴∑

i

mi xi · xi = − 1

2

i,ji6=j

Gmimj

|xi − xj |(2.9)

We now need to find the total potential energy of the system.The gravitational potential at star i due to star j is

Φi j = − Gmj

|xi − xj |

Therefore the gravitational potential at star i due to all other stars is

Φi =∑

jj 6=i

Φi j =∑

jj 6=i

− Gmj

|xi − xj |

Therefore the gravitational potential energy of star i due to all the other stars is

Vi = mi Φi = − mi

jj 6=i

Gmj

|xi − xj |

The total potential energy of the system is therefore

V =∑

i

Vi =1

2

i

−mi

jj 6=i

Gmj

|xi − xj |

The factor 12 ensures that we only count each pair of stars once (otherwise we would count

each pair twice and would get a result twice as large as we should). Therefore,

V = − 1

2

i,ji6=j

Gmimj

|xi − xj |

Substituting for the total potential energy into Equation 2.9,

i

mi xi · xi = V

Equation 2.6 uses time-averaged quantities. So, averaging over time t = 0 to τ ,

1

τ

∫ τ

0

i

mi xi · xi dt = 〈V 〉

16

Page 19: ASTM002 The Galaxy Course notes 2006 (QMUL)

∴∑

i

mi1

τ

∫ τ

0xi · xi dt = 〈V 〉

∴∑

i

mi 〈 xi · xi 〉 = 〈V 〉

Substituting this into Equation 2.6,

2 〈T 〉 + 〈V 〉 = 0

This is Equation 2.1, the Virial Theorem.

2.2.3 Using the Virial Theorem

The virial theorem applies to systems of stars that have reached a steady equilibrium state.It can be used for many galaxies, but can also be used for other systems such as somestar clusters. However, we need to be careful that we use the theorem only for equilibriumsystems.

The theorem can be applied, for example, to:

• elliptical galaxies

• evolved star clusters, e.g. globular clusters

• evolved clusters of galaxies (with the galaxies acting as the particles, not the individualstars)

Examples of places where the virial theorem cannot be used are:

• merging galaxies

• newly formed star clusters

• clusters of galaxies that are still forming/still have infalling galaxies

The virial theorem provides an easy way to makes rough estimates of masses, becausevelocity measurements can give 〈T 〉. To do this we need to measure the observed velocitydispersion of stars (the dispersion along the line of sight using radial velocities obtained fromspectroscopy). The theorem then gives the total gravitational potential energy, which canprovide the total mass. This mass, of course, is important because it includes dark matter.Virial masses are particularly important for some galaxy clusters (using galaxies or atoms inX-ray emitting gas as the particles).

But it is prudent to consider virial mass estimates as order-of-magnitude only, because(i) generally one can measure only line-of-sight velocities, and getting T = 1

2

imix2i from

these requires more assumptions (e.g. isotropy of the velocity distribution); and (ii) thesystems involved may not be in a steady state, in which case of course the virial theoremdoes not apply—clusters of galaxies are particularly likely to be quite far from a steady state.

Note that for galaxies beyond our own, we cannot measure three-dimensional velocitiesof stars directly (although some projects are attempting to do this for some Local Groupgalaxies). We have to use radial velocities (the component of the velocity along the line ofsight to the galaxy) only, obtained from spectroscopy through the Doppler shift of spectrallines. Beyond nearby galaxies, radial velocities of individual stars become difficult to obtain.It becomes necessary to measure velocity dispersions along the line of sight from the observedwidths of spectral lines in the combined light of millions of stars.

17

Page 20: ASTM002 The Galaxy Course notes 2006 (QMUL)

2.2.4 Deriving masses from the Virial Theorem: a naive example

Consider a spherical elliptical galaxy of radius R that has uniform density and which consistsof N stars each of mass m having typical velocities v.From the virial theorem,

2 〈T 〉 + 〈V 〉 = 0

where 〈T 〉 is the time-averaged total kinetic energy and 〈V 〉 is the average total potentialenergy.We have

T =

N∑

i=1

1

2mv2 =

1

2Nmv2

and averaging over time, 〈T 〉 = 12Nmv

2 also.The total gravitational potential energy of a uniform sphere of mass M and radius R (astandard result) is

V = − 3

5

GM2

R

where G is the universal gravitational constant. So the time-averaged potential energy ofthe galaxy is

〈V 〉 = − 3

5

GM2

R

where M is the total mass. Substituting this into the virial theorem equation,

2

(

1

2Nmv2

)

− 3

5

GM2

R= 0

But the total mass is M = Nm.

∴ v2 =3

5

NGm

R=

3

5

GM

R

The calculation is only approximate, so we shall use

v2 ' NGm

R' GM

R. (2.10)

This gives the mass to be

M ' v2R

G. (2.11)

So an elliptical galaxy having a typical velocity v = 350 km s−1 = 3.5 × 105m s−1, anda radius R = 10 kpc = 3.1× 1020 m, will have a mass M ∼ 6 × 1041 kg ∼ 3 × 1011M¯.

2.2.5 Example: the fundamental plane for elliptical galaxies

We can derive a relationship between scale size, central surface brightness and central velocitydispersion for elliptical galaxies that is rather similar to the fundamental plane, using onlyassumptions about a constant mass-to-light ratio and a constant functional form for thesurface brightness profile.

We shall assume here that:

• the mass-to-light ratio is constant for ellipticals (all E galaxies have the same M/Lregardless of size or mass), and

• elliptical galaxies have the same functional form for the mass distribution, only scalable.

18

Page 21: ASTM002 The Galaxy Course notes 2006 (QMUL)

Let I0 be the central surface brightness and R0 be a scale size of a galaxy (in this case,different galaxies will have different values of I0 and R0). The total luminosity will be

L ∝ I0 R20 ,

because I0 is the light per unit projected area. Since the mass-to-light ratio is a constant forall galaxies, the mass of the galaxy is M ∝ L .

∴ M ∝ I0 R20 .

From the virial theorem, if v is a typical velocity of the stars in the galaxy

v2 ' GM

R0.

The observed velocity dispersion along the line sight, σ0, will be related to the typical velocityv by σ0 ∝ v (because v is a three-dimensional space velocity). So

σ20 ∝ M

R0. ∴ M ∝ σ2

0 R0 .

Equating this with M ∝ I0 R20 from above, σ2

0 R0 ∝ I0 R20 .

∴ R0 I0 σ−20 ' constant .

This is close to, but not the same, as the observed fundamental plane result R0 I0.90 σ−1.4

0 'constant. The deviation from this virial prediction presumably has something to do with avarying mass-to-light ratio, but why it is a very good correlation in practice is not understoodin detail.

2.3 The Crossing Time, Tcross

The crossing time is a simple, but important, parameter that measures the timescale forstars to move significantly within a system of stars. It is sometimes called the dynamical

timescale.It is defined as

Tcross ≡ R

v, (2.12)

where R is the size of the system and v is a typical velocity of the stars.As a simple example, consider a stellar system of radius R (and therefore an overall size

2R), having N stars each of mass m; the stars are distributed roughly homogeneously, withv being a typical velocity, and the system is in dynamical equilibrium. Then from the virialtheorem,

v2 ' NGm

R.

The crossing time is then

Tcross ≡ 2R

v' 2R

NGmR

' 2

R3

NGm. (2.13)

But the mass density is

ρ =Nm43πR

3=

3Nm

4πR3.

∴R3

Nm=

3

4πρ.

19

Page 22: ASTM002 The Galaxy Course notes 2006 (QMUL)

∴ Tcross = 2

3

4πGρ

So approximately,

Tcross ∼ 1√Gρ

. (2.14)

Although this equation has been derived for a particular case, that of a homogeneous sphere,it is an important result and can be used for order of magnitude estimates in other situations.(Note that ρ here is the mass density of the system, averaged over a volume of space, andnot the density of individual stars.)

Example: an elliptical galaxy of 1011 stars, radius 10 kpc.

R ' 10 kpc ' 3.1 × 1020 m

N = 1011

m ' 1 M¯ ' 2 × 1030 kg

Tcross ' 2

R3

NGmgives Tcross ' 1015 s ' 108 yr.

The Universe is 14 Gyr old. So if a galaxy is ' 14 Gyr old, there are ' few × 100 crossingtimes in a galaxy’s lifetime so far.

2.4 The Relaxation Time, Trelax

The relaxation time is the time taken for a star’s velocity v to be changed significantly bytwo-body interactions. It is defined as the time needed for a change ∆v2 in v2 to be thesame as v2, i.e. the time for

∆v2 = v2 . (2.15)

To estimate the relaxation time we need to consider the nature of encounters between starsin some detail.

2.5 Star-Star Encounters

2.5.1 Types of encounters

We might expect that stars, as they move around inside a galaxy or other system of stars, willexperience close encounters with other stars. The gravitational effects of one star on anotherwould change their velocities and these velocity perturbations would have a profound effecton the overall dynamics of the galaxy. The dynamics of the galaxy might evolve with time,as a result only of the internal encounters between stars.

The truth, however, is rather different. Close star-star encounters are extremely rare andeven the effects of distant encounters are so slight that it takes an extremely long time forthe dynamics of galaxies to change substantially.

We can consider two different types of star-star encounters:

• strong encounters – a close encounter that strongly changes a star’s velocity – theseare very rare in practice

• weak encounters – occur at a distance – they produce only very small changes in astar’s velocity, but are much more common

20

Page 23: ASTM002 The Galaxy Course notes 2006 (QMUL)

2.5.2 Strong encounters

A strong encounter between two stars is defined so that we have a strong encounter if, atthe closest approach, the change in the potential energy is larger than or equal to the initialkinetic energy.

For two stars of mass m that approach to a distance r0, if the change in potential energyis larger than than initial kinetic energy,

Gm2

r0≥ 1

2mv2 ,

where v is the initial velocity of one star relative to the other.

∴ r0 ≤ 2Gm

v2.

So we define a strong encounter radius

rS ≡ 2Gm

v2. (2.16)

A strong encounter occurs if two stars approach to within a distance rS ≡ 2Gm/v2.For an elliptical galaxy, v ' 300 kms−1. Usingm = 1M¯, we find that rS ' 3×109 m '

0.02 AU. This is a very small figure on the scale of a galaxy. The typical separation betweenstars is ∼ 1 pc ' 200 000 AU.

For stars in the Galactic disc in the solar neighbourhood, we can use a velocity dispersionof v = 30 kms−1 and m = 1M¯. This gives rS ' 3 × 1011 m ' 2 AU. This again is verysmall on the scale of the Galaxy.

So strong encounters are very rare. The mean time between them in the Galactic discis ∼ 1015 yr, while the age of the Galaxy is ' 13 × 109 yr. In practice, we can ignore theireffect on the dynamics of stars.

2.5.3 Distant weak encounters between stars

A star experiences a weak encounter if it approaches another to a minimum distance r0 when

r0 > rS ≡ 2Gm

v2(2.17)

where v is the relative velocity before the encounter and m is the mass of the perturbingstar. Weak encounters in general provide only a tiny perturbation to the motions of stars ina stellar system, but they are so much more numerous than strong encounters that they aremore important than strong encounters in practice.

We shall now derive a formula that expresses the change δv in the velocity v during aweak encounter (Equation 2.19 below). This result will later be used to derive an expressionfor the square of the velocity change caused by a large number of weak encounters, whichwill then be used to obtain an estimate of the relaxation time in a system of stars.

Consider a star of mass ms approaching a perturbing star of mass m with an impactparameter b. Because the encounter is weak, the change in the direction of motion will besmall and the change in velocity will be perpendicular to the initial direction of motion. Atany time t when the separation is r, the component of the gravitational force perpendicularto the direction of motion will be

Fperp =Gmsm

r2cosφ ,

where φ is the angle at the perturbing mass between the point of closest approach and theperturbed star. Let the component of velocity perpendicular to the initial direction of motionbe vperp and let the final value be vperp f .

21

Page 24: ASTM002 The Galaxy Course notes 2006 (QMUL)

Making the approximation that the speed along the trajectory is constant, r '√b2 + v2t2

at time t if t = 0 at the point of closest approach. Using cosφ = b/r ' b/√b2 + v2t2 and

applying F = ma perpendicular to the direction of motion we obtain

dvperp

dt=

Gm b

(b2 + v2t2)3/2,

where vperp is the component at time t of the velocity perpendicular to the initial directionof motion. Integrating from time t = −∞ to ∞,

[

vperp

]vperp f

0= Gm b

∫ ∞

−∞

dt

(b2 + v2t2)3/2.

We have the standard integral∫∞−∞ (1 + s2)−3/2 ds = 2 (which can be shown using the

substitution s = tanx). Using this standard integral, the final component of the velocityperpendicular to the initial direction of motion is

vperp f =2Gm

bv. (2.18)

Because the deflection is small, the change of velocity is δv ≡ |δv| = vperp f . Therefore thechange in the velocity v is given by

δv =2Gm

bv, (2.19)

where G is the constant of gravitation, b is the impact parameter and m is the mass of theperturbing star.

As a star moves through space, it will experience a number of perturbations caused byweak encounters. Many of these velocity changes will cancel, but some net change will occurover time. As a result, the sum over all δv will remain small, but the sum of the squaresδv2 will build up with time. It is this change in v2 that we need to consider in the definitionof the relaxation time (Equation 2.15). Because the change in velocity δv is perpendicularto the initial velocity v in a weak encounter, the change in v2 is therefore δv2 ≡ v2

f − v2 =

|v+δv|2−v2 = (v+δv) ·(v+δv)−v2 = v ·v+2v ·δv+δv ·δv−v2 = 2v ·δv+(δv)2 = (δv)2,where vf is the final velocity of the star. The change in v2 resulting from a single encounterthat we need to consider is

δv2 =

(

2Gm

bv

)2

. (2.20)

22

Page 25: ASTM002 The Galaxy Course notes 2006 (QMUL)

Consider all weak encounters occurring in a time period t that have impact parameters inthe range b to b+ db within a uniform spherical system of N stars and radius R.The volume swept out by impact parameters b to b+ db in time t is 2π b db v t.Therefore the number of stars encountered with impact parameters between b and b+ db intime t is

(volume swept out) (number density of stars) =(

2π b db v t) N

43πR

3=

3 b v tN db

2R3

The total change in v2 caused by all encounters in time t with impact parameters in therange b to b+ db will be

∆v2 =

(

2Gm

bv

)2 ( 3 b v tN db

2R3

)

Integrating over b, the total change in a time t from all impact parameters from bmin to bmax

is

∆v2(t) =

∫ bmax

bmin

(

2Gm

bv

)2 ( 3 b v tN db

2R3

)

=3

2

(

2Gm

v

)2 v tN

R3

∫ bmax

bmin

db

b

∴ ∆v2(t) = 6

(

Gm

v

)2 v tN

R3ln

(

bmax

bmin

)

. (2.21)

It is sometimes useful to have an expression for the change in v2 that occurs in one crossingtime. In one crossing time Tcross = 2R/v, the change in v2 is

∆v2(Tcross) = 6

(

Gm

v

)2 v

R3

(

2R

v

)

N ln

(

bmax

bmin

)

= 12N

(

Gm

Rv

)2

ln

(

bmax

bmin

)

. (2.22)

The maximum scale over which weak encounters will occur corresponds to the size of thesystem of stars. So we shall use bmax ' R.

∆v2(Tcross) = 12N

(

Gm

Rv

)2

ln

(

R

bmin

)

. (2.23)

23

Page 26: ASTM002 The Galaxy Course notes 2006 (QMUL)

We are more interested here in the relaxation time Trelax. The relaxation time is definedas the time taken for ∆v2 = v2. Substituting for ∆v2 from Equation 2.21 we get,

6

(

Gm

v

)2 v TrelaxN

R3ln

(

bmax

bmin

)

= v2 .

∴ Trelax =1

6N ln(

bmax

bmin

)

(Rv)3

(Gm)2, (2.24)

or putting bmax ' R,

Trelax =1

6N ln(

Rbmin

)

(Rv)3

(Gm)2. (2.25)

Equation 2.25 enables us to estimate the relaxation time for a system of stars, such asa galaxy or a globular cluster. Different derivations can have slightly different numericalconstants because of the different assumptions made.

In practice, bmin is often set to the scale on which strong encounters begin to operate,so bmin ' 1 AU. The precise values of bmax and bmin have relatively little effect on theestimation of the relaxation time because of the log dependence.

As an example of the calculation of the relaxation time, consider an elliptical galaxy.This has: v ' 300 kms−1 = 3.0 × 105 ms−1, N ' 1011, R ' 10 kpc ' 3.1 × 1020 m andm ' 1 M¯ ' 2.0 × 1030 kg. So, ln(R/bmin) ' 21 and Trelax ∼ 1024 s ∼ 1017 yr. TheUniverse is 14 × 109 yr old, which means that the relaxation time is ∼ 108 times the age ofthe Universe. So star-star encounters are of no significance for galaxies.

For a large globular cluster, we have: v ' 10 kms−1 = 104 ms−1, N ' 500 000, R ' 5 pc' 1.6×1017 m and m ' 1 M¯ ' 2.0×1030 kg. So, ln(R/bmin) ' 15 and Trelax ∼ 5×1015 s∼ 107 yr. This is a small fraction (10−3) of the age of the Galaxy. Two body interactionsare therefore significant in globular clusters.

The importance of the relaxation time calculation is that it enables us to decide whetherwe need to allow for star-star interactions when modelling the dynamics of a system of stars.Modelling becomes much easier if two-body encounters can be ignored. A system wherethese interactions are not important is called a collisionless system. This is why stars onthe scale of galaxies were described as being collisionless in Chapter 1. Fortunately, we canignore these star-star interactions when modelling galaxies and this makes possible the useof a result called the collisionless Boltzmann equation later.

2.6 The Ratio of the Relaxation Time to the Crossing Time

An approximate expression for the ratio of the relaxation time to the crossing time can be cal-culated easily. Dividing the expressions for the relaxation and crossing times (Equations 2.25and 2.12),

Trelax

Tcross=

1

12N ln(

Rbmin

)

R2v4

(Gm)2.

For a uniform sphere, from the virial theorem (Equation 2.10),

v2 ' NGm

R

and setting bmin equal to the strong encounter radius rS = 2GM/v2 (Equation 2.16), we get,

Trelax

Tcross=

1

12N ln(

Rv2

2GM

)

R2v4

(Gm)2' N2

12N ln(N)

24

Page 27: ASTM002 The Galaxy Course notes 2006 (QMUL)

∴Trelax

Tcross' N

12 lnN. (2.26)

For a galaxy, N ∼ 1011. Therefore Trelax/Tcross ∼ 109. For a globular cluster, N ∼ 105 andTrelax/Tcross ∼ 103.

2.7 The Nature of the Gravitational Potential in a Galaxy

The gravitational potential in a galaxy can be represented as essentially having two compo-nents. The first of these is the broad, smooth, underlying potential due to the entire galaxy.This is the sum of the potentials of all the stars, and also of the dark matter and the inter-stellar medium. The second component is the localised deeper potentials due to individualstars.

We can effectively regard the potential as being made of a smooth component with verylocalised deep potentials superimposed on it. This is illustrated figuratively in Figure 2.1.

Figure 2.1: A sketch of the gravitational potential of a galaxy, showing the broad potentialof the galaxy as a whole, and the deeper, localised potentials of individual stars.

Interactions between individual stars are rare, as we have seen, and therefore it is thebroad distribution that determines the motions of stars. Therefore, we can represent the dy-namics of a system of stars using only the smooth underlying component of the gravitationalpotential Φ(x, t), where x is the position vector of a point and t is the time. If the galaxy hasreached a steady state, Φ is Φ(x) only. We shall neglect the effect of the localised potentialsof stars in the following sections, which is an acceptable approximation as we have shown.

2.8 Gravitational potentials, density distributions and masses

2.8.1 General principles

The distribution of mass in a galaxy – including both the visible and dark matter – determinesthe gravitational potential. The potential Φ at any point is related to the local density ρ byPoisson’s Equation, ∇2Φ(≡ ∇ ·∇Φ) = 4πGρ . This means that if we know the density ρ(x)as a function of position across a galaxy, we can calculate the potential Φ, either analyticallyor numerically, by integration. Alternatively, if we know Φ(x), we can calculate the densityprofile ρ(x) by differentiation. In addition, because the acceleration due to gravity g isrelated to the potential by g = −∇Φ, we can compute g(x) from Φ(x) and vice-versa.Similarly, substituting for g = −∇Φ in the Poission Equation gives ∇ · g = − 4πGρ .

These computations are often done for some example theoretical representations of thepotential or density. A number of convenient analytical functions are encountered in theliterature, depending on the type of galaxy being modelled and particular circumstances.

25

Page 28: ASTM002 The Galaxy Course notes 2006 (QMUL)

The issue of determining actual density profiles and potentials from observations of galax-ies is much more challenging, however. Observations readily give the projected density distri-butions of stars on the sky, and we can attempt to derive the three-dimensional distributionof stars from this. However, it is the total density ρ(x), including dark matter ρ

DM(x), that

is relevant gravitationally, with ρ(x) = ρDM

(x) + ρV IS

(x). The dark matter distribution canonly be inferred from the dynamics of visible matter (or to a limited extent from gravita-tional lensing of background objects). In practice, therefore, the three-dimension densitydistribution ρ(x) and the gravitational potential Φ(x) are poorly known.

2.8.2 Spherical symmetry

Calculating the relationship between density and potential is much simpler if we are dealingwith spherically symmetric distributions, which are appropriate in some circumstances suchas spherical elliptical galaxies. Under spherical symmetry, ρ and Φ are functions only of theradial distance r from the centre of the distribution. Therefore,

∇2Φ =1

r2d

dr

(

r2dΦ

dr

)

= 4πGρ

because Φ is independent of the angles θ and φ in a spherical coordinate system (see Ap-pendix B).

Another useful parameter for spherically symmetric distributions is the mass M(r) thatlies inside a radius r. We can relate this to the density ρ(r) by considering a thin sphericalshell of radius r and thickness dr centred on the distribution. The mass of this shell isdM(r) = ρ(r)×surface area×thickness = 4πr2ρ(r)dr. This gives us the differential equation

dM

dr= 4π r2 ρ , (2.27)

often known as the equation of continuity of mass. The total mass is Mtot = limr→∞M(r).The gravitational acceleration g in a spherical distribution has an absolute value |g| of

g =GM(r)

r2, (2.28)

at a distance r from the centre, where G is the constant of gravitation (derived in Ap-pendix B), and is directed towards the centre of the distribution.

2.8.3 Two examples of spherical potentials

The Plummer Potential

A function that is often used for the theoretical modelling of spherically-symmetric galaxiesis the Plummer potential. This has a gravitational potential Φ at a radial distance r fromthe centre that is given by

Φ(r) = − GMtot√r2 + a2

, (2.29)

where Mtot is the total mass of the galaxy and a is a constant. The constant a serves toflatten the potential in the core.

For this potential the density ρ at a radial distance r is

ρ(r) =3Mtot

a2

(r2 + a2)5/2, (2.30)

which can be derived from the expression for Φ using the Poisson equation ∇2Φ = 4πGρ.This density scales with radius as ρ ∼ r−5 at large radii.

26

Page 29: ASTM002 The Galaxy Course notes 2006 (QMUL)

The mass interior to a point M(r) can be computed from the density ρ using dM/dr =4πr2ρ, or from the potential Φ using Gauss’s Law in the form

S ∇Φ · dS = 4πGM(r) for aspherical surface of radius r. The result is

M(r) =Mtot r

3

(r2 + a2)3/2. (2.31)

The Plummer potential was first used in 1911 by H. C. K. Plummer (1875–1946) todescribe globular clusters. Because of the simple functional forms, the Plummer model issometimes useful for approximate analytical modelling of galaxies, but the r−5 density profileis much steeper than elliptical galaxies are observed to have.

The Isothermal Sphere

The density distribution known as the isothermal sphere is a spherical model of a galaxythat is identical to the distribution that would be followed by a stable cloud of gas havingthe same temperature everywhere. A spherically-symmetric cloud of gas having a singletemperature T throughout would have a gas pressure P (r) at a radius r from its centre thatis related to T by the ideal gas law as P (r) = npkBT , where np(r) is the number density ofgas particles (atoms or molecules) at radius r and kB is the Boltzmann constant. The cloudwill be supported by hydrostatic equilibrium, so therefore

dP

dr= − GM(r)

r2ρ(r) , (2.32)

where M(r) is the mass enclosed within a radius r. The gradient in the mass is dM/dr =4πr2ρ(r).

These equations have a solution

ρ(r) =σ2

2πGr2, and M(r) =

2σ2

Gr , where σ2 ≡ kBT

mp, (2.33)

where mp is the mass of each gas particle. The parameter σ is the root-mean-square velocityin any direction.

The isothermal sphere model for a system of stars is defined to be a model that hasthe same density distribution as the isothermal gas cloud. Therefore, an isothermal galaxywould also have a density ρ(r) and mass M(r) interior to a radius r given by

ρ(r) =σ2

2πGr2, and M(r) =

2σ2

Gr , (2.34)

where σ is root-mean-square velocity of the stars along any direction.The isothermal sphere model is sometimes used for the analytical modelling of galaxies.

While it has some advantages of simplicity, it does suffer from the disadvantage of beingunrealistic in some important respects. Most significantly, the model fails totally at largeradii: formally the limit of M(r) as r −→ ∞ is infinite.

2.9 Phase Space and the Distribution Function f(x, v, t)

To describe the dynamics of a galaxy, we could use:

• the positions of each star, xi

• the velocities of each star, vi

where i = 1 to N , with N ∼ 106 to 1012. However, this would be impractical numerically.If we tried to store these data on a computer as 4-byte numbers for every star in a galaxy

having N ∼ 1012 stars, we would need 6 × 4 × 1012 bytes ∼ 2 × 1013 bytes ∼ 20 000 Gbyte.

27

Page 30: ASTM002 The Galaxy Course notes 2006 (QMUL)

This is such a large data size that the storage requirements are prohibitive. If we needed tosimulate a galaxy theoretically, we would need to follow the galaxy over time using a largenumber of time steps. Storing the complete set of data for, say, 103 − 106 time steps wouldbe impossible. Observationally, meanwhile, it is impossible to determine the positions andmotions of every star in any galaxy, even our own.

In practice, therefore, people represent the stars in a galaxy using the distribution function

f(x,v, t) over position x and velocity v, at a time t. This is the probability density in the6-dimensional phase space of position and velocity at a given time. It is also known asthe “phase space density”. It requires only modest data resources to store the functionnumerically for a model of a galaxy, while f can also be modelled analytically.

The number of stars in a rectangular box between x and x + dx, y and y + dy, z andz+dz, with velocity components between vx and vx +dvx, vy and vy +dvy, vz and vz +dvz,is f(x,v, t) dx dy dz dvx dvy dvz ≡ f(x,v, t) d3x d3v . The number density n(x,v, t)of stars in space can be obtained from the distribution function f by integrating over thevelocity components,

n(x,v, t) =

∫ ∞

−∞f(x,v, t) dvx dvy dvz =

∫ ∞

−∞f(x,v, t) d3v . (2.35)

2.10 The Continuity Equation

We shall assume here that stars are conserved: for the purpose of modelling galaxies we shallassume that the number of stars does not change. This means ignoring star formation andthe deaths of stars, but it is acceptable for the present purposes.

The assumption that stars are conserved results in the continuity equation. This expressesthe rate of change in the distribution function f as a function of time to the rates of changewith position and velocity. The equation becomes an important starting point in derivingother equations that relate f to the gravitational potential and to observational quantities.

Consider the x − vx plane within the 6-dimensional phase space (x, y, z, vx, vy, vz) inCartesian coordinates. Consider a rectangular box in the plane extending from x to x+ ∆xand vx to vx + ∆vx.

But the velocity vx means that starsmove in x (vx ≡ dx/dt).So there is a flow of stars throughthe box in both the x and the vx

directions.

28

Page 31: ASTM002 The Galaxy Course notes 2006 (QMUL)

We can represent the flow of stars by the continuity equation:

∂f

∂t+

∂x

(

fdx

dt

)

+∂

∂y

(

fdy

dt

)

+∂

∂z

(

fdz

dt

)

+∂

∂vx

(

fdvx

dt

)

+

∂vy

(

fdvy

dt

)

+∂

∂vz

(

fdvz

dt

)

= 0 . (2.36)

This can be abbreviated as

∂f

∂t+

3∑

i=1

(

∂xi

(

fdxi

dt

)

+∂

∂vi

(

fdvi

dt

) )

= 0 , (2.37)

where x1 ≡ x, x2 ≡ y, x3 ≡ z, v1 ≡ vx, v2 ≡ vy, and v3 ≡ vz. It is sometimes also abbreviatedas

∂f

∂t+

∂x·(

fdx

dt

)

+∂

∂v·(

fdv

dt

)

= 0 , (2.38)

where, in this notation, for any vectors a and b with components (a1, a2, a3) and (b1, b2, b3),

∂a· b ≡

3∑

i=1

∂bi∂ai

. (2.39)

(Note that it does not mean a direct differentiation by a vector).It is also possible to simplify the notation further by introducing a combined phase space

coordinate system w = (x,v) with components (w1, w2, w3, w4, w5, w6) = (x, y, z, vx, vy, vz).In this case the continuity equation becomes

∂f

∂t+

6∑

i=1

∂wi(fwi) = 0 . (2.40)

The equation of continuity can also be expressed in terms of the momentum p = mv, wherem is mass of an element of gas, as

∂f

∂t+

∂x·(

fdx

dt

)

+∂

∂p·(

fdp

dt

)

= 0 . (2.41)

29

Page 32: ASTM002 The Galaxy Course notes 2006 (QMUL)

2.11 The Collisionless Boltzmann Equation

2.11.1 The importance of the Collisionless Boltzmann Equation

Equation 2.25 showed that the relaxation time for galaxies is very long, significantly longerthan the age of the Universe: galaxies are collisionless systems. This, fortunately, simplifiesthe analysis of the dynamics of stars in galaxies.

It is possible to derive an equation from the continuity equation that more explicitlystates the relation between the distribution function f , position x, velocity v and time t.This is the collisionless Boltzmann equation (C.B.E.), which takes its name from a similarequation in statistical physics derived by Boltzmann to describe particles in a gas.

2.11.2 A derivation of the Collisionless Boltzmann Equation

The continuity equation (2.37) states that

∂f

∂t+

3∑

i=1

(

∂xi

(

fdxi

dt

)

+∂

∂vi

(

fdvi

dt

) )

= 0 ,

where f is the distribution function in the Cartesian phase space (x1, x2, x3, v1, v2, v3). Butthe acceleration of a star is given by the gradient of the gravitational potential Φ:

dvi

dt= − ∂Φ

∂xi

in each direction (i.e. for each value of i for i = 1, 2, 3). (This is simply dv/dt = g = −∇Φresolved into each dimension.)

We also havedxi

dt= vi, so,

∂f

∂t+

3∑

i=1

(

∂xi(fvi) +

∂vi

(

−f ∂Φ

∂xi

) )

= 0 .

But vi is a coordinate, not a value associated with a particular star: we are using thecontinuous function f rather than considering individual stars. Therefore vi is independentof xi. So,

∂xi(fvi) = vi

∂f

∂xi.

The potential Φ ≡ Φ(x, t) does not depend on vi: Φ is independent of velocity.

∴∂

∂vi

(

fdΦ

dxi

)

=∂Φ

∂xi

∂f

∂vi

∴∂f

∂t+

3∑

i=1

(

vi∂f

∂xi− ∂Φ

∂xi

∂f

∂vi

)

= 0 .

Butdvi

dt= − ∂Φ

∂xi, so,

∂f

∂t+

3∑

i=1

(

vi∂f

∂xi+

dvi

dt

∂f

∂vi

)

= 0 . (2.42)

This is the collisionless Boltzmann equation. It can also be written as

∂f

∂t+

3∑

i=1

(

dxi

dt

∂f

∂xi+

dvi

dt

∂f

∂vi

)

= 0 . (2.43)

30

Page 33: ASTM002 The Galaxy Course notes 2006 (QMUL)

Alternatively it can expressed as,

∂f

∂t+

6∑

i=1

wi∂f

∂wi= 0 , (2.44)

where w = (x,v) is a 6-dimensional coordinate system, and also as

∂f

∂t+

dx

dt· ∂f∂x

+dv

dt· ∂f∂v

= 0 , (2.45)

and as∂f

∂t+

dx

dt· ∂f∂x

+dp

dt· ∂f∂p

= 0 . (2.46)

Note the use here of the notation

dx

dt· ∂f∂x

≡3∑

i=1

dxi

dt

∂f

∂xi= 0 , etc. (2.47)

2.11.3 Deriving the Collisionless Boltzmann Equation using HamiltonianMechanics

The collisionless Boltzmann equation can also be derived from the continuity equation usingHamiltonian mechanics. This derivation is given here. It has the advantage of being neat.However, do not worry if you are not familiar with Hamiltonian mechanics: this is given asan alternative to Section 2.11.2.

Hamilton’s Equations relate the differentials of the position vector x and of the (gener-alised) momentum p to the differential of the Hamiltonian H:

dx

dt=

∂H

∂p,

dp

dt= − ∂H

∂x. (2.48)

(In this notation this means

dxi

dt=

∂H

∂piand

dpi

dt= − ∂H

∂xifor i = 1 to 3, (2.49)

where xi and pi are the components of x and p.)Substituting for dx/dt and dp/dt into the continuity equation,

∂f

∂t+

∂x·(

f∂H

∂p

)

+∂

∂p·(

−f ∂H∂x

)

= 0 .

For a star moving in a gravitational potential Φ, the Hamiltonian is

H =p2

2m+ mΦ(x) =

p · p2m

+ mΦ(x) . (2.50)

where p is its momentum and m is its mass. Differentiating,

∂H

∂p=

d

dp

(p · p2m

)

+d

dp(mΦ)

=p

m+ 0 because Φ(x, t) is independent of p

=p

m

and∂H

∂x=

∂x

(

p2

2m

)

+ m∂Φ

∂x

= 0 + m∂Φ

∂xbecause p2 = p · p is independent of x

= m∂Φ

∂x.

31

Page 34: ASTM002 The Galaxy Course notes 2006 (QMUL)

Substituting for ∂H/∂p and ∂H/∂x,

∂f

∂t+

∂x·(

fp

m

)

− ∂

∂p·(

fm∂Φ

∂x

)

= 0

∴∂f

∂t+

p

m· ∂f∂x

− m∂Φ

∂x· ∂f∂p

= 0

because p is independent of x, and because ∂Φ/∂x is independent of p since Φ ≡ Φ(x, t).But the momentum p = m dx/dt and the acceleration is 1

mdp/dt = − ∂Φ/∂x (the gradientof the potential).

∴∂Φ

∂x= − 1

m

dp

dt.

So,∂f

∂t+

m

m

dx

dt· ∂f∂x

− m

(

− 1

m

dp

dt

)

· ∂f∂p

= 0

∴∂f

∂t+

dx

dt· ∂f∂x

+dp

dt· ∂f∂p

= 0 .

The left-hand side is the differential df/dt. So,

∂f

∂t+

dx

dt· ∂f∂x

+dp

dt· ∂f∂p

≡ df

dt= 0 (2.51)

— the collisionless Boltzmann equation.While this equation is called the collisionless Boltzmann equation (or CBE) in stellar

dynamics, in Hamiltonian dynamics it is known as Liouville’s theorem.

2.12 The implications of the Collisionless Boltzmann Equa-tion

The collisionless Boltzmann equation tells us that df/dt = 0. This means that the densityin phase space, f , does not change with time for a test particle. Therefore if we follow a starin orbit, the density f in 6-dimensional phase space around the star is constant.

This simple result has important implications. If a star moves inwards in a galaxy as itfollows its orbit, the density of stars in space increases (because the density of stars in eachof the components of the galaxy is greater closer to the centre). df/dt = 0 then tells us thatthe spread of stellar velocities around the star will increase to keep f constant. Thereforethe velocity dispersion around the star increases as the star moves inwards. The velocitydispersion is larger in regions of the galaxy where the density of stars is greater. Conversely,if a star moves out from the centre, the density of stars around it will decrease and thevelocity dispersion will decrease to keep f constant.

The collisionless Boltzmann equation, and the Poisson equation (which is the gravita-tional analogue of Gauss’s law in electrostatics) together constitute the basic equations ofstellar dynamics:

df

dt= 0 , ∇2Φ(x) = 4πGρ(x) , (2.52)

where f is the distribution function, t is time, Φ(x, t) is the gravitational potential at pointx, ρ(x, t) is the mass density at point x, and G is the constant of gravitation.

The collisionless Boltzmann equation applies because star-star encounters do not changethe motions of stars significantly over the lifetime of a galaxy, as was shown in Section 2.5.Were this not the case and the system was collisional, the CBE would have to be modifiedby adding a “collisional term” on the right-hand side.

32

Page 35: ASTM002 The Galaxy Course notes 2006 (QMUL)

Though f is a density in phase space, the full form of the collisionless Boltzmann equationdoes not necessarily have to be written in terms of x and p. We can express df

dt = 0 inany set of six variables in phase space. You should remember that f is always taken to bea density in six-dimensional phase space, even in situations where it is a function of fewervariables. For example, if f happens to be a function of energy alone, it is not the same asthe density in energy space.

2.13 The Collisionless Boltzmann Equation in Cylindrical Co-ordinates

So far we have considered Cartesian coordinates (x, y, z, vx, vy, vz). However, the form

∂f

∂t+

3∑

i=1

(

dxi

dt

∂f

∂xi+

dvi

dt

∂f

∂vi

)

= 0 ,

for the collisionless Boltzmann equation of Equation 2.43 applies to any coordinate system.For a galaxy, it is often more convenient to use cylindrical coordinates with the centre of

the galaxy as the origin.

The coordinates of a star are (R,φ, z). A cylindrical system is particularly useful forspiral galaxies like our own where the z = 0 plane is set to the Galactic plane. (Note theuse of a lower-case φ as a coordinate angle, whereas elsewhere we have used a capital Φ todenote the gravitational potential.)

The collisionless Boltzmann equation in this system is

df

dt=

∂f

∂t+

dR

dt

∂f

∂R+

dt

∂f

∂φ+

dz

dt

∂f

∂z+

dvR

dt

∂f

∂vR+

dvφ

dt

∂f

∂vφ+

dvz

dt

∂f

∂vz

= 0 , (2.53)

where vR, vφ, and vz are the components of the velocity in the R,φ, z directions.We need to replace the differentials of the velocity components with more convenient

terms. dvR/dt, dvφ/dt and dvz/dt are related to the acceleration a (but are not actually thecomponents of the acceleration for the R and φ directions). The velocity and accelerationin terms of these differentials in a cylindrical coordinate system are

v =dr

dt=

dR

dteR + R

dteφ +

dz

dtez

a =dv

dt=

(

d2R

dt2−R

(

dt

)2)

eR +

(

2dR

dt

dt+R

d2φ

dt2

)

eφ +d2z

dt2ez (2.54)

33

Page 36: ASTM002 The Galaxy Course notes 2006 (QMUL)

where eR, eφ and ez are unit vectors in the R,φ and z directions (a standard result for anycylindrical coordinate system, and for any velocity, acceleration or force). Representing thevelocity as v = vReR + vφeφ + vzez and equating coefficients of the unit vectors,

dR

dt= vR ,

dt=vφ

R,

dz

dt= vz . (2.55)

The acceleration can be related to the gravitational potential Φ with a = −∇Φ (because theonly forces acting on the star are those of gravity). In a cylindrical coordinate system,

∇ ≡ eR∂

∂R+ eφ

1

R

∂φ+ ez

∂z. (2.56)

Using this result and equating coefficients, we obtain,

d2R

dt2−R

(

dt

)2

= −∂Φ

∂R, 2

dR

dt

dt+R

d2φ

dt2= − 1

R

∂Φ

∂φ,

d2z

dt2= −dΦ

dz

Rearranging these and substituting for dR/dt, dφ/dt and dz/dt from 2.55, we obtain,

dvR

dt= −∂Φ

∂R+v2φ

R,

dvz

dt= −∂Φ

∂z,

and with some more manipulation,

dvφ

dt=

d

dt

(

Rdφ

dt

)

=dR

dt

dt+R

d2φ

dt2= vR

R+(

− 1

R

∂Φ

∂φ− 2

dR

dt

dt

)

=vR vφ

R− 1

R

∂Φ

∂φ− 2 vR

R= − 1

R

∂Φ

∂φ− vR vφ

R. (2.57)

Substituting these into Equation 2.53, we obtain,

df

dt=

∂f

∂t+ vR

∂f

∂R+

R

∂f

∂φ+ vz

∂f

∂z+

(

v2φ

R− ∂Φ

∂R

)

∂f

∂vR

− 1

R

(

vRvφ +∂Φ

∂φ

)

∂f

∂vφ− ∂Φ

∂z

∂f

∂vz= 0 , (2.58)

This is the collisionless Boltzmann equation in cylindrical coordinates. This form relates fto observable parameters (R,φ, z, vR, vφ, vz) and the potential Φ.

In many practical cases, particularly spiral galaxies, Φ will be independent of φ, so∂Φ/∂φ = 0 (but not if we include spiral arms where the potential will be slightly deeper).

2.14 Orbits of Stars in Galaxies

2.14.1 The character of orbits

The term orbit is used to describe the trajectories of stars within galaxies, even though theyare very different to Keplerian orbits such as those of planets in the Solar System. The orbitsof stars in a galaxy are usually not closed paths and in general they are three dimensional(they do not lie in a plane). They are often complex. In general they are highly chaotic,even if the galaxy is in equilibrium.

The orbit of a star in a spherical potential, to consider the simplest example, is confinedto a plane perpendicular to the angular momentum vector of the star. It is, however, not aclosed path and has an appearance that is usually described as a rosette. In axisymmetricpotentials (e.g. an oblate elliptical galaxy) the orbit is confined to a plane that precesses.This plane is inclined to the axis of symmetry and rotates about the axis. The orbit withinthe plane is similar to that in a spherical potential.

Triaxial potentials can have orbits that are much more complex. Triaxial potentials oftenhave the tendency to tumble about one axis, which leads to chaotic star orbits.

34

Page 37: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 2.2: An example of the orbit of a star in a spherical potential. An example star hasbeen put into an orbit in the x−y plane. Its orbit follows a “rosette” pattern, but it remainsin the x− y plane. [These diagrams were plotted using data generated assuming a Plummerpotential: the potential lacks a deep central cusp.]

Figure 2.3: The orbit of a star in a flattened (oblate) potential. An example star has beenput into an orbit inclined to the x − y plane. The galaxy is flattened in the z directionwith an axis ratio of 0.7. The orbit follows a “rosette” pattern, but the plane of the orbitprecesses. This illustrates the trajectory of a star in an oblate elliptical galaxy, for example.

2.14.2 The chaotic nature of many orbits

In chaotic systems, stars that initially move along similar paths will diverge, eventuallymoving along very different orbits. The divergence in their paths is exponential in time,which is the technical definition of chaos in dynamical systems. Their motion shows astretching and folding in phase space. This can be so even if there is no collective motion ofstars at all (f in equilibrium).

35

Page 38: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 2.4: The orbit of a star in a triaxial potential. An example star has been put intoan orbit inclined to the x− y plane. The galaxy has different dimensions in each of the x, yand z directions. The orbit is complex and it maps out a region of space. This illustratesthe trajectory of a star in a triaxial elliptical galaxy, for example. (This simulation extendsover a longer time period than those of Figures 2.2 and 2.3.)

This stretching and folding in phase space can be appreciated using an analogy. Whenmaking bread, a baker’s dough behaves essentially as a fluid. Dough is incompressible, butthat does not prevent the baker stretching it in one direction and shrinking it in others,and then folding it back. So while the dough keeps much the same overall shape, particlesinitially nearby within it can be dispersed to widely different parts of it, through the repeatedstretching and folding. The same stretching and folding operation can take place for starsin phase space. In fact it appears that phase space is typically riddled with regions wheref gets stretched in one direction while being shrunk in others. Thus nearby orbits tend todiverge, and the divergence is exponential in time.

Simulations show that the timescale for divergence (the e-folding time) is Tdiverge ∼ Tcross,the crossing time, and gets shorter for higher star densities.

However, in some special cases, there is no chaos. These systems are said to be integrable.If the dynamics is confined to one real-space dimension (hence two phase-space dimen-

sions) then no stretching-and-folding can happen, and orbits are regular. So in a sphericalsystem all orbits are regular. In addition, there are certain potentials (usually referred toas Stackel potentials) where the dynamics decouples into three effectively one-dimensionalsystems; so if some equilibrium f generates a Stackel potential, the orbits will stay chaos-free. Also, small perturbations of non-chaotic systems tend to produce only small regions ofchaos,1 and orbits may be well described through perturbation theory.

2.14.3 Integrals of the motion

To solve the collisionless Boltzmann equation for stars in a galaxy, we need further constraintson the position and velocity. This can be done using integrals of the motion. These are simplyfunctions of the star’s position x and velocity v that are constant along its orbit. They areuseful in potentials Φ(x) that are constant over time. The distribution function f is alsoconstant along the orbit and can be written as a function of integrals of the motion.

1If you ever come across the ‘KAM theorem’, that’s basically it.

36

Page 39: ASTM002 The Galaxy Course notes 2006 (QMUL)

Examples of integrals of the motion are:

• The total energy. The energy E of a particular star in a potential is constant overtime, so E(x,v) = 1

2mv2 +mΦ(x). Because this is dependent on the mass of the star,

it is more normal to work with the energy per unit mass, which will be written as Em

here. So Em = 12v

2 + Φ is a constant.

• In an axisymmetric potential (e.g. our Galaxy), the z-component of the angular mo-mentum, Lz, is conserved. Therefore Lz is an integral of the motion in such a potential.

• In a spherical potential, the total angular momentum L is constant. Therefore L is anintegral of the motion in this potential, and the x, y and z components of L are eachintegrals of the motion.

An orbit is said to be regular if it has as many isolating integrals that can define the orbitunambiguously as there are spatial dimensions.

2.14.4 Isolating integrals and integrable systems

The collisionless Boltzmann equation tells us that df/dt = 0 (Section 2.12). As was discussedearlier, if we move with a star in its orbit, f is constant locally as the star passes throughphase space at that instant in time. But if the system is in a steady state (the potential isconstant over time), f is constant along the star’s path at all times. This means that theorbits of stars map out constant values of f .

An integral of the motion for a star (e.g. energy per unit mass, Em) is constant (bydefinition). They therefore define a 5-dimensional hypersurface in 6-dimensional phase space.The motion of a star is confined to that 5-dimensional surface in phase space. Therefore fis constant over that hypersurface.

A different value of the isolating integral (e.g. a different value of Em) will define adifferent hypersurface. In turn, f will be different on this surface. So f is a function of theisolating integral, i.e. f(x, y, z, vx, vy, vz) = fn(I1) where I1 is an integral of the motion. I1here “isolates” a hypersurface. Therefore the integral of the motion is known as an isolating

integral.Integrals that fail to confine orbits are called “non-isolating” integrals. A system is

integrable if we can define isolating integrals that enable the orbit to be determined.In integrable systems there are significant simplifications. Each orbit is (i) confined to

a three-dimensional toroidal subspace of six-dimensional phase space, and (ii) fills its torusevenly.2 Phase space itself is filled by nested orbit-carrying tori—they have to be nested,since orbits can’t cross in phase space. Therefore the time-average of each orbit is completelyspecified once we have specified which torus it is on; this takes three numbers for each orbit,and these are called ‘isolating integrals’ – they are constants for each orbit of course. Think ofthe isolating integrals as a coordinate system that parameterises orbital tori; transformationsto a different set of isolating integrals is like a coordinate transformation.

If isolating integrals exist, then any f that depends only on them will automaticallysatisfy the collisionless Boltzmann equation. Conversely, since orbits fill their tori evenly,any equilibrium f cannot depend on location on the tori, it can only depend on the torithemselves, i.e., on the isolating integrals. This result is known as Jeans’ theorem.

2.14.5 The Jeans Theorem

The Jeans Theorem is an important result in stellar dynamics that states the importanceof integrals of the motion in solving the collisionless Boltzmann equation for gravitational

2These two statements are important results from Hamiltonian dynamical systems which we won’t try toprove here. But the statements that follow in this section are straightforward consequences of (i) and (ii).

37

Page 40: ASTM002 The Galaxy Course notes 2006 (QMUL)

potentials that do not change with time. It was named after its discoverer, the Englishastronomer, physicist and mathematician Sir James Hopwood Jeans (1877–1946).

It states that any steady-state solution of the collisionless Boltzmann equation dependson the phase-space coordinates only through integrals of the motion in the galaxy’s potential,and any function of the integrals yields a steady-state solution of the collisionless Boltzmannequation.

This means that in a potential that does not change with time, we can express thecollisionless Boltzmann equation in terms of integrals of motion, and then solve for thedistribution function f in terms of those integrals of motion. We can then convert thesolution of f in terms of the integrals to a solution for f in terms of the space and velocitycoordinates. For example, if the energy per unit mass Em and total angular momentumcomponents Lx and Ly are constant for each star in some potential, then we can solve for funiquely as a function of Em, Lx and Ly. Then we can convert from Em, Lx and Ly to givef as a function of (x, y, z, vx, vy, vz).

You should be wary of Jeans’ theorem, especially when people tacitly assume it, becauseas we saw, it assumes that the system is integrable, which is in general not the case.

2.15 Spherical Systems

2.15.1 Solving for f in spherical galaxies

The Jeans Theorem does apply in spherical systems of stars, such as spherical ellipticalgalaxies. As a consequence, f can depend on (at most) three integrals of motion in aspherical system. The simplest case is for f to be a function of the energy of the stars only.(Since we are considering bound systems, f = 0 for E > 0 always.) To find an equilibriumsolution, we only have to satisfy Poisson’s equation.

The total energy of a star of mass m moving with a velocity v is E = 12mv

2 + mΦ,where Φ is the gravitational potential at the point where the star is situated. Here it is moreconvenient to use the energy per unit mass Em = 1

2v2 + Φ.

A spherical galaxy can be described very simply by a spherical polar coordinate system(r, θ, φ) with the origin at the centre. Poisson’s equation relates the Laplacian of the grav-itational potential Φ at a point to the local mass density ρ as ∇2Φ = 4πGρ. In a sphericalpolar coordinate system the Laplacian of any scalar function A(r, θ, φ) is

∇2A ≡ 1

r2∂

∂r

(

r2∂A

∂r

)

+1

r2 sin θ

∂θ

(

sin θ∂A

∂θ

)

+1

r2 sin2 θ

∂2A

∂φ2(2.59)

(a standard result from vector calculus: see Appendix B).In a spherically symmetric galaxy that does not change with time, the potential is a

function of the radial distance r from the centre only. So ∂Φ/∂θ = 0 and ∂Φ/∂φ = 0.Therefore,

∇2Φ =1

r2d

dr

(

r2dΦ

dr

)

. (2.60)

Substituting this into the Poisson equation,

1

r2d

dr

(

r2dΦ

dr

)

= 4πGρ . (2.61)

The distribution function f is related to the number density n of stars by

n =

f d3v

38

Page 41: ASTM002 The Galaxy Course notes 2006 (QMUL)

(from Equation 2.35), and in this case f is a function of energy per unit mass: f = f(Em).We can relate this to the density ρ using ρ = mn where m is the mean mass of a star, giving,

ρ = m

f d3v . (2.62)

This integral is over all velocities. We can convert from d3v to dv by considering a thinspherical shell in a space defined by the three velocity components, which gives d3v = 4πv2dv.So

ρ = 4π m

fv2 dv . (2.63)

Note that this integration can be performed over velocity at each and every point in thegalaxy, so this ρ is ρ(r).

We must determine the limits on this integral. For any particular point in the galaxy(i.e. any value of r), the minimum possible velocity is 0, which occurs when a star movingon a radial orbit reaches its maximum distance from the centre at that point. The maximumvelocity occurs when a star has the greatest possible energy (Em = 0, which would allow astar to move out from the point to arbitrary distance). Therefore the maximum velocity isv =

−2Φ(r). So the integration is from velocity v = 0 to√

−2Φ(r). So,

1

r2d

dr

(

r2dΦ

dr

)

= (4π)2 Gm

√−2Φ(r)

0f v2 dv . (2.64)

We can convert this integral to an integral over energy per unit mass. Em = 12v

2 + Φ givesdEm = v dv at a fixed position (and hence for a constant Φ). The maximum possible energyper unit mass is 0, while the minimum possible value at a radius r would be given by a starthat is stationary at that point: Em = Φ(r) (which is of course negative). So, at any radiusr,

1

r2d

dr

(

r2dΦ

dr

)

= (4π)2√

2 Gm

∫ 0

Φ(r)

Em − Φ(r) f(Em) dEm , (2.65)

on substituting v =√

2(Em − Φ) .It is usual in Equation 2.64 to take f(v) as given and to try to solve for Φ(r) and hence

ρ(r); this is a nonlinear differential equation. In Equation 2.65 we would normally take Φ asgiven, and try to solve for f(Em); this is a linear integral equation.

There are f(Em) models in the literature, and you can always concoct a new one bypicking some ρ(r), computing Φ(r) and then solving Equation 2.65 numerically. Note thatthe velocity distribution is isotropic for any f(Em). If f depends on other integrals ofmotion, say angular momentum L or its z component, or both – thus f(Em, L

2, Lz) – thenthe velocity distribution will be anisotropic, and there are many examples of these aroundtoo.

2.15.2 Example of a spherical, isotropic distribution function: the Plum-mer potential

As discussed earlier, the Plummer potential has a gravitational potential Φ and a massdensity ρ at a radial distance r from the centre that are given by

Φ(r) = − GMtot√r2 + a2

, ρ(r) =3Mtot

a2

(r2 + a2)5/2, (2.29) and (2.30)

where Mtot is the total mass of the galaxy and a is a constant. The distribution function forthe Plummer model is related to the density by Equation 2.63. It can be shown that theseΦ(r) and ρ(r) forms give a solution,

f(Em) =24√

2

7π3

a2

G5M4totm

(−Em)72 . (2.66)

39

Page 42: ASTM002 The Galaxy Course notes 2006 (QMUL)

This can be verified by inserting in Equation 2.64, although this is not trivial to do. Thisresult gives the distribution function f as a function only of the energy per unit mass Em.To calculate f for any point (x, y, z, vx, vy, vz) in phase space, we need only to calculate Em

from these coordinates and then calculate the value of f associated with that Em.

2.15.3 Example of a spherical, isotropic distribution function: the isother-mal sphere

The isothermal sphere was introduced in Section 2.8.3. The density profile was given inEquation 2.34. The isothermal sphere is defined by analogy with a Maxwell-Boltzmann gas,and therefore the distribution function as a function of the energy per unit mass Em is givenby,

f(Em) =n0

(2πσ2)32

exp

(

− Em

σ2

)

=n0

(2πσ2)32

exp

(

−12v

2 + Φ

σ2

)

, (2.67)

where σ2 is a velocity dispersion and acts in this distribution like a temperature does in agas. n0 is a constant. Integrating over velocities gives

n(r) =

f d3v =

∫ ∞

0f . 4π v2 dv =

4π n0

(2πσ2)32

exp

(

− Φ

σ2

)∫ ∞

0v2 exp

(

− v2

2σ2

)

dv

= n0 exp

(

−Φ(r)

σ2

)

, (2.68)

using the standard integral∫∞0 e−ax2

dx =√π/2

√a . Converting this to density ρ(r) using

ρ = mn, where m is the mean mass of the stars, we get,

ρ(r) = ρ0 exp

(

−Φ(r)

σ2

)

, and equivalently, Φ(r) = − σ2 ln

(

ρ(r)

ρ0

)

, (2.69)

where ρ0 is a constant. Using this, Poisson’s equation (∇2Φ = 4πGρ) in a sphericallysymmetric potential becomes on substituting for dΦ/dr,

d

dr

(

r2d ln ρ

dr

)

= − 4πG

σ2r2 ρ , (2.70)

for which the solution is

ρ(r) =σ2

2πGr2(2.71)

(see Equation 2.34). As already commented in Section 2.8.3, the isothermal sphere hasinfinite mass! (A side effect of this is that the boundary condition Φ(∞) = 0 cannot be used,which we why we needed the redundant-looking constant ρ0 in Equations 2.67 and 2.68.)Nevertheless, it is often used as a model, with some large-r truncation assumed, for the darkhaloes of disc galaxies.

The same ρ(r) can be produced by many different f , all having different velocity distri-butions.

2.16 Observable and Measurable Quantities

The phase space distribution f is usually very difficult to measure observationally, becauseof the challenges of measuring the distribution of stars over space and particularly overvelocity. Velocity components along the line of sight can be measured spectroscopically froma Doppler shift. However, transverse velocity components cannot be measured directly forgalaxies beyond our own (or at least beyond the Local Group). As a function of seven

40

Page 43: ASTM002 The Galaxy Course notes 2006 (QMUL)

variables (six of the phase space, plus time), the function f can be awkward to computetheoretically. It is therefore more convenient to use quantities related to f .

The number density n of stars in space can be measured observationally by countingmore luminous stars for nearby galaxies, or from the observed intensity of light for moredistant galaxies. Star counts combined with estimates of the distances of individual starscan provide n as a function of position within our Galaxy. For a distant galaxy, convertingthe intensity along the line of sight of the integrated light from large numbers of stars in tonumber densities – a process known as deprojection – requires assumptions about the stellarpopulations and their three-dimensional distribution. Nevertheless, reasonable attempts canbe made in many instances.

Spectroscopy provides mean velocities 〈vr〉 along the line of sight through a galaxy, andthe widths of absorption lines provide velocity dispersions σr along the line of sight. Thesemean velocities will be weighted according to the numbers of stars.

It is therefore much more convenient to calculate quantities involving number densitiesn, mean velocities and velocity dispersions from f . These quantities can then be comparedwith observations more directly. A series of equations called the Jeans Equations allow thisto be done.

2.17 The Jeans Equations

The Jeans Equations relate number densities, mean velocities, velocity dispersions and thegravitational potential. They were first used in stellar dynamics by Sir James Jeans in 1919.

It is useful to derive equations for the quantities

n =

f d3v ,

n 〈vi〉 =

vi f d3v ,

n σ 2ij =

(vi − 〈vi〉) (vj − 〈vj〉) f d3v , (2.72)

by taking moments of the collisionless Boltzmann equation (expressed in the Cartesian vari-ables xi and vi). σij is a velocity dispersion tensor: it is discussed in more detail below.

The collisionless Boltzmann equation gives (Equation 2.43)

∂f

∂t+

3∑

i=1

(

dxi

dt

∂f

∂xi+

dvi

dt

∂f

∂vi

)

= 0 ,

or equivalently,

∂f

∂t+

3∑

i=1

vi∂f

∂xi−

3∑

i=1

∂Φ

∂xi

∂f

∂vi= 0 ,

on substituting for the components of acceleration from dv/dt = −∇Φ.To derive the first of the Jeans Equations, we shall consider the zeroth moment by

integrating this equation over all velocities.

(

∂f

∂t+

3∑

i=1

vi∂f

∂xi−

3∑

i=1

∂Φ

∂xi

∂f

∂vi

)

d3v =

0 . d3v . (2.73)

∂f

∂td3v +

3∑

i=1

vi∂f

∂xid3v −

3∑

i=1

∂Φ

∂xi

∂f

∂vid3v = 0 ,

41

Page 44: ASTM002 The Galaxy Course notes 2006 (QMUL)

(with the right hand being zero because it is a definite integral). Some of these terms canbe simplified, particularly by noting the integration is performed over all velocities at eachposition and time.

But

∂f

∂td3v =

∂t

f d3v because t and vi’s are independent

=∂n

∂tbecause n =

f d3v,

and

vi∂f

∂xid3v =

∂(vif)

∂xid3v because vi’s and xi’s are independent

=∂

∂xi

vi f d3v because xi’s and vi’s are independent

=∂ (n 〈vi〉 )

∂xion substituting n〈vi〉 =

vifd3v.

and

∂Φ

∂xi

∂f

∂vid3v =

∂Φ

∂xi

∂f

∂vid3v because xi’s and Φ are independent of vi’s

=∂Φ

∂xi(0) because f −→ 0 as |vi| −→ ∞ (by analogy

with the divergence theorem)

= 0 .

Substituting for these terms,

∂n

∂t+

3∑

i=1

∂n〈vi〉∂xi

= 0 , (2.74)

which is a continuity equation. This is the first of the Jeans Equations.To derive the second of the Jeans Equations, we consider the first moment of the collision-

less Boltzmann equation by multiplying by vi and integrating over all velocities. Multiplyingthe C.B.E. throughout by vi, we obtain,

vi∂f

∂t+ vi

3∑

j=1

vj∂f

∂xj− vi

3∑

j=1

∂Φ

∂xj

∂f

∂vj= 0 , (2.75)

where the summation is performed over an integer j because we have introduced a velocitycomponent vi. Note that the use of vi means that we are considering one particular velocitycomponent only at this stage, i.e. one value of i from i = 1, 3. Integrating this over allvelocities,

vi∂f

∂t+

3∑

j=1

vi vj∂f

∂xj−

3∑

j=1

vi∂Φ

∂xj

∂f

∂vj

d3v =

0 . d3v . (2.76)

vi∂f

∂td3v +

3∑

j=1

vi vj∂f

∂xjd3v −

3∑

j=1

vi∂Φ

∂xj

∂f

∂vjd3v = 0 .

42

Page 45: ASTM002 The Galaxy Course notes 2006 (QMUL)

But

vi∂f

∂td3v =

∂(vi f)

∂td3v because vi and t are independent

=∂

∂t

vi f d3v

=∂

∂t(n 〈vi〉) because n 〈vi〉 =

vi f d3v,

and

vi vj∂f

∂xjd3v =

∂xj(vi vj f) d3v because vi and vj are independent of xi

=∂

∂xj

vivjf d3v because xi and vi’s are independent

=∂ (n 〈vivj〉 )

∂xjon substituting n〈vivj〉 =

vivjfd3v

and

vi∂Φ

∂xj

∂f

∂vjd3v =

∂Φ

∂xj

vi∂f

∂vjd3v because xj ’s and Φ are independent of vi’s

But∂(vif)

∂vj= vi

∂f

∂vj+ f

∂vi

∂vj∴ vi

∂f

∂vj=

∂(vif)

∂vj− f

∂vi

∂vj

and∂vi

∂vj= 1 if i = j

= 0 if i 6= j because vi and vj are independent if i 6= j

∴∂vi

∂vj= δij

∴ vi∂f

∂vj=

∂(vif)

∂vj− δij f .

So

vi∂Φ

∂xj

∂f

∂vjd3v =

∂Φ

∂xj

∫ (

∂(vif)

∂vj− δij f

)

d3v

=∂Φ

∂xj

(∫

∂(vif)

∂vjd3v − δij

f d3v

)

=∂Φ

∂xj

(

0 − δij n)

because vif −→ 0 as |vi| −→ ∞

= − ∂Φ

∂xjδij n .

Substituting for these terms,

∂(n〈vi〉)∂t

+3∑

j=1

∂xj

(

n〈vivj〉)

−3∑

j=1

(

− ∂Φ

∂xiδij n

)

= 0 .

So,

∂(n〈vi〉)∂t

+

3∑

j=1

∂xj

(

n〈vivj〉)

= − ∂Φ

∂xin , (2.77)

for each of i = 1, 2, 3. This is the second of the Jeans Equations.We need to introduce a tensor velocity dispersion σij defined so that

nσ2ij ≡

(vi − 〈vi〉) (vj − 〈vj〉) f d3v (2.78)

43

Page 46: ASTM002 The Galaxy Course notes 2006 (QMUL)

for i, j = 1, 3 (see Equation 2.72 above). This is used to represent the spread of velocities ineach direction. It is a symmetric tensor and we can choose some coordinate system in whichit is diagonal (i.e. σ11 6= 0, σ22 6= 0, σ33 6= 0, but all the other elements are zero). This isknown as the velocity ellipsoid. For example, in a cylindrical coordinate system, we might useelements such as σRR, σφφ and σzz. If the velocity dispersion is isotropic, σ11 = σ22 = σ33,which we might simplify by writing as σ only.

Rearranging Equation 2.78 and multiplying out,

σ2ij =

1

n

(

vi − 〈vi〉) (

vj − 〈vj〉)

f d3v

=1

n

(

vivj − vi〈vj〉 − 〈vi〉vj + 〈vi〉 〈vj〉)

f d3v

=1

n

vivj f d3v − 1

n

vi〈vj〉 f d3v − 1

n

〈vi〉vj f d3v +1

n

〈vi〉 〈vj〉 f d3v

=1

n

vivj f d3v − 〈vj〉1

n

vi f d3v − 〈vi〉1

n

vj f d3v + 〈vi〉 〈vj〉1

n

f d3v

because 〈vi〉 and 〈vj〉 are constants

= 〈vivj〉 − 〈vj〉 〈vi〉 − 〈vi〉 〈vj〉 + 〈vi〉 〈vj〉 from Equation 2.72.

So,

σ2ij = 〈vivj〉 − 〈vi〉 〈vj〉 (2.79)

This can be used to find 〈vivj〉 using

〈vivj〉 = σ2ij + 〈vi〉 〈vj〉

Substituting for 〈vivj〉 into the second of the Jeans Equations (Equation 2.77),

∂(n〈vi〉)∂t

+

3∑

j=1

[

∂xj

(

nσ2ij

)

+∂

∂xj

(

n〈vi〉〈vj〉)

]

= − ∂Φ

∂xin ,

for each of i = 1, 2 and 3. Therefore,

〈vi〉∂n

∂t+ n

∂〈vi〉∂t

+3∑

j=1

∂xj

(

nσ2ij

)

+3∑

j=1

∂xj

(

n〈vi〉〈vj〉)

= − ∂Φ

∂xin . (2.80)

We can eliminate the 1st and 4th terms using the first of the Jeans Equations (Equation 2.74).Multiplying that equation throughout by 〈vi〉,

〈vi〉∂n

∂t+ 〈vi〉

3∑

j=1

∂xj

(

n〈vj〉)

= 0

∴ 〈vi〉∂n

∂t+

3∑

j=1

〈vi〉∂

∂xj

(

n〈vj〉)

= 0 (2.81)

But∂

∂xj

(

n〈vi〉〈vj〉)

= 〈vi〉∂

∂xj

(

n〈vj〉)

+ n 〈vj〉∂〈vi〉∂xj

Substituting for 〈vi〉∂

∂xj(n〈vj〉),

〈vi〉∂n

∂t+

3∑

j=1

(

∂xj(n〈vi〉〈vj〉) − n 〈vj〉

∂〈vi〉∂xj

)

= 0

44

Page 47: ASTM002 The Galaxy Course notes 2006 (QMUL)

∴ 〈vi〉∂n

∂t+

3∑

j=1

∂xj(n〈vi〉〈vj〉) = n

3∑

j=1

〈vj〉∂〈vi〉∂xj

Substituting this into Equation 2.80, we obtain,

n∂〈vi〉∂t

+ n

3∑

j=1

〈vj〉∂〈vi〉∂xj

= − n∂Φ

∂xi−

3∑

j=1

∂xj(nσ2

ij) , (2.82)

where i can be any of 1, 2 or 3. This is a third Jeans Equation.This can also be expressed as,

d〈v〉dt

= − ∇Φ − 1

n∇ · (nσ2) . (2.83)

where 〈v〉 is the mean velocity vector, t is the time, Φ is the potential, n is the numberdensity of stars and σ2 represents the tensor σ2

ij . Note that here d/dt is not ∂/∂t, but

dv

dt

(

≡ Dv

Dt

)

=∂v

∂t+ v · ∇v , (2.84)

which is sometimes called the convective derivative; it is also sometimes written as D/Dt toemphasise that it is not simply ∂

∂t .This is similar to the Euler equation in fluid dynamics. An ordinary fluid has

d〈v〉dt

= − ∇Φ − ∇p

ρ+ viscous terms , (2.85)

where the pressure p arises because of the high rate of molecular encounters, which alsoleads to the equation of state, and p is isotropic. In stellar dynamics, the stars behavelike a fluid in which ∇ · (ρσ) behaves like a pressure, but it is anisotropic. Indeed, thisanisotropy is the reason that it is represented by a tensor, whereas in an ordinary fluid thepressure is represented by a scalar. A related fact is that in the flow of an ordinary fluidthe particle paths and streamlines coincide, whereas stellar orbits and the streamlines 〈v〉do not generally coincide.

The Jeans Equations have been represented here in terms of the number density n ofstars. However, it is possible to work instead with the mean mass density in space ρ instead ofn. The Jeans Equations can be used for all stars in a galaxy, but sometimes they are used forsubpopulations in our Galaxy (e.g G dwarfs, K giants). If they are used for subpopulations,Φ remains the total gravitational potential of all matter (including dark matter), but thevelocities and number densities refer to the subpopulations.

2.18 The Jeans Equations in an Axisymmetric System, e.g.the Galaxy

Using cylindrical coordinates (R,φ, z) and assuming axisymmetry (so ∂/∂φ = 0), the secondJeans Equation is

∂t(n〈vR〉) +

∂R(n〈v2

R〉) +∂

∂z(n〈vRvz〉) +

n

R

(

〈v2R〉 − 〈v2

φ〉)

= − n∂Φ

∂Rfor the R direction

∂t(n〈vφ〉) +

∂R(n〈vRvφ〉) +

∂z(n〈vφvz〉) +

2n

R〈vRvφ〉 = 0

for the φ direction

45

Page 48: ASTM002 The Galaxy Course notes 2006 (QMUL)

∂t(n〈vz〉 ) +

∂R(n〈vRvz〉 ) +

∂z(n〈v2

z〉 ) +n 〈vRvz〉

R= − n

∂Φ

∂zfor the z direction. (2.86)

In a steady state, where the potential does not change with time, we can use ∂/∂t = 0.This axisymmetric form of the second Jeans Equation is useful in spiral galaxies, such asour own Galaxy, provided that we neglect any change in the potential in the φ direction(although there might be a φ dependence if the potential is deeper in the spiral arms).

Meanwhile, the first of the Jeans Equations in a cylindrical coordinate system withaxisymmetric symmetry (∂/∂φ = 0) is,

∂n

∂t+

∂R(Rn〈vR〉 ) +

∂z(n〈vz〉 ) = 0 . (2.87)

2.19 The Jeans Equations in a Spherically Symmetric System

The second Jeans Equation in a steady-state (∂/∂t = 0) spherically-symmetric (∂/∂φ = 0)galaxy in a spherical polar coordinate system (r, θ, φ) is

d

dr(n 〈v2

r 〉 ) +n

r

[

2〈v2r 〉 − 〈v2

θ〉 − 〈v2φ〉]

= − ndΦ

dr. (2.88)

This might be used, for example, for a spherical elliptical galaxy.We can calculate the gradient in the potential in this spherical case very sinply. Using

the general result that the acceleration due to gravity is g = −∇Φ, that g = −GM(r)/r2

in a spherically symmetric system where M(r) is the mass interior to the radius r, and that∇Φ = dΦ/dr in a spherical system, we get dΦ/dr = GM(r)/r2.

As a simple test to see whether this really does work, let us make a crude model ofour Galaxy’s stellar halo. We shall assume that the halo is spherical, assume a logarithmicpotential of the form Φ(r) = v2

0 ln r where v0 is a constant, assume there are isotropicvelocity components (i.e. 〈v2

r 〉 = 〈v2θ〉 = 〈v2

φ〉 = σ2, where σ is a constant), and assume that

the star number density of the halo can be approximated by n(r) ∝ r−l where l is a constant.Equation 2.88 becomes

d

dr

(

n 〈σ2〉)

+n

r( 0 ) = − n

dr,

on substituting for the velocity terms. Substituting for Φ(r) and n(r), and cancelling r, weobtain σ = v0/

√l. For the Milky Way’s halo, observations show that n ∝ r−3.5 (i.e. l = 3.5),

while v0 as measured from gas on circular orbits is 220 km s−1, and rotation is negligible(which is a requirement for v2

θ = σ2 etc.). So we expect σ ' 220 km s−1/√

3.5 ' 120 km s−1.And it is.

2.20 Example of the Use of the Jeans Equations: the SurfaceMass Density of the Galactic Disc

The Jeans Equations can be applied to our Galaxy to measure the surface mass density ofthe Galactic disc at the solar distance from the centre using observations of the velocitiesalong the line of sight of stars lying some distance above or below the Galactic plane. Thesurface mass density is the mass per unit area of the disc when viewed from from a greatdistance. It is expressed in units of kg m−2, or more commonly solar masses per square

46

Page 49: ASTM002 The Galaxy Course notes 2006 (QMUL)

parsec (M¯pc−2). This analysis is important because it allows the quantity of dark matterin the disc to be estimated. Determining whether there is dark matter in the Galactic discis a very important constraint on the nature of dark matter.

The second Jeans Equation in a cylindrical coordinate system (R,φ, z) centred on the Galaxy,with z = 0 in the plane and R = 0 at the Galactic Centre states for the z direction that

∂(n〈vz〉 )

∂t+

∂(n〈vRvz〉 )

∂R+

∂(n〈v2z〉 )

∂z+

n〈vRvz〉R

= − n∂Φ

∂z

(Equation 2.86), where n is the star number density, vR and vz are the velocity componentsin the R and z directions, Φ(R, z, t) is the Galactic gravitational potential and t is time.The Galaxy is in a steady state, so n does not change with time. Therefore the first term∂(n〈vz〉)/∂t = 0.Observations show that

∂(n〈vRvz〉 )

∂R' 0 and

n〈vRvz〉R

' 0 ,

as is to be expected because of the cancelling of positive and negative terms of the z-components of the velocity. Therefore,

∂(n〈v2z〉 )

∂z= − n

∂Φ

∂z. (2.89)

〈v2z〉 is the mean square velocity in the direction perpendicular to the Galactic plane.

Poisson’s equation gives ∇2Φ = 4πGρ, where ρ is the mass density at a point. In cylindricalcoordinates the Laplacian is

∇2Φ =1

R

∂R

(

R∂Φ

∂R

)

+1

R2

∂2Φ

∂φ2+

∂2Φ

∂z2.

If we observe stars above and below the Galactic plane, all at the same Galactocentric radiusR, we can neglect the ∂Φ/∂R and ∂2Φ/∂φ2 terms.

∴∂2Φ

∂z2= 4πGρ ,

and so∂

∂z

(

− 1

n

∂z

(

n〈v2z〉)

)

= 4πGρ .

Integrating perpendicular to the galactic plane from −z to z, the surface mass density withina distance z of the plane at a Galactocentric radius R is

Σ(R, z) =

∫ z

−zρ dz′ =

∫ z

−z

1

4πG

∂z

(

− 1

n

∂z

(

n〈v2z〉)

)

dz′

47

Page 50: ASTM002 The Galaxy Course notes 2006 (QMUL)

= − 1

4πG

[

1

n

∂z

(

n〈v2z〉)

]z

z′=−z

= − 1

2πGn

∂z

(

n〈v2z〉)

z

,

assuming symmetry about z = 0. Therefore the surface mass density within a distance z ofthe plane at the solar Galactocentric radius R0 is

Σ(R0, z) = − 1

2πGn

∂z

(

n〈v2z〉)

z

. (2.90)

If the star densities n can be measured as a function of height z from the plane and if thez-component of the velocities vz can be measured as spectroscopic radial velocities, we cansolve for Σ(R0, z) as a function of z. This gives, after modelling the contribution from thedark matter halo, the mass density of the Galactic disc.

The analysis can be performed on some subclass of stars, such as G giants or K giants.In this case the number density n of stars in space is that of the subclass. Number countsof stars towards the Galactic poles, combined with estimates of the distances to individualstars, give n. Spectroscopic observations give radial velocities (the velocity components alongthe line of sight) through the Doppler effect. By observing towards the Galactic poles, theradial velocities are the same as the vz components.

This analysis gives Σ(R0, z) as a function of z. The value increases with z as a greaterproportion of the stars of the disc are included, until all the disc matter is included. Σ(R0, z)will still increase slowly with z beyond this as an increasing amount mass from the darkmatter halo is included. Indeed, it is necessary to determine the contribution Σd(R0) from thedisc alone to the observed data. An additional complication is that in measuring ∂(n〈v2

z〉)/∂zas a function of z, we are dealing with the differential of observed quantities. This meansthat the effects of observational errors can be considerable.

The first measurement of the surface density of the Galactic disc was carried out byOort in 1932. More modern attempts were carried out in the 1980s by Bahcall and byKuijken and Gilmore. There has been considerable debate about the interpretation of results.Early studies claimed evidence of dark matter in the Galactic disc, but more recently someconsensus has developed that there is little dark matter in the disc itself, apart from thecontribution from the dark matter halo that extends into the disc. A modern value is

48

Page 51: ASTM002 The Galaxy Course notes 2006 (QMUL)

Σd(R0) = 50± 10M¯ pc−2. The absence of significant dark matter in the disc indicates thatdark matter does not follow baryonic matter closely on a small scale.

2.21 N-body Simulations

An alternative approach that can be adopted to study the dynamics of stars in galaxiesis to use N -body simulations. In these analyses, the system of stars is represented by alarge number of particles and computer modelling is used to trace the dynamics of theseparticles under their mutual gravitational attractions. These simulations usually determinethe positions of the test particles at each of a series of time steps, calculating the changesin their positions between each step. It is possible to add further particles to trace gas anddark matter, although the gas must be made collisional.

The individual particles in a galaxy simulation, however, do not correspond to stars. It isimpossible to represent every star in a galaxy in N -body simulations. In practice, the limitson computational power allows only ∼ 105 to 108 particles, whereas there may be as manyas 1012 stars in the galaxies being modelled. The appropriate interpretation of simulationparticles is as Monte-Carlo samplers of the distribution function f .

In Section 2.6 we found that the ratio of the relaxation time to the crossing time wasTrelax/Tcross ∼ N/12 lnN for a system of N particles. It follows that a system that ismodelled computationally by too few particles will have a relaxation time that is too short,and may experience the effects of two-body encounters. As a consequence, the particles inN -body simulations have to be made collisionless artificially. The standard way of doing thisis to replace the 1/r gravitational potential by (r2 + a2)−

12 , which amounts to smearing out

the mass on the ‘softening length’ scale a.Early N -body computer codes performed calculations for each time step that took a time

that depended on the number of particles as N 2. Modern codes perform faster computationsby treating distant particles differently to nearby particles. “Tree codes” combine the effectsof number of distant particles together. This increases their efficiency and the computationtimes scale as only N lnN .

N -body simulations are widely used now to study the evolution of galaxies, and anactive research area at present is to incorporate gas dynamics in them. In contrast tostandard N -body methods, smoothed particle hydrodynamics (SPH) are often used to studythe gas in galaxies. Modern simulations include the effects of dark matter alongside starsand gas. They can follow the collapse of clumps of dark matter in the early Universe thatlead to the formation of galaxies. Simulations can also follow the growth of structure in theUniverse as gravitational attraction produced the clustering of galaxies observed today. N -body simulations can model the effects of large changes in gravitational potentials, whereasanalytical methods can find these more challenging.

49

Page 52: ASTM002 The Galaxy Course notes 2006 (QMUL)

Chapter 3

The Interstellar Medium

3.1 An Introduction to the Interstellar Medium

The interstellar medium (ISM) of a galaxy consists of the gas and dust distributed betweenthe stars. The mass in the gas is much larger than that in dust, with the mass of dust inthe disc of our Galaxy ' 0.1 of the mass of gas. Generally, the interstellar medium amountsto only a small fraction of a galaxy’s luminous mass, but this fraction is strongly correlatedwith the galaxy’s morphological type. This fraction is ' 0 % for an elliptical galaxy, 1–25 %for a spiral (the figure varies smoothly from type Sa to Sd), and 15–50 % for an irregulargalaxy.

The interstellar gas is very diffuse: in the plane of our Galaxy, where the Galactic gasis at its densest, the particle number density is ' 103 to 109 atomic nuclei m−3. Some ofthis gas is in the form of single neutral atoms, some is in the form of simple molecules, someexists as ions. Whether gas is found as atoms, molecules or ions depends on its temperature,density and the presence of radiation fields, primarily the presence of ultraviolet radiationfrom nearby stars. Note that the density of the gas is usually expressed as the number ofatoms, ions and molecules per unit volume; here they will be expressed in S.I. units of m−3,but textbooks, reviews and research papers still often use cm−3 (with 1 cm−3 ≡ 106 m−3, ofcourse).

The interstellar medium in a galaxy is a mixture of gas remaining from the formationof the galaxy, gas ejected by stars, and gas accreted from outside (such as infalling diffusegas or the interstellar medium of other galaxies that have been accreted). The ISM is veryimportant to the evolution of a galaxy, primarily because it forms stars in denser regions. Itis important observationally. It enables us, for example, to observe the dynamics of the gas,such as rotation curves, because spectroscopic emission lines from the gas are prominent.

The chemical composition is about 90 % hydrogen, 9% helium plus a trace of heavyelements (expressed by numbers of nuclei). The heavy elements in the gas can be depletedinto dust grains. Dust consists preferentially of particles of heavy elements.

Individual clouds of gas and dust are given the generic term nebulae. However, theinterstellar gas is found to have a diverse range of physical conditions, having very differenttemperatures, densities and ionisation states. There are several distinct types of nebula, asis described below.

3.2 Background to the Spectroscopy of Interstellar Gas

The gas in the interstellar medium readily emits detectable radiation and can be studiedrelatively easily. The gas is of very low density compared to conditions on the Earth, evencompared to many vacuums created in laboratories. Therefore spectral lines are observedfrom the interstellar gas that are not normally observed in the laboratory. These are referred

50

Page 53: ASTM002 The Galaxy Course notes 2006 (QMUL)

to as forbidden lines, whereas those that are readily observed under laboratory conditionsare called permitted lines. Under laboratory conditions, spectral lines with low transitionprobabilities are ‘forbidden’ because the excited states get collisionally de-excited beforethey can radiate. In the ISM, collisional times are typically much longer than the lifetimesof those excited states that only have forbidden transitions. So forbidden lines are observablefrom the ISM, and in fact they can dominate the spectrum.

Astronomy uses a particular notation to denote the atoms and ions that is seldom en-countered in other sciences. Atoms and ions are written with symbols such as H I, H II, He I

and He II. In this notation, I denotes a neutral atom, II denotes a singly ionised positivelycharged ion, III denotes a doubly ionised positive ion, etc. So, H I is H0, H II is H+, He I isHe0, He II is He+, He III is He2+, Li I is Li0, etc. A negatively charged ion, such as H−, isindicated only as H−, although few of these are encountered in astrophysics. Square bracketsaround the species responsible for a spectral line indicates a forbidden line, for example [O II].

3.3 Cold Gas: the 21 cm Line of Neutral Hydrogen

Cold gas emits only in the radio and the microwave region, because collisions between atomsand the radiation field (e.g. from stars) are too weak to excite the electrons to energy levelsthat can produce optical emission.

The most important ISM line from cold gas is the 21 cm line of atomic hydrogen (H I). Itcomes from the hyperfine splitting of the ground state of the hydrogen atom (split becauseof the coupling of the nuclear and electron spins). The energy difference between the twostates is ∆E = 9.4× 10−25 J = 5.9× 10−6 eV. This produces emission with a rest wavelengthλ0 = hc/∆E = 21.1061 cm and a rest frequency ν0 = ∆E/h = 1420.41MHz. In this process,hydrogen atoms are excited into the upper state through collisions (collisional excitation),but these collisions are rare in the low densities encountered in the cold interstellar medium.The transition probability is A = 2.87 × 10−15 s−1. The lifetime of an excited state is' 1/A = 11 million years. This is still much shorter than times between collisions, whichallows de-excitation to occur through spontaneous emission rather than through collisionalde-excitation.

The 21cm spin flip transition itself cannot be observed in a laboratory because of theextremely low transition probabilities, but the split ground state shows up in the laboratory

51

Page 54: ASTM002 The Galaxy Course notes 2006 (QMUL)

through of the hyperfine splitting of the Lyman lines in the ultraviolet. In the ISM, the21cm line is observed primarily in emission, but can also be observed in absorption againsta background radio continuum source.

H I observations have many uses. One critically important application is to measure theorbital motions of gas to determine rotation curves in our own Galaxy and in other galaxies.H I observations can map the distribution of gas in and around galaxies.

3.4 Cold Gas: Molecules

Molecular hydrogen (H2) is very difficult to detect directly. It has no radio lines, whichis unfortunate, since it prevents the coldest and densest parts of the ISM being observeddirectly. There is H2 band absorption in the far ultraviolet, but this can only be observedfrom above the Earth’s atmosphere.

What saves the situation somewhat is that other molecules do emit in the radio/microwaveregion. Molecular energy transitions can be due to changes in the electron energy levels, andalso to changes in vibrational and rotational energies of the molecules. All three types ofenergy are quantised. Transitions between the electron states are in general the most en-ergetic, and can produce spectral lines in the optical, ultraviolet and infrared. Transitionsbetween the vibrational states can produce lines in the infrared. Transitions between therotational states are in general the least energetic and produce lines in the radio/microwaveregion.

CO has strong lines at 1.3mm and 2.6mm from transitions between rotational states. COis particularly useful as a tracer of H2 molecules on the assumption that the densities of thetwo are proportional. Mapping CO density is therefore used to determine the distributionof cold gas in the ISM.

Cold, dense molecular gas is concentrated into clouds. These are relatively small in size(' 2−40 pc across). Temperatures are T ∼ 10 K and densities ∼ 108 to 1011m−3. Molecularclouds fill only a very small volume of the ISM but contain substantial mass. Regions of gasin molecular clouds can experience collapse under its own gravitation to form stars. Thenewly formed hot stars in turn irradiate the gas with ultraviolet light, ionising and heatingthe gas.

Giant molecular clouds are larger regions of cold, molecular gas. Their masses are largeenough (up to 106 M¯) that they can have perturbing gravitational effects on stars in thedisc of the Galaxy. Within our Galaxy they are mostly found in the spiral arms.

3.5 Hot Gas: HII Regions

Hot gas is readily observed in the optical region of the spectrum. Gas that is largely ionisedproduces emission lines from electronic transitions as ions and some atoms revert to lowerenergy states. The gas is therefore observed as nebulae.

An important kind of object is H II regions, which are regions of partially ionised hydro-gen surrounding very hot young stars of O or B spectral types. Hot stars produce a largeflux of ultraviolet photons, and any Lyman continuum photons (i.e., photons with wave-lengths λ < 912 A) will photoionise hydrogen producing a region of H+, i.e. H II ions. Thesewavelengths correspond to energies > 13.6 eV, the ionisation energy from the ground stateof hydrogen. The ionised hydrogen then recombines with electrons to form neutral atoms.But the hydrogen does not have to recombine into the ground state; it can recombine intoan excited state and then radiatively decay after that. Electrons will cascade down energylevels, emitting photons as they do, in a process known as radiative decay.

This process produces a huge variety of observable emission lines and continua in theultraviolet, optical, infrared and radio parts of the spectrum. Free-bound transitions (in-

52

Page 55: ASTM002 The Galaxy Course notes 2006 (QMUL)

The optical spectrum ofthe Orion Nebula. Thespectrum shows very strongemission lines from speciessuch as H I, [O II] and [O III].

volving free electrons combining with ions to become bound in atoms) produce continuumradiation. Bound-bound transitions (involving electrons inside an atom moving to a differentenergy level in the same atom) produce spectral lines.

Transitions in hydrogen atoms down to the first excited level (n = 2) produce Balmerlines, which lie in the optical. These are prominent in nebulae. Transitions down to theground (n = 1) state produce Lyman lines in the ultraviolet. In each series, the individuallines are labelled α, β, γ, δ, ..., in order of decreasing wavelength. The transitions from n ton− 1 levels are the strongest. Therefore the α line of any series is the strongest.

Lyman series lines of hydrogen are: Balmer series lines of hydrogen are:Lyα λ = 1216 A (in ultraviolet) Hα λ = 6563 A (in optical)Lyβ 1026 A ( ” ) Hβ 4861 A ( ” )Lyγ 973 A ( ” ) Hγ 4340 A ( ” )

Hδ 4102 A ( ” )Hε 3970 A ( ” )

Atoms in H II regions can also be collisionally excited. Atomic hydrogen has no levelsaccessible at collision energies characteristic of H II regions (T ∼ 104 K) but N II, O II, S II,O III, Ne III all do. The [O III] lines at 4959A and 5007A are particularly prominent. Someof the most prominent lines in the optical spectra of H II regions, other than the hydrogenlines listed above, are:

[O II] 3727 A He I 5876 A[Ne III] 3869 A [N II] 6548 A[O III] 4959 A [N II] 6584 A[O III] 5007 A

Colour optical images of H II regions of the kind used in popular astronomy books showstrong red and green colours: the red is produced mainly by the Hα line, while the greenis produced by [O III] and Hβ. H II regions are seen prominently in images of spiral andirregular galaxies. Their emission lines dominate the spectra of late-type galaxies and arevaluable for use in measuring redshifts.

The photoionisation and recombination process in H II regions and planetary nebulaeproduces, by a convenient accident, one Balmer photon for each Lyman continuum photonfrom the hot star, so the ultraviolet flux from the star can be measured by observing an

53

Page 56: ASTM002 The Galaxy Course notes 2006 (QMUL)

optical spectrum of the H II region surrounding the hot star. The reason is basically that thegas is opaque to Lyman photons and transparent to other photons, since almost all the Hatoms are in the ground state. A Lyman continuum photon initially from the star will getabsorbed by a hydrogen atom, producing a free electron. This electron will then be capturedinto some bound state. If it gets captured to the ground state we are back where we started(with a ground state atom and a Lyman continuum photon), so consider the case where theelectron is captured to some n > 1 state. Such a capture releases a free-bound continuumphoton which then escapes, and leaves an excited state which wants to decay to n = 1. If itdecays to n = 1 bypassing n = 2, it will just produce a Lyman photon which will get almostcertainly get absorbed again. Only if it decays to some n > 1 will a photon escape. In otherwords, if the decay bypasses n = 2 it almost always gets another chance to decay to n = 2and produce a Balmer photon that escapes. The Lyα photons produced by the final decayfrom n = 2 to n = 1 random-walk through the gas as they get absorbed and re-emitted againand again. The total Balmer photon flux thus equals the Lyman continuum photon flux.One can then place the source star in an optical-ultraviolet colour–magnitude diagram, anddetermine a colour temperature which is called the Zanstra temperature in this context.

H II regions and planetary nebulae also produce thermal continuum radiation. The pro-cess that produces this is free-free emission: free electrons in the H II can interact withprotons without recombination, and the acceleration of the electrons in this process pro-duces radiation. (Electrons can interact with other electrons in similar fashion as well, butthis produces no radiation because the net electric dipole moment does not change.) The re-sulting spectrum is not blackbody because the gas is transparent to free-free photons: thereis no redistribution of the energy of the free-free photons. In fact the spectrum is quiteflat at radio frequencies – this is the same thing as saying that the time scale for free-freeencounters is ¿ 1/ν for radio frequency ν.

Detailed analyses of the relative strengths of the emission lines from H II regions canprovide measurements of the temperatures, densities and chemical composition of the inter-stellar gas. This is possible for nebulae in our Galaxy and for emission lines in other galaxies.Measuring the chemical compositions of gas in galaxies is important in understanding howthe abundances of chemical elements vary from one place in the Universe to another.

3.6 Hot Gas: Planetary Nebulae

A planetary nebula is like a compact H II region, except that it surrounds the exposed coreof a hot, highly evolved star rather than a hot young star. The gas is ejected from the starthrough mass loss over time. Ultraviolet photons from the star ionise the gas in mannersimilar to H II regions, and the gas emits photons like a H II region. Emission processes aresimilar to H II regions, but the density, temperature and ionisation state of the gas around aplanetary nebula can be somewhat different to a H II region.

Planetary nebulae are relatively luminous and have prominent emission lines. As suchthey can be observed in other galaxies and can be detected at greater distances than manyindividual ordinary stars. They can be used to trace the distribution and kinematics of starsin other galaxies.

3.7 Hot Gas: Supernova Remnants

Supernovae eject material at very high velocities into the interstellar medium. This gasshocks, heats and disrupts the ISM. Low density components of the ISM can be significantlyaffected, but dense molecular clouds are less strongly affected. Hot gas from supernovae caneven be ejected out of the Galactic disc into the halo of the Galaxy. Supernova remnantshave strong line emission. They expand into and mix with the ISM.

54

Page 57: ASTM002 The Galaxy Course notes 2006 (QMUL)

3.8 Hot Gas: Masers

In the highest density H II regions (∼ 1014 m−3), either very near a young star, or in aplanetary-nebula-like system near an evolved star, population inversion between certainstates becomes possible. The overpopulated excited state then decays by stimulated emis-sion, i.e., it becomes a maser. An artificial maser or laser uses a cavity with reflecting walls tomimic an enormous system, but in an astrophysical maser the enormous system is availablefor free; so an astrophysical maser is not directed perpendicular to some mirrors but shinesin all directions. But as in an artificial maser, the emission is coherent (hence polarised),with very narrow lines and high intensity. Masers from OH and H20 are known. Their highintensity and relatively small size makes masers very useful as kinematic tracers.

3.9 Hot Gas: Synchrotron Radiation

Finally, we’ll just briefly mention synchrotron radiation. It is a broad-band non-thermal ra-diation emitted by electrons gyrating relativistically in a magnetic field, and can be observedin both optical and radio. The photons are emitted in the instantaneous direction of electronmotion and polarised perpendicular to the magnetic field. The really spectacular sources ofsynchrotron emission are systems with jets (young stellar objects with bipolar outflows, oractive galactic nuclei). It is synchrotron emission that lights up the great lobes of radiogalaxies.

3.10 Absorption-Line Spectra from the Interstellar Gas

If interstellar gas is seen in front of a continuum background light source, light from thesource is found to be absorbed at particular wavelengths. A number of interstellar lines andmolecular bands are seen in absorption. This process requires a relatively bright backgroundsource in practice.

The molecular absorption can be very complex. Electron transitions in the molecules donot produce single strong spectral lines, but a set of bands. This is because of the effect ofthe various vibrational and rotational energy states of the molecules in the gas, combinedwith the stronger electronic transitions.

Some of the interstellar absorption features are not well understood. One particularproblem is the diffuse interstellar bands in the infrared. These are probably caused bycarbon molecules, possibly by polycyclic aromatic hydrocarbons (PAHs).

An interesting example of the importance of interstellar absorption lines concerns mea-surements of the temperatures of cold interstellar CN molecules. Like most heteronuclearmolecules, CN has rotational modes which produce radio lines. The radio lines can be ob-served directly, but more interesting are the optical lines that have been split because ofthese rotational modes. Observations of cold CN against background stars reveal, throughthe relative widths of the split optical lines, the relative populations of the rotational modes,and hence the temperature of the CN. The temperature turns out to be 2.7 K, i.e., these coldclouds are in thermal equilibrium with the microwave background. The temperature of in-terstellar space was first estimated to be ' 3 K in 1941, well before the Big Bang predictionsof 1948 and later, but nobody made the connection at the time.

3.11 The Components of the Interstellar Medium

It is sometimes convenient to divide the diffuse gas in the interstellar medium into distinctcomponents, also called phases:

55

Page 58: ASTM002 The Galaxy Course notes 2006 (QMUL)

• the cold neutral medium – consisting of neutral hydrogen (H I) and molecules at tem-peratures T ∼ 10 − 100 K and relatively high densities;

• the warm neutral medium – consisting of H I but at temperatures T ∼ 103 − 104 K oflower densities;

• the warm ionised medium – consisting of ionised gas (H II) at temperatures T ∼ 104 Kof lower densities;

• the hot ionised medium – consisting of ionised gas (H II) at very high temperaturesT ∼ 105 − 106 K but very low densities.

These phases are pressure-confined and are stable in the long term. Ionisation by supernovaremnants is an important mechanism in producing the hot ionised medium. The cold neutralmedium makes up a significant fraction ∼ 50% of the ISM’s mass, but occupies only a verysmall fraction by volume. Some other individual structures in the ISM, such as supernovaremnants, planetary nebulae, H II regions and giant molecular clouds, are not included inthese phases because they are not in pressure equilibrium with these phases.

3.12 Interstellar Dust

Interstellar dust consists of particles of silicates or carbon compounds. They are relativelysmall, but have a broad range in size. The largest are ' 0.5 µm with ∼ 104 atoms, but someappear to have . 102 atoms and thus are not significantly different from large molecules.

Dust has a profound observational effect – it absorbs and scatters light. Dust diminishesthe light of background sources, a process known as interstellar extinction. Examples of thisare dark nebulae and the zone of avoidance for galaxies at low galactic latitudes.

Consider light of wavelength λ with a specific intensity Iλ passing through interstellarspace. (Specific intensity of light here means the energy transmitted in a direction per unittime through unit area perpendicular to that direction into unit solid angle per unit intervalin wavelength.) If the light passes through an element of length in the interstellar medium,it will experience a change dIλ in the intensity Iλ due to absorption and scattering by dust.This is related to the change dτλ in the optical depth τλ at the wavelength λ that the lightexperiences along its journey by

dIλIλ

= − dτλ .

Integrating over the line of sight from a light source to an observer, the observed intensity is

Iλ = Iλ 0 e − τλ

where Iλ 0 is the light intensity at the source and τλ is the total optical depth along the lineof sight.

This can be related to the loss of light in magnitudes. The magnitude m in some photo-metric band is related to the flux F in that band by m = C − 2.5 log10(F ), where C is acalibration constant. So, the change in magnitude caused by an optical depth τ in the bandis A = − 2.5 log10(e

−τ ) = + 1.086 τ. The observed magnitude m is related to the intrinsicmagnitude m0 by m = m0 + A, where A is the extinction in magnitudes (the intrinsic mag-nitude m0 is the magnitude that the star would have if there were no interstellar extinction).A depends on the photometric band. For example, for the V (visual) band (correspondingto yellow-green colours and centred at 5500 A), V = V0 + AV , while for the B (blue) band(centred at 4400 A), B = B0 + AB. For sight lines at the Galactic poles, AV ' 0.00 to0.05 mag, while at intermediate galactic latitudes, AV ' 0.05 to 0.2 mag. However, in theGalactic plane, the extinction can be many magnitudes, and towards the Galactic Centre itis AV À 20 mag.

Aλ is a strong function of wavelength and it scales as Aλ ∼ 1/λ (but not as strong asthe ∼ 1/λ4 relation of the Rayleigh law). There is therefore much stronger absorption in the

56

Page 59: ASTM002 The Galaxy Course notes 2006 (QMUL)

The interstellar extinction law.The extinction caused by dustis plotted against wavelengthand extends from the ultra-violet through to the near-infrared. [Based on data fromSavage & Mathis, Ann. Rev.Astron. Astrophys., 1979.]

blue than in the red. This produces an interstellar reddening by dust: as light consisting of arange of wavelengths passes through the interstellar medium, it is reddened by the selectiveloss of short wavelength light compared to long wavelengths. Colour indices are reddened,e.g. B − V is reddened so that the observed value is (B − V ) = (B − V )0 + EB−V , where(B − V )0 is the intrinsic values (what would be observed in the absence of reddening) andEB−V is known the colour excess. The colour excess measures how reddened a source is.The colour excess is therefore the difference in the extinctions in the two magnitudes, e.g.EB−V = AB −AV . If the intrinsic colour can be predicted, i.e. we can predict (B−V )0 (e.g.from a spectrum), it is possible to calculate EB−V from EB−V = (B−V ) − (B−V )0. EB−V

data can then be used to map the dust distribution in space. It is found from observationsthat the extinction in the V band AV ' 3.1EB−V . Extinction gets less severe for λ & 1µmas the wavelength gets much longer than the grains. Grains are transparent to X-rays.

The extinction becomes very strong for long sight lines through the disc of the Galaxy.The Galactic Centre is completely opaque to optical observations. There is some patchinessin the distribution of the dust. A few areas of lower dust extinction towards the bulge of ourGalaxy, such as Baade’s Window, enable the stars in the bulge to be studied.

We can model the extinction caused by dust in the Galaxy to predict how the extinctionin magnitudes towards distant galaxies will depend on their galactic latitude b. The opticaldepth caused by dust extinction when light of wavelength λ travels a short distance dsthrough the interstellar medium is given by dτλ = κλ ρd ds, where ρd is the density of dust atthat point in space and κλ is the mass extinction coefficient at that point for the wavelengthλ. Observations of star numbers show that their number density declines exponentially withdistance from the Galactic plane. We shall adopt a similar behaviour for the density of dust,and therefore assume that the density of dust varies with distance z above the Galactic planeas ρd(z) = ρd0 e−|z|/h, where ρd0 and h are constants.

The optical depth when travelling a distance ds along the line of sight at galactic latitudeb is dτλ = κλ ρd(z) ds = κλ ρd(z) dz/ sin |b| where z is the distance north of the Galactic plane.Integrating along the line of sight,

∫ τλ

0dτ ′λ =

∫ ∞

0

κλ ρd(z) dz

sin |b| =

∫ ∞

0

κλ ρd0 e−|z|/h

sin |b| dz =κλ ρd0

sin |b|

∫ ∞

0e−|z|/h dz

57

Page 60: ASTM002 The Galaxy Course notes 2006 (QMUL)

assuming that the opacity κλ does not vary with distance from the Galactic plane. Therefore,

τλ =κλ ρd0 h

sin |b| = κλ ρd0 h cosec|b| .

Since the extinction in magnitudes is Aλ = 1.086τλ, we get

Aλ = 1.086κλ ρd0 h cosec|b| .

Therefore the extinction in magnitudes towards an extragalactic object is predicted to varywith galactic latitude b as Aλ ∝ cosec|b| in this particular model. By coincidence, the samecosec|b| dependence of Aλ on galactic latitude b is obtained using a simplistic model in whichthe dust is found in a slab of uniform density centred around the Galactic plane.

3.13 Interstellar Dust: Polarisation by Dust

However, extinction by dust does one very useful thing for optical astronomers. Dustgrains are not spherical and tend to have some elongation. Spinning dust grains tend to alignwith their long axes perpendicular to the local magnetic field. They thus preferentially blocklight perpendicular to the magnetic field: extinction produces polarised light. The observedpolarisation will tend to be parallel to the magnetic field. Hence polarisation measurementsof starlight reveal the direction of the magnetic field (or at least the sky-projection of thedirection).

Dust also reflects light, with some polarisation. This is observable as reflection nebulae,where faint diffuse starlight can be seen reflected by dust.

58

Page 61: ASTM002 The Galaxy Course notes 2006 (QMUL)

3.14 Interstellar Dust: Radiation by Dust

Light absorbed by dust will be reradiated as a black-body spectrum (or close to a black-bodyspectrum). The Wien displacement law states that the maximum of the Planck function Bλ

over wavelength of a black-body at a temperature T is found at a wavelength

λmax =2.898 × 10−3

TK m .

This predicts that the peak of the black-body spectrum from dust at a temperature ofT = 10 K will be at a wavelength λmax = 290 µm, from dust at T = 100 K will be at awavelength λmax = 29 µm, and for T = 1000 K will be at λmax = 2.9 µm.

The radiation emitted by dust will be found in the infrared, mostly in the mid-infrared,given the expected temperatures of dust. This can be observed, for example using observa-tions from space (such as those made by the IRAS satellite), as diffuse emission superimposedon a reflected starlight spectrum. However, the associated temperature of the black-body issurprisingly high – T ∼ 103 K – which is much hotter than most of the dust. The interpre-tation of this is that some dust grains are so small (< 100 atoms) that a single ultravioletphoton packs enough energy to heat them to ∼ 103 K, after which these ‘stochasticallyheated’ grains cool again by radiating, mostly in the infrared. This process may be partof the explanation for the correlation between infrared and radio continuum luminosities ofgalaxies (e.g., at 0.1 mm and 6 cm), which seems to be independent of galaxy type. Theidea is that ultraviolet photons from the formation of massive stars cause stochastic heatingof dust grains, which then reradiate them to give the infrared luminosity. The supernovaeresulting from the same stellar populations produce relativistic electrons which produce theradio continuum as synchrotron emission.

3.15 Star formation

Stars form by the collapse of dense regions of the interstellar medium under their own gravity.This occurs in the cores of molecular clouds, where the gas is cold (∼ 10 K) and densitiescan exceed 1010 molecules m−3.

A region of cold gas will collapse when its gravitational self-attraction is greater than thehydrostatic pressure support. This gravitational instability is often described by the Jeans

length and Jeans mass. For gas of uniform density ρ, the Jeans length λJ is the diameter ofa region of the gas that is just large enough for the gravitational force to exceed the pressuresupport. It is given by

λJ = cs

π

Gρ, (3.1)

where cs is the speed of sound in the gas and G is the constant of gravitation. The Jeansmass is the mass of a region that has a diameter equal to the Jeans length and is therefore

MJ =π

6ρ λ 3

J . (3.2)

Star formation can be self-propagating. Stars will form, heat up and ionise the coldmolecular gas, with the resulting outwards flow of gas compressing gas ahead of it. Thiscompression causes instabilities that result in local collapse to form new stars. The enhanceddensity in the spiral arms of our Galaxy and in other spiral galaxies means that star formationoccurs preferentially in the arms.

59

Page 62: ASTM002 The Galaxy Course notes 2006 (QMUL)

Chapter 4

Galactic Chemical Evolution

4.1 Introduction

Chemical evolution is the term used for the changes in the abundances of the chemicalelements in the Universe over time, since the earliest times to the present day. The studyof these changes is an important field in astronomy, for both our Galaxy and for othergalaxies. This includes the study of the elemental abundances in stars and in the interstellarmedium. A fundamental objective of studies of chemical evolution is to develop a completeunderstanding of how the elemental abundances correlate with parameters like time, locationwithin a galaxy, and stellar velocities.

The term ‘chemical’ here refers to the chemical elements. It does not refer to chemistry inthe broader sense: the study of interactions between molecules in the Universe is a different,distinct field called astrochemistry.

4.2 Chemical Abundances

The relative abundances of the chemical elements can be measured in a number of astro-nomical objects, most importantly using spectroscopic techniques. The observed strengthsof spectral lines depend on a variety of factors among which are the chemical abundances ofthe elements producing the spectral lines. Abundances can measured in stellar photospheresfrom the strengths of absorption lines. The observed strengths of lines in stellar spectra de-pend on the abundance of the element responsible, on the effective temperature of the star,on the acceleration due to gravity at its surface and on small-scale turbulence in the atmo-sphere of the star. All these parameters can be solved for if there are sufficient spectroscopicobservations, while the analysis becomes simpler if the temperature can be determined inadvance from photometric observations. Equally, abundances can be determined from thestrengths of emission lines from interstellar gas, most notably from H II regions.

It is important to define some terms and parameters that are used for the study of theabundances of chemical elements. In astronomy, for historical reasons, the term metals isused for elements other than H and He; the term heavy elements is also used for these.Under this definition, even elements such as carbon, oxygen, nitrogen and sulphur are calledmetals. The term metallicity is used for the fraction of heavy elements, usually expressed asa fraction by mass.

It is convenient to define the fractions by mass of hydrogen X, of helium Y , and of heavyelements Z. Therefore, Z = (mass of heavy elements)/(total mass of all nuclei) in someobject, objects or region of space. We therefore have X + Y + Z ≡ 1.

In some other applications, the abundances by number of nuclei are used. These areusually expressed relative to hydrogen. So N(He)/N(H) is the ratio of the abundance ofhelium to hydrogen by number, and N(Fe)/N(H) the iron-to-hydrogen ratio.

60

Page 63: ASTM002 The Galaxy Course notes 2006 (QMUL)

For convenience, chemical abundances in the Universe are often compared to the valuesin the Sun. Solar abundances give X = 0.70, Y = 0.28, Z = 0.02 by mass. An object, suchas a star, that has a heavy element fraction significantly lower than the Sun is said to bemetal poor, while one that has a larger heavy element fraction is said to be metal rich.

Abundance ratios by number are expressed relative to the Sun using a parameter [A/B],where A and B are the chemical symbols of two elements, and is defined as,

[A/B] = log10

(

N(A)

N(B)

)

− log10

(

N(A)

N(B)

)

¯

, (4.1)

where ¯ represents the abundance ratio in the Sun. So the ratio of iron to hydrogen in astar relative to the Sun is written as

[Fe/H] = log10

(

N(Fe)

N(H)

)

− log10

(

N(Fe)

N(H)

)

¯

. (4.2)

The [Fe/H] parameter for the Sun is, by definition, 0. A mildly metal-poor star in theGalaxy might have [Fe/H] ' −0.3, while a very metal-poor star in the halo of the Galaxymight have [Fe/H]' −1.5 to −2. A metal-rich star in the Galaxy might have [Fe/H]' +0.3.The interstellar gas in the Galaxy has a near-solar metallicity with [Fe/H] ∼ 0. The iron-to-hydrogen number ratio is encountered more often in research papers than other ratiosbecause iron produces large numbers of absorption lines in the spectra of late-type stars likeour Sun, making iron abundances relatively straightforward to measure.

4.3 The Chemical Enrichment of Galaxies

Current cosmological models show that the Big Bang produced primordial gas having a chem-ical composition that was 91 % H, 9 % He by number, plus a trace of Li7. This correspondsto a composition by mass that was 77 % H and 23 % He: i.e. X = 0.77, Y = 0.23, Z = 0.00.The baryonic material produced by the Big Bang was therefore almost pure hydrogen andhelium, with ten hydrogen nuclei for every one of helium. This production of the chemicalelements in the Big Bang is known as primordial nucleosynthesis.

The material we find around us in the Universe today contains significant quantities ofheavy elements, although these are still only minor contributors to the total mass of baryonicmatter. These heavy elements have been synthesised in nuclear reactions in stars, a processknown as nucleosynthesis.

These nuclear reactions in general occur in the cores of stars, producing enriched materialin stellar interiors. Enriched material can be ejected into the interstellar medium in the laterstages of stellar evolution, through mass loss and in supernovae. Star formation thereforeproduces stars which, after a time delay, eject heavy elements into the interstellar medium,including heavy elements newly synthesised in the stellar interiors. Star formation from thisenriched material in turn results in stars with enhanced abundances of heavy elements. Thisprocess occurs repeatedly over time, with the continual recycling of gas, leading to a gradualincrease in the metallicity of the interstellar medium with time.

Supernovae are important to chemical enrichment. They can eject large quantities ofenriched material into interstellar space and can themselves generate heavy elements innucleosynthesis. Type II supernovae are produced by massive stars (M & 8M¯). They ejectenriched material into the interstellar medium ∼ 107 yr after formation. This material isrich in C, N and O. Type Ia supernovae are probably caused by explosive fusion reactionsin binary systems and eject enriched material & 108 yr after the initial star formation. Thismaterial is rich in iron.

The main sequence lifetime TMS of a star is a very strong function of the star’s massM . Stars with masses M . 0.8M¯ have TMS > age of the Universe. So mass that goes

61

Page 64: ASTM002 The Galaxy Course notes 2006 (QMUL)

into low mass stars is lost from the recycling process. Usefully, samples of low mass starspreserve abundances of the interstellar gas from which they formed if enriched material isnot dredged up from the stellar interiors. This is true for G dwarfs for example, where thegas in the photosphere is almost unchanged in chemical composition from the gas from whichthey formed. So a sample of G dwarfs provides samples of chemical abundances through thehistory of the Galaxy. In contrast, observations of the interstellar gas in a galaxy provideinformation on present-day abundances and on the current state of chemical evolution.

4.4 The Simple Model of Galactic Chemical Evolution

The Simple Model of chemical evolution simulates the build up of the metallicity Z in avolume of space. The Simple Model makes some simplifying assumptions:

• the volume initially contains only unenriched gas – initially there are no stars and noheavy elements;

• the volume of space where the evolution takes place is a ‘closed box’ – no gas entersor leaves the volume;

• the gas in the volume is well mixed – it has the same chemical composition throughout;

• instantaneous recycling occurs – following star formation, all newly created heavyelements that enter the ISM do so immediately;

• the fraction of newly-synthesised heavy elements ejected into the ISM after materialforms stars is constant.

These assumptions, although slightly naive, allow some important predictions about thevariation of the metallicity in the interstellar gas with the amount of star formation that hastaken place.

Consider a volume within a galaxy, small enough to be fairly homogeneous, but largeenough to contain a good sample of stars. The total mass of chemical elements in this volumeat time t is Mtotal, made up of Mstars in stars and Mgas in gas. So

Mtotal = Mstars + Mgas . (4.3)

Here, Mstars = Mstars(t) and Mgas = Mgas(t). The assumption that the volume is a closedbox means that Mtotal = constant at all times. Initially, at time t = 0, Mstars = 0 (thevolume contains pure gas initially).

Let Mmetals be the mass of heavy elements in the gas within the volume at time t.Therefore the heavy element mass fraction of the gas is

Z ≡ Mmetals

Mgas. (4.4)

Consider a time interval from t to t+ δt. Star formation will occur in this time, with gasforming stars. Let the change in Mstars and Mgas in this time be δMstars and δMgas. Somestars will eject enriched gas back into the interstellar medium (through supernovae and massloss).

We firstly need to express the change δZ in the metallicity of the interstellar gas in termsof δMstars and δMgas. From Equation 4.4, we have Z = fn(Mmetals,Mgas), so the differentialof Z with respect to time is

dZ

dt=

∂Z

∂Mmetals

dMmetals

dt+

∂Z

∂Mgas

dMgas

dt

Differentiating Equation 4.4,

∂Z

∂Mmetals=

1

Mgasand

∂Z

∂Mgas= − Mmetals

M2gas

,

62

Page 65: ASTM002 The Galaxy Course notes 2006 (QMUL)

which gives,dZ

dt=

1

Mgas

dMmetals

dt− Mmetals

M2gas

dMgas

dt.

For a small time interval δt, we have,

δZ =δMmetals

Mgas− Mmetals

M2gas

δMgas ,

which gives from the definition of Z in Equation 4.4,

δZ =δMmetals

Mgas− Z

δMgas

Mgas. (4.5)

We need to distinguish between the the total mass in stars Mstars at time t and the totalmass that has taken part in star formation MSF over all periods up to time t. When a massδMSF goes into stars during star formation, the total mass in stars will change by amountless than this, because material from the new stars is ejected back into the interstellar gas.So, δMSF > δMstars, and MSF > Mstars.

Let α be the fraction of mass participating in star formation that remains locked up inlong-lived stars and stellar remnants. So,

δMstars = α δMSF (4.6)

The mass of newly synthesised heavy elements ejected back into the ISM is proportionalto the mass that goes into stars (from the Simple Model assumptions listed above). Let themass of newly synthesised heavy elements ejected into the ISM be p δMstars, where p is aparameter known as the yield, with p set to be a constant here.

The change δMmetals in the mass of heavy elements in the gas in a time δt will be causedby the loss of heavy elements in the gas that goes into star formation, by the gain of old heavyelements that have gone into star formation and are then ejected back into the gas, and bythe gain of newly synthesised heavy elements from stars that are ejected into the gas. Thecontribution to δMmetals from the loss of heavy elements in the gas going into star formationwill be − δMSF Mmetals/Mgas = − δMSF Z. The contribution from old heavy elements inthe gas that have gone into star formation then are ejected back unchanged into the gas willbe Z δMSF × (fraction of mass going into stars that is ejected back) = Z (1− α) δMSF. Thecontribution from the newly synthesised heavy elements that are ejected into the gas will bep δMSF from the definition of the yield p above.

Therefore, in time δt,

δMmetals = − Z δMSF + Z (1 − α) δMSF + p δMstars ,

which gives on expanding and cancelling,

δMmetals = − Z α δMSF + p δMstars

= − Z δMstars + p δMstars , (4.7)

on substituting for α δMSF = δMstars. Dividing by Mgas,

δMmetals

Mgas= − Z

δMstars

Mgas+ p

δMstars

Mgas.

But from Equation 4.3, the changes in masses are related by δMtotal = δMstars + δMgas = 0for a closed box. Therefore,

δMstars = − δMgas . (4.8)

63

Page 66: ASTM002 The Galaxy Course notes 2006 (QMUL)

∴δMmetals

Mgas= Z

δMgas

Mgas− p

δMgas

Mgas.

Substituting this into Equation 4.5,

δZ = ZδMgas

Mgas− p

δMgas

Mgas− Z

δMgas

Mgas

∴ δZ = − pδMgas

Mgas, (4.9)

Converting this to a differential and integrating from time 0 to t,

∫ Z(t)

0dZ ′ = −

∫ Mgas(t)

Mgas(0)p

dM ′gas

M ′gas

∴ Z(t) − 0 = − p[

lnM ′gas

]Mgas(t)

Mgas(0).

This gives,

Z(t) = − p ln

(

Mgas(t)

Mgas(0)

)

. (4.10)

Since the Mgas(0) = Mtotal(t) (a constant) for all t (because we have a closed boxthat initially contained only gas), we can rewrite this equation using the gas fraction µ ≡Mgas(t)/Mtotal(t) as

Z(t) = − p lnµ . (4.11)

Both Z and µ can in principle be measured with appropriate observations, and this equationdoes not depend on time t or star formation rate explicitly. This is an important predictionfrom the theory that is discussed more in Section 4.5, where comparisons with observationsare considered.

We now need to consider how the mass in stars depends on metallicity. Equation 4.9gives the change in metallicity Z in time δt. But from Equation 4.3, at any time, Mtotal =Mstars + Mgas, and at time 0, Mtotal = Mgas(0). So, Mgas(t) = Mgas(0) −Mstars(t). FromEquation 4.8, δMstars = − δMgas. Substituting for Mgas(t) and δMstars into Equation 4.9,

δZ = pδMstars

Mgas(0) −Mstars(t).

Integrating from time 0 to t,

∫ Z

0dZ ′ = p

∫ Mstars

0

dM ′stars

Mgas(0) −M ′stars

∴ Z − 0 = − p[

ln(

Mgas(0) −M ′stars

)]Mstars

0

∴ Z = − p ln (Mgas(0) −Mstars(t)) + ln (Mgas(0))

= − p ln

(

Mgas(0) −Mstars(t)

Mgas(0)

)

,

64

Page 67: ASTM002 The Galaxy Course notes 2006 (QMUL)

which rearranges to

Mstars(t)

Mgas(0)= 1 − e−Z(t)/p . (4.12)

This is a prediction of how the fraction of the mass of the volume that is in stars varieswith metallicity. Again we have a neat prediction of the Simple Model that involves timeonly through Z(t) and Mstars(t). This equation does not involve the star formation rate asa function of time explicitly, which means that the predictions are much simpler to comparewith observations.

Today, at time t1, we have a metallicity Z1 and a mass in stars Mstars1. Therefore wehave

Mstars1

Mgas(0)= 1 − e−Z1/p ,

at the present time. Dividing Equation 4.12 by this,

Mstars(t)

Mstars1=

1 − e−Z(t)/p

1 − e−Z1/p. (4.13)

This is a prediction of how the mass in stars at any time varies with the metallicity.If we observe some subsample of long-lived stars of similar mass, the number N(Z) of

these stars having a metallicity Z and less will be related to the mass by N(Z) ∝Mstars(t).So,

N(Z)

N1=

Mstars(t)

Mstars1,

where N1 is the value of N(Z) today. This then gives,

N(Z)

N1=

1 − e−Z(t)/p

1 − e−Z1/p. (4.14)

This gives a specific prediction of the number of stars as a function of metallicity. In practiceit is often easier to work with the differential distribution dN/dZ, which expresses the numberof stars with metallicity Z against Z.

4.5 Comparing the Simple Model with Observations: Abun-dances in the Interstellar Gas of Galaxies

Equation 4.11 provided a simple expression for the metallicity in the Simple Model:

Z = − p ln (gas fraction) .

This was derived by considering the chemical evolution in a single closed box. However, thisis a general result for any closed box and it can be compared with the observed metallicitiesand gas fractions in a number of galaxies, provided that the yield is the same in all places.

Magellanic irregular galaxies are found to fit this relation reasonably well, and p is es-timated to be ' 0.0025 from observations. In spiral galaxies, the gas fraction in the discincreases as we go outwards, and Z is indeed observed to decrease, though perhaps moresteeply than this crude model predicts.

65

Page 68: ASTM002 The Galaxy Course notes 2006 (QMUL)

4.6 Comparing the Simple Model with Observations: StellarAbundances in the Galaxy and the G-Dwarf Problem

Equation 4.14 gives a prediction of the number of stars N(Z) having a metallicity ≤ Z asa function of Z for a sample of long-lived stars, based on the Simple Model assumptions inSection 4.4. These predictions can be compared with observations.

These comparisons require metallicity data for relatively large numbers of stars to en-sure adequate statistics. Therefore, metallicity estimates are often made using photometrictechniques, rather than using more precise spectroscopic measurements. These photometricestimates tend to measure the abundances of the elements that produce strong absorption inthe light of the stars, rather than the overall metallicity. Therefore iron abundances [Fe/H]are generally quoted for studies of the metallicity distribution.

G- or K-type main sequence stars can conveniently be used in these studies becausetheir lifetimes are sufficiently long that they will have survived from the earliest times tothe present: these are known as G and K dwarfs. G dwarfs have generally been usedbecause of the advantage of their greater luminosity and because the techniques of estimatingmetallicities have been better calibrated.

The metallicity distribution observed for metal-poor globular clusters (where the abun-dance of each cluster is used instead of individual stars) gives a tolerably good fit to theSimple Model prediction. However, matters are very different for the stars in the solarneighbourhood, within the disc of the Galaxy.

The figure below gives a comparison of the the predicted metallicity distribution of Equa-tion 4.14 with observations of long-lived stars in the solar neighbourhood. The Simple Modelprediction is found to be very different to the observed distribution. The Simple Model pre-dicts a far larger proportion of metal-poor stars than are actually found. This has becomeknown as the G dwarf problem.

The observed cumulative metallicity distribution for stars in the solar neighbourhood, com-pared with the Simple Model prediction for p = 0.010 and Z1 = Z¯ = 0.017. [The observeddistribution uses data from Kotoneva et al., M.N.R.A.S., 336, 879, 2002, for stars in theHipparcos Catalogue.]

66

Page 69: ASTM002 The Galaxy Course notes 2006 (QMUL)

The differential metallicity distribution, representing simply a histogram of star numbersas a function of [Fe/H], also shows the failure of the Simple Model prediction. This is shownin the figure below.

The observed differential metallicity distribution for stars in the solar neighbourhood, com-pared with the Simple Model prediction for p = 0.010 and Z1 = Z¯ = 0.017. [The observeddistribution uses data from Kotoneva et al., M.N.R.A.S., 336, 879, 2002, for stars in theHipparcos Catalogue.]

Determining the metallicity distribution in galaxies outside our own is very difficultobservationally. G dwarf stars are very faint in even the nearest Local Group galaxies.

4.7 Solutions to the G-Dwarf Problem

The G-dwarf problem indicates that the Simple Model is an oversimplification in the solarneighbourhood: one or more of the assumptions in Section 4.4 must be wrong. This is avery important result, but precisely which of the assumptions are wrong is difficult to say. Abetter fit to the observed data can be had by relaxing any of a number of the Simple Modelassumptions. These include:

• the gas was not initially of zero metallicity;

• there has been an inflow of very metal-poor gas (this can help, but the value of theyield p must be adjusted);

• there has been a variable initial mass function (which could result in a change in thefraction α of mass that remains locked up in long-lived stars, or in a change in theyield p);

• the samples of stars used to test the Simple Model are biased against low-metallicitystars (but considerable care is taken by observers to correct for these effects).

67

Page 70: ASTM002 The Galaxy Course notes 2006 (QMUL)

(The initial mass function is the number N of stars as a function of star mass M immediatelyfollowing star formation. There is virtually no evidence that the initial mass function variedwith time.)

One possible change to the Simple Model is to allow for a loss of gas from the volume.This is plausible, because supernovae following star formation could drive gas out of theregion of the galaxy being studied. One possible prescription would be to set the outflowrate to be proportional to the rate of star formation. Therefore, the loss of gas from thevolume in a time δt is c δMstars, where c is a constant. In this case the total mass Mtotal

in the volume varies with time, unlike in the Simple Model. However, it is found that aloss of enriched gas would make the G dwarf problem even worse: it reduces the quantity ofenriched gas to make stars.

One modification to the Simple Model that can achieve a better fit between theoreticalprediction and observations is to allow for the inflow of gas. This gas could be unprocessed,primordial gas. Analytic models of this type often set the inflow rate to be proportional tothe star formation rate, so the inflow of gas in a time δt is c δMstars, where c is a constant.This can produce a better fit between models and observations, provided the yield p is chosenappropriately.

4.8 Nucleosynthesis

The processes by which chemical elements are created are called nucleosynthesis. The ele-ments produced in the hot, dense conditions in the Big Bang (hydrogen, helium and Li7)were created in primordial nucleosynthesis. Other isotopes and elements were produced bynucleosynthesis inside stars and in supernova explosions (including some additional helium).

The nuclear reactions responsible are complex and varied. The proto-proton chain is aseries of reactions that fuse protons to form helium. In summary these reactions involve4 H1 → He4 with the addional production of positrons, neutrinos and gamma rays. Thecarbon-nitrogen cycle and the carbon-nitrogen-oxygen bicycle also fuse protons to form He4

using pre-existing C12 in these reactions, creating N and O as (mostly) temporary byproductswhich ultimately return to C12. However, incomplete reactions in this series can leave behindsome N, O and F.

Helium burning occurs in evolved red giant stars. In summary, 3He4 → C12 through anintermediate stage involving Be8. Reactions of this type can continue, with carbon burningand oxygen burning producing elements such as Mg and Si.

Another important process involves the capture of He nuclei by other nuclei, known asα-capture (because the He nucleus is an α particle). For example, O16 + He4 → Ne20.Elements that are mainly in form of isotopes consisting of multiples of α particles are knownas α elements and are relatively abundant.

A number of isotopes are built up by neutron capture. This process can occur slowly insidestars over long periods of time, when it is called the s-process. Under extreme circumstancesin supernova explosions, neutron capture can occur rapidly and it is called the r-process.

Some isotopes are particularly stable, while others are fragile in the high temperaturesin stellar interiors (and participate readily in reactions with other nuclei, producing otherisotopes). This stability/fragility affects the abundances of the elements produced duringnucleosynthesis.

When the amount of an element produced by nucleosynthesis in stars does not dependon the abundances of elements in the gas that formed the stars, that element is said to be aprimary element. On the other hand, if the amount of an element produced by nucleosyn-thesis does depend on the abundances in the gas that went into the star, the element is saidto be a secondary element. For example, the amount of the isotope N14 produced in starsdepends on the abundance of carbon in the gas that formed the stars: the more carbon, the

68

Page 71: ASTM002 The Galaxy Course notes 2006 (QMUL)

more N14 can be formed as a minor byproduct of the CNO bicycle.

4.9 Element Ratios

The sections on chemical evolution above considered the changes in the overall metallicityZ. However, the abundances of individual elements provide very important additional infor-mation. The ratios of the abundances of individual elements, for example oxygen to iron orcarbon to oxygen, have been determined by the details of nucleosynthesis and the enrichmentof the interstellar medium over time, including how the relative quantities of these elementsejected into the interstellar gas has varied with time.

The [O/Fe] element abundance ratio plotted as a function of [Fe/H]. [Based on data fromEdvardsson et al., Astron. Astrophys., 275, 101, 1993, supplemented with data from Zhang& Zhao, M.N.R.A.S., 364, 712, 2005.]

Chemical elements can be measured from the spectra of stars. Correlations are oftenfound between abundance ratios. An important example is how the oxygen-to-iron ratio,[O/Fe], varies with the iron-to-hydrogen ration, [Fe/H]. Stars with metallicities similar tothat of the Sun have, unsurprisingly, [O/Fe] values similar to the Sun. Metal-poor stars have[O/Fe] values that are larger than the Sun, with the [O/Fe] values increasing with decreasing[Fe/H] until a near-constant value is reached. The conventional interpretation of this is thatthe heavy metal enrichment that produced the material that went into very metal-poor starswas caused by type II supernovae predominantly. Type II supernovae occur soon after aburst of star formation and produce large quantities of oxygen relative to iron. Thereforethe material in very metal-poor stars had high values of [O/Fe]. Later, type Ia supernovaeproduced larger quantities of iron compared with oxygen, reducing the oxygen-to-iron rationin the interstellar gas. Later stars were therefore less metal-poor and had [O/Fe] valuescloser to the Sun.

69

Page 72: ASTM002 The Galaxy Course notes 2006 (QMUL)

Chapter 5

Rotation Curves

5.1 Circular Velocities and Rotation Curves

The circular velocity vcirc is the velocity that a star in a galaxy must have to maintain acircular orbit at a specified distance from the centre, on the assumption that the gravitationalpotential is symmetric about the centre of the orbit. In the case of the disc of a spiral galaxy(which has an axisymmetric potential), the circular velocity is the orbital velocity of a starmoving in a circular path in the plane of the disc. If the absolute value of the accelerationis g, for circular velocity we have g = v2

circ/R where R is the radius of the orbit (with R aconstant for the circular orbit). Therefore, ∂Φ/∂R = v2

circ/R, assuming symmetry.The rotation curve is the function vcirc(R) for a galaxy. If vcirc(R) can be measured over a

range of R, it will provide very important information about the gravitational potential. Thisin turn gives fundamental information about the mass distribution in the galaxy, includingdark matter.

We can go further in cases of spherical symmetry. Spherical symmetry means that thegravitational acceleration at a distance R from the centre of the galaxy is simply GM(R)/R2,where M(R) is the mass interior to the radius R. In this case,

v2circ

R=

GM(R)

R2and therefore, vcirc =

GM(R)

R. (5.1)

If we can assume spherical symmetry, we can estimate the mass inside a radial distanceR by inverting Equation 5.1 to give

M(R) =v2circR

G, (5.2)

and can do so as a function of radius. This is a very powerful result which is capable oftelling us important information about mass distribution in galaxies, provided that we havespherical symmetry. However, we must use a more sophisticated analysis for the generalcase where we do not have spherical symmetry. The more general case of axisymmetry isconsidered in Section 5.3.

5.2 Observations

Gas and young stars in the disc of a spiral galaxy will move on nearly closed orbits, and ifthe underlying potential is axisymmetric these will be nearly circular. Therefore if the bulkvelocity v of gas or young stars can be measured, it provides v2

circ = R ∂Φ/∂R. Old starsshould be avoided: old stars have a greater velocity dispersion around their mean orbitalmotion and their bulk rotational velocity will be slightly smaller than the circular velocity.

70

Page 73: ASTM002 The Galaxy Course notes 2006 (QMUL)

Spectroscopic radial velocities can be used to determined the rotational velocities of spiralgalaxies provided that the galaxies are inclined to the line of sight. The analysis is impossiblefor face-on spiral discs, but inclined spirals can be used readily. The circular velocity vcirc

is related to the velocity vr along the line of sight (corrected for the bulk motion of thegalaxy) by vr = vcirc cos i where i is the inclination angle of the disc of the galaxy to the lineof sight (defined so that i = 90 for a face-on disc). Placing a spectroscopic slit along themajor axis of the elongated image of the disc on the sky provides the rotation curve fromoptical observations. Radio observations of the 21 cm line of neutral hydrogen at a numberof positions on the disc of the galaxy can also provide rotation curves.

For example, in our Galaxy the circular velocity at the solar distance from the GalacticCentre is 220 km s−1 (at R0 = 8.0 kpc from the centre).

When people first starting measuring rotation curves (c. 1970), it quickly became clearthat the mass in disc galaxies does not follow the visible disc. It was found that disc galaxiesgenerically have rotation curves that are fairly flat to as far out as they could be measured(out to several scale lengths). This is very different to the behaviour that would be expectedwere the visible mass – the mass of the stars and gas – the only matter in the galaxies. Thisis interpreted as strong evidence for the existence of dark matter in galaxies.

The simplest interpretation of a flat rotation curve is that based on the assumption thedark matter is spheroidally distributed in a ‘dark halo’. For a spherical distribution of mass,vcirc = constant implies that the enclosed mass M(r) ∝ r, and so ρ(r) ∝ 1/r2.

Rotation curves determined from optical spectra are generally limited to ' few scalelengths (assuming an exponential density profile). These do provide important evidence offlat rotation curves. However, 21 cm radio observations can be followed out to significantlygreater distances from the centres of spiral galaxies, using the emission from the atomichydrogen gas. These H I observations provide powerful evidence of a constant circular velocitywith radius, out to radial distances where the density of stars has declined to very low levels,providing strong evidence for the existence of extensive dark matter haloes.

As yet it is not clear exactly how far dark matter haloes extend. Neither is there a goodestimate of the total mass of any disc galaxy. This is what makes disc rotation curves veryimportant.

Figure 5.1: The spiral galaxy NGC 2841 and its H I 21 cm radio rotation curve. The figureon the left presents an optical (blue light) image of the galaxy, while that on the right givesthe rotation curve in the form of the circular velocity plotted against radial distance. Theoptical image covers the same area of the galaxy as the radio observations: the 21 cm radioemission from the atomic hydrogen gas is detected over a much larger area than the galaxycovers in the optical image. [The optical image was created using Digitized Sky Survey IIblue data from the Palomar Observatory Sky Survey. The rotation curve was plotted using

71

Page 74: ASTM002 The Galaxy Course notes 2006 (QMUL)

data by A. Bosma (Astron. J., 86, 1791, 1981) taken from S. M. Kent (AJ, 93, 816, 1987).]

5.3 Theoretical Interpretation

However, one needs to be careful about interpreting flat rotation curves. The existence ofdark matter haloes is a very important subject and caution is appropriate before acceptingevidence that has profound significance to our understanding of matter in the Universe. Forthis reason, attempts were made to model observed rotation curves using as little mass in thedark matter haloes as possible. These ‘maximal disc models’ attempted to fit the observeddata by assuming that the stars in the galactic discs had as much mass as could still beconsistent with our understanding of stellar populations. They still, however, required acontribution from a dark matter halo at large radii when H I observations were taken intoaccount.

Importantly, the maximum contribution to the rotation curve from an e−R/R0 disc is not(as we might naively expect) around R0 but around 2.5R0. Adding the effect of a bulge caneasily give a fairly flat rotation curve to 4R0 without a dark halo. To be confident aboutthe dark halo, one needs to have the rotation curve for & 5R0. In practice, that means H I

measurements; optical rotation curves do not go out far enough to say anything conclusiveabout dark haloes.

The rest of this section is a more detailed working out of the previous paragraph. Itfollows an elegant derivation and explanation due to A. J. Kalnajs.

Consider an axisymmetric disc galaxy. Consider the rotation curve produced by the discmatter only (at this stage we shall not consider the contribution from the bulge or from thedark matter halo). This analysis will use a cylindrical coordinate system (R,φ, z) with thedisc at z = 0 and R = 0 at the centre of the galaxy. Let the surface mass density of the discbe Σ(R).

The gravitational potential in the plane of the disc at the point (R,φ, 0) is

Φ(R) = − G

∫ ∞

0R′Σ(R′) dR′

∫ 2π

0

dφ√

R2 +R′2 − 2RR′ cosφ, (5.3)

found by integrating the contribution from volume elements over the whole disc. To makethis tractable, let us first define a function L(u) so that

L(u) ≡ 1

∫ 2π

0

dφ√

1 + u2 − 2u cosφ. (5.4)

(The function within the integral can be expanded into terms called Laplace coefficients,which are explained in many old celestial mechanics books.)This can be expanded as

L(u) = 1 +u2

4+

9

64u4 +

25

256u6 +

1225

16384u8 + O(u10) for u < 1 , (5.5)

either using a binomial expansion of the function in u or by expressing it as Laplace coeffi-cients (which uses Legendre polynomials), and then integrating each term in the expansion.The integration over φ in Equation 5.3 can be expressed in terms of L(u) as

∫ 2π

0

dφ√

R2 +R′2 − 2RR′ cosφ=

RL(

R′

R

)

for R′ < R

=2π

R′L(

RR′

)

for R′ > R , (5.6)

72

Page 75: ASTM002 The Galaxy Course notes 2006 (QMUL)

because the expansion of L(u) assumed that u < 1.Splitting the integration in Equation 5.3 into two parts (for R′ = 0 to R, and for R′ = R to∞) and substituting for L(R′/R) and L(R/R′), we obtain,

Φ(R) = − 2πG

∫ R

0

R′

RΣ(R′) L

(

R′

R

)

dR′ − 2πG

∫ ∞

RΣ(R′) L

(

RR′

)

dR′ . (5.7)

Consider a star in a circular orbit in the disc at radius R, having a velocity v. The radialcomponent of the acceleration is

v2

R=

∂Φ

∂R,

and hence

v2(R) = R∂Φ

∂R

= − 2πGRd

dR

∫ R

0

R′

RΣ(R′) L

(

R′

R

)

dR′ − 2πGRd

dR

∫ ∞

RΣ(R′) L

(

RR′

)

dR′

These two differentials of integrals can be simplified by using a result known as Leibniz’sIntegral Rule, or Leibniz’s Theorem for the differential of an integral. This states for afunction f of two variables,

d

dc

∫ b(c)

a(c)f(x, c) dx =

∫ b(c)

a(c)

∂cf(x, c) dx + f(b, c)

db

dc− f(a, c)

da

dc(5.8)

This gives

d

dR

∫ R

0

R′

RΣ(R′) L

(

R′

R

)

dR′ =

∫ R

0

∂R

[

R′

RL(

R′

R

)

]

Σ(R′) dR′ + Σ(R)L(1)

andd

dR

∫ R

0Σ(R′) L

(

RR′

)

dR′ =

∫ ∞

R

∂R

[

L(

RR′

)

]

Σ(R′) dR′ − Σ(R)L(1)

Therefore we get

v2(R) = − 2πGR

∫ R

0

d

dR

(

R′

RL(

R′

R

)

)

Σ(R′) dR′ − 2πGR

∫ ∞

RΣ(R′)

d

dRL(

RR′

)

dR′

But,

d

dR

(

R′

RL(

R′

R

)

)

= − R′

R2L(

R′

R

)

+R′

R

d

dRL(

R′

R

)

= − R′

R2L(

R′

R

)

+R′

R

dL(

R′

R

)

d(R′/R)

d(R′/R)

dR

= − R′

R2L(

R′

R

)

− R′2

R3L′(

R′

R

)

writing L′(u) ≡ dL(u)du .

∴ v2 = + 2πG

∫ R

0

[

R′

RL(

R′

R

)

+(

R′

R

)2L′(

R′

R

)

]

Σ(R′) dR′

− 2πG

∫ ∞

R

(

RR′

)

L′(

RR′

)

Σ(R′) dR′ . (5.9)

73

Page 76: ASTM002 The Galaxy Course notes 2006 (QMUL)

This can be quite messy and it can abbreviated as

v2(R) = 2πG

∫ ∞

0K(

RR′

)

Σ(R′) dR′ , (5.10)

where the function K(

RR′

)

represents the function over both R′ = 0 to R and R′ = R to ∞domains.Changing variables to x ≡ lnR, y ≡ lnR′, we can write this as a convolution

v2(R) = 2πG

∫ ∞

−∞K(ex−y)R′Σ(R′) dy. (5.11)

The kernel K(R/R′) is in Figure 5.2.

Figure 5.2: The kernel K(R/R′). Observe that the R > R′ part tends to have higher absolutevalue than the R < R′ part.

Figure 5.3 shows RΣ(R) and v2 for an exponential disc, but the general shapes are notvery sensitive to whether Σ(R) is precisely exponential. The important qualitative fact isthat whatever RΣ(R) does, v2 does roughly the same, but expanded by a factor of ' e.

The distinctive shape of the v2 (lnR) curve for realistic discs makes it very easy torecognise non-disc mass. Figure 5.4, following Kalnajs, shows the rotation curves you get byadding either a bulge or a dark halo. (Actually this figure fakes the bulge/halo contributionby adding a smaller/larger disc; but if you properly add spherical mass distributions fordisc/halo, the result is very similar.) Kalnajs’s point is that a bulge+disc rotation curve hasa similar shape to a disc+halo rotation curve – only the scale is different. So when examininga flat(-ish) rotation curve, you must ask what the disc scale radius is.

5.4 Representing Dark Matter Distributions

The dark matter within spiral galaxies does not appear to be confined to the discs and itis probably distributed approximately spheroidally. A popular density profile that has beenadopted for modelling dark matter haloes has the form

ρ(r) =ρ0

1 + (r/a)2, (5.12)

74

Page 77: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 5.3: The dashed curve is RΣ(R) for an exponential disc with Σ ∝ e−R and the solidcurve is v2(R). Note that R is measured in disc scale lengths, but the vertical scales arearbitrary.

where r is the radial distance from the centre of the galaxy, ρ0 is the central dark matterdensity, and a is a constant. This form does reproduce the observed rotation curves of spiralgalaxies adequately: it gives a circular velocity that is vcirc = 0 at R = 0, that rises rapidlywith the raidal distance in the plane of the disc R, and then becomes flat (vcirc =constant)for RÀ a. This profile, however, has the problem that its mass is infinite. Therefore a morepractical functional form is

ρ(r) =ρ0

1 + (r/a)n, (5.13)

where a and n are constants, with n > 2 giving a finite mass.Some numerical N -body simulations of galaxy formation have predicted that dark matter

haloes will have density profiles of the form

ρ(r) =k

r (a + r)2, (5.14)

where a and k are constants. This is known as the Navarro-Frenk-White profile after thescientists who first described it. It fits the densities of collections of particles representingdark matters haloes in numerical simulations, and does so adequately over broad ranges inmasses and sizes. It is therefore often used to represent the dark matter haloes of galaxiesand also of clusters of galaxies.

The profiles above are spherical: the density depends only on the radial distance r fromthe centre. These functional forms for ρ can be modified to allow for flattened systems.

75

Page 78: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 5.4: Plots of v2 against lnR (upper panel) or v against R (lower panel) For one curvein each panel, a second exponential disc with mass and scale radius both scaled down bye2 ' 7.39 has been added (to mimic a bulge); for the other curve a second exponential discwith mass and scale radius both scaled up by e2 ' 7.39 has been added (to mimic a darkhalo).

76

Page 79: ASTM002 The Galaxy Course notes 2006 (QMUL)

Chapter 6

Gravitational Lensing and DarkMatter in the Galactic Halo

6.1 Introduction

Gravitational lensing is the process that causes the appearance of distant bright objects to bealtered by the gravity of foreground mass. Being a purely gravitational effect makes lensingastrophysically important as a probe of mass, including dark matter as well as visible matter.

Examples of gravitational lensing that have been observed include

• microlensing by stars, brown dwarfs etc. in the Galactic halo;

• deflection of light and radio waves by the Sun;

• lensing by distant galaxies; and

• lensing by galaxy clusters.

We shall begin this Chapter with a detailed review of gravitational lensing. The purposeof this is to explain the background to gravitational lensing before using these principles tounderstand how the light of distant stars can be lensed by objects within our Galaxy. It isthe sections on microlensing in the Milky Way that are really syllabus material. The restyou should consider as relevant background material, plus information of general interest.

6.2 Gravitational Deflection of Light by a Point Mass

Photons are affected by a gravitational field, but not in the same way as massive particlesare. For the details we need general relativity, but fortunately, for astrophysical applicationswe only need to take over a few simple results. The most important is that if a light ray

Figure 6.1: The deflection of a ray of light by a point mass. The deflection angle is α.

77

Page 80: ASTM002 The Galaxy Course notes 2006 (QMUL)

passes by a mass M with impact parameter R (À GM/c2 and À the size of the mass), itgets deflected by an angular amount

α =4GM

c2R. (6.1)

In contrast, a massive body at high speed v gets deflected by α = 2GM/v2R (which wasproved in the discussion of relaxation time in stellar dynamics).

In most practical applications, the gravitational deflection of light is very small. Forexample, the deflection of a ray of light skimming the surface of the Sun is only 8.5 ×10−6 rad = 1.8 arcsec. This was first measured using observations of a solar eclipse in 1919by a team led by Arthur Eddington.

If lensing takes place over a short enough distance that the deflection can be taken to besudden, the lens is said to be geometrically thin. Otherwise it is a thick lens.

6.3 The Lensing Equation

To make Equation 6.1 useful we need two approximations, both very good in almost allastrophysical situations:

(i) The deflector is much smaller than the distances to the observer and the object beingviewed (the ‘source’);

(ii) The deflections are always very small, so we can freely use sinα = α, and also we canget the total deflection from a mass distribution by integrating Equation 6.1.

Figure 6.2: The definitions of the quantities DL, DS, DLS, θ, θS, and α.

Accordingly, let us consider a situation as in Figure 6.2: the observer is viewing a sourceat distance DS, with a lens (a mass screen) intervening at distance DL; DLS is the distancefrom the lens to the source. On galactic scales DL, DS, DLS are ordinary distances, but oncosmological scales they must be understood as angular diameter distances, and DS 6= DL +

78

Page 81: ASTM002 The Galaxy Course notes 2006 (QMUL)

DLS. The reason for this complication is that the universe will have expanded substantiallyover the light travel time. We shall ignore these cosmological effects in this analysis becauseour objective is to understand lensing within the neighbourhood of our Galaxy.

We can use angular coordinates to describe the transverse positions.1 Let θS be theposition of the source, and θ be its observed position after being deflected. Note that theseare two-dimensional angular positions, and they are therefore represented as vectors. Letα(θ) be the deflection angle. θS ,θ and α will be measured in radians. Let Σ(θ) be the lens’ssurface mass density (expressed as the mass per unit solid angle, in units such as kg sr−1,solar masses per steradian, or solar masses per square arcsecond).

Then, comparing vectors in the source plane, we get

DS θ = DS θS + DLS α . (6.2)

(By convention,2 α is directed outwards from the deflecting mass rather than towards it.)Using Equation 6.1 to get α in terms of Σ, we get

θ = θS +DLS

DSα(θ) , α(θ) =

4G

c2DL

Σ(θ′) (θ − θ′) d2θ′

|θ − θ′|2 . (6.3)

This is known as the lens equation. It gives θS as an explicit function of θ, but θ as animplicit function of θS. Moreover, θ(θS) need not be single-valued, so sources can be multiplyimaged.

6.4 Time Delays

The deflected light experiences a time delay because of:

• the increased geometric light travel time Tgeom = 12T0(θ − θS), where T0 ≡ DLDS

cDLS;

• the delay within the gravitational potential, −Ψ(θ) (the Shapiro time delay).

The total time delay is therefore,

T =1

2T0 (θ − θS)

2 − Ψ(θ) . (6.4)

The potential time delay, Ψ, is a scalar function with the dimensions of time. We denote ithere by the capital letter Psi (a related quantity called the lensing potential will be introducedlater and will be denoted by a lower-case psi, ψ: take care not to confuse them). It is givenby

Ψ(θ) =4G

c3

Σ(θ′) ln |θ − θ′| d2θ′ , (6.5)

The total time delay T has the property that

∇θ T = 0

where ∇θ represents the gradient in the lens plane

(

i.e. ∇θ ≡ ∂

∂θxex +

∂θyey

)

.

1Later on, we’ll use θr, θx, θy as coordinates rather than r, x, y, to remind us that these are angles on thesky, not distances.

2The astrophysical convention being that you first think how a rational person would do it, and then youchange the sign.

79

Page 82: ASTM002 The Galaxy Course notes 2006 (QMUL)

This is merely Fermat’s Principle, a standard result in optics. It states that the path takenby the light minimises the travel time given the particular lens-observer-source configuration.

The four equations

∇T = 0, T =1

2T0 (θ − θS)

2 − Ψ(θ) ,

Ψ(θ) =4G

c3

Σ(θ′) ln |θ − θ′| d2θ′ , T0 =DLDS

c DLS, (6.6)

represent a reformulation of the results of Equation 6.3. Although it is possible to workentirely with Equation 6.3, Equations 6.6 are much more intuitive. In the cosmologicalsituation, both terms for T need to be multiplied by (1 + zL), where zL is the redshift of thelensing object.

The gravitational time delay Ψ can be derived directly from general relativity, indepen-dently of Equation 6.1, and is known as the Shapiro time delay. Radio astronomers canmeasure it directly within the Solar System.

6.5 The Einstein Radius and Einstein Rings

Figure 6.3: The Einstein ring produced by symmetrical lensing of a point object by a pointmass.

Consider a point mass M , which happens to be precisely between us and a point source.In other words θS = 0 and Σ(θ) = Mδ(θ). From the symmetry, we expect to observe a ring.The lens equation (6.3) is solved by θ = θE, with

θ2E =

4GM

c2DLS

DLDS, (6.7)

The interpretation of this is that this perfectly aligned lens will produce an image that is aring with an angular radius θE given (in radians) by

θE =

4GM

c2DLS

DLDS. (6.8)

This circular image is called an Einstein ring.The physical length RE corresponding to the angle θE is called the Einstein radius and

is given by RE = θEDL. The Einstein radius is therefore,

RE =

4GM

c2DLDLS

DS. (6.9)

By a Gauss’s-law type argument, for any circular mass distribution Σ(θr), Ψ(θr) andα(θ) will be influenced only by interior mass. So we’ll get the same images for any circular

80

Page 83: ASTM002 The Galaxy Course notes 2006 (QMUL)

distribution of the mass M , provided it fits within an Einstein radius. Bodies that fit withintheir own Einstein radius are said to be ‘compact’. But the Einstein radius depends on wherethe source and observer are:

RE ∼ (Schwarzschild radius ×DL)12 .

This effectively means that the further away you look, the easier it gets to see examples ofgravitational lensing. This is a surprising fact at first, but it’s really just the gravitationalanalogue of a familiar fact about glass lenses – to get the maximum effect from a lens youhave to be near the focal plane, if you’re too near the lens doesn’t have much effect.

6.6 The Critical Surface Density Σcrit

For given values of DL and DS, for a lens to be compact object you have to pack a massM (in projection) into a circle of radius θE. However, the area of this circle is proportionalto the mass (through the dependence of θE on the mass). So clearly there has to be acritical density, say Σcrit, such that if Σ ≥ Σcrit somewhere then there is a compact object(or sub-object).

Σcrit corresponds to a mass surface density of M/πθ2E . Substituting for θE from Equa-

tion 6.8, we find that

Σcrit =DLDS

DLS

c2

4πG. (6.10)

It has units of mass per unit solid angle (e.g. kg sr−1, M¯ sr−1 or M¯ arcsec−2). Note thatΣcrit depends on the distances DL, DS and DLS . If Σ > Σcrit, the lensing is compact.

The fact there is a critical density, and that it depends on distances, has importantastrophysical consequences. For example, a galaxy as a whole (a smooth distribution of∼ 1012M¯ on a scale of ∼ 105 pc) is not compact to lensing for DL . 109 pc – cosmologicaldistances. But clumps within the galaxy may be compact at much smaller distances. Inparticular, a star is compact to lensing at distances of even . 1 pc.

6.7 The Lensing Potential ψ

It is possible to define a two-dimensional function ψ(θ) so that α is related to the gradientof ψ in the lens plane as,

∇θ ψ ≡ θ − θS =DLS

DSα(θ) , (6.11)

from the lensing equation.ψ is a dimensionless scalar function known as the lensing potential. It denoted by a

lower-case letter ψ and care should be taken to avoid confusing it with the time delay Ψcaused by the gravitational potential. Ψ and ψ are related by

Ψ = T0 ψ =DLDS

cDLSψ , (6.12)

Using this, we can write Equation 6.6 more concisely as

∇T = 0, T = T0

[

12(θ − θS)

2 − ψ(θ)]

ψ(θ) = 1π

κ(θ′) ln |θ − θ′| d2θ′ , (6.13)

where κ is the projected mass density in units of the critical density (i.e. κ ≡ Σ/Σcrit).From the second line of Equation 6.13 it should be evident that ψ satisfies a two-dimensionalPoisson equation

∇2ψ = 2κ . (6.14)

81

Page 84: ASTM002 The Galaxy Course notes 2006 (QMUL)

6.8 The Arrival Time Surface

The surface T (θ) is known as the time delay surface or the arrival time surface. Wherever thearrival time is stationary (i.e., the surface as a maximum, minimum, or saddle point) therewill be constructive interference, and an image. This is Fermat’s principle. Furthermore, theless the curvature of the surface at the images, the more magnified the image will be. Thisis formalised in the next section.

Try to visualise the arrival time surface. The geometrical part is a parabola with aminimum at θS. Having mass in the lens pushes up the surface variously. If κ(θ) > 1anywhere, there will be a maximum somewhere near there, hence another image. Theremust be a third image too, because to have a minimum and a maximum in a surface youmust have a saddle point somewhere. In fact

maxima + minima = saddle points + 1 . (6.15)

This is a really a statement about geometry that should be intuitively clear, though a formalproof is difficult.

A good way of gaining some intuition about the arrival time surface is to take a trans-parency with a blank piece of paper behind it and look at the reflections of a light bulb.Notice how images merge and split, and how you get grotesquely stretched images just asthey do. Deep images of rich clusters of galaxies show just these effects!

6.9 Magnification

By magnification we mean: how much does the image move when we move the source? Itshould be clear that this magnification can’t be a scalar, because an image doesn’t in generalmove in the same direction as the source. In fact the magnification is a tensor. We’ll denoteit by M (the letter A for ‘amplification’ is also used). Formalising our definition, we have

M−1 =∂θS

∂θ=

∂2

∂θ2T (θ) . (6.16)

In Cartesian coordinates

M−1 =

1 − ∂2ψ

∂θ2x

∂2ψ

∂θx∂θy

∂2ψ

∂θy∂θx1 − ∂2ψ

∂θ2y

.

Notice that M−1 is basically taking the curvature of the arrival time surface.It is helpful to write M−1 in terms of its eigenvalues, and the usual form is like

M−1 = (1 − κ)

(

1 00 1

)

− γ

(

cos 2φ sin 2φsin 2φ − cos 2φ

)

. (6.17)

The eigenvalues are of course 1 − κ± γ. The first term in Equation 6.17 is the trace part –and comparing equations 6.17 and 6.14 shows that it must be κ – while the second term istraceless. The term with κ produces an isotropic expansion or contraction, while the γ termproduces a stretching in the φ direction and a shrinking in the perpendicular direction; κ isknown as ‘convergence’ and γ as ‘shear’.

The determinant of M can be thought of as a scalar magnification.

|M | = [(1 − κ)2 + γ2]−1 . (6.18)

82

Page 85: ASTM002 The Galaxy Course notes 2006 (QMUL)

The area of images on the sky are increased by a factor |M |. The places where one of theeigenvalues of M−1 becomes zero (and in consequence |M | is infinite) are in general curvesand are known as critical curves. When critical curves are mapped onto the source planethrough the lens equation, they give caustics; a source lying on a caustic gets infinitelymagnified.

6.10 Examples of the Magnification: Lensing by a Point-Massand by an Isothermal Sphere

For a point mass, the lens equation is

θSx = θx − θx

θ2r

θ2E , θSy = θy − θy

θ2r

θ2E ,

and this gives

M−1 =

1 −(

1

θ2r

+ 2θ2x

θ4r

)

θ2E 2

θxθy

θ4r

θ2E

2θxθy

θ4r

θ2E 1 −

(

1

θ2r

+ 2θ2y

θ4r

)

θ2E

. (6.19)

Taking the determinant and simplifying, we get

|M |−1 = 1 − θ4E

θ4r

. (6.20)

We shall now consider a circular mass distribution Σ ∝ θ−1r . This is known as the ‘isothermal

lens’, because it is the ρ ∝ 1/r2 isothermal sphere in projection. The lens equation for theisothermal lens is

θSx = θx − θx

θrθ2E , θSy = θy − θy

θrθ2E ,

and gives

M−1 =

1 −(

1

θr+θ2x

θ3r

)

θ2E

θxθy

θ3r

θ2E

θxθy

θ3r

θ2E 1 −

(

1

θr+θ2y

θ3r

)

θ2E

.

And from this we get

|M |−1 = 1 − θEθr

. (6.21)

This is shorter in polar coordinates, but tensor components in polar coordinates can getconfusing.

6.11 The Conservation of Surface Brightness

Magnification in lensing conserves surface brightness. We can prove this in a rather inter-esting way. Let us consider the axial direction as a formal time variable t; then light rayscan be thought of as trajectories. Now allow observers to be at arbitrary transverse position(say w – two dimensional) and arbitrary t. Then θ as observed at (w, t) is just the local dw

dtfor the corresponding light ray, up to a constant factor. This means we can make a formalanalogy with Hamiltonian formulation of stellar dynamics, with θ (up to a constant) playing

83

Page 86: ASTM002 The Galaxy Course notes 2006 (QMUL)

the role of the momentum, w playing the role of the coordinates, and ψ(w, t) replacing theNewtonian potential. The phase space density f is the density of photons in (w,θ) space,or the number of photons per unit solid angle on the sky per unit telescope area, i.e., thesurface brightness. The collisionless Boltzmann equation applies (as it does for any Hamil-tonian system) and it tells us that surface brightness is conserved along trajectories! Surfacebrightness must be conserved by the act of placing the lens there too – think of surfacebrightness before and after going through the lens. QED. We must be careful, though, tounderstand ‘along the trajectories’ correctly. It means we must always be looking at photonsfrom the same source, so if the image is moved in the sky by lensing we must follow it whenwe measure surface brightness.

This means that lensing changes the apparent sizes (and shapes) of objects, but does notalter their surface brightness. Lensing a source will change its apparent brightness becauseits angular area is changed, not because the surface brightness changes.

6.12 The Preservation of Spectroscopic and Colour Informa-tion

The gravitational lensing of light does not depend on the wavelength: the wavelength of light,or equivalently the energy of photons, does not appear in the equations describing lensing.Therefore, the spectrum of a source is not changed if the source is lensed by a foregroundobject. Equally, the colour of the source is unchanged.

This provides an important test for the effects of lensing. If two images on the sky aresuspected to be caused by lensing of a single object, we expect that their spectra will be thesame. They should show the same spectral features, and have the same redshift.

One qualification needs to be made. An extended source, such as a galaxy, may emit lightfrom different regions that have different spectra and different colours. This can producesome practical changes in observed spectra on account of lensing in some instances, causedby the different magnification of subregions.

6.13 Multiple-image QSOs

Multiple imaging of a quasi-stellar object (QSO) happens when a foreground galaxy lies . θE

(in projection) of a QSO, and produces two or four images with arcsecond order separations.Two-image systems have a minimum and a saddle point, while four-image systems have twominima and two saddle points. In both cases there is a maximum as well, at the bottom ofthe galaxy’s potential well; but since that is also generally the densest part of the galaxy,κ is very high and |M | nearly vanishes, so these central images are too faint to detect.Multiple-image QSOs are of great astrophysical interest, and two things make them so.

The first is that since QSOs are often very time-variable and the different images havedifferent arrival times, the images will show the same time-variability, but with offsets. Theseoffsets are simply the differences in T (θ) between different images. (So far they have beenexplicitly measured for several lenses.) Provided we know (or can model) κ(θ), the measuredtime offsets tell us T0, and hence H0. Basically it’s this: normally we can only measuredimensionless quantities (image separations, relative magnifications) in lens systems; but ifwe succeed in measuring a quantity that has a scale (the time delays), it can tell us the scaleof the Universe (H0). In practice, there is considerable uncertainty about the distribution ofmass in the lensing galaxies, and this translates into an uncertainty in the inferred H0 thatis much larger than errors in the time delays. Maybe this problem will be overcome, maybenot.

The second thing has to do with the extremely small size of QSOs in optical continuum.Now the κ(θ) of a galaxy isn’t perfectly smooth, it becomes granular on the scale of individual

84

Page 87: ASTM002 The Galaxy Course notes 2006 (QMUL)

stars. This produces a very complicated network of critical lines (in the lens plane), and acorresponding complicated network of caustics in the source plane (like the pattern at thebottom of a swimming pool). The optical continuum emitting regions of QSOs are smallenough to fit between the caustics, but the line emitting regions straddle several caustics.As proper motions move the caustic network, the continuum region will sometimes crossa caustic, and show a sudden change in brightness; the time taken for the brightness tochange is the time it take to cross the caustic. This is the phenomenon of QSO microlensing:continuum shows it but lines don’t. (It’s just the gravitational version of stars twinklingand planets not twinkling.) This has been observed, and modelling the caustic network andputting in plausible values for the proper motion leads to an estimate of the intrinsic size ofthe continuum regions of QSOs. It’s very small ∼ 100 AU.

Figure 6.4: Examples of gravitational lensing of quasars and distant galaxies by foregroundgalaxies. The pictures show images of candidate lenses recorded with the Hubble Space Tele-scope. [From NASA HST press release 1999-18, produced by Kavan Ratnatunga (CarnegieMellon Univ.), NASA and the Space Science Telescope Institute.]

6.14 Galaxy Clusters

Galaxy clusters are generally not in dynamical equilibrium (there haven’t been enough cross-ing times since they formed). Their mass distributions and ψ potentials are thus warped inmore complicated ways than for single galaxies. They are also much bigger on the sky andthus have many more background objects (faint blue galaxies) to lens.

The transparency with a paper behind it and several light bulbs overhead is a good anal-ogy of lensing by a cluster. Rich clusters show many highly stretched images of backgroundgalaxies, and these are known as arcs. A deep HST image of Abell 2218 shows over a hundredarcs, including seven multiple image systems.

An arc is close to a zero eigenvalue of M−1, and is stretched along the correspondingeigenvalue. Thus each arc provides some sort of constraint on the ψ of the cluster.

Clusters also show weak lensing. That’s when the eigenvalues 1 − κ ± γ are too closeto unity to show up as arcs, but if many galaxies in the same region are examined thenstatistically a stretching is measurable. The statistical stretching measures the ratio of thetwo eigenvalues, and thus γ/(1 − κ).

Several groups have been reconstructing cluster mass profiles from information providedby multiple-images, arcs, and weak lensing.

85

Page 88: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 6.5: Gravitational lensing by a cluster of galaxies. This Hubble Space Telescopeimage of the cluster Abell 2218 shows many arcs caused by the lensing of distant backgroundgalaxies by the mass distribution in the cluster. [NASA image recorded with the HubbleSpace Telescope by Andrew Fruchter and the ERO Team and released by the Space TelescopeScience Institute (as STScI-2000-07).]

6.15 Microlensing in the Milky Way

The exact nature of dark matter in the Universe is still unclear, despite dedicated researchover many years. The available evidence indicates that a majority of the dark matter isdynamically cold and that it is collisionless. It could be in the form of subatomic particles,such as weakly-interacting massive particles (WIMPs), or could be astronomical objects thatemit little or no radiation, known as massive astrophysical compact halo objects (MACHOs).

One possibility is that the dark matter in the Milky Way halo is MACHOs in the formof brown dwarfs, compact objects below the hydrogen burning threshold of 0.08M¯. Suchobjects would act as point lenses. Indeed, the gravitational lensing of background stars bydark astronomical objects in the halo would be one way of detecting individual dark matterobjects, and therefore of identifying the nature of dark matter.

A point lens has two images, at

θ =1

2

(

θS ±√

θS2 + 4θ2

E

)

. (6.22)

(There is formally a third image at θ = 0, i.e., at the lens itself, but for a point massthis image has zero magnification.) The image separation for a ∼ M0 lens at distances of∼ 10, kpc is < 1 mas, far too small to resolve. What will be observed is a brightening equalto the combined magnification of both images. Using the result Equation 6.20 for |M | for apoint lens, and adding the absolute values of |M | at the two image positions, we get

Mtot =u2 + 2

u(u2 + 4)12

, u =θSθE

. (6.23)

Now because of stellar motions, θS will change by an amount θE over times of order amonth, so microlensing in the Milky Way can be observed by monitoring light curves. Ifthe background source star has impact parameter b and velocity v (projected onto the lens

86

Page 89: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 6.6: Predicted light curves for impact parameters of RE (lowest), 0.5RE and 0.2RE

(top). The unit of time is how long it takes the source to move a distance RE .

place) with respect to the lens, then

u =(b2 + v2t2)

12

DLθE. (6.24)

Inserting Equation 6.24 into 6.23 gives us Mtot(t), i.e. the light curve. This is plotted forthree different b in Figure 6.6. The height of a measured light curve immediately gives RE/b,and the width gives RE/v.

Though trying to resolve the images images in microlensing seems hopeless with foresee-able technology, there are some prospects for tracking the moving double image indirectly.By combining the positions and magnifications of the two images, we have for the centroid

θcen =u(3 + u2)

2 + u2θE . (6.25)

Such microlensing events are rare, because θS has to be . θE for significant magnification.People speak of an optical depth τ to microlensing in a field. This is the probability of a starbeing (in projection) within θE of a foreground lens, at any given time. From Equation 6.24it amounts to the probability of Mtot ≥ 2/

√5 = 1.34. It’s just the covering factor of discs of

radius θE (Einstein rings) from all lenses between us and the stars in the field. The sourcestars might be bright stars in the Large Magellanic Cloud (LMC) and the lenses very faintstars or brown dwarfs in the Milky Way halo. Note that the term optical depth has a verydifferent meaning here to the use of the term in radiation physics, as was used for examplein the discussion of extinction by dust in the interstellar medium.

We can derive an expression giving the optical depth towards a source plane as a functionof the density of microlensing objects in space between the source plane and an observer.Consider a field on the sky subtending a solid ange Ω. Microlensing occurs due to objects ofmass ML between the observer and the source plane. Consider a thin surface over this solidangle between a distance DL and DL + DDL. The fraction of the field covered by Einsteinradii of lensing sources in this thin shell is

dτ = π θ2E dN / Ω

87

Page 90: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 6.7: The observed light curve of microlensing event BUL SC3 91382 from the OGLEsurvey. The graph shows the observed brightness of a star in the Galactic Bulge, expressedas the infrared magnitude, plotted against time. The star was observed to brighten and fadeover a period of several weeks. The curve is a fit to the data points using the expressionfor Mtot from Equation 6.23 and u from Equation 6.24, after choosing the time of maximumbrightness, the magnitude m0 before/after the lensing event, and the quantities b/DLθE

and v/DLθE to achieve the best fit. The magnitude at any time is fitted with m(t) =m0 − 2.5 logMtot(t). [Plotted with data provided by the OGLE project.]

where dN is the number of lenses in this thin shell. If n is the number density of lenses,

dN = n D2L dDL Ω .

The mass density is ρ = nML. Therefore,

dτ =π θ2

E ρ D2L dDL

ML.

Substituting for the angular Einstein radius and integrating over lens distance DL from theobserver to the source plane, we obtain,

τ =4πG

c2DS

∫ DS

0DL DLS ρ(DL) dDL . (6.26)

The really nice thing about the formula 6.26 is that it doesn’t depend on the massdistribution of the lenses, as long as each mass fits within its own Einstein radius (diffusegas clouds don’t count, nor does any kind of diffuse dark matter). So τ estimated from lightcurve monitoring could be used to make inferences about ρ.

How large is τ through the Galactic halo? To estimate that, we need an estimate forρ. Now the Milky Way rotation curve suggests an isothermal halo, ρ = σ2/(2πGr2), withσ ∼ 200 km/sec. If we then say that r will be of order the D factors in Equation 6.26, weget

τ ∼ σ2

c2, or τ ∼ 10−7 to 10−6 .

88

Page 91: ASTM002 The Galaxy Course notes 2006 (QMUL)

This is a very low figure, showing that the probability of detecting a microlensing event whenmonitoring a single star is negligible. However, monitoring very large numbers (∼ 106 to107) of stars simultaneously can make this feasible.

With some more care, we can estimate the lensing optical depth more accurately. As anexample, consider a study of stars in a (hypothetical) dwarf galaxy companion to our ownGalaxy that lies at a distance S from the Sun in the direction of the South Galactic Pole. Thesurvey aims to detect the brightening of stars in the dwarf galaxy caused by microlensing byMACHOs in the Galactic halo along the sight line to the dwarf galaxy, as a test of whetherthe dark matter halo of our Galaxy is made out of MACHOs. We shall assume that the darkmatter halo can be represented by an isothermal sphere model in which the density profileis given by

ρ(r) =σ2

2πGr2, (6.27)

where r is the radial distance from the Galactic Centre, σ is the velocity dispersion of particlesmoving in this potential, and G is the constant of gravitation. This isothermal sphere modelis likely to be a reasonable representation of the real density profile because it implies a flatrotation curve, providing we do not use it close to the Galactic Centre (because it impliesinfinite density as r → ∞) or use at very large distances (because it implies a flat rotationout to infinite distance).

In this example, the source of light is a star in the companion galaxy at a distanceDS = S. The lensing object is a MACHO at a distance DL from the Sun. The distance of

the lensing object from the Galactic Centre is r =√

R 20 +D 2

L where R0 is the distance of

the Sun from the Galactic Centre. The distance between the lensing object and the sourceis DLS = DS −DL = DS −R0. Using Equation 6.26,

τ =4πG

c2DS

∫ DS

0DL DLS ρ(r) dDL =

4πG

c2DS

∫ DS

0DL (DS −DL)

σ2

2πGr2dDL

=4πG

c2S

∫ S

0DL (S −DL)

σ2

2πG(R 20 +D 2

L)dDL =

2σ2

c2S

∫ S

0

(DLS −D 2L)

(R 20 +D 2

L)dDL

89

Page 92: ASTM002 The Galaxy Course notes 2006 (QMUL)

=2σ2

c2S

(

S

∫ S

0

DL dDL

(R 20 +D 2

L)−∫ S

0

D 2L dDL

(R 20 +D 2

L)

)

This can be solved using the standard integrals

x dx

a2 + x2= 1

2 ln∣

∣a2 + x2∣

∣ + constant ,

x2 dx

a2 + x2= x − a tan−1

(x

a

)

+ constant .

Using these standard integrals, we get

τ =2σ2

c2S

(

[

S

2ln(

R 20 +D 2

L

)

]S

DL=0

+

[

− DL + R0 tan−1

(

DL

R0

)]S

DL=0

)

=2σ2

c2S

(

S

2ln(

R 20 + S 2

)

− S

2ln(

R 20

)

− S + R0 tan−1

(

S

R0

)

+ 0 − tan−1 0

)

=2σ2

c2S

(

S

2ln

(

1 +S 2

R 20

)

− S + R0 tan−1

(

S

R0

) )

which simplifies to

τ =σ2

c2

(

ln

(

1 +S 2

R 20

)

− 2 + 2R0

Stan−1

(

S

R0

) )

(6.28)

The distance of the Sun from the Galactic Centre is R0 = 8.0 kpc. Putting S = 100 kpc asthe distance to the companion galaxy (a reasonable figure), we get S/R0 = 12.5. Therefore,

τ = 3.3 × σ2

c2.

Observations of stars in the stellar halo of the Galaxy find σ = 200 kms−1. Using this figurefor the hypothetical MACHOs gives τ = 1.5×10−6. That is to say, were the dark matter halomade of compact objects such as low mass stars, brown dwarfs, planet-size bodies or stellarmass black holes, we would expect mictolensing events to occur with a frequency defined byτ = 1.5 × 10−6.

So to have any hope of detecting such microlensing events, it is necessary to monitorthe light curves of millions of stars. A number of surveys have been undertaken in the lastdecade, observing fields in the LMC and the Milky Way bulge among others. (The bulgesurveys go through the Milky Way disc, of course, but do also probe that part of the darkmatter halo that extends through the disc.)3

6.16 Results of Microlensing Surveys

Several surveys of large numbers of stars have been conducted over the past several years toidentify microlensing events. These have observed stars in the Galactic Bulge, in the LargeMagellanic Cloud and in the Andromeda Galaxy M31. These surveys have monitored manymillions of stars regularly for years to search for increases in their brightnesses consistentwith microlensing by possible MACHOs, and also caused by lensing by ordinary stars.

A considerable number of microlensing events have been observed to date. The currentmeasurements of τ from observations are ∼ 10−7 towards the LMC and ' 3× 10−6 towardsthe Bulge. However, this frequency is substantially lower than would be expected were thedark matter halo of the Galaxy made entirely of MACHOs: too few lensing events have been

3An estimate of τ from a survey will include a correction for the detection efficiency. Surveys have to bevery wary of spurious detections; hence any light curve possibly contaminated by stellar variability has to bediscarded for microlensing purposes. Detection efficiencies are of order 30%.

90

Page 93: ASTM002 The Galaxy Course notes 2006 (QMUL)

found to explain the dark matter in the halo. Many of the lensing events can be explainedas being caused by main-sequence stars or white dwarfs. How much of the lensing mass isin brown dwarfs as distinct from faint stars is not entirely clear, but the available evidencesuggests that compact astronomical objects can make up . 20 % of the dark matter halo ofthe Galaxy. Meanwhile, the huge number of variable stars discovered by these surveys arerevolutionising that field of study.

91

Page 94: ASTM002 The Galaxy Course notes 2006 (QMUL)

Chapter 7

The Galaxy: Its Structure andContent

7.1 Introduction

Chapter 1 included a brief overview of our Galaxy, while later chapters discussed importantprocesses affecting the Galaxy and other galaxies. Here we bring these concepts together todevelop an understanding of the Galaxy itself. This will include a consideration of each ofthe components (disc, bulge, halo, etc.) in detail.

The Milky Way Galaxy is, as far as we know, a typical disc galaxy. Figure 1.8 was acartoon to remind you of its different components. The luminous parts are mostly a disc ofpopulation I stars and a bulge of older population II stars. We live in the disc, with the Sunat a distance R0 = 8.0 kpc from the centre. Apart from stars, the disc also has clusters ofyoung stars and H II regions, and gas and dust; the gas is mostly observed as an H I layerwhich flares at large radii. There is some evidence that there are two or three spiral armsin the disc (the dust makes it hard to tell). The bulge is accompanied by a bar, though thedimensions of it are unclear. There are some very old stars (and globular clusters of veryold stars) in the stellar halo. But the most massive part is the dark matter halo, which ismade of dark matter of unknown composition.

That is not all: there are also the small companion galaxies. The best known of theseare the the Large and Small Magellanic Clouds (LMC and SMC) which are ' 50 kpc away;these are associated with a trail of debris, mostly H I gas, known as the Magellanic Stream.Then there is the Sagittarius Dwarf Galaxy which appears to be merging with the MilkyWay now.

7.2 The Mass of the Galaxy

The mass of the Galaxy enclosed within different radii can be determined using a variety ofmethods. Observations of the rotation curve provide measurements out to ' 12 − 15 kpc.Measurements of the dynamics of globular clusters can constrain the enclosed mass to agreater distance.

However, while there are good estimates of the enclosed mass of the Milky Way withindifferent radii, it is not known where the halo of the Milky Way finally fades out (or even ifthe size of the halo is a very meaningful concept). So the only way to get at the total massof the Milky Way is to observe its effect on other galaxies. The simplest but most robustof these comes from an analysis of the mutual dynamics of the Milky Way and M31 (theGreat Andromeda Galaxy): it is known as the timing argument. This is discussed in thenext section.

92

Page 95: ASTM002 The Galaxy Course notes 2006 (QMUL)

7.3 The Mass of the Galaxy from Dynamical Timing Argu-ments

The dynamical timing argument relies on modelling the dynamics of the Galaxy and nearbygalaxies. The Local Group contains two substantial spiral galaxies, the Galaxy and M31 (itdoes also contain one less massive spiral, M33, several irregular galaxies of modest mass, andnumerous low mass dwarfs). We shall first consider the mass constraint that can be obtainedfrom the dynamics of M31 and the Galaxy, ignoring the other Local Group galaxies.

The observational inputs are (i) M31 is 750 kpc away, and (ii) the Milky Way and M31are approaching at 121 km s−1. (The transverse velocity of M31 is poorly determined atpresent.) A simple approximation for their dynamics is to suppose that they started out atthe same point moving apart with initial velocities from the Big Bang, and have since turnedaround because of mutual gravity. This is not strictly true of course, because galaxies hadnot already formed at the Big Bang; however it is thought that galaxies (at least galaxies likethese) formed early in the history of the Universe, so the approximation may be acceptable.Writing l for the distance of M31 from the Galaxy, and M for the combined mass of bothsystems, the equation of of motion for the reduced Keplerian one-body problem is

d2l

dt2= − GM

l2. (7.1)

Here we shall count time from the Big Bang, so that t = 0 refers to the Big Bang. Thecurrent time and separation are t0 and l0.

In considering a Keplerian problem without perturbation we are, of course, assumingthat the gravity from Local Group dwarfs and the cosmological tidal field is negligible; butas there are no other large galaxies within a few Mpc this seems a fair approximation.

It is not obvious how to solve this nonlinear equation, but fortunately the solutions arewell known and easy to verify. There are actually three solutions, depending on the precisecircumstances. One solution applies to the case where the combined mass M is too small tohalt the expansion and the two galaxies drift further apart for ever: this is not the case wehave here. A second solution applies to the limiting case where the mass is just insufficientto stop the motion apart (so dl/dt → 0 and l → ∞ as t → ∞). The third solution appliesto the case where the mass is great enough to halt the drift apart and the galaxies fall backtoward each other: it is this case that we have here, where the two galaxies are alreadyfalling towards each other.

This solution is most conveniently expressed in parametric form, as

t = τ0 (η − sin η) ,

l =(

GMτ20

)13 (1 − cos η) . (7.2)

Here τ0 is an integration constant. The other integration constant has been eliminated by theboundary condition that the two galaxies (or at least the material from which they formed)were at the same position immediately after the Big Bang: i.e. l = 0 at t = 0.

It is easy to show that these equations for t and l are a solution to Equation 7.1 bydifferentiating them to get d2l/dt2 and substituting them into Equation 7.1. This can bedone using

d2l

dt2=

d

dt

(

dl

dt

)

=d

(

dl

dt

)

.

(

dt

)−1

=d

(

dl

(

dt

)−1)

.

(

dt

)−1

To determine the total mass M , we first consider the dimensionless quantity(

t0l0

)(

dl

dt

)

t0

=sin η0 (η0 − sin η0)

(1 − cos η0)2, (7.3)

93

Page 96: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 7.1: The change in the separation l of the Galaxy and M31 with time t since the BigBang in the model used for the timing argument.

where the subscripts in t0 and so on refer to the current time, as is conventional in cosmol-ogy. The quantity on the left-hand side can be calculated directly from observational data.Inserting the observed values of l0 = 750kpc and

(

dldt

)

t0= −121km s−1 and a plausible value

of 14 Gyr for t0 (the age of the Universe), we get

sin η0 (η0 − sin η0)

(1 − cos η0)2= −2.32 .

This can be solved numerically to give η0 = 4.28. Inserting these values into t0 = τ0(η0 −sin η0) from Equation 7.2, we get τ0 = 2.70 Gyr. Then l0 = (GMτ 2

0 )1/3(1 − cos η0) gives(GMτ 2

0 )1/3, and using the value we found for τ0 gives1

M ' 4.4 × 1012M¯ . (7.4)

From its luminosity and rotation curve, M31 appears to have approximately twice themass of the Milky Way, i.e. MM31 ' 2MGalaxy. Using M = MM31 + MGalaxy, this impliesthat the mass of Milky Way exceeds 1012M¯. Estimates for the mass of the luminous partof the Milky Way range from (0.05 − 0.12) × 1012M¯, which confirms that the majority ofthe mass of the Galaxy is unseen (it is dark matter).

It should be noted that this analysis predicts that the Galaxy and M31 will collide,and consequently merge, at some time, about 3.0 Gyr in the future. However, it fails totake account of the component of the velocity of M31 tangential to our line of sight. M31and the Galaxy may have a tangential component, and therefore may have enough angularmomentum that they may not actually come together.

The timing argument can be applied not only to the Andromeda Galaxy, but also to LocalGroup dwarf galaxies (which have much less mass and behave just as tracers). Figure 7.2shows plots of l against dl/dt for some Local Group dwarfs, along with the predictions of

1It is useful to remember G in useful astrophysical units as 4.98 × 10−15 M−1¯ pc3 yr−2.

94

Page 97: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 7.2: Distances and velocities of six Local Group dwarf galaxies, and predictions fordifferent values of GM/τ0 (by Alan Whiting).

the timing argument for different values of GM/τ0. This uses

l =(

GMτ20

)13 (1 − cos η) and

dl

dt=

(

GM

τ0

) 13 sin η

1 − cos η. (7.5)

This model assumes that the dwarfs have been moving on radial trajectories since the BigBang.

7.4 Kinematics in the Solar Neighbourhood

The Milky Way is a differentially rotating system. The local standard of rest (LSR) is asystem located at the Sun and moving with the local circular velocity (which is ' 220km s−1).The Sun has its own peculiar motion of ' 13 km s−1 with respect to the LSR.

The rotation velocity and its derivative at the solar position are traditionally expressedin terms of Oort’s constants:

A ≡ 1

2

(

R− ∂vφ

∂R

)

at R = R0

B ≡ − 1

2

(

R+∂vφ

∂R

)

at R = R0 (7.6)

Observations show that A = +14.4± 1.2 km s−1(kpc)−1 andB = −12.0± 2.8 km s−1(kpc)−1.One reason that these parameters are useful is that B vanishes for solid body rotation

(i.e. B = 0 when the angular velocity Ω(R) = constant). Another useful property is thatthe gradient of the rotational velocity is ∂vφ/∂R = −(A+B) at R = R0, which means thatA + B = 0 if the rotation curve is flat (because vφ is the same as vcirc, and therefore wehave ∂vφ/∂R = ∂vcirc/∂R = 0 for vcirc = constant). Therefore calculating A+B is a test ofwhether the Galaxy has a flat rotation curve close to the Sun’s distance from the GalacticCentre. Similarly, the angular velocity in the solar neighbourhood is Ω0 = vφ/R|R=R0

=A−B.

95

Page 98: ASTM002 The Galaxy Course notes 2006 (QMUL)

The radial and tangential components of the velocity of stars or gas in circular orbit, vr

and vt, can be written as functions of the galactic longitude l as

vr ' Ad sin(2l)

vt ' Ad cos(2l) + B d (7.7)

locally (within about 1 kpc), where d is the distance from the Sun.The advantage of the Oort constants A and B is that they describe the motions of stars

around the Sun in the Galaxy, and they can be measured from simple velocity and distancedata. But now that we have accurate proper motions from the Hipparcos satellite mission,and hence (combining with ground-based line-of-sight velocities) three-dimensional stellarvelocities in the solar neighbourhood, A and B are less important.

If you take the average (three-dimensional) velocity and dispersions of any class of starsin the solar neighbourhood, then 〈vR〉 and 〈vz〉 turn out to be nearly zero, while 〈vφ〉 is suchthat 〈vφ〉−vLSR is negative and ∝ σRR. This is known as the asymmetric drift and essentiallyexpresses the degree of rotational support versus pressure support. Young stars are almostentirely supported by 〈vφ〉, like the gas that produced them. The asymmetric drift for youngstars is therefore nearly zero, because 〈vφ〉 ' vLSR. Older stars pick up increasing amounts ofpressure support in the form of σRR; they then need less vφ to support them, and thus tendto lag behind the LSR. The linear relation can be derived from the Jeans equations, but wewon’t go through that because you’ve probably had enough of Jeans equations for now. . .

When examined in detail using proper motions from the Hipparcos astrometry satellite,the velocity structure in the solar neighbourhood is more complicated than anyone expected.Figure 7.3 shows a reconstruction of the stellar (u, v) (i.e., radial and tangential velocity)distribution in the solar neighbourhood for stars in different ranges of the main sequence.2

Notice the clumps in the velocity distribution which appear for stars of all ages. (And theseare clumps only in velocity space, not in real space.) The idea that there are groups ofstars at similar velocities is itself not new—it actually dates from the early proper motionmeasurements of nearly a century ago. But these ‘streams’ have generally been interpretedas groups of stars which formed in the same complex and were later stretched in real spaceover several galactic orbits. The surprising new finding is that the ‘streams’ are seen for starsof all ages, which indicates a dynamical origin; they seem to be wanting to tell us somethinginteresting about Milky Way dynamics, but as yet we don’t know what.

7.5 Dynamics of the Galactic Disc

The orbits of stars in galaxies are, in general, not closed paths, as we saw in Chapter 2. Thisis equally true of stars in the disc of our Galaxy. The tangential (vφ) component dominatesfor disc stars (vφ is much larger than the vR and vz components). We can therefore breakthe motion of disc stars into two parts: a uniform motion about the Galactic Centre, plusthe motion relative to this uniform rotation. This second part is called an epicycle. Theepicycle is very nearly an ellipse, but the period of the motion around the epicycle is notthe same as the period of the uniform component of the motion. (The term epicycle wasused historically for the complicated system of cycles that was used to fit the motion of theplanets before Kepler explained the elliptical orbits about the Sun.)

7.6 The Disc (or Thin Disc)

The disc of the Galaxy contains mostly stars, with some gas. The stars are distributed withan exponential density profile in both the R and z directions. The density ρ(R, z) is therefore

2The Schwarzschild ellipsoid and its vertex deviation that you may find in textbooks should now beconsidered obsolete—they are essentially the result of washing out the structure in Figure 7.3.

96

Page 99: ASTM002 The Galaxy Course notes 2006 (QMUL)

Figure 7.3: Distribution of radial (u) and tangential (v) velocities of main sequence stars inthe solar neighbourhood, recently reconstructed from Hipparcos proper motions by WalterDehnen (1998). The upper left panel is for the youngest (and bluest) stars; these are esti-mated to be < 0.4 Gyr old. The upper right panel is for stars younger than 2 Gyr, and thelower left panel is for stars younger than 8 Gyr. The lower right panel shows the combineddistribution for all main sequence stars. The Sun is at (0, 0) and the LSR is marked by atriangle.

Figure 7.4: Epicyclic orbits. The first diagram shows the orbit of a star in an elliptical orbitin the disc of a spiral galaxy as viewed in an inertial frame. It moves in a “rosette” pattern.The middle diagram shows the same orbit as viewed in a rotating frame, with the framerotating with uniform motion and with a period equal to the orbital period of the star. Thethird diagram show the orbit of the star in the centre diagram in greater detail, showing theepicyclic motion.

97

Page 100: ASTM002 The Galaxy Course notes 2006 (QMUL)

described byρ(R, z) = ρ0 e−R/hR e−|z|/hz ,

where hR and hz are the scale lengths in the R and z directions, and ρ0 is a constant.Observations show that hR ' 3.5 kpc. The vertical scale height hz is different for differentage stars – young stars have smaller scale heights – but hz = 250 pc is a typical value.

The disc is rotationally supported with a circular velocity vcirc ' 220 km s−1 at theSun’s position from the centre. There is a small velocity dispersion around this of 15 km s−1

for young stars, 40 km s−1 for old ones. Young stars form in the gas and naturally have asmall velocity dispersion about the mean rotation. The greater velocity dispersion of olderstars has probably been caused by perturbations of the stars during encounters with giantmolecular clouds. We learnt in Chapter 2 that stars are collisionless in galaxies. However,encounters between stars with giant molecular clouds can perturb stellar velocities in galacticdiscs to a limited degree over the lifetime of a spiral galaxy.

Heavy element abundances in disc stars are close to the solar values. Typical metallicitiesare [Fe/H] = −0.4 to +0.2.

The gas and its associated dust are concentrated close to the Galactic plane. The gasmoves in circular orbits. The H I gas layer flares and warps at large radius.

The main disc is often called the thin disc to distinguish it from the thick disc, describedbelow.

7.7 The Thick Disc

The term thick disc is usually given to a distribution of stars that is more extended in thevertical direction (perpendicular to the plane) than the main Galactic disc (the thin disc).The term is associated with stars, not gas. It consists of moderately metal-poor, older stars,with [Fe/H] close to −0.6. The system is rotationally supported, but vcirc slightly smallerthan for thin disc, a consequence of the stars showing a velocity dispersion that is largerthan for those of the thin disc. The asymmetric drift is 30 − 50 km s−1. Only about 2% ofthe stars in the solar neighbourhood belong to the thick disc. The density distribution, likethat of the thin disc, is a double exponential function, with probably a comparable radialscale length to the thin disc, but the vertical scale height is hz ' 1.3 kpc. There has beensome controversy over whether it is a distinct component of the Galaxy in its own right, or ismade merely out of a small number of disc stars with extreme metallicities and kinematics.

7.8 The Bulge

The bulge is a spheroidally distributed, but flattened, system in the central regions of theGalaxy, confined to the inner 2 kpc. Its stars are old. They show a very wide range inmetallicity, ranging from [Fe/H]= −1 to +1. It is largely pressure supported.

7.9 The Bar

There is little doubt now that the distribution of stars in the region of the Milky Way bulgeis triaxial – there is a (rotating) bar with the positive l side nearer to us and moving away.The evidence for this was at first indirect, and took the following form. Consider gas in thering, which must move on closed orbits. If it moved on circular orbits in the disc, and wemeasured its Galactic longitude l and line of sight velocity v, then all the gas at positive l(i.e. on one side of the Galactic Centre) would have one sign for v and similarly all the gas atnegative l (on the other side) would have the opposite sign for v. In fact gas at positive l isseen with both signs for v, and likewise at negative l. So the gas orbits must be non-circular,

98

Page 101: ASTM002 The Galaxy Course notes 2006 (QMUL)

and hence the gravitational potential must be non-circular in the disc. This suggests a barand indeed the observed gas kinematics is well fitted by a bar.

Figure 7.5: Schematic of the bar in the Milky Way Bulge, viewed from the North Galacticpole (left), and from the Sun (right). (From Blitz and Spergel, ApJ, 1991. The right paneluses minus the usual convention for l.)

The features of a bar can in fact be seen in an infrared map of the bulge, if you knowwhat to look for. Figure 7.5 shows a bar in the plane, and its effect on an l, b map.

1. The side nearer to us is brighter. Contours of constant surface brightness are furtherapart in both l and b on the nearer size.

2. Very near the centre, the further side appears brighter, so the brightest spot is slightlyto the further size of l = 0. The reason is that on the further side our line of sightpasses through a greater depth of bar material, which more than compensates for itbeing slightly further.

The features (i) can be discerned in many different data sets; the feature (ii) is harder tofind, it just about shows up in the COBE maps of the bulge.

7.10 The Galactic Centre

Observing the centre of the Galaxy is extremely difficult in the optical because the extinctioncaused by dust in the Galactic plane is approximately 30 magnitudes in the V-band. Thingsare not so extreme in the infrared, and in the K band (2.2 µm) the extinction is a moremoderate 3–4 mag. However, the available data show that there is a very dense star clusterat the Galactic centre, with a compact radio source, Sagittarius A∗ (abbreviated Sgr A∗) atits centre. There is a ring or disc of gas around the centre, about 5 pc across, detected byits molecular emission.

Orbital velocities immediately around Sgr A∗ are very high. The available evidencesuggests that there is a compact massive object at the centre of Sgr A∗. This is probably acentral black hole with a mass 1 − 3 × 106M¯.

7.11 The Stellar Halo

The stellar halo is the spheroidally distributed, slightly flattened, system that extends farfrom the disc. It includes diffuse field stars and globular clusters. The stars are very old

99

Page 102: ASTM002 The Galaxy Course notes 2006 (QMUL)

(13 Gyr) and are very metal-poor, having [Fe/H] ' −1 to −2.5. It makes only a very smallcontribution to the total mass of the Galaxy. It is a pressure-supported system. The velocitydispersion is σ = 200 kms−1. The asymmetric drift is ' 190 km s−1. These two figures meanthat the kinematics of halo stars are very different to those of the disc. Only about 1/1000-thof the stars in the solar neighbourhood belong to the halo. Examples of halo stars in the solarneighbourhood, of value for example in studying chemical abundances in very metal-poorstars, have often been found from their high proper motions: relatively nearby halo starsusually have large motions across the sky compared with typical (disc) stars.

7.12 Globular Clusters

The Galaxy contains about 150 globular clusters. They are compact systems of ∼ 105

stars. Many of these are very metal-poor, are distributed spheroidally and have randomlyorientated orbits: they appear to be associated with the stellar halo.

However, some globular clusters are only moderately metal-poor. Those globular clustershaving [Fe/H> −0.8 form a more flattened system. They may be associated with the thickdisc.

7.13 The Dark Matter Halo

The dark matter halo appears to extend out to large radii, to & 100 kpc, as is shown bystudies of the dynamics of companion dwarf galaxies, for example. It dominates the mass ofthe Galaxy. It appears to be spheroidally distributed: it does not appear to be concentratedin the Galactic disc, as we saw in Section 2.20.

The nature of dark matter is unknown, but it appears not to be in the form of ∼ stellarmass compact objects, such as white dwarfs or brown dwarfs, as microlensing surveys haveshown (at least only a small proportion of the dark matter can be in stellar mass compactobjects). Neither is it likely to be in the form of dark compact objects having masses& 1000M¯, such as massive black holes. Objects of this type would perturb the dynamicsof disc stars, thickening the disc, which is not observed on a significant scale.

Constraints from primordial nucleosynthesis imply that baryonic matter only contributes4 % of the closure density of the Universe. Cosmological results indicate that matter con-stributes 27 % of the closure density. Therefore we expect that most of the dark matter inthe Universe is not in the form of baryonic matter, if the cosmological models are correct.This is consistent with the results of Galactic microlensing surveys.

The dark matter must be composed of individual particles, be they subatomic particlesor astronomical bodies. To support a spheroidal distribution within the Galaxy’s potential,these particles must be moving on mostly randomly orientated trajectories with a velocitydispersion of 200− 400 kms−1. The dark matter particles do not dissipate significant energyin interactions with each other (or with the luminous matter), otherwise it would settle downto a rotating disc or a single mass at the centre of the Galaxy’s potential – which it has notdone.

7.14 The Local Group

The Galaxy lies in a system of more than 40 galaxies called the Local Group, about 1 Mpcacross. There are two large spiral galaxies – the Galaxy and the Great Andromeda Galaxy(M31) – and one spiral (M33) of slightly lower mass. There are a few irregular galaxies,most notably the Large Magellanic Cloud and the Small Magellanic Cloud, but these arenot particularly massive. All the other members are dwarf galaxies, either dwarf irregulars,dwarf ellipticals or dwarf spheroidals.

100

Page 103: ASTM002 The Galaxy Course notes 2006 (QMUL)

A large majority of the galaxies are companions of either the Galaxy or M31. Forexample, the Magellanic Clouds are situated 50 − 60 Mpc from the Galaxy. Several of thedwarf spheroidal galaxies lie within 200 kpc.

7.15 The Formation of the Galaxy

A fundamental question relating to the Galaxy is how it was formed. Some stars in theGalaxy, such as those in the thick disc and bulge, and particularly in the halo, are old.The oldest stars were formed relatively early in the history of the Universe, indicating thatthe Galaxy had a relatively early origin. Star formation has continued in the disc, at least,throughout its history.

There are two main scenarios for the formation of the Galaxy:

• the monolithic collapse model, and

• the merging of subunits.

The monolithic collapse model was developed in the 1960s, particularly by Eggen, Lynden-Bell and Sandage. In this picture, the Galaxy formed by the collapse of a protogalactic cloudof gas that had some net angular momentum. The gas initially had a very low metallicity.The collapse occurred mostly in the radial direction and some modest star formation occurredduring this time. This produced very metal-poor stars with randomly-oriented elongated or-bits, which today are observed as the stellar halo. The gas settled into a broad rotating disc,which was moderately metal-poor by this time as a result of the enrichment of the gas bythe heavy elements created by halo stars. The rotation was the result of the net angularmomentum of the protogalactic cloud. Star formation in this cooling, settling disc producedthick disc stars which are rotationally supported but have an appreciable velocity dispersion.The gas continued to settle into a thinner, stable, rotating disc. Residual gas that fell to thecentral regions formed the bulge stars. Star formation continued in the gas disc at a gradualrate, building up the stars of the thin disc.

This model predicted the main features of the Galaxy, and did so very neatly. It explainedthe fast rotation of disc stars with their near-solar metallicities, and the randomly orientatedorbits of halo stars with their very low metallicities and great ages.

The merging scenario instead maintained that the Galaxy was built up by the merg-ing and accretion of subunits. It was first developed by Searle and Zinn in 1977 for thestellar halo. The merging model is strongly supported by detailed computer simulations ofgalaxy formation. These simulations predict that the primordial material from the Big Bangclumped into large numbers of dark matter haloes that also contained gas. These smallhaloes then merged through their mutual gravitational attraction, building up larger haloesin the process. The gas formed some stars in these dark matter clumps. A large number ofthese subunits produced our Galaxy, with the gas settling into a rotating disc as a naturalconsequence of the dissipative, collisional nature of the gas. Star formation in the disc thenformed the disc stars. The stars of the Galactic stellar halo may have come from the accre-tion of subunits that had already formed some stars. This process of building galaxies bythe merging of clumps to form successively larger and larger units is known as hierarchicalgalaxy formation.

Mergers certainly played an important role in the formation and subsequent evolution ofthe Galaxy. Detailed computer modelling of galaxy formation provides powerful evidence infavour of this picture. Indeed the merging process may well be occurring today.

7.16 The Sagittarius Dwarf

We shall end our discussion of our Galaxy with the Sagittarius Dwarf. Although it has beenin the past an independent galaxy, it is today plunging into our Galaxy and appears to be

101

Page 104: ASTM002 The Galaxy Course notes 2006 (QMUL)

0 10 20 30

-15

-10

-5

0

5

X (kpc)

Figure 7.6: A partial map of the Sagittarius dwarf galaxy, from RR Lyrae variables. Weare at (0, 0), the ellipses around (8.5, 0) represent the bulge, and the four circles indicate thefour microlensing survey fields where the RR Lyraes were found. (From Minniti et al. 1997.)

in the process of being pulled apart by the gravitational influence of our Galaxy.It may seem amazing that this fairly substantial companion galaxy of the Milky Way

remained undiscovered till 1993; the reason is that it is located behind the bulge, and thushas the densest part of the Milky Way in the foreground as camouflage. We do not know indetail how large the Sagittarius Dwarf is, because its stars are difficult to distinguish fromthe foreground stars. A lower limit on its size comes indirectly from microlensing surveys,because they detect RR Lyrae variables in their fields. Figure 7.6 shows its rough extent.

The Sagittarius Dwarf is a highly elongated body. It includes the some globular clusters,including M54, and is associated with a very faint star stream. It contains mostly moderatelyold or very old stars. It is almost certainly being tidally stretched as it passes through MilkyWay halo: that would explain its long, thin structure. It probably will be totally disruptedover the following 108–109 years, with its stars being lost into the Galaxy’s stellar halo.

The Sagittarius Dwarf provides evidence that the Galaxy does accrete small companiongalaxies. It may well have consumed many such galaxies in the past. Indeed, a recent studyof infrared observations has found debris from a dwarf galaxy, which has been called theCanis Major Dwarf, situated within the Galaxy about 13 kpc from the Galactic centre. Thisprovides evidence that merging is a significant process in galaxy evolution, and possibly totheir formation.

102

Page 105: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix A

Revision of Astronomical Quantities

A1. Astronomical Units

At a research and academic level in astronomy, large distances are expressed in parsecs (pc),while at a popular level they tend to be given in light years (ly).

1 pc = 3.0857 × 1016 m = 3.2616 lyDistances on the scale of galaxies are often expressed in kiloparsecs (kpc), with 1 kpc ≡1000 pc. Distances between galaxies and on the cosmological scale are usually expressed inmegaparsecs (Mpc), with 1 Mpc ≡ 106 pc.Distances on the scale of the Solar System are measured in terms of the semi-major axis ofthe Earth’s orbit, the astronomical unit (AU), with 1 AU = 1.4960 × 1011 m.Masses are often measured in terms of the mass of the Sun, the solar mass M¯, with

1 M¯ = 1.989 × 1030 kgLuminosities, defined as the total power output of radiation (in the form of visible light,infrared, ultraviolet etc.), are often expressed in terms of the luminosity of the Sun, the solarluminosity L¯, with

1 L¯ = 3.826 × 1026 WAngular separations on the sky are measured in degrees (deg or ), minutes of arc (arcminor ′) and seconds of arc (arcsec or ′′). The abbreviations arcmin and arcsec are used inpreference to min and sec to distinguish them from the minutes and seconds of time that areused when expressing coordinates of right ascension on the sky.Wavelengths of light are sometimes expressed in Angstrom units (A), with 1 A ≡ 10−10 m,in preference to the nanometre (nm, with 1nm ≡ 10−9 m ≡ 10A). Wavelengths of infraredradiation are often expressed in micrometres (µm), with 1 µm ≡ 10−6 m. The micrometreis often called the micron.Time is often expressed in years, with 1 yr = 3.1557 × 107 s.Long time spans are often measured in Gigayears, with 1 Gyr ≡ 109 yr = 3.1557 × 1016 s.In all other instances, S.I. units should be used. Unfortunately, some older systems, such ascgs units, still persist in research articles and textbooks.

A2. Astronomical Magnitudes

The brightnesses of astronomical objects in the optical, near infrared and near ultravioletregions of the spectrum are expressed on a logarithmic scale called magnitudes. A magnitudeis the brightness integrated over a some range of wavelength, and consequently any particularmagnitude applies to a certain region of the spectrum. This region of the spectrum isconventionally selected by passing the light through a coloured filter and that region of thespectrum is called the filter’s waveband, passband or the photometric band.

Commonly used wavebands are the U band in the near ultraviolet (around 360 nm wave-length), the B band in the blue (around 440 nm), the V band in the green/yellow (around550 nm), the R band in the red (around 640 nm) and the I band in the near infrared (around

Page 106: ASTM002 The Galaxy Course notes 2006 (QMUL)

790 nm). It is always necessary to specify which waveband is being used when magnitudesare quoted (and precisely which definition of passband is being used).

The apparent magnitude mF

of an object in some waveband F is related to the flux ofradiation FF in that band at the top of the Earth’s atmosphere by

mF

= CF

− 2.5 log10

(FF) ,

where CF is a calibration constant for that band. The constant 2.5 has been chosen tomaintain consistency with historical definitions of magnitudes. A fundamental consequenceof this definition is that brighter objects have smaller magnitudes. For example, a magnitude16.3 star is brighter than a magnitude 19.7 star.

Therefore two objects that have fluxes F1 and F2 in some band will have apparentmagnitudes m1 and m2 in that band that are related by

m1 −m2 = − 2.5 log10

(

F1

F2

)

and equivalently,F1

F2= 10−

25(m1−m2) .

The absolute magnitude is the magnitude that an object would have if it were observed ata distance of precisely 10 pc. The absolute magnitude therefore measures the luminosity, ortotal power output, in the photometric band. Absolute magnitudes are denoted by a capitalM with a subscript indicating the photometric band, such as M

Vfor the V-band absolute

magnitude.The apparent magnitude m

Fand the absolute magnitude M

Fof some object through

some filter F are related by

mF

− MF

= 5 log10

(D/pc) − 5 + AF,

whereD is the distance (here expressed in parsecs) and AF is the loss of light due to extinctionby intervening material (usually interstellar dust). This equation has to be modified fordistant galaxies, for which cosmological effects are important, by using,

mF

− MF

= 5 log10

(DL/pc) − 5 + A

F+ k

F,

where DL

is the luminosity distance (again expressed here in parsecs), and kF

is known asthe k-correction (it expresses the effect of redshift on the passband).

Apparent magnitudes are often denoted simply by the name of the waveband, ratherthan by a letter m followed by a subscript indicating the band. For example, V as wellas m

Vdenotes the V-band apparent magnitude, and B as well as m

Bdenotes the B-band

apparent magnitude.The difference between magnitudes in different wavebands is known as a colour index.

The colour index is an excellent measure of the colour of an object. For example, the (B−V )colour index measures the relative brightness of an object in the blue and yellow parts of thespectrum.

The calibration constants CF

for different photometric bands are usually defined so that astar of spectral type A0 V (a relatively hot main sequence star) has zero colour indices. So thebright star Vega, which happens to be of type A0 V, has (B−V ) = 0.00 and (V −R) = 0.00.

As an example of the use of magnitudes, if a star is observed to have a B-band apparentmagnitude of B = 17.85 mag and a V-band apparent magnitude of V = 17.05 mag, its(B − V ) colour index will be (B − V ) = 17.85 − 17.05 = 0.80 mag. If it lies at a distanceof D = 2000 pc and there is negligible interstellar extinction between us and the star (i.e.AB = AV = 0.00), the absolute magnitudes will be MB = B − 5 log

10(D/pc) + 5 − AB =

+6.34 mag and MV = V − 5 log10

(D/pc) + 5 − AV = +5.54 mag.

Page 107: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix B

Revision of Newtonian Gravitation

B1. Summary

This appendix summarises some basic results relating to gravitation from Newtonian Me-chanics. This information covers most of the basic principles from physics about gravitationthat are needed for the course.

B2. The Gravitational Field from a Point Mass

The attractive force between two particles of mass m1 and m2 a distance r apart is

Fgrav =Gm1m2

r2,

where G is the universal constant of gravitation, with G = 6.673 × 10−11 m3 kg−1s−2.Using Newton’s Second Law, the acceleration due to gravity at a distance r from a pointmass m is

g =Gm

r2,

directed towards the point mass.The acceleration due to gravity is the gravitational field strength.The gravitational potential a distance r from a point mass m is

Φ = − Gm

r.

B3. General Results about Gravitational Fields

The acceleration due to gravity g at any point is related to the gradient of the gravitationalpotential Φ by

g = − ∇Φ ,

in any gravitational field.The potential is negative at all times, tending to zero at infinite distance.The potential energy of a particle of mass m at a point in a gravitational field is

V = m Φ ,

where Φ is the potential at the point. This defintion means that the potential energy isnegative.Gauss’s Law relates the integral of the gravitational acceleration over a closed surface to themass lying inside that surface. If g is the acceleration due to gravity and dS is an elementof the surface S, then

Sg · dS = − 4πGMS

Page 108: ASTM002 The Galaxy Course notes 2006 (QMUL)

for any closed surface S, where MS is the total mass contained within the surface. This isthe direct equivalent of Gauss’s Law for electrostatics (

S E · dS = QS/ε0).Substituting for g = −∇Φ, we also have

S∇Φ · dS = + 4πGMS

The Poisson Equation relates the Laplacian of the potential at a point to the mass density.It states that

∇2Φ = 4πG ρ ,

where Φ is the potential at the point and ρ is the density.

B4. The Gravitational Field within a Spherically Symmetric

Distribution of Mass

The acceleration due to gravity at a distance r from the centre of a spherically symmetricdistribution of mass is

g =GM(r)

r2

where M(r) is the mass interior to a radius r, and is directed towards the centre of thedistribution. This result does not depend on how the mass is distributed, other than it isspherically symmetric. Mass outside the radius r does not affect the gravitational field atr in this spherically symmetric case. This is the same acceleration as would be given by apoint mass M(r) at the centre of the distribution.

This result can be derived very easily using Gauss’s Law.

Consider a spherical sur-face S of radius r centredon the distribution.

The acceleration due togravity at a point on thesurface is g. The magni-tude of the acceleration ev-erywhere on the surface is|g| ≡ g, from symmetry.

From Gauss’s Law,∫

Sg · dS = − 4πGM(r) ,

where M(r) is the mass inside the surface and G is the universal constant of gravitation.But an element of the surface area dS is anti-parallel to the acceleration due to gravity g,so g · dS = − |g| |dS| = − g dS. But since g is constant over the spherical surface,

− g

SdS = − 4πGM(r)

Page 109: ASTM002 The Galaxy Course notes 2006 (QMUL)

∴ − g (4πr2) = − 4πGM(r) ,

which gives,

g =GM(r)

r2.

This analysis is possible because of the spherical symmetry.

B5. Gradient Operators

The results presented above use various operators from vector calculus. It may be useful toquote mathematical expressions for these explicitly for some important coordinate systems.In a Cartesian coordinate system (x, y, z) with unit vectors ı, and k, the gradient of anyfunction A(x, y, z) is

∇A ≡ ı∂A

∂x+

∂A

∂y+ k

∂A

∂z.

The Laplacian operator in the Cartesian system is

∇2A ≡ ∇ · (∇A) ≡ ∂2A

∂x2+

∂2A

∂y2+

∂2A

∂z2.

In a spherical polar coordinate system (r, θ, φ) with unit vectors er, eθ and eφ, these operatorsare

∇A ≡ er∂A

∂r+ eθ

1

r

∂A

∂θ+ eφ

1

r sin θ

∂A

∂φ,

∇2A ≡ 1

r2∂

∂r

(

r2∂A

∂r

)

+1

r2 sin θ

∂θ

(

sin θ∂A

∂θ

)

+1

r2 sin2 θ

∂2A

∂φ2,

for any scalar function A(r, θ, φ).In a cylindrical coordinate system (R,φ, z) with unit vectors er, eφ and ez, these operatorsare

∇A ≡ eR∂A

∂R+ eφ

1

R

∂A

∂φ+ ez

∂A

∂z.

∇2A ≡ 1

R

∂R

(

R∂A

∂R

)

+1

R2

∂2A

∂φ2+

∂2A

∂z2,

for any scalar function A(R,φ, z).

B6. Distributions of point masses

In this section we shall consider the gravitational effects of a series of point masses mi whichare located at positions ri, for i = 1, N .The gravitational potential at some position r caused by the distribution is

Φ(r) = −GN∑

i=1

mi

|r − ri|,

where G is the constant of gravitation.The acceleration due to gravity at the point r is

g = −GN∑

i=1

mi

|r − ri|3(r − ri) .

Page 110: ASTM002 The Galaxy Course notes 2006 (QMUL)

The internal gravitational potential energy of the distribution of point masses is

V = − 12 G

i,ji6=j

mi mj

|ri − rj |.

B7. Continuous distributions of mass

The gravitational potential at a position r in a continuous distribution of mass enclosed ina volume V is given by

Φ(r) = −G

V

ρ(r′)

|r − r′| dV ′ .

where r′ is the position vector of the volume element dV ′, ρ(r′) is the mass density at theposition r′, and G is the constant of gravitation. The gradient of this is

∇Φ(r) = G

V

ρ(r′) (r − r′)

|r − r′|3 dV ′ .

So the acceleration due to gravity at the point r is

g = −∇Φ(r) = −G

V

ρ(r′) (r − r′)

|r − r′|3 dV ′ .

The internal gravitational potential energy of some distribution of mass is

V = − 12 G

V

V

ρ(r)ρ(r′)

|r − r′| dV dV ′ .

where r is the position vector of the volume element dV and where r′ is the position vectorof the volume element dV ′.

Page 111: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 1: Question

Suppose some category of galaxies has an observed surface brightness profile I(R) = I0 f(R/R0)with all galaxies having the same I0 and function f but different galaxies having differentR0. If the mass-to-light ratio is constant everywhere then show that

L ∝ v4

where L is the total luminosity and v is a characteristic velocity.

Problem 2: Question

The Plummer potential has a gravitational potential Φ(r) at a distance r from the centre ofa spherically-symmetric mass distribution that is given by

Φ(r) = − GMtot√r2 + a2

,

where Mtot is the total mass, G is the constant of gravitation, and a is a constant. Derivefrom this an expression for the mass M(r) interior to a radius r, and show that the densityρ(r) at a radius r is

ρ(r) =3Mtot

a2

(r2 + a2)52

.

Problem 3: Question

A family of radial density profiles ρ(r) that have been popular for the theoretical modellingof spherically symmetric galaxies have been Dehnen models. These are defined so that thedensity profiles are

ρ(r) =q a

rq

r3 (r + a)q+1Mtot ,

where r is the radial distance from the centre of the galaxy, q is an adjustable parameter, ais a scaling constant (determining the size of the galaxy), and Mtot is the total mass. Thespecial case of q = 1, which is called the Jaffe model, is particularly important because itis found to fit the observed I(R) of ellipticals at least as well as the de Vaucouleurs R1/4

profile.What is the mass M(r) interior to a radius r for any value of q?What is the gravitational potential of a mass distribution having a Jaffe ρ(r)?The Dehnen models have an interesting limit as q → 0. What is it?You may use the standard integral

rq−1

(r + a)q+1dr =

1

q a

rq

(r + a)q+ constant .

Page 112: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 1: Answer

Integrating over the surface brightness gives the luminosity of a galaxy to be L ∝ I0R20.

Because I0 is constant for all galaxies of this type, L ∝ R20 for all. The virial theorem

implies M/R0 ∝ v2, where M is the mass of a galaxy and v is a typical velocity of starsin a galaxy. Eliminating R0 gives L ∝ M2v−4. Because the mass-to-light ratio is constant,M/L = constant, so M ∝ L. Substituting for M in L ∝ M 2v−4 gives L ∝ L2v−4, which inturn gives

L ∝ v4 ,

the required result.This is the same as the Tully-Fisher relation for spiral galaxies, or the Faber-Jackson relationfor elliptical galaxies (and observed samples of both types of galaxies do tend to have onlya limited range in I0 and standard I(R) profiles).

Problem 2: Answer

Gauss’s Law gives∫

S ∇Φ ·dS = 4πGMS for any closed surface S, where Φ is the potential ata point on the surface and MS is the total mass enclosed within that surface. Consider thesurface S to be a spherical surface of radius r centred on the mass distribution. ThereforeΦ = Φ(r) is constant over the surface of radius r.Using a spherical polar coordinate system (r, θ, φ) centred on the mass distribution with unitvectors er, eθ and eφ,

∇Φ ≡ er∂Φ

∂r+ eθ

1

r

∂Φ

∂θ+ eφ

1

r sin θ

∂Φ

∂φ= er

dr,

in this case because the ∂Φ/∂θ and ∂Φ/∂φ terms are zero on account of the spherical sym-metry. So, ∇Φ is directed radially outwards and |∇Φ| = dΦ/dr .So ∇Φ and dS are parallel over the whole surface. Therefore ∇Φ · dS = |∇Φ||dS| cos 0 =|∇Φ|dS, which gives in Gauss’s Law,

|∇Φ|∫

SdS = 4πGM(r) ,

using the fact that ∇Φ is constant over the surface. Substituting for |∇Φ| = dΦ/dr we get,

dr(4πr2) = 4πGM(r) ∴ M(r) =

r2

G

dr.

Differentiating the expression for Φ in the question,

dr=

GMtot r

(r2 + a2)3/2, which gives M(r) =

Mtot r3

(r2 + a2)3/2,

the result given in the lectures.To determine the density ρ, we can consider a thin spherical shell of radius r and thicknessdr centred on the mass distribution. The volume of the shell is 4πr2dr and its mass is4πr2ρ(r)dr where ρ(r) is the density at a radius r. So,

ρ(r) =1

4π r2dM

dr.

Page 113: ASTM002 The Galaxy Course notes 2006 (QMUL)

Differentiating the expression for M(r) derived above using the product rule,

dM

dr=

3Mtot r2

(r2 + a2)3/2− 3Mtot r

4

(r2 + a2)5/2=

3Mtot a2 r2

(r2 + a2)5/2.

∴ ρ(r) =3Mtot

a2

(r2 + a2)5/2,

the result we had to prove.As an alternative method, we could use Poisson’s equation ∇2Φ = 4πGρ, which in this caseof spherical symmetry gives

ρ(r) =1

4πG∇2Φ =

1

4πG

1

r2d

dr

(

r2dΦ

dr

)

.

Substituting for the expression for dΦ/dr from above and differentiating would give therequired result.

Problem 3: Answer

Consider a thin spherical shell of radius r′ and thickness dr′ concentric with the galaxy. Themass in the shell will be

dM ′ = 4π r′ 2 dr′ ρ(r′)

Integrating from the centre of the galaxy (r′ = 0) to radial distance r,

∫ M(r)

0dM ′ =

∫ r

04π r′ 2 dr′ ρ(r′) =

∫ r

04π r′ 2 dr′

q a

r′q

r′3 (r′ + a)q+1Mtot .

∴ M(r) = q aMtot

∫ r

0

r′q−1

(r′ + a)q+1dr′ = Mtot

[

r′q

(r′ + a)q

]r

r′=0

on using the standard integral provided. This gives,

M(r) = Mtot

(

rq

(r + a)q− 0q

(0 + a)q

)

= Mtotrq

(r + a)q,

the required result (for all q 6= 0).The gravitational potential Φ is related to the mass MS inside a surface S by Gauss’s Law,∫

S ∇Φ · dS = 4πGMS . If S is a sphere of radius r centred on the galaxy,∫

S ∇Φ · dS =4πGM(r), which in this spherically symmetric case becomes

S

drdS = 4πGM(r) ∴

dr

SdS = 4πGMtot

rq

(r + a)q

∴ 4πr2dΦ

dr= 4πGMtot

r

(r + a)for the Jaffe model (q = 1)

∴dΦ

dr= GMtot

1

r

1

(r + a).

Integrating from a radius r to infinity (with Φ(∞) = 0),

∫ 0

Φ(r)dΦ′ = GMtot

∫ ∞

r

1

r′1

(r′ + a)dr′ .

This can be solved using partial fractions:

Φ(r) − 0 = GMtot

∫ ∞

r

1

a

(

1

r′− 1

(r′ + a)

)

dr′ =GMtot

a

[

ln r′ − ln(r′ + a)]∞

r′=r

Page 114: ASTM002 The Galaxy Course notes 2006 (QMUL)

Φ(r) =GMtot

a

[

ln

(

1

1 + a/r′

) ]∞

r

= − GMtot

aln

(

r

r + a

)

=GMtot

aln

(

r + a

r

)

,

for the potential at a radius r in the Jaffe (q = 1) model.

Alternatively, we could approach the problem from a more physical perspective and considerthe potential energy released when a particle of mass m is brought from infinity to a radiusr in the presence of the gravitational force F = −GM(r)m/r2. The potential energy at adistance r from the centre is then Vp(r) = mΦ(r), from which we could calculate Φ(r). Thiswould give the same result as the method above.

If q → 0, the density profile gives ρ = 0 for r > 0. However,

ρ(0) =aMtot

4πlim

q, r→0

q rq

r3 (r + a)q+1.

So q → 0 implies that all the mass Mtot is concentrated at the centre: it corresponds to apoint mass.

Page 115: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 4: Question

The distribution function f in a spherically-symmetric galaxy is related to the mass densityρ(r) at a radial distance r from the centre by

ρ(r) = 4π√

2 m

∫ 0

Φ(r)

Em − Φ(r) f(Em) dEm ,

where Em is the energy per unit mass for a star, Φ(r) is the gravitional potential at a radius

r, and m is the mean mass per star. Show that a functional form f(Em) = b (−Em)72 is a

solution to this equation for a Plummer potential, where b is a constant, using the potentialand density given in Question 2. Express b is terms of G,Mtot and a using the result ofQuestion 2. The substitution Em = Φ cos2 θ and the standard result

∫ π2

0sin2 θ cos8 θ dθ =

512

may prove useful.Assumingm = 0.70M¯, what is the value of the distribution function f for (x, y, z, vx, vy, vz) =(10kpc, 0, 0, 0, 0, 200kms−1) in a galaxy having a Plummer potential with a softening param-eter a = 1.70 kpc and a total mass of 2.0 × 1012M¯? Note that x = y = z = 0 correspondsto the centre of the galaxy in this coordinate system.[ 1M¯ = 1.989 × 1030 kg, 1 kpc = 3.0857 × 1019 m, and G = 6.673 × 10−11 m3 kg−1 s−2.]

Page 116: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 4: Answer

Try f = b (−Em)7/2, where Em is the energy per unit mass and b is a constant. The densitybecomes

ρ(r) = 4π√

2 m b

∫ 0

Φ

Em − Φ (−Em)7/2 dEm .

Note that Em and Φ are both negative, so −Em and −Φ are both positive. Use the substi-tution Em = Φ cos2 θ. Differentiating, dEm = −2Φ sin θ cos θ dθ. The limits of the integralare:

when Em = Φ, cos2 θ = 1 . ∴ cos θ = ±1 . Take θ = 0 .

when Em = 0, cos2 θ = 0 . ∴ cos θ = 0 . Take θ = π/2 .

Using this substitution, the density becomes

ρ(r) = 4π√

2 m b

∫ π/2

0

Φ cos2 θ − Φ (−Φ cos2 θ)7/2 (−2Φ sin θ cos θdθ)

= 4π√

2 m b

∫ π/2

0

√−Φ sin θ (−Φ)7/2 cos7 θ (2)(−Φ) sin θ cos θ dθ

= 8π√

2 m b (−Φ)5∫ π/2

0sin2 θ cos8 θ dθ .

Using the standard integral, we get,

ρ(r) = 8π√

2 m b (−Φ)5(

512

)

=7√

2 π2

64m b (−Φ)5

Substituting for the Plummer potential from Question 1, we obtain,

ρ(r) =7√

2 π2

64m b

G5M5tot

(r2+2)5/2.

This is the same as the expression for the density of the Plummer potential in Question 1 if

b =24√

2

7π3

a2

mG5M4tot

.

So f(−Em) = b(−Em)7/2 is a solution to the equation relating density and the distributionfunction in Question 1 if the constant b has this value.The energy per unit mass for the position and velocity in the question is

Em = − GMtot√r2 + a2

+1

2v2 = − 8.48× 1011 + 2.00× 1010 J kg−1 = − 8.28× 1011 J kg−1

using a radial distance from the centre of the galaxy of r =√

x2 + y2 + z2 = 10 kpc =3.09 × 1020 m. But f = b(−Em)7/2 with

b =24√

2

7π3

a2

mG5M4tot

= 0.1564 × (1.70 × 3.0857 × 1019)2

(0.70 × 1.989 × 1030)(6.673 × 10−11)5 (2 × 1012 × 1.989 × 1030)4m−13 s10

= 0.1564 × (1.70 × 3.0857)2 × 1038

(0.70 × 1.989)(6.673)5 (2 × 1.989)4 × 10143m−13 s10

= 9.33 × 10−112 m−13 s10

Page 117: ASTM002 The Galaxy Course notes 2006 (QMUL)

So f = b(−Em)7/2 gives,

f = 9.33 × 10−112 (8.28 × 1011)7/2 m−3 (m s−1)−3 = 4.82 × 10−70 m−3 (m s−1)−3 .

[Part of this question appeared in the May 2005 examination.]

Page 118: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 5: Question

A spherical elliptical galaxy has a total density distribution

ρtot(r) =ρ0

1 + r2/a2,

as a function of radial distance r from its centre, where ρ0 and a are constants. Show thatthe mass M(r) interior to a radius r has the form M(r) ∝ r3 for r ¿ a and M(r) ∝ r forr À a.

Consider a population of massless test particles in the potential of this galaxy. Assumethat this population is spherical, non-rotating, isothermal and isotropic, with velocity dis-persion σ in each velocity component. What is the radial density distribution ρp(r) of thistest particle population, expressed in terms of M(r) and r?

Solve for ρp(r) in terms of r explicitly for large radii (i.e. for regions where r À a) to showthat the density has a power law dependence on radius. What is the index of this powerlaw? Give a physical interpretation of this index. What is the condition for the densitydistributions of the test particle population and the galaxy itself to have similar forms atlarge r?

Problem 6: Question

Many of the researchers who perform N -body simulations do so to study the dynamics ofgalaxies, but some others us N -body techniques to study the dynamics of globular clusters.Naively, we might expect the latter group of people would have an easier job, because theycan easily afford as many particles in their simulations as there are stars in the real objects,and they do not need to worry about gas dynamics. We might therefore expect that globularcluster dynamics would be a well-understood subject by now. However, many problems havenot been solved fully and plenty of difficult research remains to be done. This problem is towork out why.

Consider a globular cluster and a galaxy, both ∼ 1010 yr old. The globular cluster hasa size ∼ 20 pc across and contains 105 stars moving with a typical velocity 15 km s−1. Thegalaxy is ∼ 20 kpc across and contains ∼ 1011 stars with a typical velocity 200 km s−1.Both of these are simulated using 105 particles. Give two reasons why the globular clustersimulation will be more difficult.

Page 119: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 5: Answer

For a thin spherical shell we have, dM(r) = 4πr2ρ(r)dr. Integrating from the centre (r = 0)to a radial distance r, we obtain,

M(r) = 4πρ0

∫ r

0

r′2

1 + r′2/a2dr′ = 4πρ0 a

2 ( r − a tan−1(r/a) )

(using a substitution r′ = a tan θ to solve the integral).When r ¿ a, the expansion tan−1 x = x− x3/3 + . . . gives

M(r) = 4πρ0 a2(

r − r + r3/3 − O(r5))

=4πρ0 a

2 r3

3− O(r5) .

Therefore, M(r) ∝ r3 when r ¿ a.When r À a, tan−1(r/a) ' π/2, so r − a tan−1(r/a) ' r, which gives, M(r) = 4πρ0 a

2 r .Therefore, M(r) ∝ r when r À a. (This is an important result because M(r) ∝ r gives acircular velocity vcirc = constant, as is observed in the rotation curves of spiral galaxies).Actually the asymptotic forms are obvious from ρ ∼ ρ0 (small-r) and ρ ∼ ρ0/r

2 (large-r).For a spherically-symmetric potential, the second Jeans equation gives

∂r(n 〈v2

r 〉 ) +n

r

[

2〈v2r 〉 − 〈v2

θ〉 − 〈v2φ〉]

= − n∂Φ

∂r

where n is the number density of some system of particles or stars, Φ is the gravitationalpotential, while 〈v2

r 〉, 〈v2θ〉 and 〈v2

φ〉 are the mean values of the squares of the velocity in ther, θ and φ directions (from Section 2.18 of the course notes). For an isotropic distributionwith no net rotation, 〈v2

r 〉 = 〈v2θ〉 = 〈v2

φ〉 = σ2, where σ is a constant. So,

d

dr

(

n 〈σ2〉)

= − n∂Φ

∂r.

To find dΦ/dr, use the fact that the acceleration due to gravity is g = −∇Φ and thatg = GM(r)/r2 for a spherical distribution of mass. Since the particles are isothermal, σ isindependent of r. Therefore,

σ2 dn

dr= − n

GM(r)

r2

in terms of M(r) and r. Using ρp ∝ n and integrating,

σ2

dρp

ρp= −G

M(r)

r2dr ,

which gives,

ln ρp = − G

σ2

M(r)

r2dr .

This is the density of the test particles as a function of r and M(r).For large r, we have M(r) = 4πρ0 a

2 r , which gives,

ln ρp = − 4πGρ0 a2

σ2

dr

r= − 4πGρ0 a

2

σ2ln r + k1 ,

where k1 is a constant. Rearranging,

ρp = k3 r−

4πGρ0 a2

σ2 ,

Page 120: ASTM002 The Galaxy Course notes 2006 (QMUL)

where k3 is a constant. This is a power law of the form ρp = rl where the index isl = −4πGρ0 a

2/σ2. So the density of the population of test particles will have a power lawdependence on distance at large radii.Here

√Z can be interpreted as the ratio of circular speed to dispersion. The tracer population

will have the same large-r density law as the massive population if Z = 2 (i.e. ρ ∝ r−2 andρp ∝ r−2 at large r if Z = 2).

Problem 6: Answer

The crossing time of the globular cluster will be Tcross ∼ 20 pc/15 km s−1 ∼ 1.3 × 106 yr,while that of the galaxy will be ∼ 20 kpc/200 km s−1 ∼ 1.0× 108 yr. So, the globular clusterwill be ∼ 104Tcross old, whereas the galaxy will be ∼ 102Tcross old. N -body modelling of thedynamics of the two objects will use a series of time steps, with the positions of the particlesbeing computed at each of these steps. However, the globular cluster simulation will needsteps ∼ 102 times smaller than for the galaxy to achieve steps representing the same fractionof the crossing time in both systems. Using Trelax/Tcross ' N/12 lnN , the globular clusterwill have Trelax ' 700Tcross ∼ 109 yr, so the globular cluster will be ∼ 10Trelax old. Incontrast, the galaxy will be ¿ Trelax old. The globular cluster simulation will thereforehave to consider two-body relaxation, while the galaxy simulation can ingore it. Both theseconsiderations make the globular cluster simulation more difficult.

Page 121: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 7: Question

A star lying in the Galactic plane is observed to have a visual magnitude of V = 13.60 magand a colour index (B− V ) = 0.98 mag. Its spectrum shows it to be a dwarf star of spectraltype G6 with a solar composition. Stars of this type are known to have an intrinsic colourof (B − V )0 = 0.76 mag and an absolute visual magnitude of MV = +5.20.What is the extinction by the interstellar medium in the V band between us and the star?What is the distance of the star? What is the mean extinction per unit distance in thedirection of the star expressed in mag kpc−1 for the V band? Will this extinction per unitdistance be the same for other stars in the sky?What would you expect the extinction to be towards the star in the I and K photometricbands (which have central wavelengths of 790 nm and 2.2 µm respectively)?

Problem 8: Question

Observations of a part of the interstellar medium of the Galaxy show that a region of hotionised gas (with a temperature 500 000 K, number density of ions 6000 m−3), a region ofcold neutral gas (temperature 50 K, number density of molecules 2× 107 m−3), and a regionof warm neutral gas (temperature 10 000 K, number density of atoms 1 × 105 m−3) are incontact with each other. Which, if any, of these are in pressure equlibrium with the others?

Problem 9: Question

The mean density in the form of stars in the disc of the Galaxy is observed to vary withthe distance z from the Galactic plane as ρs(z) = ρsoe

−|z|/hs close to the Sun, where ρso isthe density of stars in space in the plane, and hs is a scale height (ρso and hs are thereforeconstants at the distance of the Sun from the Galactic Centre). The density of the interstellargas ρg is also found to vary exponentially with height, with ρg(z) = ρgoe

−|z|/hg , where ρgo

and hg are constants. Observations show that hs = 250 pc and hg = 150 pc and ρso = 6 ρgo.What is the ratio of the surface density of stars, Σs, to that of gas, Σgo, at the Sun’s distancefrom the Galactic Centre?How do you expect the surface density of the dust, Σd, to compare with Σs?

Page 122: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 7: Answer

The (B−V ) colour excess of the star is E(B−V ) = (B−V )−(B−V )0 = 0.98−0.76 = 0.22mag.The mean interstellar extinction curve has the relation AV = 3.3E(B−V ). Therefore, weexpect AV = 3.3× 0.22 = 0.73 mag. The relation between apparent and absolute magnitudefor the V band is V − MV = 5 log10(D/pc) − 5 + AV . Therefore, the distance is D =10(V −MV +5−AV )/5 pc = 342 pc. The distance to the star is 340 pc.The mean extinction per unit distance is 0.73 mag/342 pc = 2.1 mag (kpc)−1.[Note the linear relation. This comes from the extinction AV = 1.086τV where τV is theoptical depth in the V band. Put τV = ρd κV D, where ρd is the density of dust in space, κV

is a coefficient expressing how strongly dust absorbs light in the V band (a constant for theV band unless the type of dust particles varies substantially), and D is the distance to thestar. So AV = 1.086 ρd κV D and therefore AV varies linearly with distance D.]This figure depends on the density of dust in space. Therefore it will be large (as in thiscase) in the direction of the Milky Way, and small away from the plane of the Galaxy. Itwill be large along sight lines that pass through dense gas (such as cold, neutral gas) andsmaller along sight lines through lower density gas (such as hot ionised gas). Therefore themean extinction per unit distance varies strongly across the sky.[In the AV = 1.086 ρd κV D representation used above, AV /D = 1.086 ρd κV . The parameterκV will vary only slightly, but ρd varies greatly. Therefore, AV /D varies greatly.]From the mean interstellar extinction law graph (on page 57 of the course notes), AI/E(B−V ) =2.0, and AK/E(B−V ) = 0.4 by extrapolation. So AI = 2.0E(B−V ) = 2.0 × 0.22 = 0.44 mag,and AK = 0.4E(B−V ) = 0.4 × 0.22 = 0.09 mag. So the extinction in the I and K bands willbe 0.44 and 0.09 mag respectively.

Problem 8: Answer

The ideal gas law relates the pressure P in a gas to the number density n of particles andthe absolute temperature T by P = nkB T , where kB is the Boltzmann constant. This givesthe pressure in the hot ionised gas as Phot = 6000 m−3×kB ×500 000 K = 3×109 kB K m−3

(in this problem, it is easier to leave the pressure in terms of the constant kB than tocalculate the result explicitly). For the cold neutral gas we find that the pressure is Pcold =1 × 109 kB K m−3. For the warm neutral gas the pressure is Pwarm = 1 × 109 kB K m−3.We can see that Pcold ' Pwarm 6= Phot. Therefore the cold neutral region and the warmneutral region are in pressure equilibrium. The hot ionised region is not in equilibrium withthe other two regions.[Because the hot ionised region has a higher pressure than the others, it will expand bycompressing them, although the higher densities in the others mean that their inertias slowthe process significantly.][In practice, magnetic fields and cosmic rays can provide contributions to the pressure inaddition to the gas pressure considered here. They will increase the presures, particularlyfor the hot ionised gas which will have been produced by supernovae (supernovae will producecosmic rays and the neutron stars they leave behind can add to the strength of the magneticfield). These extra effects are ignored in this case.]

Page 123: ASTM002 The Galaxy Course notes 2006 (QMUL)

Problem 9: Answer

The surface mass density of stars can be obtained by integrating the density over height z:

Σs =

∫ ∞

−∞ρs(z) dz =

∫ ∞

−∞ρso e−|z|/hs dz = 2

∫ ∞

0ρso e−z/hs dz (from the symmetry)

= 2 ρso

∫ ∞

0e−z/hs dz = 2 ρso

[

−hs e−z/hs

]∞

z=0= 2 ρso hs

Similarly, for the gas, Σg = 2 ρgo hg. Therefore,

Σs

Σg=

2 ρso hs

2 ρgo hg=

ρso

ρgo

hs

hg= 6 × 250

150= 10 .

So Σs = 10 Σg at the Sun’s distance from the Galactic Centre.Dust density ρd closely follows that of gas and observations show that ρd/ρg ' 0.1. Thereforewe expect Σs ' 100 Σd at the Sun’s distance from the Galactic Centre.[In reality, the density of the interstellar medium is highly variable from place to place. Thisexponential law is a mean representation of the decline in density from the Galactic plane.]

Page 124: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 10: Question

One variant on the Simple Model of galactic chemical evolution is the ‘leaky-box’ model.This simulates the effect of shocks from supernovae and winds from young massive stars byallowing gas to leave the box at a rate proportional to the star formation rate. Thereforethe change δMtotal in the total mass Mtotal in the box is

δMtotal = − c δMstars ,

where δMstars is the change in the mass in stars, and c is a constant or proportionality.Use this to derive an expression for the mass in gas Mgas(t) at time t in terms of Mtotal(0)and Mstars(t).Now modify the closed-box relation between δMmetals and δMstars by adding an appropriateleaking term.Use these two expressions to derive

δZ =p δMstars

Mtotal(0) − (1 + c)Mstars.

This expression shows that the leaky box model won’t solve the G-dwarf problem? Why?

Problem 11: Question

For gravitational lensing, for very distant sources (i.e., DS À DL), we can write the expres-sion for the Einstein angular radius as

θE = k√

M/DL ,

where k is a constant. Find the value of k in arcsec if M is measured in solar masses andDL in parsecs.

Page 125: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 10: Answer

We can apply the same analysis to this problem as was used for the Simple Model, but thetotal mass Mtotal is now a function of time t. Since δMtotal = − c δMstars, the mass of gasat time t is

Mgas(t) = Mtotal(0) − (1 + c)Mstars(t) .

Some results from the Simple Model still apply, such as the expression for the change δZ inthe metallicity Z of the gas in time δt, and the relation between the changes in the mass instars and the total mass MSF that has participated in star formation up to time t:

δZ =δMmetals

Mgas− Z

δMgas

Mgasand δMstars = α δMSF ,

where α is the fraction of the mass participating in star formation that remains in long-livedstars and stellar remnants. With outflow we have

δMmetals = − Z δMSF + Z (1 − α) δMSF + p δMstars − cZδMstars

= p δMstars − Z δMstars − cZδMstars .

Substituting in the expression for δZ gives the required result

δZ =p δMstars

Mtotal(0) − (1 + c)Mstars.

But this is just the closed-box result with p replaced by p/(1 + c), and Mtotal = Mgas(0)replaced by Mtotal(0)/(1 + c). So the leaky box model just changes the effective value of pand doesn’t change the distribution of stellar metallicities.

Problem 11: Answer

The angular size corresponding to the Einstein radius is (from the course notes)

θE =

4GM

c2DLS

DLDS,

where M is the mass of the lensing object, DL is the distance to the lensing object, DS is thedistance to the source, DLS is the distance between the lens and source, G is the constantof gravitation and c is the speed of light.For DS À DL, we have DLS ' DS .

∴ θE =2√G

c

M

DLrad =

2√

6.673 × 10−11

2.998 × 108

M

DLrad kg−1/2 m1/2

= 5.45 × 10−14

M

DLrad kg−1/2 m1/2

= 5.45 × 10−14

M

DL

(

180 × 60 × 60

π

)(

1.989 × 1030

3.086 × 1016

)1/2

arcsecM−1/2¯ pc1/2

= 0.081

M

DLarcsecM

−1/2¯ pc1/2

So the constant is k = 0.081 arcsecM−1/2¯ pc1/2.

This means that a lens at a distance of 1 pc (about the distance of the nearest star) imagingan extragalactic source will produce an Einstein angular radius of about 0.1 arcsec.

Page 126: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 12: Question

An optical microlensing survey images a star field in the Galactic bulge close to the Galacticcentre. Assuming that the dark matter halo is made from compact objects with approxi-mately stellar masses and has a density distribution

ρ(r) =ρ

0b2

r2 + b2,

where r is the radial distance from the Galactic centre, ρ0

is the central dark matter densityand b is a constant, derive an expression for the optical depth of microlensing to the fieldin terms of ρ

0, b and R

0. Express the result in terms of the distance R

0of the Sun from

the Galactic centre. You may assume that the star field is not significantly affected by dustextinction for this calculation.Calculate τ if R

0= 8.0 kpc, b = 2.0 kpc and ρ

0= 2.0 × 10−20 kg m−3.

Problem 13: Question

A weakly-interacting massive particle (WIMP) with a mass of 1000mp, where mp = 1.6726×10−27 kg is the mass of the proton, lenses the light of a star in the Large Magellanic Cloud,which is situated 50kpc from the Earth. Calculate the Einstein angular radius of the WIMPif it lies at a distance 20 kpc from the Earth. How does this figure compare with the angularradius of the star if it has the same radius, 6.96 × 108 m, as the Sun? Will the microlensingeffect of the WIMP be noticeable? Are dark matter microlensing surveys sensitive to thelensing of stars by WIMPs?What is Einstein angular radius of a brown dwarf with a mass of 0.05M¯ at the samelocation as the WIMP? Will the lensing effect of the brown dwarf on the background starbe noticeable if there is a suitable alignment?

Problem 14: Question

The separation l between the Galaxy and M31 is expected to obey the equation

d2l

dt2= − GM

l2,

over time t, where M is their combined mass, on the assumption that they move only undertheir mutual attraction. Show that, in addition to the parametric solutions involving sin andcos discussed in the lectures, the solutions (i) l = (GMτ 2

0 )1/3(cosh η − 1), t = τ0(sinh η − η),where η is a parameter, and (ii) l = (9GM/2)1/3t2/3 are also solutions. What do these twocases represent physically? What observational constraints show that these solutions areinappropriate to the real Galaxy–M31 system?

Page 127: ASTM002 The Galaxy Course notes 2006 (QMUL)

Appendix C: Example Problems

Problem 12: Answer

Use the equation

τ =4πG

c2DS

∫ DS

0DL DLS ρ(DL) dDL ,

whereDS is the distance from the observer to the source, DL is the distance from the observerto the lens, and DLS is the distance from the lens to the source. Considering the geometry,DS = R

0, DLS = R

0− DL and DLS = r. Therefore, r = R

0− DL and dDL = −dr. The

optical depth is then

τ =4πG

c2R0

∫ R0

0(R

0− r) r

ρ0b2

r2 + b2dr =

4πGρ0b2

c2R0

∫ R0

0

(

R0r

r2 + b2− r2

r2 + b2

)

dr

=4πGρ

0b2

c2R0

[

R0

2ln(r2 + b2) − r + b tan−1

(r

b

)

]R0

0

using the result∫

r2/(1 + r2/b2) dr = r − b tan−1(r/b) + constant, from the solution toproblem 3. This gives,

τ =2πGρ

0b2

c2

(

ln(

1 +R2

0

b2

)

− 2 +2b

R0

tan−1 R0

b

)

Substituting for ρ0

= 2.0 × 10−20 kg m−3, b = 2.0 × 103 pc = 2.0 × 103 × 3.086 × 1016m,R

0= 8.0 × 103 pc = 8.0 × 103 × 3.086 × 1016m, gives τ = 5.3 × 10−7.

Problem 13: Answer

The expression for the Einstein angular radius is

θE =

4GM

c2DLS

DLDSrad ,

where the mass of the lens M is 1000mP = 1.673 × 10−24 kg, the distance between theobserver and lens is DL = 20 kpc = 6.17 × 1020 m, the distance between the observer andsource is DS = 50 kpc = 1.54 × 1021 m, and the distance between the lens and source isDLS = 30 kpc = 9.26 × 1020 m. For the WIMP we have θE = 2.2 × 10−36 rad.The star with a radius 6.96 × 108 m at a distance of 20 kpc = 6.2 × 1020 m subtends anangular radius of 6.9×108/6.2×1020 rad = 1.1×10−12 rad. So the angular radius of the staris 5 × 1023 times the Einstein angular radius of the WIMP. The lensing effect of the WIMPwill take place on a scale that is ∼ 10−23 smaller than the scale of the star image. It will becompletely undetectable.[This question assumes that WIMPs do exist. Despite dedicated experiments to search forthem, little firm evidence has been found that they exist in any significant numbers.]A M = 0.05M¯ brown dwarf will have θE = 5.4 × 10−10 rad = 1.1 × 10−4 arcsec. TheEinstein angular radius of the brown dwarf is 490 times larger than that angular radius ofthe background star. Lensing effects will be significant if there is a suitable alignment.

Page 128: ASTM002 The Galaxy Course notes 2006 (QMUL)

Problem 14: Answer

The formulae can be shown to be solutions by differentiation and appropriate substitutioninto the differential equation.To prove (i) is a solution, differenting l = (GMτ 2

0 )1/3(cosh η− 1) and t = τ0 (sinh η− η), weget

dl

dη= (GMτ 2

0 )1/3 sinh η anddt

dη= τ0 (cosh η − 1)

Therefore,dl

dt=

dl

(

dt

)−1

=(GMτ 2

0 )1/3 sinh η

τ0 (cosh η − 1). Differentiating again,

d

(

dl

dt

)

=(GMτ 2

0 )1/3 cosh η τ0 (cosh η − 1) − (GMτ 20 )1/3 sinh η τ0 sinh η

τ 20 (cosh η − 1)2

= − (GMτ 20 )1/3

τ0 (cosh η − 1)using cosh2 x− sinh2 x ≡ 1

Therefore,

d2l

dt2=

d

(

dl

dt

)

.

(

dt

)−1

= − (GMτ 20 )1/3

τ0(cosh η − 1)

1

τ0(cosh η − 1)= − (GMτ 2

0 )1/3

τ 20 (cosh η − 1)2

Therefore,

d2l

dt2+

GM

l2= − (GMτ 2

0 )1/3

τ 20 (cosh η − 1)2

+GM

[(GMτ 20 )1/3(cosh η − 1)]2

= − (GM)1/3

τ4/3

0 (cosh η − 1)2+

(GM)1/3

τ4/3

0 (cosh η − 1)2= 0

So,d2l

dt2= − GM

l2, the original equation of motion.

Therefore the parametric equations are solutions to the equation of motion.

To prove (ii) is a solution, differenting l = (9GM/2)1/3 t2/3 we get,

dl

dt=

2

3

(

9GM

2

)1/3 1

t1/3and

d2l

dt2= − 2

9

(

9GM

2

)1/3 1

t4/3

Therefore,

d2l

dt2+

GM

l2= − 2

9

(

9GM

2

)1/3 1

t4/3+

GM

(9GM/2)2/3t4/3

= − 2

9

(

9GM

2

)1/3 1

t4/3+

2

9

(

9GM

2

)1/3 1

t4/3= 0

So,d2l

dt2= − GM

l2, the original equation of motion.

Therefore l = (9GM/2)1/3 t2/3 is a solution to the equation of motion.

Case (i) represents the situation where the mass of the Galaxy–M31 system is insufficientto reverse their initial movement apart. They continue to move apart at all times, withdl/dt > 0 even in the limit t → ∞. The separation l continues to increase with time at thepresent day: it predicts that M31 will still be moving away from the Galaxy today, which isinconsistent with radial velocity measurements.Case (ii) represents the situation where the mass of the Galaxy–M31 system is just insufficientto reverse their initial movement apart. The separation l continues with time to the presentday, but dl/dt→ 0 in the limit t→ ∞. It predicts that M31 will still be moving away fromthe Galaxy today, which is inconsistent with radial velocity measurements.


Recommended