Lecture Notes in PhysicsFounding Editors W Beiglbock J Ehlers K Hepp H Weidenmuller
Editorial Board
R Beig Vienna AustriaW Beiglbock Heidelberg GermanyW Domcke Garching GermanyB-G Englert SingaporeU Frisch Nice FranceF Guinea Madrid SpainP Hanggi Augsburg GermanyG Hasinger Garching GermanyW Hillebrandt Garching GermanyR L Jaffe Cambridge MA USAW Janke Leipzig GermanyH v Lohneysen Karlsruhe GermanyM Mangano Geneva SwitzerlandJ-M Raimond Paris FranceD Sornette Zurich SwitzerlandS Theisen Potsdam GermanyD Vollhardt Augsburg GermanyW Weise Garching GermanyJ Zittartz Koln Germany
The Lecture Notes in PhysicsThe series Lecture Notes in Physics (LNP) founded in 1969 reports new developmentsin physics research and teaching ndash quickly and informally but with a high quality andthe explicit aim to summarize and communicate current knowledge in an accessible wayBooks published in this series are conceived as bridging material between advanced grad-uate textbooks and the forefront of research and to serve three purposes
bull to be a compact and modern up-to-date source of reference on a well-defined topic
bull to serve as an accessible introduction to the field to postgraduate students andnonspecialist researchers from related areas
bull to be a source of advanced teaching material for specialized seminars courses andschools
Both monographs and multi-author volumes will be considered for publication Editedvolumes should however consist of a very limited number of contributions only Pro-ceedings will not be considered for LNP
Volumes published in LNP are disseminated both in print and in electronic formats theelectronic archive being available at springerlinkcom The series content is indexed ab-stracted and referenced by many abstracting and information services bibliographic net-works subscription agencies library networks and consortia
Proposals should be sent to a member of the Editorial Board or directly to the managingeditor at Springer
Christian CaronSpringer HeidelbergPhysics Editorial Department ITiergartenstrasse 1769121 Heidelberg Germanychristiancaronspringercom
Sverre J AarsethChristopher A ToutRosemary A Mardling (Eds)
The CambridgeN-Body Lectures
123
Sverre J AarsethUniversity of CambridgeInstitute of AstronomyMadingley RoadCambridge CB3 0HAUnited Kingdomsverreastcamacuk
Christopher A ToutUniversity of CambridgeInstitute of AstronomyMadingley RoadCambridge CB3 0HAUnited Kingdomcatastcamacuk
Rosemary A MardlingSchool of Mathematical SciencesMonash UniversityVictoria 3800Australiamardlingscimonasheduau
Aarseth S J et al (Eds) The Cambridge N-Body Lectures Lect Notes Phys 760(Springer Berlin Heidelberg 2008) DOI 101007978-1-4020-8431-7
The Royal Astronomical Society Series A series on Astronomy amp AstrophysicsGeophysics Solar and Solar-terrestrial Physics and Planetary Sciences
ISBN 978-1-4020-8430-0 e-ISBN 978-1-4020-8431-7
DOI 101007978-1-4020-8431-7
Lecture Notes in Physics ISSN 0075-8450
Library of Congress Control Number 2008929549
ccopy 2008 Springer-Verlag Berlin Heidelberg
This work is subject to copyright All rights are reserved whether the whole or part of the material isconcerned specifically the rights of translation reprinting reuse of illustrations recitation broadcastingreproduction on microfilm or in any other way and storage in data banks Duplication of this publicationor parts thereof is permitted only under the provisions of the German Copyright Law of September 91965 in its current version and permission for use must always be obtained from Springer Violations areliable to prosecution under the German Copyright Law
The use of general descriptive names registered names trademarks etc in this publication does not implyeven in the absence of a specific statement that such names are exempt from the relevant protective lawsand regulations and therefore free for general use
Cover design Integra Software Services Pvt Ltd
Printed on acid-free paper
9 8 7 6 5 4 3 2 1
springercom
Preface
This book gives a comprehensive introduction to the tools required for directN -body simulations The contributors are all active researchers who writein detail on their own special fields in which they are leading internationalexperts It is their previous and current connections with the Cambridge Insti-tute of Astronomy as staff or visitors that gives rise to the title The materialis generally at a level suitable for a graduate student or postdoctoral workerentering the field
The book begins with a detailed description of the codes available forN -body simulations In a second chapter we find different mathematical for-mulations for special treatments of close encounters involving binaries ormultiple systems which have been implemented The concept of chaos andstability plays a fundamental role in celestial mechanics and is highlightedhere in a presentation of a new formalism for the three-body problem Theemphasis on collisional stellar dynamics enables the scope to be enlargedby including methods relevant for comparison purposes Modern star clus-ter simulations include additional astrophysical effects by modelling real starsinstead of point-masses Several contributions cover the basic theory and com-prehensive treatments of stellar evolution for single stars as well as binariesQuestions concerning initial conditions are also discussed in depth Furtherconnections with reality are established by an observational approach to dataanalysis of actual and simulated star clusters Finally important aspects ofhardware requirements are described with special reference to parallel andGRAPE-type computers The extensive chapters provide an essential frame-work for a variety of N -body simulations
During an extensive summer school on astrophysical N -body simulationsheld in Cambridge wwwcambodyorg the Royal Astronomical Society en-couraged us to edit a volume on the topic to be published in The Royal As-tronomical Society Series Subsequently we collected the tutorial lecture notesassembled in this volume We would like to take this opportunity to thankthe Royal Astronomical Society for sponsoring the school and the Institute ofAstronomy for provision of school facilities We are grateful to all the authors
VI Preface
who took time off from their busy schedules to deliver the manuscripts whichwere then checked for both style and scientific content by the editors Thiscollection of topics related to the gravitational N -body problem will proveuseful to both students and researchers in years to come
Cambridge Sverre J AarsethMay 2008 Christopher A Tout
Rosemary A Mardling
Contents
1 Direct N -Body CodesSverre J Aarseth 111 Introduction 112 Basic Features 213 Data Structure 314 N -Body Codes 415 Hermite Integration 616 AhmadndashCohen Neighbour Scheme 817 Time-Step Criteria 1018 Two-Body Regularization 1119 KS Decision-Making 13110 Hierarchical Systems 15111 Three-Body Regularization 17112 Wheel-Spoke Regularization 18113 Post-Newtonian Treatment 20114 Chain Regularization 21115 Astrophysical Procedures 23116 GRAPE Implementations 26117 Practical Aspects 28References 30
2 Regular Algorithms for the Few-Body ProblemSeppo Mikkola 3121 Introduction 3122 Hamiltonian Manipulations 3123 Coordinate Transformations 3324 KS-Chain(s) 3525 Algorithmic Regularization 3726 N -Body Algorithms 4427 AR-Chain 4528 Basic Algorithms for the Extrapolation Method 51
VIII Contents
29 Accuracy of the AR-Chain 56210 Conclusions 57References 58
3 Resonance Chaos and Stability The Three-Body Problemin AstrophysicsRosemary A Mardling 5931 Introduction 5932 Resonance in Nature 6133 The Mathematics of Resonance 6234 The Three-Body Problem 72References 95
4 FokkerndashPlanck Treatment of Collisional Stellar DynamicsMarc Freitag 9741 Introduction 9742 Boltzmann Equation 9843 FokkerndashPlanck Equation 10144 Orbit-Averaged FokkerndashPlanck Equation 10745 The FokkerndashPlanck Method in Use 113Acknowledgement 118References 118
5 Monte-Carlo Models of Collisional Stellar SystemsMarc Freitag 12351 Introduction 12352 Basic Principles 12453 Detailed Implementation 12654 Some Results and Possible Future Developments 145Acknowledgement 153References 153
6 Particle-Mesh Technique and SUPERBOX
Michael Fellhauer 15961 Introduction 15962 Particle-Mesh Technique 16063 Multi-Grid Structure of Superbox 166References 168
7 Dynamical FrictionMichael Fellhauer 17171 What is Dynamical Friction 17172 How to Quantify Dynamical Friction 17273 Dynamical Friction in Numerical Simulations 17574 Dynamical Friction of an Extended Object 177References 179
Contents IX
8 Initial Conditions for Star ClustersPavel Kroupa 18181 Introduction 18182 Initial 6D Conditions 20283 The Stellar IMF 22284 The Initial Binary Population 23885 Summary 253Acknowledgement 254References 254
9 Stellar EvolutionChristopher A Tout 26191 Observable Quantities 26192 Structural Equations 26493 Equation of State 26594 Radiation Transport 26895 Convection 27196 Energy Generation 27397 Boundary Conditions 27998 Evolutionary Tracks 27999 Stellar Evolution of Many Bodies 281References 282
10 N -Body Stellar EvolutionJarrod R Hurley 283101 Motivation 283102 Method and Early Approaches 284103 The SSE Package 286104 N -Body Implementation 289105 Some Results 293References 295
11 Binary StarsChristopher A Tout 297111 Orbits 298112 Tides 300113 Mass Transfer 302114 Period Evolution 307115 Actual Types 308References 318
12 N -Body Binary EvolutionJarrod R Hurley 321121 Introduction 321122 The BSE Package 321123 N -Body Implementation 325
X Contents
124 Binary Evolution Results 329References 331
13 The Workings of a Stellar Evolution CodeRoss Church 333131 Introduction 333132 Equations 333133 Variables and Functions 335134 Method of Solution 337135 The Structure of stars 339136 Problematic Phases of Evolution 340137 Robustness of Results 342References 345
14 Realistic N -Body Simulations of Globular ClustersA Dougal Mackey 347141 Introduction 347142 Realistic N -Body Modelling ndash Why and How 347143 Case Study Massive Star Clusters in the Magellanic Clouds 354144 Summary 375References 375
15 Parallelization Special Hardware and Post-NewtonianDynamics in Direct N-Body SimulationsRainer Spurzem Ingo Berentzen Peter Berczik David MerrittPau Amaro-Seoane Stefan Harfst and Alessia Gualandris 377151 Introduction 377152 Relativistic Dynamics of Black Holes in Galactic Nuclei 378153 Example of Application to Galactic Nuclei 380154 N -Body Algorithms and Parallelization 381155 Special Hardware GRAPE and GRACE Cluster 382156 Performance Tests 385157 Outlook and AhmadndashCohen Neighbour Scheme 386Acknowledgement 388References 388
A Educational N -Body WebsitesFrancesco Cancelliere Vicki Johnson and Sverre Aarseth 391A1 Introduction 391A2 wwwNBodyLaborg 391A3 wwwSverrecom 394A4 Educational Utility 396
References 397
Index 399
1
Direct N -Body Codes
Sverre J Aarseth
University of Cambridge Institute of Astronomy Madingley Road CambridgeCB3 0HA UKsverreastcamacuk
11 Introduction
The classical formulation of the gravitational N -body problem is deceptivelysimple Given initial values of N masses coordinates and velocities the taskis to calculate the future orbits Although the motions are in principle com-pletely determined by the underlying differential equations accurate solutionscan only be obtained by numerical methods Self-gravitating stellar systemsexperience highly complicated interactions which require efficient proceduresfor studying the long-term behaviour In this chapter we are concerned withdescribing aspects relating to direct summation codes that have been remark-ably successful This is the most intuitive approach and present-day technol-ogy allows surprisingly large systems to be considered for a direct attackAstronomers and mathematicians alike are interested in many aspects of dy-namical evolution ranging from highly idealized systems to star clusters wherecomplex astrophysical processes play an important role Hence the need formodelling such behaviour poses additional challenges for both the numericalanalyst and the code designer
In the present chapter we concentrate on describing some relevant proce-dures for star cluster simulation codes Such applications are mainly directedtowards studying large clusters However many techniques dealing with few-body dynamics have turned out to be useful here and their implementationwill therefore be discussed too At the same time the GRAPE special-purposesupercomputers are increasingly being used for large-N simulations Hence adiversity of tools are now employed in modern simulations and the practi-tioner needs to be versatile or part of a team This development has led tocomplicated codes which also require an effort in efficient utilization as well asinterpretation of the results It follows that designers of large N -body codesneed to pay attention to documentation as well as the programming itselfFinally bearing in mind the increasing complexity of challenging problemsposed by new observations further progress in software is needed to keeppace with the ongoing hardware developments
Aarseth SJ Direct N-Body Codes Lect Notes Phys 760 1ndash30 (2008)
DOI 101007978-1-4020-8431-7 1 ccopy Springer-Verlag Berlin Heidelberg 2008
2 S J Aarseth
12 Basic Features
Before delving more deeply into the underlying algorithms it is desirableto define units and introduce the data structure that forms the back-boneof a general N -body code From dimensional analysis we first constructfiducial velocity and time units by V lowast = 1 times 10minus5(GML
lowast)12 km sminus1T lowast = (Llowast3GM)12 s with G the gravitational constant and Llowast = 3times1018 cmas a convenient length unit Given the length scale or virial radius RV in pcand total mass NMS in M where MS is the average mass specified as in-put we can now write the corresponding values for a star cluster model asV lowast = 6557 times 10minus2(NMSRV)12 km sminus1and T lowast = 1494(R3
VNMS)12 MyrHence scaled (or internal) N -body units of distance velocity and time areconverted to corresponding astrophysical units (pc km sminus1 Myr) by r =RVr v = V lowastv t = T lowastt Finally individual masses in M are obtained fromm = MS m where MS is now redefined in terms of the scaled mean mass
As the next logical step on the road to an N -body simulation we considermatters relating to the initial data Let us assume that a complete set of initialconditions have been generated in the form mi ri vi for N particles wherethe masses coordinates and velocities can be in any units A standard clustermodel is essentially defined by NMS RV together with a suitable initialmass function (IMF) After assigning the individual data we evaluate thekinetic and potential energy K and U taking U lt 0 The velocities are scaledaccording to the virial theorem by taking vi = q vi where q = (QV|U |K)12
and QV is an input parameter (05 for overall equilibrium) Note that ingeneral the virial energy should be used however the additional terms arenot known ahead of the scaling We now introduce so-called standard unitsby adopting the scaling G = 1
summi = 1 E0 = minus025 where E0 is the
new total energy (lt 0) Here the energy condition is only applied for boundsystems (QV lt 1) otherwise the convention E0 = 025 is adopted The finalscaling is performed by ri = riS
12 vi = viS12 with S = E0(q2K + U)
These variables define a standard crossing time Tcr = 2radic
2T lowast MyrMany simulations include primordial binary stars for greater realism Be-
cause of their internal binding energies the above scaling cannot be imple-mented directly Instead the components of each binary are first combinedinto one object whereupon the reduced population of Ns single stars and Nb
binaries are subject to the standard scaling It then remains for the internaltwo-body elements such as semi-major axis eccentricity and relevant anglesto be assigned together with the mass ratio The choice of distributions is verywide but should be motivated by astrophysical considerations Of special in-terest here are the periods and mass ratios which may well be correlated forluminous stars (eg spectroscopic binaries) More complicated ways of pro-viding initial conditions with primordial binaries can readily be incorporatedThus for example a consistent set of initial conditions that do not requirescaling may be uploaded Such a data set might in fact be acceptable by awell-written code but this practice is not recommended
1 Direct N -Body Codes 3
13 Data Structure
The time has now come to introduce the data structure used in the CambridgeN -body codes Complications of describing the quantities in a stellar systemarise when some objects are no longer single stars In the first instance hardbinaries are treated by two-body regularization (Kustaanheime amp Stiefel 1965hereafter KS) Now a convenient description refers to the relative motion aswell as that of the centre of mass (cm) For the purposes of sequential pre-dictions and force summations it is natural to place the two KS componentsfirst in all relevant arrays followed by single stars with the cm last Thusgiven Np pairs the type of object can be distinguished by its location i inthe array compared to 2Np and N Likewise for long-lived triples where theinner binary of the hierarchy becomes the first member of the new KS pairand the outer component the second
The new arrangement necessitates the introduction of so-called ghost starswhich retain the quantities associated with the outer component except thatthe mass is temporarily set to zero In other words a ghost star is a dormantparticle without any gravitational effect since it now forms part of the tripleGeneralization to a quadruple consisting of two binaries forming a new KSfollows readily Note that in this case a ghost binary must be defined as wellas a ghost cm particle Higher-order systems of increasing complexity aredefined in an analogous manner The treatment of hierarchies continues as longas they are defined to be stable as will be discussed in subsequent sections
It now remains to introduce the final type of object in the form of acompact subsystem which is treated by chain regularization (Mikkola ampAarseth 1993) Briefly the idea here is to employ pairwise two-body regu-larization for the strongest interactions and include the other terms as per-turbations Such systems are invariably short-lived but the special treatmentis most conveniently carried out within the context of the standard data struc-ture At least two of the chain members are former components of a KS binaryand the initial membership may be three or four These systems are usuallycreated following a strong interaction between a binary and another singleparticle or binary Here one of the members is assigned to the role as the cmfor the subsystem while the others become ghosts
bull Single stars 2Np lt i le N Ni = ibull KS pairs 1 le i le 2Np ip = iicm minusNbull Cm particles i gt N N = N0 + Nk k = 2ip minus 1bull Stable triples KS + ghost Ncm = minusNk
bull Ghost particles Nghost = N2ipminus1 mghost = 0bull Stable quadruples KS + KS ghost Ncm = minusNk
bull Higher orders T + KS Ncm = minus (2N0 + Nk)bull Chain members 2Np lt icm le N Ncm = 0
The table summarizes the key features of the data structure In order to keeptrack of the identity of the particles we also assign a name to each denoted by
4 S J Aarseth
Ni This quantity is useful for distinguishing the type of object ie whethersingle binary or even chain cm Thus the name of a binary cm is definedby Ncm = N0 + Nk where N0 is the initial particle number and Nk is thename of the first KS component Likewise the cm of hierarchical systems ofdifferent levels are identified by Ncm lt 0 while Ni = 0 for a chain cm withi le N Note that an arbitrary number of binaries can be accommodated butonly one chain Given the location icm of any cm the corresponding KS pairindex is obtained from ip = icm minusN with the components at 2ip minus 1 2ip
A new KS pair is created by exchanging the individual particle componentswith the two first single-particle arrays and introducing the correspondingcm at N + Np after Np has been updated Conversely termination of aKS solution requires the former components to be placed in the first availablesingle-particle array (unless already in the correct location) and the cm to beeliminated The case of terminating a hierarchical system is more complicatedand will be considered later
There are many advantages of having a clearly defined and simple datastructure The analogy with molecules is striking and this also extends tointeractions since some objects may combine while others are disrupted inresponse to internal or external effects On the debit side all arrays of sizeN +Np must be in correct sequential order after each creation or destructionof an object Neighbour lists to be discussed later must also be updated con-sistently However the overheads still form a small fraction of the total CPUtime The same procedure applies when distant particles known as escapersare removed from the data set Again in the latter case the name identifiesthe type of object involved
14 N -Body Codes
A general N -body code consists of three main parts in the form of initial con-ditions integration and run-time data analysis of the results In the precedingsections we have discussed some relevant aspects dealing with the initial setupand data structure Before attacking the next stage it is useful to introducethe various algorithms that are used to advance the solutions Ideally differ-ent objects require a specially designed integration method in order to exploitthe characteristic features We start by considering single stars which usu-ally dominate by numbers and concentrate on the challenge of studying largesystems The first speed-up of such calculations can be obtained by assigningindividual time-steps according to the local conditions Since a Taylor seriesis used to describe the motion we are concerned with relative convergencewhere smooth orbits in low-density regions may have longer steps
From the N2 nature of the gravitational problem the calculation of theaccelerations requires an increasing fraction of the total effort Hence the sim-ple approach of direct summation for each integration step is too expensiveand restricts the type of problem for investigation A second efficiency feature
1 Direct N -Body Codes 5
called a neighbour scheme (Ahmad amp Cohen 1973 hereafter AC) enables con-sistent solutions to be obtained while still employing direct summation Thebasic idea here is to introduce two time-scales for each particle where contri-butions from close neighbours are evaluated frequently by direct summationwhile the more distant forces are included (and recalculated) on a longer time-scale This two-polynomial scheme speeds up the calculation considerably atthe expense of extra programming Finally we also mention the modern wayto study large N and retain strict summation namely special-purpose com-puters known as GRAPE (Makino et al 1997)
Close encounters present another challenge that must be faced either inthe form of hyperbolic motion or as persistent binaries Although the time-steps of two interacting bodies can be reduced accordingly this may leadto significant accumulation of errors A more elegant way practised in theCambridge codes is to employ two-body regularization as mentioned aboveNow the programming requirements are quite formidable However the payoffis that such solutions can be used with confidence since the equations of motionare linear for weak perturbations
The next level of complexity arises when a regularized binary experiencesa strong interaction with another object A reliance on the two-body formu-lation makes for inefficient treatment during resonant interactions Compactsubsystems may instead be studied by three-body (Aarseth amp Zare 1974) orchain regularization (Mikkola amp Aarseth 1993) At present the former may beused if the external perturbations are small while the latter takes account ofperturbations and allows for up to six members Once again the programmingeffort is substantial but permits the study of extremely energetic interactions
One more special procedure remains to be discussed Although less spec-tacular the treatment of long-lived hierarchies requires careful decision-making A hierarchy is said to be stable if the orbital elements satisfy certainconditions The main property of a stable system is that the inner semi-majoraxis should be secularly constant in the presence of an outer bound perturberEssentially the outer pericentre needs to exceed the inner semi-major axis bya factor depending on the orbital parameters (Mardling amp Aarseth 1999)Once deemed to be stable the closest perturber is regularized with respectto the inner binary cm which is now treated as a point-mass However thespecial configuration is terminated on large external perturbations or if theouter eccentricity increases sufficiently to violate the stability criterion
The procedures outlined above constitute a veritable tool box for a widevariety of N -body simulations Efficient use of these tools requires a complexnetwork of decision-making Moreover it is desirable that the associated over-heads should only represent a small proportion of the total CPU effort Someof the relevant algorithms will be presented in later sections Suffice it for nowto state that this desirable requirement has been met as can be ascertainedby so-called run-time profiling
In the following we shall concentrate on the code nbody6 which combinesall of the above features and is suitable for studying realistic star clusters as
6 S J Aarseth
well as idealized systems on laptops and workstations However a section willbe devoted to GRAPE procedures With the above review as background wenow move to the next stage of presenting some of the main integration algo-rithms In each case further details are available elsewhere (Aarseth 2003)
15 Hermite Integration
Let us start by looking at the derivation of the Hermite scheme that hasproved so successful in modern simulations We expand Taylor series solutionfor the coordinates and velocities to fourth order in an interval Δt by
x1 = x0 + v0Δt+a0
2Δt2 +
a0
6Δt3 +
a(2)0
24Δt4 + α
a(3)0
120Δt5
v1 = v0 + a0Δt+a0
2Δt2 +
a(2)0
6Δt3 +
a(3)0
24Δt4 (11)
Here a represents the acceleration or force per unit mass which will alsobe referred to as force for convenience and α is an adjustable constant Thehigher-order Newmark implicit method (Newmark 1959) takes the form
x1 = x0 +12(v0 + v1)Δtminus α
10(a1 minus a0)Δt2 +
6αminus 5120
(a1 + a0)Δt3
v1 = v0 +12(a1 + a0)Δtminus 1
12(a1 minus a0)Δt2 (12)
As can be verified by substitution for v1 into the first equation with α = 1the standard Taylor series is recovered after some simplification
a1 = a0 + a0Δt+12a
(2)0 Δt2 +
16a
(3)0 Δt3
a1 = a0 + a(2)0 Δt+
12a
(3)0 Δt2 (13)
The subscripts 0 1 can be reversed hence the formulation is time-symmetricand consistent with the Hermite formulation It has been shown (Kokuboamp Makino 2004) that α = 76 is the optimal choice for the leading termin the error of the longitude of the periapse Moreover secular errors in theelements a and e are removed by using constant time-steps (in the absence ofencounters) for small eccentricities e le 01 This makes it an efficient schemefor planetesimal dynamics (see below) It has been found that energy errorsare improved by high-order prediction of the particle being advanced
It is also instructive to present a traditional formulation of standard Her-mite integration We first write a Taylor series for the force per unit mass Fand its explicit derivative F (1) for a given particle i (with index suppressed)to be advanced by a time interval t as
1 Direct N -Body Codes 7
F = F 0 + F(1)0 t+
12F
(2)0 t2 +
16F
(3)0 t3
F (1) = F(1)0 + F
(2)0 t+
12F
(3)0 t2 (14)
After obtaining the initial values F 0 F(1)0 by summation the coordinates and
velocities of all particles are predicted to low order by
rj =[(
16F
(1)0 δtprimej +
12F 0
)
δtprimej + v0
]
δtprimej + r0
vj =(
12F
(1)0 δtprimej + F 0
)
δtprimej + v0 (15)
with δtprimej = t minus tj where tj is the time of the last force calculation New valuesF F (1) are now obtained in the usual way for the particle under considerationThis enables the higher derivatives to be constructed by inversion which yields
F(3)0 = [2(F 0 minus F ) + (F (1)
0 + F (1)) t]6t3
F(2)0 = [minus3(F 0 minus F ) minus (2F
(1)0 + F (1)) t]
2t2 (16)
Consequently the fourth-order corrector can be applied to the predicted so-lution of particle i by adding the contributions
Δri =124
F(2)0 Δt4 +
1120
F(3)0 Δt5
Δvi =16F
(2)0 Δt3 +
124
F(3)0 Δt4 (17)
Before proceeding we introduce so-called quantized time-steps according tothe rule
Δtn =(smax
2
)nminus1
(18)
where smax defines the maximum permitted value usually taken as unity withstandard scaling Hence every time-step Δti should correspond to some valueof n which entails a slight reduction from a provisional choice The reason forthis novel procedure is to reduce the overheads involved in the predictions ofall coordinates and velocities namely once per step Moreover this predictionis made by hardware when using GRAPE This procedure is referred to asa block-step scheme Thus it requires truncation of the natural step to thenearest value of n Moreover time-steps can only be increased by a factor of2 every other time to maintain synchronization of all ti + Δti
Here we also discuss a heliocentric formulation which has proved efficientfor planetesimal simulations (Kokubo Yoshinaga amp Makino 1998) In helio-centric coordinates the equation of motion for a mass-point mi is given by
ri = minusNsum
j=1 j =i
mj
[ri minus rj
|ri minus rj |3+
rj
r3j
]
minus M0 +mi
r3iri (19)
8 S J Aarseth
where M0 is the mass of the central star or dominant body If the total mass inplanetesimals is small (eg Saturnrsquos ring) the indirect terms may be neglected
In concise form the following algorithm describes the essential steps in-volved in the integration itself for a group of selected particles
bull Determine members due for updating at new time tbull Predict all r r to order Fbull Improve ri ri to order F (3) for the first memberbull Obtain F F due to planetesimalsbull Add optional gas drag or tidal dampingbull Include the dominant force and first derivativebull Apply the Hermite correctorbull Perform a second iteration by the two last stepsbull Specify provisional new time-step Δtibull Compare nearest neighbour step Δtnb = 01R2R middot Vbull Check for close encounter R lt Rcl R lt 0bull Complete the cycle for any other tj + Δtj = tbull Include optional boundary crossings
Some comments on this scheme are in order It is known as being time-symmetric Hermite of type P(EC)n (predict evaluate correct etc) The num-ber of iterations n is usually chosen as 2 but n = 3 may also be worth whileNote that for large N the expensive evaluation of the perturbations is not per-formed again because the two-body term dominates the errors On GRAPEthe procedure for identifying close encounters is implemented by using thenearest-neighbour facility which enables a suitable maximum time-step to bedefined In the alternative case of a standard calculation the closest parti-cle can readily be determined from the current neighbour list which wouldusually be small1 Typically a close encounter is defined by the distance Rclwhich signals switching the solution method to regularization (if desired)
16 AhmadndashCohen Neighbour Scheme
Most simulations aim for the largest systems that can be studied with a givenresource As already remarked this invariably means the use of some kind ofneighbour (or hybrid) procedure In the following we summarize the salientfeatures of the AC scheme since complete descriptions of the Hermite versionare already available (Makino amp Aarseth 1992 Aarseth 2003)
The basic idea is to split the total force acting on a particle into two partsformally represented by
F (t) =nsum
j=1
F j + F d(t) (110)
1A full-blown AC scheme might not satisfy the strict time-symmetry condition
1 Direct N -Body Codes 9
where the first term contains the contributions from the n nearest neighboursand F d represents the distant members as well as any external effects Like-wise a similar equation can be written for the force derivative The basic ideais to perform direct summation over the neighbours at suitably chosen smallsteps and add the predicted contributions from the distant particles with fit-ting coefficients recalculated on a longer time-scale Δtd This leads to a gainin performance provided that N n and Δtd Δtn can be satisfied
The total force used for the integration is obtained on the time-scale Δtdwhen the neighbour list is also formed At intermediate times or so-calledirregular time-steps the total force and first derivative are evaluated by
F (t) = F n + F d(tminus t0) + F d(t0)F (t) = F n + F d (111)
where t0 is the time of the last regular force calculation For conveniencethe two time-steps are commensurate but this is not a formal requirementprovided the total force is evaluated at the nearest irregular time The deter-mination of time-steps for each force polynomial will be discussed in the nextsection
There are several possible strategies for neighbour selection Essentiallythe choice is between aiming for a constant value of n or adopt a more flexibleapproach depending on local conditions Given that particles in the halo havesmooth orbits as opposed to those in the core that are affected by stronginteractions it seems appropriate to employ a criterion depending on thedensity The neighbour radius itself is updated according to the relation
Rnews = Rold
s
(np
n
)13
(112)
Here the predicted neighbour number np is expressed in terms of the densitycontrast C prop nR3
s asnp = nmax(004C)12 (113)
subject to an upper limit Again the choice of nmax is a matter of taste but avalue near 2N12 has proved itself for large N In fact there are compensatingfactors affecting code performance such that smaller n requires more frequentupdating of the neighbours The neighbour selection is made during the totalforce calculation using |ri minus rj | lt Rs and is essentially free since all distancesare calculated in any case
The combination of two-force polynomials requires some care when thereis a change in the neighbour population In general there is a flux across theneighbour sphere which must be accounted for in the higher derivatives Todo this we evaluate the explicit derivatives F
(2)ij F
(3)ij from the corresponding
members j and add or subtract the corrections to the higher derivatives thatare kept separately However this extra cost may be avoided by performingthe energy check and result analysis at times commensurate with smax since
10 S J Aarseth
all the solutions are then known to highest order This is possible becauseonly predictions up to F
(1)i are used in the general integration
As regards performance the neighbour scheme is comparable to a single-force polynomial code for N 50 and speeds up as N14 Moreover a compar-ison with the GRAPE-6A (so-called micro-Grape) with the same host showsthe latter being faster by a factor of 11 for N = 25 000 Finally we emphasizethat neighbour lists are also very useful for identifying other close membersin connection with regularization and for estimating the density contrast
17 Time-Step Criteria
Any integration method based on individual time-steps tries to employ anappropriate criterion which optimizes the overall solution accuracy At thesimplest level are expressions of the type
Δt =α|r||v| Δt =
β|F ||F (1)|
(114)
where α and β are suitable dimensionless constants However such simpleforms invariably cause numerical problems mainly because close encountersare not detected in time for step reduction Since we are dealing with a Taylorseries for the force it is natural to look for a relative criterion involving higherderivatives The most convenient simple time-step can be constructed from
Δt =
(η|F ||F (2)|
)12
(115)
where η 002 would give reasonable behaviour For many years this relationwas used with success
The idea of relative convergence can be extended to take into account allthe force derivatives Consequently we write a general expression in the form
Δt =
(η(|F ||F (2)| + |F (1)|2)|F (1)||F (3)| + |F (2)|2
)12
(116)
This criterion has several useful properties Compared to (115) it gives a well-defined large value when the force is small as is the case near a tidal boundaryMoreover two bodies with different masses will tend to have similar time-stepsduring close encounters which facilitates decision-making In fact after thetruncation according to (18) the two steps are often identical but this cannotbe assumed It is worth emphasizing that a relative time-step criterion of theabove type is independent of the (non-zero) mass
From past experience it seems most efficient to assign slightly differentvalues for the dimensionless accuracy factors Hence in most practical work
1 Direct N -Body Codes 11
regardless of N the respective values ηI = 002 ηR = 003 for the irregularand regular time-steps have been adopted For N 1000 typical time-stepratios of about 6 are seen this increases slowly as N is increased
In the case of planetesimal simulations special care is needed to ensuredetection of close encounters and physical collisions We therefore employ anadditional criterion based on the nearest neighbour
Δt =βR2
|R middot V | (117)
where β = 01 has proved sufficient The different strategies for GRAPE andconventional computers in this problem were commented on in a previoussection
For completeness we also include KS regularization in this discussion sinceit has relevance for the general time-step criterion Briefly for the unperturbedcase the equation governing the relative motion is given by
F u =12hu (118)
where h is the specific two-body energy and u the generalized coordinateswhich have the useful property u middot u = R Since h lt 0 for a binary we definethe constant time-step in terms of the frequency as
Δτ =ηu
(2|h|)12 (119)
with ηu = 02 for accurate solution (Mikkola amp Aarseth 1998) Substitutioninto (116) by carrying out explicit differentiation (with hprime = 0) simplifies tothe adopted form thereby giving some support for this apparently complicatedexpression Note that the basic time-step (119) is reduced appropriately inthe presence of significant perturbations
18 Two-Body Regularization
Regularization plays an important part in the codes under discussion In thefollowing we outline some of the main aspects of the KS method and describevarious relevant algorithms The latter can be divided into a purely localpart involved with studying the relative motion and a global part that formsan interface with the whole system Let us begin with a summary of thewell-known classical formulation (Kustaanheimo amp Stiefel 1965) for the 3Dtreatment which is described in more detail elsewhere (Aarseth 2003)
New coordinates in 4D are introduced by the condition
R = u21 + u2
2 + u23 + u2
4 (120)
12 S J Aarseth
As usual in regularization a time transformation is also needed and we choosethe simplest differential relation
dt = R dτ (121)
or tprime = R It turns out that the coordinate transformation
R = L(u)u (122)
is satisfied by the Levi-Civita matrix
L(u) =
⎡
⎣u1 minusu2 minusu3 u4
u2 u1 minusu4 minusu3
u3 u4 u1 u2
⎤
⎦ (123)
as can be verified by substitution into the equation for R For completenesswe also include the appropriate relations for the relative velocity Thus theregularized velocities are obtained by
uprime =12LT (u)R (124)
while the physical values are recovered from
R = 2L(u)uprimeR (125)
Starting from the perturbed two-body problem for mk and ml
R = minusmk +ml
R3R + P (126)
with P the tidal perturbation the equations of relative motion can be derivedThe complete set is given by
uprimeprime =12hu +
12RLT P
hprime = 2uprime middot LT P
tprime = u middot u (127)
where LT represents the transpose matrixThe 10 equations describing the relative motion in the presence of external
perturbations are regular in the sense that the solutions are well defined forR rarr 0 In order to describe the actual orbit in a stellar system we introducethe associated cm by
rcm =mk rk +mlrl
mk +ml (128)
Likewise the cm force is obtained from
rcm =mk P k +ml P l
mk +ml (129)
1 Direct N -Body Codes 13
Hence the cm is added to the system of N particles as a fictitious memberto be advanced in time Individual coordinates are obtained by combining thetwo motions which yields
rk = rcm + μRmk
rl = rcm minus μRml (130)
where μ = mkml(mk +ml) is the reduced mass and similarly for the globalvelocities
Given the regularized time-step defined above the equations for therelative motion are advanced by an efficient Hermite method (Mikkola ampAarseth 1998) Although this formulation is fairly complicated the KS equa-tions can also be written in standard Hermite form by including the terms F
prime
u
and hprimeprimeImplementation of two-body regularization has many practical benefits
First the equations of motion take the form of a perturbed harmonic oscil-lator and are therefore regular This treatment permits a constant time-stepfor small perturbations while for direct integration Δt prop R32 which canbe troublesome when treating very eccentric binaries Moreover with lin-earized equations the accuracy per step is higher and only about 30 steps areneeded for an orbit Integration of relative motion also permits a faster forcecalculation because P prop 1R3 for tidal perturbation Finally on the creditside unperturbed two-body motion is justified in case there are no perturberswithin a distance d = λa(1 + e) with λ 100 Likewise if d gt λR the cmapproximation can be used in force calculations with binaries
The price to pay for all the advantages comes in the form of coordinate andvelocity transformations at the interface between relative and global motionHowever these operations are fast and do not involve the square root Asfor simulations using GRAPE there is a further cost due to differential forcecorrections since the hardware is based on point-mass interactions
Several optional features are worth mentioning For small perturbationsthe principle of adiabatic invariance can be used to slow down the motionby scaling the perturbation (Mikkola amp Aarseth 1996) So-called energy rec-tification improves the solutions of uuprime by scaling to the explicit value ofh which is integrated independently The availability of completely regulartwo-body elements like the semi-major axis (a) and eccentricity (e) can alsobe beneficial when employing averaged expressions to model secular evolutionof stable triples or tidal circularization (Mardling amp Aarseth 2001)
19 KS Decision-Making
A variety of algorithms are involved in the overall management of the regu-larization scheme Broadly speaking we may distinguish between aspects ofinitialization integration and termination and these will be covered in turn
14 S J Aarseth
The first question which presents itself is when to choose two particles forregularization treatment A close encounter is traditionally defined by the twomain parameters
Rcl =4 rh
N C13 Δtcl = β
(R3
cl
m
)12
(131)
where rh is the half-mass radius C is the central density contrast and β adimensionless constant determined by experimentation Thus a particle withtime-step Δtk lt Δtcl needs to have a close neighbour inside the distance RclFurther conditions of negative radial velocity and dominant two-body motionmust also be satisfied The latter is ensured by comparing the two-body termsdue to any other members identified in the close encounter search In the caseof GRAPE a list of particles with small time-steps is maintained and updatedduring the force calculation when the host computer is idle
The principle of initializing KS polynomials is the same as for single parti-cles except that time derivatives must also be obtained By employing explicitdifferentiation the latter terms are readily constructed from the available datainvolving u and its derivatives A conversion by Taylor series expansion forΔτ finally gives the time-step in physical units which is used for the schedul-ing of regularized solutions Thus any KS pair which needs to be advancedduring the next block-step is treated first
Initially and during the integration a consistent perturber list must alsobe available The perturber search is carried out after each apocentre passageRap = a(1+e) using the tidal limit approximation Particles inside a distance
rp =(
2mp
mbγmin
)13
a (1 + e) (132)
are selected from the neighbour list where mb is the mass of the binaryand γmin is a small dimensionless perturbation usually taken as 10minus6 Anextra procedure is included to increase the neighbour list for cm particles ifRs lt λa(1 + e)
A useful quantity for many purposes is the dimensionless relative pertur-bation defined by
γ =|P k minus P l|R2
mk +ml (133)
If evaluated in the apocentre region this dimensionless quantity is a measureof dominant two-body motion In general it is advantageous to initiate regu-larization if γ 01 but larger values are acceptable during the treatment
The KS integration itself begins with the prediction of u and uprime to high-est order u(5) while h is predicted to order h(2) As usual in the Hermitescheme perturbers are predicted to low order Transformations yield globalcoordinates and velocities rk rl rk rl which are needed for the force calcu-lation The physical perturbation P = P k minusP l and P can now be obtained
1 Direct N -Body Codes 15
By virtue of the time transformation we have P prime = R P This enables thecorrector to be applied with new values uuprime to order u(5) and h to h(4)An iteration without recalculation of the perturbations improves the finalsolution
The conversion to physical time must also be carried out to highest orderTaylor series expansion yields the desired terms by successive explicit differ-entiation beginning with tprimeprime = 2u middot uprime and continued up to t(6) using knownterms This permits the corresponding physical time-step to be obtained by
Δt =6sum
k=1
1kt(k)0 Δτk (134)
Time inversion is required when calculating the force on single particles Givena physical interval δt this is achieved by expanding τ = 1R to sufficient orderNote that division by R is not dangerous here since the cm approximationis used for small values
Conditions for unperturbed motion have been alluded to above By carefulanalysis of the velocity distribution of nearby particles it is possible to extendthe analytical solution to many Kepler periods This is achieved by identifyingthe particles that provide the maximum force as well the smallest time ofminimum approach If there are no perturbers we estimate the minimumtime to reach the boundary γ γmin as well as the free fall time of thenearest particle Depending on the remaining time a number of unperturbedorbits may be adopted and the KS motion will remain dormant until the nexttime for checking Several extra conditions are also included in order to avoidpremature interactions inside the unperturbed boundary
Following the general exposition we now comment on the final stage of theKS cycle Termination of hard binaries is appropriate for strong perturbationsay γ ge 05 which would most likely result in switching to another dominantpair (temporary capture or so-called resonance) or chain regularization Forsofter binaries a smaller perturbation limit is called for After terminationstandard force polynomials are initialized for the two single particles
As a technical point except for collisions termination is delayed until theend of the block-step ie until the remaining interval δt = Tblock minus t fallsbelow the physical step Δt converted from Δτ A final iteration to the exactvalue can then readily be performed with Δτ obtained from τ τ and δt
110 Hierarchical Systems
Long-lived triples or even quadruples form an important constituent inN -body simulations Typically a triple is formed through a strong interac-tion between two hard binaries where the weakest binary is disrupted andone component is ejected The other component may then be captured intoan orbit around the inner binary because of energy and angular momentum
16 S J Aarseth
conservation Such systems may have long life-times and their treatment bydirect integration poses very severe numerical problems (or even code crash)by loss of accuracy as well as greater effort
Over the years there has been a quest for stability criteria which wouldallow the description of hierarchies to be simplified by assuming the innersemi-major axis to be constant permiting the cm approximation to be usedIn the absence of secular changes the outer component (a single particle oranother binary) may then be regularized with respect to the inner binary cmthereby speeding up the calculation by a large factor For this purpose we haveemployed a stability criterion that has been tested successfully for a limitedrange of parameters (Mardling amp Aarseth 1999 2001) A sharper stability cri-terion has been developed recently for the general three-body problem basedon first principles The underlying theory is discussed in Chap 3 togetherwith a practical algorithm that has been implemented in nbody46 Givenall the elements describing the inner and outer orbit this algorithm definesstability or otherwise for a hierarchical configuration instead of estimating thedistance from the stability boundary Consequently the stability test needsto be re-assessed during the subsequent evolution
The identification of a hierarchical candidate system involves checkingmany conditions In the first instance a search is initiated after each apocen-tre turning point provided the cm step is sufficiently small in other wordsif Δtcm lt Δtcl This condition implies that the new hierarchy is likely toform a hard outer binary However it should be stated that the same testis also performed for a new chain regularization which again involves stronginteractions After identifying the two most dominant neighbours the outertwo-body elements are constructed for the main perturber Among furtherconditions to be checked are the perturbation on the outer orbit as well asthe requirement of a new hard binary Moreover extra tests are performed ifthe outer component is another binary in which case a modified criterion isused depending on the ratio of semi-major axes
Acceptance of the stability condition entails a considerable programmingeffort in order to maintain a consistent data structure as discussed in anearlier section The relevant algorithmic steps are set out in the followingtable and are mostly self-explanatory
bull Increase the control index for decision-makingbull Save relevant masses mkml in a hierarchy tablebull Copy cm neighbour list for later correctionsbull Terminate KS solution and update Np and arraysbull Evaluate potential energy of components and old neighboursbull Record R = rk minus rl V = vk minus vl and h in the special tablebull Form binary cm in location of the primary j = 2Np + 1bull Define ghost (m = 0 x = 106) and initialize prediction variablesbull Obtain potential energy of inner cm body and neighboursbull Remove ghost from neighbour and perturber listsbull Initialize new KS for outer component in l = k + 1
1 Direct N -Body Codes 17
bull Specify cm and ghost names Ncm = minusNk Nghost = Nl
bull Set pericentre stability limit in R0(Np) for termination testbull Update the internal and differential energy ΔE = μh0 + ΔΦ
Integration of hierarchical systems proceeds in the usual way except that thestability condition needs to be checked This is done at each apocentre turningpoint using the property Ncm lt 0 for identification One way in which thestability test may no longer apply is when the outer eccentricity increases dueto perturbations otherwise similar termination criteria are used as for hardbinaries For completeness we also give the algorithm dealing with the mainpoints of termination
bull Locate current position in the hierarchy table Ni = Ncm
bull Save cm neighbours for correction procedurebull Terminate the outer KS solution (k l) and update Np
bull Evaluate potential energy of cm wrt neighbours amp lbull Determine location of ghost Nj = Nghost j = 1 N +Np
bull Restore inner binary components from saved quantitiesbull Add l to neighbour lists containing first component kbull Initialize force polynomials for outer componentbull Copy basic KS variables h u uprime from the tablebull Re-activate inner binary as new KS solutionbull Obtain potential energy of inner components and perturbersbull Update internal energy for conservation ΔE = ΔΦ minus μhbull Reduce control index and compress tables (including escapers)
111 Three-Body Regularization
More than 30 years ago a break-through in regularization theory made it pos-sible to study the strong interactions of three particles (Aarseth amp Zare 1974)The basic idea is simple namely to employ two different KS solutions of m1
and m2 separately with respect to the so-called reference body m3 It is alsoinstructive to review this development because of its connection with the sub-sequent chain regularization mentioned above
In the following we summarize the key points of the formulation Theinitial conditions are first expressed in the local cm frame with coordinatesri and momenta pi Given the three respective distances R1 R2 R with Rthe distance between m1 and m2 and p3 = minus(p1 + p2) as the momentum ofm3 the basic Hamiltonian can be written as
H =2sum
k=1
12μk3
p2k +
1m3
pT1 middot p2 minus
m1m3
R1minus m2m3
R2minus m1m2
R (135)
with μk3 = mkm3(mk +m3) As can be seen the kinetic energy is expressedby the momenta of m1 and m2 together with a cross product which represents
18 S J Aarseth
the mutual interaction of m1 and m2 Likewise the potential energy is a sumof the three relevant terms Thus omitting any references to m2 reduces tothe familiar form of the two-body problem
In analogy with standard KS we introduce a coordinate transformation forthe distances R1 and R2 by
Q2k = Rk (k = 1 2) (136)
Several alternative time transformations are available Here we adopt the orig-inal choice which is the most intuitive but not necessarily the best giving thedifferential relation between physical and regularized time
dt = R1R2 dτ (137)
This enables a regularized Hamiltonian to be formed as Γlowast = R1R2 (H minusE0)where E0 is the initial energy By construct Γlowast should be zero along thesolution path Making use of the KS property p2
k = P 2k4Rk where P k now
is the regularized momentum the new Hamiltonian becomes
Γlowast =2sum
k=1
18μk3
Rl P2k +
116m3
P T1 A1 middot AT
2 P 2
minusm1m3R2 minusm2m3R1 minusm1m2R1R2
|R1 minus R2|minus E0R1R2 (138)
where l = 3 minus k For historical reasons Ai is taken as twice the transposeLevi-Civita matrix of (123) Finally the equations of motion are given by
dQk
dτ=
partΓlowast
partP k
dP k
dτ= minus partΓlowast
partQk
(139)
It can be seen from inspection of the Hamiltonian that the solutions are reg-ular for R1 rarr 0 or R2 rarr 0 Moreover the singular terms are numericallysmaller than the regular terms provided |R1 minus R2| gt max (R1 R2) Hence aswitch to another reference body can be made when R is no longer the largest(or second largest) distance which usually ensures a regular behaviour Fulldetails of the transformations can be found in the original publication
So far three-body regularization has only been used in unperturbed formwithin the N -body codes when chain regularization is not available whichis quite rare However it can be quite efficient as a stand-alone code forscattering experiments In particular the simplicity of decision-making as wellas the ability to achieve accurate results by a high-order integrator makes ita good choice for such problems (Aarseth amp Heggie 1976)
112 Wheel-Spoke Regularization
The recent interest in massive objects in the form of black holes has inspireda closer look at alternative regularization methods The so-called wheel-spoke
1 Direct N -Body Codes 19
formulation is a direct generalization of three-body regularization to includemore members (Zare 1974) Such a configuration may be appropriate if thereference body dominates the mass in which case the need for switching isno longer an issue and leads to further simplification The scheme is outlinedhere in the expectation that it will prove a popular tool since its effectivenesshas been demonstrated recently (Aarseth 2007)
Let us consider a subsystem of n single particles of mass mi and a dominantbody of mass m0 where the initial conditions qi pi are expressed in the localcm frame Introducing relative coordinates qi with respect to m0 we writethe Hamiltonian as
H =nsum
i=1
p2i
2μi+
1m0
nsum
iltj
pTi middot pj minusm0
nsum
i=1
mi
Riminus
nsum
iltj
mimj
Rij (140)
where μi = mim0(mi + m0) and Ri = |qi| As can be seen this is a directgeneralization of (135) to n gt 2 where m0 plays the role of reference bodyThis implies that the technical treatment will also be similar However theoriginal time transformation is now replaced by the inverse Lagrangian energyas tprime = 1L since a multiple product would be cumbersome and might notwork for critical cases This choice has many advantages and would also besuitable for three-body regularization
The use of a fixed reference body albeit with dominant mass raises atechnical problem of dealing with close encounters between two light bodiesThus for small separations the last term of (140) may become arbitrarilylarge if Rij rarr 0 At present this difficulty is overcome by introducing a smallsoftening in these terms while still retaining the conservative nature of theHamiltonian It turns out that the powerful integrator (Bulirsch amp Stoer 1966)is able to handle quite small values of non-regularized distances so that theessential dynamics is preserved
The regularized coordinates and momenta Qi P i are obtained in the usualway Conversely the physical values are recovered from the inverse transfor-mations by
qi =12AT
i Qi pi =14AT
i P iRi (141)
For completeness we also give the full set of transformations to the final valuesin the local cm system corrected for a sign error
qi = q0 + qi q0 = minusnsum
i=1
miqi
nsum
i=0
mi
pi = pi (i = 1 n) p0 = minusnsum
i=1
pi (142)
The method presented here may also be used for more conventional calcula-tions involving comparable masses without the restriction of a fixed referencebody or softening This would be a simpler alternative to chain regularizationbut would at most be effective for four or five members
20 S J Aarseth
113 Post-Newtonian Treatment
The wheel-spoke formulation is particularly suited to studying a compact sub-system containing a massive object inside a star cluster Especially attractiveis the possibility of including relativistic terms in the most dominant two-body motion The corresponding post-Newtonian equation of motion can bewritten in the convenient form (Blanchet amp Iyer 2003 Mora amp Will 2004)
d2r
dt2=
mi +m0
r2
[(minus1 +A)
r
r+Bv
] (143)
where the dimensionless quantities A and B represent relativistic effects Herethe two-body term is contained in the regularized Hamiltonian with the re-maining contributions added as a perturbation
The coefficients A B can be expanded as functions of vc with c the speedof light Using the current notation this gives rise to the perturbing force
P GR =mim0
c2r2
[(
A1 +A2
c2+A52
c3
)r
r+(
B1 +B2
c2+B52
c3
)
v
]
(144)
Here the first-order precession is described by
A1 = 2(2 + η)mi +m0
rminus (1 + 3η)v2 +
32ηr2 B1 = 2(2 minus η)r (145)
with η = mim0(mi + m0)2 Next comes the second-order precession termsA2 B2 which are somewhat more complicated Of most interest is the energyloss by gravitational radiation represented by A52 B52
For energy conservation purposes an extra equation for the relativisticcontribution is integrated according to
ΔEGR =int
P GR middot v dt (146)
In order to carry out the treatment in regularized time the right-hand side isconverted into an expression analogous to hprime in (127) Also note that deriva-tive evaluations of the physical perturbation are not required for solution offirst-order equations The associated time-scale for shrinkage employed in thedecision-making is given by (Peters 1964)
τGR =5a4c5
64mim20
(1 minus e2)72
g(e) (147)
where g(e) is a known function and standard N -body units applyImplementation of the wheel-spoke scheme into a large N -body code
presents many interesting aspects To begin with a suitably compact sub-system is chosen from a binary containing the heavy body if there is at leastone close perturber inside Rcl The subsystem is initialized in the usual way
1 Direct N -Body Codes 21
including transformations to KS-type variables Q P The perturber list isagain constructed according to (132) which now yields a smaller mass factorand hence requires less effort in coordinate prediction
Although the innermost binary is invariably long-lived the question ofmembership changes must be considered Decisions of addition or removal arebased on the central distance and radial velocity of perturbers or existingmembers respectively Simple criteria including a combination of an appro-priate perturbation (say γ gt 005) and distance (rp lt
sumRk) are used in
the former case while removal is controlled by R2 gt 2m0R and Rk gt RclIn analogy with the integration of KS binaries the cm force is obtained byvectorial summation over the components
The addition of post-Newtonian terms necessitates the introduction ofphysical units This is achieved by specifying the total mass and half-massradius as well as the speed of light From NMS and rh we have c = 3times105V lowastwith the velocity scaling factor V lowast expressed in km sminus1 This enables thecoalescence distance to be defined as three Schwarzschild radii by
rcoal =6(mi +m0)
c2 (148)
Alternatively a disruption distance may be defined for white dwarfs An ex-perimental scheme has been adopted where the different GR terms are acti-vated progressively depending on the value of the time-scale (147) Thus theradiation term is included first on the supposition that precession does notplay an important role during the early stages However due care must beexercised if the innermost binary is subject to Kozai cycles (Kozai 1962)
Simulations of centrally concentrated cluster models have been made witha GRAPE code for m0 = N12MS and N = 105 equal-mass stars Here theinnermost binary shrank by a significant factor and also developed very higheccentricity by the Kozai resonance In some cases the resulting pericentredistance was sufficiently small for stars with white dwarf radii to be affectedby further gravitational radiation shrinkage before disruption (Aarseth 2007)
114 Chain Regularization
This contribution would not be complete without a discussion of chain regu-larization which has proved to be a powerful tool in star cluster simulationsIn the following we shall review some of the essential features as well as themain algorithms since the relevant details can be found elsewhere (Mikkola ampAarseth 1993 Aarseth 2003)
The basic idea takes its cue from three-body regularization A system issuitable for special treatment if one hard binary has a close perturber in theform of a single particle or another binary Upon termination of the KS binarythe coordinates and momenta are expressed in the local cm frame Thus Nminus1
22 S J Aarseth
chain vectors connect the particles experiencing the strongest pair-wise forcesand are defined in terms of the coordinates qk by
Rk = qk+1 minus qk k = 1 N minus 1 (149)
In Hamiltonian theory the generating function
S =Nminus1sum
k=1
W k middot (qk+1 minus qk) (150)
connects the old momenta with the new ones by pk = partSpartq The relativephysical momenta W k can then be obtained by the recursion
W k = W kminus1 minus pk k = 2 N minus 2 (151)
with W 1 = minusp1 and W Nminus1 = minuspN due to the cm condition Substitutioninto a Hamiltonian of the type (140) yields
H =12
Nminus1sum
k=1
(1mk
+1
mk+1
)
W 2k minus
Nminus1sum
k=2
1mk
W kminus1 middot W k
minusNminus1sum
k=1
mkmk+1
Rkminus
Nsum
1leilejminus2
mimj
Rij (152)
where the first momentum term contains the reduced mass In spite of the sim-ilarity with (140) the formalism differs in some important respects mainlybecause there is no reference body
As stated earlier the inverse Lagrangian energy is a good choice for thetime transformation Multiplication by tprime = 1L gives the regularized Hamil-tonian Γlowast = tprime(H minusE0) which can be differentiated in the usual way to yieldthe equations of motion Note that for technical reasons the differentiation ofthe product tprimeH is done explicitly This procedure enables the term H minus E0
(which should be zero) to be retained for stabilizing the solutions It can beseen that the two-body solutions are regular for any individual Rk rarr 0 atseparate times As usual the KS relations can be used to recover the physicalvariables via the standard transformations
Rk = Lk Qk W k = Lk P k2Q2k (153)
from which the momenta pk are readily derivedThe implementation of chain regularization into an N -body code contains
many algorithms some of which will be described briefly Following initial-ization in the cm frame and evaluation of the total energy E0 the chainvectors must be constructed The selection of the corresponding chain indicespresents a considerable algorithmic challenge if (as may occur later) thereare more than four members (cf Mikkola amp Aarseth 1993) Thus the scheme
1 Direct N -Body Codes 23
may not work efficiently if the chain vectors fail to connect the dominant two-body forces The canonical variables Q P are introduced as before and theintegration can begin after specifying a suitably small time-step
Several quantities are useful for the decision-making Among these are thecharacteristic external perturbation γch and gravitational radius Rgrav wherethe latter represents the effective size of the subsystem Thus a perturber isconsidered for chain membership if γch is significant provided certain otherconditions are fulfilled The perturber list is updated at appropriate timesby (132) with Rgrav replacing the apocentre distance Likewise an existingmember with positive radial velocity is a candidate for removal if we have
R2k gt
2sum
mk
Rk Rk gt 3Rgrav (154)
Here the former condition requires transformation to the local cm systemThe chain integration is continued as long as there are at least three memberswith re-initialization after any changes Note that the membership procedurealso allows for a hard binary to be added or removed
It turns out that the chain structure is a convenient tool for checking thedynamical state Thus any escaping single particle or binary can readily beidentified by considering the distances at the beginning and end of the chainif N gt 3 As in the case of two-body regularization the internal integration iscontinued up to the next block-step time This entails inverting the integralof Ldt for an upper limit to ensure that the block-step is not exceeded Notethat here we do not have a Taylor series expansion for the time derivatives
In general termination is carried out if max Rk gt 3Rcl for three par-ticles or two hard binaries Provisions are also included for termination of astable hierarchy followed by switching to the more efficient KS treatmentAs discussed previously one way in which this can occur is after a stronginteraction of two binaries Finally procedures for physical collisions or tidalcircularization are also included albeit with considerable programming effort
115 Astrophysical Procedures
A star cluster simulation code should include a wide range of astrophysicalprocesses for a realistic treatment In the following we touch briefly on someof the most relevant aspects of the Cambridge codes By now the additionof synthetic stellar evolution has enabled the introduction of many interest-ing features that pose numerical challenges The simulation of realistic starclusters requires an IMF containing a significant proportion of heavy stars asdiscussed in Sect 74 It has been known for a long time that a few heavy bod-ies exert an unduly large influence on the dynamics of stellar systems Such adistribution also leads to mass segregation on a short time-scale which maybe comparable to the main-sequence life-time for typical cluster parametersMass loss from evolving stars is therefore important for all but the youngest
24 S J Aarseth
clusters and its inclusion in a simulation code is essential for observationalinterpretation
Since the basic ingredients of the stellar evolution scheme are discussedat length in Chaps 10 and 12 we concentrate on some of the related algo-rithms here The primary quantities associated with each star are updatedat sufficiently frequent intervals for a smooth representation For dynamicalpurposes only the process of mass loss requires special treatment It is usu-ally confined to a small fraction of all stars The main procedures can besummarized under the following headings
bull Mass loss from single stars and binariesbull Roche-lobe mass transfer and common-envelope evolutionbull Magnetic braking and spin-orbit couplingbull Inspiralling of compact binariesbull Supernova explosions and neutron star kicksbull Physical collisions (KS or chain regularization)
In the case of significant mass loss Δm gt 01M force polynomials for thenearest neighbours are re-initialized in order to reduce discontinuity effectsLikewise appropriate corrections are made to ensure overall energy conserva-tion This entails knowledge of the potential since we assume that the ejectedmass escapes rapidly from the cluster When using GRAPE the cost of a fullN summation can be avoided in most cases (except small Δti and large Δm)by employing the available potential corrected for the net force contributionup to the current time
Δφ = minusvi middot (F i minus F tide)(tminus ti) (155)
Close binaries undergoing general mass loss on a slow time-scale also re-quire updating of their KS elements Consequently the orbital parametersare modified at constant eccentricity based on the adiabatic approximationMba = const A corresponding correction for the inner binary elements of ahierarchical triple can be carried out explicitly Here it is necessary to re-assessthe stability condition because the inner orbit expands more than the outerone
A realistic period distribution invariably includes binaries that experienceRoche-lobe mass-transfer after the primary leaves the main sequence Thisstage is initiated by tidal circularization or the formation of a circular binaryfollowing common envelope evolution Since the complicated astrophysicalmodelling is discussed in Chap 12 we limit our comments to some computa-tional aspects for completeness For practical reasons the continuous processof mass transfer is divided into an active and a coasting phase where thelatter is updated at frequent intervals The duration of the active phase isrestricted to the cm time-step for consistency with the dynamics After theinternal adjustment of the essentially circular orbit has been completed anysystem mass loss is corrected for in the same way as for single stars
1 Direct N -Body Codes 25
Magnetic braking and inspiralling of compact binaries by gravitationalradiation are catered for both within the Roche process as well as for certainnon-interacting binaries In either case changes in the rotational spin of thecomponents are treated according to the recipes outlined in Chap 12 Wenote that these processes themselves do not involve any mass loss
Stars above about 8M undergo supernova explosions and eject a signifi-cant amount of mass during the transition to neutron stars In the absence ofa consensus on neutron star kicks we have adopted a Maxwellian distributionwith large dispersion hence practically all the neutron stars escape from thecluster Now the correction procedure includes the increased kinetic energyas well as the potential energy contribution of the expelled mass Since theejection of high-velocity members is also a feature of stellar systems contain-ing binaries we have implemented an algorithm for preventing discontinuouschanges in the neighbour force for large time-steps
The determination and implementation of collisions in chain regularizationrequire special care and have been discussed elsewhere in considerable detail(Aarseth 2003) For highly eccentric binaries the KS solution facilitates acheck on the pericentre distance provisionally identified by a negative productof the old and new radial velocity Rprime = 2u middot uprime and R lt a The outcome ofa collision depends on the stellar types so that a variety of remnants may beproduced (see Chap 12) Here we note that the device of ghost stars can beused when two stars are replaced by one non-zero mass
Tidal fields represent another important feature of star cluster simulationsTwo different types of external effects are catered for Most open clustersin the solar neighbourhood move in nearly circular orbits which admit alinearized tidal force to be included in the equations of motion This simplerepresentation gives rise to an energy integral and imposes a tidal boundarythat is useful for defining escape The tidal radius is given by
rtide =(
GM
4A(AminusB)
)13
(156)
where A and B are the classical rotation constants Traditionally stars outside2rtide are removed from the calculation since their subsequent effect on boundcluster members is negligible
The general case of 3D motion requires a full galactic model with explicitexpressions for the force and its derivative The equations of motion are nowmost conveniently expressed in a non-rotating coordinate system (Aarseth2003) It is still possible to have an approximate energy integral by monitoringthe accumulated work done by the perturbing force P i during each (regular)time-step Expanding the integrated contribution to third order in terms ofthe initial values and expressing the result at the end of the time-step weobtain
ΔEi = mi
(12WiΔt2i minus WiΔti
)
(157)
26 S J Aarseth
where Wi = vimiddotP i Knowledge of P i enables the second order to be included inthe expansion and the resulting conservation is satisfactory Although distantstars are usually removed from the active data structure using a nominal valueof the tidal radius their orbits in the galactic potential can still be integratedHopefully these recent code innovations will encourage more comprehensivestudies of eccentric globular cluster orbits and associated tidal tails
116 GRAPE Implementations
Since the use of GRAPE-type special-purpose computers is gaining morewidespread use it may be of interest to describe some of the proceduresin the simulation code nbody4 In particular it should be emphasized thatthe internal GRAPE data structure differs from the host in several importantrespects which calls for additional software
We take advantage of the work-sharing facility to speed up the calcula-tion by carrying out some operations on the host while GRAPE is busy Ingeneral for large N many particles are due to be advanced at the same timebut the number may also be quite small during episodes of strong multipleinteractions After prediction of the first 48 block members nblock the relevantprocedures can be summarized as follows
bull Begin force calculation for the first block-step membersbull Predict the next 48 members (if any) while GRAPE is busybull Predict rivi of cm and perturbed KS components (first time)bull Form a list of small time-steps (first time nblock le 32)bull Correct the previous block members and specify new time-stepsbull Copy the force and force derivatives from GRAPEbull Correct the last block members after repeating the abovebull Send all the corrected rivi and also F i F i to GRAPE
The scheduling of particles to be advanced is essentially the same as innbody6 However coordinate and velocity predictions on the host are nowrestricted to block-step members since a fast prediction of all particles arecarried out on the GRAPE hardware When these quantities are copied tothe corresponding GRAPE variables for data transfer an optional predictionto second order in the force derivative may be included for increased accuracyWith regularized binaries present the data structure on GRAPE consists ofsingle particles and the cm of each KS pair Consequently the force actingon a binary is in the first instance obtained by direct summation from 2Np +1to N +Np where a cm is treated as a single particle Differential force cor-rections are then applied for each binary perturber to be consistent with thecm force and likewise for any perturber forces These corrections involve sub-tracting the cm terms before adding the vectorial contributions due to thetwo components Any particles which are not on the block-step must there-fore be predicted on the host before these corrections are performed Note
1 Direct N -Body Codes 27
that the subtraction procedure invariably introduces small errors due to thelower precision of the GRAPE hardware
Another aspect of the prediction strategy concerns the indirect terms inthe heliocentric formulation (19) Again the coordinates and velocities of anysignificant members for which tj +Δtj gt t need to be predicted first This canmost readily be achieved by maintaining a list of any important planetesimalperturber which is updated following changes in the data structure In orderto check energy conservation in the heliocentric case the expression for kineticenergy takes the form
K =12
Nsum
i=1
miv2i minus
12
(M0 minus
summi
)v2
0 (158)
where v0 = minussum
miviM0 is the velocity of the dominant body of mass M0
and the second sum in (158) refers to the heavy perturbersAs mentioned in Sect 15 the determination of a maximum time-step also
differs when using a GRAPE in connection with (19) We employ a specialfunction that supplies the index of the closest neighbour at no extra costduring the force evaluation The current relative coordinates and velocityRV define an appropriate time-step Δtnb = 01R2R middot V which may besmaller than the standard value Another point to note is that the directforce summation does not include the dominant body whose effect is addedin the iteration Since provisional values of F i F i for each member on theblock-step are supplied to GRAPE for scaling purposes it is necessary tosubtract the dominant contributions first On the other hand decisions on newregularizations or terminations are made during the time-step determinationand executed in the usual way at the end of the block-step
Procedures for wheel-spoke regularization have also been combined withthe GRAPE code nbody4 making a separate version nbody7 A new featurehere is how to recognize a compact subsystem suitable for special treatmentGiven the presence of a massive binary together with the conditions
R lt 2Rcl rp lt14Rcl (m0m)12
rp lt 0 (159)
with rp the distance to the closest perturber this system is initialized andadditional perturbers are selected as for chain regularization A list of neigh-bours is updated on the local crossing time from which significant perturbersare selected Frequent checks are made on membership changes of the sub-system taking care to avoid near-collisions in the overlap region although nodirect test is made at present2
The post-Newtonian algorithms discussed above have also been imple-mented Again these procedures are carried out on the host computer Several
2Interactions between subsystem members and perturbers are not softened hencethe use of an overall perturbation with respect to the cm only acts as a guide
28 S J Aarseth
models where the relativistic terms become important have been studied forcentrally concentrated systems with N = 105 equal-mass particles and onemassive black hole of mass m0 = 300 m (Aarseth 2007) A typical simulationover 100 time units and including GR coalescence can be done in a few daysExperience shows that the less powerful GRAPE-6A is well suited for thispurpose since for much of the time the host constitutes the computationalbottleneck especially during relativistic episodes Because the central sub-system is now advanced by the accurate but more expensive BulirschndashStoermethod the overall energy conservation is somewhat better than for standardcluster simulations
When using GRAPE all regularization procedures are treated in essen-tially the same way as in nbody6 Depending on the requirements there isa choice of chain regularization time-transformed leapfrog (see Chap 2) orwheel-spoke method for studying three different types of problems but onlyone scheme is chosen for a given calculation Some of these procedures aredistinguished by options and there are also different directories containingroutines of the same name In conclusion this GRAPE software package hasalready yielded some interesting results that open up new avenues for futureexploration
117 Practical Aspects
In the preceding sections we have described the main procedures of the codenbody6 and also nbody4 which is similar The actual use of these codesinvolves many additional considerations Here we attempt a general summaryof some practical features that play a key role
To begin with the code needs to be installed and tested This neces-sitates downloading the software and extracting the relevant files3 Certainparameters governing maximum array sizes should be checked otherwise the(generous) defaults will be adopted It is expected that the code will com-pile successfully on most conventional computers Likewise results of the testinput should be examined before any further work is attempted When try-ing out a new code it is of interest to evaluate the performance by so-calledprofiling as explained in the manual which can also be downloaded
A versatile code requires a number of input parameters especially if thereare many alternative procedures To facilitate explanation we distinguish be-tween different types of input In the first group are the particle number N maximum neighbour membership nmax as well as the number of primordialbinaries nbin The second set of parameters ηI ηR ηu are concerned with theintegration itself and are dimensionless ie the same for most problems
Initial conditions may be generated internally or uploaded from a file Inthe former case there is a choice of IMF distributions with upper and lower
3See httpwwwastcamacukresearchnbody
1 Direct N -Body Codes 29
mass limits The main scaling parameters are the length unit RV in pc andmean mass MS in solar units as well as the virial theorem ratio QV discussedearlier The network of 40 options are defined in a table and allows a vari-ety of tasks to be considered However the choice must be consistent whichrequires due care All the close encounter parameters have been discussed inthe KS section Special input templates are also available for simulations withprimordial binaries or cluster orbits in a 3D galactic potential
An example of typical input parameters is given for illustration purposeswhere the main categories are placed together
bull N = 1000 nmax = 70 ηI = 002 ηR = 003bull S0 = 03 ΔT = 2 Tcrit = 100bull QE = 2 times 10minus5 RV = 2 MS = 05bull 1 2 5 7 14 16 20 23bull Δtcl = 10minus4 Rcl = 0001 ηu = 02 γmin = 10minus6
bull α = 23 m1 = 100 mN = 02
In the second line S0 is an initial guess for the neighbour sphere the outputinterval is ΔT and Tcrit gives the termination time Moreover the relativeenergy tolerance QE is used for automatic error control The line of optionscontains some useful suggestions but is by no means complete Finally theIMF is defined by the classical Salpeter exponent α together with the upperand lower mass limits in terms of the average mass More detailed informationon the full set of input parameters can be found in the manual Thus for exam-ple there are options for external perturbations or stellar evolution Takinginto account the wide range of available procedures the complete input file isquite compact in comparison with many other large codes
Presentation of results constitutes another challenge for code developmentIt also requires an effort by the practitioner to extract the available data in asuitable form Here we may distinguish between result summaries and detailedinformation To elucidate the possibilities the table summarizes some of themain optional procedures with a brief explanation
Procedure Explanation
Cluster core N2 algorithm for core radius and density centreLagrangian radii Percentile mass radii and half-mass radiusError control Automatic error check and restart from last timeEscape Removal of distant members and table updatesTime offset Rescaling of all global times for large valuesEvent counters Stellar types and remnant statisticsBinary analysis Regularized binary histograms and energy budgetBinary data bank Characteristic parameters for regularized binariesHR diagram Evolutionary state of single stars and binariesGeneral data bank Detailed snapshots for data analysis
30 S J Aarseth
Each of these procedures is activated by specifying a non-zero option asdefined in the manual There is also a facility for changing any option atlater times Many of the result summaries are self-explanatory and will notbe reviewed here Likewise the manual illustrates the principle of adding newvariables to the code while preserving the total size of the common blocks
We conclude by commenting on the way in which the total energy is ob-tained Thus rather than evaluating the kinetic and potential energies di-rectly the different contributions are derived consistently according to thecalculation method For example the binding energies of KS pairs are givenby
sumμihi where hi is predicted to highest order Monitoring the internal
energies of hierarchical systems and collisions events enable a conservationscheme to be maintained at high accuracy because dissipative processes arealso accounted for
References
Aarseth S J 2003 Gravitational N-Body Simulations Cambridge University PressCambridge 6 8 11 21 25
Aarseth S J 2007 MNRAS 378 285 19 21 28Aarseth S J Heggie D C 1976 AampA 53 259 18Aarseth S J Zare K 1974 Celes Mech 10 185 5 17Ahmad A Cohen L 1973 J Comput Phys 12 389 5Blanchet L Iyer B 2003 Class Quantum Grav 20 755 20Bulirsch R Stoer J 1966 Num Math 8 1 19Kokubo E Makino J 2004 PASJ 56 861 6Kokubo E Yoshinaga K Makino J 1998 MNRAS 297 1967 7Kozai Y 1962 AJ 67 591 21Kustaanheimo P Stiefel E 1965 J Reine Angew Math 218 204 11Makino J Aarseth S J 1992 PASJ 44 141 8Makino J Taiji M Ebisuzaki T Sugimoto D 1997 ApJ 480 432 5Mardling R A Aarseth S J 1999 in Steves B A Roy A E eds The
Dynamics of Small Bodies in the Solar System Kluwer Dordrecht p 385 5 16Mardling R Aarseth S 2001 MNRAS 321 398 13 16Mikkola S Aarseth S J 1993 Celes Mech Dyn Ast 57 439 3 5 21 22Mikkola S Aarseth S J 1996 Celes Mech Dyn Ast 64 197 13Mikkola S Aarseth S J 1998 New Astron 3 309 11 13Mora T Will C M 2004 Phys Rev D 69 104021 (gr-qc0312082) 20Newmark N M 1959 J Eng Mech 85 67 6Peters P C 1964 Phys Rev 136 B1224Zare K 1974 Celes Mech 10 207 19
2
Regular Algorithms for the Few-Body Problem
Seppo Mikkola
Tuorla Observatory University of Turku Finlandmikkolautufi
21 Introduction
In N -body simulations the most common strong interactions are due to closeencounters of just two bodies Most classical numerical integration methodslose precision for such situations due to the 1r2 singularity of the mutualforce of the two bodies In a close encounter the relative motion of the partici-pating bodies is so fast that for a brief moment the rest of the system can beconsidered frozen Consequently the most important feature of a regularizingalgorithm must be that it can handle reliably the perturbed two-body prob-lem There are two basically different types of methods available Coordinateand time transformations and algorithms that produce regular results withoutcoordinate transformation
The first coordinate-transformation method was that of Levi-Civita (1920)but the method works only in two dimensions Later Kustaanheimo amp Stiefel(1965) generalized this by applying a transformation (KS-transformation)from four dimensions to three dimensions (see also Aarseth 2003) More re-cently two versions of algorithmic regularization have been proposed Theseare the logarithmic Hamiltonian (LogH) suggested by Mikkola amp Tanikawa(1999a b) and independently by Preto amp Tremaine (1999)
A further development the Time Transformed Leapfrog (TTL) was pre-sented by Mikkola amp Aarseth (2002) Finally Mikkola amp Merritt (2006 2008)combined the LogH and TTL as well as a generalized midpoint method tomodify the algorithmic regularization such that it can handle the case ofvelocity dependent perturbations which are important in for example post-Newtonian dynamics (Soffel 1989)
22 Hamiltonian Manipulations
All known regularization methods require the introduction of a new indepen-dent variable Due to the importance of the Hamiltonian formalism this is
Mikkola S Regular Algorithms for the Few-Body Problem Lect Notes Phys 760 31ndash58
(2008)
DOI 101007978-1-4020-8431-7 2 ccopy Springer-Verlag Berlin Heidelberg 2008
32 S Mikkola
often done by transforming the Hamiltonian Let qqq and ppp be the coordinatesand momenta T = T (ppp) the kinetic energy and U = U(rrr t) the potentialThen H(pppqqq t) = T (ppp) minus U(qqq t) is the Hamiltonian If one defines a newindependent variable s by the differential equation
dt = g(p q t)ds (21)
the equations of motion can be derived from the extended phase space Hamil-tonian Γ (Poincarersquos transformation)
Γ = g(p q t)(H(p q t) +B) (22)
where B is the momentum of time and initially
B(0) = minusH(p(0) q(0) t0) (23)
Time is now a coordinate and one notes that the Poincare transformationmakes the new Hamiltonian Γ conservative since it does not depend explicitlyon the new independent variable Due to this and the choice of the initial valuefor B the numerical values are Γ = 0 and B = minusH (binding energy) alongthe trajectory
One often uses
Γ = (H +B)L or Γ = (H +B)U (24)
Here U is the potential energy and L = T +U the Lagrangian The equationsof motion take the form
tprime =partΓpartB
= g qprime =partΓpartp
= +gpartH
partp+partg
partp(H +B) (25)
Bprime = minuspartΓpartt
= minusg partHpartt
minus partg
partt(H +B) pprime = minuspartΓ
partq= minusg partH
partqminus partg
partq(H +B)
which is correct because H + B = 0 along the orbit However this does notmean that the latter terms can be dropped The reason for this will becomeclear in the example in Sect 23
Another way to manipulate the Hamiltonian is the use of the functionalHamiltonian (Preto amp Tremaine 1999)
Λ = f(T +B) minus f(U) (26)
where f(z) is any function that satisfies f prime(z) ge 0 A most interesting functionis f(z) = log(z) (Mikkola amp Tanikawa 1999a b Preto amp Tremaine 1999)which gives tprime = partΛpartB = 1(T + B) Along the correct trajectory we alsohave 1(T + B) = 1U and thus the time transformation is essentially thesame as g = 1U A special feature of the functional Hamiltonian is that itallows the use of the (symplectic) leapfrog algorithm because the equations ofmotion
2 Regular Algorithms for the Few-Body Problem 33
rrr =partΛpartppp
= f prime(T +B)partT
partppp ppp = minuspartΛ
partrrr= f prime(U)
partU
partrrr(27)
are such that the right-hand sides do not depend on variables on the left-handside
23 Coordinate Transformations
231 One-Dimensional Case
A simple example is provided by the one-dimensional two-body problem TheKeplerian Hamiltonian H = p22 minusMq may be transformed by the point-transformation q = Q2 p = P(2Q) into the form H = P 2(8Q2) minusMQ2Using g = q = Q2 one obtains
Γ = Q2
(P 2
8Q2minus M
Q2+B
)
=18P 2 +BQ2 minusM (28)
and the equations of motion are
Qprime =14P P prime = minus2BQ or Qprimeprime = minusB
2Q (29)
which is a harmonic oscillator because B = minusH = constantNote that had we dropped the (H +B) factored terms in (25) we would
have had
Qprime =14P P prime = minus2
(18P 2 minusM
)
Q or Qprimeprime = minus12
(18P 2 minusM
)
Q
(210)
which is singular (but still analytically regular due to energy conservationie because 1
8P2 minusM = BQ2)
232 Three-Dimensional Case KS-Transformation
The KS-transformations (Kustaanheimo amp Stiefel 1965) between the three-dimensional position and momentum rrr and ppp and the corresponding four-dimensional KS-variables QQQ and PPP may be written
rrr = QQQQ ppp = QPPP(2Q2) (211)
Here Q is the KS-matrix (Stiefel amp Scheifele 1971 p 24)
Q =
⎛
⎜⎜⎝
Q1 minusQ2 minusQ3 Q4
Q2 Q1 minusQ4 minusQ3
Q3 Q4 Q1 Q2
Q4 minusQ3 Q2 minusQ1
⎞
⎟⎟⎠ (212)
34 S Mikkola
Another way to write the transformation is
x = Q21minusQ2
2minusQ23+Q2
4 y = 2(Q1Q2minusQ3Q4) z = 2(Q1Q3+Q2Q4) (213)
Note that the fourth components of rrr and ppp that (211) produces are zerosdue to the structure and properties of the transformation
Due to increased number of variables the Qrsquos corresponding to given phys-ical coordinates are not unique However one may choose any solution forexample with rrr = (x y z)t r = |rrr| we calculate
u1 =radic
12 (r + |x|)
u2 = Y(2u1) (214)u3 = Z(2u1)u4 = 0
and the components of QQQ are
QQQ =
(u1 u2 u3 u4)t X ge 0(u2 u1 u4 u3)t X lt 0 (215)
(This algorithm is used to avoid round-off error)Initial values for the KS momenta are given by
PPP = 2Qtppp (216)
For the two-body problem H = 12ppp
2minusMr the time-transformed HamiltonianΓ in (22) takes the form
Γ =18PPP 2 minusM +BQQQ2 (217)
ie a harmonic oscillator in complete analogy with the one-dimensional caseWhen regularized by the KS-transformation the equations of motion for
a perturbed binaryrrr +Mrrrr3 = FFF (218)
take the explicit form
QQQprimeprime = minus12BQQQ+
12rQtFFF
Bprime = minus2QQQprime middot QtFFF (219)tprime = r = QQQ middotQQQ
Here FFF is the physical perturbation exerted by other particles (or any otherphysical effect) and
B =M
rminus ppp2
2is the two-body binding (Kepler-)energy Since the equations are regular theycan be solved with any reasonable numerical method
2 Regular Algorithms for the Few-Body Problem 35
24 KS-Chain(s)
When the KS-transformation is applied in N -body systems one does notobtain a harmonic oscillator but close approaches can still be regularizedFirst one forms a chain of particles such that all the small critical distancesare included in the chain and then one applies the KS-transformation to thechain vectors For details of the chain selection procedure see Sect 271
Let a time-transformed multiparticle Hamiltonian be
Γ = (T minus U +B)(T + U)
whereT =
sum
ν
ppp2ν(2mν) U =
sum
iltj
mimjrij
Let us introduce new coordinates
XXXk = rrrikminus rrrjk
then we can use the generating function
S =sum
k
WWW k middotXXXk =sum
k
WWW k middot (rrrikminus rrrjk
) (220)
In terms of the new momenta WWW the old ones are
pppν =partS
partrrrν=sum
k
WWW k middot (δνikminus δνjk
) (221)
where the δrsquos are the Kronecker symbols Thus we have
T =12
sum
αβ
TαβWWWα middotWWW β (222)
U =sum
k
mikmjk
|XXXk|+
sum
iltj (ij) isinikjk
mimj
rij (223)
whereTαβ =
sum
ν
1mν
(δνiαminus δνjα
)(δνiβminus δνjβ
)
and the second potential energy termsum
iltj (ij) isinikjk
mimj
rij
contains all the distances rij = rij(X1X2 ) that are not included amongthe vectors XXXk
36 S Mikkola
After application of the KS transformation by (211) to every momentum-coordinate pair by
WWW XXX rarr PPP QQQ
one can obtain the regularized Hamiltonian
Γ(PPP QQQ) = (T minus U +B)(T + U)
and form the canonical equations of motion
Bprime = minuspartΓpartt
PPP prime = minus partΓpartQQQ
(224)
tprime =partΓpartB
QQQprime =partΓpartPPP
(225)
Note that the number of new variables may exceed the number of the oldones This however is not a problem all the physical results remain correct(Heggie 1974)
The above formulation is completely general at least to the point thatall the well-known methods the Zare (1974) method in which all particlesare regularized with respect to a central body Heggiersquos global regularization(Heggie 1974) (in which all the interparticle vectors are taken as new variablesand collisions are regularized by the KS transformation) and the chain method(Mikkola amp Aarseth 1993) are included The vectors XXX of these methods areschematically illustrated in Fig 21
ndash2
0
2
4
6
8
10
ndash2 ndash15 ndash1 ndash05 0 05 1 15 2
C
H
Z
Fig 21 Regularized interactions (schematically) in Zare method (Z) globalmethod of Heggie (H) and chain method (C)
2 Regular Algorithms for the Few-Body Problem 37
In fact one can regularize any interparticle vector Thus any kind ofbranching and looping chains can be handled This could be seen as an in-termediate form between the Heggie method and the chain However it isnot clear if such alternatives are actually more useful than the simple chainComprehensive instructions for use of the KS-chain can be found in Mikkolaamp Aarseth (1993) and Aarseth (2003)
25 Algorithmic Regularization
The algorithmic regularization contrary to KS regularization does not usecoordinate transformation but only a time transformation and a suitable al-gorithm that produces regular results despite the singularity in the force Thefirst such methods were invented in 1999 independently in two places (Mikkolaamp Tanikawa 1999a b Preto amp Tremaine 1999)
251 The Logarithmic Hamiltonian (LogH)
Let ppp be the momenta and qqq the coordinates T (ppp) the kinetic energy andU(qqq t) the force function Then the Hamiltonian in extended phase-space is
H = T +B minus U (226)
Here B is the momentum of time (which is now a coordinate t = partHpartB = 1)
If B(0) = minusH(0) then the function
Λ = log(T +B) minus log(U) (227)
can be used as a Hamiltonian in the extended phase space
DemonstrationThe equations of motion derivable from Λ read
pppprime = minuspartΛpartqqq
=partU
partqqqU Bprime = minuspartΛ
partt=
partU
parttU (228)
qqqprime =partΛpartppp
=partT
partpppTe tprime =
partΛpartB
= 1Te (229)
where Te = T + B and a prime denotes differentiation with respect to the(new) independent variable s
Since Λ does not depend explicitly on s the value of Λ is constantThus T +B = U due to choice of initial value for B Using this and dividingthe equations of motion by the equation for time (229) we get for the timederivatives
ppp =partU
partqqq B =
partU
parttand qqq =
partT
partppp (230)
ie the normal Hamiltonian equations
38 S Mikkola
LogH for Two bodies
To introduce the method we first consider the simple case of two-body motionH = ppp22 minusMr which gives
Λ = log(ppp22 +B) + log(r) (231)
after dropping log(M)Thus the time transformation is
dt = dspartΛpartB
=ds
(ppp22 +B) (232)
B remains constant B = minus(ppp22 minusMr) The new independent variable s is
s =int t
(ppp22 +B) dt =int t M
rdt (233)
ie a quantity proportional to the eccentric anomaly increment
With stepsize h and initial values ppp0 rrr0 t0 the leapfrog algorithm takesthe form (illustration in Fig 22)
ndash04
ndash02
0
02
04
ndash02 0 02 04 06 08 1
Fig 22 Illustration of the working of the algorithmic regularization in the caseof an elliptic two-body motion The points on the ellipse are the starting and endpoints in a leapfrog step while those outside the ellipse are the rrr 1
2-points
2 Regular Algorithms for the Few-Body Problem 39
rrr 12
= rrr0 +h
2ppp0(
ppp20
2+B) (234)
ppp1 = ppp0 minus h rrr 12r21
2(235)
rrr1 = rrr 12
+h
2ppp1(
ppp21
2+B) (236)
t1 = t0 +h
2
[1
(ppp202 +B)
+1
(ppp212 +B)
]
(237)
This algorithm produces correct positions and momenta on the associatedKeplerian ellipse (Mikkola amp Tanikawa 1999a b Preto amp Tremaine 1999)however time is not correct and the method thus has phase errors Thisresult applies even for collision orbits where the eccentricity e = 1
Although the singularity when r rarr 0 is not removed one expects thealgorithm to be applicable for the N -body problem since the functions arenot evaluated precisely at r = 0
252 Time-Transformed Leapfrog (TTL)
Consider the general system
rrr = vvv vvv = FFF (rrr) (238)
where rrr and vvv are position and velocity vectors of arbitrary dimension Wenow introduce a time transformation
ds = Ω(rrr) dt (239)
where Ω(rrr) gt 0 is arbitraryIf W = Ω then one may write
rrrprime = vvvW tprime = 1W vvvprime = FFFΩ
where a prime means dds If W is obtained from the differential equation
W = vvv middot partΩpartrrr
or W prime = vvv middot partΩpartrrr
Ω (240)
instead of W = Ω directly we have⎛
⎜⎜⎝
rrrprime
tprime
vvvprime
W prime
⎞
⎟⎟⎠ =
⎛
⎜⎜⎝
vvvW1W
0000
⎞
⎟⎟⎠+
⎛
⎜⎜⎝
0000
FFF (rrr)Ω(rrr)vvv middot part ln(Ω)partrrr
⎞
⎟⎟⎠ (241)
This allows the Time-Transformed Leapfrog (TTL)
40 S Mikkola
rrr 12
= rrr0 +h
2vvv0
W0(242)
t 12
= t0 +h
21W0
(243)
vvv1 = vvv0 + hFFF (rrr 1
2)
Ω(rrr 12)
(244)
W1 = W0 + hvvv0 + vvv1
2Ω(rrr 12)middotpartΩ(rrr 1
2)
partrrr 12
(245)
rrr1 = rrr 12
+h
2vvv1
W1(246)
t1 = t 12
+h
21W1
(247)
A Simple Fortran Code for Two Bodies (LogH)
implicit real8 (a-hmo-z)
read(5)htmxmass read stepsize maximum time amp mass
read(5)xyzvxvyvz read initial coordsvels
c initializations
t=0
r=sqrt(xx+yy+zz) distance
vv=vxvx+vyvy+vzvz v-square
B=massr-vv2 binding-E
c
c Integration of the two-body motion
1 continue
dt=h(vxvx+vyvy+vzvz+2B) time increment
x=x+dtvx
y=y+dtvy
z=z+dtvz
t=t+dt
dtc=h(xx+yy+zz)
vx=vx-xdtc
vy=vy-ydtc
vz=vz-zdtc
dt=h(vxvx+vyvy+vzvz+2B) new time increment
x=x+dtvx
y=y+dtvy
z=z+dtvz
t=t+dt time has an O(h^3) error
c diagnostics time coords amp error
write(62)txyz
amp (B+(vxvx+vyvy+vzvz)2)-masssqrt(x2+y2+z2)
if(tltTmx)goto 1
2 Regular Algorithms for the Few-Body Problem 41
2 format(1x1p5g124)
end
If one takesΩ = 1r (248)
the increment of W in one step is
ΔW = minush rrr
r3middot vvv1 + vvv0
2(249)
and
Δ12vvv2 =
12(vvv2
1 minus vvv20) =
12(vvv1 minus vvv0) middot (vvv1 + vvv0) = minush rrr
r3middot vvv1 + vvv0
2
which means that for the unperturbed two-body problem this algorithm ismathematically equivalent to the LogH-method (more generally this is thecase if Ω = U) Numerically however this does not apply The reason is thatin case of a close approach W first increases then decreases fast This meansthat the increments are large numbers and there is considerable cancellationand possible round-off error Combined with the extrapolation method thisalternative leapfrog can be a powerful integrator for some systems
Remark Especially interesting is the fact that the method can be efficientfor potentials that differ from the Newtonian 1r behaviour at small distancesOne notes that both the LogH and TTL are useful for the soft potential
U prop 1radicr2 + ε2
which cannot be regularized with the KS-transformationRemark If Ω = 1r the (numerical) relation W = 1r remains valid after
every step and somewhat surprisingly this is true for any radial force fieldFFF = f(r)rrrr
A Simple Fortran Code for Two Bodies (TTL)
implicit real8 (a-hmo-z)
read(5)htincrtmxmass read steptincr maxtime mass
read(5)xyzvxvyvz read initial coordsvels
tnext=0
c initializations
t=0
r=sqrt(xx+yy+zz) distance
vv=vxvx+vyvy+vzvz v-square
E0=vv2-massr
W=massr
c
c Integration of two-body motion
42 S Mikkola
1 continue
dt=hW2 time increment
t=t+dt
x=x+dtvx
y=y+dtvy
z=z+dtvz
c
dtc=h(xx+yy+zz)
dw= -(xvx+yvy+zvz)dtc2
vx=vx-xdtc
vy=vy-ydtc
vz=vz-zdtc
W=W+dw-(xvx+yvy+zvz)dtc2
c
dt=hW2 new time increment
t=t+dt this has an O(h^3) error
x=x+dtvx
y=y+dtvy
z=z+dtvz
c diagnostics
if(tlttnext)goto 1
tnext=tnext+tincr
r=sqrt(xx+yy+zz)
err=-E0+(vxvx+vyvy+vzvz)2-massr
write(62)txyzerrr Wr-mass time coords amp error
if(tltTmx)goto 1
2 format(1x1p10g124)
end
253 A Simple LogH Algorithm for the Three-Body Problem
The three-body problem is still one of the most studied problems in few-bodydynamics Therefore it may be of interest to consider in more detail a simpleregular three-body algorithm This also serves as further illustration of theuse of the algorithmic regularization
Following Heggie (1974) we use the three interparticle vectors (see Fig 23)
XXX1 = rrr3 minus rrr2 XXX2 = rrr1 minus rrr3 XXX3 = rrr2 minus rrr1 (250)
as new coordinates Let the corresponding velocities be VVV k = XXXk then thekinetic and potential energies (in cm system) can be written
T =1
2M
sum
iltj
mimjVVV2kij
U =sum
iltj
mimj
|Xkij | (251)
where M =sum
k mk is the total mass and kij = 6 minus i minus j The equations ofmotion are
2 Regular Algorithms for the Few-Body Problem 43
ndash04
ndash02
0
02
04
06
08
1
0 05 1 15 2
X1 X2
X3
m1m2
m3
Fig 23 Labelling of vectors in the three-body regularization
XXXk = VVV k VVV k = minusM XXXk
|XXXk|3+mk
sum
ν
XXXν
|XXXν |3 (252)
and after the application of the logarithmic Hamiltonian modification theyread
tprime = 1(T +B) XXX primek = XXXk(T +B) VVV prime
k = VVV kU (253)
which are suitable for the leapfrog algorithm given in (258) and (259) aswell as for Yoshidarsquos (1990) higher-order leapfrogs
The usage of the relative vectors instead of some inertial coordinates isadvantageous in attempting to avoid large round-off effects One could alsointegrate only two of the triangle sides obtaining the remaining one from theconditions sum
k
XXXk = 000sum
k
VVV k = 000
However this hardly reduces the computational effort required by the methodInstead one may occasionally compute the longest side and the correspondingvelocity from the above triangle conditions Note however that the sums ofthe sides are not only integrals of the exact solution but are also exactlyconserved by the leapfrog mapping
The transformation from the variables XXX to centre-of-mass coordinates rrrcan be done as
44 S Mikkola
rrr1 =(m3XXX2 minusm2XXX3)
M rrr2 =
(m1XXX3 minusm3XXX1)M
rrr3 =(m2XXX1 minusm1XXX2)
M
(254)and the velocities obey the same rule
26 N -Body Algorithms
In an N -body system the Logarithmic Hamiltonian (LogH)
Λ = ln(T +B) minus ln(U) (255)
gives the equations of motion
tprime =partΛpartB
= 1(T +B) rrrprimek = vvvk(T +B) vvvprimek = AAAkU (256)
where vvvk = ˙rrrk and AAAk = partUpartrrrk
mk are the velocity and acceleration corre-spondingly
It is important to note that the derivatives of coordinates only depend onvelocities and vice versa This makes a simple leapfrog algorithm possible (seebelow) The most important feature is that as discussed in Sect 251 theresulting leapfrog is exact for two-body motion except for a phase error andthus regularizes close approaches
The Time-Transformed Leapfrog (TTL) method is a generalization of thisidea (Mikkola amp Aarseth 2002) In the time transformation one chooses someother function Ω(rrr) in place of the potential U and defines an auxiliary quan-tity W by the differential equation W = Ω = partΩ
partrrr middot vvvThe resulting TTL equations read
tprime = 1W rrrprimek =1W
partT
partpppk
vvvprimek =1ΩAAAk W prime =
sum
k
partΩpartrrrk
middot vvvkΩ (257)
and these can also be used to construct a leapfrog-like mapping which forsuitable functions Ω are asymptotically exact for two-body motion near col-lision It can be shown that TTL is mathematically equivalent to LogH if onetakes Ω = U
261 LogH Leapfrog
First one computes the constant B = minusT + U from initial values The equa-tions of motion can be used to define the basic mappings XXX(s) and VVV (s)as
XXX(s) δt = s(T +B) t rarr t+ δt rrrk rarr rrrk + δt vvvk (258)
VVV (s) δt = sU ppp rarr pppk + δtAAAk
which can be evaluated in a sequence
XXX(h2)VVV (h)X(h2)
using always the most recent results as input for the next operation
2 Regular Algorithms for the Few-Body Problem 45
262 TTL
Here one first evaluates the initial value of W = Ω then uses the leapfrogmappings
XXX(s) δt = sW t rarr t+ δt rrrk rarr rrrk + δt vvvk (259)
VVV (s) δt = sΩ δvvvk = δtAAAk W rarr W + δtsum
k
partΩpartrrrk
middot(
vvvk +12δvvvk
)
vvvk rarr vvvk + δvvvk (260)
to advance the coordinates and velocities using the operation sequence
XXX(h2)VVV (h)XXX(h2)
repeatedlyFor Ω one may use any suitable function but usually it is advantageous
to takeΩ =
sum
iltj
Ωij
rij
whereΩij = 1 or Ωij = mimj
the latter choice being recommended if the masses are comparableThe leapfrog alone is however in many cases not accurate enough The
accuracy can be improved eg by using the higher-order leapfrog algorithmsof Yoshida (1990) Alternatively one may use the extrapolation method(Bulirsch amp Stoer 1966 Press et al 1986)
27 AR-Chain
First of all it is necessary to emphazise the importance of the chain structurenot only in the KS-chain method but also when one uses one of the algorith-mic regularizations The reason is round-off errors If one uses centre-of-masscoordinates the relative coordinates of a distant close pair are differencesof large numbers and there is considerable cancellation of significant figuresleading to irrecoverable errors
This section discusses a new code that uses the chain structure and amixture of the LogH and TTL-methods
271 Finding and Updating the Chain
We begin by finding the shortest interparticle vector for the first part of thechain Next we search for the particle closest to one or the other end of thepresently known part of the chain This particle is added to the closest end
46 S Mikkola
1
2
3
4
5
6
7
8
9
10
middotmiddotmiddotmiddotmiddotmiddotmiddotmiddotmiddotmiddot
middotmiddotmiddotmiddotmiddotmiddotmiddotmiddottimes times
times
lowast lowast
Fig 24 Illustration of the chain and the checking of switching conditions Distanceslike R57 are compared with the smaller of the two distances R56 and R67 (markedby ) Interparticle distances like R410 are compared with the smallest of those incontact with the considered distance (marked by times)
of the already existing chain This is repeated until all particles are includedin the chain The particles are then re-numbered along the chain as 1 2 Nfor ease of programming
After every integration step we check for the need of updating the chainFigure 24 illustrates the case of a 10-particle chain To avoid some potentialround-off problems it is advantageous to carry out the transformation fromthe old chain vectors XXXk to the new ones directly by expressing the new chainvectors as sums of the old ones
Let the actual ldquophysicalrdquo names of the chain particles 1 N (as definedabove) be I1 I2 IN and let us use the notation Iold
k and Inewk for the
names in the old and new chains Then we may write
rrrIoldk
=kminus1sum
ν=1
XXXoldν (261)
XXXnewμ = rrrInew
μ+1minus rrrInew
μ (262)
Thus we need to use the correspondence between the old and the new indicesto express the new chain vectors XXX in terms of the old ones One finds that ifk0 and k1 are two indices such that Iold
k0= Inew
μ and Ioldk1
= Inewμ+1 then
XXXnewμ =
Nminus1sum
ν=1
BμνXXXoldν (263)
where Bμν = +1 if(k1 gt ν amp k0 le ν) and Bμν = minus1 if(k1 le ν amp k0 gt ν)otherwise Bμν = 0
2 Regular Algorithms for the Few-Body Problem 47
272 Transformations
After selecting the chain and renaming the particles as 1 2 N alongthe chain one can evaluate the initial values for the chain vectors andvelocities as
XXXk = rrrk+1 minus rrrk (264)VVV k = vvvk+1 minus vvvk (265)
where vvvk = ˙rrrk At the same time one may evaluate the centre-of-mass quan-tities
M =sum
k
mk (266)
rrrcm =sum
k
mkrrrkM (267)
vvvcm =sum
k
mkvvvkM (268)
The transformation back to rrrvvv can be done by simple summation
rrr1 = 000 (269)vvv1 = 000 (270)
rrrk+1 = rrrk +XXXk (271)vvvk+1 = vvvk + VVV k (272)
followed by reduction to the centre of mass
rrrcm =sum
k
mkrrrkM (273)
vvvcm =sum
k
mkvvvkM (274)
rrrk = rrrk minus rrrcm (275)vvvk = vvvk minus vvvcm (276)
However it is not always necessary to reduce the coordinates to the centre-of-mass system since accelerations only depend on the differences
273 Equations of Motion and the Leapfrog
The equations of motion read
XXXk = VVV k (277)˙VVV k = AAAk+1 minusAAAk (278)
48 S Mikkola
where the accelerations AAAk with possible external effects fffk are
AAAk = minussum
j =k
mjrrrjk
|rrrjk|3+ fffk (279)
and for j lt k
rrrjk =
⎧⎪⎨
⎪⎩
rrrk minus rrrj if k gt j + 2XXXj if k = j + 1XXXj +XXXj+1 if k = j + 2
(280)
For k gt j one uses the fact that rrrjk = minusrrrkj The use of XXXj and XXXj +XXXj+1
reduces the round-off effect significantly More generally one could also use
rrrkj =kminus1sum
ν=j
XXXν (281)
but for many bodies it is faster to use the above recipe (280) and the latteralternative seems not to improve the resultsThe kinetic energy is
T =12
sum
k
mkvvv2k (282)
and the potential energyU =
sum
iltj
mimj
|rrrij | (283)
which is evaluated along with the accelerations according to (280) We intro-duce further a time transformation function
Ω =sum
iltj
Ωij
|rrrij | (284)
where Ωij are some selected coefficients (to be discussed below)Now one may define the two time transformations
tprime = 1(α(T +B) + βω + γ) = 1(αU + βΩ + γ) (285)
where α β and γ are adjustable constants B = U minusT is the N -body bindingenergy and ω is defined by the differential equation
ω =sum
k
partΩpartrrrk
middot vvvk (286)
and the initial value ω(0) = Ω(0) The binding energy B changes according to
B = minussum
k
mkvvvk middot fffk (287)
2 Regular Algorithms for the Few-Body Problem 49
The equations of motion that can be used to construct the leapfrog whichprovides algorithmic regularization are for time and coordinates respectively
tprime = 1(α(T +B) + βω + γ) (288)
rrrprimek = tprimevvvk (289)
and for velocities B and ω
τ prime = 1(αU + βΩ + γ) (290)
vvvprimek = τ primeAAAk (291)
Bprime = τ primesum
k
(minusmkvvvk middot fffk) (292)
ωprime = τ primesum
k
partΩpartrrrk
middot vvvk (293)
To account for the vvv-dependence of Bprime and ωprime one must follow Mikkola ampAarseth (2002) ie first the vvvk are advanced and then the average lt vvvk gt=(vvvk(0) + vvvk(h))2 is used to evaluate Bprime and ωprime
The leapfrog for the chain vectors XXXk and VVV k can be written most easilyin terms of the two mappings
XXX(s)
δt = s(α(T +B) + βω + γ) (294)
t = t+ δt (295)XXXk rarr XXXk + δtVVV k (296)
(297)
VVV (s)
δt = s(αU + βΩ + γ) (298)
VVV k rarr VVV k + δt(AAAk+1 minusAAAk) (299)
B rarr B + δtsum
k
(minusmk lt vvvk gt middotfffk) (2100)
ω rarr ω + δtsum
k
partΩpartrrrk
middot lt vvvk gt (2101)
where lt vvvk gt is the average of the initial and final vvvrsquos here Note that it isalso necessary to evaluate the individual velocities vvvk because the expressionfor Bprime and ωprime would otherwise (in terms of the chain vector velocities VVV k)become rather cumbersome
One leapfrog step can then be written simply as
XXX(h2)VVV (h)XXX(h2)
50 S Mikkola
and a longer sequence of n steps reads
XXX(h2)[Πnminus1
ν=1 (VVV (h)XXX(h))]VVV (h)XXX(h2)
This is the formulation to be used with the extrapolation method when pro-ceeding over a total time interval of length nh
274 Alternative Time Transformations
If one takesΩj = mimj (2102)
then α = 0 β = 1 γ = 0 is mathematically equivalent to α = 1 β = γ = 0as was shown in Mikkola amp Aarseth (2002) However numerically these arenot equivalent and the LogH alternative is much more stable On the otherhand as noted above it is desirable to get stepsize shortening (and thusregularization) also for encounters of small bodies and thus some function Ωshould also be included
To increase the numerical stability for strong interactions of big bodiesand smooth the encounters of small bodies one may use α = 1 β = 0 and
Ωij =
m2 if mimj lt εm2
0 otherwise (2103)
where m2 =sum
iltj mimj(N(N minus 1)2) is the mean mass product and ε
an adjustable parameter (ε sim 10minus3 may be a good guess) It is sometimesadvantageous to integrate (286) for ω even if β = 0 This is because theintegrator (extrapolation method) is forced to use short steps where ω islarge thus giving higher precision when required
Remarks
1 If (α β γ) prop (1 0 0) the method is the logarithmic Hamiltonian method(LogH) of Mikkola amp Tanikawa (1999a)
2 If (α β γ) prop (0 1 0) the method is the transformed leapfrog (TTL)(Mikkola amp Aarseth 2002)
3 If (α β γ) prop (0 0 1) the method is the normal basic leapfrog4 Which combination of the numbers (α β γ) is best cannot be answered in
general For N -body systems with very large mass ratios one must haveβ = 0 but some small value is advantageous This is because low-massbodies do not contribute significantly to the energies and if β = 0 thestepsize is not reduced sufficiently during a close encounter
2 Regular Algorithms for the Few-Body Problem 51
28 Basic Algorithms for the Extrapolation Method
281 Leapfrog
The extrapolation method (Gragg 1964 1965 Bulirsch amp Stoer 1966) whichextrapolates results from a simple basic integrator to zero stepsize is one ofthe most efficient methods to convert results of low-order basic integrators intohighly accurate final outcomes Often such an integrator can be convenientlychosen to be a composite integrator like the leapfrog Let the differentialequations to be
xxx = fff(yyy) yyy = ggg(xxx) (2104)
then one can construct the the simple leapfrog algorithm
xxx 12
= xxx0 +h
2fff(yyy0) (2105)
yyy1 = yyy0 + hggg(xxx 12) (2106)
xxx1 = xxx 12
+h
2fff(yyy1) (2107)
One notes that this is a slightly generalized formulation of the very basicleapfrog which is obtained if fff(yyy) = yyy In this case therefore xxx would be thecoordinate vector yyy the velocity vector and ggg(xxx) the acceleration
Let us introduce the two mappings (or ldquosubroutinesrdquo)
XXX(s) xxx rarr xxx+ sfff(yyy) (2108)
andYYY (s) yyy rarr yyy + sggg(xxx) (2109)
with which the above leapfrog can be symbolized as XXX(h2)YYY (h)XXX(h2)When we want to compute n steps of stepsize = hn we can write
XXX
(h
2n
)[
YYY
(h
n
)
XXX
(h
n
)]nminus1
YYY
(h
n
)
XXX
(h
2n
)
(2110)
This advances the system over the time interval hThe final results can now be considered to be a function of hn and thus
it is possible to extrapolate to zero stepsize Due to the time symmetry of theleapfrog the error has an (asymptotic) expansion of the form
a2(hn)2 + a4(hn)4 +
ie the expansion contains only even powers of h This makes the extrapolationprocess particularly efficient
52 S Mikkola
282 Midpoint Method
In addition to the leapfrog algorithm commonly used in connection withthe extrapolation method we have the so-called modified midpoint methodThis algorithm can also be formally written as a leapfrog Let the differentialequation be
zzz = fff(zzz) (2111)
and let us split this into two parts as
xxx = fff(yyy) yyy = fff(xxx) (2112)
If this pair of equations is solved using the initial conditions xxx(0) = yyy(0) =zzz(0) the solution is simply xxx(t) = yyy(t) = zzz(t) On the other hand (2112) isof the same form as (2104) except that ggg = fff and it is possible to constructthe leapfrog algorithm
xxx 12
= xxx0 +h
2fff(yyy0) (2113)
yyy1 = yyy0 + hfff(xxx 12) (2114)
xxx1 = xxx 12
+h
2fff(yyy1) (2115)
the results of which can also be used for extrapolation to zero stepsize Notethat it is the vector xxx that is extrapolated while here yyy is just an auxiliaryquantity If one defines the mapping
AAA(yyyxxx s) xxx rarr xxx+ sfff(yyy) (2116)
then similar to (2110) one can write for the results with stepsize = hn
AAA
(
yyyxxxh
2n
)[
AAA
(
xxxyyyh
n
)
AAA
(
yyyxxxh
n
)]nminus1
AAA
(
xxxyyyh
n
)
AAA
(
yyyxxxh
2n
)
(2117)where xxx = zzz(0) yyy = zzz(0) initially
283 Generalized Midpoint Method
Here we introduce a generalization of the well-known modified midpointmethod In this algorithm the basic approximation to advance the solutionis not just the evaluation of the derivative at the midpoints but any methodto approximate the solution Thus eg the algorithmic regularization by theleapfrog can be used even when there are additional forces depending on ve-locities This provides a regular basic algorithm which is made suitable forthe extrapolation method by means of the generalized midpoint method
The starting point in this algorithm (Mikkola amp Merritt 2006 2008) is thesame as in the previous (midpoint method) section ie the problem consideredis
2 Regular Algorithms for the Few-Body Problem 53
zzz = fff(zzz) zzz(0) = zzz0 (2118)
and it is split into two as xxx = fff(yyy) yyy = fff(xxx) and the leapfrog-like algorithm(the modified midpoint method) is
xxx 12
= xxx0 +h
2fff(yyy0) yyy1 = yyy0 + hf(xxx 1
2) xxx1 = xxx 1
2+h
2fff(yyy1)
A new interpretation of the above can be obtained by first rewriting it in theform
xxx 12
= xxx0 +[
+h
2fff(yyy0)
]
(2119)
yyy 12
= yyy0 minus[
minush
2f(xxx 1
2)]
(2120)
yyy1 = yyy 12
+[
+h
2f(xxx 1
2)]
(2121)
xxx1 = xxx 12minus[
minush
2fff(yyy1)
]
(2122)
In (2119) the bracketed term is an (Euler-method) approximation to theincrement of xxx over the time interval h2 with the initial value yyy0 while in(2120) the initial value is xxx 1
2asymp xxx(h2) and the time interval is minush2 Finally
this increment is added ndash with a minus sign ndash to yyy0 to obtain an approximationfor yyy(h2) In the remaining formulae (2121) and (2122) the idea is the samebut the roles of xxx and yyy have been changed
A generalization of this follows readily Let d(zzz0Δt) be an increment forzzz such that
zzz(Δt) asymp zzz0 + d(zzz0Δt) (2123)
is an approximation to the solution of (2118) over a time interval Δt Onestep in the generalized midpoint method can now be written
xxx 12
= xxx0 + d(
yyy0+h
2
)
(2124)
yyy 12
= yyy0 minus d(
xxx 12minush
2
)
(2125)
yyy1 = yyy 12
+ d(
xxx 12+
h
2
)
(2126)
xxx1 = xxx 12minus d
(
yyy1minush
2
)
(2127)
or if we define the mapping (or ldquosubroutinerdquo)
AAA(xxxyyy h) xxx rarr xxx+ d(
yyy+h
2
)
(2128)
yyy rarr yyy minus d(
xxxminush
2
)
(2129)
54 S Mikkola
we can write the algorithm with many (n) steps as
1 Initialize yyy = xxx2 Repeat AAA(xxxyyy h)AAA(yyyxxx h) n times (2130)3 Take xxx as the final result
Thus one simply calls the subroutine AAA alternately with arguments (xxxyyy) and(yyyxxx) such that the sequence is time-symmetric (starts and stops with xxx in(2130))
This basic algorithm has the correct symmetry ndash because it was derivedfrom a leapfrog-like treatment and thus the Gragg-Bulirsch-Stoer extrapola-tion method can be used to obtain high accuracy
This generalized midpoint algorithm may be especially useful if oneemploys a special method well-suited to the particular problem at hand to ob-tain the increment ddd For the few-body problem with velocity-dependent ex-ternal perturbations such a method is the algorithmic regularization leapfrogThe external perturbation (with possible dependence on velocities) can beadded to the increment as
d rarr d + Δtfff(vvv ) (2131)
where fff is the external perturbation and vvv is the most recent velocity valueavailable Further on the leapfrog can be replaced by any other method thatis not necessarily time-symmetric since the algorithm generates the right kindof symmetry
284 Lyapunov Exponents
When the Lyapunov exponents (usually the largest one is sufficient) are re-quired the normal practice is that one derives the variational equations andthen programs the integration of those equations In practice there exists an-other simpler way to do the necessary programming
1 First one writes the code to integrate the basic problem It is a good ideato use rather simple program statements
2 One differentiates the resulting (and tested) code line by line adding thenecessary lines for evaluation of the variations
3 This is the simplest way to write the code for the variations since thereis no reason to consider the variational equations at all Instead one me-chanically differentiates every program statement thus getting the exactvariations of the algorithm
4 That is the best one can do
Perhaps the best way to clarify the above is to give a simple example Hereis a leapfrog algorithm for the harmonic oscillator First is shown the pureharmonic oscillator code then the version with variations The differentiatedlines that evaluate the variations are marked as ldquovarrdquo
2 Regular Algorithms for the Few-Body Problem 55
c Leapfrog code for a harmonic oscillator
c-----------------------------------------------
implicit real8 (a-ho-z)
x=1
p=0
h=001d0
E0=(pp+xx)2
t=0
1 continue
x=x+h2p this is
p=p-hx a leapfrog
x=x+h2p step
t=t+h
c diagnostics
E=(pp+xx)2
write(6)txpE-E0
if(tlt100)goto 1 max time=100
end
c Differentiated leapfrog for harmonic oscillator
c----------------------------------------------
implicit real8 (a-ho-z)
x=1
dx=1 var
p=0
dp=0 var
E0=(pp+xx)2
dE0=pdp+xdx var
t=0
h=001d0 stepsize
1 continue
x=x+h2p this is
dx=dx+h2dp var
p=p-hx a leapfrog
dp=dp-hdx var
x=x+h2p step
dx=dx+h2dp var
t=t+h
c diagnostics
E=(pp+xx)2
dE=pdp+xdx var (this should be constant)
write(6)txpE-E0dE-dE0
if(tlt100)goto 1 max time=100
end
The harmonic oscillator example is almost trivial but explains anyway how thevariations can be obtained by differentiating the original code mechanicallywithout any need to consider the variational equations The same technique
56 S Mikkola
is useful for almost any algorithm however complicated One easy check toimplement for the the variations is based on the fact that the differentialsof constants of motion are also constants of motion Above there is only oneintegral the total energy The differential should thus remain (approximately)constant In the few-body problem this applies to the components of angularmomentum also Finally in terms of the variations δq the Lyapunov expo-nents (approximations for) can be obtained as
λ asymp ln(|δq|)t (2132)
when the time t is sufficiently largeIn time-transformed systems all the variables including the time t have
variations Often the results are wanted in the ldquophysicalrdquo system where time isthe independent variable One must thus eliminate the time-variation effectIf f is any function of the system variables and time the physical systemvariation Δf and the time-transformed system variation δf are related by
Δf = δf minus δt f (2133)
where f is the total time derivative of f
29 Accuracy of the AR-Chain
To demonstrate the ability of the AR-chain code to handle large mass ratioswe plot in Fig 25 the energy and angular momentum errors in a system witha wide range of masses (two masses m1 = m2 = 1 and the rest were assignedvalues 01 001 0001 10minus8 Due to the large range of masses the KS-chain
cannot integrate the motions in this system satisfactorily but AR-chain is fastand accurate
The system evolves by ejecting most of the small masses in the time intervalillustrated The energy errors in this example are shown in two ways theuppermost curve gives the relative error in energy computed as 1minusEE0 whilethe lowermost curve is the value of the logarithmic Hamiltonian (essentiallythe same as (E minus E0)U The absolute error of the angular momentum isalso illustrated in the figure Somewhat surprisingly the relative error of theenergy fluctuates considerably while the value of the logarithmic Hamiltonianevolves much more slowly The reason for this is that since the Hamiltonianis log((T minusEU)) the algorithm attempts to keep this quantity constant (andnot the energy E) In fact it is inevitable that integration errors give a smallnon-zero value for the logarithmic Hamiltonian log((T minus E)U) = ε fromwhich we can derive the energy error
δE = εU (2134)
assuming the logarithmic Hamiltonian remains constant Thus it is essentiallythe variation of the potential energy U that causes the fluctuation of theenergy error in the above figure We conclude that all the illustrated errorsare sufficiently small of the order of magnitude of round-off error effects
2 Regular Algorithms for the Few-Body Problem 57
ndash4endash13
ndash2endash13
0
2endash13
4endash13
6endash13
8endash13
1endash12
12endash12
0 20 40 60 80 100 120
erro
rs
time
1ndashEE0
AM
log((TndashE ) U )
0
Fig 25 Errors in a 10-body problem integrated with the AR-chain code Thesystem consists of a heavy binary (component masses = 1 eccentricity e = 05) andthe other particles have masses 10minusn for n = 1 2 3 8 Uppermost curve relativeerror of energy (= 1 minus EE0) lowermost curve log((T minus E)U) which is the valueof the logarithmic Hamiltonian the thick curve (AM) absolute error in the angularmomentum
210 Conclusions
Experience has shown that generally the AR-chain is comparable in accuracywith the KS-chain in most practical problems (the one-dimensional N -bodyproblem being an exception) With the modified midpoint method AR-chain
is efficient also in problems with velocity-dependent external forces A furtheradvantage is the fact that contrary to KS-chain soft potentials can readilybe treated without any problem Also the differentiation of the algorithmsis sufficiently simple especially for the three-body algorithm discussed inSect 253 so that one can evaluate the Lyapunov exponents
In summary
1 KS-chain is the most efficient KS-regularized code but restricted to com-parable masses (say mass ratios of sim 104) A possible drawback for someproblems is that a soft potential cannot be used
2 LogH is a good alternative for comparable masses3 TTL can handle large mass ratios but may suffer from round-off errors4 AR-chain can handle large mass ratios and soft potential With the gen-
eralized midpoint method velocity-dependent external forces can also be
58 S Mikkola
included with no problem Consequently AR-chain is a good alternativeto the KS-chain and in many problems the best method
5 For all the algorithms discussed here use of the extrapolation method(Bulirsch amp Stoer 1966 Press et al 1986) is necessary to improve theleapfrog results to high accuracy
Finally it is necessary to stress that the codes discussed here are stand-alonefew-body codes requiring additional programming when implementing themfor large N -body systems1
References
Aarseth S J 2003 Gravitational N-Body Simulations Cambridge University PressCambridge 31 37
Bulirsch R Stoer J 1966 Num Math 8 1 45 51 58Gragg W B 1964 PhD thesis University of California Los Angeles 51Gragg W B 1965 SIAM J Numer Anal 2 384 51Heggie D C 1974 Celes Mech 10 217 36 42Kustaanheimo P Stiefel E 1965 J Reine Angew Math 218 204 31 33Levi-Civita T 1920 Acta Math 42 99 31Mikkola S Aarseth S J 1993 Celes Mech Dyn Astron 57 439 36 37Mikkola S Aarseth S 2002 Celes Mech Dyn Astron 84 343 31 44 49 50Mikkola S Merritt D 2006 MNRAS 372 219 31 52Mikkola S Merritt D 2008 AJ 135 2398 50Mikkola S Tanikawa K 1999a MNRAS 310 745 50Mikkola S Tanikawa K 1999b Celes Mech Dyn Astron 74 287 31 32 37 39Press W H Flannery B PTeukolsky S A Wetterling W T 1986 Numerical
Recipes Cambridge University Press Cambridge 45 58Preto M Tremaine S 1999 AJ 118 2532 31 32 37 39Soffel M H 1989 Relativity in Astrometry Celestial Mechanics and Geodesy
Springer-Berlin p 141 31Stiefel E L Scheifele G 1971 Linear and Regular Celestial Mechanics Springer
Berlin 33Yoshida H 1990 Phys Lett A 150 262 43 45Zare K 1974 Celes Mech 10 207 36
1Some source codes can be found on httpwwwcambodyorgcodesphp
3
Resonance Chaos and StabilityThe Three-Body Problem in Astrophysics
Rosemary A Mardling
School of Mathematical Sciences Monash University Victoria 3800 Australiamardlingscimonasheduau
31 Introduction
In his Oppenheimer lecture entitled ldquoGravity is cool or why our universe isas hospitable as it isrdquo Freeman Dyson discusses how time has two faces thequick violent face and the slow gentle face the face of the destroyer and theface of the preserver (Dyson 2000) He entirely attributes these two faces togravity and the ease with which gravitational energy can change irreversiblyinto other forms of energy The simplest system exhibiting these two faces isthat of three gravitating bodies for most configurations the slow gentle faceis the norm while for a very important subset violence is the order of the dayIn fact it is this violence resulting in one of the bodies being ejected from thesystem which is responsible for much of the structure we see in the universefrom planets to giant elliptical galaxies
The simplest example of a quiescent gravitating system is that of twobodies orbiting each other at a distance large enough that their potentialsare essentially those of point masses Their paths about the common centreof mass are simple ellipses and these paths do not change from orbit toorbit their shapes (eccentricities) are preserved as are their sizes (semi-majoraxes) and orientations in space (inclination and longitudes of periastron andascending nodes measured with respect to some reference set of axes seeFig 31) However add one more body to the system and this wealth ofsymmetry is lost at least to some extent In the simplest case if the binarycomponents have equal mass and the third body orbits the binary in the sameplane and is ldquosufficiently distantrdquo the original binary will simply rotate aboutits centre of mass this is apsidal motion Its eccentricity and semi-major axiswill not be affected and the third body will orbit the centre of mass of thebinary as if the latter were a single body with mass equal to the sum of thecomponent masses No net energy or angular momentum is exchanged betweenthe inner and outer orbits in this simple case If the inner binary componentshave different masses some angular momentum is exchanged between theorbits with the result that the eccentricities oscillate about some mean values
Mardling RA Resonance Chaos and Stability The Three-Body Problem in Astrophysics
Lect Notes Phys 760 59ndash96 (2008)
DOI 101007978-1-4020-8431-7 3 ccopy Springer-Verlag Berlin Heidelberg 2008
60 R A Mardling
2
i
k
jI
Ωω
f
line of nodes
pericentre
m
Fig 31 Orbital elements specifying the orientation and phase of a binary relativeto a fixed coordinate system ω is the argument of periastron Ω is the longitude ofthe ascending node I is the orbital inclination and f is the true anomaly the latterbeing one of several ways of specifying the orbital phase
This is most pronounced when one body is much more massive than the othertwo as is the case in a planetary system because very close stable systemscan exist
If the orbit of the third body is out of the plane of the binary in additionto apsidal motion both orbits will rock (nutate) up and down that is theirrelative inclination will oscillate about some mean value and the planes of theirorbits will rotate about the direction defined by the total angular momentumof the system (precession)1 No energy and very little angular momentum isexchanged between the orbits of such a system2 even though the eccentricityof the inner binary may oscillate substantially about some mean value aphenomenon called the Kozai effect (Kozai 1962)
These variations of the elements generally occur on time-scales much longerthan the component orbital periods and are referred to as secular variationsThey are characterized by zero energy exchange between the orbits whichmanifests itself in the constancy of the semi-major axes of both the innerand the outer orbits3 In contrast to this unstable systems defined as thosefor which one body eventually escapes to infinity necessarily must exchangeenergy between the orbits in order for this to occur If one makes a plot inthe parameter space of initial conditions associated with secular and unstablebehaviour one finds a very sharp boundary between the two
I was led to the study of stability in the three-body problem after dis-covering that the energy exchange process between the tides and the orbitin a close binary system can be chaotic (Mardling 1995ab) One day Sverre
1Note that apsidal motion is often mistakenly referred to as precession2Again except if the system is a very close planetary-like system3Except for stable resonant systems see later
3 Three-Body Stability 61
Aarseth was looking at my stability plots and commented that they remindedhim of some plots made by Peter Eggleton and Luda Kiseleva for three-bodyhierarchies (Eggleton amp Kiseleva 1995) He wondered whether or not the twoproblems might be linked It turns out that they are much of the analysispresented in this chapter can equally be applied to the binary-tides problem
Throughout this chapter I will refer to five intimately related works sub-mitted or in progress M1a (Mardling 2008a) and M1b discuss stability in thethree-body problem the former coplanar systems and the latter inclined M2discusses the resonant structure of eccentric planetary systems M3 (Mardling2008b) presents a simple formalism for studying the secular evolution of arbi-trary triple configurations4 while M4 presents a new formalism for studyingstrong three-body interactions
32 Resonance in Nature
The most familiar example of resonance in action is a parent pushing a childon a swing The only way to increase the amplitude of the swing consistentlyis to push it at its natural frequency But if you think about it the ldquonaturalfrequencyrdquo varies depending on the amplitude of the swing while it is prettymuch constant over the range of amplitudes tolerated by most children forthe intrepid child who prefers heights substantially more than that of theparentrsquos one needs to wait considerably longer for her to complete a full swingbefore she gets her next push This amplitude dependence of the frequency is acharacteristic of non-linear oscillators of which the pendulum is one exampleand we will see that it is fundamental to understanding stability in the three-body problem
Resonance is responsible for both structure and destruction in Nature andnot just via gravity It is Naturersquos way of moving energy around in bulk Forexample molecular structure depends on resonance between internal elec-tronic states the formation of carbon in stars via the triple-alpha processrelies on a resonant reaction between an alpha particle and a very short-livedberyllium nucleus leading to the formation of an excited state of the carbonnucleus even the Archimedes spiral of a sunflower relies on resonance for itsformation [see Reichl (1992) for a discussion of the golden mean as the ldquomostirrational numberrdquo] But when gravity is involved resonance plays a role onevery astrophysical scale through the dynamics of three-body instability
321 Three-Body Processes in Astrophysics
Three-body processes are at the heart of structure on all astrophysical scalesfrom planet formation via the accumulation of planetesimals to giant ellipticalgalaxies through the forced collisions of smaller galaxies Processes occurring
4Some animations of stable and unstable triples may be found athttpusersmonasheduau~ro
62 R A Mardling
in star clusters include binaryndashsingle star scattering in the cores of globularclusters a process largely responsible for the prevention of total core col-lapse (Aarseth 1971) the formation of X-ray binaries in globular cluster coresthrough binaryndashsingle and binaryndashbinary collisions (Hills 1976) the formationof massive stars that almost certainly occasionally (if not exclusively) formthrough collisions induced in small-N systems the building of intermediate-mass black holes through the so-called Kozai mechanism (Aarseth 2007) theformation of close binaries through the Kozai mechanism (Eggleton amp Kiseleva2001 Fabrycky amp Tremaine 2007) the stability or otherwise of planetary sys-tems in star clusters (Spurzem et al 2006) and hypervelocity stars originat-ing from galactic centre (Hills 1976) In addition many objects thought tobe binary stars are revealing themselves to be triple or higher-order config-urations (Tokovinin et al 2006) such systems may well be the remnants ofeven higher-order systems that have decayed since their birth in the natal starcluster (Reipurth amp Clarke 2001)
To understand all these processes it is necessary to understand how energyand angular momentum move around inside a triple and under what circum-stances a given configuration is stable The rest of this chapter is devoted tothis question through a study of resonance in the three-body problem
33 The Mathematics of Resonance
331 The Pendulum
Before we discuss resonance it is necessary to review the mechanics of apendulum As we will show pendulum-like behaviour is fundamental to anunderstanding of the three-body problem
The equation governing the motion of a pendulum of length l in a uniformgravitational field g is
φ+ ω20 sinφ = 0 (31)
where ω20 = gl Clearly for max(φ) 1 (31) reduces to the equation for
simple harmonic motion with natural frequency ω0 We will refer to ω0 as thesmall angle frequency and to the associated libration period the small anglelibration period Figure 32(a) plots φ against time the latter measured inunits of small angle libration periods for φ(0) = 0 and various values of φ(0)while Fig 31(b) plots solutions in phase-space that is φ against φ Solutionsthat oscillate between fixed values of φ lt π are referred to as libratory andthose for which φ is unbounded are called circulatory These two kinds ofmotion are separated in phase space by the separatrix the two branches ofwhich are indicated by the dashed curves in each panel Clearly the librationperiod increases from 2πω0 for small maximum φ equiv φm to infinity for φm =π Note in particular the so-called hyperbolic fixed points on the separatrix(φ φ) = (plusmnπ 0) in panel (b) these play a vital role in unstable triples as wewill demonstrate
3 Three-Body Stability 63
Fig 32 Libration versus circulation of a pendulum Corresponding curves in (a)and (b) have the same colour The dashed curves correspond to the separatrixafter starting at φ(0) = 0 the system takes an infinite amount of time to reach theunstable equilibrium points (φ φ) = (plusmnπ 0) (also known as hyperbolic fixed points)
Equation (31) has an integral of the motion which we refer to as thependulum energy
E =12φ2 minus ω2
0(cosφ+ 1) (32)
where we have chosen the zero of E to correspond to the separatrix that isthe curve which passes through (φ φ) = (π 0) The equation for the separa-trix is therefore
φ = plusmn2ω0 cos(φ2) (33)
For systems with E lt 0 the libration period Tlib is given by
Tlib =int Tlib
0
dt = 4int φm
0
dφφ
=2radic
2ω0
int φm
0
dφradiccosφminus cosφm
(34)
where again φm is the maximum value of φ therefore corresponding to φ = 0Note that for φm 1 Tlib 2πω0
For systems with E gt 0 the circulation period Tcirc is given by
Tcirc = 2int π
0
dφφ
= 2int π
0
dφradicφ2
0 + 2ω20(cosφminus 1)
(35)
where φ0 is the value of φ corresponding to φ = 0 Note that for φ0 2ω0Tcirc 2πφ0
The libration and circulation frequencies ωlib equiv 2πTlib and ωcirc equiv2πTcirc respectively are plotted in Fig 33 Note the steep dependence ofωlib on φm near φm = π and ωcirc on φ0 near φ0 = 0 As we will now demon-strate it is this steep dependence which is responsible for chaos in weaklycoupled non-linear systems
64 R A Mardling
m
E Eω
ω
ωω
Fig 33 Amplitude dependence of pendulum libration and circulation frequenciesNote the extremely steep dependence of ωlib on φm near π ndash one of the secrets tounderstanding chaos in weakly interacting systems The dashed curves correspondto (a) the small angle frequency and (b) φ0 = 2ω0
332 Linear Versus Non-Linear Resonance
Consider a simple undamped spring with natural frequency ω which is forcedat the frequency Ω If φ is the displacement away from equilibrium then giventhe initial conditions φ(0) = φ(0) = 0 the solution to the equation of motion
φ+ ω2φ = A sin Ωt (36)
is
φ(t) =A
Ω2 minus ω2[(Ωω) sinωtminus sin Ωt] (37)
when Ω = ω and
φ(t) =A
2ω2[sinωtminus ωt cosωt] (38)
when Ω = ω These two types of solution are plotted in Fig 34(a) and (b)respectively In the first case a near-resonant value of Ω = 09ω produces thephenomenon called beating where the frequency of the envelope of the solutionis |Ω minus ω| The maximum value attained is approximately (Aω)|Ω minus ω|However when Ω = ω the envelope is given by φ(t) = plusmnAt2ω and thesolution grows without bound This is linear resonance
Unlike a simple spring whose natural oscillation frequency is indepen-dent of the amplitude the libration frequency of a pendulum is amplitude-dependent except when the libration angle is small Consider a pendulumwhich is forced at a constant frequency Ω and let its small angle frequency beω0 Its equation of motion is almost identical to (36) except that φ is replacedby sinφ
φ+ ω20 sinφ = A sin Ωt (39)
3 Three-Body Stability 65
ωΩ ωΩ
π
π π
π
ωΩΩ ω
t t
tt
tt t
t
Fig 34 Forced linear spring vs forced pendulum Linear spring (a) beating withΩ ltsim ω and (b) linear resonance with Ω = ω Pendulum (c) and (d) Both solutionsexhibit beating but the system which is forced with a frequency less than the small-angle frequency attains a larger amplitude because as the amplitude increases thelibration frequency decreases moving it closer to the forcing frequency In contrastsystem (d) moves away from the forcing frequency from the start and therefore doesnot attain as large an amplitude For all four systems A = 01 and φ(0) = φ(0) = 0
Now there is no closed-form solution in fact this differential equation admitschaotic solutions In order to understand how such solutions arise (and ulti-mately to understand why the three-body problem admits chaotic solutions)consider solutions to (39) with the same initial conditions as for the forcedspring these are shown in Fig 34(c) and (d) Both solutions exhibit beatingbut the system which is forced with a frequency less than the small anglefrequency attains a larger amplitude because as the amplitude increases thelibration frequency decreases moving it closer to the forcing frequency (seeFig 33) In contrast system (b) moves away from the forcing frequency fromthe beginning
What happens if A is increased in (39) While doing this merely scalesthe amplitude for a linear spring the response is quite different for a forcedpendulum because the response frequency actually depends on the amplitudeFigure 35 shows solutions for various values of A equiv Aω2
0 for φ(0) = φ(0) = 0
66 R A Mardling
π π
ππt
tt t
t
t t
t
A A
A A
Fig 35 Strong forcing of a pendulum All systems have Ω = 09 ω0 and φ(0) =φ(0) = 0 except for the dashed curves for which φ(0) = 10minus6 (a) A equiv Aω2
0 = 03libration Here the pendulum frequency drops further below the forcing frequencyand beating is less pronounced Note especially that the amplitude gets dangerouslyclose to π that is the separatrix (b) A = 10 circulation Safely past the separatrixthe system is sufficiently forced to simply circulate (c) A = 0305 and (d) A = 105chaos The system is forced sufficiently strongly to show a mixture of libration andcirculation The dashed curves illustrate the sensitivity of chaotic systems to initialconditions In fact both (a) and (b) are also chaotic but these systems do not comesufficiently close to the separatrix during this time interval Note that the valuesof A in (c) and (d) are only slightly different to those in (a) and (b) respectivelysuggesting that the time at which obvious divergence of nearby trajectories takesplace is statistical Note also that different scales have been used for each panel
and Ω = 09ω0 In (a) A = 03 the motion remains libratory over this timeinterval (E lt 0) but the amplitude comes close to π (maximum 26) In (b)A = 10 and the stronger forcing allows the system to be completely circu-latory with E gt 0 at all times shown Panels (c) and (d) exhibit sensitivityto initial conditions a diagnostic of chaos even though their values for A areonly slightly different to those in (a) and (b) This is demonstrated by plot-ting trajectories with the same initial conditions except for the initial valuesfor φ which differ by 10minus6 Note that for longer integration times (a) and
3 Three-Body Stability 67
(b) also display similar sensitivity to initial conditions including a mixture oflibration and circulation
333 The Butterfly Effect Explained
When a system is near the separatrix a small difference in φ can correspondto at least an order of magnitude difference in the pendulum frequency ωlib
or ωcirc (see Fig 33) Since the libration amplitude depends sensitively onthe current value of ωlib relative to the forcing frequency [for example com-pare Fig 34(c) and (d)] such differences can eventually lead to a significantdivergence of initially nearby solutions as long as the system is not periodicor quasi-periodic (see below)5 A system that is sufficiently strongly forcedmay even cross the separatrix and begin to circulate this almost never hap-pens at the same time as a neighbouring trajectory because of the differencesin their pendulum frequencies at the time The situation is indicated by ar-rows in Fig 35(c) and (d) This behaviour is the essence of chaos in weaklyinteracting systems
Let us consider the situation more closely Given the values of φ and φat any time t one can define the instantaneous (or osculating) pendulumfrequency ω to be such that
ω(t) =
ωlib E lt 0minusωcirc E gt 0 (310)
where again ωlib = 2πTlib and ωcirc = 2πTcirc with Tlib and Tcirc defined in(34) and (35) These latter quantities depend on knowing φm and φ0 that isrespectively φ at φ = 0 for a librating system and φ at φ = 0 for a circulatingsystem The instantaneous values of these can be defined via the pendulumenergy E (which is now not conserved) Thus from (32)
φ2 minus ω20(1 + cosφ) = minusω2
0(1 + cosφm) (311)
and
φ2 minus ω20(1 + cosφ) = φ2
0 minus 2ω20 (312)
Note that defining the pendulum frequency to be negative when E gt 0 simplyensures that dωdt is continuous through ω = 0 that is for the purpose ofgraphical representation there is a smooth transition from libration to circu-lation More importantly it allows for a meaningful measure of the ldquodistancerdquobetween neighbouring trajectories (see discussion below)
Figure 36(b) plots ω(t) for the stable case shown in panel (a) of the samefigure for which A = 01 Ω = 09ω0 The pendulum frequency is clearly
5A system is N-fold quasi-periodic if it can be represented as the product of NFourier series with associated frequencies ωi i = 1 N such that the ωi are notcommensurate If the ωi are commensurate the system is periodic
68 R A Mardling
ω ωΩ Ω
π π
ππ
π π
ω ωω
ωω
π πω
t ttt
tt
t t
tt
A A
Fig 36 Exponential divergence of chaotic trajectories Panel (a) shows the evolu-tion of φ (in units of π) for two initially close trajectories (δφ(0) = 10minus6) for A = 01and Ωω0 = 09 No unstable behaviour is indicated and this is supported by panel(c) which plots the logarithm of the difference in the pendulum frequencies Panel(b) shows the evolution of the pendulum frequency ω(t) ((310)) for the systemwith φ(0) = 0 Points are plotted only when the forcing is zero that is when thependulum is ldquofreerdquo Since φ is quasi-periodic (in fact for this example it is actuallyperiodic because ω0 and Ω are commensurate) the pendulum frequencies come inand out of step over time and their differences therefore never build up Panels (d)(e) and (f) show the evolution of these quantities for the chaotic system A = 02 andΩω0 = 09 The initially close trajectories diverge strongly around t2π = 30 eventhough the system appears to be stable before then However it is clearly not evenquasi-periodic and panel (f) reveals that the trajectories are in fact exponentiallydiverging because |φ| comes close enough to π for ω1 to be significantly different toω2 at those times In particular notice how individual peaks in panel (f) correspondto minimum values of |ω(t)| The forcing is strong enough to allow the system tocross the separatrix and occasionally circulate Since φ is not periodic differencesin ω accumulate and remain O(|ω|)
3 Three-Body Stability 69
periodic with minima corresponding to maximum forcing (notice in (a) howthe response ldquostretchesrdquo at maximum amplitude this is seen in more detailin Fig 34(c)) Panel (c) plots the logarithm of the difference between thependulum frequencies ω1 and ω2 of two initially close systems for which thedifference in φ(0) is again 10minus6 equiv ε The difference remains of the order orless than ε for the time shown here and for longer times grows linearly beforeturning over when |ω1 minus ω2| 001 This behaviour is common to quasi-periodic (and periodic) systems for which accumulation of differences in ω islimited to how out of phase the two systems become
In contrast the right-hand panels (d) (e) and (f) show φ(t) ω(t) andlog |ω1 minusω2| for the chaotic system A = 02 and Ω = 09ω0 Unlike the stablesystem this one is not periodic or quasi-periodic and the consequence is thatdifferences in ω do accumulate These differences are a maximum when |ω(t)|is a minimum because of its steep dependence on φ0 as φ0 rarr π and this canbe seen if one compares panels (e) and (f) Eventually |ω1 minus ω2| = O(|ω|)when one of the systems is sufficiently forced to start circulating Note thatsystem 1 first circulates at t2π 84
The slope of the curve in panel (f) indicates the time-scale τ on whichexponential trajectory divergence takes place This is normally associated withthe largest Lyapunov exponent λ which is related to τ such that λ sim 1τ
The following questions arise how strong does the forcing have to be (howlarge should A be) andor how close should the forcing frequency Ω be to ω0
in order that the system is not exclusively libratory Are all systems whichdo not circulate quasi-periodic or periodic (ie do all chaotic systems involvecirculation) These and other related questions have been studied extensivelyin the context of conservative Hamiltonian systems of which the general three-body problem is an example In fact the three-body problem (or simplifiedversions of it) motivated Poincare to invent the modern theory of dynamicalsystems and chaos (Barrow-Green 1997) and led to the famous KolmogorovndashArnolrsquodndashMoser or KAM theory of weakly interacting Hamiltonian systems(see below)
334 Pendulums the Three-Body Problemand Resonance Overlap
The previous examples demonstrate how springs and pendulums respond tofixed forcing How are these related to the three-body problem Most three-body configurations can be regarded as being composed of an ldquoinner binaryrdquoand an ldquoouter binaryrdquo the latter being composed of the inner binary and thethird body this is referred to as a three-body hierarchy (see Fig 38) Whena system is stable (or at least close to stable) these two binaries constitutea weakly interacting conservative system with each binary forcing the other
Figure 37 shows the evolution of the semi-major axis ai of the innerbinary of (a) a stable triple and (b) an unstable triple The behaviour ofthe stable system is very similar to the forced pendulum in Figs 34(c) and
70 R A Mardling
Fig 37 Evolution of the semi-major axis ai of the inner binary of a stable triple(a) and an unstable triple (b) The initial conditions are such that for both (a)and (b) the ratio of the outer periastron distance to the inner semi-major axis is36 and the inner binary is circular while the outer eccentricity is 03 and 05 for(a) and (b) respectively In (b) we also show the evolution of an almost identicalconfiguration for which the initial inner eccentricities differ by 10minus6
36(a) here the forcing is provided by the third body with outer periastronpassage occurring at 05 phase The chaotic system in (b) is reminiscent ofFig 36(d) in this case with a mixture of oscillation between two fixed values(ldquolibrationrdquo) and approximately steady increase or decrease (ldquocirculationrdquo)of ai In fact the inner and outer orbits exchange energy via an interactionpotential or disturbing function which can be written as an infinite series ofresonance angles each a linear combination of all the angles in the systemand each obeying a forced pendulum equation The forcing of each individualldquopendulumrdquo is provided by all the other ldquopendulumsrdquo and when the systemis stable the forcing is negligible (in fact exponentially small) For almost allstable systems the pendulum motions are circulatory with exponentially smallamplitudes however some stable systems exist in a resonant state in whichcase one resonance angle librates6 In order for stability to be maintainedthe forcing of such an angle must remain small in the sense discussed inthe previous section When the forcing is such that the pendulum librationamplitude (ie the single resonance angle that is librating) comes close to πthe system is unstable again in the same sense as discussed in the previoussection However here the forcing is provided by another ldquopendulumrdquo withalmost the same frequency ie by another resonance angle In order for theforcing to be sufficiently strong it turns out that such a resonance angle (ingeneral) must also be librating and we have the situation where the systemexists in two ldquoneighbouringrdquo resonant states this is referred to as resonanceoverlap Thus the diagnostic for instability is simply that two neighbouringresonances be librating this is the resonance overlap stability criterion
6In fact the stable resonant state actually consists of a superposition of resonanceangles (M2) but this is usually only important for extreme mass-ratio systems thathave stable low-order resonances
3 Three-Body Stability 71
The reader is referred to the original paper by Walker amp Ford (1969) inwhich this idea is discussed in a clear and straightforward way while Chirikov(1979) provides a deeper and more extensive analysis The concept of res-onance in weakly interacting conservative systems originates in a theoremproposed and partially proved by Kolmogorov (1954) itself inspired by thework of Poincare (1993) This theorem was fully proved by Arnolrsquod (1963)and independently by Moser (1962) The three papers constitute the famousKolmogorovndashArnolrsquodndashMoser or KAM theorem which would provide a proofthat ldquostablerdquo triple systems are formally stable for all time were it not forthe fact that one of the assumptions made in the proof of the theorem isviolated The aim of the KAM theorem is to show that if one perturbs anintegrable Hamiltonian system sufficiently weakly7 then some of the KAMtori on which solutions were originally quasi-periodic will be only slightly dis-torted and quasi-periodicity will be preserved Although not a conservativeHamiltonian system we see this behaviour in going from the forced spring inFig 34(a) to the forced pendulum in panel (c) of the same figure a pendulumcan be regarded as a linear spring with a non-linear perturbation Howeverif the perturbation is too strong quasi-periodicity is lost and the motion be-comes unpredictable If the KAM theorem applied to the three-body problemit would prove that a large subset of configurations exists whose members re-main stable for all time (because they are stuck on KAM tori) But the catchis that one requires the characteristic frequencies of the decoupled system tobe non-commensurate and this is not the case because the apsidal motion andprecession frequencies are equal (in fact equal to zero)
So a formal proof of the ultimate stability of general three-body configura-tions remains elusive although it can be proved in some restricted cases forexample when the eccentricities and inclinations are small so that the seculartheory of Laplace applies and can be used as the underlying ldquounperturbedrdquosystem see Arnolrsquod (1978) p 414 We must therefore (at least for now) becontent with our observation that apparently stable systems seem to mimicquasi-periodic systems for which the KAM theorem does apply and proceedto use the tools of the theorem (in particular the resonance overlap stabilitycriterion) to predict albeit approximately the boundary between stable andunstable behaviour
7An integrable Hamiltonian system that is a function of N coordinate and Nmomentum variables is one which has N integrals of the motion For such systemsone can then find a coordinate transformation such that the new momenta are theintegrals themselves and the new coordinates qi i = 1 N are linear functions oftime qi(t) = ωit + Ci where the ωi are the characteristic frequencies of the systemand the Ci are constants If the ωi are not commensurate that is there exists nointegers ki such that
sumkiωi = 0 the solutions are restricted to and densely cover
so-called KAM tori and the motion is quasi-periodic If the ωi are commensuratethe motion is periodic
72 R A Mardling
34 The Three-Body Problem
The three-body problem is famously easy to formulate and impossible tosolve ndash at least analytically Newton is said to have suffered from sleeplessnessand headaches trying to find closed-form solutions after having had such aneasy time with the two-body problem After many attempts by the best math-ematicians of their time Poincare noticed that perturbation techniques un-avoidably involved singularities associated with resonances and concluded thatthe three-body problem has solutions that cannot be represented by conver-gent series
In order to appreciate fully the dynamics of the three-body problem webegin by reviewing some aspects of the two-body problem in particular itsintegrals of the motion These express various symmetries inherent in theequations of motion one (sometimes more) of which survives when a thirdbody is added and the system is stable (the total energy and linear and angularmomenta are still conserved)
341 Symmetries in the Two-Body Problem
The equations of motion of two bodies with masses m1 and m2 acting underthe influence of each otherrsquos gravity are
m1r1 =Gm1m2
r212r12 (313)
m2r2 = minusGm1m2
r212r12 (314)
where r12 = r2 minus r1 Equations (313) and (314) constitute a twelfth-ordersystem of differential equations However it has eight independent integrals ofthe motion and as is well known this restricts the motion to a simple curve inspace as we now show Three of the integrals of motion are the components ofthe total linear momentum P which one obtains by adding (313) and (314)together and integrating that is
m1r1 +m2r2 equiv P (315)
Dividing through by the masses subtracting (313) from (314) and definingr to be the position vector of m2 relative to m1 that is r equiv r12 we reducethe system to sixth order
r = minusGm12
r2r (316)
where r = |r| and m12 = m1 +m2 Taking the cross product of each side withμr and integrating we get another three integrals of the motion these are thecomponents of the total angular momentum J
3 Three-Body Stability 73
μr times r equiv J (317)
where μ = m1m2m12 is the reduced mass of the system A seventh integralof the motion is the total energy this is obtained by taking the dot productof (316) with μr and integrating
12μr middot r minus Gm1m2
requiv E (318)
where we have used the chain rule
ddt
=part
partt+ r middot part
partr (319)
with partpartr equiv nabla The seven integrals reflect natural symmetries of isolatedconservative mechanical systems the conservation of energy and linear mo-mentum reflect the fact that the equations of motion are independent of theorigin of time and space respectively while the conservation of angular mo-mentum reflects the fact that the solution is independent of the orientationof the system For all these symmetries there is no external landmark whichcould be used to distinguish one system from another under such transforma-tions
What symmetry does the eighth integral correspond to It is well knownthat solutions to (313) and (314) are conic sections In particular thesecurves are fixed in space that is their orientation is invariant a fact peculiarto the two-body problem (see Goldstein (1980) p 104 for a discussion of this)This is normally expressed as the invariance of the RungendashLenz vector (alsocalled the Laplace vector) a vector which points in the direction of periastronand is defined by
e = r times (r times r)Gm12 minus r (320)
and whose magnitude is the orbital eccentricity e But this appears to addthree extra integrals in fact one can show that only one is independent of theother seven (Goldstein 1980)
The two-body problem has six degrees of freedom and hence one only needssix integrals of the motion in order that the system be completely integrable(in the sense discussed in the footnote on p 71) The fact that we have eightrestricts the motion to closed curves in the frame of reference of the centreof mass of the system Solution curves are the conic sections (see Goldstein(1980) for a method of solution)
342 The Three-Body Problem
The equations of motion of three bodies with masses m1 m2 and m3 actingunder the influence of each otherrsquos gravity are
74 R A Mardling
m1r1 =Gm1m2
r212r12 +
Gm1m3
r213r13 (321)
m2r2 = minusGm1m2
r212r12 +
Gm2m3
r223r23 (322)
m3r3 = minusGm1m3
r213r13 minus
Gm2m3
r223r23 (323)
where the vectors ri i = 1 2 3 are referred to the centre of mass of thesystem (see Fig 38) and rij = rj minus ri with rij = |rij | The differentialequations (321) (322) and (323) constitute an 18th-order system While itagain yields the seven integrals of total energy linear momentum and angularmomentum there is no analogue of the RungendashLenz integral Thus we are twointegrals short of a totally integrable system This fact results in the possibilityof the system admitting chaotic solutions that is solutions that are exquisitelysensitive to the initial conditions and are hence unpredictable In fact for somesystems with negative total energy it allows for infinite separation of one bodyfrom the other pair These are systems referred to as Lagrange unstable whichin general do not rely on the close approach of two of the bodies (such systemsare referred to as Hill unstable)
We thus ask the general question given a particular three-body configu-ration how can we determine whether or not it is (Lagrange) stable for alltime As discussed in Sect 334 there is no rigorous answer to this ques-tion However there is no doubt that there exists a sharp (albeit fractal-like)boundary in parameter space between unstable systems which decay on arelatively short time-scale and those which appear to remain intact (are sta-ble) indefinitely It is this boundary that is approximately delineated in thischapter using the so-called resonance overlap criterion which itself involvesidentifying internal resonances in the system In order to do this we begin byintroducing Jacobi or hierarchical coordinates r and R which together with
RC123
r3
C12
m1
m2
m3
r1
r2
r
Fig 38 Centre of mass coordinates ri and Jacobi coordinates r and R C12 is thecentre of mass of bodies 1 and 2 while C123 is the centre of mass of the whole system
3 Three-Body Stability 75
conservation of linear momentum replace the centre-of-mass coordinates r1r2 and r3 (see Fig 38)
343 Equations of Motion in Jacobi Coordinates
Intuitively it seems reasonable that three-body configurations are more likelyto be stable the further one of the bodies (let us take this to be body 3) isseparated from the other two In fact a very distant third body will orbitthe other two as if they were almost a single body Thus we can conceiveof an ldquoinner binaryrdquo composed of bodies 1 and 2 and an ldquoouter binaryrdquocomposed of bodies (1+2) and body 3 Jacobi coordinates conveniently expressthis arrangement Just as for the two-body problem r is defined to be theposition vector of m2 relative to m1 that is r = r2 minus r1 while R is theposition vector of m3 relative to the centre of mass of m1 and m2 In factit turns out that R passes through the centre of mass of the system and assuch is in the same direction as r3 with R = (m123m12) r3 (Fig 38) wherem123 = m1 +m2 +m3 Using these definitions we can reduce the 18th-ordersystem (321) (322) and (323) to the 12th-order system
μir +Gm1m2
r2r =
partRpartr
(324)
μoR +Gm12m3
R2R =
partRpartR
(325)
where R = |R| μi = m1m2m12 and μo = m12m3m123 are the reducedmasses associated with the inner and outer orbits respectively and
R = minusGm12m3
R+
Gm2m3
|R minus α1r|+
Gm1m3
|R + α2r|(326)
is the disturbing function8 with αi = mim12 i = 1 2 As rR rarr 0 andorm3m12 rarr 0 R rarr 0 and the inner and outer orbits decouple In fact thedisturbing function contains all the information about how the inner andouter orbits exchange energy and angular momentum Since we are interestedin determining which configurations are unstable that is which allow theescape to infinity of one of the bodies and this necessarily generally involvesa substantial exchange of energy between the orbits our focus for the rest ofthis chapter will be on the disturbing function it contains all the secrets ofthe three-body problem
Before we proceed we need to define the orbital elements of the inner andouter binaries in terms of which the stability boundary will be expressed Using
8Note that as a quantity introduced to study the restricted three-body problemthe disturbing function has historically been defined to have units of energy per unitmass Here it has units of energy
76 R A Mardling
subscripts i and o to denote the inner and outer orbits respectively9 these arethe semi-major axes ai and ao the eccentricities ei and eo the orientationangles ωi Ωi Ii and ωo Ωo Io which are respectively the arguments ofperiastron the longitudes of the ascending node and the inclinations (seeFig 31) and the phase angles fi Mi λi εi and fo Mo λo εo which arerespectively the true anomaly the mean anomaly the mean longitude andthe mean longitude at epoch (Murray amp Dermott 2000) Note that longitudeangles are measured with respect to a fixed direction (which here we taketo be the i direction in Figs 31 and 39) we will use longitudes when weconstruct the resonance angle in the next section Thus rather than ωio wewill use the longitudes of periastron defined to be i = ωi + Ωi and similarlyfor o From Fig 31 we see that for inclined orbits this is a dog-leg angleThe phase angles fio Mio and λio equiv Mio + io are used to express theangular positions of the bodies in the two-body orbit the choice of whichdepends on the application (there are at least another two phase angles inuse the true longitude equiv f + and eccentric anomaly neither of which wewill use here) The mean longitude at epoch is the mean longitude at t = 0((345)) See Murray amp Dermott (2000) for a more detailed discussion of thevarious orbital elements
344 Spherical Harmonic Expansions
Since our aim is to determine which configurations are stable it is useful towrite the disturbing function in terms of the orbital elements of the inner andouter binaries To do this we somehow need to separate information aboutthe inner orbit from that of the outer orbit The form of the second and thirdterms in (326) suggest using a Legendre expansion
1|b minus a| =
infinsum
l=0
(al
bl+1
)
Pl(cos γ) (327)
where b = |b| a = |a| with a lt b Pl(cos γ) is a Legendre polynomial of degreel and cos γ = a middot b However for us this involves the angle between r andR information about the two orbits is still ldquotangledrdquo We can go one stepfurther and use something called the addition theorem (Jackson 1975) whichexpresses a Legendre polynomial of order l in terms of spherical harmonicsYlm whose arguments are the spherical polar coordinate angles of the vectorsr and R both referred to a fixed coordinate system (Fig 39)
Pl(cos γ) =4π
2l + 1
lsum
m=minusl
Ylm(θ ϕ)Y lowastlm(ΘΨ) (328)
9When no subscript is used the elements refer to any (or either) two-body orbit
3 Three-Body Stability 77
Ψ
Θ
iC
k
m
m
3
2
12
θ
ϕ
Fig 39 Spherical polar angles associated with r (θ ϕ) and R (Θ Ψ) The origincorresponds to the centre of mass of m1 and m2 C12
Spherical harmonics are defined in terms of associated Legendre functionsPm
l (cos θ) and trigonometric functions (see Jackson (1975) for an extensivediscussion of their properties)
Ylm(θ ϕ) =
radic2l + 1
4π(l minusm)(l +m)
Pml (cos θ) eimϕ (329)
where the numerical coefficient is chosen so that the spherical harmonics havea particularly simple orthogonality relation
int 2π
0
int π
0
Ylm(θ ϕ)Y lowastlprimemprime(θ ϕ) sin θ dθ dϕ = δllprimeδmmprime (330)
Spherical harmonics are especially important in quantum mechanics Com-bining (327) and (328) the disturbing function (326) becomes
R = Gμim3
infinsum
l=2
lsum
m=minusl
(4π
2l + 1
)
Ml
(rl
Rl+1
)
Ylm(θ ϕ)Y lowastlm(Θ ψ) (331)
where
Ml =mlminus1
1 + (minus1)lmlminus12
mlminus112
(332)
Notice how the sum over l begins at l = 2 and not l = 0 this is because thel = 0 term is cancelled by the first term in (326) while the l = 1 term (thedipole term) is zero because M1 = 0 Thus the leading term is proportional
78 R A Mardling
to r2R3 so that R provides a perturbation to the inner and outer orbits forsmall rR The l = 2 contribution is called the quadrupole term while thel = 3 contribution is called the octopole term Notice also that M2 = 1 andthat when m1 = m2 Ml = 0 for l odd
Since the focus of classical treatments of the three-body problem has beenthe Solar System in which mass ratios eccentricities and inclinations are gen-erally small these elements have been used as expansion parameters Theso-called literal expansion (Murray amp Dermott 2000) involves Laplace coef-ficients which are functions of the ratio of semimajor axes and is valid fororbits which cross an example of which is the NeptunendashPluto pair Apart frombeing restricted to small eccentricities and inclinations it also assumes thatone of the participating orbits is not affected by the presence of the third bodythis is the restricted three-body problem The formulation presented here isinstead restricted by the condition rR lt 1 for the spherical harmonic ex-pansion (331) to be valid Note that it is similar to the (rather tedious tofollow) formulation of Kaula (1961) however the latter is also based on therestricted three-body problem
Our aim here is to identify internal resonances so that we can apply theresonance overlap criterion and determine stability boundaries The two mostfundamental frequencies in the system are the inner and outer orbital frequen-cies νi and νo respectively and these are the only frequencies present whenthe orbits are not coupled For example recall that the orientation of a two-body orbit remains fixed in space and this is expressed by the constancy ofthe RungendashLenz vector However when a third body is introduced this sym-metry is broken and the original orbit rotates in space in a manner similar toa spinning top acting under the applied torque of the Earth As discussed inthe Introduction the presence of a third body introduces four new frequen-cies (apsidal advance and precession of the inner and outer orbits) which areusually much slower than the orbital frequencies Resonances will in generalinvolve linear combinations of all six frequencies Our next task then is toexpress the disturbing function in terms of six angles associated with thesefrequencies and as discussed earlier these are chosen to be longitudes Themean longitudes λio are associated with νio while the angles associated withapsidal motion and precession are the longitudes of periastron io and thelongitudes of the ascending node Ωio respectively
For clarity and simplicity the rest of the chapter will assume coplanarmotion see M1a and M3 for the general analysis involving inclined systemsTaking the plane of the orbits to be the xndashy plane the polar angles are thenθ = Θ = π2 so that from (329)
Ylm(π2 ϕ) =
radic2l + 1
4π(l minusm)(l +m)
Pml (0) eimϕ equiv
radic2l + 1
4πclm eimϕ (333)
and similarly for Ylm(π2Ψ) Values for c2lm for some values of l and m arelisted in Table 31
3 Three-Body Stability 79
Table 31 Spherical harmonic constants
l m c2lm
2 2 380 14
3 3 5161 316
Referring to Figs 31 and 39 and recalling that we are working in the plane(I = 0) we have ϕ = fi + ωi + Ωi = fi + i and Ψ = fo + o Substitutingthese together with (333) into (331) gives
R = Gμim3
infinsum
l=2
lsum
m=minusl
c2lm Ml
(rleimfi
)(eminusimfo
Rl+1
)
eim(iminuso) (334)
where we have collected together plane polar variables associated with eachorbit in the two pairs of large brackets For uncoupled orbits these are pe-riodic functions with frequencies νi and νo Since we are interested in weakinteraction between the orbits it makes sense to expand these expressions inFourier series in these frequencies Using the familiar two-body expressions
r =ai(1 minus e2i )
1 + ei cos fiand R =
ao(1 minus e2o)1 + eo cos fo
(335)
we have(r
ai
)l
eimfi =infinsum
nprime=minusinfins(lm)nprime (ei) einprimeMi (336)
and
eminusimfo
(Rao)l+1=
infinsum
n=minusinfinF (lm)
n (eo) eminusinMo (337)
where
s(lm)nprime (ei) =
12π
int π
minusπ
(r
ai
)l
eimfieminusinprimeMi dMi (338)
and
F (lm)n (eo) =
12π
int π
minusπ
eminusimfo
(Rao)l+1einMo dMo (339)
Note that the mean anomalies are related to the orbital frequencies by
Mi(t) = νit+Mi(0) and Mo(t) = νot+Mo(0) (340)
80 R A Mardling
n = 1
n = 1
n = 2
n = 3n = 1
n = 2
n = 1
l m
ei
ei ei
ei
e is n
e is n
e is n
e is n
l m
l ml m
Fig 310 Fourier coefficients s(lm)
nprime (ei) for various values of l m and nprime =1 2 10(= n in figure) Dashed curves correspond to nprime = m The most impor-
tant coefficient for the stability analysis of similar-mass systems is s(22)1 (ei) (shown
in red (grey) note that it is negative for all values of ei)
The real eccentricity-dependent Fourier coefficients s(lm)nprime (ei) and f
(lm)n (eo) =
(1 minus eo)l+1F(lm)n (eo) are plotted in Figs 310 and 311 for some values of l
m n and nprime In Sect 347 we present approximations to the functions usedin our stability analysis Substituting (336) and (337) into the disturbingfunction (334) gives
R = Gμim3
infinsum
l=2
lsum
m=minusl2
infinsum
nprime=minusinfin
infinsum
n=minusinfinc2lmMl
(al
i
al+1o
)
s(lm)nprime (ei)F (lm)
n (eo)eiφmnnprime
= 2Gμim3
sum
L
ζmc2lm Ml
(al
i
al+1o
)
s(lm)nprime (ei)F (lm)
n (eo) cos (φmnnprime) (341)
where
φmnnprime = nprimeMi minus nMo +m(i minuso)= nprimeλi minus nλo + (mminus nprime)i minus (mminus n)o (342)
is called a resonance angle Here ζm = 12 if m = 0 and is 1 otherwise and
3 Three-Body Stability 81
n = 2
l m l m
l ml m
n = 3
n = 2
n = 4n = 2
n = 3
eo eo
eoeo
e of n
e of n
e of n
e of n
Fig 311 Fourier coefficients f(lm)n (ei) = (1 minus eo)
l+1F(lm)n for various values of l
m and n = 2 10 Dashed curves correspond to n = m The most importantcoefficients for the stability analysis of similar-mass systems are f
(22)n
sum
L
equivinfinsum
l=2
lsum
m=mmin2
infinsum
nprime=minusinfin
infinsum
n=minusinfin (343)
where the sum over m is in steps of two for coplanar systems (M1a) andmmin = 0 or 1 if l is even or odd respectively
We now have the disturbing function expressed in terms of all the relevantorbital elements including the four angles λi λo i and o which appear inlinear combination in the resonance angle (for coplanar systems the ascendingnode longitudes do not appear explicitly)
345 Energy Transfer Between Orbits
The defining characteristic of (most) stable hierarchical systems is that(essentially) no net energy is exchanged between the orbits over one outerorbital period The usual way to show this is via orbit-averaging over the in-ner orbit This involves a time-average over one entire orbit assuming that allthe orbital variables except the inner orbital phase remain constant on thisshort time-scale The form of (341) makes this extremely easy to performbut first we need an expression for the rate of change of the orbital energyThe simplest way to obtain such an expression is to use Lagrangersquos planetaryequation for the rate of change of the semi-major axis
82 R A Mardling
Lagrangersquos Planetary Equations
Lagrangersquos planetary equations express the rates of change of all the elementsof a two-body orbit which is being perturbed by some external potentialNo assumption is made about the smallness of mass ratios (or any otherparameters) so that it is perfectly well applicable to the general three-bodyproblem the results of which are meaningful as long as the inner and outerorbits retain their identities The derivation of these equations can be foundin Brouwer amp Clements (1961) and is based on the method of variation ofparameters The parameters in this case are the orbital element which remainconstant when the orbit is unperturbed that is e a Ω I and ε = M(0)+The Lagrange equation relevant to us here is that for the rate of change ofthe semi-major axis For the inner and outer orbits of a triple this is
dai
dt=
2μiνiai
partRpartεi
anddao
dt=
2μoνoao
partRpartεo
(344)
respectively where R is given by (341) (recall that our disturbing functionhas dimensions of energy)
Now the usual definition of the mean longitude is
λ = M + = νt+M(0) + = νt+ ε (345)
But this assumes that the orbital frequency (and hence the semi-major axis byKeplerrsquos law and also the orbital energy) is constant something we certainlydo not wish to assume once we consider unstable systems A more generaldefinition is
λ =int t
0
ν(tprime) dtprime + εlowast (346)
where εlowast is a generalization of ε which takes into account the variation of ν(Brouwer amp Clements (1961) p 286 and Murray amp Dermott (2000) p 252 wedo not need the precise definition here) It turns out that using this definitionof λ one can replace εi and εo with λi and λo in (344) so that the rates ofchange of the semi-major axes become
dai
dt=
2μiνiai
partRpartλi
anddao
dt=
2μoνoao
partRpartλo
(347)
Writing the inner orbital energy Ei in terms of inner semimajor axis Ei =minusGm1m22ai the rate of change of Ei is then
1Ei
dEi
dt= minus 1
ai
dai
dt
= 4νi
(m3
m12
)sum
L
nprimeζmc2lmMl
(ai
ao
)l+1
s(lm)nprime (ei)F (lm)
n (eo) sin (φmnnprime)
equivsum
L
nprime Clmnnprime sin(φmnnprime) (348)
3 Three-Body Stability 83
Performing a time-average over the inner orbit assuming all elements exceptλi are constant (including ai ie putting λi = νit+ εi) gives
lang1Ei
dEi
dt
rang
=sum
L
nprime Clmnnprime
Ti
int Ti
0
sinφmnnprimedt
=sum
L
nprime Clmnnprime sin (φmnnprime) δnprime0 = 0 (349)
where Ti = 2πνi is the outer orbital period A simpler way to look at this isto ask for the contributions to (348) which are not rapidly varying (ie termswhich do not depend on λi and λo) that is to retain only the ldquosecularrdquo (slowlyvarying) terms by putting nprime = n = 0 This automatically gives ltEiEigt= 0due to the factor nprime in (348) This simple approach also yields the secular ratesof change of the other orbital elements via the Lagrange equations (M3)
Resonance
How do we reconcile (349) with the fact that significant energy transfer isneeded for escape of one body to occur It seems that the assumption thatelements other than λi hardly change over an inner orbital period must bewrong in such cases In fact it is not so much that the other elements donot change much but rather that in some circumstances certain combinationsof angles vary slowly and this can result in significant energy transfer Forexample imagine a system for which the outer orbital period is almost exactlytwo times the inner orbital period that is
νi minus 2νo 0 (350)
Noting from (342) and (346) that
φmnnprime = nprimeνi minus nνo + [nprimeεi minus nεo + (mminus nprime)i minus (mminus n)o] nprimeνi minus nνo (351)
where the frequencies in square brackets are generally much smaller than theorbital frequencies (350) is simply φm21 0 for any m In practice it isterms with m = 2 which contribute the most to energy transfer because theseinvolve the quadrupole l = 2 terms (note the power of aiao in (348) andrecall that the summation over l begins at 2) A system for which (350) holdsis referred to as resonant for obvious reasons In fact except for systems forwhich m2m3 m1 eg starndashplanetndashplanet systems or intermediatemassiveblack holendashstarndashstar systems the so-called 21 resonance is unstable becauseadjacent resonances overlap and produce instability However there are nowseveral stable 21 planetary systems known One example is GJ 876 (Riveraet al 2005) whose orbital periods are 3034 days and 60935 days with massesm1 = 03M m2 = 062MJ and m3 = 193MJ where MJ is the mass of
84 R A Mardling
π0minusπλ λ ω
oi
i
i io
Fig 312 The 21 resonance in the GJ 876 planetary system (a) the evolution ofthe inner semi-major axis for max(νiνo) = 21 The small wiggles correspond to en-ergy exchange during periastron passage of the outer planet (two peaks per passagecorresponding to superior and inferior conjunction) (b) libration and circulationνiνo equiv σ vs the resonance angle φ221 for (from centre) σ = 2008 21 and 22
Jupiter This period ratio is such that νiνo = 2008 that is the system isvery close to exact resonance In order to demonstrate clearly the resonantvariation of ai Fig 312(a) plots its evolution for a slightly larger value of σ(σ = 21) while Fig 312(b) plots νiνo equiv σ vs the resonance angle φ221 forσ = 2008 (the innermost set of points) σ = 21 (the librating set of pointsforming a fuzzy circle) and σ = 22 (the circulating set of points) The factthat ai varies significantly in Fig 312(a) indicates that a substantial amountof energy is exchanged between the orbits (when the inner orbit shrinks theouter orbit expands due to conservation of energy) Resonant orbits are alsoassociated with libration of one or more resonance angles The width of aresonance is the ldquodistancerdquo from exact resonance to the separatrix calculatedat φmnnprime = 0 if this separatrix overlaps the separatrix of a neighbouringresonance we have instability Thus our task is to determine the width ofresonances and to ask for what orbital parameters are these wide enough tooverlap neighbouring resonances
Before we leave this section on energy exchange and resonance we quotea result from M4 which gives approximately the energy exchanged betweenthe inner and outer orbits over one outer orbital period (from apastron toapastron)
ΔEi
Ei I2
22 + 2 ei(0) I22 sin [φ(0)] (352)
where ei(0) is the inner eccentricity at t = 0 and
I22 =94
(m3
m12
)(ai
ao
)3
E22(eo σ) (353)
with an asymptotic expression for the ldquooverlap integralrdquo
3 Three-Body Stability 85
E22(eo σ) = νieminusiσπ
int To
0
eminus2ifo
(Rao)3eiνit dt (354)
4radic
2π3
(1 minus e2o)34
e2oσ52eminusσξ(eo) (355)
(M1a) Here To is the outer orbital period and ξ(eo) = coshminus1(1eo)minusradic
1 minus e2oAlso
φ(0) = Mi(0) + σπ + 2(i minuso) φ2n1(0) (356)
that is φ(0) is approximately the value of the resonance angle φ2n1 when theouter body is at apastron (see (342)) exact equality holding when σ = nThe expression (355) includes only quadrupole l = 2 m = 2 terms and isobtained using an asymptotic method similar to that of Heggie (1975) whichgives the energy exchanged during the flyby of a binary by a third body Notethat limeorarr0 E22 = 0 for σ gt 2 is finite for σ = 2 and is not defined for σ lt 2and that limeorarr1(1 minus eo)3E22 is finite
The form of (355) shows that the amount of energy transferred duringone outer orbit of a bound triple is exponentially small except when σξ(eo) issmall This is consistent with the orbit-averaging result 〈EiEi〉 = 0 and itstrongly suggests that ldquostablerdquo systems are stable for all time although aspreviously discussed a proof is not yet available
346 A Pendulum Equation for the Resonance Angle
Figure 312(b) illustrates how a resonance angle librates when the orbital fre-quencies are near-commensurate This suggests that resonance angles shouldsatisfy a pendulum-like equation the ability to write down such an equationwould then give us the full machinery outlined in Sect 331 for pendulumsIn particular we could calculate the distance from exact resonance to theseparatrix that is the resonance width recall that we need this in order todetermine when neighbouring resonances overlap and hence when a system isunstable
Referring to (31) we see the second time derivative of the resonance angleis required Starting from (351)
φmnnprime = nprimeνi minus nνo (357)
where we have replaced the approximation symbol with equality we then have
φmnnprime = nprimeνi minus nνo (358)
Relating the rates of change of the orbital frequencies to the rates of changeof the semi-major axes
νi
νi= minus3
2ai
aiand
νo
νo= minus3
2ao
ao (359)
86 R A Mardling
we can again make use of Lagrangersquos planetary equation for the rate of changeof the semi-major axis (347) together with (348) and its equivalent for aoSubstituting these into (358) and assuming that the resonance is isolated (notforced) that is that the only significant terms in the summations are thosewith the same values of m n and nprime we get
φmnnprime = minusnprime2ν2oAmnnprime sin (φmnnprime) (360)
where
Amnnprime equiv minus6 ζm
infinsum
l=lmin2
c2lms(lm)nprime (ei)F (lm)
n (eo)
middot[M
(l)i σminus(2lminus4)3 +M (l)
o (nnprime)2σminus2l3]
minus6 ζm
infinsum
l=lmin2
c2lms(lm)nprime (ei)F (lm)
n (eo) (nnprime)minus(2lminus4)3
middot[M
(l)i +M (l)
o (nnprime)23]
(361)
and we have put σ nnprime in the last step Here lmin = 2 if m is even andlmin = 3 if m is odd The dependence on the masses is solely through thefunctions
M(l)i = Ml
(m3
m12
)(m12
m123
)(l+1)3
and M (l)o = Ml
(m1m2
m212
)(m12
m123
)l3
(362)
Except for very low values of n corresponding to planetary-like problems itis usually adequate to include only the first term in the summation over l
Comparing (360) with (31) we have that the ldquosmall angle frequencyrdquoω0 is nprimeνo|Amnnprime |12 When Amnnprime gt 0 we have libration around zero andwhen Amnnprime lt 0 we have libration around π It turns out that for systems forwhich at least two of the masses are reasonably similar (this is quantified inSect 3410) the dominant resonances are those with m = 2 and nprime = 1 Usingthe notation introduced in M1a these are the [n 1](2) resonances Referringto Figs 310 and 311 and recalling that we only need include l = 2 whenm = 2 we see that s(22)1 (ei) lt 0 for all 0 le ei le 1 and that f (22)
n (eo) gt 0for 0 le eo le 1 so that A2n1 gt 1 for all n Thus libration is around zero forall resonances of interest here Putting nprime = 1 and m = 2 in (361) retainingonly the l = 2 term and setting φ2n1 equiv φn and A2n1 equiv An the resonances ofinterest to us are governed by
φn = minusν2oAn sinφn (363)
3 Three-Body Stability 87
whereAn = minus9
2s(22)1 (ei)F (22)
n (eo)[M
(2)i +M (2)
o n23] (364)
with
M(2)i =
m3
m123and M (2)
o =(m1m2
m212
)(m12
m123
)23
(365)
and we have used c222 = 38 from Table 31 In Sect 345 p 84 we definedthe width of a resonance to be the distance from exact resonance and theseparatrix calculated at φmnnprime = 0 Equation (33) gives an expression forthe separatrix so that the width of a resonance is
Δφ = 2ω0 = 2νo
radicAn (366)
for the [n 1](2) resonances of interest here It is usually more convenient todefine the width of a resonance in terms of the change in σ Since
φn = νi minus n νo = νo(σ minus n) (367)
we can define the width of the [n 1](2) resonance to be
Δσn = 2radicAn (368)
We can associate an ldquoenergyrdquo En with the pendulum-like motion of a res-onance such that En lt 0 for libration and En gt 0 for circulation of φnFollowing (32) we then have
En =12φ2
n minus ν2oAn(cosφn + 1) (369)
It is useful to define a dimensionless version of this such that En = νoEnthat is
En =12[δσn]2 minusAn(cosφn + 1) (370)
where δσn = σ minus n is the ldquodistancerdquo from exact resonance corresponding toφn Note that δσn is a maximum when φn = 0 (for libration around φn = 0)We will use (370) in a simple algorithm to determine the stability of anygiven configuration (Sect 3410)
The form of (364) makes it relatively easy to see how resonance widthsdepend on the various parameters Before we make use of (368) to determinethe stability boundary it is necessary to discuss evaluation of the eccentricityfunctions s(22)1 (ei) and F
(22)n (eo)
88 R A Mardling
347 Eccentricity Functions
Since the eccentricity functions s(22)1 (ei) and F(22)n (eo) are integrals with no
closed form expressions (except for n = 0 see M3) it is of interest to findapproximations A simple Taylor expansion of the integrand of s(22)1 (ei) aboutei = 0 allows for the integral to be performed and if one expands up to O(e7i )allows for the function to be well represented for all ei le 1 This proceduregives
s(22)1 (ei) minus3ei +
138e3i +
5192
e5i minus2273072
e7i (371)
If εi is the difference between the exact and approximate expression |εi| lt0001 for ei lt 063 |εi| lt 001 for ei lt 079 and |εi| lt 01 for ei lt 1
While it is possible to find Taylor series approximations to F(22)n (eo) we
would need hundreds of these for a general stability algorithm since systemswith very high outer eccentricity can involve very high values of n (sinceσ = νiνo nnprime = n) Instead we make use of the asymptotic expression(354) to evaluate (339) Making the substitution Mo = νotminusπ in (354) (sincethe outer orbit starts at minusπ that is Mo(0) = minusπ) so that νit = σ(Mo + π)and νidt = σdMo the integral becomes
E22(eo σ) = σ
int π
minusπ
eminus2ifo
(Rao)3eiσMo dMo (372)
Comparing this with (339) we see that
F (22)n (eo) E22(eo n)2πn (373)
Thus we have the beautiful result that the resonance widths are exponentiallysmall when σξ(eo) is small consistent with the fact that an exponentially smallamount of energy is exchanged between the orbits in such circumstances
348 Induced Eccentricity and Secular Effects
The expression for the resonance width (368) together with (364) and (371)suggest that systems whose inner binary is circular have zero resonance widths(since s(22)1 (0) = 0) But this surely is not true Figure 313 plots the evolutionof the inner eccentricity for an equal mass three-body system whose initialeccentricities are ei(0) = 0 and eo(0) = 05 and for which (a) σ = 10 and(b) σ = 8 Both systems start at outer apastron and significant eccentricityis induced when they pass through outer periastron The formalism used toestimate the energy transferred between orbits (see Sect 345 and (352)) canalso be used to estimate the induced inner eccentricity This is given by
ei(To) =[ei(0)2 minus 2 ei(0) I22 sin[φ(0)] + I2
22
]12 (374)
3 Three-Body Stability 89
σ σe i e i
Fig 313 Induced inner eccentricity of a circular binary (a) σ = 10 and (b)σ = 8 In both cases eo = 05 and the system is started at outer apastron withMi(0) = 0 and i minus o = 0 Both systems are chaotic but (a) is on the stabilityboundary while (b) is deep inside the unstable region The dashed lines correspondto the estimated induced eccentricity ((374)) following the first outer periastronpassage
where ei(0) and ei(To) are the inner eccentricity at initial and final outerapastron and I22 and φ(0) are given by (353) and (356) respectively Thedashed curves in Fig 313 indicate these estimates
It turns out that using ei(To) instead of ei(0) in the expression for the res-onance width quite accurately predicts the stability boundary when octopoleeffects are unimportant (see Fig 315)
Octopole Variations for Coplanar Systems
For systems with m1 = m2 secular octopole contributions to the disturbingfunction (terms with n = nprime = 0) can cause the inner eccentricity to varyconsiderably on time-scales of thousands of inner orbits (Murray amp Dermott2000 M3) This is especially important for close planetary systems Whilethe outer eccentricity also varies the main effect on the resonance widthscomes from the variation of s(22)1 (ei) which is a maximum at the maximumof the octopole cycle in ei Referring to this maximum as e(oct)
i it is givenapproximately by (Mardling 2007 M1a)
e(oct)i =
(1 + α)e(eq)
i α le 1ei(0) + 2e(eq)
i α gt 1(375)
where α = |1 minus ei(0)e(eq)i | and e
(eq)i is the ldquoequilibriumrdquo or ldquofixed pointrdquo
eccentricity which is the root of the eighth-order polynomialsum8
n=1 anxn in
[01] where the an are given by
a0 = minusB2
a1 = 2ABa2 = B2 + C2 minusA2
90 R A Mardling
a3 = minus2(AB + 4CD)a4 = A2 + 3C2 + 16D2
a5 = minus18CD
a6 =94C2 + 24D2
a7 = minus9CDa8 = 9D2 (376)
with
A =34
(m3
m12
)(ai
ao
)3
εminus3o
B =1564
(m3
m12
)(m1 minusm2
m12
)(ai
ao
)4
εminus5o
C =34
(m1m2
m212
)(ai
ao
)2
εminus4o
D =1564
(m1m2
m212
)(m1 minusm2
m12
)(ai
ao
)3 (1 + 4e2oeoε6o
)
(377)
and εo =radic
1 minus e2o In the limit ei 1 the equilibrium eccentricity reduces to
e(eq)i =
(54)eom3(m1 minusm2)(aiao)2σεminus1o
|m1m2 minusm12m3(aiao)εoσ| (378)
Note that even though (378) is not accurate away from the stability boundarywhere ei(To) is large it can be used to determine the boundary if ei(0) is smallbecause ei(To) tends to be small there in that case (see Fig 313)
349 Resonance Overlap and the Stability Boundary
The stability of any given coplanar configuration depends on the values of theeight parameters m2m1 m3m12 σ ei eo i minus o Mi(0) and Mo(0) Inorder to represent the stability boundary in two dimensions we need to fixthe values of six of these and vary the other two Here we choose to plot eo
against σ for i minuso = Mi(0) = 0 and Mo = minusπ and for a selection of massratios and ei(0)
For a given value of n and for fixed values of ei(0) m2m1 and m3m12the two boundaries of the [n 1](2) resonance are given by
σ(eo) = nplusmn Δσn(eo) = nplusmn 2 [An(eo)]12
(379)
3 Three-Body Stability 91
12
librationof
nΔσ
φ
e o e o
Fig 314 (a) The [121](2) resonance (b) Resonance overlap This example cor-responds to m2m1 = m3m1 = 001 and ei(0) = 05 (see Fig 316) See text fordiscussion
where An(eo) is given by (364) Note that this assumes exact resonance occurswhen
φn = νi minus nνo = 0 (380)
that is when σ = νiνo = n however if iνo is significant it will shift exactresonance away from this (recall the precise expression (351) for φn see alsoFig 315) Figure 314(a) plots eo against σ for the [121](2) resonance for aparticular set of initial conditions with the shaded region corresponding tolibration of the resonance angle φ1210 while panel (b) shows the overlap ofthe resonances [n 1](2) n = 9 10 15 for the same initial conditions Thelower (green)-shaded regions in panel (b) formally correspond to stable libra-tion of the resonance angles φn while the unshaded regions correspond to sta-ble circulation for which the inner and outer orbits have constant semi-majoraxes The upper (red)-shaded region corresponds to the overlap of neighbour-ing resonances (as well as more distant resonances) so that a system withinitial conditions corresponding to any point in this region is predicted by theresonance overlap stability criterion to be unstable
How does this compare with direct numerical experiments Figure 315(a)shows a stability map for equal-mass configurations with initially circular in-ner binaries for various initial period ratios and outer eccentricities A dotcorresponding to the initial values of σ and eo is plotted if a direct numericalintegration of the three-body equations of motion results in an unstable sys-tem Rather than integrating the system until one of the bodies escapes twoalmost identical systems (the given system and its ldquoghostrdquo) are integrated inparallel and the difference in the inner semi-major axes at outer apastron ismonitored (because this variable is approximately constant for non-resonantsystems) Taking advantage of the sensitivity of a chaotic system to initial con-ditions this difference will grow in proportion to the initial difference between
10Even though (379) gives σ as a function of eo it seems more natural to plotthe resonance boundaries with σ as the independent variable
92 R A Mardling
Fig 315 Experimental vs theoretical stability boundary The position of each red(grey) dot in (σ minus eo) space corresponds to the initial conditions of an unstablesystem for which the masses are equal ei(0) = 0 and Mi(0) = 0 and Mo(0) = minusπThe black curves are the resonance boundaries given by (379) which terminate atpoints for which ei(To) = 1 Notice the structure of the distribution of dots near thesetermination points this reflects the process of exchange of m3 into the inner binary(consistent with ei(To) gt 0) Systems deemed stable (see text for how this decisionis made) are those for which exchange occurs rapidly While the resonance overlapstability criterion predicts the stability boundary fairly accurately some of the reddots fall inside single-resonance regions which ought to be stable according to thecriterion But the criterion assumes that when only one resonance angle is libratingthe forcing is negligible this clearly is not true at these points Also notice how thered dots trace the separatrix at the left-hand boundaries and in particular noticethe offset which is prominent for the 51 resonance this is analogous to spectral linesplitting by a magnetic field and is a result of the influence of i which has beenneglected in (379)
two systems (10minus7 in the inner eccentricity) for a stable system but will growexponentially for an unstable system as discussed in Sect 333 The actualstability boundary fairly accurately follows the points at which neighbour-ing resonances overlap however the stability criterion does not predict theunstable nature of some systems inside single-librating regions (correspond-ing to the green regions in Fig 314(b)) because it assumes that forcing isnegligible there
Figure 316 shows stability maps for a variety of initial conditions Eachmap has m1 = 1 Mi(0) = 0 and Mo(0) = minusπ and aligned periastra exceptfor panel (f) Consider the systems (a) (c) and (e) for which ei(0) = 0 andη = i minuso = 0 The librating regions for which there is no overlap with aneighbouring resonance are relatively free of unstable systems while those for
3 Three-Body Stability 93
eie o
ei
ei
eiei
ei
e oe o e o
e oe o
m m m m
mmmm
mi mi
Fig 316 Stability maps for a variety of initial conditions (m1 = 1) Notice howresonance shapes vary significantly from panel to panel but the resonance overlapstability criterion is still successful at predicting the stability boundary (except forthe single-librating regions) The dashed curve in the top left-hand corner of eachpanel corresponds to Rpai = 1 where Rp = ao(1 minus eo) is the outer periastrondistance (data were not collected beyond this curve) (a) planetary-like systemwith significant inner eccentricity (b) low-mass secondary with zero initial innereccentricity (c) Jupiter-like outer body orbiting an equal-mass eccentric binary(d) ldquobinaryrdquo consisting of a heavy body and an equal-mass binary (e) and (f)equal-mass system with ei(0) = 02 Here η = i minuso the two plots demonstratingthe effect of rotating the orbits relative to each other Notice that even resonances aremore stable than odd in (a) while the opposite is true in (b) (see text for discussion)
94 R A Mardling
odd resonances tend to be full down to near the resonance cross-over pointsThe reason for this is as follows Referring to (342) on p 80 we see (puttingnprime = 1 and m = 2) that for these initial conditions φn(0) = nπ Since librationis around zero (because An gt 0) a system starting at exact resonance thatis with σ = n will stay there if n is even because it is at the very centreof the resonance (see Fig 312 on p84) while if n is odd the system startsat the hyperbolic fixed point on the separatrix An odd-n system for whichσ = n (and is indicated on the stability map to be inside a resonance) actuallybegins outside the librating region recall that the definition of the resonanceboundary uses the value of the separatrix at φn = 0 However it will still bestrongly forced and its proximity to the separatrix will cause it to be unstableA more detailed analysis can be found in M1a
We should expect from this discussion that a system for which η = i minuso = 0 will exhibit different behaviour and this is indeed the case as panel(f) for which η = π2 reveals In this case φn = (n+ 1)π and we see that itis the even resonances that are now more unstable
The fact that ei(0) = 0 for the examples just discussed means that theinner orbit begins with a definite periastron direction What about whenei(0) = 0 Figure 315 as well as panels (b) and (d) in Fig 316 show thatpoints on the left-hand sides of the resonances tend to be unstable whilepoints on the right-hand side are stable up to where the resonances overlapWe interpret this as indicating that the induced periastron direction associ-ated with the induced eccentricity tends to be such that η(To) π4 so thatφn = (2n+ 1)π2
Another feature of Fig 316 worth noting is the patch of instability atthe lower-left corner of panel (a) This is common for low-order resonances inplanetary-like systems and actually corresponds to libration around π (this isdiscussed in detail in M2)
3410 A Simple Algorithm for Predicting Stability
For most applications one needs to know the stability characteristics of singlesystems Thus rather than give a formula for the stability boundary we endthis chapter by presenting an algorithm for testing the stability of individualconfigurations Note that it only holds for coplanar systems11 and is restrictedto systems for which the [n 1](2) resonances dominate These are such thateither both m2m1 gt 001 and m3m1 gt 001 or at least one of m2m1 gt 005or m3m1 gt 005 The algorithm is as follows
1 Identify which [n 1](2) resonance the system is near and calculate thedistance δσn from that resonance δσn = σminusn where n = σ (the nearestinteger for which n le σ)
11A Fortran routine for arbitrarily inclined systems is available from the author
3 Three-Body Stability 95
σ σ
e o e o
Fig 317 Comparison of (a) experimental and (b) theoretical data for equal masscoplanar systems with ei(0) = 0
2 Take the associated resonance angle to be zero rather than the definition(342) (see discussion below) φn = 0
3 Calculate the induced eccentricity from (374) and (if m1 = m2) the maxi-mum octopole eccentricity from (375) Determine ei = max[ei(To) e
(oct)i ]
for use in s(22)1 (ei)
4 Calculate An from (364)5 Calculate En and En+1 from (370) and deem the system unstable if En lt 0
and En+1 lt 0
Figure 317 compares the experimental data shown in Fig 315 with datagenerated using the algorithm above A dot is plotted if a system is deemed tobe unstable The boundary structure is reproduced reasonably well althoughthe boundary itself should be slightly lower a result of the fact that theresonance overlap criterion does not recognize the unstable nature of pointsnear to but outside the separatrix This is also the reason for taking φn = 0for all initial conditions (recall the discussion in the previous section on oddand even resonances)
References
Aarseth S J 1971 ApampSS 13 324 62Aarseth S J 2007 MNRAS 378 285 62Arnolrsquod V I 1963 Russian Mathematical Surveys 18 9 71Arnolrsquod V I 1978 Mathematical Methods of Classical Mechanics Springer-Verlag
New York 71Barrow-Green J 1997 Poincare and the Three Body Problem (History of Mathe-
matics V 11) American Mathematical Society 69Brouwer D Clements G M 1961 Methods of Celestial Mechanics Academic Press
New York and London 82Chirikov B V 1979 Phys Rep 52 263 71
96 R A Mardling
Dyson F J 2000 Oppenheimer Lecture University of California Berkeleyhttpwwwhartford-hwpcomarchives20035html 59
Eggleton P Kiseleva L 1995 ApJ 455 640 61Eggleton P P Kiseleva-Eggleton L 2001 ApJ 562 1012 62Fabrycky D Tremaine S 2007 ApJ 669 1298 62Goldstein H 1980 Classical Mechanics Addison-Wesley Philippines 73Heggie D C 1975 MNRAS 173 729 85Hills J G 1976 MNRAS 175 1P 62Hills J G 1988 Nature 331 687Jackson J D 1975 Classical Electrodynamics Wiley New York 2nd ed 76 77Kaula W M 1961 Geophys J Roy Astr Soc 5 104 78Kolmogorov A N 1954 Dokl Akad Nauk 98 527 71Kozai Y 1962 AJ 67 591 60Mardling R A 1995a ApJ 450 722 60Mardling R A 1995b ApJ 450 732 60Mardling R A 2007 MNRAS 382 1768 89Mardling R A 2008a submitted to MNRAS 61Mardling R A 2008b submitted to MNRAS 61Moser J 1962 Nachr Akad Wiss Gottingen II Math Phys KD 1 1 71Murray C D Dermott S F 2000 Solar System Dynamics Cambridge Univ Press
Cambridge 76 78 82 89Poincare H 1993 New Methods of Celestial Mechanics (Vol 1) Goro D L ed
AIP New York I23 22 71Reichl L E 1992 The Transition to Chaos in Conservative Classical Systems
Quantum Manifestations Springer-Verlag New York 61Reipurth B amp Clarke C 2001 AJ 122 432 62Rivera E J et al 2005 ApJ 634 625 83Spurzem R Giersz M Heggie D C Lin D N C 2006 astro-ph0612757 62Tokovinin A Thomas S Sterzik M amp Udry S 2006 AampA 450 681 62Walker G H Ford J 1969 Physical Review 188 416 71
4
FokkerndashPlanck Treatment of Collisional StellarDynamics
Marc Freitag
University of Cambridge Institute of Astronomy Madingley Road CambridgeCB3 0HA UKmarcfreitaggmailcom
41 Introduction
In this chapter I explain how the evolution of an N -body system can be de-scribed using a formalism explicitly based on the distribution function in phasespace Such an approach can be contrasted with direct N -body simulations inwhich the trajectories of a large number of particles are integrated Becausetrajectories with close initial conditions diverge exponentially in gravitationalN -body systems (Goodman et al 1993 Hemsendorf amp Merritt 2002 andreferences therein) most results of N -body simulations must be interpretedstatistically It is therefore interesting to consider the simulation methods thattreat the gravitational system in an explicitly statistical way
Since the early 1980s the numerical solution of the FokkerndashPlanck (FP)equation has been the technique of choice for a statistical treatment of colli-sional systems such as globular clusters or dense galactic nuclei In its basicversion on which I focus this equation (combined with the Poisson equa-tion) describes the evolution of a stellar system in dynamical equilibrium butevolving slowly through the effects of two-body relaxation In this chapterI further restrict myself to spherically symmetric configurations with no netrotation as most researchers in the field have done to make the problemeasier to tackle As far as relaxation is concerned the Monte-Carlo numericalscheme presented in Chap 5 is essentially equivalent to solving the FP equa-tion using a particle-based representation of the distribution function insteadof tabulated data Therefore the assumptions and limitations inherent in theFP description of relaxation which are described in detail in this chapteralso apply to Monte-Carlo techniques
A note of caution is required here The dynamics of a gravitational N -bodysystem is highly non-linear with the possibility that small differences in theldquomicroscopicrdquo conditions (such as the existence and properties of a binarystar) can lead to rather large macroscopic differences in evolution The FPapproach does not provide a statistical description of the various macroscop-ically distinct possible evolutions When such divergences are expected to
Freitag M Fokkerndashplanck Treatment of Collisional Stellar Dynamics Lect Notes Phys 760
97ndash121 (2008)
DOI 101007978-1-4020-8431-7 4 ccopy Springer-Verlag Berlin Heidelberg 2008
98 M Freitag
occur such as in the process of collisional runaway or post-collapse core os-cillations (see Sect 45) the only way to capture them in a satisfying wayby means of FP simulations is probably by including some explicit stochasticprocess and repeat the simulation several times with different random se-quences (see Takahashi amp Inagaki (1991) for an example in the case of coreoscillations)
In the last decade or so FP codes have lost some ground to direct N -bodyand Monte-Carlo codes Indeed these particle-based methods make it easierto include a variety of physical effects thought to play an important rolein real systems and faster computers enable the use of higher and higherparticle numbers Nevertheless because FP computations are very fast andproduce data that are much smoother less memory-consuming and easierto manipulate than particle-based simulations they are an invaluable tool forexploring large volumes of parameter space They also help in gaining a betterunderstanding of ldquomacroscopicrdquo collisional stellar dynamics by providing adescription at a level more suitable than that of ldquomicroscopicrdquo point-massparticles attracting each other
In Sect 42 I present the Boltzmann equation which is at the heart ofthe statistical description of an N -body system In Sect 43 I give an outlineof the derivation of the main forms of the FP equation used to simulate theeffects of relaxation in spherical stellar systems Finally Sect 45 is a quickoverview of the applications of the FP approach in stellar dynamics with afocus on the additional physics that can be incorporated into that framework
42 Boltzmann Equation
421 Notation
The following notations are in use in this section Position and velocity in 3Dspace are denoted by
x = (x y z) = (x1 x2 x3)
andv = (vx vy vz) = (v1 v2 v3)
For a point in the 6D phase space I use the notation
w = (xv)
The gradient of a field u in 3D space is written
nablau equiv partu
partx=(partu
partxpartu
partypartu
partz
)
and the gradient in the 6D phase space is
nablau equiv partu
partw=(partu
partxpartu
partypartu
partzpartu
partvxpartu
partvypartu
partvz
)
4 FokkerndashPlanck Treatment 99
422 Collisionless System
In this section I follow mostly the treatment presented in Sect 41 of Binneyamp Tremaine (1987 hereafter BT87)
We consider a large number Nlowast of bodies moving under the influence ofa smooth gravitational potential Φ(x t) Here smooth means essentially thatΦ does not change much over distances of the order of (a few times) the av-erage inter-particle distance nminus13 where n is the particle number densityNo other forces affect the motion of these objects The potential Φ may bethe gravitational field created by these bodies themselves or an external fieldThe system of particles is described through the one-particle phase-space dis-tribution function (DF for short) f(xv t) A useful interpretation of f is asa probability density if it is normalised to 1 Then f(xv t)d3xd3v is theprobability of finding at time t any given particle within a volume of phasespace d3xd3v around the 6D phase-space point w = (xv) The mean numberof particles in this volume is Nlowastf(xv t)d3xd3v
From the knowledge of the initial conditions f0(xv) equiv f(xv t0) wewant to predict f(xv t) at some future time t gt t0 We define the velocityin the 6D phase-space
w = (x v) = (vminusnablaΦ) (41)
As long as Φ is sufficiently smooth the particles evolve in a smooth continuousway in the phase-space Therefore f must satisfy a continuity equation
partf
partt+ nabla middot (fw) =
partf
partt+
3sum
i=1
part(fvi)partxi
minus3sum
i=1
part(fpartxiΦ)partvi
= 0 (42)
This equation can be simplified using the fact that in the phase-space repre-sentation the xi and vi are independent variables (partvipartxj = 0) and that Φdoes not depend on the velocities so that partΦpartvi = 0 Therefore we have
partf
partt+
3sum
i=1
vipartf
partximinus
3sum
i=1
partxiΦpartf
partvi=
partf
partt+ v middot nablaf minus nablaΦ middot partf
partv= 0 (43)
This is the collisionless Boltzmann equation It can be written simply as
Dtf = 0 (44)
where Dt is a notation for the ldquoLagrangianrdquo or advective rate of change of f This equation means that if we follow the trajectory of a (real or imaginary)particle in the phase-space the number density around it does not change Inother words the flow in phase-space is incompressible
We note that there is an equation which is equivalent but more general(and of less practical use) for the distribution function in the Nlowast-particlephase-space in which a point represents all the positions and velocities of
100 M Freitag
the Nlowast bodies of the system It is Liouvillersquos theorem (BT87 Sect 82) Thecollisionless Boltzmann equation follows from Liouvillersquos theorem and the as-sumptions that the number of particles is very large and that there are no two-particle correlations In other words the probability of finding particle 1 at w1
and particle 2 at w2 is simply given by the product f(w1 t)f(w2 t)d6w1 d6w2
(BT87 Sect 83) While the first approximation is certainly valid in many as-trophysical situations such as galaxies and globular clusters (but see commentsbelow about multi-component systems) the second is violated by two-bodyeffects such as mutual deflections or the existence of small bound sub-groupsin particular binaries In fact as long as they do not interact closely withother objects and are themselves numerous enough binaries can in principlebe treated as just a special component for which a particle is really a bi-nary Two-particle effects such as deflection due to close encounters are calledcollisional effects and the FokkerndashPlanck treatment described below is anapproximate but manageable way to take them into account
The Boltzmann equation is valid whether f is interpreted as a numbermass luminosity or probability density The distribution function f does notneed to represent a system of objects with identical physical properties (stel-lar masses radii etc) but may be used globally for a mixed populationAs long as all sub-populations share the same f0 or if we are not interestedin distinguishing between them and the system is collisionless a unique fis enough to describe the system and its evolution If there are different sub-populations with initially distinct distribution functions (as would be the casefor a globular cluster with primordial mass segregation) each population (in-dex α) can be assigned its own DF fα In the absence of collisional termsthe only coupling between the evolution of the various fα is through the factthat they move in the same global potential Φ to which each componentcontributes unless it is treated as a mass-less tracer Specifically Φ is ob-tained from the fαrsquos and a possible external potential Φext through the Poissonequation
Φ(x) = Φself + Φext with nabla2Φself = 4πGNcompsum
α=1
Mα
int
d3v fα(xv)︸ ︷︷ ︸
ρα
(45)
where Ncomp is the number of components and Mα the total mass in com-ponent α (with the normalisation
intd3v d3x fα = 1) In the following we will
generally assume a fully self-gravitating system Φ(x) = Φself Because the Boltzmann equation simply states conservation of the phase-
space density along physical trajectories it keeps the same form if anothercoordinate system is used instead of the Cartesian (x y z) as long as f stillrepresents the number density per unit volume of the (x y z vx vy vz) phase-space
4 FokkerndashPlanck Treatment 101
423 Collision Terms
When particles are subject to forces other than those produced by the smoothΦ the convective derivative of f does not vanish anymore In particular ina real self-gravitating N -particle system the potential cannot be smooth onsmall scales Instead it exhibits some graininess ie short-term small-scalefluctuations Φreal = Φ + ΔΦgrainy Here I call relaxation the effects of thesefluctuations on the evolution of the system described by f Schematically theyare due to the fact that a given particle does not see the rest of the system asa smooth mass distribution but as a collection of point-masses Relaxationaleffects also known (somewhat confusingly) as collisional effects can there-fore be seen as particles influencing each other individually as opposed tocollectively To allow for these effects a right-hand collision term Γ has to beintroduced into the Boltzmann equation
Dtf = Γ [f ] (46)
We now develop an expression for Γ Let Ψ(wΔw)d6(Δw)dt be the probabil-ity that a particle at the phase-space position w is perturbed (through forcesnot derived from Φ) to w+Δw during dt In general Ψ is also a function of tbut I drop this dependence here to simplify notation Stars are scattered outof an element of phase space around w at a rate
Γminus = minusf(w)int
d6(Δw)Ψ(wΔw) (47)
while stars from other phase-space positions (wminusΔw) are scattering into thiselement at a rate
Γ+ =int
d6(Δw)f(w minus Δw)Ψ(w minus ΔwΔw) (48)
The collision term is thus Γ = Γ+ + Γminus and the Boltzmann equation withsuch a collision term is called the master equation
43 FokkerndashPlanck Equation
431 FokkerndashPlanck Equation in Position-Velocity Space
Theoretically the master equation is of very general applicability because veryfew simplifying assumptions have been made so far Unfortunately it is of lit-tle practical use unless some explicit expression for the transition probabilityΨ is known The FokkerndashPlanck treatment is based on the assumption that Ψis sufficiently smooth and that typical changes Δw are small We can then de-velop Ψ and f around w in a Taylor series to second order in Δw Specificallyin the term Γ+ we write
102 M Freitag
f(w minus Δw)Ψ(w minus ΔwΔw) = f(w)Ψ(wΔw) minus6sum
i=1
Δwipart
partwi[Ψ(wΔw)f(w)]
+12
6sum
ij=1
ΔwiΔwjpart2
partwipartwj[Ψ(wΔw)f(w)] + O((Δw)3)
(49)
Defining the diffusion coefficients (DCs)
〈Δwi〉 equivint
d6(Δw)ΔwiΨ(wΔw)
〈ΔwiΔwj〉 equivint
d6(Δw)ΔwiΔwjΨ(wΔw)(410)
and plugging the development (49) into the collision term of the master equa-tion we obtain the general FokkerndashPlanck (FP) equation
Dtf = minus6sum
i=1
part
partwi[f(w)〈Δwi〉] +
12
6sum
ij=1
part2
partwipartwj[f(w)〈ΔwiΔwj〉] (411)
Here 〈Δwi〉 is the mean change in wi per unit time due to collisional effectsThese diffusion coefficients are generally functions of w and t but I have notwritten these dependencies explicitly
Now in the case of stellar dynamics we identify the collisional changesΔw with the effect of Keplerian hyperbolic uncorrelated two-body encountersand assume that they occur instantaneously ie on a time-scale much shorterthan the dynamical time-scale tdyn equiv R
32cl (GMcl)minus12 where Mcl is the total
mass of the system and Rcl is some typical length scale such as the half-massradius In this local approximation we neglect the change in position and onlyconsider changes in velocity This means that Ψ(wΔw) = 0 if Δx = 0 andthe FokkerndashPlanck equation reads
Dtf = minus3sum
i=1
part
partvi[f(xv)〈Δvi〉] +
12
3sum
ij=1
part2
partvipartvj[f(xv)〈ΔviΔvj〉] (412)
432 Diffusion Coefficients and Approximations for Relaxation
Let us sketch the computation of the velocity diffusion coefficients In practicewe do not need to compute the transition probability Ψ Instead we use thefact that for instance 〈Δvi〉 is the mean rate of change of the component iof the velocity of a given particle (called the test particle) as it is perturbedby all other particles (the field particles) To carry out the computationswe have to adopt the following set of approximations usually referred to asldquoChandrasekhar theory of relaxationrdquo (Chandrasekhar 1943 1960 See forinstance Henon 1973 Saslaw 1985 Spitzer 1987 Binney amp Tremaine 1987Heggie amp Hut 2003)
4 FokkerndashPlanck Treatment 103
1 Local approximation The collisional perturbations to the motion of thetest particle are assumed to take place on a scale much smaller than thesize of its orbit Formally this holds if perturbations from distant starswith a long time-scale are negligible
2 Small perturbations approximation We assume that on time-scales of or-der tdyn (or shorter) the ldquocollisionsrdquo produce only a small change in theorbital parameters of a particle for the diffusion coefficients this trans-lates into tdyn〈Δvi〉 v tdyn〈ΔviΔvj〉 v2 This is an extension of theFP approximation which will make it possible to average the FP equationover the orbit of stars Most importantly for the time being it justifiesthe assumption that perturbations are two-body effects only and that theyadd linearly In other words to this level of approximation the combinedeffect of two field particles on a test particle are the same as the sumof the effects of each taken independently In particular the interactionbetween both field particles can be neglected Hence we are only con-sidering the so-called two-body relaxation This simplification only holdsif perturbations from very close stars (leading to large changes in v) arenegligible
3 Homogeneity approximation This is sometimes considered part of the lo-cal approximation We assume that the cumulative effects of the pertur-bations on the test object are as if the properties of the field particles(density velocity distribution) were the same in the whole system andequal to what they are in the vicinity of the test object In other wordsthe local conditions are representative of the global ones This arguablylooks like an unjustified assumption given how heterogeneous stellar sys-tems are (for instance the density in a globular cluster or galactic nucleusdecreases by many orders of magnitude from the centre to the half-massradius) and the long-range unshielded nature of the gravitational forceWe will see as we proceed why it may be a reasonable simplification butwe note that it can only work if distant perturbations do not dominate
To sum up the standard theory of relaxation is based on the assumptionsthat relaxation can be reduced to the cumulative effects of a large number ofuncorrelated two-body encounters that can be treated like (local) Kepleriansmall-angle hyperbolic velocity deflections due to objects with a density andvelocity distribution identical to the local ones
All these approximations are shared by other explicitly statistical methodsused to follow the long-term evolution of stellar clusters such as the Monte-Carlo scheme (see Chap 5) and the gaseous model (Bettwieser amp Spurzem1986 Louis amp Spurzem 1991 Giersz amp Spurzem 1994 Spurzem amp Takahashi1995 Amaro-Seoane et al 2004) but some approximations can be improvedon In particular large velocity changes (due to close encounters) can beincluded (Goodman 1983a Freitag et al 2006a)
To compute the diffusion coefficients we start by looking at the hyperbolicKeplerian encounter between the test particle with velocity v and mass m and
104 M Freitag
a field particle with velocity vf and mass mf We only consider field particles ofa given mass possibly different from m Standard numerical methods basedon the FP equation require that the mass spectrum is discretised Hencewe assume there are Nf particles of mass mf described by the distributionfunction ff now with the normalisation
intd3xd3vff = Nf
Using the local approximation we can assume that the encounter takesplace in a vacuum In other words the orbits are straight lines at large sep-aration (ldquoinfinityrdquo) The relative velocity at infinity is vrel = v minus vf and thevelocity of the centre-of-mass (CM) of the pair vcm = μv + (1 minus μ)vf withμ = m(m + mf) If b is the impact parameter the effect of the encounter issimply to rotate the relative velocity by an angle
tan(θ
2
)
=b0b
with b0 =G (m+mf)
v2rel
(413)
The value b0 is the impact parameter leading to a deflection angle π2 (inthe CM frame) We decompose the change of velocity Δv into componentsparallel and perpendicular to the initial relative velocity vrel
Δvperp = 2(1minus μ)vrelb
b0
(
1 +b2
b20
)minus1
Δv = 2(1minus μ)vrel
(
1 +b2
b20
)minus1
(414)
We then transform from the reference frame aligned with vrel (dependent onvf) to the external frame to get the Δvirsquos The next step is to average overall (equally probable) possible orientations of the impact parameter vectoraround the direction of vrel This gives values of 〈Δvi〉 and 〈ΔviΔvj〉 forfixed vf and b Now we sum the effects of all the encounters with field starshaving this velocity The number density of such objects is ffd3vf (consideredindependent of the position owing to the homogeneity approximation) andthe rate of encounters with an impact parameter between b and b + db is2πbdbvrelffd3vf We have to integrate over all possible impact parametersThis involves the integrals
int bmax
0
Δvbdb = vrel(1 minus μ)b20 ln(1 + Λ2)
int bmax
0
(Δv)2bdb = 2v2rel(1 minus μ)2b20
(
1 minus 11 + Λ2
)
int bmax
0
(Δvperp)2bdb = 2v2rel(1 minus μ)2b20
(
ln(1 + Λ2) minus 1 +1
1 + Λ2
)
(415)
In these relations Λ = bmaxb0 where bmax is the ill-defined maximum impactparameter For a system that is not too centrally concentrated we can setb = Rcl In most cases Λ 1 so the integrals can be approximated by
4 FokkerndashPlanck Treatment 105
int bmax
0
Δvbdb 2vrel(1 minus μ)b20 lnΛ
int bmax
0
(Δv)2bdb 0
int bmax
0
(Δvperp)2bdb 4v2rel(1 minus μ)2b20 lnΛ
(416)
Hence the cut-off bmax only enters the computation of the diffusion coefficientsthrough the multiplicative Coulomb logarithm lnΛ Due to the very weaklogarithmic dependency we can replace m and mf in b0 by the mean valueMclNlowast and vrel by the 1D velocity dispersion σv measured for example atthe half-mass radius unless σv is a very steep function of the position suchas around a massive black hole Further for a self-gravitating system in virialequilibrium σ2
v asymp GMclRcl so that Λ must be of order Nlowast Putting Λ = γcNlowastdirect N -body experiments indicate that γc asymp 01 for single-mass systemsand γc asymp 001 (with considerable uncertainty) if objects have a realistic massspectrum (See Henon 1975 for theoretical estimates and Giersz amp Heggie 19941996 amongst others for the determinations based on N -body simulations)
Although the above integrals are carried out from b = 0 remember that theFP approximation requires small changes in v This suggests that encounterswith b smaller than a few b0 (causing deflection angles not small comparedto π2) cannot be taken into account But truncating the integrations atbmin = a few b0 would just bring in terms smaller than those in (416) by afactor lnΛ This is reflected by the fact that the typical time-scale for anencounter within kb0 with k some numerical coefficient is
tla =[
nσvπ(kb0)2(
1 +2Gmkb0σv
)]minus1
asymp(nσvπ(kb0)2
)minus1 asymp σ3v
k G2m2n (417)
where n is the number density σv the velocity dispersion and m the (mean)mass of a particle For k asymp 1 this large-angle deflection time-scale is of orderlnΛ longer than the relaxation time (see (424)) However from these consid-erations it does not follow that large-angle deflection cannot play an impor-tant role in some circumstances while the standard two-body relaxation bydefinition leads to gradual changes in orbital properties a single large-angleencounter causes sudden orbit modifications which may have very differentconsequences This may produce ejections or lead to strong interactions be-tween stars and a central massive black hole in a galactic nucleus (Henon1960 Lin amp Tremaine 1980 Freitag et al 2006a See also Chap 5)
The contribution to the relaxation of encounters with b between b1 andb2 with b2 gt b1 b0 is proportional to ln(b1b2) This explains why thestructure of the stellar system at large distances from the test particle haslittle importance in practice The average inter-particle distance is
d equiv nminus13 =(m
ρ
)13
asymp(mR3
cl
Mcl
)13
= Nlowastminus13Rcl (418)
106 M Freitag
while b0 asymp Nlowastminus1Rcl So somewhat surprisingly about two thirds of the contri-
bution to two-body relaxation come from encounters with impact parameterssmaller than d This is why the homogeneity approximation is a good one
Carrying out the computation of the diffusion coefficients using (416) wearrive at
〈Δvi〉 = 4π lnΛG2mf(m+mf)parth(v)partvi
〈ΔviΔvj〉 = 4π lnΛG2m2f
part2g(v)partvipartvj
(419)
where h(v) and g(v) are the Rosenbluth potentials (Rosenbluth et al 1957)
h(v) =int
d3uff(u) |v minus u|minus1 and g(v) =int
d3uff(u) |v minus u| (420)
Recall that all these quantities have an implicit x-dependenceIf the velocity distribution is isotropic we can go further in the computa-
tion of the diffusion coefficients for the velocity We find (eg Spitzer 1987)
〈Δv〉 = minus4πλm2f
(
1 +m
mf
)
Elt2 (V )
〈Δvperp〉 = 0
〈(Δv)2〉 =8π3λm2
f v(Elt4 (v) + Egt
1 (v))
〈(Δvperp)2〉 =8π3λm2
f v(3Elt4 (v) minus Elt
4 (v) + 2Egt1 (v))
〈ΔvΔvperp〉 = 0
(421)
where λ equiv 4πG2 lnΛ
Eltn (v) =
int v
u=0
(u
v
)n
ff(u)du and Egtn (v) =
int infin
u=v
(u
v
)n
ff(u)du (422)
We see that the mass of the test particle m only appears in the coefficient〈Δv〉 for dynamical friction From this the diffusion coefficients for the energycan be computed using ΔE = vΔv + 1
2 (Δvperp)2 + 12 (Δv)2 which gives
〈ΔE〉 = 4πλm2f v
(
Egt1 (v) minus m
mfElt
2 (v))
〈(ΔE)2〉 =8π3λm2
f v3(Elt
4 (v) + Egt1 (v)
)
(423)
We can write Egtltn = ξgtlt
n nσminus3v where ξgtlt
n are dimensionless order-of-unity (and position-dependent) numbers n is the local number density offield stars and σv their local 1D velocity dispersion The time-scale trlx over
4 FokkerndashPlanck Treatment 107
which the direction of the velocity of a typical star (with v = v equiv 312σv) haschanged completely due to relaxation can be estimated using (423) and thedefinition 〈(Δvperp)2〉vtrlx equiv σ2
v We find tminus1rlx asymp lnΛG2m2
f nσminus3v A conventional
definition of the local relaxation time is obtained by assuming that the velocitydistribution is isotropic and Maxwellian and using the mean stellar mass m(Spitzer 1987)
trlx equiv 0339σ3
v
lnΛG2m2n (424)
In the case of a system with objects of different masses the relaxational effectof a species α is proportional to nαm
2α rather than its density (eg Perets et al
2007) On the other hand dynamical friction corresponding to the secondnegative term for 〈ΔE〉 (see (423)) has a time-scale proportional to ρ = mnthe total mass density of the field irrespective of the individual masses of thestars (for more on dynamical friction see Chap 7)
This is as far as we can go without further restriction on the distributionfunction ff If there is a single species of particles ff = f and the FP equationconsisting of (412) with the above diffusion coefficients (419) together withthe Poisson equation determine the evolution of the DF in a self-containedway Unfortunately the FP equation is a very intricate integro-differentialequation which at this point cannot be solved in whole generality
Furthermore realistic stellar systems are composed of objects with a rangeof properties (in particular masses) We can assume that there is a discreteset of populations orbiting in their common total potential and influencingeach other through two-body relaxation Each component k is described byDF fk which follows an FP equation but the diffusion coefficients are now asum of contributions from each component
〈Δvi〉k = 4π lnΛG2
timesNcompsum
l=1
[
ml(mk +ml)part
partvi
(int
d3ufl(u) |v minus u|minus1
)]
(425)
44 Orbit-Averaged FokkerndashPlanck Equation
441 General Considerations
To go further and obtain more easily usable versions of the FP equation weneed to restrict ourselves to stellar systems that are spherically symmetric inall their properties1 The use of the FP equation to study the structure and
1This does not imply that the velocity distribution is isotropic meaning thatit is spherically symmetric in velocity space but that the local velocity distribu-tion depends only on the moduli of the components of the velocity parallel andperpendicular to the radius-vector
108 M Freitag
evolution of stellar clusters was pioneered by Henon (1961) who derived theFP equation for an isotropic (but multi-mass) cluster and found an analyt-ical self-similar solution for the single-mass case assuming the existence ofa central energy source The first numerical codes producing general time-dependent solutions were written by Cohn (1979 1980) and to this daymost of the work in this field is based on the formalism and numerical meth-ods developed by this author (but see Takahashi 1995 and references thereinfor a finite-element scheme to solve the FP equation based on a variationalprinciple)
The FP equation can also be used for systems with axial symmetry suchas globular clusters or galactic nuclei with global rotation but we will nottreat this approach here (see Goodman 1983b Einsel 1996 Einsel amp Spurzem1999 Kim et al 2002 2004 Fiestas 2006 Fiestas et al 2006 Kim et al 2008for this original line of research under active development)
We also assume that the stellar system is in (quasi-)dynamical equilibriumIn other words it evolves very little over dynamical timescales
∣∣∣ff
∣∣∣ tdyn
If evolution is only due to two-body relaxation and the system is fully self-gravitating this assumption holds provided Nlowast is sufficiently large because∣∣∣ff
∣∣∣ asymp trlx asymp Nlowast(lnΛ)minus1tdyn with lnΛ = ln(γcNlowast) asymp 5minus 15 For single-mass
systems with Nlowast 103 the distinction between dynamical and relaxationaleffects (or between the smooth and grainy parts of the potential) becomesblurred When stars have a broad mass spectrum a larger number of stars isrequired for a clear distinction between dynamical and relaxational regimes
From Jeansrsquo theorem (Jeans 1915 Merritt 1999) for a spherical system indynamical equilibrium the DF f can depend on the phase-space coordinates(xv) only through the (specific) orbital energy E and modulus of the angularmomentum J
f(xv) = F (E(xv) J(xv)) with E = φ(r) +12v2 J = r vt (426)
where r = |x| v = |v| in a system of reference centred on the cluster centre2
φ is the spherically symmetric smooth gravitational potential so that Φ(x) =φ(r) and vt is the modulus of the component of the velocity perpendicular tothe radius-vector x
442 Isotropic Spherical Cluster
We first consider the simpler case of a cluster with isotropic velocity dispersionwhere F depends on E only We also assume only one component LetN(E)dEbe the number of stars with energy between E and E+dE The transformationfrom F to N is found by integrating over the phase-space accessible to orbits
2I use the word ldquoclusterrdquo to designate all (spherically) symmetric stellar systemsincluding galactic nuclei
4 FokkerndashPlanck Treatment 109
with energy between E and E + δE and then letting δE be an infinitesimalδE rarr dE
N(E)δE =int
[EE+δE]
d3xd3vF (E) = 16π2
int
r
dr r2[int
v
dv v2F (E)]
(427)
We bring F (E) out of the integrals because it is nearly constant in the in-tegration domain (by definition) We first realise the v-integration at fixedr which runs from v =
radic2(E minus φ(r)) to v + δv with δv δEv giving
intvdv v2
radic2(E minus φ(r))δE Finally remains the integration over r which
runs from 0 to rmax(E) defined such that φ(rmax) = E We neglect the smallpart of the integration domain with r between rmax(E) and rmax(E + δE)because its contribution is of higher order in δE Once we replace δE by dEwe find
N(E) = 16π2p(E)F (E) (428)
withp(E) =
int rmax
0
r2v dr =int rmax
0
r2radic
2(E minus φ(r))dr (429)
Note that the quantity p(E) is proportional to the radial orbital period aver-aged in J space (isotropised orbital period)
p(E) =12
int J2c (E)
0
d(J2)Porb(E J) with Porb(E J) = 2int rmax
rmin
drvr (430)
where Jc(E) is the angular momentum of a circular orbit of energy EWe could transform the FP equation in (xv)-space (412) into an equation
for the rate of change of N(E) but it is much simpler to start over fromscratch The collisional term of an FP equation for N(E) simply reads
dNdt
∣∣∣∣coll
= minus part
partE[ΔEN(E)] +
12part2
partE2
[(ΔE)2N(E)
] (431)
Here the computation of the diffusion coefficients involve averaging over thevolume of space accessible to a particle of energy E reflecting the transfor-mation from F (E) to N(E) (428) and (429)
ΔE = p(E)minus1
int rmax
0
r2v〈ΔE〉dr (432)
where 〈ΔE〉 is the local diffusion coefficient for the kinetic energy In otherwords the mean rate of change of 1
2v2 for a particle at position r with velocity
v =radic
2(E minus φ(r))The smooth potential φ may change slowly as a result of the relaxational
evolution of the cluster itself or because of an external influence In any casethis will induce a change in the energy not accounted for by the collisional
110 M Freitag
term (431) So if we write DtN(E) for the ldquoLagrangianrdquo rate of change ofdensity in energy space following the φ-induced change in E we obtain theright-hand side of the FP for N(E)
DtN(E) =partN
partt+partN
partE
dEdt
∣∣∣∣φ
=dNdt
∣∣∣∣coll
(433)
where dEdt|φ is the change in energy due to the evolution of the potentialIt can be shown that it is equal to the phase-space averaged value of partφpartt
dEdt
∣∣∣∣φ
= p(E)minus1
int rmax
0
partφ(r)partt
r2vdr (434)
We see that the FP equation for N(E) as well as its generalisation to theanisotropic case (see Sect 443) are orbit-averaged Again the condition forthis averaging to be valid is that the system evolves only very little over onedynamical time staying close to dynamical equilibrium
To solve numerically the FP equation it is usual to write it in a flux-conservation form
DtN(E) = minuspartFE
partEwith FE = mDEF minusDEE
partF
partE (435)
Using (423) it can be shown that the flux coefficients are
DE =16π3λmf
int E
φ(0)
dEprimep(Eprime)Ff(Eprime)
DEE =16π3λm2f
[
q(E)int 0
E
dEprimeFf(Eprime) +int E
φ(0)
dEprimeq(Eprime)Ff(Eprime)
]
(436)
where
q(E) =int E
φ(0)
dEprimep(Eprime) =13
int rmax
0
r2v3 dr (437)
Here q(E) is the volume of phase-space accessible to particles with energieslower than E and p(E) is the area of the hypersurface bounding this volumethat is p(E) = partqpartE (Goodman 1983a) q(E) is also proportional to theisotropised radial action
q(E) =14
int J2c (E)
0
d(J2)Q(E J) with Q(E J) = 2int rmax
rmin
dr vr (438)
We have used an index ldquofrdquo for ldquofieldrdquo to distinguish the mass and DF of thepopulation we follow (test stars) from the ldquofieldrdquo objects This distinction doesnot apply to a single-component system but makes it very easy to generaliseto a multi-component situation by summing over components to get the totalflux coefficient
4 FokkerndashPlanck Treatment 111
DE =Ncompsum
l=1
DEl DEE =Ncompsum
l=1
DEEl (439)
where the flux coefficient for component l can be written by replacing thesubscript ldquofrdquo by ldquolrdquo in (436) (eg Murphy amp Cohn 1988)
We now explain schematically how the FP equation is used numerically tofollow the evolution of star clusters A more detailed description can be foundin for example Chernoff amp Weinberg (1990) In the most common schemepioneered by Cohn (1980) two types of steps are realised in alternation
1 Diffusion step The change in the distribution function F for a discrete timestep Δt is computed by use of the FP equation assuming the potential φis fixed setting DtN = partN
partt = partNpartt
∣∣coll
The FP equation written as a flux-conserving equation is discretised on an energy grid The flux coefficientsare computed using the DF(s) of the previous step this makes the equationslinear in the values of F on the grid points The finite-differentiation schemeis the implicit Chang amp Cooper (1970) algorithm which is first-order intime and energy
2 Poisson step Now the change of potential resulting from the modification inF is computed and F is modified to account for the term dEdt|φ assumingDtN = partN
partt + partNpartE
dEdt
∣∣φ
= 0 This is done implicitly by using the fact thatas long as the change in φ over Δt is very small the actions of each orbitare adiabatic invariants Hence during the Poisson step the distributionfunction expressed as a function of the actions does not change Usingthe isotropised radial action q(E) defined above F (q)dq = F (E)p(E)dEwith F (q) = F (E(q)) In other words the modified F (E) is obtained byrecomputing the relation q(E) in the modified potential In practice aniterative scheme is used to compute the modified potential determinedimplicitly by the modified DF through the relation
φ(r) = minus4πG[1r
int r
0
dss2ρ(s) +int infin
r
dssρ(s)]
(440)
with
ρ(r) = 4πmint Emax
φ(r)
dEradic
2(E minus φ(r))F (E) (441)
for one component The iteration is started with the values of φ ρ etccomputed before the previous diffusion step
443 Anisotropic Spherical Cluster
The anisotropic FP treatment was already used to study some aspects ofthe structure of globular clusters by Spitzer amp Shapiro (1972) This typeof approach was then applied to the distribution of stars around a mas-sive black hole (assuming φ = minusGMBHr where MBH is the mass of the
112 M Freitag
black hole) by Lightman amp Shapiro (1977) and Cohn amp Kulsrud (1978)Although the first self-consistent FP simulations by Cohn (1979) made useof an anisotropic code further work on such models was relatively limitedin comparison to the isotropic case because the Chang amp Cooper (1970)discretisation scheme which proved so useful for getting good energy con-servation when the DF depended only on E (and t) has no exact equiva-lent for the case of a 2D (E J) dependence Also in most circumstancesit seems that forcing isotropy does not affect the results much and allowsa substantial reduction in the computational burden Cohn (1985) first pre-sented results of anisotropic FP models based on an extension of the ChangndashCooper scheme Since then Takahashi (1995 1996 1997) and Drukier et al(1999) have developed FP codes for spherical clusters with anisotropic velocitydistributions
Let F (E(xv) J(xv))d3xd3v be the number of stars with position withina volume d3x around x and velocity within d3v around v Because of sphericalsymmetry we can write d3x = 4πr2dr and d3v = 4πvtdvtdvr We note thatF (E J) = 0 if J gt Jc(E) Let N(E J)dE dJ be the number of stars withenergy between E and E + dE and angular momentum between J and J +dJ To convert from F (E J) to N(E J) we follow a star with energy Eand angular momentum J on its orbit and integrate the volume of phase-space along the way We use the distance from the centre r as integrationvariable
N(E J)dE dJ = 4πint rmax(EJ)
rmin(EJ)
r2drVr(E J)dE dJ (442)
Here Vr(E J)dE dJ denotes the (infinitesimal) volume in v-space with energybetween E and E + dE and angular momentum between J and J + dJ for afixed r We have
Vr(E J)dE dJ = 4πvtdvtdvr = 4πvt
∥∥∥∥
partEpartvt
partEpartvr
partJpartvt
partJpartvr
∥∥∥∥
minus1
dE dJ = 4πvt
rvrdE dJ
(443)which leads to
N(E J) = 8πPorb(E J)J F (E J) (444)
In numerical applications it is convenient to use R equiv (JJc(E))2 as a variableinstead of J Then the density of particles per unit E and R is
N(ER) = 4πJc(E)2Porb(E J)F (E J) (445)
The FP equation for N(ER) in its flux-conserving form is a direct extensionof the isotropic one
DtN(ER) = minuspartFE
partEminus partFR
partR (446)
4 FokkerndashPlanck Treatment 113
with
FE = mDEF minusDEEpartF
partEminusDER
partF
partR
FR = mDRF minusDRRpartF
partRminusDER
partF
partE
(447)
The expression for the flux coefficients are significantly longer than in theisotropic case they are given by Cohn (1979) for single-mass clusters and byTakahashi (1997) for the multi-mass case3 To my knowledge in all numericalsolutions of the anisotropic FP equation for stellar systems an isotropised DFis used in the computation of the diffusion and flux coefficients For instancefor DEE we use
DEE =32π3
3λm2
f
int rmax
rmin
drvr
[v2
int 0
E
dEprimeFf(Eprime r)
+ vminus1
int E
φ(r)
dEprimeFf(Eprime r) (2(φ(r) minus Eprime))32]
(448)
Here Ff is the isotropised DF
Ff(Eprime r) =1
Jmax
int Jmax
0
dJFf(Eprime J) (449)
where Jmax(E r) =radic
2r2(φ(r) minus E) is the maximum (scaled) angular mo-mentum that an orbit of energy E can have if it goes through radius r andRmax = (JmaxJc)2
45 The FokkerndashPlanck Method in Use
To conclude this chapter I present a quick and partial overview of the workcarried out in cluster and galactic nucleus modelling using the direct resolutionof the FokkerndashPlanck equation My goal here is to provide pointers to theliterature that will allow the reader a deeper exploration of this rich field
451 Relaxational Evolution
The only physics included in the FokkerndashPlanck formalism presented here isself-gravity (through use of the Poisson equation) and two-body relaxationThis is enough to study the evolution of stellar clusters (with no or few pri-mordial binaries) up to core collapse The case of a single-mass cluster was
3Beware that in the work of these authors E is the binding energy and hastherefore the opposite sign as here with corresponding sign changes to be trackedin the computation of the coefficients and E-derivatives
114 M Freitag
initially computed by Cohn (1979 1980) for a Plummer model and revisitedseveral times since to explore a variety of initial cluster structures (Wiyantoet al 1985 Quinlan 1996) or to investigate the core-collapse physics in greaterdetail using more sophisticated FokkerndashPlanck codes (Takahashi 1995 Drukieret al 1999) Clusters with stars of different masses are much more realisticand have been considered by several authors (eg Merritt 1983 Inagaki ampWiyanto 1984 Inagaki amp Saslaw 1985 Murphy amp Cohn 1988 Chernoff ampWeinberg 1990 Lee 1995 Takahashi 1997 Kim et al 1998)
In a multi-mass cluster with a realistic mass spectrum the evolution tocore collapse is driven by mass segregation FP simulations are the ideal toolto investigate how this process operates in the limit of a very large numberof stars They are quick and their results are not affected by any significantnumerical noise in contrast to particle-based methods such as direct N -bodyor Monte-Carlo codes In Fig 41 I show the evolution of the Lagrangian radiifor a cluster with stellar mass spectrum dNlowastdMlowast prop Mminus235
lowast covering therange 02ndash10M The simulation was performed using an FP code providedby HM Lee (eg Lee et al 1991) using 12 mass components The initialstructure is a Plummer model In Fig 42 I plot the evolution of the centralldquotemperaturerdquo for several mass components We see that energy equipartitionis approached at the centre only amongst the most massive stars (roughly inthe range 3ndash10M)
Using an energy grid of 200 elements such an FP run requires only 1ndash2 minof CPU time on a laptop computer For an anisotropic code that solves the FPequation in (E J) space the simulation runs for about 4 days on a desktopcomputer (G Drukier 2007 personal communication) When the mass spec-trum is discretised into a larger number of mass components the computingtime increases approximately linearly with the number of components Thecorresponding direct N -body simulation with 256 000 particles took about 40days using special-purpose GRAPE hardware (H Baumgardt 2005 personalcommunication) and a Monte-Carlo simulation using 106 particles took aboutone week on a desktop computer (see Chap 5)
452 Models with Additional Physics
In order to simulate more realistic and complex systems the FokkerndashPlanckdescription of two-body relaxation has been complemented by approximatetreatment of a large variety of other physical effects Here I give a list of theseeffects with references to some pioneering or otherwise notable FP works wherethey have been considered
bull Central massive black hole Assuming a quasi-stationary regime and afixed Keplerian potential Lightman amp Shapiro (1977) and Cohn amp Kulsrud(1978) used the FP formalism to determine the distribution of stars arounda massive black hole (MBH) and the rate of stellar disruptions by theMBH The treatment of the loss cone developed for these works was later
4 FokkerndashPlanck Treatment 115
MMMM
M
Fig 41 Core collapse of a Plummer cluster model with 02ndash10M Salpeter massfunction dNlowastdMlowast prop Nminus235
lowast Results of an isotropic FokkerndashPlanck code providedby H M Lee in solid lines are compared to a direct Nbody4 simulation with 256 000particles in dashes (H Baumgardt 2005 personal communication) To show masssegregation the evolution of Lagrangian radii for mass fractions of 1 and 50 per centis plotted for stars with masses within five different bins (corresponding to 5 of the12 discrete mass components used for the FP simulation) The length unit is theN -body scale (see Chap 1) The time unit is the initial half-mass relaxation time(Spitzer 1987) To convert the dynamical time units of the N -body simulation to arelaxation time a value of γc = 0045 was used for the Coulomb logarithm Comparewith Fig 54
introduced in self-consistent FP codes to study the evolution of globularclusters hosting an intermediate-mass black hole or of dense galactic nu-clei (Cohn 1985 David et al 1987a b Murphy et al 1991) Simplified FPcodes assuming in particular a fixed potential have been used to investi-gate the segregation of stellar-mass black holes around a MBH (Hopman ampAlexander 2006 Alexander 2007 OrsquoLeary et al 2008) and the formationof a central cusp of dark matter (Merritt et al 2007a) Very recently aFP code which includes the gravity of the stars self-consistently was usedto study the shrinkage of a binary MBH (Merritt et al 2007b) and theevolution of small nuclear clusters (Merritt 2008)
bull Stellar evolution Mass loss due to stellar evolution can be included byreducing the stellar mass represented by a mass component as a functionof time (eg Lee 1987a Chernoff amp Weinberg 1990 Quinlan amp Shapiro1990 Murphy et al 1991)
bull Collisions Some FP simulations have included the effects of collisions re-sulting in mergers (Lee 1987a Quinlan amp Shapiro 1989 1990) or (partial)
116 M Freitag
mm
σ
Fig 42 Evolution of the central temperatures during the core collapse of amulti-mass cluster model The temperature of component i is defined as Ti equiv32(mi〈m〉)σ2
i (0) where mi is the mass of a star of component i σi(0) the central1D velocity dispersion of that component in N -body units and 〈m〉 the mean stellarmass The data come from the same FokkerndashPlanck simulation as in Fig 41 Thesolid lines are the temperatures for the same five mass components (highest to lowestmass from top to bottom) The dashed line represents the mass-weighted averagecentral temperature
disruptions (David et al 1987a b Murphy et al 1991) The FP approachhas also been used to follow the evolution of galaxy clusters taking intoaccount galaxy mergers and mass stripping due to encounters betweengalaxies (Merritt 1983 1984 1985 Takahashi et al 2002) Collisions canonly be treated in an averaged and highly approximate fashion in the FPformalism because the mass and orbital energy of collision products ofany mass have to be transferred to the predefined mass components Fur-thermore the effects of collisions on stellar evolution cannot be includedin any detailed way Finally in the case of collisional runaway which isthe growth of one or a few stars to very high mass by successive mergersmass components have to be introduced that contain a very small num-ber of stars (sometimes less than one) Nevertheless comparisons with theMonte-Carlo algorithm (Chap 5) where collisions can be treated moreaccurately generally show good agreement as far as the overall effects ofcollisions are concerned (Freitag amp Benz 2002 Freitag et al 2006b)
bull Binary stars In a cluster containing no binaries initially some will formnear the centre during core collapse when the density reaches sufficientlyhigh values either through dissipative two-body effects or through close
4 FokkerndashPlanck Treatment 117
three-body interactions (eg Aarseth 1971 Heggie amp Hut 2003) Bothkinds of mechanism have been included in FP codes (Statler et al 1987Lee et al 1991 Takahashi amp Inagaki 1991 Lee amp Ostriker 1993 amongstothers) In most cases the binary population is not followed explicitlyInstead the formation hardening and ejection of binaries are simply in-cluded as an effective central source of heating able to stop and reversecore collapse Binary heating can result in gravothermal core oscillations inthe post-collapse evolution (Cohn et al 1989 Takahashi amp Inagaki 1991Breeden et al 1994) A more detailed treatment of binaries would necessi-tate to represent them by at least one additional component (Lee 1987bGao et al 1991) Only limited physical realism can be achieved because itis not practical to extend the phase space to include the internal propertiesof the binaries which include mass ratio semi-major axis and eccentricityThis limitation explains why to the best of my knowledge primordial bi-naries have only been included into the FP framework by Gao et al (1991)Furthermore in the case of dynamically formed binaries only a few areexpected to be present in the core at any given time (Goodman 1984Baumgardt et al 2002) making a description based on the distributionfunction inadequate
bull Large-angle scatterings Goodman (1983a) included the effects of close two-body encounters in FP simulations and concluded that they do not affectappreciably the core collapse process
bull Evaporation Assuming the cluster is on a circular orbit around a sphericalgalaxy (or in the equatorial plane of an axially symmetrical galaxy) theevaporation of stars in the steady tidal field can be approximated in aspherical FP code by an outer boundary condition For an isotropic for-mulation the condition is F (Et) = 0 with Et = minusGMclR
minus1t and Rt is the
tidal truncation radius which can be identified with the distance betweenthe centre of the cluster and the Lagrange point L1 or L2 (eg Chernoffamp Weinberg 1990) A more accurate condition can be used in anisotropicmodels by setting the DF to zero for orbits with an apocentre distancelarger than Rt (Takahashi et al 1997) Delayed evaporation can be sim-ulated to account for the fact that a star can spend a significant amountof time in the cluster even when its orbital parameters would allow it toreach the Lagrange points (Lee amp Ostriker 1987 Takahashi amp PortegiesZwart 2000)
bull Gravitational shocking In general as it orbits its host galaxy a globu-lar cluster can experience strongly varying external gravitational stressesMurali amp Weinberg (1997a) and Gnedin et al (1999) have included so-called disc and bulge shocking in their FP simulations which allowedthem to study the evolution of whole globular cluster systems (Gnedin ampOstriker 1997 Murali amp Weinberg 1997b c) Thank to a new integrationscheme shocking has been studied in anisotropic FP models (Shin et al2008)
118 M Freitag
bull Gas dynamics (David et al 1987a b) coupled the FP algorithm with aspherical gas dynamical code to predict what amount of the gas releasedby stars through evolution and collisions is accreted by a central MBH inAGN models However gas motion is likely to be highly non-spherical andto vary on time-scales much shorter than those for evolution of the stellarcluster (eg Williams et al 1999 Cuadra et al 2005)
FP simulations including several of the above physical processes have beenused to interpret observations of a few specific globular clusters M 15(Grabhorn et al 1992 Dull et al 1997) M 71 (Drukier et al 1992) NGC 6397(Drukier 1993 1995) and NGC 6624 (Grabhorn et al 1992) In the futureit seems likely that particle-based methods will be used to produce detailedmodels of observed clusters (see Giersz amp Heggie 2003 2007 and Hurley et al2005 for pioneering examples) These codes can deal realistically with stel-lar populations that are rare or otherwise problematic to simulate with FPmethods such as primordial binaries blue stragglers or X-ray binaries How-ever because they are so much faster FP codes can be an invaluable toolto carry out extensive parameter-space exploration and determine the initialconditions and physical parameters most likely to fit the observational dataDirect N -body or Monte-Carlo simulations can then be used using these inputparameters to obtain more detailed models
Acknowledgement
I am indebted to Gordon Drukier and Hyung Mok Lee who provided invaluablehelp in the preparation of my FokkerndashPlanck lecture and took the time toread and comment on a draft of this chapter I also thank Hyung Mok Lee formaking available his FokkerndashPlanck code and helping me to use it and HolgerBaumgardt for providing unpublished N -body data My work is supported bythe STFC rolling grant to the IoA
References
Aarseth S J 1971 ApampSS 13 324 117Alexander T 2007 in Livio M Koekemoer A M eds 2007 STScI Spring Sympo-
sium Black Holes (astro-ph07080688) 115Amaro-Seoane P Freitag M Spurzem R 2004 MNRAS 352 655 103Baumgardt H Hut P Heggie D C 2002 MNRAS 336 1069 117Bettwieser E Spurzem R 1986 AampA 161 102 103Binney J Tremaine S 1987 Galactic Dynamics Princeton Univ Press Princeton
NJ 99 102Breeden J L Cohn H N Hut P 1994 ApJ 421 195 117Chandrasekhar S 1943 Rev Mod Phys 15 1 102Chandrasekhar S 1960 Principles of Stellar Dynamics Dover enlarged
edition 102
4 FokkerndashPlanck Treatment 119
Chang J S Cooper G 1970 J Comp Phys 6 1 111 112Chernoff D F Weinberg M D 1990 ApJ 351 121 111 114 115 117Cohn H 1979 ApJ 234 1036 108 112 113 114Cohn H 1980 ApJ 242 765 108 111 114Cohn H 1985 in Goodman J Hut P eds Proc IAU Symp 113 Dynamics of
Star Clusters Reidel Dordrecht p 161 112 115Cohn H Hut P Wise M 1989 ApJ 342 814 117Cohn H Kulsrud R M 1978 ApJ 226 1087 112 114Cuadra J Nayakshin S Springel V Di Matteo T 2005 MNRAS 360 L55 118David L P Durisen R H Cohn H N 1987a ApJ 313 556 115 116 118David L P Durisen R H Cohn H N 1987b ApJ 316 505Drukier G A 1993 MNRAS 265 773 118Drukier G A 1995 100 347 118Drukier G A Cohn H N Lugger P M Yong H 1999 ApJ 518 233 112 114Drukier G A Fahlman G G Richer H B 1992 ApJ 386 106 118Dull J D Cohn H N Lugger P M Murphy B W Seitzer P O Callanan P J
Rutten R G M Charles P A 1997 ApJ 481 267 118Einsel C Spurzem R 1999 MNRAS 302 81 108Einsel M 1996 PhD thesis Christian-Albrechts-Universitat zu Kiel 108Fiestas J 2006 PhD thesis Heidelberg University 108Fiestas J Spurzem R Kim E 2006 MNRAS 373 677 108Freitag M Amaro-Seoane P Kalogera V 2006a ApJ 649 91 103 105Freitag M Benz W 2002 AampA 394 345 116Freitag M Rasio F A Baumgardt H 2006b MNRAS 368 121 116Gao B Goodman J Cohn H Murphy B 1991 ApJ 370 567 117Giersz M Heggie D C 1994 MNRAS 268 257 105Giersz M Heggie D C 1996 MNRAS 279 1037 105Giersz M Heggie D C 2003 MNRAS 339 486 118Giersz M Heggie D C 2007 in Vesperini E Giersz M Sills A eds Dynami-
cal Evolution of Dense Stellar Systems Proceedings of IAU Symposium No 246(astro-ph07110523) 118
Giersz M Spurzem R 1994 MNRAS 269 241 103Gnedin O Y Lee H M Ostriker J P 1999 ApJ 522 935 117Gnedin O Y Ostriker J P 1997 ApJ 474 223 117Goodman J 1983a ApJ 270 700 103 110 117Goodman J 1983b PhD thesis Princeton University 108Goodman J 1984 ApJ 280 298 117Goodman J Heggie D C Hut P 1993 ApJ 415 715 97Grabhorn R P Cohn H N Lugger P M Murphy B W 1992 ApJ 392 86 118Heggie D Hut P 2003 The Gravitational Million-Body Problem Cambridge Univ
Press Cambridge 102 117Hemsendorf M Merritt D 2002 ApJ 580 606 97Henon M 1960 Annales drsquoAstrophysique 23 668 105Henon M 1961 Annales drsquoAstrophysique 24 369 108Henon M 1973 in Martinet L Mayor M eds Lectures of the 3rd Advanced Course
of the Swiss Society for Astronomy and Astrophysics Obs de Geneve Genevep 183 102
Henon M 1975 in Hayli A ed Proc IAU Symp 69 Dynamics of Stellar SystemsReidel Dordrecht p 133 105
120 M Freitag
Hopman C Alexander T 2006 ApJ Lett 645 L133 115Hurley J R Pols O R Aarseth S J Tout C A 2005 MNRAS 363 293 118Inagaki S Saslaw W C 1985 ApJ 292 339 114Inagaki S Wiyanto P 1984 PASJ 36 391 114Jeans J H 1915 MNRAS 76 70 108Kim E Einsel C Lee H M Spurzem R Lee M G 2002 MNRAS
334 310 108Kim E Lee H M Spurzem R 2004 MNRAS 351 220 108Kim E Yoon I Lee H M Spurzem R 2008 MNRAS 383 2 108Kim S S Lee H M Goodman J 1998 ApJ 495 786 114Lee H M 1987a ApJ 319 801 115Lee H M 1987b ApJ 319 772 117Lee H M 1995 MNRAS 272 605 114Lee H M Fahlman G G Richer H B 1991 ApJ 366 455 114 117Lee H M Ostriker J P 1987 ApJ 322 123 117Lee H M Ostriker J P 1993 ApJ 409 617 117Lightman A P Shapiro S L 1977 ApJ 211 244 112 114Lin D N C Tremaine S 1980 ApJ 242 789 105Louis P D Spurzem R 1991 MNRAS 251 408 103Merritt D 1983 ApJ 264 24 114 116Merritt D 1984 ApJ 276 26 116Merritt D 1985 ApJ 289 18 116Merritt D 1999 PASP 111 129 108Merritt D 2008 preprint (astro-ph08023186)Merritt D Harfst S Bertone G 2007a Phys Rev D 75 043517 115Merritt D Mikkola S Szell A 2007b ApJ 671 53Murali C Weinberg M D 1997a MNRAS 288 749 117Murali C Weinberg M D 1997b MNRAS 291 717 117Murali C Weinberg M D 1997c MNRAS 288 767Murphy B W Cohn H N 1988 MNRAS 232 835 111 114Murphy B W Cohn H N Durisen R H 1991 ApJ 370 60 115 116Perets H B Hopman C Alexander T 2007 ApJ 656 709 107OrsquoLeary R M Kocsis B Loeb A 2008 preprint (astro-ph08072638)Quinlan G D 1996 New Astronomy 1 255 114Quinlan G D Shapiro S L 1989 ApJ 343 725 115Quinlan G D Shapiro S L 1990 ApJ 356 483 115Rosenbluth M N MacDonald W M Judd D L 1957 Physical Review 107 1 106Saslaw W C 1985 Gravitational Physics of Stellar and Galactic Systems Cam-
bridge Univ Press Cambridge 102Shin J Kim S S Takahashi K 2008 MNRAS 386 L67Spitzer L 1987 Dynamical evolution of globular clusters Princeton Univ Press
Princeton NJ 102 106 107 115Spitzer L J Shapiro S L 1972 ApJ 173 529 111Spurzem R Takahashi K 1995 MNRAS 272 772 103Statler T S Ostriker J P Cohn H N 1987 ApJ 316 626 117Takahashi K 1995 PASJ 47 561 108 112 114Takahashi K 1996 PASJ 48 691 112Takahashi K 1997 PASJ 49 547 112 113 114Takahashi K Inagaki S 1991 PASJ 43 589 98 117
4 FokkerndashPlanck Treatment 121
Takahashi K Lee H M Inagaki S 1997 MNRAS 292 331 117Takahashi K Portegies Zwart S F 2000 ApJ 535 759 117Takahashi K Sensui T Funato Y Makino J 2002 PASJ 54 5 116Williams R J R Baker A C Perry J J 1999 MNRAS 310 913 118Wiyanto P Kato S Inagaki S 1985 PASJ 37 715 114
5
Monte-Carlo Models of Collisional StellarSystems
Marc Freitag
University of Cambridge Institute of Astronomy Madingley Road CambridgeCB3 0HA UKmarcfreitaggmailcom
51 Introduction
In this chapter I describe a fast approximate particle-based algorithm tocompute the long-term evolution of stellar clusters and galactic nuclei Itrelies on the assumptions of spherical symmetry of the stellar system dynam-ical equilibrium and local diffusive two-body relaxation It allows for velocityanisotropy an arbitrary stellar mass spectrum stellar evolution a centralmassive object collision between stars binary processes and two-body en-counters leading to large deflection angles Using one to ten million particlesa run extending over several relaxation times takes a few days to a few weeksto compute on a single-CPU personal computer and the CPU time scalesas tCPU prop Np lnNp where Np is the number of particle used Because eachphysical process is implemented with its explicit scaling the number of starssimulated can be (much) larger than Np making it possible to simulate galac-tic nuclei with (in particular) the correct rate of relaxation
The Monte-Carlo (MC) numerical scheme is intermediate both in termsof realism and computing time between FokkerndashPlanck or gas approaches anddirect N -body codes The former are very fast but based on a significantly ide-alised description of the stellar system the latter treat (Newtonian) gravity inan essentially assumption-free way but are extremely demanding in terms ofcomputing time (Binney amp Tremaine 1987 Sills et al 2003) The MC schemewas first introduced by Henon to follow the relaxational evolution of globularclusters (Henon 1971ab Henon 1973a Henon 1975) To my knowledge thereexist three independent codes based on Henonrsquos ideas in active developmentand use The first is the one written by M Giersz (Giersz 1998 2001 2006Giersz et al 2008) which implements many of the developments first intro-duced by Stodolkiewicz (1982 1986) Second is the code written by K Joshi(Joshi et al 2000 2001) and greatly improved and extended by A Gurkanand J Fregeau (see for instance Fregeau et al 2003 Gurkan et al 2004 2006Fregeau amp Rasio 2007) These codes have been applied to the study of globu-lar and young clusters Finally we developed a MC code specifically aimed at
Freitag M Monte-Carlo Models of Collisional Stellar Systems Lect Notes Phys 760
123ndash158 (2008)
DOI 101007978-1-4020-8431-7 5 ccopy Springer-Verlag Berlin Heidelberg 2008
124 M Freitag
the study of galactic nuclei containing a central massive black hole (Freitag ampBenz 2001c Freitag amp Benz 2002 Freitag et al 2006a Freitag et al 2006bc)The description of the method given here is based on this particular imple-mentation1
This chapter is organised as follows In Sect 52 the core principles andassumptions of the method are presented In Sect 53 I expose the innerworkings of the code in detail the basic algorithm which treats global self-gravity and two-body relaxation is the subject of Sect 531 while Sect 532covers the additional physical processes (collisions central object binariesstellar evolution etc) Finally in Sect 54 I show a few applications anddiscuss possible avenues for future developments of the method in the contextof research on star clusters (Sect 541) and on galactic nuclei (Sect 542)
52 Basic Principles
The MC code shares most of its underlying assumptions with the FokkerndashPlanck (FP) approach presented in Chap 4 Essentially Henonrsquos algorithmcan be seen as a particle-based method to solve the coupled FP and Pois-son equations for a stellar cluster using Monte-Carlo sampling to determinethe long-term effects of two-body relaxation An advantage of the MC ap-proach over FP integrations is that it can include a continuous stellar massspectrum and extra physical ingredients such as stellar evolution collisionsbinaries or a central massive black hole in a much more straightforward andrealistic way On the downside MC simulations require considerably morecomputing time Furthermore the MC results show numerical noise whilethose obtained with the FP codes are smooth and easier to analyse and ma-nipulate
The assumptions shared by both methods are the following
1 Dynamical equilibrium2 Spherical symmetry3 Diffusive relaxation4 Adequacy of representation with a one-particle distribution function
An isolated system is likely to attain dynamical equilibrium after an ini-tial phase of violent relaxation spanning a few dynamical times tdyn =radicR3
cl(GMcl) where Rcl is a characteristic length (such as the half-massradius) and Mcl the mass of the cluster The MC code developed by Spitzerand collaborators (Spitzer amp Hart 1971ab Spitzer amp Thuan 1972 Spitzer amp
1This code is available at httpwwwastcamacukresearchrepositoryfreitagMChtmGeneral information on the MC method and more references can be found onthe web pages created for the MODEST consortium (ldquoMOdeling DEnse STellarsystemsrdquo) at httpwwwmanybodyorgmodest (follow the link to the workinggroup on stellar-dynamics methods WG5)
5 Monte-Carlo Models 125
Shull 1975 Spitzer amp Mathieu 1980) allows for out-of-equilibrium situationsat the price of computing speed but the assumption of spherical symmetrystrongly limits the usefulness of this feature
In practice the strongest restriction is that of spherical symmetry Vio-lent relaxation generally leads to an equilibrium configuration with signifi-cant triaxiality (eg Aguilar amp Merritt 1990 Theis amp Spurzem 1999 Boily ampAthanassoula 2006) Although it is likely that two-body relaxation makes thesystem more symmetrical flattening owing to global rotation can persist overmany relaxation times (Einsel amp Spurzem 1999 Kim et al 2002 2004 Fiestaset al 2006) In galactic nuclei the interaction between the stars and a binarymassive black hole (eg Merritt amp Milosavljevic 2005) or a massive accre-tion disc (eg Subr et al 2004) cannot be studied accurately when sphericalsymmetry is assumed (see Sect 542)
The last two assumptions have been discussed in Chap 4 on FP methodsThey imply that correlations between particles beyond random two-bodyencounters are neglected but I stress that three- and four-body interactionsin the form of binary processes can be included in the MC approach with muchmore realism than permitted by the direct FP formalism (see Sect 532)
It should be noted at once that all these assumptions can only be validif the system under consideration contains a large number of stars In myexperience the MC approach is suitable if the number of particles Np satisfies
Np 3000mmax
〈m〉 (51)
where mmax and 〈m〉 are the maximum and mean stellar mass respectivelyIn Henonrsquos scheme the numerical realisation of the cluster is a set of spher-
ical shells with zero thickness each of which is given a mass M a radius Ra specific angular momentum J and a specific kinetic energy T These parti-cles can be interpreted as spherical layers of synchronised stars that share thesame stellar properties orbital parameters and orbital phase and experiencethe same processes (relaxation collision etc) at the same time
From the radii and masses of all particles the potential can be computedat any time or place and the orbital energies of all particles are straightfor-wardly deduced from their kinetic energies and positions Hence the set ofparticles can be regarded as a discretised representation of the distributionfunction (DF) f(xv) = F (E J) But whereas a functional or tabulated ex-pression of the DF (as implemented in direct FP methods) would require theintegration of the Poisson equation to yield the gravitational potential theMonte-Carlo realisation of the cluster provides it directly From this point ofview the Monte-Carlo method is closer to N -body philosophy than to directFP methods
The main difference between the MC code and a spherical 1D N -bodysimulation (eg Henon 1973b) is that the former does not explicitly followthe continuous orbital motion of particles which preserves E and J How-ever these orbital constants as well as other properties of the particles are
126 M Freitag
modified by collisional processes to be incorporated explicitly two-body relax-ation stellar collisions etc So the MC simulation proceeds through millionsto billions of steps each of them consisting of the selection of particles themodification of their properties to simulate the effects these physical processesand the selection of radial positions R on their new orbits
53 Detailed Implementation
531 Core Algorithm
This subsection is divided into four parts In the first I present the treatmentof relaxation and the overall structure of the code In the following partsI explain in detail some important aspects of the algorithm which are theselection of a pair of particles to evolve the representation of the gravitationalpotential and the determination of a new orbital position for updated particles
Two-Body Relaxation and General Organisation
The treatment of two-body relaxation is the backbone of Henon-type Monte-Carlo schemes It relies on the usual diffusive approximation developed byChandrasekhar and presented in Chap 4 I recall that the basic idea behindthe concept of relaxation is that the gravitational potential of a stellar systemcontaining a large number of bodies can be described as the sum of a dom-inating smooth contribution plus a small granular part that fluctuates oversmall scales and short times When only the smooth part is taken into accountthe DF of the cluster obeys the collisionless Boltzmann equation Howeverin the long run the fluctuating part makes E and J change slowly and theDF evolve The basic simplifying assumption underlying Chandrasekhar relax-ation theory is to treat the effects of the fluctuating part as the sum of multipleuncorrelated two-body hyperbolic gravitational encounters with small devia-tion angles Under these assumptions if a test star of mass m travels througha field of stars with homogeneous number density n which all have massmf and the same velocity after a time span δt its velocity in the referenceframe of the encounters will deviate from the initial direction by an angle θsuch that
〈θ〉δt = 0 and
langθ2rang
δt 8πn lnΛ
G2 (m+mf)2
v3rel
δt(52)
where vrel the relative velocity between the test star and the field stars andlnΛ ln(γcNlowast) for a self-gravitating cluster with the value of γc dependingon the mass spectrum (see Chap 4)
5 Monte-Carlo Models 127
Henonrsquos method avoids the computational burden and some of the nec-essary simplifications connected with the numerical evaluation of diffusioncoefficients The repeated application of (52) to a given particle implicitlyamounts to a Monte-Carlo integration of the orbit-averaged diffusion coef-ficients provided the orbital positions and properties of field particles arecorrectly sampled Under the usual assumption that encounters are local thislatter constraint is obeyed if we take these properties to be those of the clos-est neighbouring particle Furthermore this allows us to actually modify thevelocities of both particles at a time each acting as a representative from thefield for the other Evolving particles in symmetrical pairs not only speeds upthe simulations by a factor 2 but also and more critically ensures strictconservation of energy
Therefore at the heart of the MC treatment of relaxation are super-encounters encounters between two neighbouring particles with a deflectionangle θSE devised to reproduce statistically the cumulative effects of the nu-merous physical deflections taking place in the real system over a time spanδt Using the indices 1 and 2 to designate the particles in a pair we see thatin order to reproduce the values of (52) for deflection angles correspondingto a time step δt we must set
θSE =π
2
radicδt
trlx 12
(53)
where
trlx 12 equiv π
32v3rel
lnΛG2 (m1 +m2)2n
(54)
is the pair relaxation time
With no other physical process than relaxation included a single step ina MC simulation consists of the following operations
1 Selection of a pair of adjacent particles to evolve This procedure alsodetermines the (local) value of the time step δt as explained below
2 Modification of the orbital properties (Ei and Ji) of the particles througha super-encounter This involves(a) estimation of the local density n entering trlx 12 in (54)(b) random orientation of the velocity vectors vi of the particles respecting
their angular momenta Ji = Ji and specific kinetic energy Ti = 12vi
2
(this sets the centre-of-mass [CM] and relative velocities vCM and vrelthe former defines the encounter CM frame while the latter allows θSE
to be determined through (53) and (54)(c) random orientation of the orbital plane in the CM frame around the
direction of the relative velocity (the angle θSE is known so computingthe post-encounter velocities in the CM frame is trivial) and
(d) transformation back to the cluster frame to obtain the modified Eprimei
and J primei
128 M Freitag
3 For each particle selection of a new position on the (EprimeiJ
primei)-orbit As a
particle is a spherical shell its position is simply its radius Ri This stepcomprises the update of the potential to take these new positions intoaccount
To compute the local density required in step 2a we build and maintaina radial Lagrangian grid the cells of which typically contain a few tens ofparticles each Frequent updates (each time a particle gets a new position R)and occasional rebuilds of the mesh introduce only a very slight computationaloverhead
Selection of a Pair of Particles and Determination of Time Step
For the sake of efficiency we wish to use time steps that reflect the largevariations of the relaxation time between the central and outer parts of astellar cluster The other constraint determining the selection procedure isthat particles in an interacting pair must have the same δt lest energy not beconserved2 But adjacent particles only form a pair momentarily and separateafter their interaction as each is attributed a new position This necessitatesthe use of local time steps ie δt should be a function of R alone instead ofbeing attached to particles
For the time steps to be sufficiently short we impose
δt(R) le fδttrlx(R) (55)
where trlx is a locally averaged relaxation time
trlx prop 〈v2〉 32
lnΛG2〈m〉2n (56)
and 0005 le fδt le 005 typically The time trlx is evaluated approximatelywith a sliding averaging procedure and tabulated from time to time to reflectthe slow evolution of the cluster
The members of a pair arrived at their present position at different timesbut have to leave it at the same time after a super-encounter Building onthe statistical nature of the scheme instead of trying to maintain a particleat radius R during exactly δt(R) we only require the expectation value forthe residence time at R to be δt(R) As explained by Henon (1973a) thisconstraint can be fulfilled if the probability for a pair at R to be selected isproportional to 1δt(R) This is realised in the following way
bull Because it would be difficult to define and use a selection probability Pselec
that is a function of the continuous variable R we define it to depend on2When collisions are included a shared δt also ensures that the probability for
particle i to collide with particle j equates the symmetrical quantity
5 Monte-Carlo Models 129
the rank i of the pair (rank 1 designates the two particles that are closestto the centre rank 2 the second and third particles at increasing R and soon) For a given clusterrsquos state local relaxation times trlx are computed atthe radial position of every pair Rank-depending time steps are definedto obey inequality (55)
δt(i) le fδttrlx(R(i)) (57)
bull Normalised selection probabilities are computed by
Pselec(i) =δt
δt(i)with δt =
⎛
⎝Npminus1sum
j=1
1δt(j)
⎞
⎠
minus1
(58)
from which we derive a cumulative probability
Qselec(i) =isum
j=1
Pselec(j) (59)
bull At each evolution step another particle pair is randomly chosen accordingto Pselec To do this a random numberXrand is first generated with uniformprobability between 0 and 1 The pair rank is then determined by inversionof Qselec
i = Qminus1selec(Xrand) (510)
The binary tree (see Sect 531) is searched twice to find the id-numbersof the member particles the (momentary) ranks of which are i and i+ 1
bull The pair is evolved through a super-encounter as explained above for atime step δt(i)
bull After a large number of elementary steps δt(i) and Pselec(i) are re-computed to reflect the slight modification of the overall cluster structure
For the sake of efficiency we must choose for Qminus1selec a function that is quickly
evaluated while Pselec(j) must approximate 1trlx(R(i)) as closely as possibleto avoid unnecessarily long time steps A good compromise is to use a piecewiseconstant representation ie divide the cluster into some 50 radial slices anduse a constant Pselec in each This is illustrated in Fig 51 (with only 20 slicesfor clarity) Once the selection probabilities have been determined the valueδt relating them to the time step is set to δt = fδt max(Trel(i)Pselec(i)) so asto ensure that the constraint of (55) is satisfied everywhere
It must be stressed that the probabilities Pselec(i) and corresponding timesteps are computed in advance and are only updated (to reflect the evolutionof the structure) after each particle has been treated several times on averageOnce the pair of adjacent particles of rank i has been selected to be subject toa super-encounter the time step δt(i) is imposed and the encounter relaxation
130 M Freitag
Fig 51 Selection probabilities in a King W0 = 5 cluster model consisting of 10 000particles The inverse of the locally estimated relaxation time is compared to thepiecewise approximation used to set the probabilities in the MC code
time trlx 12 is determined by the particlesrsquo properties and the local density(54) This imposes the value of the deflection angle (53) In order to performa proper orbit averaging and sampling over the field particles θSE should besmall so that a given particle would have experienced a large number of super-encounters by the time its orbit has changed significantly Unfortunately thisis impossible to enforce strictly as the δt(i) values are based on an estimate ofthe typical local relaxation time while trlx 12 can happen to be much shorterUsing a sufficiently small value of fδt we can keep the fraction of encountersleading to large values of θSE to a low level
Representation of the Gravitational Potential
The smooth part of the potential of the cluster is simply approximated asthe sum of the contributions of the Np particles each of which is a sphericalinfinitely thin shell In other terms compared to the potential in a systemof Np point-masses we (implicitly) perform a complete smoothing over theangular variables Between particles of rank i and i+1 the (smooth) potentialfelt by a particle at radius R isin [Ri Ri+1] is simply
Φ(R) = minusAi
RminusBi with Ai =
iminus1sum
j=1
Mj and Bi =Npsum
j=i
Mj
Rj (511)
5 Monte-Carlo Models 131
where Mj and Rj are the mass and radius of the particle of rank j Althoughwe do not smooth the density distribution in the radial direction tests showthat in practice this spherically symmetric potential does not introduce sig-nificant unwanted relaxation for Np 104 in simulations extending to an av-erage number of steps per particle of a few thousands (Henon 1971b Freitagamp Benz 2001c) However too small a time step parameter fδt can yield anartificially accelerated evolution owing to this numerical relaxation
At each step in the simulation two particles are selected undergo a super-encounter and are given new positions on their slightly modified orbits Toenforce exact energy conservation the Ai and Bi coefficients are updatedafter every such orbital displacement Doing so saves much trouble connectedwith a potential that lags behind the actual distribution of particlesrsquo radii (andmasses when stellar evolution or collisions are included) However performingpotential updates only after a large number of particle moves has advantagesof its own in particular the possibility of algorithm parallelisation (Joshi et al2000) but requires special measures to ensure satisfactory energy conservation(Stodolkiewicz 1982 Giersz 1998 Fregeau amp Rasio 2007)
The potential information is not represented by linear arrays (for the Ai
and Bi) but by a binary tree (Sedgewick 1988) This tree also contains rankinginformation It allows us to find a particle of a given rank compute the poten-tial at its position and update the potential data once the particle is movedto another radius in O(logNp) operations instead of O(Np) as would be thecase with simple arrays At any given time each particle is represented by anode in the tree Each node is connected to (at most) two sub-trees All thenodes in the left sub-tree of a given node correspond to particles with smallerradii and all the nodes in its right sub-tree to particles at larger radii Thespherical potential is represented by (floating-point) δAk and δBk coefficientsattached to nodes A third (integer) value δik allows the determination ofthe radial rank of any particle If we define LT k and RT k to be the sets ofnodes in the left and right sub-trees of node k these quantities are defined by
δik = 1 + number of nodes in LT k
δAk = Mk +sum
misinLT k
Mm and δBk =Mk
Rk+
sum
misinRT k
Mm
Rm
(512)
An example of binary tree is shown in Fig 52 After a large number ofspecified steps the binary tree is rebuilt from scratch to keep it well balanced
Selection of a New Orbital Position
In a spherical potential Φ(R) a star of specific orbital energy E and angularmomentum J spends during one complete radial oscillation a time dt =vminus1rad(R)dR in an infinitesimal interval of radius [RR + dR] with
132 M Freitag
Fig 52 Binary tree for a cluster of 50 particles The structure of the tree is shownafter many particles have been moved around since the tree was built The loweraxis shows the radius of each particle The tree keeps the particles sorted in radiusThe table on the right is the content of the three arrays used in the Fortran-77
code to implement the logical structure of the tree Arrays l son(k) and r son(k)
indicate the root nodes for the left and right sub-trees of node k Array father(k)
allows us to climb back to the root
v2rad = 2E minus 2Φ(R) minus J2
R2 (513)
Without knowledge of orbital phase the probability density of finding the starat R is thus
dPorb
dR=
2Porb
1vrad(R)
(514)
5 Monte-Carlo Models 133
where
Porb = 2int Rapo
Rperi
dRvrad(R)
(515)
is the radial orbital periodSince dynamical equilibrium is assumed the knowledge of the explicit or-
bital motion R(t) is not necessary Instead once a particle is updated its posi-tion R is picked up at random but with the requirement of correct statisticalsampling This means that the fraction of time spent at R must follow (514)Let the sought-for probability of placing the particle at R isin [Rperi Rapo] befplac(R) equiv dPplacdR We have to compensate for the fact that if the particleis placed at R it will stay there for an average time δtPselec(R) The averageratio of times spent at two different radii R1 and R2 on the orbit is
langtstay(R1)tstay(R2)
rang
=fplac(R1)Pselec(R2)fplac(R2)Pselec(R1)
=vrad(R2)vrad(R1)
(516)
This imposes the relation
fplac(R) prop Pselec(R)vrad(R)
(517)
The numerical implementation of this probability law is complicated by thefact that vrad(R)minus1 is not known analytically and becomes infinite at the peri-centre and apocentre However vrad(R)minus1 can always be capped by theKeplerian value with the same J Rperi and Rapo allowing the use of anefficient rejection method (Press et al 1992 Sect 73) to pick up R accordingto (517)3
532 Additional Physics
Because it is based on particle representation it is relatively easy to add avariety of physical ingredients to the MC algorithm in order to improve therealism of the simulations or the domain of applicability of the methods
Collisions
Direct collisions are likely to occur in very dense stellar systems from youngclusters to core-collapsed globular clusters to nuclei of small galaxies (eg thevarious contributions in Shara 2002)
Let us consider a close approach between two stars with masses and radiim1 r1 and m2 r2 respectively The relative velocity at infinity is vrel and the
3This is the only significant improvement of the relaxation-only MC algorithmover the method described by Henon He also used a binary tree in the latest versionsof his code although he did not describe it in his articles
134 M Freitag
impact parameter b Neglecting tidal effects a collision requires the centres ofthe stars to come closer than dcoll = r1 + r2 Although neglected in our MCcode (because rare in galactic nuclei) tidal captures (Fabian et al 1975) canbe be considered using dcapt = η(r1 + r2) with η gt 1 a numerical coefficientdependent on the velocity masses and structures of the stars (eg Lee ampOstriker 1986 Kim amp Lee 1999) Treating the approach until physical contactas a point-mass problem (assuming hyperbolic trajectories) we obtain thelargest impact parameter leading to contact bmax and the cross section
Scoll 12 = πb2max = π(r1 + r2)2[
1 +(vlowast 12
vrel
)2]
(518)
where
v2lowast 12 =
2G(m1 +m2)r1 + r2
(519)
is the relative velocity the stars would have at contact on a parabolic orbit It isof the order a few 100 km sminus1 for main-sequence (MS) objects The second termin the bracket of (518) is the gravitational focusing which highly enhancesthe cross section over the geometrical value π(r1 +r2)2 as long as vrel lt vlowast 12So the collision rate for a star 1 travelling through a field of stars 2 withidentical masses sizes and velocities with number density n2 is simply
dNcoll
dt
∣∣∣∣12
= n2vrelScoll 12 equiv tminus1coll 12 (520)
which defines the collision time tcoll 12 If all stars have the same mass m andsize r a number density n and their velocities follow a Maxwellian distributionwith 1D dispersion σ2
v the average collision rate is (Binney amp Tremaine 1987)
tminus1coll = 16
radicπnσvr
2
(
1 +Gm
2σ2vr
)
(521)
Adding stellar collisions to the MC algorithm is relatively straightforwardthanks to the use of particles to represent the cluster (as opposed to DFs asdone in FP codes)
First the determination of time steps (and corresponding pair-selectionprobabilities) has to include in addition to (55) the following constraint
δt(R) le fδttcoll(R) (522)
with
tminus1coll = 16
radicπnσv〈r2〉
(
1 +G〈mr〉2σ2
v〈r2〉
)
(523)
where σ2v = 13〈v2〉m The notations 〈middot middot middot 〉 and 〈middot middot middot 〉m denote number- and
mass-weighted averaged quantities respectively4 The choice of quantities to4Note that (15) of Freitag amp Benz (2002) is slightly incorrect
5 Monte-Carlo Models 135
average is such that we retrieve the correct value for the average collision ratein the limits σ2
v G〈m〉〈r〉minus1 and σ2v G〈m〉〈r〉minus1
Next when a pair is selected for update and once the local density andrelative velocity have been determined the pair collision time is computedusing (518) (519) and (520) but with n instead of n2 Hence the probabilityof collision between the pair during the time step δt is
Pcoll 12 = nvrelScoll 12 δt (524)
The use of n rather than n2 is of central importance This way the collisionprobabilities are symmetric as they should be Pcoll 12 = Pcoll 21 Further-more it would be impossible to estimate the local density of each populationparticularly because in MC codes as in N -body each particle can represent astar (or stars) with properties different from any other particle What makesthis simplification possible is that for a given particle the (local) probabilitythat the neighbouring particle is of type x (whatever the definition of a typeis) is simply nxn so the process of selecting the next particle as interactionpartner will statistically produce a rate of collisions with objects of type xproportional to nx because n rather than nx is used to compute the pair col-lision time Including the estimate of the collision time in the determinationof the time steps ensures that in a vast majority of cases Pcoll 12 fδt 1avoiding time steps during which more than one collision should have occurredIn the MC algorithm a collision between two particles has a statistical weightof NlowastNp This means that every star in the first particle collides with a starof the second particle and that all these collisions are identical so that the out-come can be represented by (at most) two particles corresponding to NlowastNp
collision products eachThen a random number Xrand with uniform deviate between 0 and 1 is
generated and a collision between the two particles has to be implemented ifXrand lt Pcoll 12 In low-velocity environments it is justified to assume thatcollisions result in mergers with negligible mass loss (Freitag et al 2006b)but this simplification breaks down in galactic nuclei where σv gt 100 km sminus1
(Freitag amp Benz 2002) We use prescriptions for the boundary between mergersand fly-bys and for the amount of mass and energy lost based on a large setof SPH simulations of collisions between MS stars (Freitag amp Benz 2005)The impact parameter is selected at random with uniform probability in b2
between 0 and b2max Because evolution on the MS is neglected a collision isentirely determined by the values of m1 m2 vrel and b and its outcome isdetermined using 4D interpolation and extrapolation from the SPH results(Freitag amp Benz 2002 Freitag et al 2006c) The properties of the particlesare updated from the post-collision values of m1 m2 and vrel
The particles are then placed at random radii on their new orbits accord-ing to (517) This concludes the step as two-body relaxation is not imple-mented when a collision is detected In highly collisional systems this canlead to an underestimate of relaxation effects and we have experimented witha modified scheme in which every second step is collisional and the others are
136 M Freitag
reserved for relaxation This makes the code approximately twice as slow butdoes not seem to affect the results significantly In case of a merger or if oneor both stars are completely disrupted (a rare outcome requiring velocities inexcess of about 5 vlowast 12) the number of particles in the simulation is reducedcorrespondingly
One major theoretical uncertainty still to be tackled when it comes to theeffects of collisions in stellar dynamics is how they affect stellar evolution Incase of mergers the problem is made particularly difficult by the very highrotation rate of the collision product (eg Sills et al 1997 2001 Lombardiet al 2002) In the face of this uncertainty we adopt a simple approach inwhich we set the effective age of the collision product based on its mass and theamount of core helium and assume no collisional mixing at all (see PortegiesZwart et al 1999 for another prescription)
While the hydrodynamics of collisions between two MS stars is now rela-tively well understood (Sills et al 2002 Freitag amp Benz 2005 Dale amp Davies2006 Trac et al 2007 and references therein) our knowledge about encountersfeaturing other stellar types is still very limited mostly because the physicsinvolved is more challenging Collisions between a giant and a more compactobject are probably more common than MSndashMS encounters at least in galac-tic nuclei where gravitational focusing is weaker but only a few authors haveattempted to model such events (Davies et al 1991 Rasio amp Shapiro 1991Bailey amp Davies 1999 Lombardi et al 2006) The main question mark con-cerns the evolution of the common envelope system resulting from the captureof the more compact star (see eg Taam amp Ricker 2006 and Chap 11) Colli-sions between a compact remnant and a MS (or giant) star have been studiednumerically in a larger number of papers (Regev amp Shara 1987 Benz et al1989 Rozyczka et al 1989 Davies et al 1992 Ruffert 1993 to mention afew) but clear and comprehensive predictions for their outcome are still miss-ing This is unfortunate because in our models for galactic nuclei collisionsbetween a MS star and a remnant occur at a rate comparable to collisions be-tween two MS stars (a few 10minus6 yrminus1 in a Milky-Way-like nucleus see Freitaget al 2006a) Finally in young dense clusters where mergers may contributeto the formation of massive stars (m gt 10M) or lead to the build-up of verymassive stars (m gt 100M eg Bally amp Zinnecker 2005 and Sect 541)collisions involving pre-MS objects are likely a type of event only simulatedvery recently (Laycock amp Sills 2005 Davies et al 2006)5
Central Massive Object
To study the structure and evolution of galactic nuclei with a central mas-sive black hole (MBH MBH 104 M) or globular clusters hosting an
5For more pointers to the literature on stellar collisions and tidaldisruptions by a massive black hole see the MODEST web pages athttpwwwmanybodyorgmodestWGwg4html
5 Monte-Carlo Models 137
intermediate-mass black hole (IMBH 104 M MBH 102 M) or a verymassive star (Mlowast 200M) the effects of a central massive object have beenincluded in the MC code (Freitag 2000 Freitag amp Benz 2002 Freitag et al2006a Freitag et al 2006b) Here I concentrate on the case of an (I)MBH (seeFerrarese amp Ford 2005 for a review of the observational evidence for MBHs incentres of galaxies and Miller amp Colbert 2004 van der Marel 2004 for reviewson the possible existence of IMBHs)
Recall that the MC approach is only valid for spherical systems in dy-namical equilibrium and useful mostly if collisional effects such as two-bodyrelaxation produce noticeable evolution over the period of interest Galacticnuclei hosting MBH less massive than about 107 M are probably relaxedand therefore amenable to MC modelling Indeed assuming naively that theSgr Alowast cluster at the centre of our Galaxy is typical as far as the total stellarmass and density are concerned (Genzel et al 2003 Ghez et al 2005 Schodelet al 2007) and that we can scale to other galactic nuclei using the observedcorrelation between the mass of the MBH and the velocity dispersion of thehost spheroid σ in the form σ = σMW(MBH4times 106 M)1β with β asymp 4minus 5(Ferrarese amp Merritt 2000 Tremaine et al 2002) we can estimate the relax-ation time at the radius of influence (the limit of the region where the gravityof the MBH dominates) to be trlx(Rinfl) asymp 1010 yr (MBH4 times 106 M)(2minus3β)
All the key aspects of the interaction between the central MBH and itshost stellar system (ldquoclusterrdquo in short) are included in the MC code
Gravity of the MBH The contribution of the MBH is treated as a centralfixed point mass Newtonian gravity is assumed so the only modification incomputing the potential φ is to add MBH to the coefficients Ai in (511) TheMBH is allowed to grow by accretion of material from the stars or through anad hoc prescription to account for gas inflow Care is taken to make the timesteps significantly shorter than φ(dφdt)minus1 so as to ensure that the adiabaticeffects of the growth of the MBH on the cluster are accounted for (Young1980 Quinlan et al 1995) The MBH imposes very high stellar velocities inits vicinity causing stellar collisions to be more disruptive The gas emitted ina collision is assumed to accrete completely and immediately onto the MBH orto accumulate in an unresolved disc around the MBH if its growth is limitedby the Eddington rate
Tidal disruptions A star of mass Mlowast and radius Rlowast which comes withina distance Rtd = k Rlowast(MBHMlowast)13 of the MBH is torn apart by the tidalforces (eg Fulbright 1996 Diener et al 1997 Ayal et al 2000 Kobayashiet al 2004) Here k is a constant of order unity depending on the structureof the star In the present implementation we assume that the tidal disrup-tion is always complete and that a fixed fraction of the mass of the disruptedstar is accreted immediately usually 50 per cent as suggested by most hy-drodynamical simulations The rest is lost from the cluster These events arepredicted to trigger month- to year-long accretion flares in the UVX domain(Hills 1975 Rees 1988) some of which might have been detected already (see
138 M Freitag
Komossa 2005 for a review and Gezari et al 2006 Esquej et al 2007 for recentobservations)
In a spherical galactic nucleus in dynamical equilibrium the velocity vectorv of a star at distance R from the MBH has to point inside the loss conein direction to or away from the centre for its orbit to pass within Rtd Theaperture angle of the loss cone θLC is given by the relation
sin2(θLC) = 2(Rtd
vR
)2 [v2
2+GMBH
Rtd
(
1 minus Rtd
R
)
+ Φlowast(R) minus Φlowast(Rtd)]
2GMBHRtd
(vR)2asymp Rtd
R
(525)
where Φlowast(R) = Φ(R) + GMBHR is the cluster contribution to the gravita-tional potential The first approximation is valid as long as R Rtd whichis nearly always the case the second is an order-of-magnitude estimate validwithin the sphere of influence of the MBH where v2 asymp GMBHR
minus1Stars on loss-cone orbits are removed on an orbital time-scale In a spher-
ical potential it is generally assumed that loss-cone orbits are replenishedby two-body relaxation but orbital perturbations by resonant relaxation (seeSect 542) or deflections by massive objects such as molecular clouds (Peretset al 2007) may play an important role Barring such non-standard processestwo loss-cone regimes can be distinguished (Frank amp Rees 1976 Lightman ampShapiro 1977 Cohn amp Kulsrud 1978) (1) The loss cone is kept full and doesnot induce any significant anisotropy in the velocity distribution when relax-ation is strong enough to repopulate loss-cone orbits over an orbital timecorresponding to the condition θ2
LCtrlx Porb For stars in this regime whichtypically occurs at large distances the average time before tidal disruption isof order tdisrfull θminus2
LCPorb (when averaged over all directions of v) (2) Theloss cone is (nearly) empty in the opposite case θ2
LCtrlx Porb and corre-sponds to an absorbing region of phase space into which the stars diffuse Thedensity of stars on orbits close to but out of the loss cone is reduced In thisregime it takes on average tdisrempty trlx ln(θminus2
LC) for a star to be disruptedPlunges through the horizon The last stable parabolic orbit around a non-
spinning massive black hole corresponds to a (Newtonian) pericentre distanceRLSPO = 8GMBHc
minus2 Sufficiently dense stars such as compact remnants havea tidal disruption radius Rtd inside RLSPO (or even inside the horizon) mean-ing that such objects will be swallowed whole rather than be tidally disruptedand produce no accretion flare6 From the point of view of stellar dynamicsthis situation is identical to the case of tidal disruptions with the quantityRtd replaced by RLSPO
6In fact when RLSPO gt Rtd gt Rhor = 2GMBHcminus2 the star is disrupted before itdisappears through the horizon To my knowledge the detectability of such eventshas not been investigated
5 Monte-Carlo Models 139
Inspirals by emission of gravitational waves Significant emission of grav-itational waves (GWs) occurs during very close encounters with the MBH(Peters amp Mathews 1963) For a compact massive stellar object on a veryeccentric orbit GW emission may dominate orbital evolution over two-bodyrelaxation yielding to progressive circularisation and shrinking of the semi-major axis (Peters 1964) until it plunges through the horizon of the MBH(or is tidally disrupted) For a 1ndash10M object orbiting a MBH with a massbetween 104 and 107 M the final months or years of inspiral should be de-tectable by the future spaceborn GW observatory LISA7 to distances of severalGpc Such extreme mass ratio inspirals (EMRIs) yield an unprecedented viewon the direct vicinity of MBHs The promise for physics and astrophysics is asexciting as the uncertainties about their physical rates and the challenges fordata analysis are high (see Amaro-Seoane et al 2007 for an extensive reviewof the various aspects of EMRI research)
I now explain in some detail how the loss-cone physics is implementedin the MC code This treatment is adequate only for the processes requir-ing a single passage within a well-defined critical distance of the MBH tobe successful such as tidal disruption plunges or non-repeating GW burstsemitted by stars on quasi-parabolic orbits (Hopman et al 2007) In contrastan EMRI is a progressive process that will only be successful (as a poten-tial source for LISA) if the stellar object experiences a very large number ofsuccessive dissipative close encounters with the MBHs (Alexander amp Hopman2003) The ability of the MC approach to deal with this situation is discussedin Amaro-Seoane et al (2007)
At the end of the step in which two particles have experienced an encounter(to simulate two-body relaxation) each particle is tested for entry into theloss cone J lt JLC where JLC = RV sin(θLC)
radic2GMBHRtd (525) A
complication arises because the time step δt used in the MC code is a frac-tion fδt = 10minus3 minus 10minus2 of the local relaxation time trlx(R) which is muchlarger than the critical timescale θ2
LCtrlx In other words the super-encounterdeflection angle θSE (53) is much larger than θLC This keeps the loss coneeffectively and artificially full However in contrast with direct N -body sim-ulations this is not due to the overall relaxation rate being too large whenNp lt Nlowast
To treat the empty loss-cone regime in the most accurate fashion we wouldneed to use time steps as short as the orbital period Unfortunately it is notpossible to give short time steps only to particles with eccentric orbits (andhence at risk of entering the loss cone) because the time step is a function ofthe positionR and cannot be attached to a particle Hence at least all particleswithin the critical radius defined by tdisrfull(Rcrit) = tdisrempty(Rcrit) wheret quantities are some local average would need to have much shorter timesteps which would slow down the code considerably Instead an approximate
7Laser Interferometer Space Antenna see httpwwwlisa-scienceorg
140 M Freitag
procedure is used to ensure that entry into the loss cone happens diffusivelywhen θ2
LCtrlx PorbAfter the super-encounter deflection angle θSE has been computed (53)
and before the particles in the pair are given their new energies angular mo-menta and positions we check each of them for entry into the loss cone inthe following manner First the orbital period is computed by integrating(515) using Chebyshev quadrature (Press et al 1992) We consider that dur-ing Porb δt the direction of the velocity of the particle would have changedby an rms angle θorb = (Porbδt)12θSE We then assume that the tip of thevelocity vector of the particle executes a random walk of NRW = δtPorb sub-steps of length θorb during δt The modulus of the velocity is kept constantEntry into the loss cone is tested at each of these sub-steps This random walkis executed in the reference frame of the super-encounter but independentlyfor each particle of the pair because they have different θorb and NRW If aparticle is found on a loss-cone orbit it is immediately removed and (part of)its mass is added to the MBH If the random walk never crosses into the losscone the particle is kept and in order to ensure exact energy conservationthe particle is given the velocity computed in the super-encounter not thatreached at the end of the random walk The random walk is a refinement of thesuper-encounter from a statistical point of view but because of its stochasticnature it cannot produce velocity vectors anti-parallel to each other for theparticles in a pair This means that energy in the reference frame of the cluster(as opposed to that of the pair) would not be conserved It might be possi-ble to improve this procedure by performing the random walk in the clusterreference frame and leaving the particle with the velocity attained at the endof it This would permit us to obtain the correct decrease of density on theorbits close to the loss cone
In the context of loss-cone physics I mention another type of Monte-Carlocode developed by Shapiro and collaborators at Cornell University (Shapiro1985 for a review and references) Their approach was essentially a hybridbetween that presented here entirely based on particles and with no explicitcomputation of diffusion coefficients and the direct FokkerndashPlanck integration(Chap 4) Instead of having particles interacting in pairs their density in the(E J) phase space was tabulated in order to compute diffusion coefficientsused to modify their orbital parameters during the next global step Withina global step each particle could be evolved independently of the others (andon its own time step) until the updated phase-space density (and potential)is recomputed This permitted to endow the particles in or close to the losscone with time steps as short as their orbital time Extending this scheme to amulti-mass situation seems feasible without explicit use of an augmented (andsparsely populated) (E JMlowast) phase space Unfortunately to my knowledgesuch a development was not attempted
5 Monte-Carlo Models 141
Binary Stars
The MC code presented so far in this chapter only deals with the dynamicsand evolution of single stars This is a reasonable simplification as long as theoverall dynamics of galactic nuclei is concerned because in such environmentsmost binaries are very soft meaning that their internal orbital velocity is muchsmaller than velocity dispersion at least in the vicinity of a MBH where thedensity and interaction probability are the highest However binaries playa major role in the evolution of globular clusters where the hard ones actas an efficient central source of heat by being shrunk and eventually ejectedduring interactions with other stars (Aarseth 1974 Spitzer amp Mathieu 1980Gao et al 1991 Hut et al 1992 Heggie amp Hut 2003 Giersz 2006 Fregeauamp Rasio 2007 amongst many others) For a given stellar density binariesalso highly increase the rate of direct collision between stars (Portegies Zwartet al 1999 Portegies Zwart amp McMillan 2002 Portegies Zwart et al 2004Fregeau et al 2004) Beside their dynamical role binary interactions in denseclusters are also of high interest as a way to create a whole zoo of ldquostellarexoticardquo and phenomena including blue stragglers millisecond pulsars andmergers between compact stars as sources of supernovae gamma-ray burstsor gravitational waves (eg Hurley et al 2001 Davies 2002 Shara amp Hurley2002 Benacquista 2006 Grindlay et al 2006 OrsquoLeary et al 2007) Includingbinaries in models of galactic nuclei is also important to explain X-ray observa-tions at the Galactic centre (Muno et al 2005) hyper-velocity stars (eg Hills1988 Brown et al 2005) and as a possible channel to create extreme-massratio sources of gravitational waves for LISA (Miller et al 2005)
Here I put aside the very thorny question of binary evolution and howit might be affected by dynamics (see Chaps 11 and 12) and concentrateon the dynamical aspects Binaries have been included in MC simulationswith various levels of sophistication (Spitzer amp Mathieu 1980 Stodolkiewicz1985 1986 Giersz 1998 2001 2006 Giersz amp Spurzem 2000 2003 Fregeauet al 2003 Gurkan et al 2006 Fregeau amp Rasio 2007 Spurzem et al 2006)The approach of Fregeau amp Rasio (2007) is based on our own treatment ofcollisions and is the most direct and accurate one at least when each particlerepresents a single system (single star or binary) This treatment does notinclude formation of binaries through three-body interactions (see the worksof Stodolkiewicz and Giersz)
To include binaries in a MC code we first need to allow some of theparticles to represent binaries instead of single stars which requires extradata to keep track of the internal structure masses and evolutionary phase ofthe member stars semi-major axis abin and eccentricity ebin In the absence ofinteraction with another star or binary these parameters are updated by theuse of some binary evolution prescription Then similar to stellar collisionsincluding binary dynamics amounts to (1) determining the probability of abinary interaction Pbin between two neighbouring particle if at least one of
142 M Freitag
them is a binary (2) generating a random number Xrand and if Xrand lt Pbin(3) implementing a singlendashbinary or binaryndashbinary encounter
Steps (1) and (2) are the same as in the implementation of collision betweensingle stars Actually at this level binary interactions do not need to bedistinguished from stellar collisions We only need to give to binaries a radiusηabin where η gt 1 is a safety factor to ensure that all interactions that canperturb the binaries significantly are taken into account Fregeau amp Rasio(2007) chose η = 2 and checked that a value η = 4 (which could cause thetime steps to be about twice as short) do not lead to statistically differentresults as far as the overall evolution of the cluster and binary population isconcerned More complex forms of the criterion for the most distant encounterto be included have been used by other authors (eg Bacon et al 1996 Gierszamp Spurzem 2003) The simple rule described here based on proximity atthe closest approach (when each binary is treated as a point mass) shouldyield correct results if η is made sufficiently large but in studies of smallperturbations to binaries (or planetary systems) it may be less than optimalin the sense that large η values will yield small time steps Indeed for binarieswe have to substitute ηabin for r in (523) Roughly speaking with binariesat the hardndashsoft boundary (Gmbina
minus1bin σ2
v) the time step will be limited bybinary processes rather than by two-body relaxation if η gt lnΛ
Between interactions binaries are treated as unperturbed and their prop-erties are updated using binary evolution prescriptions Note that this is alsothe case in N -body codes unless another object comes within a distancedpert = γ
minus13min (2mpertmbin)13(1 + ebin)abin where mpert is the mass of the
perturber and γmin is the tidal perturbation parameter (Aarseth 2003 andChap 1) In most cases γmin is set to 10minus6 Hence in a similar-mass situa-tion (mpert asymp mbin) the N -body prescription corresponds to η asymp 100 minus 200in the MC collision formalism Whether this much more conservative condi-tion yields significantly different results in the evolution of the binaries andtheir host cluster has not been investigated in depth (see Giersz amp Spurzem2003 Spurzem et al 2006 for some discussion) Incidentally such researchmay open the possibility of a more approximate but much faster treatment ofbinary interactions in direct N -body codes
The most direct and accurate (but also time-consuming) way of imple-menting step (3) ie of determining the outcome of a binary encounter oc-curring in a MC simulation is to switch to a direct few-body integrator (seeChap 2 for algorithms) First the quantities not specified by the MC parti-cles have to be picked at random These are the orbital phase(s) and orienta-tion(s) and the impact parameter8 One difficulty arises with binaryndashbinaryencounters as they often result in the formation of a stable triple system As
8In principle we could keep track of the orbital phase of a binary between inter-actions However the MC method relies on the assumptions that strong interactionsare rare and that binaries are much smaller than any length scale in the cluster Thiseffectively randomises the orbital phase between interactions
5 Monte-Carlo Models 143
mentioned by Giersz amp Spurzem (2003) and Fregeau amp Rasio (2007) it is inprinciple possible to have some particles representing triples (or higher-orderstable groups) in the MC framework with the appropriate book-keeping butthis has not been implemented so far Instead triple systems are forcefullybroken apart into a binary and a single star just unbound to the binary An-other type of outcome that may require special treatment is the formation ofa very wide soft binary with a size not much smaller than the typical sizeof the cluster Such pairs cannot be treated accurately in the MC formalismbut they are unlikely to survive the next interaction so they can be artifi-cially broken up without affecting the results Finally as mentioned above itis probably important to allow for direct collisions during binary interactionsOne source of uncertainty is the size of a merged star just after a collision It islikely to be several times the MS radius leading to a significant probability ofa triple or quadruple collision (Goodman amp Hernquist 1991 Lombardi et al2003 Fregeau et al 2004)
Once the outcome of a binaryndashsingle or binaryndashbinary interaction has beendetermined the products of the interaction are turned back into MC particlesrepresenting single or binary stars with the adequate internal and orbitalproperties and a position in the cluster is selected for each according to theprocedure presented in Sect 531
Integrating the few-body encounters in a cluster with a large fraction ofbinaries can account for a significant fraction of the computing time A muchfaster way to deal with binary dynamics is to use ldquorecipesrdquo which are fittingformulae for the cross section and outcome of interactions based on large pre-computed sets of scattering experiments (eg Heggie 1975 Hut 1993 Heggieet al 1996) However for stars of unequal masses the parameter space is toovast to be reliably covered by such recipes Even in the idealised case where allstars have the same mass for which comprehensive binary-interaction crosssections are available the use of such recipes rather than explicit few-bodyintegrations seems to yield quantitatively inaccurate results (Fregeau et al2003 Fregeau amp Rasio 2007)
Other Physical Ingredients
MC codes can include a few other physical processes that I describe moresuccinctly
Stellar evolution ndash Evolution of stars (single or binaries) can be taken intoaccount with various levels of refinement In our MC code a very simple pre-scription is used which assumes that a star of initial mass Mlowast spends a timetMS(Mlowast) on the MS without any evolution and abruptly turns into a compactremnant at the end of this period Thus the giant phase is neglected Therelation tMS(Mlowast) and the prescriptions for the nature and mass of the rem-nant are taken from stellar evolution models (Hurley et al 2000 Belczynski
144 M Freitag
et al 2002) To ensure that stellar evolution time-scales are resolved a sup-plementary constraint on the time step is introduced δti le fδtlowasttlowasti wheretlowasti is an estimate for the stellar evolution time-scale of stars at rank i andfδtlowast = 0025 typically In the present implementation tlowasti is simply the MSlifetime of the particle which has rank i at the moment the time steps arecomputed Because we use a piecewise constant representation of δt the timestep will generally be shorter than a fraction fδtlowast of the smallest local value oftMS Once a pair of particles is selected it is first checked for stellar evolutionand its masses and radii are updated if required before the super-encounter(or collision) is carried out Natal kicks can be given to newborn neutronsstars and black holes (Freitag et al 2006a)
This simplistic treatment can be improved by the use of detailed stel-lar evolution packages (Portegies Zwart amp Verbunt 1996 Portegies Zwart ampYungelson 1998 Hurley et al 2000 2001 See also Chaps 10 and 13) A diffi-culty to confront however is that this will involve shorter time-scales tlowast egto resolve the giant phase In general stars with short tlowast can be found any-where in a cluster imposing (unlike relaxation or collision) uniformly shorttime steps This could be prevented by using a time-stepping scheme for stel-lar evolution independent of the dynamical one For instance using a heapstructure (Press et al 1992) we could keep track of the next particle requir-ing update of its stellar parameters and realise this update when due withoutchanging the orbital parameters (except if a natal kick is imparted)
Large-angle scatterings ndash Gravitational encounters between stars of massm1 and m2 at a relative velocity vrel with an impact parameter smaller thana few b0 equiv G(m1 +m2)vminus2
rel lead to deflection angles too large to be accountedfor in the standard diffusive theory of relaxation On average a star willexperience an encounter with impact parameter smaller than fLAb0 (withfLA of order a few) over a time-scale
tLA [π(fLAb0)2nσ
]minus1 asymp lnΛf2LA
trlx (526)
The effects of large-angle scatterings on the overall evolution of a clusterare negligible in comparison with diffusive relaxation (Henon 1975 Goodman1983) However unlike the latter process they can produce velocity changesstrong enough to eject stars from an isolated cluster (Henon 1960 1969Goodman 1983) or more important from the region of influence around aMBH (Lin amp Tremaine 1980 Baumgardt et al 2004 OrsquoLeary amp Loeb 2008)Large-angle scatterings are easily included in MC simulations as a special caseof collision with a cross section π(fLAb0)2 (Freitag et al 2006a) but the timesteps will be limited by this (rare) process rather than by diffusive relaxationfor fLA 4
Tidal evaporation ndash Stellar clusters are subject to the tidal influence of theirhost galaxy Assuming spherical symmetry the MC code cannot deal with thegalactic field accurately but it is easy to include in an approximate way themost important effect which is the evaporation of stars from the cluster
5 Monte-Carlo Models 145
A star can escape from a cluster on a circular orbit of radius RG around aspherical host galaxy if its orbit allows it to reach the Lagrange point awayfrom or in the direction of the galaxy These locations are approximately ata distance RL = RG(Mcl(2MG))13 from the clusterrsquos centre where Mcl andMG are the masses of the cluster and a point-mass galaxy respectively In thespherical approximation we assume that a star escapes when its apocentredistance is larger than RL As the total mass of the cluster decreases the valueof RL is adjusted This can lead to more stars being lost if their apocentredistances happen to lie beyond the new RL value so we have to iterate untilconvergence is reached for the bound mass of the cluster Using such treatmentof tidal evaporation combined with a prescription for the orbital decay of thecluster owing to dynamical friction Gurkan amp Rasio (2005) have simulatedthe internal and orbital evolution of clusters at the Galactic centre
54 Some Results and Possible Future Developments
Monte-Carlo codes have been used in a variety of problems involving thecollisional evolution of globular clusters and galactic nuclei I do not attemptto review this variety of works but invite the reader to sample the referencescited in Sect 51 Here I limit myself to the quick presentation of a few typicalresults to give a flavour of the capabilities of the method
541 Young Clusters and Globular Clusters
In Figs 53 and 54 I show the evolution to core collapse of single-mass andmulti-mass Plummer models computed with the MC code described here withno other physics than two-body relaxation I compare with direct Nbody4
results (H Baumgardt 2005 personal communication) Provided the valueof γc needed to convert N -body time units (see Chap 1) to relaxation timeis adjusted in an ad hoc fashion very good agreement between the methodsis obtained for these cases We find γc 015 for the single-mass modeland γc 003 for Salpeter mass function (dNlowastdMlowast prop Mminus235
lowast ) extendingfrom 02 to 10 M in agreement with theoretical expectations and previousnumerical determinations (Henon 1975 Giersz amp Heggie 1994 1996 Freitaget al 2006c) We note that in N -body simulations core collapse is alwayshalted and reversed by the formation and hardening of binaries through closethree-body interactions (eg Aarseth 1971 Heggie amp Hut 2003) a process notincluded in the MC code When the mass function is extended to 120M theagreement between MC and N -body simulations is poorer but the time tocore collapse is found to be approximately the same in terms of relaxationtime namely a surprising 10ndash20 per cent of the initial central relaxation time(Spitzer 1987)
trc(0) equiv 0339σ3
v
lnΛG2〈m〉2n (527)
146 M Freitag
Fig 53 Core collapse of a single-mass cluster initialised as a Plummer modelThe results of the MC code using 250 000 particles in solid lines are comparedto a direct Nbody4 simulation using 64 000 particles in dashes (H Baumgardt2005 personal communication) Top panel evolution of radii of the Lagrangianspheres containing the indicated fraction of the mass Bottom panel evolution ofthe anisotropy parameter averaged over Lagrangian shells bounded by the indicatedmass fractions The length unit is the N -body scale (see Chap 1) The time unit isthe initial half-mass relaxation time (Spitzer 1987) To convert the dynamical timeunits of the N -body simulation to a relaxation time a value of γc = 015 was usedfor the Coulomb logarithm
5 Monte-Carlo Models 147
MMMM
M
Fig 54 Core collapse of a Plummer cluster with 02ndash10 M Salpeter mass functionA MC code simulation with 106 particles in solid lines is compared to a directNbody4 simulation with 256 000 particles in dashes (H Baumgardt 2005 personalcommunication) To show mass segregation the evolution of Lagrangian radii isplotted for mass fractions of 1 and 50 per cent for stars with masses within fivedifferent bins To convert the dynamical time units of the N -body simulation to arelaxation time a value of γc = 003 was used for the Coulomb logarithm Comparewith Fig 41
where the quantities 〈m〉 n and σv are determined at the centre This isa result of great interest as it raises the possibility of triggering a phase ofrunaway collisions in young dense clusters (Quinlan amp Shapiro 1990 PortegiesZwart et al 1999 Portegies Zwart amp McMillan 2002 Gurkan et al 2004Portegies Zwart et al 2004 Freitag et al 2006bc)
A domain where MC simulations are bound to play a unique role in thenext few years is the evolution of large clusters with a high fraction of pri-mordial binaries This is one of the most challenging situations for directN -body codes because the evolution of regularised binaries cannot be com-puted on special-purpose GRAPE hardware At the time of writing the pub-lished N -body simulations tallying the largest number of binaries are those byHurley et al (2005) with 12 000 binaries amongst 36 000 stars and by PortegiesZwart et al (2007) with 13 107 binaries amongst 144 179 stars In contrastFregeau amp Rasio (2007) present tens of MC simulations for 105 particles somewith 100 per cent binaries and a few 3times105 particle cases with up to 15times105
binaries (see also Gurkan et al 2006) Although single and binary stellar evolu-tion were not included in these simulations they can be incorporated into MCcodes in the same way and with the same level of realism as in direct N -body
148 M Freitag
05
06
07
08
09
1M
bMb(0
) M
M(0
)
0 10 20 30 40 50 60t [t
rh]
01
1
r c rh
b rh
s [r N
B]
Fig 55 Evolution of a cluster containing 30 per cent of (hard) primordial binaries(J Fregeau 2007 personal communication) The cluster is set up as a Plummermodel of 105 particles with masses distributed according to a Salpeter IMF between02 and 12 M Stellar evolution is not simulated Top panel total cluster mass(dashed line) and mass in binaries (dot-dashes) normalised to the initial valuesBottom panel core radius (solid line) half-mass radius of single stars (dashes) andhalf-mass radius of binaries (dot-dashes) in N -body units Time is in units of theinitial half-mass relaxation time For more information on this work see Fregeau ampRasio (2007)
codes In Fig 55 I show the results from a simulation of a cluster with30 per cent primordial binaries ie Nbin(Nbin + Nsingle) = 03 (J Fregeau2007 personal communication) Binaries stabilise the core against collapse fora duration of tens of half-mass relaxation times corresponding to more thanthe Hubble time when applied to real globular clusters The quasi-equilibriumsize of the core maintained during this long phase of binary burning appearsto be too small to explain the observed core size of most non-collapsed Galac-tic clusters It is not yet clear whether this discrepancy is to be blamed onthe neglect of stellar evolution and other well-known physical effects (colli-sions non-stationary Galactic tides etc) or can only be resolved by assumingsome more exotic physics such as the presence of IMBHs in many clusters(Baumgardt et al 2005 Miocchi 2007 Trenti et al 2007) but it seems thatMC simulations are the ideal tool to investigate this issue
5 Monte-Carlo Models 149
Monte-Carlo codes that treat the dynamics and evolution of single andbinary stars in great detail should be available very soon allowing the simula-tion of clusters containing up to 107 stars on a star-by-star basis with a highlevel of realism as long as the assumptions of spherical symmetry and dynam-ical equilibrium are justified I now mention a few strong motivations to tryand extend the realm of MC cluster simulations beyond these assumptions
bull Galactic tides The treatment of stellar evaporation from a cluster can beimproved significantly First stars have to find the narrow funnels aroundthe Lagrange points to exit the cluster (eg Fukushige amp Heggie 2000Ross 2004) Hence it takes a star several dynamical times to find theldquoexit doorrdquo even when some approximate necessary condition for the es-cape is reached such as an apocentre distance (in the spherical potential)larger than the distance to the Lagrange point Therefore a significantfraction of the stars in a cluster can be potential escapers (Fukushige ampHeggie 2000 Baumgardt 2001) Using (semi)analytical prescriptions fromthe cited studies one could take this effect into account in MC simula-tions by giving potential escapers a finite lifetime before they are actuallyremoved from the cluster (see Takahashi amp Portegies Zwart 2000 for asimilar approach applied to FokkerndashPlanck simulations) Other importanteffects of the galactic gravitational field absent from MC simulations (andmost other cluster simulations) come from its non-steadiness A cluster onan eccentric orbit experiences a stronger tidal stress at pericentre an ef-fect dubbed bulge shocking while compressive disc shocking happens whenthe cluster crosses the plane of the galactic disc (eg Spitzer 1987 Gnedinamp Ostriker 1997 Baumgardt amp Makino 2003 Dehnen et al 2004) Sucheffects can be included in MC codes using the same (semi)analytical pre-scriptions as in some FokkerndashPlanck integrations (Gnedin amp Ostriker 1997Gnedin et al 1999) Alternatively because shocking occurs on a time-scalemuch shorter than the relaxation time we could switch back and forth be-tween a fast non-collisional N -body algorithm (such as Superbox seeChap 6) to compute the effects of the shocks and a MC code to evolvethe cluster between shocks Another possibility would be a hybrid non-spherical MCN -body method suggested in the next point
bull Rotating clusters Observational evidence and theoretical models indicatethat clusters may be born with significant rotation possibly as a resultof the merger of two clusters (see references in Amaro-Seoane amp Freitag2006) The MC approach exposed here is not appropriate to study non-spherical systems but as already suggested by Henon (1971a) it might bepossible to develop a hybrid approach where a collisionless N -body codeis used for fast orbit sampling in a non-spherical geometry (by actual or-bital integration) and collisional effects are included explicitly in a MCfashion by realising super-encounters between neighbouring pairs A com-bination of the Self-Consistent Field N -body method with FokkerndashPlanckrelaxation terms was developed by S Sigurdsson to study the evolution
150 M Freitag
of globular clusters orbiting a galaxy (Johnston et al 1999) but to myknowledge no MCN -body hybrid has ever been developed Such a codewould also be of great interest in the study of galactic nuclei as mentionedin Sect 542
bull Primordial gas Observations show that when a cluster forms not morethan 30 per cent of the gas is eventually turned into stars (Lada 1999)In relatively small clusters the gas is expelled by the ionising radiationand winds of OB stars within the first 1ndash2 Myr In clusters with an escapevelocity larger than about sim 10 km sminus1 complete expulsion of the gasprobably only occurs when the first SN explodes (Kroupa et al 2001 Boilyamp Kroupa 2003ab Baumgardt amp Kroupa 2007 and references therein Seealso Sect 74) When still present in the cluster the gas dominates thegravitational potential Furthermore it can strongly affect the orbits andmass of stars as they accrete and slow down to conserve momentum thusshaping the mass function and producing strong segregation (Bonnell et al2001ab Bonnell amp Bate 2002) Such effects can be included in MC codesif the gas is treated as a smooth parametrised component However tofollow the reaction of the cluster to the fast gas expulsion we would haveto switch to a (collisionless) N -body code or Spitzer-type dynamical MCscheme because the Henon algorithm can only treat adiabatic potentialevolution
542 Galactic Nuclei
In addition to the study of globular and young clusters the MC code is also amethod of choice for the study of small galactic nuclei (Freitag 2001 Freitagamp Benz 2001ab 2002 Freitag 2003 Freitag et al 2006a) Massive black holes(MBHs) less massive than about 107 M are probably generally surroundedby a stellar nucleus with a relaxation time shorter than 1010 yr at the distancewhere the mass in stars is equal to the mass of the MBH (eg Lauer et al1998 Genzel et al 2003 Freitag et al 2006a Merritt amp Szell 2006) Althoughdirect N -body codes with GRAPE hardware can now be used to study someimportant aspects of the collisional evolution of galactic nuclei (Preto et al2004 Merritt amp Szell 2006 Merritt et al 2007b) they are still limited to 106 particles for this kind of application which falls short of the number ofstars in galactic nuclei
In Fig 56 I show the evolution of a small galactic nucleus computed withthe MC code described in this chapter In addition to two-body relaxation thephysics include the effects of a (growing) central MBH (tidal disruption directmergers for objects too compact to be disrupted) and stellar collisions Large-angle scatterings were found to be of secondary importance for such systemsand stellar evolution can be taken into account but this raises the questionof how much gas from stellar evolution will be accreted by the MBH (Freitaget al 2006a) For the model presented segregation of stellar-mass black holes
5 Monte-Carlo Models 151
dd
Fig 56 Evolution of the model for a small galactic nucleus hosting a MBH witha mass of 35times 104 M with 21times 106 particles (model GN84 of Freitag et al 2006a)Top panel evolution of Lagrangian radii for the various stellar species (MS main-sequence WD white dwarfs NS neutron stars BH stellar black holes) The stellarpopulation has a fixed age of 10 Gyr Bottom panel accretion of stellar material bythe MBH For tidal disruptions 50 per cent of the mass of the star is accretedldquoMergersrdquo are events in which an object crosses the horizon whole Collisions be-tween MS stars are also taken into account with all the released gas being accretedby the MBH
152 M Freitag
to the centre occurs within some 50 Myr after which their swallowing by theMBH drives the expansion of the nucleus For models with parameters per-taining to the Milky Way nucleus mass segregation takes about 3ndash5 Gyr andonly little expansion occurs in a Hubble time The segregation of stellar blackholes is of key importance for the formation of EMRI sources for LISA (Hop-man amp Alexander 2006b Amaro-Seoane et al 2007 and references therein)
Simulations of galactic nuclei have not yet reached as high a level of realismas one might wish Several aspects of the physics are still laking including thefollowing elements
bull Binary stars Binary stars are probably not effective as a source of heat be-cause the ambient velocity dispersion is so high in galactic nuclei Howeverthis population is of interest in its own right as mentioned in Sect 532
bull Resonant relaxation Close to the MBH stars travel on approximately fixedKeplerian orbits exerting torques on each other causing the eccentricitiesto fluctuate randomly on a time-scale shorter than that of standard two-body relaxation (Rauch amp Tremaine 1996) This might affect moderatelythe rate of tidal disruptions (Rauch amp Ingalls 1998) and very significantlythat of EMRIs (Hopman amp Alexander 2006a) but being an intrinsicallynon-local effect it can probably only be included in an approximate fashionin MC models
bull Motion of the central MBH Direct N -body simulations have establishedthe importance of MBH wandering (eg Merritt et al 2007 and referencestherein) Because this is a dynamical non-spherical perturbation to theidealised cluster representation used in the MC approach it can only beincluded through ad hoc prescriptions determining for example the prob-ability for a star to be tidally disrupted It is not yet clear whether thewandering would affect the results appreciably and justify such modifica-tions to the MC code
bull Interplay between accretion disc and stars The orbits of stars repeatedlyimpacting a dense disc tend to align with it (eg Syer et al 1991 Subr et al2004 Miralda-Escude amp Kollmeier 2005) Stars may therefore be a majorcontributor to nuclear activity and the growth of SMBHs Testing this ideais challenging since what is required is a numerical scheme coupling stellardynamics for several millions of stars disc physics and some prescriptionfor the stellar and orbital evolution of the stars embedded in the disc Anon-spherical hybrid MCN -body code as suggested above could formthe backbone of this complex scheme
bull Binary massive black hole Galaxy mergers lead to the formation of massivebinaries the evolution and fate of which is still debated The key questionis whether interactions with stars and gas are efficient at shrinking thebinary to the point where it merges by the emission of gravitational waves(Begelman et al 1980 Merritt amp Milosavljevic 2005 Berczik et al 2006Merritt 2006 Sesana et al 2007 amongst others) If the binary insteadstalls for a very long time the next galactic merger can bring about a
5 Monte-Carlo Models 153
highly dynamical three-body interaction involving MBHs likely to lead toa merger and the ejection of a single MBH (Hoffman amp Loeb 2007) If theparent galaxies are devoid of gas once its separation has become smallerthan about sim 4Gμσ2 where μ is the reduced mass and σ the stellar veloc-ity dispersion the MBH binary can only shrink by ejecting passing starsout of the nucleus These interactions also determine the evolution of theeccentricity which might play a key role in bringing the binary to coales-cence While only N -body methods can implement the non-symmetricalgeometry of this situation (eg Mikkola amp Aarseth 2002) they cannotinclude the gt 107 stars present in even a moderately small nucleus Anaxially symmetrical (hybrid) MC code would make it possible to simulatethe interaction of a massive binary with its host nucleus employing a real-istic mass ratio between the stars and the MBHs and hence the correctrate of relaxation into the loss cone for interaction with the massive binary
Acknowledgement
It is a pleasure to thank M Atakan Gurkan and John Fregeau for discussionsand comments on a draft of this chapter I also thank John Fregeau andHolger Baumgardt for providing unpublished simulation results My work issupported by the STFC rolling grant to the IoA
References
Aarseth S J 1971 ApampSS 13 324 145Aarseth S J 1974 AampA 35 237 141Aarseth S J 2003 Gravitational N-body Simulations Cambridge Univ Press
Cambridge 142Aguilar L A Merritt D 1990 ApJ 354 33 125Alexander T Hopman C 2003 ApJ Lett 590 L29 139Amaro-Seoane P Freitag M 2006 ApJ Lett 653 L53 149Amaro-Seoane P Gair J R Freitag M Miller M C Mandel I Cutler C J
Babak S 2007 Classical and Quantum Gravity 24 113 139 152Ayal S Livio M Piran T 2000 ApJ 545 772 137Bacon D Sigurdsson S Davies M B 1996 MNRAS 281 830 142Bailey V C Davies M B 1999 MNRAS 308 257 136Bally J Zinnecker H 2005 AJ 129 2281 136Baumgardt H 2001 MNRAS 325 1323 149Baumgardt H Kroupa P 2007 MNRAS 380 1589 150Baumgardt H Makino J 2003 MNRAS 340 227 149Baumgardt H Makino J Ebisuzaki T 2004 ApJ 613 1133 144Baumgardt H Makino J Hut P 2005 ApJ 620 238 148Begelman M C Blandford R D Rees M J 1980 Nature 287 307 152Belczynski K Kalogera V Bulik T 2002 ApJ 572 407 143
154 M Freitag
Benacquista M J 2006 Living Reviews in Relativity 9 2 141Benz W Hills J G Thielemann 1989 ApJ 342 986 136Berczik P Merritt D Spurzem R Bischof H-P 2006 ApJ Lett 642 L21 152Binney J Tremaine S 1987 Galactic Dynamics Princeton Univ Press
Princeton NJ 123 134Boily C M Athanassoula E 2006 MNRAS 369 608 125Boily C M Kroupa P 2003a MNRAS 338 665 150Boily C M Kroupa P 2003b MNRAS 338 673 150Bonnell I A Bate M R 2002 MNRAS 336 659 150Bonnell I A Bate M R Clarke C J Pringle J E 2001a MNRAS 323 785 150Bonnell I A Clarke C J Bate M R Pringle J E 2001b MNRAS 324 573 150Brown W R Geller M J Kenyon S J Kurtz M J 2005 ApJ Lett 622 L33 141Cohn H Kulsrud R M 1978 ApJ 226 1087 138Dale J E Davies M B 2006 MNRAS 366 1424 136Davies M B 2002 in van Leeuwen F Hughes J DPiotto G eds ASP Conf Ser
Vol 265 Omega Centauri A Unique Window into Astrophysics Astron SocPac San Francisco p 215 141
Davies M B Benz W Hills J G 1991 ApJ 381 449 136Davies M B Benz W Hills JG 1992 ApJ 401 246 136Davies M B Bate M R Bonnell I A Bailey V C Tout C A 2006 MNRAS
370 2038 136Dehnen W Odenkirchen M Grebel E K Rix H-W 2004 AJ 127 2753 149Diener P Frolov V P Khokhlov A M Novikov I D Pethick C J 1997 ApJ
479 164 137Einsel C Spurzem R 1999 MNRAS 302 81 125Esquej P Saxton R D Freyberg M J Read A M Altieri B Sanchez-Portal M
Hasinger G 2007 AampA 462 L49 138Fabian A C Pringle J E Rees M J 1975 MNRAS 172 15 134Ferrarese L Ford H 2005 Space Science Reviews 116 523 137Ferrarese L Merritt D 2000 ApJ Lett 539 L9 137Fiestas J Spurzem R Kim E 2006 MNRAS 373 677 125Frank J Rees M J 1976 MNRAS 176 633 138Fregeau J M Cheung P Portegies Zwart S F Rasio F A 2004 MNRAS 352 1 141 143Fregeau J M Gurkan M A Joshi K J Rasio F A 2003 ApJ 593 772 123 141 143Fregeau J M Rasio F A 2007 ApJ 658 1047 123 131 141 142 143 147 148Freitag M 2000 PhD thesis Universite de Geneve 137Freitag M 2001 Classical and Quantum Gravity 18 4033 150Freitag M 2003 ApJ Lett 583 L21 150Freitag M Amaro-Seoane P Kalogera V 2006a ApJ 649 91 124 136 137 144 150 151Freitag M Benz W 2001a in Deiters S Fuchs B Just R Spurzem R eds ASP
Conf Ser Vol 228 Dynamics of Star Clusters and the Milky Way Astron SocPac San Francisco p 428 150
Freitag M Benz W 2001b in Kaper L van den Heuvel E P J Woudt P AESO Astrophysics Symposia Black Holes in Binaries andGalactic Nuclei p 269 150
Freitag M Benz W 2001c AampA 375 711 124 131Freitag M Benz W 2002 AampA 394 345 124 134 135 137 150Freitag M Benz W 2005 MNRAS 358 1133 135 136Freitag M Gurkan M A Rasio F A 2006b MNRAS 368 141 124 135 137 147Freitag M Rasio F A Baumgardt H 2006c MNRAS 368 121 124 135 145 147
5 Monte-Carlo Models 155
Fukushige T Heggie D C 2000 MNRAS 318 753 149Fulbright M S 1996 PhD thesis University of Arizona 137Gao B Goodman J Cohn H Murphy B 1991 ApJ 370 567 141Genzel R Schodel R Ott T Eisenhauer F Hofmann R Lehnert M Eckart A
Alexander T Sternberg A Lenzen R Clenet Y Lacombe F Rouan D RenziniA Tacconi-Garman L E 2003 ApJ 594 812 137 150
Gezari S Martin D C Milliard B Basa S Halpern J P Forster K FriedmanP G Morrissey P Neff S G Schiminovich D Seibert M Small T WyderT K 2006 ApJ Lett 653 L25 138
Ghez A M Salim S Hornstein S D Tanner A Lu J R Morris M BecklinE E Duchene G 2005 ApJ 620 744 137
Giersz M 1998 MNRAS 298 1239 123 131 141Giersz M 2001 MNRAS 324 218 123 141Giersz M 2006 MNRAS 371 484 123 141Giersz M Heggie D C 1994 MNRAS 268 257 145Giersz M Heggie D C 1996 MNRAS 279 1037 145Giersz M Heggie D C Hurley J R 2008 MNRAS 388 429Giersz M Spurzem R 2000 MNRAS 317 581 141Giersz M Spurzem R 2003 MNRAS 343 781 141 142 143Gnedin O Y Lee H M Ostriker J P 1999 ApJ 522 935 149Gnedin O Y Ostriker J P 1997 ApJ 474 223 149Goodman J 1983 ApJ 270 700 144Goodman J Hernquist L 1991 ApJ 378 637 143Grindlay J Portegies Zwart S McMillan S 2006 Nature Physics 2 116 141Gurkan M A Fregeau J M Rasio F A 2006 ApJ Lett 640 L39 123 141 147Gurkan M A Freitag M Rasio F A 2004 ApJ 604 123 147Gurkan M A Rasio F A 2005 ApJ 628 236 145Heggie D C 1975 MNRAS 173 729 143Heggie D Hut P 2003 The Gravitational Million-Body Problem A Multidisci-
plinary Approach to Star Cluster Dynamics CambridgeUniv Press Cambridge 141 145Heggie D C Hut P McMillan S L W 1996 ApJ 467 359 143Henon M 1960 Annales drsquoAstrophysique 23 668 144Henon M 1969 AampA 2 151 144Henon M 1971a ApampSS 14 151 123 149Henon M 1971b ApampSS 13 284 123 131Henon M 1973a in Martinet L Mayor M eds Lectures of the 3rd Advanced
Course of the Swiss Society for Astronomy and Astrophysics Obs de GeneveGeneve p 183 123 128
Henon M 1973b AampA 24 229 125Henon M 1975 in Hayli A ed Proc IAU Symp 69 Dynamics of Stellar Systems
Reidel Dordrecht p 133 123 144 145Hills J G 1975 Nature 254 295 137Hills J G 1988 Nature 331 687 141Hoffman L Loeb A 2007 MNRAS 334 153Hopman C Alexander T 2006a ApJ 645 1152 152Hopman C Alexander T 2006b ApJ Lett 645 L133 152Hopman C Freitag M Larson S L 2007 MNRAS 378 129 139Hurley J R Pols O R Aarseth S J Tout C A 2005 MNRAS 363 293 147Hurley J R Pols O R Tout C A 2000 MNRAS 315 543 143 144
156 M Freitag
Hurley J R Tout C A Aarseth S J Pols O R 2001 MNRAS 323 630 141 144Hut P 1993 ApJ 403 256 143Hut P McMillan S Goodman J Mateo M Phinney E S Pryor C Richer H B
Verbunt F Weinberg M 1992 PASP 104 981 141Johnston K V Sigurdsson S Hernquist L 1999 MNRAS 302 771 150Joshi K J Nave C P Rasio F A 2001 ApJ 550 691 123Joshi K J Rasio F A Portegies Zwart S 2000 ApJ 540 969 123 131Kim E Einsel C Lee H M Spurzem R Lee M G 2002 MNRAS 334 310 125Kim E Lee H M Spurzem R 2004 MNRAS 351 220 125Kim S S Lee H M 1999 AampA 347 123 134Kobayashi S Laguna P Phinney E S Meszaros P 2004 ApJ 615 855 137Komossa S 2005 in Merloni A Nayakshin S Sunyaev R A eds Growing Black
Holes Accretion in a Cosmological Context Springer Berlin p 269 138Kroupa P Aarseth S Hurley J 2001 MNRAS 321 699 150Lada E A 1999 in Lada C J Kylafis N D eds NATO ASIC Proc 540 The
Origin of Stars and Planetary Systems Kluwer Academic Publishers p 441 150Lauer T R Faber S M Ajhar E A Grillmair C J Scowen P A 1998 AJ 116
2263 150Laycock D Sills A 2005 ApJ 627 277 136Lee H M Ostriker J P 1986 ApJ 310 176 134Lightman A P Shapiro S L 1977 ApJ 211 244 138Lin D N C Tremaine S 1980 ApJ 242 789 144Lombardi Jr J C Proulx Z F Dooley K L Theriault E M Ivanova N Rasio
F A 2006 ApJ 640 441 136Lombardi J C Thrall A P Deneva J S Fleming S W Grabowski P E 2003
MNRAS 345 762 143Lombardi J C Warren J S Rasio F A Sills A Warren A R 2002 ApJ
568 939 136Merritt D 2006 ApJ 648 976 152Merritt D Berczik P Laun F 2007 AJ 133 553 152Merritt D Mikkola S Szell A 2007b ApJ 671 53 150Merritt D Milosavljevic M 2005 Living Reviews in Relativity 8 8 125 152Merritt D Szell A 2006 ApJ 648 890 150Mikkola S Aarseth S 2002 Celes Mech Dyn Ast 84 343 153Miller M C Colbert E J M 2004 International J Modern Phys D 13 1 137Miller M C Freitag M Hamilton D P Lauburg V M 2005 ApJ Lett
631 L117 141Miocchi P 2007 MNRAS 381 103 148Miralda-Escude J Kollmeier J A 2005 ApJ 619 30 152Muno M P Pfahl E Baganoff F K Brandt W N Ghez A Lu J Morris M R
2005 ApJ Lett 622 L113 141OrsquoLeary R M Loeb A 2008 MNRAS 383 86 144OrsquoLeary R M OrsquoShaughnessy R Rasio F A 2007 Phys Rev D 76 061504 141Perets H B Hopman C Alexander T 2007 ApJ 656 709 138Peters P C 1964 Phys Rev 136 1224 139Peters P C Mathews J 1963 Phys Rev 131 435 139Portegies Zwart S F Baumgardt H Hut P Makino J McMillan S L W 2004
Nature 428 724 141 147Portegies Zwart S F Makino J McMillan S L W Hut P 1999 AampA 348 117 136 141 147
5 Monte-Carlo Models 157
Portegies Zwart S F McMillan S L W 2002 ApJ 576 899 141 147Portegies Zwart S F McMillan S L W Makino J 2007 MNRAS 374 95 147Portegies Zwart S F Verbunt F 1996 AampA 309 179 144Portegies Zwart S F Yungelson L R 1998 AampA 332 173 144Press W H Teukolsky S A Vetterling W T Flannery B P 1992 Numerical
Recipes in FORTRAN Cambridge Univ Press Cambridge 133 140 144Preto M Merritt D Spurzem R 2004 ApJ Lett 613 L109 150Quinlan G D Hernquist L Sigurdsson S 1995 ApJ 440 554 137Quinlan G D Shapiro S L 1990 ApJ 356 483 147Rasio F A Shapiro S L 1991 ApJ 377 559 136Rauch K P Ingalls B 1998 MNRAS 299 1231 152Rauch K P Tremaine S 1996 New Astronomy 1 149 152Rees M J 1988 Nature 333 523 137Regev O Shara M M 1987 MNRAS 227 967 136Ross S D 2004 PhD thesis Calif Inst Technology 149Rozyczka M Yorke H W Bodenheimer P Muller E Hashimoto M 1989 AampA
208 69 136Ruffert M 1993 AampA 280 141 136Schodel R Eckart A Alexander T Merritt D Genzel R Sternberg A Meyer
L Kul F Moultaka J Ott T Straubmeier C 2007 AampA 469 125 137Sedgewick R 1988 Algorithms Second Edition Addison-Wesley 131Sesana A Haardt F Madau P 2007 ApJ 660 546 152Shapiro S L 1985 in Goodman J Hut P eds Proc IAU Symp 113 Dynamics
of Star Clusters Reidel Dordrecht p 373 140Shara M ed 2002 ASP Conf Ser 263 Stellar Collisions amp Mergers and their
Consequences Astron Soc Pac San Francisco 133Shara M M Hurley J R 2002 ApJ 571 830 141Sills A Adams T Davies M B Bate M R 2002 MNRAS 332 49 136Sills A Deiters S Eggleton P Freitag M Giersz M Heggie D Hurley J Hut
P Ivanova N Klessen R S Kroupa P Lombardi J C McMillan S PortegiesZwart S F Zinnecker H 2003 New Astron 8 605 123
Sills A Faber J A Lombardi J C Rasio F A Warren A R 2001 ApJ 548323 136
Sills A Lombardi J C Bailyn C D Demarque P Rasio F A Shapiro S L1997 ApJ 487 290 136
Spitzer L 1987 Dynamical Evolution of Globular Clusters Princeton Univ PressPrinceton NJ 145 146 149
Spitzer L J Hart M H 1971a ApJ 164 399 124Spitzer L J Hart M H 1971b ApJ 166 483 124Spitzer L J Thuan T X 1972 ApJ 175 31 124Spitzer L Mathieu R D 1980 ApJ 241 618 125 141Spitzer L Shull J M 1975 ApJ 201 773 124Spurzem R Giersz M Heggie D C Lin D N C 2006 preprint (astro-
ph0612757) 141 142Stodolkiewicz J S 1982 Acta Astron 32 63 123 131Stodolkiewicz J S 1985 in Goodman J Hut P eds Proc IAU Symp 113 Dy-
namics of Star Clusters Reidel Dordrecht p 361 141Stodolkiewicz J S 1986 Acta Astron 36 19 123 141Subr L Karas V Hure J-M 2004 MNRAS 354 1177 125 152
158 M Freitag
Syer D Clarke C J Rees M J 1991 MNRAS 250 505 152Taam R E Ricker P M 2006 preprint (astro-ph0611043) 136Takahashi K Portegies Zwart S F 2000 ApJ 535 759 149Theis C Spurzem R 1999 AampA 341 361 125Trac H Sills A Pen U-L 2007 MNRAS 337 136Tremaine S Gebhardt K Bender R Bower G Dressler A Faber S M Filippenko
A V Green R Grillmair C Ho L C Kormendy J Lauer T R MagorrianJ Pinkney J Richstone D 2002 ApJ 574 740 137
Trenti M Ardi E Mineshige S Hut P 2007 MNRAS 374 857 148van der Marel R P 2004 in Ho L ed Coevolution of Black Holes and Galaxies
from the Carnegie Observatories Centennial Symposia Cambridge Univ PressCambridge p 37 137
Young P J 1980 ApJ 242 1232 137
6
Particle-Mesh Technique and Superbox
Michael Fellhauer
University of Cambridge Institute of Astronomy Madingley Road CambridgeCB3 0HA UKmadfastcamacuk
61 Introduction
Many problems in astronomy ranging from celestial mechanics via stellar dy-namics to cosmology require the solution of Newtonrsquos laws
F = a middotm = mdv
dt(61)
v =dr
dt (62)
where F is the gravitational force of all other (N minus 1) masses
F j =Nsum
i=1i =j
Gmjmi
r3ijrij (63)
acting on mass j (index ij denotes the vectors connecting particle i and j)While there is an analytical solution for the two-body system systems
involving three or more masses do not have an analytical solution Thus com-puter simulations of the time-evolution of multi-body systems are very com-mon in astronomy
The tools used for these purposes are diverse and widely range from high-precision integrators for the dynamics of the planetary systems to programmesusing up to a billion particles to investigate the structure formation in theuniverse This article focuses on the particle-mesh technique and a programmeto simulate galaxies called Superbox
The particle-mesh (PM) technique is explained in Sect 62 Then themulti-grid structure of Superbox is described in Sect 63
Fellhauer M Particle-Mesh Technique and SUPERBOX Lect Notes Phys 760 159ndash169 (2008)
DOI 101007978-1-4020-8431-7 6 ccopy Springer-Verlag Berlin Heidelberg 2008
160 M Fellhauer
62 Particle-Mesh Technique
621 Overview
In the particle-mesh technique the density of the particles is sampled on agrid covering the simulation area and then Poissonrsquos equation
nabla2Φ = 4πG (64)
is solved on the grid-based density using a suitable Greenrsquos function to derivethe grid-based gravitational potential Particles are integrated using the forcesderived from this grid-based potential
The first step is to locate the grid-point of each particle according toits position and derive a grid of densities This density-grid is Fourier-transformed via the Fast Fourier Transform(FFT) algorithm This requiresthat the number of grid-cells per dimension is a power of 2 The Fourier-transformed density-grid is multiplied cell-by-cell with a suitable alreadyFourier-transformed Greenrsquos function Then these values are back-transformedwhich results in a grid of potential values From these potential values theforces of each particle are derived via discrete differentiation Finally the par-ticle velocities and positions are integrated forward in time
A flow-chart of a standard PM-code is shown in Fig 61
read input data
forward FFT of Greenrsquos Function
start timeminusstep loop
derive gridminusbased density array
forward FFT of density array
cellminusbyminuscell multiplication with Greenrsquos Fkt
backward FFT to derive potential array
start particle loop
differentiate potential to get force
integrate velocities
integrate positions
collect output data
write final data
Fig 61 Flow-chart of a standard PM-code
6 Particle-Mesh Technique and Superbox 161
622 Suitable Greenrsquos Function
The usual geometry of the grid in a particle-mesh code is Cartesian and cu-bic Therefore the standard Greenrsquos function which describes the distancesbetween cells looks like
Hijk =1
radici2 + j2 + k2
i j k = 0 n
H000 =1ξ (65)
This formula implies that the length of one grid-cell is unity n is the numberof grid-cells per dimension and has to be a power of 2
The value for H000 has to be chosen carefully It describes the strength ofthe force between particles in the same cell including the non-physical lsquoself-gravityrsquo of the particle acting on itself In the one-dimensional case analyticalstudies by D Pfenniger showed a value of ξ = 34 gives the best results interms of energy conservation Numerical experiments showed that this is alsotrue in the three-dimensional case
Nevertheless in the case of very low particle numbers per cell this valuecould lead to spurious self-accelerations and a value that excludes the forcesof particles from the same cell would be more suitable In the Superbox
differentiation scheme the value to exclude self-gravity is ξ = 1 In a latersection we discuss why one should avoid low particle-per-cell ratios if possible
Finally it can be stated that the grid-array of the Greenrsquos function hasto be set up and Fourier-transformed only once at the beginning of eachsimulation and can then be used throughout the whole simulation
623 Deriving the Density-Grid
The actual positions and velocities of each particle (x y z vx vy vz) are storedin the particle array From the actual positions the grid-cell in which eachparticle is located is derived via
ix = nearest integer(enh middot x) + n2 (66)
ix denotes the grid-cell number in the x-direction enh is a numerical factorthat stretches or compresses the physical extension of the x-direction of thesimulation area to allow the grid-cell length to be unity The grid-cell numbersin the y- and z-direction are derived accordingly
There are two possibilities to assign the mass of the particle to the density-grid covering the simulation area One is called nearest-grid-point scheme andassigns the whole mass of the particle to the grid-cell that the particle is inA second more advanced procedure is called cloud-in-cell scheme and assignsa radius of half a cell length to each particle The mass of the particle is nowdistributed to the cells this extended particle is in according to the actual
162 M Fellhauer
1
2
3
n
1 3
ix
iy +mass
grid of densities
1
2
3
4
5
N
x z vx vy vz
array of particles
ix = nint(enhx + n2)
iy = nint(enhy + n2)
y
n2
Fig 62 Deriving the density-grid from the particle positions The z-dimensionis omitted for clarity In the NGP scheme the total mass is placed in one cell inthe CIC scheme contributions of the mass are distributed in neighbouring cells also(denoted by the circle)
deviation of the particle position from the centre of the cell In Fig 62 thisassignment is shown for two dimensions
The CIC scheme allows for a much smoother distribution of the densitiesbut does not allow for sub-cell-length resolution This has to be added via di-rect summation of the forces of neighbouring particles within a certain sphereof influence A code that employs direct summation in the vicinity of eachparticle is usually called P3M-code (particle-particle particle-mesh) The CICscheme also allows for a smooth and high accuracy derivation of the forces(this will be discussed in a sub-section below)
Superbox still uses the lsquoold-fashionedrsquo NGP-scheme which results in amuch faster assignment of the densities and allows for sub-cell-length resolu-tion if H000 = 1 To reach the high accuracy we later apply a higher-orderdifferentiation scheme to obtain the forces
624 The FFT-Algorithm
Poissonrsquos equation is solved for the density-grid to get the grid-based potentialΦijk which becomes
Φijk = Gnminus1sum
abc=0
abc middotHaminusibminusjcminusk i j k = 0 nminus 1 (67)
where n denotes the number of grid-cells per dimension (n3 = Ngc totalnumber of grid-cells) and Hijk is the Greenrsquos function To avoid this N2
gc pro-cedure the discrete Fast Fourier Transform (FFT) is used for which n = 2kk gt 0 being an integer The stationary Greenrsquos function is Fourier-transformed
6 Particle-Mesh Technique and Superbox 163
once at the beginning of the calculation and only the density array is trans-formed at each time-step
abc =nminus1sum
ijk=0
ijk middot exp(
minusradicminus1
2πn
(ai+ bj + ck))
Habc =nminus1sum
ijk=0
Hijk middot exp(
minusradicminus1
2πn
(ai+ bj + ck))
(68)
The two resulting arrays are multiplied cell by cell and transformed back toget the grid-based potential
Φijk =G
n3
nminus1sum
abc=0
abc middot Habc middot exp(radic
minus12πn
(ai+ bj + ck))
(69)
The FFT-algorithm gives the exact solution of the grid-based potential for aperiodic system For the exact solution of an isolated system which is whatsimulators are interested in the size of the density array has to be doubled(2n) filling all inactive grid cells with zero density and extending the Greenrsquosfunction in the empty regions in the following way (also shown in Fig 63)
H2nminusijk = H2nminusi2nminusjk = H2nminusij2nminusk = H2nminusi2nminusj2nminusk
= Hi2nminusjk = Hi2nminusj2nminusk = Hij2nminusk = Hijk (610)
This provides the isolated solution of the potential in the simulated area be-tween i j k = 0 and n minus 1 In the inactive part the results are unphysicalTo keep the data size as small as possible only a 2n times 2n times n-array is usedfor transforming the densities and a (n+ 1)times (n+ 1)times (n+ 1)-array is usedfor the Greenrsquos function For a detailed discussion see Eastwood amp Brownrigg(1978) and also Hockney amp Eastwood (1981)
The FFT-routine incorporated in Superbox is a simple one-dimensionalFFT and is taken from Werner amp Schabach (1979) and Teukolsky et al (1992)It is fast and makes the code portable and not machine-specific The low-storage algorithm for extending the FFT to three dimensions to obtain the3-D potential is taken from Hohl (1970) The performance of Superbox canbe increased by incorporating machine-optimised FFT routines
A detailed description of the low-storage FFT algorithm used in Super-
box can be found in the manual available directly from the author (Fellhauer2006)
625 Derivation of the Forces
After the FFT procedure has been completed one has a grid-based potentialof the simulation area From this potential the forces acting on each particleare derived via discrete numerical differentiation of the potential
164 M Fellhauer
simulated object
active simulation area
empty ghost region empty ghost region
empty ghost region
1 3 n n+1 2n
gridminusarray rho not existent as array
not existent as arraynot existent as array
2
Fig 63 Virtual extension of the simulation area to provide isolated solution(z-direction omitted)
As with the mass assignment of the density array the forces are also cal-culated differently depending on whether a NGP or CIC scheme is used ANGP scheme only uses the force calculated for the grid-cell the particle isin while in a CIC scheme forces of the neighbouring cells are used with thesame weights the mass was distributed to interpolate the force to the particleposition
For simplicity the force derivation of the different schemes is given in a1D case
NGP a(xi + dx) =partΦpartx
∣∣∣∣i
(611)
SUPERBOX a(xi + dx) =partΦpartx
∣∣∣∣i
+part2Φpartx2
∣∣∣∣i
middot dxΔx
(612)
CIC a(xi + dx) =partΦpartx
∣∣∣∣i
middot Δxminus dxΔx
+partΦpartx
∣∣∣∣i+1
middot dxΔx
(613)
where a denotes the acceleration xi is the position of the cell with index i theparticle is located in and dx is the deviation of the particle from the centreof the cell As one can see the standard NGP scheme does not account forthe deviation of the particle from the centre of the cell The acceleration isa step function from cell to cell and is not steady at all The CIC schemeaccounts for this deviation and the acceleration of the particle is a weightedmean from the cell the particle is in and the neighbouring cell Superbox hasa non-standard force calculation scheme which is definitely NGP in nature(only the force for the cell i is used) but accounts for the deviation by usingthe next term of a Taylor series of the acceleration around the cell i Thesteadiness of the force is not guaranteed when crossing the cell boundaries at
6 Particle-Mesh Technique and Superbox 165
an arbitrary angle but anisotropies of the force are suppressed The full 3Dexpression for the acceleration in Superbox is
aijkx(dxdydz) =partΦpartx
∣∣∣∣ijk
+part2Φpartx2
∣∣∣∣ijk
dx+part2Φpartxparty
∣∣∣∣ijk
dy +part2Φpartxpartz
∣∣∣∣ijk
dz
(614)
The partial derivatives are replaced in the code by second-order central dif-ferentiation quotients and now the 3D expression for the acceleration in thex-direction reads
aijkx(dxdydz) =Φi+1jk minus Φiminus1jk
2Δx
+Φi+1jk + Φiminus1jk minus 2 middot Φijk
(Δx)2middot dx
+Φi+1j+1k minus Φiminus1j+1k + Φiminus1jminus1k minus Φi+1jminus1k
4ΔxΔymiddot dy
+Φi+1jk+1 minus Φiminus1jk+1 + Φiminus1jkminus1 minus Φi+1jkminus1
4ΔxΔzmiddot dz (615)
Note that generally Δx = Δy = Δz = 1 ie the cell-length is assumed to beequal along the three axes and unity i j k are the cell indices of the particlein the three Cartesian coordinates The accelerations in y- and z-direction arecalculated analogously
626 Integrating the Particles
The orbits of the particles are integrated forward in time using the leapfrogscheme For example for the x-components of the velocity vx and positionx vectors of particle l
vn+12xl = v
nminus12xl + an
xl middot Δt
xn+1l = xn
l + vn+12xl middot Δt (616)
where n denotes the nth time step and Δt is the length of the integrationstep
Superbox uses a fixed global time step ie the time step is the same forall particles and does not vary in time
The leapfrog integrator together with the fixed time step is very fast (nodecision-making necessary) and is accurate enough for a grid-based code It isin principle time-reversible and has very good energy conservation propertiesconsidering its simplicity
166 M Fellhauer
63 Multi-Grid Structure of SUPERBOX
A detailed description of the code is also found in Fellhauer et al (2000) Foreach galaxy five grids with three different resolutions are used This is madepossible by invoking the additivity of the potential (Fig 64)
The five grids are as follows
bull Grid 1 is the high-resolution grid that resolves the centre of the galaxy Ithas a length of 2timesRcore in one dimension In evaluating the densities allparticles of the galaxy within r le Rcore are stored in this grid
bull Grid 2 has an intermediate resolution to resolve the galaxy as a wholeThe length is 2 times Rout but only particles with r le Rcore are stored hereie the same particles as are also stored in grid 1
bull Grid 3 has the same size and resolution as grid 2 but it contains onlyparticles with Rcore lt r le Rout
bull Grid 4 has the size of the whole simulation area (ie lsquolocal universersquo with2 times Rsystem) and has the lowest resolution It is fixed Only particles ofthe galaxy with r le Rout are stored in grid 4
RoutRout
Grid 4 Grid 5
Rout Rout
Rcore
RcoreRcore
Rcore
Rsystem
RsystemRsystem
Rsystem
Grid
1
Grid
2
Grid 1 + 2 Grid 3
Fig 64 The five grids of Superbox In each panel solid lines highlight the relevantgrid Particles are counted in the shaded areas of the grids The lengths of the arrowsare (N2)minus2 grid-cells (see text) In the bottom left panel the grids of a hypotheticalsecond galaxy are also shown as dotted lines
6 Particle-Mesh Technique and Superbox 167
bull Grid 5 has the same size and resolution as grid 4 This grid treats theescaping particles of a galaxy and contains all particles with r gt Rout
Grids 1 to 3 are focused on a common centre of the galaxy and move with itthrough the lsquolocal universersquo as detailed below All grids have the same numberof cells per dimension n for all galaxies The boundary condition requiringtwo empty cells with = 0 at each boundary is open and non-periodic thusproviding an isolated system This however means that only nminus 4 active cellsper dimension are used
To keep the memory requirement low all galaxies are treated consecutivelyin the same grid-arrays whereby the particles belonging to different galaxiescan have different masses Each of the five grids has its associated potentialΦi i = 1 2 5 computed by the PM technique from the particles of onegalaxy located as described above The accelerations are obtained additivelyfrom the five potentials of each galaxy in turn in the following way
Φ(r) = [θ(Rcore minus r) middot Φ1 + θ(r minusRcore) middot Φ2 + Φ3] middot θ(Rout minus r)+ θ(r minusRout) middot Φ4 + Φ5
Φ(Rcore) = Φ1 + Φ3 + Φ5
Φ(Rout) = Φ2 + Φ3 + Φ5 (617)
where θ(ξ) = 1 for ξ gt 0 and θ(ξ) = 0 otherwise This means
bull For a particle in the range r le Rcore the potentials of grids 1 3 and 5 areused to calculate the acceleration
bull For a particle with Rcore lt r le Rout the potentials of grids 2 3 and 5 arecombined
bull And finally if r gt Rout the acceleration is calculated from the potentialsof grids 4 and 5
bull Any particle with r gt Rsystem is removed from the computation
Due to the additivity of the potential (and hence its derivatives the accel-erations) the velocity changes originating from the potentials of each of thegalaxies can be separately updated and accumulated in the first of the leapfrogformulae (616) The final result does not depend on the order by which thegalaxies are taken into account and it could be computed even in parallel ifa final accumulation takes place After all velocity changes have been appliedto all galaxies the positions of the particles are finally updated
As long as the galaxies are well separated they feel only the low-resolutionpotentials of the outer grids But as the galaxies approach each other theirhigh-resolution grids overlap leading to a high-resolution force calculationduring the interaction
631 Grid Tracking
Two alternative schemes to position and track the inner and middle grids canbe used The most useful scheme centres the grids on the density maximum
168 M Fellhauer
of each galaxy at each step The position of the density maximum is found byconstructing a sphere of neighbours centred on the densest region in whichthe centre of mass is computed This is performed iteratively The other optionis to centre the grids during run-time on the position of the centre of mass ofeach galaxy using all its particles remaining in the computation
632 Edge-Effects
It can be seen in Fig 64 that only spherical regions of the cubic grids containparticles (except for grid 5) Particles with eccentric orbits can cross the borderof two grids thus being subject to forces resolved differently No interpolationof the forces is done at the grid boundaries This keeps the code fast andslim but the grid sizes have to be chosen properly in advance to minimise theboundary discontinuities It leads to some additional but negligible relaxationeffects because the derived total potential has insignificant discontinuities atthe grid boundaries (Wassmer 1992) The best way to avoid these edge-effectsis to place the grid boundaries at lsquoplacesrsquo where the slope of the potential isnot steep
633 Choice of Parameters
Finally we make some comments on the right choice of parameters In princi-ple Superbox works with all sets of parameters but the outcome might beunphysical The user has to check if the choice makes sense or not There area few rules that help to ensure that the simulation is not unrealistic Firstone should check if there are enough particles for the given resolution As arule-by-thumb one can divide the number of particles by the total number ofcells of one grid If the mean number of particles per cell amounts to a fewthen one is on the safe side (conservative lt N gtasymp 10minus15) Second one shouldcheck the time-step Particles should not travel much more than one grid-cellper time step otherwise one again loses resolution Another rule-by-thumb istake the shortest crossing-time of all objects and divide it by 10 (conservative50ndash70) This ensures that this object stays stable It is also not useful to havelarge resolution steps between the grid levels At least one should avoid themin all places of interest
References
Eastwood J W Brownrigg D R K 1978 J Comput Phys 32 24 163Fellhauer M 2006 Superbox manual madfastcamacuk 163Fellhauer M Kroupa P Baumgardt H Bien R Boily C M Spurzem R Wassmer
N 2000 NewA 5 305 166Hockney R W Eastwood J W 1981 Computer Simulations Using Particles
McGraw-Hill 163
6 Particle-Mesh Technique and Superbox 169
Hohl F 1970 NASA Technical Report R-343 163Teukolsky S A Vetterling W T Flannery B P 1992 Numerical Recipes in
Fortran Cambridge University Press Cambridge 163Wassmer N 1992 Diploma thesis University Heidelberg 168Werner H Schabach R 1979 Praktische Mathematik II Springer 163
7
Dynamical Friction
Michael Fellhauer
University of Cambridge Institute of Astronomy Madingley Road CambridgeCB3 0HA UKmadfastcamacuk
71 What is Dynamical Friction
Dynamical friction is as the name says a deceleration of massive objects Itoccurs whenever a massive object travels through another extended objectThis behaviour makes dynamical friction one of the most important effects instellar dynamics
It occurs on all kinds of length-scales and objects from the sinking to thecentre of massive stars inside a star cluster leading to mass segregation viasinking of star clusters and dwarf galaxies inside the host galaxy to collisionsof massive galaxies
Dynamical friction is a pure gravitational interaction between the massiveobject (M) and the multitude of lighter stars (m) of the extended object it istravelling through (see Fig 71 left panel) In the rest-frame of the moving ob-ject M the lighter stars are oncoming from the front and get deflected behindthe object (see Fig 71 middle panel) These many gravitational interactionssum up to an effective deceleration of the object while some of the deflectedlighter particles m build up a wake behind M (see Fig 71 right panel) Thiswake can be measured and may induce an extra drag on the moving objectbut the drag is neglected in the determination of the standard description ofdynamical friction It is dynamical friction which causes the wake and not thewake being responsible for the dynamical friction
Mv
Mwake
Fig 71 Dynamical friction as a cartoon
Fellhauer M Dynamical Friction Lect Notes Phys 760 171ndash179 (2008)
DOI 101007978-1-4020-8431-7 7 ccopy Springer-Verlag Berlin Heidelberg 2008
172 M Fellhauer
Hence dynamical friction causes a deceleration of the object M and there-fore if it was on a stable orbit before causes a shrinking of this orbit andsinking to the centre in response to the deceleration If the object is initiallyon an eccentric orbit dynamical friction acts in a way that the orbit gets moreand more circular
72 How to Quantify Dynamical Friction
Dynamical friction was first quantified by Chandrasekhar (1943) In this sec-tion the classical way to derive the dynamical friction formula will be followed(see for example Binney amp Tremaine 1987 chapter 71)
Before the multitude of encounters can be treated one has to focus on asingle encounter The geometry of this encounter is shown in the left panel ofFig 72 Defining r = xm minus xM as the separation vector between m and Mand V = r one gets the relative velocity change
ΔV = Δvm minus ΔvM (71)
Because this two-body system is conservative one can apply momentum con-servation which leads to
mΔvm +MΔvM = 0 (72)
Combining these two equations and eliminating Δvm gives ΔvM as a functionof ΔV
ΔvM = minus(
m
m+M
)
ΔV (73)
In the right panel of Fig 72 we show the hyperbolic geometry of the Keplerproblem in the frame of the reduced particle mass travelling in the combinedpotential due to both particles (m + M) The conserved angular momentum
m
M
xM
xm
r
vm
vMV0
V0
ψ ψ0
θb r
Fig 72 Left Geometry of a single encounter Right The motion of the reducedparticle during a hyperbolic encounter V 0 = V (t = minusinfin) is the initial velocity b isthe impact parameter and θ is the deflection angle
7 Dynamical Friction 173
(per unit mass) in this system is L = bV0 = r2Ψ From the analytical solu-tion of the Kepler problem we know the equation that relates radius r andazimuthal angle Ψ
1r
= C cos(Ψ minus Ψ0) +G(m+M)
b2V 20
(74)
where C and Ψ0 are constants defined by the initial conditions If (74) isdifferentiated with respect to time one gets
drdt
= Cr2Ψ sin(Ψ minus Ψ0) = CbV0 sin(Ψ minus Ψ0) (75)
Evaluating (74) and (75) at t = minusinfin one obtains
0 = C cos(Ψ0) +G(m+M)
b2V 20
(76)
minusV0 = CbV0 sin(minusΨ0) (77)
Using these two equations to eliminate C leads to
tan(Ψ0) = minus bV 20
G(m+M) (78)
The point of closest approach is reached when Ψ = Ψ0 and since the orbitis symmetrical about this point the deflection angle is θ = 2Ψ0 minus π Byconservation of energy the length of the relative velocity vector is the samebefore and after the encounter and has the value V0 Hence the componentsΔV and ΔV perp of ΔV are given by
|ΔV perp| = V0 sin(θ) = V0 |sin(2Ψ0)| =2V0 |tan(Ψ0)|1 + tan2(Ψ0)
=2bV 3
0
G(m+M)
[
1 +b2V 4
0
G2(m+M)2
]minus1
(79)
∣∣ΔV
∣∣ = V0 [1 minus cos(θ)] = V0(1 + cos(2Ψ0)) =
2V0
1 + tan2(Ψ0)
= 2V0
[
1 +b2V 4
0
G2(m+M)2
]minus1
(710)
ΔV always points in the direction opposite to V 0 Using (73) one finallygets
|ΔvMperp| =2mbV 3
0
G(m+M)2
[
1 +b2V 4
0
G2(m+M)2
]minus1
(711)
∣∣ΔvM
∣∣ =
2mV0
m+M
[
1 +b2V 4
0
G2(m+M)2
]minus1
(712)
174 M Fellhauer
Hence by (73) ΔvM always points in the same direction as V 0Let us now imagine that M travels through an infinite homogeneous ldquosea
of particlesrdquo Then there are as many deflections from ldquoaboverdquo as from ldquobe-lowrdquo or from ldquorightrdquo or ldquoleftrdquo and the changes in ΔvMperp sum up to zeroFurthermore one has to invoke the ldquoJeans swindlerdquo to neglect the gravita-tional potential of the ldquosea of particlesrdquo so the motion of each particle isdetermined only by M The changes in ΔvM are all parallel to V 0 and forma non-zero resultant ie the mass M suffers a steady deceleration which issaid to be dynamical friction
To determine the deceleration one now has to integrate over all possibleimpact parameters b and velocities vm The number density of particles mwith velocity distribution f(v) in the velocity-space element d3vm at impactparameters between b and b+ db is
2πbdbtimes V0 times f(vm)d3vm (713)
Hence the net rate of change of vM is
dvM
dt
∣∣∣∣vm
= V 0f(vm)d3vm
int bmax
0
2mV0
m+M
[
1 +b2V 4
0
G2(m+M)2
]minus1
2πbdb
(714)
with bmax the largest impact parameter to be considered Performing theintegration over all b one finds
dvM
dt
∣∣∣∣vm
= 2π ln(1 + Λ2)G2m(m+M)f(vm)vm minus vM
|vm minus vM |3d3vm (715)
with
Λ =bmaxV
20
G(m+M)=
bmax
bmin (716)
Usually Λ is very large and so one can assume that 12 ln(1 + Λ2) asymp ln(Λ)
which is called the Coulomb logarithm Furthermore one replaces V0 by thetypical speed vtyp Equation (715) states that particles that have velocity vm
exert a force on M that acts parallel to vmminusvM and is inversely proportionalto the square of this vector The problem to integrate over all velocities vm isequivalent to finding the gravitational field at the point with position vectorin velocity space vM which is generated by the ldquomass densityrdquo ρ(vm) =4π ln(Λ)Gm(m + M)f(vm) If the particles move isotropically the densitydistribution is spherical and according to Newtonrsquos first and second theoremthe total acceleration of M is equal to Gv2
M times the total ldquomassrdquo at vm ltvM Hence
dvM
dt= minus16π2 ln(Λ)G2m(m+M)
int vM
0f(vm)v2
mdvm
v3M
vM (717)
ie only particles m with velocities slower than M contribute to the force thatalways opposes the motion of M and this equation is henceforth called theChandrasekhar dynamical friction formula
7 Dynamical Friction 175
If f(vm) is Maxwellian with dispersion σ then
f =n0
(2πσ2)32exp
(
minus v2
2σ2
)
(718)
and introducing ρ = n0m as the background density one can perform theintegration which gives
dvM
dt= minus4π ln(Λ)G2ρM
v3M
[
erf(X) minus 2Xradicπ
exp(minusX2)]
vM (719)
with X = vMradic
2σ This formula holds for M mWith this formula one can derive some useful relations If keeping ln Λ
constant we can determine the time a star cluster or dwarf galaxy needs tospiral into the centre of its host system
tfric =117D2
0vcirc
ln(Λ)GM=
264 times 1011
ln(Λ)
(D0
2 kpc
)2 ( vcirc
250 km sminus1
)(106 MM
)
yr
(720)
Furthermore McMillan amp Portegies Zwart (2003) derived a formula for thesinking rate if the background is a mass distribution following a power law ofthe form M(D) = A middotDα Then the distance D of an object to the centre ofthe host system vs time is given by
D(t) = D0
[
1 minus α(α+ 3)α+ 1
radicG
ADα+30
χM ln(Λ)t
]23+α
(721)
with
χ = erf(X) minus 2Xradicπ
exp(minusX2) (722)
where X = vMradic
2σEven though one might think that the derivation of Chandrasekharrsquos for-
mula has too many vague definitions and approximations in it it has beenshown that it is a really powerful tool to describe dynamical friction in allkinds of environments
73 Dynamical Friction in Numerical Simulations
Especially in numerical simulations the validity of Chandrasekharrsquos formulahas been verified throughout the decades Still some words of caution haveto be added In the previous section it was shown that Λ = bmaxbmin withbmintheo = G(m + M)v2
M in the extreme case of a point mass being a verysmall quantity (eg for a 106 M black hole with a velocity of 50 km sminus1 gives
176 M Fellhauer
bmin asymp 2 pc) For extended objects like a star cluster bmin is of the order ofthe size of the cluster
However even if one uses a point mass to determine dynamical friction it isnot easy to reach the correct result All standard N -body codes are resolution-limited Even if one does not introduce softening and uses a direct summationN -body code the limitation gets introduced through the finite particle num-ber In a study how dynamical friction is influenced by the resolution of thesimulation code (ie the softening length used) Spinnato et al (2003) showedthat with a given softening length ε (or in the case of a particle-mesh codethe cell-length )
bmineff asymp bmintheo + ε (or ) (723)
This is shown as the actual sinking curve for two choices of resolution in aparticle-mesh code in the left panel of Fig 73 and for all choices of ε as thederived ln(Λ) in the comparison to a direct summation N-body codes a treecode and a particle-mesh code in the right panel
In this study ln(Λ) was assumed to be constant during the whole simula-tion time independently of the actual distance D to the centre of the back-ground Fitting bmax of a constant ln(Λ) to the data resulted in bmax = kD0
with k asymp 05In another study Fellhauer amp Lin (2007) used the same approach but fitted
ln(Λ) at many small time-slices during the sinking process and determinedbmin as function of the resolution and bmax as function of the distance D asshown in Fig 74
ln Λ = ln(bmax) minus ln(bmin)= ln(kprime middotD(t)) + bmineff (724)
The values for bmineff were in very good agreement with (723) for the differentresolutions Superbox the particle-mesh code used in this study has threelevels of grid-resolutions While the point-mass starts inside the medium res-olution it crosses the grid-boundary to the high-resolution area when D lt 1
77
6
5
4
30 05 1 15 2 25
6
5
4
3
2
1
00 5 10 15 20 25 30
InΛ
N = 80 000 PP dataPP fit
tree datatree fit
PM dataPM fit
N = 2 000 000
I asymp 23ε0
εIε0
I asymp 10ε0
I asymp 5ε0
12
1
08
06
04
02
00 100 300 400
t500 600200
RR
0
Fig 73 Influence of the resolution on the dynamical friction of a point mass
7 Dynamical Friction 177
D
InΛ
Fig 74 ln(Λ) as a function of the distance to the centre of the background Alsovisible is the change in resolution for D lt 1 which leads to a smaller value of bmin
and a larger value of ln(Λ) ln(Λ) is decreasing with decreasing distance Fittingcurves assume bmax prop D (724)
in the above simulation The values for kprime differ from the value k found in theprevious study and also seem to be dependent on the resolution
74 Dynamical Friction of an Extended Object
In the previous section the dependence of ln(Λ) on environment was investi-gated which was possible because the studies involved the sinking of a pointmass with constant mass In many cases of dynamical friction the sinking ob-ject is extended and due to tidal forces acting on it the mass is not constantThis section investigates which mass one has to insert into the dynamicalfriction formulae like (719) and (721)
The initial mass and orbit of the extended object (it could be a star clusteror a dwarf galaxy) is the same as the one of the point-mass of the previoussection We use again (721) to fit now the combined quantity Mcl ln(Λ) Forthe left panel this quantity is converted into ln(Λ) in the following two ways
ln Λ(t)crosses = (Mcl ln Λ)(t)Mbound(t = 0) (725)ln Λ(t)triminuspods = (Mcl ln Λ)(t)Mbound(t) (726)
The curves show that either way does not give the correct answer If themass is kept constant and the initial mass is inserted the data points fallbelow the reference line of the point-mass case This disparity is expectedsince an extended object should have a larger bmin than that of a point-mass
178 M Fellhauer
potential For t lt 30 or D gt 1 the difference between these two simulationsis less than 20 per cent However it can also be seen that the deviation fromthe fitting line grows with time especially at t gt 30 (or equivalently as Ddecreases below 1) This growing difference is due to the loss of mass fromthe stellar cluster This divergence shows that a constant Mcl approximationdoes not adequately represent the results of the simulation If one inserts thebound mass as responsible for the dynamical friction the measured values aresystematically above the fitting line that represents the cluster with a point-mass potential However using the above argument that an extended objectshould have a larger bmin than that of a point-mass potential the tri-podsmeasured from this simulation would be systematically below the fitting lineif the bound stars adequately account for all the mass that contributes to thedynamical friction This disparity is a first hint that more particles may takepart in the dynamical friction than just the bound stars In the later stagesof the evolution these values of ln Λ increase quite dramatically which is aclear sign that Mcl is underestimated
In the right panel of Fig 75 the bound mass of the object as a function oftime (solid line) is plotted In the same figure crosses and squares representthe mass of the cluster taking part in the dynamical friction process if the sameln Λ as that derived for a point-mass is assumed Then one solves for Mcl with(721) For the crosses the actual values from the point-mass simulation isapplied while the data-points of the squares are derived using the smoothed
InΛ
tD
Mcl
Fig 75 Dynamical friction on an extended object Left Fitting Mcl ln(Λ) to thesinking curve in small time-slices like in Fig 74 and deriving ln(Λ) according to(725) amp (726) Right Using the values of ln(Λ) derived from the point-mass case todetermine Mcl the mass responsible for dynamical friction (yellow squares using thefitting formulae black crosses with error-bars using the actual values of the point-mass simulation) (Red) solid line shows the bound mass of the object long dashed(green) line the bound mass plus the unbound mass in a ring around the centre ofthe background with size of the object (Red) short dashed line is the rule-by-thumbbound mass plus half of the unbound mass
7 Dynamical Friction 179
fitting curve for ln Λ(D) from (724) (Since it has already been shown that themagnitude of ln Λ(D) for a cluster with a Plummer potential is smaller thanthat for a point mass the actual total mass that contributes to the dynamicalfriction is slightly larger than both the values represented by the crosses andthe squares) Even though the uncertainties are large the data points showthat the total mass responsible which contribute to the effect of dynamicalfriction is systematically above the bound mass in the bound mass curve
In addition to the bound mass the lost mass of the cluster which is locatedin a ring of the cluster dimension around the galaxy at the same distanceis calculated and only the particles with the same velocity signature as thecluster are counted Adding this mass to the bound mass is shown as theshort dashed line in the right panel of Fig 75 This mass estimate seems tofit the data much better This value is not easy to access and surely has to bereplaced by a more elaborate formulation of dynamical friction ie assigningweights to all unbound particles with respect to their position and velocityto the cluster Thus applying a simple rule-by-thumb by adding half of theunbound mass to the bound mass (shown as long dashed line in the rightpanel of Fig 75) fits the data nicely taking into account that the ldquoactualrdquoln Λ of an extended object should be smaller than the one of a point mass iethe data points have to be regarded as lower limits Even though this simpleestimate has no physical explanation and breaks down during the very finalstages of the dissolution of the cluster it gives an easy accessible estimate ofthe dynamical friction of an extended object suffering from mass-loss
References
Binney J Tremaine S 1987 Galactic Dynamics Princeton Univ PressPrinceton NJ 172
Chandrasekhar S 1943 ApJ 97 255 172Fellhauer M Lin D N C 2006 MNRAS 375 604 176McMillan S L Portegies Zwart S F 2003 ApJ 596 314 175Spinnato P F Fellhauer M Portegies Zwart S F 2003 MNRAS 344 22 176
8
Initial Conditions for Star Clusters
Pavel Kroupa
Argelander-Institut fur Astronomie Auf dem Hugel 71 D-53121 Bonn Germanypavelastrouni-bonnde
81 Introduction
Most stars form in dense star clusters deeply embedded in residual gas Thepopulations of these objects range from small groups of stars with about adozen binaries within a volume with a typical radius of r asymp 03 pc throughto objects formed in extreme star bursts containing N asymp 108 stars withinr asymp 36 pc Star clusters or more generally dense stellar systems must there-fore be seen as the fundamental building blocks of galaxies Differentiationof the term star cluster from a spheroidal dwarf galaxy becomes blurred nearN asymp 106 Both are mostly pressure-supported that is random stellar motionsdominate any bulk streaming motions such as rotation The physical processesthat drive the formation evolution and dissolution of star clusters have a deepimpact on the appearance of galaxies This impact has many manifestationsranging from the properties of stellar populations such as the binary frac-tion and the number of type Ia and type II supernovae through the velocitystructure in galactic discs such as the agendashvelocity dispersion relation to theexistence of stellar halos around galaxies tidal streams and the survival andproperties of tidal dwarf galaxies the existence of which challenge current cos-mological perspectives Apart from this cosmological relevance dense stellarsystems provide unique laboratories in which to test stellar evolution theorygravitational dynamics the interplay between stellar evolution and dynamicalprocesses and the physics of stellar birth and stellar feedback processes duringformation
Star clusters and other pressure-supported stellar systems in the skymerely offer snap-shots from which we can glean incomplete information Be-cause there is no analytical solution to the equations of motion for more thantwo stars these differential equations need to be integrated numerically Thusin order to gain an understanding of these objects in terms of the above is-sues a researcher needs to resort to numerical experiments in order to testvarious hypotheses as to the possible physical initial conditions (to test star-formation theory) or the outcome (to quantify stellar populations in galaxies
Kroupa P Initial Conditions for Star Clusters Lect Notes Phys 760 181ndash259 (2008)
DOI 101007978-1-4020-8431-7 8 ccopy Springer-Verlag Berlin Heidelberg 2008
182 P Kroupa
for example) The initialisation of a pressure-supported stellar system is suchthat the initial object is relevant for the real physical Universe and is thereforea problem of some fundamental importance
Here empirical constraints on the initial conditions of star clusters arediscussed and some problems to which star clusters are relevant are raisedSection 82 contains information to set up a realistic computer model of a starcluster including models of embedded clusters The initial mass distribution ofstars is discussed in Sect 83 and Sect 84 delves into the initial distributionfunctions of multiple stars A brief summary is provided in Sect 85
811 Embedded Clusters
In this section an outline is given of some astrophysical aspects of dense stellarsystems in order to help differentiate probable evolutionary effects from initialconditions A simple example clarifies the meaning of this An observer maysee two young populations with comparable ages (to within 1Myr say) Theyhave similar observed masses but different sizes and a somewhat differentstellar content and different binary fractions Do they signify two differentinitial conditions derived from star-formation or can both be traced back toa t = 0 configuration which is the same
Preliminaries
Assume we observe a very young population of N stars with an age τage andthat we have a rough estimate of its half-mass radius rh and embedded stellarmass Mecl1 The average mass is
m =Mecl
N (81)
Also assume we can estimate the star-formation efficiency (SFE) ε within afew rh For this object
ε =Mecl
Mecl +Mgas (82)
where Mgas is the gas left over from the star-formation process The tidalradius of the embedded cluster can be estimated from the Jacobi limit((Eq (7-84) in Binney amp Tremaine 1987) as determined by the host galaxywhen any contributions by surrounding molecular clouds are ignored
rtid =(Mecl +Mgas
3Mgal
) 13
D (83)
1Throughout all masses m M etc are in units of M unless noted otherwiseldquoEmbedded stellar massrdquo refers to the man in stars at the time before residual gasexpulsion and when star-formation has ceased
8 Initial Conditions for Star Clusters 183
where Mgal is the mass of the spherically distributed galaxy within the dis-tance D of the cluster from the centre of the galaxy This radius is a roughestimate of that distance from the cluster at which stellar motions begin tobe significantly influenced by the host galaxy
The following quantities that allow us to judge the formal dynamical stateof the system the formal crossing time of the stars through the object canbe defined as
tcr equiv2 rhσ
(84)
where2
σ =radicGMecl
ε rh(85)
is up to a factor of order unity the three-dimensional velocity dispersion of thestars in the embedded cluster Note that these equations serve to estimate thepossible amount of mixing of the population If τage lt tcr the object cannotbe mixed and we are seeing it close to its initial state It takes a few tcr for adynamical system out of dynamical equilibrium to return back to it This isnot to be mistaken for a relaxation process
Once the stars orbit within the object they exchange orbital energythrough weak gravitational encounters and rare strong encounters The sys-tem evolves towards a state of energy equipartition The energy equipartitiontime-scale tms between massive and average stars (Spitzer 1987 p 74) whichis an estimate of the time massive stars need to sink to the centre of the systemthrough dynamical friction on the lighter stars is
tms =m
mmaxtrelax (86)
Here mmax is the massive-star mass and the characteristic two-body relax-ation time (eg Eq (4ndash9) in Binney amp Tremaine 1987) is
trelax = 01N
lnNtcr (87)
This formula refers to a pure N -body system without embedded gas A roughestimate of trelaxemb for an embedded cluster can be found in Eq (8) of Adamsamp Myers (2001) The above (87) is a measure for the time a star needs tochange its orbit significantly from its initial trajectory We often estimate itby calculating the amount of time that is required to change the velocity vof a star by an amount Δv asymp v
Thus if for example τage gt tcr and τage lt trelax the system is probablymixed and close to dynamical equilibrium but it is not yet relaxed That isit has not had sufficient time for the stars to exchange a significant amountof orbital energy Such a cluster may have erased its sub-structures
2As an aside note that G = 00045 pc3M Myr2 and that 1 km sminus1 =102 pcMyr
184 P Kroupa
Fragmentation and Size
The very early stages of cluster evolution on a scale of a few parsecs aredominated by gravitational fragmentation of a turbulent magnetised contract-ing molecular cloud core (Clarke Bonnell amp Hillenbrand 2000 Mac Low ampKlessen 2004 Tilley amp Pudritz 2007) Gas-dynamical simulations show theformation of contracting filaments which fragment into denser cloud coresthat form subclusters of accreting protostars As soon as the protostars ra-diate or lose mass with sufficient energy and momentum to affect the cloudcore these computations become expensive because radiative transport anddeposition of momentum and mechanical energy by non-isotropic outflows aredifficult to handle with present computational means (Stamatellos et al 2007Dale Ercolano amp Clarke 2007)
Observations of the very early stages at times less than a few hundreds ofthousands of years suggest that protoclusters have a hierarchical protostellardistribution a number of subclusters with radii less than 02 pc and separatedin velocity space are often seen embedded within a region less than a pcacross (Testi et al 2000) Many of these subclusters may merge to form amore massive embedded cluster (Scally amp Clarke 2002 Fellhauer amp Kroupa2005) It is unclear though if subclusters typically merge before residual gas
blow-out or if the residual gas is removed before the sub-clumps can interactsignificantly nor is it clear if there is a systematic mass dependence of anysuch possible behaviour
Mass Segregation
Whether or not star clusters or subclusters form mass-segregated remains anopen issue Mass segregation at birth is a natural expectation because proto-stars near the density maximum of the cluster have more material to accreteFor these the ambient gas is at a higher pressure allowing protostars to ac-crete longer before feedback termination stops further substantial gas inflowand the coagulation of protostars is more likely there (Zinnecker amp Yorke2007 Bonnell Larson amp Zinnecker 2007) Initially mass-segregated subclus-ters preserve mass segregation upon merging (McMillan Vesperini amp Porte-gies Zwart 2007) However for mmmax = 05100 and N le 5 times 103 stars itfollows from (86) that
tms le tcr (88)
That is a 100M star sinks to the cluster centre within roughly a crossingtime (see Table 81 below for typical values of tcr)
Currently we cannot say conclusively if mass segregation is a birth phe-nomenon (eg Gouliermis et al 2004) or whether the more massive starsform anywhere throughout the protocluster volume Star clusters that havealready blown out their gas at ages of one to a few million years are typicallymass-segregated (eg R136 Orion Nebula Cluster)
8 Initial Conditions for Star Clusters 185
Table 81 Notes the Y in the O stars column indicates that the maximum stellarmass in the cluster surpasses 8 M (Fig 81) The average stellar mass is taken tobe m = 04 M in all clusters A star-formation efficiency of ε = 03 is assumed Thecrossing time tcr is (84) The pre-supernova gas evacuation time-scale is τgas =rvth where vth = 10 km sminus1 is the approximate sound velocity of the ionised gasand τgas = 005 Myr for r = 05 pc while τgas = 01 Myr for r = 1 pc
MeclM N O stars tcrMyr τgastcr tcrMyr τgastcr(rh = 05 pc 05 pc 1 pc 1 pc)
40 100 N 09 ndash 26 ndash100 250 YN 06 008 16 02500 1250 Y 03 02 07 01103 25 times 103 Y 02 025 05 02104 25 times 104 Y 006 08 02 05105 25 times 105 Y 002 25 005 2106 25 times 106 Y 0006 83 002 5
To affirm natal mass segregation would impact positively on the notionthat massive stars (more than about 10M) only form in rich clusters andnegatively on the suggestion that they can also form in isolation For recentwork on this topic see Li Klessen amp Mac Low (2003) and Parker amp Goodwin(2007)
Feedback Termination
The observationally estimated SFE (82) is (Lada amp Lada 2003)
02 le ε le 04 (89)
which implies that the physics dominating the star-formation process on scalesless than a few parsecs is stellar feedback Within this volume the pre-clustercloud core contracts under self-gravity and so forms stars ever more vigorouslyuntil feedback energy suffices to halt the process (feedback termination)
Dynamical State at Feedback Termination
Each protostar needs about tps asymp 105 yr to accumulate about 95 of itsmass (Wuchterl amp Tscharnuter 2003) The protostars form throughout thepre-cluster volume as the protocluster cloud core contracts The overall pre-cluster cloud-core contraction until feedback termination takes (84 85)
tclform asymp few times 2radicG
(Mecl
ε
)minus 12
r32h (810)
(a few times the crossing time) which is about the time over which the clusterforms Once a protostar condenses out of the hydro-dynamical flow it becomes
186 P Kroupa
a ballistic particle moving in the time-evolving cluster potential Because manygenerations of protostars can form over the cluster-formation time-scale andif the crossing time through the cluster is a few times shorter than tclform thevery young cluster is mostly in virial equilibrium when star-formation stopswhen any residual gas has been lost3 It is noteworthy that for rh = 1pc
tps ge tclform forMecl
εge 1049 M (811)
(the protostar-formation time formally surpasses the cluster formation time)which is near the turnover mass in the old-star cluster mass function (egBaumgardt 1998)
A critical parameter is thus the ratio
τ =tclformtcr
(812)
If it is less than unity protostars condense from the gas and cannot reachvirial equilibrium in the potential before the residual gas is removed Suchembedded clusters may be kinematically cold if the pre-cluster cloud core wascontracting or hot if the pre-cluster cloud core was pressure confined becausethe young stars do not feel the gas pressure
In those cases where τ gt 1 the embedded cluster is approximately in virialequilibrium because generations of protostars that drop out of the hydrody-namic flow have time to orbit the potential The pre-gas-expulsion stellarvelocity dispersion in the embedded cluster (85) may reach σ = 40pc Myrminus1
if Mecl = 1055 M which is the case for ε rh lt 1 pc This is easily achievedbecause the radius of one-Myr old clusters is r05 asymp 08 pc with no dependenceon mass Some observationally explored cases are discussed by Kroupa (2005)Notably using K-band number counts Gutermuth et al (2005) appear tofind evidence for expansion after gas removal
Interestingly recent Spitzer results suggest a scaling of the characteristicprojected radius R with mass4
Mecl prop R2 (813)
(Allen et al 2007) so the question of how compact embedded clusters formand whether there is a massndashradius relation needs further clarification Notethough that such a scaling is obtained for a stellar population that expandsfreely with a velocity given by the velocity dispersion in the embedded cluster(85)
3A brief transition time ttr tclform exists during which the star-formation ratedecreases in the cluster while the gas is being blown out However for the purposeof the present discussion this time may be neglected
4Throughout this text projected radii are denoted by R while the 3D radiusis r
8 Initial Conditions for Star Clusters 187
r(t) asymp ro + σ t rArr Mecl =1G
(r(t) minus ro
t
)2
(814)
where ro le 1 pc is the birth radius of the cluster Is the observed scaling thena result of expansion from a compact birth configuration after gas expulsionIf so it would require a more massive system to be dynamically older whichis at least qualitatively in-line with the dynamical time-scales decreasing withmass Note also that the observed scaling (813) cannot carry through toMecl ge 104 M because the resulting objects would not resemble clusters
There are two broad camps suggesting on one hand that molecular cloudsand star clusters form on a free-fall time-scale (Elmegreen 2000 Hartmann2003 Elmegreen 2007) and on the other hand that many free-fall times areneeded (Krumholz amp Tan 2007) The former implies τ asymp 1 while the latterimplies τ gt 1
Thus currently unclear issues concerning the initialisation ofN -body mod-els of embedded clusters are the ratio τ and whether a massndashradius relationexists for embedded clusters before the development of HII regions To makeprogress I assume for now that the embedded clusters are in virial equilibriumat feedback termination (τ gt 1) and that they form highly concentrated withr le 1 pc independently of mass
The Mass of the Most Massive Star
Young clusters show a well-defined correlation between the mass of the mostmassive star mmax and the stellar mass of the embedded cluster Mecl Thisappears to saturate at mmaxlowast asymp 150M (Weidner amp Kroupa 2004 2006)This is shown in Fig 81 This correlation may indicate feedback terminationof star-formation within the protocluster volume coupled to the most mas-sive stars forming latest or turning-on at the final stage of cluster formation(Elmegreen 1983)
The evidence for a universal upper mass cutoff near
mmaxlowast asymp 150M (815)
(Weidner amp Kroupa 2004 Figer 2005 Oey amp Clarke 2005 Koen 2006Maız Apellaniz et al 2007 Zinnecker amp Yorke 2007) seems to be rather wellestablished in populations with metallicities ranging from the LMC (Z asymp0008) to the super-solar Galactic centre (Z ge 002) so that the stellar massfunction (MF) simply stops at that mass This mass needs to be understoodtheoretically (see discussion by Kroupa amp Weidner 2005 Zinnecker amp Yorke2007) It is probably a result of stellar structure stability but may be near
80M as predicted by theory if the most massive stars reside in near-equalcomponent-mass binary systems (Kroupa amp Weidner 2005) It may also bethat the calculated stellar masses are significantly overestimated (MartinsSchaerer amp Hillier 2005)
188 P Kroupa
Fig 81 The maximum stellar mass mmax as a function of the stellar mass ofthe embedded cluster Mecl (Weidner private communication an updated versionof the data presented by Weidner amp Kroupa 2006) The solid triangle is an SPHmodel of star-cluster formation by Bonnell Bate amp Vine (2003) while the solidcurve stems from stating that there is exactly one most massive star in the cluster1 =
int 150
mmaxξ(m) dm with the condition Mecl =
int mmax008
m ξ(m) dm where ξ(m) isthe stellar IMF The solution can only be obtained numerically but an easy-to-usewell-fitting function has been derived by Pflamm-Altenburg Weidner amp Kroupa(2007)
The Cluster Core of Massive Stars
Irrespective of whether the massive stars (more than about 10M) form at thecluster centre or whether they segregate there owing to energy equipartition(86) they ultimately form a compact sub-population that is dynamicallyhighly unstable Massive stars are ejected from such cores very efficiently ona core-crossing time-scale and for example the well-studied Orion Nebulacluster (ONC) has probably already shot out 70 of its stars more massivethan 5M (Pflamm-Altenburg amp Kroupa 2006) The properties of O andB runaway stars have been used by Clarke amp Pringle (1992) to deduce thetypical birth configuration of massive stars They find them to form in binarieswith similar-mass components in compact small-N groups devoid of low-massstars Among others the core of the Orion Nebula Cluster (ONC) is just sucha system
8 Initial Conditions for Star Clusters 189
The Star-Formation History in a Cluster
The detailed star-formation history in a cluster contains information aboutthe events that build up the cluster Intriguing is the recent evidence for someclusters that while the bulk of the stars have ages that differ by less thana few 105 yr a small fraction of older stars are often encountered (Palla ampStahler 2000 for the ONC Sacco et al 2007 for the σ Orionis cluster) Thismay be interpreted to mean that clusters form over about 10 Myr with afinal highly accelerated phase in support of the notion that turbulence of amagnetised gas determines the early cloud-contraction phase (Krumholz ampTan 2007)
A different interpretation would be that as a pre-cluster cloud core con-tracts on a free-fall time-scale it traps surrounding field stars which thenbecome formal cluster members Most clusters form in regions of a galaxythat has seen previous star-formation The velocity dispersion of the previ-ous stellar generation such as an expanding OB association is usually ratherlow around a few km sminus1 to 10 km sminus1 The deepening potential of a newlycontracting pre-cluster cloud core is able to capture some of the precedinggeneration of stars so that these older stars become formal cluster membersalthough they did not form in the cluster Pflamm-Altenburg amp Kroupa(2007) study this problem for the ONC and show that the age spread re-ported by Palla et al (2007) can be accounted for in this way This suggeststhat the star-formation history of the ONC may in fact not have started about10 Myr ago supporting the argument by Elmegreen (2000) Elmegreen (2007)and Hartmann (2003) that clusters form on a time-scale comparable to thecrossing time of the pre-cluster cloud core Additionally the sample of clus-ter stars may be contaminated by enhanced fore- and back-ground densitiesof field stars by focussing of stellar orbits during cluster formation (Pflamm-Altenburg amp Kroupa 2007)
For very massive clusters such as ω Cen Fellhauer Kroupa amp Evans(2006) show that the potential is sufficiently deep that the pre-cluster cloudcore may capture the field stars of a previously existing dwarf galaxy Up to30 or more of the stars in ω Cen may be captured field stars This wouldexplain an age spread of a few Gyr in the cluster and is consistent with thenotion that ω Cen formed in a dwarf galaxy that was captured by the MilkyWay The attractive aspect of this scenario is that ω Cen need not have beenlocated at the centre of the incoming dwarf galaxy as a nucleus but withinits disc because it opens a larger range of allowed orbital parameters for theputative dwarf galaxy moving about the Milky Way The currently preferredscenario in which ω Cen was the nucleus of the dwarf galaxy implies thatthe galaxy was completely stripped while falling into the Milky Way leavingonly its nucleus on its current retrograde orbit (Zhao 2004) The new scenarioallows the dwarf galaxy to be absorbed into the bulge of the Milky Way withω Cen being stripped from it on its way in
190 P Kroupa
Another possibility for obtaining an age spread of a few Gyr in a massivecluster such as ω Cen is gas accretion from a co-moving inter-stellar medium(Pflamm-Altenburg amp Kroupa 2008) This could only have worked for ω Cenbefore it became unbound from its mother galaxy though That is the clustermust have spent about 2ndash3Gyr in its mother galaxy before it was capturedby the Milky Way
This demonstrates beautifully how an improved understanding of dynam-ical processes on scales of a fewpc impinges on problems related to the forma-tion of galaxies and cosmology (through the sub-structure problem) Finallythe increasingly well-documented evidence for stellar populations in massiveclusters with different metallicities and ages and in some cases even significantHe enrichment may also suggest secondary star-formation occurring from ma-terial that has been pre-enriched from a previous generation of stars in thecluster Different IMFs need to be invoked for the populations of different ages(see Piotto 2008 for a review)
Expulsion of Residual Gas
When the most massive stars are O stars they destroy the protocluster neb-ula and quench further star-formation by first ionising most of it (feedbacktermination) The ionised gas at a temperature near 104 K and in seriousover-pressure pushes out and escapes the confines of the cluster volume atthe sound speed (near 10 km sminus1) or faster if the winds blow off O stars withvelocities of thousands of km sminus1 and impart sufficient momentum
There are two analytically tractable regimes of behaviour instantaneousgas removal and slow gas expulsion over many crossing times
bull First consider instantaneous gas expulsion τgas = 0 The binding energyof the object of mass M and radius r is
Eclbind = minusGM2
r+
12M σ2 lt 0 (816)
Before gas expulsion M = Minit = Mgas +Mecl rarr M and
σ2init =
GMinit
rinitminusrarr σ (817)
After instantaneous gas expulsion Mafter = Mecl rarr M but σafter =σinit rarr σ and the new binding energy is
Eclbindafter = minusGM2after
rinit+
12Mafter σ
2init (818)
But the cluster relaxes into a new equilibrium so that by the scalar virialtheorem5
5The scalar virial theorem states that 2 K + W = 0 rArr E = K + W = (12) W where K W are the kinetic and potential energy and E is the total energy of thesystem
8 Initial Conditions for Star Clusters 191
Eclbindafter = minus12GMafter
rafter (819)
and on equating these two expressions for the final energy and using (817)we find that
rafterrinit
=Mecl
Mecl minusMgas (820)
Thus as Mgas rarr Mecl then ε rarr 05 from above rafter rarr infin Thismeans that as the SFE approaches 50 from above the cluster unbindsitself But by (89) this result would imply either (see Kroupa Aarseth ampHurley 2001 and references therein)ndash all clusters with OB stars (and thus τgas tcr) do not survive gas
expulsion orndash the clusters expel their gas slowly τgas tcr This may be the case if
surviving clusters such as the Pleiades or Hyades formed without OBstars
bull Now consider slow gas removal τgas tcr τgas rarr infin By (820) and theassumption that an infinitesimal mass of gas is removed instantaneously
rinit minus δr
rinit=
Minit minus δMgas
Minit minus δMgas minus δMgas (821)
For infinitesimal steps and for convenience dM lt 0 but dr gt 0
r minus drr
=M + dMM + 2dM
(822)
Re-arranging this we find
drr
=dMM
(
1 minus 2dMM
)
(823)
so that
drr
=dMM
rArr lnrafterrinit
= lnMinit
Mafter (824)
upon integration of the differential equation Thus
rafterrinit
=Mecl +Mgas
Mecl=
1ε (825)
and for example for a SFE of 20 the cluster expands by a factor of 5rafter = 5 rinit without dissolving
Table 81 gives an overview of the type of behaviour one might expect forclusters with increasing number of stars N and stellar mass Mecl for twocharacteristic radii of the embedded stellar distribution rh It can be seen thatthe gas-evacuation time-scale becomes longer than the crossing time through
192 P Kroupa
the cluster for Mecl ge 105 M Such clusters would thus undergo adiabaticexpansion as a result of gas blow out Less-massive clusters are more likelyto undergo an evolution that is highly dynamic and that can be described asan explosion (the cluster pops) For clusters without O and massive B starsnebula disruption probably occurs on the cluster-formation time-scale of abouta million years and the evolution is again adiabatic A simple calculation ofthe amount of energy deposited by an O star into its surrounding cluster-nebula suggests it is larger than the nebula binding energy (Kroupa 2005)This however only gives at best a rough estimate of the rapidity with whichgas can be expelled An inhomogeneous distribution of gas leads to the gasremoval preferentially along channels and asymmetrically so that the overallgas-excavation process is highly non-uniform and variable (Dale et al 2005)
The reaction of clusters to gas expulsion is best studied numerically withN -body codes Pioneering experiments were performed by Tutukov (1978) andthen Lada Margulis amp Dearborn (1984) Goodwin (1997ab 1998) studied gasexpulsion by supernovae from young globular clusters Figure 82 shows theevolution of an ONC-type initial cluster with a stellar mass Mecl asymp 4000Mand a canonical IMF (8124) and stellar evolution a 100 initial binary popu-lation (Sect 842) in a solar-neighbourhood tidal field ε = 13 and sphericalgas blow-out on a thermal time-scale (vth = 10 km sminus1) The figure demon-strates that the evolution is far more complex than the simple analytical esti-mates above suggest and in fact a substantial Pleiades-type cluster emergesafter losing about two-thirds of its initial stellar population (see also p 195)Subsequent theoretical work based on an iterative scheme according to whichthe mass of unbound stars at each radius is removed successively shows that
Fig 82 The evolution of 5 10 20 50 of the Lagrangian radius and the coreradius (Rc = rc thick lower curve) of the ONC-type cluster discussed in the textThe gas mass is shown as the dashed line The cluster spends 06 Myr in an embeddedphase before the gas is blown out on a thermal time-scale The tidal radius (83) isshown by the upper thick solid curve (Kroupa Aarseth amp Hurley 2001)
8 Initial Conditions for Star Clusters 193
the survival of a cluster depends not only on ε τgastcr and rtid but also on thedetailed shape of the stellar distribution function (Boily amp Kroupa 2003) Forinstantaneous gas removal ε asymp 03 is a lower limit for the SFE below whichclusters cannot survive rapid gas blow-out This is significantly smaller thanthe critical value of ε = 05 below which the stellar system becomes formallyunbound (820) However if clusters form as complexes of subclusters eachof which pop in this way then overall cluster survival is enhanced to evensmaller values of ε asymp 02 (Fellhauer amp Kroupa 2005)
Whether clusters pop and what fraction of stars remain in a post-gas expul-sion cluster depend critically on the ratio between the gas-removal time-scaleand the cluster crossing time This ratio thus mostly defines which clusters suc-cumb to infant mortality and which clusters merely suffer cluster infant weightloss The well-studied observational cases do indicate that the removal of mostof the residual gas does occur within a cluster-dynamical time τgastcr le 1Examples noted (Kroupa 2005) are the ONC and R136 in the LMC both ofwhich have significant super-virial velocity dispersions Other examples arethe Treasure-Chest cluster and the very young star-bursting clusters in themassively interacting Antennae galaxy that appear to have HII regions ex-panding at velocities so that the cluster volume may be evacuated within acluster dynamical time However improved empirical constraints are needed todevelop further an understanding of cluster survival Such observations wouldbest be the velocities of stars in very young star clusters as they should showa radially expanding stellar population
Indeed Bastian amp Goodwin (2006) note that many young clusters havethe radial-density profile signature expected if they are expanding rapidlyThis supports the notion of fast gas blow out For example the 05ndash2Myrold ONC which is known to be super-virial with a virial mass about twicethe observed mass (Hillenbrand amp Hartmann 1998) has already expelled itsresidual gas and is expanding rapidly It has therefore probably lost its outerstars (Kroupa Aarseth amp Hurley 2001) The super-virial state of young clus-ters makes measurements of their mass-to-light ratio a bad estimate of thestellar mass within them (Goodwin amp Bastian 2006) and rapid dynamicalmass-segregation likewise makes naive measurements of the ML ratio wrong(Boily et al 2005 Fleck et al 2006) Goodwin amp Bastian (2006) and de Grijsamp Parmentier (2007) find the dynamical mass-to-light ratios of young clustersto be too large strongly implying they are in the process of expanding aftergas expulsion
Weidner et al (2007) attempted to measure infant weight loss with asample of young but exposed Galactic clusters They applied the maximal-star-mass to cluster mass relation from above to estimate the birth mass ofthe clusters The uncertainties are large but the data firmly suggest that thetypical cluster loses at least about 50 of its stars
194 P Kroupa
Binary Stars
Most stars form as binaries with as far as can be stated today universal orbitaldistribution functions (Sect 84) Once a binary system is born in a denseenvironment it is perturbed This changes its eccentricity and semi-majoraxis Or it undergoes a relatively strong encounter that disrupts the binary orhardens it perhaps with exchanged companions The initial binary populationtherefore evolves on a cluster crossing time-scale and most soft binaries aredisrupted It has been shown that the properties of the Galactic field binarypopulation can be explained in terms of the binary properties observed for veryyoung populations if these go through a dense cluster environment (dynamicalpopulation synthesis Kroupa 1995d) A dense cluster environment hardensexisting binaries (p 240) This increases the SN Ia rate in a galaxy withmany dense clusters (Shara amp Hurley 2002)
Binaries are significant energy sources (see also Sect 84) A hard binarythat interacts via a resonance with a cluster field star occasionally ejects onestar with a terminal velocity vej σ The ejected star either leaves the clus-ter causing cluster expansion so that σ drops or it shares some of its kineticenergy with the other cluster field stars through gravitational encounters caus-ing cluster expansion Binaries in a cluster core can thus halt and reverse corecollapse (Meylan amp Heggie 1997 Heggie amp Hut 2003)
Mass Loss from Evolving Stars
An old globular cluster with a turn-off mass near 08M has lost 30 of themass that remained in it after gas expulsion by stellar evolution (Baumgardtamp Makino 2003) Because the mass loss is most rapid during the earliest timesafter the cluster returned to virial equilibrium once the gas was expelled thecluster expands further during this time This is nicely seen in the Lagrangianradii of realistic cluster-formation models (Kroupa Aarseth amp Hurley 2001)
812 Some Implications for the Astrophysics of Galaxies
In general the above have a multitude of implications for galactic and stellarastrophysics
1 The heaviest-starndashstar-cluster-mass correlation constrains feedback modelsof star cluster formation (Elmegreen 1983) It also implies that the sumof all IMFs in all young clusters in a galaxy the integrated galaxy initialmass function (IGIMF) is steeper than the invariant stellar IMF observedin star clusters This has important effects on the massndashmetallicity rela-tion of galaxies (Koeppen Weidner amp Kroupa 2007) Additionally star-formation rates (SFRs) of dwarf galaxies can be underestimated by up tothree orders of magnitude because Hα-dark star-formation becomes possible(Pflamm-Altenburg Weidner amp Kroupa 2007) This indeed constitutes an
8 Initial Conditions for Star Clusters 195
important example of how sub-pc processes influence the physics on cos-mological scales
2 The deduction that type-II clusters probably pop (p 190) implies thatyoung clusters will appear to an observer to be super-virial ie to havea dynamical mass larger than their luminous mass (Bastian amp Goodwin2006 de Grijs amp Parmentier 2007)
3 It further implies that galactic fields can be heated and may also lead togalactic thick discs and stellar halos around dwarf galaxies (Kroupa 2002b)
4 The variation of the gas expulsion time-scale among clusters of differenttype implies that the star-cluster mass function (CMF) is re-shaped rapidlyon a time-scale of a few tens of Myr (Kroupa amp Boily 2002)
5 Associated with this re-shaping of the CMF is the natural production ofpopulation II stellar halos during cosmologically early star-formation bursts(Kroupa amp Boily 2002 Parmentier amp Gilmore 2007 Baumgardt Kroupaamp Parmentier 2008)
6 The properties of the binary-star population observed in Galactic fields areshaped by dynamical encounters in star clusters before the stars leave theircluster (Sect 84)
Points 2ndash5 are considered in more detail in the rest of Sect 81
Stellar Associations Open Clusters and Moving Groups
As one of the important implications of point 2 a cluster in the age range1ndash50Myr has an unphysical ML ratio because it is out of dynamical equilib-rium rather than because it has an abnormal stellar IMF (Bastian amp Goodwin2006 de Grijs amp Parmentier 2007)
Another implication is that a Pleiades-like open cluster would have beenborn in a very dense ONC-type configuration and that as it evolves a moving-group-I is established during the first few dozen Myr This comprises roughlytwo-thirds of the initial stellar population and the cluster is expanding witha velocity dispersion that is a function of the pre-gas-expulsion configura-tion (Kroupa Aarseth amp Hurley 2001) These computations were amongthe first to demonstrate with high-precision N -body modelling that the re-distribution of energy within the cluster during the embedded phase and dur-ing the expansion phase leads to the formation of a substantial remnant clusterdespite the inclusion of all physical effects that are disadvantageous for thisto happen (explosive gas expulsion low SFE ε = 033 galactic tidal field andmass loss from stellar evolution and an initial binary-star fraction of 100see Fig 82) Thus expanding OB associations may be related to star-clusterbirth and many OB associations ought to have remnant star clusters as nuclei(see also Clark et al 2005)
As the cluster expands becoming part of an OB association the radiationfrom its massive stars produce expanding HII regions that may trigger furtherstar-formation in the vicinity (eg Gouliermis Quanz amp Henning 2007)
196 P Kroupa
A moving-group-II establishes later ndash the classical moving group made upof stars that slowly diffuse or evaporate out of the readjusted cluster remnantwith relative kinetic energy close to zero The velocity dispersion of moving-group-I is thus comparable to the pre-gas-expulsion velocity dispersion of thecluster while moving-group-II has a velocity dispersion close to zero
The Velocity Dispersion of Galactic-Field Populationsand Galactic Thick Discs
Thus the moving-group-I would be populated by stars that carry the initialkinematic state of the birth configuration into the field of a galaxy Each gen-eration of star clusters would according to this picture produce overlappingmoving-groups-I (and II) and the overall velocity dispersion of the new fieldpopulation can be estimated by adding the squared velocities for all expandingpopulations This involves an integral over the embedded-cluster mass func-tion ξecl(Mecl) which describes the distribution of the stellar mass content ofclusters when they are born Because the embedded cluster mass function isknown to be a power-law this integral can be calculated for a first estimate(Kroupa 2002b 2005) The result is that for reasonable upper cluster masslimits in the integral Mecl le 105 M the observed agendashvelocity dispersionrelation of Galactic field stars can be reproduced
This idea can thus explain the much debated energy deficit namely thatthe observed kinematic heating of field stars with age could not until nowbe explained by the diffusion of orbits in the Galactic disc as a result of scat-tering by molecular clouds spiral arms and the bar (Jenkins 1992) Becausethe velocity-dispersion for Galactic-field stars increases with stellar age thisnotion can also be used to map the star-formation history of the Milky Waydisc by resorting to the observed correlation between the star-formation ratein a galaxy and the maximum star-cluster mass born in the population ofyoung clusters (Weidner Kroupa amp Larsen 2004)
An interesting possibility emerges concerning the origin of thick discs Ifthe star-formation rate was sufficiently high about 11 Gyr ago star clustersin the disc with masses up to 1055 M would have been born If they poppeda thick disc with a velocity dispersion near 40 km sminus1 would result naturally(Kroupa 2002b) This notion for the origin of thick discs appears to be qual-itatively supported by the observations of Elmegreen Elmegreen amp Sheets(2004) who find galactic discs at a red shift between 05 and 2 to show massivestar-forming clumps
Structuring the Initial Cluster Mass Function
Another potentially important implication from this picture of the evolution ofyoung clusters is that if the ratio of the gas expulsion time to the crossing timeor the SFE varies with initial (embedded) cluster mass an initially featurelesspower-law mass function of embedded clusters rapidly evolves to one with
8 Initial Conditions for Star Clusters 197
peaks dips and turnovers at cluster masses that characterise changes in thebroad physics involved
As an example Adams (2000) and Kroupa amp Boily (2002) assumed thatthe function
Micl = fst(Mecl)Mecl (826)
exists where Mecl is as above and Micl is the classical initial cluster massand fst = fst(Mecl) According to Kroupa amp Boily (2002) the classical initialcluster mass is that mass which is inferred by standard N -body computationswithout gas expulsion (in effect this assumes ε = 1 which is however unphys-ical) Thus for example for the Pleiades Mcl asymp 1000M at the present time(age about 100 Myr) A classical initial model would place the initial clustermass near Micl asymp 1500M by standard N -body calculations to quantify thesecular evaporation of stars from an initially bound and relaxed cluster (Porte-gies Zwart et al 2001) If however the SFE was 33 and the gas-expulsiontime-scale were comparable to or shorter than the cluster dynamical timethe Pleiades would have been born in a compact configuration resemblingthe ONC and with a mass of embedded stars of Mecl asymp 4000M (KroupaAarseth amp Hurley 2001) Thus fst(4000M) = 038 (= 15004000)
By postulating that there exist three basic types of embedded clusters(Kroupa amp Boily 2002) namely
Type I clusters without O stars (Mecl le 1025 M eg Taurus-Auriga pre-main sequence stellar groups ρ Oph)
Type II clusters with a few O stars (1025 le MeclM le 1055 eg theONC)
Type III clusters with many O stars and with a velocity dispersion compara-ble to or higher than the sound velocity of ionized gas (Mecl ge 1055 M)
it can be argued that fst asymp 05 for type I fst lt 05 for type II and fst asymp 05for type III The reason for the high fst values for types I and III is thatgas expulsion from these clusters may last longer than the cluster dynamicaltime because there is no sufficient ionizing radiation for type I clusters orthe potential well is too deep for the ionized gas to leave (type III clusters)The evolution is therefore adiabatic ((825) above) Type II clusters undergoa disruptive evolution and witness a high infant mortality rate (Lada amp Lada2003) They are the pre-cursors of OB associations and Galactic clusters Thisbroad categorisation has easy-to-understand implications for the star-clustermass function
Under these conditions and an assumed functional form for fst = fst(Mecl)the power-law embedded cluster mass function transforms into a cluster massfunction with a turnover near 105 M and a sharp peak near 103 M (Kroupaamp Boily 2002) This form is strongly reminiscent of the initial globular clustermass function which is inferred by for example Vesperini (1998 2001)Parmentier amp Gilmore (2005) and Baumgardt (1998) to be required for a
198 P Kroupa
match with the evolved cluster mass function that is seen to have a universalturnover near 105 M By the reasoning given above this ldquoinitialrdquo CMF ishowever unphysical being a power-law instead
This analytical formulation of the problem has been verified nicely withN -body simulations combined with a realistic treatment of residual gas expul-sion by Baumgardt Kroupa amp Parmentier (2008) who show the Milky Wayglobular cluster mass function to emerge from a power-law embedded-clustermass function Parmentier et al (2008) expand on this by studying the ef-fect that different assumptions on the physics of gas removal have on shapingthe star-cluster mass function within about 50 Myr The general ansatz thatresidual gas expulsion plays a dominant role in early cluster evolution maythus solve the long-standing problem that the deduced initial cluster massfunction needs to have this turnover while the observed mass functions ofyoung clusters are featureless power-law distributions
The Origin of Population II Stellar Halos
The above view implies naturally that a major field-star component is gen-erated whenever a population of star clusters forms About 12Gyr ago theMilky Way began its assembly by an initial burst of star-formation throughouta volume spanning about 10 kpc in radius In this volume the star-formationrate must have reached 10M yrminus1 so that star clusters with masses up toasymp 106 M formed (Weidner Kroupa amp Larsen 2004) probably in a chaoticturbulent early interstellar medium The vast majority of embedded clus-ters suffered infant weight loss or mortality The surviving long-lived clus-ters evolved to globular clusters The so-generated field population is thespheroidal population-II halo which has the same chemical properties as thesurviving (globular) star clusters apart from enrichment effects evident inthe most massive clusters All of these characteristics emerge naturally inthe above model as pointed out by Kroupa amp Boily (2002) Parmentier ampGilmore (2007) and most recently by Baumgardt Kroupa amp Parmentier(2008)
813 Long-Term or Classical Cluster Evolution
The long-term evolution of star clusters that survive infant weight loss andthe mass loss from evolving stars is characterised by three physical processesthe drive of the self-gravitating system towards energy equipartition stellarevolution processes and the heating or forcing of the system through externaltides One emphasis of star-cluster work in this context is to test the theoryof stellar evolution and to investigate the interrelation of stellar astrophysicswith stellar dynamics The stellar-evolution and the dynamical-evolution time-scales are comparable The reader is directed to Meylan amp Heggie (1997) andHeggie amp Hut (2003) for further details
8 Initial Conditions for Star Clusters 199
Tidal Tails
Tidal tails contain the stars evaporating from long-lived star clusters (themoving-group-II above) The typical S-shaped structure of tidal tails close tothe cluster are easily understood stars that leave the cluster with a slightlyhigher galactic velocity than the cluster are on slightly outward-directed galac-tic orbits and therefore fall behind the cluster as the angular velocity aboutthe galactic centre decreases with distance The outward-directed trailing armdevelops Stars that leave the cluster with slower galactic velocities than thecluster fall towards the galaxy and overtake the cluster
Given that energy equipartition leads to a filtering in energy space of thestars that escape at a particular time one expects a gradient in the stellarmass function progressing along a tidal tail towards the cluster so that themass function becomes flatter richer in more massive stars This effect isdifficult to detect but for example the long tidal tails found emanating fromPal 5 (Odenkirchen et al 2003) may show evidence for it
As emphasised by Odenkirchen et al (2003) tidal tails have another veryinteresting use they probe the gravitational potential of the Milky Way ifthe differential motions along the tidal tail can be measured They are thusimportant future tests of gravitational physics
Death and Hierarchical Multiple Stellar Systems
Nothing lasts forever and star clusters that survive initial relaxation to virialequilibrium after residual gas expulsion and mass loss from stellar evolutionultimately cease to exist after all member stars evaporate to leave a binary ora long-lived hierarchical multiple system composed of near-equal mass com-ponents (de la Fuente Marcos 1997 1998) Note that these need not be singlestars These cluster remnants are interesting because they may account formost of the hierarchical multiple stellar systems in the Galactic field (Good-win amp Kroupa 2005) with the implication that these are not a product ofstar-formation but rather of star-cluster dynamics
814 What is a Galaxy
Star clusters dwarf-spheroidal (dSph) and dwarf-elliptical (dE) galaxies aswell as galactic bulges and giant elliptical (E) galaxies are all stellar-dynamicalsystems that are supported by random stellar motions ie they are pressure-supported But why is one class of these pressure-supported systems referredto as star clusters while the others are galaxies Is there some fundamentalphysical difference between these two classes of systems
Considering the radius as a function of mass we notice that systems withM le 106 M do not show a massndashradius relation (MRR) and have r asymp 4 pcMore massive objects however show a well-defined MRR In fact Dabring-hausen Hilker amp Kroupa (2008) find that massive compact objects (MCOs)
200 P Kroupa
which have 106 le MM le 108 lie on the MRR of giant E galaxies (about1013 M) down to normal E galaxies (1011 M) as is evident in Fig 83
Rpc = 10minus315
(M
M
)060plusmn002
(827)
Noteworthy is that systems with M ge 106 M also exhibit complex stel-lar populations while less massive systems have single-age single-metallicitypopulations Remarkably Pflamm-Altenburg amp Kroupa (2008) show that astellar system with M ge 106 M and a radius as observed for globular clus-ters can accrete gas from a co-moving warm inter-stellar medium and mayre-start star-formation The median two-body relaxation time is longer thana Hubble time for M ge 3 times 106 M and only for these systems is there evi-dence for a slight increase in the dynamical mass-to-light ratio Intriguingly(ML)V asymp 2 for M lt 106 M while (ML)V asymp 5 for M gt 106 M with apossible decrease for M gt 108 M (Fig 84) Finally the average stellar den-sity maximises at M = 106 M with about 3 times 103 Mpc3 (DabringhausenHilker amp Kroupa 2008)
Thus
Fig 83 Massndashradius data plotted against the dynamical mass of pressure-supported stellar systems (Dabringhausen Hilker amp Kroupa 2008) MCOs aremassive compact objects (also referred to as ultra compact dwarf galaxies) Thesolid and dashed lines refer to (827) while the dash-dotted line is a fit to dSph anddE galaxies
8 Initial Conditions for Star Clusters 201
Fig 84 Dynamical ML values in dependence of the V-band luminosity ofpressure-supported stellar systems (Dabringhausen Hilker amp Kroupa 2008) MCOsare massive compact objects (also referred to as ultra compact dwarf galaxies)
bull the mass 106 M appears to be specialbull stellar populations become complex above this massbull evidence for some dark matter only appears in systems that have a median
two-body relaxation time longer than a Hubble timebull dSph galaxies are the only stellar-dynamical systems with 10 lt (ML)V lt
1000 and as such are total outliers andbull 106 M is a lower accretion limit for massive star clusters immersed in a
warm inter-stellar medium
M asymp 106 M therefore appears to be a critical mass scale so that less-massive objects show characteristics of star clusters that are described wellby Newtonian dynamics while more massive objects show behaviour moretypical of galaxies Defining a galaxy as a stellar-dynamical object which hasa median two-body relaxation time longer than a Hubble time ie essentiallya system with a smooth potential may be an objective and useful way todefine a galaxy (Kroupa 1998) Why only smooth systems show evidencefor dark matter remains at best a striking coincidence at worst it may besymptomatic of a problem in understanding dynamics in such systems
202 P Kroupa
82 Initial 6D Conditions
The previous section gave an outline of some of the issues at stake in therealm of pressure-supported stellar systems In order to attack these and otherproblems we need to know how to set up such systems in the computerIndeed as much as analytical solutions may be preferred the mathematicaland physical complexities of dense stellar systems leave no alternatives otherthan to resort to full-scale numerical integration of the 6N coupled first-order differential equations that describe the motion of the system through6N -dimensional phase space There are three related questions to ponderGiven a well-developed cluster how is one to set it up in order to evolve itforward in time How does a cluster form and how does the formation processaffect its later properties How do we describe a realistic stellar population(IMF binaries) Each of these questions is dealt with in the following sections
821 6D Structure of Classical Clusters
Because the state of a star cluster is never known exactly it is necessary toperform numerical experiments with conditions that are statistically consis-tent with the cluster snap-shot To ensure meaningful statistical results forsystems with few stars say N lt 5000 many numerical renditions of the sameobject are thus necessary For example systems with N = 100 stars evolveerratically and numerical experiments are required to map out the range ofpossible states at a particular time the range of half-mass radii at an age of20 Myr in 1000 numerical experiments of a cluster initially with N = 100 starsand with an initial half-mass radius r05 = 05 pc can be compared with anactually observed object for testing consistency with the initial conditionsExcellent recent examples of this approach can be found in Hurley et al(2005) and Portegies Zwart McMillan amp Makino (2007) with a recent reviewavailable by Hut et al (2007) and two text books have been written dealingwith computational and more general aspects of the physics of dense stellarsystems (Aarseth 2003 Heggie amp Hut 2003)
The six-dimensional structure of a pressure-supported stellar system attime t is conveniently described by the phase-space distribution functionf(rv t) where r and v are the phase-space variables and
dN = f(rv t) d3x d3v (828)
is the number of stars in 6D phase-space volume element d3x d3v In the case ofa steady state the Jeans theorem (Binney amp Tremaine 1987 their Sect 44)allows us to express f in terms of the integrals of motion ie the energyand angular momentum The phase-space distribution function can then bewritten as
f = f(rv) = f(εe l) (829)
8 Initial Conditions for Star Clusters 203
whereεe =
12v2 + Φ(r) (830)
is the specific energy of a star and
l = |r times v| (831)
is the specific orbital angular momentum of a star The Poisson equation is
nabla2Φ(r) = 4πGρm(r) = 4π Gint
allspace
mf d3v (832)
or in spherical symmetry
1r2
ddr
(
r2dΦdr
)
= 4πGint
allspace
fm
(12v2 + Φ |r times v|
)
d3v (833)
where fm is the phase-space mass-density of all matter and is equal to mffor a system with equal-mass stars Most pressure-supported systems have anear-spherical shape and so in most numerical work it is convenient to assumespherical symmetry
For convenience it is useful to introduce the relative potential6
Ψ equiv minusΦ + Φ0 (834)
and the relative energy
E equiv minusεe + Φ0 = Ψ minus 12v2 (835)
where Φ0 is a constant so that f gt 0 for E gt 0 and f = 0 for E le 0The Poisson equation becomes nabla2Ψ = minus4π Gρm subject to the boundarycondition Ψ rarr Φ0 as r rarr infin
One important property of stellar systems is the anisotropy of their velocitydistribution function We define the anisotropy parameter
β(r) equiv 1 minus v2θ
v2r
(836)
where v2θ v
2r are the mean squared tangential and radial velocities at a par-
ticular location r respectively It follows that systems with β = 0 everywherehave an isotropic velocity distribution function
If f only depends on the energy the mean squared radial and tangentialvelocities are respectively
v2r =
1ρ
int
all vel
v2r f
[
Ψ minus 12(v2
r + v2θ + v2
φ
)]
dvr dvθ dvφ (837)
6The following discussion is based on Binney amp Tremaine (1987)
204 P Kroupa
and
v2θ =
1ρ
int
all vel
v2θ f
[
Ψ minus 12(v2
r + v2θ + v2
φ
)]
dvr dvθ dvφ (838)
If the labels θ and r are exchanged in (838) it can be seen that one arrives at(837) Equations (837) and (838) are thus identical apart from the labellingThus if f = f(E) β = 0 and the velocity distribution function is isotropic
If f depends on the energy and the orbital angular momentum of the stars(|l| = |r times v|) then the mean squared radial and tangential velocities arerespectively
v2r =
1ρ
int
all vel
v2r f
[
Ψ minus 12(v2
r + v2θ + v2
φ
) rradicv2
θ + v2φ
]
dvr dvθ dvφ (839)
and
v2θ =
1ρ
int
all vel
v2θ f
[
Ψ minus 12(v2
r + v2θ + v2
φ
) rradicv2
θ + v2φ
]
dvr dvθ dvφ (840)
If the labels θ and r are exchanged in (840) it can be seen that this time onedoes not arrive at (839) Thus if f = f(E l) then β = 0 and the velocity dis-tribution function is not isotropic This serves to demonstrate an elementarybut useful property of the phase-space distribution function
A very useful series of distribution functions can be arrived at from thesimple form
fm(E) =F Enminus 3
2 E gt 00 E le 0
(841)
The mass density
ρm(r) = 4π Fint radic
2 Ψ
0
(
Ψ minus 12v2
)nminus 32
v2 dv (842)
where the upper integration bound is given by the escape condition E =Ψ minus (12)v2 = 0 Substituting v2 = 2Ψ cos2θ for some θ leads to
ρm(r) =cn Ψn Ψ gt 0
0 Ψ le 0 (843)
For cn to be finite n gt 12 ie homogeneous (n = 0) systems are excludedThe LanendashEmden equation follows from the spherically symmetric Poisson
equation after introducing dimensionless variables s = rb ψ = ΨΨ0 whereb = (4π GΨnminus1
0 cn)minus12 and Ψ0 = Ψ(0)
1s2
dds
(
s2dψds
)
=minusψn ψ gt 0
0 ψ le 0 (844)
H Lane and R Emden worked with this equation in the context of self-gravitating polytropic gas spheres which have an equation of state
8 Initial Conditions for Star Clusters 205
p = K ργm (845)
where K is a constant and p the pressure It can be shown that γ = 1 + 1nThat is the density distribution of a stellar polytrope of index n is the sameas that of a polytropic gas sphere with index γ
The natural boundary conditions to be imposed on (844) are at s = 0
1 ψ = 1 because Ψ(0) = Ψ0 and2 dψds = 0 because the gravitational force must vanish at the centre
Analytical solutions to the LanendashEmden equation are possible only for afew values of n and we remember that a homogeneous (n = 0) stellar densitydistribution has already been excluded as a viable solutions of the generalpower-law phase-space distribution function
The Plummer Model
A particularly useful case is
ψ =1
radic1 + 1
3 s2 (846)
It follows immediately that this is a solution of the LanendashEmden equation forn = 5 and it also satisfies the two boundary conditions above and so consti-tutes a physically sensible potential By integrating the Poisson equation itcan be shown that the total mass of this distribution function is finite
Minfin =radic
3 Ψ0 bG (847)
although the density distribution has no boundary The distribution functionis
fm(E) =
F(Ψ minus 1
2 v2) 7
2 v2 lt 2Ψ0 v2 ge 2Ψ
(848)
with the relative potential
Ψ =Ψ0radic
1 + 13
(rb
)2(849)
and density lawρm =
ρm0(1 + 1
3
(rb
)2) 5
2(850)
with the above total mass This density distribution is known as the Plummermodel named after Plummer (1911) who showed that the density distributionthat results from this model provides a reasonable and in particular verysimple analytical description of globular clusters The Plummer model is in
206 P Kroupa
fact a work-horse for many applications in stellar dynamics because many ofits properties such as the projected velocity dispersion profile can be calculatedanalytically Such formulae are useful for checking numerical codes used to setup models of stellar systems
Properties of the Plummer Model
Some useful analytical results can be derived for the Plummer density law(see also Heggie amp Hut 2003 their p 73 for another compilation) For thePlummer law of mass Mecl the mass-density profile (850) can be written as
ρm(r) =3Mecl
4π r3pl
1[
1 +(
rrpl
)2] 5
2 (851)
where rpl is the Plummer scale length The central number density is thus
ρc =3N
4π r3pl
(852)
The mass within radius r follows from M(r) = 4πint r
0ρm(rprime) rprime
2drprime
M(r) = Mecl
(r
rpl
)3
[
1 +(
rrpl
)2] 3
2 (853)
Thus
rpl contains 354 of the mass2 rpl contain 7165 rpl contain 943 and10 rpl contain 985 of the total mass
For the half-mass radius we have
rh = (223 minus 1)minus
12 rpl asymp 1305 rpl (854)
The projected surface mass density ΣM (R) = 2intinfin0
ρm(r) dz where R isthe projected radial distance from the cluster centre and Z is the integrationvariable along the line-of-sight (r2 = R2 + Z2) is
Σρ(R) =Mecl
π r2pl
1[
1 +(
Rrpl
)2]2 (855)
We assume there is no mass segregation so that the mass-to-light ratio Υ equiv(ML) measured in some photometric system is independent of radius Theintegrated light within projected radius R is
8 Initial Conditions for Star Clusters 207
I(R) = (1Υ )int R
0
Σρ(Rprime) 2π Rprime dRprime (856)
I(R) =Mecl r
2pl
Υ
[1r2pl
minus 1R2 + r2pl
]
(857)
Thus rpl is the half-light radius of the projected star cluster I(rpl) =05 I(infin)
In the above equations ρ(r) = ρm(r)m N(r) = M(r)m and Σn =Σρm are respectively the stellar number density the number of stars withinradius r and the projected surface number density profile if there is no masssegregation within the cluster Thus the average stellar mass m is constant
The velocity dispersion can be calculated at any radius from the Jeansequation (8120) For an isotropic velocity distribution (σ2
θ = σ2φ = σ2
r) suchas the Plummer model the Jeans equation yields
σ2r(r) =
1ρ(r)
int infin
r
ρ(rprime)GM(rprime)
r2drprime (858)
because dφ(r)dr = GM(r)r2 and the integration bounds have been chosento make use of the vanishing ρm(r) as r rarr infin Note that the above equationis also valid if M(r) consists of more than one spherical component such as adistinct core plus an extended halo Combining (851) (853) and (858) weare led to
σ2(r) =(GMecl
2 rpl
)1
[
1 +(
rrpl
)2] 1
2 (859)
where σ(r) is the three-dimensional velocity dispersion of the Plummer sphereat radius r σ2(r) =
sumk=rθφ σ
3k(r) or σ2(r) = 3σ2
1D(r) because isotropy isassumed
A star with mass m positioned at r and with speed v =(sum3
k=1 v2k
)12
can escape from the cluster if it has a total energy ebind = ekin + epot =05mv2 + mφ(r) ge 0 so that v ge vesc(r) So the escape speed at radiusr is vesc(r) =
radic2 |φ(r)| The potential at r is given by the mass within r
plus the potential contributed by the surrounding matter It is calculated byintegrating the contributions from each radial mass shell
φ(r) = minus[
GM(r)r
+int infin
r
G1rprime
ρ(rprime) 4π rprime2drprime
]
= minus(GMecl
rpl
)1
[1 + (rrpl)2]12
(860)
so that
vesc(r) =(
2GMecl
rpl
)12 1
[1 + (rrpl)2]14
(861)
208 P Kroupa
The circular speed vc of a star moving on a circular orbit at a distancer from the cluster centre is obtained from centrifugal acceleration v2
cr =dφ(r)dr = GM(r)r2
v2c =
(GMecl
rpl
)(rrpl)
2
[1 + (rrpl)2]32 (862)
In many but not all instances of interest the initial cluster model is chosento be in the state of virial equilibrium That is the kinetic and potentialenergies of each star balance so that the whole cluster is stationary Thescalar virial theorem
2K +W = 0 (863)
where K and W are the total kinetic and potential energy of the cluster7
K =12
int infin
0
ρ(r)σ2(r) 4πr2dr
=3π64
GM2ecl
rpl for the Plummer sphere (864)
W =12
int infin
0
φ(r) ρ(r) 4πr2dr
= minus3π32
GM2ecl
rplfor the Plummer sphere (865)
The total or binding energy of the cluster Etot = W +K is
Etot = minusK =12W (866)
The characteristic three-dimensional velocity dispersion of a cluster can bedefined as σ2
cl equiv 2KMecl so that
σ2cl =
3π32
GMecl
rpl (867)
equiv GMecl
rgrav (868)
equiv s2(GMecl
2 rh
)
(869)
which introduces the gravitational radius of the cluster rgrav equiv GM2ecl|W |
For the Plummer sphere rgrav = (323π)rpl = 34 rpl and the structure factor
s =(
6 times 1305π32
) 12
asymp 088 (870)7Equation (32514) on p 295 of Gradshteyn amp Ryzhik (1980) is useful to solve
the integrals for the Plummer sphere
8 Initial Conditions for Star Clusters 209
We define the virial ratio by
Q =K
|W | (871)
so that a cluster can initially be in three possible states
Q
⎧⎪⎨
⎪⎩
= 12 virial equilibrium
gt 12 expanding
lt 12 collapsing
(872)
Note that if initially Q lt 12 the value Q = 12 will be reached temporarilyduring collapse after which Q increases further until the cluster settles invirial equilibrium after this violent relaxation phase (Binney amp Tremaine 1987p 271)
The characteristic crossing time through the Plummer cluster
tcr equiv2 rpl
σ1Dcl (873)
=(
128πG
) 12
Mminus 1
2ecl r
32pl (874)
with the characteristic one-dimensional velocity dispersion σ1Dcl = σclradic
3Observationally the core radius is that radius where the projected surface
density falls to half its central value For a real cluster it is much easier todetermine than the other characteristic radii For the Plummer sphere
Rcore =(radic
2 minus 1) 1
2rpl = 064 rpl (875)
from (855) with the assumption that the mass-to-light ratio Υ is indepen-dent of radius For a King model
Rkingcore =
(9
4πGσ2
ρm(0)
) 12
(876)
is the King radius From (859) σ2(0) = GMecl(2 rpl) and from (851)ρm(0) = 3Mecl(4π r3pl) so that
rpl =(
64πG
σ(0)2
ρm(0)
) 12
= 082 Rkingcore (877)
The Singular Isothermal Model
Another useful set of distribution functions can be arrived at by consideringn = infin The LanendashEmden equation is not well defined in this limit but for a
210 P Kroupa
polytropic gas sphere (845) implies γ rarr 1 as n rarr infin Thus p = K ρm which isthe equation of state of an isothermal ideal gas with K = kB Tmp where kB
is Boltzmannrsquos constant T the temperature and mP the mass of a gas particleFrom the equation of hydrostatic support dpdr = minusρm(GM(r)r2) whereM(r) is the mass within r the following equation can be derived
ddr
(
r2d ln ρm
dr
)
= minusGmp
kB T4π r2 ρm (878)
For a distribution function (our ansatz)
fm(E) =ρm1
(2π σ2)32e
Eσ2 (879)
where σ2 is a new quantity related to a velocity dispersion and E = Ψminus v22one obtains from ρm =
intfm(E) 4π v2 dv
Ψ(r) = ln(ρm(r)ρm1
)
σ2 (880)
From the Poisson equation it then follows that
σ = const =kB T
mp(881)
for consistency with (878)Therefore the structure of an isothermal self-gravitating sphere of ideal
gas is identical to the structure of a collisionless system of stars whose phase-space mass-density distribution function is given by (879) Note that f(E) isnon-zero at all E (cf Kingrsquos models below)
The number-distribution function of velocities is F (v) =intall x
f(E) d3x ie
F (v) = F0 eminus v2
2 σ2 (882)
This is the MaxwellndashBoltzmann distribution which results from the kinetictheory of atoms in a gas at temperature T that are allowed to bounce offeach other elastically This exact correspondence between a stellar-dynamicalsystem and a gaseous polytrope holds only for an isothermal case (n = infin)
The total number of stars in the system is Ntot = Ntot
intinfin0
F (v) 4π v2 dvand the number of stars in the speed interval v to v + dv is
dN = F (v) 4π v2 dv = Ntot1
(2πσ2)32eminus
v2
2 σ2 4π v2 dv (883)
which is the MaxwellndashBoltzmann distribution of speeds The mean-squarespeed of stars at a point in the isothermal sphere is
8 Initial Conditions for Star Clusters 211
v2 =4π
intinfin0
σ2 F (v) dv4π
intinfin0
F (v) dv= 3σ2 (884)
and the 1D velocity dispersion is σ1D = σα = σ where α = r θ φ x y z To obtain the radial mass-density of this model the ansatz ρm = C rminusb
together with the Poisson equation (878) implies
ρm(r) =σ2
2πG1r2 (885)
That is a singular isothermal sphere
The Isothermal Model
The above model has a singularity at the origin This is unphysical In order toremove this problem it is possible to force the central density to be finite Tothis end new dimensionless variables are introduced ρm equiv ρmρm0 r equiv rr0The density ρm is the finite central density while r0 = RKing
core is the King radius(876) at which the projected density falls to 05013 (ie about half) its centralvalue The radius r0 is also sometimes called the core radius (but see furtherbelow for King models on p 211) The Poisson equation (878) then becomes
ddr
(
r2d ln ρm
dr
)
= minus9 ρm r2 (886)
This differential equation must be solved numerically for ρm(r) subject to theboundary conditions (as before)
ρm(r = 0) = 1dρm
dr
∣∣∣∣∣r=0
= 0 (887)
The solution is the isothermal sphereBy imposing physical reality (central non-singularity) on our mathematical
ansatz we end up with a density profile that cannot be arrived at analyticallybut only numerically The isothermal density sphere must be tabulated in thecomputer with entries such as
rr0 log10
(ρ
ρ0
)
and log10
(Σ
r0 ρ0
)
(888)
where Σ is the projected density (Binney amp Tremaine 1987 for example seetheir Table 41 and Fig 47 of) The circular velocity vc(r) = GM(r)r of theisothermal sphere is obtained by integrating Poissonrsquos equation (878) fromr = 0 to r = rprime with r2(d ln ρmdr) = minus(Gσ2)M(r) and
v2c (r) = minusσ2 d ln ρm(r)
d ln r (889)
212 P Kroupa
Numerical solution of differential (886) shows that vc rarrradic
2σ (constant) forlarge r
The isothermal sphere is a useful model for describing elliptical galaxieswithin a few core radii and disc galaxies because of the constant rotationcurve However combining the two equations for v2
c above one finds thatM(r) asymp (2σ2G) r for large r ie the isothermal sphere has an infinite massas it is not bounded
The Lowered Isothermal or King Model
We have thus seen that the class of models with n = infin contain as the simplestcase the singular isothermal sphere By forcing the central density to be finitewe are led to the isothermal sphere which however has an infinite mass Thefinal model considered here within this class is the lowered isothermal modelor the King model8 which forces not only a finite central density but alsoa cutoff in radius These have a distribution function similar to that of theisothermal model except for a cutoff in energy
fm(E) =
ρm1
(2 π σ2)32
(e
Eσ2 minus 1
) E gt 0
0 E le 0(890)
The density distribution becomes
ρm = ρm1
[
eΨσ2 erf
(radicΨσ
)
minusradic
4Ψπ σ2
(
1 +2Ψ3σ2
)]
(891)
with integration only to E = 0 as before The Poisson (878) becomes
ddr
(
r2d ln ρm
dr
)
= minus4πGρm1 r2
[
eΨσ2 erf
(radicΨσ
)
minusradic
4Ψπ σ2
(
1 +2Ψ3σ2
)]
(892)
Again this differential equation must be solved numerically for Ψ(r) subjectto the boundary conditions
Ψ(0)dΨdr
|r=0 = 0 (893)
The density vanishes at r = rtid (the tidal radius) where Ψ(r = rtid) = 0also A King model is thus limited in mass and has a finite central density
8Note that King (1962) suggested a three-parameter (mass core radius and cut-offtidal radius) empirical projected (2D) density law that fits globular clustersvery well These do not have information on the velocity structure of the clustersThe King (non-analytical) 6D models which are solutions of the Jeans equation((8120) below) and discussed here are published by King (1966)
8 Initial Conditions for Star Clusters 213
Fig 85 The King concentration parameter W0 as a function of c (cf with Fig 4ndash10of Binney amp Tremaine 1987) This figure has been produced by Andreas Kupper
but the parameter σ is not the velocity dispersion It is rather related to thedepth of the potential via the concentration parameter
Wo equiv Ψ(0)σ2
(894)
The concentration is defined as
c equiv log10
(rtidro
)
(895)
For globular clusters 3 lt Wo lt 9 075 lt c lt 175 and the relation betweenWo and c is plotted in Fig 85 Note also that the true core radius defined asΣ(Rc) = (12)Σ(0) where Σ(R) is the projected density profile and R is theprojected radius is unequal in general to the King radius r0 (876) Finallyit should be emphasised that it is not physical to use an arbitrary rtid Thetidal radius must always match the value dictated by the cluster mass andthe host galaxy (eg (83))
822 Comparison Plummer vs King Models
The above discussion has served to show how various popular models can befollowed through from a power-law distribution function (841) with differentindices n The Plummer model (p 205) and the King model (p 212) are par-ticularly useful for describing star clusters The Plummer model is determinedby two parameters the mass M and the scale radius rh asymp 1305 rpl TheKing model requires three parameters M a scale radius rh and a concen-tration parameter W0 or c Which subset of parameters yield models that aresimilar in terms of the overall density profile
214 P Kroupa
Fig 86 Comparison of a King model (solid curve) with a Plummer model (dashedcurve) Both have the same mass and that Plummer model is sought which min-imises the unweighted reduced chi-squared between the two models The upper panelshows a high-concentration King model with c = 255 and W0 = 11 and the best-fitPlummer model has rPlummer
h = 0366 rKingh (rh equiv rh) as stated in the panel The
lower panel compares the two best matching models for the case of an intermediate-concentration King model This figure was produced by Andreas Kupper
To answer this the mass is set to be constant King models with differentW0 and rh are computed and Plummer models are sought which minimisethe reduced chi-squared value between the two density profiles Figure 86shows two examples of best-matching density profiles and Fig 87 revealsthe family of Plummer profiles that best match King models with differentconcentrations Note that a good match between the two is only obtained forintermediate-concentration King models (25 le W0 le 75)
823 Discretisation
To set up a computer model of a stellar system withN particles (eg stars) thedistribution functions need to be sampled N times The relevant distribution
8 Initial Conditions for Star Clusters 215
Fig 87 The ratio rPlummerh rKing
h (rh equiv rh) for the best-matching Plummer andKing models (Fig 86) are plotted as a function of the King concentration param-eter W0 The uncertainties are unweighted reduced chi-squared values between thetwo density profiles It is evident that there are no well-matching Plummer modelsfor low- (c lt 25) and high-concentration (c gt 75) King models This figure wasproduced by Andreas Kupper
functions are the phase-space distribution function the stellar initial massfunction and the three distribution functions governing the properties of bi-nary stars (periods mass-ratios and eccentricities)
Assume the distribution function depends on the variable ζmin le ζ le ζmax
(eg stellar mass m) There are various ways of sampling from a distributionfunction (Press et al 1992) but the most efficient way is to use a generatingfunction if one exists Consider the probability X(ζ) of encountering a valuefor the variable in the range ζmin to ζ
X(ζ) =int ζ
ζmin
p(ζ prime) dζ prime (896)
with X(ζmin) = 0 le X(ζ) le X(ζmax) = 1 and p(ζ) is the distribution func-tion normalised so that the latter equal sign holds (X = 1) p(ζ) is the prob-ability density The inverse of (896) ζ(X) is the generating function It is aone-to-one map of the uniform distribution X isin [0 1] to ζ isin [ζmin ζmax]If an analytical inverse does not exist it can be found numerically in astraightforward manner for example by constructing a table of X ζ andthen interpolating this table to obtain a ζ for a given X
Example The Power-Law Stellar Mass Function
As an example consider the distribution function
ξ(m) = kmminusα α = 235 05 le m
Mle 150 (897)
216 P Kroupa
The probability density is p(m) = kp mminusα and
int 150
05p(m) dm = 1 rArr kp =
053 Thus
X(m) =int m
05
p(m) dm = kp1501minusα minus 051minusα
1 minus α(898)
and the generating function for stellar masses becomes
m(X) =[
X1 minus α
kp+ 051minusα
] 11minusα
(899)
It is easy to programme this into an algorithm Obtain a random variate Xfrom a random number generator and use the above generating function toget a corresponding mass m Repeat N times
Generating a Plummer Model
Perhaps the most useful and simplest model of a bound stellar system is thePlummer model (p 205) It is worth introducing the discretisation of thismodel in some detail because analytical formulae go a long way which isimportant for testing codes A condensed form of this material is available inAarseth Henon and Wielen (1974)
The mass within radius r is (rpl = b here)
M(r) =int r
0
ρm(rprime) 4π rprime2drprime = Mcl
(rrpl)3
[1 + (rrpl)
2] 3
2 (8100)
A number uniformly distributed between zero and one can then be defined
X1(r) =M(r)Mcl
=ζ3
[1 + ζ2] (8101)
where ζ equiv rrpl and X1(r = infin) = 1 This function can be inverted toyield the generating function for particle distances distributed according to aPlummer density law
ζ(X1) =(X
minus 23
1 minus 1)minus 1
2 (8102)
The coordinates of the particles x y z r2 = (ζ rpl)2 = x2 + y2 + z2 can beobtained as follows For a given particle we already have r For all possiblex and y z has a uniform distribution p(z) = const = 1(2 r) over the rangeminusr le z le +r Thus for a second random variate between zero and one
X2(z) =int z
minusr
p(zprime) dzprime =12 r
(z + r) (8103)
with X2(+r) = 1 The generating function for z becomes
8 Initial Conditions for Star Clusters 217
z(X2) = 2 r X2 minus r (8104)
Having obtained r and z x and y can be arrived at as follows noting theequation for a circle r2 minus z2 = x2 + y2 Choose a random angle θ which isuniformly distributed over the range 0 le θ le 2π Thus p(θ) = 1(2π) andthe third random variate becomes
X3(θ) =int θ
0
12π
dθprime =θ
2π (8105)
The corresponding generating function is
θ(X3) = 2πX3 (8106)
Finally
x =(r2 minus z2
) 12 cosθ and y =
(r2 minus z2
) 12 sinθ (8107)
The velocity for each particle cannot be obtained as simply as the positionsIn order for the initial stellar system to be in virial equilibrium the potentialand kinetic energy need to balance according to the scalar virial theoremThis is ensured by forcing the velocity distribution function to be that of thePlummer model
fm(εe) =
(24
radic2
2 π3r2pl
(G Mcl)5
)(minusεe)
72 εe le 0
0 εe gt 0(8108)
whereεe(r v) = Φ(r) + (12) v2 (8109)
is the specific energy per star and
Φ(r) = minusGMcl
rpl
(
1 +(
r
rpl
)2)minus 1
2
(8110)
is the potential Now the Plummer distribution function can be expressed interms of r and v
f(r v) = fo
(
minusΦ(r) minus 12v2
) 72
(8111)
for a normalisation constant fo and dropping the mass subscript because weassume the positions and velocities do not depend on particle mass With theescape speed at distance r from the Plummer centre vesc(r) =
radicminus2Φ(r) equiv
vζ it follows that
f(r v) = fo
(12vesc
)7 (1 minus ζ2
) 72 (8112)
218 P Kroupa
The number of particles with speeds in the interval v to v + dv is
dN = f(r v) 4π v2 dv equiv g(v) dv (8113)
Thus
g(v) = 16π fo
(12vesc(r)
)9 (1 minus ζ2(r)
) 72 ζ2(r) (8114)
that isg(ζ) = go ζ
2(r)(1 minus ζ2(r)
) 72 (8115)
for a normalisation constant go determined by demanding that
X4(ζ = 1) = 1 =int 1
0
g(ζ prime) dζ prime (8116)
for a fourth random number variate X4(ζ) =int ζ
0g(ζ prime) dζ prime It follows that
X4(ζ) =12(5 ζ3 minus 3 ζ5
) (8117)
This cannot be inverted to obtain an analytical generating function for ζ =ζ(X4) Therefore numerical methods need to be used to solve (8117) Forexample one way to obtain ζ for a given random variate X4 is to find theroot of the equation 0 = (12) (5 ζ3 minus3 ζ5)minusX4 or one can use the Neumannrejection method (Press et al 1992)
The following procedure can be implemented to calculate the velocity vec-tor of a particle for which r and ζ are already known from above Computevesc(r) so that v = ζ vesc Each speed v is then split into its componentsvx vy vz assuming velocity isotropy using the same algorithm as above forx y z
vz(X5) = (2X5 minus 1) v θ(X6) = 2πX6 (8118)
vx =radicv2 minus v2
z cosθ vy =radicv2 minus v2
z sinθ (8119)
Note that a rotating Plummer model can be generated by simply switchingthe signs of vx and vy so that all particles have the same direction of motionin the x y plane
As an aside an efficient numerical method to set up triaxial ellipsoidswith or without an embedded rotating disc is described by Boily Kroupa ampPenarrubia-Garrido (2001)
Generating an Arbitrary Spherical Non-Rotating Model
In most cases an analytical density distribution is not known (eg theKing models above) Such numerical models can nevertheless be discretisedstraightforwardly as follows Assume that the density distribution ρ(r) isknown Compute M(r) and Mcl Define X(r) = M(r)Mcl as above We thus
8 Initial Conditions for Star Clusters 219
have a numerical grid of numbers r M(r) X(r) For a given random variateX isin [0 1] interpolate r from this grid Compute x y z as above
If the distribution function of speeds is too complex to yield an analyticalgenerating function X(ζ) for the speeds ζ we can resort to the followingprocedure One of the Jeans equations for a spherical system is
ddr
(ρ(r)σr(r)2
)+ρ(r)r
[2σ2
r(r) minus(σθ(r)2 + σφ(r)2
)]= minusρ(r) dΦ(r)
dr
(8120)For velocity isotropy σ2
r = σ2θ = σ2
φ this reduces to
d(ρ σ2
r
)
dr= minusρ dΦ
dr (8121)
Integrating this by making use of ρ rarr 0 as r rarr infin and remembering thatdΦdr = minusGMr2
σ2r(r) =
1ρ(r)
int infin
r
ρ(rprime)GM(rprime)
rprime2drprime (8122)
For each particle at distance r a one-dimensional velocity dispersion σr(r) isthus obtained Choosing randomly from a Gaussian distribution with disper-sion σi i = r θ φ x y z then gives the velocity components (eg vx vy vz)for this particle
Rotating Models
Star clusters are probably born with some rotation because the pre-clustercloud core is likely to have contracted from a cloud region with differentialmotions that do not cancel How large this initial angular momentum contentof an embedded cluster is remains uncertain because the dominant motionsare random and chaotic owing to the turbulent velocity field of the gas Oncethe star-formation process is quenched as a result of gas blow-out (Sect 811)the cluster expands This must imply substantial reduction in the rotationalvelocity A case in point is ω Cen which has been found to rotate with a peakvelocity of about 7 km sminus1 (Pancino et al 2007 and references therein)
A setup for rotating cluster models is easily made for instance by increas-ing the tangential velocities of stars by a certain factor A systematic studyof relaxation-driven angular momentum re-distribution within star clustershas become available through the work of the group of Rainer Spurzem andHyung-Mok Lee and the interested reader is directed to that body of work(Kim et al 2008 and references therein) One important outcome of thiswork is that core collapse is substantially accelerated in rotating models Theprimary reason for this is that increased rotational support reduces the role ofsupport through random velocities for the same cluster dimension Thus therelative stellar velocities decrease and the stars exchange momentum and en-ergy more efficiently enhancing two-body relaxation and thence the approachtowards energy equipartition
220 P Kroupa
824 Cluster Birth and Young Clusters
Some astrophysical issues related to the initial conditions of star clusters havebeen raised in Sect 811 In order to address most of these issues numericalexperiments are required The very initial phase the first 05Myr can onlybe treated through gas-dynamical computations that however lack the nu-merical resolution for the high-precision stellar-dynamical integrations whichare the essence of collisional dynamics during the gas-free phase of a clusterrsquoslife This gas-free stage sets in with the blow out of residual gas at an age ofabout 05ndash15Myr The time 05ndash15Myr is dominated by the physics of stel-lar feedback and radiation transport in the residual gas as well as energy andmomentum transfer to it through stellar outflows The gas-dynamical com-putations cannot treat all the physical details of the processes acting duringthis critical time which also include early stellar-dynamical processes such asmass segregation and binaryndashbinary encounters
One successful procedure to investigate the dominant macroscopic physicalprocesses of these stellar-dynamical processes gas blow-out and the ensuingcluster expansion through to the long-term evolution of the remnant clusteris to approximate the residual gas component as a time-varying potential inwhich the young stellar population is trapped The pioneering work usingthis approach has been performed by Lada Margulis amp Dearborn (1984)whereby the earlier numerical work by Tutukov (1978) on open clusters andlater N -body computations by Goodwin (1997ab 1998) on globular clustersmust also be mentioned in this context
The physical key quantities that govern the emergence of embedded clus-ters from their clouds and their subsequent appearance are (BaumgardtKroupa amp Parmentier 2008 Sect 811)
bull sub-structuringbull initial mass segregationbull the dynamical state at feedback termination (dynamical equilibrium col-
lapsing or expanding)bull the star-formation efficiency εbull the ratio of the gas-expulsion time-scale to the stellar crossing time through
the embedded cluster τgastcross andbull the ratio of the embedded-cluster half-mass radius to its tidal radius rhrt
It becomes rather apparent that the physical processes governing theemergence of star clusters from their natal clouds is terribly messy and theresearch-field is clearly observationally driven Observations have shown thatstar clusters suffer substantial infant weight loss and probably about 90 of allclusters disperse altogether (infant mortality) This result is consistent withthe observational insight that clusters form in a compact configuration witha low star-formation efficiency (02 le ε le 04) and that residual-gas blow-outoccurs on a time-scale comparable or even faster than an embedded-clustercrossing time-scale (Kroupa 2005) Theoretical work can give a reasonable
8 Initial Conditions for Star Clusters 221
description of these empirical findings by combining some of the above pa-rameters such as an effective star-formation efficiency as a measure of theamount of gas removed for a cluster of a given stellar mass if this cluster werein dynamical equilibrium at feedback termination and that the gas and starswere distributed according to the same radial density function with the samescaling radius
Embedded Clusters One way to parameterise an embedded cluster is to setup a Plummer model in which the stellar positions follow a density law withthe parameters Mecl and rpl and the residual gas is a time-varying Plummerpotential initially with the parameters Mgas and rpl ie modelled with thesame radial density law The effective star-formation efficiency is then given by(82) Stellar velocities must then be calculated from a Plummer law with totalmass Mecl +Mgas following the recipes of Sect 823 The gas can be removedby evolving Mgas or rpl For example Kroupa Aarseth amp Hurley (2001) andBaumgardt Kroupa amp Parmentier (2008) assumed the gas mass decreasesexponentially after an embedded phase lasting about 05Myr during whichthe cluster is allowed to evolve in dynamical equilibrium Bastian amp Goodwin(2006) as another example do not include a gas potential but take the initialvelocities of stars to be 1
radicε times larger vembedded = (1
radicε) vno gas to model
the effect of instantaneous gas removal Many variations of these assumptionsare possible and Adams (2000) for example investigated the fraction of starsleft in a cluster remnant if the radial scale length of the gas is different to thatof the stars ie for a radially dependent star-formation efficiency ε(r)
Subclustering Initial subclustering has been barely studied Scally amp Clarke(2002) considered the degree of sub-structuring of the ONC allowed by its
current morphology while Fellhauer amp Kroupa (2005) computed the evolutionof massive star-cluster complexes assuming each member cluster in the com-plex undergoes its own individual gas-expulsion process McMillan Vesperiniamp Portegies Zwart (2007) showed that initially mass-segregated subclustersretain mass segregation upon merging This is an interesting mechanism foraccelerating dynamical mass segregation because it occurs faster in smaller-Nsystems which have a shorter relaxation time
The simplest initial conditions for such numerical experiments are to set upthe star-cluster complex (or protoONC-type cluster for example) as a Plum-mer model where each particle is a smaller subcluster Each subcluster is alsoa Plummer model embedded in a gas potential given as a Plummer modelThe gas-expulsion process from each subcluster can be treated as above
Mass Segregation and Gas Blow-Out The problem of how initially mass-segregated clusters react to gas blow-out has not been studied at all in thepast This is due partially to the lack of convenient algorithms to set up mass-segregated clusters that are in dynamical equilibrium and which do not gointo core collapse as soon as the N -body integration begins An interesting
222 P Kroupa
consequence here is that gas blow-out will unbind mostly the low-mass starswhile the massive stars are retained These however evolve rapidly so thatthe mass lost from the remnant cluster owing to the evolution of the massivestars can become destructive enhancing infant mortality
Ladislav Subr has developed a numerically efficient method to set up ini-tially mass-segregated clusters close to core-collapse based on a novel conceptthat uses the potentials of subsets of stars ordered by their mass (Subr Kroupaamp Baumgardt 2008)9 An alternative algorithm based on ordering the starsby increasing mass and increasing total energy that leads to total mass seg-regation and also to a model that is not in core collapse and which thereforeevolves towards core collapse has been developed by Baumgardt Kroupa ampde Marchi (2008) An application concerning the effect on the observed stellarmass function in globular clusters shows that gas expulsion leads to bottom-light stellar mass functions in clusters with a low concentration consistentwith observational data (Marks Kroupa amp Baumgardt 2008)
83 The Stellar IMF
The stellar initial mass function (IMF) ξ(m) dm where m is the stellar massis the parent distribution function of the masses of stars formed in one eventHere the number of stars in the mass interval m to m+ dm is
dN = ξ(m) dm (8123)
The IMF is strictly speaking an abstract theoretical construct because anyobserved system of N stars merely constitutes a particular representation ofthis universal distribution function if such a function exists (Elmegreen 1997Maız Apellaniz amp Ubeda 2005) The probable existence of a unique ξ(m) canbe inferred from the observations of an ensemble of systems each consisting ofN stars (eg Massey 2003) If after corrections for (a) stellar evolution (b)unknown multiple stellar systems and (c) stellar-dynamical biases the indi-vidual distributions of stellar masses are similar within the expected statisticalscatter we (the community) deduce that the hypothesis that the stellar massdistributions are not the same can be excluded That is we make the case fora universal standard or canonical stellar IMF within the physical conditionsprobed by the relevant physical parameters (metallicity density mass) of thepopulations at hand
Related overviews of the IMF can be found in Kroupa (2002a) Chabrier(2003) Bonnell Larson amp Zinnecker (2007) Kroupa (2007a) and a review
with an emphasis on the metal-rich problem is available in Kroupa (2007b)Zinnecker amp Yorke (2007) provide an in-depth review of the formation anddistribution of massive stars Elmegreen (2007) discusses the possibility thatstar-formation occurs in different modes with different IMFs
9The C-language software package plumix may be downloaded from the websitehttpwwwastrouni-bonnde~webaiubenglishdownloadsphp
8 Initial Conditions for Star Clusters 223
831 The Canonical or Standard Form of the Stellar IMF
The canonical stellar IMF is a two-part-power law (8128) The only structurefound with confidence so far is the change of index from the SalpeterMasseyvalue to a smaller one near 05M
10
ξ(m) prop mminusαi i = 1 2(8124)
α1 = 13 plusmn 03 008 le mM le 05α2 = 23 plusmn 05 05 le mM le mmax
where mmax le mmaxlowast asymp 150M follows from Fig 81 Brown dwarfs havebeen found to form a separate population with α0 asymp 03plusmn 05 (8129) (Thiesamp Kroupa 2007)
It has been corrected for bias through unresolved multiple stellar systemsin the low-mass (m lt 1M) regime (Kroupa Gilmore amp Tout 1991) by amulti-dimensional optimisation technique The general outline of this tech-nique is as follows (Kroupa Tout amp Gilmore 1993) First the correct form ofthe stellar-massndashluminosity relation is extracted using observed stellar bina-ries and theoretical constraints on the location amplitude and shape of theminimum of its derivative dmdMV near m = 03MMV asymp 12MI asymp 9 incombination with the observed shape of the nearby and deep Galactic-fieldstellar luminosity function (LF)
Ψ(MV ) = minus(
dmdMV
)minus1
ξ(m) (8125)
where dN = Ψ(MV ) dMV is the number of stars in the magnitude inter-val MV to MV + dMV Once the semi-empirical massndashluminosity relation ofstars which is an excellent fit to the most recent observational constraints byDelfosse et al (2000) is established a model of the Galactic field is calculatedwith the assumption that a parameterised form for the MF and different val-ues for the scale-height of the Galactic disc and different binary fractions init Measurement uncertainties and age and metallicity spreads must also beconsidered in the theoretical stellar population Optimisation in this multi-parameter space (MF parameters scale-height and binary population) againstobservational data leads to the canonical stellar MF for m lt 1M
One important result from this work is the finding that the LF of main-sequence stars has a universal sharp peak near MV asymp 12MI asymp 9 It resultsfrom changes in the internal constitution of stars that drive a non-linearity inthe stellar massndashluminosity relation A consistency check is then performedas follows The above MF is used to create young populations of binary sys-tems (Sect 842) that are born in modest star clusters consisting of a fewhundred stars Their dissolution into the Galactic field is computed with an
10The uncertainties in αi are estimated from the alpha-plot (Sect 832) as shownin Fig 5 of Kroupa (2002b) to be about 95 confidence limits
224 P Kroupa
Fig 88 The Galactic field population that results from disrupted star clustersunification of both the nearby (solid histogram) and deep (filled circles) LFs withone parent MF (8124) The theoretical nearby LF (dashed line) is the LF of allindividual stars while the solid curve is a theoretical LF with a mixture of about50 per cent unresolved binaries and single stars from a clustered star-formationmode According to this model all stars are formed as binaries in modest clusterswhich disperse to the field The resulting Galactic field population has a binaryfraction and a mass-ratio distribution as observed The dotted curve is the initialsystem LF (100 binaries) (Kroupa 1995ab) Note the peak in both theoreticalLFs It stems from the extremum in the derivative of the stellar-massndashluminosityrelation in the mass range 02ndash04 M (Kroupa 2002b)
N -body code and the resulting theoretical field is compared to the observedLFs (Fig 88) Further confirmation of the form of the canonical IMF comesfrom independent sources most notably by Reid Gizis amp Hawley (2002) andalso Chabrier (2003)
In the high-mass regime Massey (2003) reports the same slope or in-dex α3 = 23 plusmn 01 for m ge 10M in many OB associations and star clus-ters in the Milky Way and the Large and Small Magellanic clouds (LMCSMC respectively) It is therefore suggested to refer to α2 = α3 = 23 as theSalpeterMassey slope or index given the pioneering work of Salpeter (1955)who derived this value for stars with masses 04ndash10M
Multiplicity corrections await publication once we learn more about howthe components are distributed in massive stars (cf Preibisch et al 1999Zinnecker 2003) Weidner amp Kroupa (private communication) are in the pro-cess of performing a very detailed study of the influence of unresolved binaryand higher-order multiple stars on determinations of the high-mass IMF
8 Initial Conditions for Star Clusters 225
Contrary to the SalpeterMassey index (α = 23) Scalo (1986) foundαMWdisc asymp 27 (m ge 1M) from a very thorough analysis of OB star countsin the Milky Way disc Similarly the star-count analysis of Reid Gizis ampHawley (2002) leads to 25 le αMWdisc le 28 and Tinsley (1980) Kennicutt(1983) (his extended Miller-Scalo IMF) Portinari Sommer-Larsen amp Tantalo(2004) and Romano et al (2005) find 25 le αMWdisc le 27 That αMWdisc gt α2
follows naturally is shown in Sect 834Below the hydrogen-burning limit (see also Sect 833) there is substantial
evidence that the IMF flattens further to α0 asymp 03 plusmn 05 (Martın et al 2000Chabrier 2003 Moraux et al 2004) Therefore the canonical IMF most likelyhas a peak at 008M Brown dwarfs however comprise only a few per cent ofthe mass of a population and are therefore dynamically irrelevant (Table 82)The logarithmic form of the canonical IMF
ξL(m) = log10 mξ(m) (8126)
which gives the number of stars in log10 m-intervals also has a peak near008M However the system IMF (of stellar single and multiple systemscombined to system masses) has a maximum in the mass range 04ndash06M(Kroupa et al 2003)
The above canonical or standard form has been derived from detailedconsiderations of Galactic field star counts and so represents an average IMFFor low-mass stars it is a mixture of stellar populations spanning a largerange of ages (0ndash10 Gyr) and metallicities ([FeH]ge minus1) For the massivestars it constitutes a mixture of different metallicities ([FeH]ge minus15) andstar-forming conditions (OB associations to very dense star-burst clustersR136 in the LMC) Therefore it can be taken as a canonical form and theaim is to test the
IMF universality hypothesis that the canonical IMF constitutes theparent distribution of all stellar populations
Negation of this hypothesis would imply a variable IMF Note that the work ofMassey (2003) has already established the IMF to be invariable for m ge 10Mand for densities ρ le 105 stars pcminus3 and metallicity Z ge 0002
Finally Table 82 compiles some numbers that are useful for simple insightsinto stellar populations
832 Universality of the IMF Resolved Populations
The strongest test of the IMF universality hypothesis (p 225) is obtainedby studying populations that can be resolved into individual stars Because wealso seek co-eval populations with stars at the same distance and with the samemetallicity to minimise uncertainties star clusters and stellar associationswould seem to be the test objects of choice But before contemplating suchwork some lessons from stellar dynamics are useful
226 P Kroupa
Table 82 The number fraction ηN = 100int m2
m1ξ(m) dm
int mu
mlξ(m) dm and the
mass fraction ηM = 100int m2
m1m ξ(m) dmMcl Mcl =
int mu
mlm ξ(m) dm in per cent of
BDs or main-sequence stars in the mass interval m1 to m2 and the stellar con-tribution ρst to the Oort limit and to the Galactic-disc surface mass-densityΣst = 2 hρst near to the Sun with ml = 001 M mu = 120 M and theGalactic-disc scale-height h = 250 pc (m lt 1 M Kroupa Tout amp Gilmore 1993)and h = 90 pc (m gt 1 M Scalo 1986) Results are shown for the canonical IMF(8124) for the high-mass-star IMF approximately corrected for unresolved compan-ions (α3 = 27 m gt 1 M) and for the present-day mass function (PDMF α3 = 45Scalo 1986 Kroupa Tout amp Gilmore 1993) which describes the distribution of stellarmasses now populating the Galactic disc For gas in the disc Σgas = 13plusmn3 Mpc2
and remnants Σrem asymp 3 Mpc2 (Weidemann 1990) The average stellar mass ism =
int mu
mlm ξ(m) dm
int mu
mlξ(m) dm Ncl is the number of stars that have to form in
a star cluster so that the most massive star in the population has the mass mmaxThe mass of this population is Mcl and the condition is
intinfinmmax
ξ(m) dm = 1 withint mmax001
ξ(m) dm = Ncl minus 1 ΔMclMcl is the fraction of mass lost from the clusterdue to stellar evolution if we assume that for m ge 8 M all neutron stars and blackholes are kicked out by asymmetrical supernova explosions but that white dwarfs areretained (Weidemann et al 1992) and have masses mWD = 0084 mini + 0444 [M]This is a linear fit to the data of Weidemann (2000 their Table 3) for progenitormasses 1 le mM le 7 and mWD = 05 M for 07 le mM lt 1 The evolutiontime for a star of mass mto to reach the turn-off age is available in Fig 20 of Kroupa(2007a)
Mass range ηN ηM ρst Σst
[M] [] [] [Mpc3] [Mpc2]α3 α3 α3 α3
23 27 45 23 27 45 45 45
001ndash008 372 377 386 41 54 74 32 times 10minus3 160008ndash05 478 485 497 266 352 482 21 times 10minus2 10505ndash1 89 91 93 161 213 292 13 times 10minus2 641ndash8 57 46 24 324 303 151 65 times 10minus3 128ndash120 04 01 00 208 78 01 36 times 10minus5 65 times 10minus3
mM = 038 029 022 ρsttot = 0043 Σst
tot = 196
α3 = 23 α3 = 27 ΔMclMcl
mmax Ncl Mcl Ncl Mcl mto [][M] [M] [M] [M] α3 = 23 α3 = 27
1 16 29 21 38 80 32 078 245 74 725 195 60 49 11
20 806 269 3442 967 40 75 1840 1984 703 11 times 104 2302 20 13 4760 3361 1225 22 times 104 6428 8 22 8080 4885 1812 36 times 104 11 times 104 3 32 15
100 6528 2451 53 times 104 15 times 104 1 44 29120 8274 3136 72 times 104 21 times 104 07 47 33
8 Initial Conditions for Star Clusters 227
Star Clusters and Associations
To access a pristine population one would consider observing star-clustersthat are younger than a few Myr However such objects carry rather seriousdisadvantages The pre-mainsequence stellar evolution tracks are unreliable(Baraffe et al 2002 Wuchterl amp Tscharnuter 2003) so that the derived massesare uncertain by at least a factor of about two Remaining gas and dust leadto patchy obscuration Very young clusters evolve rapidly The dynamicalcrossing time is given by (84) where the cluster radii are typically rh lt1 pc and for pre-cluster cloud-core masses Mgas+stars gt 103 M the velocitydispersion σcl gt 2 km sminus1 so that tcr lt 1Myr
The inner regions of populous clusters have tcr asymp 01Myr and thus signifi-cant mixing and relaxation occurs there by the time the residual gas has beenexpelled by any winds and photo-ionising radiation from massive stars Thisis the case in clusters with N ge few times 100 stars (Table 81)
Massive stars (m gt 8M) are either formed at the cluster centre or getthere through dynamical mass segregation ie energy equipartition (Bonnellet al 2007) The latter process is very rapid ((86) p 184) and can occurwithin 1Myr A cluster core of massive stars is therefore either primordial orforms rapidly because of energy equipartition in the cluster and it is dynam-ically highly unstable decaying within a few tcr core The ONC for exampleshould not be hosting a Trapezium because it is extremely unstable The im-plication for the IMF is that the ONC and other similar clusters and the OBassociations which stem from them must be very depleted in their massivestar content (Pflamm-Altenburg amp Kroupa 2006)
Important for measuring the IMF are corrections for the typically highmultiplicity fraction of the very young population However these are veryuncertain because the binary population is in a state of change (Fig 814below) The determination of an IMF relies on the assumption that all starsin a very young cluster formed together However trapping and focussing ofolder field or OB association stars by the forming cluster has been found tobe possible (Sect 811)
Thus be it at the low-mass end or the high-mass end the stellar massfunction seen in very young clusters cannot be the true IMF Statistical cor-rections for the above effects need to be applied and comprehensive N -bodymodelling is required
Old open clusters in which most stars are on or near the main sequenceare no better stellar samples They are dynamically highly evolved becausethey have left their previous concentrated and gas-rich state and so they con-tain only a small fraction of the stars originally born in the cluster (Kroupaamp Boily 2002 Weidner et al 2007 Baumgardt amp Kroupa 2007) The binaryfraction is typically high and comparable to the Galactic field but does de-pend on the initial density and the age of the cluster as does the mass-ratiodistribution of companions So simple corrections cannot be applied equallyfor all old clusters The massive stars have died and secular evolution begins
228 P Kroupa
to affect the remaining stellar population (after gas expulsion) through energyequipartition Baumgardt amp Makino (2003) have quantified the changes ofthe MF for clusters of various masses and on different Galactic orbits Nearthe half-mass radius the local MF resembles the global MF in the clusterbut the global MF is already significantly depleted of its lower-mass stars byabout 20 of the cluster disruption time
Given that we are never likely to learn the exact dynamical history ofa particular cluster it follows that we can never ascertain the IMF for anyindividual cluster This can be summarised concisely with the following con-jecture
Cluster IMF Conjecture The IMF cannot be extracted for any indi-vidual star cluster
Justification For clusters younger than about 05Myr star-formation hasnot ceased and the IMF is therefore not yet assembled and the clustercores consisting of massive stars have already dynamically ejected members(Pflamm-Altenburg amp Kroupa 2006) For clusters with an age between 05and a few Myr the expulsion of residual gas has lead to loss of stars (KroupaAarseth amp Hurley 2001) Older clusters are either still losing stars owing toresidual gas expulsion or are evolving secularly through evaporation driven byenergy equipartition (Baumgardt amp Makino 2003) Furthermore the birthsample is likely to be contaminated by captured stars (Fellhauer Kroupa ampEvans 2006 Pflamm-Altenburg amp Kroupa 2007) There exists no time whenall stars are assembled in an observationally accessible volume (ie a starcluster)
Note that the Cluster IMF Conjecture implies that individual clus-ters cannot be used to make deductions on the similarity or not of their IMFsunless a complete dynamical history of each cluster is available Notwith-standing this pessimistic conjecture it is nevertheless necessary to observeand study star clusters of any age Combined with thorough and realisticN -body modelling the data do lead to essential statistical constraints on theIMF universality hypothesis Such an approach is discussed in the nextsection
The Alpha Plot
Scalo (1998) conveniently summarised a large part of the available observa-tional constraints on the IMF of resolved stellar populations with the alphaplot as used by Kroupa (2001 2002b) for explicit tests of the IMF univer-
sality hypothesis given the cluster IMF conjecture One example ispresented in Fig 89 which demonstrates that the observed scatter in α(m)can be readily understood as being due to Poisson uncertainties (see alsoElmegreen 1997 1999) and dynamical effects as well as arising from biasesthrough unresolved multiple stars Furthermore there is no evident systematicchange of α at a given m with metallicity or density of the star-forming cloud
8 Initial Conditions for Star Clusters 229
Fig 89 The alpha plot The power-law index α is measured over stellar mass-ranges and plotted at the mid-point of the respective mass range The power-lawindices are measured on the mass function of system masses where stars not inbinaries are counted individually Open circles are the observations from open clus-ters and associations of the Milky Way and the Large and Small Magellanic cloudscollated mostly by Scalo (1998) The open stars (crosses) are theoretical star clus-ters observed in the computer at an age of 3 (0) Myr and within a radius of 32 pcfrom the cluster centre The 5 clusters have 3000 stars in 1500 binaries initially andthe assumed IMF is the canonical one The theoretical data nicely show a similarspread to the observational ones note the binary-star-induced depression of α1 inthe mass range 01ndash05 M The IMF universality hypothesis can therefore notbe discarded given the observed data Models are from Kroupa (2001)
More exotic populations such as the Galactic bulge have also been found tohave a low-mass MF indistinguishable from the canonical form (eg Zoccaliet al 2000) Thus the IMF universality hypothesis cannot be falsifiedfor known resolved stellar populations
Very Ancient andor Metal-Poor Resolved Populations
Witnesses of the early formation phase of the Milky Way are its globular clus-ters Such 104ndash106 M clusters formed with individual star-formation ratesof 01ndash1M yrminus1 and densities of about 5 times 103ndash105 M pcminus3 These are rel-atively high values when compared with the current star-formation activityin the Milky Way disc For example a 5 times 103 M Galactic cluster formingin 1Myr corresponds to a star-formation rate of 0005M yrminus1 The alphaplot however does not support any significant systematic difference betweenthe IMF of stars formed in globular clusters and present-day low-mass star-formation For massive stars it can be argued that the mass in stars moremassive than 8M cannot have been larger than about half the cluster massbecause otherwise the globular clusters would not be as compact as theyare today This constrains the IMF to have been close to the canonical IMF(Kroupa 2001)
230 P Kroupa
A particularly exotic star-formation mode is thought to have occurred indwarf-spheroidal (dSph) satellite galaxies The Milky Way has about 19 suchsatellites at distances from 50 to 250 kpc (Metz amp Kroupa 2007) These objectshave stellar masses and ages comparable to those of globular clusters butare 10ndash100 times larger and are thought to have 10ndash1000 times more mass indark matter than in stars They also show evidence for complex star-formationactivity and metal-enrichment histories and must therefore have formed underrather exotic conditions Nevertheless the MFs in two of these satellites arefound to be indistinguishable from those of globular clusters in the mass range05ndash09M So again there is consistency with the canonical IMF (Grillmairet al 1998 Feltzing Gilmore amp Wyse 1999)
The work of Yasui et al (2006) and Yasui et al (2008) have been pushingstudies of the IMF in young star clusters to the outer metal-poor regionsof the Galactic disc They find the IMF to be indistinguishable within theuncertainties from the canonical IMF
The Galactic Bulge and Centre
For low-mass stars the Galactic bulge has been shown to have a MF indistin-guishable from the canonical form (Zoccali et al 2000) However abundancepatterns of bulge stars suggest the IMF was top-heavy (Ballero Kroupa ampMatteucci 2007) This may be a result of extreme star-burst conditions pre-vailing in the formation of the bulge (Zoccali et al 2006)
Even closer to the Galactic centre models of the HertzsprungndashRusselldiagram of the stellar population within 1 pc of Sgr Alowast suggest the IMF wasalways top-heavy there (Maness et al 2007) Perhaps this is the long-soughtafter evidence for a variation of the IMF under very extreme conditions in thiscase a strong tidal field and higher temperatures (but note Fig 810 below)
Extreme Star Bursts
As noted on p 199 objects with a mass M ge 106 M have an increased MLratio If such objects form in 1 Myr their star-formation rates SFRge 1Myrand they probably contain more than 104 O stars packed within a regionspanning at most a few parsecs given their observed present-day massndashradiusrelation Such a star-formation environment is presently outside the reachof theoretical investigation However it is conceivable that the higher MLratios of such objects may be due to a non-canonical IMF One possibilityis that the IMF is bottom-heavy as a result of intense photo-destruction ofaccretion envelopes of intermediate to low-mass stars (Mieske amp Kroupa 2008)Another possibility is that the IMF becomes top-heavy leaving many stellarremnants that inflate the ML ratio (Dabringhausen amp Kroupa 2008) Workis in progress to achieve observational constraints on these two possibilities
8 Initial Conditions for Star Clusters 231
Fig 810 The observed mass function of the Arches cluster near the Galacticcentre by Kim et al (2006) shown as the thin histogram is confronted with the the-oretical MF for this object calculated with the SPH technique by Klessen Spaansamp Jappsen (2007) marked as the hatched histogram The latter has a down-turn(bold steps near 1007) incompatible with the observations This rules out a the-oretical understanding of the stellar mass spectrum because one counter-examplesuffices to bring-down a theory One possible reason for the theoretical failure maybe the assumed turbulence driving For details of the figure see Kim et al (2006)
Population III The Primordial IMF
Most theoretical workers agree that the primordial IMF ought to be top-heavy because the ambient temperatures were much higher and the lack ofmetals did not allow gas clouds to cool and to fragment into sufficiently smallcores (Larson 1998) The existence of extremely metal-poor low-mass starswith chemical peculiarities is interpreted to mean that low-mass stars couldform under extremely metal-poor conditions but that their formation wassuppressed in comparison to later star-formation (Tumlinson 2007) Modelsof the formation of stellar populations during cosmological structure formationsuggest that low-mass population-III stars should be found within the Galactichalo if they formed Their absence to-date would imply a primordial IMFdepleted in low-mass stars (Brook et al 2007)
Thus the last three sub-sections hint at physical environments in whichthe IMF universality hypothesis may be violated
232 P Kroupa
833 Very Low-Mass Stars (VLMSs) and Brown Dwarfs (BDs)
The origin of BDs and some VLMSs is being debated fiercely One campbelieves these objects to form as stars because the star-formation processdoes not know where the hydrogen burning mass limit is (eg Eisloffel ampSteinacker 2008) The other camp believes that BDs cannot form exactly likestars through continued accretion because the conditions required for thisto occur in molecular clouds are far too rare (eg Reipurth amp Clarke 2001Goodwin amp Whitworth 2007)
If BDs and VLMSs form like stars they should follow the same pairingrules In particular BDs and G dwarfs would pair in the same manner ieaccording to the same mathematical rules as M dwarfs and G dwarfs Kroupaet al (2003) tested this hypothesis by constructing N -body models of Taurus-Auriga-like groups and Orion-Nebula-like clusters finding that it leads tofar too many starndashBD and BDndashBD binaries with the wrong semi-major axisdistribution Instead starndashBD binaries are very rare (Grether amp Lineweaver2006) while BDndashBD binaries are rarer than stellar binaries (BDs have a 15binary fraction as opposed to 50 for stars) and BDs have a semi-majoraxis distribution significantly narrower than that of starndashstar binaries Thehypothesis of a star-like origin of BDs must therefore be discarded BDs andsome VLMSs form a separate population which is however linked to that ofthe stars
Thies amp Kroupa (2007) re-addressed this problem with a detailed analysisof the underlying MF of stars and BDs given observed MFs of four popu-lations Taurus Trapezium IC348 and the Pleiades By correcting for unre-solved binaries in all four populations and taking into account the differentpairing rules of stellar and VLMS and BD binaries they discovered a signifi-cant discontinuity of the MF BDs and VLMSs therefore form a truly separatepopulation from that of the stars It can be described by a single power-lawMF (8129) which implies that about one BD forms per five stars in all fourpopulations
This strong correlation between the number of stars and BDs and thesimilarity of the BD MF in the four populations implies that the formationof BDs is closely related to the formation of stars Indeed the truncation ofthe binary binding energy distribution of BDs at a high energy suggests thatenergetic processes must be operating in the production of BDs as discussedby Thies amp Kroupa (2007) Two such possible mechanisms are embryo ejection(Reipurth amp Clarke 2001) and disc fragmentation (Goodwin amp Whitworth2007)
834 Composite Populations The IGIMF
The vast majority of all stars form in embedded clusters and so the correct wayto proceed to calculate a galaxy-wide stellar IMF is to add up all the IMFs ofall star clusters born in one star-formation epoch Such epochs may be iden-tified with the Zoccali et al (2006) star-burst events that create the Galactic
8 Initial Conditions for Star Clusters 233
bulge In disc galaxies they may be related to the time-scale of transformingthe interstellar matter to star clusters along spiral arms Addition of the clus-ters born in one epoch gives the integrated galactic initial mass function theIGIMF (Kroupa amp Weidner 2003)
IGIMF definition The IGIMF is the IMF of a composite populationwhich is the integral over a complete ensemble of simple stellar populations
Note that a simple population has a mono-metallicity and a mono-age distri-bution and is therefore a star cluster Age and metallicity distributions emergefor massive populations with Mcl ge 106 M (eg ω Cen) This indicates thatsuch objects which also have relaxation times comparable to or longer thana Hubble time are not simple (Sect 814) A complete ensemble is a statis-tically complete representation of the initial cluster mass function (ICMF) inthe sense that the actual mass function of Ncl clusters lies within the expectedstatistical variation of the ICMF
IGIMF conjecture The IGIMF is steeper than the canonical IMF if theIMF universality hypothesis holds
Justification Weidner amp Kroupa (2006) calculate that the IGIMF issteeper than the canonical IMF for m ge 1M if the IMF universality
hypothesis holds The steepening becomes negligible if the power-law massfunction of embedded star clusters
ξecl(Mecl) prop Mminusβecl (8127)
is flatter than β = 18It may be argued that IGIMF = IMF (eg Elmegreen 2006) because
when a star cluster is born its stars are randomly sampled from the IMF upto the most massive star possible On the other hand the physically motivatedansatz of Weidner amp Kroupa (2005 2006) to take the mass of a cluster as theconstraint and to include the observed correlation between the maximal starmass and the cluster mass (Fig 81) yields an IGIMF which is equal to thecanonical IMF for m le 15M but which is systematically steeper above thismass By incorporating the observed maximal-cluster-mass vs star-formationrate of galaxies Meclmax = Meclmax(SFR) for the youngest clusters (Wei-dner Kroupa amp Larsen 2004) it follows for m ge 15M that low-surface-brightness (LSB) galaxies ought to have very steep IGIMFs while normal orLlowast galaxies have Scalo-type IGIMFs ie αIGIMF = αMWdisc gt α2 (Sect 831)follows naturally This systematic shift of αIGIMF (m ge 15M) with galaxytype implies that less massive galaxies have a significantly suppressed super-nova II rate per low-mass star They also show a slower chemical enrichmentso that the observed metallicityndashgalaxy-mass relation can be nicely accounted
234 P Kroupa
for (Koeppen Weidner amp Kroupa 2007) Another very important implica-tion is that the SFRndashHα-luminosity relation for galaxies flattens so that theSFR becomes greater by up to three orders of magnitude for dwarf galax-ies than the value calculated from the standard (linear) Kennicutt relation(Pflamm-Altenburg Weidner amp Kroupa 2007)
Strikingly the IGIMF variation has now been directly measured byHoversten amp Glazebrook (2008) using galaxies in the Sloan Digital Sky Sur-vey Lee et al (2004) have indeed found LSBs to have bottom-heavy IMFswhile Portinari Sommer-Larsen amp Tantalo (2004) and Romano et al (2005)find the Milky Way disc to have a an IMF steeper than Salpeterrsquos for massivestars which is in comparison with Lee et al (2004) much flatter than theIMF of LSBs as required by the IGIMF conjecture
835 Origin of the IMF Theory vs Observations
General physical concepts such as coalescence of protostellar cores mass-dependent focussing of gas accretion on to protostars stellar feedback andfragmentation of molecular clouds lead to predictions of systematic varia-tions of the IMF with changes of the physical conditions of star-formation(Murray amp Lin 1996 Elmegreen 2004) (But see Casuso amp Beckman 2007 fora simple cloud coagulationdispersal model that leads to an invariant massdistribution) Thus the thermal Jeans mass of a molecular cloud decreaseswith temperature and increasing density This implies that for higher metallic-ity (stronger cooling) and density the IMF should shift on average to smallerstellar masses (eg Larson 1998 Bonnell et al 2007) The entirely differentnotion that stars regulate their own masses through a balance between feed-back and accretion also implies smaller stellar masses for higher metallicitydue in part to more dust and thus more efficient radiation pressure on thegas through the dust grains Also a higher metallicity allows more efficientcooling and thus a lower gas temperature a lower sound speed and thereforea lower accretion rate (Adams amp Fatuzzo 1996 Adams amp Laughlin 1996)As discussed above a systematic IMF variation with physical conditions hasnot been detected Thus theoretical reasoning even at its most elementarylevel fails to account for the observations
A dramatic case in point has emerged recently Klessen Spaans amp Jappsen(2007) report state-of-the art calculations of star-formation under physicalconditions as found in molecular clouds near the Sun and they are able toreproduce the canonical IMF Applying the same computational technologyto the conditions near the Galactic centre they obtain a theoretical IMF inagreement with the previously reported apparent decline of the stellar MF inthe Arches cluster below about 6M Kim et al (2006) published their obser-vations of the Arches cluster on the astrophysics preprint archive shortly afterKlessen Spaans amp Jappsen (2007) and performed N -body calculations of thedynamical evolution of this young cluster revising our knowledge significantlyIn contradiction to the theoretical prediction they find that the MF continues
8 Initial Conditions for Star Clusters 235
to increase down to their 50 completeness limit (13M) with a power-lawexponent only slightly shallower than the canonical MasseySalpeter valueonce mass-segregation has been corrected for This situation is demonstratedin Fig 810 It therefore emerges that there does not seem to exist any solidtheoretical understanding of the IMF
Observations of cloud cores appear to suggest that the canonical IMF isalready frozen in at the pre-stellar cloud-core level (Motte Andre amp Neri 1998Motte et al 2001) Nutter amp Ward-Thompson (2007) and Alves Lombardiamp Lada (2007) find however the pre-stellar cloud cores are distributed ac-cording to the same shape as the canonical IMF but shifted to larger massesby a factor of about three or more This is taken to perhaps mean a star-formation efficiency per star of 30 or less independently of stellar mass Theinterpretation of such observations in view of multiple star-formation in eachcloud-core is being studied by Goodwin et al (2008) while Krumholz (2008)outlines current theoretical understanding of how massive stars form out ofmassive pre-stellar cores
836 Conclusions IMF
The IMF universality hypothesis the cluster IMF conjecture andthe IGIMF conjecture have been stated In addition we may make thefollowing assertions
1 The stellar luminosity function has a pronounced maximum at MV asymp 12MI asymp 9 which is universal and well understood as a result of stellarphysics Thus by counting stars in the sky we can look into their interiors
2 Unresolved multiple systems must be accounted for when the MFs ofdifferent stellar populations are compared
3 BDs and some VLMSs form a separate population that correlates withthe stellar content There is a discontinuity in the MF near the starBDmass transition
4 The canonical IMF (8124) fits the star counts in the solar neighbourhoodand all resolved stellar populations available to-date Recent data at theGalactic centre suggest a top-heavy IMF perhaps hinting at a possiblevariation with conditions (tidal shear temperature)
5 Simple stellar populations are found in individual star clusters with Mcl
le 106 M These have the canonical IMF6 Composite populations describe entire galaxies They are a result of many
epochs of star-cluster formation and are described by the IGIMF Con-
jecture7 The IGIMF above about 1M is steep for LSB galaxies and flattens to the
Scalo slope (αIGIMF asymp 27) for Llowast disc galaxies This is nicely consistentwith the IMF universality hypothesis in the context of the IGIMF
conjecture
236 P Kroupa
8 Therefore the IMF universality hypothesis cannot be excluded de-spite the cluster IMF conjecture for conditions ρ le 105 stars pcminus3Z ge 0002 and non-extreme tidal fields
9 Modern star-formation computations and elementary theory give wrongresults concerning the variation and shape of the stellar IMF as well asthe stellar multiplicity (Goodwin amp Kroupa 2005)
10 The stellar IMF appears to be frozen-in at the pre-stellar cloud-core stageSo it is probably a result of the processes that lead to the formation ofself-gravitating molecular clouds
837 Discretisation
As discussed above a theoretically motivated form of the IMF that passesobservational tests does not exist Star-formation theory gets the rough shapeof the IMF right There are fewer massive stars than low-mass stars How-ever other than this it fails to make any reliable predictions whatsoever asto how the IMF should look in detail under different physical conditions Inparticular the overall change of the IMF with metallicity density or temper-ature predicted by theory is not evident An empirical multi-power-law formdescription of the IMF is therefore perfectly adequate and has important ad-vantages over other formulations A general formulation of the stellar IMF interms of multiple power-law segments is
ξ(m) = k
⎧⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎩
(m
mH
)minusα0
mlow le m le mH(
mmH
)minusα1
mH le m le m0(
m0mH
)minusα1(
mm0
)minusα2
m0 le m le m1(
m0mH
)minusα1(
m1m0
)minusα2(
mm1
)minusα3
m1 le m le mmax
(8128)
where mmax le mmaxlowast asymp 150M depends on the stellar mass of the embeddedcluster (Fig 81) The empirically determined stellar IMF is a two-part-form(8124) with a third power-law for BDs whereby BDs and VLMSs form aseparate population from that of the stars (p 232)
ξBD prop mminusα0 α0 asymp 03 (8129)
(Martın et al 2000 Chabrier 2003 Moraux Bouvier amp Clarke 2004) and
ξBD(0075M) asymp 025 plusmn 005 ξ(0075M)
(Thies amp Kroupa 2007) where ξ is the canonical stellar IMF (8124) Thisimplies that about one BD forms per five stars
One advantage of the power-law formulation is that analytical generat-ing functions and other quantities can be readily derived Another importantadvantage is that with a multi-power-law form different parts of the IMF
8 Initial Conditions for Star Clusters 237
can be varied in numerical experiments without affecting the other parts Apractical numerical formulation of the IMF is prescribed in Pflamm-Altenburgamp Kroupa (2006) Thus for example the canonical two-part power-law IMFcan be changed by adding a third power-law above 1M and making the IMFtop-heavy (αmgt1 M lt α2) without affecting the shape of the late-type stel-lar luminosity function as evident in Fig 88 The KTG93 (Kroupa Tout ampGilmore 1993) IMF is such a three-part power-law form relevant to the overallyoung population in the Milky Way disc This is top-light (αmgt1 M gt α2Kroupa amp Weidner 2003)
A log-normal formulation does not offer these advantages and requirespower-law tails above about 1M and for brown dwarfs for consistency withthe observations discussed above However while not as mathematically con-venient the popular Chabrier log-normal plus power-law IMF (Table 1 ofChabrier 2003) formulation leads to an indistinguishable stellar mass distri-bution to the two-part power-law IMF (Fig 811) Various analytical formsfor the IMF are compiled in Table 3 of Kroupa (2007a)
A generating function for the two-part power-law form of the canonicalIMF (8124) can be written down by following the steps taken in Sect 823The corresponding probability density is
p1 = kp1 mminusα1 008 le m le 05M (8130)p2 = kp2 mminusα2 05 lt m le mmax
where kpi are normalisation constants ensuring continuity at 05M andint 05
008
p1 dm+int mmax
05
p2 dm = 1 (8131)
N
M
Fig 811 Comparison between the popular Chabrier IMF (log-normal plus power-law extension above 1 M dashed curve Table 1 in Chabrier 2003) with the canon-ical two-part power-law IMF (solid line (8124)) The figure is from DabringhausenHilker amp Kroupa (2008)
238 P Kroupa
whereby mmax follows from Fig 81 Defining
X prime1 =
int 05
008
p1(m) dm (8132)
it follows that
X1(m) =int m
008
p1(m) dm if m le 05M (8133)
orX2(m) = X prime
1 +int m
05
p2(m) dm if m gt 05M (8134)
The generating function for stellar masses follows from inversion of the abovetwo equations Xi(m) The procedure is then to choose a random variate X isin[0 1] and to select the generating function m(X1 = X) if 0 le X le X1 orm(X2 = X) if X1 lt X le 1
This algorithm is readily generalised to any number of power-law segments(8128) such as including a third segment for brown dwarfs and allowing theIMF to be discontinuous near 008M (Thies amp Kroupa 2007) Such a formhas been incorporated into the Nbody467 programmes but hitherto with-out the discontinuity However Jan Pflamm-Altenburg has developed a morepowerful and general method of generating stellar masses (or any other quan-tities) given an arbitrary distribution function (Pflamm-Altenburg amp Kroupa2006)11
84 The Initial Binary Population
It has already been demonstrated that corrections for unresolved multiplestars are of much importance to derive correctly the shape of the stellar MFgiven an observed LF (Fig 88) Binary stars are also of significant importancefor the dynamics of star clusters because a binary has intrinsic dynamicaldegrees of freedom that a single star does not possess A binary can thereforeexchange energy and angular momentum with the cluster Indeed binariesare very significant energy sources as for example a binary composed of two1M main-sequence stars and with a semi-major axis of 01AU has a bindingenergy comparable to that of a 1000M cluster of size 1 pc Such a binarycan interact with cluster-field star accelerating them to higher velocities andthereby heating the cluster
The dynamical properties describing a multiple system are
bull the period P (in days throughout this text) or semi-major axis a (in AU)bull the system mass msys = m1 +m2
11The C-language software package libimf can be downloaded from the websitehttpwwwastrouni-bonnde~webaiubenglishdownloadsphp
8 Initial Conditions for Star Clusters 239
bull the mass ratio q equiv m2m1
le 1 where m1m2 are respectively the primaryand secondary-star masses and
bull the eccentricity e = (rapo minus rperi)(rapo + rperi) where rapo rperi are re-spectively the apocentric and pericentric distances
Given a snapshot of a binary the above quantities can be computed fromthe relative position rrel and velocity vrel vectors and the masses of the twocompanion stars by first calculating the binding energy
Eb =12μ v2
rel minusGm1 m2
rrel= minusGm1 m2
2 arArr a (8135)
where μ = m1 m2 (m1 + m2) is the reduced mass From Keplerrsquos third lawwe have
msys =a3
AU
P 2yr
rArr P = Pyr times 36525 days (8136)
where Pyr is the period in years and aAU is in AU Finally the instantaneouseccentricity can be calculated using
e =
[(1 minus rrel
a
)2
+(rrel middot vrel)
2
aGmsys
] 12
(8137)
which can be derived from the orbital angular momentum too
L = μvrel times rrel (8138)
with
L =[
G
msysa (1 minus e2)
] 12
m1 m2 (8139)
The relative equation of motion is
d2rrel
dt2= minusGmsys
r3relrrel + apert(t) (8140)
where apert(t) is the time-dependent perturbation from other cluster membersIt follows that the orbital elements of a binary in a cluster are functions oftime P = P (t) and e = e(t) Also q = q(t) during strong encounters whenpartners are exchanged Because most stars form in embedded clusters thebinary-star properties of a given population cannot be taken to represent theinitial or primordial values
The following conjecture can be proposed
Dynamical population synthesis conjecture if initial binary popu-lations are invariant a dynamical birth configuration of a stellar populationcan be inferred from its observed binary population This birth configura-tion is not unique however but defines a class of dynamically equivalentsolutions
240 P Kroupa
The proof is simple Set up initially identical binary populations in clusterswith different radii and masses and calculate the dynamical evolution with anN -body programme For a given snapshot of a population there is a scalablestarting configuration in terms of size and mass (Kroupa 1995cd)
Binaries can absorb energy and thus cool a cluster They can also heata cluster There are two extreme regimes that can be understood with aGedanken experiment Define
Ebin equiv minusEb gt 0(8141)Ek equiv (12)mσ2 asymp (1N) times kinetic energy of cluster
Soft binaries have Ebin Ek while hard binaries have Ebin Ek A usefulequation in this context is the relation between the orbital period and circularvelocity of the reduced particle
log10 P [days] = 6986 + log10 msys[M] minus 3 log10 vorb[km sminus1] (8142)
Consider now the case of a soft binary a reduced-mass particle withvorb σ By the principle of energy equipartition vorb rarr σ (85) as timeprogresses This implies a uarr P uarr A hard binary has vorb σ Invoking en-ergy equipartition we see that vorb darr a darr P darr Furthermore the amount ofenergy needed to ionise a soft binary is negligible compared to the amountof energy required to ionise a hard binary And the cross section for sufferingan encounter scales with the semi-major axis This implies that a soft binarybecomes ever more likely to suffer an additional encounter as its semi-majoraxis increases Therefore it is much more probable for soft binaries to be dis-rupted rapidly than for hard binaries to do so Thus follows (Heggie 1975Hills 1975) a law
HeggiendashHills law soft binaries soften and cool a cluster while hard bi-naries harden and heat a cluster
Numerical scattering experiments by Hills (1975) have shown that harden-ing of binaries often involves partner exchanges Heggie (1975) derived theabove law analytically Binaries in the energy range 10minus2 Ek le Ebin le 102 Ek33minus1 σ le vorb le 33σ cannot be treated analytically owing to the complexresonances that are created between the binary and the incoming star or bi-nary It is these binaries that may be important for the early cluster evolutiondepending on its velocity dispersion σ = σ(Mecl) Cooling of a cluster is en-ergetically not significant but has been seen for the first time by Kroupa Petramp McCaughrean (1999)
Figure 812 shows the broad evolution of the initial period distributionin a star cluster At any time binaries near the hardsoft boundary withenergies Ebin asymp Ek and periods P asymp Pth (vorb = σ) (85) the thermal periodare most active in the energy exchange between the cluster field and thebinary population The cluster expands as a result of binary heating and
8 Initial Conditions for Star Clusters 241
Fig 812 Illustration of the evolution of the distribution of binary star periods ina cluster (lP = log10 P ) A binary has orbital period Pth when σ3D (= σ) equals itscircular orbital velocity (8142) The initial or birth distribution (8164) evolves tothe form seen at time t gt tt
mass segregation and the hardsoft boundary Pth shifts to longer periodsMeanwhile binaries with P gt Pth continue to be disrupted while Pth keepsshifting to longer periods This process ends when
Pth ge Pcut (8143)
which is the cutoff or maximum period in the surviving period distributionAt this critical time tt further cluster expansion is slowed because the popu-lation of heating sources the binaries with P asymp Pth is significantly reducedThe details strongly depend on the initial value of Pth which determinesthe amount of binding energy in soft binaries which can cool the cluster ifsignificant enough
After the critical time tt the expanded cluster reaches a temporary stateof thermal equilibrium with the remaining binary population Further evolu-tion of the binary population occurs with a significantly reduced rate deter-mined by the velocity dispersion in the cluster the cross section given by thesemi-major axis of the binaries and their number density and that of singlestars in the cluster The evolution of the binary star population during thisslow phase usually involves partner exchanges and unstable but also long-lived hierarchical systems The IMF is critically important for this stage asthe initial number of massive stars determines the cluster density at t ge 5Myrowing to mass loss from evolving stars Further binary depletion occurs oncethe cluster goes into core-collapse and the kinetic energy in the core rises