Ra´ul Toral and - download.e-bookshelf.de · Dedicated to the memory of my father, from whom I...

Raul Toral and

Pere Colet

Stochastic Numerical Methods

Related Titles

Dubbeldam, J., Green, K., Lenstra, D.(eds.)

The Complexity of DynamicalSystemsA Multi-disciplinary Perspective

2011Print ISBN: 978-3-527-40931-0

ISBN: 978-0-470-05480-2

Huynh, H.H., Soumare, I.I., Lai, V.V.

Stochastic Simulation andApplications in Finance withMATLAB Programs

2008ISBN: 978-0-470-72538-2

Also available in digital formats.

Yates, R.D., Goodman, D.J.

WIE Probability and StochasticProcessesA Friendly Introduction for Electrical andComputer Engineers, 2nd Edition,International Edition

2 Edition

2005

ISBN: 978-0-471-45259-1

Gilat, A.

MATLABAn Introduction with Applications 2ndEdition

2 Edition

2005ISBN: 978-0-471-69420-5


Iosifescu, M., Limnios, N., Oprisan, G.

Introduction to StochasticModels

2009ISBN: 978-1-848-21057-8


Mahnke, R., Kaupuzs, J., Lubashevsky, I.

Physics of Stochastic ProcessesHow Randomness Acts in Time

2009ISBN: 978-3-527-40840-5


Raul Toral andPere Colet

Stochastic Numerical Methods

An Introduction for Students and Scientists

Authors

Prof. Raul ToralIFISC (Institute for Cross-disciplinaryPhysics and Complex Systems)CSIC-Universitat Illes BalearsPalma de MallorcaSpain

Prof. Pere ColetIFISC (Institute for Cross-disciplinaryPhysics and Complex Systems)CSIC-Universitat Illes BalearsPalma de MallorcaSpain

CoverThe cover figure aims at exemplifying therandom movement of Brownian particles ina potential landscape.

All books published by Wiley-VCH arecarefully produced. Nevertheless, authors,editors, and publisher do not warrant theinformation contained in these books,including this book, to be free of errors.Readers are advised to keep in mind thatstatements, data, illustrations, proceduraldetails or other items may inadvertently beinaccurate.

Library of Congress Card No.: applied for

British Library Cataloguing-in-PublicationDataA catalogue record for this book is availablefrom the British Library.

Bibliographic information published by theDeutsche NationalbibliothekThe Deutsche Nationalbibliotheklists this publication in the DeutscheNationalbibliografie; detailed bibliographicdata are available on the Internet at<http://dnb.d-nb.de>.

c© 2014 Wiley-VCH Verlag GmbH & Co.KGaA, Boschstr. 12, 69469 Weinheim,Germany

All rights reserved (including those oftranslation into other languages). No partof this book may be reproduced in anyform – by photoprinting, microfilm, or anyother means – nor transmitted or translatedinto a machine language without writtenpermission from the publishers. Registerednames, trademarks, etc. used in this book,even when not specifically marked as such,are not to be considered unprotected by law.

Print ISBN: 978-3-527-41149-8ePDF ISBN: 978-3-527-68313-0ePub ISBN: 978-3-527-68312-3Mobi ISBN: 978-3-527-68311-6oBook ISBN: 978-3-527-68314-7

Cover-DesignAdam-Design, Weinheim, GermanyTypesetting Laserwords Private Limited,Chennai, IndiaPrinting and Binding Markono Print MediaPte Ltd, Singapore

Printed on acid-free paper

http://dnb.d-nb.de

V

Dedicated to the memory of my father, from whom I learnt many important things notcovered in this book.

Raul Toral

Dedicated to my parents and to my wife for their support in different stages of my life.Pere Colet

VII

Contents

Preface XIII

1 Review of probability concepts 11.1 Random Variables 11.2 Average Values, Moments 61.3 Some Important Probability Distributions with a Given Name 61.3.1 Bernoulli Distribution 61.3.2 Binomial Distribution 71.3.3 Geometric Distribution 81.3.4 Uniform Distribution 81.3.5 Poisson Distribution 101.3.6 Exponential Distribution 111.3.7 Gaussian Distribution 121.3.8 Gamma Distribution 131.3.9 Chi and Chi-Square Distributions 141.4 Successions of Random Variables 161.5 Jointly Gaussian Random Variables 181.6 Interpretation of the Variance: Statistical Errors 201.7 Sums of Random Variables 221.8 Conditional Probabilities 231.9 Markov Chains 26

Further Reading and References 28Exercises 29

2 Monte Carlo Integration 312.1 Hit and Miss 312.2 Uniform Sampling 342.3 General Sampling Methods 362.4 Generation of Nonuniform Random Numbers: Basic Concepts 372.5 Importance Sampling 502.6 Advantages of Monte Carlo Integration 562.7 Monte Carlo Importance Sampling for Sums 572.8 Efficiency of an Integration Method 60

VIII Contents

2.9 Final Remarks 61Further Reading and References 62Exercises 62

3 Generation of Nonuniform Random Numbers: NoncorrelatedValues 65

3.1 General Method 653.2 Change of Variables 673.3 Combination of Variables 723.3.1 A Rejection Method 743.4 Multidimensional Distributions 763.5 Gaussian Distribution 813.6 Rejection Methods 84


4 Dynamical Methods 974.1 Rejection with Repetition: a Simple Case 974.2 Statistical Errors 1004.3 Dynamical Methods 1034.4 Metropolis et al. Algorithm 1074.4.1 Gaussian Distribution 1084.4.2 Poisson Distribution 1104.5 Multidimensional Distributions 1124.6 Heat-Bath Method 1164.7 Tuning the Algorithms 1174.7.1 Parameter Tuning 1174.7.2 How Often? 1184.7.3 Thermalization 119


5 Applications to Statistical Mechanics 1255.1 Introduction 1255.2 Average Acceptance Probability 1295.3 Interacting Particles 1305.4 Ising Model 1345.4.1 Metropolis Algorithm 1375.4.2 Kawasaki Interpretation of the Ising Model 1435.4.3 Heat-Bath Algorithm 1465.5 Heisenberg Model 1485.6 Lattice Φ4 Model 1495.6.1 Monte Carlo Methods 1525.7 Data Analysis: Problems around the Critical Region 1555.7.1 Finite-Size Effects 157

Contents IX

5.7.2 Increase of Fluctuations 1605.7.3 Critical Slowing Down 1615.7.4 Thermalization 163


6 Introduction to Stochastic Processes 1676.1 Brownian Motion 1676.2 Stochastic Processes 1706.3 Stochastic Differential Equations 1726.4 White Noise 1746.5 Stochastic Integrals. Ito and Stratonovich Interpretations 1776.6 The Ornstein–Uhlenbeck Process 1806.6.1 Colored Noise 1816.7 The Fokker–Planck Equation 1816.7.1 Stationary Solution 185


7 Numerical Simulation of Stochastic Differential Equations 1917.1 Numerical Integration of Stochastic Differential Equations with

Gaussian White Noise 1927.1.1 Integration Error 1977.2 The Ornstein–Uhlenbeck Process: Exact Generation of

Trajectories 2017.3 Numerical Integration of Stochastic Differential Equations with

Ornstein–Uhlenbeck Noise 2027.3.1 Exact Generation of the Process gh(t) 2057.4 Runge–Kutta-Type Methods 2087.5 Numerical Integration of Stochastic Differential Equations with

Several Variables 2127.6 Rare Events: The Linear Equation with Linear Multiplicative

Noise 2177.7 First Passage Time Problems 2217.8 Higher Order (?) Methods 2257.8.1 Heun Method 2267.8.2 Midpoint Runge–Kutta 2287.8.3 Predictor–Corrector 2287.8.4 Higher Order? 230


8 Introduction to Master Equations 2358.1 A Two-State System with Constant Rates 2358.1.1 The Particle Point of View 236

X Contents

8.1.2 The Occupation Numbers Point of View 2398.2 The General Case 2428.3 Examples 2448.3.1 Radioactive Decay 2448.3.2 Birth (from a Reservoir) and Death Process 2458.3.3 A Chemical Reaction 2468.3.4 Self-Annihilation 2488.3.5 The Prey–Predator Lotka–Volterra Model 2498.4 The Generating Function Method for Solving Master Equations 2518.5 The Mean-Field Theory 2548.6 The Fokker–Planck Equation 256


9 Numerical Simulations of Master Equations 2619.1 The First Reaction Method 2619.2 The Residence Time Algorithm 268


10 Hybrid Monte Carlo 27510.1 Molecular Dynamics 27510.2 Hybrid Steps 27910.3 Tuning of Parameters 28110.4 Relation to Langevin Dynamics 28310.5 Generalized Hybrid Monte Carlo 284


11 Stochastic Partial Differential Equations 28711.1 Stochastic Partial Differential Equations 28811.1.1 Kardar-Parisi-Zhang Equation 28811.2 Coarse Graining 28911.3 Finite Difference Methods for Stochastic Differential Equations 29111.4 Time Discretization: von Neumann Stability Analysis 29311.5 Pseudospectral Algorithms for Deterministic Partial Differential

Equations 30011.5.1 Evaluation of the Nonlinear Term 30311.5.2 Storage of the Fourier Modes 30411.5.3 Exact Integration of the Linear Terms 30511.5.4 Change of Variables 30611.5.5 Heun Method 30611.5.6 Midpoint Runge–Kutta Method 30711.5.7 Predictor–Corrector 30811.5.8 Fourth-Order Runge–Kutta 310

Contents XI

11.6 Pseudospectral Algorithms for Stochastic Differential Equations 31111.6.1 Heun Method 31411.6.2 Predictor–Corrector 31511.7 Errors in the Pseudospectral Methods 316


A Generation of Uniform ��(0, 1) Random Numbers 327A.1 Pseudorandom Numbers 327A.2 Congruential Generators 329A.3 A Theorem by Marsaglia 332A.4 Feedback Shift Register Generators 333A.5 RCARRY and Lagged Fibonacci Generators 334A.6 Final Advice 335

Exercises 335

B Generation of n-Dimensional Correlated Gaussian Variables 337B.1 The Gaussian Free Model 338B.2 Translational Invariance 340

Exercises 344

C Calculation of the Correlation Function of a Series 347Exercises 350

D Collective Algorithms for Spin Systems 351

E Histogram Extrapolation 357

F Multicanonical Simulations 361

G Discrete Fourier Transform 367G.1 Relation Between the Fourier Series and the Discrete Fourier

Transform 367G.2 Evaluation of Spatial Derivatives 373G.3 The Fast Fourier Transform 373

Further Reading 375

References 377

Index 383

XIII

Preface

This book deals with numerical methods that use, in one way or another, con-cepts and ideas from probability theory and stochastic processes. This does notmean that their range of validity is limited to these fields, as there are manyproblems of a purely deterministic nature that can be tackled with these meth-ods, among them, most noticeable, being the calculation of high-dimensionalsums and integrals. The material covered in this book has grown out of a seriesof master courses taught by the authors in several universities and summerschools including, in particular, the Master in Physics of Complex Systems orga-nized by IFISC (Instituto de Fısica Interdisciplinar y Systemas Complejos). Itis aimed, then, at postgraduate students of almost any scientific discipline andalso at those senior scientists who, not being familiar with the specificities ofthe numerical methods explained here, would like to get acquainted with thebasic stochastic algorithms because they need them for a particular applica-tion in their field of research. The methods split naturally in three big blocks:sampling by Monte Carlo (Chapters 2–5 and 10), generation of trajectories ofstochastic differential equations (Chapters 6, 7, and 11), and numerical solu-tions to master equations (Chapters 8 and 9), although they are intertwined inmany occasions and we have tried to highlight those connections whenever theyappear.

It has been our intention to keep the contents of the book self-contained. Hence,no previous knowledge of the subject is assumed by the reader. This is strictly trueinsofar as the numerical algorithms are concerned, but we have also included somechapters of a more theoretical nature where we summarize what the reader shouldknow before facing the numerical chapters. These are: Chapter 1, a summaryof probability concepts including the theory of Markov chains; Chapter 6, with abrief introduction to stochastic processes and stochastic differential equations, theconcepts of white and colored noise, etc.; and Chapter 8, where we present brieflythe master equations and some analytical tools for their analysis. The reader whois not familiar with any of these topics will find here the necessary theoreticalconcepts to understand the numerical methods, but if the reader wants to get adeeper knowledge of the theory, he or she might find it necessary to delve intothe more advanced books mentioned in the bibliography section. Nevertheless, wehave tried to keep the bibliographic references to a minimum, rather than include

XIV Preface

here and there numerous references to the different authors who have contributedto some aspect of the methods explained in the book. We feel that a book with aclear pedagogical orientation, as this one, is different from a review, and the readershould not be distracted by an excess of references and we apologize to thoseauthors who, having contributed to the field, do not find their work cited here.The basic bibliographic sites as well as suggestions for further reading have beenincluded at the end of each chapter.

The goal of the book is to teach the reader to program the numerical algo-rithms to do different tasks. To this end, we have included more or less completepieces of computer code. They are written in Fortran but they are, in the vastmajority of cases, simple enough that they can be understood by anyone witha basic knowledge of programming in any language. The book is not intendedto teach programming or code optimization. Sometimes we provide a full pro-gram, sometimes just some relevant lines of code. In any event, we do considerthe lines of code to be part of the text, an important part that has to be readand analyzed in detail. Our experience indicates that full understanding is onlyachieved when one does not only know what he or she wants to do but alsohow this is implemented practically, and we do recommend the reader to imple-ment and execute the computer programs along with the reading of the text.In particular, the section of the code that we consider absolutely essential hasbeen framed in a box. It is not possible to reach a good understanding of thenumerical algorithms without the understanding of the lines of code in theboxes.

We have included some exercises to complement the theory of the algorithmsexplained at the end of each chapter. We have not used the exercises to introducedifficult or more advanced topics but the exercises are, in general, of the levelof the course and are given here so that the reader can test his or her level ofcomprehension of the algorithms and theory of the main text.

Some material is left for the appendices. They cover either standard material(generation of uniform random numbers, calculation of the correlation time of aseries and discrete Fourier transforms), some more specialized topics (generationof Gaussian fields with a given correlation function, extrapolation techniques),or an introduction to more advanced simulation methods (collective algorithms,multicanonical simulations).

As we have said, the book is addressed to scientists of many disciplines, andbeyond the general framework we have included only a few specific applications.In particular, in Chapter 5, we explain how to use the Monte Carlo sampling toderive properties of physical systems near a phase transition. The reason for theinclusion of this chapter is twofold. Historically, the field of phase transitions hasmade extensive use of the Monte Carlo methods (to the extent that some peoplemight wrongly think that this is the only field of application). Moreover, it is thefield of expertise of the authors, and we felt more confident explaining in detailhow one can use the Monte Carlo sampling to derive precisely the equation of stateand the critical exponents of some model systems of interest in Statistical Mechan-ics, including the Ising and scalar models, than other examples. Nevertheless,

Preface XV

extensions of, for instance, the Ising model are now being used in fields asdistant as sociology or economics, and we hope that the reader not particularlyinterested in the physical applications will still find some useful aspects of thischapter.

Finally, we would like to thank all the colleagues and students who have helpedus to improve this book. In particular, Dr. Emilio Hernandez–Garcia read and gaveus valuable suggestions for Chapters 8 and 9.

1

1Review of probability concepts

In this chapter, we give a brief summary of the main concepts and results onprobability and statistics which will be needed in the rest of the book. Readers whoare familiar with the theory of probability might not need to read this chapter indetail, but we urge them to check that effectively this is the case.

1.1Random Variables

In most occasions, we cannot predict with absolute certainty the outcome of anexperiment (otherwise it might not be necessary to perform the experiment). Weunderstand here the word ‘‘experiment’’ in a broad sense. We can count thenumber of electrons emitted by a 𝛽-radioactive substance in a given time interval,determine the time at which a projectile hits its target or a bus reaches the station,measure an electron’s spin, toss a coin and look at the appearing side, or have alook through the window to observe whether it rains or not. We will denote byE the set of possible results of the experiment. For the 𝛽-radioactive substance,E = {0, 1, 2,…} is the set of natural numbers ℕ; the hitting times of the projectileor the arrival times of the bus (in some units) both belong to the set of realnumbers E = ℝ; the possible outcomes of a measure of an electron’s spin areE = {−ℏ∕2, ℏ∕2}; when tossing a dice, the possible results are E = {heads, tails};and, finally, for the rain observation the set of results is E = {yes, no}. In all thesecases, we have no way (or no effective way) of knowing a priori which one of thepossible outcomes will be observed. Hence, we abandon the deterministic point ofview and adopt a ‘‘probabilistic’’ description in which subsets of results (which arecalled ‘‘events’’) are assigned a number measuring their likelihood of appearance.The ‘‘theory of probability’’ is the branch of mathematics that allows us to performsuch an assignation in a logically consistent way and compatible with our intuitionof how this likelihood of events should behave.

It is useful for the theory to consider that the set of results contains only numbers.In this way, we can use the rules of calculus (add, multiply, differentiate, integrate,etc.). If the results themselves are numbers (case of counting the number ofelectrons, determine the time of impact of the projectile, etc.), this requires no

Stochastic Numerical Methods: An Introduction for Students and Scientists, First Edition.Raul Toral and Pere Colet.c© 2014 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2014 by Wiley-VCH Verlag GmbH & Co. KGaA.

2 1 Review of probability concepts

special consideration. In other cases (to observe whether it rains or not), we needto label each result with a number. This assignation is arbitrary, but usually itresponds to some logics of the problem under consideration. For instance, whentossing a coin, it might be that we win ¤1 every time heads show up and we lose¤1 when tails appear. The ‘‘natural’’ identification is +1 for heads and −1 for tails.This assignation of a number to the result of an experiment is called a ‘‘randomvariable.’’ Random variables are, hence, an application of the set of results to theset of real numbers. This application maps each result of the experiment 𝜉 ∈ Einto one, and only one, number. The application need not be one-to-one, but it canbe many-to-one. For instance, if the experiment is to extract cards from a shuffleddeck, we could assign +2 to all hearts cards, +1 to spades, and 0 to diamonds andclubs. It is customary to denote random variables by using a ‘‘hat’’ on top of itsname, say ��, or ��, or whatever name we choose for it. If we choose the name ��,the number associated with the result 𝜉 is ��(𝜉) ∈ ℝ. This distinction between theresult of the experiment and the real number associated with it is important fromthe mathematical point of view, but in many cases of interest they both coincidebecause the result of the experiment is already a real number, 𝜉 = x, and it isnatural to define ��(x) = x in a somewhat redundant notation.

In summary, a random variable �� is a real number that is obtained as a result ofan experiment.

The next step in the theory is to assign numbers called ‘‘probabilities’’ to thepossible results of the experiment or, equivalently, to the different values of therandom variable. The assignation should match our a priori expectations (if any)about the likelihood (expected frequency of appearance) of the different outcomes.For instance, when tossing a coin, it is natural (but not necessarily useful orconvenient) to assign a probability equal to 1∕2 to the appearance of heads,such that P(heads) = 1∕2 or, equivalently, to the random variable ��(heads) takingthe value +1 (as assigned arbitrarily before), P(�� = +1) = 1∕2. The assignationof probabilities to events might follow some physical law (as in the case of theradioactive substance, the Boltzmann law for the distribution of energies, orthe quantum mechanical postulates), might come after some lengthy calculation(the probability of rain tomorrow), or might follow other arguments such assymmetry (the probability of heads is equal to 1∕2), Jayne’s principle (based onthe extremization of the information function), and so on; however, whatever itsorigin, we consider the assignation to be known. A typical consequence of thetheory is the calculation of probabilities for more or less complicated events: Whatis the probability of obtaining five heads in a row if we toss a coin 10 times? Whatis the probability that the next emission of an electron by the radioactive substanceoccurs in the next 10 ms? and so on.

In practice, the assignation of probabilities to values of the random variableis performed differently if the random variable is continuous (i.e., it can takecontinuous values in a given interval �� ∈ (𝛼, 𝛽) where 𝛼 can also be −∞ or 𝛽can be +∞) or discrete (can take only a finite or infinite numerable set of values�� ∈ {x1, x2, x3,…}). For example, the random variable counting the number oftimes a coin must be tossed before heads appear can take an infinite numerable

1.1 Random Variables 3

set of values {1, 2, 3,…}. The time at which the daily bus reaches the station cantake continuous values in a finite interval (0, 24) h.

For a discrete random variable taking values �� ∈ {x1, x2, x3,…}, we assign toeach value xi its probability pi = P(�� = xi) such that the following two conditions,namely nonnegativity and normalization, are satisfied:

pi ≥ 0, ∀i, (1.1)∑∀i

pi = 1. (1.2)

One can check that the assigned probabilities pi are consistent with the actualresults of the experiment. For instance, quantum mechanics might predict that,in a given experiment with an electron’s spin, the random variable �� takes thevalue x1 = +ℏ∕2 with probability p1 = 1∕3 and the value x2 = −ℏ∕2 with probabilityp2 = 2∕3. To check whether this prediction is correct, one repeats the experimentM (a large number) times and computes the frequency fi = ni∕M, ni being thenumber of times that result xi appears, and checks whether f1 is close to 1∕3 and f2to 2∕3. If they are not, then the predictions of the theory or the implementation ofthe experiment are wrong.1)

For a continuous random variable ��, we assign, instead, a probability to therandom variable taking a value in a finite interval [a, b] as

P(�� ∈ [a, b]) = ∫b

af��(x) dx. (1.3)

Here, f��(x) is known as the probability density function of the random variable��, or pdf for short. It is one of the most important concepts in the theory ofrandom variables. To be able to consider f��(x) as a bona fide pdf, it must satisfy thenonnegativity and normalization conditions:

f��(x) ≥ 0, (1.4)

∫∞

−∞f��(x) dx = 1. (1.5)

The interpretation of the pdf is that, in the limit dx → 0, f��(x) dx gives theprobability that the random variable �� takes values between x and x + dx, that is

P(x < �� ≤ x + dx) = f��(x) dx. (1.6)

In this way, the probability that the random variable �� takes a value within anarbitrary region Ω ⊂ ℝ of the real numbers is given by the integral of the pdf overthat region:

P(�� ∈ Ω) = ∫Ωf��(x) dx. (1.7)

Note that f��(x) has units of the inverse of the units of x, and it is not limitedto taking values smaller than or equal to 1. A pdf governing the probability

1) An important issue in probability theory is to be able to conclude whether the observed frequenciesfi are indeed compatible, within unavoidable statistical errors, with the postulated probabilities pi.


of the next emission of an electron by a 𝛽-radioactive substance has units ofinverse of time, or T−1. A pdf can be computed from the experimental data. Wefirst generate M data of the random variable �� by repeating the experiment Mtimes and recording the outcomes {x1, x2,… , xM}. We choose an interval Δx andcount the number of times n(x, x + Δx) in which the random variable has takenvalues in the interval (x, x + Δx). According to the interpretation of f��(x), it isf��(x) ≈ n(x, x + Δx)∕(MΔx), from which f��(x) can be estimated. A good estimatefor f��(x) requires M to be large and Δx to be small. Again, if the estimated f��(x) isnot equal to the theoretical prediction, then something is wrong with the theory orwith the experiment.

Further calculations can be simplified if one introduces the cumulative distribu-tion function or cdf, F��(x), as

F��(x) = ∫x

−∞f��(x′) dx′. (1.8)

From this definition, it follows that the cdf F��(x) is the probability that the randomvariable �� takes values less or equal than x:

P(�� ≤ x) = F��(x) (1.9)

and that

P(x1 < �� ≤ x2) = F��(x2) − F��(x1) (1.10)

which is a relation that will be useful later. The following general properties arisefrom the definition, the nonnegativity (1.4), and the normalization condition (1.5)of the pdf f��(x):

F��(x) ≥ 0, (1.11)

limx→−∞

F��(x) = 0, (1.12)

limx→+∞

F��(x) = 1, (1.13)

x2 > x1 ⇒ F��(x2) ≥ F��(x1). (1.14)

The last property tells us that F��(x) is a nondecreasing function of its argument.If f��(x) is piecewise continuous, then the probability of the random variable ��

taking a particular value x is equal to zero, as it must be understood as the limitP(�� = x) = limΔx→0 ∫ x+Δx

x f��(x) dx = 0. It is possible to treat discrete variables in thelanguage of pdfs if we use the ‘‘Dirac-delta function’’ 𝛿(x). This mathematical objectis not a proper function, but it can be understood2) as the limit of a succession offunctions 𝛿n(x) such that 𝛿n(x) decays to zero outside a region of width 1∕n aroundx = 0 and has a height at x = 0 or order n such that the integral ∫ ∞

−∞ dx 𝛿n(x) = 1.

There are many examples of such functions: for instance, 𝛿n(x) = n∕√

2𝜋e−n2x2∕2,

or 𝛿n(x) =

{0, x ∉ (−1∕n, 1∕n)n(1 − n|x|), x ∈ (−1∕n, 1∕n)

, see Figure 1.1. The detailed shape is

2) Another way, more rigorous from the mathematical point of view, to introduce the Dirac-delta isby the use of distribution theory, but this is beyond the scope of this book.

1.1 Random Variables 5

1n

1n

X

n

δn(x)

Figure 1.1 Function 𝛿n(x). It has the property that ∫ ∞−∞ dx𝛿n(x) = 1 and, when n → ∞, it

tends to the delta function 𝛿(x).

not important. What matters is that, in the limit n → ∞, these functions tend toyield a nonzero value only at x = 0 while keeping their integral over all ℝ constant.As, for an arbitrary function f (x), we have

limn→∞∫

∞

−∞dx 𝛿n(x)f (x) = f (0) (1.15)

and so we can (in a nonrigorous way) exchange the limit and the integral andunderstand the Dirac-delta function as satisfying

𝛿(x) = 0, if x ≠ 0, (1.16)

∫∞

−∞dx f (x)𝛿(x) = f (0). (1.17)

When the random variable takes a discrete (maybe infinite numerable) set of values�� ∈ {x1, x2, x3,…} such that the value xi has probability pi, then the pdf can beconsidered as a sum of Dirac-delta functions:

f��(x) =∑∀i

pi𝛿(x − xi) (1.18)

because now P(�� = xi) = limΔx→0 ∫ xi+Δxxi−Δx f��(x) dx = pi. The corresponding cumula-

tive function is a sum of Heaviside step functions:

F��(x) =∑∀i

pi𝜃(x − xi) (1.19)

with the usual definition

𝜃(x) =

{0, x < 0,

1, x ≥ 0.(1.20)


1.2Average Values, Moments

As a random variable �� assigns a real number ��(𝜉) to the result of the experiment𝜉, it is possible to use a given real function G(x) to define a new random variable ��as ��(𝜉) = G(��(𝜉)). One defines the average or expected value E[��] of this randomvariable as

E[��] = ∫∞

−∞f��(x)G(x) dx. (1.21)

The alternative notations ⟨��⟩ or simply E[G] and ⟨G⟩ are very common and willalso be used in this book. In particular, for a discrete random variable with pdfgiven by (1.18), the average value is

E[��] =∑∀i

piG(xi). (1.22)

Some important expected values are as follows:

• Mean or average value of the random variable: 𝜇[��] = E[��];• Moments of order n: E[��n];• Central moments of n: E[(�� − 𝜇[��])n];• Variance: 𝜎2[��] = E[(�� − 𝜇[��])2] = E[��2] − E[��]2. The value 𝜎[��] is the standard

deviation of the random variable ��.

If two random variables �� and �� are related by a known function �� = y(��), thentheir respective pdfs are also related.

f��(y) =∑𝜇

f��(x𝜇)|||| dy

dx

||||x=x𝜇

(1.23)

where x𝜇 are the solutions of the equation y = y(x). For instance, if the changeis �� = ��2, then the equation y = x2 has no solutions for y < 0 and two solutionsx1 = +

√y, x2 = −

√y for y ≥ 0, and the pdf for �� is

f��(y) =⎧⎪⎨⎪⎩

0, y < 0,f��(

√y) + f��(−

√y)

2√

y, y ≥ 0.

(1.24)

1.3Some Important Probability Distributions with a Given Name

1.3.1Bernoulli Distribution

The so-called Bernoulli distribution describes a binary experiment in which onlytwo exclusive options are possible: A or A (‘‘heads or tails’’, ‘‘either it rains or not’’),with respective probabilities p and 1 − p, being p ∈ [0, 1]. We define the discrete

1.3 Some Important Probability Distributions with a Given Name 7

0 1 X

1−p

1

FB(X )

Figure 1.2 Cumulative distribution function (cdf) F��(x) of the Bernoulli random variable��(p).

Bernoulli random variable �� as taking the value 1 (respectively 0) if the experimentyields A (respectively A). The probabilities are

P(�� = 1) = p, (1.25)

P(�� = 0) = 1 − p. (1.26)

The mean value and variance can be computed as

E[��] = p, (1.27)

𝜎2[��] = p(1 − p). (1.28)

Eventually, and when needed, we will use the notation ��(p) to denote a randomvariable that follows a Bernoulli distribution with parameter p. According to thegeneral expression, the cdf of this random variable is F��(x) = (1 − p)𝜃(x) + p𝜃(x −1), or

F��(x) =⎧⎪⎨⎪⎩

0, x < 0,

1 − p, 0 ≤ x < 1,

1, x ≥ 1

(1.29)

which is plotted in Figure 1.2.

1.3.2Binomial Distribution

We now repeat M times the binary experiment of the previous case and count howmany times A appears (independently of the order of appearance). This defines arandom variable which we call ��B. It is a discrete variable that can take any integervalue between 0 and M with probabilities

p(��B = n) =(

Mn

)pn(1 − p)M−n. (1.30)


��B is said to follow a binomial distribution. The mean value and the variance aregiven by

E[��B] = Mp, (1.31)

𝜎2[��B] = Mp(1 − p). (1.32)

We will denote by ��B(p,M) a random variable that follows a binomial distributionwith probability p and number of repetitions M.

1.3.3Geometric Distribution

We consider, again, repetitions of the binary experiment, but now the randomvariable ��G is defined as the number of times we must repeat the experimentbefore the result A appears (not including the case in which A does appear). Thisis a discrete random variable that can take any integer value 0, 1, 2, 3,…. Theprobability that it takes a value equal to n is

p(��G = n) = (1 − p)np, n = 0, 1, 2,… . (1.33)

The mean value and variance are

E[��G] =1 − p

p, (1.34)

𝜎2[��G] =1 − p

p2. (1.35)

1.3.4Uniform Distribution

This is our first example of a continuous random variable. We want to describean experiment in which all possible results are real numbers within the interval(a, b) occurring with the same probability, while no result can appear outside thisinterval. We will use the notation ��(a, b) to indicate a uniform random variable inthe interval (a, b). The pdf is then constant within the interval (a, b) and 0 outsideit. Applying the normalization condition, it is precisely

f��(x) =⎧⎪⎨⎪⎩

1b − a

, x ∈ [a, b],

0, x ∉ [a, b].(1.36)

The cumulative function is

F��(x) =⎧⎪⎨⎪⎩

0, x < a,x − ab − a

, a ≤ x < b,

1, x ≥ b.

(1.37)

These two functions are plotted in Figure 1.3.


a b x0

1

1

b – a

Fx(x)

f x(x)

Figure 1.3 Top: Probability density function (pdf) f��(x) of the ��(a, b) distribution uniformlydistributed in the interval (a, b). Bottom: The corresponding cumulative distribution function(cdf) F��(x).


E[��] = a + b2, (1.38)

𝜎2[��] = (b − a)2

12. (1.39)

The uniform distribution ��(0, 1) appears in a very important result. Let usconsider an arbitrary random variable �� (discrete or continuous) whose cdf is F��(x),and let us define the new random variable �� = F��(��). We will prove that �� is a��(0, 1) variable.

The proof is as follows: Let us compute the cdf of �� starting from its definition

F��(u) = P(�� ≤ u) = P(F��(��) ≤ u). (1.40)

As F��(x) ∈ [0, 1], the condition F��(��) ≤ u requires necessarily u ≥ 0, so F��(u) = 0 ifu < 0. If, on the other hand, u > 1, then the condition F��(��) ≤ u is always satisfiedand its probability is 1, or F��(u) = 1 if u ≥ 1. Finally, for u ∈ (0, 1), the conditionF��(��) ≤ u is equivalent to �� ≤ F−1

�� (u), as the function F��(x) is a nondecreasingfunction. This gives

F��(u) = P(�� ≤ F−1�� (u)) = F��(F−1

�� (u)) = u. (1.41)

Summing up,

F��(u) =⎧⎪⎨⎪⎩

0, u < 0,

u, 0 ≤ u ≤ 1,

1, u > 1,

(1.42)

which is nothing but the cdf of a uniform random variable ��(0, 1).


1.3.5Poisson Distribution

Let us consider the binomial distribution in the limit of infinitely many repetitionsM. If we take the double limit M → ∞, p → 0 but keeping Mp → 𝜆, a finite value,the binomial distribution ��B(p) tends to the so-called Poisson distribution ��(𝜆).With the help of the Stirling approximation m! ≈ mme−m

√2𝜋m, which is valid in

the limit m → ∞, we can prove, starting from (1.30), the following expression forthe probabilities of the Poisson distribution:

P(�� = n) = e−𝜆 𝜆n

n!, n = 0, 1,… ,∞. (1.43)

The Poisson distribution is one of the most important distributions in nature,probably second only to the Gaussian distribution (to be discussed later). ThePoisson distribution has both mean and variance equal to the parameter 𝜆:

E[��] = 𝜎2[��] = 𝜆 (1.44)

which is a typical property that characterizes the Poisson distribution.We can think of the Poisson distribution simply as a convenient limit that

simplifies the calculations in many occasions. For instance, the probability that aperson was born on a particular day, say 1 January, is p = 1∕365, approximately.3)

Imagine that we have now a large group of M = 500 people. What is the probabilitythat exactly three people were born on 1 January? The correct answer is given bythe binomial distribution by considering the events A=‘‘being born on 1 January’’with probability p = 1

365and A=‘‘not being born on 1 January’’ with probability

1 − p = 364365

:

P(��B = 3) =(

5003

)( 1365

)3 (364365

)497= 0.108919… (1.45)

As p is small and M large, we might find it justified to use the Poisson approxima-tion, 𝜆 = pM ≈ 500∕365 = 1.37, to obtain

P(�� = 3) = e−1.37 1.373

3!= 0.108900… (1.46)

which is good enough. Let us compute now, using this limit, the probability that atleast two persons were born on 11 May.

P(�� ≥ 2) = 1 − P(�� ≤ 1) = 1 − P(�� = 0) − P(�� = 1)= 1 − e𝜆 − 𝜆e𝜆 = 0.3977… (1.47)

which is to be compared with the exact result 1 − P(��B = 0) − P(��B = 1) =

1 −(500

0

) ( 1365

)0 (364365

)500−

(5001

) ( 1365

)1 (364365

)499= 0.397895… , which is again a

reasonable approximation.There are occasions in which the Poisson limit occurs exactly. Imagine we

distribute M dots randomly with a distribution ��[0,T], uniform in the interval

3) Neglecting leap years and assuming that all birth days are equally probable.


[0,T] (we will think immediately of this as events occurring randomly in time witha uniform rate, hence the notation). We call 𝜔 = M∕T the ‘‘rate’’ (or ‘‘frequency’’)at which points are distributed. We now ask the question: what is the probabilitythat exactly k of the M dots lie in the interval [t1, t1 + t] ∈ [0,T]? The eventA=‘‘one given dot lies in the interval [t1, t1 + t]’’ has probability p = t

T, whereas

the event A=‘‘the given dot does not lie in the interval [t1, t1 + t]’’ has probabilityq = 1 − p. The required probability is given by the binomial distribution, ��(p,M),defined by (1.30). We now make the limit M → ∞, T → ∞ but 𝜔 = M∕T finite.This limit corresponds to the distribution in which the events occur uniformlyin time with a rate (frequency) 𝜔. As mentioned earlier, it can be proven, usingStirling’s approximation, that, in this limit, the binomial distribution ��(p,M) tendsto a Poisson distribution ��(𝜆), of parameter 𝜆 = pM = 𝜔t, finite. Let us give anexample. Consider N atoms of a 𝛽-radioactive substance. Each atom emits oneelectron independently of the others. The probability that the given atom willdisintegrate is constant with time, but it is not known which atoms will disintegratein a given time interval. All we observe is the emission of electrons with a givenrate. It is true that, as time advances, the number of atoms that can disintegratedecreases, although for some radioactive elements the decay rate is extremely slow(on the order of billions of years for the radioactive element 40

19K, for example). Wecan hence assume a constant rate 𝜔 which can be estimated simply by countingthe number of electrons M emitted in a time interval T as 𝜔 = M∕T . Under thesecircumstances, the number k of electrons emitted in the time interval [t1, t1 + t]follows a Poisson distribution of the parameter 𝜆 = pM = t

TM = 𝜔t, or

P(k; t) = e−𝜔t (𝜔t)k

k!. (1.48)

1.3.6Exponential Distribution

A continuous random variable �� follows an exponential distribution if its pdf is

f��(x) =

{0, x < 0,

ae−ax, x ≥ 0.(1.49)


E[��] = 1a, (1.50)

𝜎2[��] = 1a2

(1.51)

with a > 0 being a parameter.An interesting example is related to the Poisson distribution. Consider the

emission of electrons by a radioactive substance which we know is governed bythe Poisson distribution for those time intervals such that the emission rate can beconsidered constant. Let us set our clock at t = 0 and then measure the time t of thefirst observed emission of an electron. This time is a random variable 𝐭 (a number


associated with the result of an experiment) and has a pdf which we call f 1st𝐭

(t). By

definition, f 1st𝐭

(t)dt is the probability that the first electron is emitted during theinterval (t, t + dt), and, accordingly, the probability that the first electron is emittedafter time t is ∫ ∞

t f 1st𝐭

(t′)dt′. This is equal to the probability that no electrons havebeen emitted during (0, t) or P(0; t) = e−𝜔t, that is

∫∞

tf 1st𝐭

(t′)dt′ = e−𝜔t, t ≥ 0. (1.52)

Taking the time derivate on both sides of this equation, we obtain f 1st𝐭

(t) = 𝜔e−𝜔t,

which is valid for t ≥ 0, the exponential distribution. Alternatively, if 𝐭 follows thisexponential distribution, then the number of events occurring in a time interval(0, 1) follows a Poisson ��(𝜆) distribution with 𝜆 = 𝜔 × 1 = 𝜔.

1.3.7Gaussian Distribution

A continuous random variable �� follows a Gaussian distribution if its pdf is

f��(x) =1

𝜎√

2𝜋exp

[−(x − 𝜇)2

2𝜎2

]. (1.53)

The average and variance are given by

E[��] = 𝜇, (1.54)

𝜎2[��] = 𝜎2. (1.55)

We will use the notation that �� is a ��(𝜇, 𝜎) random variable. The cdf is

F��(x) =12+ 1

2erf

(x − 𝜇

𝜎√

2

)(1.56)

with erf(z) being the error function

erf(z) = 2√𝜋 ∫

z

0e−y2

dy. (1.57)

f��(x) and F��(x) are plotted in Figure (1.4).Gaussian random variables are very important in practice because they

appear in a large number of problems, either as an exact distribution insome limit or, simply, as providing a sufficient approximation to the realdistribution. After all, it is not unusual that many distributions have amaximum value and this can in many cases be approximated by a Gaus-sian distribution (the so-called bell-shaped curve). One of the reasons forthe widespread appearance of Gaussian distributions is the central-limittheorem, which states that the sum of a large number of independentrandom variables, whatever their distribution, will approach a Gaussiandistribution.

Date post:	27-Jan-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Ra´ul Toral and - download.e-bookshelf.de · Dedicated to the memory of my father, from whom I...

Documents