MScFinal.pdf1,1, )(,)( −+ == NNNN rNDrNB
p t t p t B N t p t D N t
p t B N t D N t o t N
p t t N
1 0for
where is the probability that the population size at time is
Chapter 3: Birth and Death processes
Thus far, we have ignored the random element of population
behaviour. Of course, this
prevents us from finding the relative likelihoods of various events
that might be of
interest, for example extinction. In this chapter, we focus on
demographic stochasticity.
This component arises from the intrinsically stochastic nature of
birth and death
processes. Even without external noise, one can not predict the
future population
numbers with certainty. Stochastic models often predict population
behaviour that is
significantly different from their deterministic equivalent. In
these cases, the
randomness itself is central to the population dynamic. This
randomness can, under
some nonlinearities have a systematic influence on the population.
Such effects can not
be captured by a deterministic model.
3.1) Introduction to Birth and Death processes
A birth and death process is defined as any continuous-time Markov
chain whose state
space is the set of all non-negative integers and whose transition
rates from state i to
state j , jir , , are equal to zero whenever 1>− ji . That is, a
birth and death process
that is currently in state i can only go to either state 1−i or
state 1+i . When the state
increases by one, we say that a birth has occurred and when the
process decreases by
one, we say that a death has occurred. We thus have:
(3.1.1)
The Markov assumption states that only the current population size
is of use in
predicting future population behaviour. Thus, by definition, other
possible transition rate
predictors – like the environmental condition – are ignored.
A population whose size at time t t+ is N could have only had one
of four things
occurring in the preceding interval [ , ]t t t+ : a member of the
population could have
given birth; a member of the population could have died; no births
or deaths could have
occurred and, finally, there could have been more than one event
(whether birth or
death or both) in the preceding interval. This means:
(3.1.2)
The probabilities of the first three events are given by the first
three terms on the RHS,
20
( ) ( ) ( ) ( ) ( ) ( )[ ( ) ( )]
( ) ( ) ( )
p t p t B N p t D N p t B N D N N
p t p t D N
N N N N= − + + − + >
r i j
ij i j
( ) ( ) ( ) ( ) ( )
( )
p t B N p t B N p t N N
p
( ) ( ) ( )p t B N p tN N0 00= −
and o t( ) is the (negligible) probability of more than one event
occurring within the
interval [ , ]t t t+ , resulting in the final population size at
time t t+ being N . We need
to consider the case when 0=N separately, as such a circumstance
will only arise if
there is a death when N = 1 . In this case:
(3.1.3)
By subtracting p tN ( ) from both sides of equations (3.1.2) and
(3.1.3), and dividing by
t , we obtain the so-called Kolmogorov forward equations. As t → 0
:
(3.1.4)
The Kolmogorov equations are an infinite system of differential
equations. They can be
written in matrix form as:
(3.1.5)
The Kolmogorov equations are the primary means to define a
time-homogeneous
Markov process. By solving these equations, we can find the
probability distribution of
N as a function of time.
3.2) Solving the Kolmogorov equations
The Kolmogorov equations can be solved for a linear “birth-only”
model. Let N0 denote
the initial population size. For a “birth-only” model (i.e. a model
that assumes that the
members of the population cannot die), the Kolmogorov equations are
as follows:
(3.2.1)
(3.2.2)
21
t N
N
N t
+ − + − − − +
− − − +
= −
−
−
= −
−
N e eN N
+ + − − −
+ + = −
−
−
λ λ λ λ
The above equations can be solved directly. Suppose that B N N( ) =
λ i.e. the birth rate
is not density-dependent. Upon substituting this transitional form
into (3.2.2), we get:
1
0
p t t N C C
p t e
where is a constant
Using the boundary condition that pN0 0 1( ) = , we then
have:
p t eN
0( ) = −λ
This expression derived for p tN0 ( ) can then be substituted into
the Kolmogorov
( ) ( )
( )
N N
N t
where is a constant
Using the boundary condition that pN0 1 0 0+ =( ) , we then
have:
p t N e eN
N t t
1 0 1+ − −= −( ) λ λ
Repeating the procedure with the boundary condition pN0 2 0 0+ =( )
, it can be shown that:
p t N N
−( ) λ λ
The first three terms, suggest that the probability mass function
for the population
number is:
(3.2.3)
This is a negative binomial distribution with parameters N0 and e
t−λ . If p tN ( ) does
indeed satisfy equation (3.2.3), then the differential equation for
p tN +1( ) would be:
(3.2.4)
Equation (3.2.4) is a first order differential equation. We can
thus apply a result given by
Jaeger et al. (1974) on equation (3.2.4), with the boundary
condition pN + =1 0 0( ) to get:
(3.2.5)
22
( ) ( ) ( ) ( ) ( )
( )
p t D N p t D N p t for N N
p
p t N
bNt bt N N
01 0 1 0 for
We have thus shown that if p tN ( ) satisfies equation (3.2.3) then
p tN +1( ) also satisfies
equation (3.2.3). We also know that equation (3.2.3) is true when N
= 0 . Hence, by the
induction principle, we then know that equation (3.2.3) is always
true. We have thus
proved that the probability mass function for a linear “birth-only”
process is a Negative
Binomial Distribution with parameters N0 and e t−λ .
Similarly, one can directly derive the probability distribution
function for a “death-
only” process. The Kolmogorov equations for the death process
are:
(3.2.6)
(3.2.7)
As before, we start of solving for p tN0 ( ) using the boundary
condition that pN0
0 1( ) = .
After which we can solve for p tN0 1− ( ) , p tN0 2− ( ) ,… , p t0
( ) using the boundary condition
that p for N NN ( )0 0 0= ≠ . When D N bN( ) = (i.e. the death rate
is proportional to the
population size), the resulting probability mass function for the
“death only” process has
the form:
(3.2.8)
Unlike the “births-only” process, the “deaths-only” process has a
finite set outcomes.
Thus, for a linear “deaths-only” process, the population size is
binomially distributed
with parameters N0 and e bt− .
3.3) Alternatives to the Kolmogorov Equations
The direct method of solving the differential equations is quite
laborious in dealing with
“birth-only” and “death-only” equations. This is due to the
necessity of deriving the
probabilities for the various possible population sizes (or at
least the first few
probabilities) separately before one can derive the general
expression for p tN ( ) . This
direct method is even less practical when dealing with a model that
allows for both
births and deaths. Due to the dependence of p tN ( ) on both p tN
−1( ) and p tN +1( ) in such
models, one must solve the differential equations simultaneously –
contrast this with the
successive solution of the equations in the two earlier models.
This makes the
Kolmogorov equations unwieldy for large populations as, in such
cases, obtaining even
23
N B N D N p tN
N N
( ) ( ) ( ) ( ) ( ) ( ) ( ) 1
2
2
2
p N N NN ( )0 0= −δ δ where is the Dirac Delta function
E dN B N D N dt= −( ) ( )
E dN B N D N dt 2 = +( ) ( )
a numerical solution is often difficult. The following alternatives
to the Kolmogorov
equations have proved useful.
A. The Continuous Approximation
By its very nature the population size, N , is a discrete random
variable. By treating N
as a continuous random variable and re-interpreting p tN ( ) as N
’s probability density
function (which we shall denote by p tN ( ) to avoid confusion),
one can derive an
approximate probability distribution for the population. Nisbet et
al. (1982) derived a
single, approximate differential equation for p tN ( ) . By
performing a Taylor expansion on
p tN ( ) and discarding terms that are of third order and higher,
the authors showed that:
(3.3.1)
A slightly modified version of the proof given by Nisbet and Gurney
(1982) is given in
Appendix A (some of the elements of the proof have been re-ordered
in an attempt to
make the proof more comprehensible).
The boundary condition for equation (3.3.1) – when the initial
population size is N0 –
is:
(3.3.2)
Unfortunately, due to its non-linear form, equation (3.3.1) is
analytically intractable.
Even a numerical solution to the differential equation cannot be
found due to the
discontinuous nature of the boundary condition given in (3.3.2).
This implies that the
continuous approximation cannot be used to derive an approximate
probability
distribution for N . However, equation (3.3.1) can be used to
generate an approximate,
quasi-equilibrium distribution (covered in Section 3.6) which, in
turn, can be used to
derive an approximate analytical expression for the mean time to
extinction.
B. Stochastic Differential Equations
Both the Kolmogorov equations and their continuous approximation
model the
population probability distribution through time. The following
model is based on the
population size itself and can thus be readily compared with the
deterministic models
covered in Chapter 2. Stochastic differential equations are often
also used to model
environmental stochasticity. It can be shown (see Appendix A)
that:
24
dN B N D N dt t B N D N dt
t
η
ηwhere is a random variable with zero mean and unit variance
dN B N D N dt B N D N d t
t
dN
dt B N D N B N D N t t
d t
B N D N dt B N D N dt
B N D N dt dt
[ ] [ ] [ ]
( ) ( ) ( ) ( )
( ) ( )
= −
= + − −
≈ +
provided is sufficiently small
Thus, provided dt is sufficiently small for terms of order dt 2 and
higher to be ignored,
we have:
(3.3.4)
Equation (3.3.4) is merely a cumbersome restatement of the
Kolmogorov equations:
η( )t has an unusual, discrete probability distribution to
accommodate the fact that N
can only take on integer values. However a tractable approximation
to this equation is
possible.
Nisbet et al. (1982) stated, for all but the smallest populations,
that any change in
the population size which is large enough to affect the transition
probabilities must be
the result of a large number of statistically independent births
and deaths. Thus, dN
can be taken over a relatively long time increment dt ; η( )t will
then have an
approximately normal probability distribution (by the Central Limit
Theorem). By
regarding N as a continuous variable (which implies that dt is
small since dt ε 2 must
be constant – see proof of equation (3.3.1) in Appendix A) and η(
)t as being normally
distributed (implying dt is large), we have:
(3.3.5)
The Wiener process, ω( )t is a continuous random process with
independent increments
and which is also time homogeneous (i.e. ω( )t and ω ω( ) ( )t s s+
− have the same
distribution for s ≥ 0 and ω( )0 0= ). In addition, the Wiener
process, ω( )t is Normally
distributed with mean 0 and variance σ 2 t (where σ > 0 ).
Nisbet et al. (1982) did not offer a resolution of the requirement
that dt is both small
and large. However, from empirical evidence in Section 3.5,
equation (3.3.5) does seem
to provide a good approximation to the Kolmogorov equations. The
stochastic
differential equation can alternatively be written as:
(3.3.6)
25
j
j
j
( , ) ( ) ( , ) ( ) ( , )
f N t o t j
t f N o t j
j
j
j
1 0 0
Equation (3.3.6) can be interpreted as the sum of ‘deterministic’
and ‘stochastic’
contributions to dN dt . Care must be taken with this
interpretation since white noise is
not well behaved (e.g. E tγ ( ) 2 = ∞ ). Stochastic differential
equations prove to be
especially useful for deriving the gross fluctuation
characteristics of a population.
C. Generating Functions
We now derive a singular differential equation for the population’s
moment generating
function. The moment generating function of any random variable
characterises that
random variable’s probability distribution. Hence such an equation
implicitly models the
population’s probability distribution through time.
Consider the random variables N t( ) and N t t N t( ) ( )+ − . The
variables represent
the population size at time t and the net change in the population
size over the interval
[ , ]t t t+ respectively. If we let f Nj ( ) represent the
continuous transition rate from
population size N to size N j+ , then:
(3.3.7)
For the birth and death model, f N B N1( ) ( )= , f N D N− =1( ) (
) and f Nj ( ) = 0 for
j ≠ −1 0 1, , . Also, f N rj N N j( ) ,= + using the notation for
the transition rates introduced in
Section 3.1. Bailey (1964) showed that the following differential
equation for the
moment generating function, M t( , )θ , corresponds to the set of
probability differential
equations (as shown in (3.1.5)):
(3.3.8)
Note that the ∂ ∂θ operator acts only on M t( , )θ . So, for
example, if f N aN bN1
2( ) = −
M t b
M t 1
equation (3.3.8) is given in Appendix B.
The birth and death process assumes that the population size cannot
change by
more than one unit in the interval t . Hence we have:
(3.3.9)
The advantage of the above equation is easy to see: instead of
having a possibly infinite
set of differential equations to solve simultaneously, we only need
to solve a single
26
( , ) ( ) ( , ) ( ) ( , )
1 11
∂
∂ = − + −
∂
M t( , ) ( , )θ θ
θ θ θ1 1
differential equation. The moment generating function characterises
the probability
distribution of N so the solution of equation (3.3.8) helps to
identify the correct
probability distribution of N . One can come up with corresponding
differential equations
for the probability generating function, P t( , ) and the cumulant
generating function,
K t( , )θ . For K t( , )θ , we use the relationship K t M t( , )
log ( , )θ θ= (equation (3.4.3) below
gives the differential equation of K t( , )θ for a birth and death
process). If we substitute
eθ = and ∂ ∂ = ∂ ∂θ in (3.3.9), we then have the following
differential equation for
the probability generating function, P t( , ) :
(3.3.10)
Consider the simple case where the transition rates are
proportional to the population
size:
(3.3.11)
In this case, the differential equation for M t( , )θ
becomes:
(3.3.12)
Equation (3.3.12) is a linear differential equation. Hence an
analytical solution for
M t( , )θ is easily obtainable. Bailey (1964) showed that the
solution of equation (3.3.12)
with boundary condition M e N( , )θ θ0 0= (i.e. the initial
population size is N0 ) is:
(3.3.13)
Since the birth and death rates were simply proportional to the
population size, it was
easy to derive this analytical solution for M t( , )θ . If the
birth and death rates are
nonlinear, differential equations (3.3.9) and (3.3.10) can become
intractable. In such
cases, we unfortunately cannot derive the exact probability
distribution of N and thus
we need to look at ways to approximate the true probability
distribution or to solve the
equation numerically.
K K ( ) ( ) ( )( ) ( )θ θ θ θ
θ θ θ 1 1 1 11 2 1 2
2
2
2
D N f N a N b N for N
a
b
( ) ( )
( ) ( )
, ; , ( ) ( )
= = −
= = + ≤
> = = =
−
K t t
i
i
i
i( , ) ( )
∂
∂ = + + +
θ θ 1 1 1 11 2 1 2
2
2
The difficulties encountered in deriving an analytical solution to
the Kolmogorov
equations – particularly when the birth and death rates are
nonlinear – forces one to
look at various, more solvable, models that approximate a
population’s probability
distribution. This section is based on the work of Matis et al.
(2000). Consider transition
rates with the following mathematical form:
(3.4.1)
The transition rates shown above only hold for N a b≤ 1 1 . This
suggests that a b1 1>> .
Consequently, we expect the per capita birth rate to dominate when
N is small and the
term b N1
2 to dominate when N is large. b N1
2 can thus be interpreted as the effect of
crowding on the population. A similar interpretation also holds for
the death rates. By
applying equation (3.3.9) to the above transition rates, we
obtain:
(3.4.2)
This implies that the differential equation for the cumulant
generating function, K t( , )θ ,
is:
(3.4.3)
A derivation of equation (3.4.3) is given in Appendix E. Neither of
the above two
differential equations is analytically tractable. We thus are
unable to find an exact
analytical expression for the probability mass function. We thus
need to look at ways to
derive an approximate expression for the probability mass function
as it evolves through
time.
One alternative to generate an approximate probability mass
function would be to try
to derive expressions for the first few cumulants of N from
equation (3.4.3). In deriving
such expressions, we make use of the following relationship:
(3.4.4)
2 3 2 3
b b t
θκ θ
κ θ
κ κ
1 2 1 2 1 1 1 2 2
1 2 1 2 1 1 1 2 1 2 1 2 1 2 1 2 3
2
1 2 1 2 1 1 1 2 1 2
2 4 2 2
a a b b b b
a a b b a a b b b b b b
a a b b a a b b b
+ + + =
+
+
− − + − +
+ − − + − − − − + − +
− − + + + − + −
1 2 1 1 2 2 2
1 2 1 2 1 2 1 3 1 2 4
3
6
− − +
− − − − + − ++
+
We also need to use the series expansion of eθ :
(3.4.8)
Substituting equations (3.4.5) – (3.4.8) into equation (3.4.3), we
thus have:
(3.4.9)
(3.4.10)
By equating the coefficients for the various powers of θ in
equation (3.4.10), we obtain
the following system of differential equations for the
cumulants:
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
1 1 2 1 2 1 1 1 2 2
2 1 2 1 2 1 1 1 2 1 2 1 2 1 2 1 2 3
3 1 2 1 2 1 1 1 2 1 2 1 2 1 1 2 2 2
2 4 2
3 6 6
t a a b b b b
= − − + − +
= + − − + − − − − + − +
= − − + + + − + − − − +
+
3 3 6 3 1 2 1 2 1 2 1 3 1 2 4
( ) ( ) ( ) ( )a a b b b b b b− − − − + − +κ κ κ etc
(3.4.11)
It can be seen that the differential equation for the i -th
cumulant function has terms up
to the i +1-th cumulant – this is due to the non-linear birth and
death rates. The
presence of these higher-order cumulants prevents us from solving
the differential
29
2
ψ κ κ κ
D N N N
0
2
2
for
otherwise
equations in (3.4.11) directly. Matis et al. (2000) proposed using
a cumulant truncation
procedure. Here, one approximates the first i cumulants by setting
all the cumulants of
order i +1 or higher to zero. If we set all the cumulants of order
4 and above to zero, we
can then solve the resulting three differential equations in
(3.4.11) numerically to find
values for κ 1( )t , κ 2 ( )t and κ 3( )t at various values of t .
The cumulant values obtained
can then be used to create a saddle-point approximation of the
probability distribution.
The probability distribution of N may be approximated by a
saddle-point probability
distribution. The saddle-point is a density function that will take
as its parameters the
values of N ’s cumulants up to a specified order and force the
values of the density
function’s cumulants to match those of N . It is this matching of
the cumulants that
makes the saddle-point an approximation of the true distribution of
N : one would
expect the approximation to be more accurate if more cumulants are
being matched
(however, in some cases, this is not true). The values for the
first three cumulants
(derived from equation (3.4.11)) can be substituted into the
following saddle-point
approximation derived by Renshaw (1998):
(3.4.12)
Matis et al. (2000) have stated that the investigations which they
have performed into
the accuracy of the above approximation have yielded results that
were “very
encouraging”.
We again consider the transition rates introduced in Chapter
2:
(3.5.1)
By solving the equation ‘ B N D N( ) ( )− = 0 ’, we can see that
the equilibrium population
size is 17 5. . Since this was a stable equilibrium state, more
births tend to occur when
N < 17 5. and more deaths tend to occur when N > 17 5.
.
Three simulations of the continuous-time Markov model were
performed using the
above transition rates so as to get a feel of what a typical
population trajectory might
look like. In order to execute the simulations, we needed to divide
the timeline into
30
t
( )
ωwhere is a Wiener process
intervals of sufficiently small lengths (in this case, the interval
length was set to 0.01) so
as to make the probability of more than one birth or death
occurring within an interval
negligible. For each interval, we thus know (using t = 0 01. ) that
the probability that a
birth occurs is B N t N N( ) . . = −0 003 0 00015 2 . The
probability that a death occurs is
D N t N N( ) . . = +0 0002 0 00001 2 and the probability that
nothing happens is
1 1 0 0032 0 00014 2− + = − +B N D N t N N( ) ( ) . . . One can
then simulate which of the three
possible events occurs in each interval and hence replicate the
population trajectory
over the period of interest. In the simulations performed, the
initial population size was
set to 10. The three simulated population trajectories are shown
below:
PopulationTrajectories
0
5
10
15
20
25
Time
N
Figure 3.1 Simulated runs of the Population when N0 = 10
Similarly, one can simulate various trajectories for the stochastic
differential equation
(SDE) approximate representation of the birth and death
process.
The SDE for the transitions given in (3.5.1) is:
(3.5.2)
As with the earlier model, we assume that the initial population
size is 10 and we use
time increments of size 0.01. Thus, by simulating the values that
the normal random
variable d tω ( ) takes over each time increment, one can derive
the population
increments through time. By adding these increments to the initial
population size, one
can derive the population trajectory.
31
PopulationTrajectories
0
5
10
15
20
25
Time
N
Figure 3.2 Simulated SDE runs of the Population when N0 = 10
In Figure 3.1, one can clearly see that the population size never
moves by more than
one unit in any instant (since dt is made sufficiently small to
exclude the possibility of
multiple births and deaths within any time increment). This serves
to highlight that the
birth-and-death process is a discrete-state process in continuous
time. However in
Figure 3.2 the population size can change to any value within an
instant. This is to be
expected as the SDE treats the population size as a continuous
variable.
Of course, if one were to repeat the simulations, one would, in all
likelihood, obtain
appreciably different population trajectories from the ones shown
in Figures 3.1 and 3.2
(since the population movements depend on the occurrence of random
events). The
initial population size is 10. From Figure 2.1, we can see that the
birth rates are higher
than the death rates when N = 10 and hence we would expect an
upward trend in the
population numbers initially. Such a trend is clearly evident at
the outset of all three
population simulations in both Figures 3.1 and 3.2. Once the
equilibrium state is
reached, one can see from the above figures that the population
then tends to vacillate
around this point. This is to be expected as this is a stable
equilibrium state.
A million simulations were then run for the birth-and-death process
and these were
compared with ten thousand simulations for the SDE. Both sets of
simulations took
roughly an hour to run in Microsoft Excel on a Pentium 4, 2800 MHz,
512MB RAM: the
SDE simulations are relatively more time-consuming as random
numbers from a
Normal distribution must be generated to carry out these
simulations. This takes a
considerably longer period of time to complete than the Uniform
random number
generation required for the birth-and-death process as the
programming language used
32
κ κ κ κ κ κ κ κ
κ κ κ
3 1 1 1 2 2 1 3
1 0 2 3
0 32 0 014 0 546 0 064 0 032
0 28 0 016 0 944 0 084 0 096 0 798 0 096
0 10 0 0 0
t
t
t
N
with boundary conditions and
to execute the simulations required one of Microsoft Excel’s
statistical functions to
generate the Normal random numbers.
For both types of simulation the initial population size is set to
10. The resulting
population sizes after each simulation run was recorded at time 10.
These results were
then grouped under a frequency distribution (values for the SDE
were rounded to the
nearest integer) and consequently the probability distribution at
time 10, pN ( )10 for both
the birth-and-death process and the stochastic differential
equation could be estimated
as the number of times that a particular population value occurred,
divided by the
number of simulations undertaken. The estimated probabilities for
the two models are
shown below:
0
0.05
0.1
0.15
0.2
0.25
1 3 5 7 9 11 13 15 17 19 21
PopulationSize
Figure 3.3 Simulated Probability Distributions for the two
processes
One can clearly see that the probability distribution for both
processes is negatively
skew at time 10. This is due, in part, to the population starting
below the equilibrium
size. The probability distribution derived using the SDE is a good
approximation to the
true distribution obtained by simulating the Birth and Death
process directly. This is
despite the fact that the population size is quite small and so the
normality assumption
implicit in the SDE model is contentious.
The above transition rates are nonlinear and so we need to use the
cumulant
truncation method in order to obtain approximate values for the
cumulants. If we choose
to truncate all cumulants of order four and higher, we will obtain
the following system of
differential equations (see equations in (3.4.11)):
(3.5.3)
33
The solution of the first three cumulants at time 10 was obtained
numerically. The
following values were obtained:
( ) . , ( ) . , ( ) .κ κ κ1 2 310 16 49 10 353 10 345= = = −
These estimates compare favourably with the first three cumulant
values observed for
the one million simulations of the birth-and-death process:
κ κ κ1 2 310 16 50 10 352 10 3 61( ) . , ( ) . , ( ) .= = = −
The estimates of the cumulants can then be substituted into the
saddle-point
approximation given in (3.4.12) so as to get an approximate
probability mass function
for N . The diagram below compares the saddle-point probabilities
with the simulated
relative frequencies from the Birth and Death process:
Accuracyof theSaddlepointapproximation
Population Size
P ro
b a
Simulated Probabilities
Saddlepoint approximations
Figure 3.4 Comparison of the Saddle-point and Simulated
probabilities
From the above figure, one can see that the saddle-point
approximation deviates
substantially from the true Birth and Death probabilities. The
saddle-point approximation
does not seem to be valid for the above three values obtained for
the cumulants – the
probability density function will be a complex number when n >
17 52. as ψ (as defined
in (3.4.12)) will be negative over this range. This means that the
saddle-point
approximation, ( )pN 10 , is not defined for N > 17 . Matis et
al. (2000) applied the saddle-
point approximation successfully to various other transitional
forms. However, they did
not apply the saddle-point approximation when considering the
transitional rates given
in (2.1.1). Further research needs to be done to ascertain the
reason for the failure of
the saddle-point approximation for the transition rates in
(2.1.1).
34
= + =+1 1 21 for
D D D N B p NN
* *( ) ( )... ( )
( ) ( )... ( ) ( ) ,=
− >
1 2 0 00
To see whether the saddle-point approximation worked better after a
longer time
interval, the population size at time 20 was also studied. A
hundred thousand
simulations were run. The values observed for the first three
cumulants were:
κ κ κ1 2 320 17 29 20 2 58 20 2 30( ) . , ( ) . , ( ) .= = =
−
Using the formula for ψ (given in (3.4.12)), we now find that the
density function is
complex over the range: n > 18 74. . Compare the cumulant values
at time 20 with those
observed at time 50:
κ κ κ1 2 350 17 35 50 2 49 50 2 20( ) . , ( ) . , ( ) .= = =
−
Here, the density function becomes complex over the range: n >
18 79. . There are
minimal changes in the values of the cumulants. This seems to
suggest that the
population is close to equilibrium at time 20 (the concept of a
population being in
equilibrium is considered in more detail in section 3.6).
A process X t( ) is said to be ergodic if all its cumulants (e.g. µ
= E X t( ) ) are equal
to the matching time averages of the process, (e.g. X T
X t dt T
0 ). One of the
conditions of ergodicity – which is satisfied by all the
birth-and-death models considered
in this research report – is that the population should ‘forget’
its initial population size
after a suitably long period of time (see Nisbet et al. (1982) and
section 3.7). Thus we
expect the values of the cumulants by time 20 to be independent of
the starting
population size. Thus, for the transition rates considered in this
example, the saddle-
point approximation is inappropriate; irrespective of the initial
value of the population.
3.6) Quasi-Equilibrium Distribution
Nisbet et al. (1982) stated that a population with a true
equilibrium state, pN
* has:
(3.6.1)
Intuitively, equation (3.6.1) signifies that a population at
equilibrium has an equal
probability of increasing from size N to size N +1 as it has of
decreasing from size
N +1 to size N . By repeatedly applying the above recurrence
relationship, one can
show that:
0 = ΠR *
Since we are ignoring migration, B( )0 is zero. This implies that p
NN
* = ∀ >0 0 . Since
= . This is to be expected since extinction is an
absorbing state when migration is ignored. However this
distribution is of limited
interest. Consequently, the concept of the quasi-equilibrium
distribution is considered
instead. Quasi-equilibrium is defined as the equilibrium
probability distribution that the
population would ultimately be subject to if it were never to
become extinct. We now
look at two possible methods of deriving the quasi-equilibrium
distribution.
A. The modified Markov Process
Matis et al. (2000) modified the original birth and death process
to create a new Markov
process whose probability distribution did not degenerate at
equilibrium to an extinction
probability of one. The coefficient matrix for this new process R *
was based on the
coefficient matrix for the birth and death process R (see equation
(3.1.5)). By deleting
the first row and column of R (thus excluding the state N = 0 ) and
by assuming that
D( )1 0= (which removes the only transition to the state N = 0 ),
one obtains the
coefficient matrix for the new process, R * .
Let p tN
m ( ) be the probability that this modified process is equal to N
at time t . Also
let p m ( ) ( )
, ,... t p tN
(3.6.3)
At equilibrium, we would expect ( )pm t = 0 . So if we let Π =
=
pN N
distribution for the modified process (and the quasi-equilibrium
distribution for the birth
and death process) we then have:
(3.6.4)
By solving equation (3.6.4) for Π , we get the quasi-equilibrium
distribution. However,
the algebra may become tedious when the population is large. Note
that R * must be a
singular matrix, otherwise Π = 0 .
B. Locally Linear Approximations
In Chapter 2, a locally linear approximation in the vicinity of the
population’s equilibrium
state was used to derive an approximate population model. An
approximation around
the population’s deterministic equilibrium state, N * can also be
used to derive an
36
B N D N( ) ( )* *=
f N B N D N g N B N D N n t N t N( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) *=
− = + = −
f N n n dB
dN
dD
∂
∂ = −
∂
∂ −
∂
∂
* * *( ) ( ) ( )λ
1
2
2
state satisfies the following relationship:
(3.6.5)
Nisbet et al. (1982) defined the three functions, f N( ) , g N( )
and n t( ) :
(3.6.6)
By performing a Taylor expansion of f N( ) around N * and retaining
only the leading
term in the expansion, we have:
(3.6.7)
(3.6.8)
Equation (3.6.7) approximates f N( ) to the first order whilst
equation (3.6.8)
approximates g N( ) to the zero’th order. These expressions can
then be substituted into
the continuous approximation given by equation (3.3.1) to
obtain:
(3.6.9)
N n
* ( ) 0 , one can solve the resulting differential equation to
derive an
approximate expression for the quasi-equilibrium
distribution:
(3.6.10)
The function clearly is Gaussian in form. Nisbet et al. stated
that, with this locally linear
approximation, the population size has the following (approximate)
distribution:
(3.6.11)
Note that λ < 0 for a stable equilibrium state (see equation
(2.2.6)) and thus the
variance ( −Q 2λ ) for N is positive.
3.6.A) Example (continued)
For the transition rates considered in Section 3.5, we have N * .=
17 5 . Furthermore, we
know that:
17 5 17 5 13125
17 5 . . .
Thus, we have that N ≅ Normal( . , . )17 5 2 34375 .
37
The corresponding modified transition matrix is obtained using the
transition rates in
(3.5.1):
population, N0 . The diagram below shows the quasi-equilibrium
distribution; the
probability distribution at time 100 (obtained by simulation); and
the locally linear Normal
approximation:
Theequilibriumprobabilitydistribution
0
0.05
0.1
0.15
0.2
0.25
0.3
Population Size
P ro
b a
Figure 3.5 The population at equilibrium
From the diagram, one can see that the locally linear approximation
provides a relatively
good fit to both the population’s probability distribution at time
100 and the quasi-
equilibrium distribution. Thus, after a long enough time interval,
the population assumes
an approximately normal distribution.
tim
T
0
1 = − ≡ −
→∞ lim
3.7) Gross Fluctuation Characteristics of a Population
The population’s probability distribution is usually a means to an
end rather than the end
itself since it is almost impossible to estimate for any natural
population. This is due to
both the unreliability of most ecological population data and the
difficulty involved in
setting up ‘replicate’ populations so that various probabilities
may be estimated.
However, gross fluctuation characteristics of a population, such as
the mean, the
variance and the autocovariance function, can usually be observed
over time for a
population. Thus such characteristics prove to be invaluable in
calibrating any
population model.
The gross fluctuation characteristics should describe the
properties of a population
at equilibrium. However, since extinction is an absorbing state,
the equilibrium state of a
population is extinction. The characteristics of such a state are
not very interesting and
so we would rather base the gross fluctuation characteristics on a
population in quasi-
equilibrium. This section is based on the work of Nisbet et al.
(1982). As before we let
pN
* be the probability that a population in quasi-equilibrium has
size N . We then have:
(3.7.1)
(3.7.2)
Unfortunately, the above equations cannot be used to calculate the
mean and the
variance as the quasi-equilibrium distribution of a population is
seldom estimable. A
good way to relate the gross fluctuation characteristics to a
measurable quantity is to
equate the above statistical expectations to the corresponding time
averages of a single
population. That is, we assume the population is ergodic (see
Section 3.5 for a
definition of ergodicity). In order for such a procedure to be
valid, the following
conditions for ergodicity must hold:
i. After a suitably long period of time the population should
‘forget’ its initial value.
ii. A population starting from a particular value should, in
principle, be able to reach
any other value.
Nisbet et al. stated that the above conditions are satisfied by all
birth and death models.
The time averages of a single population (which we denote by ) are
defined as:
(3.7.3)
(3.7.4)
39
C N t N N t N( ) ( ) ( )τ τ≡ − − −
dN B N D N dt B N D N d t= − + +( ) ( ) ( ) ( ) ( )ω
dN f N dt g N d t= +( ) ( ) ( )ω
dn
T
T
The time averages, µ tim and σ tim
2 should equal the mean and the variance of the quasi-
equilibrium distribution as the population should spend most of its
time in the quasi-
equilibrium state. Thus, from equation (3.6.11), we get:
(3.7.5)
One can also define the autocovariance function, C( )τ using time
averages:
(3.7.6)
The autocovariance function gives one an indication of the time it
takes for a population
to ‘forget’ its initial value.
An alternative method of deriving the gross fluctuation
characteristics is to use the
SDE formulation. The stochastic differential equation used to model
the population (see
equation (3.3.5)) was:
(3.7.7)
Retaining the usage of the functions f N( ) and g N( ) , as defined
in section 3.6, we
have:
(3.7.8)
If, consequently, one were to approximate the functions f N( ) and
g N( ) around N * by
the equations (3.6.7) and (3.6.8), we would obtain the following
linear SDE:
(3.7.9)
One can easily derive the gross fluctuation characteristics for a
linear SDE using
Fourier methods. Consequently, a brief description of Fourier
analysis is given below. (It
is also advisable to consult Appendix C as it gives proofs to some
key Fourier
theorems.) For any function x t( ) , its Fourier transform, ~( )x ω
is defined over the interval
−T T2 2; as:
(3.7.11)
It is this result in particular which makes a linear SDE amenable
to Fourier analysis.
40
ωγ= +2 2
E E T S~( ) ; ~( ) ; ( )γ ω γ ω ωγ= = =0 1 2
n t T
( ) ~( )
( ) ( ) *
= − =
= =
2 =
Q d Q2 2 2
2 22 2 = = =
~( ) ~( ) ~( ) ~( ) ω
ω ω λ ω γ ω= = + 1 2
The spectral density, Sx ( )ω , of the function x t( ) is defined
as:
(3.7.12)
If the population’s equilibrium state is stable, the transient
initial condition-dependent
term will decay to zero and the persisting term becomes dominant.
Since equation
(3.7.9) is linear, we know by Fourier transforming equation (3.7.9)
(and applying
equation (3.7.11)) that:
(3.7.14)
The spectral density of the population is – upon substituting
equation (3.7.14) in
equation (3.7.12) – given by:
(3.7.15)
The above relationships are useful as we know that white noise has
the following
properties:
(3.7.16)
By applying some results proved in Appendix C to the population, we
obtain:
(3.7.17)
(3.7.18)
By substituting equation (3.7.14) into (3.7.17) and taking
expectations, we have:
(3.7.19)
In addition, by substituting (3.7.15) into (3.7.18) and applying
the spectral property of
white noise (given in (3.7.16)), we have:
(3.7.20)
The time averages given above agree with the gross fluctuation
characteristics derived
via the continuous approximation (see equation (3.7.5)). Unlike the
continuous
approximation however, the stochastic differential equation
formulation allows us to
41
ω ωτ ω= −∞
ρ τ λτ( ) = e
easily derive the autocovariance function, C( )τ , for the
population. It can be proved
(see Appendix C) that:
(3.7.22)
(3.7.23)
The functional form of (3.7.23) implies that the population sizes
at two distinct points in
time can never be negatively correlated.
3.7.A) Example (continued)
Using equation (3.7.23), we find that the autocorrelation function
for the example in
section 3.5 (where λ = −0 28. ) is:
Autocorrelationfunction
0
0.2
0.4
0.6
0.8
1
Time lag
Figure 3.6 Correlation between two points a distance τ apart
One can see that the population size in the near future is closely
correlated to the
population size now. This is to be expected as the time interval is
too small to allow for
any more than a few births and deaths to occur. One can see that
the population at
times a distance 20 apart are virtually uncorrelated. Thus we would
expect the initial
population size of 10 to have virtually no impact on the population
size at time 20.
3.8) Extinction
The extinction of a species is of particular interest in all
population studies. Society is
rarely indifferent to the prospect of extinction of a species
(whether it be the rhino… or
smallpox!). Extinction is an absorbing state. The finality of the
extinction state is, in large
42
F t N t N N p t N NN0 0 0 00 0 0( ) Pr ( ) ( ) ( ) ( )= = = =
=given
f t p t N NN0 0 00( ) ( ) ( )= =with
E tp t dt D tp t dtN0 0 11= = ( ) ( ) ( )
E b t N
N
− =
∞ − − −
=
p t p p t p N1 1 0 11 1( ) ( ) * *
≈ − − = where is the quasi equilibrium probability
dp t
1 01 1 ( )
= − −% &
E S t dt p t dt S t P T tN N0 00 0
0 1= = − = >
∞ ∞ ( ) ( ) ( ) [ ] where
part, the justification for any interest in this state. The
probability of a population being
extinct at time t is p t0 ( ) . Since extinction is an absorbing
state, p t0 ( ) is always an
increasing function of time.
Let TN0 denote the time to extinction when the current population
size is N0 . Also let
f tN0 ( ) be the density function and F tN0
( ) the cumulative distribution function of TN0 .
We thus have:
(3.8.2)
. Using the functional form for ( )p t0 (shown in equation
(3.1.4)), we then have:
(3.8.3)
So for the “deaths-only” process in section 3.2 (where D N bN( ) =
), we have:
(3.8.4)
For the example in Section 3.5, one cannot obtain an analytical
expression for p t1( )
and hence we will only be able to derive the mean time to
extinction numerically using
the above method. Alternatively, one could try to derive an
approximate analytical
expression.
(3.8.5)
Nisbet et al. (1982) showed that a when a population is close to
its quasi-equilibrium
state, then:
The Kolmogorov equation when N = 1 (see equation (3.1.4)) is:
(3.8.7)
(3.8.8)
(3.8.9)
43
E D p t dt D pN0 1 11 1
1
0 = − =
−∞ exp ( ) ( ) * *% &
(3.8.10)
This expression is independent of the initial population size, N0 .
This is because we are
assuming that the population has reached the quasi-equilibrium
state, which is
independent of the initial population size. For the example in
Section 3.5, D( ) .1 0 021=
and p1
.= × − (this probability was calculated in Section 3.6.A by solving
equation
(3.6.4)). Substituting these numbers into equation (3.8.10), the
model estimates the
mean time to extinction to be 61 1012. × time units; for any
initial population size.
One can obtain an exact result for the example in Section 3.5 using
the fact that we
are modelling the population as a Markov process. Let R + be a
modified coefficient
matrix of R (see equation (3.1.5)), obtained by deleting the first
row and the first column
of R . ( R + is not quite the same as R * – which was used in
equation (3.6.3) – as we do
not make D( )1 0= . As such, R + is the coefficient matrix of the
Kolmogorov equations
amongst the transient states.) Let M = mij be the matrix of
so-called mean residence
times. The element mij is defined as the expected value of the
total elapsed time that a
population, which starts at size N i( )0 = , will be of size j
prior to the population
becoming extinct. Matis et al. (2000) stated that:
(3.8.11)
The mean time to extinction given that N N( )0 0= , denoted by EN0
, is:
(3.8.12)
For the example in section 3.5, R is a 21 21× matrix, as the
population size cannot
increase above 20. The matrix R + is thus invertible and
consequently, one can derive
the expected time to extinction using equation (3.8.12) for any
initial population size.
The matrix R + is the same as the matrix shown in Section 3.6.A,
except instead of
having the top left entry of the matrix, r1 1 0 285, .= − , we have
r1 1 0 306, .= − . By applying
equations (3.8.11) and (3.8.12) in turn, we thus find:
E N
10 12
6 6 575 6 612 6 615 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616
6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616= .124 . . . . . . .
. . . . . . . . . . . .
The mean time to extinction increases as the population size
increases. This is fairly
intuitive as the population has to suffer the loss of an additional
member of the
44
population in order to become extinct. However, the mean time to
extinction, EN0 is
effectively the same for initial population sizes of four and
above. It thus seems that the
fact that the approximate expression for EN0 (equation (3.8.10)) is
independent of the
initial population size in not all that unreasonable. Indeed, the
estimated time to
extinction using equation (3.8.10) is not far from any of the true
mean times to
extinction. Unfortunately, equations (3.8.11) and (3.8.12) cannot
be used when the
population sizes are very large as the resulting matrices are also
large and hence
difficult to manipulate. This is when the approximated analytical
expression derived by
Nisbet et al. (1982) becomes especially useful.
The Birth-and-Death model is one of the most widely-used stochastic
representations of
a population. In addition to its flexibility, the Birth-and-Death
model is also appealing
due to the simplicity of its underlying principle: the population
number can only change
when a member of the population gives birth or dies. In the
following Chapter, we look
at an alternative method of modelling demographic
stochasticity.