Guide to SABL 2014a · The SABL algorithm is a generalization of adaptive posterior simulators...

Guide to SABL 2014a

John Geweke∗†

August, 2014

Abstract

This Guide is an overview of the sequentially adaptive Bayesian learning (SABL)algorithm and a point of departure for users new to the SABL Matlab toolbox.Detailed aspects of the software, for example function and variable descriptions,are self-documenting; analytic treatments, including statements of conditions andtheorems, are provided in working papers and publications. This Guide is linkedto those sources at appropriate points.

∗University of Technology Sydney, Economics Discipline Group; [email protected]. A por-tion of the work was undertaken while the author also held the Theil Chair in Econometrics, ErasmusUniversity. Huaxin Xu and Bin Peng of UTS provided substantial assistance in the development ofSABL, and long discussions with Garland Durham and William J. McCausland are gratefully acknowl-edged. None of these individuals are responsible for errors or other limitations of SABL. The workhere was made possible by generous funding from the Australian Research Council, through grantDP130103356 to UTS, and through grant CE140100049, ARC Centre of Excellence for Mathematicaland Statistical Frontiers (ACEMS), administered through the University of Melbourne.†Copyright (C) 2014 John Geweke. Permission is granted to copy, distribute and/or modify this

document under the terms of the Creative Commons Attribution-ShareAlike 4.0 International PublicLicense. This Licencse is in the file Copyright_license/Copyright_documentation.

1

Contents

1 Introduction 4

2 The SABL algorithm 42.1 C phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 S phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 M phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4 Convergence, the two-pass variant of SABL and accuracy . . . . . . . . . 11

2.4.1 Convergence and the two-pass variant . . . . . . . . . . . . . . . . 122.4.2 Numerical accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.5 Marginal likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.6 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 The SABL toolbox 163.1 Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1.1 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.1.2 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2 Layering, stages and passing control . . . . . . . . . . . . . . . . . . . . . 183.3 Global structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.4 Using layering and global structures to customize SABL . . . . . . . . . 223.5 Updating with new data . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.6 Two-pass variant of the SABL algorithm . . . . . . . . . . . . . . . . . . 253.7 Using multiple workers and GPUs . . . . . . . . . . . . . . . . . . . . . . 26

3.7.1 Using multiple workers . . . . . . . . . . . . . . . . . . . . . . . . 283.7.2 Using graphics processing units . . . . . . . . . . . . . . . . . . . 31

3.8 Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4 Variants of the algorithm in the SABL toolbox 354.1 C phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.1.1 Data tempering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.1.2 Power tempering . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.1.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.1.4 Effective sample size stopping criterion . . . . . . . . . . . . . . . 364.1.5 Unconditional stopping criterion . . . . . . . . . . . . . . . . . . . 364.1.6 Optimization stopping criterion . . . . . . . . . . . . . . . . . . . 36

4.2 S phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.2.1 Multinomial resampling . . . . . . . . . . . . . . . . . . . . . . . 374.2.2 Residual resampling . . . . . . . . . . . . . . . . . . . . . . . . . 374.2.3 Systematic resampling . . . . . . . . . . . . . . . . . . . . . . . . 374.2.4 Stratified resampling . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.3 M phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3.1 Metropolis random walk . . . . . . . . . . . . . . . . . . . . . . . 38

2

4.3.2 Blocked Metropolis random walk . . . . . . . . . . . . . . . . . . 384.3.3 Steps stopping rule . . . . . . . . . . . . . . . . . . . . . . . . . . 384.3.4 RNE stopping rule . . . . . . . . . . . . . . . . . . . . . . . . . . 394.3.5 Steps and RNE stopping rule . . . . . . . . . . . . . . . . . . . . 394.3.6 Hybrid stopping rule . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.4 Adding a new variant of the algorithm to the SABL toolbox . . . . . . . 39

5 Models in the SABL toolbox 405.1 Prior distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.1.1 General considerations for prior distributions . . . . . . . . . . . . 405.1.2 SABL standard prior distributions . . . . . . . . . . . . . . . . . 415.1.3 Direct specification of prior distributions . . . . . . . . . . . . . . 42

5.2 Models in this edition of the SABL toolbox . . . . . . . . . . . . . . . . . 435.2.1 Toy normal model 1 (’toynorm1’) . . . . . . . . . . . . . . . . . 435.2.2 Toy normal model 2 (’toynorm2’) . . . . . . . . . . . . . . . . . 435.2.3 The normal model (’normal’) . . . . . . . . . . . . . . . . . . . . 445.2.4 The Poisson model (’poisson’) . . . . . . . . . . . . . . . . . . . 465.2.5 The negative binomial model (’negative_binomial’) . . . . . . . 475.2.6 The EGARCH model (’egarch’) . . . . . . . . . . . . . . . . . . 49

5.3 Adding new models to the SABL toolbox . . . . . . . . . . . . . . . . . . 51

3

1 Introduction

This Guide is an introduction to the Matlab toolbox SABL, an implementation of thesequentially adaptive Bayesian learning algorithm. For brevity we’ll call these the SABLtoolbox and the SABL algorithm.The SABL algorithm is a generalization of adaptive posterior simulators described in

Durham and Geweke (2015). That work is motivated by the pleasingly parallel structureof sequential Monte Carlo algorithms, explained at the start of Section 2, in conjunctionwith the power of graphics processing unit (GPU) hardware and software that togetherprovide inexpensive, massively parallel desktop scientific computing; Section 3.1.2 pro-vides more detail. The SABL algorithm builds on a substantial literature in particlefiltering, as discussed in Durham and Geweke (2015).The generalizations incorporated in the SABL toolbox include quite a few variants

of the algorithm, and the toolbox readily accommodates the incorporation of more.The variants include the extension of sequential Monte Carlo to optimization problems(Sections 2.6 and 4.1.2), producing algorithms that can also be viewed as extensionsof simulated annealing algorithms; see Geweke and Frishknecht (2014) and referencesthere.The SABL toolbox augments core Matlab functions as do all Matlab toolboxes, for

example the Matlab Statistics and Matlab Parallel Computing Toolboxes. More impor-tant, the SABL toolbox exploits the modular structure of the algorithm. Incorporatingnew variants of the algorithm amounts to providing Matlab (or C) code that respects asimple interface. The same is true of new models, which amount to code for prior den-sities and likelihood functions in the case of Bayesian inference and code for objectivefunctions in the case of optimization. SABL is specifically designed to facilitate incorpo-ration of new models by third parties referred to as modelers in this Guide. SABL is alsodesigned as a vehicle for applied scientific work drawing on models already incorporatedin SABL. Going forward we refer to such applications as projects and to those who dothis work as users.SABL source code is open. It is freely available and may be used subject to the

terms of the BSD license of the Open Software Initiative that protects it. The terms ofthis license are provided in the file /Copyright_license/Copyright_software.

2 The SABL algorithm

The SABL algorithm is a procedure for the controlled introduction of new information.It pertains to situations in which information can be represented as the probabilitydistribution of a finite dimensional vector. SABL approximates this distribution bymeans of many (typically on the order of 104 to 106) alternative versions of the vector.These versions are called particles, reflecting some of SABL’s roots in the particle filteringliterature. In the SABL algorithm particles undergo a sequence of transformations asinformation is introduced. With minor exceptions accounting for a negligible fraction

4

of computing time in typical research applications, these transformations amount toidentical instructions that operate on each particle in isolation. SABL is therefore apleasingly parallel algorithm. This property is responsible for dramatic decreases incomputing time for many research applications with GPU execution of SABL.At its highest level the SABL algorithm looks like this:

• Represent initial information

• While information not entirely incorporated

—Determine information increment and incorporate by weighting particles

—Remove the weights by resampling

—Modify the particles to represent the information more effi ciently

• End

In the sequential Monte Carlo literature each pass through the loop is known a cycle,and we will use ` to index cycles. The three steps in each cycle are phases. The firststep is the correction phase, the second the selection phase, and the third is the mutationphase; for short, C phase, S phase and M phase.Let θ ∈ Θ ⊆ Rd denote the vector whose probability distribution represents informa-

tion. The notation reflects the context of Bayesian inference about a parameter vector.We develop the main ideas in this context and subsequently treat optimization as avariant, in Section 2.6. Denote the particles by θjn, the double subscripts indicating Jgroups of N particles each. Initially θ has probability density p0 (θ); extension beyondabsolutely continuous distributions is easy, and this streamlines the notation. In SABLthe particles initially are

θ(0)jn

iid∼ p(0) (θ) (j = 1, . . . , J ;n = 1, . . . , N) . (1)

In Bayesian inference p(0) (θ) is a proper prior density and in optimization it is theprobability density of an instrumental distribution (see Section 2.6). It must be practicalto sample from the initial distribution (1) and to evaluate p(0) (θ).Denote the density incorporating all the information by p∗ (θ). SABL requires that

it be possible to evaluate a kernel k (θ) with the properties

k (θ) ≥ 0 ∀ θ ∈ Θ,∫

Θ

k (θ) dθ <∞, p∗ (θ) ∝ k∗ (θ) = p(0) (θ) k (θ) . (2)

In Bayesian inference the kernel k (θ) is the likelihood function,

k (θ) = p (y1:T | θ) , (3)

where T denotes sample size and y1:T = {y1, . . . , yT} denotes the data. In the optimiza-tion problem maxθ∈Θ h (θ),

k (θ) = exp [r · h (θ)] , (4)

5

where r > 0 and typically r is large.Cycle ` begins with the kernel k(`−1) and ends with the kernel k(`). In the first and

last cycles,k(0) = 1 and k(L) (θ) = k (θ) ,

respectively. Correspondingly define k∗(`) (θ) = p(0) (θ) k(`) (θ), implying

k∗(0) = p(0) (θ) and k∗(L) (θ) = k∗ (θ) . (5)

The particles change in each cycle, and reflecting this let θ(`)jn denote the particles at

the end of cycle `. The initial particles θ(0)jn have the common distribution (1) and are

independent. In succeeding cycles the particles θ(`)jn continue to be identically distributed

but they are not independent. The theory underlying SABL, discussed further in thissection and developed in detail by Durham and Geweke (2015) drawing on sequential

Monte Carlo theory, assures that the final particles θjn = θ(L)jn

d−→ p∗ (θ). This conver-gence in distribution takes place in N , the number of particles per group. The result isactually stronger: the particles are ergodic in N , meaning that for any function g forwhich E [g (θ)] =

∫Θg (θ) p∗ (θ) dθ exists,

limN→∞

N−1

N∑n=1

g (θjn) = E [g (θ)] (6)

with probability 1 in all groups j = 1, . . . , J .A leading technical challenge in practical sequential Monte Carlo algorithms, which of

course work with finite N , is to limit the dependence amongst particles, and in particularto keep dependence from increasing from one cycle to the next to the point that the finaldistribution of particles is an unreliable representation of any distribution. A furthertechnical challenge is to provide a measure of the accuracy of the approximation implicitin the left side of (6) for finite N that is itself reliable. The SABL algorithm and toolboxdo both in a way that makes minimal demands on users. The remainder of this section,and Section 3 that follows, provide the details.

2.1 C phase

For each cycle ` define the weight function

w(`) (θ) = k(`) (θ) /k(`−1) (θ) .

The theory underlying the SABL algorithm requires that there exist an upper boundw(`), that is,

w(`) (θ) < w(`) <∞ ∀ θ ∈ Θ.

The C phase determines w(`) (θ) explicitly and thereby defines

k(`) (θ) = w(`) (θ) · k(`−1) (θ) . (7)

6

and

p∗(`) (θ) = k∗(`) (θ) dθ/

∫Θ

k∗(`) (θ) dθ.

Correspondingly, definek∗(`) = p(0) (θ) k(`) (θ) (8)

and note (7) implies k∗(`) (θ) = w(`) (θ) · k∗(`−1) (θ) as well. The weight functions w(`) (θ)are designed so that there exists L < ∞ for which k(L) (θ) = k (θ), although the valueof L is in general not known at the outset.One approach is to use the functional form w(`) (θ) = k (θ)∆` and determine a suitable

choice of ∆` > 0. Thus at the end of cycle `, k(`) (θ) = k (θ)r` where r` =∑`

s=1 ∆s.This variant of the C phase is known as power tempering or simply tempering. Theterm originates in the simulated annealing literature in which T` = r−1

` is known astemperature and {T`} as the cooling schedule. Another approach originates in particlefiltering and Bayesian inference: k(`) (θ) = p (y1:t` | θ), where 0 < t1 . . . < tL = T (samplesize). The increments are therefore w(`) (θ) = p

(yt`−1+1:t` | y1:t`−1 , θ

). This variant of the

C phase is known as data tempering.The C phase can be motivated informally by analogy to importance sampling, a

Monte Carlo simulation method with at least a 60-year history, interpreting k∗(`−1) (θ)as the kernel of the source density and k∗(`) (θ) as the kernel of the target density.(Recall the definition of k∗(`) (θ) in (8).) If it were the case that the particles θ(`−1)

jn wereindependent and had common distribution indicated by the kernel density k∗(`−1) (θ),then ∑J

j=1

∑Nn=1w

(θ

(`−1)jn

)g(θ

(`−1)jn

)∑J

j=1

∑Nn=1w

(θ

(`−1)jn

) a.s.−→∫

Θk∗(`) (θ) g (θ) dθ∫Θk∗(`) (θ) dθ

=

∫Θ

p∗(`) (θ) g (θ) dθ = E(`) [g (θ)] (9)

so long as E(`) [g (θ)] exists. The convergence is in N , the number of particles per group.The core of the argument for importance sampling is∫

Θ

p∗(`) (θ) g (θ) dθ =

∫Θw(`) (θ) k∗(`−1) (θ) g (θ) dθ∫Θw(`) (θ) k∗(`−1) (θ) dθ

=

∫Θw(`) (θ) (θ) p∗(`−1)g (θ) dθ∫Θw(`) (θ) p∗(`−1) (θ) dθ

.

This result does not apply strictly, here, because while the particles θ(`−1)jn are identically

distributed, they are not independent and k∗(`−1) (θ) is at best an approximation ofthe kernel density of the true common distribution of the particles θ(`−1)

jn so long asN < ∞ (as it must be in practice). But many of the practical concerns in importancesampling carry over. In particular, success lies in w (θ) being “well-conditioned”—looselyspeaking, variation in w (θjn) must not be too great. For example, diffi culties arise whenjust a few weights w (θjn) account for most of the sum. In this case the target density

7

kernel k∗(`) (θ) is represented almost entirely by a small number of particles and theapproximation of E(`) [g (θ)] implicit in the left side of (9) is poor.TheC phase directly confronts the key question of howmuch information to introduce

in cycle `: too little and L will be larger than it need be; too much, and it becomesdiffi cult for the other phases to convert ill-weighted particles from cycle `−1 into particlesfrom cycle ` suffi ciently independent that the representation of the distribution does notdeteriorate from one cycle to the next into a state of gross unreliability. A conventionaland effective way to monitor the quality of the weight function is by means of relativeeffective sample size

RESS(`) =ESS(`)

JN=

[∑Jj=1

∑Nn=1w

(`)(θ

(`−1)jn

)]2

JN∑J

j=1

∑Nn=1w

(`)(θ

(`−1)jn

)2 . (10)

The effective sample size ESS(`) is an adjustment to the sample size (number of particles,JN) that accounts for lack of balance in the weights, and relative effective size is itsratio to sample size. Notice that if all weights are the same then ESS(`) = JN andRESS(`) = 1, whereas if only one weight is positive then ESS(`) = 1 and RESS(`) =1/JN .In general RESS(`) is lower the more information is introduced in the C phase. This

is always true for power tempering and as a practical matter is nearly always the case fordata tempering. It suggests a strategy of introducing information only up to the pointwhere RESS(`) has just dropped or would drop below are target value. The targetRESS∗ = 0.5 is usually reasonable, and it is the default value in the SABL toolbox.Practical experience shows that somewhat higher RESS∗ leads to more cycles but fasterexecution in the M phase, lower RESS∗ to fewer cycles but slower M phase execution,and as a result there is not much difference in execution time over the interval (0.1, 0.9)for RESS∗.For data tempering this suggests initializing w(`) (θ) = 1, followed by iterations

(s = 1, 2, . . .). Iteration s introduces yt`−1+s, updates

w(`)(θ

(`−1)jn

)= w(`)

(θ

(`−1)jn

)· p(yt`−1+s | yt`−1+s−1, θ

(`−1)jn

),

and computes the correspondingRESS(`). Iterations terminate the first timeRESS(`) <RESS∗. This procedure has been well established in the sequential Monte Carlo particlefiltering literature for years.Such strategies have not been employed previously for power tempering. The first

instance appears to be Geweke and Frischknecht (2014). Substituting w(`) (θ) = k (θ)∆`

in (10),

RESS(`) =

[∑Jj=1

∑nN=1 k

(θ

(`−1)jn

)∆`

]2

JN∑J

j=1

∑nN=1 k

(θ

(`−1)jn

)2∆`. (11)

8

Setting RESS(`) = RESS∗ produces a nonlinear equation in the single variable∆` that has a unique and easily computed solution so long as RESS∗ ∈ (0, 1) . If thesolution implies r(`) > 1 then ∆(`) = 1 − r(`−1) instead and the cycle ` = L is the lastone.

2.2 S phase

The rest of cycle ` starts with the weighted particles θ(`−1)jn and produces unweighted par-

ticles θ(`)jn that that meet or exceed a mixing condition —a measure of lack of dependence

described in the next section. The S phase begins this process, removing weights bymeans of resampling. The principle behind resampling is to regard the weight functionas proportional to a discrete probability function defined over the particles and drawfrom this distribution with replacement. Hence the name selection phase. SABL per-forms this operation on each group of particles separately —that is, particles are alwaysselected within groups and never across groups. This independence between the groupsj = 1, . . . , J is essential in (1) proving the convergence of the algorithm, (2) assessing themixing condition in the M phase, and (3) providing a numerical standard error for theapproximation as discussed in Section 2.4.2. Resampling produces unweighted particlesdenoted θ(`,0)

jn .The most elementary resampling method is to make N independent and identically

distributed draws from the multinomial distribution with argument N and probabilities

pjn = w(`)(θ

(`−1)jn

)/

N∑i=1

w(`)(θ

(`−1)ji

)(n = 1, . . . , N) .

This method is known as multinomial resampling. An alternative method, known asresidual resampling, is to compute the same probabilities and collect an initial sub-sample of size N∗ ≤ N consisting of [N · pjn] copies of each particle θjn, where thefunction [·] is standard notation for what is variously known as the greatest whole in-teger, greatest integer not greater than, or floor function. Then draw the remainingN − N∗ particles by means of multinomial resampling with probabilities probabilitiesp∗JN ∝ Npjn − [N · pjn]. Residual resampling results in lower dependence amongst theparticles θ(`,0)

jn (n = 1, . . . , N) than does multinomial resampling. For both methods thereare central limit theorems that are essential to demonstrating convergence and interpret-ing numerical standard errors. There are other resampling methods that lead to evenless dependence amongst the particles, but for these methods central limit theorems donot apply. These methods are all described in Douc et al. (2005).The S phase is a simple but key part of the SABL algorithm. Resampling is also a

key part of evolutionary (or, genetic) algorithms where it plays much the same role. Theparticles θ(`,0)

jn are for this reason sometimes called the children of the parent particles

θ(`−1)jn , and also to emphasize the fact that for each child θ(`,0)

jn there is a parent θ(`−1)jn′ .

Parents with larger weights are likely to have more children — it is not hard to work

9

out the exact distribution for any one parent for multinomial resampling and then againfor residual resampling. With both, the expected number of children, or fertility, of theparent θ(`−1)

jn is proportional to w(θ

(`−1)jn

), a measure of the parent’s “success” in the

environment of the information introduced in cycle `.

2.3 M phase

If the algorithm were to continue in this way, the number of unique children would neverincrease and in general would decrease from cycle to cycle. Indeed, in the context ofBayesian inference it can be shown under mild regularity conditions that the number ofunique children converges almost surely to 1 as the number of observations increases.The same can be demonstrated in the context of optimization for a suffi ciently largevalue of r in (4).The M phase addresses this problem by creating diversity amongst sibling particles

in a way that is faithful to the information kernel k∗(`) (θ). It does so using the sameprinciple of invariance that is central to Markov chain Monte Carlo (MCMC) algorithms,drawing particles from a transition density dQ(`) (θ | θ∗) with the property∫

Θ

k∗(`) (θ∗) dQ(`) (θ | θ∗) dθ∗ = k∗(`) (θ) ∀ θ ∈ Θ. (12)

The transition density dQ(`) is invariant with respect to the kernel k∗(`) (θ), which pre-serves the original distribution of the children but introduces the prospect that they willbe different. Notice that (12) implies that the successive application, or convolution, ofa series of invariant transitions defines a transition that is itself invariant. The universeof invariant transition densities is large and manifest in the MCMC literature. Many ofthese transitions are model-specific, for example Gibbs sampling variants of MCMC. Onthe other hand a number of families of Metropolis-Hastings transitions apply quite gen-erally and with problem-specific tuning of parameters can be computationally effi cient.The current edition of the SABL toolbox incorporates one of these variants, the

Metropolis Gaussian random walk, and the structure of SABL accommodates incor-poration of others in the future. The M phase applies the Metropolis random walkrepeatedly in steps s = 1, 2, . . ., each step generating a new set of particles θ(`,s)

jn from

the previous set θ(`,s−1)jn . Following the familiar arithmetic, candidate new particles are

generated θ∗(`,s)jn ∼ N(θ

(`,s−1)jn ,Σ(`,s−1)

)and accepted with probability

min

k∗(`)(θ∗(`,s)jn

)k∗(`)

(θ

(`,s−1)jn

) , 1

.In SABL Σ(`,s) is proportional to the sample variance of θ(`,s−1)

jn computed using all theparticles. The factor of proportionality increases when the rate of candidate acceptance

10

in the previous step exceeds a specified threshold and is decreased otherwise. This drawson established practice in MCMC and works well in this context. Section 4.3 providesmore detail about this threshold, as well as the initial value and increments of the scalingfactor.In some applications, especially those with a long parameter vector θ, the multivariate

normal distribution is a suffi ciently poor approximation of the local behavior of k∗(`) (θ)that the Metropolis Gaussian random walk can be quite ineffi cient. A straightforwardway to address this contingency is the blocked Metropolis Gaussian random walk variantof the M phase, included in the current edition of SABL. In this variant θ is partitionedinto subvectors and the Gaussian random walk Metropolis algorithm is applied to thesubvectors in turn. Section 4.3.2 provides more detail.The objective of the M phase is to attain a degree of independence of the particles

θ(`)jn at the end of each cycle suffi cient to render the final set of particles θjn = θ

(L)jn a

reliable representation of the distribution implied by the probability density functionp∗ (θ). The idea behind M phase termination in SABL is to measure the degree ofmixing (lack of dependence) amongst the particles at the send of each Metropolis steps of cycle `, and terminate when this measure meets or exceeds a certain threshold.In SABL mixing is measured by the average relative numerical effi ciency (RNE) of

a group of functions chosen specifically for this purpose in each model. The RNE of theSABL approximation of a posterior moment E [g (θ)] =

∫Θg (θ) p∗ (θ) dθ is a measure of

its numerical accuracy relative to that achieved by a hypothetical simulation θijiid∼ p∗ (θ).

Section 2.4.2 explains how this measure is constructed.A simple stopping rule for theM phase is to terminate the iterations of the Metropolis

random walk when the average RNE of a group of functions first exceeds a statedthreshold. In any application there are practical limits to the average RNE that canbe achieved through these iterations, and so it is also desirable to impose a limit ontheir number. Achieving greater independence of particles is especially important in thelast cycle, because at the end of the M phase in that cycle the particles constitute therepresentation of p∗ (θ). There are quite a few options forM phase termination, detailedin Section 4.3. The SABL toolbox core default criterion is average RNE 0.4 with 100maximum iterations in cycles 1, . . . , L − 1 and average RNE 0.9 with 300 maximumiterations in the final cycle L.Mixing thoroughly is not the objective of the M phase. In MCMC that is essential

in providing a workable representation of the distribution with kernel k∗ (θ). In SABLthe C and S phases take on this important task, whereas the function of the M phaseis to place a lower bound on the dependence amongst particles. Section 4.3 introducessome elaborations of this stopping criterion as options in the SABL toolbox.

2.4 Convergence, the two-pass variant of SABL and accuracy

Durham and Geweke (2015) shows that bounded likelihood

maxθ∈Θ

p (y1:T | θ) <∞ (13)

11

and existence of the prior moment∫Θ

|g (θ)| p(0) (θ) dθ <∞ (14)

respectively are suffi cient for the essential condition (6). (Weaker conditions exist butare more diffi cult to verify: see Durham and Geweke (2015) and references cited there.)In all posterior simulators the assessment of numerical accuracy is based on a centrallimit theorem, which in this context takes the form

N1/2(g(J,N) − g

) d−→ N(0, σ2

g

)(15)

where

g =

∫Θ

g (θ) p∗ (θ) dθ and g(J,N) = N−1

N∑n=1

g (θjn) .

By itself (15) is not enough: it is essential to compute or approximation σ2g as well.

Section 2.4.2 explains how SABL does this.

2.4.1 Convergence and the two-pass variant

The theory developed in the sequential Monte Carlo literature provides a start. It positsa fixed pre-specified sequence of kernels k(1), . . . , k(L) (see (7)) and a fixed pre-specifiedsequence of M phase transition densities dQ(`) (see (12)), together with side conditions(implied by conditions (13) and (14)), and proves (15). But in any practical applicationthe kernels k(`) and transition densities dQ(`) are adaptive, relying on information in theparticles θ(`−1)

jn or θ(`,s−1)jn , rather than fixed. The theory does not apply then because

the kernels and transitions depend on the random particles, and the structure of thisdependence is so complex as to preclude extension of the existing theory to this case—especially for the transition kernels dQ(`). Thus, this literature provides a theory ofsequential Bayesian learning but not a theory of sequentially adaptive Bayesian learning.It is universally recognized that some form of adaptation is required, for it is impossible topre-specify kernels k(`) and transition densities dQ(`) that provide reliable approximationsin tolerable time without knowing a great deal about the posterior distribution —which,of course, is the goal and not the starting point.Durham and Geweke (2015) deals with this issue by creating the two-pass variant of

the algorithm. The first pass is exactly as described in this section, with the additionthat the kernels k(`) and transitions dQ(`) are saved. For the specific variants describedin Sections 2.1 and 2.3, this amounts to saving the sequence {r`} or {t`} from the Cphase and the doubly-indexed sequence of variance matrices Σ(`,s−1) from the M phase,but the idea generalizes to other variants of the C and M phases. The second passre-executes the algorithm (with different seeds for the random number generator) anduses the kernels k(`) and transitions dQ(`) computed in the first pass, skipping the workrequired to compute these objects from the particles. The theory developed in the

12

sequential Monte Carlo literature then applies directly to the second pass, because thekernels k(`) and transitions dQ(`) are in fact fixed in the second pass. The role of thefirst pass is to provide the knowledge of the posterior distribution required for sensiblepre-specification of these objects.Experience thus far is that substantial differences between the first and second passes

do not arise, and can only be made to do so by specifying imprudently small values ofN . Thus in practice it suffi ces to use the two-pass algorithm only occasionally —perhapsat the inception of a research project when the general character of the model(s), dataand sample size are known, and then again prior to communicating findings.

2.4.2 Numerical accuracy

The sequential Monte Carlo literature provides abstract expressions for σ2g in (15) but

no means of evaluating or approximating σ2g. SABL provides the approximation using

the particle groups. Consider the second pass of the two-pass algorithm where the con-vergence theory fully applies. In this setting there is no dependence of particles acrossgroups. The M phase and the C phase are perfectly parallel: exactly the same opera-tions applied to all the particles with no communication between particles. Resamplingin the S phase, which introduces dependence amongst particles, takes place entirelywithin groups so as not to compromise independence across groups. Therefore the ap-proximations gjN = N−1

∑Nn=1 g (θjn) of g = E [g (θ)] are independent across the groups

j = 1, . . . , J . A central limit theorem (15) applies within each group. Computing thecross-group mean gJ,N = J−1

∑Jj=1 gjN , a conventional estimate of σ

2g in (15) is

σ2g = N · (J − 1)−1

J∑j=1

(gjN − gJ,N

)2(16)

and(J − 1) σ2

g/σ2g

d−→ χ2 (J − 1) , (17)

the convergence in (17) being in particles per group N . In the limit N → ∞, gJ,N andσ2g are independent.The corresponding numerical variance estimate for gJ,N is

σ2g,JN = (JN)−1 σ2

g (18)

and the numerical standard error is σg,JN =(σ2g,JN

)1/2. This should not be confused

with the approximation of the posterior variance,

var (g) = (JN)−1J∑j=1

N∑n=1

[g (θjn)− gJ,N

]2.

The numerical standard error corresponding to (18) is σg,JN =[σ2g,JN

]1/2. This is the

measure of accuracy used in SABL. From (17) the formal interpretation of numerical

13

standard error isgJ,N − gσg,JN

d−→ t (J − 1) .

If particles within groups are independent then σ2g u var (g), whereas if they are not then

usually σ2g > var (g), although σ2

g < var (g) may occur and is more likely with smallernumbers of particle groups J . The relative numerical effi ciency of the approximationgJ,N is

RNEg = var (g) /σ2g. (19)

A useful interpretation of (19) is a hypothetical simulator with θjniid∼ p∗ (θ) would achieve

the same accuracy with RNEg · JN particles.This argument does not apply directly in the first pass because of the adaptation.

In particular, recall that RNE is used in the M phase to assess mixing and determinethe end of the sequence of iterations of the Metropolis random walk. This is an exampleof the complex feedback between particles and adaptation in the algorithm that hasfrustrated central limit theorems. This shortfall in theory is likely to persist. Thetwo-pass procedure overcomes the problem and, moreover, provides the foundation forfuture variants of the algorithm without the overhead of establishing convergence foreach variant.

2.5 Marginal likelihood

The SABL algorithm is particularly well suited to providing a numerical approximationof the marginal likelihood

ML =

∫Θ

p(0) (θ) p (y1:T | θ) dθ =

∫Θ

k∗ (θ) dθ. (20)

The marginal likelihood, also called the marginal data density, is central in the theoryand practice of Bayesian model comparison, as well as in Bayesian model averaging forcombining models and decision-making. It has posed a particularly diffi cult technicalproblem that has seen checkered resolution in the posterior simulation literature as wellas in practice: depending on the combination of posterior simulation method and model,approximation of ML can be easy to impossible, and reliably assessing the accuracy ofthe approximation poses further issues that are again specific to the situation.The SABL algorithm produces approximations of ML —more precisely logML as is

standard —as a by-product of the C phase. Here we will present the ideas behind themethod, without going into full detail which requires considerable additional notation.Details are in Section 4 of Durham and Geweke (2015) and are reflected in the SABL

14

toolbox code. From (5) and (7),∫Θ

k∗ (θ) dθ =

∫Θk∗(L) (θ) dθ∫

Θk ∗(0) (θ) dθ

=

L∏`=1

∫Θk∗(`) (θ) dθ∫

Θk∗(`−1) (θ) dθ

=L∏`=1

∫Θw(`) (θ) k∗(`−1) (θ) dθ∫

Θk∗(`−1) (θ) dθ

=

L∏`=1

∫Θ

w(`) (θ) p(`−1) (θ) dθ. (21)

In the C phase of cycle `, as N →∞,

w`,J,N = (JN)−1J∑j=1

N∑n=1

w(`)(θ

(`−1)jn

)a.s.−→

∫Θ

w(`) (θ) p(`−1) (θ) dθ,

Hence from (21),L∏`=1

w`,J,Na.s.−→

∫Θ

k∗ (θ) dθ,

where the convergence is again in the number of particles per group N . This is themarginal likelihood (20) in a Bayesian inference context. Durham and Geweke (2015)discusses the approximation of log (ML) and computing the numerical standard errorfor that approximation.

2.6 Optimization

Return now to the optimization problem, determining θ∗ = arg maxθ∈Θ h (θ). As dis-cussed in Section 2, SABL approaches this problem using kernels of the form (4) in amanner analogous to the likelihood function p (y1:T | θ) in Bayesian inference. There con-tinues to be an initial density p(0) (θ), and the corresponding distribution is sometimescall the instrumental distribution in this context. This might or might not be intendedor interpreted as the expression of prior beliefs about the solution of the optimizationproblem. The density p(0) (θ) is a technical device providing an initial condition for thealgorithm and requires only that p(0) (θ) > 0 ∀ θ ∈ Θ.If h (θ) is bounded above on Θ (the analogue of (13)) then SABL produces particles

θij whose distribution has kernel density p(0) (θ) exp [r · h (θ)]. If h has a unique globalmode θ∗ then, under weak side conditions stated in Geweke and Frischknecht (2014),θ

p−→ θ∗ as r →∞. In practice one does not know values of r required for satisfactoryresults. Recall that in the power tempering variant of the C phase the sequence {r`} isderived from relative effective sample size (11), a function of the particles θ(`−1)

ij . Replacethe termination criterion r(L) = 1 used for Bayesian inference with one that defines asuitably close approximation of θ∗.An example of such a termination criterion is an upper bound for the range of

values of h (θij) over all JN particles. (As discussed in Section 4.1, there are otherconvergence criteria as well, and it is easy for users to provide a customized convergence

15

criterion.) What is a reasonable threshold size is problem-specific but usually clear. Ifh is denominated in dollars then a range of one dollar is likely more than adequate. Formaximum likelihood the kernels are p (y1:T | θ), equivalently h (θ) = log p (y1:T | θ), anda range of h (θij) somewhere between 10−2 and 10−8 is likely adequate.It is typical to see a steady increase in power with each successive cycle. In fact

for twice continuously differentiable objective functions h (θ) it can be shown that therate of change (r` − r`−1) /r`−1 comes to depend only on the number of elements in θ.If the C phase power tempering criterion is RESS∗ = 0.5 then for θ with 3 elements itis 1.153 , 10 elements 0.56, 20 elements 0.349, and 50 elements 0.198. Values are lowerfor higher RESS∗ and vice versa. Observed rates are often very close to the theoreticalvalues in most cycles. Eventually, however, this breaks down because the particlesθij are very close and differences amongst the h (θij), which are critical in evaluatingRESS, become dominated by rounding error arising from the finite number of bits infloating point arithmetic. At this point the implementation of the algorithm is no longerfaithful to its analytical properties and to continue is to produce noise. As with anyother convergence algorithm it is productive to select explicit convergence explicitly andthoughtfully.

3 The SABL toolbox

This section documents the SABL toolbox at its highest level. It describes the organiza-tion and principles behind SABL that make it easy to work in the SABL toolbox —eitheras a model developer or as a user carrying out applied work using one of the modelsincluded in the toolbox. There are three additional kinds of documentation available atmore detailed and technical levels: (1) examples in the projects directory of the SABLtoolbox; (2) the usual Matlab help command that applies to functions in the SABLtoolbox just as it applies to all Matlab functions; (3) customized help-like commandsthat document all of the constants and arrays in SABL that can be changed to invokevariants of the SABL algorithm included in the toolbox, specify details of models, orprovide auxiliary information from the SABL algorithm that may be interesting anduseful depending on context and objectives. Directions for accessing the examples areincluded with the toolbox instructions that are part of the toolbox download. Section3.3 below describes the customize help commands. Section 3.8 provides an overview ofutilities that streamline coding.One invokes SABL with the command SABL([model], [project]). The second

or both input arguments can be omitted. The file readme.txt included in the SABLtoolbox download provides more details.

3.1 Platforms

Getting things done always requires both software and hardware. The SABL toolboxfunctions in a very particular but widely available proprietary software environment. It

16

operates on two different kinds of hardware, one of which (traditional single or multicoreCPU) is available to anyone who is able to access Matlab. In addition, and key to thepower of SABL for complex inference and optimization problems, it also operates usingone or more graphics processing units (GPUs) that can be installed easily and relativelyinexpensively. The full combination of hardware and software is available through clustercomputing facilities in many universities and is increasingly available in academic andcloud computing environments.

3.1.1 Software

The SABL toolbox is written entirely in Matlab, and Matlab, version 2012b or later, isrequired. A few functions —limited to likelihood function evaluation in more complexmodels in the current edition of SABL —have two versions, one written in Matlab and theother in C/CUDA. Using the C/CUDA version requires a C/C++ compiler (e.g. GCC,Intel C++Compiler or Microsoft Visual C++) as well as the CUDAToolkit from Nvidia.The CUDAToolkit is freely available at https://developer.nvidia.com/cuda-toolkit.It drives the GPUs on the system and operates on top of the C/C++ compiler in compil-ing C/CUDA code. If Matlab runs successfully on a given combination of hardware andoperating system, then so will the SABL code. By implication the Parallel ComputingToolbox must be installed to exercise the GPU option (E.gpu = true) in SABL.

3.1.2 Hardware

All of the options and models in SABL are available for both CPU and GPU hardware;CPU is the default and is trivial to modify as described below. SABL exploits hardwarein a CPU environment in exactly the same way as Matlab. By default, Matlab attemptsto utilize all available CPU cores effi ciently, and the extent to which this has been donecan be gauged by the ratio of CPU time to elapsed time reported at the end of everySABL run. Experience suggests that this works to greater effect in SABL to the extentthat the application demands more floating-point operations; ratios been 3 and 4 arenot unusual in these circumstances in conventional quadcore environments.Through the Parallel Computing Toolbox, Matlab provides the facility to access

GPU(s) and dispatch execution explicitly to individual GPUs, if more than one ispresent. A physical GPU being used and execution local to it is referred to as worker orlab in Matlab commands and documentation. SABL provides the options of using oneor several GPUs, as described in Section 3.7.2.The Matlab Parallel Computing Toolbox enables users to access GPU memory con-

veniently and implements a subset of its instructions for the GPU. With a bit of effortone can also execute Matlab on more than one GPU simultaneously. SABL code is writ-ten respecting the limitations of the Matlab instruction set for the GPU, making it atrivial matter to execute SABL exploiting a GPU-enhanced platform. Thus the modelerand user can continue to code exclusively in Matlab and SABL will exploit the GPU.However, it does so relying only on the Matlab instruction subset for the GPU. In the

17

Matlab Parallel Computing Toolbox the organization of GPU computing is then func-tion specific: it takes advantage of the GPU function by function in vector arithmeticand to some extent in linear algebra. It does not fully exploit the pleasingly parallelnature of the SABL algorithm, especially important in likelihood function evaluation.In this environment GPU execution can be faster than multicore CPU execution, butat best by a small factor —rarely more than 5 and certainly less than 10 compared withquadcore CPU.The SABL toolbox organizes execution to exploit the pleasingly parallel nature of the

SABL algorithm. This organization involves the entire algorithm and extends across themany functions that implement both current and prospective variants of the algorithm,models, and objective functions. Because of this organization, GPU execution in SABLcan be faster than multicore CPU execution by much higher factorsTo fully exploit these advantages SABL offers the option of providing code written

using C/CUDA to evaluate likelihood functions. This provides for more explicit controlof threads, and in particular the ability to specify that each thread evaluates the entirelikelihood or objective function for a particular particle. As the number of floatingpoint operations per particle required to evaluate these functions increases, the fractionof total SABL execution time devoted to the evaluation of these functions increasestoward one. This enhances the reduction in execution time afforded by this organizationof computations on the GPU. The number of floating point operations in the entirealgorithm is close to proportional to the number of particles, especially in applicationsdominated by likelihood function evaluation. But, because there is also fixed overheadin GPU execution, returns to this organization of computations on the GPU increasenoticeably, typically to the point of several tens of thousands of particles. In the mostfavorable cases we have observed execution 40 times faster in this environment using aGPU with about 500 cores, compared with a quadcore CPU.

3.2 Layering, stages and passing control

Exhibits1 1 and 2 reproduce the heart of the executable Matlab code from its two highestlevel functions, SABL in Exhibit 1 and c_sabl in Exhibit 2. (Housekeeping functions likecreating and restoring path names, as well as print statements, have been removed.)To fully exploit the power and flexibility of SABL it is important to understand thishigh-level logic.SABL employs three layers: core, model and project. The core layer is the algorithm

itself, including all of the options for the C, S and M phases. (SABL is designed toaccommodate additional phase options in the future, including options provided by thirdparties.) Each layer consists of a set of functions. All function names in the core layerhave the leading characters c_, except for the highest-level function SABL. All functionnames in the model layer have the leading characters m_. These functions are modelspecific. Each model has its own directory, and from the declaration of the model name

1All exhibits are at the end of this document.

18

at the outset, SABL builds the path that includes the right model directory.A particular application of a model, including the data, specification of a prior distri-

bution, and parameters controlling the algorithm (for example, the number of particles)is called a project. All projects include the function p_monitor, which defines andmanages the application. Users may find it advantageous to write additional functionsinvoked directly or indirectly by p_monitor. As always it is important that these func-tions should not have names the same as those in Matlab or any Matlab toolbox beingused, unless the intention is to provide a replacement for the function in question. Forthe SABL toolbox this simply means not having c_, m_, or u_ as the two leading char-acters of a function name (or writing a function SABL). SABL uses u_ as the leadingcharacters of utility functions, discussed in Section 3.8.The function c_sabl executes the SABL algorithm as such, as will be clear from a

quick overview of Exhibit 2 in the context of the description of the SABL algorithm inSection 2. Each iteration in the while C.moreinfo loop corresponds to a cycle of thealgorithm. The function SABL embeds c_sabl and in addition does some housekeeping,mostly suppressed in Exhibit 1, as well as implementing the two-pass variant of thealgorithm (Section 3.6) and incremental updating of the posterior distribution withadditional data (Section 3.5), explicit in Exhibit 1.At many points in c_sabl control passes to the function c_monitor in the core layer

(Exhibit 2). This function executes certain commands and then passes control to thefunction m_monitor in the model layer. This model-specific function executes certaincommands and then passes control to the p_monitor, the project-specific interface pro-vided by the user. On return from p_monitor control passes back to m_monitor, whichcan execute certain further commands, then to c_monitor which can also perform addi-tional tasks before control returns to c_sabl. This also happens at four points in SABL(Exhibit 1).The hierarchy has several purposes, one of which is simply to accommodate many

models and projects in the same structure. Somewhat more subtle but quite importantis that the code in the c_monitor and m_monitor functions is written so as to enablethe modeler (through m_monitor) and the user (through p_monitor) to exercise a greatdeal of control over the details of the SABL algorithm. The pattern is that c_monitorprovides all of the details required for the SABL algorithm, using pre-defined defaultchoices, before passing control to m_monitor. (For example, the default choice for thenumber of particles is J = 16 groups with N = 1024 particles apiece.) The functionm_monitor then has the opportunity to modify these defaults before invoking p_monitor,as well as to make default choices for options specific to the model (e.g. the size andcontent of the data set). In p_monitor the user has the opportunity to override defaultsset in either c_monitor or m_monitor before control returns to m_monitor, and similarlym_monitor provides all of the model-specific details and defaults required before controlreturns to c_monitor.Since control passes downward from c_monitor to m_monitor to p_monitor and

back again at many points in the algorithm SABL must communicate to those functionsthe point at which control has been passed. To this end the entire algorithm is partitioned

19

into stages (a concept) and at the end of each stage there is a round trip through the*_monitor functions (the implementation of the concept). This single input parameter ofc_monitor, m_monitor and p_monitor is the stage name. Each invocation of c_monitoror p_monitor in SABL or c_SABL in Exhibit 1 indicates the name of the stage and thepoint at which that stage ends. The SABL code, including sample projects, uses stageas the variable name of this input string.Thus the structure of p_monitor code must be

if strcmp(stage, ’stagename_1’)execute this ...

elseif strcmp(stage, ’stagename_2’)execute that ...

...elseif strcmp(stage, ’stagename_n’)

execute yet something different ...end

While the ordering of the if and elseif blocks makes no difference, it is useful tokeep them in the same order they occur in the algorithm. It is important to bear in mindthat SABL passes through all of the stages in a cycle —that is, all of the stages namedinside the while moreinfo loop in Exhibit 2 —many times. Thus it is good practicethat all execution in m_monitor and p_monitor be written within if / elseif branchesof the kind just described. The best way to do this is by assigning such constants andarrays to fields of the P global structure described in the next section.

3.3 Global structures

Most constants and arrays in SABL are fields of structures, and these structures areglobal. SABL employs 8 structures, as follows:

• Core constants and arrays: C and Cpar. Examples are C.J, the number of particlegroups, and Cpar.theta, the matrix of particles.

• Environment constants and arrays: E and Epar. Examples are E.gpu, indicatorfor execution using one or more GPUs, and E.pus, number of Matlab workers.Epar currently has no fields.

• Model constants and arrays: M and Mpar. Examples are M.data, the data arrayin many models, and M.x_pointer, which defines the covariates by pointing tocolumns of M.data. Mpar has no fields in most models.

• Project constants and arrays: P and Ppar. These global structures are reservedfor users who find it convenient to use global structures to communicate betweenuser-defined functions or across SABL stages. The functions in the SABL toolboxdo not access these structures. Using these fields in the default environment mode

20

(E.gpu is false and E.pus = 1) is straightforward. In other modes it is essentialto have a solid understanding of the way that Matlab and SABL handle storageand access, and what would be straightforward in the default mode can lead toterminal errors or, worse, quite ineffi cient execution. Most important, improperuse of these fields while executing with one or more GPUs (E.gpu = true) cancompletely compromise the effi ciency of the entire SABL algorithm, to the pointwhere execution is slower than it would have been on a single CPU.

With a single exception all communication between users and SABL, and betweenmodelers and SABL, is managed through the fields of these global structures. The oneexception is the parameter stage, which is the single input argument of p_monitor (forusers) and is the single input argument of m_monitor (for modelers).Every field of the C, E and M structures that can have content (as opposed to, for

example, C.Cphase, which is a structure with its own fields) is either a control field ora monitor field. Control fields and subfields with content in the C structure define thespecific variant of the SABL algorithm; those in the E structure do the same for thecomputing environment; and those in the M structure define the specific model. Controlfields all have default values, and any of them may be changed by users, to good effectwhen done at the appropriate stage of the algorithm. Monitor fields provide access tointermediate aspects of the SABL algorithm, and are very useful for specific purposes—for example, having access to predictive log likelihoods in the ’data_whole’ variantof the C phase. In these applications it can be effi cient to retrieve the contents of thesefields or functions of their contents, a procedure known as harvesting. It may proveeffi cient to copy these contents or functions into a field of the P structure and then savethe P structure at the termination of SABL.

Example 1 Suppose that p_monitor invokes additional functions provided by the user.The user could pass stage as an input through a hierarchy of functions as needed. Al-ternatively the user can set stage into a global constant —consistency with SABL syntaxsuggests P.stage = stage —in p_monitor.

All control fields of global structures in SABL have default values. For example, thedefault number of particle groups is C.J = 16, the default number of particles in eachgroup is C.N = 1024, and execution uses CPU cores (E.gpu = false) with one Matlabworker (E.pus = 1). Some models may elect to change these default values.There is an important distinction between C and Cpar. Cpar is reserved for ar-

rays whose row dimension is the number of particles per Matlab worker (C.JNwork =C.JN/E.pus). These arrays are distinguished by placement in a different structure be-cause when there are multiple Matlab workers or multiple GPUs the C.JN particles arepartitioned into E.pus sets of equal size and each set is allocated to a worker. Thuseach worker ‘sees’only part of any array whose rows correspond to particles, and coreSABL code handles function of all particles in a computationally effi cient way. (SABLhas specific functions devoted to this task that are also available to users —see Section

21

3.8.) In the same situation the fields the entire C structure is copied to each worker. Thestructures Epar, Mpar and Ppar exist to handle the contingency that environment glob-als, model globals, or project globals have arrays that have the same size and allocationproperties as the structures that are the fields of Cpar.On-line documentation for fields of global structures is available through a set of

commands that works like the Matlab help command for functions. (Matlab helpworks for all SABL functions.) The command Chelp documents C and Cpar structures,Mhelp documents M structures, and Ehelp documents E structures. The commands differby the leading character of the global structure because the definition of structures inM is model-dependent. (SABL knows which model has been invoked and provides thecorrect description.) The documentation always indicates whether the field is a controlfield or monitor field, and in the case of a control field provides the default value.

Example 2 Chelp C.Jwork produces the linesThe number of particle groups assigned to each worker, C.Jwork = C.J/E.pus.

If C.J/E.pus is not a whole integer then a terminal error results.MONITOR FIELD

The user can assign values to any of the C, M or E control fields. These fields allhave default values, so while it is permissible for users to assign values to these fieldsit is not necessary. The modeler can assign values to any of the C or E control fields,and must assign default values to the control fields of the M structure for the model inquestion. With rare exception these assignments are made in stage ’startrun’. Thedefault values are provided by the Chelp, Ehelp or Mhelp query.

Example 3 Chelp C.J results in the linesThe number of groups J of particles. Default value may be different for

some models. Users can modify C.J in p_monitor at stage startrun.CONTROL FIELD Core default value C.J = 16

3.4 Using layering and global structures to customize SABL

SABL has defaults for all its options. This extends to all of the models in SABL. Evenif p_monitor doesn’t provide data for the model, the model generates artificial data andproceeds. In the most extreme case, if SABL is assigned a model but not a project whenit is invoked then it isn’t even necessary to provide a p_monitor function at all. Thiscase is useful for teaching purposes and quick introductions to SABL but not much else.This section provides a series of examples illustrating how SABL can be customized.SABL algorithm options are governed by the control fields of the C structure, exe-

cution options by the fields of the E structure, and model options by the fields of the Mstructure. Any C or E field not modified by the user or modeler assumes its core default

22

value, and any M field not modified by the user assumes its model default value. In factinvoking SABL([model]) executes the specified model entirely by default, using dataspecified by the modeler.

Example 4 The project uses model toynorm2 and its own data set, but otherwise it usesSABL defaults including the default prior distribution, all described in the documentationfor the model toynorm2. The data set is in the array data in a file named datafile. Auser would provide the following function:function p_monitorglobal Mif strcmp(stage, ‘startrun’)

load datafileM.y = data;

endIf datafile and p_monitor are in the same directory and this is also the current

working directory for Matlab, then the screen command SABL(‘toynorm2’, pwd)willexecute model toynorm2 using the data from the array data in file datafile, but otherwiseusing entirely the default options of that model.

Example 5 Now suppose that this project uses J = 32 groups of N = 4048 particleseach (default: J = 16, N = 1024). It uses the prior distribution N (−0.5, 12) for log (σ2)(default: N (0, 12). The project also tracks the number of observations and the relativeeffective sample size at the end of the C phase and the number of unique particles at theend of the S phase, in each cycle.function p_monitor(stage)global C Cpar M Pif strcmp(stage, ’startrun’)

load datafileM.y = data;

C.J = 32;C.N = 4048;M.prior.mean = [0; -0.5];M.prior.std = [1; 1];P.t = zeros(20,1);P.ress = zeros(20,1);P.nunique = zeros(20,1);

elseif strcmp(stage, ’endSphase’)P.t(C.cycle) = C.Cphase.t;P.ress(C.cycle) = C.ress;P.nunique(C.cycle) = C.Sphase.nunique;

elseif strcmp(stage, ’finish’)disp(’ Cycle t Frac unique’)disp([(1:C.cycle)’, P.t(1:C.cycle), P.unique(1:C.cycle)])

23

endSince C.Cphase.t for the current cycle is first available in stage ‘endCphase’, it

could have been harvested then. But C.Cphase.t remains unchanged during the rest ofthe cycle, so it could be harvested in any of the following phases up to and including‘endMphase’. The number of unique particles changes in the S phase and in each cycleof the M phase, so these must be counted in stage function The code is written toaccomplish both in a single elseif branch.

3.5 Updating with new data

Given an ordering of the observations t = 1, . . . , T , the posterior distribution for all T ob-servations may be expressed as a sequence of recursive updates of posterior distributionsat times ts, with 0 < t1 < . . . < tS = T :

p (θ | y1:T ) ∝ p (θ)S∏s=1

p(yts−1+1:ts | y1:ts−1 , θ

).

SABL facilitates this updating by making it convenient to use the distribution p (θ | y1:t)in place of the prior distribution p (θ) and the likelihood function p (yt+1:T | θ) in place ofp (y1:T | θ). If particles from p (θ | y1:t) are available in a file, it is faster to update thanit is to execute the entire algorithm, often much faster. This is especially advantageousin real time, on-line settings in which data arrive regularly and quickly.The project specification C.simulation_record = filename, declared by p_monitor

in stage ’open’, causes SABL to record all 8 global structures in file filename. (Thishappens at the end of the second pass if there are two passes.) All of this informa-tion, rather than just Cpar.theta, is recorded because updating requires quite a bitmore. For example, in the M phase SABL evaluates the entire posterior distributionusing new particles, and this requires access to the hyperparameters of the original priordistribution.In an updating run of SABL the project specifies C.simulation_get = fname (the

same name assigned to C.simulation_record in the earlier run of SABL). This dec-laration must be made by p_monitor in stage ’open’. SABL then begins using theseparticles rather than particles drawn from the prior distribution in the updating run. Italso uses all of the data, and for this reason SABL requires that p_monitor provide thefull, updated data set in stage ’startup’. (SABL identifies the part that is new fromthe value of C.Cphase.tlast read from C.simulation_get.) Not all of the details ofthe SABL algorithm need be the same in an updating run as they were in the earlier run.For example, criteria for stopping the M phase can be changed based on experience inthe earlier run, as can the effective sample size criterion C.Cstop.ress for stopping theC phase. But certain other details cannot, for example the number and organization ofparticles specified by C.J and C.N. SABL checks for this particular error, but the currentedition does not check many all such possible errors, which could cause SABL to termi-nate with an error message that is not particularly illuminating. The SABL updating

24

feature is tailored for ready implementation in environments in which new data arriveregularly but the specific variant of the SABL algorithm employed, as specified in thecontrol fields of the C structure, does not evolve in any significant way.Heuristically, updating in SABL is more effi cient than re-execution from the begin-

ning to the extent that the increment to information in the update relative to the previoussample is small compared to the increment in information in the entire sample relativeto the prior distribution. If the information in the prior distribution is small comparedto that in a typical observation (corresponding to the colloquial “diffuse prior”) thenthis advantage is considerable. In this situation SABL bypasses the early cycles in whichobservations enter the sample slowly (in the ‘data_whole’ variant of the C phase) orthe power of the likelihood function increases slowly from zero (in the ‘anneal_Bayes’variant). Each iteration of the M phase takes longer in updating than it typically didin the previous run because the likelihood function incorporates more observations, butthis cost would also be incurred if the entire algorithm, rather than just the update,were executed in any event. In regular updating, for example the addition of the nextquarter’s macroeconomic data or the next day of financial data, the update often entailsjust one cycle. Moreover this cycle in turn involves a very small number of iterations,sometimes just one, in theM phase. The reason for this is that by default the final cycleof SABL uses a stronger termination criterion (C.Mstop.rne_end) than do the earliercycles (C.Mstop.rne), implying that the particles at the start of an update are morethoroughly mixed than they are at the start of a typical cycle.It is important to distinguish updating with additional data from revision of data

employed in a previous run of SABL. The SABL algorithm cannot “undo”informationonce it is incorporated —this is fundamental. If data are revised it is necessary to beginwith the prior distribution or a posterior distribution that uses only a subset of theunrevised portions of the data.

3.6 Two-pass variant of the SABL algorithm

The SABL algorithm builds the kernels k(`) in the C phase of successive cycles usinginformation in the particles θ(`−1)

jn and the transition kernels dQ(`,s) in the M phase

using information in the succession of particles θ(`,s−1)jn (s = 1, 2, . . .). These critical

adaptations, especially in the M phase, are precluded by the theorems for sequentialMonte Carlo, a key underpinning of SABL.SABL resolves this problem and achieves analytical integrity by utilizing two passes.

As explained in Section 2.4.1, the first pass is the adaptive variant in which the algorithmis self-modifying. The second pass fixes the design emerging from the first pass and thenexecutes the fixed Bayesian learning algorithm to which the results in the sequentialMonte Carlo literature apply. Experience now strongly suggests that the second passleads to qualitatively the same results as the first pass — in particular, to similarityof simulated posterior moments and their numerical standard errors. Differences inposterior moment approximations, between the two passes, are typically consistent with

25

the numerical standard errors. Yet this is simply an observation, and it is important tobe able to check this condition at least occasionally in the course of a research project.All that is required for a project to execute the two-pass variant of the algorithm

in SABL is for p_monitor to set C.twopass = true in stage ‘startrun’. The impactof this change is easy to trace in Exhibit 1. SABL executes both passes in sequencewithout pausing, displays to the screen in similar fashion in both passes, and recordsthe results of the second pass to file if C.simulation_record is not empty. The logicalfield C.passone indicates whether SABL is in the first (true) or second (false) pass.

Example 6 Suppose that a research project makes a formal comparison of results in thefirst and second passes, and to this end requires access to the particles of both passes.This can be accomplised by setting C.twopass = true in stage ’startrun’, and then:elseif strcmp(stage, ’finish’)

if C.passoneP.pass1.mean = u_mean(Cpar.theta);P.pass1.std = u_std(Cpar.theta);P.pass1.nse = u_nse(Cpar.theta);P.pass1.rne = u_rne(Cpar.theta)

elseP.pass2.mean = u_mean(Cpar.theta);P.pass2.std = u_std(Cpar.theta);P.pass2.nse = u_nse(Cpar.theta);P.pass2.rne = u_rne(Cpar.theta);

p_passcompare(P.pass1, P.pass2); % Formal comparisonend

endUsers contemplating this kind of work with multiple workers can test their preparation

by understanding why this code would not run at stage ’endrun’. Would it work if E.gpu= true, and if not what changes should be made? If the p_passcompare code led to aterminal error, the results could still be saved with the line global P followed by theline save filename P from the console. Then p_passcompare could be modified, thefile loaded, and p_passcompare executed. Either way, p_passcompare has access toall SABL functions. A more conservative strategy would simply save P at the end ofexecution without attempting p_passcompare in stage ’finish’.

3.7 Using multiple workers and GPUs

As detailed in Section 2 the SABL algorithm is pleasingly parallel, meaning that almostall steps execute the same instructions on different particles and there is very little com-munication between particles. The SABL toolbox exploits this property by organizingexecution in one of four ways, corresponding in to the Cartesian product of (a) one(E.pus = 1) or more (E.pus > 1) workers, and (b) execution using CPU cores (E.gpu= false) or GPU cores (E.gpu = true). Alternatives (a) and (b) interact as follows.

26

1. With one worker and CPU cores (E.pus = 1, E.gpu = false) SABL exploits thepleasingly parallel structure of the algorithm only through array arithmetic inwhich one dimension often corresponds to particles. This is merely a conveniencefor coding in Matlab: at the machine execution level all computation is serial,with one exception. The exception is that the Matlab compiler takes advantage ofmultiple CPU cores in translating Matlab code to C code, on a function-by-functionbasis. This does, indeed, accelerate computing but not by means of exploiting thepleasingly parallel structure of the SABL algorithm.

2. With several workers and CPU cores (E.pus > 1, E.gpu = false) SABL par-titions particles amongst cores. (E.pus cannot exceed the number of CPU coresavailable.) Each worker (in fact, a thread) utilizes a single CPU core and threadssynchronized using Matlab’s single processor multiple data protocol. This often,but not necessarily, leads to faster execution than when E.pus = 1: the reason isthat Matlab was exploiting multiple CPU cores, although with a different strategy,already with E.pus = 1.

3. With one worker and one GPU (E.pus = 1, E.gpu = true) SABL performs allbut a few housekeeping functions on the GPU. GPU code exploits the hundreds(or thousands) of GPU cores available, defining threads to take advantage of theparallel structure of the algorithm and the singlue-instruction multiple-data natureof the GPU processors. This is accomplished in part through Matlab GPU kernels,but most importantly through C/CUDA code for likelihood function evaluation,which in turn accounts dominates floating point operations in all but the simplestmodels and applications.

4. With several workers and GPUs (E.pus > 1, E.gpu = true) particles are parti-tioned across GPUs, and then on each GPU execution proceeds as in the single-GPU case. Because of the pleasingly parallel structure of the SABL algorithmthere is little need for communication between GPUs — that which takes placeutilizes Matlab utilities for this kind of transfer and in all but the simplest modelsand applications this accounts for a very small fraction of the time required tocomplete the computations.

Uuser code executed from p_monitor in stage ’startrun’ can ignore issues relatedto multiple workers and GPUs in assigning constants or arrays to fields of the P andM structures: SABL allocates the fields of these structures to CPU or GPU, and acrossworkers, in an appropriate way. Thus users who are have experience with Matlab butare unfamiliar with any of the extensions in the Matlab Parallel Computing Toolboxcan simply write code to set control fields at this stage their usual manner. In otherstages more care is required, especially when harvesting monitor fields into fields of theP or Ppar structures. The remainder of this section goes into some of the details andconsiderations that are important for these operations to proceed successfully.

27

3.7.1 Using multiple workers

By default SABL uses one worker (E.pus = 1). Using multiple workers requires thatthe Matlab Parallel Computing Toolbox be installed. The user manages the openingof the Matlab pool, including specification of number of workers, as well as its closing.The user can also do this in p_monitor, opening a Matlab pool in stage ’open’ andclosing it in stage ’close’. However, this leads to the complication that the Matlabpool remains open if SABL never reaches stage ‘endrun’ (for example, if SABL issuesa terminal error or if the user interrupts SABL) and then the user must manually closethe pool or else a Matlab error arises on the next execution of the code when it attemptsto open a pool that is already open. For this reason SABL does not attempt to open orclose pools directly, and leaves it up to the user to handle this in p_monitor or outsideof SABL. If the user sets E.pus to an integer greater than 1 in stage ’startrun’ andthere is no pool open, or if there are fewer than E.pus workers in the pool then SABLterminates with an error message to that effect.SABL executes in parallel on multiple workers by means of a Matlab spmd block,

which contains all stages between (but not including) ’startrun’ and ’endrun’. This isshown explicitly in Exhibits 1 and 2. Once inside the spmd block there are E.pus workers(more technically, threads) that execute the same instructions. These instructions createand modify arrays and constants that have the same names but are in fact different.Thus once inside the spmd block there are E.pus versions of everything, including bothglobal structures and local constants and arrays. This means that p_monitor code (andfunctions that it invokes) also applies to these multiple copies.Notice that SABL executes the same function c_sabl outside the spmd block if E.pus

= 1 and inside the spmd block if E.pus > 1. This obviates writing two different versionsof the code, one for a single thread and one for multiple threads, for all of the functionsinvoked directly or indirectly by c_sabl, including p_monitor and functions written bythe user that are invoked by p_monitor in the stages between ’startrun’ and ’endrun’.It is important to bear in mind that the results of computations inside an spmd block areavailable only to the workers and not to local CPU memory. Communication betweenlocal CPU memory and the workers (threads) must be handled explicitly, and SABLdoes this by means of the function c_sabl_spmd shown in its entirety in Exhibit 3.When E.pus > 1, control passes to c_sabl_spmd rather than to c_sabl from SABL

(Exhibit 1). Exhibit 3 shows all of the executable code and most of the comments fromc_sabl_spmd. This function contains the spmd block that manages the multiple workers(threads). While inside an spmd block, identical instructions are executed on all of thethreads, each thread modifying arrays and constants in different ways despite the factthat they have the same name. Notice that the spmd block invokes the same functionc_sabl that is invoked directly by SABL when E.pus = 1. The function c_sabl, in turn,invokes (directly and indirectly) the functions that do almost all the SABL computations.In this way the SABL toolbox exploits the pleasingly parallel nature of the algorithm.The contents of constants and arrays inside the spmd block are strictly local to

each thread (worker): this is necessary since these constants and arrays have the same

28

name but in fact map to different CPU cores (if E.gpu is false) or different GPUs (ifE.gpu is true). To get computations started in an spmd block information has to betransferred from bytes allocated to local CPU storage to bytes allocated to threads(workers). Similarly, to make use of all the computations on the threads, informationhas to be transferred from the respective cores to local CPU memory when threadcomputation is complete. These transfers have to take place inside the spmd block, forthis is the only place that the workers are accessible. Notice in Exhibit 3 that this isexactly what c_sabl_spmd does: there are only three command lines —one to transfer asingle thread to multiple threads (u_copyintospmd), one to do the work (c_sabl), andone to transfer from multiple threads to a single thread (u_copyfromspmd). Threadscorrespond either to CPU cores (E.gpu = false) or to GPUs (E.gpu = true).Most important, note that only the 8 structures C, E, M, P, Cpar, Epar, Mpar and

Ppar are transferred. (Facilitating transfer of diverse groups of arrays and constants isone of the reasons these structures are central to SABL.) For transparency, the technicaldetails of these transfers are captured in u_copyintospmd and u_copyfromspmd andthese functions are in turn shown in their entirety in Exhibit 4. These transfers includeall fields of the P and Ppar structures created by project code. So long as project codeexploits the fields of these structures, code can be written that ignores the technicalcomplications of threads —SABL handles all of the overhead.

Example 7 Suppose that a project works with quarterly macroeconomic data and one ofits objectives is to illustrate how the posterior mean and standard deviation of several in-teresting functions of the parameters evolved over time. (For example, how they changedbefore, during and after the “great moderation”might be a particular focus.) To accom-plish this in most approaches would require that inference be repeated for all quarters,a time-consuming task. The SABL toolbox option C.Cphase_method = ’data_whole’introduces data one observation at a time in the C phase and in doing so provides accessto the required posterior distributions. This access is transient: it exists in each step ofthe C phase within each cycle, but once that step is completed the posterior distributionis modified with the introduction of the next observation, either in the next step of theC phase in the current or next cycle. When the algorithm is complete the user wouldlike to store the posterior means and standard deviations in a file, where they can beaccessed to create tables or figures. Code of the following form will accomplish this for(say) 5 functions of interest, regardless of whether execution takes place with E.pus =1 or E.pus > 1:elseif strcmp(stage, ’startrun’)

P.funcmeans = zeros(C.tlast, 1);P.funcstds = zeros(C.tlast, 1);

...elseif strcmp(stage, ’whileCphase’)

% The function func evalutes the functions of interest:g = func(Cpar.theta);

P.funcmeans(C.Cphase.t, :) = u_mean(g, 1, E.pus>1, Cpar.logw);

29

P.funcstds(C.Cphase.t, :) = u_std(g, 1, E.pus>1, Cpar.logw);...

elseif strcmp(stage, ’endrun’)funcmeans = P.funcmeans;funcstds = P.funcstds;save funcfile funcmeans funcstds

The first and last of the three blocks execute as a single thread whereas the secondblock executes as E.pus threads. (This is explicit in Exhibit 1.) If E.pus > 1 thenin the middle block there are identical copies of the global C, E, M and P structures oneach thread. The last block references the P structure that SABL retrieved from oneof the workers just before leaving the spmd block. (Use Chelp C.Cphase.t and ChelpCpar.logw for explicit descriptions.)The functions u_mean and u_std deal with a number of complications that arise

regularly and would otherwise be inconvenient. One is that the particles are weighted inthe C phase and this affects the computations —hence the input Cpar.logw. A second onethat the computations require communication across the workers. The functions u_meanand u_std handle this in an effi cient fashion, minimizing the information transferredbetween threads.Observe that if any of the arrays created as fields of the P structure had instead been

local, or persistent, or had been a field of any structure other than the eight defined bySABL, the computations would have failed if E.pus > 1. The only local constant orarray is g in the middle block, which exists only to avoid executing func twice. Thisis satisfactory because g is transient — the E.pus versions of g are used only on thisoccasion. Some care must be exercised in writing the code in func if is to executesuccessfully when E.pus > 1 and/or when E.gpu is true. Since Cpar.theta can be quitelarge and exists on the GPU if E.gpu is true, there is the potential for the code to moveparticles between GPU and CPU inadvertently, thereby adding great overhead in eachiteration of the C phase. SABL provides utility functions that assist with this (Section3.8), but they do not entirely preclude the possibility of mistakes of this kind.

The effi ciency of using multiple CPU cores explicitly in this way increases as prob-lems become larger and more complex: the dominant relevant dimensions are numberof particles along with the number of floating point operations in the evaluation of thelikelihood function (in Bayesian inference) or objective function (in optimization). ForBayesian inference in models that do not exploit suffi cient statistics the number of ob-servations is a third relevant dimension. This is complicated by the fact that under thebasis for comparison, E.pus = 1, Matlab already exploits the multiple CPU cores foundin most hardware. The implications can be seen directly in SABL, which indicates bothelapsed time and CPU seconds at the end: the latter is typically a multiple of the formerand can exceed 3 on a conventional quadcore machine. Thus executing on a quadcoremachine with E.pus = 2 could well be counterproductive. In practice significant gains(roughly from 50% to 150%) are seen only on machines with 8 or more cores and withE.pus equal to the number of cores.

30

Given these modest gains, SABL might not include the option for multiple workersif only CPU time were at stake. The option is present because precisely the same logichandles multiple GPUs, for which elapsed (i.e., wall clock) time can be nearly inverselyproportional to the number of GPUs in large problems.

3.7.2 Using graphics processing units

TheMatlab Parallel Computing Toolbox also integrates computation on graphics process-ing units (GPUs) into Matlab code. By default, SABL executes on the CPU (E.gpu =false). To execute with one or more GPUs, the user declares E.gpu = true in stage’open’of p_monitor. SABL will then use E.pus GPUs so long as there are at leastE.pus physical GPUs installed and available — if not, SABL terminates with an errormesssage. If E.pus > 1 then the user must also manage a Matlab pool in the same waydescribed at the start of Section 3.7.1. (Checking with HUAXIN about this)Just as Matlab uses the compiled language C to execute code on CPU cores, with the

Parallel Computing Toolbox it uses the extension C/CUDA to execute code on GPUs.It enables the Matlab programmer to create constants, arrays and certain structuresin GPU memory and transfer data from CPU to GPU and vice versa. Many Matlabcommands have been extended to GPUs and have same options and variants (e.g., svdand kron); some other commands have been extended to GPUs but with particularoptions (e.g., rand, and log with complex output) and others have not been extended(e.g., pinv and poissrnd). These extensions are still actively underway and grow witheach new edition of the Matlab Parallel Computing Toolbox, so documentation for theinstalled version of the Parallel Computing Toolbox should be consulted. As with allMatlab documentation the contingencies are not exhaustive and some learning by trialand error may be required.GPU arrays have precedence over all other types, and within GPU types precedence

is the same as within CPU types (e.g., double takes precedence over single). Thus ifa type double CPU array A and a type double GPU array GA are conformable, then B= A + GA is a type double GPU array. If GA had been a type single GPU array thenB would be type single GPU array. The primary tool for moving data from the CPUinto the GPU array is gpuArray, for example GA = gpuArray(A), and the primary toolfor moving data from the GPU to the CPU is gather, for example A = gather(GA).Other data types supported on the GPU include int and unit types as well as logical.The arguments of cell arrays and the field names of structures are pointers that exist inCPU memory. Different fields in a structure can contain different data types.GPUs work to advantage when arrays are large and operations are perfectly or pleas-

ingly parallel. This includes many operations in linear algebra. Because there is sub-stantial overhead in execution on the GPU, the problem must be suffi ciently large andamenable to parallel execution for net gains to occur. When these conditions are notmet then execution on the GPU can take even longer than it does on CPU cores. TheSABL algorithm is inherently pleasingly parallel, and it is in large and complex problemsthat effi ciency gains matter: the gain from reducing computing time from 12 hours to

31

36 minutes (95%) is more valuable than reducing it from 2 minutes to 6 seconds (also95%). In these large problems gains are nearly proportional to the number of GPU coresas well as to the number of GPUs, and gains of a factor of 50 to 100 compared withconventional “single CPU” implementation can be achieved with a 1,000 - processorGPU. In considering the value of this increase in speed, keep in mind that the basisof comparison is a CPU environment in which 4 or more cores are already being usedeffi ciently, and that the cost of GPU hardware, now well under one US dollar per core,is small relative to CPU installation and this relative cost has been falling.Whether or not gains of this order are realized is strongly affected by the way the

log likelihood or objective function is evaluated. In large problems the vast majority offloating point operations are devoted to repeated evaluations of this function, which isin turn heavily concentrated in the M phase. It is because of this concentration thatthe parts of the SABL algorithm that are not perfectly parallel, chiefly resampling inthe S phase and the assessment of acceptance rates in the M phase, are unimportantin large problems —in the relevant sense that reducing their execution time to nothingwould have a proportionately negligible effect on total execution time. For likelihoodor objective function evaluation to realize anything like its potential effi ciency in SABLrunning on the GPU threads, it is important that the threads be organized to carry outfunction evaluation at a particle from start to finish. At least in its current incarnation,Matlab code running on the GPU cannot be instructed to do this. However, if likelihoodfunction evaluation is carried out using a function coded in C/CUDA (more properlytermed a kernel) this barrier can be overcome. More precisely, C/CUDA code organizesthe threads and defines a single thread as a complete evaluation of the likelihood orobjective function.Communication between Matlab and kernels running on GPUs (for example, Matlab

*.cu and *.ptx files) is straightforward, just as it always has been between conventionalMatlab and functions written in C (Matlab *.mex files). The current edition of SABLcontains a full implementation of a C/CUDA kernel for likelihood functions in severalmodels, complementing functions written in Matlab. This code, especially for egarch,is designed in part as a template for the implementation of C/CUDA kernels for othermodels.In this environment, one of the greatest ineffi ciencies that can be introduced is fre-

quent movement of all particles, or any array with size C.JN in some dimensions, fromGPU to CPU or vice versa. This never happens in the core SABL code: in particular allfields of the Cpar global structure are GPU arrays when E.gpu is true. This conventionshould be respected by the user in creating any Ppar fields if the effi ciency of SABLrunning on GPUs is to be maintained. Particles are born (S phase), mutate (M phase)and die (S phase) in GPU memory. The precedence of the gpuArray data type impliesthat natural ways of writing project code avoid transfer from GPU to CPU.

Example 8 x = gpuArray(randn(10,1)) or x = gpuArray.randn([10,1]) creates xon the GPU. The subsequent command y=x.^2 places y on the GPU as well and entailsno intervening movement between CPU and GPU. This is reinforced by the fact that

32

conversion from gpuArray to double is prohibited in assignment statements: continuingthe example, the subsequent command z = zeros(10,1) followed by z(1) = x(1) gen-erates a terminal error. This still places some burden on the veteran Matlab programmerwho is new to GPUs.

Example 9 Return to Example 7, executing E.gpu = true in stage ‘open’. ThenCpar.theta is a GPU array, and so is the vector g created in stage ’whileCphase’.However, P.funcmeans and P.funcstds are CPU arrays, created in stage ’startrun’.Direct conversion of a GPU array to a CPU array is prohibited, and so the linesP.funcmeans... and P.funcstds... in stage ’whileCphase’ cause a terminal er-ror. Matlab specifically provides the command gather in its parallel toolbox to cope withthis situation, and since code using E.gpu = true successfully must have the Matlabtoolbox installed, the commandP.funcmeans(C.Cphase.t,:) = gather(u_mean(g, 1, E.pus>1, Cpar.logw))

and similarly for P.funcstds will successfully addres this problem.But now suppose the user provides the code to a colleague using Matlab without the

Parallel Computing Toolbox installed. Matlab will not recognize gather, leading to anerror the colleague may not recognize. Multiply this by many such occurrences in morecomplex code, and it’s clear a major portability problem to Matlab without the ParallelComputing Toolbox installed has arisen. For this reason, all SABL code is designed toexecute without the Parallel Computing Toolbox installed (and, for that matter, withoutthe Statistics Toolbox installed) so long as E.gpu = false and E.pus = 1. The substi-tute for gather is u_setdatatype:P.funcmeans(C.Cphase.t,:) = ...u_setdatatype(u_mean(g, 1, E.pus>1, Cpar.logw), ’cpu’)

These commands cause the transfer of 16*length(g) bytes from GPU to CPU perobservation. In any problem large enough that GPU compuation is more effi cient thanCPU computation this overhead in this transfer is so negligible as to be diffi cult to mea-sure.

3.8 Utilities

SABL employs many utility functions that facilitate code writing and testing. Many ofthem provide a single command for array creation and common mathematical operationsregardless of whether the computing environment is one or multiple workers, and CPU orGPU. For example, when execution takes place on one or multiple GPUs, these functionsminimize transfer of data across GPUs and between CPU and GPU.This section organizes and lists these functions, providing very brief descriptions.

The Matlab help command provides full technical detail. The actual code can be foundin the utilities directory of the SABL toolbox, along with some other functions that areimportant to SABL but less likely to be relevant for modelers and users.

33

• Array manipulation

— u_aggrecursive Globally concatenate (gcat) all the fields of a structure

— u_allocate Allocate an array on CPU or GPU memory

— u_distrirecursive Distribute all fields of a structure to workers

— u_gcat Aggregate data from multiple workers to one worker

— u_setdatatype Move data to CPU or GPU memory

• Matrix arithmetic

— u_logmeanlog Replicate log(mean(exp(x)) avoiding overflow

— u_logmomlog Also replicate log(std(exp(x)) avoiding overflow

— u_logsumlog Replicate log(sum(exp(x)) avoiding overflow

— u_matprod2 Effi cient multiplication of large sparse matrices

— u_max Replicate max(x) with effi cient handling of multiple workers

— u_min Replicate min(x) with effi cient handling of multiple workers

— u_mvmom Replicate mean(x) and cov(x) with effi cient handling of multipleworkers

— u_nse Compute numerical standard error

— u_rne Compute relative numerical effi ciency

— u_std Replicate std(x) with effi cient handling of multiple workers

— u_sum Replicate sum(x) with effi cient handling of multiple workers

• SABL standard priors

— u_prior_gauss Map Gaussian prior specification to standard

— u_prior_wishart Map Wishart prior specification to standard

• Management

— u_islab1 Replicate labindex == 1 independent of Parallel Toolbox

— u_numlabs Replicate numlabs independent of Parallel Toolbox

• Code testing

— u_pfcheck Test probability function and random sample

— u_pdfcheck Test probability density function and random sample

34

4 Variants of the algorithm in the SABL toolbox

The SABL code incorporates variants of tempering in the C phase, resampling in the Sphase, and mutation employing an invariant transition kernel in theM phase. Similarlythere are variants on procedures for determining the degree of incremental informationin the C phase and degree of mixing mixing in the M phase, referred to collectively asstopping criteria.For each of these variants this section provides a short description, often by reference

to other sections of this Guide; the protocol for invoking the option; a list of the controlfields or subfields of the global structure C that define the variant precisely and may beset by the user; and a list of the monitor fields or subfields of the global structure C thatmay be useful in accessing the state of the algorithm and should not be changed by theuser. The Chelp command, in turn, documents the fields, and in the case of controlfields indicates the default setting employed by SABL if the user or modeler does notset the field explicitly.

4.1 C phase

The current edition of SABL has three variants of the C phase: data tempering, powertempering, and optimization. The default method is data tempering.The current edition of SABL has two variants of the C phase stopping criterion:

effective sample size and optimization. Both terminate the C phase if relative effectivesample size (10) reaches or falls below threshold (RESS(`) ≤ RESS∗). In the case ofpower tempering this is guaranteed at the end of the first iteration of the C phase bysolving (11) for ∆`, but checking takes place regardless because it is almost costless andsimplifies flow control in the code.The following fields are common to all variants.Control fields: C.Cphase_method, C.Cstop_methodMonitor fields: C.Cphase.count, C.Cphase.total

4.1.1 Data tempering

Description: See data tempering in Section 2.1. This is the core default.Invocation: C.Cphase_method = ’data_whole’Control fields: NoneMonitor fields: C.Cphase.t, C.logpl

4.1.2 Power tempering

Description: See power tempering in Section 2.1Invocation: C.Cphase_method = ’anneal_Bayes’Control fields: C.Cphase.critMonitor fields: C.Cphase.power

35

4.1.3 Optimization

Description: See power tempering in Section 2.1 and optimization Section 2.6Invocation: C.Cphase_method = ’optimize’Control fields: C.Cphase.critMonitor fields: C.Cphase.fracmax, C.Cphase.logarguments, C.Cphase.logpowers,

C.Cphase.logobjectives, C.Cphase.max, C.Cphase.power, C.Cphase.spread

4.1.4 Effective sample size stopping criterion

Description: See Section 2.1. This is the core default.Invocation: C.Cstop_method = ’ESS’Control fields: C.Cstop.ressMonitor fields: C.ress

4.1.5 Unconditional stopping criterion

Description: Causes the C phase to terminate after a single iteration of the C phase;used when C.Cphase_method = ’optimize’; can be used with custom single iterationC phase methodsInvocation: C.Cstop_method = ’unconditional_stop’Control fields: NoneMonitor fields: None

4.1.6 Optimization stopping criterion

Description: Evaluates convergence criteria to determine whether the current cycle isthe last cycleInvocation: C.Cphase_method = ’optimize’ =⇒ C.Cstop_method = ’optimize’Control fields: C.Cstop.fracmax, C.Cstop.logarguments, C.Cstop.logpowers,

C.Cstop.logobjectives, C.Cstop.max, C.Cstop.ress, C.Cstop.spreadMonitor fields: None

4.2 S phase

SABL has four variants of the S phase: multinomial, residual, systematic and stratifiedresampling. The default method is residual resampling. In the S phase there is noanalogue of the stopping rule in the C phase orM phase. Control fields and monitoringfields are common to all variants of the S phase.Control fields: C.Sphase_methodMonitor fields: C.Sphase.nunique, Cpar.select

36

4.2.1 Multinomial resampling

Description: See Section 2.2.Invocation: C.Sphase_method = ’multinomial’

4.2.2 Residual resampling

Description: See Section 2.2.Invocation: C.Sphase_method = ’residual’. This is the core default.

4.2.3 Systematic resampling

Description: This variant is more effi cient than residual resampling, but there is nocentral limit theorem for numerical accuracy. This option is intended for research useonly and should not be used in applications.Invocation: C.Sphase_method = ’systematic’

4.2.4 Stratified resampling

Description: This variant is more effi cient than residual resampling, but there is nocentral limit theorem for numerical accuracy. This option is intended for research useonly and should not be used in applications.Invocation: C.Sphase_method = ’stratified’

4.3 M phase

The current edition of SABL has two variants of the M phase: Metropolis random walkand blocked Metropolis random walk. The default method is Metropolis random walk.Both variants are described in Section 2.3.The current edition of SABL has four variants of the M phase stopping rule. Each

rule is a function of two criteria: the number of iterations elapsed (iterations) and theaverage relative numerical effi ciency of several functions of the parameters (RNE). Inthe first variant (C.Mphase_method = ’steps’) the stopping rule is a fixed number ofiterations; in the second variant (’RNE’) the stopping rule is minimum value of RNE;in the third variant (’RNE&steps’) the stopping rule is a maximum value of iterationsor a minimum value of RNE, whichever comes first; in the fourth variant (’hybrid’)the stopping rule is a minimum value of RNE that declines with each iteration andeventually reaches zero. In all cases the fixed or maximum number of iterations andthe minimum value of RNE can be (and by default are) larger in the final cycle than inprevious cycles. The variant ’RNE&steps’ is the default.The following fields are common to all variants of the M phase and the M phase

stopping rule.Control fields: C.Mphase_method, C.Mstop_method,

C.Mphase.step_inc, C.Mphase.step_initial,

37

C.Mphase.step_lower, C.Mphase.step_upper,C.Mphase.acceptgoal, C.Mphase.total

Monitor fields: C.Mphase.accept_rate, C.Mphase.count, C.Mphase.step_inc,C.Mphase.rne C.Mphase.total

4.3.1 Metropolis random walk

Description: See Section 2.3. The stopping criteria are computed and the stopping ruleis imposed accordingly at the end of each iteration.Invocation: C.Mphase_method = ’MGRW’Control fields: None additionalMonitor fields: None additional

4.3.2 Blocked Metropolis random walk

Description: See Section 2.3. There are two variants: fixed blocks and random blocks.The default is random blocks unless the user or modeler creates the field C.Mphase.blocks,a cell array with one entry per block, each entry pointing to the entries of the parametervector θ defining that block. If the user or modeler does not create this field but cre-ates the field C.Mphase.nblocks, each iteration of the M phase randomly allocates θto blocks, making the block lengths as close as possible. If the field C.Mphase.nblocksis also absent then it is taken to be one-six the number of parameters rounded to thenearest integer with a minimum of 2. An attempt to use the blocked variant of theMetropolis random walk terminates with error if the model has only one parameter.The stopping criteria are computed and the stopping rule is imposed accordingly at

the end of each iteration. The iteration stopping criterion is the total number of blocksexecuted in the M phase of the current cycle: thus, for example, if C.Mstop.steps =100 and C.Mphase.nblocks = 4, the M phase will execute exactly 25 complete itera-tions with 4 blocks in each iteration.Invocation: C.Mphase_method = ’MRGW_blocked’Control fields: C.Mphase.blocks, C.Mphase.nblocksMonitor fields: C.Mphase.blocks, C.Mphase.iblock, C.Mphase.ranblock

4.3.3 Steps stopping rule

Description: The stopping rule is a fixed number of iterations of the Metropolis randomwalkInvocation: C.Mstop_method = ’steps’Control fields: None additionalMonitor fields: None additional

38

4.3.4 RNE stopping rule

Description: The stopping rule is minimum value of RNE computed for each of a set ofspecified functions. Under this stopping rule it is possible that the M phase will neverterminate.Invocation: C.Mstop_method = ’RNE’Control fields: None additionalMonitor fields: None additional

4.3.5 Steps and RNE stopping rule

Description: The steps stopping rule or the RNE stopping rule is imposed, whichever istriggered first. This is the default stopping rule unless specified otherwise for a particularmodel.Invocation: C.Mstop_method = ’RNE&steps’Control fields: None additionalMonitor fields: None additional

4.3.6 Hybrid stopping rule

Description: This is like the steps and RNE stopping rule, except that the RNE criterionRNE (s) is a function of the number of iterations s,

RNE (s) = RNE∗ · (Steps∗ − s) /Steps∗

where RNE∗ is the criterion of the RNE stopping rule and Steps∗ is the maximumnumber of steps in the steps stopping rule.Invocation: C.Mstop_method = ’hybrid’Control fields: None additionalMonitor fields: None additional

4.4 Adding a new variant of the algorithm to the SABL toolbox

Algorithm variants are manifest in the code as if / elseif blocks in the functionsc_Cphase, c_Sphase and c_Mphase. In each case there is a field with a string indicatingthe variant (e.g., C.Cphase_method = ’anneal_Bayes’) and a function that invokesthe variant, e.g. c_Cphase_anneal_Bayes(stage)). The functions c_Cphase, c_Sphaseand c_Mphase recognize the string ’custom’ for the field, and control then passes to afunction for that variant, e.g. p_Cphase_custom(stage) for the C phase and similarlyfor the S andM phase. The p_ prefix recognizes the fact that a customized variant mightbe project specific. It also facilitates initial work on any variant that is a candidate forfuture inclusion in the SABL algorithm and toolbox, at which point an additional elseifblock would be added at the appropriate place in c_Cphase, c_Sphase or c_Mphase.Stopping rule variants are manifest in the code as if / ifelse blocks in the func-

tions c_Cphase and c_Mphase. The string ’custom’ is recognized for the C and M

39

phase stopping rules, and control then passes to p_Cstop_custom and p_Mstop_customrespectively.

5 Models in the SABL toolbox

The SABL algorithm provides for Bayesian inference in any likelihood functoin andmaximization of any objective function, in each case subject to some weak conditionsdescribed early in Section 2. The SABL toolbox exploits this structure to make itsimple to us SABL for Bayesian inference in different models or maximization of differentobjective functions.

5.1 Prior distributions

There are two ways to specify the prior distribution in a model. Most models haveone more options for SABL standard prior distributions (Section 5.1.2). A particularSABL standard prior distribution is assigned using a field of the form M.prior.type;the structure M.prior may have a different name or names, depending on the model(Section 5.2), but the field type is the same across models. As explained in Section 5.2there are additional fields in M.prior. SABL standard prior distributions include someof the conjugate and conditionally conjugate forms often seen, for example normal andWishart. Future editions of SABL will include more standard prior distributions.

5.1.1 General considerations for prior distributions

Prior distributions are essential in SABL. This is clear from the mechanics outlined atthe start of Section 2: the first step in the algorithm is to draw particles, independently,from a prior distribution. More fundamentally, SABL models Bayesian learning, whichis founded on the probabilistic expression of uncertainty in all circumstances.The theory of Bayesian learning works in the same way for prior distributions that

are diffuse as with those that are more specific so long as they are all proper; i.e., thedegree of entropy is irrelevant so long as it is finite. But this theory is asymptotic in N ,the number of particles per group, and this has significant implications for practice.For any given number of particles J ·N successively more diffuse priors will lead to

slower convergence in the form of more cycles early in the algorithm, and eventuallywill lead to failure usually manifest as RNE that declines as cycles proceed, well belowthe M phase stopping citerion. Worse, it can produce an incorrect result if k∗ (θ) ismultimodal. The same is true for a given prior distribution with successive reductionin the number of particles. It is essential to bear in mind that convergence of theSABL algorithm is asymptotic in N , and its applicability depends on the structure ofinformation, including the prior distribution. Before making significant decisions abouta research project, including public reports, it is prudent to increase the number ofparticles by (say) a factor of 8 and assess the effect. The ability to do this can be a

40

very significant advantage for GPU hardware platforms in the case of large and complexproblems.

5.1.2 SABL standard prior distributions

These are the standard prior distributions currently offered in SABL. Coming editionswill include substantially more.

Multivariate normal distribution The multivariate normal prior distribution forthe model vector of parameters or a subvector of the parameters β is invoked by ageneric field M.prior.type = ’normal’. The term “generic field”means that the string’normal’ always denotes this specification, but the name of the field prior can assumedifferent names in different models. Some models may have several such fields corre-sponding to different subvectors of θ; the specifics are given in the model descriptions inSection 5.2.If M.prior.type = ’normal’ then there must be two additional fields in M.prior,

the form of the prior specification in the string M.prior.spec and parameters of the priordistribution, sometimes called the hyperparameters of the prior specification, in a cellarray M.prior.hp. The names of these subfield, spec and hp, are the same regardlessof the model-specific version of the parent structure name prior. The contents ofM.prior.hp depend on the string M.prior.spec.

• M.prior.spec = ’mean-variance’: β ∼ (µ,Σ); M.prior.hp{1}= µ, M.prior.hp{2}= Σ

• M.prior.spec = ’mean-precision’: β ∼ N (µ,H−1); M.priorhp{1}= µ, M.prior.hp{2}= H

• M.prior.spec = ’mean-std’: β ∼ N (µ,D2); M.priorhp{1} = µ, M.priorhp{2}= diag(D)

• M.prior.spec = ’linear_combinations’: Rβ ∼ N (r,D2); M.priorhp{1} = r,M.priorhp{2} = R, M.priorhp{3} = diag(D)

• M.prior.spec = ’g-prior’: β ∼ N(µ, g (X ′X)−1); M.priorhp{1} = µ, M.priorhp{2}

= g, M.priorhp{3} = X

The function u_prior_gauss provides some further technical detail.

Wishart distribution The Wishart prior distribution for a positive definite matrixX (p× p) is invoked by a generic field M.prior.type = ’Wishart’. The term “genericfield”means that the string ’Wishart’ always denotes this specification, but the nameof the field prior can assume different names in different models. Some models mayhave several such fields corresponding to different subvectors of θ; the specifics are givenin the model descriptions in Section 5.2.

41

If M.prior.type = ’Wishart’ then there must be two additional fields in M.prior,the form of the prior specification in the string M.prior.spec and parameters of the priordistribution, sometimes called the hyperparameters of the prior specification, in a cellarray M.prior.hp. The names of these subfield, spec and hp, are the same regardlessof the model-specific version of the parent structure name prior. The contents ofM.prior.hp depend on the string M.prior.spec.

• M.prior.spec = ’variance’: X ∼ Wp (V, n). M.prior.hp{1}= V , M.prior.hp{2}= n.

• M.prior.spec = ’precision’: X ∼ Wp (V −1, n). M.prior.hp{1}= V , M.prior.hp{2}= n.

The function u_prior_wishart provides some further technical detail.

5.1.3 Direct specification of prior distributions

Most SABL models also permit fully customized prior distributions, by means of the fieldM.customprior = true, which is available to every model. This option requires that theuser provide two functions, p_initialsim and p_initial. The function p_initialhas no input or output arguments. All inputs are by means of globals, and for rea-sons given elsewhere in this Guide it is highly recommended that fields of the P struc-ture be employed for this purpose. The output is the field Cpar.theta, a C.JNwork xC.parameters array of particles, drawn independently and identically (down the rows)from the prior distribution. The function p_initial has a single input argument theta,whose structure is exactly the same as that of Cpar.theta. It has a single output argu-ment logp, a C.JNwork × 1 array. Each row of logp is the log prior density evaluatedat the corresponding row of theta.The second option requires due diligence of two kinds. The first is the obvious point

that the code should work correctly. SABL provides two utility functions, u_pfcheck andu_pdfcheck, to facilitate checking consistency between p_initialsim and p_initial.The second caution is that both functions must respect SABL’s placement of particles onthreads (i.e., Matlab workers; relevant if E.pus > 1) as well as on the GPU(s) (relevantif E.gpu = true). Any users exercising this option is strongly encouraged to use theSABL utilities designed for this kind of management, in particular u_setdatatype andu_allocate. Failure to do this can cause a terminal error; worse, it can completely crip-ple the effi ciency of GPU execution and can substantially reduce the effi ciency of execu-tion when E.pus > 1. The fastest learning curve is to study the functions m_initialsimand m_initial in some of the model folders. In most cases, employing C/CUDA codewhen E.gpu = true makes little difference, unless assessment of the prior is computa-tionally intensive (as could happen, for example, in computing the equilibrium of aneconomic model).

42

5.2 Models in this edition of the SABL toolbox

These are the models currently offered in SABL. Coming editions will include substan-tially more.

5.2.1 Toy normal model 1 (’toynorm1’)

To invoke the normal model, use SABL(’toynorm1’) from the project directory, orSABL(’toynorm1’, [project_directory]) from any directory.

Observables distribution The outcomes yt (t = 1, . . . , T ) are independent:

ytiid∼ N (µ, 1) .

The parameter vector in the SABL algorithm is θ = µ.

• Model specification control field:

— M.y: T × 1 vector

• Default handling:

— If M.y is empty then it is set to a 100× 1 vector ytiid∼ N (0, 1)

Prior distributions

• SABL standard prior distribution normalM.prior_mu.type = ’normal’

Default handling:

If M.prior_mu.type is not a field or if M.prior_mu.type is empty, then SABL setsM.prior_mu.type = ’normal’. Then if the field M.prior_mu.spec is absent orempty, or if M.prior_mu.type = ’normal’ and them M.prior_mu.spec is empty,SABL sets M.prior_mu.spec = ’mean-std’. The prior mean is 0 and the priorstandard deviation is 1.

• SABL custom prior distribution

See Section 5.1.3.

M phase RNE functions The M phase RNE functions are µ, µ2 and µ3.

5.2.2 Toy normal model 2 (’toynorm2’)

To invoke the normal model, use SABL(’toynorm2’) from the project directory, orSABL(’toynorm1’, [project_directory]) from any directory.

43

Observables distribution The outcomes yt (t = 1, . . . , T ) are independent:

ytiid∼ N

(µ, σ2

).

The parameter vector in the SABL algorithm is θ′ = (µ, log σ2).

• Model specification control field:

— M.y: T × 1 vector


— If M.y is empty then it is set to a 100× 1 vector ytiid∼ N (0, 1)

Prior distributions

• SABL standard prior distribution normalM.prior_theta.type = ’normal’. With this prior type only M.prior_theta.spec= ’mean-std’ is permitted.

Default handling:

If M.prior_theta.type is not a field or if M.prior_theta.type is empty, thenSABL sets M.prior_theta.type = ’normal’. Then if the field M.prior_theta.specis absent or empty, or if M.prior_theta.type = ’normal’ and them M.prior_theta.specis empty, SABL sets M.prior_theta.spec = ’mean-std’. The means and stan-dard deviations correspond to θ ∼ N (0, I2).


See Section 5.1.3.

M phase RNE functions The M phase RNE functions are θ1, θ2, θ21, and θ

22.

5.2.3 The normal model (’normal’)

To invoke the normal model, use SABL(’normal’) from the project directory, or SABL(’normal’,[project_directory]) from any directory.

Observables distribution Let xt and zt be vectors of covariates and yt the corre-sponding outcome. The outcomes yt (t = 1, . . . , T ) are independent conditional on xt(t = 1, . . . , T ):

yt ∼ N (β′xt. exp (γ′zt)) .

The parameter vector θ in the SABL algorithm is θ′ = (β′, γ′).

• Model specification control fields:

44

— M.data: T × d data matrix.— M.x_pointer: A length kx vector pointing to columns of M.data (defines xt)

— M.z_pointer: A length kz vector pointing to columns of M.data (defines zt)

— M.y_pointer: A scalar pointing to a column of M.data (defines yt.),


— If M.data is empty then it is set to a 500×4 matrix: column 1 units, columns2, 3 and 4 independent and each i.i.d. N (0, 1); then M.x_pointer = [1, 2],M.z_pointer = [1, 3] and M.y_pointer = 4.

—There are no other defaults in the current implementation of Poisson. Thus,for example, If M.x_pointer, M.z_pointer or M.y_pointer is empty butM.data is not, SABL terminates with error.

Prior distributions

• SABL standard prior distribution normalM.prior_beta.type = ’normal’, which pertains to the entire vector β and M.prior_gamma.type= ’normal’, which pertains to the entire vector γ. The other fields of M.prior_betaand M.prior_gamma are those described in Section 5.1.2.

Default handling:

If M.prior_beta.type is not a field or if M.prior_beta.type is empty, then SABLsets M.prior_beta.type = ’normal’. Then if the field M.prior_beta.spec isabsent or empty, or if M.prior_beta.type = ’normal’ and then M.prior_beta.specis empty, SABL sets M.prior_beta.spec = ’mean-std’. The mean of the in-tercept (coeffi cient corresponding to a column of units) is then log (y), wherey is the sample mean of yt, and all other elements of the prior mean of β arezero. Except for the intercept the prior standard deviation of each coeffi cient isstd(βj)

= log (2) / [2std (xj)] .The prior standard deviation of the intercept is{var (y) /y2 +

∑j var

(βj)x2j/ (k − 1)

}1/2

, where the summation in j is over all kterms except the intercept.

If M.prior_gamma.type is not a field or if M.prior_gamma.type is empty, thenSABL sets M.prior_gamma.type = ’normal’. Then if the field M.prior_gamma.specis absent or empty, or if M.prior_gamma.type = ’normal’ and then M.prior_gamma.specis empty, SABL sets M.prior_gamma.spec =’mean-std’. The mean of the inter-cept (coeffi cient corresponding to a column of units) is then log [var(y], wherevar (y) is the sample variance of yt, and all other elements of the prior mean of βare zero. Except for the intercept the prior standard deviation of each coeffi cientis std

(βj)

= log (2) / [2std (zj)] .The prior standard deviation of the intercept is

45

{var (y) /y2 +

∑j var

(βj)z2j/ (k − 1)

}1/2



See Section 5.1.3.

M phase RNE functions There are two M phase RNE functions, β′x and γ′z.

5.2.4 The Poisson model (’poisson’)

To invoke the Poisson model, use SABL(’poisson’) from the project directory, orSABL(’poisson’, [project_directory]) from any directory.

Observables distribution Let xt be a vector of covariates and yt be the corre-sponding outcome. The outcomes yt (t = 1, . . . , T ) are independent conditional on xt(t = 1, . . . , T ):

yt ∼ Poisson (λt) , log (λt) = β′ xtk×1.

The parameter vector θ in the SABL algorithm is θ = β.


— M.data: T × d data matrix.— M.x_pointer: A length k vector pointing to columns of M.data (defines xt)



— If M.data is empty then it is set to a 500×3 matrix: column 1 units, columns2 i.i.d. N (0, 1) and column 3 i.i.d. Poisson(5); then M.x_pointer = [1, 2]and M.y_pointer = 3.

—There are no other defaults in the current implementation of Poisson. Thus,for example, If M.x_pointer, M.z_pointer or M.y_pointer ais empty butM.data is not, SABL terminates with error.

Prior distributions

• SABL standard prior distribution normalM.prior_beta.type = ’normal’, which pertains to the entire vector β. The otherfields of M.prior_beta are those described in Section 5.1.2.

Default handling: If M.prior_beta.type is not a field or if M.prior_beta.typeis empty, then SABL sets M.prior_beta.type = ’normal’. Then if the field

46

M.prior_beta.spec is absent or empty, or if M.prior_beta.type = ’normal’and then M.prior_beta.spec is empty, SABL sets M.prior_beta.spec =’mean-std’.The mean of the intercept (coeffi cient corresponding to a column of units) is thenlog (y), where y is the sample mean of yt, and all other elements of the prior mean ofβ are zero. Except for the intercept the prior standard deviation of each coeffi cientis std

(βj)


∑j var

(βj)x2j/ (k − 1)

}1/2



See Section 5.1.3.

M phase RNE functions There is a single M phase RNE function, β′x.

5.2.5 The negative binomial model (’negative_binomial’)

To invoke the Poisson model, use SABL(’negative_binomial’) from the project direc-tory, or SABL(’negative_binomial’, [project_directory]) from any directory.

Observables distribution Let xt be a vector of covariates and yt be the corre-sponding outcome. The outcomes yt (t = 1, . . . , T ) are independent conditional on xt(t = 1, . . . , T ):

yt ∼ NB (rt, pt)

logE (yt) = log

(rtpt

1− pt

)= β′ xt

kx×1

log

[var (yt)

E (yt)− 1

]= log

(pt

1− pt

)= γ′ zt

kz×1.

log(p−1t − 1

)= −γ′zt p−1

t = exp (−γ′zt) + 1 pt = [1 + exp (−γ′zt) ]−1

The function of moments var (yt) /E (yt)− 1 is the overdispersion coeffi cient, which hassupport (0,∞) in the negative binomial distribution. (In the Poisson distribution theoverdispersion coeffi cient is 0.) The inverse mapping is therefore rt = exp (β′xt − γ′zt),pt = [1 + exp (−γ′zt)]−1.The parameter vector θ in the SABL algorithm is θ′ = (β′, γ′).


— M.data: T × d data matrix.— M.x_pointer: A length kx vector pointing to columns of M.data (defines xt)

— M.z_pointer: A length kz vector pointing to columns of M.data (defines zt)

47



— If M.data is empty then it is set to a 500×4 matrix: column 1 units, columns2 and 3 independent and each i.i.d. N (0, 1), column 4 i.i.d. NB (5, 0.5) (mean10, variance 20, overdispersion 1); then M.x_pointer = [1, 2],M.z_pointer= [1, 3] and M.y_pointer = 4.

—There are no other defaults in the current implementation of Poisson. Thus,for example, If M.x_pointer or M.y_pointer is empty but M.data is not,SABL terminates with error.

Prior distributions

• SABL standard prior distribution normalM.prior_beta.type = ’normal’, which pertains to the entire vector β and M.prior_gamma.type= ’normal’, which pertains to the entire vector γ. The other fields of M.prior_betaand M.prior_gamma are those described in Section 5.1.2.

Default handling:

If M.prior_beta.type is not a field or if M.prior_beta.type is empty, then SABLsets M.prior_beta.type = ’normal’. Then if the field M.prior_beta.spec isabsent or empty, or if M.prior_beta.type = ’normal’ and then M.prior_beta.specis empty, SABL sets M.prior_beta.spec =’mean-std’. The mean of the in-tercept (coeffi cient corresponding to a column of units) is then log (y), wherey is the sample mean of yt, and all other elements of the prior mean of β arezero. Except for the intercept the prior standard deviation of each coeffi cient isstd(βj)


∑j var

(βj)x2j/ (k − 1)

}1/2


If M.prior_gamma.type is not a field or if M.prior_gamma.type is empty, thenSABL sets M.prior_gamma.type = ’normal’. Then if the field M.prior_gamma.specis absent or empty, or if M.prior_gamma.type = ’normal’ and then M.prior_gamma.specis empty, SABL sets M.prior_gamma.spec =’mean-std’. The mean of the inter-cept (coeffi cient corresponding to a column of units) is then log [var(y], wherevar (y) is the sample variance of yt, and all other elements of the prior mean of γare zero. Except for the intercept the prior standard deviation of each coeffi cientis std

(βj)

= log (2) / [2std (zj)] .The prior standard deviation of the intercept is{var (y) /y2 +

∑j var

(βj)z2j/ (k − 1)

}1/2


48


See Section 5.1.3.

M phase RNE functions There are two M phase RNE functions, β′x and γ′z.

5.2.6 The EGARCH model (’egarch’)

To invoke the Poisson model, use SABL(’egarch’) from the project directory, or SABL(’egarch’,[project_directory]) from any directory.

Observables distribution Let yt (t = 1, . . . , T ) denote an observed sequence of assetreturns. The EGARCH model is

yt = µY + σY exp

(K∑k=1

vkt/2

)εt

where µY = E (yt) and σ2Y = var (yt). The K volatility factors are

vkt = αkvk,t−1 + βk

(|εt−1| − (2/π)1/2

)+ γkεt−1 (k = 1, . . . , K) .

The distrubance terms εt are independent and identically distributioed as a full mixtureof I normal distributions

p(εt) =I∑i=1

piφ(xi;µi, σ2i ); E (εt) = 0, var (εt) = 1.

The algorithm parameter vector is θ′ =(θ(1), θ(2), σ(3)′, . . . , θ(8)′

). The mapping from

algroithm parameters to model parameters begins with

µY = θ(1)/100, σY = exp(θ(2)),

Then for k = 1, . . . , K,

αk = tanh(θ

(3)k

), βk = exp

(θ

(4)k

), γk = θ

(5)k .

For i = 1, . . . , I,

p∗i = tanh(θ

(6)i

), µ∗i = θ

(7)i , σ∗i = exp

(θ

(8)i

),

and then

pi =p∗i∑Ij=1 p

∗i

, µ∗∗i = µ∗i −J∑j=1

pjµ∗j ,

c =

{J∑j=1

pj

[(µ∗j)2

+(σ∗j)2]}−1/2

,

µi = cµ∗∗i , σi = cσ∗i .

49

• Model specification control fields

— M.y T × 1 vector of observed returns

— M.K Number of volatility states K

— M.I Number of mixture components I for εt

• Default handlingIf the field M.K is missing then M.K = 2, and if the field M.I is missing then M.I= 3. There is no other default handling for model specification control fields.

Prior distributions In the prior distribution the algorithm parameters θ are mutuallyindependent and normally distributed.

• Prior specificatoin control fields: all are 1× 2 vectors

— M.prior.muY Prior mean and standard deviation of θ(1) (1× 2)

— M.prior.sigmaY Prior mean and standard deviation of θ(2) (1× 2)

— M.prior.alpha Common mean and standard deviation of elements of θ(3)

— M.prior.beta Common mean and standard deviation of elements of θ(4)

— M.prior.gamma Common mean and standard deviation of elements of θ(5)

— M.prior.mixp Common mean and standard deviation of elements of θ(6)

— M.prior.mixmu Common mean and standard deviation of elements of θ(7)

— M.prior.mixsigma Common mean and standard deviation of elements ofθ(8)

• Defalult handlingMissing fields of M.prior are replaced as follows:

— M.prior.muY = [0, 1]

— M.prior.sigmaY = [log(0.01), 1]

— M.prior.allpha = [atanh(0.95), 1]

— M.prior.beta = [log(0.1), 1]

— M.prior.gamma = [0. 0.2]

— M.prior.mixp = [0, 1]

— M.prior.mixmu = [0, 1]

— M.prior.mixsigma = [0, 1]


See Section 5.1.3.

50

M phase RNE functions There are two M phase RNE functions, θ(1) and θ(2).

5.3 Adding new models to the SABL toolbox

The SABL toolbox accommodates the incorporation of new models. It is straightforwardto develop a local research library of models that have full interface with the SABLtoolbox and can take advantage of all of its features. The SABL toolbox interfacerequires and interface of 5 functions for each model:

• m_message Log likelihood or objective function

• m_initial Log prior probability density or log instrumental distribution density

• m_initialsim Random draw from density specified by m_initial

• m_monitor Interface between c_monitor and m_monitor (see Section 3.2)

• m_rnefunctions RNE test functions use in the M phase

New model developers should consult the model directories in the SABL toolbox.Files in these directories provide a great deal of insight into the management of execution,especially the development of code the effi ciently uses GPUs or multiple workers. Aswith all code development projects, there are substantial advantages to modularity, andall but the simplest models will develop other functions that are invoked directly orindirectly by these 5 functions.These functions are best placed in their own directory, one directory for each model.

As in any Matlab execution, it is necessary to communicate the location to Matlab, forexample by means of an addpath command, unless the directory for the model is thecurrent working directory. Execution also requires a p_monitor function. Given thesefunctions and correct modification of the Matlab path, SABL can be executed by thecommand SABL with no input arguments.Beyond these few accommodations, the development of private research code in the

context of the SABL toolbox can proceed as it would in any project using Matlab. It iseasy to develop libraries of research model code that execute using the SABL toolbox.Developing code to the standards of the Matlab toolbox entails quite a bit more,

including any modification of core default fields in the C structure, documentation of theM structure in a corresponding Mhelp.txt file, testing with code provided in a testcodesubdirectory, and at least one example project. The developers of the SABL toolboxactively encourage such contributions and are happy to consider contributions that meetthese standards.

51

References

Douc R, Cappe P, Moulines E. 2005. Comparison of resampling schemes for particlefiltering. 4th International Symposium on Image and Signal Processing and Analysis(http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.9118)Durham G, Geweke J. 2015. Adaptive sequential posterior simulators for massively

parallel computing environments. Advnaces in Econometrics, forthcoming.Geweke J, Frischknecht B. 2014. Exact optimization by means of sequentially adap-

tive Bayesian learing. Working paper.

52

EXHIBIT 1: Essential SABL function

c_monitor(’open’);if isfield(C, ’simulation_get’);

load(C.simulation_get);C.tfirst = C.tlast + 1;

elseC.tfirst = 1;

endC.passone = true;while ~C.runcomplete % For pass one and pass two

c_monitor(’startrun’);if E.pus > 1

c_sabl_spmd; % spmd blockelse

c_sabl;endc_monitor(’endrun’);if C.twopass && C.passone

C.passone = false;else

C.runcomplete = true;end

endc_monitor(’close’)if isfield(C, ’simulation_record’)

fname = C.simulation_record;C = rmfield(C, ’simulation_record’);save(fname, ’C’, ’E’, ’M’, ’P’, ’Cpar’, ’Epar’, ’Mpar’, ’Ppar’);

endif nargout > 0

allstructures = struct(’C’, C, ’E’, E, ’M’, M, ’P’, P, ...’Cpar’, Cpar, ’Epar’, Epar, ’Mpar’, Mpar, ’Ppar’, Ppar);

endpath(savedpath);end

53

EXHIBIT 2: Essential c_sabl function

function c_sablglobal C Cparif ~C.simget m_initialsim;endc_monitor(’initialize’);c_mpl(’initialize’);C.cycle = 0;C.Cphase.total = 0; C.Mphase.total = 0; C.moreinfo = true;% Cycles of the SABL algorithm ---------------------while C.moreinfo

C.cycle = C.cycle + 1;c_monitor(’startcycle’);Cpar.logw = u_allocate([C.JNwork,1], ’default’, ’zeros’);c_monitor(’startCphase’);c_Cphase(’startCphase’);c_mpl(’startCphase’);C.Cphase.count = 0; moreCsteps = true;while moreCsteps && C.moreinfo

C.Cphase.total = C.Cphase.total + 1;C.Cphase.count = C.Cphase.count + 1;moreCsteps = c_Cphase(’whileCphase’);c_monitor(’whileCphase’);c_mpl(’whileCphase’);

endc_monitor(’endCphase’);c_monitor(’startSphase’);c_Sphase(’startSphase’);c_monitor(’endSphase’);c_monitor(’startMphase’);C.Mphase.count = 0; moreMsteps = true;while moreMsteps % Iterations of the M phase

C.Mphase.total = C.Mphase.total + 1;C.Mphase.count = C.Mphase.count + 1;moreMsteps = c_Mphase(’whileMphase’);c_monitor(’whileMphase’)

endc_monitor(’endMphase’);

endc_mpl(’finish’);c_monitor(’finish’)

54

EXHIBIT 3: c_sabl_spmd function

function c_sabl_spmd% This function is intermediate between the SABL shell function and% the function c_sabl that executes the SABL algorithm. This enables% code to be written without regard to whether execution is single% thread or multiple thread. This function is invoked if E.pus > 1.global C Cpar E Epar M Mpar P Pparspmd

% Transfer an identical copy of C, E, M and P to each worker,% and restore Cpar, Epar, Mpar and Ppar on workers if C.simget = true

u_copyintospmd(C, E, M, P, Cpar, Epar, Mpar, Ppar);

% Execute the SABL algorithmc_sabl

% Transfer C, E, M and P from one worder to local CPU memory,% and transfer Cpar, Epar, Mpar and Ppar from all workers% to local CPU memory.

[Coutspmd, Eoutspmd, Moutspmd, Poutspmd, ...Cparoutspmd, Eparoutspmd, Mparoutspmd, Pparoutspmd] = u_copyfromspmd;

end% Convert identical copies of C, Cpar, E, Epar, M, Mpar, P, and Ppar% global structures from the workers to single global structures% on the client.C = Coutspmd{1};E = Eoutspmd{1};M = Moutspmd{1};P = Poutspmd{1};Cpar = Cparoutspmd{1};Epar = Eparoutspmd{1};Mpar = Mparoutspmd{1};Ppar = Pparoutspmd{1};

55

EXHIBIT 4: u_copyintospmd and u_copyfromspmd functoins

function u_copyintospmd(Cin, Ein, Min, Pin, ...Cparin, Eparin, Mparin, Pparin)

global C Cpar E Epar M Mpar P PparC = Cin;E = Ein;M = Min;P = Pin;if C.simget

Cpar = Cparin;Epar = Eparin;Mpar = Mparin;Ppar = Pparin;

endfunction [Cout, Eout, Mout, Pout, ...Cparout, Eparout, Mparout, Pparout] = u_copyfromspmd

global C Cpar E Epar M Mpar P PparCout = C;Eout = E;Mout = M;Pout = P;Cparout = Cpar;Eparout = Epar;Mparout = Mpar;Pparout = Ppar;end

56

Date post:	15-Feb-2020
Category:	Documents
Upload:	others
View:	7 times
Download:	0 times

Guide to SABL 2014a · The SABL algorithm is a generalization of adaptive posterior simulators...

Documents