Fast linked analyses for scenario-based hierarchies

© 2012 Royal Statistical Society 0035–9254/12/61000

Appl. Statist. (2012)61, Part 5, pp.

Fast linked analyses for scenario-based hierarchies

Daniel Williamson and Michael Goldstein

Durham University, UK

and Adam Blaker

National Oceanography Centre, Southampton, UK

[Received March 2011. Final revision December 2011]

Summary. When using computer models to provide policy support it is normal to encounterensembles that test only a handful of feasible or idealized decision scenarios. We present anew methodology for performing multilevel emulation of a complex model as a function of anydecision within a predefined class that makes specific use of a scenario ensemble of opportunityon a fast or early version of a simulator and a small, well-chosen, design on our current simulatorof interest. The method exploits a geometrical approach to Bayesian inference and is designedto be fast, to facilitate detailed diagnostic checking of our emulators by allowing us to carry outmany analyses very quickly. Our motivating application involved constructing an emulator for theUK Met Office Hadley Centre coupled climate model HadCM3 as a function of carbon dioxideforcing, which was part of a ‘RAPID’ programme deliverable to the UK Met Office funded bythe Natural Environment Research Council. Our application involved severe time pressure aswell as limited access to runs of HadCM3 and a scenario ensemble of opportunity on a lowerresolution version of the model.

Keywords: Bayesian analysis; Computer models; Emulation; Policy support; Restricted innerproduct space; Scenario analysis

1. Introduction

Complex physical systems such as the Earth’s climate are often studied by using computer mod-els. In cases where policy makers can influence future states of complex systems, of key scientificinterest is the provision of decision support by quantifying the uncertainties in future projec-tions of the complex system for different policy decisions. For example the Earth’s climate maybe influenced by the emissions policy for carbon dioxide (CO2), and to provide decision supportwe must make model-based uncertainty statements about future climates under different policy(see, for example, Linkov and Burmistrov (2003)). Throughout this paper we consider decisionsthat exist in the real world and are represented by potentially controllable input forcings to thecomputer models that are used to study the complex systems in which we are interested.

For each choice of input parameters, it takes a long time (typically from several weeks tomonths) to run computer models of the complex physical systems in which we are interested.When providing policy support using such simulators, the usual approach is to run the modelat a handful of prechosen decision scenarios. For example, to date, model-based uncertaintyanalysis for future climate is based on scenario analyses. A small collection of idealized forcingscenarios are considered and an uncertainty analysis conditioned on each scenario is carried out.

Address for correspondence: Daniel Williamson, Department of Mathematical Sciences, Durham University,Durham, DH1 3LE, UK.E-mail: [email protected]

2 D. Williamson, M. Goldstein and A. Blaker

Perhaps the most high profile example of a scenario analysis using climate models is the ‘CMIP3’project (Meehl et al., 2007), which collected scenario ensembles from the world’s most advancedclimate models. This data set has been used by many others (e.g. Hawkins and Sutton (2009))and many of the analyses can be seen throughout the latest report from the IntergovernmentalPanel on Climate Change (Solomon et al., 2007).

Studying any complex system under only a handful of decision scenarios is not ideal andis open to the criticism of not being policy relevant. A better form of policy support involvesestimating the response of the complex system to any feasible decision. Previously, the longrun times for state of the art computer models have made such analyses infeasible. However,recent statistical methods for analysing computer model output with emulators (see, for exam-ple, Craig et al. (1996) and Santner et al. (2003)) allow us to make belief statements about acomputer model for any untried setting of its inputs by using a small, well-chosen, ensemble ofmodel evaluations. An emulator for an expensive simulator of a complex system is a crucial firstingredient in new methods of providing policy support for any feasible decision (see Williamsonand Goldstein (2011) and Williamson (2010) for details). In this paper we shall focus on theconstruction of this emulator under a particular set of common circumstances. We refer thereaders who are interested in its use in policy support to Williamson and Goldstein (2011) andthe references therein.

To use standard techniques to build an emulator that reflects our uncertainty about thecomputer model output at any setting of the inputs and decisions, we require a parametric rep-resentation of the decisions and an ensemble of model runs designed in such a way as to coverthe subspace that is defined by the representation as well as possible. Ideally the parameteriza-tion will capture any feasible decision. However, realistically this parametric representation willlimit the possible decisions under consideration to infinitely many possible decisions within aparticular class. For the size of problems that we address in this paper, obtaining an ensembleof any reasonable size may not be possible, owing to either the time that it takes the model tocomplete a run, or the cost of obtaining each run. We shall assume that we may design and runa very small ensemble of runs on our computer simulator but that, taken on their own, theseruns would be too few in number to build an emulator with.

When our most powerful simulator is too slow or expensive to obtain an appropriate ensem-ble for emulation, a common solution is to run a large ensemble on a relatively fast, but related,model and to emulate the fast version. An emulator can then be constructed for the full versionby using the ‘fast’ emulator as a prior for the ‘full’ and performing a Bayesian update by usingthe small ensemble. This idea is known in the literature as multilevel emulation or multifidelitymodelling. See Kennedy and O’Hagan (2000) and Cumming and Goldstein (2009) for detailsof this approach.

This method is ideal if we have access to a large, well-designed, ensemble on our fast simulator.For many applications, including our application, our fast simulator may represent an olderversion of our current model and may still take a long time to run. Although we may not be ableto design a large ensemble on such a simulator, we may have access to an existing ensemble (anensemble of opportunity). If the fast model is an older version of our current model, it would bequite natural to expect old ensembles to exist. However, it is likely that such an ensemble existsfor only a handful of decision scenarios.

We define a ‘scenario ensemble’ to be a collection of model runs made at a very small numberof specific decision scenarios. The nature of the scenario ensemble makes it an unusual data setin the field of multilevel emulation. Typically our fast ensemble would contain a large amount ofinformation about large-scale behaviour across the whole parameter space (henceforth referredto as global behaviour) on the full simulator, but not much concerning small-scale variability in

Fast Linked Analyses 3

specific regions of parameter space (henceforth referred to as local variability). For the decisionparameters at least, the scenario ensemble gives a large amount of information about local var-iability at a handful of locations in decision space, but no knowledge of the global behaviouraway from those locations.

These features motivate a general need for techniques that are aimed at identifying a smallnumber of meaningful qualitative types of relationships across collections of outputs, which leadto simple forms that can be extrapolated into decision space (that part of model parameter spacerelative to the decisions). We also require simple representations of the variability across deci-sion space, reflecting the wealth of our local knowledge at each scenario point and allowing forthe constraint that we shall not obtain much more variance information from the full simulator.

As important to us will be the need for our methods to be fast. Our scenario ensembles andemulators will be used to construct a prior for our current expensive simulator that will beupdated by only a handful of runs. We must be able to perform extensive diagnostic checking,performing dozens, if not hundreds, of separate analyses designed to assess the effect of theforms of our prior specification and the robustness of our posterior. Our method exploits ageometrical interpretation of uncertainty specification and Bayesian inference, based on usingBayes linear methods to learn about complex objects in general inner product spaces, to achievethese goals. The resulting emulators may be fully probabilistic or second order (see Section 2),although the emulator that we construct in our application is of the latter type.

Our application was a time-pressured Natural Environment Research Council (NERC) funded‘RAPID’–Risk Analysis, Probability and Impacts Team (RAPIT) deliverable to the UK MetOffice. The deliverable was an emulator for one of the Met Office climate simulators that mod-elled the output as a continuous function of CO2 forcing. To build this emulator we combinea pre-existing scenario ensemble on a low resolution version of the climate model with a verysmall ensemble on the main model. We introduce the application formally in Section 3.

In this paper we introduce a general methodology for tackling this type of problem, which wecall ‘fast linked analyses for scenario-based hierarchies’ (FLASH). The method allows construc-tion of an emulator for the output of a computer model that is a continuous function of bothmodel input parameters and decision parameters, in the case where we have a large ensemble ofruns at only a very small number of specific settings of the decision parameters, and access to ahandful of further runs. In Section 2 we introduce emulators and the role of scenario analysesin FLASH. In Section 3 we introduce the application and conduct the required scenario anal-ysis. In Section 4 we discuss using scenario emulators to extrapolate global relationships acrossscenario space. In Section 5 we present a fast geometric method for representing local variabil-ity across scenario space using the information in our scenario emulators. Section 6 discusseslinking the models and concludes the application. Section 7 offers discussion.

The data that are analysed in the paper can be obtained from http://www.maths.dur.ac.uk/users/daniel.williamson/FLASHcode.html.

2. Emulators and scenario analyses

An emulator is a statistical model for a complex function. Typically, an emulator models a com-plex function, which is often slow and expensive to evaluate, as a random field that expressesour uncertainty about the function for any setting of the parameters. The complex functionsthat we consider here are the output of computer simulators, with model parameters x, anddecision parameters θ. Suppose that we denote our simulator f .x, θ/; then an emulator forf .x, θ/, which we denote f.x, θ/, describes our uncertainty about the value of f .x, θ/ for any xand θ, whether or not we have observed the value of the simulator there.


Suppose that f .x, θ/ is a vector-valued function with p outputs. A typical form for an emulatorof f .x, θ/, for component f i of f , is

fi.x, θ/=∑j

βij gj.x, θ/+ "i.x, θ/, .1/

where g.x, θ/ is a vector of known regressors in x and θ, β is a matrix of unknown coefficientsand ".x, θ/ is a multivariate Gaussian process with known covariance function. Examples imple-menting such a model include Craig et al. (1997) and Kennedy and O’Hagan (2001). Givenappropriate beliefs on {β, ".x, θ/}, an emulator can be trained by Bayesian updating using anensemble of simulator evaluations. Emulators offer a natural way to link hierarchies of modelsto each other. We elaborate on particular methods of linking models with emulators in Section6, where we link the hierarchical climate models in our application.

2.1. Emulation techniquesGiven the form in equation (1), an emulator is constructed by choosing the elements in g.x, θ/

and a covariance function for ".x, θ/, then initializing the random field {β, ".x, θ/} by perform-ing a Bayesian update of our model using an ensemble of runs. Although construction of g.x, θ/

and the covariance function of ".x, θ/ are common requirements for most emulation methods,the way in which the required random field should be induced and updated by data depends onthe willingness and the ability of the analyst to make or obtain informed prior judgements.

Suppose that a Gaussian correlation function is chosen for ".x, θ/ and its covariance func-tion c.·, ·/ is characterized by parameters Ψ. Suppose that full prior probability distributionsare available on {β,Ψ}. Then the random field π{f.x, θ/} with

f.x, θ/|β,Ψ∼GP{∑

j

βj gj.x, θ/, c.·, ·/}

can be updated by an ensemble of runs F on f .x, θ/ by using Bayes theorem, and the decom-position

π{f.x, θ/, β,Ψ|F}=π.β,Ψ|F /π{f.x, θ/|β,Ψ, F},

with

f.x, θ/|β,Ψ, F ∼GP{mÅ.x, θ/, cÅ.·, ·/}and mÅ.·/, cÅ.·, ·/ depending on g.x, θ/, the choice of c.·, ·/ and the distribution of {β,Ψ}.π{f.x, θ/|F} may be obtained in closed form through conjugate analysis under certain strongprior assumptions that fix certain elements of Ψ and allow its remaining elements and β tobe integrated out (see, for example, Haylock and O’Hagan (1996)). However, in most realisticapplications, samples from π{f.x, θ/|F} must be obtained by sampling from the full posteriorjoint distribution by using a complex Markov chain Monte Carlo algorithm.

There are many situations in which we may be unwilling or unable to make a fully probabil-istic prior specification in inducing the required random field. We may be too limited by time,resources or access to the required expertise to make meaningful subjective judgements. If thisis so, any posterior calculations will not reflect either our own beliefs, as analysts, or those ofthe scientists and modellers with whom we are working. Further, given practical constraints,the type of analysis that we suggest below may turn out to be prohibitively expensive for a fullprobabilistic specification.


In such cases Bayes linear methods may be appropriate. In treating expectation (and notprobability) as the primitive quantity, we can use Bayes linear methods to learn about thesecond-order properties of quantities of interest on the basis of partial prior specifications con-sisting of means, variances and covariances. Given prior means, variances and covariances ona collection of quantities B and data D, the Bayes linear adjusted expectation of B given D is

ED[B]=E[B]+ cov.B, D/var.D/−1.D−E[D]/, .2/

and the adjusted variance is

varD.B/=var.B/− cov.B, D/var.D/−1cov.D, B/: .3/

A full development of the Bayes linear approach can be found in Goldstein and Wooff (2007).For a recent example of the use of Bayes linear methods for computer simulators, see Vernonet al. (2011) and the accompanying discussion, or Cumming and Goldstein (2010). Emulatorsthat can be updated by using Bayes linear methods involve the same model, equation (1), butrequire only expectations and variances on {β, ".x, θ/} to induce the required random field byBayes linear adjustment. As we shall explain, there are particular practical advantages in usingthe Bayes linear approach in developing fast emulators for scenario-based hierarchies.

In both the fully Bayesian and the Bayes linear approaches to emulation, the choice of covari-ance function and parameters has a major influence on the properties of the induced randomfield.

A popular form for the covariance function is the separable form,

cov{"i.x′, θ′/, "j.x′′, θ′′/}=ΣijR{|.x′, θ′/− .x′′, θ′′/|}, .4/

where Σ is a p×p variance matrix capturing correlations between the model outputs, and R.·/ isa correlation function across the model inputs (see, for example, Craig et al. (2001) and Rougier(2008)). This form has computationally attractive properties, and we shall revisit it throughoutthis paper.

2.2. Scenario analysisSuppose that the ensemble of runs of f .x, θ/ that is available to us is a pre-existing ensembleof opportunity which explores only a handful of decision scenarios θ[1], . . . , θ[m]. Let ni be thenumber of runs for scenario θ[i]; then denote this scenario ensemble F , where

F ={f .X11, θ[1]/, . . . , f .X1n1 , θ[1]/, f .X21, θ[2]/, . . . , f .X2n2 , θ[2]/, . . . , f .Xmnm , θ[m]/}:

Unless m is large and θ[1], . . . , θ[m] are chosen to give optimal coverage of the space of possibleθ, it may not be appropriate (or possible) to emulate f .x, θ/ as a function of both x and θ byusing standard methods. In particular, if m is very small, or if n1, . . . , nm are large comparedwith m, our ensemble will contain a large amount of information about the model output ateach scenario, but very little information in the rest of decision space.

Our goal is to use F to help to construct an emulator for a more powerful and expensive ver-sion of f , f

Å, that is also a function of both x and θ. Our approach is first to emulate f .x, θ/,

by emulating each f .x, θ[i]/ (see Sections 3.2 and 3.3), and using these emulators to construct arepresentation of the mean (see Section 4.1) and variance (see Section 5) across decision space.A model is then created that links f to f

Å, and finally the prior for f

Åis updated by a small,

well-designed, ensemble of runs fÅ.x1, θ1/, . . . , f

Å.xn, θn/ (see Section 6). The nature of the sce-

nario ensemble F means that a standard emulator for f .x, θ/ that fits into this methodologyis unavailable. The novel steps in our methodology allow for m separate scenario emulators


for f .x, θ[1]/, . . . , f .x, θ[m]/, that capture the wealth of local information that is available in ourscenario ensemble, to be combined and extended across decision space to assist in building aprior for f

Å.

We define the building of separate emulators for each scenario to be a scenario analysis. Eachscenario emulator f.x, θ[k]/ can be constructed by using the analyst’s favourite standard method.We believe that, in most cases with relatively small m, a scenario analysis is the fastest and sim-plest first stage when looking to emulate f .x, θ/. When building the scenario emulators, it ishelpful if only a handful of key quantities are considered. So, for example, if the model output isa large vector, we must attempt to reduce this output to a handful of key quantities that capturethe global behaviour in the output. Such dimension reduction is often important more widely,so that the matrix inversions that are required when building emulators are computationallyfeasible. However, in FLASH it is particularly important as we must be able to extract simplequalitative relationships between the scenario emulators to link them effectively. Intuitive andlow dimensional representations of the model output will be helpful if we are to achieve this.

We now introduce our motivating example and apply the first step of our FLASH method-ology to it, by creating scenario emulators for our fast simulator.

3. Emulating HadCM3 by using fast linked analyses for scenario-based hierarchies

One of the main aims of the Natural Environment Research Council funded RAPID–RAPITprogramme is to assess the risk of shutdown of the Atlantic meridional overturning circulation(AMOC), which is defined as the zonal and vertical integral of v.x, y, z/, the meridional velocity,i.e.

Ψ.y, z/=∫ z

−Hmax.y/

∫ xe

xw

v.x, y, z′/dx dz′,

where xw and xe are the western and eastern boundaries of the ocean basin respectively,−Hmax.y/ is the maximum depth of the ocean basin and z is the vertical co-ordinate. TheAMOC has been identified as a key ocean mechanism which transports heat from the tropicsinto northern Europe and contributes to the comparatively mild European climate (Rhinesand Häkkinen, 2003; Broeker, 1987). The stability of the AMOC is of considerable interest toEuropean governments and policy makers. Palaeoclimate records suggest that in the past theAMOC is likely to have undergone major rapid fluctuations (McManus, 2004). A shutdown orsignificant decrease in the strength of the AMOC would result in a reduction of northward heattransport, and consequently a change in the climate over northern Europe. The UK Met Officeare partners in this project and the results will be used to help to meet their own deliverables tothe Department of Energy and Climate Change and the Department for Environment, Foodand Rural Affairs. RAPID–RAPIT intends to use large ensembles of the UK Met Office HadleyCentre coupled climate model HadCM3 (Gordon et al., 2000) and the results of other generalcirculation models to build emulators that will then be combined with data and expert beliefsto produce the risk assessments required.

3.1. DataHadCM3 comprises a 1:25◦ × 1:25◦ horizontal resolution ocean component with 20 constantdepth vertical levels, coupled to a 3:75◦ ×2:5◦ horizontal resolution atmosphere with 19 hybridvertical layers (a combination of constant height and pressure-dependent layers). Importantly,HadCM3 does not require flux adjustments to prevent drift in the climate, making it suitablefor centennial scale climate studies.


To give the UK Met Office confidence that RAPID–RAPIT would be able to deliver somethinginteresting in time for their deliverables to be met, they asked that we build an emulator for Had-CM3 as a function of specific CO2 forcing experiments that involved ramping the level of CO2 upfor a number of years and then ramping down. Delivering a proof of concept for this methodol-ogy was crucial for ensuring continued collaboration between the UK Met Office and RAPID–RAPIT in this area of research. As the deadline approached, it turned out that we would havethe resources for only 24 HadCM3 runs. The last eight of these would not be ready until a coupleof days before the deadline. To help, the team at climateprediction.net designed an ensemble of alower resolution model, called ‘FAMOUS’ (Smith et al., 2008) which uses near identical code toHadCM3 but with 3:75◦ ×2:5◦ horizontal resolution in the ocean and 7:5◦ ×5◦ horizontal res-olution in the atmosphere. The vertical co-ordinate system in the atmosphere is also reduced to11 hybrid vertical layers. Despite such large changes in the resolution, FAMOUS can reproducea climate that is comparable with that of HadCM3 for around 10% of the computational cost.

The FAMOUS ensemble that we received from climateprediction.net was a scenario ensem-ble and, by the time that it was ready, the deadline for our analysis was 25 days, meaning that,whatever methodology we developed, it had to be fast. In using a lower resolution generalizedcirculation model to link to HadCM3 as part of this deliverable, we were also providing a proofof concept to the UK Met Office that generalized circulation models could indeed be linkedthrough emulators, an idea underpinning our strategy for contributing to the risk assessmentslater in the project, and therefore important to the Met Office.

The FAMOUS scenario ensemble contained between 40 and 80 perturbed parameter ensem-ble members for each of six different CO2 forcing scenarios. In preparation, we had designed a24-member ensemble of HadCM3 runs which we were running on HECTOR, the UK nationalsupercomputing service. Each batch of eight runs took more than 2 weeks to complete, and ourfinal batch of eight was due to finish 2 days before the deadline. Our design was to devote eightruns to exact copies of runs that were being done on FAMOUS at climateprediction.net (to aidour linking of the two models) and to have a Latin hypercube in the other 16. These 16 runswould be used to update our prior for HadCM3 (constructed using FAMOUS and the eightrepeats), to produce our final analysis.

For each scenario in our FAMOUS ensemble, only three parameters were varied by the cli-mateprediction.net team. These are ENTCOEF, the entrainment coefficient in the model atmo-sphere, KAPPA0 SI, a vertical mixing parameter in the ocean, and the solar constant which isconsidered as a switch with three levels: 1363, 1365 and 1367. Each parameter configuration wasintegrated for a period of 200 model years (200 model years being the duration of a FAMOUSwork unit run through climateprediction.net), starting from a restart file several thousand yearsinto an integration of the default parameter configuration (see Smith et al. (2008)), to allowthe model to adjust to the new parameters. This 200-year spin-up period is sufficient to allowthe surface dynamics and fluxes in the model to adjust to near steady state. Although there willcontinue to be gradual drift in the deep ocean for several centuries, it will be small and on amuch longer timescale compared with the model response to the CO2 scenarios. The next 200-year work unit to be released through climateprediction.net included the CO2 scenarios (Fig.1). They included a control and scenarios that involved ramping up for 70 years at a constantannual rate followed by either a ramping down over 70 years back to preindustrial levels orholding the CO2 level fixed at the ramped-up level. Ramp-up by both 1% and 2% per year wasfollowed by both ramp-down and CO2 holding, with the sixth scenario being a ramp-up by 4%per year followed by ramp-down back to preindustrial levels. The use of 70 years in the scenariodesign arises because increasing at a rate of 1% per year for 70 years results in a doubling ofatmospheric CO2 levels, 2% per year a quadrupling of atmospheric CO2 levels and so forth.


0 50 100 150

6.0

6.5

7.0

7.5

8.0

Year

log(

CO

2 C

onc.

) (pp

mv)

Fig. 1. The six CO2 scenarios that were chosen for the FAMOUS experiment: - - - - - - - , control; � - � - �, up1% down; . . . . . . ., up 1% flat; � – � – � , up 2% down; – – –, up 2% flat; , up 4% down

The HadCM3 runs were started by using restart files from a long integration with standardphysics. Owing to the significantly slower integration speed compared with FAMOUS it wasonly possible to complete 50 years of spin-up with each of the different parameter configura-tions before starting the 140 years of transient atmospheric CO2. Although not ideal, this is stillsufficient time to allow the models to reach near equilibrium in the surface layers and fluxes,and drift is again small compared with the changes arising from the transient atmospheric CO2.

By the time that the deadline for the deliverable to the Met Office had arrived, some of the99 designed ensemble members per scenario had not completed, including three of our eightrepeats. Such cases are typical when waiting for the output of very slow functions and with tightdeadlines, motivating the need for general analyses that are fast.

The model AMOC data that we emulate here are thought to be noisy 170-year time series(Fig. 2). The location and direction of ‘spikes’ in the data are artefacts of the initial conditionsand are referred to by the modellers as ‘internal variability’. Of interest, in this paper, are globalproperties such as the value and location of the minimum of a smoothed version of the dataas the AMOC responds to forcing, as well as the amount by which the AMOC recovers afterforcing. Other interesting features include the ‘mean’ of the AMOC before forcing begins andthe amount of variation around a smoothed version.

At the time of designing the HadCM3 runs, we did not have the ability to change the Had-CM3 solar constant, so this was kept at the default value of 1365. We required a continuousCO2 scenario space in which each of the six FAMOUS scenarios lived, but which allowed forpolicy relevant flexibility in the CO2 forcing trajectory. We parameterize a CO2 ramp-up andramp-down experiment in the following way. Let CO2 concentration at time t be written c.t/;then we write


0 10000 20000 30000 40000 50000 60000 70000

810

1214

1618

2022

days

AM

OC

Fig. 2. Selection of raw FAMOUS AMOC with one choice of input parameters for each scenario: � – � – �,control; � - � - �, up 1% down; – – –, up 1% flat; - - - - - - - , up 2% down; . . . . . . ., up 2% flat; , up 4%down

c.t/={

c.0/.1+Δx/t t � t1,c.0/.1+Δx/t1.1−Δy/t−t1 t1 <t � t2

where c.0/ is preindustrial CO2 concentrations. The parameterization describes an experimentwhere preindustrial CO2 is increased annually by rate Δx for t1 years and then decreased annu-ally by rate Δy until the end of the experiment at time t2. We fix t1 = 70; then our CO2 forcingexperiments are described by specifying the values of two parameters θ= .Δx, Δy/T. Note thateach of the FAMOUS scenarios corresponds to one choice of this parameter vector. The controlis actually the hyperplane at Δx =Δy =0, and, for each of the remaining scenarios, the Δx-val-ues are .0:01, 0:01, 0:02, 0:02, 0:04/ with corresponding Δy-values .0, 0:0099, 0, 0:0196, 0:0385/.Our design for the HadCM3 runs corresponded to a maximin Latin hypercube (Morris andMitchell, 1995) in ENTCOEF, KAPPA0 SI and Δx, with Δy chosen so that Δy � 1:5Δx toprevent the CO2 experiment ramping down below 0:5c.0/, as our experts feared that this mightbreak the model. We also limit Δx to be in [0, 0:02], as the Met Office felt that scenarios outsidethis range represent very unrealistic increases in CO2 levels and would be a waste of very limitedresources.

3.2. Carbon dioxide forcing scenario emulators for FAMOUSTo reduce the dimension of the output and to facilitate the extraction of simple global rela-tionships across scenario space, we introduce a basis representation based on P-splines that isdesigned to extract the interesting global behaviour in the AMOC time series whilst retaining abroadly physical interpretation for the basis coefficients.


For any given scenario, model the raw simulator output at time t, sR.x, t/, as

sR.x, t/= s.x, t/+η.x, t/, .5/

where s.x, t/ is a smooth response in t for any x and η.x, t/ is mean 0, has variance dependingon x and t, and is uncorrelated with s.x, t/ for any x. We have suppressed the dependence onthe scenario in this description for notational convenience.

Although it may be more conventional to view s.x, t/ as a ‘true but unknown’ function tobe estimated from each model output, we treat s.x, t/ as a known deterministic function of theoutputs and thus it acts as our simulator. This will be made clear in our derivation of s.x, t/. Weare interested in capturing the global behaviour of the simulator, s.x, t/ over all t, as a functionof x. We make the assumption that η.x, t/ models the internal variability of the climate model sothat any emulator that we build for the smooth response s.x, t/ must interpolate the smoothedensemble members.

We use a basis representation for s.x, t/ =Σj cj.x/Bj.t/, where the Bj.t/ are basis functionsover the domain of t and the cj.·/ are chosen to give the best smooth fit to the data. Thereare many possible choices for the basis functions B.t/, with perhaps the most common beingthe polynomial basis functions and the Fourier basis functions (see, for example, Ramsay andSilverman (1996)). Basis representations have been used to emulate high dimensional computermodel output before. Challenor et al. (2009) used principal components to force the basis vectorsto be orthogonal, thus allowing separate univariate emulators to be built for each coefficient.Orthogonal decomposition was also used by Furrer et al. (2007) for analysing climate modeloutput. Bayarri et al. (2007) used a wavelet basis decomposition to capture local behaviourin the time series more accurately. In all these examples, emulation proceeds by selecting anappropriate basis representation and then emulating the coefficients as a function of the modelinputs.

We use a B-spline basis representation of degree 3 (see, for example, Fahrmeir and Tutz(2001)) to preserve some physical interpretability in the spline coefficients. This interpretabil-ity is available because, having divided the time window by k knots, Bj.t/ is only non-zero onthe domain that is spanned by the five nearest knots in the neighbourhood of t. For example,suppose that knots are placed at T0, T1, . . . , Tn; then B1.t/ is only non-zero on [T0, T4] and c1.x/

represents behaviour of the AMOC at the beginning of the time series at parameter values x.We discuss our choice of splines further in Section 7.

We fit the coefficients of this representation for any x in the design by using P-splines. Thisallows us to choose relatively few knots (in this example six) and to control the smoothness ofthe fit by using a smoothing parameter. P-spline coefficients are fitted, for any ensemble mem-ber, by using a penalized weighted sum of squares. At any x at which we have a model run, letyT.x/ = .sR.x, t1/, . . . , sR.x, tn//, B = .bij/, with bij = Bj.ti/, W = diag.w1, . . . , wn/, a diagonalmatrix of weights, cT.x/= .c1.x/, . . . , ck+4.x// and DB = .dij/, where dij =∫

B′′i .t/B′′

j .t/dt. ThenP-splines are fitted by choosing c.x/ to minimize

.y.x/−Bc.x//TW.y.x/−Bc.x//+λcT.x/DB c.x/, .6/

where λ is a smoothing parameter. The minimum of expression (6) can be written

c.x/= .BTWB+λDB/−1BTW y.x/; .7/

see Fahrmeir and Tutz (2001). In what follows we treat s.x, t/=Σj cj.x/Bj.t/ as the best smoothfit to the data, with c.x/ a known function of the simulator output.


3.3. Emulating spline coefficientsFor scenario ensemble k, by choosing the same knots for any x, we can emulate c.x/ as a func-tion of x using data c.Xk1/, . . . , c.Xknk

/ calculated from the ensemble members. Our emulatorf.x, t/ for any scenario is then expressed using our emulator for the coefficients. So, for example,

E[s.x, t/]=∑j

E[c.x/]j Bj.t/

and

var{s.x, t/}=∑i

∑j

BTi .t/ cov{ci.x/, cj.x/}Bj.t/:

To construct these emulators we must first choose appropriate λ and number of knots. We maythen post-process the data into their P-spline coefficient form and then emulate the cj.x/ jointlyas a function of x.

The raw AMOC time series are depicted in Fig. 2. An exploratory analysis that involved start-ing with a large number of knots, observing a P-spline fit, then reducing the number of knotsgradually and repeating until we had the smallest number of knots that led to a good looking fit,showed that six equally spaced knots (and hence eight basis functions) were adequate to capturethe global features of the AMOC time series for any scenario. For some scenarios, such as thecontrol, fewer knots would have been appropriate, but we wished to use the same basis for allscenarios and remained with six. We fix W to be the identity and allow λ to vary across scenar-ios, so that each set of fitted coefficients for a given scenario captures only the global behaviourthat we aim to emulate, and smooths out any local variability or behaviours. Such variabilitywould be emulated separately if the science suggested that it was worthy of investigation.

The chosen value of λ for each scenario was obtained by fitting a number of splines ‘by eye’,and selecting a value that consistently met the requirements of our fit over each ensemble mem-ber in the scenario. Our requirements were that the spline followed the shape of the curve whilelooking as smooth as possible. The less aggressive forcing scenarios require more smoothing.This reflects an observation that the standard deviation of our noise term increases as we tendtowards the control. The six values of λ were 0:5, 0:3, 0:225, 0:225, 0:1 and 0:1 for the control,up 1% flat, up 1% down, up 2% flat, up 2% down and up 4% down respectively.

The fitting of the spline coefficients produces six scenario ensembles each containing the val-ues of eight spline coefficients for each choice of the vector (ENT KAPPA0 SI SC). We notehere that both the choice of the number and location of the knots as well as the choice of penaltyterm and selection of the values of the smoothing parameter are very important to the qualityof our emulators, and that the choices that were made here were for convenience under severetime pressure. A combination of more careful choices for the knot locations and a more detailedpenalty term might lead to spline coefficients with more physical interpretability than those wehave obtained here. This is very much an area for further investigation.

To construct each emulator, we begin by fitting linear models in our input variables to eachof the eight coefficients univariately. We treat SC as a factor with three levels, and fit mod-els by ordinary least squares that contain a level contribution from the factor SC, quadraticterms in ENT and linear terms in KAPPA0 SI. We also allow an interaction between ENT andKAPPA0 SI.

We choose to fit a second-order emulator and use Bayes linear methods of updating asdescribed below. For each scenario we fit a model of the same form as equation (1),

ci.x/=∑j

βij gj.x/+ "i.x/,


where g.x/ contains the terms fitted by using ordinary least squares, β has expectation andvariance derived from the ordinary least squares fits and ".x/ is a mean zero separable Gauss-ian process, uncorrelated with β, with covariance function cov{"i.x/, "j.x′/}=Σij R.|x − x′|/.The matrix Σ is an 8 × 8 variance matrix which serves to give order-of-magnitude variancesand correlations across the outputs. We obtain the correlation part of Σ by using the empiricalresiduals from our linear model fitting, and we inflate the empirical variance by 0:3 to insureagainst overconfidence. The value of 0:3 is arrived at by using leave-one-out diagnostics (seebelow). We use Gaussian correlation function R.·/ with correlation lengths derived by using aheuristic based on Taylor series expansions as a starting point and tuning using leave-one-outdiagnostics (described below) such as those in Fig. 3 based on emulators trained by using halfof the ensembles. The heuristic is described on page 667 of Vernon et al. (2011).

Having chosen Σ and the correlation lengths, we perform a Bayes linear adjustment of themultivariate Gaussian process ".x/ using all of the residual data. The result is an emulatorfor each FAMOUS scenario that interpolates the smoothed ensemble members. Fig. 3 shows

0 50 100 150

1014

18

years

AM

OC

0 50 100 150

1014

18

years

AM

OC

0 50 100 15010

1418

years

AM

OC

0 50 100 150

1014

18

years

AM

OC

0 50 100 150

1014

18

years

AM

OC

0 50 100 150

1014

18

years

AM

OC

0 50 100 150

1014

18

years

AM

OC

0 50 100 150

1014

18

years

AM

OC

0 50 100 150

1014

18

years

AM

OC

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Fig. 3. Leave-one-out diagnostics for nine members of the FAMOUS‘up 2% down’ensemble ( , left-outdata; – – –, our emulator mean (middle) and ˙2 standard deviations): (a) data point 1; (b) data point 2; (c)data point 3; (d) data point 4; (e) data point 5; (f) data point 6; (g) data point 7; (h) data point 8; (i) data point 9


a panel of nine leave-one-out diagnostic plots for the fitted ‘up 2% down’ scenario emulator.Leave-one-out diagnostics are obtained by removing one of the ensemble members, trainingthe emulator on each of the other members and comparing the emulator’s performance at theparameter settings of the missing data to the data themselves. Tuning our prior specification byusing leave-one-out diagnostics, as we did when specifying correlation lengths and the varianceinflation by 0:3, requires building dozens of emulators each time that we look at the effect of achosen specification, meaning that, by the end of tuning, thousands of separate emulators hadbeen built. We then observe leave-one-out plots for each ensemble member of each scenarioemulator trained on the full ensemble to check the quality of our modelling and to be satisfiedthat our emulators captured and predicted the global behaviour of FAMOUS well in the contextof our posterior uncertainty. This requires hundreds of separate emulators to be built. The needto build so many emulators reinforces the need for fast emulation techniques and is the reasonthat we prefer scenario emulators to be built by using standard methods at this stage in theFLASH method.

4. Extrapolating global relationships across scenario space

Having constructed scenario emulators for f .x, θ[1]/, . . . , f .x, θ[m]/ based on intuitive low dimen-sional representations of the output, the next step in our FLASH methodology is to identifyqualitative types of relationships between these representations in each emulator. The goal is towrite down a more general mean function for a global emulator of f .x, θ/ that can be extrap-olated across scenario space. The most natural way of identifying such relationships is via thecomparison of the regression coefficients β for each output across each of our scenarios. Itmay be that simple θ-dependent relationships exist between like regression coefficients acrossemulators, allowing a more general regression surface to be constructed in scenario space.

The simplest example of such a relationship would be the existence of separability in x and θ.If a comparison of our scenario emulators suggests such separability, we may extrapolate ourmean function via

f.x, θ/=a.θ/b.x/+d.θ/e.x/, .8/

where a.θ/ is some vector of polynomials in θ, d.θ/ is a stochastic process in θ and b.x/ ande.x/ are taken directly from our scenario emulators.

Although full separability is unlikely to hold in many serious applications, we must take carenot to be too ambitious when looking for relationships as we risk overfitting, particularly whenm is small. If full separability is unavailable, we can look to the notion of ‘near separability’ toobtain a more general mean response surface for f.x, θ/.

By imposing what we have termed ‘near separability’, we are recognizing that, although simplerelationships do exist between like coefficients across emulators, the nature of these relation-ships can be completely different for each coefficient of each low dimensional representation ofthe output. For example, suppose that we have p regressors for each scenario emulator; thenwe might find ki relationships .ki �p/ for output i, so that our extrapolated emulator could bewritten

fi.x, θ/=ki∑

l=1al

i.θ/bli.x/+ ei.x, θ/, .9/

for polynomials ali.θ/ and bl

i.x/, l=1, . . . , ki. To avoid overfitting we want ki as small as possible


for each output i. This can be achieved by grouping terms with similar types of relationships inθ and allowing the residual e.x, θ/ to soak up more of the variation.

4.1. Near separability in the FAMOUS emulatorsComparing the scenario emulator mean functions for each ci, it was clear that full separabilitywould not hold in our application. Fig. 4 illustrates that the intercept of our mean function for c6has a strong negative linear relationship in θ1 and that the coefficient of ENT in the same meanfunction has a strong positive linear relationship in θ1. We found that we could impose nearseparability in the mean functions by clustering the coefficients into groups that had positivelinear, negative linear or no relationship in θ.

Let a.θ/= .1, θ1, θ2/T, and let b.x/, b′.x/ and b′′.x/ be matrices whose ith row contains thevector of basis functions from our FAMOUS emulator mean function on the ith coefficientthat have no relationship, a positive linear relationship and a negative linear relationship with θrespectively. Let the operator ‘Å’ denote the Hadamard (elementwise) product of two arrays sothat AijÅBkj is the ikth element of the matrix whose columns are the elementwise product of thecolumns of A and B. For conformable arrays V and W , let the repeated indices in the expressionVijWjk (when the arrays are not separated by the ‘Å’ symbol) denote Σj VijÅWjk. Then, we cancapture the prior mean response of the ith FAMOUS spline coefficient, ci.x/, approximatelybut effectively by using the model

bi.x/+α+ij aj.θ/Å

∑k

b′ik.x/+α−

ij aj.θ/Å∑k

b′′ik.x/, .10/

with

0.00 0.01 0.02 0.03 0.04

05

1015

2025

θ1

Line

ar m

odel

coe

ffici

ent

Fig. 4. Coefficients in our linear models for each scenario: �, intercept; �, ENT coefficient


b.x/=

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

19:26 −1:3x1 2:8x2 −4:5x22 0

18:84 0 4:5x2 −5:84x22 0

19:12 0 0 0 00 0 4:3x2 −4:5x2

2 00 0 4:86x2 −4:5x2

2 1:91x1x20 0 0 0 1:1x1x20 0:6x1 0 −8x2

2 00 0 7:85x2 −4:4x2

2 1:4x1x2

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

,

b′.x/=

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0 00 0

0:5x1 x2x1 0x1 00 x20 x2x1 0

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

, b′′.x/=

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0 0 00 0 00 x2

2 x1x21 0 01 0 01 − 1

3 x22 0

1 0 01 0 0

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

:

For example (without explicitly writing zero terms), the extended mean function for c4.x/ is

c4.x/=b4.x/+α+41b′.x/41 +α+

42θ1b′.x/41 +α+43θ2b′.x/41 +α−

41b′′.x/41 +α−42θ1b′′.x/41

+α−43θ2b′′.x/41

=4:3x2 −4:5x22 +α+

41x1 +α+42θ1x1 +α+

43θ2x1 +α−41 +α−

42θ1 +α−43θ2:

5. Representing variability across scenario space

Having constructed a mean function for f.x, θ/ across scenario space, we require a residualprocess that models our remaining uncertainty for the fast simulator. The nature of the scenarioensemble means that our residuals contain a large amount of information about local variabilityat θ[1], . . . , θ[m], but little information elsewhere. Our modelling must reflect this, as it will beimportant to transfer as much of our information up to our emulator for the full simulator aspossible.

We suppose that a separable residual covariance function was used in construction of eachscenario emulator so that, with Rx.·/ a correlation function in x,

cov{"i.x′, θ[k]/, "j.x′′, θ[k]/}=Σ.θ[k]/ij Rx.|x′ −x′′|/, k =1, . . . , m:

This is a very common structure to impose on our emulator residuals so it is natural, when look-ing for fast emulation methods as we are here, to rely on these simple, well-studied forms. We useRx.·/ instead of R.·/ here to make it clear that we refit the covariance function to the residualsof each scenario emulator by using our new extended global mean function so that Σ.θ[k]/ isthe variance matrix across model outputs for the residuals of the extended mean function atscenario θ[k].

To extend the residual covariance function into scenario space while retaining the wealth oflocal information at each scenario, we let

cov{"i.x′, θ′/, "j.x′′, θ′′/}=A.θ′, θ′′/ij Rx.|x′ −x′′|/Rθ.|θ′ −θ′′|/, .11/

where Rθ.·, ·/ is a correlation function in θ and A.θ′, θ′′/ is an output covariance matrix with


A.θ′, θ′′/=Σ1=2.θ′/Σ1=2.θ′′/, .12/

so that A.θ[k], θ[k]/≡Σ.θ[k]/, for k =1, . . . , m. How much information we have for θ =∈{θ[1], . . . ,θ[m]} is controlled by our choices for Rθ.·, ·/ and Σ.·/. The covariance matrix function Σ.·/requires construction. This is dealt with in Section 5.1.

Defining A.·, ·/ by equation (12) implies that ".x, θ/ is a valid random field because A.·, ·/is a valid covariance matrix. To see this, consider the unitary independent random field inx- and θ-space, ".x, θ/. Apply the correlation function Rx.|x′ −x′′|/Rθ.|θ′ −θ′′|/ to each mem-ber of the vector ".x, θ/ separately so that corr{"i.x′, θ′/, "j.x′′, θ′′/} = 0 for i /= j, but thatcorr{"i.x′, θ′/, "i.x′′, θ′′/} is consistent with equation (11). ".x, θ/ is a valid random field. Wemay scale ".x, θ/ by any matrix function Ψ.θ/ and the result will still be a valid random field.Let ".x, θ/=Ψ.θ/".x, θ/, where

cov{".x, θ/, ".x′, θ′/}= cov{Ψ.θ/ ".x, θ/, Ψ.θ′/ ".x′, θ′/}=Ψ.θ/Ψ.θ′/T Rx.|x −x′|/Rθ.|θ−θ′|/:

We require A.θ[k], θ[k]/=Σ.θ[k]/, which implies choosing Ψ.θ[k]/≡Σ1=2.θ[k]/ by the uniquenessof the symmetric matrix square root.

The above extension of the residual random field from model input space into decision spaceis natural and flexible. It does, however, require the construction of the covariance matrixfunction Σ.θ/. The scenario ensemble contains detailed information on the variance andcorrelation structure between the model outputs at a handful of points. We want our Σ.·/ to bea smooth interpolant of this information without having to make many detailed and difficultassumptions, but that has a firm mathematical justification.

We can view Σ.θ[1]/, . . . , Σ.θ[m]/ as realizations of a complex function Σ.·/, which we canthen model as a stochastic process with correlations driven by the distance between covariancematrices in scenario space. To describe the distance between covariance matrices we require aninner product space. We can then use a geometric interpretation of the Bayes linear approachto construct the required interpolant.

The method that we present for constructing Σ.·/ is simple for the user to perform. Whatfollows is the mathematical development of the method with certain basic concepts of func-tional analysis, such as orthogonal projection in inner product spaces, assumed known. Themathematical development is contained in Section 5.1 and in Section 5.2 we provide a clearpractical description that is independent of the material that is developed in Section 5.1, for thebenefit of users wishing to implement the methodology. Section 5.3 illustrates the applicationof this methodology in extending the FAMOUS residual information into scenario space.

5.1. Covariance matrix learning in restricted inner product spacesGeometrically, Bayes linear adjusted expectation (2) is the orthogonal projection of the vectorB onto the vector space spanned by the linear combinations of the components of the datavector D in the inner product space defined by our prior covariances (Goldstein and Wooff,2007). Goldstein and Wilkinson (2001) extended the methodology to learn about more complexrandom objects in more general inner product spaces. These include learning about covariancematrices within an inner product space of random matrices. A foundational justification for thisformalism was given in Goldstein (1997).

Suppose that the .i, j/th element of matrix M is Zij. M can be identified as the r2-vector.Z11, Z12, . . . , Zrr/ living in the corresponding vector space. We create a subspace of this spaceby specifying a collection of matrices of interest C={M1, . . . , Mp} and constructing linear space〈C 〉 containing all finite linear combinations of elements of C. Define the inner product on this


space as

.R, S/=E [tr.RS/],

‖R−S‖2 = .R−S, R−S/:.13/

A basis for the symmetric constant matrices is the collection Iij with Iij = 1 in positions .i, j/,.j, i/ and Iij =0 elsewhere. Let IÅ be the collection of such matrices and let [C] be the space 〈C 〉augmented by IÅ. Then, for X∈ [C], E[X] is the orthogonal projection of X into the space [IÅ].

Let ΣÅ ={Σ.θ[1]/, . . . , Σ.θ[m]/}; then the Bayes linear adjusted expectation of Σ.θ/ given ΣÅ

is

EΣÅ [Σ.θ/]=E [Σ.θ/]+ cov{Σ.θ/, ΣÅ}var.ΣÅ/−1.ΣÅ −E[ΣÅ]/, .14/

with var.ΣÅ/ij = .Σ.θ[i]/, Σ.θ[j]// and cov{Σ.θ/, ΣÅ}i = .Σ.θ/, Σ.θ[i]//. Note that, with appro-priate prior specification of any quantities required to calculate E[Σ.θ/] and .Σ.θ/, Σ.θ′// forany θ, θ′, EΣÅ [Σ.θ/] interpolates the matrices Σ.θ[1]/, . . . , Σ.θ[m]/. By exploiting this geometri-cal approach then, we can specify a valid random field representing variability across scenariospace that makes use of the local information that is captured at each scenario choice.

By controlling the way that matrices are entered into C, we can control how much we canlearn about the structure of our covariance matrix function away from design points. Supposethat we take a matrix M and break it into q non-overlapping submatrices of the same size,M[1], . . . , M[q], so that M =Σq

i M[i]. Then M[1], . . . , M[q] are mutually orthogonal. By carefullychoosing the way that each matrix in C is broken into orthogonal submatrices, we can learnabout interesting structures in our covariance matrix function within a restricted inner productspace.

A restricted inner product space on random matrices essentially breaks the subspace 〈C 〉down by reducing it to the orthogonal direct sum of smaller subspaces. For example, we mightwant to learn about the diagonal and off-diagonal elements of a matrix separately. Each Mcould then be entered into C as .M[1], M[2]/ with M[1] =diag.M/ and M[2] =M −M[1]. Supposethat we enter each matrix of interest M into C as .M[1], . . . , M[q]/; then the inner product onthis restricted space is defined as

.R, S/Å =q∑

j=1.R[j], S[j]/: .15/

Using the restricted inner product we can learn about the whole matrix in one go by enteringwhole covariance matrices unbroken into C and by letting .R, S/Å = .R, S/, or we can learnabout each element of the (symmetric) matrix individually by entering M into C as r.r + 1/=2matrices and letting .R, S/Å be the sum of the r.r +1/=2 inner products. This formalism allowsus to learn about any special structures in Σ.θ/ separately if we deem it appropriate.

Note that the matrix function (14) need not be non-negative definite. Suppose that E[Σ.θ/]=Σ′ ∀θ. If W.θ/ = cov{Σ.θ/, ΣÅ}var.ΣÅ/−1, then a sufficient condition for function (14) to benon-negative definite is ΣjW.θ/j �1 and W.θ/j �0, ∀j ∈{1, . . . , m}. This condition, however,is not necessary for the matrix to be non-negative definite. We consider the existence of negativeeigenvalues in equation (14) to be a diagnostic warning that there may be an issue with either theprior specification or with the defined restricted inner product space. It may be that the matricesshould be entered into C in a way that is more appropriate to the problem under consideration,or that a prior–data conflict exists and should be explored. The problem of negative eigenvalueshere is very similar to the problem of negative variance estimates in the wider statistical literature.In that literature, the usual way to deal with negative variances in component estimation algo-rithms is to set them to 0 (see, for example, Swallow and Monahan (1984)). In the same spirit, one


way of dealing with small negative eigenvalues would be to set them to 0 to make the covariancematrix non-negative definite. These issues have been discussed further in Wilkinson (1995).

5.2. Constructing the covariance matrix functionTo construct the covariance matrix function, we first decide which parts of the matrix will behavesimilarly in decision space. For example, we might decide that the diagonal elements and theoff-diagonal elements have different behaviours from each other, but that they behave similarlytogether. The most simple specification is not to break the matrix up at all and to treat it as ifeach element has the same behaviour in parameter space. We shall describe the algorithm forthis simple case, and we state how it generalizes at the end of this subsection.

We require elementwise expectations and covariances to construct the covariance matrixinterpolator. Specifically, we require E[Σ.θ/]jk and cov{Σ.θ/jk, Σ.θ′/jk} for any j and k. Notethat we do not require cov{Σ.θ/ij, Σ.θ′/kl} for i /= k or j /= l. For the covariances, a useful form is

cov{Σ.θ/jk, Σ.θ′/jk}=σjk ζ.|θ−θ′|/,where ζ.|θ − θ′|/ is a correlation function. This reduces the specification to elementwiseexpectations, a form of correlation function ζ.·/, correlation lengths for θ, and σjk for alljk. How one specifies these will depend on the problem and the experience of the analyst andany experts he or she is working with. We can further reduce the burden of prior specificationby making exchangeability judgements for some of the expectations and σs. For example, wemight judge that Σ.θ/lk has the same mean and variance for any k and reduce the number ofmeans and variances that we must specify considerably.

Once the elementwise prior specification is complete, we compute EΣÆ [Σ.θ/] by using equa-tion (14), with ΣÅ ={Σ.θ[1]/, . . . , Σ.θ[m]/} and with the required covariances between matricescomputed via

cov{Σ.θ/, Σ.θ′/}=∑j

∑k

[cov{Σ.θ/jk, Σ.θ′/jk}+E[Σ.θ/]jkÅE[Σ.θ′/]jk] .16/

(derived from inner product (13)). The quantity EΣÆ [Σ.θ/] is the required covariance matrixfunction and interpolates the scenario covariance matrices.

If we have broken the matrix up, as mentioned at the beginning of this subsection, the methodfor constructing EΣÆ [Σ.θ/] requires only a slight augmentation. So, for example, we could breakany covariance matrix of interest into separate matrices containing diagonal and off-diagonalelements. For example, M =M[1] +M[2], with M a matrix and M[1] =diag.M/. We must then per-form a separate elementwise prior specification of means and variances as we did above, for eachsubmatrix Σ.θ/[i]. This allows us to specify different correlation structures throughout decisionspace for each of the parts of Σ.θ/ that we are treating separately. Once this prior specificationis complete EΣÆ [Σ.θ/] is computed via equation (14) with E[Σ.θ/]=ΣiE[Σ.θ/[i]] and

cov{Σ.θ/, Σ.θ′/}=∑i

∑j

∑k

[cov{Σ.θ/[i]jk, Σ.θ′/[i]jk

}+E[Σ.θ/[i]]jkÅE[Σ.θ′/[i]]jk]:

This follows from equation (15).

5.3. The FAMOUS covariance matrix functionWe have six output scenario covariance matrices ΣÅ = {Σ.θ[1]/, . . . , Σ.θ[6]/}. Let Σ.θ/ be theoutput covariance matrix as a function of θ. Let MΣ be the collection of 8×8 random symmetricmatrices, with each matrix entered as one single object for simplicity. We construct the space of


all finite linear combinations of these matrices augmented by IÅ8 , the basis of 8 × 8 symmetric

constant matrices, as described above, and impose inner product (13).Under this inner product we have

.Σ.θ/, Σ.θ′//=8∑

j=1

8∑k=1

[cov{Σ.θ/jk, Σ.θ′/jk}+E[Σ.θ/]jkÅE[Σ.θ′/]jk]:

We let E[Σ.θ/] be a priori diagonal with entries equal to the mean of diag.ΣÅ/ and model theelementwise covariance via

cov{Σ.θ/ij, Σ.θ′/ij}=σij ζ.|θ−θ′|/,with

ζ.|θ−θ′|/= exp{−( |θ−θ′|

ρ

)2 }:

This reduces the remainder of our prior specification to a correlation length ρ and 36 num-bers that specify the 8 × 8 matrix .σ/ij (as σij =σji). We further reduce the burden by makingexchangeability judgements based on the elementwise standard deviations of the data to reducethe number of σs specified to 7. This involved calculating the variances of each Σij across oursix scenario covariance matrices, and grouping them by approximate magnitude. These magni-tudes were 0:01, 0:04, 0:25, 2:25, 11, 23 and 44. σij whose estimates from the data were classifiedin the same group were judged exchangeable, with the variance set to the appropriate order ofmagnitude. The values of ρ and σ specified were ρ= .0:007, 0:007/T and

σ =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0:04 0:04 0:25 0:04 0:01 0:25 0:25 0:040:04 0:25 0:25 0:04 0:01 0:04 0:01 0:010:25 0:25 0:25 0:04 0:25 2:25 2:25 0:250:04 0:04 0:04 0:04 0:04 0:04 0:04 0:040:01 0:01 0:25 0:04 0:04 2:25 0:25 0:250:25 0:04 2:25 0:04 2:25 44:00 23:00 0:250:25 0:01 2:25 0:04 0:25 23:00 11:00 0:250:04 0:01 0:25 0:04 0:25 0:25 0:25 0:25

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

:

Computation of EΣÆ [Σ.θ/] for any θ is extremely fast, with the most expensive calculationbeing the inversion of a 6 × 6 matrix. The speed of the calculation allows us to perform manydiagnostic checks on our prior specification. These diagnostic checks can be done on the pos-terior of the emulator for the full simulator to assess the sensitivity of our conclusions to ourdifferent modelling choices, and by looking at plots such as that in Fig. 5.

Fig. 5 plots the Frobenius norm of EΣÆ [Σ.θ/] as a function of θ. As we have opted to use aconstant diagonal prior expectation for Σ.θ/, we would expect the Frobenius norm to increaseat each of the scenario points, as covariances in ΣÅ are weighted and added to E[Σ.θ/]. We maysee a decrease if the diagonal entries in our prior are very much bigger than those in one of thedata covariance matrices, relative to the sum of the squares of the covariances in that matrix.However, this is unlikely unless our extended mean function captures most of the variability ina particular scenario, or we have specified too large an expectation.

What this picture can show us is how far into parameter space the local information from oursix scenario ensembles is shared. If the spikes are very sharp, there is little point in adding thisstep to our analysis, as almost no scenario-specific local information is applicable away from


Rate of CO2 increase0.00

0.010.02

0.030.04

Rate of C

O2 decrease

0.00

0.01

0.02

0.030.04

xirta

m ec

nair

avo

C fo

mro

N

5

10

15

20

25

Fig. 5. Surface kEΣÆ [Σ.θ/]kF as a function of θ1 and θ2, where k � kF is the Frobenius norm, kAkF Dp.Σi Σj jAij j2/

the scenario point. Conversely if the surface moves far from the mean everywhere, we may feelthat we have overused the local information from the scenario ensemble (or otherwise we areconfident that the local behaviour that we have seen so far will be repeated throughout decisionspace). From this picture we can deduce that the local variance information that is containedin ΣÅ only alters our covariance matrix function locally around the scenario points. When con-sidering decisions that are close to one of our scenarios, then we gain by using this geometricapproach by exploiting local variance information around the scenario. However, when far fromour scenario points, our covariance function reflects our uncertainty before the scenario analy-sis was performed. Note, in addition, that the very large value of the norm at the most extremeramp-up ramp-down scenario has no effect on ‖EΣÆ [Σ.θ/]‖ within the range of our HadCM3design. The plot seems to suggest that, in our final emulator for HadCM3, the local informationthat is contained in the scenarios with θ1 =0:02 will have the most influence on our model.

Note that the specification of a more elaborate restricted inner product space, one that wouldallow us to learn about different parts of the covariance matrix function separately, would notincrease the computation time for EΣÆ [Σ.θ/]. Under the severe time pressure that we encounteredin our application, we did not have the time to construct a more elaborate prior specification,but our diagnostic analysis of the final HadCM3 emulator suggests that we have extracted thelarge-scale features of the uncertainty specification.


6. Emulating the full simulator

Having constructed a global emulator for the fast simulator that carries as much of the localinformation gained from the scenario ensemble into scenario space as we feel is appropriate,the final stage in our FLASH methodology is to model the full simulator f

Å.x, θ/, using the

emulator for the fast simulator and any other judgements that we are willing to make, and thento adjust this model by a small ensemble of well-chosen runs of f

Å.x, θ/, which we denote F

Å.

The methods for constructing emulators for higher resolution models using emulators forlower resolution models typically take the form

fÅ.x, θ/=h{β, g.x, θ/, ".x, θ/, φ}+ ξ.x, θ/, .17/

where fÅ.x, θ/ is an emulator for fÅ.x, θ/ and h.·/ is a function of the components of the

emulator for the lower resolution model and uncertain parameters φ. For example, Kennedyand O’Hagan (2000) chose h{f.x, θ/, φ}=φf.x, θ/.

By choosing h.·/ appropriate to our own judgement, the extended emulator for f .x, θ/ plusjudgements about φ and ξ.·/ are used to construct an emulator for f

Å.x, θ/. If we already have

a small ensemble of runs on fÅ.x, θ/, our final emulator is obtained by Bayesian updating of

this prior by using FÅ

. If not, the prior emulator may be used to design an ensemble in such away as to reduce our model uncertainty as efficiently as possible. Gramacy and Lee (2011) andCumming and Goldstein (2009) used emulators to aid in choosing optimal designs.

A Bayes linear inference, such as that demonstrated in our application, will often be theright type of inference to perform. The methods are fast, allow a wealth of diagnostic checkingand are relatively simple compared with the incredibly complex fully probabilistic alternative.However, even if the ultimate aim is to provide a fully probabilistic analysis, requiring a Mar-kov chain Monte Carlo algorithm to implement, we feel that this step is best left until after afaster analysis has been used to perform diagnostic checks on the emulator. In such a case, aBayes linear analysis is useful as it can be repeated many times to perform careful diagnosticchecking, with a larger, more ambitious, analysis possibly taking place at the end. Note that wecan combine fully conjugate probabilistic methods for emulating the scenarios (see for exam-ple Kennedy and O’Hagan (2001)) with our geometric approach to learning about Σ.θ/ if wedeem it appropriate. This would simply involve constructing probabilistic scenario emulators,extending their posterior mean functions in the same way that we extend the second-orderemulator mean function, and obtaining ΣÅ from there. However, combining this informationin a fast, probabilistic analysis of the full simulator may only be possible if it is conditionedon fixed values of complicated quantities such as correlation lengths and Σ.θ/. Such an anal-ysis is unlikely to be competitive in terms of speed and interpretability with a Bayes linearanalysis.

6.1. Relating FAMOUS to HadCM3We emulate HadCM3 by using our extended scenario emulator for FAMOUS, combining theextended mean function (10) with the extended residual, as a prior model. Let the ith splinecoefficient for HadCM3’s AMOC be Hi.x, θ/; then our prior model is

Hi.x, θ/=β0i Åbi.x/+β+

ij aj.θ/Å∑k

b′ik.x/+β−

ij aj.θ/Å∑k

b′′ik.x/+αiÅ".x, θ/+ ξ.x, θ/: .18/

The quantities {β0, β+, β−, α, e.x, θ/, ξ.x, θ/} are uncorrelated a priori. ξ.x, θ/ is a Gaussianprocess representing behaviour only on HadCM3, with the parameters of its covariance func-


tion chosen by using leave-one-out diagnostics. Prior variances are assigned to β+ and β− byusing ordinary least squares estimates for α+ and α−, to avoid overusing the paired data.

Fig. 6 shows the paired data. As noted in Section 3, although we devoted eight HadCM3runs to runs in the FAMOUS design, not all of the FAMOUS ensemble had completed in timeto perform the analysis. This included three of our paired runs. We note that the FAMOUSAMOC seems to begin more strongly than HadCM3 but is more sensitive to CO2 forcing andweakens more quickly. A stronger response to CO2 forcing by the lower resolution model isnot surprising. Roberts et al. (2004) showed that HadCEM, a climate model which comprises a13◦ × 1

3◦

ocean coupled to the HadCM3 atmosphere, has a weaker response than HadCM3 to a2% per year CO2 forcing. There are physical arguments to support such results; for example aprocess such as convective adjustment, whereby an unstable column of water is mixed verticallyuntil it becomes stable, may have a large effect on the AMOC in a low resolution model wherethe water columns are several degrees in latitude and longitude and can make up a significantfraction of the area of the northern North Atlantic, but a comparatively minor effect on a higherresolution model. We also note that the HadCM3 AMOC recovers more quickly when forcing isrelaxed and that the recovery is stronger than in FAMOUS as expected by Met Office scientistsa priori. Although there are differences, the paired data have similarities that give us confidence,when added to the beliefs of our experts, that FAMOUS is informative for HadCM3 and thatattempting to link the two models is worthwhile.

We allow the paired runs to influence our modelling by using the difference between the runsto aid us in our judgements about {β0, α, ξ.x, θ/}. To account for the systematic differencesbetween the models that were anticipated by our experts, we set E[ξ.x, θ/] to be the mean of thedifferences between spline coefficients on paired runs. Subtracting the mean from these differ-

0 50 100 150

1214

1618

2022

24

Year

AM

OC

Fig. 6. The five paired runs obtained from both the HadCM3 ( , , , , ) andFAMOUS ( , — —, – – –, – � – � ,. . . . . . .) smoothed AMOCs


ences, we then set the variances of β0 and ξ.x, θ/ so that the largest difference on each splinecoefficient corresponded to a 2-standard-deviation event. For example, variability in the firstthree spline coefficients is dominated by the variance of β0

i Åbi.x/. We therefore choose var.β0i /,

i=1, 2, 3, so that the variance of β0i Åbi.x/, and hence of the first three spline coefficients, matches

the variance calculated from the paired runs as described above.We allow variability for the different way that HadCM3 responds to forcing compared with

FAMOUS to be modelled through var{ξ.x, θ/} for the last five coefficients and assign these vari-ances in the same way as for β0

i on the first three coefficients. As each of the paired runs crosssomewhere in the middle, using this method gives small variance on the fourth, fifth and sixthspline coefficients in particular. As we have such a small sample, we judged that this was over-confident and did not reflect the extent of our uncertainty for these coefficients. To adjust for thisoverconfidence, we fix our prior moments for α and adjust the variances of ξi.x, θ/, j =4, 5, 6, sothat the width of our error bounds on the spline was roughly the same as for that part of the splinecontrolled by the first three coefficients. The width of these bars increases towards the end of thespline and this is natural given the information for the paired runs. As we are confident that thereis a substantial bias between the models after ramp-down, it is natural that we are more uncer-tain about the final AMOC behaviour than we are about its earlier response to forcing, becausewe are basing our emulator of HadCM3 on FAMOUS. We see this in the error bars in Fig. 7.

We use the HadCM3 data to update our prior model in two stages. In the first stage we adjustthe coefficients {β0, β+, β−, α} by the HadCM3 data to adjust our mean function. We thencalculate .ξ.x1, θ1/, . . . , ξ.x16, θ16// after this adjustment and adjust ξ.x, θ/ by these residualsto learn about the difference between the models and to create an emulator that interpolates thedata points. This two-stage belief adjustment avoids borrowing strength from the HadCM3 datato learn about FAMOUS through ".x, θ/ and is similar in spirit to the two-stage adjustmentthat was applied by Craig et al. (2001).

Fig. 7 show similar leave-one-out diagnostics to those in Fig. 3 for the 16 HadCM3 runs. Ineach of these plots we see that the realization of HadCM3 is within the bounds of uncertaintyfor the emulator trained on the other 15 ensemble members. This demonstrates the two uses ofleave-one-out plots and the reason that our methods must be fast to build many emulators. Weused leave-one-out plots trained on half of the data to tune our model, which even using onlyeight runs can mean building hundreds of emulators, and leave-one-out plots trained on all ofthe data to test it.

The successful building of the emulator that is illustrated in Fig. 3 satisfied our remit to theMet Office and provides a convincing argument to show that generalized circulation models ofdifferent resolution can be linked through emulation and that a statistical model of complexgeneralized circulation model output such as AMOC time series can be constructed in such away as to provide uncertainties for any decision in our parameterized decision space.

7. Discussion

We have introduced the FLASH methodology for performing multilevel emulation in situationswhere only an ensemble of opportunity is available on the fast simulator and where that ensem-ble is a scenario ensemble. The method exploits the speed and flexibility of Bayes linear methodsfor learning about complex objects in abstract inner product spaces to obtain a residual processfor the fast simulator that is appropriately extended into scenario space. The speed of this andthe other steps in our method allows us to build hundreds of emulators for diagnostic purposes,which is extremely important in applications such as our deliverable to the UK Met Office,where time is short and very little data on the full simulator are available. Some of the more


0 50 100 150

1014

1822

Years

AM

OC

0 50 100 150

1014

1822

Years

AM

OC

0 50 100 150

1014

1822

Years

AM

OC

0 50 100 150

1014

1822

Years

AM

OC

0 50 100 150

1014

1822

Years

AM

OC

0 50 100 150

1014

1822

Years

AM

OC

0 50 100 150

1014

1822

Years

AM

OC

0 50 100 150

1014

1822

Years

AM

OC

0 50 100 150

1014

1822

Years

AM

OC

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Fig. 7. Leave-one-out diagnostics for nine members of the HadCM3 ensemble ( , left-out data;– – –, our emulator mean (middle) and ˙2 standard deviations): (a) data point 1; (b) data point 2; (c)data point 3; (d) data point 4; (e) data point 5; (f) data point 6; (g) data point 7; (h) data point 8; (i) data point 9

labour-intensive processes that were illustrated in our specific application, such as fitting thesmoothing parameters in our P-splines, can be automated somewhat by constructing emulatorsfor λ. This is a topic of forthcoming work.

FLASH calls for high level summaries of the collection of outputs of which the spline rep-resentation that is used in our application is an example. Although the idea of using P-splinesto achieve these summaries is new and received only a basic treatment here, we feel that furtherresearch in this area could be important. We see the spline representation as being particularlyuseful in more general emulation situations where model runs are only partially complete butwhere we may want to use the partial information that is available as part of our ensemble. Inthese circumstances, basis representations that include linear combinations of all data points,such as principal components, will not be usable and the local properties of a spline basis maybe very attractive.

As with any analysis that requires an emulator to be built by using a very small ensemble ofruns, the design of the ensemble on the full simulator is crucial. Although Latin hypercubes arepopular, the nuances of our scenario ensemble on the fast simulator may lead to a more com-


plicated and well-thought-out design on the full simulator that takes into account the locationand wealth of information that is available from the fast simulator at the scenario points.

Scenario ensembles offer a large amount of potentially useful local information about a modelat the chosen decisions. If a policy maker had narrowed his or her options to a handful of distinctscenarios, then a scenario ensemble would be ideal. When looking to provide policy support for avery large number, or even a continuum, of policies, an emulator for the computer simulator is anideal way to explore every policy simultaneously, and a scenario ensemble is not ideal for build-ing emulators. As scenario ensembles are so popular, our methodology may be widely applicablein cases where a large number of policies are possible, yet only a few scenarios are explored.

Our ideal analysis would not use a scenario ensemble at all. Ideally, we can run a large ensem-ble of our expensive simulator that fully explores decision space as well as the space containingthe other model parameters. If, as in our application, we cannot do this, the ideal analysiswould have a large ensemble on the fast simulator that fully explores decision space. We wouldthen design a small batch of runs on the expensive simulator using, for example, the method ofCumming and Goldstein (2009), to guide our choices.

Emulators offer a natural way of exploring our uncertainty over all decisions within a contin-uum. This development in the statistical analysis of computer models should have implicationsfor the design of ensembles and of scenarios, particularly where scenario analysis is currentlyprevalent such as in climate science. If it is important to consider certain scenarios, we wouldrecommend choosing these scenarios in such a way that they can be defined by a small set ofparameters and varied across a continuum. Using emulators such as that constructed in thepaper, one can, if desirable, perform the same scenario analysis at the important scenarios, yetobtain an analysis for all feasible decisions at the same time.

Although it is easy to see how an emulator such as that built in this paper allows us to exploreour uncertainty regarding the model output for any decision, this is, in itself, a very limitedform of policy support. For example, the computer model itself may be uninformative for thereality that it attempts to describe. The construction of the emulator that is a function of anyfeasible decision is the first crucial step in providing the sort of policy support that goes beyonda scenario analysis, and can address real policy questions. We refer readers who are interestedin subsequent steps that make use of such an emulator to Williamson and Goldstein (2011) andWilliamson (2010).

Acknowledgements

This work was funded by the Natural Environment Research Council RAPID–RAPIT pro-ject (NE/G015368/1). We thank the team at climateprediction.net for providing the FAMOUSensemble. We also thank Michael Vellinga of the UK Met Office for many useful discussionsregarding HadCM3 and the CO2 parameterization. The work developing spline emulators wasdone at the Issac Newton Institute programme ‘Mathematical and statistical approaches toclimate modelling and prediction’, and we are grateful to the Isaac Newton Institute for theirinvitation and support. This work made use of the facilities of HECTOR, the UK’s nationalhigh-performance computing service, which is funded by the Office of Science and Technol-ogy through the Engineering and Physical Sciences Research Council’s ‘High end computing’programme.

References

Bayarri, M. J., Berger, J. O., Cafeo, J., Garcia-Donato, G., Liu, F., Palomo, J., Parthasarathy, R. J., Paulo, R.,Sacks, J. and Walsh, D. (2007) Computer model validation with functional output. Ann. Statist., 35, 1874–1906.


Broeker, W. S. (1987) The biggest chill. Nat. Hist. Mag., 97, 74–82.Challenor, P., McNeall, D. and Gattiker, J. (2009) Assessing the probability of rare climate events. In The Hand-

book of Applied Bayesian Analysis (eds A. O’Hagan and M. West), ch. 10. Oxford: Oxford University Press.Craig, P. S., Goldstein, M., Rougier, J. C. and Seheult, A. H. (2001) Bayesian forecasting for complex systems

using computer simulators. J. Am. Statist. Ass., 96, 717–729.Craig, P. S., Goldstein, M., Seheult, A. H. and Smith, J. A. (1996) Bayes linear strategies for matching hydrocarbon

reservoir history. In Bayesian Statistics 5 (eds J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith),pp. 69–95. Oxford: Oxford University Press.

Craig, P. S., Goldstein, M., Seheult, A. H. and Smith, J. A. (1997) Pressure matching for hydrocarbon reservoirs:a case study in the use of Bayes linear strategies for large computer experiments. In Case Studies in BayesianStatistics, vol. III (eds C. Gatsonis, J. S. Hodges, R. E. Kass, R. McCulloch, P. Rossi and N. D. Singpurwalla),pp. 36–93. New York: Springer.

Cumming, J. A. and Goldstein, M. (2009) Small sample designs for complex high-dimensional models based onfast approximations. Technometrics, 51, 377–388.

Cumming, J. A. and Goldstein, M. (2010) Bayes linear uncertainty analysis for oil reservoirs based on multiscalecomputer experiments. In The Oxford Handbook of Applied Bayesian Analysis (ed. A. O. M. West), pp. 241–270.Oxford: Oxford University Press.

Fahrmeir, L. and Tutz, G. (2001) Multivariate Statistical Modelling based on Generalized Linear Models. NewYork: Springer.

Furrer, R., Sain, S. R., Nychka, D. and Meehl, G. A. (2007) Multivariate Bayesian analysis of atmosphere-oceangeneral circulation models. Environ. Ecol. Statist., 14, 249–266.

Goldstein, M. (1997) Prior inferences for Posterior Judgements. In Structures and Norms in Science, vol. two (edsM. L. D. Chiara, K. Doets, D. Mundici and J. van Benthem), pp. 55–71. Dordrecht: Kluwer.

Goldstein, M. and Wilkinson, D. J. (2001) Restricted prior inference for complex uncertainty structures. Ann.Math. Artif. Intell., 32, 315–334.

Goldstein, M. and Wooff, D. (2007) Bayes Linear Statistics Theory and Methods. Chichester: Wiley.Gordon, C., Cooper, C., Senior, C. A., Banks, H., Gregory, J. M., Johns, T. C., Mitchell, J. F. B. and Wood, R.

A. (2000) The simulation of SST, sea ice extents and ocean heat transports in a version of the Hadley Centrecoupled model without flux adjustments. Clim. Dynam., 16, 147–168.

Gramacy, R. B. and Lee, H. K. H. (2011) Optimization under unknown constraints. In Bayesian Statistics 9 (edsJ. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West), pp.229–257. Oxford: Oxford University Press.

Hawkins, E. and Sutton, R. (2009) The potential to narrow uncertainty in regional climate predictions. Bull. Am.Meteorol. Soc., 90, 1095–1107.

Haylock, R. and O’Hagan, A. (1996) On inference for outputs of computationally expensive algorithms withuncertainty on the inputs. In Bayesian Statistics 5 (eds J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M.Smith), pp. 629–637. Oxford: Oxford University Press.

Kennedy, M. C. and O’Hagan, A. (2000) Predicting the output from a complex computer code when fast approx-imations are available. Biometrika, 87, 1–13.

Kennedy, M. C. and O’Hagan, A. (2001) Bayesian calibration of computer models (with discussion). J. R. Statist.Soc. B, 63, 425–464.

Linkov, I. and Burmistrov, D. (2003) Model uncertainty and choices made by Modelers: lessons learned from theInternational Atomic Energy Agency model intercomparisons. Risk Anal., 23, 1297–1308.

McManus, J. F., Francois, R., Gherardi, J.-M., Keigwin, L. D. and Brown-Leger, S. (2004) Collapse and rapidresumption of Atlantic meridional circulation linked to deglacial climate changes. Nature, 428, 834–837.

Meehl, G. A., Covey, C., Delworth, T., Latif, M., McAvaney, B., Mitchell, J. F. B., Stouffer, R. J. and Taylor, K.E. (2007) The WCRP CMIP3 multi-model dataset: a new era in climate change research. Bull. Am. Meteorol.Soc., 88, 1383–1394.

Morris, M. D. and Mitchell, T. J. (1995) Exploratory designs for computational experiments. J. Statist. PlanngInf., 43, 381–402.

Ramsay, J. O. and Silverman, B. W. (1996) Functional Data Analysis. New York: Springer.Rhines, P. B. and Häkkinen, S. (2003) Is the oceanic heat transport in the North Atlantic irrelevant to the climate

in Europe? Arc. Subarc. Oc. Fluxes Newslett., 1, 13–17.Roberts, M. J., Banks, H., Gedney, N., Gregory, J., Hill, R., Mullerworth, S., Pardaens, A., Rickard, G., Thorpe,

R. and Wood, R. (2004) Impact of an Eddy-permitting ocean resolution on control and climate change simu-lations with a global coupled GCM. J. Clim., 17, 3–20.

Rougier, J. C. (2008) Efficient emulators for multivariate deterministic functions. J. Computnl Graph. Statist., 17,827–843.

Santner, T. J., Williams, B. J. and Notz, W. I. (2003) The Design and Analysis of Computer Experiments. NewYork: Springer.

Smith, R. S., Gregory, J. M. and Osprey, A. (2008) A description of the FAMOUS (version XDBUA) climatemodel and control run. Geoscient. Modl Develpmnt, 1, 53–68.

Solomon, S., Qin, D., Manning, M., Chen, Z., Marquis, M., Averyt, K. B., Tignor, M. and Miller, H. L. (eds)


(2007) Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel onClimate Change, 2007. Cambridge: Cambridge University Press.

Swallow, W. H. and Monahan, J. F. (1984) Monte Carlo comparison of ANOVA, MIVQUE, REML and MLestimators of variance. Technometrics, 26, 47–57.

Vernon, I., Goldstein, M. and Bower, R. G. (2011) Galaxy formation: a Bayesian uncertainty analysis (withdiscussion). Baysn Anal., 5, 619–670.

Wilkinson, D. J. (1995) Bayes linear covariance matrix adjustment. PhD Thesis. Durham University, Durham.Williamson, D. (2010) Policy making using computer simulators for complex physical systems; Bayesian decision

support for the development of adaptive strategies. PhD Thesis. Durham University, Durham.Williamson, D. and Goldstein, M. (2011) Bayesian policy support for adaptive strategies using computer models

for complex physical systems. J. Oper. Res. Soc., to be published, doi 10.1057/jors.2011.110.

Date post:	03-Oct-2016
Category:	Documents
Upload:	daniel-williamson
View:	213 times
Download:	1 times

Fast linked analyses for scenario-based hierarchies

Documents