Post on 02-Aug-2020
transcript
Accelerating Bayesian inference with computationallyintensive models, with application to Pine Island Glacier
Patrick R. Conrad (MIT), Patrick Heimbach (MIT), Youssef Marzouk (MIT),Natesh Pillai (Harvard), and Aaron Smith (Univ. of Ottawa)
Antarctica and climate change
The Western Antarctic Ice Sheet has recently showngrowing mass loss along the Amundsen coast
Western Antarctic Ice Sheet
[Rignot et al. 2011]
Pine Island Glacier
[NASA]
Vast uncertainty in ice-ocean dynamics
Figure: Temperature profile under Pine Island Glacier, Antarctica [Jacobset al.]
I How readily is heat absorbed by the ice?I How much mixing occurs near the ice-ocean interface?I Ultimately, can we predict melt rates and the stability of
the glacier?
Forward model of ice-ocean coupling
I MIT General CirculationModel, configured forPine Island
I Realistic geometry oncoarse scale (4 km × 4km × 20 m) or fine scale(1 km × 1 km × 20 m)models
I Several input parametersare unknown
Constructing an inference problem
Satellite imageBathymetry and sample
locations
102oW 30’ 101oW 30’ 100oW 30’
24’
12’
75oS
48’
36’
0
100
200
300
400
500
600
700
800
900
1000
I Representative locations for temperature and salinityobservations
Bayesian inference illustration
I Bayesian inferenceexpresses our prior beliefsover parameters θ ∈ Rn,with a probability density,
p(θ),
and constructs a posteriorprobability density,
p(θ|d) ∝ L(θ|d, f(θ))p(θ)
expressing our beliefs aftercomparing the datad ∈ Rd , to thecomputationally expensiveforward model f(θ).
I Well suited to limiteddata and complex models
θMAP
Posterior Contours
Prior Contours
Markov chain Monte Carlo (MCMC)
Posterior contours
Proposal contours
MCMC samples
I Significant literature discusses proposals that “mix”quickly, i.e., that generate nearly independent samples
I Evaluates forward model N timesI Run-time can be dominated by cost of fI Standard MCMC links cost of understanding p(θ|d) and
f(θ)
MCMC with Local Approximations
Given X0, initialize S0, then simulate chain {Xt}t≤N withkernel:
MH Kernel Kt(x , ·)1. Given Xt, draw qt ∼ Q(Xt, ·) from kernel Q with
(symmetric) translation invariant density q(x , ·)2. Compute acceptance ratio
α = min
(1,L(θ|d, f̃t(qt))p(qt)
L(θ|d, f̃t(Xt))p(Xt)
)3. As needed, select new samples near qt or Xt, yieldingSt ⊆ St+1. Refine f̃t → f̃t+1.
4. Draw u ∼ U(0, 1). If u < α, let Xt+1 = qt, otherwiseXt+1 = Xt.
Local approximations
I To compute f̃(θ), construct a model over ball BR(θ)I Use samples θi ∈ S at distance r = ‖θ − θi‖ < RI Approximation converges locally under loose conditions
[Cleveland]I For example, quadratic approximations over BR(θ) [Conn
et al.]:‖f −QRf‖ ≤ ‖f‖κλR3
Local approximation illustration
Early times Late times
Models are refined using new points chosen when modelquality appears poor
Ergodicity and exactness of approximate samplers
Assume the log-posterior is approximated with localquadratic models and θ ∈ X ⊆ Rn for compact X , orp(θ|d) obeys a Gaussian envelope:
limr→∞
sup|θ|=r
| log(p(θ|d))− log(p∞(θ))| = 0
for some quadratic form log(p∞) with negative definitecoefficient matrix.
Then under standard regularity assumptions forgeometrically ergodic kernel K∞ and posterior p(θ|d), thechain Xt is ergodic and asymptotically samples from theexact posterior:
limt→∞‖P(Xt)− p(θ|d)‖TV = 0
Example: Elliptic permeability inversion
Infer parameters of k given observations of u in the PDE:
∇s · (k(s, θ)∇su(s, θ)) = 0,
Accuracy of chains
104 105
MCMC step
10-3
10-2
10-1
100
101
Rela
tive c
ovari
ance
err
or
True model
Linear
Quadratic
GP
Cost of chains
104 105
MCMC step
102
103
104
105
Tota
l num
ber
of
evalu
ati
ons True model
Linear
Quadratic
GP
Prior and likelihood selection
I Priors are log-normal with expert-chosen mean and widthI Likelihoods are i.i.d. Gaussian with variance suggested by
in situ experimental data
Parameter Nominal value, µ′ Prior “width” σ′
Drag coefficients 1.5E-3 1.5E-3Heat & Salt transfer 1.0E-4 0.5E-4Prandtl Number 13.8 1.Schmidt Number 2432. 200.Horizontal Diffusion 5.0E-5 5.0E-5ZetaN 5.2E-2 0.5E-3
Temperature – 0.04Salinity – 0.1
Computational details and results
I Compute synthetic data using the fine scale model, try toinfer them using the coarse scale
I Constructed 30 parallel chains with shared evaluationsI Chains run for approximately two weeksI Results shown after burn-in is removed
Inference cost summary
Samples Model runs Savings
Drill and surface 225,000 53,000 ≥ 4.2x
Surface only 450,000 52,000 ≥ 8.6x
Prior and posterior marginals
Drag
Drill and Surface
Surface Only
Prior
Transfer
1.00e−04
2.50e−04
Prandtl
13
14.5
Schmidt
2400
2600
Diff
2.00e−05
1.20e−04
0.0515 0.0525
Zeta
2 6
x 10−3
0.0515
0.0525
1 2.5
x 10−4
13 14.5 2400 2600 2 12
x 10−5
Contributions
I Introduce a novel framework for using localapproximations within MCMC; prove that theframework produces asymptotically exact samples.
I Demonstrate strong numerical performance on canonicalinference problems.
I Construct a realistic, synthetic inference problem forice-ocean coupling near Pine Island Glacier.
I Apply local approximation methods to reducecomputational cost of inference in the Pine Island Glaciersetting.
This work is supported by the US Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under Award Number DE-SC0007099, part of the SciDAC Institute for the Quantification of Uncertainty in Extreme-Scale Computations (QUEST). prconrad@mit.edu, ymarz@mit.edu