Combining Observations and Models: A Bayesian View Mark Berliner, OSU Stat Dept Bayesian...

Post on 27-Mar-2015

216 views 0 download

Tags:

transcript

Combining Observations and Models: A Bayesian View

Mark Berliner, OSU Stat Dept

• Bayesian Hierarchical Models

• Selected Approaches

• Geophysical

Examples

• Discussion

Main Themes1) Goal: Develop probability

distributions for unknowns of interest by combining information sources: Observations, theory, computer model output, past experience, etc.

2) Approaches: Bayesian Hierarchical Models Incorporate various information

sources by modeling 1. priors2. data model or likelihoods

Bayesian Hierarchical Models• Skeleton:

1. Data Model: [ Y | X , ]

2. Process Model Prior: [ X | ]

3. Prior on parameters: [ ]

• Bayes’ Theorem: posterior distribution: [ X , | Y]

• Compare to

“Statistics”: [ Y | ] [ ]

“Physics”: [ X | (Y) ]

ApproachesA. Stochastic models incorporating science• Physical-statistical modeling (Berliner 2003

JGR) From ``F=ma'' to [ X | ] • Qualitative use of theory (eg., Pacific SST

model; Berliner et al. 2000 J. Climate)

B. Incorporating large-scale computer models

1) From model output to priors [ ]

2) Model output as samples from process model prior [ X | ] almost !

3) Model output as ``observations'' (Y)

C. Combinations

Glacial Dynamics (Berliner et al. 2008 J. Glaciol)

Steady Flow of Glaciers and Ice Sheets • Flow: gravity moderated by drag (base &

sides) & ….stuff….• Simple models: flow from geometry

Data: Program for Arctic Climate Regional Assessments

& Radarsat Antarctic Mapping Project

• surface topography (laser altimetry) • basal topography (radar altimetry) • velocity data (interferometry)

Modeling: surface – s, thickness – H, velocity -

u Physical Model

• Basal Stress: = - gH ds/dx (+ “stuff”)

• Velocities: u = ub + 0 H n

where ub = k p + ( gH )-q Our Model

• Basal Stress: = - gH ds/dx + where is a ``corrector process;” H, s unknown

• Velocities: u = ub + H n + e

where ub = k p + ( gH )-q or a constant;

is unknown, e is a noise process

Wavelet Smoothing of Base

Results: Velocity

Results: Stress and Corrector

Paleoclimate (Brynjarsdóttir & Berliner 2009)

Climate proxies: Tree rings, ice cores, corals, pollen, underground rock provide indirect information on climate

• Inverse problem: proxy f(climate)

Boreholes: Earth stores info on surface temp’s

• Model: Heat equation

Borehole data f(surface temp’s)

• Infer boundary condition (initial cond. is nuisance)

Modeling

• Data Model:

Y | Tr, ~ N( Tr + T0 1 + q R(k), 2 I)

true temp

Adjustments for rock types, etc.

• Process Model: heat equation applied to Tr

with b.cond. surface temp history Th

Tr | Th , ~ N( BTh , 2 I)

Th | ~ N( 0 , 2 I)

Y

h

r

In progress:• Combining boreholes (parameters and

b.cond as samples from a distribution)

• Combining with other sources and proxies

Bayesian Hierarchical Models to Augment the

Mediterranean Forecast System (MFS)Ralph Milliff CoRAChris Wikle Univ. MissouriMark Berliner Ohio State Univ..Nadia Pinardi INGV (I'Istituto Nazionale di

Geofisica e Vulcanologia) Univ. Bologna (MFS Director)Alessandro Bonazzi, Srdjan Dobricic INGV, Univ. Bologna

Bayesian Modeling in Support of Massive Forecast Models

1. MFS is an Ocean Model

2. A Boundary Condition/Forcing: Surface Winds

3. Approach: produce surface vector winds (SVW), for ensemble data assimilation

• Exploit abundant, “good” satellite wind data (QuikSCAT)

• Samples from our winds-posterior ensemble for MFS

(Before us: coarse wind field (ECMWF))

“Rayleigh Friction Model” for winds (Linear Planetary Boundary Layer Equations)

Theory

(neglect second order time derivative)discretize:

Our model

BHM Ensemble Winds

10 m/s

10 members selected from the Posterior Distribution (blue)

ApproachesA. Stochastic models incorporating science• Physical-statistical modeling (Berliner 2003

JGR) From ``F=ma'' to [ X | ] • Qualitative use of theory (eg., Pacific SST

model; Berliner et al. 2000 J. Climate)

B. Incorporating large-scale computer models

1) From model output to priors [ ]

2) Model output as samples from process model prior [ X | ] almost !

3) Model output as ``observations'' (Y)

C. Combinations

Part B) Information from Models

1) Develop prior from model output• Think of model output runs O1, … , On as samples

from some distribution• Do data analysis on O’s to estimate distribution• Use result (perhaps with modifications) as a prior

for X• Example: O’s are spatial fields: estimate spatial

covariance function of X based on O’s. • Example: Berliner et al (2003) J. Climate

2) Model output as realizations of prior “trends”

• Process Model Prior

X = O +

where is “model error”, “bias”, “offset”

• [ Y | X , ] is measurement error model:

Y = X + e

• Substitution yields [ Y | O , , ]

Y = O + + e

• Modeling is crucial (I have seen set to 0)

3) Model output as “observations”

• Data Model: [ Y, O | X , ] ( = [ Y | X, ] [ O | X , ])

• [ O | X , ] to include “bias, offset, ..”

• Previous approach: start by constructing

[ X | O , ]

This approach: construct [ O | X , ] • Model for “bias” a challenge in both cases• This is not uncommon, though not always

made clear

A Bayesian Approach to Multi-model Analysis and Climate Projection

(Berliner and Kim 2008, J Climate)

Climate Projection:

– Future climate depends on future, but unknown, inputs.

– IPCC: construct plausible future inputs, “SRES Scenarios” (CO2 etc.)

– Assume a scenario and get corresponding projection

Hemispheric Monthly Surface Temperatures

• Observations (Y) for 1882-2001.

Data Model: Gaussian with mean = true temp.

& unknown variance (with a change-point)• Two models (O): PCM (n=4), CCSM (n=1) for

2002-2197, and 3 SRES scenarios (B1,A1B,A2).

Data Model: assumes O’s are Gaussian with mean = t + model biast (different for the two models) and unknown, time-varying variances (different for the two models)

• All are assumed conditionally independent

Notes (Freeze time)

• Data model for kth ensemble member from Model j:

Ojk = + bj + ejk

– is common to both Models

– bj is Model j bias

– E( ejk ) = 0 and variances of e’s depend on j

• Computer model model:

= X + ewhere E(e) = 0• Priors for biases, variances, and X• Extensions to different model classes (more ’s)

and richer models are feasible.

IPCC (global) Us (NH)

Figure 10.4

Discussion: Which approach is best?

• Depends on form and quality of observations and models and practicality

• Develop prior for X from scientific model (part A) offers strong incorporation of theory, but practical limits on richness of [ X | ] may arise

• Model output as “observations”– Combining models: Just like different

measuring devices;– Nice for analysis & mixed (obs’ & comp.) design

– Need a prior [ X | ]

• Model output as realizations of prior “trends”– Most common among Bayesian statisticians– Combining models: like combining experts

Discussion: Models versus Reality

Need for modeling differences between X’s and O’s.

Model “assessment” (“validation”, “verification”) helps, but is difficult in complicated settings:

– Global climate models. Virtually no observations at the scales of the models.

– Tuning. Modify model based on observations.– Observations are imperfect, and are often

output of other physical models.– Massive data. Comparing space-time fields

Discussion, Cont’d

• Part C) Combining approaches– Example: Wikle et al 2001, JASA. Combined

observations and large-scale model output as data with a prior based on some physics

• Usually, many physical models. No best one, so it’s nice to be flexible in incorporating their information

Thank You!