Date post: | 11-Dec-2015 |
Category: |
Documents |
Upload: | roger-shurley |
View: | 222 times |
Download: | 0 times |
Introduction to Data Assimilation
Peter Jan van Leeuwen
IMAU
Basic estimation theory
T0 = T + e0
Tm = T + em
E{e0} = 0E{em} = 0E{e0
2} = s02
E{em2} = sm
2
E{e0em} = 0
Assume a linear best estimate: Tn = a T0 + b Tm
with Tn = T + en
Find a and b such that:
b = 1 - a
a = __________ sm2
s02 + sm
2
E{en} = 0
E{en2} minimal
Solution: Tn = _______ T0 + _______ Tmsm
2 s02
s02 + sm
2 s02 + sm
2
___ = ___ + ___1 1 1
sm2s0
2sn2
and
Note: sn smaller than s0 and sm !
Basic estimation theory
Best Linear Unbiased Estimate BLUE
Just least squares!!!
Can we generalize this?
• More dimensions
• Nonlinear estimates (why linear?)
• Observations that are not directly modeled
• Biases
P(u)
u (m/s)
1.00.5
The basics: probability density functions
The model pdfP[u(x1),u(x2),T(x3),..
u(x1)
u(x2) T(x3)
Observations
• In situ observations: e.g. sparse hydrographic observations, irregular in space and time
• Satellite observations: e.g. of the sea-surface
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
NO INVERSION !!!
Data assimilation: general formulation
Bayes’ Theorem
Conditional pdf:
Similarly:
Combine:
Even better:
Filters and smoothers
Time
Filter: solve 3D problem several times
Smoother: solve 4Dproblem once
Note: the model is (highly) nonlinear!
Model equation:
Pdf evolution: Kolmogorov’s equation(Fokker-Planck equation)
Pdf evolution in time
Only consider mean and covariance
At observation times:
-The mean of the product of 2 Gaussians is equal to linear combination of the 2 means: E{|d} = a E{} + b E{d|}
- Assume p(d|) and p() are Gaussian, and use Bayes
- But we have seen this before in the first example !
(Ensemble) Kalman Filter
with Kalman gain K = PHT (HPHT + R)-1
Kalman Filter notation: mnew = m
old + K (d - H mold)
Old solution: Tn = _______ T0 + _______ Tmsm
2 s02
s02 + sm
2 s02 + sm
2
But now for covariance matrices:
mnew = R (P+R)-1 m
old + P (P+R)-1d
(Ensemble) Kalman Filter II
The error covariance:tells us how model variables co-vary
PSSH SSH(x,y) = E{ (SSH(x) - E{SSH(x)}) (SSH(y) - E{SSH(y)}) }
PSSH SST(x,y) = E{ (SSH(x) - E{SSH(x)}) (SST(y) - E{SST(y)}) }
For example SSH at point x with SSH at point y:
Or SSH at point x and SST at point y:
Spatial correlation of
SSHand SST in the Indian
Ocean
x
x
Haugen and Evensen, 2002
Covariancesbetween
modelvariables
Haugen and Evensen, 2002
Summary on Kalman filters:
• Gaussian pdf’s for model and observations• Propagation of error covariance P If N operations for state vector evolution, then N2 operations for P evolution…
Problems:• Nonlinear dynamics, so non-Gaussian statistics• Evolution equation for P not closed• Size of P (> 1,000,000,000,000) ….
Propagation of pdf in time:ensemble or particle methods
Example of Ensemble Kalman Filter (EnKF)
MICOM model with1.3 million model variablesObservations:Altimetry, infra-red
Validated with hydrographicobservations
SST (-2K to +2K)
SSH (-10 cm to +10 cm)
RMS difference with XBT-data
?
Spurious covariances
Local updating: restrict update using only local covariances:
EnKF:
with Kalman gain
Schurproduct, or direct cut-off
Localization in EnKF-like methods
Ensemble Kalman Smoother (EnKS)
Basic idea: use covariances over time.
Efficient implementation: 1) run EnKF, store ensemble at observation times2) add influence of data back in time using covariances at different times
0
2
4
6
8
10
408 412 416 420 424 428 432 436 440 444 448 452 456
Probability densityfunction of layer thicknessof first layer at day 41during data-assimilation
No Kalman filterNo variational methods
Nonlinear filters
The particle filter(Sequential Importance Resampling SIR)
Ensemble
with
Particle filter
SIR-results for a quasi-geostrophic ocean model around South Africa with 512 members
Smoothers: formulation
Model error
Initial error
Observation error
Boundary errors etc. etc.
Smoothers: prior pdf
Smoothers: posterior pdf
Assume all errors are Gaussian:
model initial observation
Assume Gaussian pdf for model errors and observations:
in which
Find min J from variational derivative:J is costfunction or penalty function
model dynamics initial condition model-obs misfit
Smoothers in practice: Variational methods
Gradient descent methods
J
model variable
123 4 561’
Forward integrations
Backward integrations
Nonlinear two-point boundary value problemsolved by linearization and iteration
The Euler-Lagrange equations
4D-VAR strong constraintAssume model errors negligible:
In practice only a few linear and one or two nonlinear iterations are done….
No error estimate (Hessian too expensive and unwanted…)
Example 4D-VAR: GECCO
• 1952 through 2001 on a 1º global grid with 23 layers in the vertical, using the ECCO/MIT adjoint technology.
• Model started from Levitus and NCEP forcing and uses state of the art physics modules (GM, KPP).
• Control parameters: initial temperature and salinity fields
and the time varying surface forcing,
The Mean Ocean Circulation, global
Residual values can reveal inconsistencies in data sets (here geoid).
MOC at 25N
Bryden et al. (2005)
Error estimates
J
Local curvature fromsecond derivative of J,the HessianX
Other smoothers
Representers, PSAS, Ensemble Kalman smoother, ….
Simulated annealing (Metropolis Hastings), …
Relations between model variables
• Covariance gives linear correlations between variables
• Adjoint gives linear correlation between variables along a nonlinear model run (linear sensitivity)
• Pdf gives full nonlinear relation between variables (nonlinear sensitivity)
Parameter estimation
Bayes:
Looks simple, but we don’t observe model parameters….
We observe model fields, so:
in which Hhas to be found from model integrations
Example: ecosystem modeling
29 parametersof which15 were estimatedand 14 were keptFixed.
Estimated parametersfrom particle filter (SIR)
All other methods thatwere tried, including4D-VAR and EnKF failed.
Losa et al, 2001
Estimate size of model error
Brasseur et al, 2006
Why data assimilation?
• Forecasts• Process studies• Model improvements - model parameters - parameterizations• ‘Intelligent monitoring’
Conclusions
• Evolution of pdf with time is essential ingredient
• Filters: dominated by Kalman-like methods, but moving towards nonlinear methods (SIR etc.)
• Smoothers: dominated by 4D-VAR,
New ideas needed!