Data Assimilation Theory CTCD Data Assimilation Workshop Nov 2005
Sarah Dance
(apologies to Rube Goldberg)
Data assimilation is often treated as a black box algorithm
OUTAnalysis
INObservations
anda priori
information
BUT, understanding and developing what goes on inside the box is crucial !!
Some DARC DA Theory Group Projects
Nonlinear assimilation techniques
Convergence of 4D-Var
Reduced order modelling
Stochastic Processes in DA
Treatment of observation error correlations
Background error modelling and balance
Multiscale DA Phase errors and rearrangement theory
Model errors and bias correction
EnKF and bias
Information content Observation targeting
Formulations of the Ensemble Kalman Filter and Bias
MSc thesis by David Livings, supervised by Sarah Dance and Nancy
Nichols
Outline
• Bayesian state estimation and the Kalman Filter
• The EnKF• Bias and the EnKF• Conclusions
e.g. Suppose xk = M xk-1+
M is linear, the prior and model noise are Gaussian
P(xk-1) ~ N(xb, P) ~ N(0, Q)
Then P(xk |xk-1) ~N(Mxb, MPMT+Q)
Prediction (between observations)
At an observation we use Bayes rule
)(xp
)|()()|( xyxyx ppp
Prior Background error distribution
)|( xypLikelihood of observationsObservation error pdf
Bayes rule
Bayes rule illustrated
)(xp)|( xyp
Bayes rule illustrated (cont)
)|( xyp )(xp
)|()()|( xyxyx ppp
The Kalman Filter
• Use prediction equation and Bayes rule • Assume linear models (forecast and observation)• Assume Gaussian statistics
Kalman filter BUT• Models are nonlinear• Evolving large covariance matrices is
expensive (106 x 106 in meteorology)• So use an ensemble (Monte Carlo idea)
=
=
=
N=10, Perfect observations
Red ensemble mean
Blue ensemble std.
Error bars indicate obs std.
Results with ETKF (old formulation) and Peter Lynch’s swinging spring model
Ensemble statistics not consistent with the truth!
Bias and the EnKF
• Many EnKF algorithms, can be put into a “square root” framework.
• Define an ensemble perturbation matrix:
x1x
2x
3x4x
5x
So, by definition of the ensemble mean
01XX
1
1
The mean of the ensemble is updated separately.
Ensemble perturbations are updated as
where T is a (non-unique) square root of an update equation.
Thus, for consistency,
David discovered that not all implementations preserve this property.
We have now found nec. and suff. conditions for consistency.
Square-root ensemble updates
0T1X1X fa
Consequences
• The ensemble will be biased
• The size of the ensemble spread will be too small
• Filter divergence is more likely to occur !
• Care must be taken in algorithm choice and implementation