Dusanka ZupanskiDusanka ZupanskiAnd And
Scott DenningScott Denning
Colorado State UniversityColorado State UniversityFort Collins, CO 80523-1375Fort Collins, CO 80523-1375
CMDL Workshop onModeling and Data Analysis of Atmospheric CO2
Observations in North America29-30 September 2004
ftp://ftp.cira.colostate.edu/Zupanski/presentationsftp://ftp.cira.colostate.edu/Zupanski/manuscripts
Critical issues of ensemble data Critical issues of ensemble data assimilation in application to carbon assimilation in application to carbon
cycle studiescycle studies
Introduction: EnsDA approachesIntroduction: EnsDA approaches
Non-linear processes Non-linear processes
Model error and parameter estimationModel error and parameter estimation
Uncertainty estimatesUncertainty estimates
Correlated observationsCorrelated observations
Non-Gaussian PDFsNon-Gaussian PDFs
Conclusions and future workConclusions and future workDusanka Zupanski, CIRA/[email protected]
OUTLINE:OUTLINE:
Probabilistic approach to data assimilation and forecasting Probabilistic approach to data assimilation and forecasting oror
Ensemble Data Assimilation (EnsDA)Ensemble Data Assimilation (EnsDA)
Dusanka Zupanski, CIRA/[email protected]
Provides the following:
(1) Optimal solution or state estimate (e. g., optimal CO2 analysis)
(2) Optimal estimates of model error and empirical parameters
(3) Uncertainty of the analysis (a component of the analysis error covariance Pa )
(4) Uncertainty of the estimated model error and parameters (components of the analysis error covariance Pa )
(5) Estimate of forecast uncertainty (the forecast error covariance Pf )
DATA ASSIMILATION (ESTIMATION THEORY)DATA ASSIMILATION (ESTIMATION THEORY)
Discrete stochastic-dynamic model
Dusanka Zupanski, CIRA/[email protected]
Discrete stochastic observation model
111 )()( : kkkk wxGxMxM
w k-1 – model error (stochastic forcing)
M – non-linear dynamic (NWP) model
G – model (matrix) reflecting the state dependence of model error
kkk xHy )( :D
k – measurement + representativeness error
H – non-linear observation operator (M M D D )
min]([]([2
1][)(][
2
1 11 obs
Tobsb
fTb HHJ yxRyxxxxx ))P
(1) State estimate (optimal solution):
)()( 1bobs
TTba xyRPPxxx HHHH
(2) Estimate of the uncertainty of the solution:
TTaf GGQMMPP
Tji
jif MMMM )]()()][()([)( , xpxxpxP ENSEMBLE KALMAN FILTER or EnsDA APPROACH
In EnsDA solution is defined in ensemble subspace (reduced rank problem) !
KALMAN FILTER APPROACH
MAXIMUM LIKELIHOOD ESTIMATE (VARIATIONAL APPROACH ):
MINIMUM VARIANCE ESTIMATE (KALMAN FILTER APPROACH ):
DATA ASSIMILATION EQUATIONS:DATA ASSIMILATION EQUATIONS:
Ensemble Data Assimilation (EnsDA)Ensemble Data Assimilation (EnsDA)
Dusanka Zupanski, CIRA/[email protected]
(1) Maximum likelihood approach (involves an iterative minimization of
a functional) xmode (MLEF, Zupanski 2004)
(2) Minimum variance approach (calculates ensemble mean)
xmean
xmode xmean
x
PDF(x)
xmode = xmean
x
PDF(x)
Non-Gaussian Gaussian
Critical issues: Non-linear processes Critical issues: Non-linear processes
Dusanka Zupanski, CIRA/[email protected]
- Use only non-linear models (tangent-linear, adjoint models are not needed)
- Iterative minimization is beneficial for non-linear processes
IMPACT OF MINIMIZATION(quadratic observation operator - 10 obs)
0.00E+001.00E-022.00E-023.00E-024.00E-025.00E-026.00E-027.00E-02
1 11 21 31 41 51 61 71 81 91
Analysis cycle
RM
S e
rro
r
Example: KdVB model (M. Zupanski, 2004)
Critical issues: Model error and parameter estimationCritical issues: Model error and parameter estimation
Dusanka Zupanski, CIRA/[email protected]
- Estimate and correct all major sources of uncertainty: initial conditions, model error, boundary conditions, empirical parameters
- Unified algorithm: EnsDA+state augmentation approach (Zupanski and Zupanski, 2004)
IMPACT OF MODEL BIAS(10 ens, 10 obs)
0.00E+00
1.00E-01
2.00E-01
3.00E-01
4.00E-01
5.00E-01
6.00E-01
7.00E-01
1 11 21 31 41 51 61 71 81 91
Cycle No.
RM
S e
rro
r
correct_model
neglect_err
bias_estim (dim = 101)
bias_estim (dim = 10)
ESTIMATION OF DIFFUSION COEFFICIENT (102 ens, 101 obs)
2.00E-02
4.00E-02
6.00E-02
8.00E-02
1.00E-01
1.20E-01
1.40E-01
1.60E-01
1.80E-01
2.00E-01
2.20E-01
2.40E-01
2.60E-01
1 11 21 31 41 51 61 71 81 91
Cycle No.
Dif
fusi
on
co
efic
ien
t va
lue
estim value (0.07)
true value (0.07)
estim value (0.20)
true value (0.20)
Example: KdVB model
Innovation histogram(Parameter etimation 10 ens, 10 obs)
0.00E+00
1.00E-01
2.00E-01
3.00E-01
4.00E-01
5.00E-01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Category bins
PD
F
Innovation histogram(Incorrect diffusion, 10 ens, 101 obs)
0.00E+00
1.00E-01
2.00E-01
3.00E-01
4.00E-01
5.00E-01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Category bins
PD
F
Innovation histogram(Parameter estimation, 10 ens, 101 obs)
0.00E+00
1.00E-01
2.00E-01
3.00E-01
4.00E-01
5.00E-01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Category bins
PD
FInnovation histogram
(Correct diffusion, 10 ens, 101 obs)
0.00E+00
1.00E-01
2.00E-01
3.00E-01
4.00E-01
5.00E-01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Category bins
PD
FInnovation histogram
(Correct diffusion 10 ens, 10 obs)
0.00E+00
1.00E-01
2.00E-01
3.00E-01
4.00E-01
5.00E-01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Category bins
PD
F
Innovation histogram(Incorrect diffusion, 10 ens, 10 obs)
0.00E+00
1.00E-01
2.00E-01
3.00E-01
4.00E-01
5.00E-01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Category bins
PD
F10 obs 101 obs
EnsDA experiments with KdVB model (PARAMETER estimation impact)
INNOVATION 2 TEST (biased model)(neglect_err, 10 ens, 10 obs)
0.00E+002.00E+004.00E+006.00E+008.00E+001.00E+011.20E+01
1 11 21 31 41 51 61 71 81 91
Analysis cycle
INNOVATION 2 TEST (biased model)(bias_estim, 10 ens, 10 obs, bias dim = 101)
0.00E+002.00E+004.00E+006.00E+008.00E+001.00E+011.20E+01
1 11 21 31 41 51 61 71 81 91
Analysis cycle
INNOVATION 2 TEST (biased model)(bias_estim, 10 ens, 10 obs, bias dim = 10)
0.00E+002.00E+004.00E+006.00E+008.00E+001.00E+011.20E+01
1 11 21 31 41 51 61 71 81 91
Analysis cycle
INNOVATION 2 TEST (non-biased model)(correct_model, 10 ens, 10 obs)
0.00E+002.00E+004.00E+006.00E+008.00E+001.00E+011.20E+01
1 11 21 31 41 51 61 71 81 91
Analysis cycle
EnsDA experiments with KdVB model (BIAS estimation impact)
Dusanka Zupanski, CIRA/[email protected]
Critical issues: Uncertainty estimates Critical issues: Uncertainty estimates
Dusanka Zupanski, CIRA/[email protected]
- Analysis error covariance Pa (analysis uncertainty)
- Forecast error covariance Pf (forecast uncertainty)
- Both defined in ensemble sub-space
KdVB model example:
Critical issues: Correlated observations
Dusanka Zupanski, CIRA/[email protected]
Problem:Numerous observations (~108 -109) are being projected onto a small ensemble sub-space (~101 -103) ! Loss of observed information!
Remedies: Process observations one by one (Anderson 2001, Bishop et al. 2001; Hamill et al. 2001).
Or Process observations successively over relatively small local areas (LEKF, Ott et al. 2004).
Assumption in both approaches:Observations being processed separately are uncorrelated (independent)! This may not be justified for dense satellite observations.
Critical issues: Correlated observations
Dusanka Zupanski, CIRA/[email protected]
How does the observed information impact the uncertainty estimate of the optimal solution (analysis error covariance Pa ) ?
2/2/12/1 )]([ Toptfa
xAIPP
2/1aP - square root of analysis error covariance (Nstate x Nens)
2/1fP - square root of forecast error covariance (Nstate x Nens)
)( optxA - impact of observations on the optimal solution (Nens x Nens)
The eigenvalue spectrum of (I+A)-1/2 may help understand the impact of observations, and perhaps find a better solution for correlated observations.
RAMS model example
Dusanka Zupanski, CIRA/[email protected]
Eigenvalues (I+A)-1/2
RAMS, 144 obs, 10 ens
0.00E+00
2.00E-01
4.00E-01
6.00E-01
8.00E-01
1.00E+00
1 2 3 4 5 6 7 8 9 10
eigenvalue rank
eig
enva
lue
cycle 1
cycle 3
cycle 6
cycle 8
cycle 10
A safe approach to prevent loss of observed information, assuming independent observations: Nobs Nens.
If eigenvalues of (I+A)-1/2 spread over the entire interval [0,1], ensemble size (Nens) is appropriate for a given observation number (Nobs) .
CSU shallow-water model on geodesic grid
(I+A)**(-1/2) Spectrum (1025 obs, 1000 ensembles)
0.00E+00
2.00E-01
4.00E-01
6.00E-01
8.00E-01
1.00E+00
1 101 201 301 401 501 601 701 801 901
Eigenvalue Rank
Eig
en
va
lue
cycle 1
cycle 5
cycle 10
cycle 15
cycle 20
12800 obs
When system can learn from its past, less information from observations is needed !
U-wind analysis RMS error (m/s)
0.00E+00
5.00E-01
1.00E+00
1.50E+00
2.00E+00
2.50E+00
3.00E+00
3.50E+00
4.00E+00
4.50E+00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Cycle
RM
S e
rro
r
random init ptrb
KPZ init ptrb
obs error
Smooth start (in cycle 1) can improve the performance of EnsDA
Milija Zupanski, CIRA/[email protected]
Analysis error smaller than obs error
(Results from M. Zupanski et al.)
Non-Gaussian PDFsNon-Gaussian PDFs
Non-linear Atmospheric- Hydrology- Carbon state variables and observations are likely to have non-Gaussian PDFs.
MLEF, as a maximum likelihood estimate, is a suitable tool for examining the impact of different PDFs.
Develop a non-Gaussian PDF framework (M. Zupanski)- allow for non-Gaussian observation errors- apply the Bayes theorem for multiple events
Milija Zupanski, CIRA/[email protected]
CSU EnsDA algorithm is currently being examined in application to NASA’s GEOS column model in collaboration with:
-A. Hou and S. Zhang (NASA/GMAO)-C. Kummerow (CSU/Atmos. Sci.)
Innovation histogram for NASA's GEOS model experiment
(Parameter estimation 10 ens, 110 "REAL"obs)
0.00E+00
2.00E-01
4.00E-01
6.00E-01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Category bins
PD
F
NASA’s GEOS column model example
Innovation histogram for NASA's GEOS model experiment
(Parameter estimation 10 ens, 110 "REAL"obs)
0.00E+001.00E-012.00E-013.00E-014.00E-015.00E-016.00E-01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Category bins
PD
F
R1/2 = R1/2 = 2
Prescribed observation errors directly impact innovation statistics.Since the observation error covariance R is the only input required by the
system, it could be tuned!
Dusanka Zupanski, CIRA/[email protected]
In case we solved all critical issues, one problem remains:How to define observation error covariance matrix R, if it is not known?
EnsDA approaches are very promising since they can provide not EnsDA approaches are very promising since they can provide not only only optimal estimateoptimal estimate of the state, but also the of the state, but also the uncertaintyuncertainty of the of the optimal estimate.optimal estimate.
The experience gained so far indicates that the EnsDA approach The experience gained so far indicates that the EnsDA approach is suitable for addressing critical issues of data assimilation in is suitable for addressing critical issues of data assimilation in Carbon cycle studies.Carbon cycle studies.
Model error and parameter estimation are necessary ingredients Model error and parameter estimation are necessary ingredients of a data assimilation algorithm.of a data assimilation algorithm.
Problems involved in Carbon data assimilation require a state-of-Problems involved in Carbon data assimilation require a state-of-the art approach. We anticipate findings from different scientific the art approach. We anticipate findings from different scientific disciplines (e. g., atmospheric science, ecology, hydrology) to be of disciplines (e. g., atmospheric science, ecology, hydrology) to be of mutual benefits. mutual benefits.
It is especially important to gain experience with complex coupled It is especially important to gain experience with complex coupled models (e. g., RAMS-SiB-CASA), correlated (satellite) observations, models (e. g., RAMS-SiB-CASA), correlated (satellite) observations, and non-Gaussian PDFs in the and non-Gaussian PDFs in the futurefuture..
Dusanka Zupanski, CIRA/[email protected]
CONCLUSIONSCONCLUSIONS
Dusanka Zupanski, CIRA/[email protected]