Download - Dusanka Zupanski And Scott Denning Colorado State University Fort Collins, CO 80523-1375 CMDL Workshop on Modeling and Data Analysis of Atmospheric CO.

Dusanka ZupanskiDusanka ZupanskiAnd And

Scott DenningScott Denning

Colorado State UniversityColorado State UniversityFort Collins, CO 80523-1375Fort Collins, CO 80523-1375

CMDL Workshop onModeling and Data Analysis of Atmospheric CO2

Observations in North America29-30 September 2004

ftp://ftp.cira.colostate.edu/Zupanski/presentationsftp://ftp.cira.colostate.edu/Zupanski/manuscripts

Critical issues of ensemble data Critical issues of ensemble data assimilation in application to carbon assimilation in application to carbon

cycle studiescycle studies

Introduction: EnsDA approachesIntroduction: EnsDA approaches

Non-linear processes Non-linear processes

Model error and parameter estimationModel error and parameter estimation

Uncertainty estimatesUncertainty estimates

Correlated observationsCorrelated observations

Non-Gaussian PDFsNon-Gaussian PDFs

Conclusions and future workConclusions and future workDusanka Zupanski, CIRA/[email protected]

OUTLINE:OUTLINE:

Probabilistic approach to data assimilation and forecasting Probabilistic approach to data assimilation and forecasting oror

Ensemble Data Assimilation (EnsDA)Ensemble Data Assimilation (EnsDA)

Dusanka Zupanski, CIRA/[email protected]

Provides the following:

(1) Optimal solution or state estimate (e. g., optimal CO2 analysis)

(2) Optimal estimates of model error and empirical parameters

(3) Uncertainty of the analysis (a component of the analysis error covariance Pa )

(4) Uncertainty of the estimated model error and parameters (components of the analysis error covariance Pa )

(5) Estimate of forecast uncertainty (the forecast error covariance Pf )

DATA ASSIMILATION (ESTIMATION THEORY)DATA ASSIMILATION (ESTIMATION THEORY)

Discrete stochastic-dynamic model


Discrete stochastic observation model

111 )()( : kkkk wxGxMxM

w k-1 – model error (stochastic forcing)

M – non-linear dynamic (NWP) model

G – model (matrix) reflecting the state dependence of model error

kkk xHy )( :D

k – measurement + representativeness error

H – non-linear observation operator (M M D D )

min]([]([2

1][)(][

2

1 11 obs

Tobsb

fTb HHJ yxRyxxxxx ))P

(1) State estimate (optimal solution):

)()( 1bobs

TTba xyRPPxxx HHHH

(2) Estimate of the uncertainty of the solution:

TTaf GGQMMPP

Tji

jif MMMM )]()()][()([)( , xpxxpxP ENSEMBLE KALMAN FILTER or EnsDA APPROACH

In EnsDA solution is defined in ensemble subspace (reduced rank problem) !

KALMAN FILTER APPROACH

MAXIMUM LIKELIHOOD ESTIMATE (VARIATIONAL APPROACH ):

MINIMUM VARIANCE ESTIMATE (KALMAN FILTER APPROACH ):

DATA ASSIMILATION EQUATIONS:DATA ASSIMILATION EQUATIONS:

Ensemble Data Assimilation (EnsDA)Ensemble Data Assimilation (EnsDA)


(1) Maximum likelihood approach (involves an iterative minimization of

a functional) xmode (MLEF, Zupanski 2004)

(2) Minimum variance approach (calculates ensemble mean)

xmean

xmode xmean

x

PDF(x)

xmode = xmean

x

PDF(x)

Non-Gaussian Gaussian

Critical issues: Non-linear processes Critical issues: Non-linear processes


- Use only non-linear models (tangent-linear, adjoint models are not needed)

- Iterative minimization is beneficial for non-linear processes

IMPACT OF MINIMIZATION(quadratic observation operator - 10 obs)

0.00E+001.00E-022.00E-023.00E-024.00E-025.00E-026.00E-027.00E-02

1 11 21 31 41 51 61 71 81 91

Analysis cycle

RM

S e

rro

r

Example: KdVB model (M. Zupanski, 2004)

Critical issues: Model error and parameter estimationCritical issues: Model error and parameter estimation


- Estimate and correct all major sources of uncertainty: initial conditions, model error, boundary conditions, empirical parameters

- Unified algorithm: EnsDA+state augmentation approach (Zupanski and Zupanski, 2004)

IMPACT OF MODEL BIAS(10 ens, 10 obs)

0.00E+00

1.00E-01

2.00E-01

3.00E-01

4.00E-01

5.00E-01

6.00E-01

7.00E-01

1 11 21 31 41 51 61 71 81 91

Cycle No.

RM

S e

rro

r

correct_model

neglect_err

bias_estim (dim = 101)

bias_estim (dim = 10)

ESTIMATION OF DIFFUSION COEFFICIENT (102 ens, 101 obs)

2.00E-02

4.00E-02

6.00E-02

8.00E-02

1.00E-01

1.20E-01

1.40E-01

1.60E-01

1.80E-01

2.00E-01

2.20E-01

2.40E-01

2.60E-01

1 11 21 31 41 51 61 71 81 91

Cycle No.

Dif

fusi

on

co

efic

ien

t va

lue

estim value (0.07)

true value (0.07)

estim value (0.20)

true value (0.20)

Example: KdVB model

Innovation histogram(Parameter etimation 10 ens, 10 obs)

0.00E+00

1.00E-01

2.00E-01

3.00E-01

4.00E-01

5.00E-01

-5 -4 -3 -2 -1 0 1 2 3 4 5

Category bins

PD

F

Innovation histogram(Incorrect diffusion, 10 ens, 101 obs)

0.00E+00

1.00E-01

2.00E-01

3.00E-01

4.00E-01

5.00E-01

-5 -4 -3 -2 -1 0 1 2 3 4 5

Category bins

PD

F

Innovation histogram(Parameter estimation, 10 ens, 101 obs)

0.00E+00

1.00E-01

2.00E-01

3.00E-01

4.00E-01

5.00E-01

-5 -4 -3 -2 -1 0 1 2 3 4 5

Category bins

PD

FInnovation histogram

(Correct diffusion, 10 ens, 101 obs)

0.00E+00

1.00E-01

2.00E-01

3.00E-01

4.00E-01

5.00E-01

-5 -4 -3 -2 -1 0 1 2 3 4 5

Category bins

PD

FInnovation histogram

(Correct diffusion 10 ens, 10 obs)

0.00E+00

1.00E-01

2.00E-01

3.00E-01

4.00E-01

5.00E-01

-5 -4 -3 -2 -1 0 1 2 3 4 5

Category bins

PD

F

Innovation histogram(Incorrect diffusion, 10 ens, 10 obs)

0.00E+00

1.00E-01

2.00E-01

3.00E-01

4.00E-01

5.00E-01

-5 -4 -3 -2 -1 0 1 2 3 4 5

Category bins

PD

F10 obs 101 obs

EnsDA experiments with KdVB model (PARAMETER estimation impact)

INNOVATION 2 TEST (biased model)(neglect_err, 10 ens, 10 obs)

0.00E+002.00E+004.00E+006.00E+008.00E+001.00E+011.20E+01

1 11 21 31 41 51 61 71 81 91

Analysis cycle

INNOVATION 2 TEST (biased model)(bias_estim, 10 ens, 10 obs, bias dim = 101)

0.00E+002.00E+004.00E+006.00E+008.00E+001.00E+011.20E+01

1 11 21 31 41 51 61 71 81 91

Analysis cycle

INNOVATION 2 TEST (biased model)(bias_estim, 10 ens, 10 obs, bias dim = 10)

0.00E+002.00E+004.00E+006.00E+008.00E+001.00E+011.20E+01

1 11 21 31 41 51 61 71 81 91

Analysis cycle

INNOVATION 2 TEST (non-biased model)(correct_model, 10 ens, 10 obs)

0.00E+002.00E+004.00E+006.00E+008.00E+001.00E+011.20E+01

1 11 21 31 41 51 61 71 81 91

Analysis cycle

EnsDA experiments with KdVB model (BIAS estimation impact)


Critical issues: Uncertainty estimates Critical issues: Uncertainty estimates


- Analysis error covariance Pa (analysis uncertainty)

- Forecast error covariance Pf (forecast uncertainty)

- Both defined in ensemble sub-space

KdVB model example:

Critical issues: Correlated observations


Problem:Numerous observations (~108 -109) are being projected onto a small ensemble sub-space (~101 -103) ! Loss of observed information!

Remedies: Process observations one by one (Anderson 2001, Bishop et al. 2001; Hamill et al. 2001).

Or Process observations successively over relatively small local areas (LEKF, Ott et al. 2004).

Assumption in both approaches:Observations being processed separately are uncorrelated (independent)! This may not be justified for dense satellite observations.

Critical issues: Correlated observations


How does the observed information impact the uncertainty estimate of the optimal solution (analysis error covariance Pa ) ?

2/2/12/1 )]([ Toptfa

xAIPP

2/1aP - square root of analysis error covariance (Nstate x Nens)

2/1fP - square root of forecast error covariance (Nstate x Nens)

)( optxA - impact of observations on the optimal solution (Nens x Nens)

The eigenvalue spectrum of (I+A)-1/2 may help understand the impact of observations, and perhaps find a better solution for correlated observations.

RAMS model example


Eigenvalues (I+A)-1/2

RAMS, 144 obs, 10 ens

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1 2 3 4 5 6 7 8 9 10

eigenvalue rank

eig

enva

lue

cycle 1

cycle 3

cycle 6

cycle 8

cycle 10

A safe approach to prevent loss of observed information, assuming independent observations: Nobs Nens.

If eigenvalues of (I+A)-1/2 spread over the entire interval [0,1], ensemble size (Nens) is appropriate for a given observation number (Nobs) .

CSU shallow-water model on geodesic grid

(I+A)**(-1/2) Spectrum (1025 obs, 1000 ensembles)

0.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1 101 201 301 401 501 601 701 801 901

Eigenvalue Rank

Eig

en

va

lue

cycle 1

cycle 5

cycle 10

cycle 15

cycle 20

12800 obs

When system can learn from its past, less information from observations is needed !

U-wind analysis RMS error (m/s)

0.00E+00

5.00E-01

1.00E+00

1.50E+00

2.00E+00

2.50E+00

3.00E+00

3.50E+00

4.00E+00

4.50E+00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Cycle

RM

S e

rro

r

random init ptrb

KPZ init ptrb

obs error

Smooth start (in cycle 1) can improve the performance of EnsDA

Milija Zupanski, CIRA/[email protected]

Analysis error smaller than obs error

(Results from M. Zupanski et al.)

Non-Gaussian PDFsNon-Gaussian PDFs

Non-linear Atmospheric- Hydrology- Carbon state variables and observations are likely to have non-Gaussian PDFs.

MLEF, as a maximum likelihood estimate, is a suitable tool for examining the impact of different PDFs.

Develop a non-Gaussian PDF framework (M. Zupanski)- allow for non-Gaussian observation errors- apply the Bayes theorem for multiple events

Milija Zupanski, CIRA/[email protected]

CSU EnsDA algorithm is currently being examined in application to NASA’s GEOS column model in collaboration with:

-A. Hou and S. Zhang (NASA/GMAO)-C. Kummerow (CSU/Atmos. Sci.)

Innovation histogram for NASA's GEOS model experiment

(Parameter estimation 10 ens, 110 "REAL"obs)

0.00E+00

2.00E-01

4.00E-01

6.00E-01

-5 -4 -3 -2 -1 0 1 2 3 4 5

Category bins

PD

F

NASA’s GEOS column model example

Innovation histogram for NASA's GEOS model experiment

(Parameter estimation 10 ens, 110 "REAL"obs)

0.00E+001.00E-012.00E-013.00E-014.00E-015.00E-016.00E-01

-5 -4 -3 -2 -1 0 1 2 3 4 5

Category bins

PD

F

R1/2 = R1/2 = 2

Prescribed observation errors directly impact innovation statistics.Since the observation error covariance R is the only input required by the

system, it could be tuned!


In case we solved all critical issues, one problem remains:How to define observation error covariance matrix R, if it is not known?

EnsDA approaches are very promising since they can provide not EnsDA approaches are very promising since they can provide not only only optimal estimateoptimal estimate of the state, but also the of the state, but also the uncertaintyuncertainty of the of the optimal estimate.optimal estimate.

The experience gained so far indicates that the EnsDA approach The experience gained so far indicates that the EnsDA approach is suitable for addressing critical issues of data assimilation in is suitable for addressing critical issues of data assimilation in Carbon cycle studies.Carbon cycle studies.

Model error and parameter estimation are necessary ingredients Model error and parameter estimation are necessary ingredients of a data assimilation algorithm.of a data assimilation algorithm.

Problems involved in Carbon data assimilation require a state-of-Problems involved in Carbon data assimilation require a state-of-the art approach. We anticipate findings from different scientific the art approach. We anticipate findings from different scientific disciplines (e. g., atmospheric science, ecology, hydrology) to be of disciplines (e. g., atmospheric science, ecology, hydrology) to be of mutual benefits. mutual benefits.

It is especially important to gain experience with complex coupled It is especially important to gain experience with complex coupled models (e. g., RAMS-SiB-CASA), correlated (satellite) observations, models (e. g., RAMS-SiB-CASA), correlated (satellite) observations, and non-Gaussian PDFs in the and non-Gaussian PDFs in the futurefuture..


CONCLUSIONSCONCLUSIONS