Date post: | 29-Jan-2016 |
Category: |
Documents |
Upload: | aron-carter |
View: | 216 times |
Download: | 0 times |
A revised 4D-Var algorithm
Yannick TrémoletECMWF
IMA W/S 29 April, 2002
Mike Fisher, Lars Isaksen and Erik Andersson
For increased efficiency and improved accuracy.
2
Outline
Introduction
Current 4D-Var with 12-hourly cyclingMain characteristics, used data types.The incremental formulation - and associated
approximations and limitations.
The revised algorithmQuadratic inner iterations - using Conjugate
Gradient.Hessian eigen-vector preconditioning.Trajectory interpolation from T511 to T42/T95/T159.Evaluation.
Prospects and Conclusions
3
Used Data
SYNOPSurf.Press, Wind-10m, RH-2m
AIREPWind, Temperature
SATOBCloud drift winds
DRIBUSurf.Press, Wind-10m
TEMPWind, Temp, Humidity
profiles DROPSONDE
Wind and Temp profiles PILOT/Am+Eu Profilers
Wind profiles PAOB
Surface pressure proxy
ATOVSHIRS, MSU and AMSU-A
radiances SSM/I
TCWV, Wind speed METEOSAT
Water Vapour channel QuikScat
Ambiguous winds SBUV
Layer ozone GOME
Total ozone
Conventional Satellite
4
ECMWF forecast model geometry
Vertical resolution60 levels
1000
200
50
10
1
0.1
60
30
22
14
5
1
12 levels below 850 hPa
Horizontal resolutionTL511 ~ 40 km
Observations arecompared against
a short-range3-15 hour forecast
5
The current operational 4D-Var system
Forecast model at T511 (40 km) resolution
Observation minus Background departures are computed using the full model at full resolution at the observed time.
Analysis increments are computed at coarser T159 resolution (125 km), using a tangent linear forecast model and its adjoint.
All observations are analysed simultaneously.
12 hours worth of global obser-vations are used in one go.
Around 1 000 000 data are used, in total, per 12-hour cycle.
Satellite radiances are the most numerous data source
6
4D-Var finds the 12-hour forecast evolution that best fits the available observations
It does so by adjusting 1) surface pressure, and the upper-air fields of 2) temperature, 3) wind, 4) specific humidity and 5) ozone
A few 4D-Var Characteristics
All data within a 12-hour period are used simultaneously, in one global (iterative) estimation problem
7
The i-summation is over 1h or ½h-long sub-divisions (or time slots) of the 12-hour assimilation period.
The incremental formulation of 4D-Var
In the incremental formulation (Courtier et al. 1994) the cost function is expressed in terms of increments with respect to the background state, with and linearized around .
The innovations are calculated using the non-linear operators, and :
This ensures the highest possible accuracy for the calculation of the innovations, which are the primary input to the assimilation!
8
Approximations at inner iterations
1) The tangent-linear approximation:
and
2) Approximations to reduce the cost: this involves degrading the tangent-linear (and its adjoint) with respect to the full model.
Lower resolution (T159 instead of T511),
Simplified physics (some processes ignored),
Simpler dynamics (e.g. spectral instead of grid-point humidity).
This results in a shorter control vector, and cheaper TL and AD model during the minimisation - i.e. the inner iterations.
9
The outer iterations
After each minimisation at inner level;
is updated: ,
and are re-linearized around .
Innovations are re-calculated using the full non-linear model :
Superscript represents the outer iterations.
The full model remains at T511 throughout.
10
TL testing Test of linear model based on Taylor series:
Valid for any perturbation (in practice a set of random vectors).
TL and NL run withthe same setup:
Resolution Physics Time step Simpler dynamics Configuration (IFS)
11
Test of incremental approximation
In 4D-VAR the perturbation is not any vector, it is an analysis increment. It is not random and it is the result of a algorithm which involves the linear model.
The linear and non-linear models are used at different resolutions (T511/T159).
The non-linear model uses more physics.
Humidity is represented in spectral space in the linear model, in grid point space in the non-linear model.
Relative error: vs.
12
Test of incremental approximation
Compare TL output with finite difference in 4D-VAR setup (resolution, physics, …).
All the necessary information is present during the minimisation:
All the components are used exactly in their 4D-VAR configuration.
High resolution non-linear updateLow resolution non-linear trajectoryLow resolution TL (cost function)Minimisation (TL)Low resolution TL (diagnostic)
High resolution non-linear update
13
Evolution of TL model error
Operational configuration.
The error is large.
It grows very rapidly in the first hours.
It is not the case in the adiabatic test.
14
Impact of TL model resolution
T511 outer loop.
Varying inner loop resolution.
The resolution of the inner loop may have reached a limit.
15
Small scales
•TL at T255,•12h forecast,•Spectral norms
16
Impact of TL model resolution (adiab.)
Adiabatic test.
Better linear physics is needed.
It is expensive both in development work and CPU.
17
Hessian eigenvector preconditioning
The optimal pre-conditioner for the 4D-Var minimisation problem is the Hessian of the cost function, .
The full 4D-Var Hessian is not known.
So far has been used as an approximate preconditioner, neglecting the observation term.
The consequence is that patches of very dense or particularly accurate observations may deteriorate the conditioning and slow down the rate-of-convergence.
18
Trajectory (in)consistency
T319-T63 T511-T159
The discrepancies between the high resolution non-linear update and the low resolution non-linear trajectory runs can be important.
19
The revised 4D-Var algorithm: Motivation
Improved efficiency
To offset some the cost of planned 1) higher resolution, 2) improved TL physics and 3) increased numbers of satellite data.
Increased TL accuracy
Discrepancies between and can introduce errors which grow quickly over the 12-hour assimilation window, especially affecting the analysis of small-scale phenomena and humidity.
Preparation for new high density satellite data
Coping with large numbers of observational data without deterioration of the rate-of-convergence.
Preparation for cloud and rain assimilation
Requires more extensive use of TL physics, and a good agreement between at inner iterations and at outer.
20
The revised 4D-Var algorithm: Specification
Quadratic inner iterations. Variational quality control and SCAT ambiguity removal moved to outer level.
Conjugate Gradient minimisation. With objective stopping-criterion based on the gradient-norm reduction.
Hessian eigenvector pre-conditioning. Updated after each inner minimisation.
Multi-Incremental, T42/T95/T159. With some tests at T255.
Interpolation of the trajectory. From T511 to T42/T95/T159.
TL physics. Used during all inner iterations.
21
The conjugate gradient algorithm minimizes a quadratic function with a symmetric positive-definite Hessian:
Conjugate Gradients and Lanczos Algorithms
The algorithm is:step to the line minimum
recalculate the gradient
calculate a new direction
where:
Eliminate to get the 3-term recurrence (Lanczos):
22
Conjugate Gradients and Lanczos Algorithms
The gradient vectors in conjugate gradients are orthogonal.
Let be the matrix whose columns are . Then
where is tri-diagonal and The residual term becomes small during the
minimization as the gradient decreases. After iterations, we get . i.e. has the same eigenvalues as . Intermediate matrices have interleaving eigenvalues:
Even for , some eigenvalues are well approximated.
23
Preconditioning Write the analysis cost function as:
Preconditioning replaces by:
The Hessian of this new function is The trick is to choose so that has a small
condition number. Eigenvector preconditioning sets:
Writing , gives:
If we choose so that , then the condition number of is .
with
24
Preconditioning
Eig
enva
lue
N=1
1=3105.4
26=492.75
Preconditioning reduces the condition number k=1/N from 3105.4 to 492.75
25
Preconditioning
Variational Quality Control
26
Preconditioning
Convergence is roughly twice as fast with Hessian preconditioning.
27
Preconditioning: Spectrum of Hessian
0
200
400
600
800
1000
1200
1 3 5 7 9 11 13 15 17 19 21 23
Min_42
Min_95
Min_255
The leading eigenvectors of the Hessian are large-scale. It is very effective and cost-efficient to calculate them at
low resolution (T42/T95), They can be used as pre-conditioner to reduce the number
of iterations at higher resolutions (T95/T159 or T255). This naturally leads to a multi-incremental setup.
28
Multi-incremental: RMS of T analysis increments
Most of the total An-increment is formed at T42. There is a clear scale-separation between successive minimisation. The rapid decrease beyond ~T100 is due to the filtering properties of Jb, and the lack of observational information on smallest scales.
29
Conjugate-gradient: Reduction of Norm of gradient
With C.G. minimisation the gradient norm reduces nearly monotonically with iteration. It is therefore possible to introduce an objective stopping-criterion based on its ratio. We have chosen a value =0.05.
0.05
30
C.G. and Lanczos Summary
The close connection between conjugate gradients and the Lanczos algorithm allows us to simultaneously:
Minimize the cost function.
Calculate the eigenvectors and eigenvalues of the Hessian.
The extra computational effort required to calculate the eigenpairs is negligible.
The connection can be exploited to improve the minimization.
The consequence is a more efficient and more robust 4D-Var minimisation – w. r. t. observation amounts and distribution.
31
Interpolated Trajectory
The increments which are appropriate for the low resolution situation are not always suitable for the high resolution situation.
The trajectory can be interpolated:
This algorithm could not be tested with the traditional tangent linear test because of the two resolutions involved.
High resolution non-linear updateLow resolution non-linear trajectoryLow resolution TL (cost function)Minimisation (TL)Low resolution TL (diagnostic)
High resolution non-linear update
Interpolation
32
Interpolated trajectory
33
Performance: Jo cost function
0.E+00
1.E+05
2.E+05
3.E+05
4.E+05
5.E+05
6.E+05
7.E+05
Traj0
Min_
42
Min_
42Tr
aj1
Min_
95
Min_
95Tr
aj2
Min_
255
Min_
T255
Traj3
Traj_Jo
Min_Jo
Min_Jb
0.E+00
1.E+05
2.E+05
3.E+05
4.E+05
5.E+05
6.E+05
7.E+05
Traj0
Min_
42
Min_
42
Min_
159
Min_
159
Traj1
Min_
159
Min_
159
Traj2
Current 4D-Var Revised 4D-Var
34
Scores
35
Performance: CPU cost
Operational setup: 1h20min. Revised algorithm: 1h13min. On Fujitsu VPP5000, 16 processors. Elapsed time, I/O not fully optimised.
0
500
1000
1500
2000
2500
Traj0
Min_
42
Min_
42Tr
aj1
Min_
95
Min_
95Tr
aj2
Min_
255
Min_
T255
Traj3
Traj CPU
Min CPU
36
Conclusions
ECMWF’s 4D-Var has been improved:
Conj. Gradient minimisation Hessian pre-conditioning Inner/outer iteration algorithm Improved TL approximations Multi-incremental T42/T95/T159
These developments will help facilitate:
Use of higher density data Higher resolution Enhanced use of (relatively
costly) TL physics Cloud and rain assimilation
Prospects
The minimisation of the cost function has been improved. More work is needed to improve the representation of
the small scales in the inner loop. Efficiency gains will pay for improved inner loop physics
and resolution.