+ All Categories
Home > Documents > cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data...

cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data...

Date post: 19-Jun-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
12
Preliminaries CDO Parallel Output cdo Data Processing (and Production) Luis Kornblueh, Uwe Schulzweida, Deike Kleberg, Thomas Jahns, Irina Fast Max-Planck-Institut f¨ ur Meteorologie, DKRZ September 24, 2014 MAX-PLANCK-GESELLSCHAFT
Transcript
Page 1: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

cdoData Processing (and Production)

Luis Kornblueh, Uwe Schulzweida, Deike Kleberg, ThomasJahns, Irina Fast

Max-Planck-Institut fur Meteorologie, DKRZ

September 24, 2014

M A X - P L A N C K - G E S E L L S C H A F T

Page 2: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Data format standards

Pushing forward . . . ...

• World Meteorological Organization (WMO): grib and bufr,(since 1980, data converted back to 1900)

• (NetCDF) Climate and Forecast (CF) Metadata Convention,(since 1999)

• World Climate Research Program (WCRP): CMOR (forCMIP5 and onwards)

Missing common semantics and vocabulary!

Page 3: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

What is CDO ?

CDO is a collection of tools to process and analyze data fromclimate and NWP models.

• (File) format conversion: GRIB ⇔ netCDF

• Interpolation between different grid types and resolution

• Portability (ANSI C99 with some POSIX extentions)

• Performance (fast processing of large datasets, muti-threaded)

• Modular design and easily extendable with new operators

• UNIX command line interface, tested on many UNIX/Linuxsystems, Cygwin, and MacOS-X

And what is CDI ?

Page 4: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Data I/O Interface

CDI, used by CDO, is the I/O interface shared by all major MPI-Mmodels. GRIB support includes highly efficient, fast compressionalgorithms.

• GRIB1 via CGRIBEX(MPI-M)

• GRIB2 via GRIB API(ECMWF)

• netCDF,CF-convention(UNIDATA)

• SERVICE, EXTRA,IEG (MPI-M legacybinary formats)

netCDF 3

CDI

binary

CDO

C or POSIX I/O. POSIX I/O

netCDF 4(HDF5)

SERV

ICE,

EXT

RA, I

EG

GRIB

1

I/O layer (buffering)

CDI Core

C and Fortran API

ECHAM

MPIOMICON

Exte

rnal

libr

arie

s (o

ptio

nal)

GRIB

2

netC

DF3

netC

DF4

HD

F5

GRIB 1 GRIB 2

Page 5: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Available Operators

CDO provides more than 400 operators which can be pipelined onthread level. CPU time intensive operators are OpenMPparallelized.

Main categories Description

File information Print information about datasets

File operations Copy, split and merge datasets

Selection Select parts of a dataset

Comparision Compare datasets

Modification Modify data and metadata

Arithmetic Arithmeticly process datasets

Statistical values Ensemble, field, vertical and time statistic

Interpolation Horizontal, vertical and time interpolation

Import/Export HDF5, binary, ASCII

Climate indices ECA Indices

Page 6: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Supported Grids

A large set of gridsis supported includingspectral- and Fourier-coefficients. Gaussiangrids, regular and ro-tated lat-lon grids, con-formal mapped quadri-lateral grids, and fi-nally general unstruc-tured grids.

Gaussian grid

ECHAM

curvilinear grid

MPIOM

hexagonal grid

GME

triangular grid

ICON

A lot of models world wide are supported:COSMOS, CLM/COSMO, ECHAM, GME, HIRLAM, ICON, IFS,MPIOM, NEMO, REMO, and . . .only to mention a few

Page 7: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Satellite-data Support

EUMETSAT’s Climate Moni-toring Satellite Application Fa-cility provides satellite-derivedgeophysical parameter for cli-mate monitoring. Data setscontain several cloud parame-ters, surface albedo, radiationfluxes, temperatur and humid-ity profiles. These products arestored in HDF5. DWD hasfunded an CDO import opera-tor import cmsaf.

toa radiation

cloud cover

surface radiation

humidity

Page 8: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Community Support

The rapidly increasing number of CDO installations and userscreate a very high demand of support. A fully featureddevelopment platform is available to support the community. TheCDO community page was funded by the European Commissioninfrastructure project IS-ENES.

• User wiki

• Documentation

• Bug trackingsystem

• User forums

• Download area

• Repository access

http://code.zmaw.de/projects/cdo

Page 9: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Parallel Output Design

Page 10: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Parallel Output Implementation

Page 11: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Parallel Output scaling

Page 12: cdo - Data Processing (and Production) · CDO is a collection of tools to process and analyze data from climate and NWP models. (File) format conversion: GRIB ,netCDF Interpolation

Preliminaries CDO Parallel Output

Upcoming features and future development

cdo/cdi needs to be able to handle 108 grid points per level (800GB, double). A few developments necessary:

• very fast addon compression for grib2 to be validated withWMO members (interface for libaec now in grib api)

• change from ANSI-C to C++ (mac create minor portabilityproblems)

• change to an master-slave scheduling model to achievemaximum parallelization inside nodes

• add cmorizing capabilities

• add full single precision data flow (only grib api is missing)

• do not develop a MPI parallelized version, as I/O is the majorbottleneck.


Recommended