+ All Categories
Home > Documents > H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes...

H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes...

Date post: 28-Mar-2015
Category:
Upload: jeremiah-wilkinson
View: 214 times
Download: 1 times
Share this document with a friend
Popular Tags:
30
H. Thiemann (M&D) / 26.06.22 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004
Transcript
Page 1: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 1

CERA (Climate and Environmental Retrieval and Archive)

Hannes Thiemann

(M&D/MPIMET, Hamburg)

Kiel, 17.3.2004

Page 2: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 2

Data Group maintaining the WDCC

Michael Kurtz

Hans Luthardt

Michael Lautenschlager

Heinke Höck

Hannes Thiemann

Hermann Winter

Jörg Wegner

Frank Toussaint

Peter Lenzen

Page 3: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 3

Content:

• General remarks

• DKRZ archive development

• CERA1) concept

• CERA data model and structure

• Automatic fill process (not presented)

• CERA user interface

1) Climate and Environmental data Retrieval and Archiving

Page 4: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 4

Semantic data management

• Data consist of numbers and metadata.

• Metadata construct the semantic data context.

• Metadata form a data catalogue which makes data searchable.

• Data are produced, archived and extracted within their semantic context.

Data without explanation are only numbers.

Problems:• Metadata are of different complexity for different data types. • Consistency between numbers and metadata have to be ensured.

Page 5: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 5

DKRZ Architecture

Proc.: 24 nodes 192 CPU'sMemory: 1.5 TeraBytePerform.: 1.5 TeraFLOPS (peak) 500 GigaFLOPS (sust.)

Tape Archive: > 3.4 PetaByte Disk Cache: 60 TeraByteBandwidth Comp.S. – Data S.: 450 Mbyte/sec

155 Mbs

Page 6: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 6

DKRZ Archive Development

Basics observations and assumptions:1. Unix-File archive content end of 2002: 600 TB including

Backup's

2. Observed archive rate (Jan. - May 2003): 40 TB/month

3. System changes: 50% compute power increase in August 2003

4. CERA DB size end of 2002: 12 TB

5. Observed Increase (Jan. - May 2003): 1 TB/month

6. Automatic fill process into CERA DB is going to become operational with 4 TB/month this year and should increase from 10% of the archiving rate to approx. 30% end of 2004

Page 7: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 7

DKRZ Archive Development

DKRZ's Archive Increase (Estim. 09.03)

6001200

1920

2640

3360

4080

12 40 184424 664 904

2002 2003 2004 2005 2006 2007

Years

Dat

a A

mo

un

t [T

B]

Unix-File Archive

CERA DB

Page 8: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 8

Problems in file archive access: Missing Data Catalogue Data are not stored application-oriented Lack of experience with climate model data Lack of computing facilities at client site

Year 2003 2004 2005 2006 2007

Estimated File Archive Size

1,2 PB 1,9 PB 2,6 PB 3,4 PB 4,1 PB

Page 9: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 9

Limits of model resolution

ECHAM4(T42)Grid resolution: 2.8°Time step: 40 min

ECHAM4(T106)Grid resolution: 1.1°Time step: 20 min

Noreiks (MPIM), 2001

Page 10: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 10

• (I) Data catalogue and Unix files (pointer or BLOB-table-entry)

Enable search and identification of data Allow for data access as they are

• (II) Application-oriented data storage Time series of individual variables are stored as BLOB

entries in DB Tables• Allow for fast and selective data access

Storage in standard file-format (GRIB, NetCDF)• Allow for application of standard data processing routines

(PINGOs)

CERA Concept:Semantic Data Management

Page 11: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 12

Web-Based User InterfaceCatalogue Inspection

Climate Data Retrieval

CERA Database30 TB (12/2003)Data Catalogue

Processed Climate DataPointer to Raw Data

Mass Storage Archive1 PB (12/2003)

Parts of CERA

Page 12: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 13

CERA Data: Jan. Temp.

Page 13: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 14

CERA Data: Jan. Wind

(2 x 250 MB)

Page 14: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 15

• Complete with respect to IEEE’s Reference Model for Metadata (Bretherton, 1994)– Browse, Search and Retrieval– Ingest, Quality Assurance, Reprocessing– Application to Application Transfer– Storage and Archive

• Reference– “The CERA-2 Data Model” (DKRZ-Report No. 15,

1998)– URL:

http://www.pik-potsdam.de/dept/dc/e/sdm/cera/

CERA-2 Data Model

Page 15: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 16

Interoperability

• Supports interoperability due to inclusion of international standards– Directory Interchange Format (NASA, 1998)– FGDC Metadata Content Standard (FGDC, 1996)– ISO Metadata Standard for Geographic

Information (ISO 19115)

Page 16: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 17

Metadata EntryThis is the central CERA Block,providing information on• the entry's title• type and relation to other entries• the project the data belong to• a summary of the entry• a list of general keywords related to data• creation and review dates of the metadata

Additionally: Modules and Local Extensions

Module DATA_ORGANIZATION (grid structure)Module DATA_ACCESS (physical storage)Local extension for specific information on (e.g.)• data usage• data access and data administration

CoverageInformation on the volume of space-time

covered by the dataReference

Any publication related to the data togehter with the publication form

StatusStatus information like data quality, processing steps, etc.

DistributionDistribution information including access restrictions, data format and fees if necessary

Contact

Data related to contact persons and institutes like distributor, investigator, and owner of copyright

ParameterBlock describes data topic,

variable and unit

Spatial Reference

Information on the coordinatesystem used

CERA-2 Data Model Blocks

Page 17: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 18

Level 1 - Interface:Metadata entries(XML, ASCII)+ Data Files

Level 2 – Interf.:Separate filescontaining BLOBtable data in application adapted structure(time series ofsingle variables)

Experiment Description

Unix-FilesTable / Pointer

Dataset 1Description

Dataset nDescription

BLOB DataTable

BLOB DataTable

CERA Structure

Page 18: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 19

Climate Model Raw Data

Application-oriented Data Storage(Interface level 2)

Primary DataProcessing

Page 19: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 20

Start: Approved in January 2003

Maintenance: Model and Data (M&D/MPIMET) and German Climate Computing Centre (DKRZ)

Mission: Data for climate research are collected, stored and disseminated

ICSU Policy: long-term archiving and unrestricted data access for scientists

Restriction: Only climate data products in CERA DB, no raw data storage.

Content: Emphasis is spent on climate modelling and related data products.

Co-operation: with thematically corresponding data centres like WDC-MARE (Bremen) and WDC-RSAT (Oberpfaffenhofen)

URL: http://www.mad.zmaw.de/wdcc/

Page 20: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 21

WDC Verbund Erdsystemforschung

Wurde am 25.04.03 von den 3 deutschen ICSU WDC's in Oberpfaffenhofen gegründet.

• WDC for Climate: M&D / DKRZ, Hamburghttp://www.mad.zmaw.de/wdcc/ • WDC MARE (Marine Environmental Sciences): Marum, Bremen und Bremerhavenhttp://www.wdc-mare.org/ • WDC RSAT (Remote Sensing for the Atmosphere): DFD/DLR, Oberpfaffenhofenhttp://wdc.dlr.de/

Verpflichtung: Langzeit-Datenarchivierung und freier, unbeschränkter Datenzugang für alle Wissenschaftler (ICSU Rules for WDC's und Regeln zur guten wissenschaftlichen Praxis)

Page 21: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 22

WDC Verbund Erdsystemforschung

Grundsatzerklärung• Datenpublikation- Die Daten selbst sollen unabhängig vom archivierenden System eindeutig identifizierbar, referenzierbar und universell zugreifbar sein (z.B. Vergabe von DOI's oder URN's ).

- DFG Projekt "Publikation und Zitierfähigkeit wissenschaftlicher Primärdaten" (12 Monate, Beginn 01.10.03)

• Service der Datenzentren- Qualifizierte thematische Datenzentren übernehmen die Rolle für die Archivierung und Publikation von wissenschaftlichen Daten.

- Die Zentren garantieren eine langfristige und freie Verfügbarkeit archivierter Daten im Rahmen der Richtlinien der ISCU Weltdatenzentren.

- Datenzentren stehen mit ihrer Expertise den Fördereinrichtungen, den Gutachtern und der Wissenschaft beratend zur Verfügung.

Page 22: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 23

WDC-CLIMATEData Content

• Climate Model Data (Continuous stream of new data)• IPCC DDC (Data Distribution Centre)

– Will be continued for the Fourth Assessment Report

• CEOP (Coordinated Enhanced Observing Period) Model output retention and handling Centre

– Part of WCRP that was motivated by GEWEX with focus on water and energy cycles within the climate system (01.10.2002 – 31.12.2004)

• Observational Data– Model related observations: ERA15/40 (ECMWF), NCEP 40 Y. Reanal.– Instrumental data: WOCE (World Ocean Circulation Experiment)– Earth observations: Access to SST's from NOAA AVHRR in cooperation

with WDC RSAT (distributed archive)• Project Support (encourage Good Scientific Practice)

• HOAPS (Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite Data)

• CARIBIC (Civil Aircraft for Regular Investigation of the Atmosphere Based on an Instrumentation Container), MPI Mainz

• Different model applications

Page 23: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 24

Experiment

Exp.-Acronym: EH5_T63L19_AMIP_6H

Exp.-Name: ECHAM5_T63L19_AMIP Control Run 6H values

Exp.-Description:

Simulation of current climate using ECHAM5.2 forced with observed monthly sea surface temparatures and sea-ice concentrations (AMIP-2).

The simulation was run on a NEC-SX6 (hurrikan). Atmospheric data is stored every 6 hours. Monthly means are available, too.

Related experiments:

- ECHAM5_TTTLLL_AMIP in where TTTLLL is: T21L19, T31L19, T42L19, T85L19, T106L19, T42L31, T63L31, T85L31 and T106L31

The output from the model run: schauer.dkrz.de:/pf/m/m214002/NEWEXP/EXP300/run365

Project: Climate Model Simulations at MPI

Keyword: AMIP2

WDCC Example

Page 24: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 25

Experiment

Exp.-Acronym: EH5_T63L19_AMIP_6H

WDCC Example

Dataset (BLOB-Table)

DS-Acronym: EH5_T63L19_R365_TEMP2

Variable: 2m temperature

Dataset (BLOB-Table)

DS-Acronym: EH5_T63L19_R365_WIND10M

Variable: 10m wind speed

Number of datasets: 350 time series of 2D global fieldsTotal amount of GRIB data: 350 * 1.6 GB = 560 GB

schauer.dkrz.de:/pf/m/m214002/NEWEXP/EXP300/run365

Page 25: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 26

Dataset

DS-Acronym: EH5_T63L19_R365_TEMP2

DS-Name: EH5_T63L19_R365_TEMP2

DS-Summary: See summary of corresponding experiment. This dataset contains 6H values.

Creation Date: 25-MAI-2003

Format: GRIB

Size (Bytes): 1659519420

Storage: Model and Data: DB Internal Storage; Nearline

Download Permission: No

Topic / Parameter / Variable / Unit: atmosphere / atmospheric temperature / 2m temperature / Kelvin

Code Type / Code # / Code Acronym: Echam5 / 167 / TEMP2

Temporal Structure: length of time series and storage intervalls

Spatial Structure: precise definition of 3D grid points

WDCC Example

Page 26: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 27

Page 27: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 28

Inclusion of other Data

Sources

Client applet receivesforeign data URIfrom CERA-2 DB

Foreign server provides DB data by http:German Aerospace Centre

Page 28: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 29

Download StatisticsNumber of Volumedownloads (GB)

MARCH 2004 950 81FEBRUARY 2004 4018 911JANUARY 2004 1583 1154DECEMBER 2003 1077 366NOVEMBER 2003 1959 923OCTOBER 2003 2844 86SEPTEMBER 2003 3168 241AUGUST 2003 1576 208JULY 2003 3347 213JUNE 2003 3426 78MAY 2003 5803 117APRIL 2003 5343 66

Month

Page 29: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 30

CERA DB using countries

Page 30: H. Thiemann (M&D) / 19.05.2014 / 1 CERA (Climate and Environmental Retrieval and Archive) Hannes Thiemann (M&D/MPIMET, Hamburg) Kiel, 17.3.2004.

H. Thiemann (M&D) / 10.04.23 / 31

Contact

• Email: [email protected]

• Web: http://cera-www.dkrz.de/CERA


Recommended