+ All Categories
Home > Documents > Unidata Infrastructure for Data Services

Unidata Infrastructure for Data Services

Date post: 31-Dec-2015
Category:
Upload: chester-tyler
View: 33 times
Download: 1 times
Share this document with a friend
Description:
Unidata Infrastructure for Data Services. Russ Rew GO-ESSP Workshop, LLNL 2006-06-19. Some Current Unidata Infrastructure Projects. LDM for distributing and processing near real-time data - PowerPoint PPT Presentation
12
Unidata Infrastructure for Unidata Infrastructure for Data Services Data Services Russ Rew GO-ESSP Workshop, LLNL 2006-06-19
Transcript

Unidata Infrastructure for Data Unidata Infrastructure for Data ServicesServices

Russ Rew

GO-ESSP Workshop, LLNL

2006-06-19

2

Some Current Unidata Infrastructure Projects

LDM for distributing and processing near real-time data

Integrated Data Viewer (IDV) for testing infrastructure in platform-independent data visualization and analysis

NetCDF C-based interfaces for data access CFIOlib for a CF conventions API (tomorrow)

NetCDF Java for advanced data access infrastructure Common Data Model for improving interoperability NcML for metadata annotation and data aggregation THREDDS Data Server (TDS) for remote access to

archives GALEON for serving netCDF data through OGC Web

Coverage Services (WCS)

3

LDM-6 for Internet Data Distribution

Implements a peer-to-peer system for reliable, event-driven data distribution

Supports subscriptions to many near real-time data feeds; no data center needed

Data product abstraction is general: model output, observations, text products, satellite data, radar, …

Protocols use persistent connections to achieve low latency

Highly configurable: inject, distribute, capture, filter, and process arbitrary data products

In continuous use by over 160 universities, NOAA, USGS, NASA, internationally, THORPEX global ensembles (TIGGE), …

Candidate for use in new WMO weather information system

Source

LDM

Source

Source

LDM LDM

LDMLDM

LDM LDM

LDM

LDM

Internet

4

IDV (Integrated Data Viewer) Freely available 100% Java

reference application and framework for visualization and analysis of geoscience data

Provides integrated and time synchronized 2-D and 3-D visualizations of model outputs, observed, and remotely sensed data, using U. of Wisc. VisAD

Handles diverse formats and protocols for local and remote access: GRIB, netCDF, OPeNDAP, ADDE, HTTP, GIS, …

Serves as end-to-end test for many Unidata technologies: THREDDS services, Java netCDF, XML bundles, plug-in architecture, interactive collaboration, …

5

NetCDF’s Niche Simple data model for scientific datasets

Portable, self-describing data Appendable, sharable, archivable Direct access for efficient subsetting Metadata via attribute conventions such as CF

Flexible remote access via OPeNDAP, HTTP, WCS

Lots of applications: NCO, ncbrowse, ncview, IDV, IDL, MATLAB, ArcGIS, ...

Language interfaces include C, Java, Fortran, C++, Perl, Python, Ruby, ...

6

NetCDF-3 Data Model

Attribute

name: String

type: DataType

values: 1D array

Variable

name: String

shape: Dimension[ ]

type: DataType

array: read( ), …

File

location: Filename

create( ), open( ), …

Dimension

name: String

length: int

isUnlimited( )

DataTypechar byte short int

float double

A file has named variables, dimensions, and attributes. Variables also have attributes. Variables may share

dimensions, indicating a common grid. One dimension may be of unlimited

length.

Variables and attributes have one of six primitive

data types.

7

Some NetCDF-3 Limitations

Only one shared unlimited dimension No structures, just scalars and multidimensional

arrays No strings, just arrays of characters Limited numeric types No ragged arrays or nested structures Only ASCII characters in names Changes to file schema can be expensive Efficient access requires reads in same order as

writes No built-in compression Only serial I/O Flat name space limits scalability

8

NetCDF-4 Features to Address Limitations

Multiple unlimited dimensions Portable structured types String type Additional numeric types Variable-length types for ragged arrays Unicode names Efficient dynamic schema changes Multidimensional tiling (chunking) Per variable compression Parallel I/O Nested scopes using Groups

9

NetCDF-4 Data Model (Common Data Access Model)

Dimension

name: String

length: int

isUnlimited( )

Attribute

name: String

type: DataType

values: 1D array

Variable

name: String

shape: Dimension[ ]

type: DataType

array: read( ), …

Group

name: String

File

location: Filename

create( ), open( ), …DataType

PrimitiveTypechar

byte

short

intint64float

doubleunsigned byte unsigned short

unsigned intunsigned int64

string

UserDefinedType

typename: String

Compound

VariableLength

Enum

Opaque

A file has a top-level unnamed group. Each group may contain one or more named subgroups, variables, dimensions, and attributes. Variables also have attributes. Variables may share dimensions,

indicating a common grid. One or more dimensions may be of unlimited length.

Variables and attributes have one of twelve primitive data types or one of

four user-defined types.

10

NetCDF-4 Architecture

NetCDF Javaapplications

NetCDF-3applications

NetCDF-4applications

HDF5applications

NetCDF-4 uses HDF5 for storage, high performance Parallel I/O Chunking for efficient access in different orders,

efficient use of compression Conversion using “reader makes right” approach

Provides simple netCDF interface to subset of HDF5 Also supports netCDF classic and 64-bit formats

POSIX I/OPOSIX I/O MPI I/OMPI I/O

HDF5HDF5netCDF-3netCDF-3

netCDF netCDF JavaJava

netCDF-4netCDF-4

……

NetCDF Javaapplication

NetCDF-3application

NetCDF-4application

HDF5application

Java VMJava VM

11

Status of NetCDF-4

NetCDF-4.0-alpha14 currently available for testing Files created with alpha release use unsupported artifacts

We’re seeking feedback on performance and functionality

NetCDF-4.0-beta waiting for HDF5 1.8-beta Will finalize file format, eliminate necessity for artifacts

Expected within a few weeks of HDF5 1.8-beta release, maybe by August 2006

HDF5 1.8 currently expected by November 2006 Has enhancements specifically for netCDF-4: variable creation order, Unicode names, dimension scales, on-the-fly numeric conversions

Plans for netCDF-4.1 and beyond on netCDF-4 web site

12

Summary

Unidata’s LDM-6 implements an event-driven architecture for low-latency data distribution

Unidata’s IDV provides a platform-independent visualization and analysis framework and reference application for integrating data from diverse sources

Unidata’s netCDF-4 software preserves backward compatibility and eliminates many limitations of netCDF-3 with only a modest increase in complexity


Recommended