+ All Categories
Home > Documents > Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Date post: 14-Dec-2015
Category:
Upload: blaise-grandon
View: 218 times
Download: 3 times
Share this document with a friend
30
Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata
Transcript
Page 1: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Reading HDF family of formatsvia NetCDF-Java / CDM

John Caron

UCAR/Unidata

Page 2: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

NetCDF-Java library

• 100% Java• Open Source (LGPL, MIT)• Independent implementation• Used as a component in other software (partial)

– Integrated Data Viewer, THREDDS Data Server (Unidata)– Panoply (NASA)– ncBrowse (EPIC/NOAA)– Java NEXRAD Viewer (NCDC/NOAA)– MyWorld GIS (Northwestern)– EDC for ArcGIS, ERRDAP (SFSC/NOAA)– Live Access Server (PMEL/NOAA)– ncWMS (Reading)– Matlab plug-in (USGS)

Page 3: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

NetcdfDataset

ApplicationScientific Feature Types

NetCDF-Java/

CDM architecture

OPeNDAP

THREDDS

Catalog.xml NetCDF-3

HDF5

I/O service provider

GRIB

GINI

NIDS

NetcdfFile

NetCDF-4

…Nexrad

DMSP

CoordSystem Builder

Datatype Adapter

NcMLNcML

Page 4: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Format Readers (IOSP)

• General: NetCDF, HDF5, HDF4, OPeNDAP• Gridded: GRIB-1, GRIB-2, GEMPAK• Radar: NEXRAD 2&3, DORADE, CINRAD,

Universal Format• Point: BUFR, ASCII• Satellite: DMSP, GINI, McIDAS AREA• Misc: GTOPO, Lightning, etc• Others in development (partial):

– AVHRR, GPCP, GACP, SRB, SSMI, HIRS (NCDC)

Page 5: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Line of Code (est)

LOC semicolons ratio LOC ratio seminetcdf3 1977 846 1 1hdf4 3151 1405 1.6 1.7hdf-eos 3737 1695 1.9 2.0hdf5 5735 2672 2.9 3.2

common 28121 9267

Page 6: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Why all the trouble?

• ~20-40% C/C++ time spent on portability issues• Platform Independence

– Linux, Solaris, Windows (Sun)– Mac OS X (Apple)– AIX, Linux, Windows, z/OS (IBM)– HP-UX (Hewlitt-Packard)

• Progammer productivity– Object-Oriented– Garbage Collected – no memory leaks– Rich libraries– Open source

• Faster than C for some applications

Page 7: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Independent implementation

• Written entirely from reading HDF4, HDF5 file specifications

• Helped debug (HDF5), validate file specs

• File format spec is what will be needed in 100 years to read legacy data– OTOH, semantics not always obvious

• Don’t confuse reference implementation with the file/protocol specification

Page 8: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

HDF family of formats

• HDF5/NetCDF-4

• HDF4

• HDF-EOS

• Note: read-only, no parellel I/O, etc

Page 9: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

HDF5/NetCDF4

• Goal is to read all HDF5– Can read all HDF5 files that we have example– including references, soft links– Complete coverage difficult to guarantee –

combinatoric explosion

• Some esoteric features we are skipping– File drivers, external files, slib compression

• Working on a comprehensive test harness– JNI interface to Netcdf4/HDF5 library– read every byte and compare

Page 10: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

HDF4 / HDF-EOS

• Complete, works against all examples

• Tested against 400 sample files (27 Gb)– thanks to Ruth Duerr (NSIDC)

• Spot checked against HDFView

• Need systematic test to compare reading against the HDF4 C Library

Page 11: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Geolocation Primer

Page 12: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Swath

Float lat(245, 33477);

Float lon(245, 33477);

Float time(33477);

Float data(245, 33477);

Just know that its swath data• 245 points cross track• 33477 along the track• Each scan has a time coordinate

Page 13: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Swath

Float lat(33477, 245);

Float lon(33477, 245);

Float time(33477);

Float data(245, 33477);

Page 14: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Swath

Float lat(999,999);

Float lon(999,999);

Float time(999);

Float data(999,999);

Page 15: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Swath

Float v1(999, 999);

Float v2(999, 999);

Float v3(999);

Float v4(999,999);

Page 16: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

If you write data

• Don’t rely on variable name conventions

• Don’t rely on index ordering

• Don’t rely on matching index sizes

• Minimize “you just have to know that…”

Page 17: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Dimensions

Dimensions

d1=999;

d2=999;

Variables:

float v1(d1=999, d2=999);

float v2(d1=999, d2=999);

float v3(d2=999);

float v4(d2=999,d1=999);

Page 18: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Good

Variables: float v1(d1=999, d2=999); v1:standard_name = “Latitude”; float v2(d1=999, d2=999); v2:standard_name = “Longitude”; float v3(d2=999); v3:standard_name = “Time”; float v4(d2=999,d1=999);

Data_type = “Swath”;Conventions = “My unique name”;

Page 19: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

If you write data

• Unique signature

• Specify dimensions

• Identify georeferencing coordinates

• Identify data type

• Units are not optional

Page 20: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

HDF-EOS, HDF-EOS2

• Read “structural metadata” field to obtain more semantics

• Parse text in “ODL”– Data type: Swath, Grid, Point– Dimensions– Geolocation coordinate variable types:

Latitude, Longitude, Time

Page 21: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

HDF-EOS, HDF-EOS2

• Good– Unique signature, identify coordinates and

data type

• Not so good– ODL– Not using hdf4/5 constructs

• Bad– No data units– No time coordinate units!

Page 22: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Better EOS

Variables: float v1(999, 999); v1:standard_name = “Latitude”; v1:dims = “d1 d2”; float v2(999, 999); v2:standard_name = “Longitude”; v2:dims = “d1 d2”; float v3(999); v3:standard_name = “Time”; v3:dims = “d2”; float v4(999,999); v4:dims = “d2 d1”;

Page 23: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

NPP (i1.4.0.3_NPP_QUAL)

• Good– XML better than ODL

• Not so good– Not using hdf4/5 constructs

• Bad– No data units– No time coordinate units!

• Fatal Error: please reboot – Metadata not in the same file

Page 24: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Summary

• Netcdf-Java reads entire HDFx family

• Good for Java-philes

• Needs more testing – Send example files, $

• Dimensions are not optional

• Keep structural and georeferncing metadata in the same file as the data– Can also have specialized external files

Page 25: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Contact

[email protected]

Google “netcdf java”

Page 26: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

NetCDF-4 andCommon Data Model(Data Access Layer)

Page 27: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Dimension primer

Float lat(180);

Float lon(360);

Float alt(20);

Float time(1200);

Float data(1200,20,180,360);

Page 28: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Unique Name!

Float lfip(lfip=180);

Float lflop(lflop=180);

Float zorg(zorg=20);

Float skdf(skdf=1200);

Float dglot(skdf=1200,zorg=20,

lfip=180,lflop=180);

Page 29: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Float lfip(180);

Float lflop(180);

Float zorg(20);

Float freebish(1200);

Float dglot(1200,20,180,180);

Page 30: Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata.

Float lat(180);

Float lon(180);

Float alt(20);

Float time(1200);

Float data(1200,20,180,180);


Recommended