Input and Output II: netCDF and HDF files
netCDF files• Data format developed by UCAR (umbrella institute for
NCAR)• The data model is simple and flexible• The basic building blocks of netCDF files are variables,
attributes, and dimensions Variables: scalars or multidimensional arrays Global attributes: contain information about a file (e.g.
creation date, author, …) Variable attributes: contain information about a
variable (e.g. unit, valid range, scaling factor, maximum value, minimum value, …)
Dimensions: long scalars that record the size of one or more variables
• The above information are stored in the header of an netCDF file, and can be checked using ‘ncdump -h filename’
netcdf gpcp_precip {dimensions: lat = 72 ; lon = 144 ; time = UNLIMITED ; // (342 currently)variables: float lat(lat) ; lat:units = "degrees_north" ; lat:actual_range = 88.75f, -88.75f ; lat:long_name = "Latitude" ; float lon(lon) ; lon:units = "degrees_east" ; lon:long_name = "Longitude" ; lon:actual_range = 1.25f, 358.75f ; float precip(time, lat, lon) ; precip:long_name = "Average Monthly Rate of Precipitation" ; precip:valid_range = 0.f, 50.f ; precip:units = "mm/day" ; precip:add_offset = 0.f ; precip:scale_factor = 1.f ; precip:actual_range = 0.f, 36.44588f ; precip:missing_value = -9.96921e+36f ; precip:precision = 32767s ; precip:least_significant_digit = 2s ; precip:var_desc = "Precipitation" ; precip:dataset = " GPCP Version 2x79 Experimental Combined Precipitation" ; precip:level_desc = "Surface" ; precip:statistic = "Mean" ; precip:parent_stat = "Mean" ; double time(time) ; time:units = "days since 1700-1-1 00:00:0.0" ; time:long_name = "Time" ; time:delta_t = "0000-01-00 00:00:00" ; time:actual_range = 101902., 112280. ; time:avg_period = "0000-01-00 00:00:00" ;// global attributes: :Conventions = "COARDS" ; :title = "GPCP Version 2x79 Experimental Combined Precipitation" ; :history = "created oct 2005 by CAS at NOAA PSD" ; :platform = "Observation" ; :source = "GPCP Polar Satellite Precipitation Data Centre - Emission (SSM/I emission estimates).\n", "GPCP Polar Satellite Precipitation Data Centre - Scattering (SSM/I scattering estimates).\n",
…… "NASA ftp://precip.gsfc.nasa.gov/pub/gpcp-v2/psg/" ; :documentation = "http://www.cdc.noaa.gov/cdc/data.gpcp.html" ;}
Example of header
Reading netCDF files• Open and close a netCDF file:
file_id=ncdf_open(filename) (will automatically return a file_id)
ncdf_close, file_id
• Discover the contents of a file (when ncdump not available):
file_info=ncdf_inquire(file_id)
nvars=file_info.nvars
print, “number of variables ”, nvars
var_names=strarr(nvar)
for var_id=0,nvars-1 do begin
varinfo=ncdf_varinq(file_id, var_id)
var_names(var_id)= varinfo.name
print, var_id, “ ”, varinfo.name
endfor
Reading netCDF files (cont.)
• Reading a variable:
ncdf_varget, file_id, var_id, data
or ncdf_varget, file_id, var_name, data
• Reading a subset of a variable
ncdf_varget, file_id, var_id, data, offset=[…], count=[…],
stride=[…]
where
offset is the first element in each dimension to be read
count is the number of elements in each dimension to be read
stride is the sampling interval along each dimension
Reading netCDF files (cont.)• Reading attributes of a variable (Table 4.9)
Sometimes the variable has missing values
ncdf_attget, file_id, var_id, ‘_FillValue’, missing_value
Sometimes the variable was scaled before being written into the file. Then you need to read the scale_factor and add_offset:
ncdf_attget, file_id, var_id, ‘scale_factor’, scale_factor
ncdf_attget, file_id, var_id, ‘add_offset’, add_offset
data=data*scale_factor + add_offset
HDF files• The data format was developed by the National
Center for Supercomputing Applications in UIUC• Offers a variety of data models, including
multidimensional arrays, tables, images, annotations, and color palettes.
• We will introduce about the HDF Scientific Data Sets (SDS) data model, which is the most flexible data model in HDF, and shares similar features with netCDF. That is, the basic building blocks of HDF SDS files are variables, attributes, and dimensions.
Reading HDF files• Open and close a HDF file: file_id=hdf_sd_start(filename)
hdf_sd_end, file_id
• Discover the contents of a file (when metadata not available):
hdf_sd_fileinfo, file_id, nvars, ngatts
varnames=strarr(nvars) for i=0,nvars-1 do begin var_id=hdf_sd_select(file_id, i) hdf_sd_getinfo, var_id, name=name hdf_sd_endaccess, var_id varnames(i)=name print, var_id, “ ”, name endfor
• Find var_id from variable name index=hdf_sd_nametoindex(file_id, name)
var_id=hdf_sd_select(file_id, index)
Reading HDF files (cont.)• Reading a variable:
hdf_sd_getdata, var_id, data
hdf_sd_endaccess, var_id
(please remember to close a variable after reading it)
• Reading a subset of a variable
hdf_sd_getdata, var_id, data, start=[…], count=[…],
stride=[…]
where
start is the first element in each dimension to be read
count is the number of elements in each dimension to be read
stride is the sampling interval along each dimension
Reading HDF files (cont.)
• Obtain a list of all attributes associated w/ a variable
hdf_sd_getinfo, var_id, natts=natts
attnames=strarr(natts)
for I=0,natts-1 do begin
hdf_sd_attrinfo, var_id, I, name=name, data=value
attnames(I)=name
print, name, ‘ ******* ’, value
endfor
Reading multiple data files
• Put the file names in a string array filenames=[‘9701.nc’, ‘9702.nc’, ‘9703.nc’]• Put the file names in a file 9701.nc 9702.nc 9703.nc 9704.nc 9705.nc
Filling contours with colors• Syntax:
contour, d, x, y, levels=levels, c_colors=c_colors, /fill
contour, d, x, y, levels=levels, c_colors=c_colors, /cell_fill
Notes: • When there are sharp jump in the data (e.g. coastlines or missing
data), /cell_fill may be better than /fill• levels should be set with min(levels)=min(d),
max(levels)=max(d) • number of colors = number of levels - 1• After plotting the filled contours, we generally need to overplot
the contour lines and labels• Example: see color.pro at http://lightning.sbs.ohio-state
.edu/geo820/data/
• Color bar: see color.pro
In-class assignment VIData files are stored at:
http://lightning.sbs.ohio-state.edu/geo820/data/
1. Read the netCDF file ncep_skt.mon.ltm.nc for NCEP reanalysis surface skin temperature (skt) climatology. Make a contour plot with color and color bar for the skt map of January (time index 0). Make a 4-panel plots for skt maps for January, April, July, and October.
2. Read the HDF file ssmi.hdf. Only read the variable v10m. Convert the data back to its original physical values. Use different methods to check if you read the data correctly.