Date post: | 16-Jan-2016 |
Category: |
Documents |
Upload: | kelly-abigayle-heath |
View: | 216 times |
Download: | 0 times |
WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA
REPOSITORIES AND REAL TIME OBSERVATION DATA:
CUAHSI HIS EXPERIENCE
Ilya Zaslavsky San Diego Supercomputer Center, UCSD
CUAHSI = Consortium of Universities for the Advancement of Hydrologic Sciences, Inc.; HIS = Hydrologic Information System
Collaborative Project: UT Austin + SDSC + Drexel + Duke +Utah State
www.cuahsi.org/his/
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
SDSC Spatial Information Systems Lab
Research and system development• Services-based spatial information integration
infrastructure• Mediation services for spatial data, query
processing, map assembly services• Long-term spatial data preservation• Spatial data standards and technologies for
online mapping (SVG, WMS/WFS)• Support of spatial data projects at SDSC and
beyond
Mediator
LegendGenerator
MapAssembler
Ontology
…
GRID SERVICESFOR MAP INTEGRATION
Mediator
LegendGenerator
MapAssembler
Ontology
…
GRID SERVICESFOR MAP INTEGRATION
services
In Geosciences (GEON, CUAHSI, CBEO,…)
Spatial web services
FederalAgencies
Figure 1.26 The Geography Network.
ESRICounty spatial data and toxicant information
Telesis, other localNon-profits
CA state
WSDL
WSWSDL
WSWSDL
WSWSDL
WSWSDL
WSWSDL
WS
Student projects
The CHI ME Model
In regional development (NIEHS SBRP, Katrina)
In Neurosciences (BIRN, CCDB)
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
The Grid is becoming the backbone for collaborative science and data sharing
CI is about RE-USING data and research resources !!
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
Cyberinfrastructure for hydrology
• Hydrologic observations:• Reliance on federally-organized data collection (NWIS, STORET,
NCDC, etc.) with huge and complex nomenclatures simplifying access to federal repositories relatively lower emphasis on data ownership
• Handling time in both UTC and local• Various spatial offsets• Multiple data types: time series, fields, spatial data
• Integrative discipline:• Interoperation with atmospheric, ocean, soils, geomorphology, social
datasets and services…• Community:
• Organized by “natural boundaries” networks of relatively autonomous self-managed data nodes
• Partnership with public sector water management• 96% use Windows for research; Excel, ArcGIS, Matlab – most popular
SupercomputerCenters:NCSA,TACC
Domain Sciences:
Unidata, NCARLTER, GEON
Government:USGS, EPA,
NCDC, USDA
Industry:ESRI, Kisters,
OpenMI
HISTeam
WATERSTestbed
WATERS Network Information System
CUAHSI HIS
The CUAHSI Community, HIS and WATERS
CUAHSI: 116 Universities (Nov. 2006)
HIS Team:Texas, SDSC,Utah, Drexel,
Duke
WaterOneFlow Web Services
Data access through web
services
Data storage through web
services
Dow
nlo
ads
Upl
oa
ds
Observatory servers
Workgroup HIS
SDSC HIS servers
3rd party servers
e.g. USGS, NCDC
GIS
Matlab
IDL
Splus, R
D2K, I2K
Programming (Fortran, C, VB)
Web services interface
Web portal Interface (HDAS)
Information input, display, query and output services
Preliminary data exploration and discovery. See what is available and perform exploratory analyses
HTML -XML WS
DL
- SO
AP
Hydrologic Information System Service Oriented Architecture
Main Components• Web services for
accessing hydrologic repositories
• Hydrologic Observations Data Model
• Hydrologic Data Access System + Time SeriesViewer + desktop clients
• Collection of CUAHSI nodes
NWISNWIS
ArcGISArcGIS
ExcelExcel
NCARNCAR
UnidataUnidata
NASANASAStoretStoret
NCDCNCDC
AmerifluxAmeriflux
MatlabMatlabAccessAccess SASSAS
FortranFortran
Visual BasicVisual Basic
C/C++C/C++
CUAHSI Web ServicesCUAHSI Web Services
Point Observations Information ModelData Source
Network
Sites
ObservationSeries
Values
{Value, Time, Qualifier}
USGS
Streamflow gages
Neuse River near Clayton, NC
Discharge, stage, start, end (Daily or instantaneous)
206 cfs, 13 August 2006
• A data source operates an observation network• A network is a set of observation sites• A site is a point location where one or more variables are measured• A variable is a property describing the flow or quality of water• An observation series is an array of observations at a given site, for a given variable, with start time and end time• A value is an observation of a variable at a particular time• A qualifier is a symbol that provides additional information about the value
Observations Data Model Schema (version 4.0)
Data Source and Network Sites Variables Values Metadata
Depth of snow pack
Streamflow
Landuse, Vegetation
Windspeed, Precipitation
Controlled Vocabulary Tables
e.g. mg/kg, cfs
e.g. depth
e.g. Non-detect,Estimated,
A site is a point location where one or more variables are measured
A data source operates an observation network A network is a set of observation sites
Metadata provide information about the context of the observation.A variable is a property describing the flow or quality of water
A value is an observation of a variable at a particular time
From Ernest To, David Maidment, CRWR
Water Data Web Sites
NWISWeb site output# agency_cd Agency Code# site_no USGS station number# dv_dt date of daily mean streamflow# dv_va daily mean streamflow value, in cubic-feet per-second# dv_cd daily mean streamflow value qualification code## Sites in this file include:# USGS 02087500 NEUSE RIVER NEAR CLAYTON, NC#agency_cd site_no dv_dt dv_va dv_cdUSGS 02087500 2003-09-01 1190USGS 02087500 2003-09-02 649USGS 02087500 2003-09-03 525USGS 02087500 2003-09-04 486USGS 02087500 2003-09-05 733USGS 02087500 2003-09-06 585USGS 02087500 2003-09-07 485USGS 02087500 2003-09-08 463USGS 02087500 2003-09-09 673USGS 02087500 2003-09-10 517USGS 02087500 2003-09-11 454
Time series of streamflow at a gaging station
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
Challenges… (1/2)
• Sites• STORET has stations, and measurement points, at various offsets…• Site metadata lacking and inconsistent (e.g. 2/3 no HUC info, 1/3 no state/county info);
agency site files need to be upgraded to ODM…• A groundwater site is different than a stream gauge…
• Censored values• Values have qualifiers, such as “less than”, “censored”, etc. – per value. Sometimes
mixed data types.. • Units
• There are multiple renditions of the same units, even within one repository• There may be several units for the same parameter code (STORET)• If no value recorded – there are no units??• Unit multipliers
• E.g. NCDC ASOS keeps measurements as integers, and provides a multiplier for each variable
• Sources• STORET requires organization IDs (which collected data for STORET) in addition to site
IDs• Time stamps: ISO 8601
• Data types problem (conversion to PST???)• A service to determine UTC offsets given lat/lon and date??
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
Challenges… (2/2)
• Values retrieval• USGS: by site, variable, time range• EPA: by organization-site, variable, medium, units, time range• NCDC: fewer variables, period of record applies to site, not to
seriesCatalog
• Variable semantics• Variable names and measurement methods don’t match
• E.g. NWIS parameter # 625 is labeled ‘ammonia + organic nitrogen‘, Kjeldahl method is used for determination but not mentioned in parameter description. In STORET this parameter is referred to as Kjeldahl Nitrogen.
• One-to-one mapping not always possible• E.g. NWIS: ‘bed sediment’ and ‘suspended sediment’ medium types vs.
STORET’s ‘sediment’.
Ontology tagging, semantic mediation
- From different database structures, data collection procedures, quality control, access mechanisms to uniform signatures … Water Markup Language- Tested in different environments- Standards-based- Can support advanced interfaces via harvested catalogs- Accessible to community- Templates for development of new services- Optimized, error handling, memory management, versioning, run from fast serversAnd: working with agencies on setting up services!
WaterOneFlow API
• GetValues – Returns a TimeSeries
• GetSiteInfo– Station Information, including a period of record
• GetVariableInfo– Returns variable/parameter information
-- developed to have a low barrier to entry -- terminology same as Observations Database -- reuse of common elements
GetVariableInfo• Input
– Vocabulary:VariableCode
• Output– VariableResponse
<variablesResponse xmlns:gml="http://www.opengis.net/gml" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:wtr="http://www.cuahsi.org/waterML/" xmlns="http://www.cuahsi.org/waterML/1.0/">- <variables>- <variable> <variable> <variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode> <variableName>Discharge, cubic feet per second</variableName> <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units> </variable> </variables> </variablesResponse>
<sitesResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.cuahsi.org/waterML/1.0/">- <site>
- <siteInfo> <siteName>BIG ROCK C NR VALYERMO CA</siteName> <siteCode network="NWIS" siteID="4622637">10263500</siteCode> - <geoLocation>
- <geogLocation xsi:type="LatLonPointType" srs="EPSG:4269"> <latitude>34.42083115</latitude> <longitude>-117.8395072</longitude> </geogLocation>
</geoLocation> </siteInfo>- <seriesCatalog menuGroupName="USGS Daily Values" serviceWsdl="http://localhost/WaterOneFlowDev/DailyValues.asmx"> <note type="sourceUrl">http://waterdata.usgs.gov/nwis/dv?referred_module=sw&format=rdb&date_format=YYYY-MM-DD&begin_date=2006-11-17&site_no=10263500</note>
- <series><!– [snip] --> </series>
</seriesCatalog>- <seriesCatalog menuGroupName="USGS Unit Values" serviceWsdl="http://localhost/WaterOneFlowDev/UnitValues.asmx">
-<!-- [snip] --> -</seriesCatalog>
</site> </sitesResponse>
<series>- <variable> <variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode> <variableName>Discharge, cubic feet per second</variableName> <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units> </variable> <valueCount countIsEstimated="true">30563</valueCount> - <variableTimeInterval xsi:type="TimeIntervalType"> <beginDateTime>1923-02-01T00:00:00</beginDateTime> <endDateTime>2006-10-07T00:00:00</endDateTime> </variableTimeInterval> </series>
GetSiteInfo
GetValues
• NWIS, STORET, etc.– Location: NWIS:10263500– Variable: NWIS:00060– Time Range: 2005-08-01 to 2005-08-03
• MODIS, etc.– Location: GEOM:BOX(-180 -90,180 90)– Variable: MODIS:11/plotarea=landocean – Time Range: 2000-10-01 to 2001-03-01
timeSeriesResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.cuahsi.org/waterML/1.0/">- <queryInfo> <queryURL>http://waterdata.usgs.gov/nwis/dv?&site_no=10263500&</queryURL>
- <criteria> <locationParam>NWIS:10263500</locationParam> <variableParam>nwis:00060</variableParam> - <timeParam>
<beginDateTime>2005-08-01</beginDateTime> <endDateTime>2005-08-03</endDateTime>
</timeParam> </criteria>
</queryInfo>- <timeSeries>
- <sourceInfo xsi:type="SiteInfoType"> <siteName>BIG ROCK C NR VALYERMO CA</siteName> <siteCode siteID="4622637">10263500</siteCode> - <geoLocation>
- <geogLocation xsi:type="LatLonPointType" srs="EPSG:4269"> <latitude>34.42083115</latitude> <longitude>-117.8395072</longitude> </geogLocation>
</geoLocation> </sourceInfo>- <variable>
<variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode> <variableName>Discharge, cubic feet per second</variableName> <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units>
</variable>- <values count="3">
<value qualifiers="Ae" dateTime="2005-08-01T00:00:00">25</value> <value qualifiers="Ae" dateTime="2005-08-02T00:00:00">26</value> <value qualifiers="Ae" dateTime="2005-08-03T00:00:00">24</value>
</values> </timeSeries> </timeSeriesResponse>
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
Hydrologic Data Access System
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
Hydrologic Data Access
System
HIS nodes: cross-platform design Central CUAHSI HIS Node (Windows) GEON Data Node (Linux)
Data
Apache TomcatIIS Web Server
ASP . Net
Geon Software Stack
SQL Server
Proxy
ArcGIS
Technologies
HDASHODM
Web
ServiceWeb
Services
Web Serviceproxies
Data
Remote CUAHSI HIS Node (Windows)
Data
IIS Web ServerASP . Net
SQL ServerArcGIS
Technologies
HDASHODM
Web
ServiceWeb
Services
Web Serviceproxies
Remote CUAHSI HIS Node (Windows)
Data
IIS Web ServerASP . Net
SQL ServerArcGIS
Technologies
HDASHODM
Web
ServiceWeb
Services
Web Serviceproxies
Remote CUAHSI HIS Node (Windows)
Data
IIS Web ServerASP . Net
SQL ServerArcGIS
Technologies
HDASHODM
Web
ServiceWeb
Services
Web Serviceproxies
Remote CUAHSI HIS Node (Windows)
Data
IIS Web ServerASP . Net
SQL ServerArcGIS
Technologies
HDASHODM
Web
ServiceWeb
Services
Web Serviceproxies
Remote CUAHSI
HIS Nodes (Windows)
ApplicationServices, handlingof spatial data types, etc Security management,
distributed data management, integrationwith other CI projects
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
Resource registration• Shapefiles• TIFF images, GMT rasters• Web Services, WMS services• Relational databases, ASCII• PDFs, URLs• “CUAHSI data”• NetCDF• Coming: Geodatabases and ODM
SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D
Possible Connections
• Review of ODM• Dealing with observations/measurements rather than with
sensor data?
• Review of WaterOneFlow services schema• Aligning WaterOneFlow output schemas with
GML/SensorML• Carrying WaterOneFlow requests/responses over WFS
• Long term preservation of observation data• Water Data Interoperability Testbed?
Survey of Observing Systems
• NEON: http://www.neoninc.org • ORION: http://www.orionprogram.org/
• WATERS• CUASHI: http://www.cuahsi.org,
http://river.sdsc.edu/hdas • CLEANER: http://cleaner.ncsa.uiuc.edu/home/
• GLEON: http://www.gleon.org/, http://lakemetabolism.org/ • CREON: http://www.coralreefeon.org/ • MoveBank: http://www.princeton.edu/~wikelski/research/index.htm
• Civil Infrastructure: http://healthmonitoring.ucsd.edu/index.jsp
• IRIS/USArray: http://www.iris.edu/USArray/