Catalyzing carbon cycle science through synergies among research networks
NACP Breakout Session January 28 1015
Goal: Enabling of scientist-led efforts that benefit from, and provide benefit to, AmeriFlux, ICOS, NEON and other carbon focused networks through synergies and collaboration
1. And some (very quick) examples2. But what is needed?3. What can we each contribute?4. Therefore, how can this be achieved?5. Reporting back
Niwot Ridge Subalpine Forest AmeriFlux Site (US-NR1)• An “old network”, site started 1999• Several data synthesis projects with AmeriFlux Sites
(e.g. drought impacts, precipitation interception, WUE, understory LAI…)• Data often used for remote sensing calibration
purposes, and model development/verification• Site used for instrument inter comparison studies• Several new projects co-located near the tower due
to long-term data set and data availability
Great Lakes Evaporation Network (GLEN)• New network started by 3 individuals and 3 sites• Data since 2009• Already gaining some traction; several requests for
data each month by forecasters, water level modelers, and others
• Facing issues with database management, uniform data quality control, etc.
Hierarchy of Environmental Observations at Harvard Forest
EC Fluxes Remote Sensing
Harvard Forest EMS Tower
Purdue ALAR
NEON AOP: Elevation & Land Classification
NASA AirMOSS: Soil Moisture
Open-Source Data Assimilation for Land Surface models
CLM + DART + Clever people
European CarbonResearch Infr.
Ecosystem network: 40 to 60 ecosystem sites measuring fluxes, Atmospheric network: 20 to 30 towers measuring concentrations, Ocean network: ships and fix stations measuring concentrations The ICOS Thematic Centers: coordinate the networks, do the centralized processing, test and development of new methods and sensors. The Carbon Portal is the data distribution entry point.
ICOS is:1. A networks of sites
measuring GHGs in the ecosystem, atmosphere and ocean compartments
2. Four thematic centres that coordinate the activity of the sites
3. One EU level head-office and web portal
Standardization and harmonization across ecosystem networks
Same instruments and sensors
Same protocol for measuremetns
Centralized data processing
Standardized data products
Completely open data policy
Yes
Yes
Yes
YesYesYes
Yes Yes
Yes
YesYes Yes
No
No (Yes)
The COOPEUS project helps to bring together scientists and users being involved in Europe’s major environmental infrastructures (EISCAT, EPOS, LifeWATCH, EMSO, and ICOS) and in US NSF funded projects (AMISR, EARTHSCOPE, DataONE, OOI and NEON)
Main objectives are:• Interlinking similar activities globally and establish new synergies• Move in the direction of a truly global integration of existing infrastructures• Promote and define an efficient access to the measurements with an open sharing• Stimulate curiosity around the measurements in order to increase the number of users• Propose harmonization between networks and develop new standards and methods.
COOPEUS is not only EU and US. It is an open platform for discussion, willing to be as inclusive as possible. Summer schools on data use will be organized next year and are open to all the interested people. Announcments will be posted on the COOPEUS website:
http://www.coopeus.eu
What Interoperability means to me (Ankur)
• Share your data, openly, freely! Stop worrying about attribution (we can solve this). Don’t have asshole data use policies. Get DOIs for per site.
• Use common formats and conventions (e.g., Unidata, CF, NetCDF, XML, CSV, UTC). Never ever share data in XLS (ahem, old BADM)!– At least: please please put lat, lon, elevation/altitude, timezone, variable names,
units, missing data indicator in file header. One file per site/year is good for tower time series
• Make downloading, extracting, subsetting your data something that can be scripted (as simple as FTP, complex as SSH keys, OpenDAP or R Package) & automated so that machine-based analysis is possible– Not clickable websites with complex logins (see NCEP/NWS for good examples)
• Separate indicators for missing data vs. removed/flagged vs. gap-filled data, include uncertainty estimate for all observations. Gap-filling drivers >> gap-filled fluxes– Don’t remove flagged data. Someone might use it!
• Don’t assume you know how everyone will use your data. Be flexible, share raw and processed fields, native time units (20 Hz?) and aggregated– Life is easier if you use a fixed time interval, equal number of points per time
period (leap years be damned), disk space is cheap
• Provide guides for naming/unit conversion among popular similar networks
What the future (now!) looks like
pecanproject.org
The National Ecological Observatory Network is a project sponsored by the National Science Foundation and managed under cooperative agreement by NEON, Inc.. This material is based in part upon work supported by the National Science Foundation under Grant No. DBI-0752017. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
NEON, Inc. 1685 38th Street | Boulder, Colorado 80301
www.neoninc.org
INTEROPERABILITY FRAMEWORKContact: Hank Loescher ([email protected]), Brian Wee ([email protected])
Information InfrastructureEnd-to-end data flows that includes; how the measurement was made, its metadata, traceability, data formats, research
questions, and archival and retrieval processes.Physical Infrastructure
All the physical components and design elements that contribute towards a measurement, i.e., hardware physical integration, site design, and associated uncertainties, etc.
Support InfrastructureDefined as; i) all the support systems to manage the construction and operation of a research infrastructure (budget, risk,
schedule, scope integration), ii) structures to disseminate data (web portals), and iii) education and engagement.
Interoperability is Focused on 3 types of Infrastructure
Why Interoperability?
• The rapid pace of large-scale environmental global changes underscores the value of accessible long-term data sets.
• Natural, managed, and socioeconomic systems are subject to complex interacting stresses that play out over extended periods of time and space.
• An era of large-scale, interdisciplinary science fueled by large data sets.
• Data Interoperability enhances the value of current scientific efforts and investment.
• Interoperability is needed to forecast future conditions for basic understanding, and for future planning, policy, and societal benefit.
• Currently, there is no accepted approach to make large datasets interoperable
• Provides new leadership opportunities for Scientists globally
The National Ecological Observatory Network is a project sponsored by the National Science Foundation and managed under cooperative agreement by NEON, Inc.. This material is based in part upon work supported by the National Science Foundation under Grant No. DBI-0752017. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
NEON, Inc. 1685 38th Street | Boulder, Colorado 80301
www.neoninc.org
INTEROPERABILITY FRAMEWORKContact: Hank Loescher ([email protected]), Brian Wee ([email protected])
Example: Interoperability for Information Infrastructure
The degree to which Observatories are truly interoperable is the degree to which these four elements are adopted by collaborative facilities
Signal:noise and uncertainty estimates must also be known in order for data to have broader, global utility and prognostic capability (ecological forecasting)
Distilling Science Questions and Hypotheses into Requirements
Traceability of Measurements
Algorithms/Procedures
Informatics
• Mapping Questions to ‘what must be done’ • Defining Joint Science Scope• Requirements can define interfaces among respective datasets
• What is the algorithm or procedural process to create a data product?• Provides “consistent and compatible” data• Managed through intercomparisons• What are their relative uncertainties?
• Use of Recognized Standards• Traceability to Recognized Standards, or First Principles• Known and managed signal:noise• Managing QA/QC• Uncertainty budgets
• Standards - Data Formats• Standards - Metadata formats• Persistent Identifiers / Open-source• Discovery tools• And in the case of Biodiversity: Ontologies, semantics and controlled
vocabularies
1.
2.
3.
4.
This Interoperability Framework is currently being implemented as part of a joint EU FP7 and US NSF Project called CoopEUS (www.coopeus.eu)
example recipe for community algorithm development
deployable @ master branch
fork @ working group member
processing @ user branch
improvement @ developer branch
code base @ NEONimprovement @ merge branch
pull requestalgorithm lead @NEON
algorithm integration team @NEON
improvement @ community resource