Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | barnaby-brooks |
View: | 219 times |
Download: | 2 times |
Data Life Cycle
GeoData 2011 Workshop
March 2, 2011, Broomfield, COPeter Fox (RPI) [email protected], [email protected] World Constellation http://tw.rpi.edu
Motivation, temptation
• A world of challenges – as if Tim did not motivate you enough
• Data and people at the heart of it
• Researchers and their data are valuable (as ever)
• But not enough attention, focus
2Tetherless World Constellation
3
Working premise
Scientists – actually ANYONE - should be able to access and use a global, distributed knowledge base of scientific data that:• appears to be integrated• appears to be locally available
But… data and information is obtained by multiple means (instruments, models, analysis) using various (often opaque) protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed AND created in a form that facilitates generation, not use (except by accident)
And … significant levels of semantic heterogeneity, large-scale data, complex data types, legacy systems, inflexible and unsustainable implementation technology…
Uh-oh
Definitions
• Data - are encodings that represent the qualitative or quantitative attributes of a variable or set of variables.
• Data (plural of "datum", which is seldom used) - are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables but are now models, etc.
• Data - are often viewed as the lowest level of abstraction from which information and knowledge are derived***
4
Definitions ctd.
• Information– Representations (of facts? data?) in a form that
lends itself to human use
• Knowledge– Check out Wikipedia…. meaning
• Metadata – data about data• Metainformation – information about
information• Data documentation – integrated collection of
information and metadata intended to support all aspects of data (find, access, use…)
5
Examples
• Rock sample:– Data – weight, composition, shape, size– Information – images of the rock as collected– Knowledge – evidence of geologic activity– Metadata – location and time of collection– Documentation – published lab report …
• Weather– Data – wind speed and direction, temperature, ..– Information – weather map with contours and features– Knowledge – high pressure system, stable weather– Metadata – type of radar, sensor, use of model
6
Cox/2005 AGU Spring
Fields vs. objects
classic geology“Feature” viewpoint
classic geophysics“Coverage” viewpoint
• simple data structures• collated/gridded
ready for analysisnetCDF, HDF-EOS
• complex data • database insertion• complete feature interpretations XML documents
Definitions ctd.
• Data life-cycle elements (simple 3-level)– Acquisition: Process of recording or generating a concrete
artefact from the concept (see transduction)– Curation: The activity of managing the use of data from its
point of creation to ensure it is available for discovery and re-use in the future (http://www.dcc.ac.uk/FAQs/data-curator)
– Preservation: Process of retaining usability of data in some source form for intended and unintended use
– Stewardship: Process of maintaining integrity across acquisition, curation and preservation
8
Definitions ctd.
• Stewardship -> Management: Process of arranging for discovery, access and use of data, information and all related elements.
• Also oversees or effects control of processes for acquisition, curation, preservation and stewardship. Involves fiscal and intellectual responsibility.
• Not explicitly the focus of this workshop..9
10
.. Data has Lots of Audiences
From “Why EPO?”, a NASA internalreport on science education, 2005
More Strategic
Less Strategic
Science too!
On to Life Cycle…
• Life Cycle, lifecycle, life-cycle …
• By now I hope you know I know it’s about a mix of factors
• Research data and researchers
18
Physical quantity versus measured as quantity
Value and units?
Reference frame?
Reference units?Value and units?
Courtesy Krishna Sinha (VT)
19 April 2023 © GEO Secretariat
Local in-situ Networks and Systems Air pollution
measurement station
Emden, Germany
Local and national air pollution networks Venice, Italy, and Indonesia
© GEO Secretariat slide 20
Global in situ Networks and Systems
Global Seismic Network Signal from the Indian Ocean Earthquake - 26 December 2004
Global Argo Float Array
Measuring ocean temperature and salinity
© GEO SecretariatENVISAT RA-2 observing the Gulf Stream current velocity
Satellite Observation Systems
Modeling the Climate as a SystemTransformative Science, Data Infrastructures and the IPCC Experience
Lawrence BujaNational Center for Atmospheric ResearchBoulder, Colorado
CAM T341- Jim Hack
Briefing on ResultsBriefing on Results::USGS Science Strategy to Support U.S. USGS Science Strategy to Support U.S. Fish & Wildlife Service Polar Bear Listing Fish & Wildlife Service Polar Bear Listing Decision: Decision: a 6 month efforta 6 month effort
U.S. Department of the InteriorU.S. Geological Survey
29
Temptation
• To run screaming from the room?– Wait – there are cookies (and a reception)!
• To really focus on what you are DOING (less that WANT to do) and NOT DOING, but need to – near term (next week)
• Talk about it… argue it… listen to others
• To focus on value – the real and immediate value to you and the people you work with and institution/ communities you work for/ with!