Post on 23-Dec-2015
transcript
SWEET: Upper-Level Ontologies for
Earth System Science
OPeNDAP MeetingFeb 2007
Rob RaskinPO.DAAC
Jet Propulsion Laboratory
Data to Knowledge
Data Information Knowledge
Basic Elements Bytes Numbers Models FactsServices Ingest Archive Visualize Infer Understand PredictStorage File Database HDF-EOS GIS/MIS Ontology MindInteroperability Syntactic OPeNDAP WMS/WCS SemanticVolume/Density High/Low Low/HighStatistics Checksum Moments Descriptive InferentialAnalysis Fourier Wavelet EOF SSAMethodology Exploratory-analysis Model-based-mining
Syntax Semantics
Semantics: Shared Understanding of Concepts
Provides a namespace for scientific terms…plus Provides descriptions of how terms relate to one another Example tags in markup language:
subclass, subproperty, part of, same as, transitive property, cardinality, etc.
Enables object in “data space” to be associated formally with object in “science concept space”
“Shared understanding” enables software tools to find “meaning” in resources
Ontology Representation W3C has adopted four XML-based standard
ontology languages: RDF, OWL-Lite, OWL-DL, OWL Full
Basic building blocks: Class, subclass, property, subproperty, sameAs
Standard language enables anyone to extend an ontology
Knowledge built up incrementally
Why an Upper-Level Ontology for Earth System Science?
Many common concepts used across Earth Science disciplines (such as properties of the Earth) Provides common definitions for terms used in multiple
disciplines or communities Provides common language in support of community
and multidisciplinary activities Reduced burden (and barrier to entry) on creators
of specialized domain ontologies Only need to create ontologies for incremental
knowledge
Semantic Web for Earth & Environmental Terminology
(SWEET)
Ontology of Earth system science and data concepts
Provides a common semantic framework (or namespace) for describing Earth science information and knowledge
Emphasis on improving search for NASA Earth science data resources
Represented in OWL-DL
Non-LivingSubstances
LivingSubstances
PhysicalProcesses
Earth Realm
PhysicalProperties
Time
NaturalPhenomena
Human Activities
Integrative Ontologies
Space
Data
Faceted Ontologies
Units
Numerics
SWEET Ontologies
SWEET Supports Knowledge Reuse
SWEET is a concept space Enables scalable classification of Earth science and data-
related concepts Enables object in data space to be mapped to science
concept space Concept space is translatable into other
languages/cultures using “sameAs” notions
SWEET Science Ontologies
Earth Realms Atmosphere, SolidEarth, Ocean, LandSurface, …
Physical Properties temperature, composition, area, albedo, …
Substances CO2, water, lava, salt, hydrogen, pollutants, …
Living Substances Humans, fish, …
SWEET Conceptual Ontologies
Phenomena ElNino, Volcano, Thunderstorm, Deforestation,
Terrorism, physical processes (e.g., convection) Each has associated EarthRealms,
PhysicalProperties, spatial/temporal extent, etc. Specific instances included
e.g., 1997-98 ElNino
Human Activities Fisheries, IndustrialProcessing, Economics,…
SWEET Numerical Ontologies SpatialEntities
Extents: country, Antarctica, equator, inlet, … Relations: above, northOf, …
TemporalEntities Extents: duration, century, season, … Relations: after, before, …
Numerics Extents: interval, point, 0, positiveIntegers, … Relations: lessThan, greaterThan, …
Units Extracted from Unidata’s UDUnits Added SI prefixes Multiplication of two quantities carries units
Numerical Ontologies Numeric concepts defined in OWL only through
standard XML XSD spec Intervals defined as restrictions on real line
Added in SWEET Numerical relations (lessThan, max, …) Cartesian product (multidimensional spaces)
Numeric ontologies used to define spatial and temporal concepts
XSD: Datatypes Numeric
boolean, decimal, float, double, integer, nonNegativeInteger, positiveInteger, nonPositiveInteger, negativeInteger, long, int, short, unsignedLong, unsignedInt, unsignedShort, unsignedByte, hexBinary, base64Binary
String String, normalizedString, anyURI, token,
language, NMTOKEN, Name, NCName Date
dateTime, time, date, gYearMonth, gYear, gMonthDay, gDayxsd:gMonth
Data and Services Ontology Formats Data models Data Sttructures Special values
Missing, land, sea, ice, etc. Parameters
Scale factors, offsets, algorithms Data Services
Subset, reproject
Example: AIRS Level 2 Dataset
Subset of Dataset where DataModel= Level 2 Instrument= AIRS HorizontalDimension= 2 VerticalDimension= 1 Format= HDF-EOS Property= Temperature Substance= Air
Fragment of SWEET
Atmosphere
AtmosphereLayer
Troposphere
Tropopause
Stratosphere
isUpperBoundaryOf isLowerBoundaryOf
subClassOfsubClassOf
partOf
PlanetaryLayer
partOf
3DLayer
subClassOf
upperBoundary=50 km
lowerBoundary=15 km
primarySubstance=“air”
sameAs=“LowerAtmosphere”
How SWEET was Initially Populated
Initial sources GCMD
Over 10,000 datasets Over 1000 keywords Data providers submit additional terms for “free-text” search
CF Over 700 keywords Very long term names
surface_downwelling_photon_spherical_irradiance_in_sea_water
Decomposed into facets Property= spherical_irradiance Substance= sea_water Space= surface Direction= down
Collaboration Web Site
Discussion tools Blog, wiki, moderated discussion board
Version Control/ Configuration Management Trace dependencies on external ontologies Tools to search for existing concepts in registered
ontologies Ontology Validation Procedure
W3C note is formal submission method Registry/discovery of ontologies Support workflows/services for ontology development
Community Issues
Content Maintain alignment given expansion of classes and properties
Standards and Conventions Agreement on standards for use of OWL Fuzzy representation conventions Submit as standard to NASA Standards & Processes Working
Group Review Board
Who will oversee and maintain for perpetuity (or at least through the next funding cycle)?
ESIP Federation? A new consortium? Global Support
Provide tools to visualize and appreciate the big picture
Update/Matching Issues
No removal of terms except for spelling or factual errors
Subscription service to notify affected ontologies when changes made
Must avoid contradictions Additions can create redundancy if sameAs not used Humans must oversee “matching”
CF has established moderator to carry out analogous additions
OWL “import” imports entire file Associate community with ontology terms
Community tagging
Best Practices Keep ontologies small, modular
Be careful that “Owl:Import” imports everything
Use higher level ontologies where possible Identify hierarchy of concept spaces
Model schemas Try to keep dependencies unidirectional
Web Sites http://sweet.jpl.nasa.gov http://PlanetOnt.org