+ All Categories
Home > Documents > Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet...

Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet...

Date post: 12-Jan-2016
Category:
Upload: neal-hawkins
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
29
Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory [email protected] June 21, 2011
Transcript
Page 1: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Dictionaries, Vocabularies, Namespaces, Thesauri,

Ontologies, and all that

Rob RaskinNASA/Jet Propulsion Laboratory

[email protected]

June 21, 2011

Page 2: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Why care about data semantics?

Current data may need to be archived for decades or centuries Global change analysis requires consistent

comparisons across decades or centuries Synonyms

multiple words, same meaning Homonyms

same word, multiple meanings Measurement ambiguities

Sea “surface” temperature - at what “height”?

Page 3: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Let’s eat, Grandma.Let’s eat Grandma.

Time flies like an arrow.Fruit flies like a pie.

Semantic Understanding is Difficult!

LA Times headline

“Mission accomplished. Major combat operations in Iraq have ended”

-Pres. Bush, 2003

Variable t: temperatureVariable t: time Data quality= 5

Data quality= 3

Surface wind: measured 3 m above surface Surface wind: measured at surface

Page 4: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Semantic Spectrum

Catalog

List of controlled words

Semantics

Formal Hierarchyw/ Relations

Relations between children defined

Informal Hierarchy

Terms classified by categories(e.g. GCMD)

Formal Hierarchy

Terms inherent properties/meaning of parentVocabulary Ontology

Human-Readable Machine-Readable

Page 5: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Scope of Representation

Parameter names Scientific units Spatial/temporal extent/resolution Data quality Data provenance Data type Data services

CF

Page 6: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

What is an Ontology?

An approach to store knowledge Machine-readable and human-readable Provides definition of words or phrases

expressed relative to other terms Offers shared understanding of concepts and

knowledge reuse Provides semantics for machine-to-machine (or

human-to-human) communications

Page 7: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Practically, an ontology is a…

Framework for classifying knowledge Ensures there is a “place” to store

components of knowledge

Page 8: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Ontology Languages:RDF and OWL

W3C has adopted languages that specialize XML Resource Description Formulation (RDF) Ontology Web Language (OWL)

Languages predefine specific tags RDF: Class, subclass, property, subproperty

Class-property similar to Entity-Relation of DBMS theory

Page 9: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

RDF Class and Subclass Class

The basic element or “thing” or “noun” Subclass

Inherits all attributes of parent class Typically, adds Properties to distinguish subclass

from its parent Can have multiple parent classes

Cat Animalis a

has Legs 4

Page 10: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

RDF Property & Subproperty

Property A “verb” Examples:

measures, hasLocation, hasArea, northOf Properties can have attributes:

domain, range, transitive, …

Subproperty Inherits parent attributes

Page 11: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

OWL Language

Extends RDF to predefine further tags cardinality transitive relations inverse relations same as, different from union, intersection domain, range Import (from one ontology to another, to enable sharing and

reuse of the work of others) …

Page 12: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

OWL Ontology Example <Class “WaterPollution>

<SubClassOf “Pollution”> <Restriction>

<OnProperty “hasSubstance”> <AllValuesFrom “Water”>

</Restriction> </SubClassOf>

</Class>

Page 13: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Statements about Statements

OWL allows us to make statements about statements Degree of belief Timestamps Provenance / Lineage Probability / Uncertainty Security issues Author / Source / Community Community dialect …

ObservedFeature

Landsat

has Probability 0.75

Corn Crop

has Source

is a

Page 14: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Ontologies provide a common namespace Documents, web pages, data, people, and

other resources can be mapped/ categorized to this namespace

Anybody can create or extend the namespace

Why are Ontologies Useful? (1)

Page 15: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Dictionary Concepts in the namespace not just “listed” (a

taxonomy), but “defined” (in terms of others) Concepts defined via specializations of broader

concepts -- with properties that distinguish each child from the broader parent concept

Reductionist approach of science Arbitrary levels of specialization are possible

As with Library of Congress and Dewey Decimal numbering systems

Why are Ontologies Useful? (2)

Page 16: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Disambiguation Reduces semantic mismatch Synonym support (multiple terms with

same meaning) label available to indicate preferred term for

each community Homonym support (multiple meanings of

same term) separate namespaces (President:Bush vs

Plant:Bush)

Why are Ontologies Useful? (3)

Page 17: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Why are Ontologies Useful? (4)

Machine readable Ontologies are generally stored in a format

(XML) that is readable by both humans and computers

Computer accessibility enables automated reasoning

Page 18: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Knowledge retention Corporations use knowledge management to

ensure institutional memory over time, as personnel come and go

Climate disciplines can do the same! Facts/data can be represented and related in a

consistent manner Common sense knowledge is captured

Instrument characteristics

Why are Ontologies Useful? (5)

Page 19: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Ontology Representation (1):Knowledge Base of Triples

Noun-Verb-Noun representation

Parent-child relations:

Flood is a Weather Phenomena GeoTIFF is a File Format Soil Type is a Physical Property Pacific Ocean is a Ocean

Or create your own relations:

Ocean has substance Water Sensor measures Temperature

Page 20: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Ontology Representation(2): Visual

Page 21: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Ontology Representation (3): XML, RDF, and OWL

W3C has adopted XML-based standard ontology languages

Resource Description Formulation (RDF) Ontology Web Language (OWL)

Languages predefine specific tags RDF: Class, subclass, property, subproperty, … OWL: Extends RDF to predefine further tags such as cardinality

Three flavors of OWL (Lite, DL, and Full)

Use of standard languages makes it easy to extend (specialize) work of others

Page 22: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Global Warming Query in the Semantic Web

Find data which demonstrates global warming at high latitudes during summertime and plot warming rate.

Extract information from the use-case - encode knowledge Translate this into a complete query for data - inference and integration

of data from instruments, indices and models

“Global warming”= Trend of increasing temperature over large spatial scales

“High latitude”= |Latitude| > 60 degrees“Summertime”= June-Aug (NH) and Jan-Mar (SH)“Find data”= Locate datasets using catalogs, then access and

read it“Plot warming rate”= Display temperature vs time

Page 23: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Semantic Web for Earth and Environmental Terminology (SWEET)

Concept space written in OWL Initial focus to assist search for data resources

Funded by NASA Later focus to serve as community standard (upper-level

Earth system science ontology) Enables scalable classification of Earth system science and

associated data concepts Specialists can further refine SWEET concepts SWEET 2.2 has 6600 concepts in 200 modular ontologies http://sweet.jpl.nasa.gov

Page 24: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

SWEET Top-Level View

Page 25: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

CF vs SWEET Representation

CF (traditional single-attribute parameter name):tendency_of_mole_concentration_of_dissolved_inorganic_phosphorus_in_sea_water_due_to_biological_processes

SWEET (multi-attribute parameter name): Quantity= mole_concentration Transformation= tendency State= dissolved, inorganic Substance= phosphorous Medium= sea_water Process= biological_processes

Page 26: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

SWEET Data Ontology Dataset characteristics

Format, data model, dimensions, … Provenance

Source, processing history, … Parameters

Scale factors, offsets, … Data services

Subsetting, reprojection, … Quality measures Special values

Missing, land, sea, ice, ...

Page 27: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Best Practices Keep ontologies small, modular

Use higher level ontologies where possible Identify hierarchy of concept spaces

Try to keep dependencies unidirectional Gain community buy-in

Involve respected leaders

Page 28: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Ontology Development Tools: CMAP Free, downloadable tool for knowledge

representation and ontology development

Visual language with input/export to OWL Supports subset of OWL language

http://cmap.ihmc.us/coe

Page 29: Dictionaries, Vocabularies, Namespaces, Thesauri, Ontologies, and all that Rob Raskin NASA/Jet Propulsion Laboratory Raskin@jpl.nasa.gov June 21, 2011.

Resources ESIP Semantic Web Cluster

Monthly telecons Tutorials Ontology development

Datatypes data services

SWEET http://sweet.jpl.nasa.gov

Rob Raskin [email protected]


Recommended