+ All Categories
Home > Presentations & Public Speaking > Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies

Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies

Date post: 14-Jul-2015
Category:
Upload: xiaogang-marshall-ma
View: 940 times
Download: 0 times
Share this document with a friend
Popular Tags:
23
TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic Institute @MarshallXMa [email protected] x.marshall.ma rpi.edu/~max7 0000-0002-9110-7369 MarshallXMa
Transcript

TWCKnowledge Evolution in

Distributed Geoscience Datasets and

the Role of Semantic Technologies

Xiaogang (Marshall) Ma

Tetherless World Constellation

Rensselaer Polytechnic Institute

@[email protected]

x.marshall.ma

rpi.edu/~max7

0000-0002-9110-7369MarshallXMa

TWCWilliam Smith's 1815 geologic

map of England and Wales

with part of Scotland

William Smith

(1769-1839)

(Image source: Geological Society of London)

TWC1874

(Image source: British

Geological Survey)

Evolution of the

Geological Map of

British Islands / UK

TWC1874

(Image source: British

Geological Survey)

1906

Evolution of the

Geological Map of

British Islands / UK

TWC1874

(Image source: British

Geological Survey)

1906

Evolution of the

Geological Map of

British Islands / UK

1939

TWC1874

(Image source: British

Geological Survey)

1906

Evolution of the

Geological Map of

British Islands / UK

1939

1969

TWC1874

(Image source: British

Geological Survey)

1906

Evolution of the

Geological Map of

British Islands / UK

1939

1969

2007

TWC1874

(Image source: British

Geological Survey)

1906

Evolution of the

Geological Map of

British Islands / UK

1939

1969

2007

2013

TWC

9

2004 2005

2008 2009

Definition of

“Quaternary” in

several versions of

the International

Stratigraphic Chart

TWC

10

TWC

(Haq, 2007)

Distributed datasets:

Regional geologic

time scales

TWC

(Haq, 2007)

Distributed datasets:

Regional geologic

time scales

TWC

13

Distributed datasets:

Mismatches of geological

units across political

boundaries

Italy/France near

Cuneo/Colmar

Cambrian Carboniferous

(Asch et al., 2012)

(Base map courtesy:

OneGeology-Europe and USGS)

TWC

14

Distributed datasets:

Mismatches of geological

units across political

boundaries

Italy/France near

Cuneo/Colmar

Cambrian Carboniferous

(Asch et al., 2012)

(Ma et al., 2014)

Felsic and hornblendic gneisses

Granitic rocks

Wyoming/Colorado

(Base map courtesy:

OneGeology-Europe and USGS)

TWC• Data and models, vocabularies, and ontologies

– Have we ever had model-independent datasets?

• Ontology dynamics and a data life cycle

15

CONCEPT

*Initial concepts

*Questions and

answers

*Grant info

COLLECTION

*Questionnaire

*Coded instrument

*CAI metadata

*Paradata

PROCESSING

*Data specs

*Recodes

*Summary

descriptive info

DISTRIBUTION

*Terms of use

*Citation

*Packaging info

DISCOVERY

*Catalog record

*Indexing

*Related

publications

ANALYSIS

*Replication code

*Publications

ARCHIVING

*Preservation metadata

*Confidentiality

*Additional processing

REPURPOSING

*Post-hoc harmonization

*Data transformations

Diagram reproduced from (Spencer, 2012)

TWCOntology dynamics

• Ontology Mapping

• Ontology Morphism

• Ontology Matching

• Ontology Articulation

• Ontology Translation

• Ontology Evolution

• Ontology Debugging

• Ontology Versioning

• Ontology Integration

• Ontology Merging

16(Flouris et al., 2008)

TWCPotential challenges

• Reworking of the extant data in a data center

– e.g. caused by ontology/vocabulary versioning

• Semantic mismatch among data sources

– e.g. heterogeneity in ontologies of the same topic

• Differentiated understanding of a same piece of dataset

between data providers and data users

– e.g. a data provider understands Quaternary as 1.806 Ma-present,

and a data user understands it as 2.588 Ma-present

• Error propagation in cross-discipline data re-use

– e.g. heterogeneous datasets may cause misconception in

subsequent works

17(Ma et al., 2014)

TWCOneGeology-Europe

• 20 European nations

providing national geologic

maps at scale ~1: 1M

• Harmonized geological

terms and map legends

• Multilingual labels in 18

languages

• Central portal for data

browsing/query among

distributed data sources

A contribution to

INSPIRE

http://www.onegeology-europe.org

18

A few recent works of interest

TWC

19

Federated query:

Result of geologic

units with age

‘Cenozoic - from 66

million years to today’

TWC

20

Earth Resource Form

Environmental Impact Value

Exploration Activity Type

Exploration Result

UNFC Value

Earth Resource Expression

Earth Resource Shape

Enduse Potential

Mineral Occurrence Type

Mining Activity Type

Processing Activity Type

Mining Waste Type Value

Commodity Code

Mineral Deposit Group

Mineral Deposit Type

Product Value

Recently finished CGI vocabularies

• Construct a collection of vocabularies for

populating information interchange

documents and enabling interoperability

• Provide labels for concepts, scope to

various communities defined by

language, science domain, or application

domain

CGI Geoscience Terminology Workgroup

http://cgi-iugs.org/tech_collaboration/

geoscience_terminology_working_group.html

TWC

21

USGS Online Geologic Maps

• Standardized vocabulary

with detailed annotation

• Forward and backward

queries between spatial

data and attribute data

• Links to further data

sources, e.g. aeromagnetic

survey, mineral resources

data, soils, geochemical

samples, etc.

http://mrdata.usgs.gov/geology/

state/map.html

TWC

22

Records of a point in the

San Francisco area

TWCRecommendations

• Communities of practice on ontology and vocabulary

– Bottom-up, self-organized, and loose top-down control

• Formalize the ‘Concept’ step in a data life cycle

– Top-down, and adopt outputs from the bottom-up approach

• Make it a virtuous circle among the bottom-up and top-

down approaches

23

Thanks for listening.

@[email protected]


Recommended