+ All Categories
Home > Documents > A SKOS-based thesaurus of the Geological Survey of Austria ... · A SKOS-based thesaurus of the...

A SKOS-based thesaurus of the Geological Survey of Austria ... · A SKOS-based thesaurus of the...

Date post: 14-May-2018
Category:
Upload: vankhue
View: 218 times
Download: 0 times
Share this document with a friend
1
A SKOS-based thesaurus of the Geological Survey of Austria exposed through an Open Linked Data Web-Service. Marcus Ebner*, Martin Schiegl, Werner Stöckl, Ralf Schuster & Christoph Janda * Corresponding autor: [email protected] Geological Survey of Austria, Neulinggasse 38, 1030 Vienna, Austria. Geological Survey of Austria Summary & Conclusions The user will be able to access multilingual thematic thesauri for lithology, geologic time scale, geologic units & tectonic units via an open service interface that provides: (1) RDF/HTML representation of the concepts/resources accessible via its http URIs (2) a WIKI-page for simple browsing and navigation within a thesaurus and (3) a full-fledged SPARQ-Endpoint to query the thesaurus. This service is conformant with W3C standards SKOS & RDF and the linked open data principles under a simple licence scheme (Creative Commons „share alike”). Introduction The main motivation for the development thematic thesauri is the INSPIRE Directive 2007/2/EC which demands in a harmonized and technically interoperable way. The foundation of such a semantic harmonization is the development of a controlled vocabulary that covers the main thematic aspects of the datasets. The controlled vocabularies will be provided in the form of specialized thesauri in terms of information science (i.e. a lightweight ontology for information retrieval). For the encoding of the thesauri we build upon the W3C standard SKOS (Simple Knowledge Organisation System ), a thesaurus specification for the semantic web, which is itself based on the Resource Description Framework (RDF) and RDF Schema. For the development of these thesauri we use the commercial software PoolParty , which is a tool specially build to manage multilingual thesaurus systems and a SKOS editor. For the concepts in the thesauri preferred labels and definitions will be provided in both German and English. In addition to SKOS and RDF(S) encoding standards PoolParty makes use of Dublin Core (DC) for resource description. http://www.w3.org/2004/02/skos/ http://poolparty.punkt.at/ Background-Semantic Web If a statement can be made about something it is a resource in the Semantic web. Every resource has a uniform and persistent ID - Uniform Resource Identifier (URI); (predicates/classes also have URIs) Schemes and Ontologies (SKOS) can be used for inferencing about recources (i.e. general statements can be made about resources) For semantic interoperability data in the Semantic Web are encoded in the Resource Description Framework (RDF) language wich is itselfe based on XML. Information about resources in the Semantic Web are distributed and can be connected using the linked data principles Facts are expressed as directed graphs so-called triples; every triple consists of subject, predicate, object Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ http://www.w3.org/RDF/ http://linkeddata.org/ subject predicate object object conceptID Java(literal) Island predicate hasLabel isClass Software Backend Web-Service Frontend -a simplified representation of the data- A user friendly Wiki version of you project provides an alternative way to browse (and edit) thesauri (i.e. the concept schemes in a project). You can find the wiki frontend in the wiki tab of the Frontend. One can use the auto-complete supported search function and switch between the different languages of the project. The wiki version shows selected triples of individual concepts (all triples can be accessed via the Linked-Data Frontend). SPARQL-Endpoint -a programmatic access to the data- The SPARQL tab offers full fledged SPARQL endpoint to query thesauri in a project. This is particularly useful for distributed queries, where you can incorporate data from various sources. SPARQL is the query language for RDF One can select the Format for the results of a query, add namespace definitions to queries selecting one of the checkboxes in the Add Namespace section or start with one of the sample queries provided. The following result formats are available: Ÿ Table (Default) Ÿ xml/text Ÿ xml/application Ÿ json/text Ÿ json/application http://www.w3.org/TR/rdf-sparql- query/. Linked-Data Frontend - html & rdf representation of the data- The Linked-Data Frontend can be accessed via the http URI of the concept. The server answers the GET request of the client with a 303 response and redirects either to the HTML or the RDF representation depending on the accepted formats of the client. The html tab of the Frontend provides a human-readable HTML version of a concept scheme or concept including RDFa in the rdf/xml tab you get a machine-processable RDF version of the concept scheme or concept. In contrast to the Wiki version all triples for a selected concept are given in this representation (irrespective of the selected language) <?xml version="1.0" encoding="UTF-8"?> < < rdf:about= > < rdf:resource= > < xml:lang= >A coarse-grained plutonic rock composed essentially of calcic plagioclase, pyroxene and iron oxides. If olivine is an essential constituent it is an olivine gabbro. Defined modally in QAPF-field 10. (Le Maitre et al. 2005)</ > < xml:lang= >Grobkörniges plutonisches Gestein welches sich aus Kalzium-Plagioklas, Pyroxen und Eisenoxid zusammensetzt. Man spricht von Olivin Gabbro bei erheblicher Anwesenheit von Olivin. Modal definiert in QAPF-Feld 10. (Le Maitre et al. 2005)</ > < xml:lang= >Gabbro</ > < xml:lang= >Gabbro</ > < rdf:resource= > < rdf:resource=" /> < rdf:resource= /> < rdf:resource= /> < rdf:resource= /> < >ebnmar</ > < rdf:datatype= >2011-03- 17T15:38:53Z</ > < rdf:datatype= >5</ > < ebnmar</ > < rdf:datatype= >2011-03- 15T15:22:10Z</ > </ > </ > rdf:RDF rdf:Description rdf:type skos:definition skos:definition skos:definition skos:definition skos:prefLabel skos:prefLabel skos:prefLabel skos:prefLael skos:broader skos:related skos:related skos:exactMatch skos:exactMatch dc:contributor dc:contributor dcterms:modified dcterms:modified dcterms:hasVersion dcterms:hasVersion dc:creator> dc:creator dc:date dc:date rdf:Description rdf:RDF "http://resource.geolba.ac.at/lithology/48" "http://www.w3.org/2004/02/skos/core#Concept"/ "en" "de" "de" "en" "http://resource.geolba.ac.at/lithology/21"/ http://resource.geolba.ac.at/lithology/96" "http://resource.geolba.ac.at/lithology/142" "http://dbpedia.org/resource/Gabbro" "http://rdf.freebase.com/ns/m/03fsc" "http://www.w3.org/2001/XMLSchema#dateTime" "http://www.w3.org/2001/XMLSchema#int" "http://www.w3.org/2001/XMLSchema#dateTime" HTML representation RDF representation www.geologie.ac.at Triples of linked data from external sources are not given in the rdf representation! hierarchy tree search bar skos relations skos notes skos labels alphabetical listing query results formated as an HTML table Query: display all conceptsIDs and preferred labels whose preferred label starts with a B: PREFIX skos:<http://www.w3.org/2004/02/skos/core#> SELECT DISTINCT ?Concept ?prefLabel WHERE { ?Concept ?x skos:Concept . { ?Concept skos:prefLabel ?prefLabel . FILTER (regex(str(?prefLabel), '^b.*', 'i')) } } ORDER BY ?prefLabel LIMIT 50 OFFSET 0 search bar with auto-comlete http URI gives access to the linked-data frontend language switch micro-thesauri (skos concept schemes) of the selected project skos labels & notes SPARQL-Tab used to post queries
Transcript
Page 1: A SKOS-based thesaurus of the Geological Survey of Austria ... · A SKOS-based thesaurus of the Geological Survey of Austria exposed through an ... For the development of these thesauri

A SKOS-based thesaurus of the Geological Survey of Austria exposed through an Open Linked Data Web-Service.Marcus Ebner*, Martin Schiegl, Werner Stöckl, Ralf Schuster & Christoph Janda

* Corresponding autor: [email protected] Geological Survey of Austria, Neulinggasse 38, 1030 Vienna, Austria.

Geo

logi

cal S

urve

y of

Aus

tria

Summary & ConclusionsThe user will be able to access multilingual thematic thesauri for lithology, geologic time scale, geologic units & tectonic units via an open service interface that provides:

(1) RDF/HTML representation of the concepts/resources accessible via its http URIs (2) a WIKI-page for simple browsing and navigation within a thesaurus and (3) a full-fledged SPARQ-Endpoint to query the thesaurus.

This service is conformant with W3C standards SKOS & RDF and the linked open data principles under a simple licence scheme (Creative Commons „share alike”).

Introduction

The main motivation for the development thematic thesauri is the INSPIRE Directive 2007/2/EC which demands in a harmonized and technically interoperable way. The foundation of such a semantic harmonization is the development of a controlled vocabulary that covers the main thematic aspects of the datasets. The controlled vocabularies will be provided in the form of specialized thesauri in terms of information science (i.e. a lightweight ontology for information retrieval). For the encoding of the thesauri we build upon the W3C standard SKOS (Simple Knowledge Organisation System ), a thesaurus specification for the semantic web, which is itself based on the Resource Description Framework (RDF) and RDF Schema. For the development of these thesauri we use the commercial software PoolParty , which is a tool specially build to manage multilingual thesaurus systems and a SKOS editor. For the concepts in the thesauri preferred labels and definitions will be provided in both German and English. In addition to SKOS and RDF(S) encoding standards PoolParty makes use of Dublin Core (DC) for resource description.

http://www.w3.org/2004/02/skos/

http://poolparty.punkt.at/

Background-Semantic Web

If a statement can be made about something it is a resource in the Semantic web.

Every resource has a uniform and persistent ID - Uniform Resource Identifier (URI); (predicates/classes also have URIs)

Schemes and Ontologies (SKOS) can be used for inferencing about recources (i.e. general statements can be made about resources)

For semantic interoperability data in the Semantic Web are encoded in the Resource Description Framework (RDF) language wich is itselfe based on XML.

Information about resources in the Semantic Web are distributed and can be connected using the linked data principles

Facts are expressed as directed graphs so-called triples; every triple consists of subject, predicate, object

ŸŸŸ

Ÿ

Ÿ

Ÿ

http://www.w3.org/RDF/

http://linkeddata.org/

subject

predicate

object

object

conceptID

Java(literal)

Island

predicate

hasLabel

isClass

Software Backend

Web-Service Frontend-a simplified representation of the data-

A user friendly Wiki version of you project provides an alternative way to browse (and edit) thesauri (i.e. the concept schemes in a project). You can find the wiki frontend in the wiki tab of the Frontend. One can use the auto-complete supported search function and switch between the different languages of the project. The wiki version shows selected triples of individual concepts (all triples can be accessed via the Linked-Data Frontend).

SPARQL-Endpoint -a programmatic access to the data-

The SPARQL tab offers full fledged SPARQL endpoint to query thesauri in a project. This is particularly useful for distributed queries, where you can incorporate data from various sources. SPARQL is the query language for RDF

One can select the Format for the results of a query, add namespace definitions to queries selecting one of the checkboxes in the Add Namespace section or start with one of the sample queries provided. The following result formats are available:

Ÿ Table (Default)Ÿ xml/textŸ xml/applicationŸ json/textŸ json/application

http://www.w3.org/TR/rdf-sparql-query/.

Linked-Data Frontend- html & rdf representation of the data-

The Linked-Data Frontend can be accessed via the http URI of the concept. The server answers the GET request of the client with a 303 response and redirects either to the HTML or the RDF representation depending on the accepted formats of the client. The html tab of the Frontend provides a human-readable HTML version of a concept scheme or concept including RDFa in the rdf/xml tab you get a machine-processable RDF version of the concept scheme or concept.In contrast to the Wiki version all triples for a selected concept are given in this representation (irrespective of the selected language)

<?xml version="1.0" encoding="UTF-8"?><

< rdf:about= >< rdf:resource= >< xml:lang= >A coarse-grained plutonic rock composed essentially of calcic plagioclase, pyroxene and iron oxides. If olivine is an essential constituent it is an olivine gabbro. Defined modally in QAPF-field 10. (Le Maitre et al. 2005)</ >< xml:lang= >Grobkörniges plutonisches Gestein welches sich aus Kalzium-Plagioklas, Pyroxen und Eisenoxid zusammensetzt. Man spricht von Olivin Gabbro bei erheblicher Anwesenheit von Olivin. Modal definiert in QAPF-Feld 10. (Le Maitre et al. 2005)</ >< xml:lang= >Gabbro</ >< xml:lang= >Gabbro</ >< rdf:resource= >< rdf:resource=" />< rdf:resource= />< rdf:resource= />< rdf:resource= />< >ebnmar</ >< rdf:datatype= >2011-03-17T15:38:53Z</ ><rdf:datatype= >5</ >< ebnmar</ >< rdf:datatype= >2011-03-15T15:22:10Z</ >

</ >

</ >

rdf:RDF

rdf:Description rdf:type skos:definition

skos:definitionskos:definition

skos:definitionskos:prefLabel skos:prefLabelskos:prefLabel skos:prefLaelskos:broader skos:related skos:related skos:exactMatch skos:exactMatchdc:contributor dc:contributordcterms:modified

dcterms:modifieddcterms:hasVersion

dcterms:hasVersiondc:creator> dc:creatordc:date

dc:daterdf:Description

rdf:RDF

"http://resource.geolba.ac.at/lithology/48""http://www.w3.org/2004/02/skos/core#Concept"/

"en"

"de"

"de""en"

"http://resource.geolba.ac.at/lithology/21"/http://resource.geolba.ac.at/lithology/96"

"http://resource.geolba.ac.at/lithology/142""http://dbpedia.org/resource/Gabbro""http://rdf.freebase.com/ns/m/03fsc"

"http://www.w3.org/2001/XMLSchema#dateTime"

"http://www.w3.org/2001/XMLSchema#int"

"http://www.w3.org/2001/XMLSchema#dateTime"

HTML representation RDF representation

www.geologie.ac.at

Triples of linked data from external sources are not given in the rdf representation!

hierarchy tree

search barskos relations

skos notes

skos labelsalphabeticallisting

query results formated as an HTML table

Query: display all conceptsIDs and preferred labels whose preferred label starts with a B:

PREFIX skos:<http://www.w3.org/2004/02/skos/core#>SELECT DISTINCT ?Concept ?prefLabelWHERE{ ?Concept ?x skos:Concept .{ ?Concept skos:prefLabel ?prefLabel . FILTER (regex(str(?prefLabel), '^b.*', 'i')) }} ORDER BY ?prefLabel LIMIT 50 OFFSET 0

search bar with auto-comlete

http URI gives access to the linked-data frontend

language switch

micro-thesauri (skos concept schemes)of the selected project

skos labels & notes

SPARQL-Tabused to post queries

Recommended