Ksim keizer 2010-10-19

Post on 27-Jan-2015

105 views 0 download

Tags:

description

Presentation on Linked Data and Thesauri given to the KSIM meeting of the UN

transcript

The role of Thesauriand Standard Vocabularies in linking data-AGROVOC-UNBIS-EUROVOCA proposal for collaboration between agencies

Dr. Johannes KeizerFAO of the United NationsOffice of Knowledge Exchange, Research and ExtensionKnowledge and Capacity for Development

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

The Development of the Internet

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

“Closed” (“normal”) IT environments

Data sources carefully controlled.

Data formats “custom-defined” for an application.

Linked data based on an “open world mindset”

Integrating data from the open Web

Systems designed to incorporate new information incrementally

By design, tolerance of incomplete information

Open World Mindset

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 The Linked Data Universe: http://

www.linkeddata.org (july 2009)

4

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 The Linked Data Universe: http://

www.linkeddata.org (july 2010)

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Example: BBC Wildlife Finder

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 Humboldt Squid page, pulled together from a diversity of Linked

Data sources

Animal Diversity Web:Nocturnal way of life

BBC TV Documentary

BBC News item

Wikipedia

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

RDF– a grammar for the language of data

ResourcerelatedTo

ResourceA ResourceB

ResourcedescribedBy

ResourceA Some text

1. Describe resources using interrelated “statements” (“triples”).2. Use URIs – unique, globally managed identifiers – as the “words” of statements.

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

• http://www.w3.org/2007/Talks/0221-Bangalore-IH/

RDF as a common format for merging data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Born as tools to assure consistency in the indexing of library collections

Thesauri were based on “terms”, but terms represented already concepts in a non explicit way

Hierarchical and associative relationships represented generic ontological domain knowledge

Candidate building blocks for the semantic web

Role of thesauri/concept schemes

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

..from thesaurus to Ontologies….

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

around 30,000 concepts

600000 labels in around 20 languages.

one-stop shop for terminological knowledge related to agriculture in general

a knowledge base of related concepts organized in ontological relationships (hierarchical, associative, equivalence)

Is a concept/term/string based system

Concepts may be organized in multiple categories.

AGROVOC today

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 Semantic Relationships

Concept to Concept

isA (hierarchy), isPestOf, hasPest

Concept to Term

has_lexicalization (links concepts to their lexical realizations)

Term to Term

isSynonymOf, isTranslationOf, hasAcronym, hasAbbreviation

Term to String

hasSpellingVariant, hasSingular

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Further schemes in FAO

skos:broader

:bar

has_synonymhas_translation

skos:literalForm “maize”:foomaïs (fr)

:foo

has_synonymskos:literalForm “corn”

:bar

8171

1474

skosxl:altLabel

skosxl:prefLabel

skos:broader

has_synonym

SKOS Label

AGROVOC conceptual model,in SKOS-XL

SKOSConcept

rdf:type

rdf:type

6211

skos:broader

AGROVOCConceptScheme

skos:topConceptOf

skos:inScheme

Another scheme in FAO

Other scheme in FAO

skos:inScheme

12332

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

http://www.w3.org/2004/02/skos/

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 SKOS-XL output

<rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/agrovocScheme"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/c_330829"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/><skos:inScheme rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/><skos:topConceptOf rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/xl_en_1278479064610"><literalForm xmlns="http://www.w3.org/2008/05/skos-xl#" xml:lang="en">subjects</literalForm> <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/></rdf:Description>

URI of AGROVOC concept

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

The concept scheme workbench

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

AGROVOC EUROVOC UNBIS Relationship

http://aims.fao.org/aos/agrovoc/c_207

http://eurovoc.europa.eu/219055

agroforestry skos:exactMatch/ owl:sameAs

http://aims.fao.org/aos/agrovoc/c_4826

http://eurovoc.europa.eu/220018

MILK skos:exactMatch/ owl:sameAs

http://aims.fao.org/aos/agrovoc/c_12332

http://eurovoc.europa.eu/219871

MAIZE skos:exactMatch/ owl:sameAs

Linking vocabularies

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

http://agris.fao.org/agris-search/search/display.do?f=2004/ZA/ZA04002.xml;ZA2004000049

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

http://aims.fao.org/aos/agrovoc/c_7825

http://eurovoc.europa.eu/218754

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

http://eurovoc.europa.eu/219871

Maize

skosxl: literalForm

Maize

http://aims.fao.org/aos/agrovoc/c_12332

AGROVOC

skosxl: literalFormMaize

http://aims.fao.org/aos/agrovoc/c_12332 owl:sameAs http://eurovoc.europa.eu/219871

owl:sameAs/exactMatch

http://agris.fao.org/agris-search/search/display.do?f=1996/TR/TR96001.xml;TR9600026

Linking data through common URIs

skosxl: literalForm

owl:sameAs/exactMatch

http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2010:202:0011:0015:EN:PDF

http://unbisnet.un.org:8080/ipac20/ipac.jsp?session=128F308557F34.283092&profile=bib&uri=full=3100001~!685149~!1&ri=1&aspect=subtab124&menu=search&source=~!horizon

Maize

Eurovoc

UNBIS

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

What are we doing with unstructured data?• We have enormous amounts of unstructured

material

• Still most of the documents that we are producing are mostly semantically unstructured

• Human work to catalogue and index is becoming always more rare

• We need machines to do automatic semantic mark ups of text

• If machines are trained and based on concept schemes, ther are able to do so

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

• Does Concept identification in unstructured texts

• Uses Agrovoc as a controlled vocabulary

• Prototype under testing with excellent results (entire repository of ICARDA indexed)

• Will produce in future Structured RDF files that can be used to link data like “open Calais”

AgroTagger

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Life Demo: Semantic mark ups:

http://viewer.opencalais.com/http://agropedialabs.iitk.ac.in/Tagger/Agrotagger_text.php

Collaboration Some points, about what we need to do

and what we could do together

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Our agencies have a wealth of important information

We should publish them as fast as possible as “Linked Open Data” and create links among them

metadata from databases and vocabularies) can be published without bigger investments and with little delay. 

Our data need to be come reference points in the linked data environment.

01: Open Archives + Linked Open Data

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

a SKOS-XL model to transform multilingual complex thesauri in to conceptschemes and publish them as LOD

a cutting edge workbench to enrich and maintain the concept schemes/vocabularies

Semantic interoperability! Mapping!

02 Concept schemes!

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Development of Production level machine indexing to substitute human indexing of agency publications. 

Adapting AgroTagger for UNBIS

Methodologies to adapt the system to any Agency thesaurus and document corpus

Web Services to access the semantic markup engines

Customization of Search Engines

03 Semantic Technologies !

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Working group with interested colleagues from different agencies

Discussion forum to elaborate a project proposal (can be hosted on aims.fao.org)

Workshop in spring to discuss and decide details

Possible Steps

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Thank You!

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Giving a try to the workbench

A demo version of the AWB: http://202.73.13.50:55234/agrovocdevv10d/ With all functionalities, availabe to users for testing purpose.

Latest stable release version 1.0 : (read/write) http://202.73.13.50:55381/agrovocv10i/

Latest stable release version 1.0 (Read only): http://202.73.13.50:55481/agrovocv10i/ (Visitors only with only view privilege)

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

…and more: http://aims.fao.org