+ All Categories
Home > Documents > Ksim keizer 2010-10-19

Ksim keizer 2010-10-19

Date post: 27-Jan-2015
Category:
Upload: johannes-keizer
View: 105 times
Download: 0 times
Share this document with a friend
Description:
Presentation on Linked Data and Thesauri given to the KSIM meeting of the UN
Popular Tags:
36
The role of Thesauri and Standard Vocabularies in linking data- AGROVOC-UNBIS- EUROVOC A proposal for collaboration between agencies Dr. Johannes Keizer FAO of the United Nations Office of Knowledge Exchange, Research and Extension Knowledge and Capacity for Development
Transcript
Page 1: Ksim keizer 2010-10-19

The role of Thesauriand Standard Vocabularies in linking data-AGROVOC-UNBIS-EUROVOCA proposal for collaboration between agencies

Dr. Johannes KeizerFAO of the United NationsOffice of Knowledge Exchange, Research and ExtensionKnowledge and Capacity for Development

Page 2: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

The Development of the Internet

Page 3: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

“Closed” (“normal”) IT environments

Data sources carefully controlled.

Data formats “custom-defined” for an application.

Linked data based on an “open world mindset”

Integrating data from the open Web

Systems designed to incorporate new information incrementally

By design, tolerance of incomplete information

Open World Mindset

Page 4: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 The Linked Data Universe: http://

www.linkeddata.org (july 2009)

4

Page 5: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 The Linked Data Universe: http://

www.linkeddata.org (july 2010)

Page 6: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Example: BBC Wildlife Finder

Page 7: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 Humboldt Squid page, pulled together from a diversity of Linked

Data sources

Animal Diversity Web:Nocturnal way of life

BBC TV Documentary

BBC News item

Wikipedia

Page 8: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

RDF– a grammar for the language of data

ResourcerelatedTo

ResourceA ResourceB

ResourcedescribedBy

ResourceA Some text

1. Describe resources using interrelated “statements” (“triples”).2. Use URIs – unique, globally managed identifiers – as the “words” of statements.

Page 9: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

• http://www.w3.org/2007/Talks/0221-Bangalore-IH/

RDF as a common format for merging data

Page 10: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Born as tools to assure consistency in the indexing of library collections

Thesauri were based on “terms”, but terms represented already concepts in a non explicit way

Hierarchical and associative relationships represented generic ontological domain knowledge

Candidate building blocks for the semantic web

Role of thesauri/concept schemes

Page 11: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

..from thesaurus to Ontologies….

Page 12: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

around 30,000 concepts

600000 labels in around 20 languages.

one-stop shop for terminological knowledge related to agriculture in general

a knowledge base of related concepts organized in ontological relationships (hierarchical, associative, equivalence)

Is a concept/term/string based system

Concepts may be organized in multiple categories.

AGROVOC today

Page 13: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 Semantic Relationships

Concept to Concept

isA (hierarchy), isPestOf, hasPest

Concept to Term

has_lexicalization (links concepts to their lexical realizations)

Term to Term

isSynonymOf, isTranslationOf, hasAcronym, hasAbbreviation

Term to String

hasSpellingVariant, hasSingular

Page 14: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Further schemes in FAO

skos:broader

:bar

has_synonymhas_translation

skos:literalForm “maize”:foomaïs (fr)

:foo

has_synonymskos:literalForm “corn”

:bar

8171

1474

skosxl:altLabel

skosxl:prefLabel

skos:broader

has_synonym

SKOS Label

AGROVOC conceptual model,in SKOS-XL

SKOSConcept

rdf:type

rdf:type

6211

skos:broader

AGROVOCConceptScheme

skos:topConceptOf

skos:inScheme

Another scheme in FAO

Other scheme in FAO

skos:inScheme

12332

Page 15: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

http://www.w3.org/2004/02/skos/

Page 16: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19 SKOS-XL output

<rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/agrovocScheme"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/c_330829"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/><skos:inScheme rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/><skos:topConceptOf rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/xl_en_1278479064610"><literalForm xmlns="http://www.w3.org/2008/05/skos-xl#" xml:lang="en">subjects</literalForm> <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/></rdf:Description>

URI of AGROVOC concept

Page 17: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

The concept scheme workbench

Page 18: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

AGROVOC EUROVOC UNBIS Relationship

http://aims.fao.org/aos/agrovoc/c_207

http://eurovoc.europa.eu/219055

agroforestry skos:exactMatch/ owl:sameAs

http://aims.fao.org/aos/agrovoc/c_4826

http://eurovoc.europa.eu/220018

MILK skos:exactMatch/ owl:sameAs

http://aims.fao.org/aos/agrovoc/c_12332

http://eurovoc.europa.eu/219871

MAIZE skos:exactMatch/ owl:sameAs

Linking vocabularies

Page 19: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

http://agris.fao.org/agris-search/search/display.do?f=2004/ZA/ZA04002.xml;ZA2004000049

Page 20: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

http://aims.fao.org/aos/agrovoc/c_7825

http://eurovoc.europa.eu/218754

Page 21: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

http://eurovoc.europa.eu/219871

Maize

skosxl: literalForm

Maize

http://aims.fao.org/aos/agrovoc/c_12332

AGROVOC

skosxl: literalFormMaize

http://aims.fao.org/aos/agrovoc/c_12332 owl:sameAs http://eurovoc.europa.eu/219871

owl:sameAs/exactMatch

http://agris.fao.org/agris-search/search/display.do?f=1996/TR/TR96001.xml;TR9600026

Linking data through common URIs

skosxl: literalForm

owl:sameAs/exactMatch

http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2010:202:0011:0015:EN:PDF

http://unbisnet.un.org:8080/ipac20/ipac.jsp?session=128F308557F34.283092&profile=bib&uri=full=3100001~!685149~!1&ri=1&aspect=subtab124&menu=search&source=~!horizon

Maize

Eurovoc

UNBIS

Page 22: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

What are we doing with unstructured data?• We have enormous amounts of unstructured

material

• Still most of the documents that we are producing are mostly semantically unstructured

• Human work to catalogue and index is becoming always more rare

• We need machines to do automatic semantic mark ups of text

• If machines are trained and based on concept schemes, ther are able to do so

Page 23: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Page 24: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

• Does Concept identification in unstructured texts

• Uses Agrovoc as a controlled vocabulary

• Prototype under testing with excellent results (entire repository of ICARDA indexed)

• Will produce in future Structured RDF files that can be used to link data like “open Calais”

AgroTagger

Page 25: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Page 26: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Page 27: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Page 28: Ksim keizer 2010-10-19

Life Demo: Semantic mark ups:

http://viewer.opencalais.com/http://agropedialabs.iitk.ac.in/Tagger/Agrotagger_text.php

Page 29: Ksim keizer 2010-10-19

Collaboration Some points, about what we need to do

and what we could do together

Page 30: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Our agencies have a wealth of important information

We should publish them as fast as possible as “Linked Open Data” and create links among them

metadata from databases and vocabularies) can be published without bigger investments and with little delay. 

Our data need to be come reference points in the linked data environment.

01: Open Archives + Linked Open Data

Page 31: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

a SKOS-XL model to transform multilingual complex thesauri in to conceptschemes and publish them as LOD

a cutting edge workbench to enrich and maintain the concept schemes/vocabularies

Semantic interoperability! Mapping!

02 Concept schemes!

Page 32: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Development of Production level machine indexing to substitute human indexing of agency publications. 

Adapting AgroTagger for UNBIS

Methodologies to adapt the system to any Agency thesaurus and document corpus

Web Services to access the semantic markup engines

Customization of Search Engines

03 Semantic Technologies !

Page 33: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Working group with interested colleagues from different agencies

Discussion forum to elaborate a project proposal (can be hosted on aims.fao.org)

Workshop in spring to discuss and decide details

Possible Steps

Page 34: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Thank You!

Page 35: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

Giving a try to the workbench

A demo version of the AWB: http://202.73.13.50:55234/agrovocdevv10d/ With all functionalities, availabe to users for testing purpose.

Latest stable release version 1.0 : (read/write) http://202.73.13.50:55381/agrovocv10i/

Latest stable release version 1.0 (Read only): http://202.73.13.50:55481/agrovocv10i/ (Visitors only with only view privilege)

Page 36: Ksim keizer 2010-10-19

dr johannes keizer - FAO of the United Nations - knowledge and capacity for development

U

N,

KS

IM

mee

tin

g

N

ew Y

ork

, 20

10-1

0-19

…and more: http://aims.fao.org


Recommended