LOD2 Webinar Series FOX

Post on 12-Jan-2015

1,114 views 4 download

description

 

transcript

LOD2 Webinar . 29.11.2011 . Page 1 http://lod2.eu

Creating Knowledge out of Interlinked Data

http://lod2.eu

LOD2 is a large-scale integrating project co-funded by the European Commission within the FP7 Information and Communication Technologies Work Programme. This 4-year project comprises leading Linked Open Data technology researchers, companies, and service providers. Coming from across 12 countries the partners are coordinated by the Agile Knowledge Engineering and Semantic Web Research Group at the University of Leipzig, Germany.

LOD2 will integrate and syndicate Linked Data with existing large-scale applications. The project shows the benefits in the scenarios of Media and Publishing, Corporate Data intranets and eGovernment.

http://lod2.eu

Once per month the LOD2 webinar series offer a free webinar about tools and services along the Linked Open Data Life Cycle.

Stay with us and learn more about acquisition, editing, composing, connected applications – and finally publishing Linked Open Data.

Federated Knowledge Extraction Framework

Axel Ngonga

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 5 http://lod2.eu

• Steady growth but incomplete

• Structured data• Triplify, Sparqlify

• Semi-structured data• DBpedia

• Unstructured data• Make up 80% of the Web

• Diverse solutions, yet low F-score even on non-

noisy data

• Solution: FOX

Motivation

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 6 http://lod2.eu

Insight

Dictionary-based approaches

Pattern-based approaches

Condition Random FieldsSu

pport Vecto

r Mach

ines

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 7 http://lod2.eu

• Diversity of solutions to one problem• NER, KE, RE

• Each solution has its strengths and

weakness

• Apply ensemble learning to • Combine the tools at hand

• Compute better results

• In our case, decision trees (v2)

Insight

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 8 http://lod2.eu

Architecture

Learning

Prediction

Orchestration

NER

KE

RE

NED

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 9 http://lod2.eu

• Use AGDISTIS Framework

http://aksw.org/projects/AGDISTIS

Named Entity Disambiguation

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 10 http://lod2.eu

• Input• Text

• HTML

• URL

• Output• JSON-LD

• RDF/XML

Implementation

• N3

• …

• Execution• Single tools

(light)

• FOX Full

• Access• REST

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 11 http://lod2.eu

Evaluation (FOX)

MUC-7 Corpus• 6013 locations• 11093 organizations• 5882 persons

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 12 http://lod2.eu

Evaluation (AGDISTIS)

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 13 http://lod2.eu

http://fox.aksw.org

Demo

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 14 http://lod2.eu

input : text or an url

type : { text | url }

task : { NER }

output : { JSONLD | N3 | N-TRIPLE | RDF/{ JSON |

XML | XML-ABBREV} | TURTLE }

returnHtml : { true | false }

foxlight : an implemented INER class name (e.g.

`org.aksw.fox.nertools.NEROpenNLP`) or `OFF`.

FOX API Parameters

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 15 http://lod2.eu

curl -d type=text -d task=NER -d output=JSONLD --

data-urlencode "input=The foundation of the University

of Leipzig in 1409 initiated the city's development into a

centre of German law and the publishing industry, and

towards being a location of the Reichsgericht (High

Court), and the German National Library (founded in

1912). The philosopher and mathematician Gottfried

Leibniz was born in Leipzig in 1646, and attended the

university from 1661-1666." -H "Content-Type:

application/x-www-form-urlencoded" <SERVICE_URI>

FOX API Parameters

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 16 http://lod2.eu

{ "@id" : "_:t1", "http://www.w3.org/2000/10/annotation-ns#body" :

[ { "@value" : "University of Leipzig" } ],

"http://ns.aksw.org/scms/source" : [ { "@id" :

"http://ns.aksw.org/scms/tools/fox" } ],

"http://ns.aksw.org/scms/means" : [ { "@id" :

"http://dbpedia.org/resource/Leipzig_University" } ],

"http://ns.aksw.org/scms/endIndex" : [ { "@value" : "43", "@type" :

"http://www.w3.org/2001/XMLSchema#int" } ],

"http://ns.aksw.org/scms/beginIndex" : [ { "@value" : "22", "@type" :

"http://www.w3.org/2001/XMLSchema#int" } ], "@type" : [

"http://ns.aksw.org/scms/annotations/ORGANIZATION",

"http://www.w3.org/2000/10/annotation-ns#Annotation" ] }

FOX Response

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 17 http://lod2.eu

[

a scmsann:ORGANIZATION , ann:Annotation ;

scms:beginIndex "22"^^xsd:int ;

scms:endIndex "43"^^xsd:int ;

scms:means <http://dbpedia.org/resource/Leipzig_University> ;

scms:source <http://ns.aksw.org/scms/tools/fox> ;

ann:body "University of Leipzig"^^xsd:string

] .

FOX Response

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 18 http://lod2.eu

curl --data-urlencode "text='The <entity>University of

Leipzig</entity> was visited by <entity>Barack

Obama</entity>.'" -d type='agdistis' <SERVICEURL>

[{"namedEntity":"Barack Obama","start":42,

"disambiguatedURL":"http://dbpedia.org/resource/B

arack_Obama","offset":12},

{"namedEntity":"University of

Leipzig","start":5,"disambiguatedURL":"http://dbpedi

a.org/resource/Leipzig_University","offset":21}]

AGDISTIS API

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 19 http://lod2.eu

• > 90% F-score

• Can be extended to cover other KE tasks (RE, POS,

…)

• Easy integration into semantic applications

• More info at http://fox.aksw.org and

http://aksw.org/projects/agdistis

Conclusion and Future Work

Creating Knowledge out of Interlinked Data

Axel Ngonga – Federated Knowledge Extraction 30.01.2014 Page 20 http://lod2.eu

Thank you for your attention!

Axel Ngongahttp://aksw.org/AxelNgonga | http://fox.aksw.org | http://lod2.org ngonga@informatik.uni-leipzig.de