+ All Categories
Home > Education > Using the Web of Data for Information Extraction

Using the Web of Data for Information Extraction

Date post: 10-May-2015
Category:
Upload: benjamin-adrian
View: 5,725 times
Download: 0 times
Share this document with a friend
Description:
Talk at Insiders Technologies , 21.01.2010. It's about publishing RDF data with D2R-server, link the data to get Linked Data, query the data with SPARQL via SQUIN and finally annotate text with this data by using RDFa in Epiphany.
Popular Tags:
45
Benjamin Adrian http://www.dfki.uni-kl.de/~adrian Insiders January 2010 Using the Web of Data for Information Extraction sparql rdf rdfa D2R server scoobie epiphany squin Linked Data OBIE
Transcript
Page 1: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Using the Web of Data for

Information Extraction

sparqlrdf

rdfaD2R server

scoobie

epiphanysquin

Linked DataOBIE

Page 2: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Are you still surfing ...

Page 3: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010… or overloaded?

Page 4: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

What are the cities of the universities in Rhineland Palatinate and what is the unemployment rate of these cities?

A simple question ...

Page 5: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX owl: <http://www.w3.org/2002/07/owl#>PREFIX skos: <http://www.w3.org/2004/02/skos/core#>PREFIX eurostat: <http://www4.wiwiss.fu-berlin.de/eurostat/resource/eurostat/>PREFIX dbpedia: <http://dbpedia.org/ontology/>PREFIX dbpedia_cat: <http://dbpedia.org/resource/Category>

SELECT ?dbpcity ?cityName ?ur WHERE {?uni skos:subject dbpedia_cat:Universities_and_colleges_in_Rhineland-Palatinate; dbpedia:city ?dbpcity .?dbpcity owl:sameAs ?statcity. ?statcity rdfs:label ?cityName ;

eurostat:unemployment_rate_total ?ur }

What are the cities of the universities in Rhineland Palatinate and what is the unemployment rate of these cities?

A simple question ...

http://www.w3.org/TR/rdf-sparql-query/

Page 6: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010… and its answer.

dbpcity cityName ur

http://dbpedia.org/resource/Koblenz Koblenz 8.8http://dbpedia.org/resource/Trier Trier 7.3

Data Sources:

Query Engine: SQUIN - Query the Web of Linked Data http://squin.sourceforge.net/

http://wiki.dbpedia.orghttp://epp.eurostat.ec.europa.euhttp://www4.wiwiss.fu-berlin.de/eurostat/

Page 7: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

So much data out there, too much?

Page 8: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010What data do you have?

Page 9: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Are you still surfing ...

Page 10: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Agenda

In order to use Web of Data for information extraction, you have to understand its basics.● RDF on one slide● Publish data in RDF with D2R Server● Publish RDF as Linked Data● Query Linked Data with SPARQL and Squin● Use RDF for information extraction● Bring Linked Data to text via RDFa

Page 11: Using the Web of Data for Information Extraction

11Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Wouldn't this be nice.

Data

Page 12: Using the Web of Data for Information Extraction

12Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Wouldn't this be nice.

Data Text

Extraction Pipeline

ExtractionResults

enrich

User-defined Filter

Page 13: Using the Web of Data for Information Extraction

13Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Wouldn't this be nice.

Data Text

Extraction Pipeline

ExtractionResults

enrich

User-defined Filter

annotate

annotatedtext

Page 14: Using the Web of Data for Information Extraction

14Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Wouldn't this be nice.

Data Text

Extraction Pipeline

ExtractionResults

populate

enrich

User-defined Filter

annotate

annotatedtext

Page 15: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF on one slide

* From: http://sig.ma/entity/ddcb76b935e91940e5508a460619a2ac.rdf

Found at:

@prefix dblp_author: <http://dblp.l3s.de/d2r/page/authors/> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dc: <http://purl.org/dc/terms/> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix acm: <http://acm.rkbexplorer.com/description/> .

dblp_author:Michael_Gillmannfoaf:name „Michael Gillmann“ ;rdfs:seeAlso <http://www.bibsonomy.org/uri/author/Michael+Gillmann> ;rdf:type foaf:Agent ;owl:sameAs acm:person-197117-81d3fccbfd0249fc33c0d00f03a30af4 ;foaf:isMakerOf <http://dblp.l3s.de/d2r/resource/publications//icdar/SchulzEGAAD09> .

<http://dblp.l3s.de/d2r/resource/publications/conf/icdar/SchulzEGAAD09>dc:creator dblp_author:Michael_Gillmann ;dc:creator dblp_author:Markus_Ebbecke ; dc:title „Seizing the Treasure: Transferring Knowledge in Invoice Analysis“ .

Page 16: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF on one slide

* From: http://sig.ma/entity/ddcb76b935e91940e5508a460619a2ac.rdf

Found at:

@prefix dblp_author: <http://dblp.l3s.de/d2r/page/authors/> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dc: <http://purl.org/dc/terms/> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix acm: <http://acm.rkbexplorer.com/description/> .

dblp_author:Michael_Gillmannfoaf:name „Michael Gillmann“ ;rdfs:seeAlso <http://www.bibsonomy.org/uri/author/Michael+Gillmann> ;rdf:type foaf:Agent ;owl:sameAs acm:person-197117-81d3fccbfd0249fc33c0d00f03a30af4 ;foaf:isMakerOf

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09> .

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09>dc:creator dblp_author:Michael_Gillmann ;dc:creator dblp_author:Markus_Ebbecke ; dc:title „Seizing the Treasure: Transferring Knowledge in Invoice Analysis“ .

Vocabularies

Page 17: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF on one slide

* From: http://sig.ma/entity/ddcb76b935e91940e5508a460619a2ac.rdf

Found at:

@prefix dblp_author: <http://dblp.l3s.de/d2r/page/authors/> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dc: <http://purl.org/dc/terms/> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix acm: <http://acm.rkbexplorer.com/description/> .

dblp_author:Michael_Gillmannfoaf:name „Michael Gillmann“ ;rdfs:seeAlso <http://www.bibsonomy.org/uri/author/Michael+Gillmann> ;rdf:type foaf:Agent ;owl:sameAs acm:person-197117-81d3fccbfd0249fc33c0d00f03a30af4 ;foaf:isMakerOf

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09> .

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09>dc:creator dblp_author:Michael_Gillmann ;dc:creator dblp_author:Markus_Ebbecke ; dc:title „Seizing the Treasure: Transferring Knowledge in Invoice Analysis“ .

URLs / URIs

Page 18: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF on one slide

* From: http://sig.ma/entity/ddcb76b935e91940e5508a460619a2ac.rdf

Found at:

@prefix dblp_author: <http://dblp.l3s.de/d2r/page/authors/> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dc: <http://purl.org/dc/terms/> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix acm: <http://acm.rkbexplorer.com/description/> .

dblp_author:Michael_Gillmannfoaf:name „Michael Gillmann“ ;rdfs:seeAlso <http://www.bibsonomy.org/uri/author/Michael+Gillmann> ;rdf:type foaf:Agent ;owl:sameAs acm:person-197117-81d3fccbfd0249fc33c0d00f03a30af4 ;foaf:isMakerOf

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09> .

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09>dc:creator dblp_author:Michael_Gillmann ;dc:creator dblp_author:Markus_Ebbecke ; dc:title „Seizing the Treasure: Transferring Knowledge in Invoice Analysis“ .

Subjects

Page 19: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF on one slide

* From: http://sig.ma/entity/ddcb76b935e91940e5508a460619a2ac.rdf

Found at:

@prefix dblp_author: <http://dblp.l3s.de/d2r/page/authors/> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dc: <http://purl.org/dc/terms/> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix acm: <http://acm.rkbexplorer.com/description/> .

dblp_author:Michael_Gillmannfoaf:name „Michael Gillmann“ ;rdfs:seeAlso <http://www.bibsonomy.org/uri/author/Michael+Gillmann> ;rdf:type foaf:Agent ;owl:sameAs acm:person-197117-81d3fccbfd0249fc33c0d00f03a30af4 ;foaf:isMakerOf

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09> .

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09>dc:creator dblp_author:Michael_Gillmann ;dc:creator dblp_author:Markus_Ebbecke ; dc:title „Seizing the Treasure: Transferring Knowledge in Invoice Analysis“ .

Predicates

Page 20: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF on one slide

* From: http://sig.ma/entity/ddcb76b935e91940e5508a460619a2ac.rdf

Found at:

@prefix dblp_author: <http://dblp.l3s.de/d2r/page/authors/> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dc: <http://purl.org/dc/terms/> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix acm: <http://acm.rkbexplorer.com/description/> .

dblp_author:Michael_Gillmannfoaf:name „Michael Gillmann“ ;rdfs:seeAlso <http://www.bibsonomy.org/uri/author/Michael+Gillmann> ;rdf:type foaf:Agent ;owl:sameAs acm:person-197117-81d3fccbfd0249fc33c0d00f03a30af4 ;foaf:isMakerOf

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09> .

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09>dc:creator dblp_author:Michael_Gillmann ;dc:creator dblp_author:Markus_Ebbecke ; dc:title „Seizing the Treasure: Transferring Knowledge in Invoice Analysis“ .

Objects

Page 21: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF data is graph data.

Page 22: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Publishing relational data in RDF

Page 23: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Publishing relational data in RDF

./generate-mapping-o mydatabase.n3 -b http://projects.dfki.uni-kl.de/mydatabase/jdbc:mysql://localhost:3306/mydatabase

./d2r-server -p 80 -b http://projects.dfki.uni-kl.de/mydatabase/mydatabase.n3

D2R Server - Publishing Relational Databases on the Semantic Web

http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/

Two small command line calls:

Page 24: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Linked Data: Linking RDF data from different sources

Customer DB Employees DB

Project DB DBpedia

How to interlinkthese datasets?

Page 25: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Linked Data: Linking RDF data from different sources

Linked Data Principles (TimBL, 2006)

1. Use URIs as names for things (e.g., http://dbpedia.org/resource/Berlin)

2. Use HTTP-URIs so that people can look up those names3. Provide useful information in RDF when someone looks up an URI4. Include links to other URIs to enable discovery of more information

Example:

<http://dbpedia.org/resource/Berlin> owl:sameAs opencyc:en/CityOfBerlinGermany ;

owl:sameAs opencyc:en/Berlin_StateGermany owl:sameAs <http://sws.geonames.org/2950159/> owl:sameAs <http://www4.wiwiss.fu-berlin.de/eurostat/resource/regions/Berlin> owl:sameAs freebase:http://dbpedia.org/resource/Berlin

Page 26: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

SPARQL: Querying RDF data

SPARQL - the RDF query language.

In contrast to SQL, it's data model is not set oriented but graph oriented.

Some Examples:

Resulting in tuples:SELECT ?interest ?friend WHERE {

   <http://www.w3.org/People/Berners­Lee/card#i> foaf:knows ?friend .   ?friend foaf:interest ?interest .  }

Resulting as graph :CONSTRUCT {?friend foaf:interest ?interest } WHERE {

   <http://www.w3.org/People/Berners­Lee/card#i> foaf:knows ?friend .   ?friend foaf:interest ?interest .  }

Page 27: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

SPARQL: Query Linked Data from different sources

Customer DB Employees DB

Project DB DBpedia

How to accessthese datasets with a single

SPARQL query?

Page 28: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

SPARQL: Query Linked Data from different sources

D2R Server D2R Server

D2R Server D2R Server

Customer DB Employees DB

Project DB DBpedia

Squin: Query the Web of Linked Data

http://squin.sourceforge.net/

Squin follows a Link Traversal approach over HTTP URIs.

Remember:

SELECT DISTINCT ?c ?cityName ?ur WHERE {?u skos:subjectdbpedia_cat:Universities_and_colleges_in_Rhineland-Palatinate; dbpedia:city ?c . ?c owl:sameAs [ rdfs:label ?cityName ; eurostat:unemployment_rate_total ?ur ]}

SQUINSQUIN

Page 29: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Using RDF and Linked Data for Information Extraction

User Linked Data

Text Extraction Pipeline

Query

Result Graph

asks question

about

answersto

Page 30: Using the Web of Data for Information Extraction

Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Using RDF and Linked Data for Information Extraction

What data do we have?

Classes Instances Datatype Properties Object Properties

<http://dblp.l3s.de/d2r/resource/publications/dblp_pub:conf/icdar/SchulzEGAAD09>rdf:type foaf:Document ;dc:creator dblp_author:Markus_Ebbecke ; dc:title „Seizing the Treasure: Transferring Knowledge in Invoice Analysis“ .

foaf:Documentfoaf:Person

.../SchulzEGAAD09

.../Markus_Ebbeckedc:titlefoaf:namefoaf:firstNamefoaf:surName

dc:creatorfoaf:knows

Literals

„Markus“„Ebbecke“„Seizing the Treasure: Transferring Knowledge in Invoice Analysis“

Example RDF data

Page 31: Using the Web of Data for Information Extraction

31Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

SCOOBIEDomain Adaption

Vocabulary Data

Instance Data

Information Extraction (online)

Data Preprocessing & Learning (offline)

Structured Data

Text Corpus Data

Patterns andGazetteers

Data

Page 32: Using the Web of Data for Information Extraction

32Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

SCOOBIEEco System

TrainingCorpus

Patterns + Gazetteers

Text Corpus

Ontology

Instances

Domain Knowledge Models

Ses

sion

Dat

aT

asks

Index

Pre-process Train Extract

OIAP

I

I

Models

Page 33: Using the Web of Data for Information Extraction

33Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

SCOOBIEOBIE Pipeline

Normalization Text ExtractionLanguage Detection

Segmentation TokenizationSentence ExtractionPOS-Tagging

Symbolization Named Entity RecognitionStructured Entity RecognitionNoun Phrase ChunkingSymbol Recognition

Instantiation Instance RecognitionInstance DisambiguationChunk Classification

Contextualization Fact ExtractionFact Selection

Population Query Answering

Page 34: Using the Web of Data for Information Extraction

34Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Used MachineLearning Models

Regex matching statistics (Structured Entity Recognition)

Supervised Learning

Unsupervised or Instance-based Learning

Gazetteer matching statistics (Named Entity Recognition)

CRF-based Noun Phrase Chunker

K-Nearest-Neighbor chunk classifier (Chunk Classification)Spreading Activation-based fact ranking (Fact Selection)

I

I

I

TF/IDF-based instance re-ranking (Instance Disambiguation)

Semi-Supervised Learning

Page 35: Using the Web of Data for Information Extraction

35Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Used Machine Learning: Conditional Random Field

CRFs are sequence taggers:

Train it with: Bill CAPITALIZED nounslept LOWERCASE non-nounhere LOWERCASE non-noun

Test it with: He CAPITALIZEDvisited LOWERCASELondon CAPITALIZED

CRF results: nounnon-nounnon-noun

MALLET - MAchine Learning for LanguagE Toolkit

http://mallet.cs.umass.edu/

Page 36: Using the Web of Data for Information Extraction

36Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

Bringing Linked Data to Text

Annotate plain text or HTML with RDF data.

I'm working at DFKI.

RDFa offers an HTML extension:

I'm working at<span about="dbpedia:DFKI" property="rdfs:label">DFKI</span>

Now lets generate RDFa automatically ...

Page 37: Using the Web of Data for Information Extraction

37Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Do you remember?

Data Text

Extraction Pipeline

ExtractionResults

populate

enrich

User-defined Filter

annotate

annotatedtext

Page 38: Using the Web of Data for Information Extraction

38Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF Epiphany

Epiphany takes the original webpage …

Page 39: Using the Web of Data for Information Extraction

39Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF Epiphany

Epiphany takes the original webpage …and SCOOBIE initialized with an RDF data set …

Page 40: Using the Web of Data for Information Extraction

40Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF Epiphany

Epiphany takes the original webpage …and SCOOBIE initialized with an RDF data set …It extracts RDF information from text and annotates it asRDFa…

Page 41: Using the Web of Data for Information Extraction

41Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010RDF Epiphany

Epiphany takes the original webpage …and SCOOBIE initialized with an RDF Linked Data set …It extracts RDF information from text and annotates it asRDFa…clicking on RDFa annotationsopens further information fromthe Linked Data set

Page 42: Using the Web of Data for Information Extraction

42Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010

SCOOBIE

RDF Epiphany

At a glance

● Epiphany is a free web service.

● Epiphany uses SCOOBIE.

● Epiphany can be initialized with any RDFLinked Data set.

● Epiphany generates an RDF document about a web page.

● Epiphany annotates RDF as RDFa in the web page.

http://projects.dfki.uni-kl.de/epiphany/

Page 43: Using the Web of Data for Information Extraction

43Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Summary

Text

Extraction Pipeline

ExtractionResults

populate

enrich

User-defined Filter

annotate

annotatedtext

D2R Server

D2R Server

D2R Server

D2R Server

Customer DB Employees DB

Project DB DBpedia

SQUINSQUIN

Page 44: Using the Web of Data for Information Extraction

44Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Outlook

Extraction Pipeline

ExtractionResults

populate

enrich

User-defined Filter

annotate

annotatedE-Mail

D2R Server

D2R Server

D2R Server

D2R Server

Customer DB Employees DB

Project DB DBpedia

SQUINSQUIN

E-Mail

Page 45: Using the Web of Data for Information Extraction

45Benjamin Adrianhttp://www.dfki.uni-kl.de/~adrian

InsidersJanuary

2010Thank you!

sparqlrdf

rdfaD2R server

scoobie

epiphanysquin

Linked DataOBIE


Recommended