+ All Categories
Home > Technology > Fact Extraction from Wikipedia

Fact Extraction from Wikipedia

Date post: 04-Aug-2015
Category:
Upload: marco-fossati
View: 203 times
Download: 4 times
Share this document with a friend
Popular Tags:
13
Cutting Long Stories Short Fact Extraction from Wikipedia Marco Fossati fossati@spaziodati.eu Poznan, 25th June 2015
Transcript
Page 1: Fact Extraction from Wikipedia

Cutting Long Stories Short

Fact Extraction from Wikipedia

Marco Fossati [email protected]

Poznan, 25th June 2015

Page 2: Fact Extraction from Wikipedia

What?A Google Summer of Code Project for DBpedia

Page 3: Fact Extraction from Wikipedia

What?

Teaching Machines to Read

Natural Language

Page 4: Fact Extraction from Wikipedia

Why?Text Contains a Huge Amount of Knowledge

Page 5: Fact Extraction from Wikipedia

Why?

DBpedia Focuses on Semi-structured Data

Discovery of New Relations

Automatic Knowledge Base Population

Page 6: Fact Extraction from Wikipedia

How?

Machine Learning +

Lexical Semantics

Page 7: Fact Extraction from Wikipedia

How?

Poland victory World Cup 2014

“Poland won the World Cup in 2014”

Page 8: Fact Extraction from Wikipedia

Approach

1. Lexical Units

1.1.Extraction via POS Tagging

1.2.Statistical Ranking

2. Frame Database (FrameNet, Kicktionary)

The Data-driven Way

Page 9: Fact Extraction from Wikipedia

Approach

3. Frame + Frame Elements Classification

Unsupervised, Rule-based

Supervised

4. Crowdsourced Training Set Construction

5. RDF Serialization

The Data-driven Way

Page 10: Fact Extraction from Wikipedia

Crowdsourcing the AnnotationLabel words with Frame Elements

Page 11: Fact Extraction from Wikipedia

Use Case

Soccer Domain

Widely Represented (223.000 articles)

Lots of Semi-structured Data

Italian Wikipedia

Page 12: Fact Extraction from Wikipedia

Wanna contribute?

https://github.com/dbpedia/fact-extractor

Page 13: Fact Extraction from Wikipedia

That’s all Folks!

Marco Fossati [email protected]


Recommended