YAGO - A Core of Semantic Knowledge 1Fabian M. Suchanek
YAGO – A Core of Semantic Knowledge
Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum
(Max-Planck Institute for Computer Science Saarbrücken/Germany)
YAGO - A Core of Semantic Knowledge 2Fabian M. Suchanek
Overview
Motivation ر
The Yago ontology ر
Content ر
Model ر
Extension ر
Conclusion ر
YAGO - A Core of Semantic Knowledge 3Fabian M. Suchanek
The Truth about Elvis
Elvis is alive!
YAGO - A Core of Semantic Knowledge 4Fabian M. Suchanek
The Truth about Elvis
Elvis is alive!
He works as an astronaut in
NASA's special security program
YAGO - A Core of Semantic Knowledge 5Fabian M. Suchanek
Usual solution
Which NASA astronaut was born when Elvis was born?
Yields only rubbish.
Reasons:
1. Google participates in the conspiracy
2. Google does not search knowledge, but Web sites
YAGO - A Core of Semantic Knowledge 6Fabian M. Suchanek
Solution: An ontology
born1935 ?born
is anastronaut
YAGO - A Core of Semantic Knowledge 7Fabian M. Suchanek
Solution: An ontology
born1935 ?born
is aastronaut
person
entity
subclass
subclass
"Elvis Presley" "The King"
means means
is a
YAGO - A Core of Semantic Knowledge 8Fabian M. Suchanek
Solution: An ontology
born1935 ?born
is aastronaut
person
entity
subclass
subclass
"Elvis Presley" "The King"
means means
Words
is a
Individuals
Classes
Relations
YAGO - A Core of Semantic Knowledge 9Fabian M. Suchanek
Where do we get the ontology from?
Previous approaches:
Assemble the ontology manually ر
(WordNet, SUMO, GeneOntology)
Problems: Usually low coverage (MPI is in none of these)
Extract the ontology from corpora (e.g. the Web) ر
(KnowItAll, Espresso, Snowball, LEILA)
Problem: Usually low accuracy (50%-92%)
YAGO - A Core of Semantic Knowledge 10Fabian M. Suchanek
Where do we get the ontology from?
YAGO approach:
Assemble the ontology from Wikipedia (=> good coverage)
Use the category system of Wikipedia (=> good accuracy)
YAGO - A Core of Semantic Knowledge 11Fabian M. Suchanek
Exploiting the Wikipedia category system
Elvis Pr
blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter
Categories:
1935_births
1935born
Exploit relational categories
YAGO - A Core of Semantic Knowledge 12Fabian M. Suchanek
Exploiting the Wikipedia category system
Elvis Pr
blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter
Categories:
American_singers
1935born
Exploit relational categoriesExploit conceptual categories
American_singer
is a
YAGO - A Core of Semantic Knowledge 13Fabian M. Suchanek
Exploiting the Wikipedia category system
Elvis Pr
blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter
Categories:
Disputed_articles
1935born
Exploit relational categoriesExploit conceptual categories
American_singer
is ais a
Disputed_article
Avoid administrational categories
YAGO - A Core of Semantic Knowledge 14Fabian M. Suchanek
Exploiting the Wikipedia category system
Elvis Pr
blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter
Categories:
Rock'n_Roll_Music
1935born
Exploit relational categoriesExploit conceptual categories
American_singer
is ais a
Rock'n_Roll_Music
Avoid administrational categoriesAvoid thematic categories
YAGO - A Core of Semantic Knowledge 16Fabian M. Suchanek
The Upper Model
1935born
American_singer
is a
person
entity
?
YAGO - A Core of Semantic Knowledge 17Fabian M. Suchanek
The Upper Model: From Wikipedia?
1935born
American_singer
is a
People_by_occupation
Business
?Social_group
YAGO - A Core of Semantic Knowledge 18Fabian M. Suchanek
The Upper Model: From WordNet?
1935born
American_singer
is a
Singer#1
Person#3
Singer#17...
YAGO - A Core of Semantic Knowledge 19Fabian M. Suchanek
The Upper Model: From WordNet?
1935born
American_singers_of_Jewish_origin
is a
Singer#1
Person#3
Singer#17...Origin#7
YAGO - A Core of Semantic Knowledge 20Fabian M. Suchanek
The YAGO ontology
1935born
American_singer
is a
Singer#1
Person#3
subclass
subclass
"singer"
means
"Elvis Presley"means
YAGO - A Core of Semantic Knowledge 21Fabian M. Suchanek
The YAGO ontology: Accuracy
Relation Accuracysubclass 97.70% +/- 1.59%is a 94.54% +/- 2.36%familyName 97.81% +/- 1.75%givenName 97.62% +/- 2.08%establishedIn 90.84% +/- 4.28%bornInYear 93.14% +/- 3.71% diedInYear 98.72% +/- 1.30% locatedIn 98.41% +/- 1.52%politicianOf 92.43% +/- 3.93%
writtenInYear 94.35% +/- 3.33%hasWonPrize 98.47% +/- 1.57%
YAGO - A Core of Semantic Knowledge 22Fabian M. Suchanek
The YAGO ontology: Number of Facts
KnowItAll SUMO WordNet OpenCyc Cyc
30,000 60,000 200,000 300,000
2,000,000
6,000,000
Yago
Ontologies should not be judged purely by the number of facts! This is just an informational overview.
YAGO - A Core of Semantic Knowledge 23Fabian M. Suchanek
The Yago Model: Why binary is not enough
singer
is a
(But only from 1953 to 1977)
(We know this from Wikipedia)
(Elvis, is_a, singer)
YAGO - A Core of Semantic Knowledge 24Fabian M. Suchanek
The Yago Model: Why binary is not enough
is a
1953-1977
Wikipedia
time
source
#1 (Elvis, is_a, singer)
#2 (#1, time, 1953-1977)
#3 (#1, source, Wikipedia)
singer
YAGO - A Core of Semantic Knowledge 25Fabian M. Suchanek
The Yago model formally
A YAGO ontology over
a set of relations R رa set of common entities C رa set of fact identifiers I رis a function
I (RCI) R (RIC)
#1 (Elvis, is_a, singer)
#2 (#1, time, 1953-1977)
#3 (#1, source, Wikipedia)
We can talk aboutfacts (#1, source, Wikipedia) رadditional arguments (#1, time, 1953-1977) رrelations (time, hasRange, time_interval) ر
YAGO - A Core of Semantic Knowledge 26Fabian M. Suchanek
The Yago model: Logical aspects
Axioms:
(x, is_a, y)
(y, subclass, z)
=> (x, is_a, z)
...
singer
person
subclass
is a
is a
YAGO - A Core of Semantic Knowledge 27Fabian M. Suchanek
The Yago model: Logical aspects
Axioms:
(x, is_a, y)
(y, subclass, z)
=> (x, is_a, z)
...f1, f2, f3, f4, f5
f1, f2, f3
f1, f2, f3, f4, f5, f6, f7, f8, f9, f10
derive facts
Eliminate facts
finite, unique
finite, unique
YAGO - A Core of Semantic Knowledge 28Fabian M. Suchanek
Extending the Ontology
Whom did Elvis marry?
Elvis married Priscilla
X married Y
Priscilla
YAGO - A Core of Semantic Knowledge 29Fabian M. Suchanek
Extending the Ontology
Whom did Elvis marry?X married Y
subj obj
Elvis, the great rock star, married Priscilla
subj obj
Priscilla
with LEILA
YAGO - A Core of Semantic Knowledge 30Fabian M. Suchanek
Extending the Ontology
Ontology
(YAGO)
Information Extraction
(LEILA)
YAGO - A Core of Semantic Knowledge 31Fabian M. Suchanek
The Truth about Elvis
http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/
"Elvis Presley" bornInYear $year
$astro bornInYear $year
$astro isa astronaut
Enter your Yago Query:
Which astronaut was born in the same year as Elvis?
20 results
YAGO - A Core of Semantic Knowledge 32Fabian M. Suchanek
The Truth about Elvis
http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/
"Elvis Presley" bornInYear $year
$astro bornInYear $year
"Roger" givenNameOf $astro
$astro isa astronaut
Enter your Yago Query:
Which astronaut codenamed "Roger" was born in the same year as Elvis?
$astro = Roger_Chaffee
YAGO - A Core of Semantic Knowledge 33Fabian M. Suchanek
Conclusions
Yago bases on a logically clean model ر
Yago has an accuracy of around 95% ر
Yago is 3 times larger than the largest competitor ر
Elvis is alive ر
YAGO - A Core of Semantic Knowledge 34Fabian M. Suchanek
Reference
For all details, please refer to our technical report
"Yago – A Core of Semantic Knowledge"
(Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum)
available at http://www.mpii.mpg.de/~suchanek
BibTex:@TECHREPORT{yagotr, AUTHOR = {Suchanek, Fabian and Kasneci, Gjergji and Weikum, Gerhard}, TITLE = {Yago: A Core of Semantic Knowledge}, TYPE = {Research Report}, INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik}, ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany}, NUMBER = {MPI-I-2006-5-006}, YEAR = {2006}}