Semantic Parsing and Beyond to Create a
Commonsense Knowledge Base
Valerio Basile
Institute for Language and Information ScienceUniversity of Düsseldorf
19/4/2018
Valerio Basile
University of GroningenInria (Sophia Antipolis)Sapienza (Roma)Università di Torino
Computational Semantics, Semantic Web, Natural Language Generation, Information Extraction, Linguistic Annotation,
Distributional Semantics, General Knowledge Bases, Gamification, Social Media, Sentiment Analysis, Legal
Informatics, Argument Mining, Math, Pasta, Videogames, ...
whoami
Robotics and Artificial Intelligence
Objects
Linguistics and Semantics
Machine Learning and Clustering
Today
● I Motivation: The Semantics of Objects● II Objects, Knowledge and The Web● III Objects, Words and Vectors● IV Frames and Prototypical Knowledge● V Default Knowledge about Objects
Today
Part IMotivation:
The Semantics of Objects
5-year CHIST-ERA funded project (2014-2018)
4 EU partners
Deploy robots in human-inhabited environments.
The robots autonomously collect real-world data.
We use information available on the Semantic Web to identify the semantics of objects.
● Object classification
● Room detection
● Frame detection
● Inference
● ...
Frame Semantics
Bob, I want some pane!
Frame Semantics
Frame name
Frame type
Frame element
role
Frame Semantics
Part IIObjects, Knowledge and The
Web
ClassificationWhat is (not) an object?What type is an object?
What is a room?...
RelationsHow are objects related?
Where is an object?What can I do with an object?
...
Object Knowledge
http://lod-cloud.net/
Linked Open Data
Taxonomy Function Location Linked Data
DBpedia ✔ ✘ ✘ ✔
ConceptNet ✔ ✔ ✔ partly
KnowRob ✔ ✔ partly ✘
BabelNet ✔ ✘ ✘ ✔
SUN ✘ ✘ ✔ ✘
BNDB
CNDK
KR
Keyword Linking MethodsDBpedia Lookup“official” search API of DBpedia
String Match (+redirect)Try http://dbpedia.org/resource/{KEYWORD}
BabelfyState of the art algorithm for Word Sense Disambiguation/Entity Linking
Keyword Linking MethodsVector-based Contextual disambiguation
● Run String Match on the keywords● Split the missed keywords into tokens● Run String Match on the tokens● Compute the semantic similarity of each
token-entity with all the previously recognized entities
● Select the highest scoring token-entity
e.g., basket_of_banana dbr:→ Basket
The SUN database
https://groups.csail.mit.edu/vision/SUN/
131,067 Images908 Scene categories313,884 Segmented objects4,479 Object categories
The SUN database
Results2,493 objects in DBpedia679 locations in DBpedia2,935 object-location relations
The SUN database
Classification
Relations
Part IIIObjects, Words and Vectors
Problem
Classification is good, but relations are sparse
Object Knowledge
Distributional Hypothesis
Word 1
(entity mention)
Word 2
(entity mention)
Entity 1 Entity 2Semantic Relatedness
Co-occurrence
Distributional Relational Hypothesis
Entity 1 Entity 2
Type A Type BSemantic Relation
Semantic Relatedness
Entity 1 Entity 2
Object RoomisLocatedAt
Semantic Relatedness
Distributional Relational Hypothesis
Semantic Relatedness
Washing_machine Ashtray
Bathroom 5 2
Bedroom 0 1
Living_room 1 6
Co-occurrence matrix Singular value decomposition
M=U ΣV *
U k ΣkV k*=M k
Low-rank approximation
NASARI: A Novel Approach to a Semantically-Aware Representation of Items(Camacho-Collados, Pilehvar and Navigli, 2015)
Semantic Similaritybn:00008995n Bathroom 0.03750793 0.06731935 0.02334246 0.02009913 0.02251291 0.07689607 0.01527985 0.10780967 0.18232885 0.1234034 0.0520944 0.25805958 0.12200121 0.04875973 0.03544397 0.03841146 0.00970973 …
bn:00007365n Washing_machine 0.00911299 0.11549547 0.04274256 0.03672424 0.06627292 0.13761881 0.01171631 0.08721243 0.08270955 0.13095092 0.00137408 0.16226186 0.0422162 0.0545828 0.01007292 0.10094466 0.05663372 0.09864459 0.10167608 7.534e05 0.08067719 0.05527394
Cosine similarity:
A⋅B‖A‖‖B‖
=∑i=1
n
A iB i
√∑i=1
n
A i ²√∑i=1
n
A i ²
http://lcl.uniroma1.it/nasari/
Semantic Similarity
Evaluation: locatedAt
Gold standard: SUN database linked to DBpedia
Extract the top k object-location pairs
Evaluation: usedFor
Gold standard: ConceptNet linked to DBpedia
Extract the top k object-action pairs
931 high confidence location relationsOnly 52 were in the gold standard setE.g.:Trivet Kitchen→Flight_bag Airport_lounge→Soap_dispenser Unisex_public_toilet→
+ many related datasets:https://project.inria.fr/aloof/data/
Results
Object-action relation (usedFor)Extracting common sense knowledge via triple ranking using supervised and unsupervised distributional modelsS Jebbara, V Basile, E Cabrio, P Cimiano, Semantic Web 2018
Improving object detectionSemantic web-mining and deep vision for lifelong object discoveryJ Young, L Kunze, V Basile, E Cabrio, N Hawes, B CaputoRobotics and Automation, ICRA 2017
Object-location relation (locatedAt)Populating a knowledge base with object-location relations using distributional semanticsV Basile, S Jebbara, E Cabrio, P Cimiano, EKAW 2016
Distributional Relational Hypothesis
Part IVFrames and Prototypical
Knowledge
Problem
The distributional relational hypothesis is limited to specific relations
Frame Semantics
Frame Semantics
FrameNet (1997), Framester (2016), Framebase (2015)
Frame name
Frame type
Frame element
role
Frame InstanceInstance id: <fi12345>Frame type: fbframe:CookingFrame elements:● fbfe:Instrument, dbr:Knife● fbfe:Agent, dbr:Person● …
Frame InstanceInstance id: <fi12345>Frame type: fbframe:CookingFrame elements:● fbfe:Instrument, dbr:Knife● fbfe:Agent, dbr:Person● …
Default Knowledge Prototypical Frame Instances→
Frame InstanceInstance id: <fi12345>Frame type: fbframe:CookingFrame elements:● fbfe:Instrument, dbr:Knife● fbfe:Agent, dbr:Person● …
Default Knowledge Prototypical Frame Instances→=
F.I. extraction + F.I. clustering
Knowledge Extraction
Semantic Parsing+
Word Sense Disambiguation & linking+
Alignment
V. Basile, E. Cabrio, C. SchonKNEWS: Using Logical and Lexical Semantics to Extract Knowledge from Natural LanguageECAI 2016
Knowledge Extraction
https://github.com/valeriobasile/learningbyreading
Text
(Natural Language)
Semantic
Parsing
Word Sense
Disambiguation
Entity
Linking
Discourse
Representation
Structure
DBPedia
Entities
WordNet
Synsets
Semantic
Roles
FrameNet
FramesAlignment
RDF
Triples
Knowledge Extraction
https://github.com/valeriobasile/learningbyreading
Text
(Natural Language)
Semantic
Parsing
Word Sense
Disambiguation
Entity
Linking
Discourse
Representation
Structure
DBPedia
Entities
WordNet
Synsets
Semantic
Roles
FrameNet
FramesAlignment
RDF
Triples
Boxer, Semafor
Babelfy, UKB
Babelfy, Spotlight
Deeper Natural Language Processing
C&C Tools, Boxer (Curran, Clark and Bos 2007)http://valeriobasile.github.io/candcapi/
Deeper Natural Language Processing
C&C Tools, Boxer (Curran, Clark and Bos 2007)http://valeriobasile.github.io/candcapi/
DiscourseRepresentationStructure
Deeper Natural Language Processing
Same entity!
Deeper Natural Language Processing
eventsemantic roles
Deeper Natural Language Processing
event → Framesemantic roles
→ Frame elements
Word Sense Disambiguation
http://babelfy.org (Navigli and Ponzetto, 2012)
Deeper Natural Language Processing
Framebase mappingEvent Word sense FrameNet Frame→ →
SemLink mappingVerbNet role + FrameNet frame FrameNet role→
Example:
serve(e2) serve.v.01 Ofering→ →agent(x1, e2) + Ofering Oferer*→
empty(e3) empty.a.01 Fullness→ →cup(x3) + Fullness Container*→
Frame Instance Extraction@prefix fbfi: <http://framebase.org/ns/fi->
@prefix fbframe: <http://framebase.org/ns/frame->
@prefix fbfe: <{http://framebase.org/ns/fe->
@prefix rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns\#>
@prefix bn: <http://babelnet.org/rdf/>
fbfi:People_01b52400 rdfs:type fbframe:People.
fbfi:People_01b52400 fbfe:Person bn:s00001533n.
fbfi:Cardinal_numbers_3faa6c9c rdfs:type fbframe:Cardinal_numbers.
fbfi:Cardinal_numbers_3faa6c9c fbfe:Entity bn:s00001533n.
fbfi:Being_located_079aed4d rdfs:type fbframe:Being_located.
fbfi:Being_located_079aed4d fbfe:Theme bn:s00001533n.
fbfi:Being_located_079aed4d fbfe:Location bn:s00009850n.
¨Two men are sleeping on a bench¨
Frame Instance Extraction
Frame Similarity
Instance id: <fi12345>Frame type: fbframe:CookingFrame elements:● fbfe:Instrument,
dbr:Knife● fbfe:Agent, dbr:Person● …
Instance id: <fi67890>Frame type: fbframe:EatingFrame elements:● fbfe:Instrument,
dbr:Fork● fbfe:Agent, dbr:Person● …
frame types
Frame Similarity
Instance id: <fi12345>Frame type: fbframe:CookingFrame elements:● fbfe:Instrument,
dbr:Knife● fbfe:Agent, dbr:Person● …
Instance id: <fi67890>Frame type: fbframe:EatingFrame elements:● fbfe:Instrument,
dbr:Fork● fbfe:Agent, dbr:Person● …
frame elements
Frame Similarity
Measuring Frame Instance RelatednessV. Basile, R. Lopez Condori, E. Cabrio
*SEM 2018 (accepted)
Frame SimilaritySentence Textual Similarity shared task dataset250 sentence pairs1,650 frame instances with KNEWS178 frame types, ~1.2 frame elements each457 concepts
Clustering Frame Instances
Pilot StudyText for language learners (1,653 short stories) 114,536 frame instances, 154,422 frame elements, 686 frame types, 222 roles filled by 3,398 types of concepts.Hierarchical clustering with our distance metric: complete-linkage agglomerative (SciPy)
Frame Instance Extraction and Clustering for Default Knowledge BuildingA. Shah, V. Basile, E. Cabrio, S. Kamath S.
Applications of Semantic Web technologies in Robotics - ANSWER 17
Pilot Study
Pilot StudyMost frequent frame type, role and element from each cluster
<http://framebase.org/fbframe/Ride_vehicle><http://framebase.org/fbfe/Vehicle><http://wordnet−rdf.princeton.edu/wn31/02837983−n>
300 triples, available at ~http://project.inria.fr/aloof/data/
Bicycle
Part VDefault Knowledge about
Objects
Default Knowledge about Objects
RDF dataset of common sense knowledge about objects.
Object classification, prototypical location, actions, frames...
Knowledge extracted from parsing, crowdsourcing, distributional semantics, keyword linking
http://deko.inria.fr/
Default Knowledge about Objects
Default Knowledge about Objects
Default Knowledge about Objects
10,990 nquads (named graphs)
603 from crowdsourcing
1,221 from distributional relational hypothesis
8,046 from keyword kinking
1,120 from KNEWS/frame instance clustering
+ DeKO ontology
http://deko.inria.fr/
The End(Q/A)