Natural Language Knowledge Graphs
Open-IE meets Knowledge Representation
Ido DaganBar-Ilan University, Israel
Knowledge Representation (KR)
Two complementary frameworks:
• Knowledge graphs• Formal pre-specified schema & predicates• Require (supervised) IE to populate from text• Targeting established knowledge
• Open IE• Arbitrary propositions found in text (anything said)• Represented in natural language terms
Our research line: extend Open IE towards a richer KR framework
From Berant et al., 2014
Appeal: complex aggregation queries, via semantic parsing (beyond text-QA scope) E.g. politicians spouses who lived in Chicago
Open IE
• Extracts propositions as they appear in text
“…which makes aspirin relieve headaches.”
• No supervision (but may be incorporated?)• No pre-defined schema• Scalability
What’s missing in Open IE?
Structure!
• Intra-proposition structure• NL propositions are more than SVO tuples
• Inter-proposition structure• Globally consolidating and structuring the extracted information
E.g. aspirin relieve headache aspirin treat headache
… This is also the structure of my talk
Enriching PropositionStructure and Coverage
Gabriel Stanovsky Jessica Ficler Ido Dagan Yoav Goldberg
Based on paper at Semantic Parsing Workshop @ ACL 2014
(work in progress)
(Curiosity, Landed on, Mars)
(Curiosity, is a, rover)
(Curiosity, is a, science lab)
(Curiosity, landed on, Mars) (Curiosity, explores, Mars)
Open IE produces tuples of predicate and arguments
(NASA, launched, Curiosity)
(Curiosity, surveys, Mars’ surface)
(Curiosity, collects, rock samples)
• Falls short of capturing all information conveyed
• Falls short of representing internal structure of information
→ Enrich proposition representation and extraction
Limitations of Current Proposition Structure
Extracting Implied Propositions
• Propositions can be implied from syntax
• Also implied by adjectives, nominalizations, conjunctions, etc.
Curiosity’s robotic arm is used to collect samples Curiosity has a robotic arm Possessiv
es
Curiosity, the Mars rover, landed on Mars Curiosity is the Mars roverApposition
from Marsfrom Mars
• Propositions can be embedded
• Arguments and predicates may have internal structure
NASA utilizes Curiosity to survey Mars
Curiosity examines rock samples
Enriching Structure
Predicate: isSubject: CuriosityObject: the Mars rover
NASA utilizes the Mars rover, Curiosity, to examine rock samples from Mars
Proposition Structures
Predicate: examineSubject: NASAObject:
rock samplesModifier: from Mars
• rock samples
Predicate: utilizeSubject: NASAObject: the Mars roverComp:
• examine
Q: “What is Curiosity?”Q: “Who utilizes the Mars rover?”
Q: “What did NASA examine?”
Predicate: isSubject: CuriosityObject: the Mars rover
NASA utilizes the Mars rover, Curiosity, to examine rock samples from Mars
Proposition Structures
Predicate: examineSubject: the Mars roverObject:
rock samplesModifier: from Mars
• rock samples
Predicate: utilizeSubject: NASAObject: the Mars roverComp:
• examine
• Explicitly represent implied propositions and embedded structure
Further Steps
• Soon: A tool which produces proposition structures• Generically – a better “syntax wrapper” for semantic processing
(vs. dependency trees)
• Add sub-proposition factuality (truth assertion)• TruthTeller (Lotan et. al, NAACL 2012)• OLLIE (Mausam et al., EMNLP 2012)
• Extract implied arguments• From discourse rather than syntax (Stern and Dagan, ACL 2014)
Add features to nodes
pt+
TruthTeller: Predicate Truth Value Annotation
15
pt-
Add features to nodes
Predicate Truth Value Annotation
pt?
Add features to nodes
Predicate Truth Value Annotation
1. Main phenomena addressed1. Presupposition2. Factives, Implicatives3. Implication signature changes4. Negation (verbal, nominal, double…)5. Conjunctions, apposition…
2. Annotation rules identifying above phenomena
3. Large lexicon of factives, implicatives
4. Recursive algorithm for Natural Logic style calculus
Algorithm Overview
(Kiparsky & Kiparsky 1970)
(Karttunen 1971; 2012)
(Lakoff, 1970)
(Nairen et al., 2006)
(MacCartney & Manning, 2009)
Focused Entailment Graphs
for Open IE Propositions
Omer Levy Ido Dagan Jacob Goldberger
CoNLL 2014
18
Adding Inter-proposition Structure to Open IEWhich structure?• Build a graph of Open IE propositions and their entailment relations
Why entailment?• Merges paraphrases into mutual entailment cliques
aspirin relieves headache aspirin treats headache
• Organizes information hierarchically from specific to generalaspirin relieves headache painkiller relieves headache
aspirin, eliminate, headacheaspirin, cure, headache
headache, control with, aspirindrug, relieve, headache
drug, treat, headache
analgesic, banish, headache
headache, respond to, painkillerheadache, treat with, caffeine
coffee, help, headache
tea, soothe, headache
Original Open IE Output
aspirin, eliminate, headache
aspirin, cure, headache
headache, control with, aspirin
drug, relieve, headache
drug, treat, headache
analgesic, banish, headache
headache, respond to, painkillerheadache, treat with, caffeine
coffee, help, headache
tea, soothe, headache
Consolidated Open IE Output
Semantic Applications
• Example: Structured Queries
• “What relieves headaches?”
Semantic Applications
• Example: Structured Queries
• “What relieves headaches?”
aspirin, eliminate, headache
aspirin, cure, headache
headache, control with, aspirin
drug, relieve, headache
drug, treat, headache
analgesic, banish, headache
headache, respond to, painkillerheadache, treat with, caffeine
coffee, help, headache
tea, soothe, headache
Structured Query:
aspirin, eliminate, headache
aspirin, cure, headache
headache, control with, aspirin
drug, relieve, headache
drug, treat, headache
analgesic, banish, headache
headache, respond to, painkillerheadache, treat with, caffeine
coffee, help, headache
tea, soothe, headache
Structured Query:
aspirin
drug
analgesic
painkillercaffeine
coffee
tea
Structured Query:
aspirin
drug
analgesic
painkillercaffeine
coffee
tea
Next step – graph aggregation: “Which drinks relieve headache?”
Our Contributions
• Structuring Open IE with Proposition Entailment Graphs
• Dataset: 30 gold-standard graphs, 1.5 million entailment annotations
• Algorithm for constructing Focused Proposition Entailment Graphs
• Analysis: Predicate entailment is not quite what we thought
Algorithm
How do we recognize proposition entailment?
.
?
How do we recognize proposition entailment?
.
Observation: propositions entail their lexical components entail
How do we recognize proposition entailment?
.
Observation: propositions entail their lexical components entail
How do we recognize proposition entailment?
.
Proposition entailment is reduced to lexical entailment in context
𝑒=𝜎 (𝑤⋅ 𝑓 )Lexical Entailment(Logistic)
Lexical Entailment
Lexical Entailment Features
𝑓 1
𝑒
𝑓 2 𝑓 3
Lexical Entailment(Logistic)
𝑒=𝜎 (𝑤⋅ 𝑓 )
Lexical Entailment
Features• WordNet Relations• UMLS• Distributional Similarity• String Edit Distance
Lexical Entailment Features
𝑓 1
𝑒
𝑓 2 𝑓 3
Supervision
Are WordNet relations capturing real-world predicate entailments?
Why isn’t WordNet capturing predicate entailment?
Predicate Entailment vs WordNet RelationsOver a predicate inference subset, how many predicate entailments are covered by WordNet?
• Positive indicators• synonyms, hypernyms, entailment
• Negative Indicators• antonyms, hyponyms, cohyponyms
Positive12%
Negative15%
None
74%
Predicate Entailment is Context-Sensitive
The words do not necessarily entail,but the situations do.
Appeal of NL KR
• Scalable – in principle unlimited coverage• Easy to communicate with people• Understand• Supervise – add knowledge (vs. in logic representation)
• May add additional links between propositions• Causality, temporal, argumentative
• Support at least some useful inferences
Integration with Logic-based Approaches• Integrate with logical/formal representations for concrete
phenomena• E.g. temporal, arithmetic, spatial
• Borrow ideas/methods from logic to apply over NL KR• Which are relevant and applicable?
Text Explorationvia
NL Knowledge Graphs
Customer interactionsExploratory search
Example: Service issues
not happy with the catering coffee is awful
coffee in economy is awful
no refreshments
food on train is too expensive
you charge too much for sandwiches
food quality is disappointing
bad food in premier
not enough food selection provide veggie meals
not happy with the service
journey is too slow
no clear information
not happy with the staff
staff is unfriendly no vegetarian food expand meal options
sandwiches are overpriced
sandwiches are too expensive
disgusting coffee is served
they have horrible coffee
food is bad
not happy with the catering
coffee is awfulthey have horrible coffee
disgusting coffee is served
coffee in economy is awful
no refreshments
food on train is too expensive
sandwiches are too expensivesandwiches are overpriced
you charge too much for sandwiches
food is badfood quality is disappointing
bad food in premier
not enough food selectionexpand meal options
no vegetarian foodprovide veggie meals
not happy with the service
journey is too slowno clear
information
not happy with the staff
staff is unfriendly
not happy with the catering
coffee is awful
coffee in economy is awful
no refreshments
food on train is too expensive
sandwiches are too expensive
food is bad
bad food in premier
not enough food selection
no vegetarian food
not happy with the service
journey is too slow no clear information
not happy with the staff
staff is unfriendly
not happy with the toilets
toilets are dirtytoilets are smelly
missing hygienic supplies
no soap in toilets no toilet paper
not happy with train facilities
seats are uncomfortable
missing facilities
no children sectionno WIFIno AC in cars
facilities are bad
no ACno AC in station
station is too crowded
Marshfield station is too crowded
cars are congested
pathway is too narrow lack of personal space
lack of storage space
improve the website
online booking can be better
webpage shows old timetables
no website for android can‘t find FAQ page
Customer Interactions Entailment Graph
Conclusion: exciting research area
• Extend Open IE to become NL-based knowledge graph
• NL proposition structure
• Graph of inter-proposition relations• Entailment – consolidation and hierarchy for propositions• Other relations desired – causal, temporal, argumentative, …
• How does it integrate with formal/artificial language KR?
Thank You!