Natural Language Knowledge Graphs Open-IE meets Knowledge Representation Ido Dagan Bar-Ilan...

Post on 16-Dec-2015

221 views 2 download

transcript

Natural Language Knowledge Graphs

Open-IE meets Knowledge Representation

Ido DaganBar-Ilan University, Israel

Knowledge Representation (KR)

Two complementary frameworks:

• Knowledge graphs• Formal pre-specified schema & predicates• Require (supervised) IE to populate from text• Targeting established knowledge

• Open IE• Arbitrary propositions found in text (anything said)• Represented in natural language terms

Our research line: extend Open IE towards a richer KR framework

From Berant et al., 2014

Appeal: complex aggregation queries, via semantic parsing (beyond text-QA scope) E.g. politicians spouses who lived in Chicago

Open IE

• Extracts propositions as they appear in text

“…which makes aspirin relieve headaches.”

• No supervision (but may be incorporated?)• No pre-defined schema• Scalability

What’s missing in Open IE?

Structure!

• Intra-proposition structure• NL propositions are more than SVO tuples

• Inter-proposition structure• Globally consolidating and structuring the extracted information

E.g. aspirin relieve headache aspirin treat headache

… This is also the structure of my talk

Enriching PropositionStructure and Coverage

Gabriel Stanovsky Jessica Ficler Ido Dagan Yoav Goldberg

Based on paper at Semantic Parsing Workshop @ ACL 2014

(work in progress)

(Curiosity, Landed on, Mars)

(Curiosity, is a, rover)

(Curiosity, is a, science lab)

(Curiosity, landed on, Mars) (Curiosity, explores, Mars)

Open IE produces tuples of predicate and arguments

(NASA, launched, Curiosity)

(Curiosity, surveys, Mars’ surface)

(Curiosity, collects, rock samples)

• Falls short of capturing all information conveyed

• Falls short of representing internal structure of information

→ Enrich proposition representation and extraction

Limitations of Current Proposition Structure

Extracting Implied Propositions

• Propositions can be implied from syntax

• Also implied by adjectives, nominalizations, conjunctions, etc.

Curiosity’s robotic arm is used to collect samples Curiosity has a robotic arm Possessiv

es

Curiosity, the Mars rover, landed on Mars Curiosity is the Mars roverApposition

from Marsfrom Mars

• Propositions can be embedded

• Arguments and predicates may have internal structure

NASA utilizes Curiosity to survey Mars

Curiosity examines rock samples

Enriching Structure

Predicate: isSubject: CuriosityObject: the Mars rover

NASA utilizes the Mars rover, Curiosity, to examine rock samples from Mars

Proposition Structures

Predicate: examineSubject: NASAObject:

rock samplesModifier: from Mars

• rock samples

Predicate: utilizeSubject: NASAObject: the Mars roverComp:

• examine

Q: “What is Curiosity?”Q: “Who utilizes the Mars rover?”

Q: “What did NASA examine?”

Predicate: isSubject: CuriosityObject: the Mars rover

NASA utilizes the Mars rover, Curiosity, to examine rock samples from Mars

Proposition Structures

Predicate: examineSubject: the Mars roverObject:

rock samplesModifier: from Mars

• rock samples

Predicate: utilizeSubject: NASAObject: the Mars roverComp:

• examine

• Explicitly represent implied propositions and embedded structure

Further Steps

• Soon: A tool which produces proposition structures• Generically – a better “syntax wrapper” for semantic processing

(vs. dependency trees)

• Add sub-proposition factuality (truth assertion)• TruthTeller (Lotan et. al, NAACL 2012)• OLLIE (Mausam et al., EMNLP 2012)

• Extract implied arguments• From discourse rather than syntax (Stern and Dagan, ACL 2014)

Add features to nodes

pt+

TruthTeller: Predicate Truth Value Annotation

15

pt-

Add features to nodes

Predicate Truth Value Annotation

pt?

Add features to nodes

Predicate Truth Value Annotation

1. Main phenomena addressed1. Presupposition2. Factives, Implicatives3. Implication signature changes4. Negation (verbal, nominal, double…)5. Conjunctions, apposition…

2. Annotation rules identifying above phenomena

3. Large lexicon of factives, implicatives

4. Recursive algorithm for Natural Logic style calculus

Algorithm Overview

(Kiparsky & Kiparsky 1970)

(Karttunen 1971; 2012)

(Lakoff, 1970)

(Nairen et al., 2006)

(MacCartney & Manning, 2009)

Focused Entailment Graphs

for Open IE Propositions

Omer Levy Ido Dagan Jacob Goldberger

CoNLL 2014

18

Adding Inter-proposition Structure to Open IEWhich structure?• Build a graph of Open IE propositions and their entailment relations

Why entailment?• Merges paraphrases into mutual entailment cliques

aspirin relieves headache aspirin treats headache

• Organizes information hierarchically from specific to generalaspirin relieves headache painkiller relieves headache

aspirin, eliminate, headacheaspirin, cure, headache

headache, control with, aspirindrug, relieve, headache

drug, treat, headache

analgesic, banish, headache

headache, respond to, painkillerheadache, treat with, caffeine

coffee, help, headache

tea, soothe, headache

Original Open IE Output

aspirin, eliminate, headache

aspirin, cure, headache

headache, control with, aspirin

drug, relieve, headache

drug, treat, headache

analgesic, banish, headache

headache, respond to, painkillerheadache, treat with, caffeine

coffee, help, headache

tea, soothe, headache

Consolidated Open IE Output

Semantic Applications

• Example: Structured Queries

• “What relieves headaches?”

Semantic Applications

• Example: Structured Queries

• “What relieves headaches?”

aspirin, eliminate, headache

aspirin, cure, headache

headache, control with, aspirin

drug, relieve, headache

drug, treat, headache

analgesic, banish, headache

headache, respond to, painkillerheadache, treat with, caffeine

coffee, help, headache

tea, soothe, headache

Structured Query:

aspirin, eliminate, headache

aspirin, cure, headache

headache, control with, aspirin

drug, relieve, headache

drug, treat, headache

analgesic, banish, headache

headache, respond to, painkillerheadache, treat with, caffeine

coffee, help, headache

tea, soothe, headache

Structured Query:

aspirin

drug

analgesic

painkillercaffeine

coffee

tea

Structured Query:

aspirin

drug

analgesic

painkillercaffeine

coffee

tea

Next step – graph aggregation: “Which drinks relieve headache?”

Our Contributions

• Structuring Open IE with Proposition Entailment Graphs

• Dataset: 30 gold-standard graphs, 1.5 million entailment annotations

• Algorithm for constructing Focused Proposition Entailment Graphs

• Analysis: Predicate entailment is not quite what we thought

Algorithm

How do we recognize proposition entailment?

.

?

How do we recognize proposition entailment?

.

Observation: propositions entail their lexical components entail

How do we recognize proposition entailment?

.

Observation: propositions entail their lexical components entail

How do we recognize proposition entailment?

.

Proposition entailment is reduced to lexical entailment in context

𝑒=𝜎 (𝑤⋅ 𝑓 )Lexical Entailment(Logistic)

Lexical Entailment

Lexical Entailment Features

𝑓 1

𝑒

𝑓 2 𝑓 3

Lexical Entailment(Logistic)

𝑒=𝜎 (𝑤⋅ 𝑓 )

Lexical Entailment

Features• WordNet Relations• UMLS• Distributional Similarity• String Edit Distance

Lexical Entailment Features

𝑓 1

𝑒

𝑓 2 𝑓 3

Supervision

Are WordNet relations capturing real-world predicate entailments?

Why isn’t WordNet capturing predicate entailment?

Predicate Entailment vs WordNet RelationsOver a predicate inference subset, how many predicate entailments are covered by WordNet?

• Positive indicators• synonyms, hypernyms, entailment

• Negative Indicators• antonyms, hyponyms, cohyponyms

Positive12%

Negative15%

None

74%

Predicate Entailment is Context-Sensitive

The words do not necessarily entail,but the situations do.

Appeal of NL KR

• Scalable – in principle unlimited coverage• Easy to communicate with people• Understand• Supervise – add knowledge (vs. in logic representation)

• May add additional links between propositions• Causality, temporal, argumentative

• Support at least some useful inferences

Integration with Logic-based Approaches• Integrate with logical/formal representations for concrete

phenomena• E.g. temporal, arithmetic, spatial

• Borrow ideas/methods from logic to apply over NL KR• Which are relevant and applicable?

Text Explorationvia

NL Knowledge Graphs

Customer interactionsExploratory search

Example: Service issues

not happy with the catering coffee is awful

coffee in economy is awful

no refreshments

food on train is too expensive

you charge too much for sandwiches

food quality is disappointing

bad food in premier

not enough food selection provide veggie meals

not happy with the service

journey is too slow

no clear information

not happy with the staff

staff is unfriendly no vegetarian food expand meal options

sandwiches are overpriced

sandwiches are too expensive

disgusting coffee is served

they have horrible coffee

food is bad

not happy with the catering

coffee is awfulthey have horrible coffee

disgusting coffee is served

coffee in economy is awful

no refreshments

food on train is too expensive

sandwiches are too expensivesandwiches are overpriced

you charge too much for sandwiches

food is badfood quality is disappointing

bad food in premier

not enough food selectionexpand meal options

no vegetarian foodprovide veggie meals

not happy with the service

journey is too slowno clear

information

not happy with the staff

staff is unfriendly

not happy with the catering

coffee is awful

coffee in economy is awful

no refreshments

food on train is too expensive

sandwiches are too expensive

food is bad

bad food in premier

not enough food selection

no vegetarian food

not happy with the service

journey is too slow no clear information

not happy with the staff

staff is unfriendly

not happy with the toilets

toilets are dirtytoilets are smelly

missing hygienic supplies

no soap in toilets no toilet paper

not happy with train facilities

seats are uncomfortable

missing facilities

no children sectionno WIFIno AC in cars

facilities are bad

no ACno AC in station

station is too crowded

Marshfield station is too crowded

cars are congested

pathway is too narrow lack of personal space

lack of storage space

improve the website

online booking can be better

webpage shows old timetables

no website for android can‘t find FAQ page

Customer Interactions Entailment Graph

Conclusion: exciting research area

• Extend Open IE to become NL-based knowledge graph

• NL proposition structure

• Graph of inter-proposition relations• Entailment – consolidation and hierarchy for propositions• Other relations desired – causal, temporal, argumentative, …

• How does it integrate with formal/artificial language KR?

Thank You!