On the Proper Treatment of Quantifiers in Probabilistic Logic Semantics Islam Beltagy and Katrin Erk...

Post on 21-Dec-2015

215 views 1 download

Tags:

transcript

On the Proper Treatment of Quantifiers in Probabilistic Logic Semantics

Islam Beltagy and Katrin Erk

The University of Texas at Austin

IWCS 2015

Logic-based Semantics

• First-order logic and theorem proving

• Deep semantic representation:

– Negation, Quantifiers, Conjunction, Disjunction ….

2

Probabilistic Logic Semantics

• Logic

+

• Reasoning with Uncertainty

– Confidence rating of Word Sense Disambiguation

– Weight of Paraphrase rules

– Distributional similarity values [Beltagy et al., 2013]

• baby toddler | w1

• eating doll playing with a toy | w2

– ...

3

Probabilistic Logic Semantics

• Quantifiers and Negations do not work as expected

• Domain Closure Assumption: finite domain

– Problems with quantifiers

– “Tweety is a bird and it flies” “All birds fly”

• Closed-World Assumption: low prior probabilities

– Problems with negations

– “All birds fly” “The sky is not blue”

4

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks (MLNs)

– Recognizing Textual Entailment (RTE)

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

5

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

6

7

Probabilistic Logic

• Frameworks that combine logical and statistical knowledge [Nilsson, 1986], [Getoor and Taskar, 2007]

• Use weighted first-order logic rules

– Weighted rules are soft rules (compared to hard logical constraints)

• Provide a mechanism for probabilistic inference: P(Q|E, KB)• Bayesian Logic Programs (BLP) [Kersting & De Raedt, 2001]

• Markov Logic Networks (MLN) [Richardson and Domingos, 2006]

• Probabilistic Soft Logic (PSL) [Kimmig et al., NIPS 2012]

Markov Logic Networks[Richardson and Domingos, 2006]

x. smoke(x) cancer(x) | 1.5x,y. friend(x,y) (smoke(x) smoke(y)) | 1.1

• Two constants: Anna (A) and Bob (B)

• P(Cancer(Anna) | Friends(Anna,Bob), Smokes(Bob))Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)

8

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

10

Recognizing Textual Entailment (RTE)

• RTE requires deep semantic understanding [Dagan et al., 2013]

• Given two sentences Text (T) and Hypothesis (H), finding if T Entails, Contradicts or not related (Neutral) to H

11

Recognizing Textual Entailment (RTE)

• Examples (from the SICK dataset) [Marelli et al., 2014]

– Entailment: T: “A man is walking through the woods.

H: “A man is walking through a wooded area.”

– Contradiction: T: “A man is jumping into an empty pool.”

H: “A man is jumping into a full pool.”

– Neutral: T: “A young girl is dancing.”

H: “A young girl is standing on one leg.”

12

Recognizing Textual Entailment (RTE)

• Translate sentences to logic using Boxer [Bos 2008]

• T: John is driving a car

x,y,z. john(x) agent(y, x) drive(y) patient(y, z) car(z)• H: John is driving a vehicle

x,y,z. john(x) agent(y, x) drive(y) patient(y, z) vehicle(z)• KB: (collected from difference sources)

x. car(x) vehicle(x) | w• P(H|T, KB)

13

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

14

Domain Closure Assumption (DCA)

• There are no objects in the world other than the named constants (Finite Domain)

• e.g.

x. smoke(x) cancer(x) | 1.5x,y. friend(x,y) (smoke(x) smoke(y)) | 1.1Two constants: Anna (A) and Bob (B)

15

Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)

Ground Atoms

Domain Closure Assumption (DCA)

• There are no objects in the universe other than the named constants (Finite Domain)

– Constants need to be explicitly added

– Universal quantifiers do not behave as expected because of finite domain

– e.g. “Tweety is a bird and it flies” “All birds fly”

P(H|T,KB) T H

Skolemization No problems

Existence Universals in H

16

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Skolemization: in T

• Existence: in T

• Universals in Hypothesis : in H

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

17

Skolemization ( in T )

• Explicitly introducing constants

• T: x,y. john(x) agent(y, x) eat(y) • Skolemized T: john(J) agent(T, J) eat(T) • Embedded existentials

– T : x. bird(x) y. agent(y, x) fly(y)– Skolemized T: x. bird(x) agent(f(x), x) fly(f(x))– Simulate skolem functions

x. bird(x) y. skolemf(x,y) agent(y, x) fly(y)– skolemf (B1, C1), skolemf (B2, C2) …18

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Skolemization: in T

• Existence: in T

• Universals in Hypothesis : in H

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

19

Existence ( in T )

• T: All birds fly

• H: Some birds fly

• Logically, T ⇏H but pragmatically it does

– “All birds fly” presupposes that “there exist birds”

• Solution: simulate this existential presupposition

– From parse tree, Q(restrictor, body)

– “All birds fly” becomes: all(bird, fly)

– Introduce additional evidence for the restrictor bird(B)20

Existence ( in T )

• Negated Existential– T: No bird flies = no (bird, fly) x,y. bird(x) agent(y, x) fly(y) x. bird(x) y. agent(y, x) fly(y)– Additional evidence bird(B)

• Exception– T: There are no birds x. bird(x)– No additional evidence because the existence presupposition is

explicitly negated

21

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Skolemization: in T

• Existence: in T

• Universals in Hypothesis : in H

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

22

Universals in Hypothesis ( in H )

• T: Tweety is a bird, and Tweety fliesbird(Tweety) agent(F, Tweety) fly (F)• H: All birds fly

x. bird(x) y. agent(y, x) fly(y)• T H because universal quantifiers work only on the constants of

the given finite domain

• Solution:

– As in Existence, add evidence for the restrictor: bird(Woody)– If the new bird can be shown to fly, then there is an explicit universal

quantification in T

23

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

24

Closed-World Assumption (CWA)

• The assumption that everything (all ground atoms) have very low prior probability

• CWA fits the RTE task because:

– In the world, most things are false

– Inference results are less sensitive to the domain size

– Enable inference optimization [Beltagy and Mooney, 2014]

25

Closed-World Assumption (CWA)

• Because of CWA, negated H comes true regardless of T• H : x,y. bird(x) agent(y, x) fly(y) • Solution

– Add positive evidence that contradicts the negated parts of H

– A set of ground atoms with high prior probability (in contrast with low prior probability on all other ground atoms)– R: bird(B) agent(F, B) fly(F) | w=1.5– P(H| CWA) 1– P(H|R, CWA) 0

26

Closed-World Assumption (CWA)

Entailing example:

• T: No bird flies: x,y. bird(x) agent(y, x) fly(y) • H: No penguin flies: x,y. penguin(x) agent(y, x) fly(y) • R: penguin(P) agent(F, P) fly(F) | w=1.5• KB: x. penguin(x) bird(x) • P(H|T, R, KB) = 1• T KB contradicts R, which lets H be true.

27

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

28

29

Evaluation

• Probabilistic Logic Framework: Markov Logic Network

– Proposed handling of DCA and CWA applies to other Probabilistic Logic frameworks that make similar assumptions, e.g, PSL (Probabilistic Soft Logic)

• Evaluation Task: RTE

– Proposed handling of DCA and CWA applies to other tasks where the logical formulas have existential and universal quantifiers, e.g, STS (Textual Similarity) and Question Answering

30

Evaluation

1) Synthetic Dataset

• Template: Q1 NP1 V Q2 NP2 = Q1(NP1, Q2(NP2 ,V))

• Example

– T: No man eats all food

– H: Some hungry men eat not all delicious food

Evaluation

31Baseline

SkolemSkolem + Existence

Skolem + Univ in HSkolem + Univ in H + CWA

All0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Accuracy

1) Synthetic Dataset

• Dataset size: 952 Neutral + 72 Entail = 1024

32

Detection of Contradiction

• Entailment: P(H| T, KB, Wt,h)

• Contradiction: P(H| T, KB, Wt,h)

World configuration:

• Domain size

• Prior probabilities

33

Evaluation

2) Sentences Involving Compositional Knowledge (SICK) [Marelli et al., SemEval 2014]

– 10,000 pairs of sentences annotated as Entail, Contradict or Neutral

BaselineSkolem

Skolem + ExistenceSkolem + Univ in H

Skolem + Univ in H + CWAAll

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Accuracy

34

Evaluation

3) FraCas [Cooper et al., 1996]: hand-built entailments pairs

– We evaluate of the first section (out of 9 sections)

– Unsupported quantifiers (few, most, many, at least) (28/74 pairs)

BaselineSkolem

Skolem + ExistenceSkolem + Univ in H

Skolem + Univ in H + CWAAll

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Gold parses

Standard parses

Outline

• Probabilistic Logic Semantics (overview of previous work)

– Markov Logic Networks

– Recognizing Textual Entailment

• Domain Closure Assumption

– Definition

– Inference problems with Quantifiers

• Closed-World Assumption

• Evaluation

• Future work and Conclusion

35

Future Work

Generalized Quantifiers:

• How to extend this work to generalized quantifiers like Few and Most

36

37

Conclusion

• Domain Closure Assumption, its implication on the probabilistic logic inferences, and how to formulate the RTE problem in a way that we get the expected inferences

• Closed-World Assumption, why we make that assumption, and what its effect on the negation, and how to formulate the RTE problem to get correct inferences.

Thank You

38