Working with Discourse Representation Theory
Patrick Blackburn & Johan Bos
Lecture 3DRT and Inference
This lecture
Now that we know how to build DRSs for English sentences, what do we do with them?
Well, we can use DRSs to draw inferences.
In this lecture we show how to do that, both in theory and in practice.
Overview
Inference tasksWhy FOL?From model theory to proof theoryInference enginesFrom DRT to FOLAdding world knowledgeDoing it locally
The inference tasks
The consistency checking taskThe informativity checking task
Why First-Order Logic?
Why not use higher-order logic? Better match with formal semantics But: Undecidable/no fast provers available
Why not use weaker logics? Modal/description logics (decidable fragments) But: Can’t cope with all of natural language
Why use first-order logic? Undecidable, but good inference tools available DRS translation to first-order logic Easy to add world knowledge
Axioms encode world knowledge
We can write down axioms about the information that we find fundamental
For example, lexical knowledge, world knowledge, information about the structure of time, events, etc.
By the Deduction Theorem 1 … n |= iff |= 1& … & n
That is, inference reduces to validity of formulas.
From model theory to proof theory
The inference tasks were defined semantically
For computational purposes, we need symbolic definitions
We need to move from the concept of |= to |--
In other words, from validity to provability
Soundness
If provable then valid:
If |-- then |=
Soundness is a `no garbage` condition
Completeness
If valid then provable
If |= then |--
Completeness means that proof theory has captured model theory
Decidability
A problem is decidable, if a computer is guaranteed to halt in finite time on any input and give you a correct answer
A problem that is not decidable, is undecidable
First-order logic is undecidable
What does this mean?
It is not possible, to write a program that is guaranteed to halt when given any first-order formula and correctly tell you whether or not that formula is valid.
Sounds pretty bad!
Good news
FOL is semi-decidableWhat does that mean?
If in fact a formula is valid, it is always possible, to symbolically verify this fact in finite time
That is, things are only going wrong for FOL when it is asked to tackle something that is not valid
On some non-valid input, any algorithm is bound not to terminate
Put differently
Half the task, namely determining validity, is fairly reasonable.
The other half of the task, showing non-validity, or equivalenty, satisfiability, is harder.
This duality is reflected in the fact that there are two fundamental computational inference tools for FOL: theorem provers and model builders
Theorem provers
Basic thing they do is show that a formula is provable/valid.
There are many efficient off-the-shelf provers available for FOL
Theorem proving technology is now nearly 40 years old and extremely sophisticated
Examples: Vampire, Spass, Bliksem, Otter
Theorem provers and informativity
Given a formula , a theorem prover will try to prove , that is, to show that it is valid/uninformative
If is valid/uninformative, in theory, the theorem prover will always succeed
So theorem provers are a negative test for informativity
If the formula is not valid/uninformative, all bets are off.
Theorem provers and consistency
Given a formula , a theorem prover will try to prove , that is, to show that is inconsistent
If is inconsistent, in theory, the theorem prover will always succeed
So theorem provers are also a negative test for consistency
If the formula is not inconsistent, all bets are off.
Model builders
Basic thing that model builders do is try to generate a [usually] finite model for a formula. They do so by iteration over model size.
Model building for FOL is a rather new field, and there are not many model builders available.
It is also an intrinsically hard task; harder than theorem proving.
Examples: Mace, Paradox, Sem.
Model builders and consistency
Given a formula , a model builder will try to build a model for , that is, to show that is consistent
If is consistent, and satisfiable on a finite model, then, in theory, the model builder will always succeed
So model builders are a partial positive test for consistency
If the formula is not consistent, or it is not satisfiable on a finite model, all bets are off.
Finite model property
A logic has the finite model property, if every satisfiable formula is satisfiable on a finite model.
Many decidable logics have this property.
But it is easy to see that FOL lacks this property.
Model builders and informativity
Given a formula , a model builder will try to build a model for , that is, to show that is informative
If is satisfiable on a finite model, then, in theory, the model builder will always succeed
So model builders are a partial positive test for informativity
If the formula is not satisfiable on a finite model all bets are off.
Yin and Yang of Inference
Theorem Proving and Model Building function as opposite forces
Doing it in parallel
We have general negative tests [theorem provers], and partial positive tests [model builders]
Why not try to get of both worlds, by running these tests in parallel?
That is, given a formula we wish to test for informativity/consistency, we hand it to both a theorem prover and model builder at once
When one succeeds, we halt the other
Parallel Consistency Checking
Suppose we want to test [representing the latest sentence] for consistency wrto the previous discourse
Then: If a theorem prover succeeds in finding a proof
for PREV , then it is inconsistent If a model builder succeeds to construct a model
for PREV & , then it is consistent
Why is this relevant to natural language?
Testing a discourse for consistency
Discourse Theorem prover Model builder
Why is this relevant to natural language?
Testing a discourse for consistency
Discourse Theorem prover Model builder
Vincent is a man. ?? Model
Why is this relevant to natural language?
Testing a discourse for consistency
Discourse Theorem prover Model builder
Vincent is a man. ?? Model
Mia loves every man. ?? Model
Why is this relevant to natural language?
Testing a discourse for consistency
Discourse Theorem prover Model builder
Vincent is a man. ?? Model
Mia loves every man. ?? Model
Mia does not love Vincent. Proof ??
Parallel informativity checking
Suppose we want to test the formula [representing the latest sentence] for informativity wrto the previous discourse
Then: If a theorem prover succeeds in finding
a proof for PREV , then it is not informative If a model builder succeeds to construct
a model for PREV & , then it is informative
Why is this relevant to natural language?
Testing a discourse for informativity
Discourse Theorem prover Model builder
Why is this relevant to natural language?
Testing a discourse for informativity
Discourse Theorem prover Model builder
Vincent is a man. ?? Model
Why is this relevant to natural language?
Testing a discourse for informativity
Discourse Theorem prover Model builder
Vincent is a man. ?? Model
Mia loves every man. ?? Model
Why is this relevant to natural language?
Testing a discourse for informativity
Discourse Theorem prover Model builder
Vincent is a man. ?? Model
Mia loves every man. ?? Model
Mia loves Vincent. Proof ??
Let`s apply this to DRT
Pretty clear what we need to do: Find efficient theorem provers for DRT Find efficient model builders for DRT Run them in parallel And Bob`s your uncle!
Recall that theorem provers are more established technology than model builders
So let`s start by finding an efficient theorem prover for DRT…
Googling theorem provers for DRT
Theorem proving in DRT
Oh no!Nothing there, efficient or otherwise.
Let`s build our own one.
One phone call to Voronkov later: Oops --- does it take that long to build one
from scratch? Oh dear.
Googling theorem provers for FOL
Use FOL inference technology for DRT
There are a lot FOL provers available and they are extremely efficient
There are also some interesting freely available model builders for FOL
We have said several times, that DRT is FOL in disguise, so lets get precise about this and put this observation to work
From DRT to FOL
Compile DRS into standard FOL syntaxUse off-the-shelf inference engines for
FOLOkay --- how do we do this?Translation function (…)fo
Translating DRT to FOL: DRSs
x1…xn
C1
.
.
.Cn
( )fo = x1… xn((C1)fo&…&(Cn)fo)
Translating DRT to FOL: Conditions
(R(x1…xn))fo = R(x1…xn)
(x1=x2)fo = x1=x2
(B)fo = (B)fo
(B1B2)fo = (B1)fo (B2)fo
Translating DRT to FOL:Implicative DRS-conditions
x1…xm
C1
.
.
.Cn
( B)fo = x1…xm(((C1)fo&…&(Cn)fo)(B)fo)
Two example translations
Example 1
Example 2
xman(x)walk(x)
y
woman(y)
xman(x)
eadore(e)agent(e,x)theme(e,y)
Example 1
xman(x)walk(x)
Example 1
xman(x)walk(x)
)fo(
Example 1
x( (man(x))fo & (walk(x))fo )
Example 1
x(man(x) & (walk(x))fo )
Example 1
x(man(x) & walk(x))
Example 2
y
woman(y)
xman(x)
eadore(e)agent(e,x)theme(e,y)
Example 2
y
woman(y)
xman(x)
eadore(e)agent(e,x)theme(e,y)
)fo(
Example 2
xman(x)
eadore(e)agent(e,x)theme(e,y)
y ( )(woman(y))fo & ( )fo
Example 2
xman(x)
eadore(e)agent(e,x)theme(e,y)
y ( )woman(y) & ( )fo
Example 2
eadore(e)agent(e,x)theme(e,y)
)y (woman(y) &x((man(x))fo ( )fo)
Example 2
eadore(e)agent(e,x)theme(e,y)
)y (woman(y) &x(man(x) ( )fo )
Example 2
y (woman(y) &x(man(x) e ( (adore(e))fo & (agent(e,x))fo & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x) e (adore(e) & (agent(e,x))fo & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x) e (adore(e) & agent(e,x) & (theme(e,y))fo )))
Example 2
y (woman(y) &x(man(x) e (adore(e) & agent(e,x) & theme(e,y))))
Basic setup
DRS:x y
vincent(x)mia(y)love(x,y)
Basic setup
DRS:
FOL: xy(vincent(x) & mia(y) & love(x,y))
x y
vincent(x)mia(y)love(x,y)
Basic setup
DRS:
FOL: xy(vincent(x) & mia(y) & love(x,y))
Model: D = {d1} F(vincent)={d1} F(mia)={d1} F(love)={(d1,d1)}
x y
vincent(x)mia(y)love(x,y)
Background Knowledge (BK)
Need to incorporate BKFormulate BK in terms of first-order
axiomsRather than just giving to the theorem
prover (or model builder), we give it: BK & or BK
Basic setup
DRS: x y
vincent(x)mia(y)love(x,y)
Basic setup
DRS:
FOL: xy(vincent(x) & mia(y) & love(x,y))
x y
vincent(x)mia(y)love(x,y)
Basic setup
DRS:
FOL: xy(vincent(x) & mia(y) & love(x,y))
BK: x (vincent(x) man(x)) x (mia(x) woman(x)) x (man(x) woman(x))
x y
vincent(x)mia(y)love(x,y)
Basic setup
DRS:
FOL: xy(vincent(x) & mia(y) & love(x,y))
BK: x (vincent(x) man(x)) x (mia(x) woman(x)) x (man(x) woman(x))
Model: D = {d1,d2} F(vincent)={d1} F(mia)={d2} F(love)={(d1,d2)}
x y
vincent(x)mia(y)love(x,y)
Local informativity
Example:
Mia is the wife of Marsellus. If Mia is the wife of Marsellus, Vincent will
be disappointed.
The second sentence is informative with respect to the first. But…
x y
mia(x)marsellus(y)wife-of(x,y)
Local informativity
x y z
mia(x)marsellus(y)wife-of(x,y)vincent(z)
wife-of(x,y) disappointed(z)
Local informativity
Local consistency
Example:
Jules likes big kahuna burgers. If Jules does not like big kahuna burgers,
Vincent will order a whopper.
The second sentence is consistent with respect to the first. But…
x y
jules(x)big-kahuna-burgers(y)like(x,y)
Local consistency
x y z
jules(x)big-kahuna-burgers(y)like(x,y)vincent(z)
u
order(z,u)whopper(u)
Local consistency
like(x,y)
DRT and local inference
Because DRS groups information into contexts, we now have natural means to check not only global, but also local consistency and informativity.
Important for dealing with presupposition.
Presupposition is not about strange logic. But about using classical logic in new ways.
Tomorrow
Presupposition and Anaphora in DRT