Date post: | 23-Dec-2015 |
Category: |
Documents |
Upload: | brian-mills |
View: | 219 times |
Download: | 0 times |
Formal Semantics
Slides by Julia Hockenmaier, Laura McGarrity, Bill McCartney, Chris
Manning, and Dan Klein
Formal Semantics
It comes in two flavors:• Lexical Semantics: The meaning of words• Compositional semantics: How the meaning
of individual units combine to form the meaning of larger units
What is meaning
• Meaning ≠ Dictionary entriesDictionaries define words using words.Circularity!
Reference
• Referent: the thing/idea in the world that a word refers to
• Reference: the relationship between a word and its referent
Reference
Barack presidentObama
The president is the commander-in-chief.= Barack Obama is the commander-in-chief.
Reference
Barack presidentObama
I want to be the president.≠ I want to be Barack Obama.
Reference
• Tooth fairy?
• Phoenix?
• Winner of the 2016 presidential election?
What is meaning?
• Meaning ≠ Dictionary entries• Meaning ≠ Reference
Sense
• Sense: The mental representation of a word or phrase, independent of its referent.
Sense ≠ Mental Image• A word may have different mental images for
different people.– E.g., “mother”
• A word may conjure a typical mental image (a prototype), but can signify atypical examples as well.
Sense v. Reference
• A word/phrase may have sense, but no reference:– King of the world– The camel in CIS 8538– The greatest integer– The
• A word may have reference, but no sense:– Proper names: Dan McCloy, Kristi Krein
(who are they?!)
Sense v. Reference
• A word may have the same referent, but more than one sense:– The morning star / the evening star (Venus)
• A word may have one sense, but multiple referents:– Dog, bird
Some semantic relations between words
• Hyponymy: subclass– Poodle < dog– Crimson < red– Red < color– Dance < move
• Hypernymy: superclass• Synonymy:
– Couch/sofa– Manatee / sea cow
• Antonymy:– Dead/alive– Married/single
Lexical Decomposition
• Word sense can be represented with semantic features:
Compositional Semantics
Compositional Semantics
• The study of how meanings of small units combine to form the meaning of larger units
The dog chased the cat ≠ The cat chased the dog.ie, the whole does not equal the sum of the parts.
The dog chased the cat = The cat was chased by the dogie, syntax matters to determining meaning.
Principle of Compositionality
The meaning of a sentence is determined by the meaning of its words in conjunction with the way they are syntactically combined.
Exceptions to Compositionality
• Anomaly: when phrases are well-formed syntactically, but not semantically– Colorless green ideas sleep furiously. (Chomsky)– That bachelor is pregnant.
Exceptions to Compositionality
• Metaphor: the use of an expression to refer to something that it does not literally denote in order to suggest a similarity– Time is money.– The walls have ears.
Exceptions to Compositionality
• Idioms: Phrases with fixed meanings not composed of literal meanings of the words– Kick the bucket = die
(*The bucket was kicked by John.)– When pigs fly = ‘it will never happen’
(*She suspected pigs might fly tomorrow.)– Bite off more than you can chew
= ‘to take on too much’(*He chewed just as much as he bit off.)
Idioms in other languages
Logical Foundations for Compositional Semantics
• We need a language for expressing the meaning of words, phrases, and sentences
• Many possible choices; we will focus on– First-order predicate logic (FOPL) with types– Lambda calculus
Truth-conditional Semantics• Linguistic expressions
– “Bob sings.”
• Logical translations– sings(Bob)– but could be p_5789023(a_257890)
• Denotation:– [[bob]] = some specific person (in some context)– [[sings(bob)]] = true, in situations where Bob is singing; false, otherwise
• Types on translations:– bob: e(ntity)– sings(bob): t(rue or false, a boolean type)
Truth-conditional SemanticsSome more complicated logical descriptions of language:
– “All girls like a video game.”– x:e . y:e . girl(x) [video-game(y) likes(x,y)]
– “Alice is a former teacher.”– (former(teacher))(Alice)
– “Alice saw the cat before Bob did.”– x:e, y:e, z:e, t1:e, t2:e .
cat(x) see(y) see(z) agent(y, Alice) patient(y, x) agent(z, Bob) patient(z, x) time(y, t1) time(z, t2) <(t1, t2)
FOPL Syntax Summary
• A set of types T = {t1, … }
• A set of constants C = {c1, …}, each associated with a type from T
• A set of relations R = {r1, …}, where each ri is a subset of Cn for some n.
• A set of variables X = {x1, …}
• , , , , , , ., :
Truth-conditional semantics• Proper names:
– Refer directly to some entity in the world– Bob: bob
• Sentences:– Are either t or f– Bob sings: sings(bob)
• So what about verbs and VPs?– sings must combine with bob to produce sings(bob)– The λ-calculus is a notation for functions whose arguments are not yet filled.– sings: λx.sings(x)– This is a predicate, a function that returns a truth value. In this case, it takes a
single entity as an argument, so we can write its type as e t
• Adjectives?
Lambda calculus• FOPL + λ (new quantifier) will be our lambda calculus
• Intuitively, λ is just a way of creating a function– E.g., girl() is a relation symbol; but
λx . girl(x) is a function that takes one argument.
• New inference rule: function application(λx . L1(x)) (L2) → L1(L2)
E.g., (λx . x2) (3) → 32
E.g., (λx . sings(x)) (Bob) → sings(Bob)
• Lambda calculus lets us describe the meaning of words individually. – Function application (and a few other rules) then lets us combine those
meanings to come up with the meaning of larger phrases or sentences.
Compositional Semantics with the λ-calculus
• So now we have meanings for the words• How do we know how to combine the words?• Associate a combination rule with each grammar rule:– S : β(α) NP : α VP : β (function application)– VP : λx. α(x) ∧ β(x) VP : α and : ∅ VP : β
(intersection)
• Example:
Composition: Some more examples
• Transitive verbs:– likes : λx.λy.likes(y,x)– Two-places predicates, type e(et)– VP “likes Amy” : λy.likes(y,Amy) is just a one-place predicate
• Quantifiers:– What does “everyone” mean?– Everyone : λf.x.f(x)– Some problems:
• Have to change our NP/VP rule• Won’t work for “Amy likes everyone”
– What about “Everyone likes someone”?– Gets tricky quickly!
Composition: Some more examples
• Indefinites– The wrong way:• “Bob ate a waffle” : ate(bob,waffle)• “Amy ate a waffle” : ate(amy,waffle)
– Better translation:• ∃x.waffle(x) ^ ate(bob, x)• What does the translation of “a” have to be?• What about “the”?• What about “every”?
Denotation
• What do we do with the logical form?– It has fewer (no?) ambiguities– Can check the truth-value against a database– More usefully: can add new facts, expressed in
language, to an existing relational database– Question-answering: can check whether a statement
in a corpus entails a question-answer pair:“Bob sings and dances”
Q:“Who sings?” has answer A:“Bob”
– Can chain together facts for story comprehension
Grounding• What does the translation likes : λx. λy. likes(y,x) have
to do with actual liking?• Nothing! (unless the denotation model says it does)• Grounding: relating linguistic symbols to perceptual
referents– Sometimes a connection to a database entry is enough– Other times, you might insist on connecting “blue” to the
appropriate portion of the visual EM spectrum– Or connect “likes” to an emotional sensation
• Alternative to grounding: meaning postulates– You could insist, e.g., that likes(y,x) => knows(y,x)
More representation issues
• Tense and events– In general, you don’t get far with verbs as predicates– Better to have event variables e
• “Alice danced” : danced(Alice) vs.• “Alice danced” : ∃e.dance(e)^agent(e, Alice)^(time(e)<now)
– Event variables let you talk about non-trivial tense/aspect structures:
“Alice had been dancing when Bob sneezed”
More representation issues
• Propositional attitudes (modal logic)– “Bob thinks that I am a gummi bear”
• thinks(bob, gummi(me))?• thinks(bob, “He is a gummi bear”)?
– Usually, the solution involves intensions (^p) which are, roughly, the set of possible worlds in which predicate p is true.• thinks(bob, ^gummi(me))
– Computationally challenging• Each agent has to model every other agent’s mental state• This comes up all the time in language –
– E.g., if you want to talk about what your bill claims that you bought, vs. what you think you bought, vs. what you actually bought.
More representation issues
• Multiple quantifiers:“In this country, a woman gives birth every 15 minutes.Our job is to find her, and stop her.”
-- Groucho Marx
• Deciding between readings– “Bob bought a pumpkin every Halloween.”– “Bob put a warning in every window.”
More representation issues
• Other tricky stuff– Adverbs– Non-intersective adjectives– Generalized quantifiers– Generics
• “Cats like naps.”• “The players scored a goal.”
– Pronouns and anaphora• “If you have a dime, put it in the meter.”
– … etc., etc.
Mapping Sentences to Logical Forms
CCG Parsing• Combinatory Categorial
Grammar– Lexicalized PCFG– Categories encode
argument sequences• A/B means a category that
can combine with a B to the right to form an A
• A \ B means a category that can combine with a B to the left to form an A
– A syntactic parallel to the lambda calculus
Learning to map sentences to logical form
• Zettlemoyer and Collins (IJCAI 05, EMNLP 07)
Some Training Examples
CCG Lexicon
Parsing Rules (Combinators)Application
Right: X : f(a) X/Y : f Y : a
Left: X : f(a) Y : a X\Y : f
Additional rules:• Composition• Type-raising
CCG Parsing Example
Parsing a Question
Lexical Generation
Input Training ExampleSentence: Texas borders Kansas.Logical form: borders(Texas, Kansas)
GENLEX
• Input: a training example (Si, Li)
• Computation:– Create all substrings of consecutive words in Si
– Create categories from Li
– Create lexical entries that are the cross products of these two sets
• Output: Lexicon Λ
GENLEX Cross Product
Input Training ExampleSentence: Texas borders Kansas.Logical form: borders(Texas, Kansas)
Output LexiconOutput SubstringsTexasbordersKansasTexas bordersborders KansasTexas borders Kansas
X(cross product)
Output CategoriesNP : texasNP : kansas(S\NP)/NP : λx.λy.borders(y,x)
GENLEX Output LexiconWords Category
Texas NP : texas
Texas NP : kansas
Texas (S\NP)/NP : λx.λy.borders(y,x)
borders NP : texas
Borders NP : kansas
borders (S\NP)/NP : λx.λy.borders(y,x)
… …
Texas borders Kansas NP : texas
Texas borders Kansas NP : kansas
Texas borders Kansas (S\NP)/NP : λx.λy.borders(y,x)
Weighted CCG
Given a log-linear model with a CCG lexicon Λ, a feature vector f, and weights w:
The best parse is: y* = argmax w f(x,y)∙
where we consider all possible parses y for the sentence x given the lexicon Λ.
y
Parameter Estimation for Weighted CCG Parsing
Inputs: Training set {(Si,Li) | i = 1, …, n}Initial lexicon Λ, initial weights w, num. iter. T
Computation: For t=1 … T, i = 1 … n:Step 1: Check correctness
If y* = argmax w f(S∙ i,y) is Li, skip to next iStep 2: Lexical generation
Set λ = Λ ∪ GENLEX(Si,Li)Let y’ = argmax w f(S∙ i,y)
Define λi to be the lexical entries in y’Set Λ = Λ ∪ λi
Step 3: Update ParametersLet y’’ = argmax w f(S∙ i,y)If y’’ ≠ Li
Set w = w + f(Si, y’) – f(Si,y’’)
Output: Lexicon Λ and parameters w
y s.t. L(y) = Li
y
Example Learned Lexical Entries
Challenge Revisited
Disharmonic Application
Missing Content Words
Missing content-free words
A complete parse
Geo880 Test Set
Precision Recall F1
Zettlemoyer & Collins 2007 95.49 83.20 88.93
Zettlemoyer & Collins 2005 96.25 79.29 86.95
Wong & Mooney 2007 93.72 80.00 86.31
Summing Up
• Hypothesis: Principle of Compositionality– Semantics of NL sentences and phrases can be composed
from the semantics of their subparts• Rules can be derived which map syntactic analysis to
semantic representation (Rule-to-Rule Hypothesis)– Lambda notation provides a way to extend FOPC to this
end– But coming up with rule2rule mappings is hard
• Idioms, metaphors and other non-compositional aspects of language makes things tricky (e.g. fake gun)