Post on 24-May-2020
transcript
Phrase Structures and SyntaxANLP: Lecture 11
Shay Cohen
School of InformaticsUniversity of Edinburgh
8 October 2019
1 / 57
Until now...
I Focused mostly on regular languages
I Finite state machines and transducersI n-gram modelsI Hidden Markov ModelsI Viterbi search and friends
I ... Next: going up one level in the Chomsky hierarchy
2 / 57
Recap: The Chomsky hierarchy
Context−sensitive
Context−free
Regular
Recursively enumerable
3 / 57
Side note: Is English Regular?
Centre-embedding[The cat1 likes tuna fish1].[The cat1 [the dog2 chased2] likes tuna fish1].[The cat1 [the dog2 [the rat3 bit3] chased2] likes tuna fish1].
Consider L = {(the N)n TVm likes tuna fish | n,m ≥ 0}where N = { cat, dog, rat, elephant, kangaroo . . . }
TV = { chased, bit, admired, ate, befriended . . . }
Clearly L is regular. However, L ∩ English is the language
{(the N)n TVn−1 likes tuna fish | n ≥ 1}
Can use pumping lemma to show L is not regular.
Assumption 1. “(the N)n TVm likes tuna fish” is ungrammaticalfor m 6= n − 1.Assumption 2. “(the N)n TVn−1 likes tuna fish” is grammaticalfor all n ≥ 1. (Is this reasonable? You decide!)
4 / 57
Side note: Is English Regular?
Centre-embedding[The cat1 likes tuna fish1].[The cat1 [the dog2 chased2] likes tuna fish1].[The cat1 [the dog2 [the rat3 bit3] chased2] likes tuna fish1].
Consider L = {(the N)n TVm likes tuna fish | n,m ≥ 0}where N = { cat, dog, rat, elephant, kangaroo . . . }
TV = { chased, bit, admired, ate, befriended . . . }
Clearly L is regular. However, L ∩ English is the language
{(the N)n TVn−1 likes tuna fish | n ≥ 1}
Can use pumping lemma to show L is not regular.
Assumption 1. “(the N)n TVm likes tuna fish” is ungrammaticalfor m 6= n − 1.Assumption 2. “(the N)n TVn−1 likes tuna fish” is grammaticalfor all n ≥ 1. (Is this reasonable? You decide!)
4 / 57
The NLP Pipeline
5 / 57
Grammar Writing Exercise
Date: October 25 (Friday during lecture time)
You will write a grammar for the English language
There will be a competition between the grammars for “precision”and “recall”
You should be able to start working on your grammar by the endof this class
More details here:http://www.inf.ed.ac.uk/teaching/courses/anlp/cgw
There will be prizes!
6 / 57
Computing meaning
A well-studied, difficult, and un-solved problem.
Fortunately, we know enough tohave made partial progress (Wat-son won).
Over the next few weeks, we will work up to the study of systemsthat can assign logical forms that mathematically state themeaning of a sentence, so that they can be processed by machines.
Our first stop will be natural language syntax.
7 / 57
Natural language syntax
Syntax provides the scaffolding for semantic composition.
The brown dog on the mat saw the striped cat through thewindow.
The brown cat saw the striped dog through the window on themat.
Do the two sentences above mean the same thing? What is theprocess by which you computed their meanings?
8 / 57
Natural language syntax
Syntax provides the scaffolding for semantic composition.
The brown dog on the mat saw the striped cat through thewindow.The brown cat saw the striped dog through the window on themat.
Do the two sentences above mean the same thing? What is theprocess by which you computed their meanings?
8 / 57
Natural language syntax
Syntax provides the scaffolding for semantic composition.
The brown dog on the mat saw the striped cat through thewindow.The brown cat saw the striped dog through the window on themat.
Do the two sentences above mean the same thing? What is theprocess by which you computed their meanings?
8 / 57
Constituents
Words in a sentence often form groupings that can combine withother units to produce meaning. These groupings, calledconsituents can often be identified by substitution tests (muchlike parts of speech!)
Kim [read a book], [gave it to Sandy], and [left]
You said I should read the book and [read it] I did.
Kim read [a very interesting book about grammar].
9 / 57
Heads and Phrases
Noun (N): Noun Phrase (NP)Adjective (A): Adjective Phrase (AP)Verb (V): Verb Phrase (VP)Preposition (P): Prepositional Phrase (PP)
I So far we have looked at terminals (words or POS tags).
I Today, we’ll look at non-terminals, which correspond tophrases.
I The part of speech that a word belongs to is closely linked tothe type of constituent that it is associated with.
I In a X-phrase (eg NP), the key occurrence of X (eg N) iscalled the head, and controls how the phrase interacts (bothsyntactically and semantically) with the rest of the sentence.
I In English, the head tends to appear in the middle of a phrase.
10 / 57
Constituents have structure
English NPs are commonly of the form:
(Det) Adj* Noun (PP | RelClause)*NP: the angry duck that tried to bite me,
head: duck.
VPs are commonly of the form:
(Aux) Adv* Verb Arg* Adjunct*Arg → NP | PPAdjunct → PP | AdvP | . . .VP: usually eats artichokes for dinner,
head: eat
.
In Japanese, Korean, Hindi, Urdu, and other head-final languages,the head is at the end of its associated phrase.
In Irish, Welsh, Scots Gaelic and other head-initial languages, thehead is at the beginning of its associated phrase.
11 / 57
Constituents have structure
English NPs are commonly of the form:
(Det) Adj* Noun (PP | RelClause)*NP: the angry duck that tried to bite me, head: duck.
VPs are commonly of the form:
(Aux) Adv* Verb Arg* Adjunct*Arg → NP | PPAdjunct → PP | AdvP | . . .VP: usually eats artichokes for dinner,
head: eat
.
In Japanese, Korean, Hindi, Urdu, and other head-final languages,the head is at the end of its associated phrase.
In Irish, Welsh, Scots Gaelic and other head-initial languages, thehead is at the beginning of its associated phrase.
11 / 57
Constituents have structure
English NPs are commonly of the form:
(Det) Adj* Noun (PP | RelClause)*NP: the angry duck that tried to bite me, head: duck.
VPs are commonly of the form:
(Aux) Adv* Verb Arg* Adjunct*Arg → NP | PPAdjunct → PP | AdvP | . . .VP: usually eats artichokes for dinner, head: eat.
In Japanese, Korean, Hindi, Urdu, and other head-final languages,the head is at the end of its associated phrase.
In Irish, Welsh, Scots Gaelic and other head-initial languages, thehead is at the beginning of its associated phrase.
11 / 57
WALS - Subject Verb Object order
Taken from https://wals.info/feature/81A#2/5.6/172.8
12 / 57
Desirable Properties of a Grammar
Chomsky specified two properties that make a grammar“interesting and satisfying”:
I It should be a finite specification of the strings of thelanguage, rather than a list of its sentences.
I It should be revealing, in allowing strings to be associatedwith meaning (semantics) in a systematic way.
We can add another desirable property:
I It should capture structural and distributional properties ofthe language. (E.g. where heads of phrases are located; how asentence transforms into a question; which phrases can floataround the sentence.)
13 / 57
Desirable Properties of a Grammar
I Context-free grammars (CFGs) provide a pretty goodapproximation.
I Some features of NLs are more easily captured using mildlycontext-sensitive grammars, as well see later in the course.
I There are also more modern grammar formalisms that bettercapture structural and distributional properties of humanlanguages. (E.g. combinatory categorial grammar.)
I Programming language grammars (such as the ones used withcompilers, like LL(1)) aren’t enough for NLs.
14 / 57
A Tiny Fragment of English
Let’s say we want to capture in a grammar the structural anddistributional properties that give rise to sentences like:
A duck walked in the park. NP,V,PPThe man walked with a duck. NP,V,PPYou made a duck. Pro,V,NPYou made her duck. ? Pro,V,NPA man with a telescope saw you. NP,PP,V,ProA man saw you with a telescope. NP,V,Pro,PPYou saw a man with a telescope. Pro,V,NP,PP
We want to write grammatical rules that generate these phrasestructures, and lexical rules that generate the words appearing inthem.
15 / 57
A Tiny Fragment of English
Let’s say we want to capture in a grammar the structural anddistributional properties that give rise to sentences like:
A duck walked in the park. NP,V,PPThe man walked with a duck. NP,V,PPYou made a duck. Pro,V,NPYou made her duck. ? Pro,V,NPA man with a telescope saw you. NP,PP,V,ProA man saw you with a telescope. NP,V,Pro,PPYou saw a man with a telescope. Pro,V,NP,PP
We want to write grammatical rules that generate these phrasestructures, and lexical rules that generate the words appearing inthem.
15 / 57
A Tiny Fragment of English
Let’s say we want to capture in a grammar the structural anddistributional properties that give rise to sentences like:
A duck walked in the park. NP,V,PPThe man walked with a duck. NP,V,PPYou made a duck. Pro,V,NPYou made her duck. ? Pro,V,NPA man with a telescope saw you. NP,PP,V,ProA man saw you with a telescope. NP,V,Pro,PPYou saw a man with a telescope. Pro,V,NP,PP
We want to write grammatical rules that generate these phrasestructures, and lexical rules that generate the words appearing inthem.
15 / 57
A Tiny Fragment of English
Let’s say we want to capture in a grammar the structural anddistributional properties that give rise to sentences like:
A duck walked in the park. NP,V,PPThe man walked with a duck. NP,V,PPYou made a duck. Pro,V,NPYou made her duck. ? Pro,V,NPA man with a telescope saw you. NP,PP,V,ProA man saw you with a telescope. NP,V,Pro,PPYou saw a man with a telescope. Pro,V,NP,PP
We want to write grammatical rules that generate these phrasestructures, and lexical rules that generate the words appearing inthem.
15 / 57
A Tiny Fragment of English
Let’s say we want to capture in a grammar the structural anddistributional properties that give rise to sentences like:
A duck walked in the park. NP,V,PPThe man walked with a duck. NP,V,PPYou made a duck. Pro,V,NPYou made her duck. ? Pro,V,NPA man with a telescope saw you. NP,PP,V,ProA man saw you with a telescope. NP,V,Pro,PPYou saw a man with a telescope. Pro,V,NP,PP
We want to write grammatical rules that generate these phrasestructures, and lexical rules that generate the words appearing inthem.
15 / 57
Grammar for the Tiny Fragment of English
Grammar G1 generates the sentences on the previous slide:
Grammatical rules Lexical rulesS → NP VP Det → a | the | her (determiners)NP → Det N N → man | park | duck | telescope (nouns)NP → Det N PP Pro → you (pronoun)NP → Pro V → saw | walked | made (verbs)VP → V NP PP Prep → in | with | for (prepositions)VP → V NPVP → VPP → Prep NP
16 / 57
Context-free grammars: formal definition
A context-free grammar (CFG) G consists of
I a finite set N of non-terminals,
I a finite set Σ of terminals, disjoint from N,
I a finite set P of productions of the form X → α, whereX ∈ N, α ∈ (N ∪ Σ)∗,
I a choice of start symbol S ∈ N.
17 / 57
A sentential form is any sequence of terminals and nonterminalsthat can appear in a derivation starting from the start symbol.
Formal definition: The set of sentential forms derivable from G isthe smallest set S(G) ⊆ (N ∪ Σ)∗ such that
I S ∈ S(G)
I if αXβ ∈ S(G) and X → γ ∈ P, then αγβ ∈ S(G).
The language associated with grammar is the set of sententialforms that contain only terminals.
Formal definition: The language associated with G is defined byL(G) = S(G) ∩ Σ∗
A language L ⊆ Σ∗ is defined to be context-free if there existssome CFG G such that L = L(G).
18 / 57
A sentential form is any sequence of terminals and nonterminalsthat can appear in a derivation starting from the start symbol.
Formal definition: The set of sentential forms derivable from G isthe smallest set S(G) ⊆ (N ∪ Σ)∗ such that
I S ∈ S(G)
I if αXβ ∈ S(G) and X → γ ∈ P, then αγβ ∈ S(G).
The language associated with grammar is the set of sententialforms that contain only terminals.
Formal definition: The language associated with G is defined byL(G) = S(G) ∩ Σ∗
A language L ⊆ Σ∗ is defined to be context-free if there existssome CFG G such that L = L(G).
18 / 57
Assorted remarks
I X → α1 | α2 | · · · | αn is simply an abbreviation for abunch of productions X → α1, X → α2, . . . , X → αn.
I These grammars are called context-free because a rule X → αsays that an X can always be expanded to α, no matter wherethe X occurs.This contrasts with context-sensitive rules, which might allowus to expand X only in certain contexts, e.g. bXc → bαc .
I Broad intuition: context-free languages allow nesting ofstructures to arbitrary depth. E.g. brackets, begin-end blocks,if-then-else statements, subordinate clauses in English, . . .
19 / 57
Grammar for the Tiny Fragment of English
Grammar G1 generates the sentences on the previous slide:
Grammatical rules Lexical rulesS → NP VP Det → a | the | her (determiners)NP → Det N N → man | park | duck | telescope (nouns)NP → Det N PP Pro → you (pronoun)NP → Pro V → saw | walked | made (verbs)VP → V NP PP Prep → in | with | for (prepositions)VP → V NPVP → VPP → Prep NP
Does G1 produce a finite or an infinite number of sentences?
20 / 57
Recursion
Recursion in a grammar makes it possible to generate an infinitenumber of sentences.
In direct recursion, a non-terminal on the LHS of a rule alsoappears on its RHS. The following rules add direct recursion to G1:
VP → VP Conj VPConj → and | or
In indirect recursion, some non-terminal can be expanded (viaseveral steps) to a sequence of symbols containing thatnon-terminal:
NP → Det N PPPP → Prep NP
21 / 57
Structural Ambiguity
You saw a man with a telescope.
S
NP
Pro
You
VP
V
saw
NP
Det
a
N
man
PP
Prep
with
NP
Det
a
N
telescope
22 / 57
Structural Ambiguity
You saw a man with a telescope.
S
NP
Pro
You
VP
V
saw
NP
Det
a
N
man
PP
Prep
with
NP
Det
a
N
telescope
23 / 57
Structural Ambiguity
You saw a man with a telescope.
S
NP
Pro
You
VP
V
saw
NP
Det
a
N
man
PP
Prep
with
NP
Det
a
N
telescope
S
NP
Pro
You
VP
V
saw
NP
Det
a
N
man
PP
Prep
with
NP
Det
a
N
telescope
This illustrates attachment ambiguity: the PP can be a part of theVP or of the NP. Note that there’s no POS ambiguity here.
24 / 57
Structural Ambiguity
You saw a man with a telescope.
S
NP
Pro
You
VP
V
saw
NP
Det
a
N
man
PP
Prep
with
NP
Det
a
N
telescope
S
NP
Pro
You
VP
V
saw
NP
Det
a
N
man
PP
Prep
with
NP
Det
a
N
telescope
This illustrates attachment ambiguity: the PP can be a part of theVP or of the NP. Note that there’s no POS ambiguity here.
24 / 57
Structural Ambiguity
You saw a man with a telescope.
S
NP
Pro
You
VP
V
saw
NP
Det
a
N
man
PP
Prep
with
NP
Det
a
N
telescope
S
NP
Pro
You
VP
V
saw
NP
Det
a
N
man
PP
Prep
with
NP
Det
a
N
telescope
This illustrates attachment ambiguity: the PP can be a part of theVP or of the NP. Note that there’s no POS ambiguity here.
24 / 57
Structural AmbiguityGrammar G1 only gives us one analysis of you made her duck.
S
NP
Pro
You
VP
V
made
NP
Det
her
N
duck
There is another, ditransitive (i.e., two-object) analysis of thissentence – one that underlies the pair:
What did you make for her?You made her duck.
25 / 57
Structural Ambiguity
For this alternative, G1 also needs rules like:
NP → NVP → V NP NPPro → her
S
NP
Pro
You
VP
V
made
NP
Det
her
N
duck
S
NP
Pro
You
VP
V
made
NP
Pro
her
NP
N
duck
In this case, the structural ambiguity is rooted in POS ambiguity.
26 / 57
Structural AmbiguityThere is a third analysis as well, one that underlies the pair:
What did you make her do?You made her duck.
(move head or body quickly downwards)
Here, the small clause (her duck) is the direct object of a verb.
Similar small clauses are possible with verbs like see, hear andnotice, but not ask, want, persuade, etc.
G1 needs a rule that requires accusative case-marking on thesubject of a small clause and no tense on its verb.:
VP → V S1S1 → NP(acc) VP(untensed)NP(acc) → her | him | them
27 / 57
Structural AmbiguityThere is a third analysis as well, one that underlies the pair:
What did you make her do?You made her duck. (move head or body quickly downwards)
Here, the small clause (her duck) is the direct object of a verb.
Similar small clauses are possible with verbs like see, hear andnotice, but not ask, want, persuade, etc.
G1 needs a rule that requires accusative case-marking on thesubject of a small clause and no tense on its verb.:
VP → V S1S1 → NP(acc) VP(untensed)NP(acc) → her | him | them
27 / 57
Structural Ambiguity
Now we have three analyses for you made her duck:
NP VP
S
VPro NP
You made duck
Det N
her
NP VP
S
VPro
You made duck
NP NP
Pro
her
N
NP VP
S
VPro
You made duck
S
NP(acc)
her
VP
V
How can we compute these analyses automatically?
28 / 57
A Fun Exercise - Which is the VP?
(a) (b) (c)
A new one?(d) (e) (f)
saw the car from my house window with my telescope
E
29 / 57
A Fun Exercise - Which is the VP?
(a) (b) (c)
A new one?(d) (e) (f)
saw the car from my house window with my telescopeE
29 / 57
Questions to Ask Yourself
I Can this context-free grammar formalism tackle all syntaxphenomena?
I Where do we get the grammar from? How big would it be?
I How do we take a grammar and a sentence, and get a tree forthe sentence from the grammar efficiently?
I How would we introduce probabilities into the use of acontext-free grammar?
30 / 57
Summary
I We use CFGs to represent NL grammars
I Grammars need recursion to produce infinite sentences
I Most NL grammars have structural ambiguity
I A parser computes structure for an input automatically
I Recursive descent and shift-reduce parsing
31 / 57