+ All Categories
Home > Documents > CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc...

CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc...

Date post: 05-Jan-2016
Category:
Upload: christopher-short
View: 212 times
Download: 0 times
Share this document with a friend
38
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006
Transcript
Page 1: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

CPE 480 Natural Language Processing

Lecture 4: Syntax

Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006

Page 2: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

What is Syntax?

• Study of structure of language• Roughly, goal is to relate surface

form (what we perceive when someone says something) to semantics (what that utterance means)

Page 3: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

What is Syntax Not?

• Phonology: study of sound systems and how sounds combine

• Morphology: study of how words are formed from smaller parts (morphemes)

• Semantics: study of meaning of language

Page 4: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

What is Syntax? (2)

• Study of structure of language• Specifically, goal is to relate an

interface to morphological component to an interface to a semantic component

• Note: interface to morphological component may look like written text

• Representational device is tree structure

Page 5: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Simplified View of Linguistics

/waddyasai/Phonology

Morphology /waddyasai/ what did you say

Syntax what did you say say

you what

objsubj

Semanticssay

you what

objsubj P[ x. say(you, x) ]

Page 6: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Structure in Strings

• Some words: the a small nice big very boy girl sees likes

• Some good sentences:o the boy likes a girl o the small girl likes the big girlo a very small nice boy sees a very nice boy

• Some bad sentences:o *the boy the girlo *small boy likes nice girl

• Can we find subsequences of words (constituents) which in some way behave alike?

Page 7: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Structure in StringsProposal 1

• Some words: the a small nice big very boy girl sees likes

• Some good sentences:o (the) boy (likes a girl) o (the small) girl (likes the big girl)o (a very small nice) boy (sees a very nice boy)

• Some bad sentences:o *(the) boy (the girl)o *(small) boy (likes the nice girl)

Page 8: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Structure in StringsProposal 2

• Some words: the a small nice big very boy girl sees likes

• Some good sentences:o (the boy) likes (a girl) o (the small girl) likes (the big girl)o (a very small nice boy) sees (a very nice boy)

• Some bad sentences:o *(the boy) (the girl)o *(small boy) likes (the nice girl)• This is better proposal: fewer types of constituents

(blue and red are of same type)

Page 9: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

More Structure in StringsProposal 2 -- ctd

• Some words: the a small nice big very boy girl sees likes

• Some good sentences:o ((the) boy) likes ((a) girl) o ((the) (small) girl) likes ((the) (big) girl)o ((a) ((very) small) (nice) boy) sees ((a) ((very) nice)

girl)

• Some bad sentences:o *((the) boy) ((the) girl)o *((small) boy) likes ((the) (nice) girl)

Page 10: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

From Substrings to Trees

• (((the) boy) likes ((a) girl))

boythe

likesgirl

a

Page 11: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Node Labels?

• ( ((the) boy) likes ((a) girl) )• Choose constituents so each one has one

non-bracketed word: the head• Group words by distribution of constituents

they head (part-of-speech, POS):o Noun (N), verb (V), adjective (Adj), adverb (Adv),

determiner (Det)• Category of constituent: XP, where X is POS

o NP, S, AdjP, AdvP, DetP

Page 12: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Node Labels

• (((the/Det) boy/N) likes/V ((a/Det) girl/N))

boy

the

likes

girl

a

DetP

NP NP

DetP

S

Page 13: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Types of Nodes

• (((the/Det) boy/N) likes/V ((a/Det) girl/N))

boy

the

likes

girl

a

DetP

NP NP

DetP

S

Phrase-structuretree

nonterminalsymbols= constituents

terminal symbols = words

Page 14: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Determining Part-of-Speech

o noun or adjective? a blue seat a child seat a very blue seat *a very child seat this seat is blue *this seat is child

blue and child are not the same POS

blue is Adj, child is Noun

Page 15: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Determining Part-of-Speech (2)

o preposition or particle?

A he threw out the garbage B he threw the garbage out the door

A he threw the garbage out B *he threw the garbage the door out

The two out are not same POS; A is particle, B is Preposition

Page 16: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Word Classes (=POS)

• Heads of constituents fall into distributionally defined classes

• Additional support for class definition of word class comes from morphology

Page 17: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Some Points on POS Tag Sets

• Possible basic set: N, V, Adj, Adv, P, Det, Aux, Comp, Conj

• 2 supertypes: open- and closed-classo Open: N, V, Adj, Advo Closed: P, Det, Aux, Comp, Conj

• Many subtypes:o eat/V eat/VB, eat/VBP, eats/VBZ, ate/VBD,

eaten/VBN, eating/VBG, o Reflect morphological form & syntactic

function

Page 18: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Phrase Structure and Dependency Structure

likes/V

boy/N girl/N

the/Det a/Detboy

the

likes

girl

a

DetP

NP NP

DetP

S

All nodes are labeled with words!Only leaf nodes labeled with words!

Page 19: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Phrase Structure and Dependency Structure

(ctd)

likes/V

boy/N girl/N

the/Det a/Detboy

the

likes

girl

a

DetP

NP NP

DetP

S

Representationally equivalent if each nonterminal node has one lexical daughter (its head)

Page 20: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Types of Dependency

likes/V

boy/N girl/N

a/Detsmall/Adjthe/Det

very/Adv

sometimes/Adv

ObjSubjAdj(unct)

FwFw

Adj

Adj

Page 21: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Grammatical Relations

• Types of relations between wordso Arguments: subject, object, indirect

object, prepositional objecto Adjuncts: temporal, locative, causal,

manner, …o Function Words

Page 22: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Subcategorization

• List of arguments of a word (typically, a verb), with features about realization (POS, perhaps case, verb form etc)

• In canonical order Subject-Object-IndObj

• Example:o like: N-N, N-V(to-inf)o see: N, N-N, N-N-V(inf)

• Note: J&M talk about subcategorization only within VP

Page 23: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

What About the VP?

boy

the

likes

girl

a

DetP

NP NP

DetP

S

boy

the

likesDetP

NP

girl

a

NP

DetP

S

VP

Page 24: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Context-Free Grammars

• Defined in formal language theory• Terminals, nonterminals, start

symbol, rules• String-rewriting system• Start with start symbol, rewrite

using rules, done when only terminals left

• NOT A LINGUISTIC THEORY, just a formal device

Page 25: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

CFG: Example• Many possible CFGs for English, here is an

example (fragment):o S NP VPo VP V NPo NP DetP N | AdjP NPo AdjP Adj | Adv AdjPo N boy | girlo V sees | likeso Adj big | smallo Adv very o DetP a | the

the very small boy likes a girl

Page 26: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

S

S

Page 27: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

NP VP

NP

S

VP

Page 28: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

DetP N VP

DetP

NP

S

VP

N

Page 29: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

the boy VP

boythe

DetP

NP

S

VP

N

Page 30: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

the boy likes NP

boythe likes

DetP

NP

NP

S

VP

N V

Page 31: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

the boy likes a girl

boythe likes

DetP

NP

girla

NP

DetP

S

VP

N

N

V

Page 32: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Derivations in a CFG;Order of Derivation

Irrelevant

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

NP likes DetP girl

likes

NP

girl

NP

DetP

S

VP

N

V

Page 33: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Derivations of CFGs

• String rewriting system: we derive a string (=derived structure)

• But derivation history represented by phrase-structure tree (=derivation structure)!

Page 34: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Nobody Uses Simple CFGs (Except Intro NLP Courses)

• All major syntactic theories (Chomsky, LFG, HPSG, TAG-based theories) represent both phrase structure and dependency, in one way or another

• All successful parsers currently use statistics about phrase structure and about dependency

• Derive dependency through “head percolation”: for each rule, say which daughter is head

Page 35: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Ambiguity of Syntax

• Example:o I saw a man with a telescope.

Page 36: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Types of syntactic constructions

• Is this the same construction?o An elf decided to clean the kitcheno An elf seemed to clean the kitchen An elf cleaned the kitchen

• Is this the same construction?o An elf decided to be in the kitcheno An elf seemed to be in the kitchenAn elf was in the kitchen

Page 37: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Types of syntactic constructions (ctd)

• Is this the same construction?There is an elf in the kitcheno *There decided to be an elf in the

kitcheno There seemed to be an elf in the

kitchen

• Is this the same construction?It is raining/it rainso ??It decided to rain/be rainingo It seemed to rain/be raining

Page 38: CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc 84010 Fall 2006.

Types of syntactic constructions (ctd)

Conclusion: • to seem: whatever is embedded

surface subject can appear in upper clause

• to decide: only full nouns that are referential can appear in upper clause

• Two types of verbs


Recommended