+ All Categories
Home > Documents > NLP - Lecture1

NLP - Lecture1

Date post: 08-Apr-2018
Category:
Upload: eltalliss
View: 230 times
Download: 0 times
Share this document with a friend

of 21

Transcript
  • 8/6/2019 NLP - Lecture1

    1/21

    CS 598 JH: Advanced NLP (Spring 09)

    Ju lia Hockenmaier [email protected]

    3324 Siebel CenterOfce Hours: Fri, 2:00-3:00pm

    http://www.cs.uiuc.edu/~juliahmr/cs598

    Review

    http://www.cs.uiuc.edu/class/fa08/cs498jhhttp://www.cs.uiuc.edu/class/fa08/cs498jhhttp://www.cs.uiuc.edu/class/fa08/cs498jhhttp://www.cs.uiuc.edu/class/fa08/cs498jhhttp://www.cs.uiuc.edu/class/fa08/cs498jhhttp://www.cs.uiuc.edu/class/fa08/cs498jhhttp://www.cs.uiuc.edu/class/fa08/cs498jh
  • 8/6/2019 NLP - Lecture1

    2/21

    CS 598 JH: Advanced NLP (Spring 09)

    What is the structure

    of a sentence?Sentence structure is hierarchical:

    A sentence consists of words ( I, eat, sushi, with, tuna )

    ..which form phrases or constituents: sushi with tuna

    Sentence structure denes dependenciesbetween words or phrases:

    I eat sushi with tuna

    2

    [ ][ ][ ][ ]

  • 8/6/2019 NLP - Lecture1

    3/21

    CS 598 JH: Advanced NLP (Spring 09)

    Strong vs. weak

    generative capacityFormal language theory:- denes language as string sets- is only concerned with generating these strings

    (weak generative capacity)

    Formal/Theoretical syntax (in linguistics):- denes language as sets of strings with (hidden) structure-is also concerned with generating the right structures(strong generative capacity)

    3

  • 8/6/2019 NLP - Lecture1

    4/21

    CS 598 JH: Advanced NLP (Spring 09)

    Context-free grammars (CFGs)

    ca ture recursionLanguage has complex constituents

    (the garden behind the house )

    Syntactically, these constituents behavejust like simple ones.

    (behind the house can always be omitted)

    CFGs dene nonterminal categoriesto capture equivalent constituents.

    4

  • 8/6/2019 NLP - Lecture1

    5/21

    CS 598 JH: Advanced NLP (Spring 09)

    Context-free grammars

    A CFG is a 4-tuple N , ,R,SA set of nonterminals N(e.g. N = {S, NP, VP, PP, Noun, Verb, ....})

    A set of terminals

    (e.g. = {I, you, he, eat, drink, sushi, ball, })

    A set of rules R R {A with left-hand-side (LHS) A N

    and right-hand-side (RHS) (N )* }A start symbol S (sentence)

    5

  • 8/6/2019 NLP - Lecture1

    6/21

    An example

    N {ball, garden, house, sushi } P {in, behind, with} NP N

    NP NP PPPP P NP

    N: nounP: prepositionNP: noun phrasePP: prepositional phrase

  • 8/6/2019 NLP - Lecture1

    7/21

    CFGs dene parse trees

    eat with tunasushiNPNP

    VP

    PPNPV P

    N {sushi, tuna} P {with} V {eat} NP NNP NP PPPP P NPVP V NP

  • 8/6/2019 NLP - Lecture1

    8/21

    CFGs are equivalent toPushdown automata PDAs

    PDAs are FSAs with an additional stack:Emit a symbol and push/pop a symbol from the stack

    This is equivalent to the following CFG:S a X bX a X bX a b

    Push x on stack.Emit a

    Pop x from stack.

    Emit b

    Accept ifstack

    empty.

  • 8/6/2019 NLP - Lecture1

    9/21

    The Chomsky HierarchyLanguage Automata Parsingcomplexity Dependencies

    Type 3 Regular Finite-state linear adjacent words

    Type 2 Context-Free Pushdown cubic nested

    Type 1 Context-sensitiveLinear

    Bounded exponential

    Type 0 RecursivelyEnumerableTuring

    machine

  • 8/6/2019 NLP - Lecture1

    10/21

    CS 598 JH: Advanced NLP (Spring 09)

    Constituents:

    Heads and dependentsThere are different kinds of constituents:

    Noun phrases : the man, a girl with glasses , Illinois Prepositional phrases : with glasses, in the garden Verb phrases : eat sushi, sleep, sleep soundly

    Every phrase has a head:Noun phrases : the man, a girl with glasses , Illinois

    Prepositional phrases : with glasses, in the garden Verb phrases : eat sushi, sleep, sleep soundly The other parts are its dependents .Dependents are either arguments or adjuncts

    10

  • 8/6/2019 NLP - Lecture1

    11/21

    Two ways to represent structure

    eat with tunasushiNPNP

    VP

    PPNP

    V P

    sushieat with chopsticksNPNP

    VP

    PPVPV P

    eat sushi with tuna

    eat sushi with chopsticks

    Phrase structure trees Dependency trees

  • 8/6/2019 NLP - Lecture1

    12/21

    Structure (Syntax)corresponds to

    Meaning (Semantics)Correct analysis

    Incorrect analysis

    eat with tunasushiNPNP

    VP

    PPNP

    V P

    sushieat with chopsticksNPNP

    VP

    PPVPV P

    eat sushi with tuna

    eat sushi with chopsticks

    eat sushi with chopsticks

    NPNP

    NPVP

    PPV P

    eat with tunasushiNPNP

    VP

    PPVPV P

    eat sushi with tuna

    eat sushi with chopsticks

  • 8/6/2019 NLP - Lecture1

    13/21

    CS 598 JH: Advanced NLP (Spring 09)

    Dependency grammar

    DGs describe the structure of sentences as graph.The nodes of the graph are the wordsThe edges of the graph are the dependencies.

    The relationship between DG and CFGs:If a CFG phrase structure tree is translated into DG,the resulting dependency graph has no crossing edges.

    13

  • 8/6/2019 NLP - Lecture1

    14/21

    CS 598 JH: Advanced NLP (Spring 09)

    CKY chart parsing algorithm

    Bottom-up parsing:start with the words

    Dynamic programming:save the results in a table/chart

    re-use these results in nding larger constituents

    Complexity: O(n 3|G|)n: length of string, |G|: size of grammar)

    Presumes a CFG in Chomsky Normal Form:Rules are all either A B C or A a(with A,B,C nonterminals and a a terminal)

    14

  • 8/6/2019 NLP - Lecture1

    15/21

    we eat sushiwe eat

    eat sushi

    sushi

    eat

    we

    S NP VPVP V NPV eat

    NP

    weNP sushi

    We eat sushi

    The CKY parsing algorithm

    SNP

    V

    NP

    VP

  • 8/6/2019 NLP - Lecture1

    16/21

    Exercise: CKY parser

    S NP VP

    NP NP PP

    NP Noun

    VP VP PP

    VP Verb NP

    I eat sushi with chopsticks

  • 8/6/2019 NLP - Lecture1

    17/21

    CS 598 JH: Advanced NLP (Spring 09)

    Dealing with Ambiguity

    A grammar might generate multiple trees for a sentence:

    What s the most likely parse for sentence S ?

    We need a model of P( | S)

    17

    eat with tunasushiNPNP

    VP

    PPNP

    V P

    sushieat with chopsticksNPNP

    VP

    PPVPV P

    eat sushi with chopsticksNPNP

    NP

    VP

    PPV P

    eat with tunasushiNPNP

    VP

    PPVPV P

  • 8/6/2019 NLP - Lecture1

    18/21

    CS 598 JH: Advanced NLP (Spring 09)

    Using Bayes Rule:

    The yield of a tree is the string of terminal symbols

    that can be read off the leaf nodes

    Computing P( | S)

    arg max

    P ( |S ) = arg max

    P ( , S )P (S )

    = arg max

    P ( , S )

    = arg max

    P ( ) if S = yield( )

    18

    yield( )= eat sushi with tuna

    eat with tunasushiNPNP

    VP

    PPNP

    V P

  • 8/6/2019 NLP - Lecture1

    19/21

    CS 598 JH: Advanced NLP (Spring 09)

    T is the (innite) set of all trees in the language:

    Weed to dene P( ) such that:

    The set T is generated by a context-free grammar

    Computing P( )

    19

    T : 0 P ( ) 1 T P ( ) = 1

    L = {s | T : yield ( ) = s }

    S NP VP VP Verb NP NP Det NounS S conj S VP VP PP NP NP PPS ..... VP ..... NP .....

  • 8/6/2019 NLP - Lecture1

    20/21

    CS 598 JH: Advanced NLP (Spring 09)

    Probabilistic Context-Free Grammars

    For every nonterminal X, dene a probability distributionP(X | X) over all rules with the same LHS symbol X:

    20

    S NP VP 0.8S S conj S 0.2

    NP

    Noun 0.2NP Det Noun 0.4NP NP PP 0.2NP NP conj NP 0.2VP Verb 0.4

    VP

    Verb NP 0.3VP Verb NP NP 0.1VP VP PP 0.2PP P NP 1.0

  • 8/6/2019 NLP - Lecture1

    21/21

    CS 598 JH: Advanced NLP (Spring 09)

    Computing P( ) with a PCFG

    The probability of a tree is the product of the probabilitiesof all its rules:

    21

    P( ) = 0.8 0.3 0.2 1.0= 0.00384

    0.2 3

    S

    NP

    Noun

    John

    VP

    VP

    Verb

    eats

    NP

    Noun

    pie

    PP

    P

    with

    NP

    Noun

    cream

    S NP VP 0.8S S conj S 0.2NP Noun 0.2

    NP

    Det Noun 0.4NP NP PP 0.2NP NP conj NP 0.2VP Verb 0.4VP Verb NP 0.3

    VP

    Verb NP NP 0.1VP VP PP 0.2PP P NP 1.0


Recommended