+ All Categories
Home > Documents > Basic Parsing with Context-Free Grammars

Basic Parsing with Context-Free Grammars

Date post: 22-Feb-2016
Category:
Upload: madison
View: 44 times
Download: 0 times
Share this document with a friend
Description:
Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky. Announcements. To view past videos: http:// globe.cvn.columbia.edu:8080/oncampus.php?c=133ae14752e27fde909fdbd64c06b337 - PowerPoint PPT Presentation
Popular Tags:
44
Basic Parsing with Context-Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky
Transcript
Page 1: Basic Parsing with Context-Free Grammars

1

Basic Parsing with Context-Free Grammars

Some slides adapted from Julia Hirschberg and Dan Jurafsky

2

To view past videos http

globecvncolumbiaedu8080oncampusphpc=133ae14752e27fde909fdbd64c06b337

Usually available only for 1 week Right now available for all previous lectures

Announcements

3

Homework Questions

4

Evaluation

5

Syntactic Parsing

6

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language Xrsquo

Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

Syntactic Parsing

CFG Example Many possible CFGs for English here is an example

(fragment) S NP VP VP V NP NP Det N | Adj NP N boy | girl V sees | likes Adj big | small DetP a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

Modified CFGS NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 2: Basic Parsing with Context-Free Grammars

2

To view past videos http

globecvncolumbiaedu8080oncampusphpc=133ae14752e27fde909fdbd64c06b337

Usually available only for 1 week Right now available for all previous lectures

Announcements

3

Homework Questions

4

Evaluation

5

Syntactic Parsing

6

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language Xrsquo

Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

Syntactic Parsing

CFG Example Many possible CFGs for English here is an example

(fragment) S NP VP VP V NP NP Det N | Adj NP N boy | girl V sees | likes Adj big | small DetP a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

Modified CFGS NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 3: Basic Parsing with Context-Free Grammars

3

Homework Questions

4

Evaluation

5

Syntactic Parsing

6

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language Xrsquo

Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

Syntactic Parsing

CFG Example Many possible CFGs for English here is an example

(fragment) S NP VP VP V NP NP Det N | Adj NP N boy | girl V sees | likes Adj big | small DetP a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

Modified CFGS NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 4: Basic Parsing with Context-Free Grammars

4

Evaluation

5

Syntactic Parsing

6

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language Xrsquo

Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

Syntactic Parsing

CFG Example Many possible CFGs for English here is an example

(fragment) S NP VP VP V NP NP Det N | Adj NP N boy | girl V sees | likes Adj big | small DetP a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

Modified CFGS NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 5: Basic Parsing with Context-Free Grammars

5

Syntactic Parsing

6

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language Xrsquo

Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

Syntactic Parsing

CFG Example Many possible CFGs for English here is an example

(fragment) S NP VP VP V NP NP Det N | Adj NP N boy | girl V sees | likes Adj big | small DetP a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

Modified CFGS NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 6: Basic Parsing with Context-Free Grammars

6

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language Xrsquo

Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

Syntactic Parsing

CFG Example Many possible CFGs for English here is an example

(fragment) S NP VP VP V NP NP Det N | Adj NP N boy | girl V sees | likes Adj big | small DetP a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

Modified CFGS NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 7: Basic Parsing with Context-Free Grammars

CFG Example Many possible CFGs for English here is an example

(fragment) S NP VP VP V NP NP Det N | Adj NP N boy | girl V sees | likes Adj big | small DetP a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

Modified CFGS NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 8: Basic Parsing with Context-Free Grammars

Modified CFGS NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 9: Basic Parsing with Context-Free Grammars

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 10: Basic Parsing with Context-Free Grammars

10

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSA

Searching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammar

Constraints provided by the input sentence and the automaton or grammar

Parsing as a Form of Search

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 11: Basic Parsing with Context-Free Grammars

11

Builds from the root S node to the leaves Expectation-based Common search strategy

Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input

string

Top-Down Parser

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 12: Basic Parsing with Context-Free Grammars

12

ldquoThe old dog the footsteps of the youngrdquo Where does backtracking happen

What are the computational disadvantages

What are the advantages

Rule Expansion

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 13: Basic Parsing with Context-Free Grammars

13

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

Bottom-Up Parsing

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 14: Basic Parsing with Context-Free Grammars

14

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 15: Basic Parsing with Context-Free Grammars

15

When does disambiguation occur

What are the computational advantages and disadvantages

Bottom-up parsing

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 16: Basic Parsing with Context-Free Grammars

16

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the input

Bottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

Whatrsquos rightwrong withhellip

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 17: Basic Parsing with Context-Free Grammars

17

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theory Early Parsing Algorithm

Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never added

Chart Parser

Some Solutions

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 18: Basic Parsing with Context-Free Grammars

18

Allows arbitrary CFGs Fills a table in a single sweep over the input

words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locations In-progress constituents Predicted constituents

Earley Parsing

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 19: Basic Parsing with Context-Free Grammars

19

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in

progressVP -gt V NP A VP has been found

States

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 20: Basic Parsing with Context-Free Grammars

20

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12]An NP is in progress the

Det goes from 1 to 2VP -gt V NP [03] A VP has been found

starting at 0 and ending at 3

StatesLocations

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 21: Basic Parsing with Context-Free Grammars

21

Graphically

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 22: Basic Parsing with Context-Free Grammars

22

As with most dynamic programming approaches the answer is found by looking in the table in the right place

In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

Earley

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 23: Basic Parsing with Context-Free Grammars

23

March through chart left-to-right At each step apply 1 of 3 operators

Predictor Create new states representing top-down

expectations Scanner

Match word predictions (rule with word after dot) to words

Completer When a state is complete see what rules were

looking for that completed constituent

Earley Algorithm

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 24: Basic Parsing with Context-Free Grammars

24

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends

So predictor looking at S -gt VP [00]

results in VP -gt Verb [00] VP -gt Verb NP [00]

Predictor

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 25: Basic Parsing with Context-Free Grammars

25

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new

state VP -gt Verb NP [01]

Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

Scanner

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 26: Basic Parsing with Context-Free Grammars

26

Applied to a state when its dot has reached right end of role

Parser has discovered a category over some span of input

Find and advance all previous states that were looking for this category copy state move dot insert in current chart entry

Given NP -gt Det Nominal [13] VP -gt Verb NP [01]

Add VP -gt Verb NP [03]

Completer

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 27: Basic Parsing with Context-Free Grammars

27

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

How do we know we are done

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 28: Basic Parsing with Context-Free Grammars

28

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

Earley

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 29: Basic Parsing with Context-Free Grammars

29

Book that flight We should findhellip an S from 0 to 3 that is a

completed statehellip

Example

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 30: Basic Parsing with Context-Free Grammars

CFG for Fragment of EnglishS NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 31: Basic Parsing with Context-Free Grammars

31

Example

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 32: Basic Parsing with Context-Free Grammars

32

Example

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 33: Basic Parsing with Context-Free Grammars

33

Example

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 34: Basic Parsing with Context-Free Grammars

34

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parser Thatrsquos how we solve (not) an exponential problem in

polynomial time

Details

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 35: Basic Parsing with Context-Free Grammars

35

With the addition of a few pointers we have a parser

Augment the ldquoCompleterrdquo to point to where we came from

Converting Earley from Recognizer to Parser

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 36: Basic Parsing with Context-Free Grammars

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9S8

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 37: Basic Parsing with Context-Free Grammars

37

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1] Follow the structural traces from the Completer Of course this wonrsquot be polynomial time since

there could be an exponential number of trees We can at least represent ambiguity efficiently

Retrieving Parse Trees from Chart

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 38: Basic Parsing with Context-Free Grammars

38

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

Left Recursion vs Right Recursion

)(

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 39: Basic Parsing with Context-Free Grammars

Solutions Rewrite the grammar (automatically) to a

weakly equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e Not so obvious what these rules meanhellip

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 40: Basic Parsing with Context-Free Grammars

40

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PP Nom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules first NP --gt Det Nom NP --gt NP PP

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 41: Basic Parsing with Context-Free Grammars

41

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

Another Problem Structural ambiguity

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 42: Basic Parsing with Context-Free Grammars

42

NP vs VP Attachment

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 43: Basic Parsing with Context-Free Grammars

43

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up
Page 44: Basic Parsing with Context-Free Grammars

44

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problems Combining the two solves some but not all issues

Left recursion Syntactic ambiguity

Next time Making use of statistical information about syntactic constituents Read Ch 14

Summing Up

  • Slide 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing (2)
  • CFG Example
  • Modified CFG
  • Slide 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley (2)
  • Example
  • CFG for Fragment of English
  • Example (2)
  • Example (3)
  • Example (4)
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide 39
  • Slide 40
  • Another Problem Structural ambiguity
  • Slide 42
  • Slide 43
  • Summing Up

Recommended