+ All Categories
Home > Documents > SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 --...

SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 --...

Date post: 14-Nov-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
15
1 Statistical NLP Spring 2007 Lecture 12: Phrase Decoding Dan Klein – UC Berkeley Overview: Extracting Phrases Sentence-aligned corpus cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8 dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9 language ||| langue ||| 0.9 Phrase table (translation model) Intersected and grown word alignments Directional word alignments
Transcript
Page 1: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

1

Statistical NLPSpring 2007

Lecture 12: Phrase DecodingDan Klein – UC Berkeley

Overview: Extracting Phrases

Sentence-aligned corpus

cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9language ||| langue ||| 0.9 …

Phrase table(translation model)

Intersected and grown word alignments

Directional word alignments

Page 2: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

2

Pharaoh’s Model

[Koehn et al, 2003]

Segmentation Translation Distortion

Pharaoh’s Model

Where do we get these counts?

Page 3: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

3

Phrase-Based Decoding

这 7人 中包括 来自 法国 和 俄罗斯 的 宇航 员 .

Decoder design is important: [Koehn et al. 03]

Phrase Scoring

les chatsaiment

lepoisson

cats

like

fresh

fish

.

.frais

.

Learning weights has been tried, several times:

[Marcu and Wong, 02][DeNero et al, 06]… and others

Seems not to work, for a variety of only partially understood reasons

Main issue: big chunks get all the weight, obvious priors don’t help

Page 4: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

4

Extracting Phrases

Phrase SizePhrases do help

But they don’t need to be longWhy should this be?

Page 5: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

5

Bidirectional Alignment

Alignment Heuristics

Page 6: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

6

Sources of Alignments

Lexical Weighting

Page 7: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

7

The Pharaoh Decoder

Probabilities at each step include LM and TM

Hypotheis Lattices

Page 8: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

8

Pruning

Problem: easy partial analyses are cheaperSolution 1: use beams per foreign subsetSolution 2: estimate forward costs (A*-like)

WSD?Remember when we discussed WSD?

Word-based MT systems rarely have a WSD stepWhy not?

Page 9: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

9

What’s Next?Modeling syntax

PCFGs and phrase structureSyntactic parsingGrammar inductionSyntactic language and translation models

Phrase Structure ParsingPhrase structure parsing organizes syntax into constituents or bracketsIn general, this involves nested treesLinguists can, and do, argue about detailsLots of ambiguity

Not the only kind of syntax…

new art critics write reviews with computers

PP

NPNP

N’

NP

VP

S

Page 10: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

10

Constituency Tests

How do we know what nodes go in the tree?

Classic constituency tests:Substitution by proform

Question answers

Semantic reference

Dislocation

Cross-linguistic arguments, too

Conflicting TestsConstituency isn’t always clear

Units of transfer:think about ~ penser àtalk about ~ hablar de

Phonological reduction:I will go → I’ll goI want to go → I wanna goa le centre → au centre

La vélocité des ondes sismiques

Page 11: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

11

Non-Local PhenomenaDislocation / gapping

Why did the postman think that the neighbors were home?A debate arose which continued until the election.

BindingReference

The IRS audits itselfControl

I want to goI want you to go

Regularity of RulesArgumentationAdjunctionCoordinationX’ Theory

Page 12: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

12

PP Attachment

PP Attachment

Page 13: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

13

Attachment is a Simplification

I cleaned the dishes from dinner

I cleaned the dishes with detergent

I cleaned the dishes in the sink

Syntactic Ambiguities I

Prepositional phrases:They cooked the beans in the pot on the stove with handles.

Particle vs. preposition:A good pharmacist dispenses with accuracy.The puppy tore up the staircase.

Complement structuresThe tourists objected to the guide that they couldn’t hear.She knows you like the back of her hand.

Gerund vs. participial adjectiveVisiting relatives can be boring.Changing schedules frequently confused passengers.

Page 14: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

14

Syntactic Ambiguities IIModifier scope within NPsimpractical design requirementsplastic cup holder

Multiple gap constructionsThe chicken is ready to eat.The contractors are rich enough to sue.

Coordination scope:Small rats and mice can squeeze into holes or cracks in the wall.

Treebank Sentences

Page 15: SP07 cs294 lecture 12 -- phrase decoding.ppt [Read-Only]klein/cs294-7/SP07 cs294 lecture 12 -- phrase...frais .. Learning weights has been tried, several times: [Marcu and Wong, 02]

15

Human ProcessingGarden pathing:

Ambiguity maintenance


Recommended