Date post: | 25-Jul-2015 |
Category: |
Technology |
Upload: | alopezfoo |
View: | 301 times |
Download: | 10 times |
Phrase-Based Translation
The IBM Models
The IBM Models
•Fertility probabilities.
The IBM Models
•Fertility probabilities.
•Word translation probabilities.
The IBM Models
•Fertility probabilities.
•Word translation probabilities.
•Distortion probabilities.
The IBM Models
•Fertility probabilities.
•Word translation probabilities.
•Distortion probabilities.
•Some problems:
•Weak reordering model -- output is not fluent.
•Many decisions -- many things can go wrong.
The IBM Models
•Fertility probabilities.
•Word translation probabilities.
•Distortion probabilities.
•Some problems:
•Weak reordering model -- output is not fluent.
•Many decisions -- many things can go wrong.
The IBM Models
•Fertility probabilities.
•Word translation probabilities.
•Distortion probabilities.
•Some problems:
•Weak reordering model -- output is not fluent.
•Many decisions -- many things can go wrong.
Although north wind howls , but sky still very clear .虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
虽然
�
� �北 风 呼啸 , 天空 天空 依然 清澈 。
north wind strong , the sky remained clear . under theHowever
IBM Model 4
Tradeoffs: Modeling v. Learning
IBM Model 1 ✔ ✘ ✘ ✔ ✔
HMM ✔ ✔ ✘ ✘ ✔
IBM Model 4 ✔ ✔ ✔ ✘ ✘
Lexical
Tran
slatio
n
Local orderi
ng depen
dency
Fertilit
y
Convex
Tracta
ble Exa
ct
Inferen
ce
Tradeoffs: Modeling v. Learning
IBM Model 1 ✔ ✘ ✘ ✔ ✔
HMM ✔ ✔ ✘ ✘ ✔
IBM Model 4 ✔ ✔ ✔ ✘ ✘
Lexical
Tran
slatio
n
Local orderi
ng depen
dency
Fertilit
y
Convex
Tracta
ble Exa
ct
Inferen
ce
Lesson:Trade exactnessfor expressivity
Although north wind howls , but sky still very clear .虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
虽然
�
� �北 风 呼啸 , 天空 天空 依然 清澈 。
north wind strong , the sky remained clear . under theHowever
IBM Model 4
What are some things this model doesn’t account for?
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However , the sky remained clear under the strong north wind .
虽然 北 风 呼啸 , 天空 天空 依然 清澈 。
north wind strong , the sky remained clear . under theHowever
�
�
�
What are some things this model doesn’t account for?
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However ,
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However , the sky remained clear under the strong north wind .
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However , the sky remained clear under the strong north wind .
p(English, alignment|Chinese) =p(segmentation) · p(translations) · p(reorderings)
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However , the sky remained clear under the strong north wind .
p(English, alignment|Chinese) =p(segmentation) · p(translations) · p(reorderings)
Phrase-based Models
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However , the sky remained clear under the strong north wind .
p(English, alignment|Chinese) =p(segmentation) · p(translations) · p(reorderings)
Phrase-based Models
distortion = 6
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However , the sky remained clear under the strong north wind .
p(English, alignment|Chinese) =p(segmentation) · p(translations) · p(reorderings)
Phrase-based Models
distortion = 6
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However , the sky remained clear under the strong north wind .
p(English, alignment|Chinese) =p(segmentation) · p(translations) · p(reorderings)
Phrase-based Models
distortion = 6
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However the strong north wind , the sky remained clear under .
However , the sky remained clear under the strong north wind .
p(English, alignment|Chinese) =p(segmentation) · p(translations) · p(reorderings)
Phrase-based Models
distortion = 6
Phrase-based Models
Phrase-based Models
•Segmentation probabilities.
Phrase-based Models
•Segmentation probabilities.
•Phrase translation probabilities.
Phrase-based Models
•Segmentation probabilities.
•Phrase translation probabilities.
•Distortion probabilities.
Phrase-based Models
•Segmentation probabilities.
•Phrase translation probabilities.
•Distortion probabilities.
•Some problems:
•Weak reordering model -- output is not fluent.
•Many decisions -- many things can go wrong.
Phrase-based Models
•Segmentation probabilities.
•Phrase translation probabilities.
•Distortion probabilities.
•Some problems:
•Weak reordering model -- output is not fluent.
•Many decisions -- many things can go wrong.
Phrase-based Models
•Segmentation probabilities.
•Phrase translation probabilities.
•Distortion probabilities.
•Some problems:
•Weak reordering model -- output is not fluent.
•Many decisions -- many things can go wrong.
Phrase-based Models
•Segmentation probabilities: fixed (uniform)
•Phrase translation probabilities.
•Distortion probabilities: fixed (decaying)
Learning p(Chinese|English)
•Reminder: (nearly) every problem comes down to computing either:
•Sums: MLE or EM (learning)
•Maximum: most probable (decoding)
Recap: Expectation Maximization
•Arbitrarily select a set of parameters (say, uniform).
•Calculate expected counts of the unseen events.
•Choose new parameters to maximize likelihood, using expected counts as proxy for observed counts.
•Iterate.
•Guaranteed that likelihood is monotonically nondecreasing.
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
p(
p(
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
p( )
) +
) +
Marginalize: sum all alignments containing the link
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
p(
p(
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
p(
) +
) +
)
Divide by sum of all possible alignments
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
p(
p(
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
However , the sky remained clear under the strong north wind .
p(
) +
) +
)
Divide by sum of all possible alignments
We have to sum over exponentially many alignments!
EM for Model 1
probability of an alignment.
p(F,A|E) = p(I|J)Y
ai
p(ai = j)p(fi|ej)
EM for Model 1
probability of an alignment.
observed uniform
p(F,A|E) = p(I|J)Y
ai
p(ai = j)p(fi|ej)
factors across words.
EM for Model 1
probability of an alignment.
observed uniform
p(F,A|E) = p(I|J)Y
ai
p(ai = j)p(fi|ej)
EM for Model 1
p(ai = j|F,E) =p(ai = j, F |E)
p(F,E)=
EM for Model 1
北北
.�
a⇥A: �north
p(north| ) · p(rest of a)
p(ai = j|F,E) =p(ai = j, F |E)
p(F,E)=
EM for Model 1
北北
.�
a⇥A: �north
p(north| ) · p(rest of a)
marginal probability of alignments containing link
p(ai = j|F,E) =p(ai = j, F |E)
p(F,E)=
EM for Model 1
p(north| ).�
a⇥A: �north
p(rest of a)北北
marginal probability of alignments containing link
EM for Model 1
p(north| ).�
a⇥A: �north
p(rest of a)北北
marginal probability of alignments containing link
�
c⇥Chinese words
p(north|c).�
a⇥A: �north
p(rest of a)
marginal probability of all alignments
EM for Model 1
p(north| ).�
a⇥A: �north
p(rest of a)北北
marginal probability of alignments containing link
�
c⇥Chinese words
p(north|c).�
a⇥A: �north
p(rest of a)
marginal probability of all alignments
c
EM for Model 1
p(north| ).�
a⇥A: �north
p(rest of a)北北
marginal probability of alignments containing link
�
c⇥Chinese words
p(north|c).�
a⇥A: �north
p(rest of a)
marginal probability of all alignments
c
identical!
EM for Model 1
北p(north| ).�
c�Chinese words p(north|c)
EM for Phrase-Based
•Model parameters: p(E phrase|F phrase)
•All we need to do is compute expectations:
p(ai = j|F,E) =p(ai,i0 = hj, j0i, F |E)
p(F,E)
EM for Phrase-Based
•Model parameters: p(E phrase|F phrase)
•All we need to do is compute expectations:
p(F,E) sums over all possible phrase alignments
p(ai = j|F,E) =p(ai,i0 = hj, j0i, F |E)
p(F,E)
EM for Phrase-Based
•Model parameters: p(E phrase|F phrase)
•All we need to do is compute expectations:
p(F,E) sums over all possible phrase alignments...which are one-to-one by definition.
p(ai = j|F,E) =p(ai,i0 = hj, j0i, F |E)
p(F,E)
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However
EM for Phrase-Based
p(ai = j|F,E) =p(ai,i0 = hj, j0i, F |E)
p(F,E)
, the sky remained clear under the strong north wind .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However
EM for Phrase-Based
p(ai = j|F,E) =p(ai,i0 = hj, j0i, F |E)
p(F,E)
Can we compute this quantity?
, the sky remained clear under the strong north wind .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。Although north wind howls , but sky still very clear .
However
EM for Phrase-Based
p(ai = j|F,E) =p(ai,i0 = hj, j0i, F |E)
p(F,E)
Can we compute this quantity?
, the sky remained clear under the strong north wind .
How many 1-to-1 alignments are there ofthe remaing 8 Chinese and 8 English words?
Recap: Expectation Maximization
•Arbitrarily select a set of parameters (say, uniform).
•Calculate expected counts of the unseen events.
•Choose new parameters to maximize likelihood, using expected counts as proxy for observed counts.
•Iterate.
•Guaranteed that likelihood is monotonically nondecreasing.
Recap: Expectation Maximization
•Arbitrarily select a set of parameters (say, uniform).
•Calculate expected counts of the unseen events.
•Choose new parameters to maximize likelihood, using expected counts as proxy for observed counts.
•Iterate.
•Guaranteed that likelihood is monotonically nondecreasing.
Computing expectations from a phrase-based model, given a sentence pair, is #P-Complete(by reduction to counting perfect matchings;
DeNero & Klein, 2008)
argmaxa p(a|f,e) is also hard
argmaxa p(a|f,e) is also hard
argmaxa p(a|f,e) is also hard
Now What?
•Option #1: approximate expectations
•Restrict computation to some tractable subset of the alignment space (arbitrarily biased).
•Markov chain Monte Carlo (slow).
Now What?•Change the problem definition
•We already know how to learn word-to-word translation models efficiently.
•Idea: learn word-to-word alignments, extract most probable alignment, then treat it as observed.
•Learn phrase translations consistent with word alignments.
•Decouples alignment from model learning -- is this a good thing?
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
akemasu / open
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
watashi wa / I
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
watashi / I
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
watashi / I ✘
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
hako wo / box
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
hako wo / the box
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
hako wo / open the box
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
hako wo / open the box✘
Phrase Extraction
I open the box
watashi
wa
hako
wo
akemasu
hako wo akemasu / open the box
Phrasal Translation Estimation
Phrasal Translation Estimation
•Option #1 (EM over restricted space)
•Align with a word-based model.
•Compute expectations only over alignments consistent with the alignment grid.
Phrasal Translation Estimation
•Option #1 (EM over restricted space)
•Align with a word-based model.
•Compute expectations only over alignments consistent with the alignment grid.
•Option #2 (Non-global estimation)
•View phrase pairs as observed, irrespective of context or overlap.
Decoding
We want to solve this problem:
e⇤ = arg max
ep(e|f)
Decoding
We want to solve this problem:
e⇤ = arg max
ep(e|f)
Q: how many English sentences are there?
北 风 呼啸 。
北 风 呼啸 。
segmentationssubstitutionspermutations
北 风 呼啸 。
O(2n)segmentationssubstitutionspermutations
北 风 呼啸 。
O(2n)O(5n)
segmentationssubstitutionspermutations
北 风 呼啸 。
O(2n)O(5n)O(n!)
segmentationssubstitutionspermutations
Key Idea
Key Idea
Key Idea
Key Idea
Key Idea
Key Idea
Dynamic Programming
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
START Although
crystal clear
START However
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
START Although
crystal clear
START However
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
wind shrieked
wind screamed
north wind
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
wind shrieked
wind screamed
north wind
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
the sky
shrieked ,
, yet
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
the sky
shrieked ,
, yet
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
sky ,
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
sky ,
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。clear .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。still quite
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。blue .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。clear .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。still quite
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。blue .
虽然 北 风 呼啸 , 但 天空 依然 十分 清澈 。
Although the northern wind shrieked across the sky, but was still very clear.
Approximation: Pruning
Approximation: Pruning
Idea: prune states by accumulated path length
Approximation: Pruning
Approximation: Pruning
Solution: Group states by number of covered words.
•Some (not all) key ingredients in Google Translate:
•Some (not all) key ingredients in Google Translate:
•Phrase-based translation models
•Some (not all) key ingredients in Google Translate:
•Phrase-based translation models
•... Learned heuristically from word alignments
•Some (not all) key ingredients in Google Translate:
•Phrase-based translation models
•... Learned heuristically from word alignments
•... Coupled with a huge language model
•Some (not all) key ingredients in Google Translate:
•Phrase-based translation models
•... Learned heuristically from word alignments
•... Coupled with a huge language model
•... And decoding w/ severe pruning heuristics