+ All Categories
Home > Documents > Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin...

Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin...

Date post: 18-Jan-2018
Category:
Upload: stuart-robbins
View: 219 times
Download: 0 times
Share this document with a friend
Description:
The generative story Source word Head words Links to zero or more non-head words (same side) Non-head words Linked from one head word (same side) Deleted words No link in source side Target words Head words Links to zero or more non-head words (same side) Non-head words Linked from one head word (same side) Spurious words No link in target side
35
Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao
Transcript
Page 1: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Getting the structure right for word alignment: LEAF

Alexander Fraser and Daniel Marcu

Presenter Qin Gao

Page 2: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Problem

IBM Models have 1-N

assumption

Solutions

A sophisticated

generative story

Generative Estimation of parametersAdditional Solution

Decompose the model

components

Semi-supervised

training

ResultSignificant

Improvement on BLEU (AR-

EN)

Quick summary

Page 3: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

The generative storySource word

Head words Links to zero or more non-head words (same side)

Non-head words

Linked from one head word (same side)

Deleted words No link in source sideTarget words

Head words Links to zero or more non-head words (same side)

Non-head words

Linked from one head word (same side)

Spurious words

No link in target side

Page 4: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Minimal translational correspondence

Page 5: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.
Page 6: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

The generative story

A B C

Page 7: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

1a. Condition: Source word

A B C

Page 8: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

1b. Determine source word class

A B C

Page 9: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

2a. Condition on source classes

C(A) C(B) C(C)

Page 10: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

2b. Determine links between head word and non-head words

C(A) C(B) C(C)

Page 11: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

3a. Depends on the source head word

A B C

Page 12: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

3b. Determine the target head word

A B C

X

Page 13: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

4a. Conditioned on source head word and cept size

A B C

X

2

Page 14: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

4b. Determine the target cept size

A B C

X

2

?

Page 15: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

5a. Depend on the existing sentence length

A B C

X

2

?

Page 16: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

5b. Determine the number of spurious target words

A B C

X

2

? ?

Page 17: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

6a. Depend on the target word

A B C

X ? ?XYZ

Page 18: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

6b. Determine the spurious word

A B C

X ? ZXYZ

Page 19: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

7a. Depends on target head word’s class and source word

A B C

C(X) ? Z

Page 20: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

7b. Determine the non-head word it linked to

A B C

C(X) Y Z

Page 21: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

8a. Depends on the classes of source/target head words

C(A) B C

C(X) Y Z

1 2 3

Page 22: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

2

8b. Determine the position of target head word

C(A) B C

C(X)

Y Z

1 3

Page 23: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

2

8c. Depends on the target word class

C(A) B C

C(X)

Y Z

1 3

Page 24: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

32

8d. Determine the position of non-headwords

C(A) B C

C(X) Y

Z

1

Page 25: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

1 32

9. Fill the vacant position uniformly

C(A) B C

C(X) YZ

Page 26: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

1 32

(10) The real alignment

C(A) B C

C(X) YZ

Page 27: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Unsupervised parameter estimation

Bootstrap using HMM alignments in two directions Using the intersection to determine

head words Using 1-N alignment to determine target

cepts Using M-1 alignment to determine

source cepts Could be infeasible

Page 28: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Training: Similar to model 3/4/5

From some alignment (not sure how they get it), apply one of the seven operators to get new alignments

Move French non-head word to new head, move English non-head word to new head, swap heads of two French non-head words, swap heads of two English non-head words, swap English head word links of two French head

words, link English word to French word making new head

words, unlink English and French head words.

All the alignments that can be generated by one of the operators above, are called neighbors of the alignment

Page 29: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Training If we have better alignment in the

neighborhood, update the current alignment

Continue until no better alignment can be found

Collect count from the last neighborhood

Page 30: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Semi-supervised training Decompose the components in the large formula

treat them as features in log-linear model And other features

Used EMD algorithm (EM-Discriminative) method

Page 31: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Experiment First, a very weird operation, they

fully link alignments from ALL systems and then compare the performance

Page 32: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Training/Test Set

Page 33: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Experiments French/English: Phrase based Arabic/English: Hierarchical (Chiang

2005) Baseline: GIZA++ Model 4, Union Baseline Discriminative: Only using

Model 4 components as features

Page 34: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Conclusion(Mine) The new structural features are

useful in discriminative training No evidence to support the

generative model is superior over model 4

Page 35: Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.

Unclear points Are F scores “biased?” No BLEU score given for LEAF

unsupervised They used features in addition to

LEAF features, where is the contribution comes from?


Recommended