+ All Categories
Home > Documents > Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in...

Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in...

Date post: 12-Aug-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
49
1 Experiments in English-Japanese Tree-to-String Machine Translation Experiments in English↔Japanese Tree-to-String Machine Translation Graham Neubig Nara Institute of Science and Technology 10/20/2012
Transcript
Page 1: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

1

Experiments in English-Japanese Tree-to-String Machine Translation

Experiments in English↔JapaneseTree-to-String Machine Translation

Graham NeubigNara Institute of Science and Technology

10/20/2012

Page 2: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

2

Experiments in English-Japanese Tree-to-String Machine Translation

Introduction/Motivation

Page 3: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

3

Experiments in English-Japanese Tree-to-String Machine Translation

Translation Models

he visited the white house 彼 は ホワイト ハウス を 訪問 した

he visited the white house

PRP VBD DT NNP NNP

NPVP

S

NPNP

he visited the white house

dobj

nsubjn

det

string

tree (phrase structure)

dependency

string

彼 は ホワイト ハウス を 訪問 した

N P N N P N V

NP NP VP

PP PP

VPS

tree (phrase structure)

dependency

彼 は ホワイト ハウス を 訪問 した

subj

dobjnnn n

to

Page 4: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

4

Experiments in English-Japanese Tree-to-String Machine Translation

Recent Usage in English↔Japanese

● Phrase-based translation [Koehn+ 03] is still popular

● Moses used in 25 papers at NLP2012

● Also, hierarchical phrase-based translation [Chiang 07] ([Feng+ 11] is one of the few examples)

English:   he   visited  the white house

Japanese: 彼 は ホワイト ハウス を 訪問 した

Page 5: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

5

Experiments in English-Japanese Tree-to-String Machine Translation

Recent Usage in English↔Japanese

● Pre-ordering [Xia+ 04] is another popular technique

● First used for Japanese by [Komachi+ 06]?

● Used by Google [Xu+ 09], NTT [Isozaki+ 11], others [Nguyen+ 08, Neubig+ 12]

彼 は  ホワイト ハウス を 訪問 した

he visited the white house

obj

subjdet

adj

SourceDependencies:

Pre-ordering:

he the white house visited

subj v obj → subj obj v

Translation:

Page 6: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

6

Experiments in English-Japanese Tree-to-String Machine Translation

Recent Usage in English↔Japanese

● Dependency-to-dependency used by Kyoto U [Nakazawa+ 06] and rule based systems

he visited the white house

dobj

nsubjn

det

彼 は ホワイト ハウス を 訪問 した

subj

dobjnn

n n

訪問 した

dobj

X1 X2

X1 visited X2

nsubj dobj

Page 7: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

7

Experiments in English-Japanese Tree-to-String Machine Translation

Recent Usage in English↔Japanese

● String-to-tree models [Yamada+ 01] used by NTT in NTCIR task [Sudoh+ 11]

Page 8: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

8

Experiments in English-Japanese Tree-to-String Machine Translation

Recent Usage in English↔Japanese

he visited the white house 彼 は ホワイト ハウス を 訪問 した

he visited the white house

PRP VBD DT NNP NNP

NPVP

S

NPNP

he visited the white house

dobj

nsubjn

det

string

tree (phrase structure)

dependency

string

彼 は ホワイト ハウス を 訪問 した

N P N N P N V

NP NP VP

PP PP

VPS

tree (phrase structure)

dependency

彼 は ホワイト ハウス を 訪問 した

subj

dobjnnn n

(H)PBMT

Pre-ordering

D2D

S2T

Page 9: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

9

Experiments in English-Japanese Tree-to-String Machine Translation

What about Tree-driven Models?!

he visited the white house 彼 は ホワイト ハウス を 訪問 した

he visited the white house

PRP VBD DT NNP NNP

NPVP

S

NPNP

he visited the white house

dobj

nsubjn

det

string

tree (phrase structure)

dependency

string

彼 は ホワイト ハウス を 訪問 した

N P N N P N V

NP NP VP

PP PP

VPS

tree (phrase structure)

dependency

彼 は ホワイト ハウス を 訪問 した

subj

dobjnnn n

T2S

D2S

Page 10: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

10

Experiments in English-Japanese Tree-to-String Machine Translation

Tree-to-String Models [Liu+ 06]

VP0-5

PP0-1

VP2-5

PP2-3

N2

P3

V4

N0

P1

友達 と

ご飯 を 食べ た

SUF5

VP4-5

x1 with x0

x1 x0

a friend

a meal

ate

ate a meal with a friend

x1 x0

x1 x0

Page 11: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

11

Experiments in English-Japanese Tree-to-String Machine Translation

Dependency-to-String Models[Quirk+ 05]

he visited the white house

dobj

nsubjn

det

彼 は ホワイト ハウス を 訪問 した 訪問 したX1 X2

X1 visited X2

nsubj dobj

Page 12: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

12

Experiments in English-Japanese Tree-to-String Machine Translation

T2S/D2S vs Phrase Based

● + Better reordering through use of syntactic structure

● + Very fast! (especially compared to HPBMT)

● + Better lexical choice because long-range context considered (especially D2S)

● - Requires a parser

● - Sensitive to parse errors

Page 13: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

13

Experiments in English-Japanese Tree-to-String Machine Translation

T2S/D2S vs Pre-ordering

● + T2S/D2S jointly searches for reordering and translation

● + T2S/D2S can easily handle lexicalized reordering

● - Pre-ordering can find translation rules that overlap constituent boundaries

X が 好き

PPVP

likes X

X が 高い

PPVP

X is high

Page 14: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

14

Experiments in English-Japanese Tree-to-String Machine Translation

T2S vs. D2S

● T2S: Can handle de-lexicalized rules = more general?

X2:VBD

VP

S

X3:NPX1:NPX1 X3 X2

(SVO → SOV)

● D2S: Dependent words are close → good for lexical choice?

dobj

run a program

dobj

run a marathon

Page 15: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

15

Experiments in English-Japanese Tree-to-String Machine Translation

Experiments and Summary

Page 16: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

16

Experiments in English-Japanese Tree-to-String Machine Translation

Question:

How well do modern statistical tree-to-string methods work for English↔Japanese translation?

Page 17: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

17

Experiments in English-Japanese Tree-to-String Machine Translation

Previous Research

● Three examples for En→Ja?● [Quirk+ 06] Uses dependency treelet translation and

shows improvement over PBMT● [Wu+ 10] Uses HPSG input and shows improvement

over Joshua (HPBMT)● [DeNero+ 11] Shows forest-to-string does slightly better

than syntactic pre-ordering in terms of BLEU● One example for Ja→En?

● [Menezes+ 05] Uses dependency treelet translation, no direct comparison to other methods

Page 18: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

18

Experiments in English-Japanese Tree-to-String Machine Translation

Experimental Setup

● System: In-house forest-to-string decoder “travatar”● Forest-to-string translation [Mi+ 08] with tree transducers● Alignment GIZA++, extraction GHKM, tuning MERT

● Data: Kyoto Free Translation Task (KFTT [Neubig 11]), ~350k sentences of Wikipedia data for training

● Baseline: Moses PBMT, PBMT + Preordering [Neubig+ 12]

● Evaluation: BLEU, RIBES, Acceptability (0-5)

Page 19: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

19

Experiments in English-Japanese Tree-to-String Machine Translation

Tree-to-String Settings(Explained in Detail Later)

● Language Analysis:● En Parser: Stanford, Berkeley, Egret (Tree, Forest)● Ja: Juman+KNP, MeCab+Cabocha, KyTea+EDA

● Composed Rules: 1, 2, 3, 4

● Non-terminals: 1, 2, 3

● Binarization: Left, Right

● Null Attachment: Top, Exhaustive (1, 2)

● Tuning: BLEU, RIBES, (BLEU+RIBES)/2

Page 20: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

20

Experiments in English-Japanese Tree-to-String Machine Translation

Summary (En-Ja)

PBMTPBMT+Pre

T2SF2S

18.5

19

19.5

20

20.5

21

21.5

BLE

U

PBMTPBMT+Pre

T2SF2S

6263646566676869

RIB

ES

PBMTPBMT+Pre

T2SF2S

2.2

2.4

2.6

2.8

3

3.2

Acc

epta

bilit

y

Page 21: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

21

Experiments in English-Japanese Tree-to-String Machine Translation

Summary (Ja-En)

PBMT PBMT+Pre T2S15.615.8

1616.216.416.616.8

17

BLE

U

PBMT PBMT+Pre T2S62

62.563

63.564

64.565

65.5

RIB

ES

PBMT PBMT+Pre T2S2.2

2.4

2.6

2.8

3

3.2

Acc

epta

bilit

y

Page 22: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

22

Experiments in English-Japanese Tree-to-String Machine Translation

En-Ja F2S vs. PBMT+Pre

Input:Department of Sociology in Faculty of Letters opened .

PBMT+Pre:開業 年 文学 部 社会 学科 。

F2S:文学 部 社会 学 科 を 開設 。

Properly interprets noun phrase + verb

Page 23: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

23

Experiments in English-Japanese Tree-to-String Machine Translation

En-Ja F2S vs. PBMT+Pre

Input:Afterwards it was reconstructed but its influence declined .

PBMT+Pre:その 後 衰退 し た が 、 その 影響 を 受け て 再建 さ れ た もの で あ る 。

F2S:その 後 再建 さ れ て い た が 、 影響 力 は 衰え た 。

Properly reconstructs relationship between twoverb phrases

Page 24: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

24

Experiments in English-Japanese Tree-to-String Machine Translation

En-Ja F2S vs. PBMT+Pre

Input:Introduction of KANSAI THRU PASS Miyako Card

PBMT+Pre:スルッと kansai 都 カード の 導入

F2S:伝来 スルッと KANSAI 都 カード

Parsing error:(NP (NP Introduction) (PP of KANSAI THRU PASS) (NP Miyako) (NP Card))

Page 25: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

25

Experiments in English-Japanese Tree-to-String Machine Translation

Ja-En T2S vs. PBMT+Pre

Input:史実 に は 直接 の 関係 は な い 。

PBMT+Pre:in the historical fact is not directly related to it .

T2S:is not directly related to the historical facts .

Properly translates “ … ” に は 関係 が as “related to”

Page 26: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

26

Experiments in English-Japanese Tree-to-String Machine Translation

Ja-En T2S vs. PBMT+PreInput:九条 道家 は 嫡男 ・ 九条 教実 に 先立 た れ 、 次男 ・ 二条 良実 は 事実 上 の 勘当 状態 に あ っ た 。

PBMT+Pre:michiie kujo was his eldest son and heir , norizane kujo , and his second son , yoshizane nijo was disinherited .

T2S:michiie kujo to his legitimate son kujo norizane died before him , and the second son , nijo yoshizane was virtually disowned .

Much better division between clauses

Page 27: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

27

Experiments in English-Japanese Tree-to-String Machine Translation

Ja-En T2S vs. PBMT+PreInput:日本 語 日本 文学 科1474 年 ~ 1478 年 - 山名 政 豊

PBMT+Pre:the department of japanese language and literaturein 1474 to 1478 - masatoyo yamana

T2S:japanese language and literaturemasatoyo yamana 1474 shokoku-ji in -

Errors due to more restrictive rule extraction (first example),parse errors (second example, “Yamana” is a single noun phrase)

Page 28: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

28

Experiments in English-Japanese Tree-to-String Machine Translation

Effect of Language Analysis

Page 29: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

29

Experiments in English-Japanese Tree-to-String Machine Translation

Question:

How much do the language analysis tools used effect translation?

Page 30: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

30

Experiments in English-Japanese Tree-to-String Machine Translation

Language Analysis (En-Ja):

● Which parser provides better translations?

● Stanford Parser, Berkeley Parser, Egret (a clone of the Berekely parser that can output forests)

PBMTPBMT+Pre

StanfordBerkeley

EgretEgret+F2S

6263646566676869

RIB

ES

PBMTPBMT+Pre

StanfordBerkeley

EgretEgret+F2S

18.5

19

19.5

20

20.5

21

21.5

BLE

U

Page 31: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

31

Experiments in English-Japanese Tree-to-String Machine Translation

Language Analysis (Ja-En):● 3 morphological/dependency analysis combinations

● Use head rules to change dependency into CFG● For bunsetsu-based, last content word is head● Punctuation dependencies reversed

Juman+KNP MeCab+CaboCha KyTea+EDA

Segmentation Long Medium Short

OOV Simple Simple Model

Parsing Unit Bunsetsu Bunsetsu Word

Algorithm CKY-Style Cascaded Chunking MST

Page 32: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

32

Experiments in English-Japanese Tree-to-String Machine Translation

Language Analysis (Ja-En):

PBMTPBMT+Pre

Juman+KNPMeCab+CaboCha

KyTea+EDA

585960616263646566

RIB

ES

PBMTPBMT+Pre

Juman+KNPMeCab+CaboCha

KyTea+EDA

0

5

10

15

20

BLE

U

PBMTPBMT+Pre

Juman+KNPMeCab+CaboCha

KyTea+EDA

2.2

2.4

2.6

2.8

3

3.2

Acc

epta

bilit

y

Page 33: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

33

Experiments in English-Japanese Tree-to-String Machine Translation

EDA vs. KNP/CaboChaInput:向嶽寺派祇園女御妹-後に平忠盛妻

MeCab+CaboCha:向嶽寺 school祇園女御 younger sister : later became the wife of taira no tadamori

KyTea+EDA:kogaku-ji temple schoolgion no nyogo younger sister - , later taira no tadamori 's wife

Smaller, more accurate segmentationprovides better translations (EDA)

Page 34: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

34

Experiments in English-Japanese Tree-to-String Machine Translation

EDA vs. CaboCha/KNPInput:大宮学舎旧守衛所文学部社会学科を設置

MeCab+CaboCha:former omiya campus . officedepartment of faculty of letters society was established .

KyTea+EDA:omiya campus former guard officedepartment of sociology , faculty of letters was established .

Word-based noun-phrase parsing helps translation (EDA)

Page 35: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

35

Experiments in English-Japanese Tree-to-String Machine Translation

EDA vs. CaboCha/KNPInput:芳崖と雅邦はともに地方の狩野派系絵師の家の出身であった。

MeCab+CaboCha:hogai and gaho both was from a family of local painters of the kano school .

KyTea+EDA:hogai and gaho from the family of the region of the kano together school series painter .

CaboCha/KNP wins followed no clear pattern. This case:CaboCha: “ → ” とみに 出身 EDA: “ → ”ともに 地方

Page 36: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

36

Experiments in English-Japanese Tree-to-String Machine Translation

CaboCha vs. KNPInput:谷万太郎1391年-山名氏清1392年~1394年-畠山基国

JUMAN/KNP:taro million taniin 1391 , - the yamana clan- in 1392 - 1394 hatakeyama ) province

MeCab+CaboCha:mantaro tani1391 , : ujikiyo yamana1392 1394 : motokuni hatakeyama

Most prominent wins forCaboCha were segmentation

Page 37: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

37

Experiments in English-Japanese Tree-to-String Machine Translation

Conclusion

● Egret is best for English, and forests are important.

● KyTea+EDA is best for Japanese● At the moment, morphological analysis is more

important than parsing?● Future directions:

● Forest-based parser!● Better bunsetsu→word dependency conversion rules

Page 38: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

38

Experiments in English-Japanese Tree-to-String Machine Translation

Other Settings

Page 39: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

39

Experiments in English-Japanese Tree-to-String Machine Translation

Question:

What other settings have a significant effect on translation results?

Page 40: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

40

Experiments in English-Japanese Tree-to-String Machine Translation

Composed Rules

● Combine two minimal rules into larger rules:

VP2-5

PP2-3

N2

P3

V4

ご飯 を 食べ た

SUF5

VP4-5

x1 x0

ate

VP2-5

PP2-3

N2

P3

V4

ご飯 を 食べ た

SUF5

VP4-5

ate x0

Page 41: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

41

Experiments in English-Japanese Tree-to-String Machine Translation

Composed Rules (En-Ja)

● Composed rules are very important

PBMTPBMT+Pre

Comp 1Comp 2

Comp 3Comp 4

1516171819202122

BLE

U

PBMTPBMT+Pre

Comp 1Comp 2

Comp 3Comp 4

6263646566676869

RIB

ES

Page 42: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

42

Experiments in English-Japanese Tree-to-String Machine Translation

Number of Non-Terminals

VP2-5

PP2-3

N2

P3

VP4-5

V4

食べ た

SUF5

VP4-5

VP2-5

PP2-3

N2

P3

V4

を 食べ た

SUF5

VP4-5

ate x0ate x1 x0

0 NT 1 NT 2 NT

Page 43: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

43

Experiments in English-Japanese Tree-to-String Machine Translation

Number of Non-Terminals (En-Ja)

● 2 Non-terminals are necessary, but more are harmful

● Why? Larger are more noisy?

PBMTPBMT+Pre

NT 1NT 2

NT 3NT 4

16

17

18

19

20

21

22

BLE

U

PBMTPBMT+Pre

NT 1NT 2

NT 3NT 4

6263646566676869

RIB

ES

Page 44: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

44

Experiments in English-Japanese Tree-to-String Machine Translation

Binarization (En-Ja)

the White House

NP

the

White House

NP

NP'

the White

House

NP

NP'

None Right Left

● Right or left much better than none

● In general right > left for En-Ja, left > right for Ja-En

ホワイト ハウス ホワイト ハウス ホワイト ハウス

Page 45: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

45

Experiments in English-Japanese Tree-to-String Machine Translation

Tuning

● Two evaluation measures:● BLEU correlated with fluency● RIBES correlated with adequacy

● Tune both of these measures with MERT

● Also, might be worth considering both [Duh+ 12], so we use linear combination BLEU+RIBES also

Page 46: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

46

Experiments in English-Japanese Tree-to-String Machine Translation

Tuning

BLEU RIBES BLEU+RIBES16

17

18

19

20

21

BLE

U

BLEU RIBES BLEU+RIBES66.5

67

67.5

68

68.5

RIB

ES

En-Ja

Ja-En

BLEU RIBES BLEU+RIBES15.6

15.8

16

16.2

16.4

16.6

16.8

BLE

U

BLEU RIBES BLEU+RIBES62.5

63

63.5

64

64.5

65

65.5

RIB

ES

Page 47: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

47

Experiments in English-Japanese Tree-to-String Machine Translation

Conclusion

Page 48: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

48

Experiments in English-Japanese Tree-to-String Machine Translation

Insights

● How well does tree-to-string work for En-Ja, Ja-En?● As well as phrase-based with pre-ordering [Neubig+ 12]● Forest-to-string translation works better for En-Ja

● Egret worked best for English-Japanese KyTea+EDA worked the best for Japanese-English

● For Ja-En we need:● Better morphological analysis!● Pass multiple morphological analysis results to parsing!● n-best or forest based parser!

Page 49: Experiments in English↔Japanese Tree-to-String Machine … · 2012/10/20  · 4 Experiments in English-Japanese Tree-to-String Machine Translation Recent Usage in English↔Japanese

49

Experiments in English-Japanese Tree-to-String Machine Translation

Thank You!


Recommended