+ All Categories
Home > Documents > Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the...

Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the...

Date post: 08-Jul-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
50
Syntax-Based Models Philipp Koehn 7 November 2017 Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017
Transcript
Page 1: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

Syntax-Based Models

Philipp Koehn

7 November 2017

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 2: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

1

what is syntax?

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 3: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

2Tree-Based Models

• Traditional statistical models operate on sequences of words

• Many translation problems can be best explained by pointing to syntax

– reordering, e.g., verb movement in German–English translation– long distance agreement (e.g., subject-verb) in output

⇒ Translation models based on tree representation of language

– significant ongoing research– state-of-the art for some language pairs

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 4: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

3Dependency Structure

I like the interesting lecturePRO VB DET JJ NN↓ ↓ ↓ ↓

like lecture lecture like

• Center of a sentence is the verb

• Its dependents are its arguments (e.g., subject noun)

• These may have further dependents (adjective of noun)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 5: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

4Phrase Structure Grammar

• Phrase structure

– noun phrases: the big man, a house, ...– prepositional phrases: at 5 o’clock, in Edinburgh, ...– verb phrases: going out of business, eat chicken, ...– adjective phrases, ...

• Context-free Grammars (CFG)

– non-terminal symbols: phrase structure labels, part-of-speech tags– terminal symbols: words– production rules: NT → [NT,T]+

example: NP → DET NN

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 6: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

5Phrase Structure Grammar

PRPI

MDshall

VBbe

VBGpassing

RPon

TOto

PRPyou

DTsome

NNScomments

NP-APP

VP-AVP-A

VP-AS

Phrase structure grammar tree for an English sentence(as produced Collins’ parser)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 7: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

6

syntactic transfer

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 8: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

7Synchronous Phrase Structure Grammar

• English rule

NP → DET JJ NN

• French rule

NP → DET NN JJ

• Synchronous rule (indices indicate alignment):

NP→ DET1 NN2 JJ3 | DET1 JJ3 NN2

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 9: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

8Synchronous Grammar Rules

• Nonterminal rules

NP→ DET1 NN2 JJ3 | DET1 JJ3 NN2

• Terminal rules

N→maison | house

NP→ la maison bleue | the blue house

• Mixed rules

NP→ la maison JJ1 | the JJ1 house

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 10: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

9Tree-Based Translation Model

• Translation by parsing

– synchronous grammar has to parse entire input sentence– output tree is generated at the same time– process is broken up into a number of rule applications

• Translation probability

SCORE(TREE, E, F) =∏i

RULEi

• Many ways to assign probabilities to rules

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 11: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

10Aligned Tree Pair

PRPI

MDshall

VBbe

VBGpassing

RPon

TOto

PRPyou

DTsome

NNScomments

NP-APP

VP-AVP-A

VP-AS

IchPPER

werdeVAFIN

IhnenPPER

dieART

entsprechendenADJ

AnmerkungenNN

aushändigenVVFIN

NP

VPS VP

Phrase structure grammar trees with word alignment(German–English sentence pair.)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 12: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

11Reordering Rule

• Subtree alignmentVP

PPER

...

NP

...

VVFIN

aushandigen

↔ VP

VBG

passing

RP

on

PP

...

NP

...

• Synchronous grammar ruleVP→ PPER1 NP2 aushandigen | passing on PP1 NP2

• Note:

– one word aushandigen mapped to two words passing on ok– but: fully non-terminal rule not possible

(one-to-one mapping constraint for nonterminals)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 13: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

12Another Rule

• Subtree alignment

PRO

Ihnen

↔ PP

TO

to

PRP

you

• Synchronous grammar rule (stripping out English internal structure)

PRO/PP→ Ihnen | to you

• Rule with internal structure

PRO/PP → IhnenTO

to

PRP

you

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 14: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

13Another Rule

• Translation of German werde to English shall be

VP

VAFIN

werde

VP

...

↔ VP

MD

shall

VP

VB

be

VP

...

• Translation rule needs to include mapping of VP

⇒ Complex rule

VP →VAFIN

werde

VP1MD

shall

VP

VB

be

VP1

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 15: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

14Internal Structure

• Stripping out internal structure

VP→werde VP1 | shall be VP1

⇒ synchronous context free grammar

• Maintaining internal structure

VP →VAFIN

werde

VP1MD

shall

VP

VB

be

VP1

⇒ synchronous tree substitution grammar

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 16: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

15

learning

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 17: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

16Learning Synchronous Grammars

• Extracting rules from a word-aligned parallel corpus

• First: Hierarchical phrase-based model

– only one non-terminal symbol X

– no linguistic syntax, just a formally syntactic model

• Then: Synchronous phrase structure model

– non-terminals for words and phrases: NP, VP, PP, ADJ, ...– corpus must also be parsed with syntactic parser

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 18: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

17Extracting Phrase Translation Rules

Ishall

bepassing

some

onto

you

comments

Ich

werd

eIh

nen

die

ents

prec

hend

enAn

mer

kung

enau

shän

dige

n

shall be = werde

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 19: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

18Extracting Phrase Translation Rules

Ishall

bepassing

some

onto

you

comments

Ich

werd

eIh

nen

die

ents

prec

hend

enAn

mer

kung

enau

shän

dige

n

some comments = die entsprechenden Anmerkungen

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 20: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

19Extracting Phrase Translation Rules

Ishall

bepassing

some

onto

you

comments

Ich

werd

eIh

nen

die

ents

prec

hend

enAn

mer

kung

enau

shän

dige

n

werde Ihnen die entsprechenden Anmerkungen aushändigen = shall be passing on to you some comments

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 21: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

20Extracting Hierarchical Phrase Translation Rules

Ishall

bepassing

some

onto

you

comments

Ich

werd

eIh

nen

die

ents

prec

hend

enAn

mer

kung

enau

shän

dige

n

werde X aushändigen= shall be passing on X

subtractingsubphrase

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 22: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

21Formal Definition

• Recall: consistent phrase pairs

(e, f) consistent with A⇔∀ei ∈ e : (ei, fj) ∈ A→ fj ∈ f

AND ∀fj ∈ f : (ei, fj) ∈ A→ ei ∈ e

AND ∃ei ∈ e, fj ∈ f : (ei, fj) ∈ A

• Let P be the set of all extracted phrase pairs (e, f)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 23: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

22Formal Definition

• Extend recursively:

if (e, f) ∈ P AND (eSUB, fSUB) ∈ P

AND e = ePRE + eSUB + ePOST

AND f = fPRE + fSUB + fPOST

AND e 6= eSUB AND f 6= fSUB

add (ePRE + X + ePOST, fPRE + X + fPOST) to P

(note: any of ePRE, ePOST, fPRE, or fPOST may be empty)

• Set of hierarchical phrase pairs is the closure under this extension mechanism

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 24: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

23Comments

• Removal of multiple sub-phrases leads to rules with multiple non-terminals,such as:

Y→ X1 X2 | X2 of X1

• Typical restrictions to limit complexity (Chiang, 2005)

– at most 2 nonterminal symbols– no neighboring non-terminals on the source side– at least 1 but at most 5 words per language– span at most 15 words (counting gaps)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 25: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

24Learning Syntactic Translation Rules

PRP IMD shall

VB beVBG passing

DT some

RP onTO to

PRP you

NNS comments

Ich

PPE

R

werd

e V

AFIN

Ihne

n P

PER

die

ART

ents

pr.

ADJ

Anm

. N

N

aush

änd.

VV

FIN

NP

PPVP

VP

VP

S

NP

VPVP

S

PRO

Ihnen

= PP

TO

to

PRP

you

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 26: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

25Constraints on Syntactic Rules

• Same word alignment constraints as hierarchical models

• Hierarchical: rule can cover any span⇔ syntactic rules must cover constituents in the tree

• Hierarchical: gaps may cover any span⇔ gaps must cover constituents in the tree

• Much less rules are extracted (all things being equal)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 27: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

26Impossible Rules

PRP IMD shall

VB beVBG passing

DT some

RP onTO to

PRP you

NNS comments

Ich

PPE

R

werd

e V

AFIN

Ihne

n P

PER

die

ART

ents

pr.

ADJ

Anm

. N

N

aush

änd.

VV

FIN

NP

PPVP

VP

VP

S

NP

VPVP

S

English span not aconstituentno rule extracted

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 28: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

27Rules with Context

PRP IMD shall

VB beVBG passing

DT some

RP onTO to

PRP you

NNS comments

Ich

PPE

R

werd

e V

AFIN

Ihne

n P

PER

die

ART

ents

pr.

ADJ

Anm

. N

N

aush

änd.

VV

FIN

NP

PPVP

VP

VP

S

NP

VPVP

S

Rule with this phrase pairrequires syntactic context

VP

VAFIN

werde

VP

... =

VP

MD

shall

VP

VB

be

VP

...

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 29: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

28Too Many Rules Extractable

• Huge number of rules can be extracted(every alignable node may or may not be part of a rule → exponential number of rules)

• Need to limit which rules to extract

• Option 1: similar restriction as for hierarchical model(maximum span size, maximum number of terminals and non-terminals, etc.)

• Option 2: only extract minimal rules (”GHKM” rules)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 30: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

29

refinements

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 31: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

30Minimal Rules

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extract: set of smallest rules required to explain the sentence pair

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 32: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

31Lexical Rule

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: PRP→ Ich | I

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 33: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

32Lexical Rule

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: PRP→ Ihnen | you

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 34: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

33Lexical Rule

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: DT→ die | some

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 35: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

34Lexical Rule

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: NNS→ Anmerkungen | comments

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 36: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

35Insertion Rule

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: PP→ X | to PRP

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 37: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

36Non-Lexical Rule

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: NP→ X1 X2 | DT1 NNS2

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 38: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

37Lexical Rule with Syntactic Context

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: VP→ X1 X2 aushandigen | passing on PP1 NP2

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 39: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

38Lexical Rule with Syntactic Context

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: VP→werde X | shall be VP (ignoring internal structure)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 40: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

39Non-Lexical Rule

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Extracted rule: S→ X1 X2 | PRP1 VP2

DONE — note: one rule per alignable constituent

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 41: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

40Unaligned Source Words

I shall be passing on to you some comments

PRP MD VB VBG RP TO PRP DT NNS

NPPP

VP

VP

VP

S

Ich werde Ihnen die entsprechenden Anmerkungen aushändigen

Attach to neighboring words or higher nodes→ additional rules

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 42: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

41Too Few Phrasal Rules?

• Lexical rules will be 1-to-1 mappings (unless word alignment requires otherwise)

• But: phrasal rules very beneficial in phrase-based models

• Solutions

– combine rules that contain a maximum number of symbols(as in hierarchical models, recall: ”Option 1”)

– compose minimal rules to cover a maximum number of non-leaf nodes

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 43: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

42Composed Rules

• Current rules X1 X2 = NP

DT1 NNS1

die = DT

some

entsprechenden Anmerkungen = NNS

comments

• Composed ruledie entsprechenden Anmerkungen = NP

DT

some

NNS

comments

(1 non-leaf node: NP)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 44: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

43Composed Rules

• Minimal rule: X1 X2 aushandigen = VP

PRP

passing

PRP

on

PP1 NP2

3 non-leaf nodes:VP, PP, NP

• Composed rule: Ihnen X1 aushandigen = VP

PRP

passing

PRP

on

PP

TO

to

PRP

you

NP1

3 non-leaf nodes:VP, PP and NP

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 45: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

44Relaxing Tree Constraints

• Impossible ruleX

werde

= MD

shall

VB

be

• Create new non-terminal label: MD+VB

⇒ New ruleX

werde

= MD+VB

MD

shall

VB

be

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 46: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

45Zollmann Venugopal Relaxation

• If span consists of two constituents , join them: X+Y

• If span conststs of three constituents, join them: X+Y+Z

• If span covers constituents with the same parent x and include

– every but the first child Y, label as X\Y– every but the last child Y, label as X/Y

• For all other cases, label as FAIL

⇒ More rules can be extracted, but number of non-terminals blows up

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 47: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

46Special Problem: Flat Structures

• Flat structures severely limit rule extraction

NP

DT

the

NNP

Israeli

NNP

Prime

NNP

Minister

NNP

Sharon

• Can only extract rules for individual words or entire phrase

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 48: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

47Relaxation by Tree Binarization

NP

DT

the

NP

NNP

Israeli

NP

NNP

Prime

NP

NNP

Minister

NNP

Sharon

More rules can be extracted

Left-binarization or right-binarization?

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 49: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

48Scoring Translation Rules

• Extract all rules from corpus

• Score based on counts

– joint rule probability: p(LHS, RHSf , RHSe)

– rule application probability: p(RHSf , RHSe|LHS)

– direct translation probability: p(RHSe|RHSf , LHS)

– noisy channel translation probability: p(RHSf |RHSe, LHS)

– lexical translation probability:∏

ei∈RHSep(ei|RHSf , a)

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017

Page 50: Syntax-Based Models - MT classmt-class.org/jhu/slides/lecture-syntax-based-models.pdfI like the interesting lecture PRO VB DET JJ NN # # # # like lecture lecture like Center of a sentence

49

next lecture: decoding

Philipp Koehn Machine Translation: Syntax-Based Models 7 November 2017


Recommended