Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)

MT Quality – LSP Perspective

Olga Beregovaya

VP, Technology Solutions

• What do translators appreciate?

• What do translators struggle most with?

• Engineering – impact on quality?

• Final output quality?

In Translators’ Own Words

THE POST-EDITOR PRODUCES:Publishable quality

The post-editor is responsible for ensuring that client quality requirements

and style guide are met

The post-editor is expected to adhere to client StyleGuide preferences

with regard to:

Infinitive / Imperative

Passive / Impassive

Formal / Informal

Different Styles for Headers, Lists, Tables

Special Handling of UI Options (Bilingual, English, Target?)

Converting All the Measurements Based On the Local Conventions

+ Disambiguate Terminology

+ Correct all the grammatical errors

But does the machine produce sufficient output?

THE POST-EDITOR RECEIVES:

GERMAN FRENCH JAPANESE RUSSIAN CHINESE SPANISH ITALIAN BRAZILIAN

WRONG TERMINOLOGY 6.46 4.93 13.63 5.00 6.20 9.63 3.78 1.13

WRONG SPELLING 2.00 0.86 0.88 0.13 0.30 1.13 0.56 1.27

SOURCE NOT TRANSLATED 6.38 5.36 3.88 5.13 3.60 2.50 1.22 1.73

COMPLIANCE WITH CLIENT

SPECS2.46 0.86 3.00 2.13 0.70 0.63 0.44 2.60

LITERAL TRANSLATION 7.85 8.64 5.00 4.00 9.40 5.38 7.67 7.93

TEXT/INFO ADDED 2.69 1.36 2.13 1.25 0.80 1.88 0.44 0.80

CAPITALIZATION 2.69 3.43 0.00 2.63 0.50 1.75 3.33 2.60

WRONG WORD FORM 6.77 7.79 0.13 9.88 0.60 6.75 3.67 6.75

WRONG PART OF SPEECH 2.62 3.21 2.00 1.88 0.60 2.13 3.67 1.33

PUNCTUATION 4.46 3.00 0.75 3.38 4.10 2.13 1.22 3.53

SENTENCE STRUCTURE 12.54 10.00 14.25 8.00 13.00 5.38 6.11 3.67

TAGS + MARK-UP 1.23 0.14 0.13 0.50 0.20 0.38 0.44 0.20

LOCALE ADAPTATION 0.46 0.29 0.75 0.63 0.20 0.75 0.44 0.13

SPACING 0.92 0.36 2.25 1.25 4.00 0.50 0.33 0.40

OTHER 1.92 1.50 1.88 0.13 0.50 0.13 1.44 0.27

TOTAL ERRORS 61.46 51.71 50.63 45.88 44.70 41.00 34.78 32.53

Most time-consuming issues that translators need to

fix are:

• Sentence structure (word order)

• MT output too literal

• Wrong terminology

• Word form disagreements

• Source term left untranslated

OR, IN A NUTSHELL…

The Translator Gains…

Productivity gains ranging from 56% to negative

- Content type

- Engine output quality

- How fast is HT (and how much MT helps)

- Correlation?

TOP 6 ON THE TRANSLATORS’ LOVE IT-LIST

1. Source of inspiration: reduces thinking and translation choice time

2. Provides reference - very useful to translators new to a specific domain

3. Reduces typing & lookup time by handling well repetitive terminology and

structures

4. …thereby takes away the more monotonous efforts of translation

5. Post-editors over time notice improvements; appreciate it more if they

‘co-own’ the engine

6. MT output can be funny

LOL!LOL!

TOP 3 ON THE TRANSLATORS’ S*#!T-LIST

1. Wrong sentence structure

• Major impact on the post-editing effort (Spanish and Portuguese produce fewest errors)

• Japanese has the highest error rate and the lowest productivity gains (supported by

the cognitive effort error ranking research)

2. Wrong and inconsistent terminology

• Very time-consuming to check and fix terminology; + enough issues from Fuzzy

Matches already

• A major problem for new products where the terminology is not settled yet

• Inconsistent output for UI references

3. Correct MT to an agreed standard (=quality expectations)

• A challenging concept in the beginning for post-editors – they think they should edit

less if the quality is bad

S*#!TS*#!T

FEEDBACK LOOP – Essential!

SOURCE TEXT MT OUTPUT POST-EDITED OUTPUTSPECIFIC

ERRORS/CHANGES MADE

Single-phase options range from 1.4kW to 7.7kW while three-phase PDUs, packed with output receptacles, range from 8.6kW to 21.6kW.

Single-fase 7.7kW Opties variëren van 1.4kW om en driefasige PDU's, boordevol Output-aansluitingen, variëren van 8,6 kW tot 21.6kW.

1,4 kW ... 7,7 kW ... 21,6 kW

Numbers and measurement units are not converted properly and no spaces inserted by MT engine (3 out of 4 occurrences, 1 is correct however, strange...

Single-phase options range from 1.4kW to 7.7kW while three-phase PDUs, packed with output receptacles, range from 8.6kW to 21.6kW.

• Biedt maximaal 24 TB <fmt id="1" tooltip="SUPERSCRIPT" endtooltip="SUPERSCRIPT"> 2 </fmt> maximale capaciteit per-uitbreidingsbehuizing toe te voegen.

• Biedt een maximale capaciteit van 24 TB<fmt id="1" tooltip="SUPERSCRIPT" endtooltip="SUPERSCRIPT">2</fmt> per uitbreidingsbehuizing.

No space should be inserted in front of and behind a number in superscript (in

this case a "2"). ...>2<... and not: > 2 <

<fmt id="1" tooltip="b" endtooltip="b">Interface Speed:</fmt> 6 Gb/s SAS

<fmt id="1" tooltip="b" endtooltip="b"> Interfacesnelheid: 6 </fmt> Gb/s SAS

• Biedt een maximale capaciteit van 24 TB<fmt id="1" tooltip="SUPERSCRIPT" endtooltip="SUPERSCRIPT">2</fmt> per uitbreidingsbehuizing.

The number is inserted before the tag and should be after the tag

<fmt id="1" tooltip="b" endtooltip="b">Intermixed Drive Capacities:</fmt> Yes

<fmt id="1" tooltip="b" endtooltip="b"> Intermixed Capaciteit van de schijven: Ja </fmt>

...</fmt> JaThe string is inserted before the tag and should be after the tag (and again spacing before and after tags inserted)

A new feature — DR Rapid Data Access — adds tighter integration with backup software applications, starting with Symantec OpenStorage-enabled backup applications.

Een nieuwe functie - DR-Rapid Data Access - voegt strakkere integratie met back-uptoepassingen, beginnend met Symantec OpenStorage geschikte back-uptoepassingen.

... — DR Rapid Data Access — ...

Please ensure any special characters like — (ChrW(151)) are preserved when inserting a TM proposal, and not replaced by a normal hyphen (ChrW(45)).

Can these errors can be learned and corrected automatically? Can

we simplify or omit the “feedback loop”?

POST-EDITING QUALITY RESULTS

No fails on one of our 28-language PE program thanks to correct

terminology choices and few and consistent error.

LOCALIZATION TAG PLACEMENT

This is what a plain-text engine will do:

To become verified and lift your sending limit, please confirm your email

address, then add a credit or prepaid card to your account and {30} {31}

{32} {33} {34} {35}confirm{36} {37} {38} it.{39}.

{30}Para hacerse verificado y levantar su límite de envío, por favor

confirme su dirección de correo electrónico, luego añada un crédito o

tarjeta de prepago a su cuenta de y

confírmelo.{31}{32}{33}{34}{35}{36}{37}{38}{39}

This is a<ph id="1" x="<b>">{1}</ph>test<ph id="1"

x="</b>">{2}</ph>

Dies ist ein <ph id="1" x="<b>">{1}</ph>Test<ph id="1"x="</b>">{2}</ph>.

AND THIS IS WHAT’S NEEDED

• More transparency in workings of engine and training

• Faster systems, shorter turnaround on large systems

• More “wizards” for training and deployment

• Easier testing methodologies without full deployments

• More standardized scoring and comparison metrics

• More “wizards” for training and deployment

• Predictive analysis of quality – confidence and utility scores

• Normalization integrated into workflow and standardized

• Industry-wide proper name and title library

• Better transliteration standards

• Morphologically aware terminology choices

• More research on post-editing environments

1. How to display source/target

2. How to display multiple suggestions

3. Autocomplete

4. Better ways to calculate the productivity improvements with post-editing

• More interoperability, so translators can stay in CAT tool they prefer

• Simplified workflows connecting MT engines and other tools

Translator Wishlist

Date post:	16-Jan-2017
Category:	Presentations & Public Speaking
Upload:	taus-enabling-better-translation
View:	184 times
Download:	2 times

Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)

Presentations & Public Speaking