Date post: | 24-Dec-2015 |
Category: |
Documents |
Upload: | anastasia-parrish |
View: | 218 times |
Download: | 0 times |
Automating Translation in the Localisation Factory
An Investigation of Post-Editing Effort
Sharon O’BrienDublin City University
Assumptions about MT
T (MT + PE) < T (Trans)
Do we have proof?
Dated studies: Pan-American Health Organisation General Motors European Union
3-4 times faster than translation But:
No details given More Recently:
Average daily throughput for PE: 5,250 words per day
Krings (2001): only thorough, published empirical data on PE rates
MT + CL
CL: Relatively young field of research/implementation
Consequently: little empirical data
CL improves “translatability”
The notion of translatability is based on so-called "translatability indicators" where the occurrence of such an indicator in the text is considered to have a negative effect on the quality of machine translation. The fewer translatability indicators, the better suited the text is to translation using MT.
(Underwood and Jongejan, 2001: 363)
Can we prove it - empirically?
By using CL rules to eliminate negative “translatability indicators”, post-editing effort of MT output will be lower than for output where negative translatability indicators have not been removed.
Experimental Set-Up
Validity!Professional, experienced subjects, native
speakers (German)Homogenous backgrounds and level of
experienceFamiliar text (user guide)Familiar working environmentPayment for time
However: limited number of subjects
Framework of Analysis
How do you measure post-editing “effort”?TemporalTechnicalCognitive
Two sentence types: “Snti” “Smin-nti”
Framework of Analysis
Temporal Effort: How much time, in seconds, did it take to post-edit
each sentence?
Technical Effort: How many deletions, insertions, cut & pastes were
made for each sentence?
Cognitive Effort: Combined Temporal & Technical Additional measurement: Choice Network Analysis
Analysis Tools
IBM WebsphereTranslogExcel
Translog User Interface
Translog Log File
Results: General Temporal Effort
0
2
4
6
8
10
12
14
16
18
Median WordsPer Minute
Post-Editor
Translator
Temporal Effort: Individual Variation
12.9
13
13.1
13.2
13.3
13.4
13.5
13.6
13.7
Median Wordper Minute
Translator 1
Translator 2
Translator 3
0
5
10
15
20
25
30
Median WordsPer Minute
Fastest Post-Editor
Slowest Post-Editor
Temporal Effort by Sentence Type
Processing Speed: the total number of source words in each
segment divided by the total processing time for that segment
Processing Speed by Sentence Type
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Median Processing Speed
Snti
Smin-nti
Technical Effort by Sentence Type
0
0.5
1
1.5
2
2.5
3
3.5
4
MedianDeletions
Snti
Smin-nti
0
0.5
1
1.5
2
2.5
3
3.5
4
MedianInsertions
Snti
Smin-nti
Technical Effort: Cut & Paste
Very little activity!Retyping of entire phrases rather than
cutting & pastingLess effort to re-type?Need for training?
Cognitive Effort
On average, the elimination of NTIs suggests that PE effort is reduced.
However, CNA shows:More edits to some NTIs than to othersEven though NTIs have been removed from
a sentence, this does not guarantee zero post-editing
High PE Effort
Gerund (“ing” form of verb) Ungrammatical Phrase Putting an adjective after the noun Non-finite verb (no tense marked) Slang Misspelling Long Noun Phrase Ellipsis Long Sentence (more than 25 words) Verbs with particles Use of Footnotes Multiple Prepositions Short Segment (fewer than 4 words)
Medium PE Effort
Multiple Coordinators Problematic Punctuation Passive Voice Phrase not syntactically complete Use of Personal Pronouns Use of Slash as a separator Ambiguous coordination Use of brackets Proper Nouns Missing “that” in a relative clause
Low PE Effort
AbbreviationsDemonstrative PronounsMissing “in order to”Contractions (“Let’s”)
Conclusions
Taking into account that no QA was performed on the final texts:
On average post-editing can be faster than translationHigh degree of individual variation
On average, removing NTIs reduces PE EffortBut some NTIs demand more effort than
others
Conclusions
Even if all known NTIs are removed, sentences may still require PE effort.
Conclusions
Not all CL rules will have equal impactEven if CL is applied, PE effort will not
be removed completelyPost-editors are still human and still
translators…