Machine Transla-on & Quality
Aidan Collins – User Experience Manager Dura7on: 10 minutes
Machine Transla-on & Quality
What we aim to cover today? � The MT Quality Challenge
� Measure Quality � Predict Schedule & Costs
� Possible ways of measuring quality � Automated evalua;on methods � Who do they really help?
� The MT Challenge � Improve current measurements to drive new business models
� Conclusion
10 minutes
Machine Transla-on & Quality
What is KantanMT.com? � Sta7s7cal MT System
� Cloud-‐based � Highly scalable � Inexpensive to operate � Quick to deploy
� Our Vision � To put Machine Transla;on
� Customiza;on � Improvement � Deployment
� into your hands
Ac7ve KantanMT Engines
6,632 Training Words Uploaded
23,653,605,925 Member Words Translated
362,291,925
Fully Opera;onal 7 months
Machine Transla-on & Quality
The Challenge of Measuring MT Quality
How can we measure this
today?
Machine Transla-on & Quality
What aMributes can we measure? � Language AVributes
� Adequacy � Accuracy of generated texts � Based on word recall & precision
� Fluency � Comprehensibility of texts � Readability, understandability � Based on phrase reuse and
assembly
� Task-‐oriented AVributes � Produc;vity
� Post-‐edi;ng speed
� Acceptability � Fit-‐for-‐purpose measurement � Usable transla;ons within the
context of the end user’s quality demands
Machine Transla-on & Quality
Automated MT Evalua;ons � Many difference techniques currently available
� All compute similarity of generated texts to reference texts � The smaller the difference => the beMer the quality!
Language AVributes Task AVributes
F-‐Measure TER
NIST
GTM
BLEU
METEOR
Fluency
Adequacy
Produc7vity
Acceptability
Machine Transla-on & Quality
Who needs to measure Quality? � Developers of MT Engines
� Automated BLEU, METEOR, F-‐MEASURE, TER ideal and prac;cal � No individual measurement has absolute meaning
� but points quality curve in the right direc;on within a domain
Machine Transla-on & Quality
Who needs to measure Quality? � The Localisa7on Stakeholder Dilemma
� Produc;on Teams (PMs, LEs and QEs) � Need segment measurements on quality and PE efforts
� Determine ;ered segment post-‐edit rate � Distribu;on of post-‐edi;ng tasks based on segment quality
� Localisa;on Managers � Need produc;vity measurements to predict budget and schedule
� Aka Project Segment Reports � MT Measurements need to ‘fit’ business planning and charge models
� Translators � Unfortunately, don’t get a fair deal
� No segment informa;on, just top level project
Machine Transla-on & Quality
The Quality & MT Rela;onship
NIST GTM BLEU
F-‐Measure
TER METEOR MT De
velope
rs
Prod
uc7o
n
None of them measure this!
Machine Transla-on & Quality
Conclusions � There are many automated MT quality measurements
� Mostly suitable for MT developers � Not op;mal for produc;on teams � Of no use to translators
� All rely on reference texts to compute measurements � What the industry wants is …
� Segment level quality measurements for MT texts � Help Project Managers predict Project Cost & Schedule � Do not rely on reference texts to compute measurements � Give high level of granularity – remove the guess work-‐work
Machine Transla-on & Quality
KantanMT Analy;cs™ � KantanMT Analy7cs™
� Segment Quality Scoring for MT texts � (Think FuzzyMatch for Machine Transla;on Systems:-‐>) � In BETA, final release October 2013
� KantanMT Analy;cs reports are XML based:-‐
Machine Transla-on & Quality
KantanMT Analy;cs™ � KantanMT Analy7cs Report created
� XML based for consump7on by TMS/GMS placorms