Download - Aidan!Collins!–User!Experience!Manager! Duraon:10minutes! · Machine(Transla-on(&"Quality!Aidan!Collins!–User!Experience!Manager! Duraon:10minutes!!

Machine Transla-on & Quality

Aidan Collins – User Experience Manager Dura7on: 10 minutes


What we aim to cover today? �  The MT Quality Challenge

� Measure Quality �  Predict Schedule & Costs

�  Possible ways of measuring quality �  Automated evalua;on methods � Who do they really help?

�  The MT Challenge �  Improve current measurements to drive new business models

�  Conclusion

10 minutes


What is KantanMT.com? �  Sta7s7cal MT System

�  Cloud-‐based �  Highly scalable �  Inexpensive to operate �  Quick to deploy

� Our Vision �  To put Machine Transla;on

�  Customiza;on �  Improvement �  Deployment

�  into your hands

Ac7ve KantanMT Engines

6,632 Training Words Uploaded

23,653,605,925 Member Words Translated

362,291,925

Fully Opera;onal 7 months


The Challenge of Measuring MT Quality

How can we measure this

today?


What aMributes can we measure? �  Language AVributes

�  Adequacy �  Accuracy of generated texts �  Based on word recall & precision

�  Fluency �  Comprehensibility of texts �  Readability, understandability �  Based on phrase reuse and

assembly

�  Task-‐oriented AVributes �  Produc;vity

�  Post-‐edi;ng speed

�  Acceptability �  Fit-‐for-‐purpose measurement �  Usable transla;ons within the

context of the end user’s quality demands


Automated MT Evalua;ons � Many difference techniques currently available

�  All compute similarity of generated texts to reference texts �  The smaller the difference => the beMer the quality!

Language AVributes Task AVributes

F-‐Measure TER

NIST

GTM

BLEU

METEOR

Fluency

Adequacy

Produc7vity

Acceptability


Who needs to measure Quality? � Developers of MT Engines

�  Automated BLEU, METEOR, F-‐MEASURE, TER ideal and prac;cal �  No individual measurement has absolute meaning

�  but points quality curve in the right direc;on within a domain


Who needs to measure Quality? �  The Localisa7on Stakeholder Dilemma

�  Produc;on Teams (PMs, LEs and QEs) �  Need segment measurements on quality and PE efforts

�  Determine ;ered segment post-‐edit rate �  Distribu;on of post-‐edi;ng tasks based on segment quality

�  Localisa;on Managers �  Need produc;vity measurements to predict budget and schedule

�  Aka Project Segment Reports �  MT Measurements need to ‘fit’ business planning and charge models

�  Translators �  Unfortunately, don’t get a fair deal

�  No segment informa;on, just top level project


The Quality & MT Rela;onship

NIST GTM BLEU

F-‐Measure

TER METEOR MT De

velope

rs

Prod

uc7o

n

None of them measure this!


Conclusions �  There are many automated MT quality measurements

�  Mostly suitable for MT developers �  Not op;mal for produc;on teams �  Of no use to translators

�  All rely on reference texts to compute measurements � What the industry wants is …

�  Segment level quality measurements for MT texts �  Help Project Managers predict Project Cost & Schedule �  Do not rely on reference texts to compute measurements �  Give high level of granularity – remove the guess work-‐work


KantanMT Analy;cs™ �  KantanMT Analy7cs™

�  Segment Quality Scoring for MT texts �  (Think FuzzyMatch for Machine Transla;on Systems:-‐>) �  In BETA, final release October 2013

�  KantanMT Analy;cs reports are XML based:-‐


KantanMT Analy;cs™ �  KantanMT Analy7cs Report created

�  XML based for consump7on by TMS/GMS placorms


Aidan Collins – User Experience Manager [email protected]