Date post: | 17-Jan-2015 |
Category: |
Technology |
Upload: | taus-enabling-better-translation |
View: | 521 times |
Download: | 1 times |
Industry-ScaleCrowdsourcing of
Data & TerminologyRahzeb Choudhury, TAUS
TAUS MissionOur mission is to increase the size and significance of the translation industry to help the world communicate better.
Sharing Data & Knowledge…on an industry-level in anopen and transparentlandscape brings us all to a higher level of competence.
Where We Stand
Together We Know
More
We KnowBetter
Four Focus Areas
This slide may not be used or copied without permission from TAUS
Translation as a Utility
Data Technology
InteroperabilityMetrics
Members
Global Members
Academic, NGO & Government Members
Large Corporate Members
Small Corporate Members
Agency Members
Terminology
43.5%
39.9%
14.8%1.8%
Importance of Terminology Work
Very important
Quite important
Less important
Not important
Source: TaaS User Needs Survey, 2012. 1735 responses (approx 40% technicalwriters, 30% translators, plus others)
Source: TaaS User Needs Survey, 2012. 1735 responses (approx 40% technicalwriters, 30% translators, plus others)
Information Sources
Source: TaaS User Needs Survey, 2012. 1735 responses (approx 40% technicalwriters, 30% translators, plus others)
Information Sources
Source: TaaS User Needs Survey, 2012. 1735 responses (approx 40% technicalwriters, 30% translators, plus others)
Information Sources
Source: TaaS User Needs Survey, 2012. 1735 responses (approx 40% technicalwriters, 30% translators, plus others)
Information Sources
Source: TaaS User Needs Survey, 2012. 1735 responses (approx 40% technicalwriters, 30% translators, plus others)
Information Sources
Source: TaaS User Needs Survey, 2012. 1735 responses (approx 40% technicalwriters, 30% translators, plus others)
Main Problems
20.6%
12.2%
11.5%
10.3%
36.0%
9.4%
Lack ofresources/InsufficientterminologymanagementPoor quality/Up-to-dateness
Lack of information
Lack of convincingverification/Misleadinginformation online
Rest
Too many sources.Takes too much time.Effort is duplicated.
Results questionable.
…Centralization…
OwnedShared
Web
Machine Translation
Data and Quality
Amount of Data
MT Quality
More data
Algorithms
In-domain Data
OwnedShared
Web
Lack of access.Copyright.
Takes too much time.Effort is duplicated.
Quality questionable.
…Centralization…
Central Source of In-domain Data
OwnedShared
Web – to come in 2014
Terminology and Machine Translation
Data and Quality
Amount of Data
MT Quality
More data
Algorithms
In-domain Data
Usage/Feedback Data..Terminology!
…Centralization…
TAUS MissionOur mission is to increase the size and significance of the translation industry to help the world communicate better.
Sharing Data & Knowledge…on an industry-level in anopen and transparentlandscape brings us all to a higher level of competence.
Central Sources of Data and Terminology
Own Data – Private Vault Shared Data – In domain data Web Data – Data Collector
Own Terms – Build Own Collections Shared Term – In-domain terms Web Terms – Term Collector
But what about the crowd?
For language workers, CAT Tools & MT Systems
Source: TaaS User Needs Survey, 2012. 1735 responses (approx 40% technicalwriters, 30% translators, plus others)
Main Problems
20.6%
12.2%
11.5%
10.3%
36.0%
9.4%
Lack ofresources/InsufficientterminologymanagementPoor quality/Up-to-dateness
Lack of information
Lack of convincingverification/Misleadinginformation online
Rest
Central Sourcing of Data and Terminology
The crowd must verify!
Web Data – Data Collector Web Terms – Term Collector
But what about the crowd?
The crowd must source!
Unless the crowd helps tosource and verify…….
Too many sources.Takes time.
Effort is duplicated.Results questionable.
We maintain the status quo..
Register and engage:demo.taas-project.eu
This slide may not be used or copied without permission from TAUS
Thank you.Contact: [email protected]