Linguistic Resources for LocalisationReinhard Schäler (LRC)
Deirdre Farrell (VeriTest)Annette Lee (VU Games)
Andreas Papadakis (Archetypon)Florian Sachse (PASS)
A project co-funded by the European Union’s eContent Programme
Tenth Localisation Conference organised by the LRCLRC-X The Development Localisation Event
University of Limerick +++ 13–14 September 2005
Agenda
The IGNITE project Partners Context – Rationale - Overview
Phase I: Linguistic Resources Summary Discussion
Partners
Coordinator University of Limerick, Localisation Research
Centre (Ireland); contact: Reinhard Schäler Contractors
PASS GmbH (Germany); contact: Florian Sachse Lionbridge – VeriTest (Ireland); contact: Deirdre
Farrell Vivendi Universal Games (Ireland); contact:
Annette Lee Archetypon (Greece); contact: George Boukis
Each partner represents one of the stakeholders in localisation
Independent Research Organisation
The Localisation Research Centre (LRC) is the information, educational, and research centre
for the localisation community.
IGNITE will enable the LRC to offer an independent testing and certification service to publishers of digital
content, tools and technologies, as well as to the developers of localisation standards
Localisation Research Centrewww.localisation.ie
Digital Content Developer Vivendi Universal Games
Global leader in multi-platform interactive entertainment.
Products for all major platforms, including PCs, consoles, internet
700 title library includes multi-million unit selling
Rationale for participation reduce costs & time to market, increase potential
market, and improve efficiency participate along with key industry players in
developing an infrastructure of resources, tools, technologies and standards
automation of localisation processes for digital content, incorporating findings from IGNITE project
Localisation Service Provider Archetypon
SME, founded in 1987 90% Revenue from the international market One of the 500 leading high-growth companies in
Europe, for the second time in 2002 (GrowthPlus)
Rationale for participation Evolve in a dynamic market Automate localisation processes Collaborate with partners in the industry
Localisation Technology Provider
PASS Engineering Worldwide leading provider of high-quality localization
tools; developers of PASSOLO Cutting-edge technology, across a wide variety of
platforms, powerful interfaces, highly competitive, scalable pricing
Rationale for participation PASS can make a very significant contribution Expect a formalized set of test cases to run against
PASSOLO or its competitors Expect better understanding of data-centric or
programmatic approches based on a large set of scenarios
IT Testing & Certification Expert VeriTest
Established, Public Company (Nasdaq: LIOX) Global IT Outsourcing Solutions
19 Solution Centers in 10 Countries The industry's most trusted brand for high-quality, cost-
effective outsourced testing, competitive analysis and certification services
Rationale for participation To be involved in developing supportable standards within the
industry To develop, implement and audit the proposed test harness
and certification process
VeriTest is a division of Lionbridge
The localisation factorya case study (2003)
Current throughput: 100,000 language check-ins per month
2 million files per month 98% of words leverage Average time to process a file:
45 seconds Fully scalable “add-a-box
model” Simship of all 30 languages International version testing
before US release Reduced no. of release
engineers (20->2) resulting in US$20m saving per year
Positive ROI within 1 year
Project constraints4m wordcount software strings30 languages simultaneous
release13k localisable filesLocalisation group in Dublin;
5,000 people world-wide distributed development team
Objectives24/7, 100% automated process –
no exceptionsTranslation in parallel with
developmentTranslation begins at code check-
inTranslation “on demand” – no
more “big project” model
The Setting
Objectives and deliverables
Linguistic infrastructure resources Language data Tools and technologies Standards
Access Content developers Service providers Technology developers
Performance scenarios Digital content - standards Tools and technologies – standards Standards - coverage
IGNITEIGNITELinguistic Resources
Language data Tools StandardDigital content source/target
TerminologiesTranslation memories
Terminology DBsTM systemsUI editors
OASISISOUnicode
Exam
ple
s
Performance analysis
IGNITEConsortium
IGNITEContact Group
Ph
ase I
Ph
ase I
IP
hase I
II
Linguistic Resources
Language data Tools StandardDigital content source/target
TerminologiesTranslation memories
Terminology DBsTM systemsUI editors
OASISISOUnicode
Exam
ple
s
Linguistic Resources
Language data Tools StandardsDigital content source/target
TerminologiesTranslation memories
Terminology DBsTM systemsUI editors
OASISISOUnicodeW3C
Exam
ple
s
Localisation Process EnvironmentState-of-the-art technologies and process environent
IGNITEConsortium
IGNITEContact Group
Ph
ase I
Ph
ase I
IP
hase I
II
L i n g u i s t i c R e s o u r c e s S u p p o r t N e t w o r k
Performance analysisStandard verification and enhancement
Phase I: Lingustic resources
Language Data Tools Standards
Language data
Set of speech or language data and descriptions in machine readable form
Used e.g. for building, improving or evaluating natural
language and speech algorithms or systems as core resources for the software localisation and
language services industries for language studies electronic publishing international transactions subject-area specialists end users
Language data
Types of language data Multimodal digital content in source and
target languages Monolingual and multilingual terminology Translation memories
Languages covered Primarily those represented in consortium Also those represented by contact group
Linguistic tools
Linguistic tools and technologies answer some of the central questions around terminology handling and update processing. Terminology handling
Access to standard terminology in multiple languages Maintainance of multilingual terminologies Integration of late terminology changes Consistence checker
Update processing Version comparison, change control Analysis and alignment of source and target Use of exact or fuzzy matches Beyond Translation Memory Systems
Linguistic tools
Linguistic tools in localisation Terminology management systems Translation memories Machine translation User interface and user assistance visual translation
environments Language data analysis tools Sophisticated matching tools Natural language parsers Extract-and-Insert tools Parsers for natural language digital content in
compiled sources
Linguistic tools
Direct and online access Cooperation with leading industry
associations (e.g. Gala and TILP) Tools review and categorisation Dissemination
Greater confidence when selecting tools Development of market for tools
Standards
A large number of standards relevant to linguistic resources in the context of localisation have been published by a number of organisations International Standards Organisation (ISO) Localisation Industry Standards Association (LISA) OASIS The Free Standards Group Open Internationalization
Initiative (Openi18n.org) Termnet Unicode WC3
Standards
Central repository Review of standard development
process Uptake and support Demonstration Effectiveness
Phase I: Linguistic resources – how?
IGNITE Contact Group Digital content publishers
Coordinated by Vivendi Universal Games
Service providers Coordinated by Archetypon
Tools developers Coordinated by PASS
Standard organisations Coordinated by VeriTest
How to contribute and benefit
IGNITE contact group – sign up!
Organised by type of enterprise and interest (content
developer, service provider, tools developer, certification
expert)
Contributors to the content, tools/technologies and
standards repositories
Early access to localisation resources, infrastructure and test
harness
Exposure through project literature and publications
Preferential invitation to Contact Group meetings, workshops
Next steps
Phase II Review approaches to standard process descriptions
(in other industries, e.g. manufacturing) Destillation of standard localisation process Localisation Factory (automated localisation process
environment)
Phase III Develop process and standard evaluation strategy Build test harnesses Report on performance evaluations
Discussion
Benefits for your company?Are all angles covered?
Interested in contributing?How to?
Is it feasible?