FLAVIUS Meeting – WP4
June 7, 2010Giurgiu BogdanWong William
Agenda
• About Language Weaver • R&D work• Customer experience• LW core mission• LW architectural contribution• Deliverables• Roadmap• Internal milestone rollup• Questions & Answers
Language Weaver at a Glance
Founded2002. The first commercial science breakthrough in high speed statistical human language translation
Offices Los Angeles (HQ), Washington DC, Boston, San Francisco, Paris, London, Brussels, Tokyo, & Cluj
Employees 105
Management
• Mark Tapling, President & CEO• Daniel Marcu, Founder & CTO• William Wong, Founder & VP Engineering• Adrian Gocan, Country Manager – Cluj Office
Markets Served
• Digital Content / Social Media• Customer Support• Government Intelligence
Language Weaver delivers human communication solutions through trusted automated language translation
Language Weaver Romania
Established 2008
Employees ~50 employees, 5 open positions
Key Areas of Expertise
Development, engineering, telemarketing/marketing, linguists
2 active contracts with the European Union are driven out of this location – FAUST and FLAVIUS
Partnerships Language Weaver Srl has a partnership with Cambridge University to deliver research solutions
FLAVIUS Contributors
• LW SRL, Romania– Daniel Marcu, CTO – Ionel Condor, Engineering Manager– Bogdan Giurgiu, Project Manager – Ana Totea, Engineer– Bogdan Faraga, Engineer– Daniel Sarbe, Engineer– Matei Nicolae, Engineer
R&D Projects
• Research Projects– Improve syntax-based SMT (DARPA funded)– Small footprint systems for SMT– Domain customization techniques for SMT
• R&D Projects– GALE Operational Engines – Broadcast Monitoring
Solutions (speech2text translation)– FAUST – FP7 EC Project
Currently Available Language Pairs
Western European Middle Eastern & African Eastern European
Danish to/from English
Dutch to/from English
French to/from English
French to/from Spanish
French to/from German
Italian to/from English
Italian to/from Spanish
German to/from English
German to/from Spanish
Greek to/from English
Norwegian to/from English
Portuguese to/from English
Spanish to/from English
Swedish to/from English
Arabic to/from English
Arabic to/from French
Arabic to/from Spanish
Dari to/from English
Hebrew to/from English
Hausa to/from English
Pashto to/from English
Persian to/from English
Somali to/from English
Turkish to/from English
Urdu to/from English
Bulgarian to/from English
Czech to/from English
Hungarian to/from English
Polish to/from English
Romanian to/from English
Russian to/from English
Serbian to/from English
Asian
Simplified Chinese to/from
English
Traditional Chinese to/from
English
Hindi to/from English
Japanese to/from English
Korean to/from English
Thai to/from English
Bengali to/from English*Latest product release enables LW to translate to and from any language that is available with limited quality
Our Customer Deployments
QUALITY
BaselineTr
ained
Post
Edit
Custo
mer
Car
e
Digita
l Con
tent
Lega
l M
arke
ting
Publ
icat
ions
FACT INFLUENCE
Governments
Our Customer Experience
QUALITY
BaselineTr
ained
Post
Edit
Custo
mer
Car
e
Digita
l Con
tent
Lega
l M
arke
ting
Publ
icat
ions
FACT INFLUENCE
Governments
• Baselines are inadequate for FAUT (fully automated useful translation)
• Lacks utility of translation (usefulness)• Basic translation(gisting) does not convey
publisher needs such as terminology
Customer Experience
QUALITY
BaselineTr
ained
Post
Edit
Custo
mer
Car
e
Digita
l Con
tent
Lega
l M
arke
ting
Publ
icat
ions
FACT INFLUENCE
Governments
• Human post edit for preservation of publisher voice
• Humans productivity limited to 2.500 words per day
• High cost prevents time critical high volume publication and user generated content
Customer Experience
QUALITY
BaselineTr
ained
Post
Edit
Custo
mer
Car
e
Digita
l Con
tent
Lega
l M
arke
ting
Publ
icat
ions
FACT INFLUENCE
Governments
• Convergence of utility vs. ROI
• Proven trust in actionable content over baseline engines
• Significant cost reduction from influence oriented communications
• Liberates publisher & user generated content
Core Mission for FLAVIUS
Accelerate the adoption of FAUT on a broad scale by leveraging easy customization of domain verticals for content publishers.
FLAVIUS Content Management System
Language Weaver’s Contribution
Keys to a Successful Partner Integration
1. Ability to integrate with Language Weaver Machine Translation for development and testing
2. Ability to customize baseline engines with dictionaries
3. Ability to customize baseline engines with training of domain/customer specific vertical system
Accomplishments To Date (M3)
• No pre-financing is expected• Negotiated purchase agreement between LW SRL
and Dell Computers• Purchased 14 Dell servers• Purchased Cisco network switch• Entered into collocation agreement between LW SRL
and Latisys (hosting location in Irvine, CA)• All hardware delivered to Latisys• LW Inc. IT staff installed and deployed to TOD
(Translations on Demand) at 0 cost to LW SRL.• Available languages: English to French, Spanish,
Italian, German, Polish, Romanian, Swedish and vice-versa
Current Activities (M3)
REST APITOD
• LW setup integration partner accounts• Partner start development using TOD REST
API:– HTTP base communication protocol
• Web 2.0 used by Amazon, Twitter, etc.– Supported text formats: TXT, HTML, TMX,
XLIFF
Upgrade TOD Framework (LW Milestone)
• Internal milestone for LW to migrate partner accounts to upgraded TOD framework in month 9
• Provide new functionality outside of the FLAVIUS project but materially benefits the teams.
• Extends current REST API• Trustscore™ enabled baseline engines
• Utility not quality based assessment• Deployed for TripAdvisor and Dell
• Reporting of basic statistics
REST APITOD
Reporting Trustscore™
Customization via Dictionary (M12)
• REST API enabled dictionary support • Dictionary upload through API• A dictionary will be specific to an account,
per language pair• i.e. Dell (account), Eng-Spa(LP), Servers
Terminology (dictionary – 1+)
REST APITOD
Reporting Trustscore™ Dictionary
Customization via Training (M21)
d
Parallel Aligned Text
Optional: Regression Text
Optional: Test Text
Evaluation
Data:• Fix noisy text• More text• Text alignment• Text segmentation
Product Delivery viaTOD
LW TrainingCompute Cloud
REST APITOD
Reporting Trustscore™
Dictionary
Training
Complete Picture
REST API
TODReporting Trustscore™
Dictionary
Training
FLAVIUS Language Weaver Roadmap
Internal milestones
Who What When
LW Translation Engines Up and Running
June 30th
LW REST API to be used to access the SMT
June 30th
SFT Architecture document should include the details of the translation API
TBD
TBD Project presentation June 30th
TBD Project website June 30th
Project logo June 30th
Questions & Answers
Thank you!Accelerating the way the world communicates