Accent Colors Accent 1 (RGB: 141,178,24) Accent 2 (RGB: 63,63,63) Accent 3 (RGB: 127,127,127) Accent...

Post on 17-Dec-2015

283 views 1 download

Tags:

transcript

FLAVIUS Meeting – WP4

June 7, 2010Giurgiu BogdanWong William

Agenda

• About Language Weaver • R&D work• Customer experience• LW core mission• LW architectural contribution• Deliverables• Roadmap• Internal milestone rollup• Questions & Answers

Language Weaver at a Glance

Founded2002. The first commercial science breakthrough in high speed statistical human language translation

Offices Los Angeles (HQ), Washington DC, Boston, San Francisco, Paris, London, Brussels, Tokyo, & Cluj

Employees 105

Management

• Mark Tapling, President & CEO• Daniel Marcu, Founder & CTO• William Wong, Founder & VP Engineering• Adrian Gocan, Country Manager – Cluj Office

Markets Served

• Digital Content / Social Media• Customer Support• Government Intelligence

Language Weaver delivers human communication solutions through trusted automated language translation

Language Weaver Romania

Established 2008

Employees ~50 employees, 5 open positions

Key Areas of Expertise

Development, engineering, telemarketing/marketing, linguists

2 active contracts with the European Union are driven out of this location – FAUST and FLAVIUS

Partnerships Language Weaver Srl has a partnership with Cambridge University to deliver research solutions

FLAVIUS Contributors

• LW SRL, Romania– Daniel Marcu, CTO – Ionel Condor, Engineering Manager– Bogdan Giurgiu, Project Manager – Ana Totea, Engineer– Bogdan Faraga, Engineer– Daniel Sarbe, Engineer– Matei Nicolae, Engineer

R&D Projects

• Research Projects– Improve syntax-based SMT (DARPA funded)– Small footprint systems for SMT– Domain customization techniques for SMT

• R&D Projects– GALE Operational Engines – Broadcast Monitoring

Solutions (speech2text translation)– FAUST – FP7 EC Project

Currently Available Language Pairs

Western European Middle Eastern & African Eastern European

Danish to/from English

Dutch to/from English

French to/from English

French to/from Spanish

French to/from German

Italian to/from English

Italian to/from Spanish

German to/from English

German to/from Spanish

Greek to/from English

Norwegian to/from English

Portuguese to/from English

Spanish to/from English

Swedish to/from English

Arabic to/from English

Arabic to/from French

Arabic to/from Spanish

Dari to/from English

Hebrew to/from English

Hausa to/from English

Pashto to/from English

Persian to/from English

Somali to/from English

Turkish to/from English

Urdu to/from English

Bulgarian to/from English

Czech to/from English

Hungarian to/from English

Polish to/from English

Romanian to/from English

Russian to/from English 

Serbian to/from English

Asian

Simplified Chinese to/from

English

Traditional Chinese to/from

English

Hindi to/from English

Japanese to/from English

Korean to/from English

Thai to/from English

Bengali to/from English*Latest product release enables LW to translate to and from any language that is available with limited quality

Our Customer Deployments

QUALITY

BaselineTr

ained

Post

Edit

Custo

mer

Car

e

Digita

l Con

tent

Lega

l M

arke

ting

Publ

icat

ions

FACT INFLUENCE

Governments

Our Customer Experience

QUALITY

BaselineTr

ained

Post

Edit

Custo

mer

Car

e

Digita

l Con

tent

Lega

l M

arke

ting

Publ

icat

ions

FACT INFLUENCE

Governments

• Baselines are inadequate for FAUT (fully automated useful translation)

• Lacks utility of translation (usefulness)• Basic translation(gisting) does not convey

publisher needs such as terminology

Customer Experience

QUALITY

BaselineTr

ained

Post

Edit

Custo

mer

Car

e

Digita

l Con

tent

Lega

l M

arke

ting

Publ

icat

ions

FACT INFLUENCE

Governments

• Human post edit for preservation of publisher voice

• Humans productivity limited to 2.500 words per day

• High cost prevents time critical high volume publication and user generated content

Customer Experience

QUALITY

BaselineTr

ained

Post

Edit

Custo

mer

Car

e

Digita

l Con

tent

Lega

l M

arke

ting

Publ

icat

ions

FACT INFLUENCE

Governments

• Convergence of utility vs. ROI

• Proven trust in actionable content over baseline engines

• Significant cost reduction from influence oriented communications

• Liberates publisher & user generated content

Core Mission for FLAVIUS

Accelerate the adoption of FAUT on a broad scale by leveraging easy customization of domain verticals for content publishers.

FLAVIUS Content Management System

Language Weaver’s Contribution

Keys to a Successful Partner Integration

1. Ability to integrate with Language Weaver Machine Translation for development and testing

2. Ability to customize baseline engines with dictionaries

3. Ability to customize baseline engines with training of domain/customer specific vertical system

Accomplishments To Date (M3)

• No pre-financing is expected• Negotiated purchase agreement between LW SRL

and Dell Computers• Purchased 14 Dell servers• Purchased Cisco network switch• Entered into collocation agreement between LW SRL

and Latisys (hosting location in Irvine, CA)• All hardware delivered to Latisys• LW Inc. IT staff installed and deployed to TOD

(Translations on Demand) at 0 cost to LW SRL.• Available languages: English to French, Spanish,

Italian, German, Polish, Romanian, Swedish and vice-versa

Current Activities (M3)

REST APITOD

• LW setup integration partner accounts• Partner start development using TOD REST

API:– HTTP base communication protocol

• Web 2.0 used by Amazon, Twitter, etc.– Supported text formats: TXT, HTML, TMX,

XLIFF

Upgrade TOD Framework (LW Milestone)

• Internal milestone for LW to migrate partner accounts to upgraded TOD framework in month 9

• Provide new functionality outside of the FLAVIUS project but materially benefits the teams.

• Extends current REST API• Trustscore™ enabled baseline engines

• Utility not quality based assessment• Deployed for TripAdvisor and Dell

• Reporting of basic statistics

REST APITOD

Reporting Trustscore™

Customization via Dictionary (M12)

• REST API enabled dictionary support • Dictionary upload through API• A dictionary will be specific to an account,

per language pair• i.e. Dell (account), Eng-Spa(LP), Servers

Terminology (dictionary – 1+)

REST APITOD

Reporting Trustscore™ Dictionary

Customization via Training (M21)

d

Parallel Aligned Text

Optional: Regression Text

Optional: Test Text

Evaluation

Data:• Fix noisy text• More text• Text alignment• Text segmentation

Product Delivery viaTOD

LW TrainingCompute Cloud

REST APITOD

Reporting Trustscore™

Dictionary

Training

Complete Picture

REST API

TODReporting Trustscore™

Dictionary

Training

FLAVIUS Language Weaver Roadmap

Internal milestones

Who What When

LW Translation Engines Up and Running

June 30th

LW REST API to be used to access the SMT

June 30th

SFT Architecture document should include the details of the translation API

TBD

TBD Project presentation June 30th

TBD Project website June 30th

Project logo June 30th

Questions & Answers

Thank you!Accelerating the way the world communicates