+ All Categories
Home > Documents > EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009:...

EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009:...

Date post: 20-Apr-2018
Category:
Upload: phamnhu
View: 218 times
Download: 3 times
Share this document with a friend
18
Evalita Workshop 2009 1 Paolo Baggia 1 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech Luminary at SpeechTEK 2009 Evalita Workshop December 12 th , 2009
Transcript
Page 1: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 1Paolo Baggia 1

EVALITA 2009: Loquendo Spoken Dialog System

Paolo BaggiaDirector of International StandardsSpeech Luminary at SpeechTEK 2009

Evalita WorkshopDecember 12th, 2009

Page 2: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 2Paolo Baggia

Company Profile

� Privately held company (fully owned by Telecom Italia), founded in 2001 as spin-off from Telecom Italia Labs, capitalizing on 30yrs experience and expertise in voice processing.

� Global Company, leader in Europe and South America for award-winning, high quality voice technologies (synthesis, recognition, authentication and identification) available in 26 languages and 62 voices.

� Multilingual, proprietary technologies protected over 100 patents worldwide

� Financially robust, break-even reached in 2004, revenues and earnings growing year on year

� Growth-plan investment approved for the evolution of products and services.

� Offices in New York. Headquarters in Torino, local representative sales offices in Rome, Madrid, Paris, London, Munich

� Flexible: About 100 employees, plus a vibrant ecosystem of local freelancers.

Torino

Rome

Madrid

Paris

London

New York

Munich

Page 3: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 3Paolo Baggia

Overview

A Bit of Context

Loquendo Spoken Dialog SystemLoquendo VoxNauta VoiceXML/CCXML platformSDS – Ingredients Grammar HandlingEvaluation

Page 4: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 4Paolo Baggia

A Bit of Context

Page 5: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 5Paolo Baggia

A Brief History of Speech Technologies

1769176917691769

Von Kempelen'sTalking Machine

1920192019201920

Radio Rex

FFT, cepstrum, DTW, TTS

1952195219521952

BL's Audrey

1971197119711971

DARPA SUR program starts

Hidden Markov Models

1975197519751975

Dictation systems

1982198219821982

Dictation Industry

1985198519851985

Widespread use of HMMs

1988198819881988

DARPA Resource Management DARPA ATIS

1990199019901990

Continuous Speech

Recognition Speech Understanding

DialogDARPA COMMUNICATOR

1995199519951995

Spoken dialog Industry

Isolated Words

MultiModal

2000200020002000

BL's Voder

1936193619361936

NLU systems

Conversational systems

by Roberto Pieraccini

Page 6: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 6Paolo Baggia

SSML 1.0 W3C RecSRGS 1.0

W3C Rec

1998

1999

2000

2002

2004

W3C Voice Browser

WorkshopVoiceXML 1.0

Released

VoiceXML Forum Birth

W3C charters Voice Browser

WG

W3C charters Multimodal Interaction

WG

SALT Forum Birth

VoiceXML 2.0 W3C Rec

By AT&T, IBM,Lucent, Motorola,

By Cisco, Comverse, Intel, Microsoft, Philips,SpeechWorks,

Preparing to announce VoiceXML 1.0Friday Feb. 25 th, 2000Lucent, Naperville, Illinois

Left to right: Gerald Karam (AT&T), Linda Boyer (IBM), Ken Rehor (Lucent), Bruce Lucas (IBM),Pete Danielsen (Lucent), Jim Ferrans (Motorola), Dave Ladd (Motorola).

The (r)evolution of VoiceXML1998 - 2004

SISR 1.0 W3C Rec

2007

VoiceXML 2.1 W3C Rec

2008

EMMA 1.0 W3C Rec

PLS 1.0W3C REC

2009

Page 7: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 7Paolo Baggia

Speech Interface Framework in 2000 (by Jim Larson)

DialogManager

WorldWideWeb

TelephoneSystem

ContextInterpretation

MediaPlanning

LanguageGeneration

TTS

ASRLanguage

Understanding

DTMF Tone Recognizer

Pre-recorded Audio Player

Speech SynthesisMarkup Language (SSML)

Pronunciation LexiconSpecification (PLS)

Reusable Components Call Control XML(CCXML)

Semantic Interpretation forSpeech Recognition (SISR)

N-gram Grammar ML

Speech RecognitionGrammar Spec. (SRGS)

Natural LanguageSemantics ML

VoiceXML 2.0

VoiceXML 2.1 EMMA

User

Page 8: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 8Paolo Baggia

DialogManager

WorldWideWeb

TelephoneSystem

ContextContextContextContextInterpretationInterpretationInterpretationInterpretation

MediaPlanning

LanguageGeneration

TTS

ASR

DTMF Tone Recognizer

Pre-recorded Audio Player

Speech SynthesisMarkup Language (SSML)

Pronunciation LexiconSpecification (PLS)

Reusable Components Call Control XML(CCXML)

Semantic Interpretation forSpeech Recognition (SISR)

N-gram Grammar ML

Speech RecognitionGrammar Spec. (SRGS)

Natural LanguageSemantics ML

VoiceXML 2.0

VoiceXML 2.1 EMMA 1.0

User

LanguageLanguageLanguageLanguageUnderstandingUnderstandingUnderstandingUnderstanding

Speech Interface Framework - End of 2009 (by Jim Larson)

Page 9: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 9Paolo Baggia

Architectural Changes

User Speech Applic.

ASR / DTMF

TTS / Audio

Traditional (proprietary) architecture

ProprietarySCE

Proprietaryplatform

User VoiceXML Browser

ASR / DTMF

TTS / Audio

Web Applic.HTTP

VoiceXML architecture

.vxml

.grxml/.gram, .pls

.ssml, .wav/.mp3, .pls

VoiceXMLplatform

Page 10: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 10Paolo Baggia

The Landscape Changed!

VoiceXML changed the landscape of speech application development From proprietary to standard-based speech applications

• Proprietary platforms(HW & SW)

• Proprietary applications (by proprietary SCE)

• Mainly DTMF and pre-recorded prompts

• First attempts to add speech into IVR

• Standard VoiceXMLplatforms

• Standards for SpeechTechnologies

• Standard tools forVoiceXML applications

• Integration of DTMFand ASR

• Still predominance ofDTMF, but more andmore speechapplications

Before After

VoiceXML Key Features:It takes the web paradigm to the core of speech applications developmentIt is a powerful abstraction – Easy to author

Page 11: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 11Paolo Baggia

Loquendo Spoken Dialog System

Page 12: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 12Paolo Baggia

VoxNauta – Internal Architecture

Page 13: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 13Paolo Baggia

Loquendo SDS – Ingredients

Dialog Implementation:VoiceXML mostly static application (JSP for dynamic pages)

VoiceXML mixed initiative for multi slot inputBarge-in always present (to interrupt prompts and shift focus)

Data in mySQL DB

Speech Grammar Development:SRGS grammars no SLM

Mostly dynamic grammars (JPS generated)Wild exploitation of Garbage Rule (of SRGS)

Prompting:Pure TTS, untuned

� Development / Tuning: 1 week

Page 14: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 14Paolo Baggia

Typical Garbage Topologies:

Speech Grammar with Garbage Rules

Garbage

GarbageGarbage

Garbage

Contentpart

Contentpart

Contentpart

Prefix

Postfix

“ (Well…I’m leaving…er…sorry) from Boston ”

“ at 5pm (please) ”

“ (I’d like to travel) from Rome to Venice (please) ”

� An attempt to explore limits/advantages in conversati onal dialogs

Page 15: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 15Paolo Baggia

Loquendo SDS – Grammar Encapsulation

Robust Domain Concept Grammars:For Brands, Categories, Codes, Names

Component Grammars:First level of composition of domain concepts

Combinatorial Grammars:Combination of Components in different orders

Insertion of Garbage Rules

Page 16: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 16Paolo Baggia

Loquendo SDS – Evaluation

Short Dialogs:Mixed initiative and flexible grammars were effective

Task Success Rate:Very high for implemented task

Some Tasks were not Implemented:Initial requirements a bit vague

Very precise testing scenarios

No last minute tuning of Loquendo SDS

Page 17: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 17Paolo Baggia

Final Evaluation Table

63.5%

(73/115)

-62.2%

(56/90)

-58.4% (45/77)-Overall

(corr/req)

78.6%

(11/14)

3.5 ± 2.577.8%

(14/18)

2.8 ± 1.655.6%

(5/9)

2.3 ± 0.4Search single

product

44.4%

(4/9)

3.8 ± 1.625.0%

(2/8)

3.0 ± 0.80.0%

(0/4)

2.0 ± 0.0List products -

other

63.2%

(12/19)

7.5 ± 2.842.9%

(9/21)

4.3 ± 1.836.4%

(4/11)

4.6 ± 1.5New order

66.7%

(2/3)

3.0 ± 0.00.0%

(0/8)

2.0 ± 0.050.0%

(2/4)

2.0 ± 0.0List customers

75.0%

(3/4)

3.0 ± 0.080.0%

(4/5)

2.0 ± 0.00.0%

(0/8)

2.5 ± 1.5List orders

54.6%

(12/22)

3.4 ± 1.688.9%

(8/9)

2.3 ± 0.583.3%

(5/6)

2.0 ± 0.0Ask customer

detail

90.5%

(19/21)

3.1 ± 0.595.0%

(19/20)

2.4 ± 0.8100.0%

(19/19)

1.9 ± 0.4Identify

representative

Tsr

(corr/req)

Duration

(turns)

Tsr

(corr/req)

Duration

(turns)

Tsr

(corr/req)

Duratio

n

(turns)

UniTNLoquendoUniNATask

Page 18: EVALITA 2009: Loquendo Spoken Dialog System · Evalita Workshop 2009 Paolo Baggia 11 EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech

Evalita Workshop 2009 18Paolo Baggia

THANK YOUTHANK YOU

for clarifications or questions:

[email protected]


Recommended