+ All Categories
Home > Documents > CSETalk: A Spoken Dialogue System for CSE Course Information

CSETalk: A Spoken Dialogue System for CSE Course Information

Date post: 01-Jan-2016
Category:
Upload: yetta-skinner
View: 31 times
Download: 0 times
Share this document with a friend
Description:
CSETalk: A Spoken Dialogue System for CSE Course Information. Preethi Jyothi, Rohit Prabhavalkar, Thomas Lynch, Deepak Bal, Prateeti Mohapatra. Outline. Introduction Automatic Speech Recognition Language Understanding Galaxy Overview Backend Ravenclaw Architecture Speech Synthesis - PowerPoint PPT Presentation
Popular Tags:
42
CSETalk: A Spoken Dialogue System for CSE Course Information Preethi Jyothi, Rohit Prabhavalkar, Thomas Lynch, Deepak Bal, Prateeti Mohapatra
Transcript
Page 1: CSETalk: A Spoken Dialogue System for CSE Course Information

CSETalk: A Spoken Dialogue System for CSE Course

Information

Preethi Jyothi, Rohit Prabhavalkar, Thomas Lynch, Deepak Bal,

Prateeti Mohapatra

Page 2: CSETalk: A Spoken Dialogue System for CSE Course Information

2

Outline

• Introduction• Automatic Speech Recognition• Language Understanding • Galaxy Overview• Backend• Ravenclaw Architecture• Speech Synthesis• Evaluation Metrics• Challenges Faced• Conclusions and Future Work

Page 3: CSETalk: A Spoken Dialogue System for CSE Course Information

3

Introduction

• Dialog systems seek to provide a natural conversational interaction between the user and the computer system

• Spoken input from the human user -> meaning of the utterance -> results of the operation to the user

• Two-way flow of information– User-to-system– System-to-user

Page 4: CSETalk: A Spoken Dialogue System for CSE Course Information

4

Introduction

• Types of Dialogue Systems– System-initiative– User-initiative– Mixed-initiative

Page 5: CSETalk: A Spoken Dialogue System for CSE Course Information

5

Motivation

• All the components of a speech and language processing systems are integrated

• Building a SDS is a challenging task due to the interaction of the various components that are involved

• Goal: To implement an end to end spoken dialog system for the Course Information system using the RavenClaw dialog management framework

Page 6: CSETalk: A Spoken Dialogue System for CSE Course Information

6

Other Dialogue Systems

• Saplen system 1997 [R. López-Cózaret et al. Eurospeech1997]: Food ordering system

• Let’s Go! Bus Information System [Raux et. al. Eurospeech 2000]

• ITSpoke: Intelligent Tutoring SDS [Litman and Silliman HLT-NAACL 2004]

Page 7: CSETalk: A Spoken Dialogue System for CSE Course Information

7

Overall Architecture

Lang. Understand.PHOENIX/HELIOS

Dialog Manag.RAVENCLAW

Back-end(various)

Lang. GenerationROSETTA

RecognitionSPHINX

SynthesisTHETA

Page 8: CSETalk: A Spoken Dialogue System for CSE Course Information

8

CSETalk: Course Information System

• Information-based• Mixed type of dialogue system• The user can seek information about

course numbers, credits, instructors, call numbers, etc.

• Typical query – What course is Prof. X offering? – What is the call number for CSE-333?

Page 9: CSETalk: A Spoken Dialogue System for CSE Course Information

ASR – Sphinx II• Using the Sphinx II decoding engine• Uses semi-continuous hidden markov models• Used an off the shelf acoustic model (male and

female voice )• Pronunciation model and language model were

built using Sphinx Knowledge Base Tool• Used an n-gram language model, built from a

corpus of sentences generated randomly from system grammar

Page 10: CSETalk: A Spoken Dialogue System for CSE Course Information

Helios – Confidence Annotation• Generates a confidence score for the utterance

based on information from ASR, Parser and Dialog manager

• ASR– Number of words, 'unconfident' words

• Parser– Uncovered words, transitions between parsed

fragments, unparsed fragments, etc.• Dialog manager

– State of dialog, concepts expected at current state, number of turns at current state, etc.

Page 11: CSETalk: A Spoken Dialogue System for CSE Course Information

Helios – Confidence Annotation

• Allows detection and recovering from misunderstandings

• Unfortunately, due to lack of time, we were not able to explore it fully

Page 12: CSETalk: A Spoken Dialogue System for CSE Course Information

Phoenix - Parser

• The parser parses the word sequence into a set of semantic frames

• A frame is a set of named 'slots'

• Frame: [CSETalk]

Nets: [QueryCallNum]

[QueryInstructor]

...

• Nets are compiled into RTNs

Page 13: CSETalk: A Spoken Dialogue System for CSE Course Information

Phoenix - Parser

RTN for [QueryCallNum]

Page 14: CSETalk: A Spoken Dialogue System for CSE Course Information

Phoenix - Parser

Page 15: CSETalk: A Spoken Dialogue System for CSE Course Information

15

Galaxy Communicator

http://communicator.sourceforge.net/sites/MITRE/distributions/GalaxyCommunicator/docs/manual/index.html

Page 16: CSETalk: A Spoken Dialogue System for CSE Course Information

16

What is Galaxy CommunicatorPer the Galaxy Website:

•Distributed

•Message based

•Hub-and-spoke Infrastructure

•Optimized for constructing spoken dialogue systems

Origins:

•Based on MIT Galaxy System

•Currently Mitre Corporation /Darpa Communicator (Now Concluded).

•Available at Sourceforge

http://communicator.sourceforge.net/sites/MITRE/distributions/GalaxyCommunicator/docs/manual/index.html

Page 17: CSETalk: A Spoken Dialogue System for CSE Course Information

17

Galaxy Overview

• Modular with Galaxy as the communications controller.• Everything is sent using sockets and frames.• Servers/modules can be located on the network or internet.• Galaxy transforms the information from text strings to sockets and back again.• Easy to interface with C++ or any other language with a sockets library.• Handles storage and forwarding for each independent module.• Monitors the status of the other modules.• Logs information pertaining to the other servers specified. (Very flexible.)• No need to negotiate communications parameters with the other modules.• Simple to add different package to handle some aspect of the system.

– Write an interface from the new system to Galaxy.– For MySQL, changed the original backend server call to load and call MySQL.– The same strategy would work for other modules.

• Downside: Galaxy is a single point of failure. If Galaxy goes down, everything is down and no one is notified.

Page 18: CSETalk: A Spoken Dialogue System for CSE Course Information

18

The DetailsWhat its not:

•Not an end-to-end dialogue system.

•No run-time semantic standards

•No Configuration-time semantic standards

•But compatible with many standards like W3C Voice Browsers group's proposed specifications.

Knowledge Requirements:

•You need to know C programming.

•Some background in distributed processing.

•RPC, CORBA, Java RM

•Reasonable command of English - to read the documentation

Platforms: Sparc Solaris, Intel Linux and Win32

Compiler: GNU gcc and make.

Page 19: CSETalk: A Spoken Dialogue System for CSE Course Information

19

The Exchange

Select *

From courses

Where

courseId =“730”

and

instructor like “fosler-lussier”

{ query course_query courseId seven thirty three credits callNum instructor fosler-lussier)

courseId callNum creditHrs starttime endtime dow instructor

730 04581-7 3 1130 1248 T R Eric Fosler-Lussier

Page 20: CSETalk: A Spoken Dialogue System for CSE Course Information

20

The Exchange

{ query refine_results results : 1 { { courseId "730" credits "3" callNum "04581-7" callNumM "04581-7" dow "T R" starttime "1130" endtime "1248" room "DL 0305" instructor "Eric Fosler-Lussier" title "Survey of Artificial Intelligence II: Advanced Topics U G 3" description "A survey of advanced concepts, techniques, and applications of artificial

intelligence, including knowledge-based systems, learning, natural language understanding, and vision."

) }}

aStr = Gal_GetObject(f, ":inframe");

inframe = Gal_StringValue(aStr);

// Our Stuff next line

GetCourseInfoC( inframe, &outframe );

aStr = Gal_StringObject(outframe);

Gal_SetProp(f, ":outframe", aStr);

Page 21: CSETalk: A Spoken Dialogue System for CSE Course Information

21

Backend Server

• MySQL is the database. Possibly a different choice will be made next time. Maybe something simple like SQLite.

• Original Roomline calls a perl script’s for data retrieval.

• Many of the components also use perl scripts for various functions.

Page 22: CSETalk: A Spoken Dialogue System for CSE Course Information

22

Galaxy Summary

•Makes Life Easy

•It Glues Everything Together (Seamlessly?)

•Do Not Start From Scratch! You’ll be sorry.

•Modify an Existing System.

•Great Piece of Software to Allow Experimenting with various components.

•However, too many single points of failure!

Page 23: CSETalk: A Spoken Dialogue System for CSE Course Information

23

RavenClaw Architecture

– Domain-specific dialog control logic specified– Most of the effort goes into creating this specification

Dialog Task Specification (Domain Specific)

Dialog Engine (Domain Independent)

– Runs the Dialog Task Specification to specify the dialog control at runtime– Provides a large number of domain-independent conversational strategies

Page 24: CSETalk: A Spoken Dialogue System for CSE Course Information

24

Dialog Task Specification

• Hierarchical plan for the dialog• Tree of dialog agents

– Non-terminals - Dialog Agencies– Terminals - Fundamental Dialog Agents

• Inform• Request• Expect and Execute

Page 25: CSETalk: A Spoken Dialogue System for CSE Course Information

25

Dialog Agents

• Concepts – associated with agents– Concepts have pre-defined types– Set of value/confidence pairs

• Structure of an agent– Execute routine – dependent on the agent

type– Preconditions– Success/Failure criteria– Trigger conditions/ Trigger commands

Page 26: CSETalk: A Spoken Dialogue System for CSE Course Information

26

RavenClaw Dialog Engine

• Functions using two data structures– Dialog Stack– Expectation Agenda

• Functions in two phases– Execution Phase – Dialog agents executed

from Dialog Stack– Input Phase – Uses Expectation Agenda to

map user inputs to concept values

Page 27: CSETalk: A Spoken Dialogue System for CSE Course Information

27

Dialog Task Specification

Dialog Task Specification [captures domain-specific dialog control logic]

Dialog Engine [domain-independent reusable component] Execution Phase

Dialog Stack Expectation Agenda

Page 28: CSETalk: A Spoken Dialogue System for CSE Course Information

28

Dialog Task Specification

Dialog Task Specification [captures domain-specific dialog control logic]

Dialog Engine [domain-independent reusable component] Execution Phase

Dialog Stack Expectation Agenda

CSETalk

Page 29: CSETalk: A Spoken Dialogue System for CSE Course Information

29

Dialog Task Specification

Dialog Task Specification [captures domain-specific dialog control logic]

Dialog Engine [domain-independent reusable component] Execution Phase

Dialog Stack Expectation Agenda

CSETalk

Welcome

S: Welcome to CSE Talk ...

Page 30: CSETalk: A Spoken Dialogue System for CSE Course Information

30

Dialog Task Specification

Dialog Task Specification [captures domain-specific dialog control logic]

Dialog Engine [domain-independent reusable component] Execution Phase

Dialog Stack Expectation Agenda

CSETalk

S: Welcome to CSE Talk ...

Page 31: CSETalk: A Spoken Dialogue System for CSE Course Information

31

Dialog Task Specification

Dialog Task Specification [captures domain-specific dialog control logic]

Dialog Engine [domain-independent reusable component] Execution Phase

Dialog Stack Expectation Agenda

CSETalk

Task

S: Welcome to CSE Talk ...

Page 32: CSETalk: A Spoken Dialogue System for CSE Course Information

32

Dialog Task Specification

Dialog Task Specification [captures domain-specific dialog control logic]

Dialog Engine [domain-independent reusable component] Execution Phase

Dialog Stack Expectation Agenda

CSETalk

Task

S: Welcome to CSE Talk ...

How May I Help You S: How can I help you?

Course_Id: [courseNumber]

Page 33: CSETalk: A Spoken Dialogue System for CSE Course Information

33

Dialog Task Specification

Dialog Task Specification [captures domain-specific dialog control logic]

Dialog Engine [domain-independent reusable component] Execution Phase

Dialog Stack Expectation Agenda

CSETalk

Task

S: Welcome to CSE Talk ...

How May I Help You S: How can I help you?

Course_Id: [courseNumber]

Course_Id: [courseNumber]Course_Id: [courseNumber]Course_Inst: [professor]

Page 34: CSETalk: A Spoken Dialogue System for CSE Course Information

34

Dialog Task Specification

Dialog Task Specification [captures domain-specific dialog control logic]

Dialog Engine [domain-independent reusable component] Input Phase

Dialog Stack Expectation Agenda

CSETalk

Task

S: Welcome to CSE Talk ...

How May I Help You S: How can I help you?

Course_Id: [courseNumber]

Course_Id: [courseNumber]Course_Id: [courseNumber]Course_Inst: [professor]

… U: Who’s teaching seven thirty?[courseNumber](seven thirty)

Page 35: CSETalk: A Spoken Dialogue System for CSE Course Information

35

Language Generation - Rosetta• Template-Based• Attribute-Value pairs Text for synthesis• Input from Dialog Manager, Output to TTS

– Input Frame{

act informobject welcome

}– Output

Welcome to CSETalk Automated Course Information….

Page 36: CSETalk: A Spoken Dialogue System for CSE Course Information

36

Speech Synthesis

• Within Galaxy, Kalliope manages the speech synthesis– Is the link between Festival and Galaxy Hub– Submits synthesis requests to Festival, holds

a synthesis queue, etc– Formatted to work with multiple synthesis

systems (Festival, Swift, Theta)

Page 37: CSETalk: A Spoken Dialogue System for CSE Course Information

37

Speech Synthesis

• Actual speech synthesized by Festival– Open Source speech synthesis software– Developed at University of Edinburgh and

worked on extensively at CMU– Currently includes many types of voices

• Diphone (what we use)• HMM based (referred to as HTS)• Unit Selection

Page 38: CSETalk: A Spoken Dialogue System for CSE Course Information

38

Speech Synthesis

• CMU’s Festvox project hopes to make the building of new voices more systematic and better documented– Extensive instructions and tools on the web to

help with the creation of new voices– Limited domain could be useful for our project

• Uses unit selection methods

Page 39: CSETalk: A Spoken Dialogue System for CSE Course Information

39

Performance Evaluation

• Many facets to measuring performance– Efficiency

• Processing time• How much time is wasted with corrections, etc.

– Quality• How many times did the system misinterpret• How many times did the user have to correct the system

– User Satisfaction• Does the user feel happy with their interaction• Have user fill out survey

– Task Success• Did the user leave with the information that they came for

Page 40: CSETalk: A Spoken Dialogue System for CSE Course Information

40

Performance Evaluation

• One framework that tries to incorporate all of these facets is PARADISE (PARAdigm for DIalogue System Evaluation)– PARADISE proposes to compute a

performance measure as a function of both task success and dialogue costs

– If design changes were made, PARADISE could be used to evaluate the effectiveness of the changes.

Page 41: CSETalk: A Spoken Dialogue System for CSE Course Information

41

Challenges Faced

• NO DOCUMENTATION!!

• Hard to localize faults since ASR/DM/NLG are tightly coupled

• Festival runs best on UNIX whereas the rest of the system runs on Windows

• Limited Time

Page 42: CSETalk: A Spoken Dialogue System for CSE Course Information

42

Conclusions and Future Work

• Using more classes in the language model

• Using other corpuses to obtain the acoustic model and the language model information

• Creating or using better TTS voice

• Actually evaluating performance as discussed


Recommended