+ All Categories
Home > Documents > Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

Date post: 11-Jan-2016
Category:
Upload: inigo
View: 35 times
Download: 0 times
Share this document with a friend
Description:
Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents. S. Kawamoto, et al. October 27, 200 4. Agenda. Introduction Toolkit Design and Outline Speech recognition module Speech synthesis module Facial image synthesis module Agent manager Virtual machine model - PowerPoint PPT Presentation
Popular Tags:
15
1 Galatea: Open-Source Software Galatea: Open-Source Software for Developing Anthropomorphic for Developing Anthropomorphic Spoken Dialog Agents Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004
Transcript
Page 1: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

1

Galatea: Open-Source Software for Galatea: Open-Source Software for Developing Anthropomorphic Spoken Developing Anthropomorphic Spoken

Dialog AgentsDialog Agents

S. Kawamoto, et al.

October 27, 2004

Page 2: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

2

AgendaAgenda

• Introduction

• Toolkit Design and Outline– Speech recognition module– Speech synthesis module– Facial image synthesis module– Agent manager– Virtual machine model– Task manager– Prototyping tools

• Prototype Systems

• Conclusions

Page 3: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

3

IntroductionIntroduction• An anthropomorphic spoken dialog agent (ASDA) is one of

the next-generation human-computer interfaces

• Many ASDA systems have been developed, but developing a high-quality ASDA system is still challenging

An unlimited number of life-like agent characters having different faces and voices just like human

• For this reason, Galatea has been developed to provide a platform to build next-generation ASDA systems

Page 4: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

4

Features of the ToolkitFeatures of the Toolkit• Easy customization

– Model-based approachesOnce the model parameters are trained, facial expressions

and voice quality can be controlled easily

• Key techniques for natural spoken dialog Incremental speech recognition, synchronization between

speech and facial animation, etc

• Modularity of functional units– Simple architecture to manage each functional unit

User can develop, improve, debug, etc

• Open-source free software

Introduction

Page 5: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

5

Toolkit Design and OutlineToolkit Design and Outline

Works as an inter-modulecommunication manager

Directly managed by the modules which utilize the devices

Adding a new module for the function and connecting the module to the agent manager

Page 6: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

6

Speech Recognition Module (SRM)Speech Recognition Module (SRM)• Major interfaces of SRM are

as follows:– Outputs

Recognition result (XML format)

Engine status(“busy”, “waiting”, ... )

– Control commandReload grammar, change

the settings of thespeech recognition engine

– Grammar representationTransforms the XML grammar into a format that is accepted

by the speech recognition engine

Toolkit Design and Outline

Command InterpreterCommand Interpreter

Grammar TransformerGrammar Transformer

Speech Recognition EngineSpeech Recognition Engine

Speech input

Grammar

Request

Response

Page 7: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

7

Speech Synthesis Module (SSM)Speech Synthesis Module (SSM)• Accept arbitrary Japanese

texts

• Synthesize speech with a human voice– HMM-based speech

synthesis method isemployed

• Synchronizing the lip movement with speech

• SSM can interrupt speech output to cope with any interruption by the user

Toolkit Design and Outline

Command Interpreter

Dictionary

AcousticModels

SpeechOutput

Text Analyzer

WaveformGeneration

Engine

Page 8: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

8

Facial Image Synthesis Module (FSM)Facial Image Synthesis Module (FSM)• Supports high-quality facial

image synthesis, animation control, precise lip-sync with voice

• GUI is equipped to fit a generic face wire frame model onto a full-face snapshot image

• Facial action control– Mouth shape– Facial expression

Toolkit Design and Outline

Page 9: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

9

Agent Manager (AM)Agent Manager (AM)• Integrator of all the modules of the ASDA system

• Play a central role of communication

• Synchronization manager between SSM and FSM to achieve the precise lip-sync

Toolkit Design and Outline

Dispatcher

Macro-command interpreter

Page 10: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

10

Virtual Machine ModelVirtual Machine Model

• Module interface is modeled as a machine with slots– Each slot is indicates machine status

• Changing the slot values by a common command set “set Speak = now” means starting voice synthesis of a given

text immediately

Toolkit Design and Outline

Page 11: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

11

Task Manager (TM)Task Manager (TM)• Define the dialog as a set of interactions which can be

represented by a dialog description language

• Goal in developing the TM is that the system can use several types of dialog description languages– VoiceXML

High-level language, task-oriented information and the intentions of the participants

– PDOC (primitive dialog operation commands)Low-level language, device events and sequence control

Toolkit Design and Outline

Page 12: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

12

Prototyping ToolsPrototyping Tools• “Galatea Interaction Builder (IB)”

Toolkit Design and Outline

ApplicationDeveloper

Interaction Builder

Galatea MMI System

XISL File

web site

Create XISL

Document

Download and

Execute XISL

Check

DesignScenario

Page 13: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

13

Prototype SystemsPrototype Systems

Page 14: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

14

Echo-back taskEcho-back task

Prototype Systems

Page 15: Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents

15

ConclusionsConclusions• A human-like spoken dialog agent is one of the promising

man-machine interfaces for the next generation

• Galatea is a software toolkit to develop a human-like spoken dialog agent

• Because of the high modularity and simple communication architecture, it will speed up the research and application development based on ASDA


Recommended