+ All Categories
Home > Documents > A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott...

A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott...

Date post: 22-Dec-2015
Category:
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
39
A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA USA Grace Chung Corporation for National Research Initiatives Reston, VA USA Mikio Nakano NTT Corporation Atsugi, Japan
Transcript
Page 1: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

A Framework For Developing Conversational User Interfaces

James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni

MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA USA

Grace ChungCorporation for National

Research Initiatives

Reston, VA USA

Mikio NakanoNTT Corporation

Atsugi, Japan

Page 2: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Speech Speech

Synthesis

Generation

Text

Understanding

Meaning

Conversational User Interfaces

Human

Computer

Recognition

Text

Page 3: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

HumanComputerInitiative

• Human takes complete control

• Computer is totally passive

• Human takes complete control

• Computer is totally passive

H: I want to visit my grandmother.

• Computer maintains tight control

• Human is highly restricted

• Computer maintains tight control

• Human is highly restricted

C: Please say the departure city.

Types of Conversational Interfaces

• Conversational systems differ in the degree with which human or computer controls the conversation (initiative)

DirectedDialogue

Free FormDialogue

Mixed InitiativeDialogue

Page 4: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

• Can verbalize response– Language generation

– Speech synthesis

• Can engage in dialogue with a user during the interaction

Conversational Interfaces

• Can understand verbal input– Speech recognition

– Language understanding (in context)

SpeechRecognition

SpeechRecognition

Language Understanding

Language Understanding

ContextResolution

ContextResolution

DialogueManagement

DialogueManagement

LanguageGenerationLanguage

Generation

SpeechSynthesisSpeech

Synthesis

AudioAudio Back EndBack End

Page 5: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

The Problem With Conversational Interfaces

• Advanced conversational systems are out there– Both user and computer can take initiative– Goal: conversational skill of system should approach that of

human operator

• But…– These systems are built by experts– Huge learning curve for novices, and– Tremendous iterative effort required even from experts

• For this reason– Most advanced conversational systems remain in research labs

* e.g. Jupiter weather info system (+1-888-573-TALK) : Zue et al, IEEE Trans. SAP, 8(1), 2000

– However, we have seen limited commercial deployment* e.g. AT&T’s “How May I Help You”, Gorin et al, Speech

Communication, 23, 1997

Page 6: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Simplifying Conversational System Creation

• Goal: make it easier for both expert and novice developers to create conversational interfaces– But still use advanced human language technologies

• Strategy: simplify configuration process– Automatically configure technology components bases on examples

– Allow specification through web interface or unified configuration file

Web Interface

Configuration File

ConfigurationEngine

ContextResolution

Dialogue Management Generation

SynthesisUnderstanding

RecognitionSpeechBuilder

Page 7: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Configuring a Conversational Interface: Knowledge Representation

• First, define example sentences for in-domain actions

Action Examples

identify I would like to know today’s weather in DenverWhat will the temperature be on Tuesday

set Turn on the radio in the kitchen pleaseCan you turn the dining room lights off

Attribute Valuescity Boston, Denver, San Francisco, …

room living room, dining room, kitchen, …

• Then, define the important concepts present in the actions (attributes):– Concept values make up recognizer vocabulary!

– Examples of attributes automatically matched to attribute classes

Page 8: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Starting with a Database Table

Name Phone Email OfficeJim Glass x3-1640 [email protected] 601

Scott Cyphers x3-0248 [email protected] 604Eugene Weinstein X3-8569 [email protected] 633

Attributesname Jim Glass, Scott Cyphers…

property Name, Phone, Email, Office

Actionsrequest_property What is the email for Jim Glass?

request_office Where can I find Jim Glass?

• Provide database table to configure speech interface:

• Only some columns are used to access entries (e.g., Name)– Values of those columns become values for domain concepts

– Default action sentences are automatically generated

• But, every table cell can potentially be an answer to a question– All Names of columns become one concept – “property”

Page 9: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Dialogue Management

• Generic Dialogue Manager (Polifroni & Chung, ICSLP 2002)

Generic Dialogue Manager

Generic Dialogue Manager

• Plan system responses

• Regularize common concepts

• Summarize database results

SpeechRecognition

SpeechRecognition

Language Understanding

Language Understanding

ContextResolution

ContextResolution

DialogueManagement

DialogueManagement

LanguageGenerationLanguage

Generation

SpeechSynthesisSpeech

Synthesis

AudioAudio Back EndBack End

HotelsHotels

Air TravelAir Travel

SportsSports

WeatherWeather

Page 10: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Context Resolution

Input Query

Resolve Deixis

Incorporate Fragments

Resolve Pronouns

Inherit Predicates

Fill in Default Values

Query Interpreted in Context

“What does this one serve?”

“What is their phone number?”

“Are there any on Main Street?”

“What about Massachusetts Ave?”

“Give me directions from MIT.”

“Show me restaurants in Cambridge.”

Page 11: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Human Language Technology Details

• Approach: Use same technologies as deployed in our mainstream, more complex systems

• Speech Recognizer (Glass, Computer, Speech, and Language, 2003)– Trained on 100+ hours of mostly telephone speech

– Word pronunciations supplied by large dictionary, generated by rule, or provided by developer

• Natural Language Understanding: (Seneff, Computational Linguistics, 1992)– Hierarchical sentence grammar used to parse sentence hypothesis

– Back off to concept spotting when no full parse is made

• Language Generation: (Baptist&Seneff, ICSLP 2000)– Used in: SQL (DB Query) generation, paraphrasing & URL-encoding

meaning representation, responses

Page 12: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Web-based Interface

Defining Actions andConcepts (Attributes)Defining Actions andConcepts (Attributes)

Page 13: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Web-based Interface: Viewing Sentences

Examining how sentences are reducedto an action and a set of attribute-value pairs

Examining how sentences are reducedto an action and a set of attribute-value pairs

Page 14: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Web-based Interface: Response Generation

Domain independent system promptsDomain independent system prompts

Customizing system responses

Customizing system responses

Domain specific system promptsDomain specific system prompts

Page 15: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Web-based Interface: Editing Pronunciations

Modifying system generatedpronunciations for the vocabulary

Modifying system generatedpronunciations for the vocabulary

Page 16: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Web-based Interface: Context Resolution

Context Resolution configured throughMasking and Inheritance of concepts

Context Resolution configured throughMasking and Inheritance of concepts

Page 17: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Voice Configuration File: An Alternative to the Web Interface

• Entire domain can be specified in single configuration file– Allows for automated generation of conversational systems

<actions><request_name> = i would like a restaurant | can you (show|give) me a Chinese restaurant in Arlington;</actions>

<attributes> <cuisine> = Chinese|Taiwanese; <city> = Washington | Boston | Arlington;</attributes>

<discourse>name masks(city cuisine neighborhood);</discourse>

<constraints><request_name> (city|neighborhood) {prompt_for_city};</constraints>

Page 18: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Deployment

• SpeechBuilder functional for the past three years

• Some example domains:– Office appliance control

– Laboratory directory (auto-attendant)

– Restaurant query system

• Has been used by MIT researchers (experts) as well as novice developers at our sponsor companies– Used in technology transfer workshop for pervasive computing project

(Oxygen)

• SpeechBuilder has been used as an educational tool– Computational linguistics class at Georgetown University

– Summer class at Johns Hopkins University

– Youngest SpeechBuilder developer: 9 years old

Page 19: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Japanese SpeechBuilder

• Created in collaboration with NTT

• Challenge: Segmentation (no spaces between words)

Page 20: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

• A hotel application using the generic dialogue manager– Compiled via SpeechBuilder using constraints shown previously

– Other generic functionality is automatically included

• Illustrated technical issues:

Example Domain

– Canonicalizing relative dates– Interpreting fragments correctly in context– Soliciting necessary information from user

– Ordering and summarizing results of query to content provider– Resolving superlatives/updating discourse context– Interpreting pronouns in context– Returning and speaking specific properties– Repeating previous replies

Page 21: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Another Example Domain: Object Manipulation System

• Stock SpeechBuilder domain for spoken dialogue

• Custom back-end connected to stereo camera and person tracking algorithm (Demirdjian, WOMOT 2003)

Page 22: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Ongoing and Future Work

• Incorporate speech synthesis– Allow use of concatenative speech synthesizer (Yi et al, ICSLP 2000) in

SpeechBuilder

• Allow use of multiple modalities– Provide functionality to incorporate multimodal input into systems

• Improve dialogue management tools and modules– Improve ability of SpeechBuilder systems to use more sophisticated

dialogue strategies

– Provide additional generic semantic concepts for use in domains

• Allow system refinement by unsupervised learning– Use confidence scores to improve domain language model

(Nakano&Hazen, Eurospeech 2003)

• Allow system modification in real-time– Need ability to re-train recognizer during runtime (Schalkwyk et al,

Eurospeech 2003)

Page 23: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Thank You! For more information:

• http://www.sls.csail.mit.edu/

• Email us! [email protected]

• Jupiter weather Information system: º +1-617-258-0300 (outside USA)

º 1-888-573-TALK (USA toll-free)

• Mercury flight information system: º +1-617-258-6040 (outside USA)

º 1-877-MIT-TALK (USA toll-free)

• Pegasus flight status system: º +1-617-258-0301 (outside USA)

º 1-877-LCS-TALK (USA toll-free)

Page 24: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

THE END

Page 25: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Utility for rapid prototyping of speech-based interfaces

– Used to create demonstrations for NTT CS Labs open house

– Prototypes were developed with a few days of effort

– Three papers submitted for publishing

Page 26: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Human LanguageTechnologies

• Only some columns are used to access entries (e.g., Name)– Values of those columns become values for domain concepts

– Default action sentences are automatically generated

• But, every table cell can potentially be an answer to a question– Names of non-access columns become a concept

Page 27: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

• For each concept present in the domain, define how queries about that concept should be answered

To Configure Response Generation…

<telephone> = “The telephone for :name is :phone”

• Define some prompts for generic events, e.g. welcome and goodbye

<welcome> = “Welcome to the auto-attendant”

<no_data> = “Sorry, there was no data matching your request.”

Page 28: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Conversational User Interfaces: Input Side

Action DB

Text

Recognition

Meaning

Understanding“Back-end”

Technologies

Human LanguageTechnologies

“Find me a flight toBoston on Tuesday”

action=flightsto_city=Bostonday=Tuesday

Speech

Page 29: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Conversational User Interfaces: Output Side

Action DB

Meaningflight_num=55airline=Deltaorigin=LGAdest=BOS

Text

Synthesis

Generation

Delta flight, number fifty five from La Guardia to Boston…

SpeechHuman LanguageTechnologies

Page 30: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Conversational User Interfaces: The Whole Picture

Action

Text

Recognition

Meaning

Understanding

Speech

Meaning

Text

Speech

Synthesis

Generation

Or Is It

?

Page 31: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

The Missing Pieces: Context and Dialogue

• Context Resolution:

• Dialogue Management:

action=flightsto_city=Bostonday=Tuesday

+Last time,

the user askedfor a flight from LGA

=

action=flightsorigin=BOSdest=LGA

day=Tuesday

action=flightsto_city=Bostonday=Tuesday

+ =“Which city

would you liketo fly from?”

Page 32: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Conversational User Interfaces: The Whole Picture

Action

Text

Recognition

Meaning

Understanding

Speech

Meaning

Text

Speech

Synthesis

Generation

Context Resolution,Dialogue Management

Page 33: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

The Problem With Conversational Interfaces…

• Complex conversational systems are out there– Both user and computer can take initiative– Goal: conversational skill of system should approach that of

human operator

• But…– These systems are built by experts– Huge learning curve for novices, and– Tremendous iterative effort required even from experts

• For this reason– Most advanced conversational systems remain in research labs

* e.g. Jupiter weather info system (+1-888-573-TALK) : Zue et al, IEEE Trans. SAP, 8(1), 2000

– However, we have seen limited commercial deployment* e.g. AT&T’s “How May I Help You”, Gorin et al, Speech

Communication, 23, 1997

Page 34: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Configuring Response Generation…

• For each concept present in the domain, define how queries about that concept should be answered

• Configure some generic prompts for summarizing long results

• Define some prompts for generic events, e.g. welcome

Property/ Condition Response

phone The phone number for :restaurant_name is :phone

cuisine :restaurant_name serves :cuisine cuisine

Welcome Welcome to the restaurants domain

No matches I’m sorry, I couldn’t find any restaurants matching your request

Many matches I found five restauraunts :items

item (what to return when summarizing) :restaurant_name

Page 35: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Configuring Context Resolution

• Context Resolution (discourse) configured through Masking and Inhertiance of concepts

• Inheritance configures how actions remember concepts, e.g.:– User: “What is the phone number for Jim Glass”

– System: “Jim Glass’ phone number is 3-1640

– User: “What about his email address?”

– System: “Jim Glass’ email address is [email protected]

– Name concept is inherited

• Masking configures how certain concepts block other concepts, even in the presence of inheritance, e.g.– User: “Do you have any restaurants in Boston?”

– System: “In Boston, I have the following…”

– User: “What about in Times Square?”

– System: “In Times Square, New York, I have…”

– City concept is masked by Neighborhood concept

Name is inherited

City is masked

Page 36: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Voice Configuration File

• Developers can also use Voice Configuration (VCFG) file format to configure SpechBuilder domains:

<actions><request_name> = i would like a restaurant | can you (show|give) me a Chinese restaurant in Arlington;</actions>

<attributes> <cuisine> = Chinese|Taiwanese; <city> = Washington | Boston | Arlington;</attributes>

<discourse>name masks(city cuisine neighborhood);</discourse>

<constraints><request_name> (city|neighborhood) {prompt_for_city};</constraints>

Page 37: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Dialogue Management

• Generic Dialogue Manager (Polifroni & Chung, ICSLP 2002)

SpeechRecognition

SpeechRecognition

Language Understanding

Language Understanding

ContextResolutionContext

Resolution

DialogueManagement

DialogueManagement

LanguageGenerationLanguage

Generation

SpeechSynthesisSpeech

Synthesis

AudioAudio DatabaseDatabase

HotelsHotels

Air TravelAir Travel

SportsSports

WeatherWeather

Generic Dialogue Manager

Generic Dialogue Manager

• Plan system responses

• Regularize common concepts

• Summarize database results

Page 38: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Deployment

• SpeechBuilder functional for the past three years

• Some example domains:– Office appliance control

– Laboratory directory (auto-attendant)

– Restaurant query system

• Has been used by MIT researchers (experts) as well as novice developers at our partner companies

• SpeechBuilder has been used by students in– Computational linguistics class at Georgetown University

– Summer class at Johns Hopkins University

– Technology transfer workshop for pervasive computing project (Oxygen)

• In collaboration with NTT, we have developed a Japanese version of SpeechBuilder. Japanese domains:– Bus timetable system

– Weather information system

Page 39: A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial.

Computer Aided Design on User Interfaces – Jan 16th, 2004Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory

Configuring a Speech Interface with SpeechBuilder: Knowledge Representation

Attribute Valuescity Boston, Denver, San Francisco, …

room living room, dining room, kitchen, …

• First define some concepts present in the domain (attributes):– Concept values make up recognizer vocabulary!

• Then, define examples of things to do with the concepts (actions)– Examples of attributes automatically matched to attribute classes

Action Examples

identify I would like to know today’s weather in DenverWhat will the temperature be on Tuesday

set Turn on the radio in the kitchen pleaseCan you turn the dining room lights off


Recommended