+ All Categories
Home > Documents > Special Lecture (406) Spoken Language Dialog Systems...

Special Lecture (406) Spoken Language Dialog Systems...

Date post: 26-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
75
© Macquarie University 2004 1 Special Lecture (406) Special Lecture (406) Special Lecture (406) Special Lecture (406) Spoken Language Dialog Systems Spoken Language Dialog Systems Spoken Language Dialog Systems Spoken Language Dialog Systems Review Review Review Review Rolf Rolf Rolf Rolf Schwitter Schwitter Schwitter Schwitter { { {schwitt}@ics.mq.edu.au schwitt}@ics.mq.edu.au schwitt}@ics.mq.edu.au schwitt}@ics.mq.edu.au
Transcript
Page 1: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 1

Special Lecture (406)Special Lecture (406)Special Lecture (406)Special Lecture (406)Spoken Language Dialog SystemsSpoken Language Dialog SystemsSpoken Language Dialog SystemsSpoken Language Dialog Systems

ReviewReviewReviewReview

Rolf Rolf Rolf Rolf SchwitterSchwitterSchwitterSchwitter

{{{{schwitt}@ics.mq.edu.auschwitt}@ics.mq.edu.auschwitt}@ics.mq.edu.auschwitt}@ics.mq.edu.au

Page 2: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 2

Why Voice?Why Voice?Why Voice?Why Voice?

• Wireless devices have small screens and limited input capabilitiWireless devices have small screens and limited input capabilitiWireless devices have small screens and limited input capabilitiWireless devices have small screens and limited input capabilities.es.es.es.

• Telephone keypad can give users only a limited number of choicesTelephone keypad can give users only a limited number of choicesTelephone keypad can give users only a limited number of choicesTelephone keypad can give users only a limited number of choices....

• Speech technology is evolving.Speech technology is evolving.Speech technology is evolving.Speech technology is evolving.

• The exchange of information between a person and a computer is The exchange of information between a person and a computer is The exchange of information between a person and a computer is The exchange of information between a person and a computer is becoming more like a becoming more like a becoming more like a becoming more like a real conversationreal conversationreal conversationreal conversation. . . .

• Users want handsUsers want handsUsers want handsUsers want hands----free or eyesfree or eyesfree or eyesfree or eyes----free use.free use.free use.free use.

• Allows interaction with displayAllows interaction with displayAllows interaction with displayAllows interaction with display----based Web content in cases where based Web content in cases where based Web content in cases where based Web content in cases where the mouse and keyboard may be missing or inconvenient. the mouse and keyboard may be missing or inconvenient. the mouse and keyboard may be missing or inconvenient. the mouse and keyboard may be missing or inconvenient.

Page 3: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 3

What is a Spoken Language Dialog System?What is a Spoken Language Dialog System?What is a Spoken Language Dialog System?What is a Spoken Language Dialog System?

• An SLDS is a computer system that you can talk to in order to caAn SLDS is a computer system that you can talk to in order to caAn SLDS is a computer system that you can talk to in order to caAn SLDS is a computer system that you can talk to in order to carry rry rry rry out some task.out some task.out some task.out some task.

• SLDSsSLDSsSLDSsSLDSs are typically of two kinds:are typically of two kinds:are typically of two kinds:are typically of two kinds:

– InformationInformationInformationInformation----provisionprovisionprovisionprovision systems provide information in response systems provide information in response systems provide information in response systems provide information in response to a query, such as a request for timetable information or to a query, such as a request for timetable information or to a query, such as a request for timetable information or to a query, such as a request for timetable information or weather information.weather information.weather information.weather information.

– TransactionTransactionTransactionTransaction----basedbasedbasedbased systems allow you to undertake some systems allow you to undertake some systems allow you to undertake some systems allow you to undertake some transaction, such as buying or selling stocks, or reserving transaction, such as buying or selling stocks, or reserving transaction, such as buying or selling stocks, or reserving transaction, such as buying or selling stocks, or reserving a seat on a plane.a seat on a plane.a seat on a plane.a seat on a plane.

Page 4: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 4

The Architecture of an SLDSThe Architecture of an SLDSThe Architecture of an SLDSThe Architecture of an SLDS

Speech RecognitionSpeech RecognitionSpeech RecognitionSpeech Recognition

Language UnderstandingLanguage UnderstandingLanguage UnderstandingLanguage Understanding

Dialog ManagementDialog ManagementDialog ManagementDialog Management

DatabaseDatabaseDatabaseDatabase

Language GenerationLanguage GenerationLanguage GenerationLanguage Generation

Speech SynthesisSpeech SynthesisSpeech SynthesisSpeech Synthesis

Page 5: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 5

What's Involved in Building an SLDS?What's Involved in Building an SLDS?What's Involved in Building an SLDS?What's Involved in Building an SLDS?

• Dialog DesignDialog DesignDialog DesignDialog Design

– The process of working out how the interaction between human The process of working out how the interaction between human The process of working out how the interaction between human The process of working out how the interaction between human and machine will move from stage to stage.and machine will move from stage to stage.and machine will move from stage to stage.and machine will move from stage to stage.

– Also referred to as Also referred to as Also referred to as Also referred to as script writingscript writingscript writingscript writing or or or or call flow layout.call flow layout.call flow layout.call flow layout.

• Prompt DesignPrompt DesignPrompt DesignPrompt Design

– What the system says to get the caller to say something we are What the system says to get the caller to say something we are What the system says to get the caller to say something we are What the system says to get the caller to say something we are able to handle.able to handle.able to handle.able to handle.

• Grammar WritingGrammar WritingGrammar WritingGrammar Writing

– Specifying what the caller is permitted to say at any given statSpecifying what the caller is permitted to say at any given statSpecifying what the caller is permitted to say at any given statSpecifying what the caller is permitted to say at any given state.e.e.e.

Page 6: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 6

CSLU Speech ToolkitCSLU Speech ToolkitCSLU Speech ToolkitCSLU Speech Toolkit

• Developed at the Centre for Spoken Language Understanding at theDeveloped at the Centre for Spoken Language Understanding at theDeveloped at the Centre for Spoken Language Understanding at theDeveloped at the Centre for Spoken Language Understanding at theOregon Graduate Institute in the USA.Oregon Graduate Institute in the USA.Oregon Graduate Institute in the USA.Oregon Graduate Institute in the USA.

• The CSLU toolkit has many features that are similar to commerciaThe CSLU toolkit has many features that are similar to commerciaThe CSLU toolkit has many features that are similar to commerciaThe CSLU toolkit has many features that are similar to commercial l l l tools for building tools for building tools for building tools for building SLDSsSLDSsSLDSsSLDSs....

• The toolkit includes a rapid application development environmentThe toolkit includes a rapid application development environmentThe toolkit includes a rapid application development environmentThe toolkit includes a rapid application development environment....

• A tutorial is available at:A tutorial is available at:A tutorial is available at:A tutorial is available at:

http://www.cslu.ogi.edu/toolkit/docs/2.0/apps/rad/index.htmlhttp://www.cslu.ogi.edu/toolkit/docs/2.0/apps/rad/index.htmlhttp://www.cslu.ogi.edu/toolkit/docs/2.0/apps/rad/index.htmlhttp://www.cslu.ogi.edu/toolkit/docs/2.0/apps/rad/index.html

Page 7: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 7

Food Ordering ServiceFood Ordering ServiceFood Ordering ServiceFood Ordering Service

Page 8: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 8

Food Ordering ServiceFood Ordering ServiceFood Ordering ServiceFood Ordering Service

Page 9: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 9

Building a Pizza Ordering ServiceBuilding a Pizza Ordering ServiceBuilding a Pizza Ordering ServiceBuilding a Pizza Ordering Service

• SpecificationSpecificationSpecificationSpecification

– Pizzas are three sizes: small, medium, and large.Pizzas are three sizes: small, medium, and large.Pizzas are three sizes: small, medium, and large.Pizzas are three sizes: small, medium, and large.

– Pizzas may have one or more of these toppings:Pizzas may have one or more of these toppings:Pizzas may have one or more of these toppings:Pizzas may have one or more of these toppings:

cheese, pepperoni, sausage, peppers, pineapple, and tomatoescheese, pepperoni, sausage, peppers, pineapple, and tomatoescheese, pepperoni, sausage, peppers, pineapple, and tomatoescheese, pepperoni, sausage, peppers, pineapple, and tomatoes

– Drinks are three sizes: small, medium, and large.Drinks are three sizes: small, medium, and large.Drinks are three sizes: small, medium, and large.Drinks are three sizes: small, medium, and large.

– Drinks may be Coke, Pepsi, Diet Pepsi, lemonade, or water.Drinks may be Coke, Pepsi, Diet Pepsi, lemonade, or water.Drinks may be Coke, Pepsi, Diet Pepsi, lemonade, or water.Drinks may be Coke, Pepsi, Diet Pepsi, lemonade, or water.

– An order may contain zero or more pizzas and zero or more drinksAn order may contain zero or more pizzas and zero or more drinksAn order may contain zero or more pizzas and zero or more drinksAn order may contain zero or more pizzas and zero or more drinks....

Page 10: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 10

Steps in Dialog DesignSteps in Dialog DesignSteps in Dialog DesignSteps in Dialog Design

1.1.1.1. Make sure you understand what you are trying to achieve Make sure you understand what you are trying to achieve Make sure you understand what you are trying to achieve Make sure you understand what you are trying to achieve (use scenarios and build first a conceptual model).(use scenarios and build first a conceptual model).(use scenarios and build first a conceptual model).(use scenarios and build first a conceptual model).

2.2.2.2. See if you can decompose the task into smaller meaningful subtasSee if you can decompose the task into smaller meaningful subtasSee if you can decompose the task into smaller meaningful subtasSee if you can decompose the task into smaller meaningful subtasks.ks.ks.ks.

3.3.3.3. Identify the information tokens you need for each task or subtasIdentify the information tokens you need for each task or subtasIdentify the information tokens you need for each task or subtasIdentify the information tokens you need for each task or subtask.k.k.k.

4.4.4.4. Decide how you will obtain this information from the caller.Decide how you will obtain this information from the caller.Decide how you will obtain this information from the caller.Decide how you will obtain this information from the caller.

5.5.5.5. Sketch a call flow diagram with appropriate prompts that captureSketch a call flow diagram with appropriate prompts that captureSketch a call flow diagram with appropriate prompts that captureSketch a call flow diagram with appropriate prompts that captures s s s this information.this information.this information.this information.

6.6.6.6. Test your call flow diagram in a Wizard of Oz simulation.Test your call flow diagram in a Wizard of Oz simulation.Test your call flow diagram in a Wizard of Oz simulation.Test your call flow diagram in a Wizard of Oz simulation.

7.7.7.7. Revise your call flow diagram and repeat Step 6 …Revise your call flow diagram and repeat Step 6 …Revise your call flow diagram and repeat Step 6 …Revise your call flow diagram and repeat Step 6 …

Page 11: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 11

Wizard of Oz SimulationsWizard of Oz SimulationsWizard of Oz SimulationsWizard of Oz Simulations

• A human experimenter (the Wizard) simulates an automated system.A human experimenter (the Wizard) simulates an automated system.A human experimenter (the Wizard) simulates an automated system.A human experimenter (the Wizard) simulates an automated system.

• The dialog specifications (for example, flow charts with associaThe dialog specifications (for example, flow charts with associaThe dialog specifications (for example, flow charts with associaThe dialog specifications (for example, flow charts with associated ted ted ted prompts) are spread out in front of the experimenter. prompts) are spread out in front of the experimenter. prompts) are spread out in front of the experimenter. prompts) are spread out in front of the experimenter.

• The experimenter reads the appropriate prompts from the specs, The experimenter reads the appropriate prompts from the specs, The experimenter reads the appropriate prompts from the specs, The experimenter reads the appropriate prompts from the specs, waits for a response from the subject (or no response), checks twaits for a response from the subject (or no response), checks twaits for a response from the subject (or no response), checks twaits for a response from the subject (or no response), checks the he he he specs on how to proceed, and then speaks the next prompt.specs on how to proceed, and then speaks the next prompt.specs on how to proceed, and then speaks the next prompt.specs on how to proceed, and then speaks the next prompt.

• Very effective in uncovering problems with logic, navigation, awVery effective in uncovering problems with logic, navigation, awVery effective in uncovering problems with logic, navigation, awVery effective in uncovering problems with logic, navigation, awkward kward kward kward sequences of prompts, omissions, and so on.sequences of prompts, omissions, and so on.sequences of prompts, omissions, and so on.sequences of prompts, omissions, and so on.

Page 12: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 12

Types of Spoken OutputTypes of Spoken OutputTypes of Spoken OutputTypes of Spoken Output

TypeTypeTypeType FunctionFunctionFunctionFunction

PromptPromptPromptPrompt Indicates it is time for user input, and thus serves as aIndicates it is time for user input, and thus serves as aIndicates it is time for user input, and thus serves as aIndicates it is time for user input, and thus serves as aturn-taking cueturn-taking cueturn-taking cueturn-taking cue

FeedbackFeedbackFeedbackFeedback Presents the app state that results from user input,Presents the app state that results from user input,Presents the app state that results from user input,Presents the app state that results from user input,allowing the user to compare original intent with resultallowing the user to compare original intent with resultallowing the user to compare original intent with resultallowing the user to compare original intent with result

InstructionsInstructionsInstructionsInstructions Provide information about operating the user interface orProvide information about operating the user interface orProvide information about operating the user interface orProvide information about operating the user interface orunderstanding the taskunderstanding the taskunderstanding the taskunderstanding the task

HelpHelpHelpHelp Help instructions often adopt a separate mode or stateHelp instructions often adopt a separate mode or stateHelp instructions often adopt a separate mode or stateHelp instructions often adopt a separate mode or stateaimed at coaching the useraimed at coaching the useraimed at coaching the useraimed at coaching the user

App DataApp DataApp DataApp Data Information presented to the user as part of the task: egInformation presented to the user as part of the task: egInformation presented to the user as part of the task: egInformation presented to the user as part of the task: egweather, stock information, flight timesweather, stock information, flight timesweather, stock information, flight timesweather, stock information, flight times

Page 13: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 13

PromptsPromptsPromptsPrompts

• Prompts are the turnPrompts are the turnPrompts are the turnPrompts are the turn----taking cues within spoken dialogs.taking cues within spoken dialogs.taking cues within spoken dialogs.taking cues within spoken dialogs.

• Prompts have two purposes:Prompts have two purposes:Prompts have two purposes:Prompts have two purposes:

– cause the user to speak,cause the user to speak,cause the user to speak,cause the user to speak,

– convey to the user what may be spoken (optionally).convey to the user what may be spoken (optionally).convey to the user what may be spoken (optionally).convey to the user what may be spoken (optionally).

• Prompts should be as distinguishable as possible Prompts should be as distinguishable as possible Prompts should be as distinguishable as possible Prompts should be as distinguishable as possible fromfromfromfrom

– instructionsinstructionsinstructionsinstructions

– ffffeedbackeedbackeedbackeedback, and, and, and, and

– hhhhelpelpelpelp....

• Prompts fall along a continuum from implicit to explicit.Prompts fall along a continuum from implicit to explicit.Prompts fall along a continuum from implicit to explicit.Prompts fall along a continuum from implicit to explicit.

Page 14: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 14

Implicit versus Explicit PromptsImplicit versus Explicit PromptsImplicit versus Explicit PromptsImplicit versus Explicit Prompts

• Computer 1: Computer 1: Computer 1: Computer 1: Welcome to ABC Bank. What would you like to do?Welcome to ABC Bank. What would you like to do?Welcome to ABC Bank. What would you like to do?Welcome to ABC Bank. What would you like to do?

• Computer 2: Computer 2: Computer 2: Computer 2: Welcome to ABC Bank. You can check an account Welcome to ABC Bank. You can check an account Welcome to ABC Bank. You can check an account Welcome to ABC Bank. You can check an account balance, transfer funds, or pay a bill. What would you balance, transfer funds, or pay a bill. What would you balance, transfer funds, or pay a bill. What would you balance, transfer funds, or pay a bill. What would you like to do?like to do?like to do?like to do?

• Computer 3: Computer 3: Computer 3: Computer 3: Welcome to ABC Bank. You can check an account Welcome to ABC Bank. You can check an account Welcome to ABC Bank. You can check an account Welcome to ABC Bank. You can check an account balance, transfer funds, or pay a bill. Say one of the balance, transfer funds, or pay a bill. Say one of the balance, transfer funds, or pay a bill. Say one of the balance, transfer funds, or pay a bill. Say one of the following choices: check balance, transfer funds, or following choices: check balance, transfer funds, or following choices: check balance, transfer funds, or following choices: check balance, transfer funds, or pay bills.pay bills.pay bills.pay bills.

Page 15: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 15

Synthesized versus Digitized Spoken OutputSynthesized versus Digitized Spoken OutputSynthesized versus Digitized Spoken OutputSynthesized versus Digitized Spoken Output

• Use digitized speech whenever possible.Use digitized speech whenever possible.Use digitized speech whenever possible.Use digitized speech whenever possible.

• Use synthesized speech for unbounded information:Use synthesized speech for unbounded information:Use synthesized speech for unbounded information:Use synthesized speech for unbounded information:

– yellow and white pagesyellow and white pagesyellow and white pagesyellow and white pages

– encyclopedia readencyclopedia readencyclopedia readencyclopedia read----backbackbackback

– rapidly changing informationrapidly changing informationrapidly changing informationrapidly changing information

– electronic mail.electronic mail.electronic mail.electronic mail.

• Avoid synthesis for single isolated words.Avoid synthesis for single isolated words.Avoid synthesis for single isolated words.Avoid synthesis for single isolated words.

Page 16: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 16

TimingTimingTimingTiming

• MinimiseMinimiseMinimiseMinimise time between end of prompt and beginning of recognition.time between end of prompt and beginning of recognition.time between end of prompt and beginning of recognition.time between end of prompt and beginning of recognition.

• Trim prompts aggressively if bargeTrim prompts aggressively if bargeTrim prompts aggressively if bargeTrim prompts aggressively if barge----in is not in use.in is not in use.in is not in use.in is not in use.

• Communicate errors quickly.Communicate errors quickly.Communicate errors quickly.Communicate errors quickly.

Page 17: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 17

Prompt DesignPrompt DesignPrompt DesignPrompt Design

• Precede prompts with instructions.Precede prompts with instructions.Precede prompts with instructions.Precede prompts with instructions.

Computer:Computer:Computer:Computer: Your plan requires that you select a PIN to use the Your plan requires that you select a PIN to use the Your plan requires that you select a PIN to use the Your plan requires that you select a PIN to use the system. The PIN must be between 5 and 9 digits in system. The PIN must be between 5 and 9 digits in system. The PIN must be between 5 and 9 digits in system. The PIN must be between 5 and 9 digits in length. At the tone, please say your PIN.length. At the tone, please say your PIN.length. At the tone, please say your PIN.length. At the tone, please say your PIN.

• Repeat only the prompt.Repeat only the prompt.Repeat only the prompt.Repeat only the prompt.

Computer:Computer:Computer:Computer: Sorry, I didn’t understand that. At the tone, please Sorry, I didn’t understand that. At the tone, please Sorry, I didn’t understand that. At the tone, please Sorry, I didn’t understand that. At the tone, please say your PIN.say your PIN.say your PIN.say your PIN.

Page 18: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 18

Prompt DesignPrompt DesignPrompt DesignPrompt Design

• Put key information immediately before expected user speech:Put key information immediately before expected user speech:Put key information immediately before expected user speech:Put key information immediately before expected user speech:

– if using bargeif using bargeif using bargeif using barge----in, put key information at a phrase boundary,in, put key information at a phrase boundary,in, put key information at a phrase boundary,in, put key information at a phrase boundary,

– if not using bargeif not using bargeif not using bargeif not using barge----in, put key information before the tone.in, put key information before the tone.in, put key information before the tone.in, put key information before the tone.

Page 19: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 19

HelpHelpHelpHelp

• Empower the user with help availability.Empower the user with help availability.Empower the user with help availability.Empower the user with help availability.

• Let the user know how to get help.Let the user know how to get help.Let the user know how to get help.Let the user know how to get help.

• Once help is declared available, keep it available.Once help is declared available, keep it available.Once help is declared available, keep it available.Once help is declared available, keep it available.

• Return to a logical starting point after help.Return to a logical starting point after help.Return to a logical starting point after help.Return to a logical starting point after help.

• Use examples for help.Use examples for help.Use examples for help.Use examples for help.

Page 20: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 20

Developing Speech InterfacesDeveloping Speech InterfacesDeveloping Speech InterfacesDeveloping Speech Interfaces

• Speech interfaces can be developed usingSpeech interfaces can be developed usingSpeech interfaces can be developed usingSpeech interfaces can be developed using

– generalgeneralgeneralgeneral----purpose programming languagespurpose programming languagespurpose programming languagespurpose programming languages

– specialspecialspecialspecial----purpose (programming) languages.purpose (programming) languages.purpose (programming) languages.purpose (programming) languages.

• A specialA specialA specialA special----purpose language such as purpose language such as purpose language such as purpose language such as VoiceXMLVoiceXMLVoiceXMLVoiceXML cancancancan

– simplify application developmentsimplify application developmentsimplify application developmentsimplify application development

– reduce network trafficreduce network trafficreduce network trafficreduce network traffic

– separate interaction code from application logic codeseparate interaction code from application logic codeseparate interaction code from application logic codeseparate interaction code from application logic code

– provide portability and simplicityprovide portability and simplicityprovide portability and simplicityprovide portability and simplicity

– support prototyping and refinement.support prototyping and refinement.support prototyping and refinement.support prototyping and refinement.

Page 21: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 21

A A A A VoiceXMLVoiceXMLVoiceXMLVoiceXML ExampleExampleExampleExample

<?xml version = "1.0"?>

<vxml version = "2.0">

<form>

<block>

<prompt bargein = "false"> Welcome to Ajax Travel.

<audio src = "http://www.prerecorded.audiofile ..."/>

</prompt>

</block>

</form>

</vxml>

Page 22: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 22

VoiceXMLVoiceXMLVoiceXMLVoiceXML ArchitectureArchitectureArchitectureArchitecture

GatewayGatewayGatewayGateway

&&&&

Voice ServerVoice ServerVoice ServerVoice Server

WebWebWebWeb

ServerServerServerServer

InternetInternetInternetInternet

PSTNPSTNPSTNPSTN

InternetInternetInternetInternet

• VoiceXMLVoiceXMLVoiceXMLVoiceXML documentsdocumentsdocumentsdocuments

• audio filesaudio filesaudio filesaudio files

• service logic (CGI)service logic (CGI)service logic (CGI)service logic (CGI)

• transaction processingtransaction processingtransaction processingtransaction processing

• database interfacedatabase interfacedatabase interfacedatabase interface

• telephony interfacetelephony interfacetelephony interfacetelephony interface

• voice browservoice browservoice browservoice browser

• automated speech recognitionautomated speech recognitionautomated speech recognitionautomated speech recognition

• texttexttexttext----totototo----speech synthesisspeech synthesisspeech synthesisspeech synthesis

• touchtonetouchtonetouchtonetouchtone

• audio play/recordaudio play/recordaudio play/recordaudio play/record

PhonePhonePhonePhone

• regular phoneregular phoneregular phoneregular phone

• wireless phonewireless phonewireless phonewireless phone

• soft phonesoft phonesoft phonesoft phone

HTTP/HTTP/HTTP/HTTP/VoiceXMLVoiceXMLVoiceXMLVoiceXML

SIPSIPSIPSIP

Page 23: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 23

A A A A VoiceXMLVoiceXMLVoiceXMLVoiceXML ScenarioScenarioScenarioScenario

• A customer dials the phone number of a travel agent.A customer dials the phone number of a travel agent.A customer dials the phone number of a travel agent.A customer dials the phone number of a travel agent.

• The The The The VoiceXMLVoiceXMLVoiceXMLVoiceXML gateway receives the call along with information about gateway receives the call along with information about gateway receives the call along with information about gateway receives the call along with information about the dialed number.the dialed number.the dialed number.the dialed number.

• The The The The VoiceXMLVoiceXMLVoiceXMLVoiceXML gateway searches a database.gateway searches a database.gateway searches a database.gateway searches a database.

• If successful, it maps the dialed number to a URL.If successful, it maps the dialed number to a URL.If successful, it maps the dialed number to a URL.If successful, it maps the dialed number to a URL.

• This URL is the location of the agent’s main page (This URL is the location of the agent’s main page (This URL is the location of the agent’s main page (This URL is the location of the agent’s main page (ajax.vxml ).).).).

• The gateway retrieves the The gateway retrieves the The gateway retrieves the The gateway retrieves the ajax.vxml page together with associated page together with associated page together with associated page together with associated files such as grammars and recorded audio from the HTTP server.files such as grammars and recorded audio from the HTTP server.files such as grammars and recorded audio from the HTTP server.files such as grammars and recorded audio from the HTTP server.

• These associated files may be cached on the These associated files may be cached on the These associated files may be cached on the These associated files may be cached on the VoiceXMLVoiceXMLVoiceXMLVoiceXML gateway.gateway.gateway.gateway.

Page 24: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 24

A A A A VoiceXMLVoiceXMLVoiceXMLVoiceXML ScenarioScenarioScenarioScenario

• The The The The VoiceXMLVoiceXMLVoiceXMLVoiceXML interpreter parses and executes the interpreter parses and executes the interpreter parses and executes the interpreter parses and executes the VoiceXMLVoiceXMLVoiceXMLVoiceXMLdocument.document.document.document.

• The interpreter steps through The interpreter steps through The interpreter steps through The interpreter steps through ajax.vxml playing prompts, hearing playing prompts, hearing playing prompts, hearing playing prompts, hearing responses and passing them on to a speech recognition engine.responses and passing them on to a speech recognition engine.responses and passing them on to a speech recognition engine.responses and passing them on to a speech recognition engine.

• If necessary, additional If necessary, additional If necessary, additional If necessary, additional VoiceXMLVoiceXMLVoiceXMLVoiceXML documents and associated files documents and associated files documents and associated files documents and associated files are retrieved from the HTTP server.are retrieved from the HTTP server.are retrieved from the HTTP server.are retrieved from the HTTP server.

• Recorded audio is served by specifying the URL of the WAV file.Recorded audio is served by specifying the URL of the WAV file.Recorded audio is served by specifying the URL of the WAV file.Recorded audio is served by specifying the URL of the WAV file.

• Communications between the voice gateway and the HTTP server Communications between the voice gateway and the HTTP server Communications between the voice gateway and the HTTP server Communications between the voice gateway and the HTTP server follow standard HTTP protocols.follow standard HTTP protocols.follow standard HTTP protocols.follow standard HTTP protocols.

Page 25: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 25

VoiceXMLVoiceXMLVoiceXMLVoiceXML DocumentsDocumentsDocumentsDocuments

• A A A A VoiceXMLVoiceXMLVoiceXMLVoiceXML document forms a conversational finite state machine. document forms a conversational finite state machine. document forms a conversational finite state machine. document forms a conversational finite state machine.

• The caller is always in one conversational state, or dialog, at The caller is always in one conversational state, or dialog, at The caller is always in one conversational state, or dialog, at The caller is always in one conversational state, or dialog, at a time.a time.a time.a time.

• Each dialog determines the next dialog to transition to. Each dialog determines the next dialog to transition to. Each dialog determines the next dialog to transition to. Each dialog determines the next dialog to transition to.

• Transitions are specified using Transitions are specified using Transitions are specified using Transitions are specified using URIsURIsURIsURIs, which define the next , which define the next , which define the next , which define the next document and dialog to use. document and dialog to use. document and dialog to use. document and dialog to use.

• Execution is terminated Execution is terminated Execution is terminated Execution is terminated

– when a dialog does not specify a successor, or when a dialog does not specify a successor, or when a dialog does not specify a successor, or when a dialog does not specify a successor, or

– if it has an element that explicitly exits the conversation.if it has an element that explicitly exits the conversation.if it has an element that explicitly exits the conversation.if it has an element that explicitly exits the conversation.

Page 26: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 26

FormsFormsFormsForms

• Forms collect values for a set of field item variables. Forms collect values for a set of field item variables. Forms collect values for a set of field item variables. Forms collect values for a set of field item variables.

• Grammars define the allowable inputs for fields. Grammars define the allowable inputs for fields. Grammars define the allowable inputs for fields. Grammars define the allowable inputs for fields.

• Platform throws events if the input is outPlatform throws events if the input is outPlatform throws events if the input is outPlatform throws events if the input is out----ofofofof----grammar.grammar.grammar.grammar.

• Actions are performed when field items are filled.Actions are performed when field items are filled.Actions are performed when field items are filled.Actions are performed when field items are filled.

Page 27: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 27

FormsFormsFormsForms

<form id = Identifier>

<block> Message </block>

<field name = VariableName>

<prompt> Question </prompt>

<grammar src = URI type = MediaType/>

<catch event = EventType> HandlerMessage </catch>

<filled> Actions </filled>

</field>

</form>

Page 28: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 28

MenusMenusMenusMenus

• Menus present the caller with a set of options.Menus present the caller with a set of options.Menus present the caller with a set of options.Menus present the caller with a set of options.

• Transitions to another dialog are based on a choice.Transitions to another dialog are based on a choice.Transitions to another dialog are based on a choice.Transitions to another dialog are based on a choice.

• The <menu> element is a shortcut for a form with only one field.The <menu> element is a shortcut for a form with only one field.The <menu> element is a shortcut for a form with only one field.The <menu> element is a shortcut for a form with only one field.

• It is a convenient way to ask the user to pick one option from aIt is a convenient way to ask the user to pick one option from aIt is a convenient way to ask the user to pick one option from aIt is a convenient way to ask the user to pick one option from a list.list.list.list.

Page 29: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 29

MenusMenusMenusMenus

<menu id = Identifier>

<prompt> Question <enumerate/> </prompt>

<choice next = URI-1> Phrase-1 </choice>

<choice next = URI-2> Phrase-2 </choice>

<choice next = URI-3> Phrase-3 </choice>

<noinput> Message <enumerate/> </noinput>

</menu>

Page 30: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 30

TransitionsTransitionsTransitionsTransitions

• Transitions are specified using Transitions are specified using Transitions are specified using Transitions are specified using URIsURIsURIsURIs....

• URIsURIsURIsURIs define the next document and dialog to use. define the next document and dialog to use. define the next document and dialog to use. define the next document and dialog to use.

• If a URI does not refer to a document, the current document is aIf a URI does not refer to a document, the current document is aIf a URI does not refer to a document, the current document is aIf a URI does not refer to a document, the current document is assumed. ssumed. ssumed. ssumed.

• If it does not refer to a dialog, the first dialog in the documeIf it does not refer to a dialog, the first dialog in the documeIf it does not refer to a dialog, the first dialog in the documeIf it does not refer to a dialog, the first dialog in the document is assumed.nt is assumed.nt is assumed.nt is assumed.

• Transitions can be requested Transitions can be requested Transitions can be requested Transitions can be requested ---- for example for example for example for example ---- by:by:by:by:

<choice next = URI>

<goto next = URI>

<link next = URI>

Page 31: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 31

Prompt ElementPrompt ElementPrompt ElementPrompt Element

• The <prompt> element controls the output of synthesized speech The <prompt> element controls the output of synthesized speech The <prompt> element controls the output of synthesized speech The <prompt> element controls the output of synthesized speech and and and and prerecordedprerecordedprerecordedprerecorded audio. audio. audio. audio.

• Important attributes are:Important attributes are:Important attributes are:Important attributes are:

– bargeinbargeinbargeinbargein controls whether the caller can interrupt a promptcontrols whether the caller can interrupt a promptcontrols whether the caller can interrupt a promptcontrols whether the caller can interrupt a prompt

– condcondcondcond an expression telling if the prompt should be spokenan expression telling if the prompt should be spokenan expression telling if the prompt should be spokenan expression telling if the prompt should be spoken

– countcountcountcount a number that allows to emit different promptsa number that allows to emit different promptsa number that allows to emit different promptsa number that allows to emit different prompts

– timeouttimeouttimeouttimeout for the following caller inputfor the following caller inputfor the following caller inputfor the following caller input

Page 32: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 32

Tapered PromptsTapered PromptsTapered PromptsTapered Prompts

• Prompts can be used to vary a message given to the human.Prompts can be used to vary a message given to the human.Prompts can be used to vary a message given to the human.Prompts can be used to vary a message given to the human.

<field name = "card_type">

<prompt count = "1">

What kind of credit card do you have?

</prompt>

<prompt count = "2">

Type of card?

</prompt>

• Prompts may be tapered to be:Prompts may be tapered to be:Prompts may be tapered to be:Prompts may be tapered to be:

– more terse with use (field prompting)more terse with use (field prompting)more terse with use (field prompting)more terse with use (field prompting)

– more explicit (help prompts).more explicit (help prompts).more explicit (help prompts).more explicit (help prompts).

Page 33: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 33

Catch and Help ElementCatch and Help ElementCatch and Help ElementCatch and Help Element

• The <help> element is an abbreviation for The <help> element is an abbreviation for The <help> element is an abbreviation for The <help> element is an abbreviation for

<catch event = "help">

</catch>

• For exampleFor exampleFor exampleFor example

<help>

Please say Visa, Mastercard, or American Express.

</help>

• Additional attributes are: "count" and "Additional attributes are: "count" and "Additional attributes are: "count" and "Additional attributes are: "count" and "condcondcondcond".".".".

Page 34: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 34

ConditionsConditionsConditionsConditions

• The <if> element is used for conditional logic. The <if> element is used for conditional logic. The <if> element is used for conditional logic. The <if> element is used for conditional logic.

• It has optional <else> and <It has optional <else> and <It has optional <else> and <It has optional <else> and <elseifelseifelseifelseif> elements.> elements.> elements.> elements.

• The expression language used is JavaScript. The expression language used is JavaScript. The expression language used is JavaScript. The expression language used is JavaScript.

<if cond = "(card_type == 'amex' ||

card_type == 'american express') &amp;&amp;

card_num.length != 15">

<elseif …/>

</if>

• The "The "The "The "condcondcondcond" operator "&&" needs to be escaped "&amp; &amp;"." operator "&&" needs to be escaped "&amp; &amp;"." operator "&&" needs to be escaped "&amp; &amp;"." operator "&&" needs to be escaped "&amp; &amp;".

Page 35: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 35

VariablesVariablesVariablesVariables

• Variables are declared by <Variables are declared by <Variables are declared by <Variables are declared by <varvarvarvar> elements:> elements:> elements:> elements:

<var name = "mm"/>

<var name = "i" expr = "expiry_date.length"/>

• Variables are also declared by form items:Variables are also declared by form items:Variables are also declared by form items:Variables are also declared by form items:

<field name = "card_type"> … </field>

• Attributes are:Attributes are:Attributes are:Attributes are:

– namenamenamename the name of the variable that will hold the resultthe name of the variable that will hold the resultthe name of the variable that will hold the resultthe name of the variable that will hold the result

– exprexprexprexpr initial value is optionalinitial value is optionalinitial value is optionalinitial value is optional

Page 36: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 36

Assign ElementAssign ElementAssign ElementAssign Element

• The <assign> element assigns a value to a variable:The <assign> element assigns a value to a variable:The <assign> element assigns a value to a variable:The <assign> element assigns a value to a variable:

<assign name = "mm" expr = "expiry_date.substring(0, 1)"/>

• Variables need to be declared before making an assignment.Variables need to be declared before making an assignment.Variables need to be declared before making an assignment.Variables need to be declared before making an assignment.

• Attributes are:Attributes are:Attributes are:Attributes are:

– namenamenamename the name of the variable being assigned tothe name of the variable being assigned tothe name of the variable being assigned tothe name of the variable being assigned to

– exprexprexprexpr the new value of the variablethe new value of the variablethe new value of the variablethe new value of the variable

Page 37: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 37

Clear ElementClear ElementClear ElementClear Element

• The <clear> element resets one or more variables.The <clear> element resets one or more variables.The <clear> element resets one or more variables.The <clear> element resets one or more variables.

• For example:For example:For example:For example:

<clear namelist = "card_num"/>

• The attribute "The attribute "The attribute "The attribute "namelistnamelistnamelistnamelist" contains the variables to be reset." contains the variables to be reset." contains the variables to be reset." contains the variables to be reset.

Page 38: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 38

Throw ElementThrow ElementThrow ElementThrow Element

• The <throw> element throws an event. The <throw> element throws an event. The <throw> element throws an event. The <throw> element throws an event.

• This can be a preThis can be a preThis can be a preThis can be a pre----defined one:defined one:defined one:defined one:

<throw event = "nomatch"/>

or an applicationor an applicationor an applicationor an application----defined one:defined one:defined one:defined one:

<throw event = "com.att.portal.machine"/>

Page 39: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 39

Submit ElementSubmit ElementSubmit ElementSubmit Element

• The <submit> element is used to submit information to a server:The <submit> element is used to submit information to a server:The <submit> element is used to submit information to a server:The <submit> element is used to submit information to a server:

<submit next = "place_order.asp"

namelist = "card_type card_num expiry_date"/>

• It lets you submit a list of variables to the document server viIt lets you submit a list of variables to the document server viIt lets you submit a list of variables to the document server viIt lets you submit a list of variables to the document server via an a an a an a an HTTP GET or POST request:HTTP GET or POST request:HTTP GET or POST request:HTTP GET or POST request:

<submit next = "place_order.asp" method = "post"

namelist = "card_type card_num expiry_date"/>

Page 40: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 40

What is JavaScript?What is JavaScript?What is JavaScript?What is JavaScript?

• JavaScript is an objectJavaScript is an objectJavaScript is an objectJavaScript is an object----oriented scripting language.oriented scripting language.oriented scripting language.oriented scripting language.

• ClientClientClientClient----side JavaScript isside JavaScript isside JavaScript isside JavaScript is

– an implementation of an implementation of an implementation of an implementation of ECMAScriptECMAScriptECMAScriptECMAScript

– usually embedded directly in HTML pagesusually embedded directly in HTML pagesusually embedded directly in HTML pagesusually embedded directly in HTML pages

– interpreted.interpreted.interpreted.interpreted.

• SeverSeverSeverSever----side JavaScript isside JavaScript isside JavaScript isside JavaScript is

– used with Web servers used with Web servers used with Web servers used with Web servers

– proprietary and vendorproprietary and vendorproprietary and vendorproprietary and vendor----specificspecificspecificspecific

– interpreted or compiled.interpreted or compiled.interpreted or compiled.interpreted or compiled.

Page 41: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 41

What is JavaScript good for?What is JavaScript good for?What is JavaScript good for?What is JavaScript good for?

• Basically JavaScript is a scripting language for HTML designers.Basically JavaScript is a scripting language for HTML designers.Basically JavaScript is a scripting language for HTML designers.Basically JavaScript is a scripting language for HTML designers.

• JavaScript language has a very simple syntax.JavaScript language has a very simple syntax.JavaScript language has a very simple syntax.JavaScript language has a very simple syntax.

• JavaScript can JavaScript can JavaScript can JavaScript can

– put dynamic text into an HTML page,put dynamic text into an HTML page,put dynamic text into an HTML page,put dynamic text into an HTML page,

– react to events,react to events,react to events,react to events,

– read and write HTML elements,read and write HTML elements,read and write HTML elements,read and write HTML elements,

– be used to validate data. be used to validate data. be used to validate data. be used to validate data.

Page 42: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 42

Are JavaScript and Java identical?Are JavaScript and Java identical?Are JavaScript and Java identical?Are JavaScript and Java identical?

• No!No!No!No!

• JavaScript and Java are two completely different things!JavaScript and Java are two completely different things!JavaScript and Java are two completely different things!JavaScript and Java are two completely different things!

• JavaScript is a lightweight scripting language.JavaScript is a lightweight scripting language.JavaScript is a lightweight scripting language.JavaScript is a lightweight scripting language.

• Java is a powerful programming language.Java is a powerful programming language.Java is a powerful programming language.Java is a powerful programming language.

• Java belongs to the same category as C and C++. Java belongs to the same category as C and C++. Java belongs to the same category as C and C++. Java belongs to the same category as C and C++.

Page 43: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 43

Dynamic Scripting (CGI: Dynamic Scripting (CGI: Dynamic Scripting (CGI: Dynamic Scripting (CGI: script.pyscript.pyscript.pyscript.py))))

#!/usr/local/bin/python

import cgi

form = cgi.FieldStorage()

print "Content-type: text/xml\n\n"

if (form["answer"].value == 'true') :print "<vxml version=\"2.0\"><form> \

<block>You just said yes</block></form></vxml>"else:

print "<vxml version=\"2.0\"><form> \<block>You just said no</block></form></vxml>

Page 44: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 44

VoiceXMLVoiceXMLVoiceXMLVoiceXML ImplementationsImplementationsImplementationsImplementations

• WebWebWebWeb----based based based based VoiceXMLVoiceXMLVoiceXMLVoiceXML development tools:development tools:development tools:development tools:

– TellmeTellmeTellmeTellme at http://at http://at http://at http://studio.tellme.comstudio.tellme.comstudio.tellme.comstudio.tellme.com

– BeVocalBeVocalBeVocalBeVocal at http://at http://at http://at http://café.bevocal.comcafé.bevocal.comcafé.bevocal.comcafé.bevocal.com

– HeyAnitaHeyAnitaHeyAnitaHeyAnita at http://at http://at http://at http://www.heyanita.comwww.heyanita.comwww.heyanita.comwww.heyanita.com

• VoiceXMLVoiceXMLVoiceXMLVoiceXML platforms (and graphical development tools):platforms (and graphical development tools):platforms (and graphical development tools):platforms (and graphical development tools):

– Nuance at http://www.nuance.comNuance at http://www.nuance.comNuance at http://www.nuance.comNuance at http://www.nuance.com

– OptimTalkOptimTalkOptimTalkOptimTalk at http://at http://at http://at http://www.optimtalk.czwww.optimtalk.czwww.optimtalk.czwww.optimtalk.cz

– VocomoVoiceVocomoVoiceVocomoVoiceVocomoVoice Studio at http://Studio at http://Studio at http://Studio at http://www.vocomosoft.com/vvs.htmwww.vocomosoft.com/vvs.htmwww.vocomosoft.com/vvs.htmwww.vocomosoft.com/vvs.htm

Page 45: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 45

TellmeTellmeTellmeTellme StudioStudioStudioStudio

• TellmeTellmeTellmeTellme studio is a suite of Webstudio is a suite of Webstudio is a suite of Webstudio is a suite of Web----based based based based VoiceXMLVoiceXMLVoiceXMLVoiceXML development tools.development tools.development tools.development tools.

• TellmeTellmeTellmeTellme studio enables you studio enables you studio enables you studio enables you

– to build and test, and publish to build and test, and publish to build and test, and publish to build and test, and publish VoiceXMLVoiceXMLVoiceXMLVoiceXML applicationsapplicationsapplicationsapplications

– without buying or installing any hardware or software.without buying or installing any hardware or software.without buying or installing any hardware or software.without buying or installing any hardware or software.

• By registering, you can develop your application for free.By registering, you can develop your application for free.By registering, you can develop your application for free.By registering, you can develop your application for free.

• But check out first the But check out first the But check out first the But check out first the VoiceXMLVoiceXMLVoiceXMLVoiceXML elements supported by the elements supported by the elements supported by the elements supported by the TellmeTellmeTellmeTellmevoice interpreter.voice interpreter.voice interpreter.voice interpreter.

Page 46: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 46

MyStudioMyStudioMyStudioMyStudio

• VoiceXMLVoiceXMLVoiceXMLVoiceXML scratchpadscratchpadscratchpadscratchpad

– You can write a phone application using the You can write a phone application using the You can write a phone application using the You can write a phone application using the VoiceXMLVoiceXMLVoiceXMLVoiceXML scratchpad.scratchpad.scratchpad.scratchpad.

• Application URLApplication URLApplication URLApplication URL

– Alternatively, you can write a phone application using a text edAlternatively, you can write a phone application using a text edAlternatively, you can write a phone application using a text edAlternatively, you can write a phone application using a text editor itor itor itor and store the result on a Web server.and store the result on a Web server.and store the result on a Web server.and store the result on a Web server.

– The application URL points to the initial The application URL points to the initial The application URL points to the initial The application URL points to the initial VoiceXMLVoiceXMLVoiceXMLVoiceXML document.document.document.document.

• VoixeXMLVoixeXMLVoixeXMLVoixeXML terminalterminalterminalterminal

– You can test the application logic and flow using the You can test the application logic and flow using the You can test the application logic and flow using the You can test the application logic and flow using the VoiceXMLVoiceXMLVoiceXMLVoiceXMLterminal.terminal.terminal.terminal.

Page 47: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 47

MyStudioMyStudioMyStudioMyStudio

Page 48: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 48

Application URLApplication URLApplication URLApplication URL

Page 49: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 49

VoiceXMLVoiceXMLVoiceXMLVoiceXML TerminalTerminalTerminalTerminal

Page 50: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 50

Grammar ScratchpadGrammar ScratchpadGrammar ScratchpadGrammar Scratchpad

• The The The The TellmeTellmeTellmeTellme platform provides two choices when writing grammars:platform provides two choices when writing grammars:platform provides two choices when writing grammars:platform provides two choices when writing grammars:

– use a builtuse a builtuse a builtuse a built----in grammarin grammarin grammarin grammar

– define your own grammardefine your own grammardefine your own grammardefine your own grammar

• Supported grammar languages are:Supported grammar languages are:Supported grammar languages are:Supported grammar languages are:

– Nuance Grammar Specification Language (GSL) Nuance Grammar Specification Language (GSL) Nuance Grammar Specification Language (GSL) Nuance Grammar Specification Language (GSL)

– Speech Recognition Grammar Specification (SRGS)Speech Recognition Grammar Specification (SRGS)Speech Recognition Grammar Specification (SRGS)Speech Recognition Grammar Specification (SRGS)

• You can execute GSL + SRGS in the You can execute GSL + SRGS in the You can execute GSL + SRGS in the You can execute GSL + SRGS in the VoiceXMLVoiceXMLVoiceXMLVoiceXML scratchpad.scratchpad.scratchpad.scratchpad.

• But the But the But the But the TellmeTellmeTellmeTellme grammar tools support GSL grammars only.grammar tools support GSL grammars only.grammar tools support GSL grammars only.grammar tools support GSL grammars only.

Page 51: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 51

Grammar Scratchpad: GSLGrammar Scratchpad: GSLGrammar Scratchpad: GSLGrammar Scratchpad: GSL

Page 52: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 52

Grammar Phrase CheckerGrammar Phrase CheckerGrammar Phrase CheckerGrammar Phrase Checker

Page 53: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 53

Grammar Phrase Checker: Returned ValueGrammar Phrase Checker: Returned ValueGrammar Phrase Checker: Returned ValueGrammar Phrase Checker: Returned Value

Page 54: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 54

Grammar Phrase GeneratorGrammar Phrase GeneratorGrammar Phrase GeneratorGrammar Phrase Generator

Page 55: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 55

Grammar Phrase Generator: Generated PhraseGrammar Phrase Generator: Generated PhraseGrammar Phrase Generator: Generated PhraseGrammar Phrase Generator: Generated Phrase

Page 56: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 56

Connecting to Connecting to Connecting to Connecting to TellmeTellmeTellmeTellme StudioStudioStudioStudio

• To preview your application, you can use a phone and callTo preview your application, you can use a phone and callTo preview your application, you can use a phone and callTo preview your application, you can use a phone and call

– (408)(408)(408)(408)----678678678678----4465446544654465

or you can use a soft phone and call or you can use a soft phone and call or you can use a soft phone and call or you can use a soft phone and call

– sip:[email protected]:[email protected]:[email protected]:[email protected]

Page 57: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 57

GSL Grammar at WorkGSL Grammar at WorkGSL Grammar at WorkGSL Grammar at Work

<?xml version = "1.0"?>

<vxml version = "2.0">

<form>

<field name = "destination">

<prompt>Do you want to fly to New York or Washingto n?</prompt>

<grammar type = "application/x-gsl" mode = "voice">

<![CDATA[

[[(new york) (big apple)] {<destination "new york"> }

[washington (the capital)] {<destination "washington ">}]

]]>

</grammar>

<catch event = "nomatch noinput">

<reprompt/>

</catch>

Page 58: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 58

GSL Grammar at WorkGSL Grammar at WorkGSL Grammar at WorkGSL Grammar at Work

<filled>

<prompt>You said <value expr = "destination"/></prom pt>

</filled>

</field>

</form>

</vxml>

Page 59: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 59

Speech RecognitionSpeech RecognitionSpeech RecognitionSpeech Recognition

• Speech produces a sound pressure wave which forms an acoustic siSpeech produces a sound pressure wave which forms an acoustic siSpeech produces a sound pressure wave which forms an acoustic siSpeech produces a sound pressure wave which forms an acoustic signal. gnal. gnal. gnal.

• The microphone The microphone The microphone The microphone

– receives the acoustic signal and receives the acoustic signal and receives the acoustic signal and receives the acoustic signal and

– converts it to an analogue signal.converts it to an analogue signal.converts it to an analogue signal.converts it to an analogue signal.

• To store the analogue signal, it must be converted to a digital To store the analogue signal, it must be converted to a digital To store the analogue signal, it must be converted to a digital To store the analogue signal, it must be converted to a digital signal. signal. signal. signal.

• A speech recognizer tries to transform A speech recognizer tries to transform A speech recognizer tries to transform A speech recognizer tries to transform

a digitallya digitallya digitallya digitally----encoded acoustic signal in a natural languageencoded acoustic signal in a natural languageencoded acoustic signal in a natural languageencoded acoustic signal in a natural language

into text in that language.into text in that language.into text in that language.into text in that language.

Page 60: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 60

Problems of Describing SpeechProblems of Describing SpeechProblems of Describing SpeechProblems of Describing Speech

• Sentences consist of sequences of words, in text these are delimSentences consist of sequences of words, in text these are delimSentences consist of sequences of words, in text these are delimSentences consist of sequences of words, in text these are delimited ited ited ited by spaces.by spaces.by spaces.by spaces.

• When we produce speech, there are no markers for word boundariesWhen we produce speech, there are no markers for word boundariesWhen we produce speech, there are no markers for word boundariesWhen we produce speech, there are no markers for word boundaries....

• Speech can be described as a sequence of phonemes.Speech can be described as a sequence of phonemes.Speech can be described as a sequence of phonemes.Speech can be described as a sequence of phonemes.

• To identify words, we need to search for an interpretation withiTo identify words, we need to search for an interpretation withiTo identify words, we need to search for an interpretation withiTo identify words, we need to search for an interpretation within the n the n the n the phoneme sequence.phoneme sequence.phoneme sequence.phoneme sequence.

Page 61: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 61

Phases of ASRPhases of ASRPhases of ASRPhases of ASR

Phoneme IdentificationPhoneme IdentificationPhoneme IdentificationPhoneme Identification

Word IdentificationWord IdentificationWord IdentificationWord Identification

Feature ExtractionFeature ExtractionFeature ExtractionFeature Extraction

Words & PhrasesWords & PhrasesWords & PhrasesWords & Phrases

Audio InputAudio InputAudio InputAudio Input

Acoustic ModelAcoustic ModelAcoustic ModelAcoustic Model

Language ModelLanguage ModelLanguage ModelLanguage Model

Page 62: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 62

The Predictive Power of NThe Predictive Power of NThe Predictive Power of NThe Predictive Power of N----GramsGramsGramsGrams

• Example input:Example input:Example input:Example input:

– Yesterday I went to the …Yesterday I went to the …Yesterday I went to the …Yesterday I went to the …

• BigramsBigramsBigramsBigrams::::

– Next word is something that can or is likely to follow Next word is something that can or is likely to follow Next word is something that can or is likely to follow Next word is something that can or is likely to follow ‘the‘‘the‘‘the‘‘the‘

• Trigrams:Trigrams:Trigrams:Trigrams:

– Next word is something that can or is likely to follow ‘to the‘Next word is something that can or is likely to follow ‘to the‘Next word is something that can or is likely to follow ‘to the‘Next word is something that can or is likely to follow ‘to the‘

Page 63: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 63

Why use NWhy use NWhy use NWhy use N----Gram Language Modelling?Gram Language Modelling?Gram Language Modelling?Gram Language Modelling?

• Allow for large vocabulary applications.Allow for large vocabulary applications.Allow for large vocabulary applications.Allow for large vocabulary applications.

• A context free grammar of reasonable complexity can never foreseA context free grammar of reasonable complexity can never foreseA context free grammar of reasonable complexity can never foreseA context free grammar of reasonable complexity can never foresee e e e all the different utterance patterns that callers may use in spoall the different utterance patterns that callers may use in spoall the different utterance patterns that callers may use in spoall the different utterance patterns that callers may use in sponnnn----taneoustaneoustaneoustaneous speech input.speech input.speech input.speech input.

Page 64: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 64

Development Life CycleDevelopment Life CycleDevelopment Life CycleDevelopment Life Cycle

• The development life cycle of speech applications consists of thThe development life cycle of speech applications consists of thThe development life cycle of speech applications consists of thThe development life cycle of speech applications consists of the e e e following stages:following stages:following stages:following stages:

– investigation: identify applicationinvestigation: identify applicationinvestigation: identify applicationinvestigation: identify application

– design: specify business and conceptual model plus technologydesign: specify business and conceptual model plus technologydesign: specify business and conceptual model plus technologydesign: specify business and conceptual model plus technology

– development: develop applicationdevelopment: develop applicationdevelopment: develop applicationdevelopment: develop application

– testing: test applicationtesting: test applicationtesting: test applicationtesting: test application

– sustaining: deploy applicationsustaining: deploy applicationsustaining: deploy applicationsustaining: deploy application

Page 65: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 65

Dialog StylesDialog StylesDialog StylesDialog Styles

Mixed InitiativeMixed InitiativeMixed InitiativeMixed Initiative

Application DirectedApplication DirectedApplication DirectedApplication DirectedFormsFormsFormsForms

DTMF MenusDTMF MenusDTMF MenusDTMF Menus

User DirectedUser DirectedUser DirectedUser DirectedDictationDictationDictationDictation

QueryQueryQueryQuery

Command and ControlCommand and ControlCommand and ControlCommand and Control

Page 66: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 66

SALTSALTSALTSALT

• SALT (= Speech Application Language Tags)SALT (= Speech Application Language Tags)SALT (= Speech Application Language Tags)SALT (= Speech Application Language Tags)

– is an extension of HTMLis an extension of HTMLis an extension of HTMLis an extension of HTML

– consists of a small set of XML elements (tags)consists of a small set of XML elements (tags)consists of a small set of XML elements (tags)consists of a small set of XML elements (tags)

– adds a powerful speech interface to Web pages.adds a powerful speech interface to Web pages.adds a powerful speech interface to Web pages.adds a powerful speech interface to Web pages.

• SALT can be used for bothSALT can be used for bothSALT can be used for bothSALT can be used for both

– voicevoicevoicevoice----only browsersonly browsersonly browsersonly browsers

– multimodal browsers.multimodal browsers.multimodal browsers.multimodal browsers.

Page 67: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 67

Multimodal DialogsMultimodal DialogsMultimodal DialogsMultimodal Dialogs

• Today, spoken interfaces are possible that prompt users with Today, spoken interfaces are possible that prompt users with Today, spoken interfaces are possible that prompt users with Today, spoken interfaces are possible that prompt users with speech and understand simple words or phrases. speech and understand simple words or phrases. speech and understand simple words or phrases. speech and understand simple words or phrases.

• As the technology improves we can expect richer conversations. As the technology improves we can expect richer conversations. As the technology improves we can expect richer conversations. As the technology improves we can expect richer conversations.

• Speech can be combined with other modes of interaction. Speech can be combined with other modes of interaction. Speech can be combined with other modes of interaction. Speech can be combined with other modes of interaction.

• Multimodal interaction will enable users to Multimodal interaction will enable users to Multimodal interaction will enable users to Multimodal interaction will enable users to

– speakspeakspeakspeak

– write and typewrite and typewrite and typewrite and type

– hear and see hear and see hear and see hear and see

using a more natural interface than today's single mode browsersusing a more natural interface than today's single mode browsersusing a more natural interface than today's single mode browsersusing a more natural interface than today's single mode browsers....

Page 68: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 68

Multimodal BrowserMultimodal BrowserMultimodal BrowserMultimodal Browser

Page 69: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 69

VoiceXMLVoiceXMLVoiceXMLVoiceXML versus SALTversus SALTversus SALTversus SALT

• VoiceXMLVoiceXMLVoiceXMLVoiceXML and SALT are both and SALT are both and SALT are both and SALT are both

– markupmarkupmarkupmarkup languages languages languages languages

– that describe speech interfaces.that describe speech interfaces.that describe speech interfaces.that describe speech interfaces.

• VoiceXMLVoiceXMLVoiceXMLVoiceXML is designed for telephony applications:is designed for telephony applications:is designed for telephony applications:is designed for telephony applications:

– interactive voice response applications are the focus.interactive voice response applications are the focus.interactive voice response applications are the focus.interactive voice response applications are the focus.

• SALT targets speech application across a whole spectrum:SALT targets speech application across a whole spectrum:SALT targets speech application across a whole spectrum:SALT targets speech application across a whole spectrum:

– multimodal interactions are the focus.multimodal interactions are the focus.multimodal interactions are the focus.multimodal interactions are the focus.

Page 70: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 70

VXML SolutionVXML SolutionVXML SolutionVXML Solution

<?xml version = "1.0"?>

<vxml version = "2.0">

<form id = "TravelForm">

<field name = "OriginCity" >

<grammar src = "city.xml" />

<prompt> Where would you like to leave from? </prom pt>

<nomatch> I didn't understand </nomatch>

</field>

<field name = "DestCity" >

<grammar src = "city.xml"/>

<prompt> Where would you like to go to? </prompt>

<nomatch>I didn't understand </nomatch>

</field>

<filled> <submit ... /> </filled>

</form>

</vxml>

Page 71: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 71

SALT SolutionSALT SolutionSALT SolutionSALT Solution

<!– HTML -->

<html xmlns:salt = "urn:saltforum.org/schemas/020124 ">

<body onload = "RunAsk()">

<form id = "travelForm">

<input name = "txtBoxOriginCity" type = "text" />

<input name = "txtBoxDestCity" type = "text" />

</form>

Page 72: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 72

SALT SolutionSALT SolutionSALT SolutionSALT Solution

<!-- Speech Application Language Tags -->

<salt:prompt id = "askOriginCity"> Where would you like to leave from?

</salt:prompt>

<salt:prompt id = "askDestCity"> Where would you like to go to?

</salt:prompt>

<salt:prompt id = "sayDidntUnderstand" onComplete = " runAsk()">Sorry, I didn't understand.

</salt:prompt>

Page 73: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 73

SALT SolutionSALT SolutionSALT SolutionSALT Solution

<salt:reco id = "recoOriginCity" onReco = "procOrigin City()"

onNoReco = "sayDidntUnderstand.Start()">

<salt:grammar src = "city.xml" />

</salt:reco>

<salt:reco id = "recoDestCity" onReco = "procDestCity ()"

onNoReco = "sayDidntUnderstand.Start()">

<salt:grammar src = "city.xml" />

</salt:reco>

Page 74: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 74

SALT SolutionSALT SolutionSALT SolutionSALT Solution

<!-- script -->

<script>

function RunAsk() {

if (travelForm.txtBoxOriginCity.value == "") {

askOriginCity.Start();

recoOriginCity.Start();

} else if (travelForm.txtBoxDestCity.value == "") {

askDestCity.Start();

recoDestCity.Start();

}

}

Page 75: Special Lecture (406) Spoken Language Dialog Systems Reviewweb.science.mq.edu.au/~rolfs/slds-2004/slds-2004-lecture... · 2004-06-25 · 2.22..2. See if you can decompose the task

© Macquarie University 2004 75

SALT SolutionSALT SolutionSALT SolutionSALT Solution

function procOriginCity() {

travelForm.txtBoxOriginCity.value = recoOriginCity.t ext;

RunAsk();

}

function procDestCity() {

travelForm.txtBoxDestCity.value = recoDestCity.text;

travelForm.submit(); }

</script>

</body>

</html>


Recommended