UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP...

UWSP Web Speech Research Group

Joe FrostMark StenersonProfessor Dave Gibbs

Presentation to AITPMonday, October 17, 2005

Before beginning… Thanks for the opportunity to share

Feel free to interrupt with Questions/Comments

This will hopefully be of interest to you!

AITP needed a presentation! We’re happy to be here…

Overview UWSP Web Speech Research Group

Purpose Research Interest Origins

Previous Work Shaker – PowerPoint conversion SIDE – Speaking Integrated Development Environment

Current Work Interactions with Web Pages Voice-controlled database interaction Telephony Applications

Questions

UWSP Web Speech Research Group Purpose

Research the functionality and usefulness of the use of voice input and output on the web through the creation of meaningful projects and prototypes

Voice input: speech recognition (SR, or ASR) Voice output: speech synthesis, or text-to-speech

(TTS)

UWSP Web Speech Research Group Completed Research

Developed IDE to assist in the preparation of speaking online course materials

Development tool to create speaking web pages Useful for instructors, instructional designers, and Training

materials

Current Research Interactive browsing capability (forms, etc.) Investigating speaking web pages with broader

applicability – including speech recognition Interactive database prototype – “hands-free” database

updating telephony-based systems

Origins of Web Speech Research Group

Online Course: WDMD 170 Spring 2004 Audio over PowerPoint, or saved as HTML

large files – inaccessible to dial-up users Clumsy to edit, maintain

Investigated Speech Recognition XP wished to train my profile

Opera introduced its “speaking browser” March 2004 Press Release Currently Opera 8.5 (load same page in Opera and speak)

Investigated Text-To-Speech (TTS) Microsoft Speech Application Language Tags (SALT) VoiceXML

Initial Inquiry Investigated Speech

Recognition XP prompted me to train

my profile Demonstrate training

speech profile (Quick Launch Speech

button)

Initial Inquiry Investigated Text-To-

Speech (TTS) Built into XP

Demonstrate TTS Speech

XP Supplied voices LH Michael, Michelle MS Mary, Mike, Sam

Purchased Voices NeoSpeech Kate, Paul

Speaking Integrated Development Environment:SIDE

Uses TTS (Text-To-Speech) technology TTS in a web page: essentially a markup

language

SALT (Speech Application Language Tags) Developed by Microsoft and the SALT Forum

Voice XML Roots in a research project called PhoneWeb at AT&T Bell

Laboratories. Eventually picked up by the VoiceXML Forum

Web Page TTS: SALT

SALT:

“The Speech Application Language Tags (SALT) specification enables multimodal and telephony-enabled access to information, applications, and Web services from PCs, telephones, tablet PCs, and wireless personal digital assistants (PDAs). The Speech Application Language Tags extend existing mark-up languages such as HTML, XHTML, and XML.” -SALTForum.com

Web Page TTS: VoiceXML Voice XML

A language for creating voice-user interfaces, particularly for the telephone. It uses speech recognition and touchtone (DTMF keypad) for input, and pre-recorded audio and text-to-speech synthesis (TTS) for output. It is based on the Worldwide Web Consortium's (W3C's) Extensible Markup Language (XML), and leverages the web paradigm for application development and deployment. By having a common language, application developers, platform vendors, and tool providers all can benefit from code portability and reuse. -VoiceXML Forum

Comparison SALT

Newer technology Support from Microsoft Designed for internet age Controllable Voice purchase availability Large download to enable

speech

VXML

Older technology Support from VXML community Designed for telephony Many functions, not interactive Single voice available currently Very small add-in download

SALT and HTML

<body>

</body>

<html>

</html>

<head>

</head>SALT tags

SALT tags are entered into the head of the document.

Upon rendering the document IE recognizes the embedded SALT using a special plug-in

SALT example: Hello World How does this work?

Use of tags within the HTML document to invoke the Windows voice

Example of simple tags:

<html xmlns:salt=“http://www.saltforum.org/2002/SALT”> <head> <salt:prompt id=“hello”>Hello World</salt:prompt> </head> <body onload=“hello.Start()”> </body></html>

Hello World Example

(Note: this example only “speaks” if you have the I.E. Web Speech Add-In installed. You can download the add-in from our web page)

VoiceXML and HTML <html>

</html>

<head> <form> <block>

</block> </form></head>

VXML speech text within <form> and <block>

• VXML tags before </head>

• Insert to <body>• ev:event = "load" ev:handler = "#objID"

<body>

</body>

VoiceXML example: Hello World How does this work?

Use of tags within the HTML document to invoke the voice within the Opera 8 web browser

Example of simple tags:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:ev="http://www.w3.org/2001/xml-

events"> <head> <form xmlns="http://www.w3.org/2001/vxml" id="lecture"> <block> Hello World </block> </form> </head> <body ev:event="load" ev:handler="#lecture"> </body></html>

NOTE: Hello World Example must be manually loaded into Opera.Filename is Opera-HelloWorld.xml

Web Speech Research Group: First Iteration Fall 2004: Independent Study – 2 students

Conversion “wizard” – PPT saved as HTML was input Used notes section of PPT file as the text to be spoken by the

page

Added SALT Tags (worked only with SALT) Added “controls” via JavaScript Tabbed “Shaker”, as in SALT Shaker Presented December 2004 (requires I.E. Speech Add-in)

Web Speech Research Group: Second Iteration Spring 2005: Senior Projects Team (CIS 480) – 4

students create an application: Speech Integrated Development

Environment (SIDE) To enable a web author to add speech to pages Useful in online courses Create SALT or VXML pages

CIS 499 Independent Study – 2 students Interactive database prototype

Spring 2005: Speech IDE SIDE – Speech Integrated Development

Environment

Take SALT Shaker Wizard to an integrated development environment, or Speech IDE

Allow the modification of any existing web page intoone with speech

Web Page Conversion SIDE Project SIDE Function: permits a web author to easily

add speech to his/her web pages. Conversion:

HTML with SALT tags and Control Panel (IE) HTML with Voice XML tags (Opera)

HTML

TextHTML with SALT

SIDEHTML with VXML

• Keyboard• Text file• Voice

Where do the TTS tags go (SALT)?

<body>

</body>

<html>

</html>

<head>

</head>

JavaScriptSALT tags

Control Panel

• JavaScript and SALT tags before </head>

• Control Panel before </body>

SIDE Conversion ExampleCan add speaking text to any HTML page

Convert the AITP Portal Page to a speaking page using SIDE. (close PPT to avoid invoking “Train Profile”) Will create SALT tags for I.E.

Speaking Pages USES?? Online course page “lectures” Low overhead / low fidelity applications Training Situations

Looking for prototypical application

SALT Examples Recognition only Recognition and response combined

Demonstrate internal simple processing

Interactive form Potentially linked to db and submission

Web page navigation

JF

Main SALT Tags <salt: prompt>

The speaking (output) tag; TTS Methods:

Start() – begin speaking Stop() Pause() Resume()

<salt: listen> The listening (input) tag; recognition Contains one or more grammars

Grammars define what words are listened for

<salt: bind> Binds the recognized value to a form element

JF

Example: recognition Recognize a country of the European Union

(requires I.E. Speech Add-In and microphone) Courtesy of Mark Huckvale, University College

London www.phon.ucl.ac.uk/home/mark/salt/

Key code snippets

<salt:listen id="RecogEU">

<input name="txtCountry" type="text" onclick="RecogEU.Start()" />

JF

Example: recognition and response Recognize a country and supply its capital city (Huckvale)

(requires I.E. Speech Add-In and microphone)

Key code snippets

function LookupCapital(country) {

If (country=="Austria") return("Vienna");

else if (country=="Belgium") return("Brussels");

else if (country=="Cyprus") return("Nicosia");

else if (country=="Czech Republic") return("Prague");

etc.

JF

Example: interactive form Interactively order a pizza, using a

Pizza order form (Hill, WSRG)(requires I.E. Speech Add-In and microphone)

JF

Example: web page navigation Navigate pages with links already established

Page Control using speech recognition (Gibbs)(requires I.E. Speech Add-In and microphone)

Allows speaking interruptions<salt:prompt id='Prompt' oncomplete = 'pageFinished();'

bargein = 'true' bargeintype ='speech'>

Allows continuous listening<salt:listen id="testreco"

onsilence = "testreco.start();“

onnoreco = "testreco.start();“

onreco = "CheckCommand();">

JF

What It Can Provide Accessibility

No keyboard needed

Hands Free Use Automobiles!

Telephony Menu driven pages

Transparent Web Applications Same pages serving both www and listen-only

devices?

JF

Managing Data With Voice Recognition Navigate Records

Create New

Update

Delete

MS

Example: interactive database Hands-free data management

MS

Where Now? Pre-Built Grammars

Telephony

Transparent Web Applications

Contact Information UWSP Web Speech Group

http://www.uwsp.edu/cis/dgibbs/speechgroup

Questions? Comments?

Date post:	27-Dec-2015
Category:	Documents
Upload:	shanon-horton
View:	214 times
Download:	1 times

UWSP Web Speech Research Group Joe Frost Mark Stenerson Professor Dave Gibbs Presentation to AITP...

Documents