Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | baker-avila |
View: | 40 times |
Download: | 2 times |
Stochastic Language Generation for Spoken Dialog Systems
Alice Oh
School of Computer ScienceLanguage Technologies Institute
Carnegie Mellon University
April 19, 2023 Speech Group 2
CarnegieMellon
Big Question
How can we design a good NLG system for spoken dialog systems?
April 19, 2023 Speech Group 3
CarnegieMellon
What is NLG?
Natural Language Understanding (NLU)
Natural Language Generation (NLG)
Text Semantic (Syntactic)Representation
TextSemantic (Syntactic)Representation
April 19, 2023 Speech Group 4
CarnegieMellon
Example of NLG
{act querycontent name}=> What is your full name?
(Carnegie Mellon Communicator)
April 19, 2023 Speech Group 5
CarnegieMellon
Example of NLG cat clause process type material
effect-type creativelex ‘score’tense past
participants agent cat properhead cat person-name
first-name [lex ‘Michael’]last-name [lex ‘Jordan’]
created cat npcardinal [value 36]definite nohead [lex ‘point’]
=> Michael Jordan scored 36 points.
(FUF/SURGE Elhadad and Robin, 1996)
April 19, 2023 Speech Group 6
CarnegieMellon
What is a good NLG system?
High quality output? Write like Shakespeare? Talk like … the presidential candidates? Or… just produce grammatical sentences?
Reusable? Portable? What about development & maintenance?
April 19, 2023 Speech Group 7
CarnegieMellon
What is a spoken dialog system?
A task-oriented human-computer interaction via natural spoken dialog
CMU Communicator Complex travel planning system
Jupiter Worldwide weather report
What is not a spoken dialog system? C-STAR speech-to-speech translation system American Airlines flight information system (IVR)
April 19, 2023 Speech Group 8
CarnegieMellon
NLG in spoken dialog systems
Language is different from text-based applications Shorter in length Simpler in structure Less strict in following grammatical rules
Lexicon is domain-specific
April 19, 2023 Speech Group 9
CarnegieMellon
Communicator Project
A spoken dialog system in which users engage in a telephone conversation with the system using natural language to solve a complex travel reservation task
Components Sphinx-II speech recognizer Phoenix semantic parser Domain agents Agenda-based dialog manager Stochastic natural language generator Festival domain-dependent Text-to-Speech (being integrated)
Want to know more? Call toll-free at 1-877-CMU-PLAN
April 19, 2023 Speech Group 10
CarnegieMellon
Problem Statement
Problem: build a generation engine for a dialog system that can combine the advantages, as well as overcome the difficulties, of the two dominant approaches (template-based generation, and grammar rule-based NLG)
Our Approach: design a corpus-driven stochastic generation engine that takes advantage of the characteristics of task-oriented conversational systems. Some of those characteristics are that Spoken utterances are much shorter in length There are well-defined subtopics within the task, so the language
can be selectively modeled
April 19, 2023 Speech Group 11
CarnegieMellon
Stochastic NLG: overview
Language Model: an n-gram language model of domain expert’s language built from a corpus of travel reservation dialogs
Generation: given an utterance class, randomly generates a set of candidate utterances based on the LM distributions
Scoring: based on a set of heuristics, scores the candidates and picks the best one
Slot filling: substitute slots in the utterance with the appropriate values in the input frame
April 19, 2023 Speech Group 12
CarnegieMellon
Stochastic NLG: overview
Input Frame{act querycontent depart_timedepart_date 20000501}
Language Models Candidate Utterances
What time on {depart_date}?
At what time would you beleaving {depart_city}?
Scoring
Generation
Best Utterance
What time on {depart_date}?
Complete Utterance
What time on Mon, May 8th?Slot FillingTTS
Dialog Manager
TaggedCorpora
April 19, 2023 Speech Group 13
CarnegieMellon
Input Frame{act querycontent depart_timedepart_date 20000501}
Language Models Candidate Utterances
What time on {depart_date}?
At what time would you beleaving {depart_city}?
Scoring
Generation
Best Utterance
What time on {depart_date}?
Complete Utterance
What time on Mon, May 8th?Slot FillingTTS
Dialog Manager
TaggedCorpora
April 19, 2023 Speech Group 14
CarnegieMellon
FestivalSynthesis
DialogManager
PhoenixParser
SphinxASR
Speech signal
Words
BackendModules
Semantic Frames
Queries
Data
StochasticNLG
April 19, 2023 Speech Group 15
CarnegieMellon
Stochastic NLG: Corpora
Human-Human dialogs in travel reservations
(CMU-Leah, SRI-ATIS/American Express dialogs)
CMU(Agent)
CMU(User)
SRI(Agent)
SRI(User)
# of Dialogs 39 68
# of Utterances 970 946 2245 2060
# of Words 12852 7848 27695 17995
April 19, 2023 Speech Group 16
CarnegieMellon
Example
Utterances in Corpus:What time do you want to depart {depart_city}?
What time on {depart_date} would you like to depart?
What time would you like to leave?
What time do you want to depart on {depart_date}?
Output (different from corpus):What time would you like to depart?
What time on {depart_date} would you like to depart {depart_city}?
*What time on {depart_date} would you like to depart on {depart_date}?
April 19, 2023 Speech Group 17
CarnegieMellon
Evaluation
Transcription
Dialogs
StochasticNLG
TemplateNLG Dialogs
withOutputT
Dialogswith
OutputS
Batch-modeGeneration
ComparativeEvaluation
April 19, 2023 Speech Group 18
CarnegieMellon
Preliminary Evaluation
Batch-mode generation using two systems, comparative evaluation of output by human subjects
User Preferences (49 utterances total)
Weak preference for Stochastic NLG (p = 0.18)
subject stochastic templates difference1 41 8 332 34 15 193 17 32 -154 32 17 155 30 17 136 27 19 87 8 41 -33
average 27 21.29 5.71
April 19, 2023 Speech Group 19
CarnegieMellon
Stochastic NLG: Advantages
corpus-driven easy to build (minimal knowledge engineering) fast prototyping minimal input (speech act, slot values) natural output leverages data-collecting/tagging effort
April 19, 2023 Speech Group 20
CarnegieMellon
Open Issues
How big of a corpus do we need? How much of it needs manual tagging? How does the n in n-gram affect the output? What happens to output when two different human
speakers are modeled in one model? Can we replace “scoring” with a search algorithm?
April 19, 2023 Speech Group 22
CarnegieMellon
Current Approaches
Traditional (rule-based) NLG hand-crafted generation grammar rules and other knowledge input: a very richly specified set of semantic and syntactic features Example*
(h / |possible<latent|
:domain (h2 / |obligatory<necessary|
:domain (e / |eat,take in|
:agent you
:patient (c / |poulet|))))
You may have to eat chicken Template-based NLG
simple to build input: a dialog act, and/or a set of slot-value pairs
* from a Nitrogen demo website, http://www.isi.edu/natural-language/projects/nitrogen/
April 19, 2023 Speech Group 23
CarnegieMellon
Stochastic NLG can also be thought of as a way to automatically build templates from a corpus
If you set n equal to a large enough number, most utterances generated by LM-NLG will be exact
duplicates of the utterances in the corpus.
April 19, 2023 Speech Group 24
CarnegieMellon
Tagging
CMU corpus tagged manually SRI corpus tagged semi-automatically using trigram
language models built from CMU corpus
April 19, 2023 Speech Group 25
CarnegieMellon
Tags
Utterance classes (29)query_arrive_city inform_airport
query_arrive_time inform_confirm_utterance
query_arrive_time inform_epilogue
query_confirm inform_flight
query_depart_date inform_flight_another
query_depart_time inform_flight_earlier
query_pay_by_card inform_flight_earliest
query_preferred_airport inform_flight_later
query_return_date inform_flight_latest
query_return_time inform_not_avail
hotel_car_info inform_num_flights
hotel_hotel_chain inform_price
hotel_hotel_info other
hotel_need_car
hotel_need_hotel
hotel_where
Attributes (24)airline flight_num
am hotel
arrive_airport hotel_city
arrive_city hotel_price
arrive_date name
arrive_time num_flights
car_company pm
car_price price
connect_airline
connect_airport
connect_city
depart_airport
depart_city
depart_date
depart_time
depart_tod
April 19, 2023 Speech Group 26
CarnegieMellon
Stochastic NLG: Generation
Given an utterance class, randomly generates a set of candidate utterances based on the LM distributions
Generation stops when an utterance has penalty score of 0 or the maximum number of iterations (50) has been reached
Average generation time: 75 msec for Communicator dialogs
April 19, 2023 Speech Group 27
CarnegieMellon
Stochastic NLG: Scoring
Assign various penalty scores for unusual length of utterance (thresholds for too-long and too-short) slot in the generated utterance with an invalid (or no) value in the
input frame a “new” and “required” attribute in the input frame that’s missing
from the generated utterance repeated slots in the generated utterance
Pick the utterance with the lowest penalty (or stop
generating at an utterance with 0 penalty)
April 19, 2023 Speech Group 28
CarnegieMellon
Stochastic NLG: Slot Filling
Substitute slots in the utterance with the appropriate values in the input frame
Example:What time do you need to arrive in {arrive_city}?What time do you need to arrive in New York?
April 19, 2023 Speech Group 29
CarnegieMellon
Stochastic NLG: Shortcomings
What might sound natural (imperfect grammar, intentional omission of words, etc.) for a human speaker may sound awkward (or wrong) for the system.
It is difficult to define utterance boundaries and utterance classes. Some utterances in the corpus may be a conjunction of more than one utterance class.
Factors other than the utterance class may affect the words (e.g., discourse history).
Some sophistication built into traditional NLG engines is not available (e.g., aggregation, anaphorization).
April 19, 2023 Speech Group 30
CarnegieMellon
Evaluation
Must be able to evaluate generation independent of the rest of the dialog system
Comparative evaluation using dialog transcripts need more subjects 8-10 dialogs; system output generated batch-mode by two
different engines
Evaluation of human travel agent utterances Do users rate them well? Is it good enough to model human utterances?
April 19, 2023 Speech Group 31
CarnegieMellon
What is NLG?
Natural Language Understanding (NLU)
TextSemantic (Syntactic)
RepresentationText
Semantic (Syntactic)Representation
Natural Language Understanding (NLU)