Download - 1 Chapter 19: Dialogue and Conversational Agents Nadia Hamrouni and Ahmed Abbasi 12/5/2006.

1

Chapter 19: Dialogue and Conversational Agents

Nadia Hamrouni and Ahmed Abbasi

12/5/2006

2

Applications of Dialogue Agents

• Conversational agents useful for:

– Booking airline flights

– Answering questions

– Electronic Customer Relationship Management (e-CRM) systems

3

Characteristics of Dialogue

• Characteristics of Dialogue– Turns and Utterances

• Dialogue is characterized by turn-taking• Overlapping is small (less than 5%).• Speaker transitions occur at utterance boundaries

– Boundaries based on cue words (e.g., “well, and, so”)

– Grounding• Speaker and hearer must establish common ground (the set of

things mutually believed) • Done via:

– attention, acknowledgement, contribution, demonstration, and display

– Conversational Implicature• Utterance interpretation relies on more than sentence meaning.• Requires drawing of inferences.

– A - “What day in May did you want to travel?”– C - “I need to be there for a meeting from the 12th to the 15th.”

4

Dialogue Acts

• Speech Acts:– Locutionary act– Illocutionary act– Perlocutionary act

• Dialogue Acts / Conversational Moves– Include various types of conversational functions.

• Dialogue Act Markup in Several Layers (DAMSL) architecture– Dialogue act tagging scheme– Hierarchical tag set– Codes levels of dialogue information

e.g. forward looking function, backward looking function.– focused on task-oriented dialogue

5

Automatic Interpretation of Dialogue Acts

• Two types of models:

– Plan Inference Models

– Cue-Based Models

6

Plan Inference Rules

• Rule based techniques consisting of manually crafted rule sets.

• Rules designed for “AI Planning”– How hearer will handle speaker requests– Also called action schema

• Includes constraints, preconditions, effects, and body.

• Based on BDI models (Allen, 1995)– Belief, Desire, Intention

• Belief modeled using KNOWs and KNOWIFs• Desire modeled using WANTs

7


Can you give me a list of flights from Atlanta?

Step 1: Decompose request: S.REQUESTS(S,H,InformIf(H,S,CanDo(H,Give(H,S,LIST)))))

Step 2: B(H,W(S,InformIf(H, S,CanDo(H,Give(H,S,LIST)))))

Step 3: B(H,W(S,KnowIf(H,S,CanDo(H,Give(H,S,LIST)))))

Step 4: B(H,W(S,CanDo(H,Give(H,S,LIST))))

Step 5: B(H,W(S,Give(H,S,LIST)))

Step 6: REQUEST(H,S,Give(H,S,LIST))

8


• Advantages– Extremely powerful– Combines rich knowledge structures and

planning techniques• Can capture direct and indirect uses of dialogue

• Disadvantages– Time consuming and labor intensive– Accounting for all possible reasoning makes

this approach AI-Complete.

9

Cue-based Interpretation

• Supervised machine learning techniques

• Trained on hand-labeled dialogue corpora– Use cues (linguistic features) for identifying

dialog types.– Word features:

• “please” “would you” REQUEST

– Conversational Structure• “yeah” after proposal AGREEMENT

10


• Decision Tree Models– Shriberg et al. (1998)– Used Decision tree models trained to differentiate

statements, yes-no questions, wh-questions, and declarative questions.

• HMM Models– Woszczyna and Waibel (1994)– Build markov models of speech act probabilities.

• Similar to n-gram models, use Bayes’ Rule• D* = argmax P(D|C)

D

11


• Advantages– Data driven approach less time consuming.– Use of machine learning with availability of

large corpora and modern computing power make such methods highly efficient.

• Disadvantages– Not as sophisticated and accurate as the plan

inference approach.

12

Evolution of Conversation Agents

• ELIZA– Weizenbaum (1966)– Simple dialogue manager– Match previous sentence to set of conditions

• PARRY– Colby et al. (1971)– Paranoid agent with emotional states and delusions

• Emotions included anger, fear, etc.

• BDI Model– Cohen and Perrault (1979)– Still prevalent due to high accuracy

• Machine Learning– 1990s – Present

13

Multimodal Agents

• REA– (Bickmore &

Cassell, 2004)– Developed at the

MIT Media Lab– Embodied Agent

• “Human” agents considered more trustworthy (Kiesler & Sproull, 1997).

– Designed to be a real estate agent

– Rule based system

14

Multimodal Agents

• COMIC– (Foster & Oberlander,

2004)– Animated Embodied

Agents– Use machine learning

algorithms to build agent models

– Models trained on corpus of video recordings of conversations.

– Models consider speech, facial expressions, body language, and discussion context.

15

References

• Allen, J. (1995). Natural Language Understanding. Benjamin Cummings, Menlo Park, CA.

• Bickmore T. & Cassell, J. (2004). Social Dialogue with Embodied Conversational Agents. In J. van Kuppevelt, L. Dybkjaer & N. Bernsen (Eds.), Natural, Intelligent and Effective Interaction with Multimodal Dialogue Systems. New York: Kluwer Academic.

• Colby, K. M., Weber, S., & Hilf, F. D. (1971). Artificial Paranoia. Artificial Intelligence, 2(1), 1-25.

• Foster, M. E. & Oberlander, J. (2006). Data-driven Generation of Emphatic Facial Displays. Proceedings of the EACL (2006).

• Kiesler, S., & Sproull, L. (1997). 'Social' Human-Computer Interaction. In B. Friedman (Ed.), Human Values and the Design of Computer Technology (pp. 191-199). Stanford, CA: CSLI Publications.

• Shriberg, E., Bates, R., et al. (1989). Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? Language and Speech, 41(3-4), 439-487.

• Weizenbaum, J. (1966). ELIZA – A Computer Program for the Study of Natural Language Communication Between Man and Machine. Communication of the ACM, 9(1), 36-45.

• Woszczyna, M. and Waibel, A. (1994). Inferring Linguistic Structure in Spoken Language. ICSLP-94, 847-850.