Understanding the User in Socialbot Conversations · Sap CSE Hao Cheng EE. ... Roadmap o The...

Understanding the User in SocialbotConversations

Mari Ostendorf& the Sounding Board Team

University of Washington

The Sounding Board TeamStudents Faculty Advisors

Yejin Choi - CSE - Noah Smith

Mari Ostendorf, EE

Elizabeth Clark CSE

Ari HoltzmanCSE

HaoFang

EE

MaartenSap

CSE

HaoCheng

EE

Teams of university students try to build a socialbot that • converses coherently and engagingly with people • on popular topics and current events.

10M conversations with real users + new type of conversational AIà many new research problems

This talk: understanding the user includes user modeling

Types of Conversational AI Systems

Accomplish Tasks

Social Conversation–

–

+

+

Chat Bot

Virtual Assistant

chitchat

execute commands, answer questions

Limited social back and forth

Limited content to talk about

2-way social & information

exchange

Socialbot

Roadmapo The socialbot as a conversational gatewayo Sounding Board system overviewo Characteristics of real userso User modeling – first stepso Take-aways & open issues

The Socialbot as a Conversational Gateway

A Perspective on Socialbotso A socialbot facilitates evolving user goals & interestso Users (should) know they are talking to a boto Broad applicationso Education: language learning, tutoring systemso Information exploration, interactive help & recommendationso Exercise/therapy coach, companion

Sounding Board:A Conversational Gatewayto Online Content

Sounding Board

Issues Vary for Different ParadigmsConversational AI

System ComponentsSpeech/language understanding Dialog managementBack-end application

Response generation

Narrow options & execute tasks

Reward = timely task completion

ASSISTANT SOCIALBOT

Structured Database

Task intents, form filling

Learn about interests & make suggestions

Reward = user satisfaction

Unstructured information

Social & info intents

Open domainConstrained domains

Sounding Board – System Overview

o Design philosophyo Brief system overviewo Evaluation

Early Stage Challengeso Software:o No experience with Alexa skill kits, built-in tools are more for

speech-enabling an existing appo No existing dialog system to build on

o Data:o Task is open domain & users want current content à

there was no good existing data for end-to-end trainingo Our initial system was sufficiently bad, we didn’t want to learn

from early user conversations with it

What Makes Someone a Good Conversationalist?

o Have something interesting to say

o Show interest in what your partner says

These principles apply to a socialbot

Have something interesting to sayo Users react positively to learning something new

o ... and negatively to old or unpleasant news

SpaceX sends beer ingredients to International Space Station just in time for Christmas

Man Given 'Options' Before Cutting Dog's Head Off, Ga. Sheriff Says

Babies as young as 10 months can assess how much someone values a particular goal by observing how hard they are willing to work to achieve it …

Show interest in what the user sayso Users lose interest when they get too much content that

they don’t care abouto Users like acknowledgment of their reactions & requestso Some users need encouragement to express opinions

…but it can be annoying This article mentioned Google. Have you heard of Google?

Design Philosophyo Content-driveno Daily content mining, large & dynamic content collectiono Knowledge grapho DM that promotes popular content, diverse sources (styles)

o User-centrico Language understanding that detects user sentimento Dialog management (DM) that tries to learn user personality,

handles rapid topic changes, tracks engagement, ….o Language generation with prosody-appropriate grounding

Multi-dimensional NLU Representation

What is your favorite color?

Let’s talk about technology.

That’s really interesting!

Tell me a joke.Commands

Questions

User Reactions

Topics

Hierarchical Dialog ManagementoMaster (Global)oRank topics, miniskills, contentoConsider: topic coherence, user

engagement, content availabilityoMiniskills (Local)ogreeting / goodbye / menuoprobe user personalityodiscuss a news article / movieo tell a fact / thought / advice / joke

Negotiation

Thought

Fact

Movie

From Speech Acts to Natural Language

GROUNDING

INFORMNEWSTITLE

REQUESTINPUT

INSTRUCTSKIP

Speech Acts

I’m glad you like it!

I read this article from yesterday. UT Austin and Google AI use machine learning ….

Have you read this news?

You can say “next” to talk about other news.

Response

Phrase Generation

Prosody Adjustment

UT Austin and Google AI use machine learning on data from NASA's Kepler Space Telescope to discover an eighth planetcircling a distant star.

o Crawl online contento Filter inappropriate &

depressing contento Index interesting &

uplifting contento noun phrases, entities, meta-info

o Knowledge grapho daily updatedo 80K entries, 300K topics

Content Management

science astronomy

Knowledge Graph

UT Austin and Google AI use machine learning on data from NASA's Kepler Space Telescope … planet … distant star.

How does NASAorganize a party? They plan-et!

Artificial intelligencein 2017 still can't truly understand humans

NASA… android device ... Google … Android device manager …

Janice Joplin was … fraternity brothers at UT Austin …

… NASA …

… AI …AI

… Google …Google

… UT Austin … UT Austin

Evaluation – AlexaPrize Protocolo Users (judges) rate conversations on a 1-5 scaleo Duration of conversation (tie breaker)

Rated by Rating DurationFinals(mid Nov 2017) Judges 3.17 10:22

Post-finals(Nov 24 – Dec 24)

Alexa owners 3.65 4:30

Diagnostic Evaluationo User ratings are expensive, noisy & sparseo user ratings have a very high variance (3.65 ± 1.40)o users may decline to rate the system (43% are rated)o conversations can have good and bad sections

o Users provide more information than is available from the final conversation rating & lengtho topic proposal, topic acceptance & rejectiono reaction to content

Local Rewards: Combine Ratings & Responses

26

Caveats & Constraintso ASR is imperfect

o No sentence segmentation or pause information

o Audio information is not available (i.e. no prosodic cues)

cause does that you’re gonna state that’s coolI’m happy you liked that.

How about robots what’s up with robots?Sorry. I'm actually not very familiar with robots what's up with robots.

Wow! You must be very smart! ….Yeah I did not.

Did you know that ….?

Characteristics of Real Users(from 10M Sounding Board Conversations)

o Observationso Personalityo Implications

Observations: Users Varyo Different interests, opinions on issues, sense of humoro Interaction styles: terse vs. verbose, politeness, …o Different goals: information seeking, opinion sharing,

getting to know each other, adversarialo Different ages

Content Preferences VaryDid you know that Malaysian vampires are tiny monsters that burrow into people's heads and force them to talk about cats?

That’s creepy.

Oh you are so funny.

Oh my god that’s funny.

AmusedThat’s not true.

Oh gods are you have to hear this.

What the heck.

Not amused

Cats are my favorite animals.

Let’s talk about cats.

Cat lover

Cool.Wow that’s interesting.

That’s awesome.

Not really listening?

Interaction Styles➟Cool➟One excuses

enough➟No thanks➟No➟No➟Sure➟Yeah➟Yeah➟No I didn’t

know that➟No➟Yeah➟No

➟No➟ Yes➟ Cool➟ Yeah that’s

cool➟ No I didn’t➟ No➟ Yes➟ No I didn’t

know that➟ No➟ Yes➟ Cool➟ ...

➟ So you’re saying mean is a type 2 diabetic art is junk food

➟ No I’d rather go back to talking about vampires

➟ I love vampires the something that I really loved do you know and rice the author

➟ it is weird i don't understand➟ Yes I would➟ i don't know that's an interesting question

and is it really true that garlic keeps vampires the wedding and what i

➟ what are they have their long fingernails for ➟ i think that that's probably true but i think it

vampires are evil and they don't care about sustaining things for human be-...

Terse UserTalkative User

Different User Goalso Information seeking

o Opinion sharing

o Getting to know each other

i don't know that's an interesting question and is it really true that garlic keeps vampires the wedding and what i

Tell me more about that. What is that? Why?

What’s your favorite video game?

i think that that's probably true but i think it vampires are evil and they don't care about sustaining things for human be-

My favorite video game is minecraft

Adversarial User Exampleso Taboo topics

!

o Offensive language"

o Venting"

Alexa, are you naked?

@#%*#*!!.I heard you mention unicorn. I'll do some research on that. How about we talk about some news about artificial intelligence?

You suck.Unfortunately, I’m not sure I understood what you said.

I’m a chatbot.

yeah can i get my butt hampshire suspense are there was a cough sure stop

ASR is more challenging with children ….

No problem. Let me think. How about we chat about…

Often “can you repeat that?” isn’t going to yield a much better result.

And more content filtering is needed…

Let’s talk about Santa Claus!

You know what I realized the other day? Santa Claus is the most elaborate lie ever told.

User Personality o User-centric topic

suggestionso Five-factor model

(Costa&McCrae, 1992)o E.g., “Do you talk a lot?”

o Helps us understand how users interact with Sounding Board

https://www.verywellmind.com/the-big-five-personality-dimensions-2795422

Trends for Personality Typeso Personality correlates with user ratingso Extroverted, agreeable, open -> higher ratings (p<<.001)

o Topics brought up by users o Introverts (AI, food, cats), extroverts (news, fashion)o Open & imaginative (AI, time travel, aliens)o Low conscientiousness (pokemon, video games, minecraft)

Implications

Content Management

Spoken Language

Understanding

Dialogue Management

Language Generation

Age & dialect impact ASR; verbosity impacts NLP

Content ranking (sources, topic, entities)Error handling, follow-up strategy

Flag rather than filter controversial contentMulti-dimensional content index (e.g.

ratings of user types)

Politeness, repetition rephrasing

User Modeling – First Steps

o Content ranking problemo User embedding model

Content Ranking Problem o Predict user engagement with proposed contento Content can be characterized based on:o Information source, broad topic, entitieso Sentence embeddings

o User engagement characterized based ono User suggested topicso User accepted/rejected topicso User pos/neg reactions to the contento User reaction to the bot

User engagement(subdialog reward)

Types of Features to Useo User-independent infoo Relatedness to current topic (depending on engagement)o General popularity in dialogs with other users

o User-specific featureso Engagement with related topics/sources earlier in the dialogo Age/personality factors reflected in language use

User-topic engagement data is sparse. User embeddings enable learning from similar users.

Predicting User Ratings of Conversationso Task: predict the conversation-level user rating using

linear regression with features that characterizeo Topic-initiation strategy and topic coherenceo Agent dialog acts & language useo User characteristics (verbosity) & language use

o Finding: the best performance is obtained with user characteristics alone

Features !Topic .198Agent .256User .301All .295

Conversational Style – User VectorsUser bag of words* LDA

Vector of ”topic” probabilities

10 LDA clusters – frequent words reflect:• People interacting in specific modes [jokes, music, quiz]• Politeness (would_like, can_I)• Interest in Alexa (what_is, your, favorite)• Positive engagement (cool, funny, interesting)• Self-oriented user (I_think, I_like, I_am)• Interest in video games

* And frequent bigrams

Towards a better unsupervised BOW model

o Is perplexity the right objective for learning user vectors?o Need tricks to make it work, e.g. drop frequent words (yes, no,

yeah, ….)

o A better objective: user re-identification

!!,#[1 + % &!$ , &!% − %(&!$ , &#$)]&

Distance to self Distance to others

!!,#[1 + % &!$ , &!% − %(&!$ , &#$)]&


'(){+,}./,0[2 +4 +/2, +/5 −4(+/2, +02)]&



Cats are my favorite animals.

Let’s talk about cats.

Alexa, what’s your favorite singer?

Experiments on Twitter Userso Task: given a small set of

example users, find other users with similar interests

o Learn embeddings from user tweets

o Experiments with 16 groups, find match out of 43k

Model MRRword2vec 846

LDA 501Re-ID 24

W2V init à Re-ID 12

Results

(Jaech et al., NAACL 2018)

Evaluating User Embeddingso Can we do a preliminary assessment the user embeddings

without full system implementation & user testing?o Plans for using existing data:o Content engagement prediction accuracyo Conversation-level rating predictiono User response generation with context-aware language model

o Work in progress…

Sounding Board – Summaryo The socialbot as a conversational gateway:o Facilitate evolving user goals & interestso Learn new facts, explore information, share opinions

o Critical system componentso Tracking user intent and engagement

o Managing dynamic content (social chat knowledge)

o 10M conversations with real users + new type of conversational AI à many new research problems

User-Related Socialbot Take Awayso User-driven information exploration brings out

user variationo Understanding the user includes:owhat they just said (intent, sentiment)owho they are (interests, interaction style)

o User modeling has implications for all dialog system components (& evaluation)

Some Open Issueso User-dependent reward functiono Dialog policy learning with user embeddingso User response generation with context-aware language

model for a user simulator for reinforcement learning

Thank You

Prosody – What’s that?o It’s not what you say, but how you say ito Intonation, pausing, duration lengthening… (attributes

of the acoustic signal)

o Which communicateo User intent, sentiment, sarcasm, …o Socialbot empathy, enthusiasm, topic change,…

Date post:	03-Jul-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Understanding the User in Socialbot Conversations · Sap CSE Hao Cheng EE. ... Roadmap o The...

Documents