Embodied Conversational Agents: A Case Study of Freudbot Bob Heller, PhD Athabasca University...

transcript

Embodied Conversational

Agents:A Case Study of

Freudbot

Bob Heller, PhDAthabasca UniversityNovember 3, 2004

Mike Proctor – AIML programmer

Dean Mah – Web implementation

Billy Cheung – Graphics, test chatter

Lisa Jewell – Chat log analysis, content developer, test chatter

Julianna Charchun – Chat log analysis

Jude Onuh – AIML programmer

Acknowledgements

Embodied Conversational Agents

Definitions

• Embodiment in Conversational Interfaces: REA (Cassel et al., 1999)

• Embodied Conversational Agents (Cassel, Sullivan, Prevost, & Churchill, 2000)– FMTB model

Vos (2002) offers 5 features of ECA– Human like appearance– Body used for communication purposes– Natural communication protocols– Multimodality– Social role

Anthropomorphic Agents

Animated Interface Agents

Animated Pedagogical agents

Pedagogical Agent Persona

Intelligent Tutoring Systems - AutoTutor (Graesser et al) http://www.autotutor.org/index.htm

Chatterbots or Chatbots

- Weizenbaum’s (1966) Eliza

- primacy of conversation

- Constructivist theory

- The Media Equation

- Persona effect

- cognitive load

Richard Wallace and A.L.I.C.E. • Artificial Linguistic Internet Computer Entityhttp://alicebot.org/

• 3 time winner of the Loebner Contest (the holy grail for chatbots)

http://www.loebner.net/

• AIML – Artificial Intelligence Markup Languagehttp://www.aimlbots.com/

• PandoraBotshttp://www.pandorabots.com

‘Theory’ behind ALICE- pattern matching - Zipf distribution- Iterative

Freudbot 1

Why Freud?

• Initial plan of deployment

• The famous personality application – Emile http://www.hud.ac.uk/hhs/research/emile/emileframeset.htm

– Shakespeare http://www.pandorabots.com/pandora/talk?botid=c6937cfb3e354738

– Hans Christian Anderson http://www.niceproject.com/about/

– John Lennon

Freudbot 1

Developing the AIML

• Narrative structure

• Test chatters

• How much ALICE?

Research Questions

• Is it worth it?

• Is ‘chattiness’ related to the subjective evaluation of chat experience?

• Are there individual difference variables that are related to measures of chat performance/experience?

Freudbot 1

• Online Recruitment – restricted to psychology students– Incentive (1/30 chance at $300)

• Random assignment to bot type

• Controlled Chat – automatically directed to questionnaire after

10 mins of chat

Freudbot 1: Methodology

n PercentGender Men 12 18%

Women 55 82%

Age Distribution 18-22 6 9%23-27 15 22%28-32 11 16%33-37 7 10%38-42 15 22%42+ 13 19%

Student Status Full-time 27 40%Part-time 35 52%Non-student 5 8%

Self-rated academic Below avg 0 0% ability Average 13 19%

Above avg 39 58%Excellent 15 22%

Freudbot 1: Participants (N=67)

Is it worth it?

• self-report data*

MeanUseful 2.2Recommend 2.4Overall 2.4

Enjoyable 2.6Engaging 2.7Memorable 2.8

Expansion 3.4

* 5 point scale

Would you chat again?Yes No(n=30) (n=35)2.7 1.83.4 1.63.2 1.8

3.4 1.93.4 2.13.6 2.2

4.1 2.8

Best FeaturesInteractivity 16Able to ask questions with answers 16Learning about Freud & theories 13Simplicity/ease of use 5Entertaining/humorous 5Thought provoking 5No good features 5Technological features of Freudbot 4Potential to Freudbot 4Alternative learning style 3Novelty/uniqueness of Freudbot 3Tricking Freudbot 2Unpredictable 2

Worst FeaturesRepetition 33Unable to answer questions 23Conversation did not flow 12Limited knowledge base 10User needed prior knowledge 3User was uncertain about what to do 3Not an effective learning tool 3Conversation was too short 1No sound 1

Is it worth it?

Mean RangeNumber of Exchanges 31.0 5-82

• Chat logs

Proportion of on-task responses by participant* .60

questions .37comments* .23

* correlated with a composite measure of self rated chat experience

Proportion of repetitions by Freudbot .25Proportion of non-sensical by Freudbot .39

FreudAlice JustFreud

n=35 n=32

Useful 2.2 2.3

Recommend 2.5 2.4

Overall 2.5 2.4

Enjoyable 2.7 2.6

Memorable 3.0 2.7

Engaging 2.8 2.7

Expansion 3.3 3.5

Chattiness?

# of Exchanges 32.2 29.7

On task Response* .56 .64

* -significant difference btw groups

Individual difference variables?• demographic

– Gender– Age– Student status*– Self-rated academic ability

• computer experience & self-rated skill

• academic background– # of university courses– # of distance ed courses*– # of psychology courses– Rated importance of Freud*

Individual difference variables?

• attitudes towards technology and education– Positive aspects of on-line activities– Independent Learner– negative aspects of on-line activities*

• Is it worth it? – worth another look

• Is ‘chattiness’ related to the subjective evaluation of chat experience?– ‘Chattiness’ is not the right level – Nass and Reeves (1998)

• Are there individual difference variables that are related to measures of chat performance/experience?– some relations that make sense and others that don’t

Freudbot 1 Summary

Research Goals

1. Improve Performance

• Fix repetition problem

• Topic tags

• More content

2. Replication

3. Instructional Set

4. Future Development

Freudbot 2

• online recruitment, incentive, & controlled chat identical to Freudbot 1• random assignment to instructional set• similar questionnaire with additional questions on applications and improvements

Freudbot 2:Methodology

http://psych.athabascau.ca/html/Freudbot/test.html

n PercentGender Men 10 18%

Women 45 82%

Age Distribution 18-22 7 13%23-27 17 31%28-32 7 13%33-37 11 20%38-42 6 11%42+ 7 13%

Student Status Full-time 26 47%Part-time 28 51%Non-student 1 2%

Self-rated academic 0-50 0 4% ability 50-65 2 4%

66-79 11 20%80-89 30 55%90+ 10 18%

Participants (N=55)

Improvement?

• self-report data (5 point scale)

Freudbot 1 Freudbot 2Useful** 2.2 3.0Recommend** 2.4 2.9Overall** 2.4 3.0

Enjoyable 2.6 3.0Engaging** 2.7 3.1Memorable 2.8 3.1

Expansion** 3.4 4.1

** - statisically significant

Would you chat again?

Yes No(n=37) (n=18)3.3 2.43.4 1.73.4 2.2

3.3 2.33.5 2.23.6 2.1

4.4 3.3

Improvement?

Mean RangeNumber of Exchanges 28.4 3-115

• Chat logs

MeanProportion of on-task responses by participant* .90

questions .36comments .48

* correlated with a composite measure of self rated chat experience

Proportion of appropriate responses by Freud .60

Replication?

• Demographic– Gender*– Age– Student status*– Self-rated academic ability

• computer experience

• academic background– # of university courses– # of distance ed courses– # of psychology courses– Rated importance of Freud*

Replication?

• attitudes towards technology and education– Positive aspects of on-line activities– Independent Learner– negative aspects of on-line activities*

Instructional Set?

Brief Set Elaborate Set n=27 n=28

Useful 3.1 2.9

Recommend 2.8 2.9

Overall 2.9 3.1

Enjoyable 2.9 3.0

Memorable 3.2 3.0

Engaging 3.0 3.3

Expansion 3.9 4.2

# of Exchanges 25.3 31.3

On task Response .90 .90

Future Development?

Freudbot Improvements

Chat behaviour 4.2

Audio Response 3.1

Voice Recognition 2.6

Synchronization 2.5

Animation/movment 2.3

* 5-point scale

Other Applications

Practice quizbot 4.1

Famous personality4.1

Course content 3.4

Chatroom 3.3

Course Admin 3.2

1. Improvement

- yes, but clearly room for more

2. Replication

- some

3. Instructional Set

- no effects

4. Development

Freudbot 2: Summary

Future Direction

• Haptek Freud – Animacy/agency hypothesishttp://psych.athabascau.ca/html/Freudbot/haptek.html

• Piagetbot (Support from MCR) – learning outcomes

• Skinnerbot (Lyle Grant)

• Coursebot

• Quizbot

Questions?

Embodied Conversational Agents: A Case Study of Freudbot Bob Heller, PhD Athabasca University...

Documents