Human Learning in Dynamic Environments. Coty Gonzalez 2008 MEGR... · /DDML bdu/DDMLab Social and...

Post on 23-Aug-2020

1 views 0 download

transcript

Human Learning in Dynamic Human Learning in Dynamic Environments

Cleotilde (Coty) GonzalezDynamic Decision Making Laboratory

d /DDML bwww.cmu.edu/DDMLabSocial and Decision Sciences Department

Carnegie Mellon University

Research supported by the National Science Foundation :

Human and Social Dynamics: Decision, Risk, and Uncertainty

Dynamic Environments• Combat missions, Production scheduling, Fire fighting,

Emergency dispatch, Air-traffic control• Complex

o Number of components: alternatives, events, courses of action, outcomes

o Uncertainty: All possible states of the world and outcomes are unavailable, incomplete, and difficult to imagineg

o Constraints: limited time, knowledge, resources, human capacity

• Dynamic Complexity• Dynamic Complexityo Arises from the interactions of components over timeo Environment is autonomous. All is change at many g y

different time scaleso Learning from our actions: feedback delays

Dynamic Decision Making: A Closed-Loop view

Hypothesize illnesses and Symptoms

delay

run tests

delay delay

Test resultsHealth

External event

resultsHealth

DiagnosisTreatment

delay delay

DiagnosisTreatment

delay

Learning in dynamic systems is hard

• People remain suboptimal in these systems even with repeated trials, unlimited time and performance incentives (Sterman,1994; Diehl & Sterman 1995)Sterman, 1995).

• We have difficulty processing feedback. F db k d l bl f l Feedback delay is a problem for learning (Brehmer, 1992; Sterman, 1989).

But… how do we learn in dynamic environments?environments?• Decision Makers recognize typical situations and typical

D i i k th i t k l d responses. Decision makers use their past knowledge and adapt their strategies “on the fly”.

Chess studies Expertise: Chase & Simon 1973o Chess studies, Expertise: Chase & Simon, 1973

o Adaptive Decision Making: Payne, Bettman, & Johnson, 1993

o Decision making under uncertainty: “Case-Based Decision o Decision making under uncertainty Case Based Decision Theory” , Gilboa and Schmeidler, 1995

o Theory of automaticity: Logan, 1988

o “Recognition-Primed Decision Making” (RPDM): Intuition, Mental simulations, Klein et al., 1993; Klein, 1998

Pattern recognition is easier if you have iexperience

Instance Based Learning Theory (Gonzalez, Lerch, & Lebiere, 2003)

• RECOGNITION OF FAMILIAR PATTERNSo Determining the similarity between a situation and past

experience o Identifying ‘typical’ situations and responsesy g yp p

• ACQUIRING CAUSE-EFFECT KNOWLEDGEQo Accumulation of instances with practice in a task o Improvement of decision making by bootstrapping on previous

k l d knowledge

Implemented in ACT-R (Anderson and Lebiere, 1988)

IBLT: WHAT do we learn?

Situation Decision OutcomeSituation-Decision

Cycle

Action-Outcome

CycleCycle Cycle

FutureDecisions

S ODS D O

S D O

Blending of past

OutcomesSimilarity

S D OS D O Time

Outcomes

F db kEnvironment

Feedback

IBLT: HOW do we learn?

ACT-R(A d & L bi 1998)(Anderson & Lebiere, 1998)

h l l f

Declarative Memory Procedural Memory

The 2x2 levels of ACT-R

Chunks: declarative facts

Productions: If (cond) Then (action)

Symbolic

facts (cond) Then (action)

A ti ti f h k

S bS b li

Activation of chunks (likelihood of

retrieval)

Conflict Resolution (likelihood of use)

SubSymbolic

IBLT models compare to human decision making:

• In dynamic resource allocation tasks (Gonzalez et

making:

al., 2003)

• In supply chain management control (Martin, Gonzalez & Lebiere 2004)Gonzalez & Lebiere, 2004)

• In repeated choice tasks (Lebiere, Gonzalez & Martin, 2007)2007)

• But there is long way to go to demonstrate: generalizability and utility of IBLTg y y

Decision Making Games (DMGames) used for experimentationfor experimentation

• DMGames embody the essential characteristics of • DMGames embody the essential characteristics of real-world decision environments

o Interactiveo Interactive

o Repeated and interrelated decisions

E t l t d t i t tio External events and team interactions

• Help compress time and space – speed up learning

• Help manipulate experience - learn from simulated cases and on-demand repeated practice

k d d l d h • No risk to individuals and they are FUN.

DMGames used in behavioral research in the DDMlab

Military Command and Control

Real-time resource allocation

Military Command and Control

Real-time resource allocationReal time resource allocationReal time resource allocation

Medical Medical

Supply-Chain

ed caDiagnosis

Supply-Chain

ed caDiagnosis

Chain Management Fire

Fighting

Chain Management Fire

Fighting

MEDIC: Learning tools that represent the dynamics of medical diagnosis (Gonzalez & Vrbin, 2007)y f g ( , )

• Concepts adapted from Kleinmuntz (1985):

Task complexity (numerous diseases and symptoms)o Task complexity (numerous diseases and symptoms)o Disease base rateso Time pressureo Test diagnosticityo Treatment effectivenesso Treatment risko Treatment risk

• Additions:

o Feedback delays (e.g. receiving test results)

• With the potential for:

o Dynamic diagnostic cues

o Dynamic symptoms

MEDIC demo

Factors that influence Learning in dynamic systemsy

• Time constraints (Gonzalez, 2004)

• Workload (Gonzalez, 2005)

• The similarity and diversity of experiences (Gonzalez and y y pQuesada, 2004; Gonzalez and Madhavan, in preparation)

• Our inherent cognitive abilities (Gonzalez, Thomas and Vanyukov 2004)Vanyukov, 2004)

• The type of feedback (Gonzalez, 2005)

• Our difficulty in understanding simple stock and flow Our difficulty in understanding simple stock and flow structures (Cronin and Gonzalez, 2005; Cronin, Gonzalez and Sterman, 2006; Gonzalez, Sterman and Cronin, in preparation)

Experiment 1: probabilities

• MEDIC incorporated:

o Symptoms-disease associations from 0.1 to 0.9

o Delay in test resultsy

o Time pressure due to patient’s declining health in real-time

o Deterministic treatment needed to be provided

• N=12, students, paid flat rateN , students, pa d flat rate

• Each student resolved 56 cases

Results

Treatment

Results- test diagnosticity

Disease base rates

Diagnosticity per disease

Experiment 1: Conclusions

• Students did learn – not perfectly• Showed knowledge of probabilities, tested for the

more diagnostic cues, and diagnosed very closely to the real state of the diseases.f .

• What is the role of feedback and how would that interact with the symptom-probability matrix?

Experiment 2: Probabilities and f db kfeedback

• MEDIC:• Symptomology table: Probability or Certainty• either detailed feedback or no feedback

• Participants were assigned to one of four conditions:o probabilities, full feedback (P1) -26o certainty full feedback (P2)-30o certainty, full feedback (P2) 30o certainty, no feedback (P3)-25o probabilities, no feedback (P4)- 29

• N= 110 Participants were paid a flat dollar amount

P b biliProbability

C t i t

Disease 1 Disease 2 Disease 3 Disease 4 0.25 0.25 0.25 0.25 Base Rates

Certainty

0.0 0.0 0.0 0.0 Symptom 11.0 0.0 0.0 0.0 Symptom 2 1.0 1.0 0.0 0.0 Symptom 3 0.0 0.0 1.0 0.0 Symptom 4

Test diagnosticity - probability condition

Test diagnosticity – Certainty condition

Diagnosticity per disease

Experiment 2: Conclusions

• Full feedback was helpful in the probabilistic i t d did t k diff i th environment and did not make a difference in the

certain environment

• We now know that: with repeated trials, students p ,learn in probabilistic environments with time constraints and feedback delays

• Feedback helps in probabilistic environmentsFeedback helps in probabilistic environments• Probabilistic environments are not the main reason

for poor learning in dynamic tasks

Basic Building Blocks of Dynamic Decision Making TasksMaking Tasks

• Stocks (accumulations)

• Flows that increase (Inflow) or decrease (Outflow) the stock

• Feedback Delays & multiple relationships

• Environmental or external effects

• Multiple decisions about flows

These problems of dynamic control over time are important to human life: keeping a healthy weight, bank p p g y gaccounts, company inventory, stress levels, climate change etc.

Humans suffer of poor understanding of accumulation: Stock-Flow failureaccumulation: Stock Flow failure

Cronin, Gonzalez & Sterman, 2008 ; Cronin & Gonzalez, 2007; Cronin, Gonzalez and Sterman, 2006; Sweeney & Sterman, 2000 St 2002 2000; Sterman, 2002;

Weight as balance between consumed and expended energyexpended energy

1. When eaten most?

2. When exercised most?

3. When weight highest?

4. When weight lowest?4 g

Blood glucose level as balance between glucagon and insulin productionglucagon and insulin production

1. When most glucagon?g g

2. When most insulin?

3. When glucose level 3. When glucose level 

highest?

4. When glucose level 4. When glucose level 

lowest?

Why? (Cronin & Gonzalez, 2007; Cronin, Gonzalez & Sterman, 2008)

• Not an artifact of the graph

t rman, )

• Not due to the form of graphical presentation

• Not due to motivation• Not due to motivation

• Not due to familiarity with the context

• Stock Flow failure is one important reason for • Stock-Flow failure is one important reason for learning problems in dynamic systems

U f h i ti th t i t iti l li • Use of heuristics that are intuitively appealing but erroneous

Future work

• Further investigate the correlation heuristic and the Stock Flow failureand the Stock-Flow failure

• Use DMGames of Dynamic Stocks and Flows to d t d th i d l i bl understand the reasoning and learning problems

in dynamic tasks

• Further develop the Instance-Based learning theory to other dynamic problems, like the St k FlStock-Flow

• Further investigate ways to identify and overcome the problems in learning in dynamic systems

DDMLab – February,