Time-constrained reasoning Learn to be a split-second expert Niels Taatgen and John Anderson...

Time-constrained Time-constrained reasoningreasoningLearn to be a split-second expertLearn to be a split-second expert

Niels Taatgen and John AndersonNiels Taatgen and John Anderson

Carnegie Mellon UniversityCarnegie Mellon University

Funded by grants from ONR (ACT-R), NWO (Set) and NASA Funded by grants from ONR (ACT-R), NWO (Set) and NASA (CMU-ASP)(CMU-ASP)

OverviewOverview

Time-constrained reasoningTime-constrained reasoning ACT-R essentialsACT-R essentials Set (a game)Set (a game) CMU-ASP (complex dynamic task)CMU-ASP (complex dynamic task)

Goals of the ACT-R projectGoals of the ACT-R project

Explain as much as possible of Explain as much as possible of cognition using a single theorycognition using a single theory

The theory has to be explicated in The theory has to be explicated in a cognitive architecture, a a cognitive architecture, a simulation environment that simulation environment that mimics actual human behaviormimics actual human behavior

Single mechanisms should explain Single mechanisms should explain many phenomenamany phenomena

Skill Acquisition through Skill Acquisition through Production CompilationProduction Compilation

People become faster at performing a certain People become faster at performing a certain task through learningtask through learning

The general theory is that they transform The general theory is that they transform declarative into procedural knowledgedeclarative into procedural knowledge

Speed-up is explained by the fact that Speed-up is explained by the fact that declarative knowledge no longer has to be declarative knowledge no longer has to be retrievedretrieved

A second aspect of becoming faster is that A second aspect of becoming faster is that some things that had to be done serially can some things that had to be done serially can now be done in parallel leading to qualitative now be done in parallel leading to qualitative changes in behaviorchanges in behavior

Time-constrained Time-constrained reasoningreasoning

Initial knowledgeInitial knowledge Speed-up by production compilationSpeed-up by production compilation Qualitative improvements in Qualitative improvements in

performanceperformance Embedded learningEmbedded learning ““Being fast” means: explaining why Being fast” means: explaining why

people can learn to be very fast at people can learn to be very fast at certain complex tasks (so nothing fast certain complex tasks (so nothing fast computers solve)computers solve)

Goals of the ACT-R projectGoals of the ACT-R project

Model Human Model Human behavior in order behavior in order to:to:

Better understand Better understand human reasoninghuman reasoning

Use models of Use models of human reasoning human reasoning in:in:• HCIHCI• Intelligent agentsIntelligent agents• TutoringTutoring

The simulation’s The simulation’s predictions are as predictions are as fine-grained as fine-grained as possible:possible:• reaction timesreaction times• errorserrors• choiceschoices• learning curveslearning curves• eye movementseye movements• fMRI datafMRI data

ACT-R’s foundationsACT-R’s foundations

Rational Analysis: Knowledge is treated Rational Analysis: Knowledge is treated as having a potential benefit for the as having a potential benefit for the agent (so ‘truth’ is not fundamental)agent (so ‘truth’ is not fundamental)

Procedural/Declarative distinctionProcedural/Declarative distinction Hybrid: both symbolic and subsymbolicHybrid: both symbolic and subsymbolic Strong focus on learningStrong focus on learning Strong focus on interaction with the Strong focus on interaction with the

environmentenvironment

Summary of ACT-R’s Summary of ACT-R’s memorymemory

Declarative memoryDeclarative memory• contains facts and contains facts and

past goals (called past goals (called chunks)chunks)

• activation activation determines which determines which chunk is selectedchunk is selected

• chunks that are chunks that are retrieved or retrieved or recreated often recreated often receive high receive high activationsactivations

Procedural Procedural memorymemory• contains rulescontains rules• utility determines utility determines

which rule is which rule is selectedselected

• High utility = high High utility = high probability of probability of success and low success and low costscosts

Overview ACT-R in a Overview ACT-R in a DiagramDiagram

Matching (Striatum)

Selection (Pallidum)

Execution (Thalamus)Pro

du

cti

on

s(B

as

al

Ga

ng

lia

)

Retrieval Buffer(VLPFC)

Goal Buffer(DLPFC)

Manual Buffer(Motor)

Visual Buffer(Parietal)

Declarative Module(Temporal/Hippocampus)

Intentional module(not identified)

Visual Module(Occipital/Parietal)

Manual Module(Motor/Cerebellum)

External World

ACT-R Cycle:ACT-R Cycle:

MatchingMatching

SelectionSelection

ExecutionExecution


Matching (Striatum)



du

cti

on

s(B

as

al

Ga

ng

lia

)


Goal Buffer(DLPFC)







External World


Matching:Matching:

Production rules Production rules that match the that match the current contents of current contents of the buffers are the buffers are determineddetermined


Matching (Striatum)



du

cti

on

s(B

as

al

Ga

ng

lia

)


Goal Buffer(DLPFC)







External World


Selection:Selection:

Select the Select the production rule with production rule with the highest Utilitythe highest Utility


Matching (Striatum)



du

cti

on

s(B

as

al

Ga

ng

lia

)


Goal Buffer(DLPFC)







External World


Execution:Execution:

The selected The selected production rule production rule modifies contents of modifies contents of buffersbuffers

Modules operate Modules operate asynchronously asynchronously from central from central cognition cognition

Production Compilation:Production Compilation:Learning new production rulesLearning new production rules

Declarative

Procedural

Rule 1 Rule 2

Fact fromDeclarative memory

Combine two existing Combine two existing rules that are used in rules that are used in sequence into a new sequence into a new rule, while substituting rule, while substituting a fact that is retrieved a fact that is retrieved from memoryfrom memory

Solidify recurring Solidify recurring reasoning patternsreasoning patterns

Rule 1 Rule 2

Fact fromDeclarative memory

Rule1 & Rule2

SetSet

GameGame PredictionsPredictions ExperimentExperiment ModelModel ApplicationApplication EvaluationEvaluation

With Marcia van Oploo, Jos Braaksma and Jelle With Marcia van Oploo, Jos Braaksma and Jelle NiemantsverdrietNiemantsverdriet

Computer GamesComputer Games

Computer chess: design a program Computer chess: design a program that plays chess as good as that plays chess as good as possiblepossible

But chess players complain that But chess players complain that computers play boring chesscomputers play boring chess

Different goal: opponents that play Different goal: opponents that play as humanly as possible.as humanly as possible.

The Game of Set!The Game of Set!

Game consists of Game consists of 81 cards81 cards

Each card has Each card has four attributes: four attributes: color, shape, color, shape, filling and numberfilling and number

Goal of the GameGoal of the Game

Twelve cards are Twelve cards are put on the tableput on the table

Find a Set: three Find a Set: three cards in which for cards in which for each attribute, the each attribute, the attribute values for attribute values for each of the cards each of the cards are all different, or are all different, or all the sameall the same

Goal of the GameGoal of the Game

Not a Set!

Some Sets are more Some Sets are more difficult then othersdifficult then others

One attribute differentOne attribute different

Two attributes differentTwo attributes different

Three attributes differentThree attributes different

Four attributes differentFour attributes different

1 2 3 4

5 6 7 8

9 10 11 12

Set! as a game to play Set! as a game to play against the computeragainst the computer

For a computer, the game is trivial (as For a computer, the game is trivial (as opposed to chess, etc.)opposed to chess, etc.)

The challenge is to program an The challenge is to program an opponent that acts as a human playeropponent that acts as a human player

So the computer opponent has to be So the computer opponent has to be fast at sets that people are fast at, and fast at sets that people are fast at, and slow at sets that people are slow atslow at sets that people are slow at

The PredictionsThe Predictions

1.1. The “easy” sets will be found The “easy” sets will be found faster than the “hard” sets.faster than the “hard” sets.

2.2. Experts on the game will mainly Experts on the game will mainly excel in finding the hard sets, and excel in finding the hard sets, and will be approximately equally will be approximately equally good as beginners on the easy good as beginners on the easy setssets

The ExperimentThe Experiment

8 subjects, 4 beginners, 4 experts8 subjects, 4 beginners, 4 experts 20 set-problems (12 cards, find the 20 set-problems (12 cards, find the

set as fast as possible)set as fast as possible) 5 problems of each of the four 5 problems of each of the four

levels of difficultylevels of difficulty

Experimental resultsExperimental results

Experiment confirmsExperiment confirmsboth hypotheses:both hypotheses:1.1. Difficult problemsDifficult problems

take longertake longer2.2. Beginners and ExpertsBeginners and Experts

are equally good atare equally good ateasy problems, buteasy problems, butExperts excel on hardExperts excel on hardproblemsproblems

Set!

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4

Difficulty (number of attributes different)

Time (sec)

Beginner data

Expert data

The modelThe model Why do hard sets take Why do hard sets take

longer?longer?• Checking for unequal

attributes takes longer Why are experts better Why are experts better

at hard problems?at hard problems?• They are better at multi-

tasking, which pays off in hard problems

How the model worksHow the model works

First, pick a First, pick a random card and random card and stick it in the goal stick it in the goal bufferbuffer

Goal

3

Visual

3


Second, search Second, search for a card of the for a card of the same color. If this same color. If this fails, search for fails, search for an arbitrary an arbitrary different carddifferent card

We don’t put this We don’t put this card in the goal, card in the goal, but leave it in the but leave it in the visual buffervisual buffer

Goal

3

Visual

2


Now the model is Now the model is going to do two going to do two things in parallel:things in parallel:• Check in declarative Check in declarative

memory whether or memory whether or not we tried this not we tried this combination of two combination of two cards beforecards before

• Make a prediction Make a prediction what the third card what the third card has to look likehas to look like

Goal

3

Visual

2

Retrieval buffer:Have we tried thesetwo cards before?

DeclarativeMemory busy

retrieving the fact


Predicting the third Predicting the third cardcard• For each attribute, we For each attribute, we

determine what it has determine what it has to be like in the third to be like in the third card, and put this card, and put this back into the goalback into the goal

• When the attribute for When the attribute for goal and visual are goal and visual are equal, this attribute is equal, this attribute is also the desired also the desired attribute for the new attribute for the new cardcard

Goal

3

Visual

2



retrieving the fact


Predicting the Predicting the third cardthird card• When the When the

attributes are attributes are different, we have different, we have to determine the to determine the third valuethird value

Goal

3

Visual

2


11


retrieving the fact

How the model worksHow the model works After predicting the After predicting the

third cardthird card• Now that the third Now that the third

card has been card has been predicted, the model predicted, the model tries to find it on the tries to find it on the screen.screen.

• If this fails, it starts all If this fails, it starts all over againover again

• If this succeeds, it If this succeeds, it announces it has announces it has found a Set!found a Set!

Goal

31

Beginners vs. ExpertsBeginners vs. Experts Beginner:Beginner:IF value in the goal is IF value in the goal is val1val1 and and

the value in the visual the value in the visual buffer is buffer is val2val2

THEN send a retrieval request THEN send a retrieval request for a value that is different for a value that is different from from val1val1 and and val2val2

IF the retrieval buffer contains IF the retrieval buffer contains val3val3, different from , different from val1val1 and and val2val2

THEN put THEN put val3val3 in the goal in the goal

Expert:Expert:

IF value in the goal IF value in the goal is red and the is red and the value in the visual value in the visual buffer is bluebuffer is blue

THEN put green in THEN put green in the goalthe goal

How the model worksHow the model works Beginners:Beginners:

• At the moment that (in this At the moment that (in this example) the filling has to example) the filling has to be determined, a be determined, a declarative retrieval is declarative retrieval is neededneeded

• This is however impossible, This is however impossible, because declarative because declarative memory is still engaged in memory is still engaged in another retrieval!another retrieval!

• So the beginner has to wait So the beginner has to wait until the first declarative until the first declarative retrieval is doneretrieval is done

ExpertsExperts• Have proceduralized the Have proceduralized the

retrieval for the third retrieval for the third attribute valueattribute value

Goal

3

Visual

2


Retrieval buffer:What is the thirdvalue?


retrieving the fact

Novice

Expert

Nothing goingon here exceptwait for retrieval

Third card ispredicted

Third card issought

Results of the ModelResults of the Model

Set!

0

20

40

60

80

100

120

1 2 3 4

Difficulty (number of attributes different)

Time (sec)

Beginner model

Expert model

Beginner data

Expert data

Conclusion (intermediate)Conclusion (intermediate)

Experts are not just faster than Experts are not just faster than beginnersbeginners

They can do certain reasoning steps They can do certain reasoning steps effortlessly which are effortful to effortlessly which are effortful to novicesnovices

In this task the beginners can still do it, In this task the beginners can still do it, but one can image tasks where you but one can image tasks where you have to be a (partial) expert to be able have to be a (partial) expert to be able to do it at allto do it at all

The ApplicationThe Application

CMU-ASP taskCMU-ASP task

Subjects have to classify planes (tracks) Subjects have to classify planes (tracks) on a radar-screenon a radar-screen

They have to do three things to classify They have to do three things to classify a track:a track:• Select one by clicking on itSelect one by clicking on it• Use one of two classification methods, each Use one of two classification methods, each

of which sometimes successful and of which sometimes successful and sometimes notsometimes not

• Enter the classification into the systemEnter the classification into the system

Clicking ona track selects it

One classificationmethod is to lookat altitude and speed

Another is to ask fora radar signature by pressingsome keys and waiting forthe information to come up

Finally, a series of keypresses is used toenter the information

General modeling General modeling approachapproach

Represent instructions as Represent instructions as declarative knowledgedeclarative knowledge

Have task-general production rules Have task-general production rules that interpret these instructionsthat interpret these instructions

Production compilation produces Production compilation produces task-specific rulestask-specific rules

Representation of Representation of instructions (Anderson)instructions (Anderson)

Hand toF-keys

IdentifyTracks

Look fora track

Hook track

Id it Repeat

Do Radar Classify

Select“EWS”

Select“Query”

Encode

Look atalt & speed

This model misses many This model misses many interesting phenomenainteresting phenomena

Moving hands from the mouse to the Moving hands from the mouse to the keyboard and vice versa ahead of timekeyboard and vice versa ahead of time

Deciding which classification strategy to Deciding which classification strategy to use firstuse first

Comparing tracks before selecting one Comparing tracks before selecting one without using to much timewithout using to much time

To summarize: optimize parallelization To summarize: optimize parallelization of behaviorof behavior

From a strong to weakFrom a strong to weakHierarchyHierarchy

SelectTrack

Look forTrack

HookTrack

UseRadar

Id

Findapprop.

key

ReadAir ID

Recallclassifi-cation

EnterClassifi-cation

Findapprop.

key

Checkalt/speed

Look foralt

Checkrange

Look forspeed

Hand toF-keys

IdentifyTracks

Look fora track

Hook track

Id it Repeat

Do Radar Classify

Select“EWS”

Select“Query”

Encode

Look atalt & speed

Instructions involve Instructions involve multiple stepsmultiple steps

Look at a trackLook at a track

1.1. Find a visual-Find a visual-location of type location of type tracktrack

2.2. Move attention Move attention to itto it

3.3. Store the Store the location in the location in the goalgoal

Hook a trackHook a track

1.1. Move the hand Move the hand to the mouseto the mouse

2.2. Move the mouse Move the mouse to the location of to the location of the trackthe track

3.3. Click the mouseClick the mouse

Carrying out these steps in order is inefficient!

Example productionsExample productions

(p retrieve-next-clause(p retrieve-next-clause

=goal>=goal>

isa taskisa task

rule =idrule =id

step donestep done

-retrieval> -retrieval>

==>==>

+retrieval>+retrieval>

isa clauseisa clause

rule =idrule =id

))

IF the goal is a IF the goal is a tasktask

THEN retrieve the THEN retrieve the next instructionnext instruction

Note: spreading Note: spreading activation will insure activation will insure the right instruction the right instruction is retrievedis retrieved

Example productionsExample productions(p look-for-visual-track(p look-for-visual-track =goal>=goal> isa taskisa task step donestep done =retrieval>=retrieval> isa clauseisa clause relation look-forrelation look-for arg2 =typearg2 =type =visual-location>=visual-location> isa visual-locationisa visual-location kind =typekind =type -visual>-visual>==>==> +visual>+visual> isa visual-objectisa visual-object screen-pos =visual-locationscreen-pos =visual-location -retrieval>)-retrieval>)

IF the goal is a taskIF the goal is a taskAND an instruction is AND an instruction is

retrieved to look for retrieved to look for something of a something of a certain typecertain type

AND we have found a AND we have found a location on the screen location on the screen with something of with something of that typethat type

THENTHENmove the eyes to the move the eyes to the location and attend itlocation and attend it

Example productionsExample productions(p hook-new-mouse-not-at-destination(p hook-new-mouse-not-at-destination =goal>=goal> isa taskisa task var1 =objectvar1 =object step donestep done =retrieval>=retrieval> isa clauseisa clause relation hook-newrelation hook-new arg1 var1arg1 var1 =manual-state>=manual-state> isa module-stateisa module-state modality freemodality free!eval! (not (cursor-at =object))!eval! (not (cursor-at =object))==>==> -retrieval>-retrieval> +manual>+manual> isa move-cursorisa move-cursor

object =object)object =object)

IF the goal is a taskIF the goal is a taskAND an instruction has AND an instruction has

been retrieved that been retrieved that specifies that an specifies that an object has to be object has to be clickedclicked

AND the motor module AND the motor module is availableis available

AND the cursor is not AND the cursor is not on the objecton the object

THENTHENmove the mouse to move the mouse to the objectthe object

Additional ruleAdditional rule

If the visual system finds another If the visual system finds another track while one is already in the track while one is already in the goal, then it compares the new goal, then it compares the new track to the old track and retains track to the old track and retains the bestthe best

Novice

Move hand to mouse

Move mouse to track Click mouse

Attend a track

Learned rulesLearned rules (p Production1013(p Production1013 =goal>=goal> isa TASKisa TASK rule Find-Trackrule Find-Track step Donestep Done =manual-state>=manual-state> isa MODULE-STATEisa MODULE-STATE modality Freemodality Free !eval! (hand-not-on-mouse)!eval! (hand-not-on-mouse) ==>==> +manual>+manual> isa HAND-TO-MOUSE)isa HAND-TO-MOUSE)

IF the goal is to find a trackAND the motor module is freeAND the hand is not on the mouseTHEN move the hand to the mouse

Learned rulesLearned rules (p Production544(p Production544 =goal>=goal> isa TASKisa TASK rule Find-Trackrule Find-Track step Donestep Done =visual-location>=visual-location> isa VISUAL-LOCATIONisa VISUAL-LOCATION kind Square-Trackkind Square-Track =visual-state>=visual-state> isa MODULE-STATEisa MODULE-STATE modality Freemodality Free -visual>-visual> ==>==> +visual>+visual> isa VISUAL-OBJECTisa VISUAL-OBJECT screen-pos =visual-location)screen-pos =visual-location)

IF the goal is to find IF the goal is to find a tracka track

AND a visual-AND a visual-location of type location of type track has been track has been foundfound

AND it is not AND it is not attendedattended

THEN attend the THEN attend the tracktrack

Expert

Other changes in the Other changes in the model’s behaviormodel’s behavior

The model sometimes opts for The model sometimes opts for doing a Radar sweep first and then doing a Radar sweep first and then looking at the altitude and speed looking at the altitude and speed (subjects do too)(subjects do too)

The model sometimes prefers The model sometimes prefers using the left hand for keying and using the left hand for keying and the right hand for mousingthe right hand for mousing

From strong to weak task From strong to weak task hierarchyhierarchy

Production compilation not only speeds Production compilation not only speeds up performance, but also leads to up performance, but also leads to qualitative changes in behaviorqualitative changes in behavior

Bottom-up behavior next to Top-down Bottom-up behavior next to Top-down behaviorbehavior

Is this real reasoning? Maybe reasoning Is this real reasoning? Maybe reasoning with the small “r”, but nevertheless with the small “r”, but nevertheless crucial in understanding the flexibility of crucial in understanding the flexibility of human learninghuman learning

Related workRelated work

Mike Freed: APEX (but no learning)Mike Freed: APEX (but no learning) Ron Chong: EPIC-SoarRon Chong: EPIC-Soar Kieras and Meyer are working on it Kieras and Meyer are working on it

(EPIC), but they will have to (EPIC), but they will have to incorporate learning into EPICincorporate learning into EPIC

Rick Lewis uses an APEX-like Rick Lewis uses an APEX-like approach in ACT-Rapproach in ACT-R

Three modelsThree models

1.1. Wait 35 sec in expert setting and Wait 35 sec in expert setting and 45 sec in beginner setting45 sec in beginner setting

2.2. Based on the model just Based on the model just discusseddiscussed

3.3. The model plus two additional The model plus two additional strategiesstrategies

Usability testUsability test

1

1.5

2

2.5

3

3.5

4

4.5

5

Model 1 Model 2 Model 3

Acts like human?

Challengingopponent?

DifferenceBeginner andExpert?

Date post:	27-Mar-2015
Category:	Documents
Upload:	faith-salazar
View:	220 times
Download:	0 times

Time-constrained reasoning Learn to be a split-second expert Niels Taatgen and John Anderson...

Documents