Striatal Dopamine (DA) and Learning: Do Category Learning (CL) data constrain computational models?...

Post on 28-Mar-2015

221 views 1 download

Tags:

transcript

Striatal Dopamine (DA) and Learning: Do Category

Learning (CL) data constrain computational models?

Alan PickeringDepartment of Psychology

a.pickering@gold.ac.uk

Overview• Classic CL findings and questions • DA, the striatum and learning• Generate simple hypothesis about CL

deficits in Parkinson’s Disease• Generate simple biologically-

constrained neural net to test hypothesis

• Simulate CL data on 2 types of matched CL tasks

• Conclusions – why model fails

Classic Findings and Questions

• Parkinson’s Disease (PD) patients are impaired at CL tasks.

• Why?-What psychological processes are impaired?-What brain regions and neuro- transmitters are involved?

Category Learning in Parkinson’s Disease

Weather task: Knowlton et al, 1996

Category Learning in Parkinson’s Disease

Main Findings: Knowlton et al, 1996

Key Facts• PD involves prominent damage to the

striatum• CL may (sometimes) involve

procedural/habit learning• Striatal structures are part of cortico-

striato-pallido-thalamic loops possibly implicated in procedural learning

• The striatum is strongly innervated by ascending DA projections

Simple Interpretation• CL deficits in PD may arise because

of damage to …

loss of ascending DA signals

which compromise the functioning of (parts of) …

the striatum

Three Learning Processes Which Might Be DA-Related

1. Appetitive reinforcement and motivation

DA cell firing increses/decreases provide a positive/negative reinforcement signal which is required for synaptic strengthening/ weakening

“3-factor learning rule”

(e.g., Wickens; Brown et al etc.)

Corticostriatal (Medium Spiny Cell) Synapse

DA Receptors in StriatumAfter Schultz, 1998

DA receptors: Unfilled rectangles

GLU receptors: Filled rectangles

DA-Related Processes (cont)

2. Reward Prediction ErrorMidbrain DA neurons increase firing in response to unexpected rewards and decrease firing to nonoccurrence of expected rewards

Firing change= reward prediction error

Schultz, Suri, Dickinson, Dayan etc.

DA Cell Recordings: Evidence For Reward Prediction Error

CUE REWARD

DA-Related Processes (cont)

3. Modulation of Neural SignalsFloresco et al (2001): “DA receptor activity serves to strengthen salient inputs while inhibiting weaker ones”

Also: Nicola & Malenka; J.D.Cohen; Ashby & Cassale; Salum et al; Nakahara; Schultz

Evidence For ModulationNicola & Malenka, 1997Recorded effect of DA on response of striatal cells to strong and weak inputs

Strong

Weak

Linking 3-Factor Learning & Reward Prediction Error

Cue

Reward

Striatal Cell

DA Cell

Reward prediction

Reward predictionerror

Excitatory Inhibitory Reinforcement

Simple Working Hypothesis

• CL is impaired in PD patients (and other DA-compromised groups) due to “reduced DA function” in striatum (tail of caudate)

• The loss of ascending DA input reduces the reinforcing function of the reward prediction error signal innervating the striatum

Modelling• Biologically-constrained neural net• Data to be simulated taken from

Ashby et al (2003)• Data from young and old controls

(YC, OC) and PD patients• Study used matched CL tasks: rule-

based (RB) and Information Integration (II)

• Ashby and colleagues believe these tasks are handled by distinct CL systems

Ashby et al: II Task• 3 of the 4 dimensions determine categories• Not readily verbalisable

Cat A

Cat B

Ashby et al: RB Task• 1 dimension (background colour) determines category• Readily verbalisable rule

Cat B

Cat A

Ashby et al: Results• Proportion failing to learning to criterion in 200 trials

0

0.1

0.2

0.3

0.4

0.5

0.6

Proportion

Nonlearners

Y C

OC

PD

0

0.1

0.2

0.3

0.4

0.5

0.6

Proportion

Nonlearners

II Task RB Task

RB Task: Results• Trials to criterion for learners

0

10

20

30

40

50

60

70

80

90

Trials to

Criterion

Y C

OC

PD

0

10

20

30

40

50

60

70

80

90

Trials to Criterion

II Task RB Task

Modelling

• Constrained by input and output connections of striatum (caudate)

• Learning rule based on known 3-factor form of synaptic plasticity in striatum

• Learning rule consistent with reward prediction error properties of DA neurons

Connections of Striatum

Neocortex

Striatum

SNc

VTA

Sth

Thalamus

GPi GPe

Prefrontal Cortex

Schematic Model

Reward

Stimulus Pattern Response Decision

Input Output

DA

….

Reward

DA

S-R Representation

Reward prediction

Model Learning Rule

When reward present, E>0

wJK = kR*E*ykout*xJ

out

When reward absent, E<0

wJK = kN*E*ykout*xJ

out

xJout

yKoutyK

Reward prediction error, E

wJK

Modelling of Reduced DA FunctionLoss of DA input to striatum (tail of caudate) modelled 2 ways (with same results):-

a) loss of modifiability of cortico- striatal weightsb) proportional reduction of reward prediction error strength

Mean proportion of weights modifiable:-YC 0.8 OC 0.5 PD 0.2(with s.d. = 0.15)

Modelling Process• Found parameters which gave good

fit to YC performance on II task and set DA parameters for PD to produce appropriate level of nonlearners on same task

• Varied OC DA values between YC and PD

• Looked at fit (with these parameters) to all other data cells esp. RB task

Modelling II Task Results

0

0.1

0.2

0.3

0.4

0.5

0.6

Data Model0

20

40

60

80

100

120

140

Data Model

Trials to criterion (learners)

Proportion of non-learners

YC PD

Modelling II Task Results

0

0.1

0.2

0.3

0.4

0.5

0.6

Data Model0

20

40

60

80

100

120

140

Data Model

Trials to criterion (learners)

Proportion of non-learners

YC PDOC

Modelling II Task Results*

0

0.1

0.2

0.3

0.4

0.5

0.6

Data Model0

10

20

30

40

50

60

70

80

90

Data Model

Trials to criterion (learners)

Proportion of non-learners

YC PDOC

Model Results II TaskPerformance of learners in blocks of 16 trials

Modelling RB Task Results

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Data Model0

10

20

30

40

50

60

70

80

90

100

Data Model

Trials to criterion (learners)

Proportion of non-learners

YC PDOC

Modelling RB Task Results*

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Data Model0

10

20

30

40

50

60

70

Data Model

Trials to criterion (learners)

Proportion of non-learners

YC PDOC

Conclusions & Future Work 1

• Simplest realistic model of cortico-striatal learning captures only limited aspects of the CL data

• “Bimodal” nature (learn normally vs. fail) of data simulated only under some paramter settings

• No intermediate DA parameter settings in old controls which can be both PD-like for II task and YC-like for RB task

Conclusions & Future Work 2• Model challenges hypothesis under

test: PD (and OC) deficits in some CL tasks seem unlikely to be solely due to reduced DA-related reinforcement in striatum

• Findings are consistent with >1 CL system

• Future model should add rule system (c.f. Ashby’s COVIS)

Alan PickeringCL Refs 2001-

• Pickering, A.D., & Gray, J.A. (2001). Dopamine, appetitive reinforcement, and the neuropsychology of human learning: An individual differences approach. In A. Eliasz & A. Angleitner (Eds.), Advances in individual differences research (pp. 113-149). Lengerich, Germany: PABST Science Publishers.