Instrumental Conditioning II. Delay of Reinforcement Start DelayChoice Correct Incorrect Grice...

Post on 21-Jan-2016

220 views 0 download

transcript

Instrumental Conditioning II

Delay of Reinforcement

StartDelayChoice

Correct

Incorrect

Grice (1948)

Goal

Reward or No Reward

Grice (1948) Results

2030405060708090

10025 100

175

250

325

400

475

550

625

700

Trials

Per

cen

t C

orr

ect

0s

5s

2s1.2s

0.5s

10s

Overcoming the effects of delay

• Secondary reinforcers

• “Marking” procedure

Lieberman, McIntosh & Thomas (1979)

Reinforcement Punishment

Positive contingency

Negative contingency

Chocolate Bar Electric Shock

Excused from Chores

No TV privileges

Effect on Rate Behavior

Professor Drew

Anticipatory Contrast - Crespi (1942)

00.5

11.5

22.5

33.5

44.5

2 4 6 8 10 12 14 16 18 20 2 4 6 8

Trials

Run

ning

Spe

ed (f

t/se

c)

256-16 Pellets16-16 Pellets1 - 16 Pellets

Rats run down maze to find food pellets in goal arm.

What is a reinforcer?

Operational Definition (behaviorists): That which increases the probability of the response that preceded it.

Thorndike: A stimulus that produces a “satisfying state of affairs”

Drive Reduction Theory

Amt of H2O in body

Compare with Set Point

Seek water/ don’t seek water

drives

Drive Reduction Considered: Are reinforcers necessary for survival?

– Eating to excess

– Drugs of Abuse

– “Pleasure centers” of the brain

Behavioral Regulation View: The Premack Principle

• Behaviors are reinforcing, not stimuli

• To predict what will be reinforcing, observe the baseline frequency of different behaviors

• Highly probable behaviors will reinforce less probable behaviors

Premack Revised: The Response Deprivation Hypothesis

• Low frequency behaviors can reinforce high frequency behaviors (and vice versa)

• All behaviors have a preferred frequency = the behavioral bliss point

• Deprivation below that frequency is aversive, and organisms will work to remedy this

Timberlake & Allison (1974)

Response deprivation hypothesis

.25 .5 .75

The ice cream scale (in pints)

1.0 1.25 1.5 1.75 2.0 2.25 2.5

Bliss point

(1.0 pints/night)

Will work to avoid ice creamWill work to obtain

Contiguity versus Contingency in operant conditioning

Degraded Contingency Effect

= bar press = food

Perfect contingency

Strong Responding

Degraded contingency

Weak Responding

G.V. Thomas (1983)

Contiguity pitted against contingency

“Free” reinforcers given every 20s

Lever press advances delivery of pellet, but cancels pellet for next 20-s interval

So if you press at second 2, you get a pellet immediately, but you get no pellet during seconds 3-20 and 21-40.

20s 40s 60s

G.V. Thomas (1983)

Contiguity pitted against contingency

So if you press at second 2, you get a pellet immediately, but you get no pellet during seconds 3-20 and 21-40.

20s 40s 60s

Lever press here

Lose this pellet

“Superstitious Behavior”

• Suggested that temporal contiguity more important than contingency

• 15-s FT, no response requirement

• “adventitious reinforcement”

“In 6 out of 8 cases the resulting responses were so clearly defined that two observers could agree perfectly in counting instances. One bird was conditioned to turn counter-clockwise about the cage, making 2 or 3 turns between reinforcements. Another repeatedly thrust its head into one of the upper corners of the cage….”

Orienting toward feeder

Pecking near feeder

Moving along wall

¼ turn

“Misbehavior” and the limits of operant conditioning

Limits of Operant Conditioning

• Some behaviors can’t be conditioned– Yawning– Scratching

• Belongingness– Presentation of a female won’t reinforce biting

• “Misbehavior”

Marian Breland Bailey – How to train a chicken

The famous dancing chicken

What is learned in operant conditioning?

S R

What is learned?

Edwin Guthrie: mere contiguity of a stimulus and a behavior stamps in that S-R; reinforcement is not necessary

S R

What is learned?

Thorndike:Reinforcement “stamps in” this connection

S R

O

What is learned?

?

S R O

2-Process Theory

operant

Pavlovian

S R

CR

2-Process Theory

operant

Pavlovian

Evidence for 2-process theoryPavlovian-Instrumental Transfer

Phase 1 Phase 2 Test

LeverFood LightFood Light: #Presses?No Light: #Presses?

# Presses

Light No CS

The presence of the CS intensifies operant responding

S R O?

?

What is learned?

Does the Pavlovian S-O association activate a vague emotional state or a specific mental representation of the outcome?

Specific Outcome RepresentationsTrapold

Phase 1 Phase 2 Test

(operant) (classical)

R LeverPellet TonePellet Tone:Left? Right?

L LeverSucrose LightSucrose Light:Left? Right?

# Presses

Light Noise

Left

Right