Chapter 6 – Schedules or Reinforcement and Choice Behavior
• Outline – Simple Schedules of Intermittent
Reinforcement• Ratio Schedules• Interval Schedules• Comparison of Ratio and Interval Schedules
– Choice Behavior: Concurrent Schedules• Measures of Choice Behavior• The Matching Law
– Complex Choice• Concurrent-Chain Schedules• Studies of “Self Control”
• Simple Schedules of Intermittent Reinforcement
• Ratio Schedules– RF depends only on the number of responses
performed• Continuous reinforcement (CRF)
– each response is reinforced• barpress = food• key peck = food
• CRF is rare outside the lab.– Partial or intermittent RF
• Partial or intermittent Schedules of Reinforcement
• FR (Fixed Ratio)– fixed number of operants (responses)
• CRF is FR1– FR 10 = every 10th response RF
• originally recorded using a cumulative record– Now computers
• can be graphed similarly
• The cumulative record represents responding as a function of time– the slope of the line represents rate of
responding.• Steeper = faster
• Responding on FR scheds.– Faster responding = sooner RF
• So responding tends to be pretty rapid– Postreinforcement pause
• Postreinforcement pause is directly related to FR.– Small FR = shorter pauses
• FR 5 – large FR = longer pauses
• FR 100 – wait a while before they start working.
– Domjan points out this may have more to do with the upcoming work than the recent RF• Pre-ratio pause?
• how would you respond if you received $1 on an FR 5 schedule?
• FR 500?– Post RF pauses?
• RF history explanation of post RF pause– Contiguity of 1st response and RF
• FR 5 – 1st response close to RF– only 4 more
• FR 100 – 1st response long way from RF – 99 more
• VR (Variable ratio schedules)– Number of responses still critical– Varies from trial to trial
• VR 10 – reinforced on average for every 10th
response. – sometimes only 1 or 2 responses are required– other times 15 or 19 responses are required.
• Example (# = response requirement) VR10 FR10
• 19 RF 10 RF• 2 RF 10 RF• 8 RF 10 RF• 18 RF 10 RF• 5 RF 10 RF• 15 RF 10 RF• 12 RF 10 RF• 1 RF 10 RF
• VR 10– (19+2+8+18+5+15+12+1)/8 = 10
• VR = very little postreinforcement pause– why would this be?
• Slot machines– very lean schedule of RF– But - next lever pull could result in a payoff.
• FI (Fixed Interval Schedule)– 1st response after a given time period has
elapsed is reinforced. • FI 10s
– 1st response after 10s RF.• RF waits for animal to respond• responses prior to 10-s not RF.
• scalloped responding patterns– FI scallop
• Similarity of FI scallop and post RF pause?– FI 10s?– FI 120s?
• The FI scallop has been used to assess animals’ ability to time.
• VI (variable interval schedule)– Time is still the important variable – However, time elapse requirement varies around a set
average• VI 120s
– time to RF can vary from a few seconds to a few minutes
• $1 on a VI 10 minute schedule for button presses?– Could be RF in seconds– Could be 20 minutes
• post reinforcement pause?
• Produces stable responding at a constant rate– peck..peck..peck..peck..peck– sampling whether enough time has passed
• The rate on a VI schedule is not as fast as on an FR and VR schedule– why?– ratio schedules are based on response.
• faster responding gets you to the response requirement quicker, regardless of what it is?
– On a VI schedule # of responses don’t matter,• steady even pace makes sense.
• Interval Schedules and Limited Hold– Limited hold restriction
• Must respond within a certain amount of time of RF setup
– Like lunch at school• Too late you miss it
• Comparison of Ratio and Interval Schedules– What if you hold RF constant
• Rat 1 = VR • Rat 2 = Yoked control rat on VI
– RF is set up when Rat 1 gets to his RF• If Rat 1 responds faster, RF will set up sooner for
Rat2• If Rat 1 is slower, RF will be delayed
• Comparison of Ratio and Interval Schedules
• Why is responding faster on ratio scheds?– Molecular view
• Based on moment x moment RF• Inter-response times (IRTs)
– R1……………R2 RF» Reinforces long IRT
– R1..R2 RF » Reinforces short IRT
• More likely to be RF for short IRTs on VR than VI
• Molar view– Feedback functions
• Average RF rate during the session is the result of average response rates
– How can the animal increase reinforcement in the long run (across whole session)?• Ratio - Respond faster = more RF for that day
– FR 30– Responding 1 per second RF at 30s– Respond 2 per second RF at 15s
• Molar view continued– Interval - No real benefit to responding faster
• FI 30• Responding 1 per second RF at 30 or 31 (30.5)• What if 2 per second 30 or 30.5 (30.25)
– Pay • Salary?• Clients?
• Choice Behavior: Concurrent schedules– The responding that we have discussed so far
has involved schedules where there is only one thing to do.
– In real life we tend to have choices among various activities
– Concurrent schedules • examines how an animal allocates its responding
among two schedules of reinforcement?• The animals are free to switch back and forth
• Measures of choice behavior– Relative rate of responding
• for left key BL .
(BL + BR)
– BL = Behavior on left – BR = Behavior on right
We are just dividing left key responding by total responding.
• This computation is very similar to the computation for the suppression ratio.– If the animals are responding equally to each
key what should our ratio be? 20 . = .50 20+20
– If they respond more to the left key? 40 . = .67 40+20
– If they respond more to the right key? 20 . = .33 20+40
• Relative rate of responding for right key– Will be reciprocal of left key responding, but also can
be calculated with the same formula BR .
(BR + BL)
• Concurrent schedules?– If VI 60 VI 60– The relative rate of responding for either key will
be .5• Split responding equally among the two keys
• What about the relative rate of reinforcement?– Left key?
• Simply divide the rate of reinforcement on the left key by total reinforcement.
rL .
(rL + rR)
• VI 60 VI 60?– If animals are dividing responding equally?– .50 again
• The Matching Law– relative rate of responding matches relative
rate of RF when the same VI schedule is used• .50 and .50
– What if different schedules of RF are used on each key?
• Left key = VI 6 min (10 per hour)• Right key = VI 2 min (30 per hour)
Left key relative rate of responding BL . = rL . 10 =.25 left
(BL + BR) (rL + rR) 40
Right key? simply the reciprocal.75
Can be calculated though BR . = rR . 30 =.75 right
(BR + BL) (rR + rL) 40
Thus - three times as much responding on right key .25x3 = .75
Matching Law continued: Simpler computation.
BL . = rL .
BR rR
1030
again – three times as much responding on right key
• Herrnstein (1961) compared various VI schedules – Matching Law.
• Figure 6.5 in your book
• Application of the matching law– The matching law indicates that we match our behaviors to the
available RF in the environment.– Law,Bulow, and Meller (1998)
• Predicted adolescent girls that live in RF barren environments would be more likely to engage in sexual behaviors
• Girls that have a greater array of RF opportunities should allocate their behaviors toward those other activities
• Surveyed girls about the activities they found rewarding and their sexual activity
• The matching law did a pretty good job of predicting sexual activity– Many kids today have a lot of RF opportunities.
• May make it more difficult to motivate behaviors you want them to do – Like homework
» X-box» Texting friends» TV
• Complex Choice– Many of the choices we make require us to
live with those choices• We can’t always just switch back and forth
– Go to college?– Get a full-time job?
• Sometimes the short-term and long-term consequences (RF) of those choices are very different
– Go to college» Poor now; make more later
– Get a full-time job» Money now; less earning in the long run
• Concurrent-Chain Schedules• Allows us to examine these complex choice behaviors
in the lab– Example
• Do animals prefer a VR or a FR?– Variety is the spice of life?
• Choice of A– 10 minutes on VR 10
• Choice of B– 10 minutes on FR 10
• Subjects prefer the VR10 over the FR10– How do we know?
• Subjects will even prefer VR schedules that require somewhat more responding than the FR – Why do you think that happens?
• Studies of Self control– Often a matter of delaying immediate
gratification (RF) in order to obtain a greater reward (RF) later.• Study or go to party?• Work in summer to pay for school or enjoy the time
off?
• Self control in pigeons?– Rachlin and Green (1972)
• Choice A = immediate small reward• Coice B = 4s Delay large reward
– Direct choice procedure• Pigeons choose immediate, small reward
– Concurrent-chain procedure• Could learn to choose the larger reward
– Only if a long enough delay between initial choice and the next link.
• This idea that imposing a delay between a choice and the eventual outcomes helps organisms make “better” (higher RF) outcomes works for people to.
• Value-discounting function V = M .
(1+KD)• V-value of RF• M- magnitude of RF• D – delay of reward• K – is a correction factor for how much the animal is influenced by the
delay– All this equation is saying is that the value of a reward is
inversely affected by how long you have to wait to receive it.– IF there is no delay D=0
• Then it is simply magnitude over 1
• If I offer you– $50 now or $100 now? 50 . = 50 100 . = 100 (1+1x0) (1+1x0)
– $50 now or $100 next year? 50 . = 50 100 . = 7.7(1+1x0) (1+1x12)
• As noted above K is a factor that allows us to correct these delay functions for individual differences in delay-discounting
• People with steep delay discounting functions will have a more difficult time delaying immediate gratification to meet long-term goals– Young children– Drug abusers
• Madden, Petry,Badger, and Bickel (1997)– Two Groups
• Heroin-dependent patients• Controls
– Offered hypothetical choices• $ smaller – now• $ more – later
– Amounts varied • $1,000, $990, $960, $920, $850, $800, $750, $700, $650, $600, $550, $500, $450,
$400, $350, $300,$250, $200, $150, $100, $80, $60, $40, $20, $10, $5, and $1– Delays varied
• 1 week, 2 weeks, 2 months, 6 months, 1 year, 5 years, and 25 years.
• • It has been described mathematically in the following way (Baum, 1974)• • RA = b rA a
• RB rB• • RA and RB refer to rates of responding on keys A and B (i.e. left and right)• • rA and rB refer to the rates of reinforcement on those keys• • When the value of exponent a is equal to 1.0 a simple matching relationship occurs
where the ratio of responses perfectly match the ratio of reinforcers obtained.• • The variable b is used to adjust for response effort differences between A an B
when they are unequal, or if the reinforcers for A and B were unequal.