1
JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2003, 80, 1–27 NUMBER 1 (JULY)
PUNISHMENT IN HUMAN CHOICE:DIRECT OR COMPETITIVE SUPPRESSION?
THOMAS S. CRITCHFIELD, ELLIOTT M. PALETZ, KENNETH R. MACALEESE, ANDM. CHRISTOPHER NEWLAND
ILLINOIS STATE UNIVERSITY, AUBURN UNIVERSITY, AND UNIVERSITY OF NEVADA–RENO
This investigation compared the predictions of two models describing the integration of reinforce-ment and punishment effects in operant choice. Deluty’s (1976) competitive-suppression model(conceptually related to two-factor punishment theories) and de Villiers’ (1980) direct-suppressionmodel (conceptually related to one-factor punishment theories) have been tested previously in non-humans but not at the individual level in humans. Mouse clicking by college students was maintainedin a two-alternative concurrent schedule of variable-interval money reinforcement. Punishment con-sisted of variable-interval money losses. Experiment 1 verified that money loss was an effective pun-isher in this context. Experiment 2 consisted of qualitative model comparisons similar to those usedin previous studies involving nonhumans. Following a no-punishment baseline, punishment was su-perimposed upon both response alternatives. Under schedule values for which the direct-suppressionmodel, but not the competitive-suppression model, predicted distinct shifts from baseline perfor-mance, or vice versa, 12 of 14 individual-subject functions, generated by 7 subjects, supported thedirect-suppression model. When the punishment models were converted to the form of the gener-alized matching law, least-squares linear regression fits for a direct-suppression model were superiorto those of a competitive-suppression model for 6 of 7 subjects. In Experiment 3, a more thoroughquantitative test of the modified models, fits for a direct-suppression model were superior in 11 of13 cases. These results correspond well to those of investigations conducted with nonhumans andprovide the first individual-subject evidence that a direct-suppression model, evaluated both quali-tatively and quantitatively, describes human punishment better than a competitive-suppression mod-el. We discuss implications for developing better punishment models and future investigations ofpunishment in human choice.
Key words: punishment, concurrent schedules, one factor theory, two factor theory, mouse click,humans
There is widespread agreement in the so-cial sciences that behavior is influenced by itsbenefits and costs (e.g., Akers, 1994; Davison,1991; Ehrlich, 1996; Eysenck, 1967; Gray, Staf-
We are indebted to Mei Sho Jang for computer pro-gramming, Alejandro Lazarte for statistical advice, andScott Lane for helpful comments on a draft of the man-uscript. The research was supported at Illinois State Uni-versity by a College of Arts and Sciences Small Grant forResearch and a Pre-Tenure Faculty Initiative Grant; andat Auburn University by two Discretionary Grants-in-Aidof Research, with additional funds from the College ofLiberal Arts and the Department of Psychology (Bill Hop-kins, Chair). For data collection assistance at various stag-es of the investigation, we thank Beth Burnett, Stacy Daw-son, Christina Deutsch, Karen Green, Kelly Grisham,Carmen Lee, Karen Mowrey, Bethany Munge, and KellyParrish. Portions of the data were presented at the meet-ings of the Psychonomic Society (1996), the Internation-al Conference on Comparative Cognition (1997), the As-sociation for Behavior Analysis (1998, 2002), and theMid-American Association for Behavior Analysis (2001).
M. Christopher Newland and Elliott M. Paletz are atAuburn University; Kenneth R. MacAleese is at the Uni-versity of Nevada–Reno.
Address correspondence to T. Critchfield, Departmentof Psychology, Illinois State University, Normal, Illinois61790-4620 (e-mail: [email protected]).
ford, & Tallman, 1991; Kahneman & Tversky,1979; Leung, 1995; Lohman, 1997; Neilson,1998), but the specifics of the interaction arerarely addressed with much precision. Withinoperant psychology, it is axiomatic to assumethat behavior is jointly determined by rein-forcement, which increases behavior frequen-cy, and punishment, which decreases behav-ior frequency, but exactly how the twocombine to influence behavior output oftenremains unstated (e.g., Skinner, 1953; seealso Lerman & Vorndran, 2002).
Explicit attention to the reinforcement-punishment interaction can be found in twovariations on Herrnstein’s (1970) matchinglaw
B Rx X5 (1)B 1 B R 1 Rx y x y
in which the B terms refer to response ratesfor concurrently-available response options xand y, and the R terms refer to rates of re-inforcement for those alternatives. In the tra-dition of single-factor theories of punishment
2 THOMAS S. CRITCHFIELD et al.
(e.g., Rachlin & Herrnstein, 1969; Thorndi-ke, 1911), de Villiers (1980) proposed thatpunishers (P terms below) directly reduce thestrength of reinforced responding:
B (R 2 P )x X x5 . (2)B 1 B (R 2 P ) 1 (R 2 P )x y x x y y
The de Villiers model, thus, can be termed adirect-suppression model.
By contrast, Deluty (1976) proposed amodel that, in the tradition of two-factor the-ories of punishment (e.g., Bolles, 1967; Dins-moor, 1954; Estes, 1944; Mowrer, 1947; Res-corla & Solomon, 1967), assumes thatpunishment of one behavior increases the rel-ative value of reinforcement for other behav-iors. Deluty’s model, which can be termed acompetitive-suppression model, assumes thatpunishers for alternative x supplement the re-inforcement rate for alternative y, and viceversa:
(R 1 P )B X yx 5 . (3)B 1 B (R 1 P ) 1 (R 1 P )x y x y y x
Only a few experiments have compared thepredictions of these models. Before examin-ing those experiments, it is important to notethat neither Equation 2 nor Equation 3 isbased on the contemporary standard for de-scribing concurrent schedule performance,the generalized matching law (Baum, 1974,1979), in which relative behavior and conse-quence rates are expressed as logarithmically-transformed ratios rather than as proportionsand are modulated by two fitted parameters.The generalized matching law is superior toEquation 1 in accounting for systematic de-viations from strict matching (predominantlyundermatching; e.g., Baum, 1979; Kollins,Newland, & Critchfield, 1997), but it createscertain ambiguities for punishment modelsthat we will address later in this report.
Because models based on Equation 1 donot accommodate the systematic deviationsfrom matching that characterize most con-current schedule performances, Equations 2and 3 can be compared in terms of qualita-tive (directional), but not quantitative(point), predictions. de Villiers (1980, 1982)described an approach in which the logic ofmathematical inequalities is applied to yieldqualitative predictions. Consider a case inwhich the same rate of punishment is applied
to two response alternatives, x and y, with un-equal reinforcement rates (i.e., Rx . Ry). Ac-cording to Equation 2, Ry is discounted pro-portionally more than Rx, leading toincreased preference for Rx. According toEquation 3, Rx and Ry are augmented equallyin absolute terms, thus becoming more sim-ilar than during baseline, which leads to re-duced preference for Rx. In this fashion,therefore, the models can be compared with-out reference to point predictions.
Only two investigations (de Villiers, 1980;Farley, 1980) have unambiguously comparedthe models represented by Equations 2 and3. The limited evidence from these studies,evaluated using de Villiers’ (1980, 1982) qual-itative approach, strongly favors the direct-suppression model of Equation 2. Only De-luty (1976, 1982; see also Deluty & Church,1978) has claimed empirical support for thecompetitive-suppression model, but withoutdirectly comparing models. In fact, Equations2 and 3 make nearly identical predictions un-der the schedule values employed by Deluty(e.g., de Villiers, 1980). Thus the Deluty stud-ies do not provide a meaningful test of themodels.
All of the aforementioned data were ob-tained from nonhumans, and it is importantto determine the extent to which animal-based principles apply to human behavior.For example, it is interesting to note thatSkinner’s (e.g., 1953) influential writingsabout punishment in human affairs appear toendorse a competitive-suppression perspec-tive. Another reason to be interested in hu-man outcomes concerns a limitation of stud-ies with nonhumans. Equations 2 and 3 assignequal weights to reinforcers and punishers,implying that reinforcers and punishers haveequal impact upon behavior, but this is a du-bious assumption when qualitatively differentevents are employed as reinforcers (e.g.,food) and punishers (e.g., electric shock). Instudies with nonhumans, therefore, the pre-dictions of the two models are blurred by un-known functional magnitudes of the conse-quences (Farley & Fantino, 1978). In humanstudies, however, it is possible to programmoney-gain reinforcers and money-loss pun-ishers at values nominally consistent with theequal-magnitude assumptions of Equations 2and 3.
Apparently only one published study
3PUNISHMENT AND CHOICE
Table 1
Session duration, changeover delay (COD), and details of compensation in Experiments 2and 3; see text for additional information about money contingencies. Values varied acrosssubjects in Experiment 1 (see Appendix A). Base pay refers to payment for attending exper-imental sessions.
Experiment PartSession(min)
COD(s)
Base payper hour
Consequences (cents)
Reinforcers Punishers
2 ABC
10108
0.50.52.0
$1.50$1.50
—
337
337
3 AB
815
2.00.5
—$1.50
82
82
(Bradshaw, Szabadi, & Bevan, 1979) has ex-amined the effects of punishment in humanfree-operant choice. Variable-ratio money-losspunishment was superimposed upon one re-sponse option in a concurrent variable-inter-val (VI) VI schedule of money-gain reinforce-ment, but models like Equations 2 and 3 werenot applied to the data, and the publishedreport lacks information (e.g., obtained ratesof consequences) necessary to reanalyze theresults in these terms. An additional studyemployed punishment in a group-compari-son design involving discrete-trial choice pro-cedures (Gray et al., 1991). Subjects partici-pated for 50 trials in one of several conditions(N 5 5 per group) across which the proba-bility and magnitude of reinforcement andpunishment were varied. Because each indi-vidual participated in only one condition,models similar to Equations 2 and 3 were fit-ted to group-aggregate functions. A variant ofthe direct-suppression model provided a bet-ter fit to the group functions than a variantof the competitive-suppression model.Whether the same would be true for individ-ual functions is not known.
It remains to be determined, therefore,whether animal-based models of punishmentin choice adequately describe individual hu-man behavior. The present investigationsought to generate new human data relevantto direct-suppression and competitive-sup-pression punishment models, and wasdesigned to incorporate parallels with theprocedures of studies conducted with non-humans. In particular, the present investiga-tion employed free-operant procedures (un-like Gray et al., 1991), a changeover delaythat applied to reinforcement and punish-ment schedules alike (unlike Bradshaw et al.,
1979, and Gray et al.), and VI punishment(unlike Bradshaw et al.). The present studyretained one important feature of previoushuman studies by using money-based rein-forcers and punishers of equal magnitude.
The general experimental strategy was asfollows. In the first experiment, a brief ma-nipulation check was conducted to verify thatthat point loss functioned as punishment. Ex-periment 2 compared Equations 2 and 3 us-ing the qualitative approach of de Villiers(1980, 1982), and incorporated a first at-tempt to compare direct-suppression andcompetitive-suppression models based on thegeneralized matching law. Experiment 3 gen-erated data sets better suited to evaluatingmodels based on the generalized matchinglaw.
GENERAL METHODThis research was conducted over approx-
imately a 7-year period at two institutions, re-sulting in two types of procedural differencesacross studies that are summarized in Table1. First, session durations became shorteracross experiments (the studies were not con-ducted in the order reported here) as welearned that workable data could be obtainedin briefer observation periods. Second, de-tails of subject compensation varied accord-ing to the dictates of local institutional reviewboards and local economies. In particular,the value (in cents) of reinforcers and pun-ishers varied across studies to assure that totalearnings approximated the federal minimumwage.
Subjects and ApparatusUndergraduate students (subjects num-
bered 500 and above at Illinois State Univer-
4 THOMAS S. CRITCHFIELD et al.
sity, the rest at Auburn University) volun-teered for research on ‘‘Choice andProblem-Solving.’’ Potential volunteers wereinitially contacted by telephone after re-sponding to recruitment flyers. They provid-ed informed consent after visiting the labo-ratory and completing a brief sample sessionwith the experimental task. Subjects agreedto participate for a minimum of 20 hr and amaximum of 40 hr. The actual duration ofeach subject’s participation was influenced byhow quickly stable data were obtained duringthe experimental sessions and by the vagariesof participant and university schedules.
Subjects were asked to leave personal be-longings such as watches, calculators, andbackpacks outside the workroom during ex-perimental sessions. Each subject workedalone in an office-sized room containing achair, a table supporting a 14-in. (35.56-cm)computer monitor on which stimuli were pre-sented, and a computer mouse that was usedto register responses. An IBMt-compatiblecomputer in an adjacent room controlled ex-perimental events and collected the data viaa custom program written in QuickBasicq.
Procedure
Typically, subjects visited the laboratory 3to 5 days per week and completed between 6and 12 sessions per visit. For session dura-tions see Table 1 and Appendix A. Subjectscould take breaks between sessions.
Experimental task. The concurrent-schedulestask was based on that of Madden and Perone(1999). Sessions began with the display of amessage stating, ‘‘Click here to begin.’’ Click-ing the message caused it to be replaced bytwo white rectangles separated by a thin blackline. Each rectangle occupied approximatelyone half of the screen except for the top 2.5cm, which remained black. An arrow-shapedcursor indicated the virtual position of themouse at all times during the session. In thecenter of each white rectangle was a coloredtarget approximately 0.5 cm square. Clickingeither target once set both targets into mo-tion. Targets moved about 0.5 cm per secondin a randomly-determined direction. Clickswithin the borders of a target registered re-sponses upon which the reinforcement andpunishment schedules were based. Clickselsewhere were ineffective and were notcounted.
Normally, a money counter, located in thecenter of the top, black region of the screen,displayed total session earnings in numeralsabout 1 cm high. To the left and right of thiscentral counter were additional counters, innumerals about 0.5 cm high, displaying mon-ey outcomes specific to schedules operatingin the two white regions. On each side, coun-ters were labeled, ‘‘Money earned this side 5
’’; and ‘‘Money lost this side 5 .’’ Allcounters registered zero at the start of eachsession, and the appropriate counters incre-mented or decremented with the occurrenceof each reinforcer or punisher. The omissionof the money counters served as the experi-mental manipulation in a portion of Experi-ment 2.
Schedules and consequences. Reinforcers(money gains) and punishers (money losses)occurred according to independent VI sched-ules arranged using constant-probability dis-tributions (Fleshler & Hoffman, 1962). With-in a condition, reinforcement schedulesmade available approximately 360 total rein-forcers per hour of session time, based onprogrammed schedule values. The aggregatepunishment rate depended on the design ofthe individual experiments.
Within a screen location (side), reinforce-ment and punishment VI schedules operatedconjointly. Across screen locations, schedulesoperated concurrently. A changeover delay(COD) precluded the adventitious reinforce-ment or punishment of switching sides. Dur-ing the COD, the VI schedules and the ses-sion clock were suspended, and responses,although recorded, were ineffective. TheCOD was as specified in Table 1, with oneexception. For S272 (Experiment 2, Parts Aand B), preference for the richer of two VIschedules was not evident during the earlystages of the initial experimental condition.Consequently, the COD was gradually in-creased to 6 s until preference became ap-parent, and thereafter the COD remainedconstant throughout this subject’s participa-tion.
If a reinforcer or punisher became avail-able on one side while responding took placeon the other, the relevant VI timer was sus-pended until a changeover occurred and theCOD was completed. Thereafter, the first re-sponse produced the consequence and re-started the VI timer. If a reinforcer and pun-
5PUNISHMENT AND CHOICE
isher both became available on one side whileresponding took place on the other side, theconsequence that became available first wasdelivered contingent upon the first responseafter a changeover and COD. The secondconsequence was delivered contingent uponthe next response on the same side. If a re-inforcer and punisher for one response op-tion became available simultaneously, the or-der of delivery was determined randomly.
Reinforcers and punishers were signaled bya 1-s flashing alternation (0.25 s per flash) ofthe most-recently clicked target with a mes-sage indicating the amount of money gain(e.g., ‘‘13¢,’’ printed in black) or loss (e.g.,‘‘23¢,’’ printed in red). During feedbackmessages, the cursor disappeared from thescreen, the VI and session clocks were sus-pended, and mouse clicks were ineffective.
Experimental designs. All experiments incor-porated no-punishment baselines in which VIreinforcement schedules operated for bothresponse alternatives. Associated with eachbaseline were at least two additional condi-tions in which punishment was superimposedupon both of the response alternatives.
Discriminative stimuli. The schedules as-signed to each response alternative operatedin a single screen location (left or right), asdescribed above, and the moving target as-sociated with each location was distinctly col-ored. Target colors remained constant withina condition. Across conditions, target colorswere assigned, without replacement, from apool incorporating 16 different hues, thenwere recycled as necessary to produce aunique pair of colors in all conditions.
Stability criteria. A condition was terminatedwhen one of three criteria was met: (a) Visualinspection of graphed data revealed no sys-tematic trends in either response or time al-location proportions, and, for both propor-tions, over four consecutive sessions, thedifference in means between the first and sec-ond pair of sessions differed by no more than10% of the four-session mean; (b) all re-sponse and time proportions in four consec-utive sessions were less than 0.1 or greaterthan 0.9, suggesting floor or ceiling effects;or (c) stability was not achieved according tothe above criteria within 15 sessions. In thislast eventuality, the condition already in pro-gress continued through the end of the day’s
visit to the laboratory, and a new conditionbegan at the start of the next visit.
Instructions. The informed consent agree-ment stated that:
The researchers hope to learn about how in-dividuals make choices based on theirexperience in ambiguous situations. You willview information on a computer screen andmake decisions by pressing buttons on amouse. The decisions you make sometimeswill result in money rewards or penalties. Youwill not be given extensive instructions butrather will be asked to learn from experienceas you work.
At the start of the first session, subjects weretold which mouse button to press and weregiven the following instructions:
You will see that the screen is divided into twoseparate sections, one on the left and one onthe right. Two different colored squares moveabout on the screen, but each will stay withinits respective section. With the mouse, youmay click either square as much or as little asyou like. Money from both squares counts to-ward your overall earnings. The squares payoff differently. It is up to you to decide when,and how often, to click each square. Try toearn as much money as you can.
Data Reduction and Presentation
For all experiments, we present terminaldata, defined as the mean of the final foursessions in a condition. Because response-al-location and time-allocation outcomes werequite similar, for economy of presentation,graphic displays and model tests focus on re-sponse allocation. Time allocation data arepresented in the appendixes.
Model predictions were based on obtainedrates of reinforcement and punishment forindividual subjects. Occasionally, obtainedpunishment rates in a condition equaled orexceeded obtained reinforcement rates for aresponse option. Such outcomes create cal-culation problems for direct-suppressionmodels (e.g., see Davison & McCarthy, 1988;Gray et al., 1991), a matter that we will ad-dress in the General Discussion. For analysesbased on Equation 2 (see Experiment 1),high punishment rates may lead to predic-tions of preference greater than 1.0 or lessthan zero. In such cases, for purposes ofgraphic display, the prediction was consid-ered to be the ceiling (1) or floor (0) of the
6 THOMAS S. CRITCHFIELD et al.
Fig. 1. Experiment 1: Response rates during a no-punishment baseline (NP) and with punishment (P) su-perimposed on one of the response options. Punishmentrate was either 50% or 80% of reinforcement rate, basedon programmed schedule values. The left and right barof each adjacent pair represent the left and right re-sponse alternative, respectively. For punishment condi-tions, the punished alternative is shaded. Note that or-dinate scaling differs across subjects.
measurement scale. In Experiments 2 and 3,punishment models were modified to takethe form of the generalized matching law,which utilizes the natural logarithm of con-sequence and behavior ratios. In direct-sup-pression model analyses, high punishmentrates may yield undefined predictions be-cause only positive ratio values can be sotransformed. Conditions in which punish-ment rate equaled or exceeded reinforce-ment rate were excluded from analyses basedon the generalized matching law.
EXPERIMENT 1
This brief study served as a manipulationcheck for subsequent experiments. In partic-ular, in Experiment 2 programmed rates ofreinforcement and punishment were em-ployed for which, according to de Villiers’(1980) qualitative approach to model com-parison, only one punishment model pre-dicted a distinct shift from baseline choicepatterns. Thus the absence of a preferenceshift could reflect either a failure of the mod-el under consideration or simply the use ofineffective punishers. Although the loss ofpoints worth money has proven to be an ef-fective consequence in other studies of aver-sive control, most such studies have involvednegative reinforcement (Crosbie, 1998).Thus this study was undertaken to verify thatmoney loss would serve as a punisher in thecontext of the present procedures.
METHOD
Eight individuals who also participated inExperiment 2 or 3 served as subjects. Sessionduration, changeover delay, and details ofmoney contingencies are shown for each sub-ject in Appendix A. The 6 subjects in Part Acompleted a no-punishment baseline fol-lowed by punishment of one response option.The baseline involved a 1:1 reinforcement ra-tio (VI 20 s VI 20 s), and VI 40-s punishment(50% of reinforcement rate) was applied tothe response option that generated the high-er response rate during baseline. The pur-pose of Part A was to verify that money lossreduced rates of the behavior on which it wascontingent. The 2 subjects in Part B com-pleted a VI 2-s VI 20-s baseline plus condi-tions in which VI 40-s punishment and VI25-s punishment (50% and 80% of reinforce-
ment rate, respectively) were applied to thepreferred response option. The purpose ofPart B was to verify that effects of punishmentwere frequency dependent.
RESULTS AND DISCUSSION
Appendix A lists obtained rates of rein-forcement and punishment. The top tworows of Figure 1 show response rates for theleft and right screen locations (shown as ad-jacent bars) in the baseline and 50% punish-ment conditions of Part A. In all cases, moneyloss decreased response rate for the alterna-tive on which it was contingent. The bottomrow of Figure 1 shows analogous data for PartB. Again, punishment decreased responserates in all instances, and the magnitude ofeffect depended on punishment frequency.Across Parts A and B, in all but one instance(S272, 50% punishment), contingent moneyloss led to response rate increases for the un-punished alternative, replicating a commonfinding (e.g., Azrin & Holz, 1966; Bradshawet al., 1979). Overall, the results of manipu-
7PUNISHMENT AND CHOICE
lation check indicated that response-contin-gent money-loss functioned as punishment inthe context of the present procedures.
EXPERIMENT 2
This experiment promoted qualitativecomparisons of the predictions of Equations2 and 3. Part A was designed as a test of thedirect-suppression model of Equation 2. Fol-lowing a no-punishment baseline, punish-ment was superimposed on both response al-ternatives at a rate equal to 50% and 100%of the reinforcement rate of the leaner sched-ule. Under these conditions, Equation 2 pre-dicts a preference shift toward the rich alter-native, and the competitive-suppressionmodel of Equation 3 predicts relatively littlepreference shift. Part B was designed as a testof the competitive-suppression model ofEquation 3. Following a no-punishment base-line, punishment was superimposed on bothresponse alternatives at a rate proportional tothe programmed reinforcement rate. Underthe programmed contingencies, Equation 3predicts decreased preference for the rein-forcement-rich alternative, while the direct-suppression model of Equation 2 predicts nochange in preference
Part C was conducted in recognition of thefact that experimental procedures for hu-mans are never identical to those for non-humans (e.g., Baron, Perone, & Galizio,1991). Unlike in nonhuman experiments, inParts A and B money counters were displayedduring experimental sessions, including acentral counter showing net session earningsand pairs of counters showing money gainedand money lost for each of the two responsealternatives. We were concerned that by mak-ing aggregate gains and losses explicit, thesecounters would impose a subtractive logicthat, although consistent with predictions ofthe direct-suppression model, would be idio-syncratic to this particular computer work en-vironment (that is, discriminative control ex-erted by the counters might overwhelmcontrol by the consequences). To determinewhether money counters were integral to ef-fects in Parts A and B, Part C replicated theschedule values of Part A in the presence andabsence of money counters.
METHOD
Eight students volunteered to participate.One was dropped from the study after failingto show consistent preference for the richersource of reinforcement during 13 hr of ex-posure to a no-punishment baseline. No dataare presented for this subject. Five subjectscompleted both Part A and Part B. For 2 ofthese subjects (S271 and S274), Part A camefirst, and for the other 3 (S269, S272, andS273), Part B came first. Two other subjects(S500 and S501) completed Part C. Table 2shows the schedule values. The screen loca-tion to which the richer schedule was as-signed was randomly determined for eachcondition.
Part A consisted of a no-punishment base-line (2:1 reinforcement ratio) and two con-ditions in which punishment was applied toboth response options. Across conditions,punishment for both response options wasprogrammed at a rate equal to 50% and100% of the programmed reinforcement ratefor the leaner reinforcement schedule. Forexample, given baseline reinforcementschedules of VI 15 s and VI 30 s, the 50%condition would yield punishment schedulesof VI 60 s for both alternatives.
Part B consisted of a no-punishment base-line (5:1 reinforcement ratio) and three pun-ishment conditions in which punishment wasapplied to both response options. Across con-ditions, punishment was programmed at arate equal to 25%, 50%, and 75% of the re-inforcement rate for each response option.For example, given baseline reinforcementschedules of VI 12 s and VI 60 s, the 50%condition would yield punishment schedulesof VI 24 s and VI 120 s, respectively.
In Part C, the design and procedures ofPart A were replicated; once with moneycounters present as during Part A, and oncewith the money counters absent. S500 com-pleted the counter conditions first; S501 com-pleted the no-counter conditions first. Whenmoney counters were absent, the top portionof the screen where counters normally wouldbe displayed remained black. Prior to the firstsession of the no-counters phase, subjectswere told, ‘‘The computer will indicate eachtime you gain or lose money. Your total earn-ings during the session will not be shown onyour screen, although the computer will keep
8 THOMAS S. CRITCHFIELD et al.
Table 2
Experiments 2 and 3: Variable-interval schedule values (s). Note that in Experiment 3 eachpunishment condition was preceded by a baseline condition using the same reinforcementschedule values. Within experiments, the sequence of conditions varied across subjects (seeAppendixes B and C).
Experiment Part Condition
Reinforcement
Rich Lean
Punishment
Rich Lean
2 A, C 2:1 baseline2:1 50% punishment2:1 100% punishment
151515
303030
—6030
—6030
B 5:1 baseline5:1 25% punishment5:1 50% punishment5:1 75% punishment
12121212
60606060
—482416
—24012080
3 A 2:13:14:15:17:1
1513121211
3040506080
8585858585
8585858585
B 3:22:13:14:15:19:1
17:1
17151312121110
2530405060
100180
34302624242220
506080
100120200360
track of this. You may always ask how muchmoney you have made after any session.’’ Nei-ther subject inquired about session totals. Pri-or to the first session of the counters phase,subjects were told, ‘‘The computer will indi-cate each time you gain or lose money anddisplay your total earnings during the sessionon your screen.’’
RESULTS
Following the lead of de Villiers (1980,1982), emphasis was placed on evaluating thequalitative predictions of the models. For allsubjects, response proportions during base-line were lower than predicted by Equation1, suggesting the commonly-reported humantendency toward undermatching (Kollins etal., 1997). To facilitate visual comparison ofpredicted versus obtained response patterns,in Figures 2 and 3 the predictions of Equa-tions 2 and 3 are plotted against the left or-dinate of each panel, and obtained responseproportions are plotted against the right or-dinate, which is offset vertically to bring ob-tained baseline proportions into correspon-dence with model predictions. Appendix Bshows the obtained reinforcement, punish-
ment, and response rates upon which theanalyses shown in Figures 2 and 3 were based.
For Part A, the test of the direct-suppres-sion model, Figure 2 (top, right panel) showsthe model predictions based on programmedschedule values. Visual inspection of the topfive subject panels suggests that, for all sub-jects, as predicted by the direct-suppressionmodel of Equation 2, rich-side preference in-creased as punishment rose from 50% to100% of the lean-side reinforcement rate.
For Part B, the test of the competitive-sup-pression model, Figure 3 (top, right panel)shows the model predictions based on pro-grammed schedule values. Visual inspectionof the remainder of Figure 3 suggests that theperformance of 3 of 5 subjects mirrored thepredictions of the direct-suppression model.For S271, performance was roughly interme-diate to prediction of the two models. ForS269, based on programmed rates of conse-quences, both models predicted a punish-ment-related decrease in rich-side prefer-ence. These subjects may be viewed asuninformative for model-comparison purpos-es.
Figure 2 (bottom four panels) summarizes
9PUNISHMENT AND CHOICE
Fig. 2. Experiment 2, Parts A and C: Response proportions (right ordinate) and predictions of Equations 2 and3 based on obtained rates of reinforcement and punishment (left ordinate) for individual subjects. Punishment ratefor both response alternatives was a percentage of the lean-side reinforcement rate. Top, right panel: Predictionsbased on programmed rates of reinforcement and punishment.
the results from Part C, the counter test. Vi-sual inspection suggests that, for both sub-jects, as predicted by the direct-suppressionmodel, rich-side preference increased as pun-ishment rose from 50% to 100% of the lean-side reinforcement rate. This was true regard-less of the presence or absence of moneycounters. Although Part C involved only 2subjects and a limited (A-B) experimental de-
sign, the data suggest that results of previousexperiments were not an artifact of thescreen display.
DISCUSSION
Punishment was superimposed upon hu-man concurrent schedule performance toproduce 14 individual-subject functions po-tentially relevant to the predictions of the di-
10 THOMAS S. CRITCHFIELD et al.
Fig. 3. Experiment 2, Part B: Response proportions (right ordinate) and predictions of Equations 2 and 3 basedon obtained rates of reinforcement and punishment (left ordinate) for individual subjects. Punishment rate for eachresponse alternative was a percentage of the reinforcement rate for that alternative. The top, right panel shows modelpredictions based on programmed rates of reinforcement and punishment.
rect-suppression (Equation 2) and competi-tive-suppression (Equation 3) punishmentmodels. In each of these cases, based on pro-grammed rates of reinforcement and punish-ment, one of the punishment models pre-dicted a distinct shift in preference while theother predicted little or no change from base-line. Visual inspection revealed 12 outcomesthat were consistent with Equation 2 predic-tions and no outcomes that were clearly con-sistent with Equation 3 predictions. Two caseswere ambiguous with respect to model pre-dictions.
These findings join with those of studiesconducted with individual nonhumans insupporting the qualitative predictions ofEquation 2 over those of Equation 3 (de Vil-liers, 1980; Farley, 1980). Some previous stud-ies have applied punishment to humanchoice, but the punishment models under
consideration here were tested with group-ag-gregate data (Gray et al., 1991) or were notevaluated at all (Bradshaw et al., 1979). Thepresent investigation compared punishmentmodels at the level of individual subjects andimproved on previous studies by incorporat-ing features that made the comparisons easi-er to interpret (free-operant procedures,COD, interval-schedule punishment). Over-all, these results point to human punishmentas a process involving direct suppression, asimplied by one-factor punishment theories.
Like those from earlier studies of nonhu-mans, the present results must be interpretedcautiously because neither of the punishmentmodels under consideration allowed formalconsideration of deviations from perfectmatching that occur routinely in all species(Baum, 1979; Kollins et al., 1997). The prob-lem is evident in Figures 2 and 3 in which
11PUNISHMENT AND CHOICE
Table 3
Experiment 2: Model parameters and percentage of variance accounted for (%VAC) in fittingEquations 4, 5, and 6 to data from punishment conditions. Data from Experiment 1 wereincluded in the analysis when available.
Subject
Equation 5
a log b %VAC
Equation 6
a log b %VAC
Equation 4
a log b %VAC
269271272273274500501
.761.15.43.91
1.41.39.75
2.152.402.012.212.492.11
.02
90.879.495.385.752.883.298.4
1.361.591.021.081.331.212.55
.06
.12
.06
.12
.142.01
.07
82.524.292.229.053.387.882.6
.671.18.71.27.20
1.13.76
2.142.142.14
.31
.372.21
.22
79.683.398.915.71.5
92.190.5
ordinates were offset to adjust for baselineperformances not precisely in accord withthe predictions of perfect matching. Withoutthis affordance, the putative direct-suppres-sion effects of punishment would be difficultto discern through visual inspection.
The generalized matching law (Baum,1974)
B Rx xlog 5 a log 1 log b (4)1 2 1 2B Ry y
accounts for deviations from perfect match-ing through two fitted parameters: a (slope)serves as an estimate of sensitivity to differentfrequencies of reinforcement for the two re-sponse alternatives, and log b (intercept)serves as an estimate of bias for one responsealternative. It is a simple matter to convertthe punishment models to this form, yieldinga direct-suppression model
B R 2 Px x xlog 5 a log 1 log b (5)1 2 1 2B R 2 Py y y
and a competitive-suppression model
R 1 PB x yxlog 5 a log 1 log b. (6)1 2 1 2B R 1 Py y x
Although models similar to Equation 5have been proposed to account for travelcosts in foraging analogs involving concur-rent schedules (under the assumption thattravel results in lost reinforcers; Baum, 1982,Davison, 1991), punishment effects on indi-vidual behavior apparently have not beenevaluated using models based on the gener-alized matching law. At least three approach-es to model comparisons can be imagined.
One method of evaluating punishmentmodels would be to compare the percentageof variance accounted for (%VAC) whencompeting models are fit to data from pun-ishment conditions (excluding baseline con-ditions). Unfortunately, as long as the func-tions from punishment conditions are linearwhen plotted on logarithmic coordinates, thegeneralized matching law (Equation 4) willdescribe the data well without reference topunishment, and it makes no sense to renderpunishment irrelevant in studies of punish-ment. Thus both Equations 5 and 6, that con-tain the same fitted parameters, would be ex-pected to provide good fits to data frompunishment conditions. Table 3 summarizesthe least-squares linear-regression fits ofEquations 4, 5, and 6 to the punishment-dataconditions. As expected, all three models pro-vided an acceptable account in the majorityof cases, and the model providing the best fitvaried across subjects.
A second approach is suggested by previ-ous work on travel costs in foraging. Baum(1982) and Davison (1991; Davison & McCar-thy, 1988) assumed that travel costs—explic-itly equated with punishment by Baum—leavethe fitted parameters of the generalizedmatching law unaltered. If the same can beassumed of punishment, and if the purposeof any punishment model is to account forthe effects of reinforcement and punishmentwithin a single mathematical expression, thenEquations 5 and 6 may be compared in termsof their capacity to integrate the data frompunishment and no-punishment conditions.To create Figure 4, baseline and punishmentdata for each of the 7 participants in both
12 THOMAS S. CRITCHFIELD et al.
Fig. 4. Experiment 2: Relationship between log con-sequence ratio, based on Equations 5 and 6, and log re-sponse ratio for individual subjects. Note: a and log b arefitted parameters of the models. VAC 5 percentage ofvariance accounted for.
parts of Experiment 2 were pooled with datafrom Experiment 1 (if available). Figure 4shows the least-squares linear-regression fitsof Equations 5 and 6 to these data. In six of
seven cases, the direct-suppression model ofEquation 5 accounted for more variance thanthe competitive-suppression model of Equa-tion 6, although the %VAC was modest insome cases. Overall, analyses based on thegeneralized matching law concurred withthose based on a more qualitative approach.
A third, and more stringent, model testsuggested by an anonymous reviewer and byM. Davison (personal communication, May26, 2002) also assumes that punishment is in-ert with respect to the fitted parameters ofthe generalized matching law: Equation 4may be fit to baseline (no-punishment) dataand the resulting slope and bias parameters(a and log b, respectively) held constant asEquations 5 and 6 are fit to punishment data.For the present data set, this third approachwas rejected on practical grounds, as it re-quires more baseline conditions than were in-cluded in the experimental design.
EXPERIMENT 3
Equation 2 was designed as a more thor-ough test of models in which subjects typicallycompleted more conditions than in Experi-ment 1, including equal numbers of baselineand punishment conditions, thereby support-ing a strategy of model comparison thatcould not be applied in the first experiment.
METHOD
Seventeen subjects participated, 6 in Part Aand 11 in Part B (more subjects were includ-ed in Part B because the results were morevariable across subjects). Two subjects with-drew from each part of the study before ad-equate data could be collected, citing bore-dom or schedule conflicts as the reason forwithdrawing. Data are reported for the re-maining 4 subjects in Part A and 9 subjectsin Part B.
Subjects completed at least 4 two-conditionphases, each consisting of a baseline (rein-forcement-only) condition plus a punishmentcondition with identical reinforcement rates.The punishment contingencies of Part Awere similar to those of Part A in Experiment2 in that a constant rate of punishment wasapplied to both response alternatives across arange of relative reinforcement rates. Thepunishment contingencies of Part B weresimilar to those of Part B in Experiment 2 in
13PUNISHMENT AND CHOICE
Table 4
Experiment 3: Model parameters and percentage of variance accounted for (%VAC) in fittingEquations 4, 5, 6, and 7 to data from punishment conditions. Analyses included subjects whocompleted at least four punishment conditions in which obtained reinforcement rate exceed-ed obtained punishment rate.
Sub-ject
Equation 5
a log b %VAC
Equation 6
a log b %VAC
Equation 4
a log b %VAC
Equation 7
a log b %VAC
209243252253254265267512513514515
2.212.19
.241.44.40.64.99.66.66.50.33
.04
.132.062.11
.05
.08
.08
.06
.10
.132.14
44.135.984.097.883.598.397.795.292.864.296.7
2.902.851.066.232.863.454.661.62.32
1.09.75
.04
.142.042.04
.04
.01
.12
.61
.14
.102.18
48.236.690.096.897.895.194.196.894.591.898.6
2.202.23
.251.25.60.68.93.87.74.85.44
.03
.112.062.07
.07
.05
.13
.01
.05
.05
.14
42.748.886.298.795.197.899.094.994.085.495.9
.20
.252.20
21.212.702.712.71
21.7025.8728.602.73
.01
.09
.022.19
.07
.032.05
21.72.052.112.27
41.463.667.898.797.097.396.395.197.190.950.1
that punishment proportional to the rein-forcement rate was applied to both alterna-tives.
For each subject in both parts of the ex-periment, an attempt was made to completeapproximately the same number of phases inwhich the left and right response optionswere more frequently reinforced. The num-ber of phases completed varied across sub-jects depending on individual availability andlimitations of academic schedules. Reinforce-ment and punishment schedule values areshown in Table 2, and the sequence of con-ditions completed by each individual is shownin Appendix C.
RESULTS AND DISCUSSION
Appendix C shows the data on which thethree types of model comparisons discussedpreviously were based. Table 4 (leftmost threesections) shows the fitted parameter valuesof, and %VAC by, Equations 4, 5, and 6 whenfit to data from punishment conditions forindividuals who completed at least four suchconditions in which obtained reinforcementrate exceeded obtained punishment rate forboth response options. The two punishmentmodels and the generalized matching law allprovided good accounts of the data for mostsubjects. Thus, as anticipated (see Experi-ment 2, Discussion), this type of analysis pro-vides no clear basis for distinguishing be-tween punishment models.
Figure 5 shows the least-squares linear re-gression fits of Equations 5 (direct-suppres-
sion model) and 6 (competitive-suppressionmodel) to the baseline and punishment con-ditions combined. Recall that this approachis one means of evaluating the capacity ofpunishment models to integrate reinforce-ment and punishment conditions in a singleexpression. In all cases in Part A, and in sevenof nine cases in Part B, the direct-suppressionmodel accounted for more variance than thecompetitive-suppression model. These resultsgenerally corroborate those of Experiment 2.
Table 5 summarizes the third type of modeltest in which Equation 4 was fit to baseline(no-punishment) data, and the resultingslope and bias parameters (a and log b, re-spectively) were held constant as Equations 5and 6 were fit to punishment data. The tableshows outcomes for subjects with at least fourpunishment conditions in which obtained re-inforcement rate exceeded obtained punish-ment rate for both response options. In 3 of11 cases (S209, S243, and S265), neither thedirect-suppression model of Equation 5 northe competitive-suppression model of Equa-tion 6 accounted well for the punishmentdata. In all of the remaining eight cases, thedirect-suppression model provided a betteraccount than the competitive-suppressionmodel, although the %VAC by the better-fit-ting model was fairly low (median 5 68.5%).
Overall, the data from Experiment 3 maybe said to support direct suppression as themechanism underlying punishment effects,but, unlike in Experiment 2, the competitive-suppression model of Equation 6 provided an
14 THOMAS S. CRITCHFIELD et al.
Fig. 5. Experiment 3: Relationship between log consequence ratio, based on Equations 5 and 6, and log responseratio for individual subjects. Note: a and log b are fitted parameters of the models. VAC 5 percentage of varianceaccounted for.
15PUNISHMENT AND CHOICE
Table 5
Experiment 3: Model parameters and percentage of variance accounted for (%VAC) in fittingthe generalized matching law to baseline (no-punishment data), and variance in punishmentdata accounted for by Equations 5 and 6 when parameters were forced to baseline values.Analyses included subjects who completed at least four punishment conditions in which ob-tained reinforcement rate exceeded obtained punishment rate.
Subject
Equation 4 fit to baseline
a log b %VAC
Punishment conditions:Variance accounted for by
Equation 5 Equation 6
S209S243S252S253S254S265S267S512S513S514S515
.624
.488
.3611.123.921
1.021.928.598.471.512.470
.0092.0022.0142.1782.0082.3892.1802.021
.001
.073
.092
97.098.496.898.192.196.893.793.091.697.791.9
—a
—a
58.1%97.4%59.4%26.3%91.2%94.0%70.7%47.3%68.5%
—a
—a
27.0%31.7%33.3%11.8%16.6%57.3%62.9%
—a
12.9%a Undefined (negative sum of squares).
obviously superior fit to the data of 2 subjectsin one analysis (S209 and S243 in Figure 5).Visual inspection of Figure 5 suggests onepossible source of this discrepancy. Whereasthe baseline and punishment data from mostsubjects were readily integrated into a single,positively-sloped linear function, 2 subjectsapparently produced negatively-sloped pun-ishment functions. This outcome is reminis-cent of a finding reported by Deluty andChurch (1978), who exposed rats to unequal,independent, concurrent schedules of re-sponse-independent shock. When the ratscould select which of these schedules oper-ated, time allocation was an inverse functionof shock rate, a pattern well described by amodel that can be expressed as
PB yxlog 5 a log 1 log b. (7)1 2 1 2B Py x
Table 4 (rightmost section) shows the re-sults of fitting Equation 7 to punishment-con-dition data for Experiment 2 subjects. Al-though Equation 7 tended to account forabout the same %VAC as Equations 4through 6 for most subjects, the fitted param-eter values verify that S209 and S243 werequalitatively different than other subjects. Be-cause Equation 7 inverts punishment terms(Py/Px rather than Px/Py), inverse matchingto punishment yields a positive value of theslope parameter, a, as obtained for S209 and
S243. Thus these subjects tended to allocatethe bulk of their responding to the optionwith the lower rate of money loss, eventhough this was also the option with the lowerreinforcement rate and, thus, the lower netmoney gain. Note that this outcome is con-sistent with a competitive-suppression view(Equation 6 reduces to Equation 7 when thereinforcement terms are omitted). Equation7 can be rejected on logical grounds as a sim-ple account of negative slopes generated byother subjects. These subjects allocated thebulk of their responding to the option withthe higher rate of money loss (and also thehigher rate of reinforcement). Because Ex-periment 1 showed money loss to function aspunishment, it seems likely that for these sub-jects responding was controlled by net moneygain (as assumed in direct-suppression ac-counts) rather than by punishment ratealone. Taken together, these results suggestthe possibility of pronounced individual dif-ferences in human punishment effects thatmerit consideration in future studies.
GENERAL DISCUSSION
The purpose of this investigation was tocompare direct-suppression and competitive-suppression models of punishment in choice,and to apply these models for the first timeto individual human behavior. Experiment 2
16 THOMAS S. CRITCHFIELD et al.
compared models based on Herrnstein’s(1970) proportional matching law using aqualitative evaluation procedure that hasbeen employed in all previous model com-parisons. A direct-suppression punishmentmodel was superior to a competitive-suppres-sion punishment model in describing 12 of14 individual-subject functions.
Apparently, no previous report has at-tempted to update operant punishment mod-els to the form of Baum’s (1974) generalizedmatching law or to compare such modelsquantitatively (although see Gray et al.,1991). In both Experiments 2 and 3, modelsbased on the generalized matching law werecompared by determining how well they in-tegrated data from punishment and no-pun-ishment conditions. A direct-suppressionmodel proved superior to a competitive-sup-pression model for 17 of 20 individual func-tions (Figures 4 and 5). In Experiment 3, amore rigorous method of model comparisonsupported a direct-suppression model ineight of eight interpretable cases (Table 5).
Limited data from previous investigationshave suggested the superiority of a direct-sup-pression punishment model (de Villiers,1980; Farley, 1980), thereby lending supportto the one-factor view of punishment onwhich Equations 2 and 5 are based (e.g., seede Villiers, 1980, 1982; Farley & Fantino,1978; Mazur, 1994). This conclusion, hereto-fore based primarily on studies of nonhu-mans, can now be provisionally extended toindividual human behavior. One reason to re-gard the present results as provisional lies inthe nature of the instructions that, althoughminimal by most standards, exhorted oursubjects to ‘‘earn as much money as you can.’’By possibly focusing attention on net earn-ings, this phrase might have predisposed sub-jects toward performances that were consis-tent with the direct-suppression model. Forinstance, a subject attending closely to netsession earnings might be relatively insensi-tive to momentary influences such as thetransient, punishment-elicited emotional re-sponses that two-factor theories (e.g., Dins-moor, 1954) hold as the basis of alternativereinforcement effects described in competi-tive-suppression models. Whether brief in-structions heard once at the start of partici-pation can exert such powerful effects acrossmany hours of exposure to changing contin-
gencies can only be revealed by replicatingour procedures using different instructions.
The one-factor view of human punishmentsuggested by the present findings raises inter-esting questions about prominent interpre-tive writings that incorporate a two-factor per-spective. Skinner (1953), for example,proposed that, consistent with two-factor as-sumptions, ‘‘the most important effect ofpunishment . . . is to establish aversive con-ditions which are avoided by any behavior of‘doing something else’ ’’ (p. 189). Events thatfunction as punishers often do generate emo-tional responses (e.g., Axelrod & Apshe,1983; Taylor, 1991), and casual observationindicated that our subjects sometimes reactedemotionally to point loss. Yet the presentfindings lend no systematic support to the no-tion that punishment makes alternative be-havior more reinforcing in absolute terms.We suggest, therefore, that although emo-tional by-products may contribute to trouble-some side effects often associated with ther-apeutic, social, and legal applications ofpunishment (e.g., Axelrod & Apsche; Ger-shoff, 2002; Skinner), they bear no necessaryrelation to the operant response-rate changesthat define the operation of punishment.
Although the present findings are broadlyconsistent with direct suppression of behaviorby punishment, several unresolved issues willloom large in the continued evaluation of thisview.
Functional Consequence Scaling
Cognitive decision research suggests thatequal-sized money gains and losses can havedifferent degrees of impact on choice (e.g.,Kahneman & Tversky, 1979), a finding forwhich, so far, no clear operant parallel ap-parently exists (e.g., see Lerman & Vorndran,2002). The present investigation used rein-forcers and punishers of equal monetary val-ue to avoid scaling ambiguities that plaguedprevious investigations with nonhumans inwhich food and shock, respectively, served asreinforcers and punishers. Nevertheless, theassumption that reinforcers and punishers ofnominally equal value exert equal degrees ofcontrol over behavior bears formal scrutiny.
The present investigation provided cluesthat reinforcers and punishers did not alwayshave equal impact upon behavior. First, 2 sub-jects in Experiment 2 appeared to show in-
17PUNISHMENT AND CHOICE
verse matching-to-punishment rates. Thus,under the contingencies employed in Part Bof Experiment 2, they preferred the optionwith the lower rate of reinforcement and,therefore, the lower rate of net money earn-ings. This outcome makes sense only if, forthese individuals, one punisher was more ef-ficacious than one reinforcer. Second, as not-ed previously, in 14 instances in Experiments2 and 3, the condition-mean punishment rateequaled or exceeded reinforcement rate fora response option (see Appendixes B and C),resulting in a net gain of zero cents (or less)for that response option according to the sub-tractive logic of direct-suppression models. Inno case, however, did this result in exclusivepreference for the other response option,suggesting that punishers may sometimeshave had a lower functional value than rein-forcers despite their nominally equal magni-tude.
Future studies should assess the functionalmagnitudes of money gain and money loss inthe context of human operant experimentsbecause no quantitative choice model can befully evaluated without knowing the function-al magnitudes of the consequences involved(e.g., Herrnstein, 1970). Statistical scalingprocedures such as those described by Farleyand Fantino (1978) provide one means of ac-complishing this. We note, however, that suchprocedures can be employed only after a gen-eral form of punishment model (e.g., directsuppression versus competitive suppression)has been adopted (see Baum, 1982; Farley &Fantino). By lending support to direct-sup-pression models, therefore, the present inves-tigation helps to pave the way for studies offunctional punishment value.
It would be surprising if the functional val-ue of money consequences does not varyacross individuals. In scaling the functionalimpact of food reinforcers and shock punish-ers in pigeons, for example, Farley and Fan-tino (1978) found different relative values fordifferent subjects. Intersubject differencesmight be especially pronounced for condi-tioned consequences (such as money), whichacquire their capacity to influence behaviorthrough experience that, in the world outsidethe laboratory, is unique for each individual(e.g., Lerman & Vorndran, 2002). Unusualpreexperimental histories may well have ledto the aberrant performances of S209 and
S243 in Experiment 2. For this reason, ex-perimentally-created conditioned conse-quences (e.g., Jackson & Hackenberg, 1996)might provide a better foundation for futureinvestigations.
Model Limitations and Characteristics
Limitations of existing direct-suppression models.Although in the present investigation direct-suppression models (Equations 2 and 5) out-performed competitive-suppression models(Equations 3 and 6), it is unclear whetherEquations 2 and 5 form the basis of a goodpunishment model. The present investigationhighlights two limitations of existing direct-suppression models. The first limitation is il-lustrated through several instances in whichthe response rate for a response option re-mained greater than zero despite the fact thatpunishment rate equaled or exceeded rein-forcement rate (Appendixes B and C). Asnoted previously, because of high punish-ment rates many of these cases are incom-patible with qualitative model evaluationsbased on the proportional matching law(Herrnstein, 1970), and all are incompatiblewith quantitative model evaluations based onthe generalized matching law (Baum, 1974).Our practice was to drop these cases fromconsideration, but a good punishment modelshould accommodate them. Pending furthermodel development, these cases can be avoid-ed through scheduling conventions such asthe Stubbs-Pliskoff technique for arrangingnonindependent concurrent schedules thatenforces programmed relative consequencerates (Stubbs & Pliskoff, 1969). Ultimately,however, a general-purpose direct-suppres-sion model is required.
A second limitation of existing direct-sup-pression models is demonstrated empirically.Even when troublesome cases of high-ratepunishment were excluded from analysis, di-rect-suppression models, although superiorto their competitive-suppression counter-parts, often accounted for only a modest per-centage of variance in individual-subject func-tions (Tables 3 and 4; Figures 4 and 5).Whether the unexplained variance can be at-tributed to features of the models, features ofthe present investigation, or both, remains tobe determined. One obvious hypothesis canbe immediately ruled out. It might be pro-posed that emotional responses—unmea-
18 THOMAS S. CRITCHFIELD et al.
sured here but thought to be elicited by aver-sive events (e.g., Axelrod & Apshe, 1983;Taylor, 1991; Skinner, 1953)—competed withoperant processes to create unsystematicnoise in the data from punishment condi-tions. Tables 3 and 4 argue against this pos-sibility by showing that data from punishmentconditions, in which these emotional respons-es should have been elicited, were reasonablyorderly. The difficulty seems to lie instead inintegrating data from punishment and no-punishment conditions; the very goal that asuccessful punishment model should achieve.
An additional, and possibly related, con-cern is whether Equations 2 and 5 promotethe most appropriate level of analysis. For ex-ample, these models make no reference tothe discriminability of the consequences orthe stimuli associated with them—factors thatare important in concurrent schedule perfor-mance involving only reinforcement (Davison& Jenkins, 1985; Davison & McCarthy, 1988;Davison & Nevin, 1999; Madden & Perone,1999). Additionally, Equations 2 and 5 aremolar models, and perhaps punishment ef-fects are better understood on a molecularlevel of analysis (e.g., Vaughan, 1987). Order-ly effects of reinforcement in concurrentschedules have been detected at both levels(e.g., Landon, Davison, & Elliffe, 2002). Itmakes sense to anticipate parallels in the ef-fects of punishment.
Parameter invariance. Key model tests of thepresent investigation were predicated on theassumption that punishment leaves the fittedparameters of the generalized matching lawunchanged—an assumption for which weknow of no direct empirical support. If pun-ishment were found to alter the fitted param-eters of the generalized matching law, thenthe conceptual status of these parameters inpunishment models would have to be recon-sidered. Ambiguities already exist. Note, forexample, that in the present investigation thesensitivity estimates (slopes) derived from thedirect suppression model of Equation 5 near-ly always were lower than those derived fromthe competitive suppression model of Equa-tion 6 (see Tables 3 and 4; Figures 4 and 5).This is because, compared to reinforcement-only models (e.g., Equation 4), direct-sup-pression models tend to expand the range ofconsequence ratios (e.g., in Part A of Exper-iments 2 and 3, subtracting a constant from
unequal reinforcement values shifts their ra-tio away from unity). By contrast, competitive-suppression models tend to compress therange of consequence ratios (e.g., in Part Aof Experiments 2 and 3, adding a constant tounequal reinforcement values shifts their ra-tio toward unity). In matching terms, plottingthe same set of behavior ratios against differ-ent ranges of consequence ratios necessarilyyields functions of unequal slope. It is notclear, however, whether is it justified in suchcases to conclude that sensitivity to conse-quence differentials is lower for direct-sup-pression models, or whether it is even per-missable to compare slopes generated byqualitatively-different models. Does sensitivitymean the same thing in different models?
Apparently at odds with the parameter-in-variance assumption are reports of punish-ment-related changes in both the sensitivity(Bradshaw et al., 1979) and bias (Bradshaw etal.; McAdie, Foster, & Temple, 1996; McAdie,Foster, Temple, & Matthews, 1993) parame-ters of the generalized matching law. For pre-sent purposes, these reports admittedly areambiguous. Bradshaw et al. employed ratiopunishment schedules, which confound re-sponse and consequence rates, and omitted achangeover delay that could have assured in-dependence of concurrent repertoires.McAdie et al. (1993) examined only one base-line and one punishment condition per sub-ject, precluding conclusions about the match-ing relation. Additionally, punishment in theMcAdie et al. investigations consisted of loudnoise presented continuously in associationwith residence at one response option, an ar-rangement that appears to punish changingover to one alternative rather discrete re-sponses at that alternative.
If punishment does alter the free parame-ters of the generalized matching law, thenmany complications arise. Although it may betempting, in the name of parsimony, to sim-ply employ Equation 4 to describe punish-ment effects, doing so without reference topunishment leaves the model as merely de-scriptive. To create testable predictions, pun-ishment would have to be integrated directlyinto the sensitivity or bias parameter of Equa-tion 4 (that is, these parameters would nolonger be entirely free). It is not clear howthis might be accomplished or what the im-plications would be for one-factor and two-
19PUNISHMENT AND CHOICE
factor theories that have guided interpreta-tions of punishment for most of the pastcentury.
Because of the potential for parameter in-variance, the specific schedule values em-ployed in punishment-model tests may be im-portant in ways not considered when thepresent investigation was designed. Assume,for instance, that punishing one behavior cre-ates a bias against engaging in that behavior(as suggested by Bradshaw et al., 1979, andMcAdie et al., 1993). Applying a constant rateof punishment to two concurrent responseoptions (as in Part A of the present Experi-ments 2 and 3) would promote competingbiases that cancel each other out, leavingmodels based on the generalized matchinglaw easy to interpret. By contrast, applying un-equal punishment schedules to the two re-sponse options (as in Part B of the presentexperiments) would generate competing bi-ases of unequal strength. If raw punishmentrates varied not only across response optionsbut also across reinforcement ratios (as inPart B of the present Experiments 2 and 3),unexplained variance would be introduced inthe linear functions, and matching modelsmight appear to perform badly. Thus asym-metrical punishment effects on sensitivityand/or bias may help to explain why, in bothExperiment 2 and 3 of the present investi-gation, Part B (in which punishment rate var-ied) produced less consistent outcomes thanPart A (in which punishment rate was con-stant).
Obviously, new data are needed to shedlight on the status of free parameters in pun-ishment models based on the generalizedmatching law. Straightforward informationcould be obtained by simply punishing oneof two concurrently-available response op-tions across a range of relative reinforcer ra-tios. To date, however, no study has done sowith human subjects (for whom equal-sizedreinforcers and punishers presumably can bearranged) while applying standard procedur-al controls (e.g., COD) associated with con-current schedule performances. In studies in-volving simultaneous punishment ofconcurrent response options, it makes senseto emphasize cases in which a constant rateof punishment is employed for all responseoptions in all conditions, thereby presumably
minimizing problems associated with param-eter invariance.
Conclusions
The present investigation provides theclearest and most extensive evidence availableto date that operant punishment directly sup-presses the behavior on which it is contin-gent. In supporting a direct-suppression ac-count of human punishment, the presentfindings agree with those of studies involvingnonhumans (de Villiers, 1980; Farley, 1980),thus bolstering confidence in the interspeciesgenerality of punishment effects in choice.Aside from the possibility of parameter in-variance, the unresolved theoretical and tech-nical issues discussed above do not detractfrom these contributions. Rather, by improv-ing on and extending previous investigations,the present one helps to bring these issuesinto focus for future investigations. In appar-ently supporting a direct-suppression view,the present results raise questions about in-terpretations of everyday human punishmentthat stress competitive-suppression mecha-nisms inspired by two-factor punishment the-ories (e.g., Skinner, 1953). Replication andextension of the present investigation willprove informative, therefore, in evaluatingthe validity of these interpretive accounts. Fi-nally, in highlighting some limitations of ex-isting quantitative models of punishment, thepresent investigation sets the stage for furtherpunishment-model development.
In these ways, the present report demon-strates the value of continuing research onfundamental processes of punishment. Pre-cious little operant punishment research hasbeen published in recent years, especially re-search involving human subjects (e.g., see Ax-elrod & Apshe, 1983; Crosbie, 1998; Lerman& Vorndran, 2002). The present results areimportant, therefore, in adding to this mea-ger data base. Ironically, behavior analysis ap-pears to have largely abandoned research onpunishment and other forms of aversive con-trol just as the world outside of behavior anal-ysis has become fascinated by it. This mayhelp to explain the recent proliferation ofnonbehavioral theories of aversive control, of-ten guided by nonoperant data (e.g., Carlson& Tamm, 2000; Gehring & Willoughby, 2002;Gershoff, 2002; Taylor, 1991). Some encour-agement may be drawn, however, from the
20 THOMAS S. CRITCHFIELD et al.
fact that the direct-suppression view of pun-ishment supported in the present investiga-tion corresponds to the assumption, made inmany fields and psychology subdisciplines,that benefits and costs combine directly to in-fluence behavior (e.g., Ehrlich, 1996; Gray etal., 1991; Kahneman & Tversky, 1979; Leung,1995; Lohman, 1997; Neilson, 1998). Oper-ant punishment research capitalizing on thiscommon ground thus has the capacity toboth advance operant theory and stimulateinterdisciplinary discourse.
REFERENCES
Akers, R. L. (1994). Criminological theories: Introduction andevaluation. Los Angeles: Roxbury.
Axelrod, S., & Apshe, J. (1983). The effects of punishmenton human behavior. San Diego: Academic Press.
Azrin, N. H., & Holz, W. C. (1966). Punishment. In W.K. Honig (Ed.), Operant behavior: Areas of research andapplication (pp. 380–447). Englewood Cliffs, NJ: Pren-tice-Hall.
Baron, A., Perone, M., & Galizio, M. (1991). Analyzingthe reinforcement process at the human level: Canapplication and behavioristic interpretation replacelaboratory research? The Behavior Analyst, 14, 95–105.
Baum, W. M. (1974). On two types of deviation from thematching law: Bias and undermatching. Journal of theExperimental Analysis of Behavior, 22, 231–242.
Baum, W. M. (1979). Matching, undermatching, andovermatching in studies of choice. Journal of the Exper-imental Analysis of Behavior, 32, 269–281.
Baum, W. M. (1982). Choice, changeover, and travel.Journal of the Experimental Analysis of Behavior, 38, 35–49.
Bolles, R. C. (1967). Theory of motivation. New York: Harp-er & Row.
Bradshaw, C. M., Szabadi, E., & Bevan, P. (1979). Theeffect of punishment on free-operant choice behaviorin humans. Journal of the Experimental Analysis of Behav-ior, 31, 71–81.
Carlson, C. L., & Tamm, L. (2000). Responsiveness ofchildren with attention deficit-hyperactivity disorderto reward and response cost: Differential impact onperformance and motivation. Journal of Consulting andClinical Psychology, 68, 73–83.
Crosbie, J. (1998). Negative reinforcement and punish-ment. In K. A. Lattal & M. Perone (Eds.), Handbookof research methods in human operant behavior (pp. 163–189). New York: Plenum.
Davison, M. (1991). Choice, changeover, and travel: Aquantitative model. Journal of the Experimental Analysisof Behavior, 55, 47–61.
Davison, M., & Jenkins, P. E. (1985). Stimulus discrimi-nability, contingency discriminability, and schedulepreformance. Animal Learning & Behavior, 13, 77–84.
Davison, M., & McCarthy, D. (1988). The matching law: Aresearch review. Hillsdale, NJ: Erlbaum.
Davison, M., & Nevin, J. A. (1999). Stimuli, reinforcers,and behavior: An integration. Journal of the Experimen-tal Analysis of Behavior, 71, 439–482.
Deluty, M. Z. (1976). Choice and the rate of punishmentin concurrent schedules. Journal of the ExperimentalAnalysis of Behavior, 25, 75–80.
Deluty, M. Z. (1982). Maximizing, minimizing, andmatching between reinforcing and punishing situa-tions. In M. L. Commons, R. J. Herrnstein, & H. Rach-lin (Eds.), Quantitative analyses of behavior. Volume II:Matching and maximizing accounts (pp. 305–325). Cam-bridge, MA: Ballinger.
Deluty, M. Z., & Church, R. M. (1978). Time-allocationmatching between punishing situations. Journal of theExperimental Analysis of Behavior, 29, 191–198.
de Villiers, P. A. (1980). Toward a quantitative theory ofpunishment. Journal of the Experimental Analysis of Be-havior, 33, 15–25.
de Villiers, P. A. (1982). Toward a quantitative theory ofpunishment. In M. L. Commons, R. J. Herrnstein, &H. Rachlin (Eds.), Quantitative analyses of behavior. Vol-ume II: Matching and maximizing accounts (pp. 327–344). Cambridge, MA: Ballinger.
Dinsmoor, J. A. (1954). Punishment I: The avoidance hy-pothesis. Psychological Review, 61, 34–46.
Ehrlich, I. (1996). Crime, punishment, and the marketfor offenses. Journal of Economic Perspectives, 10, 43–67.
Estes, W. K. (1944). An experimental study of punish-ment. Psychological Monographs, 57(3), (whole No.263).
Eysenck, H. J. (1967). The biological basis of personality.Springfield, IL: Charles C. Thomas.
Farley, J. (1980). Reinforcement and punishment effectsin concurrent schedules: A test of two models. Journalof the Experimental Analysis of Behavior, 33, 311–315.
Farley, J., & Fantino, E. (1978). The symmetrical law ofeffect and the matching relation in choice behavior.Journal of the Experimental Analysis of Behavior, 29, 37–60.
Fleshler, M., & Hoffman, H. S. (1962). A progression forgenerating variable-interval schedules. Journal of theExperimental Analysis of Behavior, 5, 529–530.
Gehring, W. J., & Willoughby, A. R. (2002). The medialfrontal cortex and the rapid processing of monetarygains and losses. Science, 295, 2279–2282.
Gershoff, E. T. (2002). Corporal punishment by parentsand associated child behaviors and experiences: Ameta-analytic and theoretical review. Psychological Bul-letin, 128, 539–579.
Gray, L. N., Stafford, M. C., & Tallman, I. (1991). Re-wards and punishment in complex human choices.Social Psychology Quarterly, 54, 318–329.
Herrnstein, R. J. (1970). On the law of effect. Journal ofthe Experimental Analysis of Behavior, 13, 243–266.
Jackson, K., & Hackenberg, T. (1996). Token reinforce-ment, choice, and self-control in pigeons. Journal ofthe Experimental Analysis of Behavior, 66, 29–49.
Kahneman, D., & Tversky, A. (1979). Prospect theory: Ananalysis of decisions under risk. Econometrica, 47, 263–291.
Kollins, S. H., Newland, M. C., & Critchfield, T. S. (1997).Human sensitivity to reinforcement in operantchoice: How much do consequences matter? Psycho-nomic Bulletin & Review, 4, 208–220. Erratum: Psycho-nomic Bulletin & Review, 4, 431.
Landon, J., Davison, M., & Elliffe, D. (2002). Concurrentschedules: Short-term and long-term effects of rein-forcers. Journal of the Experimental Analysis of Behavior,77, 257–271.
21PUNISHMENT AND CHOICE
APPENDIX A
Experiment 1: Mean obtained rates of responding, reinforcement, and punishment duringthe final four sessions per condition.
Consequences per hour Behavior allocation
Subject
Ses-sion
(min)
Change-overdelay(s)
Value ofconse-
quencesin cents Condition
Ses-sion
Reinforcersper hour
Left Right
Punishersper hour
Left Right
Responses perminute
Left Right
Time (s)
Left Right
Part A253 15 0.5 2 BL
P 50%7
10109149
11242
—0
—28
58.1153.0
65.48.1
417858
48342
268 15 0.5 2 BLP 50%
74
14383
138145
—50
—0
25.89.5
24.827.7
460221
441678
269 10 0.5 3 BLP 50%
57
9823
84146
—80
—0
15.65.2
14.136.6
34769
251529
271 10 0.5 3 BLP 50%
44
95162
13548
—0
—41
22.946.7
30.27.1
259512
33987
274 10 0.5 3 BLP 50%
44
90165
14153
—0
—30
28.377.5
40.47.6
254522
34477
512 8 2.0 8 BLP 50%
44
101148
10341
—0
—30
26.647.8
27.08.8
240402
23877
Part B272 10 6.0 3 BL
P 50%P 80%
1046
1169678
105114128
—5754
—00
68.532.736.4
64.944.878.2
306265194
293333405
273 10 0.5 3 BLP 50%P 80%
444
134144159
14914783
—00
—7563
35.755.187.5
50.344.716.4
252333502
34726798
Note. BL 5 baseline (no punishment); P 5 punishment; % 5 punishment rate as a percentage of reinforcementrate based on programmed schedule values.
Lerman, D. C., & Vorndran, C. M. (2002). On the statusof knowledge for using punishment: Implications fortreating behavior disorders. Journal of Applied BehaviorAnalysis, 35, 431–464.
Leung, S. F. (1995). Dynamic deterrence theory. Econ-omica, 62, 65–87.
Lohman, S. (1997). Linkage politics. Journal of ConflictResolution, 41, 38–67.
Madden, G. J., & Perone, M. (1999). Human sensitivityto concurrent schedules of reinforcement: Effects ofobserving schedule-correlated stimuli. Journal of theExperimental Analysis of Behavior, 71, 303–318.
Mazur, J. E. (1994). Learning and behavior (3rd. ed.). En-glewood Cliffs, NJ: Prentice-Hall.
Mowrer, O. H. (1947). On the dual nature of learning:A re-interpretation of ‘‘conditioning’’ and ‘‘problemsolving.’’ Harvard Educational Review, 17, 102–148.
McAdie, T. M., Foster, M., & Temple, W. (1996). Con-current schedules: Quantifying the aversiveness ofnoise. Journal of the Experimental Analysis of Behavior, 65,37–55.
McAdie, T. M., Foster, T. M., Temple, W., & Matthews, L.R. (1993). A method for measuring the aversivenessof sounds to domestic hens. Applied Animal BehaviourScience, 37, 223–238.
Neilson, W. S. (1998). Optimal punishment schemes withrate-dependent preferences. Economic Inquiry, XXXVI,266–271.
Rachlin, H., & Herrnstein, R. J. (1969). Hedonism revis-ited: On the negative law of effect. In B. A. Campbell& R. M. Church (Eds.), Punishment and aversive behav-ior (pp. 83–109). New York: Appleton-Century-Crofts.
Rescorla, R. A., & Solomon, R. L. (1967). Two-processlearning theory: Relationships between Pavlovian con-ditioning and instrumental learning. Psychological Re-view, 74, 151–182.
Skinner, B. F. (1953). Science and human behavior. NewYork: Free Press.
Stubbs, D. A., & Pliskoff, S. S. (1969). Concurrent re-sponding with fixed relative rate of reinforcement.Journal of the Experimental Analysis of Behavior, 12, 887–895.
Taylor, S. E. (1991). Asymmetrical effects of positive andnegative events: The mobilization-minimization hy-pothesis. Psychological Bulletin, 110, 67–85.
Thorndike, E. L. (1911). Animal intelligence. New York:Macmillan.
Vaughan, W. (1987). Choice and punishment: A localanalysis. In M. L. Commons, J. E. Mazur, J. A. Nevin,& H. Rachlin (Eds.), Quantitative analyses of behavior(Vol. 5): The effects of delay and of intervening events onreinforcement value (pp. 159–186). Hillsdale, NJ: Erl-baum.
Received May 17, 1999Final acceptance May 5, 2003
22 THOMAS S. CRITCHFIELD et al.
APPENDIX B
Experiment 2: Mean obtained rates of responding, reinforcement, and punishment duringthe final four sessions per condition. ‘‘Rich’’ and ‘‘Lean’’ refer to programmed reinforcementrates. BL 5 Baseline (no punishment). In the Condition column of Part A, percentages referto punishment rate (applied to both response options) as a percentage of the lean-side re-inforcement rate. In the Condition column of Part B, percentages refer to punishment rateas a percentage of the reinforcement rate of each response option. See text and Table 2 fordetails.
Consequences per hour Behavior allocation
SubjectMoney
Counter?Order/
Condition Session
Reinforcers
Rich Lean
Punishers
Rich Lean
Responses perminute
Rich Lean
Time (s)
Rich Lean
Part A269 Yes 1 BL
2 50%3 100%a
1047
18093
200
39242
—48
102
—88
13.123.928.6
8.13.80.42
330829
269518590
271 Yes 1 BL2 50%3 100%a
446
159168198
789972
—4896
—4872
57.445.564.5
36.946.624.3
355294437
243306163
272 Yes 1 BL2 50%3 100%a
1268
132150152
656256
—4787
—4262
28.049.069.6
19.631.634.3
354345385
245253213
273 Yes 1 BL2 50%3 100%a
477
185188219
876624
—54
107
—4438
20.126.442.1
15.19.93.6
333423540
26617659
274 Yes 1 BL2 50%3 100%a
446
197159225
68898
—54
110
—4515
64.252.855.6
27.836.21.8
413355582
18724518
Part B269 Yes 1 BL
2 25%3 50%4 75%
55
1211
20422715281
38143242
—628969
—2
1836
23.027.416.89.5
10.54.98.06.3
40137
274229
198562325370
271 Yes 1 BL2 25%4 50%3 75%a
7455
272239278239
20301427
—68
110188
—86
27
33.929.434.829.7
1.45.91.26.0
544488570111
2711128
488272 Yes 3 BL
1 25%2 50%4 75%a
6467
223225189239
29302718
—6093
170
—6
2027
67.480.565.373.9
13.828.224.614.4
498433163129
102166436469
273 Yes 1 BL 5 219 45 — — 18.3 14.2 333 2663 25%4 50%2 75%a
44
11
258242264
443916
65132208
92316
24.129.736.5
10.68.23.2
402447544
19815355
274 Yes 1 BL2 25%4 50%3 75%
4644
260213248218
41443842
—52
137180
—111836
53.631.731.831.8
18.818.37.9
13.9
444377477416
155222123182
Part C500 Yes 1 BL
2 50%3 100%a
979
135177173
634432
—4589
—4132
14.331.621.5
12.814.16.6
250338369
230143111
No 5 BL6 50%4 100%
887
150122173
806965
—5077
—3949
26.214.619.6
18.513.56.8
288249360
191230119
501 Yes 4 BL5 50%6 100%
1156
159180200
708949
—5494
—4540
31.640.244.2
21.617.25.5
4.575.607.04
3.432.390.96
23PUNISHMENT AND CHOICE
APPENDIX B
(Continued)
Consequences per hour Behavior allocation
SubjectMoney
Counter?Order/
Condition Session
Reinforcers
Rich Lean
Punishers
Rich Lean
Responses perminute
Rich Lean
Time (s)
Rich Lean
No 3 BL1 50%2 100%
789
166171182
848849
—6099
—4139
23.234.631.6
19.818.77.5
4.355.216.42
3.652.791.58
a Conditions in which obtained punishment rate equaled or exceeded obtained reinforcement rate for a responseoption were excluded from model evaluations summarized in Figure 5.
24 THOMAS S. CRITCHFIELD et al.
APPENDIX C
Experiment 3: Mean obtained rates of responding, reinforcement, and punishment duringthe final four sessions per condition.
Subject
Reinforce-ment ratio
(L : R) Phase
No-Punishment baseline
Sessions
Reinforcers per hour
Left Right
Responses per minute
Left Right
Time (s)
Left Right
Part A512 7:1
3:11:21:41:5
41352
5111044
261176944524
1943
133204244
44.736.325.620.47.3
11.016.430.033.048.1
37731822918965
103161250290413
513 7:15:11:31:7
4213
45
124
2101414523
3041
133244
18.46.17.16.5
6.34.68.6
23.5
359276216111
121203263369
514 6:14:12:11:31:7b
42531
47658
23321815820130
38326960
246
33.839.025.242.116.6
12.210.616.817.839.6
361381293339143
118100185141336
515 4:13:12:11:51:7
34152
76444
188208114156
211366
193221
16.418.710.75.46.9
4.62.8
11.818.720.3
37541326310375
10467
216376404
Part B209 9:1
3:13:2
246
844
284208144
2064
100
91.868.153.4
14.033.545.4
771564484
127334415
1:21:41:17b
315
19147
763814
187235320
28.027.315.0
73.277.981.6
255237152
644662747
210 9:13:13:21:17a,b
4132
4111210
217201124
0
151
84235
18.314.514.90.1
3.90.2
15.012.0
763895483
2
1366
416898
243 17:1b
3:11:21:41:9
24315
444
107
280188804321
1557
148200257
38.126.219.515.410.1
11.815.622.331.040.5
659545422315456
240353476583443
252 17:1b
4:13:21:31:91:17b
625413
544656
269199167672814
1545
114206251288
33.026.624.817.215.812.0
13.714.323.930.033.937.2
612603459352317251
288298441546583648
253 17:14:12:11:31:9
35142
4448
11
3322751572217
93496
290303
114.0115.068.82.94.8
6.020.767.6
125.098.5
8607554562443
39144442875856
254 17:19:13:12:31:41:17
632541
445446
304284250925612
151462
157255308
82.885.7
124.518.818.416.8
22.014.423.232.361.9
123.0
727800744257166115
17298
157641732783
25PUNISHMENT AND CHOICE
APPENDIX C
(Extended)
Punishment superimposed
Sessions
Reinforcers per hour
Left Right
Punishers per hour
Left Right
Responses per minute
Left Right
Time (s)
Left Right
75658
283246711011
6113193255184
32343088
417323532
54.045.413.32.05.0
3.19.1
36.554.051.5
4513971321944
2881
347460435
8177
10
2462044524
2330
128227
30302823
21242829
21.815.44.25.5
2.35.95.8
20.0
425345201101
55134278381
5544
10
24422519321923
30384536
248
2434303226
1928262430
42.245.731.943.19.5
5.938.67.89.4
53.3
41740737739478
6273
10286
4014
1044
14
20314183136
83086
212244
232124104
1115242832
16.712.618.83.83.3
3.07.3
34.719.419.8
4013121677054
78167319409425
444
190190154
2971
107
9910583
164360
42.840.825.3
56.053.831.2
391596413
507303485
664
1055515
160176220
532916
88107120
58.744.758.6
32.342.136.9
564460538
334439361
45
148
206114992
265773
269
11869562
142846
145
21.114.110.70.3
7.512.912.324.5
65945976314
239441136885
54455
155173835330
1465
13096
170
9197502918
1439836798
17.118.818.219.426.6
21.618.317.310.414.4
394456478554551
476442427346347
464546
243214141682713
1455
100179209188
13311377361514
153057
105119114
21.326.314.311.415.213.9
13.420.215.520.924.916.2
556498432343343379
344403466556557520
54
1787
329281140169
81788
257301
16614373104
69
46130142
106.0134.040.44.11.1
3.64.3
66.2110.0116.0
8398723975944
5928
502839855
6455
1522
2822792411073914
142357
143238304
146139129532112
13123086
127160
60.069.382.136.99.9
13.3
11.812.716.933.536.571.9
78479275244412484
11587
147455777814
26 THOMAS S. CRITCHFIELD et al.
APPENDIX C
(Continued)
Subject
Reinforce-ment ratio
(L : R) Phase
No-Punishment baseline
Sessions
Reinforcers per hour
Left Right
Responses per minute
Left Right
Time (s)
Left Right
265 9:14:11:21:17a
4213
51345
246265960
535
125337
41.0143.0107.0
0.7
3.825.054.0
154.0
820775593
4
81124306894
267 17:14:11:21:9a
2413
5664
346278190
620
224319
104.0103.3
2.70.2
1.42.6
92.097.9
886875282
1425
871898
268 5:14:11:31:17b
4132
658
13
223239294
2648
226351
36.3104.011.42.1
5.834.860.6
110.0
77867114717
121229752882
a When, during the terminal sessions of a baseline condition, a response option generated nonzero response ratesbut insufficient residence time to allow reinforcers to accrue, a reinforcement rate of 0.1 per minute, or 6 per hour,was used during model fits.
b Conditions in which obtained punishment rate equaled or exceeded obtained reinforcement rate for a responseoption were excluded from model evaluations summarized in Figure 6.
27PUNISHMENT AND CHOICE
APPENDIX C
(Continued Extended)
Punishment superimposed
Sessions
Reinforcers per hour
Left Right
Punishers per hour
Left Right
Responses per minute
Left Right
Time (s)
Left Right
5585
269236729
1646
151328
145126416
92381
168
108.0108.064.210.3
18.927.165.9
130.0
76771643468
132183467831
4878
3412339811
642
171304
166117543
22695
150
93.570.548.83.5
4.834.639.490.0
86060049029
39299410871
6446
11297234
1844
154181
7669145
122786
110
8.121.93.51.4
3.134.112.719.8
65835619861
241543701837