+ All Categories
Home > Documents > Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin...

Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin...

Date post: 09-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
13
Journal of Experimental Psychology: Human Learning and Memory 1981, Vol 7, No. 4, 241-253 Copyright 1981 by the American Psychological Association, Inc. 0096-1515/81/0704-0241J00.75 Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How do strategies affect the learning of categories that lack necessary and suf- ficient attributes? The usual answer is that different strategies correspond to different models. In this article we provide evidence for an alternative view— Strategy variations induced by instructions affect only the amount of information represented about attributes, not the process operating on these representations. The experiment required subjects to classify schematic faces into two categories. Three groups of subjects worked with different sets of instructions: roughly, form a prototype of each category, learn each category as a rule-plus-exception, or standard neutral instructions. In addition to learning the faces (Phase 1), subjects were given transfer tests on learned and novel faces (Phase 2) and speeded categorization tests on learned faces (Phase 3). There were performance differ- ences in all three phases due to instructions, but these results were readily ac- counted for by specific changes in the representations posited by the context model of Medin and Schaffer; that is, strategies seemed to affect only the amount of information stored about each exemplar's attributes. The recent upsurge of interest in natural categories such as bird, tree, and fruit has been accompanied by parallel investigations of representations and processing of artifi- cial categories. In this article we are pri- marily concerned with the role of strategies in learning the attribute structure of artifi- cial categories. But since the effects of strat- egies can best be understood in terms of spe- cific categorization models, we first provide a brief overview of models and then take up the strategy issue. Category Learning Models One idea growing out of research with artificial categories is that based on experi- ence with exemplars, people abstract some measure of the central tendency of a cate- gory and base their categorical judgments on this central tendency, or prototype (e.g., Posner & Keele, 1968). A contrasting view This research was supported in part by National In- stitute of Mental Health Grants MH 32370 and MH 19705 and by the National Institute of Education Con- tract HEW-NIE-C-400-76-0116. Edward E. Smith is now at Bolt Beranek & Newman Inc., Cambridge, Massachusetts. Requests for reprints should be sent to Douglas L. Medin, Department of Psychology, University of Illi- nois, Champaign, Illinois 61820. posits that when an item is presented to be classified, it acts as a retrieval cue to access information associated with similar stored exemplars, and this specific exemplar infor- mation is the basis for category judgments (Medin & Schaffer, 1978). According to the latter idea, some animal might be catego- rized as a rodent not on the basis of a com- parison to a rodent prototype, but because that animal has similar attributes to a rabbit and the categorizer thinks that rabbits are rodents. A specific proposal embodying this idea, known as the context model, will be considered in detail shortly. Since both prototype and exemplar-based models can account for many phenomena, it is difficult to generate differential predic- tions (see, e.g., Hintzman & Ludlam, 1980). In one attempt to do so, Medin and Schaffer (1978) contrasted the predictions of what they called independent- and interactive-cue theories, with prototype models being one type of independent-cue theory. Indepen- dent-cue theories assume that the informa- tion entering into category judgments (over- all similarity, distance, or validity) can be derived from an additive combination of the information from component attributes (Franks & Bransford, 1971; Reed, 1972). In other words, the more characteristic at- tributes an exemplar has, the easier it should 241
Transcript
Page 1: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

Journal of Experimental Psychology:Human Learning and Memory1981, Vol 7, No. 4, 241-253

Copyright 1981 by the American Psychological Association, Inc.0096-1515/81/0704-0241J00.75

Strategies and Classification Learning

Douglas L. MedinUniversity of Illinois at Urbana-Champaign

Edward E. SmithStanford University

How do strategies affect the learning of categories that lack necessary and suf-ficient attributes? The usual answer is that different strategies correspond todifferent models. In this article we provide evidence for an alternative view—Strategy variations induced by instructions affect only the amount of informationrepresented about attributes, not the process operating on these representations.The experiment required subjects to classify schematic faces into two categories.Three groups of subjects worked with different sets of instructions: roughly, forma prototype of each category, learn each category as a rule-plus-exception, orstandard neutral instructions. In addition to learning the faces (Phase 1), subjectswere given transfer tests on learned and novel faces (Phase 2) and speededcategorization tests on learned faces (Phase 3). There were performance differ-ences in all three phases due to instructions, but these results were readily ac-counted for by specific changes in the representations posited by the contextmodel of Medin and Schaffer; that is, strategies seemed to affect only the amountof information stored about each exemplar's attributes.

The recent upsurge of interest in naturalcategories such as bird, tree, and fruit hasbeen accompanied by parallel investigationsof representations and processing of artifi-cial categories. In this article we are pri-marily concerned with the role of strategiesin learning the attribute structure of artifi-cial categories. But since the effects of strat-egies can best be understood in terms of spe-cific categorization models, we first providea brief overview of models and then take upthe strategy issue.

Category Learning Models

One idea growing out of research withartificial categories is that based on experi-ence with exemplars, people abstract somemeasure of the central tendency of a cate-gory and base their categorical judgmentson this central tendency, or prototype (e.g.,Posner & Keele, 1968). A contrasting view

This research was supported in part by National In-stitute of Mental Health Grants MH 32370 and MH19705 and by the National Institute of Education Con-tract HEW-NIE-C-400-76-0116.

Edward E. Smith is now at Bolt Beranek & NewmanInc., Cambridge, Massachusetts.

Requests for reprints should be sent to Douglas L.Medin, Department of Psychology, University of Illi-nois, Champaign, Illinois 61820.

posits that when an item is presented to beclassified, it acts as a retrieval cue to accessinformation associated with similar storedexemplars, and this specific exemplar infor-mation is the basis for category judgments(Medin & Schaffer, 1978). According to thelatter idea, some animal might be catego-rized as a rodent not on the basis of a com-parison to a rodent prototype, but becausethat animal has similar attributes to a rabbitand the categorizer thinks that rabbits arerodents. A specific proposal embodying thisidea, known as the context model, will beconsidered in detail shortly.

Since both prototype and exemplar-basedmodels can account for many phenomena,it is difficult to generate differential predic-tions (see, e.g., Hintzman & Ludlam, 1980).In one attempt to do so, Medin and Schaffer(1978) contrasted the predictions of whatthey called independent- and interactive-cuetheories, with prototype models being onetype of independent-cue theory. Indepen-dent-cue theories assume that the informa-tion entering into category judgments (over-all similarity, distance, or validity) can bederived from an additive combination of theinformation from component attributes(Franks & Bransford, 1971; Reed, 1972).In other words, the more characteristic at-tributes an exemplar has, the easier it should

241

Page 2: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

242 DOUGLAS L. MEDIN AND EDWARD E. SMITH

be to learn and classify. Interactive-cue the-ories reject such additivity. Thus in Medinand Schaffer's (1978) model, which is aninteractive-cue model, the various attributevalues comprising exemplars are combinedin a multiplicative manner to determine theoverall similarity of two exemplars.

The multiplicative rule has the implicationthat an exemplar may be classified more ef-ficiently if it is highly similar to one instanceand dissimilar to a second than if it hasmedium similarity to two instances of a cat-egory. Hence the context model predicts thatcategorization performance will vary withthe number of stored exemplars similar tothe test item. Independent-cue models areinsensitive to such density effects. In a seriesof four experiments, Medin and Schaffer(1978) obtained clear support for the contextmodel. Data from original learning, transfer,and speeded classification were in each casemore in line with the context model thanwith a generalized independent-cue model.In addition, a mathematical version of thecontext model gave an excellent quantitativeaccount of classification performance ontransfer tests involving new and old in-stances.

The Strategy Issue

One response to the above results is toquestion their generality, particularly withrespect to the issue of strategies. One couldargue that there was something about theMedin and Schaffer (1978) items, or somedetail of the experimental situation, thatdiscouraged people from developing the typeof category representation appropriate to in-dependent-cue theories, like a prototypemodel. If people had been instructed, say,to form a prototype, the results might wellhave been different.

The importance of strategies in classifi-cation learning seems undeniable, and thereare important issues concerning how strat-egies ought to be treated by a theory. Onepoint of view is that strategies modify bothrepresentations and processes. According tothis idea, to understand categorization oneneeds (a) a list of the possible strategies thatmight be used in the task, (b) a separatetheory mapping each strategy on to perfor-

mance, and (c) a higher level theory speci-fying the factors governing strategy selec-tion. Under this view current theories ofcategorization are essentially alternativeprocedures, each of which can be evidencedwhen their eliciting factors are operative. Inthis sense all models are correct (and all in-correct) at least some of the time.

An alternative view is that strategies in-duced by instructions alter the underlyingrepresentation in particular ways while leav-ing unchanged the processes operating onthese representations. For example, a keyassumption of independent-cue models isthat judgments are based on a weighted,additive combination of information fromcomponent attributes. Strategies might alterthe weights attached to different attributes,but judgments could still be based on anadditive and independent combination of in-formation. In other words, strategies wouldinfluence the parameters in the model butleave the basic model intact. The same pos-sibility holds for interactive-cue models,such as the context model. One could holdthat strategy variations modify category rep-resentations in essentially quantitative wayswhile leaving the retrieval process impliedby the context model unchanged. The mul-tiplicative rule for similarity relationshipswould still hold, but the similarity parame-ters associated with component attributesmight well differ for different strategies. Thegoal of the present investigation was to pro-vide evidence bearing on this alternativeview of strategies.

Overview of the Experiments

In the present experiments we attemptedto induce strategy variations by means ofinstructions. The task involved two fuzzycategories; that is, no individual cue wasperfectly valid and associated with membersof one category but not the other. In onecondition people were asked to use a rule-plus-exception strategy; in a second peoplewere asked to learn the central tendency, orprototypes, of the categories; in the thirdpeople were given no special instructions.Our aim was to see if the model could handlelearning and transfer data from these threedistinct conditions solely in terms of differ-

Page 3: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

STRATEGIES AND CLASSIFICATION LEARNING 243

ences in the similarity parameters associatedwith the attributes.

By imposing some instructional controlover the strategies that people employed, wehoped to obtain a clearer picture of the ef-fects of strategies on performance as well asan indication of what aspects of the datawere invariant over strategies. The instruc-tional variations were also designed to putparticular models, for example, prototypemodels, in the best light by asking peopleto do what the models imply that they nor-mally do.

The basic design of the experiments isshown in Table 1. The items were Brunswikfaces varying in eye height (EH), eye sep-aration (ES), nose length (NL), and mouthheight (MH). There were two possible val-ues for each attribute, which are representedin the table in terms of a binary notation.For example, the value 1 on the attribute ofnose length might correspond to a long nose,the value 0 to a short nose. Categories A andB differ with respect to what is generallytrue. That is, on each attribute Category Aexemplars tend to have the value 1 and Cat-egory B exemplars the value 0, althoughthere is at least one exception for each at-tribute in each category. This distributionof attributes corresponds to that used byMedin and Schaffer (1978) in their Exper-iments 2 and 3; as they note, the two cate-gories are separable by a linear discriminantfunction as required by independent-cuemodels (see, e.g., Reed, 1972).

Although the models will be evaluated interms of their ability to account for the entirepattern of data, a good guidepost for distin-guishing the context model from indepen-dent-cue models is the comparison of Face4 and Face 7 (see Table 1). Since the centraltendency, or modal prototype, for CategoryA is 1111, Face 4 must be at least as closeto the prototype as Face 7 regardless of howthe attributes are weighted.1 Thus all inde-pendent-cue models predict that Face 4 willbe easier to learn and more accurately clas-sified than Face 7 because for the only di-mension where the two differ, Face 4 has thetypical or characteristic value and Face 7the atypical value. In contrast, interactive-cue models in general and the context modelin particular predict Face 7 should be easier

Table 1Attribute Structure of Categories Used in theExperiment

Faceno.

47

15135

EH

11110

Attribute

ES

Training items

A exemplars

10011

value

NL

11101

MH

00111

B exemplars

122

1410

1000

1100

0100

0010

New transfer items

13689

1116

1110000

0010101

0011010

1010110

Note. EH = eye height; ES = eye separation; NL =nose length, MH = mouth height. See the text for ex-planation of binary notation.

because the number of highly similar pat-terns is the most important factor in perfor-mance. Although one would not want to as-sume all dimensions are equally salient, forconvenience we shall call two faces highlysimilar if they differ in value along only onedimension. Face 7 is highly similar (differson only one attribute) to two other faces inCategory A (4 and 15) but is not highly sim-ilar to any face in Category B. Face 4, onthe other hand, is highly similar to one facein Category A (7) and to two in CategoryB (2 and 12); it should be more difficult toclassify. This prediction is not entirely pa-rameter free, but it holds over a very large

' Although we describe the prototype in terms ofmodal values, a prototype could as readily be based onmean values. This distinction would not lead to differ-ential predictions in our experiments, as both a modalprototype and a mean prototype represent special casesof the general independent-cue model.

Page 4: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

244 DOUGLAS L. MEDIN AND EDWARD E. SMITH

range of values, and values that would alterthat prediction would produce other testabledistinctions between the context and inde-pendent-cue models.

A Note on Stimuli and CategoryStructure in Relation to the Models

The generality of any results from theparticular stimuli and category structureshown in Table 1 are obviously limited. Inwhat follows we provide some rationale forour particular choice of structure and stimulito clear up some common misconceptionsabout the models.

Category structure. The categories wereset up to allow for a contrast between thecontext model and independent-cue modelswithout violating any major constraint onstructure implied by these models. The onlyconstraint associated with the context modelis that categorization will be easier to theextent that within-category similarity ismaximized and between-category similarityis minimized. Independent-cue models re-quire that categories be linearly separableif perfect categorization is to occur. That is,there must be some weighted additive com-bination of attribute values that puts all Aexemplars into Category A and all B ex-emplars into Category B. One way to seethat the stimuli in Table 1 satisfy that con-straint is to note that all exemplars could becorrectly classified by looking at eye height,nose length, and mouth height, determiningwhether the values were typical of categoryA or B, and then assigning the face to Cat-egory A if two or more of the values weretypical for Category A and to Category Bif two or more of the values were typical forB. When categories are linearly separable,it is more difficult to distinguish the predic-tions of the contending models, but as wehave seen, the structure in Table 1 does per-mit some contrast.

Stimuli. One reason for using Brunswikfaces was that Reed (1972) used them instudies that were taken as providing supportfor a specific independent-cue model (pro-totype theory). These experiments predatedthe context theory, so it is difficult to judgehow it would have fared.

A second major reason for using Brunswik

faces is that they are highly confusable.Though the individual values are distinctive,individual faces differ from each other onlyin the particular combination of attributevalues they possess. This might appear tobias the experiment against the contextmodel inasmuch as it assumes that perfor-mance is based on the retrieval of specificitem information; however, the context modeldoes not specifically assume that a distinctor distinctly accessible representation is de-veloped for each individual stimulus. In thenext paragraphs we amplify this point. (Foradditional details, see Medin & Schaffer,1978, pp. 210-212.)

In the context model the parameter re-flecting the similarity of two values on anattribute is assumed to be less when thatattribute is attended to, or forms part of ahypothesis, than when it is not so salient. Ifinstructions encourage the belief that thereis only one critical attribute, many attributesof the item will not be encoded in any detailat all, and the resulting representations willnot be sufficient to distinguish the individualexemplars (e.g., Bourne & O'Banion, 1969;Calfee, 1969). In other words, it is not nec-essarily assumed that a distinct representa-tion is set up for each individual exemplar.

Consider a highly simplified classificationtask involving two Category A patterns(Ai = 1110, A2 = 1010) and two CategoryB patterns (B, = 0001, B2 = 0010). Supposethat a person in the experiment has selec-tively attended (perhaps tested hypothesesabout) the second and third dimensions, soless information has been stored about thefirst and fourth dimensions. The subject'srepresentation of exemplar informationmight be something like this:

?11?-A(A,) 000?-B(B,)

?010-A(A2) 7010-B(B2)

where the question marks indicate that in-formation that would differentiate Value 1and Value 0 on that dimension was notstored (or cannot be accessed). Note thatthis representation is not sufficient to pro-duce perfect performance because the rep-resentations associated with A2 and B2 can-not be distinguished. When B2 is presentedfor a test, the representation associated with

Page 5: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

STRATEGIES AND CLASSIFICATION LEARNING 245

A2 should be as likely to be accessed as therepresentation associated with B2. If a newpattern B3 = 0000 is presented, it should becorrectly classified because it would mostlikely access the representation associatedwith BI. Note further that on a new-old rec-ognition test, B3 would very likely be rec-ognized falsely as old for the same reason.Thus depending on the completeness of theexemplar information and the nature of theprobes, new-old recognition could actuallybe at chance and classification could bebased on exemplar representations and berelatively accurate.2

By using highly confusable stimuli, weaimed to assure that when a training stim-ulus was presented, it would not automati-cally retrieve its own representation. As wehave seen the context model does not assumethat exemplars get perfectly coded nor thata probe invariably accesses the correspond-ing representation in memory. Even whenthe correct category assignment has beenattached to each of the individual trainingstimuli, classification performance may notbe perfect because the presentation of atraining stimulus would not invariably leadto accessing its associated representation inmemory.

A secondary aim of using confusable stim-uli was to make clearer the nature of thecontrast between independent-cue modelsand the context model. Although the contextmodel assumes that performance is based onretrieval of exemplar information and cer-tain versions of the general independent-cuemodel assume that performance is based onan abstract prototype, the fundamental con-trast is not between exemplar and prototypemodels but, rather, between independent-cueand interactive-cue models. For example,one independent-cue model, the averagedistance model (e.g., Reed, 1972), is an ex-emplar model that assumes that represen-tations associated with every trainingstimulus are retrieved when a probe is pre-sented and that the probe is assigned to thecategory whose members have the greateraverage similarity to (lesser average distancefrom) the probe item. As long as binary-val-ued dimensions are used, as in the presentexperiments, the predictions of an averagedistance model cannot be differentiated from

predictions of a prototype model. It is alsotrue that the context model and some (butnot all) other interactive-cue models assumethat categorization can involve a consider-able amount of abstraction—The key dif-ference is that interactive-cue models do notassume that this abstracted information isconfined to an additive and independent sumof components.

Finally, one should note that more generalforms of independent-cue models allow fortraining stimuli to be classified on the basisof specific information concerning that par-ticular item. To the extent that this occurs,it would be more difficult to distinguish pre-dictions of the contending theories, at leastwith respect to classification of training stim-uli. Therefore, to minimize information spe-cific to individual faces, the stimuli differedfrom each other only in their combinationof values.

Method

SubjectsNinety-six volunteers were solicited through ads in

local newspapers. The subjects, men and women rangingin age from 17 to 30 years, were paid $2.50 for theexperimental session. Thirty-two people were assignedto each of the three instructional conditions.

StimuliThe stimuli were Brunswik faces displayed on an ap-

proximately 27 cm X 34 cm visual display screen (Dig-ital Equipment Corp. VR-17 cathode ray tube screen)linked to a PDP-11 computer. The face outlines were13.5 cm x 11.5 cm and centered on the screen. The faces

2 The above offers one interpretation of the contextmodel that makes it qualitatively consistent with new-old recognition being poor yet classification being ex-emplar-based and relatively accurate. Other interpre-tations of the context model are also possible here. Shif-frin (Note 1) has suggested that there is some probabilityon any training trial that one or more attribute valuesof the pattern may be encoded erroneously. Thus thoughthe most likely outcome of a training trial is that allencoded values are correct, the second most likely out-come is that one encoded value is in error, the third mostlikely outcome is that two encoded values are in error,and so forth. The upshot is that training results in manyexemplars of various sorts being stored in memory, withthe likely majority being the exemplars actually pre-sented or single-value perturbations of them. This ma-jority underlies relatively accurate exemplar-based clas-sification, and the sheer number of stored exemplarswould- b«'-responsible for poor new-old recognition.

Page 6: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

246 DOUGLAS L. MEDIN AND EDWARD E. SMITH

differed in nose length, mouth height, eye separation,and eye height—the same attributes that Reed (1972)varied in his studies of category learning. The nose waseither a 1.5-cm or a 3.0-cm vertical line centered withinthe face outline. The mouth was a 4.0-cm horizontalline, which was either 1.5 cm or 3.0 cm from the chinline. The eyes were 1 cm X 2.5 cm and were separatedby either 1.5 cm or 3.5 cm (measuring from inner edges).Finally, the eyes were either 2.5 cm or 5 cm from thetop of the face outline (measured to the top edge of theeye). The two possible values on each of the four attri-butes were combined to produce 16 distinct faces. Thecategories were constructed in accordance with the de-sign shown in Table 1.

All subjects were presented the same faces, but theparticular assignment of individual faces to the abstractnotation in Table 1 varied across subjects. For example,for one subject 1001 might refer to a face with eyes upand far apart, a long nose, and a low mouth; for anothersubject 1001 might refer to a face with eyes down andclose together, a long nose, and a low mouth; and so on.Overall, each face was assigned to a given abstract no-tation exactly twice for each instructional condition,once when Faces 4, 6, 7, 13, and 15 were associatedwith Category A and once when they were associatedwith Category B. Hence the assignment of faces to con-ditions, instructions, and category labels was completelycounterbalanced.

General Procedures

The procedure had three phases: original learning,transfer, and speeded classification. Each phase is de-scribed below.

Initial learning. This phase consisted of up to 32runs through the set of nine training faces (see Table1), with a learning criterion of one errorless run. Thetrial sequence went as follows: (a) A face appeared onthe screen, (b) To indicate their categorization, subjectspressed either the button marked A or the buttonmarked B, which occupied the lower left and right cor-ners, respectively, of a 4 X 4 button response box. (3)The face remained on the screen for an additional 2 secwhile feedback was displayed below the face, (d) A 1-sec intertrial interval ensued.

The first part of the instructions was the same for allthree instructional conditions. Subjects were told thatthey would see faces differing only in nose length, mouthheight, distance between the eyes, and height of the eyes,and that their task was to learn to correctly classify thefaces into Category A or Category B. They were furthertold that each facial feature had some information valuefor category membership, but that none was a perfectlyreliable indicator of category membership.

The general procedure was then described. Subjectswere told that they would be given immediate feedbackabout the correctness of their categorization responses.Subjects were then given the specific instructions de-signed to induce strategy differences. One group wasgiven standard instructions not specifying any particular

strategy3; a second group was told to use a rule-plus-exception strategy; the third group was asked to learnthe central tendency (or prototype) for each category.Details are provided in Instructions for Initial Learning.Note that none of the groups were given instructions touse exemplars as a basis of performance, though thisis the process specified by the context model.

Transfer. Transfer tests immediately followed initialtraining. Subjects were instructed they would see theold faces mixed in with some new faces that were verysimilar to the old. As each face appeared they wereasked to check to see whether it was a new face and,if so, to press the button marked A' for new. Then re-gardless of whether the face was old or new, they wereto decide whether the face belonged in Category A orCategory B, based on what they had learned before.Finally, they were asked to indicate how confident theywere concerning which category the face belonged toby pressing the Guess button, the Think so button, orthe Sure button. Subjects were then given two runsthrough all 16 faces, using different randomizations ofthe faces in the two runs. Each face remained on thescreen until the confidence judgment was given. Theintertrial interval was as before, but no feedback con-cerning either recognition or classification was given.

Speeded classification. After the transfer tests sub-jects were given an additional 16 runs through the ninetraining faces. Presentation and feedback were exactlyas in initial learning; the only difference was that sub-jects were now told that their latencies were being mea-sured and they were to respond as fast as they couldwithout making errors. Subjects in the various instruc-tional conditions were asked to use the strategy they hademployed before.

Instructions for Initial LearningStandard instructions. These instructions were not

designed to focus subjects on any particular strategy.They read as follows:

At first you will have to guess because I haven't givenyou any information about which category a partic-ular face falls into. Each face will be presented re-peatedly during the learning phase of the experimentand by paying attention to the feedback, eventuallyyou can be able to correctly assign each face to itsappropriate category.

Rule-plus-exception instructions. In addition to theabove standard instructions, subjects were told:

We want you to use a particular strategy to learn toclassify the faces. You might call it a "rule-plus-ex-ception strategy". First pay attention to nose lengthand learn which category most short-nosed faces fallinto and which category is correct for most long-nosedfaces. You will find that one short-nosed face and one

3 This instruction group is actually Experiment 3 ofMedin and Schaffer (1978). The full description is re-ported here to facilitate cross-referencing and becausepreviously unreported data from this group are includedin the present article.

Page 7: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

STRATEGIES AND CLASSIFICATION LEARNING 247

long-nosed face are exceptions to the rule. Memorizethese faces. When you have mastered the task, youwill be doing something like looking to see if the faceis one of the exceptions; if so, make the correct re-sponse; if not, apply the rule for short and long noses.Usually the task would be quite difficult and fewerthan half the people who try it would be able to learnit, but by using the rule-plus-exception strategy andfocusing on nose length, you should be able to readilylearn which category is correct for each face.

Prototype instructions. Since a prototype might bebased on either mean or modal values, the instructionswere designed to be neutral on this point. In additionto the standard instructions, subjects were told:

There are probably many strategies you could use tolearn which category each face belongs to but we wantyou to focus on one particular one. As the faces ap-pear, we want you to form a general impression ofwhat "A" faces on the average look like and what"B" faces on the average look like. At the end of thelearning phase of this experiment, I'll ask you whetherthe faces in one category generally had short or longnoses, low or high mouths, up or down eyes, or closeor far-apart eyes. As you develop a general impressionof what "A" faces on the average look like and what"B" faces on the average look like, we want you touse these general impressions to help you classify thefaces.

General Results

Initial Learning

As anticipated the initial learning taskproved difficult, with fewer than half of thesubjects meeting the criterion of one error-less run. Overall, 14 of 32 people met thecriterion in the standard and rule-plus-ex-ception conditions, but only 8 of 32 reachedcriterion in the prototype condition. All sub-jects, however, improved with practice. Ifperformance were completely at chance,subjects should have averaged 16 errors perface during learning (each face was pre-sented 32 times); instead average errors perface were 8.0 under standard instructions,6.4 under rule-plus-exception instructions,and 9.2 under prototype instructions.

Table 2 shows the distribution of errorsacross faces for the three instructional con-ditions. The rule-plus-exception instructionsled to the best overall performance; the pro-totype instructions resulted in the poorestoverall performance. There were also con-siderable differences across instructionalconditions in the relative difficulty of indi-

vidual faces. Face 12, for example, was as-sociated with an average of 17.4 errors underprototype instructions but only 6.3 errorsunder rule-plus-exception instructions. Sta-tistical tests confirm the reliability of thesedifferences. An analysis of variance was con-ducted on errors during learning using thefactors of instructional conditions, random-izations, and faces. Significant effects wereobtained for the main effects of instructions,F(2, 48) = 4.44, MSe = .93, p < .02; faces,/X8, 384) = 55.96, MSe = .69, /x.OOOl;and the Instructions X Faces interaction,F(16, 384) = 3.96, MSe = .69, p < .001. Inshort, different instructions produced differ-ences in learning that interacted with par-ticular faces.

There were also some constants across in-structions. In particular, the theoreticallyimportant comparison of Faces 4 and 7showed a small but reliable advantage forFace 7 in all conditions, t(95) = 2.87, p <.01, consistent with the context model butnot with independent-cue models. Also thefaces that served as exceptions in the rule-plus-exception condition (Faces 2 and 13)were among the most difficult to master inall conditions.

Transfer Tests

Recognition. Recognition performancewas very poor and there were only minor

Table 2Mean Number of Errors for Each Face DuringInitial Learning as a Function of Instructions

Instruction

FaceNumber

457

13152

101214

Standard

4.58.24.2

11.92.8

12.94.4

15.26.6

Rule-plus-exception

3.95.93.3

10.7"2.8

13.8"3.86.36.8

Prototype

7.79.26.7

13.74.9

10.34.2

17.48.7

M 8.0 6.4 9.2

" Face was an exception in the rule-plus-exception con-dition.

Page 8: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

248 DOUGLAS L. MEDIN AND EDWARD E. SMITH

differences across instruction conditions. Theprobability of saying new to an old face was.19, .22, and .13 in the respective standard,exception, and prototype instructional con-ditions, but the probability of saying new toa new face was only .23, .26, and .19 in therespective conditions. Overall, 59 subjectswere more likely to say new to a new facethan to an old face, 9 showed no difference,and 28 showed the opposite trend. Acrossthe 96 subjects new-old recognition wasbarely above chance x20) = 6.44, p < .05.As noted in the introduction, this result isconsistent with both classes of models underconsideration.

Categorization during transfer. Cate-gorization accuracy for the nine old andseven new faces for the various instructionalconditions is shown in the first columns ofTables 3, 4, and 5. (The remaining columnsof these tables contain theoretical predic-tions that will shortly be discussed.) Againprototype instructions yielded the poorestoverall performance. Of course this maymerely reflect the fact that this group mas-tered less of the categorical structure duringthe initial learning phase. Of somewhatgreater interest are the differences betweenthe instructional conditions in the relativedifficulty of individual new faces. For ex-ample, with rule-plus-exception instructions,Face 1 was categorized as an A 45% of thetime and Face 3 as a B 80% of the time; incontrast, with prototype instructions Face 1was called an A 73% of the time and Face3 a B 35% of the time. But importantly, al-though the differences were small, Face 4was never categorized more accurately thanFace 7 under any instructions. Moreover, thelargest difference favoring Face 7 occurredunder the prototype instructions, whichshould have been maximally favorable to theindependent-cue model. Overall, Face 7 wasclassified significantly more accurately thanFace 4, t(95) - 2.89, p < .02.

An analysis of variance of responses to theold faces revealed significant effects of in-structions, F(2, 48) = 5.32, M5e = .93,p<.0l, and faces, F(8, 384) =17.56,MSe = .84, p < .001. A similar analysis forthe new faces produced significant effects forfaces, F(6, 288) - 38.22, MSe = 1.00, p <.0001, and for the Instructions X Faces in-

Table 3Observed and Predicted Proportions of CorrectCategorizations for Each Face DuringTransfer: Standard Instructions

Face

andcategory

label Observed

Proportions

Predictedcontextmodel

Predictedindependent-

cue model

Old faces

4A7A

15A13A5A

12B2B

14B10B

M

.97

.97

.92

.81

.72

.67

.72

.97

.95

.89

.94

.991.00.72.71

.68

.711.001.00

.95

.92

.96

.79

.71

.67

.76

.951.00

New faces

1A6A9A

11A

3B8B

16B

.72

.98

.27

.39

.44

.77

.91

.78

.95

.30

.47

.45

.78

.88

.591.00.14.43

.49

.65

.94

teraction, F(90, 288) = 8.37, MSe - 1.00,p< .0001. (Detailed accounts of these re-sults will be offered in the Theoretical Anal-ysis section.)

Speeded Classification

Table 6 summarizes the data from thespeeded-classification phase of the experi-ment. Average correct reaction times aregiven for each face for each instructionalcondition, with corresponding error rates.The pattern of results is by now a familiarone. There are instructional differences inthe relative difficulty of faces—for example,Faces 2 and 13, the two exceptions, are themost difficult only in the rule-plus-exceptioncondition. And consistent with the contextmodel, once more Face 7 resulted in betterperformance—faster reaction times andfewer errors—than Face 4 in all instruc-tional conditions. Again, this effect wasmodest but consistent.

Page 9: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

STRATEGIES AND CLASSIFICATION LEARNING 249

Table 4Observed and Predicted Proportions of CorrectCategorizations for Each Face DuringTransfer: Rule-Plus-Exception Instructions

Facenumber

andcategory

label

4A7A

ISA13A5A

12B2B

14B10B

M

1A6A9A

11A

3BSB

16B

Observed

Old

.89

.94

.94

.72

.78

.73

.70

.91

.95

.84

New

.45

.88

.08

.75

.80

.42

.88

Proportions

Predictedcontextmodel

faces

.91

.97

.99

.67

.74

.72

.61

.96

.98

faces

.50

.94

.20

.79

.80

.48

.92

Predictedindependent-

cue model

.92

.91

.98

.68

.84

.82

.66

.921.00

.411.00.16.69

.72

.44

.97

Statistical tests confirm the above impres-sions. Separate analyses of variance wereconducted on reaction times and errors. Forthe reaction times, there were significanteffects of instructions, F(2, 48) =13.72,MSe = 2.21, /x.OOl; faces, F(8, 384) =17.93, MSe = .20, p<.0001; and the In-structions X Faces interaction, F(16,384) = 3.09, MSe = .20, p < .0001. For er-rors, there were significant effects of faces,F(8, 384) = 38.06, MSe = .79, /x.OOOl,and of the Faces X Instructions interaction,F(16, 384) = 2.63, M5e = .79, /x.OOl.Also, the three-way Instructions X Faces XRandomizations interaction was marginallysignificant, F(120, 384) = 1.30, M5e = .79,p < .05. Planned t tests on Faces 4 and 7indicated that the latter produced fewer er-rors, r(95) = 3.18, /x.Ol, and faster re-sponding, f(95) = 1.91, p < .06.

Theoretical AnalysisAlthough the guideline comparison of

Face 4 with Face 7 uniformly favored the

Table 5Observed and Predicted Proportions of CorrectCategorizations for Each Face DuringTransfer: Prototype Instructions

FaceniimhprI1UI11UC1

andcategory

label

4A7A

15A13A5A

12B2B

14B10B

M

1A6A9A

11A

3B8B

16B

Observed

Old

.77

.97

.98

.70

.60

.45

.72

.83

.87

.79

New

.73

.87

.28

.52

.35

.78

.88

Proportions

Predictedcontextmodel

faces

.83

.89

.93

.74

.57

.47

.70

.85

.91

faces

.77

.89

.28

.46

.40

.74

.85

Predictedindependent-

cue model

.84

.92

.98

.77

.52

.51

.75

.841.00

.721.00.20.44

.46

.74

.98

context model, it is also important to see howthe contending models fit the transfer dataquantitatively. For the context model pre-

Table 6Mean Correct Reaction Times (RT; in msec)for Each Old Face During SpeededClassification as a Function of Instructions

Standard

number RT

4 1.115 1.347

1315

21012

.08

.27

.07

.30

.08

.3714 1.13M 1.19

ER

.05

.14

.03

.09

.02

.12

.03

.19

.06

.08

Rule-plus-exception

RT

1.271.611.211.87a

1.31

.97"

.42

.58

.34

.51

ER

.03

.11

.01

.15

.01

.20

.02

.10

.04

.07

Prototype

RT

1.922.131.692.121.54

1.911.642.291.851.90

ER

.07

.18

.04

.14

.04

.12

.03

.16

.06

.09

Note. ER = error rate.a Stimuli that were exceptions to the rule.

Page 10: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

250 DOUGLAS L. MEDtN AND EDWARD E. SMITH

dictions concerning transfer can be analyzedin terms of which exemplars are likely to beretrieved when any particular face is pre-sented as a probe. It is assumed that theprobability of assigning a particular probeface to Category A(B) is equal to the sumof the similarities of that face to each of thestored exemplars of A(B), divided by thesum of the similarities of that face to eachof the stored exemplars of both categories(Medin & Schaffer, 1978).4 The similaritybetween a probe face and an exemplar isdetermined by a multiplicative combinationrule. More precisely, the model has four pa-rameters (each ranging between 0 and 1,with 1 designating maximum similarity)that correspond to the similarity parametersfor the values of the four attributes used inthis experiment; for example, the parameterfor eye height specifies the similarity be-tween the two different values of this attri-bute. And similarity between two items isdetermined by multiplying the relevant pa-rameters.

Quantitative predictions of the general in-dependent-cue model (including the proto-type model as a special case) require someadditional assumptions. Following Medinand Schaffer's (1978) treatment of indepen-dent-cue theory, we assume that associatedwith each attribute is some weight param-eter reflecting the importance of that attri-bute in categorization. If Weh, Wes, Wnl, andWmh are the weight parameters associatedwith eye height, eye separation, nose length,and mouth height, respectively, then theprobability that a particular face will beclassified as an A is equal to the relativeweight of values consistent with Category A.For example, the probability that a face withvalues 1010 would be called an A is(Weh + Wnl)/(Weh + Wes + Wnl + Wmh).Since we are working with ratios, there arereally only three independent paremeters,assuming the parameters sum to one. In ad-dition, in the case of old faces, we will as-sume that transfer performance may bebased on specific exemplar information. Thisprobability is represented by a parameter S.Since 1010 is a face used in training (Face7), the probability of its being classified asan A would be equal to S + (1.0 - S) X(Weh + Wn,)/(Weh + Wra + Wnl + Wmh).

Both the general independent-cue modeland the context model thus have four freeparameters to be used in fitting the data.These parameters were separately estimated(by minimizing least squares) for each of thethree instructional conditions. The goal ofboth models is to describe the transfer datafrom the different instructional conditionsin terms of only parameter changes. Thepredictions from two models are shown inTables 3, 4, and 5. Both models do a fair jobof capturing the main trends in the data, butthe context model seems to provide betterquantitative predictions. Some evidence forthe latter's superiority is presented in Table7. The table gives three measures of eachmodel's goodness-of-fit—average absolutedeviation, sum of squared deviations, andrank order correlation between predictedand observed classification accuracy—andall favor the context model.

To obtain a more precise comparison ofhow well the two models fit the data, chi-square tests were computed for the threeexperiments. In these tests all cases in whichthe expected number of responses was lessthan five were lumped into a single cell. Forthe independent-cue model, x2(31) = 194.4,p < .001, which is highly significant; for thecontext model, x2(33) = 47.3,p ^ .05, whichis not quite statistically significant. The dif-ference in chi-square values provides an in-dex of relative accuracy of the two modelsand this difference, x2U) = 147.1, is highlysignificant, (p < .001). (Technically this testrequires that the component chi-squareshave the same degrees of freedom, but thefact that there were fewer degrees of free-dom associated with the independent-cuemodel should, if anything, favor this model.)

The parameter values associated withthese predictions are shown in Table 8. Theparameter constraints are fairly tight in thatvalues more than a few percentage points

4 As Medin and Schaffer (1978) noted, the idea is notthat all stored patterns are accessed by each probe but,rather, that the similarity parameters determine whichpatterns are likely to be accessed by the probe. Theparticular ratio rule is already an approximation, sincesimilarity parameters would be expected to differ forindividual subjects. The best defense of the response ruleis that it is a fair approximation and that it seems towork.

Page 11: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

STRATEGIES AND CLASSIFICATION LEARNING 251

Table 7Statistics for Evaluating Goodness of Fit of Models Applied to Transfer Categorization Data

Average absolutedeviation

Instructionalcondition

StandardRule-plus-exceptionPrototype

Context

.036

.048

.038

Independentcue

.048

.055

.072

S deviations squared

Context

.032

.049

.028

Independentcue

.063

.062

.114

Rank-order correlations

Context

+.90+.98+.96

Independentcue

+.90+.87+.88

away yield substantially poorer fits for bothmodels. Consider first the parameters for thecontext model. Here the smaller the simi-larity parameter, the more salient the di-mension. Note that in the rule-plus-excep-tion condition, the similarity parameter forthe dimension of nose length is 0, which isconsistent with the instructions, making thisattribute salient. (There is no parameter forspecific item information, since performanceis assumed to be exclusively based on ex-emplar retrieval.) Most important, note thatthe parameters for standard and rule-plus-exception instructions are generally smallerthan those with prototype instructions. Thissuggests that subjects who had either stan-dard or rule-plus-exception instructions weremore likely to store distinctive informationabout the attributes of exemplars than sub-jects who had prototype instructions. Andthis accounts for why the former two instruc-tional conditions led to better overall per-formance than did prototype instructions.Similarly, the changes in parameter valueswith instructions account for many of theinteractions we obtained between faces andinstructions. As one example, Face 12 wascategorized more efficiently with rule-plus-exception than prototype instructions be-cause (a) under the former instructions noselength is more salient and (b) correct cate-gorization of this face hinges critically onthe nose-length attribute (see Table 1).

For the independent-cue model, the greaterthe parameter value associated with an at-tribute, the greater its weight or salience.Therefore these values generally should beand are negatively correlated with the sim-ilarity parameters of the context model. Theparameter values for specific exemplar in-formation raise a problem for the indepen-

dent-cue model. In two of the three instruc-tional conditions this value is high (.42). Thisleads one to ask why old-new recognitionwas so poor, and why it was not poorer withprototype instructions than with the othertwo conditions. That is, if there is informa-tion that identifies specific exemplars duringclassification, then that same informationshould have mediated old-new recognition.The fact that it did not casts further doubton the independent-cue model, at least theversions that include specific item informa-tion. We also tried fitting the data with avariation of the independent-cue model thatdid not include specific item information butassumed that there was some probability, p,that the summary representation (e.g, pro-totype) had not been developed, in whichcase it was assumed that subjects wereforced to guess. This variation of the modelproduced markedly worse fits to the datamainly because performance on old face pat-terns was too good relative to performanceon new faces, a result that the modifiedmodel cannot predict.

A Comment About Degree of Learning

The question arises about whether therewere any systematic differences betweenlearners and nonlearners, or differences asa function of stage of learning. For example,it is conceivable that the independent-cuemodel describes performance accurately onlyearly in learning, whereas the context modelmight be more accurate at later stages oflearning. The data give little support to thisidea. The relative difficulty of Faces 4 and7 did not show any noticeable interactionwith practice, and Face 7 was easier thanFace 4 for both learners (those who met the

Page 12: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

252 DOUGLAS L. MEDIN AND EDWARD E. SMITH

Table 8Best Fitting Parameter Values for the Two Models

Dimension

Standard instructionsContextIndependent cue

Rule-plus-exception instructionsContextIndependent cue

Prototype instructionsContextIndependent cue

Eyeheight

.01

.50

.15

.28

.05

.53

Eyeseparation

.11

.06

.37

.03

.62

.02

Nose

.07

.36

.00

.55

.37

.26

Mouth

.40

.08

.41

.13

.48

.18

Specificitem (S)

.42

.42

.11

criterion of one errorless run in initial learn-ing) and nonlearners. Overall, nonlearnersaveraged .77 more errors on Face 4 thanFace 7 during learning, t(59) = 1.72, p <.10, and learners averaged 1.22 more errorson Face 4, f(35) = 2.69, p < .05. The largerdifference for learners is to be expected,since differences should not begin to appearuntil at least a modest amount of learninghas taken place. Finally, informal attemptsto fit the transfer performance of learnersand nonlearners separately suggest that dif-ferences can be fairly accurately describedsimply in terms of differences in the param-eters of the context model (the similarityparameters for nonlearners are higher).

Although some learning is needed beforedifferences between Faces 4 and 7 are de-tectable, the relative difficulty of the twofaces did not interact with stage of practicein any way that suggests that the contextmodel describes performance at one stageof learning and independent-cue models arecorrect for a different stage. Breaking the32 original-learning trials into four blocks(of eight trials each) yielded an averagenumber of errors of 2.06, 1.72, 1.08, and .84for Face 4 and an average of 2.02, 1.24, .90and .66 for Face 7. Thus Face 7 led to betterperformance than Face 4 on each block.

DiscussionThe main results are easy to describe. The

instructional variations produced large dif-ferences in the pattern of errors, reactiontime, and transfer performance. There werestrong interactions of instructional condi-tions with particular faces. Yet certain re-lationships in the data held across all con-

ditions, relationships that were accuratelydescribed by the context model. Further-more, this model provided a good accountof many of the instructionally induced dif-ferences simply in terms of variations in thesimilarity parameters of the attributes. Nonew or special processes were required forthe different conditions, and the contextmodel fit the data better than an indepen-dent-cue model in each condition. At leastfor the present studies, instructional manip-ulations influence the representations but notthe basic processes operating on them. De-spite the variations in performance for eachof the instructional conditions, performancewas more in line with interactive-cue modelsthan with models assuming that informationis combined in an additive and independentmanner. This raises the possibility that itmay not be necessary for a new processmodel to be developed for each alternativestrategy a person might employ. One mightopt for formulating process models on a levelat which they can capture the relations inthe data that are invariant over strategy.There is little point in speculating about theviability of extending the context model tostill other strategy variations, but we knowfrom the present study that at least somedegree of generality can be achieved evenwhen experimental manipulations dramati-cally alter many details of performance.

As noted earlier one should also be cau-tious about generalizing the present resultsto different stimuli and different categorystructures, and at best the present resultsshould be taken as suggestive. Still there isevidence that the advantage of interactivemodels over independent-cue models could

Page 13: Strategies and Classification Learning...Strategies and Classification Learning Douglas L. Medin University of Illinois at Urbana-Champaign Edward E. Smith Stanford University How

STRATEGIES AND CLASSIFICATION LEARNING 253

have some generality. The Medin and Schaf-fer (1978) results held across both face andgeometric stimuli and across at least modestvariations in category structure. In none ofthese experiments, however, has the numberof alternative training stimuli been verylarge. In a recent series of studies reportedby Medin (Note 2), category size was variedin a major way. These experiments com-pared the difficulty of learning categoriesthat either were or were not linearly sepa-rable. Independent-cue models predict thatwith other factors held constant, the linearlyseparable task should be easier. In differentexperiments the stimulus set was either smallor unlimited (no stimulus was ever re-peated). In neither case was there any evi-dence that the linearly separable task waseasier, contrary to independent-cue models.

The present approach to the role of strat-egies in learning departs from usual prac-tices. That is, typically attention is focuseddirectly on strategies rather than on the by-products associated with the use of strate-gies. And usually one is confronted with, andtakes as the appropriate task, addressing thediversity and flexibility of strategies. By con-centrating on the representations that resultfrom the use of strategies, attention is calledto the commonalities underlying perfor-mance. The present data are consistent withthe idea that a basic property of categori-zation is that probes act as retrieval cues toaccess representations similar to the probe.This sensitivity to similarity acts as an ef-ficient mechanism for categorization byanalogy.

Brooks (1978) has argued that analogicalreasoning has been given short shrift in anal-yses of categorization learning in favor ofattention to the more analytical thinkingassociated with strategies and hypothesistesting. Indeed, there is some evidence thatfor complex category structures, mastery ofthe category is accomplished better in theabsence of strategies than in their presence(Kossan, 1978; Reber, 1969,1976). The con-text model suggests that analytical and an-

alogical processes should not be viewed asmutually exclusive alternatives. Rather, ac-cess to stored representations may always beby means of an essentially analogical pro-cess, but the character of the representationsmay be modified by analytical strategies em-ployed during learning.

Reference Notes1. Shiffrin, R. M. Personal communication, January 26,

1981.2. Medin, D. L. Levels of categorization and the in-

tegration of new experiences. Paper presented at themeeting of the American Psychological Association,Montreal, Canada, September 1980.

ReferencesBrooks, L. Nonanalytic concept formation and memory

for instances. In E. Rosch & B. B. Lloyd (Eds.), Cog-nition and categorization. Hillsdale, N.J.: Erlbaum,1978.

Bourne, L. E., & O'Banion, K. Memory of individualevents in concept identification. Psychonomic Science,1969, 16, 101-103.

Calfee, R. C. Recall and recognition memory in conceptidentification. Journal of Experimental Psychology,1969, 81, 436-440.

Franks, J. J., & Bransford, J. D. Abstraction of visualpatterns. Journal of Experimental Psychology, 1911,90, 65-74.

Hintzman, D. L., & Ludlam, G. Differential forgettingof prototypes and old instances: Simulation by an ex-emplar-based classification model. Memory & Cog-nition, 1980, 8, 378-382.

Kossan, N. E. Structure and strategy in concept ac-quisition. Unpublished doctoral dissertation, StanfordUniversity, 1978.

Medin, D. L., & Schaffer, M. M. Context theory ofclassification learning. Psychological Review, 1978,85, 207-238.

Posner, M. I., & Keele, S. W. On the genesis of abstractideas. Journal of Experimental Psychology, 1968,77,353-363.

Reber, A. S. Transfer of syntactic structure in syntheticlanguages. Journal of Experimental Psychology,1969, 81, 115-119.

Reber, A. S. Implicit learning of synthetic languages:The role of instructional set. Journal of ExperimentalPsychology: Human Learning and Memory, 1976, 2,88-94.

Reed, S. K. Pattern recognition and categorization.Cognitive Psychology, 1972, 3, 382-407.

Received December 10, 1980 •


Recommended