+ All Categories
Home > Documents > Elimination by aspects: A theory of choice.

Elimination by aspects: A theory of choice.

Date post: 08-Oct-2016
Category:
Upload: amos
View: 267 times
Download: 8 times
Share this document with a friend
19
PSYCHOLOGICAL REVIEW VOL. 79, No. 4 JULY 1972 ELIMINATION BY ASPECTS: A THEORY OF CHOICE l AMOS TVERSKY « Hebrew University of Jerusalem Most probabilistic analyses of choice are based on the assumption of simple scalability which is an ordinal formulation of the principle of independence from irrelevant alternatives. This assumption, however, is shown to be in- adequate on both theoretical and experimental grounds. To resolve this problem, a more general theory of choice based on a covert elimination process is developed. In this theory, each alternative is viewed as a set of aspects. At each stage in the process, an aspect is selected (with probability proportional to its weight), and all the alternatives that do not include the selected aspect are eliminated. The process continues until all alternatives but one are eliminated. It is shown (a) that this model is expressible purely in terms of the choice alternatives without any reference to specific aspects, (i) that it can be tested using observable choice probabilities, and (c) that it generalizes the choice models of R. D. Luce and of F. Restle. Empirical support from a study of psychophysical and preferential judgments is presented. The strategic im- plications of the present development are sketched, and the logic of elimina- tion by aspects is discussed from both psychological and decision-theoretical viewpoints. When faced with a choice among several alternatives, people often experience un- certainty and exhibit inconsistency. That is, people are often not sure which alterna- tive they should select, nor do they always make the same choice under seemingly 1 The research was supported, in part, by National Science Foundation Grant GB-6782. Much of the work reported in this paper was accomplished while the author was a Fellow at the Center for Advanced Study in the Behavioral Sciences, Stanford, Cali- fornia, during 1970-1971. I wish to thank the Center for the generous hospitality. I am grateful to David H. Krantz for many invaluable discussions throughout the years, to Maya Bar-Hillel for her assistance in both theoretical and experimental phases of the investigation, and to Edward N. Pugh for his help in the analysis of the data. I have also benefited from discussions with Clyde H. Coombs, Robyn M. Dawes, R. Duncan Luce, Jacob Marschak, J. E. Russo, and Paul Slovic. 1 Requests for reprints should be sent to Amos Tversky, who is now at the Oregon Research Institute, P. O. Box 3196, Eugene, Oregon 97403. identical conditions. In order to account for the observed inconsistency and the reported uncertainty, choice behavior has been viewed as a probabilistic process. Probabilistic theories of preference differ with respect to the nature of the mechanism that is assumed to govern choice. Some theories (e.g., Thurstone, 1927, 1959) at- tribute a random element to the determina- tion of subjective value, while others (e.g., Luce, 1959) attribute a random element to the decision rule. Most theoretical work on probabilistic preferences has been based on the notion of independence among alternatives. This notion, however, is incompatible with some observed patterns of preferences which exhibit systematic dependencies among alternatives. This paper develops a probabilistic theory of choice, based on a covert elimina- tion process, which accounts for observed 281 1972 by the American Psychological Association, Inc.
Transcript
Page 1: Elimination by aspects: A theory of choice.

PSYCHOLOGICALREVIEW

VOL. 79, No. 4 JULY 1972

ELIMINATION BY ASPECTS:

A THEORY OF CHOICE l

AMOS TVERSKY «

Hebrew University of Jerusalem

Most probabilistic analyses of choice are based on the assumption of simplescalability which is an ordinal formulation of the principle of independencefrom irrelevant alternatives. This assumption, however, is shown to be in-adequate on both theoretical and experimental grounds. To resolve thisproblem, a more general theory of choice based on a covert elimination processis developed. In this theory, each alternative is viewed as a set of aspects. Ateach stage in the process, an aspect is selected (with probability proportionalto its weight), and all the alternatives that do not include the selected aspectare eliminated. The process continues until all alternatives but one areeliminated. It is shown (a) that this model is expressible purely in terms of thechoice alternatives without any reference to specific aspects, (i) that it can betested using observable choice probabilities, and (c) that it generalizes thechoice models of R. D. Luce and of F. Restle. Empirical support from a studyof psychophysical and preferential judgments is presented. The strategic im-plications of the present development are sketched, and the logic of elimina-tion by aspects is discussed from both psychological and decision-theoreticalviewpoints.

When faced with a choice among severalalternatives, people often experience un-certainty and exhibit inconsistency. Thatis, people are often not sure which alterna-tive they should select, nor do they alwaysmake the same choice under seemingly

1 The research was supported, in part, by NationalScience Foundation Grant GB-6782. Much of thework reported in this paper was accomplished whilethe author was a Fellow at the Center for AdvancedStudy in the Behavioral Sciences, Stanford, Cali-fornia, during 1970-1971. I wish to thank theCenter for the generous hospitality. I am gratefulto David H. Krantz for many invaluable discussionsthroughout the years, to Maya Bar-Hillel for herassistance in both theoretical and experimentalphases of the investigation, and to Edward N.Pugh for his help in the analysis of the data. I havealso benefited from discussions with Clyde H.Coombs, Robyn M. Dawes, R. Duncan Luce,Jacob Marschak, J. E. Russo, and Paul Slovic.

1 Requests for reprints should be sent to AmosTversky, who is now at the Oregon ResearchInstitute, P. O. Box 3196, Eugene, Oregon 97403.

identical conditions. In order to accountfor the observed inconsistency and thereported uncertainty, choice behavior hasbeen viewed as a probabilistic process.

Probabilistic theories of preference differwith respect to the nature of the mechanismthat is assumed to govern choice. Sometheories (e.g., Thurstone, 1927, 1959) at-tribute a random element to the determina-tion of subjective value, while others (e.g.,Luce, 1959) attribute a random element tothe decision rule. Most theoretical workon probabilistic preferences has been basedon the notion of independence amongalternatives. This notion, however, isincompatible with some observed patternsof preferences which exhibit systematicdependencies among alternatives.

This paper develops a probabilistictheory of choice, based on a covert elimina-tion process, which accounts for observed

2811972 by the American Psychological Association, Inc.

Page 2: Elimination by aspects: A theory of choice.

282 AMOS TVERSKY

dependencies among alternatives. Thefirst section analyzes the independence as-sumption ; the second section formulates atheory of choice and discusses its con-sequences ; some experimental tests of thetheory are reported in the third section ;and its psychological implications are ex-plored in the fourth and final section.

We begin by introducing some notation.Let T = {x,y,z, • • •} be a finite set, inter-preted as the total set of alternatives underconsideration. We use A, B, C, • • • , todenote specific nonempty subsets of T, andA,, BJ, Ck, • • • , to denote variables rangingover nonempty subsets of T. Thus,{Ai\Ai~2.B} is the set of all subsets of Twhich includes B. The number of elementsin A is denoted by a. Proper and non-proper set inclusion are denoted, respec-tively, by D and I>. The empty set isdenoted by <t>. The probability of choosingan alternative x from an offered set A C Tis denoted P(x,A). Naturally, we assumeP(x,A) > 0, EP(x,A) = 1 for any A,

x£Aand P(x,A) = 0 for any x ($! A. Forbrevity, we write P(x\y) for P(x,{x,y}),P(x;y,z) for P(x,{x,y,z}), etc. A real-valued, nonnegative function in one argu-ment is called a scale. Choice probabilityis typically estimated by relative frequencyin repeated choices. It should be kept inmind, however, that other empirical inter-pretations of choice probability, such asconfidence judgments (which are applicableto unique choice situations), might also beadopted.

Perhaps the most general formulation ofthe notion of independence from irrelevantalternatives is the assumption that thealternatives can be scaled so that eachchoice probability is expressible as amonotone function of the scale values ofthe respective alternatives. This assump-tion, called simple scalability, was firstinvestigated by Krantz (1964, AppendixA). Formally, simple scalability holds if andonly if there exists a scale u defined on thealternatives of T and functions Fn in narguments, 2 < n < t, such that for anyA = { x , - - - , z } C T,

where each Fa is strictly increasing in thefirst argument and strictly decreasing inthe remaining a — 1 arguments providedP(x,A) 5^ 0, 1. This assumption underliesmost theoretical work in the field. Thetheory of Luce (1959), for example, is aspecial case of this assumption where

P(x,A) = ^.[M (*) , • • - ,«(«)]

u(x)u(y)

[2]

P(x,A) =*•„ [1]

Despite its generality, simple scalability(Equation 1) has strong testable con-sequences. In particular, it implies thatfor all x, y G A,

P(x;y) > l / 2 i f i P ( x , A ) >P(y,A),provided P(y,A} ^ 0. [3]

Equation 3 asserts that the ordering of xand y, by choice probability, is independentof the offered set.3 Thus, if x is preferredto y in one context (e.g., P(x;y) > 1/2),then x is preferred to y in any context.Furthermore, if P(x;y) = 1/2 then P(x,A)= P(y,A) for any A which contains bothx and y. Thus, if an individual is indiffer-ent between x and y, then he should choosethem with equal probability from any setwhich contains them.

This assumption, however, is not valid ingeneral, as suggested by several counterex-amples and demonstrated in many experi-ments (see Becker, DeGroot, & Marschak,1963b; Chipman, 1960; Coombs, 1958;Krantz, 1967 ; Tversky & Russo, 1969). Tomotivate the present development, let usexamine the arguments against simplescalability starting with an example pro-posed by Debreu (1960).

Suppose you are offered a choice amongthe following three records: a suite by De-bussy, denoted D, and two different re-cordings of the same Beethoven symphony,denoted BI and B2. Assume that the twoBeethoven recordings are of equal quality,

3 Simple scalability is, in fact, equivalent (seeTversky, 1972) to the following order independenceassumption. For x, y G A — B, and z £ B,P(x,A) > P(y,A) iSP(z,B U {*}) < P(z,B U {y})provided the terms on the two sides of either in-equality are not both 0 or 1.

Page 3: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE 283

and that you are undecided between addinga Debussy or a Beethoven to your recordcollection. Hence, P(Bi;B2) == P(D;Bi)= P(D;B2) = 1/2. It follows readily fromEquation 3 that P(D ; 61,62) = 1/3. Thisconclusion, however, is unacceptable onintuitive grounds because the basic con-flict between Debussy and Beethoven isnot likely to be affected by the additionof another Beethoven recording. In-stead, it is suggested that in choosingamong the three records, BI and B2 aretreated as one alternative to be comparedwith D. Consequently, one would expectthat P(D; Bi,B2) will be close to one-half,while P(Bi;B2 lD) = JP(B2;Bi,D) will beclose to one-fourth, contrary to simplescalability (Equation 1). Empirical sup-port for Debreu's hypothesis was presentedby Becker et al. (1963b) in a study ofchoice among gambles. Although Debreu'sexample was offered as a criticism of Luce'smodel (Equation 2), it applies to any modelbased on simple scalability.

Previous efforts to resolve this problem(e.g., Estes, 1960) attempted to redefinethe alternatives so that BI and Ba are nolonger viewed as different alternatives.Although this idea has some appeal, it doesnot provide a satisfactory account of ourproblem. First, BI and B2 are not onlyphysically distinct, but they can also beperfectly discriminable. Hence, there isno independent basis for treating them asindistinguishable. Second, the process ofredefining choice alternatives itself re-quires an adequate theoretical analysis.Finally, data show that the principle ofindependence from irrelevant alternativesis violated in a manner that cannot bereadily accounted for by grouping choicealternatives. More specifically, it appearsthat the addition of an alternative to anoffered set "hurts" alternatives that aresimilar to the added alternative more thanthose that are dissimilar to it. Such aneffect (of which Debreu's example is aspecial case) requires a more drastic re-vision of the principles underlying ourmodels of choice.

The following example provides anotherillustration of the inadequacy of simple

scalability. Suppose each of two travelagencies, denoted 1 and 2, offers tours ofEurope (E) and of the Far East (F). LetT = {Ei,Fi,E2,F2} where letters denote thedestination of the tours, and the subscriptsdenote the respective agencies. Let usassume, for simplicity, that the decisionmaker is equally attracted by Europe andby the Far East, and that he has no reasonto prefer one travel agency over the other.Consequently, all binary choice prob-abilities equal one-half, and the probabilityof choosing each tour from the total setequals one-fourth. It follows from Equa-tion 3, in this case, that all trinary prob-abilities must equal one-third. However,an examination of the problem suggeststhat in fact none of the trinary probabilitiesequals one-third; instead, some of themequal one-half while the others equalone-fourth.

Consider, for example, the set {Ei.Fi.Fz}.Since the distinction between the agenciesis treated as irrelevant, the problem re-duces to the choice between a tour ofEurope and a tour of the Far East. Ifthe latter is chosen, then either one ofthe agencies can be selected. Consequently,P(Ei; FlfF2) = 1/2, and P(Fi; F,,Ei)= P(F2;Fi,Ei) = 1/4. An identical argu-ment applies to all other triples. Besidesviolating simple scalability, this exampledemonstrates that the same set of binary(or quarternary) probabilities can give riseto different trinary probabilities and hencethe latter cannot be determined by theformer. Put differently, this example showsthat the probabilities of choosing alterna-tives from a given set, A, cannot be com-puted, in general, from the probabilities ofchoosing these alternatives from the subsetsand the supersets of A. This observationimposes a high lower bound on the com-plexity of any adequate theory of choice.

A minor modification of an example dueto L. J. Savage (see Luce & Suppes, 1965,pp. 334-335), which is based on binarycomparisons only, illustrates yet anotherdifficulty encountered by simple scalability.Imagine an individual who has to choosebetween a trip to Paris and a trip to Rome.Suppose he is indifferent between the two

Page 4: Elimination by aspects: A theory of choice.

284 AMOS TVERSKY

trips so that P(Paris; Rome) = 1/2. Whenthe individual is offered a new alternativewhich consists of the trip to Paris plusa $1 bonus, denoted Paris +, he will un-doubtedly prefer it over the original tripto Paris with certainty so that P (Paris + ;Paris) = 1. It follows from Equation 3,then, that P (Paris + ; Rome) = 1, which iscounterintuitive. For if our individual can-not decide between Paris and Rome, it isunlikely that a relatively small bonus wouldresolve the conflict completely and changethe choice probability from 1/2 to 1.Rather, we expect P (Paris +; Rome) tobe closer to 1/2 than to 1. Experimentaldata (e.g., Tversky & Russo, 1969) supportthis intuition. Choice probabilities, there-fore, reflect not only the utilities of thealternatives in question, but also thedifficulty of comparing them. Thus, anextreme choice probability (i.e., close to 0or 1) can result from either a large dis-crepancy in value or from an easy com-parison, as in the case of the added bonus.The comparability of the alternatives,however, cannot be captured by their scalevalues, and hence simple -scalability mustbe rejected. The above examples demon-strate that the substitution of one alterna-tive for another, which is equivalent to it insome contexts, does not necessarily preservechoice probability in any context. Thesubstitution affects the comparabilityamong the alternatives, which in turn in-fluences choice probability.

An alternative approach to the develop-ment of probabilistic theories of choicetreats the utility of each alternative as arandom variable rather than a constant.Specifically, it is assumed that there existsa random vector U = (Ux,- • -,U2) onT - (x, • • • ,z] (i.e., for any y G T, Uv isa random variable) such that

P(x,A) = P(U* > Ua for all y £ A). [4]

Models of this type are called randomutility models. The only random utilitymodels which have been seriously in-vestigated assume that the random vari-ables are independent. However, an ex-tension of the last example (see Luce &Raiffa, 1957, p. 375) is shown to violate

any independent random utility model. Todemonstrate, consider the trips to Paris andRome with and without the added bonus.The expected binary choice probabili-ties in this case are P (Paris + ; Paris) = 1,P(Rome +; Rome) = 1 but P (Paris +;Rome) < 1 and P(Rorne +; Paris) < 1.

Assuming an independent random utilitymodel, the first two equations above implythat there is no overlap between the dis-tributions representing Paris and Paris +,nor is there an overlap between the distribu-tions representing Rome and Rome +.The last two inequalities above implythat there must be some overlap betweenthe distributions representing Rome andParis +, as well as between the distribu-tions representing Paris and Rome +. Itis easy to verify that these conclusions aremutually inconsistent, and hence the abovechoice probabilities are incompatible withany independent random utility model.The representation of choice alternatives byindependent random variables, therefore,appears too restrictive in general since,like simple scalability, it is incompatiblewith some eminently reasonable patterns ofpreference. In discussing the difficultiesencountered by probabilistic theories ofchoice, Luce and Suppes (1965) wrote :

It appears that such criticisms, although usuallydirected toward specific models, are really muchmore sweeping objections to all our current prefer-ence theories. They suggest that we cannot hopeto be completely successful in dealing with prefer-ences until we include some mathematical structureover the set of outcomes that are simply sub-stitutable for one another and those that are specialcases of others. Such functional and logical rela-tions among the outcomes seem to have a sharpcontrol over the preference probabilities, and theycannot long be ignored [p. 337].

THEORYThe present development describes choice

as a covert sequential elimination process.Suppose that each alternative consists of aset of aspects of characteristics,4 and that

4 The representation of choice alternatives ascollections of measurable aspects was developed byRestle (1961) who formulated a binary choice modelbased on this representation. As will be shownlater, the present theory reduces to Restle's in the

Page 5: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE 285

at every stage of the process, an aspect isselected (from those included in the avail-able alternatives) with probability that isproportional to its weight. The selectionof an aspect eliminates all the alternativesthat do not include the selected aspect, andthe process continues until a single alterna-tive remains. If a selected aspect is in-cluded in all the available alternatives, noalternative is eliminated and a new aspectis selected. Consequently, aspects that arecommon to all the alternatives under con-sideration do not affect choice probabilities.Since the present theory describes choiceas an elimination process governed by suc-cessive selection of aspects, it is called theelimination-by-aspects (EBA) model.

In contemplating the purchase of a newcar, for example, the first aspect selectedmay be automatic transmission: this willeliminate all cars that do not have thisfeature. Given the remaining alternatives,another aspect, say a $3000 price limit, isselected and all cars whose price exceedsthis limit are excluded. The process con-tinues until all cars but one are eliminated.This decision rule is closely related to thelexicographic model (see Coombs, 1964;Fishburn, 1968), where an ordering of therelevant attributes is specified a priori.One chooses, then, the alternative that isbest relative to the first attribute; if somealternatives are equivalent with respect tothe first attribute, one chooses from themthe alternative that is best relative to thesecond attribute, and so on. The presentmodel differs from the lexicographic modelin that here no fixed prior ordering of as-pects (or attributes) is assumed, and thechoice process is inherently probabilistic.

More formally, consider a mapping thatassociates with each x G T a nonempty setx' = (a,/3, • • •} of elements which are inter-preted as the aspects of x. An alternativex is said to include an aspect a whenevera G x'. The aspects could represent values

two-alternative case. A related representation ofchoice alternatives was developed by Lancaster(1966) who assumed that economic goods possess,or give rise to, multiple characteristics (or aspects)in fixed proportion, and that these characteristicsdetermine the consumer's choice. Lancaster'stheory, however, is nonprobabilistic.

FIG. 1. A graphical representation of aspectsin the three-alternative case.

along some fixed quantitative or qualitativedimensions (e.g., price, quality, comfort),or they could be arbitrary features of thealternatives that do not fit into any simpledimensional structure. The characteriza-tion of alternatives in terms of aspects isnot necessarily unique. Furthermore, wegenerally do not know what aspects areconsidered by an individual in any particu-lar choice problem. Nevertheless, as isdemonstrated later, this knowledge is notrequired in order to apply the presentmodel, and its descriptive validity can bedetermined independently of any particularcharacterization of the alternatives.

To clarify the formalization of the model,let us first examine a simple example. Con-sider a three-alternative set T = {x,y,z},where the collections of aspects associatedwith the respective alternatives are

and

x' = {ai,a2,0i,02,pi,p2,a>},

y' = {j3i,j32,0i,02,<n,<T2,w},

A graphical representation of the struc-ture of the alternatives and their aspects ispresented in Figure 1. It is readily seenthat a,, /3», and y» (* = 1, 2) are, respec-tively, the unique aspects of x, y, and 2;that Oi, <ri, and pi are, respectively, the

Page 6: Elimination by aspects: A theory of choice.

286 AMOS TVERSKY

aspects shared by x and y, by y and z, andby x and z; and that u is shared by all threealternatives. Since the selection of co doesnot eliminate any alternative, it can bediscarded from further considerations. Letu be a scale which assigns to each aspect apositive number representing its utility orvalue, and let K be the sum of the scalevalues of all the aspects under considera-tion, that is, K = 53 u(a) where the sum-

a

mation ranges over all the aspects exceptco. Using these notations we now computeP(x,T).

Note first that x can be chosen directlyfrom T if either a\ or aa is selected in thefirst stage (in which case both y and z areeliminated). This occurs with probability

P(x;y) =

[_u(a.\) +«(«2)]/K. Alternatively, x can bechosen via {x,y} if either 61 or 62 is selectedin the first stage (in which case z is elimi-nated), and then x is chosen over y. Thisoccurs with probability \ju(Bi) + u(6%)~\X P(x ', y)/K. Finally, x can be chosen via(x,z) if either pi or pa is selected in thefirst stage (in which case y is eliminated),and then x is chosen over z. This occurswith probability [w(pi) + u(p?)~]P(x\ z)/K.Since the above paths leading to the choiceof x from T are all disjoint,

P(x,T) =

[5]

where

tt(pi)etc.

More generally, let T be any finite set ofalternatives. For any A C T let .4' =(a\a(E.x' for some #(E^}, and A° ={a a EX' for all xEA}. Thus, .4' isthe set of aspects that belongs to atleast one alternative in A, and A° isthe set of aspects that belongs to all thealternatives in A. In particular, T' isthe set of all aspects under considera-tion, while T° is the set of aspects sharedby all the alternatives under study. Givenany aspect a G T', let A a denote those al-ternatives of A which include a, that is,Aa = {x\x EA&a Ex'}.

The elimination-by-aspects model assertsthat there exists a positive scale u definedon the aspects (or more specifically onT' - r°) such that for all x £ A C T

Equation 6 is a recursive formula. Itexpresses the probability of choosing x fromA as a weighted sum of the probabilities ofchoosing x from the various subsets of A(i.e., Aa for a Ex'), where the weights(i.e., u(a)/ Z «(/3)) correspond to the prob-abilities of selecting the respective aspectsof x.

Consider a special case of the elimina-tion-by-aspects model where all pairs ofalternatives share the same aspects, that is,x' fl y' — z> n w' f°r aii x> y> z> w S T.Since aspects that are common to all thealternatives of T do not affect the choiceprocess, the alternatives can be treated as(pairwise) disjoint, that is, x' f*l y1 = <t> forall x, y G T. In this case, Equation 6reduces to

u(a)P(X,Aa)ra

provided the denominator does not vanish.Note that the summations in the numeratorand the denominator of Equation 6 range,respectively, over all aspects of x and Aexcept those that are shared by all elementsof A. Hence, the denominator of Equa-tion 6 vanishes only if all elements of Ashare the same aspects, in which case it isassumed that P(x,A) = I/a.

«£«'

ZP(x,A) =

since a G x' implies A„ = { x } , and P(x,{x})= 1. Letting

u(a)

yields

Hence, in the present theory, Luce's model

Page 7: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE 287

(Equation 2) holds whenever the alterna-tives can be regarded as composed of dis-joint aspects.

Next, examine another special case of themodel where only binary choice prob-abilities are considered. In this case, weobtain

U(ct)

P(*',y) = £ «(«) + £ «0s)*' - yr 0 6: y — *'«(*' - /)

7-—^. C7]«(*'-/)+«(/-*')

where #' — y' = {a \a G x' & a £JE y'} isthe set of aspects that belongs to x but nottoy; / - *' = {/3|/3 G / & / 3 $ *'} is theset of aspects that belongs to y but not to x;and «(*' — y') — £ «(«). Equation

« G*' -y7 coincides with Restle's (1961) model.The EBA model, therefore, generalizes thechoice models of Luce and of Restle.

The elimination-by-aspects model hasbeen formulated above in terms of a scale «defined over the set of relevant aspects.It appears that the application of the modelpresupposes prior characterization of thealternatives in terms of their aspects.However, it turns out that this is notnecessary because the EBA model can beformulated purely in terms of the alterna-

tives, or more specifically, in terms of thesubsets of T.

To illustrate the basic idea, consider theexample presented in Figure 1. There weassume that on, Pi, • • • (i = I , 2) are alldistinct aspects. According to the elimina-tion-by-aspects model, however, there is noneed to distinguish between aspects thatlead to the same outcome. For example,the selection of either on or a2 eliminatesboth y and 2; the selection of either 61 or 62eliminates 2; and the selection of either pior p2 eliminates y. From the standpoint ofthe elimination-by-aspects model, therefore,there is no need to differentiate between a\and «2, between 6\ and 62, or between piand p2. Thus we can group all the aspectsthat belong to x alone, all the aspects thatbelong to x and y but not to 2, etc. Let{x} denote the aspects that belong to *alone (i.e., ai and «2), {x^y} the aspects thatbelong only to x and y (i.e., 0i and On),{x^z} the aspects that belong only to x and2 (i.e., pi and p2), etc.6 The representationof the grouped aspects in the three-alterna-tive case is displayed in Figure 2.

The scale value of a collection of aspectsis defined as the sum of the scale value ofits members, that is, U(x) = u(a\) + w(a2),U(xjy) = «(0i) +u(6i), etc. For sim-plicity of notation we write U(x) forU(&}), U(xjy) for U((x~y}), etc. Thus,Equation 5 is expressible as

P(x,T) =U(x) U(x7z)P(x;z)

U(x) + U(y) + U(z) U(y7z) [8]

where

P(x;y) =

etc.

U(x)U(x) + U(y)

The essential difference between Equations5 and 8 lies in the domain of the scales: inEquation 5, u is defined over individualaspects, whereas in Equation 8 U is de-fined over collections of aspects which areassociated, respectively, with the subsets of

5 In this paper, the superbar is used exclusivelyto denote collections of aspects. It should not beconfused with a common use of this symbol todenote set complement.

T. The method by which Equation 5 istranslated into Equation 8 can be appliedin general.

Each proper subset A of T is associatedwith the set A of all aspects that are in-cluded in all the alternatives of A and arenot included in any of the alternatives thatdo not belong to A. That is, A = {a £7"' |a G *' for all * £ A & a $ y' for anyy £ A}. The scale Uis defined by U(A) =£ u(a). It is shown in the appendix that

a 6 A

the elimination-by-aspects model, defined inEquation 6, holds if and only if there existsa scale U defined on {Ai\AiC.T} such

Page 8: Elimination by aspects: A theory of choice.

288 AMOS TVERSKY

FIG. 2. A graphical representation of the groupedaspects in the three-alternative case.

that for all * £ A C T

V(Bt)P(x,AP(X,A) =

ZI e a[9]

where a = ( A j \ A j ("} A ^ A, 4>}, providedthe denominator does not vanish. (Ac-cording to the present theory, the de-nominator can vanish only if P(x,A)= I/a.) The significance of this resultlies in showing how the elimination-by-aspects model can be formulated in terms ofthe subsets of T without reference to_specificaspects. Note that for A C T, U(A) is nota measure of the value of the alternatives ofA ; rather it is a measure of all the evalua-tive aspects that are shared by all thealternatives of A and by them only. Thus,U(A) can be viewed as a measure of theunique advantage of the alternatives of A.The reader is invited to verify that in thethree-alternative case, Equation 9 reducesto Equation 8.

Before discussing the consequences ofthe EBA model, let us examine how it re-solves the counterexamples described inthe previous section. First, considerDebreu's record selection problem whereT = {D,Bi,B2}. Naturally, the two Bee-thoven recordings have much more incommon with each other than either of

them has with the Debussy record. As-sume, for simplicity, that any aspectshared by D and one of the B records isalso shared by the other B record, hence Dcan be treated as (aspectwise) disjointof both Bi and B2. Suppose t/(B_i) =Z7(B,) = a, £/(Bi,B2) = b, and Z7(D) =a + b. A graphical illustration of thisrepresentation is shown in Figure 3.

It follows readily, under these assump-tions, that all the binary choice probabilitiesare equal, since

a 1 a + b

= P(D;BO =P(D;B 2 ) ,yet the trinary choice probabilities areunequal, since

= P(Bi;B,,D) =P(B2 ;Bi ,D).

In fact, as a (or a/b) approaches 0, theleft-hand side approaches 1/2 while theright-hand side approaches 1/4. Hence,according to the elimination-by-aspectsmodel, all three records can be pairwiseequivalent, and yet the probability ofchoosing D from the entire set can be ashigh as 1/2 whenever Bi and B2 includethe same aspects.

Second, consider Savage's problem ofchoosing between trips, and let T = {P, R,P +, R +}, where P and R denote, respec-tively, trips to Paris and Rome, while +denotes a small monetary bonus. Here it isnatural to suppose that Paris + includesParis (in the sense that all aspects of the

Beethoven I

Debussy

L° JBeethoven 2

FIG. 3. A graphical illustration of the analysisof the record selection problem.

10

~\

1b 1

111

.._]_

a + b

Page 9: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE 289

latter trip are included in the former). Onthe other hand, Paris + does not includeRome because each of these trips has someaspects that are not shared by the other.Similarly, Rome + includes Rome but notParis. The relations among the four al-ternatives are illustrated in Figure 4.

Letting U(P~+) = £/(R+) = a, andU(P,P +) = U(R,R +) = b, yields

P(p.R) - A = * _ a+b

^ r 'K; 2b 2~2(a+b)

P(P+;P) = P ( R + ; R ) =- = 1, and

P(P+;R) = p(R+ ;p) = a + ba + 2b

which can take any value between 1/2 and1, depending on the relative weight of thebonus. Thus, the above pattern of binarychoice probabilities, which violates simplescalability (Equation 1) and any in-dependent random utility model (Equa-tion 4), arises naturally in the presentmodel. Essentially the same solution tothis problem (which involves only binaryprobabilities) has been proposed by Restle(1961).

The reader is invited to show how theelimination-by-aspects model can accom-modate the example described earlier ofchoice among tours of Europe or the FarEast with each of two travel agencies.

Consequences

In the following discussion we assumethat the elimination-by-aspects model isvalid, and list some of its testable conse-quences. The derivations of these proper-ties are presented in Tversky (1972).

Regularity: For all * £ A C B,

P(x,A) > P(x,B). [10]

Regularity asserts that the probability ofchoosing an alternative from a given setcannot be increased by enlarging theoffered set. This is probably the weakestform of noninteraction among alternatives.Although regularity seems innocuous, it is

Paris +A

Paris

1

vRome

VRome +

FIG. 4. A graphical illustration of the analysisof the choice between trips.

worth noting that the replacement of > by> in Equation 10 violates the expectedpreference pattern in the record selectionproblem.

The following consequence of the elimina-tion-by-aspects model involves binary prob-abilities only. Since it generalizes thealgebraic notion of transitivity, it is calledmoderate stochastic transitivity.

Moderate stochastic transitivity :

P ( x ; y ) > 1/2 and P(y;z) > 1/2 implyP(*;«) > min[P(x;y), P(y;a)]. [11]

If we replace min by max in the con-clusion of Equation 11, we obtain astronger condition called strong stochastictransitivity. This latter property (whichis not a consequence of the present model)is essentially equivalent to simple scal-ability in the binary case. If we replace theconclusion of Equation 11 by P(x;z)> 1/2, we obtain a weaker condition calledweak stochastic transitivity, which is a con-

Page 10: Elimination by aspects: A theory of choice.

290 AMOS TVERSKY

sequence of the existence of an ordinalutility scale satisfying u(x) > u(y) iffP(x\y) > 1/2.

The next consequence of the EBA modelhas not been investigated previously tothe best of my knowledge. It relatesbinary and trinary choice probabilitiesby a property called the multiplicativeinequality.

Multiplicative inequality:

P(x;y,z)>P(x',y)P(x;s). [12]

The multiplicative inequality assertsthat the probability of choosing x from{x,y,z} is at least as large as the prob-ability of choosing x from both {x,y} and{x,z} in two independent choices. It isconjectured that the elimination-by-aspectsmodel implies a much stronger formof the multiplicative inequality, namely,P(x,A \JB) >P(x,A)P(x,B) for all A,B C T.

Equations 10 and 12 can be combined toyield

mm[P(x;y),P(x;z)~] > P(x;y,z)>P(x;y)P(x;z). [13]

Thus, trinary choice probabilities arebounded from above by regularity, andfrom below by the multiplicative inequality.A geometric representation of Equation 13

which displays the admissible range ofP(x;y,z) given the values of P(x;y) andP(x; z) is given in Figure 5. It shows thatthe trinary probability must lie betweenthe lower and upper surfaces generated,respectively, by the multiplicative in-equality (Equation 12) and regularity(Equation 10).

The significance of the above conse-quences stems from the fact that they pro-vide measurement-free tests of the elimina-tion-by-aspects model, that is, tests whichdo not require estimation of parameters.

For a given set of alternatives T, theelimination-by-aspects model has 2' — 3free parameters, or £7 values (the number ofproper nonempty subsets of T minus anarbitrary unit of measurement), while thenumber of independent data points of theform P(xA), x £ A C T, is

FIG. 5. A geometric representation of the ad-missible values (shaded region) of the Irinary prob-ability P(x;y,z) given the binary probabilitiesP(x;y') and P(x\z), under Equation 13.

Hence, there are always at least as manydata points as parameters in the presentmodel; the former exceeds the latterwhenever t > 3. In general, therefore, thescale values are uniquely determined bythe choice probabilities except in someparticular situations, for example, whenP(x,A) = I/a for all x G A C T.

Even in the case where t — 3, in whichthe number of parameters (five) equals thenumber of data points, the choice prob-abilities are severely constrained. Thevolume of the subspace generated by thepresent model is less than 1/2% of thevolume of the entire parameter space whichis a five-dimensional unit hypercube. Theprobability that a point sampled at random,from a uniform distribution over theparameter space, satisfies the presentmodel, therefore, is less than .005 in thiscase.

Additional consequences and furtherdevelopments of the elimination-by-aspectsmodel are presented in Tversky (1972).They include a generalization of the presentmodel, an extension to ranking, and aproof that the EBA model is a randomutility model, though not an independentone.

Page 11: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE

3.50

291

I M

A dot p a t t e r n A gamble A score profile

FIG. 6. Typical stimulus slides from each of the three tasks.

TESTS

In contrast to the many theoreticalstudies of probabilistic models of preference(see, e.g., Becker et al., 1963a; Luce &Suppes, 1965; Marschak, 1960; Morrison,1963), there have been relatively fewempirical studies in which these modelswere tested. Moreover, much of theavailable data are limited to binary choices,and most studies report and analyze onlygroup data (see, e.g., Rumelhart & Greeno,1971). Unfortunately, group data usuallydo not permit adequate testing of theoriesof individual choice behavior because, ingeneral, the compatibility of such datawith the theory is neither a necessary nor asufficient condition for its validity. (Foran instructive illustration of this point, seeLuce, 1959, p. 8.) The scarcity of appro-priate data in an area of considerabletheoretical interest is undoubtedly due tothe difficulties involved in obtaining ade-quate estimates of choice probabilities foran individual subject, particularly outsidethe domain of psychophysics.

Two consequences of the present modelwere tested in previous studies. In an ex-periment involving choice among gambles,Becker et al. (1963b) showed that althoughsimple scalability (Equation 1) is sys-tematically violated, the regularity condi-tion (Equation 10) is generally satisfied.Similarly, although strong stochastic transi-tivity was violated in several studies (e.g.,Coombs, 1958; Krantz, 1967; Tversky &Russo, 1969), moderate stochastic transi-tivity was usually supported. (For somespecified conditions under which moderatestochastic transitivity, as well as weakstochastic transitivity, is violated, seeTversky, 1969.) The fact that simple

scalability and strong stochastic transi-tivity are often violated while regularityand moderate stochastic transitivity aretypically satisfied provides some support,albeit nonspecific, for the present theory.The following experimental work was de-signed to obtain a more direct test of theEBA model.

Method

To test the model, three different tasks wereselected. The stimuli in Task A were random dotpatterns, in a square frame, varying in size (ofsquare) and density (of dots). Subjects were pre-sented with pairs and triples of frames and in-structed to choose, in each case, the frame whichcontained the largest number of dots. The stimuliin Task B were profiles of college applicants withdifferent intelligence (I) and motivation (M) scores.The scores were expressed in percentiles (relativeto the population of college applicants), and dis-played as bar graphs. Subjects were presented withpairs and triples of such profiles and asked to select,in each case, the applicant they considered themost promising. The stimuli in Task C were two-outcome gambles of the form (p,x), in which onewins $x with probability p and nothing otherwise.Each gamble was displayed as a pie diagram, wherethe probabilities of winning and not winning wererepresented, respectively, by the black and whitesectors of the pie. Subjects were presented withpairs and triples of gambles and were asked tochoose the gamble they would prefer to play. (Atthe end of the study, each subject actually playedfor money five of the gambles chosen by him in thecourse of the study. The gambles were played byspinning an arrow on a wheel of fortune and thesubjects won the indicated amont if the arrowlanded on the black sector of the wheel.) Examplesof the three types of stimuli are shown in Figure 6.

The same eight subjects participated in all threetasks. They were students in a Jerusalem highschool, ages 16-18. Subjects were run in a singlegroup. The stimuli were projected on slides andeach subject indicated his choices by checking anappropriate box on his response sheet. The studyconsisted of 12 one-hour sessions, three times a week,

Page 12: Elimination by aspects: A theory of choice.

292 AMOS TVERSKY

for four weeks. The first two sessions were practicesessions in which the problems and the procedurewere introduced and the subjects familiarizedthemselves with the stimuli of the task.

Each experimental session included all threetasks, and the ordering of the tasks was randomizedacross sessions. Within each task, subjects werepresented with various pairs and triples formed froma basic set of 4 X 4 = 16 two-dimensional stimuli.One set of three stimuli of each type was isolatedand replicated more than other sets. The entiretriple was replicated 30 times (three per session)while each of the pairs within this triple was repli-cated 20 times (two per session). The followingdiscussion is concerned with the analysis of thesetriples. Each triple was constructed so that noalternative dominates another one with respect toboth dimensions, and so that two of the elements,called x and y, are very similar to each other, whilethe third element, z, is relatively dissimilar to eachof them.6

The subjects were paid a flat fee for the completionof all the sessions. In addition, each subject re-ceived a bonus proportional to the number of correctnumerosity judgments made by him, and wasallowed to play, for money, five gambles selectedrandomly from those chosen by him during thestudy.

Results

The analysis of the results begins bytesting the constant-ratio rule which isessentially equivalent to Luce's (1959)model. According to this rule,

P(x;y) _ P(x,A)P(y;x) P(y,A)

x,yEA, [14]

provided the denominators do not vanish.The constant-ratio rule is a strong versionof the principle of independence from ir-relevant alternatives. It requires that theratio of P(x,A) and P(y,A) (not merelytheir order as required by simple scalability)be independent of the offered set A.

6 The following stimuli were employed in the study.Task A : x = (13 X 13, 4/5), y = (14 X 14, 3/4),and z = (28 X 28, 1/5) where the first component ofeach stimulus is the size of the underlying matrixused to generate the pattern, and the second com-ponent is the proportion of cells of the matrix thatcontain dots. TaskB:* = (78,25), y = (75,35), andz = (60,90) where the first and second components ofeach pair denote, respectively, intelligence andmotivation scores of the applicants. Task C:x = (1/5, 4.00), y = (1/4,3.50), and z = (2/3, 1.00),where the first and second component of each pairare, respectively, the probability of winning^and theamount to be won in each of the gambles in Israelipounds.

Let T = {x,y,z}, and define

n , . . . ^ _ pfe^)

P*(y;s) =

p(*,r) + P(z,zyP(y,T)

P(y,T) + P(z,T)'

and [15]

Hence, by the constant-ratio rule,

P(x\z) = Pv(x;z)

P(y;z) = Px(y;z}.Put differently, the binary probabilityP(x\z) should equal P y ( x ; z ) , computedfrom the trinary probabilities, since underEquation 14 the presence of y is "ir-relevant" to the choice between x and z.

In the present study, the alternativeswere designed so that x and y are muchmore similar to each other than either ofthem is to z. Hence, the similarity hy-pothesis that is incorporated into theelimination-by-aspects model predicts thatthe addition of alternative y to the set{x,z} will reduce P(x,T) proportionallymore than P(z,T). That is, the similaralternative, x, will lose relatively morethan the dissimilar alternative, z, by theaddition of y. Likewise, y is expected tolose relatively more than z by the introduc-tion of x. Contrary to the constant-ratiorule, therefore, the similarity hypothesisimplies

andP(x;z) > Py(x;z)

P(y;z) > Px(y;z).[16]

To test the constant-ratio rule, the ob-served (binary) relative frequencies P (x,z)and P(y,z) were compared, respectively,with Py(x\z) and Px(y\z) computed fromthe trinary relative frequencies, separatelyfor each one of the subjects. The observedand the computed values for all subjectsare shown in Table 1 for each of the threetasks.

It seems that the constant-ratio model(Equation 14) holds in the psychophysicaltask (A), and that it fails in the two prefer-ence tasks (B and C) in the manner pre-dicted by the similarity hypothesis (Equa-tion 16). Out of 16 individual comparisonsin each task (two per subject), Equation 16was satisfied in 13 and 15 cases, respec-

Page 13: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE 293

TABLE 1

OBSERVED AND PREDICTED PROPORTIONS (UNDER THE CONSTANT-RATIO MODEL) FOR EACH TASK

Subject

12345678

Overallproportion

P

Task .4 (dots)

P(*yi)

.50

.60

.25

.70

.65

.40

.15

.15

.425

£.(*;«>

.43

.27

.38

.75

.52

.39

.26

.14

.405

ns

£(?;»)

.45

.35

.40

.30

.35

.45

.45

.45

.400

r*(yy)

.43

.33

.41

.67

.39

.52

.44

.57

.466

ns

Task B (applicants)

£<*;«)

.65

.55

.55

.40

.65

.35

.75

.55

.556

<

£»(*;«)

.44

.37

.38

.46

.45

.20

.77

.52

.463

.05

P(y;»)

.30

.75

.60

.40

.55

.40

.35

.40

.469

<

&.(?;»)

.26

.58

.41

.32

.40

.38

.40

.29

.388

.10

Task C (gambles)

P(*-f)

.35

.60

.25

.60

.20

.65

.55

.55

.469

<

P,(x-f)

.12

.53

.26

.43

.16

.54

.42

.35

.354

.01

P(yy)

.50

.70

.50

.70

.50

.60

.65

.70

.606

<

£»(?;«>

.46

.68

.29

.35

.41

.44

.50

.43

.466

.01

tively, in Tasks B and C (p < .05 in eachcase7), and only in 7 cases in Task A.Essentially the same result was found inadditional analyses.

The relatively small number of observa-tions does not permit an adequate test ofindividual comparisons. Hence, the ob-served and the computed choice frequencieswere pooled over subjects. The results of achi-square test of Equation 15 againstEquation 16, based on these data, areshown in the last row of Table 1 for eachcomparison in each of the tasks. The samepattern emerges from the analysis of thepooled data: the observed proportions aresignificantly higher than the computed onesin Tasks B and C, but not in Task A.

Since the constant-ratio model is notacceptable, in general, the simplest versionof the elimination-by-aspects model, whichis compatible with the similarity hypothe-sis, was selected next. Recall that the teststimuli were designed so that x and y arevery similar to each other while z is rela-tively dissimilar to either of them (seeFootnote 6). Thus, we assume that neitherx nor y share with 2 any aspect that theydo not share with each other. Con-sequently, aside from the aspects sharedby all three stimuli, 2 can be regardedas (aspectwise) disjoint from both x andy. That is, we assume that, to a reason-

7 This significance level should be interpretedwith caution because of the potential dependencybetween the observations of each subject.

able degree of approximation, U(x,z) =U(y~&) = 0. This assumption reduces thenumber of free parameters (from five tothree) at the cost of some loss in generality.

Let U(x) = a, U(y) = b, U(z) = c, andU(x^y) = d (see Figure 7). Under thisspecial case of the model, there exist non-negative a, b, c, and d such that

P ( x \ y ) = a +b'b +d

b + d + c'

P ( x ; z ) =a + d

a + d + c'[17]

a +d-

and

P(x;y,z) =

P(*\x,y) =

a + b + c + d'

a + b +c+d'

For three alternatives, there are five in-dependent data points (three binary andtwo trinary). In the absence of any re-strictions on the parameters, the likeli-hood function of the data is maximized byusing the observed relative frequencies asestimates of the parameters, in which casethe dimensionality of the parameter space,denoted rf(fl), equals five. In the aboveversion (Equation 17) of the elimination-by-aspects model, we can set c, say, ar-bitrarily, whence the observed proportionsare all expressible in terms of three parame-

Page 14: Elimination by aspects: A theory of choice.

294 AMOS TVERSKY

FIG. 7. A graphical illustration of the testedversion of the EBA model (Equation 17).

ters (a, b, and d), and the dimensionalityof the restricted parameter space, denotedd (u), equals three. Let A be the likelihoodratio L(<a)/L(Q), where L denotes themaximum value of the likelihood functionunder the respective model. If Equation17 holds, then the statistic — 2ln\ has anapproximate chi-square distribution withd (0) — d (to) = 2 degrees of freedom.

Chandler's (1969) STEPIT program wasemployed to obtain maximum likelihoodestimates of the parameters under Equa-tion 17 with c = 1. The values of the teststatistics are reported in Table 2, alongwith the estimates of d, for each subjectin all tasks.

Table 2 exhibits a very good correspond-ence between the observed proportions andthe tested version (Equation 17) of theEBA model: only 2 out of 24 tests permitrejecting the model at the conservative.1 level. It should perhaps be noted that

TABLE 2VALUES OF THE TEST STATISTIC AND THE

ESTIMATED VALUES OF d FOR EACHSUBJECT IN EACH OF THE TASKS

Subject

12345678

Task A(dots)

X*

.1333.02S

.849S.SS1*.951.401

3.7404.112

d

.29

.89000000

Task B(applicants)

X2

2.1791.634.159

6.864*.428.405.083.038

d

.14

.921.18.51

1.23.420.37

TaskC(gambles)

X2

.040

.0012.0221.053.887.157.304

1.241

d

.46

.58

.141.56

01.181.001.44

Note.—df = 2.*#= . ! .

a correspondence between observed choiceprobabilities and the elimination-by-aspects model does not necessarily implythat the subjects are actually following astrategy of elimination by aspects. Theymight, in fact, employ a different strategythat is well approximated by the elimina-tion-by-aspects model. The study of theactual strategies employed by subjects inchoice experiments may perhaps be ad-vanced by investigating choice probabili-ties in conjunction with other data suchas reaction time, eye movements, or verbalprotocols.

The relation between the predictions ofthe constant-ratio model (Equation 15)and the similarity hypothesis (Equation16) can be further investigated using theobtained estimates of the parameter d,reported in Table 2. It is easy to verifythat the constant-ratio model is compatiblewith Equation 17 if and only if d = 0, whilethe similarity hypothesis implies d > 0.Hence, if the former holds, the estimates ofd should be close to 0, whereas if the latterholds, the estimates should be substantiallypositive. (The magnitude of d should beinterpreted in the light of the facts that allparameters are nonnegative and c = 1, seeEquation 17 and Figure 7.) Inspection ofTable 2 reveals that the majority of the destimates in Task A are zero, while themajority of the d estimates in Tasks B andC are substantially positive. This agreeswith the results of previous analyses (sum-marized in Table 1) according to which theconstant-ratio model is satisfied in Task A,but not in Tasks B and C.

Taken together, the experimental find-ings suggest the hypothesis that theconstant-ratio model is valid for choiceamong unitary alternatives (e.g., dots,colors, sounds) that are usually evaluatedas wholes, but not for composite alternatives(e.g., gambles, applicants) that tend to beevaluated in terms of their attributes orcomponents. This hypothesis is closelyrelated to a suggestion made by Luce(1959):If we call a decision that is not subdivided intosimpler decisions an elementary choice, then possiblywe can hope to find Axiom 1 ^Luce's choice axiom]directly confirmed for elementary choices but prob-ably not for more complex ones Q). 133].

Page 15: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE 295

Research on multidimensional scaling basedon similarity, or proximity, data (e.g.,Shepard, 1964a; Torgerson, 1965) has alsoshown that judgments of unitary and com-posite stimuli (sometimes referred to asanalyzable and unanalyzable) are governedby different rules. Much additional re-search, however, is required in order toassess the validity and the generality of theproposed hypothesis.

Finally, the distinction between unitaryand composite stimuli is logically in-dependent of whether the inconsistencyreflected in choice probabilities is attribut-able to imperfect discrimination or to aconflict among incompatible criteria. (Fora discussion of this last distinction, seeBlock & Marschak, 1960.) Althoughchoice experiments in psychophysics typi-cally involve imperfect discrimination withunitary stimuli while preference experi-ments are usually concerned with conflictamong composite alternatives, the othertwo combinations also exist.

DISCUSSION

Strategic Implications

A major feature of the elimination-by-aspects model is that the probability ofselecting an alternative depends not onlyon its overall value, but also on its rela-tions to the other available alternatives.This gives rise to study of strategic factorsin the design and the presentation ofchoice alternatives. Specifically, the pres-ent model provides a method for investigat-ing questions concerning optimal designor location of alternatives in order tomaximize (or minimize) choice probabilityunder specified constraints. The followingexamples are intended to illustrate thescope and the nature of such a study.

First, consider a problem of binary com-parisons. Suppose y and z are given, andwe search for x such that P (x; y) is maxi-mized under the constraints that z has noaspects in common with any other alterna-tive, and that P(y; 2) and P(x; z) are fixed.By the former constraint, z can be viewedas a standard of comparison. Hence, thelatter constraint can be interpreted asmeaning that the overall values of y and

of x (evaluated relative to z) are held fixed.Thus, only the position of x relative to ycan be varied to maximize P(x;y). Underthese conditions, the present model impliesthat if P(x\ z) > P ( y ; z ) , x' should includeas much of y' as possible. If, on the otherhand, P(x\ z) < P(y; z), x' should includeas little of y' as possible. The degree ofoverlap between x' and y1 can be regardedas an index of the difficulty of comparingthem. If x' includes y', the comparison istrivial, and P(x;y) is maximal. If x' andy1 are disjoint, the comparison is much moredifficult, and P(x; y) is less extreme.

In the light of this interpretation, theabove result asserts that it is in the bestinterest of the favored alternative to makethe comparison as easy as possible, whileit is in the best interest of the nonfavoredalternative to make the comparison asdifficult as possible. This certainly makessense: any increase in the difficulty ofcomparing the alternatives adds "error" tothe judgment process and makes P(x;y)closer to 1/2. According to this logic,drastically different policies are prescribeddepending on whether x is the favored orthe nonfavored alternative. Advertisingcampaigns based on slogans such as "Allaspirins are the same—why pay more?"and "This car is completely different fromany other car in its class," illustrate, respec-tively, the policies recommended to thefavored and the nonfavored alternatives.Note that these policies could be employedin the design of products as well as in theiradvertisements.

Second, let T = (x,y,- • • ,2} and supposethat all pairwise choice probabilities arefixed and that we wish to select a setA C T' which includes both x and y so thatthe ratio P(x,A)/P(y,A) is maximized.According to the elimination-by-aspectsmodel, the above ratio is maximized whenA consists of alternatives (which are notdominated by y) that "cover" as much of yas possible without "covering" much of x.If x and y are products in some market A,for example, then the present model pre-dicts that the relative advantage of x overy is maximized when the other availableproducts are as similar to y and dissimilarto x as possible. The example of choice

Page 16: Elimination by aspects: A theory of choice.

296 AMOS TVERSKY

among records discussed in the introduc-tion and the similarity effect demonstratedin Table 1 illustrate the point. Note thatthis maximization problem cannot be in-vestigated in Luce's model (Equation 2),for example, since by the constant-ratiorule P(x,A)/P(y,A) = P(X;y)/P(y;x), x,y£A, and hence is independent of A.According to the EBA model, in contrast,the above ratio can, in principle, bearbitrarily large, provided P(x ; y ) ^0.

Thus, if the present theory is valid, onecan take advantage of the so-called "ir-relevant alternatives" to influence choiceprobabilities. This result is based on theidea that the introduction of an addi-tional alternative "hurts" similar alterna-tives more than dissimilar ones. This is afamiliar notion in the context of groupchoice. The present development suggeststhat it is an important determinant of in-dividual choice behavior as well. Inpractice, problems such as the design of aproduct or a political campaign involvemany specific constraints concerning thenature of the product or the candidate.To the extent that these constraints can betranslated into the present framework, theelimination-by-aspects model can be used(or abused) to determine the optimal de-sign, or location, of choice alternatives.

Psychological Interpretation

The EBA model accounts for choice interms of a covert elimination process basedon sequential selection of aspects. Anysuch sequence of aspects can be regardedas a particular state of mind which leadsto a unique choice. In light of this inter-pretation, the choice mechanism at anygiven moment in time is entirely deter-ministic ; the probabilities merely reflectthe fact that at different moments in timedifferent states of mind (leading to dif-ferent choices) may prevail. According tothe present theory, choice probability is anincreasing function of the values of therelevant aspects. Indeed, the elimination-by-aspects model is compensatory in naturedespite the fact that at any given instantin time, the choice is assumed to followa conj unctive (or a lexicographic) strategy.Thus, the present model is compensatory

"globally" with respect to choice prob-ability but not "locally" with regard to anyparticular state of mind.

In the proposed model, aspects are inter-preted as desirable features; the selectionof any particular aspect leads to elimina-tion of all alternatives that do not containthe selected aspect. Following the presentdevelopment, one can formulate a dualmodel where aspects are interpreted as dis-advantages, or regrets, associated with thealternatives. According to such a model,the selection of a particular aspect leadsto the elimination of all alternatives thatcontain the selected aspect. This model isalso based on the notion of elimination byaspects, except that here an alternative ischosen if and only if none of its aspects isselected, whereas in the model developed inthis paper an alternative is selected if andonly if it includes all the selected aspects.The former model may be more appro-priate when the defining features of thealternatives are naturally viewed as un-desirable. In choosing among variousinsurance policies, for example, it may bemore natural to apply the strategy ofelimination by aspects to the various risksand premiums, treated as disadvantages orregrets, than to interpret them as relativeadvantages with respect to some referencepoints.8

Although the present model has been in-troduced and discussed in terms of aspects,we have shown that it requires no specificassumptions concerning the structure ofthese aspects. In the course of the in-vestigation, however, assumptions con-cerning the structure and/or the relativeweights of aspects were sometimes in-troduced. In discussing the Paris-Romeproblem, for example, we assumed thatParis + (i.e., a trip to Paris plus an addedbonus) includes Paris in the sense that allaspects of the latter are included in theformer. Similarly, in analyzing Debreu'sexample, we assumed that the two re-cordings Bi and 62 of the Beethoven

8 George Miller remarked that people seem to bebetter at finding what is wrong with an alternativethan what is good about it. This certainly is trueof some people, who might then find the "negative"version of the model less objectionable or morecompatible with their way of thinking.

Page 17: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE 297

symphony are very similar to each other,whereas the suite by Debussy is relativelydissimilar to either of them. Essentiallythe same assumption was employed in theanalysis of the experimental data. In allthese instances, specific assumptions aboutthe structure or the relative weights ofaspects were added to the model on thebasis of some prior analysis of the alterna-tives. The addition of such assumptionsstrengthens the predictions of the modeland tightens its empirical interpretation.These assumptions, however, must becarefully examined because the inadequacyof an added assumption can erroneously beinterpreted as a failure of the model.

To illustrate this point, consider thefollowing example of choice between articlesof clothing. Let J denote a jacket, S apair of matching slacks, and C a coat.Suppose that the coat is more valuable thanthe jacket, so P(C;J) > 1/2. But sincethe slacks and the jacket are well matched,P(JS; CS) > 1/2, where JS and CS denotethe options consisting of the combined re-spective articles. Both JS and CS sharethe same article, S; hence one might betempted to interpret S as a collection ofaspects shared by the two alternatives.According to the elimination-by-aspectsmodel, such aspects could be deletedwithout affecting the choice process. Con-sequently, under the proposed interpreta-tion of S, P(JS;CS) =P(J ;C) contraryto the assumptions. Further reflection,however, reveals that the interpretation ofS as a collection of aspects common to bothoptions is inappropriate. The fact that thejacket and the slacks form an attractiveoutfit implies that this alternative has somegestaltlike properties, or that the optionJS includes some aspects that are not in-cluded in either J or S alone. Hence, thefact that the option JS includes both J andS as components does not, by itself,justify the conclusion that the aspects ofJS can be partitioned into those associatedwith J and S alone.

Rational Choice and the Logic of Eliminationby Aspects

The following television commercial servesto introduce the problem. "There are more

than two dozen companies in the SanFrancisco area which offer training in com-puter programming." The announcer putssome two dozen eggs and one walnut onthe table to represent the alternatives, andcontinues: "Let us examine the facts. Howmany of these schools have on-line com-puter facilities for training?" The an-nouncer removes several eggs. "How manyof these schools have placement servicesthat would help find you a job?" The an-nouncer removes some more eggs. "Howmany of these schools are approved forveterans' benefits?" This continues untilthe walnut alone remains. The announcercracks the nutshell, which reveals the nameof the company and concludes: "This is allyou need to know in a nutshell."

This commercial illustrates the logic ofelimination by aspects; it also suggests thatthis logic has some normative appeal as amethod of choosing among many complexalternatives. The appeal of this logic stemsprimarily from the fact that it is easy tostate, defend, and apply. In choosingamong many complex alternatives such asnew cars or job offers, one typically facesan overwhelming amount of relevant in-formation. Optimal policies for choosingamong such alternatives usually require in-volved computations based on the weightsassigned to the various relevant factors, oron the compensation rates associated withthe critical variables. Since man's intuitivecomputational facilities are quite limited(Shepard, 1964b; Slovic & Lichtenstein,1971), the above method is difficult toapply.

Moreover, it seems that people are re-luctant to accept the principle that (evenvery important) decisions should dependon computations based on subjective esti-mates of likelihoods or values in which fchedecision maker himself has only limitedconfidence. When faced with an importantdecision, people appear to search for ananalysis of the situation and a compellingprinciple of choice which will resolve thedecision problem by offering a clear-cutchoice without relying on estimation ofrelative weights, or on numerical computa-tions. (Altogether people seem to havemore confidence in the rationality of

Page 18: Elimination by aspects: A theory of choice.

298 AMOS TVERSKY

their decisions than in the validity oftheir intuitive estimates, and the factthat the former depends on the latteris often met with a mixture of resistanceand unhappiness.)

The strategy of elimination by aspects(illustrated by the above commercial) pro-vides an example of such a principle: It isrelatively easy to apply, it involves nonumerical computations, and it is easy toexplain and justify in terms of a priorityordering defined on the aspects. Inasmuchas people look for a decision rule that notonly looks sensible, but which also seemseasy to defend to oneself as well as to others,the principle of elimination by aspectsappears attractive. Its uncritical applica-tion, however, may lead to very poor de-cisions. For virtually any available al-ternative, no matter how inadequate itmight be, one can devise a sequence ofselected aspects or, equivalently, describe aparticular state of mind that leads to thechoice of that alternative.

Indeed, the purpose of advertisement isto induce a state of mind in the decisionmaker which will result in the purchase ofthe advertised product. This is typicallyaccomplished by increasing the salienceand the availability of the desired state ofmind. Being influenced by such factors,people are often lured into adopting a stateof mind which, upon further reflection, ap-pears atypical or inadequate. Shepard(1964b) tells of a person who is induced topurchase the Encyclopedia Britannica byimagining how he would read it in his freetime and impress his friends with hisnewly acquired knowledge. Only afterfailing to consult the Encyclopedia Britan-nica for a long period of time does theperson realize how inappropriate the stateof mind was that had led him to purchasethose many dusty volumes.

From a normative standpoint, the majorflaw in the principle of elimination by as-pects lies in its failure to ensure that thealternatives retained are, in fact, superiorto those which are eliminated.

In the problem addressed by the abovecommercial, for instance, the existenceof placement services that would help

the trainee to find a job is certainly adesirable aspect of the advertised program.Its use as a criterion for elimination, how-ever, may lead to the rejection of programswhose overall quality exceeds that of theadvertised one despite the fact that theydo not offer placement services.

In general, therefore, the strategy ofelimination by aspects cannot be defendedas a rational procedure of choice. On theother hand, there may be many contexts inwhich it provides a good approximationto much more complicated compensatorymodels and could thus serve as a usefulsimplification procedure. The conditionsunder which the approximation is adequate,and the manner in which this principlecould be utilized to facilitate and improvedecision making, are subjects for futureinvestigations.

REFERENCES

BECKER, G. M., DEGROOT, M. H., & MARSCHAK, J.Stochastic models of choice behavior. BehavioralScience, 1963, 8, 41-55. (a)

BECKER, G. M., DEGROOT, M. H., & MARSCHAK, J.Probabilities of choices among very similar ob-jects. Behavioral Science, 1963, 8, 306-311. (b)

BLOCK, H. D., & MARSCHAK, J. Random orderingsand stochastic theories of responses. In I. Olkin,S. Ghurye, W. Hoeffding, W. Madow, & H. Mann(Eds.), Contributions to probability and statistics.Stanford: Stanford University Press, 1960.

CHANDLER, J. P. STEPIT—Finds local minima ofa smooth function of several parameters. Be-havioral Science, 1969, 14, 81-82.

CHIPMAN, J. S. Stochastic choice and subjectiveprobability. In D. Willner (Ed.), Decisions,values, and groups. Vol. 1. New York: PergamonPress, 1960.

COOMBS, C. H. On the use of inconsistency ofpreferences in psychological measurement. Jour-nal of Experimental Psychology, 1958, 55, 1-7.

COOMBS, C. H. A theory of data. New York:Wiley, 1964.

DEBREU, G. Review of R. D. Luce, Individualchoice behavior: A theoretical analysis. A mericanEconomic Review, 1960, SO, 186-188.

ESTES, W. K. A random-walk model for choice be-havior. In K. J. Arrow, S. Karlin, & P. Suppes(Eds.), Mathematical methods in the social sciences,1959. Stanford: Stanford University Press, 1960.

FISHBURN, P. C. Utility theory. ManagementScience, 1968, 13, 435-453.

KRANTZ, D. H. The scaling of small and large colordifferences. (Doctoral dissertation, University ofPennsylvania) Ann Arbor, Mich.: UniversityMicrofilms, 1964. No. 65-5777.

KRANTZ, D. H. Rational distance function for

Page 19: Elimination by aspects: A theory of choice.

A THEORY OF CHOICE 299

multidimensional scaling. Journal of Mathe-matical Psychology, 1967,4,226-245.

LANCASTER, K. J. A new approach to consumertheory. Journal of Political Economy, 1966, 74,132-157.

LUCE, R. D. Individual choice behavior: A theoreticalanalysis. New York: Wiley, 1959.

LUCE, R. D., & RAIFFA, H. Games and decisions.New York: Wiley, 1957.

LUCE, R. D., & SUPPES, P. Preference, utility, andsubjective probability. In R. D. Luce, R. R.Bush, & E. Galanter (Eds.), Handbook of mathe-matical psychology, III. New York: Wiley, 1965.

MARSCHAK, J. Binary-choice constraints andrandom utility indicators. In K. J. Arrow, S,Karlin, &P. Suppes (Eds.), Mathematical methodsin the social sciences, 1959. Stanford: StanfordUniversity Press, 1960.

MORRISON, H. W. Testable conditions for triads ofpaired comparison choices. Psychometrika, 1963,28,369-390.

RESILE, F. Psychology of judgment and choice.New York: Wiley, 1961.

RUMELHART, D. L., & GREENo, J. G. Similaritybetween stimuli: An experimental test of theLuce and Restle choice models. Journal ofMathematical Psychology, 1971, 8, 370-381.

SHEPARD, R. N. Attention and the metric structureof the stimulus space. Journal of MathematicalPsychology, 1964, 1, 54-87. (a)

SHEPARD, R. N. On the subjectively optimum selec-tion among multiattribute alternatives. InM. W. Shelly & G. L. Bryan (Eds.), Human judg-ments and optimality. New York: Wiley, 1964. (b)

SLOVIC, P., & LICHTENSTEIN, S. C. Comparison ofBayesian and regression approaches to the studyof information processing in j udgment. Organiza-tional Behavior and Human Performance, 1971,6, 649-744.

THURSTONE, L. L. A law of comparative judgment.Psychological Review, 1927, 34, 273-286.

THURSTONE, L. L. The measurement of values.Chicago: University of Chicago Press, 1959.

TORGERSON, W. S. Multidimensional scaling ofsimilarity. Psychometrika, 1965, 30, 379-393.

TVERSKY, A. Intransitivity of preferences. Psy-chological Review, 1969, 76, 31-48.

TVERSKY, A. Choice by elimination. Journal ofMathematical Psychology, 1972, in press.

TVERSKY, A., & Russo, J. E. Similarity and sub-stitutability in binary choices. Journal of Mathe-matical Psychology, 1969, 6, 1-12.

APPENDIX

This appendix estalishes the equivalence ofthe two formulations (Equations 6 and 9) ofthe elimination-by-aspects model. Let T be afinite set of alternatives. For each x G T, letx' denote the set of aspects associated with x.For any A C T, define A' = {a|a G x' for

ic x G A), A° = {a\a G x' for all * G 4},A = (a\ct G x' for all x G A & a £ y for

someandany y G A } . We wish to show that thereexists a positive scale u on T' — T° satisfyingEquation 6 if and only if there exists a scale Uon {.Ai|;4,- C T} satisfying Equation 9.

It follows at once from the above definitionsthat {Ai\Ai C T} forms a partition of T' — T°,since any a G T' — T° belongs to exactly oneAi. Suppose Equation 6 holds. For anyACT, define U(A) = E «(«)- By the

<«e£positivity of u, f/isnonnegative and U(A) = 0iff A = </>. Note that if a, /3 G B then for allA C T, Aa = A/, = A fl B. Furthermore,

{ Bi forms a partition of *', thenumerator in Equation 6 can be expressed as

E P(x,A.)u(a)a €*'-/!»

= £, E P(x,A.)u(a)

* e B; £ A <* e «,= E P(x,AnB,)U(B,).

Bi^A

(The condition x G Bt under the summationsign is deleted because for any x (£ Bi, P(x,A H Bt) = 0. Similarly, since {/3|/3 G A'— Au} = {/3|/9 G A,- for some Aj such thatAJ jj> A and Aj(~\ A 7* <t>}, the denominatorin Equation 6 can be expressed as

where« = (Aj\AjC\ A 9* A,<f>).

Thus, Equation 6 reduces to Equation 9, since

E ti(a)P(x,Aa)06*' -A'

E «(«p£A' -A'

E U(Si)P(x,A n Bi)_ Ki $ A

E U(Aj)•4,6(4

Conversely, suppose Equation 9 holds.That is, there exists a scale U such thatP(x,A) is given by the right-hand side ofthe above equation. For any x G T, letx' = (At C T\x G Ai}, and u = U, henceEquation 9 reduces to Equation 6. Finally,if either of the above denominators vanishes,so does the other.

(Received February 7, 1972)


Recommended