+ All Categories
Home > Documents > N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... ·...

N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... ·...

Date post: 08-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
47
OTIC FILE COPY o0 ARI Research Note 90-100 Q N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases From a List Robert M. Hamm University of Colorado for Contracting Officer's Representative DTIC Michael Drillings SELECTE SEP 2 6 1990 Basic Research Michael Kaplan, Director August 1990 United States Army Research Institute for the Behavioral and Social Sciences Approved for public release; distribution is unlimited.
Transcript
Page 1: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

OTIC FILE COPYo0 ARI Research Note 90-100Q

N Evaluation of a Method of VerballyExpressing Degree of Belief bySelecting Phrases From a List

Robert M. HammUniversity of Colorado

for

Contracting Officer's Representative

DTIC Michael DrillingsSELECTESEP 2 6 1990

Basic ResearchMichael Kaplan, Director

August 1990

United States ArmyResearch Institute for the Behavioral and Social Sciences

Approved for public release; distribution is unlimited.

Page 2: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

U.S. ARMY RESEARCH INSTITUTEFOR THE BEHAVIORAL AND SOCIAL SCIENCES

A Field Operating Agency Under the Jurisdictionof the Deputy Chief of Staff for Personnel

EDGAR M. JOHNSON JON W. BLADESTechnical Director COL, IN

Commanding

Research accomplished under contractfor the Department of the Army

Institute of Cognitive Science, University of Colorado TI S A& iDTIC TALI U

Uii.ur,o,. Ce.

By

Distr ibutler) I

Availability Cudes

AdvaI afid I orDisl SoectaI

NOTICES

DISTRIBUTION: This report has been cleared for release to the Defense Technical InformationCenter (DTIC) to comply with regulatory requirements. It has been given no primary distributionother than to DTIC and will be available only through DTIC or the National TechnicalInformation Service (NTIS).

FINAL DISPOSITION: This report may be destroyed when it is no longer needed. Please do notreturn it to the U.S. Army Research Institute for the Behavioral and Social Sciences.

NOTE: The views, opinions, and findings in this report are those of the author(s) and should notbe construed as an official Department of the Army position, policy, or decision, unless so

. by other authorized documents.

Page 3: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

UNCLASSIFIED

SECURITY CLASSIFICATION OF THIS PAGEForm Approved

REPORT DOCUMENTATION PAGE OM No.v0704-0

la. REPORT SECURITY CLASSIFICATION lb. RESTRICTIVE MARKINGS

Unclassified

2a. SECURITY CLASSIFICATION AUTHORITY 3. DISTRIBUTION /AVAILABILITY OF REPORTApproved for public release;

2b. DECLASSIFICATION /DOWNGRADING SCHEDULE distribution is unlimited.

4. PERFORMING ORGANIZATION REPORT NUMBER(S) S. MONITORING ORGANIZATION REPORT NUMBER(S)

ARI Research Note 90-100

6a. NAME OF PERFORMING ORGANIZATION 6b. OFFICE SYMBOL 7a. NAME OF MONITORING ORGANIZATION(If applicable)

Institute of Cognitive Science U.S. Army Research Institute

6c. ADDRESS (City, State, and ZIP Code) 7b. ADDRESS (City, State, and ZIP CrcJe)

University of Colorado, Box 345 5001 Eisenhower AvenueBoulder, CO 80309-0345 Alexandria, VA 22333-5600

8a. NAME OF FUNDING /SPONSORIN6 8b. OFFICE SYMBOL 9. PROCUREMENT INSTRUMENT IDENTIFICATION NUMBERORGANIZATION U.S. Army Research (If applicable)

Institute for the Behavioral MDA903-86-K-0265and Social Sciences PERI-BR8c. ADDRESS (City, State, and ZIP Code) " 10. SOURCE OF FUNDING NUMBERS

5001 Eisenhower Avenue PROGRAM PROJECT TASK WORK UNITELEMENT NO. NO. NO. ACCESSION NO.

Alexandria, VA 22333-5600 61102B 74F N/A N/A

11. TITLE (Include Security Classification)

Evaluation of a Method of Verbally Expressing. Degree of Belief by Selecting Phrases From a

List12. PERSONAL AUTHOR(S)

Hamm, Robert M.13a. TYPE OF REPORT 13b. TIME COVERED 14. DATE OF REPORT (Year, Month, Day) 15. PAGE COUNT

Interim FROM g6/Oq TO.g/_Inc) 1990a August 49

16. SUPPLEMENTARY NOTATION

Contracting Officer's Representative, Michael Drillings

17. COSATI CODES ., 18. SUBJECT TERMS (Continue on reverse if necessary and identify by block number)

FIELD GROUP SUB-GROUP -Probability; Probabilistic inference. ,K:Verbal probabilities /

Bias19, ABSTRACT (Continue on reverse if necessary and identify by block number).. This report describes a method for verbal expression of degree of uncertainty. The

method requires the subject to select a phrase from a list that spans the full range ofprobabilities. In a second, optional step, the subject indicates the numerical meaning ofeach phrase. Alternative list orders were compared to determine the effects of presentingthe phrases in ordered sequence or randomly. When the verbal expressions were arranged inrandom order, ordinal position had a significant effect on the selection of expressions, andthe preference for phrases with broader ranges of meaning was stronger in the second half ofthe list. However, these effects did not occur when the phrases were listed in ascending ordescending order. Considerations of accuracy and interpersonal agreement also support theuse of ordered phrase lists. %

20. DISTRIBUTION /Al'LAoILITY ^,' ABSTRACT 21. ABSTRACT SECURITY CLASSIFICATION

MJUNCLASSIFIED/UNLIMITED 0 SAME AS RPT. 0 DTIC USERS Unclassified

22a. NAME OF RESPONSIBLE INDIVIDUAL 22b. TELEPHONE (Include Area Code) 22c. OFFICE SYMBOL

MiL..ael Drillings (202) 274-8722 PERT-BR

DD Form 1473, JUN 86 Previous editions are obsolete. SECURITY CLASSIFICATION OF THIS PAGEUNCLASSIFIED

Page 4: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

EVALUATION OF A METHOD OF VERBALLY EXPRESSING DEGREE OF BELIEF BY SELECTINGPHRASES FROM A LIST

CONTENTS

Page

INTRODUCTION ............... ............................ 1

Description of the Method for Verbal Expressionof Degree of Uncertainty ............ ...................... 2

METHOD ................ ................................ 5

RESULTS ................ ................................ 5

Effect of Phrase List Order on Problem Answers ...... ........... 6Effect of List Order on Values Assigned to the VerbalExpressions of Probability ....... ..................... .... 23Effect of Phrase List Order on Accuracy of ProblemAnswers ........... .............................. .... 29

DISCUSSION ........... .............................. .... 35

BIBLIOGRAPHY ........... ............................. .... 39

APPENDIX 1. THE "DOCTOR" PROBLEM, ONE OF FOUR PROBABILISTICINFERENCE WORD PROBLEMS USED IN THE STUDY ........ ... 41

2. INSTRUCTIONS FOR QUESTIONNAIRE ELICITING LOWERAND UPPER BOUNDS ON THE NUMERICAL MEANINGS OFEACH PHRASE ....... ....................... .... 43

Ili

Page 5: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Evaluation of a Method of Verbally Expressing Degree of Beliefby Selecting Phrases from a List.

1. Introduction.The issue whether people think and communicate better using numerical or verbal expressions

of probability has received recent attention (Beyth-Marom, 1982; Kong, Barnett, Mosteller, andYoutz, 1986; Wallsten, Budescu, Rapoport, Zwick, and Forsyth, 1986; Zimmer, 1983). In a numberof contexts, communication using verbal expressions of probability is preferable, even though it maybe less precise than numerical communication (Zwick, 1987). Reasons for this include people'spreference for verbal probabilities and the possibility that linguistic terms may facilitate thinkingabout uncertainty in complex problems (Zimmer, 1983). Accuracy on probabilistic inference wordproblems is not generally better in either mode (Hamm, 1988).

This paper describes a method designed to avoid several problems that may limit theusefulness of verbal expressions of probability, and reports a study that evaluates possibleconfounding factors.

Though verbal expression of probability is justified in some situations, it presents problems thatmust be solved if it is to be broadly useful. First, the meanings of phrases differ between people,although they seem stable for individuals over time (Budescu and Wallsten, 1985) and Kong,Barnett, Mosteller, and Youtz (1986) found no systematic differences between occupational groups.Second, there is an indefinitely large number of words and phrases that could be used to expressdegree of belief. This makes it difficult to develop a lexicon of the numerical meanings of all verbalexpressions of probability. Any new phrase would pose a problem of interpretation, in contrast witha new number that can be easily understood because it can be placed unambigously on the [0,11number line. The method to be described below solves the problem of individual differences byhaving the subject assign a numerical value to each phrase (either before or after the phrases havebeen used in problem solving or communication). It addresses the problem of the indefinitely largelexicon by confining the subject's responses to a limited set of verbal phrases, selected to cover thefull range of degrees of belief (though it risks using phrases that subjects may not understand asprecisely as they understand their own words).

A third problem with verbal expressions of probability is that the meaning of a phrase maydepend on contextual factors. For example, it may depend on the object whose probability is beingdiscussed (Wallsten, Fillenbaum, and Cox, 1986; Mapes, 1979). Thus, 'highly likely" may have adifferent numerical interpretation if applied to the possible failure of a Broadway play than if appliedto the possible meltdown of a nuclear reactor. Although this issue is not addressed in this study, thesubject's assignment of numbers to phrases could be done on a context specific basis. Next, themeaning of a phrase may depend on the other phrases available in the choice set. For example, themeaning of "probable" may depend on whether 'not probable' is present in the list. A term and itsnegation may mutually influence their meanings to be equidistant from the midpoint of 50%. Twosimilar terms such as 'fairly unlikely' and 'somewhat unlikely' may be assigned the same broadmeaning if only one of them is in a list, but may be assigned adjacent but non-overlappingmeanings if both are present. Finally, a phrase's immediate neighbors in a list may affect itsmeaning. Thus, "rarely" may mean something different If positioned between "very unlikely and'absolutely impossible' than If its neighbors are "good chance' and "slightly less than half the time'.

A fourth problem is that when subjects read a list of candidate phrases they must do sosequentially. Phrases early in the list may be more likely to be chosen if subjects stop reading afterfinding one that is good enough. Or phrases late in the list may be favored if subjects read throughtlhe whole list and Choose from among phrases that are still in short term memory when they finish.Fifth, while the meanings of all verbal expressions of probability may be inherently vague, somephrases may be more vague than others (Wallsten, Budescu, Rapoport, Zwick, and Forsyth, 1986).There are a number of possible mechanisms (detailed below) by which these differences Inva;,,emess might affect the selection of a term from a list.

Page 6: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

The problems of context, primacy/recency, and differences in vagueness are addressedexperimentally in the present study. Another issue explored is the effect of presenting the phrasesin sequential (ascending or descending) or random order. A sequential list would allow a subject tomore rapidly find the phrase he or she wants, but it might also constrain the subject's interpretationof the meanings of the phrases. Whether such a constraint is an advantage or disadvantage will bediscussed below.

1.1. Description of the method for verbal expression of degree of uncertainty.In order for subjects to use verbal rather than numerical expressions of probability, while

avoiding the requirement of an ever-expanding lexicon, subjects can be asked to select verbalexpressions of probability from a pre-defined list. To decrease miscommunication due to individualdifferences in interpretation of words or phrases, they can be asked in a separate procedure tosupply numerical interpretations for the terms in the list.

In the version of the method used in the present study, nineteen verbal expressions covered therange from 0% to 100%, with symmetry about an easily identifiable midpoint ("tossup"). The list wasstructured so that there was a term for each 5% mark, except that there was only one term inbetween 25% and 40%, and only one in between 60% and 75%. Other researchers may wish touse shorter (or longer) lists, lists without a sharply defined midpoint, lists that are not balancedaround 50% (see Kong, Barnett, Mosteller, and Youtz, 1986) or different phrases. Thesedistinctions, while important, are not pertinent to the present investigation of factors affecting theselection of phrases from the phrase list. The results of this study are applicable to lists comprisedof any set of verbal expressions of probability.

The present list was produced by reviewing previous studies that elicited numerical values forverbal expressions of probability (Budescu and Wallsten, 1985; Lichtenstein and Newman, 1967;Simpson, 1944; Shanteau, 1974; Wallsten, Budescu, Rapoport, Zwick, and Forsyth, 1986), in orderto identify a set of words and phrases that (a) have interpretations that cover the entire probabilityrange, in about evenly spaced steps, and (b) have relatively narrow interpretations, as indicated bysmall standard deviations, compared to other candidates with the nearby means (see Table 1). Tocover the ends, *absolutely impossiblen and *absolutely certain" were chosen. "Almost certain" wasused to cover the 95% range; however, it was subsequently learned that Kong, Barnett, Mosteller,and Youtz (1986) had found that people assign this phrase a mean value of .78 (median .90).

Insert Table I about here.

The list of verbal expressions of probability may optionally be presented in sequential order(ascending (Table 1) or descending) or in random order (e.g., Table 2). The subject's instructionsare as follows:

In this study you will be asked to select verbal phrases that represent yourestimates of the probability or likelihood that statements are true or that events havehappened. Please look over the following list of phrases.

[The list of verbal expressions of probability was presented.]

[Random order conditions:] The verbal phrases in this list are arranged inrandom order.

[Ascending or descending order conditIons:] The verbal phrases in this list arearranged in order. The top ones in the list express a very [high/tow] degree ofprobability, and the bottom ones express a very [low/high] degree. These meaningswere determined in surveys of a large number of people.

2

Page 7: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table IMeans and standard deviations of numerical interpretations of

verbal expressions of probability measured In previous studies.

Verbal phrase Mean Standard Source Value adopted(Median) Deviation for this study

Absolutely impossible - - Author .00Rarely .05 .07 a S 1944 .05

.08 .06 R&W 1985Very unlikely .09(.10) .07 L&N 1967 .10Seldom .16 .09 SAW 1985 .15

.10 .12& S 1944

.16(.15) .08 LAN 1967Not very probable .20(.20) .12 LAN 1967 .20Fairly unlikely .25(.25) .11 L&N 1967 .25Somewhat unlikely .31(.33) .12 L&N 1967 .33

.27 Sh 1974Uncertain .41 .13 B&W 1985 .40

.40(.50) .14 L&N 1967Slightly less than

half the time .45(.45) .04 L&N 1967 .45Toss-up .50(.50) .00 L&N 1967 .50

.47 .11 BAW 1985

.54 Sh 1974Slightly more than

half the time .55(.55) .06 L&N 1967 .55setter than even .58(.60) .06 L&N 1967 .60

.66 Sh 1974Rather likely .69(.70) .09 L&N 1967 .70Good chance .74(.75) .12 L&N 1967 .75Quite likely .79(.80) .10 L&N 1967 .80Very probable .87(.89) .07 LAN 1967 .85Highly probable .89(.90) .04 L&N 1967 .90

.84 Sh 1974Almost certain - - Author .95Absolutely certain - - Author 1.00

s Interquartile range.

Note: Sources are: B&W - Budescu and Wallsten, 1985, L&N - Lichtenstein and Newman,1967; Sh, - Shanteau, 1974; S - Simpson, 1944. Author - author's judgment.

3

Page 8: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 2

Random Phrase List Order, "Random Order A".

Uncer-tain

Rather likely

Somewhat unlikely

Rarely

Slightly less than half the time

Good chance

Fairly unlikely

Absolutely impossible

Toss-up

Quite likely

Not very probable

Absolutely certain

Slightly more than half the tim

Very probable

Seldom

Almost certain

Better than even

Righly probable

Very unlikely

4

Page 9: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Please use one of these phrases to answer every question in the problems thatfollow. It will help the people who will be reading your answers if you will write neatlyand write the whole phrase. Do not leave any answers blanki

Please be careful to consider all the possible phrases and select the best one foreach answer. To help you consider the phrases for each word problem, you shouldseparate this page from the questionnaire booklet and set it beside the booklet for easyreference as you work on the problems.

At a different time from when the phrase lists are used in the problem solving or communicationtask (immediately after, in Hamm, 1988), the subjects are asked to refer to the lists and say "thenumerical probability that most closely represents what each of these verbal phrases means."

The study's purpose was to investigate whether the following factors influence subjects' use ofverbal expressions of probability that are presented in a list: (a) the context, i.e., a phrase'sneighbors in the list; (b) the phrase's position in the first or second half of the list; and (c) differencesin the vagueness of the phrases. In addition, the effects of appearing in a sequentially orderedversus random list will be investigated. The influence of these factors on (1) subjects' tendency toselect a phrase, (2) subjects' assignment of numbers to phrases, and (3) the accuracy of subjects'reasoning (on word problems) using the phrases, will be determined.

2. Method.One hundred and forty seven subjects from the Introductory Psychology subject pool

participated. Each did 4 probabilistic inference word problems (Appendix 1; see Hamm, 1988, fordetails). Half responded with verbal expressions of probability (the others used numbers). Allsubsequently assigned numerical values to the phrases. Total response time on the questionnairewas recorded.

One subject was dropped from the analysis for using the wrong response mode. A number ofindividual responses were dropped because subjects did not follow directions (e.g., used a phrasethat was not on the list, assigned a range of values to a phrase, or assigned the same value toevery phrase).

The phrases were presented in one of four possible orders. Two of the lists were orderedsequentially, either ascending (Table 1) or descending. The other two lists were arranged in arandom order (Table 2), which was produced by folding the ordered list (splitting the list at a phrase,reversing one half, and interleaving the two halves) repeatedly. In answering the problems, 25subjects used the ascending phrase list, 14 the descending, 16 the random list in Table 2, and 16the reversed random list. The numbers of subjects assigning values to phrases in the 4 list orderswere 48, 27, 31, and 32, respectively. Those subjects who used the verbal response modesubsequently assigned values to phrases that were presented in the same order.

Insert Table 2 about here.

3. Results.The presentation of the results will be organized around three topics: the effect of phrase list

order on the selection of phrases as answers to the problems, its effect on the numerical valuessubjects assign to the phrases, and finally its effect on the accuracy of the subjects' answers to theproblems.

Page 10: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

3.1. Effect of phrase list order on problem answers.Subjects selected phrases from a list to express their estimates of the probability of a

hypothesis sixteen times: four times (after 0, 1, 2, and 3 pieces of key information had beenprovided; see Appendix 1) in each of 4 problems (concerning Cabs, Doctors, Insurance, and Twins;see Hamm, 1988).

3.1.1. Preference for phrases In particular ordinal positIons.To reveal preferences for phrases presented in particular positions in the phrase list, consider

the answers after all 3 pieces of intormation were provided (Table 3). The phrases in the first andlast positions were rarely used by either the 39 subjects presented with ascending or descendinglists (where the extremes were "absolutely impossible" and "absolutely certain"), or the 32 subjectspresented with random lists (whose extremes were "uncertain" and "very unlikely"). However, thephrases next to the extreme positions were chosen frequently in the Cab and Doctor problems.There is thus evidence that phrases that appear both early and late in the list are used. Further, inthe Cab and Insurance problems the phrase in the middle position is used frequently. This may bedue to its meaning ("tossup", in the middle of the ordered lists, expresses "I don't know"), ratherthan its location (1Oth in a list of 19 phrases).

Insert Table 3 about here.

3.1.2. Preference for phrases In the first or second half of the list.Because the identity of the phrase occupying a particular position varies across phrase lists, we

must consider the lists separately. The sequentially ordered and the random phrase lists were eachpresented in two orders that are reverses of one another. Comparison of the reversed lists canreveal the overall tendency to pick answers that are early or late, separate from the identities of thephrases. The average ordinal position of the phrases subjects selected from each list is given for all16 problem answers in Table 4. If there were no effect of ordinal position, the unweighted meanordinal position of the selected phrase for the ordered lists (or random lists) would be 10. (Theunweighted mean is taken, to control for different numbers of subjects using the ascending anddescending lists.) Looking over all four answers for all problems, the mean ordinal position of thechosen phrases is 9.74 for the ordered lists, an average of one quarter position (out of 19) in front ofthe midpoint. For the random lists, the mean ordinal position is almost exactly the middle position,10.

Insert Table 4 about here.

When there are 0, 1, or 2 pieces of information in the word problems, the answers frequently arestrongly constrained (see Hamm, 1987), and so little effecl of list reversal would be expected.Looking therefore at only the answers after all three key pieces of information had been presented,there is a slightly larger effect of position in the list. The mean ordinal position of the answers is 1/3of a positiun in front of the midpoint for the ordered lists, and 4/5 of a position after the midpoint forthe random lists. The small magnitude of this effect suggests that position in the list has little effecton the probability that a phrase will be used.

3.1.3. Comparison of ordinal position effect on phrase selection In random and ordered lists.Though the overall effect of ordinal position is small, there may be differences between the

ordered and random lists in the magnitude of the effect, which would have implications for thedesign of the optimal method for selecting verbal expressions of probability. In order to measurethe effect of ordinal position on the tendency of subjects to select individual phrases, so that theordered and random lists may be compared, an index was computed for each phrase, measuring itstendency to be used more when it appears in the first half than the second half of the list. First, ameasure Dia.bp is computed separately for each phrase in each problem, separately for the

6

Page 11: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 3

Number of subjects who chose the phrase occupyingeach ordinal position In the list for their final answer

on each problem.

Cab Doctor Insurance Twins Total

Position 1 0 0 0 4 4

Position 2 6 10 2 2 20

Position 3 3 5 4 4 16

Position 4 4 5 3 1 13

Position 5 2 4 4 6 16

Position 6 3 3 5 3 14

Position 7 2 1 4 4 11

Position 8 1 1 5 3 10

Position 9 1 1 3 3 8

Position 10 6 2 10 4 22

Position 11 1 2 3 6 12

Position 12 0 0 4 2 6

Position 13 6 4 0 9 19

Position 14 9 4 6 6 25

Position 15 6 5 3 3 17

Position 16 5 5 4 0 14

Position 17 6 4 8 6 24

Position 18 10 13 3 1 27

Position 19 0 1 0 3 4

Total 71 70 71 70 282

7

Page 12: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 4Mean ordinal position of phrase selected (from list

of 19) for each phrase list.

-------------- ---------------------Ordered Lists Random Lists

---------------------- ---------------------

ABc Desc Unwtd Ran A Ran B UnwtdMean Mean

---------------------- ---------------------N 25 14 16 16

---------------------- --- -----------------Prob Amt oflem info

---------------------- ---------------------

cab 0 9.38 9.64 9.51 7.94 11.94 9.94

cab 1 10.92 9.71 10.32 10.25 10.50 10.38

cab 2 14.64 5.07 9.86 10.81 6.44 8.63

cab 3 15.04 5.79 10.41 11.81 11.06 11.44

doc 0 9.20 9.57 9.39 7.63 12.19 9.91

doc 1 5.36 15.29 10.32 12.88 8.27 10.57

doc 2 17.44 2.21 9.83 13.87 5.38 9.62

doc 3 15.00 4.00 9.50 9.60 10.50 10.05

ins 0 7.80 12.93 10.36 6.63 11.81 9.22

ins 1 8.16 10.93 9.54 7.75 11.38 9.56ins 2 9.60 9.36 9.48 8.75 11.38 10.06ins 3 10.68 8.29 9.48 12.13 10.00 11.06

twn 0 9.32 10.00 9.66 7.31 11.06 9.19twn 1 9.16 9.64 9.40 6.38 13.67 10.02

twn 2 11.92 7.21 9.57 9.06 9.75 9.41

twn 3 9.88 8.64 9.26 12.19 9.40 10.79---------------------- ---------------------

Mean of all amountsof information: 9.74 9.99

Mean of 3-inf problems: 9.66 I0.84

8

Page 13: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

ordered and random lists:

Di*.~ =1I00(H x:4- + bDi .0 No Nb

where i indexes the 19 phrases; a is a particular phrase order (ascending or random A) and b is itsreverse (descending or random B); p signifies the particular problem; H. is 1 if the phrase appearedin the first half of list j, 0 if in the middle, and -1 if in the second half of the list; CP is the count ofsubjects using list jon problem p who chose phrase ias their answer; and N. is te total number ofsubjects using list j. The Di a b p indices for each phrase, each problem, are presented in Table 5. Inthe sequentially ordered lists, there are approximately the same number of phrases that havenegative and positive indices. However, in the random lists, there are more phrases with negativeindices. This suggests that in random lists subjects tend to select phrases that are in the secondhalf of the list.

Insert Table 5 about here.

The mean of the index, across the 18 phrases that are not in the middle of the list (the Di,a,b,pfor the middle phrase is 0), is given by:

19

Oaib~ - -

18

This mean is produced separately for the ordered and random lists, for each problem. In addition,an overall index is produced for the ordered and the random lists, by averaging over the 4problems:

4

Dab = 4 "

The two elements of the Di ab p index are the percents of subjects who chose the phrase whenit appeared in two lists that have reversed orders. Every subject chose one phrase on eachproblem, and there were 19 phrases, so the percent of subjects expected to choose each phrase is5.263%. The index subtracts the percent choosing the phrase when it is in the second half of the listfrom the percent choosing the phrase when it is in the first half of the list. The expected difference is0% if ordinal position has no effect. A positive index would signify that subjects chose the phrasemore often when it appeared in the first half of the list. Table 6 shows the mean Dabp and Dae bindices for the Ordered (a a ascending, b - descending) and Random (a - random order A, b =random order B) phrase list orders, for the subjects' final answers on each problem. Theserepresent the average difference in the percent of subjects choosing a phrase when it is in the firstcompared to the second half of the list. Dividing the mean answer by 5.263% expresses theordinal-position effect as a proportion of the percent of subjects expected to use the average phrase(Columns 2 and 6 of Table 6). Table 6 also shows the standard deviation of the index across the 18phrases, and the t-test for whether the mean Is different from 0%.1

Insert Table 6 about here.

The Dabp index is positive (indicating a tendency to choose phrases early in the list) for 3 of the4 problems;when the lists were ordered, but negative (indicating preference for phrases In the

9

Page 14: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

second half of the list) for all problems when the lists were random. In the ordered lists the Da,bindex (averaged over problc,,s) indicates a .064% preference for phrases in the first half of the list,which is a .012 proportion of the expected 5.26%. Thus, a phrase from the first half of the list wouldbe used 5.327% of the time on average, compared with 5.199% for a phrase from the second halfof the list. In the random lists the Db index is -.844% (a .16 proportion of the expected 526%),reflecting a preference for the phrases in the second half of the list. A phrase from the second halfof a random list would be used 5.685% of the time, while a phrase from the first half would be used4.841% of the time. These small effects are not significant for the individual problems, althoughwhen averaged across problems the tendency of subjects faced with randomly ordered lists toselect phrases in the second half of the list is statistically significant (t - -2.861, p - .02).

The Da b, index allows a statistical test of whether there is a difference between the orderedand random Pists in the direction and extent of the ordinal position effect. With the random lists, thephrase chosen for the final answers in all problems tended to come from the second half of the list,but with the ordered lists there was a slight preference for the first half. The difference in ordinalposition effect between the ordered and random lists is shown in Table 7. The effect is very small -.the mean difference is .908% (a .173 proportion of the 5.263% of the subjects expected to select agiven phrase) -- although the difference is statistically significant for the overall indices (andmarginally so for the Insurance problem),

Insert Table 7 about here.

3.1.4. Effect of list reversal on the selection of phrases with broad and narrow membershipfunctions.

Verbal expressions of probability differ in the range of numerical probabilities to which theyrefer. Some phrases, such as "absolutely certain" and "tossup," would be expected to refer tonarrow ranges of probabilities (see also Kong, Barnett, Mosteller, and Youtz, 1986), while otherphrases, particularly those with meanings near 25% or 75%, would refer to broader ranges. Thetendency to use a phrase with a broad "membership function" (Wallsten, Budescu, Rapoport, Zwick,and Forsyth, 1986) may be more strongly affected by its ordinal position in a list than the tendencyto use a phrase with a narrow range. Broad phrases may be strongly affected even though when allphrases are considered, as in the above analysis, the ordinal position effects are very small. Inorder to measure the breadth of the membership functions of the 19 verbal expressions ofprobability used in this study, an auxilary study was carried out.

Method. Sixty-five subjects, primarily from the Introductory Psychology subject pool, filled out aquestionnaire (Appendix 2) which asked them to state the lower and upper bounds of the numericalprobabilities that each phrase refers to. Half of the subjects named the lower bound for eachphrase before the upper bound, and half did the reverse. Crossed with this factor, half of thesubjects named the phrases in Random Order A (Table 2), and half in itL reverse, Random Order B.

Results. The mean lower and upper limits, across all conditions, are presented in Columns 1and 2 of Table 8. The midpoint between these bounds is an estimate of the meaning the individualassigns to the verbal expression of probability. The mean and median midpoints of these rangesand their standard deviation (Columns 3, 4, and 5) can be compared with the values in Table 1.The 6th column shows the standard deviations of the differences between the upper and lowerbounds, which reveal that there is an exceptionally high variation across subjects (s.d. . .336) in therange of meaning attributed to "uncertain".

Insert Table 8 about here.

The difference between a phrase's upper and lower bounds Is a measure of the range ofmeaning the individual assigns to the phrase, and can be used as an estimate of the breadth of the

10

Page 15: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 5The ordinal position effect Indices for each phrase,D4,_,bg or each problem and -,_,b for all problems.

D. a, b,p DL, a.b

Cab Doctor Xnsurance Twins All

Ord Ran Ord Ran Ord Ran Ord Ran Ord Ran

Absolutely imp. 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Rarely 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0Very unlikely 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.3 0.1 0.1Seldom -3.6 0.0 0.0 0.0 0.0 0.0 2.1 0.0 -0.4 0.0Not very prob. 2.1 0.0 0.0 0.0 0.0 0.0 8.5 -3.2 2.7 -0.8Fairly unlikely 0.0 0.1 0.0 0.0 0.0 0.0 -1.6 -3.0 -0.4 -0.7Somewhat unlik. -3.7 -6.3 0.0 6.3 4.2 -6.3 -10.6 -3.1 -2.5 -2.3Uncertain 0.0 0.0 0.0 0.0 0.5 0.0 2.1 3.3 0.6 0.8Slightly lessthan 1/2 time 0.0 0.0 0.0 -3.1 0.0 0.0 -1.6 -6.3 -0.4 -2.3

Toss-up 0.0 - 3.1 - 0.0 - 0.0 - 0.8Slightly morethan 1/2 time 1.6 -3.1 -2.1 0.0 0.0 0.0 -4.2 0.0 -1.2 -0.8

Better than even 3.7 -3.0 0.0 3.1 -2.1 -3.0 3.2 -5.9 1.2 -2.2Rather likely -2.6 -0.2 3.7 -6.6 3.7 3.1 -2.6 3.1 0.6 -0.1Good chance -6.3 -6.2 -2.1 0.0 5.3 -3.1 -0.5 -3.1 -0.9 -3.1Quite likely -8.8 - 3.2 - -4.2 - 5.3 - -1.1 fVery probable 0.9 -3.1 5.1 -6.3 1.5 3.1 0.0 -3.1 1.9 -2.3Highly pro),. 4.5 -6.0 -0.6 6.3 -2.1 -2.9 3.6 0.0 1.3 -0.7Almost certain 1.1 3.2 -2.4 -2.9 0.0 -6.1 1.8 0.0 0.1 -1.4Absolutely cert. 0.0 0.0 -2.1 0.0 0.0 0.0 0.0 0.0 -0.5 0.0

# phrases with 6 2 3 4 6 2 7 3 8 3

positive index

# phrases with 7 9 10 10 9 11 5 8 2 4zero index

# phrases with 5 7 5 4 3 5 6 7 8 11

negative index

a- indicates that the value was not calculated for a phrase because it appeared in the central position inthe list.

11

Page 16: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 6Mean ordinal position effect Indices,

for the 18 phrases that are not In the middle position.

Ordered Lists Random Lists(a=Ascending, b=Descending) (a=Random A, b=Random B)

Mean Prop- St Dev t Mean Prop- St Dev tortion ortion

Index DabP

Cab -.608 -.116 3.304 -.737 -1.363 -.259 2.637 -1.860Doctor .153 .029 2.008 .305 -.006 -.001 3.413 -0.007Insurance .404 .076 2.238 .723 -.842 -.160 2.545 -1.326Twin .306 .058 4.035 .304 -1.167 -.222 2.644 -1.769

Index Dab .064 .012 1.213 .211 -.844 -.160 1.182 -2.861*

p = .02, 2-tailed

12

Page 17: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 7Difference between the ordinal position effect Index scores

of the ordered and random lists.

t 2-tailDiff- Prop- value proberence ortion

Dabp difference.

Cab .755 .143 1.55 .141Doctor .159 .030 0.14 .889Insurance 1.246 .237 1.92 .073Twin 1.473 .280 1.08 .296

Dab difference.

All problems .908 .173 2.86 .011*

Note: Because a different phrase was dropped (for being in the middle location) from theordered list than from the random list, N - 19 -2 - 17 and df - 16.

13

Page 18: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 8The mean lower limit, upper limit, and midpoint between the limits,

for each phrase.

---------------------------- -------Lower Upper Midpoint (average) S.D. of

Phrase Limit LitmitMean Median St Dev Range

-------------- ----- ---------------------- -------

Absolutely imposs. .007 .041 .024 .000 .080 .133

Rarely .064 .183 .117 .100 .085 .079Very unlikely .046 .145 .096 .075 .071 .061Seldom .117 .243 .180 .150 .126 .090Not very probable .113 .235 .174 .150 .116 .078Fairly unlikely .176 .287 .231 .225 .111 .061Somewhat unlikely .217 .349 .283 .250 .129 .065Uncertain .294 .534 .414 .500 .153 .336Slightly less than

half the time .390 .470 .430 .440 .050 .04Toss-up .484 .526 .505 .500 .045 .087Slightly more than

half the time .524 .604 .564 .555 .065 .053Better than even .540 .703 .621 .600 .088 .123Rather likely .585 .735 .660 .700 .222 .087Good chance .652 .799 .726 .750 .137 .077Quite likely .686 .824 .755 .800 .146 .087Very probable .733 .872 .803 .850 .122 .093Righly probable .757 .899 .828 .850 .127 .084Almost certain .840 .950 .895 .925 .088 .090

Absolutely certain .928 .980 .954 .100 .105 .142- ----------- -----------------------------

14

Page 19: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

phrase's membership function. The mean and median range for each phrase are presented inTable 9. "Absolutely impossible" (.034), "tossup" (.042), and "absolutely certain" (.052) have thenarrowest ranges, and "uncertain" (.240) and "better than even" (.163) have the widest ranges. Themedian range measure does not discriminate well among the phrases, for its value for many of thephrases was .10. For comparison with previous work, Column 3 shows the difference between themedian upper bound and the median lower bound for the three phrases studied by Walisten,Budescu, Rapoport, Zwick, and Forsyth (1986) that were used in the present study. These rangesare generally larger than the individual ranges measured in our study (Columns 1 and 2), whichmay reflect different ellicitation procedures. The standard deviation of the values assigned to averbal expression of probability can be considered an alternative measure of the breadth of thephrase's membership function, although it is confounded with individual differences in the meaningof the phrase. Column 4 of Table 9 shows the mean of the standard deviations of the values givento phrases when presented in the four list orders in the main study. Column 5 shows the standarddeviation of the midpoints of the ranges, from the auxilary study.

Insert Table 9 about here.

The intercorrelations among these five measures of breadth of membership function are all fairlyhigh, ranging from .55 to .86 (Table 10). This indicates that when a direct measure of the breadth ofmembership function is lacking, the standard deviation might serve as a useful proxy.

Insert Table 10 about here.

The question whether subjects prefer to use phrases with broader meanings is addressed inTable 11, which shows the correlations between the indices of breadth of membership function andmeasures of the number of subjects who used each phrase for each problem, separately for theordered and random lists. The relations are generally positive, especially for the Cab and Insuranceproblems. While this suggests people prefer to use phrases with broad, even vague, meanings, itmay be due to preferences to answer these problems with particular degrees of probability, e.g.,answers between .10 and .40 or between .60 and .90. Further study is needed to clarify this issue.

Insert Table 11 about here.

To test whether the range of a phrase's meaning influences the impact of ordinal position onsubjects' tendency to select it from the list when answering a word problem, Table 12 showscorrelations between measures of the breadth of membership function (from Table 9) and theDi,a,b and Dib measures of the effect of ordinal position on the probability of selecting phrases(from Rable 5). The Di , b p measures are positive for a phrase If It is more likely to be used when inthe first rather than the second half of a list. Therefore, a positive correlation In Table 12 means thatthe broader the membership function of the phrase, the more it is likely that the phrase will be usedmore when it is in the first half of the list, or equivalently, the narrower the phrase's meaning themore likely that it will be used more when in the second half of the list. There were no significanteffects for the mean range measure, which is our best measure of breadth of membership function.The median range measure correlated negatively with the Di,=,b Index for the random lists, for allproblems, suggesting that subjects' tendency to select phrases from the second half of random lists(noted above) is stronger for phrases with broad than with narrow ranges of meaning. (Note that 14of the phrases are defined as "broad', 12 of them with median ranges of .10.) The SD of rangemidpoints measure showed a number of significant positive and negative correlations which arehard to interpret because the measure possibly confounds the breadth of the phrases'interpretations with Individual differences in their interpretations.

15

Page 20: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Insert Table 12 about here.

In conclusion, even considering variations in the breadth of the phrases' meanings, there is onlyvery weak evidence that there is any effect of ordinal position in the phrase list on the tendency toselect particular phrases as the answers to word problems.

3.1.5. Difficulty finding the desired phrase In ordered and random lists.A possible advantage of an ordered as opposed to a random phrase list is that subjects can find

the verbal expression they want more easily. There are a number of possible reasons for thisadvantage. Subjects who know the phrase they want may be better able to verity its presence in alist that is structured in an ascending or descending order. Subjects who know the probability theywant to express (either as a number, a range of numbers, a verbal phrase not in the list, or an ideathat is not modally specific) may be able to find an appropriate expression more easily, presumablyby evaluating the available phrases, when those phrases are ordered. Another possibility is thatpeople may not know what degree of probability they want to express until they have consideredcandidate phrases. If so, it may be easier to check whether the meanings of the phrases apply tothe situation when using an ordered list, in which phrases' meanings can be quickly understoodbecause they are implied by the meanings of their neighbors in the list

Any of these ordered list advantages might result in faster response time. The time to completethe whole questionnaire includes time reading and responding to all four problems, as well as timeassigning values to all 19 phrases. Analysis of response time as a function of whether the list wasordered or random and of presentation mode shows that the subjects took only 10 seconds longeron the average (out of 18 minutes) on the random lists, which is not significant in a 2 (list order) X 2(presentation mode) ANOVA. Therefore the admittedly rough measure of total questionnaireresponse time gives no indication that responding using random phrase lists is more difficult thanresponding using ordered lists.

A second measure of whether the ordered phrase lists are easier to use than the random lists isthe variability of the meanings of the phrases subjects select as answers for the word problems. Ifwe assume that subjects know the probability they want to express and have more trouble finding averbal expression that fits it well when they are searching a random list, then we would expect thatthe numerical values of the phrases selected will be more variable with the random lists. If weassume that at the outset subjects don't know the probability that they want to express, anddiscover it by looking at phrases and seeing which one "seems right-, then we would expect that thecontext variability in the random lists will cause a wider variation in the subjects' interpretations ofthe phrases when deciding which one to select. Either way, we expect that the random phrase listswill produce higher variability in the meaning of the answers than the ordered lists.

To measure variability in the meanings of the phrases subjects selected, it is necessary to usethe a priori values (Column 4 of Table 1). (Use of the subjects' own assigned values wouldconfound list differences in variation in meaning of the selected phrase with list differences invariation of the values subjects subsequently assigned to the phrase.) Table 13 shows the meansand standard deviations of the ap/ori'values of the selected phrases. While the phrases chosenfrom the random lists had numerical interpretations with higher average standard deviations (.195)than those chosen from the ordered lists (.181), and this was true for 11 of the 16 subproblems, thedifference Is not statistically significant (Chi2 - 1.25). Therefore the random lists have only a slighttendency to produce answers with higher variability.

**** tbe4ee**O e$* ee**eOeee4e~ee

Insert Table 13 about here.

If it is more difficult to make fast, accurate use of a random list of verbal expressions ofprobability than an ordered list, we have not been able to measure it.

16

Page 21: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 9Measures of breadth of membership function.

Measures derived from Measures derived fromsubjects' estimates of standard deviations ofupper and lower bounds estimates of meanings.

Mean Median U-L SD of SD of U-L

Range a Rangea diff.b meaning e midpointd

Phrase

Absolutely impossible .034 .00 .074 .080Rarely .106 .10 .078 .085Very unlikely .100 .10 .132 .071Seldom .126 .10 .089 .126Not very probable .122 .10 .090 .116Fairly unlikely .112 .10 .083 .111Somewhat unlikely .132 .10 .092 .129Uncertain .240 .10 .122 .153Slightly less than

half the time .080 .07 .046 .050Toss-up .042 .00 .13 .021 .045Slightly more than

half the time .080 .09 .061 .065Better than even .163 .10 .074 .088

Rather likely .150 .15 .121 .222Good chance .147 .15 .46 .106 .137Quite likely .137 .10 .087 .146Very probable .139 .10 .084 .122Highly probable .142 .10 .071 .127Almost certain .109 .10 .11 .087 .088Absolutely certain .052 .00 .008 .105

a Difference between upper and lower bounds, auxiliary study, N - 65.

b Difference between median upper bound and median lower bound, from Figure 4 of Walisten,Budescu, Rapoport, Zwick, and Forsyth (1986).

c Mean of standard deviations of values assigned to the phrases, from four lists with differentphrase orders, main study, N - 138.

d Standard deviation of midpoint (average of upper and lower bounds), auxiliary study.

17

Page 22: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 10

Intercorrelations among five measures ofbreadth of membership function.

Mean Median U-L SD ofRange Range Diff. Value

Median .73*Range

U-L .74 .72Difference

SD of .67"* .76"* .63Value

SD of .65"* .59"" .86 .55"*Midpoint

Note: Indices are defined in notes to Table 9. N a 19 for every correlation except thoseinvolving the U-L Difference index, for which N - 3.

p < .01.

18

Page 23: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 11Correlations between Indices of breadth of membership function

and measures of the number of sublects who selected each phrasefor each problem, separately for ordered and random phrase lists.

Measures ofBreadth of Membership Function

Problem Mean Median SD of SD ofList structure Range Range Values Midpoints

Cabordered .26 .40' .19 .37 +

random .35+ .51' .27 .13total .31+ .47' .23 .17

Doctorordered .07 .12 -.06 .18random .17 .29 .14 .38+

total .14 .24 .04 .32+

Insuranceordered .48* .28 .46' .26random .16 .12 -.05 .15total .40' .25 .27 .26

Twinsordered .19 .26 .05 .22random .06 .02 .12 -.28total .15 .17 .11 -.06

All problemsordered .35+ .43* .22 .41'

random .29 .38 + .22 .29total .35 + .43* .24 .38+

+pc.10; p<.05; p<.01.

19

Page 24: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 12

Correlations between measures of breadth of membership functionand Indices of the effect of ordinal position on phrase selection,

for ordered and random phrase lists.

Measures ofBreadth of Membership Function

Mean Median SD of SD ofRange Range Values Midpoints

Index Da,b,p

CabOrdered -. 10 -. 25 -. 26 -. 43*Random -.27 -.31 -.07 -.21

DoctorOrdered .29 .25 .32 .49'Random -. 02 -. 22 -. 21 -. 25

InsuranceOrdered .10 . 3 5

+ .33 .33+Random -. 10 -. 11 -. 01 .17

TwinsOrdered .19 -. 01 .03 .03Random .11 -.05 .25 .38+

Index Da,b

All problemOrdered .25 .09 .14 .09Random -.16 -.41' -.06 .00

Note: For ordered lists, a - ascending and b - descending; for random lists, a- random order Aand b - random order B.

+p < .10; "p < .05.

20

Page 25: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 13Means and standard deviations of the a priori values of the

phrases subjects selected to answer the problems.

Ordered RandomLists Lists

-List withMean SD Mean SD bigger SD

cab 0 .474 .125 .489 .154 r1 .542 .286 .606 .276 o2 .781 .106 .801 .144 r3 .776 .195 .738 .196 r

doc 0 .482 .112 .516 .161 r1 .216 .159 .247 .203 r2 .927 .089 .923 .074 o3 .813 .162 .737 .201 r

ins 0 .356 .187 .368 .201 r1 .413 .251 .437 .263 r2 .500 .242 .578 .278 r3 .563 .258 .678 .239 o

twn 0 .477 .084 .525 .112 r1 .478 .232 .437 .209 02 .632 .175 .566 .174 03 .525 .229 .434 .240 r

Mean SD: .181 .195

Note: N - 39 for the ordered lists and 32 for the random list.

21

Page 26: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 14Mean numerical values assigned to each phrase, each list order.

Ordered Lists Random Lists Total

Ascen- Descen- Both List A List B Bothding ding

N: 48 28 76 32 32 64 140Mn SD Mn SD Mn SD Mn SD Mn SD Mn SD Mn SD

Absolutely imp. .021 .130 .000 .002 .011 .066 .002 .005 .035 .160 .019 .083 .015 .074Rarely .104 .096 .084 .042 .094 .069 .1: .077 .161 .098 .144 .088 .119 .078Very unlikely .128 .053 .135 .041 .132 .047 .162 .175 .209 .260 .186 .218 .159 .132Seldom .180 .057 .171 .049 .176 .053 .202 .127 .229 .121 .216 .124 .196 .089Not very probable .235 .094 .225 .048 .230 .071 .197 .095 .199 .124 .198 .110 .214 .090Fairly unlikely .283 .073 .2?1 .039 .282 .056 .255 .105 .266 .113 .261 .109 .271 .083Somewhat unlikely .334 .075 .332 .052 .333 .064 .326 .147 .278 .092 .302 .120 .318 .092Uncertain .404 .073 .407 .055 .406 .064 .405 .193 .399 .167 .402 .180 .404 .122Slightly less than

half the time .448 .045 .445 .036 .447 .041 .435 .030 .428 .072 .432 .051 .439 .046Toss-up .496 .040 .493 .038 .495 .039 .500 .000 .501 .004 .501 .002 .498 .021Slightly more than

half the time .546 .053 .561 .053 .554 .053 .580 .046 .552 .090 .566 .068 .560 .061Better than even .598 .069 .608 .063 .603 .066 .609 .069 .627 .095 .618 .082 .611 .074Rather likely .671 .080 .680 .070 .676 .075 .738 .121 .619 .211 .679 .166 .677 .121Good chance .719 .081 .735 .071 .727 .076 .775 .112 .663 .158 .719 .135 .723 .106Quite likely .776 .083 .783 .063 .780 .073 .796 .084 .720 .117 .758 .101 .769 .087Very probable .827 .084 .844 .062 .836 .073 .865 .075 .803 .115 .834 .095 .835 .084Highly probable .873 .081 .890 .059 .882 .070 .888 .063 .867 .082 .878 .073 .880 .071Almost certain .930 .074 .931 .058 .931 .066 .922 .067 .870 .149 .896 .108 .913 .087Absolutely cert. .999 .007 1.000 .000 1.000 .004 .998 .010 .996 .013 .997 .012 .998 .008

Mean .504 .071 .506 .047 .505 .059 .515 .084 .496 .118 .505 .101 .505 .080Standard dey. .301 .026 .309 .019 .305 .018 .310 .054 .281 .060 .295 .052 .299 .031

22

Page 27: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

3.2. Effect of list order on values assigned to the verbal expressions of probability.The second procedure of the proposed method, which is the subject's individual assignment of

numerical values to verbal expressions of probability, is important because it potentially increasesthe accuracy of the proposed method by allowing adjustments for (a) individual differences in theinterpretation of phrases, and (b) phrase interpretation differences due to context. In order toevaluate the reliability of the values elicited in this procedure, it is necessary to determine whetherthe order in which phrases are presented affects the numerical values that subjects assign to thephrases. Table 14 shows the mean value subjects assigned to the phrases when they werepresented in each of the 4 orders. It also shows aggregate means for ordered lists, random lists,and all lists. These values may be compared with those in Table 1 and Table 8.

Insert Table 14 about here.

3.2.1. Effect of list structure on accuracy and variability of assigned values.Accuracy of assigned values may be measured by subtracting the values the researcher

assigned to the phrases a priori (based on previous studies; see Table 1) from the values thesubjects assigned to them. Table 15 shows the mean accuracy (deviation) scores for the orderedand random phrase lists, and their variability (standard deviations). In both lists the deviations tendto be positive in the first half of the list, and negative in the second half. That is, subjects' numberswere too high when the a priorivalue was low, and too low when the a prorivalue was high. Thusthese subjects have shifted toward .5 in 1987, in comparison with the interpretations of thesephrases found in previous studies (primarily Lichtenstein and Newman, 1967).

Insert Table 15 about here.

The hypothesis that an ordered presentation of the verbal expressions of probability allowssomeone to more readily recognize their meanings predicts that subjects will assign more accuratenumerical values (closer to the a priori values) when the lists are presented in ascending ordescending order than random order. For 13 of the 19 phrases, the absolute value of the meandeviation was larger when the lists were presented randomly. Four of these comparisons (for"rarely', "very unlikely", "seldom", and "almost certain') were statistically significant (in one-wayANOVAs), and two more ('somewhat unlikely" and "slightly less than half the timej were at p < .10.Only one of the phrases ('not very probable') had a significantly greater absolute deviation in theordered list than in the random list.

The hypothesis also predicts that subjects will be less variable in assigning numerical meaningsto phrases in the ordered lists than in the random lists. In contrast with the previous analysis, thisprediction does not depend on a priori assumptions about the "true" meanings of the phrases.Column 8 of Table 15 shows that for 17 of the 19 phrases (all save "tossup" and "high probability"),there was higher variability in the numerical values assigned to the phrases when they werepresented in the randomly ordered lists (Chi 2 - 7.5, df - 1, p < .005, one-tailed). The meanstandard deviations of the values assigned to the phrases in the 4 lists are shown at the bottom ofTable 14. The values assigned to phrases in the list with Random Order B had the highest standarddeviation (.118), more than that for Random Order A (.084), Ascending order (.071), or Descendingorder (.047). The mean random list standard deviation (M = .101) was significantly higher than themean ordered list standard deviation (M - .054; t - 3.92, df - 18, p < .01). T-tests between thestandard deviations of the values assigned in the Individual lists are shown in Table 16. All 4comparisons between random and ordered lists have the predicted order (3 of them significant).Overall there was, as predicted, less between-subject variation in the means of the numericalvalues assigned to the phrases when they were presented in an ordered list. In addition, there weresignificant differences between the two ordered lists (subjects assigned more varying values in theascending list) and between the two random lists (subjects assigned more varying values in list B).

23

Page 28: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Insert Table 16 about here.

If subjects assign to randomly arranged verbal expressions of probability numerical values thatare more variable and farther from the conventional meanings of these terms than they assignordered lists, and if this reflects their understanding of the meanings of the phrases when they areusing them to answer the questions, then it is preferable to use the ordered lists.

3.2.2. Effect of breadth of phrase meaning and list structure on variability and accuracy ofvalue assignment.

It can be expected that subjects will assign more variable numerical values to verbal probabilityexpressions that have broader meanings. This would occur particularly when the phrase lists arerandomly ordered, for the context supplies fewer constraints on the meaning of each term. Thecorrelations between the four measures of breadth of membership function of the 19 phrases (fromTable 9) and the standard deviations of those phrases when presented in each list order (fromTable 14) are shown in the top half of Table 17. (It should be noted that two of the measures ofbreadth of membership function are in fact standard deviations, and so would have high correlationsby definition.) There is a strong positive correlation between the indices of breadth of phrasemeaning (Mean and Median Range) and the standard deviations of the values assigned to thephrases, for every list order except the ascending list. This relation was expected and is the reasonthe standard deviation was proposed as a proxy measure for the breadth of membership function.However, there is a difference between ordered and random lists in the strength of this relationship.

*O**ee****44***O****O*OO*Oeee**eO*tee

Insert Table 17 about here.

Analogous arguments lead us to expect that the breadth of a phrase's membership functionmay influence the accuracy of the values subjects assign to the phrase, and that this effect may bemoderated by whether the list is ordered or random. The correlations between the measures ofbreadth of meaning and the accuracy scores (absolute deviations, defined above), presented in thebottom half of Table 12, show that there is no significant relation between accuracy of the valueassignment and Mean Range, our best measure of breadth of membership function, althoughMedian Range and standard deviation are significantly positively correlated with accuracy. Theselatter relations may be attributed to the fact that these two measures distinguish the phrasesidentified with 0, .5, and I from the others, and people know the value of these probability phrases.The structure of the list (ordered versus random) has no effect on the size of these relations.

3.2.3. Effect of nearness of anchor on variability and accuracy of value assignment.Three of the verbal expressions of probability used here have quite specific meanings:

"absolutely certain" (1.0), "tossup" (.50), and "absolutely impossible" (0). It is possible that subjectsuse these phrases as anchors when assigning values to other phrases. If so, we may expect lessvariability in the values assigned to phrases that are near to these anchors, than in the valuesassigned to more distant phrases. The distance of a phrase from an anchor will be more salient inan ordered list than in a random list. Therefore, we may expect the effect of distance from ananchor phrase on the variability of the values assigned to other phrases to be smaller in randomlists. However, people already know the meanings of these phrases, and so even ih the randomlists a phrase whose meaning is near an anchor may have a narrower range of interpretations.

In the context of our list of 19 phrases, the distance of a phrase from the nearest anchor Issimply measured by counting the number of steps In the list to the nearest anchor (see Column 9 ofTable 15). The hypothesis predicts that this measure will be positively correlated (over the 16 non-anchor phrases) with the standard deviation of the values the subjects assigned to the phrase, andthat this correlation will be larger for the ordered lists than for the random lists. Column I of Table18 shows these correlations for each list and for the combined lists (ordered, random, and total).The correlations of phrase value standard deviations with distances from anchors are significantly

24

Page 29: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 15Means and standard deviations of accuracy scores (deviations)

for values assigned to phrases, Ordered and Random lists.

Ordered RandomLists Lists--------------------- List with Test of List with Distance ofMean SD Mean SD greater Mean dif greater phrase from

deviation variab- nearestscore F sig ility anchor

Absolutely inposs. .013 .103 .018 .114 r .06 .800 V 0Rarely .047 .081 .094 .089 r* 10.54 .002 r 1Very unlikely .031 .049 .085 .221 r* 4.32 .040 r 2Seldom .026 .054 .065 .124 r* 6.07 .015 r 3Not very probable .031 .080 -.002 .110 o* 4.10 .045 r 4Fairly unlikely .032 .063 .010 .108 o 2.14 .146 r 4Somewhat unlikely .004 .067 -.028 .124 r+ 3.64 .058 r 3Uncertain .005 .067 .002 .179 o .02 .887 r 2Slightly less than

half the time -.003 .042 -.018 .054 r+ 3.54 .062 r 1Toss-up -.005 .039 .000 .003 0 1.19 .277 o 0Slightly more than

half the time .002 .053 .016 .072 r 1.85 .176 r 1Better than even .002 .066 .018 .082 r 1.55 .215 r 2Rather likely -.026 .076 -.020 .180 o .05 .819 r 3Good chance -.025 .077 -.032 .148 r .12 .734 r 4Quite likely -.022 .076 -.043 .208 r 1.79 .184 r 4Very probable -.017 .077 -.016 .101 0 .00 .947 r 3Righly probable -.021 .074 -.022 .073 r .02 .886 0 2Almost certain -.020 .068 -.053 .117 C* 4.45 .037 r 1Absolutely certain -.001 .006 -.003 .011 r 2.68 .104 r 0

Number of phrasesfor which random listhas larger statistic: 13 17

Number of phrasesfor which ordered listhas larger statistic: 6 2

Note: N a 76 for Ordered lists, and N - 64 for Random lists.

4 p <.10;*p .05.

25

Page 30: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 16T-tests of differences between mean standard deviations ofvalues assigned to phrases, In different phrase list orders.

SD Age SD Dan SD Rana--------------------------------

Men .071 .047 .084------- ------ --------------------------

SD Don .047 -3.54' -

SD Ran~a .094 1.02 3.37* -

SD R.arb .118 3.80' 5.36' 3.15*------- ------ --------------------------

pc.1

26

Page 31: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 17Correlations of measures of phrases' breadth of membershD function

and measures of phrases' standard deviation and accuracy.

Measures of Phrases'Breadth of Membership Function

Mean Median SD of SD ofRange Range Values Midpoints

Measures of Phrases'Variability

SD ascending list .22 .28 .51*.' .26SD descending list .72** .86* .56** .48*

SD ordered lists .56* .68*0 .68 * .46 *

SD random list A .750 .69*0 .86** *53**SD random list B .39* .56** .91*0 .40*

SD random lists . 61* . 68** 970* .50*

SD all lists .67** .76*0 1 .0 b .55*

Measures of Phrases'Accuracy

Aec ascending list .09 .44* .45' .27Ace descending list .12 .50* .49' .08

Ae ordered lists .11 .49* .49* .20

Acc random list A .02 .40* .38+ -.02Acc random list 5 .13 .50* .540* .24

Acc random lists .09 *49* .51' .15

Ae all lists .10 .52 ** .55"* .18

a These row variables are all components of the column index.

b The row variable is identical with the column variable.

p C .10; * p -C .05; "p < .01.

27

Page 32: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 18Correlations between standard deviations and accuracies of valuesassilned to phrases, and distance of phrase from nearest anchor.

Correlation of Correlation of PhrasePhrase Value SD with Value Accuracy (abs.distance from anchor deviation) with

Phrase List Order distance from anchor

Ascending .41+ .30

Descending .30 .13

Ordered (all) .44* .24

Random A .32 -. 40 +

Random 3 .14 .00

Random (all) .25 -.16

Total .34 + -.05

Note: N - 16 for every correlation.

+ p <C.10; " p <.05.

28

Page 33: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

positive, as predicted, for the ascending list, and for the ordered lists overall. Though positive, thecorrelations for the random lists are smaller, as expected, and nonsignificant.

Insert Table 18 about here.

If subjects do indeed use these three phrases as anchors, does this contribute to the accuracyof the values they assign to other variables? Are the phrases near anchors assigned more accuratevalues, and if this effect occurs is it stronger in ordered phrase lists? Column 2 of Table 18 Showsthat there is a nonsignificant correlation of .24 between absolute error of assigned value anddistance from anchor, which is the predicted direction. The correlation in one of the random listswas -.40, df-15, p < .10, in the opposite direction. These results provide weak evidence that whenlists are ordered, subjects use an anchoring strategy that both narrows the range of the values theyassign to verbal expressions of probability, and makes those value more accurate.

3.2.4. Effect of list structure on amount of duplication In assigned values.If in the value assignment procedure of the proposed method, subjects assign the same

numerical value to more than one phrase, this would degrade the precision of the method. Peoplecan be expected to do this more often when the lists are random than when they are ordered. Theextent of such duplication can be measured by counting the number of pairs of phrases to which asubject assigns the same value. For example, if "almost certain" and "highly probable" are bothassigned the value .90, that is one duplicated pair. If in addition "quite likely" were to be called .90,this would produce 3 pairs. If someone assigned the same value to all 19 phrases, there would be(19"18)/2 - 171 duplicate pairs. Table 19 shows the number of pairs of phrases that were assignedduplicate values for each list. The number of duplications is very small in comparison with themaximum possible count of 171. Significantly more duplicate values were assigned to phrases inthe random lists (M - 4.6) than in the ordered lists (M - 1.5), as predicted (F(3,140) u 6.13, p -.0006).

Insert Table 19 about here.

3.3. Effect of phrase list order on accuracy of problem answers.A third criterion for evaluating the proposed method of expressing degree of belief by selecting

verbal expressions of probability is the accuracy of its use. This accuracy is a joint product of (a)the phrase the subject selects, (b) the meaning assigned to the phrase, and (c) the right answer tothe problem. Hamm (1988) has compared the accuracy of the verbal and numerical responsemodes in this study, and found that verbal responses were more accurate in some probabilisticinference word problems but less accurate in others. Here we ask whether the accuracy of subjects'responses is affected by the order in which the phrases are presented.

3.3.1. Effects of list structure on accuracy of problem answers.Accuracy of answers using the response mode of selecting answers from a list of verbal

expressions of probability can be measured by translating all phrases (those the experimenterIncluded in the word problem, and those the subject selected as response) into numbers, andcomparing the response number with the correct answer (produced by applying Bayes' Theorem tothe numbers in the word problem; see Hamm, 1988). Translation from phrases to numbers can bedone In two ways: using the a priorivalues (Table 1) or the values each individual subject assignedto the phrases. Accuracy using both translations will be studied here, to separate those effects oflist order which are due to selection from those due to value assignment. If phrase list order affectsaccuracy using the a priori translations, this can only be due to Its effects on selecton of a phraseas a response. If list order affects accuracy using the subjects' individual translations but not usingthe a priori translatbon, this must be due to Its effects on subjects' assignment of values to phrases.

29

Page 34: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Results are presented separately for subjects for whom the word problems were presented withverbal and numerical expressions of probability (Table 20). If the probabilities were presented asphrases, the numerical value of the right answer depends on an assignment of a numerical value toone or more phrases. Because the subject was in the verbal response mode condition, thenumerical value of his or her answer also depends on the assignment of a numerical value to aphrase. Answers when no information had been presented are not analyzed here, because therewas little variation in response. There was no single correct answer for the Doctor and Insuranceproblems when only two pieces of information had been presented (see Hamm, 1987; 1988), andso these subproblems too are excluded from the analysis.

Insert Table 20 about here.

Table 20 shows the accuracy scores (absolute errors) for subjects using the verbal responsemode, computed using the a priori translations from phrases to numbers, for ordered and randomphrase list orders, separately for each subproblem and for the numerical and verbal presentationmodes. The ordered list produced more accurate answers on 8 of 20 comparisons between theordered and random lists. Only three of these 20 comparisons were statistically significant. In allthree, ordered lists produced more accurate responses. (When the deviation score, rather than theabsolute deviation score, was used, the results were similar, which shows that the advantage ofordered lists is not simply their smaller variability.) In conclusion, there is weak evidence that theorder in which phrases are presented influences the accuracy of the subjects' performance onprobabilistic inference word problems. A parallel analysis, using subjects' individually assignedvalues to translate the meaning of the phrases and calculate accuracy, had similar results.

3.3.2. Effects of use of subject's own assigned values versus a priori values on accuracy ofresponse.

A motivation for the proposed method is to enable subjects to express their degrees of belief ina way that is more natural for them than using numerical probabilities. It might seem that asking thesubjects afterwards for their numerical interpretations of the phrases defeats this purpose. However,the virtue of the method is its isolation of the numerical thinking, for it allows subjects to use only thelinguistic mode when thinking about the problems. Translating the phrases into numbers is doneseparately and does not interfere with the all-important problem solving. Nonetheless, theassignment of numbers to phrases places a burden on the subjects, and so it is worth consideringwhether it is possible to do without this part of the procedure by using a priori numericalinterpretations of the phrases. What effect does the use of the subjects' own translations of thephrases have on their accuracy on the word problems?

Table 21 shows the mean accuracy score (absolute error) on each problem using both the apriori numerical values and the subjects' own values for the phrases. The comparison includessubjects whose presentation mode and response mode were numericallverbal, verbal/numerical, orverbal/verbal. The answers using the subjects' own values were more accurate on 8 of the 10problems (using both absolute deviation scores (Table 21) and simple deviation scores), andsignificantly so after three pieces of information for the Doctor and Twins problems. However, thereis significantly higher accuracy using the a priorivalues after two pieces of information for the Cabproblem. Therefore when accuracy is very important, it is probably preferable to use subjects'individual Interpretations of the phrases, rather than relying on a universal a priori interpretation.However, the evidence is mixed, and the difference in even the significant comparisons Is small, Inconditions where t is difficult to get subjects to assign values to the phrases, a priori interpretationscould be used with only a small probable loss of accuracy.

Insert Table 21 about here.*00*000 0000 *0*0e 0t *0 *000*0.O~e 000030

30

Page 35: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 19Mean number of pairs of phrases to which subjects

gave duplicate values, for each phrase list order.

Mean SD N ofPhrase # of subjectsList Order pairs----------------------------Ascending 1.73 5.41 49Descending 1.03 2.08 30(Ordered) 1.47 4.44 79

Random A 4.36 4.17 33Random B 4.84 4.66 32(Random) 4.60 4.39 65----------------------------

31

Page 36: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 20

Comparison of word problem accuracy (absolute deviations)between subjects with ordered and random phrase lists, for

subjects with numerical and verbal presentation of probabilities.

----------------- -----------------Presentation Numerical Verbal

mode

Response Verbal Verbalmode

Prob- Amount Mean dev V Big Mean dev F uiglem of info

Ord Ran Ord Ran-------------------------------------------------

Cab 1 .13 .24 2.4 .14 .13 .21 2.1 .17N (9) (8) (10) (8)

Cab 2 .04 .13 19.0 .00"* .09 .06 1.2 .30N (9) (8) (11) (8)

Cab 3 .35 .33 .4 .51 .42 .38 .9 .35N (18) (16) (21) (16)

-------------------------------------------------

DOc 1 .06 .20 5.2 .03" .17 .11 1.3 .26

N (18) (15) (21) (16)

DOC 3 .63 .54 1.3 .27 .62 .47 2.6 .13N (9) (8) (10) (8)

-------------------------------------------------

ins 1 .23 .25 .1 .74 .15 .16 .0 .87

N (18) (16) (21) (16)

Ing 3 .45 .44 .0 .99 .23 .47 7.7 .01"N (9) (8) (11) (8)

-------------------------------------------------

Tim 1 .08 .13 1.2 .23 .22 .12 1.0 .34N (9) (7) (11) (8)

Twn 2 .09 .09 .0 .91 .12 .11 .0 .97N (9) (8) (10) (8)

Tim 3 .25 .22 .2 .64 .30 .26 .5 .49N (18) (16) (21) (15)

---- -------- -----------------------------------

32

Page 37: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Number of problem(out of 10) whereordered list is 53more accurste

p <.05; -p <.01.

33

Page 38: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Table 21Comparison of Mean accuracies (absolute deviations)

for each subproblem,using a priori values versus subjects' Individual values.

Apriori Own

values values t Big N

Prob- Amountlem of info

Cab 1 .19 .19 .09 .925 53Cab 2 .10 .13 -2.67 .011* 53Cab 3 .38 .36 1.65 .101 105

Doc 1 .17 .17 .18 .857 105Doc 3 .61 .55 4.06 .000"* 53

Ins 1 .20 .19 .69 .492 107Ins 3 .39 .40 -.92 .361 53

Twin 1 .17 .17 .79 .434 52Twin 2 .11 .12 .98 .333 52Twin 3 .28 .24 2.93 .004*" 104

p < .05; "" p < .01.

34

Page 39: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

4. Discussion.Expressing uncertainty by selecting verbal probabilities from a list has the advantages detailed

by Zwick (1987; e.g., that people prefer to use verbal probabilities) without the disadvantages ofunconstrained verbal expression. Because there are only a limited number of phrases in the offeredlist, it is possible to agree on their meanings, and so communication of uncertainty is feasible withthis method. Because the individual gives numerical interpretations for the verbal expressions Inthe value assignment procedure, it is possible to compensate for individual differences in themeanings of terms.

The present study tested whether the arbitrary features of the method, specifically thesequential order of the list of verbal expressions, and the positions of particular phrases in the list,affect the results. Investigation of the influence of list order on the selection of phrases, theassignment of numbers to represent phrase meanings, and the accuracy of the responsesproduced using the method showed that sequentially ordered lists are less vulnerable than randomlists to ordinal position effects and to the effects of variations in the phrases' breadth of meaning.

The effect of ordinal position on the selection of a phrase was ascertained by comparing thephrases selected from reversed lists. List order reversal made little difference. If there were noordinal position effect, the mean ordinal position of the selected phrases, averaged across reversedlists, would be the 10th position out of 19. The mean selected position for the final answers on theproblems was 9.66 for the ordered lists, and 10.84 for the random lists. For the ordered lists, this isnot statistically different from the 10th position. For the random lists, the tendency to pick terms inthe second half of the list is significant only when the effect is measured over all four problems.

People seem to prefer phrases with relatively broad meanings, such as "somewhat unlikely" or"good chance". This preference, however, may be due to the particular word problems used in thestudy. The answers to these problems tended to be in the .60 to .90 range (see Hamm, 1988). Theverbal expressions covering this range (as well as the .10 to the .40 range) have broader rangesthan the phrases covering other ranges. Therefore Subjects are likely to use a phrase with arelatively broad meaning on these problems. An additional effect that is independent of problemcontent was demonstrated: in random lists, the preference for broad over narrow phrases wasgreater in the second half of the list. There were two performaVce measures on which random listswere not significantly different than ordered lists - the time to complete the questionnaire, and thevariance of the a priori meaning of the phrases selected as word problem answers.

The method's ability to compensate for individual differences and context effects in theinterpretation of the verbal expressions of probability depends on the second step, a separateprocedure in which subjects assign numerical values to the phrases. Subjects gave more variableand less accurate values to phrases which were displayed in random order. Similar effectsprobably occur when people interpret the verbal expressions prior to selecting a phrase to answer aword problem.

The data suggest that subjects produce values for the terms by anchoring on the meanings ofknown phrases ("absolutely certain" for 1.0, "tossup" for .5, and "absolutely impossible" for 0), andthen adjusting. The evidence for this strategy is that the values given to phrases near these anchorswere less variable than the values given to phrases farther away. The correlation between phrasevalue variance and distance from an anchor was statistically significant in the ordered lists, but notsignificant (though positive) in the random lists. Thus the sequential arrangement of the list seemsto facilitate the use of the anchor and adjust strategy in assigning values to phrases. This may bewhy more accurate values were assigned to phrases in the ordered lists, and is another reason toprefer ordered lists. Additionally, the ordered list promotes more discrimination among the phrases,for fewer duplicate values were assigned to phrases when they were presented in sequence.

The accuracy of the word problem answers depends on the accuracy of the two procedures wehave already discussed, selection and value assignment. Although the advantages of orderedphrase lists have been demonstrated for both of these procedures, measurement of their effect on

35

Page 40: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

word problem accuracy gives perspective on the importance of the distinction between ordered andrandom lists. The overall difference was very small, but the ordered lists produced significantlymore accurate responses than the random lists for 3 of the 20 answers tested.

All these comparisons indicate either that the phrase selection method is better using orderedphrase lists than random lists, or that there is no difference. Before recommending ordered phraselists, however, we must consider a potential criticism. The constraints that an ordered list places onthe subject's interpretation of the phrases may distort, rather than clarify, the meanings of thephrases, thus preventing people from using the phrases as they normally would. Consider, forexample, the meaning of "almost certain". Kong, Barnett, Mosteller, and Youtz (1986) foundsubjects assign it a mean value of .78 (median .90), but the author used it to mean .95 in thepresent study. When subjects assigned values to "almost certain" in the random phrase lists (wherethere was nothing to indicate that the phrase meant .95), the mean value was .90 (see Table 9).However, in ordered lists (where it appeared in the 18th or 2nd of 19 positions, between "highlyprobably" and "absolutely certain"), its mean value was .93. This proves that placing a phrase in anordered list may change its meaning.

Another example is the verbal expression for .40, "uncertain." The range of m3aning peopleassign to this phrase is both very wide (an average of .24 between the lower and upper bounds; seeTable 9) and very variable (Table 8 shows an average standard deviation of .34; some subjectsgave it a range of 0 and others a range of 100). Although the mean value assigned to "uncertain"was .40 or .41 in both the ordered and random lists (Table 14), agreeing with the a priori value,these values were much more variable in the random list (sd. .18) than the ordered list (sd - .06).Thus placing a phrase in an ordered ;ist can change the breadth of its meaning. Because of theexceptional variability of the meaning of "uncertain', an alternative phrase for .40 should besubstituted. A candidate is "worse than even", which Shanteau (1974) found to have a mean valueof .38 using two different procedures, and which is symmetric with "better than even" whosemeaning is .60.

Although it is possible to find replacement phrases for particular inappropriate verbalexpressions, still the ordered list will change some phrases' meanings and breadths of meaning, formany individuals. This can be viewed, however, as a necessary cost of adopting a common set ofinterpretations of verbal expressions of uncertainty. Kong, Barnett, Mosteller, and Youtz (1986)advocate improving the use of verbal probabilities through codifying the meaning of probabilisticexpressions. They suggest measuring what people usually mean by phrases, publicizing this, andtraining people to use the terms with these agreed-upon meanings. Such publicity and trainingwould (a) reduce the differences between people, (b) narrow the individual membership functionsfor each phrase, and (c) get people to use the phrases to mean the same probability in differentcontexts. Such a program would require changing people's interpretations of many phrases, in theprocess of establishing a new convention. The proposed method of selecting verbal probabilityexpressions from a list could be a tool in such a program. The changes that the use of an orderedphrase list in this method would induce in the meaning of its phrases are costs worth incurring inorder to improve communication about uncertainty.

Beyth-Marom (1982) proposed an alternative framework for codifying the meaning of verbalprobal-ility expressions. It divides the probability scale into ranges .10 or .20 wide, and associateseach range with from 2 to 6 verbal expressions. For example, the terms "small chance" and"doubtful" would refer to the .10 to .30 range. Although this reflects the fact that verbal expressionsapply to ranges of probability, it has disadvantages. It does not distinguish between probabilitieswithin a range. It requires people to learn a number of sharp boundaries (e.g., at .10 and .30) thatare somewhat arbitrary. If establishing a convention requires people to relearn the meanings ofphrases, it seems more useful to associate phrases with points and allow for fuzzy boundaries, thanto associate phrases with specific ranges.

The method proposed here optionally elicits subjects' own numerical meanino for each phrase.Use of subject supplied values rather than a priori values to Interpret the phrases used in the wordprobierrs, in cJei ku 6valuate the acuracy of the subjects' reasoning, resulted in improved

36

Page 41: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Probability Response Scale

Verbal Expressions Numerical Expressions

Absolutely impossible .0O

Rarely .05

Very unlikely .10

Seldom .15

Not very probable 20

Fairly likely .25

Somewhat unlikely .33

Worse than even .40

Slightly less than half the time .45

Toss-up .50

Slightly more than half the time .55

Better than even .60

Rather likely .70

Good chance .75

Quite likely .80

Very probable .85

Highly probable .90

Almost certain .95

Absolutely certain 1.00

Figure 1. Probability Response ScAle

37

Page 42: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

accuracy in 8 of 10 problems, but this cost additional subject time. The need for such a procedurewould presumably fade if a set of conventional meanings would become accepted.

A list that displayed both the verbal expressions of probabiliy and their numerical interpretations(as in Figure 1) could be useful in this context. People would be free to use the mode they foundmore fitting to the problem and to their cognitive style. The two modes of expression would mutuallydefine each other, so that people's interpretation of each would be more constrained. Finally, use ofthe scale would train people to associate the verbal and numerical expressions, promoting theacceptance and use of the new convention.

Insert Figure 1 about here.

Alternative lists of verba expressions of very low or very high probabilities would be useful formaking distinctions among degrees of near impossibility or near certainty. These lists should bebased on research discovering the phrases people already use in contexts where these ranges ofprobability are pertinent, such as medicine (cf. Meyer and Pauker, 1987) or technological systems.A recent example highlights this need. To assess the overall risk of space shuttle failure, NASAengineers were asked to make verbal assessments of the reliability of space shuttle components.These were then translated into numbers, using an arbitrary code ('frequent" = .01; "reasonablyprobable" - .001 ; "occasional" - .0001; and "remote" - .00001) that was not used by the engineersin making their original assessments (Marshall, 1986). This poor risk assessment practice has givensubjective judgment a bad name in the aerospace community: "the government is relying too muchon subjective judgment and too little on statistical analysis in deciding which of thousands of safetyproblems on the space shuttle should get attention" (Marshall, 1988, p 1233). Codification of verbalexpressions of probability would impose consistent interpretations on the phrases and allowexperts' subjective judgment to makes its potentially crucial contribution.

38

Page 43: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

5. Bibliography.Beyth-Marom, R. (1982). How probable is probable? A numerical translation of verbal

probability expressions. Journal of Forecasting, 1, 257-269.

Budescu, D.V., and Wallsten, T.S. (1985). Consistency in interpretation of probabilistic phrases.Oranizational Behavior and Human Decision Processes, 36, 391-405.

Hamm, R.M. (1987). Diagnostic inference: People's use of information in incomplete Bayesianword problems. (Publication #87-11.) Institute of Cognitive Science, University of Colorado,Boulder.

Hamm, R.M. (1988). Accuracy of probabilistic inference using verbal and numericalprobabilities. Institute of Cognitive Science, University of Colorado, Boulder.

Kong, A., Barnett, G.O., Mosteller, F., and Youtz, C. (1986). How medical professionalsevaluate expressions of probability. New England Journal of Medicine, 315, 740-745.

Lichtenstein, S., and Newman, J.R. (1967). Empirical scaling of common verbal phrasesassociated with numerical probabilities. Psychonomic Science, 9, 563-564.

Mapes, R.E.A. (1979). Verbal and numerical estimates of probability in therapeutic contexts.Social Science and Medicine, 13A, 277-282.

Marshall, E. (1986). Feynman issues his own shuttle report, attacking NASA's risk estimates.Science, 232, 1596.

Marshall, E. (1988). Academy panel faults NASA's safety analysis. Science, 239, 1233.

Meyer, K.B., and Pauker, S.G. (1987). Screening for HIV: Can we afford the false positive rate?New England Journal of Medicine, 317, 238-241.

Shanteau, J. (1974). Component processes in risky decision making. J. ExperimentalPsychology, 103, 680-691.

Simpson, R.H. (1944). The specific meanings of certain terms indicating differing degrees offrequency. Quarterly J. of Speech, 30, 328-330.

Wallsten, T.S., Budescu, D.V., Rapoport, A., Zwick, R., and Forsyth, B. (1986). Measuring thevague meanings of probability terms. J. Experimental Psychology: General, 115, 348-365.

Wallsten, T.S., Fillenbaum, S., and Cox, J.A. (1986). Base rate effects on the interpretations ofprobability and frequency expressions. J. of Memory and Language, 25, 571-587.

Zimmer, A.C. (1983). Verbal vs. numerical processing of subjective probabilities. InR. W. Scholz (Ed.), Decision Making under Uncertainty. North-Holland: Elsevier SciencePublishers, pp 159-182.

Zwick, R.. (1987). Combining stochastic uncertainty and linguistic inexactness: Theory andexperimental evaluation. Ph.D. Dissertation, Psychology Department, University of North Carolina,Chapel Hill.

39

Page 44: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

6. Appendix 1.

The "Doctor" problem, one of four probabilistic Inferenceword problems used In the study.

In alternative versions of the problem, the probabilistic information was presented in eitherverbal or numerical form.

[0 pieces of key Information,] The next word problem is about a doctor trying to figure outwhat disease a patient has. The patient is, undeniably, ill, but it is difficult to know what disease hehas. You will be asked to estimate how likely it is that the patient has one of two diseases.

The patient comes in to the emergency room at night with a very unusual symptom - his eyesare bright yellow. The doctor knows that there are only two diseases that can produce thisparticular symptom - hepatitis and toxic uremia. People never contract both illnesses at the sametime.

With what you know now, what is the probability that the patient has toxic uremia?

[1 piece of key Information.] A discussion with a colleague reminds the doctor that toxicuremia is a less common disease than hepatitis. He checks a textbook and finds that [It Is highlyprobable that people] [90% of people] who present to their doctors with the s:. nptom of yelloweyes have hepatitis, therefore, [It Is very unlikely that they] [only 10% of people with thissymptom] have toxic uremia.

With what you now know, what is the probability that the patient has toxic uremia?

[2 pieces of key Information.] The doctor orders the lab to do a Spock test on the patient'sblood. In two hours the results are back - the Spock test indicates that the patient has toxic uremia.

With what you know now, what is the probability that the patient has toxic uremia?

[3 pieces of key Information.) The doctor consults his diagnostic manual and discovers thatthe Spock test is the best way to find out whether a patient with yellow eyes has hepatitis or toxicuremia. However, the Spock test is not foolproof. When the patient has toxic uremia, [it Is ratherlikely that the Spock test will Indicate that the patient has this Illness. It Is somewhatunlikely that the Spock test will Indicate that the patient has hepatitis] [the Spock test correctlyIndicates this 70% of the time, but 30% of the time it falsely indicates that the patient has hepatitis].Similarly, when the patient actually has hepatitis, [It Is somewhat unlikely that the Spock test willIndicate that the patient has toxic uremia] [the Spock test will indicate that the disease Is toxicuremia approximately 30% of the time].

With what you know now, what is the probability that the patient has toxic uremia?

41

Page 45: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

7. Appendix 2.

Instructions for questionnaire eliciting lower and upper bounds onthe numerical meanings of each phrase.

[Two versions were prepared. One asked for upper bounds first, and the other asked for lowerbounds first.]

People often use words or phrases such as "impossible" or "very likely" to express a degree ofuncertainty or certainty. We are interested in the range of uncertainties for which you think itappropriate to use each of a number of words or phrases.

Think of a cafeteria tray that has 100 ping pong balls on it. Some of them are white and the restare yellow. You can see every one of them clearly. You must convey to a friend how many ol theballs are white. You want to tell him how likely it is that a white ball would be picked if they werethoroughly mixed up and someone were to draw one without looking. However, you are not allowedto tell the person the actual proportion of white ping pong balls. Rather, you are forced to use anon-numerical descriptive phrase.

We want to know the range of proportions of white ping pong balls, in the tray described above,for which you would consider each term to be appropriate. We will ask you to tell us this for each of20 terms.

The first term is "about even". What is the highest lowest] proportionof white balls (out of 100) for which you think it would be appropriate touse the term "about even*, in trying to tell your friend the proportion ofwhite and yellow ping pong balls? Write that number here:

Now what is the lowest [highest] proportion of white balls for whichyou think it would be appropriate to use the term "about even"?

Look at your answers. You should have named two numbers somewhere between 0 and 100(inclusive). The second number should have been lower [higher] than (or equal to) the first. Anynumber in between the two numbers would be a reasonable interpretation for your friend to makewhen you tell him that the chance of drawing a white ping pong ball is "about even". Any numberhigher [lower] than your first answer would not be a reasonable interpretation of "about even"; norwould any number lower [higher] than your second answer be reasonable. If these statements arenot all true, you may wish to go back and change one or both of your answers.

On the next page is a list of words or phrases expressing degree of uncertainty. Assume thatyou are using each phrase to describe the chance of drawing a white ping pong ball from the tray of100 balls. For each phrase, please express the upper and lower [lower and upper] numerical limitsthat you would expect your friend to use in interpreting It.

43

Page 46: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Please focus on each word or phrase by itself, rather than trying to compare it with youranswers for other words or phrases.

Upper Lower] Lower CUpper],Ljmit Limit

Uncertain

Rather likely

Somewhat unlikely

Rarely

Slightly less thanhalf the time

Good chance

Fairly unlikely

Absolutely impossible

Toss-up

Quite likely

Not very probable

Absolutely certain

Slightly more thanhalf the time

Very probable

Seldom

Almost certain

Better than even

Highly probable

Very unlikely

44

Page 47: N Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting Phrases ... · 2011-05-15 · Evaluation of a Method of Verbally Expressing Degree of Belief by Selecting

Notes'The applicable t-test is:

XIi- R-11 X-O Xx4.0069

'S N 18 sd- sdY'- S6G

where s" is the unbiased estimate of the standard deviation and "sd" is the measured standarddeviation.

45


Recommended