+ All Categories
Transcript
  • 8/3/2019 Significance Levels-0.05, 0.01?????

    1/6

    Agricultural & Applied Economics Association

    Significance Levels. 0.05, 0.01, or?Author(s): Lester V. ManderscheidReviewed work(s):Source: Journal of Farm Economics, Vol. 47, No. 5, Proceedings Number (Dec., 1965), pp. 1381-

    1385Published by: Oxford University Press on behalf of the Agricultural & Applied Economics AssociationStable URL: http://www.jstor.org/stable/1236396 .

    Accessed: 10/01/2012 07:38

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

    JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of

    content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms

    of scholarship. For more information about JSTOR, please contact [email protected].

    Agricultural & Applied Economics Association and Oxford University Press are collaborating with JSTOR to

    digitize, preserve and extend access toJournal of Farm Economics.

    http://www.jstor.org

    http://www.jstor.org/action/showPublisher?publisherCode=ouphttp://www.jstor.org/action/showPublisher?publisherCode=aaeahttp://www.jstor.org/stable/1236396?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/1236396?origin=JSTOR-pdfhttp://www.jstor.org/action/showPublisher?publisherCode=aaeahttp://www.jstor.org/action/showPublisher?publisherCode=oup
  • 8/3/2019 Significance Levels-0.05, 0.01?????

    2/6

    CONTRIBUTEDPAPERS:MARKETING,PRICES,AND CONSUMPTIONCHAIRMAN:MARGUERITEuRK,UNIVERSITYF MINNESOTA

    Significance Levels-0.05, 0.01, or ?LESTERV. MANDERSCHEID

    M OST statisticallyoriented researchpublished in the JOURNALFFARM ECONOMICSncludes tests of statisticalhypotheses.In mostcases a significanceevel of either5 or 1 percentis cited.But a few use 10or even 20 percent.Why the difference? s a 1-percent evel "better"hana 5-percent evel?I will arguethat choice of statistical-significanceevels is not arbitrarybut rather s, or at least shouldbe, a deliberatechoice. The basicpurposeof this paper is to integratedecisiontheory-management f you prefer-with statisticalhypothesistesting.The discussionwill be restricted o rel-atively simple cases so as to minimize mathematical confusion. Once theconcepts are clarified, the mathematically sophisticated reader maypursuethe morerealistic cases. Let me begin with a review of some ele-mentary deasto insurethatwe are all thinking n the sameterms.The basicproblem n hypothesis estinginvolveschoosingwhich of twohypothesesto use as a basis for action.Hypothesesmaybe simpleor verycomplex.In a simple case, we might hypothesizethat two populationshave equal means(H) or alternativelyhat the mean of one population stwo unitslargerthanthe mean of the second (HA).Moreformally:H: Al == A

    HA:1I1- 12 = 2.A t test can be applied to a set of data to determine whether we acceptH, or rejectH and accept HA.What significanceevel should we use? Isa 1-percentsignificance evel better than a 5-percentsignificance evel?Why would we consider the 1-percentsignificance evel better? Be-cause the probabilityof rejectingH when it is true is reduced to 1 per-cent. And this is obviouslybetterthanusing a test whichpermitsa 5-per-cent probabilityof rejectingH when it is true. Or is it? The probabilityof acceptingH whenit is falsemustalso be considered.Thus, we have two types of errors:

    Type I: Rejecting H when it is true.Type II: AcceptingH when it is false.LESTER V. MANDERSCHEID is associate professor of agriculturaleconomics, Michi-gan State University. 1381

  • 8/3/2019 Significance Levels-0.05, 0.01?????

    3/6

    1382 LESTERV. MANDERSCimIn a statistician's Utopia one can simultaneously minimize the proba-bility of Type I error (a) and the probability of Type II error (P). Astatistician in Utopia would obviously set a = 0= . But as a is de-creased (the significance level moved from 5 percent to 1 percent) P isincreased. The exact relationship between a and P depends on theunderlying probability distributions for the test statistic and on thehypothesis and alternative hypothesis. This relationship is illustrated inmost introductory statistics textbooks.For any particular test, we know the test statistic and the hypotheses.From this information we can calculate the P associated wtih any par-ticular a.: The statistical tests recommended in standard textbooks orreference books are suggested because P is minimized for given aby these tests. For example, the t test is recommended for testingH:?, = ~2 against HA:,1 - '2 = 2 under rather general conditions be-cause for any value of a (any significance level) the probability of aType II error, P, is as small as possible. In some cases power functionsor operating-characteristic curves are exhibited to illustrate this fact.Unfortunately, very few standard statistics books go much further inhelping us select a significance level. Two quotations from the morehelpful books will suffice to make the point:The choice of a level of significancea will usually be somewhat arbi-trary since in most situations there is no precise limit to the probabilityof an error of the first kind that can be tolerated. It has become cus-

    tomary to choose for a one of a number of standard values such as.005, .01 or .05. There is some convenience in such standardizationsince it permits a reduction in certain tables needed for carrying outvarious tests. Otherwise there appears to be no particular reason forselecting these values. In fact, when choosing a level of significanceone should also consider the power that the test will achieve againstvariousalternatives.2In practice, the final choice of the value for the critical probabilityrepresentssome compromisebetween these two risks. It must be arrivedat by balancing the consequencesof a Type I erroragainst the possibleconsequencesof a Type II error.3Both quotations emphasize balancing the two types of error. Can weformalize this balancing by use of economic- and/or decision-theorycriteria? One might consider minimizing a weighted average of the

    1This statement is literally true for simple hypotheses. For complex hypotheses,we can calculate the P3 or given a for various alternative values of the parameters.'E. L. Lehmann, Testing StatisticalHypotheses, New York,John Wiley & Sons,Inc., 1959, p. 61.' W. A. Spurr,L. S. Kellogg, and J. H. Smith, Business and Economic Statistics,Homewood, Ill., RichardD. Irwin, 1961, p. 253.

  • 8/3/2019 Significance Levels-0.05, 0.01?????

    4/6

    SIGNIFICANCE LEVELS-0.05,0.01, OR ? 1383costs,using a and P as weights.Definingthe cost of a Type I erroras C,and the cost of a Type II erroras Cu, this mightbe stated as:Minimize L' = aCI + #CII.We thus consider L' as a 'loss function"and minimize it by choosingappropriatea and 3.More properlyL' should be labeled as expected-loss functionbut simplicity suggests the term 'loss function."Unfortunately,the loss function involves mathematical difficulties:a is calculatedon the basis that H is true while P is calculatedon thebasis that HA is true. We are thus adding together"unlike"tems. Butthere is also anotherconsideration.Hodges and Lehmannphrase t thus:. . . the reasonablecompromise n choosingthe criticalvalue will dependon the consequencesf the two errors.However,t alsodependson thecircumstancesf the problemn anotherway. If the null hypothesissveryfirmlybelieved,on the basis of muchpastexperience r of a well-verified heory,one would not lightlyreject t andhence would tend touse a verysmalla. On the otherhand,a largera wouldbe appropriatefor testinga null hypothesisaboutwhichone is highlydoubtfulpriorto the experiment.4

    Fortunatelythis suggestionprovidesa solutionto some of the mathe-matical difficulties-at least to the person willing to accept some of the"Bayesian" pproach o statistics.Defineas follows:PI: Priorprobability hat H is true,

    PII:Priorprobability hat HA is true.These priorprobabilitiesreflectthe investigator'sbeliefs priorto lookingat the data. If we accept the idea that prior probabilitiesexist, theyprovidea link for putting a and P probabilitieson a commonbasis.The resulting oss function is as follows:L = PIaCI+ PIfCII.Choosing a and P so as to minimize L, given the values of PI, P1I,CI, and CII, leads to an "optimum" r "best"significance evel for thepersonwhose decisionruleis to minimizeexpected oss.Note that this discussion assumes a fixed sample size. Permittingsample size to vary allows calculation of the sample size needed toachieve given levels of the loss function rather than minimizingit forgiven sample size.A similaranalysiscanbe pursuedby the personwho prefersa mini-maxor some other decisionrule.The significance evel will dependon the de-cisionrule but the conceptualargumentsare the same.

    "J. L. Hodges, Jr., and E. L. Lehmann, Basic Concepts of Probabilityand Sta-tistics, San Francisco,Holden-Day, Inc., 1964, p. 326.

  • 8/3/2019 Significance Levels-0.05, 0.01?????

    5/6

    1384 LESTEnV. MANEERSCHEIDWhat are the implications of the loss-function approach for the simplecase suggested on page 1, where

    H: A -A /2and

    HA:: 1- /2 = 2 ?Suppose that these refer to yields for two varieties of wheat. Supposeall data other than average yield indicate no difference in the twovarieties. Then a farmer might well say Pr= PI = 3 and also say thatCr= 0, since if the yields are equal he loses no profit by choosing eithervariety. However, C11 is 2 bushels per acre and this can be translatedinto a dollar amount by using price and acreage. Obviously, the farmerwants to minimize P and doesn't care about a-a significant level of 5percent is obviously wrong! In fact, he should always choose HA andplant variety 1.

    Suppose that the decision maker is the head of a seed company andthat development costs for variety 1, a new variety, would be high. Themanager then needs some idea of how much we will lose if he developsvariety 1 and it is no better (CI) compared to the loss if he fails to de-velop it and variety 1 is better (CT). Further, he will want to consultgeneticists and other agronomists to evaluate the prior probabilitiesrather than assuming PI = PII = 3. In spite of the extra complications,the manager may still find the loss function a useful device for selectingan appropriate significance level.Others involved-the plant breeder, a rival seed company, etc.-mightarrive at still different a levels either because they begin with differentprior probabilities or because their estimated costs are different. Butthis should not worry us. Don't we argue that decision makers need toevaluate their environment, talents, etc. to arrive at a "best" decision?Some will argue that this approach is interesting in theory but im-possible in practice because we cannot estimate Cr and CII. But if wecannot estimate the costs of an error, should we be testing? A basicpurpose of testing is to choose between two acts. If we are choosingacts, we should be able to specify the costs which can be measured ineither monetary or nonmonetary units.There remains, however, real difficulty in actually calculating theminimum for the loss function in most cases of practical importance. Forexample, if we test

    H: U1 -2,HA: /11 /2,

  • 8/3/2019 Significance Levels-0.05, 0.01?????

    6/6

    SIGNIFICANCE LEVELS-0.05, 0.01, OR ? 1385then the value of P depends on the difference between p, and pt2;C I alsodepends on this difference.5 Thus, L is a complicated mathematicalfunction. Approximate results can be obtained by using "representativevalues" for I, - p2. The basic conceptual framework is still a validreasoning device whether one actually carries out the minimization oronly approximates it.

    Birnbaum6 has argued that researchers test too many hypotheses andfail to specify the likelihood of various parameter values often enough.Short of publishing the likelihood function, one could publish the maxi-mum value of a that would permit rejection of a relevant hypothesis (ormaximum a for several hypotheses). One could go further and publish theP associated with several possible values of a. This would permit the de-cision maker to test, or approximate a test, using his optimum values fora and (3.Summary

    Choosing a significance level is not an arbitrary choice between a 5-percent and a 1-percent level. Rather, a conscious choice can be made-achoice grounded in the principles of management and statistical theory.One must consider(1) the costs associated with each type of error,(2) the prior probabilities of the hypothesis and the alternative, and(3) the size of the Type II error associated with each significance level.Incorporating these facts into a decision model yields a "best"significancelevel. This approach clarifies the relation between testing hypotheses andfollowing actions and helps explain why several decision makers facedwith exactly the same observations may reach different decisions.

    5P1 must be interpreted carefully, since the probability of exact equality isundoubtedly near 0. We usually have in mind equality up to some small difference.6 A. Birnbaum,"On the Foundations of Statistical Inference,"J. Am. Stat. Assn.57:269ff, June 1962.


Top Related