+ All Categories
Home > Documents > mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2...

mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2...

Date post: 05-Mar-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
56
STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERT L. STEARMAN2 Department of Biostatistics, School of Hygiene and Public Health, The Johns Hopkins University, Baltimore, Maryland CONTENTS I. Introduction ................................... 161 II. Observations, Samples and Populations . 162 Describing a Population . 164 Frequency tables.................................................................... .164 Bar graphs.......................................................................... 164 Histograms.......................................................................... 164 Frequency polygons.................................................................. .165 Frequency curves.................................................................... .165 Parameters and Statistics . 166 Parameters.......................................................................... 166 Statistics............................................................................ 167 III. Precision and Accuracy . 169 Precision.......................................................9....... 169 Variance as an index of precision........................................ 169 Methods of increasing precision...................................................... 170 Coefficient of variation as an index of precision .171 Accuracy ............................................................................ 171 IV. The Normal Distribution .172 Utility............................................................................... 172 Parameters and statistics............................................................ 172 Significance Tests .173 Basic principles...................................................................... 173 Test of a single sample mean .176 Test of the difference between two treatments: paired samples..................................................................... 178 independent samples.............................................................. 179 Test of two sample estimates of the variance .183 Significant versus practical differences. ...................................... 184 Interpretation of results of significance tests .185 Test of the difference among more than two treatments.186 Problems of Estimation .188 Confidence intervals: basic principles. ...................................... 188 Confidence interval for a population mean. ............ ...... 189 Confidence interval for a population variance .... 190 Confidence interval for the difference between two means .... 190 Components of variance technique....................................... 191 V. The Binomial Distribution .... 194 The parent population............................................................... 194 Probability.......................................................................... 194 Distribution of samples.............................................................. 194 Parameters and statistics of the binomial distribution .... 196 Significance Tests .... 196 Binomial test of a single sample proportion.......................................... 196 Normal approximation to the binomial distribution .... 197 Normal approximation test of a single sample proportion .... 198 Normal approximation test for two sample proportions .... 199 'Paper number 300. 2 Milbank Memorial Fund Fellow. 160 on March 6, 2020 by guest http://mmbr.asm.org/ Downloaded from
Transcript
Page 1: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

STATISTICAL CONCEPTS IN MICROBIOLOGY'ROEBERT L. STEARMAN2

Department of Biostatistics, School of Hygiene and Public Health, The Johns Hopkins University,Baltimore, Maryland

CONTENTSI. Introduction ................................... 161

II. Observations, Samples and Populations. 162Describing a Population. 164Frequency tables.....................................................................164Bar graphs.......................................................................... 164Histograms.......................................................................... 164Frequency polygons...................................................................165Frequency curves.....................................................................165

Parameters and Statistics. 166Parameters.......................................................................... 166Statistics............................................................................ 167

III. Precision and Accuracy. 169

Precision.......................................................9.......169Variance as an index of precision........................................ 169Methods of increasing precision...................................................... 170Coefficient of variation as an index of precision.171

Accuracy ............................................................................ 171IV. The Normal Distribution.172

Utility............................................................................... 172Parameters and statistics............................................................ 172

Significance Tests.173Basic principles...................................................................... 173Test of a single sample mean.176Test of the difference between two treatments:paired samples..................................................................... 178independent samples.............................................................. 179

Test of two sample estimates of the variance.183Significant versus practical differences....................................... 184Interpretation of results of significance tests.185Test of the difference among more than two treatments.186

Problems of Estimation.188Confidence intervals: basic principles....................................... 188Confidence interval for a populationmean............. ...... 189Confidence interval for a population variance.... 190Confidence interval for the difference between two means.... 190Components of variance technique....................................... 191

V. The Binomial Distribution.... 194The parent population............................................................... 194Probability.......................................................................... 194Distribution of samples.............................................................. 194Parameters and statistics of the binomial distribution.... 196

Significance Tests.... 196Binomial test of a single sample proportion.......................................... 196Normal approximation to the binomial distribution.... 197Normal approximation test of a single sample proportion.... 198Normal approximation test for two sample proportions.... 199

'Paper number 300.2 Milbank Memorial Fund Fellow.

160

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 2: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

19551 STATISTICAL CONCEPTS IN MICROBIOLOGY 161

Normal approximation test for more than two sample proportions .................... 200Graphical methods for tests of sample proportions.................................... 201

Problems of Estimation................................................................ 201Confidence interval for a population proportion ...................................... 201Sample size and power in significance tests involving proportions ..................... 202Other methods for a confidence interval for a population proportion .................. 204Confidence interval for the difference between two population proportions............ 204

VI. The Poisson Distribution ................................................................ 204Derivation of the Poisson Distribution ................................................. 205Limit of the binomial distribution................................................... 205Items or events randomly distributed in time or space................................ 205General form of probability for Poisson distribution.................................. 205Mean and variance of the Poisson distribution ....................................... 206

Applications of the Poisson Distribution............................................... 206Bacterial counts by chamber method................................................. 206Bacterial counts by plate method .................................................... 206Bacterial counts from dilution series ................................................. 206

VII. Acknowledgments........................................................................ 207VIII. Appendix................................................................................ 207

Analysis of Variance Computing Table................................................. 207Tests to Supplement the Analysis of Variance .......................................... 210Notes on the Application of the Chi-square Test....................................... 213

References.............................................................................. 214

I. INTRODUCTION benefit of proofs. However, in some instances an

Statistical methods are being used to an in- attempt will be made to show the logic underlyingcreasing extent in the field of microbiology. They a procedure without resorting to involved mathe-are employed in many studies which range from matics or theory.the estimation of bacterial deities with dilution This paper is neither intended to be a textbookseries to the determination of better deil for nor is it designed to make statisticians of thevitamin Bt assays. This increased application of readers The training of a statistician is a longvitaminBnassas. This increasd application o and involved program, just as is the taning of astatistics in microbiology is part of a more general a i p utrend which is being noted in moat biological microbiologist. It is hoped, however, that thesciences. One reason for this trend lies in the fact reader will become acquainted with some of thethat present biological problems are of a satiti- terminology of the field of statistics and some ofcal nature. Whether we like it or not, once a the concepts underlying the various methodsscience advances beyond the descriptive stage, its Preeted-problems become statistical, even though we Before proceeding, it is appropriate to see whatdon't use formal statistical techniques. statistical methods can do. One of the moreIn 1943, Eisenhart and Wilson (1), in a review important advantages of statistics is their power

of statistical methods in bacteriology, concerned to get the most information out of a given set ofthemselves primarily with the methodology of data. This ability makes it possible to obtain astatistical tests. The present review will deal given amount of information from a smaller ex-mainly with the basic concepts which underlie periment than would be needed if cruder methodsstatistical tests while the methods will be used to of analysis were used. For example, in studiesillustrate these principles. Due to limitations of involving the effects of different factors (such asspace the review will be restricted to the concepts pH and temperature), statistical methods make itunderlying the elementary methods involving the possible to study the effects of all of the factorsthree basic distributions in statistics, namely, the in a single experiment and still obtain the samenormal, binomial, and Poisson distributions. amount of information as would be given by

In addition to discusing the basic concepts, an several experiments involving more work,attempt will be made to point out some of the Another advantage of statistical methods is thatpitfalls which should be avoided. Statements of they offer a standard method of judging experi-theorems and methods will be given without the mental data. When different people examine a

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 3: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

162 ROEBERT L. STEARMAN [VOL. 19

given set of experimental data, each person usu- from other fields while other terms have origi-ally forms his own opinion as to what the data nated within the field. In making the transition tomean, whereas statistical methods offer a com- statistics, the definitions of the borrowed termsmon bars for evaluating the results. In some types have undergone changes, so that the statisticalof experiments, statistical methods offer the only definitions do not necessarily jibe with the defi-satisfactory solution obtainable. Statistical nitions to be found in a dictionary. This sectionmethods are available for a large variety of prob- and the next will present the basic terminology tolems ranging from the testing of differences be- be used in later sections.tween two laboratory procedures to the fitting of The basic building block of the statisticalcurves to data. method is the observation. An observation is aSome words of caution about the use of statisti- measurement. It may be qualitative, such as the

cal methods are also appropriate at this point. classification of dead or alive in an experiment onStatistical procedures, like dynamite, are bene- the effect of botulinus toxin on an animal, or theficial if used properly, but may be dangerous serologic type in the classification of pneu-when used improperly. It has been stated, by mococci. On the other hand, it may be quanti-critics of statistical methods, that anything can be tative, such as the optical density or per centproved using statistics. This statement is true, if light transmission of a bacterial culture in liquid(and this is a big if) two conditions are met, (a) medium, the number of colonies on a plate, or thestatistics are used improperly, and (b) the person number of animals that die in a cage of rats givento whom a fallacy is being "proved" does not a fixed dose of botulinus toxin. No matter whetherunderstand statistics. In actuality, statistical the unit being measured is a single entity or amethods are nothing more than the application of group, as long as a single measurement repre-logic to experimental data, formalized by applied sents the unit, the measurement is an observa-mathematics. If statistical methods led to illogical tion.results, the field would have ceased to exist long A sample is a group of observations drawn fromago. Statistical procedures, if used properly, pro- a population (called a universe by some). Thevide a potent, very helpful and many times neces- membership of a population or sample is deter-sary tool, but anything can happen when the use mined by what is being studied. To illustrate this,is improper. Huff (2) has written an excellent, consider a problem in clinical bacteriology. If aamusing and worth-while book on the subject of patient has a septicemia, the primary interestthe improper use of statistics. will be to determine the etiologic agent to startA point which should always be kept in mind therapy. Then, the population will be the organ-

is that statistical methods are not a substitute for ism or organisms (in mixed infections) that are ingood experimental technique. The statistical results the blood of the patient. The sample will consistObtained from a body of data are no better than of the specific type or types of organisms foundthe technique involved in the experiment. in the blood drawn from the patient and sent toFurther, statistical methods are not a substitute for the laboratory. On the other hand, the studysound professional judgement in interpretation of might consist of finding what organisms arethe results of an experiment. They are primarily an found in septicemia cases in a hospital during aaid to interpretation. One final caution before given year. Then, the type or types of organisproceeding: Be careful that the statistical analysis of a patient will become a sample in the popu-fits the experimental procedure: Many experimental lation of types found in all of the septicemia casesdesigns which are quite similar require different in the hospital during the year. Again, the studymethods of analysis. Statisticians have learned may be broader than this: the problem might bethat a thorough knowledge of the details of the to determine what organisms are found in sep-experimental procedure must be obtained before ticemia cases in general. If so, the organismsthe method of statistical analysis can be decided found in cases in a hospital now become a sampleupon. in this larger population. Thus, populations and

samples will change with the primary interests ofII. OBSERvATIONs, SAMPLES AND POPULATIONS the study. As interests broaden, what used to be

Statistics, like any other science, has its own populations become samples from still largervocabulary; some terms have been borrowed populations.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 4: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

19551 STATISTICAL CONCEPTS IN MICROBIOLOGY 163

A point to be derived from this discion is not infinite. For example, a study to determinethat a population is automatically defined as the etiologic agent for a septicemia case withoutsoon as the problem to be studied is defined. a mixed infection would define a population of oneSamples are obtained to shed light on the charac. member, namely the type of organism involved.teristics of the population being sampled. For Studies to determine what organisms are found inexample, when a physician treats a septicemia septicemia cases in a year would also define apatient with one of the standard antibiotics, he is finite population with more than one member.primarily interested in the well-being of his The size of a sample is limited only by the sizepatient; he is interested in the treatment only of the parent population (the population frominsofar as it will improve the condition of the which a sample is drawn is called the parentpatient. If, however, he tests a new experimental population). A sample may contain only one ob-antibiotic on the patient, he is interested not servation, or, if the population is of finite size,only in the well-being of this particular patient the sample may contain all of the members of thebut also in the effect of the new treatment on the parent population; this is referred to as completecountless other patients who may some day have sampling. It is, of course, impossible to havethe same condition. In this way, the effect of the complete sampling on an infinite population.new antibiotic on this patient becomes an ob- One concept which will be needed later on isservation taken from the population made up of that of a population of all possible samples of athe effect on all patients who may suffer from the given size. To illustrate a population of this type,same infection. When he and other physicians consider a parent population of Streptococcustesting the same new treatment gather these pyogenes (the cocci themselves). Now, the popu-observations together, they have a sample which lation of all possible samples of size five willmay help in shedding light on the efficacy of the contain all of the possible combinations of fivenew antibiotic. cocci which may be drawn from this parent popu-The data obtained from a laboratory experi- lation. It must be noted that any one particular

ment are, in the vast majority of cases, nothing coccus may appear in a great many differentmore than a sample drawn from a population samples constructed in this manner, since thedefined by the materials and methods used in the manner of construction is equivalent to returningexperiment. If the investigator who ran the ex- each sample to the population before the nextperiment (or some other investigator) attempts to sample is made up. In a population of this type,reproduce the results, the data obtained represent the members of the population are samples ofa second sample from the same population, as five cocci. Any sample of five cocci which couldlong as the materials and methods remain the be drawn from the parent population will be asame. member of this population of samples.Most of the populations which are sampled in A type of sample which is used to a great

laboratory experiments are infinite populations. extent is the random sample. A random sample isFor example, in studying the metabolism of some a sample taken in any manner which gives eachparticular strain of Escherichia coli, the number member of the parent population an equalof possible organimsof this type would indeed be chance of appearing in the sample, with theinfinite. The laboratory studies are done on additional condition that once any particularsamples from this infinite population. Metabolic member is chosen it does not affect the chance ofstudies on E. coli are aimed at determining the any other of the members appearing in thecharacteristic metabolic processes of the entire sample. There are many ways in which this caninfinite population. In other words, we must be done. One method which could be used is towork with samples, but our ultimate aim is the assign a number to each member of the parentcharacterization of an infinite population which population, write these numbers on separate slipswe cannot study in its entirety. This is the basis of paper, place the slips in a hat, mix themof experimental science. It is at the point of gen- thoroughly, and then draw several slips from theeralization from sample results to population char- hat. The members whose numbers were drawnacteristics that statistical methods offer a stand- would then appear in the sample. A sample of thisard approach as well as a potent tool. There are, of type would give each of the members an equalcourse, problems where the population defined is chance of appearing in the sample and the ap-

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 5: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

164 ROEBERT L. STEA N [VOL. 19

pearance of any particular member in the sample Preqe..ydoes not affect the chance of the other membeappearing in the sample.

Describing a PopulationIf all of the members of a population were

alike, it would be a simple matter to describe thepopulation by iing the number of observationsin the population and the common measurementpossessd by all of the members., Most popula- Measurementtions met in research are not of this sort, however. Figure 1. Bar graph (see text for detailed ex-For example, the size of streptococci will vary p gu ).with species, strain and even with the indi-viduals within a strain. Description of popula_ each. With the cases of pneumococcal pneumonia,tions of this type presents a problem. In small the rare serologic types might be grouped to-populations, it is not hard to list each of the gether.Bar graphs. Since it is often helpful to "see" aobservdatons, but in large populations a thing population, graphs are useful. Although in-would be a hopeless task. Even though all of the numerable types of graphs might be mentioned,members of a large population might be listed, it only the ones that are commonly used will bewould be virtually impossible to determine, from described. One type is the bar graph shown inthe listing, just how the observations were dis- figure 1. This particular type portrays popula-tributed, i.e., whether the observations are tions in which the measurements are qualitative,clustered about some central value, what amount such as the serologic types of pneumonia, or dis-of variation exists and so on. The following aids crete quantitative measurements, such as themay be used to help in the description of a popu- number of rats in groups of a given size that dielation; they may also be used in describing with a standard dose of botulinus toxin. Each ofsamplatio In; fa,theyay al e usedmoreinthedesr the possible different measurements that occurssamples. In fact, they arensed more in the de- in the population is plotted along the abscissascription of samples than in the description of (horizontal axis of the graph), and their relativepopulations since populations are seldom com- frequencies are represented by heights of bars (orpletely known and since data are usually only a lines) drawn above each of the measurements.sample from a population. Thus, although the Each of the members of the population is repre-following descriptions will be made in reference sented by equal units of height along the ordinateto populations, the reader must understand that (vertical axis of the graph). Thus, the height ofthe methods may also be applied to samples. each of the bars is proportional to the relative

Frequency tables. One of the aids to the descrip- frequency for the measurement involved.tion of the distribution of the observations in a Histograms. A type of graph which is used topopulation is the frequency table. The frequency portray continuous measurements, such as thetable is a listing, in tabular form, of the fre- diameters of streptococci, is the histogram (seequency (number of observations or fraction of figure 2). The different possible measurements arethe total number of observations) with whichobservations occur for each of the possible dif- Frequencyferent measurements contained in the popula- perunitoftion. For example, if the population were the measurementpneumococcal pneumonia cases occurring in agiven year, and the measurement were the sero-logic type, the frequency table would contain alisting of the number of cases for each of thedifferent serologic types. It is often advantageousto collect the measurements into intervals orgroups. For example, if the population were astock colony of rats, and the measurement werethe weight, the number of animals could be listed Measurementfor each of several intervals of say five grams Figure B. Histogram.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 6: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

19551 STATISTICAL CONCEPTS IN MICROBIOLOGY 165

again plotted along the abscissa. Each member of Frequencythe population could again be represented by a per unit ofline above its particular value of the measure- meqeurementment. However, in the histogram, the abscissa isdivided into intervals, and the relative fre-quencies with which the members of the popula-tion fall in each interval are represented by rec-tangles constructed above the intervals. Here,each member of the population is represented byequal units of area under the rectangles. Thus,the area under each rectangle is proportional tothe relative frequency for the interval of measure-ments involved. This latter point is extremely Measurementimportant in the construction of histograms Figure 5. Frequency polygon for histogram ofwhere the intervals along the abscissa are un- figure 2.equal. If the intervals along the abscissa of thegraph are of equal length, the relative frequency relative frequencies in the two intervals, the rec-will be proportional to the height of the rec-tangle; however, if the intervals are unequal in tnhgl abv the frstinterva mus becone unilength, the heights of the rectangles must be ieihand thefiverectan aeit second in-proportional to the average number of members Ters will be fie units in heit (eetfigue 4).of the population for each unit of length of the This will make the areas of the two rectangles theitra. Foa xmpeonie tw a.ljacent... . same, as they should be, since both rectanglesintervals, both c a ngive members oth represent the same number of members of thetervals, both containig five members of thepouain

population. Let the first interval be five units in population.length and the second be one unit in length. Now, Frequencyi polygons. An alternative method ofifngthe height therecondbetanglesu ae mden .

prop presentation of the information in a histogram isif the heights of the rectangles are made propor- th rqec oyo hw nfgr .Tefetional to the relative frequencies, both rectangles the frequency polygon shown in figure 5. The fre-wilbefivuntsn high. Tusthevieer ;iiquency polygon is made by connecting the centers

of the tops of successive rectangles in a histo-be led to believe that the relative frequencies per gram by straight lines. The frequency polygonunit of length of the intervals are the same (see has the advantage of giving the viewer the im-figure 3). This is obvious nonsense, since there is rein of continuity which is iherent in theon the average, only one member per unit length msthin the first interval and five in the second in- measurements being portrayed. It is for this

reason that a frequency polygon should not re-terval. Therefore, to present a true picture of the place the bar graph, since the measurementsrepresented by the bar graph are not continuous.

Frequency The frequency polygon does not maintain therelationship between area and the relative fre-quency given by the histogram.

Frequency curves. A specialization of the histo-gram which will be of use is the frequency curve.If a small sample were drawn from an infinitepopulation, a histogram of the sample could be

Measurement made with fairly wide intervals along the ab-scissa. If the number of observations in the

FigueIsample is increased, a smoother picture of thedistribution of the sample will be obtained by

Frequency decreasing the size of the intervals. Thus, by in-per unit of creasing the size of the sample and decreasingmeasurement the width of the intervals, the top of the histo-

gram can be made to approach a smooth curve.The curve which is approached by a histogramby letting the size of the sample increase to in-finity, that is by approaching complete sampling,and by letting the width of the intervals becomeinfinitesimal is called the frequency curve of the

Figure 4. Properly plotted histogram. distribution of the population (see figure 6).

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 7: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

166 ROEBERT L. STEARMAN [VOL. 19

Frequency members in the population be represented by N.Thus, if there are 1000 members in the popula-tion, N will be 1000 and the members will benumbered from 1 to 1000. N does not have to befinite, as it is here, but may be infinite for infinitepopulations. Now, let X be the measurementwhich is taken for each of the members. Themeasurement for the first member may be de-noted by X1, the measurement for the second

y-2Xu+J12 or members as X2 and so on up to the measurementMeasurement for the last member of the population which will

Figure 6. Frequency curve for normal distribu- be XN. Matters can be simplified even further bytion. letting Xi be the general term for any one of the

N measurements; when i is equal to 1 we obtainHere, again, the area under the curve within a Xi, the measurement for the first member. Withgiven interval of the abscissa is proportional to this general term, we can denote the measure-the relative frequency with which members of ments of the population as beingthe population fall within the interval. Xi(i = 1, 2, * -, N) (2.1)8

Parameters and Stattstics Symbol 2.1 tells us that the measurement of theParameters. We have discussed how tables and ith member of the population is denoted by Xi

graphs can be used to describe a population. Al- and that i can be anything from 1 toN (. . . standsthough these methods are frequently the most for all of the intervening terms between 2 and N).useful, distributions can also be described by the Now, the only thing necessary to define the popu-use of certain constants called parameters (the lation mean is a symbol which will tell us to addnumber of parameters that are required to up all of these measurements. The capitalizeddescribe a population completely will vary with greek letter sma (Z) is used as a summationthe distribution). A parameter is a characteristic sign. Thus, we can define the population mean byof a population. Actually, a parameter may be Ndefined as any function (in the mathematical Xxisense) of the measurements of the members of a (population. For example, if a population consists N (2.2)of the measured diameters of the streptococci on The indices i = 1 and N tell us to add all of thea slide (this would, indeed, be a trivial popula- Xi's fo up to i equal to N.tion), the average diameter of the cocci would be rhe equal mt1a parameter, as would the smallest and largest Thep ation m n the cter in adiaetesthetoal f he iamtes, r venth rameter. The variance, on the other hand, is adiameters, the total of the diameters . Al- parameter which is designed to give a measure of

though the nqumber ofpllossilep ameters.ith the spread or variability of the population. Thethough the number of possible parameters is thusunlimited, two parameters are used to a greater variance, denoted by the lower case greek letterextent than the others. These are the mean and sma square (o2),is the average of the squaredthe variance. deviations, from the population mean, of the.. . . ~~~measurements which make up the population,The mean of a population is nothing more than i

the arithmetic average of the measurements of the te.members of the population. Thus, to obtain the

N(I -

mean, add the measurements for each of the (2.3)members of the population and divide by the Nnumber of members. The mean of a population is The following notation is used for numberingdenoted by the lower case greek letter mu (u).To save space later on, it is advisable to pick decimal is the section number and the number

up some mathematical notations at this time. following the decimal point is the number of theFor the sake of convenience, let us number the equation within the section. Thus, symbol 2.1members of the population and let the number of is the first numbered symbol in the second section

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 8: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 167

To determine the variance of a population, Statistics. Populations may be characterized bysubtract the population mean from each of the their parameters. A similar characteristic for ameasurements in the population, square the sample is called a 8tatistic. Thus, a statistic is aresulting numbers, add them, and divide by the characteristic of a sample and may be defined asnumber of members in the population. any function (in the mathematical sense) of theThere are, of course, other possible parameters observations in a sample. There are an unlimited

which would give a measure of the spread of the number of possible statistics just as there are anpopulation, and the reader may wonder why such unlimited number of possible parameters. Ana complex parameter should be chosen as the one example of a statistic would be the average di-to use. To give an adequate answer to this ameter of a sample of the streptococci on ourquestion would require a jaunt into theoretical slide.statistics, but one reason is that the variance has The usefulness of any particular statistic willseveral advantages not enjoyed by other parame- depend upon its ability to estimate a parameterters of spread. One, which will be put to use at a of the parent population. To discuss the ability oflater point in the paper, is that the variance can a statistic to estimate a parameter, we will needbe split into component parts, i.e., if the variation to define and draw a distinction between thein a population arises from different sources, the terms estimator and estimate, as applied to avariance of the population may be split into statistic. An estimator is the mathematical pro-terms representing the amount of variation arising cedure used to determine a statistic and anfrom each source. Similarly, viance terms can be estimate is the actual number obtained when thiscombined, i.e., if we have a laboratory procedure procedure is applied to a particular sample. Anwhich requires several steps, and the variance estimator will remain the same from sample tointroduced by each step is known, the variances sample (for a given statistic), but estimates willfor the individual steps can be combined to ob- vary from sample to sample. As an illustration,tain a variance for the complete procedure. consider samples from the slide of streptococci:Another reason for the choice of the variance lies the average diameter of the cocci will vary fromin its position of importance in the normal dis- sample to sample, but the method of determiningtribution which will be discussed in later sections. the average diameter will remain the same.Since the normal distribution is of great signifi- One other concept which will be needed beforecance in many types of statistical problems, and proceeding is the concept of an expected value.since a normally distributed population can be The expected value of a function of the measure-completely defined in terms of its population ments in a population is the average value of themean and variance, we have another good reason function for the entire population. The expectedfor the choice of the variance as a parameter of value of a function is denoted by placing thespread or dispersion. capital letter E in front of the function, as inThe square root of the variance, called the equation 2.5. To illustrate the expected value,

standard deviation and denoted by the lower case consider two examples of its use. First, considergreek letter sigma (o), will be used to a large the expected value of the measurements of aextent in later sections of this paper; the equation population. The expected value is the averageof its definition is value of all of the measurements in the popula-

/ N tion, thus,:(Xi _ A)2 N

-(2.4) Xi (2.5)N

The standard deviation has an advantage overthe variance in that taking the square root brings But, the right hand side of equation 2.5 is thethe standard deviation into the same units of definition of the population mean, therefore,measurement as that for the members of the E(X) (2.6)population. Thus, if population members are ex-pressed in grams, the population mean and the Next, consider the expected value of the squaredpopulation standard deviation will also be ex- deviations of the measurements from the popu-pressed in grams. lation mean. This would be the average value of

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 9: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

168 ROEBERT L. STEARMAN [VOL. 19

these squared deviations for the entire popu- ing bias may also be used in other conditions, aslation, or, will be seen later on.

N The two statistics most commonly used forZ (Xi -_)2 estimating the population mean and population

EI(X - ;9)'j =-1 (2.7) variance are the sample mean and the sampleN estimate of the variance. To distinguish samples

The right hand side of equation 2.7 is the defi- from populations in equations for definingnition of the variance for the population, there- statistics, new notations will be set up for samples.fore, S We denoted the size of a population by N and we

will denote sample size by n. We denoted theEl(X - }s)" c (2.8) measurement for a member of a population by

Thus, the two commonly used population pa- Xi and we will denote the same measurement forrameters might have been defined by the use of an observation in a ample as x;. Thus, the ob-expected values. servations in a sample of size n may be denoted-The concept of an expected value can also be byapplied to a statistic. The application stems from xz(i - 1, 2, *--,n) (2.11)the population of all possible samples of a givensize. It has been pointed out previously that there The sample mean is denoted by the lower caseis a population of all of the possible samples of a letter z bar (i), and may be defined bygiven size which could be drawn from a parentpopulation. The estimates obtained by applying xia given estimator to each of the samples in this - (2.12)population of samples would also constitute a n

population. Having thus obtained the population Thus, the sample mean is the arithmetic averageof estimates, the expected value of these estimates of the observations in the sample. The samplecan be determined. mean is an unbiased estimate of the populationThere are several criteria for judging how well mean.

a statistic estimates a parameter. One criterion The sample estimate of the variance is denotedthat will be important in later discussions i by the lower case letter s square (02) and isbias. Let the population parameter be denoted by defined bythe lower case greek letter theta (0) and a memberof the population of estimates obtained by using : (x-the estimator by theta hat (a). Now, an estimator i(213)is said to be unbiased, that is, the method of n-1estimation is said to be unbiased, if

The reader will note that the divisor is n - 1,E(8) = e (2.9) while the divisor of the population variance was

In words, the estimator is unbiased if the mean N as shown in equation 2.3. There are severalof the population of estimates is equal to the reasons for using the number n - 1 instead of n,parameter of the parent population. If the mean one of them being that 82, as defined, is an un-of the population of estimates is not equal to the biased estimate of a0 for the infinite populationsparameter of the parent population, then the which are so important in laboratory experi-estimator is said to be biased, and the difference mentation, that is,is called the bias, i.e., Ets') = as (2.14)bias~~~~~~~~~~~~'=Ea- (2.14)

bias = E(0) - e (2.10)The number n - 1 is the number of degrees of

If the bias is negative, that is, if the parameter is freedom of the sample estimate of the variance. Ingreater than the expected value, the estimator is mathematical parlance, the degrees of freedomsaid to be negatively biased. Similarly, if the bias may be defined as the number of "independentis positive, that is, if the parameter is less than variables" which go to make up 82. One methodthe expected value of the estimate, the estimator of determining the number of "independentis said to be positively biased. The terms concern- variables" is to answer the question: If we wish to

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 10: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 169

know the value of each of the observations in the tions are done on the radioactivity of a specimen,sample, and we are given the mean of the ob- it would indeed be rare to get the same results onservations, how many of the original observa- every try. If there were no variation in thesetions must we have? The answer is one less than measurements, the measuring device used, or thethe total number of observations in the sample procedure used would be said to be precise, under(if we know all but one of the observations, we the usual dictionary definition of the word. How-may determine the value of the missing observa- ever, in biological work, precise devices or pro-tion by subtraction). Another way of looking at cedures seldom if ever exist under this definition.the number of degrees of freedom is to say that Instead, the measuring devices have varyingwhen the sample was first taken, there were n precision. The smaller the variation, the greaterdegrees of freedom. One of these was used in the precision. A more useful definition of the worddetermining the mean, so there are n - 1 degrees precise allows the devices or procedures of greatof freedom left for the sample estimate of the precision to be called precise while those devicesvariance. whose measurements are subject to an amount ofThe numerator of the right hand side of equa- variation which exceeds some allowable amount

tion 2.13 for 8 is called the sum of squares of are spoken of as being imprecise (under thedeviations from the mean. This term may be dictionary definition of these terms, all devicesabbreviated as S.S. Thus, and procedures are imprecise). These definitions

, will be used in this review. Precise and impreciseShy- 2 (xi -")' (2.15) are relative; their definition will vary with the

unit being measured.The denominator is the degrees of freedom Variance as an index of precisin. The precisionabbreviated d.f. (some authors use onlyf, there of measuring devices or laboratory procedures isfore, the sample estimate of the variance may be of the utmost importance to the worker who usesdefined as being the S.S. divided by the degrees them. Precision and reliability are inseparable inof freedom. Thus, laboratory work. Of even greater importance is

that the worker know the precision of his tools.82 -.. (2.16) There is no direct measure of precision, but an

d.f. index of precisin is available since there is a

The abbreviations S.S. and df., as well as the measure of the amount of variation, namely thedefinition given by equation2.16, will be ~ variance. The variance is an inverse function ofagre tntgivn theqlateionsof16twisrevsew. the precision: the greater the precision, the lessa great extent in the later sections of this review. the variance.There is also a sample estimate of the popu- Consider the problem of determining the aver-lation standard deviation, namely, the square age diameter of one of the species of streptococci,root of the sample estimate of the variance, de-

noted by the lower case letter 8. The equation of say ,Streptococcus pyogee. It is an easy matter to.iotsddefitionw d obtain a sample of the population of the coccithemselves; simply inoculate a tube of broth, let

8 81 (217) it incubate for a sufficient time at the properd.. temperature, withdraw a loopful and make a

smear on a glass microscope slide. The cocci on theThe calculation of the sample mean and the slide would be a sample of the population of the

sample estimate of the variance, as well as the cocci. The problem of obtaining a sample of thesample estimate of the standard deviation, will be diameters is another matter. Each of the cocci onillustrated many times in later sections, so no the slide has a certain "true" diameter. The trueillustrations will be given at this point. diameters of the cocci on the slide cannot be

measured, but the diameters may be estimatedwith measurements obtained by using an ocular

Precisi micrometer on a fixed and stained preparation.Measurements are subject to variation. For Even apart from errors in measurement because

example, if repeated Kjeldahl determinations are of distortion of size from fixing and staining, theserun on a sample of protein, or repeated determina- measurements by the ocular micrometer will be

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 11: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

170 ROEBERT L. STEARMAN [VOL. 19

subject to error and will not be the same as the ment divided by the number of estimates whichtrue diameters of the cocci. A measurement ob- go to make up the mean.tained from a measuring device is called an One point which must be brought out in thisestimate. connection is that equation 3.1 holds only if theThe reader will note that apparently there are n measurements are taken on a single object or

two definitions for the word estimate. The word if a single measurement is taken on each of nestimate was used in the last paragraph to denote individuals and only if the n measurements thusthe measurement given by some measuring de- taken are independent. By independent we meanvice; previously it was used to denote a statistic that the size of a given measurement does notderived from a sample. Actually, both of these depend in any way on the size of any of the pre-fall under the same broad definition of an esti- ceding measurements.mate. With both we are trying to estimate a true In general, if we are going to take measure-value; in the first a measuring device is being used ments on more than one individual it is betteras the estimator to estimate the true value of a to take more than one measurement on eachmember of a population and in the second a since variation among individuals introduces amathematical procedure is used as the estimator new source of variation above and beyond that ofto estimate the true value of a parameter of a the measuring device. For example, the size ofpopulation. Thus, no matter whether the esti- streptococci will vary with species, strain, andmator is a measuring device or a mathematical even with individuals within a particular strain.procedure, the numerical value obtained by its Let us confine our attention to measurements ofapplication is called an estimate. individuals within a particular strain. Variation

If one coccus from our slide of cocci were exists among these cocci, in that they will nottaken, a population of the estimates of its true all have the same diameter. A measuring devicediameter could be obtained by repeated measure- has a certain amount of variation, and the cocciments of its diameter using an ocular micrometer also have a certain amount of variation. Thus,or some other similar device. The variance of this the variation in means of estimates obtained bypopulation of estimates would be an index of the taking several measurements on each of severalprecision of the measuring device. Another way cocci will be due not only to the variation fromof stating this is that the variance of this popu- the measuring device, but also to the variationlation of estimates is an index of the precision of among the cocci. This means that the variance ofan estimate, that is, the variance gives us an index the mean taken in this manner depends not onlyof the precision of a single measurement. on the variance of the measuring device but also

Methods of increasing precision. Many times, a on the variance of the true diameters of the cocci.measuring device does not give the needed pre- Here again, the relationship among these vari-cision. In this event, another measuring device ances may be stated mathematically. Denote thewhich does have sufficient precision to meet our variance among the true diameters of the coccineeds may be available. Often, however, a device within the chosen strain by ol and the varianceof sufficient precision is not obtainable. To by- due to the measuring device as ad. Now, take thepass this obstacle, advantage may be taken of the mean of the estimates for n. of these cocci withfact that the mean of several estimates has nd determinations per coccus (take the samegreater precision than a single estimate. The number of determinations, nd, on each coccus).relationship of the variance of a mean of several Then the relationship among the variances maymeasurements of the same object, denoted by be stated by the following equation. Ea4, and the variance of a single measurement, 04, 2 2may be stated mathematically. The relationship = + (3.2)between the variance of the mean of n measure- s no ncn'zments and the variance of a single measurement is Equation 3.2 reduces to a special case of equation

s( 3.1 if a single measurement is taken on each2= n (3-1) coccus (nd = 1). The relationship among then

variances becomes more complex when the num-Thus the variance of the mean of several esti- ber of measurements varies among the differentmates is equal to the variance of a single measure- cocci; this relationship will not be discussed.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 12: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 171

The relationship shown in equation 3.2 is in- If the variance remains the same regardless ofadequate if streptococci from different species the size of the mean, then the coefficient of vari-and strains are measured, since the variance ation will not remain the same. The oppositeamong streptococci will depend upon the species, statement is also true, namely, if the coefficientstrains and individuals present. That is, the vari- of variation remains the same regardless of theance among the estimates would be a function of size of the mean, then the variance will not re-the variance among species, the variance among main the same. Sometimes neither the variancestrains within species, the variance among indi- nor the coefficient of variation will remain con-viduals within the strains and the variance due stant with changes in the mean. In this event, itto the measuring device. Equation 3.2 can be will be necessary to specify the variance or theextended to take care of the variance for all of coefficient of variation for each level of the meanthese sources. The main point to be gained from to discuss the precision of the measurements.the preceding discussion is that the precision of a However, if either the variance or the coefficientmethod can be increased by taking the mean of of variation remains the same with each level ofseveral measurements, and that the precision of the mean, the one which remains constant may bethis mean will depend upon the origin of the used as an index of precision for the measuringmeasurements which go to make up the mean. device or procedure. It is, of course, important to

Coefficient of variation as an index of precision. specify which of these two indices is being usedMany times a laboratory worker, discussing the since there is a definite difference between them.precision of one of his procedures, will state thathis procedure gives answers that agree within, Accuracysay, 5 per cent. This type of statement implies According to the dictionary a measuring de-that the amount of variation is 5 per cent ir- vice is accurate if the estimate obtained by therespective of the size of the true value, in contrast device is equal to the true value being estimated.to that of a worker who states that weighings on This type of definition places a great restrictiona rough balance are within 2 grams. The latter on the use of the term accurate, since even anstatement implies that the variance is the same imprecise measuring device will give the rightregardless of the size of the mean, whereas the answer part of the time; thus, it would be said tofirst statement implies that the coefficient of vari- be accurate at times and inaccurate at other timesation is the same regardless of the size of the mean. (actually it would be inaccurate more times thanThe coefficient of variation is a measure of the it would be accurate). For a device to be accurate

amount of variation in terms of per cent of the all of the time, it would have to be subject to nomean. The population coefficient of variation will variation and give the right answer every time. Asbe denoted here by the capital letters C.V. and pointed out before, only devices which are sub-the sample estimate of the coefficient of variation by ject to variation are available. This type of athe lower case letters c.v. The population coeffi- definition is too limited to be of much use.cient of variation is defined by-equation 3.3: A more practical approach is to apply the terms

100 used in discussing bias to the problem of measur-C.V. (3.3) ing devices. Their use makes it unnecessary to

define a group of new terms. Thus, a device is saidand sample estimate of the coefficient of vari- to be unbiased if the mean of the population ofation by: estimates is equal to the true value and it is

biased if the mean of the population of estimatesCAv. =100 (3.4) is not equal to the true value. The direction of its

bias is defined by stating that the device is nega-The standard deviation is used in the coefficient tively biased or positively biased, and the amountof variation rather than the variance because the can be stated.standard deviation is expressed in the same units The bias or lack of bias of a measuring deviceof measurement as the mean (and the original or procedure is of great importance to the labora-measurements). The coefficient of variation will tory worker, but it is something which can't bebe in per cent, since the numerator and denomi- determined directly by statistical methods. Fornator are in the same units. example, if the diameters of the cocci on the slide

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 13: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

172 ROEBERT L. STEARMAN [VOL. 19

were measured, and then 10 millimeters added to proach a normal distribution with smalleach, no statistical procedure would pick up this samples while other distributions mayobvious bias in the results if the results alone were require quite large sample size.)considered. However, statistical methods will (c) Many naturally occurring populations maydetermine bias if additional information is be made to follow an approximatelyavailable. normal distribution by the use of a

Manymlethods may be used to determine the simple transformation of the measure-Many methods may be used to determine the ments of the members of the population.

bias of a device or method, for example, the com- For example, the logarithms of countsparison of the device with a standard. Thus, per minute in radioactivity measure-weights are tested against standard weights to ments of a specimen using an Autoscalerdetermine the bias of the weights. When a stand- are approximately normally distributed.ard is used, statistical comparison tests, to be (d) Statistical tests involving populationsconsidered later, will be of use. which are not normally distributed may

Another method of determimng the bias of a be simplified by use of normal approxi-method is by the recovery of added measurable mations. Examples of this will be givenmethodis by the recovery of added measurable in later sections.material. For example, in microbiological assays,known amounts of the metabolite being assayed There is nothing particularly "normal" aboutare added to unknowns and the difference, as the normal distribution. Although, at one time,measured by the assay, between the level of the some statisticians thought that all biologicalmetabolite in the unknown and the level in the populations would be normally distributed, thisunknown plus known added amount, is compared was soon shown to be a fallacy. The namewith the amount of the metabolite added. Here normal distribution has remained because ofagain, statistical comparison tests will be useful. usage and not because it represents the normalThe difference between bias and precision must state of affairs for biological populations.

be noted. Precision is concerned with the var- Parameters and statietics. A circle may beability of the estimates about their central value, defined by the use of its center and its radius.whereas bias is concerned with the difference If we wish to graph a circle we may do so if webetween the central value of the estimates and thetrue value being estimated. Thus, procedures parameter) and its radius (equivalent to amybe unbiased and imprecise, unbiased and paaetr an t ais(qlaettmay be unbiased and imprecise, unbiased and parameter of spread). In a somewhat similar

precise, biased and imprecise, or biased and pre- way, we may define a normal distribution by itscise. Any of these combinations may be met in mean and variance. The frequency curve for apractice. The best combination would be a pro- normally distributed population, a symmetrical,cedure both unbiased and precise. This would be a bell-shaped curve (see figure 6) is completelyprocedure that could be called accurate. defined in terms of the population mean and

variance. The only parameters which appear inIV. THE NORMAL DISTRIBUTION the mathematical equation for the curve are

Utility. The normal distribution is one of the these two (the reader may find the equation ofmost used and most important distributions in the curve in most elementary statistics books orstatistics. The usefulness of the normal distribu- reference 1).tion lies in four facts. Changes in the mean and variance of the

population result in changes in the position and(a) Many of the naturally occurring popula- spread of the curve. Changes in the mean result

tions are approximately normally di(- in shifting the curve to left or right along thetributed. abscissa while changes in the variance result in

(b) The means of large random samples from increasing or decreasing the spread of the curve.naturally occurring populations are An excellent discussion of the effect on the curvenormally distributed even if their parentpopulations are not normally distributed oftchanging the mean and variance, complete(The meaning of the word large will with graphs, isgiven by Bross (3) startig Onvary with the distribution of the parent page 202 of his highly recommended book.population. The means of random An important relationship which is used to asamples from some distributions ap- great extent is that 95 per cent of the area under

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 14: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

19551 STATISTICAL CONCEPTS IN MICROBIOLOGY 173

the curve lies between the mean minus 2 standard X. o2 = 994 gram (4.5)deviations and the mean plus 2 standard devia- X.975 = 1006 gram (4.6)tions (to be more precise, 1.96 should be usedinstead of 2; however, for all practical purposes, We shall use this sort of notation to a great2 is close enough). Since the area under the curve extent in the discussion of statistical tests.is proportional to the relative frequency with The two statistics which are used to estimatewhich the measurements of the members fall the population mean and variance are the samplewithin the interval, this means that 95 per cent mean and the sample estimate of the variance.of the members of the population will have If a random sample is taken from a normalmeasurements which lie between the mean minus population, the expected value of the sample2 standard deviations and the mean plus 2 mean is the population mean and the expectedstandard deviations. value of the sample estimate of the variance is

As an illustration, if the distribution of weights the population variance, thus, the two estimators(estimates) obtained by weighing a 1000-gram are unbiased for a parent population with aobject on a rough balance was a normal distribu- normal distribution.tion with mean 1000 and variance 9, then thestandard deviation of the distribution would be Swnfran Teststhe square root of 9, or 3 grams. Thus, Basic principles. Certain bacteria which occur

2-2a 1000-6- 994 grams (4.1) normally in throat cultures taken from healthyindividuals are called the "normal flora." When

and a clinical bacteriologist finds only these organisms

A + 2 - 1000+ 6 =1006grams (4.2) in a throat culture and if none of the organismare there in abnormal quantity, the culture may

therefore, 95 per cent of the weights obtained in be reported to the physician as a normal throatthis manner would lie between 994 grams and culture. However, if beta-hemolytic streptococci1006 grams. One other relationship follows fromthe fact that the frequency curve for a normal occur m the culture in large numbers, thisdistribution is symmetrical (one half the curve is findin would be rare in a normal throat, so thea mirror image of the other half). Hence, the re- bacteriologist would report the finding of themaining 5 per cent of the population is divided beta-hemolytic streptococci in the culture. Otherequally between the two "tails" of the curve, fiding, also rare in a normal throat culture,that is, 2% per cent of the members will have would be reported to the physician as abnormalmeasurements which are less than the mean minus findings. Thus, throat cultures fall into two2 standard deviations and 2% per cent of categories--the first, those cultures showingthe members will have measurements which ex- organ which could be expected in normalceed the mean plus 2 standard deviations. In the throats, and the second, cultures that exhibitexample, 2% per cent of the weights obtainedfrom the balance will be less than 994 grams and oan ofltype. s orenumber rae inbormal2H per cent of the weights will exceed 1006 gram throat cultures. On receiving the laboratory

findings, the physician may return a diagnosis ofOne method for denoting the relationship of either a normal or an infected throat. If the

the area under the curve and a particular value diagnosis is no infection, and nothing is wrongof the measurements along the abscissa is by use with the patient's throat, then no error will beof a subscript which tells what proportion of the made. If the diagnosis is a throat infection andpopulation will have measurements which are the patient has an infection of the throat, again,less than the given value. For example Xo.oz is no error will be made. However, one type of errorthe value of a measurement such that 2% per will be made if the diagnosis is an infection ofcent (a proportion of 0.025) of the population the throat when none exists, and a second typemembers will have measurements which are less of error will be made if the diagnosis is madethan this value. Thus, for the normal curve that there is nothing wrong with the patient's

X =o- -2o (4.3) throat when in truth there is an infection.The basic principles of statistical tests are

. =7 + 2i (4.4) much the same as those for the diagnosis of theor in the example of the rough balance presence or lack of throat infection based on

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 15: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

174 ROEBERT L. STEARMAN [VOL. 19

TABLE 1 than 5 per cent; one other level which is used atTypes of errors which can be made in a test of times is the 1 per cent significance level.

a hypothesisTo illustrate a significance test, consider an

Decision concerning the Status of the Hypothesis artificial example with the rough balance. Assumehypothesis T that the problem is to test whether the balance

True False is unbiased in weighing a 1000-gram object. TheDo not reject the hy- object will be weighed only once. Now, if the

pothesis.... No error Type II population of estimates which could be obtainederror is normally distributed and if the variance for

Reject the hypothesis.. Type I er- No error this population is known, a test may be set upror for the hypothesis that the balance is unbiased,

that is, that the expected value of the estimateson a standard weight weighing 1000 grams will be

laboratory findings on throat cultures. In a sta- 1000 grams. Let us assume that although wetistical test, a theory or proposition (called a haven't tested for bias before, we know from pasthypothesis) is to be tested. We then determine experience with the balance that the variancewhat we would expect to find if the hypothesis for estimates is 9 (this is indeed an artificialis true and what we would expect to be a rare example). Thus, the standard deviation will bethe square root of 9, or 3 grams (this is not partoccurrence if the hypothesis is true. Havng de- of our hypothesis).termined what would be normal and what would Before setting up the test, the alternatives tobe rare if the hypothesis were true, the experi- the hypothesis must be considered, that is, whatmental data is examined to determine into which can happen if the hypothesis is not true? Thereof these classes the data fall. If the data fall into are two possible alternatives, namely, the balancethe normal class, the hypothesis is not rejected; may be negatively biased so that the estimatesagain, there are two types of error. An error is will, on the average, be less than 1000 grams ormade if the hypothesis is rejected when it is true. the balance may be positively biased so that theThis is called a type I error. An error is also made estimates will, on the average, be greater thanif the hypothesis is not rejected if it is false. This 1000 grams. Now, the test must be set up so as to

is called type II error. The types of errors which pick up a bias in either direction and still have atotal of only 5 per cent of type I errors if the hy-

can be madeeansummarized in table 1. pothesis is indeed true. To accomplish this weOne of the things to be done for a test of a can use the fact that if the hypothesis is true, we

hypothesis (called a significance tedt) is to set up have a normal distribution with mean 1000 and asome sort of criterion for judging what is a rare standard deviation of 3. Thus, since 5 per centoccurrence and what is a normal occurrence. The of the members of such a population will eithermethod used in setting up this criterion is to be less than 994 grams or more than 1006 gramsplace a limit on the rate at which type I errors (see equations 4.1 and 4.2), the significance levelwill be made if the procedure is used on all of the test will be 5 per cent if the hypothesis ispossible samples drawn from the theoretical rejected when the weight obtained is less thaninwhichthhpohsi994 grams and if the hypothesis is also rejected

population e when the weight obtained exceeds 1006 grams.most commonly used test criterion is set up m The hypothesis is rejected for these values sincesuch a way that if the hypothesis is true, type I values as large as or larger than 1006 grams or aserrors will be made only 5 per cent of the time. small as or smaller than 994 grams can only occurThis is equivalent to saying that if the hy- by chance 5 per cent of the time if the hypothesispothesis is true, a discrepancy between the is true.observed event and the hypothesis as large as orlarger than the one obtained could only happen 5 Values which lead to the rejection of theper cent of the time by chance. The per cent of hypothesis are said to lie in the critical regon fortype I errors which will occur is called the thetest. Intheexampleof therough balance, thesignificance level of the test. Thus, the most critical region consists of two parts, namely,commonly used test criterion has a significance those values which. are less than 994 gramslevel of 5 per cent. Other tests may be set up in and those values which exceed 1006 grams.which the significance level is something other The critical region in the example of the rough

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 16: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

19551 STATISTICAL CONCEPTS IN MICROBIOLOGY 175

balance consists of the two tails of the population The advantage of this transformation is thatcurve; this type is called a two-tailed test. That it even if the variance of the original populationis a two-tailed test arises ultimately from the changes from problem to problem or if thefact that there were two alternatives to the hypothetical mean of the population changeshypothesis. It is also possible to have a one-tailed due to a change of hypothesis, u will still havetest, which will arise if there is only one alternative the same distribution. Thus, the critical regionto the hypothesis. For example, if for some reason for a u-test will remain the same each time if theit was known that the rough balance could only significance level and the number of alternativesbe positively biased, then a one-tailed test would to the hypothesis remain the same. Since thebe used. Thus the alternatives to the hypothesis variance is 1, the standard deviation will be theare important in deciding whether a two-tailed square root of 1 which is 1. Thusor a one-tailed test will be used. U.025 = -2 = 0-2 = -2 (4.8)

The test criterion is now ready. Note that this andhas been done without once seeing the data whichwill be used to decide whether the hypothesis U.976 = + 2a = 0 + 2 = 2 (4.9)will be rejected or not, since if a test is to beobjective it must not be swayed by a prior knowl- Therefore, the critical region for a two-tailededge of the data. Having the criterion for the u-test, with a significance level of 5 per cent, willtest, we now proceed to obtain the data. Let us be of values of u which are less than minus 2 andsay that we weigh a 1000-gram weight using our values of u which exceed plus 2. To illustrate therough balance and obtain an estimate from the test, let us apply it to the problem of the roughbalance of 1005 grams. Now, this estimate does balance. In the example,not fall into the critical region for the test, there-fore, the hypothesis that the balance is unbiased 1005 - 1000 5is not rejected. u= 1703

If we continue to use a test of the sort taken Here again, the value obtained does not fall intofor our rough balance, the critical region will the critical region, therefore the hypothesischange each time that a new hypothesis is chosen. that the balance is unbiased is not rejected. ThisIt will also change each time that the variance of is as it should be, since the two tests are identical.the population changes, that is, the critical region When the hypothesis is not rejected, the state-for a test of this type will not always be thesame ment is made that the observed value is notfor every test. Each time a significance test of significantly different, statitially, from the meanthis sort is made, the critical region must be given in the hypothesis and the significance levelcomputed. Matters can be somewhat simplified of the test is specified. Thus in the test of theby the use of another test (which will be called rough balance, it is said that the estimatethe u-test in this paper). This test will have the obtained is not significantly different, sta-same critical region for each problem even if the tistically, from 1000 grams at the 5 per centhypothesis changes or if the variance of the ificance level. When the hypothesis ispopulation changes. rejected, the statement is made that the valueThe u-test makes use of a simple transformation obtained is sinifiantly different, statistically,

for normally distributed measurements. If y is from the value given by the hypothesis andan observation drawn from a normally distributed again the significance level is specified. The wordpopulation with mean &, and variance 2, then significant as used in this connection is strictly athe quantity u where statistical term and has no connection with the

- dictionary definition of the term. Thus, if theupn <(4.7) statement is made that results obtained in an

experiment are significantly different, sta-is normally distributed with mean zero and tistically, from the results expected if thevariance 1. Thus any normally distributed hypothesis were true, what is meant is only thatpopulation can be transformed to a standard the hypothesis is rejected at the given significancenormal distribution by means of this relationship. level and not that the results are world-shaking

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 17: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

176 ROEBERT L. STEARMAN [VOL. 19

or even that the observed difference is of any alternative hypothesis, a test with a significancepractical value. The practical aspects of the level of 5 per cent will have less type II errorsresults of the test are given by a consideration of than will a test with 1 per cent significance level.the original problem and by a consideration of A term used in discussing type II errors isthe size of the difference in relation to the power. The power of a test is 100 minus the perproblem. A further discussion of significant versus cent of type II errors. Thus, if the level of typepractical differences will be given at a later point II errors was, say, 27 per cent, the power wouldin this section. be 73 per cent (100-27 = 73). The power of a

test is, in a way, a measure of its ability toSummarizing the steps which are taken in a dertia ween teanull othessa nd

significance test: differentiate between the null hypothesis and(a) Set up the hypothesis. the specified alternative hypothesis. Thus, as the(b) Determine what the alternatives to the power of the test increases, it is less likely that

hypothesis are (this tells whether to use a two- the null hypothesis will be accepted if thetailed or a one-tailed test). alternative hypothesis is the true state of affairs.

(c) Set up the significance level for the test The power of a test also depends upon the(this tells the rate at which type I errors will variance of the population specified for the test.occur). For example, if the variance in the example of

(d) Using the significance level and the al- the rough balance were larger than 9, then theternatives to the hypothesis, determine what the power of the test against a particular alternativecritical region is from the population defined by hypothesis would be less than it would be if thethe hypothesis.

(e) Check the data to see whether or not they variance were 9. The variance of the parentfall into the critical region. population i something which controls the power

(f) If the data fall into the critical region, Of the test. The power of the test can be increasedreject the hypothesis, otherwise, do not reject in the same way that precision is increased, thatthe hypothesis. is by using the mean of a sample for the testThese are the basic steps which will be taken in rather than a single observation.testing hypotheses with the statistical tests to be Test of a single sample mean. The populationdiscussed in this paper. of sample means of all possible samples of size

n drawn from a parent population which isIn setting up a significance test, the per cent normally distributed with mean Mu and variance

of type I errors (the significance level) to de- normally distributed mean

termine the critical region was used for the test. but with arianeno sinetese meansNothing was said about the type II errors for the a normal distribution, the u-test (see previoustest. The level of type I errors could be used to section) applies. Letting y (in equation 4.7) beset up the critical region for the test only after the amplesmeaniro a ran sample:the hypothesis to be tested had been specified.tTo discuss type II errors, a particular alternative u I-- A A (4.11)hypothesis must be specified. The rate at which U = -(.1type II errors will occur depends upon the dif- 4/inference between the alternative hypothesis andthe hypothesis to be tested (called the null Here again, u has a normal distribution withhypoteis) as well as on the significance level for mean zero and unit variance so the test is thethe test. The rate at which type II errors occur same as before.increases as the mean specified by the alternative In all of the significance tests so far thehypothesis and the mean specified by the null variance of the parent population had to behypothesis get closer to each other. For example, known to set up the test, but seldom will thatif the null hypothesis states that the mean is 1000 be known. Usually, we must rely on the samplegrams, as in our previous example of the rough estimate of the variance derived from a samplebalance, an alternative hypothesis of 1005 grams to estimate the variance of the parent population.will have a greater rate of type II errors than an If the sample estimate of the variance is substi-alternative hypothesis of 1010 grams. The rate of tuted for the population variance into thetype II errors will also increase by decreasing the equation for u (equation 4.7), the resultingsignificance level. For example, for a given quantity no longer follows a normal distribution.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 18: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 177

However, the distribution of the resulting ratio TABLE 2(called t) is known. That is, if y is an observation Hypothetical sample of size 10 obtained by weighingdrawn from a normally distributed population a standard 100-gram weight on a rough balancewith mean p, and if is asample estimate of the Order of Estimatevariance of the population from which y is WetiininGrams xi-x (x;- i)drawn, then the quantity t where

mg

_(xi)_ Xi

ty -141 1 100.1 -0.16 0.0256 10,020.01t _ (4.12) 2 100.5 0.24 0.0576 10,100.253 100.2 -0.06 0.0036 10,040.04

follows the t-distribution. The frequency curve 4 100.5 0.24 0.0576 10,100.25of the t-distribution is similar in shape to that 5 100.4 0.14 0.0196 10,080.16for the normal distribution and is also a sym- 6 100.6 0.34 0.1156 10,120.36... ~~~~~~~~7100.3 0.04 0.0016 10,060.09metrical curve with its center at zero. 8 99.9 -0.36 0.1296 9,980.01

When we were dealing with the quantity u, 9 100.2 -0.06 0.0036 10,040.04the value of c2 was exact; therefore, the distribu- 10 99.9 -0.36 0.1296 9,980.01tion of u did not vary with the size of the smple. - l,With t, however, s is an estimate and the Total... 1,002.6 0.00 0.5440 100,521.22precision of this estimate will vary with the sizeof the sample from which it is obtained. Thedistribution of t will therefore depend upon the In column 4 of table 2size of the sample from which the sample estimate 10of the variance of y was obtained. When the Z(.1distribution of t is computed, however, the degrees Thereforeof freedom of s are used rather than the size of S S 0.5440the sample from which it was obtained. The - 0 54 = 0.0604 (4.15)advantage of this method will be seen when the dof. 9t-distribution is applied to more complex prob- The null hypothesis is that ,; = 100 and fromlems. The distribution of t has been worked out equation 4.13, 2 - 100.26; therefore, letting y infor various degrees of freedom and the values of equation 4.12 be the sample mean, I:t for the critical region have been tabulated.' - i I 2 -

The use of the t-distribution is illustrated by < / (4.16)the example given in table 2. A one per cent signifi- ncance level and a two-tailed test will be used with ora sample of size 10, therefore, df. = 10 - 1 = 9. 100.26 - 100 0.26 0.26Using a t-table we find our critical region for the 0 0.078test will include values of t which are less than A/0604 (4.17)-3.250 and values of t which exceed 3.250. V10From column 2 of table 2: = 3.333

-1 0026 This value of t lies in the critical region for thei-1 ' ' *at-test, therefore the hypothesis that the balance

Thus, is unbiased is rejected, and it is stated that the116 sample mean is significantly different, statis-

2; xi tically, from the value given by the null hy-- 1,002.6 10 (4.13) pothesis.n 10 In table 2 and equation 4.14 the deviations of

each of the observations from the sample mean4Statistical tables are included in most ele- were calculated, squared, and then added to ob-

mentary statistical textbooks. There are also tain the S.S. The S.S. for a sample can be ob-special books of tables. Included among these tained by the use of a mathematical identity:are those of Arkin and Colton (4), Hald (5) andFisher and Yates (6). Unless otherwise noted, all (2values used in this paper are those taken from SASH 2(x- )'2- Z _- / (4.18)Hald (5). i-l i-l n

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 19: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

178 ROEBERT L. STEARMAN [VOL. 19

That is, to obtain the S.S.: square the individual TABLE 3*tobservations, add them and subtract the square of Results obtained by two observers each counting thethe total of the observations divided by the same 12 platesnumber of observations in the sample. Thereader can verify equation 4.18 by applying it to Plate No. Plate Count Differencethe data of table 2. Using equation 4.18: Observer B Observer D (B-D)

S.S. = 100,521.22 - 100,520.676 = 0.5440 R1 221 323 -102R2 141 202 -61

which is the same result obtained in equation R3 63 80 -174.14. The mathematical identity shown in equa- R4 249 198 +51tion 4.18 saves time in computing the value for R5 292 323 -31the S.S. when calculating machines are avail- R6 79 97 -18able. The identity is also used in setting up tables P7 161 181 -20for more advanced statistical tests. Although P8 397 416 -19either method may be used for computing an P9 118 139 -21S.S. value, the identity will give a more precise P10 93 112 -19answer if the mean of the sample is a number P1l 94 98 -4with a great many integers beyond the decimal P12 163 161 +2point, since rounding off the number whichrepre-_sents the mean introduces an error which is mag- Total....... 2,071 2,330 -259nified by the procedure of squaring and totaling 2;x2.........467,705 577,682 19,883the deviations from the mean.

* This table originally appeared in The Bac-

Test of the difference bween two treatments: teriological Grading of Milk, by G. S. Wilson,paired samples. Until now, we have been testing page 105, and is reproduced here with the permis-

sion of Her Majesty's Stationery Office, the copy-hypotheses concerning a single sample mean. right ownerMany times we may wish to test the difference t Note that in this table the summation signbetween two treatments. For example, su1ppo5e Z, is used without the indices i = 1 and 12 andthere is a standard method for isolating a given that there is no index on x. As this is the usualtoxin from raw materi and a new method is to practice when there is no doubt about what isbe tested to determine whether any improvement being summed, indices will be used from now onis obtained in the yield. Or, the recovery of a only when necessary.known added amount of metabolite to an un-known in checking for bias in a new away subgroups would be as nearly alike as we couldmethod may be tested. make them.One of the methods of comparing two treat- Another method of pairing is used if a treat-

ments is the method of pairing, a method used ment has no lasting effect on the unit to which theto a great extent in scientific research. It attempts treatment is applied. In this type of pairing, eachto make the group submitted to one treatment as unit is treated with one treatment and, when itnearly like the group submitted to another recovers, is treated with the other treatment.treatment as possible. An example would be a Thus, each unit serves as its own control. Anstudy to determine the effect of two treatments example of this would be obtained if two ob-on rats. If a group of rats is to be divided into servers were to count the bacterial colonies on atwo subgroups, each of which would be given a given set of plates. Each of the observers woulddifferent treatment, the entire group would be count the same plates. In this way, each plateseparated first into sets of two animals each. would give a direct comparison of the twoEach set would be made up of two animals which observers.were as nearly alike as possible with respect tosex, age, weight, and so on. When the two The data in table 3 (reference 7, table LVI, p.groups for teighete are selected, onen is 105) are the results obtained when each of two

observers (B and D) counted the number of col-taken from each set of two for the first subgroup onies on each of 12 plates. These will be used toand the remaining animal from each set of two test for a differential bias of the two observers.for the second subgroup. In this way, the two The last column of table 3 contains the dif-

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 20: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 179

ferences between the counts of the two observers. XZ(d -d)' 14,292.9167In the analysis of a paired experiment, these -'n-1 - 1,299.3561 (4.23)

differences (denoted by d) are treated as a samplefrom the population of differences. In this way Therefore, using equation 4.20,the data may be analyzed by a test of a single -21.583 - 0 -21.583 -21.583sample mean. Letting y in equation 4.12 be d, the t - - - *

4average difference for the sample, we obtain 1,9(4.24)

-- 12(4~~~~~~4)1t=

d (4.19) --2.0738d

The t-value obtained in equation 4.24 does notand, with our knowledge about the mean and fall into the critical region of the test; thereforevariance of the population of sample mean, the mean plate counts of the two observers areequation 4.19 becomes not significantly different, statistically.

t d__i Test of teU difference between two treatments:(4.20) independe samples. The means from two samples

/kn can also be compared when the samples are not

We will therefore use a t-test to test the differ paired, but to develop the test for such a case,ences. some additional information is needed, namely,The usual hypothesis on this test is that there the mean and variance for the difference between

is no difference between the two treatments or in two sample means. This information is obtainedthis example, the hypothesis is that both ob- from a more general bit of theory. If yi is anservers will, on the average, get the same plate observation drawn from a parent populationcount. With this hypothesis, pd- 0. The alterna- with mean pv, and variance a",, and Y2 is antives to the hypothesis will be that the mean is observation drawn from a parent populationgreater than zero (observer B will obtain a higher with mean i,, and variance ay,, then the meancount) or that the mean is less than zero (observer of the difference between y1 and Y2 (denoted byD will obtain the higher count). Therefore, the is the difference between the means. Thattest is a two-tailed test. Use the 5 per cent levelof significance for the test, and since there are 12 odifferences in the sample, there will be 12 - 1 = = on P-12 (4.25)11 degrees of freedom. From the {-table, t.,75 with11 degrees of freedom is 2.201; therefore, the criti- The variance of the difference between y1 and /2cal region for the test will be values of t which (denoted by a42,,,) is the sum of the variances.are less than -2.201 and values of t which exceed That is2.201. 2 = +From column 4 of table 3: OSf1<-2 = afl + fv2 (4.26)

Ed = -259 Now, if yi is a sample mean, xl, for a sample ofsize ni drawn from a parent population with

therefore, mean ,l and variance 4, and if 12 is a sample= Ed -259 mean, x2, for a sample of size n2 drawn from a

-12- -21.583 (4.21) parent population with mean P2 and variancen 12 a2, then

Also from the same column, #'ji-i - -JA (4.27)

Ud2 - 19,883 and

therefore, using equation 4.18, 2d-O-(-18(l9-+ (4.28)

?,(d - )2 d -= 19883n ' 12 Equations 4.26 and 4.28 hold only for inde-

= 14,292.9167 (4.22) pendent samples and not for paired samples.If the two samples are drawn from parent

Since there are 11 degrees of freedom, populations with a normal distribution, a u-test

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 21: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

180 ROEBERT L. STEARMAN [VOL. 19

may be set up on the basis of the values of the 2S. S., 2 S. S.2mean and variance given in equations 4.27 and S

d.f.1 an d82 fd.4.28. That is, The pooled estimate of as (denoted by s2) may be

{-*-=-2)-(1 2) defined byu~~~~ 2

al <2 (4.29) _S.S., + S. S.2

will be normally distributed with mean 0 and In other words, the pooled estimate of M2 isvariance 1. Thus, if the variances of the two equal to the sum of the S.S. divided by the sumparent populations were known, a u-test could of the degrees of freedom. The degrees of freedombe used to test some hypothesis concerning the of a; is d.f.1 + d.f.2.difference between the means of the parent When se is substituted for sk and s22 in equationpopulations, however, this is not a common 4.30:case in practice.

If the variances of the parent populations are t -M 2)-4.34))not known, as is usually so, a t-test may be used 1 + 1with n, n2

(_ -_2)_-_ -_2) The degrees of freedom for this test is the82 (4.30) degreesof freedom of A

R/-+- To illustrate the use of the t-test for t shownni n2'Ii fll in equation 4.34, the data from table 3 that

The t-test for the difference between two sample were used to illustrate the test in which we hadmeans falls into two categories: one type of test paired samples will be used. Note, however, thatif the variances of the parent populations are the test for unpaired samples should not be usedequal and another type of test if the variances of when the samples are paired. Although pairingthe parent populations are not equal. The test is used to a great extent in scientific research, anfor the equal population variances will be all too common failing is that the worker doesn'tillustrated first. take advantage of the pairing when the statisticalWhen the variances of the two parent popula- analysis is done. Instead, he uses the t-test from

tions are equal, the variance common to the two equation 4.34, which is designed for unpairedpopulations (denoted by a2) may be defined by samples. The reason that we will use the paired

= a2 = (42 data to illustrate the test for unpaired samples is01 2~ (4.31) to show what happens when the wrong test is

When M is substituted for the two population used.variances in equation 4.29: Again, test the hypothesis that there is no

(*1- 2) - ( -I difference in the mean plate counts for the ob-u= V i (4.32) servers, that is, test the hypothesis that MB -

(-1 +_3 MD = 0. The alternatives to the hypothesis remainVn n2 the same. A two-tailed test and the 5 per cent

Since the two variances are equal, s2 and 82 Will significance level will be used again. Since eachboth*bestiatesf thecommo v * 2 observer has 12 plate counts and 11 degrees of

freedom, 82 and t will have 11 plus 11 or 22 degreesEach of these estimates contains a certain of freedom. From the t-table, t.s7 with 21 degreesamount of information about oa. Pooling all of of freedom is 2.074; therefore, the critical regionthis information about M would provide an even for the test will consist of values of t which arebetter estimate of this common variance; a less than -2.074 and values of t which exceedmethod is available for doing this. +2.074.

Since From table 3:

2S. S. 2B = 172.583 (4.35)S2 = (see equation 2.16)d.f. S.S.B = 110,284.9167 (4.36)

put s2 and s2 into this form and obtain S2 = 10,025.9015 (4.37)

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 22: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 181

-D= 194.166 (4.38) there is quite a bit of variation in plate counts

S=S-D- 125,273.6667 (4.39) obtained by each observer, the difference betweentheir counts for the same plate does not vary

4D= 11,388.5152 (4.40) nearly as much. The greater the correlation

Using equation 4.33, between the observations for each pair in pairedsamples, the greater the loss in precision will be

82 = 110,284.9167 + 125,273.6667 when we apply the test for unpaired samples. It11 + 11 (4.41) is for this reason that the test for unpaired

= 10,707.2083 samples should not be applied to data frompaired samples.

Substitute these values in equations 4.34, Paired samples are not the only source ofcorrelated observations in a test of the difference

172583-194.166-0 between two treatments. It is possible to have10,707.2083(1+ I) correlation between observations within the

12 12/ samples from each treatment. Here, again, the-21.583 ) method for independent samples cannot be used.

* = -0.511 An example of correlation of this type will beVrlj5784.5~347 found in the data from an experiment from

As the value of t obtained does not fall into the Halvorson and Spiegelman (20). Yeast cells were

critical region of the test, therefore, the mean nitrogen-starved for 80 minutes in a syntheticplate counts of the two observers are not ig- medium, replenished with NH4Cl for 15 minutes,nificantly different, statistically. Although this is centrifuged, washed, and resuspended in buffer.the same conclusion reached with the t-test for Equal aliquots were placed m each of 10 flasks:* . . ^ Ad . . ~~~~~~5control flg containin 0.3 per cent glucosepaired samples, compare the values of t obtained ad5ct flasks containing 0.3 per cent glucose.

using the two methods. From equation 4.42 the and 5 fks contaiing 0.3 per cent glucosew itht-valueuingthe est for npaired amples i 0.5 per cent a-methyl-glucoside (the two treat-t-value using the test oftnbained usin ments). Free amino acid extracts were prepared-0.511,~~whra th vau of t bane ,sn from cells of each flask following aerobic incuba-the test for paired samples, given in equation 4.24

is -2.073. Further ex.mition of these two tion at 30 C for 140 minutes. Glutamic acid was

equations for t shows that the numerators are assayed manometrically by the decaboxylasethe same and that the reason for the difference method, with two determinations per flask. Theis that the denominator for the test for unpaired data from the control flasks given in table 6 (seedata is greater than that for the paired data. next section) are used to illustrate another type ofdataIgreatrtna tnat or tnepi da anayssChecking back to the equations for the sample analestimates of the variance for the two tests Now, the amount of free glutamic acid will(equations 4.23 and 4.41), we see that the reason vary from flask to flask. If a flask has a freefor the difference in the denom sis the glutamic acid content, both determinationsdifference in these two etates. The sample should be high and if the free glutamic acid

estimate*of th variance for the test for the un-content is low for a flask the two determinations

esimated ofmhes visancefor th timest fo the u will be low; obviously, the two determinationsthat for the test for the paired samplesmimea for any flask will be correlated, since their valuesthat for thetes t fore s p will be dependent upon the free glutamic acidprecision has been lost by using the test for un- cotn ftefak h aatu oss ffvpaired samples on the samples which are paired.The reason for this loss in precision is evident pairs of correlated values for each treatment;

on examinaon n of the original data given in hence the t-test for the difference between twotable 3. The actual numbers of colonies varied means cannot be used on the data as they stand.from plate to plate, thus, when the plate count As previously noted, the effect of using the testobtained by B increased, the plate count for D for unpaired samples on samples which are reallyalso increased, and vice versa. This means that paired is to overestimate the variance of thethe plate count of B is correlated with that of D, difference between the two means, with athat is, a change in D's count is associated with resulting loss in precision on the test. This ina similar change in B's count. Thus, although turn lowers the proportion of type I errors,

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 23: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

182 ROEBERT L. STEARMAN [voL. 19

leading to the conclusion that the observed TABLE 4differences are not statistically significant, Radioactivity measurements on two solutionsdifferences which would be statistically significant CPM CPMif the proper significance test were applied. The Solution 1 Solution 2effect of using the test for independent sampleson samples which have observations that are 898 400correlated within the samples is just the opposite. 892 411Here, the variance of the difference between the 860 425two means is underestimated. This raises the 864 390proportion of type I errors, which makes us state, 87 4as statistically significant, differences which 84would not be statistically significant if the proper 874 399test were applied. The difficulty of having the five 834 399correlated pairs of observations in our data is 842 404readily overcome: compute the mean of the two 846 396determinations for each flask and use the resulting 849 390data (now made up of five flask means for each 399treatment) for the t-test. Other slightly more 395complex methods of analysis are available but Zx 10,353 5,623need not be used in place of the t-test. n 12 14As noted in equation 4.33, when the population 2 862.75 401.64

variances are the same, the sample estimates of ZX' 8,936,475 2,259,483the variance can be pooled. Pooling theseestimates leads to the simplification of the (X) 8,932,050.75 2258.437.79equation for the t-test (from equation 4.30 to nequation 4.34). If the variances of the two parent S.S. 4,424.25 1,045.21populations are not equal, the sample estimates 82 402.20 80.40of the variance cannot be pooled, since they arenot estimates of the same quantity, and the t-t ence in the radioactivity of the two solutions,(equation 4.30) must be used without any ence a ve highctivity of thectwo Theresmlfcto.^Whe th population.variances hence a vrery high value of t is expzected. TheresimplificationWhen the populan vari are two alternatives to the hypothesis, namely,are not equal, the quantity t, as defined m the radioactivity of solution 1 is higher than that

equation 4.30, does not follow a standard of solution 2 or the radioactivity of solution 2 ist-distribution. However, its distribution can be higher than that of solution 1 (obviously, theapproximated by a t-distribution with an former is the situation). Therefore, we will use aapproximately chosen degrees of freedom; of the two-tailed test with a singificance level of 5 perseveral approximate solutions to this problem, cent for the test.two will be given here. Both solutions depend Substituting the values obtained in table 4 inupon the ratios sI/n, and iVn2, therefore, the equation 4.30,example will be worked out before taking up (862.75 - 401.64) - 0 461.11thes. t - 40220 80.40 V-v33.52 + 5.74The data in table 4 (statistical study of radio- 12 + 14 (4.43)

activity measurements) illustrate the use of a 461.11 461.11i-test to test the difference between two sample -

3973.542

means when the variances of the two parent pop- 6.27ulations are not equal. The index of radioactivityused was the number of counts per minute To determine whether the value of t obtained(CPM); 12 determinations were run on a sample falls into the critical region, the value fort.7from solution 1 and 14 determinations were run for the test must be known.on a sample from solution 2.The hypothesis is that the two solutions have A simple approximate solution to this problem

the same radioactivity per milliliter, that is is to exmine s2/n, and 8n/f2 and take the- A2. From table 4, obviously there is a differ- degrees of freedom for the sample estimate of the

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 24: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 183

variance for the larger ratio. If sl/n, - 91/mg, When the degrees of freedom are the same forthen take the smaller of the degrees of freedom s' and 4, thenfor the two samples. In the example, 42 has 11 tl = t2 = t' (4.48)degrees of freedom and 4 has 13 degrees offreedom, and and both of the approximate solutions will give

2 402.20 the same result. Under other conditions, theA * = 33.52 (4.44) second approximate method gives the more exact

nl 12 results

-2 8040-One point to be considered in deigning an-* 5.74 (4.45) experiment in which a test of the difference

ns 1T4 between two treatments is to be made, is that if/ni of the variances are equal, maximum precision for

Sinc is larger the de fro eedomfor the test is obtained, for a given size of experiment,4,which is 11, are used. From the t-table, ~t97' if the sample sizes are equal. If the variances are

for 11 degrees of freedom is 2.201, therefore, the n e m pcritical region for the test will be values of t otained, foraoent itwhicarelessthan- 2.01 ad vaues f tobtained, for a given size of experiment, if the

sample sizes are proportional to the variances ofwhich exceed 2.201. The value of t obtained in the parent populations from which they wereequation 4.43 falls into the criticl region, there- drawn.fore we reject the hypothesis. Test of two sample estimates of the variance. ItThe second approximate solution to be has been noted that the method used to test the

illustrated is that given by Cochran and Cox difference between two sample means depends(8, page 92) in which the value of t for the upon whether the parent populations of the twocritical region is determined directly. We wish to samples have the same variance. In general, thisestimate t.975 for the test. Letting t11ibe t. for is not known, so a test to decide whether they arethe degrees of freedom of *and t2 be t.97r for the same must be devised. The test used for thisthe degrees of freedom of 4, then the value of purpose is the F-test. It stems from the fact thatt.976 for the test (which will be denoted by t) is if a random sample is drawn from each of twogiven by parent populations which are normally dis-

2 tributed, the distribution of F, defined bytnlJ+7J±equation 4.49, is known.

81+ 82 F =---2-= (4.49)n,1 4/ ;2 02

From the t-table, t.975 with 11 degrees of freedom The frequency curves of the F-distribution are

is 2.201 and t.976 with 13 degrees of freedom is not symmetrical, and there are no negative2.160. Therefore, substituting these values and values of F, since none of the values enteringthe values from equations 4.44 and 4.45 into into the equation for F are negative numbers.equation 4.46, The distribution of F depends on the precision

of both of the sample estimates of the variance,i (33.52)(2.201) + (5.74)(2.160) thus the degrees of freedom of both of these

33.52 + 5.74 estimates are used in setting up the tables of F(447) for the determination of the critical region of the

73.77752 + 12.39840 86.17592 test. All tables of F are one-tailed tables. The39.9. 2 6 2.195 usual method of tabulation is to set up a separate

table for each of the various percentage points ofFrom the value obtained in equation 4.47, the F. Then, the columns of each table are assigedcritical region for the test will be values of t to the degrees of freedom of the numerator, whilewhich are less than -2.195 and values of f the rows axe assigned to the degrees of freedomwhich exceed 2.195. Again, the value of t obtained of the denominator.in equation 4.43 falls into the critical region of The usual hypothesis in an F-test is that thethe test, and the hypothesis is rejected. two population variances are equal, i.e., the

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 25: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

184 ROEBERT L. STEARMAN [VOL. 19

variance terms in the equation for F cancel out, fore, the critical region for the test will be valuesleaving of F which exceed 3.20. Substituting the values

from table 4 into equation 4.50,2

81F--1 (4.50) F-402.20 (.1e2 F~ 8090 -500 (4.51)52 ~~~~~~~~~~~~80.40The test involved when we are checking to see The value of F obtained falls into the criticalwhether the variances of the parent populations region for the test, therefore, the hypothesis isare equal in order to determine which method to rejected that the population variances are equal,use in testing the difference between two sample and our statement was not in disagreement withmeans is a two-tailed test. This arises from the the data. Since under usual operating conditionsfact that if the variances are not equal, the for the t-tests for the difference between samplevaricomeans for unpaired samples, it is not known

variance of the first population may be greater whether the population variances are equal orthan that of the second or the variance of the not, it is a good idea to run an F-test to decidesecond population may be greater than that of whether they are equal before deciding which t-the first. Thus, to obtain the critical region for test to use.the test with a significance level of 5 per cent, thevalue of F. o2 and F.m76 must be found. There Snipcani ersuspracticaldifferences.Whentheare no tables of F.026; all of the available tables null hypothesis is rejected on a statistical test,are set up for the upper tail only, but from we say that the results of an experiment areequation 4.50 it can be seen that for F to fall significantly different, statistically, from thebelow F.02s, S4 must be greater than s,. To use results we would expect if the null hypothesis isthe available tables, then, simply interchange true. The fact that the difference betweenthe subscripts on the two sample estimates of observed results and expected results is significantthe variance so the larger estimate of the variance does not mean that this difference has anyis in the numerator. In this way F-tests are made practical value. It is also true that a differencewith the available tables, since values which of great enough size to be of practical value maywould ordinarily fall below F.o02 are now trans- exist between the true population and theformed into values which will exceed F.976 for population given by the null hypothesis and stillthe new test. the null hypothesis will not be rejected by a

statistical test of the data obtained in an experi-The F-test will be illustrated by testing the as- ment. The size of the difference between the true

sumption made concerning the population var- state of affairs and the state given by the nulliances for the t-test for unpaired samples with hypothesis which is necessary to obtain a sta-unequal variances. When we ran the t-test for tistically significant difference depends on theunpaired samples on the data on plate counts of power of the test used.12 plates by each of two observers (table 3), we The power of a test is dependent upon thesaid that the population variances were equal.The sample estimates of the variance cannot be varance of the parent population (or populations)tested to see whether this is correct as the F- and o the size of the sample (or samples) usedtest requires s ands2 to be independent and for the test. The power of the test is high if thesince the samples are paired they are correlated variance of the parent population is low. Powerand thus not independent. can be increased by increasing the size of theWhen the t-test on the radioactivity measure- sample used for the test. Thus, if the variance is

ments on the two solutions (table 4) was made, low or if too large a sample is taken the resultit was said that the population variances were will be a test which is powerful enough to show,not equal. This can be tested with an F-test. as statistically significant, differences which areFrom table 4, s4 = 402.20 with 11 degrees of of no practical value. In fact, if the sample isfreedom and s2 = 80.40 with 13 degrees of free- large enough, a statistically significant dif-dom. Since 82 is the larger, place it in the nu-merator for the test. Therefore, if a test is run ference will be obtained between any twowith a significance level of 5 per cent, F.975 for 11 procedures unless they always give identicaland 13 degrees of freedom (the number given results. The larger the sample, the smaller thefirst is the d.f. of the numerator) must be found. difference that can be detected by the test.From the F-table, the value needed is 3.20, there- It is a good idea to examine the data from an

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 26: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 185

experiment before a statistical test is run to see the sample mean for the second solution, it iswhether the difference between the observed concluded that the level of radioactivity perresults and the results expected, if the hypothesis milliliter for solution 1 is higher than that forto be tested is true, is great enough to be of solution 2.practical value. If the difference is of sufficient When we say that we do not reject a hy-size to be of practical importance, go ahead with pothesis, this does not mean that we regard thethe test. If the difference is not of sufficient size hypothesis as true. As has been pointed out, theto be of practical importance, there is no need size of the difference between the null hypothesisto run the test. and the true state of affairs which is necessary forAt the other extreme, if the variance of the a test of hypothesis to be significant depends

parent population is high or if too small a sample upon the power of the test. Thus, a differenceis taken, the power of the test may be so low may exist but our sample size may not be largethat a difference which is of great enough size to enough to detect it. Therefore, the interpretationbe of practical importance may escape detection given when a hypothesis is not rejected is that aby the test. It is for these reasons that sample difference may exist, but if it does, the samplesize is so critical. The size of a sample must be taken was not of sufficient size to detect it. Forsufficient to detect differences of practical example, when the difference between the plateimportance but not so large that differences too counts of observers B and D (table 3) was tested,small to be important are also detected. A person the difference was found not to be statisticallywho uses statistical tests must never become so significant. However, this does not mean that aenamored by the test that he loses sight of the difference may not exist, only that if a dif-problem to which the test is being applied. It is ference does exist, the sample size is not sufficientimportant to examine the results of a significance to detect it.test carefully and translate those results in terms When the difference between observed resultsof the original problem. In making this transla- and expected results is of sufficient size to be oftion, we must keep the power and the significance practical importance, but the difference is notlevel of the test in mind. statistically significant, it is a good idea to

Intrpretation of resus of snificance tests. repeat the experiment, using larger samples.When we reject a hypothesis and say that the Thus, if the difference in plate counts betweenobserved results are significantly different, observers B and D is large enough to be ofstatistically, from the results expected if the interest, the experiment should be repeatedhypothesis were true, it is always important to using more plates. It could be that there is aexamine the data to see how (in which direction) definite difference to be expected between thethe observed results differ from the expected plate counts of these two observers. If so, itresults. For example, if a person checking for bias may be possible to detect it with a test in whichin an instrument rejects the hypothesis that the the samples are larger. On the other hand, theinstrument is unbiased, he should examine the observed difference may be a result of thedata to see whether the bias is positive or variation in counts for the two observers and thenegative. Thus, for the example of the rough next test may turn out to be not significant alsobalance (table 2), the hypothesis, which was (in fact, it could easily turn out that in the nextrejected, was that the mean was 100 grams and, test B's plate counts are higher than D's).since the sample mean was 100.26 grams, this When results of statistical tests are published,means that the balance is positively biased. it is important for the writer to give the sig-

If one tests the difference between two means nificance level used for the test. Many times,and rejects the hypothesis that they come from authors will say that results obtained in anpopulations with equal means, he should examine experiment were significantly different, sta-the data to see which population mean is the tistically, by a t-test for instance, but fail tolarger. Thus, in the example of the two solutions mention the significance level used in the test.of radioactive phosphorus (table 4), the hy- Since not everybody will agree on the significancepothesis, which was rejected, was that the two level to be used, the writer should not onlypopulation means were equal and, since the include the significance level used, but he shouldsample mean of the first solution is higher than also give the value obtained in the test along

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 27: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

186 ROEBERT L. STEARMAN [VOL. 19

with the degrees of freedom for the test. In this TABLE 5*way, other investigators who might wish to Effect of oxidative rancidity of lard on germinationexamine the results at a different significance level of spores of a putrefactive anaerobemay do so. Another point which is helpful in (plate counts)this respect is that a writer should, if possible, Lard Lardgive all of the original data on which the test (cNo:tRan)i d Control Totalsis based so other investigators will have a chance (Kreis +) (Kreis-)to check the results for themselves.

Test of the difference among more than two 71 123 133treatments. Many times we may wish to study 60 140 118the effect of more than two treatments. For 70 129 132example, the problem may be a test of the dif-70 127 127ference in the effect on growth among each of 62 136 138several amino acids when they are added to 73 130 133cultures of a given organism. The differences 74 119 129could be examined by applying t-tests for the 85 121 122difference between all of the possible pairs of 88 149 144treatments, but difficulties will be encountered.The number of t-tests that will be required T = ZX 756 1,285 1,317 3,358mounts rapidly as the number of treatments n 10 10 10 30increases. Thus, 3 treatments would require 3 1 75.6 128.5 131.71-tests, but 10 treatments need 45 t-tests. When ) 58,148 166,219 174,041 398,408the number of t-tests increases, there may be 57,153.6 165,122.5 173,448.9difficulty due to the significance level of the 994.4 1,096.5 592.1 2,683.01-test. If a significance level of 5 per cent is used, S32 110.49 121.83 65.79the hypothesis will be rejected, on the average,5 per cent of the time when the hypothesis is * This table originally appeared in the Journaltrue. Under these conditions, if 45 t-tests were of Bacteriology, page 431, Volume 63, and is re-run, at least some of these would be expected produced here with the permission of the copy-to turn out significant even though there was no right owner.difference among the population means of the 10treatments being tested. Conflicting results may have two estimates of the common variance ofalso result from the t-tests. For example, consider the populationsthree treatments, A, B, and C, lettered in This will be illustrated with the data of Rothincreasing order of magnitude of the sample and Halvorson (9) given in table 5. These datameans; t-tests may show that A is not sig- are from an experiment designed to show thenificantly less than B and B is not significantly effect of oxidative rancidity on the germinationless than C. In this way, there should be no of bacterial spores. There are three treatments,differehnc between Ahis wandy. ,ther 1-tes the control medium, medium with non-ranciddiferece.nC. Hw, tes , lard added and the medium with rancid lardmay reveal that A is significantly less than C. added. The observations are plate counts.

Another point to be considered is that whent-tests are run, the information is used from If the variances of the populations from whichonly two of the treatments at a time; some of the the samples are taken are equal, a pooled estimateinformation which could be contributed by the of this common variance can be obtained byother treatments is missed. the same procedure used for the pooled estimateThe method developed for testing the dif- of the variance for the test of the difference

ference among more than two treatments is the between two treatments (see equation 4.18). Theanalysis of variance. The analysis of variance is pooled estimate of the variance is equal to thebased on the fact that if we have a set of popula- sum of the S.S. divided by the sum of thetions which have the same variance, and we take degrees of freedom, or

a random sample from each of these populations, pooled 82 = 2S.S. (4.52)then, if the population means are equal, we will Zd.f.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 28: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

19551 STATISTICAL CONCEPTS IN MICROBIOLOGY 187

This gives one estimate of the common variance and(call this the internal estimate of the common nZ - 1081 = 9926.5 (4.58)variance). In the example, each of the samples X = 9have 10 observations, therefore each has 9 The two estimates of the variance common todegrees of freedom. Thus, the populations can be examined by the F-test

to see whether these two quantities are estimatespooled ' 2,6830 = 9937 (4.&3) of the same variance (statistical theory states

27 these estimates are independent so the F-testIf the population means of the treatments are can be used). The hypothesis for the test is that

equal (this is the hypothesis), the variance of the the population means are equal. If this is true,sample means is equal to the variance common both the pooled 82 and n84 will be estimates ofto the populations divided by the size of the the same variance. If the population means aresamples (see equation 3.1), or not equal, ned will be an estimate of something

larger than the variance common to the three2 2 populations since there will be additional varia-- n tion due to the difference among the population

means of the treatments. Thus the test has onlyAn estimate of ar is obtained from a sample one alternative to the hypothesis, that is thatestimate of the variance of the sample means. nsm is an estimate of something which is largerIf x is the mean of the sample means, that is, than the variance estimated by the pooled s'.

Therefore, the F-test is one-tailed and, since2x (4.54) =2 is an estimate of a quantity which is equal tok or larger than the variance estimated by the

where k is the number of samples, which is 3 m pooled e, n4 is always put in the numerator of F.the example, then, the estimate of the variance Thus,of the means (denoted by 4) is n8F

pooled s22 (2( - ) (4.65) Now test the hypothesis that the population

means of the three treatments for the data inNow, 42 is an estimate of a/n thus, n is an table 5 are equal using the 5 per cent significanceestimate of o2, giving the second estimate of the level. Since the test is a one-tailed test, the valuevariancecommotothe th, population needed for the critical region is F.95 with 2 and 27variance commonto the three populatons (we degrees of freedom. An F table reveals that the

caran call thsteetra siaeo h omn critical region for the test will be values of Fvariance), which exceed 3.35. Substituting the values ob-

tained in equations 4.53 and 4.58 into the equa-The identity in equation 4.12 can be used to tion for F (equation 4.59),

determine the S.S. for, the sample means. Thus,for our example, F= 9,926.45 99.89 (4.60)

20- 39,572.50 99.37and This value falls into the critical region of the

test, therefore the hypothesis that the popula-(2)2 112,761.64 = 3758721 tion means of the three treatments are equal is

3 ~=3,57.1rejected.therefore, The computations necessary for the analysis of

variance are simplified by the use of an analysqisS.S.i 39,572.50 -37.587.21 - 1,985.29 (4.56) vc r mlfe yteueo nai"X3957.5-3,57.1195.9 4.6 of variance computing table. This table takes

Since there are three sample means, the degrees advantage of the type of mathematical identityof freedom of 4- is 2. Thus, shown in equation 4.13 to shorten the computing

necessary to reach the F-test. The method used- 1,985.29 992.645 (4.57) in the table consists of splitting the total S.S.

2 for the experiment into its component parts.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 29: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

188 ROEBERT L. STEARMAN [VOL. 19

One part of this total is the S.S. for the variation safely under most practical conditions. To makeamong the sample means (this will lead to ns2) the preliminary test on variances is rather likewhile the other portion will be the S.S. for the putting to sea in a rowing boat to find outvariation within the samples (this will lead to the whether conditions are sufficiently calm for anpooled s2). The total degrees of freedom for the ocean liner to leave port!"5experiment are also split into similar parts. The It is emphasized that the analysis of variancedetails of the computing instructions for the is b no means completel insensitive to lack ofanalysis of variance computing table are given min y

t ytheappendix. ~~~~~homogeneity of the variances. Variation in the

population variances can be tolerated as long asSeveral assumptions underlie an analysis of the variation does not exceed reasonable limits.variance, three of which have been mentioned If the variation among the population variancesalready, namely, that the populations sampled are is too great, methods are available such as thatnormally distributed, the samples taken are of James (15) and Welch (16) for correcting therandom samples and the populations have equal analysis of variance rocedure for the lack ofpopulation variances. Eisenhart (10) lists anddiscusses theseand.theassumptions homogeneity of the variances. Transformationsadi te

chran describessomeofther are also of use in problems requiring the use of anand Cochran (11) describes some of the conse- ..............quences when these assumptions are not satisfied. analysis of variance when the variances of the

Statistical tests are available to check whether original measurements are not homogeneous. Forthe assumption of equal population variances is example, radioactivity measurements using an

Autoscaler have a constant coefficient of variationmet. When the population variances are equal, adteeoetevracsaenteulwethey are said to be homogeneous, thus, a test tand therefore the variances are not equal when

used to see whether the variances are ho- however, if the logarithm of the radioactivitymogeneous is called a homogeneity of variances test. m ere is use teevarianc iltbTwo homogeneity of variances tests are used mogenes. sedo the cmmonly uefairly frequently; Bartlett's (12) method, which transformations are fisted and dlscussed bymay be used if the degrees of freedom for each Bartltta(17).of the samples in the experiment is at least 4and Box's (13) method, if the degrees of freedom Problems of Estimationfor any of the samples is less than 4.Box (14) has shown that homogeneity of Most experiments are run to estimate an

variances tests are very sensitive to non- unknown quantity and not to test some hy-normality of the parent populations. This serves pothesis. For example, we may wish to determineas a drawback to the use of homogeneity of the number of bacteria in a sample of milk or thevariances tests unless the populations are indeed amount of a given metabolite in some source ornormal, since a statistically significant result may the potency of a toxin. Here, no particularbe due to non-normality or lack of homogeneity. hypothesis is to be tested; instead, we have aThis fact led Box to conclude: problem of estimation. The same type of problem

may arise if a hypothesis is rejected. For example,"It has frequently been suggested that a test if the hypothesis that a measuring device is un-

of homogeneity of variances should be applied biased has been rejected, we may wish to de-before making an analysis of variance test for termine the amount of bias, or if the hypothesishomogeneity of means in which homogeneity of that two procedures give the same result isvariances is assumed. The present research sug- . . .

w

gests that when, as is usual, little is known of the rejected, we may wsh to determine the magn-parent distribution, this practice may well lead tude of the difference between them.to more wrong conclusions than if the preliminary Confidence intervals: basic principles. Intest was omitted. It has been shown that in the problems of estimation, we must, again, contendcommonly occurring case in which the group with the old problem of the variation of measure-sizes are equal or not very different, the analysis ments. For example, consider the problem ofof variance test is affected surprisingly little byvariance inequalities. Since this test is also known 6 This paragraph originally appeared in Bio-to be very insensitive to non-normality it would metrika, page 333, Volume 40, and is reproducedbe best to accept the fact that it can be used here with the permission of the copyright owner.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 30: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 189

determining the bias of a balance. Now, if the ficient will be 95 per cent. The procedures forbalance gave exactly the same weight each time, setting up confidence intervals follow.it would be a simple matter to determine the bias Confidence interval for a population mean.of the balance; we would need only to obtain an Before proceeding to the confidence interval forestimate of the weight of a standard weight and a population mean, a new notation will bethe difference between the estimated weight and needed, namely an inequality sign to showthe true weight would give the bias of the relative magnitudes. This sign is a V lying on itsbalance. However, balances just don't do that. side, the open end pointing toward the largerIf a standard weight is weighed 10 times, the quantity, thus, to show that A is less than Bmean will give an estimate of the bias. If the use the symbol A < B, or stating this inequalitysame weight is weighed 10 more times the mean in another way, B is greater than A, as shownof the second set of 10 observations will give by the symbol B > A. The following exampleanother estimate of the bias, but the second may help to clarify the use of these symbols.estimate probably won't agree with the first. Thusan estimate of the bias can be obtained from the 2 <7 < 16 (4.61)mean of 10 observations but this mean, by itself, Symbol 4.61 says that 2 is less than 7 and 7, ingives no idea about the reliability of the estimate. turn is less than 16. This can be stated inTo obtain an estimate with a known amount of another way by saying that 7 lies between 2 andreliability, we utilize the information in the 16. This type of notation is used in setting upsample concerning the variation to which the confidence intervals.observations are subject plus the information The confidence interval for a population meancontained in the mean of the sample concerning is derived from the t-test shown in equation 4.16.the true amount of bias and come up with a range If the 5 per cent significance level is used, theof values as an estimate of the bias of the balance. test will not be significant ifThat is, all of the information the sample has togive is utilized and the statement is made that ___the true amount of bias lies within a certain t.2< < t 62)range, the range being determined by the (42information in the sample. Now, we could say,with absolute assurance, that the bias of the Now, if all three members of the inequality areinstrument was between minus infinity grams multiplied byx/,i7/, which is a positive number,and plus infinity grams. However, such a range the direction of the inequality will be unchangedwould be of little value from a practical point of (the reader can verify this by multiplying all ofview. To obtain a range of values for the true the members of the inequality shown in 4.16 bybias which has greater practical usefulness, we some positive number, say +2). Thus,settle for a little less assurance that the true biaslies within the given range. t.Ou6-/sP/n < 2 - A < t.978n (4.63)

Consider a parent population with a populationparameter, 0; take all of the possible samples of If all three members of the inequality area given size from this population; for each of the multiplied by-1, that is, if the sign of all of thesamples some procedure is used for determining members is changed, the direction of the in-a range with which to estimate 0 (exactly the equality will be changed (the reader can verifysame procedure is used on each of the samples). this by multiplying all three members of theThe procedure for setting up the range is such inequality given in 4.61 by -1 to give -2 >that the ranges for a certain percentage of the -7 > -16). Thus, multiplying all three members

ples will contain 0. The ranges set up in this of the inequality shown in 4.63 by -1,way are called confidence interas and the per -t.02sN~i; > s -x > -t.9,7r (4.64)cent of the samples whose confidence intervalscontain the population parameter is called the Now add t to all three members of the in-confidence coefficient. Thus, if the procedure is equality, this will not change the direction of thesuch that the confidence intervals of 95 per cent inequality (the reader can verify this by addingof the samples contain 0, the confidence coef- some number, either positive or negative, to all

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 31: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

190 ROEBERT L. STEARMAN [voL. 19

three members of the inequality given in 4.61). quantity can be used to test the hypothesis thatThus, the variance of a population is equal to a given

quantity. The test is much the same as any other*-02sV/'82 > It > *- t.975N/8/n (4.65) test of hypothesis, such as the t-test, so its useRearranging the inequality given in 4.65, will not be illustrated. The test is two-tailed,

- t.g75V/O7/n < A < - o~r,,N/_n (4.66) therefore, the hypothesis is rejected, for a

significance level of 5 per cent, ifInequality 4.66 defines the confidence interval S. S.for a population mean with a confidence coef- xc.o < < x2. (4.71)ficient of 95 per cent.

The confidence interval will be illustrated with Now, taking the reciprocal of each member ofthe data in table 2, from which the hypothesis the inequality (i.e., each member of the in-that the rough balance was unbiased was rejected. equality is divided into 1), the direction of theFrom equation 4.13, : is 100.26 and from equa- inequality changes (the reader can verify thistion 4.17, ViO7i is 0.78. The number of degrees of by taking the reciprocal of all three members offreedom for t was 9, thus t.o26 is-2.262 and t.97s inequality 4.61 to give H > % > K6). Thus,is +2.262. Substituting these values in the in-equality for the confidence interval, 1 > >> 1 (4.72)

100.26 - (2.262) (0.078) <,A< 100.26 - (-2.262) (0.078) Multiplying all three members of inequality 4.72

by 8.S.,or 8.8. S.S

100.08< p < 100.44 (4.67) x).om > xe.W (4.73)

Thus the confidence interval, with a 95 per cent Rearranging inequality 4.73 gives the confidenceconfidence coefficient, for the mean of the esti-

i

mates of the weight lies between 100.08 and 100.44. interval for the 95 per cent confidence coefficient:The confidence interval for the bias is obtained S. S. S.S(by subtracting the true value of the standard - < a2 < (4.74)weight from each, thus, X2.n x'.i

0.08 < bias < 0.44 (4.68) Using the data in table 2 as illustration, fromequation 4.14 the S.S. is 0.5440. The degrees of

The confidence interval for a different confi- freedom will be 9 again, therefore X2.o25 is 2.70dence coefficient is determined by the appropriate and x2.975 is 19.0; thus the confidence interval,choice of the significance level used in setting up for the 95 per cent confidence coefficient, will beinequality 4.62. For example if t.oo6 and t.996 05440 05440(for the 1 per cent significance level) is used, the 19.0 < o < 2.70confidence interval for the population meanwith a confidence coefficient of 99 per cent orresults. This would be 0.029 < 2 < 0.201 (4.75)

- t.996N/0n < p< e2-Cor-t0/6 (4.69)Confidence interval for the difference between two

Confidence interval for a population variance. means. Two methods are available for de-This confidence interval for a population variance termining the confidence interval for the dif-is based on the chi-square (x2) distribution. The ference between two population means, corre-confidence interval is derived from the fact that sponding to the two cases for the t-test forthe quantity independent samples (paired samples will use

S. S. equation 4.66).X (2= (4.70) If the population variances are equal for the

two treatments, the equation for the t-testfollows the chi-square distribution with the (equation 4.34) is used to derive the confidencedegrees of freedom associated with the S.S. This interval for the difference between the two

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 32: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 191

population means. The resulting inequality is TABLE 6*

1~ Free glutamic acid in N replenished cells (control)

21 - f22- CMs7 $/ tP ( 1 + )Flask No Glutamic Acid Flask4/n,n2 Flask No. ~~~~~~~isx/100 mg dry cell Totals

< JAI - ;92 < 21- 22 (4.76)1 19.6

-t~o26CM//89 (n + ala) 2 20.4 40.0+/ni n2 ~2 17.9

where e2 is the pooled estimate of the common 3 17.2 35.1variance defined by equation 4.33. 18.0 35.2

If the population variances are not equal for 4 18.9the two treatments, we use the equation for the 19.6 38.5t-test (equation 4.30) along with the equation for 5 17.3the approximate t-value, t' (equation 4.46) to 17.5 34.8derive the confidence interval for the differencebetween the two population means. The resulting Grand total ..183.6inequality is * This table originally appeared in the Journal

/22 of Bacteriology, page 605, Volume 65, and appears*1-*-t'.s iV/!1 + 82 here with the permission of the copyright owner.

ni ni

< ;I1- I2 < xl- (4.77) replenished with NH4Cl for 15 minutes, centri-fuged, washed, and resuspended in buffer. Equal

-+/'2 aliquots were placed in five flasks, containing 0.3ni nA per cent glucose. Free amino acid extracts were

prepared from cells following aerobic incubationComponents of variance technique. In some at 300C for 140 minutes. Glutamic acid was as-

laboratory procedures variation in results may sayed manometrically by the decarboxylasearise from more than one source. For example, method, with two determinations per flask.each of a group of technicians makes the same Consider the parent population for the samplemeasurement on an unknown; the variation i given in table 6. This population will be made upresults for the group will arise from two sources: of many glutamic acid determinations on each ofvariation in results due to the measuring device many flasks. Denote a member of this population,and further variation arising from the dif- say the ith glutamic acid determination on theferences among the technician Another example jth flak, by Xii. Now, the glutamic acid de-would be a procedure in which two steps are terminations for a given flask, say the jth flask,required before the final result is obtained with will have a mean value which is called the flaskvariation in results from each of the steps in the mean and denoted by X.,. Thus if there are Nprocedure. When a problem arises in which it is determinations for the jth flask,necessary to obtain an estimate of the amount ofvariation arising from each of the sources, the X., = Mx'' (4.78)components of variance technique can be used Nfor this purpose. The components of variancetechnique is an extension of the analysis of The parent population will also have a meanvariance and is based on the expected values of value, denoted by jA, which will be the mean ofthe mean squares in the analysis of variance all of the glutamic acid determinations for all oftable. the flasks. Thus, if the total number of de-

terminations in the population is N,We will use the data in table 6 to illustrate the _____(479

components of variance technique. These data IL = E(x1) = N479are the control portion of the experiment ofHalvorson and Spiegelman (20) discussed previ-ously (page 181). Yeast cells were nitrogen- The measurement of any member of the parentstarved for 80 minutes in a synthetic medium population can now be broken down into compo-

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 33: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

192 ROEBERT L. STEARMAN [VOL. 19

nent parts, thus, TABLE 7X = + (X., - A) + (Xe, -X ) (4380) Expected values of the mean square terms in an

analysis of variance table for n determinationsThat is, a particular glutamic acid determination on each of k flasksis the sum of the parent population mean plus Variation Due to: Mean Square E(MS.)the deviation of the flask mean from the popula-tion mean plus the deviation of the determination Treatments (flasks).... M.S. Od + nffrom the flask mean. Observations (determi-

2The variance of the population, denoted by 2, nations) M...Od............. ad

can also be broken down into component parts.Part of the over-all variation is due to the . 2And an estimate of A sIvariation of the flask means about the populationmean. This variation would be due to such 2 M.S.T- M.S.0things as the variation in the number of yeast cells sf = n (4.84)in an aliquot and the variation in the aliquots . . 2themselves. The variance of the flask means where MS.T is equivalent to nsa.about the population means can be denoted by The use of table 7 and the accompanying equa-o? and will be tions 4.83 and 4.84 can be illustrated with the

ad= EtX~i- M)21 data in table 6. Table 8 is the analysis of varianceI~f=Elj - JA)'j (4.81) computing table for the data. From the analysisAnother part of the over-all variation will be due of variance table M.S.o is 0.2300, thereforeto the variation of the glutamic acid determina- 2tions about their flask means. This would be the 8d = 0.2300 (4.85)usual variation due to the procedure of obtaining and, since M.S.T is 2.8185 and n is 2, the estimateestimates of the amount of glutamic acid. The of 2s will bevariance of the glutamic acid determinations 2 2.8185-0.2300about the flask means can be denoted by crd and 2,= = 1.2942 (4.86)will be

2= El(X-_ (482) The estimate of the variance due to the varia-*d=EI(ii-09.82 tion in glutamic acid determinations is 0.2300Both ad and afJ will be components of 2. The while the estimate of the variances due to thecomponents of variance technique gives an variation of glutamic acid content among theestimate of each of these components of a2 from flasks is 1.2942. Obviously, the contribution froma sample taken from the population. Denote the the difference in results for different flasks to theeof o, 2aof bya

.over-all variance is greater than the contributionestimate of by sl and the estimate o-d D d

2 * * ^ . * from variation in results due to the glutamic acidThus, sf gives an estimate of the variance due to determinations. The components of variancevariation in results among flasks and Ad gives an technique can also be applied to experiments inestimate of the variance for the procedure for the which there are unequal numbers of observationsglutamic acid determinations. for the various treatments, for example, unequal

If we take a random sample of k flasks and numbers of glutamic acid determinations perrun n determinations on each flask, an analysis flask, but the method is slightly more com-of variance computing table can be set up for plicated than the method for equal numbers ofthe resulting data as given in the appendix. The observations for the various treatments.estimates of 2 and ad will be based on theexpecatedvalues of the mean square taser in the One point to be noticed is that the F-test forexpected values of the mean square terms in the teaayi fvrac fdt ftetpthe analysis of variance of data of the typeresulting analysis of variance table. The expected s i 2values of the mean square terms for an experi- shw in tal et h yohsstav0 That is, if the value of F falls into the criticalment involving n determinations on each of k region, we say that* is significantly differentflasks are given in table7.ffrom zero, statistically. An interesting phenom-From table 7 we see that M.S.o (equivalent .i o ificantly differentof2 enon may occur if af is not signficnldifrtto the pooled s2) is an estimate Ofld~, that is, from zero: if the value of F is less than 1, then

d2 = M.S.0 (4.83) M.S.T < M.S.o and SJ will be negative. This is

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 34: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 193

TABLE 8 By substituting the estimates from the com-Analysis of variance computing table for the data ponents of variance technique into the equation,

in table 9 an estimate of this variance is,Part 1: Preliminary calculations 2 2

(1) (2) (3) (4) (5) nf nfnd

Obser- Total of Thus,Source of Total of No. of vations SquaresVariation Squares Items per vationsrSquared Squared periOber 1.2942 0.2300 (.9Item (2) + (4) +n(4-89)

nf nf ndCorrection*.... 33,708.96 1 10 3,370.896 Now, the variance of the mean can be decreased,Flasts.......... 638343210 1 3,383.320 thus increasing the precision, by increasingDeterminations. 3,383.32* _ '*______ either the number of flasks or the number of

Part 2: Analysis of variance table determinations per flask or both. Where sf isthe larger of the two components, the greatest

(6) (7) (8) (9) (10) gain in precision will be obtained by increasingthe number of flasks. This is illustrated by con-Degrees Mean

Sum of of SquareVariation Due to: Squares Free- (s'u2) F sidering an experiment with a total of four(S.S.) dom (7) + (8) determinations. There are three ways of ob-(d.f.)__

taining four determinations: one flask with fourFlasks......... 11.274 4 2.8185 12.254 determinations, two flasks with two determina-Determinations. 1.150 5 0.2300 tions per flask, or four flasks with one deter-_________________ ___* mination per flask. Which method gives theTotal. . 12.424 9 greatest precision?* The correction term does not constitute a If four determinations are done on one flask,

source of variation, the variance of the mean of the determinationswill be

rather surprising; of, by its very nature cannot 2 1.2942 0.2300be negative. A negative value of sf can only 8 = 1 + 4 = 1-2942 + 0.0575 = 1.3517occur when 2~ is not significantly different fromzero. When this occurs, reference to table 7 With two determinations on each of two flasksshows that both M.S.T and M.S.o will be esti- the estimate of the variance of the mean will bemates of ad. Since estimates of variances aresubject to variation (this is the reason for having S2 = 2 + 4 = 0.6741 + 0.0575 = 0.7046the F-distribution to test sample estimates ofvariances) it is not surprising to see M.S.T less which gives a fair increase in precision. On thethan M.S.o when both are estimates of ad. other hand, if one determination is made onTherefore, we use zero as our estimate of af each of four flasks, the estimate of the variancewhen gf turns out to be negative. An example of of the mean will bea negative estimate of a component of varianceis given by Stearman, Ward and Webster (21). s2 + 0.3235 + 0.0575 = 0.3810The components of variance technique is a

very useful statistical tool in designing of ex- which gives an even greater increase in preci-periments. Its usefulness stems from our dis- sion. Thus it is clear that increasing the numbercussion of methods of increasing precision; of flasks has a greater effect in increasing pre-equation 3.2 states that the variance of a mean cision than increasing the number of determina-of nf flasks with nd determinations per flask tions per flask. Estimates of the components canwill be be used to determine the size of experiment and

2 2 allocation of flasks and determinations necessary2= + - (4.87) to obtain a given precision. Further discussion

nf nf nd of the use of components of variance in de-

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 35: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

194 ROEBERT L. STEARMAN [VOL. 1l

signing experiments will be found in Stearman, the survivals to class A and the deaths to classWard and Webster (21). A.

Probability. The statement that a certainV. THE BINOMIAL DISTRIBUTION proportion, P, of the parent population will fall

The parent populawion. Some problems define into class A, is usually made by saying that thea population in which each member of the popu- probability that a member of the parent popu-lation falls into one of two classes. An example lation will fall into class A is P. In other words,would be a study of the effect of a given toxin on the probability that a member of a populationrats in which the classifications would be that will have a certain attribute is nothing morean animal died or the animal did not die. The than the proportion of the population membersproblem defines a population made up of the that have this attribute, e.g., in the example ofreactions of each of the animals to the toxin. the toxin given to rats the probability that aThe typical question to be answered in such a rat given the toxin will die is 0.75.problem is, what proportion of rats will die, The term probability can also be applied togiven a particular dose of the toxin? In other some of the points discussed previously. As anwords, if the same dose of toxin is given to all example, the significance level in a test of a,of the population of rats from which samples hypothesis is the proportion, expressed in perare drawn in studies of this type, what propor. cent, of type I errors which will be made if thetion of the population would die? hypothesis is correct. We can also speak of the

Call one of the classes class A and the other probability of type I errors. Thus, at a signifi-class "not A" (denote the class "not A" by A). cance level of 5 per cent, the probability of aThus, some of the members of the population type I error is 0.05.will have attribute A and will fall in class A Ditibution of samples. Samples from thewhile the remaining members of the population parent population in which members fall intowill not have attribute A and will fall in class A one of two classes are like samples drawn from(in some discussions, the two classes are re- other populations in that they are subject toferred to as "successes" and "failures", however, variation (unless all of the members of thesince this type of nomenclature sometimes leads population are alike). If a single member isto confusion, we will use the classes A and A). drawn from the population it will either fall intoDenote the proportion of the population class A or into class A. If a sample of size two is

which falls into class A by P and the proportion drawn from the parent population, three out-of the population which falls in class A by Q. comes are possible: two members from class A,As an example, if three quarters of rats given one member each from class A and class A, ortoxin would die and the remaining quarter two members from class A. The probabilitiessurvive, P would be 0.75 and Q would be 0.25. for each of these outcomes can be derived bySince all of the members of the population will considering the order in which the observationsfall into one class or the other, the total of the in the sample are drawn. To obtain a sampletwo proportions must be 1, thus, containing two members of class A, both the

first and second observation must be in class A.P + Q = 1 (51) The probability that the first observation is in

Equation 5.1 leads to a much-used relationship, class A is P and the probability that the secondnamely observation is in class A is also P (on the as-

sumption that the parent population is an in-Q = 1 - P (5.2) finite population). The probability that both

the first and second observation are in class A isThe assignment of the members of the popu- the product of the two probabilities, orlation to the two classes is based on our primaryinterests in the problem under study. For ex- Probability of AA = PP = P2ample, if, in the toxin example, primary inter-est is in the proportion of deaths of the test A sample with one member each from class Aanimals, assign the deaths to class A and the and class A can be obtained in two ways; asurvivals to class A, however, if primary in- sample in which a member from class A is drawnterest is in the proportion of survivals, assign first and then a member from class A, or a

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 36: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

19551 STATISTICAL CONCEPTS IN MICROBIOLOGY 195

sample in which first a member from class A TABLE 9and then a member from class A. The prob- Binomial distribution for samples of size 8ability that the first observation comes from Nclass A is P and the probability that the second servations erance oProbability for Order T otalProba-observation comes from class A is Q, therefore A i Sampie in of AppearanceSbaptnSmr

Probability of AA - PQ 0 AA QQQ = Q SQsThe probability that the first observationcomes from class A is Q and the probability that 1 AAA PQQ = PQ'the second observation comes from class A is P, AAA QPQ - PQ |therefore AA QQP - PQ' 3PQ2

Probability of AA - QP P̂Q 2 AAA PPQ =P'QAAA PQP-P'Q

Now, the probability of a sample with one AAA QPP-PQ 3P'Qmember each from class A and class A is thetotal of the probabilities for the different ways 3 AAA PPP - Psin which we can obtain the sample, therefore I

Probability of either AX or Here again the total of the probabilities will beAAA PQ + PQ - 2PQ 1, since

To obtain a sample containing two members (Q + P)' - 1 - 1from class A, both the first and second observa- If the method we used in determining thetion must be from class A. The probability that probabilities for the outcomes in samples ofthe first observation is from class A is Q and the size 2 and 3 is applied to samples of greaterprobability that the second observation is from size, it will be found that the probabilities forclass A is Q, therefore the different possible outcomes will coincide

Probability of AA = QQ - Q' with the terms in a binomial expansion of(Q + P) raised to a power equal to the size of

Now, if these are all of the possible outcomes, the sample. It is for this reason that the distribu-the total of the probabilities (proportions) must tion of possible outcomes for samples from thebe 1. The proof of this is based on the fact that parent population is called the binomial dis-the three probabilities listed are the three terms tribution.in the binomial expansion of (Q + p)2, that is, Table 10 lists the probabilities for the general

(Q + P)-Q= + 2PQ + PI case where the sample size is equal to n; theseprobabilities are based on the binomial expan-

since Q + P = 1 (see equation 5.1), sion of (Q + P)R.

(Q + P)2 = 1' = 1 Let us consider the example of the effect of atoxin on rats for an illustration of the use of table

That is, the total of the three probabilities is 10. In the example, the rats had a probability ofequal to 1. dying of 0.75. What is the probability that ex-

If a sample of size 3 is drawn from the parent actly 5 out of a sample of 12 rats will die? Here, npopulation, four outcomes are possible: none, is 12 and k is 5, so the probability will beone, two or three observations in the sample (12)(11)(10)(9)(8) ( 675)5 (O.2)7from clas A. If the probabilities for each of the (1)(2)(3)(4)(5)possible outcomes are computed in a method (53)similar to that used for samples of size 2, the (792)(.2373046875)(.0 103515625)results are given in table 9. - 0.01147127The four probabilities listed in column 4 of

table 9 are the four terms in the binomial ex- The reader will note that the computations ofpansion of (Q + P)3, that is, the probabilities become quite involved: tables

are available, however, which list the probabili-(Q + P)' = Q8 + 3PQ2 + 3P2Q + Ps ties of the possible outcomes for various sizes

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 37: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

196 ROEBERT L. STEARMAN [voL. 19

TABLE 10 considered. If we are dealing with the propor-Binomial distribution for samples with tion of observations from class A in the various

n observations samples,

No.of Ob- ,A=P (5.4)serva-

tionsfrom Probability for SampleClass A and

in Sample= PQ (5.5)0 Qn n

1 nPQt- However, if we are dealing with the number ofobservations from class A in the various possible

2 n(n -(21) PQ- samples,= nP (5.6)

3 n(n - 1)(n - 2) PSQRS and(1)(2)(3) a2 = nPQ (5.7)

One point which should be noted here is thatequations 5.4 and 5.5 are also applicable when

k n(n- 1)(n- e) (n-k +1) pkpc P and Q are given in terms of percentages.(1)(2)(3) *-- (k) When a sample is drawn from the parentpopulation, it is drawn to obtain an estimate ofthe proportion, P, of the parent populationwhich has some particular attribute, A. The

n sample estimate of P, denoted by p, is theproportion of observations in the sample from

of samples. The tables of the National Bureau class A. A sample estimate of Q, denoted by q,of Standards (22) list the probabilities for will be 1 - p. Sample estimates of the variancesamples of sizes 1 to 49, and the tables of Romig of the distribution of samples are obtained by(23) list the probabilities for samples of sizes substituting p and q for P and Q in the ap-50 to 100 by steps of 5. Both sets of tables are propriate equations. These estimates will beset up for parent population probabilities by used later so they will not be illustrated at thissteps of 0.01. point.

Parameters and statistics of the binomial dis- .etribution. The binomial distribution is a dis- Significance Testtribution of samples, i.e., if all the possible The basic principles of significance tests forsamples of a given size are drawn from the the binomial distribution are much the same asparent population, the distribution of the those for significance tests involving the normalsamples (table 10) will be the binomial dis- distribution.tribution. The binomial distribution can be Binomial test of a single sample proportion.considered in two ways: (a) the proportion of The basic procedure used in testing the numberobservations from class A in the various possible of class A members observed in a sample againstsamples; or (b) the number of observations from a specified parent population proportion, P, is toclass A in the various possible samples. Table compute the mean (nP) for the binomial dis-10 shows the distribution for the number of tribution with the given size of sample and theobservations from class A. A table of the dis- value of P specified by the hypothesis, thentribution for the proportion of observations from determine the probability of obtaining devia-class A is obtained by dividing the entries in tions from this mean as great as or greater thancolumn 1 of table 10 by the size of the sample, n. the deviation of the observed number of class AThe mean and variance of the binomial dis- members from the mean. With a two-tailed test,

tribution depend upon whether the proportion the probabilities for deviations of this type areor number of observations from class A is computed in both tails of the distribution; with

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 38: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 197

a one-tailed test, the deviations are computed TABLE 11for the one tail only. If the probability of the Binomial distribution for samples of size 12 withdeviations falls below the significance level of P = 0.75the test, we reject the hypothesis. No. of Rats That Die Probability

This procedure can be illustrated with a hy- 0 0000001pothetical sample testing whether the proportion ° 0.0 01of rats which die when given the toxin in the 1 0.0000021example will be 0.75. If the toxin is given to 12 2 0.0000354rats, and if the hypothesis is correct, the mean 3 0.0003541number of rats which will die for samples of size 4 0.002389812 will be 5 0.0114713

6 0.0401494n=rP = (12)(0.75) = 9 (5.8) 7 0.1032415

8 0.1935777A two-tailed test will be used, since the propor- 9 0.2581036

tion of rats which die in the parent population 10 0.2322932may be less than 0.75 or may be more than 0.75. 11 0.1267054Suppose that 6 of the 12 animals in the sample 12 0.0316764die; as the deviation of the observed number, 6,from the mean, 9 is 3, the probability of obtain- Total ..1.0000000ing deviations of 3 or more from the mean is re-quired. The numbers (of animals which woulddie) that have a deviation of 3 or more from the proportions of class A members, and equationsmean will include 0, 1, 2, 3, 4, 5, 6 and 12 (12 has 5.6 and 5.7 when considering the number ofa deviation of 3 and is included since the test is class A members.a two-tailed test). Now, if the total of the prob- How well the normal distribution approxi-abilities for these numbers falls below the signifi- mates the binomial distribution depends oncance level of the test, we will reject the two factors, namely, the value of the parenthypothesis; however, if the total of the proba- .bilities is greater than the significance level, the ationproportinocls A members, P,hypothesis will not be rejected. Let us use the 5 and the size of sample, . When the value of Pper cent (0.05) significance level. The probabili- is near or equal to 0.5, the normal distributionties for the various possible outcomes for the is quite close to the binomial distribution evensample, taken from the National Bureau of for small samples; however, as P departs fromStandards binomial distribution tables (22), are 0.5 the size of sample necessary to obtain agiven in table 11. good fit increases. No attempt will be made toFrom table 11 the total of the probabilities for give any criterion for the size of sample neces-

0, 1, 2, 3, 4, 5, 6 and 12 rats dying is 0.0860786; sary for the use of the normal approximation tosince this value exceeds 0.05, the hypothesis that the binomial distribution, since what would bethe value of P for the parent population is 0.75, considered a good fit depends upon the conse-is notrejected.'cosdrd gofideedupnteos-

quences of discrepencies between the binomialNormal approximation to the binomial dis- distribution and its normal approximation.

tribution. Significance tests which use the bino- One other point to be considered is the factmial distribution to test hypotheses concerning that while the binomial distribution is discretesample proportions are, at best, rather involved (it has values only at 0, 1, 2, ... , n), the normalprocedures. It would be well to have some curve that approximates it has a continuousquicker and more easily used procedure in distribution. The probability for a discretemaking the tests. Such a test is available from point in the binomial distribution is approxi-the fact that the binomial distribution can be mated by the probability for an interval in theapproximated by the normal distribution. The normal distribution, that is, the height of a baronly parameters necessary in setting up a normal in a bar graph is approximated by the area underdistribution are the population mean and the a frequency curve in an interval. The method ofpopulation variance. Thus, equations 5.4 and handling this problem, as suggested by Yates5.5 are used to set up the normal approximation (24), is quite straight forward. If the normalfor the binomial distribution when considering approximation for the probability of k members

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 39: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

198 ROEBERT L. STEARMAN [voL. 19

of class A in a sample is required, the area under This test can be illustrated with the examplethe normal curve between (k - %) and (k + A) used for the test of a sample proportion usingis used. Thus, for the probability of 6 ani- the binomial distribution. In the example, 6 ratsmals dying in a sample of size 12 from a parent died in a sample of 12 animals, and the hypothesispopulation with P = 0.75, use the area under was that P = 0.75. Since the mean from thethe normal curve with -u= 9 ( - -nP = 9) hypothesis is 9 (equation 5.8) and the observed

and ora 1.5 (cr2 9nPQ = 2.25) between 55 number of deaths, 6, is less than the mean, equa-and 6-1.5(a figure 7)Two specialtwamsneed

5-5tion 5.9 tests the hypothesis. Thus,and 6.5 (see figure 7). Two special cases needmentioning. If k is 0, use the area for everything u - 6 + 1/2-9 -2.5 2.5less than X (O + %) and for k n, use the area V(12)(.75)(.25) .25 1.5 (5.11)for everything exceeding n -b -1.667Normal approximation test of a single sample

proportion. The significance test used for a The critical region for the u-test with a 5 perangle sample proportion depends on whether cent significance level will be values less than -2

the number of class A members in the sample and values exceeding +2 (see equations 4.8 andare being considered or the proportion of class A 4.9). As the value of u obtained does not fall intoarembers ionthesample.For either,thepn ormala the critical region, the hypothesis that the prob-members in the sample. For either, the normal ability of death is 0.75. is not rejected, the sameapproximation is used. However, the mean and result as in the binomial test. Some idea of howvariance will depend on whether numbers or well the probabilities for the two tests agree canproportions are considered. be gathered if the probability associated with theThe u-test (not the t-test) is used for signifi- value of u obtained is determined. The prob-

cance tests of a sample proportion. First con- ability obtained from the binomial test waseider tests that deal with the number of class A 0.0861 (table 11): using the National Bureau ofmembers in a sample. If the observed number of Standards tables of the normal distribution (25),class A members in the sample, A, is less than we see that the probability associated with a

elP, usmembers m the sample, k, > le88 tna value of u of -1.667 is 0.0955.tiP, useck + 1/2 - nP Now, tests that deal with the proportion of

up ;- (5.9) class A members in the sample can be con-sidered. Here, again, a correction for con-tinuity should be included; thus if the sample

ik- 1/2 - nP (5.10) proportion is less than the proportion accordingVQnPO (5Q to the hypothesis, P, use

The value X in equations 5.9 and 5.10 is called k + 1/2 pthe correction for continuity and arises from the nfact that the interval between (k - %) and u - (5.12)(k + M) is used to approximate the binomial __probability.

If the sample proportion is greater than P, useFrequency k- 1/2

-PnurU (5.13)

The values of u which will be obtained usingthese equations will be identical with the valuesof u obtained using equations 5.9 and 5.10, since5.12 and 5.13 may be obtained by dividing the

o 2 4 6 8 lo 12 numerator and denominator of equations 5.9Number of class A members in sample and 5.10 by n.

Figure 7. Normal approximation to the bino-mial distribution for samples of size 12 with P This u-test can also be illustrated with the0.75. data of the previous example. Here, P = 0.75 and

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 40: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 199

the sample proportion, 0.5 (6/12), which is less TABLE 12than 0.75, therefore equation 5.12 is used: Data from a hypothetical test of a purification

6 + 1/2 procedure for a toxin*- 0.75u-_______- 0.541667 - 0.75 Toxic N~ ~~~~~oxPropoto1275)(.25) N/'.-015-6 Test Material ; Reac- Total Of Toxi

(5.14) tion tinReactions

-0.208333 Original material.. 4 8 12 0.3330.12

1.667 Purified product ... 12 2 14 0.857

This is identical with the value of u obtained in Total.16 10 26 0.615the previous test (see equation 5.11), therefore _the decision will be the same. * Not corrected for continuity.

Norma approximam test for two sample The u-test for the hypothesis that the twoproportwn. The extension of the u-test to the parent population proportions are equal will becase of two sample proportions is quite prraight-forward. The correction for continuity is made u P-PL BP or PS PL (5.15)by subtracting % from the numerator of the / + pq + pqlarger sample proportion and adding % to the ?/LnLns nLnumerator of the smaller sample proportion; wherealso, a pooled estimate of the variance (similar q 1-pto the pooled estimate of the common variance The test can be illustrated with a hypotheticalin the t-test for two sample means) is used. This check on a purification procedure for a toxin; forpooled estimate of the variance comes from the example, starting with some original materialhypothesis, which is usually that the parent containing a toxin, suppose we are attempting topopulation proportions are equal. If the parent isolate the toxin in pure form. Now, after one orpopulation proportions are equal, the beat more stages in the isolation technique, the purifiedestimate of the variance will come from the product is tested to determine if the toxicity hasestimatesand.from the combinedsa increased which would indicate that the isolationThestimat poowingand taqifrm theeed samplhes technique was succeeding. To test the toxicity ofThe foowing notation is needed for the pur- the starting material and the purified product, we

pose of setting up the u-test, Let could find whether the proportion of animals

nL- the size of the sample with the larger showing a toxic reaction (or death) with thesample proportion original material is the same as the proportion for

kL - the number of classA members in the sam- animals given the purified product. Suppose a testple with the larger sample proportion is made on the two materials with the results

ns - the size of the sample with the smaller summarized in table 12. Now, the proportion ofsample proportion toxic reactions is larger for the purified product,

k=- the number of class A members in the thereforesample with the smaller sample propor- 12 - YM 11.5tion PL 4 - 0.821

Then 14 14

bX- = the proportion of class A =12312=fL members in the sample with 12+4 16the larger sample proportion p = - - 0.615corrected for continuity 14 + 12 26

pas k + -the proportion of class A u -.375-.821ns members in the sample with (.615)(.385) (.615)(.385)

the smaller sample propor- 12 14tion corrected for continuity -.446

kL + k.8P - - the pooled estimate of the A/.019731 +.016912

nL + ns common parent populationproportion (if the hypothesis - .446 = - =-2.34 (5.16)is true) N/.06643 .191

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 41: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

200 ROEBERT L. STEARMAN [VOL. 19

TABLE 13 Normal approximation te8t for more than twoData from table 12 corrected for continuity sample proportions. Some problems may lead to

Tox___NoToxic T ofalProportion

a test of more than two sample proportions; forTest Material Reac- Reaction of Toxic example, a check of the toxicity of the toxin

tion Reactions produced by a given organism under different

Original material. 4.5 7.5 12 0.375 conditions of temperature or pH or when cul-Purified product.., 11.5 2.5 14 0.821 tured in different media, or a test of the propor-

tion of positive reactions to tuberculin producedTotal .......... 16 10 26 0.615 by different laboratories or by different proc-

esses. If we have more than two samples, say c

The value of u obtained falls in the critical samples, we can set the data up in the form of a

region for the 5 per cent significance level (u is contingency table as shown in table 14.less than -2) so the hypothesis that the parent When a significance test on more than twopopulation proportions are equal is rejected. Since sample proportions is made, the chi-squarethe sample proportion of toxic reactions is greater test is used instead of the u-test, and the cor-for the purified product, there is a statistically rection for continuity is dropped. Yates (24)significant rise in the proportion of toxic reactions suggested that for contingency tables largerbetween the original material and the purified than the 2 X 2 table, there appears to be lessproduct. need for the correction for continuity. The chi-

A table of the type shown in table 12 is called square test presents a method of comparing thenumber of members in each of the classes for

a 2 x 2 contingency table, because the original each oftempes ich wereobser (wedata can be shown in a table consisting of two can denote the observed number by 0) with therows and two columns. The totals shown in the numer ofsclass member which woul benumber of class members which would bethird row and the fourth column are called themarginal totals. Note that the proportion shown expete the hyotes tope tted wein the third row of the fifth column is the pooled ue (deote the pectd rbiE) Th00 usual hypothesis is that the proportion ofestimate of the common population proportion. members in each class is the same in the parentHowever, the two sample proportions are nottheonethtae ued n te utes, sncethe populations from which the samples were drawn.

If this hypothesis is true, the pooled estimatestable does not contain the correction for con- of the proportions (p and q) will be the besttinuity. A table that contains the correction for

e of the proportions which are commoncontinuity can be set up by subtracting .

fromto the parent populations. These estimates arethe numerator of the larger proportion and

adding or subtracting % from the remaining fobtae bypdividing the totalof the membersnumbers in the 2 X 2 portion of the table in for the respective classes by the total number ofsuch a way as to maintain the marginal totals members in the samples, as shown in table 14.of the original data. An example of such a table Now, if the hypothesis is true, the expectedis table 13, which shows the data in table 12 proportions in each of the samples will be thecorrected for continuity; the values of PL, PSI same as the pooled estimates of the commonand p which are needed in the u-test appear in proportions, thus, the expected proportion ofthe fifth column of the table. class A members in each sample will be p and

TABLE 14General form of a 2 X c contingency table (two rows by c columns)

SampleClass Total Pooled Estimates

12 ... c

A k k2 . .. kck p = Ik/ZnA ni-k1 n2-k2 n0-k, In- k q = (Zn-zk)/Zn

Total.......... ni n2 ... nn

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 42: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 201

the expected proportion of class A members TABLE 15will be q. To obtain the expected numbers for Test of data from a hypothetical experiment oneach sample multiply the number of observa- toxin production by an organism ontions for the sample by p and q in turn. Thus, three mediathe number of class A members in any sample,say the ith sample, will be the product of p and Reaction MeTiumthe number of observations in the sample, or 1 2 3

(;k) = (nj(Zk) I. Observed values (0)Zn Zn - - - - _

En En ..A . 3 11 9 23*Similarly, the expected number of class A mem- A.. 12 7 8 27*bers in any sample will be Total. 15 18 17 50

(Zn - Zk) (nJ)(Zn - Zk) Proportion A. 0.20 0.61 0.53E - niq - (nj) =_

n-Znw Zn II. Expected values (E)

Having obtained the expected values for the A. 6.90 8.28 7.82 23.0various cells in the contingency table, we are A. 8.10 9.72 9.18 27.0now ready to set up the test. This test is based Total. .15.0 18.0 17.0on the fact that the quantity

(O- )2( III. Differences (0 - E)X2 2E(5.17)

A...Z..A-3.90 +2.72 +1.18 0follows the chi-square distribution with c - 1 A. + 3.90 -2.72 -1.18 0degrees of freedom. The test is one-tailed; the - - -E_____procedure for obtaining chi-square is to square ______ = (0-the difference between the observed and ex- A. 2.2043 0.8935 0.1781pected values, divide the resulting squared term A . 1.8778 0.7612 0.1517by the expected value for each cell and then 6.0666add up the resulting squared terms over all ofthe cells. A = toxic reaction.

A = non toxic reaction.The test will be illustrated with a hypothetical * Pooled estimate for A = 0.46; for A = 0.54

example of toxin production by a given organismon each of three different media. On testing thethree preparations on a group of animals, suppose Problems of Estimatiothe results given in table 15 are obtained. Since Using the normal approximation, approximatethere are three samples, chi-square will have confidence intervals can be obtained for the3 - 1 or 2 degrees of freedom. The value of X2.95for 2 degrees of freedom is 5.99, therefore our parent population proportions from samplecritical region for the test will be all values of estimates. The concepts of application of con-chi-square which exceed 5.99. The details of the fidence intervals to the normal approximationcomputations are shown in table 15. to the binomial distribution are the same as

Since the value of chi-square obtained in the those for the normal distribution. With thetest falls into the critical region, the hypothesis normal approximation to the binomial distribu-that the toxin production is the same for all three tion, the u-test is used to derive the confidencemedia is rejected. intervals instead of the t-test as was done with

Graphical methods for tests of sample propor- the normal distribution.tions. Mosteller and Tukey (29) have developed Confidence interval for a population proportion.a graphical method for testing hypotheses con- Generally, no correction is made for continuitycerning one or more sample proportions. The in setting up the confidence intervals for themethod, being graphical, is necessarily somewhat normal approximation to the binomial distribu-crude, however, it is sufficiently precise for many tion. If the confidence interval is derived as wasproblems and provides a quick and easy method done for the normal distribution (equationsfor testing such hypotheses. 4.62 through 4.66) the following confidence in-

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 43: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

202 ROEBERT L. STEARMAN [VoL. 19

TABLE 16 limits for samples of various sizes in which 50Ninety-five per cent confidence limits for samples per cent of the sample members are from class A.

of various sizes with p = 0.5 Sample size and power in significance tests in-SampleSize Lower-Limit Upper Limit volving proportions. The effect of the imprecision

of sample stimates of proportions based on% % small samples shows up not only in the confi-

4 0 100 dence intervals, but also in tests of hypotheses.6 9.2 90.8 The imprecision of the sample estimates from10 18.4 81.6 small samples serves to decrease the power of20 27.6 72.4 the signifcance tests. When this happens, there

50 35.8 64.2 exists a greater probability of accepting a hy-100 35.8 6402 pothesis that is actually wrong. This fact can be

250 43.6 56B4 illustrated by the determination of the samplesizes necessary to obtain a given power for testsof significance between samples taken from two

terval for a population proportion with a con- populations with specified proportions. Davisfidence coefficient of 95 per cent is obtained: and Zippin (30) have given charts which may be

p - u.W6N/_pn <P < p - u.ogaV (5.18) used to determine sample sizes for samplingfrom two populations with specified proportions

where p and q are the sample estimates of P and for powers of the tests of 50 and 80 per cent.Q. Substituting the values of u in inequality These two charts are given in figures 8 and 9.5.18, The method of finding the size of sample, as

p-2V~i;<P<p+2V (5.19) given by Davis and Zippin, is:

The confidence interval for a confidence coeffi- 1. Find the vertical line whose value is that ofthe smaller of the two population per-

cient of 99 per cent is determined by substitut- prentages bei compared.percentages being compared.ing 2.576 for 2 in inequality 5.19. In this in- 2. Follow this vertical line up until it crossesequality, pq/n is the sample estimate of the the curved line corresponding to the othervariance. One point to be noted is that ine- percentage.quality 5.19 is applicable whether p and q are 3. From the point of intersection, read hori-given in terms of proportion or percentage. zontally to determine the value of theThe use of the confidence interval given in horizontal line on which the intersection

equation 5.19 can be illustrated with a hypo- occurs. This value will be the size requiredthetical example. With a new toxin preparation for each sample.3outo6amlgetTo now illustrate the use of the charts of Davis

3 ou atiof nanimals givnthep atioxnshowd a and Zippin suppose an industrial firm is producingtoxicreation.ofWhatmaisthetd phopuatoion pro - an antibiotic which, from past experience, seemstion of animpals that would show a toxic reac to be effective in 50 per cent of cases of a particulartion? Fifty per cent of the animals in the test type of disease. Further, a research unit of theshowed a toxic reaction, so the confidence in- firm has produced a new antibiotic which theyterval would be wish to test against the antibiotic now in pro-

_____2_____5)~(0)/ < P< 5 + 2Vr(_ __

(5duction to learn whether it will be economically

50-2(50) (50)16 <P < 50 + 2~/(50) (50)16 feasible to place the new antibiotic in production.50 - (2) (20.4) < P < 50 + (2) (20.4) The cost of production of the new antibiotic is

such that it will be economically feasible only if50 - 40.8 < P < 50 + 40.8 (5.20) the new product will be effective in at least 75 per

9.2% < P < 90.8% cent of the cases. The problem is to determine thesize of the samples for the test necessary to detect

In addition to Mustrating the use of ine- a difference in proportions of this magnitude. Thequality 5.19, this example also serves to empha- smaller of the two population percentages is 50size that sample estimates of parent population per cent so in figure 8 find the vertical line corre-proportions are quite imprecise when the sample sponding to this value, then follow this verticalzetionssmall. Thisepointisealso itustratledin line to the point where it crosses the line corre-

size is small. This point is also illustrated in sponding to the other population percentage, 75table 16 which shows the 95 per cent confidence per cent. From the point of intersection, which is

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 44: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

PER CENT IN ONE SAMPLE10 70 20 o0 40 )0 60 TO so 90| 000

6D0- 9l / l l g 600Soo S- s:: oo400- 7;_l l100300- 600400

i 2D00 t1200 _

o 100 = + =Z 100 oaz80- W

|(z

760- ! 602z 50- / / / // -50 z

40:ssZ{Zz i

30~~~~ ~ ~ ~ ~ ~ ~ ~~~~~~6

20- r t r Z 2 t r w - 201000F~~~ ~ ~ ~~~~4

lo-'/ z o 000.0/ 100 10 20 30 40 50 60 70 80 90 100

PER CENT IN ONE SAMPLEFigure 8. Chart for estimating number of animals required in each sample for statistical significance

between percentages. Significance level, 5 per cent; power, 50 per cent. (Figures 8 and 9 originally ap-peared in The Journal of Wildlife Management (30). These figures are reproduced by permission of Dr.David E. Davis and the editor of that journal).

PER CENT IN THE OTHER POPULATION

15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

zz°°og/ g / X / A / A / XW|°80°°80 :z 2000 //

if I 2000

a.100 / / / ///0 / / / / -/ - -100 0.

a00 80z>/>=Z!20 2

mn 600 r- 6030 mn5020 - -5 0

400~~~~~~~~~~~~~

o 2803

204

W W~~~~~~~~~~~~~~~~3

W W~~~~~~~~~~~~~~~2z z~~~~~~~~~~~W. 100 20 3 0 50640 8 900c

8 80~~~~~~~~~0

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 45: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

204 ROEBERT L. STEARMAN [voL. 19

on one of the horizontal lines, read horizontally tion proportions is assumed, the pooled p and qin either direction to determine the size of the are not used in estimating the variance. If thesamples-here, the size is 30. Thus, two samples confidence interval is derived as before:each of size 30 are necessary for the test. Now,the chart used was that for a power of 50 per cent, PI - P - (U.97O)(8,,1.-.2) < Pt - P2i.e., if the sample size is 30 for each sample, and (5.21)if the difference between the two population < -P2-(U.025)(8pl-2)percentages is that given, then in 50 per cent of wheresuch tests the conclusion will be reached that thereis a significant difference between the two popula- 8___lqlP2+ 2 (5.22)tion proportions using a test with a 5 per cent level pnq- 2q2of significance. On the other hand, this meansthat there is a 50 per cent chance that even though The confidence interval for the differencethere is the required difference between the two between two population proportions can bepopulation proportions, the significance test on illustrated with the data in table 12 which hasthe two samples will lead to the conclusion that been used to illustrate the u-test for the dif-there is no significant difference between the two ference between two means. The conclusionproducts, and will lead to the dropping of the new reached was that there was a significant dif-product. The research group would probably wish ference between the two proportions so now themore assurance that their labors will not go un- next step in the process can be taken and anheeded if indeed they have come up with a new estimate of the difference obtained. Since the dataantibiotic which meets the necessary specifica- are not to be corrected for continuity table 12 istions, so let us see what happens to the sample used rather than table 13. From table 12,size when the power of the test is increased from Pi = 0.333 (qi = 0.667)50 to 80 per cent. Using figure 9 to determine thesample size, we proceed as before; this time the ni - 12point of intersection of the two lines is on the P2= 0.857 (q2-0.143)horizontal line corresponding to a sample size of60. Thus, to increase the power of the test from 50 n, - 14to 80 per cent the size for each of the two samples thereforemust be doubled.This example shows that increasing the power /(0.333)(0.667) (0.857)(0.143)

of a test requires increasing the sample size. S-via.v= +Examination of the charts in figures 8 and 9 also 12 14 (5.23)shows that as the size of difference between the = 0.165population proportions to be detected diminishes,the size of sample becomes increasingly large. Thus,

Other methods for a confidence interval for a 0.333-0.857- 2(0.165) <PI

popultion proportion. Other methods are also - P2 < 0.333 - 0.857 + 2(0.165)available for obtaining a confidence interval for -0.524 - 330 < PI-P2a single population proportion. Clopper andPearson (31) presented charts which can be <-0.524 + 0.330 (5.24)used to obtain the exact (not normal approxima- -0.854 < P - P2 < -0.194tion) confidence interval for a population pro- That is, the purification increased the proportionportion. Snedecor (32) gives tables based on the of toxic reactions by something between 0.194 andClopper-Pearson charts. The binomial prob- 0.854 or between 19.4 and 85.4 per cent.ability graph paper of Mosteller and Tukey (29)can also be used to obtain a confidence interval VI. THE POISSON DISTRIBUTIONfor a population proportion. The third and final of the basic distributions

Confidence interval for the difference between of statistics to be discussed is the Poisson dis-two population proportions. The normal approxi- tribution. The discussion of the Poisson dis-mation is used to obtain a confidence interval tribution along with its history and applicationsfor the difference between two population was ably handled by Eisenhart and Wilson (1,proportions, again with no correction for con- pages 62 to 92); therefore, the treatment heretinuity. Since a difference between the popula- will be held to a minimum.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 46: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 205

Derivation of the Poisson Distribution where the lower case greek letter lambda (X) isThere are two major methods for deriving the density or average number of bacteria per

the Poisson distribution. In one, the Poisson unit volume of the liquid.distribution becomes the limit of the binomial An example of events randomly distributed indistribution as n becomes large and at the same time would be cosmic rays. Here the numbertime P becomes small in such a way that the of cosmic rays counted in intervals of givenmean, ,u = nP, remains finite. In the other, if length of time, t, will have a Poisson distributionevents or items are randomly distributed in and the probability of obtaining k cosmic raystime or space, then the number of events or in an interval of time will beitems in samples taken with respect to time or (Xt)*space will have a Poisson distribution. - eCM (6.3)

Limit of the binomial distribution. It has beenalready pointed out that the binomial distribu- Here, lambda is the average number of cosmiction can be approximated by the normal dis- rays per unit time.tribution, and also that this approximation General form of probability for Poisson dis-works well when the value of P is near or equal tribution. All expressions for the probability of kto 50 per cent and when the size of sample is events or items in a sample can be given in onelarge. On the other hand, if the proportion, P form, namely(or Q), of clas A members in the population kbecomes small, say less than 0.01, and if we take Probability of k events in sample = e-P1 (6.4)large samples (of size n) so that the mean num-ber of clas A members, ,u = nP, is some small where ;& is the mean number of events. Thenumber, then the distribution of the number of previous three expressions for the probabilityclass A members in the population of all possible (6.1, 6.2, 6.3) can be obtained by appropriatesamples can be approximated by the Poisson substitution for A. In the limit of the binomial,distribution. The Poisson distribution, like the the mean number of class A members was nP.binomial distribution, is the distribution of For the bacteria randomly distributed in asample outcomes. The probabilities for the liquid, the mean number of bacteria for a givenvarious possible outcomes are given in table 17.The expression for the probability of k class TABLE 17

A members in a sample can be simplified by the Poisson distributionuse of the notation k! = (1)(2)(3) -.. (k); No. ofwhere k! is read factorial k. As an example, 5! is Obser-(1) (2)(3) (4) (5) - 120. Then, the probability of frm Probablty for SampleClass Ak class A members in a sample is in Sample

(nP)i eP (6.1) ems

nP(by definition, 0! = 1). 11-eP

Items or events randomly distributed in time or 1space. As stated before, if events or items arerandomly distributed in time or space, then the 2 (nP)2 ePnumber of events or items in samples taken with (1)(2)respect to time or space will have a Poissondistribution. An example of items in space 3 (nP)3 6nPwould be bacteria in a liquid such as milk or (1)(2)(3)water. If the bacteria are randomly distributedin the liquid, then the number of bacteria insamples of a given volume, v, will have a Poissondistribution and the probability of obtaining k k (nP)* ePbacteria in a sample will be (1) (2) ... (k)

(xvY' >Xw (6.2)

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 47: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

206 ROEBERT L. STEARMAN [VOL. 19

volume, v, will be Xv. For the cosmic rays, the cells to cover others. However, the variancemean number of rays for a time interval, t, will will be smaller than that of the Poisson dis-be Xt. Thus, all expressions for the probability tribution, since the observed numbers areof k items or events in a sample can be lumped clustered more closely about the mean thaninto one common expression, given by equation they would be in a Poisson distribution.6.4. Tables of the probabilities are given by If there is clumping of the cells, the numbersMolina (33) and Kitagawa (34). obtained will tend to be more divergent fromMean and variance of the Poisson distribution. the mean number than we would expect from a

The mean and the variance of the binomial Poisson distribution. That is, there will be toodistribution are related; that is, if the propor- many squares with relatively large numbers oftion of class A members is considered, the mean cells and too many squares with few cells. Here,is P and the variance is PQ/n. The relationship if the clumping is sufficient to cause some ofof the mean and variance of the Poisson distribu- the cells to be hidden under others, the meantion is even more striking; both the mean and will be decreased. Also, the clumping will bringthe variance are equal to ,u. The standard devia- about an increase in the variance due to thetion of the Poisson distribution will be NAPi. divergence of the observed numbers from theRandomness is essential in the distribution of mean.

the events or items in time or space to obtain Bacterial counts by plate method. The Poissonthe Poisson distribution. If the events or items distribution will be applicable to this count ifare not random, the mean of all possible samples the bacteria are randomly distributed in thewill still remain it, but the variance will not be ju. liquid being sampled. Also, if the repellingThe effect of non-randomness on the variance effect among the cells is not negligible or if thewill be discussed in the subsection on applica- relative volume of the cells is too high, thetions of the Poisson distribution. mean count will remain the same but the vari-

ance of the counts will be less than that expectedApplicatiowns of the Poisson Distribution from a Poisson distribution. If there is clumping,

the additional complication obtains of a clumpof bacteria givig rise to a single colony. Thus,

Poisson distribution for the number of bacteria . ."agam, the same conditions imposed for bac-per square in a counting chamber will be ob-agitesm codinsmpedfrb-ted ifuthe bnacounterian rhandmbewiy btedob-terial counts by chamber method must be met.tldistributd Bacterial counts from dilution series. The

] the chamber. This follows from considering estimation of bacterial densities by dilutionthe volume of liquid above each square to be series is one of the oldest applications of statisticsthe sample of volume, v, in the example of bac- to microbiology. The estimate obtained isteria in liquid. Certain conditions must be met known as the Most Probable Number (MPN).before the distribution of the bacterial cells m Tables for determining the MPN, published bythe chamber will be random: (a) cells should the American Public Health Association (35,not~~~ ~repeonelcaanotherorltelselothee3usbnot repel once another, or else there must be pages 220 and 221), are set up for 10-fold dilu-

sufficient space among the cells so that the re- tions for certain combinations of 5- and 10-foldpelling effect will be negligible; (b) the volume dilutions.of the cells, relative to the volume of liquid in Cochran (36) published a table, given here aswhich they are suspended, should be small; (c) table 18, which can be used to obtain confidencethere should be no clumping of cells. limits for the number of bacteria from a MPN,

If the repelling effect among the cells is not for the numberof bacteria fm Mnegligible, or if the relative volume of the cells for loera ofi 2, 4, 5 andd10.is too high, the cells in the chamber will tend to The lower confidence lmit s obtained by d-be uniformly or homogeneously distributed viding the MPN by the factor and the upperrather than random. That is, the numbers ob- confidence limit is obtained by multiplying thetained will tend to be closer to the mean number MPN by the factor. The use of Cochran's tablethan would be expected if the cells were ran- can be illustrated by obtaining confidencedomly distributed. Here, the mean number of limits for a value in one of the A. P. H. A. tablescells per square will remain the same, as long as (35, page 221). This table is set up for five tubesthe crowding is not great enough to cause some per dilution with a dilution ratio of 10, thus, the

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 48: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 207

factor from Cochran's table will be 3.30. Now, TABLE 18*suppose that 4 out of 5 tubes contining 0.1 ml. Factors for confience limits for mostof the liquid being tested showed growth and probable number3 out of 5 containing 1 ml. and 5 out of 5 tubes Factofor95__onfidnce ___itcontaiing 10 ml. showed growth. The value of No. Of e LSamplesthe Most Probable Number will be 59 per ml. %rDilution ratio

The upper confidence limit will be tion2 4 5 10

(59)(3.30) = 194.7 _1 4.00 7.14 8.32 14.45

and the lower confidence limit will be 2 2.67 4.00 4.47 6.6159/3.30 - 17.9 3 2.23 3.10 3.39 4.68

4 2.00 2.68 2.88 3.80Thus we see that our estimate of the number of 5 1.86 2.41 2.58 3.30bacteria per ml. of the liquid will be between 6 1.76 2.23 2.38 2.9817.9 and 194.7. 7 1.69 2.10 2.23 2.74Finney (37) has given a computational method 8 1.64 2.00 2.12 2.57

for obtaining the Most Probable Number which 9 1.58 1.92 2.02 2.43has certain advantages: (a) it is readily com -_ ____ 1_86 1_9_ __2_32puted for any dilution ratio; (b) the number of * This table originally appeared in Biometrics,tubes need not be the same for all dilutions Volume 6 page 115, 1950, and is reproduced here(this would be especially useful in the event with the permission of Professor William G.of breakage through mishap); (c) confidence Cochran and the editor of that journal.limits can be obtained directly from the com-putations. and another table will be used when the number

VII. ACKNOWLEDGMENTS of observations are not equal for the differentsamples. The method for samples of equal size

I am indebted to my wife, Barbara Do Stear- can be illustrated using the data in table 5.man (Technician in Charge, Diagnostic Bac- A few new notations will be required to setteriology Laboratory, The Johns HopkinsHOse up the analysis of variance computing table.pital), for her encouragement and for serving Starting with data from an experiment with kas a microbiologist reader of the first draft of treatments and n observations per treatment (inthis review, as well as to Professor William G. the example,k is3 and n is 10), let x;; be theCochran and Doctor Margaret Merrell (Depart- ith observation from the jth treatment, thus, iment of Biostatistics, The Johns Hopkins Uni- will run from 1 to n and j will run from 1 to k.versity) for checking the first draft for statistical For example, if i is 5 and j is 3, this would givecontent and readability. xu which would be the fifth observation fromThanks are due to the students and staff of the third treatment (in the example this would

the School of Hygiene and Public Health, The be the fifth observation in the sample from theJohns Hopkins University, as well as to various control, or 127). The total of the observationspersons in the fields of microbiology and statis- for the jth treatment will be denoted by T.j.tics for their criticisms of the "Ditto" copies of Thus, T., will be the sum of all of the observa-the manuscript. The "Ditto" copies of the manu- tions in the sample from the jth treatment orscript, as well as the final publication of thisreview would not have been possible without T.) = 2Zij - XZi + X2j + Xj + *. + x"j (8.1)the patient help of Miss Virginia Brooke Thomp- i-ison and Mrs. Adrienne Holland. In the example, T.2 would be the sum of the

VI. APPENDIX observations in the second treatment or 1,285.Let T.. stand for the grand total of all of theobservations in the experiment, thus,

There are two slightly different computing k *tables used in the analysis of variance. One T.. 2xiZ = 2 T.;table will be used when there are an equal i -i i-1 (8.2)number of observations in each of the samples -T.1 + T.2 + -.- + T.t

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 49: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

208 ROEBERT L. STEARMAN [VOL. 19

TABLE 19Analysis of variance computing table for samples with equal numbers of observations

Part 1: Preliminary calculations(1) (2) (3) (4) (5)

Source of Variation Total of Squares No. of Items Observati Total of Squares perSquared SquaredItem Observation (2) + (4)

Correction*............... T 1 - a nk T../nk = ATreatments...................T-k - b n 2,T2.,/n = BObservations nic-..............2{sxXjnk c ZZjz',1 = C

Part 2: Analysis of variance table(6) (7) (8) (9) (10)

Variation Due to: Sum of Degrees of Mean Square (4) (7) + (8) FSquares (S)Freeom (d~f.)

Treatments........... .B - A b - a (B - A)/(b - a) =M.S. M.S.F/M.S.oObservations.............. C - B c - b (C - B)/(c - b) - MS.0

Total................... C-A c-a

* The correction term does not constitute a source of variation.

TABLE 20 letter replaced by the dot was the letter onAnalysis of variance computing table for the data which summation took place to obtain the total.

in table 5 Starting with xZq, the treatment total, T.j, isPart 1: Preliminary calculations obtained when summation takes place on i

____________ - _ 2___ -3)__4_- while j remains constant, therefore the letter i is(2)(3)(4) (5) replaced by a dot in the subscript of T. The

Obser- Total of grand total, T.., results from the summationonSource of Total of No. of vations Squarebts n hrfrebt etr rVariation Items per per Obse- bohiadj, thrfr ohltesaereplacedVariationl | Squares Squared Sutared vation

Item (2) --2(4) b osi h usrp fTThe summation notation can be simplified by

Correction*... 11,276,164 1 30 375,872.1 using only i or j as an index to the summationTreatments.... 3,957,250 3 10 395,725.0 sign, since the limits of both i and j are known,Plate counts... 398,408 30 1 398,408.0 thus

Part 2: Analysis of variance table T.; = 2x (8.3)

(6) (7) (8) (9) (10) and

Sum of Degrees Mean T.. = 22ixiZ-2 ,T.i (8.4)Variation Due to: Smuares Free- ST 1 h yF * . .adom Table 19 E the analysis of vace computing

(d.f.) (7) (8) table for an experiment with equal numbers of

Treatments....... 19,852.9 2 9,926.45 99.89 observations. Table 20 illustrates the use ofPlate counts....... 2,683.0 27 99.37 table 19 with the data from table 5. The analysis_____________ - of variance computing table has two parts:Total. 22,535.9 29 the first part consists of the preliminary calcula-* The correction-termdoes not-constitute a

tions, while the second part is the analysis ofsouce correction term does not constitutea. variance table. When the results of an analysis

of variance are published, only the second partIn the example, T.. would be the total of the is placed in the publication.totals, or 3,358. The first line of the preliminary calculations isThe reader should note that in this notation, devoted to the correction term. The correction

a dot in the subscript of a T shows that the term does not usually constitute a source of

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 50: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL[CONCEPTS IN MICROBIOLOGY 209

variation; it is placed in the table for computa- subtract A from C (398,408.0 - 375,872.1 =

tional convenience. The second line is for corm- 22,535.9). The entries in column 7 represent theputations involving the treatment totals, while splitting of the total S.S. into two parts whichthe third line involves the observations of the represent the S.S. for the variation among theexperiment. sample means (this is the treatment S.S.) andThe second column contains the totals of the S.S. for the variation within the samples

squared terms. The first line contains the square (this is the observation S.S.).of the grand total (3,3582 = 11,276,164). The The entries in the eighth column are obtainedsecond line contains the sum of the squares of by performing the same operations on the lowerthe treatment totals (7562 + 1,2582 + 1,3172 = case letters as those which were performed on3,957,250). The third line of the second column the capital letters to obtain the entries in columncontains the sum of the squares of the observa- 7. Thus to obtain the treatment d.f., subtract ations (398,408). from b (3 - 1 = 2). To obtain the observationThe third column lists the number of items d.f., subtract b from c (30 - 3 = 27). Subtract

which were squared to obtain the entries in a from c for the total d.f. (30 -1 = 29). Thethe second column. In the first line, this was a entries in the eighth column represent thesingle item, the grand total; in the second line, splitting of the degrees of freedom into two partseach of the k totals, so the entry here is k (k = which represent the degrees of freedom for the3); in the third line each of the nk observations variation among sample means (the treatmentin the experiment, so the entry is nk (nk = 30). df.) and the degrees of freedom for variationThe fourth column lists the number of obser- within the samples (the observation d.f.).

vations which make up each of the items that Having obtained the S.S. and the d.f. values,were squared. In the first line, the grand total is we are now ready to find the mean squares,the sum of the nk observations (nk = 30). In which will be the external and internal estimatesthe second line each of the treatment totals was of the common variance. The mean squares arethe sum of n observations (n = 10). In the third entered in column 9 and are obtained by dividingline each of the observations has been squared the entry in the same line of column 7 by theso the entry will be 1. Note that the product of respective entry in column 8 (the S.S. dividedan entry in column 3 multiplied by the respec- by the degrees of freedom). There is no need fortive entry in column 4 is always nk, the total the total mean square, so it won't be computed.number of observations. This fact gives a quick The treatment mean square (M.S.T) is obtainedcheck on the entries in these columns; in the by dividing the treatment S.S. by the treat-example, nk is 30, so the product will be 30. ment d.f. (19,852.9/2 = 9,926.45). The treat-The entries in the fifth column are obtained ment mean square is equal to the external

by dividing the entry in the same line of column estimate ns2 (see equation 4.58). The observa-2 by the respective entry in the fourth column. tion mean square is equal to the observa-The entries in the third and fifth columns have tion S.S. divided by the observation d.f.been designated by letters for convenience in (2,683.0/27 = 99.37). The observation meansetting up the entries in the analysis of variance square (M.S.o) is equal to the internal estimatetable. The entries in the third column have or the pooled e2 (see equation 4.53). Now sincebeen denoted by lower case letters, while the 2entries in column 5 have been denoted by M.S.T= fl8Ccapital letters. andTurning to the analysis of variance table, the

entries in the seventh and eighth columns are M.8.0 = pooled sobtained by simple subtractions involving the therefore, using equation 4.59,entries in columns 3 and 5 of the preliminary n2 M.S.,calculations. To obtain the treatment S.S., F - ______ - M.S.T = 99.89 (8.5)subtract A from B (395,725.0 - 375,872.1 pooled s2 M.S.019,852.9). To obtain the S.S. for the observa- The degrees of freedom of F are found in columntions (the term error is often used in place of 8 and will be b -a degrees of freedom for theobservations) subtract B from C (398,408.0 - numerator and c -b degrees of freedom for the395,725.0 = 2,683.0). To obtain the total S.S., denominator. Thus the analysis of variance

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 51: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

210 ROEBERT L. STEARMAN [voL. 19

TABLE 21Analysis of variance computing table for samples with unequal numbers of observations

Part 1: Preliminary calculations(1) 1 (2) (3) (4) (5)

Source of Variation Total of Squares No. of Items Observations per Total of Squares perSquared Squared Item Observations (2) + (4)

Correction*................ T.. 1 = a Zlnj T24 /n = ATreatments.-. k = b Z,(T'.J/n,) = BObservations.............. ZZX x'sj Zin, = c 1 2Zix;24 = C

Part 2: Analysis of variance table(6) (7) (8) (9) (10)

Variation Due to: Squares (S..) Fr (d.f.) Mean Square (s0) (7) + (8) F

Treatments ................B - A b - a (B - A)(b - a) =M. M.S.TM.S.OObservations.............. C - B c - b (C - B)/(c - b) = M.So

_oa...................

....A. c.-.Total . C-A c-a

* The correction term does not constitute a source of variation.

computing table gives results identical to those and divide each squared item by the number ofobtained before (see equation 4.60). observations which make up the total beforeThe analysis of variance computing table summing. That is, determine Tl/ni for each

must be changed when the number of observa- treatment and then take the sum of these quan-tions are not equal for the different samples. tities. Columns 2 and 4 of line 2 are left blank,One more piece of notation is needed to set up and the sum of the T.2/nj values is enteredthe analysis of variance computing table for the directly in column 5 of line 2. The entry incase of unequal sample size. Let nj be the size column 3 of line 2 remains the same as beforeof the sample from the jth treatment. The rest since the number of items squared is still k, theof the notation will remain the same. If the number of treatments.sample size for the jth treatment is n,, the total None of the entries in an analysis of variancesize of the experiment will be Zjai, that is, the computing table should be negative numbers.size of the experiment will be the sum of the If any entry should be negative, a mistake hassizes of the samples involved. been made in the computations. A point whichThe analysis of variance computing table for is helpful in checking the entries in the analysis

samples of unequal size is given in table 21. The of variance table is that the treatment S.S. plususe of this table will not be illustrated. The the observation S.S. is equal to the total S.S.same basic form is used; the relationship among and the total d.f. is the sum of the treatmentcolumns remains the same and the analysis of d.f. and the observation d.f.variance table is unchanged. Only slight changes Tess to Supplement the Analysis ofoccur m the preliminary calculations. In the Variancefirst and third lines of the preliminary calcula-tions, the only change made is the replacement If the value of F obtained in the analysis ofof nk by Znj as the total number of observations variance does not fall into the critical regionin the experiment. In the second line for equal for the test, the hypothesis that the populationnumbers, we first took the total of the squares means for the treatments are the same is notof the treatment totals and then divided the rejected. If this happens, further tests for dif-total of the squares by the number of observa- ferences among the population means may betions common to the samples. For unequal unnecessary. However, if the value of F fallssample sizes, square each of the treatment totals into the critical region, further tests may be

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 52: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 211

needed to determine where the differences among therefore, further tests after the analysis ofthe treatment population means lie. variance are not needed. Here, an estimate ofThe analysis of variance tests the variation the magnitude of the variation from each source

among the sample means of the treatments to can be obtained by the use of the componentssee whether the variation among these means is of variance technique.greater than the variation normally expected to In other types of problems, interest may bearise from the procedure for obtaining the centered in the relationship among the popula-observations. For example, if the three treat- tion means, for example, (a) in which treat-ments in table 5 had population means which ment gives the best results, (b) in ranking thewere equal, we would expect to have some treatments, or (c) in finding whether the rela-variation among the sample means arising tionship among a set of treatments is linear. Infrom the variation among the plate counts. these types of problems it is necessary to followHowever, when the analysis of variance was an analysis of variance by further tests if thereapplied to these data, the fact that the F-value is a statistically significant difference amongfell into the critical region told us that the the means.variation among the sample means for these One method of testing differences amongthree treatments was greater than that which means after an analysis of variance is by thecould be explained on the basis of the variation t-test. We use the usual t-test for the differenceamong the plate counts. The statistically signifi- between two treatments for populations withcant value of F does not tell the source of this equal variances (the variances must be equal toincreased variation. If all but one of the popula- use the analysis of variance) except that thetion means were equal, the fact that one of the pooled s2 for two treatments is replaced by thepopulation means was different could introduce pooled 02 from all of the treatments (the observa-sufficient variation among the sample means to tion mean square from the analysis of varianceobtain a statistically significant value of F. table) as well as the degrees of freedom for thisSimilarly, if no two population means were pooled 82 (the observation d.f.). In this way,equal, we could again get sufficient variation all of the information afforded us by the entireamong the sample means to obtain a statistically experiment concerning the magnitude of thesignificant value of F. Thus, if the value of F common population variance is utilized.falls into the critical region for the test, we know As previously pointed out, the F-test for theonly that at least one of the population means analysis of variance is fairly insensitive to lackis "out of line." of homogeneity of variances. The t-tests which

In some problems, knowledge that the varia- use the internal estimate of the variance, thetion among the treatment means is greater than pooled 2, are sensitive to homogeneity of vari-that which can be accounted for by the varia- ances. Thus, Bartlett's test (12) may give us ation arising from the procedure for obtaining useful warning not to use a pooled e2 for t-teststhe observations may be sufficient and further which follow the F-test. If t-tests are to be usedtesting may not be necessary. An example of following the analysis of variance, the homo-this type of problem might arise from a group geneity of variance test should be used. Anotherof laboratory technicians running the same of the troubles with the use of the t-test is thatlaboratory procedure or some common test. all too often conflicting results are obtained asHere, interest might be confined to knowing pointed out earlier. Still remaining is the problemwhether the different technicians agreed within of the number of t-tests which must be run.the limits of their precisions. In this type of Slightly more sophisticated methods have beenproblem, the analysis of variance tests the varia- offered for testing the differences including thetion among the technicians to see whether it is method of Nair (18), which may be combinednegligible with respect to the variation from the with the t-test, and the more recent method bymethod of measurement. Here the treatments Duncan (19). These latter methods can be usedare samples and the observations are the meas- in ranking treatments.urements. Interest rests primarily on variation The most useful method, if it is applicable,among the technicians, not in which technician for testing the differences among the treatmentsgives the greatest or smallest average value, is the use of individual degrees of freedom. Ac-

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 53: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

212 ROEBERT L. STEARMAN [VOL. 19

tually, the method of individual degrees of and 3, then use the second degree of freedom tofreedom is an extension of the analysis of vari- test the difference between treatments 2 and 3.ance procedure. The procedure consists of a Another theory would require a different par-further breakdown of the treatment S.S. into titioning of the treatment S.S.parts, each of which has 1 degree of freedom. It is not necessary to partition the treatmentThe partitioning of the treatment S.S. is done S.S. completely to test a hypothesis. For ex-in such a way that the resulting mean squares ample, the hypothesis to be tested might becan be used to test certain hypotheses concern- that the k treatments could be divided into twoing the differences among the treatments, the groups, each group having equal populationdenominator for the tests being the observation means within the group but with a differencemean square. With k treatments, there are in population means between the groups. Thatk - 1 degrees of freedom for treatments; there- is, there would be k1 treatments (the first group)fore, the treatment S.S. can be partitioned into with equal means and k2 treatments (the secondk - 1 parts, each having 1 degree of freedom. group) with equal means (k1 + k2 = k), but theFor example, the data in table 5 have three meancommon to thefirstgroup wouldnot be equaltreatments, hence two degrees of freedom for to the mean which was common to the secondtreatments (see table 20). The treatment S.S. group. Here, 1 degree of freedom would be usedcan be split into two parts, each having 1 degree to test the difference between the two groups,of freedom; this can be done in many ways, k1 - 1 degrees of freedom to test the differenceincluding the following three ways. The first among the means in the first group and k2 - 1degree of freedom could be used to test the degrees of freedom to test the difference amongdifference between the control (treatment 3) the means in the second group (treatmentand the two media with lard added, with the d.f. = 1 + ki -1 + k2 -1 = ki + k2-1 =second degree of freedom being used to test the k - 1).difference between the rancid lard (treatment 1) Individual degrees of freedom can be used toand lard which is not rancid (treatment 2), or test other types of hypotheses. For example, ifwe can use the first degree of freedom to test the treatments consist of increasing equallythe difference between treatment 1 and treat- spaced quantities of a metabolite or test ma-ments 2 and 3 with the second degree of freedom terial, the treatment S.S. can be split into unitsfor testing the difference between treatments with single degrees of freedom to test whether2 and 3, or use the first degree of freedom to test the response is linear, quadratic, or cubic andthe difference between treatment 2 and treat- higher. All in all, methods employing individualments 1 and 3 with the second degree of freedom degrees of freedom cover a great many types ofbeing assigned to test the difference between hypotheses and are very useful. The methodstreatments 1 and 3. for partitioning the treatment S.S. for individ-The way in which the treatment S.S. is parti- ual degrees of freedom are given in section 3.4

tioned must be meaningful. It should also be of Cochran and Cox (8).decided upon before the data are examined, so The supplementary tests discussed here maythat the data do not bias the judgement as to be used in place of an analysis of variance. Inwhich test is to be used. The researcher should fact, they have a definite advantage over thehave some particular hypothesis in mind before analysis of variance in that they are designedthe data are obtained, in the event there is a to test specific hypotheses whereas the analysisstatistically significant difference among the of variance tests a general hypothesis. Themeans. The choice of the hypothesis may be analysis of variance is like a shotgun, in thatbased on some theory to be tested. For example, it covers a lot of territory and doesn't bringbefore taking the data in table 5 it might have too much force to bear on any particular point.been thought that rancid lard would inhibit the The supplementary tests, however, are likegermination of spores while non-rancid lard rifles, in that they bring all of their force towould have no effect differing from the control bear on a particular point. Thus the analysis ofmedium. Here, the proper choice would be to variance is a general test which tests for anyuse the first degree of freedom to test the dif- and all types of divergence from equality ofference between treatment 1 and treatments 2 treatment population means with little power

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 54: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 213

against any specific type of divergence. The r X c contingency table, chi-square will havesupplementary tests, on the other hand, test (r - 1)(c - 1) degrees of freedom.specified types of divergence and have much The chi-square test can also be used in testshigher power. Therefore, in some problems the of "goodness of fit" which are tests designed tosupplementary tests may be more appropriate determine whether data fit various theoreticalthan the straight analysis of variance. frequencies. Tests of this sort include testing of

NotestheApplicatiohypotheses of genetical characters as well asNotes on the A Tplcaton of t testing the fit to such distributions as the nor-

(Chi-square Test mal, binomial or Poisson. The chi-square test

We can apply the chi-square test to cases for goodness of fit is discussed in most elemen-where we are testing either two sample propor- tary textbooks [See also Eisenhart and Wilsontions or a single sample proportion. In the chi- (1)].square test of two sample proportions, use the Chi-square, like the treatment S.S. in thesame procedure on the 2 X 2 contingency table analysis of variance, can be partitioned intoused on the 2 X c contingency table with one individual degrees of freedom, each of whichchange; instead of running the test on the origin can be used to test some particular hypothesis.nal 2 X 2 table, run the test on the 2 X 2 table Cochran (26) presents several methods whichthat is corrected for continuity, as in table 13. are applicable to the various chi-square tests.The degrees of freedom for the chi-square test Yates (27) has given a method, along with de-will be 2 - 1 or 1 degree of freedom. tailed computing instructions, for separating

If the chi-square test is used on a single out a single degree of freedom of chi-square forsample proportion, again correct for continuity. testing the linearity of response in an r X cThe expected values are obtained from the contingency table. Yates' method would beproportion given by the hypothesis. The ob- applicable, for example, to testing the hypothe-served values for both class A and class X must sis that the proportion of positive reactions tobe corrected for continuity. The test will have tuberculin tests is a linear function of increasing1 degree of freedom. strengths of tuberculin.The decisions reached concerning the hypoth- Chi-square values are additive. Thus, in a

esis will be the same for both the u-test and series of chi-square values, each of which resultsthe chi-square test. This arises from the fact from a test of the same hypothesis on differentthat although the two tests look quite different, sets of data, the sum of these will be distributedthey are indeed related. The relationship be- as chi-square with degrees of freedom equal totween the two tests is that u - x2 with 1 degree the sum of the degrees of freedom of the individ-of freedom. The critical regions of the two tests ual tests. Cochran (28) pointed out the factare such that for a given significance level, any that the values of chi-square to be added mustvalue which falls in the critical region for one not be corrected for continuity, e.g., when thetest also falls in the critical region for the other chi-square values for a series of tests of one ortest. two sample proportions are added, the testsThe chi-square test can also be extended to must not contain the correction for continuity.

results where the members of the parent popu- There are certain restrictions concerning howlations fall into more than two classes. For ex- small the expected values in the chi-squareample, instead of classifying members as either tests can be. Some authorities suggest that 10having a toxic reaction or no toxic reaction, the is the minimum expected value which should beseverity of the reaction might have three classi- used while others have suggested that a mini-fications, such as none, mild or severe toxic mum expected value of 5 will suffice. Studyingreaction. Another example would be the quanti- the effect of small expectations, Cochran (28)tation of the tuberculin reaction. If there are r concluded that the effect of small expectedclassifications with c samples, an r X c con- values depends upon the number of degrees oftingency table results. The chi-square test freedom in the test, and later (26) presentedproceeds as before by the determination of the recommendations about minimum expectations.expected values and the test being based on Yates' correction for continuity (24) is in-chi-square as defined by equation 5.17. With an tended for use when the samples are small.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 55: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

214 ROEBERT L. STEARMAN [VOL. 19

Such correction is less necessary when the 15. JAMES, G. S. 1951 The comparison of severalsamples used are large. However, as a working groups of observations when ratios of therule, the correction for continuity takes little population variances are unknown. Bio-extra time and may as well be used regardless metrika, 38, 324-329.of the size of sample involved. The correction 16. WELCH, B. L. 1951 On the comparison of

of Yates (24) is usually limited in application several mean values: an alternative ap-tot Ytesi(24) chi-squally ltests of either aone proach. Biometrika, 38, 330-336.to u-tests or chi-square tests of either one or 17. BARTLETT, M. S. 1947 The use of trans-two sample proportions. Cochran (28) ha5 formations. Biometrics, 3, 39-52.given a correction for continuity which has 18. NAIR, K. R. 1948 The distribution of thewider application. extreme deviate from the sample mean and

its studentized form. Biometrika, 35, 118-REFERENCES 144.

1. EISENHART, C., AND WILSON, P. W. 1943 19. DUNCAN, D. B. 1951 A significance test forStatistical methods and control in bac- differences between ranked treatments in anteriology. Bacteriol. Revs., 7, 57-137. analysis of variance. Virginia J. Sci.,

2. HUFF, D. 1954 How to lie with statistics. 2(N.S.), 909-913.W. W. Norton and Co., New York, N. Y. 20. HALVORSON, H. O., AND SPIEGELMAN, S.

3. BROSS, I. D. J. 1953 Design for decision. 1953 Net utilization of free amino acidsThe MacMillan Co., New York, N. Y. during the induced synthesis of maltozymase

4. ARKIN, H., AND COLTON, R. R. 1950 Tables in yeast. J. Bacteriol., 65, 601-608.for statisticians. Barnes and Noble, Inc., 21. STEARMAN, R. L., WARD, T. G., AND WEBSTER,New York, N. Y. R. A. 1953 Use of a "components of

5. HALD, A. 1952 Statistical tables and formu- variance" technique in biological experi-las. John Wiley and Sons, Inc., New York, mentation. Am. J. Hyg., 58, 340-351.N. Y. 22. National Bureau of Standards 1949 Tables

6. FISHER, R. A., AND YATES, F. 1953 Sta- of the binomial probability distribution.tistical tables for biological, agricultural and United States Government Printing Office,medical research. 4th ed. Oliver and Boyd, Washington, D. C.London, England. 23. ROMIG, H. G. 1953 50-100 binomial tables.

7. WILSON, G. S. 1935 The bacteriological John Wiley and Sons, Inc., New York, N. Y.grading of milk. His Majesty's Stationery 24. YATES, F. 1934 Contingency tables in-Office, London, England. volving small numbers and the x' test. J.

8. COCHRAN, W. G., AND Cox, G. M. 1950 Roy. Stat. Soc. (Suppl), 1, 217-235.Experimental designs. John Wiley and 25. National Bureau of Standards 1953 TablesSons, Inc., New York, N. Y. of normal probability functions. United

9. ROTH, N. G., AND HALVORSON, H. 0. 1952 States Government Printing Office, Wash-The effect of oxidative rancidity in un- ington, D. C.saturated fatty acids on germination of 26. COCHRAN, W. G. 1954 Some methods forbacterial spores. J. Bacteriol., 63, 429-435. strengthening the common x' tests. Bio-

10. EISENHART, C. 1947 The assumptions metrics, 10, 417-451.underlying the analysis of variance. Bio- 27. YATES, F. 1948 The analysis of contingencymetrics, 3, 1-21. tables with groupings based on quantitative

11. COCHRAN, W. G. 1947 Some consequences characters. Biometrika, 35, 176-181.when the assumptions for the analysis of 28. COCHRAN, W. G. 1942 The x2 correction forvariance are not satisfied. Biometrics, 8, continuity. Iowa State Coll. J. Sci., 16,22-38. 421-436.

12. BARTLETT, M. S. 1937 Some examples of 29. MOSTELLER, F., AND TuzEY, J. W. 1949 Thestatistical methods of research in agriculture uses and usefulness of binomial probabilityand applied biology. J. Roy. Stat. Soc. paper. J. Am. Stat. Assoc., 44, 174-212.(Suppl.), 4, 137-170. 30. DAVIS, D. E., AND ZIPPIN, C. 1954 Planning

13. Box, G. E. P. 1949 A general distribution wildlife experiments involving percentages.theory for a class of likelihood criteria. J. Wildlife Management, 18, 170-178.Biometrika, 36, 317-46. 31. CLOPPER, C. J., AND PEARSON, E. S. 1934

14. Box, G. E. P. 1953 Non-normality and The use of confidence or fiducial limitstests on variances. Biometrika, 40, 318- illustrated in the case of the binomial.335. Biometrika, 26, 404-413.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from

Page 56: mmbr.asm.org · STATISTICAL CONCEPTS IN MICROBIOLOGY' ROEBERTL. STEARMAN2 DepartmentofBiostatistics, SchoolofHygieneandPublicHealth, TheJohnsHopkins University, Baltimore, Maryland

1955] STATISTICAL CONCEPTS IN MICROBIOLOGY 215

32. SNEDECOR, G. W. 1950 Statistical methods. Standard methods for the examination of4th ed. The Iowa State College Press, water and 8ewage. 8th ed. American PublicAmes, Iowa. Health Association, New York, N. Y.

33. MOLINA, E. C. 1949 Poisson exponential 36. COCOON, W. G. 1950 Estimation of bac-binomial limit. D. Van Nostrand Co., Inc., terial densities by means of the "mostNew York, N. Y. probable number." Biometrics, 6, 105-116.

34. KITAGAWA, T. 1952 Tables of the Poisson 37. FINNEY, D. J. 1951 The estimation ofdistribution. Baifukan, Tokyo, Japan. bacterial densities from dilution series. J.

35. American Public Health Association 1936 Hyg., 49, 26-35.

on March 6, 2020 by guest

http://mm

br.asm.org/

Dow

nloaded from


Recommended