+ All Categories
Home > Documents > Probability Distrubution

Probability Distrubution

Date post: 02-Jun-2018
Category:
Upload: ron971
View: 216 times
Download: 0 times
Share this document with a friend

of 28

Transcript
  • 8/10/2019 Probability Distrubution

    1/28

    L E C T U R E R : D R . G O R D O N L I G H T B O U R N

    BIOSTATISTICSBIOL 350

  • 8/10/2019 Probability Distrubution

    2/28

    5.3 THE POISSON DISTRIBUTION

    C O N T I N U ATI O N O F C H A P T E R 5

  • 8/10/2019 Probability Distrubution

    3/28

    5.3 THE POISSON DISTRIBUTION

    Quite frequently we study cases where sample size k is very large, and one of the events (probability q ) ismuch more frequent than the other (probability p ).

    Expression e.g. (0.001+ 0.999) 1000The expansion of this binomial would be quite tiresome!!

    In cases like above we are generally interested inone tail of the distribution only.

    This is the tail represented by:

    p 0q k , p 1q k-1 , p 2q k- 2 , p 3q k- 3 ,..

  • 8/10/2019 Probability Distrubution

    4/28

    5.3 THE POISSON DISTRIBUTION

    The first term represents no rare events and k frequentevents (in a sample of k events).The second term represents 1 rare event and k-1 frequentevents.The third term represents 2 rare events and k-2 frequentevents.And so forth.The expression of the form are the binomial

    coefficients.We could use the binomial to compute the frequencies asin 5.2, however, it is much easier to use anotherdistribution, the **Possion distribution**.

  • 8/10/2019 Probability Distrubution

    5/28

    5.3 THE POISSON DISTRIBUTION

    The Poisson may be used to approximate thebinomial when the probability of the rare event p< 0.1and the product of sample size and probability kp

  • 8/10/2019 Probability Distrubution

    6/28

    5.3 THE POISSON DISTRIBUTION

    The Poisson variable, Y, will be the number of rareevents per sample.

    It can assume discrete (integer) values form 0 to . The variable must have two properties:1. Its mean must be small relative to the maximum

    number of events per sampling unit. This meansthe event should be rare.

    2. An occurrence of the event must be independentof prior occurrences within the sampling unit. Thismeans the event should be random.

  • 8/10/2019 Probability Distrubution

    7/28

    5.3 THE POISSON DISTRIBUTION

    If the occurrence of one event enhances theprobability of a second such event, we obtainclumping or contagious distributions.

    If the occurrence of one event impedes that of asecond such event in the sampling unit, we obtain arepulsed, spatially or temporally distribution.

    The Poisson distribution can be used as test forrandomness or independence of distribution bothspatially and temporally.

  • 8/10/2019 Probability Distrubution

    8/28

    5.3 THE POISSON DISTRIBUTION

    The Poisson series can be represented by:

    ,!

    , !

    , !

    , !

    ,, !

    , (5.11)Which are the relative expected frequencies corresponding tothe following counts of the rare events Y:

    0, 1, 2, 3, 4, ., r, The first term represents the relative expected frequency ofsamples containing no rare events (0).The second term, one rare event.The third term, two rare events.The fourth term, three rare events.And so forth.

    Explanation of the term e ,where e is the base of the natural log, is2.71828 and is the parametric mean of the distribution.

  • 8/10/2019 Probability Distrubution

    9/28

    TABLE 5.5 YEAST CELLS IN 400 SQUARESOF A HEMACYTOMETER

    = 1.8 cells per square; n = 400 squares sampled. __________________________________________________________________________(1) (2) (3) (4)Number of AbsoluteCells per Observed expected Deviation fromsquare frequencies frequencies expectationY -

    _________________________________________________________________________0 75 66.1 +1 103 119.0 -2 121 107.1 +3 54 64.3 -4 30 28.9 +5 13 10.4 +

    6 2 3.1 -7 1 17 0.8 14.5 + +8 0 0.2 -9+ 1 0.0 +

    400 399.9 __________________________________________________________________________

  • 8/10/2019 Probability Distrubution

    10/28

    EXAMPLE 5.5

    Distribution of yeast cells in 400 squares of ahaemocytometer.Column (1) lists the number of yeast cells observed in

    each haemocytometer square.Column (2) gives the observed frequency thenumber of squares containing a given number ofyeast cells.

    Note 75 squares contain no (0) yeast cells.Most squares held either one or two cells.Only 17 squares contained 5 or more yeast cells.

  • 8/10/2019 Probability Distrubution

    11/28

    EXAMPLE 5.5

    Why would we expect this frequency distribution tobe distributed in Poisson fashion?We have a relatively rare event.

    On average there 1.8 cells per square.Relative to the amount of space, the number found isvery low.We expect the occurrence of individual yeast cells in

    a square is independent of the occurrence of otheryeast cells.

  • 8/10/2019 Probability Distrubution

    12/28

    EXAMPLE 5.5

    The mean of the rare events is the only quality weneed to know to calculate the relative expectedfrequencies (of a Poisson distribution).

    We do not know the parametric mean of the yeastcells.We employ an estimate (the sample mean) andcalculate the expected frequencies where equals

    the sample mean of table 5.5.It is convenient to rewrite expression 5.11 as:

    = i-1 for i = 1, 2, . Where = e - (5.12)

  • 8/10/2019 Probability Distrubution

    13/28

    EXAMPLE 5.5

    Note that the parametric mean has been replaced bythe sample mean .Expression 5.12 yields relative expressed frequencies.

    Absolute expected frequencies: = n/e

    We list the expected frequencies I column (3) of table 5.5.What have we learnt?

    When comparing the observed frequencies with theexpected frequencies, we see a good fit (mean 1.8).No clear pattern of deviation from expected is observed.

  • 8/10/2019 Probability Distrubution

    14/28

    EXAMPLE 5.5

    The biological interpretation: the yeast cells seem to berandomly dispersed in the counting chamber, indicatingthorough mixing of the suspension.Note that in Table 5.5 we group the low frequencies at onetail of the curve, uniting them by means of a bracket. For agoodness of fit test no expected frequency, should beless than 5.Poisson distribution facts:Computing expected frequencies we need to know onlyone parameter the mean of the distribution.The mean completely defines the shape of a givenPoisson distribution.We have a simple relation between the two: = 2The variance is equal to the mean.

  • 8/10/2019 Probability Distrubution

    15/28

    EXAMPLE 5.5

    In our example, variance = 1.965, not much larger thanthe mean 1.80, indicating that the yeast cells aredistributed approximately in Poisson fashion.The coefficient of dispersion: CD =

    This value will be near 1 in distributions that are essentialPoisson,>1 in clumped samples, and

  • 8/10/2019 Probability Distrubution

    16/28

    FIGURE 5.3

  • 8/10/2019 Probability Distrubution

    17/28

    TABLE 5.6 NUMBER OF MOSS SHOOTS (HYPNUMSCHREBERI) PER QUADRAT ON CHINA CLAY

    RESIDUES (MICA)

    __________________________________________________________________________(1) (2) (3) (4)Number of Absolute DeviationMoss shoots Observed expected fromPer quadrat frequencies frequencies expectation

    Y - _________________________________________________________________________0 100 77.7 +1 9 37.6 -2 6 9.1 -3 8 1.5 +4 1 17 0.2 10.8 + +

    5 0 0.0 06+ 2 0.0 +

    126 126.1 __________________________________________________________________________

    = 0.4841 s 2 = 1.308 CD = 2.702

  • 8/10/2019 Probability Distrubution

    18/28

    TABLE 5.6

    The first example, is from an ecological study ofmosses of the species Hypnum schreberi invadingmica residue of china clay. The ecologist laid out 126quadrats. In each quadrat they counted the numberof moss shoots. Expected frequencies are calculatedusing the mean number of moss shoots, = 0.4841, asan estimate of .We expect only 78 quadrats without a moss plant, wefind 100.Also we expect 1.7 quadrats containing 3 or moremoss shoots, we find 11.The center classes are less than expected.

  • 8/10/2019 Probability Distrubution

    19/28

    TABLE 5.6

    Instead of the near 38 expected quadrats with onemoss plant each, we find only 9.This case illustrates clumping, which was alsoencountered in the binomial distribution.The sample variance s2 = 1.308, much larger than the

    = 0.4841, yields a coefficient of dispersion CD =2.702.Biological explanation: the protonemata, or spores,

    of the moss were carried in by water and depositedat random but that each protonema gave rise to anumber of upright shoots, so counts of the latterindicated a clumped distribution.

  • 8/10/2019 Probability Distrubution

    20/28

    TABLE 5.7 POTENTILLA (WEED) SEEDS IN 98QUARTER-OUNCE SAMPLES OF GRASS SEEDS

    (PHLEUM PRATENSE) __________________________________________________________________________(1) (2) (3) (4)Number of weed Poisson DeviationSeeds per Observed expected fromSample of seeds frequencies frequencies expectationY -

    _________________________________________________________________________0 37 31.3 +1 32 35.7 -2 16 20.4 -3 9 7.8 +4 2 2.2 -5 0 13 0.5 10.6 - +

    6 1 0.1 +7+ 1 0.0 +

    98 98.0 __________________________________________________________________________

    = 1.1429 s 2 = 1.711 CD = 1.497

  • 8/10/2019 Probability Distrubution

    21/28

    TABLE 5.7

    The second example tests the randomness ofdistribution of weed seeds in samples of grass seed.We can estimate k (which is several thousand), andq, which represents the large proportion of grassseeds, as compared with p, the small proportion ofweed seed.The data are structured as in a binomial distributionwith alternative states: weed seed and grass seed.

    Only the number of weed seeds must be considered.This is a binomial in which the frequency of oneoutcome is very much smaller than that of the other,and the sample size is large.

  • 8/10/2019 Probability Distrubution

    22/28

    TABLE 5.7

    We can use the Poisson distribution as a usefulapproximation of the binomial frequencies for the tail ofthe distribution.We use the average number of weed seeds per sample ofseeds as our estimate of the mean and calculate Poissonfrequencies from the mean.Although the pattern of deviations and the coefficient ofdispersion indicate clumping, this tendency is notpronounced and we do not have sufficient evidence tosuggest this is not a Poisson distribution.

    We conclude the seeds are randomly distributed throughout the sample.If clumping had been found, it might mean that weedseeds stuck together, for some physical reason

  • 8/10/2019 Probability Distrubution

    23/28

    TABLE 5.9 AZUKI BEAN WEEVILS (CALLOSOBRUCHUSCHINENSIS) EMERGING FROM 112 AZUKI BEANS

    (PHASEOLUS RADIATUS)

    __________________________________________________________________________(1) (2) (3) (4)Number of Poisson DeviationWeevils emerging Observed expected fromPer bean frequencies frequencies expectation

    Y -

    _________________________________________________________________________0 61 70.4 -1 50 32.7 +2 1 7.6 -3 0 1 1.2 8.9 - -4+ 0 0.1 -

    112 112.0 __________________________________________________________________________

    = 0.4643 s 2 = 0.269 CD = 0.579

  • 8/10/2019 Probability Distrubution

    24/28

    TABLE 5.9 AZUKI BEAN WEEVILS (CALLOSOBRUCHUSCHINENSIS) EMERGING FROM 112 AZUKI BEANS

    (PHASEOLUS RADIATUS)

    This distribution is extracted from an experimentalstudy of population of the azuki weevil.The number of holes in beans (emergence) is a good

    measure of the number of adult that have emerged.The rare event in this case is the weevil present in thebean.The distribution is strongly repulsed, a far rare

    occurrence.There are many more beans containing one weevilthan the Poisson distribution would predict.

  • 8/10/2019 Probability Distrubution

    25/28

    TABLE 5.9 AZUKI BEAN WEEVILS (CALLOSOBRUCHUSCHINENSIS) EMERGING FROM 112 AZUKI BEANS

    (PHASEOLUS RADIATUS)

    Biological explanation:It was found that the adult female weevil tend todeposit evenly rather than randomly over the

    available beans.This prevents too many egg being place on any onebean and precluding heavy competition among thedeveloping larvae.

    A contributing factor was competition betweenlarvae feeding on the same bean, generally resultingin all but one being killed or driven away.

  • 8/10/2019 Probability Distrubution

    26/28

    TABLE 5.10 MEN KILLED BY BEING KICKED BY A HORSE IN 10PRUSSIAN ARMY CORPS IN THE COURSE OF 20 YEARS

    __________________________________________________________________________(1) (2) (3) (4)Number of men Poisson DeviationKilled per year Observed expected fromPer army corps frequencies frequencies expectation

    Y -

    _________________________________________________________________________0 109 108.7 +1 65 66.3 -2 22 20.2 +3 3 4.1 -4 1 4 0.6 4.8 + -

    5+ 0 0.1 -Total 200 200.0

    __________________________________________________________________________

    = 0.610 s 2 = 0.611 CD = 1.002

  • 8/10/2019 Probability Distrubution

    27/28

    TABLE 5.10 MEN KILLED BY BEING KICKED BY A HORSE IN 10PRUSSIAN ARMY CORPS IN THE COURSE OF 20 YEARS

    Table 5.10 is a frequency distribution of men killed bybeing kicked by a horse in 10 Prussian army corpsover 20 years.The basic sampling unit is temporal, one army corpsper year.The mean 0.610 men killed per army corps per year isthe rare event.If we knew the number of men in each army corps,

    we could calculate the probability of not being killedin one year.This would give us a binomial approximating thePoisson distribution.

  • 8/10/2019 Probability Distrubution

    28/28

    TABLE 5.10 MEN KILLED BY BEING KICKED BY A HORSE IN 10PRUSSIAN ARMY CORPS IN THE COURSE OF 20 YEARS

    Knowing the sample size is large, however, we canconsider the example from the Poisson model, usingthe observed mean number of men killed per armycorps per year as an estimate of .This example is a perfect fit to the expected.What would clumping mean?Poor discipline in the particular corps or a particularvicious horse that killed several men before the corps

    got rid of it.Repulsion might mean the men in a corps werecareless until someone had been killed, after whichthey became more careful for a while.


Recommended