+ All Categories
Home > Documents > M1905 Topic 2A Probability

M1905 Topic 2A Probability

Date post: 07-Apr-2018
Category:
Upload: moskvax
View: 217 times
Download: 0 times
Share this document with a friend

of 101

Transcript
  • 8/4/2019 M1905 Topic 2A Probability

    1/101

    Monday, 15 August 2011

    Lecture 7 - Content

    2 Sets

    2 Probability and counting

    2 Conditional probability

    Many students are not familiar with set theory, i.e. spend some more time explaining

    various operations carefully. If time left, spend some time talking about cardinality

    ofN, Q andR.

    Statistics (Advanced): Lecture 7 1

  • 8/4/2019 M1905 Topic 2A Probability

    2/101

    Sets

    Before we look at probability it is necessary to understand sets because probabilities

    are typically described in terms of sets where an event occurs.

    Definition 1. The set of all possible outcomes of an experiment is called a sample

    space, denoted by . Any subset A of the sample space , denoted by A iscalled an event.

    Definition 2. The counting operator N(A) is a set function that counts how manyelements belong to the set (event) A.

    Example (Sample spaces).

    Coin: = {H, T} N() = 2.Dice: =

    {1, 2, 3, 4, 5, 6

    } N(

    {1, 2, 5

    }) = 3;

    Weight: = R+ N(R+) = .(We want to count objects for any event)

    Statistics (Advanced): Lecture 7 2

  • 8/4/2019 M1905 Topic 2A Probability

    3/101

    Set Notation

    Before we introduce probability we need to introduce some notation. Let A, B

    .

    symbol set theory probability

    largest set certain event

    empty set impossible eventAB union of A and B event A or event BAB intersection of A and B event A and event BAC = \A complement of A not event A

    Statistics (Advanced): Lecture 7 3

  • 8/4/2019 M1905 Topic 2A Probability

    4/101

    Intersection Operator

    The set A

    B denotes the set such that if C

    A

    B then C

    A and C

    B

    ( is called the intersection operator).

    Statistics (Advanced): Lecture 7 4

  • 8/4/2019 M1905 Topic 2A Probability

    5/101

    Intersection Operator

    Examples:

    2 {1, 2} {red, white} = .2 {1, 2, green} {red, white, green} = {green}.2 {1, 2} {1, 2} = {1, 2}.

    Some basic properties of intersections:

    2 A B = B A.2 A (B C) = (A B) C.2 A B A.2 A

    A = A.

    2 A = .2 A B if and only if A B = A.

    Statistics (Advanced): Lecture 7 5

  • 8/4/2019 M1905 Topic 2A Probability

    6/101

    Union Operator

    The set A

    B denotes the set such that ifC

    A

    B then C

    A and/or C

    B

    ( is called the union operator).

    Statistics (Advanced): Lecture 7 6

  • 8/4/2019 M1905 Topic 2A Probability

    7/101

    Union Operator

    Examples:

    2 {1, 2} {red, white} = {1, 2, red, white}.2 {1, 2, green} {red, white, green} = {1, 2, red, white, green}.2 {1, 2} {1, 2} = {1, 2}.

    Some basic properties of unions:

    2 A B = B A.2 A (B C) = (A B) C.2 A (A B).2 A

    A = A.

    2 A = A.2 A B if and only if A B = B.

    Statistics (Advanced): Lecture 7 7

  • 8/4/2019 M1905 Topic 2A Probability

    8/101

    Set Minus

    The set A

    \B denotes the set such that if C

    A

    \B then C

    A and C /

    B.

    Statistics (Advanced): Lecture 7 8

  • 8/4/2019 M1905 Topic 2A Probability

    9/101

    Set Complement

    The set Ac =

    \A denotes the set such that ifC

    Ac then C /

    A.

    Statistics (Advanced): Lecture 7 9

  • 8/4/2019 M1905 Topic 2A Probability

    10/101

    Set Minus

    Examples:

    2 {1, 2} \ {red, white} = {1, 2}.2 {1, 2, green} \ {red, white, green} = {1, 2}.2 {1, 2} \ {1, 2} = .2

    {1, 2, 3, 4

    } \ {1, 3

    }=

    {2, 4

    }.

    Some basic properties of complements:

    2 A \ B = B \ A.2 A Ac = .

    2 A Ac

    = .2 (Ac)c = A.

    2 A \ A = .Statistics (Advanced): Lecture 7 10

  • 8/4/2019 M1905 Topic 2A Probability

    11/101

    Theorem 1. The complement of the union of A and B equals the intersection of

    the complements

    (A B)C

    = (A

    C

    ) (BC

    ).Proof. Use Venn diagrams for LHS and RHS and colour areas.

    Theorem 2. de Morgans Laws.

    n

    i=1 Aic

    =

    n

    i=1 Aciand

    ni=1

    Ai

    c=

    ni=1

    Aci

    Statistics (Advanced): Lecture 7 11

  • 8/4/2019 M1905 Topic 2A Probability

    12/101

    Counting Ordered Sampling without replacement

    Example (Ordered samples without replacement). The number ofordered samples

    of size r we can draw without replacement from n objects is,

    n (n 1) . . . (n r + 1) = n!(n r)!

    Recall: 0! = 1.

    Statistics (Advanced): Lecture 7 12

  • 8/4/2019 M1905 Topic 2A Probability

    13/101

    Counting Unordered Sampling without replacement

    Example (Unordered samples without replacement).

    nCr =n

    r

    =

    n!

    r!(n r)! = n Choose r.

    Recall,nCr =

    nCnr

    since nn r

    =

    n!

    (n r)!((n (n r))! = n Choose r.and so

    n

    0

    =

    n

    n

    = 1

    Statistics (Advanced): Lecture 7 13

  • 8/4/2019 M1905 Topic 2A Probability

    14/101

    Sampling without replacement (cont)

    Example. Consider

    {A,B,C

    }and select r = 2 items:

    {A, B} {A, C} {B, C} order important{B, A} {C, A} {C, B} 32 = 3 samples

    order not important 3 2 = 6 samples

    Statistics (Advanced): Lecture 7 14

  • 8/4/2019 M1905 Topic 2A Probability

    15/101

    Sampling in R

    # Creating ordered lists

    n = 158;x = 1:n;

    set.seed(6) # set random seed to 6 to reproduce results

    sample(x) # random permutation of nos 1,2,...,158: n! possibilities

    sample(x,10) # choose 10 numbers without replacement

    sample(x,10,TRUE) # choose 10 numbers with replacement = bootstrap sampling

    Statistics (Advanced): Lecture 7 15

  • 8/4/2019 M1905 Topic 2A Probability

    16/101

    What is Probability?

    1. Subjective probability expresses the strength of ones belief (the basis of Bayesian

    Statistics a bit on that later).

    2. Classical probability concept, mathematical answer for equally likely outcomes.

    Theorem 3. If there are n equally likely possibilities of which one must occur

    and s are regarded as favourable ( = successes), then the probability P of a

    success is given by s/n.

    Statistics (Advanced): Lecture 7 16

  • 8/4/2019 M1905 Topic 2A Probability

    17/101

    What is Probability?

    3. The frequency interpretation of probability:

    Theorem 4. The probability of an event (or outcome) is the proportion of times

    the event occur in a long run of repeated experiments.

    or in words:

    If an experiment is repeated n times under identical conditions, and if

    the event A occurs m times, then as n becomes large (i.e. in the long-run)the probability of A occurring is the ratio m/n.

    Statistics (Advanced): Lecture 7 17

  • 8/4/2019 M1905 Topic 2A Probability

    18/101

    What is Probability?

    2 The constancy of the gender ratio at birth. In Australia, the proportion of male

    births is fairly stable at 0.51. This long run relative frequency is used to estimatethe probability that a randomly chosen birth is male.

    2 Cancer council records show the age standardised mortality rate from breast

    cancer in NSW was close to 20 per 100,000 over the years 1972-2000. For a

    randomly chosen woman, we use 0.0002 as the probability of breast cancer.

    Example (Coin tossing).

    Buffon (1707-1788): n = 4, 040 P({H}) = 50.7%.Pearson (1857-1936): n = 24, 000 P({H}) = 50.05%.

    Coin Tossing in Rtable(sample(c("H","T"),4040,T))/4040

    table(sample(c("H","T"),24000,T))/24000

    Statistics (Advanced): Lecture 7 18

  • 8/4/2019 M1905 Topic 2A Probability

    19/101

    Coin Tossing 2010s

    In the 2010s Stanford Professor Persi Diaconis developed the Coin Tosser 3000.

    However, the machine is designed to flip a coin with the same result every time!

    Statistics (Advanced): Lecture 7 19

  • 8/4/2019 M1905 Topic 2A Probability

    20/101

    What is Probability?

    4. Mathematical formulation of probability

    Definition 3 (due to Andrey Kolmogorov, 1933). Given a sample space

    A , we define P(A), the probability of A, to be a value of a non-negativeadditive set function that satisfies the following three axioms:

    A1: For any event A, 0 P(A) 1,A2: P() = 1,

    A3: If A and B are mutually exclusive events (A B = ), thenP(A B) = P(A) + P(B).

    A3: IfA1, A2, A3, . . . is a finite or infinite sequence of mutually exclusive events

    in , then

    P(A1 A2 A3 . . .) = P(A1) + P(A2) + P(A2) + . . . .

    Statistics (Advanced): Lecture 7 20

  • 8/4/2019 M1905 Topic 2A Probability

    21/101

    Example (Lotto). A lotto type barrel contains 10 balls numbered 1, 2, . . . , 10. Three

    balls are drawn.

    i. How many distinct samples can be drawn?

    n = 10C3 =

    10

    3

    =

    10 9 81 2 3 = 120.

    ii. Event A = {1, 2, . . . , 7} (all numbers less than seven).7

    3 = 7 6 5

    1 2 3 = 35 successes P(A) = 35120 = 724.

    iii. B = all drawn numbers are even: P(B) = 1120

    53

    = 10120 =

    112.

    A B = {(2, 4, 6)} P(A B) = 1/120.

    iv. P(A B)? To answer this we need our next theorem.

    Statistics (Advanced): Lecture 7 21

  • 8/4/2019 M1905 Topic 2A Probability

    22/101

    Addition Theorem

    Theorem 5 (Addition Theorem). If A and B are any events in , then

    P(A B) = P(A) + P(B) P(A B)Proof. Use Venn diagrams, i.e. draw pictures OO and colour regions.

    Alternatively use axioms only. First note that

    P(A) = P((A

    \(A

    B))

    (A

    B))

    A3= P(A

    \(A

    B)) + P(A

    B). (1)

    Similarly, P(B) = P(B \ (A B)) + P(A B). Next,P(A B) = P((A \ (A B)) (B \ (A B)) (A B))

    A3= P(A \ (A B)) + P(B \ (A B)) + P(A B)= [P(A \ (A B)) + P(A B)]

    +[P(B \ (A B)) + P(A B)] P(A B)= P(A) + P(B) P(A B)

    which follows from the result (1).

    Statistics (Advanced): Lecture 7 22

  • 8/4/2019 M1905 Topic 2A Probability

    23/101

    Example (Lotto). A lotto type barrel contains 10 balls numbered 1, 2, . . . , 10. Three

    balls are drawn.

    i. How many distinct samples can be drawn? 120.

    ii. Event A = {1, 2, . . . , 7} (all numbers less than seven). P(A) = 724.iii. B = all drawn numbers are even: P(B) = 112.

    Also P(A B) = 1/120.iv. P(A B)?

    P(A B) = P(A) + P(B) P(A B) = 724

    +1

    12 1

    120=

    44

    120=

    11

    30.

    Statistics (Advanced): Lecture 7 23

  • 8/4/2019 M1905 Topic 2A Probability

    24/101

    Poincares Theorem

    Theorem 6 (Poincares formula, not part of M1905). Let A1, A2, . . . , An be any

    events in . Then,

    P

    n

    i=1

    Ai

    =

    ni=1

    P(Ai) i

  • 8/4/2019 M1905 Topic 2A Probability

    25/101

    (Unconditional) probability

    2 Recall 3 Axioms of probability.

    2 P(AC) = 1 P(A) since A AC = hence, 1 = P() = P(A AC) =P(A) + P(AC).

    2 P() = 0 because = C, hence P() = 1 P().2 etc.

    Statistics (Advanced): Lecture 7 25

  • 8/4/2019 M1905 Topic 2A Probability

    26/101

    Conditional Probability Motivating Example

    Consider the following (fictional) table of Sports Mortality Rates compiled over the

    last decade:SPORT Description DEATHS

    Chess Board Game considered

    the national sport of Russia 0

    Statistics (Advanced): Lecture 7 26

  • 8/4/2019 M1905 Topic 2A Probability

    27/101

    Conditional Probability Motivating Example

    Consider the following (fictional) table of Sports Mortality Rates compiled over the

    last decade:SPORT Description DEATHS

    Chess Board Game considered

    the national sport of Russia 0

    Boxing Barbaric Sport where two

    people hit each other 5

    Statistics (Advanced): Lecture 7 27

  • 8/4/2019 M1905 Topic 2A Probability

    28/101

    Conditional Probability Motivating Example

    Consider the following (fictional) table of Sports Mortality Rates compiled over the

    last decade:SPORT Description DEATHS

    Chess Board Game considered

    the national sport of Russia 0

    Boxing Barbaric Sport where two

    people hit each other 5Chess Boxing 5 minutes of Chess followed

    by 2 minutes of Boxing 0

    Statistics (Advanced): Lecture 7 28

  • 8/4/2019 M1905 Topic 2A Probability

    29/101

    Conditional Probability Motivating Example

    Consider the following (fictional) table of Sports Mortality Rates compiled over the

    last decade:SPORT Description DEATHS

    Chess Board Game considered

    the national sport of Russia 0

    Boxing Barbaric Sport where two

    people hit each other 5Chess Boxing 5 minutes of Chess followed

    by 2 minutes of Boxing 0

    Sky Diving Jumping out of a plane

    with a parachute 10

    Statistics (Advanced): Lecture 7 29

  • 8/4/2019 M1905 Topic 2A Probability

    30/101

    Conditional Probability Motivating Example

    Consider the following (fictional) table of Sports Mortality Rates compiled over the

    last decade:SPORT Description DEATHS

    Chess Board Game considered

    the national sport of Russia 0

    Boxing Barbaric Sport where two

    people hit each other 5Chess Boxing 5 minutes of Chess followed

    by 2 minutes of Boxing 0

    Sky Diving Jumping out of a plane

    with a parachute 10

    Lawn Bowls Rolling a Ball acrossgrass to hit other balls 1000

    Statistics (Advanced): Lecture 7 30

  • 8/4/2019 M1905 Topic 2A Probability

    31/101

    Hence, Lawn Bowls is the most dangerous sport by far!

    Statistics (Advanced): Lecture 7 31

  • 8/4/2019 M1905 Topic 2A Probability

    32/101

    Conditional Probability Motivating Example

    However, the number of deaths given that the sportsperson is young is zero so

    thatP(Dying from Lawn Bowls|sportsperson is young) 0

    even though

    P(Dying from Lawn Bowls) is large.

    Statistics (Advanced): Lecture 7 32

  • 8/4/2019 M1905 Topic 2A Probability

    33/101

    Conditional Probability Another Motivating Example

    What is the probability of the important event

    A = (starting salary after uni 60k)?

    What is the sample space ?

    Possibilities are:

    1 = {all students},2 = {all male students},3 = {all students with a maths degree}.

    Statistics (Advanced): Lecture 7 33

  • 8/4/2019 M1905 Topic 2A Probability

    34/101

    Conclusion

    2 Probability depends on the underlying sample space !

    2 Hence, if it is unclear to what sample space A refers to then make it clear by

    writing

    P(A|) instead of P(A)which we read as the conditional probability of A relative to or given ,

    respectively.

    Definition 4. If A and B are any events in and P(B) = 0 then, the conditionalprobability of A given B is

    P(A|B) = P(A B)P(B)

    .

    Statistics (Advanced): Lecture 7 34

  • 8/4/2019 M1905 Topic 2A Probability

    35/101

    Additional material for Lecture 7

    A combinatorial proof of the binomial theorem

    The binomial theorem says

    (x + y)n =n

    k=0

    n

    k

    xkynk.

    Consider the more complicated product

    (x1 + y1)(x2 + y2) (xn + yn)Its expansion consists of the sum of 2n terms, each term being the product of n

    factors. Each term consists either xk or yk, for each k = 1, . . . , n. For example,

    (x1 + y1)(x2 + y2) = x1x2 + x1y2 + y1x2 + y1y2

    Now, there is 1 =n

    0

    term with y terms only, n =

    n1

    with one x term and (n1)

    y terms etc. In general, there are

    nk

    terms with exactly k xs and (n k) ys. The

    theorem follows by letting xk = x and yk = y.

    Statistics (Advanced): Lecture 7 35

  • 8/4/2019 M1905 Topic 2A Probability

    36/101

    More on set theory

    The operation of forming unions, intersections and complements of events obey rules

    similar to the rules of algebra. Following some examples for events A, B and C:Commutative law: A B = B A and A B = B AAssociative law: (AB)C = A(B C) and (AB)C = A(B C)Distributive law: (A B) C = (A C) (B C) and (A B) C =(A C) (B C).

    Statistics (Advanced): Lecture 7 36

  • 8/4/2019 M1905 Topic 2A Probability

    37/101

    extra page

    Statistics (Advanced): Lecture 7 37

  • 8/4/2019 M1905 Topic 2A Probability

    38/101

    Tuesday, 16 August 2011

    Lecture 8 - Content2 Conditional probability

    2 Bayes rule

    2 Integer valued random variables

    Conditional probability equation

    P(A|) = P(A )P()

    = P(A) for P(B) > 0 : P(A|B) = P(A B)P(B)

    Statistics (Advanced): Lecture 8 38

  • 8/4/2019 M1905 Topic 2A Probability

    39/101

    Conditional probability (cont)

    Example (Defect machine parts). Suppose that 500 machine parts are inspected

    before they are shipped.

    2 I = (a machine part is improperly assembled)

    2 D = (a machine part contains one or more defective components)

    N(S) = 500, N(I) = 30, N(D) = 15, N(I D) = 10

    Statistics (Advanced): Lecture 8 39

  • 8/4/2019 M1905 Topic 2A Probability

    40/101

    Example (cont)

    Assumption: equal probabilities in the selection of one of the machine parts.

    Using the classical concept of probability we get:P(D) = P(D|) = N(D)

    N()=

    15

    500=

    3

    100,

    P(D|I) = N(D I)N(I)

    =10

    30=

    1

    3>

    3

    100,

    note that ifN() > 0, then

    =N(D I) 1N()

    N(I) 1N()=

    P(D I)P(I)

    .

    Statistics (Advanced): Lecture 8 40

  • 8/4/2019 M1905 Topic 2A Probability

    41/101

    General multiplication rule of probability

    Theorem 7 (General multiplication rule of probability). IfA and B are any events

    in , then

    P(A B) = P(B) P(A|B), if P(B) = 0, changing A and B yields= P(A) P(B|A), if P(A) = 0.

    Proof. This holds because,

    P(A|B) := P(A B)P(B)

    etc.

    What happens if P(A|B) = P(A)?

    additional information ofB is of no use

    special multiplication rule!

    P(A B) = P(A) P(B).

    Statistics (Advanced): Lecture 8 41

  • 8/4/2019 M1905 Topic 2A Probability

    42/101

    Definition of independence of events

    Definition 5. If A and B are any two events in a sample space , we say that A

    is independent of B if and only if P(A|B) = P(A).From the general multiplication rule it follows that ifP(A|B) = P(A) then P(B|A) =P(B) and we say simply that A and B are independent.

    Statistics (Advanced): Lecture 8 42

  • 8/4/2019 M1905 Topic 2A Probability

    43/101

    Alternative View of Independence

    Alternatively, if A and B are independent then P(A B) = P(A) P(B) andhence,

    P(B|A) = P(A B)P(A)

    (using Bayes rule)

    =P(A) P(B)

    P(A)(using independence)

    = P(B).

    which can also be interpreted as saying that knowing A does not effecting the

    probability of B.

    Statistics (Advanced): Lecture 8 43

  • 8/4/2019 M1905 Topic 2A Probability

    44/101

    Independence

    In other words the events A and B are independent if the chance that one happens

    remains the same regardless of how the other turns out.

    Example. Suppose that we toss a fair coin twice. Let

    A = {heads of the first toss}and

    B ={

    heads of the second toss}

    .

    Now suppose A occurred. Then

    P({B knowning A has happened}) = 12.

    Statistics (Advanced): Lecture 8 44

  • 8/4/2019 M1905 Topic 2A Probability

    45/101

    Independence Example 2

    Example. Consider the following 6 boxes

    1 2 3 1 2 3

    Suppose we select a box at random, as it is drawn you see that it is green. Then

    P(A = {getting a 2}) = 26

    =1

    3

    P(B = {getting a 2 if I know it is green}) = 13

    Knowing the selected box is green has not changed our knowledge about which

    numbers might be drawn.

    Hence, the events A and B are independent.

    Statistics (Advanced): Lecture 8 45

  • 8/4/2019 M1905 Topic 2A Probability

    46/101

    Independence Example 3

    Example. Consider the following 6 boxes

    1 1 2 1 2 2

    Suppose we select a box at random, as it is drawn you see that it is green. Then

    P(A = {getting a 2}) = 36

    =1

    2

    P(B = {getting a 2 if I know it is green}) = 13

    Knowing the selected box is green HAS CHANGED our knowledge about which

    numbers might be drawn.

    Hence, the events A and B are NOT independent.

    Statistics (Advanced): Lecture 8 46

  • 8/4/2019 M1905 Topic 2A Probability

    47/101

    Independence Example 4

    Example. Two cards are drawn at random from an ordinary deck of 52 playing

    cards. What is the probability of getting two aces if(a) the first card is replaced before the second is drawn?

    (Solution: 4/52 4/52 = 1/169 since here P(A1 A2) = P(A1) P(A2) )(b) The first card is not replaced before the second card is drawn?

    (Solution: 4/52

    3/51 = 1/221 but unlike above P(A2|

    A1)= P(A

    2))

    Independence is violated when the sampling is without replacement.

    Statistics (Advanced): Lecture 8 47

  • 8/4/2019 M1905 Topic 2A Probability

    48/101

    Independence Example 5

    Medical records indicate that the proportion of children who have had measles by the

    age of8 is 0.4. The corresponding proportion for chicken pox is 0.5. The proportionwho have had both diseases by the age of 8 is 0.3. An infant is randomly selected.

    Let A represent the event that he contracts measles, and B that he contracts chicken

    pox, by the age of 8 years.

    2 Estimate P(A), P(B) and P(A

    B).

    P(A) = 0.4, P(B) = 0.5 and P(A B) = 0.3.2 Are A and B independent?

    P(A)P(B) = 0.2 = P(AB) = 0.3, so NO, A and B are not independent.

    Statistics (Advanced): Lecture 8 48

  • 8/4/2019 M1905 Topic 2A Probability

    49/101

    Bayes rule

    Example (The burgers are better...). Assume you get your burgers

    2 60% from supplier B1 [HJ]

    2 30% from supplier B2 [McD]

    2 10% from supplier B3 [RR]

    P(B1) = 0.6, P(B2) = 0.3, and P(B3) = 0.1.

    Interested in the event A =(good burger).

    [Draw a picture that shows thatB1, B2, andB3 are mutually exclusive events with

    B1 B2 B3 = .]

    Statistics (Advanced): Lecture 8 49

  • 8/4/2019 M1905 Topic 2A Probability

    50/101

    Example (cont)

    It follows that,

    A = A (B1 B2 B3) = (A B1) (A B2) (A B3).Note that (A B1), (A B2) and (A B3) are mutually exclusive.By Axiom 3 we get

    P(A) = P((A

    B1)

    (A

    B2)

    (A

    B3))

    = P(A B1) + P(A B2) + P(A B3).Remember the general multiplication rule:

    We already know that

    P(A B) = P(B) P(A|B), if P(B) = 0,= P(A) P(B|A), if P(A) = 0.

    Statistics (Advanced): Lecture 8 50

  • 8/4/2019 M1905 Topic 2A Probability

    51/101

    Example (cont)

    So we can write

    P(A) = P(B1) P(A|B1) + P(B2) P(A|B2) + P(B3) P(A|B3)= 0.6 P(A|B1)

    0.95, very good

    +0.3 P(A|B2) 0.80, sufficient

    +0.1 P(A|B3) 0.65, insufficient

    = 0.875.

    [The above probabilities 0.95, 0.8, 0.65 are from personal experience, i.e. subjective probability.]

    What did the example teach us?

    Strategy: decompose complicated events into mutually exclusive simple(r) events!

    Statistics (Advanced): Lecture 8 51

  • 8/4/2019 M1905 Topic 2A Probability

    52/101

    Total probability rule

    Theorem 8 (Total probability rule). IfB1, B2, . . . , Bn are mutually exclusive events

    such that B1 B2 . . . Bn = then for any event A ,P(A) =

    ni=1

    P(Bi) P(A|Bi).

    Example (Burger, cont). We know already that supplier B3 is bad. So what is

    P(B3|A) (if a burger is good is it from B3)? By definition of the conditional prob-ability, since P(A) > 0,

    P(B3|A) = P(A B3)P(A)

    =P(B3 A)

    P(A)=

    P(B3) P(A|B3)3i=1 P(Bi) P(A|Bi)

    =0.1 0.65

    0.875

    = 0.074.

    After we know that a burger is good the probability that it comes from B3 decreases

    from 0.1 to 0.074.

    Statistics (Advanced): Lecture 8 52

  • 8/4/2019 M1905 Topic 2A Probability

    53/101

    Bayes rule or Theorem

    What we just derived is the famous formula, called Bayes rule or theorem.

    Theorem 9 (Bayes Rule). If B1, B2, . . . , Bn are mutually exclusive events suchthat B1 B2 . . . Bn = then for any event A ,

    P(Bj|A) = P(A|Bj) P(Bj)

    ni=1 P(A|Bi) P(Bi)

    .

    The probabilities P(Bi) are called the priori probabilities and the probabilities P(Bi|A)the posteriori probabilities, i = 1, . . . , n.

    Statistics (Advanced): Lecture 8 53

  • 8/4/2019 M1905 Topic 2A Probability

    54/101

    Reverend Thomas Bayes (1701 - 1761)

    2 Born in Hertfordshire (London, England),

    2 was a Presbyterian minister,

    2 studied: theology and mathematics,

    2 best known for Essay Towards Solving a

    Problem in the Doctrine of Chances ,

    2 where Bayes Theorem was first proposed.

    2 Words: Bayes rule, Bayes Theorem,

    Bayesian Statistics.

    Statistics (Advanced): Lecture 8 54

  • 8/4/2019 M1905 Topic 2A Probability

    55/101

    Example of Bayes Rule Screening test for Tuberculosis

    TB (D+) No TB (D)X-ray Positive (S+) 22 51 73X-ray Negative (S) 8 1739 1747

    30 1790 1820

    What is the probability that a randomly selected individual has tuberculosis given

    that his or her X-ray is positive given that P(D+

    ) = 0.000093?2 P(D+) = 0.000093 which implies that P(D) = 0.999907.

    2 P(S+|D+) = 22/30 = 0.73332 P(S+|D) = 51/1790 = 0.0285

    P(D+

    |S+

    ) =P(S+

    |D+)P(D+)

    P(S+|D+)P(D+) + P(S+|D)P(D)=

    0.7333 0.0000930.7333 0.000093 + 0.0285 0.999907 = 0.00239

    Statistics (Advanced): Lecture 8 55

  • 8/4/2019 M1905 Topic 2A Probability

    56/101

    Integer valued random variables

    Many observed numbers are the random result of many possible numbers.

    Definition 6. A random variable X is a real-valued function of the elements of asample space .

    Note that such functions are denoted with capital letters and their images (out-

    comes) with lower case letters, e.g. x.

    Examples.

    2 How many times (X) will you be caught speeding?

    2 What will your final mark (Y) for MATH1905 be?

    2 How old (Z, in years) do you think your stats lecturer is?

    Statistics (Advanced): Lecture 8 56

  • 8/4/2019 M1905 Topic 2A Probability

    57/101

    Random Variable Example 3 Coins

    Consider tossing three coins. The number of heads showing when the coins land is

    a random variable: it assigns the number 0 to the outcome {T, T, T}, the number1 to the outcome {T, T, H}, the number 2 to the outcome {T, H, H}, and thenumber 3 to the outcome {H, H, H }.

    Statistics (Advanced): Lecture 8 57

  • 8/4/2019 M1905 Topic 2A Probability

    58/101

    Random Variable Example 3 Coins

    Events Random Variable Probability

    T T T

    T T H

    T HT

    T HH

    HT T

    HT HHHT

    HHH

    X = { Number of Heads }P(X = 0) = 1

    8

    P(X = 1) = 38

    P(X = 2) = 38

    P(X = 3) = 18

    Statistics (Advanced): Lecture 8 58

  • 8/4/2019 M1905 Topic 2A Probability

    59/101

    Random Variable Notation 3 Coins

    We use upper case letters to denote unobserved random variables, say X, and

    lower case letters to their observed values, in this case x.For example, in the above example before the three coins land we denote the number

    of heads X, after the coins have landed we denote the number of coins x so that

    we can write P(X = x).

    Statistics (Advanced): Lecture 8 59

  • 8/4/2019 M1905 Topic 2A Probability

    60/101

    The mother of all examples: Bernoulli trials!

    Definition 7. Bernoulli trials satisfy the following assumptions:

    (i) there are only two possible outcomes for each trial,

    (ii) the probability of success is the same for each trial,

    (iii) the outcomes from different trials are independent,

    (iv) there are a fixed number n of Bernoulli trials conducted.

    Example (n = 1, coin). : Head or Tail. We can describe the trial (before flipping

    the coin) in full detail. Consider a function

    X : {H, T} {0, 1} s.t. X(H) = xH = 1 and X(T) = xT = 0.What is the probability that X = xH = 1?

    P(X = 1) = P(X = xH) = P(H) = p = 1/2 P(X = 0) = 1/2.

    Statistics (Advanced): Lecture 8 60

  • 8/4/2019 M1905 Topic 2A Probability

    61/101

    Jacob Bernoulli (16541705)

    2Born in Basel (Switzerland),

    2 1 of 8 mathematicians in his family,

    2 studied: theology maths & astro,2 best known for Ars Conjectandi (The Art of

    Conjecture),2 application of probability theory to games

    of chance, introduction of the law of large

    numbers.

    2 Words: Bernoulli trial, Bernoulli numbers.

    Statistics (Advanced): Lecture 8 61

  • 8/4/2019 M1905 Topic 2A Probability

    62/101

    extra page

    Statistics (Advanced): Lecture 8 62

  • 8/4/2019 M1905 Topic 2A Probability

    63/101

    Monday, 22 August 2011

    Lecture 9 - Content2 Distribution of a random variable

    2 Binomial distribution

    2 Mean of a distribution

    Statistics (Advanced): Lecture 9 63

  • 8/4/2019 M1905 Topic 2A Probability

    64/101

    Revised Axioms of Probability

    In Lecture 7 we used the following definition of probability

    Definition 8 (due to Andrey Kolmogorov, 1933). Given a sample space A ,we define P(A), the probability of A, to be a value of a non-negative additive set

    function that satisfies the following three axioms:

    A1: For any event A, 0 P(A) 1,A2: P() = 1,

    A3: If A1, A2, A3, . . . is a finite or infinite sequence of mutually exclusive events in

    , then

    P(A1 A2 A3 . . .) = P(A1) + P(A2) + P(A2) + . . . .However, nothing is lost if we replace A1 : 0 P(A) 1 with A1 : 0 P(A).

    Statistics (Advanced): Lecture 9 64

  • 8/4/2019 M1905 Topic 2A Probability

    65/101

    Proof by Contradiction

    Assume the following 3 axioms:

    A1: For any event A , 0 P(A),A2: P() = 1,

    A3: If A1, A2, A3, . . . is a finite or infinite sequence of mutually exclusive events in

    , then

    P(A1

    A2

    A3

    . . .) = P(A1) + P(A2) + P(A2) + . . . .

    Now let us assume that A4: For any event A , P(A) > 1, then2 1 = P() = P(A Ac) = P(A) + P(Ac).2 Rearranging we have P(Ac) = 1 P(A).2 By A4 we have P(Ac) < 0. This contradicts A1, hence A4 cannot be assumed.

    Statistics (Advanced): Lecture 9 65

  • 8/4/2019 M1905 Topic 2A Probability

    66/101

    Revised Axioms of Probability

    Definition 9 (due to Andrey Kolmogorov, 1933). Given a sample space A

    ,

    we define P(A), the probability of A, to be a value of a non-negative additive setfunction that satisfies the following three axioms:

    A1: For any event A, 0 P(A),A2: P() = 1,

    A3: If A1, A2, A3, . . . is a finite or infinite sequence of mutually exclusive events in

    , thenP(A1 A2 A3 . . .) = P(A1) + P(A2) + P(A2) + . . . .

    This is the minimal set of axioms needed to define probability.

    Random Variables Reminder

    n = 1 (Coin): = {H, T} X {0, 1} R. Thus X(H) = xH = 1 and X(T) =xT = 0 P(X = 1) = P(H) = p.Statistics (Advanced): Lecture 9 66

  • 8/4/2019 M1905 Topic 2A Probability

    67/101

    Distribution of a random variable

    Definition 10. The probability distribution of a integer-valued random variable X

    is a list of the possible values of X together with their probabilitiespi = P(X = i) 0 and

    i

    pi = 1.

    There is nothing special with the subscript i; we could and will equally well use j,

    k, x etc.

    Definition 11. The probability that the value of a random variable X is less than

    or equal to x, that is

    F(x) = P(X x),is called the cumulative distribution function or just the distribution function.

    Also, note that for integer valued random variables that

    P(X = x) = F(x) F(x 1).

    Statistics (Advanced): Lecture 9 67

  • 8/4/2019 M1905 Topic 2A Probability

    68/101

    Example (n = 3, IT problems). A network is fragile. By experience: P(F) = 0.1 =

    1p that in any given week 1 major problem; P(S) = 0.9 = p that there is none,respectively. Out of3 weeks, how many weeks, X, had

    1 problem and with what

    probability?

    (a) All possible outcomes:FFF SFF FSF FFS

    SSF FSS SFS SSS

    (b) What is the probability of each outcome? Use special multiplication rule ofprobability because sessions are independent!?

    (c) What is the probability distribution of the number of successes, X, among the

    3 sessions.

    Statistics (Advanced): Lecture 9 68

  • 8/4/2019 M1905 Topic 2A Probability

    69/101

    Example (cont)

    P(X = 0) = P(FFF) = P(F)

    P(F)

    P(F) = (1

    p)3

    P(X = 1) = P(SFF FSF FFS) mutually exclusive events

    = P(SFF) + P(FSF) + P(FFS)

    = 3 (1 p)2p =

    3

    1

    (1 p)2p, select one S out of3 trials.

    similarly we get for X = 2 and X = 3

    P(X = 2) =

    3

    2

    (1 p)p2, select two S out of 3 trials,

    P(X = 3) =

    3

    3

    p3, select three S out of 3 trials.

    Statistics (Advanced): Lecture 9 69

  • 8/4/2019 M1905 Topic 2A Probability

    70/101

    Binomial distribution

    We can generalise this result for any n 1 and success probability p [0, 1].Definition 12. The probability distribution of the number of successes X = i inn N independent Bernoulli trials is called the binomial distribution,

    pi = P(X = i) =

    n

    i

    pi(1 p)ni.

    The success probability of a single Bernoulli trial is p and i = 0, 1, . . . , n.

    To say that the random variable X has the binomial distribution with parameters n

    and p we write X B(n, p).This defines a family of probability distributions, with each member characterized

    by a given value of the parameter p and the number of trials n.

    Statistics (Advanced): Lecture 9 70

  • 8/4/2019 M1905 Topic 2A Probability

    71/101

    Binomial distribution

    Since pi, 0 i n is a probability distribution we have the identity (which we willuse later on) n

    i=0

    n

    i

    pi(1 p)ni = 1

    for any 0 p 1.

    A special case of the Binomial distribution is the Bernoulli distribution where n = 1and

    P(Xi = i) = pi(1 p)1i.

    There is another special relationship between the Bernoulli distribution and the

    Binomial distribution.

    If Xi Bernoulli(p) for 1 i n and Y = ni=1 Xi thenY B(n, p).

    Statistics (Advanced): Lecture 9 71

  • 8/4/2019 M1905 Topic 2A Probability

    72/101

    Example (Dice). Roll a fair dice 9 times. Let X be the probability of sixes obtained.

    Then X B(9, 1/6); that is

    pi = P(X = i)=

    n

    i

    pi(1 p)ni

    =

    9

    i

    1

    6

    i5

    6

    9i

    = 9i59i

    69 , i = 0, 1, . . . , 9.

    Statistics (Advanced): Lecture 9 72

  • 8/4/2019 M1905 Topic 2A Probability

    73/101

    With your table calculator or with R:

    > n = 9 ;

    > p = 1/6;

    > round(dbinom(0:n,n,p),4) # dbinom for B(n,p) probs

    [1] 0.1938 0.3489 0.2791 0.1302 0.0391

    [5] 0.0078 0.0010 0.0001 0.0000 0.0000

    > pbinom(1,n,p) # for B(n,p) cumulative probabilities

    [1] 0.5426588

    Hence, P(X = 4) = 0.0391 and P(X < 2) = F(1) = 0.5426588.

    Statistics (Advanced): Lecture 9 73

  • 8/4/2019 M1905 Topic 2A Probability

    74/101

    Shape of the binomial distribution

    2 We get a binomial distribution if

    1. we are counting something over a fixed number of trials or repetitions,2. the trials are independent and

    3. the probability of the outcome of interest is constant across trials.

    2 The binomial distribution is centred at n p,2 the closer p to 1/2 the more symmetric the distribution/histogram,

    2 the larger n the closer the shape to a bell (normal).

    Statistics (Advanced): Lecture 9 74

  • 8/4/2019 M1905 Topic 2A Probability

    75/101

  • 8/4/2019 M1905 Topic 2A Probability

    76/101

    0.

    00

    0.

    05

    0.

    10

    0.

    15

    0

    .20

    Probabilities for X~B(n=10,p=0.5)

    0.

    0

    0.

    1

    0.

    2

    0

    .3

    Probabilities for X~B(10,0.1)

    0.

    00

    0.

    10

    0.

    20

    0.

    30

    Probabilities for X~B(10,0.8)

    0.

    00

    0.

    05

    0.

    10

    0.1

    5

    0.

    20

    0.

    25

    Probabilities for X~B(10,0.4)

    Statistics (Advanced): Lecture 9 76

  • 8/4/2019 M1905 Topic 2A Probability

    77/101

    Example. In a small pond there are 50 fish, 20 of which have been tagged. Seven

    fish are caught and X represents the number of tagged fish in the catch. Assume

    each fish in the pond has the same chance of being caught. Is X binomial

    (a) if each fish is returned before the next catch?

    Yes, provided the fish do not learn from their experience, i.e. the probability of

    catching each fish stays the same for each of the 7 trials.

    P(X = 1) = 7120

    501

    30506

    0.131 = dbinom(1,7,0.4)

    Statistics (Advanced): Lecture 9 77

  • 8/4/2019 M1905 Topic 2A Probability

    78/101

    (b) if the fish are not returned once they are caught?

    This situation cannot be modelled by a binomial as the proportion of tagged fish

    changes at each trial.

    If there were 5,000 fish, 2,000 of which had been tagged then the change in theproportion was negligible and we could model with a binomial.

    P(X = 1) =

    201

    306

    507 = choose(20,1)*choose(30,6)/choose(50,7) (in R)

    = 0.119 (to 3 d.p.)

    P(X = 1) =

    2000

    1

    30006 5000

    7

    = choose(2000,1)*choose(3000,6)/choose(5000,7) (in R)

    = 0.131 (to 3 d.p.)

    Statistics (Advanced): Lecture 9 78

  • 8/4/2019 M1905 Topic 2A Probability

    79/101

    Mean of a distribution

    Definition 13. For a random variable X taking values 0, 1, 2, . . . with

    P(X = i) = pi i = 0, 1, 2, . . .

    the mean or expected value of X is defined to be

    = E(X) =

    i

    i pi.

    Interpretation of E(X)

    2 Long run average of observations of X because pi fi/n.2 Centre of balance of the probability density (histogram). (draw picture)

    2 Measure of location of the distribution.

    Definition 14.For any function g(X) we define the expected value E(g(X)) by

    E(g(X)) =

    i

    g(i) pi.

    Statistics (Advanced): Lecture 9 79

  • 8/4/2019 M1905 Topic 2A Probability

    80/101

    Expectation of a Dice Roll

    Let X = {Face showing from a dice roll} where pi = P(X = i) = 1/6 for i =

    1, 2, . . . , 6. Then = E(X)

    =6

    i=1

    i pi= i

    i 1/6

    = 3.5.

    Note: the expected value in this case is not one of the observed values.

    Statistics (Advanced): Lecture 9 80

  • 8/4/2019 M1905 Topic 2A Probability

    81/101

    Mean of a distribution (cont)

    Theorem 10. For constants a and b

    E(aX+ b) = a E(X) + b.

    Proof.

    E(aX+ b) =all i

    g(i) pi; where g(i) = a i + b,

    = all i

    (a i)pi + b pi= a

    all i

    i pi + ball i

    pi

    = a E(X) + b.

    Statistics (Advanced): Lecture 9 81

  • 8/4/2019 M1905 Topic 2A Probability

    82/101

    Expectation of X B(n, p)Theorem 11. The expectation of X B(n, p) is E(X) = np.

    Proof.

    E(X) =n

    i=0

    i pi =n

    i=0

    i n!i!(n i)!p

    i(1 pi)(ni); change to i = 1, . . . , n ,

    =n

    i=1

    i n!i!(n i)!p

    i(1 pi)(ni); simplify,

    =n

    i=1

    i n (n 1)!i(i 1)!(n i)!p

    i(1 pi)(ni),

    = n pn

    i=1

    (n 1)!(i 1)!(n i)!p

    i1(1 pi)(ni); sub. j = i 1, m = n 1.

    Hence, E(X) = np m

    j=0mj pj(1 p)mj

    sums to 1 because probabilities from Y B(m, p)

    .

    Statistics (Advanced): Lecture 9 82

  • 8/4/2019 M1905 Topic 2A Probability

    83/101

    Example (Multiple choice section in M1905 exam is worth 35%).

    20 questions and each question has 5 possible answers. A student decides to answer

    the questions by selecting an answer at random.

    (a) What is the expected number of correct responses? Let X denote the number

    of correct answers. X B(20, 0.2). The expected number of correct answers isnp = 4

    (b) Probability that the student has more than 10 correct answers?

    P(X > 10) = 1 P(X 10)= 1 0.9994, with 1-pbinom(10,20,0.2)= 0.0006

    (c) If the student scores 4 for a correct answer but -1 for a wrong response, what is

    his expected score?

    E[4 X+ (1) (20 X)] = E(5X 20) = 0.

    Statistics (Advanced): Lecture 9 83

  • 8/4/2019 M1905 Topic 2A Probability

    84/101

    Tuesday, 23 August 2011

    Lecture 10 - Content

    2 Variance of a distribution

    2 More integer-valued distributions

    2 Probability generating functions

    Statistics (Advanced): Lecture 10 84

  • 8/4/2019 M1905 Topic 2A Probability

    85/101

    Expectation of a distribution Reminders

    The expectation of a distribution (or expectation of a random variable) is the mean

    of the probability distribution (a measure of distribution location).Note that

    2 E(X) =

    i i pi =

    i i P(X = i) and2 E(g(X)) =

    ig(i) pi =

    ig(i) P(X = i).

    Statistics (Advanced): Lecture 10 85

  • 8/4/2019 M1905 Topic 2A Probability

    86/101

    Variance of a distribution

    Example. Suppose X (e.g. number of shoes in suitcase) takes the values 2, 4 and

    6 with probabilities i 2 4 6pi 0.1 0.3 0.6

    Hence, = E(X)

    = i i pi=

    i

    i pi= 2 0.1 + 4 0.3 + 6 0.6= 5.

    Statistics (Advanced): Lecture 10 86

  • 8/4/2019 M1905 Topic 2A Probability

    87/101

    What is E(X2)?

    Suppose X (e.g. number of shoes in suitcase) takes the values 2, 4 and 6 with

    probabilities i 2 4 6pi 0.1 0.3 0.6

    What is E(X2)?

    Solution 1: E(X2)Def= g(i)pi = i2pi = 26.8 = 5

    2.

    Solution 2: i i2 = j and X X2 = Y, use E(Y) = j jpjj 4 16 36

    pj 0.1 0.3 0.6

    The distribution of Y can be hard to get (e.g. for continuous rvs).

    Statistics (Advanced): Lecture 10 87

  • 8/4/2019 M1905 Topic 2A Probability

    88/101

    Definition 15. The variance of the random variable X is defined by

    Var(X) = 2 = E(X )2 = E(X2) 2,

    where = E(X) and 2 is also a measure of spread.

    This is like the large sample limit of a sample variance.

    The standard deviation of X is =

    2.

    Statistics (Advanced): Lecture 10 88

  • 8/4/2019 M1905 Topic 2A Probability

    89/101

    Variance of a Linear Transformation

    Theorem 12. For any constants a and b

    Var(aX+ b) = a2 Var(X).

    Proof.

    Var(aX+ b) = E[(aX+ b)2] (E[aX+ b])2= E[a2X2 + abX+ b2] (a E[X] + b)2

    = a2

    E[X2

    ] + 2ab E[X] + b2

    (a E[X]2

    + 2ab E[X] + b2

    )= a2(E[X2] E[X]2)= a2 Var(X).

    Statistics (Advanced): Lecture 10 89

    ( ) ( ) ( )

  • 8/4/2019 M1905 Topic 2A Probability

    90/101

    Example. If X B(n, p) then well show later that Var(X) = n p (1 p).2 Hence, ifp = 0 or 1 then the variance is 0.

    2 the variance is largest when p = 0.5 and in this case it is 2

    = n/4.

    Statistics (Advanced): Lecture 10 90

    M i l d di ib i

  • 8/4/2019 M1905 Topic 2A Probability

    91/101

    More integer-valued distributions

    Geometric distribution

    The binomial random variable is just one possible integer-valued random variable.Suppose we have an infinite sequence of independent trials, each of which gives a

    success with probability p and failure with probability q = 1 p.Definition 16. The geometric distribution with parameter p (= success prob.) has

    probabilities for the number of failures X before the first success,

    pi = P(X = i) = qi p, i= 0, 1, 2, . . . .

    Note the probabilities add to 1:

    P(X = 0) + p1 + . . . = p + qp + q2p + . . . = p(1 + q + q2 + . . .) = p

    1

    1 q

    = 1

    [By induction we can prove that 1 + q + . . . + qn = 1qn+1

    1q .]

    Statistics (Advanced): Lecture 10 91

    E l A f i di i h dl il i h i

  • 8/4/2019 M1905 Topic 2A Probability

    92/101

    Example. A fair die is thrown repeatedly until it shows a six.

    (a) What is the probability that more than 7 throws are required?

    P(X > 7) = 1 P(X 7) = 1 7

    i=0

    56i 1

    6= 0.232 (3dp)

    with 1-pgeom(7,1/6) or with 1-sum(dgeom(0:7,1/6)).

    (b) Is it more likely that an odd number of throws is required or an even number?

    Because 0 P(X = i) 1 and F() = 1 we find,P(even) P(odd) =

    j=1

    P(X = 2(j 1))

    k=1

    P(X = 2k 1)

    =

    j=1

    P(X = 2(j 1)) P(X = 2j 1) =

    j=1

    q2(j1)p q2j1p

    = p j=1

    (q2(j1) q2j1) 0

    odd number of throws are more likely.

    Statistics (Advanced): Lecture 10 92

    Th P i i ti t th Bi i l

  • 8/4/2019 M1905 Topic 2A Probability

    93/101

    The Poisson approximation to the Binomial

    The Poisson distribution often serves as a first theoretical model for counts which

    do not have a natural upper bound.Possible examples

    2 modeling of number of accidents, crashes, breakdowns

    2 modeling radioactivity measured by the Geiger counter

    2 modeling of so-called rare events (meteorite impacts, heart attacks)

    Whether or not the Poisson distribution sufficiently describes count data is not

    answered at this early stage but postponed to later lectures in statistics.

    Statistics (Advanced): Lecture 10 93

    Th P i di ib i b h li i i di ib i f B( )

  • 8/4/2019 M1905 Topic 2A Probability

    94/101

    The Poisson distribution can be seen as the limiting distribution ofB(n, p):Let n , while p 0 and np (0, ).For X

    B(n, p) we know that

    P(X = k) =

    n

    k

    pk

    =()

    (1 p)nk =()

    .

    Then, () = nkpk = nkk

    nk =

    n(n

    1)

    (n

    k + 1)

    n n nk

    k! k

    k!

    and () = (1 p)nk =

    1 n

    n1

    n

    k

    1

    e .

    Hence,

    P(X = k) e kk!

    , for k = 0, 1, 2, . . . .

    Statistics (Advanced): Lecture 10 94

    A i i i d if 2 i ll!

  • 8/4/2019 M1905 Topic 2A Probability

    95/101

    Approximation is good if n p2 is small!X B(158, 1365) and n p2 = 0.001186:> # What is the probability that of 158 people, exactly k have a birthday today?> n = 158; p=1/365;

    > round(dbinom(0:7,n,p),5);

    [1] 0.64826 0.28139 0.06068 0.00867 0.00092 0.00008 0.00001 0.00000

    > round(dpois(0:7,p*n),5);

    [1] 0.64864 0.28078 0.06077 0.00877 0.00095 0.00008 0.00001 0.00000

    But for n = 10

    > n = 10; p=1/3;

    > round(dbinom(0:4,n,p),5);

    [1] 0.01734 0.08671 0.19509 0.26012 0.22761

    > round(dpois(0:4,p*n),5);

    [1] 0.03567 0.11891 0.19819 0.22021 0.18351

    Statistics (Advanced): Lecture 10 95

    P b bilit di t ib ti f X B( ) d X P()

  • 8/4/2019 M1905 Topic 2A Probability

    96/101

    Probability distribution for X B(n, p) and X P()

    0.

    0

    0.

    2

    0.

    4

    0.

    6

    Probabilities for X~B(158,1/365)

    0.

    0

    0.

    2

    0.

    4

    0.

    6

    Probabilities for X~Poi(158/365)

    0.

    00

    0.

    10

    0.

    20

    Probabilities for X~B(10,1/3)

    0.

    00

    0.

    05

    0.

    10

    0.

    15

    0.

    20

    Probabilities for X~Poi(10/3)

    Statistics (Advanced): Lecture 10 96

    P b bilit g n ting f n ti ns

  • 8/4/2019 M1905 Topic 2A Probability

    97/101

    Probability generating functions

    Let X N and pi = P(X = i), i = 0, 1, 2, . . .

    Definition 17. The probability generating function is defined as

    (s) = p0 + p1s + p2s2 + p3s

    3 + . . ..

    Example. If X only takes a finite number of values (e.g. X B(n, p)) then (s)is a polynomial.

    Alternatively (e.g. X P()) (s) is a power series.

    Statistics (Advanced): Lecture 10 97

    Properties of (s)

  • 8/4/2019 M1905 Topic 2A Probability

    98/101

    Properties of (s)

    Let s [0, 1] then2 0

    (s)

    1,

    2 (1) = p0 + p1 + . . . = 1,

    2 (s) = p1 + 2p2s + 3p3s2 + . . . 0, s 0.2 (1) = p1 + 2p2 + 3p3 + . . . = E(X) (if E(X) is finite),

    2 (s) = 2p2 + 6p3 + 4 3p4 + . . . at s = 1, so (1) = E(X(X 1)) andVar(X) = E(X2) (E X)2 = (1) + (1) ((1))2.

    Statistics (Advanced): Lecture 10 98

    Example (Poisson distribution) For X P()

  • 8/4/2019 M1905 Topic 2A Probability

    99/101

    Example (Poisson distribution). For X P(),

    (s) =

    i=0 e

    i

    i!si

    = e

    i=0

    es

    es(s)i

    i!

    = ees = e(s1).

    Hence,

    (s) = e(s1) so E(X) = (= (1))(s) = 2e(s1) so E[X(X 1)] = 2

    and

    Var(X) = E(X2) (E X)2 = 2 + 2 = .

    Statistics (Advanced): Lecture 10 99

    Example (Binomial distribution) Let X B(n p)

  • 8/4/2019 M1905 Topic 2A Probability

    100/101

    Example (Binomial distribution). Let X B(n, p).First, note that

    (x + y)n =n

    i=0

    nixiyni. (2)Then

    (s) =n

    i=0

    si

    n

    i

    pi(1 p)ni

    =n

    i=0

    ni(sp)i(1 p)ni= (1 p + ps)n

    which follows from (2). Then

    (s) = np(1 p + ps)n1 so that (1) = E(X) = np,

    (s) = np2

    (n 1)(1 p + ps)n

    2

    so that (1) = np2

    (n 1)and finally,

    Var(X) = (1) ((1))2 + (1) = np2(n 1) n2p2 + np = np(1 p).Statistics (Advanced): Lecture 10 100

  • 8/4/2019 M1905 Topic 2A Probability

    101/101

    Statistics (Advanced): Lecture 10 101


Recommended