Epidemiology 9509 - Principle of Biostatistics Chapter 5 Probability Distributions...

Post on 16-Feb-2020

4 views 0 download

transcript

Epidemiology 9509 probability dist’ns (continued)

Epidemiology 9509Principle of Biostatistics

Chapter 5Probability Distributions (continued)

John Koval

Department of Epidemiology and BiostatisticsUniversity of Western Ontario

1

Epidemiology 9509 probability dist’ns (continued)

What was covered previously

1. probability P(A)setsP(A and B); P(A or B)

2. probability distributions2.1 discrete

2.1.1 equiprobable

2.1.2 bernoulli

2.1.3 binomial

2.1.4 poisson

2.2 continuous

2.2.1 uniform

2.2.2 normal

3. calculating probabilities

3.1 discretePr(X = x)

3.2 continuousintervals: Pr(X < a), Pr(a < X < b)

2

Epidemiology 9509 probability dist’ns (continued)

What is being covered now

Using SAS to

1. calculate probabilities

2. calculate and plot probability distributions

3

Epidemiology 9509 probability dist’ns (continued)

Calculating probabilities

SAS function PDF

title ’calculate binomial probability’;

data binom1;

prob = pdf(’binomial’, 4, 0.4, 10);

output ;

proc print data=binom1;

4

Epidemiology 9509 probability dist’ns (continued)

binomial probability

calculate binomial probability

Obs prob

1 0.25082

Does this agree with previous calculations?

5

Epidemiology 9509 probability dist’ns (continued)

binomial probability

calculate binomial probability

Obs prob

1 0.25082

Does this agree with previous calculations?0.251, Lecture Chapter 5, page 8

6

Epidemiology 9509 probability dist’ns (continued)

Calculating probability distribution

title "calculate binomial probability distribution’;

data binom2;

do x = 0 to 10 by 1;

prob = pdf(’binomial’, x, 0.4, 10);

output;

end;

proc print data=binom2;

proc gplot;

plot prob*x;

run;

7

Epidemiology 9509 probability dist’ns (continued)

binomial probability distribution

calculate binomial probability distribution

Obs x prob

1 0 0.00605

2 1 0.04031

3 2 0.12093

4 3 0.21499

5 4 0.25082

6 5 0.20066

7 6 0.11148

8 7 0.04247

9 8 0.01062

10 9 0.00157

11 10 0.00010

8

Epidemiology 9509 probability dist’ns (continued)

GPLOT of pdf

9

Epidemiology 9509 probability dist’ns (continued)

Calculating cumulative probabilities

values up to and includingSAS function CDF

title ’calculate cumulative binomial probability’;

data binom3;

prob = cdf(’binomial’, 7, 0.4, 20);

output ;

proc print data=binom3;

run;

10

Epidemiology 9509 probability dist’ns (continued)

binomial cumulative distribution calculation

calculate cumulative binomial probability

Obs prob

1 0.41589

Does this agree with previous calculations ?

11

Epidemiology 9509 probability dist’ns (continued)

binomial cumulative distribution calculation

calculate cumulative binomial probability

Obs prob

1 0.41589

Does this agree with previous calculations ?0.4159, using R, Lecture Chapter 5, page 30

12

Epidemiology 9509 probability dist’ns (continued)

cumulative continuous probabilities

Pr(X < b)

= Pr

(

ZN <b−µ

σ

)

= Φ(

b−µ

σ

)

Φ() given by SAS function PROBNORM

13

Epidemiology 9509 probability dist’ns (continued)

example

Recall normal approximation to binomial

wantPr(Xnorm < 7.5)= Pr(ZN <

(

7.5−82.19

)

= Φ(−.228)

title ’calculate Normal probability’;

data norm1;

prob =probnorm(-0.228);

output;

proc print data=norm1;

run;

;

14

Epidemiology 9509 probability dist’ns (continued)

binomial cumulative distribution calculation

calculate Normal probability

Obs prob

1 0.40982

Does this agree with previous calculations ?

15

Epidemiology 9509 probability dist’ns (continued)

binomial cumulative distribution calculation

calculate Normal probability

Obs prob

1 0.40982

Does this agree with previous calculations ?0.4098by linear interpolation,see lecture Chapter 5, page 30

16

Epidemiology 9509 probability dist’ns (continued)

Probability of interval

Pr(17 < X < 22)= Pr

(

17−205 < ZN <

22−205

)

= Pr(−0.6 < ZN < 0.4) = Φ(0.4) − Φ(−0.6)

title ’calculate Normal probability for interval’;

data norm2;

a=-0.6;

b=0.4;

proba =probnorm(a);

probb = probnorm(b);

probint = probb - proba;

output;

proc print data=norm2;

run;

17

Epidemiology 9509 probability dist’ns (continued)

binomial cumulative distribution calculation

calculate Normal probability for interval

Obs a b proba probb probint

1 -0.6 0.4 0.27425 0.65542 0.38117

Does this agree with previous calculations ?

18

Epidemiology 9509 probability dist’ns (continued)

binomial cumulative distribution calculation

calculate Normal probability for interval

Obs a b proba probb probint

1 -0.6 0.4 0.27425 0.65542 0.38117

Does this agree with previous calculations ?0.3809,see lecture Chapter 5, page 26

19

Epidemiology 9509 probability dist’ns (continued)

Plotting normal density function

not usually done in practice

data norm3;

do x = 0 to 10 by 0.05;

density = pdf(’normal’, x, 4, 1.55);

output ;

end;

proc gplot data = norm3;

plot density*x;

symbol interpol=join;

20

Epidemiology 9509 probability dist’ns (continued)

GPLOT of pdf of Normal N(4,2.4)

21

Epidemiology 9509 probability dist’ns (continued)

normal approximation to binomial

title ’Normal approximation to binomial’;

data normbinom;

n=20;

pi=0.4;

mu = n*pi;

var = n*pi*(1-pi);

sd = sqrt(var);

do i = 0 to 20.975 by 0.025;

binompdf = pdf(’binomial’, floor(i), pi, n);

x = i-0.5;

normpdf = pdf(’normal’, x, mu, sd);

output normbinom;

end;

22

Epidemiology 9509 probability dist’ns (continued)

normal approximation to binomial(continued)

proc gplot data=normbinom;

plot binompdf * x normpdf * x/

haxis=-1 to 21 by 1 vaxis=0 to 0.2 by 0.05 overlay;

symbol interpol=join;

23

Epidemiology 9509 probability dist’ns (continued)

GPLOT of normal approximation to Bin(20,0.4)

24

Epidemiology 9509 probability dist’ns (continued)

another normal approximation to binomial

non-symmetric distribution Bin(10,.2)

data normbinom2;

n=10;

pi=0.2;

mu = n*pi;

var = n*pi*(1-pi);

sd = sqrt(var);

do i = 0 to 10.9075 by 0.025;

binompdf = pdf(’binomial’, floor(i), pi, n);

x = i-0.5;

normpdf = pdf(’normal’, x, mu, sd);

output normbinom2;

end;

25

Epidemiology 9509 probability dist’ns (continued)

non-symmetric distribution (continued)

proc gplot data=normbinom2;

plot binompdf * x normpdf * x /

haxis=-1 to 11 by 1 vaxis=0 to 0.5 by 0.05 overlay;

symbol interpol=join;

26

Epidemiology 9509 probability dist’ns (continued)

Normal approximation to Bin(10,0.2)

original distribution is asymmetricnot a good fit to the normal

27