Today: lab 2 due Monday: Quizz 4 Wed: A3 due Friday: Lab 3 due Mon Oct 1: Exam I this room, 12 pm.

Post on 17-Dec-2015

215 views 1 download

transcript

• Today: lab 2 due

• Monday: Quizz 4

• Wed: A3 due

• Friday: Lab 3 due

• Mon Oct 1: Exam I this room, 12 pm

Recap last lecture Ch 6.1

•Empirical frequency distributions

•Discrete

•Continuous

•Four forms

•F(Q=k), F(Q=k)/n, F(Qqk), F(Qqk)/n

•Four uses

•Summarization gives clue to process

•Summarization useful for comparisons

•Used to make statistical decisions

•Reliability evaluation

Today

Read lecture notes!

Age of mothers

Fre

qu

en

cy

15 20 25 30 35 40 45

02

46

81

01

2

Distribution of ages of mothers Sample: students that attended class in 1997

Population: MUN students Unknown distribution

Distribution of ages of mothers Sample: students that attended class in 1997

Population: MUN students Unknown distribution

Solution: use theoretical frequency dist to characterize pop

Assumption: observations are distributed in the same way as theoretical dist

Theoretical distribution is a model of a frequency distribution

Commonly used theoretical dist:

Discrete

Binomial

Multinomial

Poisson

Negative binomial

Hypergeometric

Uniform

Continuous

Normal

Chi-square (2)

t

F

Log-normal

Gamma

Cauchy

Weibull

Uniform

Commonly used theoretical dist:

Discrete

Binomial

Multinomial

Poisson

Negative binomial

Hypergeometric

Uniform

Continuous

Normal

Chi-square (2)

t

F

Log-normal

Gamma

Cauchy

Weibull

Uniform

Theoretical frequency distributions

4 forms

Empirical

(n=sample)

Theoretical

(N=pop discrete)

Theoretical

(N=pop continuous)

Theoretical frequency distributions - 4 uses

1. Clue to underlying process

If an empirical dist fits one of the following, this suggests the kind of mechanism that generated the data

a) Uniform dist

e.g. # of people per table mechanism: all outcomes have equal prob

b) Normal dist

e.g. oxygen intake per day mechanism: several independent factors, no prevailing factor

Theoretical frequency distributions - 4 uses

1. Clue to underlying process

c) Poisson dist

e.g. # of deaths by horsekick in the Prussian army, per year mechanism: rare & random event

c) Binomial dist

e.g. # of heads/tails on coin toss mechanism: yes/no outcome

Theoretical frequency distributions - 4 uses

2. Summarize data dist info contained in parameters

e.g. number of events per unit space or time can be summarized as the expected value of a Poisson dist

Theoretical frequency distributions - 4 uses

2. Summarize data

e.g. number of events per unit space or time can be summarized as the expected value of a Poisson dist

Can make comparisons

Theoretical frequency distributions - 4 uses

3. Decision making. Use theoretical dist to calculate p-value

Theoretical frequency distributions - 4 uses

3. Decision making. Use theoretical dist to calculate p-value

p(X1qx)

p(X2>x)

Theoretical frequency distributions - 4 uses

3. Decision making. Use theoretical dist to calculate p-value

p(X1qx)

MiniTab: cdf

R: pnorm()

Theoretical frequency distributions - 4 uses

4. Reliability. Put probability range around outcome

Theoretical frequency distributions - 4 uses

4. Reliability. Put probability range around outcome

MiniTab: invcdf

R: qnorm()

Computing probabilities from observed vs theoretical dist

Theoretical

Advantages Disadvantages

EasyAssumptions may not apply

wrong p-values

Familiar Checking assumptions is laborious

Recipes, known performance

Empirical

Advantages Disadvantages

No assumptions Computation

Easy to defend Not always easy to carry out

Ch 6.3 Fit of Observed to Theoretical

Will present 2 examples: 1 continuous, 1 discrete

More examples in lecture notes

Ch 6.3 Fit of Observed to Theoretical

Example 1 (Poisson)

Number of coal mining disasters, 1851-1866 (England)

NDisaster = [4 5 4 1 0 4 3 4 0 6 3 3 4 0 2 4]

sum(N)=47

k = [0 1 2 3 4 5 6] = outcomes(N)

n = 16 observations

k F(N=k)

0

1

2

3

4

5

6

n

Nsum )(̂

Example 1 (Poisson)

Number of coal mining disasters, 1851-1866 (England)

k F(N=k) F(N=k)/n

0 3 0.1875

1 1 0.0625

2 1 0.0625

3 3 0.1875

4 6 0.3750

5 1 0.0625

6 1 0.0625

2.9375 47/16 ̂

Example 1 (Poisson)

Number of coal mining disasters, 1851-1866 (England)

k F(N=k) F(N=k)/n Pr(N=k)

0 3 0.1875

1 1 0.0625

2 1 0.0625

3 3 0.1875

4 6 0.3750

5 1 0.0625

6 1 0.0625

2.9375 47/16 ̂

!)Pr(

k

ekN

k

Example 1 (Poisson)

Number of coal mining disasters, 1851-1866 (England)

k F(N=k) F(N=k)/n Pr(N=k)

0 3 0.1875 0.053

1 1 0.0625 0.1557

2 1 0.0625 0.2287

3 3 0.1875 0.2239

4 6 0.3750 0.1644

5 1 0.0625 0.0966

6 1 0.0625 0.0473

2.9375 47/16 ̂

!)Pr(

k

ekN

k

Example 1 (Poisson)

Number of coal mining disasters, 1851-1866 (England)

k F(N=k) F(N=k)/n Pr(N=k) Obs-Exp

0 3 0.1875 0.053

1 1 0.0625 0.1557

2 1 0.0625 0.2287

3 3 0.1875 0.2239

4 6 0.3750 0.1644

5 1 0.0625 0.0966

6 1 0.0625 0.0473

2.9375 47/16 ̂

!)Pr(

k

ekN

k

Example 1 (Poisson)

Number of coal mining disasters, 1851-1866 (England)

k F(N=k) F(N=k)/n Pr(N=k) Obs-Exp

0 3 0.1875 0.053 0.1345

1 1 0.0625 0.1557 -0.0932

2 1 0.0625 0.2287 -0.1662

3 3 0.1875 0.2239 -0.0364

4 6 0.3750 0.1644 0.2106

5 1 0.0625 0.0966 -0.0341

6 1 0.0625 0.0473 0.0152

2.9375 47/16 ̂

!)Pr(

k

ekN

k

Example 2 (Normal)

Age of mothers of students in quant 1997

Are the ages normally distributed?

Age of mothers

Fre

qu

en

cy

15 20 25 30 35 40 45

02

46

81

01

2

Example 2 (Normal)

Age of mothers of students in quant 1997

Are the ages normally distributed?

Example 2 (Normal)

Age of mothers of students in quant 1997

Are the ages normally distributed?

Strategy work with probability plots compute cdf

Example 2 (Normal)

Age of mothers of students in quant 1997

Are the ages normally distributed?

Strategy work with probability plots compute cdf

2

2

1

2

1)Pr(

X

exAge

Expected distribution:

Example 2 (Normal)

Age of mothers of students in quant 1997

Are the ages normally distributed?

Strategy work with probability plots compute cdf

2

2

1

2

1)Pr(

X

exAge

Expected distribution:

Example 2 (Normal)

Age of mothers of students in quant 1997

Are the ages normally distributed?

Strategy work with probability plots compute cdf

-2 -1 0 1 2

20

25

30

35

40

Normal Q-Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s