+ All Categories
Home > Documents > The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal...

The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal...

Date post: 06-Mar-2018
Category:
Upload: hadang
View: 229 times
Download: 1 times
Share this document with a friend
64
The Normal Distribution
Transcript
Page 1: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Normal Distribution

Page 2: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Outline

1 Introduction: Continuous Random Variables (3.6)

2 Introduction: Normal Distribution (4.1-4.2)

3 Areas Under a Normal Curve (4.3)Standard Normal Table: Computing ProbabilitiesStandard Normal Table: Finding QuantilesGeneral Normal DistributionUsing the ComputerA Case Study

4 Assessing Normality (4.4)

Page 3: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Outline

1 Introduction: Continuous Random Variables (3.6)

2 Introduction: Normal Distribution (4.1-4.2)

3 Areas Under a Normal Curve (4.3)Standard Normal Table: Computing ProbabilitiesStandard Normal Table: Finding QuantilesGeneral Normal DistributionUsing the ComputerA Case Study

4 Assessing Normality (4.4)

Page 4: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Recall: Discrete Random Variables

We can describe a discrete random variable with a table.For example:

k 0 1 5 10Pr {X = k} 0.1 0.5 0.1 0.3

But a continuous random variable can take any value in ainterval; we could never list all of these values in a table

Page 5: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Continuous Distributions

A continuous random variable has possible values over acontinuum.

The total probability of one is not in discrete chunks atspecific locations, but rather is ground up like a very finedust and sprinkled on the number line.

We cannot represent the distribution with a table ofpossible values and the probability of each.

Page 6: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Continuous Distributions (continued)

Instead, we represent the distribution with a probabilitydensity function which measures the thickness of theprobability dust.

Probability is measured over intervals as the area underthe curve.A legal probability density f :

1 is never negative (f (x) ≥ 0 for −∞ < x < ∞).2 has a total area under the curve of one (

∫∞−∞ f (x)dx = 1).

Page 7: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

For You

Read 3.6 in your book

Page 8: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Outline

1 Introduction: Continuous Random Variables (3.6)

2 Introduction: Normal Distribution (4.1-4.2)

3 Areas Under a Normal Curve (4.3)Standard Normal Table: Computing ProbabilitiesStandard Normal Table: Finding QuantilesGeneral Normal DistributionUsing the ComputerA Case Study

4 Assessing Normality (4.4)

Page 9: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Normal DistributionThe Normal Distribution is the most important distributionof continuous random variables.The normal density curve is the famous symmetric,bell-shaped curve.The central limit theorem is the reason that the normalcurve is so important. Essentially, many statistics that wecalculate from large random samples will haveapproximate normal distributions (or distributions derivedfrom normal distributions), even if the distributions of theunderlying variables are not normally distributed.This fact is the basis of most of the methods of statisticalinference we will study in the last half of the course.Chapter 4 introduces the normal distribution as aprobability distribution.Chapter 5 culminates in the central limit theorem, theprimary theoretical justification for most of the methods ofstatistical inference in the remainder of the textbook.

Page 10: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The famous bell curve

Used everywhere: very useful description for lots of biological(and other) random variables.

Body weight,

Crop yield,

Protein content in soybean,

Density of blood components

Y ∼ N (µ, σ)

values can, inprinciple, go toinfinity, both ways.

µ − 4σ µ − 2σ µ µ + 2σ µ + 4σ

σ σ

Page 11: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Normal Density

Normal curves have the following bell-shaped, symmetricdensity.

f (x) =1√2πσ

e− 1

2

(x−µ

σ

)2

, (−∞ < x < ∞)

Parameters: The parameters of a normal curve are themean µ and the standard deviation σ.

Notation: a random variable is normal with mean µ andstandard deviation σ,

Y ∼ N (µ, σ)

Standard Normal. We often consider µ = 0 and σ = 1.

Z ∼ N (0, 1)

Page 12: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Recall: Empirical Rule

The 68–95–99.7 Rule.

For every normal curve:1 ≈ 68% of the area is within one SD of the mean2 ≈ 95% of the area is within two SDs of the mean3 ≈ 99.7% of the area is within three SDs of the mean

Page 13: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Standard Normal Density

Standard Normal Density

Possible Values

Den

sity

−3 −2 −1 0 1 2 3

Area within 1 = 0.68

Area within 2 = 0.95

Area within 3 = 0.997

Page 14: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Outline

1 Introduction: Continuous Random Variables (3.6)

2 Introduction: Normal Distribution (4.1-4.2)

3 Areas Under a Normal Curve (4.3)Standard Normal Table: Computing ProbabilitiesStandard Normal Table: Finding QuantilesGeneral Normal DistributionUsing the ComputerA Case Study

4 Assessing Normality (4.4)

Page 15: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Recall: the Normal Density

f (x) =1√2πσ

e− 1

2

(x−µ

σ

)2

, (−∞ < x < ∞)

(can you integrate that?)

Page 16: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Compute areas

There is no formula to calculate general areas under thenormal curve.

(The integral of the density has no closed form solution.)

We will learn to use normal tables for this.

We will also learn how to use R.

However, you will have to use normal tables on the exams.

Page 17: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Standard Normal Table

We have tables for N (0, 1)

The standard normal table lists the area to the left of zunder the standard normal curve for each value from−3.49 to 3.49 by 0.01 increments.The normal table is located:

1 inside front cover of your textbook.2 Table 3 (p. 675-676)

Numbers in the margins represent z.

Numbers in the middle of the table are areas to the left of z.

Page 18: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Warning!

Learn the normal table today.

Make it part of your being.

Page 19: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Standard Normal Table: Computing Probabilities

Page 20: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 1 2 3 4

Whole area = 1 (total probability rule)Pr {Z ≤ 0} =

.5

by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} =

0.8413− 0.5 = .3413

Pr {−1 ≤ Z ≤ 1} =

2 ∗ 0.3413 = .6826

Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 21: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 1 2 3 4

Whole area = 1 (total probability rule)Pr {Z ≤ 0} =

.5

by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} =

0.8413− 0.5 = .3413

Pr {−1 ≤ Z ≤ 1} =

2 ∗ 0.3413 = .6826

Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 22: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 1 2 3 4

Whole area = 1 (total probability rule)Pr {Z ≤ 0} =

.5

by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} =

0.8413− 0.5 = .3413

Pr {−1 ≤ Z ≤ 1} =

2 ∗ 0.3413 = .6826

Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 23: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 1 2 3 4

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} =

0.8413− 0.5 = .3413

Pr {−1 ≤ Z ≤ 1} =

2 ∗ 0.3413 = .6826

Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 24: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 2 4−5 1

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} =

0.8413− 0.5 = .3413

Pr {−1 ≤ Z ≤ 1} =

2 ∗ 0.3413 = .6826

Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 25: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 2 40 1

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} =

0.8413− 0.5 = .3413

Pr {−1 ≤ Z ≤ 1} =

2 ∗ 0.3413 = .6826

Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 26: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 2 40 1

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} = 0.8413− 0.5 = .3413Pr {−1 ≤ Z ≤ 1} =

2 ∗ 0.3413 = .6826

Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 27: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 2 4−1 1

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} = 0.8413− 0.5 = .3413Pr {−1 ≤ Z ≤ 1} =

2 ∗ 0.3413 = .6826

Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 28: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 −2 0 2 4−1 1

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} = 0.8413− 0.5 = .3413Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 29: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 0 41.5 5

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} = 0.8413− 0.5 = .3413Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826Pr {Z > 1.5} =

1− Pr {Z ≤ −1.5}= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 30: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 0 41.5 5

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} = 0.8413− 0.5 = .3413Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826Pr {Z > 1.5} = 1− Pr {Z ≤ −1.5}

= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 31: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 0 4−1.5 1.5

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} = 0.8413− 0.5 = .3413Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826Pr {Z > 1.5} = 1− Pr {Z ≤ −1.5}

= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} =

Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 32: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 4−0.50.3

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} = 0.8413− 0.5 = .3413Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826Pr {Z > 1.5} = 1− Pr {Z ≤ −1.5}

= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} = Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}

= .6179− .3085 = .3094

Page 33: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The standard normal Z ∼ N (0, 1)

−4 4−0.50.3

Whole area = 1 (total probability rule)Pr {Z ≤ 0} = .5 by symmetryPr {Z = 0} = 0Pr {Z < 1} = .8413 (table, front cover)Pr {0 ≤ Z ≤ 1} = 0.8413− 0.5 = .3413Pr {−1 ≤ Z ≤ 1} = 2 ∗ 0.3413 = .6826Pr {Z > 1.5} = 1− Pr {Z ≤ −1.5}

= 1− 0.9332 = .0668

Draw a picture and make use of symmetry.

Pr {−0.5 ≤ Z ≤ 0.3} = Pr {Z ≤ 0.3} − Pr {Z ≤ −0.5}= .6179− .3085 = .3094

Page 34: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Standard Normal Table: Finding Quantiles

Page 35: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Basic Idea

This is the inverse problem.

Before. Question: here is the interval; Answer: Probabilityof the interval

Now. Question: Here is the probability. Answer: What isthe interval that matches with the probability.

Problem: many intervals have the same area.

We ask for a certain type of interval.

Page 36: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Reverse Question: Finding quantiles

Pr {Z ≤?} = .975

? = 1.96

the value ? is a quantile .Use Table 3 (or front cover) backward

Pr {Z ≤?} = 0.20

? = −0.84

Pr {Z ≥?} = 0.80

? = −0.84

−3 0 3?

−3 0 3?

Page 37: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Reverse Question: Finding quantiles

Pr {Z ≤?} = .975 ? = 1.96the value ? is a quantile .Use Table 3 (or front cover) backward

Pr {Z ≤?} = 0.20

? = −0.84

Pr {Z ≥?} = 0.80

? = −0.84

−3 0 3?

−3 0 3?

Page 38: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Reverse Question: Finding quantiles

Pr {Z ≤?} = .975 ? = 1.96the value ? is a quantile .Use Table 3 (or front cover) backward

Pr {Z ≤?} = 0.20 ? = −0.84Pr {Z ≥?} = 0.80

? = −0.84

−3 0 3?

−3 0 3?

Page 39: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Reverse Question: Finding quantiles

Pr {Z ≤?} = .975 ? = 1.96the value ? is a quantile .Use Table 3 (or front cover) backward

Pr {Z ≤?} = 0.20 ? = −0.84Pr {Z ≥?} = 0.80 ? = −0.84

−3 0 3?

−3 0 3?

Page 40: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

zα Notation

Instead of ? we introduce a fancy notation.

Let zα be the number such that

Pr {Z < zα} = 1− α

Draw Figure 4.19 (α to the right; 1− α to the left)

Example. z.025 = 1.96

Draw a picture!

(Note: your book uses a capital, Zα)

Page 41: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

General Normal Distribution

Page 42: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

General Normal?

We are good if Y ∼ N (0, 1)

What if Y ∼ N (10, 5) or N (11, 3), or ...

(Directions from “Home”)

Page 43: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Standardization

All normal curves have the same shape, and are simplyrescaled versions of the standard normal density.

Consequently, every area under a general normal curvecorresponds to an area under the standard normal curve.

The key standardization formula is

z =x − µ

σ

Solving for x yieldsx = µ + zσ

which says algebraically that x is z standard deviationsabove the mean.

Page 44: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Probability Example

If X ∼ N(100, 2), find Pr {X > 97.5}.Solution:

Pr {X > 97.5} = Pr(

X − 1002

>97.5− 100

2

)= Pr {Z > −1.25}= 1− Pr {Z < −1.25}= ?

Page 45: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Quantile Example

If X ∼ N(100, 2), find the cutoff values for the middle 70%of the distribution.

Solution: The cutoff points will be the 0.15 and 0.85quantiles.

From the table, 1.03 < z < 1.04 and z = 1.04 is closest.

Thus, the cutoff points are the mean plus or minus 1.04standard deviations.

100− 1.04(2) = 97.92, 100 + 1.04(2) = 102.08

Page 46: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

General form N (µ, σ)

All normal distributions have the same shape.

Transformation

Z =Y − µ

σ∼ N (0, 1)

Systolic blood pressure in healthy adults has a normaldistribution with mean 112 mmHg and standard deviation 10mmHg, i.e. Y ∼ N (112, 10).

One day, I have 92 mmHg.

Pr {Y ≤ 92} = Pr{

Y − 11210

≤ 92− 11210

}= Pr {Z ≤ −2} = 0.0227

Page 47: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

General form N (µ, σ)

µ = 112 mmHg, σ = 10 mmHg.

Pr {102 ≤ Y ≤ 122} = Pr{

102− 11210

≤ Y − 11210

≤ 122− 11210

}= Pr {−1 ≤ Z ≤ 1} = .6826

68.3% of healthy adults have systolic blood pressure between102 and 122 mmHg.A patient’s systolic blood pressure is 137 mmHg.

Pr {Y ≥ 137} = Pr{

Y − 11210

≥ 137− 11210

}= Pr {Z ≥ 2.5} = 1− .9938 = 0.0062

This patient’s blood pressure is very high...

Page 48: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

General form N (µ, σ)µ = 112 mmHg, σ = 10 mmHg.What is “High blood pressure”? For instance, it could be thevalue BP∗ such that

Pr {Y ≤ BP∗} = .95

We need z∗ such that Pr {Z ≤ z∗} = .95. Table in back cover(last line): z∗ = 1.645. Thus BP∗ lies 1.645 standard deviationsabove the mean:

BP∗ = 112 + 1.645 ∗ 10 = 112 + 16.45 = 128.45 mmHg

Formal approach:

.95 = Pr {Y ≤ BP∗} = Pr{

Y − 11210

≤ BP∗ − 11210

}= Pr {Z ≤ z∗}

with z∗ =BP∗ − 112

10i.e. BP∗ = 112 + 10 z∗

Page 49: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Using the Computer

Page 50: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Other Options

Remember you need the tables for the exam.

Statistical software (e.g. R)

Online Calculators. See course website, handouts.

Calculators for Various Statistical Problems

Page 51: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Doing calculations with R

> pnorm(1)[1] 0.8413447> pnorm(2)-pnorm(-2)[1] 0.9544997> pnorm(3)-pnorm(-3)[1] 0.9973002> 1- pnorm(137, mean=112, sd=10)[1] 0.006209665> qnorm(.95, mean=112, sd=10)[1] 128.4485> pbinom(1, size=6, prob=1/6)[1] 0.7367755> dbinom(0:6, size=6, prob=1/6)[1] 0.335 0.402 0.201 0.054 0.008 0.001 0.000

norm normal distributionbinom binomialp probability: Pr {Y ≤ . . .}q quantiled density, or probability mass

function: Pr {Y = . . .}

Page 52: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

A Case Study

Page 53: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Exam Hint

This how exam questions will often look.

I give you some question in the context of a scientific area;you figure out the probabilty calculations to use.

(word problems)

Page 54: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Case Study

ExampleBody temperature varies within individuals over time (it can behigher when one is ill with a fever, or during or after physicalexertion). However, if we measure the body temperature of asingle healthy person when at rest, these measurements varylittle from day to day, and we can associate with each person anindividual resting body temperture. There is, however, variationamong individuals of resting body temperture.

Page 55: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Question

ExampleIn the population, suppose that:

the mean resting body temperature is 98.25 degreesFahrenheit;

the standard deviation is 0.73 degrees Fahrenheit;

resting body temperatures are normally distributed.

Let X be the resting body temperature of a randomly chosenindividual. Find:

1 Pr {X < 98}, the proportion of individuals with temperatureless than 98.

2 Pr {98 < X < 100}, the proportion of individuals withtemperature between 98 and 100.

3 The 0.90 quantile of the distribution.4 The cutoff values for the middle 50% of the distribution.

Page 56: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Outline

1 Introduction: Continuous Random Variables (3.6)

2 Introduction: Normal Distribution (4.1-4.2)

3 Areas Under a Normal Curve (4.3)Standard Normal Table: Computing ProbabilitiesStandard Normal Table: Finding QuantilesGeneral Normal DistributionUsing the ComputerA Case Study

4 Assessing Normality (4.4)

Page 57: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Normal Probability Plots

A standard question begins, “Assuming that variable Y hasa normal distribution,. . . ”. But how do we know thedistribution is approximately normal?

Sometimes we can rely on the central limit theorem.

If given data, there are tests for normality, but there arereasons not to do these.

It is generally better to make a plot that sheds light on thequestion, “Is the data so far from normality as to bias amethod that assumes normality?”

There is no easy answer to the question, but a normalprobability plot is much more informative than the result ofa test.

Page 58: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

What is a Normal Probability Plot?

A normal probability plot is a plot of the sorted sample dataversus something close to the expected z-score for thecorresponding rank of a random normal sample of thesame size.

For example, the expected z-score of the minumum from arandom sample of size 10 from a normal population isabout −1.54.

If the plotted points are close to a straight line, there isevidence that the distribution is close to normal.

If the plotted points are far from a straight line, there isevidence of non-normality.

Page 59: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

The Basic Idea

A normal probability plot plots the ordered observations onthe y-axis (y(1), . . . , y(n)) against some standard “normalscores” x1, . . . , xn on the x-axis.

If the data were truly from a normal distribution it would lieon a “straight” line.

Why a straight line? Y = µ + σZ

Page 60: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Normal Example: All three plots are normal (n = 50)

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−2 −1 0 1 2

−2

02

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Histogram of x

x

Fre

quen

cy

−3 −2 −1 0 1 2 3

05

1015

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

−2 −1 0 1 2

−2

02

Normal Q−Q Plot

Theoretical QuantilesS

ampl

e Q

uant

iles

Histogram of x

x

Fre

quen

cy

−3 −2 −1 0 1 2 3

05

15

●●●

●●

●●

●●

●●

●●

●●●

●●

● ●

−2 −1 0 1 2

−1

12

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Histogram of x

x

Fre

quen

cy

−2 −1 0 1 2

04

812

Page 61: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Normal Example: All three plots are normal (n = 500)

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●●

●●

●●●●

●●●

●●

●●

●●●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●

●●●

●●●●●

●●●

●●

●●

●●

●●

● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●●

●● ●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●●

●●

●●●

●●●

●●

●●●

●●

●●● ●

●●

●●

●●

●●

●●●

●●

●●

−3 −2 −1 0 1 2 3

−3

−1

13

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Histogram of x

x

Fre

quen

cy

−3 −1 0 1 2 3

040

80

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●

●●

●●

● ●

●●

●●●

● ●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●● ●

●●

●●●●●●

●●●

●●●

●●

●●●

●●

●●

●●●●●

●●●

●●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●

●●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3 −2 −1 0 1 2 3

−3

−1

13

Normal Q−Q Plot

Theoretical QuantilesS

ampl

e Q

uant

iles

Histogram of x

x

Fre

quen

cy

−3 −2 −1 0 1 2 3

040

80

●●●

● ●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●●

●●●●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●●

●●

●●

●●●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●●

● ●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●●

−3 −2 −1 0 1 2 3

−3

02

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Histogram of x

x

Fre

quen

cy

−3 −1 0 1 2 3

040

80

Page 62: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Non-normal Example: One plot is not normal(n = 100)

●●

● ●

●●●●

●●

●●

●●●

●● ●●●

●● ●

●●

●●●

● ●

● ●●

●●

●●

●●

●●

●●

●●

●●●

●●●●●

●●

●●

−2 −1 0 1 2

02

46

8

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Histogram of x

x

Fre

quen

cy

0 2 4 6 8

020

40

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●● ●

●●

●●

●●

●●

●●

−2 −1 0 1 2

−2

02

Normal Q−Q Plot

Theoretical QuantilesS

ampl

e Q

uant

iles

Histogram of x

x

Fre

quen

cy

−2 −1 0 1 2 3

05

15

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●●

●●

●●

−2 −1 0 1 2

−2

01

2

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Histogram of x

x

Fre

quen

cy

−2 −1 0 1 2

05

15

Page 63: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

For You

Look at Figures 4.26 - 4.32 (p.136 - 139)

See how different shaped histograms lead to differentnormal probability plots

Page 64: The Normal Distribution - University of Wisconsin–Madison · PDF fileThe Normal Distribution The Normal Distribution is the most important distribution of continuous random variables.

Warning!

Learn the normal table today.

Make it part of your being.


Recommended