Chapter 2
Data Analysis
Section 2.2Density Curves and Normal Distributions
Starnes/Tabor, The Practice of Statistics
By the end of this section, you should be able to:
LEARNING TARGETS
Density Curves and Normal Distributions
USE a density curve to MODEL distributions of quantitative data.
IDENTIFY the relative locations of the mean and median of a distribution from a density curve.
USE the 68–95–99.7 rule to estimate (i) the proportion of values in a specified interval, or (ii) the value that corresponds to a given percentile in a Normal distribution.
FIND the proportion of values in a specified interval in a Normal distribution using Table A or technology.
FIND the value that corresponds to a given percentile in a Normal distribution using Table A or technology.
DETERMINE whether a distribution of data is approximately Normal from graphical and numerical evidence.
Starnes/Tabor, The Practice of Statistics
Exploring Quantitative Data
In Chapter 1, we developed a kit of graphical and numerical tools for describing distributions. Now, we’ll add one more step to the strategy.
Exploring Quantitative Data
Adding the same positive number a to (subtracting a from) each observation:1. Always plot your data: make a graph, usually a dotplot,
stemplot, or histogram.2. Look for the overall pattern (shape, center, and variability) and
for striking departures such as outliers.3. Calculate numerical summaries to describe center and
variability.
Starnes/Tabor, The Practice of Statistics
Exploring Quantitative Data
In Chapter 1, we developed a kit of graphical and numerical tools for describing distributions. Now, we’ll add one more step to the strategy.
Exploring Quantitative Data
Adding the same positive number a to (subtracting a from) each observation:1. Always plot your data: make a graph, usually a dotplot,
stemplot, or histogram.2. Look for the overall pattern (shape, center, and variability) and
for striking departures such as outliers.3. Calculate numerical summaries to describe center and
variability.4. When there’s a regular overall pattern, use a simplified model
called a density curve to describe it.
Starnes/Tabor, The Practice of Statistics
Density Curves
A density curve is a curve that• Is always on or above the horizontal axis• Has area exactly 1 underneath itThe area under the curve and above any interval of values on the horizontal axis estimates the proportion of all observations that fall in that interval.
Starnes/Tabor, The Practice of Statistics
Density Curves
A density curve is a curve that• Is always on or above the horizontal axis• Has area exactly 1 underneath itThe area under the curve and above any interval of values on the horizontal axis estimates the proportion of all observations that fall in that interval.
Starnes/Tabor, The Practice of Statistics
Density Curves
The overall pattern of this dotplot of the amount of time it has taken Selena to get to the bookstore by train each day for the last
1000 days she worked can be described by a horizontal line.
A density curve is a curve that• Is always on or above the horizontal axis• Has area exactly 1 underneath itThe area under the curve and above any interval of values on the horizontal axis estimates the proportion of all observations that fall in that interval.
Starnes/Tabor, The Practice of Statistics
Density Curves
A density curve is a curve that• Is always on or above the horizontal axis• Has area exactly 1 underneath itThe area under the curve and above any interval of values on the horizontal axis estimates the proportion of all observations that fall in that interval.
The overall pattern of this histogram of the scores of all 947 seventh-grade students in Gary, Indiana, on the vocabulary part of the Iowa Test of Basic Skills (ITBS) can be described by a smooth curve drawn through the tops of the bars.
Density curve
Starnes/Tabor, The Practice of Statistics
Describing Density Curves
The mean of a density curve is the point at which the curve would balance if made of solid material.The median of a density curve is the equal-areas point, the point that divides the area under the curve in half.
Starnes/Tabor, The Practice of Statistics
Describing Density Curves
The mean of a density curve is the point at which the curve would balance if made of solid material.The median of a density curve is the equal-areas point, the point that divides the area under the curve in half.
Starnes/Tabor, The Practice of Statistics
Describing Density Curves
The mean of a density curve is the point at which the curve would balance if made of solid material.The median of a density curve is the equal-areas point, the point that divides the area under the curve in half.
Starnes/Tabor, The Practice of Statistics
Describing Density Curves
The mean of a density curve is the point at which the curve would balance if made of solid material.The median of a density curve is the equal-areas point, the point that divides the area under the curve in half.
The median and the mean are the same for a symmetric density curve. They both lie at the center of the curve. The mean of a skewed curve is pulled away from the median in the direction of the long tail.
Starnes/Tabor, The Practice of Statistics
Describing Density Curves
• A density curve is an idealized description of a distribution of data.
Starnes/Tabor, The Practice of Statistics
Describing Density Curves
• A density curve is an idealized description of a distribution of data.
• We distinguish between the mean and standard deviation of the density curve and the mean and standard deviation computed from the actual observations.
Starnes/Tabor, The Practice of Statistics
Describing Density Curves
• A density curve is an idealized description of a distribution of data.
• We distinguish between the mean and standard deviation of the density curve and the mean and standard deviation computed from the actual observations.
• The usual notation for the mean of a density curve is µ (the Greek letter mu). We write the standard deviation of a density curve as σ (the Greek letter sigma).
Starnes/Tabor, The Practice of Statistics
Normal Distributions
One particularly important family of density curves are the Normal curves,
which describe Normal distributions.
Starnes/Tabor, The Practice of Statistics
Normal Distributions
One particularly important family of density curves are the Normal curves,
which describe Normal distributions.
• Shape: All Normal distributions have the same overall shape: symmetric,
single-peaked, and bell-shaped.
Starnes/Tabor, The Practice of Statistics
Normal Distributions
One particularly important family of density curves are the Normal curves,
which describe Normal distributions.
• Shape: All Normal distributions have the same overall shape: symmetric,
single-peaked, and bell-shaped.
• Center: The mean µ is located at the midpoint of the symmetric density curve
and is the same as the median.
Starnes/Tabor, The Practice of Statistics
Normal Distributions
One particularly important family of density curves are the Normal curves,
which describe Normal distributions.
• Shape: All Normal distributions have the same overall shape: symmetric,
single-peaked, and bell-shaped.
• Center: The mean µ is located at the midpoint of the symmetric density curve
and is the same as the median.
• Variability: The standard deviation σ measures the variability (width) of a
Normal distribution.
Starnes/Tabor, The Practice of Statistics
Normal Distributions
A Normal distribution is described by a symmetric, single-peaked, bell-shaped density curve called a Normal curve. Any Normal distribution is completely specified by two numbers: its mean µ and standard deviation σ.
Why are Normal distributions important in statistics? Here are three reasons.
Starnes/Tabor, The Practice of Statistics
Normal Distributions
A Normal distribution is described by a symmetric, single-peaked, bell-shaped density curve called a Normal curve. Any Normal distribution is completely specified by two numbers: its mean µ and standard deviation σ.
Why are Normal distributions important in statistics? Here are three reasons.1. Normal distributions are good descriptions for some distributions of real data.
Distributions that are often close to Normal include:• Scores on tests taken by many people (such as SAT exams and IQ tests)• Repeated careful measurements of the same quantity
(like the diameter of a tennis ball)• Characteristics of biological populations
(such as lengths of crickets and yields of corn)
Starnes/Tabor, The Practice of Statistics
Normal Distributions
A Normal distribution is described by a symmetric, single-peaked, bell-shaped density curve called a Normal curve. Any Normal distribution is completely specified by two numbers: its mean µ and standard deviation σ.
Why are Normal distributions important in statistics? Here are three reasons.1. Normal distributions are good descriptions for some distributions of real data.
Distributions that are often close to Normal include:• Scores on tests taken by many people (such as SAT exams and IQ tests)• Repeated careful measurements of the same quantity
(like the diameter of a tennis ball)• Characteristics of biological populations
(such as lengths of crickets and yields of corn)2. Normal distributions are good approximations to the results of many kinds of
chance outcomes, like the proportion of heads in many tosses of a fair coin.
Starnes/Tabor, The Practice of Statistics
Normal Distributions
A Normal distribution is described by a symmetric, single-peaked, bell-shaped density curve called a Normal curve. Any Normal distribution is completely specified by two numbers: its mean µ and standard deviation σ.
Why are Normal distributions important in statistics? Here are three reasons.1. Normal distributions are good descriptions for some distributions of real data.
Distributions that are often close to Normal include:• Scores on tests taken by many people (such as SAT exams and IQ tests)• Repeated careful measurements of the same quantity
(like the diameter of a tennis ball)• Characteristics of biological populations
(such as lengths of crickets and yields of corn)2. Normal distributions are good approximations to the results of many kinds of
chance outcomes, like the proportion of heads in many tosses of a fair coin.3. Many of the inference methods in Chapters 8–12 are based on Normal
distributions.
Starnes/Tabor, The Practice of Statistics
The 68–95–99.7 Rule
Although there are many Normal curves, they all have properties in common.
The 68-95-99.7 RuleIn a Normal distribution with mean µand standard deviation σ:
• Approximately 68% of the observations fall within σ of the mean µ.
• Approximately 95% of the observations fall within 2σ of the mean µ.
• Approximately 99.7% of the observations fall within 3σ of the mean µ.
Starnes/Tabor, The Practice of Statistics
The 68–95–99.7 Rule
Although there are many Normal curves, they all have properties in common.
The 68-95-99.7 RuleIn a Normal distribution with mean µand standard deviation σ:
• Approximately 68% of the observations fall within σ of the mean µ.
• Approximately 95% of the observations fall within 2σ of the mean µ.
• Approximately 99.7% of the observations fall within 3σ of the mean µ.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
All Normal distributions are the same if we measure in units of size σ from the mean µ.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
All Normal distributions are the same if we measure in units of size σ from the mean µ.
𝑧 =𝑣𝑎𝑙𝑢𝑒 − 𝑚𝑒𝑎𝑛
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛=𝑥 − 𝜇
𝜎
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
All Normal distributions are the same if we measure in units of size σ from the mean µ.
The standard Normal distributionis the Normal distribution with mean 0 and standard deviation 1.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
How to Find Areas in any Normal Distribution
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the boundary value(s) clearly identified, and the area of interest shaded.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
How to Find Areas in any Normal Distribution
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the boundary value(s) clearly identified, and the area of interest shaded.
Step 2: Perform calculations—show your work! Do one of the following:i. Standardize each boundary value and use Table A or
technology to find the desired area under the standard Normal curve; or
ii. Use technology to find the desired area without standardizing.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
How to Find Areas in any Normal Distribution
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the boundary value(s) clearly identified, and the area of interest shaded.
Step 2: Perform calculations—show your work! Do one of the following:i. Standardize each boundary value and use Table A or
technology to find the desired area under the standard Normal curve; or
ii. Use technology to find the desired area without standardizing.
Step 3: Be sure to answer the question that was asked!
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the boundary value(s) clearly identified, and the area of interest shaded.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the boundary value(s) clearly identified, and the area of interest shaded.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 2: Perform calculations—show your work! Do one of the following:i. Standardize each boundary
value and use Table A or technology to find the desired area under the standard Normal curve; or
ii. Use technology to find the desired area without standardizing.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 2: Perform calculations—show your work! Do one of the following:i. Standardize each boundary
value and use Table A or technology to find the desired area under the standard Normal curve; or
ii. Use technology to find the desired area without standardizing.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 2: Perform calculations—show your work! Do one of the following:i. Standardize each boundary
value and use Table A or technology to find the desired area under the standard Normal curve; or
ii. Use technology to find the desired area without standardizing.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 2: Perform calculations—show your work! Do one of the following:i. Standardize each boundary
value and use Table A or technology to find the desired area under the standard Normal curve; or
ii. Use technology to find the desired area without standardizing.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 2: Perform calculations—show your work! Do one of the following:i. Standardize each boundary
value and use Table A or technology to find the desired area under the standard Normal curve; or
ii. Use technology to find the desired area without standardizing.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 2: Perform calculations—show your work! Do one of the following:i. Standardize each boundary
value and use Table A or technology to find the desired area under the standard Normal curve; or
ii. Use technology to find the desired area without standardizing.
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
Step 3: Be sure to answer the question that was asked!
Starnes/Tabor, The Practice of Statistics
Finding Areas in a Normal Distribution
The distribution of ITBS (Iowa Test of Basic Skills) vocabulary scores among all Gary, Indiana, seventh-graders. Recall that this distribution is approximately Normal with mean µ = 6.84 and standard deviation σ = 1.55. What proportion of these seventh-graders have vocabulary scores that are below sixth-grade level?
We estimate that about 29.4% of Gary, Indiana, seventh-
grader scores fall below the sixth-grade level on the ITBS
vocabulary test.
Step 3: Be sure to answer the question that was asked!
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
How to Find Values from Areas in any Normal Distribution
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the area of interest shaded, and unknown boundary value clearly marked.
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
How to Find Values from Areas in any Normal Distribution
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the area of interest shaded, and unknown boundary value clearly marked.
Step 2: Perform calculations—show your work! Do one of the following:i. Use Table A or technology to find the value of z with the
indicated area under the standard Normal curve, then “unstandardize” to transform back to the original distribution; or
ii. Use technology to find the desired value without standardizing.
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
How to Find Values from Areas in any Normal Distribution
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the area of interest shaded, and unknown boundary value clearly marked.
Step 2: Perform calculations—show your work! Do one of the following:i. Use Table A or technology to find the value of z with the
indicated area under the standard Normal curve, then “unstandardize” to transform back to the original distribution; or
ii. Use technology to find the desired value without standardizing.
Step 3: Be sure to answer the question that was asked!
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
According to www.cdc.gov/growthcharts/, the heights of 3-year-old females are approximately Normally distributed with a mean of 94.5 centimeters and a standard deviation of 4 centimeters. Seventy-five percent of 3-year-old girls are taller than what height?
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the area of interest shaded, and unknown boundary value clearly marked.
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
According to www.cdc.gov/growthcharts/, the heights of 3-year-old females are approximately Normally distributed with a mean of 94.5 centimeters and a standard deviation of 4 centimeters. Seventy-five percent of 3-year-old girls are taller than what height?
Step 1: Draw a normal distribution with the horizontal axis labeled and scaled using the mean and standard deviation, the area of interest shaded, and unknown boundary value clearly marked.
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
According to www.cdc.gov/growthcharts/, the heights of 3-year-old females are approximately Normally distributed with a mean of 94.5 centimeters and a standard deviation of 4 centimeters. Seventy-five percent of 3-year-old girls are taller than what height?
Step 2: Perform calculations—show your work! Do one of the following:
i. Use Table A or technology to find the value of z with the indicated area under the standard Normal curve, then “unstandardize” to transform back to the original distribution; or
ii. Use technology to find the desired value without standardizing.
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
According to www.cdc.gov/growthcharts/, the heights of 3-year-old females are approximately Normally distributed with a mean of 94.5 centimeters and a standard deviation of 4 centimeters. Seventy-five percent of 3-year-old girls are taller than what height?
Step 2: Perform calculations—show your work! Do one of the following:
i. Use Table A or technology to find the value of z with the indicated area under the standard Normal curve, then “unstandardize” to transform back to the original distribution; or
ii. Use technology to find the desired value without standardizing.
(i) Using Table A: z = – 0.67Using technology:
invNorm(area: 0.25, mean: 0, SD: 1) = –0.67
−𝟎. 𝟔𝟕 =𝒙 − 𝟗𝟒. 𝟓
𝟒−𝟎. 𝟔𝟕 𝟒 + 𝟗𝟒. 𝟓 = 𝒙
𝟗𝟏. 𝟖𝟐 = 𝒙(ii) invNorm(area: 0.25, mean: 94.5, SD: 4) = 91.80
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
According to www.cdc.gov/growthcharts/, the heights of 3-year-old females are approximately Normally distributed with a mean of 94.5 centimeters and a standard deviation of 4 centimeters. Seventy-five percent of 3-year-old girls are taller than what height?
Step 3: Be sure to answer the question that was asked!
Starnes/Tabor, The Practice of Statistics
Working Backward: Finding Values from Areas
According to www.cdc.gov/growthcharts/, the heights of 3-year-old females are approximately Normally distributed with a mean of 94.5 centimeters and a standard deviation of 4 centimeters. Seventy-five percent of 3-year-old girls are taller than what height?
Step 3: Be sure to answer the question that was asked!
About 75% of 3-year-old girls are taller than 91.80 cm.
Starnes/Tabor, The Practice of Statistics
After this section, you should be able to:
LEARNING TARGETS
Section Summary
USE a density curve to MODEL distributions of quantitative data.
IDENTIFY the relative locations of the mean and median of a distribution from a density curve.
USE the 68–95–99.7 rule to estimate (i) the proportion of values in a specified interval, or (ii) the value that corresponds to a given percentile in a Normal distribution.
FIND the proportion of values in a specified interval in a Normal distribution using Table A or technology.
FIND the value that corresponds to a given percentile in a Normal distribution using Table A or technology.
DETERMINE whether a distribution of data is approximately Normal from graphical and numerical evidence.
Starnes/Tabor, The Practice of Statistics
Assignment
Page 138-144 #46, 48, 52, 54, 74, 80
If you are stuck on any of these, look at the odd before or after and the answer in the back of your book. If you are still not sure text a friend or me for help (before 8pm).
Tomorrow we will check homework and review for 2.2 Quiz.