© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved
HLTH 300 Biostatistics for Public Health Practice,
Raul Cruz-Cano, Ph.D.2/17/2014, Spring 2014
Fox/Levin/Forde, Elementary Statistics in Social Research, 12e
Chapter 4: Measures of Variability
1
2
Announcement
Let’s switch Lecture Chapter 5 and Exam 1
© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved
Calculate the range and inter-quartile range
Calculate the variance and standard deviation
Obtain the variance and standard deviation from a simple frequency distribution
Understand the meaning of the standard deviation
Calculate the coefficient of variation
CHAPTER OBJECTIVES
4.1
4.2
4.3
4.4
4.5
Use box plots to visualize distributions4.6
4
Introduction4.1
Measures of Central
TendencyMeasures of Variability
• Summarizes what is average or typical of a distribution
• Summarizes how scores are scattered around the center of the distribution
Calculate the range and inter-quartile rage
Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes
4.1
6
4.1
The difference between the highest and lowest scores in a distribution
• Provides a crude measure of variation– Outliers affect interpretation
The Range
R H L
range highest score in a distribution lowest score in a distribution
RHL
7
4.1
The difference between the score at the first quartile and the score at the third quartile
• Manages the effects of extreme outliers – Sensitive to the way in which scores are concentrated around
the center of the distribution
The Inter-Quartile Range
3 1IQR Q Q
inter-quartile range1 the score value at or below which 25% of the cases fall3 the score value at or below which 75% of the cases fall
IQRQQ
IQR: Example 13.1
Step 1: Sort distribution from lowest to highest1, 2, 4, 5, 9, 11, 13
Step 2: Locate the position of the median
Step 3: Locate the median1, 2, 4, 5, 9, 11, 13
What is the inter-quartile range of the following distribution:
1, 5, 2, 9, 13, 11, 4
1 7 1 8Position of Median 42 2 2
N
IQR: Example 13.1
Step 4: Separate the 2 halves1, 2, 4 9, 11, 13
Step 5: Find the “median” of each half 1, 2, 4
9, 11, 13
Step 6: Calculate inter-quartile rangeIQR = 3rd Quartile – 1st Quartile = 11 – 2 = 9
What is the inter-quartile range of the following distribution:
1, 5, 2, 9, 13, 11, 4
position 224
213 Quartile1st Position
1st Quartile
position 224
213 Quartile 3rdPosition
3rd Quartile
IQR: Example 23.1
Step 1: Sort distribution from lowest to highest1, 1, 2, 2, 3, 4, 4, 6
Step 2: Locate the position of the median
Step 3: Locate the median1, 1, 2, 2, 3, 4, 4, 6
Step 4: Take the halfway point between the two casesMedian = 2.5
What is the inter-quartile range of the following distribution:
4, 3, 1, 1, 6, 2, 2, 4
1 8 1 9Position of Median 4.52 2 2
N
IQR: Example3.1
Step 4: Separate the 2 halves1, 1, 2, 2 3, 4, 4, 6
Step 5: Find the “median” of each half 1, 1, 2, 4
3, 4, 4,6
Step 6: Calculate inter-quartile rangeIQR = 3rd Quartile – 1st Quartile = 4 – 1.5 = 2.5
What is the inter-quartile range of the following distribution: 4, 3, 1, 1, 6, 2, 2, 4
position 5.225
214 Quartile1st Position
position 5.224
214 Quartile 3rdPosition
5.123
221 Quartile1st
428
244 Quartile 3rd
12
IQR from Frequency Table
X f cf
31 1 25
30 1 24
29 1 23
28 0 22
27 2 22
26 3 20
25 1 17
24 1 16
23 2 15
22 2 13
21 2 11
20 3 9
19 4 6
18 2 2
When you are given a frequency table instead of the raw data
26Quartile 3rdposition 5.19478
4)26(3
4)1(3
19.5Quartile1st position 5.6426
41
22position 13226
2125
N
N
MedianNN
13
IQR from Frequency Table
X= 18 18 19 19 19 19 20 20 20 21 21 22 22 23 23 24 25 26 26 26 27 27 29 30 31
Pos = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Median
Pos = 1 2 3 4 5 6 7 8 9 10 11 12 X= 18 18 19 19 19 19 20 20 20 21 21 22
1st Quartile
X= 23 23 24 25 26 26 26 27 27 29 30 31
Pos = 1 2 3 4 5 6 7 8 9 10 11 12
3rd Quartile
IQR Advantage: Outliers3.1
Step 1: Sort distribution from lowest to highest1, 2, 4, 5, 9, 11, 1300
Step 2: Locate the position of the median
Step 3: Locate the median1, 2, 4, 5, 9, 11, 1300
What is the inter-quartile range of the following distribution:
1, 5, 2, 9, 1300, 11, 4
1 7 1 8Position of Median 42 2 2
N
IQR Advantage: Outliers3.1
Step 4: Separate the 2 halves1, 2, 4 9, 11, 1300
Step 5: Find the “median” of each half 1, 2, 4
9, 11, 1300
Step 6: Calculate inter-quartile rangeIQR = 3rd Quartile – 1st Quartile = 11 – 2 = 9
What is the inter-quartile range of the following distribution:
1, 5, 2, 9, 13, 11, 4
position 224
213 Quartile1st Position
1st Quartile
position 224
213 Quartile 3rdPosition
3rd Quartile
IQR Advantage: Outliers3.1What is the range and mean of the following distribution: 1, 5, 2, 9, 1300, 11, 4 vs. 1, 5, 2, 9, 13, 11, 4
42.67
411139251 28.1907
41113009251
XX
Range=1300-1=1299 Range=13-1=12
Calculate the variance and standard deviation
Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes
4.2
18
4.2
We need a measure of variability that takes into account every score
• Deviation: the distance of any given raw score from the mean• Squaring deviations eliminates the minus signs• Summing the squared deviations and dividing by N gives us
the average of the squared deviations
The Variance
2
2X X
sN
2
2
variance
sum of the squared deviations from the mean
total number of scores
s
X X
N
19
4.2
With the variance, the unit of measurement is squared
• It is difficult to interpret squared units• We can remove the squared units by taking the square root of
both sides of the equation • This will give us the standard deviation
The Standard Deviation
2X X
sN
“Original” formula for raw data
20
4.2
There is an easier way to calculate the variance and standard deviation
• Using raw scores
The Raw-Score Formulas
22 2Xs X
N
22X
s XN
2
2
variance standard deviation total number of scores
mean squared
ssNX
Formula for frequency tables
Standard Deviation: Raw DataWhat is the standard deviation of the following distribution: 1, 5, 2, 9, 13, 11, 4
42.67
411139251
X
X Dev. Sq. Dev
1 1-6.42= -5.42 (-5.42)2 =29.37
5 5-6.42= -1.42 (-1.42)2 =2.012 2-6.42= -4.42 (-4.42)2 =19.53
9 9-6.42= 2.58 (2.58)2 =6.65
13 13-6.42= 6.58 (6.58)2 =43.29
11 11-6.42= 4.58 (4.58)2 =20.97
4 4-6.42= -2.42 (-2.42)2 =5.85
2X X
sN
)( XX 2)( XX
38.15685.597.2029.4365.653.1901.237.29)( 2 XX
72.434.227
38.156)( 2
NXX
s
Obtain the variance and standard deviation from a simple frequency distribution
Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes
4.3
Example4.3Obtaining the variance and standard deviation from a simple frequency distribution X f fX fX2
31 1 31 961
30 1 30 900
29 1 29 841
28 0 0 0
27 2 54 1,458
26 3 78 2,028
25 1 25 625
24 1 24 576
23 2 46 1,058
22 2 44 968
21 2 42 882
20 3 60 1,200
19 4 76 1,444
18 2 36 648
575 13,589
2 2
222
22
575 2325
(23) 529
13,589 529 543.56 529 14.5625
14.56 3.82
fXX
N
X
fXs X
N
fXs X
N
24
Additional Example
Find Variance and Standard Deviation using frequency table from last session
Understand the meaning of the standard deviation
Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes
4.4
26
4.4
The standard deviation converts the variance to units we can understand
But, how do we interpret this new score?• The standard deviation represents the average variability in a
distribution– It is the average deviations from the mean
• The greater the variability, the larger the standard deviation• Allows for a comparison between a given raw score in a set
against a standardized measure
The Meaning of the Standard Deviation
Calculate the coefficient of variation
Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes
4.5
28
4.5
Used to compare the variability for two or more characteristics that have been measured in different units
• The coefficient of variation is based on the size of the standard deviation
• Its value is independent of the unit of the measurement
The Coefficient of Variation
100 sCVX
coefficient of variation standard deviation
mean
CVsX
29
Example
1. Find CV
Use box plots to visualize distributions
Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes
4.6
4.6
Figure 4.4
4.6
Figure 4.5
Box Plot: Examples Draw the box plot of the following distribution:
1, 5, 2, 9, 13, 11, 4
Draw the box plot of the following distribution: 4, 3, 1, 1, 6, 2, 2, 4
34
Problem 12
35
Now in Excel
1. Find IQR of BMI: http://office.microsoft.com/en-us/excel-help/quartile-HP005209226.aspx
2. Find standard deviation of BMI: http://office.microsoft.com/en-us/excel-help/stdev-HP005209277.aspx
3. Find CV of BMI:
36
Homework #4
Problems (Chapter 4): Problems 20 (+boxplot) and 25
© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved
Researchers can calculate the range and inter-quartile range for a crude measure of variation
The variance and standard deviation provides the research with a measure of variation that takes into
account every score
The variance and standard deviation can also be calculated for data presented in a simple frequency
distribution
The standard deviation can be understood as the average of deviations from the mean
The coefficient of variation is used to compare the variability for two or more characteristics that have
been measured in different units
CHAPTER SUMMARY
4.1
4.2
4.3
4.4
4.5
Social researchers often use box plots to visualize various aspects of a distribution 4.6