+ All Categories
Home > Documents > Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries...

Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries...

Date post: 26-Dec-2015
Category:
Upload: melinda-singleton
View: 223 times
Download: 2 times
Share this document with a friend
Popular Tags:
63
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs to Describe Data The Numerical Methods of Summarizing Data
Transcript
Page 1: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 1 of 63

Chapter 2Exploring Data with Graphs and

Numerical Summaries

Learn ….The Different Types of Data

The Use of Graphs to Describe Data

The Numerical Methods of Summarizing Data

Page 2: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 2 of 63

Section 2.1

What are the Types of Data?

Page 3: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 3 of 63

In Every Statistical Study:

Questions are posed

Characteristics are observed

Page 4: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 4 of 63

Characteristics are Variables

A Variable is any characteristic that is recorded for subjects in the study

Page 5: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 5 of 63

Variation in Data

The terminology variable highlights the fact that data values vary.

Page 6: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 6 of 63

Example: Students in a Statistics Class

Variables:• Age

• GPA

• Major

• Smoking Status

• …

Page 7: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 7 of 63

Data values are called observations

Each observation can be:

• Quantitative

• Categorical

Page 8: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 8 of 63

Categorical Variable

Each observation belongs to one of a set of categories

Examples:• Gender (Male or Female)

• Religious Affiliation (Catholic, Jewish, …)

• Place of residence (Apt, Condo, …)

• Belief in Life After Death (Yes or No)

Page 9: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 9 of 63

Quantitative Variable

Observations take numerical values

Examples:• Age

• Number of siblings

• Annual Income

• Number of years of education completed

Page 10: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 10 of 63

Graphs and Numerical Summaries

Describe the main features of a variable

For Quantitative variables: key features are center and spread

For Categorical variables: key feature is the percentage in each of the categories

Page 11: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 11 of 63

Quantitative Variables

Discrete Quantitative Variables

and

Continuous Quantitative Variables

Page 12: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 12 of 63

Discrete

A quantitative variable is discrete if its possible values form a set of separate numbers such as 0, 1, 2, 3, …

Page 13: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 13 of 63

Examples of discrete variables

Number of pets in a household Number of children in a family Number of foreign languages spoken

Page 14: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 14 of 63

Continuous

A quantitative variable is continuous if its possible values form an interval

Page 15: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 15 of 63

Examples of Continuous Variables

Height Weight Age Amount of time it takes to complete

an assignment

Page 16: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 16 of 63

Frequency Table

A method of organizing data

Lists all possible values for a variable along with the number of observations for each value

Page 17: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 17 of 63

Example: Shark Attacks

Page 18: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 18 of 63

Example: Shark Attacks

What is the variable?

Is it categorical or quantitative?

How is the proportion for Florida calculated?

How is the % for Florida calculated?

Example: Shark Attacks

Page 19: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 19 of 63

Insights – what the data tells us about shark attacks

Example: Shark Attacks

Page 20: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 20 of 63

Identify the following variable as categorical or quantitative:

Choice of diet (vegetarian or non-vegetarian):

a. Categorical

b. Quantitative

Page 21: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 21 of 63

Number of people you have known who have been elected to political office:

a. Categorical

b. Quantitative

Identify the following variable as categorical or quantitative:

Page 22: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 22 of 63

Identify the following variable as discrete or continuous:

The number of people in line at a box office to purchase theater tickets:

a. Continuous

b. Discrete

Page 23: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 23 of 63

The weight of a dog:

a. Continuous

b. Discrete

Identify the following variable as discrete or continuous:

Page 24: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 24 of 63

Section 2.2

How Can We Describe Data Using Graphical Summaries?

Page 25: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 25 of 63

Graphs for Categorical Data

Pie Chart: A circle having a “slice of pie” for each category

Bar Graph: A graph that displays a vertical bar for each category

Page 26: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 26 of 63

Example: Sources of Electricity Use in the U.S. and Canada

Page 27: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 27 of 63

Pie Chart

Page 28: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 28 of 63

Bar Chart

Page 29: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 29 of 63

Pie Chart vs. Bar Chart

Which graph do you prefer? Why?

Page 30: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 30 of 63

Graphs for Quantitative Data

Dot Plot: shows a dot for each observation

Stem-and-Leaf Plot: portrays the individual observations

Histogram: uses bars to portray the data

Page 31: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 31 of 63

Example: Sodium and Sugar Amounts in Cereals

Page 32: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 32 of 63

Dotplot for Sodium in Cereals

Sodium Data:

0 210 260 125 220 290 210 140 220 200 125 170 250 150 170 70 230 200 290 180

Page 33: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 33 of 63

Stem-and-Leaf Plot for Sodium in Cereal

Sodium Data:

0 210

260 125

220 290

210 140

220 200

125 170

250 150

170 70

230 200

290 180

Page 34: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 34 of 63

Frequency Table

Sodium Data: 0 210

260 125220 290210 140220 200125 170250 150170 70230 200290 180

Page 35: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 35 of 63

Histogram for Sodium in Cereals

Page 36: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 36 of 63

Which Graph?

Dot-plot and stem-and-leaf plot:• More useful for small data sets

• Data values are retained

Histogram• More useful for large data sets

• Most compact display

• More flexibility in defining intervals

Page 37: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 37 of 63

Shape of a Distribution

Overall pattern• Clusters?

• Outliers?

• Symmetric?

• Skewed?

• Unimodal?

• Bimodal?

Page 38: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 38 of 63

Symmetric or Skewed ?

Page 39: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 39 of 63

Example: Hours of TV Watching

Page 40: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 40 of 63

Identify the minimum and maximum sugar values:

a. 2 and 14 b. 1 and 3

c. 1 and 15 d. 0 and 16

Page 41: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 41 of 63

Consider a data set containing IQ scores for the general public:

What shape would you expect a histogram of this data set to have?

a. Symmetric

b. Skewed to the left

c. Skewed to the right

d. Bimodal

Page 42: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 42 of 63

Consider a data set of the scores of students on a very easy exam in which most score very well but a few score very poorly:

What shape would you expect a histogram of this data set to have?

a. Symmetric

b. Skewed to the left

c. Skewed to the right

d. Bimodal

Page 43: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 43 of 63

Section 2.3

How Can We describe the Center of Quantitative Data?

Page 44: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 44 of 63

Mean

The sum of the observations divided by the number of observations

n

xx

Page 45: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 45 of 63

Median

The midpoint of the observations when they are ordered from the smallest to the largest (or from the largest to the smallest)

Page 46: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 46 of 63

Find the mean and median

CO2 Pollution levels in 8 largest nations measured in metric tons per person:

2.3 1.1 19.7 9.8 1.8 1.2 0.7 0.2

a. Mean = 4.6 Median = 1.5

b. Mean = 4.6 Median = 5.8

c. Mean = 1.5 Median = 4.6

Page 47: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 47 of 63

Outlier

An observation that falls well above or below the overall set of data

The mean can be highly influenced by an outlier

The median is resistant: not affected by an outlier

Page 48: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 48 of 63

Mode

The value that occurs most frequently.

The mode is most often used with categorical data

Page 49: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 49 of 63

Section 2.4

How Can We Describe the Spread of Quantitative Data?

Page 50: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 50 of 63

Measuring Spread: Range

Range: difference between the largest and smallest observations

Page 51: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 51 of 63

Measuring Spread: Standard Deviation

Creates a measure of variation by summarizing the deviations of each observation from the mean and calculating an adjusted average of these deviations

1

)( 2

n

xxs

Page 52: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 52 of 63

Empirical Rule

For bell-shaped data sets:

Approximately 68% of the observations fall within 1 standard deviation of the mean

Approximately 95% of the observations fall within 2 standard deviations of the mean

Approximately 100% of the observations fall within 3 standard deviations of the mean

Page 53: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 53 of 63

Parameter and Statistic

A parameter is a numerical summary of the population

A statistic is a numerical summary of a sample taken from a population

Page 54: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 54 of 63

Section 2.5

How Can Measures of Position Describe Spread?

Page 55: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 55 of 63

Quartiles

Splits the data into four parts The median is the second quartile, Q2

The first quartile, Q1, is the median of the lower half of the observations

The third quartile, Q3, is the median of the upper half of the observations

Page 56: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 56 of 63

Example: Find the first and third quartiles

Prices per share of 10 most actively traded stocks on NYSE (rounded to nearest $)

2 4 11 12 13 15 31 31 37 47

a. Q1 = 2 Q3 = 47

b. Q1 = 12 Q3 = 31

c. Q1 = 11 Q3 = 31

d. Q1 =11.5 Q3 = 32

Page 57: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 57 of 63

Measuring Spread: Interquartile Range

The interquartile range is the distance between the third quartile and first quartile:

IQR = Q3 – Q1

Page 58: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 58 of 63

Detecting Potential Outliers

An observation is a potential outlier if it falls more than 1.5 x IQR below the first quartile or more than 1.5 x IQR above the third quartile

Page 59: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 59 of 63

The Five-Number Summary

The five number summary of a dataset:

• Minimum value

• First Quartile

• Median

• Third Quartile

• Maximum value

Page 60: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 60 of 63

Boxplot

A box is constructed from Q1 to Q3

A line is drawn inside the box at the median

A line extends outward from the lower end of the box to the smallest observation that is not a potential outlier

A line extends outward from the upper end of the box to the largest observation that is not a potential outlier

Page 61: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 61 of 63

Boxplot for Sodium Data

Sodium Data: 0 200 Five Number Summary:

70 210

125 210 Min: 0

125 220 Q1: 145

140 220 Med: 200

150 230 Q3: 225

170 250 Max: 290

170 260

180 290

200 290

Page 62: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 62 of 63

Boxplot for Sodium in Cereals

Sodium Data: 0 210

260 125220 290210 140220 200125 170250 150170 70230 200290 180

Page 63: Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.

Agresti/Franklin Statistics, 63 of 63

Z-Score

The z-score for an observation measures how far an observation is from the mean in standard deviation units

An observation in a bell-shaped distribution is a potential outlier if its z-score < -3 or > +3

deviation standard

mean -n observatio z


Recommended