Warm-up
An investigator wants to study the effectiveness of two surgical procedures to correct near-sightedness: Procedure A uses cuts from a scalpel and procedure B uses a laser. The data to be collected are the degrees of improvement in vision after the procedure is performed.
Design an experiment for this.
Comparitive Graphs
Section 1.1Creating and Interpreting Comparative Graphs
After this section, you should be able to…
CONSTRUCT and INTERPRET Comparative bar graphs
CONSTRUCT and INTERPRET Segmented bar graphs
CONSTRUCT and INTERPRET Two Way Tables
CALCULATE & INTERPRET marginal and conditional distributions
ORGANIZE statistical problems
Learning Objectives
Two Way Table
Describes two categorical variables.
One variable is shown in the rows and the other is in the columns.
Example of Two Way Table
Young adults by gender & chance of getting rich
Gender
Opinion Female Male Total
Almost no chance 96 98 194
Some chance but probably not 426 286 712
A 50-50 chance 696 720 1416
A godd chance 663 758 1421
Almost certain 486 597 1083
Total 2367 2459 4826
Reading a Two-Way Table
Look at the distribution of each variable separately.The totals on the right are strictly the values
for the distribution of opinions about becoming rich for all.
The totals at the bottom are for gender
Marginal Distribution
The marginal distribution of one of the categorical variables in a two-way table of counts is the distribution of values of that variable among all individuals described by the table.
It’s the distribution of each category alone.
Percentages
Often are more informative
Used when comparing groups of different sizes.
Find the percent of young adults who they there is a good chance they will be rich.
Young adults by gender & chance of getting rich
Gender
Opinion Female Male Total
Almost no chance 96 98 194
Some chance but probably not 426 286 712
A 50-50 chance 696 720 1416
A godd chance 663 758 1421
Almost certain 486 597 1083
Total 2367 2459 4826
Find the marginal distribution (in %) of opinions. Make a graph to display the marginal distribution.
Young adults by gender & chance of getting rich
Gender
Opinion Female Male Total
Almost no chance 96 98 194
Some chance but probably not 426 286 712
A 50-50 chance 696 720 1416
A godd chance 663 758 1421
Almost certain 486 597 1083
Total 2367 2459 4826
Response Percent
Almost no chance 4.0%
Some chance but probably not 14.8%
A 50-50 chance 29.3%
A good chance 29.4%
Almost certain 22.4%
Find the marginal distribution (in %) of gender. Make a graph to display the marginal distribution.
Young adults by gender & chance of getting rich
Gender
Opinion Female Male Total
Almost no chance 96 98 194
Some chance but probably not 426 286 712
A 50-50 chance 696 720 1416
A godd chance 663 758 1421
Almost certain 486 597 1083
Total 2367 2459 4826
Response Percent
Male 51%
Female 49%
Conditional Distribution
It describes the values of that variable among individuals who have a specific value of another variable.
To describe the relationship between the two categorical variables
Conditional Distribution of young women and men and their opinion.
Young adults by gender & chance of getting rich
Gender
Opinion Female Male
Almost no chance 96 98
Some chance but probably not 426 286
A 50-50 chance 696 720
A godd chance 663 758
Almost certain 486 597
Total 2367 2459
Side-by-Side Bar Graph
Response Women Men
Almost no chance 4.1% 4%
Some chance but probably not 18.0% 11.6%
A 50-50 chance 29.4% 29.3%
A good chance 28% 30.8%
Almost certain 20.5% 24.3%
Segmented Bar Graph
Response Women Men
Almost no chance 4.1% 4%
Some chance but probably not 18.0% 11.6%
A 50-50 chance 29.4% 29.3%
A good chance 28% 30.8%
Almost certain 20.5% 24.3%
Did we look at the right conditional distribution? Our goal was to analyze the relationship
between gender and opinion about chances of becoming rich for these young adults.
Hint: Does gender influence opinion or opinion influence gender?
Since gender influences opinion, then we want to consider the conditional distribution of opinion for each gender.
Four-Step Process State: What’s the question that you’re
trying to answer? Plan: How will you go about answering
the question? What statistical techniques does this problem call for?
Do: Make graphs and carry out needed calculations.
Conclude: Give your practical conclusion in the setting of the real-world problem.
State
What is the relationship between gender and responses to the question “What do you think are the chances you will have much more than a middle-class income at age 30?”
Plan
We suspect that gender might influence a young adult’s opinion about the chance of getting rich. So we’ll compare the conditional distributions of response for men alone and for women alone.
Response Women Men
Almost no chance 4.1% 4%
Some chance but probably not 18.0% 11.6%
A 50-50 chance 29.4% 29.3%
A good chance 28% 30.8%
Almost certain 20.5% 24.3%
Do
We’ll make a side-by side bar graph to compare the opinions of males and females.
I could have used a segmented as well!
Side-by Side Comparative Bar Graph
Response Women Men
Almost no chance 4.1% 4%
Some chance but probably not 18.0% 11.6%
A 50-50 chance 29.4% 29.3%
A good chance 28% 30.8%
Almost certain 20.5% 24.3%
Segmented Comparative Bar Graph
Response Women Men
Almost no chance 4.1% 4%
Some chance but probably not 18.0% 11.6%
A 50-50 chance 29.4% 29.3%
A good chance 28% 30.8%
Almost certain 20.5% 24.3%
Conclude Based on the sample data, men seem
somewhat more optimistic about their future income than women. Men were less likely to say that they have “some chance but probably no” than women (11.6% vs 18.0%). Men were more likely to say that they have a “good chance” (30.8% vs 28.0%) aor alre “almost certain” (24.3% vs 20.5%) to have much more than a middle-class income by age 30 than women were.
Association
We say there is an association between two variables if specific values of one variable tend to occur in common with specific values of the other.Be careful though….even a strong association
between two categorical variables can be influenced by other variables lurking in the background.
Simpson’s Paradox
An association between two variables that holds for each individual value of a thrid variable can be changed or even reversed when the data for all values of the third variable are combined. This reversal is called Simpson’s paradox.
Accident victims are sometimes taken by helicopter from the accident scene to a hospital. Helicopters save taim. Do they also save lives?
Helicopter Road
Victim Died 64 260
Victim survived 136 840
Total 200 1100
32% of helicopter patients died, but only 24% of the others did. This seems discouraging!
Helicopter is sent mostly to serious accidents.
Serious Accident
Helicopter Road
Died 48 60
Survived 52 40
Total 100 100
Less Serious Accident
Helicopter Road
Died 16 200
Survived 84 800
Total 100 1000
Titanic Disaster
Homework
Page 24 (19, 21, 23, 24, 25, 27-32, 33, 35, 36)