Chapter 2 Displaying and DescribingCategorical Data
Graphs for Categorical Variables
Our concern will be two types of visual representations.
1 Pie charts2 Bar graphs
Since these both deal with categorical data, they both deal with countsin categories, so we are graphing either raw counts (frequency) orpercentages (relative frequency).
ImportantNote: For all graphs, be sure to label everything clearly.
Graphs for Categorical Variables
Our concern will be two types of visual representations.
1 Pie charts
2 Bar graphs
Since these both deal with categorical data, they both deal with countsin categories, so we are graphing either raw counts (frequency) orpercentages (relative frequency).
ImportantNote: For all graphs, be sure to label everything clearly.
Graphs for Categorical Variables
Our concern will be two types of visual representations.
1 Pie charts2 Bar graphs
Since these both deal with categorical data, they both deal with countsin categories, so we are graphing either raw counts (frequency) orpercentages (relative frequency).
ImportantNote: For all graphs, be sure to label everything clearly.
Graphs for Categorical Variables
Our concern will be two types of visual representations.
1 Pie charts2 Bar graphs
Since these both deal with categorical data, they both deal with countsin categories, so we are graphing either raw counts (frequency) orpercentages (relative frequency).
ImportantNote: For all graphs, be sure to label everything clearly.
Graphs for Categorical Variables
Our concern will be two types of visual representations.
1 Pie charts2 Bar graphs
Since these both deal with categorical data, they both deal with countsin categories, so we are graphing either raw counts (frequency) orpercentages (relative frequency).
ImportantNote: For all graphs, be sure to label everything clearly.
Pie Charts
ExampleYou sit on an overpass and record the color of the first 100 cars yousee. The results are as follows:
color frequencyred 15blue 21green 18white 22black 19other 5
Construct a pie chart to illustrate the relationship between the colorsof these cars.
How We Construct Pie Charts
What are the important things to keep in mind?
1 Must make up to 100%2 Sections must be in proper size relation
To accomplish the latter, we use central angles.
DefinitionThe central angle is the angle whose vertex is the center of the circleand whose rays are radii of the circle.
How We Construct Pie Charts
What are the important things to keep in mind?
1 Must make up to 100%
2 Sections must be in proper size relation
To accomplish the latter, we use central angles.
DefinitionThe central angle is the angle whose vertex is the center of the circleand whose rays are radii of the circle.
How We Construct Pie Charts
What are the important things to keep in mind?
1 Must make up to 100%2 Sections must be in proper size relation
To accomplish the latter, we use central angles.
DefinitionThe central angle is the angle whose vertex is the center of the circleand whose rays are radii of the circle.
How We Construct Pie Charts
What are the important things to keep in mind?
1 Must make up to 100%2 Sections must be in proper size relation
To accomplish the latter, we use central angles.
DefinitionThe central angle is the angle whose vertex is the center of the circleand whose rays are radii of the circle.
Central Angles
So how do we find the central angle associated with a section of thepie chart?
Central Angle Calculation
To find the central angle, multiply the relative frequency by 360◦.
color frequency central anglered 15 .15 × 360◦ = 54◦
blue 21 75.6◦
green 18 64.8◦
white 22 79.2◦
black 19 68.4◦
other 5 18◦
Central Angles
So how do we find the central angle associated with a section of thepie chart?
Central Angle Calculation
To find the central angle, multiply the relative frequency by 360◦.
color frequency central anglered 15 .15 × 360◦ = 54◦
blue 21 75.6◦
green 18 64.8◦
white 22 79.2◦
black 19 68.4◦
other 5 18◦
Central Angles
So how do we find the central angle associated with a section of thepie chart?
Central Angle Calculation
To find the central angle, multiply the relative frequency by 360◦.
color frequency central anglered 15 .15 × 360◦ = 54◦
blue 21 75.6◦
green 18 64.8◦
white 22 79.2◦
black 19 68.4◦
other 5 18◦
Central Angles
So how do we find the central angle associated with a section of thepie chart?
Central Angle Calculation
To find the central angle, multiply the relative frequency by 360◦.
color frequency central anglered 15 .15 × 360◦ = 54◦
blue 21 75.6◦
green 18 64.8◦
white 22 79.2◦
black 19 68.4◦
other 5 18◦
The Resulting Pie Chart
Red
15%
Blue
21%
Green
18%
White
22%
Black
19%
Other5%
Drawbacks to Pie Charts
1 We must use relative frequencies
2 It is just as easy to read the frequency table as the pie chart3 Only good for categorical variables4 Not easy to compare two variables5 Easy to manipulate6 Be careful that all percentages are calculated the same way (i.e.
the same denominator)
Drawbacks to Pie Charts
1 We must use relative frequencies2 It is just as easy to read the frequency table as the pie chart
3 Only good for categorical variables4 Not easy to compare two variables5 Easy to manipulate6 Be careful that all percentages are calculated the same way (i.e.
the same denominator)
Drawbacks to Pie Charts
1 We must use relative frequencies2 It is just as easy to read the frequency table as the pie chart3 Only good for categorical variables
4 Not easy to compare two variables5 Easy to manipulate6 Be careful that all percentages are calculated the same way (i.e.
the same denominator)
Drawbacks to Pie Charts
1 We must use relative frequencies2 It is just as easy to read the frequency table as the pie chart3 Only good for categorical variables4 Not easy to compare two variables
5 Easy to manipulate6 Be careful that all percentages are calculated the same way (i.e.
the same denominator)
Drawbacks to Pie Charts
1 We must use relative frequencies2 It is just as easy to read the frequency table as the pie chart3 Only good for categorical variables4 Not easy to compare two variables5 Easy to manipulate
6 Be careful that all percentages are calculated the same way (i.e.the same denominator)
Drawbacks to Pie Charts
1 We must use relative frequencies2 It is just as easy to read the frequency table as the pie chart3 Only good for categorical variables4 Not easy to compare two variables5 Easy to manipulate6 Be careful that all percentages are calculated the same way (i.e.
the same denominator)
Another Pie Chart Example
ExampleThe following is a breakdown of the solid waste that made upAmerica’s garbage in 2000. Values given represent millions of tons.
Material WeightFood 25.9Glass 12.8Metal 18.0Paper 86.7
Plastics 24.7Rubber 15.8Wood 12.7
Yard Trimmings 27.7Other 7.5
Create a pie chart to represent this data.
Solution
We can’t make a pie chart with this data; at least not yet. What do weneed?
Material Weight Relative FrequencyFood 25.9 11.2 %Glass 12.8 5.5%Metal 18.0 7.8%Paper 86.7 37.4%
Plastics 24.7 10.7%Rubber 15.8 6.8%Wood 12.7 5.5%
Yard Trimmings 27.7 11.9%Other 7.5 3.2%
231.9
Solution
We can’t make a pie chart with this data; at least not yet. What do weneed?
Material Weight Relative FrequencyFood 25.9 11.2 %Glass 12.8 5.5%Metal 18.0 7.8%Paper 86.7 37.4%
Plastics 24.7 10.7%Rubber 15.8 6.8%Wood 12.7 5.5%
Yard Trimmings 27.7 11.9%Other 7.5 3.2%
231.9
Solution
Now we can find the central angles and create our pie chart.
Material Weight Relative Frequency Central AngleFood 25.9 11.2% 40.3◦
Glass 12.8 5.5% 19.8◦
Metal 18.0 7.8% 28.1◦
Paper 86.7 37.4% 134.6◦
Plastics 24.7 10.7% 38.5◦
Rubber 15.8 6.8% 24.5◦
Wood 12.7 5.5% 19.8◦
Yard Trimmings 27.7 11.9% 42.8◦
Other 7.5 3.2% 11.5◦
Solution
Now we can find the central angles and create our pie chart.
Material Weight Relative Frequency Central AngleFood 25.9 11.2% 40.3◦
Glass 12.8 5.5% 19.8◦
Metal 18.0 7.8% 28.1◦
Paper 86.7 37.4% 134.6◦
Plastics 24.7 10.7% 38.5◦
Rubber 15.8 6.8% 24.5◦
Wood 12.7 5.5% 19.8◦
Yard Trimmings 27.7 11.9% 42.8◦
Other 7.5 3.2% 11.5◦
Food
11%
Glass
6%
Metal
7%Paper
37%
Plastics
11%
Rubber
7%
Wood
6% Trimmings
12%Other
3%
Bar Graphs
Bar graphs basically give us the same information as a pie chart, witha couple advantages.
1 We can use raw frequencies as all that matters is the size of therectangle
2 We can compare multiple variables
ImportantThe bars must all be of the same width.
Bar Graphs
Bar graphs basically give us the same information as a pie chart, witha couple advantages.
1 We can use raw frequencies as all that matters is the size of therectangle
2 We can compare multiple variables
ImportantThe bars must all be of the same width.
Bar Graphs
Bar graphs basically give us the same information as a pie chart, witha couple advantages.
1 We can use raw frequencies as all that matters is the size of therectangle
2 We can compare multiple variables
ImportantThe bars must all be of the same width.
Bar Graphs
Bar graphs basically give us the same information as a pie chart, witha couple advantages.
1 We can use raw frequencies as all that matters is the size of therectangle
2 We can compare multiple variables
ImportantThe bars must all be of the same width.
The Good and the Not-So-Good
Generally used for categorical variables
Bars can be vertical or horizontal
Cannot analyze distribution because the order of the classes isnot necessarily in numerical order
Can be used for comparisons
The Good and the Not-So-Good
Generally used for categorical variables
Bars can be vertical or horizontal
Cannot analyze distribution because the order of the classes isnot necessarily in numerical order
Can be used for comparisons
The Good and the Not-So-Good
Generally used for categorical variables
Bars can be vertical or horizontal
Cannot analyze distribution because the order of the classes isnot necessarily in numerical order
Can be used for comparisons
The Good and the Not-So-Good
Generally used for categorical variables
Bars can be vertical or horizontal
Cannot analyze distribution because the order of the classes isnot necessarily in numerical order
Can be used for comparisons
Bar Graph Example
ExampleThe growth of the US population age 65 and over is given in the table.Create a bar graph to represent this data.
1900 4.1 1970 9.81910 4.3 1980 11.31920 4.7 1990 12.51930 5.5 2000 12.41940 6.9 2010 13.21950 8.1 2020 16.51960 9.2 2030 20.0
Here’s the Graph
Age of Seniors by Decade
Year
Perc
ent
5
10
15
2019
00
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
2010
2020
2030
Note
Notice that we can’t do much analysis here other than see which classhas the most. We don’t even have to put the bars in any kind of order;if we did by size, we’d have a paredo graph. But since order does notmatter, we cannot talk about the distribution the same way we will beable to for quantitative variables.
Comparisons Using Bar Graphs
ExampleCreate a bar graph for the given causes of death and analyze theresults. Values given are the number per 100,000 people.
Cause of Death 1970 1980 1990 2000Cardiovascular 640 509 387 318Cancer 199 208 216 201Accidents 62 46 36 34
And Our Graph
Causes of Death
Year
Num
bero
fDea
ths
(per
100,
000) Legend
CardiovascularCancerAccidents
150
300
450
600
1970
1980
1990
2000
Analysis?
And Our Graph
Causes of Death
Year
Num
bero
fDea
ths
(per
100,
000) Legend
CardiovascularCancerAccidents
150
300
450
600
1970
1980
1990
2000
Analysis?
Analysis
Cancer and accidents are roughly the same in each decade
Cardiovascular disease decreases each decade and isapproaching level of cancer deaths
Analysis
Cancer and accidents are roughly the same in each decade
Cardiovascular disease decreases each decade and isapproaching level of cancer deaths
Segmented Bar Graphs
UsageSegmented bar graphs are best used to show the cummulative effectof a categorical variable.
Contingency Tables
DefinitionContingency tables are another way to display data. They differ fromfrequency tables in that each variable is distributed across differentcategories.
Contingency tables look like charts with values based on differentconditions. We often see these broken out by gender and by whetheror not the people have a particular characteristic.
Contingency Tables
DefinitionContingency tables are another way to display data. They differ fromfrequency tables in that each variable is distributed across differentcategories.
Contingency tables look like charts with values based on differentconditions. We often see these broken out by gender and by whetheror not the people have a particular characteristic.
Contingency Table Example
ExampleSuppose the following data was collected from voters leaving apolling station during the 2008 Presidential election. People wereasked how they identified themselves and for which candidate theyvoted.
Strong Weak Ind Ind Ind Weak Strong Row TotalDem Dem Dem Repub Repub Repub
McCain 4 17 15 18 69 104 164 389(2.6) (14.9) (11.7) (40.2) (79.5) (89.6) (97.0) (49.1)
Obama 136 95 104 25 12 12 5 390(97.4) (85.1) (83.1) (57.6) (14.2) (10.4) (3.0) (49.2)
Other 0 0 7 1 6 0 0 13(0.0) (0.0) (5.2) (2.3) (6.4) (0.0) (0.0) (1.7)
Column 140 111 126 44 87 116 169 792Total (100) (100) (100) (100) (100) (100) (100) (100)
Now the Questions
Strong Weak Ind Ind Ind Weak Strong Row TotalDem Dem Dem Repub Repub Repub
McCain 4 17 15 18 69 104 164 389(2.6) (14.9) (11.7) (40.2) (79.5) (89.6) (97.0) (49.1)
Obama 136 95 104 25 12 12 5 390(97.4) (85.1) (83.1) (57.6) (14.2) (10.4) (3.0) (49.2)
Other 0 0 7 1 6 0 0 13(0.0) (0.0) (5.2) (2.3) (6.4) (0.0) (0.0) (1.7)
Column 140 111 126 44 87 116 169 792Total (100) (100) (100) (100) (100) (100) (100) (100)
1 What percent of those who identify themselves as IndependentDemocrats voted for Obama?
2 What percent of those who identify themselves as WeakRepublicans voted for McCain?
3 What percent of people identify themselves as Independent?
Now the Questions
Strong Weak Ind Ind Ind Weak Strong Row TotalDem Dem Dem Repub Repub Repub
McCain 4 17 15 18 69 104 164 389(2.6) (14.9) (11.7) (40.2) (79.5) (89.6) (97.0) (49.1)
Obama 136 95 104 25 12 12 5 390(97.4) (85.1) (83.1) (57.6) (14.2) (10.4) (3.0) (49.2)
Other 0 0 7 1 6 0 0 13(0.0) (0.0) (5.2) (2.3) (6.4) (0.0) (0.0) (1.7)
Column 140 111 126 44 87 116 169 792Total (100) (100) (100) (100) (100) (100) (100) (100)
1 What percent of those who identify themselves as IndependentDemocrats voted for Obama?
2 What percent of those who identify themselves as WeakRepublicans voted for McCain?
3 What percent of people identify themselves as Independent?
Now the Questions
Strong Weak Ind Ind Ind Weak Strong Row TotalDem Dem Dem Repub Repub Repub
McCain 4 17 15 18 69 104 164 389(2.6) (14.9) (11.7) (40.2) (79.5) (89.6) (97.0) (49.1)
Obama 136 95 104 25 12 12 5 390(97.4) (85.1) (83.1) (57.6) (14.2) (10.4) (3.0) (49.2)
Other 0 0 7 1 6 0 0 13(0.0) (0.0) (5.2) (2.3) (6.4) (0.0) (0.0) (1.7)
Column 140 111 126 44 87 116 169 792Total (100) (100) (100) (100) (100) (100) (100) (100)
1 What percent of those who identify themselves as IndependentDemocrats voted for Obama?
2 What percent of those who identify themselves as WeakRepublicans voted for McCain?
3 What percent of people identify themselves as Independent?
What If We Went The Other Way?
Strong Weak Ind Ind Ind Weak Strong Row TotalDem Dem Dem Repub Repub Repub
McCain 4 17 15 18 69 104 164 389(2.6) (14.9) (11.7) (40.2) (79.5) (89.6) (97.0) (49.1)
Obama 136 95 104 25 12 12 5 390(97.4) (85.1) (83.1) (57.6) (14.2) (10.4) (3.0) (49.2)
Other 0 0 7 1 6 0 0 13(0.0) (0.0) (5.2) (2.3) (6.4) (0.0) (0.0) (1.7)
Column 140 111 126 44 87 116 169 792Total (100) (100) (100) (100) (100) (100) (100) (100)
What percent of McCain voters consider themselves as weakRepublicans?
These percentages are based on the column sums. What must weconsider to find our answer? Row totals
104389
= 26.7%
What If We Went The Other Way?
Strong Weak Ind Ind Ind Weak Strong Row TotalDem Dem Dem Repub Repub Repub
McCain 4 17 15 18 69 104 164 389(2.6) (14.9) (11.7) (40.2) (79.5) (89.6) (97.0) (49.1)
Obama 136 95 104 25 12 12 5 390(97.4) (85.1) (83.1) (57.6) (14.2) (10.4) (3.0) (49.2)
Other 0 0 7 1 6 0 0 13(0.0) (0.0) (5.2) (2.3) (6.4) (0.0) (0.0) (1.7)
Column 140 111 126 44 87 116 169 792Total (100) (100) (100) (100) (100) (100) (100) (100)
What percent of McCain voters consider themselves as weakRepublicans?These percentages are based on the column sums. What must weconsider to find our answer?
Row totals
104389
= 26.7%
What If We Went The Other Way?
Strong Weak Ind Ind Ind Weak Strong Row TotalDem Dem Dem Repub Repub Repub
McCain 4 17 15 18 69 104 164 389(2.6) (14.9) (11.7) (40.2) (79.5) (89.6) (97.0) (49.1)
Obama 136 95 104 25 12 12 5 390(97.4) (85.1) (83.1) (57.6) (14.2) (10.4) (3.0) (49.2)
Other 0 0 7 1 6 0 0 13(0.0) (0.0) (5.2) (2.3) (6.4) (0.0) (0.0) (1.7)
Column 140 111 126 44 87 116 169 792Total (100) (100) (100) (100) (100) (100) (100) (100)
What percent of McCain voters consider themselves as weakRepublicans?These percentages are based on the column sums. What must weconsider to find our answer? Row totals
104389
= 26.7%
What If We Went The Other Way?
Strong Weak Ind Ind Ind Weak Strong Row TotalDem Dem Dem Repub Repub Repub
McCain 4 17 15 18 69 104 164 389(2.6) (14.9) (11.7) (40.2) (79.5) (89.6) (97.0) (49.1)
Obama 136 95 104 25 12 12 5 390(97.4) (85.1) (83.1) (57.6) (14.2) (10.4) (3.0) (49.2)
Other 0 0 7 1 6 0 0 13(0.0) (0.0) (5.2) (2.3) (6.4) (0.0) (0.0) (1.7)
Column 140 111 126 44 87 116 169 792Total (100) (100) (100) (100) (100) (100) (100) (100)
What percent of McCain voters consider themselves as weakRepublicans?These percentages are based on the column sums. What must weconsider to find our answer? Row totals
104389
= 26.7%