Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | mabel-baldwin |
View: | 216 times |
Download: | 0 times |
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin
Describing Data:Frequency Tables, FrequencyDistributions, and Graphic Presentation
Chapter 2
2
GOALS
•Organize qualitative data into a frequency table.•Present a frequency table as a “bar chart” (Excel they are called column chart) or a pie chart.•Organize quantitative data into a frequency distribution.•Present a frequency distribution for quantitative data using histograms, frequency polygons, and cumulative frequency polygons.
3
Mutually Exclusive
An individual, object, or measurement is included in only one category– It can’t be in two categories– Example: A particular phone call cannot
originate with both AT&T and MCI
Frequency Table
Frequency Table: A grouping of qualitative data into mutually exclusive classes (categories) showing the number of observations in each class
4
Sales data from Auto Dealership
Price Price($000) Age of Customer ForD Car Type24,624.00 24.624 50 Domestic GM23,032.00 23.032 50 Domestic Ford27,556.00 27.556 59 Domestic GM20,384.00 20.384 32 Domestic GM20,953.00 20.953 29 Domestic Ford37,270.00 27.270 35 Foreign Mercedes21,006.00 21.006 57 Domestic GM27,594.00 27.594 43 Domestic GM29,636.00 29.636 51 Domestic GM26,357.00 26.357 31 Foreign Honda38,262.00 28.262 39 Foreign Mercedes38,910.00 38.910 25 Foreign Honda23,947.00 23.947 43 Domestic Ford
We would like a Frequecy Table that shows how many of each Car Type we sold last
month from the Auto Dealership data (counting).
Count of Car TypeCar Type TotalFord 28GM 22Honda 13Mercedes 10Toyota 7Grand Total 80
Relative Class Frequencies
Class frequencies can be converted to relative class frequencies to show the fraction of the total number of observations in each class.
A relative frequency captures the relationship between a class total and the total number of observations.
5
Car Type Frequency Relative Frequency
GM 22 27.50%
Ford 28 35.00%
Mercedes 10 12.50%
Honda 13 16.25%
Toyota 7 8.75%80 100.00%
6
Textbook: Bar Charts Excel: Column Chart
In Excel, this is a Column chart. Column charts are good for Nominal Level Data. Notice that the columns do not touch.
8
Frequency Distribution
A Frequency distribution is a grouping of data into mutually exclusive categories showing the number of observations in each class.
•The raw data are more easily interpreted if organized into a frequency distribution•The resulting frequency distribution helps a person to quickly see the “shape” of the data•Although the frequency distribution will result in the loss of some detail, seeing patterns in the data can help a person to make better decisions
9
5 Steps To Organize Raw Data Into A Frequency Distribution
Step 1: Decide on Number of Classes Step 2: Determine The Class Interval Step 3: Set The Individual Class Limits Step 4: Tally The Data Into Classes Step 5: Count The Tallies in Each Class & Present
the Frequency Distribution
10
Step 1: Determining The Number Of Classes
Goal is to use just enough classes so you can see the “shape” of the data.
You must use professional judgment. Useful recipe to determine the number
of classes:2k ≥ n
n = total observations
k = number of classes
Best to use 5 < k < 15
General guidelines that are not always possible to follow. Thus, making Frequency Distributions is often refer to as an “art”.
11
Definitions
Class Interval– Distance between lower limit of class and lower limit of the
next class– The class interval is obtained by subtracting the lower limit of
a class from the lower limit of the next class (also midpoint to midpoint)
Class Midpoint (Class Mark)– The midpoint can be thought of as the “typical value” for the
class– This is the average of the upper and lower class limits:
(Lower class limit + upper class limit)/2
12
Step 2: Determine The Class Interval Or Width
Class interval should be the same for every interval– If they are not equal
graphs may be misleading, & calculations may be problematic
– In some cases, where there is a potential for many empty classes, unequal class interval may be necessary
The classes all taken together must cover at least the distance from the lowest value in the raw data up to the highest value:
i ≥H - L
ki = Class IntervalH = Highest ValueL = Lowest Valuek = Number of Classes
Determine Class Interval
13
EXAMPLE – Creating a Frequency Distribution Table
Ms. Kathryn Ball of AutoUSA wants to develop tables, charts, and graphs to show the typical selling price on various dealer lots. The table on the right reports only the price of the 80 vehicles sold last month at Whitner Autoplex.
14
Constructing a Frequency Table - Example
Step 1: Decide on the number of classes. A useful recipe to determine the number of classes (k) is the “2 to the k rule.” such that 2k > n.There were 80 vehicles sold. So n = 80. If we try k = 6, which means we would use 6 classes, then 26 = 64, somewhat less than 80. Hence, 6 is not enough classes. If we let k = 7, then 27 128, which is greater than 80. So the recommended number of classes is 7.
Step 2: Determine the class interval or width. The formula is: i (H-L)/k where i is the class interval, H is the highest observed value, L is the lowest observed value, and k is the number of classes.($35,925 - $15,546)/7 = $2,911Round up to some convenient number, such as a multiple of 10
or 100. Use a class width of $3,000
15
Step 3: Set The Individual Class Limits
Classes must be mutually exclusive Avoid overlapping or unclear class limits:
– Include lower limit– Exclude upper limit
Example of class limits:– $12,000 up to $15,000 and $15,000 up to
$18,000 $12,000 & $14,999 belong in the first class $15,000 belongs in the second class
Avoid open ended classes (problems with graphing)
The lower limit of the first class should be a multiple of the class interval (not always possible)
Convenient multiples of ten are useful You must compare the actual range to the
range implied by the number of classes & class interval
General guidelines that are not always possible to follow. Thus, making Frequency Distributions is often refer to as an “art”.
17
Step 4: Tally the vehicle selling prices into the classes.
Step 5: Count the number of items in each class.
Constructing a Frequency Table
Observed Patterns:
Range: about $15,000 to about $36,000
Concentration between $18,000 & $27,000
Largest concentration is in $18,000 - $21,000 class
– Typical Value = (18+21)/2 = 19.5 K.
Two sold for $33,000 or more
8 sold for less than $18,000
18
19
Relative Frequency Distribution
To convert a frequency distribution to a relative frequency distribution, each of the class frequencies is divided by the total number of observations.
20
Graphic Presentation of a Frequency Distribution
The three commonly used graphic forms are:
HistogramsFrequency polygonsCumulative frequency distributions
21
Histogram
Histogram for a frequency distribution based on quantitative data is very similar to the column charts (book says: bar chart) showing the distribution of qualitative data. The classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars. The columns must touch in order to visually articulate that the class interval spans from lower class limit to upper class limit.
22
Other Notes About Histogram
Histograms constructed from Relative Frequency Distributions look the same (have the same shape), but instead, the vertical axis would show percentages
Histograms must have the columns touching:– The columns must touch in order to visually articulate that the class
interval spans from lower class limit to upper class limit (a continuous variable)
– For nominal or ordinal level data, the columns are not drawn adjacent to each other
The category labels are usually words
23
Frequency Polygon
A frequency polygon also shows the shape of a distribution and is similar to a histogram.
It consists of line segments connecting the points formed by the intersections of the class midpoints and the class frequencies.
26
Second Example of a Cumulative Frequency Distribution (prices of vehicles are lower)
Number ofVehicles Sold(Frequency)
CumulativeFrequency
12 up to 15 8 815 up to 18 23 31 = 8 + 2318 up to 21 17 48 = 31 + 1721 up to 24 18 66 = 48 + 1824 up to 27 8 74 = 66 + 827 up to 30 4 78 = 74 + 430 up to 33 2 80 = 78 + 2
Total 80
Selling Prices($ thousands)
Cumulative Frequency Distribution forVehicles Selling Price
27
Cumulative Frequency Polygon
0
10
20
30
40
50
60
70
80
9 12 15 18 21 24 27 30 33
Selling Price ($000)
Nu
mb
er
of
Ve
hic
les
So
ld
25%
50%
75%
100%
28
Cumulative Frequency Polygon
Plot line on coordinate system
X-axis = Upper limit of class
Y-axis (Left) = Cumulative Frequency
Y-axis (Right) = % First point on graph is:
(lower limit of first class, 0)
x y (left)12 015 818 3121 4824 6627 7430 7833 80
29
Cumulative Frequency Polygon
0
10
20
30
40
50
60
70
80
9 12 15 18 21 24 27 30 33
Selling Price ($000)
Nu
mb
er
of
Ve
hic
les
So
ld
25%
50%
75%
100%
x y (left)12 015 818 3121 4824 6627 7430 7833 80
30
Cumulative Frequency Polygon
0
10
20
30
40
50
60
70
80
9 12 15 18 21 24 27 30 33
Selling Price ($000)
Nu
mb
er
of
Ve
hic
les
So
ld
25%
50%
75%
100%
50% of the vehicles sold for less than about $19,500
31
Cumulative Frequency Polygon
0
10
20
30
40
50
60
70
80
9 12 15 18 21 24 27 30 33
Selling Price ($000)
Nu
mb
er
of
Ve
hic
les
So
ld
25%
50%
75%
100%
25 of the vehicles sold for less than about $17,500
32
Cumulative Frequency Polygon
0
10
20
30
40
50
60
70
80
9 12 15 18 21 24 27 30 33
Selling Price ($000)
Nu
mb
er
of
Ve
hic
les
So
ld
25%
50%
75%
100%
80% of the vehicles sold for less than about $24,000