+ All Categories
Home > Documents > Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is...

Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is...

Date post: 12-Jul-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
12
Chapter 2 Organizing and Summarizing Data Ch2.1 Organizing Qualitative Data Objective A : Interpretation of a Basic Statistical Graph Example 1 : Identity Theft Identity fraud occurs someone else’s personal information is used to open credit card accounts, apply for a job, receive benefits, and so on. The following relative frequency bar graph represents the various types of identity theft based on a study conducted by the Federal of Trade Commission (2008). ( a) Approximate what percentage of identity theft was loan fraud (such as applying for a loan in someone else’s name)? 0.05 x 100% = 5% ( b) If there were 10 million cases of identity fraud in 2008, how many were credit card fraud (someone uses someone else’s credit card to make a purchase)? 0.26 of the 10 million = 0.26 x 10 million = 2.6 million or 2,600,000 cases Objective B : Construct a Frequency / Relative Frequency Distribution, Bar Graph, Pareto Chart and Pie Chart B1. Frequency / Relative Frequency Distribution - A frequency distribution lists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each category with the counts in a table format 1 *original data has been converted to a percentage (in decimals)
Transcript
Page 1: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

Chapter 2 Organizing and Summarizing Data

Ch2.1 Organizing Qualitative DataObjective A : Interpretation of a Basic Statistical Graph

Example 1 :Identity Theft Identity fraud occurs someone else’s personal information is used to open credit card accounts, apply for a job, receive benefits, and so on. The following relative frequency bar graph represents the various types of identity theft based on a study conducted by the Federal of Trade Commission (2008).

(a) Approximate what percentage of identity theft was loan fraud (such as applying for a loan in someone else’s name)?

0.05 x 100% = 5%(b) If there were 10 million cases of identity fraud in 2008, how many were credit card fraud (someone uses someone else’s credit card to make a purchase)?

0.26 of the 10 million = 0.26 x 10 million = 2.6 million or 2,600,000 casesObjective B : Construct a Frequency / Relative Frequency Distribution, Bar Graph, Pareto Chart and Pie Chart

B1. Frequency / Relative Frequency Distribution- A frequency distribution lists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each category with the counts in a table format

- A relative frequency distribution lists each category of data and the relative frequency which is the proportion of observation within a category. The total proportions will add to 100% or 1.Converts the raw data in each category with the equivalent percentages (in decimals) in table format

Relative frequency ¿frequency

∑ of all frequencies*sum of all frequencies means total number surveyed ex. 100 people surveyed and 40 support free tuition in community colleges: relative freq. = 40/100 = 0.40 or 40%

1

*original data has been converted to a percentage (in decimals)

Page 2: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

B2. Construct a Bar Graph, a Pareto Chart, or a Pie Chart

- A bar graph is constructed by labeling each category of data on either the horizontal or vertical axis and the frequency or relative frequency of the category on the other axis. Rectangles of equal width are drawn for each category. The height of each rectangle represents the category’s frequency or relative frequency.*FOR CATEGORICAL DATA ONLY!

freq: # of times in a week rel. freq.- # of times in a week as a proportionNote: bars have same width and there is a space between the bars- A Pareto chart is a bar graph whose bars are drawn in decreasing order of frequency or relative frequency.*FOR CATEGORICAL DATA ONLY!

Note: the data is in order from greatest to least (bar graph but in order).- A pie chart is a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category.

2

Page 3: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

Chapter 2.2 Organizing Quantitative Data: The Popular Displays

Objective A : Histogram - A histogram is constructed by drawing rectangles for each class of data. If the discrete

data set is small, each number is a class. If the discrete data set is large or the data are continuous, the classes must be created using interval of numbers. The height of each

rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same and the rectangles touch each other.

Construct Frequency Distribution and Histogram for Discrete Data

3

Page 4: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

* raw data

Freq: number of customers during 15 min intervals

4

Page 5: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

Objective B : Constructing a Stem-and-Leaf Plot

The stem of a data value will consist of the digits to the left of the rightmost digit. The leaf of a data value will be the rightmost digit.

Objective C : Construct Frequency Distributions and Histogram for Continuous Data

- Classes are categories into which data are grouped.- The lowest class limit is the smallest value within a class.

- The upper class limit is the largest value within a class.- The class width is the difference between consecutive lower class limits.

- The class width is computed by the following formula.

Class width -------> Round this value up to the same decimal place as the raw data.

5

Types of graphs for Quantitative Data:

1. histogram2. stem and leaf3. time-series (only when time is involved)4. box plot5. dot plot

Page 6: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

Example 1: The following data represent the fall 2006 student headcount enrollments for all public community colleges in the state of Illinois.

(a) Find the number of classes.6

(b) Find the class limits.Lower: 0-5000-10000-15000-20000-25000Upper: 4999-9999-14999-19999-24999-29999

(c) Find the class width: 5000-0 = 5000

Example 2:

6

Page 7: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

* 40 numbers

*freq. of five-year rate of returns

Identify the shape of each distribution.

Uniform DistributionEx: last digit on a phone number

Bell-Shaped Distribution ( Normal Distribution or symmetric )

Ex. Physical measurements of species

7

Page 8: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

Skewed to the right Distribution (right tail)Ex . grades of a difficult exam

Skewed to the left Distribution (left tail)Ex. Grades of an easy exam

Example 3: The largest value of a data set is 125 and the smallest value of the data set is 27. If six classes are to be formed, calculate an appropriate class width.

Class width ~ 125-27 = 98 = 16.33 Round up to 17. So use width of 20 since it is more practical. 6 6

Percent vs proportion: proportion is decimal form of the fraction, percent is proportion out of every 100 subjects.Descriptive vs inferential: Descriptive statistics uses collected data to summariz results in tables, graphs, and calculations. Inferential statistics uses collected data to make inferences and predictions on a population based on the sample.

Objective D : Time Series Graphs - A time series graph represents the values of a variable that have been collected over a

specified period of time. The horizontal axis is the time and the vertical axis is the value of the variable. Line segments are drawn by connective consecutive points of time and corresponding value of the variable.

Example 1: The following time-series graph shows the annual U.S. motor vehicle production from 1990 through 2008.

8

Page 9: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

(a) Estimate the number of motor vehicles produced in the United States in 1991. 8900 thousand = 8,900,000 OR 8.9 million

(b) Estimate the number of motor vehicles produced in the United States in 1999.13000 thousand = 13,000,000 or 13 million

(c) Use the results from (a) and (b) to estimate the percent increase in the number of motor vehicles produced from 1991 to 1999.

13 million – 8.9 million = 4.1 million ( 4,100,000)

Formula: _ increase x 100% = 4.1 mil x 100% = 0.4606 x 100% = 46.06% or ~ 46.1% increase Original amt 8.9 mil

(d) Estimate the percent decrease in the number of motor vehicles produced from 1999 to 2008.

1999: 13 mil 2008: 8.8 mil : decrease = 13-8.8 = 4.2 mil

Formula: _ decrease x 100% = 4.2 mil x 100% = 0.3231 x 100% = 32.31% or ~ 32.3% decrease Original amt 13 milCh 2.3 Graphical Misrepresentations of Data - The most common graphical misinterpretation of data is accomplished through manipulation of the scale of the graph.

Example 1: Union MembershipThe following relative frequency histogram represents the proportion of employed people aged 25 to 64 years old who were members of a union.

9

Page 10: Los Angeles Mission College€¦ · Web viewlists each category of data and the frequency which is the number of occurrences for each category data. Converts the raw data in each

Describe how this graph is misleading. What might a reader conclude from the graph?

Vertical scale doesn’t start at zero so heights appear closer than what they actually are.

Example 2: Inauguration Cost: The following is a USA Today-type graph. Explain how it is misleading.

Not proportional: The cost for Reagan was about 4 times as much as for Carter. However, the drawing for Reagan is not four times as big as the drawing for Carter.

10


Recommended