+ All Categories
Home > Documents > 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and...

1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and...

Date post: 23-Mar-2018
Category:
Upload: buithuan
View: 215 times
Download: 3 times
Share this document with a friend
13
15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and a population. 2. Organize data in a frequency table. 3. Use a variety of methods to represent data visually. 4. Use stem-and-leaf displays to compare data. In a survey of 100,000 women conducted by Cosmopolitan Magazine, it was found that over 70% of women who were married for more than 5 years had had an affair. These are truly shocking results, but before you swear off marriage completely, there are some ques- tions that you should be asking, such as who conducted the study? How were the women selected for the study? Of the 100,000 women, how many responded to the questions in the study? Did those being surveyed have any particular biases? It may reassure you to know that a larger survey of 200,000 women found that only 15% of the women reported that they had been unfaithful. These contradictory results make us wonder, which survey is correct? As you will learn shortly, perhaps we should trust neither survey. Populations and Samples In this chapter you will study statistics, an area of mathematics in which we are inter- ested in gathering, organizing, analyzing, and making predictions from numerical infor- mation called data. In the two marriage surveys, it would have been ideal if the researchers had been able to contact each one of the millions of married woman in the United States. This set of all married women is called the population. Of course, this is impractical, so the researchers contacted only a subset of the population, called a sample. It is very important that the sample is typical of the population as a whole. In fact, in the two studies just mentioned, both samples were chosen very poorly, and we should trust neither survey. We will describe a sample as biased if it does not accurately reflect the population as a whole with regard to the data that we are gathering. Bias often occurs if we use poor sam- pling techniques. There are many ways in which bias can creep into a sample. It could occur because of the way in which we decide how to choose the people to participate in the *Many of the calculations that we do in this chapter can be done easily using computers and graphing calculators. Copyright © 2010 Pearson Education, Inc.
Transcript
Page 1: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

15.1 Organizing and Visualizing Data*Objectives1. Understand the difference between a sample and a population.2. Organize data in a frequency table.3. Use a variety of methods to represent data visually.4. Use stem-and-leaf displays to compare data.

In a survey of 100,000 women conducted by Cosmopolitan Magazine, it was found thatover 70% of women who were married for more than 5 years had had an affair. These aretruly shocking results, but before you swear off marriage completely, there are some ques-tions that you should be asking, such as who conducted the study? How were the womenselected for the study? Of the 100,000 women, how many responded to the questions in thestudy? Did those being surveyed have any particular biases?

It may reassure you to know that a larger survey of 200,000 women found that only15% of the women reported that they had been unfaithful. These contradictory resultsmake us wonder, which survey is correct? As you will learn shortly, perhaps we shouldtrust neither survey.

Populations and SamplesIn this chapter you will study statistics, an area of mathematics in which we are inter-ested in gathering, organizing, analyzing, and making predictions from numerical infor-mation called data. In the two marriage surveys, it would have been ideal if theresearchers had been able to contact each one of the millions of married woman in theUnited States. This set of all married women is called the population. Of course, this isimpractical, so the researchers contacted only a subset of the population, called asample. It is very important that the sample is typical of the population as a whole. Infact, in the two studies just mentioned, both samples were chosen very poorly, and weshould trust neither survey.

We will describe a sample as biased if it does not accurately reflect the population as awhole with regard to the data that we are gathering. Bias often occurs if we use poor sam-pling techniques. There are many ways in which bias can creep into a sample. It couldoccur because of the way in which we decide how to choose the people to participate in the

*Many of the calculations that we do in this chapter can be done easily using computers and graphing calculators.

Copyright © 2010 Pearson Education, Inc.

Page 2: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

CHAPTER 15 y Descriptive Statistics716

survey. This is called selection bias. As an example, if we were to do a phone survey in themiddle of a weekday afternoon, we would probably get an overrepresentation of retireesand stay-at-home parents in our sample. Call-in surveys conducted by local news showsand radio stations are also prone to selection bias.

It might seem we would get better, more reliable information if we were to walkaround a town and ask people “randomly” to take part in our survey. We put the wordrandomly in quotes because studies have found that selection bias can occur using thismethod since interviewers tend to choose people who are better dressed and who lookcooperative, thus skewing the sample.

Another issue that can affect the reliability of a survey is the way we ask the questions,which is called leading-question bias. For example, in a Roper poll conducted for theAmerican Jewish Committee on the Holocaust, people were asked, “Does it seem possibleor does it seem impossible to you that the Nazi extermination of the Jews neverhappened?” The use of double negatives in this question caused confusion in the waypeople responded to the survey. When the question was worded this way, 22% of thosesurveyed said that it was possible that the Holocaust did not occur. A new survey was con-ducted in which the question was rephrased, “Does it seem possible to you that the Naziextermination of the Jews never happened, or do you feel certain that it happened?” In thenew survey, only 1% of those surveyed stated that it was possible that the Holocaust neveroccurred.

We have only touched on the notion of how important it is that statistical conclusionsare based on reliable, nonbiased data. There are whole books written on samplingtheory. For now, we just want you, as an educated consumer of technical information,to be aware that just because a study says something, as the song says, “It ain’t neces-sarily so!”

We will now turn our focus on what to do once we have obtained reliable data.

Frequency TablesWhen we gather information about a population, we often end up with a large collec-tion of numbers. Unless we can organize the data in a meaningful way, it is nearlyimpossible to interpret these facts. For example, if you glance at the financial sectionof a newspaper, you will find several pages containing thousands of numerical factsabout the daily performance of various stocks in the stock market. This amount ofdetail about the market’s activity seems overwhelming; it is difficult to understand thegeneral pattern of changes in stock prices from those lists of numbers. However, ifyou were to tune in to the evening news, the commentator might summarize this setof data by saying, “The Dow Jones lost 99.59 points today, closing at 12,251.7. Losersoutnumbered winners by three to one.” For any large amount of data to be compre-hensible, we must organize it and present it so that we can see patterns, trends, andrelationships.

D E F I N I T I O N S We refer to a collection of numerical information as data or adistribution. A set of data listed with their frequencies is called a frequency distribution.

Sometimes we want to show the percent of the time that each item occurs in afrequency distribution. In this case, we call the distribution a relative frequencydistribution. We often present a frequency distribution as a frequency table. In afrequency table, we list the values in one column and the frequencies of the values inanother column, as we show in Example 1. We can also present a relative frequency distri-bution in table form.

KEY POINT

A frequency table is onemethod used to organizedata.

Copyright © 2010 Pearson Education, Inc.

Page 3: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

15.1 y Organizing and Visualizing Data 717

EXAMPLE 1 Using Tables to Summarize TV Program Evaluations

CBS has asked 25 viewers to evaluate the latest episode of CSI. The possible evaluations are

(E)xcellent, (A)bove average, a(V)erage, (B)elow average, (P)oor.

After the show, the 25 evaluations were as follows:

A, V, V, B, P, E, A, E, V, V, A, E, P, B, V, V, A, A, A, E, B, V, A, B, V

Construct a frequency table and a relative frequency table for this list of evaluations.

SOLUTION: If we count the number of Es,As, and so on in the list, we get the resultsshown in Table 15.1.

By organizing the data in this table, wecan see the distribution of favorable andunfavorable evaluations more quickly. Noticethat the sum of the frequencies in Table 15.1 is25, which is the number of viewers asked toevaluate the program.

We construct a relative frequency distribu-tion for these data by dividing each frequency inTable 15.1 by 25. For example, because thereare 4 Es, the relative frequency of the score E is . Table 15.2 shows the relativefrequency distribution for the set of evaluations.

The sum of the relative frequencies in Table 15.2 is 1; however, in other examples, thesum of the relative frequencies may not be exactly 1 due to rounding. ]

If there are many different values in a data set, we may group the data values intoclasses to make the information more understandable. Although there is no hard-and-fastrule, generally using 8 to 12 classes will give a good presentation of the data. We illustratehow to group data in Example 2.

1

425 = 0.16

Evaluation Frequency

E 4

A 7

V 8

B 4

P 2

Total 25

TABLE 15.1 Frequency tablesummarizing viewer evaluations ofa police drama. Evaluation Relative Frequency

E 425 = 0.16

A 725 = 0.28

V 825 = 0.32

B 425 = 0.16

P 225 = 0.08

Total 1.00

TABLE 15.2 Relative frequency tablesummarizing viewer evaluations of a policedrama.

Quiz Yourself *

Construct a frequency table anda relative frequency table for thefollowing distribution:

1, 2, 7, 2, 6, 5, 2, 7, 8, 8, 1, 3, 10, 7, 9, 1, 7, 3, 5, 2

1

H I S TO R I C A L H I G H L I G H T ¶ ¶ ¶Presidential PollsIn 1936, pollster George Gallup boldly declared that theLiterary Digest would incorrectly predict Alfred Landon todefeat Franklin Roosevelt for reelection as president of theUnited States. Gallup’s claim seemed far-fetched because heused a sample of only 50,000, whereas the Digest intendedto survey 10 million.

When Roosevelt won, it was clear that Gallup’ssuperior sampling methods were more important thanhaving a large sample. Because the Digest used telephonedirectories, magazine subscription lists, and membershiplists of clubs and organizations, it had sampled people inthe higher economic classes. Thus, its survey sufferedfrom extreme selection bias and greatly overrepresentedRepublicans.

After this fiasco, pollsters developed quota sampling, inwhich samples would reflect the makeup of the population.The idea was to have the same percentage of men, women,Catholics, Jews, blacks, whites, and so on in a sample asthere were in the population. However, when pollsters usedquota sampling to forecast the 1948 presidential election,disaster struck again. All the major polling organizationswrongly predicted that Thomas Dewey would defeat theincumbent Harry S. Truman.

One major reason that the 1948 polls failed was thatthe polling stopped too soon, missing a late trend towardTruman. Even though quotas were met, selection bias creptin because interviewers had too much freedom in choosingwhom to interview within those categories.

*Quiz Yourself answers begin on page 778.

Copyright © 2010 Pearson Education, Inc.

Page 4: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

CHAPTER 15 y Descriptive Statistics718

EXAMPLE 2 Grouping Data Values into ClassesSuppose 40 health care workers take an AIDS awareness test and earn the following scores:

79, 62, 87, 84, 53, 76, 67, 73, 82, 68, 82, 79, 61, 51, 66, 77, 78, 66, 86, 70, 76, 64, 87, 82, 61, 59, 77, 88, 80, 58, 56, 64, 83, 71, 74, 79, 67, 79, 84, 68

Construct a frequency table and a relative frequency table for these data.

SOLUTION: Because there are so many differentscores in this list, the frequency of each score willbe very small; constructing a frequency table aswe did in Example 1 would not give us any usefulinformation.

We must decide how to group the scores beforemaking a table. The smallest score is 51 and thelargest is 88. The difference, 88 - 51 = 37, suggeststhat if we take a range of 40 and divide it into equalparts, we might get a reasonable grouping of thedata. We will group the data into classes, each con-taining five values. The first class contains numbersfrom 50 to 54, the second contains numbers from55 to 59, and so on. Counting the numbers in eachclass gives us the frequencies in the second columnof Table 15.3.

To find the relative frequencies, we divide eachcount in the first column by 40, which is the total num-

ber of scores. For example, in the row labeled 55–59, we divide 3 by 40 to get 0.075 in thethird column. ]

Table 15.3 helps us see patterns and trends in the data. For example, a large number ofworkers have test scores below 70, which might mean that these workers are not effectivein treating AIDS patients and may require extra training.

Representing Data VisuallyThe saying “A picture is worth a thousand words” certainly applies when working withlarge sets of data. By presenting data graphically, we can observe patterns more easily.A bar graph is one way to visualize a frequency distribution. In drawing a bar graph, wespecify the classes on the horizontal axis and the frequencies on the vertical axis. If we aregraphing a relative frequency distribution, then the heights of the bars correspond to thesize of the relative frequencies, as we show in Example 3.

EXAMPLE 3 Drawing a Bar Graph of the Viewer Evaluation Data

a) Draw a bar graph of the frequency distribution of CBS viewers’ responses summarizedin Table 15.1 in Example 1.

b) Draw a bar graph of the relative frequency distribution of the CBS viewers’ responsessummarized in Table 15.2 in Example 1.

SOLUTION:

a) Because the largest frequency is 8, we labeled the vertical axis from 0 to 8. Next we drewfive bars of heights 4, 7, 8, 4, and 2 to indicate the frequencies of the evaluations E, A, V, B,and P, as is shown in Figure 15.1(a).

Range of Scores on AIDS Awareness Test Frequency Relative Frequency

50–54 2 0.05

55–59 3 340 = 0.075

60–64 5 0.125

65–69 6 0.15

70–74 4 0.10

75–79 9 0.225

80–84 7 0.175

85–89 4 0.10

Total 40 1.00

TABLE 15.3 Frequency table and relative frequency table for scores onAIDS awareness test.

KEY POINT

We use bar graphs torepresent frequencydistributions graphically.

Copyright © 2010 Pearson Education, Inc.

Page 5: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

15.1 y Organizing and Visualizing Data 719

b) In Figure 15.1(b), we labeled the vertical axis from 0 to 35 because the largest relativefrequency was 0.32, or 32%. Although both bar graphs have the same shape, it is usu-ally better to draw a bar graph of relative frequency distributions when comparing twodifferent data sets.

Now try Exercises 7 to 12. ]

If we are comparing two data sets of different sizes, graphing the relative frequencies,rather than the actual values in the data sets, allows us to compare the distributions. In thiscase, instead of drawing two separate bar graphs, we could show both distributions on a singlegraph, using, say, red for the bars in the first distribution and green for the bars in the second.

Until now, the data we have been organizing and graphing could not take on fractionalvalues. By this we mean that in Example 1, a viewer could evaluate CSI as above average orexcellent but could not give a rating between those two. Similarly, in Example 2, a score onthe AIDS awareness test could be 78 or 79, but a score between these two numbers, such as78.56, was not possible. A variable quantity that cannot take on arbitrary values is calleddiscrete. Other quantities, called continuous variables, can take on arbitrary values. Weightis an example of a continuous variable. We may say that a person weighs 150 pounds;however, with a more accurate scale, we may find that the person actually weighs 150.3 orperhaps 150.314 pounds.

We use a special type of bar graph called a histogram to graph a frequency distributionwhen we are dealing with a continuous variable quantity. We also may use a histogramwhen the variable quantity is not continuous, but has a very large number of different pos-sible values. Money is an example of such a quantity.

As with a bar graph, we specify classes for a histogram. With a histogram, however, wedo not allow any spaces between the bars above each class. If a data value falls on theboundary between two data classes, then you must make it clear as to whether you arecounting that value in the class to the right or to the left of the data value. We show how todraw a histogram for a frequency distribution in Example 4. As with bar graphs, we canalso draw a histogram for a relative frequency distribution.

EXAMPLE 4 Drawing a Histogram to Represent Weight-Loss Data

The New You Clinic has the following data regarding the weight lost by its clients over thepast 6 months. Draw a histogram for the relative frequency distribution for these data.

SOLUTION: We first must find the relative frequency distribution. Because there are65 data values, we divide each frequency by 65 to obtain the corresponding relativefrequency distribution in the third column of Table 15.4.

2

Fre

quen

cy

Viewers' Ratings

2

1

0

3

4

5

6

7

8

E A V B P

Per

cent

Viewers' Ratings

5

0

10

15

20

25

30

35

E A V B P

(a) (b)

FIGURE 15.1 (a) Bar graph of frequency distribution of viewers’ ratings. (b) Bar graph of relative frequency distribution of viewers’ ratings.

Graphing calculator representationof Figure 15.1(a).*

*Note that in this graphing calculator screen, there are no spaces between the bars as in the graph in Figure 15.1(a).

Quiz Yourself

Draw a bar graph representingthe relative frequencies that youfound in Quiz Yourself 1.

2

Pounds Lost Frequency

0 to 10 14

10+ to 20 23

20+ to 30 17

30+ to 40 8

40+ to 50 3

Total 65

Copyright © 2010 Pearson Education, Inc.

Page 6: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

We now draw this histogram exactly like a bar graph, as shown in Figure 15.2, exceptthat we do not allow spaces between the bars. Also, we label the endpoints of the classintervals on the horizontal axis.

Now try Exercises 13, 14, 17, and 18. ]

When we look at the histogram in Figure 15.2, we can see that the majority of theclients lost between 10 and 30 pounds. There is really no strict rule as to how to constructa histogram. It is up to you to decide whether to group the data and how large each dataclass should be; however, it is customary to have data classes all of the same size.

EXAMPLE 5 Determining Information from a GraphFigure 15.3 shows the number of Atlantic hurricanes* over a period of years. Use this bargraph to answer the following questions.

CHAPTER 15 y Descriptive Statistics720

Pounds Lost Frequency Relative Frequency

0 to 10 14 0.215

10+ to 20 23 0.354

20+ to 30 17 0.262

30+ to 40 8 0.123

40+ to 50 3 0.046

Total 65 1.00

*These data are from the Colorado State Tropical Prediction Center.

Per

cent

10

5

0

15

20

25

30

35

40

Number of Pounds Lost

10 3020 40 50

FIGURE 15.2 Histogramof weight loss at the New YouClinic.

Fre

quen

cy

2

0

4

6

8

7

5

3

1

9

10

Number of Hurricanes per Year

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

FIGURE 15.3 Number of hurricanes per year.

TABLE 15.4 Frequency and relative frequency distributionsof weight loss at the New You Clinic.

a) What was the smallest number of hurricanes in a year during this period? What was thelargest?

b) What number of hurricanes per year occurred most frequently?

c) How many years were the hurricanes counted?

d) In what percentage of the years were there more than 10 hurricanes?

SOLUTION:

a) The smallest number of hurricanes in any year during this time period was 4. The largestnumber was 19.

b) The number of hurricanes per year that occurred most frequently corresponds to thetallest bar in Figure 15.3, which appears over the number 11. Therefore, 11 hurricanesoccurred in 10 different years.

Copyright © 2010 Pearson Education, Inc.

Page 7: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

KEY POINT

A stem-and-leaf display isanother way to display data.

15.1 y Organizing and Visualizing Data 721

c) To find the total number of years for which these data were gathered, we add theheights of all of the bars to get

1 + 1 + 6 + 6 + 9 + 4 + 6 + 10 + 5 + 5 + 3 + 1 + 1 = 58 years.

d) First, we count the number of years in which there were more than 10 hurricanes. If weadd the heights of the bars above these values, we get 10 + 5 + 5 + 3 + 1 + 1 = 25.Because there are 58 years of data, we calculate which is approximately 43%.

Now try Exercises 19 to 22. ]

Stem-and-Leaf DisplaysA stem-and-leaf display is an effective way to present two sets of data “side by side” foranalysis. This technique is used in an area called exploratory data analysis, developed byJohn Tukey, a mathematician who worked at Princeton University and Bell Labs.

Some sports fans believe that home run records have become meaningless in recentyears because of the use of steroids by baseball players. In Example 6, we use a stem-and-leaf display to investigate whether there has been an increase in home run production in theNational League recently.

EXAMPLE 6 Using Stem-and-Leaf Home Run Records from Two Eras

The following are the number of home runs hit by the home run champions in the NationalLeague for the years 1975 to 1989 and for 1993 to 2007.

a) 1975–1989: 38, 38, 52, 40, 48, 48, 31, 37, 40, 36, 37, 37, 49, 39, 47

b) 1993–2007: 46, 43, 40, 47, 49, 70, 65, 50, 73, 49, 47, 48, 51, 58, 50

Compare these home run records using a stem-and-leaf display.

SOLUTION: We first examine the home run data for 1975 to 1989. In constructing a stem-and-leaf display, we view each number as having two parts. The left digit is considered thestem and the right digit the leaf. For example, 38 has a stem of 3 and a leaf of 8. The stemsfor the data in part a) are 3, 4, and 5. We first list the stems in numerical order and draw avertical bar to their right. Next, we write the leaves corresponding to each stem to the rightof its stem and vertical bar. We also list the leaves in increasing order away from the stem.Figure 15.4 shows the stem-and-leaf display for the data in part a). We show the stem-and-leaf display for the data in part b) in Figure 15.5.

2558 = 0.431,

H I G H L I G H T ¶ ¶ ¶Using Technology to Graph Data*In organizing a large amount of data, it is tedious to drawgraphs by hand. Many software programs such as MicrosoftWord and Excel and graphing calculators allow you to entera frequency table and then will draw a graph for you. Forexample, you can use Microsoft Word to draw the follow-ing pie chart using the data from Example 1 regarding theTV program CSI.

*See your instructor for tutorials on using technology to graph data.

ExcellentPoor

AboveAverage

Average

BelowAverage

Stems Leaves

345

1 6 7 7 7 8 8 90 0 7 8 8 92

FIGURE 15.4 Stem-and-leafdisplay of home run data for 1975to 1989.

4567

0 3 6 7 7 8 9 90 0 1 850 3

FIGURE 15.5 Stem-and-leafdisplay of home run data for1993 to 2007.

Copyright © 2010 Pearson Education, Inc.

Page 8: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

CHAPTER 15 y Descriptive Statistics722

We can compare these data by placing these two displays side by side as we show inFigure 15.6. Some call this display a back-to-back stem-and-leaf display.

From Figure 15.6, we can clearly see the pattern that National League home run cham-pions hit significantly more home runs from 1993 to 2007 than from 1975 to 1989.

Now try Exercises 15 and 16. ]

Because the numbers in Example 6 were two-digit numbers, we used single digits forthe stems. If we had to represent a number such as 325, we could use either a stem of 32and a leaf of 5 or a stem of 3 and a leaf of 25, depending on which way presented the datamore clearly.

Organizing and displaying a collection of data is generally not our final goal; we usu-ally want a concise numerical description of a set of data. In Section 15.2, we will discusshow to analyze data once we have organized it.

3

1975–1989 1993–200734567

0 3 6 7 7 8 9 90 0 1 850 3

9 8 8 7 7 7 6 19 8 8 7 0 0

2

FIGURE 15.6 Combined stem-and-leaf display of home run data.

Quiz Yourself

Use a stem-and-leaf display torepresent the following collec-tion of scores:

92, 68, 77, 98, 88, 75, 82, 62, 84, 67, 62, 91, 82, 73, 66, 81, 63, 90, 83, 71

3

¶ ¶ ¶ H I S TO R I C A L H I G H L I G H TFlorence Nightingale*

It may surprise you to see a referenceto the legendary nurse FlorenceNightingale in a discussion of statis-tics. Although she is best known forher compassion toward the sick andher work in improving hospital sani-tation, she was also trained in mathe-matics. As a young woman, she hadthe good fortune to study with James

Sylvester, one of the most eminent British mathematiciansof the nineteenth century.

While serving as a nurse at a military hospital during theCrimean War, she was disturbed by the high mortality rate

of her patients. Using statistical methods that she invented,she persuaded her superiors to carry out hospital reforms.As a result of improved sanitation, deaths decreased, andsome credit her with saving the British Army during theCrimean War.

In recognition for her revolutionary work in developinga statistical approach to medicine, Nightingale was electedto the Statistical Society of England. She also was an advi-sor on military health during the American Civil War, andin 1874, she became an honorary member of the AmericanStatistical Association. The renowned statistician KarlPearson described Florence Nightingale as a prophetess inthe development of applied statistics.

*This note is based on the biography of Florence Nightingale by Cynthia Audain, located at the Biographies ofWomen Mathematicians Web site (www.agnesscott.edu/lriddle/women/women.htm) at Agnes Scott College,Atlanta, Georgia. Also, we used “Mathematical Education in the Life of Florence Nightingale,” by Sally Lipsey,The Newsletter of the Association for Women in Mathematics, Vol. 23, No. 4 (1993), 11–12.

Copyright © 2010 Pearson Education, Inc.

Page 9: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

15.1 y Exercises 723

Exercises 15.1Looking Back*These exercises follow the general outline of the topics presented inthis section and will give you a good overview of the material thatyou have just studied.

1. What are two kinds of bias that we mentioned that can affect thevalidity of a sample?

2. How did we find the relative frequencies in Table 15.2?

3. Why did we use a histogram in Example 4 rather than a bargraph?

4. What did we use to compare the home run data in Example 6?

5. Referring to the Historical Highlight on presidential polling onpage 717, why were George Gallup’s predictions more reliablethan those of the Literary Digest?

6. How did Florence Nightingale use mathematics to improvehealth care?

Sharpening Your SkillsIn Exercises 7 and 8, construct a frequency table, a relative fre-quency table, and a bar graph for the data given.

7. The number of hours of flight time for 20 amateur pilots lastmonth:

7, 8, 6, 5, 7, 10, 2, 7, 9, 5, 8, 8, 10, 9, 6, 5, 10, 7, 9, 8

8. The number of passengers per car on an Amtrak Acelatrain:

38, 39, 38, 37, 40, 38, 38, 37, 40, 38, 38, 39, 38, 37, 40, 37, 40, 38, 38, 37, 40, 38, 39, 38, 37, 40, 38, 38, 39, 38

*Before doing these exercises, you may find it useful to review the note How to Succeed at Mathematicson page xix.

MPG 19 20 21 22 23 24 25 26 27 28 29 30

Frequency 2 5 3 2 1 8 1 2 3 2 7 2

10. This table contains the weight loss by 30 participants in the Biggest Loser show.

In Exercises 11 and 12, construct a bar graph for the relativefrequency table for the data given.

11. The following are the ages of 60 people who have volunteeredto work for the Katrina relief effort:

21, 23, 27, 22, 23, 29, 28, 24, 24, 25, 27, 22, 26, 26, 23, 23, 26, 28, 25, 24, 28, 27, 24, 23, 29, 28, 24, 22, 27, 26, 22, 24, 26, 21, 24, 28, 24, 25, 22, 25, 27, 21, 23, 26, 23, 23, 27, 27, 23, 21, 22, 27, 26, 23, 25, 29, 24, 27, 27, 26

12. The following are the ages of 40 ExecuCorps volunteers whoare advising young entrepreneurs:

51, 56, 57, 52, 53, 59, 58, 52, 54, 55, 53, 56, 58, 55, 54, 58, 57, 52, 53, 59, 52, 54, 53, 51, 54, 58, 54, 55, 52, 55, 52, 57, 57, 53, 51, 52, 57, 56, 53, 55

In Exercises 13 and 14, group the data as indicated and construct ahistogram.

13. The following are the heights (in inches) of the players in fourEastern Conference teams of the WNBA for the 2007 season:

Chicago Sky: 69, 71, 74, 74, 78, 68, 68, 71, 75, 69,76, 68, 69, 74, 65, 73

NY Liberty: 72, 67, 72, 73, 75, 77, 73, 69, 67, 75,76, 76, 73, 74, 69, 70

Washington Mystics: 68, 72, 75, 68, 70, 69, 68, 74, 74, 69,73, 76, 76, 80, 73, 78

Atlanta Dream: 72, 77, 71, 80, 69, 73, 69, 75, 66, 68,77, 74, 76, 73, 72, 75

Use classes of width 2, starting at 64.5.(Source: www.wnba.com)

Weight Loss (Pounds) 0 1 2 3 4 5 6 7 8 9 10

Frequency 2 3 3 0 1 6 1 4 3 2 5

In Exercises 9 and 10, construct a bar graph for the data given in each frequency table.

9. This table contains the EPA mileage ratings for 38 domestic cars.

Copyright © 2010 Pearson Education, Inc.

Page 10: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

CHAPTER 15 y Descriptive Statistics724

14. The following are the scores on a 100-point language apti-tude test given to 60 people who are applying for the PeaceCorps:

83, 71, 92, 87, 56, 64, 41, 95, 88, 91, 78, 73, 81, 79, 59,73, 81, 93, 84, 66, 74, 51, 85, 78, 81, 98, 63, 91, 89, 64, 74, 61, 92, 77, 86, 79, 63, 91, 86, 91, 58, 83, 81, 77, 89, 83, 61, 83, 94, 76, 78, 61, 84, 88, 87, 68, 83, 71, 85, 64

Use classes of width 10, starting at 40.5.

In Exercises 15 and 16, represent the two sets of data on a singlestem-and-leaf display.

15. A: 29, 32, 34, 43, 47, 43, 22, 38, 42, 39, 37, 33, 42, 18, 22, 39, 21, 26, 18, 43

B: 32, 38, 22, 39, 21, 26, 28, 16, 13, 20, 21, 29, 22, 24, 33, 47, 23, 22, 18, 33

16. X: 29, 42, 34, 44, 47, 43, 22, 38, 42, 59, 41, 16, 47, 43, 42, 18, 22, 49, 21, 26, 18, 45, 24, 40

Y: 32, 48, 22, 59, 21, 26, 28, 16, 14, 20, 17, 45, 21, 29, 22, 24, 34, 47, 23, 22, 18, 45, 21, 16

18. The following data are the number of reports of mishandled baggage per 1,000 passengers for 10 U.S. airlines during 3 months of 2008. Useclasses of width 1, starting at 3. (Source: U.S. Department of Transportation)

19. The manager at the local Starbucks counted the customersonce each hour over a busy weekend. We summarize theresults she obtained in the following bar graph; use it to answerthe following questions.

20. Redo Exercise 19 using the following bar graph.

To estimate how much money to budget for snow removal, a townhas kept records for the past 20 years of the number of days it hassnowed each winter. The bar graph below summarizes this informa-tion. Use this graph to answer Exercises 21 and 22.

Year Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.

2005 1.823 1.918 2.065 2.283 2.216 2.176 2.316 2.506 2.927 2.785 2.343 2.186

2006 2.315 2.310 2.401 2.757 2.947 2.917 2.999 2.985 2.589 2.272 2.241 2.334

2007 2.274 2.285 2.592 2.860 3.130 3.052 2.961 2.782 2.789 2.793 3.069 3.020

Month AA UA NWA Delta Comair US Air SWA Am. Eagle Jet Blue Cont.

Jan. 7.75 6.47 5.00 7.87 9.28 7.35 6.99 13.71 3.93 4.76

Feb. 6.85 5.44 4.68 6.90 8.45 6.96 5.63 12.81 3.27 4.60

Mar. 7.34 4.86 4.57 7.90 9.83 6.93 5.49 12.74 3.51 5.50

Fre

quen

cy

Number of Customers

2

0

4

6

8

10

12

14

0 1 2 3 4 5 6 7 8 9 10 11 12

a. What was the smallest number of customers in the bar, andhow often did it occur?

b. What was the largest number of customers in the bar, andhow often did it occur?

c. What was the most frequently occurring nonzero customercount?

d. For how many hours were the customers counted?

e. For what fractional part of the total number of hours werethere more than six customers in the bar?

Fre

quen

cy

Number of Customers

2

0

4

6

8

10

0 1 2 3 4 5 6 7 8 9 10 11 12

Fre

quen

cy

Number of Snow Days

1

0

2

3

4

5

25–29 30–34 35–39 40–44 45–49 50–54

Applying What You’ve LearnedIn Exercises 17 and 18, group the data as indicated and construct a histogram.

17. The following table contains the average price (in dollars) of 1 gallon of unleaded regular gasoline for 2005 to 2007. Use classes ofwidth 20 cents, starting at $1.80. (Source: U.S. Bureau of Labor Statistics)

Copyright © 2010 Pearson Education, Inc.

Page 11: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

15.1 y Exercises 725

21. a. How many years had between 40 and 49 snow days?

b. How many years had fewer than 30 snow days?

c. How many years had at least 40 snow days?

d. How many years had fewer than 50 snow days?

22. a. How many years had between 25 and 29 snow days?

b. How many years had fewer than 40 snow days?

c. How many years had at least 45 snow days?

d. How many years had more than 34 snow days?

Comparing wage data. The following bar graphs compare women’sand men’s hourly wages in a recent year. Use these graphs to answerExercises 23–28. Because you are estimating your answers by look-ing at the graphs, your answers may not agree exactly with ours.

25. In what age and wage category do men seem to have thebiggest advantage (in terms of a percentage difference) overwomen?

26. In what age and wage category do women seem to have thebiggest advantage (in terms of a percentage difference) overmen?

27. In what age and wage category do women and men seem to bemost equal?

28. What general conclusions can you draw from these data?

29. Comparing training programs. Mary Kay gives sales train-ing to its newly hired employees. To determine how effectivethe training is, the firm compared the monthly sales of a groupthat has completed the training with a group that has not.The following numbers indicate thousands of dollars of saleslast month. Represent the two sets of data on a single stem-and-leaf display. Does the orientation program seem to besucceeding?

No Training: 19, 22, 34, 23, 27, 43, 42, 28,32, 29, 41, 26, 28, 26, 43, 40

With Training: 29, 21, 39, 44, 41, 36, 37, 29,43, 45, 28, 32, 28, 33, 36, 32

30. Comparing weight-loss programs.A hospital is testing two weight-loss programs to determine whichprogram is more effective. The fol-lowing data represent the amountof weight lost by comparableclients for a year in each program.Represent the two given sets of dataon a single stem-and-leaf display.Which program seems to be more effective?

Program A: 19, 32, 27, 34, 33, 36, 47, 32, 25,52, 29, 26, 37, 28, 26, 43, 31, 40

Program B: 29, 21, 39, 44, 41, 36, 37, 26, 26, 43, 45, 28, 32, 28, 33, 36, 53, 39

Communicating Mathematics31. What is the difference between a population and a sample?

32. Is it better to use a frequency graph or a relative frequencygraph to compare two different data sets?

33. What is the difference between a variable that is continuousversus a variable that is discrete?

34. What type of a graph do we use to graph a frequency distribu-tion of a continuous variable quantity?

35. What do you see as an advantage in grouping data? A disadvantage?

36. How does grouping data affect the way a bar graph looks?

37. What do you think a bar graph (with grouped data) of the num-ber of credits completed by members of your class would looklike? Why?

38. What do you think a bar graph (with grouped data) of the day ofthe year that the members of your class were born (Jan. 1 = 1,Jan. 2 = 2, and so on) would look like? Why?

Percentage Earning Less than $7.51 per Hour

25 and older

Age 20 to 24

Women

16 to 19

Men

0 5 1510

Percentage Earning $7.51to $9.99 per Hour

0 20 40 60 80 100

25 and older

Age 20 to 24

Women

16 to 19

Men

Percentage Earning atLeast $10.00 per Hour

0 20 40 60 80

25 and older

Age 20 to 24

Women

16 to 19

Men

23. What percent of women ages 20 to 24 earned $7.50 or lessper hour?

24. What percent of men age 25 and older earned between $7.51and $9.99 per hour?

Copyright © 2010 Pearson Education, Inc.

Page 12: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

CHAPTER 15 y Descriptive Statistics726

Using Technology to InvestigateMathematics39. See your instructor for tutorials to use graphing calculators,

spreadsheets, or other technology to demonstrate some of theideas we have discussed in this section. Use the technology toreproduce some of the graphs in this section.

40. Search the Internet for “bar graph and applets” or “histogramand applets,” or “stem-and-leaf display and applets.” Find aninteractive program that interests you, experiment with it, andreport on your findings.

For Extra CreditIn Exercises 41 and 42, use the data to construct a histogram thatconveys the information as well as you can. Explain how youdecided on the class width and the first class. Mention any prob-lems that the data presented to you in making an attractivehistogram.

41. According to Variety Magazine (World Almanac, 2008), theall-time top grossing movies through 2007 were as follows:

42. According to Nielsen Media Research (World Almanac, 2008),the favorite prime-time shows in 2007 and their average audi-ence were as follows:

MovieGross

(millions $)

1. Titanic (1997) 601.0

2. Star Wars Episode IV (1977) 461.0

3. Shrek 2 (2004) 437.0

4. E.T. (1982) 435.0

5. Star Wars Episode I (1999) 431.0

6. Pirates of the Caribbean: Dead Man’s Chest (2006) 423.3

7. Spider-Man (2002) 404.0

8. Star Wars Episode III—Revenge of the Sith (2005) 380.3

9. The Lord of the Rings: The Return of the King (2003)

377.0

10. Spider-Man 2 (2004) 373.4

11. The Passion of the Christ (2004) 370.3

12. Jurassic Park (1993) 357.1

13. The Lord of the Rings: The Two Towers (2002) 341.8

14. Finding Nemo (2003) 339.7

15. Spider-Man 3 (2007) 336.5

16. Forrest Gump (1994) 329.7

17. The Lion King (1994) 328.5

18. Shrek the Third (2007) 321.0

19. Harry Potter and the Sorcerer’s Stone (2001)

317.6

ProgramAvg.

Audience (%)

1. American Idol—Wednesday 17.3

2. American Idol—Tuesday 16.8

3. Dancing with the Stars 13.3

4. Dancing with the Stars—Monday 12.7

5. Dancing with the Stars—Results Show 12.6

6. CSI 12.2

7. Grey’s Anatomy 12.1

8. Dancing with the Stars—Tuesday 11.8

9. House 11.0

10. Desperate Housewives (tied with)NBC Sunday Night Football 10.8

12. CSI: Miami 10.7

13. FOX NFL Sunday 9.5

14. Without a Trace 9.4

15. Deal or No Deal—Monday 9.2

16. Survivor: Cook Islands (tied with)Two and a Half Men 9.1

18. NCIS 9.0

19. Cold Case (tied with) CSI: NY 8.9

21. Criminal Minds 8.8

22. 60 Minutes (tied with) Shark 8.7

24. Survivor: Fiji 8.4

25. Lost 8.3

20. The Lord of the Rings: The Fellowship of the Ring (2001)

314.8

21. Star Wars Episode II—Attack of the Clones (2002)

310.7

22. Star Wars Episode VI—Return of the Jedi (1983)

309.2

23. Transformers (2007) 309.0

24. Pirates of the Caribbean: At World’s End (2007)

308.3

25. Independence Day (1996) 306.2

43. How might you present three sets of data in the same graph?

Copyright © 2010 Pearson Education, Inc.

Page 13: 1. Understand the difference between a sample and a ...garcia/4.1reading.pdf · 15.1 Organizing and Visualizing Data* Objectives 1. Understand the difference between a sample and

15.2 y Measures of Central Tendency 727

45. You may find it interesting to search the Internet for “how to liewith statistics.” In addition to many references to Darrell Huff’sclassic book How to Lie With Statistics, you will find other sitesthat explain how to use statistics to mislead your audience. Finda site that interests you and report on your findings.

44. The following table is an example of a double-stem display.Without having this table explained to you, try to interpret itsmeaning and list the data items in the table.

(201–250) 2 34 45 49(251–300) 2 57 68 77 82(301–350) 3 23 45(351–400) 3 62 73 78(401–450) 4 12 34(451–500) 4 82 89 93

Copyright © 2010 Pearson Education, Inc.


Recommended