Chapter 3 Frequency Distributionspsy210.faculty.unlv.edu › chapters › 03_Hurlburt_Ch_3.pdf ·...

Frequency Distributions

3.1 Distributions as Tables

3.2 Distributions as GraphsHistogram • Frequency Polygon

3.3 Eyeball-estimation

3.4 The Shape of DistributionsDescribing Distributions • The Normal Distribution

3.5 Eyeball-calibration

3.6 Bar Graphs of Nominal and Ordinal Variables

3.7 ConnectionsCumulative Review • Computers • Homework Tips

Exercises for Chapter 3

Personal Trainer LECTLET 3A: Frequency Distributions as Tables LECTLET 3B: Frequency Distributions as Graphs LABS: Lab for Chapter 3 REViEwMASTER 3A: RESOuRCE 3A: Stem and Leaf Displays RESOuRCE 3X: Additional Exercises

Chapter

33

33

ch03.indd 33 6/7/17 4:03 PM

34 Chapter 3 Frequency Distributions

Learning Objectives❶ How is a tabular frequency distribution constructed?❷ What are class intervals and what role do they play in the development of grouped

frequency distributions?❸ What are two graphical methods of representing interval/ratio data from grouped

frequency distributions?❹ What steps are involved in eyeball-estimating frequency distributions?❺ How are these terms used to describe the shape of distributions: unimodal, bimodal,

symmetric, positively skewed, negatively skewed, asymptotic, normal?❻ How is a bar graph similar to and different from a histogram? For what kind of

data is each used?

There are three main concepts in statistics: Frequency distributions, the distribution of the means, and the test statistic. Here we focus on frequency distributions and how they are displayed: in tables (as ungrouped and

grouped frequency distributions) and in graphs (as histograms and frequency poly-gons). You will learn to eyeball-estimate frequency distributions and terminology for describing the shapes of distributions: unimodal, bimodal, sym metric, positively and negatively skewed, asymptotic, and normal.

Table 3.1 shows the grouped frequency distributions and Figure 3.1 shows the his-tograms for the intellectual growth scores in the Pygmalion study of Chapter 1, where Rosenthal and Jacobson (1968) led teachers to believe that “bloomers” would show IQ spurts, but that “other” children would not spurt. Ac tually, the bloomers were a ran-dom sample of students—no different, on average, from the other children. The grouped frequency distributions (Table 3.1) and histograms (Figure 3.1) show that the most frequently occurring IQ gain among the bloomers was between 11 and 20 IQ points, whereas the most frequently occurring IQ gain among the other children was between 1 and 10 IQ points. It is also easy to see that there is considerable overlap between the bloomer and other children’s intellectual growth.

FrequencyIQ Gain Bloomers Others

61 – 70 1 0 51 – 60 0 0 41 – 50 0 0 31 – 40 0 1 21 – 30 2 2 11 – 20 6 14 1 – 10 1 18 –9 – 0 2 9–19 – –10 0 3

12 47

*Based on Rosenthal & Jacobson (1968)

TaBle 3.1 Grouped frequency distributions of intellectual growth (IQ gain) for Oak School bloomers and other children*

frequency distributionenumeration

(ee∙NEW∙mer∙A∙shun)histogramfrequency polygonunimodal (YOU∙ni∙MO∙dull)bimodal (BYE∙MO∙dull)symmetricskewedasymptotic (ASS∙sim∙TOT∙tic)normalbar graph

ch03.indd 34 6/7/17 4:03 PM

Chapter 3 Frequency Distributions 35

20 –

15 –

10 –

5 –

0 ––20 - –11 –10 - –1

–20 - –11 –10 - –1

0-9 10-19 20-29 30-39 40-49 50-59 60-69

0-9 10-19 20-29 30-39 40-49 50-59 60-69

– – – – – – – – – –

f

IQ Gain of Bloomers

20 –

15 –

10 –

5 –

0 – – – – – – – – – – –

f

IQ Gain of Others

FIGURe 3.1 Histograms of intellectual growth (IQ gain) for Oak School bloomers and other children (Based on Rosenthal & Jacobson, 1968)

The break in the X-axis shows that the Y-axis does not intersect at X = 0 as is customary.

c

We could have made the same observations by inspecting the data in Chapter 1, but the frequency distributions described this chapter make it easier.

Recall that in Chapter 1 we pointed out the three major concepts in statistics: (1) what a “distribution of a variable” is and how to describe it, (2) what a “distribution of means” is and how it is related to the distribution of a

variable, and (3) what a “test statistic” is and how it is related to the distribution of means. This chapter (and also Chapters 4 and 5) focuses on (1) the distributions of variables.

Our first task is to convince ourselves that understanding frequency distributions will make it simpler for us to think about and communicate about data. Suppose, for example, that we are interested in the weights of male students in our statistics class. There are 25 men in the class, and we measure each man’s weight, with the results shown in Figure 3.2, where each man is represented by a square.

ch03.indd 35 6/7/17 4:03 PM


135 180 190 137 154

149 164 185 173 163

162 157 161 173 180

179 197 159 182 164

164 144 152 150 163

FIGURe 3.2 Weights (lb) of male students as they sit in the classroom

TaBle 3.2 Enumeration of male students’ weights (lb)135 180 190 137 154

149 164 185 173 163

162 157 161 173 180

179 197 159 182 164

164 144 152 150 163

In an enumerated list, you have to hunt for the largest or smallest values.

c

Now suppose our friend Jack asks how heavy the male students in our class are. We could simply list the data: “The first student weighs 135 pounds, the next student weighs 180 pounds, the third student weighs . . . , and the last student weighs 163 pounds.” This kind of listing is called an enumeration.

Enumerations can be printed, as shown in Table 3.2. An enumeration is a perfectly accurate answer to Jack’s question about student weights, but it is probably an undesir-able answer both because it is too long (we give Jack more information than he wants) and because it does not highlight the important characteristics of the distribution (it does not, for example, make it easy to discover the largest or the smallest weight, or to see which weights occur most frequently).

Statisticians make the answers to questions such as Jack’s more informative by using frequency distributions in the form of tables or graphs.

Click Lectlets and then 3A in the Personal Trainer for an audiovisual discussion of Sec tions 3.1 and 3.2.

3.1 Distributions as TablesA tabular frequency distribution is a table that lists the numerical values of a vari-able in a logical order along with frequency of each value. A variable, as we saw in Chapter 2, is that characteristic of interest that can take on dif ferent values for each

enumeration Listing all points in a data set

tabular frequency distribution An ordered listing of all values of a variable and their frequencies

Personal Trainer

Lectlets

enumeration (ee∙NEW∙mer∙A∙shun)

ch03.indd 36 6/7/17 4:03 PM

3.1 Distributions as Tables 37

participant, and the frequency (usually abbreviated f ) is the number of times a par-ticular value of the variable occurs. Table 3.3 shows the tabular frequency distribu-tion of male students’ weights. The value 180 occurs twice (“has frequency 2”), as do 173 and 163, and the value 164 occurs three times (“has frequency 3”). All other frequencies for the weights listed in the table are 1. If a weight does not occur in the data of Table 3.2 (for example, 181 or 142), then its frequency is 0, and we omit it entirely in the Table 3.3 frequency distribution.

Note that the sum of the frequencies (25 in our example) is shown at the bottom of the frequency column and is always equal to the number of entries in the original data set.

Note also that the frequency distribution shows the values of the variable under con sideration (weight) in order; the convention is to put the largest value first. This presentation sim plifies our communication about the data. Now if Jack wants to know how heavy the men in our class are, we can immediately say, “The lightest is 135 pounds, the second lightest is 137 pounds, the third lightest is 144 pounds, . . . , and the heaviest is 197 pounds.” This is still a long and cumbersome answer, but the ordering makes it easier for Jack to gain some appreciation of how the weights stand in relation to each other. However, although this frequency distribution is more informative than a simple enumeration of the individual weights, it has disadvantages: It still provides a long list of weights and their frequencies, and it is relatively difficult to identify weights with 0 frequencies.

A grouped frequency distribution (also called a “frequency distribution using class intervals”) is more compact and efficient. Successive weights are grouped together into class intervals and the frequencies are reported for the intervals, not for the individual weights, as shown in Table 3.4. Note that the sum of the frequencies is still 25, the number of original observations.

The use of class intervals (130–139 pounds, etc.) generally makes a frequency distri bution easier to understand. For example, it is now clear that the most frequent weights are in the 160s.

Class intervals must follow these rules: Each interval must be the same width (in Table 3.4, 10 pounds); the class intervals must be nonoverlap ping and contiguous (for example, if the upper limit of one interval is 159 pounds, then the lower limit of the next must be 160 pounds); and it is generally desirable to have between 6 and 20 class

TaBle 3.3 Frequency distribution of male students’ weights

Weight (Ib) f Weight (lb) f197 1

161 1190 1185 1 159 1182 1 157 1180 2 154 1179 1 152 1173 2 150 1164 3 149 1163 2 144 1162 1 137 1

135 125

In a frequency distribution, largest and smallest values are easy to spot.

c

grouped frequency distribution A frequency distribution with adjacent values of the variable grouped togetherinto class intervals

frequency The number of times the particular value of a variable occurs

(continues)

(continued)

Class intervals must be ◽ equal sized ◽ non-overlapping ◽ contiguous

c

ch03.indd 37 6/7/17 4:03 PM


intervals (exactly how many intervals is a matter of judgment, deciding which number of class intervals presents the data in the most informative manner).

Six steps are involved in creating a grouped frequency distribution:

STep 1 Range. Find the range of the scores (highest score minus lowest score).

STep 2 Number. Make a preliminary choice of the desired number of class intervals.

STep 3 Width. Determine the interval width by dividing the range by the number of class intervals. Round the interval width in either direction to a conve-nient number, even if that means adjusting the number of class intervals you selected in Step 2.

STep 4 Lowest. Determine the lower limit of the lowest interval. This must be chosen so that the lowest data point falls somewhere in the lowest interval; the intervals should have convenient limits.

STep 5 Limits. Prepare a list of the limits of each class interval, beginning at the bottom of the table with the lowest score and proceeding upward. Be sure that the intervals are all the same width, are contiguous and are nonoverlap-ping. Be sure that the highest interval contains the highest score.

STep 6 Frequencies. Count the number of observations that occur in each interval and enter that count as the frequency of the interval.

Note that the word convenient or desired appears in Steps 2, 3, and 4. This indi-cates that there is no simple rule governing the choices to be made; instead, some judgment must be made about the intended communication and the selections chosen to make the clearest communication.

We illustrate this procedure by preparing another grouped frequency distribution for our weight data. Suppose we inspect Table 3.4 and wonder whether we would have pro vided more information if we had used more class intervals than the seven shown there. We decide to prepare a grouped frequency distribution using twice as many class intervals, so we will use 14. We use the same steps.

In Step 1, we note that the highest weight is 197 pounds and the lowest is 135 pounds, so the range is 197 – 135 = 62.

TaBle 3.4 Grouped frequency distribution of male students’ weights using 10-pound class intervals

Class Interval f190–199 2180–189 4170–179 3160–169 7150–159 5140–149 2130–139 2

25

Steps for creating a grouped frequency distribution

The largest values of X are at the top of all frequency distributions.

c

ch03.indd 38 6/7/17 4:03 PM

3.1 Distributions as Tables 39

In Step 2, our preliminary choice of the number of class intervals this time is 14.

In Step 3, the interval width will be approximately the range divided by the number of intervals, or 62/14 = 4.43 pounds. We could round that to 4 or to 5 pounds, but we decide that 5 is more convenient because the lower limit of every second interval then ends in 0. That means we will have 13 instead of 14 intervals, but our statistical judgment is that using round numbers as interval limits more than offsets that disadvantage.

In Step 4, the lowest interval must contain the lowest point (135), so we could choose any of the following values as the lower limit of that interval: 131, 132, 133, 134, or 135 (or any intermediate decimal value if we desired, but that might imply that we measured weights to a greater precision than whole pounds). We select 135 pounds because it is a round number. The lowest interval is there-fore 135–139 pounds. Note that that interval is 5 pounds wide (not 4 as it might appear) because it contains the five values 135, 136, 137, 138, and 139.

In Step 5, the first interval is 135–139. The second interval must have the same width and be contiguous, so it is 140–144. We continue creating intervals as shown in Table 3.5, ending with the highest interval (195–199), which includes the highest point (197 pounds).

In Step 6, we refer to our original frequency distribution (Table 3.3) and count the number of individuals whose weights fall into the listed classes. Two individu-als (135 and 137) have weights in the first class (135–139), for example, so the frequency of that class is 2. We enter those frequencies in the right-hand column of Table 3.5, and the grouped frequency distribution using 5-pound intervals is completed. Double checking, we add down the list of frequencies; the sum must still be 25, the total number of data points.

Note that a grouped frequency distribution includes class intervals even if they have 0 frequency (for example, the interval 165–169 is included in Table 3.5). This is in contrast to an ungrouped frequency distribution (for example, Table 3.3), where values with 0 frequency are omitted.

TaBle 3.5 Grouped frequency distribution of male students’ weights using 5-pound class intervals

Class Interval f195–199 1190–194 1185–189 1180–184 3175–179 1170–174 2165–169 0160–164 7155–159 2150–154 3145–149 1140–144 1135–139 2

25

Note: intervals with zero frequency are included in grouped frequency distributions.

c

ch03.indd 39 6/7/17 4:03 PM


Choosing between these two grouped frequency distributions, the one with 10-pound intervals and the one with 5-pound intervals, is a matter of judgment. If we wish to convey that the frequency of weights rises to a maximum in the 160-pound range and then decreases again, we would probably prefer the 10-pound intervals (Table 3.4). But if we wish to demonstrate that no individuals weigh between 165 and 170 pounds, we would prefer the 5-pound intervals (Table 3.5). The choice is a matter of judgment and commu nication.

Note that some information is lost in a grouped frequency distribution. For example, Table 3.4 tells us that five weights are between 150 and 159, but we have no way of knowing whether those five individuals all weigh 150 or all weigh 159 or are spread over that range. Tables 3.4 and 3.5 have improved the clarity and ease of comprehending the data in comparison with Table 3.3, but they have sacrificed some detail in so doing.

Click Lectlets and then 3B in the Personal Trainer for an audiovisual discussion of Sections 3.2 through 3.6.

3.2 Distributions as GraphsHistogramThe grouped frequency distributions we have shown as tables can also be repre-sented graphically by histograms or frequency polygons. A histogram is a plot of the class intervals of the variable (weight, in our example) on the horizontal axis (sometimes called the X-axis or abscissa), with the frequency of each interval rep-resented as a vertical bar. Figure 3.3 shows the histogram obtained from Table 3.4, which used 10-pound class intervals. Note that each interval’s frequency is dis-played as a rectangle or “bar,” and there is no space (only a single vertical line) separating the bars.

Personal Trainer

Lectlets

histogram A graphical presentation of a grouped frequency distribution with frequencies represented as vertical bars; it is appropriate for interval/ratio data

8 –

7 –

6 –

5 –

4 –

3 –

2 –

1 –

0 – 130–139 140–149 150–159 160–169 170–179 180–189 190–199

– – – – – – – –

f

Weight (lb)

FIGURe 3.3 Histogram of male students’ weights

Check: ◽◽ ◽Axes labeled (including units)

◽ Class intervals under bars ◽ Bars share common borders

c

ch03.indd 40 6/7/17 4:03 PM

3.2 Distributions as Graphs 41

Here are the steps used in creating a histogram:

STep 1 Begin. Begin with a grouped frequency distribution (Table 3.4 for our weight example). We could have used Table 3.5 if we had preferred.

STep 2 Axes. Draw and label the axes of the histogram. It is conventional to make the vertical axis about two-thirds to three-quarters the length of the horizontal axis. The vertical axis should be labeled “ f.” The horizontal axis should be labeled with the name of the variable, “Weight” in our example, with the unit of measurement in parentheses, “(lb).” Enter the class interval ranges on the horizontal axis. They should be equally spaced because the intervals in the grouped frequency distribution should have been equal sized.

STep 3 Break? It is conventional to have the vertical axis intersect the horizontal axis at the point where the value of the variable is 0. If this is not the case in your histogram, then indicate that with a break in the horizontal axis. The vertical axis in Figure 3.3 seems to intersect the horizontal axis at about 120 pounds, not 0, so a break is indicated in the axis.

STep 4 Edges. Draw vertical lines at the edges of the class intervals to form the edges of the histogram bars. Note that the edges of each bar are single lines, indicating that there is no space between the values that the bars represent.

STep 5 Heights. The height of each vertical bar should equal the frequency of the values in that interval.

Now when Jack asks us about the weights of the male students in our class, we can answer by showing him this histogram. This may be a more concise summary for Jack because he can see at a glance that the lowest weight is about 130 pounds, the highest weight is about 200 pounds, and most of the weights cluster around 160 or 170 pounds.

The histogram based on class intervals presents a relatively complete, easily compre hensible picture of our data, and in fact histograms are among the most widely used ways of describing sets of data.

Frequency PolygonAnother, entirely equivalent, graphical presentation of data is the frequency polygon. Recall that a histogram is a plot of a series of rectangular bars, where each bar’s width is the width of the class interval and the bar’s height is the frequency of that class inter-val (see Figure 3.3). A frequency polygon is simply a transformation of the histogram obtained by substituting a single point for each bar and then connecting the points with straight line segments. Figure 3.4 is the histogram of Figure 3.3 with the equivalent frequency polygon superimposed. Figure 3.5 then shows the histogram removed, leav-ing the frequency polygon by itself. It is customary to connect the first and last dots to the horizontal axis with diagonal dashed lines. The frequency polygon is thus a second

Steps for creating a histogram

frequency polygon A graphical presentation of a grouped frequency distribution with frequencies represented as points; it is appropriate for interval/ratio data

Why no space between the bars?Because the class intervals are contiguous.

c

In my surveys, most, but not all, people prefer histograms to tables.

c

ch03.indd 41 6/7/17 4:03 PM


8 –

7 –

6 –

5 –

4 –

3 –

2 –

1 –

0 – 130–139 140–149 150–159 160–169 170–179 180–189 190–199

– – – – – – – –

f

Weight (lb)

FIGURe 3.4 Transforming a histogram into a frequency polygon (The completed frequency polygon is shown in Figure 3.5.)

8 –

7 –

6 –

5 –

4 –

3 –

2 –

1 –

0 – 134.5 144.5 154.5 164.5 174.5 184.5 194.5

– – – – – – –

f

Weight (lb)

FIGURe 3.5 Frequency polygon of male students’ weights

form of graphical frequency distribution; it contains exactly the same information as does the histogram.

Note that the values on the X-axis of a frequency polygon are shown as single points plotted at the midpoints of the class intervals rather than displaying the class intervals beneath the bars as was the case in the histogram. For example, the interval 160–169 pounds has the midpoint (160 + 169)/2 = 164.5. The dot of the frequency polygon is plotted directly above 164.5.

We make three important observations about histograms and frequency polygons. First, they both contain the identical information found in a grouped frequency dis-tribution table. Therefore, the choice of using a histogram, a frequency polygon, or a grouped frequency distribution table is based entirely on clarity of presentation. Use whichever form presents your data most effectively.

Check: ◽ Axes labeled (including units)

◽ Midpoints at tic marks ◽ Dots above midpoints ◽ Dots connected with straight lines

◽ Dashed lines diagonally to 0

c

The histogram and frequency polygon are both shown so that you can see the relationships. In actual practice, choose one or the other, not both.

c

ch03.indd 42 6/7/17 4:03 PM

3.3 Eyeball-estimation 43

Second, we have plotted the frequencies on the vertical axis of the histograms and frequency polygons. In some cases, particularly when the number of observations is large, it is desirable to plot a measure of relative frequency, such as a proportion or percentage, on that axis. For example, because there are 25 men in our weight data, each man represents 1/25, or 4%, of the total sample. Thus, if we wished, the label on the vertical axis in Figures 3.3, 3.4, or 3.5 could be altered by replacing “ f ” with “Percentage” and replacing the values “l, 2, 3, . . .” with “4%, 8%, 16%, . . .” The new graph would convey the same information as the original.

Third, as sample sizes become larger, the width of each bar in a histogram (or the distance between each point in a frequency polygon) can be made narrower and narrower. Then the shape of the figure becomes, in general, smoother and smoother. Figure 3.6 shows such a histogram. If we imagine an infinitely large sample with infinitely narrow class intervals, then the histogram or frequency polygon becomes a smooth curve.

Click Resources and then 3A in the Personal Trainer for one more important but optional way of presenting frequency distributions of interval/ratio data. The stem and leaf display combines some of the best properties of tables and graphs, but is not as yet widely used.

Click Lectlets and then 3B in the Personal Trainer for an audiovisual discussion of Sections 3.3 through 3.6.

3.3 Eyeball-estimationWe have followed one set of data, the weights of a group of students, from the raw data stage (Table 3.2) to the histogram (Figure 3.3) and frequency polygon (Figure 3.5) graphi cal presentations of those data. We have seen that histograms and frequency polygons are simpler, more informative ways of communicating about a distribution of data than is the enumeration of the raw data themselves. Now it is our task to become skilled in visualizing distributions even when we do not have the raw data to begin

relative frequency Frequency divided by the size of the group, expressed as a proportion or percentage

The concept of relative frequency will be important in Chapter 6.

c

110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260

– – – – – – – – – – – – – – – –

f

Weight (lb)

FIGURe 3.6 Histogram of male students’ weights with a very large sample size and very narrow class intervals

Personal Trainer

Resources

Personal Trainer

Lectlets

Generally, large samples and narrow intervals give smoother histograms.

c

ch03.indd 43 6/7/17 4:03 PM


with. We want to be able to sketch a reasonably accurate frequency distribution “by eyeball,” based on what we know about the world in general. (In this textbook, we will highlight descriptions of the process of “eyeball-estimation” by using the eyeglass logo shown at the beginning of this section.)

Now we are starting over with the weights of male students. Forget for the moment that we ever saw the data in Figure 3.2. Our new task is to attempt to sketch the distri-bution of weights of students in a statistics class based on what we know about students in general, not based on any data about any particular class that we might have seen. The steps in this sketching-by-eyeball procedure are listed next.

STep 1 Axes. Draw and label the axes. A frequency distribution always has the variable that we are interested in plotted on the X-axis; in our case, the variable is “Weight (lb),” which should be the label of the X-axis. Do not forget the unit of measurement (“lb”) in the variable label because we will see in later chapters that keeping the units clearly in mind is essential to understanding statistics. The vertical axis of a frequency dis tribution is always the frequency and is customarily labeled “ f.” The vertical axis is customarily drawn to be about two-thirds to three-quarters the length of the horizontal axis. See Figure 3.7. Note that our eyeballed distribution is an abstraction in the sense that it does not refer to any particular sample of values or to any particular sample size, so the f-axis does not have numeri-cal values.

STep 2 Min and Max. Plot your estimates of the minimum and maximum values of the variable. What do you think is the smallest man’s weight in a col-lege class? My estimate is about 120 pounds; your estimate might be a bit higher or lower depending on your experience. Our task here is to be approximate, not exact. Label the lowest weight (120 pounds) near the left end of the X-axis. If we think that the lowest weight is 120 pounds, then the frequency of any value (weight) below 120 must be 0, and the plot of frequencies should begin to rise at about the value 120 pounds. Repeat the process for the largest weight likely to be found in a college class, putting that value near the right end of the X-axis. My estimate is 220 pounds. See Figure 3.8.

STep 3 Intermediate. Add intermediate values to the X-axis, using round num-bers. Keep the intervals between the values equal, and make the distances between those values on the X-axis equal. In our example, we need round numbers between 120 and 220 with equal in tervals between them; thus, 140, 160, 180, and 200 are appropriate and are equally spaced on the X-axis. See Figure 3.9.

STep 4 Mode. Plot estimate(s) of the mode(s) of the frequency distribution. The mode is the value of the variable (weight in our example) that occurs most frequently in a distri bution; there can sometimes be more than one mode. What weight or small range of weights do you think is likely to occur most frequently? It seems to me that the most frequently occurring weight among college men is about 170 pounds (your estimate—and the truth!—might be somewhat higher or lower). With this assumption, the fre quency

Steps for eyeball-estimating a frequency distribution

ch03.indd 44 6/7/17 4:03 PM

3.3 Eyeball-estimation 45

f

Weight (lb)

f

Weight (lb) 220

– –

FIGURe 3.7 Eyeball-estimating the frequency distribution: Step 1, drawing and labeling the axes

FIGURe 3.8 Eyeball-estimating the frequency distribution: Step 2, plotting minimum and maximum values

distribution must reach its maximum at 170 and decrease in both directions from there. Therefore, we locate a point above 170 pounds on our fre-quency distribu tion and sketch a little peak at that point to indicate that the frequency will decrease as weight either increases or decreases from 170 pounds (see Figure 3.10). The absolute height of this peak is not important in this sketch. If there are two or more modes, then the relative height is important: The more frequently occurring mode should be higher.

STep 5 Complete. Connect the parts of the sketch that you have just created, keep-ing the distribution smooth and continuous (unless there is good reason to have a discontinuity in the distribution). Nature as a rule does not change abruptly, and distributions of variables that occur in nature are therefore usually smooth. See Figure 3.11.

f

Weight (lb) 120 140 160 180 200 220

– – – – – – – – – – –

f

Weight (lb) 120 140 160 180 200 220

– – – – – – – – – – –

FIGURe 3.9 Eyeball-estimating the frequency distribution: Step 3, labeling intermediate values on the X-axis

FIGURe 3.10 Eyeball-estimating the frequency distribution: Step 4, sketching the mode

ch03.indd 45 6/7/17 4:03 PM


Our sketch of the distribution of men’s weights in college classes is now complete. How does our sketch (which we created without considering any particular data) com-pare with the histogram and frequency polygon built from actual data and shown in Figures 3.3 and 3.5? First, our sketched curve is smoother than either the histogram or the frequency polygon. The rough edges or sudden changes in direction in the plots of Figures 3.3 and 3.5 are artifacts of the small sample size and wide class intervals. As we saw in Figure 3.6, the larger the sample size and the narrower the class intervals, the smoother the histogram and frequency polygon. Our eyeball-estimated frequency distribution is smooth, as if we had extremely narrow class intervals.

Second, the minimum and maximum values are not exactly the same (but close) in our estimate as those in the data. The difference could be because the data of Table 3.2 repre sented just one classroom whereas our sketch represented classrooms in general. If Table 3.2 had been of a different single classroom then the minimum and maximum values of the data would likely have been different. Alternatively, it is possible that our eyeball is somewhat mistaken in estimating the minimum and maximum values of this variable.

Third, both the sketch and the data have the mode at about 170 pounds. Our esti-mate of the peak for men’s weights was corroborated by the data. In general, our sketch has roughly the same characteristics as the data.

3.4 The Shape of Distributions Describing DistributionsAs we sketch frequency distributions, it will be convenient to have some terms that describe the shapes of the distributions.

The first two terms describe the number of modes the distribution has. Recall that the mode is the most frequently occurring value in a distribution; it lies directly below

f

Weight (lb) 120 140 160 180 200 220

– – – – – – – – – – –

FIGURe 3.11 Eyeball-estimating the frequency distribution: Step 5, completing the sketch with a smooth curve

The eyeball philosophy: Quick, easy approximations exercise your comprehension. You don’t have to be exact (that’s why we do computations).

c

ch03.indd 46 6/7/17 4:03 PM

3.4 The Shape of Distributions 47

the highest point on a graphical frequency distribution. A distribution that has one mode is called unimodal. Our sketch of the distribution of men’s weights in Figure 3.11 is unimodal.

A bimodal distribution has two modes. For example, we might expect the dis-tribution of weights of college students in general to have two modes, one most fre-quently occurring weight for the men and one most frequently occurring weight for the women, as shown in Figure 3.12. Most distributions in nature are unimodal; bimodal distributions occur only when there is a splitting of a population into two relatively distinct parts. Being bimodal does not imply no overlap between the distributions (for example, in our weight example some men are lighter than the female mode and some women are heavier than the male mode).

We call a distribution symmetric if the right side is a mirror image of the left side. The unimodal distribution of men’s weights in Figure 3.11 is symmetric; the bimodal distribu tion of men’s and women’s weights in Figure 3.12 is also approximately symmetric.

f

Weight (lb)

– – – – – – – – – – – – – –

100 120 140 160 180 200 220

FIGURe 3.12 A bimodal distribution: weights of college students (both men and women)

unimodal distribution A distribution that has one most frequently occurring value

bimodal distribution A distribution that has two most frequently occurring values

symmetric distribution A distribution whose left side is a mirror image of its right side

f

X

f

X

FIGURe 3.13 A positively skewed distribution FIGURe 3.14 A negatively skewed distribution

unimodal (YOU∙ni∙MO∙dull)bimodal (BYE∙MO∙dull)

ch03.indd 47 6/7/17 4:03 PM


We call a distribution skewed if one “tail” (the side of a distribution) is longer than the other: thus, skewed distributions must be asymmetric. Distributions whose right-hand tail is longer than the left are positively skewed (see Figure 3.13), and those whose left-hand tail is longer are negatively skewed (see Figure 3.14).

We call the tail of a distribution asymptotic if it gradually approaches the X-axis but never actually touches it. (To be truly asymptotic is therefore a mathematical abstrac-tion; all real distributions have a lowest value below which the frequency is 0 and thus where the distribution actually meets the X-axis.) The distribution of Figure 3.14 is asymptotic in the left tail but not asymptotic in the right tail.

The Normal DistributionFor reasons that will become clearer in Chapter 7, the distributions of many variables that occur in nature have a characteristic form called the normal distribution. The heights of people (or trees or dandelions), IQs of people, maze-running speeds of rats, and gains or losses on the stock exchange are all approximately normally distributed variables. This section will acquaint you with the general shape of normal distribu-tions, and later chapters will fill in the computational details.

Normal distributions are unimodal, symmetric, and asymptotic, which means that they have the general bell shape shown in Figures 3.15, 3.16, and 3.17. Normal distribu-tions can be tall or short, narrow or wide, but they must have the same general bell shape.

You may wonder why the normal distribution is called asymptotic when the curves drawn in Figures 3.15, 3.16, and 3.17 seem to touch the X-axis. Being asymptotic is a mathematical ideal, impossible to represent accurately in those portions of the curve where the correct height is less than the width of the ink lines. Figures 3.15, 3.16, and 3.17 are in fact precisely drawn normal distributions. (There is a widespread miscon-ception, however, that normal distributions should seem to “float” above the X-axis; a misconception created by textbooks that incorrectly illustrate normal distributions.)

For contrast, Figures 3.18 through 3.20 present non-normal distributions. As their captions show, these curves each have only two of the three characteristics required for normal distributions. Figure 3.21 shows a curve with all three characteristics, and yet it is still not actually normal.

f

X

f

X

FIGURe 3.15 A normal distribution FIGURe 3.16 Another (wider) normal distribution

positively skewed distribution A distribution whose right tail is longer than its left tail

negatively skewed distribution A distribution whose left tail is longer than its right tail

asymptotic Gradually approaching the X-axis

normal distribution A class of distributions that are unimodal, symmetric, and asymptotic

asymptotic (ASS∙sim∙TOT∙tic)

ch03.indd 48 6/7/17 4:03 PM

3.4 The Shape of Distributions 49

f

X

f

X

FIGURe 3.17 A third (narrower) normal distribution FIGURe 3.18 A non-normal distribution (unimodal and sym-metric but non-asymptotic)

f

X

f

X

FIGURe 3.19 A non-normal distribution (unimodal and asymptotic but not symmetric)

FIGURe 3.20 A non-normal distribution (symmetric and asymptotic but bimodal)

f

X

FIGURe 3.21 A non-normal distribution (unimodal, symmetric, and asymptotic but too flat on top)

ch03.indd 49 6/7/17 4:03 PM


3.5 Eyeball-calibrationLet’s have some practice in eyeball-estimating frequency distributions. (In this text, we call such practice “eyeball-calibration” and indicate it with the glasses-and-ruler logo shown at left.) Suppose we are interested in the distribution of the speeds of automobiles on a section of an interstate highway where the speed limit is 65 miles per hour (mph). In Step 1 of our procedure, we draw and label the axes. The horizontal axis is labeled “Speed (mph)” because that is the vari-able of interest, and the vertical axis is the “frequency,” which we custom-arily abbreviate as “ f .” In Step 2, we ask ourselves what is the slowest likely speed on such an interstate, and we might guess about 50 mph. (Note that we are making some implicit assumptions here, such as the absence of hills or snowy road conditions. One of the tasks of science is to make such assump-tions explicit, but that is a topic of research design, not of statistics itself.) The maximum speed is perhaps about 80 mph. We mark the minimum and maximum speeds on the X-axis. In Step 3, we add the intermediate values to the X-axis: 60 and 70 mph. In Step 4, we enter our estimate of the mode, which we might guess to be about 65 mph. Then in Step 5, we connect the points to form the complete sketch shown in Figure 3.22. Our sketched distribution is more or less normally shaped.

As another example, let’s eyeball-estimate the distribution of the length of col-lege students’ hair. In Step 1, we draw and label the axes: The X-axis is “Hair length (in.)” and the Y-axis is “frequency” or “f.” In Step 2, we plot the minimum length, which is 0 inches, and the maximum length, which we take to be in the neighbor-hood of 40 inches. In Step 3, we enter the intermediate values: 10, 20, and 30 inches. In Step 4, we eyeball-estimate the mode. But now there would be two modes—one about 3 or 4 inches (mostly males), and one somewhere between about 10 and 20

The general steps for this procedure are found on pages 38–40.

c

f

Speed (mph) 50 60 70 80

– – – –

FIGURe 3.22 Eyeball-estimate of the frequency distribution of speeds on an interstate highway

ch03.indd 50 6/7/17 4:03 PM

3.5 Eyeball-calibration 51

inches (mostly females). We put a peak in the graph above each of those values: the hair-length distribution is bimodal. In Step 5, we connect the points as shown in Figure 3.23. This distribution is bimodal and positively skewed. (Note that the peak of the distribution is higher for the (mostly) males than for the (mostly) females, because there is less variability in men’s hair length than in women’s. We will return to that discussion in Chapters 5 and 6.)

The exercises for this chapter will give you the chance to convince yourself that you can sketch by eyeball relatively accurate frequency distributions of all sorts of variables, ranging from the heights of basketball players to the lengths of McDonald’s french fries. Being able to sketch an approximately accurate frequency distribution is evidence that you understand what a frequency distribution is and that you have some knowledge of the way variables are distributed in the real world.

Statistics, to be sure, are frequently used to describe variables whose distribu-tions we do not know; that’s one way scientists learn about the real world. However, at this stage in learning to understand statistics, we are focusing on variables that are familiar. Your task is to recognize that there are variables with distributions every-where you look, and that if you are familiar with the variable, you can describe its distribution with at least some accuracy. I want you, as you’re walking across cam-pus, to identify variables such as the height of pine trees, or the number of books students carry, or the amount of time students take to walk from the library to the student union, or the number of miles on the odometers of cars in the parking lot, or the number of times the word the appears per page in your textbooks, or. . . . I want you to visualize the frequency distributions of those variables. I want you to say to yourself as you walk things like, “The shortest pine trees on this campus seem to be about 4 feet and the tallest about 40 feet, and the distribution seems to be symmetric with the mode at about 20 feet.” That is, I want you, for the moment, to practice seeing distributions of heights of trees, rather than the trees themselves, and distributions of the number of books, rather than the books themselves.

f

Hair Length (in.)0 10 20 30 40

FIGURe 3.23 Eyeball-estimate of the frequency distribution of hair length

Review: Three main concepts:1. Distribution of a variable2. Distribution of means3. Test statisticWe are focusing here on distributions of variables.

c

ch03.indd 51 6/7/17 4:03 PM


We shall see throughout the remainder of this textbook that statistics is almost entirely the study of distributions and their characteristics. If we are to understand statistics, we must become confidently familiar with distributions, being able to “see” them accurately wherever we look.

3.6 Bar Graphs of Nominal and Ordinal VariablesOur discussions so far have focused on the graphical display of interval/ratio variables—that is, on histograms and frequency polygons. If a variable is nominal or ordinal, however, the use of a histogram or frequency polygon can be quite misleading.

Suppose the campus newspaper polls students about their political preference, asking them whether they are Republican, Democrat, Independent, or decline to state. This is clearly a nominal variable, and the frequencies are shown in Table 3.6. If we (mistakenly) were to create a frequency polygon for these data, we would plot a point for Republicans, a point for Democrats, and so on, and then connect the points with continu-ous lines. The line between the Republican and the Democrat points would give the erro-neous impression that measurements were made at intermediate points between those two parties. To avoid such misleading displays, we do not use histograms or frequency polygons for nominal data, preferring instead the bar graph.

A bar graph is a graphical display of a frequency distribution where frequencies are represented as separated vertical bars, as shown in Figure 3.24. Note how the sepa-ration of the bars makes it clear that no intermediate values were measured. Note also that the order of the bars could be rearranged, with Democrats placed at the left, for example. This is because the political preference variable is measured at the nominal level, where order is not important.

The frequency distributions of ordinal data are displayed as bar graphs, histo-grams, or frequency polygons. Using the bar graph for ordinal data has the advantage of making it clear that the intervals between points are not necessarily equal sized, but it has the disadvantage of implying that there are no in termediate values between the bars. The choice between the methods of display for ordinal data is a judgment

TaBle 3.6 Frequency distribution of students’ political preferences

preference fRepublican 375Democrat 510Independent 141Decline to state 84

bar graph A graphical presentation of a frequency distribution of nominal or ordinal data where frequencies are represented by separated bars

600 –

500 –

400 –

300 –

200 –

100 –

0 – Republican Democrat Independent Declined

f

Political Party Preference

FIGURe 3.24 Bar graph of political party preference

Spaces between bars in a bar graph indicate that the bars could be reordered.

c

ch03.indd 52 6/7/17 4:03 PM

3.7 Connections 53

call made for the individual frequency distribution, considering which kind of display gives the clearest communication about the data.

To review: Histograms and frequency polygons are designed to represent data measured at the interval/ratio level (ordinal data also are often displayed in histo-grams). Bar graphs are designed for nominal data.

Connections3.7

Cumulative ReviewChapters 1 and 2 distinguished between populations (the entire group of interest) and sam-ples (some subset of that group) and between parameters (characteristics of popula tions) and statistics (characteristics of samples). Variables were defined as characteristics that could take on any of several values, and we distinguished among nominal (catego rization only), ordinal (ordered categorization), and interval/ratio (ordered categorization where the categories all have equal sizes) levels of measurement. Summation notation and probability were introduced as two of the primary mathematical skills underlying statistics.

ComputersUse DataGen to create a frequency distribution of the data in Table 3.2:

1. In the Personal Trainer, click DataGen (Win) if you are a Windows user or DataGen (Mac) if you are a Macintosh user. DataGen (Win) requires that you have Microsoft Office Excel 2007 or later installed on your Windows computer. DataGen (Mac) requires that you have Microsoft Office Excel 2011 or later installed on your Macintosh computer. If you have a choice, use the Windows version of DataGen.

2. Enter the data of Table 3.2. Note that logically this is one column of data, so enter all 25 values down DataGen’s Variable 1 column.

3. Note that DataGen does not create frequency distributions or histograms, so this cannot be done directly. You can ask DataGen to sort the data, so that cre-ating a histogram is easier. Highlight the 25 values in Variable 1. Click Edit variable and then click Sort (this variable only).

4. Save these data because we will use them later: Click Save As. Choose a folder in the Save As window. In the File name: cell type table 3-2. The file name at the top of the datagen spreadsheet window should now read table 3-2.xlsx.

5. Open the file you just saved to make sure it was saved correctly: Click Open. Make sure table 3-2.xlsx appears in the File name: cell, and then click Open.

Use DataGen to add 100 to all the values of a variable:

1. Enter the five values 2, 4, 6, 8, and 10 into the Variable 1 column of the spreadsheet.

2. Highlight those five cells, click Edit variable, enter 100 into the text entry cell next to the Add button and this value and click Add.

Personal Trainer

DataGen

ch03.indd 53 6/7/17 4:03 PM


SPSS

S

%a

Use SPSS to create a frequency distribution of the data in Table 3.2: 1. Enter the data from Table 3.2 by typing the values down a single column,

using Enter between values. (Do not put the data in five different columns.)

2. Save these data so that you can use them later: Click File, then click Save As.... In the File name: cell, type table 3-2, which will make the new file name table 3-2.sav. Then click Save.

3. Recall this file to make sure it is saved correctly: Click File, then Open, and then Data.... Click table 3-2.sav. Click Open.

4. Click Analyze, then Descriptive Statistics, then Frequencies....

5. Click and then click OK.

6. Figure 3.25 shows the output, with my annotations, that will appear in the Output1-IBM SPSS Statistics Viewer window. Check to make sure that the frequency column sums to 25 and that the table matches Table 3.3 in the textbook. Note that SPSS chooses to list the smallest values first, unlike the convention described in the text.

VAR00001

Frequency Percent Valid PercentCumulative

Percent

Valid 135.00 1 4.0 4.0 4.0

137.00 1 4.0 4.0 8.0

144.00 1 4.0 4.0 12.0

149.00 1 4.0 4.0 16.0

150.00 1 4.0 4.0 20.0

152.00 1 4.0 4.0 24.0

154.00 1 4.0 4.0 28.0

157.00 1 4.0 4.0 32.0

159.00 1 4.0 4.0 36.0

161.00 1 4.0 4.0 40.0

162.00 1 4.0 4.0 44.0

163.00 2 8.0 8.0 52.0

164.00 3 12.0 12.0 64.0

173.00 2 8.0 8.0 72.0

179.00 1 4.0 4.0 76.0

180.00 2 8.0 8.0 84.0

182.00 1 4.0 4.0 88.0

185.00 1 4.0 4.0 92.0

190.00 1 4.0 4.0 96.0

197.00 1 4.0 4.0 100.0

Total 25 100.0 100.0

X f

Check that this is correct

FIGURe 3.25 Sample SPSS output: Frequency distribution of Table 3.2

ch03.indd 54 6/7/17 4:03 PM

3.6 Bar Graphs of Nominal and Ordinal Variables 55

Use SPSS to create a histogram of these data:

7. Click Graphs and then Chart Builder….

8. If a window that reads “Before you use this dialog . . .” appears, click OK.

9. In the Choose from: dialog, select Histogram.

10. Drag the icon for the histogram into the window that says Drag a Gallery chart here to use it as your starting point.

11. Drag VAR00001 from the Variables: dialog to the X-Axis? label of the histogram.

12. Click OK.

13. Note that the histogram appears in the Output1-IBM SPSS Statistics Viewer window.

Use SPSS to add 100 to all the values of a variable:

1. Clear the data by clicking File > New > Data.

2. Enter 2, 4, 6, 8, and 10 into the first column of the spreadsheet.

3. Click Transform and then click Compute variable....

4. Enter VAR00001 in the Target Variable: text entry cell.

5. Click , click +, click 1, click 0, click 0, and then click OK.

6. Click OK to Change existing variable?

7. Note the result in the Untitled-IBM SPSS Statistics Data Editor window.

Excel

H0: m1 = m2

X

Use Excel to create a grouped frequency distribution (like Table 3.4) of the data in Table 3.2 (Excel will not easily create an ungrouped frequency distribution, but see the DataGen hint above):

1. Enter the data from Table 3.2 into the first 5 cells of spreadsheet columns A through E.

2. Save these data so that you can use them later: Click File > Save As. In the File name: cell type table 3.2, which will make the new file name “table 3.2.xlsx.” Then click Save.

3. Enter the top values of the class intervals from Table 3.4 (199, 189, 179, etc.) into the first 7 cells of column F.

4. Highlight the first 7 cells of column G (that’s where the frequencies will go).

5. In the Formulas tab click Insert Function.

6. In the Or select a category: cell select Statistical.

7. Scroll down and select FREQUENCY and click OK.

8. Click the RefEdit control at the end of the Data_array cell. Highlight the 25 values in columns A through E. Click the RefEdit control again.

9. Click the RefEdit control at the end of the Bins_array cell. Highlight the 7 values in column F. Click the RefEdit control again.

10. On the keyboard press simultaneously Ctrl-Shift-Enter. This will enter the FREQUENCY function into all 7 cells in column G.

11. Compare the contents of column G to the frequencies column in Table 3.4.

ch03.indd 55 6/7/17 4:03 PM


Use Excel to create a histogram of these data:

12. Click the Data tab and then click Data Analysis. If there is no Data Analysis entry in the ribbon, do step 12A. Otherwise go to step 13.

12A. If there is no Data Analysis entry, you will have to install the Excel Analysis Toolpak, which comes with Excel but is not automatically installed. Click File > Options > Add-Ins. Highlight Analysis ToolPak. Click Go. That produces the Add-Ins window. Check Analysis ToolPak and click OK. Then, in the Data tab click Data Analysis.

13. Scroll down to Histogram, highlight it, and click OK.

14. Click the RefEdit control at the end of the Input range: cell. Highlight the 25 values in columns A through E. Click the RefEdit control again.

15. Click the RefEdit control at the end of the Bin Range: cell. Highlight the 7 values in column F. Click the RefEdit control again.

16. Check the Chart Output box and click OK.

17. Compare the histogram that Excel creates to Figure 3.3.

18. Note that Excel places the histogram in a new worksheet. To return to the original data, click Sheet1 at the bottom of the window.

Use Excel to add 100 to all the values of a variable:

19. Clear the data by clicking File > New > Blank workbook.

20. Enter the five values 2, 4, 6, 8, 10 into the first cells of column A:.

21. Click on cell B1, type =A1+100. Don’t forget to type the equal sign at the beginning of this entry. Then press Enter. Note that the value 102 (= 2 + 100) appears in cell B1.

22. Click on cell B1. Grab the fill handle ( , the little square box that appears at the lower right corner of the B1 cell) and drag it down until the first 5 cells in column B are highlighted. Then release the handle.

Homework Tips1. Check the list of learning objectives at the beginning of this chapter. Do you under-

stand each one?

2. When you are preparing a grouped frequency distribution, make sure that all the class intervals have the same width. The most frequent mistake is to make the first or the last interval wider or narrower than the remaining intervals. If the intervals are equal, then the last digit in the interval limits (including the first and last interval) will form some uniform sequence, such as 0, 5, 0, 5, . . . or 0, 4, 8, 2, 6, 0, 4, 8, 2, 6, . . .

3. Remember to plot the points in a frequency polygon at the midpoints of the class intervals and to connect the first and last dots to the axis with diagonal dashed lines.

Click ReviewMaster and then Chapter 3 in the Personal Trainer for an electronic interactive review of the concepts in Chapter 3.

Click Labs and then Chapter 3 in the Personal Trainer for interactive practice of the skills in Chapter 3 and a quiz to test your understanding.

Personal Trainer

ReviewMaster

Personal Trainer

Labs

ch03.indd 56 6/7/17 4:03 PM

Chapter 3 Exercises 57

CHAPTER 3 E xErc i s EsSection A: Basic Exercises (Answers in Appendix D, page 576)

[HiNT: Save your answers to these exercises. You will use them again for the exercises in Chapters 4 and 5. When sketching distributions, follow the steps of Figures 3.7–3.11. Do not forget to label your axes (including the unit of measurement).]

1. To measure the fitness of students, Edith Drabushen se lects 20 students at random and asks them to ride a sta tionary bicycle as fast as they can for 30 min-utes. The bicycle has an odometer, and she records the number of miles each student rides. The data (in miles) are 8, 15, 22, 17, 9, 16, 15, 15, 21, 12, 14, 16, 22, 17, 16, 11, 13, 15, 20, 15. Help Edith prepare these distributions:

(a) Frequency distribution (b) Grouped frequency distribution with about eight

groups (c) Histogram (d) Frequency polygon

ExErcisE i workEd out(a) Frequency distribution ✓ Labels “Distance ridden” and “f ” ✓ Units “(mi)” shown ✓ Largest values first

Distance Ridden (mi) f

22 221 120 117 216 315 514 113 112 111 1 9 1 8 1

20

✓ No zeros in f column ✓ f column sum shown ✓ f column sum equals n

(b) Grouped frequency distribution ✓ Labels “Distance ridden (mi)” and “f ” ✓ Top interval includes largest point ✓ Bottom interval includes smallest point ✓ All groups same width (especially first and last!) ✓ No gaps between groups ✓ f column sum still equals n

Distance Ridden (mi) f

22–23 220–21 218–19 016–17 514–15 612–13 210–11 18–9 2

20

(c) Histogram ✓ Labels “Distance ridden” and “f ” ✓ Units “(mi)” shown ✓ No space between bars ✓ Break in horizontal axis because vertical axis is

not at 0 ✓ Intervals shown on X axis

7 –6 –5 –4 –3 –2 –1 –0 –

8–9 10–11 12–13 14–15 16–17 18–19 20–21 22–23

f

Distance Ridden (mi.)

– – – – – – – – –

ch03.indd 57 6/7/17 4:03 PM


(d) Frequency polygon ✓ Labels “Distance ridden” and “f ” ✓ Units “(mi)” shown ✓ Horizontal axis values at midpoints of class intervals ✓ Dots at midpoints of intervals ✓ Dashed lines from last dots to axis

the distribu tion of vibrato frequencies. What terms can be used to describe this distribution?

6. Coca-Cola bottling companies manufacture bottles of Coke that say “16 ounces” on the label. Bottlers know that consumer groups will complain if too many bottles contain less than 16 ounces, but they also don’t want to waste money by putting too much Coke in the bot tles. They know that their bottling machines are not per fectly accurate in dis-pensing exact amounts of liquid. They therefore set their machines to dispense Coke so that the modal amount of Coke is 16.5 ounces. Sketch by eyeball the distribution of amounts of Coke in “16 ounce” Coca-Cola bottles. What terms can be used to describe this distribution?

7. Sketch by eyeball the distribution of lengths of McDon ald’s french fries. What terms can be used to describe this distribution?

8. Royal Perfecto is a competitor of McDonald’s, and it, like McDonald’s, grows its own potatoes from which french fries are made. Royal Perfecto has discovered a way of growing potatoes that are perfectly cubical, 5 inches on each side. Furthermore, Royal Perfecto pota toes, when cut into strips and deep fried, retain their exact 5-inch lengths. Sketch the distribution of lengths of Royal Perfecto french fries. What terms can be used to describe this distribution?

9. Big Brothers is an organization that pairs young boys with adult men. The Los Angeles Big Brothers Club has a luncheon where boys aged 7 can come with their adult Big Brothers. Sketch the distribu-tion of heights of all people (boys and their Big Brothers) who attend this luncheon. What terms can be used to describe this distribution?

10. Sketch the distribution of lengths of blades of grass on the lawn near the building where your statistics class meets. What terms can be used to describe this distribution?

11. Suppose the college ground crew went on strike for two weeks and the lawn does not get mowed dur-ing that time. Sketch by eyeball the distribution of lengths of blades of grass at the end of the strike. How is this dis tribution similar to and different from the distribution you sketched in Exercise 10?

7 –6 –5 –4 –3 –2 –1 –0 –

8.5 10.5 12.5 14.5 16.5 18.5 20.5 22.5

f

Distance Ridden (mi.)

– – – – – – – –

2. (a) For the data of Exercise 1, prepare a histogram us ing about four groups.

(b) Which histogram is more informative, the one with eight groups or the one with four groups?

3. These data are the scores of all the students on the first exam in a statistics class: 81, 86, 91, 75, 96, 82, 88, 88, 71, 68, 72, 61, 84, 86, 95, 91, 84, 83, 91, 83, 89, 90. Prepare these distributions:

(a) Frequency distribution (b) Grouped frequency distribution with about

eight groups. The lower limit of the lowest group could be 60, 61, 62, or 63. Which of these choices makes the most sense? Why?

(c) Histogram (d) Frequency polygon

4. Most professional basketball players are about 6 feet 8 inches (80 inches) tall. A few are as tall as 7 feet 2 inches, and a few are as short as 5 feet 8 inches. Sketch by eyeball the distribution of heights of professional basketball players. What terms can be used to describe this distribution?

5. Vibrato is the slight wavering in pitch that musicians add to musical notes to give them warmth. Most mu sicians’ vibrato has a frequency of about 7 Hz (hertz, or cycles per second). A few are as fast as 9 Hz, and a few are as slow as 5 Hz. Sketch by eyeball

ch03.indd 58 6/7/17 4:03 PM

Chapter 3 Exercises 59

12. Suppose that the registrar at Hower University reports these enrollments in each of the colleges: Liberal Arts 3024, Science and Mathematics 1127, Performing Arts 752, Health Sciences 1452, Business and Economics 4320, and Education 2431.

(a) What kind of graphical presentation is appropri-ate for the frequency distribution at Hower U?

(b) Create that graphical distribution.

Section B: Supplementary Exercises 13. For a study on human learning, a researcher cre-

ated a list of 25 pairs of nonsense syllables (such as dak, fom). The first syllable of each pair is pre-sented to the participant one at a time, and the par-ticipant is asked to say its paired associate. If the participant is incorrect, the correct sylla ble is then presented. One of the variables of interest in this study is the number of times such a list has to be presented until the participant can correctly give all 25 associates. The researcher randomly selects 50 students from an introductory psychology course and trains each student on the list until the 100% criterion is reached. The numbers of presentations of the list for the 50 stu dents are 8, 9, 7, 8, 16, 7, 10, 11, 16, 14, 12, 13, 12, 13, 12, 14, 8, 9, 17, 12, 5, 18, 14, 14, 12, 8, 11, 11, 9, 9, 18, 15, 11, 7, 9, 5, 6, 8, 10, 11, 11, 10, 14, 16, 6, 11, 15, 9, 19, 12. Prepare these distributions:

(a) Frequency distribution (b) Grouped frequency distribution with six to eight

groups (c) Histogram (d) Frequency polygon

14. A college is considering supplying personal com-puters for its students and wishes to know how fast the stu dents can type. It takes a random sample of 55 students and administers a typing test. These are the results (in words per minute): 8, 24, 20, 20, 17, 18, 16, 19, 17, 29, 25, 27, 14, 5, 21, 36, 11, 16, 29, 20, 11, 17, 26, 10, 11, 5, 5, 19, 28, 7, 15, 8, 14, 32, 32, 12, 7, 12, 13, 30, 19, 16, 42, 26, 16, 30, 21, 8, 4, 23, 5, 15, 19, 9, 30. Prepare these distributions:

(a) Frequency distribution

(b) Grouped frequency distribution with six to eight groups


15. A sociologist is interested in the economic impact of school events at Waytoo High School. She inter-views couples who attend the senior prom and asks how much money each couple spent on this one evening, includ ing tickets, clothing, and entertain-ment. The data (in dollars) are 245, 190, 330, 225, 140, 120, 410, 395, 218, 264, 256, 302, 330, 310, 275, 272, 188, 380, 95, 160, 260, 265, 387, 342, 340. Prepare these distribu tions:

(a) Frequency distribution of money spent on the prom at Waytoo High

(b) Grouped frequency distribution with about eight groups


16. The president of a sorority reports these grade-point av erages of sorority members: 3.2, 3.5, 3.2, 4.0, 2.7, 3.1, 3.1, 2.9, 3.7, 2.8, 3.6, 3.4, 3.8, 3.8, 2.8, 3.6, 3.9, 3.6, 3.3, 3.4, 3.3, 3.5, 3.4, 3.3, 3.7, 3.5, 3.9, 3.8, 3.1, 3.2. Prepare these distributions:

(a) Frequency distribution (b) Grouped frequency distribution with about

eight groups (c) Histogram (d) Frequency polygon

17. Sketch by eyeball the distribution of lengths of automobiles parked in your college parking lot. Describe (in words) this distribution.

18. Sketch by eyeball the distribution of lengths of songs played on a rock and roll radio station. Describe (in words) this distribution.

19. Sketch by eyeball the distribution of the prices of men’s shirts in a local department store. Describe (in words) this distribution.

20. Sketch by eyeball the distribution of the prices of men’s shirts in a local discount store. Describe (in words) this distribution.

ch03.indd 59 6/7/17 4:03 PM


Section C: Cumulative Review (Answers in Appendix D, page 578)

21. In the Australian gambling game of Two-Up, two coins are “spun” (thrown) simultaneously into the air. The person running the game calls out the out-come of the two coins using one of three terms: “heads” (both are heads), “tails” (both are tails), or “odds” (one head and one tail).

(a) What is the probability of “heads”? (b) What is the probability of “tails”? (c) What is the probability of “odds”?

22. Consider the data Xi: 14, 11, 12, 17, 13, 15. Find: (a) X4

(b) ∑(4X2i + 2Xi – 3)

23. Sketch by eyeball the distribution of lengths of last names (number of letters in last name, including hy phenated surnames) for students in your college.

24. Identify the level of measurement (nominal, ordi-nal, or interval/ratio) for each of these variables:

(a) Number of pages in a textbook (b) House address number as a measure of dis-

tance from the center of town (c) Religious affiliation (1 = Protestant, 2 = Catholic,

3 = Jewish, etc.)

Click Resources and then 3X in the Personal Trainer for additional exercises.Personal Trainer

Resources

ch03.indd 60 6/7/17 4:03 PM

Date post:	27-Jun-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Chapter 3 Frequency Distributionspsy210.faculty.unlv.edu › chapters › 03_Hurlburt_Ch_3.pdf ·...

Documents