+ All Categories
Home > Documents > Chapters 1 & 3 Graphical Methods for Describing Data.

Chapters 1 & 3 Graphical Methods for Describing Data.

Date post: 05-Jan-2016
Category:
Upload: victoria-banks
View: 223 times
Download: 1 times
Share this document with a friend
78
Chapters 1 & 3 Graphical Methods for Describing Data
Transcript
Page 1: Chapters 1 & 3 Graphical Methods for Describing Data.

Chapters 1 & 3

Graphical Methods for Describing Data

Page 2: Chapters 1 & 3 Graphical Methods for Describing Data.

What is statistics?

• the science of collecting, organizing, analyzing, and drawing conclusions from data

Page 3: Chapters 1 & 3 Graphical Methods for Describing Data.

Why should one study statistics?

1. To be informed . . .a) Extract information from tables, charts

and graphsb) Follow numerical argumentsc) Understand the basics of how data

should be gathered, summarized, and analyzed to draw statistical conclusions

Can dogs help patients with

heart failure by reducing stress

and anxiety?

When people take a vacation do they really

leave work behind?

Page 4: Chapters 1 & 3 Graphical Methods for Describing Data.

Why should one study statistics? (continued)

2. To make informed judgments

3. To evaluate decisions that affect your life

If you choose a particular major, what are your chances of finding

a job when you graduate?

Many companies now require drug screening as a condition of

employment. With these screening tests there is a risk of a false-

positive reading. Is the risk of a false result acceptable?

Page 5: Chapters 1 & 3 Graphical Methods for Describing Data.

What is variability?

Suppose you went into a convenience store to purchase a soft drink. Does every can on the shelf contain exactly 12 ounces?

NO – there may be a little more or less in the various cans due to the variability that is inherent in the filling process.

In fact, variability is almost universal!

It is variability that makes life interesting!!

The quality, state, or degree of being variable or changeable.

Page 6: Chapters 1 & 3 Graphical Methods for Describing Data.

If the Shoe Fits ...

The two histograms to the right display the distribution of heights of gymnasts and the distribution of heights of female basketball players. Which is which? Why?

Heights – Figure A

Heights – Figure B

Page 7: Chapters 1 & 3 Graphical Methods for Describing Data.

If the Shoe Fits ...

Suppose you found a pair of size 6 shoes left outside the locker room. Which team would you go to first to find the owner of the shoes? Why?

Suppose a tall woman (5 ft 11 in) tells you see is looking for her sister who is practicing with a gym. To which team would you send her? Why?

Page 8: Chapters 1 & 3 Graphical Methods for Describing Data.

The Data Analysis Process1. Understand the nature of the

problem

2. Decide what to measure and how to measure it

3. Collect data

4. Summarize data and perform preliminary analysis

5. Perform formal analysis

6. Interpret results

It is important to have a clear direction before gathering data.

It is important to carefully define the variables to be studied and to develop appropriate methods for

determining their values.

It is important to understand how data is collected because

the type of analysis that is appropriate depends on how the

data was collected!This initial analysis provides

insight into important characteristics of the data.

It is important to select and apply the appropriate inferential

statistical methodsThis step often leads to the formulation of new research

questions.

Page 9: Chapters 1 & 3 Graphical Methods for Describing Data.

Suppose we wanted to know the average GPA of high school graduates in the nation this year.

We could collect data from all high schools in the nation.

What term would be used to describe “all high school

graduates”?

Page 10: Chapters 1 & 3 Graphical Methods for Describing Data.

Population

• The entire collection of individuals or objects about which information is desired

• A census is performed to gather about the entire population

What do you call it when you collect data about the entire population?

Page 11: Chapters 1 & 3 Graphical Methods for Describing Data.

GPA Continued:Suppose we wanted to know the average GPA of high school graduates in the nation this year.

We could collect data from all high schools in the nation.

Why might we not want to use a census here?

If we didn’t perform a census, what would we do?

Page 12: Chapters 1 & 3 Graphical Methods for Describing Data.

Sample

• A subset of the population, selected for study in some prescribed manner

What would a sample of all high school graduates across the nation look like?

High school graduates from each state (region), ethnicity, gender, etc.

Page 13: Chapters 1 & 3 Graphical Methods for Describing Data.

GPA Continued:Suppose we wanted to know the average GPA of high school graduates in the nation this year.

We could collect data from a sample of high schools in the nation.

Once we have collected the data, what would we do with

it?

Page 14: Chapters 1 & 3 Graphical Methods for Describing Data.

Descriptive statistics• the methods of organizing &

summarizing data

• Create a graph

If the sample of high school GPAs contained 1,000 numbers, how could the data be organized or summarized?

• State the range of GPAs• Calculate the average

GPA

Page 15: Chapters 1 & 3 Graphical Methods for Describing Data.

GPA Continued:Suppose we wanted to know the average GPA of high school graduates in the nation this year.

We could collect data from a sample of high schools in the nation.

Could we use the data from our sample to answer this

question?

Page 16: Chapters 1 & 3 Graphical Methods for Describing Data.

Inferential statistics• involves making generalizations

from a sample to a populationBased on the sample, if the average GPA for high school graduates was 3.0, what generalization could be made?

The average national GPA for this year’s high school graduate is approximately 3.0.Could someone claim that the average

GPA for graduates in your local school district is 3.0?No. Generalizations based on the results of a sample can only be made back to the population from which the sample came from.

Be sure to sample from the population of interest!!

Page 17: Chapters 1 & 3 Graphical Methods for Describing Data.

Variable • any characteristic whose value

may change from one individual to another

• Suppose we wanted to know the average GPA of high school graduates in the nation this year. Define the variable of interest.The variable of interest is the GPA of high school graduates

Is this a variable . . .The number of wrecks per week at the intersection

outside school? YES

Page 18: Chapters 1 & 3 Graphical Methods for Describing Data.

Data• The values for a variable from

individual observations

For this variable . . .The number of wrecks per week at the intersection outside . . . What could observations be?

0, 1, 2, …

Page 19: Chapters 1 & 3 Graphical Methods for Describing Data.

Two types of variables

categorical numerical

discrete continuous

Page 20: Chapters 1 & 3 Graphical Methods for Describing Data.

Categorical variables• Qualitative

• Identifies basic differentiating characteristics of the population

Can you name any categorical variables?

Page 21: Chapters 1 & 3 Graphical Methods for Describing Data.

Numerical variables• quantitative

• observations or measurements take on numerical values

• makes sense to average these values

• two types - discrete & continuous

Can you name any numerical variables?

Page 22: Chapters 1 & 3 Graphical Methods for Describing Data.

Discrete (numerical)• Isolated points along a number

line

• usually counts of items

Page 23: Chapters 1 & 3 Graphical Methods for Describing Data.

Continuous (numerical)• Variable that can be any value in

a given interval

• usually measurements of something

Page 24: Chapters 1 & 3 Graphical Methods for Describing Data.

Identify the following variables:

1. the color of cars in the teacher’s lot

2. the number of calculators owned by students at your school

3. the zip code of an individual

4. the amount of time it takes students to drive to school

5. the appraised value of homes in your city

Categorical

Categorical

discrete numerical

Discrete numerical

Continuous numerical

Is money a measurement or a count?

Page 25: Chapters 1 & 3 Graphical Methods for Describing Data.

Classifying variables by the number of variables in a data

setSuppose that the PE coach records the height of each student in his class.

Univariate - data that describes a single characteristic of the population

This is an example of a univariate data

Page 26: Chapters 1 & 3 Graphical Methods for Describing Data.

Classifying variables by the number of variables in a data

setSuppose that the PE coach records the height and weight of each student in his class.

Bivariate - data that describes two characteristics of the population

This is an example of a bivariate data

Page 27: Chapters 1 & 3 Graphical Methods for Describing Data.

Classifying variables by the number of variables in a data

setSuppose that the PE coach records the height, weight, number of sit-ups, and number of push-ups for each student in his class.

Multivariate - data that describes more than two characteristics (beyond the scope of this course)

This is an example of a multivariate data

Page 28: Chapters 1 & 3 Graphical Methods for Describing Data.

Graphs for categorical data

Page 29: Chapters 1 & 3 Graphical Methods for Describing Data.

Bar Chart

When to Use Categorical data

How to construct– Draw a horizontal line; write the

categories or labels below the line at regularly spaced intervals

– Draw a vertical line; label the scale using frequency or relative frequency

– Place equal-width rectangular bars above each category label with a height determined by its frequency or relative frequency

Page 30: Chapters 1 & 3 Graphical Methods for Describing Data.

Bar Chart (continued)

What to Look For Frequently or infrequently

occurring categories

Collect the following data and then display the data in a bar chart:

What is your favorite ice cream flavor?

Vanilla, chocolate, strawberry, or other

Page 31: Chapters 1 & 3 Graphical Methods for Describing Data.

Double Bar Charts

When to Use Categorical data

How to construct– Constructed like bar charts, but with two (or

more) groups being compared– MUST use relative frequencies on the

vertical axis– MUST include a key to denote the different

barsWhy MUST we use relative frequencies?

Page 32: Chapters 1 & 3 Graphical Methods for Describing Data.

Each year the Princeton Review conducts a survey of students applying to college and of parents of college applicants. In 2009, 12,715 high school students responded to the question “Ideally how far from home would you like the college you attend to be?” Also, 3007 parents of students applying to college responded to the question “how far from home would you like the college your child attends to be?” Data is displayed in the frequency table below.Frequency

Ideal Distance Students Parents

Less than 250 miles 4450 1594

250 to 500 miles 3942 902

500 to 1000 miles 2416 331

More than 1000 miles 1907 180

Create a comparative bar chart with these data.

What should you do first?

Page 33: Chapters 1 & 3 Graphical Methods for Describing Data.

Relative Frequency

Ideal Distance Students Parents

Less than 250 miles .35 .53

250 to 500 miles .31 .30

500 to 1000 miles .19 .11

More than 1000 miles .15 .06

Found by dividing the frequency by the total number of students

Found by dividing the frequency by the total number of parents

What does this graph show about the ideal distance college should be from home?

Page 34: Chapters 1 & 3 Graphical Methods for Describing Data.

Segmented (or Stacked) Bar Charts

When to Use Categorical data

How to construct– MUST first calculate relative frequencies– Draw a bar representing 100% of the group– Divide the bar into segments corresponding

to the relative frequencies of the categories

Page 35: Chapters 1 & 3 Graphical Methods for Describing Data.

Relative Frequency

Ideal Distance Students Parents

Less than 250 miles .35 .53

250 to 500 miles .31 .30

500 to 1000 miles .19 .11

More than 1000 miles .15 .06

Remember the Princeton survey . . .

Create a segmented bar graph with these data.

First draw a bar that

represents 100% of the

students who

answered the survey.

Page 36: Chapters 1 & 3 Graphical Methods for Describing Data.

Less than 250 miles

250 to 500 miles

500 to 1000 miles

More than 1000 miles

Relative Frequency

Ideal Distance Students Parents

Less than 250 miles .35 .53

250 to 500 miles .31 .30

500 to 1000 miles .19 .11

More than 1000 miles .15 .06

First draw a bar that

represents 100% of the

students who

answered the survey.

0.2

0.4

0.6

0.8

1.0

Rela

tive f

requency

Students

Next, divide the bar into segments.

Do the same thing for

parents – don’t forget a key

denoting each category

Parents

Notice that this segmented bar chart displays the same relationship

between the opinions of students and parents concerning the ideal distance

that college is from home as the double bar chart does.

Page 37: Chapters 1 & 3 Graphical Methods for Describing Data.

Pie (Circle) ChartWhen to Use Categorical data

How to construct– Draw a circle to represent the entire data

set– Calculate the size of each “slice”:

Relative frequency × 360° – Using a protractor, mark off each slice

To describe – comment on which category had the largest

proportion or smallest proportion

Page 38: Chapters 1 & 3 Graphical Methods for Describing Data.

Typos on a résumé do not make a very good impression when applying for a job. Senior executives were asked how many typos in a résumé would make them not consider a job candidate. The resulting data are summarized in the table below.

Number of Typos

Frequency

Relative Frequency

1 60 .40

2 54 .36

3 21 .14

4 or more 10 .07

Don’t know 5 .03

Create a pie chart for these data.

Page 39: Chapters 1 & 3 Graphical Methods for Describing Data.

Number of Typos

Frequency

Relative Frequency

1 60 .40

2 54 .36

3 21 .14

4 or more 10 .07

Don’t know 5 .03

First draw a circle to

represent the entire data set.

Next, calculate the size of the

slice for “1 typo”

.40×360º =144º

Draw that slice.

Repeat for each slice.

Here is the completed pie chart created

using Minitab.

What does this pie chart tell us about the number of typos occurring in

résumés before the applicant would not be considered for a job?

Page 40: Chapters 1 & 3 Graphical Methods for Describing Data.

Graphs for numerical data

Page 41: Chapters 1 & 3 Graphical Methods for Describing Data.

Dotplot

When to Use Small numerical data sets

How to construct– Draw a horizontal line and mark it with an

appropriate numerical scale– Locate each value in the data set along the

scale and represent it by a dot. If there are two are more observations with the same value, stack the dots vertically

Page 42: Chapters 1 & 3 Graphical Methods for Describing Data.

Dotplot (continued)

What to Look For – The representative or typical value– The extent to which the data values spread out– The nature of the distribution along the number line– The presence of unusual values

Collect the following data and then display the data in a dotplot:

How many body piercings do you have?

Page 43: Chapters 1 & 3 Graphical Methods for Describing Data.

How to describe a numerical, univariate

graph

Page 44: Chapters 1 & 3 Graphical Methods for Describing Data.

What strikes you as the most distinctive difference among the distributions of exam scores in

classes A, B, & C ?

Page 45: Chapters 1 & 3 Graphical Methods for Describing Data.

1. Center

• discuss where the middle of the data falls

• three measures of central tendency–mean, median, & mode

The mean and/or median is typically reported rather than the

mode.

Page 46: Chapters 1 & 3 Graphical Methods for Describing Data.

What strikes you as the most distinctive difference among the

distributions of scores in classes D,

E, & F?

Page 47: Chapters 1 & 3 Graphical Methods for Describing Data.

2. Spread

• discuss how spread out the data is

• refers to the variability in the data

• Measure of spread are–Range, standard deviation, IQR

Remember,Range = maximum value – minimum

value

Standard deviation & IQR will be discussed in Chapter 4

Page 48: Chapters 1 & 3 Graphical Methods for Describing Data.

What strikes you as the most distinctive difference among the distributions of exam scores in

classes G, H, & I ?

Page 49: Chapters 1 & 3 Graphical Methods for Describing Data.

3. Shape

• refers to the overall shape of the distribution

The following slides will discuss these shapes.

Page 50: Chapters 1 & 3 Graphical Methods for Describing Data.

Symmetrical

• refers to data in which both sides are (more or less) the same when the graph is folded vertically down the middle

• bell-shaped is a special type–has a center mound with two sloping tails

1. Collect data by rolling two dice and recording the sum of the two dice. Repeat three times.

2. Plot your sums on the dotplot on the board.

3. What shape does this distribution have?

Page 51: Chapters 1 & 3 Graphical Methods for Describing Data.

Uniform

• refers to data in which every class has equal or approximately equal frequency

1. Collect data by rolling a single die and recording the number rolled. Repeat five times.

2. Plot your numbers on the dotplot on the board.

3. What shape does this distribution have?

To help remember the name for this shape,

picture soldier standing in

straight lines. What are they

wearing?

Page 52: Chapters 1 & 3 Graphical Methods for Describing Data.

Skewed

• refers to data in which one side (tail) is longer than the other side

• the direction of skewness is on the side of the longer tail

1. Collect data finding the age of five coins in circulation (current year minus year of coin) and record

2. Plot the ages on the dotplot on the board.

3. What shape does this distribution have?

The directions are right skewed or left skewed.

Name a variable with a distribution that is skewed left.

Page 53: Chapters 1 & 3 Graphical Methods for Describing Data.

Bimodal (multi-modal)

• refers to the number of peaks in the shape of the distribution

• Bimodal would have two peaks• Multi-modal would have more

than two peaksBimodal distributions can occur when the data set consist of observations

from two different kinds of individuals or objects.

Suppose collect data on the time it takes to drive from San Luis Obispo, California to Monterey, California. Some people may take the inland route (approximately 2.5 hours) while others may take the coastal route (between 3.5 and 4 hours).

What shape would this distribution have?

What would a distribution be called if it had ONLY one peak? Unimodal

Page 54: Chapters 1 & 3 Graphical Methods for Describing Data.

3. Shape

• refers to the overall shape of the distribution

• symmetrical, uniform, skewed, or bimodal

Page 55: Chapters 1 & 3 Graphical Methods for Describing Data.

What strikes you as the most distinctive difference among the

distributions of exam scores in class J ?

Page 56: Chapters 1 & 3 Graphical Methods for Describing Data.

4. Unusual occurrences

• Outlier - value that lies away from the rest of the data

• Gaps

• Clusters

Page 57: Chapters 1 & 3 Graphical Methods for Describing Data.

5. In context• You must write your answer in

reference to the context in the problem, using correct statistical vocabulary and using complete sentences!

Page 58: Chapters 1 & 3 Graphical Methods for Describing Data.

Dotplot (continued)

What to Look For – The representative or typical value– The extent to which the data values spread out– The nature of the distribution along the number line– The presence of unusual values

Collect the following data and then display the data in a dotplot:

How many body piercings do you have?

Describe the distribution of the number of body

piercings the class has.

Page 59: Chapters 1 & 3 Graphical Methods for Describing Data.

Numerical Graphs Continued

Page 60: Chapters 1 & 3 Graphical Methods for Describing Data.

Stem-and-Leaf DisplaysWhen to Use Univariate numerical data

How to construct– Select one or more of the leading digits for

the stem– List the possible stem values in a vertical

column– Record the leaf for each observation beside

each corresponding stem value– Indicate the units for stems and leaves in a

key or legend

To describe – comment on the center, spread, and shape of

the distribution and if there are any unusual features

Each number is split into two parts:

Stem – consists of the first digit(s)Leaf - consists of the final digit(s)

Use for small to moderate sized

data sets. Doesn’t work well for large data sets.

Be sure to list every stem

from the smallest to the largest value

If you have a long lists of leaves behind a few

stems, you can split stems in order to

spread out the distribution.

Can also create comparative stem-and-leaf displays

Remember the data set collected in Chapter 1 – how many piercings do you have? Would a stem-and-leaf display be a

good graph for this distribution? Why or why not?

Page 61: Chapters 1 & 3 Graphical Methods for Describing Data.

The following data are price per ounce for various brands of different brands of dandruff shampoo at a local grocery store.

0.32 0.21 0.29 0.54 0.17 0.28 0.36 0.23

Create a stem-and-leaf display with this data? Stem Leaf

1

2

3

4

5

What would an appropriate stem be?

List the stems vertically

For the observation of “0.32”, write the

2 behind the “3” stem.

2

Continue recording each leaf with the

corresponding stem 1 9

4

7

8

6

3

Describe this distribution.

The median price per ounce for dandruff

shampoo is $0.285, with a range of $0.37. The

distribution is positively skewed with an outlier at

$0.54.

Page 62: Chapters 1 & 3 Graphical Methods for Describing Data.

The Census Bureau projects the median age in 2030 for the 50 states and Washington D.C. A stem-and-leaf display is shown below.

Notice that you really cannot see a distinctive

shape for this distribution due to the long list of

leaves

We can split the stems in order to better see the

shape of the distribution.

Notice that now you can see the

shape of this distribution.

We use L for lower leaf values (0-4) and H for higher leaf values (5-

9).

Page 63: Chapters 1 & 3 Graphical Methods for Describing Data.

The following is data on the percentage of primary-school-aged children who are enrolled in school for 19 countries in Northern Africa and for 23 countries in Central African.

Northern Africa54.6 34.3 48.9 77.8 59.6 88.5 97.4 92.5 83.9 98.891.6 97.8 96.1 92.2 94.9 98.6 86.6 96.9 88.9

Central Africa58.3 34.6 35.5 45.4 38.6 63.8 53.9 61.9 69.9 43.085.0 63.4 58.4 61.9 40.9 73.9 34.8 74.4 97.461.0 66.7 79.6

Create a comparative stem-and-leaf display. What is an appropriate

stem?

Let’s truncate the leaves to the unit place.

“4.6” becomes “4”

Be sure to use comparative language when describing

these distributions!

The median percentage of primary-school-aged children enrolled in school is larger for countries in Northern Africa than in Central Africa, but the ranges are the same. The

distribution for countries in Northern Africa is strongly negatively skewed, but the

distribution for countries in Central Africa is approximately symmetrical.

Page 64: Chapters 1 & 3 Graphical Methods for Describing Data.

HistogramsWhen to Use Univariate numerical

data

How to construct Discrete data―Draw a horizontal scale and mark it with the

possible values for the variable―Draw a vertical scale and mark it with frequency

or relative frequency―Above each possible value, draw a rectangle

centered at that value with a height corresponding to its frequency or relative frequency

To describe – comment on the center, spread, and shape of the

distribution and if there are any unusual features

Constructed differently for

discrete versus continuous data

For comparative histograms – use two separate graphs with the same scale on the horizontal axis

Page 65: Chapters 1 & 3 Graphical Methods for Describing Data.

Queen honey bees mate shortly after they become adults. During a mating flight, the queen usually takes several partners, collecting sperm that she will store and use throughout the rest of her life. A study on honey bees provided the following data on the number of partners for 30 queen bees.

12 2 4 6 6 7 8 7 8 11 8 3 5 6 7 10 1 9 7 6 9 7 5 4 7 4 6 7 8 10

Create a histogram for the number of partners of the queen bees.

Page 66: Chapters 1 & 3 Graphical Methods for Describing Data.

First draw a horizontal

axis, scaled

with the possible values of

the variable of interest.

Next draw a vertical

axis, scaled

with frequency or relative frequency.

Suppose we use relative frequency instead of frequency on the

vertical axis.

Draw a rectangle

above each value with a

height correspondin

g to the frequency.

What do you notice about the shapes of these two histograms?

0 1 2 3 4 5 6 7 8 9 10 11 120

1

2

3

4

5

6

7

Page 67: Chapters 1 & 3 Graphical Methods for Describing Data.

HistogramsWhen to Use Univariate numerical

data

How to construct Continuous data―Mark the boundaries of the class intervals on the

horizontal axis―Draw a vertical scale and mark it with frequency

or relative frequency―Draw a rectangle directly above each class

interval with a height corresponding to its frequency or relative frequency

To describe – comment on the center, spread, and shape of the

distribution and if there are any unusual features

This is the type of histogram that most students are familiar with.

Page 68: Chapters 1 & 3 Graphical Methods for Describing Data.

A study examined the length of hours spent watching TV per day for a sample of children age 1 and for a sample of children age 3. Below are comparative histograms.

Children Age 1 Children Age 3

Notice the common scale on the horizontal axis

Write a few sentences comparing the distributions.

The median number of hours spent watching TV per day was greater for the 1-

year-olds than for the 3-year-olds. The distribution for the 3-year-olds was more

strongly skewed right than the distribution for the 1-year-olds, but the two distributions had similar ranges.

Page 69: Chapters 1 & 3 Graphical Methods for Describing Data.

Cumulative Relative Frequency Plot

When to use- used to answer questions about percentiles.

How to construct- Mark the boundaries of the intervals on the horizontal axis- Draw a vertical scale and mark it with relative frequency- Plot the point corresponding to the upper end of each interval with its cumulative relative frequency, including the beginning point- Connect the points.

Percentiles are a value with a given percent of

observations at or below that value.

Page 70: Chapters 1 & 3 Graphical Methods for Describing Data.

The National Climatic Center has been collecting weather data for many years. The annual rainfall amounts for Albuquerque, New Mexico from 1950 to 2008 were used to create the frequency distribution below.

Annual Rainfall(in inches)

Relative frequency

Cumulative relative frequency

4 to <5 0.052

5 to <6 0.103

6 to <7 0.086

7 to <8 0.103

8 to <9 0.172

9 to <10 0.069

10 to < 11 0.207

11 to <12 0.103

12 to <13 0.052

13 to <14 0.052

Find the cumulative relative frequency for

each interval

0.052

0.155

0.241

+

+

Continue this pattern to

complete the table

Page 71: Chapters 1 & 3 Graphical Methods for Describing Data.

The National Climatic Center has been collecting weather data for many years. The annual rainfall amounts for Albuquerque, New Mexico from 1950 to 2008 were used to create the frequency distribution below.

Annual Rainfall(in inches)

Relative frequency

Cumulative relative frequency

4 to <5 0.052 0.052

5 to <6 0.103 0.155

6 to <7 0.086 0.241

7 to <8 0.103 0.344

8 to <9 0.172 0.516

9 to <10 0.069 0.585

10 to < 11 0.207 0.792

11 to <12 0.103 0.895

12 to <13 0.052 0.947

13 to <14 0.052 0.999

In the context of this problem, explain

the meaning of this value.

In the context of this problem, explain

the meaning of this value.

Why isn’t this value one (1)?

To create a cumulative relative frequency plot, graph a point for the upper value

of the interval and the cumulative relative frequency

Plot a point for each interval. Plot a starting point at (4,0).

Connect the points.

Page 72: Chapters 1 & 3 Graphical Methods for Describing Data.

2 4 6 8 10 12 14

0.2

0.4

0.6

0.8

1.0

Rainfall

Cum

ula

tive r

ela

tive f

requency

What proportion of years had rainfall amounts that were

9.5 inches or less?

Approximately 0.55

Page 73: Chapters 1 & 3 Graphical Methods for Describing Data.

2 4 6 8 10 12 14

0.2

0.4

0.6

0.8

1.0

Rainfall

Cum

ula

tive r

ela

tive f

requency

Approximately 30% of the years had annual rainfall less than what amount?

Approximately 7.5 inches

Page 74: Chapters 1 & 3 Graphical Methods for Describing Data.

2 4 6 8 10 12 14

0.2

0.4

0.6

0.8

1.0

Rainfall

Cum

ula

tive r

ela

tive f

requency

Which interval of rainfall amounts

had a larger proportion of

years –9 to 10 inches or 10 to 11 inches?

Explain The interval 10 to 11 inches, because its slope is steeper, indicating a larger proportion occurred.

Page 75: Chapters 1 & 3 Graphical Methods for Describing Data.

Displaying Bivariate Numerical Data

Page 76: Chapters 1 & 3 Graphical Methods for Describing Data.

ScatterplotsWhen to Use Bivariate numerical data

How to construct - Draw a horizontal scale and mark it with

appropriate values of the independent variable

- Draw a vertical scale and mark it appropriate values of the dependent variable

- Plot each point corresponding to the observations

To describe - comment the relationship between the variables

Scatterplots are discussed in much greater depth in

Chapter 5.

Page 77: Chapters 1 & 3 Graphical Methods for Describing Data.

Time Series PlotsWhen to Use

- measurements collected over time at regular intervalsHow to construct

- Draw a horizontal scale and mark it with appropriate values of time

- Draw a vertical scale and mark it appropriate values of the observed variable

- Plot each point corresponding to the observations and connect

To describe - comment on any trends or patterns over time

Can be considered bivariate data where the y-variable is the

variable measured and the x-variable is time

Page 78: Chapters 1 & 3 Graphical Methods for Describing Data.

The accompanying time-series plot of movie box office totals (in millions of dollars) over 18 weeks in the summer for 2001 and 2002 appeared in USA Today (September 3, 2002).

Describe any trends or patterns that you see.


Recommended