+ All Categories
Home > Documents > Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A...

Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A...

Date post: 19-Jul-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
36
11 syllabus syllabus r r ef ef er er ence ence Strand: Statistics and probability Core topic: Data collection and presentation In this In this cha chapter pter 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series
Transcript
Page 1: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

11

syllabussyllabusrrefefererenceenceStrand:Statistics and probability

Core topic:Data collection and presentation

In thisIn this chachapterpter11A Scatterplots11B Regression lines11C Time series and trend lines

Scatterplots and time series

MQ Maths A Yr 11 - 11 Page 451 Thursday, July 5, 2001 11:03 AM

Page 2: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

452

M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Introduction

The manager of a small ski resort has a problem. He wants to be able to predict thenumber of skiers using his resort each weekend in advance, so that he can organiseadditional resort staffing and catering if needed. He knows that good deep snow willattract skiers in big numbers but scant covering is unlikely to attract a crowd. Toinvestigate the situation further, he collects the following data over twelve consecutiveweekends at his resort.

As there are two types of data in this example, they are called

bivariate

data. Foreach item (weekend), two variables are considered (depth of snow and number ofskiers). When analysing bivariate data, we are interested in examining the relationshipbetween the two variables. In the case of the ski resort data, the manager might beinterested in answering the following questions.

1. Are visitor numbers related to depth of snow?

2. If there is a relationship between visitor numbers and depth of snow, is it always trueor is it just a guide? In other words, how strong is the relationship?

3. How much confidence could be placed in the prediction?

In this chapter we shall examine how questions such as these can be answered by theappropriate presentation of data.

Depth of snow (m) Number of skiers

0.5 120

0.8 250

2.1 500

3.6 780

1.4 300

1.5 280

1.8 410

2.7 320

3.2 640

2.4 540

2.6 530

1.7 200

MQ Maths A Yr 11 - 11 Page 452 Thursday, July 5, 2001 11:03 AM

Page 3: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s

453

1

When graphing pairs of variables, we commonly have an ‘independent variable’ and a‘dependent variable’. Explain these terms.

2

By convention, which of the above two variable types is graphed on the horizontalaxis?

3

For the following pairs of variables, state which variable would be classified as thedependent variable.

a

age and height

b

distance travelled and time taken (at a fixed speed)

c

temperature and elevation

d

blood alcohol level and reaction time

e

IQ and results on an academic test

f

overtime pay and hours worked

g

value of a car and its age

h

time taken to travel a given distance and speed of travel

4

For each of the pairs of variables in question

3

, state what happens to the value of thesecond variable when there is an increase in the value of the first variable.

5

If we collected data for each of the pairs of variables in question

3

and plotted thepoints, we would obtain a set of points scattered over the Cartesian plane. Consider astraight line which could be drawn indicating the general trend in each case.

a

Draw up a set of labelled axes (no scale) for each case in question

3

.

b

Sketch a straight line which would show the general trend of the relationship

between the two variables.

Scatterplots

We shall now look more closely at the data collectedby the ski resort manager, and at the three questionshe posed. To help answer these questions, the datacan be arranged on a

scatterplot

. Each of the datapoints is represented by a single visible point on thegraph.

When drawing a scatterplot, it is important to choose the correct variable to assign toeach of the axes. The convention is to place the

independent variable

on the

x

-axisand the dependent variable on the

y

-axis. The independent variable in an experiment orinvestigation is the variable that is deliberately controlled or adjusted by theinvestigator. The

dependent variable

is the variable that responds to changes in theindependent variable.

Neither of the variables involved in the ski resort data was controlled directlyby the investigator but ‘Number of skiers’ would be considered the dependentvariable because it is likely to change depending on depth of snow. (The snow

0

200

400

600

800

0 2 31 4

Num

ber

of s

kier

s

Depth of snow (m)

MQ Maths A Yr 11 - 11 Page 453 Thursday, July 5, 2001 11:03 AM

Page 4: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

454

M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

depth does not depend on numbers of skiers.) As ‘Number of skiers’ is thedependent variable, we graph it on the

y

-axis and the ‘Depth of snow’ on the

x

-axis.Notice how the scatterplot for the ski resort data shows a general upward trend.

It is not a perfectly straight line but it is still clear that a general trend orrelationship has formed: as the depth of snow increases, so too does the numberof skiers.

Once the scatterplot has been drawn, we can determine if any pattern is evident.Worked example 1 shows how, as a general rule, as height increases so doesmass.

We can also look to see if the pattern is linear. In worked example 1, although thepoints are not in a perfect straight line, they approximate a straight line.

The figures below show examples of linear and non-linear relationships.

Linear relationships

Positive relationship between variables; Negative relationship between variables;that is, as one variable increases, the that is, as one variable increases, theother variable also increases. other variable decreases.

The table below shows the height and mass of ten Year 11 students.

Display the data on a scatterplot.

Height (cm) 120 124 130 135 142 148 160 164 170 175

Mass (kg) 45 50 54 59 60 65 70 78 75 80

THINK WRITE

Show the height on the x-axis and the mass on the y-axis.Plot the point given by each pair.

1

40

50

60

70

80

100 140 160120 180

Mas

s (k

g)

Height (cm)

2

1WORKEDExample

0

y

x 0

y

x

MQ Maths A Yr 11 - 11 Page 454 Thursday, July 5, 2001 11:03 AM

Page 5: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s

455

Non-linear relationships

In other cases it may be that there is no relationshipat all between the two variables. Such a scatterplotwould look like the one shown on the right.

0

y

x 0

y

x

0

y

x 0

y

x

0

y

x

The table below shows the length and mass of a dozen eggs.

a Display this information in a scatterplot.b Determine if there is any relationship between the length and mass of the eggs and state

if the relationship is linear.

Length (cm) 6.2 3.9 4.5 5.8 7.2 7.6 6.1 6.7 7.3 5.1 6.0 7.3

Mass (g) 60 15 25 50 95 110 55 75 95 35 54 96

THINK WRITE

a Display length on the x-axis and mass on the y-axis.

a

Plot the point given by each pair.

b Study the scatterplot to see if mass increases as length increases.

b As length increases, so does the mass of the egg.

Study the scatterplot to see if the points seem to approximate a straight line.

The points do not approximate a straight line and so the relationship is not linear.

1

0

20

40

60

80

100

120

0 81 2 3 4 5 6 7

Mas

s (g

)

Length (cm)

2

1

2

2WORKEDExample

MQ Maths A Yr 11 - 11 Page 455 Thursday, July 5, 2001 11:03 AM

Page 6: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

456 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

An understanding of the relationship between two variables offers a tool for futureplanning by businesses. Umbrellas can be stocked in preparation for the wet season,heaters can be stocked for an on-coming winter, the stocks of ice-cream are much lowerduring winter. An appreciation of these relationships provides an opportunity for a busi-ness to gain an edge over its competitors.

The graphics calculator and statistical data

A graphics calculator is a great labour-saving device when it comes to statistical calculations and display of data. If you have one readily available, its use in this chapter will greatly reduce your manual calculations.

Let us repeat worked example 2 using a graphics calculator. (Instructions here apply to the Texas Instrument TI–83 calculator. Other brands offer similar functions.)

1 Clear the Y= editor and turn off any existing plots by pressing [STAT PLOT] and choosing 4: PlotsOff.

2 Press and select 1:Edit. Enter the x-data (length) in L1 and y-data (mass) in L2.

3 Press [STAT PLOT] then press to select the options for Plot 1.

4 Turn the plot ON; select the type as a scatterplot; the Xlist as L1; and the Ylist as L2.

5 Press and choose 9:ZoomStat.

6 The scatterplot shown in worked example 2 will be shown on the screen. Press , then use the arrow keys to examine the plot.

7 To examine the summary statistics of the x- and y-variables, press , then use the right arrow key to highlight CALC. Option 1 displays statistics of the variable x (mean, standard deviation, quartiles, minimum and maximum values etc.) while option 2 displays statistics for both variables. Note the LinReg(ax + b) (linear regression function) available. We will use this function later to determine the equation of the best straight line which fits the data in the scatterplot.

inve

stigationinvestigatio

n

2nd

STAT

2nd ENTER

ZOOM

TRACE

STAT

remember1. A scatterplot is a graph that is used to compare two variables.2. One variable (the independent variable) is on the horizontal axis and the other

variable (the dependent variable) is on the vertical axis.3. Points are plotted by the pair formed by each variable.4. A relationship between the variables exists if one increases as the other

increases or if one decreases as the other increases.5. If the points on the scatterplot seem to approximate a straight line, the

relationship can be said to be linear.

remember

MQ Maths A Yr 11 - 11 Page 456 Thursday, July 5, 2001 11:03 AM

Page 7: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 457

Scatterplots

Use a graphics calculator if you have one available.

1 The table below shows the marks obtained by a group of 10 students in history andgeography.

Display this information on a scatterplot.

2 The table below shows the maximum temperature each day together with the numberof people who attend the cinema that day.

Display the information on a scatterplot.

3 The table below shows the wages of 20 people and the amount of money they spendeach week on entertainment.

Display this information on a scatterplot.

4 The table below shows the marks obtained by nine students in English and History.

a Display the information on a scatterplot.b Is there any relationship between the marks obtained in English and in History? If

there does appear to be a relationship, is the relationship linear?

5 The table below shows the daily temperature and the number of hot pies sold at theschool canteen.

a Display the information on a scatterplot.b Determine if there appears to be any relationship between the two variables and if

the relationship appears to be linear.

History 36 65 82 72 58 39 58 74 82 66

Geography 45 78 66 72 50 51 61 70 60 88

Temperature (°C) 25 33 30 22 15 18 27 22 28 20

No. at cinema 256 184 190 312 458 401 200 357 312 423

Wages ($) 370 380 500 510 395 430 535 490 495 550

Amount spent onentertainment ($) 55 85 150 75 145 100 130 115 70 150

Wages ($) 810 460 475 520 530 475 610 780 350 460

Amount spent onentertainment ($) 220 50 100 150 140 160 90 130 40 50

English 55 20 27 33 73 18 37 51 79

History 72 37 53 74 73 44 59 55 84

Temperature (°C) 24 32 28 23 16 14 26 20 29 21

No. of pies sold 56 20 24 60 84 120 70 95 36 63

11A

WORKEDExample

1

EXCEL Spreadsheet

Scatterplot

WORKEDExample

2

MQ Maths A Yr 11 - 11 Page 457 Thursday, July 5, 2001 11:03 AM

Page 8: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

458 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

6 Container ships arriving on a wharf are unloaded bywork teams. The table below shows the number ofpeople in the work team and the time taken to unloadthe container ship.

a Display the information on a scatterplot.b Determine if there appears to be a relationship

between the number of people in the work teamand the time taken to unload the container ship. Ifthere is a relationship, does the relationshipappear to be linear?

7

Which of the following scatterplots does not display a linear relationship?A B

C D

8

In which of the following is no relationship evident between the variables?A B

C D

No. in work team

15 18 12 19 22 21 17 16 18 20

Hours taken 20 16 25 15 14 13 18 20 17 14

mmultiple choiceultiple choice

y

x

y

x

y

x

y

x

mmultiple choiceultiple choice

y

x

y

x

y

x

y

x

MQ Maths A Yr 11 - 11 Page 458 Thursday, July 5, 2001 11:03 AM

Page 9: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 4599 Give an example of a situation where the scatterplot may look like the ones below.

a b

Fitting straight lines to bivariate dataThe process of ‘fitting’ straight lines to bivariate data enables us to analyse relation-ships between the data and possibly make predictions based on the given data set.

Fitting a straight line by eye (line of best fit)Consider the set of bivariate data points shown at right. In this case the x-values could be the hand lengths of a group of people, while y-values could be the lengths of their feet. We wish to determine a linear relationship between these two random variables.

Of course, there is no single straight line which would go through all the points, so we can only estimate such a line.

Furthermore, the more closely the points appear to be on or near a straight line, themore confident we are that such a linear relationship may exist and the more accurateour fitted line should be.

Consider the estimate, drawn ‘by eye’ in the figure below. It is clear that most of thepoints are on or very close to this straight line. This line was easily drawn since thepoints are very much part of an apparent linear relationship.

However, note that some points are below the line and some are above it. Furthermore, if x is the hand length and y is the foot length, it seems that people’s feet are generally longer than their hands.

Regression analysis is concerned with finding these straight lines of best fit using various methods so that the number of points above and below the lines are ‘balanced’.

Collecting bivariate data1 Choose one of the following and collect data from within your class.

a Each person’s hand span and height.b Each person’s resting heart rate and the time it takes for them to run 400 m.c Each person’s mark in mathematics and in science.

2 Display the results on a scatterplot.3 Discuss any relationship that may be evident between the two variables.

WorkS

HEET 11.1

0

y

x 0

y

x

inve

stigationinvestigatio

n

y

x

y

x

MQ Maths A Yr 11 - 11 Page 459 Thursday, July 5, 2001 11:03 AM

Page 10: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

460 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Methods of drawing lines of best fitThere are many different methods of fitting a straight line by eye. They may appearlogical or even obvious but fitting by eye involves a considerable margin of error. Weare going to consider only one method: fitting the line by balancing the number ofpoints

The technique of balancing the number of points involves fitting a line so that thereis an equal number of points above and below the line. For example, if there are 12points in the data set, 6 should be above the line and 6 below it.

As mentioned above, estimating the position for the line of best fit by eye involves aconsiderable margin of error. For this reason, when making predictions of one variablefrom another using this graphical technique, the forecasts obtained can be consideredrough estimates at best. A graphics calculator allows much more reliable predictions tobe made.

Draw the line of best fit for the data in the figure using the equal-number-of-points method.

THINK DRAW

Note that the number of points (n) is 8.

Fit a line where 4 points are above the line. Using a clear plastic ruler, try to fit the best line.

The first attempt has only 3 points below the line where there should be 4. Make refinements.

The second attempt is an improvement, but the line is too close to the points above it. Improve the position of the line until a better ‘balance’ between upper and lower points is achieved.

y

x

1

2 y

x

3 y

x

4 y

x

3WORKEDExample

MQ Maths A Yr 11 - 11 Page 460 Thursday, July 5, 2001 11:03 AM

Page 11: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 461

Regression linesAfter drawing a scatterplot, we look for a relationship between the variables. If there isa relationship, in many cases it appears to be linear. Once it has been established that alinear relationship exists, we can introduce a regression line that will enable us to makepredictions about the two variables.

Consider the example at the beginning of thechapter where the manager of the ski resort wantsto predict the number of skiers at the resort on theweekend. Based on past data, we drew ascatterplot.

On the scatterplot, we can draw the line of best fit.This line of best fit can be extended and then usedto make predictions about the data. When this isdone, the line is called a regression line. In theabove example the depth of snow can then be usedto predict the number of visitors to the resort.

0

200

400

600

800

0 2 31 4

Num

ber

of s

kier

s

Depth of snow (m)

0

200

400

600

800

0 2 31 4N

umbe

r of

ski

ers

Depth of snow (m)

MQ Maths A Yr 11 - 11 Page 461 Thursday, July 5, 2001 11:03 AM

Page 12: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

462 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Making predictionsIf a linear relationship exists between a pair of variables, it is useful to be able to makepredictions of one variable from the other. There are two methods available:

1. Algebraically — we could find an equation for this regression line (or line of bestfit), substitute the value of the known variable and so calculate the value of theunknown variable. A graphics calculator will generate the equation of the regressionline from data entered in lists.

2. Graphically — we can draw lines horizontally or vertically from the known valueto the line, then read off the corresponding value of the unknown variable. Using agraphics calculator, we can use the ‘trace’ function to determine variable values onthe line.

Algebraic predictions

Using a graphics calculatorIf you have access to a graphics calculator, the equation of the regression line canreadily be produced from the two sets of entered data.

Manually

To calculate the line of best fit, we must determine the gradient and y-intercept of the drawn line.

Equation of a line of best fit or regression line

To have the graphics calculator automatically find the equation of a line of best fit (also called a regression line), follow these steps. The example below uses the skiers and snow depth data given at the beginning of this chapter.1. Enter the depth data in L1 and number of skiers in L2 (first press , and

select EDIT and 1:Edit...).2. Press , select CALC and choose 4:LinReg(ax + b).3. Enter L1, L2, Y1 (that is, press [L1] [L2] ; Y1 is found

under VARS/Y-VARS/Function).4. Press .5. Ensure a STATPLOT is set up by entering [STATPLOT], setting Plot 1 to ON,

setting the Type for a scatterplot, Xlist as L1 and Ylist as L2.6. Adjust WINDOW settings to cope with extremes of the data, then press GRAPH.7. Use [CALC] and select 1:Value or press to find values on the line.8. The graphics calculator shows the line of best fit to have the equation

y = 186.4x + 28.3.

inve

stigationinvestigatio

n

STAT

STAT2nd , 2nd ,

ENTER2nd

2nd TRACE

MQ Maths A Yr 11 - 11 Page 462 Thursday, July 5, 2001 11:03 AM

Page 13: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 463

Once the regression line has been drawn, to make predictions, it is useful to find itsequation.

The equation of a straight line has the form y = mx + c, where m is the gradient andc is the y-intercept. The gradient of a regression line is best found with a ruler andpencil by measuring the rise and the run from two points on the line of regression. Thegradient is then found using:

The y-intercept can then be seen by noting the point where the line crosses the y-axis.Consider worked example 3.From this figure we can see that the equation of

the regression line is approximately y = x + 3.

That is, chemistry = × physics + 3.

A teacher may then be able to estimate a student’schemistry mark by using their result in physics. Forexample a student who got a mark of 5 in physics mayget:

y = × 5 + 3

= 6

The teacher would estimate a mark of 6 for this student in chemistry.Use a graphics calculator to determine the equation of the regression line for the

relationship between the chemistry and physics marks. Your resulting equation shouldbe y = 0.6x + 2.8 (close to the one determined here).

The table below shows the marks of 10 students in a physics and chemistry quiz.

Draw a scatterplot and on the graph show the regression line.

Physics 2 5 6 7 7 8 8 9 9 10

Chemistry 4 7 5 8 6 6 9 7 10 9

THINK WRITE

Put physics on the x-axis and chemistry on the y-axis.Plot the point corresponding to each pair of marks.Add a line of regression.

1

0

2

4

6

8

10

0 4 62 8 10

Che

mis

try

Physics

2

3

4WORKEDExample

SkillSH

EET 11.1Gradient m( ) riserun--------=

58---

0

2

4

6

8

10

0 4 62 8

run = 8

rise = 5

y-intercept = approx. 3

10

Che

mis

try

Physics

58---

Cabri Geometry

Gradient

Cabri Geometry

Graphsand

y-intercept

58---

18---

MQ Maths A Yr 11 - 11 Page 463 Thursday, July 5, 2001 11:03 AM

Page 14: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

464 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

The following table shows the bus fare charged by a bus company for different distances.

a Represent the data using a scatterplot and add the regression line.b Find the gradient and y-intercept of the regression line; hence, find its equation,

manually.c Enter the data as lists into a graphics calculator and determine the equation of the

regression line.

Distance (km) Fare ($)

1.5 2.10

0.5 2.00

7.5 4.50

6 4.00

6 4.50

2.5 2.60

0.5 2.10

8 4.50

4 3.50

3 3.00

THINK WRITE

a Show the distance on the x-axis and the fare on the y-axis.

a

Plot the points given by each pair.Add the regression line to the graph.

b Take two points on or closest to the regression line and find the gradient between them by substituting in the

formula and simplifying.

b

rise = 2, run = 5.5

m =

≈ 0.36

1

0.00

1.00

2.00

3.00

4.00

5.00

0 81 2 3 4 5 6 7

Fare

($)

Distance (km)

23

1

mriserun--------=

0.00

1.00

2.00

3.00

4.00

5.00

0 8

run = 5.5

rise = 2

1 2 3 4 5 6 7

Fare

($)

Distance (km)

25.5-------

5WORKEDExample

EXCEL

Spreadsheet

Gradient

MQ Maths A Yr 11 - 11 Page 464 Thursday, July 5, 2001 11:03 AM

Page 15: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 465

Once the equation of the regression line has been found, it is possible to use this equationto make predictions by substituting the value of the given variable into it.

Graphical predictions

As mentioned earlier, once the regression line has been drawn, horizontal or verticallines can be added from the axes to the line to read off the value of the unknown vari-able. Remember that the values so determined are only estimates, and not exact values.

THINK WRITE

Note the y-intercept. y-intercept = 1.8Substitute into the formula y = mx + c to find the equation.

y = 0.36x + 1.8

c Enter distances into L1 and fares into L2.

c The equation of the regression line is y = 0.37x + 1.8

Follow the procedures outlined previously to determine the equation of the regression line.

23

1

2

A casino records the number of people, N, playing a jackpot game and the prize money, p, for that game and plots the results on a scatterplot. The regression line is found to have the equation N = 0.07p + 220.a Find the number of people playing when the prize money is $2500.b Find the likely prize on offer when there are 500 people playing.

THINK WRITE

a Write the equation of the regression line.

a N = 0.07p + 220

Substitute 2500 for p. N = 0.07 × 2500 + 220

Calculate N. = 395

Give a written answer. There would be approximately 395 people playing.

b Write the equation of the regression line. b N = 0.07p + 220

Substitute 500 for N. 500 = 0.07p + 220

Solve the equation. 280 = 0.07p

p =

p = 4000

Give a written answer. The prize would be approximately $4000.

1

2

3

4

1

2

3

2800.07----------

4

6WORKEDExample

MQ Maths A Yr 11 - 11 Page 465 Thursday, July 5, 2001 11:03 AM

Page 16: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

466 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

CausalityThe term causality refers to one variable causing another. For example, there is a strong relationship between a person’s shoe size and shirt size (as one tends to increase, the other also tends to increase). However, this does not mean that one causes the other.

There also appears to be a strong relationship between the number of cigarettes smoked and the chance of contracting lung cancer. In this case, we are advised that smoking causes lung cancer.

We cannot draw the conclusion that one variable in a relationship causes another just because there is a strong relationship between the two. In fact, sometimes the strong relationship between two variables is caused by their relationship to a third variable. In the example of shoe and shirt sizes, above, it is more likely that they are both dependent on age.

Consider the graph drawn in worked example 5 representing bus fares for travelling various distances. Graphically estimate:a the fare for a journey of 5 kmb how far you could expect to travel for a cost of $3.25.

THINK WRITE

a Find the value of 5 km on the x-axis. aDraw a vertical line from this point to meet the regression line.Draw a horizontal line from this point to the y-axis.

Read this value on the y-axis. A distance of 5 km of bus travel would cost about $3.60.

b Locate $3.25 on the y-axis. bDraw a horizontal line from this point to the regression line.From this point, draw a vertical line to the x-axis.Read off the x-value at this point. For $3.25 you could expect to travel about

4 km.

1

0.00

1.00

2.00

3.00

4.00

5.00

0 1 2 3 4 5 6 7

Fare

($)

Distance (km)

$3.60

$3.255 km

4 km

2

3

4

12

3

4

7WORKEDExamplein

vestigation

investigation

MQ Maths A Yr 11 - 11 Page 466 Thursday, July 5, 2001 11:03 AM

Page 17: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 467

Interpolation and extrapolationWe use the term interpolation for the making of predictions from a graph’s regressionline from within the bounds of the original experimental data.

Data may be interpolated either algebraically or graphically as shown by the pre-vious examples.

We use the term extrapolation for the making of predictions from a graph’sregression line from outside the bounds of the original experimental data.

Suppose, in the ski resort problem, that we wished to find the number of skiers if the depth of snow was 4.5 m.

The data could be extrapolated graphically by first extending the trend line to make the prediction possible.

The graph shows that about 900 skiers would be attracted to the resort if the snow depth was 4.5 m. How reliable is this prediction?

Reliability of predictionsResults predicted (whether algebraically or graphically) from the line of best fit of ascatterplot can be considered reliable only if:1. a reasonably large number of points are used to draw the scatterplot,2. the plotted points lie reasonably close to the line of best fit,3. the predictions are made using interpolation and not extrapolation.

Extrapolated results can never be considered to be reliable because when extrapolation is used we are assuming that the relationship holds true for untested values. In the case of the ski resort data we would be assuming that the relationship continues to hold true even when the snow depth is extreme. This may or may not be the case. It might be that there

In a small group, for each of the following topics, discuss:a whether a positive or negative relationship exists between the variablesb how strong the relationship might bec whether one variable causes the other.

1 Hours of study and exam marks.

2 Hours of exercise and resting pulse rate.

3 Weight and shirt size.

4 The number of hotels and churches in country towns.

5 The number of motels in a town and the number of flights landing at the nearest airport.

6 Age and the time taken to run 100 metres.

00

200

400

600

800

1000

1 2 3 4 5Snow depth (m)

Num

ber

of s

kier

s

00

200

400

600

800

1000

1 2 3 4 5Snow depth (m)

Num

ber

of s

kier

s

MQ Maths A Yr 11 - 11 Page 467 Thursday, July 5, 2001 11:03 AM

Page 18: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

468 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

are only a certain number of skiers in the population — which places a limit on the maximum number visiting the resort. In this situation the trend line, if continued, may look like that in the graph above.

Alternatively, it might be that when the snow depth is extreme then skiers cannot even get to the slopes.

So — take care when extrapolating data!

CrayfishA graphics calculator is recommended for this activity.

Rock lobsters (crayfish) are sized according to the length of their carapace (main body shell).

The table below gives the age and carapace length of 15 male rock lobsters.

inve

stigationinvestigatio

n

Age (years)Length of

carapace (mm) Age (years)Length of

carapace (mm)

3 65 14 210

2.5 59 4.5 82

4.5 80 3.5 74

3.25 68 2.25 51

7.75 130 1.76 48

8 150 10 171

6.5 112 9.5 160

12 200

MQ Maths A Yr 11 - 11 Page 468 Thursday, July 5, 2001 11:03 AM

Page 19: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 469

1 Draw a scatterplot of the data.

2 Draw in the line of best fit.

3 Suggest reasons why the points on the scatterplot are not perfectly linear.

4 Find an equation that represents the relationship between the length of the rock lobster, L, and its age, a.

5 Find the likely size of a 5-year-old male rock lobster.

6 Find the likely size of a 16-year-old male rock lobster.

7 Rock lobsters reach sexual maturity when their carapace length is approximately 65 mm. Find the age of the rock lobster at this stage.

8 The Fisheries Department wishes to set minimum size restrictions so that the rock lobsters have three full years from the time of sexual maturity in which to breed before they can be legally caught. What size should govern the taking of male crayfish?

9 Which of your answers to parts 5 to 9 do you consider reliable? Why?

World populationThe table below shows world population during the years 1955 to 1985.

1 Use a graphics calculator to create a scatterplot of the data.

2 Does the scatterplot of the Population-versus-Year data appear to be linear?

3 Use the graphics calculator to fit a regression line to the data; hence find the equation of the line and graph it on the scatterplot.

4 Use the equation to predict the world population in 1950.

5 Use the equation to predict the world population in 1982.

6 Use the equation to predict the world population in 1997.

7 Which of your answers to parts 4 to 6 do you consider reliable? Why?

8 In fact the world population in 1997 was 5840 million. Account for the discrepancy between this and your answer to part 6.

9 What would you estimate the world population to be in 2010?

inve

stigationinvestigatio

n

Year 1955 1960 1965 1970 1975 1980 1985World population (millions) 2757 3037 3354 3696 4066 4432 4828

MQ Maths A Yr 11 - 11 Page 469 Thursday, July 5, 2001 11:03 AM

Page 20: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

470 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Regression lines

Note : For many of these questions your answers may differ somewhat from those inthe back of the book. The answers are provided as a guide but there are likely to beindividual differences when fitting straight lines by eye.

Where appropriate, the use of a graphics calculator is recommended if one isavailable.

1 Fit a straight line to the data in the scatterplots using the equal-number-of-points method.a b c

d e

remember1. To fit a straight line to bivariate data by eye, the line is drawn by balancing the

number of points, ensuring an equal number of points above and below the fitted line. This line is called the line of best fit or regression line.

2. The equation of the regression line can be found by measuring thegradient and the y-intercept of the regression line and using the formulay = mx + c. Sometimes the gradient of the regression line will be negative. This occurs if one variable increases while the other variable decreases.

3. The equation of the regression line can also be determined using a graphics calculator.

4. Once the equation of the regression line has been found, it can then be used to make predictions about the variables.

5. Predictions can also be made graphically by drawing lines from the axes to the regression line.

6. Interpolation involves making predictions within the bounds of the original data.

7. Extrapolation involves making predictions beyond the bounds of the original data.

8. Predictions can be considered reliable if they are obtained using interpolation from a large number of points which lie reasonably close to the line of best fit.

remember

11B

WORKEDExample

3 y

x

y

x

y

x

y

x

y

x

MQ Maths A Yr 11 - 11 Page 470 Thursday, July 5, 2001 11:03 AM

Page 21: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 4712 The table below shows the marks achieved by a class of students in English and

Mathematics.

Show these data on a scatterplot and on the graph show the regression line.

3 Position the straight line of best fit through each of the following graphs and find theequation of each.a b

c

4 In an experiment, a student measures the length of a spring (L) when differentmasses (M) are attached to it. Her results are shown below.

a Draw a scatterplot of the data and on it draw the line of regression.

b Find the gradient and y-intercept of the regression line and, hence, find the equation of the regression line. Write your equation in terms of the variables L and M.

English 64 75 81 63 32 56 47 59 73 64

Maths 76 62 89 56 49 57 53 72 80 50

Mass (g) Length of spring (mm)

0 220

100 225

200 231

300 235

400 242

500 246

600 250

700 254

800 259

900 264

WORKEDExample

4

Mathcad

Gradient andy-intercept

010203040506070

y

x0 4 62 8 100

10203040506070

0 40 6020 80 100 120 x

y

0

500

1000

1500

2000

2500

3000

0 10 155 20 25

y

x

WORKEDExample

5

MQ Maths A Yr 11 - 11 Page 471 Thursday, July 5, 2001 11:03 AM

Page 22: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

472 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

5 A scientist who measures the volume of a gas at different temperatures provides thefollowing table of values.

a Draw a scatterplot of the data and on it draw the line of regression.b Give the equation of the line of best fit. Write your equation in terms of the

variables: volume of gas, V, and its temperature, T.

6 A sports scientist is interested in the importance of muscle bulk to strength. Hemeasures the biceps circumference of ten people and tests their strength by askingthem to complete a lift test. His results are given in the following table.

a Draw a scatterplot of the data and draw the line of regression.b Find a rule for determining the ability of a person to complete a lift test, S, from

the circumference of their biceps, B.

Temperature (°C) Volume (L)

−40 1.2

−30 1.9

−20 2.4

0 3.1

10 3.6

20 4.1

30 4.8

40 5.3

50 6.1

60 6.7

Circumference of biceps (cm) Lift test (kg)

25 50

25 52

27 58

28 51

30 60

30 62

31 53

33 62

34 61

36 66

MQ Maths A Yr 11 - 11 Page 472 Thursday, July 5, 2001 11:03 AM

Page 23: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 4737 A taxi company adjusts its meters so that the fare is charged according to the

following equation: F = 1.2d + 3 where F is the fare, in dollars, and d is the distancetravelled, in km.a Find the fare charged for a distance of 12 km.b Find the fare charged for a distance of 4.5 km.c Find the distance that could be covered on a fare of $27.d Find the distance that could be covered on a fare of $13.20.

8 Detectives can use the equation H = 6.1f − 5 to estimate the height of a burglar wholeaves footprints behind. (H is the height of the burglar, in cm, and f is the length ofthe footprint.)a Find the height of a burglar whose footprint is 27 cm in length.b Find the height of a burglar whose footprint is 30 cm in length.c Find the footprint length of a burglar of height 185 cm. (Give your answer correct

to 2 decimal places.)d Find the footprint length of a burglar of height 152 cm. (Give your answer correct

to 2 decimal places.)

9 A football match pie seller finds that the number of pies sold is related to thetemperature of the day. The situation could be modelled by the equationN = 870 − 23t, where N is the number of pies sold and t is the temperature of the day.a Find the number of pies sold if the temperature was 5 degrees.b Find the number of pies sold if the temperature was 25 degrees.c Find the likely temperature if 400 pies were sold.d How hot would the day have to be before the pie seller sold no pies at all?

10 The following table shows the average annual costs of running a car. It includes allfixed costs (registration, insurance etc.) as well as running costs (petrol, repairs etc.).

a Draw a scatterplot of the data.b Draw in the line of best fit.c Find an equation which represents the relationship between the cost of running a

vehicle, C, and the distance travelled, d.d Use your graph and its equation to find:

i the annual cost of running a car if it is driven 15 000 kmii the annual cost of running a car if it is driven 1000 kmiii the likely number of kilometres driven if the annual costs were $8000iv the likely number of kilometres driven if the annual costs were $16 000.

Distance (km) Annual cost ($)

5 000 4 000

10 000 6 400

15 000 8 400

20 000 10 400

25 000 12 400

30 000 14 400

WORKEDExample

6

SkillSH

EET 11.4

SkillSHEET 11.3

SkillSH

EET 11.2

WORKEDExample

7

MQ Maths A Yr 11 - 11 Page 473 Thursday, July 5, 2001 11:03 AM

Page 24: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

474 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

11 A market researcher finds that the number of people who would purchase ‘Wise-up’(the thinking man’s deodorant) is related to its price. He provides the following tableof values.

a Draw a scatterplot of the data.b Draw in the line of best fit.c Find an equation that represents the relationship between the number of cans of

‘Wise-up’ sold, N (in thousands), and its price, p.d Use the equation to predict the number of cans sold each week if:

i the price was $3.10 ii the price was $4.60.e At what price should ‘Wise-up’ be sold if the manufacturers wished to sell 80 000 cans?f Given that the manufacturers of ‘Wise-up’ can produce only 100 000 cans each

week, at what price should it be sold to maximise production?

12 The following table gives the adult return air fares between some Australian cities.

a Draw a scatterplot of the data and on it draw the regression line.b Find an equation that represents the relationship between the air fare, A, and the

distance travelled, d.c Use the equation to predict the likely air fare (to the nearest dollar) from:

i Sydney to the Gold Coast (671 km)ii Perth to Adelaide (2125 km)iii Hobart to Sydney (1024 km)iv Perth to Sydney (3295 km).

Price ($) Weekly sales (× 1000)

1.40 105

1.60 101

1.80 97

2.00 93

2.20 89

2.40 85

2.60 81

2.80 77

3.00 73

3.20 69

3.40 65

City Distance (km) Price ($)

Melbourne–Sydney 713 580

Perth–Melbourne 2728 1490

Adelaide–Sydney 1172 790

Brisbane–Melbourne 1370 890

Hobart–Melbourne 559 520

Hobart–Adelaide 1144 820

Adelaide–Melbourne 669 570

EXCEL

Spreadsheet

Interpolation/extrapolation

MQ Maths A Yr 11 - 11 Page 474 Thursday, July 5, 2001 11:03 AM

Page 25: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s

475

An electrical repair business charges its customers using the formula

C

=

40

h

+

35, where

C

is the cost of the repairs and

h

is the time taken for the repairs, in hours.Find the cost of a repair job that took:

1

2 hours

2

5 hours

3

1 hour and 15 minutes.

Estimate the time taken (in hours and minutes) for repairs if the cost of the repairs was:

4

$175

5

$275

6

$145.

The information below is to be used for questions

7

10

.

A survey relating exam marks to the amount of television watched finds that theregression line has the equation

M

=

95

15

t

, where

M

is the mark obtained and

t

is theaverage number of hours of television watched each night by the students.

7

Estimate the mark of a person who averages one hour of television per night.

8

Estimate the mark of a person who watches an average 4 hours of television per night.

9

Estimate the amount of television watched per night by a person who scores a mark of65.

10

Jodie scored 27.5 on the exam. Estimate the average amount of television that Jodiewatches each night.

Time series and trend lines

We have looked at bivariate, or (

x

,

y

), data where both

x

and

y

could vary independ-ently. We shall now consider cases where the

x

-variable is time and, generally, wheretime goes up in even increments such as hours, days, weeks or even years. In thesecases we have what is called a

time series

. The main purpose of a time series is to seehow some quantity varies with time. For example, a company may wish to record itsdaily sales figures over a 10-day period.

We could also make a graph of thistime series as shown in the figure.

As can be seen from this graph, thereseems to be a trend upwards — clearly,this company is increasing its revenues!

Time

Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Day 8 Day 9 Day 10

Sales ($)

5200 5600 6100 6200 7000 7100 7500 7700 7700 8000

1

Days

Sale

s ($

)

4 000

5 000

6 000

7 0008 000

9 000

10 000

0 1 2 3 4 5 6 7 8 9 10 t

MQ Maths A Yr 11 - 11 Page 475 Friday, July 6, 2001 2:37 PM

Page 26: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

476 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Types of trendMany types of trend exist. They can be classified as secular, seasonal, cyclic andrandom.

Secular trendsIf over a reasonably long period of time a trend appears to be either increasing ordecreasing steadily, with no major changes of direction, then it is called a seculartrend. It is important to look at the data over a long period. If the trend in the figure onthe previous page continued for, say, 30 days, then we could safely conclude that thecompany was indeed becoming more profitable. What appears to be a steady increaseover a short term — say, stock market share prices — can turn out to be somethingquite different over the long run.

Seasonal trendsCertain data seem to fluctuate during the year, as the seasons change. Consequently,this is termed a seasonal trend. The most obvious example of a seasonal trend wouldbe the average monthly maximum temperatures over a year.

A sample of this type of trend isshown in the figure.

Note that the months have beencoded, so that 1 = Jan., 2 = Feb., andso on. From which hemisphere of theworld would these data come?

Cyclic trendsLike seasonal trends, cyclic trends show fluctuations upwards and downwards, but notaccording to season. Businesses often have cycles where at times profits increase, thendecline, then increase again. A good example of this would be the sales of a new majorsoftware product, such as a word processor. At first, sales are slow; then they pick up asthe product becomes popular. When enough people have bought the product, sales mayfall off until a new version of the product comes on the market, causing sales toincrease again. This cycle can be repeated many times, which is why there are manyversions of some software products.

Random trendsTrends may seem to occur at random. This can be caused by external events such asfloods, wars, new technologies or inventions, or anything else that results from randomcauses. There is no obvious way to predict the direction of the trend or even when itchanges direction.

Months

Tem

p. (

°C)

14

18

22

2630

0 1 2 3 4 5 6 7 8 9 10 11 12 t

MQ Maths A Yr 11 - 11 Page 476 Thursday, July 5, 2001 11:03 AM

Page 27: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 477In the figure, there are a couple of minor

fluctuations at t = 4 and t = 8, and a major one at t = 13. The major fluctuation could have beencaused by a change in government which positivelyaffected sales.

The trend lineIf we want to predict the future values of a trend, it is important to be able to fit astraight line to the data that we already have. There are a number of techniques whichcan be used to determine the trend line. We will use the method of fitting the line ‘byeye’ employing the ‘equal number of points’ technique that was used previously.

Trend lines in ABS census dataThis activity can be undertaken using a spreadsheet or graphics calculator.

The Australian Bureau of Statistics has published data from the 1996 Australian Census on their web site, http://www.abs.gov.au. (It will take some time for the 2001 Census material to be available. Download the updated data when it is published.) An extract from their material shows the age breakdown year by year for males and females to age 79 (groupings occur after this date).

Prof

its

14

18

22

2630

0 2 4 6 8 10 12 14 16 t

Fit a straight line to the following time series data, which represent the body temperature of a patient with appendicitis, taken every hour.

THINK WRITE

Attempt to fit a line using your eye.By trial and error, a line such as the one at right could be the trend line. Try to balance the number of points either side of the line.

Evaluate the trend. It is unlikely that the temperature will continue to rise indefinitely, but the line may be significant over the short term.

Hours

Tem

p. (

°C)

30.030.531.031.532.032.533.0

0 1 2 3 4 5 6 7 8 9 10 t

1

Hours

Tem

p. (

°C)

30.030.531.031.532.032.533.0

0 1 2 3 4 5 6 7 8 9 10 t

2

8WORKEDExample

inve

stigationinvestigatio

n

MQ Maths A Yr 11 - 11 Page 477 Thursday, July 5, 2001 11:03 AM

Page 28: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

478 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

AU

ST

RA

LIA

N B

UR

EA

U O

F S

TA

TIS

TIC

S19

96 C

ensu

s of

Pop

ulat

ion

and

Hou

sing

Aus

tral

ia76

8896

5.46

4 sq

km

s

B03

AG

E B

Y S

EX

All

Per

sons

Mal

eF

emal

eP

erso

ns

0 ye

ars

123

101

116

506

239

607

112

623

211

998

024

621

22

133

747

126

933

260

680

313

241

412

611

125

852

54

133

393

126

489

259

882

0–4

648

887

616

019

126

490

65

133

908

127

701

261

609

613

375

912

714

426

090

37

130

505

124

238

254

743

812

959

512

365

025

324

59

129

424

123

389

252

813

5–9

657

191

626

122

128

331

310

131

934

125

566

257

500

1113

104

612

600

725

705

312

132

320

126

563

258

883

1313

412

312

709

826

122

114

131

117

123

939

255

056

10–1

466

054

062

917

31

289

713

1513

065

512

363

525

429

016

127

911

121

142

249

053

1712

677

212

088

424

765

618

126

441

120

841

247

282

1912

761

512

379

225

140

715

–19

639

394

610

294

124

968

820

127

481

124

020

251

501

2113

097

812

842

725

940

522

131

793

129

213

261

006

2313

579

313

392

426

971

724

140

685

140

143

280

828

20–2

466

673

065

572

71

322

457

2514

415

714

541

028

956

726

134

645

136

527

271

172

2713

344

313

604

926

949

228

130

488

132

804

263

292

2912

847

413

126

425

973

825

–29

671

207

682

054

135

326

1

Mal

eF

emal

eP

erso

ns

3013

286

713

625

626

912

331

130

269

133

606

263

875

3213

949

214

231

328

180

533

142

209

145

143

287

352

3414

215

014

576

128

791

130

–34

686

987

703

079

139

006

635

145

277

148

560

293

837

3614

101

214

620

528

721

737

137

969

141

896

279

865

3813

707

614

166

027

873

639

134

507

137

849

272

356

35–3

969

584

171

617

01

412

011

4013

748

114

036

027

784

141

127

118

130

373

257

491

4213

141

713

489

826

631

543

130

649

132

966

263

615

4412

450

312

551

525

001

840

–44

651

168

664

112

131

528

045

127

342

128

336

255

678

4612

435

812

564

625

000

447

120

734

121

165

241

899

4812

521

612

279

624

801

249

132

224

127

795

260

019

45–4

962

987

462

573

81

255

612

5011

197

910

913

622

111

551

101

519

9839

120

008

252

102

489

9939

120

188

053

9209

589

216

181

311

5490

288

8713

617

742

450

–54

498

370

483

442

981

812

5586

624

8391

317

053

756

8389

280

821

164

713

5780

225

7843

615

866

158

7796

777

312

155

279

5976

179

7468

015

085

955

–59

404

887

395

162

800

049

Mal

eF

emal

eP

erso

ns

6073

329

7455

914

788

861

6614

366

340

132

483

6267

868

6801

813

588

663

6656

068

291

134

851

6466

288

6637

013

265

860

–64

340

188

343

578

683

766

6568

947

7204

014

098

766

6598

268

925

134

907

6765

505

6827

013

377

568

6373

768

776

132

513

6961

639

6693

312

857

265

–69

325

810

344

944

670

754

7060

621

6879

412

941

571

5506

062

832

117

892

7254

121

6449

811

861

973

5071

762

169

112

886

7448

188

6044

110

862

970

–74

268

707

318

734

587

441

7545

523

5881

110

433

476

4056

453

908

9447

277

3155

843

298

7485

678

2970

242

884

7258

679

2812

241

345

6946

775

–79

175

469

240

246

415

715

80–8

410

325

617

447

627

773

285

–89

4371

694

151

137

867

90–9

412

286

3561

747

903

95–9

82

331

840

810

739

99 y

ears

and

ove

r58

72

157

274

4O

vers

eas

visi

tor

6579

873

796

139

594

To

tal

884

922

49

043

199

1789

242

3

MQ Maths A Yr 11 - 11 Page 478 Thursday, July 5, 2001 11:03 AM

Page 29: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 479

Time series and trend lines

Graphics calculators or spreadsheets can be used for the following where appropriate.

1 Identify whether the following trends are likely to be secular, seasonal, cyclic orrandom:a the amount of rainfall, per month, in

north Queenslandb the number of soldiers in the United

States army, measured annuallyc the number of people living in Australia,

measured annuallyd the share price of BHP, measured monthlye the number of seats held by the Liberal

Party in Federal Parliament.

2 Fit a trend line to the data in the graph at right.

Consider this data set as a time series of age groupings 0–4, 5–9 etc. (Note that the totals for these groupings occur within the data.)

1 Graph the trend of male numbers over time.

2 On the same graph, plot the female numbers over time.

3 Examine the differences in these two trends.a Compare the numbers of males and females in Australia in each age

category.b What has happened to the birth rate over time?c Compare the age of death of males and females.d Which age group represents the greatest proportion of the population?

4 Write a report of the age composition of Australians. Present your findings, backed by reference to tables and graphs.

remember1. Time series are a set of measurements taken over (usually) equally spaced time

intervals, such as hourly, daily, weekly, monthly or annually.2. There are 4 basic types of trend:

(a) secular: increasing or decreasing steadily(b) seasonal: varying from season to season(c) cyclic: similar to seasonal but not tied to a calendar cycle(d) random: varying from external causes happening at random.

remember

11C

Days

Tem

p. (

°C)

10

15

20

2530

35

40

0 1 2 3 4 5 6 7 8 9 10 t

MQ Maths A Yr 11 - 11 Page 479 Thursday, July 5, 2001 11:03 AM

Page 30: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

480 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

3 Fit a trend line to the following data. What type of trend is best reflected by these data?

4 The monthly share prices of a recently privatised telephone company were recorded asfollows.

Graph the data (let 1 = Jan. 2001, 2 = Feb. 2001 . . . and so on) and fit a trend line to thedata. Use this line to predict the share price in January 2002. Comment on the feas-ibility of the predicted share price.

5 Plot the following monthly sales data for umbrellas. Fit a trend line. Discuss the type oftrend best reflected by the data and the limitations of your trend line.

6 Consider the data in the figure, which represent the price of oranges over a 19-week period.a Fit a straight trend line to the data.b Predict the price in week 25.

7 The following table represents the quarterly sales figures (in thousands) of a popularsoftware product. Plot the data and fit a trend line using the ‘equal number of points’method. Discuss the type of trend best reflected by these data.

8 The number of employees at the Comnatpac Bank was recorded over a 10-monthperiod. Plot and fit a trend line to the data. What would you say about the trend?

t 1 2 3 4 5 6 7 8 9 10 11 12

y 6 9 13 8 9 14 15 17 14 11 15 19

Date Jan. 01 Feb. 01 Mar. 01 Apr. 01 May 01 Jun. 01 Jul. 01 Aug. 01

Price ($) 2.50 2.70 3.00 3.20 3.60 3.70 3.90 4.20

Month Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.

Sales 5 10 15 40 70 95 100 90 60 35 20 10

Quarter Q1-99 Q2-99 Q3-99 Q4-99 Q1-00 Q2-00 Q3-00 Q4-00 Q1-01 Q2-01 Q3-01 Q4-01

Sales 120 135 150 145 140 120 100 110 120 140 190 220

Month Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.

Employees 6100 5700 5400 5200 4800 4400 4200 4000 3700 3300

WORKEDExample

8

Weeks

Pric

e (c

ents

)

20

40

60

80

100

0 5 10 15 20 25 t

WorkS

HEET 11.2

MQ Maths A Yr 11 - 11 Page 480 Thursday, July 5, 2001 11:03 AM

Page 31: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 481

Weather report for snowfieldsThe Bureau of Meteorology publishes statistical data on their web site (http://www.bom.gov.au). A visit to this site reveals climate averages for many places throughout Australia. The following data set is the result of measurements collected at Mount Buffalo Chalet over a number of years.

inve

stigationinvestigatio

n

CLI

MA

TE

AV

ER

AG

ES

— lo

ng te

rm m

ean

valu

es o

f wea

ther

dat

a

0830

73M

OU

NT

BU

FF

ALO

CH

ALE

TC

omm

ence

d: 1

910

Last

rec

ord:

200

0

Latit

ude:

36.

72 S

Long

itude

: 146

.82

EE

leva

tion:

137

2.0

mS

tate

: VIC

JAN

FE

BM

AR

AP

RM

AY

JUN

JUL

AU

GS

EP

OC

TN

OV

DE

CA

NN

No.

%ag

e

Yrs

com

p

Mea

n D

aily

Max

Tem

p (d

eg C

)19

.518

.916

.411

.67.

94.

93.

74.

68.

010

.814

.217

.511

.643

.368

Mea

n no

. Day

s, M

ax >

= 4

0.0

deg

C0.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

02.

892

Mea

n no

. Day

s, M

ax >

= 3

5.0

deg

C0.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

02.

892

Mea

n no

. Day

s, M

ax >

= 3

0.0

deg

C0.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

02.

892

Hig

hest

Max

Tem

p (d

eg C

)29

.023

.921

.724

.418

.912

.214

.012

.814

.818

.521

.529

.529

.52.

894

Mea

n D

aily

Min

Tem

p (d

eg C

)10

.811

.09.

05.

32.

60.

2–0

.7–0

.31.

83.

86.

49.

25.

043

.669

Mea

n no

. Day

s, M

in =

< 2

.0 d

eg C

0.7

0.0

0.7

4.3

17.3

22.3

27.3

26.0

15.5

11.3

4.7

1.7

131.

82.

997

Mea

n no

. Day

s, M

in =

< 0

.0 d

eg C

0.0

0.0

0.3

1.3

7.0

13.7

20.0

19.0

9.5

5.7

1.3

0.3

78.2

2.9

97

Low

est M

in T

emp

(deg

C)

1.7

3.3

0.0

–0.6

–1.9

–4.5

–5.3

–4.4

–4.4

–3.6

–1.1

–0.6

–5.3

2.9

86

Mea

n 9a

m A

ir T

emp

(deg

C)

14.9

15.0

12.9

8.7

5.6

2.7

1.7

2.4

4.9

7.4

10.3

13.2

8.4

40.8

89

Mea

n 9a

m W

et-b

ulb

Tem

p (d

eg C

)10

.611

.29.

76.

43.

81.

40.

61.

23.

05.

07.

29.

65.

940

.688

Mea

n 9a

m R

elat

ive

Hum

idity

(%

)56

6064

7073

7881

7871

6863

6168

40.5

88

Mea

n 9a

m W

ind

Spe

ed (

km/h

r)6.

96.

48.

17.

79.

08.

410

.19.

39.

87.

97.

27.

58.

210

.165

Mea

n 3p

m W

ind

Spe

ed (

km/h

r)4.

93.

44.

35.

36.

67.

98.

98.

17.

96.

86.

36.

36.

52.

175

Mea

n R

ainf

all (

mm

)88

.483

.210

0.8

131.

019

7.3

212.

723

0.8

227.

019

6.8

191.

413

0.1

112.

619

02.0

85.5

96

Med

ian

(Dec

ile 5

) R

ainf

all (

mm

)64

.257

.981

.810

8.3

168.

218

1.4

214.

322

5.1

190.

515

9.5

112.

794

.818

84.6

75

Dec

ile 9

Rai

nfal

l (m

m)

189.

618

4.2

256.

927

8.5

375.

939

2.2

386.

235

6.2

332.

139

9.3

253.

825

6.8

2493

.875

Dec

ile 1

Rai

nfal

l (m

m)

12.1

9.2

15.1

22.4

56.6

56.3

85.1

86.4

65.5

45.1

49.7

27.6

1209

.975

Mea

n no

. of R

aind

ays

6.6

6.5

7.4

9.0

12.4

13.4

14.6

15.5

13.5

12.8

10.3

8.9

131.

084

.395

Hig

hest

Mon

thly

Rai

nfal

l (m

m)

364.

741

2.9

359.

276

4.1

614.

060

7.4

717.

455

7.7

496.

052

5.3

323.

137

4.6

85.5

96

Low

est M

onth

ly R

ainf

all (

mm

)1.

30.

04.

80.

00.

032

.02.

321

.431

.74.

010

.45.

185

.596

Hig

hest

Rec

orde

d D

aily

Rai

n (m

m)

193.

013

4.4

174.

018

9.7

173.

021

6.7

157.

014

5.4

195.

021

2.0

108.

610

9.7

216.

785

.196

Mea

n no

. of C

lear

Day

s0.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

00.

03.

010

0

Mea

n no

. of C

loud

y D

ays

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

3.0

100

MQ Maths A Yr 11 - 11 Page 481 Thursday, July 5, 2001 11:03 AM

Page 32: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

482 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

The following article explains the ideal conditions for heavy snowfalls, and graphically displays the recorded snow depth in the Snowy Mountains over the years 1963 to 1995.

1 Carefully examine all the information presented.

2 Visit the web site for more information, if necessary.

3 Your school is planning a visit for your year level to the snowfields some time during the year. Analyse the data presented (together with any other you have collected) and write a report to the organisers identifying:a the general weather conditions in the snowfields throughout the year (refer

to as many factors as possible)b the best time of the year for the excursion (in your opinion)c the type of weather you would hope for, prior to your visit, to produce good

snowfalls in the area.Supply tables and graphs to support your recommendations.

Abundant snow seasonsThe prime conditions for heavy snowfalls in the Australian Alps are persistent strong westerliesthrough the winter, which produce abundant precipitation and are generally accompanied byrelatively low temperatures. There is no particular relationship with El Nino or La Nina, thoughthe tendency for less precipitation in El Nino years usually leads to a poor season. The favouredconditions are generally different to those which produce snow at low levels, for which a strongcold outbreak (with generally south to southwesterly winds) is required.

On the Snowy Mountains the greatest snow depth in the 46 years from 1954 to 1999 was justover 3.5 metres, achieved in both 1961 and 1981. The winter of 1981 was also exceptionallywet in South Australia, Victoria and southern New South Wales: a severe storm struck Adelaideon 1 June, and this heralded three months of almost uninterrupted influence by depressions inthe westerlies, which brought copious rain and snow. At Perisher Valley the snow depth rosesteadily to 3.6 metres in early September before beginning to decline. Snow didn’t disappearcompletely until mid-November. More recently, 1992 was a very long season with deep snow.

At Spencer Creek and Perisher Valley in NSW the snow depth reached the high level of threemetres at regular four-year intervals: 1956, 1960, 1964 and 1968. These were also years ofheavy snow cover on the Victorian Alps. In each case weather patterns were characterised bystrong, persistent westerlies and abundant rain at lower levels. In each case the snow coverpersisted until late Novem-ber, and in 1956, into De-cember. In 1964 aparticularly stormy thirdweek of July dumped pro-digious quantities of snow,stranding people at FallsCreek and Mt Hotham.In Australia, the annualspring melting of themountain snow is not in it-self sufficient to causeflooding. However, inyears when the snow depthis great, a combination ofheavy rain and mild towarm conditions can aug-ment spring floods.

Winter snow depth (cm) at Spencer Creek in the Snowy Mountains, 1964 through 1995 (data courtesy of Snowy

Mountains Hydro-electric Authority

MQ Maths A Yr 11 - 11 Page 482 Thursday, July 5, 2001 11:03 AM

Page 33: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 483

Scatterplots• When looking for a relationship between two variables, data can be represented on

a scatterplot.• One variable (the independent variable) is on the x-axis and the other variable (the

dependent variable) is on the y-axis.• Points are plotted by the coordinates formed by each piece of data.• If the dependent variable consistently increases or decreases as the independent

variable increases, a relationship exists.• If all points on the scatterplot form a straight line, the relationship is said to be

linear.• The pattern of the scatterplot gives an indication of the strength of the relationship

or level of association between the variables.• A strong relationship between variables does not imply that one variable causes the

other to occur.

Regression lines• The regression line is the line of best fit on a scatterplot.• By measuring the gradient and the y-intercept on the regression line, we can use the

formula y = mx + c to find the equation. A graphics calculator also can be used to determine the equation of the regression line.

• When the equation of the regression line has been found, it can then be used to make predictions about the data.

Predictions• Predictions can be made algebraically or graphically.• Reliable predictions are produced by interpolation using a large number of points

that lie reasonably close to the regression line.• Be careful about extrapolating data.

Time series• A time series is a set of measurements taken over (usually) equally spaced time

intervals, such as hourly, daily, weekly, monthly or annually.

Trend lines• There are 4 basic types of trend:

1. secular: increasing or decreasing steadily2. seasonal: varying from season to season3. cyclic: similar to seasonal but not tied to a calendar cycle4. random: variations caused by external triggers happening at random.

Fitting trend lines• The trend line is a straight line that can be used to represent the entire time series.

Trend lines can be used for predicting the future values of the time series. The line can be found by eye using the ‘equal number of points’ technique.

summary

MQ Maths A Yr 11 - 11 Page 483 Thursday, July 5, 2001 11:03 AM

Page 34: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

484 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Answers to some of these questions may vary due to different positions of the regression line. The use of a graphics calculator is recommended, where appropriate.

1 The table below shows the maximum and minimum temperature on 10 days chosen at random throughout the year.

Display this information on a scatterplot.

2 The table below shows the number of sick days taken by ten employees and relates this to the number of children that they have.

a Show this information on a scatterplot.b Does a relationship appear to exist between the number of sick days taken and the

number of children they have? If so, is the relationship linear?

3 The table below shows the number of cars and number of televisions in each household.

a Show this information on a scatterplot.b Does a relationship appear to exist between the number of televisions in each household

and the number of cars they have? If so, is the relationship linear?

4

A researcher administers different amounts of fertiliser to a number of trial plots of potato crop. She then measures the total mass of potatoes harvested from each plot. When drawing the scatterplot, the researcher should graph:A mass of harvest on the x-axis because it is the independent variable, and amount of ferti-

liser on the y-axis because it is the dependent variableB mass of harvest on the y-axis because it is the independent variable, and amount of ferti-

liser on the x-axis because it is the dependent variable

Maximum temperature (°C) 25 36 21 40 24 26 30 18 20 25

Minimum temperature (°C) 12 21 11 23 12 15 19 10 8 13

No. of children 1 0 3 2 2 4 6 0 1 2

No. of sick days 5 3 10 8 4 12 12 0 1 5

No. of cars 1 1 2 2 2 3 1 0 1 2

No. of televisions 2 1 1 2 0 1 4 3 1 1

11A

CHAPTERreview

11A

11A

11A mmultiple choiceultiple choice

MQ Maths A Yr 11 - 11 Page 484 Thursday, July 5, 2001 11:03 AM

Page 35: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

C h a p t e r 1 1 S c a t t e r p l o t s a n d t i m e s e r i e s 485C mass of harvest on the x-axis because it is the dependent variable, and amount of ferti-

liser on the y-axis because it is the independent variableD mass of harvest on the y-axis because it is the dependent variable, and amount of ferti-

liser on the x-axis because it is the independent variable.

5

Which of the following graphs best depicts a strong negative relationship between the two variables?A B C D

6

What type of relationship is shown by the graph on the right?A Strong positive relationshipB Moderate positive relationshipC Moderate negative relationshipD Strong negative relationship

7 The table below shows the relationship between two variables, x and y.

a Prepare a scatterplot of the data.b On the scatterplot, draw the regression line.c By measuring the gradient and the y-intercept of the regression line, find its equation.

8 A survey is conducted comparing household income, I , with house value, V . A scatterplot is drawn and the regression line is found to have the equation V = 3.7I + 50 000. Use the equation to find:a the likely value of a house owned by a family with an income of $52 000b the likely income (to the nearest $1000) of a family living in a house valued at $320 000.

9 An entomologist conducted an experiment in which small amounts of insecticide were introduced to a container of 100 blowflies. The results are detailed below.

a Display the above information on a scatterplot and draw the line of regression.b Find the equation of the regression line.c Use the equation to predict the number of blowflies that would remain after two hours if

4.25 micrograms of insecticide was introduced.d Estimate the amount of insecticide needed to remove all blowflies.

x 2 4 18 7 9 12 2 7 11 10 16

y 103 75 20 66 70 50 95 40 27 42 30

Insecticide (I) (micrograms) 1 2 3 4 5 6 7 8 9 10

No. remaining after 2 h (F) 99 92 81 74 62 68 52 45 38 24

11Ammultiple choiceultiple choice

y

x

y

x

y

x

y

x

11Ammultiple choiceultiple choice

y

x

11B

11B

11B

MQ Maths A Yr 11 - 11 Page 485 Thursday, July 5, 2001 11:03 AM

Page 36: Scatterplots and time seriesmathsbooks.net/Maths Quest 11A for Queensland/MQA... · 11A Scatterplots 11B Regression lines 11C Time series and trend lines Scatterplots and time series

486 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

10

The price of oranges over a 16-month period is recorded in the figure. The trend can be described as:A Cyclic B Seasonal C RandomD Secular E There is no trend.

11 The number of uniforms sold in a school uniform shop is reported in the table.

Fit a trend line to these data. What type of trend is best reflected by these data? Can you explain these trends?

Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.

118 92 53 20 47 102 90 42 35 26 12 58

11C mmultiple choiceultiple choice

testtest

CHAPTERyyourselfourself

testyyourselfourself

11

t

50403020100

0 2 4 6 8Months

10 12 14

Pric

e of

ora

nges

16

11C

MQ Maths A Yr 11 - 11 Page 486 Thursday, July 5, 2001 11:03 AM


Recommended