Date post: | 22-Dec-2015 |
Category: |
Documents |
Upload: | dominick-dennis |
View: | 282 times |
Download: | 8 times |
© Boardworks Ltd 20051 of 26 © Boardworks Ltd 20051 of 26
AS-Level Maths: Statistics 1for Edexcel
S1.2 Calculating means and standard deviations
This icon indicates the slide contains activities created in Flash. These activities are not editable.
For more detailed instructions, see the Getting Started presentation.
© Boardworks Ltd 20052 of 26
Co
nte
nts
© Boardworks Ltd 20052 of 26
Means
Calculating means
Calculating standard deviations
Coding
© Boardworks Ltd 20053 of 26
The mean is the most widely used average in statistics. It is found by adding up all the values in the data and dividing by how many values there are.
, , ,...,1 2 3 nx x x x
...1 2 3 inxx x x x
xn n
Note: The mean takes into account every piece of data, so it is affected by outliers in the data. The
median is preferred over the mean if the data contains outliers or is skewed.
Mean
Notation: If the data values are , then the mean is
This is the mean symbol
This symbol means the
total of all the x values
© Boardworks Ltd 20054 of 26
If data are presented in a frequency table:
Mean
Value Frequency
… …
2x
nx
1x 1f
2f
nf
...1 1 2 2 i in n
i i
x fx f x f x fx
f f
then the mean is
© Boardworks Ltd 20055 of 26
Example: The table shows the results of a survey into household size. Find the mean size.
Mean
Household size, x Frequency, f
1 20
2 28
3 25
4 19
5 16
6 6
To find the mean, we add a 3rd column to the table.
x × f
20
56
75
76
80
36
TOTAL 114 343
Mean = 343 ÷ 114 = 3.01
© Boardworks Ltd 20056 of 26
Co
nte
nts
© Boardworks Ltd 20056 of 26
Standard deviation
Calculating means
Calculating standard deviations
Coding
© Boardworks Ltd 20057 of 26
There are three commonly used measures of spread (or dispersion) – the range, the inter-quartile range and the standard deviation.
( )2
variance ix x
n
( )
2
s.d. ix x
n
Standard deviation
The following formulae can be used to find the variance and s.d.
variance = (standard deviation)2variance = (standard deviation)2
The variance is related to the standard deviation:
The standard deviation is widely used in statistics to measure spread. It is based on all the values in the data, so it is sensitive to the presence of outliers in the data.
© Boardworks Ltd 20058 of 26
Total: 22
Example: The mid-day temperatures (in °C) recorded for one week in June were: 21, 23, 24, 19, 19, 20, 21
( )2
variance ix x
n
Standard deviation
...21 23 21 14721
7 7x
21 0 0
23 2 4
24 3 9
19 -2 4
19 -2 4
20 -1 1
21 0 0
( )2ix xix xix
So variance = 22 ÷ 7 = 3.143
So, s.d. = 1.77°C (3 s.f.)
°CFirst we find the mean:
© Boardworks Ltd 20059 of 26
There is an alternative formula which is usually a more convenient way to find the variance:
Standard deviation
( ) ( )2 2 2But, 2i i ix x x x x x 2 22i ix x x nx 2 22ix x nx nx 2 2ix nx
2
2variance ix xn
Therefore, and
2
2s.d. ix xn
( )2
variance ix x
n
© Boardworks Ltd 200510 of 26
Example (continued): Looking again at the temperature data for June: 21, 23, 24, 19, 19, 20, 21
Standard deviation
14721
7x
...2 2 2 221 23 21ix
°C
Also, = 3109
.
.
2
2 23109variance 21 3 143
7s . 77.d 1
ix xn
°C
Note: Essentially the standard deviation is a measure of how close the values are to the mean value.
We know that
So,
© Boardworks Ltd 200511 of 26
When the data is presented in a frequency table, the formula for finding the standard deviation needs to be adjusted slightly:
Calculating standard deviation from a table
2
2s.d. i i
i
f xx
f
Example: A class of 20 students were asked how many times they exercise in a normal week.
Find the mean and the standard deviation.
Number of times exercise taken
Frequency
0 5
1 3
2 5
3 4
4 2
5 1
© Boardworks Ltd 200512 of 26
Calculating standard deviation from a table
x × f x2 × f
0 0
3 3
10 20
12 36
8 32
5 25
No. of times exercise taken, x
Frequency, f
0 5
1 3
2 5
3 4
4 2
5 1
. .2
2 2116s.d. 1 9 1 4
08
2i i
i
f xx
f
The table can be extended to help find the mean and the s.d.
TOTAL: 20 38 116
.38
201 9x
© Boardworks Ltd 200513 of 26
If data is presented in a grouped frequency table, it is only possible to estimate the mean and the standard deviation. This is because the exact data values are not known.
An estimate is obtained by using the mid-point of an interval to represent each of the values in that interval.
Example: The table shows the annual mileage for the employees of an insurance company.
Estimate the mean and standard deviation.
Calculating standard deviation from a table
Annual mileage, x Frequency
0 ≤ x < 5000 6
5000 ≤ x < 10,000 17
10,000 ≤ x < 15,000 14
15,000 ≤ x < 20,000 5
20,000 ≤ x < 30,000 3
© Boardworks Ltd 200514 of 26
Calculating standard deviation from a table
Mileage Frequency, f Mid-point, x f × x f × x2
0 – 5000 6 2500 15000 37,500,000
5000 – 10,000 17 7500 127,500 956,250,000
10,000 – 15,000 14 12,500 175,000 2,187,500,000
15,000 – 20,000 5 17,500 87,500 1,531,250,000
20,000 – 30,000 3 25,000 75,000 1,875,000,000
480,000
410
5,667x
TOTAL 45 480,000 6,587,500,000
26,587,500,000s.d. 10,667
47
55 11
miles
miles
© Boardworks Ltd 200515 of 26
In most distributions, about 67% of the data will lie within 1 standard deviation of the mean, whilst nearly all the data values will lie within 2 standard deviations of the mean.
Values that lie more than 2 standard deviations from the mean are sometimes classed as outliers – any such values should be treated carefully.
Standard deviation is measured in the same units as the original data. Variance is measured in the same units squared.
Notes about standard deviation
Here are some notes to consider about standard deviation.
Most calculators have a built-in function which will find the standard deviation for you. Learn how to use thisfacility on your calculator.
© Boardworks Ltd 200516 of 26
Examination-style question: The ages of the people in a cinema queue one Monday afternoon are shown in the stem-and-leaf diagram:
Examination-style question
2 3 means 23 years old2 3 63 1 6 64 1 2 5 6 95 0 4 76 1
a) Explain why the diagram suggests that the mean and standard deviation can be sensibly used as measures of location and spread respectively.
b) Calculate the mean and the standard deviation of the ages.
c) The mean and the standard deviation of the ages of the people in the queue on Monday evening were 29 and 6.2 respectively. Compare the ages of the peoplequeuing at the cinema in the afternoon with those in theevening.
© Boardworks Ltd 200517 of 26
a) The mean and the standard deviation are appropriate, as the distribution of ages is roughly symmetrical and there are no outliers.
Examination-style question
2 3 means 23 years old2 3 63 1 6 64 1 2 5 6 95 0 4 76 1
b) . .597
597 so, 42 642861
44
2 6ix x . .2 227,131
27131 so, s.d. 42 6428614
10 9ix c) The cinemagoers in the evening had a smaller mean
age, meaning that they were, on average, younger than those in the afternoon.
The standard deviation for the ages in the evening was also smaller, suggesting that the evening audience were closer together in age.
© Boardworks Ltd 200518 of 26
Sometimes in examination questions you are asked to pool two sets of data together.
Combining sets of data
Example: Six male and five female students sit an A-level examination.
The mean marks were 52% and 57% for the males and females respectively. The standard deviations were 14 and 18 respectively.
Find the combined mean and the standard deviation for the marks of all 11 students.
© Boardworks Ltd 200519 of 26
Let be the marks for the 6 male students.
Let be the marks of the 5 female students.
To find the overall mean, we first need to find the total marks for all 11 students.
,...,1 6x x
,...,1 5y y
Combining sets of data
As 52x 6 52 312x As 57y 5 57 285y
312 285 597x y
.. . %. .597
54 2727 31
541
Therefore
So the combined mean is:
© Boardworks Ltd 200520 of 26
To find the overall standard deviation, we need to find the total of the marks squared for all 11 students.
As s.d. 14x
Therefore,
So the combined s.d. is: (to 3 s.f.)
Combining sets of data
As s.d. 18y
2
2s.d. ix xn
( )2 2 2s.d.x n x ( )2 2 26 14 52 17,400x ( )2 2 25 18 57 17,865y
2 2 35,265x y
. . %235,26554 2 6 17
111
Notice that the formula
rearranges to give
© Boardworks Ltd 200521 of 26
Co
nte
nts
© Boardworks Ltd 200521 of 26
Calculating means
Calculating standard deviations
Coding
Coding
© Boardworks Ltd 200522 of 26
Coding is a technique that can simplify the numerical effort required in finding a mean or standard deviation.
Enter some data below, and see how it changes when you add or multiply by different numbers.
Coding
© Boardworks Ltd 200523 of 26
Adding
So, if a number b is added to each piece of data, the mean value is also increased by b.
The standard deviation is unchanged.
i iy ax b
y ax b s.d. s.d.y xa
Coding
More formally, if then:
Multiplying
If each piece of data is multiplied by a, the mean value is multiplied by a.
The standard deviation is also multiplied by a.
© Boardworks Ltd 200524 of 26
Example: Find the mean and the standard deviation of the values in the table. Use the transformation below to help you.
15
10y x
Coding
x Frequency
50 3
60 5
70 7
80 4
90 1
y
0
1
2
3
4
Using the given transformation, add a y column to the table.
© Boardworks Ltd 200525 of 26
Coding
y Frequency, f
0 3
1 5
2 7
3 4
4 1
y × f y2 × f
0 0
5 5
14 28
12 36
4 16
.35
201 75y
Total 20 35 85
. .2
2 285s.d. 1 75
21 09
0i i
i
f yy
f
To find the mean:
To find the s.d.:
© Boardworks Ltd 200526 of 26
And the standard deviation of x is: 10 × 1.09 = 10.9
We can rearrange:
to get:
15
10y x
Therefore the mean of x is:
Coding
10 50x y
. .10 50 10 1 75 0 75 6 5x y
Note how the coding helped to simplify the calculations by making the numbers smaller.
You have now found the mean and standard deviation of y. To find them for the x values, you must reverse the coding.