Date post: | 12-Jan-2016 |
Category: |
Documents |
Upload: | polly-watkins |
View: | 214 times |
Download: | 0 times |
Correlation
• This Chapter is on Correlation
• We will look at patterns in data on a scatter graph
• We will be looking at how to calculate the variance and co-variance of variables
• We will see how to numerically measure the strength of correlation between two variables
CorrelationScatter GraphsScatter Graphs are a way of representing 2 sets of data. It is then possible to see whether they are related.
Positive Correlation As one variable increases, so does the other
Negative Correlation As one variable increases, the other decreases
No Correlation There seems to be no pattern linking the two variables
Positive
Negative
None
6A
CorrelationScatter GraphsIn the study of a city, the population density, in people/hectare, and the distance from the city centre, in km, was investigated by choosing sample areas. The results are as follows:
Plot a scatter graph and describe the correlation. Interpret what the correlation means.
Area A B C D E
Distance 0.6 3.8 2.4 3.0 2.0
Pop. Density
50 22 14 20 33
Area F G H I J
Distance 1.5 1.8 3.4 4.0 0.9
Pop. Density
47 25 8 16 38
0 1 42 3
10
20
30
50
40
0
Distance from centre (km)
Pop. D
ensi
ty (
people
/hect
are
)
The correlation is negative, which means that as we get further from
the city centre, the population density decreases.
CorrelationVariability of Bivariate DataWe learnt in chapter 3 that:
In Correlation:
Similarly for y:
And you can also calculate the Co-variance of both variables
6B/C
2( )x xVariance n
(Although remember that this formula
changed to make it easier to use)
2( )x x xxS
2( )y y yyS
( )( )x x y y xyS
‘How x varies’
‘How y varies’
‘How x and y vary
together’n
( )( )x x y y
CorrelationVariability of Bivariate DataLike in chapter 3, we can use a formula which will make calculations easier
2( )x xVariance n 2( )x x xxSBUT:
xxSVariance n
6B/C
CorrelationVariability of Bivariate Data
xxSVariance n
xxS n Variance
22
xxx x
n nnS
22
2( )xx
xx
n nnS
22xxx
xn
S
Multiply both sides by ‘n’
The easier formula for variance from chapter 3
222 x x
n n
For the second fraction, square the top and bottom
separately
Variability of Bivariate Data
Multiplying both fractions by ‘n’ will cancel a ‘divide by n’
from each of them
6B/C
CorrelationVariability of Bivariate DataThese are the formulae for Sxx, Syy and Sxy. You are given these in the formula booklet. You do not need to know how to derive them (like we just did!)
22xxx
xn
S 22yyy
yn
S
xyx y
xyn
S
6B/C
Correlation
Variability of Bivariate DataCalculate Sxx, Syy and Sxy, based on the following information.
22xxx
xn
S
22yyy
yn
S xy
x yxy
nS
n
y
2x 2y
xy
x 12
198
155
3904
2031
2732
22xxx
xn
S 2(155)
203112
xxS
28.92xxS
22yyy
yn
S 2(198)
390412
yyS
637yyS
xyx y
xyn
S
155 1982732
12xyS
174.5xyS
6B/C
Correlation
Variability of Bivariate Data
The following table shows babies heads’ circumferences (cm) and the gestation period (weeks) for 6 new born babies. Calculate Sxx, Syy and Sxy.
We need
22xxx
xn
S
22yyy
yn
S xy
x yxy
nS
1200
1400
1178
1140
1221
1116xy
1600
1600
1444
1444
1369
1296y2
9001225
9619001089
961x2
404038383736Gestation period
(y)
303531303331Head
size (x)
FEDCBABaby
n
y
2x 2y
xy
x 6
229
190
8753
6036
7255
6B/C
Correlation
Variability of Bivariate Data
The following table shows babies heads’ circumferences (cm) and the gestation period (weeks) for 6 new born babies. Calculate Sxx, Syy and Sxy.
We need
22xxx
xn
S
22yyy
yn
S xy
x yxy
nS
n
y
2x 2y
xy
x 6
229
190
8753
6036
7255
22xxx
xn
S 2(190)
60366
xxS
19.33xxS
22yyy
yn
S 2(229)
87536
yyS
12.83yyS
xyx y
xyn
S
190 2297255
6xyS
3.33xyS
6B/C
CorrelationProduct Moment Correlation CoefficientWe can test the correlation of data by calculating the Product Moment Correlation Coefficient. This uses Sxx, Syy and Sxy.
The value of this number tells you what the correlation is and how strong it is.
The closer to 1, the stronger the positive correlation. The same applies for -1 and negative correlation. A value close to 0 implies no linear correlation.
xy
xx yy
Sr
S S
Positive Correlation
Negative Correlation
-1 10
No Linear Correlation
6B/C
CorrelationProduct Moment Correlation CoefficientGiven the following data, calculate the Product Moment Correlation Coefficient.
74xxS 150yyS 102xyS
xy
xx yy
Sr
S S
102
74 150r
0.97r There is positive correlation, as x
increases, y does as well.
6B/C
CorrelationLimitations of the Product Moment Correlation CoefficientSometimes it may indicate Correlation between unrelated variables
Cars on a particular street have increased, as have the sales of DVDs in town The PMCC would indicate positive correlation where the two are most likely not linked
The speed of computers has increased, as has life expectancy amongst people These are not directly linked, but are both due to scientific developments
6B/C
CorrelationUsing Coding with the PMCCCalculating the PMCC from this table.
6D
391403744036565351903450532640xy
14440012960012602511902
5112225102400y2
106091081610609104041060910404x2
380360355345335320y
103104103102103102x
n
y 2x 2y xy
x 6
2095
617
733675
63451
215480
22xxx
xn
S 2(617)
634516
xxS
2.83xxS
22yyy
yn
S 2(2095)
7336756
yyS
2170.83yyS
xyx y
xyn
S
617 2095215480
6xyS
44.17xyS
CorrelationUsing Coding with the PMCCCalculating the PMCC from this table.
391403744036565351903450532640xy
14440012960012602511902
5112225102400y2
106091081610609104041060910404x2
380360355345335320y
103104103102103102x
2.83xxS
2170.83yyS
44.17xyS
xy
xx yy
Sr
S S
44.17
2.83 2170.83r
0.563r 6D
pqp q
pqn
S
Correlation
q
Using Coding with the PMCCCalculating the PMCC from this table, using coding.
6D
48483318218pq
256144121814916q2
9169494p2
161211974q
343232p
380360355345335320y
103104103102103102x
2p 2q pq
p 6
59
17
667
51
176
22ppp
pn
S 2(17)
516
ppS
2.83ppS
22qqq
qn
S 2(59)
6676
qqS
86.83qqS
17 59176
6pqS
8.83pqS
n
100p x
300
5
yq
CorrelationUsing Coding with the PMCCCalculating the PMCC from this table.
2.83ppS
86.83qqS
8.83pqS
pq
pp qq
Sr
S S
8.83
2.83 86.83r
0.563r So coding will not affect the PMCC!
48483318218pq
256144121814916q2
9169494p2
161211974q
343232p
380360355345335320y
103104103102103102x
6D
Summary
• We have looked at plotting scatter graphs
• We have looked at calculating measures of variance, Sxx, Syy and Sxy
• We have also seen types of correlation and how to recognise them on a graph
• We have calculated the Product Moment Correlation Coefficient, and interpreted it. It is a numerical measure of correlation.