LEARNING OBJECTIVES
❶ Understand and interpret the terms dependent and independent variable.
❷ Calculate and interpret the coefficient of correlation, the coefficient of determination, and the standard error of estimate.
❸ Pearson’s Product Moment Correlation of Coefficient, rp
❹ Spearman’s Rank Correlation of Coefficient, rs
2
3
Correlation Analysis and Scatter Diagram
Correlation Analysis is the study of the relationship
between variables. It is also defined as group of
techniques to measure the association between
two variables.
A Scatter Diagram is a chart that portrays the
relationship between the two variables. It is the
usual first step in correlations analysis
4
Dependent vs. Independent Variable
DEPENDENT VARIABLE The variable that is being predicted or estimated. It is scaled on the Y-axis.
INDEPENDENT VARIABLE
The variable that provides the basis for estimation. It is the predictor variable. It is scaled on the X-axis.
6
The Coefficient of Correlation, r
The Coefficient of Correlation (r) is a measure of
the strength of the relationship between two
variables. It requires interval or ratio-scaled data.
It can range from -1.00 to 1.00.
Values of -1.00 or 1.00 indicate perfect and strong
correlation.
Values close to 0.0 indicate weak correlation.
Negative values indicate an inverse relationship
and positive values indicate a direct relationship.
8
Scatter Plots and Correlation
A scatter plot (or scatter diagram) is used to show the relationship between two quantitative variables
The linear relationship can be:
• Positive – as x increases, y increases
»As advertising dollars increase, sales increase
• Negative – as x increases, y decreases
»As expenses increase, net income decrease
14
Pearson’s Product Moment Correlation of Coefficient, rp
Formula,
where
n = number of paired observations
r = Sample correlation coefficient
x = Value of the independent variable
y = Value of the dependent variable
2222 YYnXXn
YXXYnrp
15
The duration of the last 9 business trips made by an employee and the corresponding expenses claimed are shown in the following table.
No. of days 3 5 2 1 3 4 1 3 1
Expenses ($) 100 300 90 30 240 200 150 170 60
Calculate the product moment correlation of coefficient between the number of days and expenses.
Solution Let X be the number of days Y be the expenses
16
X Y XY X2 Y2
3 100 300 9 10,000
5 300 1500 25 90,000
2 90 180 4 8,100
1 30 30 1 900
3 240 720 9 57,600
4 200 800 16 40,000
1 150 150 1 22,500
3 170 510 9 28,900
1 60 60 1 3,600
Total 23 1340 4250 75 261,600
17
The coefficient of correlation, rp = +0.8228, indicates that there is a ___________ _______________ correlation between the number of days and expenses.
8226.0
2222
YYnXXn
YXXYnrp
18
Question:
The following sample observations
were randomly selected.
X: 4 5 3 6 10
Y: 4 6 5 7 7
Compute rp. Interpret your answer.
19
Spearman’s Rank Correlation of Coefficient, rs
Developed in the 1920s by Charles
Spearman (British psychologist).
Based on rank-order scores.
Works correctly even if the original scores
are nonnumeric.
Much less affected by outliers.
20
Spearman’s Rank Correlation of Coefficient, rs
Formula,
where
d = r1 – r2
r1 = ranks for x
r2 = ranks for y
16
12
2
nn
drs
21
Example 1: Rank Correlation of Coefficient (Data has already been ranked)
A German language teacher takes a group of 5 students. She rank orders them in order of how confident they are when speaking (1 - extremely confident, 5 - not at all confident) and wants to correlate this with performance in the oral examination. A different teacher has given ratings of how well the students spoke in the oral exam (1 - hopeless, 5 - excellent).The following table was obtained as a result. Compute rs and interpret. Person Confidence Oral exam performance A 5 2 B 4 4 C 1 5 D 3 3 E 2 5
22
Example 1: Rank Correlation of Coefficient (Data has already been ranked)
Solution
Let r1 = rankings of confidence
r2 = rankings of oral exam performance
Person r1 r2 d d2
A
B
C
D
E
Total 34
23
Example 1: Rank Correlation of Coefficient (Data has already been ranked)
The coefficient of rank correlation, rs = - 0.7, indicates that there is a ___________ ____________ between the confidence and oral exam performance in their rankings.
7.0
7.11
1
61
2
2
nn
drs
24
Example 2: Rank Correlation of Coefficient (Data has not yet been ranked)
The following data relates to the marks obtained by 5 students in the Economics and Statistics examinations. Compute rs and interpret. Marks Student Economics Statistics 1 36 52 2 98 91 3 75 68 4 65 53 5 82 62
25
Solution
Let X = marks in Economics
Y = marks in Statistics
r1 = ranks for X
r2 = ranks for Y
Student X Y r1 r2 d d2 1 36 52 0 0 2 98 91 0 0 3 75 68 1 1 4 65 53 0 0 5 82 62 -1 1 Total 2
Example 2: Rank Correlation of Coefficient (Data has not yet been ranked)
26
Solution
Let X = marks in Economics
Y = marks in Statistics
r1 = ranks for X
r2 = ranks for Y
Student X Y r1 r2 d d2 1 36 52 5 5 0 0 2 98 91 1 1 0 0 3 75 68 3 2 1 1 4 65 53 4 4 0 0 5 82 62 2 3 -1 1 Total 2
Example 2: Rank Correlation of Coefficient (Data has not yet been ranked)
27
Example 2: Rank Correlation of Coefficient (Data has not yet been ranked)
The coefficient of rank correlation, rs = +0.9,
indicates that there is a ___________
_____________ between the rankings in both
subjects.
9.0
1.01
1
61
2
2
nn
drs
28
Example 3: Rank Correlation of Coefficient (Tied Ranks)
The following data relates to the marks obtained by 5 students in Accounting and Costing examinations. Compute rs and interpret.
Marks Student Accounting Costing 1 86 91 2 86 82 3 77 68 4 63 77 5 89 77
29
Solution
Let X = marks in Accounting
Y = marks in Costing
r1 = ranks for X
r2 = ranks for Y
Student X Y r1 r2 d d2 1 86 91 1.5 2.25 2 86 82 0.5 0.25 3 77 68 -1 1 4 63 77 1.5 2.25 5 89 77 -2.5 6.25 Total 12
Example 3: Rank Correlation of Coefficient (Tied Ranks)
2.5 1 2.5 2 4 5 5 3.5 1 3.5
30
Example 3: Rank Correlation of Coefficient (Tied Ranks)
The coefficient of rank correlation, rs = +0.4,
indicates that there is a ___________
____________ between the rankings in both
subjects.
4.0
6.01
1
61
2
2
nn
drs
31
Question:
The following sample observations
were randomly selected.
X: 4 5 3 6 10
Y: 4 6 5 7 7
Compute rs. Interpret your answer.
32
Coefficient of Determination
The coefficient of determination (r2) is the proportion of the total variation in the dependent variable (Y) that is explained or accounted for by the variation in the independent variable (X).
It is the square of the coefficient of correlation.
It ranges from 0 to 1.
It does not give any information on the direction of the relationship between the variables.
33
Coefficient of Determination (r2) - Example
Recall Example 1
The coefficient of determination, r2 ,is 0.677,
found by (0.8228)2
This is a proportion or a percent; we can say that
67.7 percent of the variation in the expenses is
explained, or accounted for, by the variation in the
number of days.
34
Testing the Significance of the Correlation Coefficient
H0: = 0 (the correlation in the population is 0)
H1: ≠ 0 (the correlation in the population is not 0)
Reject H0 if:
t > t/2,n-2 or t < -t/2,n-2
35
Testing the Significance of the Correlation Coefficient - Example
H0: = 0 (the correlation in the population is 0)
H1: ≠ 0 (the correlation in the population is not 0)
Reject H0 if:
t > t/2,n-2 or t < -t/2,n-2
t > t0.025,8 or t < -t0.025,8
t > 2.306 or t < -2.306
36
Testing the Significance of the Correlation Coefficient - Example
The computed t (3.297) is within the rejection region, therefore, we will reject H0. This means the correlation in the population is not zero. From a practical standpoint, it indicates to the sales manager that there is correlation with respect to the number of sales calls made
and the number of copiers sold.