Date post: | 27-Mar-2015 |
Category: |
Documents |
Upload: | ashton-mcmanus |
View: | 215 times |
Download: | 0 times |
CorrelationMinium, Clarke & Coladarci, Chapter 7
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Association
Univariate vs. Bivariate
one variable vs. two variables
When we have two variables we can ask about the degree to which they co-vary
is there any relationship between an individual’s score on one variable and his or her score on a second variable
number of beers consumed and reaction time (RT)
Number of hours of studying and score on an exam
Years of education and salary
Parent’s anxiety (or depression) and child anxiety (or depression)*
The correlation coefficient
“a bivariate statistic that measures the degree of linear association between two quantitative variables.
The Pearson product-moment correlation coefficient
Bivariate Distributions and Scatterplots
Scatter diagram
Graph that shows the degree and pattern of the relationship between two variables
Horizontal axis
Usually the variable that does the predicting (this is somewhat arbitrary)
e.g., price, studying, income, caffeine intake
Vertical axis
Usually the variable that is predicted
e.g., quality, grades, happiness, alertness
Bivariate Distributions and Scatterplots
Steps for making a scatter diagram
Draw axes and assign variables to them
Determine the range of values for each variable and mark the axes
Mark a dot for each person’s pair of scores
Bivariate Distributions and Scatterplots
Linear correlation
Pattern on a scatter diagram is a straight line
Example above
Curvilinear correlation
More complex relationship between variables
Pattern in a scatter diagram is not a straight line
Example below
Bivariate Distributions and Scatterplots
Positive linear correlation
High scores on one variable matched by high scores on another
Line slants up to the right
Negative linear correlation
High scores on one variable matched by low scores on another
Line slants down to the right
Bivariate Distributions and Scatterplots
Zero correlation
No line, straight or otherwise, can be fit to the relationship between the two variables
Two variables are said to be “uncorrelated”
Bivariate Distributions and Scatterplots
a. Negative linear correlation
b. Curvilinear correlation
c. Positive linear correlation
d. No correlation
The Covariance
Covariance is a number that that reflects the degree and direction of association between two variables.
This is the definition
Note its similarity to the definition of variance (S2)
The logic of the Covariance
Cov(X X)(Y Y )
n
S2 (X X)(X X)
n
0
3
6
9
12
15
0 3 6 9 12
X Values
Y V
alu
es
The Covariance
Example (Positive Correlation) (see p. 109)
Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym)
A 9 13 4 4 16
B 7 9 2 0 0
C 5 7 0 -2 0
D 3 11 -2 2 -4
E 1 5 -4 -4 16
n=5 Xm=5 Ym=9 sum = 28
Cov =28/5=5.6
Cov(X X)(Y Y )
n
0
3
6
9
12
15
0 3 6 9 12
X Values
Y V
alu
es
The Covariance
Example (Negative Correlation) (see p. 109)
Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym)
A 9 5 4 -4 -16
B 7 11 2 2 4
C 5 7 0 -2 0
D 3 9 -2 0 0
E 1 13 -4 4 -16
n=5 Xm=5 Ym=9 sum = -28
Cov = -28/5=-5.6
Cov(X X)(Y Y )
n
0
3
6
9
12
15
0 3 6 9 12
X Values
Y V
alu
es
The Covariance
Example (Zero Correlation) (see p. 109)
Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym)
A 9 13 4 2.8 11.2
B 7 9 2 -1.2 -2.4
C 5 7 0 -3.2 0.0
D 3 9 -2 -1.2 2.4
E 1 13 -4 2.8 -11.2
n=5 Xm=5 Ym=10.2 sum = 0
Cov = 0/5 = 0
Cov(X X)(Y Y )
n
The Pearson r: the Pearson product-moment coefficient of correlation
Correlation coefficient, r, indicates the precise degree of linear correlation between two variables
Can vary from
-1 (perfect negative correlation)
through 0 (no correlation)
to +1 (perfect positive correlation)
r is more useful than Cov because it is independent of the underlying scales of the two variables
if two variables produce an r of .5, for example, r will still equal .5 after any linear transformation of the two variables
linear transformation: adding, subtracting, dividing or multiplying by a constant
e.g., converting Celsius to Fahrenheit: F = 32 + 1.8C
e.g., converting Fahrenheit to Celsius: C = (F - 32) /1.8
r (X X)(Y Y ) / n
SXSYCovSXSY
The Pearson r: the Pearson product-moment coefficient of correlation
r = .81
r = .46
r = .16
r = -.75
r = -.42
r = -.18
Correlation and Causality
When two variables are correlated, three possible directions of causality
1st variable causes 2nd
2nd variable causes 1st
Some 3rd variable causes both the 1st and the 2nd
There is inherent ambiguity in correlations
Correlation and Causality
When two variables are correlated, three possible directions of causality
1st variable causes 2nd
2nd variable causes 1st
Some 3rd variable causes both the 1st and the 2nd
Inherent ambiguity in correlations
Factors influencing the Pearson r
Linearity
Outliers
“To the extent that a bivariate distribution departs from linearity, r will underestimate that relationship.” (p.121)
“Discrepant data points, or outliers, affect the magnitude of r and the direction of the effect depending on the outlier’s location in the scatterplot.” (p. 122).
Factors influencing the Pearson r
Restriction of Range “Other things being equal, restricted variation in either X or Y will result in a lower Pearson r and would be obtained were variation greater.” (p. 122)
Factors influencing the Pearson r
Context
“Because of the many factors that influence r, there is no such thing as the correlation between two variables. Rather, the obtained r must be interpreted in full view of the factors that affect it and the particular conditions under which it was obtained.” (p. 124)
Judging the Strength of Association
r2: proportion of common variance
The coefficient of determination, r2, is the proportion of common variance shared by two variables.
We will talk more about this when we discuss Chapter 8.