+ All Categories
Home > Documents > Correlation & Regression

Correlation & Regression

Date post: 13-Jan-2016
Category:
Upload: thisbe
View: 25 times
Download: 0 times
Share this document with a friend
Description:
Correlation & Regression. Association & Prediction. Measuring association. Editorial and letter to the editor, Indianapolis Star re CDC data Differing opinions regarding degree of association How to quantify the association between two variables ie Smoking deaths & tax - PowerPoint PPT Presentation
Popular Tags:
45
Correlation & Regression Association & Prediction
Transcript
Page 1: Correlation & Regression

Correlation&

RegressionAssociation & Prediction

Page 2: Correlation & Regression

Measuring association

Editorial and letter to the editor, Indianapolis Star re CDC data

Differing opinions regarding degree of association

How to quantify the association between two variables• ie Smoking deaths & tax• ie Smoking percent & tax• ie Smoking percent & smoking death

Page 3: Correlation & Regression

Lot’s of Anecdotal & Clinical Relationships

Breast feeding & IQ

Smoking & Criminal Behavior

Abortion & Crime

Page 4: Correlation & Regression

Is there a relationship?

Student SAT-V GPAJohn 333 1.0Janet 756 3.8Thomas 444 1.9Scotty 629 3.2Diana 501 2.3Hilary 245 0.4

Page 5: Correlation & Regression

Plot out the data The Scattergram

SAT_V

800700600500400300200

GP

A

4.0

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

Page 6: Correlation & Regression

Plot out the data The Scattergram

SAT_V

800700600500400300200

GP

A

4.0

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

John

Janet(756,3.8)

Page 7: Correlation & Regression

Plot out the data The Scattergram

SAT_V

800700600500400300200

GP

A

4.0

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

Each point represents a pair of scores from a single subject (case)

Page 8: Correlation & Regression

The Scattergram

SAT_V (Mean = 484.67)

800700600500400300200

GP

A (

Me

an

= 2

.1)

4.0

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

Page 9: Correlation & Regression

Add 2 more students

Student SAT-V GPAJohn 333 1.0Janet 756 3.8Thomas 444 1.9Scotty 629 3.2Diana 501 2.3Hilary 245 0.4Joe 630 0.9Patricia 404 3.1

Page 10: Correlation & Regression

The Scattergram

SAT_V (Mean = 492.75)

800700600500400300200

GP

A (

Me

an

= 2

.08

)

4.0

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

Page 11: Correlation & Regression

Quantifying Relationships

Pearson: developed the technique Pearson r

•Pearson correlation coefficient•Pearson product-moment correlation

coefficient•r

Page 12: Correlation & Regression

Correlation

Co rrelation: how score on one variable is related to score on another variable

More specifically• How relative performance on one variable

is related to relative performance on another variable• ie How each score relates to its’ mean and

variability

Page 13: Correlation & Regression

Quantify relationship to the mean: Deviation Score

X = independent variable Y = dependent variable X - X (score on one variable related

to its mean; deviation score of X; x)

Y - Y (score on another variable related to its mean; deviation score of Y; y)

Page 14: Correlation & Regression

Calculation of r : deviation score method

( (Xi - X) (Yi -Y) )

[(Xi - X)2 * (Yi - Y)2]r =

Page 15: Correlation & Regression

Calculation of r : deviation score method

( Xi - X)

Deviation score of XxNote: will be + or - for each case

Page 16: Correlation & Regression

Calculation of r : deviation score method

( Yi - Y)

Deviation score of YyNote: will be + or - for each case

Page 17: Correlation & Regression

Calculation of r : deviation score method

(Xi - X) ( Yi - Y)

Product of paired deviation scoresProduct of x and yxyNote: product will be + or - for each case

Page 18: Correlation & Regression

Calculation of r : deviation score method

[(Xi - X) ( Yi - Y)]

Sum of product of paired deviation scoresSum of xyCovarianceNote: will be + or - depending on ALL of the individual cases!!!!

Page 19: Correlation & Regression

Calculation of r : deviation score method

( (Xi - X) (Yi -Y) )

(Xi - X)2 * (Yi - Y)2r =

Page 20: Correlation & Regression

Calculate r : T1&T2, T1&T3, T1&T4

Test 1 Test 2 Test 3 Test 4

Mike 11 11 5 9

Sue 9 9 7 5

Jan 7 7 9 11

Bob 5 5 11 7

Page 21: Correlation & Regression

r by deviation score method

Name T1(X) T2(Y) x y x^2 y^2 xy

Mike 11 11 3 3 9 9 9

Sue 9 9 1 1 1 1 1

Jan 7 7 -1 -1 1 1 1

Bob 5 5 -3 -3 9 9 9

X=8 Y=8 20 20 20

00.120

20

2020

2022

yx

xyr

Page 22: Correlation & Regression

r T1&T2 = 1.00Perfect Positive Relationshipsee scattergram next slide

Test 1(X)

Test 2(Y)

Test 3(Y)

Test 4(Y)

Mike 11 11 5 9

Sue 9 9 7 5

Jan 7 7 9 11

Bob 5 5 11 7

Page 23: Correlation & Regression

Graphical presentation of the data: perfect +

relationship

T1

121110987654

T2

12

11

10

9

8

7

6

5

4

Page 24: Correlation & Regression

Test 1(X)

Test 2(Y)

Test 3(Y)

Test 4(Y)

Mike 11 11 5 9

Sue 9 9 7 5

Jan 7 7 9 11

Bob 5 5 11 7

•T1 & T2 = 1.00

•perfect positive

•T1 & T3 = -1.00

•perfect negative

•T1& T4 = 0.00

•no relationship

Page 25: Correlation & Regression

Possible values of r

Range from -1.00 to +1.00 any value in between

• closer the value to -1.00, stronger the - relationship between the two variables

• closer the value to +1.00, stronger the + relationship between the two variables

Guess the correlation game

Page 26: Correlation & Regression

Possible values of r

Range from -1.00 to +1.00 any value in between

• closer the value to -1.00, stronger the - relationship between the two variables

• closer the value to +1.00, stronger the + relationship between the two variables

Just what does r value of +0.25 mean?

Page 27: Correlation & Regression

Factors limiting a PMCC

1. Homogenous group• subjects very similar on the variables

2. Unreliable measurement instrument/technique • measurements bounce all over the place)

3. Nonlinear relationship • Pearson's r is based on linear relationships

4. Ceiling or Floor with measurement • lots of scores clumped at the top or bottom...therefore no spread which

creates a problem similar to the homogeneous group [skewed data set(s)]

Page 28: Correlation & Regression

Assumptions of the PMCC

1. Measures are approximately normally distributed• Check with frequency distribution

2. The variance of the two measures is similar (homoscedasticity)

• check with scatterplot

3. The relationship is linear• check with scatterplot

4. The sample represents the population

5. Variables measured on a interval or ratio scale

Page 29: Correlation & Regression

NotCausation

Only Association

Page 30: Correlation & Regression

Correlations and causality

Correlations only describe the relationship, they do not prove cause and effect

Correlation is a necessary, but not a sufficient condition for determining causality

There are Three Requirements to Infer a Causal Relationship…

Page 31: Correlation & Regression

Correlations and causality

A statistically significant relationship between the variables

The causal variable occurred prior to the other variable

There are no other factors that could account for the cause Correlation studies do not meet the last

requirement and may not meet the second requirement

Page 32: Correlation & Regression

Correlations and causality

If there is a relationship between A and B it could be because A ->B A<-B A<-C->B

Page 33: Correlation & Regression

Smoking & LBP

Smoking LowBackPain

r = 0.45

Page 34: Correlation & Regression

Smoking & LBP

Smoking LowBackPain

r = 0.45

?LowBackPain

Smoking

Page 35: Correlation & Regression

Smoking & LBP

Smoking LowBackPain

r = 0.45

Lifestyle factors( ie strength)

?

Page 36: Correlation & Regression

Interpreting r

r is not a proportion.• r = 0.25 does not mean one quarter

similarity between the variables• r = 0.50 does not mean one half

similarity between the variables r describes the co-variability of the

variables

Page 37: Correlation & Regression

Coefficient of Determination

r2 : simply square the r value What percentage of the variance in

each variable is explained by knowledge of the variance of the other variable• what percentage of the variance

within Y is predicted by the variance within X?

Page 38: Correlation & Regression

Coefficient of Determination

(Shared Variation) Correlation Coefficient Squared Percentage of the variability among scores on

one variable that can be attributed to differences in the scores on the other variable

The coefficient of determination is useful because it gives the proportion of the variance of one variable that is predictable from the other variable

Page 39: Correlation & Regression

Notes about r2

Coefficient of determination explains shared variance• therefore 1-r2 is unexplained

r = 0.70 gives about 50% explained variance (why???)

always calculate r2 to evaluate extent of the correlation

Page 40: Correlation & Regression

Use of Correlation

Reliability of a test/measure • relate test-retest scores• relate tester1 to tester2

Validity of a test• HR and fitness (aerobic capacity)

Relate multiple dependent variables (do all measure the same construct?)

Page 41: Correlation & Regression

Cautions concerning r

Appropriate only for linear relationships (use Anxiety&Performance.sav)

Sensitive to range of talent• smaller range, lower r

Sensitive to sampling variation• smaller samples, more unstable

r calculated is not population r

Page 42: Correlation & Regression

Anxiety & Skill Performance

Page 43: Correlation & Regression

Meyer et al, 2002MSSE, 34:7, 1065-1070

Page 44: Correlation & Regression

Adachi et al, 2002. Mechanoreceptors in the ACL contribute to the joint position sense. Acta Orthop Scand, 73:2:330-334.

Page 45: Correlation & Regression

Click here for a web site to reviewcorrelation concepts

introduced inthis lecture


Recommended