Describing Relationships: Scatterplots and Correlation.

Post on 13-Dec-2015

222 views 4 download

Tags:

transcript

Describing Relationships: Scatterplots and Correlation

Two Quantitative Variables Plot observed data on a graphHorizontal (X axis) independent

variable(explanatory variable)

Vertical (Y axis) dependent variable(response variable)

We call the graph a scatter diagram or scatter plot

ExampleX = Dosage of DrugY = Reduction in Blood Pressure X Y100 10200 18300 32400 44500 56 Correlation – a measure of association that tests

whether a relationship exists between two variables

Perfect positive linear correlation

50403020100

50

40

30

20

10

0

C2

C1

Perfect negative linear correlation

50403020100

50

40

30

20

10

0

C2

C1

Positive linear correlation

50403020100

50

40

30

20

10

0

C2

C1

Negative linear correlation

50403020100

50

40

30

20

10

0

C2

C1

Non-linear correlation

50403020100

30

20

10

0

C1

C2

No correlation

50403020100

50

40

30

20

10

0

C1

C2

We wish to quantify the strength and direction of a linear relationship (Pearson product-moment correlation coefficient, r)

222222 )()(

))((

yynxxn

yxxyn

yyxx

yyxxr

-1 <= r <= 1

r = 1 Perfect Positive Linear Correlation

r = -1 Perfect Negative Linear Correlation

r = 0 no linear relationship

General Rule: |r| >= .75 indicates a strong linear relationship

ExampleX = Dosage of DrugY = Reduction in Blood Pressure

99728.

2222

yynxxn

yxxynr

The R-squared value is the percent of the variation of Y explained by the model

 For the Drug example

The higher is, the better the model

99728.r%5.992 R

%100%0 2 R

2R

“Causal” Research – When the objective is to determine if a variable causes a certain behavior (whether there is a cause and effect relationship between variables)

 Note that it is never possible to prove causality just

based on the relationship between two variables  There is a strong statistical correlation over months

of the year between ice cream consumption and the number of assaults in the U.S.

 Does this mean ice cream manufacturers are

responsible for crime?

No! The correlation occurs statistically because the hot temperatures of summer increase both ice cream consumption and assaults

 Thus, correlation does NOT imply causation Other factors besides cause and effect can

create an observed correlation 

To establish whether two variables are causally related you must establish:Time order - The cause must have occurred before the effect Co-variation (statistical association) – The correlation

coefficient must show a strong relationship between the dependent and independent variable

 Rationale - There must be a logical and compelling

explanation for why these two variables are related Non-spuriousness - It must be established that the

independent variable X, and only X, was the cause of changes in the dependent variable Y; rival explanations must be ruled out

This type of research is very complex and the researcher can never be completely certain that there are not other factors influencing the causal relationship

 To help identify a relationship as cause and

effect a study is often performed many times The study should yield the same results every

time it is conducted (if this occurs it helps rule out rival explanations)