+ All Categories
Home > Documents > Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Date post: 30-Dec-2015
Category:
Upload: jeffry-martin
View: 217 times
Download: 2 times
Share this document with a friend
31
Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression
Transcript
Page 1: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-1

Chapter Seventeen

Correlation and Regression

Page 2: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-2

Chapter Outline

1) Overview

2) Product-Moment Correlation

3) Regression Analysis

4) Bivariate Regression

5) Multiple Regression

6) Multicollinearity

Page 3: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-3

Variances are similar, t-tests are appropriate

Page 4: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-4

Variances are not similar, t-tests could be misleading

Correlation and regression analysis can help….

Page 5: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-5

Product Moment Correlation

• The product moment correlation, r, summarizes the strength of association between two metric (interval or ratio scaled) variables, say X and Y.

• In other words, you can have a correlation coefficient for Likert scale items, not dichotomous items.

• It is an index used to determine whether a linear (straight-line) relationship exists between X and Y.

• As it was originally proposed by Karl Pearson, it is also known as the Pearson correlation coefficient.It is also referred to as simple correlation, bivariate correlation, or merely the correlation coefficient.

Page 6: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-6

Linear relationships

Page 7: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-7

Product Moment Correlation

From a sample of n observations, X and Y, the product moment correlation, r, can be calculated as:

r =

(X i - X )(Y i - Y )i = 1

n

(X i - X )2i = 1

n(Y i - Y )2

i = 1

n

D iv is io n o f th e n u m erato r an d d en o m in ato r b y (n -1 ) g iv es

r =

(X i - X )(Y i - Y )n -1

i = 1

n

(X i - X )2

n -1i = 1

n (Y i - Y )2

n -1i = 1

n

=C O V x y

S x S y

sumX = average of all x’s

Y = average of all y’s

Don’t worry, we can do this in SPSS…

Page 8: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-8

Product Moment Correlation

• r varies between -1.0 and +1.0.

• The correlation coefficient between two variables will be the same regardless of their underlying units of measurement.

• For example, comparing a 5 point scale to a 7 point scale is okay.

Page 9: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-9

Explaining Attitude Toward the City of Residence

Example: Table 17.1

Respondent No Attitude Toward the City

Duration of Residence

Importance Attached to

Weather 1 6 10 3

2 9 12 11

3 8 12 4

4 3 4 1

5 10 12 11

6 4 6 1

7 5 8 7

8 2 2 4

9 11 18 8

10 9 9 10

11 10 17 8

12 2 2 5

Page 10: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-10

Product Moment CorrelationThe correlation coefficient may be calculated as follows:

STEP 1: GET THE AVERAGES OF X AND Y = (10 + 12 + 12 + 4 + 12 + 6 + 8 + 2 + 18 + 9 + 17 + 2)/12= 9.333 = average of X

Duration of residence (X)

Attitude toward the city (Y)

= (6 + 9 + 8 + 3 + 10 + 4 + 5 + 2 + 11 + 9 + 10 + 2)/12= 6.583 = average of Y

(X i - X )(Yi - Y)i=1

n = (10 -9.33)(6-6.58) + (12-9.33)(9-6.58)

+ (12-9.33)(8-6.58) + (4-9.33)(3-6.58) + (12-9.33)(10-6.58) + (6-9.33)(4-6.58)

For each respondent, + (8-9.33)(5-6.58) + (2-9.33) (2-6.58)subtract the average of + (18-9.33)(11-6.58) + (9-9.33)(9-6.58) x from their x; subtract + (17-9.33)(10-6.58) + (2-9.33)(2-6.58)the average of y from = -0.3886 + 6.4614 + 3.7914 + 19.0814 their y, then multiply, + 9.1314 + 8.5914 + 2.1014 + 33.5714 then sum all values. + 38.3214 - 0.7986 + 26.2314 + 33.5714

= 179.6668

STEP 2: GET THE NUMERATOR

Page 11: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-11

Product Moment Correlation

(Xi - X )2

i=1

n

= (10-9.33)2 + (12-9.33)2 + (12-9.33)2 + (4-9.33)2

+ (12-9.33)2 + (6-9.33)2 + (8-9.33)2 + (2-9.33)2 + (18-9.33)2 + (9-9.33)2 + (17-9.33)2 + (2-9.33)2

= 0.4489 + 7.1289 + 7.1289 + 28.4089 + 7.1289+ 11.0889 + 1.7689 + 53.7289 + 75.1689 + 0.1089 + 58.8289 + 53.7289= 304.6668

(Yi

- Y )2i=1

n= (6-6.58)2 + (9-6.58)2 + (8-6.58)2 + (3-6.58)2

+ (10-6.58)2+ (4-6.58)2 + (5-6.58)2 + (2-6.58)2

+ (11-6.58)2 + (9-6.58)2 + (10-6.58)2 + (2-6.58)2

= 0.3364 + 5.8564 + 2.0164 + 12.8164+ 11.6964 + 6.6564 + 2.4964 + 20.9764 + 19.5364 + 5.8564 + 11.6964 + 20.9764= 120.9168

Thus, 179. 6668(304. 6668) (120. 9168) = 0.9361 = r

STEP 3: GET THE DENOMINATOR

STEP 4: COMPLETE THE FORMULA

Page 12: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-12

Interpretation of the Correlation Coefficient

• The correlation coefficient ranges from −1 to 1. • A value of 1 implies that a linear equation

describes the relationship between X and Y perfectly, with all data points lying on a line for which Y increases as X increases.

• A value of −1 implies that all data points lie on a line for which Y decreases as X increases.

• A value of 0 implies that there is no linear correlation between the variables.

Page 13: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-13

Positive and Negative Correlation

Page 14: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-14

Correlation Negative Positive

None −0.09 to 0.0 0.0 to 0.09

Small −0.3 to −0.1 0.1 to 0.3

Medium −0.5 to −0.3 0.3 to 0.5

Strong −1.0 to −0.5 0.5 to 1.0

Interpretation of the Correlation Coefficient

• As a rule of thumb, correlation values can be interpreted in the following manner:

Page 15: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-15

SPSS Windows: Correlations

1. Select ANALYZE from the SPSS menu bar.

2. Click CORRELATE and then BIVARIATE.

3. Move “variable x” into the VARIABLES box. Then move “variable y” into the VARIABLES box.

4. Check PEARSON under CORRELATION COEFFICIENTS.

5. Check ONE-TAILED under TEST OF SIGNIFICANCE.

6. Check FLAG SIGNIFICANT CORRELATIONS.

7. Click OK.

Page 16: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-16

SPSS Example: Correlation

Correlations

Age InternetUsage

InternetShopping

Age Pearson Correlation

1 -.740 -.622

Sig. (1-tailed)

 .000 .002

N 20 20 20InternetUsage Pearson

Correlation-.740 1 .767

Sig. (1-tailed)

.000 

.000

N 20 20 20InternetShopping Pearson

Correlation-.622 .767 1

Sig. (1-tailed)

.002 .000 

N 20 20 20

Page 17: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-17

Regression Analysis

Regression analysis examines associative relationships between a metric dependent variable and one or more independent variables in the following ways:

• Determine whether the independent variables explain a significant variation in the dependent variable: whether a relationship exists.

• Determine how much of the variation in the dependent variable can be explained by the independent variables: strength of the relationship.

• Determine the structure or form of the relationship: the mathematical equation relating the independent and dependent variables.

• For example, does a change in age predict a change in Internet usage? Regression can answer this.

Page 18: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-18

Statistics Associated with Bivariate Regression Analysis

• Bivariate regression model. The basic regression equation is Yi = + Xi + ei, where Y = dependent or criterion variable, X = independent or predictor variable, = intercept of the line, = slope of the line, and ei is the error term associated with the i th observation.

Y = B0 + B1X1 + e

• Coefficient of determination. The strength of association is measured by the coefficient of determination, r 2. It varies between 0 and 1 and signifies the proportion of the total variation in Y that is accounted for by the variation in X.• Note: This is the correlation coefficient squared. Above .5 is good.

• Estimated or predicted value. The estimated or predicted value of Yi is i = a + b x, where i is the predicted value of Yi, and a and b are estimators of and , respectively.

0 1

0

1

0

1

Page 19: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-19

Statistics Associated with Bivariate Regression Analysis

• Regression coefficient. The estimated parameter b is usually referred to as the non-standardized regression coefficient.

• Scattergram. A scatter diagram, or scattergram, is a plot of the values of two variables for all the cases or observations.

• Standard error of estimate. This statistic, SEE, is the standard deviation of the actual Y values from the predicted values.

• Standard error. The standard deviation of b, SEb, is called the standard error.

Y

Page 20: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-20

Conducting Bivariate Regression Analysis

Fig. 17.2

Plot the Scatter Diagram

Formulate the General Model

Estimate Standardized Regression Coefficients (b)

Test for Significance (p-value)

Determine the Strength of Association (r-square)

Check Prediction Accuracy

Page 21: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-21

Conducting Bivariate Regression AnalysisThe Bivariate Regression Model

In the bivariate regression model, the general form of astraight line is: Y = X 0 + 1

whereY = dependent variableX = independent (predictor) variable

= intercept of the line 0 1 = slope of the line

The regression procedure adds an error term:

Yi = 0 + 1 Xi + ei

where ei is the error term associated with the i th observation.

Page 22: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-22

Plot of Attitude with Duration

Actual Responses – Attitude Towards City v. Duration of Residence

Is there a pattern? And which line is most accurate?

4.52.25 6.75 11.25 9 13.5

9

3

6

15.75 18

Duration of Residence

Att

itud

e

Page 23: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-23

• In order to determine the correct line, we use the Least-squares procedure (or OLS regression). Essentially, this finds a line that minimizes the distance from the line to all the points.• Least-squares minimizes the square of the vertical

distances of all the points from the line.

• Once we find the line, a formula can be derived:• Attitude = 1.0793 + 0.5897 (duration of residence)• This means that attitude towards city can be predicted

by duration of residence

Plot of Attitude with Duration

Page 24: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-24

SPSS Windows: Bivariate Regression

1. Select ANALYZE from the SPSS menu bar.

2. Click REGRESSION and then LINEAR.

3. Move “Variable y” into the DEPENDENT box.

4. Move “Variable x” into the INDEPENDENT(S) box.

5. Select ENTER in the METHOD box.

6. Click on STATISTICS and check ESTIMATES under REGRESSION COEFFICIENTS.

7. Check MODEL FIT.

8. Click CONTINUE.

9. Click OK.

Page 25: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-25

Model SummaryModel

R R SquareAdjusted R Square

Std. Error of the

Estimate 1 .767 .588 .565 .93902

CoefficientsaModel

Unstandardized Coefficients

Standardized

Coefficients

t Sig.B Std. Error Beta1 (Constant) .122 .577   .211 .835

InternetUsage .853 .168 .767 5.071 .000

SPSS Example: Bivariate Regression

Page 26: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-26

Multiple Regression

The general form of the multiple regression model is as follows:

We will want to run multiple regression if we believe that multiple IVs will predict one DV.

Perhaps this is a more appropriate formula:Attitude = 0.33732 + 0.48108 (Duration of

residence) + 0.28865 (Importance of weather)

Y = 0 + 1X1 + 2X2 + 3 X3+ . . . + k X k+ ee

Page 27: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-27

SPSS Windows: Multiple Regression

1. Select ANALYZE from the SPSS menu bar.

2. Click REGRESSION and then LINEAR.

3. Move “Variable y” into the DEPENDENT box.

4. Move “Variable x1, x2, x3…” into the INDEPENDENT(S) box.

5. Select ENTER in the METHOD box.

6. Click on STATISTICS and check ESTIMATES under REGRESSION COEFFICIENTS.

7. Check MODEL FIT.

8. Click CONTINUE.

9. Click OK.

Page 28: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-28

SPSS Example: Multiple Regression

Model SummaryModel

R R SquareAdjusted R Square

Std. Error of the

Estimate 1 .771 .595 .547 .95866

CoefficientsaModel

Unstandardized Coefficients

Standardized

Coefficients

t Sig.B Std. Error Beta1 (Constant) .862 1.542   .559 .583

InternetUsage .754 .255 .679 2.957 .009Age -.011 .021 -.119 -.520 .610

Page 29: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-29

Multicollinearity

• Multicollinearity arises when correlations among the predictors are very high.

• Multicollinearity can result in several problems, including:• The regression coefficients may not be

estimated precisely. • The magnitudes, as well as the signs of the

partial regression coefficients, may change.• It becomes difficult to assess the relative

importance of the independent variables in explaining the variation in the dependent variable.

Page 30: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-30

Multicollinearity

• In our example, age was actually a significant predictor. That, plus the high correlations, indicate that multicollinearity exists.

• Since age has a very low coefficient value, we can feel safe just getting rid of it for our final model.

CoefficientsaModel

Unstandardized Coefficients

Standardized

Coefficientst Sig.B Std. Error Beta

1 (Constant) 5.073 .709   7.160 .000

Age -.056 .017 -.622 -3.366 .003

Page 31: Copyright © 2010 Pearson Education, Inc. 17-1 Chapter Seventeen Correlation and Regression.

Copyright © 2010 Pearson Education, Inc. 17-31

Thank you!

Questions??


Recommended