+ All Categories
Home > Documents > Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple...

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple...

Date post: 13-Jan-2016
Category:
Upload: katrina-walsh
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
37
Copyright © 2010 Pearson Education, Inc., publishing as Prentice- Hall. 4-1 Chapter 4 Chapter 4 Multiple Regression Multiple Regression Analysis Analysis
Transcript
Page 1: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1

Chapter 4Chapter 4Multiple Regression AnalysisMultiple Regression Analysis

Page 2: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-2

LEARNING OBJECTIVESLEARNING OBJECTIVES

Upon completing this chapter, you should be able to Upon completing this chapter, you should be able to do the following:do the following:

• Determine when regression analysis is the Determine when regression analysis is the appropriate statistical tool in analyzing a problem.appropriate statistical tool in analyzing a problem.

• Understand how regression helps us make Understand how regression helps us make predictions using the least squares concept.predictions using the least squares concept.

• Use dummy variables with an understanding of Use dummy variables with an understanding of their interpretation.their interpretation.

• Be aware of the assumptions underlying regression Be aware of the assumptions underlying regression analysis and how to assess them.analysis and how to assess them.

Chapter 4Chapter 4Multiple Regression AnalysisMultiple Regression Analysis

Chapter 4Chapter 4Multiple Regression AnalysisMultiple Regression Analysis

Page 3: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-3

LEARNING OBJECTIVES continued . . . LEARNING OBJECTIVES continued . . .

Upon completing this chapter, you should be able to Upon completing this chapter, you should be able to do the following:do the following:

• Select an estimation technique and explain the Select an estimation technique and explain the difference between stepwise and simultaneous difference between stepwise and simultaneous regression.regression.

• Interpret the results of regression.Interpret the results of regression.

• Apply the diagnostic procedures necessary to Apply the diagnostic procedures necessary to assess “influential” assess “influential” observations.observations.

Chapter 4Chapter 4Multiple Regression AnalysisMultiple Regression Analysis

Chapter 4Chapter 4Multiple Regression AnalysisMultiple Regression Analysis

Page 4: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-4

Multiple regression analysis . . . is a Multiple regression analysis . . . is a statistical technique that can be used to statistical technique that can be used to analyze the relationship between a single analyze the relationship between a single dependent (criterion) variable and several dependent (criterion) variable and several independent (predictor) variables. independent (predictor) variables.

Multiple Regression DefinedMultiple Regression Defined

Page 5: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-5

Multiple RegressionMultiple Regression

Y’Y’ = b = b0 0 + b + b11XX11 + b + b22XX22 + . . . + b + . . . + bnnXXn n + e + e

YY = Dependent Variable = # of credit cards= Dependent Variable = # of credit cards

bb00 = intercept (constant) = constant number of credit = intercept (constant) = constant number of credit cards cards independent of family size and income.independent of family size and income.

bb11 = change in # of credit cards associated with a unit= change in # of credit cards associated with a unit

change in family size (regression coefficient).change in family size (regression coefficient).

bb22 = change in # of credit cards associated with a unit = change in # of credit cards associated with a unit change in change in income (regression coefficient).income (regression coefficient).

XX11 = family size= family size

XX22 = income= income

e e = prediction error (residual)= prediction error (residual)

Page 6: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-6

Variate (Y’) = XVariate (Y’) = X11bb11 + X + X22bb22 + . . . + X + . . . + Xnnbbnn

A variate value (Y’) is calculated for each respondent.A variate value (Y’) is calculated for each respondent.

The Y’ vaThe Y’ valuelue is a is a linear combinationlinear combination of the entire set of of the entire set of variables that best achieves the statistical objective.variables that best achieves the statistical objective.

Y’ X1

X2

X3

Page 7: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-7

Multiple Regression Decision ProcessMultiple Regression Decision Process

Stage 1: Objectives of Multiple RegressionStage 1: Objectives of Multiple Regression

Stage 2: Research Design of Multiple RegressionStage 2: Research Design of Multiple Regression

Stage 3: Assumptions in Multiple Regression AnalysisStage 3: Assumptions in Multiple Regression Analysis

Stage 4: Estimating the Regression Model and Stage 4: Estimating the Regression Model and Assessing Overall FitAssessing Overall Fit

Stage 5: Interpreting the Regression VariateStage 5: Interpreting the Regression Variate

Stage 6: Validation of the ResultsStage 6: Validation of the Results

Page 8: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-8

Stage 1: Objectives of Multiple Stage 1: Objectives of Multiple RegressionRegression

In selecting suitable applications of multiple In selecting suitable applications of multiple regression, the researcher must consider three regression, the researcher must consider three primary issues: primary issues:

1.1. the appropriateness of the research problem, the appropriateness of the research problem, 2.2. specification of a statistical relationship, and specification of a statistical relationship, and 3.3. selection of the dependent and independent selection of the dependent and independent

variables.variables.

Page 9: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-9

Selection of Dependent and Selection of Dependent and Independent VariablesIndependent Variables

The researcher should always consider three The researcher should always consider three issues that can affect any decision about issues that can affect any decision about variables:variables:

• The theory that supports using the The theory that supports using the variables, variables,

• Measurement error, especially in the Measurement error, especially in the dependent variable, and dependent variable, and

• Specification error. Specification error.

Page 10: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-10

Measurement Error in RegressionMeasurement Error in Regression

Measurement error that is problematic can Measurement error that is problematic can be addressed through either of two be addressed through either of two approaches:approaches:

• Summated scales, orSummated scales, or• Structural equation modeling Structural equation modeling

procedures.procedures.

Page 11: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-11

Rules of Thumb 4–1 Rules of Thumb 4–1

Meeting Multiple Regression ObjectivesMeeting Multiple Regression Objectives

• Only structural equation modeling (SEM) can Only structural equation modeling (SEM) can directly accommodate measurement error, but directly accommodate measurement error, but using summated scales can mitigate it when using summated scales can mitigate it when using multiple regression.using multiple regression.

• When in doubt, include potentially irrelevant When in doubt, include potentially irrelevant variables (as they can only confuse variables (as they can only confuse interpretation) rather than possibly omitting a interpretation) rather than possibly omitting a relevant variable (which can bias all regression relevant variable (which can bias all regression estimates).estimates).

Page 12: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-12

Stage 2: Research Design of a Stage 2: Research Design of a Multiple Regression AnalysisMultiple Regression Analysis

Issues to consider . . . Issues to consider . . .

• Sample size,Sample size,

• Unique elements of the dependence Unique elements of the dependence relationship – can use dummy variables as relationship – can use dummy variables as independents, andindependents, and

• Nature of independent variables – can be Nature of independent variables – can be both fixed and random.both fixed and random.

Page 13: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-13

Rules of Thumb 4–2Rules of Thumb 4–2

Sample Size ConsiderationsSample Size Considerations

• Simple regression can be effective with a sample size Simple regression can be effective with a sample size of 20, but maintaining power at .80 in multiple of 20, but maintaining power at .80 in multiple regression requires a minimum sample of 50 and regression requires a minimum sample of 50 and preferably 100 observations for most research preferably 100 observations for most research situations.situations.

• The minimum ratio of observations to variables is 5 to The minimum ratio of observations to variables is 5 to 1, but the preferred ratio is 15 or 20 to 1, and this 1, but the preferred ratio is 15 or 20 to 1, and this should increase when stepwise estimation is used.should increase when stepwise estimation is used.

• Maximizing the degrees of freedom improves Maximizing the degrees of freedom improves generalizability and addresses both model parsimony generalizability and addresses both model parsimony and sample size concerns.and sample size concerns.

Page 14: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-14

Rules of Thumb 4–3Rules of Thumb 4–3

Variable TransformationsVariable Transformations• Nonmetric variables can only be included in a regression Nonmetric variables can only be included in a regression

analysis by creating dummy variables.analysis by creating dummy variables.• Dummy variables can only be interpreted in relation to Dummy variables can only be interpreted in relation to

their reference category.their reference category.• Adding an additional polynomial term represents another Adding an additional polynomial term represents another

inflection point in the curvilinear relationship.inflection point in the curvilinear relationship.• Quadratic and cubic polynomials are generally sufficient Quadratic and cubic polynomials are generally sufficient

to represent most curvilinear relationships.to represent most curvilinear relationships.• Assessing the significance of a polynomial or interaction Assessing the significance of a polynomial or interaction

term is accomplished by evaluating incremental R2, not term is accomplished by evaluating incremental R2, not the significance of individual coefficients, due to high the significance of individual coefficients, due to high multicollinearity.multicollinearity.

Page 15: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-15

Stage 3: Assumptions in Stage 3: Assumptions in Multiple Regression AnalysisMultiple Regression Analysis

• Linearity of the phenomenon measured.Linearity of the phenomenon measured.

• Constant variance of the error terms.Constant variance of the error terms.

• Independence of the error terms.Independence of the error terms.

• Normality of the error term distribution.Normality of the error term distribution.

Page 16: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-16

Rules of Thumb 4–4Rules of Thumb 4–4

Assessing Statistical AssumptionsAssessing Statistical Assumptions

• Testing assumptions must be done not only for Testing assumptions must be done not only for each dependent and independent variable, but each dependent and independent variable, but for the variate as well.for the variate as well.

• Graphical analyses (i.e., partial regression plots, Graphical analyses (i.e., partial regression plots, residual plots and normal probability plots) are residual plots and normal probability plots) are the most widely used methods of assessing the most widely used methods of assessing assumptions for the variate.assumptions for the variate.

• Remedies for problems found in the variate must Remedies for problems found in the variate must be accomplished by modifying one or more be accomplished by modifying one or more independent variables as described in Chapter 2.independent variables as described in Chapter 2.

Page 17: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-17

Stage 4: Estimating the Regression Stage 4: Estimating the Regression Model and Assessing Overall Model FitModel and Assessing Overall Model Fit

In Stage 4, the researcher must accomplish three In Stage 4, the researcher must accomplish three basic tasks:basic tasks:

1.1. Select a method for specifying the regression Select a method for specifying the regression model to be estimated, model to be estimated,

2.2. Assess the statistical significance of the overall Assess the statistical significance of the overall model in predicting the model in predicting the dependent variable, dependent variable, and and

3.3. Determine whether any of the Determine whether any of the observations observations exert an undue influence on the results.exert an undue influence on the results.

Page 18: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-18

Variable Selection ApproachesVariable Selection Approaches

• Confirmatory (Simultaneous)Confirmatory (Simultaneous)

• Sequential Search Methods:Sequential Search Methods: Stepwise (variables not removed once Stepwise (variables not removed once

included in regression equation).included in regression equation). Forward Inclusion & Backward Elimination.Forward Inclusion & Backward Elimination. Hierarchical.Hierarchical.

• Combinatorial (All-Possible-Subsets)Combinatorial (All-Possible-Subsets)

Page 19: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-19

Variable Description Variable TypeData Warehouse Classification VariablesX1 Customer Type nonmetric X2 Industry Type nonmetric X3 Firm Size nonmetric X4 Region nonmetricX5 Distribution System nonmetricPerformance Perceptions VariablesX6 Product Quality metricX7 E-Commerce Activities/Website metricX8 Technical Support metricX9 Complaint Resolution metricX10 Advertising metricX11 Product Line metricX12 Salesforce Image metricX13 Competitive Pricing metricX14 Warranty & Claims metricX15 New Products metricX16 Ordering & Billing metricX17 Price Flexibility metricX18 Delivery Speed metricOutcome/Relationship MeasuresX19 Satisfaction metric X20 Likelihood of Recommendation metric X21 Likelihood of Future Purchase metric X22 Current Purchase/Usage Level metric X23 Consider Strategic Alliance/Partnership in Future nonmetric

Description of HBAT Primary Database VariablesDescription of HBAT Primary Database Variables

Page 20: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-20

• Explained variance = RExplained variance = R22 (coefficient of determination). (coefficient of determination).

• Unexplained variance = residuals (error).Unexplained variance = residuals (error).

• Adjusted R-Square = reduces the RAdjusted R-Square = reduces the R22 by taking into by taking into account the sample size and the number of account the sample size and the number of independent variables in the regression model (It independent variables in the regression model (It becomes smaller as we have fewer observations per becomes smaller as we have fewer observations per independent variable).independent variable).

• Standard Error of the Estimate (SEE) = a measure of Standard Error of the Estimate (SEE) = a measure of the accuracy of the regression predictions. It estimates the accuracy of the regression predictions. It estimates the variation of the dependent variable values around the variation of the dependent variable values around the regression line. It should get smaller as we add the regression line. It should get smaller as we add more independent variables, if they predict well.more independent variables, if they predict well.

Regression Analysis TermsRegression Analysis Terms

Page 21: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-21

• Total Sum of Squares (SST) = total amount of variation that

exists to be explained by the independent variables. TSS = the

sum of SSE and SSR.

• Sum of Squared Errors (SSE) = the variance in the dependent

variable not accounted for by the regression model = residual.

The objective is to obtain the smallest possible sum of squared

errors as a measure of prediction accuracy.

• Sum of Squares Regression (SSR) = the amount of

improvement in explanation of the dependent variable

attributable to the independent variables.

Regression Analysis Terms Continued . . .Regression Analysis Terms Continued . . .

Page 22: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-22

Least Squares Regression LineLeast Squares Regression Line

XX

Y

Y = averageY = average

Total DeviationTotal Deviation

Deviation not Deviation not explained by explained by regressionregression

Deviation Deviation explained by explained by regressionregression

Page 23: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-23

Statistical vs. Practical Significance?Statistical vs. Practical Significance?

The F statistic is used to determine if the overall regression model is statistically significant. If the F statistic is significant, it means it is unlikely your sample will produce a large R2 when the population R2 is actually zero. To be considered statistically significant, a rule of thumb is there must be <.05 probability the results are due to chance.

If the R2 is statistically significant, we then evaluate the strength of the linear association between the dependent variable and the several independent variables. R2, also called the coefficient of determination, is used to measure the strength of the overall relationship. It represents the amount of variation in the dependent variable associated with all of the independent variables considered together (it also is referred to as a measure of the goodness of fit). R2 ranges from 0 to 1.0 and represents the amount of the dependent variable “explained” by the independent variables combined. A large R2 indicates the straight line works well while a small R2 indicates it does not work well.

Even though an R2 is statistically significant, it does not mean it is practically significant. We also must ask whether the results are meaningful. For example, is the value of knowing you have explained 4 percent of the variation worth the cost of collecting and analyzing the data?

Page 24: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-24

Rules of Thumb 4–5Rules of Thumb 4–5

Estimation TechniquesEstimation Techniques• No matter which estimation technique is chosen, theory must be a guiding

factor in evaluating the final regression model because: Confirmatory Specification, the only method to allow direct testing of a pre-

specified model, is also the most complex from the perspectives of specification error, model parsimony and achieving maximum predictive accuracy.

Sequential search (e.g., stepwise), while maximizing predictive accuracy, represents a completely “automated” approach to model estimation, leaving the researcher almost no control over the final model specification.

Combinatorial estimation, while considering all possible models, still removes control from the researcher in terms of final model specification even though the researcher can view the set of roughly equivalent models in terms of predictive accuracy.

• No single method is “Best” and the prudent strategy is to use a combination of approaches to capitalize on the strengths of each to reflect the theoretical basis of the research question.

Page 25: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-25

Regression Coefficient QuestionsRegression Coefficient Questions

Three questions about the statistical Three questions about the statistical significance of any regression coefficient:significance of any regression coefficient:

1)1) Was statistical significance established?Was statistical significance established?

2)2) How does the sample size come into play?How does the sample size come into play?

3)3) Does it have practical significance in Does it have practical significance in

addition to statistical significance?addition to statistical significance?

Page 26: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-26

Rules of Thumb 4–6Rules of Thumb 4–6

Statistical Significance and Influential ObservationsStatistical Significance and Influential Observations• Always ensure practical significance when using large sample

sizes, as the model results and regression coefficients could be deemed irrelevant even when statistically significant due just to the statistical power arising from large sample sizes.

• Use the adjusted R2 as your measure of overall model predictive accuracy.

• Statistical significance is required for a relationship to have validity, but statistical significance without theoretical support does not support validity.

• While outliers may be easily identifiable, the other forms of influential observations requiring more specialized diagnostic methods can be equal to or even more impactful on the results.

Page 27: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-27

Types of Influential ObservationsTypes of Influential Observations

Influential observations . . . include all observations that have a disproportionate effect on the regression results. There are three basic types based upon the nature of their impact on the regression results:

• Outliers are observations that have large residual values and can be identified only with respect to a specific regression model.

• Leverage points are observations that are distinct from the remaining observations based on their independent variable values.

• Influential observations are the broadest category, including all observations that have a disproportionate effect on the regression results. Influential observations potentially include outliers and leverage points but may include other observations as well.

Page 28: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-28

Corrective Actions for InfluentialsCorrective Actions for InfluentialsInfluentials, outliers, and leverage points are based on

one of four conditions, each of which has a specific course of corrective action:

1. An error in observations or data entry – remedy by correcting the data or deleting the case,

2. A valid but exceptional observation that is explainable by an extraordinary situation – remedy by deletion of the case unless variables reflecting the extraordinary situation are included in the regression equation,

3. An exceptional observation with no likely explanation – presents a special problem because there is no reason for deleting the case, but its inclusion cannot be justified either, suggesting analyses with and without the observations to make a complete assessment, and

4. An ordinary observation in its individual characteristics but exceptional in its combination of characteristics – indicates modifications to the conceptual basis of the regression model and should be retained.

Page 29: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-29

Assessing MulticollinearityAssessing Multicollinearity

The researcher’s task is to . . . The researcher’s task is to . . .

• Assess the degree of multicollinearity, Assess the degree of multicollinearity,

• Determine its impact on the results, and Determine its impact on the results, and

• Apply the necessary remedies if needed.Apply the necessary remedies if needed.

Page 30: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-30

Multicollinearity DiagnosticsMulticollinearity Diagnostics

• Variance Inflation Factor (VIF) – measures how much the variance of the regression coefficients is inflated by multicollinearity problems. If VIF equals 0, there is no correlation between the independent measures. A VIF measure of 1 is an indication of some association between predictor variables, but generally not enough to cause problems. A maximum acceptable VIF value would be 10; anything higher would indicate a problem with multicollinearity.

• Tolerance – the amount of variance in an independent variable that is not explained by the other independent variables. If the other variables explain a lot of the variance of a particular independent variable we have a problem with multicollinearity. Thus, small values for tolerance indicate problems of multicollinearity. The minimum cutoff value for tolerance is typically .10. That is, the tolerance value must be smaller than .10 to indicate a problem of multicollinearity.

Page 31: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-31

Interpretation of Regression ResultsInterpretation of Regression Results

• Coefficient of Determination Coefficient of Determination

• Regression Coefficients Regression Coefficients (Unstandardized – bivariate)(Unstandardized – bivariate)

• Beta Coefficients (Standardized)Beta Coefficients (Standardized)

• Variables EnteredVariables Entered

• Multicollinearity ??Multicollinearity ??

Page 32: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-32

Rules of Thumb 4–7

Interpreting the Regression Variate

• Interpret the impact of each independent variable relative to the other variables in the model, as model respecification can have a profound effect on the remaining variables: Use beta weights when comparing relative importance among

independent variables. Regression coefficients describe changes in the dependent

variable, but can be difficult in comparing across independent variables if the response formats vary.

• Multicollinearity may be considered “good” when it reveals a suppressor effect, but generally it is viewed as harmful since increases in multicollinearity: reduce the overall R2 that can be achieved, confound estimation of the regression coefficients, and negatively affect the statistical significance tests of coefficients.

Page 33: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-33

Rules of Thumb 4–7 continued . . .

Interpreting the Regression Variate

• Generally accepted levels of multicollinearity (tolerance values up to .10, corresponding to a VIF of 10) almost always indicate problems with multicollinearity, but these problems may be seen at much lower levels of collinearity or multicollinearity. Bivariate correlations of .70 or higher may result in

problems, and even lower correlations may be problematic if they are higher than the correlations between the dependent and independent variables.

Values much lower than the suggested thresholds (VIF values of even 3 to 5) may result in interpretation or estimation problems, particularly when the relationships with the dependent variable are weaker.

Page 34: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-34

• Histogram of standardized residuals – enables you to determine

if the errors are normally distributed.

• Normal probability plot – enables you to determine if the errors

are normally distributed. It compares the observed (sample)

standardized residuals against the expected standardized

residuals from a normal distribution.

• ScatterPlot of residuals – can be used to test regression

assumptions. It compares the standardized predicted values of

the dependent variable against the standardized residuals from

the regression equation. If the plot exhibits a random pattern

then this indicates no identifiable violations of the assumptions

underlying regression analysis.

Residuals Plots

Page 35: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-35

Stage 6: Validation of the Stage 6: Validation of the ResultsResults

• Additional or Split SamplesAdditional or Split Samples

• Calculating the PRESS StatisticCalculating the PRESS Statistic

• Comparing Regression ModelsComparing Regression Models

• Forecasting with the ModelForecasting with the Model

Page 36: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-36

Variable Description Variable TypeData Warehouse Classification VariablesX1 Customer Type nonmetric X2 Industry Type nonmetric X3 Firm Size nonmetric X4 Region nonmetricX5 Distribution System nonmetricPerformance Perceptions VariablesX6 Product Quality metricX7 E-Commerce Activities/Website metricX8 Technical Support metricX9 Complaint Resolution metricX10 Advertising metricX11 Product Line metricX12 Salesforce Image metricX13 Competitive Pricing metricX14 Warranty & Claims metricX15 New Products metricX16 Ordering & Billing metricX17 Price Flexibility metricX18 Delivery Speed metricOutcome/Relationship MeasuresX19 Satisfaction metric X20 Likelihood of Recommendation metric X21 Likelihood of Future Purchase metric X22 Current Purchase/Usage Level metric X23 Consider Strategic Alliance/Partnership in Future nonmetric

Description of HBAT Primary Database Variables

Page 37: Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-37

Multiple Regression Learning Multiple Regression Learning CheckpointCheckpoint

1.1. When should multiple regression be used?When should multiple regression be used?

2.2. Why should multiple regression be used?Why should multiple regression be used?

3.3. What level of statistical significance and What level of statistical significance and

RR2 2 would justify use of multiple regression?would justify use of multiple regression?

4.4. How do you use regression coefficients?How do you use regression coefficients?


Recommended