Date post: | 14-Apr-2018 |
Category: |
Documents |
Upload: | fahad-mushtaq |
View: | 242 times |
Download: | 0 times |
of 19
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
1/19
Variable Selection in Multiple Regression Modeling
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
2/19
Use Multiple Regression to answer following questions?
1) Taking Sales as Target variables develop a Regression Model2) Which variables should be included and why?3) Is the model adequate for future predictions?
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
3/19
Section A : Find Correlations between Predictor Variables
Variable Accounts has significant correlations with variables Time, Poten and ShareVariable Poten has significant correlation with variables TimeVariable Rating hassignificant correlations with variable AdvExp
Section B : Perform Multiple Linear RegressionB.1 : ENTER REGRESSION METHOD (Include ALL Predictor Variables)Now enter all 8 variables as predictor and Sales as dependent variable for Linear regression. The
output is as shown below. Model Summary output shows that adjusted R2 is 0.89 indicating that
model explains 89% variability in the data.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
4/19
ANNOVA Table below shows that the Regression model adequately fits the data.
The Coefficients Table below attests to the Correlation observations of section A. Variable Accountsis the highest correlated with VIF equal to 5.637.
Variable Share is highly correlated with VIF equal to 3.395. Similarly variable Time is also highlycorrelated with VIF equal to 3.356.
Variables to be excluded include Time, Change, Accounts, Work and Rating since p-value of thesevariables exceeds value of 0.05.So the variables that pass the significance test and are to be
retained in the Regression model are;
Poten, AdvExp, ShareTo ascertain the reliability of this conclusion, we continue with other methods in Multiple Regression.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
5/19
B.2: Stepwise Regression method.We apply the stepwise method as shown below
Now click on Options tab as shown below.
The options tab invokes Stepping Method Criteria as shown.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
6/19
A variable will be entered into a solution if p-value is less than 0.05 and will be removed from the
solution if p-value of that variable is greater than 0.10. In stepwise regression, the order of selection of
variables into the solution is shown in Coefficients Table.
The selected variables are,
Accounts, AdvExp, Poten, Share
Each selected variable has p-value less than 0.05. As shown in ANNOV Table, the four variables are
selected in four steps. Since p-value is less than 0.05, model adequately fits the data.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
7/19
Strength of the model is provided in the Summary Table as shown below.
Note the final model is obtained in fourth stage with model explaining 88.1% variability in the data. It is
interesting to observe that Adjusted R2 increase while Standard Error of the Estimate decreases at each
step.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
8/19
B.3: Backward Regression method.Invoke the Backward regression as shown below.
Initially all the variables are entered in the solution. Then variablesWork, Rating and Accounts are
removed at subsequent stages since their respective p-values are greater than 0.10 as shown below.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
9/19
As seen in Coefficients Table below, all eight predictor variables are entered in the initial solution to find
the respective p-values.
Since variable Workhas highest p-value of 0.487, it is excluded from the solution. The solution for
remaining seven variables (excluding Work) is shown below. Since variable Rating has highest p-value of
0.440, it is excluded from the solution.
The process of Regression model building is re-executed with six variables now. The output is shown
below. Variable Accounts has the highest p-vale of 0.224. So it is excluded from subsequent solution.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
10/19
Regression model building is now carried out with five variables to find their respective p-values. The
output of Coefficients is shown below.
Since all the remaining five variables have their respective p-values lessthan 0.10, no further exclusion is
carried out.
ANNOVA Tables for four stages of model development are shown below.
Note that value of F increases with each stage. Five variables at Model 4 have collective p-value of 0.000
attesting that model is adequate for predictive analysis.
To find the strength of model, we look at MODEL SUMMARY Table.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
11/19
Five variables for Model 4 have Adjusted R Square value of 0.893 signifying very high variability
explaining power.
B.4: Forward Regression method.The forward selection procedure starts with no independent variables. It adds variables one at a time
using the same procedure as stepwise regression for determining whether an independent variable
should be entered into the model. However, the forward selection procedure does not permit a
variable to be removed from the model once it has been entered. The procedure stops if the p-value for
each of the independent variables not in the model is greater than to enter.
We apply Forward Regression method by invoking option Forward as shown below.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
12/19
We try all variables one-by-one as predictor variables with Sales as dependent variable. Variable
Accounts has the smallest p-value out of eight predictor variable, so it is entered as first variable in the
solution as shown below.
Now, there are seven variables remaining. Which one to choose? Select one of the remaining variables
at a time with variable Accounts to find p-values of all pairs as shown below.
Note variable AdvExp when paired with variable Accounts yield lowest p-value. So it is chosen to join the
selected variables group as seen in Coefficients Table.
Now we have two variables (Accounts, AdvExp) in the selected variable list. Adding the remaining six
variables one-by-one and finding p-values for the remaining five variables, we have following Table.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
13/19
As seen, variable Poten has p-value of 0.019 when combined with two variables (Accounts, AdvExp). So
we select variable Poten to be included in the selected variables list as seen in the following Coefficients
Table..
Next, we combine the remaining five variables one-by-one with already three selected variables
(Accounts, AdvExp, Poten) to select next variable with lowest p-value (less than .05). The results are
shown in Table below.
Variable Share has a p-value of .001 to join the remaining three variables as seen in the following
Coefficients Table
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
14/19
Next we combine remaining four variables one-by-one with the selected four variables to find p-values
of these remaining variables. The p-values of these four variables are shown in the following Table.
No variable from the remaining four variables has a p-value less than 0.05, so no further variable is
added to the list of selected variables.
ANNOVA Table for the four models is shown below attesting that Model 4 has highest F value.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
15/19
MODEL SUMMARY Table shows that Model 4 has lowest standard error with highest value of adjusted
R2
value of 0.881
SUMMARY OF FOUR REGRESSION METHODS
Here we compare the four Regression methods and list down the eight variables as recommended to be
included by each method.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
16/19
Situation A:
Analysis of Regression Model including Four Variables Suggested by Backward Regression
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
17/19
Situation B:
Regression Model by including three Variables as Recommended by ENTER method.
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
18/19
Situation C:
Regression Model by Including Four Variables as Suggested by
(i) Stepwise Method(ii) Forward Method
7/30/2019 Chapter 15 CRAVEN SALES MODEL - Multiple Regression
19/19
Comparison of Three Situations
Situation C has better parameters of the three situations. The model proposed by this situation has
largest adjusted R2 of 0.881 with maximum F value of 45.226. However standard error of estimate is
greater than obtained in situation B.
So the recommended model according to situation C is;
Estimate of Sales = -1441.932 + 0.038 (Poten) + 0.175 (AdvExp) + 190.144 (Share) + 9.214 (Accounts)