Date post: | 14-Apr-2018 |
Category: |
Documents |
Upload: | amin-haleeb |
View: | 241 times |
Download: | 4 times |
of 19
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
1/19
Ch. 14: The Multiple Regression Model
buildingIdea: Examine the linear relationship between1 dependent (Y) & 2 or more independent variables (Xi)
XXXY kik2i21i10i
Multiple Regression Model with k Independent Variables:
Y-intercept Population slopes Random Error
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
2/19
The coefficients of the multiple regression modelare estimated using sample data with kindependent variables
Interpretation of the Slopes: (referred to as aNetRegression Coefficient)
b1=The change in the mean of Y per unit change in X1,taking into account the effect of X2 (or net of X2)
b0 Y intercept. It is the same as simple regression.
kik2i21i10i XbXbXbbY
Estimated(or predicted)value of Y
Estimated slope coefficientsEstimatedintercept
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
3/19
Three dimensionY
X1
X2
Graph of a Two-Variable Model
22110XbXbbY
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
4/19
Example: Simple Regression Results
Multiple Regression Results
Check the size and significance level of thecoefficients, the F-value, the R-Square, etc. Youwill see what the net of effects are.
Coefficients tandard Erro t Stat
Intercept (b0) 165.0333581 16.50316094 10.000106
Lotsize (b1) 6.931792143 2.203156234 3.1463008
F-Value 9.89
Adjusted R Square 0.108
Standard Error 36.34
Coefficients Standard Error t Stat
Intercept 59.32299284 20.20765695 2.935669
Lotsize 3.580936283 1.794731507 1.995249
Rooms 18.25064446 2.681400117 6.806386
F-Value 31.23
Adjusted R Square 0.453
Standard Error 28.47
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
5/19
Using The Equation to Make Predictions
Predict the appraised value at average lot size
(7.24) and average number of rooms (7.12).
What is the total effect from 2000 sf increase in lot
size and 2 additional rooms?
$215,180or215.18
)18.25(7.12(7.24)3.5859.32.App.Val
$43,660
(18.25)(2)0)(3.58)(200
valueapp.inIncrese
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
6/19
Coefficient of Multiple Determination, r2
and Adjusted r2
Reports the proportion of total variation in Y
explained by all X variables taken together (the
model)
Adjusted r2
r2 never decreases when a new X variable is added to the
model
This can be a disadvantage when comparing models
squaresofsumtotal
squaresofsumregression
SST
SSR
r2
k..12.Y
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
7/19
What is the net effect of adding a new variable? We lose a degree of freedom when a new X variable is added
Did the new X variable add enough explanatory power to offset
the loss of one degree of freedom?
Shows the proportion of variation in Y explained
by all X variables adjusted for the number of X
variables used
(where n = sample size, k = number of independent variables)
Penalize excessive use of unimportant independent
variables
Smaller than r2
Useful in comparing among models
1kn
1n)r1(1r
2
k..12.Y
2
adj
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
8/19
Multiple Regression Assumptions
Assumptions: The errors are normally distributed
Errors have a constant variance
The model errors are independent
Errors (residuals) from the regression model:ei = (Yi Yi)
These residual plots are used in multiple
regression: Residuals vs. Yi
Residuals vs. X1i
Residuals vs. X2i
Residuals vs. time (if time series data)
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
9/19
Two variable model
Y
X1
X2
22110XbXbbY
Yi
Yi
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
10/19
Are Individual Variables Significant?
Use t-tests of individual variable slopes Shows if there is a linear relationship between the
variable Xi and Y; Hypotheses:
H0
: i
= 0 (no linear relationship)
H1: i 0 (linear relationship does exist between Xi and Y)
Test Statistic:
Confidence interval for the population slope i
i
b
i1kn
S
0bt
ib1kniStb
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
11/19
Is the Overall Model Significant?
F-Test for Overall Significance of the Model Shows if there is a linear relationship between all of the X
variables considered together and Y
Use F test statistic; Hypotheses:
H0: 1 = 2= = k= 0 (no linear relationship)
H1: at least one i 0 (at least one independentvariable affects Y)
Test statistic:
1kn
SSE
k
SSR
MSE
MSRF
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
12/19
Testing Portions of the Multiple
Regression Model To find out if inclusion of an individual Xj or a
set of Xs, significantly improves the model,
given that other independent variables areincluded in the model
Two Measures:
1. Partial F-test criterion2. The Coefficient of Partial Determination
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
13/19
Contribution of a Single Independent
Variable Xj
SSR(Xj | all variables except Xj)
= SSR (all variables)SSR(all variables except Xj)
Measures the contribution of Xj in explaining the total
variation in Y (SST)
consider here a 3-variable model:
SSR(X1 | X2 and X3)
= SSR (all variablesX1-x3)SSR(X2 and X3)
SSRUR
Model
SSRRModel
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
14/19
The Partial F-Test Statistic
Consider the hypothesis test:H0: variable Xj does not significantly improve the model after allother variables are included
H1: variable Xj significantly improves the model after all other
variables are included
1)-k-/(nSSEMSE
n)restrictioofnumber)/(dfSSR-(SSRF
UR
RUR
Note that the numerator is the contribution of Xj to the regression.
If Actual F Statistic is > than the Critical F, then
Conclusion is: Reject H0; adding X1 does improve model
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
15/19
Coefficient of Partial Determination for
one or a set of variables Measures the proportion of total variation in the dependent
variable (SST) that is explained by Xj while controlling for
(holding constant) the other explanatory variables
RUR
RUR2
j)exceptvariablesYj.(all
SSRSST
SSR-SSRr
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
16/19
Using Dummy Variables
A dummy variable is a categoricalexplanatory variable with two levels:
yes or no, on or off, male or female
coded as 0 or 1
Regression intercepts are different if thevariable is significant
Assumes equal slopes for other variables
If more than two levels, the number ofdummy variables needed is (number oflevels - 1)
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
17/19
Different Intercepts, same slope
Y (sales)
b0 + b2
b0
1010
12010
Xbb(0)bXbbY
Xb)b(b(1)bXbbY
121
121
Fire Place
No Fire Place
If H0: 2 = 0 is
rejected, then
Fire Place has a
significant effect
on Values
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
18/19
Interaction Between Explanatory
Variables Hypothesizes interaction between pairs of X variables
Response to one X variable may vary at different levels of
another X variable
Contains two-way cross product terms
Effect of Interaction Without interaction term, effect of X1 on Y is measured by 1
With interaction term, effect of X1 on Y is measured by 1 + 3 X2
Effect changes as X2 changes
)(XbXbXbb
XbXbXbbY
21322110
3322110
X
7/30/2019 Chapter 14, Multiple Regression Using Dummy Variables
19/19
Example: Suppose X2 is a dummy variable
and the estimated regression equation is
Slopes are different if the effect of X1 on Y depends on X2 value
X10 10.5 1.5
Y
= 1 + 2X1 + 3X2 + 4X1X2Y