Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-1
© 2004 Prentice-Hall, Inc. Chap 13-1
Basic Business Statistics(9th Edition)
Chapter 13Simple Linear Regression
© 2004 Prentice-Hall, Inc. Chap 13-2
Chapter Topics
Types of Regression ModelsDetermining the Simple Linear Regression EquationMeasures of VariationAssumptions of Regression and CorrelationResidual Analysis Measuring AutocorrelationInferences about the Slope
© 2004 Prentice-Hall, Inc. Chap 13-3
Chapter Topics
Correlation - Measuring the Strength of the AssociationEstimation of Mean Values and Prediction of Individual ValuesPitfalls in Regression and Ethical Issues
(continued)
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-2
© 2004 Prentice-Hall, Inc. Chap 13-4
Purpose of Regression Analysis
Regression Analysis is Used Primarily to Model Causality and Provide Prediction
Predict the values of a dependent (response) variable based on values of at least one independent (explanatory) variableExplain the effect of the independent variables on the dependent variable
© 2004 Prentice-Hall, Inc. Chap 13-5
Types of Regression ModelsPositive Linear Relationship
Negative Linear Relationship
Relationship NOT Linear
No Relationship
© 2004 Prentice-Hall, Inc. Chap 13-6
Simple Linear Regression Model
Relationship between Variables is Described by a Linear FunctionThe Change of One Variable Causes the Other Variable to ChangeA Dependency of One Variable on the Other
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-3
© 2004 Prentice-Hall, Inc. Chap 13-7
PopulationRegressionLine (Conditional Mean)
Simple Linear Regression Model
Population regression line is a straight line that describes the dependence of the average value (conditional mean) of one variable on the other
Population Y Intercept
Population SlopeCoefficient
Random Error
Dependent (Response) Variable
Independent (Explanatory) Variable
ii iY Xβ β ε0 1+ +=
|Y Xµ
(continued)
© 2004 Prentice-Hall, Inc. Chap 13-8
Simple Linear Regression Model(continued)
ii iY Xβ β ε0 1+ +=
= Random Error
Y
X
(Observed Value of Y) =
Observed Value of Y
|Y X iXµ β β0 1= +
iε
β0
β1
(Conditional Mean)
© 2004 Prentice-Hall, Inc. Chap 13-9
Sample regression line provides an estimate of the population regression line as well as a predicted value of Y
Linear Regression Equation
Sample Y Intercept
SampleSlopeCoefficient
Residual0 1i iib bY X e+ +=
0 1Y b b X= + = Simple Regression Equation (Fitted Regression Line, Predicted Value)
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-4
© 2004 Prentice-Hall, Inc. Chap 13-10
Linear Regression Equation
and are obtained by finding the values of and that minimize the sum of the squared residuals
provides an estimate of provides an estimate of
0b 1b0b 1b
0b β0
1b β1
(continued)
( )2 2
1 1
ˆn n
i i ii i
Y Y e= =
− =∑ ∑
© 2004 Prentice-Hall, Inc. Chap 13-11
Linear Regression Equation(continued)
Y
XObserved Value
|Y X iXµ β β0 1= +
iε
β0
β1
ii iY Xβ β ε0 1+ +=
0 1i iY b b X= +
ie
0 1i iib bY X e+ +=1b
0b
© 2004 Prentice-Hall, Inc. Chap 13-12
Interpretation of the Slopeand Intercept
is the average value of Y when the value of X is zero
measures the change in the average value of Y as a result of a one-
unit change in X
| 0Y Xβ µ0 ==
|1
change in change in
Y X
Xµ
β =
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-5
© 2004 Prentice-Hall, Inc. Chap 13-13
Interpretation of the Slopeand Intercept
is the estimated average
value of Y when the value of X is zero
is the estimated change
in the average value of Y as a result of a one-
unit change in X
(continued)
( )ˆ 0b Y X0 = =
1
ˆchange in change in
YbX
=
© 2004 Prentice-Hall, Inc. Chap 13-14
Simple Linear Regression: Example
You wish to examine the linear dependency of the annual sales of produce stores on their sizes in square footage. Sample data for 7 stores were obtained. Find the equation of the straight line that fits the data best.
Annual Store Square Sales
Feet ($1000)1 1,726 3,6812 1,542 3,3953 2,816 6,6534 5,555 9,5435 1,292 3,3186 2,208 5,5637 1,313 3,760
© 2004 Prentice-Hall, Inc. Chap 13-15
Scatter Diagram: Example
0
2000
4000
6000
8000
10000
12000
0 1000 2000 3000 4000 5000 6000
Square Feet
Ann
ual S
ales
($00
0)
Excel Output
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-6
© 2004 Prentice-Hall, Inc. Chap 13-16
Simple Linear Regression Equation: Example
0 1ˆ
1636.415 1.487i i
i
Y b b XX
= += +
From Excel Printout:Coefficien ts
In te rce p t 1636.414726X V a ria b le 1.486633657
© 2004 Prentice-Hall, Inc. Chap 13-17
Graph of the Simple Linear Regression Equation: Example
0
2000
4000
6000
8000
10000
12000
0 1000 2000 3000 4000 5000 6000
Square Fe e t
Ann
ual S
ales
($00
0)
Yi = 1636.415 +1.487Xi
∧
© 2004 Prentice-Hall, Inc. Chap 13-18
Interpretation of Results: Example
The slope of 1.487 means that for each increase of one unit in X, we predict the average of Y to increase by an estimated 1.487 units.
The equation estimates that for each increase of 1 square foot in the size of the store, the expected annual sales are predicted to increase by $1487.
ˆ 1636.415 1.487i iY X= +
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-7
© 2004 Prentice-Hall, Inc. Chap 13-19
Simple Linear Regressionin PHStat
In Excel, use PHStat | Regression | Simple Linear Regression …
Excel Spreadsheet of Regression Sales on Footage
Microsoft Excel Worksheet
© 2004 Prentice-Hall, Inc. Chap 13-20
Measures of Variation: The Sum of Squares
SST = SSR + SSE
Total Sample
Variability
= Explained Variability
+ Unexplained Variability
© 2004 Prentice-Hall, Inc. Chap 13-21
Measures of Variation: The Sum of Squares
SST = Total Sum of Squares Measures the variation of the Yi values around their mean,
SSR = Regression Sum of Squares Explained variation attributable to the relationship between X and Y
SSE = Error Sum of Squares Variation attributable to factors other than the relationship between X and Y
(continued)
Y
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-8
© 2004 Prentice-Hall, Inc. Chap 13-22
Measures of Variation: The Sum of Squares
(continued)
Xi
Y
X
Y
SST = ∑(Yi - Y)2
SSE =∑(Yi - Yi )2∧
SSR = ∑(Yi - Y)2∧
∧
__
_
© 2004 Prentice-Hall, Inc. Chap 13-23
Venn Diagrams and Explanatory Power of Regression
Sales
SizesVariations in Sales explained by Sizes or variations in Sizes used in explaining variation in Sales
Variations in Sales explained by the error term or unexplained by Sizes
Variations in store Sizes not used in explaining variation in Sales
( )SSE
( )SSR
© 2004 Prentice-Hall, Inc. Chap 13-24
The ANOVA Table in Excel
SSTn-1Total
MSE=SSE/(n-k-1)
SSEn-k-1Error
P-value of the F Test
MSR/MSEMSR=SSR/k
SSRkRegression
Significance F
FMSSSdf
ANOVA
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-9
© 2004 Prentice-Hall, Inc. Chap 13-25
Measures of VariationThe Sum of Squares: Example
ANOVAdf SS MS F Significance F
Regression 1 30380456.12 30380456.1 81.1790902 0.000281201
Error 5 1871199.595 374239.919
Total 6 32251655.71
Excel Output for Produce Stores
SSRSSE
Regression (explained) df
Degrees of freedom
Error (unexplained) dfTotal df
SST
© 2004 Prentice-Hall, Inc. Chap 13-26
The Coefficient of Determination
Measures the proportion of variation in Y that is explained by the independent variable X in the regression model
2 Regression Sum of SquaresTotal Sum of Squares
SSRrSST
= =
© 2004 Prentice-Hall, Inc. Chap 13-27
Venn Diagrams and Explanatory Power of Regression
Sales
Sizes
2
SSRSSR S
r
SE
=
=+
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-10
© 2004 Prentice-Hall, Inc. Chap 13-28
Coefficients of Determination (r 2) and Correlation (r)
r2 = 1, r2 = 1,
r2 = .81, r2 = 0,Y
Yi = b0 + b1Xi
X^
YYi = b0 + b1Xi
X
^Y
Yi = b0 + b1XiX
^
Y
Yi = b0 + b1Xi
X^
r = +1 r = -1
r = +0.9 r = 0
© 2004 Prentice-Hall, Inc. Chap 13-29
Standard Error of Estimate
Measures the standard deviation (variation) of the Y values around the regression equation
( )2
1
ˆ
2 2
n
ii
YX
Y YSSESn n
=
−= =
− −
∑
© 2004 Prentice-Hall, Inc. Chap 13-30
Measures of Variation: Produce Store Example
Reg ressio n S tatisticsM ult ip le R 0.9705572R S quare 0.94198129A djus ted R S quare 0.93037754S tandard E rror 611.751517O bs ervat ions 7
Excel Output for Produce Stores
r2 = .94 94% of the variation in annual sales can be explained by the variability in the size of the store as measured by square footage.
Syxn
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-11
© 2004 Prentice-Hall, Inc. Chap 13-31
Linear Regression Assumptions
NormalityY values are normally distributed for each XProbability distribution of error is normal
Homoscedasticity (Constant Variance)Independence of Errors
© 2004 Prentice-Hall, Inc. Chap 13-32
Consequences of Violationof the Assumptions
Violation of the AssumptionsNon-normality (error not normally distributed)Heteroscedasticity (variance not constant)
Usually happens in cross-sectional data
Autocorrelation (errors are not independent)Usually happens in time-series data
Consequences of Any Violation of the Assumptions
Predictions and estimations obtained from the sample regression line will not be accurateHypothesis testing results will not be reliable
It is Important to Verify the Assumptions
© 2004 Prentice-Hall, Inc. Chap 13-33
• Y values are normally distributed around the regression line.
• For each X value, the “spread” or variance around the regression line is the same.
Variation of Errors Aroundthe Regression Line
X1
X2
X
Y
f(e)
Sample Regression Line
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-12
© 2004 Prentice-Hall, Inc. Chap 13-34
Residual Analysis
PurposesExamine linearity Evaluate violations of assumptions
Graphical Analysis of ResidualsPlot residuals vs. X and time
© 2004 Prentice-Hall, Inc. Chap 13-35
Residual Analysis for Linearity
Not Linear Linear
X
e e
X
Y
X
Y
X0 0
00
© 2004 Prentice-Hall, Inc. Chap 13-36
Residual Analysis forHomoscedasticity
Heteroscedasticity Homoscedasticity
SR
X
SR
X
Y
X X
Y
0
0
0
0
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-13
© 2004 Prentice-Hall, Inc. Chap 13-37
Residual Plot
0 1000 2000 3000 4000 5000 6000
Square Feet
Residual Analysis: Excel Output for Produce Stores Example
Excel Output
Observation Predicted Y Residuals1 4202.344417 -521.34441732 3928.803824 -533.80382453 5822.775103 830.22489714 9894.664688 -351.66468825 3557.14541 -239.14541036 4918.90184 644.09816037 3588.364717 171.6352829
© 2004 Prentice-Hall, Inc. Chap 13-38
Residual Analysis for Independence
The Durbin-Watson StatisticUsed when data are collected over time to detect autocorrelation (residuals in one time period are related to residuals in another period)Measures violation of independence assumption
21
2
2
1
( )n
i ii
n
ii
e eD
e
−=
=
−=
∑
∑
Should be close to 2.
If not, examine the model for autocorrelation.
© 2004 Prentice-Hall, Inc. Chap 13-39
Durbin-Watson Statisticin PHStat
PHStat | Regression | Simple Linear Regression …
Check the box for Durbin-Watson Statistic
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-14
© 2004 Prentice-Hall, Inc. Chap 13-40
5α =.0
k=1 k=2
n dL dU dL dU
15 1.08 1.36 .95 1.54
16 1.10 1.37 .98 1.54
Obtaining the Critical Values of Durbin-Watson Statistic
Table 13.4 Finding Critical Values of Durbin-Watson Statistic
© 2004 Prentice-Hall, Inc. Chap 13-41
Do not reject H0 (no autocorrelation)
Using the Durbin-Watson Statistic
: No autocorrelation (error terms are independent): There is autocorrelation (error terms are not
independent)
0H1H
0 42dL 4-dLdU 4-dU
Reject H0(positive
autocorrelation)
Inconclusive Reject H0(negative
autocorrelation)
© 2004 Prentice-Hall, Inc. Chap 13-42
Residual Analysis for Independence
Not Independent Independent
e e
TimeTime
Residual is Plotted Against Time to Detect Any Autocorrelation
No Particular PatternCyclical Pattern
Graphical Approach
0 0
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-15
© 2004 Prentice-Hall, Inc. Chap 13-43
Inference about the Slope: t Test
t Test for a Population SlopeIs there a linear relationship between Y and X ?
Null and Alternative HypothesesH0: β1 = 0 (no linear relationship)H1: β1 ≠ 0 (linear relationship)
Test Statistic
1
1
1 1
2
1
where ( )
YXb n
bi
i
b St SS
X X
β
=
−= =
−∑. . 2d f n= −
© 2004 Prentice-Hall, Inc. Chap 13-44
Example: Produce StoreData for 7 Stores:
Estimated Regression Equation:Annual
Store Square SalesFeet ($000)
1 1,726 3,6812 1,542 3,3953 2,816 6,6534 5,555 9,5435 1,292 3,3186 2,208 5,5637 1,313 3,760
ˆ 1636.415 1.487i iY X= +
The slope of this model is 1.487.
Are square footage and annual sales linearly related?
© 2004 Prentice-Hall, Inc. Chap 13-45
Inferences about the Slope: t Test Example
H0: β1 = 0H1: β1 ≠ 0α = .05df = 7 - 2 = 5Critical Value(s):
Test Statistic:
Decision:
Conclusion:There is evidence that square footage is linearly related to annual sales.
t0 2.5706-2.5706
.025
Reject Reject
.025
From Excel Printout
Reject H0.
Coefficients Standard Error t Stat P-valueIntercept 1636.4147 451.4953 3.6244 0.01515Footage 1.4866 0.1650 9.0099 0.00028
1b 1bS t
p-value
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-16
© 2004 Prentice-Hall, Inc. Chap 13-46
Inferences about the Slope: Confidence Interval Example
Confidence Interval Estimate of the Slope:
11 2n bb t S−±Excel Printout for Produce Stores
At 95% level of confidence, the confidence interval for the slope is (1.062, 1.911). Does not include 0.
Conclusion: There is a significant linear relationship between annual sales and the size of the store.
Lower 95% Upper 95%Intercept 475.810926 2797.01853Footage 1.06249037 1.91077694
© 2004 Prentice-Hall, Inc. Chap 13-47
Inferences about the Slope: F Test
F Test for a Population SlopeIs there a linear relationship between Y and X ?
Null and Alternative HypothesesH0: β1 = 0 (no linear relationship)H1: β1 ≠ 0 (linear relationship)
Test Statistic
Numerator d.f.=1, denominator d.f.=n-2
( )
1
2
S S R
F S S En
=
−
© 2004 Prentice-Hall, Inc. Chap 13-48
Relationship between a t Test and an F Test
Null and Alternative HypothesesH0: β1 = 0 (no linear relationship)H1: β1 ≠ 0 (linear relationship)
The p –value of a t Test and the p –value of an F Test are Exactly the SameThe Rejection Region of an F Test is Always in the Upper Tail
( )2
2 1, 2n nt F− −=
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-17
© 2004 Prentice-Hall, Inc. Chap 13-49
ANOVAdf SS MS F Significance F
Regression 1 30380456.12 30380456.12 81.179 0.000281Residual 5 1871199.595 374239.919Total 6 32251655.71
Inferences about the Slope: F Test Example
Test Statistic:
Decision:
Conclusion:
H0: β1 = 0H1: β1 ≠ 0α = .05numerator df = 1denominator df = 7 - 2 = 5
There is evidence that square footage is linearly related to annual sales.
From Excel Printout
Reject H0.
0 6.61
Reject
α = .05
1, 2nF −
p-value
© 2004 Prentice-Hall, Inc. Chap 13-50
Purpose of Correlation Analysis
Correlation Analysis is Used to Measure Strength of Association (Linear Relationship) Between 2 Numerical Variables
Only strength of the relationship is concernedNo causal effect is implied
© 2004 Prentice-Hall, Inc. Chap 13-51
Purpose of Correlation Analysis
Population Correlation Coefficient ρ (Rho) is Used to Measure the Strength between the Variables
(continued)
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-18
© 2004 Prentice-Hall, Inc. Chap 13-52
Sample Correlation Coefficient r is an Estimate of ρ and is Used to Measure the Strength of the Linear Relationship in the Sample Observations
Purpose of Correlation Analysis(continued)
( ) ( )
( ) ( )1
2 2
1 1
n
i ii
n n
i ii i
X X Y Yr
X X Y Y
=
= =
− −=
− −
∑
∑ ∑
© 2004 Prentice-Hall, Inc. Chap 13-53r = .6 r = 1
Sample Observations from Various r Values
Y
X
Y
X
Y
X
Y
X
Y
X
r = -1 r = -.6 r = 0
© 2004 Prentice-Hall, Inc. Chap 13-54
Features of ρ and r
Unit FreeRange between -1 and 1The Closer to -1, the Stronger the Negative Linear RelationshipThe Closer to 1, the Stronger the Positive Linear RelationshipThe Closer to 0, the Weaker the Linear Relationship
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-19
© 2004 Prentice-Hall, Inc. Chap 13-55
Hypotheses H0: ρ = 0 (no correlation) H1: ρ ≠ 0 (correlation)
Test Statistic
( )( )
( ) ( )
2
2 1
2 2
1 1
where
2n
i ii
n n
i ii i
rtr
n
X X Y Yr r
X X Y Y
ρ
=
= =
−=
1−−
− −= =
− −
∑
∑ ∑
t Test for Correlation
© 2004 Prentice-Hall, Inc. Chap 13-56
Example: Produce Stores
Regression StatisticsMultiple R 0.9705572R Square 0.94198129Adjusted R Square 0.93037754Standard Error 611.751517Observations 7
From Excel Printout r
Is there any evidence of linear relationship between annual sales of a store and its square footage at .05 level of significance? H0: ρ = 0 (no association)
H1: ρ ≠ 0 (association)α = .05df = 7 - 2 = 5
© 2004 Prentice-Hall, Inc. Chap 13-57
Example: Produce Stores Solution
0 2.5706-2.5706
.025
Reject Reject
.025
Critical Value(s):
Conclusion:There is evidence of a linear relationship at 5% level of significance.
Decision:Reject H0.
2
.9706 9.00991 .9420
52
rtr
n
ρ−= = =
−1−−
The value of the t statistic is exactly the same as the t statistic value for test on the slope coefficient.
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-20
© 2004 Prentice-Hall, Inc. Chap 13-58
Estimation of Mean Values
Confidence Interval Estimate for :
The Mean of Y Given a Particular Xi
2
22
1
( )1ˆ( )
ii n YX n
ii
X XY t Sn X X
−
=
−± +
−∑t value from table with df=n-2
Standard error of the estimate
Size of interval varies according to distance away from mean, X
| iY X Xµ =
© 2004 Prentice-Hall, Inc. Chap 13-59
Prediction of Individual Values
Prediction Interval for Individual Response Yi at a Particular Xi
Addition of 1 increases width of interval from that for the mean of Y
2
22
1
( )1ˆ 1( )
ii n YX n
ii
X XY t Sn X X
−
=
−± + +
−∑
© 2004 Prentice-Hall, Inc. Chap 13-60
Interval Estimates for Different Values of X
Y
X
Prediction Interval for an Individual Yi
a given X
Confidence Interval for the Mean of Y
Yi = b0 + b1Xi∧
X
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-21
© 2004 Prentice-Hall, Inc. Chap 13-61
Example: Produce Stores
Yi = 1636.415 +1.487Xi
Data for 7 Stores:
Regression Model Obtained:∧
Annual Store Square Sales
Feet ($000)1 1,726 3,6812 1,542 3,3953 2,816 6,6534 5,555 9,5435 1,292 3,3186 2,208 5,5637 1,313 3,760
Consider a store with 2000 square feet.
© 2004 Prentice-Hall, Inc. Chap 13-62
Estimation of Mean Values: Example
Find the 95% confidence interval for the average annual sales for stores of 2,000 square feet.
2
22
1
( )1ˆ 4610.45 612.66( )
ii n YX n
ii
X XY t Sn X X
−
=
−± + = ±
−∑
Predicted Sales Yi = 1636.415 +1.487Xi = 4610.45 (in $000)∧
X = 2350.29 SYX = 611.75 tn-2 = t5 = 2.5706
Confidence Interval Estimate for | iY X Xµ =
|3997.02 5222.34iY X Xµ =< <
© 2004 Prentice-Hall, Inc. Chap 13-63
Prediction Interval for Y : Example
Find the 95% prediction interval for annual sales of one particular store of 2,000 square feet.
Predicted Sales Yi = 1636.415 +1.487Xi = 4610.45 (in $000)∧
X = 2350.29 SYX = 611.75 tn-2 = t5 = 2.57062
22
1
( )1ˆ 1 4610.45 1687.68( )
ii n YX n
ii
X XY t Sn X X
−
=
−± + + = ±
−∑
Prediction Interval for Individual
2922.00 6297.37iX XY =< <
iX XY =
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-22
© 2004 Prentice-Hall, Inc. Chap 13-64
Estimation of Mean Values and Prediction of Individual Values in PHStat
In Excel, use PHStat | Regression | Simple Linear Regression …
Check the “Confidence and Prediction Interval for X=” box
Excel Spreadsheet of Regression Sales on Footage
Microsoft Excel Worksheet
© 2004 Prentice-Hall, Inc. Chap 13-65
Pitfalls of Regression Analysis
Lacking an Awareness of the Assumptions Underlining Least-Squares RegressionNot Knowing How to Evaluate the AssumptionsNot Knowing What the Alternatives to Least-Squares Regression are if a Particular Assumption is ViolatedUsing a Regression Model Without Knowledge of the Subject Matter
© 2004 Prentice-Hall, Inc. Chap 13-66
Strategy for Avoiding the Pitfalls of Regression
Start with a scatter plot to observe possible relationship between X on Y Perform residual analysis to check the assumptionsUse a histogram, stem-and-leaf display, box-and-whisker plot, or normal probability plot of the residuals to uncover possible non-normality
Statistics for Managers Using Microsoft Excel, 2/e © 1999 Prentice-Hall, Inc.
Chapter 13 Student Lecture Notes 13-23
© 2004 Prentice-Hall, Inc. Chap 13-67
Strategy for Avoiding the Pitfalls of Regression
If there is violation of any assumption, use alternative methods (e.g., least absolute deviation regression or least median of squares regression) to least-squares regression or alternative least-squares models (e.g., curvilinear or multiple regression)If there is no evidence of assumption violation, then test for the significance of the regression coefficients and construct confidence intervals and prediction intervals
(continued)
© 2004 Prentice-Hall, Inc. Chap 13-68
Chapter Summary
Introduced Types of Regression ModelsDiscussed Determining the Simple Linear Regression EquationDescribed Measures of VariationAddressed Assumptions of Regression and CorrelationDiscussed Residual Analysis Addressed Measuring Autocorrelation
© 2004 Prentice-Hall, Inc. Chap 13-69
Chapter Summary
Described Inference about the SlopeDiscussed Correlation - Measuring the Strength of the Association Addressed Estimation of Mean Values and Prediction of Individual ValuesDiscussed Pitfalls in Regression and Ethical Issues
(continued)