QUANTITATIVE ANALYSIS FOR
BUSINESSLecture 2
July 5th, 2010
Saksarun (Jay) Mativachranon
INTRO Please turn your mobile phones off or
switch it to silent mode and please do not pick up your calls
Slide will be available atwww.slideshow.com (soon)
Email: [email protected]
LINEAR REGRESSION
REGRESSION Regression is used for estimating the
unknown effect of changing one variable over anotherThe variable to be estimated is called
“dependent variable”The changing variable is called
“independent variable”
LINEAR REGRESSION ASSUMPTIONS
1. There is NO relationship between X and Y if 1 equals to 0
2. There is ALWAYS a relationship if 1 does NOT equal to 0
3. The independent Variable (X) is not random4. The expected value of error ( e ) is 0
XY 10
LINEAR REGRESSION ANALYSIS Analyzing the correlation and
directionality of the data Estimating the model Evaluating the validity and usefulness of
the model
USAGE OF REGRESSION Causal analysis Forecasting an effect (of independent
variable to that of dependent variable) Forecasting (trend of) future values
SIMPLE LINEAR REGRESSION True value of slope and intercept are not
known, so we estimate them by using sample data
whereY = dependent variableX = independent variable b0 = intercept (value of Y when X
= 0) b1 = slope of the regression line
XbbY 10 ˆ
^
SCATTER DIAGRAM
EXAMPLE
Linear Regression
SITUATION Company A wants to know the
relationship between the Man Hour of their sales force and their sales number
They have collected their sales data and the man hour put in during the collection period
COMPANY A DATASales of Company A ($) Man Hour (Hour)
6 3
8 4
9 6
5 4
4.5 2
9.5 5
COMPANY A’S SALES SCATTER DIAGRAM 12 –
10 –
8 –
6 –
4 –
2 –
0 –
Sale
s
Man Hour
| | | | | | | |
0 1 2 3 4 5 6 7 8
FINDING THE REGRESSION Company A is trying to predict its sales
from the man hour spent
The line in is the one that minimizes the errors
Y = SalesX = Man Hour
Error = (Actual value) – (Predicted value)
YYe ˆ
DATA MANIPULATION For the simple linear regression model,
the values of the intercept and slope can be calculated using the formulas below
XbbY 10 ˆ
values of (mean) average Xn
XX
values of (mean) average Yn
YY
21 )(
))((
XX
YYXXb
XbYb 10
REGRESSION CALCULATION
Y X (X – X)2 (X – X)(Y – Y)
6 3 (3 – 4)2 = 1 (3 – 4)(6 – 7) = 1
8 4 (4 – 4)2 = 0 (4 – 4)(8 – 7) = 0
9 6 (6 – 4)2 = 4 (6 – 4)(9 – 7) = 4
5 4 (4 – 4)2 = 0 (4 – 4)(5 – 7) = 0
4.5 2 (2 – 4)2 = 4 (2 – 4)(4.5 – 7) = 5
9.5 5 (5 – 4)2 = 1 (5 – 4)(9.5 – 7) = 2.5
ΣY = 42Y = 42/6 = 7
ΣX = 24X = 24/6 = 4
Σ(X – X)2 = 10 Σ(X – X)(Y – Y) = 12.5_ _
_
_ _ _
_ _
REGRESSION CALCULATION (CONT.)
46
246
XX
7642
6YY
25110
51221 .
.)(
))((
XX
YYXXb
24251710 ))(.(XbYb
XY 2512 .ˆ Therefore
RESULTS Company A Sales model
Predicting salesEvery 1 Man-hour, Company A sells $3.25
worth of goods
XY 2512 .ˆ
MEASURING REGRESSION MODEL Regression model can be developed for
any variable Y and X But how do we know the reliability of Y
from variation of X ???
COMPANY A’S SALES MODEL 12 –
10 –
8 –
6 –
4 –
2 –
0 –
Sale
s
Man Hour
| | | | | | | |
0 1 2 3 4 5 6 7 8
ErrorError
MEASURING REGRESSION MODEL (CONT.) How do we know the reliability of Y from
variation of X ???Can we find the average of the errors?
MEASUREMENT OF VARIABILITY SST – Total variability about the mean SSE – Variability about the regression
line SSR – Total variability that is explained
by the model
MEASUREMENT OF VARIABILITY
Sum of the squares total2)( YYSST
Sum of the squared error
22 )ˆ( YYeSSE
Sum of squares due to regression 2)ˆ( YYSSR
An important relationshipSSESSRSST
COMPANY A EXAMPLEY X (Y – Y)2 Y (Y – Y)2 (Y – Y)2
6 3 (6 – 7)2 = 1 2 + 1.25(3) = 5.75 0.0625 1.563
8 4 (8 – 7)2 = 1 2 + 1.25(4) = 7.00 1 0
9 6 (9 – 7)2 = 4 2 + 1.25(6) = 9.50 0.25 6.25
5 4 (5 – 7)2 = 4 2 + 1.25(4) = 7.00 4 0
4.5 2 (4.5 – 7)2 = 6.25
2 + 1.25(2) = 4.50 0 6.25
9.5 5 (9.5 – 7)2 = 6.25
2 + 1.25(5) = 8.25 1.5625 1.563
∑(Y – Y)2 = 22.5 ∑(Y – Y)2 = 6.875 ∑(Y – Y)2 =
15.625
Y = 7 SST = 22.5 SSE = 6.875 SSR = 15.625
^
_
_^
_
_ _^ ^
^
COMPANY A’S VARIABILITYSST = 22.5SSE = 6.875SSR = 15.625
COMPANY A’S SALES MODEL 12 –
10 –
8 –
6 –
4 –
2 –
0 –
Sale
s
Man Hour
| | | | | | | |
0 1 2 3 4 5 6 7 8
Y = 2 + 1.25X^
Y – YY – Y
^
YY – Y^
COEFFICIENT OF DETERMINATION The proportion of variability of Y in the
regression model
COEFFICIENT OF DETERMINATION The coefficient of determination is r2
SSTSSE
SSTSSR
r 12
COMPANY A EXAMPLE
Explanation Over 69% of Y can be predicted by
variation of X
For Company A
69440522
625152 ..
.r
CORRELATION COEFFICIENT The strength of linear relationship
Relationship of Y and X It will always be between +1 and –1 The correlation coefficient is r
CORRELATION COEFFICIENT
**
*
*(a)Perfect Positive
Correlation: r = +1
X
Y
*
* *
*(c) No
Correlation: r = 0
X
Y
* **
** *
* ***
(d)Perfect Negative Correlation: r = –1
X
Y
**
**
* ***
*(b)Positive
Correlation: 0 < r < 1
X
Y
****
*
**
NEXT WEEK Linear Regression
Errors in Regression modelVariance
Mean Square Error Standard Deviation
Testing the Model Multiple Regression