1
The Ordinary Least Squares (OLS)
Estimator
2
Regression Analysis
• Regression Analysis: a statistical technique for
investigating and modeling the relationship
between variables.
• Applications: Engineering, the physical and
chemical science, economics, management, life
and biological science, and the social science
• Regression analysis may be the most widely used
statistical technique
3
• Example 1: delivery time v.s. delivery
volume
– Suspect that the time required by a route
deliveryman to load and service a machine is
related to the number of cases of product
delivered
– 25 randomly chosen retail outlet
– The in-outlet delivery time and the volume of
product delivery
– Scatter diagram: display a relationship between
delivery time and delivery volume
4
5
6
• Y: delivery time, x: delivery volume
Y = 0 + 1 x + ε
• Error, ε:
– The difference between y and 0 + 1 x
– A statistical error, i.e. a random variable
– The effects of the other variables on delivery
time, measurement errors, …
7
• Simple linear regression model:
Y = 0 + 1 x + ε
– x: independent (predictor, regressor) variable
– Y: dependent (response) variable
– ε : error
• If x is fixed, Y is determined by ε.
• Suppose that E(ε) = 0 and Var(ε) = 2 .
Then
E(Y|x) = E(0 + 1 x + ) = 0 + 1 x
Var(Y|x) = Var(0 + 1 x + ) = 2
8
• The true regression line is a line of mean
values: the height of the regression line at
any x is the expected value of Y for that x.
• The slope, 1: the change in the mean of Y
for a unit change in x
• The variability of Y at x is determined by
the variance of the error
9
• Example:
– E(Y|x) = 3.5 + 2 x, and Var(Y|x) = 2
– Y|x ~ N(0 + 1 x , 2 )
– 2 small: the observed values will fall close the
line.
– 2 large: the observed values may deviate
considerably from the line.
10
11
• The regression equation is only an
approximation to the true functional
relationship between the variables.
• Regression model: Empirical model
12
13
• Valid only over the region of the regressor
variables contained in the observed data!
14
• Multiple linear regression model:
Y = 0 + 1 x1 + … + k xk + ε
• Linear: the model is linear in the
parameters, 0, 1, …, k, not because Y is a
linear function of x’s.
15
• Two important objectives:
– Estimate the unknown parameters (fitting
the model to the data): The method of least
squares.
– Model adequacy checking: An iterative
procedure to choose an appropriate regression
model to describe the data.
• Remarks:
– Don’t imply a cause-effect relationship between
the variables
– Can aid in confirming a cause-effect
relationship, but it is not the sole basis!
– Part of a broader data-analysis approach
16
The Least Squares Estimator
• Y = 0 + 1 x + ε
– x: regressor variable
– Y: response variable
– 0: the intercept, unknown
– 1: the slope, unknown
– ε: error with E(ε) = 0 and Var(ε) = 2
(unknown)
• The errors are uncorrelated.
17
• Given x,
E(Y|x) = E(0 + 1 x + ) = 0 + 1 x
Var(Y|x) = Var(0 + 1 x + ) = 2
• Responses are also uncorrelated.
• Regression coefficients: 0, 1
– 1: the change of E(Y|x) by a unit change in x
– 0: E(Y|x=0)
18
Least-squares Estimation of the
Parameters
Estimation of 0 and 1
• Data: n pairs: (yi, xi), i = 1, …, n
• Method of least squares: Minimize
n
i
ii xyS1
2
1010 )]([),(
19
•
• Least-squares normal equations:
20
• The least-squares estimator:
21
• The fitted simple regression model:
– A point estimate of the mean of y for a
particular x
• Residual:
– An important role in investigating the
adequacy of the fitted regression model and in
detecting departures from the underlying
assumption!
22
• Example 2: The Rocket Propellant Data
– Shear strength is related to the age in weeks of
the batch of sustainer propellant.
– 20 observations
– From scatter diagram, there is a strong
relationship between shear strength (Y) and
propellant age (x).
– Assumption
Y = 0 + 1 x + ε
23
24
•
•
• The least-square fit:
65.41112
56.110622
yxnyxS
xnxS
i
iixy
i
ixx
82.2627ˆˆ
15.37ˆ
10
1
xy
S
S
xx
xy
xy 15.3782.2627ˆ
25
• How well does this equation fit the data?
• Is the model likely to be useful as a
predictor?
• Are any of the basic assumption violated
and if so how serious is this?
26
Properties of the Least-Squares Estimators
and the Fitted Regression Model
• are linear combinations of yi
• are unbiased estimators.
01ˆ and ˆ
xxii
n
i
ii Sxxcyc /)( ,ˆ
1
1
xy 10ˆˆ
01ˆ and ˆ
27
•
•
011010
110
1
1
)ˆ()ˆ(
)(
)()()ˆ(
xxxyEE
xc
yEcycEE
i
ii
i
ii
n
i
ii
i xxi
i
xx
i
i
ii
i
ii
Sxx
Sc
yVarcycVarVar
22
2
222
2
1
)(
)()()ˆ(
)1
()ˆ(2
2
0
xxS
x
nVar
Classical Linear Regression Assumptions
• 1. Regression is linear in parameters
• 2. Error term has zero population mean
• 3. Error term is not correlated with X’s
• 4. No serial correlation
• 5. No heteroskedasticity
• 6. No perfect multicollinearity
• and we usually add:
• 7. Error term is normally distributed
(*We did not use this in deriving the OLS – for it is a non-parametric estimator. A good property.)
Gauss-Markov Theorem
• Given OLS assumptions 1 through 6, the OLS
estimator of βk is the minimum variance estimator
from the set of all linear unbiased estimators of βk
for k=0,1,2,…,K. That is, the OLS is the BLUE
(Best Linear Unbiased Estimator)
~~~~~~~~
* Furthermore, by adding assumption 7 (normality),
one can show that OLS = MLE and is the BUE (Best
Unbiased Estimator) also called the UMVUE.
Gauss-Markov Theorem
• Can you prove this theorem?
• This is your Quiz 2.
• Last but not the least, we thank colleagues
who have uploaded their lecture notes on
the internet!