Date post: | 03-Jun-2018 |
Category: |
Documents |
Upload: | gyanbitt-kar |
View: | 221 times |
Download: | 0 times |
of 28
8/12/2019 6338_multicollinearity & Autocorrelation
1/28
MULTICOLLINEARITY
AUTOCORRELATION
8/12/2019 6338_multicollinearity & Autocorrelation
2/28
Multicollinearity The theory of causation and multiple causation
Interdependence between the Independent Variables andvariability of Dependent Variables
Parsimony and Linear Regression
Theoretical consistency and Parsimony
X1
X2
X3
X4
X5
Y
8/12/2019 6338_multicollinearity & Autocorrelation
3/28
One of the assumptions of the CLRM isthat there is no Multicollinearity
amongst the explanatory variables. Multicollinearity refers to perfect or
exact relationship among some or all
explanatory variables
Expl.: X1 X2 X*
2
10 50 52
15 75 7518 90 97
24 120 129
30 150 152
8/12/2019 6338_multicollinearity & Autocorrelation
4/28
X2i= 5X1i & X*2was created by adding 2,
0, 7, 9 & 2 from, a random number table.
Here r1.2= 1 & r2.2*= 0.99
X1& X2show perfect multicollinearity
X2& X*2 near-perfect multicollinearity The problem of multicollinearity and its
degree in types of data
Overlap between the variablesindicates the extent of it as shown in
the Venn diagram.
8/12/2019 6338_multicollinearity & Autocorrelation
5/28
Example:
Y = a + b1x1+ b2x2+ u
whereY = Consumption Expenditure
X1= Income & X2= Wealth
Consumption expenditure depends on income (x1) and
wealth (x2)
The estimated equation from a set of data is as follows: = 24.77 + 0.94x10.04x2
t : (3.66) (1.14) (0.52)R2= 0.96 2= 0.95 F = 92.40The individual coefficients are not significant although F valuesuggests a high degree of association
There is a wrong sign with x2
8/12/2019 6338_multicollinearity & Autocorrelation
6/28
The fact that the Ftest is significant but the tvalues of X1 and X2 are individually
insignificant means that the two variables areso highly correlated that it is impossible to
isolate the individual impact of either income
or wealth on consumption.Let us regress X2on X1
X2= 7.54 + 10.19 X1
t= (0.25) (62.04) R20.99This shows near perfect multi-collinearity
between X2and X1
8/12/2019 6338_multicollinearity & Autocorrelation
7/28
Y on X1 Y on X2
= 24.24 + 0.51X1 = 24.41 + 0.05 X2
t= (3.81) (14.24) t= (3.55) (13.29)
R2= 0.96 R2= 0.96
Wealth has significant impact
Dropping highly collinear variable has made theother variable significant.
8/12/2019 6338_multicollinearity & Autocorrelation
8/28
Sources of Multicollinearity
Data collection method employed:
Sampling over a limited range of the values taken by
the regressors in the population
Constraints on the model or in the population being
sampled:
Regression of electricity consumption on income and
house size. There is a constraint : families with higher
income may have larger homes.
8/12/2019 6338_multicollinearity & Autocorrelation
9/28
8/12/2019 6338_multicollinearity & Autocorrelation
10/28
Practical Consequences of Multicollinearity:
In cases of near perfect or high multicollinearity one is
likely to encounter the following consequences:
1. The OLS estimators have large variances and co-
variances making precise estimation difficult.
2. (a) Because of 1 the confidence intervals tend to be
much wider leading to the acceptance of the
zero null hypothesis (i.e. the true population
coefficient is zero) more readily.(b) Because of 1 the t ratios of one or more
coefficients tend to be statistically insignificant.
8/12/2019 6338_multicollinearity & Autocorrelation
11/28
Practical Consequences of Multicollinearity:
3. Although the t ratio(s) of one or morecoefficients is/are statistically insignificant, R2
the overall measure of goodness of fit, can be
very high.
4. The OLS estimators and their S.E.s can be
sensitive to small changes in the data.
8/12/2019 6338_multicollinearity & Autocorrelation
12/28
8/12/2019 6338_multicollinearity & Autocorrelation
13/28
Remedial Measures
1. A priori information and articulation
2. Dropping a highly collinear variable
3. Transformation of Data
4. Additional information or new data
5. Identifying the purpose and reducing the
degree of it. (Or) Simply identifying it if
the purpose is prediction.
8/12/2019 6338_multicollinearity & Autocorrelation
14/28
AUTOCORRELATION
The assumption E(UU)= 2I
Each u distribution has the same variance(homoscedastic)
All disturbances are pair wise uncorrelated
This assumption gives
E(UiUj) = 0 i j
This assumption when violated leads to:
1. Heteroscedasticity
2. Autocorrelation
Var u1 Cov (U1U2) . . . Cov (U1, U2) 2 0 . . . 0
Cov (U2U1) Var V2 . . . Cov (U2, Un) 0 2 . . . 0
. . . . . . . . . . . . . . . = . . . . . . . . . . . .
Cov (unU1) Cov(UnU2) . . . (Var Un) 0 0 . . .2
8/12/2019 6338_multicollinearity & Autocorrelation
15/28
Covariance is the measure of how much two
random variables vary together (as distinct from
variance, which measures how much a single
variable varies.)
Covariance between two random variables say X
and Y is defined as
Cov (X, Y) = E [(X - )(Y- )]
Where and are expected values of X and Yrespectively.
If X and Y are independent their cov. is Zero
8/12/2019 6338_multicollinearity & Autocorrelation
16/28
The assumption implies that the disturbance
term relating to any observation is notinfluenced by the disturbance term relating
to any other observation.
For example:1. If we are dealing with quarterly time
series data involving the regression of the
following specification. (Time Series Data)
8/12/2019 6338_multicollinearity & Autocorrelation
17/28
Output (Q) = f (Labour and Capital Input)
Q L K UQ 1.1 L1 K1 U1
Q 1.2 L2 K2 U2
Q 1.3 L3 K3 U3
Q 1.4 L4 K4 U4Q 2.1 . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .Q n.4 L4n K4n U4n
Output is
affecteddue to
labour
strike
There is no
reason to believethat this will be
carried over to
U4
8/12/2019 6338_multicollinearity & Autocorrelation
18/28
2. Let
Family Consumption Expenditure = f (income)
(A regression involving Cross Section Data)
Consumption Expenditure
of Families
Income of
Family
F 1 I1 U1
F2 I2 U2
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
Fn In Un
8/12/2019 6338_multicollinearity & Autocorrelation
19/28
The effect of an increase of one familys income on
consumption expenditure is not expected to affect the
consumption expenditure of another family.The reality:
1. Distribution caused by strike may affect production
2.Consumption expenditure of one family mayinfluence that of another family i.e.
To keep up with the Joneses Demonstration effect
Autocorrelation is a feature in most time-series data. In cross section data it is referred to as spatial
autocorrelation.
8/12/2019 6338_multicollinearity & Autocorrelation
20/28
8/12/2019 6338_multicollinearity & Autocorrelation
21/28
Therefore, in regression involving time series data
successive observations are likely to be inter-dependent
which reflect in a systematic pattern of the ui s.2. Specification bias:
Excluded variable(s) or incorrect functional form.
a) When some relevant variables have been excludedfrom the model they will reflect a systematic pattern
in the ui s.
b) In case of incorrect functional form i.e. fitting a linear
function when the true relationship is non-linear (&vice-versa), there will either be over estimation or
under estimation of the dependent variable which will
have a systematic impact on Ui s.
8/12/2019 6338_multicollinearity & Autocorrelation
22/28
Example:
(Correct) MC = 1+ 2output + 3(output)2+ Ui
(Incorrect) MC = b1+ b2output + Vi
Where vi = (output)2 + ui and hence it will catch the
systematic effect of (output)2on the MC leading to serial
correlation of uis
.
3. Cobweb Phenomenon:
Supply of many agricultural commodities reflect the so
called Cobweb-Phenomenon where supply reacts to
price with lag of one time period because supply
decisions take time to implement (gestation period).
Expl.: At the beginning of this yearsplanting of crops farmers
are influenced by the price prevailing last year.
8/12/2019 6338_multicollinearity & Autocorrelation
23/28
Suppose at the end of period tprice Ptturns out
to be lower than Pt-1. Therefore, in period t +1 the
farmer may decide to produce less than they did inperiod t.
Such phenomena are known as Cobweb
Phenomena. And they give a systematic pattern tothe Uis.
In cases of Household Expenditure, share prices
etc. such problem arises. In general, when laggedvariable is not included (in many cases) the uisare
correlated.
8/12/2019 6338_multicollinearity & Autocorrelation
24/28
4. Manipulation of time series data:
(i) Extrapolation of values of variables likepopulation give rise to serial dependence of
successive Uis.
(ii) Very often we use projected population figure
to arrive at per capita figure for any macro-
variable and use of such figures in forecasting
using regression (the successive Uisare serially
correlated).
8/12/2019 6338_multicollinearity & Autocorrelation
25/28
Consequences: (Proofs are not given)
In the presence of autocorrelation in a model:
a) Residual variance is likely to under estimate
the true 2.
b) R2
is likely to be over estimated.c) ttest are not valid and if applied likely to give
misleading conclusions.
OLS estimators although linear and unbiased, theydo not have minimum variance leading to invalid
tand Ftest.
8/12/2019 6338_multicollinearity & Autocorrelation
26/28
Detection of autocorrelation:
The assumption of the CLRM relates to the population
disturbance term which are not directly observable.Therefore, their proxies i.e. isare obtained from OLS
and examined for the presence/absence of auto
correlation.
There are various methods. Some of them are:
1. Graphical Method
2. Runs Test (a non-parametric test)Examines the
signs3. DW-statistics
A decision rule is applied.
8/12/2019 6338_multicollinearity & Autocorrelation
27/28
Remedial Measures:
Data transformation by
a) First difference method (Xt+1 Xt) (one degreeof freedom is lost)
b) transformation
Estimated
The transformed model becomes
(Yt- Yt-1) = 1(1-)+2(Xt-Xt-1)+ut
This is known as generalised or quasi-difference
equation.
8/12/2019 6338_multicollinearity & Autocorrelation
28/28
Exercise 4 ( Refer Ch 10&12 of DNG)
Use time series data in MR
Find Correlation table
See the extent of multicollinearity Test for autocorrelation
In the presence of it use Ro transformation
Addressing both the problems calculateforecast error and select an equation which
gives the minimum forecast error.