+ All Categories
Home > Education > Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Date post: 09-Jan-2017
Category:
Upload: sajid-ali-khan
View: 1,314 times
Download: 28 times
Share this document with a friend
136
Advanced Econometrics 1 ADVANCED ECONOMETRICS SAJID ALI KHAN
Transcript
Page 1: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

1

ADVANCED

ECONOMETRICS

SAJID ALI KHAN

Page 2: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

2

ADVANCED

ECONOMETRICS

SAJID ALI KHAN

M.Phil. Statistics AIOU, Islamabad

M.Sc. Statistics AJKU, Muzaffarabad

PRINCIPAL

GREEN HILLS POSTGRADUATE COLLEGE

RAWALAKOT AZAD KASHMIR

E.Mail: [email protected]

Mobile: 0334-5439066

Page 3: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

3

CONTENTS

Chapter: 1. Econometrics 1

1.1. Introduction

1.2. Mathematical and statistical relationship

1.3. Goals of econometrics

1.4. Types of econometrics

1.5. Methodology of econometrics

1.6. The role of the computer

1.7. Exercise

Chapter: 2. Simple Linear Regression 6

2.1. The nature of the regression analysis

2.2. Data

2.3. Method of ordinary least squares

2.4. Properties of least square regression line

2.5. Assumptions of ordinary least square

2.6. Properties of least squares estimators small/ large sample

2.7. Variance of disturbance term 𝑼𝒊

2.8. Distribution of dependent variable Y

2.9. Maximum likelihood method

2.10. Goodness of fit test

2.11. Mean prediction

2.12. Individual prediction

2.13. Sampling distributions and confidence interval

2.14. Exercise

Chapter: 3. Multiple Linear Regression and Correlation 36

3.1. Multiple linear regression

3.2. Coefficient of multiple determination

3.3. Adjusted 𝑹 𝟐

3.4. Cobb-Douglas production function

3.5. Partial correlation

3.6. Testing multiple regression (F-test)

3.7. Relation between 𝑹𝟐𝒂𝒏𝒅 𝑭

3.8. Exercise

Chapter: 4. General Linear Regression 44

4.1. Introduction

4.2. Properties of GLR

4.3. Polynomial

4.4. Exercise

Chapter: 5. Dummy Variables 53

5.1. Nature of dummy variables

5.2. Dummy variable trap

5.3. Uses of dummy variables

5.4. Exercise

Page 4: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

4

Chapter: 6. Auto-Regressive and Distributed-Lag Model 56

6.1. Distributed-lag model

6.2. Auto-regressive model

6.3. Lag

6.4. Reasons/sources of lags

6.5. Types of distributed lag model

6.6. Estimation of distribution lag model

6.7. Exercise

Chapter: 7. Multicollinearity 61

7.1. Collinearity

7.2. Multicollinearity

7.3. Sources of multicollinearity

7.4. Types of multicollinearity

7.5. Estimation of multicollinearity

7.6. Consequences of multicollinearity

7.7. Detection of multicollinearity

7.8. Remedial measures of multicollinearity

7.9. Exercise

Chapter: 8. Hetroscedasticity 75

8.1. Nature of heteroscedasticity

8.2. Estimation of heteroscedasticity

8.3. Consequences of heteroscedasticity

8.4. Detection of heteroscedasticity

8.5. Remedial measures of heteroscedasticity

8.6. Exercise

Chapter: 9. Autocorrelation 86

9.1. Introduction

9.2. Reasons of autocorrelation

9.3. Estimation of autocorrelation

9.4. Consequences of autocorrelation

9.5. Detection of autocorrelation

9.6. Exercise

Chapter: 10. Simultaneous Equation System 93

10.1. Introduction

10.2. System of simultaneous equation

10.3. Simultaneous equation bias

10.4. Methods of estimation in simultaneous equation models

10.5. Exercise

Chapter: 11. Identification Problem 108

11.1. Introduction

11.2. Rules for identification

11.3. Conditions of identification

11.4. Exercise

Page 5: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

5

Chapter: 1

ECONOMETRICS

1.1: INTRODUCTION

Econometrics is the field of economics that concerns itself

with the application of mathematical statistics and the tools of

statistical inference to the empirical measurement of relationships

postulated by economic theory.

Econometrics literally means “economic measurement” is

the quantitative measurement and analysis of actual economic and

business phenomena. Econometrics is a fascinating set of

techniques that allows the measurements and analysis of economic

trends.

Econometrics, the result of a certain outlook on the role of

economics, consists of the application of mathematical statistics to

economic data to lend empirical support to the models constructed

by mathematical economics and to obtain numerical results.

Econometrics may be defined as the quantitative analysis of actual

economic phenomena based on the concurrent development of

theory and observation, related by appropriate methods of

inference.

Econometrics may be defined as the social science in which

the tools of economic theory, mathematics and statistical inference

are applied to the analysis of economic phenomena. Econometrics

is concerned with the empirical determination of economic laws.

Frisch (1933) and his society responded to an

unprecedented accumulation of statistical information. They saw a

need to establish a body of principles that could organize what

would otherwise become a bewildering mass of data. Neither the

Page 6: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

6

pillars nor the objectives of econometrics have changed in the

years since this editorial appeared.

1.2: MATHEMATICAL AND STATISTICAL RELATIONSHIP

The main concern of mathematical economics is to express

economic theory in mathematical form without regard to

measurability or empirical verification of the theory.

Economic statistics is mainly concerned with collecting,

processing and presenting economic data in the form of charts and

tables. These are the jobs of economic statistician. Economic data

collected by public and private agencies are non-experimental and

likely to contain errors of measurement.

1.3: GOALS OF ECONOMETRICS

POLICY MAKING: We apply the various techniques in

order to obtain reliable estimates of the individual

coefficients of the economic relationship from which we may

evaluate parameters of economic theory. The knowledge of

the numerical value of these coefficients is very important for

the decision of firms as well as for the formulation of the

economic policy of the government.

FORECASTING: In formulating policy decisions it is

essential to be able to forecast the value of the economic

magnitudes. Such forecasts will enable the policy-maker to

judge whether it is necessary to take any measures in order to

influence the relevant economic variables. Forecasting is

becoming increasingly important both for the regulation of

developed economies as well as for the planning of the

economic development of underdeveloped countries.

Page 7: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

7

ANALYSIS: Econometrics aims primarily at the verification

of economic theories. In this case we say that the purpose of

the research is analysis that is obtaining empirical evidence to

test the explanatory power of economic theories.

1.4: TYPES OF ECONOMETRICS

Econometrics may be divided into two broad categories:

THEORETICAL ECONOMETRICS

Theoretical econometrics is concerned with the development of

appropriate methods for measuring economic relationship specified

by econometric models. Since the economic data or observations

of real life and not derived from controlled experiments, so

econometrics methods have been developed for such non

experimental data.

APPLIED ECONOMETRICS

In applied econometrics we use the tools of theoretical

econometrics to study some special field of economics and business,

such as the production function, investment function, demand and

supply function, etc.

Applied econometric methods will be used for estimation of

important quantities, analysis of economic outcomes, markets or

individual behavior, testing theories, and for forecasting. The last of

these is an art and science in itself, and the subject of a vast library of

sources.

1.5: METHODOLOGY OF ECONOMETRICS

Traditional econometric methodology has the following main

points:

1. Statement of theory or hypothesis.

Page 8: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

8

2. Specification of the mathematical model of the theory.

3. Specification of the statistical or econometric model.

4. Obtaining the data.

5. Estimation of the parameters of the econometric model.

6. Hypothesis testing.

7. Forecasting or prediction.

8. Using the model for control or policy purpose.

1. Statement of Theory or Hypothesis

Keynes stated, the fundamental psychological law is men

(women) are disposed as a rule and on average, to increase their

consumption as their income but not as much as the increase in

their income.

2. Specification of the Mathematical Model

Although Keynes postulated a positive relationship

between consumption and income, a mathematical economist

might suggest the following form of consumption function:

Y═ X 0 < < 1

Where: Y═ consumption expenditure and X═ income

═ intercept coefficient and ═ slope coefficient or MPC.

3. Specification of the Econometric Model of Consumption

The inexact relationship between economic variables, the

econometrician would modify the deterministic consumption

function as follows:

Y═ + X+u

Where “u” is known as the disturbance, error term or random

(stochastic) variable.

4. Obtaining Data

To estimate the econometric model that is to obtain the

numerical values of β and β , we need data. e.g

Page 9: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

9

Year Y X

2004 55 67

2005 58 70

2006 60 72

5. Estimation of the Econometric Model

Regression analysis technique to obtain the estimates of the

model. Thus

Ŷ═ 54+0.5576X

6. Hypothesis Testing

Assuming that the fitted model is a reasonably good

approximation of reality, we have to develop suitable criteria to

find out whether the estimates obtained in accord with the

expectations of the theory that is being tested.

7. Forecasting or Prediction

If the chosen model does not refute the hypothesis or theory

under consideration, we may use it to predict the future value of

the dependent, or forecast variable Y on the basis of known or

expected future value of the explanatory or predictor variable X.

8. Use of the Model for Control or Policy Purposes

An estimated model may be used for control, or policy

purposes. By appropriate fiscal and monetary policy mix, the

government can manipulate the control variable X to produce the

desired level of the target variable Y.

1.6: THE ROLE OF THE COMPUTER

Regression software packages, such as MINITAB,

EVIEWS, SAS, SPSS, STATA, SHAZAM etc.

Page 10: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

10

1.7: Exercise

1. What is econometrics? How many types of

econometrics.

2. Discuss the methodology of econometrics.

3. Differentiate between statistics and mathematics.

4. What are the goals of econometrics?

Page 11: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

11

Chapter: 2

SIMPLE LINEAR REGRESSION

2.1: THE NATURE OF REGRESSION ANALYSIS

2.1.1: HISTORICAL ORIGIN OF THE TERM REGRESSION

The term regression was introduced by Francis Galton.

Galton found that there was a tendency for tall parents to have tall

children and for short parents to have short children, the average

height of children born of parents of a given height tended to move

or “regress” toward the average height in the population as a

whole.

2.1.2: THE MODERN INTERPRETATION OF REGRESSION

Regression analysis is concerned with the study of

dependence of one variable on one or more other variable variables

with a view to estimating the mean value of the former in terms of

the known or fixed values of the latter.

TERMINOLOGY AND NOTATION

Dependent variable Independent variable

Explained Explanatory

Predictand Predictor

Regressand Regressor

Response Stimulus

Endogenous Exogenous

Controlled Control

Page 12: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

12

2.2: DATA

Collection of information or facts and figures is called data.

2.2.1: TYPES OF DATA

There are three types of data.

Time Series Data: A time series is a set of observations on the

values that a variable takes at different times. Such data may be

collected at regular time intervals, such as daily, weekly, monthly,

quarterly and yearly.

Cross-Section Data: Cross-Section data are data on one or more

variables collected at the same point in time, such as the census of

population conducted by the Census Bureau every 10 years.

Pooled Data: In pooled, or combined, data are elements of both

time series and cross-section data.

Panel, Longitudinal, or Micro panel Data: This is a

special type of pooled data in which the same cross-

sectional unit is surveyed over time.

2.3: METHOD OF ORDINARY LEAST SQUARES

The method of ordinary least squares is the sum of squares of

observed Y and estimated Ŷ. That is

Y═ α +βX+ e

The estimated model is

Page 13: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

13

Then the residual sum of squares is

∑ ═ ∑(Y

∑ ═ ∑(Y a bX eq. (A)

Minimizing eq. (A) w.r.t “a” and equating zero.

═ 2∑( ( 1)

0═ 2∑(Y a bX)

0═ ∑(Y a bX)

0═ ∑Y + +b

∑Y═ + b eq. (1)

Minimizing eq. (A) w.r.t “b” and equating zero.

═ 2∑(Y a bX ( X)

0═ 2∑X(Y a bX)

0═ ∑XY + + b∑

∑XY═ ∑ eq. (2)

Dividing eq. (1) by “n” on both sides.

+

Ӯ ═ a + b

a ═ b

Put the value of “a” in eq. (2).

═ ( )∑X+b

Page 14: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

14

= (

) +b

=

+b

b

b{

b=

2.4: PROPERTIES OF LEAST SQUARE REGRESSION LINE

It passes through mean points ( , Ӯ).

The estimated value of Ŷ is equal to the actual value of Y.

The mean value of residual = 0.

The residual are uncorrelated with predicted .

The residual are uncorrelated with predicted .

2.5: THE ASSUMPTIONS UNDERLYING THE

METHOD OF LEAST SQUARES: THE

CLASSICAL LINEAR REGRESSION

MODEL

1. Linear Regression Model

The regression model is linear in the parameter. That is

= + +

Page 15: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

15

2. X Value are Fix in Repeated Sampling

Values taken by the regression X are considered fixed in

repeated samples. More technically, X is assumed to be

nonstochastic.

3. Zero Mean Value of Disturbance Term 𝒊

Given the value of X, the mean or expected value of random

disturbance term is zero. Technically the conditional mean value

of is zero. That is

E [

⁄ ] = 0

4. Homoscedasticity or Equal Variances of 𝑼𝒊

Given the value of X, the variance is the same for all

observation. That is the conditional variance of are identical.

[ ⁄ ]= E[ ⁄ ] = E[ ⁄ ]=

5. No Autocorrelation between the Disturbance Term 𝑼𝒊

Given any two X values and (i≠j), the correlation

between any two and (i≠j) is zero.

[ ⁄ ]=E[{ ⁄ ][{ ( ⁄ )}]

[ ⁄ ]= E[ ⁄ ][ ⁄ ]

[ ⁄ ]= 0

6. Zero Covariance between 𝑼𝒊 and 𝒊

( ) = E[ ][ ]

( ) = E [ ] E = 0

( ) = E E

( ) = 0

7. The Number of Observations” n” Must be Greater than the

Number of Parameter to be Estimated

Alternatively, the number of observations “n” must be

greater than the number of explanatory variables.

Page 16: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

16

8. Variability in X Values

The X values in a given sample must not all be the same.

Technically variance of X must be a finite positive number.

9. The Regression Model is Correctly Specified

Alternatively, there is no specification bias error in the

model used in empirical analysis.

10. There is No Perfect Multicollinearity

There is no perfect linear relationship among the

explanatory variables.

2.6: PROPERTIES OF LEAST SQUARES ESTIMATORS

2.6.1: SMALL SAMPLE PROPERTIES OF THE LEAST SQUARES ESTIMATORS

I. Unbiasedness: An estimator is said to be unbiased if the

expected value is equal to the true population parameter.

II. Least Variance: An estimate is best when it has the smallest

variance as compared with any other estimate obtained from other

econometric methods.

III. Efficiency: An unbiased estimator is said to be efficient if the

variance of the sampling distribution is smaller than that of the

sampling distribution of any other unbiased estimator of the same

parameter.

IV. Best Linear: An estimator is linear if it is a linear function of

the sample observation i.e. if it is determined by a linear

combination of the data.

V. Mean Square Error: If there are more than one unbiased

estimators, the problem arises which one to choose out of the class

of unbiased estimators. Not only this, one aspires that the sampling

Page 17: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

17

variance as well as bias should be minimum. These problems are

tackled with the help of mean-squared error. The mean-squared

error of an estimator of is given as,

M.S.E [ ]

M.S.E [ ]

M.S.E [ ] [ ]

M.S.E =

Where, Bias =

Mean squared error will be minimum if is an unbiased

estimator of , i.e., and when is

minimum.

VI. Sufficiency: An estimator is said to be sufficient if the statistic

used as estimator uses all the information that is continued in the

sample.

VII. Consistency: An estimator is said to be consistent if the

statistic to be used as estimator becomes closer and closer to the

population parameter being estimated as the sample size “n”

increases.

VIII. BLUE: An estimator that is linear, unbiased and has

minimum variance is called best linear, unbiased estimator or

BLUE.

2.6.2: LARGE SAMPLE PROPERTIES OF LEAST

SQUARES ESTIMATORS

(ASYMPTOTIC PROPERTIES)

I. Asymptotic Unbiasedness: An estimator is an

asymptotically unbiased estimator of the true population

Page 18: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

18

parameter b, if the asymptotic mean of is equal to be b.

That is [ ]

II. Consistency: An estimator is a consistent estimator of the

true population parameter b, if it satisfies two conditions:

(a) Must be asymptotically unbiased. That is

[ ]

(b) The variance of must approach zero as n tends to

infinity. That is [ ]

III. Asymptotic Efficiency: An estimator is an

asymptotically efficient estimator of the true population

parameter b, if

(a) is consistent.

has a smaller asymptotic variance as compared with any

other consistent estimator.

2.6* GAUSS MARKOV THEOREM STATEMENT:

Least squares theory was put forth by Gauss in 1809 and

minimum variance approach to the estimators of was proposed

by Markov in 1900. Since determining of minimum variance linear

unbiased estimator involves both the concepts, the theorem is

known as Gauss-Markov theorem. It can be stated as follows:

Let be n independent variables with mean

and variance. The minimum variance linear unbiased estimators of

the regression coefficients are (j=1,2,..,k).

Under the terms and conditions imposed above, the

minimum variance linear unbiased estimators of the regression

coefficients are identically the same as the least square estimators.

Page 19: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

19

The combination of the above two statements is known as

Gauss-Markov theorem. i.e. the least square estimators of and

are best, linear, unbiased estimators (BLUE).

PROOF:

We use the model,

Y=

FOR

LINEARITY:

=

=

=

=

=

Where =

are nonstochastic weight,

= …………

This is linear function of sample observations

UNBIASEDNESS:

=

=

= …eq. (1)

Properties of is

1.

2. =

Page 20: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

20

3.

Put these results in eq. (1).

=

= + …eq. (2)

E = E +

E =

Which shows that is an unbiased estimator of .

Variance of :

By definition

) = E[ ]

) = E[ ]

) = E[ ] from eq. (2)

) = E[

]

) =

( )

, ( )

) =

) =

( )

And

Page 21: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

21

FOR

LINEARITY:

= Ӯ

*

+

+

Which is linear function of sample observations .

Where

UNBIASEDNESS:

+

+ ,

Taking expectation on both sides

E ( ) = +

E ( ) =

.

Variance of :

By definition

) = E[ ]

) = E[ ]

) = E[ ] from eq. (2)

) = E[

]

) =

( )

, ( )

) =

) = *

+

) = ∑ *

+

Page 22: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

22

) = *

+

) = *

+

,

2.6** MINIMUM VARIANCE PROPERTY OF LEAST SQUARE

ESTIMATORS

Suppose is any other linear unbiased estimator of

…eq. 2

Taking expectation on both sides

E =

E =

Variance of

= E[

]

= E[

]

= E[ ]

…from eq. 2.

= E[

]

= [

]

Page 23: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

23

=

, ( )

=

[ ]

=

[

]

=

[

]

=

[

]

=

=

MINIMUM VARIANCE PROPERTY OF :

Suppose

=

Where so

+

Taking expectations on both sides

= 0

is an unbiased estimator of .

Page 24: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

24

Variance of

= E[

]

= E[

]

= E[ ]

... From eq. 2.

= E[

]

= [

]

=

, ( )

=

[ ]

=

[

]

=

[

]

=

[

]

=

[ ,

-

]

=

* ,

-

+

*

+

*

+

*

+

Hence proved.

Page 25: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

25

2.6*** COVARIANCE OF

[ ][ ]

[ ][ ]

So

=

And

Now we get

[ ][ ]

2.7: VARIANCE OF DISTURBANCE TERM 𝑼𝒊

Let +

+

By subtraction

= +

Page 26: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

26

+

= …….eq. 1.

For sample

By subtraction

=

Making substitution in . Using eq. 1 & eq, 2.

Applying sum and squares on both sides.

[ ]

Taking expectation on both sides.

E =E[

]+E

. ..eq.

Now, E[ ] *

+

E[ ] [

]

Page 27: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

27

E[ ]

(

)

E[ ]

E[ ]

E[ ]

E[ ]

…… eq.

=

E

E * (

) + [ ]

E *

+ [ ]

E [

]

E [

]

E [

]

E

E ……… eq.

Page 28: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

28

Put eq. .

( )=

( ) =

( ) =

( ) =

(

)

E =

This shows that

2.8: DISTRIBUTION OF DEPENDENT VARIABLE Y

Let +

Mean of :

[ ] [ + ]

[ ] + )

[ ] +

Variance of :

[ ]

[ ]

The shape of the distribution

and by assumption of OLS. We assume that distribution of is

Page 29: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

29

normal and we also know that any linear function of normal

variable is also normal.

Since

2.9: MAXIMUM LIKELIHOOD ESTIMATORS

OF , 𝟐

( ) = ∏

( ) (

)

…eq. (A)

Differentiate eq.(A) w.r.t and equating zero.

=

2

0 =

0 =

0 =

….. eq.1.

Page 30: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

30

.r.t “

=

2

0 =

0 =

0 =

= 0

.r.t “

=

[ (

)

]

0 =

[ (

)

]

0 =

(

)

0 =

0 =

0=

Page 31: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

31

Which is biased estimator of .

Taking expectations on both sides.

( )

( )

( )

( )

( )

Hence M.L.E of is bias estimator. But M.L.E of

2.10: TEST OF GOODNESS OF FIT 𝟐

The ratio of explained variation to the total variation is called

the coefficient of determination. The varies between 0 and 1.

Total Variation = Unexplained Variation + Explained Variation

( ) + ( )

In deviation form:

Where

Page 32: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

32

( )

2.11: MEAN PREDICTION

Where

E (

( )

( )

( ) *

+

(

)

( )

( )

[

]

( )

( ) *

+

2.12: INDIVIDUAL PREDICTION 0F Y FOR GIVEN VALUE X

Prediction error is

Page 33: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

33

( ) [ ]

( ) [ ]

( ) [ ]

( )

( )

( )

By definition variance of prediction error is:

( ) [( ) ( )]

( ) [ ]

( ) [ ]

[ ] [

]

( )

( ) *

+

(

)

( )

( )

[

]

( )

( ) *

+

Page 34: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

34

2.13: SAMPLING DISTRIBUTIONS AND CONFIDENCE INTERVAL

Use z-test if is known or n is large, otherwise

we use t-test.

. √

/

Z =

and

with (n

( (

) )

√ (

)

And

√ (

)

Confidence Interval for : , Confidence Interval for :

(

) ,

Confidence Interval for Mean Prediction:

( √

*

+

Confidence Interval for Individual Prediction:

*

+

Confidence Interval for :

Page 35: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

35

Example: Given data

X 30 60 90 120 150

Y 50 80 120 130 180

i) Estimate the model Y=

ii) Estimate Y when X = 60.

iii) Test the significance of .

iv) 95% confidence interval of .

v) Estimate

vi) Estimate mean and individual prediction when

vii) and r.

Solution:

X Y XY 𝟐 𝟐

30 50 1500 900 2500

60 80 4800 3600 6400

90 120 10800 8100 14400

120 130 15600 14400 16900

150 180 27000 22500 32400

450 560 59700 49500 72600

i) Y= 𝒊

Page 36: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

36

19.3 + 1.03X

ii) When X = 60

19.3 + 1.03(20)

19.3 + 61.8

81.1

iii) Testing for

a)

b) Choose level of significance at

c) Test statistic

√ (

)

with n-2 d.f.

d) Computation:

√ [

]

√ [ ]

Page 37: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

37

e) Critical region:

| |

f) Conclusion:

Since our calculated value less than table value so

we accept , and may conclude that null hypothesis is

better than alternative hypothesis.

Testing for

a)

b) Choose level of significance at

c) Test statistic

with n-2 d.f.

d) Computation:

e) Critical region:

| |

f) Conclusion:

Since our calculated value greater than table

value so we reject , and may conclude that

alternative hypothesis is better.

iv) 95% confidence interval for

⁄ √

(

)

19.3

19.3

Page 38: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

38

90% confidence interval for :

⁄ √

0.7947

v) Covariance:

vi) Mean prediction:

When

( ) *

+

( ) *

+

( ) [ ]

( )

Page 39: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

39

Individual prediction:

When

( ) *

+

( ) *

+

( ) [ ]

( )

vii) 𝟐 and r :

Total Variation = Unexplained Variation + Explained Variation

( ) + ( )

In deviation form:

Unexplained Variation ( )

Page 40: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

40

Page 41: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

41

2.14: Exercise

1. Discuss the nature of regression analysis.

2. What are the different types of data for economic analysis?

3. State and prove Gauss-Markov theorem.

4. Prove that

5. Prove that

E( ) =

6. Find the ML estimates of least square regression line.

7. Given the data:

X 2 3 1 5 9

Y 4 7 3 9 17

i. Estimate the model Y= by OLS.

ii. Find the variance of .

iii. Find “r” and .

8. The following marks have been obtained by a class of students

in economics:

X 45 55 56 58 60 65 68 70 75 80 85

Y 56 50 48 60 62 64 65 70 74 82 90

1. Find the equation of the lines of regression.

2. Test the significance of .

3. 98% confidence interval of .

9. A sample of 20 observations corresponding to the model

gave the following data:

(a) Estimate and calculate estimates of variance of

your estimates.

(b) Find 95% confidence interval for . Explain the mean

value of Y corresponding to a value of X fixed at X = 10.

Page 42: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

42

Chapter: 3

MULTIPLE LINEAR REGRESSION AND

CORRELATION

3.1: Multiple Linear Regression

It investigates the dependence of one variable (dependent

variable) on more than one independent variables, e.g. production

of wheat depends upon fertilizer, land condition, temperature,

water etc.

Y =

Normal equations are:

0

1

[ {

}

]

Page 43: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

43

[

]

And

0

1

[ {

}

]

[

]

or

3.2: Coefficient of Multiple Determinations

Co-efficient of multiple determinations is the proportion of

variability due to independent variable and dependent

variable Y of total variation.

Page 44: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

44

3.3: Adjusted 𝑹 𝟐

The important property of that it is non-decreasing.

That is including the explanatory variable. Value of increasing

and do not decrease to adjust this we are adjusted .

3.4: COBB-DOUGLAS PRODUCTION FUNCTION

The Cobb-Douglas Production function, in its stochastic

form, may be expressed as

Where Y = output, , capital input

U = stochastic disturbance term, e = base of natural logarithm

The relationship between output and two inputs is nonlinear.

Using log-transformation we obtain linear regression model in the

parameters.

Where and .

3.5: Partial Correlation

If there are three variables Y, . Then the

correlation between Y and is called partial correlation. The

simple partial correlation co-efficient is the measure of strength of

Page 45: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

45

linear relationship between Y and after removing the linear

influence of from Y and is denoted by .

=

√( )√(

)

3.6: TESTING THE OVERALL SIGNIFICANCE OF A

MULTIPLE REGRESSION (The F-test)

Hypothesis

Choose level of significance at

Test statistic to be used:

with

Computations:

Total SS =

Residual SS = ( )

Explained SS = Total SS

S. O. V d. f SS MS F

Regression k Explained ⁄

Residual n Residual ⁄

Total n Total

Page 46: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

46

3.7: RELATION BETWEEN 𝑹𝟐 𝑭

Total Variation =

Explained Variation = ( )

Unexplained Variation = Total Variation

Unexplained Variation =

ANOVA TABLE IS:

S. O. V d. f SS MS F

Regression k ∑

⁄ F =

Residual n ∑

Total n ∑

Page 47: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

47

Example: Given the following data:

Y 5 7 8 10

1 3 9 8

2 4 3 10

i. Estimate and interpret

them.

ii. Find and .

iii. Test the goodness of fit.

Solution:

i. Estimate 𝟐 𝟐 𝑼𝒊

Y 𝟐 𝟐 𝟐 𝟐 𝟐

𝟐 𝟐

5 1 2 5 10 2 1 4 25

7 3 4 21 28 12 9 16 49

8 9 3 72 24 27 81 9 64

10 8 10 80 100 80 64 100 100

30 21 19 178 162 121 155 129 238

Normal equations are:

Solving these equations, we get

Page 48: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

48

ii. Find 𝑹𝟐 and 𝑹 𝟐

( )

61

iii. Testing

a)

b) c) Test statistic

Page 49: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

49

⁄ with d.f.

d) Computation

e) Critical region

f) Since our calculated value less than table value so

we accept null hypothesis.

Page 50: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

50

3.8: Exercise

1. Differentiate between simple and multiple regression.

2. Write note on and .

3. Discuss the Cobb-Douglas production function.

4. How the overall significance of regression is tested?

5. Consider the following data:

Y 40 30 20 10 60 50 70 80 90

50 40 30 80 70 20 60 50 40

20 10 30 40 80 30 50 10 60

iv. Estimate and interpret

them.

v.Find and .

vi. Test the goodness of fit.

vii. Find variance of 6. Use the following data:

Y

5.5 190 49

6.5 170 58

8.0 210 55

7.5 170 58

7.0 190 55

5.0 180 49

6.0 200 46

6.5 210 46

a. Estimate by OLS.

b. Test overall significance of regression model.

c. Find adjusted coefficient of multiple correlation.

d. Find .

Page 51: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

51

Chapter: 4 GENERAL LINEAR REGRESSION (GLR)

4.1: INTRODUCTION

The general linear regression is an extension of simple

linear regression and it involves more than one independent

variables.

Let we have „n‟ observations in which a linear

relationship exist between a variable and K explanatory

variables , then regression model is:

For „n‟ observations

. . . . .

. . . . .

. . . . .

It may be written as a matrix notation

[ ]

[

]

[

]

[ ]

[

]

[ ]

[

]

Page 52: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

52

Assumptions of GLR:

1. [ ]

[

]

Taking expectation on both sides

[ ]

[

]

[

]

2. Variance

( ) [ ]

( )

[

]

[ ]

( )

[

]

Page 53: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

53

( )

[

]

( )

[

]

( )

[

]

( )

Prove that .

Proof:

Let the population model is

Estimated model is

By minimizing the sum of squares of residuals that is

[ ] [ ]

[ ][ ]

Page 54: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

54

Since is scalar, therefore it is equal its transpose i.e.

Minimize with respect to and equating zero.

4.2: PROPERTIES OF OLS ESTIMATORS IN (GLR)

1. Linearity: is linear function of the unknown parameter of

In a GLR model

……………..eq. (1)

2. Unbiasedness: The OLS estimator is unbiased.

Taking expectation on both sides

Page 55: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

55

3. Minimum Variance: By definition

( ) [ ( )][ ( )]

( ) [ ][ ]

Using eq. (1) we get

( ) [ ][ ]

( ) [ ][ ]

( ) [ ]

( ) [ ]

( ) [ ]

( )

Example: Given

Y 4 5 6 7 8

X 2 3 4 5 7

i) Calculate SLR estimate using GLR technique.

ii) Also find their variance and covariance.

Page 56: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

56

Solution:

Y X XY 𝟐 𝟐

4 2 8 2 16

5 3 15 9 25

6 4 24 16 36

7 5 35 25 49

8 7 56 49 64

30 21 138 103 190

i)

[ ∑

∑ ∑ ] *

+

[

] *

+

| |

| | |

|

*

+

Now

*

+ *

+

[

] *

+ [

]

ii) Variance-covariance

Page 57: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

57

[ ] *

+

[ ]

( )

( )

*

+

( ) *

+ 0

1

4.3: POLYNOMIAL

Any algebraic expression in which the degree

of “X” is non-negative i.e. positive or zero is known as

polynomial. E.g.

Y =

PLYNOMIAL REGRESSION

It is a simple multiple linear regression, where

explanatory variables are all powers of a single variable. E.g

second degree polynomial variable in which

It is called polynomial regression model in one regression. If

Then this is multiple linear regression with “K” explanatory

variables. The Kth order polynomial in one variable is:

Page 58: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

58

Polynomial regression model is used where the relationship

between the response variable and explanatory variable is

curve linear.

Page 59: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

59

4.4: Exercise

1. Discuss general linear regression.

2. State the assumptions under which OLS estimates are best,

linear and unbiased in general linear regression.

3. Prove that:

a)

b) ( )

4. Define polynomial regression.

5. Given the data:

X 15 20 30 50 100

Y 20 40 60 80 120

Find:

i)

ii) ( )

iii) 90% confidence interval of .

iv) Test the hypothesis when .

v) Estimate Y when X=200.

vi) And .

6. Consider the GLR model with the following data:

Y 3 7 5 9

7 11 8 10

5 3 9 3

Find:

i)

ii) ( )

iii) 90% confidence interval of

iv) Test the hypothesis when .

v) And .

7. Given the following information in deviation form:

*

+ ,

*

+

Page 60: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

60

, ,

,

a) Find the estimates of . Also find their

variances and covariance.

b) How would you estimate

c) Test the hypothesis that .

d) And .

8. Given the following data:

2 1 3

3 5 4

8 6 7

10 8 6

12 10 11

16 13 14

19 17 18

20 21 20

22 23 25

25 24 27

Find:

a) Estimate the model in deviation form .

b) ( )

c) 95% confidence interval of and .

d) Test the hypothesis when .

e) And .

Page 61: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

61

Chapter: 5

DUMMY VARIABLES

Econometric models are very flexible as they allow for the

use of both qualitative and quantitative explanatory variables. For

the quantitative response variable each independent variable can

either a quantitative variable or a qualitative variable, whose levels

represent qualities and can only be categorized. Examples of

qualitative variables may be male and female, black and white etc.

But for a qualitative variable, a numerical scale does not exist. We

must assign a set of levels to qualitative variable to account for the

effect that the variable may have on the response, then we use

dummy variables.

“A dummy variable is a variable which we construct to

describe the development or variation of the variable under

consideration.”

5.1: NATURE OF DUMMY VARIABLES

In regression analysis dependent variable is

affected not only by quantitative variables but also by qualitative

variables. For example income, output, height, temperature etc, can

be quantified on some well define scales. Similarly religion,

nationality, strikes, earthquakes, sex etc, are qualitative in nature.

These all variables affect on dependent variable. In

order to study these variables, we quantified the qualitative

variables by using “0” and “1. “0” means absence of attribute and

“1” means presence of attribute. Variables that assume “0” and “1”

are called dummy variables. Dummy variables are also called

Indicator, Binary, Categorical variables.

EXAMPLE:

Where

Page 62: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

62

Suppose

Using OLS method. There is only one dummy variable in the model.

Mean salary of Female College Professors:

(

⁄ )

Mean salary of Male College Professors:

(

⁄ )

5.2: DUMMY VARIABLE TRAP

If an indicator variable has k categories, that is k-1 dummy

variables, otherwise the situation of perfect multicollinearity arises

and the researcher will fall into the dummy variable trap.

We consider a model

Where

Sex

3000 Female 0

4000 Male 1

5000 Female 0

6000 Male 1

Page 63: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

63

This model is an example of dummy variable trap. There is a

rule of introducing a dummy variable. If a qualitative variable have

“m” categories introduce only (m ) variable (dummy). If this

rule is not followed we say that there is trap of dummy variable.

EXAMPLES:

Sex has two categories F and M that is m = 2. If we introduce

m dummy variable, we follow the rule of

introducing dummy variables. If we introduce 2 dummy

variables then we say there is dummy variable trap.

Suppose there are three categories of color as white, black and

red. Then m = 3. If we not introduce m

dummy variables, then there will be dummy variable trap.

5.3: USES OF DUMMY VARIABLES

a) Dummy variables used as alternate for qualitative factors.

b) The dummy variables can be used to deseasonalize the time

series.

c) Dummy variables are used in spline function.

d) Interaction effects can be measured by using dummy

variables.

e) Dummy variables are used for determining the change of

regression coefficient.

f) Dummy variables are used as categorical regressors.

Page 64: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

64

5.4: Exercise

1. What are the dummy variables? Discuss briefly the

features of the dummy variable regression model.

2. Discuss the uses of dummy variables.

Page 65: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

65

Chapter: 6

AUTO-REGRESSIVE AND DISTRIBUTED-LAG

MODEL

6.1: DISTRIBUTED-LAG MODEL

In regression analysis involving time-series data, If the

regression model includes not only the current but the lagged (past)

values of the explanatory variable (X‟S), it is called distributed

lag-model. That is,

Represent a distributed lag-model.

6.2: AUTO-REGRESSIVE MODEL

If the model includes one or more lagged values of

the dependent variable among its explanatory variables, it is called

an auto-regressive model. That is,

Represent an auto-regressive model. Auto-regressive models

are also known as dynamic models. Auto-regressive and

distributed-lag models are used extensively in econometric

analysis.

6.3: LAG

In economics the dependence of a variable Y

(dependent variable) on other variables (explanatory variable) is

rarely instantaneous (happen immediately). Very often Y responds

to X with a laps of time, such a laps of time is called a lag.

Page 66: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

66

6.4: REASONS SOURCES OF LAGS

There are three main reasons of lags.

1. Psychological Reasons: Due to the force of habit people do

not change their consumption habits immediately following a

price decrease or an income increase. For example those who

become instant millionaires by winning lotteries may not

change their life styles. Given reasonable time, they may learn

to live with their newly acquired fortune.

2. Technological Reason: Technological reason is the major

source of lags. In the field of economics if the drop in price is

expected to be temporary firms may not substitute labor,

especially if they expected that after the temporary drop, the

price of capital may increase beyond the previous levels. For

example, since the introduction of electronic pocket calculators

in the late 1960‟s, the price of most calculators have

dramatically decrease as a result consumers for the calculators

may hesitate to buy until they have time to look into the

features and prices of all the competing brands. Moreover they

may hesitate to buy in the expectation of further decrease in

price.

3. Institutional Reason: These reasons also contribute to lags.

For example, those who have placed funds in long term saving

accounts for fixed durations such as 1 year, 3 year or 7 year are

essentially “locked in” even though many market conditions

may be such that higher yields are available elsewhere.

Similarly, employers often given their employees a choice

among several health insurance plans, but one a choice is made

on employee may not switch to another plan for at least one

year.

Page 67: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

67

6.5: TYPES OF DISTRIBUTED LAG MODEL

There are two types of distributed lag model:

1. Infinite Distributed Lag Model

In case of infinite distributed lag model we do not

specify the length of the lag. It means that how for back

into the past we want to go: e.g

2. Finite Distributed Lag Model

In case of finite distributed lag model we specify the

length of lag: e.g

6.6: ESTIMATION OF DISRIBUTED LAG MODEL

We use the following methods for estimation of

distributed lag model.

1) Ad Hoc Estimation Method.

2) Koyck Estimation Method.

3) Almon Approach Method.

1) Ad Hoc Estimation Method

This is the approach taken by Alt and Tinbergen.

They suggest estimating,

One may proceed sequentially under this method, first we

regress then regress and and so on. This

sequential procedure stops when the regressive coefficients of

the lagged variables start becoming statistically insignificant and

Page 68: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

68

or the coefficients of at least one of the variables. Changes sign

from positive to negative or vice versa.

2) Koyck Approach

This method is used in case of finite distributed lag model.

Under this method we assume that are all of the same sign.

Koyck assume they decline geometrically as follows:

………..eq. (A) where k = 0,1,2……

. .

. .

. .

Where λ (0 < λ < 1) is known as the rate of decline or

decay of the distributed lag where 1 is known as the speed of

adjustment. As the distributed lag model is:

……eq. (B)

From eq. (A) we substitute λ, we get

...eq. (C)

Lagging one period, we get

Multiplying it by “λ” on both sides

….eq. (D)

Subtracting eq. (D) from eq. (C), we get

Page 69: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

69

…..eq. (E)

It is also regressive model, so we can apply OLS method to

model (E) and get , , using them we can fined

In a sense of multicollinearity is resolved by replacing

By a single variable . But note the

following features of Koyck transformation.

Koyck model is transformend into auto regressive model from

distributed lag model.

It gives biased and inconsistent estimator.

3) Almon Approach to Distributed Lag Models

If coefficients do not decline geometrically, They

increase at first and then decrease it is assumed that follow a

cyclical pattern. In this situation we apply Almon approach.

To illustrate Almon technique, we use the finite distributed

lag model.

……+

This may be written as:

Almon assume that “ ” can be approximated by a suitable

degree polynomial in “i” (the length of lag).

Page 70: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

70

6.7: Exercise

1. Differentiate between auto-regressive and distributed-lag models.

2. What is Lag? Discuss the sources of lags.

3. Discuss the different methods of distributed-lags model.

Page 71: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

71

Chapter: 7

MULTICOLLINEARITY

7.1: Collinearity

In a multiple regression model with two independent

variables, if there is linear relationship between independent

variables, we say that there is collinearity.

7.2: Multicollinearity

If there are more than two independent variables and they

are linearly related, this linear relationship is called

multicollinearity.

Multicollinearity arises from the presence of

interdependence among the regressors in a multivariable equation

system. The departure of orthognality in the set of regressors in a

measure of multicollinearity. It means the existence of a perfect or

exact linear relationship among some or all explanatory variables.

When the explanatory variables are perfectly correlated, the

method of least squares breaks down.

7.3: Sources of Multicollinearity

The data collection method employed for example,

sampling over a limited range of the values taken by the

regressors in the population.

Constraints on the model or in the population being

sampled. In the regression of electricity consumption (Y)

on income ( ) and house size ( ) there is a physical

constraints in the population in that families with higher

income generally larger homes than families with lower

income.

Model specification: For example adding polynomial

terms to a regression model, especially when the range of

the variable is small.

Page 72: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

72

An Over determined Model: This happens when the

model has more explanatory variables than the number of

observations. This could happen in medical research,

where there may be a small number of patients about

whom information is collected on a large number of

variables.

An additional reason for multicollinearity, especially in

time series data may be that the regressors included in the

model share a common trend, that is they all increase or

decrease over time. Thus in the regression of consumption

expenditure on income, wealth and population, the

regressors income, wealth and population may all be

growing over time at more or less the same rate leading to

collinearity among these variables.

7.4: TYPES OF MULTICOLLINEARITY

There are two types of multicollinearity.

Perfect Multicollinearity

Relates to the situation where explanatory variables are

perfectly linearly related with each other. Simply when

correlation between two explanatory variables is exactly one i.e.

). This situation is called perfect multicollinearity.

Imperfect Multicollinearity

If the correlation coefficient between two explanatory

variables is not equal to one but close to one approximately 0.9,

it is called high multicollinearity. If approximately 0.5,

it is called moderate and if it is called low

multicollinearity. Both are troublesome because it cannot be

easily detected.

Page 73: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

73

7.5: ESTIMATION IN THE PRESENCE OF PERFECT

MULTICOLLINEARITY

The three variable regression model using deviation form as

(

)

( )(

) And

(

)

( )(

)

Assume that , where λ is non-zero constant. Then

(

)

( )(

)

(

) ( )

( )(

) ( )

[ (

) ( )]

[( )

(

) ]

[ ]

[ ]

.

Similarly,

(

)

( )(

)

(

) ( )

( )(

) ( )

[ (

) ( )]

[( )

(

) ]

Page 74: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

74

[ ]

[ ]

( )

( )(

) ( )

( )

( )(

) ( )

( )

( )(

) ( )

( )

[( )

(

) ]

( )

( )

( ) .

Similarly,

( )

( )(

) ( )

( )

( )(

) ( )

( )

( )(

) ( )

( )

[( )

(

) ]

( )

Page 75: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

75

( )

( ) .

Put

Where

Regression in y on x is:

Therefore, although we can estimate uniquely, but there is

no way to estimate uniquely. Hence in the case of perfect

multicollinearity the variance and standard error of

individually are infinite.

7.6: CONSEQUENCES OF MULTICOLLINEARITY

1) The estimate of the coefficient of statistical unbiased,

even multicollinearity is strong. The sample property

of unbiased of the estimate does not require that the

X‟s be uncorrelated. On the other hand sample with

multicollinear X‟s may rounder the values of the

estimate seriously imprecise.

2) If the intercorrelation between the explanatory is

perfect. Then the estimates of the coefficient are

indeterminate.

Page 76: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

76

Proof: The three variable regression model using

deviation form as

(

)

( )(

) And

(

)

( )(

)

Assume that , where λ is non-zero constant. Then

(

)

( )(

)

(

) ( )

( )(

) ( )

[ (

) ( )]

[( )

(

) ]

[ ]

[ ]

.

Similarly,

(

)

( )(

)

(

) ( )

( )(

) ( )

[ (

) ( )]

[( )

(

) ]

[ ]

[ ]

Page 77: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

77

3) If the intercorrelation of the explanatory is perfectly

one. Then the standard error of these estimate become

infinitely large.

Proof:

If , the standard error the estimate become

infinitely large in the two variable model:

0

1

*

⁄+

*

⁄+

[

]

*

+

Putting

*

+

*

+

Infinitely large.

Similarly:

Page 78: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

78

*

+

*

⁄+

*

⁄+

[

]

*

+

Putting

*

+

*

+

Infinitely large

4) In case of strong multicollinearity regression

coefficients are determinate but their standard errors

are large.

Proof:

*

+

Put

*

+

Page 79: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

79

*

+

[ ]

In case of

If

*

+

*

+

*

+

5) In case of multicollinearity the confidence interval

becomes wider.

6) In the presence of multicollinearity the t-test will be

misleading.

7) In the presence of multicollinearity prediction is not

accurate.

7.7: DETECTION OF MULTICOLLINEARITY

1. The Farrar and Glauber Test of Multicollinearity

A statistical test for multicollinearity has been developed by

Farrar and Glauber. It is really a set of three tests.

a) The first test is a 𝟐 test for the detection of the

existence and the severity of multicollinearity in a function

including several explanatory variables.

Procedure:

i.

.

ii. Choose level of significance at

iii. Test statistic to be used

*

+

Page 80: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

80

iv. Computations: where is the value of the

standardized correlation determinant. K is

number of explanatory variables.

v. Critical Region:

vi. Conclusion: Reject if our calculated value is

greater than table value. Otherwise accept.

b) The second test is an F-test for locating which

variables are multicollinear.

Procedure:

i.

ii. Choose level of significance at

iii. Test statistic to be used

with d.f

iv. Computations:

Compute the multiple correlation coefficients

among the explanatory variables.

v. Critical Region: F

vi. Conclusion:

Reject if our calculated value is greater than

table value. Otherwise accept.

c) The third test is a t-test for finding out the pattern

of multicillinearity that is for determining which variables are

responsible for the appearance of the multicollinear variable.

Procedure: i.

ii. Choose level of significance at

iii. Test statistic to be used

with

Page 81: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

81

iv. Computations:

Computed the partial correlation

coefficients.

v. Critical Region: | |

vi. Conclusion:

Reject if our calculated value

is greater than table value. Otherwise

accept.

2. High Pair Wise Correlation among Regressors

Multicollinearity exists if the pair wise or zero order

coefficients between the two regressors are very high.

3. Eigen Value and Condition Number

A condition number K is defined as

If K is between 100 and 1000, There is moderate to

strong multicollinearity and if exceeds 1000 there is severe

multicollinearity.

The condition index defined as

If is the condition effect lie between 10 and 30 then

there is moderate to strong multicollinearity and if it exceed

30 there is severe multicollinearity.

4. Tolerance and Variance Inflation Factor

As the coefficient of determination in the regression

of regressors on the remaining regressor in the model

increases towards that is as the collinearity with the other

Page 82: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

82

regressor increases VIF all the increases and the limit it can

be infinite.

VIF

( )

Tolerance can also be used to detect the

multicollinearity. That is

Tolerance

(

)

5. High 𝑹𝟐 but Few Significant t-Ratios

If is high the F-test in most cases will reject the

hypothesis that the partial correlation coefficients are

simultaneously equal to zero, but the individual t-test will

show that non are very few of the partial slope of coefficients

are statistically different from zero. This is the symptom of

multicollinearity.

6. Some Other Multivariate Methods

Like Principal Component Analysis (PCA), Factor

Analysis (FA) and Ridge Regression can also be used for

detection of multicollinearity.

7.8. REMEDIAL MEASURES OF MULTICOLLINEARITY

i. A Prior Information

Suppose we consider the model

Where Y = Consumption,

Income and wealth variable tends to be highly collinear.

Suppose that is the rate of change of

consumption with respect to wealth one tended the

corresponding rate with respect to income. We can then run

the regression

Page 83: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

83

Where

Once we obtain we can estimate

from the

postulated relationship between and .

ii. Combining Cross-sectional and Time Series Data

A variant of the extraneous are a priori information

technique is the combination of cross-sectional and time

series data known as pooling the data. The combination of

cross-sectional and time series data may be a situation of

reduction of multicollinearity.

iii. Dropping a Variable or Variables

When faced with severe multicollinearity one of the

simplest things to do is to drop one of the collinear variables.

In dropping a variable from the model we may be

committing a specification bias or specification error.

iv. Transformation of Variables

One way of minimizing this dependence is to proceed as

follows:

If the above relation holds at time “t” it must also hold at

time “t-1” because the origin of the time is arbitrary, therefore

we have

…eq(2)

is known as first difference form.

Page 84: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

84

The first difference regression model often reduces the

severity of multicollinearity.

v. Additional or New Data

Since multicollinearity is a sample feature, it is possible

that in another sample involving the same variables.

Multicollinearity may not be as serious as in the first sample.

Sometimes simply increasing the size of slope may reduce the

multicollinearity problem.

vi. Other Methods

Multivariate statistical technique such as factor analysis

and principal components or other techniques such as ridge

regression are often implied to solve the problem of

multicollinearity.

Page 85: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

85

7.9: Exercise

1) Explain the problem of multicollinearity and its types.

2) Explain the methods for detection of multicollinearity.

3) Describe the consequences of multicollinearity.

4) How would you proceed for estimation of parameters in

the presence of perfect multicollinearity?

5) Define any four methods for removal of multicollinearity.

6) Apply Farrar and Glauber test to the following data:

6 6 6.5 7.6 9

40.1 40.3 47.5 58 64.7

5.5 4.7 5.2 8.7 17.1

108 94 108 99 93

7) Find severity, location and pattern of multicollinearity to

the following data:

Page 86: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

86

Chapter: 8

HETEROSCEDASTICITY

8.1. NATURE OF HETEROSCEDASTICITY

One of the important assumptions of the classical linear

regression model is that the variance of each disturbance term is

equal to . This is the assumption of homoscedasticity.

Symbolically, [ ]

If this assumption of the homoscedasticity is fail that is:

[ ]

Then we say that U‟s are heteroscedastic. That

[ ]

Where „i‟ tells the fact that the individual variances may all

be different.

DIFFERENCE BETWEEN HOMOSCEDASTICITY AND

HETEROSCEDASTICITY

Homoscedasticity is the situation in which the probability

distributions of the disturbance term remain same overall

observations of „X‟ and in particular that the variance of each

is the same for all values of the explanatory variables.

Heteroscedasticity is the situation in which the probability

distributions of the disturbance term does not remain the same over

all the observations of „X‟ and in particular that the variance of

each is not the same for all the values of the explanatory

variables.

8.1.1. Reasons of Heteroscedasticity

i. Error Learning Model

Page 87: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

87

As people learn their error of behavior become smaller

over time. In this case is expected to decrease, e.g. as the

number of hours of typing practice increases. The average

number of typing errors as well as their variances decreases.

ii. Data Collection Technique

Another reason of heteroscedasticity is the collection of

data techniques. Improvement of data collection techniques

is likely to decrease.

iii. Variance in Cross-Section and Time Series Data

In cross-sectional data the variance is greater than as

compared to the time series data variance. Because in cross-

sectional data, one usually deals with numbers of population

at a given point in time.

iv. Due to Specification Error

The heteroscedasticity problem is also arises from

specification errors, due to that error the variance tends to

variate.

8.2. OLS ESTIMATION OF HETEROSCEDASTICITY

Let us we use two variable model

[ ]

=

=

Page 88: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

88

= +

E = E +

E =

Which shows that is still unbiased estimator of , even in

the presence of heteroscedasticity.

Variance of :

By definition

) = E[ ]

) = E[ ]

) = E[ ]

) = E[

]

) =

( )

By assumption of heteroscedasticity

, ( )

) =

) =

In the presence of heteroscedasticity, we observed that

OLS estimator is still linear, unbiased and consistent but not

BLUE, that is is not efficient, because has not minimum

variance in the class of unbiased estimator in the presence of

heteroscedasticity.

Page 89: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

89

8.3: CONSEQUENCES OF HETEROSCEDASTICITY

1) The OLS estimators in the presence of heteroscedasticity

are still linear, unbiased and consistent.

2) In the presence of heteroscedasticity the OLS estimators

are not BLUE, that is they have not minimum variance in the

class of unbiased estimators. 3) In the presence of heteroscedasticity the confidence

interval of OLS estimators are wider.

4) In the presence of heteroscedasticity„t‟ and „F‟ test are

misleading.

8.4: DETECTION OF HETEROSCEDASTICITY

1. The Park Test

Professor Park suggested that is same function of the

explanatory variable . The functional form is

Where is the stochastic disturbance term.

Taking In on both sides. We get

Since is generally not known. Park suggests using

as a proxy and running the following regression

If turns out to be statistically significant it means

heteroscedasticity is present in the data, otherwise does not

present it.

Two stages of Park test:

Stage 1: we run the OLS and obtain .

Page 90: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

90

Stage 2: again we run OLS with as a dependent variable.

2. Glejser Test

Glejser test is similar in spirit to Park test. The

difference is that Glejser suggests as many as six functional

forms while Park suggested only one functional form.

Furthermore Glejser used absolute values of . Glejser used

the following functional forms to detect heteroscedasticity.

I. | |

II. | | √

III. | | (

)

IV. | | (

√ )

V. | | √

VI. | | √

Stages of Glejser test:

Stage 1: Fit a model Y on X and compute .

Stage 2: Take the absolute value of and then regress

with X using any one of functional form.

3. Spearman Rank Correlation Test

Rank correlation co-efficient can be used to detect

heteroscedasticity. That is

Step 1: State hypothesis

,

Step 2: Fit the regression of Y on X and compute .

Step 3: Taking the absolute values of . Rank both | | and

X according to ascending or descending order then compute

Page 91: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

91

Where | |

Step 4: For n

√ with d.f.

Step 5: C.R | | ⁄

Step 6: Conclusion: As usual.

4. Goldfeld Quandt Test

This test is applicable to large samples. The observations

must be at least twice as many as the parameters to be

estimated.

Step 1. State null and alternative hypothesis.

Step 2. Choose level of significance at

Step 3. Test statistic to be used

(

)

(

)

With

(

) (

)

Step 4. Computation: Where C is central observations

omitted and K is number of parameters estimated.

i. We arrange the observations in ascending

or descending order of magnitude.

ii. We select arbitrarily a certain number “C”

of central observations which we omitted

from the analysis “C” should be at least

one fourth of the observations for n>30.

Page 92: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

92

iii. The remaining (n-c) observations are

divided into two sub samples of equal

size

, one including the small values of

“X” and other of large values.

iv. We fit a separate regression lines to each

sub samples, we obtain the sum of

squared residuals from each of them. That

is

.

v. Compute the value of F.

Step 5. C.R:

Step 6. Conclusion:

Since our calculated value is greater than

table value. So we reject null hypothesis and may

conclude that there is heteroscedasticity.

8.5: REMEDIAL MEASURES OF HETEROSCEDASTICITY

There are two approaches of remediation:

(a) When is known.

(b) When is not known.

(a) When 𝒊𝟐 is known

The most straight forward correcting method of

heteroscedasticty, when is known by means of weighted

least squares for the estimator, thus obtained for BLUE. i.e

Dividing by on both sides.

Page 93: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

93

(b) When 𝒊𝟐 is unknown

We consider two variable regression model.

That is

Now we consider several assumptions about

the pattern of heteroscedasticity.

I. The error variance proportional to . That is

.

Proof: Dividing original model by .

Where is the disturbance term.

Taking squaring and expectation on both sides.

(

)

( )

Hence the variance of is homoscedastic.

II. The error variance proportional to . That is

.

Page 94: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

94

Proof: The original model can be transform as:

Where is the disturbance term.

Taking squaring and expectation on both sides.

(

√ )

( )

Hence the variance of is homoscedastic.

III. The error variance proportional to the squares of the

mean value of “Y”. That is

[ ]

.

Proof: The original model can be transform as:

Where is the disturbance term.

Page 95: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

95

Taking squaring and expectation on both sides.

(

)

( )

[ ]

[ ]

[ ]

Hence the variance of is homoscedastic.

IV. A log transformation such as:

Reduces heteroscedasticity, when compared with the

regression: .

Page 96: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

96

8.9: Exercise

a) Define Heteroscedasticity? What are the

consequences of the violation of the assumption of

Homoscedasticity?

b) Review suggested approaches to estimation of a

regression model in the presence of

Heteroscedasticity.

c) Discuss the three methods for detection of

Heteroscedasticity.

d) What are the solutions of Heteroscedasticity?

e) Apply Goldfeld and Quandt test on the following

data to test whether there is heteroscedasticity or not. X 20 25 23 18 26 27 29 31 22 27 32 35 40 41 39

Y 18 17 16 10 8 15 16 20 18 17 19 18 26 25 23

f) Given

Year Y 2000 3.5 15 16 -0.16

2001 4.5 20 13 0.43

2002 5.0 30 10 0.12

2003 6.0 42 7 0.22

2004 7.0 50 7 -0.50

2005 9.0 54 5 1.25

2006 8.0 65 4 -1.31

2007 12.0 8.5 3.5 -0.43

2008 14.0 90 2 1.07

Test heteroscedasticity by Spearman‟s rank test.

g) Consider the model:

Using the data below apply Park-Glejser test?

Year Y X

2002 37 4.5

2003 48 6.5

2004 45 3.5

2005 36 3.0

Page 97: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

97

Chapter: 9

AUTOCORRELATION

9.1: INTRODUCTION

Autocorrelation refer to a case in which the error term in

one time period is correlated with the error term in any other time

period. As “correlation between members of series of observations

ordered in time as in case of time series data or space as in case of

cross-sectional data”.

One of the assumptions of linear regression model is

that there is zero correlation between error terms. That is

( )

If the above assumption is not satisfied than there is

autocorrelation, that is if the value of in any particular period is

correlated with its own preceding value or values. Therefore it is

known as the autocorrelation or serial correlation. That

is ( ) . Autocorrelation is a special case of correlation.

Autocorrelation is referring to the relationship not between two

different variables but between the successive values of the same

variable.

Autocorrelation:

Lag correlation of a given series with itself is called

autocorrelation, thus correlation between two time series such as

is called autocorrelation.

Serial Correlation:

Lag correlation between two different series is called

serial correlation, thus correlation between two different series

such as

is called serial correlation.

Page 98: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

98

9.2. REASONS OF AUTOCORRELATION

There are several reasons which become the cause of

autocorrelation.

1) Omitting Explanatory Variables:

Most of the economic variables are generally tend

to be auto correlated. If an auto correlated variable has been

excluded from the set of explanatory variables, its influence

will be reflected in the random variable “U” whose value will

be auto correlated.

2) Miss Specification of the Mathematical Model:

If we have adopted a mathematical form which

differs from the true form of the relationship, the U‟s may

show serial correlation.

3) Specification Bias:

Autocorrelation also arises due to specification bias,

arises from true variables excluded from model and wrong

use of functional form.

4) Lags:

Regression models using lagged values in time

series data occur relatively often in economics, business

and some fields of engineering. If we neglect the lagged

term from the autoregressive model, the resulting error

term will reflect a systematic pattern and therefore

autocorrelation will be present.

5) Data Manipulation:

For empirical analysis, the raw data are often

manipulated. Manipulation introduces smoothness into the

raw data by dampening the fluctuations. This manipulation

Page 99: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

99

leads to a systematic pattern and therefore, autocorrelation

will be there.

9.3. OLS ESTIMATION IN THE PRESENCE OF

AUTOCORRELATION

Mean:

Taking expectations on both sides

[ ]

[ ]

[ ]

Variance: By definition:

[ ]

[ ]

[ ]

[ ]

, r=0, 1, 2, 3...

[ ]

The expression in brackets is a sum of a geometric

progression of infinite term.

Where is first term of geometric progression and ʎ is

common ratio, when | | , the formula reduce to

By using this formula, we get

*

+

Page 100: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

100

Where

Covariance:

[ ][ ]

[ ]

Given that

[ ]

[ ]

[ ]

[ ] [ ]

[

]

[ [ ] [

] ]

[

]

[ ]

* (

)+

*

+

Similarly:

In general

Page 101: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

101

9.4. CONSEQUENCES OF AUTOCORRELATION

Following are the consequences of OLS method in

the presences of autocorrelation.

1. The least square estimator is unbiased even when the

residuals are correlated.

2. With autocorrelation values of the disturbance term

the OLS variance of the parameter are likely to be

larger than those of other econometric models, so they

do not have the minimum variance that is BLUE.

3. If the values of are auto correlated the prediction

based on ordinary least square estimates will be

inefficient in the sense that they will have larger

variances as compared to others.

4. In the presence of autocorrelation “t” and “F” test are

likely to give misleading conclusion.

5. The variance of the random term “U” may be

seriously underestimated if the U‟s are auto

correlated.

9.5. DETECTION OF AUTOCORRELATION

1. Durbin Watson d-Statistic

This test was developed by Durbin and Watson to

examine whether autocorrelation exist in a given situation or

not. A Durbin Watson„d‟ statistic is defined as follows:

(

)

Page 102: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

102

Where

then

*

+

Which is simply the ratio of the sum of squared

differences in successive residuals to RSS (residual sum of

square) is called Durbin Watson d-Statistic. It is noted that

in the numerator of the d-statistic, the number of

observations in (N ) because one observation is lost in

taking successive differences.

Assumption of Durbin Watson d-Statistic

1. The regression model includes the intercept term.

2. The explanatory variable X‟s are non-stochastic or

fixed in repeated sampling.

3. The disturbance term U‟s are generated by the first order

auto regressive scheme i.e.

4. The regression model does not include lag values of the

dependent variable Y.

5. There is no missing observation in the data.

Page 103: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

103

9.6. REMEDIAL MEASURES OF AUTOCORRELATION

There are two types of remedial measures, when is

known and when is unknown.

I. When is known

The problem of autocorrelation can be easily

solved, if the coefficient of first order

autocorrelation is known.

II. When is not known

There are different ways of estimating .

i. The First-Difference Method

ii. DurbinWatson d-Statistic

Page 104: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

104

9.7: Exercise

1) What is autocorrelation? Discuss its consequences.

2) Differentiate between autocorrelation and serial

correlation. What are its various sources?

3) How can one detect each autocorrelation?

4) In the presence of autocorrelation how can one

obtain efficient estimates?

5) Describe briefly Durbin Watson d-statistic.

6) Apply Durbin Watson d-statistic to the following

data:

Y X 2 1 1.37

2 2 0.46

2 3 0.45

1 4 -2.36

3 5 1.27

5 6 -0.81

6 7 -0.09

6 8 -1.00

10 9 2.08

10 10 1.17

10 11 0.27

12 12 1.36

15 13 3.44

10 14 -2.46

11 15 2.37

Page 105: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

105

Chapter: 10

SIMULTANEOUS EQUATION MODELS

10.1: INTRODUCTION

There are two types of Simultaneous Equation Models

1. Simultaneous Equation Models

2. Recursion Equation Models

1. Simultaneous Equation Models

When the independent variable in one equation is also an

independent variable in some other equation we call it

simultaneous equations system or model. The variable entering a

simultaneous equation models are two types:

i .Endogenous variable ii. Exogenous variable

i. Endogenous variable

The variable whose values are determined within the model

is called Endogenous variable

ii. Exogenous variable

The variable whose values are determined outside the

model is called exogenous variable. These variables are treated as

nonstochastic.

2. Recursion Equation Models

In this model one dependent variable may be a function of

other dependent variable but other dependent variable might not be

the function variable.

Page 106: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

106

10.2: SYSTEM OF SIMULTANEOUS EQUATION

“A system describing the joint dependence of variables is

called a System of Simultaneous equation.”

If “Y” is the function of “X” i.e. Y=f(x), but also “X” is

function of “Y” i.e. X=f(y), we cannot describe the relationship

between Y and X by using single equation.

We must use a multi-equation model which we include

separate equations in which m Y and X, would appear as an

endogenous variable although that might appear as explanatory

variable in other equation of the model.

10.3: Simultaneous Equation Bias

It refers to the overestimation or underestimations of the

structural parameters obtain from the applications the OLS to the

structural equations. This bias result because these endogenous

variables of the system which are also explanatory variables or

correlated with the error term.

Structural Equations and Parameters

Structural equations describe the structure of an economy

or behaviors are some economic agents such as consumer or

producer. There is only on structural equation for each of the

endogenous variable of the system.

The coefficients of the structural equations are called

structural parameters and express the direct effect of each

explanatory variable on the dependent variable.

Reduced Form Equations

These are equations obtained by solving the system of

structural equations so as to express each endogenous variable as a

Page 107: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

107

function of only the exogenous variables of the function. Since the

endogenous variable of the system are uncorrelated with error

term, so OLS gives consistent reduced form parameters estimate.

These measure the total direct and indirect effect of a change in the

exogenous variables on the endogenous variables and may be used

to obtain consistence structural parameter.

Example:

Considering Keynesian model of consumption and income

function:

…………. (i)

…………. (ii)

Here and are endogenous variables and as

exogenous variable both are structural equations

Putting eq (i) in eq (ii).

……(*)

……. (iii)

Putting eq (*) in eq (i).

[

]

Page 108: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

108

……. (iv)

Here and are two structural parameters,

are four reduced form coefficients.

10.4: Methods of Estimation in Simultaneous Equation

Models

The most common methods are:

1) Direct Least Square (DLS)

2) Indirect Least Square (ILS)

3) Two stage least square (2SLS)

4) Three stage least square(3SLS)

5) Instrumental variable method(IV)

6) Least variance ratio method(LVR)

1. Direct Least Square Method (DLS)

In this method, we estimate the structural parameter by

applying OLS directly to the structural equation. This method does

not require complete knowledge of the structural system. In this

system, we express all the endogenous variables as a function of all

predetermined variables of the system and we apply ordinary least

square non restriction. Because it does not take into account any

information on the structural parameters.

2. Indirect Least Square Method (ILS)

There is definite relationship between the reduced form

coefficients and the structural parameters it is thus possible first to

obtain estimates of the structural parameters by any econometric

Page 109: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

109

technique and then substitute. These estimates into the system of

parameters relationship to obtain indirectly values for the π‟s.

Advantages of ILS

1) The derivation of the reduced form π‟s from the structural β‟s

and the Y‟s is more efficient.

2) Structural changes occur continuously over time.

3) Extraneous information is same structural parameters may

become available from other studies.

Disadvantages of ILS

1) It does not give the standard error of the estimate of the

structural parameters.

2) It cannot be used to calculate unique and consistent structural

parameter estimates from the reduced form coefficients from

the over identify equations of a simultaneous equation models.

Assumption of ILS method

1) Structural equation must be exact identified.

2) ILS method should satisfied first six stochastic assumptions of

OLS method i.e.

is random.

( )

If ILS method satisfied this assumptions and estimates of ILS

are BLUE estimators.

3) Micro variables should be correctly aggregative.

Page 110: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

110

Question: Show that ILS estimator and are consistent estimators.

Proof:

Consider Keynesian model

Reduced forms are

…….. (1)

And, [

]

……… (2)

Then

………. (3)

And,

………. (4)

Page 111: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

111

Subtracting eq (3) from eq (1).

…….. (5)

Subtracting eq (4) from eq (2).

……(6)

We know that

[ ]

Putting the value of

*,

- +

*

+

……….(7)

Similarly

[ ]

*{

+

Page 112: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

112

…………(8)

[ ]

[ ]

Applying limit n , , i.e. constant

Similarly

*

+ *

+

[ ]

[ ]

Applying limit n , , i.e. constant, ,

Page 113: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

113

Hence proved and are consistent estimators of and .

3. The Method of Two Stage Least Square (2SLS)

This method was discovered by Theil and Basmann. It is a

method of estimating consistent structural parameter for the exact

or over identified equations of a simultaneous equation system. For

exactly identified equation Two Stages Least Squares gives the

same result as of ILS. Two Stages Least Squares estimation

involves the application of OLS in two stages.

Stage 1:

In the first stage each endogenous variable is regressed on

all the predetermined variable of the system. At this stage we get

the new reduced form equation.

Stage11:

In the second stage predicted values rather than the actual

values of endogenous are used to estimate the structural equation of

the model. That is, we obtain the estimates . From stage first and

replacing in the original equation by the estimated and then

apply OLS to the equation thus transformed.

The predicted values of the endogenous variable are uncorrected

with the error term which will give us two stages least square

parameters estimates.

Page 114: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

114

Advantages of 2SLS with respect to ILS

1) 2SLS can be used to get consistent structural parameter

estimates for the over identified as well as exactly identified

equation in a system of simultaneous equation.

2) 2SLS gives the standard error of the estimate structural

parameter directly while ILS does not provide it.

3) 2SLS is very useful. It is the simplest and one of the best and

most common of all the simultaneous equation estimates.

Properties of 2SLS estimator

1) The 2SLS gives the biased estimator for small sample.

2) For large sample 2SLS estimates are unbiased that is biased

will be zero as n 3) A 2SLS estimate gives the asymptotically efficient estimator.

4) 2SLS estimates are consistent.

Question: Find out the 2SLS estimate and show that in case of exactly

identified 2SLS is same as ILS.

Proof:

We use the simple Keynesian model

… (1)

… (2)

Reduced forms are:

Page 115: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

115

… (3)

[

]

… (4)

Estimated equation of (3)

(

) ⁄

Residual

… (4)

Putting equation (4) in equation (1)

Page 116: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

116

Since involves only endogenous variable which is

independently distributed with and .Then application of OLS

will give us consistent estimate.

*

+

[ ]

[ ]

[ ]

[ ]

(

)

Page 117: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

117

It means that 2SLS and ILS are same in case of exactly identified.

= +

+ )

Hence proved.

Page 118: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

118

4. Three Stage Least Square Method (3SLS)

3SLS is a system method. It is applied to all the equations

at the same time and gives estimates of all the parameters

simultaneous. This method is logical extension of two stages least

square method. Under this method we apply OLS method in three

successive stages. It uses more information than single equation

technique.

The first two stages of 3SLS are same as 2SLS. We deal

with the reduced form of all the equation of the system. 3SLS is

the application of GLS (Generalized Least Squares). It means that

we apply OLS method to a set of transformed equations in which

the transformation is obtained from reduced form residuals of the

previous stage.

5. Method of Instrumental Variable (IV)

The instrumental variable method is a single equation

method being applied to one equation of system at a time. It has

been developed as a solution of the simultaneous equation bias and

is appropriate for over identified model.

The instrumental variable method attains the reduction of

dependence of ‟U‟ and the explanatory variable by using

appropriate exogenous variable (as instrument). The estimates

obtains from this method is consistent for large sample and biased

for small sample.

Procedure of IV Method

Step I:

An instrumental variable is an exogenous variable located

somewhere in a system of simultaneous equation which satisfies

the following condition:

Page 119: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

119

1) It must strongly correlated

2) It must truly exogenous

3) If more than one instrumental variable is to be used in the same

structural equation they must be least correlated.

Step II:

Multiplying the structural equation through by the each of

instrument variable form the equation we obtain the estimator of

the structural parameter

Properties of IV

1) For small sample estimator of structural parameter are baised.

2) For large sample the estimates of structural parameter are

consistent.

3) The estimates are not asymptotically efficient.

Assumption of IV method

1) Exogenous variable used as instrumental variable.

2) The disturbance term „U‟ must satisfied the usual assumptions

of OLS.

3) The exogenous variable must not be multicollinear.

4) The structural function must be identified.

Page 120: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

120

10.5: Exercise

1) What is meant by simultaneous equations model? Discuss.

2) Show that OLS estimates are biased in simultaneous

equations problems.

3) Differentiate between endogenous and exogenous

variables.

4) Write short notes on following:

i. Indirect Least Squares Method

ii. Instrumental Variable Method

iii. Two Stage Least Squares Method

iv. Three Stage Least Squares Method

5) Show that ILS estimates are consistent estimators.

Page 121: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

121

Chapter: 11

IDENTIFICATION

11.1 INTRODUCTION

By identification, we mean whether numerical estimates of

the parameters of the structural equation can be obtained from the

estimated reduced form equations.

If this can be done, we say that the particular equation is

identified. If it is not possible then we say that the equation under

consideration is unidentified or under identified.

In econometric theory there are two possible equations of

identification.

1) Equation under identified

2) Equation identified

1) Equation Under Identified

If the numerical estimates of the parameters of structural

equation cannot be obtain from the estimated reduced form co-

efficient then we say that the equation under consideration is

unidentified or under identified.

An equation is under identified if its statistical shape is not

unique if it is impossible to estimate all the parameters of an

equation with any econometric technique then equation is

under identified.

A system is called under identified when one or more

equations are under identified.

Example: Consider the following demand and supply model

with equilibrium condition.

Page 122: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

122

…eq (1)

…eq (2)

Solution:

(

… Eq (*)

… Reduced form (a)

Put eq (*) in

[

]

…reduced form (b)

Four structural parameters are from

structural equations of 1 and 2.We have two reduced form

coefficients π0 and π1 from the reduced form equations a & b.

Page 123: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

123

These reduced form equations contain all four structural

parameters. So there is no way in which the four structural

unknown parameters can be estimated from only two reduced form

coefficients. So the system of equation is unidentified or under

identified.

2) Equation Identified

If numerical estimates of the parameters of a structural

equation can be obtained from the estimated reduced form

coefficients then we say that equation is identified

If an equation has a unique statistical solution we may say

that equation is identified.

Identification is a problem of model formulation and

identified equation may be exactly (just) identified or over

identified.

a. Exact (Just) Identification

An identified equation is said to be exactly identified if

unique numerical values of the structural parameters can be

obtained.

Example: Consider the following demand and supply model with

equilibrium condition.

…eq (1)

…eq (2)

Solution:

Page 124: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

124

(

… Eq (*)

… Reduced form (a)

Put eq (*) in

*

+

…reduced form (b)

We have six structural parameters that are

and six reduced form coefficients that are

here we obtain unique solution of structural

parameters. So the system of equation is exactly identified.

b. Over Identification

An equation is said to be over identified if more than one

numerical value can be obtained for some of the parameters of the

structural equations.

Example: Consider the following demand and supply model with

equilibrium condition.

…eq (1)

…eq (2)

Page 125: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

125

Solution:

(

… Eq (*)

… Reduced form (a)

Put eq (*) in

[

]

…reduced form (b)

We have seven structural parameters that are

but there are eight reduced form

coefficients that are The number of

equation are greater than the number of unknown parameters as a

result we may get more than one numerical value for some of the

parameters of the structural equations. So the system of equation is

over identified.

Page 126: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

126

11.2 RULES FOR IDENTIFICATION

Identification may be established either by examination of the

specification of the structural model or by the examination of the

reduced form of the model.

1) Examination of Structural Model

It is simpler and more useful method for identification.

2) Examination of Reduced form Determinant

This approach for finding the identification is

comparatively confusing and difficult to compute because we

first find the reduced form of the structural models and study

the determinants.

11.3 CONDITIONS OF IDENTIFICATION

There are two conditions which must be fulfilling for an

equation to be identified.

1) The Order Condition of the Identification

This condition is based on a counting rule of the variables

included and excluded from the particular equation. It is a

necessary but not sufficient condition for the identification of an

equation.

Definition: “For an equation to be identified the total number of

variables (endogenous and exogenous) excluded from it must be

equal to or greater than the number of endogenous variables in the

model less one”. That is

If The equation is just or exact identified.

If It is over identified.

Page 127: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

127

Where M = number of endogenous variables in the model or

system.

m= number of endogenous variables in a given equation.

K = number of pre-determined or exogenous variables in the

model or system.

k = number of predetermined or exogenous variables in a given

equation.

Example: Consider the following demand and supply function.

…eq (1)

…eq (2)

Apply order condition.

Solution:

Q and P are endogenous variables. I is exogenous

variable. Apply order condition.

K=1 , M=2

For eq (1).

k=1 , m=2

So demand function is unidentified.

For eq (2).

k=0 , m=2

So supply function is just identified.

Page 128: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

128

Example: Consider the following demand and supply function.

…eq (1)

…eq (2)

Apply order condition.

Solution:

Q and P are endogenous variables. I, R, are exogenous

variables. Apply order condition.

K=3 , M=2

For eq (1).

k=2 , m=2

So demand function is exact identified.

For eq (2).

k=1 , m=2

So supply function is over identified.

2) The Rank Condition for Identification

The order condition is necessary but not sufficient

condition for identification. Sometime the order condition is

satisfied but it happens that an equation is not identified.

Page 129: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

129

Therefore we required another condition for identification

is the rank condition which is sufficient condition for

identification.

Rank Condition

The rank condition states that in a system of G equations,

particular equation is identified if and only if (iff) it is possible to

construct at least one none zero determinants of order (G-1) from

the coefficient of variables excluded from that particular equation

but contained in the other equation of the model.

Procedure of Rank Condition

a) Write down the equations in tabular form.

b) Strike out (exclude) the coefficient of the row in which

the equation under consideration appears.

c) Also strike out the columns corresponding to those

coefficients in step (b) which are none zero.

d) The entries left in the table will give only the coefficient

of variables included in the system but not in the equation

under consideration.

Example: Given the following equations:

Apply rank condition to all the equations.

Solution:

Page 130: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

130

Equation

1 -1 - 0 0 0

2 0 0 0 -1 0

3 0 -1 0 0 0

4 1 -1 0 1 0 1

Consider equation 1.

[

]

| | |

| |

| |

|

| |

Hence equation 1 is unidentified.

Consider equation 2.

[

]

| | |

| |

| |

|

| |

| |

Page 131: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

131

Hence equation 2 is identified.

Consider equation 3.

[

]

| | |

| |

| |

|

| |

| |

Hence equation 3 is identified.

Consider equation 4.

[

]

| |

| |

Hence equation 4 is also identified.

Example: Consider the following system of equations

Determine the system of equation is exactly, Over and

unidentified by using:

Page 132: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

132

a) Rank condition

b) Order condition

Solution:

a) Rank condition

Equation

1 1 0 0 0

2 0 1 0 0

3 0 1 0 0

4 0 1 0 0

Consider equation 1.

[

]

| | |

| |

| |

|

| |

Hence equation 1 is unidentified.

Consider equation 2.

Page 133: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

133

[

]

| | |

| |

| |

|

| |

Hence equation 2 is also unidentified.

Consider equation 3.

[

]

| | |

| |

| |

|

| |

Hence equation 3 is also unidentified.

Consider equation 4.

[

]

| | |

| |

| |

|

| |

| |

| |

Hence equation 4 is identified.

Page 134: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

134

b) Order condition

M = number of endogenous variables in a system of

equations.

K = number of exogenous variables in a system of equations.

i.e. ( )

K = 3 i.e. ( )

m = number of endogenous variables in a given equations.

For equation 1:

m = 3 i.e. ( )

For equation 2:

m = 2 i.e. ( )

For equation 3:

m = 2 i.e. ( )

For equation 4:

m = 3 i.e. ( )

k = number of exogenous variables in a given equation.

For equation 1:

k = 1 i.e. ( )

For equation 2:

k = 2 i.e. ( )

For equation 3:

Page 135: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

135

k = 2 i.e. ( )

For equation 4:

k = 1 i.e. ( )

Equation Result

1 Identified

2 Identified

3 Identified

4 Identified

Thus by order condition all the equations are identified but by

rank condition only equation 4 is identified.

Page 136: Advanced Econometrics by Sajid Ali Khan Rawalakot: 0334-5439066

Advanced Econometrics

136

11.4: Exercise

i. Discuss the problem of identification.

ii. Explain the rank condition of identification.

iii. Briefly discuss the procedure of order condition of

identification.

iv. Check the identifiability of the following model:

… (1)

… (2)


Recommended