AUTOCORRELATION OR SERIAL CORRELATION

transcript

AUTOCORRELATION

SERIAL CORRELATION

Serial Correlation (Chapter 11.1)

• Now let’s relax a different Gauss–Markov assumption.

• What if the error terms are correlated with one another?

• If I know something about the error term for one observation, I also know something about the error term for another observation.

• Our observations are NOT independent!

Serial Correlation (cont.)

• Serial Correlation frequently arises when using time series data (so we will index our observations with t instead of i)

• The error term includes all variables not explicitly included in the model.

• If a change occurs to one of these unobserved variables in 1969, it is quite plausible that some of that change will still be evident in 1970.

• In this lecture, we will consider a particular form of correlation among error terms.

• Error terms are correlated more heavily with “nearby” observations than with “distant” observations.

• E.g., cov(1969,1970) > cov(1969,1990)

• For example, inflation in the United States has been positively serially correlated for at least a century. We expect above average inflation in a given period if there was above average inflation in the preceding period.

• Let’s look at DEVIATIONS in US inflation from its mean from 1923–1952 and from 1973–2002. There is greater serial correlation in the more recent sample.

Figure 11.1 U.S. Inflation’s Deviations from Its Mean

1t1 ..

E(t) 0

Var(t) 2

Cov(t,

tt '0 for some t t '

Specifically: tt '

|t t '|

for all t, t '

X 's fixed across samples

Serial Correlation: A DGP

• We assume that covariances depend only on the distance between two time periods, |t-t’|

OLS and Serial Correlation (Chapter 11.2)

• The implications of serial correlation for OLS are similar to those of heteroskedasticity:

–OLS is still unbiased

–OLS is inefficient

– The OLS formula for estimated standard errors is incorrect

• “Fixes” are more complicated

OLS and Serial Correlation (cont.)

• As with heteroskedasticity, we have two choices:

1. We can transform the data so that the Gauss–Markov conditions are met, and OLS is BLUE; OR

2. We can disregard efficiency, apply OLS anyway, and “fix” our formula for estimated standard errors.

Testing for Serial Correlation

• Correlograms & Q-statistics

View/residual tests/correlogram-q

If there is no autocorrelation in the residuals, the auto and partial correlations at all lags should be nearly zero and Q-statistic should be insignificant with larger p-values

Durbin–Watson Test

• How do we test for serial correlation?

• James Durbin and G.S. Watson proposed testing for correlation in the error terms between adjacent observations.

• In our DGP, we assume the strongest correlation exists between adjacent observations.

Durbin–Watson Test (cont.)

• Correlation between adjacent disturbances is called “first-order serial correlation.”

• To test for first-order serial correlation, we ask whether adjacent ’s are correlated.

• As usual, we’ll use residuals to proxy for

• The trick is constructing a test statistic for which we know the distribution (so we can calculate the probability of observing the data, given the null hypothesis).

• We end up with a somewhat opaque test statistic for first-order serial correlation

T k 2e

• In large samples,

approximately estimates the covariance between adjacent error terms. If there is no first-order serial correlation, this term will collapse to 0.

• When the Durbin–Watson statistic, d, gives a value far from 2, then it suggests the covariance term is not 0 after all

• i.e., a value of d far from 2 suggests the presence of first-order serial correlation

• d is bounded between 0 and 4

Durbin–Watson Statistic

• VERY UNFORTUNATELY, the exact distribution of the d statistic varies from application to application.

• FORTUNATELY, there are bounds on the distribution.

Durbin–Watson Statistic (cont.)

• There are TWO complications in applying the Durbin–Watson statistic:

1. We do not know the exact critical value, only a range within which the critical value falls.

2. The rejection regions are in relation to d = 2, not d = 0.

TABLE 11.2 The Durbin–Watson Test

Figure 11.3 Durbin–Watson Lower and Upper Critical Values for n = 20 and a Single Explanator, = 0.5 (Panel B)

TABLE 11.1 Upper and Lower Critical Values for the Durbin–Watson Statistic (5% significance level)

• The Durbin–Watson test has three possible outcomes: reject, fail to reject, or the test is uncertain.

• The Durbin–Watson test checks ONLY for first-order serial correlation.

Checking Understanding

• Suppose we are conducting a 2-sided Durbin–Watson test with n = 80 and 1 explanator. For the 0.05 significance level,

dl = 1.61

dh= 1.66

• Would you a) reject the null, b) fail to reject the null, or c) is the test uncertain, for: i) d = 0.5; ii) d = 1.64; iii) d = 2.1; and iv) d = 2.9?

Checking Understanding (cont.)

dl = 1.61

dh= 1.66

• Would you reject the null, fail to reject the null, or is the test uncertain, for:

– i) d = 0.5: d < dl , so we reject the null.

– ii) d = 1.64: dl <d < dh , so the test is uncertain.

– iii) d = 2.1: dh< d < (4-dh), so we fail to reject.

– iv) d = 2.9: d > (4-dl), so we reject the null.

• Suppose we find serial correlation. OLS is unbiased but inefficient.

• Instead of using OLS, can we construct a BLUE Estimator?

• First, we need to specify our DGP.

One Serial Correlation DGP

t, for 1 1

E(vt) 0

Var(vt) 2

Cov(vt,v

t ') 0 for t t '

BLUE Estimation (cont.)

0 1 -1

For simplicity, focus on the case with only

one explanator:

The serial correlation is caused by the term.

If we could eliminate this term, we would be left

with a Gauss–Marko

t t t t

v DGP.

• To get rid of the serial correlation term, we must algebraically manipulate the DGP.

• Notice that

t -1 v

t 1(1 )

t 1) v

• If we regress (Yt - Yt-1) against a constant and (Xt - Xt-1), we can estimate 1 in a model that meets the Gauss–Markov assumptions.

t 1) v

E(vt) 0

Var(vt) 2

Cov(vt,v

t ') 0 for t t '

• After transforming the data, we have the following DGP

t 1) v

• With the transformed data, OLS is BLUE.

• This approach is also called GLS.

• There are two problems:

– We cannot include the first observation in our regression, because there is no Y0 or X0 to subtract. Losing an observation may or may not be a problem, depending on T

– We need to know

t 1) v

t for t 1

E(vt) 0, E(

Var(vt) 2, Var(

Cov(vt,v

t ') 0 for t t '

Cov(1,v

t) 0 for t 1

Checking Understanding

• What Gauss–Markov assumption is violated if we restore the first observation?

t 1) v

t for t 1

Cov(vt,v

t ') 0 for t t '

Cov(1,v

t) 0 for t 1

• The first observation’s error term is not correlated with any other error terms; this DGP does not suffer from serial correlation.

t 1) v

t for t 1

Var(vt) 2, Var(

• The error term of the first observation has a different variance than the other error terms.

• This DGP suffers from heteroskedasticity.

LM Test• If your model included a lagged

dependent variable on the right hand side then DW is not an appropriate test for serial correlation

• Using OLS on such a regression results in biased and inefficient coefficients

• Use Breusch-Godfrey Lagrange Multiplier test for general high order auto test

• The test statistic has an asymptotic χsquared (Chi-squared distribution)

• View residual test serial corr LM

ˆ ˆ ˆ(1 ) ( )

1) Regress

2) Regress

3) Transform the data using

4) Regress ,

t t t t

Y X X u

• This FGLS procedure is called the Cochrane–Orcutt estimator.

AUTOCORRELATION OR SERIAL CORRELATION

Documents