Econometrics II
Seppo Pynnonen
Department of Mathematics and Statistics, University of Vaasa, Finland
Spring 2018
Seppo Pynnonen Econometrics II
Panel Data
Part II
Panel Data
As of Jan 23, 2018Seppo Pynnonen Econometrics II
Panel Data
1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Data sets that combine time series and cross sections data arecommon in economics.
Independently pooled cross section:
Data are obtained by sampling randomly a large population atdifferent points in time (e.g., yearly).
Allows to investigate the effect of time. E.g., whetherrelationships have changed.
Raises typically minor statistical complications.
Important feature:
The data set consists of independently sampled observations.
Seppo Pynnonen Econometrics II
Panel Data
A panel data set (longitudinal data):
is a sample of same individuals, families, firms, cities . . ., arefollowed across time.
E.g., OECD statistics contain numerous series observed yearly fromseveral countries.
Similarly time series data on several firms, industries, etc., arethese type of data.
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time
Example 1
Women’s fertility over time: Data from General Social Survey containssamples collected even years from 1972 to 1984.
Model for explaining total number of children born to a woman.
Data is available on the course web side (password protected).
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time
dfr <- read.table(file = "http://www.econometrics.com/comdata/wooldridge/FERTIL1.shd",
stringsAsFactors = FALSE, na = -999) # read data
vnames <- unlist(strsplit("year educ meduc feduc age kids
black east northcen west farm othrural
town smcity y74 y76 y78 y80
y82 y84 agesq y74educ y76educ y78educ
y80educ y82educ y84educ", split = "[ \n]+")) # variable names
vnames # check that OK
colnames(dfr) <- vnames # rename column names
str(dfr) # description of data frame dfr
’data.frame’: 1129 obs. of 27 variables:
$ year : int 72 72 72 72 72 72 72 72 72 72 ...
$ educ : int 12 17 12 12 12 8 12 10 12 12 ...
$ meduc : int 8 8 7 12 3 8 12 12 8 6 ...
$ feduc : int 8 18 8 10 8 8 10 5 8 13 ...
$ age : int 48 46 53 42 51 50 47 46 41 36 ...
$ kids : int 4 3 2 2 2 4 0 1 2 4 ...
$ black : int 0 0 0 0 0 0 0 0 0 0 ...
$ east : int 0 0 0 0 0 0 0 0 0 0 ...
$ northcen: int 1 0 1 1 0 1 1 1 1 1 ...
$ west : int 0 0 0 0 0 0 0 0 0 0 ...
$ farm : int 0 0 0 0 1 1 0 0 0 1 ...
$ othrural: int 0 1 1 0 0 0 0 0 0 0 ...
$ town : int 0 0 0 1 0 0 1 0 0 0 ...
$ smcity : int 0 0 0 0 0 0 0 0 0 0 ...
.
.
.
etc
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time
## average number of children per woman in years 1972 to 1984
avkids <- tapply(dfr$kids, INDEX = dfr$year, FUN = mean, na.rm = TRUE) # averages per year
round(avkids, digits = 1) # averages per year
72 74 76 78 80 82 84
3.0 3.2 2.8 2.8 2.8 2.4 2.2
table(dfr$year) # number of observations (families) per year
72 74 76 78 80 82 84
156 173 152 143 142 186 177
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time
72 74 76 78 80 82 84
2.02.5
3.03.5
4.0Average Number of Children per Woman in 1972 to 1984
Year
Avera
ge n
of ch
ildern
It is obvious that the fertility rate has declined over years
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time
The analysis can be substantially elaborated by regression analysis.
After controlling other factors (educations, age, etc.), what has happenedto the fertility rate?
Build a regression with year dummies: y74 for 1974, · · · , y84 for year1984.
Year 1972 is the base year.
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time
lm(formula = kids ~ educ + age + I(age^2) + black + east + northcen +
west + farm + y74 + y76 + y78 + y80 + y82 + y84, data = dfr)
Residuals:
Min 1Q Median 3Q Max
-3.9493 -1.0420 -0.0663 0.9324 4.7785
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -7.894707 3.051590 -2.587 0.009805 **
educ -0.124241 0.018149 -6.846 1.25e-11 ***
age 0.538145 0.138400 3.888 0.000107 ***
I(age^2) -0.005868 0.001564 -3.751 0.000185 ***
black 1.083783 0.173404 6.250 5.83e-10 ***
east 0.227601 0.131252 1.734 0.083180 .
northcen 0.371391 0.119968 3.096 0.002012 **
west 0.218869 0.166352 1.316 0.188547
farm -0.091881 0.122027 -0.753 0.451637
y74 0.258628 0.172716 1.497 0.134569
y76 -0.101236 0.178732 -0.566 0.571228
y78 -0.067151 0.181449 -0.370 0.711393
y80 -0.075120 0.182707 -0.411 0.681042
y82 -0.532352 0.172339 -3.089 0.002058 **
y84 -0.538395 0.174472 -3.086 0.002080 **
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 1.556 on 1114 degrees of freedom
Multiple R-squared: 0.1263,Adjusted R-squared: 0.1153
F-statistic: 11.51 on 14 and 1114 DF, p-value: < 2.2e-16
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time
Sharp drop in fertility in the early 1980s (others are not statisticallysignificant).
E.g., the coefficient on y82 indicates that, holding other factors fixed(educ, age, and others), per 100 women there were in 1982 about 53 lesschildren than in 1972.
In particular, since education is controlled, this decline is separate fromthe decline due to the increase in eduction.
Women with more education have fewer children (coefficient −0.12 ishighly statistically significant with t = −6.85 and p-value < 0.0005).
Other things equal, per 100 women with a college education tend to have
4× 0.124 = 0.496, i.e., about 50 children less than women with only high
school education.
Seppo Pynnonen Econometrics II
Panel Data
Pooling independent cross section across time
In summary, pooled cross section data (independent samples)problems can be analyzed utilizing dummy variables.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
From each individual (people, firms, schools, cities, countries, etc.)data are collected at two time points, t = 1 and t = 2.
In usual regression one major source of bias stems from omitted(important) variables.
For example, if the true model is
yi = β0 + β1xi + β2zi + ui , (1)
but we estimateyi = β0 + β1xi + vi , (2)
wherevi = β2zi + ui , (3)
the bias in OLS estimator β1 from model (2) is
E[β1
]− β1 = β2
∑ni=1(xi − x)zi∑ni=1(xi − x)2
, (4)
which can be substantial if x and z are correlated and β2 is large.Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
The problem is that we usually do not know if important variablesare missing from our model!
Use of panel data makes it possible to eliminate the omittedvariable bias in certain cases.
Suppose that we have the following situation in terms of model (1)
yit = β0 + β1xit + β2zi + uit , (5)
where i refers to individual i and t to time point t.
Thus, we have panel data where data is collected from eachindividual i at different time points t (in the two period case,t = 1, 2).
Note that in (5) zi does not have the time index, which impliesthat variable z is time invariant (or at least changing very slowlywith time).
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Suppose, we have from each of the n individuals observations onyit and xit at time points t = 1 and t = 2, thus altogether 2nobservations.
However, we do not observe zi .
Suppose further that we allow the possibility that intercept β0 maybe different at different time points, such that (5) can be written as
yit = β0 + δ0Dt + β1xit + β2zi + uit , (6)
where Dt = 0 for t = 1 and Dt = 1 for t = 2 (time dummy).
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Then taking differences
∆yi = yi2 − yi1,
the model in (6) becomes
∆yi = δ0 + β1∆xi + ∆ui , (7)
i.e., the (unobserved) omitted variable disappears and estimatingthe slope parameter β1 with OLS is unbiased.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
The above generalizes immediately such that if we denote
ai = z′iγ = γ1zi1 + γ2zi2 + · · ·+ γqziq (8)
and enhance (6) to
yit = β0 + δ0Dt + βxit + ai + uit , (9)
taking differences reduces again to estimation model (7).
The above model is called the fixed effect (FE) model in which ai is fixed
over the time periods (ai can be a random variable, and can correlate
with the explanatory variable xit).
If ai is not correlated with other explanatory variables, the model is called
random effect (RE) model and is estimated with different techniques that
are supposed to yield more efficient estimators to β-parameters than the
fixed effect methods (that are basically OLS methods). We will return to
the RE model later.Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
In the FE case the resulting estimators of the regressionparameters from the first-differenced equation with OLS arecalled the first-differenced estimators (FD estimators).
We will deal with other fixed effect estimators later.
In summary:
Differencing eliminates all unobserved time invariant factorsfrom the model.
A major pitfall is that differencing also wipes out observedtime invariant variables (like gender) from the model!
FE cannot be used in these cases (if we want to estimate theseeffects), or in cases where the explanatory variables changevery slowly across time (the difference is nearly zero).
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
In many cases the FD-method is useful, however.
The following example highlight the biasing effect of unobservedfactors and how panel estimation with the simple FD-method likelysolves the problem.
Example 2
Data set crime2 (Wooldridge) contains data on crime and unemploymentrates for 46 US cities for 1982 (t = 1) and 1987 (t = 2).
Running simple cross section regression of crmrte on unem by using only1987 yields
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
lm(formula = crmrte ~ unem, data = cdfr, subset = year == 87)
Residuals:
Min 1Q Median 3Q Max
-57.55 -27.01 -10.56 18.01 79.75
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 128.378 20.757 6.185 1.8e-07 ***
unem -4.161 3.416 -1.218 0.23
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 34.6 on 44 degrees of freedom
Multiple R-squared: 0.03262,Adjusted R-squared: 0.01063
F-statistic: 1.483 on 1 and 44 DF, p-value: 0.2297
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Coefficient of crmrte is negative, −4.16!
However, not statistically significant.
Likely suffers from omitted variables problem (age distribution,gender distribution, eduction levels, . . .).
Most of these can be expected to be fairly stable across time. Thus,use of panel data techniques may be helpful.
Before proceeding to the panel data estimation, let us see what happensif we simply pool the two years and estimate
crmrte = β0 + δ0D87 + β1unem + u, (10)
where D87 is the year 1987 dummy.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
lm(formula = crmrte ~ d87 + unem, data = cdfr)
Residuals:
Min 1Q Median 3Q Max
-53.474 -21.794 -6.266 18.297 75.113
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 93.4203 12.7395 7.333 9.92e-11 ***
d87 7.9404 7.9753 0.996 0.322
unem 0.4265 1.1883 0.359 0.720
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 29.99 on 89 degrees of freedom
Multiple R-squared: 0.01221,Adjusted R-squared: -0.009986
F-statistic: 0.5501 on 2 and 89 DF, p-value: 0.5788
The situation does not change much qualitatively!
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
R, SAS, Stata, and EViews all have sophisticated panel dataprocedures.
We discuss some of them later.
R has the plm package for panel data analysis. In order to usepanel data variables to identify individuals and time must beavailable. in crime2, year is the time index, but city identifiersmust be defined (call it city. With these the FD method can beapplied by setting model = "FD" and index = c(city, year)
(in this order!) with the model definition in plm, see examplebelow.
In Stata the FD-method can be applied by using the regress
routine by first declaring the data as a panel data with the xtset
command(Menu: Statistics > Longitudinal/panel data > Setup
and utilities > Declare data set to be panel data).
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Eviews: Proc > Structure/Resize Current Page. . ., andfollow the instructions.
SAS: proc panel data = crime2; model crmrte = unemp;
id = city year; end; Before applying proc panel the datamust be sorted by proc sort.
Whichever software is used, identifiers for the individuals (inparticular) are needed to indicate the multiple measurements on anindividual.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
After declaring the panel structure for the program, the model
∆crmrte = δ0 + β1∆umem + ∆u (11)
can be estimated with the FD difference method in R as follows:
plm(formula = crmrte ~ unem, data = cdfr, model = "fd", index = c("city", "year"))
Balanced Panel: n=46, T=2, N=92
Observations used in estimation: 46
Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-36.90 -13.40 -5.51 12.40 52.90
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
(intercept) 15.40219 4.70212 3.2756 0.00206 **
unem 2.21800 0.87787 2.5266 0.01519 *
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Total Sum of Squares: 20256
Residual Sum of Squares: 17690
R-Squared: 0.1267
Adj. R-Squared: 0.10685
F-statistic: 6.3836 on 1 and 44 DF, p-value: 0.015189
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
In Eviews, after the data has been reshaped to panel data, theFD-estimatation can be worked out using Quick > Estimate
Equation. . . to open the Equation Estimation commandwindow to input d(cmrte) c d(unem) to get the results similarto above.
The coefficient estimate of the β1 ≈ 2.22 is now highly statisticallysignificant and of expected sign.
The model predicts that one percent increase in unemployment increasescrimes by about 2.2 per 1, 000 people.
The constant term indicates that even if the change in unemployment
rate were zero, the crime rate has generally increased during the period
from 1982 to 1987 by about 15.4 crimes per 1,000 people.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Note that the time dummy component δ0 in (11) captures all unobservedtime effect that are common to all cross-sectional individuals.
That is, we can consider δ0 to represent
δ0 = δ′zt = δ1z1t + δ2z2t + · · ·+ δpzpt ,
where zt ’s are common trend components affecting all crime rates in
t = 1987.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Differencing can be used with more than two time periods to workout fixed effect estimation.
As an example consider a three period model.
yit = δ1 + δ2D2t + δ3D3t + β1xit1 + · · ·+ βkxitk + uit (12)
for t = 1, 2, 3, where D2t = 1 for period t = 2 and zero otherwiseand D3t = 1 for t = 3 and zero othewrise.
Differencing yields
∆yit = δ2∆D2t + δ3∆D3t + β1∆xit1 + · · ·+ βk∆xitk + ∆uit (13)
t = 2, 3.
Note: For t = 2, ∆D2t = 1 and ∆D3t = 0 = D3t ; for t = 3,∆Dt2 = −1 and ∆D3t = 1 = D3t .
Again the model is simple to estimate by OLS.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Remark 1
A model of the form (13) is usual reparametrized into an equivalent form
∆yit = α0 + α3D3t + β1∆xit1 + · · ·+ βk∆xitk + ∆uit , (14)
where α0 = δ2 and α3 = δ3 − δ2.This generalizes to T time periods with time dummies D1t ,D2t , . . . ,DTt
in which Djt = 1 if j = t and zero otherwise, j = 1, 2, . . . ,T ,
∆yit = α0 + α3D3t + · · ·+ αTDTt (15)
+β1∆xit1 + · · ·+ βk∆xitk + ∆uit ,
where α0 = δ2, αj = δj − δj−1, j = 3, . . . ,T
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
An alternative method, which works in certain cases better thanthe FD-method, is called the fixed effects method.
Consider the simple case model of
yit = β1xit + ai + uit , (16)
i = 1, . . . , n, t = 1, . . . ,T .
Thus there are altogether n × T observations.
Define means over the T time periods
yi =1
T
T∑t=1
yit , xi =1
T
T∑t=1
xit , ui =1
T
T∑t=1
uit . (17)
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Thenyi = β1xi + ai + ui . (18)
Note that
1
T
T∑t=1
ai =1
TTai = ai .
Thus, subtracting (18) from (16) eliminates ai and gives
yit − yi = β1(xit − xi ) + (uit − ui ) (19)
oryit = β1xit + uit , (20)
where e.g., yit = yit − yi is the time demeaned data on y .
This transformation is also called the within transformation andresulting (OLS) estimators of the regression parameters applied to(20) are called fixed effect estimators or within estimators.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
In the two period case the FD method and FE lead to identicalresults.
Remark 2
The slope coefficient β1 estimated from (18) is called the
between estimator. vi = ai + ui is the error term. The estimator is
biased, however, if the unobserved component ai is correlated with x .
Remark 3
When estimating the unobserved effect by the fixed effect (FE) method,
it is unfortunately not clear how the goodness-of-fit R-square should be
computed. Stata produces three different R-squares: within, between,
and total.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Remark 4
Usually a full set of year dummies (i.e., year dummies for all years but the
first) are included in FE estimation to capture time variation. However,
then the effect of any variable whose change across time is constant
cannot be estimated (an example of such a variable is experience
measure by the number of year; experience increases every year by one).
Remark 5
Although time invariant variables cannot be included by themselves in a
FE mode, their interactions with year dummies can. For example, in a
wage equation (year dummy) x (education) measure the change in return
of education over time.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Yet another method is to introduce dummy variables for the crosssection unit (N − 1 dummy variables) and (possibly) for theperiods (T − 1 dummies).
If N and T are large this is not very practical.
Gives the same estimates for the regression coefficients as the timedemeaned method and the standard errors and major statistics arethe same.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Example 3
Papke (1994), Journal of Public Economics 54, 37–49, studied the effectof Indiana enterprise zone program on unemployment, years 1980–1988(Wooldridges data base, file: ezunem). Six zones designated 1984 andfour more in 1985. Twelve cities did not receive a zone (control group).
An evaluation model of the policy is
log(uclmsit) = θt + β1Dit + ai + uit (21)
where θt indicates time varying intercept, ucclmsit is the numberunemployment claims during year t in city i , and Dit = 1 if the city i hadthe zone in year t and zero otherwise.
First Difference estimates for β1:
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
plm(formula = log(uclms) ~ d81 + d82 + d83 + d84 + d85 + d86 +
d87 + d88 + ez, data = udfr, model = "fd", index = c("city",
"year"))
Balanced Panel: n = 22, T = 9, N = 198
Observations used in estimation: 176
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-0.4925469 -0.1426738 -0.0091983 0.1494605 0.6062251
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
d81 -0.321632 0.046064 -6.9823 6.547e-11 ***
d82 0.135496 0.065144 2.0799 0.039059 *
d83 -0.219255 0.079785 -2.7481 0.006654 **
d84 -0.558025 0.094564 -5.9011 1.958e-08 ***
d85 -0.556576 0.108961 -5.1080 8.809e-07 ***
d86 -0.586054 0.118298 -4.9541 1.769e-06 ***
d87 -0.853738 0.126950 -6.7250 2.666e-10 ***
d88 -1.192423 0.135049 -8.8296 1.382e-15 ***
ez -0.181878 0.078186 -2.3262 0.021209 *
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Total Sum of Squares: 20.678
Residual Sum of Squares: 7.7958
R-Squared: 0.623
Adj. R-Squared: 0.60494
F-statistic: 34.496 on 8 and 167 DF, p-value: < 2.22e-16
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
The estimate of β1, β1 = −.182 indicates that the presence of an EZ
causes about a 16.6% (e−.182 − 1 = .166) fall in unemployment claims,
which is both economically and statistically significant (t-val 2.33).
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Fixed Effect estimation results
plm(formula = log(uclms) ~ d81 + d82 + d83 + d84 + d85 + d86 +
d87 + d88 + ez, data = udfr, model = "within", index = c("city",
"year"))
Balanced Panel: n = 22, T = 9, N = 198
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-0.5761817 -0.1083693 -0.0097701 0.1136396 0.4962292
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
d81 -0.321632 0.060457 -5.3200 3.297e-07 ***
d82 0.135496 0.060457 2.2412 0.0263323 *
d83 -0.219255 0.060457 -3.6266 0.0003811 ***
d84 -0.579152 0.062318 -9.2935 < 2.2e-16 ***
d85 -0.591787 0.065495 -9.0355 3.919e-16 ***
d86 -0.621265 0.065495 -9.4856 < 2.2e-16 ***
d87 -0.888949 0.065495 -13.5727 < 2.2e-16 ***
d88 -1.227633 0.065495 -18.7438 < 2.2e-16 ***
ez -0.104415 0.055419 -1.8841 0.0612906 .
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Total Sum of Squares: 42.388
Residual Sum of Squares: 6.7144
R-Squared: 0.8416
Adj. R-Squared: 0.81314
F-statistic: 98.5855 on 9 and 167 DF, p-value: < 2.22e-16
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
Dummy variable regression:
lm(formula = log(uclms) ~ d81 + d82 + d83 + d84 + d85 + d86 +
d87 + d88 + c2 + c3 + c4 + c5 + c6 + c7 + c8 + c9 + c10 +
c11 + c12 + c13 + c14 + c15 + c16 + c17 + c18 + c19 + c20 +
c21 + c22 + ez, data = udfr)
Residuals:
Min 1Q Median 3Q Max
-0.57618 -0.10837 -0.00977 0.11364 0.49623
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.67615 0.08008 145.807 < 2e-16 ***
d81 -0.32163 0.06046 -5.320 3.30e-07 ***
d82 0.13550 0.06046 2.241 0.026332 *
d83 -0.21926 0.06046 -3.627 0.000381 ***
d84 -0.57915 0.06232 -9.293 < 2e-16 ***
d85 -0.59179 0.06550 -9.036 3.92e-16 ***
d86 -0.62126 0.06550 -9.486 < 2e-16 ***
d87 -0.88895 0.06550 -13.573 < 2e-16 ***
d88 -1.22763 0.06550 -18.744 < 2e-16 ***
(city dummies deleted)
ez -0.10441 0.05542 -1.884 0.061291 .
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.2005 on 167 degrees of freedom
Multiple R-squared: 0.9332,Adjusted R-squared: 0.9212
F-statistic: 77.75 on 30 and 167 DF, p-value: < 2.2e-16
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
The results show that the FE and DVRM results are exactly the same.
Using the FE results, the coefficient −0.104 implies about 10.4 percent
drop in the unemployment claims due to the program. The estimate is
significant in one-tailed testing but not in two-tailed testing.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
If the number of periods is 2 (T = 2) FE and FD giveidentical results.
When T ≥ 3 the FE and FD are not the same.Both are unbiased under assumptions FE.1–FE.4
FE.1 For each i , the model is
yit = β1xit1 + · · ·+ βkxitk + ai + uit , t = 1, . . .T .
FE.2 We have a random sample from the cross section.FE.3 Each explanatory variables changes over time, and they are not
perfectly collinear.FE.4 E[uit |Xi , ai ] = 0 for all time periods (Xi stands for all
explanatory variables).FE.5 var[uit |Xi , ai ] = σ2
u for all t = 1, . . . ,T .FE.6 cov[uit , uis ] = 0 for all t 6= sFE.7 uit |Xi , ai ∼ NID(0, σ2
u).
Both are consistent under assumptions FE.1–FE.4 for fixed Tas n→∞.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
If uit is serially uncorrelated, FE is more efficient than FD (becauseof this FE is more popular).
If uit is (highly) serially correlated, ∆uit may be less seriallycorrelated, which may favor FD over FE. However, typically T israther small, such that serial correlation is difficult to observe.
In sum, there are no clear cut guidelines to choose between thesetwo. Thus, a good advise is to check them them both and try todetermine why they differ if there is a big difference.
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Fixed effects model
A data set is called a balanced panel if the same number of timeseries observations are available for each cross section units. Thatis T is the same for all individuals. The total number ofobservations in a balanced panel is nT .
All the above examples are balanced panel data sets.
If some cross section units have missing observations, whichimplies that for an individual i there are available Ti time periodobservations i = 1, . . . , n, Ti 6= Tj for some i and j , we call thedata set an unbalanced panel. The total number of observationsin an unbalanced panel is T1 + · · ·+ Tn.
In most cases unbalanced panels do not cause major problems tofixed effect estimation.
Modern software packages make appropriate adjustments toestimation results.
Seppo Pynnonen Econometrics II
Panel Data
Random effects models1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Random effects models
Consider the simple unobserved effects model
yit = β0 + β1xit + ai + uit , (22)
i = 1, . . . , n, t = 1, . . . ,T .
Typically also time dummies are also included to (22).
Using FD or FE eliminates the unobserved component ai .
However, if ai is uncorrelated with xit using random effect (RE)estimation can lead to more efficient estimation of the regressionparameters.
Seppo Pynnonen Econometrics II
Panel Data
Random effects models
Generally, we call the model in equation (22) the random effectsmodel if ai is uncorrelated with all explanatory variables, i.e.,
cov[xit , ai ] = 0, t = 1, . . . ,T . (23)
How to estimate β1 efficiently?
If (23) holds, β1 can be estimated consistently from a single crosssection.
Obviously this discards lots of useful information.
Also if (23) holds β1 can be estimated consistently from thepooled data set by OLS. However, the errors are heteroskedasticand correlated that lead to underestimation of the standard errors.
Seppo Pynnonen Econometrics II
Panel Data
Random effects models
If the data set is simply pooled and the error term is denoted asvit = ai + uit , we have the regression
yit = β0 + β1xit + vit . (24)
Then
corr[vit , vis ] =σ2a
σ2a + σ2
u
(25)
for t 6= s, where σ2a = var[ai ] and σ2
u = var[uit ].
That is, the error terms vit are (positively) autocorrelated, whichbiases the standard errors of the OLS β1.
Seppo Pynnonen Econometrics II
Panel Data
Random effects models
If σ2a and σ2
u were known, optimal estimators (BLUE) would beobtained the generalized least squares (GLS), which in this casewould reduce to estimate the regression slope coefficients from thequasi demeaned equation
yit − λyt = β0(1− λ) + β1(xit − λxi ) + (vit − λvi ), (26)
where
λ = 1−(
σ2u
σ2u + Tσ2
a
) 12
. (27)
In practice σ2u and σ2
a are unknown, but they can be estimated.
Seppo Pynnonen Econometrics II
Panel Data
Random effects models
One method is to estimate (24) from the pooled data set and usethe OLS residuals vit to estimate σ2
a and σ2u and plug them into
(27).
There resulting GLS estimators for the regression slope coefficientsare called random effects estimators (RE estimators).
Under the random effects assumptions2 the estimators areconsistent, but not unbiased.
They are also asymptotically normal as n→∞ for fixed T .
However, with small n and large T properties of the RE estimatoris largely unknown.
2The ideal random effects assumptions include FE.1, FE.2, FE.4–FE.6.
FE.3 is replaced withRE.3: There are no perfect linear relationships among the explanatory variables.RE.4: In addition of FE.4, E[ai |Xi ] = 0.
Seppo Pynnonen Econometrics II
Panel Data
Random effects models
It is notable that λ = 1 results in (26) results to the pooledregression and FE obtained with λ = 0.
RE estimation is available in modern statistical packages withdifferent options.
Example 4
Data set wagepan.xls (Wooldridge): n = 545, T = 8.
Is there a wage premium in belonging to labor union?
log(wageit) = β0 + β1educit + β3exprit + β4expr2it
+β5marriedit + β6unionit + ai + uit
Year dummies for 1980–1987 are included.
It is notable that with inclusion of full set of year dummies implies thatone cannot estimate with the FE method effects that change a constantamount over time. Experience (exper) is such a variable.
Seppo Pynnonen Econometrics II
Panel Data
Random effects models
Estimate Std. Error
(Intercept) -0.0343056 0.06326
educ 0.0989945 0.00462
exper 0.0861696 0.01014
I(exper^2) -0.0027349 0.00071
married 0.1230113 0.01557
union 0.1685243 0.01707
-------------------------------------------
lwage | Pooled Random Fixed
| OLS Effects Effects
--------+----------------------------------
educ | .0989945 .0906150 ..
| (.0046227) (.0105807)
exper | .0861696 .1027934 ..
| (.0101415) (.0153853)
exper2 | -.0027349 -.0046859 -.0051855
| (.0007099) (.0006896) (.0007044)
married | .1230113 .0678821 .0466804
| (.0155714) (.0167369) (.0183104)
union | .1685243 .1031103 .0800019
| (.0170652) (.0178388) (.0193103)
-------------------------------------------
It is notable that OLS standard errors tend to be smaller than in the RE or FE cases.
OLS standard errors underestimate the true standard errors.
OLS coefficient estimates also suffer from the omitted variable problem accounted inpanel estimation.
Stata estimate of the correlation in (25) is .464.
Seppo Pynnonen Econometrics II
Panel Data
Random effects or fixed effects1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Random effects or fixed effects
FE is widely considered preferable because it allows correlationbetween ai and x variables.
Given that the common effects, aggregated to ai is not correlatedwith x variables, an obvious advantage of the RE is that it allowsalso estimation of the effects of factors that do not change in time(like education in the above example).
Typically the condition that common effects ai is not correlatedwith the regressors (x-variables) should be considered more like anexception than a rule, which favors FE.
Seppo Pynnonen Econometrics II
Panel Data
Hausman specification test1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Hausman specification test
Hausmanan (1978) devised a test for the orthogonality of thecommon effects (ai ) and the regressors.
The test compares the fixed effect (OLS) and random effect (GLS)estimates utilizing the Wald testing approach.
Seppo Pynnonen Econometrics II
Panel Data
Hausman specification test
The basic idea of the test relies on the fact that under the nullhypothesis of orthogonality both OLS and GLS are consistent,while under the alternative hypothesis GLS is not consistent.
Thus, under the null hypothesis OLS and GLS estimates shouldnot differ much from each other.
The test compares these estimates with Wald statistic.
In Stata performing Hausman requires that both OLS and GLSregression results are saved for availability for the postestimationtest0 procedure.
Seppo Pynnonen Econometrics II
Panel Data
Hausman specification test
Example 5
Applying the Hausman test to the case of Examle 4 can be in Stata
yields:
Seppo Pynnonen Econometrics II
Panel Data
Hausman specification test
* Estimate fixed effects
xtreg lwage y81 y82 y83 y84 y85 y86 y87 exper2 married union, fe
* store the results into "hfixed"
estimates store hfixed
* Estimate the random effects model
xtreg lwage y81 y82 y83 y84 y85 y86 y87 educ exper exper2 married union, re
* store the results into "hrandom"
estimates store hrandom
* Hausman test
hausman hfixed hrandom
---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| hfixed hrandom Difference S.E.
--------+---------------------------------------------------------
y81 | .1511912 .0427498 .1084414 .
y82 | .2529709 .035577 .2173939 .
y83 | .3544437 .0270943 .3273494 .
y84 | .4901148 .052207 .4379078 .
y85 | .6174822 .0690524 .5484299 .
y86 | .7654965 .1053229 .6601736 .
y87 | .9250249 .1505464 .7744785 .
exper2 | -.0051855 -.0046859 -.0004996 .000144
married | .0466804 .0678821 -.0212017 .0074261
union | .0800019 .1031103 -.0231085 .0073935
-------------------------------------------------------------------
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematic
chi2(10) = (b-B)’[(V_b-V_B)^(-1)](b-B)
= 26.77
Prob>chi2 = 0.0028
(V_b-V_B is not positive definite)
Seppo Pynnonen Econometrics II
Panel Data
Hausman specification test
The test rejects the orthogonality condition. Thus, FE should be used.
In Eviews Hausman test is obtained by first estimating the modelas a random effect model and then selecting
View > Fixed/Rendom Effect Testing > Correlated
Random Effects - Hausman Test
Seppo Pynnonen Econometrics II
Panel Data
Policy analysis with panel data1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Policy analysis with panel data
Panel data is useful for policy analysis, in particular, programevaluation.
Example 6
Continue Example 1.2, where training program on worker productivity was evaluated.
The data include three years, 1987, 1988, and 1989.
The training program was implemented first time 1988.
We focus on the years 1987 (no program) and 1988 (program implemented) to seewhether the program benefits firms.
The model panel model is
log(scarpit) = β0 + δ0 y88 + β1grantit + ai + uit , (28)
where y88 is the year 1988 dummy (= 1 for year 1988 and = 0 otherwise) and ai
includes the unobserved firm effects (worker skill, etc.).
Seppo Pynnonen Econometrics II
Panel Data
Policy analysis with panel data
Ignoring panel structure OLS results suggested no improvement.
Dependent Variable: LOG(SCRAP)
Method: Panel Least Squares
Sample: 1 471 IF YEAR < 1989
Periods included: 2
Cross-sections included: 54
Total panel (balanced) observations: 108
=====================================================
Variable Coefficient Std. Error t-Statistic Prob.
-----------------------------------------------------
C 0.523144 0.159783 3.274086 0.0014
GRANT -0.058018 0.380949 -0.152299 0.8792
-----------------------------------------------------
R-squared 0.000219
Adjusted R-squared -0.009213
S.E. of regression 1.507393
F-statistic 0.023195
Prob(F-statistic) 0.879241
=====================================================
The coefficient for grant is not statistically significant, suggesting that
the program does not help in reducing the scrap rate.
Seppo Pynnonen Econometrics II
Panel Data
Policy analysis with panel data
Accounting for the possible firm effects and imposing also the year
dummy to account for possible time effect, yields
=====================================================
Variable Coefficient Std. Error t-Statistic Prob.
-----------------------------------------------------
C 0.568716 0.048603 11.70126 0.0000
GRANT -0.317058 0.163875 -1.934753 0.0585
-----------------------------------------------------
Effects Specification
Cross-section fixed (dummy variables)
Period fixed (dummy variables)
R-squared 0.964308
Adjusted R-squared 0.926556
S.E. of regression 0.406642
F-statistic 25.54364
Prob(F-statistic) 0.000000
Seppo Pynnonen Econometrics II
Panel Data
Policy analysis with panel data
The estimate of the coefficient for the grant is negative and close tostatistically significant in two sided testing and significant in one sidedtesting (program improves) for the alternative
H1 : β1 < 0
significant at the 5% level with p-value 0.0265.
According to the estimate participating the program decreases the
scrap-rate on average 32% (more accurately 27%, since
exp(−.317058)− 1 ≈ −0.272).
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models1 Panel Data
Pooling independent cross section across time
Fixed effects model
Two-period panel data analysis
More than two time periods
Fixed effects method
Dummy variable regression
Fixed effects or first differencing?
Balanced and unbalanced panels
Random effects models
Random effects or fixed effects
Hausman specification test
Policy analysis with panel data
Dynamic Panel Models
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
Many economic relationships are dynamic.
These may be characterized by the presence of lagged dependentvariables
yit = δyi ,t−1 + x′itβ + vit , (29)
wherevit = ai + uit (30)
with ai ∼ iid(0, σ2a) and uit ∼ iid(0, σ2
u) are independent,i = 1, . . . , n, t = 1, . . . ,T .
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
Alternatively the one-way error component model in (30) can be atwo-way specification such that
vit = ai + bt + uit , (31)
where all the components are assumed again independent.
After differencing we have
∆yit = δ∆yi ,t−1 + ∆x′itβ + ∆uit . (32)
The lagged term yi ,t−1 as a regressor variable is correlated withui ,t−1, which causes problems in estimation.
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
Once regressor variables are correlated with the error term, OLS orGLS estimators become inconsistent.
A typical solution to the problem is to apply some kind ofinstrumental variable estimation.
These are least squares (LS) or some other type of methods, whereinstrumental variables are utilized to remove the inconsistency dueto the error term correlation with the regressors.
A variable is suitable for an instrumental variable if it is notcorrelated with the error term, but is correlated with the regressors.
Thus, those regressors that are not correlated with the error termcan be used also as instruments.
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
Example 7
2SLS (two state least squares).
Consider a standard regression model
yi = x′iβ + ui , (33)
where xi is a k-vector of regressors (including the constant term) cov[xi , ui ] 6= 0,i = 1, . . . , n.
Suppose we have m ≥ k, additional variables in zi (m-vector) such that cov[zi , ui ] = 0but cov[zi , xi ] 6= 0.
2SLS solution for the problem is such that first (first stage) use OLS to regressx-variables on z-variables.
In the second stage replace the original regressors xi by the predicted variables xi fromthe first stage, and estimate β from the regression
yi = x′iβ + ui . (34)
The estimatorβ2SLS = (X′X)−1X′y (35)
is called the 2SLS estimator of β.
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
In particular, if m = k then (35) becomes
βIV = (Z′X)−1Z′y, (36)
which is called the Instrumental Variable estimator of β.
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
Example 8
(Data: http://eu.wiley.com/college/baltagi/ > Student companion site > datasets)
Demand for cigarettes in 46 US States [annual data, 1963–1992]. Estimated equation
cit = α+ β1ci,t−1 + β2pit + β3yit + β4pnit + vit , (37)
wherevit = ai + bt + uit , (38)
ai and bt are fixed effects, uit ∼ NID(0, σ2u), and all the observable variables are in
logarithms:cit = real per capita sales of cigarettes by persons of smoking age (14 and older).cigarette average price per packpit = real average retail price of a pack of cigarettesyit = real per capital disposable incomepnit = the minimum real price of cigarettes in any neighboring state (proxy for casualsmuggling effect across state borders)
ci,t−1 is very likely correlated with uit .
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
For reference purposes, estimating with panel OLS (average of within
group regressions with time dummies) yields
Fixed-effects (within) regression Number of obs = 1334
Group variable: state Number of groups = 46
R-sq: within = 0.9283 Obs per group: min = 29
between = 0.9859 avg = 29.0
overall = 0.9657 max = 29
F(32,1256) = 508.07
corr(u_i, Xb) = 0.4743 Prob > F = 0.0000
-----------------------------------------------------
lc | Coef. Std. Err. t P>|t|
-------------+---------------------------------------
lc |
L1. | .8302514 .0126242 65.77 0.000
|
lp | -.2916822 .0230847 -12.64 0.000
ly | .1068698 .0233417 4.58 0.000
lpn | .0354559 .02656 1.33 0.182
_cons | .8204374 .2228775 3.68 0.000
-------------+---------------------------------------
sigma_u | .02738301
sigma_e | .03504776
rho | .37905103 (fraction of variance due to u_i)
-----------------------------------------------------
F test that all u_i=0: F(45, 1256) = 4.52
Prob > F = 0.0000
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
Several method are proposed to estimate when there is potentialcorrelation between the error term and (some) regressors.
GMM (Generalized Method of Moments) estimation has gained latelymuch popularity, in particular when there are non-linear momentrestrictions.
Stata has xtdpd procedure which produces the Arellano and Bond or the
Arellano-Bover/Blundell-Bond estimator, which are GMM estimators,
where instruments are defined in a particular way (the idea will be
discussed in the classroom).
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
xtdpd l(0/1).lc lp ly lpn y66-y92, div(lp ly lpn y66-y92) dgmmiv(lc)
Dynamic panel-data estimation Number of obs = 1334
Group variable: state Number of groups = 46
Time variable: year
Obs per group: min = 29
avg = 29
max = 29
Number of instruments = 437 Wald chi2(31) = 13273.45
Prob > chi2 = 0.0000
One-step results
-----------------------------------------------------
lc | Coef. Std. Err. z P>|z|
-------------+---------------------------------------
lc |
L1. | .8201729 .0161446 50.80 0.000
|
lp | -.3607549 .0311244 -11.59 0.000
ly | .1871102 .0334027 5.60 0.000
lpn | -.0215713 .0399233 -0.54 0.589
-----------------------------------------------------
Instruments for differenced equation
GMM-type: L(2/.).lc
Standard: D.lp D.ly D.lpn D.y66 D.y67 D.y68
D.y69 D.y70 D.y71 D.y72 D.y73 D.y74 D.y75
D.y76 D.y77 D.y78 D.y79 D.y80 D.y81 D.y82
D.y83 D.y84 D.y85 D.y86 D.y87 D.y88 D.y89
D.y90 D.y91 D.y92
Instruments for level equation
Standard: _cons
Seppo Pynnonen Econometrics II
Panel Data
Dynamic Panel Models
Test for the orthogonality conditions of the instruments
Sargan test of overidentifying restrictions
H0: overidentifying restrictions are valid
chi2(405) = 561.5047
Prob > chi2 = 0.0000
The orthogonality conditions are rejected.
The reason may be that that the errors are MA(1), which implies thatthe GMM instruments (lct−2, . . .) are correlated with the error term.
This can be tried to fix by defining starting from t − 3 with command· · · dgmmiv(lc, lagrange(3 .)).
Doing this improved slightly the situation but still lead to rejection of theorthogonality conditions.
We however, do not continue the analysis here further.
Seppo Pynnonen Econometrics II