Selection Bias, Comparative Advantage and Heterogeneous
Returns to Education: Evidence from China
James J. Heckman(University of Chicago)
Xuesong Li(Institute of Quantitative & Technical Economics,
Chinese Academy of Social Sciences)
2
1. Introduction
2. Models with and without Heterogeneity
3. Selection Bias and The Marginal Treatment Effect
4. Data Set and Empirical Results
5. Concluding Remarks
3
1. Introduction
Heterogeneity and missing counterfactual states are central features of
micro data.
This paper uses China’s micro data, to estimate the return to education
for China considering both heterogeneity and selection bias.
4
Our work builds on previous research by Heckman and Vytlacil (1999,
2001), and Carneiro (2002), which develops a semi parametric
framework.
5
2. Models with and without Heterogeneity
A conventional model of the return to education without heterogeneity in returns:
(1)
I for individuals (i=1, 2, . . . ,n),
lnYi is log income,
Si is schooling level or years of schooling,
Xi is a vector of variables
βis the rate of return to education,
γis a vector of coefficients.
iiii UXSY ln
6
OLS problem: omitted ability Ai,
Three strategies:
(1) IV.
But It is also very hard to find satisfactory instruments. In fact, most commonly
used instruments in the schooling literature are invalid because they are
correlated with the omitted ability.
(2) Fixed effect method: find a paired comparison such as a genetic twin or
sibling with similar or identical ability.
It needs enough information
7
.
(3) Proxy variables for ability
Many empirical analyses reveal that better family background and better family
resources are usually associated with better environments which raise ability.
In our empirical work we use parental income as a proxy for ability.
8
A model with heterogeneous returns to education (in random coefficient form)
(2)
βi is the heterogeneous rate of return to education, which varies among
individuals.
Xi is a vector of variables including the proxy for ability.
We focus on two schooling choices:
(1) high school Si=0
(2) college Si=1
iiiii UXSY ln
9
The two potential selection outcomes
(3b) 1 if ln
(3a) 0 if ln
111
000
iiii
iiii
SUXY
SUXY
10
Observed log earnings are:
where
(5)
is the heterogeneous return to education for individual i.
βi varies in the population, and the return to schooling is a random variable with
a distribution.
(4d)
(4c) )]()[(
(4b) ])([])[(
(4a) ln)1(lnln
00
000101
010001
01
iiii
iiiiii
iiiiiii
iiiii
UXγSβ
UXSUUX
SUUUXSX
YSYSY
)()( 0101 iiii UUX
11
The mean of βi given X is:
(6)
Decision rule:
(7)
Si* is a latent variable denoting the net benefit of going to school
Zi is an observed vector of variables.
])[()( 01 iii XEXE
* ( )
1 if 0 0 otherwise,
i i i si
*i i
S P Z U
S S
12
Pi = Pi (Zi) is the propensity score or probability of receiving treatment (going to
college). P(Z) can be estimated by a logit or probit model.
Usi is the unobserved heterogeneity for individual i in the treatment selection
equation. Without loss of generality, we may assume that Usi ~ Unif [0,1].
The decision of whether to go to college (or not) for individual i is determined
completely by the comparison of the observed heterogeneity Pi(Zi) with the
unobserved heterogeneity Usi.
The smaller the Usi, the more likely it is that the person goes to college.
13
3. Selection Bias and The Marginal Treatment Effect
(8)
ATE is the average treatment effect (the effect of randomly assigning a person to
schooling)
(9)
1 1 0 0
1 0
ˆplim( ) (ln , 1) (ln , 0)
( , 1) ( , 0)
[ ( 1) ( 0)]
OLS i i i i i i
i i i i i i i i
i i i i
E Y X S E Y X S
E X U X S E X U X S
E U S E U S
(ATE) (Bias)
)()( iiii XEXEATE
14
(10)
Selection bias is the mean difference in the no-schooling (S = 0) unobservable
s between those who go to school and those who do not.
0 0
ˆplim( ) (ln , 1) (ln , 0)
, 1) [ ( 1) ( 0)]
OLS i i i i i i
i i i i i i i
E Y X S E Y X S
E(β X S E U S E U S
(TT) (Selection Bias)
15
TT (treatment on the treated), the effect of treatment on those who receive it
(e.g. go to college) compared with what they would experience without treatm
ent (i.e. do not go to college), defined as
(11)
Sorting effect is the mean gain of the unobservables for people who choose ‘1’.
1 0 1 0
( , 1) ( , 1)
( 1) ( 1)
( )
i i i i i i
i i i i i i
TT E X S E X S
E U U S ATE E U U S
Sorting Effect
16
IV is not a consistent estimator In the presence of heterogeneity and selection
bias.
(12)
0 1 0
1 01 0
( ln ) ( ) [ ( ) ]ˆplim( ) ( ) ( )
[ ( ) 1][ ( ) ] ( ) ( )
i i i i i i i iIV
i i i i i i
i i i i ii i i i
i i i i
Cov I , Y Cov I , U Cov I , U U SCov I , S Cov I , S Cov I , S
Cov I , U U S PCov I , U U SCov I , S Cov I , S
17
Neither OLS nor IV is a consistent estimator of the mean return to education in
the presence of heterogeneity and selection.
Under certain assumptions, it is possible to identify the heterogeneous return
to education with marginal treatment effect (MTE) via the method of local
instrument variables (LIV), where MTE is:
1 0 1 0
( ) ) )
( ) ( ) .i si s i i si s i i si s
i i si s
MTE X x, U u E( X x, U u E(β X x, U u
x E U U U u
18
The MTE is the average willingness to pay (WTP) for lnY1i (compared to lnY0i
) given characteristics Xi and unobserved heterogeneity Usi.
MTE can be estimated from the following relationship, where LIV can be
estimated by semi parametric methods for derivatives (Heckman, 2001):
ppx, PXYE
px, PXLIVpPUxXMTE iiiiiisii
)(ln)() ,(
19
All the other treatment variables can be unified using MTE:
10 )( ss duuMTEATE
10 )()( ssTTs duuhuMTETT
10 )()( ssTUTs duuhuMTETUT
20
Where the weights are:
)(
)(
)()(1
)(1
i
u
i
sPsTT PE
dppf
PEuF
uh s
)1()(
)1()(
)( 0
i
u
i
sPsTUT PE
dppfPE
uFuh
s
21
Treatment on the untreated (TUT) is the effect of treatment on those
who do not receive it (i.e. do not go to college) compared with what
they would experience with the treatment (i.e. go to college)
1 0
( , 0) ( , 0)
( 0).i i i i i i
i i i
TUT E X S E X S
E U U S
22
4. Data Set and Empirical Results
Data Source: China Urban Household Income and Expenditure Survey
(CUHIES) 2000
Conducted by the Urban Socio-Economic Survey Organization of the National
Bureau of Statistics.
Six provinces:
Guangdong Liaoning
Sichuan Shaanxi
Zhejiang Beijing.
23
Sample size: 4250 households.
For each household, there is rich information on all household
members, including head, spouse, children and parents.
Age, sex, education level, employment status and enterprise
ownership, occupation, years of work experience and total annual
income are available for each household member.
There are seven education levels in the sample: university, college,
special technical school, senior high school, junior high school, primary
school, and other.
24
The used sample consists of 587 individuals, including 273 people with
four-year college (or university) certificates and 314 people with only
senior high school certificates.
25
Table 2. Summary Statistics
VariableAll (n=587) Treated (n=273) Untreated (n=314)
Mean Std. Err Mean Std. Err Mean Std. Err
Log Wage 8.86 0.86 9.12 0.77 8.64 0.88
Age 26.25 4.72 26.48 4.14 26.06 5.16
Years of work experience 6.41 4.92 5.83 4.47 6.91 5.23
4-Year college attendance 0.47 0.50 1 0 0 0
Male 0.56 0.50 0.54 0.50 0.59 0.49
Lived in Guangdong Province (GD) 0.18 0.39 0.19 0.39 0.18 0.38
Lived in Liaoning Province (LN) 0.28 0.45 0.30 0.46 0.27 0.44
Lived in Shaanxi Province (SX) 0.10 0.30 0.08 0.27 0.12 0.33
Lived in Sichuan Province (SC) 0.16 0.37 0.15 0.36 0.17 0.38
Lived in Beijing (BJ) 0.15 0.36 0.15 0.36 0.14 0.35
Lived in Zhejiang Province (ZJ) 0.12 0.33 0.12 0.33 0.12 0.33
Worked in state owned enterprises (SOEs) 0.62 0.49 0.72 0.45 0.54 0.50
Worked in collective-owned firms 0.08 0.27 0.04 0.20 0.11 0.32
Worked in joint-venture or foreign owned firms 0.18 0.39 0.19 0.40 0.17 0.38
Worked in private owned firms 0.12 0.32 0.05 0.21 0.18 0.38
Worked in IND_CON sector* 0.26 0.44 0.21 0.40 0.32 0.47
Worked in TRA_COM sector* 0.03 0.17 0.03 0.17 0.03 0.18
Worked in HOU_RES sector* 0.08 0.27 0.07 0.26 0.09 0.29
Worked in SPO_SOC sector* 0.22 0.41 0.16 0.36 0.27 0.45
Worked in CUL_SCI sector* 0.10 0.29 0.14 0.34 0.06 0.24
Worked in FIN_INS sector* 0.11 0.32 0.09 0.28 0.13 0.34
Worked in GOVERN sector* 0.03 0.16 0.04 0.20 0.02 0.13
Worked in OTHER sector* 0.17 0.38 0.27 0.45 0.08 0.28
Years of father’s education 11.36 3.38 12.26 3.26 10.57 3.28
Years of mother’s education 9.90 2.99 10.41 3.31 9.46 2.60
Parental income (in 1000 yuan) 21.39 16.59 24.36 15.89 18.81 16.78
26
VariableOLS IV
Coefficient Standard Error Coefficient Standard Error
Intercept 8.3189 0.1493 8.3040 0.1552
4-Year’s college attendance 0.2929 0.0630 0.5609 0.1695
Years of work experience 0.0380 0.0194 0.0196 0.0202
Experience squared -0.0016 0.0010 -0.0007 0.0010
Parental income in 1000 yuan 0.0117 0.0020 0.0098 0.0023
Male 0.1537 0.0602 0.1439 0.0607
Lived in Guangdong Province 0.7543 0.1255 0.7908 0.1267
Lived in Liaoning Province 0.2693 0.1085 0.3142 0.1092
Lived in Sichuan Province 0.2278 0.1181 0.2759 0.1192
Lived in Beijing 0.7246 0.1241 0.7775 0.1256
Lived in Zhejiang Province 0.6241 0.1297 0.6739 0.1314
Worked in state owned enterprises -0.3679 0.0855 -0.3873 0.0868
Worked in collective-owned firms -0.4786 0.1288 -0.5890 0.1298
Worked in private owned firms -0.4649 0.1179 -0.5304 0.1179
Worked in IND_CON sector* -0.2793 0.0788 -0.3048 0.0792
Worked in TRA_COM sector* -0.4512 0.1762 -0.4645 0.1779
Worked in SPO_SOC sector* -0.2880 0.0900 -0.3106 0.0905
Worked in FIN_INS sector* -0.3220 0.1050 -0.3327 0.1061
Table 3. Estimated Mincer Model
27
Variable Coefficient Standard Error MeanMarginal Effect
Intercept -4.7370 0.7305 -
Years of father’s education 0.1017 0.0297 0.0211
Years of mother’s education 0.0605 0.0342 0.0126
Parental income in 1000 yuan 0.0190 0.0069 0.0040
Born before 1964 2.0008 0.7969 0.4159
Born in 1964 1.7285 0.9189 0.3593
Born in 1965 3.3423 0.8257 0.6947
Born in 1966 3.1813 0.8552 0.6613
Born in 1967 1.8455 1.1126 0.3836
Born in 1968 2.9030 0.8161 0.6034
Born in 1969 2.2569 0.7941 0.4691
Born in 1970 1.5076 0.7534 0.3134
Born in 1971 3.0771 0.7138 0.6396
Born in 1972 2.6424 0.7183 0.5492
Born in 1973 2.5395 0.6809 0.5279
Born in 1974 2.7740 0.6753 0.5766
Born in 1975 2.7931 0.6763 0.5806
Born in 1976 2.8634 0.6669 0.5952
Born in 1977 2.5890 0.6672 0.5381
Born in 1978 2.5572 0.6656 0.5315
Born in 1979 1.3631 0.7636 0.2833
Table 4. Estimated Logit Model For Schooling
28
VariableHigh School College
Std. Err. Std. Err.
Years of work experience 0.0360 0.0225 0.0141 0.0278
Experience squared -0.0013 0.0011 -0.0009 0.0013
Parental income in 1000 yuan 0.0188 0.0038 0.0077 0.0038
Male 0.1365 0.0723 0.1913 0.0777
Lived in Guangdong Province 0.5712 0.1961 0.8853 0.1590
Lived in Liaoning Province 0.1901 0.1263 0.3929 0.1049
Lived in Sichuan Province 0.2612 0.1364 0.2296 0.1081
Lived in Beijing 0.7122 0.1695 0.7971 0.1301
Lived in Zhejiang Province 0.6930 0.1551 0.5461 0.1744
Worked in state owned enterprises -0.3368 0.1188 -0.4471 0.1093
Worked in collective-owned firms -0.6060 0.2065 -0.5868 0.1771
Worked in private owned firms -0.4205 0.1511 -0.6256 0.1677
Worked in IND_CON sector* -0.2297 0.0821 -0.3978 0.0990
Worked in TRA_COM sector* -0.3527 0.1318 -0.5040 0.1557
Worked in SPO_SOC sector* -0.3702 0.1282 -0.3040 0.1202
Worked in FIN_INS sector* -0.3345 0.1560 -0.3543 0.1331
Table 5. Estimated Coefficients from Local Linear RegressionGuassian Kernel, bandwidth = 0.4
29
Parameters Estimation
OLS 0.2929
IV* 0.5609
ATE 0.4336
TT 0.5149
TUT 0.3630
Bias -0.1407
Selection Bias -0.2220
Sorting Gain 0.0813
*Using propensity score as instrument ATEOLSBias
TTOLSBiasSelection
ATETTGainSorting
Table 6. Comparison of Different Parameters
30
Figure 1. Density of P(S=1)Urban areas of six provinces of China
From CUHIES 2000
0%
3%
6%
9%
12%
15%
0.0 0.2 0.4 0.6 0.8 1.0P
f(P)
31
Figure 2. Marginal Treatment EffectIncluding parental income as proxy for ability
in wage equation, all ownership and sectoral dummiesalso included, Bandwidth = 0.4
0.1
0.3
0.5
0.7
0.9
0.0 0.2 0.4 0.6 0.8 1.0
Us
MTE
32
Figure 3. Weights of Treatment ParametersFor Main Specifications
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.0 0.2 0.4 0.6 0.8 1.0
Us
h(Us)
TT
TUT
ATE
33
Figure 4. Marginal Treatment Effect Excluding parental income in wage equation
But all ownership and sectoral dummies included Bandwidth = 0.3
0.5
1.0
1.5
2.0
0.0 0.2 0.4 0.6 0.8 1.0
Us
MTE
34
Figure 5. Marginal Treatment EffectAll specifications include parental income
in earnings equation, Bandwidth = 0.4
0.1
0.3
0.5
0.7
0.9
0.0 0.2 0.4 0.6 0.8 1.0
Us
MTE
A+B+C A+C
B+C C
A: with firms’ ownership dummies but not sectoral dummies
B: with sectoral dummies but not ownership dummies
C: no sectoral and ownership dummies
35
Figure 6. Marginal Treatment Effect All specifications exclude parental income
in earnings equation, Bandwidth = 0.3
0.5
1.0
1.5
2.0
0.0 0.2 0.4 0.6 0.8 1.0
Us
MTE
A+B A
B NO A & B
36
5. Concluding Remarks
Neglecting heterogeneity and selection bias leads to biased and inconsistent estimates, such as those obtained using conventional OLS and IV parameters.
We demonstrate the importance of proxying for ability in the wage equation
to identify returns to education. Excluding the proxy leads to implausibly high
estimates of the return to schooling.
37
In 2000 the average return to four-year college attendance is 43% (on
average, 11% annually) for young people in the urban areas of the six
provinces.
The results imply that, after more than twenty years of economic reform with
market orientation, the average return to education in China has increased
markedly compared with that of the 1980s and early 1990s.