15 november 2005 1
Nonlinear Trend in Inequality of Educational Opportunity in the
Netherlands 1930-1989
Maarten L. Buis
Harry B.G. Ganzeboom
15 november 2005 2
Outline• Background and research problem• Main results• Model selection
– Continuous or discrete measures parental education and father’s occupational status
– Importance mother’s education relative to father’s education
– Difference in effect between sons and daughters
• Non-linearity in trend in effects: identify periods of negative, positive, and no trend.
15 november 2005 3
Historical / biographical background
• Previous studies of trends in IEO in the Netherlands (NO CHANGE):– Dronkers et al. student cohorts (at age 12)
1965, 1973, 1981, 1989.– Peschar et al. on synthetic cohort one single
survey (NPAO 1982).
• De Graaf & Ganzeboom (1990, 1993) on synthetic cohorts in 10+ surveys: DECLINE.
15 november 2005 4
The historical trend
• Confirmed:– With linear regression– With loglinear models (uniform association,
scaled association (RC-2).– With ordered logits– With sequential logits (transition model)
[Shavit & Blossfeld 1993]: Dutch exceptionalism? Aggressive welfare state policies?
15 november 2005 5
Our explanation
• Use a lot of data, pooled surveys– Wider time window
– More statistical power
– Smoothing of survey peculiarities
– Concentrate on global distribution of education (not transitions)
• But note:– The estimated trend is far from trivial or small (-1% per
year)
15 november 2005 6
The problem
• Trend towards less IEO is well documented.
• Even the most recent accounts (Ganzeboom & Luijkx, 2004a, 2005b) find a linear trend.
• However, there is reason to believe that the trend cannot continue.
• When do we begin to observe a deceleration of the trend?
15 november 2005 7
Main results
• Model of IEO:– distinction between highest and lowest educated parent
is more important than distinction between father and mother, or same-sex-parent.
– Effects of parental education and father’s occupational status is the same for sons and daughters.
• Non-linearity in trend– Effect of father’s status decreases non-linearly over
time, slowing down significantly around 1970.– Significance was determined with parametric bootstrap
of Lowess-curves.
15 november 2005 8
1930 1950 1970 1990
12
34
56
7
OLSsignificant trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
56
7
OLSsignificant change in trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
24
68
10SOR
significant trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
24
68
10
SORsignificant change in trend
year in w hich respondent is 12IE
O
1930 1950 1970 1990
12
34
5
RC2significant trend
nyear in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
5RC2
significant change in trend
year in w hich respondent is 12
IEO
15 november 2005 9
Data
• International Stratification and Mobility File (ISMF) – on the Netherlands
• 25 surveys held between 1958 and 2003 with information on cohorts 1930-1989.
• 80,000 respondents aged between 24 and 65, of which 40,000 have complete information on child's, father’s and mother’s education and father's occupation.
• Number of cases are unequally distributed over cohorts.
15 november 2005 10
0
500
1,000
1,500
2,000
2,500
nu
mb
er
of
ob
serv
atio
ns
1940 1960 1980age in which respondent is 12
number of observations per cohort
15 november 2005 11
Model 1: linear regression• Dependent variable is level of education
and treated as continuous.• Parental education is either entered as
father’s and/or mother’s education, highest and/or lowest educated parent, or education of same sex parent
• Father’s occupational status is measured in ISEI scores
• Trend in effects are measured as third order orthogonal polynomials or lowess curves.
15 november 2005 12
Two objections against linear education
• Regression coefficient is affected by both ‘real’ effects of parental characteristics on probabilities of making transitions and expansion of the educational distribution– True, if education is studied as a process – False, if education is studied as an outcome
• education is discrete– this does not have to be a problem if there is no
concentration in the lowest or highest category
15 november 2005 13
0
.2
.4
.6
.8
1pr
opor
tion
1940 1960 1980year in which respondent is 12
higher tertiarylower tertiaryhigher secondarylower secondary
primary or less
highest achievedlevel of education
15 november 2005 14
Model 2:Stereotype Ordered Regression (SOR)
• SOR allows for unordered dependent variable
• SOR will estimate an optimal scaling of education and the effect of independent variables on this scaled education.
• The dependent variable is nominal, and SOR reveals latent ordering based on associations with independent variables.
15 november 2005 15
Model 3: Row Column Association Model II (RC2)
• Objection against use of ISEI: – Effect of father’s occupation is better
represented by small number of discrete classes, rather than on continuous scale.
• Classes used are EGP classification.
• RC2 is an extension of SOR as it also estimates an optimal scaling for FEGP
15 november 2005 16
Father’s and mother’s education
• Conventional model: Only father matters• Individual model: Both mother and father matter• Joint model: Effect of father and mother are equal• Dominance model: Highest educated parent
matters• Modified Dominance model: Highest and lowest
educated parent matter• Sex Role model: Same sex parent matters
15 november 2005 17
BICsname no. model OLS SOR RC2Baseline model 0a FIS*BYR 3̂*FEM + BYR_D*FEM
0b FIS*BYR 3̂ + BYR_D*FEM
conventional model 1a (0a) + FED*BYR 3̂*FEM -10375 -20352 -341121b (0a) + FED*BYR 3̂ -10414 -20391 -341571c (0b) + FED*BYR 3̂ -10433 -20418 -34194
individual model 2a (0a) + FED*BYR 3̂*FEM + MED*BYR 3̂*FEM -11003 -21043 -345662b (0a) + FED*BYR 3̂ + MED*BYR^3 -11075 -21124 -346592c (0b) + FED*BYR 3̂ + MED*BYR^3 -11093 -21148 -34698
joined model 3a (0a) + (FED=MED)*BYR^3*FEM -11065 -21120 -345803b (0a) + (FED=MED)*BYR^3 -11104 -21159 -346243c (0b) + (FED=MED)*BYR^3 -11121 -21183 -34662
dominance model 4a (0a) + HI_ED*BYR 3̂*FEM -10923 -20923 -345524b (0a) + HI_ED*BYR 3̂ -10961 -20963 -345964c (0b) + HI_ED*BYR 3̂ -10980 -20989 -34634
modified dominance model 5a (0a) + HI_ED*BYR 3̂*FEM + LO_ED*BYR 3̂*FEM -11071 -21094 -347055b (0a) + HI_ED*BYR 3̂ + LO_ED*BYR^3 -11149 -21180 -34797
5c (0b) + HI_ED*BYR 3̂ + LO_ED*BYR^3 -11166 -21204 -34835
sex-role model 6a (0a) + SS_ED*BYR^3*FEM -10231 -20217 -337366b (0a) + SS_ED*BYR^3*FEM -10261 -20252 -337426c (0b) + SS_ED*BYR^3 -10275 -20283 -33790
15 november 2005 18
Scaling of father’s status
EGP mean(ISEI) RC2I Service class, higher grade 66.5 1.000II Service class, lower grade 56.4 0.838IIIa Routine non-manual employees 48.6 0.651
IIIb Personal service workers 41.7 0.370
IVa Small proprietors with employees 45.4 0.467
IVb Small proprietors without employees 44.7 0.184
V Manual foremen and technicians 41.3 0.216VI Skilled manual workers 34.8 -0.148VIIa Semi- and unskilled manual workers 29.7 -0.354VIIb Agricultural workers 17.5 -0.553IVc Farmers and smallholders 29.1 0.000
15 november 2005 19
Scaling of education
education mean(educyr) SOR RC2
primary or less 6.0 0.000 0.000
lower secondary 9.3 0.348 0.299
higher secondary 11.0 0.646 0.601
lower tertiary 14.9 0.813 0.793
higher tertiary 17.1 1.000 1.000
15 november 2005 20
Linearity of trend, orthogonal polynomials
OLS SOR RC2trend t t t
FSES linear -10.85 -6.57 -20.15quadratic 6.94 4.39 4.41cubic 0.39 -0.39 -0.54
HI_ED linear -12.29 -4.63 0.50quadratic 0.94 1.07 5.27cubic -2.29 -2.08 2.61
LO_ED linear -6.97 -3.07 -0.87quadratic 2.10 1.59 1.96cubic 2.47 1.54 3.96
15 november 2005 21
Identifying periods with significant trend
• A negative slope means a negative trend.
• A positive slope means a positive trend.
• A zero slope means no trend, or not enough information.
15 november 2005 22
Identifying periods with significant change in trend
• An accelerating trend means that a negative trend becomes more negative, so a negative change in slope.
• A decelerating trend means that a negative trend becomes less negative, so a positive change in slope.
• A constant trend means no change in slope.
15 november 2005 23
Data
• The ISMF dataset is converted into three new datasets, containing estimates of the association between father’s occupational status and child’s education for 60 annual cohorts.
• One dataset for each technique.• The precision of the estimates (the standard error)
is used to weigh the cohorts (weights are the inverse of error variances).
15 november 2005 24
Lowess curves: locally weighted scatterplot smooth• We have a dataset consisting of estimates of IEO for each
annual cohort which used only information from that cohort
• If we think that IEO develops like a smooth curve over time, than nearby estimates also contain relevant information.
• The lowess curve creates an improved estimate of the IEO for each cohort using information from nearby cohorts.
• It results in a smooth line by connecting the lowess estimates.
• Estimates of the trend and change in trend at each cohort can also be obtained from this curve.
15 november 2005 25
Lowess curve in 1949
• Point on lowess curve in 1949• Select closest 60% of the points.• Give larger weights to nearby points.• Adjust weights for precision of estimated IEO.• WLS regression of IEO on time, time squared and time
cubed on weighted points.• Predicted value in 1949, is smoothed value of 1949.• First derivative in 1949 is trend in 1949.• Second derivative in 1949 is change in trend in 1949.• Repeat for all cohorts and all techniques and connect the
dots.
15 november 2005 26
1930 1940 1950 1960 1970 1980 1990
12
34
56
7
(a) Observations Within the Windowspan = 0.6
year in which respondent is 12
IEO
1949
1930 1940 1950 1960 1970 1980 1990
0.0
0.5
1.0
1.5
2.0
(b) Tricube Weights
year in which respondent is 12
Tric
ube
Ker
nel W
eigh
t
1949
1930 1940 1950 1960 1970 1980 1990
0.0
0.5
1.0
1.5
2.0
(c) Tricube (+), Precision (x),and Joint (o) Weights
year in which respondent is 12
wei
ghts
1949
1930 1940 1950 1960 1970 1980 1990
12
34
56
7
(d) Weighted Third Degree Polynomial(size of circle proportional to weight)
year in which respondent is 12
IEO
1949
IEO^
1949
15 november 2005 27
Selecting spans
• Percentage closest points (span) determines the smoothness of the lowess curve.
• Trade-off between smoothness and goodness of fit.
• Can be judged visually by comparing lowess curves with different spans.
• Numerical representations of this trade-off are Generalized Cross Validation, and Akaike Information Criterion.
• Lower values mean a better trade-off.
15 november 2005 28
1930 1950 1970 1990
12
34
56
7
OLS, span=.5
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
56
7
OLS, span=.6
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
56
7
OLS, span=.7
year in w hich respondent is 12
ieo
1930 1950 1970 1990
24
68
10SOR, span=.5
year in w hich respondent is 12
ieo
1930 1950 1970 1990
24
68
10
SOR, span=.6
year in w hich respondent is 12
ieo
1930 1950 1970 1990
24
68
10
SOR, span=.7
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
5
RC2, span=.5
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
5
RC2, span=.6
year in w hich respondent is 12
ieo
1930 1950 1970 1990
12
34
5
RC2, span=.7
year in w hich respondent is 12
ieo
15 november 2005 29
0.2 0.4 0.6 0.8 1.0
0.38
0.39
0.40
0.41
0.42
(a) Generalized CrossValidation
span
gcv
0.2 0.4 0.6 0.8 1.0
3035
4045
(b) Akaike InformationCriterion
span
aic
0.2 0.4 0.6 0.8 1.0
0.62
0.64
0.66
0.68
0.70
(a) Generalized CrossValidation
span
gcv
0.2 0.4 0.6 0.8 1.0
4550
55
(b) Akaike InformationCriterion
span
aic
0.2 0.4 0.6 0.8 1.0
0.08
80.
090
0.09
20.
094
0.09
60.
098
0.10
0
(a) Generalized CrossValidation
span
gcv
0.2 0.4 0.6 0.8 1.0
1520
2530
35
(b) Akaike InformationCriterion
span
aic
15 november 2005 30
Bootstrap confidence intervals
• Confidence interval gives the range of results that could plausibly occur just through sampling error.
• Make many `datasets' that could have occurred just by sampling error.
• Fit lowess curves through each `dataset'.• The area containing 90% of the curves is the 90%
confidence interval.• The estimates of IEO are regression, SOR, RC2
coefficients with standard errors.• The standard error gives information about what values of
IEO could plausibly occur in a `new' dataset.
15 november 2005 31
1930 1940 1950 1960 1970 1980 1990
23
45
6
(a) Lowess Smooths in theFirst 25 Bootstrap Samples
year in which respondent is 12
IEO
1930 1940 1950 1960 1970 1980 1990
-40
-20
020
40
(b) Trend in IEO in theFirst 25 Bootstrap Samples
year in which respondent is 12
tren
d, c
hang
e in
IE
O p
er 1
00 y
ears
1930 1940 1950 1960 1970 1980 1990
-800
-600
-400
-200
020
040
0
(c) Change in Trend in IEO in theFirst 25 Bootstrap Samples
year in which respondent is 12
chan
ge in
tre
nd p
er 1
00 y
ears
15 november 2005 32
1930 1950 1970 1990
12
34
56
7
(a) Lowess Smooth and90% Confidence Envelope
year in w hich respondent is 12IE
O1930 1950 1970 1990
-30
-10
010
30
(c) Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12
tren
d, c
hang
e in
IEO
per
100
yea
rs
1930 1950 1970 1990
-600
-200
020
0
(d) Change in Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12
chan
ge in
tren
d pe
r 10
0 ye
ars
1930 1950 1970 1990
24
68
10
(a) Lowess Smooth and90% Confidence Envelope
year in w hich respondent is 12
IEO
1930 1950 1970 1990
-60
-40
-20
020
40
(c) Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12tr
end,
cha
nge
in IE
O p
er 1
00 y
ears
1930 1950 1970 1990
-600
-200
200
600
(d) Change in Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12
chan
ge in
tren
d pe
r 10
0 ye
ars
1930 1950 1970 1990
12
34
5
(a) Lowess Smooth and90% Confidence Envelope
year in w hich respondent is 12
IEO
1930 1950 1970 1990
-15
-50
510
15
(c) Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12
tren
d, c
hang
e in
IEO
per
100
yea
rs
1930 1950 1970 1990
-300
-200
-100
010
0
(d) Change in Trend in IEO and90% Confidence Envelope
year in w hich respondent is 12ch
ange
in tr
end
per
100
year
s
OLS
SOR
RC2
15 november 2005 33
1930 1950 1970 1990
12
34
56
7
OLSsignificant trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
56
7
OLSsignificant change in trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
24
68
10
SORsignificant trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
24
68
10
SORsignificant change in trend
year in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
5
RC2significant trend
nyear in w hich respondent is 12
IEO
1930 1950 1970 1990
12
34
5RC2
significant change in trend
year in w hich respondent is 12
IEO