Extending Group-Based Trajectory Modeling to Account for Subject Attrition
(Sociological Methods & Research, 2011)
Amelia HavilandBobby Jones
Daniel S. Nagin
Trajectories of Physical Aggression(Child Development, 1999)
00.5
11.5
22.5
33.5
44.5
6 10 11 12 13 14 15
Age
Phys
ical A
ggre
ssio
n
Low-actual Mod. desister-actual High desister-actual Chronic-actualLow-pred. Mod. desister-pred. High desister-pred. Chronic-pred
4%
28%
52%16%
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
0
0.5
1
1.5
2
2.5
3
SO (70.9%) LR-D (21.7%) MR-D (5.7%) HR-P (1.6%)
age
me
an
nu
mb
er
of
co
nv
icti
on
s p
er
ye
ar
Trajectories Based on 1979 Dutch Conviction Cohort
The Likelihood Function
.)(N
iYPL
PJ(Yi) = probability of Yi given membership in group j
j= probability of membership in group j
ji
ji
x
x
ij eex
)(
j
ij
iji YPxYP )()()(
Missing Data
• Two Types– Intermittent missing assessments (y1, y2 , . ,y4, . ,y6)– Subject attrition where assessments cease starting
in period τ (y1 , y2 , y3 , . , . , .)
• Both types assumed to be missing at random • Model extension designed to account for
potentially non-random subject attrition• No change in the model for intermittent
missing assessments
Some Notation
τi =period t in which subject i drops out
T=number of assessment periods
jt = Probability of Drop out in group j in period t
Probability of Dropout in Period t
Period Probability of Drop Out 1 0 2 3 4 . . . . . . T
No Drop Out
1 – all the above probabilities
The Dropout Extended Likelihood for Group j
).3()1)(;,,0|(),;,|(1
1
jT
t
jtjiitit
jjii i
i
jagewypjageYP
Specification of
• Binary Logit Model• Predictor Variables
– Fixed characteristics of i, – Prior values of outcome,
• If trajectory group was known within trajectory group j dropout would be “exogenous” or “ignorable conditional on observed covariates”
• Because trajectory group is latent, at population level, dropout is “non-ignorable”
jt
ix,...., 21 itit yy
Simulation Objectives
• Examine effects of differential attrition rate across groups that are not initially well separated
• Examine the effects of using model estimates to make population level projections
Simulation 1: Two Group Model With Different Drop Probabilities and Small Initial Separation
10 10
10 10
No dropoutSlope=.5
Time
E(y) E(y)
E(y) E(y)
Time
Time Time
Group 1 Per Period
Dropout Probability
Expected Group 1
Assessment Periods
Probability of Group 1
Dropout on or before Period 6
Model Without Dropout
Model With Dropout
Group 1 Prob. Est.
(π1)
Percent Bias
Group 1 Prob. Est.
(π1)
Percent Bias
Dropout Prob.Est.
0 6.0 0 .200 0.0 .200 0.0 .000.05 5.3 .226 .171 -14.5 .199 -0.5 .051.10 4.7 .410 .146 -27.0 .199 -0.5 .099.15 4.2 .556 .122 -39.0 .200 0.0 .150.20 3.7 .672 .100 -50.0 .199 -0.5 .199.25 3.3 .762 .079 -60.5 .200 0.0 .250.30 2.9 .832 .061 -69.5 .199 -0.5 .301.35 2.6 .884 .046 -77.0 .199 -0.5 .350.40 2.4 .922 .034 -83.0 .199 -0.5 .398
Simulation Results: Group 1 and Group 2 Initially not Well Separated
An Important Distinction from Zhang and Rubin (2003)
• Dropout due death– Subject exits population of interest-the living– Data said to be “truncated”
• Dropout due termination of study participation– Subject exits the sample but remains in the
population– Data said to be censored
Simulation 2: Projecting to the Population Level from Model Parameter Estimates
Simulation 2 Continued
12.5
10
No Dropout
Dropout=.2 per period
Table 2
Simulation 2: Predicting Population Averages With and Without Adjustments for Dropout
No Dropout Model
Model with dropout
Period Average Y
Predicted Y
1~t Predicted
Y
0 10.5 10.5 .200 10.5 1 10.0 10.1 .166 10.0 2 9.48 9.67 .137 9.48 3 8.95 9.26 .112 8.95 4 8.41 8.84 .092 8.42 5 7.87 8.43 .075 7.87
Chinese Longitudinal Healthy Longevity Survey (CLHLS)
• Random selected counties and cities in 22 provinces
• 4 waves 1998 to 2005• 80 to 105 years old at baseline• 8805 individual at baseline• 68.9% had died by 2005• Analyzed 90-93 years old cohort in 1998
Activities of Daily Living
• On your own and without assistance can you:– Bath – Dress– Toilet– Get up from bed or chair– Eat
• Disability measured by count of items where assistance is required
Table 3
Summary Statistic for the Age 90 to 93 CLHLS Cohort at Baseline
Variable N Average ADL 1998 Count 1078 .84 ADL 2000 Count 580 1.05 ADL 2002 Count 335 1.16 ADL 2005 Count 120 1.26
Female 1078 .52 Life Threatening
Disease 1078 .11
1 2 3 40
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
ADL Trajectory Model Without Dropout
Low (27.1%)Medium (60.0%)High (12.9%)
Wave
ADL Count
1 2 3 40
0.51
1.52
2.53
3.54
4.5
ADL Trajectory Model With Drop Out
Low (20.1% DP=.34)Medium (58.6% DP=.47)High (21.3% DP=.64)
Wave
ADL Count
Table 4
Predict Population Average ADL counts from the Models With and Without Dropout
Model Without Drop Out
Model With Drop Out
Period Average ADL
Count
Predict ADL
Count
% Error
~1
t ~2
t ~3
t Predicted
ADL Count
% Error
1998 .84 .91 8.3 .201 .586 .213 .93 10.7 2000 1.05 1.19 13.3 .254 .600 .146 1.07 1.9 2002 1.16 1.42 22.4 .309 .593 .097 1.17 .9 2005 1.26 1.89 50.0 .366 .571 .063 1.58 25.4
Adding Covariates to Model to Test the Morbidity Compression v. Expansion Hypothesis
• Will increases in longevity compress or expand disability level in the population of the elderly?
• “Had a life threatening disease” at baseline or prior is positively correlated with both ADL counts at baseline and subsequent mortality rate.
• Question: Would a reduction in the incidence of life threatening diseases at baseline increase or decrease the population level ADL count?
Testing Strategy and Results
• Specify group membership probability (πj ) and dropout probability ( ) to be a function of life threatening disease variable
• Both also functions of sex and dropout probability alone of ADL count in prior period
• Life threatening disease significantly related to group membership in expected way but has no relationship with dropout due to death
• Thus, unambiguous support for compression
jt
Projecting the reduction in population average ADL count from a 25% reduction in the incidence of the life threatening disease at
baseline
Year 1998 2000 2002 2005Reduction (%) 3.0 2.2 1.5 .7
Projected % Reduction in Population Average ADL Count
Table 6
Own and Cross Elasticity Estimates (%) for Life Threatening Disease Incidences
Cross Elasticity
Group Own Elasticity
Group 2
Group 3
Total Elasticity
1. Low ()201.1
NA -.033 -.059 -.092
2. Medium ()586.2
.069 NA -.173 -.104
3. High()213.3
.232 -.036 NA .196
Conclusions and Future Research
• Large differences in dropout rates across trajectory groups matter
• Future research– Investigate effects of endogenous selection– Compare results in data sets with more modest
dropout rates– Further research morbidity expansion and
contraction