The (im)possibility of separating age, period and cohort effects
Andrew [email protected]
School of Geographical Sciences
NCRM Research Methods Festival, Oxford, July 2014
Summary
• Age, period and cohort (APC) effects• The APC identification problem• The HAPC model• Why it doesn’t work• Example: mental wellbeing
APC effects• A: I can’t seem to shake off this tired feeling. Guess I’m just
getting old. [Age effect]• B: Do you think it’s stress? Business is down this year, and
you’ve let your fatigue build up. [Period effect]• A: Maybe. What about you?• B: Actually, I’m exhausted too! My body feels really heavy.• A: You’re kidding. You’re still young. I could work all day long
when I was your age.• B: Oh, really?• A: Yeah, young people these days are quick to whine. We
were not like that. [Cohort effect] (From Suzuki 2012:452)
APC identification problem
• Age = Period – Cohort
• “the term [confounded] is not used in the traditional design sense of experimentally confounded but in the stronger sense of logically or mathematically confounded” (Goldstein, 1979, 19)
Impossible to isolate effects
• Cannot hold age and cohort constant and vary period (without time travel – Suzuki 2012)– Goldstein 1979: “there is no direct evidence for three
distinct types of causal factors”– Glenn 2005: “One of the most bizarre instances in the
history of science of repeated attempts to do something that is logically impossible”
• If you have age in your model, you also have period and cohort, and vice versa (whether you like it or not)
Consider these DGPs
• All will produce exactly the same data• Given that dataset, there is no logical way of telling which
DGP created the dataset
• Exact collinearity from putting all three into a regression model – model will not run.
• Grouping of one of APC breaks this collinearity, but produces arbitrary results (that depend on the chosen grouping)
Consider these DGPs
• All will produce exactly the same outcome variable• Given that dataset, there is no logical way of telling which DGP
created it
• Exact collinearity from putting all three into a regression model – model will not run.
• Grouping of one of APC breaks this collinearity, but produces arbitrary results (that depend on the chosen grouping)
Multilevel model for individuals nested in cohort groups and periods
Yang and Land’s HAPC model
CohortPeriod
Individual (Age)
Health = Intercept + Age linear trend + Age quadratic trend
+ Cohort residual + Period residual + individual residual
Yang and Land’s HAPC model• Claimed that this breaks the colinearity by
– Including an age-squared term, and/or– Treating age in a different way to periods/cohorts
• “the underidentification problem of the classical APC accounting model has been resolved by the specification of the quadratic function for the age effects” Yang and Land (2006:84)
• "An HAPC framework does not incur the identification problem because the three effects are not assumed to be linear and additive at the same level of analysis" Yang and Land (2013:191)
• "This contextual approach ...helps to deal with (actually completely avoids) the identification problem" Yang and Land (2013:71)
• Unfortunately this is not the case– See Bell, A and Jones, K (2014) Another futile quest? A simulation study of Yang and Land’s
Hierarchical age-period-cohort model. Demographic Research, 30, 11, 333-360. DOI: 10.4054/DemRes.2014.30.11
Bell, A and Jones, K (2014) Don’t birth cohorts matter? A commentary and simulation exercise of Reither, Hauser and Yang’s age-period-cohort study of obesity. Social Science and Medicine, 101, 176-180
Obesity epidemic apparently the result of periods, not cohorts
Bell, A and Jones, K (2014) Don’t birth cohorts matter? A commentary and simulation exercise of Reither, Hauser and Yang’s age-period-cohort study of obesity. Social Science and Medicine, 101, 176-180
Why model is enticing
• Intuitive– Aging occurs within individuals– Cohorts are external – we belong to them– Periods are external – we pass from one into another
• Multilevel model, so has all the extensions that go with that– Other covariates at all levels– Additional levels (eg individuals, neighbourhoods)– Random coefficients
Our view• HAPC framework is valuable, but……• Decision as to which of APC most likely caused the data
should be made based on intuition and theory• Assumptions constraining one of the parameters (often to
zero) should be made explicitly (so it can be challenged)• E.g. to constrain the period effect to zero:
Health = Intercept + Age linear trend + Age quadratic trend …+ Cohort linear trend + cohort quadratic trend …+ Cohort residual + Period residual + individual residual
Example – mental wellbeing
• Previous consensus: life course of mental wellbeing is U-shaped, worsening to the ‘midlife crisis’ and then improving into old age
• I argue that linear period trends are unlikely, and so constrain continuous period trends to zero
• Mental wellbeing measured by GHQ score, using data from the BHPS 1991-2008.
• Additionally add higher levels (individuals, local authority districts, households), random coefficients, covariates, interactions (for more details see Bell, 2014)
Example – mental wellbeing
1910
1920
19301940
19501960
1970
1980
9.8
11.2
12.6
14.0
19 38 57 76
Pre
dict
ed G
HQ
Sco
re
Age
Example – mental wellbeing
1910
1920
19301940
19501960
1970
1980
9.8
11.2
12.6
14.0
19 38 57 76
Pre
dict
ed G
HQ
Sco
re
Age
U-shape? But currently cohort is not controlled in this graph
Male
Female
8
10
12
14
20 40 60 80
Pre
dict
ed G
HQ
sco
re
Age
No U-shape found• Other findings of U-
shape result from older cohorts having better mental wellbeing (i.e. cohorts were not appropriately controlled
• Find mental wellbeing worsens throughout the life course.
Example – mental wellbeing
• Cohort effects combine quadratic trend with stochastic variation
• Those brought up during recessions have generally better mental health throughout their life course?
Male
Female
9.0
10.5
12.0
13.5
1896 1920 1944 1968
Pre
dict
ed G
HQ
Sco
re
Birth Year
Example – mental wellbeing
Conclusions
• Be careful. If you are interested in any of APC, be aware of the APC identification problem.
• If you have age in your model, you also have period and cohort (and vice versa)
• There is no mechanical solution to the problem
• Assumptions about APC need to be made, be based on theory, and stated explicitly
For more information• Bell, A and Jones, K (2014) Another futile quest? A simulation study of Yang and Land’s
Hierarchical age-period-cohort model. Demographic Research, 30, 11, 333-360. DOI: 10.4054/DemRes.2014.30.11
• Bell, A and Jones, K (2014) Don’t birth cohorts matter? A commentary and simulation exercise of Reither, Hauser and Yang’s age-period-cohort study of obesity. Social Science and Medicine, 101, 176-180
• Bell, A and Jones, K (2013) Bayesian informative priors with Yang and Land’s Hierarchical age-period-cohort model. Quality and Quantity. DOI: 10.1007/s11135-013-9985-3– For constraining parameters to something other than zero
• Bell, A (2014) Life course and cohort trajectories of mental wellbeing in the UK, 1991- 2008 – a multilevel age-period-cohort analysis. Under review, available on researchgate.net
• Bell, A and Jones, K (forthcoming) Age, period and cohort processes in longitudinal and life course analysis: a multilevel perspective. In A life course perspective on health trajectories and transitions, edited by Claudine Burton-Jeangros, Stéphane Cullati, Amanda Sacker and David Blane. Springer.
Periods or Cohorts?
• For obesity –changing diets/exercise regimes/technologies etc– Period effect – changes in culture affect everyone
the same– Cohort effect – changes effect the young in their
formative years– Could look at age effect – which is the most likely?
(I think the one associated with cohorts)– A cohort effect could cause a period effect? (eg
parents/overall culture influenced by their children)
Periods or Cohorts?• For mental wellbeing – changes in pace of
life/working patterns/level of stigma/narcissism– Period effect – changes in culture affect everyone the
same– Cohort effect – changes effect the young in their
formative years– Eg has everyone become more narcissistic? Or is
increasing narcissism in society the result of narcissism amongst newer cohorts?
• I think the later is more plausible / has a clearer causal mechanism