Post on 30-Dec-2015
transcript
S V SubramanianProfessor of Population Health and Geography
Harvard Universityhttp://www.hsph.harvard.edu/faculty/sv-subramanian/
What’s wrong with single-level epidemiology?
Public Lecture Series
Duke-National University of Singapore Medical SchoolDecember 13, 2013
S V Subramanian
2
• The problem with single level models
• Revisiting a classic
• Importance of multilevel perspectives
• Concluding remarks
Outline
3
Single level perspectives
4
5
6
Ancel Keys
Seven Country Study
Thomas Dawber
Framingham Heart Study
• Framingham– 1948 and on-going– One town in Massachusetts,
xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
– Inferential Unit: Individuals– Population variability: a
nuisance– Unit of analysis: Individuals,
xxxxxxxxxxxx
– http://www.framinghamheartstudy.org/about/history.html
• Seven Country– 1958 – 1970– 7 countries: Yugoslavia, Italy,
Greece, Finland, Netherlands, USA, Japan
– Inferential Unit: Populations– Population variability: of
substantive interest– Unit of analysis:
Sites/Countries
– http://www.sph.umn.edu/epi/history/sevencountries.asp
7
Risk factors for Cardiovascular Disease
8
American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357
S V Subramanian
9
Individual Illiteracy BlackIlliteracy 1
Black 0.203 1
American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357
S V Subramanian
10
State %Illiteracy %Black%Illiteracy 1
%Black 0.773 1
Individual Illiteracy BlackIlliteracy 1
Black 0.203 1
American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357
S V Subramanian
11
American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357
Individual Illiteracy Foreign-bornIlliteracy 1
Foreign-born 0.118 1
State %Illiteracy %Foreign-born%Illiteracy 1
%Foreign-born -0.526 1
S V Subramanian
12
• On the ecological relationship
– The purpose of this paper will have been accomplished if it prevents the future computation of meaningless correlations.
American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357
S V Subramanian
13
• On the ecological relationship
– The purpose of this paper will have been accomplished if it prevents the future computation of meaningless correlations.
• On the individual relationship
– The purpose of this paper will have been accomplished if it stimulates the study of similar problems with use of meaningful correlation between the properties of individuals.
American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357
S V Subramanian
14
S V Subramanian
15
• Conclusion not supported by analysis
• Questionable Assumption– “In each study that uses ecological correlations, the obvious purpose is
to discover something about the behavior of individuals”.
• Methodological individualism– Technically accurate but substantively misleading in asserting the
primacy of “individual” relationships for understanding the association between race and illiteracy in the US
• Conflates ecology with aggregate– the whole need not simply be the sum of its parts
Critique
S V Subramanian
16
• 3rd most cited paper in ASR (>1000 citations)
• Dire warnings of “ecological fallacy” - a cornerstone of ALL epidemiologic textbooks.
• Motivated collection of individual survey data
Robinson’s Reach
S V Subramanian
17
Multilevel, or more precisely,Two-level perspectives
S V Subramanian
18
• Data: 1930 US Census• Structure: 98 241 245 individuals in 49 States• Outcome: Illiterate or not• Predictors:
– Individual: Race/Nativity (White Native, Foreign-born Native, and Black)
– State: Percentage of Black Population; and Jim Crow or not
• Model: Two-level Binomial Logistic Model using Monte Carlo Markov Chain (MCMC) estimation with Metropolis-Hastings Algorithm
Revisiting Robinson’s Example
S V Subramanian
19
Race-illiteracy association appears to be sensitive to States’ circumstances
Ignoring states
Accounting for
states
OR (95% CI) OR (95% CI)
Native White 1 1
Black 11.66 (11.63, 11.69) 5.86 (5.84, 5.88)
S V Subramanian
20
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
Null After accounting for race
Be
twe
en
-sta
te v
ari
ati
on
(in
lo
git
s)
State “effects” not sensitive to racial composition
S V Subramanian
21
Washington
Oregon
California
District of Columbia
Nevada
New Mexico
Louisiana
Kentucky
North Carolina
Tennessee
Native Whites
Oregon
New York
South Dakota
Minnesota
Nevada
South Carolina
Alabama
Louisiana
Mississippi
North Carolina
Blacks
Illite
racy
Substantial heterogeneity in the illiteracy-race association
S V Subramanian
22
“Everywhere is nowhere”
S V Subramanian
23
Black
Foreign-born White
Native White
Race and Racial Context
S V Subramanian
24
• Presence of Jim Crow Law– i.e., federal and state laws that permitted racial
discrimination under the concept of “separate but equal”
State not simply “aggregates” of individuals
25
26
S V Subramanian
27
• Presence of Jim Crow Law– i.e., federal and state laws that permitted racial
discrimination under the concept of “separate but equal”
• Reality: “separate and unequal”– Per capita educational expenditure in public schools
• Georgia: White Child = $11.30; “Colored” Child = $0.00.• Alabama: White Child = $26.47; “Colored” Child = $3.81
State not simply “aggregates” of individuals
S V Subramanian
28
States with and without Jim Crow Laws in Education
S V Subramanian
29
Association between state Jim Crow laws, race, and illiteracy
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Native Whites Blacks
Pre
dic
ted
pro
bab
ilit
y o
f b
ein
g i
llit
erat
e
NJC
JC
The problem with Robinson’s analysis was thinking at only ONE level, leading
to an impoverished interpretation of the data.
30
S V Subramanian
31
• Critical re-thinking of any single-level analyses: ecological or individual
• No longer need to chose A level of analysis: an inductive approach to ascertaining at what level does action lie
S V Subramanian
32
Where do we go from here?
S V Subramanian
33
1. Conceptualizing Micro and Macro Contexts– e.g., neighborhoods (micro contexts) often are
embedded in larger settings such as counties or states or regions (macro contexts)
2. Considering geographical and non-geographical contexts simultaneously (e.g., neighborhoods and schools)
Current multilevel applications suffer from the
problem of missing or omitted level.
34
S V Subramanian
35
1. Importance of considering multiple (nested) geographies
Life expectancy patterns in the US
36
• Response: Life expectancy• Predictor: Time (i.e., “technological progress”) • Structure
– Repeated cross-section– Three-level: years (1961-2000) at level-1
(n=122850) nested within 3150 counties at level-2 nested within 51 states at level-3.
• Model: Three-level random coefficient model
Data
37
Variance Estimate SE Estimate SE
Between state 1.705 0.350Between county 2.984 0.076 1.422 0.036Between time 0.512 0.002 0.512 0.002
Ignore State Include State
),0(~
),0(~200
200
000
010
eij
uj
jj
ijijjij
Ne
Nu
u
exy
),0(~
),0(~
),0(~
200
200
200
000
000
010
eijk
ujk
vjk
kk
jkkjk
ijkijkjkijk
Ne
Nu
Nv
v
u
exy
Level at which action lies: Two Level
38
Variance Estimate SE Estimate SE
Between state 1.705 0.350Between county 2.984 0.076 1.422 0.036Between time 0.512 0.002 0.512 0.002
Ignore State Include State
),0(~
),0(~200
200
000
010
eij
uj
jj
ijijjij
Ne
Nu
u
exy
),0(~
),0(~
),0(~
200
200
200
000
000
010
eijk
ujk
vjk
kk
jkkjk
ijkijkjkijk
Ne
Nu
Nv
v
u
exy
Level at which action lies: Three Level
39
),0(~
:),0(~
:),0(~
200
2101
20
1
0
2101
20
1
0
111
111
000
000
010
eijk
uu
uuu
jk
jk
vv
vvv
k
k
kk
jkkjk
kk
jkkjk
ijkijkjkjkijk
Ne
Nu
u
Nv
v
v
u
v
u
exy
State Variation trumps County Variation
40
Macro contexts as important, if not more, for life expectancy variations in
the US.
41
S V Subramanian
42
2. Importance of non-geographical/spatial contexts (e.g., schools, workplaces,
hospitals, social networks)
• In prior multilevel research, we have not given sufficient attention to all potentially relevant contexts on health
– Most multilevel research to date has focused on one setting: neighborhoods
– However, other settings may be relevant for health
– Biased inferences
One context at a time
• 180 days/year• 6 or more hours/day• 12-13 years
Empirical illustration: schools versus neighborhoods
S V Subramanian
45
Level-2 School 1 School 2 School 3 School 4
Neighborhood 1
☺☺☺☺
☺
☺☺
☺
☺
Neighborhood 2
☺
☺☺☺☺
☺☺☺
☺☺
☺☺
Neighborhood 3
☺☺☺
☺☺
☺☺☺ ☺☺☺
Neighborhood 4
☺☺
☺☺☺☺
☺☺☺☺ ☺☺☺☺
The idea of cross classified
• In-home Survey: 20,745 students nested in 132 schools
• Mean 125.4 students/school• 2142 neighborhoods (census tracts)
– Mean 7.3 students/tract
Study population
3 outcomes:
Tobacco use
Depression
Weight status
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood -
School 5.6 (0.6) -
Number of days cigarettes smoked (last 30 days)
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood - 4.7 (0.5)
School -
Number of days cigarettes smoked (last 30 days)
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood - 4.7 (0.5) 0.35 (0.3)
School 5.6 (0.6) - 5.54 (0.8)
Number of days cigarettes smoked (last 30 days)
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood -
School 0.35 (0.04) -
Log odds of smoking
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood - 0.25 (0.03)
School -
Log odds of smoking
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood - 0.25 (0.03) 0.06 (0.02)
School 0.35 (0.04) - 0.36 (0.06)
Log odds of smoking
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood -
School 0.98 (0.11) -
Self reported Body Mass Index
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood - 0.73 (0.10)
School -
Self reported Body Mass Index
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood - 0.73 (0.10) 0.22 (0.08)
School 0.98 (0.11) - 0.87 (0.14)
Self reported Body Mass Index
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood -
School 2.05 (0.56) -
CES-D Scale (Depression)
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood - 1.84 (0.52)
School -
CES-D Scale (Depression)
Variance Estimate (Standard Error)
Hierarchical School
Hierarchical Neighborhood
Cross-Classified
Neighborhood - 1.84 (0.52) 0.45 (0.34)
School 2.05 (0.56) - 1.69 (0.59)
CES-D Scale (Depression)
S V Subramanian
61
• Need critical re-thinking of ALL single-level epidemiological analyses– Dire warnings of “ecological fallacy”, but the science is full
of studies that risk “individualistic or atomistic fallacy”
• Need to carefully consider the units of analysis in epidemiological investigations and the problem of “omitted” levels
• Need to consider multiple contexts simultaneously (places/schools/worksites)
Concluding remarks