S V Subramanian Professor of Population Health and Geography Harvard University What’s wrong with...

Post on 30-Dec-2015

223 views 0 download

Tags:

transcript

S V SubramanianProfessor of Population Health and Geography

Harvard Universityhttp://www.hsph.harvard.edu/faculty/sv-subramanian/

What’s wrong with single-level epidemiology?

Public Lecture Series

Duke-National University of Singapore Medical SchoolDecember 13, 2013

S V Subramanian

2

• The problem with single level models

• Revisiting a classic

• Importance of multilevel perspectives

• Concluding remarks

Outline

3

Single level perspectives

4

5

6

Ancel Keys

Seven Country Study

Thomas Dawber

Framingham Heart Study

• Framingham– 1948 and on-going– One town in Massachusetts,

xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

– Inferential Unit: Individuals– Population variability: a

nuisance– Unit of analysis: Individuals,

xxxxxxxxxxxx

– http://www.framinghamheartstudy.org/about/history.html

• Seven Country– 1958 – 1970– 7 countries: Yugoslavia, Italy,

Greece, Finland, Netherlands, USA, Japan

– Inferential Unit: Populations– Population variability: of

substantive interest– Unit of analysis:

Sites/Countries

– http://www.sph.umn.edu/epi/history/sevencountries.asp

7

Risk factors for Cardiovascular Disease

8

American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357

S V Subramanian

9

Individual Illiteracy BlackIlliteracy 1

Black 0.203 1

American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357

S V Subramanian

10

State %Illiteracy %Black%Illiteracy 1

%Black 0.773 1

Individual Illiteracy BlackIlliteracy 1

Black 0.203 1

American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357

S V Subramanian

11

American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357

Individual Illiteracy Foreign-bornIlliteracy 1

Foreign-born 0.118 1

State %Illiteracy %Foreign-born%Illiteracy 1

%Foreign-born -0.526 1

S V Subramanian

12

• On the ecological relationship

– The purpose of this paper will have been accomplished if it prevents the future computation of meaningless correlations.

American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357

S V Subramanian

13

• On the ecological relationship

– The purpose of this paper will have been accomplished if it prevents the future computation of meaningless correlations.

• On the individual relationship

– The purpose of this paper will have been accomplished if it stimulates the study of similar problems with use of meaningful correlation between the properties of individuals.

American Sociological Review, Vol. 15, No. 3 (Jun., 1950), pp. 351-357

S V Subramanian

14

S V Subramanian

15

• Conclusion not supported by analysis

• Questionable Assumption– “In each study that uses ecological correlations, the obvious purpose is

to discover something about the behavior of individuals”.

• Methodological individualism– Technically accurate but substantively misleading in asserting the

primacy of “individual” relationships for understanding the association between race and illiteracy in the US

• Conflates ecology with aggregate– the whole need not simply be the sum of its parts

Critique

S V Subramanian

16

• 3rd most cited paper in ASR (>1000 citations)

• Dire warnings of “ecological fallacy” - a cornerstone of ALL epidemiologic textbooks.

• Motivated collection of individual survey data

Robinson’s Reach

S V Subramanian

17

Multilevel, or more precisely,Two-level perspectives

S V Subramanian

18

• Data: 1930 US Census• Structure: 98 241 245 individuals in 49 States• Outcome: Illiterate or not• Predictors:

– Individual: Race/Nativity (White Native, Foreign-born Native, and Black)

– State: Percentage of Black Population; and Jim Crow or not

• Model: Two-level Binomial Logistic Model using Monte Carlo Markov Chain (MCMC) estimation with Metropolis-Hastings Algorithm

Revisiting Robinson’s Example

S V Subramanian

19

Race-illiteracy association appears to be sensitive to States’ circumstances

Ignoring states

Accounting for

states

OR (95% CI) OR (95% CI)

Native White 1 1

Black 11.66 (11.63, 11.69) 5.86 (5.84, 5.88)

S V Subramanian

20

0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

Null After accounting for race

Be

twe

en

-sta

te v

ari

ati

on

(in

lo

git

s)

State “effects” not sensitive to racial composition

S V Subramanian

21

Washington

Oregon

California

District of Columbia

Nevada

New Mexico

Louisiana

Kentucky

North Carolina

Tennessee

Native Whites

Oregon

New York

South Dakota

Minnesota

Nevada

South Carolina

Alabama

Louisiana

Mississippi

North Carolina

Blacks

Illite

racy

Substantial heterogeneity in the illiteracy-race association

S V Subramanian

22

“Everywhere is nowhere”

S V Subramanian

23

Black

Foreign-born White

Native White

Race and Racial Context

S V Subramanian

24

• Presence of Jim Crow Law– i.e., federal and state laws that permitted racial

discrimination under the concept of “separate but equal”

State not simply “aggregates” of individuals

25

26

S V Subramanian

27

• Presence of Jim Crow Law– i.e., federal and state laws that permitted racial

discrimination under the concept of “separate but equal”

• Reality: “separate and unequal”– Per capita educational expenditure in public schools

• Georgia: White Child = $11.30; “Colored” Child = $0.00.• Alabama: White Child = $26.47; “Colored” Child = $3.81

State not simply “aggregates” of individuals

S V Subramanian

28

States with and without Jim Crow Laws in Education

S V Subramanian

29

Association between state Jim Crow laws, race, and illiteracy

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Native Whites Blacks

Pre

dic

ted

pro

bab

ilit

y o

f b

ein

g i

llit

erat

e

NJC

JC

The problem with Robinson’s analysis was thinking at only ONE level, leading

to an impoverished interpretation of the data.

30

S V Subramanian

31

• Critical re-thinking of any single-level analyses: ecological or individual

• No longer need to chose A level of analysis: an inductive approach to ascertaining at what level does action lie

S V Subramanian

32

Where do we go from here?

S V Subramanian

33

1. Conceptualizing Micro and Macro Contexts– e.g., neighborhoods (micro contexts) often are

embedded in larger settings such as counties or states or regions (macro contexts)

2. Considering geographical and non-geographical contexts simultaneously (e.g., neighborhoods and schools)

Current multilevel applications suffer from the

problem of missing or omitted level.

34

S V Subramanian

35

1. Importance of considering multiple (nested) geographies

Life expectancy patterns in the US

36

• Response: Life expectancy• Predictor: Time (i.e., “technological progress”) • Structure

– Repeated cross-section– Three-level: years (1961-2000) at level-1

(n=122850) nested within 3150 counties at level-2 nested within 51 states at level-3.

• Model: Three-level random coefficient model

Data

37

Variance Estimate SE Estimate SE

Between state 1.705 0.350Between county 2.984 0.076 1.422 0.036Between time 0.512 0.002 0.512 0.002

Ignore State Include State

),0(~

),0(~200

200

000

010

eij

uj

jj

ijijjij

Ne

Nu

u

exy

),0(~

),0(~

),0(~

200

200

200

000

000

010

eijk

ujk

vjk

kk

jkkjk

ijkijkjkijk

Ne

Nu

Nv

v

u

exy

Level at which action lies: Two Level

38

Variance Estimate SE Estimate SE

Between state 1.705 0.350Between county 2.984 0.076 1.422 0.036Between time 0.512 0.002 0.512 0.002

Ignore State Include State

),0(~

),0(~200

200

000

010

eij

uj

jj

ijijjij

Ne

Nu

u

exy

),0(~

),0(~

),0(~

200

200

200

000

000

010

eijk

ujk

vjk

kk

jkkjk

ijkijkjkijk

Ne

Nu

Nv

v

u

exy

Level at which action lies: Three Level

39

),0(~

:),0(~

:),0(~

200

2101

20

1

0

2101

20

1

0

111

111

000

000

010

eijk

uu

uuu

jk

jk

vv

vvv

k

k

kk

jkkjk

kk

jkkjk

ijkijkjkjkijk

Ne

Nu

u

Nv

v

v

u

v

u

exy

State Variation trumps County Variation

40

Macro contexts as important, if not more, for life expectancy variations in

the US.

41

S V Subramanian

42

2. Importance of non-geographical/spatial contexts (e.g., schools, workplaces,

hospitals, social networks)

• In prior multilevel research, we have not given sufficient attention to all potentially relevant contexts on health

– Most multilevel research to date has focused on one setting: neighborhoods

– However, other settings may be relevant for health

– Biased inferences

One context at a time

• 180 days/year• 6 or more hours/day• 12-13 years

Empirical illustration: schools versus neighborhoods

S V Subramanian

45

Level-2 School 1 School 2 School 3 School 4

Neighborhood 1

☺☺☺☺

☺☺

Neighborhood 2

☺☺☺☺

☺☺☺

☺☺

☺☺

Neighborhood 3

☺☺☺

☺☺

☺☺☺ ☺☺☺

Neighborhood 4

☺☺

☺☺☺☺

☺☺☺☺ ☺☺☺☺

The idea of cross classified

• In-home Survey: 20,745 students nested in 132 schools

• Mean 125.4 students/school• 2142 neighborhoods (census tracts)

– Mean 7.3 students/tract

Study population

3 outcomes:

Tobacco use

Depression

Weight status

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood -

School 5.6 (0.6) -

Number of days cigarettes smoked (last 30 days)

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood - 4.7 (0.5)

School -

Number of days cigarettes smoked (last 30 days)

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood - 4.7 (0.5) 0.35 (0.3)

School 5.6 (0.6) - 5.54 (0.8)

Number of days cigarettes smoked (last 30 days)

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood -

School 0.35 (0.04) -

Log odds of smoking

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood - 0.25 (0.03)

School -

Log odds of smoking

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood - 0.25 (0.03) 0.06 (0.02)

School 0.35 (0.04) - 0.36 (0.06)

Log odds of smoking

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood -

School 0.98 (0.11) -

Self reported Body Mass Index

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood - 0.73 (0.10)

School -

Self reported Body Mass Index

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood - 0.73 (0.10) 0.22 (0.08)

School 0.98 (0.11) - 0.87 (0.14)

Self reported Body Mass Index

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood -

School 2.05 (0.56) -

CES-D Scale (Depression)

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood - 1.84 (0.52)

School -

CES-D Scale (Depression)

Variance Estimate (Standard Error)

Hierarchical School

Hierarchical Neighborhood

Cross-Classified

Neighborhood - 1.84 (0.52) 0.45 (0.34)

School 2.05 (0.56) - 1.69 (0.59)

CES-D Scale (Depression)

S V Subramanian

61

• Need critical re-thinking of ALL single-level epidemiological analyses– Dire warnings of “ecological fallacy”, but the science is full

of studies that risk “individualistic or atomistic fallacy”

• Need to carefully consider the units of analysis in epidemiological investigations and the problem of “omitted” levels

• Need to consider multiple contexts simultaneously (places/schools/worksites)

Concluding remarks