Advanced Epi August 15-19 th 2011 SACEMA Matthew Fox Boston University Center for Global Health and...

Advanced EpiAugust 15-19th 2011SACEMA

Matthew FoxBoston University

Center for Global Health and DevelopmentDepartment of Epidemiology

Health Economics and Epidemiology Research [email protected]

Introductions Who are you? Where do you work/study? What do you study?

Welcome About me Week long short course on epi methods

2 Sessions/day each about 3 hours (depending) Assumes intro/intermediate epi, practical experience

with epi and stats Mix of lecture and discussion

Too much material, take good notes, go back to them Finish mid-day on Friday Course works if you read and participate

Course Overview Review basic epidemiologic principles

Reinterpret them in a new light Think through problems/implications of

what we learned in intro/intermed epiDevelop a causal framework(s) to hang our

epidemiologic thinking Learn/apply advanced epi methods

Modern Epidemiology III

Questions for Today What is epidemiology, what is its

goal? What are measures of association and

measures of effect? What do these measures really mean? Which ones have causal meanings? What is the odds ratio really about Why does everyone use it?

The goal of epidemiologic research Epidemiology is study of:

The distribution and determinants of disease in human populations and the application of that knowledge to the control of disease

But the goal is: To obtain a valid and precise (and generalizable)

estimate of the effect of an exposure on a disease Validity is the opposite of bias, precision is the opposite of

random error

Fundamentally concerned with measurement

Anyone remember Type I and Type II error?

What are they?

Basic StatisticsTruth about Null

Effect No effect

Our study null

EffectCorrect

Type I error (alpha)

No effect Type II error (beta)

Correct

Type I: If we reject the null, what are the chance there is no effect?Type II: If we fail to reject the null, what are the chances there is an effect?

How do we know a particular epidemiologic finding is true? Find that the relative risk of exposure

to vitamin # on cancer @ is 2.5, p=0.049 Assume we did the perfect study

No bias (confounding, selection, information)80% power, alpha = 0.05

What is chance there is really no effect of vitamins on cancer? i.e. True relative risk is 1

Syphilis testing in the US

In US pre-2005, Massachusetts required a syphilis test before marriageAssume the test was:

95% sensitive and 95% specific

If I test positive, how likely is it that I truly have syphilis?Answer is that it depends

Syphilis

Truth

+ - Total

Test

+

-

TotalPrevalence is:

1%

Se = 95%Sp = 95%

100 9900

95

5

495

9405

590

9410

PPV = 16%

10,000

Back to our studyTruth

Effect No effect

Our study

EffectCorrect



Correct

Alpha and beta use the TRUTH as the denominator and so are like Se and Sp

Back to our studyTruth

Effect No effect

Our study

EffectCorrect



Correct

Judging the “correctness” of a single study is the PPV, and depends of the prevalence of true hypotheses

Back to our study alpha = 5%, (Sp 95%)beta = 5%, (Se 95%)

1000 9000

950

50

450

8550

1400

8600

68% chance our study is right

Truth

+ - Total

Our Study

+

-

Total 10,000

Prevalence of true

hypotheses is: 10%

Take home message: We need to critically examine the way we have been taught to design and interpret

epidemiologic research

Review of basic concepts

Study design, measures of disease frequency, measures of effect/association

The Source Population The population that gives rise to cases It is defined:

In time and placeWith respect to population characteristicsWith respect to external influences (modifiers)Not as a sample of the general population

Cohorts Membership in a cohort requires a person

meet admissibility criteria Have common admissibility-defining events

Membership begins once the temporally last criterion is met Once a member, a person never leaves (membership

is static or closed) A closed cohort adds no new members and loses only

to death, an open cohort is adding new members

Dynamic population

Membership requires a person satisfy the membership status criteriaThey have common admissibility-defining

characteristics Membership exists so long as all of the

status criteria are satisfied A person can enter a dynamic

population, leave it, and then re-enter

Cohorts vs. Dynamic Populations

Framingham heart studyCohort – the admissibility criteria are enrolling

in the study in 1948. Never leave the cohort once you enroll.

Dynamic population – could have instead studied all residents of Framingham from 1948 onwards, the catchment population for a case registry there. Some will leave, new people will join.

STUDY DESIGN: How to harvest information from the base

Census (cohort) or Sample (case-control) Cases are valuable (information rich)

In SE calcs, these drive your standard error Ex. SE(LN(RR)) = sqrt(1/A–1/N1+1/B–1/N0)

Include all the cases in the population Information density of population that gave rise to

cases is not great Can include all or sample Nearly all base’s info is harvested when sample of base is

small multiple of the cases

Which is the best measure to assess causal effects?1) Risk Difference2) Risk Ratio3) Odds Ratio

In a case-control study, from what population do we sample controls?

1) Those with disease 2) Those without disease3) Everyone, regardless of whether

they have the disease

Cohort Study

Case-control Study

Kramer and Bovin 1987

We define a cohort study as a study in which subjects are followed forward from exposure to outcome… Inferential reasoning is from cause to effect. In case-control studies, the directionality is the reverse. Study subjects are investigated backwards from outcome to exposure, and the reasoning is from effect to cause.”

Cohort Study: Relative Risks

Relative risk: (A/N1) / (B/N0) Risk in exposed / risk in unexposed Risk is number of cases / total at risk Numerator is number of cases Denominator is cases and controls!

Index (E+) Reference (E-)

Cases A B

Non-cases C D

Total N1 N0

Cohort Concept

t0 t

NE+

NE-

Exposed Cases A

Unexposed Cases B

C (NE+ - a)

D (NE- - b)

Cohort Study: Relative Risks

Relative risk: (A/N1)/(B/N0) can be rearranged as (A/B)/(N1/N0)

A/B is ratio of exposed to unexposed cases N1/N0 is ratio of exposed to unexposed in population

Index (E+) Reference (E-)

Cases A B

Non-cases C D

Total N1 N0

Relative risk has meaning: average increase in risk produced by exposure

Case-control: Cases

Members of population who develop disease over the follow-up periodSame cases as the analogous cohort studyCase ascertainment is influenced by design

Primary base: population defined first Secondary base: cases defined first

Case-control: Controls

A sample of the population experience that gave rise to the cases

3 options (paradigms)Un-diseased experiencePopulation at risk at beginning of the studyPopulation experience over follow-up

0 mos 6 mos 12 mos 18 mos 24 mos

Cases 0 5 10 15 20

Non-cases 100 95 90 85 80

Case-control Concept

t0 t

NE+

NE-

Exposed Cases

A

Unexposed Cases

B

C (NE+ - a)

D (NE- - b)

Option 1: Cumulative

Option 2: Case-cohort

Option 3: Density Sampling

Case-control studyIndex Reference

Cases A B

Controls C D

Now we can’t estimate risk A/N1 and B/N0

because we don’t know the denominators Left with an odds ratio

But how to interpret?

2 ways to calculate an ORIndex Reference

Cases A B

Controls C D

Cross product ratio:(A*D)/(B*C)Not particularly meaningful, but it works

2 ways to calculate an ORIndex Reference

Cases A B

Controls C D

Case ratio/base ratio: (A/B) / (C/D) A/B is the ratio of exposed to unexposed cases C/D is the ratio of exposed to unexposed controls Remember back to Relative Risk

Here C/D fills in for N1/N0

The trohoc fallacyIndex Reference

Cases 400 100

Non-cases 600 900

Total 1000 1000

Index Reference

Cases 400 100

Non-cases 60 90

Total Not sampled

The trohoc fallacy is idea that a case-control study is a cohort study done backwards (heteropalindrome)

Requires a rare disease assumption for the odds ratio to approximate the relative risk

RR = (400/1000) / (100/1000) = 4.0

OR = (400/60) / (100/90) = 6.0

10% sample of non-cases

Case-control Concept

t0 t

NE+

NE-

Exposed Cases

A

Unexposed Cases

B

C (NE+ - a)

D (NE- - b)

Option 1: Cumulative

Option 2: Case-cohort

The trohoc fallacy revealedIndex Reference

Cases 400 100

Non-cases 600 900

Total 1000 1000

Index Reference

Cases 400 100

Non-cases Not sampled

Controls 100 100

Sample total population that gave rise to cases (which includes cases), not undiseased at end Cases can be their own controls if randomly sampled

Requires no rare disease assumption

RR = (400/1000) / (100/1000) = 4.0

OR = (400/100) / (100/100) = 4.0

10% sample of population that gave rise to cases

Miettinen on the trohoc fallacy

“Consider the clinical trial: the concern is, as always, to contrast categories of treatment as to subsequent occurrence of some outcome phenomenon, whereas comparing different categories of the outcome as to the antecedent distribution of treatment is uninteresting if not downright perverse.”

Preferred terms like “case-referent” and “case-base” studies as “the base sample is no more a control series than a census of the base is”

Why it works

OR = [A*D] / [B*C]

= [A/B] / [C/D] If we sample 10% of the

base then the odds ratio is:

OR =

[A/B] /[(10%*N1)/(10%*N0)]

= [A/B]/(N1/N0) = RR

Index Ref

Cases A B

Non-case

C D

Total N1 N0

Cohort studies exclude those who are not at risk for disease (though they don’t need to). In a case control study. Should we exclude those not at risk for exposure?

Ex. In a study of hormonal contraception and heart disease, should we exclude nuns?

With appropriate sampling, odds ratio is interpreted as

estimate of relative risk, which has meaning.

Case control studies are cohort studies done

efficiently, not cohort studies done backwards.

Measures of Disease Frequency

Provide an estimate of the occurrence of disease in a populationTypically we study first occurrence as later

occurrences are often affected by first Incorporates:

Disease stateTimePopulation definition


Prevalence:Proportion of population with disease at a

particular timeCross-sectionalReflects rate of disease occurrence and

survival with disease


Cumulative Incidence (Simple)Proportion of a population that develops

disease over a follow-up periodAlso called incidence proportion or riskBounded by 0 and 1Time not part of measure but must reportDifficult to measure in dynamic populations

CI(t0,t) = I(t0,t)/N0


Incidence rate (density)Number of newly developed cases divided by

accumulated person time Time is part of the denominator

Can be used in dynamic populations/cohorts Ignores distinction between individuals

(2/100 py could be 2 followed 50 yrs each, both get event or 100 followed 1 yr each, 2 get event)

IR(t0,t) = I(t0,t) /∑PT tNPTtPTN

ii

or 1

where


Rules for counting person time Start disease free, free of history of disease at entry At risk for outcome? Not necessary, but wasteful Start after exposure is complete (not during) and after

minimum induction period Stop when disease occurs (date or midpoint) Stop if withdrawn (lost to follow up, death from another

cause, study ends, no longer at risk) Only those eligible to be counted in numerator are

in denominator Ask, if became a case, would I have counted them?

Person Time Issues I

We conduct a cohort study of continuous smoking vs. no smoking and prostate cancerEnroll 1000 smokers and 1000 non-smokers

At end, find 100 non-smokers became smokers. Should we exclude them?Can’t because if they became cases while not

smoking we would have included them

Person Time Issues II

Study HAART regimens and deathBut much death and LTFU in first 6-months

and we care about long term mortality Exclude any deaths in first 6-months

OK if all we care about is long-term effects When should person time start?

Immortal person-time biases towards null

Black triangle

Prevalence =

2/8 = 0.25

Black triangle

Cum Inc =

2/9

Black triangle

Inc Rate =

2/42

5

5

5

5

5

5

5

5

2

Measure of Effect

Comparison of occurrence of outcome in the same population at same time under two different conditions Only one can be observed Second is “counterfactual” (we will come back to this)

Theoretical, as such we substitute measure of association But as an approximation to measure of effect

Measures of Association

Comparison of incidence in 2+ populations Relative:

Comparison by division Null (no effect) is 1 Log scale (distance from 0-1 is same as 1 to infinity)

Difference: Comparison by subtraction Null (no effect) is 0 Distance above and below null is equivalent

Calculations

E

E

EE

CI

CIRR

CICIRD

E

E

EE

IR

IRIRR

IRIRIRD

Conclusion Objective is a VALID and PRECISE estimate

of the effect of an exposure on an outcome Need to think critically about the logic of the

methods we have been taught Make sure we understand how to validly design

studies and how to correctly interpret study findings

Odds ratios are odd Correct sampling means can reduce reliance on them

Date post:	02-Jan-2016
Category:	Documents
Upload:	erin-sanders
View:	213 times
Download:	0 times

Advanced Epi August 15-19 th 2011 SACEMA Matthew Fox Boston University Center for Global Health and...

Documents