Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | erin-sanders |
View: | 213 times |
Download: | 0 times |
Advanced EpiAugust 15-19th 2011SACEMA
Matthew FoxBoston University
Center for Global Health and DevelopmentDepartment of Epidemiology
Health Economics and Epidemiology Research [email protected]
Introductions Who are you? Where do you work/study? What do you study?
Welcome About me Week long short course on epi methods
2 Sessions/day each about 3 hours (depending) Assumes intro/intermediate epi, practical experience
with epi and stats Mix of lecture and discussion
Too much material, take good notes, go back to them Finish mid-day on Friday Course works if you read and participate
Course Overview Review basic epidemiologic principles
Reinterpret them in a new light Think through problems/implications of
what we learned in intro/intermed epiDevelop a causal framework(s) to hang our
epidemiologic thinking Learn/apply advanced epi methods
Modern Epidemiology III
Questions for Today What is epidemiology, what is its
goal? What are measures of association and
measures of effect? What do these measures really mean? Which ones have causal meanings? What is the odds ratio really about Why does everyone use it?
The goal of epidemiologic research Epidemiology is study of:
The distribution and determinants of disease in human populations and the application of that knowledge to the control of disease
But the goal is: To obtain a valid and precise (and generalizable)
estimate of the effect of an exposure on a disease Validity is the opposite of bias, precision is the opposite of
random error
Fundamentally concerned with measurement
Anyone remember Type I and Type II error?
What are they?
Basic StatisticsTruth about Null
Effect No effect
Our study null
EffectCorrect
Type I error (alpha)
No effect Type II error (beta)
Correct
Type I: If we reject the null, what are the chance there is no effect?Type II: If we fail to reject the null, what are the chances there is an effect?
How do we know a particular epidemiologic finding is true? Find that the relative risk of exposure
to vitamin # on cancer @ is 2.5, p=0.049 Assume we did the perfect study
No bias (confounding, selection, information)80% power, alpha = 0.05
What is chance there is really no effect of vitamins on cancer? i.e. True relative risk is 1
Syphilis testing in the US
In US pre-2005, Massachusetts required a syphilis test before marriageAssume the test was:
95% sensitive and 95% specific
If I test positive, how likely is it that I truly have syphilis?Answer is that it depends
Syphilis
Truth
+ - Total
Test
+
-
TotalPrevalence is:
1%
Se = 95%Sp = 95%
100 9900
95
5
495
9405
590
9410
PPV = 16%
10,000
Back to our studyTruth
Effect No effect
Our study
EffectCorrect
Type I error (alpha)
No effect Type II error (beta)
Correct
Alpha and beta use the TRUTH as the denominator and so are like Se and Sp
Back to our studyTruth
Effect No effect
Our study
EffectCorrect
Type I error (alpha)
No effect Type II error (beta)
Correct
Judging the “correctness” of a single study is the PPV, and depends of the prevalence of true hypotheses
Back to our study alpha = 5%, (Sp 95%)beta = 5%, (Se 95%)
1000 9000
950
50
450
8550
1400
8600
68% chance our study is right
Truth
+ - Total
Our Study
+
-
Total 10,000
Prevalence of true
hypotheses is: 10%
Take home message: We need to critically examine the way we have been taught to design and interpret
epidemiologic research
Review of basic concepts
Study design, measures of disease frequency, measures of effect/association
The Source Population The population that gives rise to cases It is defined:
In time and placeWith respect to population characteristicsWith respect to external influences (modifiers)Not as a sample of the general population
Cohorts Membership in a cohort requires a person
meet admissibility criteria Have common admissibility-defining events
Membership begins once the temporally last criterion is met Once a member, a person never leaves (membership
is static or closed) A closed cohort adds no new members and loses only
to death, an open cohort is adding new members
Dynamic population
Membership requires a person satisfy the membership status criteriaThey have common admissibility-defining
characteristics Membership exists so long as all of the
status criteria are satisfied A person can enter a dynamic
population, leave it, and then re-enter
Cohorts vs. Dynamic Populations
Framingham heart studyCohort – the admissibility criteria are enrolling
in the study in 1948. Never leave the cohort once you enroll.
Dynamic population – could have instead studied all residents of Framingham from 1948 onwards, the catchment population for a case registry there. Some will leave, new people will join.
STUDY DESIGN: How to harvest information from the base
Census (cohort) or Sample (case-control) Cases are valuable (information rich)
In SE calcs, these drive your standard error Ex. SE(LN(RR)) = sqrt(1/A–1/N1+1/B–1/N0)
Include all the cases in the population Information density of population that gave rise to
cases is not great Can include all or sample Nearly all base’s info is harvested when sample of base is
small multiple of the cases
Which is the best measure to assess causal effects?1) Risk Difference2) Risk Ratio3) Odds Ratio
In a case-control study, from what population do we sample controls?
1) Those with disease 2) Those without disease3) Everyone, regardless of whether
they have the disease
Cohort Study
Case-control Study
Kramer and Bovin 1987
We define a cohort study as a study in which subjects are followed forward from exposure to outcome… Inferential reasoning is from cause to effect. In case-control studies, the directionality is the reverse. Study subjects are investigated backwards from outcome to exposure, and the reasoning is from effect to cause.”
Cohort Study: Relative Risks
Relative risk: (A/N1) / (B/N0) Risk in exposed / risk in unexposed Risk is number of cases / total at risk Numerator is number of cases Denominator is cases and controls!
Index (E+) Reference (E-)
Cases A B
Non-cases C D
Total N1 N0
Cohort Concept
t0 t
NE+
NE-
Exposed Cases A
Unexposed Cases B
C (NE+ - a)
D (NE- - b)
Cohort Study: Relative Risks
Relative risk: (A/N1)/(B/N0) can be rearranged as (A/B)/(N1/N0)
A/B is ratio of exposed to unexposed cases N1/N0 is ratio of exposed to unexposed in population
Index (E+) Reference (E-)
Cases A B
Non-cases C D
Total N1 N0
Relative risk has meaning: average increase in risk produced by exposure
Case-control: Cases
Members of population who develop disease over the follow-up periodSame cases as the analogous cohort studyCase ascertainment is influenced by design
Primary base: population defined first Secondary base: cases defined first
Case-control: Controls
A sample of the population experience that gave rise to the cases
3 options (paradigms)Un-diseased experiencePopulation at risk at beginning of the studyPopulation experience over follow-up
0 mos 6 mos 12 mos 18 mos 24 mos
Cases 0 5 10 15 20
Non-cases 100 95 90 85 80
Case-control Concept
t0 t
NE+
NE-
Exposed Cases
A
Unexposed Cases
B
C (NE+ - a)
D (NE- - b)
Option 1: Cumulative
Option 2: Case-cohort
Option 3: Density Sampling
Case-control studyIndex Reference
Cases A B
Controls C D
Now we can’t estimate risk A/N1 and B/N0
because we don’t know the denominators Left with an odds ratio
But how to interpret?
2 ways to calculate an ORIndex Reference
Cases A B
Controls C D
Cross product ratio:(A*D)/(B*C)Not particularly meaningful, but it works
2 ways to calculate an ORIndex Reference
Cases A B
Controls C D
Case ratio/base ratio: (A/B) / (C/D) A/B is the ratio of exposed to unexposed cases C/D is the ratio of exposed to unexposed controls Remember back to Relative Risk
Here C/D fills in for N1/N0
The trohoc fallacyIndex Reference
Cases 400 100
Non-cases 600 900
Total 1000 1000
Index Reference
Cases 400 100
Non-cases 60 90
Total Not sampled
The trohoc fallacy is idea that a case-control study is a cohort study done backwards (heteropalindrome)
Requires a rare disease assumption for the odds ratio to approximate the relative risk
RR = (400/1000) / (100/1000) = 4.0
OR = (400/60) / (100/90) = 6.0
10% sample of non-cases
Case-control Concept
t0 t
NE+
NE-
Exposed Cases
A
Unexposed Cases
B
C (NE+ - a)
D (NE- - b)
Option 1: Cumulative
Option 2: Case-cohort
The trohoc fallacy revealedIndex Reference
Cases 400 100
Non-cases 600 900
Total 1000 1000
Index Reference
Cases 400 100
Non-cases Not sampled
Controls 100 100
Sample total population that gave rise to cases (which includes cases), not undiseased at end Cases can be their own controls if randomly sampled
Requires no rare disease assumption
RR = (400/1000) / (100/1000) = 4.0
OR = (400/100) / (100/100) = 4.0
10% sample of population that gave rise to cases
Miettinen on the trohoc fallacy
“Consider the clinical trial: the concern is, as always, to contrast categories of treatment as to subsequent occurrence of some outcome phenomenon, whereas comparing different categories of the outcome as to the antecedent distribution of treatment is uninteresting if not downright perverse.”
Preferred terms like “case-referent” and “case-base” studies as “the base sample is no more a control series than a census of the base is”
Why it works
OR = [A*D] / [B*C]
= [A/B] / [C/D] If we sample 10% of the
base then the odds ratio is:
OR =
[A/B] /[(10%*N1)/(10%*N0)]
= [A/B]/(N1/N0) = RR
Index Ref
Cases A B
Non-case
C D
Total N1 N0
Cohort studies exclude those who are not at risk for disease (though they don’t need to). In a case control study. Should we exclude those not at risk for exposure?
Ex. In a study of hormonal contraception and heart disease, should we exclude nuns?
With appropriate sampling, odds ratio is interpreted as
estimate of relative risk, which has meaning.
Case control studies are cohort studies done
efficiently, not cohort studies done backwards.
Measures of Disease Frequency
Provide an estimate of the occurrence of disease in a populationTypically we study first occurrence as later
occurrences are often affected by first Incorporates:
Disease stateTimePopulation definition
Measures of Disease Frequency
Prevalence:Proportion of population with disease at a
particular timeCross-sectionalReflects rate of disease occurrence and
survival with disease
Measures of Disease Frequency
Cumulative Incidence (Simple)Proportion of a population that develops
disease over a follow-up periodAlso called incidence proportion or riskBounded by 0 and 1Time not part of measure but must reportDifficult to measure in dynamic populations
CI(t0,t) = I(t0,t)/N0
Measures of Disease Frequency
Incidence rate (density)Number of newly developed cases divided by
accumulated person time Time is part of the denominator
Can be used in dynamic populations/cohorts Ignores distinction between individuals
(2/100 py could be 2 followed 50 yrs each, both get event or 100 followed 1 yr each, 2 get event)
IR(t0,t) = I(t0,t) /∑PT tNPTtPTN
ii
or 1
where
Measures of Disease Frequency
Rules for counting person time Start disease free, free of history of disease at entry At risk for outcome? Not necessary, but wasteful Start after exposure is complete (not during) and after
minimum induction period Stop when disease occurs (date or midpoint) Stop if withdrawn (lost to follow up, death from another
cause, study ends, no longer at risk) Only those eligible to be counted in numerator are
in denominator Ask, if became a case, would I have counted them?
Person Time Issues I
We conduct a cohort study of continuous smoking vs. no smoking and prostate cancerEnroll 1000 smokers and 1000 non-smokers
At end, find 100 non-smokers became smokers. Should we exclude them?Can’t because if they became cases while not
smoking we would have included them
Person Time Issues II
Study HAART regimens and deathBut much death and LTFU in first 6-months
and we care about long term mortality Exclude any deaths in first 6-months
OK if all we care about is long-term effects When should person time start?
Immortal person-time biases towards null
Black triangle
Prevalence =
2/8 = 0.25
Black triangle
Cum Inc =
2/9
Black triangle
Inc Rate =
2/42
5
5
5
5
5
5
5
5
2
Measure of Effect
Comparison of occurrence of outcome in the same population at same time under two different conditions Only one can be observed Second is “counterfactual” (we will come back to this)
Theoretical, as such we substitute measure of association But as an approximation to measure of effect
Measures of Association
Comparison of incidence in 2+ populations Relative:
Comparison by division Null (no effect) is 1 Log scale (distance from 0-1 is same as 1 to infinity)
Difference: Comparison by subtraction Null (no effect) is 0 Distance above and below null is equivalent
Calculations
E
E
EE
CI
CIRR
CICIRD
E
E
EE
IR
IRIRR
IRIRIRD
Conclusion Objective is a VALID and PRECISE estimate
of the effect of an exposure on an outcome Need to think critically about the logic of the
methods we have been taught Make sure we understand how to validly design
studies and how to correctly interpret study findings
Odds ratios are odd Correct sampling means can reduce reliance on them