Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | gervais-hopkins |
View: | 257 times |
Download: | 6 times |
Bias, Chance & Confounding
- Bias - Systematic deviation from the truth
• systematic deviation of the results (from the true value) that leads to incorrect conclusions– design
– data collection
– analysis
– interpretation
– publication
– review
Consequences of bias
• underestimate or overestimate the parameter you are trying to measure– eg blood pressure, prevalence of arthritis
• incorrect estimate of association between disease and exposure– eg prevalence of lung cancer among smokers
Three types of bias
• Selection bias • Information bias
• Confounding
Birth complications in a Canadian hospital
HospitalComplications Total births % complications
Summer 20 240 8.3%
Winter 20 180 11.1%
What does it show?
Births at hospital and at home
HomeComplications Total births % complications
Summer 2 60 3.3%
Winter 2 120 1.7%
HospitalComplications Total births % complications
Summer 20 240 8.3%
Winter 20 180 11.1%Why?
Combining hospital and home data
All birthsComplications Total births % complications
Summer 22 300 7.3%
Winter 22 300 7.3%
Home deliveries were more common in winter. Labor complications among home deliveries were low. Women with prolonged or complicated labour attempt to reach the hospital no matter what season.
Selection bias: systematic differences between those who do and do not take part in a study
Prevalence of Human papilloma virus by source of subjects
Adapted from Revzina NV Int J STD/AIDS 2005
Prevalence of alcohol abuse(St Louis, Missouri)
No. of contact
attempts
Recruitment rate
Prevalence (%) alcohol
abuse1-5 56.5 3.89
7 56.5 3.98
8 70.3 4.22
9 73.1 4.26
10-57 85.2 4.61
Cottler et al 1987
Selection bias: depression in chronic pain patients
% with depression
setting
4.6% pain clinic
13.4% arthritis clinic
24.2% inpatient neurosurgery
57.1% psychiatric clinic
Dworkin & Gitlin 1991
Selection bias in words
• error due to systematic difference in characteristics of those who do and those who do not participate in a study
• study group is not representative of the population from which you think it was sampled
• can lead to – misleading prevalence estimate
– overestimate or underestimate of association between risk factor and disease
Exploring bias:cataracts and occupational irradiation
Cataract No cataract
Exposed 10 90 100
Not exposed 200 1800 2000
Percent of employees who developed cataractExposed = 10/100 = 10%Non-exposed = 200/2000 = 10%
What type of bias might occur
In reality
Self-selection bias I: the exposed are more concerned about sight
Cataract No cataract
Exposed 10 54 64
Not exposed 120 1080 1200
All the exposed with sight problems turn upOnly 60% of the all other groups turn up
Percent of employees who developed cataractExposed = 10/54 = 18.5%Non-exposed = 120/1200 = 10%
Self-selection bias II: the exposed with sight problems have move to other jobs
Cataract No cataract
Exposed 1 54 55
Not exposed 120 1080 1200
Only 10% of the exposed with sight problems are still employed60% of the all other groups turn up
Percent of employees who developed cataractExposed = 1/55 = 1.82%Non-exposed = 120/1200 = 10%
Types of selection bias
• sampling bias– eg faulty sampling frame
• self- selection bias– eg healthy worker effect
• response bias– eg more middle class/ worried participate
• diagnostic bias– knowledge of exposure status influences diagnosis
• admission (Berkson’s) bias– eg poor social support
Size is no protection….
The fall of the Literary Digest..
Magazine had 10,000,000 subscribers
Predicted a Republican victory
Democrats won, magazine folded
Information bias• systematic difference in quality/ accuracy of data
– point estimates– between groups
• reporting bias– recall bias– social desirability (halo effect)– Hawthorne effect
• measurement bias– poorly calibrated machine– poorly phrased questions
• observer bias– different observers give different results– eg blood pressure
Measuring blood pressure
Kim E S et al. Dia Care 2007;30:1959-1963
Misclassification bias
• a type of information bias– either exposure status incorrect – or disease status incorrect
• two types– at random– differential
Misclassification at random
Disease No disease
Exposed 250 250
Not exposed
100 400
No misclassification40% of exposed misclassified as not exposed
% with diseaseexposed 50%non-exposed 20%
% with diseaseexposed 50%non-exposed 40%
Disease No disease
Exposed 150 150
Not exposed
200 500
Random misclassification weakens the size of effect
Differential (systematic) misclassification
Disease No disease
Exposed 250 250
Not exposed
100 400
No misclassification
Systematic misclassificationExposed: diseased free labelled “disease”Non- exposed: diseased labelled “disease free ”
% with diseaseexposed 50%non-exposed 20%
% with diseaseexposed 80%non-exposed 20%
Disease No disease
Exposed 400 100
Not exposed
100 400
Differential misclassification can bias result in either direction
Effect of bias in epidemiology
• Systematic error in study resulting in incorrect estimate of association between exposure and disease
– unusual people participate
– exposure incorrectly measured
– outcome incorrectly classified
•There are lots of ways to do a poor study
•Need rigorous design and conduct of studies
The Play of Chance:
Influences the results of all studies, sometimes a little, sometimes a lot.
The play of chance?
• A week in Ninewells - 25 newborns
• 16 boys; 9 girls
• Natural variability
• Small numbers are volatile
Subsequent days
6 boys & 10 girls 11 boys & 3 girls 7 boys & 7 girls
10 20 30 40 50 60 70 80 90 100
Proportion Observed
X
25 babies: 11 boys: proportion=44%25 babies: 16 boys: proportion=64%25 babies: 14 boys: proportion=56%25 babies: 17 boys: proportion=68%
X XX
25 babies: 9 boys: proportion=36%
X
• Natural variability & small numbers are volatile• Some (unknown) ‘true’ proportion• Any particular sample gives just one possible result.
In subsequent weeks (different samples):
Breast cancer patients
Sample % surviving 1year
1 85%
2 77%
3 79%
4 89%
Suicide rate in women, Northern Ireland
Rate per100,000
Year
In research
chance influences the results
how can we find out if a result is due to chance?
can we get an idea of what the true answer might be?
Interpreting clinical trials
Active treatment Control
60% 55%
% surviving 3 years
Is there a real difference?
Interpreting results
The key question is:
How likely is it that the results happened by chance?
We need to measure chance
ie the likelihood of events happening
Probability: a measure of chance
Event Frequency Probability
throwing a six 1/6 0.167
throwing 3 sixes 1/216 0.005
dying if fly 1000 miles
1 in 1 million
0.000001
winning the lottery
1 in 14 million
0.00000007
Guide to probabilities• likely event large probability
eg frost in January (0.99)
• unlikely event small probability eg snow in August (0.001)
The Statistical Test
Proposes no effecteg no true difference between groups
By the play of chancea small difference may be seen
Calculate the probability thatthe difference is simply due to chance
The logic of a statistical test
Propose Null Hypothesis ie no effect
calculate probability of result by chance
if p small conclude chance unlikely
therefore the effect is real
Examples of p-values
Outcome measure
New treatment
Control
p-value
Dialstolic BP
85 mmHg
97 mmHg
0.002
% alive at 1 year
85% 70% 0.13
% pain free at 6 months
36%
23%
0.04
Decision rule
• if p<0.05 reject chanceie conclude real effect
• if p>0.05 cannot exclude chanceie cannot conclude there is real effect
Outcome measure
New treatment
Control p-value
Dialstolic BP 85 mmHg 97 mmHg 0.002
% alive at 1 year
85% 70% 0.13
% pain free at 6 months
36% 23% 0.04
Examples of p-values
Non-significance (P>0.05)
• Non-significant = no effect
• Absence of evidence = Evidence of absence
BSE infected meat is safeWe have no reason to believe it is harmfulWe have no evidence
The meaning of p< 0.05
If p=0.05
would get result (as extreme) 1 time in 20
ie if 20 independent tests
expect 1 spurious significance by chance
Two types of problemconclude there is effect when none exists
happens with multiple testing Type I error
fail to detect an effect when one exists happens with small studies Type II error
The trade off Statistical significance guards against
chance findings
If want too much protection (p<0.01), risk missing true effects
If too little protection (eg p,0.1), then likely to get spurious effects
Another approach:95% confidence interval
We know
i) observed treatment difference
ii) our result is affected by chance
We need to knowi) where the true value might lie
Several independent studies
Repeated survival studies
0
5
10
15
20
25
0 10 20 30 40
% survival
Stu
dy n
umbe
r
Series1
Mean 32.25
Most studies 30-34
True value likely to be 30 – 34 ish
Confidence Intervals
• repeated studies cluster round the true value• from one study
– need to specify a range – likely to contain the real value
• calculate 95% confidence interval
– Most of the time (95%) the confidence interval will contain the real value (but sometimes it will not).
Where does the true value lie?
• calculate the confidence interval– Mean +/- 1.96 * Std error
• 95% confident it includes the true value
• decide what this means
Examples of confidence intervals
• percent of boy babies - 62%– 95% confidence interval: 35% to 89%
• mean diastolic pressure - 95 mm Hg– 95% confidence interval: 88 to 102 mmHg
Interpreting the 95% c.i.
0 5 10 15 20-5-10-15-20
Active - Placebo
Placebobetter
Activebetter
If active same as placebo
Interpreting the 95% c.i.
0 5 10 15 20-5-10-15-20
Active - Placebo
Placebobetter
Activebetter
95% c.i.
Interpreting the 95% c.i..
0 5 10 15 20-5-10-15-20
Active - Placebo
Placebobetter
Activebetter
95% c.i.
Is zero likely to be the true value?
Can we reject the Null Hypothesis?
Interpreting the 95% c.i..
0 5 10 15 20-5-10-15-20
Active - Placebo
Placebobetter
Activebetter
95% c.i.-10 5
Can we reject the Null Hypothesis?
Pulling it together
• for difference in mean treatment effect– if zero within confidence interval – NOT
significant
• for ratio measures eg relative risk– no difference if ratio = 1
Confounding
The Glowing Field of Confounding FruitShona MacDonald
Confounding
Coffee drinking Pancreatic cancer
Smoking
Coffee drinking Pancreatic cancer
Defining confounding
• the observed association between two factors is due to the effect of a third factor– an apparent association may be spurious– a real association may be obscured
More confounding
• Divorced men drink more
• Alcohol caused divorce
Unhappy marriage caused both
• Derivation: Latin confundere – to mix up
• when an apparent association between a factor(F) and an outcome (O) is due to a third factor (R)
Confounding
R
F O
Ministers’ salaries
Price of beer
Confounding Ministers & Beer
Road traffic accidents
Age of driver Mortality rate (per 100,000)
35-44 years
1.9
75-84 years
6.2
Evidence for Hells’ Grannies ?
Clarification of terms
Factor of interest the one thought to be a new risk factor
Confounder the one that alters the observed relation
between factor of interest and outcome
Requirements for confounding
• Confounding factor must be associated with risk factor of interest
• confounder influences risk of disease
Why worry about confounding?
• Does air pollution cause bronchitis ?
Breathe polluted air
Develop bronchitis?
Have choices and power
Do seatbelts reduce crash injuries?
?Wear seatbelts
Risk averse
↓Injured in a crash
Do STD’s increase HIV transmission?
STD HIV
Risky sex
?
Does smoking lead to illicit drug use?
Factoreg smoking
Outcomeeg drug taking
R – true risk factoreg social deprivation
?
Dealing with confounding
• Must collect information on all known potential confounding factors
• Explore for confounding in the analysis
• Practically, difficult to know which are the important confounders
EPIET (www)
Cases of Down Syndrome by Birth Order
and Maternal Age
Note on confounders
A confounder
• is the true causal factor responsible for the disease
• has to be more strongly associated with the disease than the supposed risk factor– if smoking increased risk of lung cancer x10– confounder would need to have a bigger effect
Dealing with confounding:in the design
• restrict recruitment– to one level of confounding factor– could compromise sample size
• matching– see case-control studies
Dealing with confounding:in the analysis
stratified analysis explore association between risk factor (R) and
outcome (O) for each level (strata) of the confounding factor (F)
then calculate an overall, weighted (unconfounded) estimate
direct and indirect standardisation statistical modelling (multiple regression)
Control of confounding –hard to control unknown risk factors
• These methods can control only known potential confounders.
• Only random assignment of exposure can control for unknown potential confounders (see randomised controlled trials)
More Confounding
Is the association between obesity and mortality due to the confounding effect of hypertension?
Mortality
Obesity
?Hypertension
Hypertension is probably not a real confounder but
rather the mechanism whereby obesity causes mortality
Mortality
Obesity
Hypertension:mechanism or
intervening variable
*Manson JE et al: JAMA 1987;257:353-8.
Intervening variable
Mortality
Obesity
Hypertension
*Manson JE et al: JAMA 1987;257:353-8.
Even if hypertension is a mechanism linking obesity to mortality, are there alternative mechanisms may causally link obesity and mortality.
Mortality
Obesity
HypertensionBlock by
adjustment ?
Requirements for confounding
• Confounding factor must be associated with true risk factor and disease
• confounder does not influence risk of disease (not in the causal pathway)
Is maternal smoking a risk factor of perinatal death?
Is the association confounded by low birth weight?
Perinatal mortality
Maternal smoking
?Low birth
weight
Is low birth weight the mechanism by which maternal smoking leads to higher risk of perinatal death?
Low birth weight is an intervening variable
Perinatal mortality
Maternal smoking
Low birthweight
BUT THERE COULD BE AN ADDITIONAL QUESTION:Does maternal smoking cause perinatal death by mechanisms other than low birth weight?
Perinatal mortality
Maternal smoking
Low birthweight
Direct toxic effect?
Block by adjustment
Causal models
R
F O
(F confounding)
F R O
(R intervening)
Why confounding is a problem
risk behaviours are related smoking, drinking, diet, exercise health care seeking, compliance with medication
risk factors are hard to measure fully eg diet, alcohol, social class
social factors have complex associations social class, race, education
physical environment complexair quality, noise, traffic, parks are inter-related
many confounders are not knownhence control in the analysis is limited
Exploring dietary advice
The Japanese eat very little fat and suffer fewer heart attacks than the British or Americans.
On the other hand, the French eat a lot of fat and also suffer fewer heart attacks than the British or Americans.
Dietary advice
The Japanese drink very little red wine and suffer fewer heart attacks than the British or Americans.
On the other hand, Italians drink lots of red wine and also suffer fewer heart attacks than the British or Americans.
Conclusion: Eat & drink what you like. It appears that speaking English is what kills you.
borrowed from Victor J. Schoenbach, PhD
Bias, Chance & Confounding
• Assess bias first: Critical Appraisal
• Then assess chance: assumes no bias
• Then assess confounding: effect real,
what is the explanation?
What you should know• bias
– selection– information
• play of chance– p-values and confidence intervals– Type I and II errors
• confounding– meaning– reasons for– methods to control – stratify / adjust in statistical model
• intervening variables