Date post: | 12-Jan-2016 |
Category: |
Documents |
Upload: | barnard-robertson |
View: | 213 times |
Download: | 0 times |
Variation: role of error, bias and confounding
Raj Bhopal, Bruce and John Usher Professor of Public Health,
Public Health Sciences Section, Division of Community Health Sciences,
University of Edinburgh, Edinburgh [email protected]
Educational objectivesOn completion of your studies should understand: That error is crucially important in applied sciences
based on free living populations such as epidemiology
Bias, considered as an error which affects comparison groups unequally, is particularly important in epidemiology
The major causes of error and bias in epidemiology, can be analysed based on the chronology of a research project
Bias in posing the research question, stating hypotheses and choosing the study population are relatively neglected but important topics in epidemiology
Educational objectives
Errors and bias in data interpretation and publication are particularly important in epidemiology because of its health policy and health care applications
Confounding is the mis-measurement of the relationship between a risk factor and disease and arises in comparisons of groups which differ in ways that affect disease
Different epidemiological study designs share most of the problems of error and bias
Exercise: Error and bias
Reflect on the words error and bias. What is the difference, if any, between error and bias?
Why might error and bias be particularly common and important in epidemiology?
Error An error is by definition an act, an assertion, or a belief that
deviates from what is right..but what is right? The true length of a metre is arbitrarily decided by agreeing a
definition The difference between a "correct" metre stick and an
erroneous one can be accurately measured For health and disease the truth is usually unknown and
cannot be defined in the way we define metre Error should be considered as an inevitable and important part
of human endeavor Popperian view is that science progresses by the rejection of
hypotheses (by falsification) rather than the establishing of so called truths (by verification)
Bias
A preference or an inclination Bias may be intentional or unintentional In statistics a bias is an error caused by
systematically favoring some outcomes over others
Bias in epidemiology can be conceptualised as error which applies unequally to comparison groups.
Error and bias in biology
Biological research is difficult because of the complexity and variety of living things
Circadian and other natural rhythms cause change Measurement techniques are usually limited by
technology, cost or ethical considerations Strict rules restrict what measurement is
permissible ethically and what humans are willing to give their consent to
Experimental manipulation to test a hypothesis is usually done late
Figure 4.1
(a) Error is unequal in one of these groups leading to a false interpretation of the pattern of disease - falsely detecting differences
(b) Error is unequal in one of these groups leading to a false interpretation of the pattern of disease - here failure to detect differences
Error and bias in epidemiology
Error and bias in epidemiology focus on: (a) selection (of population), (b) information (collection, analysis and interpretation of data) and (c) confounding
Error and bias is also inherent in the process of developing research questions and hypotheses but is seldom discussed
Are questions of sex or racial differences in intelligence, disease, physiology or health biased questions?
The research question, theme or hypothesis
Science is done by human beings who often have strong ideas and views
They share in the social values and beliefs of their era such as class, racial and sexual prejudice
The question "Are men more intelligent (or healthy) than women?" could be considered a biased question
Research question Apparently the neutral hypothesis here would
be that there are no gender differences in intelligence
The underlying values of the researchers may be that men are more intelligent than women
Likely to be revealed at the analysis and interpretation stage by biased interpretation
It is problematic to describe difference without conveying a sense of superiority and inferiority
The research question
Syphilis Study of the US Public Health Service followed up 600 African American men for some 40 years
The question: does syphilis have different and, particularly, less serious outcomes in African Americans than European origin Americans?
Investigators denied the study subjects treatment even when it was available and curative (penicillin)
Choice of population
Known as selection bias Volunteers are a popular choice Volunteers tend to be different in their attitudes,
behaviours and health status compared to those who do not volunteer
Men have been more often selected than women Investigators are prone to exclude individuals
and populations for reasons of convenience, cost or preference rather than for neutral, scientific reasons
Selection bias Selection bias is inevitable, simply because investigators need to
make choices Captive populations are popular-some may be fairly
representative, e.g. schoolchildren, others not at all, e.g. university students
People are also missed either inadvertently or because they actively do not participate
Selection bias matters much more in epidemiology than in biologically based medical sciences.
Biological factors are usually generalisable between individuals and populations, so there is a prior presumption of generalisability
If an anatomist describes the presence of a particular muscle, or cell type, based on one human being it is likely to be present in all human beings (and possibly all mammals)
Non-participation Some subjects chosen for a study do not participate
causing selection bias The non-response in good studies is typically 30%-40% Non-responders differ from those who respond Problem is compounded when the non-response differs
greatly in two populations that are to be compared The effect may be understood if some information is
available on those not participating e.g. their age, sex, social circumstances and why they refused
Non-response bias is an intrinsic limitation of the survey method and hence of epidemiology
Study population
Ignoredpopulation
Comparison population
Figure 4.2
Ignoring
populations
Questions
harming one
population
Measuring
unequally
Generalising
from
unrepresentative
populations
Comparing risk factor-disease outcome relationships in populations
which differ (confounding) Confounding is a difficult idea to explain and grasp It is the error in the measure of association between
a specific risk factor and disease outcome, which arises when there are differences in the comparison populations other than the risk factor under study
Confounding is derived from a Latin word meaning to mix up, a useful idea, for confounding mixes up causal and non-causal relationships
The potential for it to occur is there whenever the cardinal rule “compare like-with-like” is broken
Exercise: Confounding Imagine that a study follows up people
who drink alcohol and observes the occurrence of lung cancer
A group of people who do not drink and are of the same age and sex provide the comparison group
The study finds that lung cancer is more common in alcohol drinkers, i.e. there is an association between alcohol consumption and lung cancer.
Did alcohol causes lung cancer?
Confounding
In what other important ways might the study (alcohol drinking) and comparison (no alcohol drinking) populations be different?
Could the association between alcohol and lung cancer be confounded?
What might be the confounding variable? First key analysis in all epidemiological
studies is to compare the characteristics of the populations under study
Examples of confounding
The confounded association
One possible explanation
The confounded factor
The confounding (causal) factor
To check the assumption
(a) People who drink alcohol have a raised risk of lung cancer
Alcohol drinking and smoking are behaviours which go together
Alcohol, which is a marker for, on average, smoking more cigarettes
Tobacco, which is associated with both alcohol and with the disease
See if the alcohol-lung cancer relationship holds in people not exposed to tobacco: if yes, tobacco is not a confounder (stratified analysis chapter 7).
Figure 4.3
Apparent but spurious risk factor for disease
The true cause & confounding
variable
Disease
A statistical but not causal associationAss
ocia
tion
betw
een
the
appa
rent
ris
k
fact
or a
nd th
e ca
usal
fact
orO
ne of the causes of the disease
Figure 4.4
Alcohol drinking
Smoking
Lung cancer
Alcohol is statistically but not causally linked to lung cancer
Sm
okin
g is
ass
ocia
ted
with
the
appa
rent
ris
k fa
ctor
alco
hol,
and
vice
ver
saS
moking causes lung cancer
Possible actions to control confounding
Possible Action Study Design : Randomise individual subjects or units of populations e.g. schools.
Study Design :Select comparable groups/ restrict entry into study
Study Design : Match individuals or whole populations
Analysis : Analyse subgroups separately Analysis : Adjust data statistically
Measurement errors in epidemiology
Information bias Why are measurement errors in epidemiology
likely to be more common and more important than in other scientific disciplines - say physics, anatomy, biochemistry or animal physiology?
Assessing the presence of disease in living human beings requires a judgement
Measuring socio-economic circumstances, ethnic group, cigarette smoking habits or alcohol consumption are complex matters
These errors are life-and-death matters, even in epidemiological research
Measurement errors Past exposures will need to be estimated,
sometimes from contemporary measures Biological variation needs to be taken into account
e.g. blood pressure varies from moment to moment in response to physiological needs related to activity, in a 24 hour (circadian) cycle with lowered pressure in the night, and with the ambient temperature
Some variables have natural variation so great that making estimates is extremely difficult, for example, in diet, alcohol consumption, and the level of stress
Machine imprecision is also inevitable Inaccurate observation by the investigator or
diagnostician
Measurement errors and bias
Measurement errors which occur unequally in the comparison populations are:-differential misclassification errors or bias-likely to irreversibly destroy a study
-will increase the strength of the association in error
Non-differential errors or biases, occurring in both comparison populations, are much more likely to occur
Misclassification bias Misclassification error (or bias) occurs when a person
is put into the wrong category (or population sub-group), usually as a result of faulty measurement
Some people who are hypertensive will be misclassified as normal
Some who are not hypertensive will be misclassified as hypertensive
The end result in terms of the prevalence of hypertension may be about right
The degree to which a measure leads to a correct classification can be quantified using the concepts of sensitivity and specificity - and these are discussed in relation to screening tests
In measuring the strength of association between exposures and disease outcomes non-differential misclassification error has an important and not always predictable effect
Non differential misclassification error
Imagine a study of 20,000 women, 10,000 on the contraceptive pill and the rest not
Say that over 10 years 20% of those on the pill develop a cardiovascular disease compared to 10% of those not on the pill
The rate of disease in the oral contraceptive group is doubled (relative risk = 2)
Assume that misclassification in exposure occurs 10% of the time, so that 10% of women actually on the pill were classified as not on the pill, and that 10% who were not on were classified as on the pill
Imaginary study of cardiovascular outcome and pill use : no
misclassification True classification of pill use status
Cardiovascular Disease
Yes No Total
Yes 2,000 8,000 10,000
No 1,000 9,000 10,000
3,000 17,000 20,000
Pill and cardiovascular disease : 10% misclassification of pill use
Classification of pill use status
Cardiovascular Disease
Yes No Total Yes, classified right (on the pill so incidence rate is 20%)
1,800 7,200 9,000
Yes, classified wrong (actually not on the pill so incidence rate is 10%).
100
900
1,000
Subtotal 1,900 8,100 10,000 No, classified right (not on the pill (so incidence rate is 10%)
900 8,100 9,000
No, classified wrong (actually on the pill so incidence rate is 20%)
200
800
1,000
Subtotal 1,100 8,900 10,000 TOTAL 3,000 17,000 20,000
Misclassification: the pill The risk of CVD in the "pill users group" with 10%
misclassification is1,900/10,000, and in the "not on the pill group" is 1,100/10,000, so the relative risk is
Misclassification will, inevitably, also arise in measurement of the disease outcome, further reducing the strength of the association
Generally, non-differential misclassification bias lowers the relative risk.
This general principle may break down when misclassification occurs in confounding variables as well
7.1000,10/100,1
000,10/900,1
Analysis and interpretation Usually the potential for data analysis is far greater than
that actually done The choices will be informed by the prior interests (and biases)
and expertise of the researcher External scrutiny at an early stage by objective advisors of the
research protocol could reduce such biases Inclusion of objective, uninvolved people in the research team
at the data analysis and interpretation stage is possible but unusual, so,
Investigators should ensure their analysis is driven by hypotheses, research questions and an analysis strategy prepared in advance
Proposal is that investigators should make public their data questionnaire, the analysis strategy, and other information required to replicate the analysis
Judgement and action The data and interpretation are examined by those who need
to make decisions Interpretations, especially those which involve change that
may threaten powerful interests, will be contested. Interpretation is a matter of judgement and judgement will
depend on the prior values, beliefs and interests of the observer
Epidemiologists are not the sole arbiters of the theory and data.
Epidemiologists, however, have responsibilities for minimising the impact of their own biases and preventing the misinterpretation of data and recommendations by those with vested interests
Study population bias: generalisation
Much of epidemiology is concerned with population subgroups and comparisons between them
The interpretation rests on the assumption that the results apply, at least, to the whole group as originally chosen if not the whole population
Error arises in the inappropriate generalisation of study data to another population
Controlling errors and bias
Error control requires awareness and good scientific technique
Bias control needs equal attention to error control in all the population sub-groups
Error and bias cannot be fully controlled so the most important need is for systematic, cautious and critical interpretation of data
Conclusion Bias is a central issue in epidemiology When epidemiological data are applied to provide
health advice to individuals and to shape public health policy, error and bias are especially important
I am not aware of an epidemiological theory on why error and bias occur
Social sciences research on the nature of science indicates that the scientific endeavour is not wholly objective but open to the influence of society and context
The framework provided by the chronology and structure of a research project offers a logical approach to analysis of bias and error
Conclusions The main principles are: develop research questions and hypotheses which benefit all
the population and will not lead to harm study a representative population measure accurately and with equal care across comparison
groups compare like-with-like check for the main findings in subgroups before assuming that
inferences and generalisations apply across all groups findings of a single study should rarely be accepted at face
value first consider artefact a critical attitude is essential