Variation: role of error, bias and confounding Raj Bhopal, Bruce and John Usher Professor of Public...

Variation: role of error, bias and confounding

Raj Bhopal, Bruce and John Usher Professor of Public Health,

Public Health Sciences Section, Division of Community Health Sciences,

University of Edinburgh, Edinburgh [email protected]

mailto:[email protected]

Educational objectivesOn completion of your studies should understand: That error is crucially important in applied sciences

based on free living populations such as epidemiology

Bias, considered as an error which affects comparison groups unequally, is particularly important in epidemiology

The major causes of error and bias in epidemiology, can be analysed based on the chronology of a research project

Bias in posing the research question, stating hypotheses and choosing the study population are relatively neglected but important topics in epidemiology

Educational objectives

Errors and bias in data interpretation and publication are particularly important in epidemiology because of its health policy and health care applications

Confounding is the mis-measurement of the relationship between a risk factor and disease and arises in comparisons of groups which differ in ways that affect disease

Different epidemiological study designs share most of the problems of error and bias

Exercise: Error and bias

Reflect on the words error and bias. What is the difference, if any, between error and bias?

Why might error and bias be particularly common and important in epidemiology?

Error An error is by definition an act, an assertion, or a belief that

deviates from what is right..but what is right? The true length of a metre is arbitrarily decided by agreeing a

definition The difference between a "correct" metre stick and an

erroneous one can be accurately measured For health and disease the truth is usually unknown and

cannot be defined in the way we define metre Error should be considered as an inevitable and important part

of human endeavor Popperian view is that science progresses by the rejection of

hypotheses (by falsification) rather than the establishing of so called truths (by verification)

Bias

A preference or an inclination Bias may be intentional or unintentional In statistics a bias is an error caused by

systematically favoring some outcomes over others

Bias in epidemiology can be conceptualised as error which applies unequally to comparison groups.

Error and bias in biology

Biological research is difficult because of the complexity and variety of living things

Circadian and other natural rhythms cause change Measurement techniques are usually limited by

technology, cost or ethical considerations Strict rules restrict what measurement is

permissible ethically and what humans are willing to give their consent to

Experimental manipulation to test a hypothesis is usually done late

Figure 4.1

(a) Error is unequal in one of these groups leading to a false interpretation of the pattern of disease - falsely detecting differences

(b) Error is unequal in one of these groups leading to a false interpretation of the pattern of disease - here failure to detect differences

Error and bias in epidemiology

Error and bias in epidemiology focus on: (a) selection (of population), (b) information (collection, analysis and interpretation of data) and (c) confounding

Error and bias is also inherent in the process of developing research questions and hypotheses but is seldom discussed

Are questions of sex or racial differences in intelligence, disease, physiology or health biased questions?

The research question, theme or hypothesis

Science is done by human beings who often have strong ideas and views

They share in the social values and beliefs of their era such as class, racial and sexual prejudice

The question "Are men more intelligent (or healthy) than women?" could be considered a biased question

Research question Apparently the neutral hypothesis here would

be that there are no gender differences in intelligence

The underlying values of the researchers may be that men are more intelligent than women

Likely to be revealed at the analysis and interpretation stage by biased interpretation

It is problematic to describe difference without conveying a sense of superiority and inferiority

The research question

Syphilis Study of the US Public Health Service followed up 600 African American men for some 40 years

The question: does syphilis have different and, particularly, less serious outcomes in African Americans than European origin Americans?

Investigators denied the study subjects treatment even when it was available and curative (penicillin)

Choice of population

Known as selection bias Volunteers are a popular choice Volunteers tend to be different in their attitudes,

behaviours and health status compared to those who do not volunteer

Men have been more often selected than women Investigators are prone to exclude individuals

and populations for reasons of convenience, cost or preference rather than for neutral, scientific reasons

Selection bias Selection bias is inevitable, simply because investigators need to

make choices Captive populations are popular-some may be fairly

representative, e.g. schoolchildren, others not at all, e.g. university students

People are also missed either inadvertently or because they actively do not participate

Selection bias matters much more in epidemiology than in biologically based medical sciences.

Biological factors are usually generalisable between individuals and populations, so there is a prior presumption of generalisability

If an anatomist describes the presence of a particular muscle, or cell type, based on one human being it is likely to be present in all human beings (and possibly all mammals)

Non-participation Some subjects chosen for a study do not participate

causing selection bias The non-response in good studies is typically 30%-40% Non-responders differ from those who respond Problem is compounded when the non-response differs

greatly in two populations that are to be compared The effect may be understood if some information is

available on those not participating e.g. their age, sex, social circumstances and why they refused

Non-response bias is an intrinsic limitation of the survey method and hence of epidemiology

Study population

Ignoredpopulation

Comparison population

Figure 4.2

Ignoring

populations

Questions

harming one

population

Measuring

unequally

Generalising

from

unrepresentative

populations

Comparing risk factor-disease outcome relationships in populations

which differ (confounding) Confounding is a difficult idea to explain and grasp It is the error in the measure of association between

a specific risk factor and disease outcome, which arises when there are differences in the comparison populations other than the risk factor under study

Confounding is derived from a Latin word meaning to mix up, a useful idea, for confounding mixes up causal and non-causal relationships

The potential for it to occur is there whenever the cardinal rule “compare like-with-like” is broken

Exercise: Confounding Imagine that a study follows up people

who drink alcohol and observes the occurrence of lung cancer

A group of people who do not drink and are of the same age and sex provide the comparison group

The study finds that lung cancer is more common in alcohol drinkers, i.e. there is an association between alcohol consumption and lung cancer.

Did alcohol causes lung cancer?

Confounding

In what other important ways might the study (alcohol drinking) and comparison (no alcohol drinking) populations be different?

Could the association between alcohol and lung cancer be confounded?

What might be the confounding variable? First key analysis in all epidemiological

studies is to compare the characteristics of the populations under study

Examples of confounding

The confounded association

One possible explanation

The confounded factor

The confounding (causal) factor

To check the assumption

(a) People who drink alcohol have a raised risk of lung cancer

Alcohol drinking and smoking are behaviours which go together

Alcohol, which is a marker for, on average, smoking more cigarettes

Tobacco, which is associated with both alcohol and with the disease

See if the alcohol-lung cancer relationship holds in people not exposed to tobacco: if yes, tobacco is not a confounder (stratified analysis chapter 7).

Figure 4.3

Apparent but spurious risk factor for disease

The true cause & confounding

variable

Disease

A statistical but not causal associationAss

ocia

tion

betw

een

the

appa

rent

ris

k

fact

or a

nd th

e ca

usal

fact

orO

ne of the causes of the disease

Figure 4.4

Alcohol drinking

Smoking

Lung cancer

Alcohol is statistically but not causally linked to lung cancer

Sm

okin

g is

ass

ocia

ted

with

the

appa

rent

ris

k fa

ctor

alco

hol,

and

vice

ver

saS

moking causes lung cancer

Possible actions to control confounding

Possible Action Study Design : Randomise individual subjects or units of populations e.g. schools.

Study Design :Select comparable groups/ restrict entry into study

Study Design : Match individuals or whole populations

Analysis : Analyse subgroups separately Analysis : Adjust data statistically

Measurement errors in epidemiology

Information bias Why are measurement errors in epidemiology

likely to be more common and more important than in other scientific disciplines - say physics, anatomy, biochemistry or animal physiology?

Assessing the presence of disease in living human beings requires a judgement

Measuring socio-economic circumstances, ethnic group, cigarette smoking habits or alcohol consumption are complex matters

These errors are life-and-death matters, even in epidemiological research

Measurement errors Past exposures will need to be estimated,

sometimes from contemporary measures Biological variation needs to be taken into account

e.g. blood pressure varies from moment to moment in response to physiological needs related to activity, in a 24 hour (circadian) cycle with lowered pressure in the night, and with the ambient temperature

Some variables have natural variation so great that making estimates is extremely difficult, for example, in diet, alcohol consumption, and the level of stress

Machine imprecision is also inevitable Inaccurate observation by the investigator or

diagnostician

Measurement errors and bias

Measurement errors which occur unequally in the comparison populations are:-differential misclassification errors or bias-likely to irreversibly destroy a study

-will increase the strength of the association in error

Non-differential errors or biases, occurring in both comparison populations, are much more likely to occur

Misclassification bias Misclassification error (or bias) occurs when a person

is put into the wrong category (or population sub-group), usually as a result of faulty measurement

Some people who are hypertensive will be misclassified as normal

Some who are not hypertensive will be misclassified as hypertensive

The end result in terms of the prevalence of hypertension may be about right

The degree to which a measure leads to a correct classification can be quantified using the concepts of sensitivity and specificity - and these are discussed in relation to screening tests

In measuring the strength of association between exposures and disease outcomes non-differential misclassification error has an important and not always predictable effect

Non differential misclassification error

Imagine a study of 20,000 women, 10,000 on the contraceptive pill and the rest not

Say that over 10 years 20% of those on the pill develop a cardiovascular disease compared to 10% of those not on the pill

The rate of disease in the oral contraceptive group is doubled (relative risk = 2)

Assume that misclassification in exposure occurs 10% of the time, so that 10% of women actually on the pill were classified as not on the pill, and that 10% who were not on were classified as on the pill

Imaginary study of cardiovascular outcome and pill use : no

misclassification True classification of pill use status

Cardiovascular Disease

Yes No Total

Yes 2,000 8,000 10,000

No 1,000 9,000 10,000

3,000 17,000 20,000

Pill and cardiovascular disease : 10% misclassification of pill use

Classification of pill use status

Cardiovascular Disease

Yes No Total Yes, classified right (on the pill so incidence rate is 20%)

1,800 7,200 9,000

Yes, classified wrong (actually not on the pill so incidence rate is 10%).

100

900

1,000

Subtotal 1,900 8,100 10,000 No, classified right (not on the pill (so incidence rate is 10%)

900 8,100 9,000

No, classified wrong (actually on the pill so incidence rate is 20%)

200

800

1,000

Subtotal 1,100 8,900 10,000 TOTAL 3,000 17,000 20,000

Misclassification: the pill The risk of CVD in the "pill users group" with 10%

misclassification is1,900/10,000, and in the "not on the pill group" is 1,100/10,000, so the relative risk is

Misclassification will, inevitably, also arise in measurement of the disease outcome, further reducing the strength of the association

Generally, non-differential misclassification bias lowers the relative risk.

This general principle may break down when misclassification occurs in confounding variables as well

7.1000,10/100,1

000,10/900,1

Analysis and interpretation Usually the potential for data analysis is far greater than

that actually done The choices will be informed by the prior interests (and biases)

and expertise of the researcher External scrutiny at an early stage by objective advisors of the

research protocol could reduce such biases Inclusion of objective, uninvolved people in the research team

at the data analysis and interpretation stage is possible but unusual, so,

Investigators should ensure their analysis is driven by hypotheses, research questions and an analysis strategy prepared in advance

Proposal is that investigators should make public their data questionnaire, the analysis strategy, and other information required to replicate the analysis

Judgement and action The data and interpretation are examined by those who need

to make decisions Interpretations, especially those which involve change that

may threaten powerful interests, will be contested. Interpretation is a matter of judgement and judgement will

depend on the prior values, beliefs and interests of the observer

Epidemiologists are not the sole arbiters of the theory and data.

Epidemiologists, however, have responsibilities for minimising the impact of their own biases and preventing the misinterpretation of data and recommendations by those with vested interests

Study population bias: generalisation

Much of epidemiology is concerned with population subgroups and comparisons between them

The interpretation rests on the assumption that the results apply, at least, to the whole group as originally chosen if not the whole population

Error arises in the inappropriate generalisation of study data to another population

Controlling errors and bias

Error control requires awareness and good scientific technique

Bias control needs equal attention to error control in all the population sub-groups

Error and bias cannot be fully controlled so the most important need is for systematic, cautious and critical interpretation of data

Conclusion Bias is a central issue in epidemiology When epidemiological data are applied to provide

health advice to individuals and to shape public health policy, error and bias are especially important

I am not aware of an epidemiological theory on why error and bias occur

Social sciences research on the nature of science indicates that the scientific endeavour is not wholly objective but open to the influence of society and context

The framework provided by the chronology and structure of a research project offers a logical approach to analysis of bias and error

Conclusions The main principles are: develop research questions and hypotheses which benefit all

the population and will not lead to harm study a representative population measure accurately and with equal care across comparison

groups compare like-with-like check for the main findings in subgroups before assuming that

inferences and generalisations apply across all groups findings of a single study should rarely be accepted at face

value first consider artefact a critical attitude is essential

Date post:	12-Jan-2016
Category:	Documents
Upload:	barnard-robertson
View:	213 times
Download:	0 times

Variation: role of error, bias and confounding Raj Bhopal, Bruce and John Usher Professor of Public...

Documents