Date post: | 17-May-2017 |
Category: |
Documents |
Upload: | sophie-gayer |
View: | 216 times |
Download: | 0 times |
Nutritional Epidemiology
Lecture 3 of 4Dr. Sarah McMullenRoom 37, North Lab
Epidemiological studies What do we need to measure?
– Exposure of interest
– Outcome of interest
– The relationship between exposure and outcome
• To understand the role of hypothesis testing and confidence intervals in assessing the significance of observed associations
• To be able to recognise and interpret significant diet-disease associations in the types of analyses commonly presented in the literature
Learning objectives
Evaluating Associations If we observe an exposure/disease
association, we must consider
– Is the association valid? Do the study findings reflect the true relationship between the exposure and disease? Or do they reflect chance, bias or
confounding?
– Is the association causal? Is there sufficient evidence to infer that a causal association exists between the exposure and the disease?
Chance To understand why chance could be involved
it is important to consider sampling error
– Rarely can a whole population be studied– Instead, a SAMPLE of the population is studied– The observations of the sample provide an
ESTIMATE of what would be observed in the true population
– Variation will always exist between random samples from the same population – sampling error
– BY CHANCE, an association may be observed due to sampling error alone
Chance
Sample number
Mea
n ch
oles
tero
l co
ncen
tratio
ns
(mm
ol/l)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Plasma cholesterol concentrations were measured in two groups of students sampled from a population of students.
Do the mean values for samples 1 and 2 appear different?
Chance
Sample number
Mea
n ch
oles
tero
l co
ncen
tratio
ns
(mm
ol/l)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
NO! If we look at the mean values for repeated random samples of students from the population, we can see that the mean values of groups 1 and 2 could have occurred by chance due to the large variation between samples.
Chance
In practice, we can not take repeat samples An estimate of sampling error is required to
determine whether the observation is accounted for by chance or not
Sample number
Mea
n ch
oles
tero
l co
ncen
tratio
ns
(mm
ol/l)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
An estimation of sampling error is calculated from– Standard deviation of data from
group– Number of subjects in groupThis estimate is referred to as
the– Standard error of the mean– Confidence intervals
Chance Statistical tools must be used to estimate
sampling error and determine how likely it is that the observation occurred by chance
– Hypothesis testing and P-values
– Estimation and Confidence intervals
Hypothesis testing P-value
– Probability that result would have occurred by chance alone (i.e. if null hypothesis were true)
– Usual threshold for significance ~ P = 0.05 – If the study was repeated 100 times, a significant
difference could occur by chance only 5 times.
– The bigger the sample size, the smaller the difference needs to be to prove statistically significant
– But is it biologically significant??
Comparing meansM
ean
chol
este
rol
conc
entra
tions
(m
mol
/l)
Control HF diet
Figure 1. Mean circulating cholesterol concentrations in adult Wistar rats fed a control (n=12) or high fat (HF, n=12) diet for a period of 6 weeks. P<0.05.
r = +0.23
Correlation
Average consumption of meat (g/person/day)
Incid
ence
of C
olon
ca
ncer
Cas
es p
er 1
00,0
00
in 2
002
Examines strength of association between exposure and outcome
Correlation coefficient (r)Describes strength of the association+1 perfect positive 0 no association -1 perfect negative
r = +0.89
Correlation
Average consumption of meat (g/person/day)
Incid
ence
of C
olon
ca
ncer
Cas
es p
er 1
00,0
00
in 2
002
Examines strength of association between exposure and outcome
Correlation coefficient (r)Describes strength of the association+1 perfect positive 0 no association -1 perfect negative
Correlation
Average consumption of meat (g/person/day)
Incid
ence
of C
olon
ca
ncer
Cas
es p
er 1
00,0
00
in 2
002
Examines strength of association between exposure and outcome
Correlation coefficient (r)Describes strength of the association+1 perfect positive 0 no association -1 perfect negative
r = -0.89
r2 = 79 %79 % of variation in one
variable accounted for by the other
p = 0.02 (p<0.05)
Regression How much does the outcome change for a given change in exposure?
Average consumption of meat (g/person/day)
Incid
ence
of C
olon
ca
ncer
Cas
es p
er 1
00,0
00
in 2
002
Plots a regression line through the data
Linear regressionAssesses the effect of one predictor variable
Multiple regression Includes more than one predictor variabler =
+0.96p = 0.02 (p<0.05)
Relative Risk (RR)Or Rate RatioRatio of the incidence in the exposed group divided by the incidence in the unexposed group
Incidence in exposed group = 9/11= 0.818
Incidence in unexposed group =3/14= 0.214
RR =3.82
Relative Risk (RR) – Example 1
With Outcome
Without Outcome
Exposedn=11
Unexposedn=14
9 2
311
Relative Risk (RR)Or Rate RatioRatio of the incidence in the exposed group divided by the incidence in the unexposed group
Incidence in exposed group =3/11=0.273
Incidence in unexposed group =8/14=0.571
RR =0.478
Relative Risk (RR) – Example 2
With Outcome
Without Outcome
Exposedn=11
Unexposedn=14
3 8
86
Relative risk / Odds ratio Unexposed/control group RR/OR =
1 If the exposure increases risk RR/OR > 1 If the exposure is protective RR/OR < 1 But is the risk of outcome significantly
increased or decreased?
1 2 3 4-4 -3 -2
CONTROL
INTE
RVENTION
Confidence intervals– Tell us the range within which the actual
population value is likely to lie Based on estimates of sampling error
Commonly used in cohort and case-control studies– Risk in unexposed or control group set at 1– Risk in exposed or case group calculated
(relative risk or odds ratio)– Confidence interval for exposed or case group
calculated
Confidence intervals
95% confidence interval– 95 % confident that the true population mean
lies within the given range– If the range of the confidence interval for the
exposed/case group includes 1, there is no significant difference between the groups (sampling error)
Confidence intervals
1 2 3 4-4 -3 -2
CONTROL
INTE
RVENTION
Relative RiskWeight
(pounds)Cases Total no.
of women
RR
<128 206 5763 1
129-140 236 5701 1.17
141-155 308 6107 1.45
156-174 283 5274 1.56
>174 335 5754 1.83
Risk of breast cancer increases with increasing body weight(as shown by increasing risk relative to those <128 pounds)
But at which rates is the RR significantly higher?
Cohort StudyRelative risk of post-menopausal breast cancer in woman grouped by body weight at 18 years of age.Adjusted for BMI and WHRSellers, 2002
Relative RiskWeight
(pounds)Cases Total no.
of womenRR 95 %
confidence interval
Significantly increased?
P-value for trend
<128 206 5763 1 n/a n/a
129-140 236 5701 1.17 0.97 - 1.42
141-155 308 6107 1.45 1.21 - 1.75
156-174 283 5274 1.56 1.28 - 1.90
>174 335 5754 1.83 1.49 - 2.24 P<0.001
RR significantly increased in groups weighing >174 pounds
The association between weight and RR is also assessed by correlation analysis (P-value for trend) ~ assess dose response relationship
Odds RatioMET
h/weekOR 95 %
confidence interval
Significantly decreased?
P-value for trend
0-25 1 n/a
25-<50 1.17 0.53 - 2.55
50-<80 0.49 0.22 - 1.07
>80 0.29 0.12 - 0.72 P<0.001
Confidence intervals are similarly calculated for odds ratios
Must calculate the 95 % confidence interval– 95 % certain that the population mean lies within the range
given– If the range includes 1 ~ OR is not significantly increased
In this case risk is reduced (<1) by increased exercise
Case-control studyEffect of physical activity level on risk of breast cancer.Adjusted ORGilliland, 2002
Odds RatioOR
(95 % CI)
1 20.5
It is important to consider– The effect size– The range of the
confidence interval– The significance of
the effect
– What do each of these tell us about the data?
• To understand the role of hypothesis testing and confidence intervals in assessing the significance of observed associations
• To be able to recognise and interpret significant diet-disease associations in the types of analyses commonly presented in the literature
Learning objectives
Study size and power Too few experimental units
– Poor estimate of true population mean– High standard error of mean/wide confidence
interval– Difficult to show statistical significance
Power– Probability of being able to demonstrate a
statistically significant finding, should one exist
A successful study must be adequately powered