Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | randolf-cain |
View: | 214 times |
Download: | 1 times |
MBP1010 – Lecture 8: March 1, 2011
1. Odds Ratio/Relative Risk
• Logistic Regression
• Survival Analysis
Reading: papers on OR and survival analysis (Resources)Ch 10 Multifactorial Analyses
Assignment 3
Due: March 8
-solutions will be posted after due date -but marks will not likely available prior to exam
Observational Studies with Binary Outcomes
Case/control and cohort studies- common in cancer research
Outcome: cancer/ no cancer, dead/alive
- cross-sectional studies - classify subjects into categories of 2 binary variables
X
XX X
X
XX
0
X
XX
0
0
0 00
0
00
0
Exposureeg diet
Case Control Study
Exposureeg diet
Measure of risk: odds ratio (OR)
0 0
0
0 00
0
00
0
0
0
0
0
0 0
0
X 00
0
X0
0
X
0
0
X
Cohort Study
Exposureeg diet Cancer (yes/no)
Measure of risk: RR or OR
Cross-sectional Study
• Subjects NOT selected on exposure or outcome
• Classify subjects into exposure and outcome
• OR or RR can be used to describe association with binary outcome
Observational Studies with Binary Outcomes
-case/control, cohort studies, cross-sectional studies
Ways to examine association:
•chi square test for association (2 x 2 contingency table)• X2
• odds ratio (OR) or relative risk (RR)• X2 and magnitude of risk and CI
• logistic regression• X2, magnitude of risk, CI and can include other variables of interest
Relative RiskProspective Cohort Studies
RR = 1.0 no association
RR = 1.4 1.4 times the risk 40% higher risk
RR = 0.8 20% lower risk
RR = p1/p2
P1 = probability of disease for exposed individualsP2 = probability of disease for unexposed individuals
MDM2 protein expression and breast cancer prognosis - cohort study
- women with invasive breast cancer at BCCA
- TMA stained for MDM2 protein expression
- data on outcome (dead/alive) available
Turbin et al, Modern Pathology 2006
MDM2 protein expression and breast cancer prognosis
Prospective Cohort Study
p1 = 28/49 = 0.57 p2 = 94/313 = 0.30
X2 = 12.75 = 12.75, df = 1, p-value = < 0.01)
RR = (28/49)/(94/313) = 1.90
Women with MDM2 protein expression were at 1.9 times the risk of dying from breast cancer compared to women without MDM2 protein expression (p<0.01).
(from lecture 4)
Case Control Study of Family History and Breast Cancer
- cases of breast cancer identified by cancer registry
- controls identified through provincial screening program
- data collected by questionnaire (after diagnosis in cases)
Case Control Study of Family History and Breast Cancer
2 x 2 Contingency Table
Chi-square results with Yates’ continuity correction:
X2 = 9.60, df = 1, p-value = 0.00195 (< 0.01)
We conclude that there is a statistically significant association between first degree family history of breast cancer and breast cancer risk (p<0.01). 22% of women with breast cancer have a first degree family history of breast cancer compared to 16% of women without breast cancer.
Estimate of Risk from Case-Control Study
• we fixed the number with and without breast cancer
• we cannot estimate of the probabilities of breast cancer in women with and without family history
- Relative Risk cannot be estimated
What can we do?
Gamblers calculate their chances of winning using a term called the odds
Suppose that the horse is a favourite and it is declared to have a 1 in 4 chance of winning [1 / (1 + 3)].
The gambler might say that the horse had an odds of 1 in 3 of winning. However, gamblers are much more likely to say that the odds of the horse losing are 3 to 1.
A horse that is a longshot may have only a 1 in 50 chance of winning. On the tote board the gambler will read that it has 49 to 1 odds against winning.
A day at the racetrack.....
Estimate of risk: Odds Ratio
If the probability of an event = p, then:
The odds in favour of an event = p/(1-p)
•ratio of probability that event occurs to probabilitythat is does not
Odds Ratio:
Odds in favour of disease for the exposed groupOdds in favour of disease for the unexposed group
odds of breast cancer with FHX = 238/4181-(238/418)
= 1.32
odds of breast cancer with no FHX = 862/17821-(862/1782)
= 0.94
OR = 1.32/0.94
= 1.41
odds = p/(1-p)
Odds Ratio
OR = (a/b)/(c/d) = (238/180)/ 862/920 = 1.41
Alternate equation: (a*d)/(b*c) = (238*920)/(180*862) = 1.41
Ratio of the number times event occurs to number of times it doesn’t
Simple method for calculating OR:
- OR has a skewed distribution - limited at lower end because it can’t be negative but not limited at the upper end
- log(OR) however can take any value and has anapproximately normal distribution
SE for ln(OR) = sqrt (1/a + 1/b + 1/c + 1/d) = sqrt(1/238 + 1/180 + 1/862 + 1/920) = 0.109
ln(1.41) ± 1.96 x 0.109
0.23459 to 0.55723
1.26 to 1.75 95% CI
Confidence Interval for OR
Calculate limits on log(OR)
and then “exponentiate”
What is the interpretation of the OR?
The odds of breast cancer in women with a family history is about 1.41 times of that in women without a family history.
Strictly speaking OR should be expressed as “odds” (as above):
However, when the outcome is rare (as it is generally for cancer),the OR is approximately equal to RR and results are often expressed as risk (ie more or less likely at risk to develop cancer).
Disease Odds Ratio:
Odds in favour of disease for the exposed groupOdds in favour of disease for the unexposed group
Exposure Odds Ratio:
Odds in favour of being exposed for diseased subjectsOdds in favour of being exposed for non diseased subjects
OR is reversible
= 1.41
= 1.41
MDM2 protein expression and breast cancer prognosis
Prospective Cohort Study
p1 = 28/49 = 0.57 p2 = 94/313 = 0.30
X2 = 12.75 = 12.75, df = 1, p-value = < 0.01)
RR = (28/49)/(94/313) = 1.90
OR = (28*219)/21*122) = 3.11
Proportion dying = 34%
Caution about Case/Control Studies
“Recall” biassubjects with disease may recall their exposuresdifferently from controls
- Biological samples collected after diagnosis may be affected by presence of disease
-Selection of controls extremely important (different population?)
-Treatment of samples from cases and controls must be the same
-Posted paper: Sources of Bias in Specimens for Research about Molecular markers for cancer
Copyright © American Society of Clinical Oncology
Ransohoff, D. F. et al. J Clin Oncol; 28:698-704 2010
Fig 1. The fundamental comparison in experimental and observational study design
- paper posted on website under resources
Nested Case-Control Study
Measure of risk: OR
0 0
0
0 00
0
00
0
0
0
0
0
0 0
0
X 00
0
X0
0
X
0
0
X
XX X
XX
0 0
0
0
0
cohort
select cases &subset ofcontrols
measure exposure follow to identify cases
Relative Risk
RR = p1/p2
P1 = probability of disease for exposed individualsP2 = probability of disease for unexposed individuals
Nested Case-Control Study
• Do a prospective cohort study
• Identify cases
• Select controls (randomly) from the cohort study - usually matched to case - followed same length of time as case - match on other characteristics (eg age, site etc)
• perform measurements of exposure
• Analyze as case-control (Odds Ratio)
- Still requires cohort study; but less measurements required - Control from same population as cases-Measurements from baseline (no recall bias)
• a generalization of chi square to examine association of a binary variable with one or more independent variables (categorical or continuous)
• Logistic regression quantifies the relationship between a risk factor for (or treatment) and a disease, after adjusting for other variables.
• Binary dependent variable: an event which is either present or absent (“success” or “failure”)
• Goal is to examine factors associated with the probability of an event
• uses method of maximum likelihood rather than least squares
Logistic Regression
How does logistic regression work?
• Logistic regression finds an equation that predicts an outcome variable that is binary from one or more x variables.
Outcome = probability of disease (p)
p = β0 + β1X1 + β2X2…
But…probabilities can only range from 0 to 1 and the right hand side could be < 0 or > 1 for some values of X : Use logit transformation
How does logistic regression work?
logit transformation : logit(p) = ln(p/1-p)
Natural logarithm of the odds can take on any value (negative or positive).
Ln(Odds) = β0 + β1X1 + β2X2…
Logistic Regression Model:
Logistic Regression family history example
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.06512 0.04740 -1.374 0.16953 fhx 0.34443 0.10956 3.144 0.00167
ln(Odds)= -0.065 + 0.344x
Intercept (β0): log odds in baseline group (x = 0)
Slope (β): difference between ln(odds) for 1 unit of x variable
To Interpret – use transformation:
= eβ = e0.344 = 1.41OddsFHX
Oddsno FHX
Case Control Study of Family History and Breast Cancer
Since there are only 2 values for x (family history: yes/no):
For women with family history: ln(Odds) = β0 + β1 (x=1) For women with no family history: ln(Odds) = β0 (x=0)
ln(Odds)= -0.065 + 0.344x
A little more detail on interpretation….
odds of breast cancer with FHX = 238/4181-(238/418)
= 1.32
odds of breast cancer with no FHX = 862/17821-(862/1782)
= 0.94
OR = 1.32/0.94
= 1.41
Case Control Study of Family History and Breast Cancer
Since there are only 2 values for x (family history: yes/no):
For women with family history: ln(Odds) = β0 + β1 (x=1)
= -0.065 + 0.344 = 0.279 = ln(1.32) For women with no family history: ln(Odds) = β0 (x=0) = -0.065 = ln(0.94)
LN(Odds) = -0.065 +0.344x
β1 = difference in ln(odds) between categories = ratio of odds = 0.279 - (-0.065) = 0.344
OR = 1.32/0.94 = 1.41; e0.344 = 1.41
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.015944 0.372387 -0.043 0.96585 fhx 0.355486 0.109996 3.232 0.00123 age -0.003756 0.004749 -0.791 0.42897 bmi 0.003721 0.010040 0.371 0.71092 HRT 0.204735 0.091312 2.242 0.02495
Multiple Logistic Regression – Family History Example
Note: z test used for coefficients.For 95% CI can use 1.96 x se
Multiple Logistic Regression – Family History Example
lower 95% CI higher 95% CI
OR 2.5 % 97.5 %(Intercept) 0.9841825 0.4741266 2.042309fhx 1.4268734 1.1508620 1.771679age 0.9962506 0.9870106 1.005564bmi 1.0037280 0.9841700 1.023710HRT 1.2271996 1.0262560 1.468051
Interpretation:The odds of a woman with family history developing breast cancer is 1.43 times (95% CI 1.15 to 1.77) that of a woman without a family history, after adjustment for age, BMI and HRT use.
Studies with Binary Outcomes - Summary
Ways to examine association:
•chi square test for association (2 x 2 contingency table)
• odds ratio (OR) or relative risk (RR)* - test of association, magnitude of risk and CI
• logistic regression• OR as measure as risk, CI and can include other variables of interest
* for case-control study only OR is appropriate; for cohort and cross-sectional both OR and RR are valid; if probability of outcome is rare - OR and RR will be similar