MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival...

MBP1010 – Lecture 8: March 1, 2011

1. Odds Ratio/Relative Risk

• Logistic Regression

• Survival Analysis

Reading: papers on OR and survival analysis (Resources)Ch 10 Multifactorial Analyses

Assignment 3

Due: March 8

-solutions will be posted after due date -but marks will not likely available prior to exam

Observational Studies with Binary Outcomes

Case/control and cohort studies- common in cancer research

Outcome: cancer/ no cancer, dead/alive

- cross-sectional studies - classify subjects into categories of 2 binary variables

X

XX X

X

XX

0

X

XX

0

0

0 00

0

00

0

Exposureeg diet

Case Control Study

Exposureeg diet

Measure of risk: odds ratio (OR)

0 0

0

0 00

0

00

0

0

0

0

0

0 0

0

X 00

0

X0

0

X

0

0

X

Cohort Study

Exposureeg diet Cancer (yes/no)

Measure of risk: RR or OR

Cross-sectional Study

• Subjects NOT selected on exposure or outcome

• Classify subjects into exposure and outcome

• OR or RR can be used to describe association with binary outcome

Observational Studies with Binary Outcomes

-case/control, cohort studies, cross-sectional studies

Ways to examine association:

•chi square test for association (2 x 2 contingency table)• X2

• odds ratio (OR) or relative risk (RR)• X2 and magnitude of risk and CI

• logistic regression• X2, magnitude of risk, CI and can include other variables of interest

Relative RiskProspective Cohort Studies

RR = 1.0 no association

RR = 1.4 1.4 times the risk 40% higher risk

RR = 0.8 20% lower risk

RR = p1/p2

P1 = probability of disease for exposed individualsP2 = probability of disease for unexposed individuals

MDM2 protein expression and breast cancer prognosis - cohort study

- women with invasive breast cancer at BCCA

- TMA stained for MDM2 protein expression

- data on outcome (dead/alive) available

Turbin et al, Modern Pathology 2006

MDM2 protein expression and breast cancer prognosis

Prospective Cohort Study

p1 = 28/49 = 0.57 p2 = 94/313 = 0.30

X2 = 12.75 = 12.75, df = 1, p-value = < 0.01)

RR = (28/49)/(94/313) = 1.90

Women with MDM2 protein expression were at 1.9 times the risk of dying from breast cancer compared to women without MDM2 protein expression (p<0.01).

(from lecture 4)

Case Control Study of Family History and Breast Cancer

- cases of breast cancer identified by cancer registry

- controls identified through provincial screening program

- data collected by questionnaire (after diagnosis in cases)


2 x 2 Contingency Table

Chi-square results with Yates’ continuity correction:

X2 = 9.60, df = 1, p-value = 0.00195 (< 0.01)

We conclude that there is a statistically significant association between first degree family history of breast cancer and breast cancer risk (p<0.01). 22% of women with breast cancer have a first degree family history of breast cancer compared to 16% of women without breast cancer.

Estimate of Risk from Case-Control Study

• we fixed the number with and without breast cancer

• we cannot estimate of the probabilities of breast cancer in women with and without family history

- Relative Risk cannot be estimated

What can we do?

Gamblers calculate their chances of winning using a term called the odds

Suppose that the horse is a favourite and it is declared to have a 1 in 4 chance of winning [1 / (1 + 3)].

The gambler might say that the horse had an odds of 1 in 3 of winning. However, gamblers are much more likely to say that the odds of the horse losing are 3 to 1.

A horse that is a longshot may have only a 1 in 50 chance of winning. On the tote board the gambler will read that it has 49 to 1 odds against winning.

A day at the racetrack.....

Estimate of risk: Odds Ratio

If the probability of an event = p, then:

The odds in favour of an event = p/(1-p)

•ratio of probability that event occurs to probabilitythat is does not

Odds Ratio:

Odds in favour of disease for the exposed groupOdds in favour of disease for the unexposed group

odds of breast cancer with FHX = 238/4181-(238/418)

= 1.32

odds of breast cancer with no FHX = 862/17821-(862/1782)

= 0.94

OR = 1.32/0.94

= 1.41

odds = p/(1-p)

Odds Ratio

OR = (a/b)/(c/d) = (238/180)/ 862/920 = 1.41

Alternate equation: (a*d)/(b*c) = (238*920)/(180*862) = 1.41

Ratio of the number times event occurs to number of times it doesn’t

Simple method for calculating OR:

- OR has a skewed distribution - limited at lower end because it can’t be negative but not limited at the upper end

- log(OR) however can take any value and has anapproximately normal distribution

SE for ln(OR) = sqrt (1/a + 1/b + 1/c + 1/d) = sqrt(1/238 + 1/180 + 1/862 + 1/920) = 0.109

ln(1.41) ± 1.96 x 0.109

0.23459 to 0.55723

1.26 to 1.75 95% CI

Confidence Interval for OR

Calculate limits on log(OR)

and then “exponentiate”

What is the interpretation of the OR?

The odds of breast cancer in women with a family history is about 1.41 times of that in women without a family history.

Strictly speaking OR should be expressed as “odds” (as above):

However, when the outcome is rare (as it is generally for cancer),the OR is approximately equal to RR and results are often expressed as risk (ie more or less likely at risk to develop cancer).

Disease Odds Ratio:

Odds in favour of disease for the exposed groupOdds in favour of disease for the unexposed group

Exposure Odds Ratio:

Odds in favour of being exposed for diseased subjectsOdds in favour of being exposed for non diseased subjects

OR is reversible

= 1.41

= 1.41

MDM2 protein expression and breast cancer prognosis

Prospective Cohort Study

p1 = 28/49 = 0.57 p2 = 94/313 = 0.30

X2 = 12.75 = 12.75, df = 1, p-value = < 0.01)

RR = (28/49)/(94/313) = 1.90

OR = (28*219)/21*122) = 3.11

Proportion dying = 34%

Caution about Case/Control Studies

“Recall” biassubjects with disease may recall their exposuresdifferently from controls

- Biological samples collected after diagnosis may be affected by presence of disease

-Selection of controls extremely important (different population?)

-Treatment of samples from cases and controls must be the same

-Posted paper: Sources of Bias in Specimens for Research about Molecular markers for cancer

Copyright © American Society of Clinical Oncology

Ransohoff, D. F. et al. J Clin Oncol; 28:698-704 2010

Fig 1. The fundamental comparison in experimental and observational study design

- paper posted on website under resources

Nested Case-Control Study

Measure of risk: OR

0 0

0

0 00

0

00

0

0

0

0

0

0 0

0

X 00

0

X0

0

X

0

0

X

XX X

XX

0 0

0

0

0

cohort

select cases &subset ofcontrols

measure exposure follow to identify cases

Relative Risk

RR = p1/p2

P1 = probability of disease for exposed individualsP2 = probability of disease for unexposed individuals

Nested Case-Control Study

• Do a prospective cohort study

• Identify cases

• Select controls (randomly) from the cohort study - usually matched to case - followed same length of time as case - match on other characteristics (eg age, site etc)

• perform measurements of exposure

• Analyze as case-control (Odds Ratio)

- Still requires cohort study; but less measurements required - Control from same population as cases-Measurements from baseline (no recall bias)

• a generalization of chi square to examine association of a binary variable with one or more independent variables (categorical or continuous)

• Logistic regression quantifies the relationship between a risk factor for (or treatment) and a disease, after adjusting for other variables.

• Binary dependent variable: an event which is either present or absent (“success” or “failure”)

• Goal is to examine factors associated with the probability of an event

• uses method of maximum likelihood rather than least squares

Logistic Regression

How does logistic regression work?

• Logistic regression finds an equation that predicts an outcome variable that is binary from one or more x variables.

Outcome = probability of disease (p)

p = β0 + β1X1 + β2X2…

But…probabilities can only range from 0 to 1 and the right hand side could be < 0 or > 1 for some values of X : Use logit transformation

How does logistic regression work?

logit transformation : logit(p) = ln(p/1-p)

Natural logarithm of the odds can take on any value (negative or positive).

Ln(Odds) = β0 + β1X1 + β2X2…

Logistic Regression Model:

Logistic Regression family history example

Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.06512 0.04740 -1.374 0.16953 fhx 0.34443 0.10956 3.144 0.00167

ln(Odds)= -0.065 + 0.344x

Intercept (β0): log odds in baseline group (x = 0)

Slope (β): difference between ln(odds) for 1 unit of x variable

To Interpret – use transformation:

= eβ = e0.344 = 1.41OddsFHX

Oddsno FHX


Since there are only 2 values for x (family history: yes/no):

For women with family history: ln(Odds) = β0 + β1 (x=1) For women with no family history: ln(Odds) = β0 (x=0)

ln(Odds)= -0.065 + 0.344x

A little more detail on interpretation….

odds of breast cancer with FHX = 238/4181-(238/418)

= 1.32

odds of breast cancer with no FHX = 862/17821-(862/1782)

= 0.94

OR = 1.32/0.94

= 1.41


Since there are only 2 values for x (family history: yes/no):

For women with family history: ln(Odds) = β0 + β1 (x=1)

= -0.065 + 0.344 = 0.279 = ln(1.32) For women with no family history: ln(Odds) = β0 (x=0) = -0.065 = ln(0.94)

LN(Odds) = -0.065 +0.344x

β1 = difference in ln(odds) between categories = ratio of odds = 0.279 - (-0.065) = 0.344

OR = 1.32/0.94 = 1.41; e0.344 = 1.41

Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.015944 0.372387 -0.043 0.96585 fhx 0.355486 0.109996 3.232 0.00123 age -0.003756 0.004749 -0.791 0.42897 bmi 0.003721 0.010040 0.371 0.71092 HRT 0.204735 0.091312 2.242 0.02495

Multiple Logistic Regression – Family History Example

Note: z test used for coefficients.For 95% CI can use 1.96 x se

Multiple Logistic Regression – Family History Example

lower 95% CI higher 95% CI

OR 2.5 % 97.5 %(Intercept) 0.9841825 0.4741266 2.042309fhx 1.4268734 1.1508620 1.771679age 0.9962506 0.9870106 1.005564bmi 1.0037280 0.9841700 1.023710HRT 1.2271996 1.0262560 1.468051

Interpretation:The odds of a woman with family history developing breast cancer is 1.43 times (95% CI 1.15 to 1.77) that of a woman without a family history, after adjustment for age, BMI and HRT use.

Studies with Binary Outcomes - Summary

Ways to examine association:

•chi square test for association (2 x 2 contingency table)

• odds ratio (OR) or relative risk (RR)* - test of association, magnitude of risk and CI

• logistic regression• OR as measure as risk, CI and can include other variables of interest

* for case-control study only OR is appropriate; for cohort and cross-sectional both OR and RR are valid; if probability of outcome is rare - OR and RR will be similar

Date post:	04-Jan-2016
Category:	Documents
Upload:	randolf-cain
View:	214 times
Download:	1 times

MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival...

Documents