+ All Categories
Home > Documents > April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data...

April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data...

Date post: 29-Dec-2015
Category:
Upload: judith-patrick
View: 216 times
Download: 0 times
Share this document with a friend
29
April 11 Logistic Regression Modeling interactions Analysis of case-control studies Data presentation
Transcript
Page 1: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

April 11

• Logistic Regression– Modeling interactions

– Analysis of case-control studies

– Data presentation

Page 2: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Subgroup AnalysesJournal Tables

Treatment A Treatment B OROverall 100 150 0.67

Men 40 90 x.xxWomen 60 60 x.xx

Age <50 25 30 x.xxAge 51-60 35 50 x.xxAge 60 + 40 70 x.xx

SBP < 160 40 70 x.xxSBP ≥ 160 60 80 x.xx

Is there any evidence that the effect of treatment differs among subgroups

Page 3: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

TOMHS Example

• Question: Does the effect of active BP treatment on CVD differ for young versus older persons?

• Looking at an interaction effect (effect modification)• Compare

– Odds CVD (treatment/placebo) in younger patients

– Odds CVD (treatment/placebo) in older patients

Page 4: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Logistic regression equation

Model log odds of outcome as a linear function of one or more variables

Xi = predictors, independent variables

is increase in log odds of 1-unit increase in X

eis relative odds of a 1-unit increase in X

...)1

log( 22110

xx

The model is:

Page 5: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Logistic Model For Interaction

X1 = 1 for active treatment and 0 for placebo

X2 = 1 for age ≥ 55 and 0 for age < 55

X3 = X1 * X2

21322110)1

log( xxxx

Page 6: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Logistic Model For Interaction

X1 = 1 for active treatment and 0 for placebo

X2 = 1 for age ≥ 55 and 0 for age < 55

X3 = X1 * X2

21322110)1

log( xxxx

Log Odds (placebo, young) = 0

Log Odds (active, young) = 0 + 1

Log Odds (placebo, old) = 0 + 2

Log Odds (active, old) = 0 + 1 + 2 +3

Dif = 1; exp(1) is odds (A v P) for young

Dif = 1 + ; exp(1 + 3 ) is odds (A v P) for old

Page 7: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Log Odds (placebo, young) = 0

Log Odds (active, young) = 0 + 1

Log Odds (placebo, old) = 0 + 2

Log Odds (active, old) = 0 + 1 + 2 +3

exp(1) is odds (A v P) for young

exp(1 + 3 ) is odds (A v P) for old

What does 3 Mean?

=Odds (A v P) for Old exp(1 + 3)

Odds (A v P) for Young exp (1)exp (3)=

A ratio of ratios!!

Page 8: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Interaction Hypothesis

Ho: 3 = 0

Ha: 3 ≠ 0

Test in SAS just like any other coefficient

21322110)1

log( xxxx

Page 9: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

TOMHS: Overall Effect of Active Treatment

PROC MEANS DATA=temp N MEAN SUM; CLASS active; VAR cvd; RUN;

Analysis Variable : cvd

N active Obs N Mean Sum============================================================ 0 234 234 0.1623932 38.0000000

1 668 668 0.1107784 74.0000000============================================================

Active: 38/234 or 11.1%

Placebo: 74/668 or 16.2%

RR = 0.68 (32% lower rate of CVD with active treatment)

Page 10: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

OVERALL (ACTIVE VERSUS PLACEBO)

The LOGISTIC Procedure

Analysis of Maximum Likelihood Estimates

Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -1.6405 0.1773 85.6626 <.0001active 1 -0.4423 0.2159 4.1964 0.0405

Odds Ratio Estimates

Point 95% WaldEffect Estimate Confidence Limits

active 0.643 0.421 0.981

Active group at 36% lower risk of CVD compared to placebo.

Page 11: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Reading DATA and Creating indicator variables and interaction variable

LIBNAME tomhs 'C:/';DATA temp; SET tomhs.bpstudy; cvd = second; if group = 6 then active = 0; else active = 1; if age < 55 then old = 0; else old =1;

*compute interaction term (x3); active_old = active*old;

Page 12: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

* Get simple counts and proportions first;PROC MEANS DATA=temp N MEAN SUM; CLASS old active; VAR cvd; RUN;

The MEANS Procedure

Analysis Variable : cvd

N old active Obs N Mean Sum=========================================================================== 0 0 115 115 0.1565217 18.0000000

1 350 350 0.0714286 25.0000000

1 0 119 119 0.1680672 20.0000000

1 318 318 0.1540881 49.0000000

It appears the effect of treatment is mostly in younger patients

Page 13: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

PROC LOGISTIC DATA=temp DESCENDING; MODEL CVD = active old active_old;

CONTRAST 'A v P (Young)' active 1 /ESTIMATE=BOTH; CONTRAST 'A v P (Old)' active 1 active_old 1

/ESTIMATE=BOTH; * Will give us beta1 + beta 3; RUN;

Page 14: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

SAS OUTPUT

Response Profile

Ordered Total Value cvd Frequency

1 1 112 2 0 790

Probability modeled is cvd=1.Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 15.7787 3 0.0013Score 14.7851 3 0.0020Wald 14.0735 3 0.0028

Page 15: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

The LOGISTIC Procedure

Analysis of Maximum Likelihood Estimates

Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -1.6843 0.2566 43.0730 <.0001active 1 -0.8806 0.3301 7.1180 0.0076old 1 0.0850 0.3549 0.0573 0.8108active_old 1 0.7771 0.4395 3.1261 0.0770

Odds Ratio Estimates

Point 95% WaldEffect Estimate Confidence Limits

active 0.415 0.217 0.792old 1.089 0.543 2.183active_old 2.175 0.919 5.147

b1b2b3

Odds CVD (A v P) for younger patients = exp(b1) = 0.415

Odds CVD (A v P) for older patients = exp(b1 + b3) = exp(-0.11) = 0.90

2.175 = 0.90/.415

Ratio of Odds Ratios

Page 16: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

CONTRAST 'A v P (Old)' active 1 active_old 1 /ESTIMATE=BOTH;

Computes 1*beta1 + 0*beta2 + 1*beta3 =beta1 + beta3

Plus test and 95%CI

Contrast Rows Estimation and Testing Results

Standard Lower UpperContrast Type Row Estimate Error Alpha Limit Limit

A v P (Young) PARM 1 -0.8806 0.3301 0.05 -1.5275 -0.2337A v P (Young) EXP 1 0.4145 0.1368 0.05 0.2171 0.7916

A v P (Old) PARM 1 -0.1035 0.2902 0.05 -0.6723 0.4653A v P (Old) EXP 1 0.9017 0.2617 0.05 0.5105 1.5925

Exp(

Page 17: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Patients in the active group were at 36% lower risk of CVD compared to the placebo group (OR: 0.64; 95% CI:0.42-0.98). Analyses by age showed that the benefit for active treatment was greatest in younger patients. In patients < age 55 the CVD risk was 58% lower in the active treatment (OR: 0.42) where for patients over 55 years of age the CVD risk was only 10% lower (OR:.90). The test for interaction between treatment and age approached significance (p=.07).

Description of Findings

Page 18: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Logistic Regression forCase Control Studies

• Same analyses as prospective study• Outcome:

– Y = 1 is a case

– Y = 0 is a control

• Model log (odds) of being a case• Odds ratios have same meaning• Estimating probability of being a case not appropriate

Page 19: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Example Colon Polyp Study

• Cases (N=574)– Patients diagnosed with colorectal polyps from

colonoscopy

• Controls (N=707)– Patients clear of colorectal polyps from colonoscopy

• Risk Factors Under Study – FH of colon cancer– Smoking and alcohol– Reproductive history factors– Obesity and adiposity (weight to hip measures)

Page 20: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Example Colon Polyp Study

• Variables – CC Status (1=case, 2=control)– Age (years)– FH colon cancer (1=Y, 0=N)– Current Smoking (1=Y, 0=N)– Gender (1=Men, 0 = Women)– Waist to Hip Ratio

• Variables Names– CC, AGE, FHCC, SMOKERS, MEN, and WHIP

Page 21: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

PROC LOGISTIC DATA=temp ; MODEL cc = age fhcc smokers men whip; UNITS whip = 0.1 ;Response Profile Ordered Total Value CC1 Frequency

1 1 561 2 2 690

Probability modeled is CC=1.

Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 165.4379 5 <.0001Score 155.7546 5 <.0001Wald 139.8082 5 <.0001

Page 22: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Analysis of Maximum Likelihood Estimates Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -4.0683 0.5953 46.7054 <.0001AGE 1 0.0497 0.00618 64.8156 <.0001FHCC 1 -0.4434 0.1505 8.6798 0.0032smokers 1 0.5272 0.1623 10.5537 0.0012men 1 0.8379 0.1503 31.0610 <.0001WHIP 1 0.7491 0.6287 1.4197 0.2335

Odds Ratio Estimates Point 95% WaldEffect Estimate Confidence Limits

AGE 1.051 1.038 1.064FHCC 0.642 0.478 0.862smokers 1.694 1.233 2.329men 2.312 1.722 3.104WHIP 2.115 0.617 7.253

UNITS whip = 0.1 ;Effect Unit Estimate 95% Confidence LimitsWHIP 0.1000 1.078 0.953 1.219

Page 23: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Interaction Model

• Is relationship of waist to hip ratio different for men and women

• Define interaction term– Whip * men

Page 24: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

PROC LOGISTIC DATA=temp DESCENDING; MODEL cc = age fhcc smokers men whip whip_men; Analysis of Maximum Likelihood Estimates Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -2.4771 0.8632 8.2349 0.0041AGE 1 0.0511 0.00626 66.7467 <.0001FHCC 1 -0.4528 0.1511 8.9866 0.0027smokers 1 0.5487 0.1631 11.3203 0.0008men 1 -2.5148 1.3103 3.6838 0.0549WHIP 1 -1.2470 1.0235 1.4846 0.2231whip_men 1 3.7225 1.4392 6.6897 0.0097

Odds Ratio Estimates Point 95% WaldEffect Estimate Confidence Limits

AGE 1.052 1.040 1.065FHCC 0.636 0.473 0.855smokers 1.731 1.257 2.383men 0.081 0.006 1.055WHIP 0.287 0.039 2.136whip_men 41.367 2.464 694.576

P-value for women

Page 25: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Some Practical Aspects for Analyses

• Divide continuous variable of interest into 3-5 categories and compute relative odds for increasing categories.

• Summarize results using beta coefficient using factor as continuous variable.

Page 26: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Example Omega-3 Intake and CHD

Omega-3 Intake CHD

N

Odds Ratio

(95% CI)

I 40 1.00

II 42 1.08 (0.80 – 1.45)

III 37 0.92 (0.68 – 1.32)

IV 35 0.89 (0.66 - 1.25)

V 24 0.61 (0.34 – 0.98)

Beta (SE) 0.30 (.15)

Page 27: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Advantages

• Can determine if risk increases linearly with increasing levels of factor

• No assumptions of pattern of risk when using categories

• Can determine if there is a threshold effect• Eliminates possible effect of outliers.

Page 28: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

Analysis

• Create indicator variables for quintiles of omega-3 and run logistic regression

• Run regression using omega-3 as continuous variable

Page 29: April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.

In Class Exercise

• Investigate whether the odds of CVD increases linearly with age

• Divide age into 4-categories– < 50; 50-54; 55-59; 60+

• Two CVD endpoints: – Clinical – major CVD

– Second – major + minor CVD

• Compute percent with CVD with each age category

• Run logistic regression with 3 indicator variables using < 50 as reference group

• Run logistic regression using age as continuous variable


Recommended