+ All Categories
Home > Documents > Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting,...

Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting,...

Date post: 30-Aug-2019
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
26
12 th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to- event data: lifting the veil of censoring Patrick Royston Cancer Group, MRC Clinical Trials Unit, London
Transcript
Page 1: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 1 Patrick Royston

Visualising and analysing time-to-event data: lifting the veil of censoring

Patrick RoystonCancer Group, MRC Clinical

Trials Unit, London

Page 2: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 2 Patrick Royston

A poet writes about censored observations:

Last night I saw upon the stairA little man who wasn't thereHe wasn't there again todayOh, how I wish he'd go away!

From Antigonish (1899)

Hughes Mearns (1875-1965)

Page 3: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 3 Patrick Royston

Outline

• Why is censoring of time-to-event data an issue?

• Example in breast cancer• Visualisation of censored data using

model-based imputation• Multiple imputation and analysis of

survival data with missing covariate observations

• Demonstration with Stata

Page 4: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 4 Patrick Royston

Why is censoring an issue?

• You can’t picture the raw data easily• Reliance on Kaplan-Meier plots

Exaggerates differences between groupsAttracts attention to unreliable survival estimates at extreme times

• Data will be analysed using Cox modelStill the almost-automatic choice – although decent alternatives exist

• Time is “forgotten about” in the Cox modelAnalysis is based on the ranks of failure times

Page 5: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 5 Patrick Royston

• Results of Cox regression models are usually expressed as (log) hazard ratios

Indirect – not dealing directly with timeCan be hard to interpret – different effect on survival curves at high and low survival probsParticularly difficult for interactions – ‘ratio of hazard ratios’

• Non-proportional hazardsData with long-term follow-up typically have itModelling and interpretation may be complex

Page 6: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 6 Patrick Royston

Example: Primary node-positive breast cancer

• GBSG trial BMFT-2• 686 patients, 299 events for recurrence-

free survival (RFS)• Patients assigned to hormonal therapy

(TAM) or not• Visualise the effect of TAM on RFS• Visualise interaction between TAM and ER

(estrogen receptor status)

Page 7: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 7 Patrick Royston

Traditional visualisation: Kaplan-Meier by TAM group

0.00

0.25

0.50

0.75

1.00

S(t)

0 2 4 6 8Recurrence-free survival time, yr

hormone = No TAM hormone = TAM

Kaplan-Meier survival estimates, by hormone

Page 8: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 8 Patrick Royston

Dot plot by TAM therapy –unhelpful with censored data

00000010110000111111011111111110111000111101111111111111111111111111111111111110100111111111111011110011111111111110000001100101011111110110110001101100101101010011001011010010111011100010010101111111010101000100011110110000001110110000101010001101011010110100100011100100110000000011010101000101000111000010000000000101001000100100100000100000011000000001000100010000000110000000100000000000000000000010000001001000000010000000001000001000

000011100011101001101110111111100011111111111111111111111001111011001101011000010100111000001010010001010101100100100110001011000010000110011010000000011100000000001100001000000000000000000000100000000001110000100010000011000000000000010000000000

02

46

8R

ecur

renc

e-fre

e su

rviv

al ti

me,

yr

No TAM TAMHormone therapy, 1=no, 2=yes

fig2

Page 9: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 9 Patrick Royston

How better to visualise survival times?

• To make progress with visualisation, aim to impute the “missing” part of censored times

• Assume a parametric distribution of survival time• Survival times are sometimes approximately

lognormally distributed (Royston 2001a)Can check by using modified Normal Q-Q plot

• If lognormal approximation is not good, can consider Box-Cox transformation of time

Or another transformation towards normality

Page 10: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 10 Patrick Royston

Assessing lognormality: modified Normal Q-Q plot• Simple transformation of Kaplan-Meier survival curve

.2.5

12

510

Rec

urre

nce-

free

surv

ival

tim

e (lo

g sc

ale)

-3 -2 -1 0 1Normal equivalent deviates

Page 11: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 11 Patrick Royston

Normal Q-Q plot by TAM group

.2.5

12

510

Rec

urre

nce-

free

surv

ival

tim

e (lo

g sc

ale)

-3 -2 -1 0 1Normal equivalent deviates

No TAM TAM

Page 12: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 12 Patrick Royston

Visualisation of censored data using imputation

• Create m (≥ 1) copies of the data with censored survival times imputed

• Need an imputation model to reflectDistribution of times (e.g. lognormal)Effects of covariates (prognostic factors)

• Creating an imputation model:Use mfp with cnreg (censored normal regrn.) to model poss. non-linear effects of covariatesE.g. mfp cnreg lnt x1 x2 x3 x4a x4b x5 x6 x7 hormone, censored(c) select(1) dfdefault(2)

Page 13: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 13 Patrick Royston

Creating the imputed dataset(s)

• Can use the ice multiple imputation command to create the imputations

Royston (2004, 2005a, 2005b) Stata J• ice varlist using filename[.dta][if exp] [in range] [weight],[m(#) cmd(cmdlist) cycles(#) boot[(varlist)] seed(#) dryruneq(eqlist) passive(passivelist) substitute(sublist) dropmissinginterval(intlist) other_options]

Page 14: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 14 Patrick Royston

Interval censoring with ice

• gen ll = lnt

• gen ul = cond(_d==1, lnt, ln(50))// chose upper limit of 50 years for

RFS: can use . for +∞• (generate FP transformations of x1, x5, x6)• ice x1_1 x2 x3 x4a x4b x5_1 x6_1 x7 hormone ll ul lnt using imputed.dta, interval(lnt:ll ul) m(10)

Page 15: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 15 Patrick Royston

How interval() works

Censoredobs

Upperlimit

Completeobs

-4 -3 -2 -1 0 1 2 3 4

• Sample randomly from truncated normal distribution (shaded)

Page 16: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 16 Patrick Royston

Code fragment from uvis.ado

`cmd' `yvarlist' `xvars' `wgt',`options'

...if "`cmd'"=="intreg" {

tempvar PhiA PhiBgen `PhiA‘ = cond(missing(`ll'), 0,

norm((`ll'-`xb')/`rmsestar'))gen `PhiB‘ = cond(missing(`ul'), 1,

norm((`ul'-`xb')/`rmsestar'))replace `yimp‘ = `xb‘

+`rmsestar'*invnorm(`u'*(`PhiB'-`PhiA')+`PhiA')

}

Page 17: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 17 Patrick Royston

Uses of the interval() option

• Impute right-, left- or interval-censored outcomes

Response variable in time-to-event studies• Impute when a covariate is sometimes

partly observed, sometimes completeSome observations recorded exactlyOthers known to be below or above a cutoffE.g. D-dimer in DVT, PgR/ER in breast cancer

• Interval censored covariatesIncome in surveys recorded as ranges only

Page 18: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 18 Patrick Royston

Breast cancer data: visualisation of time to recurrence

00000000000001011011010111101111110101101101111111101011111111100011011111011111111111111111111111111110101111111011111011001111111111111110111110110011100111011110101111111101111101011011001111111101000111110101000001111010001101011100110011101100100100101001001001101111101111110011111101000000100010010001011001101101010000000010100000100101011001001011100011100000110000100111001101110110101100100001010000010000010001000101000000001101000000010110111001000100010110011000100001001010100000011000000100000000101000000000101000000100000000000000010000000000010000000000000010001000000100010000000011001010000000000000001100001000100000000000001000100100000000000000001000000000010000

11111111111111111111111101111111111101111111111111111111111111111111111111111111111111111111111111101111011111111111011111111111111111111111111011111111111001111111111111111111111111111010110101101111111101011111111110110101111101110101011001111111010111111111111101111001011110101100110001010011110001100111011011010001010000000100010101001011101101001100000100010000000000010001100001000100001000000000000000000110001010000010000100000000000000000000000000000000000001000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

.05

.1.2

.51

25

1020

50R

ecur

renc

e-fre

e su

rviv

al ti

me,

yr

0 1imputation number

fig_response

Page 19: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 19 Patrick Royston

Visualisation: some plots using the first imputed sample

.2.5

12

510

2050

Rec

urre

nce-

free

surv

ival

tim

e (lo

g sc

ale)

No TAM TAMHormone therapy

.2.5

12

510

2050

Rec

urre

nce-

free

surv

ival

tim

e (lo

g sc

ale)

0 10 20 30 40 50x5 - number of positive lymph nodes

Page 20: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 20 Patrick Royston

Visualisation: treatment by covariate interaction

.2.5

12

510

2050

Rec

urre

nce-

free

surv

ival

tim

e (lo

g sc

ale)

ER neg, no TAM ER neg, TAM ER pos, no TAM ER pos, TAMTAM x ER status

Page 21: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 21 Patrick Royston

Limitations

• Imputed times to event are helpful for visualisation, but less so for analysis

Effectively, such imputations are extrapolations into the futureWe don’t know the future distributionEstimates of means, SD’s, regression coeffsetc. are heavily dependent on the distributional assumptionsPotential for bias if assumed distr’n is wrong

• Imputed times may be unrealisticE.g. survival time 150 years!

Page 22: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 22 Patrick Royston

Other approaches

• A reasonably large literature exists• Buckley-James estimation (Buckley & James

1979)Estimates the mean of the censored partNot so good for visualisation

• Wei & Tanner (1991)Two algorithms which give multiple imputations of the censored partRelaxes the normality assumption – samples taken from the distribution of the residuals

• stpm (Royston 2001b, Royston & Parmar 2002)More flexible distributions of survival time available

Page 23: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 23 Patrick Royston

Imputation of survival data with missing covariate observations

• So far, have assumed covariates have complete data• If covariates have missing data, need a suitable algorithm

for multiple imputation of all missing valuese.g. MICE (ice)

• To reduce bias, must include the response (time-to-event) in the imputation model

How?• “Standard” approach is to include (censored) log time and

the censoring indicator in the imputation model No theoretical justification

• May be better toInclude covariates as usualImpute right-censored times using ice with interval() option

• Can also use imputed data for visualisation

Page 24: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 24 Patrick Royston

Analysis of survival data with missing covariate observations

• Disregard the imputed times in the MI dataset

Except for visualisation purposes

• Use original time and censoring indicator• Can analyse the MI dataset using

stcox (Cox regression)streg (several models available)stpm (flexible parametric survival models)

• micombine supports such models

Page 25: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 25 Patrick Royston

Conclusions

• Use of familiar graphical tools with imputed times to event can give greater insight into censored survival data

Scatter plots, smoothers, etc

• Treatment or prognostic effects may be depressingly small when displayed as scatter plots of times

Much overlap between groupsWeak regression relationships

• Imputation of times may be helpful in multiple imputation with missing covariate values

Page 26: Visualising and analysing time-to- event data: lifting the ... · 12th UK Stata Users’ meeting, September 2006 1 Patrick Royston Visualising and analysing time-to-event data: lifting

12th UK Stata Users’ meeting, September 2006 26 Patrick Royston

Some references

• Buckley J, James I (1979) Linear regression with censored data. Biometrika 66: 429-436• Faucett CL, Schenker N, Taylor JMG (2002) Survival analysis using auxiliary variables via multiple

imputation, with application to AIDS clinical trial data. Biometrics 58: 37-47.• Ma SG (2006) Multiple augmentation with partial missing regressors. Biometrical Journal 48: 83-92• Pan W (2000) A multiple imputation approach to Cox regression with interval-censored data.

Biometrics 56: 199-203• Royston P (2001a) The lognormal distribution as a model for survival time in cancer, with an

emphasis on prognostic factors. Statistica Neerlandica 55: 89-104• Royston P (2001b) Flexible alternatives to the Cox model, and more. The Stata Journal 1:1-28. • Royston P, Parmar MKB (2002) Flexible proportional-hazards and proportional-odds models for

censored survival data, with application to prognostic modelling and estimation of treatment effects. Statistics in Medicine 21: 2175-2197

• Royston P (2004) Multiple imputation of missing values. Stata Journal 4: 227-241• Royston P (2005a) Multiple imputation of missing values: update. Stata Journal 5: 188-201• Royston P. (2005b) Multiple imputation of missing values: update of ice. Stata Journal 5: 527-536• Tanner MA, Wing HW (1987) The calculation of posterior distributions by data augmentation. JASA

82: 528-540. [Cited 642 times, WoS 10.9.2006]• Wei GCG, Tanner MA (1990) Posterior computations for censored regression data. JASA 85: 829-39• Wei GCG, Tanner MA (1991) Application of multiple imputation to the analysis of censored

regression data. Biometrics 47: 1297-1309


Recommended