Springer Series in Statistics
Advisors: P. Bickel, P. Diggle, S. Fienberg, K. Krickeberg, I. Olkin, N. Wermuth, S. Zeger
Springer Science+Business Media, LLC
Springer Series in Statistics
Andersen/Borgan/Gill/Keiding: Statistical Models Based on Counting Processes. Atkinson/Riani: Robust Diagnotstic Regression Analysis. Berger: Statistical Decision Theory and Bayesian Analysis, 2nd edition. Bolfarine/Zacks: Prediction Theory for Finite Populations. Borg/Groenen: Modern Multidimensional Scaling: Theory and Applications Brockwell/Davis: Time Series: Theory and Methods, 2nd edition. Chen/Shao/Jbrahim: Monte Carlo Methods in Bayesian Computation. David/Edwards: Annotated Readings in the History of Statistics. Devroye!Lugosi: Combinatorial Methods in Density Estimation. Efromovich: Nonparametric Curve Estimation: Methods, Theory, and Applications. Fahrmeir!Tutz: Multivariate Statistical Modelling Based on Generalized Linear
Models, 2nd edition. Farebrother: Fitting Linear Relationships: A History of the Calculus of Observations
1750-1900. Federer: Statistical Design and Analysis for Intercropping Experiments, Volume 1:
Two Crops. Federer: Statistical Design and Analysis for Intercropping Experiments, Volume II:
Three or More Crops. Fienberg!Hoaglin!Kruskal/Tanur (Eds.) : A Statistical Model: Frederick Mosteller's
Contributions to Statistics, Science and Public Policy. Fisher/Sen: The Collected Works ofWassily Hoeffding. Friedman et a!: The Elements of Statistical Learning: Data Mining, Inference and
Prediction Glaz!Naus/Wallenstein: Scan Statistics. Good: Permutation Tests: A Practical Guide to Resampling Methods for Testing
Hypotheses, 2nd edition. Gouril!roux: ARCH Models and Financial Applications. Grandell: Aspects of Risk Theory. Haberman: Advanced Statistics, Volume 1: Description of Populations. Hall: The Bootstrap and Edgeworth Expansion. Hardie: Smoothing Techniques: With Implementation in S. Harrell: Regression Modeling Strategies: With Applications to Linear Models,
Logistic Regression, and Survival Analysis Hart: Nonparametric Smoothing and Lack-of-Fit Tests. Hartigan: Bayes Theory. Hedayat/Sloane/Stujken: Orthogonal Arrays: Theory and Applications. Heyde: Quasi-Likelihood and its Application: A General Approach to Optimal
Parameter Estimation. Huet/Bouvier!Gruet/Jolivet: Statistical Tools for Nonlinear Regression: A Practical
Guide with S-PLUS Examples. Ibrahim/Chen/Sinha.: Bayesian Survival Analysis. Kolen/Brennan: Test Equating: Methods and Practices. Kotz!Johnson (Eds.): Breakthroughs in Statistics Volume I. Kotz/Johnson (Eds.) : Breakthroughs in Statistics Volume II. Kotz/Johnson (Eds.) : Breakthroughs in Statistics Volume III.
(continued after index)
Joseph G. Ibrahim Ming-Hui Chen Debajyoti Sinha
Bayesian Survival Analysis
With 51 Illustrations
Springer
Joseph G. Ibrahim Department of Biostatistics Harvard School of Public
Health and Dana-Farber Cancer Institute
44 Binney Street Boston, MA 02115 USA [email protected]
Debajyoti Sinha Department of Biometry
and Epidemiology Medical Universtiy of South Carolina 135 Rutledge A ve PO Box 250551 Charleston, SC 29425 USA [email protected]
Ming-Hui Chen Department of Mathematical Sciences Worcester Polytechnic Institute 100 Institute Road Worcester, MA 01609-2280 USA [email protected]
Library of Congress Cataloging-in-Publication Data Ibrahim, Joseph George.
Bayesian survival analysis / Joseph G. Ibrahim, Ming-Hui Chen, Debajyoti Sinha. p. cm. - (Springer series in statistics)
IncIudes bibliographical references and indexes. ISBN 978-1-4419-2933-4 ISBN 978-1-4757-3447-8 (eBook) DOI 10.1007/978-1-4757-3447-8 1. Failure time data analysis. 2. Bayesian statistical decision theory. 1. Chen
Ming-Hui, 1961- II. Sinha, Debajyoti . III. Title. IV. Series. QA276 .127 2001 519.5'42-dc21 2001020443
Printed on acid-free paper.
© 200 1 Springer Seienee+Business Media New York Origina1ly published by Springer-Verlag New York, !ne. in 2001 Softeover reprint of the hardcover 1 st edition 2001
AlI rights reserved. This work may not be translated or copied in whole or in par! without the writlen permission of the publisher Springer Seience+Business Media, LLC , except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especialIy identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.
Production managed by MaryAnn Brickner; manufacturing supervised by Erica Bresler. Photocomposed pages prepared from the authors' ~TEJX2, files.
9 8 7 6 5 432 1
ISBN 978-1-4419-2933-4 SPIN 10833390
To Joseph G. Ibrahim's parents, Guirguis and Assinat Ibrahim
Ming-Hui Chen's parents, his wife, Lan, and his daughters, Victoria and Paula
Debajyoti Sinha's parents, Nemai Chand and Purabi Sinha, and his wife, Sebanti
Preface
Survival analysis arises in many fields of study including medicine, biology, engineering, public health, epidemiology, and economics. Recent advances in computing, software development such as BUGS, and practical methods for prior elicitation have made Bayesian survival analysis of complex models feasible for both practitioners and researchers. This book provides a comprehensive treatment of Bayesian survival analysis. Several topics are addressed, including parametric and semiparametric models, proportional and nonproportional hazards models, frailty models, cure rate models, model selection and comparison, joint models for longitudinal and survival data, models with time-varying covariates, missing covariate data, design and monitoring of clinical trials, accelerated failure time models, models for multivariate survival data, and special types of hierarchical survival models. We also consider various censoring schemes, including right and interval censored data. Several additional topics related to the Bayesian paradigm are discussed, including noninformative and informative prior specifications, computing posterior quantities of interest, Bayesian hypothc esis testing, variable selection, model checking techniques using Bayesian diagnostic methods, and Markov chain Monte Carlo (MCMC) algorithms for sampling from the posterior and predictive distributions.
The book will present a balance between theory and applications, and for each of the models and topics mentioned above, we present detailed examples and analyses from case studies whenever possible. Moreover, we demonstrate the use of the statistical package BUGS for several of the models and methodologies discussed in this book. Theoretical and applied problems are given in the exercises at the end of each chapter. The book is
viii Preface
structured so that the methodology and applications are presented in the main body of each chapter and all rigorous proofs and derivations are placed in Appendices. This should enable a wide audience of readers to use the book without having to go through the technical details. Without compromising our main goal of presenting Bayesian methods for survival analysis, we have tried to acknowledge and briefly review the relevant frequentist methods. We compare the frequentist and Bayesian techniques whenever possible and discuss the advantages and disadvantages of Bayesian methods for each topic.
Several types of parametric and semiparametric models are examined. For the parametric models, we discuss the exponential, gamma, Weibull, log-normal, and extreme value regression models. For the semiparametric models, we discuss a wide variety models based on prior processes for the cumulative baseline hazard, the baseline hazard, or the cumulative baseline distribution function. Specifically, we discuss the gamma process, beta process, Dirichlet process, and correlated gamma process. We also discuss frailty survival models that allow the survival times to be correlated between subjects, as well as multiple event time models where each subject has a vector of time-to-event variables. In addition, we examine parametric and semiparametric models for univariate survival data with a cure fraction (cure rate models) as well as multivariate cure rate models. Also, we discuss accelerated failure time models and flexible classes of hierarchical survival models based on neural networks. The applications are all essentially from the health sciences, including cancer, AIDS, and the environment.
The book is intended as a graduate textbook or a reference book for a one- or two-semester course at the advanced masters or Ph.D. level. The prerequisites include one course in statistical inference and Bayesian theory at the level of Casella and Berger (1990) and Box and Tiao (1992). The book can also be used after a course in Bayesian statistics using the books by Carlin and Louis (1996) or Gelman, Carlin, Stern, and Rubin (1995) . This book focuses on an important subfield of application. It would be most suitable for second- or third-year graduate students in statistics or biostatistics. It would also serve as a useful reference book for applied or theoretical researchers as well as practitioners. Moreover, the book presents several open research problems that could serve as useful thesis topics.
We would like to acknowledge the following people, who gave us permission to use some of the contents from their work, including tables and figures: Elja Arjas, Brad Carlin, Paul Damien, Dipak Dey, Daria Gasbarra, Robert J . Gray, Paul Gustafson, Lynn Kuo, Sandra Lee, Bani Mallick, Nalini Ravishanker, Sujit Sahu, Daniel Sargent, Dongchu Sun, Jeremy Taylor, Bruce Turnbull, Helen Vlachos, Chris Volinsky, Steve Walker, and Marvin Zelen. Joseph Ibrahim would like to give deep and special thanks to Marvin Zelen for being his wonderful mentor and friend at Harvard, and to whom he feels greatly indebted. Ming-Hui Chen would like to give special thanks to his advisors James Berger and Bruce Schmeiser, who have
Preface ix
served as his wonderful mentors for the last ten years. Finally, we owe deep thanks to our parents and our families for their constant love, patience, understanding, and support. It is to them that we dedicate this book.
Joseph G. Ibrahim, Ming-Hui Chen, and Debajyoti Sinha March 2001
Contents
Preface
1 Introduction 1.1 Aims . . 1.2 Outline . 1.3 Motivating Examples 1.4 Survival Analysis . .
1.4.1 Proportional Hazards Models 1.4.2 Censoring . . . . . 1.4.3 Partial Likelihood . . . . . . .
1.5 The Bayesian Paradigm . . . . . . . . 1.6 Sampling from the Posterior Distribution . 1. 7 Informative Prior Elicitation 1.8 Why Bayes?
Exercises .. . .
2 Parametric Models 2.1 Exponential Model 2.2 Weibull Model . . . 2.3 Extreme Value Model . 2.4 Log-Normal Model 2.5 Gamma Model .
Exercises .. .. . .
vii
1 1 2 3
13 15 15 16 17 18 22 26 27
30 30 35 37 39 40 42
Contents xi
3 Semiparametric Models 4 7 3.1 Piecewise Constant Hazard Model . . . . . . . . 47 3.2 Models Using a Gamma Process . . . . . . . . . 50
3.2.1 Gamma Process on Cumulative Hazard. 50 3.2.2 Gamma Process with Grouped-Data Likelihood 51 3.2.3 Relationship to Partial Likelihood . . 53 3.2.4 Gamma Process on Baseline Hazard 55
3.3 Prior Elicitation . . . . . . . . . . . 56 3.3.1 Approximation of the Prior . . . . . 57 3.3.2 Choices of Hyperparameters . . . . . 59 3.3.3 Sampling from the Joint Posterior Distribution of
(/3, ~, ao) . . . . . . . . . . . 60 3.4 A Generalization of the Cox Model 63 3.5 Beta Process Models . . . . . . 66
3.5.1 Beta Process Priors . . . 66 3.5.2 Interval Censored Data . 71
3.6 Correlated Gamma Processes 72 3.7 Dirichlet Process Models . . . . 78
3.7.1 Dirichlet Process Prior . 78 3.7.2 Dirichlet Process in Survival Analysis . 81 3.7.3 Dirichlet Process with Doubly Censored Data 84 3. 7.4 Mixtures of Dirichlet Process Models 87 3.7.5 Conjugate MDP Models . . . . . 89 3.7.6 Nonconjugate MDP Models . . . 90 3.7.7 MDP Priors with Censored Data 91 3. 7.8 Inclusion of Covariates 94 Exercises .
4 Frailty Models 4.1 Proportional Hazards Model with Frailty
4.1.1 Weibull Model with Gamma Frailties 4.1.2 Gamma Process Prior for H0 (t) . . . 4.1.3 Piecewise Exponential Model for h0 (t) 4.1.4 Positive Stable Frailties .. ..... . . 4.1.5 A Bayesian Model for Institutional Effects 4.1.6 Posterior Likelihood Methods . . . . 4.1. 7 Methods Based on Partial Likelihood
4.2 Multiple Event and Panel Count Data 4.3 Multilevel Multivariate Survival Data . 4.4 Bivariate Measures of Dependence .
Exercises .. .. . ......... .
5 Cure Rate Models 5.1 Introduction ...... .. . . 5.2 Parametric Cure Rate Model .
94
100 101 102 104 106 112 118 126 131 134 136 147 148
155 155 156
xii Contents
5.2.1 Models . ... . .... ..... . 5.2.2 Prior and Posterior Distributions 5.2.3 Posterior Computation ..... .
5.3 Semiparametric Cure Rate Model . . . . 5.4 An Alternative Semiparametric Cure Rate Model
5.4.1 Prior Distributions ... 5.5 Multivariate Cure Rate Models
5.5.1 Models . . ...... . . 5.5.2 The Likelihood Function 5.5.3 The Prior and Posterior Distributions . 5.5.4 Computational Implementation Appendix. Exercises ....
6 Model Comparison 6.1 Posterior Model Probabilities .. ... .. . .
6.1.1 Variable Selection in the Cox Model . 6.1.2 Prior Distribution on the Model Space 6.1.3 Computing Prior and Posterior Model Probabilities
6.2 Criterion-Based Methods ... ... . 6.2.1 The L Measure . .... .. . 6.2.2 The Calibration Distribution .
6.3 Conditional Predictive Ordinate . . . 6.4 Bayesian Model Averaging . . . . . .
6.4.1 BMA for Variable Selection in the Cox Model 6.4.2 Identifying the Models in A' . . . . . . 6.4.3 Assessment of Predictive Performance
6.5 Bayesian Information Criterion . . 6.5.1 Model Selection Using BIC . ..... . 6.5.2 Exponential Survival Model . . . . . . 6.5.3 The Cox Proportional Hazards Model . Exercises . ...... .. .... . . ... .. .
7 Joint Models for Longitudinal and Survival Data 7.1 Introduction . .. ... .. .. . . .... ... .
7.1.1 Joint Modeling in AIDS Studies .. . .. . 7.1.2 Joint Modeling in Cancer Vaccine Trials . 7.1.3 Joint Modeling in Health-Related Quality of Life
Studies .. . ... .. . ....... . . . .... . 7.2 Methods for Joint Modeling of Longitudinal and Survival
156 160 163 171 179 180 185 185 188 190 191 199 205
208 209 210 211 212 219 220 223 227 234 236 237 239 246 249 249 250 254
262 262 263 263
264
Data . . . . . . . . . . . . . . . . 265 7.2.1 Partial Likelihood Models 265 7.2.2 Joint Likelihood Models 267 7.2.3 Mixture Models . . . . . . 273
Contents xiii
7.3 Bayesian Methods for Joint Modeling of Longitudinal and Survival Data Exercises ...... .
8 Missing Covariate Data 8.1 Introduction ...... . ..... . ........ . 8.2 The Cure Rate Model with Missing Covariate Data 8.3 A General Class of Covariate Models . 8.4 The Prior and Posterior Distributions . 8.5 Model Checking
Appendix. Exercises ....
9 Design and Monitoring of Randomized Clinical Trials 9.1 Group Sequential Log-Rank Tests for Survival Data . 9.2 Bayesian Approaches . . . . .
9.2.1 Range of Equivalence . 9.2.2 Prior Elicitation . . .. 9.2.3 Predictions . . . . . . . 9.2.4 Checking Prior-Data Compatibility
9.3 Bayesian Sample Size Determination ... 9.4 Alternative Approaches to Sample Size Determination
Exercises
10 Other Topics 10.1 Proportional Hazards Models Built from Monotone Func-
tions .............. . 10.1.1 Likelihood Specification .. 10.1.2 Prior Specification ..... 10.1.3 Time-Dependent Covariates
10.2 Accelerated Failure Time models 10.2.1 MDP Prior for (}i ..... .
10.2.2 Polya Tree Prior for (}i . . .
10.3 Bayesian Survival Analysis Using MARS 10.3.1 The Bayesian Model ..... . 10.3.2 Survival Analysis with Frailties
10.4 Change Point Models . . . . . . . . . 10.4.1 Basic Assumptions and Model 10.4.2 Extra Poisson Variation 10.4.3 Lag Functions . . . 10.4.4 Recurrent Tumors . 10.4.5 Bayesian Inference
10.5 The Poly-Weibull Model . 10.5.1 Likelihood and Priors . 10.5.2 Sampling the Posterior Distribution .
275 287
290 290 292 293 297 301 311 317
320 320 322 326 328 332 334 336 340 349
352
352 354 356 357 359 360 364 373 374 379 381 382 385 386 388 389 395 396 397
xiv Contents
10.6 Flexible Hierarchical Survival Models
10.7
10.8
10.6.1 Three Stages of the Hierarchical Model 10.6.2 Implementation ..... . Bayesian Model Diagnostics . . . 10.7.1 Bayesian Latent Residuals 10.7.2 Prequential Methods Future Research Topics . Appendix. Exercises ...
List of Distributions
References
Author Index
Subject Index
398 400 403 413 413 417 429 431 433
436
438
467
475