springer series in statistics978-1-4757-3447-8/1.pdf · [email protected] debajyoti sinha...

13
Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, K. Krickeberg, I. Olkin, N. Wermuth, S. Zeger Springer Science+Business Media, LLC

Upload: others

Post on 11-Mar-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Springer Series in Statistics

Advisors: P. Bickel, P. Diggle, S. Fienberg, K. Krickeberg, I. Olkin, N. Wermuth, S. Zeger

Springer Science+Business Media, LLC

Page 2: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Springer Series in Statistics

Andersen/Borgan/Gill/Keiding: Statistical Models Based on Counting Processes. Atkinson/Riani: Robust Diagnotstic Regression Analysis. Berger: Statistical Decision Theory and Bayesian Analysis, 2nd edition. Bolfarine/Zacks: Prediction Theory for Finite Populations. Borg/Groenen: Modern Multidimensional Scaling: Theory and Applications Brockwell/Davis: Time Series: Theory and Methods, 2nd edition. Chen/Shao/Jbrahim: Monte Carlo Methods in Bayesian Computation. David/Edwards: Annotated Readings in the History of Statistics. Devroye!Lugosi: Combinatorial Methods in Density Estimation. Efromovich: Nonparametric Curve Estimation: Methods, Theory, and Applications. Fahrmeir!Tutz: Multivariate Statistical Modelling Based on Generalized Linear

Models, 2nd edition. Farebrother: Fitting Linear Relationships: A History of the Calculus of Observations

1750-1900. Federer: Statistical Design and Analysis for Intercropping Experiments, Volume 1:

Two Crops. Federer: Statistical Design and Analysis for Intercropping Experiments, Volume II:

Three or More Crops. Fienberg!Hoaglin!Kruskal/Tanur (Eds.) : A Statistical Model: Frederick Mosteller's

Contributions to Statistics, Science and Public Policy. Fisher/Sen: The Collected Works ofWassily Hoeffding. Friedman et a!: The Elements of Statistical Learning: Data Mining, Inference and

Prediction Glaz!Naus/Wallenstein: Scan Statistics. Good: Permutation Tests: A Practical Guide to Resampling Methods for Testing

Hypotheses, 2nd edition. Gouril!roux: ARCH Models and Financial Applications. Grandell: Aspects of Risk Theory. Haberman: Advanced Statistics, Volume 1: Description of Populations. Hall: The Bootstrap and Edgeworth Expansion. Hardie: Smoothing Techniques: With Implementation in S. Harrell: Regression Modeling Strategies: With Applications to Linear Models,

Logistic Regression, and Survival Analysis Hart: Nonparametric Smoothing and Lack-of-Fit Tests. Hartigan: Bayes Theory. Hedayat/Sloane/Stujken: Orthogonal Arrays: Theory and Applications. Heyde: Quasi-Likelihood and its Application: A General Approach to Optimal

Parameter Estimation. Huet/Bouvier!Gruet/Jolivet: Statistical Tools for Nonlinear Regression: A Practical

Guide with S-PLUS Examples. Ibrahim/Chen/Sinha.: Bayesian Survival Analysis. Kolen/Brennan: Test Equating: Methods and Practices. Kotz!Johnson (Eds.): Breakthroughs in Statistics Volume I. Kotz/Johnson (Eds.) : Breakthroughs in Statistics Volume II. Kotz/Johnson (Eds.) : Breakthroughs in Statistics Volume III.

(continued after index)

Page 3: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Joseph G. Ibrahim Ming-Hui Chen Debajyoti Sinha

Bayesian Survival Analysis

With 51 Illustrations

Springer

Page 4: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Joseph G. Ibrahim Department of Biostatistics Harvard School of Public

Health and Dana-Farber Cancer Institute

44 Binney Street Boston, MA 02115 USA [email protected]

Debajyoti Sinha Department of Biometry

and Epidemiology Medical Universtiy of South Carolina 135 Rutledge A ve PO Box 250551 Charleston, SC 29425 USA [email protected]

Ming-Hui Chen Department of Mathematical Sciences Worcester Polytechnic Institute 100 Institute Road Worcester, MA 01609-2280 USA [email protected]

Library of Congress Cataloging-in-Publication Data Ibrahim, Joseph George.

Bayesian survival analysis / Joseph G. Ibrahim, Ming-Hui Chen, Debajyoti Sinha. p. cm. - (Springer series in statistics)

IncIudes bibliographical references and indexes. ISBN 978-1-4419-2933-4 ISBN 978-1-4757-3447-8 (eBook) DOI 10.1007/978-1-4757-3447-8 1. Failure time data analysis. 2. Bayesian statistical decision theory. 1. Chen

Ming-Hui, 1961- II. Sinha, Debajyoti . III. Title. IV. Series. QA276 .127 2001 519.5'42-dc21 2001020443

Printed on acid-free paper.

© 200 1 Springer Seienee+Business Media New York Origina1ly published by Springer-Verlag New York, !ne. in 2001 Softeover reprint of the hardcover 1 st edition 2001

AlI rights reserved. This work may not be translated or copied in whole or in par! without the writlen permission of the publisher Springer Seience+Business Media, LLC , except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especialIy identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.

Production managed by MaryAnn Brickner; manufacturing supervised by Erica Bresler. Photocomposed pages prepared from the authors' ~TEJX2, files.

9 8 7 6 5 432 1

ISBN 978-1-4419-2933-4 SPIN 10833390

Page 5: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

To Joseph G. Ibrahim's parents, Guirguis and Assinat Ibrahim

Ming-Hui Chen's parents, his wife, Lan, and his daughters, Victoria and Paula

Debajyoti Sinha's parents, Nemai Chand and Purabi Sinha, and his wife, Sebanti

Page 6: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Preface

Survival analysis arises in many fields of study including medicine, biology, engineering, public health, epidemiology, and economics. Recent advances in computing, software development such as BUGS, and practical methods for prior elicitation have made Bayesian survival analysis of complex mod­els feasible for both practitioners and researchers. This book provides a comprehensive treatment of Bayesian survival analysis. Several topics are addressed, including parametric and semiparametric models, proportional and nonproportional hazards models, frailty models, cure rate models, model selection and comparison, joint models for longitudinal and survival data, models with time-varying covariates, missing covariate data, design and monitoring of clinical trials, accelerated failure time models, models for multivariate survival data, and special types of hierarchical survival models. We also consider various censoring schemes, including right and interval censored data. Several additional topics related to the Bayesian paradigm are discussed, including noninformative and informative prior specifications, computing posterior quantities of interest, Bayesian hypothc esis testing, variable selection, model checking techniques using Bayesian diagnostic methods, and Markov chain Monte Carlo (MCMC) algorithms for sampling from the posterior and predictive distributions.

The book will present a balance between theory and applications, and for each of the models and topics mentioned above, we present detailed examples and analyses from case studies whenever possible. Moreover, we demonstrate the use of the statistical package BUGS for several of the models and methodologies discussed in this book. Theoretical and applied prob­lems are given in the exercises at the end of each chapter. The book is

Page 7: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

viii Preface

structured so that the methodology and applications are presented in the main body of each chapter and all rigorous proofs and derivations are placed in Appendices. This should enable a wide audience of readers to use the book without having to go through the technical details. Without compro­mising our main goal of presenting Bayesian methods for survival analysis, we have tried to acknowledge and briefly review the relevant frequentist methods. We compare the frequentist and Bayesian techniques whenever possible and discuss the advantages and disadvantages of Bayesian methods for each topic.

Several types of parametric and semiparametric models are examined. For the parametric models, we discuss the exponential, gamma, Weibull, log-normal, and extreme value regression models. For the semiparametric models, we discuss a wide variety models based on prior processes for the cumulative baseline hazard, the baseline hazard, or the cumulative base­line distribution function. Specifically, we discuss the gamma process, beta process, Dirichlet process, and correlated gamma process. We also discuss frailty survival models that allow the survival times to be correlated be­tween subjects, as well as multiple event time models where each subject has a vector of time-to-event variables. In addition, we examine parametric and semiparametric models for univariate survival data with a cure fraction (cure rate models) as well as multivariate cure rate models. Also, we discuss accelerated failure time models and flexible classes of hierarchical survival models based on neural networks. The applications are all essentially from the health sciences, including cancer, AIDS, and the environment.

The book is intended as a graduate textbook or a reference book for a one- or two-semester course at the advanced masters or Ph.D. level. The prerequisites include one course in statistical inference and Bayesian theory at the level of Casella and Berger (1990) and Box and Tiao (1992). The book can also be used after a course in Bayesian statistics using the books by Carlin and Louis (1996) or Gelman, Carlin, Stern, and Rubin (1995) . This book focuses on an important subfield of application. It would be most suitable for second- or third-year graduate students in statistics or biostatistics. It would also serve as a useful reference book for applied or theoretical researchers as well as practitioners. Moreover, the book presents several open research problems that could serve as useful thesis topics.

We would like to acknowledge the following people, who gave us per­mission to use some of the contents from their work, including tables and figures: Elja Arjas, Brad Carlin, Paul Damien, Dipak Dey, Daria Gasbarra, Robert J . Gray, Paul Gustafson, Lynn Kuo, Sandra Lee, Bani Mallick, Nalini Ravishanker, Sujit Sahu, Daniel Sargent, Dongchu Sun, Jeremy Taylor, Bruce Turnbull, Helen Vlachos, Chris Volinsky, Steve Walker, and Marvin Zelen. Joseph Ibrahim would like to give deep and special thanks to Marvin Zelen for being his wonderful mentor and friend at Harvard, and to whom he feels greatly indebted. Ming-Hui Chen would like to give spe­cial thanks to his advisors James Berger and Bruce Schmeiser, who have

Page 8: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Preface ix

served as his wonderful mentors for the last ten years. Finally, we owe deep thanks to our parents and our families for their constant love, patience, understanding, and support. It is to them that we dedicate this book.

Joseph G. Ibrahim, Ming-Hui Chen, and Debajyoti Sinha March 2001

Page 9: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Contents

Preface

1 Introduction 1.1 Aims . . 1.2 Outline . 1.3 Motivating Examples 1.4 Survival Analysis . .

1.4.1 Proportional Hazards Models 1.4.2 Censoring . . . . . 1.4.3 Partial Likelihood . . . . . . .

1.5 The Bayesian Paradigm . . . . . . . . 1.6 Sampling from the Posterior Distribution . 1. 7 Informative Prior Elicitation 1.8 Why Bayes?

Exercises .. . .

2 Parametric Models 2.1 Exponential Model 2.2 Weibull Model . . . 2.3 Extreme Value Model . 2.4 Log-Normal Model 2.5 Gamma Model .

Exercises .. .. . .

vii

1 1 2 3

13 15 15 16 17 18 22 26 27

30 30 35 37 39 40 42

Page 10: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Contents xi

3 Semiparametric Models 4 7 3.1 Piecewise Constant Hazard Model . . . . . . . . 47 3.2 Models Using a Gamma Process . . . . . . . . . 50

3.2.1 Gamma Process on Cumulative Hazard. 50 3.2.2 Gamma Process with Grouped-Data Likelihood 51 3.2.3 Relationship to Partial Likelihood . . 53 3.2.4 Gamma Process on Baseline Hazard 55

3.3 Prior Elicitation . . . . . . . . . . . 56 3.3.1 Approximation of the Prior . . . . . 57 3.3.2 Choices of Hyperparameters . . . . . 59 3.3.3 Sampling from the Joint Posterior Distribution of

(/3, ~, ao) . . . . . . . . . . . 60 3.4 A Generalization of the Cox Model 63 3.5 Beta Process Models . . . . . . 66

3.5.1 Beta Process Priors . . . 66 3.5.2 Interval Censored Data . 71

3.6 Correlated Gamma Processes 72 3.7 Dirichlet Process Models . . . . 78

3.7.1 Dirichlet Process Prior . 78 3.7.2 Dirichlet Process in Survival Analysis . 81 3.7.3 Dirichlet Process with Doubly Censored Data 84 3. 7.4 Mixtures of Dirichlet Process Models 87 3.7.5 Conjugate MDP Models . . . . . 89 3.7.6 Nonconjugate MDP Models . . . 90 3.7.7 MDP Priors with Censored Data 91 3. 7.8 Inclusion of Covariates 94 Exercises .

4 Frailty Models 4.1 Proportional Hazards Model with Frailty

4.1.1 Weibull Model with Gamma Frailties 4.1.2 Gamma Process Prior for H0 (t) . . . 4.1.3 Piecewise Exponential Model for h0 (t) 4.1.4 Positive Stable Frailties .. ..... . . 4.1.5 A Bayesian Model for Institutional Effects 4.1.6 Posterior Likelihood Methods . . . . 4.1. 7 Methods Based on Partial Likelihood

4.2 Multiple Event and Panel Count Data 4.3 Multilevel Multivariate Survival Data . 4.4 Bivariate Measures of Dependence .

Exercises .. .. . ......... .

5 Cure Rate Models 5.1 Introduction ...... .. . . 5.2 Parametric Cure Rate Model .

94

100 101 102 104 106 112 118 126 131 134 136 147 148

155 155 156

Page 11: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

xii Contents

5.2.1 Models . ... . .... ..... . 5.2.2 Prior and Posterior Distributions 5.2.3 Posterior Computation ..... .

5.3 Semiparametric Cure Rate Model . . . . 5.4 An Alternative Semiparametric Cure Rate Model

5.4.1 Prior Distributions ... 5.5 Multivariate Cure Rate Models

5.5.1 Models . . ...... . . 5.5.2 The Likelihood Function 5.5.3 The Prior and Posterior Distributions . 5.5.4 Computational Implementation Appendix. Exercises ....

6 Model Comparison 6.1 Posterior Model Probabilities .. ... .. . .

6.1.1 Variable Selection in the Cox Model . 6.1.2 Prior Distribution on the Model Space 6.1.3 Computing Prior and Posterior Model Probabilities

6.2 Criterion-Based Methods ... ... . 6.2.1 The L Measure . .... .. . 6.2.2 The Calibration Distribution .

6.3 Conditional Predictive Ordinate . . . 6.4 Bayesian Model Averaging . . . . . .

6.4.1 BMA for Variable Selection in the Cox Model 6.4.2 Identifying the Models in A' . . . . . . 6.4.3 Assessment of Predictive Performance

6.5 Bayesian Information Criterion . . 6.5.1 Model Selection Using BIC . ..... . 6.5.2 Exponential Survival Model . . . . . . 6.5.3 The Cox Proportional Hazards Model . Exercises . ...... .. .... . . ... .. .

7 Joint Models for Longitudinal and Survival Data 7.1 Introduction . .. ... .. .. . . .... ... .

7.1.1 Joint Modeling in AIDS Studies .. . .. . 7.1.2 Joint Modeling in Cancer Vaccine Trials . 7.1.3 Joint Modeling in Health-Related Quality of Life

Studies .. . ... .. . ....... . . . .... . 7.2 Methods for Joint Modeling of Longitudinal and Survival

156 160 163 171 179 180 185 185 188 190 191 199 205

208 209 210 211 212 219 220 223 227 234 236 237 239 246 249 249 250 254

262 262 263 263

264

Data . . . . . . . . . . . . . . . . 265 7.2.1 Partial Likelihood Models 265 7.2.2 Joint Likelihood Models 267 7.2.3 Mixture Models . . . . . . 273

Page 12: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

Contents xiii

7.3 Bayesian Methods for Joint Modeling of Longitudinal and Survival Data Exercises ...... .

8 Missing Covariate Data 8.1 Introduction ...... . ..... . ........ . 8.2 The Cure Rate Model with Missing Covariate Data 8.3 A General Class of Covariate Models . 8.4 The Prior and Posterior Distributions . 8.5 Model Checking

Appendix. Exercises ....

9 Design and Monitoring of Randomized Clinical Trials 9.1 Group Sequential Log-Rank Tests for Survival Data . 9.2 Bayesian Approaches . . . . .

9.2.1 Range of Equivalence . 9.2.2 Prior Elicitation . . .. 9.2.3 Predictions . . . . . . . 9.2.4 Checking Prior-Data Compatibility

9.3 Bayesian Sample Size Determination ... 9.4 Alternative Approaches to Sample Size Determination

Exercises

10 Other Topics 10.1 Proportional Hazards Models Built from Monotone Func-

tions .............. . 10.1.1 Likelihood Specification .. 10.1.2 Prior Specification ..... 10.1.3 Time-Dependent Covariates

10.2 Accelerated Failure Time models 10.2.1 MDP Prior for (}i ..... .

10.2.2 Polya Tree Prior for (}i . . .

10.3 Bayesian Survival Analysis Using MARS 10.3.1 The Bayesian Model ..... . 10.3.2 Survival Analysis with Frailties

10.4 Change Point Models . . . . . . . . . 10.4.1 Basic Assumptions and Model 10.4.2 Extra Poisson Variation 10.4.3 Lag Functions . . . 10.4.4 Recurrent Tumors . 10.4.5 Bayesian Inference

10.5 The Poly-Weibull Model . 10.5.1 Likelihood and Priors . 10.5.2 Sampling the Posterior Distribution .

275 287

290 290 292 293 297 301 311 317

320 320 322 326 328 332 334 336 340 349

352

352 354 356 357 359 360 364 373 374 379 381 382 385 386 388 389 395 396 397

Page 13: Springer Series in Statistics978-1-4757-3447-8/1.pdf · ibrahim@jimmy.harvard.edu Debajyoti Sinha Department of Biometry and Epidemiology Medical Universtiy of South Carolina 135

xiv Contents

10.6 Flexible Hierarchical Survival Models

10.7

10.8

10.6.1 Three Stages of the Hierarchical Model 10.6.2 Implementation ..... . Bayesian Model Diagnostics . . . 10.7.1 Bayesian Latent Residuals 10.7.2 Prequential Methods Future Research Topics . Appendix. Exercises ...

List of Distributions

References

Author Index

Subject Index

398 400 403 413 413 417 429 431 433

436

438

467

475