Bayesian Hierarchical Models in Statistical Quality...

Bayesian Hierarchical Models in

Statistical Quality Control Methods to

Improve Healthcare in Hospitals

Hassan AssarehBachelor of Industrial Engineering/Master of Socio-Economic Systems Engineering

Bachelor of Statistics/Master of Human Resource ManagementIran University of Science and Technology/Azad Science and Research University

Karaj Payame Noor University/Tehran Payame Noor University

A thesis submitted in partial fulfilment of the requirements for the degree ofDoctor of Philosophy

June 2012

Principal Supervisor: Prof. Kerrie MengersenAssociate Supervisor: Dr. Helen Johnson

Queensland University of TechnologySchool of Mathematical SciencesScience and Engineering Faculty

Brisbane, Queensland, 4001, AUSTRALIA

c© Copyright by Hassan Assareh 2012

All Rights Reserved

ii

Dedicated to my beloved wife and daughter

Arezoo & Elina

iv

Keywords

Acceptance sampling, Angioplasty, Bayesian method, Bayesian hierarchical model,

Bernoulli process, Cardiac surgery, Censored data, Change point, Clinical database,

Control chart, Data collection, Data quality, Healthcare surveillance, Hospital out-

come, Inspection cost, Intensive care unit, Linear trend, Logistic regression, Markov

chain Monte Carlo, Modification cost, Multiple change, Multivariate control chart,

Optimization, Patient mix, Poisson process, Quality improvement, Reversible jump

Markov chain Monte Carlo, Risk-adjustment, Risk model, Root causes analysis, Sam-

ple size determination, Statistical process control, Step change, Survival time, Utility,

Value of information,

v

vi

Abstract

Quality oriented management systems and methods have become the dominant business

and governance paradigm. From this perspective, satisfying customers’ expectations

by supplying reliable, good quality products and services is the key factor for an or-

ganization and even government. During recent decades, Statistical Quality Control

(SQC) methods have been developed as the technical core of quality management and

continuous improvement philosophy and now are being applied widely to improve the

quality of products and services in industrial and business sectors. Recently SQC tools,

in particular quality control charts, have been used in healthcare surveillance. In some

cases, these tools have been modified and developed to better suit the health sector

characteristics and needs. It seems that some of the work in the healthcare area has

evolved independently of the development of industrial statistical process control meth-

ods. Therefore analysing and comparing paradigms and the characteristics of quality

control charts and techniques across the different sectors presents some opportunities

for transferring knowledge and future development in each sectors. Meanwhile con-

sidering capabilities of Bayesian approach particularly Bayesian hierarchical models

and computational techniques in which all uncertainty are expressed as a structure of

probability, facilitates decision making and cost-effectiveness analyses.

Therefore, this research investigates the use of quality improvement cycle in a health

vii

setting using clinical data from a hospital. The need of clinical data for monitoring

purposes is investigated in two aspects. A framework and appropriate tools from the

industrial context are proposed and applied to evaluate and improve data quality in

available datasets and data flow; then a data capturing algorithm using Bayesian deci-

sion making methods is developed to determine economical sample size for statistical

analyses within the quality improvement cycle.

Following ensuring clinical data quality, some characteristics of control charts in the

health context including the necessity of monitoring attribute data and correlated qual-

ity characteristics are considered. To this end, multivariate control charts from an

industrial context are adapted to monitor radiation delivered to patients undergoing

diagnostic coronary angiogram and various risk-adjusted control charts are constructed

and investigated in monitoring binary outcomes of clinical interventions as well as post-

intervention survival time.

Meanwhile, adoption of a Bayesian approach is proposed as a new framework in esti-

mation of change point following control chart’s signal. This estimate aims to facilitate

root causes efforts in quality improvement cycle since it cuts the search for the potential

causes of detected changes to a tighter time-frame prior to the signal. This approach

enables us to obtain highly informative estimates for change point parameters since

probability distribution based results are obtained.

Using Bayesian hierarchical models and Markov chain Monte Carlo computational

methods, Bayesian estimators of the time and the magnitude of various change sce-

narios including step change, linear trend and multiple change in a Poisson process are

developed and investigated.

The benefits of change point investigation is revisited and promoted in monitoring

hospital outcomes where the developed Bayesian estimator reports the true time of the

shifts, compared to priori known causes, detected by control charts in monitoring rate

of excess usage of blood products and major adverse events during and after cardiac

surgery in a local hospital.

The development of the Bayesian change point estimators are then followed in a health-

care surveillances for processes in which pre-intervention characteristics of patients are

viii

affecting the outcomes. In this setting, at first, the Bayesian estimator is extended

to capture the patient mix, covariates, through risk models underlying risk-adjusted

control charts. Variations of the estimator are developed to estimate the true time of

step changes and linear trends in odds ratio of intensive care unit outcomes in a local

hospital. Secondly, the Bayesian estimator is extended to identify the time of a shift

in mean survival time after a clinical intervention which is being monitored by risk-

adjusted survival time control charts. In this context, the survival time after a clinical

intervention is also affected by patient mix and the survival function is constructed

using survival prediction model.

The simulation study undertaken in each research component and obtained results

highly recommend the developed Bayesian estimators as a strong alternative in change

point estimation within quality improvement cycle in healthcare surveillances as well as

industrial and business contexts. The superiority of the proposed Bayesian framework

and estimators are enhanced when probability quantification, flexibility and generaliz-

ability of the developed model are also considered.

The empirical results and simulations indicate that the Bayesian estimators are a strong

alternative in change point estimation within quality improvement cycle in healthcare

surveillances. The superiority of the proposed Bayesian framework and estimators

are enhanced when probability quantification, flexibility and generalizability of the

developed model are also considered. The advantages of the Bayesian approach seen in

general context of quality control may also be extended in the industrial and business

domains where quality monitoring was initially developed.

ix

x

List of Publications

This thesis is comprised of 11 published, accepted or submitted for publication papers

and are listed below:

Chapter 3 : Assareh, H., Waterhouse, M. A., Moser, C., Brighouse, R. D., Foster, K.

A., Smith, I. R. and Mengersen, K. (2011) Data quality improvement in clinical

databases using statistical quality control: review and case study, Drug Informa-

tion Journal, in press.

Chapter 4 : Assareh, H., Waterhouse, M. A., Brighouse, R. D., Foster, K. A., Smith,

I. R. and Mengersen, K. An economical sample size determination algorithm for

clinical data statistical analysis, IIE Transactions on Healthcare Systems Engi-

neering, submitted.

Chapter 5 : Waterhouse, M. A., Smith, I. R., Assareh, H., and Mengersen, K. (2010)

Implementation of multivariate control charts in a clinical setting, International

Journal for Quality in Health Care, 22 (5): 408-414.

Chapter 6 : Assareh, H., Noorossana, R. and Mengersen, K. (2011) Bayesian change

point detection in monitoring cardiac surgery outcomes, Computer and Industrial

Engineering, submitted.

xi

Chapter 7 : Assareh, H. and Mengersen, K. (2011) Bayesian multiple change Point

estimation of Poisson rates in control charts, IIE Transactions, submitted.

Chapter 8 : Assareh, H., Smith, I. and Mengersen, K. (2011) Bayesian change point

detection in monitoring cardiac surgery outcomes, Quality Management in Health

Care, 20(3): 227-232.

Chapter 9 : Assareh, H., Smith, I. and Mengersen, K. (2011) Change point estimation

in risk-adjusted control charts, Statistical Methods in Medical Research, in press.

Chapter 10 : Assareh, H., Smith, I. and Mengersen, K. (2011) Bayesian estimation of

the time of a linear trend in risk-adjusted control charts IAENG International

Journal of Computer Science, 38 (4): 409–417.

Chapter 11 : Assareh, H. and Mengersen, K. (2011) Bayesian estimation of the time

of a decrease in risk-adjusted survival time control charts, IAENG International

Journal of Applied Mathematics, 41 (4):360–366.

Chapter 12 : Assareh, H. and Mengersen, K. (2011) Change point estimation in moni-

toring survival time, PLOS One, under revision.

Chapter 13 : Assareh, H. and Mengersen, K. (2011) Estimation of the time of a linear

trend in monitoring survival time, under preparation.

xii

Acknowledgements

First of all I would like to acknowledge my principal supervisor for her inspiration,

guidance, friendship and encouragement throughout my PhD experience: Prof. Kerrie

Mengersen. Kerrie, I am privileged to have had the opportunity to work with you.

Being part of your team has been interesting and rewarding. Your wide ranging knowl-

edge, tremendous expertise, endless and invaluable support and insightful suggestions

have helped me through all of the research and personal difficulties faced during my

PhD. The trust and confidence you have had towards me and your motivation and

passion in this research is unforgettable and appreciable. This short acknowledgment

is insufficient to express my gratitude.

I would like to thank Dr. Mary Waterhouse, Ian Smith, Russell Brighouse and Dr.

Kelley Foster from St Andrew’s Medical Institute, Brisbane, Australia, who have sup-

ported my study by sharing data, thoughts and comments and enhanced this research

through their valuable contributions. I offer my sincere thanks to all who contributes

in preparation of clinical data at St Andrew’s War Memorial Hospital, Brisbane, Aus-

tralia.

I would like to acknowledge my supervisor, discipline of Mathematical Sciences, QUT,

and St. Andrew’s Medical Institute for their financial support which made this research

xiii

possible.

I am grateful to my friends and colleagues at QUT and BRAG group for their ongoing

support and companionship. Outside of Brisbane, I thank all of my amazing friends in

Sydney for everything they have done to help me and support my wife in absence of

me through my PhD.

No words can express the depth of my gratitude and love for my parents who have

always warmly supported me and encouraged me to do whatever I believe I can do.

A huge thank to my beloved wife Arezoo, who has offered me a tremendous amount

of encouragement, love and understanding and endured my absence during my PhD

journey; and finally thank to Elina, my lovely new born daughter, who brought joy and

happiness to our life.

xiv

Contents

Keywords v

Abstract vii

List of Publications xi

Acknowledgements xiii

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Research Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3.1 Objective 1: Dataset Quality Evaluation . . . . . . . . . . . . . . 4

1.3.2 Objective 2: Control Charts Development and Application . . . 5

1.4 Research Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4.1 Contribution to Application . . . . . . . . . . . . . . . . . . . . . 6

1.4.2 Contribution to Method . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 Research Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.6 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.7 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Literature Review 11

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Statistical Quality Control . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.1 Quality in Clinical Datasets . . . . . . . . . . . . . . . . . . . . . 12

2.2.2 Statistical Process Control . . . . . . . . . . . . . . . . . . . . . . 18

2.2.3 Quality Control Charts . . . . . . . . . . . . . . . . . . . . . . . 19

xv

2.2.4 Control Charts in Healthcare . . . . . . . . . . . . . . . . . . . . 27

2.2.5 Change Point Estimation in Control Charting . . . . . . . . . . . 37

2.3 Bayesian Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.3.1 Bayesian Computation . . . . . . . . . . . . . . . . . . . . . . . . 41

2.3.2 Bayesian Change Point Estimation . . . . . . . . . . . . . . . . . 44

2.4 Bayesian Quality Control . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.4.1 Optimal Control Policy . . . . . . . . . . . . . . . . . . . . . . . 46

2.4.2 Inferences and Estimating . . . . . . . . . . . . . . . . . . . . . . 47

2.4.3 Bayesian Control Chart . . . . . . . . . . . . . . . . . . . . . . . 50

2.4.4 Other Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Data Quality Improvement in Clinical Databases 65

3.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.3 Data Quality and Sampling Definitions . . . . . . . . . . . . . . . . . . . 72

3.4 Acceptance Sampling Plans . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.4.1 Sampling Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3.4.2 Case Study: ICU data . . . . . . . . . . . . . . . . . . . . . . . . 75

3.5 Statistical process Control . . . . . . . . . . . . . . . . . . . . . . . . . . 77

3.5.1 Quality Control Charts . . . . . . . . . . . . . . . . . . . . . . . 79

3.5.2 Case Study: Radiation Metrics Data Collection . . . . . . . . . . 83

3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4 An Economical Sample Size Determination Algorithm 95

4.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.3 Theoretical Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.4 General Data Capturing Algorithm . . . . . . . . . . . . . . . . . . . . . 101

4.4.1 Phase 1: Prediction for Bj . . . . . . . . . . . . . . . . . . . . . 103

4.4.2 Phase 2: Estimation for Bj . . . . . . . . . . . . . . . . . . . . . 106

4.4.3 Phase 3: Prediction for Bj+1 . . . . . . . . . . . . . . . . . . . . 108

4.5 Customized Algorithm for Risk Model Construction . . . . . . . . . . . 108

4.5.1 Assumptions and Definitions . . . . . . . . . . . . . . . . . . . . 108

4.5.2 Utility Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

4.6 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

4.6.1 Utility Function Construction . . . . . . . . . . . . . . . . . . . . 116

4.6.2 Algorithm Iterations . . . . . . . . . . . . . . . . . . . . . . . . . 124

4.6.3 Algorithm Termination . . . . . . . . . . . . . . . . . . . . . . . 125

4.7 Algorithm Development and Extension . . . . . . . . . . . . . . . . . . . 128

4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

xvi

5 Implementation of Multivariate Control Charts 135

5.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

5.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5.3.1 Description of case study data . . . . . . . . . . . . . . . . . . . 139

5.3.2 A general framework for multivariate monitoring . . . . . . . . . 140

5.3.3 Control chart construction . . . . . . . . . . . . . . . . . . . . . . 142

5.3.4 Outline of simulation study . . . . . . . . . . . . . . . . . . . . . 143

5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

5.4.1 Case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

5.4.2 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

6 Change Point Estimation in Poisson Control Charts 153

6.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

6.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

6.3 Bayesian Poisson Process Step Change Model . . . . . . . . . . . . . . . 160

6.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

6.3.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

6.3.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . 163

6.4 Bayesian Poisson Process Linear trend Change Model . . . . . . . . . . 168

6.4.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

6.4.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169


6.5 Bayesian Poisson Process Multiple Change model . . . . . . . . . . . . . 172

6.5.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

6.5.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173


6.6 Comparative Performance and Model Selection . . . . . . . . . . . . . . 176

6.7 Comparison of Bayesian Estimator with other Methods . . . . . . . . . 177

6.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

7 Multiple Change Point in Poisson Control Charts 187

7.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

7.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

7.3 Bayesian Multiple Change Point Model and RJMCMC Steps . . . . . . 193

7.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

7.3.2 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . 195

7.3.3 Birth and Death of a Change Point . . . . . . . . . . . . . . . . . 196

7.3.4 Proposal Distributions . . . . . . . . . . . . . . . . . . . . . . . . 197

7.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

7.4.1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

xvii

7.4.2 One Change Point . . . . . . . . . . . . . . . . . . . . . . . . . . 199

7.4.3 Two change points . . . . . . . . . . . . . . . . . . . . . . . . . . 203

7.4.4 Three change points . . . . . . . . . . . . . . . . . . . . . . . . . 208

7.5 Comparison of Bayesian Estimator with Other Methods . . . . . . . . . 212

7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

8 Change Point Detection in Cardiac Surgery Outcomes 219

8.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

8.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

8.3 Cardiac Surgery Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

8.3.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

8.3.2 Process Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . 225

8.3.3 Change Point Detection . . . . . . . . . . . . . . . . . . . . . . . 228

8.4 Angioplasty Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

8.4.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

8.4.2 Process Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . 236

8.4.3 Change Point Detection . . . . . . . . . . . . . . . . . . . . . . . 239

8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

9 Change Point Estimation in Risk-Adjusted Charts 249

9.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

9.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

9.3 Risk-Adjusted Control Charts . . . . . . . . . . . . . . . . . . . . . . . . 256

9.4 Change Point Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

9.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262


9.7 Comparative Performance and Model Selection . . . . . . . . . . . . . . 275


9.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

10 Linear Trend Estimation in Risk-Adjusted Charts 287

10.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

10.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

10.3 Risk-Adjusted Control Charts . . . . . . . . . . . . . . . . . . . . . . . . 293


10.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298



10.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

11 Estimation of a Decrease in Survival Time 313

11.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

11.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

xviii

11.3 Risk-Adjusted Survival Time Control Charts . . . . . . . . . . . . . . . 319


11.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323


11.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

12 Change Point in Monitoring Survival Time 335

12.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

12.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339



12.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347


12.7 The Effect of Censoring Time . . . . . . . . . . . . . . . . . . . . . . . . 357


12.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

13 Linear Trend Estimation in Survival Time 367

13.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370

13.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370



13.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377



13.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

14 Conclusion 397

14.1 Research Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397

14.1.1 Objective 1: Dataset Quality Evaluation . . . . . . . . . . . . . . 398

14.1.2 Objective 2: Control Charts Application and Development . . . 399

14.1.3 Contribution to Application . . . . . . . . . . . . . . . . . . . . . 401

14.1.4 Contribution to Method . . . . . . . . . . . . . . . . . . . . . . . 401

14.2 Research Summary and Remarks . . . . . . . . . . . . . . . . . . . . . . 402

14.3 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404

14.3.1 Immediate Research . . . . . . . . . . . . . . . . . . . . . . . . . 404

14.3.2 Relevant Research . . . . . . . . . . . . . . . . . . . . . . . . . . 407

Bibliography 411

xix

xx

List of Figures

1.1 Research aim. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Research objectives defined within implementation of quality improve-

ment cycle in the pilot hospital. . . . . . . . . . . . . . . . . . . . . . . . 4

2.1 Process improvement cycle (Montgomery, 2008). . . . . . . . . . . . . . 19

3.1 Process improvement cycle (Montgomery, 2008). . . . . . . . . . . . . . 78

3.2 u-chart of observed errors in radiation metrics dataset; Stage 1: before

intervention-April 2009, Stage 2: after intervention-May 2009. . . . . . . 84

3.3 Pareto chart of observed error types in radiation metrics dataset in April

2009. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.4 Cause and Effect diagram of potential causes of observed errors in radi-

ation metrics dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.5 CCC-chart for observed errors in radiation metrics dataset for July-

September 2009. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

3.6 A guideline for statistical quality control tools selection in clinical data

management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.1 Conceptual diagram of the data capturing algorithm. . . . . . . . . . . . 101

4.2 Algorithm components for the jth iteration. . . . . . . . . . . . . . . . . 103

4.3 Utility function loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

4.4 Customized data capturing algorithm for risk model construction. . . . 112

4.5 Performance criteria of calibrated APACHE II over observed and simu-

lated blocks, under (1) Fix and (2) Updating I and II approaches of the

data capturing algorithm implementation. . . . . . . . . . . . . . . . . . 119

xxi

4.6 Utility functions and terminations points under Fix (-F) and Updating

I (-I) and II (-II) approaches: (1) Budget line utility; (2) Linear util-

ity, the asterisk shows the estimated total cost obtained at the end of

the third iteration, CT,3 = 2877.2; (3) Performance based utility func-

tion (PU1) with k1 is equal to 305.23 and 287.71 for Updating I and II

approaches, respectively; (4) Performance based utility function (PU2)

with 5% increase in k1, k2 is equal to 320.23 and 291.65 for Updating I

and II approaches, respectively. A vertical line is drawn to show when

updating occurs in the algorithm. . . . . . . . . . . . . . . . . . . . . . . 123

5.1 Hotelling’s T 2 chart for the simultaneous monitoring of D, T and F for

females undergoing a CA in November 2005. . . . . . . . . . . . . . . . . 143

5.2 MEWMA chart for the simultaneous monitoring of D, T and F for


5.3 MCUSUM chart for the simultaneous monitoring of D, T and F for


5.4 Plot of ARL1 versus ||δ|| for the T 2, MEWMA and MCUSUM charts,

given ρ12 = ρ13 = ρ23 = 0.2. Results are shown for the cases where no

data are missing (γ = 0) and when γ = 0.2. In the latter case, MI has

been used to impute for missing values. . . . . . . . . . . . . . . . . . . 147

6.1 Posterior distributions of the time τ and the magnitude δ of a step

change following signals from (a1, a2) c-chart, (b1, b2) Poisson EWMA

(r = 0.1 and A± = 2.67) and (c1, c2) Poisson CUSUM ((k+, h+) =

(22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20, δ = +6 and τ = 100. . . 163

6.2 Directed acyclic graph for the step change model in a Poisson process. . 182

6.3 Directed acyclic graph for the linear trend change model in a Poisson

process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

6.4 Directed acyclic graph for the multiple change model in a Poisson process.183

7.1 Posterior distributions of the number k and the time τ1,1 of a step change

of sizes (a) δ1,1 = −5 and (b) δ1,1 = +5 following signals from c-chart

where λ1,0 = 20, and τ1,1 = 25. . . . . . . . . . . . . . . . . . . . . . . . 200

7.2 Posterior distributions of the number k and the time, τ2,1 and τ2,2, of

a two consecutive changes of sizes (a) (δ2,1, δ2,2) = (−5,−10) and (b)

(δ2,1, δ2,2) = (−5,+10) following signals from c-chart where λ2,0 = 20,

and (τ2,1, τ2,2) = (25, 35). . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

7.3 Posterior distributions of the number k and the time, τ3,1, τ3,2 and τ3,3, of

three consecutive changes of sizes (a) (δ3,1, δ3,2, δ3,3) = (−5,+5,−5) and

(b) (δ3,1, δ3,2, δ3,3) = (+5,−5,+5) following signals from c-chart where

λ3,0 = 20, and (τ3,1, τ3,2, τ3,3) = (25, 35, 45). . . . . . . . . . . . . . . . . 209

xxii

8.1 Exponentially weighted moving average graphs (with smoothing con-

stant of 0.01) tracking the incidence of patients returning to theatre for

re-operation for bleeding related issues and cases requiring excess blood

product utilisation (>10 units)in the first 24 hours post CABG surgery.

Data is drawn from cardiac surgical procedures performed at SAWMH

in the period 2002-2010. . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

8.2 Bernoulli CUSUM and EWMA control charts for the re-operation (a1-

2) and the use of blood products (b1-2) variables over 1072 patients

underwent CABG surgery during 2006-2010. . . . . . . . . . . . . . . . . 229

8.3 Posterior distributions of the time τ (1) and the magnitude δ (2) of

the change in the rate of re-operation detected by the Bernoulli EWMA

control chart at the 32nd patient who underwent CABG surgery. . . . . 231

8.4 Exponentially weighted moving average graph (with smoothing constant

of 0.01) for rates of patients for whom Aprotinin was used in CABG

surgery during 2006-2010 at SAWMH. . . . . . . . . . . . . . . . . . . . 234

8.5 Exponentially weighted moving average graphs (with smoothing con-

stant of 0.01) for rates of patients who underwent CABG or PTCA on

the lesion target of the angioplasty procedure (TLR) and the rate of

patients who experienced either TLR or heart attack or died (MACE).

Data is drawn from cardiac surgical procedures performed at SAWMH

in the period 2002-2006. . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

8.6 Bernoulli CUSUM and EWMA control charts for TLR (a1-2) and MACE

(b1-2) variables over 982 patients underwent angioplasty during 2005-2006.238

8.7 Exponentially weighted moving average graph (with smoothing constant

of 0.01) for rates of patients who DES was used for in angioplasty pro-

cedure during 2005-2006 at SAWMH. . . . . . . . . . . . . . . . . . . . . 240

9.1 Distribution of calculated (1) logit of APACHE II scores logit(p); and

(2) risk of mortality for 4644 patients admitted to ICU during 2000-2009. 262

9.2 Effect of a change of size {0.2, 0.5, 0.8, 1.25, 2, 5} in (1) odds ratio, δ,

and (2) slope, β1, in an in-control Bernoulli process with baseline risks

of p0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

9.3 Distribution of observable risk of mortality after a step change in (1)

odds ratio of size δ = 0.33 and (2) slope of size β1 = 0.33 for 4644

patients admitted to ICU during 2000-2009. . . . . . . . . . . . . . . . . 264

9.4 Risk-adjusted (a1) CUSUM ((h+, h−) = (5.85, 5.33)) and (b1) EWMA

(λ = 0.01 and L = 2.83) control charts and obtained posterior distri-

butions of (a2, b2) time τ and (a3, b3) magnitude δ of an induced step

change of size δ = 0.33 in odds ratio where E(p0) = 0.082 and τ = 500. . 266

xxiii


(λ = 0.01 and L = 2.83) control charts and obtained posterior distribu-

tions of (a2, b2) time τ and (a3, b3) magnitude β1 of an induced step

change of size β1 = 0.33 in slope where E(p0) = 0.082 and τ = 500. . . . 267

10.1 Distribution of calculated (1) logit of APACHE II scores logit(p); and (2)

probability of mortality for 4644 patients who admitted to ICU during

2000-2009. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

10.2 Effect of linear trend disturbances with a slope of β occurred at i = 500

in odds ratio of an in-control Bernoulli process for the 600th patient with

a baseline risk of p0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

10.3 Distribution of observable probability of mortality after (1) 50, (2) 100,

(3) 150 and (4) 200 observations since occurrence of a linear trend dis-

turbance with a slope of size β = 0.025 in odds ratio for 4644 patients

who admitted to ICU during 2000-2009. . . . . . . . . . . . . . . . . . . 301


(λ = 0.01 and L = 2.83) control charts and obtained posterior distribu-

tions of (a2, b2) time τ and (a3, b3) magnitude β of an induced linear

trend with a slope of size β = 0.025 in odds ratio where E(p0) = 0.082

and τ = 500. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302

11.1 (1) Risk-adjusted survival time CUSUM chart (h = 4.88) and obtained

posterior distributions of (2) time τ and (3) magnitude k of a decrease of

size k = 0.25 in λ (mean survival time) where λ0 = 42133.6 and τ = 500. 325

12.1 Cumulative distribution functions of prior distributions. The assigned

priors for the magnitude of the change, k, in the scale parameter of the

Weibull AFT model λ in the cases of detection of (1) an increase, or (2)

a decrease in k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

12.2 Estimated survival curves for patients with (1) low to medium and (2)

medium to high Parsonnet scores (risks prior to surgery) over the follow-

up period of 30 days obtained through the fitted Weibull AFT model to

the training survival time data. . . . . . . . . . . . . . . . . . . . . . . . 348

12.3 Estimated probability of survival at the 15th and the 30th day of the

follow-up period of 30 days over all Parsonnet scores prior and after (1)

an increase of size k = 4, and (2) a decrease of size k = 0.25 in the MST.

Prior and after the change are indexed by 1 and the value of k. . . . . . 349

12.4 Estimated absolute magnitude of change in probability of survival over

all Parsonnet scores prior and after changes in the MST. Probabilities

at the 15th and the 30th day of the follow-up period of 30 days prior and

after (1) an increase of size k = 4, and (2) a decrease of size k = 0.25 in

the MST. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

xxiv

12.5 Risk-adjusted survival time CUSUM charts ((h+, h−) = (4.88, 4.53)) and

obtained posterior distributions of the time τ and the magnitude k of

(a1-a3) an increase of size k = 4, and (b1-b3) a decrease of size k = 0.25

in λ (mean survival time) where λ0 = 42133.6 and τ = 500. . . . . . . . 352

13.1 Estimated survival curves for patients with (1) low to medium and (2)

medium to high Parsonnet scores (risks prior to surgery) over the follow-

up period of 30 days obtained through the fitted Weibull AFT model to

the training survival time data. . . . . . . . . . . . . . . . . . . . . . . . 378

13.2 Estimated probability of survival at the (1) 15th and the (2) 30th day of

the follow-up period of 30 days over all Parsonnet scores prior (i = 500)

and after (i = {550, 600, 650}) (a) an increasing trend with a slope of size

k = 0.005, and (b) a decreasing trend with a slope of size k = −0.005 in

the MST. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

13.3 Estimated absolute magnitude of change in probability of survival at the

15th and the 30th day of the follow-up period of 30 days over all Parsonnet

scores following (i = 600) (1) an increasing trend with a slope of size

k = 0.005, and (2) a decreasing trend with a slope of size k = −0.005 in

the MST. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381

13.4 Risk-adjusted survival time CUSUM charts ((h+, h−) = (4.88, 4.53)) and

obtained posterior distributions of the time τ and the magnitude k of

(a1-a3) an increasing trend with a slope of size k = 0.005, and (b1-b3)

a decreasing trend with a slope of size k = −0.005 in λ (mean survival

time) where λ0 = 42133.6 and τ = 500. . . . . . . . . . . . . . . . . . . . 383

xxv

xxvi

List of Tables

3.1 Single sampling plans for APACHE II data, LTPD=1.0% and process

average=0.5%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.2 Double sampling plans for APACHE II data, LTPD=1.0% and process

average=0.5%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

3.3 Quality control charts and their components. . . . . . . . . . . . . . . . 80

4.1 Customized APACHE II model parameters using logistic regression over

observed and simulated blocks under Fix and Updating approaches.

Highlighted rows are sets of parameters which are used for utility func-

tion construction within the utility loop of the algorithm. . . . . . . . . 117

4.2 Raw and relative performance criteria (Somer’s statistic D, external ac-

curacy Ea, precision P , weights w and performance index PI) of the cal-

ibrated APACHE II model over observed and simulated data obtained

using Fix approach. Relative criteria are based on the comparison of

M0 with MF . Highlighted row is the set of parameters which is used

for utility function construction within the utility loop of the algorithm. 118

4.3 Raw and relative performance criteria (Somer’s statistic D, external ac-

curacy Ea, precision P , weights w and performance index PI) of the

calibrated APACHE II model over observed and simulated data obtained

using Updating I and II approaches. Relative criteria are based on the

comparison of M0 with MF . Highlighted rows are sets of parameters

which are used for utility function construction within the utility loop

of the algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

4.4 Predicted number of errors (E(xj)), costs (E(Cj)) and related utility

values Uj for utility function scenarios, BL, LU, PU1 and PU2, over

observed and simulated blocks under Fix and Updating I approaches. . 121

xxvii

4.5 Predicted number of errors (E(xj)), costs (E(Cj)) and related utility

values Uj for utility function scenarios, BL, LU, PU1 and PU2, over

observed and simulated blocks under Updating II approach. Highlighted

row is the set of parameters which is used for utility function construction

within the utility loop of the algorithm. . . . . . . . . . . . . . . . . . . 122

4.6 Data capturing algorithm iterations and termination points for four util-

ity function scenarios under Fix approach. . . . . . . . . . . . . . . . . . 125

4.7 Data capturing algorithm iterations and termination points for four util-

ity function scenarios under Updating I and II approaches. . . . . . . . . 126

6.1 Posterior estimates (mode, sd.) of step change point model parame-

ters τ and δ following signals (RL) from c-, Poisson EWMA (r = 0.1

and A± = 2.67) and Poisson CUSUM charts((k+, h+) = (22.4, 22),

(k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard devia-

tions are shown in parentheses. . . . . . . . . . . . . . . . . . . . . . . . 164

6.2 Credible intervals for step change point model parameters τ and δ fol-

lowing signals from c-, Poisson EWMA (r = 0.1 and A± = 2.67) and

Poisson CUSUM charts ((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14))

where λ0 = 20 and τ = 100. . . . . . . . . . . . . . . . . . . . . . . . . . 165

6.3 Probability of the occurrence of the change point in the last 10, 25 and

50 observed samples prior to signalling for c-, Poisson EWMA (r =

0.1 and A± = 2.67) and Poisson CUSUM charts((k+, h+) = (22.4, 22),

(k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. . . . . . . . . . . . . 166

6.4 Average of posterior estimates (mode, sd.) of step change point model

parameters τ and δ following signals (RL) from c-, Poisson EWMA (r =


(k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard deviations

are shown in parentheses. . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6.5 Posterior estimates (mode, sd.) of linear trend change point model pa-

rameters τ and β following signals (RL) from c-, Poisson EWMA (r = 0.1

and A± = 2.67) and Poisson CUSUM charts ((k+, h+) = (22.4, 22),



6.6 Credible intervals for linear trend change point model parameters τ and β

following signals from c-, Poisson EWMA (r = 0.1 and A± = 2.67) and

Poisson CUSUM charts ((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14))

where λ0 = 20 and τ = 100. . . . . . . . . . . . . . . . . . . . . . . . . . 170

6.7 Average of posterior estimates (mode, sd.) of linear trend change point

model parameters τ and β following signals (RL) from c-, Poisson EWMA

(r = 0.1 andA± = 2.67) and Poisson CUSUM charts((k+, h+) = (22.4, 22),



xxviii

6.8 Posterior estimates (mode, sd.) of multiple change point model parame-

ters τ1, δ1, τ2 and δ2 following signals (RL) from c-, Poisson EWMA (r =

0.1 and A± = 2.67) and Poisson CUSUM charts ((k+, h+) = (22.4, 22),

(k−, h−) = (17.4, 14)) where λ0 = 20, τ1 = 100 and τ2 = 110. Standard

deviations are shown in parentheses. . . . . . . . . . . . . . . . . . . . . 174

6.9 Credible intervals for multiple change point model parameters τ1, δ1, τ2

and δ2 following signals from c-chart, Poisson EWMA (r = 0.1 and A± =

2.67) and Poisson CUSUM ((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14))

where λ0 = 20, τ1 = 100 and τ2 = 110. . . . . . . . . . . . . . . . . . . . 174

6.10 Average of posterior estimates (mode, sd.) of multiple step change

point model parameters τ and δ following signals (RL) from c-, Poisson

EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUM charts((k+, h+) =

(22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard


6.11 Performance and goodness of the change point models on different change

types following signal from a c-chart where λ0 = 20, τ1 = 100 and τ2 = 110.177

6.12 Average of detected time of a step change in a Poisson process obtained

by the Bayesian estimator, CUSUM and EWMA built-in estimators and

MLE estimator following signals (RL) from c-, Poisson EWMA (r =




6.13 Average of detected time of a linear trend in a Poisson process obtained

by the Bayesian estimator, CUSUM and EWMA built-in estimators and

MLE estimator following signals (RL) from c-, Poisson EWMA (r =




7.1 Posterior distributions (mode, sd.) of multiple change point model pa-

rameters mk and θm1 = (τ1,1, δ1,1) following signals (RL) from c-chart

where λ1,0 = 20 and τ1,1 = 25. Standard deviations and 80% credible

intervals are shown in round and square parentheses, respectively. . . . . 200

7.2 Average of posterior estimates (E(mode), E(sd.)) of multiple change

point model parameters mk and θm1 = (τ1,1, δ1,1) following signals (RL)

from c-chart where λ1,0 = 20 and τ1,1 = 25. Standard deviations are

shown in parentheses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201


rameters mk and θm2 = (τ2,i, δ2,i), i = 1, 2, following signals (RL) from

c-chart where λ2,0 = 20, τ2,1 = 25 and τ2,2 = 35. Standard deviations

and 80% credible intervals are shown in round and square parentheses,

respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

xxix


point model parameters mk and θm2 = (τ2,i, δ2,i), i = 1, 2, following

signals (RL) from c-chart where λ2,0 = 20, τ2,1 = 25 and τ2,2 = 35.

Standard deviations are shown in parentheses. . . . . . . . . . . . . . . . 205


point model parameters mk and θm1 =(τ1,1, δ1,1) following signals (RL)

from c-chart for replications in which the number of change points was

underestimated where where λ2,0 = 20, τ2,1 = 25 and τ2,2 = 35. Stan-

dard deviations are shown in parentheses. . . . . . . . . . . . . . . . . . 207


rameters mk and θm3 = (τ3,i, δ3,i), i = 1, 2, 3, following signals (RL) from

c-chart where λ3,0 = 20, τ3,1 = 25, τ3,2 = 35 and τ3,3 = 45. Standard

deviations and 80% credible intervals are shown in round and square

parentheses, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 210


point model parameters mk and θm3 = (τ3,i, δ3,i), i = 1, 2, 3, following

signals (RL) from c-chart where λ3,0 = 20, τ3,1 = 25, τ3,2 = 35 and

τ3,3 = 45. Standard deviations are shown in parentheses. . . . . . . . . . 211

7.8 Average of change point estimates obtained through the built-in EWMA

(τewma) and CUSUM (τcusum), MLE (τmle) and Bayesian (τb, time of the

first change) estimators following signals from Poisson EWMA (RLewma),

Poisson CUSUM (RLcusum) and c-chart (RLc) where λk,0 = 20 and

τk,1 = 25. Standard deviations are shown in parentheses. . . . . . . . . . 214

8.1 Posterior distributions (mode, sd.) and incredible intervals (CI) of the

change point parameters τ and δ following signals from the Bernoulli

CUSUM (h± = (3.37, 2.87) and h± = (3.22, 2.68)) and EWMA (λ =

0.05, A± = 4.15 and A± = 4.25) charts on the rate of re-operation

and the use of blood products over 1072 patients who underwent CABG

surgery during 2006-2010. Standard deviations are shown in parentheses. 232

8.2 Posterior distributions (mode, sd.) and credible intervals (CI) of the

change point parameters τ and δ following signals from the Bernoulli

CUSUM (h± = (3.78, 3.27) and h± = (4.60, 4.07)) and EWMA (λ =

0.05, A± = 4.50 and A± = 4.05) charts on TLR and MACE variables

over 982 patients undergone angioplasty during 2005-2006. Standard



ters (τ , δ and β1) following signals (RL) from RACUSUM ((h+, h−) =

(5.85, 5.33)) and RAEWMA charts (λ = 0.01 and L = 2.83) where

E(p0) = 0.082 and τ = 500. Standard deviations are shown in parenthe-

ses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

xxx

9.2 Credible intervals for step change point model parameters (τ , δ and β1)

following signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and

RAEWMA charts (λ = 0.01 and L = 2.83) where E(p0) = 0.082 and

τ = 500. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

9.3 Probability of the occurrence of the change point in the last {25, 50, 100,200, 300} observations prior to signalling for RACUSUM ((h+, h−) =


E(p0) = 0.082 and τ = 500. . . . . . . . . . . . . . . . . . . . . . . . . . 270


parameters (τ and δ) for a change in odds ratio following signals (RL)

from RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ =

0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500. Standard devia-

tions are shown in parentheses. . . . . . . . . . . . . . . . . . . . . . . . 271


parameters (τ and β1) for a change in slope following signals (RL) from

RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ = 0.01

and L = 2.83) where E(p0) = 0.082 and τ = 500. Standard deviations


9.6 Performance and goodness of the change point models on different change

types following signal from a RAEWMA (λ = 0.01 and L = 2.83) where

E(p0) = 0.082 and τ = 500. . . . . . . . . . . . . . . . . . . . . . . . . . 275

9.7 Average of detected time of a step change in odds ratio obtained by

the Bayesian estimator (τb), CUSUM and EWMA built-in estimators



τ = 500. Standard deviations are shown in parentheses. . . . . . . . . . 276

9.8 Average of detected time of a step change in slope obtained by the

Bayesian estimator (τb), CUSUM and EWMA built-in estimators follow-

ing signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA

charts (λ = 0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500. Stan-

dard deviations are shown in parentheses. . . . . . . . . . . . . . . . . . 278


rameters (τ and β) following signals (RL) from RACUSUM ((h+, h−) =


E(p0) = 0.082 and τ = 500. Standard deviations are shown in parenthe-

ses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

10.2 Credible intervals for linear trend change point model parameters (τ and

β) following signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and


τ = 500. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

xxxi

10.3 Probability of the occurrence of the change point in the last 25, 50

and 100 observations prior to signalling for RACUSUM ((h+, h−) =


E(p0) = 0.082 and τ = 500. . . . . . . . . . . . . . . . . . . . . . . . . . 304


model parameters (τ and β) for a drift in odds ratio following signals

(RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts

(λ = 0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500. Standard


10.5 Average of detected time of a linear trend change in odds ratio obtained

by the Bayesian estimator (τb), CUSUM and EWMA built-in estimators




11.1 Posterior estimates (mode, sd.) of step change point model parameters

(τ and k) following signals (RL) from RAST CUSUM (h = 4.88) where

λ0 = 42133.6 and τ = 500. . . . . . . . . . . . . . . . . . . . . . . . . . . 326

11.2 Credible intervals for step change point model parameters (τ and k)

following signals (RL) from RAST CUSUM (h = 4.88) where λ0 =

42133.6 and τ = 500. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

11.3 Probability of the occurrence of the change point in the last {25, 50, 100,200, 300, 400, 500} observations prior to signalling for RAST CUSUM

(h = 4.88) where λ0 = 42133.6 and τ = 500. . . . . . . . . . . . . . . . . 327


parameters (τ and k) for a change in the mean survival time following

signals (RL) from RAST CUSUM (h = 4.88) where λ0 = 42133.6 and



ters (τ and k) following signals (RL) from RAST CUSUM ((h+, h−) =

(4.88, 4.53)) where λ0 = 42133.6 and τ = 500. . . . . . . . . . . . . . . . 351

12.2 Credible intervals for step change point model parameters (τ and k)

following signals (RL) from RAST CUSUM ((h+, h−) = (4.88, 4.53))

where λ0 = 42133.6 and τ = 500. . . . . . . . . . . . . . . . . . . . . . . 353


((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6 and τ = 500. . . . . . . . . 354


parameters (τ and k) for a change in the mean survival time following

signals (RL) from RAST CUSUM ((h+, h−) = (4.88, 4.53)) where λ0 =

42133.6 and τ = 500. Standard deviations are shown in parentheses. . . 355

xxxii


parameters (τ and k) for a change in the mean survival time using dif-

ferent censoring time, c, following signals (RL) from RAST CUSUM

((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6 and τ = 500. Standard


12.6 Average of detected time of a step change in the mean survival time

obtained by the Bayesian estimator (τb) and CUSUM built-in estimator

following signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) where

λ0 = 42133.6 and τ = 500. Standard deviations are shown in parentheses.360


rameters (τ and k) following signals (RL) from RAST CUSUM ((h+, h−) =

(4.88, 4.53)) where λ0 = 42133.6 and τ = 500. . . . . . . . . . . . . . . . 382

13.2 Credible intervals for linear trend change point model parameters (τ and

k) following signals (RL) from RAST CUSUM ((h+, h−) = (4.88, 4.53))

where λ0 = 42133.6 and τ = 500. Standard deviations are shown in

parentheses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384


((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6 and τ = 500. . . . . . . . . 385


model parameters (τ and k) for a change in the mean survival time

following signals (RL) from RAST CUSUM ((h+, h−) = (4.88, 4.53))

where λ0 = 42133.6 and τ = 500. Standard deviations are shown in

parentheses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386

13.5 Average of detected time of a linear trend in the mean survival time

obtained by the Bayesian estimator (τb) and CUSUM built-in estimator

following signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) where

λ0 = 42133.6 and τ = 500. Standard deviations are shown in parentheses.389

14.1 Summary of research components. . . . . . . . . . . . . . . . . . . . . . 405

xxxiii

xxxiv

Statement of Original Authorship

The work contained in this thesis has not been previously submitted for a

degree or diploma at any other higher education institution. To the best of

my knowledge and belief, the thesis contains no material previously published

or written by another person except where due reference is made.

Signature:Hassan Assareh

Date:

xxxv

xxxvi

CHAPTER 1

Introduction

1.1 Motivation

Quality oriented management systems and methods have become the a key component

business and governance paradigm. From this perspective, satisfying customers’ ex-

pectations by supplying reliable, good quality products and services is the key factor

for an organization and even government. One of the significant public concerns is the

quality of healthcare services, which is affecting quality of life and public and private

investments. The Institute of Medicine (2000) reported that the number of deaths due

to medical errors in U.S. hospitals may have exceeded 100,000 per year; and the num-

bers of unnecessary surgeries and hospital infections have topped 12,000 and 80,000,

respectively. It is well recognized in the research community that many of these events

can be avoided by the effective use of healthcare standardization, improvement, and

surveillance methods (Tsui et al., 2008).

During recent decades, Statistical Quality Control (SQC) methods have been developed

as the technical core of quality management and continuous improvement philosophy

2 Chapter 1. Introduction

and now are being applied widely to improve the quality of products and services in

industrial and business sectors. The short and long term benefits and achievements

obtained by industrial and business sector via the implementation of SQC methods,

notably Statistical Process Control (SPC) tools and Acceptance Sampling Plans (ASP)

have motivated other sectors to consider those tools and include them as an essential

part of the monitoring process of management. The former includes quality control

charts and root causes analysis diagrams which aim to monitor the process through the

product or service specifications online and statistically in order to reduce variation;

and in the latter evaluation of the quality of products or services is the aim.

Since 1990s SQC tools, in particular quality control charts, have been used in healthcare

surveillance. In some cases, these tools have been modified and developed to better suit

the health sector characteristics and needs. Woodall (2006), a well-recognized expert

in the development of SPC techniques, believes that some of the work in the healthcare

area has evolved independently of the development of industrial statistical process con-

trol methods. Therefore analyzing and comparing the characteristics of quality control

charts across the different sectors presents some opportunities for transferring knowl-

edge and future development in each section. For example, for healthcare surveillance,

it could be instructive to consider techniques developed in an industrial context, for

example monitoring of attributes data, multiple units and rare events (Woodall, 2006;

Woodall et al., 2010).

One of the exciting new paradigms in industrial SQC is the use of Bayesian approaches

and methods. Bayesian inference is an approach to statistics in which many forms of

uncertainty are included in the model and expressed in terms of probability. Bayesian

inference uses a numerical estimate of the degree of belief in a hypothesis before evidence

has been observed and calculates a numerical estimate of the degree of belief in the

hypothesis after evidence has been observed. In the SQC context, this implies revision of

belief about a process and product/service after observation of data, and the possibility

of dynamic updating of estimated parameters and control charts components as new

data are gathered.

Developed computational methods and software in Bayesian statistics pave the way for

1.2 Research Aim 3

Figure 1.1 Research aim.

extending current approaches to capture and model importance sources of uncertainty

under a new outlook. Bayesian hierarchical models (BHM) aim to define parameters

of interest in the system as variables which behave under an unknown probability dis-

tribution. A BHM thus presents a structure of observable and unobservable variables,

parameters and their dependencies. This structure considers more flexibility for and

tries to infer deeply about parameters of monitored system/process/product.

Consideration of the capabilities of the Bayesian approach in SQC and achievements

obtained in an industrial context has motivated the present research, which aims to

develop Bayesian hierarchical models and methods for monitoring and improving hospi-

tal outcomes purposes through gaining accurate and informative results which support

clinicians in decision making and healthcare management.

1.2 Research Aim

This research will consider the advancement of capabilities of statistical quality control

techniques and particularly quality control charts by development of Bayesian hierar-

chical models and adaptation of other state-of-the-art approaches from the industrial

context to the monitoring of healthcare processes and hospital services as depicted in

Figure 1.1. It aims to satisfy the larger demand for evidence-based quality improvement

in patient-based outcomes. This study was motivated by the quality improvement at

St Andrew’s War Memorial Hospital (SAWMH) implemented by St Andrew’s Medical


Figure 1.2 Research objectives defined within implementation of quality improvement cycle in the pilothospital.

Institute (SAMI) team.

1.3 Research Objectives

Development of Bayesian hierarchical models for enhancing capabilities of statistical

quality control methods in order to evaluate, monitor and improve the quality of services

in hospitals is the main purpose of the current research. This will be achieved by

pursuing the following objectives defined defined as part of the implementation of the

quality improvement cycle in the pilot hospital illustrated in Figure 1.2.

1.3.1 Objective 1: Dataset Quality Evaluation

The first part of this study concerns quality assurance of gathered data and existing

datasets in hospital databases. These datasets are being used to estimate parameters

and quantities including risk adjusted models and key clinical indicators for construct-

ing control charts and undertaking statistical analyses of data from ongoing processes.

The datasets are also used to analyze the behavior of processes from past to present

1.4 Research Contribution 5

longitudinally and identify shifts. It is known that the datasets contain missing data

and errors such as in data format and scale which reduce the effectiveness of their usage

and the accuracy of estimated statistics. Therefore determining the quality of current

data and making decisions on whether to accept them should be investigated. This

objective will be followed by addressing the following goals:

• Goal 1: Estimation of data quality and setting acceptance criteria for clinical

datasets

• Goal 2: On-line data quality monitoring and improvement

• Goal 3: Determination of optimal data size

1.3.2 Objective 2: Control Charts Development and Application

With increasing demands in the application of SPC tools in the health sector, health

researchers have modified and developed quality control charts in order to improve

the efficiency of their performance in this new context. It seems that there are op-

portunities for more research which takes into account the specific characteristics of

health processes. This part of the study aims to advance control charting methods and

their capabilities in the health sector using Bayesian techniques and considering recent

paradigms and developments in the industrial area. This objective will be satisfied by

meeting the following goals within the health domain:

• Goal 1: Monitoring attributes data and rare events

• Goal 2: Monitoring related clinical variables simultaneously

• Goal 3: Facilitation of root causes analysis through estimation of the time when

a clinical process has changed

1.4 Research Contribution

This research essentially can be characterized under the quality engineering stream

of research. In particular, it contributes to statistical quality control methods in a


healthcare context through application of Bayesian approach and methods. The con-

tributions made by pursuing the objectives above and related goals can be addressed

by considering two aspects, Application and Method.

1.4.1 Contribution to Application

Translation and application of well-established SQC methods such as acceptances sam-

pling plans and control chart to evaluate and improve quality of clinical datasets

through Goals 1 and 2 of Objective 1 can be categorized under this title. Implementa-

tion of multivariate control charts in monitoring clinical processes, Goal 2 of Objective

1, can also be seen as a component of contribution to Application. Furthermore, any

investigation containing implementation of developed control charting methods in the

healthcare context within the development components of Goal 3 of Objective 2 would

be characterized as a contribution to Application.

1.4.2 Contribution to Method

Development of a sample size determination method within the Goal 3 of Objective 1

is the first contribution to the Method component. This will be followed by develop-

ment of change point models in a Bayesian framework for control charting purposes.

In particular, the design and investigation of Bayesian estimators for various circum-

stances of monitoring clinical and non-clinical outcomes are the main components of

the contribution to Method.

1.5 Research Scope

In the context of quality management and improvement in healthcare surveillance, there

are other problems, methods and inferences which are not addressed in this thesis and

can be considered for further investigation and research. As briefly explained in the

objectives, this thesis attempts to respond to specific research problems raised in the

practice of quality improvement in a local hospital, including clinical data quality eval-

uation and improvement, sample size determination, multivariate and attribute control

1.6 Thesis Structure 7

charting and mainly change point estimation following a signal from a control charts.

Each research component has its own scope and limitation explained within chapters.

In clinical data quality research, error detection and importance determination are not

followed in this research, however they are addressed in the context of the proposed

framework. Statistically design of sample size determination method can also be fol-

lowed for every complex statistical analysis, however in this thesis a general algorithm

is constructed which benefits of flexibility and generalization. In terms of change point

investigation, this study focuses on better estimation of change in the process in order

to facilitate root causes analysis efforts; hence, development and assessment of subse-

quent action plans and interventions in the process and system are beyond the scope

of the thesis.

In the advancement of control charts, this thesis focuses on monitoring healthcare in

hospitals and clinical centers; hence other statistical and charting methods developed

and applied for public health and Syndromic surveillances to detect disease outbreaks

are not followed here. Risk-adjustment of multivariate control charts can be of fur-

ther research; however, this thesis focuses on well-established standard control charts

adapted from an industrial context. In change point estimation, this thesis focuses on

retrospective investigation in which identification of the true time prior to the control

chart’s signal is sought. Prospective change point estimation is not investigated.

1.6 Thesis Structure

This thesis is written in publication form since the research objectives are met by a

series of research components outlined in independent articles submitted to, or accepted

by, journals. Chapters are presented here in the form which they were submitted

or accepted. These articles are presented in Chapters 3 to 13. Each chapter thus

has its own relevant literature review and references and there is necessarily some

overlap and repetition across chapters, in particular across Chapters 6 to 13 since

the study is extended to capture more complexity. A more comprehensive literature

review is presented in Chapter 2 and the references for all chapters are compiled into

a bibliography that appears at the end of the thesis.


1.7 Thesis Outline

Chapter 2 comprises a literature review on clinical data quality improvement and re-

lated sample size determination for quality control purposes under the first objective.

This is followed by outlining control charting methods for monitoring hospital outcomes

and Bayesian methods particularly in change point detection. Some differences between

the reviewed body of literature in this chapter with cited references within Chapters 6

to 13 are expected due to word limitations for submission purposes as well as date of

preparation.

Chapter 3 addresses the data quality objective and provides an applied and compre-

hensive discussion on the translation and application of acceptance sampling plans and

statistical process control techniques in evaluation and improvement of clinical datasets.

Within this study some of the outlined methods were implemented at SAMI and the

results are reported here.

Chapter 4 targets the sample size determination goal considering cost-effectiveness of

data quality improvement efforts. In this chapter an innovative Bayesian algorithm is

proposed and applied in construction of risk models for intensive care units at SAWMH.

Chapter 5 focuses on the application of well-established industrial charting methods

in a clinical context under the second research objective. Within this chapter a mul-

tivariate control chart is constructed to monitor the radiation delivered to patients

undergoing diagnostic coronary angiogram procedures at SAWMH. Incompleteness of

patients’ records and non-normality of data are some of the associated concerns which

are discussed.

In Chapter 6 the Bayesian approach is proposed as a new paradigm in the well-known

change point estimation stream of research in control charting. A Bayesian estimator

is developed to identify the time and the magnitude of shifts, known number and type,

in the mean of a Poisson process. The motivation of this study arose from monitoring

radiation instruments using Poisson based control charts in SAWMH. The performance

of the proposed Bayesian estimator is investigated over several change scenarios and

compared to the available alternatives.

1.7 Thesis Outline 9

This work is then extended to a circumstance in which no prior knowledge on the num-

ber of change points exists. In Chapter 7 a Bayesian multiple change point estimator

is developed and investigated. The performance of the proposed estimator is compared

to the control chart signals.

In Chapter 8 the potential capabilities of change point estimation from industrial quality

control perspective is considered under the control charting objective. A Bayesian

change point estimator is developed to study the potential causes of detected shifts

in two clinical variables including cardiac surgery outcomes and excess use of blood

products during angioplasty procedures at SAWMH. This study highlights the benefits

of such investigations in a clinical setting and promotes the Bayesian approach.

In Chapter 9 the Bayesian change point model developed and applied to clinical out-

comes in Chapter 6 is extended to capture patient mix with respect to clinical risk

factors. In this setting the in-control state of the clinical process is explained through

risk models. In this Chapter a Bayesian estimator is develop to identify the time, type

and the magnitude of a detected step change in odds ratio of death among patients

who were admitted to intensive care units at SAWMH.

In Chapter 10 the developed Bayesian estimator in the presence of patient mix is

extended to the case in which there is a linear trend in odds ratio of death. The perfor-

mance of the proposed Bayesian estimator is investigated over several trend scenarios

and compared to the available alternatives.

In Chapter 11 under the same perspective, a Bayesian estimator is developed to identify

the time of a decrease in mean survival time among patients who underwent a cardiac

surgery. In this scenario, the quality characteristic of the process is the survival time

in the follow-up period after the surgery which may be right censored. Similar to the

studies in Chapters 9 and 10, the in-control state of this clinical process is constructed

considering specific characteristics of each patient. Therefore the control chart as well as

the proposed estimator will capture all covariates. The proposed estimator is extended

to identify a wider range of step changes including jumps and drops in the mean survival

time in Chapter 12 and investigated over various follow-up period scenarios.

In Chapter 13 the proposed estimator for estimation of the time of a step change in


survival time is extended to identify a linear trend in mean survival time among patients

who underwent a cardiac surgery. The performance of the proposed Bayesian estimator

is investigated over several trend scenarios and compared to the available alternative.

In Chapter 14 the thesis is concluded and findings from each chapter are restated and

mapped to the objectives of the research. Areas in which the research can be advanced

are also outlined.

Bibliography

Tsui, K. L., Chiu, W., Gierlich, P., Goldsman, D., Liu, X., and Maschek, T. (2008). A

review of healthcare, public health, and syndromic surveillance. Quality Engineering,

20(4):435–450.

Woodall, D. H. (2006). The use of control charts in health-care and public-health

surveillance. Journal of Quality Technology, 38(2):89–104.

Woodall, D. H., Grigg, O. A., and Burkom, H. S. (2010). Research issues and ideas on

health-related surveillance. Frontiers in Statistical Quality Control 9, 38(2):145–155.

CHAPTER 2

Literature Review

2.1 Introduction

This chapter provides an overview on the body of knowledge related to research objec-

tives and goals which are review within the chapters of this thesis. For sake of integrity

the review of the literature in this chapter is presented in different structure seen within

following chapters.

2.2 Statistical Quality Control

In ISO 9000 “Quality” is defined as the “degree to which a set of inherent characteristics

fulfills requirements” and quality control is referred to all operational activities that

are used to fulfill the requirements for quality. The field of statistical quality control

(SQC) can be broadly defined as those statistical and engineering methods that are

used in measuring, monitoring, controlling, and improving quality.

12 Chapter 2. Literature Review

Woodall and Montgomery (1999) depict SQC as a division of industrial statistics that

encompass the segments of design of experiments (DOE), capability analysis, accep-

tance sampling, and statistical process control (SPC).

DOE places the emphasis on process optimization through identification and controlling

of important variables, referred to as factors, which directly influence product and

process quality levels. Capability analysis is an exercise to analyze collected data and

determine if a particular process is capable of meeting specification tolerance limits.

Acceptance sampling is inspection or testing of products in order to infer the overall

quality of the lot for the purpose of accepting or rejecting it. This approach remains

in regular use by quality assurance organisations in evaluating product conformities at

incoming inspection and test, shipping, or in-process inspection gates.

SPC is used for process monitoring and is a proactive approach. Unlike acceptance

sampling which is generally applied only at the end of the process and in which prod-

uct nonconformities have already occurred, SPC is used to signal when a process is out

of control, and then institute necessary corrective and preventive actions to preclude

potential product nonconformities. Specifically, it is an application of statistical meth-

ods for collecting and charting of data, and monitoring the variability of a particular

process of interest over time relative to the upper and lower control limits normally set

at above and below three standard deviations from the process mean.

Each of these segments compliments the others and serves as an integral part of SQC

used for process improvement and optimization. In this research the last two component

of SQC will be considered.

2.2.1 Quality in Clinical Datasets

There is an increasing demand for high quality medical registries and clinical databases.

Progress in information technology has paved the way for the systematic collection of

predefined patient data at a local, regional and national level. Clinical databases and

registries provide a valuable resource for the study of disease trends, interventions and

medical decision making and outcomes (Black, 1999). They are also a component of

2.2 Statistical Quality Control 13

quality improvement programs. They are used to assess productivity, to identify best

practices and to evaluate effectiveness of new procedures, drugs and services (Arts et al.,

2002).

To meet these objectives it is vital to have a good database design and high-quality

data. Indeed, the quality of any analysis is affected by data quality and database

structure (Beretta et al., 2007; Hattemer-Apostel et al., 2008). Inconsistencies in data

recording, such as missing values and errors, can lead to biased results. Arts et al.

(2002) define data quality as the totality of features and characteristics of a dataset

that affect its ability to meet its intended uses (based on ISO 8402-1986). Clinical data

managers are now responsible for providing high quality datasets, and often do this by

monitoring data capture and flow processes (Hattemer-Apostel et al., 2008).

The International Conference on Harmonization E6 Guidelines for Good Clinical Prac-

tice indicates that quality control should be applied to each stage of data handling to

ensure that all data are reliable and have been processed correctly (Shen and Zhou,

2006). Data quality assurance programs that consist of systematic procedures before,

during and after data collection are being developed and applied by data managers to

minimize inaccurate and incomplete data in final datasets (Arts et al., 2002). Whitney

et al. (1998) have defined quality assurance as a program that includes all activities

before data collection to ensure that the data are of the highest possible quality at the

time of collection.

Evaluation and Improvement of Data Quality

In a seminal paper, Arts et al. (2002) developed a total framework for quality assurance

in medical registries. Based on their model, procedures have been developed to prevent

the collection of insufficient data, to detect imperfect data and its causes, and to apply

relevant corrective actions in local and central registries. This framework has been

applied in the construction of databases for intensive care units in Australia and New

Zealand (Stow et al., 2006). However, although this framework provides comprehensive

guidelines for the construction of a high-quality database, it lacks practical mechanisms

by which we can evaluate data quality, give feedback to providers, conduct root causes


analysis, and prioritize preventive and corrective actions.

Whitney et al. (1998) characterized quality control procedures which take place dur-

ing and after data collection to identify and correct errors and their causes. During

the collection process, data are transferred from paper or electronic-based case report

forms (CRFs) to databases, as well as between datasets and centers. As such, it is rec-

ommended that audits and quality review programs be applied at the different stages

(data entry, data transcription, merging and dataset locking) of database construction

(Hattemer-Apostel et al., 2008; Brunelle and Kleyle, 2002; Zhang, 2004).

Although data quality needs to be sufficiently high that objectives can be met reliably,

auditing an entire dataset, particularly when it is large, involves substantial effort and

the resources usually cannot be justified (Hattemer-Apostel et al., 2008). Rostami et al.

(2009) highlight the Institute of Medicine’s (IOM) statement which says that “there

can be no perfect dataset” and that “there may be a decreasing marginal benefit from

pursuing such a goal”. Therefore a number of minor errors might be acceptable. The

problem then becomes one of determining what is meant by acceptable, and this will

depend in part on the importance of the variables involved. It may be reasonable to

execute a 100% audit on just certain critical variables (Zhang, 2004). To this end, some

researchers have designed sampling plans based on statistical quality control methods as

an alternative to 100% audit, particularly for non-critical variables in clinical databases.

The objective of an acceptance sampling plan (ASP) is to determine whether an entire

dataset is acceptable, in terms of its error rate, based upon the number of defective

items in a sample from the dataset. Brunelle and Kleyle (2002) extended a statisti-

cal approach proposed by Sullivan et al. (1997) by designing a sampling plan which

uses acceptable and limited quality levels (AQL=0.1% and LQL=1.0%). Zhang (2004)

developed a hypothesis test for error rates which can be used to decide whether to

accept or reject a dataset given an acceptable quality level (0.1%). Shen and Zhou

(2006) developed acceptance sampling plans based on acceptable error rates of 0.0%

and 0.5% for critical and non-critical variables, respectively. To determine the impor-

tance of variables and acceptable error rates, a study of the effect of errors in clinical

decision making and resultant outcomes is essential. In this regard, systematic and

quantitative methods have been proposed aiming to evaluate the clinical consequences


of different errors in such variables for both patients and the healthcare system. Among

these, the Failure Mode and Effects Analysis (FMEA) procedure has been applied to

assess the risk associated with errors in an electronic health record system (Win et al.,

2004). Hasan and Padman (2006) developed a statistical approach to translate the

uncertainty about data quality into the risk of negative medical consequences. This

approach was then applied to distinguish critical and non-critical variables and design

of an efficient data quality improvement program. In a more recent study Rostami

et al. (2009) have used a control chart for error rates during the audit process to find

outliers and run root causes analyses. Their procedure led to an approximate 50%

saving in time when compared to a full audit, while producing the same decrease in

error rates. Despite this research, it seems that statistical quality control methods for

the evaluation and improvement of data quality have not been as widely used in the

clinical context as in industrial applications (Hattemer-Apostel et al., 2008). This may

be due to: a lack of managerial approach and technical knowledge of statistical quality

control, unwillingness to acknowledge total solutions, lack of communication with data

providers and users and their shared responsibilities in a process-oriented approach, a

lack of documentation on quality control techniques in a clinical context and also lack

of time in clinical settings. In particular, most sampling plans for audits have been

conducted either in an ad-hoc manner (Zhang, 2004) or using a fixed sample size of

10% of recorded data (Shen and Zhou, 2006). In addition, many of them have been

developed using an average quality level which leads to a high rate of errors in the long

term (Montgomery, 2008).

Acceptance sampling in Clinical Datasets

Acceptance sampling plans (APSs) can be used to assess a dataset when 100% inspec-

tion is uneconomical. The dataset is accepted if its quality is satisfactory, based upon

the number of defective items observed in a sample or set of samples from the dataset,

and it is rejected otherwise. Rejected datasets may be returned to their source and

submitted to 100% inspection and correction, termed rectifying inspection.

Although ASPs as audit tools do not directly lead to an improvement in the data


collection process (Montgomery, 2008), there may be a psychological effect due to

rectifying inspection. That is, clinicians may be more careful when entering data if

records have been returned to them for correction. ASPs can be applied during any step

of the data collection process, thereby ensuring an acceptable level of quality of either

the data received from the clinician or delivered to the database users. Of the various

systems available for designing an ASP, the Dodge-Romig system (Dodge and Romig,

1959) embraces both rectifying inspection and critical variables (Montgomery, 2008).

This system has been developed based upon the lot tolerance percent defective (LTPD)

and average outgoing quality level (AOQL). LTPD is the poorest level of quality that

the data user is willing to accept in an individual dataset, and AOQL is the worst

possible average quality that would result from a plan with rectifying inspection in the

long term. To design an ASP, the user is required to specify one of these parameters

and the average rate of errors for incoming datasets. If the average rate of errors is

unknown, it may be estimated from a preliminary sample.

The simplest ASP involves choosing a sample size n and acceptance number c. A

random sample of size n is taken from the dataset and if the number of defective

records in the sample does not exceed c, then the dataset is accepted. Otherwise, it is

rejected. This is termed a single sampling plan.

A double sampling plan depends upon four parameters: n1, n2, c1 and c2. A sample

of size n1 is taken. If the number of defective records does not exceed c1, then the

dataset is accepted; if it exceeds c2, then the dataset is rejected; and otherwise a second

sample of size n2 is taken. A decision is then made by comparing the total number

of defective records from both samples to c2. Double sampling plans are cheaper than

single sampling plans when the data quality is either very good or very bad because, on

average, they inspect fewer items than required by a single sampling plan (Montgomery,

2008).

When first implementing ASPs, it is recommended that a single sampling plan is

adopted. Terminating the audit once the number of defective records exceeds the

acceptance number is referred to as curtailment. In a database context, curtailment

may be inadvisable when using a single sampling plan, since complete inspection will


provide a better estimate of data quality. If the quality is estimated to be either very

good or very bad, a double sampling plan can be adopted, in which case, curtailment

in the second stage may be acceptable.

ASPs can be extended to more than two samples. The reader is referred to Montgomery

(2008) for a discussion on multiple and sequential sampling plans. These methods break

large samples into smaller ones and relocate the decision point on consecutive sampling

and observations. Although both plans are more complicated to administer, some

economical efficiency may be gained and the plans may be more appealing to both

clinicians and users of the database.

For critically important variables, the acceptance number is usually set to zero in a

single sampling plan. In this case, it may be preferable to adopt a Chain sampling plan

(Chsp) instead (Dodge, 1955). In Chsp the decision about whether to reject or accept

is based on the results from previous samples as well as a sample from the current

dataset. The dataset is only accepted if either there are no defective records in the

current sample (of size n), or there is one defective record in the current sample and no

defective records in the previous i datasets. For details on how n and i are estimated,

the reader is referred to Montgomery (2008), Dodge (1955), Dodge and Stephens (1966)

and Schilling and Neubauer (2009).

Chsp is appropriate only if the quality of incoming datasets is both relatively stable and

high. If repeated application of Chsp suggests consistently high quality of incoming

data, then a Skip-Lot sampling plan (SkSp) may be considered in order to reduce

the burden of inspection (Dodge, 1943; Perry, 1973). This involves using a reference

sampling plan, such as single or double sampling, to sentence datasets. If a specified

number of consecutive datasets is accepted, then instead of inspecting each new dataset,

the reference sampling plan is applied to a specified fraction of incoming datasets. If,

however, a dataset is rejected while using the reduced inspection process, then normal

inspection (of each dataset) is resumed.

In some data collection processes, incoming data flow is continuous, rather than peri-

odic, and is in batch form. In this case, data aggregation may be undertaken to provide

large datasets before using an ASP. This approach has some disadvantages particularly


in administration and corrective action. Continuous sampling plans (CSPs) (Dodge,

1947; Dodge and Torrey, 1951) are recommended for this circumstance. The simplest

plan, a CSP-1, begins with 100% inspection of all incoming records; as with SkSp, if a

specified number of consecutive records are accepted, then instead of inspecting each

new record, we inspect only a fraction of them. If, however, a record is rejected while

using the reduced inspection process, then 100% inspection is resumed.

2.2.2 Statistical Process Control

Acceptance sampling plans and rectifying inspection might ensure the quality of in-

coming/outgoing data, but they do not lead to improvement in data collection. The

data must be produced, transferred and stored accurately and completely. In a more

general framework, to obtain the desirable clinical outcomes, all clinical interventions

and care procedures should be monitored and set to the standards from admission to

discharge over all patients.

Improvement in quality of care is achieved by stabilizing clinical interventions and

care processes through the elimination of sources of variability. Statistical process

control (SPC) is a set of tools that diagnose, control and prioritize on-line variation

problems, analyze their root causes and reflect the effect of corrective actions and

improvements. Due to these capabilities, quality management programs, including

Six Sigma, have embedded SPC tools into the technical core of their methodologies

(Montgomery and Woodall, 2008). SPC consists of seven tools. The Check Sheet and

Defect Concentration Diagram are data collection and summary tools that present the

current situation of a process via its measurements and observed defects. Histogram

and Scatter Plots analyze the behavior of the process factors and variables individually

and interactively. A Control Chart interprets data quality and detects changes in the

process. A Pareto Chart categorizes and prioritizes observed errors and their root

causes. Finally, a Cause and Effect Diagram identifies and categorizes the potential

causes of observed errors for more details on SPC tools see Montgomery (2008) and

Ishikawa (1990). A process may improve when a control chart identifies undesirable

variation in the process outcomes, root cause analysis is implemented using Pareto


charts and Cause and Effect diagrams, and corrective action is defined and accomplished

(Figure 2.1). This procedure is known as an Out-of-Control Action Plan (OCAP). The

success of an SPC program requires data managers to involve and support OCAP cycles

within their systems (Montgomery, 2008).

Figure 2.1 Process improvement cycle (Montgomery, 2008).

2.2.3 Quality Control Charts

A control chart is essentially a graphical display of a measured quality characteristic

versus time. The standard assumptions justifying the use of control charts are that

the in-control data are independent and normally distributed. Montgomery (2008)

indicated that even slight correlation between data points will adversely affect the per-

formance of most control charts and increase the false alarm incidence rate. Typically,

two horizontal lines called the upper control limit (UCL) and the lower control limit

(LCL) are plotted on a control chart. If a process is in control, it is expected that

nearly all of the sample points will fall in a random pattern between these two limits.

A point falling outside the control limits is interpreted as evidence that the process is

out of control, as is a non-random pattern of points falling within the control limits.

In practice, control chart’s parameters are estimated using a series of observations or

samples when the process is assumed to be in the in-control state. Then the constructed

control chart is used to monitor the process and detect possible shifts in the process

mean. The former and latter stages are called phase I and phase II.

Because shifting the distance between control limits and the process average will alter

the false positive and false negative incidence rates, specifying control limits is a critical


control chart design parameter (Montgomery, 2008). A standard theoretical measure

of control chart performance is the charts average run length (ARL).

Montgomery (2008) defines average run length as the average number of data points

that must be acquired before a shift is detected and an out-of-control alarm is issued.

Thus, when there is no change in the mean level of a process, µ0, the ARL0 should be

large. In contrast, when there is a change in the mean level, µ1 = µ0 + δ, the ARL1

should be small. The ARL0 is a measure of the cost of false alarms, while the ARL1

measures the delay in detecting the change and thus the cost of false negatives. When

comparing different SPC chart procedures, it is common practice to fix the ARL0 values

amongst the procedures and compare the minimizations of ARL1.

There is a close connection between control charts and hypothesis testing. The control

chart tests the hypothesis that the process is in a state of statistical control. A point

plotting within the control limits is equivalent to failing to reject the hypothesis of sta-

tistical control, and a point plotting outside the control limits is equivalent to rejecting

the hypothesis of statistical control.

Shewhart Control Charts

The theory of control charts was first proposed by Shewhart (1926, 1927) and control

charts developed according to these principles are often called Shewhart control charts.

In Shewhart setting, the quality characteristic of interest is normally distributed with

known mean, µ0, and standard deviation, σ0, then the sample mean, X , is normally

distributed with mean µ0 and standard deviation σ0/√n where n is the sample size.

The sample mean then is plotted sequentially on the X control chart with a center line

of µ0 and control limits of ±Zα/2 × σ0√n. It is often that the standard normal score is

replaced by a coefficient of size three. A 3-sigma Shewhart control chart approximately

has ARL0 ≈ 370. This class of charts are well known in detection of large shifts in the

process mean (Montgomery, 2008; Shewhart, 1927).

All control charts as well as Shewhart have been developed in two classes including

variable and attribute according to the the quality characteristic of interest. In a


variable chart, data are measured on a continuous scale; whereas, in attribute charts

they are measured on a discrete scale.

X control chart is the most common variable control chart. In monitoring variable data

it is highly recommended to monitor the variation of data among observed samples at

the same time. In this regard either an S chart or R chart may be used. They control

the observed standard deviation and the range within drawn samples, respectively

(Montgomery, 2008). If the sample is one, a moving average control chart or I chart

may be applied instead of X chart.

The attribute control chart is normally used when there are insufficient means of collect-

ing or measuring variable data due to the nature of the process. In some circumstances,

it may be more economical to collect attribute data in lieu of measuring and collecting

the exact characteristics for variable data.

The fraction of nonconforming items in a sample can be monitored by a p-chart. A

binomial distribution with a mean of p0 underlies this control chart. p0, mean of fraction

nonconforming, might be known from previous experience or estimated from observed

data from preliminary in-control samples. Negative LCLs are set to zero (Montgomery,

2008).

It is often more informative to monitor the types of nonconforming rather than de-

fective items. Nonconformity control charts are proposed as alternatives for fraction

nonconforming control charts such as the p-chart.

A c-chart monitors the occurrence of nonconformity, c say, in an inspection unit. In

this chart, a Poisson distribution models the number of occurrences in an interval of

time or space. The assumption here is that the fraction of nonconformities is small

relative to the sample size and that all units have the same underlying probability of

being defective.

If the number of items monitored in an inspection unit, the c-chart components would

need to be redefined and the center line will be non-constant. An alternative is to

construct a chart based on the average number of nonconformities/occurrences per

inspection unit, u say. A u-chart is defined with a base inspection unit size and the


observed nonconformities/occurrences in a unit with different size are converted to this

base size. The resultant u-chart has a constant center line and variable limits.

As discussed earlier, the c- and u-charts provide more information upon which to make

decisions regarding corrective actions. Often quality characteristics are not equally

important and categorizing them as critical or non-critical is advised. It may be worth-

while constructing separate control charts for the different characteristics types. A

notable extension of the u-chart simultaneously takes into account the importance of

the nonconformities. In this development a demerit system is used to classify either

nonconformities (Montgomery, 2008). This system assigns different levels of severity to

nonconformities according their effects on outcome quality (Jones and Woodall, 1999).

It may be likely that nonconformities occur in clusters and that the probability of an

nonconformity is not constant. In this case, two distributions can be used, one to

express the number of clusters and another to model the number of nonconformities in

clusters. In this case, a compound Poisson distribution or other mixture model can be

applied; see Kaminsky et al. (1992), Gardiner (1987) and Montgomery (2008).

Control Charts for Rare Events

When the quality of the process is high, the number of occurrences/nonconformities

tend to zero, in which a sequence of zeros will be observed. In this situation the She-

whart plots are not useful since the observed data is no longer distributed normally.

Count- and time-based control charts that monitor time or number of conforming prod-

ucts between two nonconformities may be more appropriate. In the count-based ap-

proach, the observed faultless items between two defective items are counted and plotted

on a cumulative count of conforming (CCC) control chart. The construction of a CCC-

chart is similar to a p-chart, except the number of conforming items is plotted when

a defective item has been observed. The interpretation of a CCC-chart differs from

conventional Shewhart control charts. A succession of conforming items will eventually

result in a statistic exceeding the UCL, indicating an improvement in process. On the

other hand, a signal below the LCL shows a decline in the process quality. For more

information on the chart’s construction and parameter definition refer to Calvin (1983),


Goh (1987) and Xie et al. (2002). As an extension of the CCC-chart, the number of

faultless cases may be counted until r > 1 defective cases are observed. In this case a

CCC-r chart is constructed on a negative binomial distribution; see Xie et al. (1999)

and Ohta et al. (2001).

If the event follows a Poisson process, the time between two events has an exponential

distribution. Since this distribution is skewed, transformation of an exponential ran-

dom variable to an approximately normal variable via taking logarithms or x = y0.25

(Kittlitz, 1999; Nelson, 1994) will allow the CL and control limits to be calculated using

the usual mean and three standard deviations based on transformed data. Similar to

a CCC-chart, an out-of-control point higher than the UCL indicates an improvement

in quality and a signal below the LCL shows a drop in process quality. Although the

time-based approach seems easier than count-based method, care should be taken when

defining and measuring the desired variable. Plotting the time to observe r nonconform-

ing items may also be considered. In this case the control chart would be constructed

based on a Gamma distribution; see (Zhang et al., 2007).

Cumulative Sum Control Charts

Cumulative sum (CUSUM) control charts were proposed by Page (1954). A CUSUM

control chart is a sequential monitoring scheme that accumulates evidence of the per-

formance of the process and signals when either a deterioration or an improvement is

detected. To this end a Wald sequential probability ratio test (SPRT) (Wald, 1947) is

applied to detect a change in the process. In practice, samples or observations were

taken at regular time intervals and combined with information obtained from prior

samples or observations.

In monitoring a normal process, let Xi be the ith observation of the process. We

assume Xi ∼ N(µ0, σ2) when the process is in control. In an out-of-control state ,

Xi ∼ N(µ1, σ2). A two-sided CUSUM design for monitoring the process mean is as

follows:

S+i = max{S+

i−1 +Xi − µ0 − k} (2.1)


S−i = min{−S−

i−1 +Xi − µ0 + k}, (2.2)

where the starting values are commonly S+0 = S−

0 = 0, k is called reference value and

k = |µ1−µ0|2 = δ×σ

2 . δ = |µ1−µ0|σ is the shift size in the unit of standard deviation σ.

When either S+i > h+ or S−

i < h−, the process is considered to be out-of-control.

h+ and h− are called the decision thresholds. CUSUM control charts accumulate

information from previous successive samples. Therefore, they are effective in detecting

small sustained shifts caused by persistent special causes (Hawkins and Olwell, 1998).

It is common practice to reset S+0 and =S−

0 to 0 after signalling. However the pro-

posed initialization may also be altered to achieve better performance in the detection

of changes that immediately occur after control chart initialization. Lucas and Crosier

(1982) proposed the fast initial response (FIR) as a modification to the above rule by

setting an initial “head start” value of h/2, instead of 0, to gain a more rapid response

to the out-of-control state. This scheme has been extended to several distributions. For

more details on CUSUM control charts see Hawkins and Olwell (1998) and Montgomery

(2008).

Exponentially Weighted Moving Average Control Charts

The EWMA control chart was first introduced by Roberts (1959). This chart is another

efficient control chart for detecting small shifts. The EWMA statistic accumulating

current and past sample information is defined as:

Zi = λXi + (1− λ)Zi−1, (2.3)

where Z0 = 0, and 0 < λ ≤ 1 is a weighting parameter called smoothing constant.

In some applications, when a sample of size greater than one is, the sample value

of observation is replaced by the sample average. By using recursive relationship we

obtain,


Zi = λi−1∑

j=0

(1− λ)jXi−j + (1− λ)iZ0. (2.4)

Therefore, Zi can be viewed as a weighted average of all past and current observations,

X1, X2, ..., Xi, and the weights λ(1−λ)i decreases exponentially over i. All the weights

sum to 1. If the observations are all independent with standard deviation σ, then the

variance of Zi is

σ2Zi

= σ2

(λ

2− λ

)[1− (1− λ)2i]. (2.5)

The calculated Zi is then plotted on a chart centered at µ0 and has control limits of

µ0 ± LσZi. Note that as i becomes larger, [1 − (1 − λ)2i] approaches to 1. Therefore,

the control limits tend to stabilize.

The value of L depends on the given type I error of the EWMA control chart. It is

found that L = 3 (the usual three-sigma limits) works reasonably well in practice. The

parameter λ has an end effect on the performance of the EWMA scheme. Larger λ

values provide more weight to the recent data; whereas, smaller λ values provide more

weight to the older data values. For the λ value close to 0, the EWMA is similar to

the CUSUM chart and can detect small to moderate process shifts in the process mean

(Crowder, 1989). However a very small λ may postpone detection of a change in process

mean when the value of the EWMA is on one side of the central line and a mean shift is

in the opposite direction. since the chart does not weight the new observation heavily.

This is called the inertia effect; see Woodall and Mahmoud (2005) for more discussion.

Multivariate Control Charts

In most SPC applications more than one quality characteristic is of interest. A simple

way to tackle this need is use of univariate control charts for each quality character-

istics. However this method has been criticized since the overall probability of a false

alarm is inflated, unless the control limits are adjusted accordingly, and any correlation

between the variables is ignored. This suggests that it might be worthwhile adopting


multivariate techniques.

Hotelling (1947) introduced a statistic which uniquely lends itself to plotting multivari-

ate observations. This statistic, appropriately named Hotelling’s T 2, is a scalar that

combines information from the dispersion and mean of several variables.

Let Xi be the ith vector of observations for the p variables that we want to monitor.

When the process is in-control, it is assumed that Xi follows a multivariate normal

distribution, with mean vector µ0 and covariance matrix Σ, independent of other obser-

vations. Note that the T 2 chart is highly sensitive to normality assumption (Stoumbos

and Sullivan, 2002).

In multivariate charting the objective is to detect a shift from µ0 to µ1 and charts

consider only the magnitude of any shift and not its direction. Hence, they use only an

upper control limit (UCL). If a statistic exceeds the UCL, the chart is said to signal,

and the process should be investigated to determine if the signal is due to an error in

the data, is indicative of a genuine shift, or simply the result of natural variability.

In a T 2 chart for each observation T2i = (Xi − µ0)

′Σ−1(Xi − µ0) is plotted on a chart

with a UCL of χ2α,p where µ0 andΣ have been specified in the control chart construction

stage, phase I, using a large sample. Other values for control limit may also be used

depending on the size of samples and stage of monitoring, see Tracy et al. (1992) for

more details.

There exist multivariate versions of univariate EWMA and CUSUM. In MEWMA we

let Zi = λ(Xi − µ0) + (1 − λ)Zi−1 where Z0 = 0 and plot Z′iΣ

−1Zi

Zi where ΣZi=

λ2−λ [1−(1−λ)2i] (Lowry et al., 1992). λ and UCL may be set by simulation considering

the desired performance of the chart.

Among several versions of the MCUSUM, Crosier (1988) plots Li′ΣiLi where

Li =

0 if Ci ≤ k

(Li−1 +Xi − µ0)(1− k/Ci) if Ci > k,(2.6)

and Ci =((Li−1 +Xi − µ0)

′Σ−1(Li−1 +Xi − µ0))1/2

. Crosier (1988) recommended


setting L0 = 0 and k = (δ′Σ−1δ)1/2/2. The UCL is calculated by simulation in order

to achieve a desired ARL0. See Pignatiello and Runger (1990) for alternative MCUSUM

methods.

2.2.4 Control Charts in Healthcare

The objective in healthcare surveillance is to monitor hospital incidents or performance

(lab turnaround time, number of medical errors, infection or death rates, readmission

rates, etc) for better understanding of incident patterns, detection of errors and im-

provement of service performance. Control charts, as the technical tools of continuous

improvement philosophy and quality management programs in the industrial sector,

have been considered by medical experts and are now being developed and applied

widely in healthcare services and hospital outcomes monitoring. Woodall (2006) and

Woodall et al. (2010) comprehensively reviewed the increasing stream of adaptations of

control charts and their implementation in healthcare surveillance. He acknowledged

the need for modification of the tools according to health sector characteristics such as

emphasis on monitoring individuals, particularly dichitomos data, multiple units, rare

events, patient mix.

Healthcare-Related Characteristics in Monitoring

The motivation of all developments and modifications of control charts in healthcare

surveillance is the differences between background specification and special aims within

industrial areas and the health context. In the followings some of the significant char-

acteristics of the health context are discussed. Meanwhile industrial-based techniques

which can potentially contribute to modification and development of control charts are

shortly addressed.

Attribute Data

In a healthcare sector the use of attribute data such as failures, counts, counts and

times between events, as well as the use of surrogate data such as ICU length of stay

for mortality events, is much more usual than in industrial sectors (Woodall, 2006).


Therefore techniques developed for attributes discussed in (Woodall, 1997) and partic-

ularly high yield processes for industries which are being designed using exponential

and geometric models could provide great resources for the health sector (Xie et al.,

2002; Yang et al., 2002).

Sampling

Throughput in healthcare systems and hospitals is generally very slow. In most cases

the behavior of a process in hospitals is monitored over all produced and gathered data,

not just on sampled observations (Woodall, 2006; Tsui et al., 2008).

Process Adjustment

In the healthcare sector and specifically in public health surveillance, it is not possible

to adjust an out of control process to return it quickly to in control performance. In

many SPC applications, the control chart might continue to provide alarms after its

first signal (Woodall, 2006; Tsui et al., 2008).

Phase I and II

Woodall (2006) comments that integration of stabilizing (Phase I) and monitoring

(Phase II) stages in running control charts in the health sector is the opposite of iden-

tified stages in industrial contexts. In light of this characteristic control charts are not

well examined through the common criteria (ARL) over different size of shifts where

the process is stable. Also because of the tendency towards 100% inspection other

alternatives such as time of observations to signal is recommended.

Short Run Process and Start Up

In spite of the fact that monitoring short run production and new processes have

been considered in the industrial context and qualified control charts such as Q-charts

and modified CUSUM charts have been developed and analyzed (Del Castillo and

Montgomery, 1994; Garjani et al., 2010; Zantek and Nestler, 2009; Celano et al., 2011),

Benneyan (2006) indicates that there has been no notable attempt to develop or apply

such methods in the health sector. In some monitoring cases there is no long run

of measurement; an example is monitoring vital signs of patients in intensive care

Tsiamyrtzis and Hawkins (2005). Therefore in these circumstances sufficient data for


construction of control chart in phase I do not exist. On the other hand, the difficulty

and expense of reaching the Phase I condition to obtain in-control parameters motivates

healthcare researchers to monitor the process from beginning and mix Phases I and II

(Woodall, 2006).

Multiple Units

Monitoring outcomes for more than one unit simultaneously, where a unit could be, for

example, a surgeon, general practitioner or hospital, is one of the important issues in

the healthcare surveillance. Prominent earliest research which considered this issue is

by Aylin et al. (2003) that followed the Shipman case and investigated mortality rates

in primary care and which was later developed by others (Marshall et al., 2004). These

authors faced extra-Poisson variation due to unmeasured case mix, which is known as

over-dispersion. Some authors have attempted to tackle over-dispersion of attribute

data (Christensen et al., 2003; Woodall, 1997), but Woodall (2006) argues that this

issue remains unresolved. Recently, Funnel plots have been applied for comparative

analysis in the context of multiple units. This plot is a standard tool within meta-

analysis as a graphical check of any relationship between effect estimates and their

precision. Spiegelhalter (2005a,b) has investigated the performance of Funnel plots on

risk adjusted data taking into account their over-dispersion. He has proposed different

strategies to overcome observed over-dispersion, such as improving risk stratification,

clustering, estimating an over-dispersion factor and assuming a random effects model.

Due to its performance and proposed modifications, Funnel plots have been considered

and applied by other researchers for the analysis of multiple units (Jones et al., 2008;

Mayer et al., 2009; Mohammed and Deeks, 2008; Mohammed et al., 2008).

Risk Adjustment

One of the major differences in the health sector is the variability of probability of

failure, such as probability of death for a patient in a hospital, which is related to

a patient’s personal characteristics such as age, demographics and health conditions.

Presence of risk and required risk-adjustment in constructing control chart jeopardises

the stability of an in-control parameter in the design of a control chart. Therefore risk

adjusted versions of common control charts have been developed, reviewed and applied


in healthcare surveillance; see following sections. However, the choice of a model for

calculating risk adjustment, which affects the control chart performance, requires more

research and development (Woodall, 2006).

Aggregated Data

Monitoring a cluster of events (diseases), which means that after obtaining data regu-

larly in intervals they are aggregated by location and time, is another usual procedure,

in particular for prospective public health and syndromic surveillance. It seems that

in the industrial area there is no study in cluster monitoring. Some of the proposed

methods have included a combination of control charts (CUSUM) and control charts

based on multivariate statistics (MEWMA, MCUSUM), but these have been argued

to be ineffective (Woodall, 2006). A few methods are being developed in the health

context, such as Scan methods (Rolka et al., 2007; Sonesson, 2007; Woodall et al., 2008)

which take into account aggregation by location and time. This significant difference

is not in the main interest of this study, so the argument is not followed here. For

more details see Woodall (2006), Tsui et al. (2008) and Morton et al. (2010) and the

references therein are recommended.

Prospective Approach

The prospective detection of clusters of events occurring close together both tempo-

rally and spatially is important in finding outbreaks of disease within a geographic

region in syndromic and public health surveillances (Woodall et al., 2008). Prospec-

tive monitoring and prediction of such adverse events is one of the main challenges of

healthcare researchers compared with their industrial colleagues. Some modifications

and developments on Scan statistics and control charts have been proposed to provide

more predictive tools (Fricker Jr and Chang, 2008; Marshall et al., 2007; Sego, 2006;

Sego et al., 2009; Woodall et al., 2008).

Control Charts Development and Application in Healthcare

Morton and Lindsten (1976) proposed resetting sequential probability ratio test (RSPRT)

chart to monitor the rate of Down’s syndrome over time. This method is a special case of


Wald’s test (Wald, 1947) and an alternative of CUSUM method, however the CUSUM

was found to be the superior in performance (Grigg et al., 2003). The sets methods

initially proposed by Chen (1978) was considered to monitor the number of newborns

with a specific congenital malformation. Although this method is easy in construction

and implementation compared to CUSUM and EWMA, it is outperformed by alterna-

tives (Sego, 2006). One of the earliest comprehensive research studies was undertaken

by Benneyan (1998a,b) who utilized SPC methods and control charts in epidemiology

and control infection and discussed a wide range of control charts in the health con-

text. He also investigated the application of the geometric control chart for tracking

adverse events and then implemented this method for mortality rare events (Benneyan,

2001; Benneyan et al., 2003). Morton et al. (2001) also applied CUSUM, EWMA and

Shewhart control charts in monitoring hospital-acquired infections at a local hospital

and compared their performance. However the nature of processes in healthcare drove

quality experts to consider specific characteristics of clinical monitoring.

Risk adjustment has been considered in the development of control charts due to the

impact of the human element in process outcomes. Lovegrove et al. (1997) proposed the

variable life adjusted display (VLAD) in which the cumulative differences between the

expected and observed cumulative deaths,∑

i pi−∑

i yi, is monitored. In this formula-

tion, pi is the expected death that is predicted by an appropriate risk model, and yi is

the observed process outcome. Although the chart benefits of illustrative features, the

statistical performance of the chart in signalling has been argued (Grigg and Farewell,

2004a). A variation of VLAD chart was also proposed by Poloniecki et al. (1998). In

this chart the number of deaths is assumed to follow a Poisson distribution whith a

mean obtained by considering expected risk of death and in-control performance. The

chart has a control limit of χ21 statistic. The performance of the chart in terms of ARL

has been also challenged (Steiner and Cook, 2000).

Risk-adjusted p − chart was also proposed for monitoring mortality in intensive care

units where the control limits adjusted according to the predicted risk of death and

number of patients admitted (Cook et al., 2003). Care should be taken since the

Shwehart charts are very sensitive to the normality assumption (Grigg and Farewell,

2004b).


In a seminal paper, Steiner and Cook (2000) developed a risk-adjusted version of

CUSUM to monitor surgical outcomes, death and survival, which are influenced by

the state of a patient’s health, age and other clinical factors known prior to the pro-

cedure. This approach has been extended to EWMA (RAEWMA) (Cook, 2004; Grigg

and Spiegelhalter, 2007). Both modified procedures have been intensively reviewed

and are now well established for monitoring clinical outcomes where the observations

are recorded as binary data (Grigg et al., 2003; Grigg and Farewell, 2004b; Grigg and

Spiegelhalter, 2006; Cook et al., 2008). Risk-adjustment was also considered in appli-

cation of sets method (Grigg and Farewell, 2004a) and RSPRT chart (Spiegelhalter

et al., 2003) in a clinical setting. These methods are not followed in this study since

their performances in comparison with CUSUM and EWMA methods have been argued

(Woodall, 2006; Sego, 2006).

Monitoring patient survival time instead of binary outcomes of a process in the presence

of patient mix has recently been proposed in the healthcare context. In this setting a

continuous time-to-event variable within a follow-up period is considered. The variable

may be right censored due to a finite follow-up period. Biswas and Kalbfliesch (2008)

developed a risk-adjusted CUSUM based on Cox model for failure time outcomes. Sego

et al. (2009) used accelerated failure time regression model to capture the heterogene-

ity among patients prior to the surgery and developed a risk-adjusted survival time

CUSUM (RAST CUSUM) scheme. They showed that this procedure is more sensitive

in detection of an increase in odds ratio compared to risk-adjusted CUSUM charts.

Steiner and Jones (2010) challenged the updating feature of the RAST CUSUM and

highlighted the delay of 30 days chart in capturing patients who survived in the follow-

up period. They extended this approach by proposing a updating EWMA (uEWMA)

procedure based on the same survival time model discussed by Sego et al. (2009). In

this scheme, Steiner and Jones (2010) allow the chart to be-updated on an ongoing

basis to reflect the latest information. Therefore it signals quicker than RAST CUSUM

which must wait until follow-up period passes to update the monitoring. For more

details on uEWMA refer to Steiner and Jones (2010) since it is not followed in this

research.

It should be noted that there are also other developments in charting methods in the


health context, which are not in the scope of the current study, including: a) Public

health surveillance which aims to understand trends and detect changes in disease

incidence and death rates for planning, implementation and evaluation of public health

practice, b) Syndromic surveillance which aims to detect disease outbreaks (natural

or an intended bioattack) earlier than would be achieved via conventional reporting of

confirmed cases. For more details about these areas and methods mainly spatial and

temporal techniques, excellent reviews by Woodall (2006), Woodall et al. (2010) and

Tsui et al. (2008) and the references therein are recommended.

Risk-Adjusted CUSUM and EWMA

The risk of death of a patient after a cardiac surgery is affected by the rate of mortality

in the cardiac surgery and also a patient’s covariates such as age, gender, co-morbidities,

etc. Risk-adjusted control charts are monitoring procedures designed to detect changes

in a process parameter of interest, such as rate of mortality, where the process outcomes

are affected by covariates that we are not really interested in, such as case mix. In these

procedures, risk models are used to adjust control charts in a way that the effects of

covariates for each input, patient say, would be taken into account.

A risk-adjusted CUSUM (RACUSUM) control chart is a sequential monitoring scheme

that accumulates evidence of the performance of the process and signals when either

a deterioration or an improvement is detected, where the evidence has been adjusted

according to a patient’s prior risk (Steiner and Cook, 2000).

For the ith patient, we observe yi where yi ∈ (0, 1). This leads to a sequential set of

Bernoulli data. The RACUSUM continuously evaluates a hypothesis of an unchanged

risk-adjusted odds ratio, OR0, against an alternative hypothesis of changed odds ratio,

OR1, in the Bernoulli process Cook et al. (2008). A weight Wi, the so-called CUSUM

score, is given to each patient considering the observed outcomes yi ∈ (0, 1) and their

prior risks pi,


W±i =

ln[ (1−pi+OR0×pi)×OR1

1−pi+OR1×pi] if yi = 0

ln[1−pi+OR0×pi1−pi+OR1×pi

] if yi = 1.

(2.7)

Upper and lower CUSUM statistics are obtained through X+i = max{0, X+

i−1 + W+i }

and X−i = min{0, X+

i−1 − W−i }, respectively, and then plotted over i. Often the null

hypothesis, OR0, is set to 1 and CUSUM statistics, X+0 and X−

0 , are initialized at

0. Therefore an increase in the odds ratio, OR1 > 1, is detected when a plotted X+i

exceeds a specified decision threshold h+; similarly, if X−i exceeds a specified decision

threshold h−, the RACUSUM charts signals that a decrease in the odds ratio, OR1 < 1,

has occurred. See Steiner and Cook (2000) for more details.

A risk-adjusted EWMA (RAEWMA) control chart is a monitoring procedure in which

an exponentially weighted estimate of the observed process mean is continuously com-

pared to the corresponding predicted process mean obtained through the underlying

risk model. The EWMA statistic of the observed mean is obtained through Zoi =

λ × yi + (1 − λ) × Zoi−1. Zoi is then plotted in a control chart constructed with

Zpi = λ×pi+(1−λ)×Zpi−1 as the center line and control limits of Zpi±L×σZpiwhere

the variance of the predicted mean is equal to σ2Zpi

= λ2×pi(1−pi)+(1−λ)2×σ2Zpi−1

. We

let σ2Zp0

= 0 and initialize both running means, Zo0 and Zp0, at the overall observed

mean, p0 say, in the calibration stage of the risk model and control chart (so-called

Phase 1 in an industrial context); see Cook (2004) and Cook et al. (2008) for more

details. The smoothing constant λ of EWMA charts is determined considering the size

of shift that is desired to be detected and the overall process mean; see Somerville et al.

(2002) for more details.

The decision thresholds of the RACUSUM, h+ and h−, and the coefficient of the control

limits in RAEWMA control charts, L, are determined in a way that the charts have

a specified performance in terms of false alarm and detection of shifts in odds ratio;

see Montgomery (2008) and Steiner and Cook (2000) for more details. The proposed

initialization may also be altered to achieve better performance in the detection of

changes that immediately occur after control chart initialization, see Steiner (1999) and


Knoth (2005) for more details on fast initial response (FIR). There exists an alternative

for risk-adjusted EWMA in which the focus is on estimation of probability of death

using pseudo observations and Bayesian methods (Cook et al., 2008). This formulation

will not be followed here; see Grigg and Spiegelhalter (2007) for more details.

Risk-Adjusted Survival Time CUSUM

The survival time of a patient after cardiac surgery is affected by the rate of mortality

of cardiac surgery within the hospital and also patient covariates such as age, gender,

co-morbidities and so on. Risk-adjusted control charts of time-to-event are monitoring

procedures designed to detect changes in a process parameter of interest, such as sur-

vival time, where the process outcomes are affected by covariates that we are not really

interested in, such as patient mix. In these procedures, regression models for time are

used to adjust control charts in such a way that the effects of covariates for each input,

patient say, would be eliminated.

The RAST CUSUM proposed by Sego et al. (2009) continuously evaluates a hypoth-

esis of an unchanged and in-control survival time distribution, f(xi, θi0), against an

alternative hypothesis of a changed, out-of-control, distribution, f(xi, θi1) for the ith

patient. In this setting the density function f(.) explains the observed survival time,

xi, that should be adjusted based on the observed patient covariates.

The patient index i = 1, 2, ... corresponds to the time order in which the patients

undergo the surgery. We thus observe (ti, δi) where

ti = min(xi, c) and δi =

1 if xi ≤ c

0 if xi > c.

(2.8)

Here c is a fixed censoring time, equal to the follow-up period. We assume that the

survival time, xi, for the ith patient and consequently (ti, δi), are not updated after the

follow-up period. This leads to a dataset of right censored times, ti.

An accelerated failure time (AFT) regression model is used to predict survival time


functions, f(.), for each patient in the presence of covariates, ui. However other models

such as a Cox model that also allows capture of covariates can be considered in a similar

manner.

In an AFT model the survival function for the ith patient with covariates ui, S(xi, θi |

ui), is equivalent to the baseline survival function S0(xiexp(βTui)), where β is a vector

of covariate coefficients.

Several distributions can be used to model survival time with an AFT. Here we focus

on the Weibull distribution and outline relevant RAST CUSUM statistics; see Klein

and Moeschberger (1997) for more details. For a Weibull distribution the baseline

survival function is S0(x) = exp[−(x/λ)α] where α > 0 and λ > 0 are shape and

scale parameters, respectively. For the RAST CUSUM procedure, all parameters of

the Weibull survival function, β, α and λ, are estimated using training data, so-called

phase I. In this phase, an available dataset of patients records is used assuming that

the process is in-control for that period of time. A set of independent priors can also

be used to obtain posterior estimates of the AFT parameters over the training data.

It has been discussed that any shifts in the quality of the process of the interest can

be interpreted in terms of shifts in the scale parameters, λ; see Sego et al. (2009) and

Steiner and Jones (2010). Hence the RAST CUSUM procedure can be constructed and

calibrated to detect a specific size of change in the average or median survival time

(MST) since any shift in λ is equivalent to an identical shift in the size of average or

median survival time. Thus the CUSUM score, Wi, is given by

W±i (ti, δi | ui) = (1− (ρ±)−α)

(tiexp(β

Tui)

λ0

)− δiα ln ρ±, (2.9)

where it is designed to detect an increase (or decrease) from λ0 to λ+1 = ρ+λ0 (λ−

1 =

ρ−λ0). Upper and lower CUSUM statistics are obtained through Z+i = max{0, Z+

i−1 +

W+i } and Z−

i = min{0, Z−i−1 − W−

i }, respectively, and then plotted over i. Often

CUSUM statistics, Z+0 and Z−

0 , are initialized at 0.

An increase in the MST is detected when a plotted Z−i exceeds a specified decision

threshold h−; similarly, if Z+i exceeds a specified decision threshold h+, the RAST


CUSUM charts signals that a decrease in the MST has occurred. Although this in-

terpretation of a chart’s signals is in contrast with the common expression used for

standard risk-adjusted control charts for binary outcomes, it seems reasonable taking

into account that any increase in the MST can be characterized as a drop in the odds

of mortality. However in the Weibull distribution scenario for a specific change size in

the MST, the equivalent magnitude of shift in odds is not obtainable; see Sego et al.


The magnitudes of the decision thresholds in RAST CUSUM, h+ and h−, are deter-

mined in such a way that the charts have a specified performance in terms of false

alarm and detection of shifts in the MST. In this regard, Markov chain and simulation

approaches can be applied; see Sego (2006) for more details. The proposed initializa-

tion may also be altered to achieve better performance in the detection of changes that

immediately occurred after control chart construction; see Steiner (1999) and Knoth

(2005) for more details on fast initial response (FIR).

2.2.5 Change Point Estimation in Control Charting

The need to know the time at which a process began to vary, the so-called change

point, has recently been raised and discussed in the industrial context of quality control.

Accurate detection of the time of change can help in the search for a potential cause

more efficiently as a tighter time-frame prior to the signal in the control charts is

investigated.

A built-in change point estimator in CUSUM charts was suggested by (Page, 1954,

1961). An equivalent estimator in EWMA charts was also proposed by Nishina (1992).

The change points from CUSUM and EWMA are the points at which they were last at

zero (Hawkins and Olwell, 1998) and at the process mean (Nishina, 1992), respectively.

Both estimators do not provide any statistical inferences on the obtained estimates.

Having said that Hinkley (1971) studied the distribution of the built-in estimator of

CUSUM charts and derived an asymptotic distribution that enables us to make infer-

ences. These early built-in change point estimators can be applied for all discrete and

continuous distributions underlying the charts.


Samuel et al. (1998b) proposed a maximum likelihood estimator (MLE) for the estima-

tion of change point in control charting. This estimator was extended to estimate the

time of a step change in a normal process mean being monitored by a X chart (Samuel

and Pignatjello, 1998). Similar MLE estimators were also developed and compared

to the chart’s signal following a step change in a Poisson process and a process frac-

tion nonconformity monitored by c-chart (Samuel et al., 1998a) and p-chart (Samuel

and Pignatiello, 2001), respectively. They demonstrated how closely MLE estimators

estimate the change point in comparison with the Shewhart control charts.

Subsequently, Perry (2004) evaluated the performance of the MLE estimator and re-

ported that it outperforms Poisson CUSUM and Poisson EWMA built-in estimators

in presence of a step change. He also constructed a confidence set on the estimated

change point which covers the true process change point with a given level of certainty

using a likelihood function based upon the method proposed by Box and Cox (1964).

Perry and Pignatiello (2005) extended the model and compared the performance of the

derived MLE estimator for a step change in monitoring fraction nonconformity with

EWMA and CUSUM charts. Perry et al. (2006) then derived a MLE estimator and

confidence set under a linear trend assumption where the process parameter changes

over time. This type of change is common and for example can be caused by tool

wearing, operator’s skill improvement and spread of infections over time. They showed

that this is superior to the step change estimator if a linear trend disturbance occurs

in the Poisson rate.

Perry et al. (2007a,b) challenged the underlying assumption of knowing the form of

change types in these approaches and noted that either a step change or a linear trend

with constant slope could not adequately describe what often happens in practice.

They extended the MLE approach to the situation in which no prior knowledge of the

change type exists. The only assumption they made was that the form of shifts belongs

to the set of monotonic effects. They derived a change point estimator and constructed

confidence sets for non-decreasing multiple step change points using isotonic regression

models. The performance of these estimator were compared with the step change and

linear trend MLE estimators where a step change, a linear trend and multiple change

points are present. The multiple change point estimator was reported to relatively


outperform other MLE estimators for some magnitudes of step and linear trend dis-

turbances and in the case of multiple change points it was shown to be the superior

estimator. However, the estimator still remains dependent on a priori knowledge about

the behavior of the shifts, such as monotonic change. In practice, it is not uncommon,

to experience non-monotonic consecutive changes that may occur as a result of one

influential process input variable changing several times or several influential process

input variables changing at different times. Indeed, these changes could influence the

process mean in any direction and lead to multiple change points in the Poisson mean

which are not necessarily monotonic.

All MLE estimators described above were developed assuming that the underlying

distribution is stable over time. This assumption cannot often be satisfied in monitoring

clinical outcomes as the mean of the process being monitored is highly correlated with

individual characteristics of patients. Therefore, it is required that the risk model,

which explains patient mix, be taken into consideration in detection of true change

points in control charts.

Development of change point estimators extended to more complex probability distri-

butions, types of processes (multistage processes, profile quality characteristic, high

yield processes), type of data (multivariate, profile variables and autocorrelated data)

and change type scenarios (mean and variance). For example, in the case of a very low

fraction non-conforming, Noorossana et al. (2009) derived and analyzed the MLE esti-

mator of a step change based on the geometric distribution control chats discussed by

Xie et al. (2002). Other research has also proposed and analyzed new estimators based

upon clustering (Ghazanfari et al., 2008; Alaeddini et al., 2009) and artificial neural

network methods (Ahmadzadeh, 2009). Amiri and Allahyari (2011) comprehensively

reviewed the body of knowledge in change point estimation in control charting in an

industrial context. Yet, no change point model has been developed considering clinical

characteristics.


2.3 Bayesian Approach

A Bayesian approach to statistical modelling and analysis allows estimates to be based

on a synthesis of prior distributions and current sample data. In the classical Fre-

quentist viewpoint of statistical theory, a statistical procedure is judged by averaging

its performance over all possible data. However, the Bayesian approach gives prime

importance to how a given procedure performs for the actual data observed in a given

situation. Further, in contrast to the Frequentist procedures, Bayesian approaches for-

mally use information available from sources other than the statistical investigation.

Such information, available through expert judgment, past experience or literature,

is described by a probability distribution on the set of all possible values of the un-

known parameter of the statistical model at hand. Bayesian methods provide a com-

plete paradigm for both statistical inference and decision making under uncertainty.

Bayesian methods contain (as particular cases) many of the more often used Frequen-

tist procedures, solve many of the difficulties faced by conventional statistical methods,

and extend the applicability of statistical methods.

Statistical inferences for a quantity of interest in a Bayesian framework are described

as the modification of the uncertainty about their value in the light of evidence, and

Bayes’ theorem precisely specifies how this modification should be made as below:

Posterior ∝ Likelihood× Prior, (2.10)

where “Prior” is the state of knowledge about the quantity of interest in terms of a

probability distribution before data are observed; “Likelihood” is a model underlying

the observations, and “Posterior” is the state of knowledge about the quantity after

data are observed, which also is in the form of a probability distribution.

Applying the Bayesian framework and obtaining a posterior distribution for parameters

of interest enables us to construct probability based intervals around estimated param-

eters. A credible interval (CI) is a posterior probability based interval which involves

those values of highest probability in the posterior density of the parameter of interest.

Choice of prior distribution is very critical as it essentially indicates how we believe the

2.3 Bayesian Approach 41

parameter would behave if we had no data from which to make our decision. In other

words, a prior is often the purely subjective assessment of an expert. An informative

prior expresses specific, definite information about the parameter of interest. Therefore

the posterior of the parameter of interest is largely determined by the prior, where there

is a minimal information. When there is no a priori knowledge on the parameter an

uninformative distribution, or diffuse prior, can be used. In this setting, the posterior is

heavily affected by the observed data. In either cases it is common to apply conjugate

priors to make calculation of the posterior distribution easier by giving a closed-form

expression for the posterior (Gelman et al., 2004).

Bayes’ structure is expandable to multiple levels in a hierarchical fashion, Bayesian hier-

archical models, which allows enriching the model by capturing all kind of uncertainties

for data observed as well as priors. Bayesian hierarchical modeling has been increas-

ingly recognized as a powerful approach for analyzing complex phenomena. Bayesian

hierarchical models are now commonly used both within and outside the statistics

literature, and are widely lauded for their capacity to synthesize data from different

sources, to accommodate complicated dependence structures, to handle irregular fea-

tures of data such as missingness and censoring, and to incorporate scientifically based

process information (Craigmile et al., 2009).

2.3.1 Bayesian Computation

In complicated Bayesian models, it is often not easy to obtain the posterior distribution

analytically. This analytic bottleneck has been eliminated by the emergence of Markov

chain Monte Carlo (MCMC). In MCMC algorithms a Markov chain, also known as a

random walk, is constructed whose stationary distribution is the posterior distribution

of the parameters. Samples generated from a long run of the Markov chain using

a proposal transition density are drawn from posterior distributions of interest. An

advantage of estimating Bayesian models parameters MCMC methods is that it yields

estimates of all model parameters, including estimates of model parameters associated

with specific respondents. In addition, the use of MCMC methods facilitate the study

of functions of model parameters that are closely related to decisions faced by process


experts.

MCMC is the general procedure of simulating such Markov chains and using them to

draw inference about the characteristics of f(x). Methods which have ignited MCMC

are the Gibbs sampler and the more general Metropolis-Hastings algorithms. These

methods are simply prescriptions for constructing a Markov transition kernel p(x|x∗)

which generates a Markov chain x(1), ..., x(k) converging to f(x). We here briefly outline

the above methods as well as an extension of MCMC known as reversible jump MCMC

which is used for model selection. For more details on Bayesian computation methods

and other variations of MCMC see Gelman et al. (2004).

Metropolis-Hastings algorithm

A Metropolis-Hastings algorithm generates Markov chains which converges to f(x), by

successively sampling from an (essentially) arbitrary proposal distribution q(x|x∗), a

Markov transition kernel, and imposing a random rejection step at each transition.

The algorithm for a candidate proposal distribution q(x|x∗), entails simulating x(1), ..., x(k)

as follows (Hastings, 1970):

• Simulate a transition candidate xC from q(x|x(j))

• Set x(j+1) = xC with probability α(x(j), xC) = min{1, q(x(j)|xC)

q(xC |x(j))

f(xC)

f(x(j))}, otherwise

x(j+1) = x(j).

The original Metropolis algorithm was based on symmetric proposal distribution, q(x|x∗) =

q(x∗|x), for which α is of the simple form 1, f(xC)/f(x(j)). If the proposal distribution

is chosen such that the Markov chain satisfies modest conditions (irreducibility and ape-

riodicity), then convergence to f(x) is guaranteed. However, the rate of convergence

will depend on the relationship between q(x|x∗) and f(x).

The choice of the proposal distribution is critical for the efficiency of the algorithm.

On one hand, it could lead to a large number of candidates xC being rejected, and

on the other hand it could result in accepting nearly all proposed candidates, but the

candidates could be close to each other in the space of the distribution of x(j). In both

cases the algorithm is inefficient as it does not mix rapidly.

2.3 Bayesian Approach 43

Gibbs Sampler

The Gibbs sampler which was originally developed by Geman and Geman (1984) is a

special case of Metropolis-Hastings algorithm whereby the proposal density for updating

x(j) equals the full conditional p(x∗|x(j)) so that proposals are accepted with probability

1. In the Gibbs algorithm, samples are drawn from the full conditional component

distribution f(xi|x−i), i = 1, ..., p, where x−i denotes the component of x other than

xi. The samples are generated as follows:

• Initialize x02, x03, ..., x

0p

• For j = 1, ..., k generate

– xj+11 ∼ f(x1|xj2, x

j3, ..., x

jp)

– xj+12 ∼ f(x2|xj1, x

j3, ..., x

jp)

...

– xj+1p ∼ f(xp|xj1, x

j2, ..., x

jp−1).

Reversible Jump MCMC

Reversible jump MCMC (RJMCMC), developed by Green (1995), provides a general

framework for MCMC simulation in which the dimension of the parameter space can

vary between iterations of the Markov chain. Thus, the dimensionality of the space

is considered to be a stochastic variable as well as the parameters of interest in each

dimension. The reversible jump sampler can be seen as an extension of the standard

Metropolis-Hastings algorithm onto a more general space that jumps between models

with parameter spaces of different dimensions.

Let θm denote the parameter vector corresponding to modelm, where θm has dimension

dm. If the current state of the Markov chain is (m, θm), where θm has dimension dm,

then a general version of the algorithm is the following:

(a) Propose a new model m′ with probability j(m,m′).

(b) Generate u from a specified proposal density q(u | θm,m,m′).


(c) Propose a new vector of parameters θ′m′ by setting (θ′m′ , u′) = gm,m′(θm, u) where

gm,m′ is a specified invertible function.

(d) Accept the proposed move to model m′ with probability

α = min

(1,

f(x | m′, θ′m′)f(θ′m′ | m′)f(m′)j(m′,m)q(u′ | θm,m′,m)

f(x | m, θm)f(θm | m)f(m)j(m,m′)q(u | θm′ ,m,m′)

∣∣∣∣∂g(θm, u)

(θm, u)

∣∣∣∣).

(2.11)

(e) Return to step 1 until the required number of iterations is reached.

The portion of times that a model m is accepted in the simulation represents the

posterior probability of the model, and the samples from each iteration within the

model m are drawn from the posterior distributions of the parameter set of θm.

Important elements of the algorithm are the proposal distributions q(u′ | θm,m′,m) and

the matching function gm,m′ . The vectors u and u′ are used to make the dimensions of

the parameter spaces of m and m′ equal. The usual practice is to set du or du′ equal to

zero depending on which model has fewer parameters. When dm < dm′ we set du′ = 0,

generate u as described above, and calculate using θ′m′ the matching function gm,m′ .

Otherwise, when dm′ < dm we set du′ = 0 equal to zero and directly calculate θ′m′ and

u′ using the matching function gm,m′ , since we do not need to generate any additional

parameters. The corresponding proposal distributions are usually constructed by single

MCMC runs within each model, while the matching function gm′,m is constructed by

considering the structural properties of each model and their possible association.

2.3.2 Bayesian Change Point Estimation

In the Bayesian context, change point estimation has been investigated and recently re-

visited in Bayesian Hierarchical Models. Carlin et al. (1992) applied MCMC and Gibbs

sampling methods to the conditional distributions of parameters of interest in a coal

mining disaster to obtain posterior distributions of a step change point. The idea of us-

ing MCMC then extended by Green (1995) to the situation where the number of change

points is unknown. He developed RJMCMC by using a Metropolis Hasting step which

switches between models with different number of change points. The application of

2.4 Bayesian Quality Control 45

MCMCmethods in change point detection has also been studied and compared by other

researchers (Chib, 1998; Lavielle and Lebarbier, 2001). MCMC methods can provide a

comprehensive statistical inference on the estimated change point(s) and change point

model selection which will be considered in this research. There also exist some other

alternatives for change point detection methods in the Bayesian context. Barry and

Hartigan (1992) introduced Product Partition Models (PPMs) for multiple change point

detection where the number of changes is unknown and taken as a random variable.

Loschi and Cruz (2002a,b) applied PPMs and studied the effect of prior distributions

on PPMs. This technique was developed to provide posterior distribution of changes

points using Gibbs sampling (Loschi et al., 2003; Loschi and Cruz, 2005; Loschi et al.,

2005, 2008). Liang et al. (2007) proposed Stochastic Approximation in Monte Carlo

(SAMC) algorithm as an alternative for RJMCMC. Its convergence then improved us-

ing smoothing methods (Liang, 2009). SAMC was applied for multiple change point

detection and shown that it makes significant improvement over RJMCMC for complex

Bayesian model selection problems in change-point estimation (Cheon and Kim, 2010).

Yet, no Bayesian change point estimation models and computation techniques have

been considered in control charting.

2.4 Bayesian Quality Control

In a Bayesian framework statistical inference about a quantity of interest is described

as the modification of the uncertainty about its value in the light of evidence, and

Bayes’ theorem precisely specifies how this modification should be made. In the SPC

context, this implies revision of belief about a process after observation of data, and

the possibility of dynamic updating of control charts as new data are gathered. The

methodology and application of Bayesian approaches in SPC can be considered in terms

of the SPC aims, which are discussed in followings.


2.4.1 Optimal Control Policy

The major use of Bayesian methods in SPC is the area which aims to estimate the

control chart parameters more efficiently, considering cost of sampling and chart per-

formance. This issue is now being investigated for economical design of control charts

and more recently for adaptive control charts. The traditional approach to a control

chart design considers the classical control chart framework with the objective of deter-

mining the values of the chart parameters, namely the sample size, sampling interval,

and the control limits to satisfy economic or statistical requirements. Under a Bayesian

approach, focus can be shifted to determining the optimal control policy based on the

posterior probability that the process is out of control, minimizing the total expected

cost over a finite horizon, or the long-run expected average cost.

One of the earliest contributions to this area is the study by Girshick et al. (1955) who

formulated inspection decision in a Bayesian context and other research which extended

their work in a partially observable Markov decision process framework (POMDP)

(Eckles, 1968; White, 1977). Taylor (1965, 1967) showed that non-Bayesian control

techniques are not optimal and suggested that in the general case, the action decision,

sample size, and the sampling interval should be determined based on the probability

that the process is out of control, which is updated whenever a new sample is taken.

In more recent works, Tagaras (1994, 1996) has proposed a dynamic programming ap-

proach for the modelling and cost minimization of statistical process control activities,

in particular X charts in contrast with traditional economical design by Duncan (1956).

The decision parameters, including sampling interval, sample size and control limit lo-

cation of the control chart, are allowed to change dynamically as new information about

the process becomes available. It has been shown with numerical examples that the

dynamic programming solution can be much more economical than the conventional

static solution with fixed control chart parameters.

Calabrese (1995) developed a Bayesian process control for attributes and showed that

this model is optimal compared to non-Bayesian techniques. Porteus and Angelus

(1997) discussed the advantages of a developed dynamic programming approach and


Bayesian approach. Tagaras and Nikolaidis (2002) have evaluated the relative effective-

ness of partially and fully one-sided Bayesian X charts and derived properties about

the structure of their optimal policies. Nenes and Tagaras (2007) extended that study

to a two-sided form and assessed its economic performance. Nikolaidis et al. (2007)

applied different adaptive Bayesian control charts (all combination of fixed and vari-

able sampling intervals and sample size) for a tile manufacturer and reported their

economic performances. Kooli and Limam (2009) have investigated optimal solutions

of the static np-chart, the basic Bayesian np-chart, and the Bayesian scheme with adap-

tive sample size and reported that the last of these is preferred in terms of cost. They

have indicated that Bayesian control charts are affected by the length of the production

run.

Most recently Makis (2008, 2009) has extended a Bayesian approach and optimal control

policy into a multivariate context and showed that the Bayesian multivariate control

chart is highly cost effective in comparison with MEWMA and Chi-square charts. See

Yin and Makis (2011), Zhang and Su and Cheng et al. (2011) for more developments

in this area. There appears to be no literature on the application of Bayesian methods

in the design of control charts in healthcare surveillance that considers both econom-

ical and statistical issues simultaneously. Hence this gap needs to be investigated for

different specifications of the health sectors in terms of cost and risk.

2.4.2 Inferences and Estimating

Bayesian methods have also been used in SPC aiming to overcome some existing prob-

lems in construction of control charts where the process or data have uncommon speci-

fications with respect to the primary assumptions underlying control charts. This part

may have some overlaps with the previous section, but the different characteristics

encourage separate discussion here.

The Process Capability Ratio (PCR) and its variations are useful for assessing the

capability of manufacturing processes where the quality characteristics of interest is

a variable (not attribute). Shiau et al. (1999a) and Shiau et al. (1999b) indicated

that the usual practice of judging process capability by evaluating point estimates


of some process capability indexes has a flaw in that there is no assessment of the

error distributions of these estimates. They analyzed the properties of the PCR using

Bayesian methods and obtained its distribution. This area has been extended recently

by other research Wu (2008).

A search of the healthcare literature shows that there is no research on the application

of such indexes in monitoring health processes. Thus there is a very good opportunity

to contribute in this area and develop modified PCR indexes using both Frequentist

and Bayesian approaches.

Sturm et al. (1991) discussed monitoring processes in which the parameters can vary

over time. Relaxing of the stability of the in-control process parameter have been

followed in several studies (Feltz and Sturm, 1994; Feltz and Shiau, 2001; Jain, 1993;

Jain et al., 1993; Shiau et al., 2005). This assumption is totally contrary to the primary

axiom of stability of parameters in an in-control process in traditional SPC. In these

studies, they developed univariate and multivariate control charts for continuous and

discrete measurements in a Bayesian framework and applied empirical Bayes to update

the process parameters. They showed that their methods, in particular the recursive

estimation equations, are very efficient for automated, high-speed, high volume and

data intensive manufacturing lines; this has been termed monitoring in real time. In a

recent study Bayarri and Garcıa-Donato (2005) extended this approach to overcome the

extra variation in the Poisson distribution of a u-chart by taking the Poisson parameter

as a variable.

In healthcare surveillance, effort has focused on achieving a control chart based on

constant parameters, but most often control charts are required for data which come

from unknown and different distributions. It seems by applying Bayesian methods and

allowing control chart parameters to vary a new perspective on tackling risk could be

built. Moreover, there is an opportunity to simultaneously deal with over and under

dispersion. Ryan (2011) has argued that most processes involving proportions and

counts exhibit extra-binomial or extra-Poisson variation (over-dispersion). This idea

might be applicable for monitoring multiple units and the application of Funnel plots.


Hamada (2002) stated that the probability content of standard control limits for at-

tributes can vary because distribution parameters that appear in the control limits are

estimated based on previous data. He applied Bayesian tolerance interval control lim-

its which control the probability content at a specified level with a given confidence.

He showed that Bayesian tolerance interval control limits can be used for processes at

start-up where there is not enough data to estimate control chart parameters accurately.

Tsiamyrtzis (2000) and Tsiamyrtzis and Hawkins (2005) supported the performance of

conventional control charts such as CUSUM and EWMA in short-run production, but

they pinpointed the necessity of a procedure which can be applied to detect changes

from the very first observation and addressed the strengths of Bayesian framework in

this regard. They have developed a procedure to detect shifts in the mean of process

by use of a mixture of normal distributions. They recognize their method as a gener-

alization of the Kalman Filter. The usefulness of this approach to monitor vital signs

of a patient in intensive care has been highlighted Tsiamyrtzis and Hawkins (2005).

They have also extended the Bayesian approach to EWMA for the start-up phase of

a production and applied it for environmental monitoring Tsiamyrtzis and Hawkins

(2008).

The interest in monitoring from the beginning of a process in healthcare surveillance

motivates the application of Bayesian methods in order to employ prior knowledge

and uncertainty to construct and run the control chart without wasting time at the

beginning of phase I. This approach would be useful for monitoring short runs and

start up processes such as patient based processes (treatment), new procedures and

new devices.

Triantafyllopoulos (2006) has developed a new multivariate control chart based on

Bayes factors. This control chart specifically aims to monitor multivariate autocorre-

lated and serially correlated processes. His general idea is to form a target distribution,

to construct a predictive density with good forecast ability and then to apply a univari-

ate control chart such as EWMA for the logarithm of the Bayes factor of the predictive

error density against the target error density.


2.4.3 Bayesian Control Chart

Recently Marcellus (2008a,b) has developed a Bayesian chart as an alternative for a

conventional control chart. He claims that his work differs substantially from previous

contributions in Bayesian SPC which focused on optimization and estimation. In this

Bayesian chart the probability of two out of control levels are plotted and used to make

decisions about the state of a process. He indicates that this requires more knowledge

about process structure than most popular charts, but acquiring this knowledge can

yield real benefits.

2.4.4 Other Applications

There is further scope in Bayesian approaches and other research which uses Bayesian

techniques that have not been covered in the previous sections. One of the most

important of these is process adjustment. The main aim of using SPC methods is

firstly to identify a shift in process parameters and secondly to adjust an out of control

process considering its root causes. The area of research called Engineering Process

Control (EPC), which focuses on latter role of SPC, has recently been considered from

a Bayesian perspective. Recent research in this area can be found in Colosimo and

Del Castillo (2007). This area is not followed in this thesis and remains of interest for

further research. Note that applicability of such approach and intervention in clinical

setting of quality control needs comprehensively consideration of special characteristics

of healthcare surveillances discussed earlier.

2.5 Conclusion

This literature review comprehensively addressed significant characteristics and neces-

sary modification of statistical quality control methods for monitoring clinical outcomes.

This section targeted a broader gap analysis in development of control charting methods

in a healthcare area and discussed possible contribution of advanced techniques from

an industrial context. Bayesian approach and well-developed computational methods

were introduced and their potentials in probabilistic inferences and decision making

BIBLIOGRAPHY 51

for monitoring purposes were highlighted. Among those estimation of control chart

parameters considering lack of historical data, variable parameters, associated cost of

failures and probability quantification and forecasting capabilities can be named. In

a wide scope achievable advancement in monitoring clinical outcomes by application

of Bayesian techniques were addressed. In this regard, the review body of knowledge

in developed Bayesian techniques in industrial monitoring procedures can be used as

benchmarks. Although a large number of potential research and development was pro-

posed withing the review of literature, this research was limited to satisfy objectives

discussed in Introduction which mainly focus on data quality improvement and estima-

tion of change point in statistical quality control practices in a healthcare context.

Bibliography

Ahmadzadeh, F. (2009). Change point detection with multivariate control charts by ar-

tificial neural network. The International Journal of Advanced Manufacturing Tech-

nology.

Alaeddini, A., Ghazanfari, M., and Nayeri, M. (2009). A hybrid fuzzy-statistical clus-

tering approach for estimating the time of changes in fixed and variable sampling

control charts. Information Sciences, 179(11):1769–1784.

Amiri, A. and Allahyari, S. (2011). Change point estimation methods for control

chart postsignal diagnostics: a literature review. Quality and Reliability Engineering

International, doi:10.1002/qre.1266.

Arts, D. G. T., Keizer, N. F. D., and Scheffer, G. J. (2002). Defining and improving data

quality in medical registries: A literature review, case study, and generic framework.

Journal of the American Medical Informatics Association, 9(6):600–611.

Aylin, P., Best, N., Bottle, A., and Marshall, C. (2003). Following shipman: a pilot

system for monitoring mortality rates in primary care. The Lancet, 362(9382):485–

491.

Barry, D. and Hartigan, J. (1992). Product partition models for change point problems.

The Annals of Statistics, pages 260–279.

Bayarri, M. and Garcıa-Donato, G. (2005). A Bayesian sequential look at u-control

charts. Technometrics, 47(2):142–151.

Benneyan, J. (2006). Discussion-the use of control charts in health-care and public-

health surveillance. Journal of Quality Technology, 38(2):113–123.


Benneyan, J. C. (1998a). Statistical quality control methods in infection control and

hospital epidemiology, part i: introduction and basic theory. Infection Control and

Hospital Epidemiology, 19(3):194–214.

Benneyan, J. C. (1998b). Statistical quality control methods in infection control and

hospital epidemiology, part ii: chart use, statistical properties, and research issues.

Infection Control and Hospital Epidemiology, 19(4):265–283.

Benneyan, J. C. (2001). Performance of number-between g-type statistical control

charts for monitoring adverse events. Health Care Management Science, 4(4):319–

336.

Benneyan, J. C., Lloyd, R. C., and Plsek, P. E. (2003). Statistical process control as

a tool for research and healthcare improvement. Quality and Safety in Health Care,

12(6):458.

Beretta, L., Aldrovandi, V., Grandi, E., Citerio, G., and Stocchetti, N. (2007). Im-

proving the quality of data entry in a low-budget head injury database. Acta Neu-

rochirurgica, 149(9):903–909.

Biswas, P. and Kalbfliesch, J. D. (2008). A risk-adjusted CUSUM in continuous time

based on the Cox model. Statistics in Medicine, 27(17):3382–3406.

Black, N. (1999). High-quality clinical databases: breaking down barriers. The Lancet,

353:1205–1206.

Box, G. and Cox, D. (1964). An analysis of transformations. Journal of the Royal

Statistical Society. Series B (Methodological), 26(2):211–252.

Brunelle, R. and Kleyle, R. (2002). A database quality review process with interim

checks. Drug Information Journal, 36(2):357–367.

Calabrese, J. (1995). Bayesian process control for attributes. Management Science,

41(4):637–645.

Calvin, T. (1983). Quality control techniques for zero defects. IEEE Transactions on

Components, Hybrids, and Manufacturing Technology, 6(3):323–328.

Carlin, B., Gelfand, A., and Smith, A. (1992). Hierarchical Bayesian analysis of change-

point problems. Applied statistics, pages 389–405.

Celano, G., Castagliola, P., Trovato, E., and Fichera, S. (2011). Shewhart and EWMA

control charts for short production runs. Quality and Reliability Engineering Inter-

national, 27(3):313–326.

Chen, R. (1978). A surveillance system for congenital malformations. Journal of the

American Statistical Association, pages 323–327.

BIBLIOGRAPHY 53

Cheng, S., Mao, H., Goswami, V., Laxmi, P., Meyners, M., Srivastava, P., Jain, N.,

Sivakumar, B., Jain, M., Gupta, R., et al. (2011). The economic design of multivariate

MSE control chart. Economic Design, 8(2):75–85.

Cheon, S. and Kim, J. (2010). Multiple change-point detection of multivariate mean

vectors with the Bayesian approach. Computational Statistics & Data Analysis,

54(2):406–415.

Chib, S. (1998). Estimation and comparison of multiple change-point models. Journal

of Econometrics, 86(2):221–241.

Christensen, A., Melgaard, H., Iwersen, J., and Thyregod, P. (2003). Environmen-

tal monitoring based on a hierarchical Poisson-Gamma model. Journal of Quality

Technology, 35(3):275–285.

Colosimo, B. and Del Castillo, E. (2007). Bayesian Process Monitoring, Control and

Optimization. Chapman and Hall/CRC.

Cook, D. (2004). The Development of Risk Adjusted Control Charts and Machine

learning Models to monitor the Mortality of Intensive Care Unit Patients. PhD

thesis, University of Queensland, Australia.

Cook, D., Steiner, S., Cook, R., Farewell, V., and Morton, A. (2003). Monitoring the

evolutionary process of quality: risk-adjusted charting to track outcomes in intensive

care. Critical Care Medicine, 31(6):1676.

Cook, D. A., Duke, G., Hart, G. K., Pilcher, D., and Mullany, D. (2008). Review of

the application of risk-adjusted charts to analyse mortality outcomes in critical care.

Critical Care Resuscitation, 10(3):239–251.

Craigmile, P., Calder, C., Li, H., Paul, R., and Cressie, N. (2009). Hierarchical model

building, fitting, and checking: a behind-the-scenes look at a Bayesian analysis of

arsenic exposure pathways. Bayesian Analysis, 4(1):1–36.

Crosier, R. B. (1988). Multivariate generalizations of cumulative sum quality control

schemes. Technometrics, 30(3):291–303.

Crowder, S. V. (1989). Design of exponentially weighted moving average schemes.

Journal of Quality Technology, 21(3):155–162.

Del Castillo, E. and Montgomery, D. (1994). Short-run statistical process control: Q-

chart enhancements and alternative methods. Quality and Reliability Engineering

International, 10(2):87–97.

Dodge, H. (1943). Skip-lot sampling plan. Statistics, 14(3):264–279.

Dodge, H. (1947). Sampling plans for continuous production. Industrial Quality Con-

trol, 14(3):5–9.


Dodge, H. (1955). Chain sampling inspection plan. Industrial Quality Control,

11(4):10–13.

Dodge, H. and Romig, H. (1959). Sampling Inspection Tables: Single and Double

Sampling. Wiley.

Dodge, H. and Stephens, K. (1966). Some new chain sampling inspection plans. In-

dustrial Quality Control, 23(2):61–67.

Dodge, H. and Torrey, M. (1951). Additional continuous sampling inspection plans.

Industrial Quality Control, 7(5):7–12.

Duncan, A. (1956). The economic design of X charts used to maintain current control

of a process. Journal of the American Statistical Association, pages 228–242.

Eckles, J. (1968). Optimum maintenance with incomplete information. Operations

Research, 16(5):1058–1067.

Feltz, C. and Shiau, J. (2001). Statistical process monitoring using an empirical Bayes

multivariate process control chart. Quality and Reliability Engineering International,

17(2):119–124.

Feltz, C. and Sturm, G. (1994). Real-time empirical Bayes manufacturing process

monitoring for censored data. Quality and Reliability Engineering International,

10(6):467–476.

Fricker Jr, R. and Chang, J. (2008). A spatio-temporal methodology for real-time

biosurveillance. Quality Engineering, 20(4):465–477.

Gardiner, J. (1987). Detecting Small Shifts in Quality Levels in a Near Zero Defect

Environment for Integrated Circuits. PhD thesis, University of Washington, Seattle,

Washington.

Garjani, M., Noorossana, R., and Saghaei, A. (2010). A neural network-based control

scheme for monitoring start-up processes and short runs. The International Journal

of Advanced Manufacturing Technology, 51(9):1023–1032.

Gelman, A., Carlin, J., Stern, H., and Rubin, D. (2004). Bayesian Data Analysis.

Chapman & Hall/CRC.

Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the

Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine

Intelligence, 6(2):721–741.

Ghazanfari, M., Alaeddini, A., Niaki, S., and Aryanezhad, M. (2008). A clustering

approach to identify the time of a step change in Shewhart control charts. Quality

and Reliability Engineering International, 24(7):765–778.

BIBLIOGRAPHY 55

Girshick, M., Rubin, H., and Sitgreaves, R. (1955). Estimates of bounded relative error

in particle counting. The Annals of Mathematical Statistics, 26(2):276–285.

Goh, T. (1987). A control chart for very high yield processes. Quality Assurance,

13(1):18–22.

Green, P. (1995). Reversible jump Markov chain Monte Carlo computation and

Bayesian model determination. Biometrika, 82(4):711–732.

Grigg, O. and Farewell, V. (2004a). A risk-adjusted sets method for monitoring adverse

medical outcomes. Statistics in Medicine, 23(10):1593–1602.

Grigg, O. V. and Farewell, V. T. (2004b). An overview of risk-adjusted charts. Journal

of the Royal Statistical Society: Series A (Statistics in Society, 167(3):523–539.

Grigg, O. V. and Spiegelhalter, D. J. (2006). Discussion. Journal of Quality Technology,

38(2):124–136.

Grigg, O. V. and Spiegelhalter, D. J. (2007). A simple risk-adjusted exponen-

tially weighted moving average. Journal of the American Statistical Association,

102(477):140–152.

Grigg, O. V., Spiegelhalter, D. J., and Farewell, V. T. (2003). Use of risk-adjusted

CUSUM and RSPRT charts for monitoring in medical contexts. Statistical Methods

in Medical Research, 12(2):147–170.

Hamada, M. (2002). Bayesian tolerance interval control limits for attributes. Quality


Hasan, S. and Padman, R. (2006). Analyzing the effect of data quality on the accuracy

of clinical decision support systems: a computer simulation approach. In AMIA An-

nual Symposium Proceedings, volume 2006, page 324. American Medical Informatics

Association.

Hastings, W. (1970). Monte Carlo sampling methods using Markov chains and their

applications. Biometrika, 57(1):97–109.

Hattemer-Apostel, R., Fischer, S., and Nowak, H. (2008). Getting better clinical trial

data: An inverted viewpoint. Drug Information Journal, 42(2):123–130.

Hawkins, D. and Olwell, D. (1998). Cumulative Sum Charts and Charting for Quality

Improvement. Springer Verlag.

Hinkley, D. (1971). Inference about the change-point from cumulative sum tests.

Biometrika, 58(3):509–523.

Hotelling, H. (1947). Multivariate quality control-illustrated by the air testing of sample

bombsights. Techniques of Statistical Analysis, pages 111–184.


Ishikawa, K. (1990). Introduction to Quality Control. Productivity Press.

Jain, K. (1993). A Bayesian Approach to Multivariate Quality Control. PhD thesis,

University of Maryland at College Park.

Jain, K., Alt, F., and Grimshaw, S. (1993). Multivariate quality control-a Bayesian

approach. In Annual Quality Congress Transactions-American Society for Quality

Control, volume 47, pages 645–645. American Society for Quality control.

Jones, H., Ohlssen, D., and Spiegelhalter, D. (2008). Use of the false discovery rate

when comparing multiple health care providers. Journal of Clinical Epidemiology,

61(3):232–240.

Jones, L. and Woodall, W. (1999). Exact properties of demerit control charts. Journal

of Quality Technology, 31(2):207–216.

Kaminsky, F. C., Benneyan, J. C., Davis, R. D., and Burke, R. J. (1992). Statistical

control charts based on a geometric distribution. Journal of Quality Technology,

24(2):63–69.

Kittlitz, R. G. J. (1999). Transforming the exponential for SPC applications. Journal


Klein, J. P. and Moeschberger, M. L. (1997). Survival Analysis: Techniques for Cen-

sored and Truncated Data. Springer: New York.

Knoth, S. (2005). Fast initial response features for EWMA control charts. Statistical

Papers, 46(1):47–64.

Kooli, I. and Limam, M. (2009). Bayesian np control charts with adaptive sample

size for finite production runs. Quality and Reliability Engineering International,

25(4):439–448.

Lavielle, M. and Lebarbier, E. (2001). An application of MCMC methods for the

multiple change-points problem. Signal Processing, 81(1):39–53.

Liang, F. (2009). Improving SAMC using smoothing methods: theory and applications

to Bayesian model selection problems. The Annals of Statistics, 37(5B):2626–2654.

Liang, F., Liu, C., and Carroll, R. (2007). Stochastic approximation in Monte Carlo

computation. Journal of the American Statistical Association, 102(477):305–320.

Loschi, R. and Cruz, F. (2002a). An analysis of the influence of some prior specifications

in the identification of change points via product partition model. Computational

Statistics & Data Analysis, 39(4):477–501.

Loschi, R. and Cruz, F. (2002b). Applying the product partition model to the identi-

fication of multiple change points. Advances in Complex Systems, 5(4):371–388.

BIBLIOGRAPHY 57

Loschi, R. and Cruz, F. (2005). Extension to the product partition model: computing

the probability of a change. Computational Statistics & Data Analysis, 48(2):255–

268.

Loschi, R., Cruz, F., and Arellano-Valle, R. (2005). Multiple change point analysis for

the regular exponential family using the product partition model. Journal of Data

Science, 3(3):305–330.

Loschi, R., Cruz, F., Iglesias, P., and Arellano-Valle, R. (2003). A Gibbs sampling

scheme to the product partition model: an application to change-point problems.

Computers & Operations Research, 30(3):463–482.

Loschi, R., Cruz, F., Takahashi, R., Iglesias, P., Arellano-Valle, R., and MacGre-

gor Smith, J. (2008). A note on Bayesian identification of change points in data

sequences. Computers & Operations Research, 35(1):156–170.

Lovegrove, J., Valencia, O., Treasure, T., Sherlaw-Johnson, C., and Gallivan, S. (1997).

Monitoring the results of cardiac surgery by variable life-adjusted display. The Lancet,

350(9085):1128–1130.

Lowry, C. A., Woodall, W. H., Champ, C. W., and Rigdon, S. E. (1992). A multivariate

exponentially weighted moving average control chart. Technometrics, 34(1):46–53.

Lucas, J. and Crosier, R. (1982). Fast initial response for CUSUM quality-control

schemes: give your CUSUM a head start. Technometrics, 24(3):199–205.

Makis, V. (2008). Multivariate Bayesian control chart. Operations Research, 56(2):487–

496.

Makis, V. (2009). Multivariate Bayesian process control for a finite production run.

European Journal of Operational Research, 194(3):795–806.

Marcellus, R. (2008a). Bayesian monitoring to detect a shift in process mean. Quality


Marcellus, R. (2008b). Bayesian statistical process control. Quality Engineering,

20(1):113–127.

Marshall, B., Spitzner, D., and Woodall, W. (2007). Use of the local Knox statistic

for the prospective monitoring of disease occurrences in space and time. Statistics in

Medicine, 26(7):1579–1593.

Marshall, C., Best, N., Bottle, A., and Aylin, P. (2004). Statistical issues in the

prospective monitoring of health outcomes across multiple units. Journal of the

Royal Statistical Society. Series A (Statistics in Society), 167(3):541–559.

Mayer, E., Bottle, A., Rao, C., Darzi, A., and Athanasiou, T. (2009). Funnel plots and

their emerging application in surgery. Annals of Surgery, 249(3):376.


Mohammed, M. and Deeks, J. (2008). In the context of performance monitoring, the

Caterpillar plot should be mothballed in favor of the Funnel plot. The Annals of

Thoracic Surgery, 86(1):348.

Mohammed, M., Worthington, P., and Woodall, W. (2008). Plotting basic control

charts: tutorial notes for healthcare practitioners. Quality and Safety in Health

Care, 17(2):137.

Montgomery, D. and Woodall, W. (2008). An overview of six sigma. International

Statistical Review, 76(3):329–346.

Montgomery, D. C. (2008). Introduction to Statistical Quality Control. Wiley.

Morton, A., Mengersen, K., Waterhouse, M., and Steiner, S. (2010). Analysis of aggre-

gated hospital infection data for accountability. Journal of Hospital Infection.

Morton, A., Whitby, M., McLaws, M., Dobson, A., McElwain, S., Looke, D., Stackel-

roth, J., and Sartor, A. (2001). The application of statistical process control charts

to the detection and monitoring of hospital-acquired infections. Journal of Quality

in Clinical Practice, 21(4):112–117.

Morton, N. and Lindsten, J. (1976). Surveillance of downs syndrome as a paradigm of

population monitoring. Human Heredity, 26(5):360–371.

Nelson, L. (1994). A control chart for parts-per-million nonconforming items. Journal


Nenes, G. and Tagaras, G. (2007). The economically designed two-sided Bayesian

control chart. European Journal of Operational Research, 183(1):263–277.

Nikolaidis, Y., Rigas, G., and Tagaras, G. (2007). Using economically designed She-

whart and adaptive X charts for monitoring the quality of tiles. Quality and Relia-

bility Engineering International, 23(2):233–245.

Nishina, K. (1992). A comparison of control charts from the viewpoint of change-point

estimation. Quality and Reliability Engineering International, 8(6):537–541.

Noorossana, R., Saghaei, A., Paynabar, K., and Abdi, S. (2009). Identifying the period

of a step change in high-yield processes. Biometrika, 25(7):875–883.

Ohta, H., Kusukawa, E., and Rahim, A. (2001). A CCC-r chart for high-yield processes.

Quality and Reliability Engineering International, 17(6):439–446.

Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2):100–115.

Page, E. S. (1961). Cumulative sum charts. Technometrics, 3(1):1–9.

Perry, M. and Pignatiello, J. (2005). Estimating the change point of the process fraction

nonconforming in SPC applications. International Journal of Reliability, Quality and

Safety Engineering, 12(2):95–110.

BIBLIOGRAPHY 59

Perry, M., Pignatiello, J., and Simpson, J. (2006). Estimating the change point of

a Poisson rate parameter with a linear trend disturbance. Quality and Reliability

Engineering International, 22(4):371–384.

Perry, M., Pignatiello, J., and Simpson, J. (2007a). Change point estimation for mono-

tonically changing Poisson rates in SPC. International Journal of Production Re-

search, 45(8):1791–1813.

Perry, M., Pignatiello, J., and Simpson, J. (2007b). Estimating the change point of

the process fraction non-conforming with a monotonic change disturbance in SPC.


Perry, M. B. (2004). Robust Change Detection and Change Point Estimation for Pois-

son Count Processes. PhD thesis, Florida State University, USA.

Perry, R. L. (1973). Skip-lot sampling plans. Journal of Quality Technology, 5(3):123–

130.

Pignatiello, J. J. and Runger, G. C. (1990). Comparisons of multivariate CUSUM

charts. Journal of Quality Technology, 22(3):173–186.

Poloniecki, J., Valencia, O., and Littlejohns, P. (1998). Cumulative risk adjusted mor-

tality chart for detecting changes in death rate: observational study of heart surgery.

British Medical Journal, 316(7146):1697–1700.

Porteus, E. and Angelus, A. (1997). Opportunities for improved statistical process

control. Management Science, 43(9):1214–1228.

Roberts, S. W. (1959). Control chart tests based on geometric moving averages. Tech-

nometrics, 1(3):239–250.

Rolka, H., Burkom, H., Cooper, G., Kulldorff, M., Madigan, D., and Wong, W. (2007).

Issues in applied statistics for public health bioterrorism surveillance using multiple

data streams: research needs. Statistics in Medicine, 26(8):1834–1856.

Rostami, R., Nahm, M., and Pieper, C. (2009). What can we learn from a decade of

database audits? the duke clinical research institute experience, 1997-2006. Clinical

Trials, 6(2):141–150.

Ryan, T. P. (2011). Statistical Methods for Quality Improvement. Wiley.

Samuel, T. and Pignatiello, J. (2001). Identifying the time of a step change in the

process fraction nonconforming. Quality Engineering, 13(3):357–365.

Samuel, T., Pignatiello, J., and Calvin, J. (1998a). Identifying the time of a step change

in a normal process variance. Quality Engineering, 10(3):529–538.

Samuel, T., Pignatiello, J., and Calvin, J. (1998b). Identifying the time of a step change

with control charts. Quality Engineering, 10(3):521–527.


Samuel, T. and Pignatjello, J. (1998). Identifying the time of a change in a Poisson

rate parameter. Quality Engineering, 10(4):673–681.

Schilling, E. and Neubauer, D. (2009). Acceptance Sampling in Quality Control. Chap-

man & Hall/CRC.

Sego, L. H. (2006). Applications of Control Charts in Medicine and Epidemiology. PhD

thesis, United States-Virginia, Virginia Polytechnic Institute and State University.

Sego, L. H., Reynolds, J. D. R., and Woodall, W. H. (2009). Risk adjusted monitoring

of survival times. Statistics in Medicine, 28(9):1386–1401.

Shen, L. Z. and Zhou, J. (2006). A practical and efficient approach to database quality

audit in clinical trials. Drug Information Journal, 40(4):385–393.

Shewhart, W. (1926). Quality control charts. Bell System Technical Journal, 5:593–602.

Shewhart, W. (1927). Quality control. Bell System Technical Journal, 6:722–735.

Shiau, J., Chen, C., and Feltz, C. (2005). An empirical Bayes process monitoring

technique for polytomous data. Quality and Reliability Engineering International,

21(1):13–28.

Shiau, J., Chiang, C., and Hung, H. (1999a). A Bayesian procedure for process capa-

bility assessment. Quality and Reliability Engineering International, 15(5):369–378.

Shiau, J., Hung, H., and Chiang, C. (1999b). A note on Bayesian estimation of process

capability indices. Statistics & Probability Letters, 45(3):215–224.

Somerville, S. E., Montgomery, D. C., and Runger, G. C. (2002). Filtering and smooth-

ing methods for mixed particle count distributions. journal International Journal of

Production Research, 40(13):2991–3013.

Sonesson, C. (2007). A CUSUM framework for detection of space–time disease clusters

using scan statistics. Statistics in Medicine, 26(26):4770–4789.

Spiegelhalter, D. (2005a). Funnel plots for comparing institutional performance. Statis-

tics in Medicine, 24(8):1185–1202.

Spiegelhalter, D. (2005b). Handling over-dispersion of performance indicators. Quality

and Safety in Health Care, 14(5):347.

Spiegelhalter, D., Grigg, O., Kinsman, R., and Treasure, T. (2003). Risk-adjusted

sequential probability ratio tests: applications to Bristol, Shipman and adult cardiac

surgery. International Journal for Quality in Health Care, 15(1):7–13.

Steiner, S. H. (1999). EWMA control charts with time-varying control limits and fast

initial response. Journal of Quality Technology, 31(1):75–86.

BIBLIOGRAPHY 61

Steiner, S. H. and Cook, R. J. (2000). Monitoring surgical performance using risk

adjusted cumulative sum charts. Biostatistics, 1(4):441–452.

Steiner, S. H. and Jones, M. (2010). Risk-adjusted survival time monitoring with an

updating exponentially weighted moving average (EWMA) control chart. Statistics

in Medicine, 29(4):444–454.

Stoumbos, Z. G. and Sullivan, J. H. (2002). Robustness to non-normality of the mul-

tivariate EWMA control chart. Journal of Quality Technology, 34(3):260–276.

Stow, P. J., Hart, G. K., Higlett, T., George, C., Herkes, R., McWilliam, D., and Bel-

lomo, R. (2006). Development and implementation of a high-quality clinical database:

the australian and new zealand intensive care society adult patient database. Journal

of Critical Care, 21(2):133–141.

Sturm, G., Feltz, C., and Yousry, M. (1991). An empirical Bayes strategy for analysing

manufacturing data in real time. Quality and Reliability Engineering International,

7(3):159–167.

Sullivan, E., Gorko, M., Stellon, R., and Chao, G. (1997). A statistically-based process

for auditing clinical data listings. Drug Information Journal, 31(3):647–653.

Tagaras, G. (1994). A dynamic programming approach to the economic design of

X-charts. IIE Transactions, 26(3):48–56.

Tagaras, G. (1996). Dynamic control charts for finite production runs. European

Journal of Operational Research, 91(1):38–55.

Tagaras, G. and Nikolaidis, Y. (2002). Comparing the effectiveness of various Bayesian

X control charts. Operations Research, 50(2):878–888.

Taylor, H. (1965). Markovian sequential replacement processes. The Annals of Math-

ematical Statistics, 36(6):1677–1694.

Taylor, H. (1967). Statistical control of a Gaussian process. Technometrics, 9(1):29–41.

Tracy, N. D., Young, J. C., and L., M. R. (1992). Multivariate control charts for

individual observations. Journal of Quality Technology, 24(2):88–95.

Triantafyllopoulos, K. (2006). Multivariate control charts based on Bayesian state space

models. Quality and Reliability Engineering International, 22(6):693–707.

Tsiamyrtzis, P. (2000). A Bayesian Approach to Quality Control Problems. PhD thesis,

University of Minnesota.

Tsiamyrtzis, P. and Hawkins, D. M. (2005). A Bayesian scheme to detect changes in

the mean of a short-run process. Technometrics, 47(4):446–456.


Tsiamyrtzis, P. and Hawkins, D. M. (2008). A Bayesian EWMAmethod to detect jumps

at the start-up phase of a process. Quality and Reliability Engineering International,

24(6):721–735.



20(4):435–450.

Wald, A. (1947). Sequential Analysis. John Wiley & Sons.

White, C. (1977). A Markov quality control process subject to partial observation.

Management Science, 23(8):843–852.

Whitney, C., Lind, B., and Wahl, P. (1998). Quality assurance and quality control in

longitudinal studies. Epidemiologic Reviews, 20(1):71–80.

Win, K. T., Phung, H., Young, L., Tran, M., Alcock, C., and Hillman, K. (2004).

Electronic health record system risk assessment: a case study from the MINET.

Health Information Management, 33(2):43–48.





Woodall, W. (1997). Control charts based on attribute data: bibliography and review.


Woodall, W., Brooke Marshall, J., Joner Jr, M., Fraker, S., and Abdel-Salam, A.

(2008). On the use and evaluation of prospective scan methods for health-related

surveillance. Journal of the Royal Statistical Society: Series A (Statistics in Society),

171(1):223–237.

Woodall, W. H. and Mahmoud, M. A. (2005). The inertial properties of quality. Tech-

nometrics, 47(4):425–436.

Woodall, W. H. and Montgomery, D. C. (1999). Research issues and ideas in statistical

process control. Journal of Quality Technology, 31(4):376–386.

Wu, C. (2008). Assessing process capability based on Bayesian approach with subsam-

ples. European Journal of Operational Research, 184(1):207–228.

Xie, M., Goh, T., and Kuralmani, V. (2002). Statistical Models and Control Charts for

High-Quality Processes. Kluwer Academic Publishers.

Xie, M., Lu, X., Goh, T., and Chan, L. (1999). A quality monitoring and decision-

making scheme for automated production processes. International Journal of Quality

and Reliability Management, 16(2):148–157.

BIBLIOGRAPHY 63

Yang, Z., Xie, M., Kuralmani, V., and Tsui, K. (2002). On the performance of geometric

charts with estimated control limits. Journal of Quality Technology, 34(4):448–458.

Yin, Z. and Makis, V. (2011). Economic and economic-statistical design of a mul-

tivariate Bayesian control chart for condition-based maintenance. IMA Journal of

Management Mathematics, 22(1):47–63.

Zantek, P. and Nestler, S. (2009). Performance and properties of Q-statistic monitoring

schemes. Naval Research Logistics (NRL), 56(3):279–292.

Zhang, C. W., Xie, M., Liu, J. Y., and Goh, T. N. (2007). A control chart for the

Gamma distribution as a model of time between events. International Journal of


Zhang, P. (2004). Statistical issues in clinical trial data audit. Drug Information

Journal, 38(4):371–387.

Zhang, P. and Su, Q. The economically designed control chart for short-run pro-

duction based on Bayesian method. In Artificial Intelligence, Management Science

and Electronic Commerce (AIMSEC), 2011 2nd International Conference on, pages

4828–4831. IEEE.

CHAPTER 3

Data Quality Improvement in Clinical

Databases Using Statistical Quality Control:

Review and Case Study

Preamble

Success of any quality improvement program in healthcare depends on the accuracy

of quality characteristics measured in the system. Clinical databases and medical reg-

istries are now widely used in construction of benchmarks, risk models and in-control

status of clinical procedures. Therefore assessment of the quality of data in clinical and

medical contexts is an essential stage in monitoring clinical outcomes. In this chapter

we reviewed the statistical analysis components in data quality improvement. Follow-

ing a gap analysis in the body of knowledge on quality evaluation and improvement

techniques in this area, we promoted well-established acceptance sampling plans (ASP)

and statistical process control (SPC) tools from an industrial context, including control

charts and root causes analysis, as the technical core of the data quality improvement

66 Chapter 3. Data Quality Improvement in Clinical Databases

mechanism. In this regard, all potential tools were discussed and adapted in the the

data quality context. In a more general framework we illustrate how the proposed

methods can be merged in the current approaches, techniques and infrastructure of

clinical data collection and management and when the transition between methods

should be followed. Two case studies were also presented in which we applied some of

the techniques to databases maintained by St Andrew’s War Memorial Hospitals.

This chapter focuses on the first objective of the thesis, mainly goals 1 and 2, in which

data quality estimation and enhancement are sought. The second objective of the thesis

is also addressed since a comprehensive overview on control charting for attribute data

are discussed. The main contribution of this chapter is on knowledge adaption and

application as the promoted methods are the well-know techniques in an industrial

context.

This chapter has been written as a journal article for which I am the principal author.

It is reprinted here in its entirety. I was responsible for the conception of the paper,

statistical analysis, writing and addressing the reviewer’s comments.

67

Statement for Authorship

This chapter has been written as a journal article. The authors listed below have

certified that:

(a) they meet the criteria for authorship in that they have participated in the concep-

tion, execution or interpretation of at least that part of the publication in their

field of expertise;

(b) they take public responsibility for their part of the publication, except for the

responsible author who accepts overall responsibility for the publication;

(c) there are no other authors of the publication according to these criteria;

(d) potential conflicts of interest have been disclosed to granting bodies, the editor or

publisher of journals or other publications and the head of responsible academic

unit; and

(e) they agree to the use of the publication in the student’s thesis and its publication

on the Australian Digital Thesis database consistent with any limitations set by

publisher requirements.

in the case of this chapter, the reference for the associated publication is:

Assareh, H., Waterhouse, M. A., Moser, C., Brighouse, R. D., Foster, K. A., Smith,

I. R. and Mengersen, K. (2011) Data quality improvement in clinical databases using

statistical quality control: review and case study, Drug Information Journal, in press.

Contributor Statement of contribution

H. Assareh Conception and conduct research, implement statisti-cal analysis, write manuscript, make modifications tomanuscript as suggested by co-authors and reviewers

Signature & Date:

M.A. Waterhouse Conception, comments on manuscript, editing

C. Moser Conception

R. D. Brighouse Data collection

K. A. Foster Data collection

I. Smith Data collection, comments on manuscript

K. Mengersen Supervise research, comments on manuscript, editing

Principal Supervisor Confirmation: I have sighted email or other correspondence for

all co-authors confirming their authorship.

Name: ——————— Signature: ——————— Date: ———————


3.1 Abstract

Ensuring the quality of data being collected in clinical and medical contexts is a concern

for data managers and users. Quality assurance frameworks, systematic audits and

correction procedures have been proposed to enhance accuracy and completeness of

databases. Following an overview of the undertaken approaches, particularly statistical

methods, we promote acceptance sampling plans (ASP) and statistical process control

(SPC) tools, including control charts and root causes analysis, as the technical core of

the data quality improvement mechanism. We review ASP and SPC techniques and

discuss their implementation in data quality evaluation and improvement. Two case

studies are presented in which we apply some of the techniques to databases maintained

by a local hospital. We propose guidelines for which techniques are appropriate with

regard to dataflow and database specifications.

3.2 Introduction

There is an increasing demand for high quality medical registries and clinical databases.

Progress in information technology has paved the way for the systematic collection of

predefined patient data at a local, regional and national level. Clinical databases and

registries provide a valuable resource for the study of disease trends, interventions and

medical decision making and outcomes (Black, 1999). They are also a component of

quality improvement programs. They are used to assess productivity, to identify best

practices and to evaluate effectiveness of new procedures, drugs and services (Arts et al.,

2002).

To meet these objectives it is vital to have a good database design and high-quality

data. Indeed, the quality of any analysis is affected by data quality and database

structure (Beretta et al., 2007; Hattemer-Apostel et al., 2008). Inconsistencies in data

recording, such as missing values and errors, can lead to biased results. Arts et al.

(2002) define data quality as the totality of features and characteristics of a dataset

that affect its ability to meet its intended uses (based on ISO 8402-1986). Clinical data

managers are now responsible for providing high quality datasets, and often do this by

3.2 Introduction 69

monitoring data capture and flow processes (Hattemer-Apostel et al., 2008).

The International Conference on Harmonization E6 Guidelines for Good Clinical Prac-

tice indicates that quality control should be applied to each stage of data handling to

ensure that all data are reliable and have been processed correctly (Shen and Zhou,

2006). Data quality assurance programs that consist of systematic procedures before,

during and after data collection are being developed and applied by data managers to

minimize inaccurate and incomplete data in final datasets (Arts et al., 2002). Whitney

et al. (1998) have defined quality assurance as a program that includes all activities

before data collection to ensure that the data are of the highest possible quality at the

time of collection.

In a seminal paper, Arts et al. (2002) developed a total framework for quality assurance

in medical registries. Based on their model, procedures have been developed to prevent

the collection of insufficient data, to detect imperfect data and its causes, and to apply

relevant corrective actions in local and central registries. This framework has been

applied in the construction of databases for intensive care units in Australia and New

Zealand (Stow et al., 2006). However, although this framework provides comprehensive

guidelines for the construction of a high-quality database, it lacks practical mechanisms

by which we can evaluate data quality, give feedback to providers, conduct root causes

analysis, and prioritize preventive and corrective actions.

Whitney et al. (1998) characterized quality control procedures which take place dur-

ing and after data collection to identify and correct errors and their causes. During

the collection process, data are transferred from paper or electronic-based case report

forms (CRFs) to databases, as well as between datasets and centers. As such, it is rec-

ommended that audits and quality review programs be applied at the different stages

(data entry, data transcription, merging and dataset locking) of database construction

(Hattemer-Apostel et al., 2008; Brunelle and Kleyle, 2002; Zhang, 2004).

Although data quality needs to be sufficiently high that objectives can be met reliably,

auditing an entire dataset, particularly when it is large, involves substantial effort and

the resources usually cannot be justified (Hattemer-Apostel et al., 2008). Rostami et al.

(2009) highlight the Institute of Medicine’s (IOM) statement which says that ”there


can be no perfect dataset” and that ”there may be a decreasing marginal benefit from

pursuing such a goal”. Therefore a number of minor errors might be acceptable. The

problem then becomes one of determining what is meant by acceptable, and this will

depend in part on the importance of the variables involved. It may be reasonable to

execute a 100% audit on just certain critical variables (Zhang, 2004). To this end, some

researchers have designed sampling plans based on statistical quality control methods as

an alternative to 100% audit, particularly for non-critical variables in clinical databases.

The objective of an acceptance sampling plan (ASP) is to determine whether an entire

dataset is acceptable, in terms of its error rate, based upon the number of defective

items in a sample from the dataset. Brunelle and Kleyle (2002) extended a statistical

approach proposed by Sullivan et al. (1997) by designing a sampling plan which uses

acceptable and limited quality levels (AQL=0.1% and LQL=1.0%). Zhang (2004) de-

veloped a hypothesis test for error rates which can be used to decide whether to accept

or reject a dataset given an acceptable quality level (0.1%). Shen and Zhou (2006)

developed acceptance sampling plans based on acceptable error rates of 0.0% and 0.5%

for critical and non-critical variables, respectively. To determine the importance of

variables and acceptable error rates, a study of the effect of errors in clinical decision

making and resultant outcomes is essential. In this regard, systematic and quantitative

methods have been proposed aiming to evaluate the clinical consequences of different

errors in such variables for both patients and the healthcare system. Among these,

the Failure Mode and Effects Analysis (FMEA) procedure has been applied to assess

the risk associated with errors in an electronic health record system (Win et al., 2004).

Hasan and Padman (2006) developed a statistical approach to translate the uncertainty

about data quality into the risk of negative medical consequences. This approach was

then applied to distinguish critical and non-critical variables and design of an efficient

data quality improvement program.

In a more recent study Rostami et al. (2009) have used a control chart for error rates

during the audit process to find outliers and run root causes analyses. Their procedure

led to an approximate 50% saving in time when compared to a full audit, while pro-

ducing the same decrease in error rates. Despite this research, it seems that statistical

quality control methods for the evaluation and improvement of data quality have not

3.2 Introduction 71

been as widely used in the clinical context as in industrial applications (Hattemer-

Apostel et al., 2008). This may be due in some part to a lack of managerial approach

and technical knowledge of statistical quality control, unwillingness to acknowledge to-

tal solutions, lack of communication with data providers and users and their shared

responsibilities in a process-oriented approach and a lack of documentation on quality

control techniques in a clinical context. In particular, most sampling plans for audits

have been conducted either in an ad-hoc manner (Zhang, 2004) or using a fixed sample

size of 10% of recorded data (Shen and Zhou, 2006). In addition, a large majority of

them have been developed using an average quality level which leads to a high rate of

errors in the long term (Montgomery, 2008).

This paper gives an overview of acceptance sampling plans (ASPs) and statistical pro-

cess control (SPC) methods that can be employed to facilitate high-quality databases.

Guidelines are also suggested for which tools are appropriate given a data collection

system’s maturity and quality history.

The work presented in this paper has been done in conjunction with St Andrew’s Med-

ical Institute (SAMI), a research centre associated with St Andrew’s War Memorial

Hospital (SAWMH) in Brisbane, Australia. As part of their Applied Medical Intelli-

gence project, SAMI is seeking to improve the quality and safety of patient care through

better clinical audit processes. Achieving this goal requires developing procedures for

better data acquisition, analysis and utilization. In order to improve data acquisition,

SAMI has begun using Dendrite database software and they are implementing some

of the methods surveyed in this paper. In this paper, we consider the effectiveness of

these techniques with respect to two of SAMI’s databases.

In Section 3.3 we introduce quality attributes and sampling terminology with respect

to data collection processes. In Section 3 we discuss ASPs and demonstrate their appli-

cation to SAMI’s intensive care unit (ICU) database. In Section 4 SPC is discussed and

a case study using SAMI’s radiation metrics database is presented. Method selection

and final comments are covered in Section 5 and 6.


3.3 Data Quality and Sampling Definitions

In this paper, the quality attributes of interest are defined for concreteness as accuracy

and completeness. Accuracy refers to the extent to which recorded data are correct.

Completeness refers to the extent to which (available) data have been registered (Arts

et al., 2002). An entry that is inaccurate or incomplete is an error, and we consider both

systematic and random errors. Systematic errors include programming errors, unclear

definitions for data items, and violations of the data collection protocol. Examples of

random errors are inaccurate data transcription, typing errors, and illegible handwriting

in case report forms (CRFs)(Arts et al., 2002). Methods of error detection include re-

entering data, automatic domain and inconsistency checks, visual checks, site audits,

delta check and statistical analysis of data (Arts et al., 2002). In delta checks the

results of one or more measurements from two successive samples from the same patient

are compared. If the magnitude of the difference falls outside of a predetermined

threshold, there is evidence of a possible error (Nosanchuk and Gottmann, 1974). Other

statistical analysis includes cross-tabulation for categorical variables, regression models

for correlated variables, confidence intervals and control charts for outliers. Substantial

care should be taken when selecting the original source for an audit (Brunelle and

Kleyle, 2002). CRFs (or eCRFs) have been proposed as suitable original sources in most

cases (Shen and Zhou, 2006; Zhang, 2004). A combination of auditing and statistical

tools is recommended since data may be incorrectly recorded in the original source.

Variables in a dataset usually have differing levels of importance; hence categorizing

variables and determining different levels and methods of checking are recommended

(Hattemer-Apostel et al., 2008). Critical variables are subjected to more stringent tests

with respect to accuracy and completeness. In this regard, the proposed statistical and

systematic evaluation techniques can be applied (Win et al., 2004; Hasan and Padman,

2006). Since quantitative determination of criticalness of variables is out of the scope of

this study and may need to consider some study-specific factors, it is not followed here.

In general, the choice of sampling unit is somewhat arbitrary; however, the use of data

points as sample units may lead to more complexity. Here we assume that sampling

is undertaken at the patient level. This level has been considered by other researchers

3.4 Acceptance Sampling Plans 73

and also extended to patient-form and patient-visit (Brunelle and Kleyle, 2002; Zhang,

2004; Rostami et al., 2009). The use of patient level leads to cluster sampling since

when a patient’s record is selected randomly as a sample unit, all data elements within

that record are considered (Shen and Zhou, 2006).

3.4 Acceptance Sampling Plans

Acceptance sampling plans (ASPs) can be used to assess, or sentence, a dataset when

100% inspection is uneconomical. The dataset is accepted if its quality is satisfactory,

based upon the number of defective items observed in a sample or set of samples from

the dataset, and it is rejected otherwise. Rejected datasets may be returned to their

source and submitted to 100% inspection and correction, termed rectifying inspection.

Although ASPs as audit tools do not directly lead to an improvement in the data

collection process (Montgomery, 2008), there may be a psychological effect due to

rectifying inspection. That is, clinicians may be more careful when entering data if

records have been returned to them for correction previously. ASPs can be applied

during any step of the data collection process, thereby ensuring an acceptable level

of quality of either the data received from the clinician or delivered to the database

users. Of the various systems available for designing an ASP, the Dodge-Romig system

(Dodge and Romig, 1959) embraces both rectifying inspection and critical variables

(Montgomery, 2008). This system has been developed based upon the lot tolerance

percent defective (LTPD) and average outgoing quality level (AOQL). LTPD is the

poorest level of quality that the data user is willing to accept in an individual dataset,

and AOQL is the worst possible average quality that would result from a plan with

rectifying inspection in the long term. To design an ASP, the user is required to specify

one of these parameters and the average rate of errors for incoming datasets. If the

average rate of errors is unknown, it may be estimated from a preliminary sample.


3.4.1 Sampling Plans

The simplest ASP involves choosing a sample size n and acceptance number c. A

random sample of size n is taken from the dataset and if the number of defective

records in the sample does not exceed c, then the dataset is accepted. Otherwise, it is

rejected. This is termed a single sampling plan.

A double sampling plan depends upon four parameters: n1, n2, c1 and c2. A sample

of size n1 is taken. If the number of defective records does not exceed c1, then the

dataset is accepted; if it exceeds c2, then the dataset is rejected; and otherwise a second

sample of size n2 is taken. A decision is then made by comparing the total number

of defective records from both samples to c2. Double sampling plans are cheaper than

single sampling plans when the data quality is either very good or very bad because, on

average, they inspect fewer items than required by a single sampling plan (Montgomery,

2008).

When first implementing ASPs, it is recommended that a single sampling plan is

adopted. Terminating the audit once the number of defective records exceeds the

acceptance number is referred to as curtailment. In a database context, curtailment

may be inadvisable when using a single sampling plan, since complete inspection will

provide a better estimate of data quality. If the quality is estimated to be either very

good or very bad, a double sampling plan can be adopted, in which case, curtailment

in the second stage may be acceptable.

ASPs can be extended to more than two samples. The reader is referred to Montgomery

(2008) for a discussion on multiple and sequential sampling plans. These methods break

large samples into smaller ones and relocate the decision point on consecutive sampling

and observations. Although both plans are more complicated to administer, some

economical efficiency may be gained and the plans may be more appealing to both

clinicians and users of the database.

For critically important variables, the acceptance number is usually set to zero in a

single sampling plan. In this case, it may be preferable to adopt a Chain sampling plan

(Chsp) instead (Dodge, 1955). In Chsp the decision about whether to reject or accept

3.4 Acceptance Sampling Plans 75

is based on the results from previous samples as well as a sample from the current

dataset. The dataset is only accepted if either there are no defective records in the

current sample (of size n), or there is one defective record in the current sample and no

defective records in the previous i datasets. For details on how n and i are calculated,

the reader is referred to Montgomery (2008), Dodge (1955), Dodge and Stephens (1966)

and Schilling and Neubauer (2009).

Chsp is appropriate only if the quality of incoming datasets is both relatively stable and

high. If repeated application of Chsp suggests consistently high quality of incoming

data, then a Skip-Lot sampling plan (SkSp) may be considered in order to reduce

the burden of inspection (Dodge, 1943; Perry, 1973). This involves using a reference

sampling plan, such as single or double sampling, to sentence datasets. If a specified

number of consecutive datasets is accepted, then instead of inspecting each new dataset,

the reference sampling plan is applied to a specified fraction of incoming datasets. If,

however, a dataset is rejected while using the reduced inspection process, then normal

inspection (of each dataset) is resumed.

In some data collection processes, incoming data flow is continuous, rather than peri-

odic, and is in batch form. In this case, data aggregation may be undertaken to provide

large datasets before using an ASP. This approach has some disadvantages particularly

in administration and corrective action. Continuous sampling plans (CSPs) (Dodge,

1947; Dodge and Torrey, 1951) are recommended for this circumstance. The simplest

plan, a CSP-1, begins with 100% inspection of all incoming records; as with SkSp, if a

specified number of consecutive records are accepted, then instead of inspecting each

new record, we inspect only a fraction of them. If, however, a record is rejected while

using the reduced inspection process, then 100% inspection is resumed.

3.4.2 Case Study: ICU data

The Acute Physiology and Chronic Health Evaluation II (APACHE II) (Knaus et al.,

1985) is an intensive care unit (ICU) scoring system based on a logistic regression model

that predicts the mortality of a patient given 12 physiological measurements taken in


Table 3.1 Single sampling plans for APACHE II data, LTPD=1.0% and process average=0.5%.

Block Count of recordsDesign Implementation

Sample size Acceptance number AOQL Observed errorResult

(n) (c) (%) (d)

2001-1 336 175 0 0.12 2 Rejected2001-2 341 175 0 0.12 0 Accepted2002-1 301 175 0 0.12 1 Rejected2002-2 324 175 0 0.12 2 Rejected

the first 24 hours after admission to ICU, chronic health status and age. These pre-

dictions enable clinicians and health managers to select treatment and procedures,

facilitate clinical resources utilization, monitor care processes and conduct quality im-

provement programs (Moreno and Matos, 2001; Sakr et al., 2008; Shahian et al., 2004).

The APACHE II data elements are routinely collected and recorded in local datasets by

SAMI and then submitted to ANZICS CORE Adult Patient Database (APD) periodi-

cally. The SAMI dataset contained 4644 records for patients admitted to ICU between

2000 and 2009. Although data modification had been applied during data collection

and registration at SAMI, data quality evaluation and rectifying inspection was con-

sidered prior to submission to APD to ensure that the released data are of high quality.

The SAMI ICU dataset was partitioned into 6 month periods. As 44 and 142 records

were collected during the 4th quarter of 2000 and the 1st quarter of 2009 respectively,

they were incorporated into adjacent periods. Incomplete and inaccurate data were

considered to be errors. Errors were detected by comparing the electronic dataset to

the original forms. Since all variables in the dataset are used in the calculation of

the APACHE II score, they were all considered to be critical. Having said that, risk

assessment methods can also be implemented to categorize the variables, see Hasan

and Padman (2006) for more details. A single sampling plan was designed by setting

LTPD=1.0% and assuming the average error rate was 0.5%. For an LTPD of 1%, this

is the highest average error rate for which Dodge and Romig (1959) provide a design.

Table 3.1 shows the sampling plans for the first four data blocks, the number of errors

observed in each sample, and our conclusion on whether the entire block of data should

be accepted or rejected. The AOQL column indicates the outgoing data quality level

when rectifying inspection is used. The plans were implemented and rejected blocks

were subjected to inspection and modification.

3.5 Statistical process Control 77

Three of the first four blocks were rejected, suggesting that the quality was poor.

Consequently, a double sampling plan was developed and applied to the remaining

blocks. Table 3.2 shows the design and the results of its implementation.

Table 3.2 Double sampling plans for APACHE II data, LTPD=1.0% and process average=0.5%.

Block Count of recordsDesign Implementation

n1 c1 n2 c2 AOQL% Observed error in n1 Observed error in n2 Result

2003-1 304 200 0 90 1 0.12 1 1 Accepted2003-2 277 180 0 75 1 0.10 2 - Rejected2004-1 283 180 0 75 1 0.10 0 - Accepted2004-2 210 165 0 - - 0.10 1 - Rejected2005-1 244 165 0 - - 0.10 0 - Accepted2005-2 261 180 0 75 1 0.10 0 - Accepted2006-1 295 180 0 75 1 0.10 1 2 Rejected2006-2 251 165 0 - - 0.10 0 - Accepted2007-1 277 180 0 75 1 0.10 1 1 Accepted2007-2 237 165 0 - - 0.10 1 - Rejected2008-1 282 180 0 75 1 0.10 2 - Rejected2008-2 421 215 0 100 1 0.14 1 3 Rejected

Seven out of sixteen data blocks were accepted and the rest were submitted to 100%

inspection and modification. This provided an outgoing error rate of around 0.1% on

average when rectifying inspection was conducted. We were not able to use double

sampling plans for four of the final twelve blocks (2004-2, 2005-1, 2006-2, 2007-2) due

to their small sizes. The four plans described in Table 3.2 are similar to those obtained

from single sampling. When double sampling was used, a second sample was required

in four cases and two of these blocks were accepted (2003-1, 2007-1). In this study, the

potential benefit of double sampling relative to single sampling was not great because

of the small block sizes and the very high level of desired quality. The acceptance

of blocks 2003-1 and 2007-1 illustrated the advantage of double sampling. Under a

single sampling scheme, both blocks would have been rejected and submitted to 100%

inspection. In contrast, rejected blocks 2006-1 and 2008-2 would have been rejected

under a single sampling scheme, but with fewer records inspected.

3.5 Statistical process Control

Acceptance sampling plans and rectifying inspection might ensure the quality of incom-

ing/outgoing data, but they do not lead to improvement in data collection. The data

must be produced, transferred and stored accurately and completely. Ongoing improve-

ment in data quality is achieved by stabilizing data collection and registration processes


Figure 3.1 Process improvement cycle (Montgomery, 2008).

through the elimination of sources of variability. Statistical process control (SPC) is

a set of tools that diagnose, control and prioritize on-line variation problems, analyze

their root causes and reflect the effect of corrective actions and improvements. Due to

these capabilities, quality management programs, including Six Sigma, have embedded

SPC tools into the technical core of their methodologies (Montgomery, 2008).

SPC consists of seven tools. The Check Sheet and Defect Concentration Diagram

are data collection and summary tools that present the current situation of a process

via its measurements and observed defects. Histogram and Scatter Plots analyze the

behavior of the process factors and variables individually and interactively. A Control

Chart interprets data quality and detects changes in the process. A Pareto Chart

categorizes and prioritizes observed errors and their root causes. Finally, a Cause

and Effect Diagram identifies and categorizes the potential causes of observed errors

(Montgomery, 2008; Ishikawa, 1990).

A process may improve when a control chart identifies undesirable variation in the

process outcomes, root cause analysis is implemented using Pareto charts and Cause

and Effect diagrams, and corrective action is defined and accomplished as shown in

Figure 3.1. This procedure is known as an Out-of-Control Action Plan (OCAP). The

success of an SPC program requires data managers to involve and support OCAP cycles

within their systems (Montgomery, 2008).


3.5.1 Quality Control Charts

Control charts are used to identify whether the variation of the process outcomes is

due to assignable or random causes and whether the process is statistically in or out

of control. A control chart presents a quality characteristic of the process over time.

Generally speaking, the chart has a center line (CL) showing the in control mean of the

characteristic, and an upper and lower control limit, denoted UCL and LCL, respec-

tively. A sample point outside the control limits indicates that the process is out of

control. The size of the sample used to calculate the plotted statistic and the frequency

of sampling depends upon the shift size to be detected. In general, small samples with a

high frequency of sampling (small intervals) are recommended (Montgomery, 2008). As

errors and defective records are the quality characteristics of interest in data collection

processes, Shewhart control charts for attributes are considered. These charts have a

CL equal to the mean of the underlying distribution of the process, and the UCL and

LCL are equal to three standard deviations above and below the mean, respectively.

This is based on the assumption that the quantity being monitored, typically a sample

mean, is normally distributed.

The proportion of defective cases can be monitored by a p-chart. The components of a p-

chart are presented in Table 3.3. The CL, p , might be known from previous experience

or estimated from observed data from preliminary in-control samples. Negative LCLs

are set to zero (Montgomery, 2008).

It is often more informative to monitor the types of errors rather than defective records.

In this case, parameters of the control chart must be redefined using data points instead

of records. The additional benefits of these charts must be weighed against the added

complication and administration that they impose.

Nonconformity control charts are proposed as alternatives for fraction nonconforming

control charts such as the p-chart (Montgomery, 2008). A c-chart monitors the occur-

rence of errors, c say, in an inspection unit, which is taken here to be a database record

or fixed number of records. Subsequent units are sampled and the observed number

of errors is counted for each unit and plotted on a chart that is based on a Poisson

80

Chapter3.Data

Quality

Impro

vementin

ClinicalData

bases

Table 3.3 Quality control charts and their components.

Quality characteristic Control chart Underlying distribution Plotted statistic Chart components Parameters definition

Proportion of defectiverecords

p-chart Binomial pi =din

UCL = p+ 3

√p(1−p)

n

p =∑m

i=1 dimnCL = p

LCL = p− 3

√p(1−p)

n

Number of errors perinspection unit

c-chart Poisson ci

UCL = c+ 3√c

c =∑m

i=1 cim

CL = cLCL = c− 3

√c

Average number of errors perinspection unit

u-chart Poisson ui =cini

UCL = u+ 3√

un

u =∑m

i=1 ui

mCL = u

LCL = u− 3√

un

Weighted average number oferrors per inspection unit

Weighted u-chart Poisson ui =∑k

j=1 wjcijn

UCL = u+ 3

√∑kj=1

∑mi=1 w

2j cij

mn u =∑m

i=1 ui

m , wj : weight forerror/variable type j

CL = u

LCL = u− 3

√∑kj=1

∑mi=1 w

2j cij

mn

Number of correct cases/datapoints between defectivecases/errors

CCC-chart Weibull (Geometric) xi

UCL = lnα/2ln 1−p α: acceptable risk of false

alarm, p: observed ratio ofdefective cases/errors

CL = 1/p

LCL = ln 1−α/2ln 1−p

Time between defectivecases/errors

X-chart Weibull (approximately normal) xi

UCL = X + 3

√∑mi=1(xi−X)m−1

X =∑m

i=1 xi

mCL = X

LCL = u− 3

√∑mi=1(xi−X)m−1


distribution, see Table 3.3. The assumption here is that the fraction of nonconformi-

ties is small relative to the sample size and that all units have the same underlying

probability of being defective.

A c-chart monitors the occurrence of nonconformity, c say, in an inspection unit. In

this chart, a Poisson distribution models the number of occurrences in an interval of

time or space. The assumption here is that the fraction of nonconformities is small

relative to the sample size and that all units have the same underlying probability of

being defective.

If the number of items monitored on a record or the number of recors in an inspection

unit, the c-chart components would need to be redefined and the center line will be

non-constant. An alternative is to construct a chart based on the average number of

errors per inspection unit, u say. A u-chart is defined with a base inspection unit

size, for example 10 records, and the observed errors in a unit with different size are

converted to this base size; for example a sample of size 15 is 1.5 inspection units. The

resultant u-chart has a constant center line and variable limits. The resultant u-chart

has a constant CL and variable limits. Similar to a p-chart, the c- and u-chart may

be built using a known average number of errors in the process data, or constructed in

trial mode and modified using preliminary samples.

As discussed earlier, the c- and u-charts provide more information upon which to make

decisions regarding corrective actions. Often quality characteristics are not equally

important and categorizing them as critical or non-critical is advised. It may be worth-

while constructing separate control charts for the different variable types. A notable

extension of the u-chart simultaneously takes into account the importance of the vari-

ables/errors. In this development a demerit system is used to classify either errors

(Montgomery, 2008) or variables (Alidousti et al., 2005). This system assigns differ-

ent levels of severity to variables/errors according their effects on outcome data quality

(Jones and Woodall, 1999). In this regard the systematic and statistical risk assessment

methods can be applied to rank the error types (Win et al., 2004; Hasan and Padman,

2006).

It may be likely that errors occur in clusters and that the probability of an error is


not constant. In this case, two distributions can be used, one to express the number of

clusters and another to model the number of errors in clusters. In this case, a compound

Poisson distribution or other mixture model can be applied; see Kaminsky et al. (1992),

Gardiner (1987) and Montgomery (2008).

When the quality of the process is high, the number of errors and proportion of de-

fective cases tend to zero, in which a sequence of zeros will be observed. In this

situation the Shewhart plots are not useful since the observed data is no longer dis-

tributed normally. Count- and time-based control charts that monitor time or number

of conforming products between two nonconformities may be more appropriate. In the

count-based approach, the observed faultless cases/data points between two defective

cases/errors are counted and plotted on a cumulative count of conforming (CCC) con-

trol chart. The construction of a CCC-chart is similar to a p-chart, except the number

of conforming items is plotted when a defective item has been observed. The interpre-

tation of a CCC-chart differs from conventional Shewhart control charts. A succession

of error-free records will eventually result in a statistic exceeding the UCL, indicating

an improvement in data quality. On the other hand, a signal below the LCL shows

a decline in the data quality. For more information on the chart’s construction and

parameter definition refer to Calvin (1983), Goh (1987) and Xie et al. (2002). As an

extension of the CCC-chart, the number of faultless cases may be counted until r > 1

defective cases are observed. In this case a CCC-r chart is constructed on a negative

binomial distribution; see Xie et al. (1999) and Ohta et al. (2001).

If the event follows a Poisson process, the time between two events has an exponential

distribution. Since this distribution is skewed, transformation of an exponential ran-

dom variable to an approximately normal variable via taking logarithms or x = y0.25

(Kittlitz, 1999; Nelson, 1994) will allow the CL and control limits to be calculated using

the usual mean and three standard deviations based on transformed data. Similar to

a CCC-chart, an out-of-control point higher than the UCL indicates an improvement

in process quality and a signal below the LCL shows a drop in process quality. Al-

though the time-based approach seems easier than count-based method, care should be

taken when defining and measuring the desired variable. Plotting the time to observe

r nonconforming items may also be considered. Since the time between two defective


cases/errors is measured, there should be a constant volume of dataflow in order to

accept time as an alternative to the count of cases. Plotting the time to observe r

defective cases/errors may also be considered. In this case the control chart would be

constructed based on a Gamma distribution; see Zhang et al. (2007).

3.5.2 Case Study: Radiation Metrics Data Collection

When a patient undergoes a cardiac procedure in Australia, it is a legal requirement

that the hospital records the amount of radiation to which the patient is exposed.

SAMI collects data for a total of five variables, three of which are used to monitor the

amount of radiation, namely fluoroscopy time, the number of digital frames taken, and

the dose area product (DAP). The other two variables are case type and diagnosis.

At SAWMH, data recorded in paper form during radiography by the radiographer and

attending cardiologist are entered into SAMI’s database at the end of each day. We

implemented SPC tools to monitor the data collection process in order to facilitate the

detection of errors and identification of their causes. Inaccuracy and incompleteness

were considered to be types of errors and identified by comparison database records with

paper forms. All variables were seen as critical and sample units were defined at the

patient level. That is, a patient’s record comprises the five variables, and if any of these

variables are inaccurate or missing, the record is considered defective and the number

of errors is registered. However, risk assessment methods can also be implemented

to categorize the critical and non-critical variables, see Hasan and Padman (2006) for

more details. Initially, a sample of 50 records was taken from the database; 16 errors

were observed. Given this high error rate, we decided to monitor all incoming data on

a daily basis for a month. The aim of 100% inspection was to identify assignable causes

and to run corrective interventions until the process achieved a desired error rate, set

by SAMI collaborators at 1.0%. We used a u-chart because the number of records per

day varied. Stage 1 of Figure 3.2 shows a chart of the observed number of errors for 23

days in April 2009. In total there were forty-five errors in 305 inspected records during

in this period, giving an estimated error rate of 14.7%.

We categorized different types of errors using a Pareto chart. Figure 3.3 shows that most


Figure 3.2 u-chart of observed errors in radiation metrics dataset; Stage 1: before intervention-April2009, Stage 2: after intervention-May 2009.

errors occurred in the recording of case type (40%), followed by the number of frames

and DAP. A Cause and Effect diagram was constructed and revised in collaboration

with radiographers and SAMI researchers. Figure 3.4 presents identified causes of

observed errors from the radiographers’ view and other potential sources that were

identified via brainstorming. Comparing the most common errors with potential causes

revealed that errors in case type were most frequently caused by a lack of communication

between the radiographer and cardiologist. Errors in the number of frames were often

due to entering an estimated value prior to completion of the procedure instead of

reading the true number from the device. Other common errors were due to carelessness

and delay in recording measurements, illegible handwriting on the original forms, and

the complexity of the database interface.

In collaboration with SAWMH clinicians and SAMI researchers, we organized and im-

plemented the required interventions and then continued charting for the next month

of May 2009. Stage 2 of the chart in Figure 3.2 shows that the data collection process

became more accurate, with the error rate dropping to 2.8%. As such, we switched

to inspecting only 50% of the daily records. As the error rate approached the desired

level of 1.0%, the u-chart was replaced by a CCC-chart in July 2009. The CCC-chart

plotted the number of correct records between defective records, using α = 0.0027 and

an estimated error rate of p = 0.016. Figure 3.5 shows the resultant CCC-chart for July


Figure 3.3 Pareto chart of observed error types in radiation metrics dataset in April 2009.

Figure 3.4 Cause and Effect diagram of potential causes of observed errors in radiation metrics dataset.


Figure 3.5 CCC-chart for observed errors in radiation metrics dataset for July-September 2009.

to September 2009. By continual monitoring and running corrective interventions in

this manner, longer sequences of correct records were more consistently observed over

time.

The corrective interventions comprised changes to practice, administration and soft-

ware. Missing data were found more systematically than other errors because the

software reacted to missing values. In light of this, the radiation metrics form was re-

vised to include multiple-choice items. Other interventions included clearly defining the

responsibilities associated with data entry and confirmation, providing weekly feedback

on error rates to the radiographers and cardiologists, redesigning the software interface

and incorporating inbuilt checks for obvious inconsistencies.

Note that if we designed a p-chart, it would be similar to the u-chart because each of

the observed defective records contained only one error. In these circumstances the rate

of defective records (p) and the average number of errors per inspection unit (u) are

equal. Care should be taken when a u-chart is replaced by a CCC-chart with respect

to the difference between an error and a defective record when the sample unit has

been defined at patient level. As such, when constructing the CCC-chart, we started

counting the number of correct records between two defective records instead of errors.

3.6 Discussion 87

3.6 Discussion

The question of whether to use acceptance sampling or the tools of statistical process

control depends upon a number of issues. Data collection at the local level tends

to be continual, via a mixture of paper and electronic forms. SPC tools are well

suited to monitoring data quality at the local level. In some cases, datasets are then

periodically released to central authorities or other external users, and ASPs can be

used to determine whether a dataset intended for release is of sufficiently high quality.

When there is good communication between the data users and providers, application

of SPC tools to these datasets can also be highly beneficial. Control charting quality

characteristics of datasets helps determine when the error rate has changed in a timely

manner.

Quality improvement is not consistently and automatically achieved in datasets if there

is no system and process-oriented thinking about associated data collection tasks. Im-

proved quality can only be achieved if database managers and the people involved in

data collection are committed to the OCAP cycle. As it has been seen a successful

implementation of the cycle at SAMI led to a significant drop of the error rate in

the radiation metrics data collection where radiographers and data management ex-

perts contributed and interventions were conducted. Although rectifying inspection

will result in an entirely clean dataset, it does nothing to ensure the quality of future

datasets. Similarly, constructing control charts and identifying changes in the error

rate is a largely fruitless and often detrimental activity if an OCAP is not enacted.

Ideally, management should be involved in the different levels of the system, including

data production, transfer and audit. Moreover, it is critical that management ensures

that corrective and preventive actions are taken in light of SPC analysis. Their willing-

ness to be involved in the entire quality improvement cycle is a measure of the system’s

maturity. Taking into account the maturity of the system and its quality history leads

to the conceptual guideline shown in Figure 3.6.

It should be noted that this diagram is a static guideline for tool selection. The effective

application of SPC tools by data providers should result in the users being able to use


economical ASPs designed for high quality output. In some cases, data users may even

terminate inspection if they believe the provider is trustworthy. Generally speaking,

all elements in Figure 3.6 should be considered parts of a transition process within

a life cycle of quality improvement programs. This transition can be seen at SAMI

in which the u-chart was replaced by a CCC-chart where the level of data quality

approached to one percent error rate. This cycle may begin with implementation of

acceptance sampling plans on incoming data to the process and released data to internal

or external users. This is followed by the application of SPC tools to improve the

process, which then motivates other process owners to replace 100% audit of datasets

with acceptance sampling plans. The process is continued in order to provide high

quality datasets for downstream processes and users until the owner changes plans to

high quality methods or possibly terminates them. During this maturity evolution,

other activities are involved to apply SPC tools and communicate effectively to their

data providers and users so that the quality of data is improved and ensured from the

beginning.

3.7 Conclusion

The reliability of any statistical results in the clinical context is affected by the quality of

data being used. In this study we consider application of the well-established statistical

tools in industrial quality control programs to improve the quality of clinical data in

the medical registries. Technical and procedural aspects of acceptance sampling plans

and statistical process control tools including control charts and root causes analysis

Figure 3.6 A guideline for statistical quality control tools selection in clinical data management.

3.7 Conclusion 89

were discussed and translated into a quality control program for clinical data. Sampling

plans were implemented on the dataset contained APACHE II scores of a local hospital.

It was shown that how sampling plans enable us to evaluate the quality of data and

reduce inspection and modification efforts. The transition between plans was also

demonstrated.

In a process-oriented approach, SPC tools were considered to improve the quality of

radiation metrics data through monitoring error rates. The potential causes of observed

high rate of errors in the U-chart, a control chart for the average number of errors, were

investigated using the root causes analysis procedure. The error rate was significantly

dropped when required interventions and corrective actions were conducted into the

system. We then switched to a CCC-chart in which the number of correct records

between errors is monitored since the error rate reached to around one percent.

A guideline was provided for quality control tool selection in the context of clinical

data management. We considered the quality of data and the engagement of all parts

of data generation and collection process as essential criteria in tool selection. However

this guideline should be seen dynamically and transition between tools and reaching to

higher level of maturity and quality in the system should be chased.

The achievements obtained through the application of statistical quality control tech-

niques in the above case studies promote consideration of them in current data quality

improvement efforts within clinical contexts. Current data quality improvement pro-

grams benefit from ASP and SPC in two aspects. These methods can be embedded

in the quality assurance approaches and procedures and may take both operational

and technical roles. In this setting, the program is well-equipped with a wide range of

statistical tools promising cost-effectiveness of the efforts. Alternatively, ASP and SPC

are able to be integrated with well-established error detections methods such as delta

checks and variable identification techniques such as failure mode and effects analy-

sis and risk assessment. In this framework error detection and variable identification

methods are implemented interactively and the time-based characteristic of ASP and

SPC enables us to analyze the effectiveness of such techniques on the resultant datasets

longitudinally. Moreover, the dynamics of OCAP and the transition between ASP and


SPC tools support data quality improvement programs in evaluation of the current

state of achievable goals in data quality and actions which need to be taken in order to

ensure higher data quality.

The emergence of information technology-based platforms for data collection within and

across clinical centers during patient care and clinical trials such as Electronic Data

Capture (EDC) and Electronic Medical Report (EMR) systems dramatically enhances

the data quality since less transcription from paper sources to databases is required

(Choi et al., 2011). However, the chance of recording inaccurate and incomplete data

is not fully eliminated. Failure in structural design of the database, data capturing

instruments, transcription from the source and so on are among potential causes of

such errors. Therefore quality control and routine audits are recommended for the

resultant databases at the early stage of data collection in batch forms (Helms, 2001).

In this regard, statistical quality control methods can still serve to this end.

Bibliography

Alidousti, S., Assareh, H., and Kazempour, Z. (2005). Quality control of indexing

process. Faslnameye Ketab, 63:63–73.








353:1205–1206.





Choi, J., Horn, D., Kist, M., and DAgostino Jr, R. (2011). Evaluation of data entry

errors and data changes to an electronic data capture clinical trial database. Drug

Information Journal, 45:421–430.

BIBLIOGRAPHY 91



trol, 14(3):5–9.


11(4):10–13.


Sampling. Wiley.







Washington.


13(1):18–22.




Association.



Helms, R. (2001). Data quality issues in electronic data capture. Drug Information

Journal, 35(3):827–837.






24(2):63–69.



Knaus, W., Draper, E., Wagner, D., and Zimmerman, J. (1985). APACHE II: a severity

of disease classification system. Critical Care Medicine, 13(10):818–829.



Moreno, R. and Matos, R. (2001). New issues in severity scoring: interfacing the ICU

and evaluating it. Current Opinion in Critical Care, 7(6):469–474.



Nosanchuk, J. and Gottmann, A. (1974). CUMS and delta checks. a systematic ap-

proach to quality control. American Journal of Clinical Pathology, 62(5):707–712.




130.



Trials, 6(2):141–150.

Sakr, Y., Krauss, C., Amaral, A., Rea-Neto, A., Specht, M., Reinhart, K., and Marx,

G. (2008). Comparison of the performance of SAPS II, SAPS 3, APACHE II, and

their customized prognostic models in a surgical intensive care unit. British Journal

of Anaesthesia, 101(6):798–803.


man & Hall/CRC.

Shahian, D., Blackstone, E., Edwards, F., Grover, F., Grunkemeier, G., Naftel, D.,

Nashef, S., Nugent, W., and Peterson, E. (2004). Cardiac surgery risk models: a

position article. The Annals of Thoracic Surgery, 78(5):1868–1877.











BIBLIOGRAPHY 93













Journal, 38(4):371–387.

CHAPTER 4

An Economical Sample Size Determination

Algorithm for Clinical Data Statistical Analysis

Preamble

Implementation of quality control chart for monitoring hospital outcomes often in-

volves construction and calibration of risk models that express the in-control state of

the clinical processes. In monitoring intensive care units outcomes, historical data of

admissions are used to predict pre-admission risk of death for each patient admitting

to the ICU when the care procedures in the ICU is in in-control state and stable. The

accuracy of predictions and the overall cost of the risk model construction are affected

by the quality and amount of data used. There has been considerable research into

data quality evaluation and improvement considering short term and long term costs.

Also several sample size determination methods have been proposed for statistical anal-

ysis and estimation. Yet no research has been found to tackle a combined approach

in which determination of sample size for construction of complex statistical models

simultaneously satisfies statistical, accuracy and precision, and economical, cost of data

96 Chapter 4. An Economical Sample Size Determination Algorithm

inspection and error modification, requirements. In this chapter This research presents

a general data capturing algorithm which addresses this issue. It uses Value of Infor-

mation theory from a Bayesian decision making context and the concept of Utility. We

proposed a customized version of the algorithm to determine an appropriate sample size

for risk model construction using logistic regression and then apply it for calibration

of the Acute Physiology and Chronic Health Evaluation II (APACHE II), for various

utility scenarios. We also outline extensions which could be made to the framework

and techniques.

The focus of this chapter is the first objective of the thesis, mainly goal 3, in which

optimal sample size for construction of risk models applied in control charting program

is sought. This chapter contributes to development of new methods using capabilities

of Bayesian approach in a clinical context. In this regards value of information theory

is merged to the utility concept extracted from operation research context to build an

recursive algorithm.



statistical analysis, writing manuscripts and addressing the reviewer’s comments.

97



certified that:



field of expertise;






unit; and





Assareh, H., Waterhouse, M. A., Brighouse, R. D., Foster, K. A., Smith, I. R. and

Mengersen, K. An economical sample size determination algorithm for clinical data

statistical analysis, IIE Transactions on Healthcare Systems Engineering, submitted.


H. Assareh Conception and conduct research, implement statisticalanalysis, write code, write manuscript, make modificationsto manuscript as suggested by co-authors and reviewers

Signature & Date:

M.A. Waterhouse Conception, comments on manuscript, editing

R. D. Brighouse Data collection

K. A. Foster Data collection

I. Smith Data collection

K. Mengersen Supervise research, conception, comments on manuscript,editing





4.1 Abstract

For most data analysis problems, sample size formulae are constructed by focusing

on statistical characteristics rather than economical constraints. When performing a

complicated statistical analysis involving clinical data, such as risk model construction,

choosing a sample size which simultaneously satisfies statistical (accuracy and precision)

and economical (cost of data inspection and error modification) requirements are non-

trivial. This research presents a general data capturing algorithm which addresses this

issue. It uses Value of Information theory from a Bayesian decision making context and

the concept of Utility. We propose a customized version of the algorithm to determine

an appropriate sample size for risk model construction using logistic regression and

then apply it for calibration of the Acute Physiology and Chronic Health Evaluation II

(APACHE II), for various utility scenarios. We also outline extensions which could be

made to the framework and techniques.

4.2 Introduction

Many clinical and medical studies use historical data. For example, it is common to use

past data to estimate parameters and quantities when constructing a baseline against

which to compare current and future data. Similarly, the construction of risk models

frequently involves the use of historical data to build a prediction equation for the status

of a process outcome based on a number of predictors. The accuracy of estimates and

the overall cost of the study are affected by the quality and amount of data used.

The success of any clinical and medical research depends on the quality of the recorded

data. Inconsistencies such as missing values and errors can lead to biased analyses and

inferences. There has been considerable research into evaluating and improving dataset

quality. For example, investigating a sample dataset and running corrective actions,

such as re-entering, filling and modifying, depending on the types of errors has been

proposed (Beretta et al., 2007).

Some researchers have focused on the process of data entering and cleaning. Standard-

ized procedures, forms and recording have been recommended (Harvey et al., 2007).

4.2 Introduction 99

Arts et al. (2002) developed a framework for quality assurance in medical registries by

considering a wide range of literature . Based on their model, procedures have been

developed to prevent poor data quality, to detect imperfect data and its causes, and

to apply relevant corrective actions in local and central registries. This framework

has been applied in the construction of intensive care unit databases in Australia and

New Zealand (Stow et al., 2006). It provides an effective approach to obtaining such a

database, but it does not include any systematic way to evaluate data quality.

Other research has focused on using a statistical quality control framework to audit

datasets, suggesting the use of sampling plans and their acceptance criteria (Brunelle

and Kleyle, 2002; Sullivan et al., 1997). Shen and Zhou (2006) asserted that quality

auditing of a database should be conducted with emphasis on efficient and statistically

accepted methods. They categorized variables as either critical or noncritical and

developed sampling plans based on the binomial distribution with differing acceptable

ratios of errors for the two categories. Their paper includes a discussion of sample sizes

and the estimated number of errors.

Determining an appropriate sample size depends on the objectives of the study and

the desired accuracy and precision of the results. Chow et al. (2007) and Spiegelhalter

et al. (2004) introduced various methods for the design of clinical research studies.

Some methods seek to derive an optimal sample size with respect to precision while

incorporating the cost of sampling. To date, these methods have only been developed

for simple analyses such as mean and variance estimation and linear regression. Popular

sampling methods include simple, stratified and clustered sampling (Kish, 1995; Sarndal

et al., 2003). Yet another approach is to use acceptance sampling plans which seek to

accept or reject a batch of data based upon the number of defects in a sample taken

from the batch. In economic extensions long term costs and quality of accepted batches

can also be taken into account (Montgomery, 2008).

The effect of data quality and sample size on the accuracy and cost of a study have

only been studied in combination for very simple cases. It would, therefore, appear

that there is a need for an integrated model for more complicated statistical analyses.

This research develops a general data capturing algorithm to determine an economical


sample size which takes into account data utility, quality and modification costs. In

Section 5.3.2, we first introduce the fundamental concepts on which the algorithm is

constructed. The general algorithm components are developed in Section 4.4. We

then customize the algorithm for risk model construction in Section 4.5. We apply the

proposed algorithm for calibration of an existing risk model (APACHE II) in a local

hospital. The performance of the algorithm and its components are investigated over

different scenarios and obtained results are discussed in Section 4.6. In Section 4.7

some developments and extensions of the general and the customized algorithms are

proposed. The study is then summarized in Section 4.8.

4.3 Theoretical Framework

We phrase the question of interest as follows: “Do we have enough data to meet our

research goals, or is more required?” Our decision depends upon the potential costs and

benefits of accumulating more data. Benefits include increased precision in estimates,

while costs include the time required to evaluate the quality of the new data and make

any corrections.

The term “Utility” expresses benefit. It is a measure of the relative satisfaction from,

or desirability of, consumption of one or more goods and services (here data). Utility

has been considered in economics and operation research contexts and decision theory.

The principle of diminishing marginal utility says that after some point, as consump-

tion of a good increases, the marginal utility of that good will begin to fall (Besanko

and Braeutigam, 2002). This principle is applicable here. For example, although the

variability of the sample mean decreases with an increase in sample size, after a certain

point this decrease is sufficiently marginal that it does not justify the cost of extra

sampling.

Value of Information (VOI) theory enables a decision maker (DM) to evaluate the utility

of more information with respect to its effects on alternate decisions (Spiegelhalter

et al., 2003; Winkler, 2003). In VOI theory the DM tries to predict whether the

current optimum choice (with maximum benefit/minimum cost) will change if they buy

4.4 General Data Capturing Algorithm 101

Figure 4.1 Conceptual diagram of the data capturing algorithm.

more information about the status of all variables and, if so, whether the purchase is

worthwhile considering obtainable benefits of change in the optimal choice and the cost

of information. This theory has been applied widely in the medical context (Claxton

et al., 2001, 2004).

4.4 General Data Capturing Algorithm

VOI and Utility theories provide a framework for the development of a general data cap-

turing algorithm which simultaneously considers utility, quality and cost. The general

principle of this algorithm is shown in Figure 4.1. Blocks of data, each compromising a

set of records, are accepted until the total cost of obtaining and modifying data exceeds

the increase in utility.

We preface this section by giving a set of definitions followed by an explanation of the

algorithm components.

Data block: Available data are considered as sets of records. The records are parti-

tioned into blocks which, for simplicity, are of equal size, N say. Decisions are made

with respect to blocks, not individual records. Blocks are built in time order. The

anticipated effect of this is that records in blocks are more homogeneous.

Cost: All accepted blocks are subjected to 100% inspection and all errors are corrected.

This leads to costs which depend upon the type of errors and corrections. Sampling,

inspection and modification costs are all taken into account.


Time continuity: Given that most clinical research uses data which has been recorded

over a continuous period of time, time continuity is maintained for accepted blocks.

Therefore, evaluation and acceptance of blocks run in one time direction, backward

(newest to oldest blocks) or forward (oldest to newest). This means that we do not

allow a block to be skipped. In Section 4.7 an alternative is discussed.

Utility function: This function is based on the contribution of each block (if accepted)

to the research goal such as precision of an estimate. It is expressed in terms of money.

It can be built and updated during iterations of the algorithm by taking into account

the DM’s perspective and measurable benefits. This measure can be defined either as a

fixed quantity which might be updated at each iteration, or as a variable with a random

distribution in which its parameters are re-modified at each iteration.

Net benefit: This is the monetary gain of an accepted block when its utility and

associated costs are taken into account. The term “Benefit” is used for any monetary

gain.

Label the blocks of data B1, B2, . . . , BF . The jth iteration of the algorithm involves

deciding whether to accept or reject Bj . Suppose that we have completed j−1 iterations

and that we have accepted B1, B2, . . . , Bj−1. In this section we outline the steps taken

in the jth iteration, as shown in Figure 4.2.

The algorithm is divided into three main phases, namely

• Phase 1. Prediction for Bj ;

• Phase 2. Estimation for Bj ; and

• Phase 3. Prediction for Bj+1.

In Phase 1 we predict error rates and costs for Bj given what we have observed for

previous blocks. If we predict that the benefit of accepting Bj outweighs the cost then

we proceed to Phase 2, otherwise we terminate the algorithm.

In Phase 2 we estimate error rates and costs for Bj based upon a sample from this

block. At this stage, we either accept Bj or proceed to Phase 3.

In Phase 3 we make predictions about the next block of data, Bj+1. Phase 3 is only


Figure 4.2 Algorithm components for the jth iteration.

undertaken when we estimate a negative net benefit associated with Bj . It seeks to

determine whether this negative net benefit is anomalous. If the negative net benefit is

thought to be a local effect, then we accept Bj and begin a new iteration of the algo-

rithm. Otherwise we terminate the algorithm and only accept blocks B1, B2, . . . , Bj−1.

With respect to notation, we use z and z to denote the estimated and predicted values

of a variable z, respectively.

4.4.1 Phase 1: Prediction for Bj

Based upon data in B1, B2, . . . , Bj−1, the current iteration begins by making predictions

for Bj . In this phase, called “Preposterior Analysis” (Winkler, 2003), we predict: the

quality of data in Bj , the costs associated with cleaning the data in Bj , the contribution

of Bj to utility should it be accepted.

Based on these predictions, we either


a. Terminate the algorithm and accept only B1, B2, . . . , Bj−1; or

b. Subject Bj to inspection.

We now outline the prediction steps in more detail.

i. Predict Data Quality

Suppose that there are l error types, each with an associated cost of correction. For

each observation in Bj , let θj0 be the probability that it is correct, and let θjk be the

probability that it is an error of type k, for k = 1, 2, . . . , l. Note that θj0, . . . , θjl ≥ 0 and

∑lk=0 θjk = 1. Similarly, let xj0 denote the number of correct observations, and let xjk

denote the number of errors of type k (k = 1, 2, . . . , l) in Bj . Let θj = (θj0, θj1, . . . , θjl)

and xj = (xj0, xj1, . . . , xjl) and allow that xj | θ ∼ Multinomial(N ;θ). Hence

p(xj | θj) =N !

xj0!xj1! . . . xjl!θxj0

j0 θxj1

j1 . . . θxjl

jl . (4.1)

In this step the number of errors in Bj is predicted. This is achieved using a posterior

predictive distribution Equation (4.2) from a Bayesian formulation:

p(xj | xj−1) =

∫p(xj | θ)p(θ | xj−1) dθ, (4.2)

where

p(θ | xj−1) =

∫p(xj−1 | θ)π(θ) dθ. (4.3)

A Dirichlet distribution with parameters α0, α1, . . . , αl is used as a prior for θ:

π(θ) =Γ(α0 + · · ·+ αl)

Γ(α0) · · ·Γ(αl)θα0−10 θα1−1

1 . . . θαl−1l . (4.4)

This distribution is a conjugate prior for the multinomial distribution (Gelman et al.,

2004).

If j > 1, then the posterior distribution obtained for the previous iteration, p(θ | xj−1),

is used as the prior for Bj . If j = 1, then either the DM sets the parameters of Equation


(4.4), or an uninformative prior can be used by setting αk = 1, for k = 0, 1, . . . , l. If

l = 1, then there is only one type of error, xj | θ ∼ Binomial(N ;θ), and a Beta prior is

used. In this case, Tuyl et al. (2009) show that the predictive posterior will be in form

of a hypergeometric distribution:

p(xj | N,xj−1, n) =n+ α+ β − 1

N + n+ α+ β − 1×

(xj+xj−1+α−1

xj

)(N+n−xj−xj−1+β−1N−xj

)(N+n+α+β−2

N

) , (4.5)

where N is the size of the unobserved block for which we want to determine xj , and n

is the size of the observed sample/block for which xj−1 errors were found. In Equation

(4.5), α and β are the parameters of the prior Beta distribution.

ii. Predict Total Net Benefit (TNB)

Let Ui and CT,i denote the utility and total cost incurred by accepting B1, B2, . . . , Bi,

respectively. To investigate the total net benefit of accepting Bj , denoted TNBj , it

is necessary to compare predicted values for Uj and CT,j . Construction of a utility

function and calculation of Uj are discussed in Section 4.5.

Let CS and CI denote the costs of sampling and inspection, respectively, and let CMk

denote the cost of modifying an error of type k (k = 1, 2, . . . , l). The overall cost

associated with Bj is denoted by Cj and is predicted using

Cj = CS +N × CI +l∑

k=1

xjkCMk. (4.6)

The estimated total cost of accepting B1, B2, . . . , Bj−1 is

CT,j−1 =

j−1∑

i=1

[CS +N × CI +N

l∑

k=1

θikCMk

]. (4.7)

If Bj is accepted, then the predicted total cost is CT,j = CT,j−1+ Cj , and the expected

predicted total net benefit is E(TNBj) = E(Uj)− E(CT,j).

If E(TNBj) ≥ 0, then it might be worthwhile accepting Bj and we proceed to the


estimation phase of the algorithm; see Section 4.4.2. Otherwise the predicted cost

exceeds the utility function (Figure 4.1). However, immediate termination of the al-

gorithm may not be beneficial since uncertainty is associated with the predictions and

also these predictions are based on the data in B1, B2, . . . , Bj−1, and it is possible that

the quality of Bj is better than previous blocks. In addition, the marginal utility may

exceed the marginal cost at block j. In this case, the local net benefit (LNB) associated

with Bj can be predicted.

Having TNB as a criterion of acceptance of a block for further investigation may lead

to capture of a data block with high cost and low benefit. To avoid this, comparing

marginal benefits with costs at blocks regardless of the obtained total net benefits would

be worthwhile. Another alternative will also be discussed in Section 4.7.

iii. Predict Local Net Benefit (LNB)

The local net benefit associated with accepting Bj , denoted LNBj , is defined to be the

area between the utility function and the line joining CT,j−1 and CT,j . It is predicted

using

LNBj =

∫ zj

zj−1

[E(Uj)− Uj−1

zj − zj−1(z − zj−1) + Uj−1

]−

[E(CT,j)− CT,j−1

zj − zj−1(z − zj−1) + CT,j−1

]dz, (4.8)

where zj = N × j and zj−1 = N × (j − 1). If LNBj ≥ 0, then we proceed to the

estimation phase of the algorithm. Otherwise, the algorithm stops and we only accept

B1, B2, . . . , Bj−1.

4.4.2 Phase 2: Estimation for Bj

The steps of Phase 2 are very similar to those of Phase 1. The difference is that we

now estimate the quality of Bj based upon a sample taken from that block, as opposed

to making predictions based upon the quality of previous blocks. In Phase 2 we either

a. Accept Bj and move to the next iteration of the algorithm; or


b. Proceed to Phase 3 in which we make predictions regarding Bj+1, before deciding

whether or not to accept Bj .

We now outline the estimation steps.

i. Take a Sample

Let θ denote the proportion of all errors, θ =∑l

k=1 θk, in Bj , based on either the

mean of the prior distribution or previous samples, and let e denote the desired level

of precision in estimates. We take a random sample of size

n =n0

1 + n0−1N

(4.9)

from Bj , where

n0 =Z2θ(1− θ)

e2, (4.10)

and Z corresponds to the standard normal distribution quantile for a specified signifi-

cance level (Cochran, 2007). The sample size can be updated at each iteration; however

for simplicity, we assume here a fixed sample size.

ii. Estimate Data Quality

The sample is inspected and we count the number of each type of error. Using the

observed proportion of errors in the sample, we estimate the overall proportion of errors

in Bj by the expected value of the posterior distribution, t(θ | x) =∫p(x | θ)π(θ)dθ.

iii. Estimate TNB

The TNB is estimated directly using TNBj = Uj − CT,j ; note that Uj is not affected

by the sampling process if it has been defined as a deterministic, not random variable.

iv. Estimate LNB

The LNB is estimated using Equation (4.8), where Uj and CT,j are replaced by Uj and

CT,j , respectively. If LNBj ≥ 0, then we accept Bj and return to the start of the

algorithm. Otherwise, we proceed to Phase 3.


4.4.3 Phase 3: Prediction for Bj+1

Phase 3 aims to determine whether the negative net benefits associated with Bj are

a localized effect due to a locally high proportion of errors in that block. We repeat

the steps outlined in Phase 1 for prediction of TNB, replacing xj by xj+1, and xj−1 by

xj . If the predicted TNB for Bj+1 is also negative, then we terminate the algorithm,

having accepted only B1, B2, . . . , Bj−1. Otherwise, we conclude that there is a potential

reduction in the error rate after Bj . In this case, we accept Bj and use the predictions

from this step in the first phase of the next iteration involving Bj+1.

4.5 Customized Algorithm for Risk Model Construction

Risk prediction models have been developed and widely applied in clinical research

and practice. They can predict the probability of mortality for a patient who under-

goes surgery or stays at an intensive care unit (ICU). These prediction models enable

clinicians and health managers to select proper treatment and procedures, facilitate

clinical resources utilization, monitor care processes and conduct quality improvement

programs (Moreno and Matos, 2001; Sakr et al., 2008; Shahian et al., 2004). To esti-

mate the probability p of an event given risk factors x1, ..., xn (Hosmer and Lemeshow,

2000):

ln(p

1− p) = α0 + α1x1 + α2x2 + ...+ αnxn. (4.11)

The magnitude of each coefficient describes the size of the contribution of that risk

factor. In this section, we customize the general data capturing algorithm in order

to determine the optimal number of records to use for constructing a risk model by a

logistic regression.

4.5.1 Assumptions and Definitions

All assumptions given in Section 4.4 still hold. We introduce some new definitions

which arise from the context of risk model construction.

4.5 Customized Algorithm for Risk Model Construction 109

Preliminary block and model: The first block of data is automatically accepted,

checked and corrected in order to define error types and determine their modification

costs, as well as provide a more informative prior distribution for errors. This block is

labelled B0, and we let M0 denote the model based upon B0.

Other blocks and models: The remaining data are partitioned into blocks labelled

B1, B2, . . . , BF . Without loss of generality, here we assume equally sized blocks. For

i = 1, 2, . . . , F , we let Bi denote the collection of blocks B0, B1, ..., Bi, and we let Mi

denote the model based upon Bi.

Utility function: The performance of the constructed risk model is used to construct

the utility function. There are many criteria which can be used to assess this perfor-

mance. However, caution should be taken in choice of criteria as some of goodness-of-fit

statistics behave misleadingly when the sample size increases; see Kramer and Zimmer-

man (2007), Marcin and Romano (2007) and Nemes et al. (2009) for more details. We

define and use two raw criteria which are sensitive to dataset size, and which broadly

name accuracy and precision.

• Accuracy: We define the accuracy in terms of two measures: a measure for

goodness-of-fit of the model and a measure of accuracy of the constructed model

over external data. Let Dj be the the value of Somer’s statistic for goodness-

of-fit (Somers, 1962). Somer’s statistic is a rank correlation measure which is

used as a performance indicator of a predictor of a binary variable. Let ¯pj and

pob be the overall average of predicted probabilities and the observed proportion

of occurrences of the event based on Bj and BF , respectively. We measure the

accuracy of Mj over external data using Eaj = 1−∣∣ ¯pj − pob

∣∣pob

.

• Precision: Let σ2i,j be the estimated variance of αi based on Bj for i = 0, . . . , n.

Among a number of measures for overall precision of Mj , we consider the average

precision of αi for i = 0, . . . , n. That is, we define the precision of Mj to be

Pj =(

1σ0,j

+ ...+ 1σn,j

)/n.

We are also interested in relative criteria, which compare the accuracy and precision of

Mj to that of Mk, where j + 1 ≤ k ≤ F . The relative accuracy (RD and REa) and


Figure 4.3 Utility function loop.

relative precision (RP ) of Mj compared to MF , for example, are given by:

RDj,j+1 =Dj

Dj+1, REaj,j+1 =

EajEaj+1

, RPj,j+1 =Pj

Pj+1; (4.12)

however REa and Ea are identical since EaF = 1.

If desired, we can obtain a single performance index (PI) by taking an average of the

relative criteria. Hence, under the simplest weighted model that gives weights of w1, w2

and w3 to criteria where∑3

k=1wk = 1, for the jth iteration we have

PIj,j+1 = w1RDj,j+1 + w2REaj,j+1 + w3RPj,j+1. (4.13)

As the the effect of external accuracy decreases where j tends to F , a variable set of

weights over j may also be of interest. For each iteration, the raw and relative criteria

(or PI) are reported and the DM uses this information to determine the utility in a

monetary form; see Section 4.5.2. We denote the utility at the jth iteration by Uj .

4.5.2 Utility Loop

We interpret utility to be the maximum amount of money that the DM would be

willing to pay in order to gain an improvement in the performance of the risk model.

Presenting obtainable gains in performance and the predicted costs associated with the

addition of another block enables the DM to express their idea of utility, which can

then be used in the general data capturing algorithm.


This process is implemented via the insertion of a 3-step loop in the general algorithm;

see Figure 4.3. The first step involves simulation of unobserved block(s) by random

sampling from all observed data from previous iterations. In the second step the risk

model is constructed using all blocks (observed and simulated) and performance criteria

are calculated in raw and relative forms. In the third step, the DM is asked to assign

a value to the utility point based on the performance criteria and the predicted cost.

This loop can be performed for each iteration, thus producing updated utility points

block by block, or for the first iteration only, producing a set of fixed points in a form of

a utility function to be used for all blocks, or a combination of these approaches, where

after some iteration the utility function is updated. In this section we first consider

updating the utility points for each iteration. The other approach and its alternatives

will be discussed in the next section.

Utility Points

The utility loop is inserted into all three phases of the general algorithm (Figure 4.4).

In Phase 1, we simulate Bj using B0 and all data observed in previous iterations of

the algorithm. The risk model is constructed using Bj and performance criteria are

calculated in raw and relative forms.

The DM is asked to assign a value to the utility point for iteration j, which is then used

in the next phase of the algorithm. The criteria and assigned utility for Bj from Phase

1 are then updated based on the sample observed in Phase 2. That is, the utility loop

is implemented using B0 and observed samples from iterations 1 to j. The decision

regarding whether to accept Bj may depend on an investigation of Bj+1, as discussed in

Section 4.4.3. In this case, Bj+1 is simulated using available data from B0 and observed

samples from iterations 1 to j, and the risk model is constructed over all data blocks.

The DM determines Uj+1, which can then be used in next iteration of the algorithm.

Consequently, the utility loop in Phase 1 of the next iteration is not required.

Alternatively, the utility loop in Phase 3 can be integrated into the loop in Phase 2.

In this case, Bj and Bj+1 are simulated in one loop and the risk model is constructed

twice, first using Bj , and then using Bj+1.


Figure 4.4 Customized data capturing algorithm for risk model construction.

The advantage of this approach is that the algorithm uses the most up-to-date utility

values, based on the observed data. On the other hand, it is quite demanding on the

DM. In addition, continuous updating of the utility does not provide a total view of

the performance of the risk model. That is, it does not compare performance of the

current model to that which is based on all blocks. It limits the relative criteria to

the previously accepted block (Bj−1) and, if Phase 3 is implemented, the next block

(Bj+1).

Utility Function

The utility loop can be performed in the first iteration to produce a single fixed utility

function for all further iterations. To do this, in Phase 1 we simulate B1, . . . , BF

based on B0, and we calculate raw and relative performance criteria for each Mi (i =

1, 2, . . . , F ), where relative criteria compare Mi to MF .

The calculated performance estimates and predicted costs for all blocks enable the


DM to express their utility for each added block using a function. Some rational and

easily constructed utility functions are now discussed. The proposed functions combine

different levels of prediction and the DM’s point of view.

i. Budget Line (BL)

When there is a fixed amount of money allocated for the risk model calibration, the

utility function is a horizontal line. Blocks are added until the modification costs equal

the fixed budget. In the algorithm Uj is replaced by the fixed budget b.

Since the marginal cost of adding a block is always non-negative, if the total cost exceeds

the budget (TNB < 0) in Phase 2, then we do not proceed to the marginal analysis

through the local net benefit measure and Phase 3. The algorithm is terminated without

accepting the block.

ii. Linear Utility (LU)

If the DM prefers to control associated costs rather than obtainable performance, a

linear utility function may be appropriate. The DM is asked to adjust (confirm, in-

crease or decrease) CT,0 and CT,F . These adjustments can be in the same or opposite

directions. A line is then drawn between the adjusted CT,0 and the adjusted CT,F .

The adjusted CT,F is considered to be UF , and other utility points (U1, . . . , UF−1) are

obtained by linear interpolation.

iii. Performance Curve-Based Utility (PU)

The DM may prefer to spend money relative to overall performance. In this case, Uj

depends on the amount of improvement in PI associated with accepting Bj . We let

Uj = Uj−1 + (k × (100(PIj − PIj−1))), (4.14)

where k is the amount of money that the DM is willing to pay to achieve an additional

one percent improvement in the model performance. The cost incurred per one percent

achieved improvement in M1 and its adjustment are proposal values for k. For the

first iteration, we substitute CT,0 for U0 in Equation (4.14). Performance indices are


calculated for the constructed risk models using observed and simulated blocks.

iv. Customized Utility (CU)

In the most interactive case, all performance criteria and predicted costs are reported to

the DM at each stage. The DM is asked to determine the amount of money they would

be willing to spend in order to obtain the reported performance by adding blocks. The

expressed utilities are inputted into the algorithm directly. This method is an extension

of the utility point approach for all blocks.

The aforementioned approaches and methods can be used in a renewable manner.

That is, after accepting k blocks, say, the utility loop in Phase 1 is performed using all

observed data (B0 and samples from B1, . . . , Bk) and the utility function is updated

for the remaining blocks, Bk+1, . . . , BF . In this approach, the performance based and

customized utility functions are updated; however other utilities may also be renewed

by interaction with the DM.

The advantage of the utility function approach is that the algorithm is less dependent

on the DM’s intervention and therefore finds the optimal sample size faster. Moreover,

as the relative criteria are calculated using the performance of MF , the criteria are

easier to interpret and a total view of the contribution of each added block is provided.

However, its disadvantages include the fact that it is not updated and inaccuracies

may be introduced via the simulation process. In general, choice of utility function

influences the number of accepted blocks and the termination point of the algorithm.

4.6 Case Study

Risk models have become an essential part of any quality improvement programs in

hospitals. The Acute Physiology and Chronic Health Evaluation II (APACHE II)

(Knaus et al., 1985) is an ICU scoring system which predicts the probability (p) of

mortality of a patient based on a logistic regression given 12 physiological measurements

taken in the first 24 hours after admission to ICU, chronic health status and age. St.

Andrew’s Medical Institute (SAMI), a research centre associated with St Andrew’s

4.6 Case Study 115

War Memorial Hospital (SAWMH) in Brisbane, Australia, is routinely collecting and

recording the APACHE II data elements in SAWMH’s ICU. SAMI’s research team

decided to construct risk adjusted control charts using the APACHE II model to express

patient mix.

The direct use of a ready-made risk model has been criticized by researchers since they

may not accurately predict local outcomes. Ivanov et al. (1999) and Hannan et al.

(1997) suggested that it is wiser to develop risk models that are tailored to regional

patients. Teres and Lemeshow (1999) pinpointed model customization as a strategy to

achieve better model performance for local patients’ data.

Recalibration of the model most frequently involves recalculating the coefficients and

fitting a logistic regression on calculated logit(p) in Equation (4.11) based on local data

(Beck et al., 2002), as follows:

logit(p∗) = β0 + β1(logit(p)), (4.15)

where p is calculated probability of death using the original model and p∗ is its cali-

brated value.

Research shows that a calibrated APACHE II model outperforms the original model

(Schonhofer et al., 2004; Suistomaa et al., 2002). A logistic regression model was

chosen for calibration of APACHE II based on local patients. The dataset contained

4644 records for patients admitted to ICU between 2000 and 2009. A review of the

dataset revealed that there were some records in which the diagnostic categories were

inaccurate. This inconsistency can lead to bias in calibrated predictions and control

charts. They decided to check and modify recorded data prior to calibration. As

data cleaning is a costly process we applied the algorithm developed in Section 4.5 to

determine the required amount of data to achieve a well calibrated APACHE II in an

acceptable price. In this regard, initial parameters of the algorithm were defined as

follows:

Data block: As the dataset was used for monitoring, continuity and time order were

considered. The records were ordered by time, with the most recent 644 cases taken


to be the preliminary block, labelled B0. The remaining data were partitioned into 10

blocks of 400 records each. These blocks are labelled B1, . . . , B10, where B10 contains

the oldest data.

Costs: B0 was completely inspected and inaccurate coding of diagnostic categories was

identified for 67 of the 644 records. The inspection and modification costs per inaccurate

code were determined to be $1 and $5 respectively. The total cost of cleaning B0 was

obtained as CT,0 = 644× CI + 67× CM = $979.

Sample size: Given that 67 of the 644 inspected records were defective, the observed

error rate is 10.4%. Substituting Z = 1.96 and e = ±8.0% into Equation (4.9), we

calculated that around 49 records should be sampled from a block in the estimation

phase. For simplicity, we rounded this up to 50 records, and set the cost of sampling

to be $25.

Utility function: To construct a utility function, two approaches were considered.

First, the implementation of the utility loop only at the first iteration of the data

capturing algorithm was applied. In the second approach, the utility functions was

updated after the third iteration. In this approach, we considered two procedures

including partly and fully updating. The former involves updating performance related

criteria which only affects performance based utility functions, whereas in the latter,

all defined and predicted parameters may also be updated; see Section 4.5.2. We refer

to these as “Fix”, “Updating I” and “Updating II”, respectively. They are outlined in

the next section.

4.6.1 Utility Function Construction

As discussed in Section 4.5.2 the unobserved blocks (B1, . . . , B10) are simulated and

then a logistic regression is fitted to B0,B1, . . . ,B10. Under Updating I and II ap-

proaches, unobserved blocks (B4, . . . , B10) were re-simulated using B0 and the samples

obtained and modified in the first three iterations of the algorithm. In this regard, each

block B1, . . . , B3 was simulated using associated samples, then unobserved blocks were

re-simulated using B0 and extended B1, . . . , B3. This procedure maintains the effect of

4.6 Case Study 117

Table 4.1 Customized APACHE II model parameters using logistic regression over observed and sim-ulated blocks under Fix and Updating approaches. Highlighted rows are sets of parameters which areused for utility function construction within the utility loop of the algorithm.

ModelFix Updating I,II

β0 σβ0 β1 σβ1 β0 σβ0 β1 σβ1

M0 -0.500 0.444 1.339 0.215 -0.500 0.444 1.339 0.215M1 -0.211 0.370 1.471 0.186 -0.211 0.370 1.471 0.186M2 -0.352 0.303 1.367 0.146 -0.352 0.303 1.367 0.146

M3 -0.274 0.269 1.332 0.126-0.274 0.296 1.332 0.126-0.682 0.381 1.663 0.200

M4 -0.310 0.241 1.323 0.113 -0.868 0.345 1.547 0.171M5 -0.229 0.227 1.365 0.108 -0.755 0.333 1.634 0.171M6 -0.203 0.216 1.411 0.104 -0.794 0.301 1.580 0.153M7 -0.237 0.207 1.401 0.099 -0.836 0.287 1.570 0.145M8 -0.236 0.196 1.417 0.095 -0.742 0.275 1.608 0.139M9 -0.256 0.184 1.406 0.089 -0.728 0.260 1.617 0.134M10 -0.274 0.177 1.399 0.085 -0.726 0.249 1.632 0.130

each block considering their sizes in the simulated data. We, then, fitted a logistic re-

gression to B3, baseline data for unobserved blocks, and B4, . . . ,B10. Reconstruction of

a model over B3 under Updating approaches leads to a new set of parameters that are,

here, only used for updating the utility functions, not for re-evaluation of the results

obtained through the third iteration of the algorithm.

Table 4.1 shows the parameter estimates for models based upon different amounts

of data for all approaches. As updating occurred in the third iteration, the same

parameters as those obtained through the Fix approach were used for the first three

iterations under Updating I and II. The first row of parameters for B3 is the set that

were used in the third iteration of the algorithm, before updating, and the second

row, in gray, was used for updating utility functions. Note that in practice the values

presented under Updating I and II approaches in Tables 4.1-4.7 were obtained when

the algorithm proceeded and reached the end of the third iteration.

Table 4.1 indicates that the intercept parameter, β0, of the calibrated APACHE II,

remains a negative value and tends to be stable when more blocks are used; however

obtained values under Updating approaches are significantly less than those obtained

under Fix. Conversely, the slope, β1, becomes larger after updating. Having a larger

negative value in the intercept shows a larger consistent drop in the observed odds ratio

of death; whereas a slope of size more than one, expresses a decrease and an increase


Table 4.2 Raw and relative performance criteria (Somer’s statistic D, external accuracy Ea, precisionP , weights w and performance index PI) of the calibrated APACHE II model over observed andsimulated data obtained using Fix approach. Relative criteria are based on the comparison of M0 withMF . Highlighted row is the set of parameters which is used for utility function construction withinthe utility loop of the algorithm.

ModelRaw Criteria Relative Criteria Overall Performance

Dj Eaj Pj RDj REaj RPj wj,1, wj,2, wj,3 PIjM0 0.771 0.875 3.441 0.956 0.875 0.397 0.40, 0.40, 0.20 0.812M1 0.784 0.886 4.036 0.972 0.886 0.466 0.43, 0.36, 0.21 0.833M2 0.789 0.892 5.060 0.978 0.892 0.584 0.45, 0.32, 0.23 0.861M3 0.791 0.938 5.816 0.980 0.938 0.671 0.48, 0.28, 0.24 0.894M4 0.793 0.967 6.459 0.982 0.967 0.745 0.51, 0.24, 0.25 0.918M5 0.795 0.974 6.803 0.985 0.974 0.785 0.53, 0.20, 0.27 0.929M6 0.795 0.974 7.074 0.985 0.974 0.816 0.56, 0.16, 0.28 0.936M7 0.797 0.992 7.443 0.987 0.992 0.859 0.59, 0.12, 0.29 0.950M8 0.800 0.992 7.791 0.991 0.992 0.899 0.61, 0.08, 0.31 0.963M9 0.802 0.998 8.265 0.994 0.998 0.954 0.64, 0.04, 0.32 0.981M10 0.807 1.000 8.661 1.000 1.000 1.000 0.67, 0.00, 0.33 1.000

in the odds of death for those patients with less and more than 50% chance of death,

respectively.

As proposed in Section 4.5, we considered the goodness-of-fit D, external accuracy Ea

and precision P as the three criteria of performance of a logistic risk model in raw and

relative forms. We set M10 as the base for calculation of the relative criteria. To obtain

the overall performance index PI, we defined a variable set of non-identical weights so

that at M0, w0,k = {0.4, 0.4, 0.2} for k = 1, 2, 3 in Equation (4.13); then the weight of

external accuracy, wj,2, gradually decreases in favor of weights of goodness-of-fit and

precision, wj,1 and wj,2, as j tends to F = 10 and finally becomes 0 at j = F .

Tables 4.2 and 4.3 give the details of the performance criteria of the constructed model

over observed and simulated blocks for Fix and Updating I,II approaches.

As seen in Table 4.2 and depicted in Figure 4.5-1, in the Fix approach the relative

goodness-of-fit criterion, RD, gradually increases and reaches to one when more blocks

are used. This consistency can also be seen in the relative precision, RP , but, with a

higher slope. In contrast, the relative external accuracy, REa, shows a non-consistent

increasing behavior in which it accelerates when the third and the forth blocks are

added and then slows down over the last four blocks. These behaviors lead to a nearly

smooth increase in the overall performance index, PI.

4.6 Case Study 119

Table 4.3 Raw and relative performance criteria (Somer’s statistic D, external accuracy Ea, precision P ,weights w and performance index PI) of the calibrated APACHE II model over observed and simulateddata obtained using Updating I and II approaches. Relative criteria are based on the comparison ofM0 with MF . Highlighted rows are sets of parameters which are used for utility function constructionwithin the utility loop of the algorithm.

ModelRaw Criteria Relative Criteria Overall Performance

Dj Eaj Pj RDj REaj RPj wj,1, wj,2, wj,3 PIjM0 0.771 0.875 3.441 0.956 0.875 0.397 0.40, 0.40, 0.20 0.812M1 0.784 0.886 4.036 0.972 0.886 0.466 0.43, 0.36, 0.21 0.833M2 0.789 0.892 5.060 0.978 0.992 0.584 0.45, 0.32, 0.23 0.861

M30.791 0.938 5.816 0.980 0.938 0.671 0.48, 0.28, 0.24 0.8940.819 0.930 3.808 0.965 0.930 0.651 0.48, 0.28, 0.24 0.880

M4 0.825 0.960 4.367 0.972 0.960 0.746 0.51, 0.24, 0.25 0.912M5 0.826 0.962 4.410 0.973 0.962 0.754 0.53, 0.20, 0.27 0.913M6 0.830 0.982 4.925 0.978 0.982 0.842 0.56, 0.16, 0.28 0.940M7 0.838 0.989 5.182 0.987 0.989 0.886 0.59, 0.12, 0.29 0.957M8 0.839 0.992 5.389 0.989 0.992 0.921 0.61, 0.08, 0.31 0.968M9 0.845 0.992 5.641 0.996 0.998 0.964 0.64, 0.04, 0.32 0.986M10 0.848 1.000 5.847 1.000 1.000 1.000 0.67, 0.00, 0.33 1.000

(1) (2)

Figure 4.5 Performance criteria of calibrated APACHE II over observed and simulated blocks, under(1) Fix and (2) Updating I and II approaches of the data capturing algorithm implementation.

In the Updating I and II approaches, as shown in Table 4.3 and Figure 4.5-2, a fall occurs

in the relative Somer’s statistic, RD, at the fourth block when the model is constructed

on the re-simulated data. RP also increases non-consistently. These changes lead to a

more fluctuating performance index, PI, for the Updating I and II compared to that

obtained for the Fix approach.

To illustrate the algorithm’s performance given different utility functions, we considered

four scenarios:

BL: A budget line corresponding to $6000;


LU: A linear utility with +20% adjustment in the calculated cost for B0 and −20%

adjustment in the predicted total cost for B10, based upon the belief that data

quality diminishes with time;

PU1: A performance curve-based utility, where k1 in Equation (4.14) is set to 305.2

which estimated by E(C1)/(100× (PI1 − PI0)); and

PU2: A performance curve-based utility with a 5% increase in k1 obtained in PU1,

k2 = 320.5.

For scenarios LU, PU1 and PU2 in the Fix approach of the algorithm, we predicted

the number of errors and associated costs for each added block based upon the error

rate observed for B0. When using Equation (4.2), the number of errors has a binomial

distribution and we use a Beta(1, 1) prior. The consequent predictive distribution is

hypergeometric; see Equation (4.5). The costs were also predicted using Equations

(4.6) and (4.7). Table 4.4 summarizes the predicted errors, costs and utility values for

all four scenarios under the Fix approach.

In the Updating I approach, only performance indices were updated. Consequently,

we only updated PU1 and PU2 using obtained parameters PI in Table 4.3. In this

setting, other utility functions remained unchanged and it was assumed no interaction

with the DM was made; therefore, initial values of k were also used. As this approach

shares most parameters with the Fix approach, PU1 and PU2 under the Updating I

are also presented in Table 4.4. Note that termination of the algorithm would not alter

under the Updating I to those obtained through the Fix approach if either a BL or a

LU function is considered.

Comparison of the growth of the predicted number of errors E(xj) and associated costs

E(Cj) in Table 4.4 reveals that the hypergeometric distribution in Equation (4.5) tends

to predict a large error rate by adding a block compared with the observed preliminary

block. This trend begins with 45.3/400 = 0.112 error rate for the first block and reaches

a maximum of (476.9 − 399.7)/400 = 0.193 and $810.9 cost at M7, then decreases to

(706.7− 630.6)/400 = 0.190 at M10.

However this behavior does not affect the LU function, as it is constructed using the

4.6 Case Study 121

Table 4.4 Predicted number of errors (E(xj)), costs (E(Cj)) and related utility values Uj for utilityfunction scenarios, BL, LU, PU1 and PU2, over observed and simulated blocks under Fix and UpdatingI approaches.

Dataset E(xj) E(Cj) E(CT,j)Uj-Fix Uj-Updating I

BL LU PU1 PU2 PU1 PU2

B1 45.353 651.7 1630.7 6000 1758 1631 1663 1631 1663B2 105.719 726.8 2357.5 6000 2341 2486 2561 2486 2561B3 174.271 767.7 3125.3 6000 2925 3501 3161 3501 3161B4 247.375 790.5 3915.8 6000 3508 4247 4379 4043 4197B5 322.944 802.8 4718.7 6000 4092 4578 4743 4059 4214B6 399.714 808.8 5527.5 6000 4676 4791 4965 4907 5104B7 476.901 810.9 6338.5 6000 5259 5218 5412 5429 5652B8 554.001 810.4 7149.0 6000 5843 5617 5829 5761 6001B9 630.689 808.4 7957.4 6000 6426 6173 6411 6290 6557B10 706.755 805.3 8762.7 6000 7010 6728 6992 6714 7001

observed cost at the preliminary block B0, CT,0, and the predicted total cost obtained

for B10, E(CT,F ). Utility values for the LU scenario increases by a constant step of size

583.5, approximately. In an opposite way, the performance based utilities, PU1 and

PU2, increase non-consistently as they capture the behavior of the obtained PI shown

in Figure 4.5-1.

In the Updating II approach, we updated all elements of the linear and performance

based utility functions, including performance indices and baseline costs in interaction

with the DM. To construct utility functions, we predicted the number of errors and

associated costs for each added block based upon the error rate observed for B0 and

samples taken from B1, ..., B3. As will be discussed later in Section 4.6.2, four, six

and seven errors were observed in the samples of size 50 taken from blocks B1, B2 and

B3, respectively, in the first three iterations of the algorithm. We then used the same

prediction methods discussed for the Fix approach, in which the number of observed

errors and records were set to the values obtained at the end of the third iteration.

We held the same adjustment coefficients proposed for the linear utility function in the

Fix approach, ±20%; however for PUs, the ks were replaced by those obtained through

CT,3−CT,0/(100× (PI3−PI0)) in which the new PI3 was applied; see the highlighted

row for M3 in Table 4.3. Thus k1 and k2 were set to 287.71 and 291.65 for PU1 and

PU2, respectively. Table 4.5 shows the predicted errors, costs and utility values for the

Updating II approach. The estimated number of errors and associated costs at the end


Table 4.5 Predicted number of errors (E(xj)), costs (E(Cj)) and related utility values Uj for utilityfunction scenarios, BL, LU, PU1 and PU2, over observed and simulated blocks under Updating IIapproach. Highlighted row is the set of parameters which is used for utility function constructionwithin the utility loop of the algorithm.

Dataset E(xj) E(Cj) E(CT,j)Uj-Updating II

BL LU PU1 PU2

B1 45.353 651.7 1630.7 6000 1758 1632 1662B2 105.719 726.8 2357.5 6000 2341 2489 2558B3 174.271 767.7 3125.3 6000 2925 3507 3623B3 124.637 636.6 2877.2 - - - -B4 177.472 689.1 3566.3 6000 4279 3777 3822B5 245.788 766.6 4332.9 6000 4637 3792 3837B6 322.008 806.1 5139.0 6000 4996 4566 4651B7 402.343 826.7 5965.7 6000 5354 5042 5150B8 484.717 836.9 6802.6 6000 5713 5346 5469B9 567.919 841.0 7643.6 6000 6071 5829 5977B10 651.220 841.5 8485.1 6000 6429 6216 6383

of the third iteration, borrowed from Table 4.6, are also presented in the highlighted

row. This set of parameters was used to update utility functions particularly, LU, PU1

and PU2.

The predicted number of errors E(xj) and associated costs E(Cj) in the Updating

II approach, where the prediction is made using the observed number of errors in

the preliminary block and samples taken from blocks B1, B2 and B3, also confirms the

tendency of the hypergeometric distribution to predict large number of errors and costs

when more blocks are added. Table 4.5 shows that an error rate of (177.4−124.6)/400 =

0.132 is predicted for the fourth block, then this grows and reaches a maximum of

(651.2− 567.9)/400 = 0.208 and $840.5 cost at M10.

In Figure 4.6 utility functions are compared under the different approaches. As defined

and discussed earlier, a BL utility function is not affected by updating; therefore it

remains unchanged for all approaches. The updated LU in the Updating II approach

begins from above the fixed LU, but crosses it at the eighth block, approximately, and

lies below the initial LU; see Figure 4.6-2. The initial superiority of the updated LU

is due to multiplication of adjustment value of 1.2 in the estimated total cost obtained

at the end of the third iteration, CT,3 = 2877.2, which leads to a higher value than the

related initial utility value, U3 = 2925. However after updating since a lower cost is

predicted when all block are captured, B10, than that obtained through Fix or Updating

4.6 Case Study 123

(1) (2)

(3) (4)

Figure 4.6 Utility functions and terminations points under Fix (-F) and Updating I (-I) and II (-II)approaches: (1) Budget line utility; (2) Linear utility, the asterisk shows the estimated total cost

obtained at the end of the third iteration, CT,3 = 2877.2; (3) Performance based utility function (PU1)with k1 is equal to 305.23 and 287.71 for Updating I and II approaches, respectively; (4) Performancebased utility function (PU2) with 5% increase in k1, k2 is equal to 320.23 and 291.65 for Updating I andII approaches, respectively. A vertical line is drawn to show when updating occurs in the algorithm.

I approaches, the updated LU increases with smaller steps of size 358.5, approximately;

see Tables 4.4 and 4.5.

Figure 4.6-(3) indicates that the updated performance based utility function in which

the performance index was only updated, PU1-I, lies below the PU1 obtained for Fix

approach, PU1− F , over B4 and B5. It is due to the observed drop in PI for B3 after

re-simulation, see Table 4.3. However, PU1-I reaches to higher utility values and stands

above the PU1-F over remained blocks since PI increases significantly; see Figure 4.5.

Having said that, the magnitude of the slope of increase in PI is not sufficient compared

to drop in the updated k1 = 287.71 to obtain a PU1 under the Updating II that stands,

even partly, above the PU1 obtained for Fix. As expected, it also lies below the PU1

obtained for Updating I, as k1 drops significantly in Updating II.


Similar behavior can also be seen in Figure 4.6-4 since PU2 is obtained considering

a 5% increase in the estimated k1, denoted as k2, for PU1. This increase, generally,

leads to utility lines that sit at higher values. The difference between PU2s obtained

for Updating I and II is slightly larger than that seen for PU1 since a 5% increase in

k1 leads to a larger value for higher k1.

4.6.2 Algorithm Iterations

For each utility function scenario under the three approaches, Fix, Updating I and II,

the algorithm was run until it terminated.

In Phase 1, we predicted the mean number of errors (E(xj)) in Bj using Equation

(4.5). We used Equations (4.6) and (4.7) to calculate the mean cost (Cj) and total cost

(CT,j), respectively. The predicted total net benefit (TNBj) was calculated using Uj

obtained through different approaches. If necessary, we calculated the predicted local

net benefit (LNBj) using Equation (4.8).

In Phase 2, we took a random sample of 50 records from the block of size 400. The

observed number of errors in nine iterations of the algorithm were {4,6,7,4,9,11,10,13}.

No sample was taken from block B10 since the algorithm terminated prior to the tenth

iteration for all utilities and approaches; see Tables 4.6 and 4.7.

We then updated our prior knowledge using the posterior distribution in Equation (4.3).

All costs were then updated based upon the calculated mode of the Beta posterior dis-

tribution, which is given by (α−1)/(α+β−2). We ran Phase 3 for those cases in which

Phase 2 did not lead us to accept the block; see Section 4.4.3. At the beginning of the

fourth iteration of the algorithm, linear and performance based utility functions were

partly and fully updated under Updating I and II approaches as extensively discussed

in Section 4.6.1. The results are outlined in the next section.

4.6 Case Study 125

Table 4.6 Data capturing algorithm iterations and termination points for four utility function scenariosunder Fix approach.

Utility ModelPhase 1 (N = 400) Phase 2 (n = 50)

E(xj | xj−1) E(Cj) E(CT,j) E(TNBj) E(LNBj) xj θj Cj CT,j TNBj LNBj

BL

M1 45.3 651 1630 4370 - 4 0.102 629 1608 4392 -M2 44.1 645 2276 3724 - 6 0.103 631 2240 3760 -M3 43.9 644 2921 3079 - 7 0.105 636 2877 3123 -M4 44.0 645 3566 2434 - 4 0.104 633 3510 2490 -M5 43.1 640 4207 1793 - 9 0.108 642 4152 1848 -M6 43.9 644 4852 1148 - 11 0.114 653 4806 1194 -M7 45.1 650 5502 498 - 10 0.118 662 5468 532 -M8 45.8 654 6157 -157 68200 13 0.125 675 6144 -144

LU

M1 45.3 651 1630 128 - 4 0.102 629 1608 150 -M2 44.1 645 2276 65 - 6 0.103 631 2240 101 -M3 43.9 644 2921 4 - 7 0.105 636 2877 48 -M4 44.0 645 3566 -58 -10800

PU1

M1 45.3 651 1630 0 - 4 0.102 629 1608 22 -M2 44.1 645 2276 209 - 6 0.103 631 2240 245 -M3 43.9 644 2921 579 - 7 0.105 636 2877 624 -M4 44.0 645 3566 673 - 4 0.104 633 3510 729 -M5 43.1 640 4207 362 - 9 0.108 642 4152 417 -M6 43.9 644 4852 -71 58200 11 0.114 653 4806 -25 78400M7 44.0 645 5502 -294 -73000

PU2

M1 45.3 651 1630 33 - 4 0.102 629 1608 55 -M2 44.1 645 2276 285 - 6 0.103 631 2240 321 -M3 43.9 644 2921 706 - 7 0.105 636 2877 750 -M4 44.0 645 3566 836 - 4 0.104 633 3510 892 -M5 43.1 640 4207 541 - 9 0.108 642 4152 597 -M6 43.9 644 4852 120 - 11 0.114 653 4806 165 -M7 45.1 650 5502 -83 7200 10 0.118 662 5468 -49 23000M8 45.8 654 6157 -320 -80600

4.6.3 Algorithm Termination

Tables 4.6 and 4.7 show that the algorithm’s termination point depends upon which

utility function and approach is adopted. Since all approaches were applied simul-

taneously under a run of the algorithm, identical samples and associated costs were

obtained. In this regard, predictions and estimations of number of errors and costs are

not repeated in Table 4.7.

When a BL was considered, the algorithm terminated at the eighth iteration. Although

B8 led to a negative prediction of total benefit (TNB = −157) in all approaches, since

the TCP line crossed BL over B7 in Figure 4.6-1 and a positive local benefit was

obtained, the algorithm was not terminated in Phase 1. When a negative value was

also returned in Phase 2, TNB = −144, the algorithm terminated. In this case the

algorithm accepted B7 and the calibrated model had a PI of 0.950, when no updating

process was applied. However updating shows an increase of 0.07 in PI, reaching to

0.957; see Table 4.3. Note that Phase 3 is not implemented for budget line utilities; see

Section 4.5.2.


Table 4.7 Data capturing algorithm iterations and termination points for four utility function scenariosunder Updating I and II approaches.

Utility ModelUpdating I Updating II

Phase 1 (N = 400) Phase 2 (n = 50) Phase 1 (N = 400) Phase 2 (n = 50)

E(TNBj) E(LNBj) E(TNBj) E(LNBj) E(TNBj) E(LNBj) E(TNBj) E(LNBj)

BL

M1 4370 - 4392 - 4370 - 4392 -M2 3724 - 3760 - 3724 - 3760 -M3 3079 - 3123 - 3079 - 3123 -M4 2434 - 2490 - 2434 - 2490 -M5 1793 - 1848 - 1793 - 1848 -M6 1148 - 1194 - 1148 - 1194 -M7 498 - 532 - 498 - 532 -M8 -157 68200 -144 -157 68200 -144

LU

M1 128 - 150 - 128 - 150 -M2 65 - 101 - 65 - 101 -M3 4 - 48 - 4 - 48 -M4 -58 -10800 713 - 769 -M5 430 - 485 -M6 144 - 190 -M7 -148 -800

PU1

M1 0 - 22 - 0 - 22 -M2 209 - 245 - 209 - 245 -M3 579 - 624 - 579 - 624 -M4 477 - 532 - 210 - 266 -M5 -148 65400 -93 88000 -415 -61200M6 56 - 101 -M7 -74 -3600

PU2

M1 33 - 55 - 33 - 55 -M2 285 - 321 - 285 - 321 -M3 706 - 750 - 706 - 750 -M4 629 - 685 - 255 - 311 -M5 6 - 61 - -370 -43200M6 252 - 298 -M7 149 - 182 -M8 -156 -419400

When using a LU, under the Fix approach, the algorithm terminated at the fourth

iteration in Phase 1 prior to sampling and observing B4. Under this scenario, M3 was

the optimal model, costing approximately $2877 and achieving a PI of 0.894. The

same result was obtained when a partial run of the utility loop was applied, Updating

I. However, under the Updating II the termination point was postponed since a higher

value of utility was obtained for B4, $4279; so that the next three blocks were accepted.

The algorithm terminated at Phase 1 of the eighth iteration when a negative value was

also obtained for local net benefit, E(LNB8) = −800, since the slope of the updated

LU was significantly less than those observed for estimated and predicted total costs;

see TCP and TCE lines in Figure 4.6-2.

By use of a PU function where the DM was willing to pay $305.23 for an extra per-

cent improvement in the calibration model performance and no update was made, the

algorithm accepted B6 with negative predicted and estimated TNBs (-71, -25) due to

their positive local benefits. It then stopped at the beginning of the seventh iteration

4.6 Case Study 127

because the LNB was predicted to be negative. In this case, M6 was estimated to

cost $4806, with a PI of 0.936. Under the Updating I approach, where only PI was

updated, PU1 initially droped below the PU1 obtained for the Fix approach and also

lied below the estimated and predicted total costs for B5 as shown in Figure 4.6-3.

However the algorithm correctly considered B5 as a local deterioration of the perfor-

mance and passed it based upon positive local net benefits. It then stopped at the

seventh iteration where the local net benefit was predicted to be negative; similar to

the Fix approach. Conversely, under the Updating II approach, where all elements of

PU1 were updated, the algorithm terminated at B5, since the updated PU1 fully stood

below the cost lines, see 4.6-3. In this scenario, the optimal choice was M4 that cost

$3510 and the calibrated model had a PI of 0.912; see Table 4.3.

If using a PU with k2 = 320.23 under the Fix approach, PU2, the algorithm terminated

at the eighth iteration, later than PU1 since a higher k was applied and a higher set of

utility values was obtained; compare Figures 4.6-3 and 4.6-4. In this scenario B8 was

rejected based upon predicted costs and calculated negative benefits (-320 and -80600),

before taking a sample. The optimal choice under this scenario cost $5468 and had a

PI of 0.963, see Table 4.2. Under the Updating I approach, although an initial drop

in PU2 can be seen in 4.6-4 compared to PU2 for Fix, it did not lead to having PU2

below the cost lines. The algorithm terminated at the eighth iteration and proposed

M7 as the optimal choice costing approximately $5487 and achieving an updated PI

of 0.957.

Figure 4.6-4 shows that a 5% increase in k, where it is fully updated, was still not

suffiecient to have PU2 above the cost lines over B5 and the next blocks in the Updating

II approach. This led to acceptance of M4 as the optimal model.

Under all scenarios, the algorithm terminated in Phase 1 because E(CT,j) and CT,j

were very close; see Table 4.6 and Figure 4.6. If E(TNBj) ≥ 0 in Phase 1, then it is

more likely that TNBj ≥ 0, which defers the termination point to the next iteration.

The termination point for the BL scenario is an exception, since costing more than the

budget is not acceptable.

The closeness of E(CT,j) and CT,j is explained by the effect of block and sample size.


The preliminary block contained 644 records of which 67 contained errors, giving an

error rate of 0.104. This informed the prior distribution about errors to be used in

the algorithm iterations, with the result that the number of errors observed in the

samples of size 50 did not strongly influence the overall error rate and consequent cost,

CT,j . Hence, although the observed error rate in Phase 2 increased as the algorithm

proceeded, reaching 13/50 = 0.26 by the eighth iteration, the estimated error rate for

the same iteration was only θ = 0.125. In other words, using such an informative prior

for the first iteration reduced the algorithm’s sensitivity to shifts in the observed error

rate.

Figure 4.6 shows that all scenarios start above the costs lines. This initial value for

the LU and PUs scenarios is due to the observed cost at the preliminary block and

predicted total cost(s) in the utility loop of the first algorithm iteration which is shown

in Tables 4.4 and 4.5. As discussed earlier, the hypergeometric distribution in Equation

(4.5) tends to predict a large error rate by adding a block compared with the observed

preliminary block.

In this regard, all predicted costs and rates in Tables 4.4 and 4.5 are higher than the

corresponding values in iterations of the algorithm in Table 4.6, even where the observed

error rate increases. For example in the ninth iteration the observed rate reaches

13/50 = 0.26 and total cost is estimated to be $6144 where the obtained predicted costs

in the utility loop at B9 are $7957 and $7643 before and after updating, respectively.

Knowing this helps the DM to define utility values and function parameters more

appropriately. One consideration is assigning a negative adjustment to TCF in the

linear utility function construction.

4.7 Algorithm Development and Extension

The primary assumptions of the general and customized algorithm described in Sections

4.4 and 4.5 can be changed to suit different circumstances. In this section we discuss

possible adaptations to the algorithm as well as ways in which the case study could

have been undertaken differently.

4.7 Algorithm Development and Extension 129

If time continuity is not important, then blocks with high error rates can be skipped

in the general data capturing algorithm. In this case, Phase 2 should incorporate

acceptance sampling such that the block under consideration is accepted or rejected

based upon the number of errors observed in the sample. The simplest acceptance plan

uses a single sample: all units in the random sample are inspected and if the number

of errors exceeds the acceptance number, then the block is rejected. The sample size

and acceptance number are defined based on the desired level of quality (Montgomery,

2008).

Another primary assumption that can be revisited is the necessity of inspection and

modification of all accepted blocks. It might suffice to accept a block without 100%

inspection and cleaning if its estimated error rate is low. In this case the cost formulae

in Equations (4.6) and (4.7) would be divided into two parts corresponding to modified

and non-modified blocks. This adjustment can obviously be used in combination with

the aforementioned non-continuous data capturing model.

An exciting development of the current algorithm is its promotion from data capturing

to an economical statistical model selection algorithm. This involves equipping the

algorithm with appropriate performance criteria for each of the alternative statistical

models under consideration. In our case study, a combination of accuracy and pre-

cision were used to quantify the performance of the constructed risk model. Other

criteria could be developed for competing models and the algorithm would then seek

the model which provides the best utility and cost. In this respect, the algorithm is

completely adapted to the VOI framework, using the expected value of perfect and

sample information (EVPI and EVSI respectively) parameters (Winkler, 2003).

All extensions discussed above can be applied to the customized algorithm for risk

model construction. Other changes and developments that could be made to the utility

loop and the customized algorithm are considered here. With respect to determining

an appropriate minimum sample size for estimation in Phase 2, various options have

been proposed for logistic regression model construction. Concato et al. (1995) and

Peduzzi et al. (1995) investigated the effect of small events per variable on regression

model performance and associated statistical tests. The rule of thumb they proposed


says that at least 10 events are required in the sample for every variable included

in the model. Other studies have focused on the prevalence of events for each level

of the predictor variables. Whittemore (1981) proposed a formula derived from the

information matrix, and Self and Mauritsen (1988) and Shieh (2001) extended this

work. Hsieh et al. (1998) simplified the model and proposed simple formulae for linear

and logistic regression models. Any of these methods could be used to determine the

preliminary block size in the data capturing algorithm.

In our case study, predictions were obtained by fitting the model using observed and

simulated blocks. The latter were obtained by randomly sampling from observed blocks.

If there is evidence that blocks closer together (in time) are more similar with respect

to quality than blocks that are further apart, then simulation should take this into

account. This can be achieved by preferentially sampling from close blocks. In general,

a moving window of fixed width can be applied. When a new block is accepted, the

latest block is removed from the moving window.

4.8 Conclusion

Sample size determination is an important component of many statistical analysis.

Most of sample size determination methods only consider statistical characteristics of

an analysis; however economical implementation of such analyses becomes a concern,

particularly, when cost of data matters. Data inspection and error modification costs

are known costs associated with clinical data.

In this study, we developed a general data capturing algorithm based on Value of

Information theory from a Bayesian decision making context and the concept of Utility.

Within the algorithm’s components a suite of costs, decision maker’s concerns and

preferences, and also performance of the statistical analysis were quantified and then

translated in terms of money in a same space.

We customized the algorithm for construction of logistic based risk models. To evaluate

its performance, we applied the customized algorithm to calibrate an available risk

model, APACHE II, using logistic regression over a local dataset. The results showed

BIBLIOGRAPHY 131

how well the algorithm captures all economical and statistical parameters of interest

including inspection, sampling, and modification costs as well as accuracy and precision

of the constructed risk model.

The performance of the algorithm was investigated over various methods of quantifi-

cation of the DM’s economical constraints within different approaches of algorithm

implementation. The obtained results supported the flexibility of the algorithm from

statistical and also economical perspectives, while sufficient sensitivity to changes was

maintained.

We also outlined a range of plug-in extensions that can be applied to the general and

the proposed customized algorithm in order to consider more practical and statistical

parameters such as choosing best data blocks and merging with available sample size

determination methods. Moreover, extension of the proposed framework and algorithm

to model selection context and finding optimal solutions can also be of interest.

Acknowledgement

The authors gratefully acknowledge financial support from the Queensland University

of Technology and St Andrews Medical Institute through an ARC Linkage Project.

Bibliography




Beck, D., Smith, G., and Pappachan, J. (2002). The effects of two methods for cus-

tomising the original SAPS II model for intensive care patients from South England.

Anaesthesia, 57(8):778–817.




Besanko, D. and Braeutigam, R. (2002). Microeconomics: An Integrated Approach.

Wiley.




Chow, S., Shao, J., and Wang, H. (2007). Sample Size Calculations in Clinical Research.

Chapman & Hall.

Claxton, K., Ginnelly, L., Sculpher, M., Philips, Z., and Palmer, S. (2004). A pilot

study on the use of decision theory and value of information analysis as part of

the nhs health technology assessment programme. Health Technology Assessment,

8(31):1–103.

Claxton, K., Neumann, P. J., Araki, S., and Weinstein, M. C. (2001). Bayesian value-

of-infomation analysis - an application to a policy model of Alzheimer’s disease.

International Journal of Technology Assessment in Health Care, 17(1):38–55.

Cochran, W. (2007). Sampling Techniques. Wiley-India.

Concato, J., Peduzzi, P., Holford, T., and Feinstein, A. (1995). Importance of events

per independent variable in proportional hazards analysis i. background, goals, and

general strategy. Journal of Clinical Epidemiology, 48(12):1495–1501.


Chapman & Hall/CRC.

Hannan, E., Farrell, L., and Cayten, C. (1997). Predicting survival of victims of motor

vehicle crashes in new york state. Injury, 28(9-10):607–615.

Harvey, A., Zhang, H., Nixon, J., and Brown, C. (2007). Comparison of data extraction

from standardized versus traditional narrative operative reports for database-related

research and quality control. Surgery, 141(6):708–714.

Hosmer, D. and Lemeshow, S. (2000). Applied Logistic Regression. Wiley-Interscience.

Hsieh, F., Bloch, D., and Larsen, M. (1998). A simple method of sample size calculation

for linear and logistic regression. Statistics in Medicine, 17(14):1623–1634.

Ivanov, J., Tu, J., and Naylor, C. (1999). Ready-made, recalibrated, or remodeled?:

issues in the use of risk indexes for assessing mortality after coronary artery bypass

graft surgery. Circulation, 99(16):2098–2104.

Kish, L. (1995). Survey Sampling. Wiley.



Kramer, A. and Zimmerman, J. (2007). Assessing the calibration of mortality bench-

marks in critical care: the Hosmer-Lemeshow test revisited*. Critical Care Medicine,

35(9):2052–2056.

BIBLIOGRAPHY 133

Marcin, J. and Romano, P. (2007). Size matters to a model’s fit. Critical Care Medicine,

35(9):2212–2213.




Nemes, S., Jonasson, J., Genell, A., and Steineck, G. (2009). Bias in odds ratios by

logistic regression modelling and sample size. BMC Medical Research Methodology,

9(1):56–60.

Peduzzi, P., Concato, J., Feinstein, A., and Holford, T. (1995). Importance of events

per independent variable in proportional hazards regression analysis ii. accuracy and

precision of regression estimates. Journal of Clinical Epidemiology, 48(12):1503–1510.





Sarndal, C., Swensson, B., and Wretman, J. (2003). Model Assisted Survey Sampling.

Springer Verlag.

Schonhofer, B., Guo, J., Suchi, S., Kohler, D., and Lefering, R. (2004). The use of

APACHE II prognostic system in difficult-to-wean patients after long-term mechan-

ical ventilation. European Journal of Anaesthesiology, 21(7):558–565.

Self, S. and Mauritsen, R. (1988). Power/sample size calculations for generalized linear

models. Biometrics, 44(1):79–86.






Shieh, G. (2001). Sample size calculations for logistic and Poisson regression models.

Biometrika, 88(4):1193–1199.

Somers, R. (1962). A new asymmetric measure of association for ordinal variables.

American Sociological Review, pages 799–811.

Spiegelhalter, D., Abrams, K., and Myles, J. (2004). Bayesian Approaches to Clinical

Trials and Health-Care Evaluation. Wiley.









Suistomaa, M., Niskanen, M., Kari, A., Hynynen, M., and Takala, J. (2002). Cus-

tomised prediction models based on APACHE II and SAPS II scores in patients with

prolonged length of stay in the icu. Intensive Care Medicine, 28(4):479–485.



Teres, D. and Lemeshow, S. (1999). When to customize a severity model. Intensive

Care Medicine, 25(2):140–142.

Tuyl, F., Gerlach, R., and Mengersen, K. (2009). Posterior predictive arguments in

favor of the Bayes-Laplace prior as the consensus prior for binomial and multinomial

parameters. Bayesian Analysis, 4(1):151–158.

Whittemore, A. (1981). Sample size for logistic regression with small response proba-

bility. Journal of the American Statistical Association, pages 27–32.

Winkler, R. (2003). Introduction to Bayesian Inference and Decision. Probabilistic

Publishing.

CHAPTER 5

Implementation of multivariate control charts in

a clinical setting

Preamble

It is not rare to be interested in monitoring more than one quality characteristic of

a clinical process using control charts. In an industrial context, multivariate control

charts including T 2 multivariate exponentially weighted moving average (MEWMA),

and multivariate cumulative sum charts (MCUSUM), have been proposed and applied

widely. These procedures are superior to simultaneously monitoring variables using

several univariate control charts since the structure of correlation between character-

istics are captured and the false alarm is not inflated. In this chapter we adapted the

above techniques in monitoring radiation delivered to patients undergoing diagnostic

coronary angiogram procedures at a local hospital. Within this adaption, we faced a

common challenge of data incompleteness in measured variables in a clinical context.

We investigated and compared the performance of the charts when different imputation

methods were used. The results of the simulation study was found in favor of MEWMA

136 Chapter 5. Implementation of Multivariate Control Charts

and MCUSUM in presence of small shifts in mean of measured characteristics and sup-

ported use of multiple imputation method.

The focus of this chapter is on the second objective of the thesis, mainly goal 2, in which

monitoring related variables is of interest. This chapter contributes to application and

adaption of well-established charting methods of an industrial context to a healthcare

area. Within this knowledge transfer common challenge of missing data of the clinical

setting also is considered.

This chapter has been written as a journal article for which I am the third author. It

is reprinted here in its entirety. I contributed in the design of statistical analysis and

simulation.

137



certified that:



field of expertise;






unit; and





Waterhouse, M. A., Smith, I. R., Assareh, H., and Mengersen, K. (2010) Implementa-

tion of multivariate control charts in a clinical setting, International Journal for Quality

in Health Care, 22 (5): 408-414.


M. A. Waterhouse Conception and conduct research, design and implement sta-tistical analysis, write code, write manuscript, make modi-fications to manuscript as suggested by co-authors and re-viewers

Signature & Date:

I. Smith Conception, Data collection, comments on manuscript, edit-ing

H. Assareh Design of statistical analysis






5.1 Abstract

In most clinical monitoring cases there is a need to track more than one quality char-

acteristic. If separate univariate charts are used, the overall probability of a false

alarm may be inflated since correlation between variables is ignored. In such cases,

multivariate control charts should be considered. This paper considers the implemen-

tation and performance of the T 2, multivariate exponentially weighted moving average

(MEWMA), and multivariate cumulative sum charts (MCUSUM) in light of the chal-

lenges faced in clinical settings. We discuss how to handle incomplete records and

non-normality of data, and we provide recommendations on chart selection. Our dis-

cussion is supported by a case study involving the monitoring of radiation delivered

to patients undergoing diagnostic coronary angiogram procedures at St Andrew’s War

Memorial Hospital, Australia. We also perform a simulation study to investigate chart

performance for various correlation structures, patterns of mean shifts, amounts of

missing data, and methods of imputation. The multivariate exponentially weighted

moving average (MEWMA) chart and the multivariate cumulative sum (MCUSUM)

chart detect small to moderate shifts quickly, even when the quality characteristics are

uncorrelated. The T 2 chart performs less well overall, although it is useful for rapid

detection of large shifts. When records are incomplete, we recommend using multiple

imputation.

5.2 Introduction

Control charts are becoming widely accepted in the health domain as a means of mon-

itoring processes and outcomes (Spiegelhalter et al., 2003). To date, attention has

mainly focused on the application of univariate control charts. In most clinical moni-

toring cases, however, there is a need to track more than one quality characteristic. If

separate univariate charts are used to monitor each quality characteristic, the overall

probability of a false alarm may be inflated (unless the control limits are adjusted ac-

cordingly) since any correlation between the variables is ignored. This suggests that it

might be worthwhile adopting multivariate techniques.

5.3 Methods 139

In this paper we survey some of the charts available for monitoring the means of con-

tinuous variables. We consider their implementation and performance in light of the

challenges faced in clinical settings. In particular, we address the fact that clinical

records are frequently incomplete.

The paper proceeds as follows. In Section 2 we outline the general multivariate frame-

work and explain how to construct Hotelling’s T 2, the multivariate exponentially weighted

moving average (MEWMA), and the multivariate cumulative sum (MCUSUM) charts.

The discussion is supported by a case study involving the monitoring of radiation deliv-

ered to patients undergoing diagnostic coronary angiogram procedures at St Andrew’s

War Memorial Hospital, Australia. We also outline the methodology of a simulation

study used to investigate how each chart performs for various correlation structures,

patterns of mean shifts, amounts of missing data, and methods of imputation. Re-

sults are given in Section 5.4, and we provide recommendations in Section 5.5 on chart

selection and implementation.

5.3 Methods

5.3.1 Description of case study data

Our dataset contains information for three variables linked to the radiation delivered

to a patient, namely the dose area product, fluoroscopy time, and the number of dig-

ital images (frames) acquired during coronary angiogram procedures at St Andrew’s

War Memorial Hospital between April 2005 and December 2008. Dose area product,

measured in mGy·cm2, provides a measure of the total radiation to which a patient

is exposed. It is affected by the fluoroscopy time (low radiation dose rate component

of the study associated with positioning catheters in the heart), the number of frames

(high dose rate documentation phase), and other procedural and clinical factors such

as the patient’s weight. To minimise variations in dose area product associated with

patient size, data included in the case study have been limited to female patients only.

Under radiation safety and protection guidelines, every effort is taken to limit patient


radiation exposure. To this end St Andrew’s War Memorial Hospital routinely moni-

tors and reviews dose area product, fluoroscopy time and frames separately in an effort

to achieve an optimised risk versus benefit balance. Although the number of frames is

technically discrete, we will regard it as continuous.

5.3.2 A general framework for multivariate monitoring

Let Xi be the ith vector of observations for the p variables that we want to monitor.

With respect to the case study, Xi comprises the values of dose area product, fluo-

roscopy time and frames for the ith patient. For example, if the tenth patient had

a dose area product of 30638 mGy·cm2, a fluoroscopy time of 2.56 minutes, and 622

frames were taken of their heart, then X10 = [30638 2.56 622]′.

When the process is in-control, it is assumed that Xi follows a multivariate normal

distribution, with mean vector µ0 and covariance matrix Σ, independent of other ob-

servations. That is, Xiiid∼ Np(µ0,Σ). There will be many occasions, however, where

clinical data do not satisfy this assumption. The MEWMA chart can be designed

to be robust against deviations from normality (Stoumbos and Sullivan, 2002; Testik

et al., 2003), and there exists a non-parametric version of the MCUSUM chart (Qiu

and Hawkins, 2003), but the T 2 chart is highly sensitive to the normality assumption

(Stoumbos and Sullivan, 2002).

When normality is questionable, it will often suffice to transform one or more variables.

For example, although dose area product and fluoroscopy time are both strongly right-

skewed, normality appears to hold for the natural log of dose area product and the in-

verse of fluoroscopy time. Consequently, instead of creating control charts using the raw

data, we would construct them for X∗i = [D T F ]′, where D = ln(dose area product),

T = (fluoroscopy time)−1 and F denotes the number of frames. Hence,X∗10 = [10.3 0.39 622]′.

The parameters µ0 and Σ can either be specified by management or estimated using a

5.3 Methods 141

sample from a stable process. We assume that X∗i

iid∼ N3(µ0,Σ), where

µ0 =

9.5

0.55

586

and Σ =

0.2 −0.03 23.8

−0.03 0.04 −6.0

23.8 −6.0 14882

.

The objective is to detect a shift from µ0 to µ1. The T 2, MEWMA and MCUSUM

charts consider only the magnitude of any shift and not its direction. Hence, they use

only an upper control limit (UCL). If a statistic exceeds the UCL, the chart is said

to ‘signal’, and the process should be investigated to determine if the signal is due to

an error in the data, is indicative of a genuine shift, or simply the result of natural

variability. Univariate charts and the raw data should be inspected to determine the

variable(s) responsible for the signal and whether it is associated with a change in

patterns of use of radiation or variation in imaging equipment performance. In terms

of our case study, a signal would correspond to increased levels of radiation exposure.

From a clinical governance perspective, deviations from a stable process must be iden-

tified as quickly as possible, while limiting the occurrence of false alarms. Performance

of a control chart is described in terms of the average number of observations that are

monitored, average run length (ARL), before the chart ‘signals’. For each chart, choice

of the signal threshold is a trade-off between the ARL when the process is in-control

(ARL0) and when the process is out-of-control (ARL1). Under ideal circumstances the

chart should have a very low false alarm rate (long ARL0) while rapidly detecting true

changes (short ARL1).

Before we can construct a multivariate chart, we need to deal with any missing data.

One solution is to use imputation, methods of which include multiple imputation (Ru-

bin, 1987), insertion of the sample mean, and regression-based imputation.

Multiple imputation is preferred because it preserves variability in the missing values

and performs well for small sample sizes and/or large proportions of missing data.

Generally speaking, it involves creating 3 ≤ r ≤ 10 complete datasets, performing

analyses on each of these datasets, and then combining the results. For each variable

with missing values, it is necessary to construct r imputation models. In practice, most


researchers will not need to develop these models directly, since software, such as NORM

(Schafer; Schafer and Olsen, 1998), is available that performs multiple imputation.

The rates of missing data for D, T and F are 3.0%, 1.4%, and 1.5%, respectively.

Instead of constructing control chart statistics for multiple datasets and then combining

the results, we use multiple imputation to create five observations for each missing value,

and we impute the average of these five values to create a single complete dataset.

5.3.3 Control chart construction

We have chosen to concentrate on the T 2, MEWMA and MCUSUM charts because they

are extensions of univariate charts commonly used to monitor clinical data, namely the

Shewhart, the exponentially weighted moving average (EWMA) and the cumulative

sum (CUSUM) charts. In what follows, we briefly describe how these charts are con-

structed. A more detailed explanation can be found in Montgomery (2008). See also

Bersimis et al. (2007) for a more comprehensive survey of research into multivariate

charts.

The T 2 chart plots T 2i = (Xi −µ0)

′Σ−1(Xi −µ0) (Hotelling, 1947). If µ0 and Σ have

been specified by management or estimated using a sufficiently large sample (in excess

of 100 observations), then the UCL is χ2α,p (Seber, 1984). If they have been estimated

using a “small” sample, the UCL depends upon whether the researcher is performing a

retrospective (Phase I) analysis or wants to monitor future values (Phase II). The Phase

I and II UCLs are βα,p/2,(n−p−1)/2[(n− 1)2/n] and Fα,p,n−p[p(n+ 1)(n− 1)/(n2 − np)],

respectively (Tracy et al., 1992), where n is the sample size.

Construction of the MEWMA chart (Lowry et al., 1992) requires specification of a

weight λ, 0 ≤ λ ≤ 1, that is used to assign importance to observations, with re-

cent observations being weighted more heavily than observations more distant in time.

Letting Zi = λ(Xi − µ0) + (1 − λ)Zi−1, where Z0 = 0, it plots Z ′iΣ

−1Zi

Zi, where

ΣZi= λ

2−λ

[1− (1− λ)2i

]Σ. Prabhu and Runger (1997)determined the optimal weight

and corresponding UCL for selected combinations of p, the size of the shift to be de-

tected, and the desired ARL0. For cases not considered, the UCL can be obtained

through simulation in order to achieve a desired ARL0.

5.3 Methods 143

Several versions of the MCUSUM chart have been proposed (Crosier, 1988). We con-

sider a version proposed by Crosier (1988). It plots(L′

iΣ−1Li

)1/2, where

Li =

0, if Ci ≤ k

(Li−1 +Xi − µ0)(1− k/Ci), if Ci > k

and Ci =((Li−1 +Xi − µ0)

′Σ−1(Li−1 +Xi − µ0))1/2

. Crosier (1988) recommended

setting L0 = 0 and k = (δ′Σ−1δ)1/2/2, and we follow the convention of resetting the

MCUSUM chart following a signal. The UCL is calculated by simulation in order to

achieve a desired ARL0.

Statistics were generated for all 884 records in the case study dataset using code

written in Matlab. In Figures 5.1 to 5.3 we show the T 2, MEWMA and MCUSUM

charts for procedures performed in November 2005. The figures were produced using

R (http://www.r-project.org). We discuss chart interpretation in Section 5.4.1.

0 10 20 30 40

05

1015

20

Observation

T2 s

tatis

tic

Figure 5.1 Hotelling’s T 2 chart for the simultaneous monitoring of D, T and F for females undergoinga CA in November 2005.

5.3.4 Outline of simulation study

The simulation study considers the monitoring ofXi = [V1 V2 V3]′, where the correlation

between variables Vi and Vj , denoted ρij , takes the values 0, 0.2 and 0.8. To study

ARL0, we simulate data from N3(0,Σ), where Σ is in correlation form. To study


0 10 20 30 40

05

1015

Observation

ME

WM

A s

tatis

tic

Figure 5.2 MEWMA chart for the simultaneous monitoring of D, T and F for females undergoing aCA in November 2005.

0 10 20 30 40

02

46

8

Observation

MC

US

UM

sta

tistic

Figure 5.3 MCUSUM chart for the simultaneous monitoring of D, T and F for females undergoing aCA in November 2005.

5.4 Results 145

ARL1, we generate data such that the first 50 records are drawn from an in-control

process, and the remaining records are drawn from N3(δ,Σ), where δ = [δ1 δ2 δ3]′ 6= 0.

We let δi be 0, 0.5 or 2, and we allow for shifts of different magnitudes amongst the

variables. Data are generated such that Vi is missing with probability γ, where γ

= 0, 0.05 or 0.2, subject to the constraint that a record cannot have all three of its

observations missing. When γ > 0, a complete dataset is then obtained by multiple

imputation, imputation of the mean, or regression-based imputation. When the process

is stable, the run length is the number of records before the chart signals. When the

process is unstable, the run length is the number of out-of-control records before the

chart signals. Run lengths for 10000 datasets are averaged to produce ARLs.

Parameters for the multivariate charts have been chosen such that ARL0 is approxi-

mately 200. The UCL of the T 2 chart is 12.85. We use λ = 0.1 and a UCL of 10.97 for

the MEWMA chart, and k = 0.5 and a UCL of 6.88 for the MCUSUM chart.

In addition to creating multivariate charts, we construct separate univariate charts for

each variable. For a particular univariate chart, we define the “overall” run length to

be the shortest of the three run lengths. For each Shewhart chart, we use the standard

lower and upper control limits of −3 and 3, respectively. For each EWMA chart, we

use λ = 0.1 and L = 2.814. Given these values, the control limits for the ith observation

are ±2.814√0.1 [1− 0.92i] /1.9. For each CUSUM chart we use K = 0.5 and a UCL of

5.

5.4 Results

5.4.1 Case study

The T 2 chart (Figure 5.1) registered out-of-control signals at times 17, 25 and 34.

The MEWMA chart first signalled at time 37 and the MCUSUM chart signalled at

time 38 (Figures 5.2 and 5.3). Since the T 2 statistic only uses information from the

current observation, investigation into the causes of the T 2 signals requires inspection

of only the 17th, 25th and 34th records. In contrast, when interpreting MEWMA and

MCUSUM signals, we should also consider records preceding a signal.


To help identify the causes of signals, we constructed Shewhart, EWMA and CUSUM

charts for each variable. The Shewhart chart for D suggested that high values of dose

area product may be the cause of the signals observed in the T 2 chart. Moreover,

X17 = [50011 5.26 506], X25 = [67508 2.27 803] and X34 = [84965 7.14 816.2].

In each case, the dose area product is considerably higher than the stable mean of

exp(9.5) = 13360 mGy·cm2. The MEWMA and MCUSUM signals are perhaps due to

the high dose area product in X34, but it is also interesting to note that the EWMA

chart for F exhibited an increasing trend starting at observation 20. T was stable

during November 2005. There were no false alarms associated with any of the EWMA

or CUSUM charts, or with the Shewhart charts used to monitor D and T . However,

the Shewhart chart for F generated a false alarm at time 14. The number of frames

(955) is reasonable when considered simultaneously with the dose area product of 28283

mGy·cm2 and the fluoroscopy time of 3.7 minutes. Moreover, an increasing trend in F

does not begin until several observations later. As an aside, 6 of the 44 records from

November 2005 required imputation. F was imputed in X9, X23, X24 and X34, T was

imputed in X26, and D was imputed in X27.

Starting around May 2006, there was a sharp increase in number of signals across

all types of charts, both multivariate and univariate. This was due to a change in

equipment and processes used at St Andrew’s War Memorial Hospital. If monitoring

the situation in “real time”, we would have allowed the process to stabilise, before re-

estimating µ0 and Σ to determine parameters appropriate for the changed conditions.

5.4.2 Simulation Study

In this section we summarise broad trends. Tables of results for all charts, combi-

nations of parameters, and types of imputation are available upon request from the

corresponding author.

In keeping with our choice of parameters, if γ = 0, we expect a false alarm every 200

records, on average. As the amount of missing data, and hence imputation, increases,

so too does ARL0. For example, when γ = 0.2 and multiple imputation is used, ARL0

is approximately 300 for the T 2 chart, and approximately 400 for the MEWMA and

5.4 Results 147

MCUSUM charts. For almost all scenarios considered, using separate univariate charts

resulted in a quicker false alarm than using the multivariate counterpart.

Regardless of correlation structure, the MEWMA and MCUSUM charts detect small to

moderate shifts significantly quicker than the T 2 chart. For small shifts, the MCUSUM

chart marginally outperforms the MEWMA chart. For large shifts, the performance of

all charts is essentially the same.

0.5 1.0 1.5 2.0 2.5 3.0 3.5

050

100

150

200

||δ||

AR

L 1

T2, γ = 0T2, γ = 0.2MEWMA, γ = 0MEWMA, γ = 0.2MCUSUM, γ = 0MCUSUM, γ = 0.2

Figure 5.4 Plot of ARL1 versus ||δ|| for the T 2, MEWMA and MCUSUM charts, given ρ12 = ρ13 =ρ23 = 0.2. Results are shown for the cases where no data are missing (γ = 0) and when γ = 0.2. Inthe latter case, MI has been used to impute for missing values.

The charts are slower to detect changes as the amount of imputation performed in-

creases. However, the effect is negligible for large shifts. The T 2 chart is most affected

by imputation. The MEWMA and MCUSUM charts are affected to the same extent.

Figure 5.4 is representative of the trends described above. It plots ARL1 versus the

shift size, as summarised by ||δ|| =√δ21 + δ22 + δ23 .

When there is little or no correlation between the variables, using multiple Shewhart

charts detects small shifts more quickly than a T 2 chart. In contrast, the MEWMA

charts detects small shifts almost as quickly as multiple EWMA charts, and the MCUSUM

chart is actually superior to multiple CUSUM charts, even when the variables are com-

pletely uncorrelated. When the variables are highly correlated, the multivariate charts


tend to detect an unstable process more quickly than multiple univariate charts. The

multivariate charts are slower when the means of all three variables have shifted by the

same amount.

If at least one pair of variables is highly correlated, small shifts are detected most quickly

when the average has been imputed. However, under these conditions, imputing the

average produces the worst ARL0. For example, when γ = 0.2, ρ12 = 0.2, ρ13 = 0.2

and ρ23 = 0.8, ARL0 is 160 if the average is imputed. When multiple imputation and

regression-based imputation are used, ARL0 is 422 and 510, respectively. If a large

amount of imputation is required, multiple imputation tends to produce better results

than regression-based imputation, but the choice of imputation technique is largely

irrelevant if interest lies in detecting moderate to large shifts.

5.5 Discussion

When there is more than one quality characteristic to be monitored, we advise using

multivariate charts to avoid excessive false signals associated with using separate uni-

variate charts. Of the charts considered in this paper, the MCUSUM chart showed

the best overall performance. However, it is only marginally superior to the MEWMA

chart, with differences becoming negligible for moderate to large shifts. Indeed, many

clinicians may feel that the ability of the MCUSUM chart to detect small shifts quicker

than the MEWMA chart is outweighed by the increased complexity of its construction.

This is especially pertinent given that many statistical software packages do not include

an in-built function for creating MCUSUM charts. We recommend strongly against re-

lying on the T 2 chart. However, if the data follow a multivariate normal distribution,

then the T 2 chart can be used in a supplementary manner for the purposes of quickly

detecting large shifts, as demonstrated in our case study.

Our case study highlighted some standard transformations that can be used when data

do not follow a normal distribution. If multivariate normality is questionable and

transformations prove unsatisfactory, then the MEWMA chart should be used. In this

case, the clinician should follow the design recommendations in Stoumbos and Sullivan

5.5 Discussion 149

(2002) or Testik et al. (2003).

In addition to having a skewed distribution, dose area product is strongly related to

patient weight, with heavier patients exposed to higher levels of radiation, on average.

As such, what is considered a “normal” dose area product depends upon the patient’s

size. Since weight was only recorded for approximately 4% of the patients in the case

study dataset, it wasn’t feasible for it to be used as another covariate in our charts. As

such, we used gender as a surrogate measure of weight, constructing charts for only the

females. Partitioning records in this way is useful if the distribution of one or more of

the quality characteristics is multimodal. If setting up an ongoing multivariate chart

for the case study, in the absence of weight data, we would continue to monitor records

for males and females separately.

Multivariate charts are known to perform well for a moderate number of variables.

However, as the number of variables increases they become less efficient in detecting

shifts. If more than ten variables are to be monitored, we recommend using principal

components analysis to reduce the dimensionality of the problem. Details on this

procedure with respect to control charts can be found in Bersimis et al. (2007).

Supplementary use of univariate charts can be used to investigate the cause(s) of mul-

tivariate signals. They can be used to examine the behavior of individual variables and

to identify the direction of any shift(s). If the signal is associated with an improve-

ment, management should make efforts to maintain whatever procedures precipitated

the change. If a signal suggests an undesirable change, investigations should be con-

ducted to determine whether the signal is a result of a genuine deterioration in the

process, a mistake in the data collection process, or the result of natural variability.

If data are missing, we recommend using multiple imputation to create complete

datasets, particularly if a large amount of imputation is required. In our study, we

imputed the average of five observations for each missing value to create a single com-

plete dataset. An alternative approach would be to use multiple imputation to create

five distinct datasets, calculate statistics for each dataset, and plot the average of the

statistics. This method is more difficult to implement and interpretation of signals

becomes more complicated.


We advise caution when using imputation based on the sample mean or regression

because these methods artificially reduce variability and may also distort relationships

between variables. We strongly advise against deleting incomplete records. Considering

our case study, if X34 had been discarded on account of the number of frames being

missing, then the unusually large dose area product associated with that record would

have been overlooked.

Because imputation reduces the amount of variability in the data, it has the effect

of increasing both ARL0 and ARL1. While the former is not considered a problem,

a higher ARL1 is less desirable. This effect can be countered somewhat by setting a

lower UCL than would be used for a dataset with no missing values.

When estimating Σ in the case study, we used the sample covariance matrix. An

alternative approach uses the differences between successive vectors of observations

(Holmes and Mergen, 1993). This is analogous to using the moving range to estimate

the standard deviation. In this case the estimator of Σ is S = 12(n−1)V V ′, where V is a

p×(n−1) matrix, the ith column of which is given by Xi+1−Xi, for i = 1, 2, . . . , n−1.

If successive observations are independent, this estimator should be used because it

results in a chart that is better able to detect sustained shifts in the mean vector

(Sullivan and Woodall, 1996). It should not be used if observations are autocorrelated

because it results in a large false alarm rate (Capilla, 2009).

We considered continuous clinical data. If the institution wants to monitor multiple

discrete quality characteristics, a multi-attribute chart should be used instead. There

has been less research into multi-attribute charts, but the interested reader is referred

to Patel (1973), Lu et al. (1998) and Skinner et al. (2003).

Acknowledgements

The authors thank Dr Tony Morton for helpful discussions.

BIBLIOGRAPHY 151

Funding Support

This research was funded by an Australian Research Council Linkage Project.

Bibliography

Bersimis, S., Psarakis, S., and Panaretos, J. (2007). Multivariate statistical pro-

cess control charts: an overview. Quality and Reliability Engineering International,

23(5):517–543.

Capilla, C. (2009). Application and simulation study of the Hotelling’s T 2 control chart

to monitor a wastewater treatment process. Environmental Engineering Science,

26(2):333–342.



Holmes, D. and Mergen, A. (1993). Improving the performance of the T 2 control chart.

Quality Engineering, 5(4):619–625.





Lu, X., Xie, M., Goh, T., and Lai, C. (1998). Control chart for multivariate attribute

processes. International Journal of Production Research, 36(12):3477–3489.


Patel, H. (1973). Quality control methods for multivariate binomial and Poisson dis-

tributions. Technometrics, pages 103–112.

Prabhu, S. and Runger, G. (1997). Designing a multivariate EWMA control chart.


Qiu, P. and Hawkins, D. (2003). A nonparametric multivariate cumulative sum pro-

cedure for detecting shifts in all directions. Journal of the Royal Statistical Society:

Series D (The Statistician), 52(2):151–164.

Rubin, D. (1987). Multiple Imputation for Nonresponse in Surveys. Wiley Online

Library.

Schafer, J. Norm: Multiple imputation of incomplete multivariate data under a normal

model, version 2. http://www.stat.psu.edu/ jls/misoftwa.html.


Schafer, J. and Olsen, M. (1998). Multiple imputation for multivariate missing-data

problems: A data analyst’s perspective. Multivariate Behavioral Research, 33(4):545–

571.

Seber, G. (1984). Multivariate Observations. Wiley Online Library.

Skinner, K., Montgomery, D., and Runger, G. (2003). Process monitoring for multiple

count data using generalized linear model-based control charts. International Journal

of Production Research, 41(6):1167–1180.






Sullivan, J. and Woodall, W. (1996). A comparison of multivariate control charts for


Testik, M., Runger, G., and Borror, C. (2003). Robustness properties of multivariate

EWMA control charts. Quality and Reliability Engineering International, 19(1):31–

38.



CHAPTER 6

Bayesian Change Point Estimation in Poisson

Based Control Charts

Preamble

Any enhancement in quality of a process outcomes is gained through the quick detec-

tion of out-of-control state of the process and investigation of potential causes of such

shifts in the process. This stage is then followed by implementation of preventive and

corrective actions. Recently the need to know the time at which a process began to

vary, the so-called change point, has been raised and discussed in the industrial context

of quality control. Accurate estimation of the time of change can help in the search for

a potential cause more efficiently as a tighter time-frame prior to the signal in the con-

trol charts is investigated. Several methods including MLE estimators and data mining

techniques such as Neural networks and Fuzzy clustering have been proposed and in-

vestigated for various processes involving single variable, multivariate and monitoring

profiles.

An overview on related body of literature revealed that the capabilities of the Bayesian

154 Chapter 6. Change Point Estimation in Poisson Control Charts

framework in this area of research has been ignored so far. In a Bayesian setting the

results obtained from the model are highly informative and can contribute directly in

the decisions made in root causes analysis. Moreover, this approach along with com-

putational techniques such as MCMC, simplify modeling the change point for complex

processes and scenarios and simultaneously shortcut the analytical hassles.

In this study Bayesian estimation of change points following detected changes in a Pois-

son process monitored by control charts were considered. To this end Bayesian hierar-

chical models were constructed in presence of a step change, multiple changes (where

the number of changes is known) and a linear drift in the process mean. Simulation re-

sults showed that more accurate estimates for time of change can be obtained when the

Bayesian estimators were used in conjunction of Poisson based control charts. These

estimates were also supported by probabilistic inferences for time and the magnitude

of changes. In comparison with alternative estimators, MLE and built-in estimators,

the Bayesian estimator performed reasonably well and remains a strong alternative.

This superiority was also enhanced particularly when other criteria such as probability

quantification through credible intervals and probabilistic inferences, flexibility, gener-

alization and simplicity are taken into accounts.

This study was mainly motivated by monitoring radiation instruments aiming to plan

preventive maintenance across medical instruments influencing clinical outcomes using

Poisson based control charts in a local hospital. For sake of confidentiality, a reflection

of real data were generated in simulation study. The simulated datasets had the added

advantage of allowing more explicit examination of the performance of the methods

developed, as well as extension of the results for other Poisson processes in and out of

the a health sector.

The focus of this chapter is on the second objective of the thesis, mainly goal 3, in

which facilitation of root cause analysis through change point estimation is sought. This

chapter contributes to method since using a Bayesian framework and computational

components change point estimators were designed to estimate time of a step change,

linear trend and multiple changes prior to Poisson control charts’ signals.


155






certified that:



field of expertise;






unit; and





Assareh, H., Noorossana, R. and Mengersen, K. (2011) Bayesian change point estima-

tion in Poisson based control charts, Computer and Industrial Engineering, submitted.


H. Assareh Conception and conduct research, design and implement sta-tistical analysis, write code, write manuscript, make modi-fications to manuscript as suggested by co-authors and re-viewers

Signature & Date:

R. Noorossana Conception, comments on manuscript





6.1 Abstract 157

6.1 Abstract

Precise identification of the time when a process has changed enables process engi-

neers to search for a potential special cause more effectively. In this paper, we develop

change point estimation methods for a Poisson process in a Bayesian framework. We

apply Bayesian hierarchical models to formulate the change point where there exists

a step change, a linear trend and a known multiple number of changes in the Poisson

rate. Markov Chain Monte Carlo is used to obtain posterior distributions of the change

point parameters and corresponding probabilistic intervals and inferences. The perfor-

mance of the Bayesian estimator is investigated through simulations and the result

shows that precise estimates can be obtained when they are used in conjunction with

the well-known c-, Poisson EWMA and Poisson CUSUM control charts for different

change types scenarios. We also apply the Deviance Information Criterion as a model

selection criterion in the Bayesian context, to find the best change point model for a

given dataset where there is no prior knowledge about the change type in the pro-

cess. In comparison with built-in estimators of EWMA and CUSUM and MLE based

estimators, the Bayesian estimator performs reasonably well and remains a strong al-

ternative. These superiorities are enhanced when probability quantification, flexibility

and generalizability of the Bayesian change point detection model are also considered.

6.2 Introduction

Statistical process control charts are used to detect changes in a process by distinguish-

ing between assignable causes and common causes of the process variation. When a

control chart signals, process engineers initiate a search to identify and eliminate the

source of variation. Knowing the time at which the process began to vary, the so-called

change point, would help to conduct the search more efficiently in a tighter time-frame.

A Poisson process is often used to model the number of occurrences in an interval of

time. In this regard, Poisson based control charts have been developed and frequently

applied in an industry context to monitor the number of defects and nonconformities

in a product (Gardiner, 1987; White et al., 1997), and in a health context to monitor


patient mortality and spread of an infection in a hospital (Benneyan, 1998; Limayea

et al., 2008). The most commonly used control chart procedures adopted for Poisson

distributed data include c-charts (Shewhart, 1926, 1927), CUSUM (Page, 1954, 1961;

Brook and Evans, 1972), and EWMA (Roberts, 1959; Trevanich and Bourke, 1993;

Borror et al., 1998); see Woodall (1997) and Montgomery (2008) for more details.

The motivation of this study arose from monitoring radiation instruments using Poisson

based control charts in a local hospital, St Andrew’s War Memorial Hospital (SAWMH),

Brisbane, Australia. This monitoring was a part of an ongoing quality improvement

program that aims to plan preventive maintenance across medical instruments influenc-

ing clinical outcomes. In this program all inspections, instrument status, repairs and

associated costs are recorded and monitored. Since hospital data are commonly confi-

dential and subject to other pressures, simulated datasets were generated which reflect

the main features of the real data. The simulated datasets have the added advantage

of allowing more explicit examination of the performance of the methods developed in

this paper.

It has been shown that Poisson CUSUM and Poisson EWMA charts are more sensitive

for detecting small shifts in the process parameters whereas a c-chart still remains

efficient for detection of large shifts (Montgomery, 2008). However, upon signaling,

none of them provide specific information regarding the time at which the process

changed and the magnitude and the type of the change.

There exists a built-in change point estimator in CUSUM charts suggested by Page

(1954) and also an equivalent estimator in EWMA charts proposed by Nishina (1992).

Samuel et al. (1998) developed and applied a maximum likelihood estimator (MLE) for

the change point in a c-chart assuming that the change type is a step change. They

demonstrated how closely this new estimator estimates the change point in comparison

with the usual c-chart signal.

Perry (2004) evaluated the performance of the MLE estimator and reported that it

outperforms Poisson CUSUM and Poisson EWMA built-in estimators if a step change

is present. He also constructed a confidence set on the estimated change point which

covers the true process change point with a given level of certainty using a likelihood

6.2 Introduction 159

function based upon the method proposed by Box and Cox (1964). Perry et al. (2006)

then derived a MLE estimator and confidence set under a linear trend assumption

where the process parameter changes over time. This type of change is common, and,

for example, can be caused by tool wearing and operator’s skill improvement over

time. They showed that this is superior to the step change estimator if a linear trend

disturbance occurs in the Poisson rate.

The underlying assumption of knowing the form of change types was challenged and

a MLE estimator was derived for non-decreasing multiple step change points using

isotonic regression models (Perry et al., 2007). The estimator was reported a reasonable

alternative for some magnitudes of the step and linear trend disturbances. In the case

of multiple change points, it was shown to be the superior estimator.

An interesting approach which has only recently been considered in the SPC context is

Bayesian hierarchical Modelling (BHM) using, where necessary, computational methods

such as Markov Chain Monte Carlo (MCMC). Application of these theoretical and

computational frameworks to change point estimation provides a way of making a set

of inferences based on posterior distributions for the time and the magnitude of a change

as well as assessing the validity of underlying assumptions in the change point model

itself (Gelman et al., 2004).

In this paper we model and estimate the change point in a Bayesian framework. We

first model and estimate change points assuming that the underlying change type is

known. In this scenario the change type is in the form of a step change, a linear trend

and a multiple change with known number of changes respectively. For each model

we analyze and discuss the performance of the Bayesian change point model through

posterior estimates and probability based intervals. The three models are demonstrated

and evaluated in Sections 6.3-6.5, and then compared with respect to goodness of fit in

Section 6.6. We then compare the Bayesian estimator with MLE based estimators and

others in Section 6.7 and summarize the study and obtained results in Section 8.5.


6.3 Bayesian Poisson Process Step Change Model

6.3.1 Model






probability distribution before the data are observed; “Likelihood” is a model underly-

ing the data, and “Posterior” is the state of knowledge about the quantity after data

are observed which also is in the form of a probability distribution. This structure is

expendable to multiple levels in a hierarchical fashion, so-called Bayesian hierarchical

models (BHM), which allows to enrich the model by capturing all kind of uncertainties

for data observed as well as priors. In complicated BHMs it is not easy to obtain the

posterior distribution analytically. This analytic bottleneck has been eliminated by the

The emergence of Markov chain Monte Carlo (MCMC) methods. In MCMC algorithms

a Markov chain, also known as a random walk, is constructed whose stationary distri-

bution is the posterior distribution of the parameters. Samples generated from a long

run of the Markov chain using a proposal transition density are drawn from posterior

distributions of interest. Some common MCMC methods for drawing samples include

Metropolis-Hastings and the Gibbs sampler, see Gelman et al. (2004) for more details.

Consider a Poisson process Xt, t = 1, ..., T , that is initially in-control, with independent

observations coming from a Poisson distribution with a known rate λ0. At an unknown

point in time, τ , the Poisson rate parameter changes from its in-control state of λ0 to λ1,

λ1 = λ0 + δ, δ 6= 0. The Poisson process step change model can thus be parameterized

as follows:

6.3 Bayesian Poisson Process Step Change Model 161

p(xt | λt) =

exp(−λ0)λxt

0 /xt! if t = 1, 2, ..., τ

exp(−λ1)λxt

1 /xt! if t = τ + 1, ..., T(6.2)

Regarding this to Equation 6.1, p(. | .), is the likelihood that underlies the observations;

and posterior distributions of the time and the magnitude of a step change will be

constructed and investigated as they are the unknown parameters of interest in the

change point analysis. Assume that the process Xt is monitored by a control chart

that signals at time T . We assign a normal distribution with mean of 0 and standard

deviation of 6 ×√λ0 as a prior distribution for δ. This is a reasonably informative

prior for the magnitude of the change in an in-control Poisson rate as the control chart

is sensitive enough to detect very large shifts and estimate associated change points.

Other distributions such as uniform or Gamma might also be of interest; see Gelman

et al. (2004) for more details on selection of prior distributions. We place a uniform

distribution on the range of (1, T − 1) as a prior for τ . See the Appendix for the step

change model code in WinBUGS (Spielgelhalter et al., 2003). To avoid obtaining a

negative value for λ1 within MCMC, particularly when a drop has occurred, we added

a constraint that λ1 must be positive. Although other methods such as modelling the

process on the log scale may be of interest, we do not pursue these here as we may lose

simplicity and explicit or correct reflection of the process.

6.3.2 Evaluation

We used Monte Carlo simulation to study the performance of the constructed BHM

in step change estimation following a signal from c-, Poisson CUSUM and Poisson

EWMA control charts when a change is simulated to occur at τ = 100. We generated

100 observations of a Poisson process with an in-control rate of λ0 = 20. We then

induced step changes of sizes δ = {+2,+6} as an example and δ = {±2,±6,±15}

for a replication study until the control charts signalled. Because we know that the

process is in-control, if an out-of-control observation was generated in the simulation of

the early 100 in-control observations, it was taken as a false alarm and the simulation

was restarted. However, in practice a false alarm may lead to stopping the process


and analyzing root causes. When no cause is found, the process would follow without

adjustment. The simulation was also repeated for rate parameters of 5 and 10 over

equivalent step changes; since the results were similar to these obtained for λ0 = 20,

they are not reported here. This simulation model reflects the data obtained from

monitoring program at SAWMH.

To construct control charts, we applied Shewhart (1926, 1927), Brook and Evans (1972)

and Trevanich and Bourke (1993) procedures for c-, Poisson CUSUM and Poisson

EWMA control charts respectively. A Poisson CUSUM accumulates the difference

between an observed value and a reference value k through S+i = max{0, xi − k+ +

S+i−1} and S−

i = max{0, k− − xi + S−i−1} where k+ = (λ+

1 − λ0)/(ln(λ+1 ) − ln(λ0))

and k− = (λ0 − λ−1 )/(ln(λ0) − ln(λ−

1 )). If S±i exceeds a specified decision interval

h± then the control chart signals that an increase (a decrease) in the Poisson rate

occurred. We calibrated the charts to detect a 25% shift in Poisson rates and have

an in-control average run length ( ÂRL0) of 370 approximately, close to a standard

c-chart, see Woodall and Adams (1993). The resultant Poisson CUSUM charts had

(k+, h+) = (22.4, 22) and (k−, h−) = (17.4, 14). For simplicity, the values were rounded

to one decimal place.

In a Poisson EWMA cumulative values of observations are obtained through Zi =

r×xi+(r−1)×Zi−1, where Z0 = λ0, and plotted in a chart with UCL = λ0+A+√V arZi

and LCL = λ0 − A−√V arZi. We let r = 0.1 and A± = 2.67 to build a chart with an

ARL0 of 370, close to a standard c-chart.

The step change and control charts were simulated in the R package (http://www.r-

project.org). To obtain posterior distributions of the time and the magnitude of the

changes we used the R2WinBUGS interface (Sturtz et al., 2005) to generate 100,000

samples through MCMC iterations in WinBUGS (Spielgelhalter et al., 2003) for all

change point scenarios with the first 20000 samples ignored as burn-in. We then an-

alyzed the results using the CODA package in R (Plummer et al., 2010). See the

Appendix for the step change model code in WinBUGS.


(a1) (a2)

(b1) (b2)

(c1) (c2)

Figure 6.1 Posterior distributions of the time τ and the magnitude δ of a step change following signalsfrom (a1, a2) c-chart, (b1, b2) Poisson EWMA (r = 0.1 and A± = 2.67) and (c1, c2) Poisson CUSUM((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20, δ = +6 and τ = 100.

6.3.3 Performance Analysis

The posterior distributions for the time and the magnitude of a step change of size

+6 are presented in Figure 6.1. For all control charts, posterior distributions of the

change point concentrate on the 100th sample which is the real change point. Since

the posteriors are asymmetric and skewed, particularly for the time of the change, the

mode of posteriors is used as an estimator for change point model parameters (τ, δ).

Table 6.1 shows the posterior estimates for increases of size +2 and +6 in the process


Table 6.1 Posterior estimates (mode, sd.) of step change point model parameters τ and δ followingsignals (RL) from c-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUM charts((k+, h+) =(22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard deviations are shown inparentheses.

δc-chart Poisson EWMA Poisson CUSUM

RL τ δ RL τ δ RL τ δ

+2 201101.1 2.15

142103.0 2.03

108103.1 2.50

(16.3) (0.46) (24.1) (0.92) (16.6) (2.1)

+6 138100.2 4.5

113100.1 3.1

106100.1 5.8

(4.2) (0.8) (13.7) (1.4) (20.1) (2.7)

mean. The c-chart detects a fall of around half a standard deviation (δ = +2) in the

Poisson rate after 101 samples where the mode of the posterior distribution reports the

101th sample as the change point. For a medium shift size, δ = +6, around one and half

a standard deviations, the posterior mode concentrates on the 100th sample whereas

the c-chart signals with 38 samples delay. The Poisson EWMA chart detects the shifts,

+2 and +6, after 42 and 13 samples where the posterior distributions report the 103rd

and 100th samples as the change points respectively. This result implies that although

the obtained posterior modes overestimate the change point for small shifts, they still

perform relatively better than the Poisson EWMA chart. The resultant posteriors from

a Poisson CUSUM are almost identical to those from Poisson EWMA. For a shift of

small to medium size, the posterior mode outperforms the CUSUM chart. Bayesian

estimates of the magnitude of the change tend to estimate small shifts almost precisely.

However, the medium shift sizes are underestimated, although this slight bias must be

considered in the context of their corresponding standard deviations.

Applying the Bayesian framework enables us to construct probability based intervals

around estimated parameters. A credible interval (CI) is a posterior probability based

interval which involves those values of highest probability in the posterior density of

the parameter of interest. Table 6.2 presents 50% and 80% credible intervals for the

estimated time and the magnitude of step changes in all three control charts. As

expected, the CIs are affected by the dispersion and higher order behaviour of the

posterior distributions. Under the same probability of 0.8 for the c-chart, the CI for

the time of the step change of size δ = +2 covers 53 samples around the 100th sample

whereas it decreases to 6 samples for δ = +6 due to the smaller standard deviation,


Table 6.2 Credible intervals for step change point model parameters τ and δ following signals fromc-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUM charts ((k+, h+) = (22.4, 22),(k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100.


50% 80% 50% 80% 50% 80%

+2τ (101,105) (65,118) (96.6,114) (71.2,125.8) (98.2,105) (65.2,108)

δ (2.1,3.2) (1.9,3.2) (1.41,2.65) (0.76,3.08) (0.12,2.50) (-0.23,4.8)

+6τ (97.9,100) (96,102) (96.9,101) (88,103) (96,101) (83,106)

δ (3.9,5.2) (3.4,5.5) (2.2,4.1) (1.2,4.8) (1.31,4) (0.05,4.9)

see Table 6.1.

Comparison of the 50% and 80% CIs for the estimated time of a step change of size

δ = +6 in the Poisson EWMA chart reveals that the posterior distribution of the time

is highly left-skewed and the increase in the probability contracts the left boundary

of the interval, from 96.9 to 88 in comparison with the shift in the right boundary.

This investigation can be extended to other shift sizes and control chart scenarios for

the time estimates. As shown in Table 6.1 and discussed above, the magnitude of the

changes are not estimated as precisely as the time. However, Table 6.2 shows that in

most cases for δ = +2 the real size of change are contained in the respective posterior

50% and 80% CIs.

Having a distribution for the time of the change enables us to make other probabilistic

inferences. As an example, Table 6.3 shows the probability of the occurrence of the

change point in the last 10, 25 and 50 observed samples prior to signalling in the control

charts. For a step change of size δ = +2, since the c-chart signals very late (see Table

6.1), it is unlikely that the change point occurred in the last 10, 25 and even 50 samples.

In contrast, in the Poisson EWMA and CUSUM charts, where they both signal earlier

than the c-chart, the probabilities of occurrence in the last 10 samples are 0.55 and

0.59, then increase to 0.76 and 0.82, respectively, as the next 15 samples are included.

In the case of δ = +6, most (0.98) of the probability density is located between the

last 25 and 50 samples for the c-chart, whereas with 0.80 it is between the last 10 and

25 samples for the Poisson EWMA chart and with probability 0.91 it is in the last 10

samples for the Poisson CUSUM chart. These kind of probability computations and

inferences can be extended to other change scenarios.


Table 6.3 Probability of the occurrence of the change point in the last 10, 25 and 50 observedsamples prior to signalling for c-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUMcharts((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100.


10 25 50 10 25 50 10 25 50

+2 0.00 0.00 0.01 0.55 0.76 0.86 0.59 0.82 0.91+6 0.00 0.01 0.99 0.06 0.86 0.95 0.91 0.97 0.99

Table 6.4 Average of posterior estimates (mode, sd.) of step change point model parameters τ andδ following signals (RL) from c-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUMcharts((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard deviationsare shown in parentheses.


E(RL) E(τ) E(στ ) E(δ) E(RL) E(τ) E(στ ) E(δ) E(RL) E(τ) E(στ ) E(δ)

-15101.17 100.45 22.48 -6.43 102.36 100.40 3.28 -11.49 101.13 100.46 23.21 -6.21(0.42) (0.36) (8.42) (4.75) (0.67) (0.38) (5.33) (2.79) (0.33) (0.36) (7.39) (4.76)

-6174.65 101.12 3.26 -5.90 106.43 100.72 14.92 -4.37 103.94 100.74 24.90 -2.10(66.38) (1.72) (4.48) (1.06) (2.84) (0.76) (7.79) (2.47) (2.36) (0.76) (5.31) (2.13)

-2663.24 103.05 21.33 -2.03 124.72 103.50 24.45 -2.11 127.54 103.23 27.13 -1.64(517.23) (2.78) (8.06) (0.38) (18.74) (2.91) (6.92) (0.98) (26.82) (2.91) (6.30) (0.80)

+2184.22 102.66 20.50 2.00 119.72 103.00 23.26 2.05 117.77 102.70 24.79 1.85(88.91) (3.83) (9.17) (0.83) (16.08) (3.18) (7.49) (0.82) (18.75) (3.20) (7.15) (0.76)

+6113.44 101.10 13.54 3.73 106.33 101.20 18.00 3.00 105.30 101.22 23.61 1.89(13.17) (1.67) (9.71) (2.45) (2.87) (1.31) (7.35) (2.26) (2.55) (1.32) (5.37) (1.80)

+15101.51 100.48 22.00 3.81 102.56 100.51 10.33 7.52 101.77 100.50 19.43 4.84(0.96) (0.30) (7.69) (4.17) (0.89) (0.29) (8.15) (3.78) (0.60) (0.30) (6.68) (4.21)

To investigate the behavior of the Bayesian estimator over the population for different

change sizes, we replicated the simulation method explained in Section 6.3.2 100 times.

This allows use of distribution of estimates with standard errors in order of 10. The

number of replications study is a compromise between excessive computational time,

considering MCMC iterations, and sufficiency of the achievable distributions even for

tails.

Simulated datasets that were obvious outliers were excluded. Table 6.4 shows the

average of the estimated parameters obtained from the replicated datasets. As seen,

although the c-chart detects a small to medium shifts, from half to one and half a

standard deviations, with a large delay, it performs better where there exists a jump.

Having a longer delay in detection of a decrease in the Poisson rate in comparison

with an increase of the same size in the c-chart is due to the equality of mean and the

variance of the Poisson distribution. Therefore a fall in the mean leads to less dispersed

observations. The Poisson EWMA and CUSUM charts behave in the same manner.

For a step change of size around half a standard deviation (δ = ±2) in the Poisson rate


the average of the modes, E(τ), reports the 103rd sample as the change point in all

three control charts, whereas the charts detect the changes with delays greater than 17

samples, obtained in the Poisson CUSUM. This superiority persists where a medium

shift of size δ = ±6 has occurred in the process mean. In this scenario, the bias of

the Bayesian estimator does not exceed one observation, whereas the minimum delay

is four samples for the Poisson CUSUM in detection of the fall. As expected, for large

shift sizes (δ = ±15), around three standard deviations, all control charts performs

well, yet the mean of modes outperform them by a delay of less than one observation.

Table 6.4 reveals that in all three control charts, the variation of Bayesian estimates

for time tends to reduce when the magnitude of shift in the process mean increases.

However, by the nature of the Poisson distribution, for small to medium drops, δ =

(−2,−6), the observed variation is less than those obtained in estimation of jumps. The

mean of the standard deviation of the posterior estimates of time, E(στ ), also decreases

by moving for small shift sizes to medium and large sizes in the Poisson EWMA and

CUSUM charts. In contrast, the greatest variation is obtained for a large shift of size

δ = ±15 in the c-chart. This is due to the early detection of such shifts by the c-chart

that leads to a very short run of samples after the change which then compresses the

data and hence informs the MCMC algorithm.

The average of the Bayesian estimates of the magnitude of the change, E(δ), shows

that the modes of posteriors for change sizes do not perform as well as the posterior

distributions of the time across different shift sizes; however, promising results are

obtained where a small shift, δ = ±2, has occurred in the process mean. This estimator

tends to underestimate the sizes, particularly where there exists a jump. This bias

increases when the shift size increases since a very short run of samples coming from

the out-of-control state of the process with a high variance was used. As seen in

Table 6.4, the best estimates are obtained in Poisson EWMA cases. Having said that,

Bayesian estimates of the magnitude of the change must be studied in conjunction with

their corresponding standard deviations. In this manner, analysis of credible intervals

would be effective.


6.4 Bayesian Poisson Process Linear trend Change Model

6.4.1 Model

Consider a Poisson process Xt, t = 1, ..., T , that is initially in-control with independent

observations coming from a Poisson distribution with a known rate λ0. After an un-

known point in time, τ say, the Poisson rate parameter changes according to a linear

trend model

λt = λ0 + β(t− τ) t > τ, (6.3)

where β is the magnitude of the linear trend. A positive β implies an increasing trend

in which λi > λ0, and a negative β leads to a linear reduction of the Poisson rate and

λt < λ0 for t = τ + 1, ..., T .

The Poisson process linear trend change model can thus be parameterized as follows:

p(xt | λt) =

exp(−λ0)λxt

0 /xt! if t = 1, 2, ..., τ

exp(−(λ0 + β(t− τ)))(λ0 + β(t− τ))xt/xt! if t = τ + 1, ..., T(6.4)

where τ and T are the change time and chart’s signal, respectively.

Similar to the step change model, it is required to define prior distributions for the

unknown parameters, τ and β. We assign a normal distribution with mean of 0 and

standard deviation of 6 ×√λ0, and a uniform distribution on the range of (1, T − 1)

as prior distributions for β and τ , respectively. Similar to the step change model, we

added a constraint on λ1 to be positive. See the Appendix for the linear trend change

model code in WinBUGS. As discussed in Section 6.3, other priors could be considered;

see Gelman et al. (2004).

6.4 Bayesian Poisson Process Linear trend Change Model 169

6.4.2 Evaluation


in the linear trend change estimation following a signal from c-, Poisson CUSUM and

Poisson EWMA control charts when a change is simulated to occur at τ = 100. We

generated 100 observations of a Poisson process with an in-control rate of λ0 = 20.

We then induced linear trend changes of slopes β = {+0.50,+1.0} as an example and

β = {±0.50,±1.0,±2.0} for a replication study until the control charts signalled. As

before, if an out-of-control observation was generated in the simulation of the early 100

in-control observations, it was taken as a false alarm and the simulation was restarted.

All control charts were constructed and analyzed using MCMC as discussed in Section

6.3.2.


Table 6.5 shows the posterior estimates for linear trends with positive slopes (increasing

trends) of sizes around 0.1 (β = +0.5) to 0.25 (β = +1.0) standard deviations in the

Poisson rate. The c-chart detects the trends with delays which drop from 21 to 12 when

the slope size increases, whereas the posterior distributions of time concentrate on the

101st and 100th samples, respectively, as the change point. The Poisson EWMA and

CUSUM charts detect a trend with a slope of β = +0.5 with around 10 observations

delay, which is better than the c-chart; however both are still outperformed by the

Bayesian modes that report the 102nd sample as the change point. For large slopes,

(β = +1.0 say), although the Poisson EWMA and CUSUM charts tend to signal more

precisely and the delays drop to eight and five samples, respectively, the Bayesian

estimator identifies the change point with no bias. As seen in Table 6.5, Bayesian

estimates of the magnitude of the slope also preform well as most of estimated slopes,

taken as posterior modes, are close to the simulated magnitudes.

Table 6.6 presents 50% and 80% credible intervals for the estimated time and the

magnitude of slope in the linear trend changes based on all three control charts. The

small standard deviations in Table 6.5 lead to obtained CIs for the time of the change

that are precise and informative. As shown in Table 6.5, the magnitudes of the slopes


Table 6.5 Posterior estimates (mode, sd.) of linear trend change point model parameters τ and β

following signals (RL) from c-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUM charts((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard deviations areshown in parentheses.

βc-chart Poisson EWMA Poisson CUSUM

RL τ β RL τ β RL τ β

+0.5 121101.6 0.48

111101.9 0.55

110102.6 0.44

(4.9) (0.29) (6.1) (1.26) (9.2) (1.12)

+1 112100 1.0

108100.1 1.06

105100.1 2.36

(3.6) (0.49) (9.2) (1.07) (7.38) (1.89)

Table 6.6 Credible intervals for linear trend change point model parameters τ and β following signalsfrom c-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUM charts ((k+, h+) = (22.4, 22),(k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100.


50% 80% 50% 80% 50% 80%

+0.5τ (98.2,104) (95,106.8) (100.9,105.8) (99,109.2) (100.4,106.1) (98.4,109.9)

β (0.3,0.6) (0.25,0.75) (0.32,1.2) (0.07,1.9) (0.13,1.02) (-0.01,1.89)

+1τ (99,102.2) (96.9,103.4) (98.8,103.1) (94.8,106.3) (100.7,103.1) (98.5,103.6)

β (0.8,1.3) (0.58,1.7) (0.09,1.12) (0.02,1.9) (0.7,3) (0,4.2)

are estimated as precisely as the time. Table 6.6 shows that almost all of the true slope

parameter values are contained in constructed 50% and 80% CIs.

Similar to the step change model, we are able to make probabilistic inferences using

the obtained posterior distributions. As an example, the probability of the occurrence

of the change point in the last 10 samples prior to signalling in the c-chart where there

exists a linear trend change of slope size β = +1 is 0.20. This probability increases to

0.98 if the last 25 samples is considered. In the Poisson EWMA case, these probabilities

are 0.68 and 0.99, respectively. For the Poisson CUSUM it is much more probable (0.91)

that the linear trend change has begun within the last 10 samples.

To investigate the behavior of the Bayesian estimator for replicate datasets sampled

form the same population, for different slope sizes, we replicated the simulation method

explained in Section 6.4.2 100 times. Simulated datasets that were obvious outliers were

excluded. Table 6.7 shows the average of the estimated parameters.

For a linear trend with small slopes of size β = ±0.5 in the Poisson rate, the aver-

age modal value, E(τ), reports the 105th sample and less as the change point in all

three control charts whereas the charts detect the changes with delays greater than 10

samples, obtained in the Poisson CUSUM. This superiority also persists where a trend

6.4 Bayesian Poisson Process Linear trend Change Model 171

Table 6.7 Average of posterior estimates (mode, sd.) of linear trend change point model parameters τand β following signals (RL) from c-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUMcharts((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard deviationsare shown in parentheses.


E(RL) E(τ) E(στ ) E(β) E(RL) E(τ) E(στ ) E(β) E(RL) E(τ) E(στ ) E(β)

-2.0106.48 100.83 2.27 -2.07 105.35 100.75 3.73 -1.72 104.07 100.92 5.29 -1.51(1.47) (1.16) (2.14) (1.22) (1.14) (0.93) (3.25) (0.95) (1.15) (0.96) (3.08) (1.59)

-1.0111.24 102.05 2.80 -1.35 108.01 102.14 5.96 -1.02 106.46 102.74 7.62 -0.78(2.56) (2.36) (1.73) (0.75) (1.76) (2.07) (5.33) (0.71) (1.92) (2.18) (4.82) (1.06)

-0.5120.08 102.96 4.19 -0.67 111.65 104.60 8.93 -0.50 109.67 104.70 9.40 -0.47(4.96) (2.50) (1.54) (0.55) (2.51) (2.91) (5.28) (0.64) (3.06) (2.91) (4.77) (0.74)

+0.5113.93 103.75 6.66 0.43 110.98 104.45 8.75 0.43 109.82 104.78 8.83 0.49(5.22) (2.99) (3.15) (0.55) (2.56) (2.94) (5.12) (0.62) (2.89) (2.78) (4.65) (0.54)

+1.0109.20 102.55 5.65 0.79 107.92 102.75 7.21 0.68 107.19 102.78 7.88 0.74(3.14) (2.05) (3.87) (0.77) (2.13) (2.11) (5.93) (0.81) (2.09) (2.36) (4.87) (0.70)

+2.0105.46 101.20 4.93 1.48 105.52 101.18 5.21 1.75 104.82 101.19 6.19 1.66(1.88) (1.02) (3.21) (0.94) (1.35) (1.04) (4.06) (1.05) (1.30) (1.04) (3.59) (0.88)

with larger slopes of size β = ±1.0,±2.0 has occurred in the process mean. In these

scenarios, the bias of the Bayesian estimator does not exceed two and one samples,

where the minimum delays are seven and four samples, respectively.

Table 6.7 shows that in all three control charts, the variation of the Bayesian estimates

for time tends to reduce when the magnitude of slope increases. The mean of posterior

standard deviation for time, E(στ ), also decreases by moving for small slope sizes

to medium and large sizes in both directions. However, the observed variation for

estimation of a decreasing trend is less than those obtained for an increasing trend

with the same slope size.

The average of the posterior estimates for the magnitude of the change, E(β), shows

that the modes of the posteriors for change sizes perform as well as the posterior

estimates of the time, particularly, for the c-chart and Poisson EWMA chart. In the

CUSUM chart, the posteriors are tend to underestimate the slope sizes. Of course

Bayesian estimates of the magnitude of the change must be studied in conjunction

with their corresponding standard deviations.


6.5 Bayesian Poisson Process Multiple Change model

6.5.1 Model

In order to address the possibility of having change types other than step and linear

trend forms (Perry et al., 2007), we introduce a multiple change point scenario where

the number of change points is known. This prior knowledge might have been obtained

based on awareness and past experience of process engineers in factors such as changes

in operators, materials, procedures, tools and policies which may lead to increasing or

decreasing step changes in the Poisson rate. Here, we consider the case of two sequential

step changes. Other cases with more than two change points can be modeled in the

same way.

Consider a Poisson process Xt, t = 1, ..., T , that is initially in-control, with independent

observations drawn from a Poisson distribution with a known rate λ0. At an unknown

point in time, τ1, the Poisson rate parameter changes from its in-control state of λ0

to λ1, λ1 = λ0 + δ1, δ1 6= 0. For a period of time, the process continues with the

new parameter, λ1, and then at an unknown point in time, τ2, it changes to λ2, λ2 =

λ0 + δ2, δ2 6= δ1 6= 0. The Poisson process multiple change point model with 2 step

changes can thus be parameterized as follows:

p(xt | λt) =

exp(−λ0)λxt

0 /xt! if t = 1, 2, ..., τ1

exp(−λ1)λxt

1 /xt! if t = τ1 + 1, ..., τ2

exp(−λ2)λxt

2 /xt! if t = τ2 + 1, ..., T

(6.5)

Similar to the step change model, prior distributions are required for the unknown

parameters, τ1, τ2, δ1 and δ2 say. We assign a normal distribution with mean of 0 and

standard deviation of 6×√λ0 for δ1 and δ2, and a uniform distribution on the range of

(1, T −1) for τ1 and τ2 as prior distributions. See the Appendix for the multiple change

model code in WinBUGS. As discussed in Section 6.3, other priors could be considered;

see Gelman et al. (2004). Similar constraints as discussed for the step change model

were also used to avoid having a negative value for Poisson means.

6.5 Bayesian Poisson Process Multiple Change model 173

6.5.2 Evaluation

As before we used Monte Carlo simulation to study the performance of the constructed

BHM in multiple change estimation following a signal from c-, Poisson CUSUM and

Poisson EWMA control charts when two changes are simulated to occur at (τ1, τ2) =

(100, 110). We generated 100 observations of a Poisson process with an in-control rate

of λ0 = 20. We then induced first and second changes of sizes (δ1, δ2) = (+2,+3) as an

example and (δ1, δ2) = {(±4,±8), (±4,±12)} as part of a replication study at the de-

termined times of change (τ1, τ2) until the control charts signalled. If an out-of-control

observation was generated in the simulation of the early 100 in-control observations,

it was taken as a false alarm and the simulation was restarted. Similarly, if in any

simulation, the charts signalled earlier than simulating the second change, that simula-

tion was terminated and not followed. In practice, for such processes the change point

model for a step change should be used. All control charts were constructed and the

MCMC method conducted as discussed in Section 6.3.2.


Table 6.8 shows the posterior estimates for two consecutive increasing step changes of

sizes +2 and +3 in the Poisson rate. The c-chart detects the changes with a delay

of 36 samples that drops to 13 samples for the Poisson EWMA and CUSUM charts.

However, the posterior distributions outperform the charts and concentrate on the

102nd and 101st samples for the time of the first step change. The second change

point is also estimated precisely by the posteriors. As seen in Table 6.8, although the

magnitude of the first step change is slightly underestimated and the second step size

is overestimated, there still exists some gain in studying of the estimated sizes and

directions in conjunction with their corresponding standard deviations.

Table 6.9 presents 50% and 80% credible intervals for the estimated time and the

magnitude of two consecutive step changes. The obtained CIs contain the true values

of the time and the size of shifts. As has been discussed in the previous change models,

we can also support the estimates with probabilistic inferences, such as the probability

of the occurence of the change point in a specified number of observed samples prior


Table 6.8 Posterior estimates (mode, sd.) of multiple change point model parameters τ1, δ1, τ2 andδ2 following signals (RL) from c-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUMcharts ((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20, τ1 = 100 and τ2 = 110. Standarddeviations are shown in parentheses.

δ1, δ2c-chart Poisson EWMA Poisson CUSUM

RL τ1 δ1 τ2 δ2 RL τ1 δ1 τ2 δ2 RL τ1 δ1 τ2 δ2

+2,+3 136102 1.7 109.3 3.2

113101.4 1.6 110 3.8

113101.4 1.6 110 3.8

(30.5) (1.8) (24) (2.1) (28) (1.7) (15.8) (2.2) (28) (1.7) (15.8) (2.2)

Table 6.9 Credible intervals for multiple change point model parameters τ1, δ1, τ2 and δ2 followingsignals from c-chart, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUM ((k+, h+) =(22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20, τ1 = 100 and τ2 = 110.

δ1, δ2c-chart Poisson EWMA Poisson CUSUM

50% 80% 50% 80% 50% 80%

+2,+3

τ1 (84.6,106) (41.2,112) (91,111) (54.9,113) (91,111) (54.9,113)

δ1 (0.3,2.5) (-0.2,4.0) (0.1,2.1) (-0.5,3.7) (0.1,2.1) (-0.5,3.7)τ2 (98.4,122) (94.3,136) (105.5,112.6) (97.1,113) (105.5,112.6) (97.1,113)

δ2 (2.0,4.8) (0.0,5.2) (2.1,5.1) (0.8,6.1) (2.1,5.1) (0.8,6.1)

to signal. For examples and discusseions see Section 6.3.3 and Section 6.4.3.

To investigate the behavior of the Bayesian estimator over the population for differ-

ent scenarios of two consecutive step changes, we replicated the simulation method

explained in Section 7.4.1 100 times. Here, we applied the multiple change point model

following signals of the c-chart as the Poisson EWMA and CUSUM mostly signal before

simulating the second change in the process.

As seen in Table 6.10 and discussed in Section 6.3.3, the c-chart signals earlier when a

larger shifts, either an increase or decrease, has occurred in the second change, however,

it performs better where there exists a jump, regardless of the direction of the first

change. The chart alarmed after 38 samples when two consecutive drops of sizes around

one and two standard deviations, δ1,2 = (−4,−8), occurred. Although this delay

falls to 16 samples when the second change has happened in the opposite direction,

the modes of posteriors for the time of the first change, E(τ1), outperform the chart.

This superiority persists when the size of the second change increases to around three

standard deviations, δ2 = (±12). The same results are also obtained where the first

change is an increase in magnitude of one standard deviation, δ1 = (+4).

Table 6.10 reveals that the Bayesian estimator tends to underestimate the time of the

first change of two monotonic changes where the second change is of size δ2 = (±12).

6.5 Bayesian Poisson Process Multiple Change model 175

Table 6.10 Average of posterior estimates (mode, sd.) of multiple step change point model parametersτ and δ following signals (RL) from c-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUMcharts((k+, h+) = (22.4, 22), (k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard deviationsare shown in parentheses.

δ1, δ2 E(RL) E(τ1) E(στ1) E(δ1) E(τ2) E(στ2) E(δ2)

-4,-12113.49 98.18 28.50 -1.17 109.26 7.08 -8.32(2.46) (15.29) (4.19) (1.24) (1.69) (5.64) (3.07)

-4,-8138.94 101.04 28.84 -1.18 109.06 5.48 -7.69(25.94) (7.60) (5.31) (1.33) (2.06) (3.36) (1.65)

-4,+8116.98 100.74 23.30 -2.48 110.49 8.14 4.69(5.75) (6.52) (9.20) (1.80) (0.60) (8.90) (3.63)

-4,+12112.78 100.37 23.86 -2.21 110.30 11.08 5.42(2.29) (6.75) (8.27) (1.75) (1.06) (8.90) (4.87)

+4,-12113.10 101.77 25.03 1.64 110.48 7.23 -9.34(2.50) (3.40) (7.67) (1.74) (0.34) (8.91) (3.28)

+4,-8134.69 101.67 24.92 1.41 110.71 3.36 -7.7(22.74) (7.04) (6.78) (1.38) (0.73) (5.59) (2.03)

+4,+8117.69 101.28 30.40 0.59 108.81 11.61 4.41(5.69) (11.10) (2.53) (0.93) (1.96) (7.35) (2.50)

+4,+12112.23 98.32 29.93 0.09 108.37 11.90 4.17(2.50) (15.32) (3.19) (1.13) (2.00) (8.15) (2.71)

The associated variation, within replications, increases when the second step change

increases in the same direction of the first change. The minimum variations of the

posterior distributions for the time of the first change, E(στ1), are obtained where

there exist non-monotonic changes, see δ1,2 = (−4,+8) and δ1,2 = (+4,−8). This

variation also increases when the second step change increases in the same direction of

the first change.

The time of the second step change is estimated precisely by the posterior modes. Table

6.10 shows that the average, E(τ1), mostly concentrate on 110th sample. Surprisingly,

the variation between replications and also the variation of posterior distributions ob-

tained for the time of the second change, E(στ1), are less than those obtained for the

first step change.

The average of the posterior estimates of the magnitude of the changes, E(δ1) and

E(δ2), shows that the modes of the posteriors for change sizes do not perform as well

as the posterior distributions of the time across different scenarios. The modes tend

to underestimate the sizes, particularly, for jumps in either the first or the second

step change. However, there still exists some gain in studying the estimated sizes and


directions, particularly when the obtained standard deviations are also considered.

6.6 Comparative Performance and Model Selection

We used Monte Carlo simulation to study the performance of the developed change

point models in different change point scenarios following a signal from a c-chart. We

generated 100 observations of a Poisson process with an in-control rates of λ0 = 20.

We then induced a step, a linear trend and a multiple change in the Poisson rate. For

each scenario the three change point models were applied and the time of the change

was estimated. Based on the MCMC simulation, the Deviance Information Criterion

(DIC) and related parameters, mean and variance of the posterior distribution of the

deviance and the penalty value, were recorded. The DIC is a goodness of fit criterion

which takes into account the deviance of the model, −2 ln(p(y | θ)), and a penalty for

the model complexity, pD (Spielgelhalter et al., 2002). To allow for asymmetry in the

posterior distribution, seen in Figure 6.1, pV was used as an alternative to pD, where

pV is half of the variance of the posterior distribution of the deviance (Gelman et al.,

2004).

Table 6.11 indicates that the Bayesian estimate of a step change outperforms other

Bayesian estimates, linear and multiple, where there is a step change in the process

parameter. It estimates 101.9 and 108.3 as the time of change of size δ = −4 and

δ = +4 respectively, whereas the linear model underestimates the time with a bias of

around 55 and 24 samples and the multiple model tends to overestimate it relative to

the step model. According to the reported DICs, the DICV supports that the step

model with values of 1167 and 845.5 is a preferable fit where there exists either an

increasing or a decreasing step change.

In the case of an occurrence of a linear trend shift in the Poisson rate, the Bayesian

estimate of a linear trend change outperforms other Bayesian estimates in estimating

the change point. The reported DICV is convincing that the linear model with values

of 603.7 and 630.9 is also the best fit. These results can be extended to the multiple

change scenario. Table 6.11 shows that the Bayesian estimate of a multiple change (two

6.7 Comparison of Bayesian Estimator with other Methods 177

Table 6.11 Performance and goodness of the change point models on different change types followingsignal from a c-chart where λ0 = 20, τ1 = 100 and τ2 = 110.

Change type Change size RL Model τ Deviance Std(D) pD DICD DICV

Step δ = −4 200Step 101.9 1163.4 2.7 1.9 1165.3 1167Linear 45.2 1168.9 2.1 2.1 1171.0 1171.1Multiple 102.5 1163.7 3.3 -0.8 1162.9 1169.1

Step δ = +4 148Step 108.3 842.2 2.6 0.8 843 845.5Linear 86.3 843.7 2.1 2.1 845.8 845.9Multiple 108.5 841.7 3.2 -1.7 840 846.8

Linear β = −1 107Step 102.3 607 3.9 -2.1 604.9 614.6Linear 101.7 601.9 1.9 1.5 603.4 603.7Multiple 102.4 604.9 3.2 -0.5 604.4 610.0

Linear β = +1 108Step 102.1 631.5 3.8 0.9 632.4 638.7Linear 100.4 629.1 1.9 0.3 629.5 630.9Multiple 101.5 628.4 4.0 -1.2 627.2 636.4

Multipleδ1 = −4,δ2 = −8

138Step 109.5 788.8 2.7 -9.9 778.9 792.4Linear 88.1 788.2 3.3 1.9 790.1 793.6Multiple 100.1 784 3.8 0.8 784.9 791.2

Multipleδ1 = +4,δ2 = +8

119Step 100.5 723.1 2.8 0.0 723.1 727.0Linear 108.3 722.8 3.1 -0.7 658.9 727.6Multiple 100.3 722.6 2.9 1.7 722.3 726.8

changes) outperforms other Bayesian estimates, step and linear, where there are two

consecutive changes in the Poisson rate. Similarly, the reported DICV supports that

the multiple model with values of 636.4 and 791.2 is also the best fit in this case.

6.7 Comparison of Bayesian Estimator with other Meth-

ods

To study the performance of the proposed Bayesian estimators in comparison with

those introduced in Section 6.2, we run the alternatives, built-in estimators of Poisson

EWMA and CUSUM charts and MLE estimators, within replications as discussed in

Sections 6.3-6.5.

Table 6.12 shows the mean of Bayesian estimates and detected change points provided

by built-in estimators of EWMA (Nishina, 1992) and CUSUM (Page, 1954) charts and

the MLE estimator (Perry, 2004) for a step change in a Poisson process.

Although the Bayesian estimator, τb, tends to overestimate the time of a step change of


Table 6.12 Average of detected time of a step change in a Poisson process obtained by the Bayesianestimator, CUSUM and EWMA built-in estimators and MLE estimator following signals (RL) fromc-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUM charts((k+, h+) = (22.4, 22),(k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard deviations are shown in parentheses.


E(RL) E(τmle) E(τb) E(RL) E(τewma) E(τmle) E(τb) E(RL) E(τcusum) E(τmle) E(τb)

-15101.17 99.97 100.45 102.36 95.42 99.98 100.40 101.13 99.56 99.96 100.46(0.42) (0.22) (0.36) (0.67) (10.12) (0.20) (0.38) (0.33) (0.89) (0.24) (0.36)

-6174.65 100.19 101.12 106.43 96.73 99.65 100.72 103.94 100.08 97.78 100.74(66.38) (1.72) (1.72) (2.84) (6.41) (2.19) (0.76) (2.36) (1.69) (13.16) (0.76)

-2663.24 93.16 103.05 124.72 97.86 102.70 103.50 127.54 122.70 103.56 103.23(517.23) (19.76) (2.78) (18.74) (17.80) (17.91) (2.91) (26.82) (26.51) (15.62) (2.91)

+2184.22 94.20 102.66 119.72 100.87 96.75 103.00 117.77 109.12 96.89 102.70(88.91) (22.15) (3.83) (16.80) (13.51) (17.12) (3.18) (18.75) (19.63) (21.09) (3.20)

+6113.44 100.55 101.10 106.33 95.94 99.31 101.20 105.30 99.75 99.29 101.22(13.17) (2.65) (1.65) (2.87) (10.04) (7.81) (1.31) (2.55) (2.36) (7.79) (1.32)

+15101.51 99.95 100.48 102.56 94.95 99.51 100.51 101.77 98.92 99.51 100.50(0.96) (0.45) (0.30) (0.89) (9.09) (4.02) (0.29) (0.60) (2.32) (4.02) (0.30)

small sizes, δ = ±2, with a delay of three samples, it outperforms the MLE estimator,

τmle, which underestimates the time by six samples following a signal from the c-

chart. For step sizes of one and half and three standard deviations, the MLE estimator

performs slightly better than the Bayesian estimator; however considering the obtained

standard deviations decreases this superiority, particularly where there exists a jump

in the process mean.

Table 6.12 reveals that the EWMA estimator, τewma, underestimates the change point

when the size of shift increases for both directions where the Bayesian estimator tends

to be more precise. τb still remains the best estimator for small changes and shows

acceptable performance in comparison with τmle over larger shifts, particularly when

the standard deviations are taken into account.

The CUSUM estimator, τcusum, outperforms the equivalent estimators in EWMA for

larger shifts, δ = (±6,±15); however, it overestimates the time of small shifts sig-

nificantly. Similar to c-chart and EWMA cases, in CUSUM, the Bayesian estimator

outperforms alternatives for small shifts and offers acceptable performance over other

shift sizes, considering the obtained standard deviations over replications.

Table 6.13 shows the mean of the Bayesian estimates and detected change points pro-

vided by built-in estimators of EWMA (Nishina, 1992) and CUSUM (Page, 1954) charts

and the MLE estimator (Perry et al., 2006) for a linear trend change in a Poisson pro-

cess. Application of the proposed MLE estimator is restricted to trends with positive

6.7 Comparison of Bayesian Estimator with other Methods 179

Table 6.13 Average of detected time of a linear trend in a Poisson process obtained by the Bayesianestimator, CUSUM and EWMA built-in estimators and MLE estimator following signals (RL) fromc-, Poisson EWMA (r = 0.1 and A± = 2.67) and Poisson CUSUM charts((k+, h+) = (22.4, 22),(k−, h−) = (17.4, 14)) where λ0 = 20 and τ = 100. Standard deviations are shown in parentheses.


E(RL) E(τmle) E(τb) E(RL) E(τewma) E(τmle) E(τb) E(RL) E(τcusum) E(τmle) E(τb)

-2.0106.48 - 100.83 105.35 97.59 - 100.75 104.07 100.36 - 100.92(1.47) - (1.16) (1.14) (6.11) - (0.93) (1.15) (1.82) - (0.96)

-1.0111.24 - 102.05 108.01 97.34 - 102.14 106.46 102.49 - 102.74(2.56) - (2.36) (1.76) (9.92) - (2.07) (1.92) (2.74) - (2.18)

-0.5120.08 - 102.96 111.65 97.61 - 104.60 109.67 104.94 - 104.70(4.96) - (2.50) (2.51) (12.03) - (2.91) (3.06) (3.74) - (2.91)

+0.5113.93 103.55 103.75 110.98 99.37 102.02 104.45 109.82 104.00 102.12 104.78(5.22) (3.48) (2.99) (2.56) (9.12) (9.23) (2.94) (2.89) (3.30) (11.68) (2.78)

+1.0109.20 102.70 102.55 107.92 97.13 101.08 102.75 107.19 101.07 101.57 102.78(3.14) (3.19) (2.05) (2.13) (9.70) (12.42) (2.11) (2.09) (3.01) (3.59) (2.36)

+2.0105.46 100.23 101.20 105.52 96.35 100.57 101.18 104.82 99.61 100.59 101.19(1.88) (2.81) (1.02) (1.35) (8.80) (4.07) (1.04) (1.30) (3.47) (3.81) (1.04)

slope as Newton’s method is not tractable for decreasing trends in Poisson mean; see

Perry et al.19 for more details.

The Bayesian estimator, τb, almost outperforms the built-in estimator of EWMA,

τewma, where there exists a decreasing trend. This superiority increases when the

slope size raises, β = −2. The CUSUM estimator, τcusum, estimates the change point

more precisely than the EWMA, however the Bayesian estimator, τb, still remains the

best alternative for detection of linear trends with negative slopes, when the variation

of the estimates is taken into account.

Table 6.13 reveals that the Bayesian estimator, τb, is slightly outperformed by the MLE

estimator, τmle, across the charts when there exists an increasing linear trend in the

process mean. However, the Bayesian estimator can still be a reasonable alternative in

light of the obtained standard deviations which are less than those observed form MLE

estimator over replications.

The MLE estimator proposed by Perry et al. (2007) is suitable for monotonic consecu-

tive changes. In contrast, the Bayesian estimator for a known number of change points

proposed in Section 6.5 can also be applied where there exists non-monotonic consec-

utive changes in the process mean. Therefore, the comparison study was not followed

for the multiple change point case as there is no appropriate MLE alternative against

which to evaluate the Bayesian estimator.

As discussed in Section 7.4.1, the built-in EWMA and CUSUM estimators can not be


studied as they tend to signal before the second change point. In the case of signalling

after the second change, they also failed as they tend to concentrate on the time of the

latter step change as the change point in non-monotonic scenarios.

Apart from accuracy and precision criteria used for the comparison study, the poste-

rior distributions for the time and the magnitude of a change enable us to construct

probabilistic intervals around estimates and probabilistic inferences about the location

of change point as discussed in Sections 6.3-6.5. This is a significant advantage of

the Bayesian approach. Although similar results may be obtained when resampling in

conjunction with MLE methods, the inferential basics of this approach is more lim-

ited; see Bernardo and Smith (1994) for more details. Also the flexibility of Bayesian

hierarchical models, ease of extension to more complicated change scenarios such as

combination of steps and linear and nonlinear trends, relief of analytic calculation of

likelihood function, particularly for non-tractable likelihood functions, and ease of cod-

ing with available packages, should be considered as additional benefits of the proposed

Bayesian change point model. This approach can be easily applied for other types of

data and processes such as Bernoulli, normal and exponential family data.

The two-step approach to change-point identification described in this paper has the

advantage of building on control charts that may be already in place in practice (as in

the motivating case study in SAWMH). An alternative may be to retain the two-step

approach but to use a Bayesian framework in both stages. There is now a substantial

literature on Bayesian formulation of control charts and extensions such as monitoring

processes with varying parameters (Feltz and Shiau, 2001), over-dispersed data (Bayarri

and Garcıa-Donato, 2005), start-up and short runs (Tsiamyrtzis and Hawkins, 2005,

2008). A further alternative is to consider a fully Bayesian, one-step approach, in

which both the monitoring of the in-control process and the retrospective or prospective

identification of changes is undertaken in the one analysis. This is the subject of further

research.

6.8 Conclusion 181

6.8 Conclusion

Identification of the time when a process has changed enables process engineers to pur-

sue investigation of special causes more effectively. Indeed, knowing the change point

restricts the search efforts to a tighter window of observations and related variables. In

this paper we modeled the change point estimation for a Poisson process in a Bayesian

framework. The Poisson process is a reflection of the processes being monitored at

SAWMH as a part of quality improvement program in preventive maintenance plans

of medical instruments. We considered three scenarios of changes, a step change, a

linear trend and a multiple change when the number of changes is known. We con-

structed Bayesian hierarchical models and derived posterior distributions for change

point estimates using MCMC. We compared the performance of the Bayesian estima-

tors with c-, Poisson EWMA and CUSUM control charts. The results showed that the

Bayesian estimates outperform standard control charts in change estimation, particu-

larly where there exists a small to medium size of step change(s) and a linear trend

change with small to relatively large magnitude of slope. In comparison with built-in

estimators of EWMA and CUSUM and MLE based estimators, the Bayesian estima-

tor performs reasonably well and remains a strong alternative, particularly when other

criteria such as probability quantification through credible intervals and probabilistic

inferences, flexibility and generalization are taken into accounts.

Investigation of the performance of the Bayesian estimates over different change sce-

narios reveals that each Bayesian change point model outperforms other models where

its underlying change type has occurred in the Poisson process. The results also sup-

port the idea of using DIC as a primary step in change point estimation which can

direct process engineers to identify the appropriate change point model before making

inferences about the derived underlying changes in the process.

Acknowledgments




Appendix

Step Change Model

Figure 6.2 Directed acyclic graph for the step change model in a Poisson process.

model {

for(i in 1 : RL c ){

xc[i] ∼ dpois(lambda2[i])

lambda1[i]=lambda0+delta*step(i-change)

lambda2[i]=max(lambda1[i],0.000001)}

tau=1/(6*sqrt(lambda0))

RL=RL c-1

delta ∼ dnorm(0, tau)

change ∼ dunif(1,RL)}

Linear Trend Change Model

model {



lambda1[i]=lambda0+beta*(i-change)*step(i-change)

6.8 Conclusion 183

Figure 6.3 Directed acyclic graph for the linear trend change model in a Poisson process.



RL=RL c-1

beta ∼ dnorm(0, tau)


Multiple Change Model

Figure 6.4 Directed acyclic graph for the multiple change model in a Poisson process.

model {



lambda1[i]=lambda0+delta1*step(i-change1)*step(change2-i)+delta2*step(i-change2)




RL=RL c-1

delta1 ∼ dnorm(0, tau)

delta2 ∼ dnorm(0, tau)

change1 ∼ dunif(1,change2)

change2 ∼ dunif(change1,RL)}

Bibliography



Benneyan, J. C. (1998). Statistical quality control methods in infection control and



Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley.

Borror, C., Champ, C., and Rigdon, S. (1998). Poisson EWMA control charts. Journal




Brook, D. and Evans, D. (1972). An approach to the probability distribution of CUSUM

run length. Biometrika, 59(3):539–549.



17(2):119–124.



Washington.


Chapman & Hall/CRC.

BIBLIOGRAPHY 185

Limayea, S. S., Mastrangeloa, C. M., and Zerrb, D. M. (2008). A case study in moni-

toring hospital-associated infections with count control charts. Quality Engineering,

20(4):404–413.









Perry, M., Pignatiello, J., and Simpson, J. (2007). Change point estimation for mono-


search, 45(8):1791–1813.



Plummer, M., Best, N., Cowles, K., Vines, K., and Plummer, M. M. (2010). Package

coda. Citeseer.


nometrics, 1(3):239–250.

Samuel, T., Pignatiello, J., and Calvin, J. (1998). Identifying the time of a step change




Spielgelhalter, D., Best, N. C. B., and Van Der Linde, A. (2002). Bayesian measures of

model complexity and fit. Journal of the Royal Statistical Society. Series B (Method-

ological), 64(4):583–639.

Spielgelhalter, D., Thomas, A., and Best, N. (2003). WinBUGS version 1.4. Bayesian

inference using Gibbs sampling. MRC Biostatistics Unit. Institute for Public Health,

Cambridge, United Kingdom.

Sturtz, S., Ligges, U., and Gelman, A. (2005). R2WinBUGS: a package for running

WinBUGS from R. Journal of Statistical Software, 12(3):1–16.

Trevanich, A. and Bourke, P. (1993). EWMA control charts using attributes data. The

Statistician, 42(3):215.






24(6):721–735.

White, C. H., Keats, J. B., and Stanley, J. (1997). Poisson CUSUM versus c-chart for

defect data. Quality Engineering, 9(4):673–679.



Woodall, W. H. and Adams, B. M. (1993). The statistical design of CUSUM charts.


CHAPTER 7

Bayesian Multiple Change Point Estimation of

Poisson Rates in Control Charts

Preamble

Any enhancement in quality of a process is gained through the quick detection of an

out-of-control state and investigation of potential causes of such shifts. This is followed

by implementation of preventive and corrective actions. The need to know the time at

which a process began to vary, the so-called change point, has been recently discussed in

the industrial context of quality control. Accurate estimation of the time of change can

help in the search for a potential cause more efficiently as a tighter time-frame prior

to the signal in the control charts is investigated. Several methods including MLE

estimators and data mining techniques such as Neural networks and Fuzzy clustering

have been proposed and investigated for processes involving single variable, multivariate

and monitoring profiles.

An overview on related body of literature revealed that the capabilities of the Bayesian

framework in this stream of research has been ignored so far. In a Bayesian setting the

188 Chapter 7. Multiple Change Point in Poisson Control Charts

results obtained from the model are highly informative and can contribute directly in

the decisions made in root causes analysis. Moreover, this approach along with com-

putational techniques such as MCMC simplify modeling the change point for complex

processes and scenarios and simultaneously shortcut the analytical hassles.

Recently a case that no priori knowledge exists on the model of changes prior to control

chart’s signal has been addressed. Among the several scenarios, consecutive monotonic

step changes may lead the process to be out-of-control. In this study in a Bayesian

framework, multiple change point model in a Poisson process mean prior to c-chart

signal was considered. This model is an extension of models proposed and evaluated

in Chapter 6. The number of step changes was unknown and treated as a random

variable. Using reversible jump MCMC, posterior distributions of the number of change

points, as well as the time and the magnitudes of changes were obtained over several

change sizes and directions. Simulations showed that more accurate estimates for time

of changes can be obtained when the Bayesian estimator was used in conjunction of

Poisson control chart. These estimates were also supported by probabilistic inferences

for time and the magnitude of changes. Compared with alternatives, Poisson EWMA

and CUSUM built-in estimators and a MLE based estimator, the Bayesian estimator

performed satisfactorily over consecutive monotonic and non-monotonic changes. This

superiority is enhanced when probability quantification, flexibility and generalization

of the Bayesian multiple change point model were considered.



chapter contributes to methods since using a Bayesian framework and computational

components a change point estimator was designed to estimate number, time magnitude

of consecutive step changes prior to Poisson control chart’s signal.




189



certified that:



field of expertise;






unit; and





Assareh, H. and Mengersen, K. (2011) Bayesian multiple change point estimation of

Poisson rates in control charts, IIE Transactions, submitted.



Signature & Date:






7.1 Abstract

Precise identification of the time when a process has changed enables process engineers

to search for a potential special cause more effectively. In this paper, we consider

Bayesian change point estimation methods for a Poisson process in a control chart

context. We apply Bayesian hierarchical models to formulate the change point where

there exists an unknown number of step changes in the Poisson rate. Reversible Jump

Markov Chain Monte Carlo is used to obtain posterior distributions of the change point

parameters including number, location and magnitude of changes and also correspond-

ing probabilistic intervals and inferences. The performance of the Bayesian estimator

is investigated through simulations and the result shows that precise estimates can

be obtained when they are used in conjunction with the c-chart for different num-

ber and direction of change scenarios. Compared with alternatives, Poisson EWMA

and CUSUM built-in estimators and a MLE based estimator, the Bayesian estimator

performs satisfactorily over consecutive monotonic and non-monotonic changes. This

superiority is enhanced when probability quantification, flexibility and generalization

of the Bayesian multiple change point model are considered.

7.2 Introduction

Statistical process control charts are used to detect changes in a process by distinguish-

ing between assignable causes and common causes of the process variation. When a

control chart signals, process engineers initiate a search to identify and eliminate the

source of variation. Knowing the time at which the process began to vary, the so-called

change point, would help to conduct the search more efficiently in a tighter time-frame.

Taylor (2000) highlighted the capability of change point analysis in characterization

of a change and promoted it as a complementary tool in process control efforts. This

approach was widely undertaken in various contexts. In an industrial area, Duarte and

Saraiva (2003) showed the informativeness of change point estimation in conjunction

with control charts in monitoring a quality variable of a magnesium bisulphite pulp mill

process. In a clinical context, Brown et al. (2004) applied this approach to study the


effect of stimulation of the subthalamic area in Parkinson’s diseases and Assareh et al.

(2011) studied the change point in monitoring adverse events; and in an environmental

study, Abu-Taleb et al. (2007) monitored relative humidity in Jordan during 1923-2006

and identified an increasing trend that occurred in 1979 using change points analysis.

A Poisson process is often used to model the number of occurrences in an interval of

time. In this regard, Poisson based control charts have been developed and frequently

applied in industry to monitor the number of defects and nonconformities in a product

(Gardiner, 1987; White et al., 1997), and in health to monitor patient mortality and

spread of an infection in a hospital (Benneyan, 1998; Limayea et al., 2008). The most

commonly used control charts adopted for Poisson distributed data include c-charts

(Shewhart, 1926, 1927), cumulative sum (CUSUM) (Page, 1954, 1961; Brook and Evans,

1972) and exponentially weighted moving average (EWMA) (Roberts, 1959; Trevanich

and Bourke, 1993; Borror et al., 1998) control charts; see Montgomery (2008) and

Woodall (1997) for more details.

Poisson CUSUM and Poisson EWMA charts are more sensitive for detecting small shifts

in the process parameters, whereas a c-chart is efficient in the detection of large shifts

(Montgomery, 2008). However, upon signaling, none of them provides information

regarding the time at which the process changed and the magnitude and the type of

the change.

There exists a change point estimator in CUSUM charts suggested by Page (1954)

and also an equivalent estimator in EWMA charts proposed by Nishina (1992). These

estimators are known as built-in estimators since the time of the change is estimated

through monitoring behavior of cumulative sum and exponentially moving average

statistics of CUSUM and EWMA control charts, respectively. Samuel et al. (1998)

developed and applied a maximum likelihood estimator (MLE) for the change point

in a c-chart assuming that the change type is a step change. They demonstrated how

closely it estimates the change point in comparison with the usual c-chart signal.

Perry (2004) evaluated the performance of the MLE estimator and reported that it

outperforms Poisson CUSUM and Poisson EWMA built-in estimators in presence of a

step change. He also constructed a confidence set on the estimated change point which


covers the true process change point with a given level of certainty using a likelihood

function based upon the method proposed by Box and Cox (1964).

Perry et al. (2006) then derived a MLE estimator and confidence set under a linear

trend assumption where the process parameter changes over time. They showed that

this is superior to the step change estimator if a linear trend disturbance occurs in the

Poisson rate.

Perry et al. (2007) challenged the underlying assumption of knowing the form of change

types in these approaches and noted that either a step change or a linear trend with

constant slope could not adequately describe what often happens in practice. They ex-

tended the MLE approach to the situation in which no prior knowledge of the change

type exists. The only assumption they made was that the form of shifts belongs to

the set of monotonic effects. They derived a change point estimator and constructed

confidence sets for non-decreasing multiple step change points using isotonic regression

models. The performance of this estimator was compared with the step change and

linear trend MLE estimators where a step change, a linear trend and multiple change

points are present. The multiple change point estimator was reported to relatively

outperform other MLE estimators for some magnitudes of step and linear trend dis-

turbances and in the case of multiple change points it was shown to be the superior

estimator. However, the estimator still remains dependent on a priori knowledge about

the behavior of the shifts, such as monotonic change. In practice, it is not uncommon,

to experience non-monotonic consecutive changes that may occur as a result of one

influential process input variable changing several times or several influential process

input variables changing at different times. Indeed, these changes could influence the

process mean in any direction and lead to multiple change points in the Poisson mean

which are not necessarily monotonic.

An interesting approach which has only recently been considered in the statistical pro-

cess control context is Bayesian hierarchical modelling (BHM) using, where necessary,

computational methods such as Markov Chain Monte Carlo (MCMC) and Reversible

Jump Markov Chain Monte Carlo (RJMCMC). Application of these theoretical and


7.3 Bayesian Multiple Change Point Model and RJMCMC Steps 193

of inferences based on posterior distributions for the number, the time and the mag-

nitude of a change as well as assessing the validity of underlying assumptions in the

change point model itself (Gelman et al., 2004; Brooks, 1998; Green, 1995; Lavielle and

Lebarbier, 2001).

In this paper we model and estimate the change point in a Bayesian framework. We

first describe the Bayesian model and define RJMCMC details. The application of the

model is demonstrated through a simulation study of a set of change point scenarios.

We then investigate the performance of the model over a wide range of consecutive

changes. The model is explained in Section 7.3 and implemented and analysed in

Section 7.4. We then compare the performance of the estimator with alternatives in

Section 7.5 and summarize the study and obtained results in Section 7.6.

7.3 Bayesian Multiple Change Point Model and RJM-

CMC Steps

7.3.1 Model







the observations given the quantity of interest, and “Posterior” is the state of knowledge

about the quantity after data are observed which also is in the form of a probability

distribution. This structure is expendable to multiple levels in a hierarchical fashion,

so-called Bayesian hierarchical models (BHM), which allows us to enrich the model by

capturing all kinds of uncertainties for observed data as well as priors. In complicated

BHMs it is not easy to obtain the posterior distribution analytically. This analytic


bottleneck has been eliminated by the The emergence of MCMC methods. In MCMC

algorithms a Markov chain, also known as a random walk, is constructed whose sta-

tionary distribution is the posterior distribution of the parameter of interest. Samples

generated from a long run of the Markov chain using a proposal transition density

are drawn from posterior distributions of interest. Some common MCMC methods for

drawing samples include Metropolis-Hastings and the Gibbs sampler, see Gelman et al.


Consider a Poisson process Xt, t = 1, ..., T , that is initially in-control and then k

change points with unknown location and magnitude occur in the process rate. Thus

at k unknown points in time, τk,1, τk,2, ..., τk,k, the Poisson rate parameter changes

from its known in-control state of λk,0 to λk,l, λk,l = λk,0 + δk,l and λk,l 6= λk,0 for

l = 1, ..., k. The Poisson process multiple change point model can be parameterized

as xt ∼ Poisson(λk,i), t = τk,i, ..., τk,i+1 for i = 0, ..., k where τk,0 = 1 and τk,k+1 = T .

That is,

p(xt | λt) =

exp(−λk,0)λxt

k,0/xt! if t = 1, 2, ..., τk,1

exp(−λk,1)λxt

k,1/xt! if t = τk,1 + 1, ..., τk,2...

exp(−λk,k)λxt

k,k/xt! if t = τk,k + 1, ..., T.

(7.2)

The quantities of interest are thus the number, the time and the magnitude of the

changes.

Let the maximum number of change points be K − 1, so that there exist K models,

mk, k = 0, 1..,K − 1, where k is the number of changes in the Poisson process. We

assign a discrete distribution for k; for example in the following simulation study, a

uniform distribution is imposed due to lack of any other information and K is set to 7

based on the problem context, so that f(m = k) = 1/7, k = 0, ..., 6. In other contexts

other distributions such as truncated Poisson or Gamma might also be of interest; see

Gelman et al. (2004) for more details on selection of prior distributions.

We place a Gamma distribution as a prior for the mean of the Poisson process; so that

λk,i ∼ Gamma(αk,i, βk,i), i = 0, ..., k. For example, in the simulation study described


below, since no other information on which to base the choice of the hyperparameters,

we follow (Carlin and Louis, 2000) and set all αk,i, βk,i for k = 0, ...,K−1 and i = 0, ..., k

to be equal and use Empirical Bayes methods to estimate α and β. Thus we let the

prior have a mean (α/β) of 20, equal to the in-control rate λk,0 and a variance (α/β2)

of at least 6 ×√λk,0, approximately. This is a reasonably informative prior for the

magnitude of the change in an in-control Poisson rate as the control chart is sensitive

enough to detect very large shifts and estimate associated change points. We thus set

α = 10 and β = 0.5.

7.3.2 Parameter Estimation

To obtain posterior estimates of the parameters of interest, we apply the RJMCMC

method (Green, 1995) which has extensively been studied and used in complex change

point and model selection problems (Brooks, 1998; Lavielle and Lebarbier, 2001; Zhao

and Chu, 2010). RJMCMC provides a general framework for Markov chain Monte

Carlo (MCMC) simulation in which the dimension of the parameter space can vary

between iterations of the Markov chain. Thus, the dimensionality of the space, here

the number of change points, is considered to be a stochastic variable as well as the time

and magnitude of the change given the dimension. In this view, The reversible jump

sampler can be seen as an extension of the standard Metropolis-Hastings algorithm

onto more general state spaces that jumps between models with parameter spaces of

different dimensions.

Let θm denote the parameter vector corresponding to modelm, where θm has dimension

dm. If the current state of the Markov chain is (m, θm), where θm has dimension dm,

then a general version of the algorithm is the following:

(a) Propose a new model m′ with probability j(m,m′).

(b) Generate u from a specified proposal density q(u | θm,m,m′).

(c) Propose a new vector of parameters θ′m′ by setting (θ′m′ , u′) = gm,m′(θm, u) where

gm,m′ is a specified invertible function.


(d) Accept the proposed move to model m′ with probability

α = min

(1,

f(x | m′, θ′m′)f(θ′m′ | m′)f(m′)j(m′,m)q(u′ | θm,m′,m)

f(x | m, θm)f(θm | m)f(m)j(m,m′)q(u | θm′ ,m,m′)

∣∣∣∣∂g(θm, u)

(θm, u)

∣∣∣∣).

(7.3)

(e) Return to step 1 until the required number of iterations is reached.

The portion of times that a model m is accepted in the simulation represents the

posterior probability of the model and the samples from each iteration within the

model m are drawn from the posterior distributions of the parameter set of θm.

Important elements of the algorithm are the proposal distributions q(u′ | θm,m′,m) and

the matching function gm,m′ . The vectors u and u′ are used to make the dimensions of

the parameter spaces of m and m′ equal.

The corresponding proposal distributions are usually constructed by single MCMC runs

within each model, while the matching function gm′,m is constructed by considering the

structural properties of each model and their possible association. In the following, we

adopt the approach taken by Zhao and Chu (2010); for completeness we paraphrase

the RJMCMC steps below.

7.3.3 Birth and Death of a Change Point

In step 1 of the RJMCMC algorithm, a model, mk, is randomly proposed and can

be limited to adjacent models, mk−1 and mk+1 say, of the last iteration. We set

the probability of transition to adjacent models, the so-called birth and death of a

change point, j(mk,mk+1) = j(mk,mk−1) = 0.5 for 0 < k < K − 1 and j(m0,m1) =

j(mK−1,mK−2) = 1 where there only exists one adjacent model.

For ease of expositions, subscripts of new parameters obtained through birth and death

moves are dropped. In the birth of a new change point τ , in a move from mk to mk+1,

all existing change points and most Poisson rates remain untouched. A non-informative

prior for τ is p(τ) = 1/(n− k − 1) as the birth cannot occur on xt, t = (1; τk1, ..., τkk).

Assume that τ occurs within (τk,j , τk,j+1) and splits this epoch into two parts. In this

circumstance, the old λk,j is replaced by two new rates λ1 and λ2, where under the


competing model mk+1 their conditional posteriors are

λ1 | x, θmk, τ ∼ Gamma(α+

τ−1∑

t=τk,j

xt, β + τ − τk,j),

λ2 | x, θmk, τ ∼ Gamma(α+

τk,j+1−1∑

t=τ+1

xt, β + τk,j+1 − τ − 1). (7.4)

In contrast, for the death of a change point, in a move from mk to mk−1, two epochs

are merged and the two rates are replaced by one rate. The conditional posterior of

the merged rate is

λ | x, θmk+1∼ Gamma(α+

τk+1,j+2−1∑

t=τk+1,j

xt, β + τk+1,j+2 − τk+1,j). (7.5)

7.3.4 Proposal Distributions

Finding appropriate proposal densities for moves, qmk,mk+1(u′ | θmk

,mk+1,mk) for birth

and qmk+1,mk(u | θmk+1

,mk,mk+1) for death, are critical in the RJMCMC algorithm.

For a birth move, the vector u′ includes three parameters τ, λ1 and λ2. We let the

proposal density be

q(τ, λ1, λ2 | θmk) = p(λ1 | θmk

, τ)× p(λ2 | θmk, τ)× p(τ | θmk

), (7.6)

where p(λ1 | θmk, τ) and p(λ2 | θmk

, τ) are the posteriors obtained in Equation equa-

tion (7.4) and p(τ | θmk) is set close to the posterior of the new change point calculated

as below (see Zhao and Chu (2010) for derivation details):

p(τ | θmk, λ1, λ2,mk+1, x) ∝ e(τ−τk,j)(λ1−λ2)(λ1/λ2)

τ−1∑

t=τk,j

xt

, (7.7)


where λ1 and λ2 are replaced by the mean of the posteriors obtained in Equation 7.4.

For a death move, the vector u includes one parameter λ. We need to propose a new rate

for the period [τk,j , τk,j+1 − 1] under mk. Here, the proposal is set in a straightforward

manner by applying the posterior of λ as below:

p(λ | θmk+1) ∼ Gamma(α+

τk+1,j+2−1∑

t=τk+1,j

xt, β + τk+1,j+2 − τk+1,j). (7.8)

All priors and proposals obtained through Sections 7.3.1-7.3.4 are then replaced in the

acceptance ratio defined in Equation (7.3) for birth and death moves appropriately.

See Zhao and Chu (2010) for more details.

7.4 Performance Analysis

7.4.1 Simulation

We used Monte Carlo simulation to study the performance of the constructed BHM in

multiple change detection following a signal from a c-chart. Processes with one, two and

three change points were considered, k = {1, 2, 3}. We generated 25 observations of a

Poisson process with an in-control rate of λk,0 = 20. We then induced step changes until

the c-chart (Shewhart, 1926, 1927) signalled. Because we know that the process is in-

control, if an out-of-control observation was generated in the simulation of the early 25

in-control observations, it was taken as a false alarm and the simulation was restarted.

However, in practice a false alarm may lead to stopping the process and searching for

root causes. When no cause is found, the process would follow without adjustment.

The simulation was also repeated for rate parameters of 5 and 10 over equivalent step

changes; since the results were similar to these obtained for λk,0 = 20, k = {1, 2, 3},

they are not reported here. We then examined the behavior of the posterior estimates

of the number, k, magnitude, δk,1, . . . , δk,k, and location, τk,1, . . . , τk,k, of changes using

100 replications.

7.4 Performance Analysis 199

The multiple changes and control charts were simulated in MATLAB. For each change

point scenario, we modified and used RJMCMC algorithm made available in MATLAB

by Zhao and Chu (2010) to generate 100,000 samples with the first 20000 samples

ignored as burn-in.

7.4.2 One Change Point

Two processes were simulated in which step change of sizes δ1,1 = +5 and δ1,1 = −5

were induced at τ1,1 = 25. The posterior distributions for the number and the time of

step changes for the two processes are presented in Figure 7.1. For both change sizes,

a larger mass of the probability function concentrates on the model with one change

point, see Figure 7.1-a1,b1. Acceptance of the model with one change point, m1, leads

to posterior distributions of the time, τ1,1, and the magnitude, δ1,1, of the change. As

seen in Figure 7.1-a2,b2, the posteriors for time concentrate on the 25th sample which

is the real change point. Since the posteriors tend to be asymmetric, the mode of the

posteriors is used as an estimator for the change point model parameter.

Table 7.1 shows the posterior estimates for the induced change sizes, δ1,1 = ±5, in

the process mean. The Bayesian estimator suggests that one change point is more

probable, p(m1) = 0.48, prior to signal of the c-chart where a change of size δ1,1 = −5,

around one standard deviation, was induced. The c-chart detects a fall after 51 samples

whereas the mode of the posterior distribution of τ1,1 reports the 25th sample accurately

as the change point. For an increase of size δ1,1 = +5, although the posterior mode

underestimates the time of the change, τ1,1 = 24, it still outperforms the c-chart.

Posterior estimates of the magnitude of the change tend to be quite accurate, taking

into account their corresponding standard deviations.



interval which involves those values of highest probability in the posterior density of the

parameter of interest. Table 7.1 also presents 80% credible intervals for the estimated

time and the magnitude of step changes. As expected, the CIs are affected by the

dispersion and higher order behaviour of the posterior distributions. As shown in Table


0 1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5

Number of change points

Prob

abili

ty

0 1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7


Prob

abili

ty

(a1) (b1)

0 20 40 600

0.1

0.2

0.3

0.4

Time (1,1)

Prob

abili

ty

0 10 20 30 400

0.1

0.2

0.3

0.4

Time (1,1)

Prob

abili

ty

(a2) (b2)

Figure 7.1 Posterior distributions of the number k and the time τ1,1 of a step change of sizes (a)δ1,1 = −5 and (b) δ1,1 = +5 following signals from c-chart where λ1,0 = 20, and τ1,1 = 25.

Table 7.1 Posterior distributions (mode, sd.) of multiple change point model parameters mk andθm1

= (τ1,1, δ1,1) following signals (RL) from c-chart where λ1,0 = 20 and τ1,1 = 25. Standarddeviations and 80% credible intervals are shown in round and square parentheses, respectively.

δ1,1 RLp(mk) θm1

m0 m1 m2 m3 τ1,1 δ1,1

-5 76 0.0009 0.43 0.34 0.1525 -4.97

(5.46) (0.51)[24.47,25.55] [-5.01,-4.95]

+5 44 0.019 0.51 0.29 0.1224 5.88

(3.68) (1.02)[23.97,24.23] [5.79,5.91]

7.1 and discussed above, the magnitude of the changes are not estimated as precisely

as the time of the change.

The probability of having a specified number of changes is presented in Table 7.1.

The probability of having more than three changes prior to signalling of the chart is

0.08 when an increase of size δ1,1 = −5 has occurred. It is less unlikely, 0.06, for a

change of size δ1,1 = +5. An interesting inference can also be made on the falseness


Table 7.2 Average of posterior estimates (E(mode), E(sd.)) of multiple change point model parametersmk and θm1

= (τ1,1, δ1,1) following signals (RL) from c-chart where λ1,0 = 20 and τ1,1 = 25. Standarddeviations are shown in parentheses.

δ1,1 E(RL)E(p(mk)) θm1

m0 m1 m2 m3 E(τ1,1) E(στ1,1) E(δ1,1)

-1526.30 0.001 0.46 0.34 0.18 25.65 2.08 -6.45(0.64) (0.05) (0.04) (0.02) (0.02) (0.71) (1.54) (2.29)

-1032.83 0.006 0.68 0.24 0.05 26.00 2.54 -9.52(7.23) (0.01) (0.08) (0.05) (0.02) (0.93) (3.84) (1.41)

-5148.92 0.10 0.36 0.27 0.15 28.25 127.92 -1.49(126.32) (0.08) (0.05) (0.03) (0.03) (2.81) (55.34) (0.70)

-3293.51 0.24 0.35 0.22 0.10 33.35 529.13 -0.31(257.68) (0.09) (0.07) (0.02) (0.02) (13.12) (180.05) (0.28)

+3100.39 0.22 0.32 0.24 0.13 28.14 88.24 1.09(58.25) (0.06) (0.03) (0.02) (0.01) (12.38) (24.77) (1.04)

+545.10 0.05 0.40 0.30 0.15 27.72 15.95 3.25(20.21) (0.07) (0.05) (0.02) (0.02) (9.25) (10.80) (1.36)

+1029.18 0.002 0.58 0.28 0.09 26.01 1.53 8.38(3.34) (0.01) (0.08) (0.04) (0.03) (1.35) (1.98) (2.15)

+1526.63 0.008 0.51 0.28 0.12 25.86 1.16 11.3(1.06) (0.03) (0.12) (0.05) (0.04) (0.91) (1.35) (3.71)

of the signal using p(m0), the probability of having no step change point. We can also

construct other probabilistic inferences using the posterior distributions of parameters.

As an example, the probability that a change point occurred in the last 10, 20 and

40 observed samples prior to signalling in the control charts can be obtained since the

right tail of the posterior was truncated at chart’s signal. For a step change of size

δ1,1 = −5, since the c-chart signals very late (see Table 7.1), it is unlikely that the

change point has occurred in the last 10 , 20 and even 40 samples with probability

0.0, 0.0 and 0.03, respectively, whereas for a change of size δ = +5, it is very probable

that the change has occurred in the last 40 samples with probability 0.99, and with

probability of 0.58, it is between last 10 and 20 samples. These kind of probability

computations and inferences can be extended to the magnitude of the changes.

The above inferences are based on a single dataset from each of the processes con-

sidered. To investigate the behavior of the Bayesian estimator for multiple datasets

generated from the same process, we replicated the simulation method explained in

Section 7.4.1 100 times. Simulated datasets that were obvious outliers were excluded.

This replication allows to have distribution of estimates with standard errors in order


of 10. The number of replication study, indeed, is a compromise between excessive

computational time, considering RJMCMC iterations, and sufficiency of the achievable

distributions even for tails. Table 7.2 shows the average of the estimated parameters

obtained from the replicated datasets. As seen, although the c-chart detects small to

medium shifts, from half to two standard deviations, with a large delay, it performs

better where there exists a jump. Having a longer delay in detection of a decrease in

the Poisson rate in comparison with an increase of the same size in the c-chart is due

to the equality of mean and the variance of the Poisson distribution. Therefore a fall

in the mean leads to less dispersed observations.

For all change sizes, the model with one change point, m1 , has highest posterior

probability; however, the strength of this comparison varies over different change sizes.

As seen in Table 7.2, the probability, p(m1), almost doubles when the magnitude of the

shift increases from δ1,1 = ±3, half a standard deviation, to δ1,1 = ±10, two standard

deviations. As the magnitude of the change increases, the posterior probability of the

model with no change point, p(m0), decreases in favor of the model with one change

point. This implies that the model with no change point, m0, closely competes with the

model with one change point, m1, over small shifts whereas the model with two change

points, m2, is the runner-up over medium to large shifts. For a large shift, δ1,1 = ±15

around three standard deviations, the probability of a model with one change point

significantly drops, particularly where there exists a drop in the Poisson mean. This

is due to the early detection of such shifts by the c-chart that leads to a very short

run of samples after the change which then compresses the data and hence informs the

RJMCMC algorithm.

For a step change of size at most one standard deviation (δ1,1 = ±3,±5) in the Poisson

rate the average of the posterior modes, denote here by E(τ), reports at the most the

33rd sample as the change point, whereas the corresponding c-charts detect the changes

with delays greater than 75 samples. This superiority persists where a medium shift

of size δ = ±10 has occurred in the process mean. In this scenario, the bias of the

Bayesian estimator does not exceed one observation, whereas the minimum delay is

four samples in detection of the fall. As expected, for large shift sizes (δ1,1 = ±15),

around three standard deviations, the c-chart performs well, yet the expected values of


modes outperform it with a delay of less than one observation.

Table 7.2 reveals that the variation of the Bayesian estimates for time tends to reduce

when the magnitude of shift in the process mean increases. However, by the nature

of the Poisson distribution, for drops, the observed variation is almost less than those

obtained in detection of jumps. The mean of the standard deviation of the posterior

estimates of time, E(στ1,1), also decreases dramatically when moving from small shift

sizes to medium and large sizes.

The average of the Bayesian estimates of the magnitude of the change, E(δ1,1), shows

that the modes of posteriors for change sizes do not perform as well as the corresponding

posterior modes of the time across different shift sizes; however, promising results are

obtained where a medium shift, δ1,1 = ±10, has occurred in the process mean. This

estimator tends to underestimate the sizes. Having said that, Bayesian estimates of

the magnitude of the change must be studied in conjunction with their corresponding

standard deviations. In this manner, analysis of credible intervals is effective.

7.4.3 Two change points

We considered a process with k = 2 change points and induced two consecutive

changes of size (δ2,1, δ2,2) = (−5,−10) and (δ2,1, δ2,2) = (−5,+10) that occurred at

(τ2,1, τ2,2) = (25, 35). The posterior distributions for the number and the time of step

changes are presented in Figure 7.2. The Bayesian estimator suggests that it is more

probable, p(m2) = 0.37 , that two change points exist prior to signalling of the c-chart

where monotonic changes of size (δ2,1, δ2,2) = (−5,−10), around one and two standard

deviations, were induced. Table 7.3 shows that the c-chart detects such a consecu-

tive fall after 13 samples where the mode of the posterior distribution reports the 25th

and 26th samples as the change points. As seen in Figure 7.2-b1, for a non-monotonic

multiple change of size (δ2,1, δ2,2) = (+5,−10), the model with two change points is

identified as the most probable model with p(m2) = 0.38. Although the posterior mode

underestimates the time of the first change, τ2,1 = 24, it still outperforms the c-chart

which signals at the 38th sample. Bayesian estimates of the magnitude of the changes

tend to estimate the first change more accurately than the second change, but more


0 1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5


Prob

abili

ty

0 1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5


Prob

abili

ty

(a1) (b1)

0 10 20 300

0.1

0.2

0.3

0.4

0.5

Time (2,1)

Prob

abili

ty

0 10 20 300

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Time (2,1)

Prob

abili

ty

(a2) (b2)

0 10 20 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Time (2,2)

Prob

abili

ty

0 10 20 300

0.2

0.4

0.6

0.8

1

Time (2,2)

Prob

abili

ty

(a3) (b3)

Figure 7.2 Posterior distributions of the number k and the time, τ2,1 and τ2,2, of a two consecutivechanges of sizes (a) (δ2,1, δ2,2) = (−5,−10) and (b) (δ2,1, δ2,2) = (−5,+10) following signals fromc-chart where λ2,0 = 20, and (τ2,1, τ2,2) = (25, 35).

accurate results were obtained for the non-monotonic changes; see Table 7.3.

Similar to the one step change scenario discussed in Section 7.4.2, we are able to

construct credible intervals around estimated parameters as well as make probabilistic

inferences using the obtained posterior distributions. Table 7.3 reveals that most of

true values of time of the consecutive changes are seen in the obtained CIs. For the

monotonic changes, (−5,−10), the probability of having less than two changes is equal

to the probability associated with the accepted model with two change points; see Table


Table 7.3 Posterior distributions (mode, sd.) of multiple change point model parameters mk andθm2

= (τ2,i, δ2,i), i = 1, 2, following signals (RL) from c-chart where λ2,0 = 20, τ2,1 = 25 and τ2,2 = 35.Standard deviations and 80% credible intervals are shown in round and square parentheses, respectively.

δ2,1, δ2,2 RLp(mk) θm2,i=1 θm2,i=2

m0 m1 m2 m3 τ2,1 δ2,1 τ2,2 δ2,2

-5,-10 38 0.11 0.26 0.37 0.1725 -4.88 36 -7.21

(6.22) (1.74) (4.87) (1.46)[22.96,25.82] [-5.02,-4.51] [34.61,36.72] [-7.432,-6.98]

-5,+10 38 0.042 0.047 0.38 0.3124 -4.35 35 4.76

(3.27) (1.11) (1.34) (1.65)[23.84,24.22] [-4.39,-4.31] [34.96,35.06] [4.72,5.11]


= (τ2,i, δ2,i), i = 1, 2, following signals (RL) from c-chart where λ2,0 = 20, τ2,1 = 25 andτ2,2 = 35. Standard deviations are shown in parentheses.

δ2,1, δ2,2 E(RL)E(p(mk)) θm2,i=1 θm2,i=2

m0 m1 m2 m3 E(τ2,1) E(στ2,1) E(δ2,1) E(τ2,2) E(στ2,2) E(δ2,2)

-5,-1042.28 0.00 0.29 0.47 0.17 26.71 4.47 -8.53 38.71 6.76 -11.38(6.73) (0.00) (0.08) (0.05) (0.03) (2.75) (1.53) (2.38) (3.89) (4.94) (1.06)

-5,+568.25 0.12 0.25 0.34 0.17 28.21 12.38 -4.55 36.75 20.97 3.28(28.02) (0.09) (0.08) (0.03) (0.01) (4.69) (7.17) (1.72) (1.70) (16.79) (1.48)

-5,+1038.64 0.00 0.15 0.52 0.22 25.96 3.84 -4.26 36.00 1.37 9.45(2.97) (0.01) (0.12) (0.08) (0.05) (1.94) (1.61) (4.18) (0.42) (1.16) (2.09)

-3,+1038.50 0.006 0.22 0.45 0.21 24.64 5.11 -4.02 35.91 2.26 8.21(2.97) (0.03) (0.11) (0.08) (0.05) (3.48) (1.75) (4.04) (0.58) (1.14) (2.42)

+3,-1042.10 0.00 0.27 0.54 0.14 24.21 4.12 2.28 35.96 3.46 -9.79(4.61) (0.00) (0.11) (0.08) (0.05) (3.70) (1.50) (4.49) (0.18) (1.73) (1.56)

+5,-1042.30 0.00 0.18 0.59 0.18 26.07 3.41 6.63 35.95 4.16 -9.03(6.42) (0.00) (0.14) (0.10) (0.05) (2.21) (1.28) (1.77) (0.31) (3.88) (1.28)

+5,-5156.19 0.001 0.28 0.35 0.21 29.32 14.17 4.12 38.14 22.08 -5.45(120.58) (0.05) (0.04) (0.02) (0.02) (5.71) (8.54) (3.32) (2.39) (11.34) (3.29)

+5,+1039.30 0.007 0.34 0.40 0.18 25.20 7.11 3.12 37.21 5.14 11.03(2.90) (0.01) (0.04) (0.03) (0.02) (1.39) (1.65) (3.04) (4.42) (1.19) (2.15)

7.3. This probability drops to 0.09 in favor of the model with three change points that

competes with m2 where there exists non-monotonic changes of size (−5,+10). Other

probabilistic inferences can also be made about the time and the magnitude of the

consecutive changes; see Section 7.4.2.

We replicated the simulation method explained in Section 7.4.1 100 times in order to

study the behavior of the Bayesian estimator for different datasets drawn from the same

process. Simulated datasets that were obvious outliers were excluded.

Table 7.4 presents the posterior means obtained through the replications. In all change

scenarios (monotonic and non-monotonic), the posterior probability of the model with

two change points, m2, is highest; however, the strength of this varies over different

change sizes. Comparison of (−3,+10) with (−5,+10) shows that when the magnitude

of the first shift increases in the opposite direction of the second change, the probability


of the model with two change points, p(m2), increases. The same result is seen for non-

monotonic cases of (+3,−10) and (+5,−10). In contrast, with reduction in the size of

the second shift, (−5,+10) to (−5,+5), that leads to a decrease of the absolute differ-

ences between non-monotonic consecutive change sizes, p(m2) drops in favor of models

with no and one change points. Notably, when the size of the second change increases

and reaches around two standard deviations, (−5,−10), the associated probability of

m2 increases again due to a drop in the probability of the model with no change point,

p(m0). As seen in Table 7.4, the same results were obtained for an increase in δ2,2

where δ2,1 = +5. This implies that the probability of the model with two changes is

affected by the magnitude of absolute differences between the size of the consecutive

changes and the direction of the changes. For monotonic changes, the model with one

change point, m1, competes with the true model, m2, whereas for small non-monotonic

changes, the model with no change, m0, also contends with the true model.

Table 7.4 shows that, as expected, the performance of the c-chart in detection of two

consecutive changes specifically depends on the size of the second shift. As discussed

in Section 7.4.2 and seen in Table 7.4, although the c-chart detects larger shifts earlier

than small to medium shifts, it is always outperformed by the Bayesian estimator, τ2,1,

in all scenarios. Similar to the behavior of the associated probability of the true model

discussed above, the performance of the posterior modes for the first shift are affected

by the direction of changes and the size of their absolute differences. For non-monotonic

cases where the size of the first shift is half a standard deviation, δ2,1 = ±3, the modes

tend to underestimate the time, with a bias of a sample. With an increase of the size of

the first change to one standard deviation, δ2,1 = ±5, approaching to the value of the

second shift size, the modes tend to overestimate the time of the first change. This bias

reaches maximum delays of three and four samples for small non-monotonic scenarios,

(δ2,1, δ2,2) = (−5,+5) and (δ2,1, δ2,2) = (+5,−5), respectively. Table 7.4 shows that the

posterior modes for the time of the second shifts, τ2,2, also tend to overestimate the

time. Having said that, both posterior modes provide almost precise estimation for the

location of changes where monotonic and non-monotonic consecutive changes occurred

in the Poisson rate, particularly when associated standard deviations are taken into

account.



=(τ1,1, δ1,1) following signals (RL) from c-chart for replications in which the number ofchange points was underestimated where where λ2,0 = 20, τ2,1 = 25 and τ2,2 = 35. Standard deviationsare shown in parentheses.

δ2,1, δ2,2 E(RL)E(p(mk)) θm1

m0 m1 m2 m3 E(τ1,1) E(στ1,1) E(δ1,1)

-5,-1042.01 0.02 0.50 0.37 0.07 32.36 4.42 -8.50(5.94) (0.13) (0.06) (0.07) (0.06) (3.99) (1.94) (1.68)

-3,-1043.82 0.00 0.56 0.36 0.06 32.43 5.10 -6.86(5.56) (0.00) (0.06) (0.06) (0.06) (5.28) (2.24) (1.86)

+3,+1038.88 0.003 0.52 0.37 0.07 29.98 7.79 7.5(4.38) (0.08) (0.05) (0.05) (0.03) (6.03) (2.41) (4.87)

+5,+1039.06 0.003 0.48 0.39 0.09 27.92 7.68 7.55(4.65) (0.01) (0.04) (0.02) (0.02) (7.64) (1.84) (3.21)

Table 7.4 reveals that the variation of Bayesian estimates for the time of the changes

almost behave in the same manner. The maximum variation was obtained for the small

non-monotonic cases, (−5,+5) and (+5,−5). The average of the Bayesian estimates of

the magnitude of the changes, E(δ2,1) and E(δ2,2), shows that while the point estimates

slightly deviate from the true values, there is no consistent pattern in these deviations

and the true values are typically encompassed in the corresponding 80% CIs.

For a few simulated datasets (less than 15% approximately) for monotonic multiple

change scenario with two changes, the probability for the model with one change point

was larger than that obtained for the true model, m2. To investigate the performance

of the Bayesian estimator, we accepted the model with one change point, m1, instead of

the true model; then we considered the associated posterior estimates, θm1 . Table 7.5

presents the mean of estimates over replications in which the number of change points

was underestimated. The posterior estimates for time still outperform the c-chart where

they detect the change point with less delays of at most seven and five observations

for two decreasing, (δ2,1, δ2,2) = (−3,−10), and increasing, (δ2,1, δ2,2) = (+3,+10),

consecutive changes, respectively. In this circumstance, the obtained estimate for the

magnitude of the change almost equals average of two consecutive changes. This result

implies that, although the proposed Bayesian estimator may fail in identification of the

true model, particularly for monotonic multiple changes, it still provides more accurate

information about the location of the first change in comparison with the c-chart.


7.4.4 Three change points

We induced three consecutive changes of size (δ3,1, δ3,2, δ3,3)=(−5,+5,−5) and (δ3,1,

δ3,2, δ3,3)=(+5,−5,+5) that occurred at (τ3,1, τ3,2, τ3,3) = (25, 35, 45). The posterior

distributions for the number and the time of changes are presented in Figure 7.3. The

Bayesian estimator suggested that it was more probable, p(m3) = 0.34, to have three

change points prior to signalling of the c-chart where three non-monotonic changes of a

same size, (−5,+5,−5) around one standard deviations, were induced. Table 7.6 shows

that the c-chart detects such consecutive shifts after 55 samples where the modes of

the posterior distributions report the 26th, 36th and 46th samples as the change points.

As seen in Figure 7.3-(b1), for a non-monotonic multiple change of the same size but

opposite direction, (+5,−5,+5), the model with three change points is identified as the

most probable model with p(m3) = 0.30. Although the posterior modes overestimate

the time of all changes with a delay of an observation, it still outperforms the c-chart

that signals at 62nd sample. Bayesian estimates of the magnitude of the changes tend

to estimate the first change more precisely than the other changes, see Table 7.6.

Similar to the one change scenario discussed in Section 7.4.2, we are able to construct

credible intervals around estimated parameters. Precise credible intervals were obtained

for change point model parameters, particularly for the time of shifts; see Table 7.6.

We can also support the estimates with probabilistic inferences, such as the probability

of the occurrence of the change points in a specified number of observed samples prior

to the signal. For examples and discussions see Section 7.4.2.

We replicated the simulation method explained in Section 7.4.1 100 times in order to

study the behavior of the Bayesian estimator over different datasets drawn from the

same population. As several combinations of shift size and direction were investigated

and almost similar results to those obtained for the scenario with two changes discussed

in Section 7.4.3 were obtained, here we limit the report and discussion to change in size

of the third shift in non-monotonic multiple change cases.

Table 7.7 presents the posterior means of parameters of interest obtained through the

replications. In all change scenarios, the model with three change points, m3, has the

highest probability; however, the strength of this varies over different change sizes.


0 1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5


Prob

abili

ty

0 1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5


Prob

abili

ty

(a1) (b1)

0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

Time (3,1)

Prob

abili

ty

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time (3,1)

Prob

abili

ty

(a2) (b2)

0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Time (3,2)

Prob

abili

ty

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Time (3,2)

Prob

abili

ty

(a3) (b3)

0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

Time (3,3)

Prob

abili

ty

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Time (3,3)

Prob

abili

ty

(a4) (b4)

Figure 7.3 Posterior distributions of the number k and the time, τ3,1, τ3,2 and τ3,3, of three consecutivechanges of sizes (a) (δ3,1, δ3,2, δ3,3) = (−5,+5,−5) and (b) (δ3,1, δ3,2, δ3,3) = (+5,−5,+5) followingsignals from c-chart where λ3,0 = 20, and (τ3,1, τ3,2, τ3,3) = (25, 35, 45).

210

Chapter7.M

ultiple

ChangePointin

Poisso

nControlCharts

Table 7.6 Posterior distributions (mode, sd.) of multiple change point model parameters mk and θm3= (τ3,i, δ3,i), i = 1, 2, 3, following signals (RL) from c-chart where

λ3,0 = 20, τ3,1 = 25, τ3,2 = 35 and τ3,3 = 45. Standard deviations and 80% credible intervals are shown in round and square parentheses, respectively.

δ3,1, δ3,2, δ3,3 RLp(mk) θm1,i=1 θm2,i=2 θm3,i=3

m0 m1 m2 m3 τ3,1 δ3,1 τ3,2 δ3,2 τ3,3 δ3,3

-5,+5,-5 81 0.022 0.011 0.21 0.3426 -4.70 36 3.42 46 -2.81

(6.05) (2.59) (3.21) (2.07) (1.98) (1.28)[25.95,26.12] [-6.02,-4.31] [35.95,37.04] [3.37,4.42] [45.79,46.34] [-3.40,-2.36]

+5,-5,+5 62 0.051 0.24 0.12 0.3026 5.15 36 -3.32 46 4.36

(4.95) (1.87) (3.55) (1.64) (2.82) (1.08)[25.84,26.27] [5.02,5.30] [35.91,36.11] [-3.46,-3.11] [45.89,46.04] [4.31,4.47]

7.4

Perfo

rmanceAnalysis

211

Table 7.7 Average of posterior estimates (E(mode), E(sd.)) of multiple change point model parameters mk and θm3= (τ3,i, δ3,i), i = 1, 2, 3, following signals (RL) from

c-chart where λ3,0 = 20, τ3,1 = 25, τ3,2 = 35 and τ3,3 = 45. Standard deviations are shown in parentheses.

δ3,1, δ3,2, δ3,3 E(RL)E(p(mk)) θm3,i=1 θm3,i=2 θm3,i=3

m0 m1 m2 m3 E(τ3,1) E(στ3,1) E(δ3,1) E(τ3,2) E(στ3,2) E(δ3,2) E(τ3,3) E(στ3,3) E(δ3,3)

-5,+5,-1052.88 0.00 0.06 0.12 0.56 26.90 2.75 -5.87 36.15 1.71 4.30 45.95 2.56 -9.87(6.49) (0.00) (0.08) (0.11) (0.11) (2.63) (1.40) (2.69) (1.36) (0.89) (2.40) (0.39) (1.63) (1.45)

-5,+5,-5187.33 0.00 0.18 0.21 0.28 27.61 7.34 -4.65 36.35 6.32 3.31 46.95 8.91 -4.37(133.80) (0.00) (0.03) (0.03) (0.02) (3.45) (1.92) (3.87) (2.01) (2.22) (3.31) (4.11) (5.32) (3.31)

+5,-5,+560.05 0.01 0.10 0.16 0.40 26.29 5.27 4.41 36.01 5.61 -4.32 46.41 11.14 5.33(11.08) (0.02) (0.07) (0.09) (0.07) (2.80) (1.65) (4.53) (0.61) (3.28) (1.68) (3.89) (4.94) (1.06)

+5,-5,+1049.23 0.00 0.06 0.17 0.48 26.15 4.07 3.73 36.03 1.80 -3.90 46.06 1.37 9.32(3.59) (0.01) (0.09) (0.12) (0.10) (2.27) (1.43) (4.66) (0.84) (0.82) (1.61) (0.43) (0.71) (1.96)


It is seen that when the magnitude of the third change increases, from one standard

deviation to two standard deviations, the Bayesian estimator distinguishes the true

model more strongly. As expected, the c-chart detects non-monotonic multiple changes

with medium shifts more quickly than those with a small shift in the third change. The

associated expected value of the first change point, E(τ3,1) and E(στ3,1), reveals that

when the magnitude of the third change increases, the Bayesian estimator tends to be

more accurate and precise. This result remains consistent across estimates of the second

and the third change points where the true times of changes are also well estimated by

the posterior modes.

Table 7.7 also shows that the accuracy and the direction of bias of Bayesian estimates

for the magnitude of the changes, δ3,1, δ3,2 and δ3,3 are not consistent across differ-

ent scenarios. However, there exists some gain in studying the estimated sizes and

directions, particularly when the obtained standard deviations are also considered.

7.5 Comparison of Bayesian Estimator with Other Meth-

ods

To study the performance of the proposed Bayesian estimator in comparison with al-

ternatives, we considered Poisson EWMA and CUSUM charts and associated built-in

estimators (Page, 1954; Nishina, 1992) and the proposed MLE estimator for a step

change in Poisson processes (Samuel et al., 1998) within replications discussed in Sec-

tions 7.4.2-7.4.4.

As expected, since both EWMA and CUSUM charts are very sensitive to shifts, simula-

tion of more than one change point before signalling is unlikely. However, we considered

the application of these charts in contexts in which the monitoring process and charts

are not terminated when the chart has signalled. Woodall (2006) highlighted this cir-

cumstance as a significant characteristic of monitoring in a clinical setting, where an

out-of-control process may not be able to be stopped and root causes analysis pro-

cedures are conducted simultaneously. We chose the MLE estimator for step change

proposed by Samuel et al. (1998) because it is the only proposed MLE method that

7.5 Comparison of Bayesian Estimator with Other Methods 213

can be applied over different change scenarios, as the developed MLE estimators for

linear trend (Perry et al., 2006) and multiple change (Perry et al., 2007) in a Poisson

mean are restricted to increasing trends and monotonic changes; see Section 7.2.

To construct control charts, we applied the procedures of Brook and Evans (1972) and

Trevanich and Bourke (1993) for Poisson CUSUM and Poisson EWMA control charts

respectively. A Poisson CUSUM accumulates the difference between an observed value

and a reference value k through S+i = max{0, xi−k++S+

i−1} and S−i = max{0, k−−xi+

S−i−1} where k+ = (λ+

1 − λ0)/(ln(λ+1 )− ln(λ0)) and k− = (λ0 − λ−

1 )/(ln(λ0)− ln(λ−1 )).

If S±i exceeds a specified decision interval h± then the control chart signals that an

increase (a decrease) in the Poisson rate occurred. We calibrated the charts to detect

a 25% shift in Poisson rates and have an in-control average run length ( ÂRL0) of

370 approximately, close to standard c-chart, see Woodall and Adams (1993). The

resultant Poisson CUSUM charts had (k+, h+) = (22.4, 22) and (k−, h−) = (17.4, 14).

For simplicity, the values were rounded to one decimal place.

In a Poisson EWMA cumulative values of observations are obtained through Zi =

r×xi+(r−1)×Zi−1, where Z0 = λ0, and plotted in a chart with UCL = λ0+A+√V arZi

and LCL = λ0 − A−√V arZi. We let r = 0.1 and A± = 2.67 to build a chart with an

ARL0 of 370, close to a standard c-chart.

Table 7.8 shows the expected value of the Bayesian estimates and detected change

points provided by built-in estimators of EWMA (Nishina, 1992) and CUSUM (Page,

1954) charts and the MLE estimator (Samuel et al., 1998) for a step change in a Poisson

process. For scenarios of more than one change, the posterior estimates for the time of

the first change were considered.

Although the Bayesian estimator, τb, tends to overestimate the time of a step change,

particularly for small shifts of size δ = ±5 with a delay of three samples, it outperforms

the EWMA, RLewma, and CUSUM, RLcusum, charts as well as their built-in estimators,

τewma and τcusum, which tend to signal with larger delays and underestimate the change

point, respectively. The exception is for τcusum in medium shift sizes δ = ±5. In this

scenario of change, the Bayesian estimator, τb, is outperformed by the MLE estimator,

τmle, with a delay of at most an observation. This is not surprising since the MLE


Table 7.8 Average of change point estimates obtained through the built-in EWMA (τewma) and CUSUM(τcusum), MLE (τmle) and Bayesian (τb, time of the first change) estimators following signals fromPoisson EWMA (RLewma), Poisson CUSUM (RLcusum) and c-chart (RLc) where λk,0 = 20 andτk,1 = 25. Standard deviations are shown in parentheses.


τbE(RLc) E(τmle) E(RLewma) E(τewma) E(RLcusum) E(τcusum)

-1032.83 25.05 27.76 21.82 28.15 23.40 26.00(7.23) (0.92) (0.95) (5.23) (0.79) (2.59) (0.93)

-5148.92 25.13 32.31 22.32 33.18 24.27 28.25(126.32) (3.74) (3.80) (5.53) (3.80) (3.69) (2.81)

+545.10 26.08 32.14 23.67 33.14 25.23 27.72(20.21) (4.02) (4.19) (4.76) (4.52) (3.50) (9.25)

+1029.18 24.99 28.04 22.35 28.35 23.40 26.01(3.34) (1.64) (1.31) (4.69) (1.32) (3.31) (1.35)

-5,-1042.28 28.39 31.71 22.12 32.62 24.01 26.71(6.73) (4.51) (3.01) (5.53) (2.94) (3.49) (2.75)

-5,+1038.64 33.69 32.16 24.00 32.84 25.87 25.96(2.97) (4.73) (3.65) (7.24) (3.42) (5.45) (1.94)

+5,-1042.30 31.03 31.96 24.87 32.70 26.33 26.07(6.42) (8.69) (5.39) (3.93) (6.84) (3.74) (2.21)

+5,+1039.30 26.62 31.98 22.76 32.61 24.55 25.20(2.90) (4.78) (3.28) (5.30) (3.33) (3.71) (1.29)

-5,+5,-1052.88 39.68 33.02 23.85 34.12 26.71 26.90(6.49) (7.23) (5.48) (8.12) (5.42) (6.91) (2.63)

+5,-5,+1049.23 37.91 33.18 25.77 34.39 27.50 26.15(3.59) (9.10) (5.90) (7.93) (6.00) (6.75) (2.27)

estimator was specifically designed to detect such shifts whereas no assumption and

limitation was made for application of the Bayesian estimator.

Where there exist two consecutive changes that are monotonic, the Bayesian estimator,

τb, outperforms almost all alternatives. For δ1,2 = (−5,−10), although the CUSUM

built-in estimator, τcusum, reports slightly more accurate estimation of the location of

the first change point, 24.01, consideration of the precision, 3.49, in comparison with

that obtained for the Bayesian estimator, 2.75, degrades this superiority. If there exist

two non-monotonic changes, the Bayesian estimator, τb, remains the most accurate and

precise estimator.

Table 7.8 reveals that the superiority of posterior modes persists for the three non-

monotonic change scenarios, particularly where the obtained standard deviations over

replications are considered. In these cases, the MLE estimator obviously failed. The

built-in estimators compete with the posterior modes, but are less precise.

In addition to accuracy and precision criteria used for the comparison study, the pos-

terior distributions for the number, the time and the magnitude of a change enable us

7.6 Conclusion 215

to construct probabilistic intervals around estimates and probabilistic inferences about

the number and location of change point as discussed in Sections 7.4.2-7.4.4. This is

a significant advantage of the proposed Bayesian approach. Although similar results

may be obtained when resampling in conjunction with MLE methods, the inferential

capabilities of this approach are more limited; see Bernardo and Smith (1994) for more

details.

This approach can be easily applied for other types of data and processes such as

Bernoulli, normal and exponential family data. Moreover, an integrated and compre-

hensive view of the number, magnitude and direction of changes which is provided by

the proposed Bayesian estimator that can substantially improve the efficiency of root

causes analysis efforts within quality improvement cycle should be taken into account.

7.6 Conclusion

Identification of the time when a process has changed enables process engineers to pur-

sue investigation of special causes more effectively. Indeed, knowing the change point

restricts the search efforts to a tighter window of observations and related variables.

The benefits of change point analysis in conjunction with control charting have been

recognized in monitoring and quality control programs within various contexts includ-

ing chemical processes, environmental and clinical studies. In monitoring a quality

characteristic, it is likely to experience consecutive changes prior to signalling of qual-

ity control procedures. In monitoring health outcomes it is not possible to stop an

out-of-control process; therefore such multiple change point patterns are expected.

To tackle this, in this paper we modeled the multiple change point detection for a Pois-

son process in a Bayesian framework. We constructed Bayesian hierarchical models and

derived posterior distributions for change point estimates using RJMCMC. We consid-

ered three scenarios of changes, a step change, and two and three consecutive changes

where they are monotonic and non-monotonic. Through simulation we investigated

the performance of the Bayesian estimator when they are used in conjunction with

a c-chart. The results showed that the Bayesian estimates outperform the Shewhart


control chart in change detection over different scenarios of the number and direction

of changes.

We then compared the Bayesian estimator with built-in estimators of EWMA and

CUSUM and MLE based estimators. The Bayesian estimator performs reasonably well

and remains a strong alternative. It becomes the superior estimator when considering

relaxing of setting assumptions, incorporation of priori knowledge, flexibility of models,

ease of extension to more complicated change scenarios such as combination of steps

and linear and nonlinear trends and relief of analytic calculation of likelihood function,

particularly for non-tractable likelihood functions.


advantage of building on control charts that may be already in place in practice, An

alternative may be to retain the two-step approach but to use a Bayesian framework in

both stages. There is now a substantial literature on Bayesian formulation of control

charts and extensions such as monitoring processes with varying parameters (Feltz and

Shiau, 2001), over-dispersed data (Bayarri and Garcıa-Donato, 2005), start-up and

short runs (Tsiamyrtzis and Hawkins, 2005, 2008). A further alternative is to consider

a fully Bayesian, one-step approach, in which both the monitoring of the in-control

process and the retrospective or prospective identification of changes is undertaken in

the one analysis. This is the subject of further research.

Bibliography

Abu-Taleb, A. A., Alawneh, A. J., and Smadi, M. M. (2007). Statistical analysis of

recent changes in relative humidity in jordan. Environmental Sciences, 3(2):75–77.

Assareh, H., Smith, I., and Mengersen, K. (2011). Bayesian change point detec-

tion in monitoring cardiac surgery outcomes. Quality Management in Health Care,

20(3):207–222.



Benneyan, J. C. (1998). Statistical quality control methods in infection control and



BIBLIOGRAPHY 217







Brooks, S. P. (1998). Markov chain Monte Carlo method and its application. Journal

of the Royal Statistical Society. Series D (The Statistician), 47(1):69–100.

Brown, P., Mazzone, P., Oliviero, A., Altibrandi, M. G., Pilato, F., Tonali, P. A., and

Di Lazzaroc, V. (2004). Effects of stimulation of the subthalamic area on oscillatory

pallidal activity in Parkinson’s disease. Experimental Neurology, 188(2):480–490.

Carlin, B. and Louis, T. (2000). Empirical Bayes: past, present and future. Journal of

the American Statistical Association, 95(452):1286–1289.

Duarte, B. and Saraiva, P. (2003). Change point detection for quality monitoring of

chemical processes. Computer Aided Chemical Engineering, 14:401–406.



17(2):119–124.



Washington.


Chapman & Hall/CRC.







20(4):404–413.










Perry, M., Pignatiello, J., and Simpson, J. (2007). Change point estimation for mono-


search, 45(8):1791–1813.




nometrics, 1(3):239–250.





Taylor, W. (2000). Change-point analysis: a powerful new tool for detecting changes.

http://www.variation.com/cpa/tech/changepoint.html.







24(6):721–735.









Zhao, X. and Chu, P. S. (2010). Bayesian change-point analysis for extreme events (ty-

phoons, heavy rainfall, and heat waves): a RJMCMC approach. Journal of Climate,

23(5):1034–1046.

CHAPTER 8

Bayesian Change Point Detection in Monitoring

Cardiac Surgery Outcomes






of quality control. Accurate estimation of the time of change can help in the search

for a potential cause more efficiently as a tighter time-frame prior to the signal in the

control charts is investigated.

This study aims to illustrate how well change point estimation fits in quality improve-

ment efforts in a clinical setting. To this end, in a retrospective manner, incidence of

return to theater for excessive bleeding and excess blood product usage, defined as use

of more than 10 units of blood products within the first 24 hours post surgery, for each

patient undergoing Coronary Artery Bypass Graft (CABG) surgery and the 12 month

220 Chapter 8. Change Point Detection in Cardiac Surgery Outcomes

major adverse cardiac event outcome rate of patients undergoing Percutaneous Trans-

luminal Coronary Angioplasty (PTCA) were monitored using Bernoulli EWMA and

CUSUM control charts at a local hospital. Following control charts’ signals the time

of changes were estimated using a Bayesian approach, a variation of models proposed

and evaluated in Chapters 6 and 7. The observed coincidence of obtained estimates

for time of changes and the timing of known potential causes supported change point

investigation. This study also revisited the capabilities of the Bayesian approach in

modeling change point in a clinical setting and construction of probabilistic inferences.

The focus of this chapter is on the second objective of the thesis, mainly goals 1 and 3, in

which monitoring attributes data and facilitation of root cause analysis through change

point estimation is sought. This chapter contributes to application as well as method.

Within this study concept of change point estimation within an industrial context is

adapted and applied in a healthcare area. Meanwhile, using a Bayesian framework and

computational components a change point model is designed to estimate time of one

and two step changes prior to Bernoulli control charts’ signals.




221



certified that:



field of expertise;






unit; and





Assareh, H., Smith, I. and Mengersen, K. (2011) Bayesian change point detection in

monitoring cardiac surgery outcomes, Quality Management in Health Care, 20(3): 227-

232.



Signature & Date:

I. Smith Supplied data , assist with discussion, comments onmanuscript, editing






8.1 Abstract

Precise identification of the time when a clinical process has changed, a control chart’s

signal, enables clinicians to search for a potential special cause more effectively. In

this paper, we develop a change point estimation method for Bernoulli processes in a

Bayesian framework. We apply Bayesian hierarchical models to formulate the change

point model and Markov Chain Monte Carlo to obtain posterior distributions of the

change point parameters. The performance of the Bayesian estimator is investigated

through applications on clinical data. We monitor outcomes of cardiac surgery and

angioplasty procedures using Bernoulli EWMA and CUSUM control charts. We then

identify the time of changes prior signals obtained from charts. Study of the known

potential causes of changes in the outcomes reveals that estimated change points and

shifts in the known causes are coincident.

8.2 Introduction

A control chart monitors behavior of a process over time by taking into account the

stability and dispersion. The chart signals when a significant change has occurred due to

the existence of assignable causes. This signal is investigated using root causes analysis

to identify potential causes of the change and then corrective or preventive actions are

conducted. Following this cycle leads to variation reduction and process stabilization

(Montgomery, 2008). The achievements obtained by industrial and business sectors

via the implementation of a quality improvement cycle including quality control charts

and root causes analysis have motivated other sectors such as healthcare to consider

those tools and apply them as an essential part of the monitoring process in order to

improve the quality of healthcare delivery.

The need for modification of the tools according to health sector characteristics such

as emphasis on monitoring individuals and patient mix was raised by quality control

experts and clinicians. In this regard, risk adjustment control charts have been devel-

oped and applied within medical contexts; see Steiner and Cook (2000), Cook (2004)

and Grigg and Spiegelhalter (2007) for more details. However, there still exits a lack


of communication and knowledge transfer among experts between health sectors and

industrial and business sectors. Consideration of identified needs and how they are

being satisfied in each sector can accelerate other sectors in their own research and

development of effective quality improvement tools (Woodall, 2006; Woodall et al.,

2010).

The need to know the time at which a process began to vary, the change point, has

recently been raised and discussed in the industrial context of quality control. Accurate

detection of the time of change can help in the search for a potential cause more

efficiently as a tighter time-frame prior to the signal in the control charts is investigated.

A built-in change point estimator in CUSUM charts suggested by Page (1954, 1961) and

also an equivalent estimator in EWMA charts proposed by Nishina (1992) are two early

change point estimators which can be applied for all discrete and continuous distribution

underlying the charts. However they do not provide any statistical inferences on the

obtained estimates. Samuel and Pignatiello (2001) developed and applied a maximum

likelihood estimator (MLE) for the change point in a process fraction nonconformity

monitored by a p-chart, assuming that the change type is a step change. They showed

how closely this new estimator detects the change point in comparison with the usual

p-chart signal. Subsequently, Perry and Pignatiello (2005) compared the performance

of the derived MLE estimator with EWMA and CUSUM charts. These authors also

constructed a confidence set based on the estimated change point which covers the

true process change point with a given level of certainty using a likelihood function

based on the method proposed by Box and Cox (1964). This approach was extended

to other probability distributions as well as change type scenarios. In the case of a very

low fraction non-conforming, Noorossana et al. (2009) derived and analyzed the MLE

estimator of a step change based on the geometric distribution control chats discussed

by Xie et al. (2002).

An interesting approach which has recently been considered in the SPC context is

Bayesian hierarchical modelling (BHM) using, where necessary, computational methods




of inferences based on posterior distributions for the time and the magnitude of a change

as well as assessing the validity of underlying assumptions in the change point model

itself (Gelman et al., 2004).

In this paper we consider the problem of change point estimation in process monitoring

in a clinical setting. The two processes of interest are Coronary Artery Bypass Graft

(CABG) surgery and Percutaneous Transluminal Coronary Angioplasty (PTCA) at St

Andrew’s War Memorial Hospital (SAWMH), Brisbane, Australia. In the former pro-

cess the incidence of return to theater for excessive bleeding and excess blood product

usage, defined as use of more than 10 units of blood products within the first 24 hours

post surgery, for each patient undergoing CABG surgery are of interest. For the latter

process, we are interested in monitoring the 12 month major adverse cardiac event

outcome rate of patients undergoing PTCA.

To monitor the processes, two control charts, a Bernoulli CUSUM and a Bernoulli

EWMA, are applied. We then construct the Bayesian estimators for Bernoulli obser-

vation data. The change points prior to signals of the control charts are identified and

investigated for the two datasets. In Section 8.3 we describe the problem of monitoring

and change point detection of CABG data and in Section 8.4 the angioplasty data are

investigated. We then summarize and discuss the implication of the methodology and

its application in Section 8.5.

8.3 Cardiac Surgery Data

8.3.1 Data Description

This analysis involved the review of prospectively collected data acquired as part of an

ongoing quality monitoring program conducted by the cardiac surgical unit of SAWMH.

Ethical approval was gained to undertake collection of these data. In total, records re-

lating to 1971 consecutive isolated CABG procedures performed in the period from

December 2002 to January 2010 were available for analysis. All procedures were per-

formed by seven experienced cardiac surgeons. Details recorded for each procedure

included patient demographic and preoperative co-morbidity details, comprehensive

8.3 Cardiac Surgery Data 225

procedural details and post procedural outcomes including major adverse events dur-

ing the term of the admission.

8.3.2 Process Monitoring

Excessive post-operative bleeding following CABG surgery can be physiological and/or

technical in origin. Clinical practice in these cases is first to administer blood products

(whole blood, platelet concentrates, or fresh frozen plasma) to replace lost blood while

the natural clotting mechanism has time to deal with the bleeding. However, if blood

loss continues, the patient may be returned to the operating theatre to check for tech-

nical problems with the suture lines. As both actions carry increased risk to the patient

(infection risk and operative complications), monitoring the rate of patients requiring

intervention to deal with excessive bleeding is of interest in ensuring the continued

quality of a cardiac surgical service.

To cover the two treatments mechanisms, monitoring of both excess blood product

usage (patients requiring in excess of 10 units of blood products in the first 24 hours

after surgery) and re-operation for excess bleeding are variables of interest. Figure 8.1

shows the rate of high blood product use and re-operations for the 1971 patients in the

period from December 2002 to January 2010.

Figure 8.1 Exponentially weighted moving average graphs (with smoothing constant of 0.01) trackingthe incidence of patients returning to theatre for re-operation for bleeding related issues and casesrequiring excess blood product utilisation (>10 units)in the first 24 hours post CABG surgery. Datais drawn from cardiac surgical procedures performed at SAWMH in the period 2002-2010.


Although it has been shown that the inclusion of risk adjustment in control charts mon-

itoring clinical outcomes has the potential to improve their performance (by accounting

for known sources of variation), in the case of blood product use and/or excess bleeding

there is no recognised risk-adjustment algorithm; see Steiner and Cook (2000), Cook

(2004) and Grigg and Spiegelhalter (2007) for more details. Clinical performance of the

cardiac surgical service is, however, subject to a formalised morbidity and mortality

review process and this did not identify any significant variation in either case mix

or the underlying risk factors that would normally be expected to be associated with

variations in outcome such as process of care, patient age, sex, case complexity, etc.

For the ith patient, we observe (yRi, yBi

) where yRi, yBi

∈ (0, 1). This leads to a

dataset of Bernoulli data. It enables us to monitor the rates of each patient instead of

monitoring grouped data in which the detection of the change is postponed to when

n > 1 patients are observed (Reynolds and Stoumbos, 1999). In this setting, we assume

yRi∼ Bernoulli(pR) and yBi

∼ Bernoulli(pB).

To monitor the probability of an event, we refer it as “rate”, hereafter, based on

Bernoulli data, we considered two well-established control chart procedures, Bernoulli

CUSUM (Steiner and Cook, 2000; Page, 1954) and Bernoulli EWMA (Somerville et al.,

2002). Alternatives for monitoring Bernoulli data based on counting observations be-

tween two events in which the observations assumed to have a geometric distribution

may also be of interest. This approach was applied to Shewhart (Xie et al., 2002; Goh,

1987; Benneyan, 2001), CUSUM (Bourke, 1991; Chang and Gan, 2001) and EWMA

(Yeh et al., 2008) control charts. However they were found inappropriate if the rate is

not low and the detection of a decrease in the rate is also of interest (Yeh et al., 2008).

The disadvantage of geometric distribution based charting is that the observation is

not plotted on the chart until an event occurs. This may cause delay in the change

detection and ineffectiveness of root causes analysis particularly when the detection

of a decrease in event rate is of interest. In contrast, in Bernoulli based charts the

observations are plotted as soon as they have been observed.

A Bernoulli CUSUM monitors an in-control rate, p0 say, using a CUSUM score Wi

through X+i = max{0, X+

i−1 +W+i } and X−

i = min{0, X−i−1 −W−

i } where


W±i =

ln((1− p±1 )/(1− p0)) if yi = 0

ln(p±1 /p0) if yi = 1,(8.1)

and the p+1 and p−1 are an increased and a decreased rate, respectively, that the chart is

designed to detect. If X+i (X−

i ) exceeds a specified decision threshold h+ (h−) then the

control chart signals that an increase (a decrease) in the Bernoulli rate has occurred.

As shown in Figure 8.1, the rates of re-operation and use of blood products seems

to be relatively stable for the 568 patients undergoing CABG during 2004 and 2005.

The associated rates, p0R = 0.021 and p0B = 0.018, of this segment were therefore

considered as the in-control rates for the chart construction. The event rates for the

subsequent 1072 patients were monitored by control charts using these rates.

We constructed the CUSUM chart to detect a doubling and a halving of the odds

ratio, p0/(1−p0)p1/(1−p1)

= {0.5, 2}, in the in-control rates (0.021, 0.018) and have an in-control

average run length ( ÂRL0) of approximately five years (1500 procedures). This setting

and initializing the chart at zero,X±0 = 0, led to decision intervals of h±R = (3.37, 2.87)

and h±B = (3.22, 2.68) for the re-operation and blood products, respectively. As two

sided charts were considered, the negative values of h− were used. The associated

CUSUM scores were obtained W±Ri = (−0.027, 0.014) and W±

Ri = (0.665,−0.678) where

yi is 0 and 1, respectively, for the re-operation variable, and W±Bi = (−0.020, 0.010)

and W±Bi = (0.672,−0.682) where yi is 0 and 1, respectively, for the blood products

variable.

In a Bernoulli EWMA cumulative values of observations are obtained through Zi =

λ× yi + (1− λ)×Zi−1, where Z0 = p0, and plotted in a chart with UCL = p0 +A+σZ

and LCL = p0 − A−σZ , where σ2Z = p0(1 − p0) × λ

2−λ . We set λ = 0.05 since the

the in-control rates were low (p0R = 0.021, p0B = 0.018); see Somerville et al. (2002)

for more details. A± were calibrated so that the same in-control average run length

( ÂRL0) as the Bernoulli CUSUM was obtained. The resultant chart had A±R = 4.15

and A±B = 4.25. A negative lower control limit in the Bernoulli EWMA was replaced

by zero.

We constructed the charts in the R package (http://www.r-project.org). The obtained


Bernoulli CUSUM and EWMA control charts are shown in Figure 8.2. According to

the CUSUM chart (Figure 8.2-a1) the rate of re-operation among patients who had

undergone CABG surgery was in-control; however the EWMA signalled (Figure 8.2-

a2) at the 32nd patient as an increase in the rate was detected. This signal was held for

the next two patients and then the re-operation rate returned to the in-control state.

The behavior of the rate of excess blood product use seemed different between the two

charts. The Bernoulli CUSUM chart (Figure 8.2-b1) first signalled at the 61st and then

at the 71st observations and remained out-of-control over the next 34 patients. This was

followed by two signals for the 128th− 144th and 158th− 183rd patients. Later, a signal

was identified at the 529th patient. This signal was extended to the rest of observations

as a long-term increase in the rate was detected by the CUSUM chart. Although the

CUSUM never returned to the in-control state, the Bernoulli EWMA chart (Figure 8.2-

b2) detected 12 short-term signals followed by in-control periods. The signal periods

were 14-15, 32-36, 41-45, 531-564, 721-731,789-790, 800-801, 812-813, 982-983, 990-

1001, 1052-1056 and 1058-1064, each indicating that an increase in the rate occurred.

The shortest signal period contained two observations, whereas the longest included 34

observations. As seen in Figure 8.2-b2, three quarters of the signals were detected in

the second half of the observations in the Bernoulli EWMA chart, beginning with the

531st patient. This was close to the patient detected by the Bernoulli CUSUM chart

at the start of the long-term increase in the rate of the blood products (number 529).

8.3.3 Change Point Detection








(a1)

(a2)

(b1)

(b2)

Figure 8.2 Bernoulli CUSUM and EWMA control charts for the re-operation (a1-2) and the use ofblood products (b1-2) variables over 1072 patients underwent CABG surgery during 2006-2010.



This structure is expendable to multiple levels in a hierarchical fashion, so-called

Bayesian hierarchical models (BHM), which allows to enrich the model by capturing all

kind of uncertainties for data observed as well as priors. In complicated BHMs it is not

easy to obtain the posterior distribution analytically. This analytic bottleneck has been

eliminated by the The emergence of Markov chain Monte Carlo (MCMC) methods. In


MCMC algorithms a Markov chain, also known as a random walk, is constructed whose

stationary distribution is the posterior distribution of the parameters. Samples gener-

ated from a long run of the Markov chain using a proposal transition density are drawn

from posterior distributions of interest. Some common MCMC methods for drawing

samples include Metropolis-Hastings and the Gibbs sampler, see Gelman et al. (2004)

for more details.

Consider a Bernoulli process yi, i = 1, ..., T , that is initially in-control, with independent

observations coming from a Bernoulli distribution with a known rate p0. At an unknown

point in time, τ , the Bernoulli rate parameter changes from its in-control state of p0

to p1, p1 = p0 + δ and p1 6= p0. The Bernoulli process step change model can thus be

parameterized as follows:

p(yi | pi) =

pyi0 (1− p0)1−yi if i = 1, 2, ..., τ

pyi1 (1− p1)1−yi if i = τ + 1, ..., T.

(8.3)

Assume that the process yi is monitored by a control chart that signals at time T . We

assign a normal distribution with mean of 0 and standard deviation of 6×√

p0(1− p0)

as a prior distribution for δ. This normal prior is truncated to −p0 and 1 − p0 due

to expected values of p1 . This is a reasonably diffuse prior for the magnitude of the

change in an in-control Bernoulli rate as the control chart is sensitive enough to detect

very large shifts and estimate associated change points. See Gelman et al. (2004) for

more details on selection of prior distributions. We place a uniform distribution on the

range of (1, T ) as a prior for τ , the time of the step change in the in-control rate.

To run the model and obtain posterior distributions of the time and the magnitude

of the changes following signals from the charts we used the R2WinBUGS interface

(Sturtz et al., 2005) to generate 100,000 samples through MCMC iterations in Win-

BUGS (Spielgelhalter et al., 2003) for all signals with the first 20000 samples ignored as

burn-in. We then analyzed the results using the CODA package in R (Plummer et al.,

2010). See Appendix for the step change model code in WinBUGS. It should be noted

that the posteriors can also be obtained analytically.

As shown in Figure 8.2, the Bernoulli EWMA chart detected an early out-of-control


(1) (2)

Figure 8.3 Posterior distributions of the time τ (1) and the magnitude δ (2) of the change in the rate ofre-operation detected by the Bernoulli EWMA control chart at the 32nd patient who underwent CABGsurgery.

state in the rate of re-operation for patients who had undergone CABG surgery proce-

dure, at around the 39th observation, where the CUSUM did not alarm. The posterior

distributions of the time, τ , and the magnitude, δ, of the detected change are shown in

Figure 8.3. As seen in Figure 8.3-1, the distribution of τ is bimodal concentrating on

the 11th and the 28th patients. Having two modes in the obtained posterior distribution

implies that there were two step changes and consequently two change points in this

subset of the observation. Figure 8.3-2 shows that the resultant posterior distribution

of δ is a unimodal distribution with 0.l4 as the mode and 0.24 as the mean. This

distribution also may be a mixture of two (slightly) different distributions for the two

changes.

To investigate the cases with two changes, we developed a multiple change point model.

In this scenario, we assume that at an unknown point in time, τ1, the Bernoulli rate

changes from its in-control state of p0 to p1, p1 = p0 + δ1 and p1 6= p0. For a period of

time, the process follows the parameter p1 and then at an unknown point in time, τ2, it

changes to p2, p2 = p0+δ2 and p2 6= p1 6= p0. Similar to the step change model, we used

a normal distribution with mean of 0 and standard deviation of 6×√

p0(1− p0) for τ1

and τ2, and a uniform distribution on the range of (1, τ2) and (τ1, T ) for δ1 and δ2 as

prior distributions. See Appendix for the multiple change model code in WinBUGS.

Table 8.1, row one, shows the resultant estimates for the multiple change point model

following the first signal obtained by the Bernoulli EWMA chart. Change point analysis

was followed for all signals, which contain at least ten in-control observations prior to

the signal, provided by the Bernoulli CUSUM and EWMA charts. The shorter subsets

were merged with the preceding signals. The multiple change point model was applied


Table 8.1 Posterior distributions (mode, sd.) and incredible intervals (CI) of the change point parame-ters τ and δ following signals from the Bernoulli CUSUM (h± = (3.37, 2.87) and h± = (3.22, 2.68)) andEWMA (λ = 0.05, A± = 4.15 and A± = 4.25) charts on the rate of re-operation and the use of bloodproducts over 1072 patients who underwent CABG surgery during 2006-2010. Standard deviations areshown in parentheses.

Variable Chart Signal Change ModelParameter CI [50%],[[80%]]

τ(1) δ(1) τ2 δ2 τ(1) τ2

Re-operation EWMA 32 Multiple11.1 0.106 28.6 0.131 [7.3,12] [26.2,31.9](6.57) (0.17) (7.42) (0.21) [[1,12.7]] [[15.3,31.9]]

Blood Prod. CUSUM 61,71 Step11.3 0.089 - - [6.6,12] -(12.6) (0.09) - - [[1,13.2]] -

Blood Prod. CUSUM 128 Step127 0.131 - - [126.4,128] -(4.4) (0.25) - - [[122.7,128]] -

Blood Prod. CUSUM 158 Step157 0.406 - - [156.5,157.9] -

(2.81) (0.25) - - [[154.9,158]] -

Blood Prod. CUSUM 529 Multiple480.1 0.008 528.1 0.056 [430.7,482] [502.8,529](80.7) (0.11) (42.9) (0.20) [[322.8,505]] [[467.2,529]]

Blood Prod. EWMA 14 Step11.0 0.341 - - [10.5,12] -(2.3) (0.20) - - [[7.8,12]] -

Blood Prod. EWMA 32,41 Step∗28.0 0.30 - - [27,28.9] -(2.9) (0.19) - - [[23.9,29]] -

Blood Prod. EWMA 71 Multiple60.0 0.047 70.2 0.129 [54.6,61] [67.7,71](5.8) (0.18) (5.1) (0.23) [[46.3,61]] [[61.2,71]]

Blood Prod. EWMA 531 Multiple480.2 0.009 528.0 0.073 [435.5,485.9] [521,531](99.7) (0.1) (34.5) (0.22) [[326.6,518.9]] [[481,531]]

Blood Prod. EWMA 721 Step709.1 0.122 - - [703,711] -(18.6) (0.14) - - [[686.3,716]] -

Blood Prod. EWMA 789,800,812 Step∗758.2 0.094 - - [750.6,762] -(13.1) (0.14) - - [[745.9,781]] -

Blood Prod. EWMA 982,990 Step969.1 0.037 - - [964.3,971] -(36.7) (0.14) - - [[942.2,982]] -

Blood Prod. EWMA 1052 Step1035 0.25 - - [1043,1045] -(4.9) (0.15) - - [[1040,1046]] -

where it was appropriate. Since the posteriors tended to be asymmetric and skewed

the mode of posteriors was used. These are reported in Table 8.1 as estimates of

the change point parameters (τ, δ). Applying the Bayesian framework enables us to

construct probability based intervals for these parameters. A credible interval (CI) is

an interval which involves those values of highest posterior probability density of the

distribution of the parameter of interest. Table 8.1 also presents 50% and 80% credible

intervals for the estimated time of the changes, one or multiple, for all signals.

The multiple change point model identified two increases of sizes 0.106 and 0.131 at

the 11th and the 28th patients, respectively, within the first 32 observations. This

result pinpoints the inability of EWMA control charts to detect a change in very early

observations, which has been labelled by researchers as Fast Initial Response (FIR).

To overcome this, several techniques have been proposed and investigated; see Steiner

(1999) and Knoth (2005) for more details.

As seen in Table 8.1, the mode of the posterior distribution obtained from the first


signal of the CUSUM chart on the blood products (61) reports the 11th patient as the

time of the change which is identical to the time provided through the first signal of the

EWMA. This case also addresses the FIR problem in CUSUM charts; see Montgomery


According to Table 8.1, the estimates of the time of the changes prior to the signals in

the EWMA chart at the 128th and the 158th patients propose that the changes occurred

in the last observed patient and the chart detects the shift immediately, whereas for the

signals at the 61st and the 529th (first change) they indicate that the chart detects the

changes with a long delay. Comparing the estimates of the magnitude of the change in

conjunction with the time is the key point here. Having relatively large and small sizes

of changes for the former and latter signals, respectively, implies that any increase in

the magnitude of a change improves the performance of immediate change detection

of the chart and therefore the change point model tends to address the signal as the

change point. In contrast, if a small shift occurs in the process, the EWMA chart

detects the change with delay, but the change point model tends to identify the real

time.

The multiple change point model was found appropriate for one of five signals obtained

from the Bernoulli CUSUM chart on the blood products. The modes reported that

two consecutive changes occurred at the patients number 480 and 528 prior with an

associated signal at the 529th patient. This signal was also detected by the EWMA

chart with a delay of two patients. Although different subsets of observations were used,

the same change point model was found appropriate and almost the same estimates

obtained for τs. The minor differences can be seen in associated standard deviations

and δs.

The multiple change point model also identified two changes prior to the signal at the

patient number 71 in the EWMA chart. In this case, the signal and 10 observations

prior to the signal were obtained. For signals highlighted by an asterisk in Table 8.1, 32

and 789, although bimodal posterior distributions were obtained for τ , a step change

point was reported since the modes are very close.

Incorporating the obtained change points with the signals of the Bernoulli CUSUM


Figure 8.4 Exponentially weighted moving average graph (with smoothing constant of 0.01) for ratesof patients for whom Aprotinin was used in CABG surgery during 2006-2010 at SAWMH.

and EWMA charts shifts the focus of experts’ efforts in root causes analysis from a

biased time frame to a time closer to when the changes really occurred in the rate of

interest. It also reveals changes ignored by other charting methods. To investigate this,

we focus on some signals and associated change point estimates. We then compare the

results with changes that occurred in the use of the drug Aprotinin (Trasylol, Bayer),

the known potential cause. Aprotinin is used during complex surgical procedures, such

as CABG surgery, to reduce bleeding.

As seen in Table 8.1, an early increase in the rate of blood products was identified

at the 11th patient, who had undergone CABG surgery in January 2006, where the

CUSUM alarmed at the 61st patient in March 2006. An identical change point was

also obtained following the first signal of the EWMA chart on the rate of re-operation.

This change coincides with the time, early 2006, when there was a temporary drop

in the use of Aprotinin. This reduction occurred following publication of work by

Mangano et al. (2006) that linked Aprotinin use to an increased risk of post procedural

complications including death. Figure 8.4 shows this reduction. However, following

a review of adverse outcomes in the surgical unit’s regular morbidity and mortality

meeting, where no significant effect was detected, this reduction was not sustained and

therefore the rate of blood products decreases to in-control range as Aprotinin use

increased.

The second identical change point which was identified following signals of the CUSUM

8.4 Angioplasty Data 235

and EWMA charts on the blood products, is at patient number 480, who underwent

surgery in late September 2007. As seen in Figure 8.2-b1, the CUSUM shows a stable

increase in the rate. Again, the change point associated with the increase in use of

blood products matches with changes in the use of Aprotinin. At this time, follow-up

studies supported the results reported by Mangano et al. (2006) and the U.S. Food

and Drug Administration (FDA) recognized this risk in late October 2007 (US). This

warning was followed by the withdrawal of Aprotinin by Bayer in early November 2007

(http://www.trasylol.com/). This action had an immediate impact on the routine use

of Aprotinin at SAWMH as seen by the large drop in Figure 8.4. It should be noted,

however, that it is also worthwhile to investigate other potential causes of changes in

the rate of blood products using the process described by Mohammed et al. (2004).

8.4 Angioplasty Data

8.4.1 Data Description

This analysis involved the review of prospectively collected data acquired as part of an

ongoing quality monitoring program conducted by the interventional cardiology unit of

St Andrew’s War Memorial Hospital, Brisbane, Australia. As with the cardiac surgical

data, ethical approval was gained to undertake collection of these data. In total, data

for 2104 index PTCA procedures performed in the period from May 2002 to December

2006 with 12 month follow-up data were available for analysis. Any instance of a patient

requiring a PTCA procedure within 12 months of the index procedures was treated as

a complication of the initial PTCA (as such they were not counted separately). All

procedures were performed by five experienced interventional cardiologists. Data were

entered into a purpose-designed database which stored patient demographic and pre-

operative co-morbidity details, comprehensive procedural details including number and

type of stents used, and outcomes including major adverse events during the admission

of the procedure as well as at 30 days and 12 months post procedures.


8.4.2 Process Monitoring

Two measures commonly used to assess the outcome for patients following PTCA are

the rate of subsequent revascularisation procedures, either repeat PTCA or CABG,

involving the target lesion (TLR) and the rate of any major adverse cardiac event

(MACE) defined as any instance of TLR, myocardial infarction or death (Ajani et al.,

2008). Common time intervals after the procedure at which the TLR or MACE rates

are quoted are 30 days and 12 months following the index procedure.

Study of PTCA outcomes frequently involves a patient’s pre-operative risk factors. In

this regard, risk adjusted monitoring procedures are recommended; see Steiner and

Cook (2000), Cook (2004) and Grigg and Spiegelhalter (2007) for more details. How-

ever risk adjustment is not followed here since the demographics of the patients who

underwent coronary angioplasty and the characteristics of the procedure were relatively

stable through the period of observation. Therefore the variation caused by the case

mix is unlikely to have significantly contributed to any detected shifts in the outcome,

particularly, when low risks were obtained through current risk models.

Similar to the first study, for ith patient, we observe (yTi, yMi

) where yTi, yMi

∈ (0, 1).

This leads to two datasets of Bernoulli data. We assume yTi∼ Bernoulli(pT ) and

yMi∼ Bernoulli(pM ). As discussed in Section 8.3.2, we first considered Bernoulli

CUSUM and EWMA control charts. According to Figure 8.5, the rates of TLR and

MACE seem to be stable from the middle of 2003 to the end of 2004 for 598 patients.

Therefore the associated rates, p0T = 0.020 and p0M = 0.040, of this segment were

considered as the in-control rates for the chart construction and the subsequent 982

patients were monitored by the resultant control charts.

In the same way as discussed in Section 8.3.2, we calibrated the CUSUM charts to

detect a doubling and a halving of the odds ratio in the in-control rates (p0T = 0.020

and p0M = 0.040) and have an in-control average run length ( ÂRL0) of five years,

approximately. This setting led to an ÂRL0 of 2250 patients, and the decision intervals

of h±T = (3.78, 3.27) and h±M = (4.60, 4.07) for TLR and MACE, respectively. As two

sided charts were considered, the negative values h− were used. The associated CUSUM

scores obtained were W±Ti

= (−0.020, 0.010) and W±Ti

= (0.673,−0.683) where yi is 0


Figure 8.5 Exponentially weighted moving average graphs (with smoothing constant of 0.01) for ratesof patients who underwent CABG or PTCA on the lesion target of the angioplasty procedure (TLR)and the rate of patients who experienced either TLR or heart attack or died (MACE). Data is drawnfrom cardiac surgical procedures performed at SAWMH in the period 2002-2006.

and 1, respectively, for TLR, and W±Mi

= (−0.039, 0.020) and W±Mi

= (0.653,−0.672)

where yi is 0 and 1, respectively, for MACE. In the Bernoulli EWMA we set λ = 0.05

as the the in-control rate is low and obtained A±T = 4.50 and A±

M = 4.05 for TLR and

MACE such that the same in-control average run length ( ÂRL0=2250) was satisfied.

The constructed Bernoulli CUSUM and EWMA control charts are shown in Figure

8.6. According to the CUSUM chart (Figure 8.6-a1), the rate of TLR among patients

who had undergone angioplasty was in-control except for two short periods of time

including 575-578 and the last three patients of the dataset, 980-982. After the signal,

the CUSUM chart returned to the in-control state; however the associated statistic of

the Bernoulli CUSUM in detection of an increase in the rate (X+i ) tended to remain

away from the center and never became zero. The Bernoulli EWMA for TLR signalled

at four periods of observations. The first pair includes six observations in the periods

301-302 and 307-310 whereas the second pair contains eight observations in the periods

570-572 and 575-579. As seen in Figure 8.6-a1, the last signal in the EWMA chart is

almost identical to the patients first detected as out-of-control in the CUSUM chart.

The behavior of the rate of MACE among patients who had undergone angioplasty


(a1)

(a2)

(b1)

(b2)

Figure 8.6 Bernoulli CUSUM and EWMA control charts for TLR (a1-2) and MACE (b1-2) variablesover 982 patients underwent angioplasty during 2005-2006.

seemed different between the two charts. The Bernoulli CUSUM chart (Figure 8.6-b1)

first signalled an increase in the rate at the 552nd patient and remained out-of-control

for the remaining patients. In contrast, the Bernoulli EWMA chart (Figure 8.6-b2)

detected four short-term signals followed by in-control periods. The signal periods were

326-328, 557, 570-571 and 575-577, each implying that an increase in the rate occurred.

The shortest signal period contained one observation, whereas the longest included

three observations. As seen in Figure 8.6-b2, the beginning of the second signal, an


Table 8.2 Posterior distributions (mode, sd.) and credible intervals (CI) of the change point parametersτ and δ following signals from the Bernoulli CUSUM (h± = (3.78, 3.27) and h± = (4.60, 4.07)) andEWMA (λ = 0.05, A± = 4.50 and A± = 4.05) charts on TLR and MACE variables over 982 patientsundergone angioplasty during 2005-2006. Standard deviations are shown in parentheses.

Variable Chart Signal Change ModelParameter CI [50%],[[80%]]

τ(1) δ(1) τ2 δ2 τ(1) τ2

TLR CUSUM 575 Step550 0.123 - - [542.5,551] -

(65.4) (0.121) - - [[540.5,568]] -

TLR CUSUM 980 Multiple904.5 0.005 971 0.055 [876.3,959] [950.7,980](101) (0.11) (53.4) (0.20) [[744,968]] [[903,980]]

TLR EWMA 301,307 Step298 0.065∗∗ - - [298.4,300] -

(25.4) (0.27) - - [[285.1,300.2]] -

TLR EWMA 570,575 Step∗550 0.131 - - [537.2,551] -

(22.5) (0.19) - - [[542.4,568]] -

MACE CUSUM 552 Multiple299.5 0.022 550 0.031 [210.8,300.7] [547,551.8](113) (0.09) (107) (0.3) [[86,389.6]] [[373.6,552]]

MACE EWMA 326 Multiple299 0.018 321 0.124 [249.1,301] [317.7,325.9](82) (0.15) (23.8) (0.20) [[161.8,322]] [[293.2,326]]

MACE EWMA 557 Step550 0.044 - - [528.5,551.1] -(62) (0.16) - - [[447.7,557]] -

MACE EWMA 570,575 Step∗565 0.38 - - [566.5,568] -(2.2) (0.21) - - [[565,569]] -

increase in the rate of MACE, detected by the Bernoulli EWMA is almost identical to

the starting time of the long-period signal detected by the Bernoulli CUSUM chart.

8.4.3 Change Point Detection

We applied the change point model proposed in Section 8.3.3 to the TLR and MACE

data. As discussed, either a step change point or multiple change point model was

allowed. Table 8.2 presents the posterior distributions and associated credible intervals

for the time and the magnitude of the change points following signals of the Bernoulli

CUSUM and EWMA charts on TLR and MACE variables. See Section 8.3.3 and

Appendix for details on the MCMC method and implementation.

The mode of the posterior distribution of the time of the first signal detected by the

CUSUM in the rate of TLR suggested that the change occurred at the 550th patient,

25 patients earlier than the observed signal. This estimate and the magnitude of the

change were confirmed when the step change point model was implemented over a

different subset of data (311 to 570) prior to the second pair of signals obtained from

the EWMA chart.

The multiple change point model was found appropriate for the second signal provided

by the CUSUM in TLR. The model suggested that a small increase in the rate occurred


Figure 8.7 Exponentially weighted moving average graph (with smoothing constant of 0.01) for ratesof patients who DES was used for in angioplasty procedure during 2005-2006 at SAWMH.

at the 904th patient and then a relatively large increase occurred at the 971st which

was followed by the signal. Having two changes points prior to the signal poses a need

for recalibration of the causal analysis around the process.

The mode proposed the 298th patient as the change point related to the signal at patient

301 in the EWMA chart. However care should be taken regarding the associated mag-

nitude as a diffuse posterior distribution was obtained (highlighted by double asterisks

in Table 8.2).

Implementation of the multiple change point model for the only signal at patient 552

of the Bernoulli CUSUM chart on the MACE data reported the 299th and the 550th

patients as the change points. The former was also identified as a change point if a

shorter subset of data is used when the first signal of the Bernoulli EWMA, at patient

326, is taken into consideration. The mode of the posterior proposed the later point as

the change point when a step change point model was applied to the signal at patient

557 in the EWMA. Thus although different subsets of observations were used, almost

the same estimates were obtained for the parameter τs. The minor differences can be

seen in associated standard deviations and the magnitude of the changes (δs).

For the signal at patient 570 in the CUSUM and EWMA charts, highlighted by an

asterisk in Table 8.2, although bimodal posterior distributions were obtained for τ , the

step change point model was reported since the modes are too close to each other and

no more informative results can be provided by the multiple change point model.


As discussed in Section 8.3.3, having the change points beside the signals leads to

more efficient root causes analysis efforts. To show this advantage of change point

detection, we , here, focus on some signals and associated change point estimates. We

compare the obtained estimates with changes in use of of drug-eluting stents (DES),

the known potential factor influencing the rate of MACE (Stone et al., 2004). A DES

is a coronary stent placed into narrowed, diseased coronary arteries that slowly releases

a drug to block cell proliferation. The stent is placed within the coronary artery by an

interventional cardiologist during an angioplasty procedure.

As seen in Table 8.2, an identical change point at the 300th patient, who underwent

angioplasty procedure in mid August 2005, was obtained following the first signals of

the CUSUM and EWMA charts on the rate of MACE. This time of change coincides

with a reduction in the use of DES across the second half of 2005, see Figure 8.7.

As one of the major benefits of DES over conventional bare metal stents (BMS) is a

reduction in the rate of late restenosis, a slow decline in their use and a shift back

to BMS, may contribute to an increase in the rate of MACE although this was not

sufficient to induce signals in TLR alone (a component of MACE). The decline in

use of DES in late 2005 and early 2006 appears to be linked with an increase in the

number of reports such as that by Bavry and colleagues at the November 2005 American

Heart Association meeting linking the use of DES to a possible increase the risk for

late thrombosis (reported in Bavry et al. (2006)). Concerns were reinforced at the

European Society of Cardiology in Barcelona in September 2006 when similar results

(particularly those of the Swedish Coronary Angiography and Angioplasty Registry)

were presented, see Daemen et al. (2007) and Lagerqvist et al. (2007). This behavior

can be seen as the source of following signals and estimated change points. As shown in

Table 8.2, although different signals were obtained by the EWMA and CUSUM charts

for TLR and MACE in early (patient 552), mid (patient 557) and end (patients 570 and

575) of February 2006, the real change point was identified as occurring on 6 February

2006 (patient 550). This change point matches with a larger drop in the use of DES at

SAWMH in the same time, as can be seen in Figure 8.7.


8.5 Conclusion

Control charts play an essential role in improvement of quality of healthcare deliv-

ery. The chart signals when a significant change has occurred due to the existence of

assignable causes. This signal is investigated using root causes analysis for identifying

potential causes and then corrective or preventive actions are conducted. Identification

of the time when a process has changed enables process owners to run their investiga-

tion for special causes more effectively. Indeed, knowing the change point restricts the

search efforts to a tighter window of observations and related variables.

In this paper we first studied the rate of patients who had undergone re-operations for

bleeding and whom were administered more than 10 units of blood products following

their CABG surgery. We applied Bernoulli CUSUM and EWMA control charts to

1072 CABG procedures over 2006-2010. The behavior of the charts was discussed. We

then developed a Bayesian change point model to identify the time when the potential

changes in the underlying rate occurred prior to signals. Either a step or a multiple

change model was used depending on which was found more appropriate. Posterior

distributions of the time and the magnitude of the changes were constructed using

MCMC method and the estimates were reported. To assess the reliability of estimates,

the changes in use of Aprotinin as a known potential cause were compared with the

obtained estimates. The coincidence of change points obtained from Bayesian posteriors

and the changes occurring in the use of Aprotinin confirmed the capability of change

point detection in root causes analysis.

In the second study we monitored the rate of patients experiencing adverse events

including re-operation, heart attack and death in the follow-up period of angioplasty.

Similar to the first study, the Bernoulli CUSUM and EWMA charts were applied and

change points prior to signals were estimated. In the same manner, root causes analysis

was conducted considering the use of DES as a source of changes. Shifts in the use of

DES were matched to the estimated time of changes identified by the Bayesian posterior

estimates.

Although these investigations were implemented off-line for a restricted range of related

8.5 Conclusion 243

factors, the obtained results supports the role of change point detection in process

monitoring and root causes analysis. It could be considered as a plug-in procedure

following signals obtained by control charts in an on-line monitoring program to provide

more specific and probabilistic information which may lead to more productive efforts

in assignable causes identification.

However, prior to practice, further investigation to validate the performance of the

proposed change point estimator over various change scenarios is required. Meanwhile,

modification of the change point model to capture the underlying patent mix in moni-

toring hospital outcomes is essential.

Acknowledgment



Appendix

Step change model code for blood products

model {

for(i in 1 : RL ){

x[i] ∼ dbern(p1[i])

p1[i]=p0+delta*step(i-change)}

tau=sqrt(1/(6*p0(1-p0)))

delta ∼ dnorm(0, tau)I(-0.0176,0.982)



Multiple change model code for blood products

model {

for(i in 1 : RL ){

x[i] ∼ dbern(p2[i])

p2[i]=p0+delta1*step(i-change1)*step(change2-i)+delta2*step(i-change2)

p2[i]¡-min(p1[i], 0.999)}

tau=sqrt(1/(6*p0(1-p0))

delta1 ∼ dnorm(0, tau)I(-0.0176,0.982)

delta2 ∼ dnorm(0, tau)I(-0.0176,0.982)

change1 ∼ dunif(1,change2)

change2 ∼ dunif(change1,RL)}

Bibliography

Ajani, A., Reid, C., Duffy, S., Andrianopoulos, N., Lefkovits, J., Black, A., New, G.,

Lew, R., Shaw, J., Yan, B., et al. (2008). Outcomes after percutaneous coronary

intervention in contemporary australian practice: insights from a large multicentre

registry. Medical Journal of Australia, 189(8):423–428.

Bavry, A., Kumbhani, D., Helton, T., Borek, P., Mood, G., and Bhatt, D. (2006). Late

thrombosis of drug-eluting stents: a meta-analysis of randomized clinical trials. The

American Journal of Medicine, 119(12):1056–1061.



336.

Bourke, P. D. (1991). Detecting a shift in the fraction of nonconforming items us-

ing run-length control charts with 100% inspection. Journal of Quality Technology,

23(3):225–238.



BIBLIOGRAPHY 245

Chang, T. C. and Gan, F. F. (2001). Cumulative sum charts for high yield processes.

Statistica Sinica, 11(1):791–805.




Daemen, J., Wenaweser, P., Tsuchida, K., Abrecht, L., Vaina, S., Morger, C., Kukreja,

N., Juni, P., Sianos, G., Hellige, G., et al. (2007). Early and late coronary stent

thrombosis of sirolimus-eluting and paclitaxel-eluting stents in routine clinical prac-

tice: data from a large two-institutional cohort study. The Lancet, 369(9562):667–

678.


Chapman & Hall/CRC.


13(1):18–22.



102(477):140–152.


Papers, 46(1):47–64.

Lagerqvist, B., James, S., Stenestrand, U., Lindback, J., Nilsson, T., and Wallentin,

L. (2007). Long-term outcomes with drug-eluting stents versus bare-metal stents in

Sweden. New England Journal of Medicine, 356(10):1009–1019.

Mangano, D., Tudor, I., Dietzel, C., et al. (2006). The risk associated with Aprotinin

in cardiac surgery. New England Journal of Medicine, 354(4):353–365.

Mohammed, M., Rathbone, A., Myers, P., Patel, D., Onions, H., and Stevens, A.

(2004). An investigation into general practitioners associated with high patient mor-

tality flagged up through the Shipman inquiry: retrospective analysis of routine data.














coda. Citeseer.

Reynolds, M. J. and Stoumbos, Z. G. (1999). A CUSUM chart for monitoring a pro-

portion when inspecting continuously. Journal of Quality Technology, 31(1):87–108.













Stone, G., Ellis, S., Cox, D., Hermiller, J., O’Shaughnessy, C., Mann, J., Turco, M.,

Caputo, R., Bergin, P., Greenberg, J., et al. (2004). One-year clinical results with the

slow-release, polymer-based, paclitaxel-eluting TAXUS stent: the TAXUS-IV trial.

Circulation, 109(16):1942–1947.



US. Food and Drug Administration (FDA): Early Communication about

an Ongoing Safety Review Aprotinin Injection (marketed as Trasylol).

http://www.fda.gov/cder/drug/earlycomm/aprotinin.htm.







BIBLIOGRAPHY 247

Yeh, A., Mcgrath, R., Sembower, M., and Shen, Q. (2008). EWMA control charts

for monitoring high-yield processes based on non-transformed observations. Interna-

tional Journal of Production Research, 46(20):5679–5699.

CHAPTER 9

Change Point Estimation in Risk Adjusted

Control Charts

Preamble









Following illustrative study conducted in Chapter Chp8: Chapter 8 that discussed

potential advantages of change point investigation in monitoring hospital outcomes,

this chapter aimed to propose Bayesian model to capture patient mix. Among several

methods including MLE estimators and data mining techniques such as Neural networks

250 Chapter 9. Change Point Estimation in Risk-Adjusted Charts

and Fuzzy clustering have been proposed and investigated in an industrial context for

various processes involving single variable, multivariate and monitoring profiles, yet no

model has been considered patient mix and risk adjustment for observed dichotomous

outcomes.

The benefits obtained by adaption of Bayesian approach in modelling and estimation

of change point parameters reported in Chapters 6 and 7, in particular flexibility and

relaxing of analytical calculations, motivated this study to employ the Bayesian frame-

work and associated computational components in development of an estimator which

be able to handle the variability of the in-control state of a clinical process. This vari-

ability are commonly explained by risk models that underlies the observations plotted

in risk-adjusted control charts.

In this chapter we applied Bayesian hierarchical models to formulate the change point

where there exists a step change in the odds ratio and logit of risk of a Bernoulli process.

The outcomes of patients admitted to Intensive Care Unit (ICU) in a local hospital was

considered as the quality characteristics of monitoring interest and a risk model that

predicts the probability (p) of mortality based on a logistic regression was selected for

risk adjustment. The performance of the Bayesian estimator was investigated through

simulations and the result showed that precise estimates can be obtained when they

are used in conjunction with the risk-adjusted CUSUM and EWMA control charts. In

comparison with alternative EWMA and CUSUM estimators, more accurate and pre-

cise estimates were obtained by the Bayesian estimator. These superiorities enhanced

when probability quantification, flexibility and generalizability of the Bayesian change

point detection model were also considered. The Deviance Information Criterion, as a

model selection criterion in the Bayesian context, was applied to find the best change

point model for a given dataset where there was no prior knowledge about the change

type in the process.




components change point estimators were designed to estimate time of a step change in

251

odds ratio of a hospital outcomes in presence of patient mix. Meanwhile the simulation

study implemented in this research, contributes to an analytic application of the risk-

adjusted control charts over various change scenarios.







certified that:



field of expertise;






unit; and





Assareh, H., Smith, I. and Mengersen, K. (2011) Change point estimation in risk-

adjusted control charts, Statistical Methods in Medical Research, in press.



Signature & Date:

I. Smith Supplied data , comments on manuscript, editing





9.1 Abstract 253

9.1 Abstract

Precise identification of the time when a change in a clinical process has occurred en-

ables experts to identify a potential special cause more effectively. In this paper, we

develop change point estimation methods for a clinical dichotomous process in the pres-

ence of case mix. We apply Bayesian hierarchical models to formulate the change point

where there exists a step change in the odds ratio and logit of risk of a Bernoulli pro-

cess. Markov Chain Monte Carlo is used to obtain posterior distributions of the change

point parameters including location and magnitude of changes and also corresponding

probabilistic intervals and inferences. The performance of the Bayesian estimator is

investigated through simulations and the result shows that precise estimates can be ob-

tained when they are used in conjunction with the risk-adjusted CUSUM and EWMA

control charts. In comparison with alternative EWMA and CUSUM estimators, more

accurate and precise estimates are obtained by the Bayesian estimator. These supe-

riorities enhance when probability quantification, flexibility and generalizability of the

Bayesian change point detection model are also considered. The Deviance Information

Criterion, as a model selection criterion in the Bayesian context, is applied to find the

best change point model for a given dataset where there is no prior knowledge about

the change type in the process.

9.2 Introduction


stability and dispersion of the process. The chart signals when a significant change

has occurred. This signal can then be investigated to identify potential causes of the

change and corrective or preventive actions can then be implemented. Following this

cycle leads to variation reduction and process stabilization (Montgomery, 2008). The

achievements obtained by industrial and business sectors via the implementation of a

quality improvement cycle including quality control charts and root causes analysis have

motivated other sectors such as healthcare to consider those tools and apply them as

an essential part of the monitoring process in order to improve the quality of healthcare


delivery.

One of the earliest comprehensive research studies was undertaken Benneyan (1998a,b)

who utilized Statistical Process Control (SPC) methods and control charts in epidemi-

ology and infection control and discussed a wide range of control charts in the health

context. Woodall (2006) comprehensively reviewed the increasing stream of adaptations

of control charts and their implementation in healthcare surveillance. He acknowledged

the need for modification of the tools according to health sector characteristics such

as emphasis on monitoring individuals, particularly dichitomos data, and patient mix.


impact of the human element in process outcomes. Steiner and Cook (2000) devel-

oped a Risk-adjusted version of Cumulative Sum charts (CUSUM) to monitor surgical

outcomes, death and survival, which are influenced by the state of a patient’s health,

age and other clinical factors known prior to the procedure. This approach has been

extended to Exponential Moving Average control charts (EWMA) (Cook, 2004; Grigg

and Spiegelhalter, 2007). Both modified procedures have been intensively reviewed

and are now well established for monitoring clinical outcomes where the observations

are recorded as binary data (Cook et al., 2008; Grigg and Farewell, 2004; Grigg and

Spiegelhalter, 2006).

Consideration of identified needs and how they are being satisfied in industrial and

business sectors can accelerate other sectors in their own research and development of

effective quality improvement tools. The need to know the time at which a process

began to vary, the so-called change point, has recently been raised and discussed in the

industrial context of quality control. Accurate detection of the time of change can help

in the search for a potential cause more efficiently as a tighter time-frame prior to the

signal in the control charts is investigated.

A built-in change point estimator in CUSUM charts was suggested by Page (1954, 1961).

An equivalent estimator in EWMA charts was also proposed by Nishina (1992). The

change points from CUSUM and EWMA are the points at which they were last at zero

(Hawkins and Olwell, 1998) and at the process mean (Nishina, 1992), respectively. Both

estimators do not provide any statistical inferences on the obtained estimates. Having


said that Hinkley (1971) studied the distribution of the built-in estimator of CUSUM

charts and derived an asymptotic distribution that enables us making inferences. These

early built-in change point estimators can be applied for all discrete and continuous

distributions underlying the charts.

Samuel and Pignatiello (2001) developed and applied a maximum likelihood estima-

tor (MLE) for the change point in a process fraction nonconformity monitored by a

p-chart, assuming that the change type is a step change. They showed how closely this

new estimator detects the change point in comparison with the usual p-chart signal.

Subsequently, Perry and Pignatiello (2005) compared the performance of the derived

MLE estimator with EWMA and CUSUM charts. These authors also constructed

a confidence set based on the estimated change point which covers the true process

change point with a given level of certainty using a likelihood function based on the

method proposed by Box and Cox (1964). This approach was extended to other prob-

ability distributions and change type scenarios. In the case of a very low fraction

non-conforming, Noorossana et al. (2009) derived and analyzed the MLE estimator of

a step change based on the geometric distribution control chats discussed by Xie et al.

(2002).



clinical outcomes as the mean of the process being monitored is highly correlated to

individual characteristics of patients. Therefore, it is required that the risk model,


points in control charts.

The motivation of this study arose from a monitoring program of mortality of patients

admitted to Intensive Care Unit (ICU) in a local hospital, Brisbane, Australia. The

Acute Physiology and Chronic Health Evaluation II (APACHE II), an ICU scoring

system (Knaus et al., 1985), is used to quantify and express patient mix in quality

control charting. APACHE II predicts the probability (p) of mortality based on a

logistic regression given 12 physiological measurements taken in the first 24 hours after

admission to ICU, as well as chronic health status and age. In this program detection


of the true change point in control charts, as a part of root cause efforts, is sought. It

should be noted that the APACHE II has been chosen to demonstrate change point

detection as it is available for all ICU admissions from 2000 at the pilot hospital.

However, for practical implementations more recent versions of this risk adjustment

tool may be of interest.




computational frameworks to change point estimation facilitates modelling the process

where heterogeneity exists and also provides a way of making a set of inferences based

on posterior distributions for the time and the magnitude of a change as well as assessing

the validity of underlying assumptions in the change point model itself (Gelman et al.,

2004). In a recent paper, Assareh et al. (2011) applied this approach and discussed the

advantages of the Bayesian framework in investigation of the change point of control

charts monitoring rate of adverse events and use of blood products among patients

undergone cardiac surgery.

In this paper we model and detect the change point in a Bayesian framework. The

change points are estimated assuming that the underlying shift is a step change. In

this scenario, we model a step change in the odds ratio and logit of risk of a Bernoulli

process. For each model we analyze and discuss the performance of the Bayesian

change point model through posterior estimates and probability based intervals. The

two models are demonstrated and evaluated in Sections 9.4-9.6, and then compared

with respect to goodness of fit in Section 9.7. We then compare the Bayesian estimator

with CUSUM and EWMA built-in estimators in Section 9.8 and summarize the study

and obtained results in Section 9.9.

9.3 Risk-Adjusted Control Charts

The risk of death of a patient admitted to ICU is affected by the rate of mortality in the

ICU and also an individual patient’s covariates such as age, gender, co-morbidities, etc.

9.3 Risk-Adjusted Control Charts 257

Risk-adjusted control charts are monitoring procedures designed to detect changes in

a process parameter of interest, such as rate of mortality, where the process outcomes

are affected by covariates, such as case mix. In these procedures, risk models are used

to adjust control charts in a way that the effects of covariates for each input, patient

say, would be taken into account.


that accumulates evidence of the performance of the process and signals when either

a deterioration or an improvement is detected, where the weight of evidence has been

adjusted according to a patient’s prior risk Steiner and Cook (2000).

For the ith patient, we observe yi where yi ∈ (0, 1). This leads to a sequential set of



OR1, in the Bernoulli process (Cook et al., 2008). A weight Wi, the so-called CUSUM

score, is given to each patient considering the observed outcomes yi ∈ (0, 1) and their

prior risks pi,

W±i =




] if yi = 1.

(9.1)


i−1 +W+i }

and X−i = min{0, X−




0. Therefore an increase in the odds ratio, OR1 > 1, is detected when a plotted X+i

exceeds a specified decision threshold h+; similarly, if X−i exceeds a specified decision

threshold h−, the RACUSUM charts signals that a decrease in the odds ratio, OR1 < 1,

has occurred. See Steiner and Cook (2000) for more details.









= λ2×pi(1−pi)+(1−λ)2×σ2Zpi−1

. We

let σ2Zp0







The decision thresholds of the RACUSUM, h+ and h−, and the coefficient of the con-

trol limits in RAEWMA control charts, L, are determined in a way that the charts

have a specified performance in terms of false alarm and detection of shifts in odds

ratio; see Montgomery (2008) and Steiner and Cook (2000) for more details. The pro-

posed initialization may also be altered to achieve better performance in the detection

of changes that immediately occur after control chart initialization, see Steiner (1999)

and Knoth (2005) for more details on fast initial response (FIR). It should be noted

that there exists an alternative for risk-adjusted EWMA in which the focus is on esti-

mation of probability of death using pseudo observations and Bayesian methods (Cook

et al., 2008). This formulation would not be considered in this study; see Grigg and

Spiegelhalter (2007) for more details.

9.4 Change Point Model







9.4 Change Point Model 259



This structure is expandable to multiple levels in a hierarchical fashion, so-called

Bayesian hierarchical models (BHM), which allows enriching the model by capturing

all kind of uncertainties for data observed as well as priors. In complicated BHMs it is

not easy to obtain the posterior distribution analytically. This analytic bottleneck has

been eliminated by the emergence of Markov chain Monte Carlo (MCMC) methods.

In MCMC algorithms a Markov chain, also known as a random walk, is constructed

whose stationary distribution is the posterior distribution of the parameters. Samples

generated from a long run of the Markov chain using a proposal transition density

are drawn from posterior distributions of interest. Some common MCMC methods for

drawing samples include Metropolis-Hastings and the Gibbs sampler, see Gelman et al.


For monitoring a process with a dichotomous outcome, death say, where no covariates

contribute to the outcomes and standard control charts are applied, the observations

yi, i = 1, ..., T , are considered as samples that independently come from a Bernoulli

distribution. Assume that such a process is initially in-control with a known rate of

p0. At an unknown point in time, τ , the Bernoulli rate parameter changes from its

in-control state of p0 to p1, p1 = p0+ q and p1 6= p0. The general Bernoulli process step

change model can thus be parameterized as follows:

pr(yi | pi) =

pyi0 (1− p0)1−yi if i = 1, 2, ..., τ

pyi1 (1− p1)1−yi if i = τ + 1, ..., T.

(9.3)

However this formulation is not sustained where the in-control rate is not stable due to

covariate contributions. In other words in risk-adjusted charting procedures, we let the

process mean vary over observations and we control the variable observed rate against

the corresponding expected rate obtained through the risk models. In this setting,

a Bernoulli process is in the in-control state when observations can be statistically

expressed by the underlying risk models, taking into account their individual covariates.

The risk-adjusted control chart signals when observations tend to violate the underlying


risk model.

To express a change in an in-control process and construct a change point model, where

covariates exist, we use the common formulation of recalibration of risk models on local

data (Beck et al., 2002), as follows:

logit(p∗) = β0 + β1(logit(p)), (9.4)

where p is the probability of death using the original model and p∗ is its calibrated

value.

To model a change, let p be an in-control rate and p∗ be the associated out-of-control

rate which is caused by departure from either β0 = 0 or β1 = 1. Hence, two step change

models can be formulated, change in the intercept and the slope.

Violation in β0 = 0 can easily be parametrized in terms of the odds ratio, δ = exp(β0),

which is frequently used for design of control charts in a clinical monitoring context

Steiner and Cook (2000). In this setting, δ = 1 is identical to no change in the intercept,

β0 = 0. To model a change point in the presence of covariates, consider a Bernoulli

process yi, i = 1, ..., T , that is initially in-control, with independent observations coming

from a Bernoulli distribution with known variable rates p0i that can be explained by

an underlying risk model p0i | xi ∼ f(xi), where f(.) is a link function and x is a vector

of covariates. At an unknown point in time, τ , the Bernoulli rate parameter changes

from its in-control state of p0i to p1i obtained through

δ =p1i/1− p1ip0i/1− p0i

and p1i =δ × p0i/(1− p0i)

1 + (δ × p0i/(1− p0i)), (9.5)

where δ 6= 1 and > 0 so that p1i 6= p0i, i = τ, ..., T . The Bernoulli process step change

model in the presence of covariates can thus be parameterized as follows:

pr(yi | pi) =

pyi0i(1− p0i)1−yi if i = 1, 2, ..., τ

pyi1i(1− p1i)1−yi if i = τ + 1, ..., T.

(9.6)

Modeling a step change in terms of odds ratios benefits the change point model since


no constraint on each p1i, i = τ, ..., T , is needed. In this parametrization, any δ > 1

induces an increase in the rate whereas 0 < δ < 1 causes a fall. This type of change

is analogous to step changes models in a Bernoulli process rate without covariates. As

seen in Equation (9.5), although a specific magnitude of change induces in the odds

ratio, the obtained out-of control rates, p1i, i = τ, ..., T , are affected differently; see

Section 9.5 for more details.

Violation in β1 = 1 may also be of interest. It is not rare that an in-control process

experiences a signal which is due to shifts that are not consistent in direction over

patients with different risks. Such changes in cardiac surgery outcomes may be expected

when patients with higher or lower risks are allocated to more and less experienced

surgeons. In this scenario, at an unknown point in time, τ , the Bernoulli rate parameter

changes from its in-control state of p0i to p1i obtained through

β1 =logit(p1i)

logit(p0i)and p1i =

(p0i/(1− p0i))β1

1 + (p0i/(1− p0i))β1, (9.7)

where β1 6= 1 so that p1i 6= p0i, i = τ, ..., T . The resultant Bernoulli process step change

model is similar to Equation (9.6). Again, no constraint on each p1i, i = τ, ..., T ,

is needed; however, the obtained out-of control rates, p1i, i = τ, ..., T , are affected

differently in direction as well as magnitude when a specific change size is induced in

the slope, β1. See Section 9.5 for more details.

Relating this to Equation (9.2), pr(. | .) is the likelihood that underlies the observa-

tions; the time and the magnitude of a step change in odds ratio and slope are the

unknown parameters of interest; and the posterior distributions of these parameters

will be investigated in the change point analysis.

Assume that the process yi is monitored by a control chart that signals at time T . We

assign a normal distribution (µ = 1, σ2 = k) for β1 and a zero left truncated normal

distribution (µ = 1, σ2 = k)I(0,∞) for δ as prior distributions where k is study-specific.

In the following, we set k = 25, giving a relatively informed prior for the magnitude

of the change in an in-control rate as the control chart is sensitive enough to detect

very large shifts and estimate associated change points. Other distributions such as


(1) (2)

Figure 9.1 Distribution of calculated (1) logit of APACHE II scores logit(p); and (2) risk of mortalityfor 4644 patients admitted to ICU during 2000-2009.

uniform and Gamma might also be of interest for δ since it is always a positive value;

see Gelman et al. (2004) for more details on selection of prior distributions. We place

a uniform distribution on the range of (1, T − 1) as a prior for τ where T is set to the

time of signal of the control charts. See Appendix for the step change model code in

WinBUGS.

9.5 Evaluation


in step change detection following a signal from RACUSUM and RAEWMA control

charts when a change in either odds ratio or slope is simulated to occur at τ = 500.

However, to extend to the results that would be obtained in practice, we considered

a dataset of available APACHE II scores that was routinely collected over 2000-2009

in the pilot hospital for construction of baseline risks in the control charts. The ICU

outcomes were subject to regular clinical review as a part of the governance process of

the hospital.

Figure 9.1-1 shows the calculated logit of APACHE II scores (logit(p)) for 4644 patients

who were admitted to ICU. The scores led to a distribution of logit values with a mean

of -2.53 and a variance of 1.05. The distribution of the obtained probability of death

over patients is also shown in Figure 9.1-2. This led to an overall risk of death of 0.082

9.5 Evaluation 263

(1) (2)

Figure 9.2 Effect of a change of size {0.2, 0.5, 0.8, 1.25, 2, 5} in (1) odds ratio, δ, and (2) slope, β1, inan in-control Bernoulli process with baseline risks of p0.

(average of obtained risks) with a variance of 0.012 among patients in the pilot hospital.

To generate observations of a process in the in-control state yi, i = 1, ..., τ , we first

randomly generated associated risks, p0i, i = 1, ..., τ , from a normal distribution (µ =

−2.53, σ2 = 1.05) and then drew binary outcomes from a Bernoulli distribution with

rates of p0i, i = 1, ..., τ . Plotting the obtained observations when the associated risks

are considered results in risk-adjusted control charts that are in-control. However other

distributions such as Beta and uniform distributions with proper parameters or even

sampling randomly from the baseline data can be applied to generate risks directly.

Because we know that the process is in-control, if an out-of-control observation was

generated in the simulation of the early 500 in-control observations, it was taken as a

false alarm and the simulation was restarted. However, in practice a false alarm may

lead to stopping the process and analyzing root causes through the investigation model

proposed by Mohammed et al. (2004). When no cause is found, the process would

follow without adjustment.

Under the step change in odds ratio, we then induced changes of sizes δ = {1.25, 1.5, 2, 3, 5}

as increases, and their inverse values of δ = {0.2, 0.33, 0.5, 0.66, 0.8} as decreases in odds

ratio and generated observations until the control charts signalled. These changes led

to different change sizes in the in-control process rate shown in Figure 9.2-1. As seen,

patients with more extreme risks of mortality are less affected compared to patients

who have a probability of around 0.5.

The effect of a drop of size δ = 0.33 in odds ratio is demonstrated in Figure 9.3-1. The

resultant distribution is shifted to the left and highly concentrates on smaller values of


(1) (2)

Figure 9.3 Distribution of observable risk of mortality after a step change in (1) odds ratio of sizeδ = 0.33 and (2) slope of size β1 = 0.33 for 4644 patients admitted to ICU during 2000-2009.

risks in comparison with the observed risks in Figure 9.1-2. The overall risk drops to

0.034 with a variance of 0.004.

For change in slope, we induced changes of sizes β1 = {1.25, 1.5, 2, 3, 5} and their

inverse values of β1 = {0.2, 0.33, 0.5, 0.66, 0.8} and generated observations until the

control charts signalled. These changes of slope led to different changes in size and

direction in the in-control process rate.

As shown in Figure 9.2-2 if a drop occurs in the slope, the risk of death increases for

patients who initially have been at a risk of less than 50%, whereas it drops for those

with a risk of higher than 50%. The obtained distribution of risks after a change of

size β1 = 0.33 in the slope is shown in Figure 9.3-2 for observed risks in the basline

data. The concentration of the distribution is shifted to higher values compared to

Figure 9.1-2, so that the overall risk increases to 0.286 with a variance of 0.005. This is

an under-dispersed distribution compared to the baseline risks since observed risks of

more than 0.5 are projected onto the same range of risks as that obtained for patients

with low risks.

To construct risk-adjusted control charts, we applied the procedures discussed in Section

9.3. We constructed RACUSUM to detect a doubling and a halving of the odds ratio

in the in-control rate, p0 = 0.082, and have an in-control average run length ( ÂRL0)

of approximately 3000 observations. We used Monte Carlo simulation to determine


decision intervals, h±. However other approaches may be of interest; see Steiner and

Cook (2000) and Grigg et al. (2003). This setting led to decision intervals of h+ = 5.85

and h− = 5.33. As two sided charts were considered, the negative values of h− were

used. The associated CUSUM scores were also obtained through Equation (9.1) where

yi is 0 and 1, respectively.

We let the smoothing constant of RAEWMA be λ = 0.01 since the in-control rate was

low and detection of small changes was desired; see Somerville et al. (2002), Cook (2004)

and Grigg and Spiegelhalter (2007) for more details. The value of L was calibrated so

that the same in-control average run length ( ÂRL0) as the RACUSUM was obtained.

The resultant chart had L = 2.83. A negative lower control limit in the RAEWMA

was replaced by zero.





change point scenarios with the first 20000 samples ignored as burn-in. We then ana-

lyzed the results using the CODA package in R Plummer et al. (2010). See Appendix

for the step change model code in WinBUGS.


To demonstrate the achievable results of Bayesian change point detection in risk-

adjusted control charts, we induced a step change of size δ = 0.33 at time τ = 500

in an in-control binary process with an overall death rate of p0 = 0.082. RACUSUM

and RAEWMA, respectively, detected a drop in odds ratio and signalled at the 669th

and 659th observations, corresponding to delays of 159 and 169 observations as shown

in Figure 9.4-a1, b1. The posterior distributions of time and magnitude of the change

were then obtained using MCMC discussed in Section 9.5. For both control charts,

the distribution of the time of the change, τ , concentrates on the 500th observation,

approximately, as seen in Figure 9.4-a2, b2. The posteriors for the magnitude of the


(a1) (b1)

(a2) (b2)

(a3) (b3)

Figure 9.4 Risk-adjusted (a1) CUSUM ((h+, h−) = (5.85, 5.33)) and (b1) EWMA (λ = 0.01 andL = 2.83) control charts and obtained posterior distributions of (a2, b2) time τ and (a3, b3) magnitudeδ of an induced step change of size δ = 0.33 in odds ratio where E(p0) = 0.082 and τ = 500.

change, δ, also well identified the exact change size as they highly concentrate on values

of less than 0.5 shown in Figure 9.4-a3, b3. As expected, there exist slight differences

between the distributions obtained following RACUSUM and RAEWMA signals since

non-identical series of binary values were used for two procedures.

This investigation was replicated using a smaller shift of size δ = 0.5 in the odds ratio.


(a1) (b1)

(a2) (b2)

(a3) (b3)

Figure 9.5 Risk-adjusted (a1) CUSUM ((h+, h−) = (5.85, 5.33)) and (b1) EWMA (λ = 0.01 andL = 2.83) control charts and obtained posterior distributions of (a2, b2) time τ and (a3, b3) magnitudeβ1 of an induced step change of size β1 = 0.33 in slope where E(p0) = 0.082 and τ = 500.

Table 9.1 summarizes the posterior estimates for both change sizes. If the posterior

was asymmetric and skewed, the mode of the posteriors was used as an estimator for

the change point model parameter (τ, δ and β1).

The RACUSUM signalled after 202 observations when the odds ratio of death was

halved. This delay significantly increased for the RAEWMA procedure and reached


Table 9.1 Posterior estimates (mode, sd.) of step change point model parameters (τ , δ and β1) followingsignals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ = 0.01 and L = 2.83)where E(p0) = 0.082 and τ = 500. Standard deviations are shown in parentheses.

Change type Change sizeRACUSUM RAEWMA

RL τ δ(β1) RL τ δ(β1)

Odds ratioδ = 0.33 669

493 0.31659

492.9 0.35(93.2) (0.20) (103.5) (0.26)

δ = 0.50 702481 0.39

1403478 0.51

(105.0) (0.21) (142.7) (0.29)

Slopeβ1 = 0.33 537

499 0.33532

499 0.42(17.2) (0.14) (27.8) (0.23)

β1 = 0.50 588522 0.44

583519 0.46

(24.5) (0.24) (31.7) (0.26)

903 observations where the posterior distribution reported a drop at the 481st and 478th

observations, respectively. This result implies that although the obtained posterior

estimates underestimated the change point, they still performed significantly better

than the risk-adjusted control charts.

We also induced the same shift sizes in the slope parameter introduced in Section 9.4.

As discussed in section 9.5, a drop in the slope, β1, leads to an increase of risk since

the distribution of baseline probabilities highly concentrated on values smaller than

0.5. Figure 9.5-a1, b1 show although the charts both detected an increase, caused by

a drop of size β1 = 0.33, with a short delay, they were outperformed by the posterior

distributions for time that concentrated on the exact value, see Figure 9.5-a2, b2, and

estimated observation number 499 as the change point in Table 9.1. For a smaller shift

of size β1 = 0.5, the posterior estimates still outperformed the charts, however they

overestimated the time of the change with a delay of around 20 observations.

Bayesian estimates of the magnitude of the change tend to be relatively accurate fol-

lowing signals of the control charts, see Figure 9.5-a3, b3 and Table 9.1. The slight

bias observed in the figures must be considered in the context of their corresponding

standard deviations.

Comparison of estimates obtained across change sizes reveals that although a shorter

run of observations from the out-of control state of the process is used when a larger

shift size occurred, less dispersed posteriors are obtained. This behavior is also seen


when distributions obtained for δ = 0.33 and β1 = 0.33 in Figures 9.4 and 9.5 are

compared knowing that this drop in the slope leads to a more significant change in

risks and less delay in the chart’s signals. However care should be taken in comparison

of results between the two change scenarios, odds ratio and slope, as they were defined

on different scales, see Section 9.4, and a same size of change leads to shifts that are

different in magnitude and even direction, see Section 9.5.





estimated time and the magnitude of changes in odds ratio and slope for RACUSUM

and RAEWMA control charts. As expected, the CIs are affected by the dispersion and

higher order behaviour of the posterior distributions. Under the same probability of

0.5 for the RACUSUM, the CI for the time of the change of size δ = 0.33 in odds ratio

covers 37 samples around the 500th observation whereas it increases to 57 observations

for δ = 0.5 due to the larger standard deviation, see Table 9.1.

Comparison of the 50% and 80% CIs for the estimated time of a change of size β1 = 0.33

in slope obtained for the RAEWMA chart reveals that the posterior distribution of the

time is left-skewed and the increase in the probability contracts the left boundary of

the interval, from 483 to 473 in comparison with no shift in the right boundary. This

investigation can be extended to other shift sizes and control chart scenarios for the

time estimates. As shown in Table 9.1 and discussed above, the magnitude of the

changes are also estimated reasonably well and Table 9.2 shows that in most cases the

real sizes of changes are contained in the respective posterior 50% and 80% CIs.



change point in the last {25, 50, 100, 200, 300} observations prior to signalling in the

control charts. For a step change of size δ = 0.5 in odds ratio, since the RAEWMA

signals very late (see Table 9.1), it is unlikely that the change point occurred in the

last 300 observations. In contrast, in the RACUSUM, where it signals earlier, the


Table 9.2 Credible intervals for step change point model parameters (τ , δ and β1) following signals(RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ = 0.01 and L = 2.83) whereE(p0) = 0.082 and τ = 500.

Change type Change size ParameterRACUSUM RAEWMA

50% 80% 50% 80%

Odds ratioδ = 0.33

τ (486,523) (418,618) (483,530) (323,551)

δ (0.07,0.32) (0.04,0.49) (0.08,0.35) (0.05,0.55)

δ = 0.50τ (479,536) (437,562) (449,517) (418,598)

δ (0.19,0.41) (0.13,0.56) (0.43,0.55) (0.38,0.62)

Slopeβ1 = 0.33

τ (493,501) (478,501) (483,501) (473,501)

β1 (0.24,0.42) (0.16,0.50) (0.31,0.52) (0.21,0.62)

β1 = 0.50τ (526,534) (511,541) (525,534) (504,537)

β1 (0.35,0.51) (0.29,0.60) (0.36,0.54) (0.29,0.62)

Table 9.3 Probability of the occurrence of the change point in the last {25, 50, 100, 200, 300} observa-tions prior to signalling for RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ = 0.01 andL = 2.83) where E(p0) = 0.082 and τ = 500.

Change type Change sizeRACUSUM RAEWMA

25 50 100 200 300 25 50 100 200 300

Odds ratioδ = 0.33 0.00 0.01 0.08 0.71 0.85 0.00 0.02 0.08 0.68 0.83δ = 0.50 0.00 0.01 0.03 0.47 0.91 0.00 0.00 0.00 0.00 0.00

Slopeβ1 = 0.33 0.00 0.58 0.98 0.99 0.99 0.00 0.53 0.95 0.98 0.99β1 = 0.50 0.00 0.05 0.94 0.99 0.99 0.00 0.23 0.94 0.99 0.99

probabilities of occurrence in the last 200 observations is 0.47, then increases to 0.91 as

the next 100 observations are included. In the case of β1 = 0.5, change in slope, most

(0.89) of the probability density is located between the last 50 and 100 observations

for the RACUSUM, whereas with probability 0.71 it is between the last 50 and 100

observations for the RAEWMA chart. The associated probabilities drop for β1 = 0.33

in favor of the probability of occurrence of the change point in the first 50 observations

prior to the signals. These kind of probability computations and inferences can be

extended to other change scenarios.

The above studies were based on a single sample drawn from the underlying distri-

bution. To investigate the behavior of the Bayesian estimator over different sample

datasets, for different change sizes of δ, we replicated the simulation method explained

in Section 9.5 100 times. Simulated datasets that were obvious outliers were excluded.

This replication allows to have distribution of estimates with standard errors in order

of 10. The number of replication study, indeed, is a compromise between excessive

computational time, considering MCMC iterations, and sufficiency of the achievable


Table 9.4 Average of posterior estimates (mode, sd.) of step change point model parameters (τ andδ) for a change in odds ratio following signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) andRAEWMA charts (λ = 0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500. Standard deviations areshown in parentheses.

δRACUSUM RAEWMA

E(RL) E(τ) E(στ ) E(δ) E(RL) E(τ) E(στ ) E(δ)

0.2656.4 478.7 90.7 0.24 668.0 479.5 87.7 0.23(54.1) (55.6) (44.6) (0.12) (75.8) (53.4) (44.1) (0.12)

0.33698.7 505.1 110.1 0.31 737.5 505.4 107.7 0.30(86.7) (91.9) (50.0) (0.15) (137.4) (94.8) (50.8) (0.14)

0.5835.8 497.7 152.0 0.42 1018.2 512.9 152.8 0.41(193.8) (105.5) (68.0) (0.14) (414.6) (111.0) (75.1) (0.19)

0.661316.2 716.5 291.7 0.52 1716.8 762.0 330.6 0.58(591.0) (294.4) (202.0) (0.16) (925.3) (339.2) (221.9) (0.36)

0.82050.4 1244.5 537.5 0.68 3314.0 1311.8 661.3 0.87(1190.2) (708.2) (442.3) (0.20) (2432.9) (852.8) (506.2) (1.59)

1.252609.3 1326.9 569.9 1.50 1313.9 974.6 301.0 1.56(1730.7) (1013.2) (501.1) (0.30) (949.5) (755.9) (283.5) (0.28)

1.51214.9 751.0 245.4 1.74 903.9 669.0 200.4 1.70(494.2) (268.2) (142.4) (0.59) (266.7) (166.1) (115.5) (0.59)

2.0701.9 505.1 99.9 2.58 628.6 494.7 104.3 2.34(134.9) (79.9) (52.8) (1.41) (80.6) (82.6) (48.6) (1.17)

3.0584.8 493.5 66.6 3.22 553.1 488.8 81.4 2.80(40.8) (57.9) (40.8) (1.85) (30.5) (71.0) (44.1) (1.90)

5.0545.5 490.7 37.1 4.86 533.1 489.3 49.2 4.67(19.7) (42.3) (31.0) (2.36) (17.7) (38.0) (42.0) (2.74)

distributions even for tails. Table 9.4 shows the average of the estimated parameters

obtained from the replicated datasets where there exists a change in odds ratio, δ.

As seen, although the RACUSUM and RAEWMA control charts tend to detect larger

shifts in the odds ratio with less delay, they perform better where there exists a jump.

A longer delay in detection of a decrease in odds ratio in comparison with an increase

of the same size is due to the different effect of a change of odds ratio on the Bernoulli

rate and the dependency of the mean and the variance of the Bernoulli distribution. A

shift in the rate, p, towards more extreme values, zero and one, leads to less dispersed

observations in a Bernoulli process. As the overall death rate in the in-control state

of this study was set to p0 = 0.082, consistent with the motivating real data in the

case study described earlier for the extreme change scenarios of δ = 0.2 and δ = 5

the rate drops and jumps to 0.017 and 0.30, respectively. Therefore the charts detect

an increase of odds ratio faster than a fall of the same size since the observations are

more dispersed after the change point. In this context the contribution of the absolute


magnitude of the shift in the overall rate caused by changes in odds ratio should also

be considered.

Comparison of performance of RACUSUM and RAEWMA charts in Table 9.4 reveals

that, although the RACUSUM detected changes in odds ratio faster for drops, it is

outperformed by the RAEWMA chart for increases in the odds ratio. The obtained

precisions behaved in the same manner.

For a very small change of size δ = 0.8 and its inverse, δ = 1.25, in the odds ratio the

expected values of the mode, E(τ), reports the 1244th and 1326th observations as the

change point in RACUSUM, respectively, whereas the chart detected the changes with

delays greater than 1550 observations, obtained for the drop. This superiority persists

for the RAEWMA chart, as well as for a small shift of size 0.66 and its inverse, 1.5, in

the odds ratio, where a delay is still associated with the estimates of the time, τ , at best

169 obtained for δ = 1.5 following signal of RAEWMA. Although the RACUSUM chart

was designed to detect moderate size shifts of 2.0 and 0.5 in odds ratio, it signalled with

a delay of at least 201 observations. In this scenario, the bias of the Bayesian estimator,

E(τ), did not exceed five observations. This bias slightly increased for the RAEWMA

chart, reaching to 12 observations, yet significantly outperformed the chart’s signal.

At best, the RACUSUM and RAEWMA signals at the 656th and 533rd observations

for the most extreme decrease and increase in odds ratio were also outperformed by

posterior modes, E(τ), that exhibited a bias of 22 and 12 observations, respectively.

The Bayesian estimator for τ tended to underestimate the time of large shifts, jumps

and drops, in odds ratio.

Table 9.4 indicates that in both risk-adjusted control charts, the variation of the

Bayesian estimates for time tends to reduce when the magnitude of shift in the odds

ratio increases. However for drops in odds ratio the observed variation is more than

those obtained for detection of jumps. The mean of the standard deviation of the

posterior estimates of time, E(στ ), also decreases when shift sizes increase.

The average of the Bayesian estimates of the magnitude of the change, E(δ), shows

that the posterior modes identify change sizes reasonably well. This estimator tends to

overestimate the sizes where there exists a small change. As seen in Table 9.4, better


Table 9.5 Average of posterior estimates (mode, sd.) of step change point model parameters (τ and β1)for a change in slope following signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMAcharts (λ = 0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500. Standard deviations are shown inparentheses.

β1RACUSUM RAEWMA

E(RL) E(τ) E(στ ) E(β1) E(RL) E(τ) E(στ ) E(β1)

0.2531.3 501.7 18.9 0.17 522.2 503.6 35.6 0.20(11.7) (10.6) (24.0) (0.18) (10.7) (9.6) (37.4) (0.28)

0.33541.4 503.2 31.9 0.30 529.5 507.3 50.7 0.33(17.0) (13.5) (27.9) (0.13) (15.3) (13.3) (43.2) (0.27)

0.5574.0 518.2 55.4 0.46 550.2 524.4 75.4 0.55(35.1) (42.9) (36.7) (0.14) (28.6) (35.1) (43.4) (0.25)

0.66664.4 495.1 93.8 0.61 612.9 491.6 103.8 0.62(103.3) (85.8) (50.7) (0.14) (81.7) (78.6) (54.0) (0.21)

0.81031.4 599.5 192.9 0.72 816.9 583.2 175.5 0.78(406.5) (214.6) (98.5) (0.11) (284.7) (178.0) (89.9) (0.20)

1.251138.1 667.0 233.1 1.34 1543.2 652.9 230.4 1.31(418.2) (180.0) (130.4) (0.16) (837.2) (192.5) (126.8) (0.17)

1.5760.5 560.2 105.3 1.59 858.7 539.8 102.0 1.62(94.2) (126.8) (43.4) (0.29) (135.4) (68.4) (47.8) (0.37)

2.0660.2 495.6 63.7 1.95 672.6 494.1 60.6 1.97(45.6) (63.5) (36.3) (0.49) (53.0) (52.5) (35.3) (0.40)

3.0632.4 491.9 36.5 2.44 636.4 493.7 35.7 2.48(31.6) (57.3) (23.5) (0.60) (33.8) (48.0) (21.9) (0.60)

5.0619.8 483.9 24.7 5.17 615.0 483.2 27.0 5.487(30.0) (30.6) (16.3) (3.34) (34.2) (27.2) (17.6) (2.74)

estimates are obtained in moderate to large shifts. Having said that, Bayesian estimates

of the magnitude of the change must be studied in conjunction with their corresponding

standard deviations. In this manner, analysis of credible intervals is effective.

Similar to the step change in odds ratio scenario, we replicated the simulation 100 times

to study the performance of the proposed change point model for different datasets

drawn for the same population for various slopes, β1, discussed in Section 9.4. Table 9.5

shows the average of the estimated parameters obtained from the replicated datasets.

As shown in Table 9.1 and discussed in Section 9.5, the effect of a change in the slope,

β1, depends on the overall Bernoulli rate. Since E(p0) = 0.082 a drop in the slope

causes an increase of the rate, see Figure 9.5, and visa versa. Therefore it is expected

that the behavior in risk-adjusted control charts and the Bayesian estimator will be

opposite to that observed for a change in the odds ratio, δ.

Table 9.5 indicates that RACUSUM and RAEWMA control charts tend to detect drops


in the slope, β1, faster than same sizes of jumps. This is due to the obtained magnitude

of shift and associated dispersion in overall Bernoulli rate after the change point; so

that for a drop in the slope both become larger than those obtained for an increase;

see Figure 9.1-2. In extreme change scenarios of β1 = 0.2 and β1 = 5 the overall

rate, p0 = 0.082, shifts to approximately +0.281 and -0.064, reaching 0.363 and 0.018,

respectively.

The RACUSUM almost outperformed the RAEWMA for increases in slopes by detec-

tion of a fall in the in-control rate, whereas the RAEWMA detected drops in slope

(increase of death rate) more quickly. This superiority is not consistent for the corre-

sponding Bayesian estimates of time of change in the slope, E(τ). Comparing obtained

posterior modes reveals that time of change was estimated more accurately where there

exists a longer series of observations coming from the out-of-control state of the process.

However, this is not held over very small changes, 0.8 and 1.25.

Table 9.5 shows that the Bayesian estimator mostly tends to underestimate and overes-

timate the time of the change, τ , where the slope increases and decreases, respectively.

However it outperformed both control charts with a smaller bias of at least 19 observa-

tions obtained for the RAEWMA with β1 = 0.2. This bias reached to 167 observations

for RACUSUM with a change of β1 = 1.25, which is still significantly less than the

obtained delay of 638 observations based on the chart’s alarm.

Similar to the results observed for a change in odds ratio in both charts, the observed

variation of Bayesian estimates and mean of the standard deviation of the posterior

estimates of time, E(στ ), over jumps in overall rate (here drop in β1) tend to be less

than those obtained for falls in rate (increase in β1).

The mean of the Bayesian estimates of the magnitude of the change, E(β1), shows that

the modes of posteriors for change sizes perform reasonably well; although they tend to

overestimate the shift sizes over small changes, these slight biases should be considered

in the context of their corresponding standard deviations and credible intervals.

9.7 Comparative Performance and Model Selection 275

Table 9.6 Performance and goodness of the change point models on different change types followingsignal from a RAEWMA (λ = 0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500.

Change type Change size E(RL) Model E(τ) E(µD) E(σD) E(DIC)

Odds ratio δ = 0.33 727.3Odds ratio 503.4 328.5 2.95 330.6

Slope 535.8 331.0 2.72 333.6

Odds ratio δ = 3.0 565.1Odds ratio 486.0 295.1 2.72 296.5

Slope 457.9 296.8 2.81 300.3

Slope β1 = 0.33 530.8Odds ratio 510.1 288.7 3.04 291.1

Slope 508.2 282.7 3.08 284.8

Slope β1 = 3.0 621.8Odds ratio 484.3 257.8 3.43 261.3

Slope 492.5 252.2 2.41 254.0

9.7 Comparative Performance and Model Selection

The change point models developed and investigated through Sections 9.4-9.6 were

based on availability of prior knowledge about the form of the change in an in-control

Bernoulli rate including change in the odds ratio or the slope. However, in practice

there may be no information and experience about the underlying change type. In this

circumstance, implementation of a false change point model may return misleading

results about the change point.

To study the performance of the change point models in different change scenarios,

we used the simulation procedure discussed in Section 9.5 in which both change point

models were implemented following signals from a RAEWMA. This simulation was re-

peated 100 times. Based on the MCMC simulation, the Deviance Information Criterion

(DIC) and related parameters, mean (µD) and variance (σD) of the posterior distribu-

tion of the deviance, were also recorded in each iteration. The DIC is a goodness of

fit criterion which takes into account the deviance of the model, −2 ln(p(y | θ)), and

a penalty for the model complexity, pV , which is half of the variance of the posterior

distribution of the deviance (Gelman et al., 2004).

Table 9.6 indicates that the Bayesian estimate obtained through the change point model

for the odds ratio outperforms the corresponding estimator for the slope where there

is either an increase or a decrease in the odds ratio. It estimates 503.4 and 486.0 as

the time of change of size δ = 0.33 and δ = 3.0 respectively, whereas the slope model


Table 9.7 Average of detected time of a step change in odds ratio obtained by the Bayesian estimator(τb), CUSUM and EWMA built-in estimators following signals (RL) from RACUSUM ((h+, h−) =(5.85, 5.33)) and RAEWMA charts (λ = 0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500.Standard deviations are shown in parentheses.

δRACUSUM RAEWMA

E(RL) E(τcusum) E(τb) E(RL) E(τewma) E(τb)

0.2656.4 432.5 478.7 668.0 472.2 479.5(54.1) (99.6) (55.6) (75.8) (80.0) (53.4)

0.33698.7 416.0 505.1 737.5 471.6 505.4(86.7) (123.2) (91.9) (137.4) (114.6) (94.8)

0.5835.8 487.4 497.7 1018.2 607.7 512.9(193.8) (167.4) (105.5) (414.6) (314.6) (111.0)

0.661316.2 848.6 716.5 1716.8 1319.9 762.0(591.0) (582.5) (294.4) (925.3) (920.6) (339.2)

0.82050.4 1693.1 1244.5 3314.0 3030.3 1311.8(1190.2) (1154.2) (708.2) (2432.9) (2395) (852.8)

1.252609.3 2281.7 1326.9 1313.9 1121.2 974.6(1730.7) (1735.9) (1013.2) (949.5) (976.6) (755.9)

1.51214.9 933.7 751.0 903.9 714.8 669.0(494.2) (494.9) (268.2) (266.7) (275.4) (166.1)

2.0701.9 506.0 505.1 628.6 498.7 494.7(134.9) (82.4) (79.9) (80.6) (106.6) (82.6)

3.0584.8 437.4 493.5 553.1 462.5 488.8(40.8) (81.3) (57.9) (30.5) (81.4) (71.0)

5.0545.5 454.8 490.7 533.1 461.9 489.3(19.7) (73.3) (42.3) (17.7) (84.9) (38.0)

overestimates and underestimates the time with a larger bias of around 35 and 43

observations. The DIC supports that the odds ratio model with values of 330.6 and

296.5 is a preferable fit where there exists a change in the odds ratio.

In the case of an occurrence of a change in the slope, the Bayesian estimate of the slope

outperforms the odds ratio model in detecting the change point with a smaller bias.

The reported DIC is convincing that the slope model, with values of 284.8 and 254, is

also the best fit.


ods


those introduced in Section 9.2, we ran the available alternatives, built-in change point


estimators of Bernoulli EWMA and CUSUM charts, within the replications discussed

in Section 9.6.

Based on Page (1954) suggestion, if an increase in a process rate detected by CUSUM

charts, an estimate of the change point is obtained through τcusum = max{i : X+i = 0};

similarly for detection of a decrease, the estimated change point is τcusum = max{i :

X−i = 0} (Hawkins and Olwell, 1998). We modified the built-in estimator of EWMA

proposed by Nishina (1992) and estimated the change point using τewma = max{i :

Zoi ≤ Zpi} and τewma = max{i : Zoi ≥ Zpi} following signals of an increase and a

decrease in the Bernoulli rate, respectively.

Table 9.7 shows the average of the Bayesian estimates, τb, and detected change points

provided by the built-in estimators of CUSUM, τcusum, and EWMA, τewma, charts for

changes in the odds ratio, δ.

The built-in estimators of EWMA and CUSUM charts outperform associated signals

over small to moderate shifts in the odds ratio, however they tend to significantly

underestimate the exact change point when the magnitude of shift increases. Notably,

in case of δ = 5.0, where both charts quickly signal after a change occurred, the built-

in estimators significantly failed. The EWMA built-in estimator, τewma, outperforms

the alternative built-in estimator over jumps in the odds ratio, exactly over the same

range of changes in which the RAEWMA is superior. Conversely, the superiority of

the RACUSUM is not revisited by its built-in estimator, τcusum, over all magnitude of

drops in the odds ratio, since it is outperformed for large drops.

Although the Bayesian estimator, τb, tends to overestimate the time of changes of small

sizes, δ = 0.8, 0.66 and their inverse values, with delays between 216 to 826 observations

obtained for the RACUSUM, it outperforms both built-in estimators, τcusum and τewma,

which overestimates the time by at least 348 observations obtained for RACUSUM. For

large shifts, increase or decrease, where the Bayesian estimator tends to underestimate

the change point, yet remains less bias compared to the built-in estimators. As discussed

earlier and also seen in Table 9.7, posterior modes provide the most accurate estimation

of time of the change for large change sizes where the built-in estimators failed by the

chart’s signals.


Table 9.8 Average of detected time of a step change in slope obtained by the Bayesian estimator(τb), CUSUM and EWMA built-in estimators following signals (RL) from RACUSUM ((h+, h−) =(5.85, 5.33)) and RAEWMA charts (λ = 0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500.Standard deviations are shown in parentheses.

β1RACUSUM RAEWMA


0.2531.3 465.4 501.7 522.2 476.0 503.6(11.7) (59.6) (10.6) (10.7) (51.5) (9.6)

0.33541.4 462.3 503.2 529.5 466.1 507.3(17.0) (64.6) (13.5) (15.3) (82.8) (13.3)

0.5574.0 465.9 518.2 550.2 469.9 524.4(35.1) (67.0) (42.9) (28.6) (88.5) (35.1)

0.66664.4 507.4 495.1 612.9 506.1 491.6(103.3) (105.7) (85.8) (81.7) (113.7) (78.6)

0.81031.4 775.0 599.5 816.9 661.4 583.2(406.5) (381.4) (214.6) (284.7) (267.6) (178.0)

1.251138.1 656.4 667.0 1543.2 1065.0 652.9(418.2) (370.4) (180.0) (837.2) (830.4) (192.5)

1.5760.5 463.5 560.2 858.7 507.8 539.8(94.2) (93.9) (126.8) (135.4) (132.5) (68.4)

2.0660.2 414.0 495.6 672.6 462.1 494.1(45.6) (101.6) (63.5) (53.0) (83.3) (52.5)

3.0632.4 437.7 491.9 636.4 462.1 493.7(31.6) (78.2) (57.3) (33.8) (78.3) (48.0)

5.0619.8 416.0 483.9 615.0 444.3 483.2(30.0) (101.2) (30.6) (34.2) (94.2) (27.2)

The built-in estimator of EWMA outperforms the proposed Bayesian estimator over

an increase of size δ = 2.0, with a less bias of four observations, however considering

corresponding standard deviations over replications, the Bayesian estimator remains

a reasonable alternative. Comparison of variation of estimated change points also

supports the superiority of the Bayesian estimators over alternatives across various

change sizes and directions in odds ratio.

Similar to the changes in the odds ratio scenario, we studied the comparative perfor-

mances of all change point estimators over changes in the slope, β1. Table 9.8 shows

the mean of Bayesian estimates, τb, and detected change points provided by the built-in

estimators of CUSUM, τcusum, and EWMA, τewma, charts.

As expected, although the built-in estimators of the EWMA and CUSUM charts out-

perform associated signals over small to moderate shifts in the slope, they failed to

provide a better estimation than the charts signals do over large drops in the slope

9.9 Conclusion 279

(jumps in Bernoulli rates). The EWMA built-in estimator, τewma, outperforms the

alternative built-in estimator over all change scenarios, except for β = 1.25, even for

increases in the slope that the RACUSUM signals faster then the RAEWMA.

Table 9.8 indicates that the Bayesian estimator, τb, remains the superior change point

estimator in comparison with alternatives almost across various change sizes and di-

rections in the slope since a less significant biases were obtained. This estimator is

outperformed by the EWAM built-in estimator for β1 = 1.5, however no substantial

accuracy is provided by the alternative when corresponding standard deviations are

also taken into account. Comparison of variation of estimated change points also sup-

ports the superiority of the Bayesian estimator over alternatives across various change

sizes and directions in the slope.

9.9 Conclusion

Quality improvement programs and monitoring process for medical outcomes are now

being widely implemented in the health care. These programs aim to drive stability in

outcomes through detection of shifts and investigation of potential causes. Obtaining

accurate information about the time when a change occurred in the process has been re-

cently considered within industrial and business context of quality control applications.

Indeed, knowing the change point enhances efficiency of root causes analysis efforts by

limiting investigation to a tighter window of observations and related variables.

In this paper, using a Bayesian framework, we modeled change point detection for

a clinical process with a dichotomous outcome, death, where case mix was present.

We considered two frequently seen change scenarios, a change in odds ratio and a

change in the logit of risks, defined by a coefficient (slope), of the in-control rate. We

constructed Bayesian hierarchical models and derived posterior distributions for change

point estimates using MCMC.

The performance of the Bayesian estimators was investigated through simulation when

they were used in conjunction with well-known risk-adjusted CUSUM and EWMA

control charts monitoring mortality rate in the ICU of the pilot hospital where risk of


death was evaluated using APACHE II, a logistic prediction model.

The results showed that the Bayesian estimates significantly outperform the RACUSUM

and RAEWMA control charts in change detection over different scenarios of magnitude

and direction of changes. It was also seen that the RAEWMA chart outperformed the

RACUSUM in detection of increases in the rate of death, causing by either a jump

in the odds ratio or a drop in the slope, whereas the RACUSUM was the superior

over decreases. This finding may suggest one charting procedure over the alternative

for various scenarios. However further comprehensive investigations need to be done

in order to provide a guideline since it has been beyond of the scope of the current

study and no such research conducted yet. In this regard a wider range of chart pa-

rameters, including smoothing constants of RAEWMA charts and ARL1, over various

baseline risk models should be considered. Other criteria such as interpretability, ease

of construction and calibration may also be of interest in chart selection.

While the choice and the design of charting procedure are vital in quick detection of

changes in clinical outcomes, risk of death here, the obtained results promote the ap-

plication of Bayesian estimators in conjunction with both RACUSUM and RAEWMA

control charts since more accurate estimates of change points are achievable. This pre-

cision enables clinicians to make timely identification of contributed factors and leads

to immediate and effective interventions in the system which will save time, money and

also lives, considering that the clinical procedures are non-stoppable.

We compared the Bayesian estimator with built-in estimators of EWMA and CUSUM.

The Bayesian estimator performs reasonably well and outperforms alternatives, partic-

ularly when precision of the estimators are taken into account.

Investigation of the performance of the Bayesian estimates over different change scenar-

ios reveals that each Bayesian change point model outperforms another model where its

underlying change type has occurred in the Bernoulli process. The results also support

the idea of using DIC as a primary step in change point detection which can direct pro-

cess experts to identify the appropriate change point model before making inferences

about the derived underlying changes in the process.

9.9 Conclusion 281




of the change point. This is a significant advantage of the proposed Bayesian approach.

Furthermore, flexibility of Bayesian hierarchical models, ease of extension to more com-

plicated change scenarios such as combination of change in odds ratio and slope, a com-

mon practice in a clinical context for calibration, and linear and nonlinear trends, relief

of analytic calculation of likelihood function, particularly for non-tractable likelihood

functions and ease of coding with available packages should be considered as additional

benefits of the proposed Bayesian change point model for monitoring purposes.

The investigation conducted in this study was based on a specific in-control rate of

mortality observed in the pilot hospital. Although it is expected that superiority of the

proposed Bayesian estimator persists over other processes in which the in-control rate

and the distribution of baseline risk may differ, the results obtained for estimators and

control charts over various change scenarios motivates replication of the study using

other case mix profiles. Moreover modification of change point model elements such as

replacing priors with more informative alternatives, or truncation of prior distributions

based on type of signals and prior knowledge, may be of interest.



the pilot hospital). An alternative may be to retain the two-step approach but to use

a Bayesian framework in both stages. There is now a substantial body of literature on

Bayesian formulation of control charts and extensions such as monitoring processes with

varying parameters (Feltz and Shiau, 2001), over-dispersed data (Bayarri and Garcıa-

Donato, 2005), start-up and short runs (Tsiamyrtzis and Hawkins, 2005, 2008). A

further alternative is to consider a fully Bayesian, one-step approach, in which both the

monitoring of the in-control process and the retrospective or prospective identification

of changes is undertaken in the one analysis. This is the subject of further research.


Acknowledgments



They would also like to thank the referees and the editor for helpful suggestions which

improved the presentation of this paper.

Appendix

Change point model code for odds ratio

model {

for(i in 1 : RLcusum){

y[i] ∼ dbern(p[i])

p[i]=x[i]+step(i-change)*(-x[i]+(delta*x[i])/(x[i]*(delta-1)+1)) }

RL=RLcusum-1

delta ∼ dnorm(1,0.04)I(0,)

change ∼ dunif(1,RL) }

Change point model code for slope

model {


y[i] ∼ dbern(p[i])

logit(p[i])=logitx[i]+step(i-change)*(beta1-1)*logitx[i] }

RL=RLcusum-1

beta1 ∼ dnorm(1,0.04)

change ∼ dunif(1,RL) }

BIBLIOGRAPHY 283

Bibliography



20(3):207–222.






















17(2):119–124.


Chapman & Hall/CRC.

Grigg, O. V. and Farewell, V. T. (2004). An overview of risk-adjusted charts. Journal



38(2):124–136.



102(477):140–152.








Biometrika, 58(3):509–523.




Papers, 46(1):47–64.
















coda. Citeseer.









BIBLIOGRAPHY 285











24(6):721–735.





CHAPTER 10

Bayesian Estimation of the Time of a Linear

Trend in Risk-Adjusted Control Charts

Preamble









Linear trends in the process mean have been considered as a frequent change type model

which eventually lead the process to be out-of-control. This drifts are so common in an

industrial context caused by tool wearing. However it is not rare to experience such a

288 Chapter 10. Linear Trend Estimation in Risk-Adjusted Charts

shift in monitoring a clinical measure due to skill improvement of surgery team, spread

of inspections, changes in effectiveness of medications and so on.

Following adaption of Bayesian approach in change point estimation through Chap-

ters 6 to 7 and achieved accuracy and precision obtained by the developed Bayesian

estimator of a step change in odds ratio of a Bernoulli process in presence of patient

mix in ChapterChp9: Chapter 9, in this chapter the Bayesian change point model was

extended to identify the time of a linear trend in ICU outcomes of a local hospital.

To model the process and change point, a linear trend in the odds ratio of a Bernoulli

process is formulated using hierarchical models in a Bayesian framework. We used

MCMC to obtain posterior distributions of the change point parameters including lo-

cation and magnitude of changes and also corresponding probabilistic intervals and

inferences. The performance of the Bayesian estimator was investigated through simu-

lations and the result showed that precise estimates can be obtained when they wer used

in conjunction with the risk-adjusted CUSUM and EWMA control charts for different

magnitude and direction of change scenarios. In comparison with alternative EWMA

and CUSUM estimators, reasonably accurate and precise estimates are obtained by the

Bayesian estimator. These superiorities are enhanced when probability quantification,

flexibility and generalizability of the Bayesian change point detection model were also

considered.




components change point estimators were designed to estimate time of a linear trend in

odds ratio of hospital outcomes in presence of patient mix. Meanwhile the simulation

study implemented in this research, contributes to an analytic application of the risk-

adjusted control charts over various change scenarios.




289



certified that:



field of expertise;






unit; and





Assareh, H., Smith, I. and Mengersen, K. (2011) Bayesian estimation of the time of a

linear trend in risk-adjusted control charts IAENG International Journal of Computer

Science, 38 (4): 409–417.



Signature & Date:

I. Smith Supplied data , comments on manuscript, editing






10.1 Abstract

Change point detection is recognized as an essential tool of root cause analyses within

quality control programs as it enables clinical experts to search for potential causes of

disturbance in hospital outcomes more effectively. In this paper, we consider estima-

tion of the time when a linear trend disturbance has occurred in an in-control clinical

dichotomous process in the presence of variable patient mix. To model the process and

change point, a linear trend in the odds ratio of a Bernoulli process is formulated using

hierarchical models in a Bayesian framework. We use Markov Chain Monte Carlo to

obtain posterior distributions of the change point parameters including location and

magnitude of changes and also corresponding probabilistic intervals and inferences.

The performance of the Bayesian estimator is investigated through simulations and the

result shows that precise estimates can be obtained when they are used in conjunction

with the risk-adjusted CUSUM and EWMA control charts for different magnitude and

direction of change scenarios. In comparison with alternative EWMA and CUSUM

estimators, reasonably accurate and precise estimates are obtained by the Bayesian

estimator. These superiorities are enhanced when probability quantification, flexibility

and generalizability of the Bayesian change point detection model are also considered.

10.2 Introduction

Control charts monitor behavior of processes over time by taking into account their

stability and dispersion. The chart signals when a significant change has occurred. This

signal can then be investigated to identify potential causes of the change and corrective

or preventive actions can then be implemented. Following this cycle leads to variation

reduction and process stabilization (Montgomery, 2008). The achievements obtained by

industrial and business sectors through the implementation of a quality improvement

cycle including quality control charts and root causes analysis have motivated other

sectors such as healthcare to consider these tools and apply them as an essential part

of the monitoring process in order to improve the quality of healthcare delivery.

One of the earliest comprehensive research studies was undertaken by (Benneyan,


1998a,b) who utilized SPC methods and control charts in epidemiology and control

infection and discussed a wide range of control charts in the health context. Woodall

(2006) comprehensively reviewed the increasing stream of adaptions of control charts

and their implementation in healthcare surveillance. He acknowledged the need for

modification of the tools according to health sector characteristics such as emphasis on

monitoring individuals, particularly dichitomos data, and patient mix. Risk adjustment

has been considered in the development of control charts due to the impact of the hu-

man element in process outcomes. Steiner and Cook (2000) developed a risk-adjusted

type of cumulative sum control chart (CUSUM) to monitor surgical outcomes, death,

which are influenced by the state of a patient’s health, age and other factors. This

approach has been extended to exponential moving average control charts (EWMA)

(Cook, 2004; Grigg and Spiegelhalter, 2007). Both modified procedures have been in-

tensively reviewed and are now well established for monitoring clinical outcomes where

the observations are recorded as binary data (Grigg and Farewell, 2004; Grigg and

Spiegelhalter, 2006; Cook et al., 2008).





industrial context of quality control. Precise identification of the time when a change in

a hospital outcome has occurred enables clinical experts to search for potential special

causes more effectively since a tighter range of time and observations are investigated.

Assareh et al. (2011a) discussed the benefits of change point investigation in monitoring

cardiac surgery outcomes and post-signal root causes analysis by providing precise

estimates of the time of the change in the rates of use of blood products during surgery

and adverse events in the follow-up period.

A built-in change point estimator in CUSUM charts suggested by Page (1954, 1961) and

also an equivalent estimator in EWMA charts proposed by Nishina (1992) are two early

change point estimators which can be applied for all discrete and continuous distribution

underlying the charts. However they do not provide any statistical inferences on the

obtained estimates.



the change point in a process fraction nonconformity monitored by a p-chart, assuming

that the change type is a step change. They showed how closely this new estimator

detects the change point in comparison with the usual p-chart signal. Subsequently,

Perry and Pignatiello (2005) compared the performance of the derived MLE estimator

with EWMA and CUSUM charts. These authors also constructed a confidence set

based on the estimated change point which covers the true process change point with

a given level of certainty using a likelihood function based on the method proposed

by Box and Cox (1964). It is not rare to experience other types of change in the

process parameters. Bissell (1984) and Gan (1991, 1992) investigated the performance

of CUSUM and EWMA control charts over linear trends in the process mean. Such

drifts can be caused by tools wearing, spread of infections, learning curve and skill

improvement or motivation reduction that may lead to shifts the process parameter

over time in an industrial or clinical contexts. MLE estimators of the time when such

drifts has occurred were developed for normal (Perry and Pignatiello Jr, 2006) and

Poisson processes (Perry et al., 2006).




computational frameworks to change point estimation in a clinical context facilitates

modelling the process and also provides a way of making a set of inferences based on

posterior distributions for the time and the magnitude of a change (Gelman et al.,

2004). This approach has recently been considered by Assareh et al. (2011a) in change

point investigation of two clinical outcomes.



clinical outcomes as the mean of the process being monitored is highly correlated to

individual characteristics of patients. Therefore it is required that the risk model, which

explains patient mix, be taken into consideration in detection of true change points in

control charts for different change types. Assareh and Mengersen (2011) and Assareh

et al. (2011b) recently proposed Bayesian modelling for estimation of changes in the rate

10.3 Risk-Adjusted Control Charts 293

of death and survival time after surgery among patients with varying pre-operation risk

of death. In this setting the process mean is no longer stable and risk models explain

in-control state of the process.

The motivation of this study arose from a monitoring program of mortality of patients

admitted to an Intensive care Unit (ICU) in a local hospital, Brisbane, Australia. The

Acute Physiology and Chronic Health Evaluation II (APACHE II), an ICU scoring

system (Knaus et al., 1985), is used to quantify and express patient mix in quality

control charting. APACHE II predicts the probability (p) of mortality based on a

logistic regression given 12 physiological measurements taken in the first 24 hours after

admission to ICU, as well as chronic health status and age. In this program detection

of the true change point in control charts at the presence of linear trend disturbances,

as a part of root cause efforts, is sought.


change points are estimated assuming that the underlying change is a linear trend. In

this scenario, we model the linear trend in the odds ratio of risk of a Bernoulli process.

We analyze and discuss the performance of the Bayesian change point model through

posterior estimates and probability based intervals. We review risk-adjusted control

charts in Section 10.3. The model is demonstrated and evaluated in Sections 10.4-10.6.

We then compare the Bayesian estimator with CUSUM and EWMA built-in estimators

in Section 10.7 and summarize the study and obtained results in Section 10.8.

10.3 Risk-Adjusted Control Charts

The probability of death of a patient who has undergone cardiac surgery is affected by

the rate of mortality of cardiac surgery within the hospital and also patient’s covariates

such as age, gender, co-morbidities and etc. Risk-adjusted control charts (RACUSUM)

are monitoring tools designed to detect changes in a process parameter of interest, such

as probability of mortality, where the process outcomes are affected by covariates, such

as patient mix. In these procedures, risk models are used to adjust control charts in

a way that the effects of covariates for each input, patient say, would be taken into


account.


that accumulates evidence of the performance of the process and signals when either a

significant deterioration or improvement is detected, where the weight of evidence has

been adjusted according to patient’s prior risk (Steiner and Cook, 2000).

For the ith patient, we observe an outcome yi where yi ∈ (0, 1). This leads to a set of



OR1, in the Bernoulli process (Cook et al., 2008). A weight Wi, the so-called CUSUM

score, is given to each patient considering the observed outcomes yi and their prior

risks pi,

W±i =




] if yi = 1.

(10.1)


i−1 + W+i }

and X−i = min{0, X+




0. Therefore an increase in the odds ratio, OR1 > 1, is detected when a plotted

X+i exceeds a specified decision threshold h+; conversely, if X−

i falls below a specified

decision threshold h−, the RACUSUM charts signals that a decrease in the odds ratio,

OR1 < 1, has occurred. See Steiner and Cook (2000) for more details.








= λ2×pi(1−pi)+(1−λ)2×σ2Zpi−1

. We

let σ2Zp0








The magnitude of the decision thresholds in the RACUSUM, h+ and h−, and the

coefficient of the control limits in RAEWMA control charts, L, are determined in a

way that the charts have a specified performance in terms of false alarm and detection

of shifts in odds ratio; see Montgomery (2008) and Steiner and Cook (2000) for more

details. The proposed initialization may also be altered to achieve better performance

in the detection of changes that immediately occur after control chart initialization,

see Steiner (1999) and Knoth (2005) for more details on fast initial response (FIR). It

should be noted that there exists an alternative for risk-adjusted EWMA in which the

focus is on estimation of probability of death using pseudo observations and Bayesian

methods (Cook et al., 2008). This formulation would not be considered in this study;

see Grigg and Spiegelhalter (2007) for more details.










For monitoring a process with dichotomous outcomes, survival say, where no covariates

contribute to the outcomes and standard control charts are applied, the observations


yi, i = 1, ..., T , are considered as samples that independently come from a Bernoulli

distribution. Assume that such process is initially in-control with a known rate of p0.

At an unknown point in time, τ , the Bernoulli rate parameter changes from its in-

control state of p0 to p1, p1 = p0 + δ and p1 6= p0. The general Bernoulli process step

change model can thus be parameterized as follows:

pr(yi | pi) =

pyi0 (1− p0)1−yi if i = 1, 2, ..., τ

pyi1 (1− p1)1−yi if i = τ + 1, ..., T.

(10.3)

However this formulation is not sustained where the in-control rate is not stable due to

covariate contributions. In other words in risk-adjusted charting procedures, we let the

process mean vary over observations and we control the variable observed rate against

the corresponding expected rate obtained through the risk models. In this setting,

a Bernoulli process is in the in-control state when observations can be statistically

expressed by the underlying risk models, taking into account their individual covariates.

The risk-adjusted control chart signals when observations tend to violate the underlying

risk model.

To express an in-control process and construct a change point model, where covariates

exist, we apply the common parameter of odds ratio, OR, which is frequently used

for design of control charts in a clinical monitoring context (Steiner and Cook, 2000).

In this setting, OR0 = 1 is identical to no change and departing from that through

OR1 = OR0 + β × t leads to a linear trend with a slope of size β over time t in the

Bernoulli process.

To model a change point in the presence of covariates, consider a Bernoulli process

yi, i = 1, ..., T , that is initially in-control, with independent observations coming from

a Bernoulli distribution with known variable rates p0i that can be explained by an

underlying risk model p0i | xi ∼ f(xi), where f(.) is a link function and x is a vector

of covariates. At an unknown point in time, τ , the Bernoulli rate parameter changes

from its in-control state of p0i to p1i obtained through


OR1 = OR0 + β × (i− τ) =p1i/1− p1ip0i/1− p0i

(10.4)

and

p1i =(OR0 + β × (i− τ))× p0i/(1− p0i)

1 + ((OR0 + β × (i− τ))× p0i/(1− p0i)), (10.5)

where OR1 6= 1 and > 0 so that p1i 6= p0i, i = τ, ..., T .

The Bernoulli process linear trend change model in the presence of covariates can thus

be parameterized as follows:

pr(yi | pi) =

pyi0i(1− p0i)1−yi if i = 1, 2, ..., τ

pyi1i(1− p1i)1−yi if i = τ + 1, ..., T.

(10.6)

Modeling a linear trend in terms of odds ratios benefits the change point model since

no constraint on each p1i, i = τ, ..., T , is needed. In this parametrization, any β > 0

corresponds to OR1 > 1 that induces an increase in the rate. This type of change

is analogous to linear trend models in a Bernoulli process rate without covariates.

Equivalently, a negative slope, β < 0, causes a fall; however such disturbance cannot

last long since OR1 is restricted to be positive. Therefore for simplicity, we limit the

investigation to increasing linear trends scenarios where β > 0.

As seen in Equation (10.5), although a specific magnitude of change induces in the

odds ratio, the obtained out-of control rates, p1i, i = τ, ..., T , are affected differently;

see Section 10.5 for more details.

Relating this to Equation (10.2), pr(. | .) is the likelihood that underlies the obser-

vations; the time, τ , and the magnitude of the slope, β, in the linear trend in odds

ratio are the unknown parameters of interest; and the posterior distributions of these

parameters will be investigated in the change point analysis. Assume that the process

delivering yi is monitored by a control chart that signals at time T .

We assign a zero left truncated normal distribution (µ = 0, σ2 = k)I(0,∞) for β as

prior distributions where k is study-specific. In the followings, we set k = 1, giving


(1) (2)

Figure 10.1 Distribution of calculated (1) logit of APACHE II scores logit(p); and (2) probability ofmortality for 4644 patients who admitted to ICU during 2000-2009.

a relatively informed priors for the magnitude of the slope change in an in-control

rate as the control chart is sensitive enough to detect very large shifts and estimate

associated change points. Other distributions such as uniform and Gamma might also

be of interest for β since it is assumed to be a positive value; see Gelman et al. (2004)

for more details on selection of prior distributions. We place a uniform distribution on

the range of (1, T -1) as a prior for τ where T is set to the time of the signal of control

charts. See the Appendix for the linear trend change model code in WinBUGS.

10.5 Evaluation

We used Monte Carlo simulation to study the performance of the constructed model

in linear trend detection following a signal from RACUSUM and RAEWMA control

charts when a change in odds ratio is simulated to occur at τ = 500. However, to

extend to the results that would be obtained in practice, we considered a dataset of

available APACHE II scores that was routinely collected over 2000-2009 in the pilot

hospital for construction of baseline risks in the control charts.

Figure 10.1-1 shows the calculated logit of APACHE II scores (logit(p)) for 4644 pa-

tients who were admitted to ICU. The scores led to a distribution of logit values with

a mean of -2.53 and a variance of 1.05. The distribution of the obtained probability of

death over patients is also shown in Figure 10.1-2. This led to an overall risk of death

10.5 Evaluation 299

of 0.082 (average of obtained risks) with a variance of 0.012 among patients in the pilot

hospital.

To generate observations of a process in the in-control state yi, i = 1, ..., τ , we first

randomly generated associated risks, p0i, i = 1, ..., τ , from a normal distribution (µ =

−2.53, σ2 = 1.05) and then drew binary outcomes from a Bernoulli distribution with

rates of p0i, i = 1, ..., τ . Plotting the obtained observations when the associated risks

are considered results in risk-adjusted control charts that are in-control. However other

distributions such as Beta and uniform distributions with proper parameters or even

sampling randomly from the baseline data can be applied to generate risks directly.




lead to stopping the process and analyzing root causes. When no cause is found, the

process would follow without adjustment.

To form an increasing linear trend in odds ratio, we then induced trends with a slope

of sizes β = {0.0025, 0.005, 0.01, 0.025, 0.05, 0.1} and generated observations until the

control charts signalled. The effect of such drifts should be considered in two ways,

over different base-line risk and time.

These slopes led to different shift sizes in the in-control process rate, p0i, for the ith

patient after the occurrence of the change. As shown in Figure 10.2 patients with a more

extreme risk of mortality are less affected compared to patients who have a probability

of around 0.5 at i = 600, after 100 observations coming from an out-of-control process

caused by linear trend disturbances of size β. This effect remains consistent over next

patients where the size of the change in odds ratio increases by time. Patients with

more extreme risks of mortality are less affected compared to patients who have a

probability of around 0.5.

The effect of a linear trend with a positive slope of size β = 0.025 in odds ratio is

demonstrated in Figure 10.3 over time, next patients say. The resultant distributions

are more over-dispersed and shifted to the right and concentrates on higher values of

risks in comparison with the observed risks in Figure 10.1-2. As seen in Figure 10.3-1


for the 550th patient, when the odds ratio increases and reaches to δ1 = 2.25, the overall

risk increases to 0.15 with a variance of 0.021. This increase in the risk almost doubles

after the next 150 patients, reaching to an overall risk of 0.28 with a variance of 0.033,

see Figure 10.3-4.

To form an increasing linear trend in odds ratio, we then induced trends with a slope

of sizes β = {0.0025, 0.005, 0.01, 0.025, 0.05, 0.1} and generated observations until the

control charts signalled. We constructed risk-adjusted control charts using the proce-

dures discussed in Section 10.3. We designed RACUSUM to detect a doubling and

a halving of the odds ratio in the in-control rate, p0 = 0.082, and have an in-control

average run length ( ÂRL0) of approximately 3000 observations. We used Monte Carlo

simulation to determine decision intervals, h±. However other approaches may be of

interest; see Steiner and Cook (2000). This setting led to decision intervals of h+ = 5.85

and h− = 5.33. As two sided charts were considered, the negative values of h− were

used. The associated CUSUM scores were also obtained through Equation (10.1) where

yi is 0 and 1, respectively.

We set the smoothing constant of RAEWMA to λ = 0.01 as the in-control rate was low

and detection of small changes was desired; see Somerville et al. (2002), Cook (2004)

and Grigg and Spiegelhalter (2007) for more details. The value of L was calibrated so

that the same in-control average run length ( ÂRL0) as the RACUSUM was obtained.

The resultant chart had L = 2.83. A negative lower control limit in the RAEWMA

was replaced by zero.

Figure 10.2 Effect of linear trend disturbances with a slope of β occurred at i = 500 in odds ratio ofan in-control Bernoulli process for the 600th patient with a baseline risk of p0.


(1) (2)

(3) (4)

Figure 10.3 Distribution of observable probability of mortality after (1) 50, (2) 100, (3) 150 and (4)200 observations since occurrence of a linear trend disturbance with a slope of size β = 0.025 in oddsratio for 4644 patients who admitted to ICU during 2000-2009.

The linear trend disturbances and control charts were simulated in the R package

(http://www.r-project.org). To obtain posterior distributions of the time and the mag-

nitude of the changes we used the R2WinBUGS interface (Sturtz et al., 2005) to gen-

erate 100,000 samples through MCMC iterations in WinBUGS (Spielgelhalter et al.,

2003) for all change point scenarios with the first 20000 samples ignored as burn-in.

We then analyzed the results using the CODA package in R (Plummer et al., 2010).

See the Appendix for the linear trend change model code in WinBUGS.



adjusted control charts, we induced a linear trend with a slope of size β = 0.25 at


(a1) (b1)

(a2) (b2)

(a3) (b3)

Figure 10.4 Risk-adjusted (a1) CUSUM ((h+, h−) = (5.85, 5.33)) and (b1) EWMA (λ = 0.01 andL = 2.83) control charts and obtained posterior distributions of (a2, b2) time τ and (a3, b3) magnitudeβ of an induced linear trend with a slope of size β = 0.025 in odds ratio where E(p0) = 0.082 andτ = 500.

time τ = 500 in an in-control binary process with an overall death rate of p0 = 0.082.

RACUSUM and RAEWMA, respectively, detected an increase in the odds ratio and

sinalled at the 595th and 565th observations, corresponding to delays of 95 and 65 ob-

servations as shown in Figure 10.4-a1, b1. The posterior distributions of time and

magnitude of the change were then obtained using MCMC discussed in Section 10.5.


For both control charts, the distribution of the time of the change, τ , concentrates on

the values closer to 500th observation as seen in Figure 10.4-a2, b2. The posteriors for

the magnitude of the change, β, also approximately identified the exact change size

as they highly concentrate on values of less than 0.05 shown in Figure 10.4-a3, b3.

As expected, there exist slight differences between the distributions obtained following

RACUSUM and RAEWMA signals since non-identical series of binary values were used

for two procedures.

Table 10.1 summarizes the obtained posteriors. If the posterior was asymmetric and

skewed, the mode of the posteriors was used as an estimator for the change point model

parameter (τ and β1). As shown, the Bayesian estimator of the time outperforms chart’s

signals, particularly for the RACUSUM with a delay of three observations. However,

the magnitude of the slope of the linear trend tends to be over overestimated by the

Bayesian estimator, obtaining 0.051 and 0.041 for RACUSUM and RAEWMA charts,

respectively. Having said that, these estimates must be studied in conjunction with

their corresponding standard deviations.





Table 10.1 Posterior estimates (mode, sd.) of linear trend change point model parameters (τ and β)following signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ = 0.01 andL = 2.83) where E(p0) = 0.082 and τ = 500. Standard deviations are shown in parentheses.

βRACUSUM RAEWMA

RL τ β RL τ β

0.025 595503.0 0.051

565513.9 0.042

(34.9) (0.14) (22.9) (0.13)

Table 10.2 Credible intervals for linear trend change point model parameters (τ and β) following signals(RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ = 0.01 and L = 2.83) whereE(p0) = 0.082 and τ = 500.

β ParameterRACUSUM RAEWMA

50% 80% 50% 80%

0.025τ (496,521) (476,531) (511,526) (497,532)

β (0.028,0.081) (0.020,0.141) (0.021,0.079) (0.018,0.129)


Table 10.3 Probability of the occurrence of the change point in the last 25, 50 and 100 observations priorto signalling for RACUSUM ((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ = 0.01 and L = 2.83)where E(p0) = 0.082 and τ = 500.

βRACUSUM RAEWMA

25 50 100 25 50 100

0.025 0.02 0.04 0.70 0.04 0.57 0.98

estimated time and the magnitude of slope of the linear trend disturbance in odds ratio

for RACUSUM and RAEWMA control charts. As expected, the CIs are affected by

the dispersion and higher order behaviour of the posterior distributions. Under the

same probability of 0.5 for the RACUSUM, the CI for the time of the change of size

β = 0.025 in odds ratio covers 25 observations around the 500th observation whereas

it increases to 35 observations for RAEWMA due to the larger standard deviation, see

Table 10.1.

Comparison of the 50% and 80% CIs for the estimated time for the RACUSUM chart

reveals that the posterior distribution of the time tends to be left-skewed and the

increase in the probability contracts the left boundary of the interval, from 496 to 476

in comparison with a shift of 10 observations in the right boundary. This result can

also be seen for the RAEWMA chart. As shown in Table 10.1 and discussed above,

magnitude of the changes are overestimated, however Table 10.2 indicates that the real

sizes of slope are approximately contained in the respective posterior 50% and 80%

CIs. Construction of probablistic intervals can be extended to other sizes of slope and

direction of linear trends in odds ratio.



change point in the last {25, 50, 100} observations prior to signalling in the control

charts. For a linear trend with a slope of size β = 0.025 in odds ratio, since the

RACUSUM signals late (see Table 10.1), it is unlikely that the change point occurred

in the last 25 or 50 observations. In contrast, in the RAEWMA, where it signals earlier,

the probability of occurrence in the last 50 observations is 0.57, then increases to 0.98

as the next 50 observations are included. These kind of probability computations and

inferences can be extended to other change scenarios.


Table 10.4 Average of posterior estimates (mode, sd.) of linear trend change point model parameters(τ and β) for a drift in odds ratio following signals (RL) from RACUSUM ((h+, h−) = (5.85, 5.33)) andRAEWMA charts (λ = 0.01 and L = 2.83) where E(p0) = 0.082 and τ = 500. Standard deviations areshown in parentheses.

βRACUSUM RAEWMA

E(RL) E(τ) E(στ ) E(β) E(RL) E(τ) E(στ ) E(β)

0.0025920.8 740.1 151.7 0.006 861.8 763.0 114.9 0.008(101.5) (94.8) (77.6) (0.004) (95.8) (94.5) (31.5) (0.003)

0.005787.7 633.6 125.1 0.010 723.1 657.4 76.9 0.013(88.3) (78.5) (64.7) (0.041) (78.2) (78.0) (31.7) (0.032)

0.01689.0 579.7 76.4 0.022 655.5 591.5 59.7 0.028(33.5) (41.8) (36.6) (0.050) (36.4) (31.2) (31.1) (0.044)

0.025610.3 524.4 52.8 0.041 590.6 528.4 49.2 0.039(26.6) (42.9) (29.1) (0.045) (30.9) (35.4) (23.2) (0.048)

0.05583.3 514.7 37.8 0.081 569.3 513.7 42.9 0.078(17.5) (20.4) (17.3) (0.027) (16.6) (21.1) (19.9) (1.034)

0.1562.7 504.3 28.7 0.129 552.4 503.3 34.6 0.130(11.8) (17.5) (14.6) (0.033) (11.8) (17.2) (19.3) (0.031)



datasets, for different slope sizes of β, we replicated the simulation method explained

in Section 10.5 100 times. Simulated datasets that were obvious outliers were excluded.

Table 10.4 shows the average of the estimated parameters obtained from the replicated

datasets where there exists a linear trend in odds ratio.

Comparison of performance of RACUSUM and RAEWMA charts in Table 10.4 reveals

that, the RAEWMA detected increasing linear trend disturbances in odds ratio faster.

This superiority drops from 59 observations for β = 0.0025 to 10 observations when

the slope size reaches to β = 0.1. For a very small slope of size β = 0.0025, the average

of the mode, E(τ), reports the 740th observation as the change point in RACUSUM,

whereas the chart detected the change with a delay of 420 observations. This superiority

persists for the RAEWMA chart, however a delay of 263 observations is still associated

with the estimate of the time, τ , for β = 0.0025 following RAEWMA signal.

Table 10.4 shows that, although the RACUSUM signals later than the alternative,

RAEWMA, particularly over small to medium slope sizes, the average of posterior esti-

mates for the time, E(τ), outperforms the estimates obtained for RAEWMA charts. A

less delay of 23 observations is obtained for β = 0.0025 scenario. This delay drops when


the slop size increases. Over medium to large sizes of slope, β = {0.025, 0.05, 0.1}, the

bias of the Bayesian estimator, E(τ), did not exceed 24 observations for the RACUSUM.

This bias slightly increased for the RAEWMA chart, reaching to 28 observations, yet

significantly outperformed the chart’s signal. At best, the RACUSUM and RAEWMA

signals at the 562nd and 552nd observations for the most extreme jump in the slope of

the linear trend in odds ratio were also outperformed by posterior modes, E(τ), that

exhibited a bias of four and three observations, respectively.

Table 10.4 indicates that in both risk-adjusted control charts, the variation of the

Bayesian estimates for time tends to reduce when the magnitude of slope increases.

The mean of the standard deviation of the posterior estimates of time, E(στ ), also

decreases when the slope sizes increases. The average of the Bayesian estimates of the

magnitude of the change, E(β), shows that the posterior modes tend to overestimate

slope sizes. As seen in Table 10.4, better estimates are obtained in moderate to large

slopes. Having said that, Bayesian estimates of the magnitude of the change must be

studied in conjunction with their corresponding standard deviations. In this manner,

analysis of credible intervals is effective.


ods


those introduced in Section 10.2, we ran the available alternative, built-in estimators of

Bernoulli EWMA and CUSUM charts, within the replications discussed in Section 10.6.

Based on Page (1954) suggestion, if an increase in a process rate detected by CUSUM

charts, an estimate of the change point is obtained through τcusum = max{i : X+i = 0}.

We modified the built-in estimator of EWMA proposed by Nishina (1992) and estimated

the change point using τewma = max{i : Zoi ≤ Zpi} following signals of an increase in

the Bernoulli rate.


provided by the built-in estimators of CUSUM, τcusum, and EWMA, τewma, charts for


Table 10.5 Average of detected time of a linear trend change in odds ratio obtained by the Bayesianestimator (τb), CUSUM and EWMA built-in estimators following signals (RL) from RACUSUM((h+, h−) = (5.85, 5.33)) and RAEWMA charts (λ = 0.01 and L = 2.83) where E(p0) = 0.082 andτ = 500. Standard deviations are shown in parentheses.

βRACUSUM RAEWMA


0.0025920.8 727.7 740.1 861.8 739.4 763.0(101.5) (131.9) (94.8) (95.8) (128.5) (94.5)

0.005787.7 605.8 633.6 723.1 622.3 657.4(88.3) (110.2) (78.5) (78.2) (103.9) (78.0)

0.01689.0 559.7 579.7 655.5 573.2 591.5(33.5) (56.2) (41.8) (36.4) (55.5) (31.2)

0.025610.3 513.1 524.4 590.6 514.0 528.4(26.6) (62.6) (42.9) (30.9) (67.6) (35.4)

0.05583.3 495.2 514.7 569.3 506.1 513.7(17.5) (56.5) (20.4) (16.6) (61.4) (21.1)

0.1562.7 483.4 504.3 552.4 497.8 503.3(11.8) (45.7) (17.5) (11.8) (63.2) (17.2)

drifts in the odds ratio, OR. The built-in estimators of EWMA and CUSUM charts

outperform associated signals over all drifts in the odds ratio, however they tend to

underestimate the exact change point when the magnitude of slope is large, β = 0.1.

The CUSUM built-in estimator, τcusum, outperforms the alternative built-in estimator

over small to moderate slopes, exactly over the same range of changes in which the

Bayesian estimates obtained for RACUSUM are superior.

The Bayesian estimator, τb, is outperformed by both built-in estimators, τcusum and

τewma, with less delays which is at most 35 observations obtained for RAEWMA for

β = 0.005. Having said that, considering corresponding standard deviations over repli-

cations, the Bayesian estimator remains a reasonable alternative. The superiority of the

built-in estimators drops when slope size increases since they tend to underestimate the

time of the change, whereas the average of posterior modes estimates more accurately.

Comparison of variation of estimated change points also supports the superiority of the

Bayesian estimators over alternatives across linear trend with a small slope.


10.8 Conclusion

Quality improvement programs and monitoring process for medical outcomes are now

being widely implemented in the health context to achieve stability in outcomes through

detection of shifts and investigation of potential causes. Obtaining accurate information

about the time when a change occurred in the process has been recently considered

within industrial and business context of quality control applications. Indeed, knowing

the change point enhances efficiency of root causes analysis efforts by restricting the

search to a tighter window of observations and related variables.

In this paper, using a Bayesian framework, we modeled change point detection for a

clinical process with dichotomous outcomes, death and survival, where patient mix was

present. We considered an increasing drift in odds ratio, caused by a linear trend with a

positive slope, of the in-control rate. We constructed Bayesian hierarchical models and

derived posterior distributions for change point estimates using MCMC. The perfor-

mance of the Bayesian estimators were investigated through simulation when they were

used in conjunction with well-known risk-adjusted CUSUM and EWMA control charts

monitoring mortality rate in the ICU of the pilot hospital where risk of death was evalu-

ated by APACHE II, a logistic prediction model. The results showed that the Bayesian

estimates significantly outperform the RACUSUM and RAEWMA control charts in

change detection over different scenarios of magnitude of slopes in drifts. We then

compared the Bayesian estimator with built-in estimators of EWMA and CUSUM. Al-

though the Bayesian estimator was outperformed by the built-in estimators, it remains

a viable alternative when precision of the estimators are taken into account.




of the change point. This is a significant advantage of the proposed Bayesian ap-

proach. Furthermore, flexibility of Bayesian hierarchical models, ease of extension to

more complicated change scenarios such as decreasing linear trends, nonlinear trends,

relief of analytic calculation of likelihood function, particularly for non-tractable like-

lihood functions and ease of coding with available packages should be considered as

BIBLIOGRAPHY 309

additional benefits of the proposed Bayesian change point model for monitoring pur-

poses.


mortality observed in the pilot hospital. Although it is expected that superiority of the

proposed Bayesian estimator persists over other processes in which the in-control rate

and the distribution of baseline risk may differ, the results obtained for estimators and

control charts over various change scenarios motivates replication of the study using

other patient mix profiles. Moreover modification of change point model elements such

as replacing priors with more informative alternatives, or truncation of prior distribu-

tions based on type of signals and prior knowledge, may be of interest.



the pilot hospital). An alternative may be to retain the two-step approach but to use

a Bayesian framework in both stages. There is now a substantial body of literature on







Bibliography

Assareh, H. and Mengersen, K. (2011). Detection of the time of a step change in

monitoring survival time. Lecture Notes in Engineering and Computer Science: Pro-

ceedings of The World Congress on Engineering 2011, 2190:314–319.

Assareh, H., Smith, I., and Mengersen, K. (2011a). Bayesian change point detec-


20(3):207–222.

Assareh, H., Smith, I., and Mengersen, K. (2011b). Identifying the time of a linear

trend disturbance in odds ratio of clinical outcomes. Lecture Notes in Engineering

and Computer Science: Proceedings of The World Congress on Engineering 2011,

2190:365–370.










Bissell, A. (1984). The performance of control charts and CUSUMs under linear trend.

Applied Statistics, 33(2):145–151.











17(2):119–124.

Gan, F. F. (1991). EWMA control chart under linear drift. Journal of Statistical

Computation and Simulation, 38(1-4):181–200.

Gan, F. F. (1992). CUSUM control charts under linear drift. The Statistician, 41(1):71–

84.


Chapman & Hall/CRC.




38(2):124–136.



102(477):140–152.

BIBLIOGRAPHY 311




Papers, 46(1):47–64.












Perry, M. and Pignatiello Jr, J. (2006). Estimating the change point of a normal process

mean with a linear trend disturbance in SPC. Quality Technology and Quantitative

Management, 3(3):325–334.


coda. Citeseer.




















24(6):721–735.



CHAPTER 11

Bayesian Estimation of the Time of a Decrease

in Risk-Adjusted Survival Time Control Charts

Preamble









Monitoring patient survival time instead of binary outcomes of a process, death, has

recently been considered in control charting context of clinical outcomes. To this end

risk-adjusted survival time CUSUM and EWMA control charts have been developed

and employed. Similar to standard risk-adjusted charts, the mean survival time for

314 Chapter 11. Estimation of a Decrease in Survival Time

each patient undergoing a clinical procedure, is predicted using a survival prediction

model and the observed outcome then is adjusted and plotted on the charts considering

expected survival time.

Following the Bayesian approach to estimation of step change and linear trend in odds

ratio of binary outcomes in risk-adjusted control charts through Chapters 9 to 10 and

achieved accuracy and precision obtained by the developed Bayesian estimator in pres-

ence of patient mix, in this chapter the Bayesian change point model was extended to

identify the time of a drops in the mean survival time of patients who underwent car-

diac surgery. The data were right censored since the monitoring was conducted over a

limited follow-up period and the effect of risk factors prior to the surgery was captured

using a Weibull accelerated failure time regression model.

Posterior distributions of the change point parameters including location and magnitude

of changes and also corresponding probabilistic intervals and inferences were obtained

using MCMC. The performance of the Bayesian estimator was investigated through


are used in conjunction with the risk-adjusted survival time CUSUM control charts for

different magnitude of decreases. This advantage of the proposed Bayesian estimator

was enhanced when probability quantification, flexibility and generalizability of the

Bayesian change point detection model are also considered.




components change point estimators were designed to estimate time of a decrease in

mean survival time of hospital outcomes in presence of patient mix. Meanwhile the

simulation study implemented in this research, contributes to an analytic application

of the risk-adjusted survival time control charts over various change scenarios.




315



certified that:



field of expertise;






unit; and





Assareh, H. and Mengersen, K. (2011) Bayesian estimation of the time of a decrease

in risk-adjusted survival time control charts, IAENG International Journal of Applied

Mathematics, 41 (4):360–366.



Signature & Date:






11.1 Abstract

Change point detection has been recognized as an essential effort of root cause analyses

within quality control programs since enables clinical experts to search for potential

causes of disturbance in hospital outcomes more effectively. In this paper, we consider

estimation of the time when a drop has occurred in the mean survival time observed over

patients undergone an in-control cardiac surgery with death and survive outcomes in

the presence of variable patient mix. The data are right censored since the monitoring

is conducted over a limited follow-up period. The effect of risk factors prior to the

surgery is captured using a Weibull accelerated failure time regression model.

We apply Bayesian hierarchical models to formulate the change point. Markov Chain

Monte Carlo is used to obtain posterior distributions of the change point parameters

including location and magnitude of drops and also corresponding probabilistic inter-

vals and inferences. The performance of the Bayesian estimator is investigated through

simulations and the result shows that precise estimates can be obtained when they are

used in conjunction with the risk-adjusted survival time CUSUM control charts for

different magnitude scenarios. This advantage is enhanced when probability quantifi-

cation, flexibility and generalizability of the Bayesian change point detection model are

also considered.

11.2 Introduction


stability and dispersion of the process. The chart signals when a significant change has

occurred. This signal can then be investigated to identify potential causes of the change

and corrective or preventive actions can then be conducted. Following this cycle leads

to variation reduction and process stabilization (Montgomery, 2008).


impact of the human element in process outcomes. Steiner and Cook (2000) developed

a risk-adjusted type of cumulative sum control chart (CUSUM) to monitor surgical

outcomes, death, which are influenced by the state of a patient’s health, age and other


factors. This approach has been extended to exponential moving average control charts

(EWMA) (Cook, 2004; Grigg and Spiegelhalter, 2007). Both modified procedures have

been intensively reviewed and are now well established for monitoring clinical outcomes

where the observations are recorded as binary data (Grigg and Farewell, 2004; Grigg

and Spiegelhalter, 2006; Cook et al., 2008).





developed a risk-adjusted CUSUM based on a Cox model for failure time outcomes.

Sego et al. (2009) used an accelerated failure time regression model to capture the

heterogeneity among patients prior to the surgery and developed a risk-adjusted sur-

vival time CUSUM (RAST CUSUM) scheme. Steiner and Jones (2010) extended this

approach by proposing a EWMA procedure based on the same survival time model

discussed by Sego et al. (2009).

The need to know the time at which a process began to vary, the so-called change point,

has recently been raised and discussed in the industrial context of quality control. Ac-

curate detection of the time of change can help in the search for a potential cause more

efficiently as a tighter time-frame prior to the signal in the control charts is investi-

gated. Assareh et al. (2011a) discussed the benefits of change point investigation in

monitoring cardiac surgery outcomes and post-signal root causes analysis by providing

precise estimates of the time of the change in the rates of use of blood products during

surgery and adverse events in the follow-up period.

Samuel and Pignatiello (2001) developed and applied a maximum likelihood estimator

(MLE) for the change point in a process fraction nonconformity monitored by a p-

chart, assuming that the change type is a step change. They showed how closely this



MLE estimator with EWMA and CUSUM charts. These authors also constructed a

confidence set based on the estimated change point which covers the true process change


point with a given level of certainty using a likelihood function based on the method

proposed by Box and Cox (1964).

This approach was extended to other probability distributions and change type scenar-

ios. In the case of a very low fraction non-conforming, Noorossana et al. (2009) derived

and analyzed the MLE estimator of a step change based on the geometric distribution

control chats discussed by Xie et al. (2002).

All MLE estimators described above were developed assuming that the underlying dis-

tribution is stable over time. This assumption cannot often be satisfied in monitoring

clinical outcomes as the mean of the process being monitored is highly linked to indi-

vidual characteristics of patients. Therefore it is required that the survival time model,


points in time-to-event control charts.





where heterogeneity exists as well as inferences based on posterior distributions for the

time and the magnitude of a change (Gelman et al., 2004).

In recent studies Bayesian change point estimators have been developed in monitoring

clinical outcomes where the mean of processes are highly linked to individual char-

acteristics of patients. In monitoring outcomes of a surgery, Assareh et al. (2011b),

captured the pre-operation risk of death using a logistic regression model. Assareh and

Mengersen (2011) also proposed this approach for monitoring survival time.


change points are estimated assuming that the underlying change is a sudden drop in

survival time which can be interpreted as an increase in odds of mortality following a

surgical process. In this scenario, we model the step change in the mean survival time

of a clinical process. We analyze and discuss the performance of the Bayesian change

point model through posterior estimates and probability based intervals. Risk-adjusted

survival time CUSUM charts is reviewed in Section 11.3. The change point model is

11.3 Risk-Adjusted Survival Time Control Charts 319

demonstrated and evaluated in Sections 11.4-11.6. We then summarize the study and

obtained results in Section 11.7.

11.3 Risk-Adjusted Survival Time Control Charts

Risk-adjusted control charts for time-to-event are monitoring procedures designed to

detect changes in a process parameter of interest, such as survival time, where the

process outcomes are affected by covariates, such as risk factors. In these procedures,

regression models for time are used to adjust control charts in a way that the effects of

covariates for each input, patient say, would be eliminated.









1 if xi ≤ c

0 if xi > c.

(11.1)







manner.



ui), is equivalent to the baseline survival function S0(xi exp(βTui)), where β is a vector








phase I. In this phase, an available dataset of patients records is used assuming that

the process is in-control for that period of time.







Wi(ti, δi | ui) = (1− (ρ)−α)

(tiexp(β

Tui)

λ0

)− δiαlogρ. (11.2)

where it is designed to detect a decrease from λ0 to λ1 = ρλ0. Upper CUSUM statistic

is obtained through Zi = max{0, Zi−1+Wi} and then plotted over i. Often the CUSUM

statisticis initialized at 0.

Therefore a reduction in the MST is detected when a plotted Zi exceeds a specified de-

cision threshold h. Although this interpretation of chart’s signal is in contrast with the

common expression used for standard risk-adjusted control charts for binary outcomes,

it seems reasonable to take into account that any drop in the MST can be characterized

as an increase in the odds of mortality. However in Weibull distribution scenario for

a specific drop in the MST, the equivalent magnitude of the increase in odds is not

obtainable; see Sego et al. (2009) for more details.

The magnitude of the decision thresholds in RAST CUSUM, h, is determined in a


way that the charts have a specified performance in terms of false alarm and detection

of shifts in the MST. In this regard, Markov chain and simulation approaches can be

applied; see Sego (2006) for more details.









the data are observed, which also is in the form of a probability distribution.

As discussed in Section 11.2, in RAST CUSUM procedures, we let the survival func-

tion vary over patients and we control the observed survival time, which may be right

censored, against the corresponding predicted survival function obtained through the

survival time model. In this setting, a process is in the in-control state when observa-

tions can be statistically expressed by the underlying survival time model, taking into

account their individual covariates. The RAST CUSUM signals when observations tend

to violate the underlying model.

To model a change point in the presence of covariates, consider a process that results

in a survival time of ti, i = 1, ..., T , that is initially in-control. The observations can

be explained by a survival function S(ti, ui), where the underlying distribution, (f(.)),

is a Weibull distribution with parameters (α0, λ0), and ui is a vector of covariates. At

an unknown point in time, τ , the Weibull scale parameter changes from its in-control

state of λ0 to λ1, λ1 = k× λ0, 0 < k < 1. The right censored survival time step change

model can thus be parameterized using a survival function as follows:


S(ti, ui) =

exp[−(tiexp(β

T0 ui)

λ0

)α0]

if i = 1, 2, ..., τ

exp[−(tiexp(β

T0 ui)

λ1

)α0]

if i = τ + 1, ..., T

(11.4)

where β0 is the vector of covariate coefficients.

If desired, an overall estimation of change size in odds of mortality equivalent to a

specific shift in the MST or λ can be obtained through simulation and averaging over

different values of covariate, ui.

Relating this to Equation (13.3), the likelihood that underlies the observations is ob-

tained through f(.)δS(.)1−δ; see Sego et al. (2009). The time and the magnitude of a

drop in the MST are the unknown parameters of interest; and the posterior distribu-

tions of these parameters will be investigated in the change point analysis.

Assume that the process ti is monitored by a control chart that signals at time T .

We assign a truncated normal distribution (µ, σ)I(.) for k as prior distribution where

all parameters are set study-specific. For a decrease in k which is detected by the

upper RAST CUSUM, exceeding the upper threshold h, we set N(µ = 0.255, σ =

0.6)I(0.01, 0.99). This setting leads to relatively an informed prior for the magnitude

of the fall.

Mean of the prior was set corresponds to the shift that the chart was calibrated to

detect, see Section 11.5. The prior let to be sensitive in detection of low to nearly large

falls in k. Note that other distributions such as uniform and Gamma might also be

of interest for k since it is always a positive value; see Gelman et al. (2004) for more

details on selection of prior distributions.

We place a uniform distribution on the range of (1, T − 1) as prior for τ where T is set

to the time of the signal of control charts. See the Appendix for the step change model

code in WinBUGS.

11.5 Evaluation 323

11.5 Evaluation


step change detection following a signal from a RAST CUSUM control chart when a

change in mean survival time is simulated to occur at τ = 500. To extend to the results

that would be obtained in practice, we considered the same cardiac surgery dataset

that was used by Steiner and Cook (2000) and then Sego et al. (2009) to construct risk-

adjusted control charts for Bernoulli and time-to event variables, respectively. It was

reported that this dataset contains 6449 operations information that were performed

between 1992-1998 at a single surgical center in U.K. The Parsonnet score (Parsonnet

et al., 1989) was recorded to quantify the patient’s risk prior to the cardiac surgery.

A follow-up period of 30 days after the surgery was set as the censoring time. A

Weibull AFT model with parameters of α0 = 0.4909, λ0 = 42133.6 and β0 = 0.1307

was reported by Sego et al. (2009) when the first two years of the data were used as

training data to fit the model and construct the in-control state of the process and

RAST CUSUM. They also found that the recorded Parsonnet scores of the training

data can be well approximated by an exponential distribution with a mean of 8.9.

To generate right censored survival time observations of a process in the in-control

state ti, i = 1, ..., τ , we first randomly generated the Parsonnet score, ui, i = 1, ..., τ ,

from an exponential distribution with a mean of 8.9 and then drew an associated

survival time, xi, i = 1, ..., τ , from the Weibull AFT model with α0 = 0.4909, λ0 =

42133.6, and β0 = 0.1307. Finally, ti and δi were obtained considering a censoring

time of c = 30 through Equation 11.1. Plotting the obtained observations when the

associated covariates are considered results in a RAST CUSUM chart that is in-control.

Note that other distributions such as uniform distributions with proper parameters or

even sampling randomly from the baseline Parsonnet scores can be applied to generate

covariates directly.

To generate the drops in λ0, or MST, we then induced changes of sizes k = {0.05,

0.066, 0.1, 0.143, 0.20, 0.25, 0.33, 0.50, 0.66, 0.75} and generated observations until

the control charts signalled. These changes led to different change sizes in in-control

estimated survival probability over days for a patient with ui as well as survival curves


between patients with different Parsonnet scores.




To construct a RAST CUSUM, we applied the procedures discussed in Section 11.3.

We calibrated the RAST CUSUM to detect a decrease in the MST that correspond to

a doubling of the odds ratio within the follow-up period and with an in-control average

run length ( ÂRL0) of approximately 10000 observations. As mentioned in Section 11.3,

for the Weibull AFT model the corresponding odds ratio formula, discussed by Sego

et al. (2009), is not reduced to a closed form of λ0 and ρ± since the covariate term is

not simplified in

OR =Oi1

Oi0, and Oi =

1− S(c | ui)S(c | ui)

(11.5)

where S(c | ui) is the probability of survival at the end of follow-up period, c.

Therefore we used Monte Carlo simulation to estimate the corresponding ρ. To do so,

we set ρ such that over 100,000 replications of generating Parsonnet scores from the

fitted exponential distribution with a mean of 8.9 and calculating the odds ratio in

Equation 11.5, the desired odds ratios of size OR = 2 was obtained. A decrease of

ρ = 0.255 in the MST was found to correspond to the desired jump in odds ratio.

We also used Monte Carlo simulation to determine decision intervals, h. However other

approaches may also be considered; see Steiner and Cook (2000) and Sego et al. (2009).

This setting led to decision interval of h = 4.88. The associated CUSUM scores were

also obtained through Equation (11.2) considering the generated ti, δi and ui.





change point scenarios with the first 20000 samples ignored as burn-in. We then an-

alyzed the results using the CODA package in R (Plummer et al., 2010). See the


(1)

(2)

(3)

Figure 11.1 (1) Risk-adjusted survival time CUSUM chart (h = 4.88) and obtained posterior distribu-tions of (2) time τ and (3) magnitude k of a decrease of size k = 0.25 in λ (mean survival time) whereλ0 = 42133.6 and τ = 500.

Appendix for the step change model code in WinBUGS.



adjusted control charts, we induced a drop of size k = 0.25 at time τ = 500 in an


Table 11.1 Posterior estimates (mode, sd.) of step change point model parameters (τ and k) followingsignals (RL) from RAST CUSUM (h = 4.88) where λ0 = 42133.6 and τ = 500.

k RL τ στ k σk0.25 651 499.8 96.0 0.226 0.180.33 722 494.8 160.6 0.27 0.19

in-control process with an overall survival time of λ0 = 42133.6. RAST CUSUM de-

tected the drop and signalled at the 651st observation, corresponding to a delay of

151 observations as shown in Figure 11.1-1. The posterior distributions of time and

magnitude of the change were then obtained using MCMC discussed in Section 11.5.

The distribution of the time of the change, τ , concentrates on the 500th observation,

approximately, as seen in Figure 11.1-2. The posterior for the magnitude of the change,

k, also reasonably identified the exact change size as it highly concentrates on values

of around 0.25 shown in Figure 11.1-3.

This investigation was replicated using a smaller shift of size k = 0.33 in λ. Table

11.1 summarizes the posterior estimates for both change sizes. If the posterior was

asymmetric and skewed, the mode of the posterior was used as an estimator for the

change point model parameters (τ and k).

The RAST CUSUM signalled after 222 observations when the mean survival time be-

came a third whereas the posterior distribution reported a drop at the 491st observation.

This result implies that although the obtained posterior estimates underestimated the

change point, they still performed significantly better than the RAST CUSUM charts.


lowing signals of the control chart, see Figure 11.1-3 and Table 11.1. The slight bias,

here underestimation, observed in the figures must be considered in the context of their

corresponding standard deviations.



shift size occurred, less dispersed posteriors are obtained, particularly for posteriors of

time.



Table 11.2 Credible intervals for step change point model parameters (τ and k) following signals (RL)from RAST CUSUM (h = 4.88) where λ0 = 42133.6 and τ = 500.

kCI 50% CI 80%

τ k τ k

0.25 (488, 551) (0.14, 0.33) (453, 581) (0.09, 0.48)0.33 (487, 648) (0.15, 0.40) (359, 709) (0.09, 0.57)

Table 11.3 Probability of the occurrence of the change point in the last {25, 50, 100, 200, 300, 400,500} observations prior to signalling for RAST CUSUM (h = 4.88) where λ0 = 42133.6 and τ = 500.

k 25 50 100 200 300 400 500

0.25 0.03 0.07 0.20 0.89 0.94 0.96 0.970.33 0.01 0.05 0.20 0.59 0.77 0.82 0.90




estimated time and the magnitude of changes in λ0 for the RAST CUSUM chart. As


posterior distributions. Under the same probability of 0.5, the CI for the time of the

change of size k = 0.25 covers 63 obsrevations around the 500th observation whereas

it increases and reaches to 161 observations for k = 0.33 due to the larger standard

deviation, see Table 11.1. This investigation can be extended to other shift sizes for

the time estimates. As shown in Table 11.1 and discussed above, the magnitude of the

changes are also estimated reasonably well and Table 11.2 shows that in all cases the

real sizes of changes are contained in the respective posterior 50% and 80% CIs.



change point in the last {25, 50, 100, 200, 300, 400, 500} observations prior to signalling

in the control charts. For a step change of size k = 0.33 in the mean survival time,

since the RAST CUSUM signals late (see Table 11.1), it is unlikely that the change

point occurred in the last 100 observations. A considerable growth in the probability

is seen when the next 200 observations are included, reaching to 0.77, whereas for a

larger drop of size k = 0.25, it is more certain that the change point has occurred in

the last 200 observations with a probability of 0.89.


Table 11.4 Average of posterior estimates (mode, sd.) of step change point model parameters (τ andk) for a change in the mean survival time following signals (RL) from RAST CUSUM (h = 4.88) whereλ0 = 42133.6 and τ = 500. Standard deviations are shown in parentheses.

k E(RL)Change point Change size

E(τ) E(στ ) E(k) E(σk)

0.05542.4 486.0 91.2 0.077 0.173(16.2) (57.3) (34.7) (0.086) (0.022)

0.066554.8 490.5 92.9 0.083 0.177(26.6) (62.5) (36.7) (0.075) (0.025)

0.10568.3 485.7 99.4 0.127 0.183(39.7) (70.9) (33.9) (0.094) (0.017)

0.143594.2 487.3 110.9 0.154 0.185(49.2) (72.5) (34.5) (0.090) (0.016)

0.20624.7 503.7 119.5 0.182 0.183(71.3) (87.1) (36.6) (0.103) (0.018)

0.25692.3 527.3 132.9 0.211 0.183(150.4) (146.2) (53.4) (0.111) (0.018)

0.33779.6 554.3 153.9 0.25 0.176(187.7) (162.3) (58.9) (0.118) (0.023)

0.501139.0 661.8 258.9 0.43 0.178(605.0) (287.7) (173.0) (0.16) (0.028)

0.662469.4 1270.3 562.1 0.51 0.183(2169.8) (783.2) (456.6) (0.22) (0.047)

0.752773.4 1748.0 697.9 0.53 0.195(2195.4) (1304.4) (720.8) (0.25) (0.047)



datasets, for different reduction in λ0, we replicated the simulation method explained

in Section 11.5 100 times.

Table 11.4 shows the average of the estimated parameters obtained from the replicated

datasets where there exists a drop in λ0 of size k.

As seen, the RAST CUSUM control chart tends to detect larger shifts in the MST

with less delays. For a large drop, a k of size 0.143 and less, the chart signals with

a delay of at most 95 observations. This delay increases over moderates reductions in

λ0, reaching to 279 observations for k = 0.33. However, the chart is failed in detection

of small drops since signals with a long delay of more than 639 observations obtained

when the MST halved, k = 0.50.

For large drops in the MST, a k of size 0.143 and less, the average values of the modes,

11.7 Conclusion 329

E(τ), tends to underestimate the time of the change since it reports at best the 490th

observation for k = 0.066. However, the Bayesian estimator still outperforms the chart

signal with a less bias over large reductions. This superiority persists for moderate shifts

in the MST, where a less bias is still associated with the Bayesian estimates of the time,

τ , at best three observations obtained for k = 0.20. Although the RAST CUSUM chart

was designed to detect a moderate drop of 0.255 in the MST, it is outperformed by the

posterior mode that detects the change point with a delay of 27 observations.

Table 11.4 shows that the bias of the Bayesian estimator, E(τ), did not exceed 55

observations over moderate reductions. This bias increased when the MST halved,

reaching to 162 observations, yet significantly outperformed the chart’s signal. For

smaller reductions, k = (0.66, 0.75), the posterior modes significantly overestimate the

change point since the RAST CUSUM signals very late. The variation of the Bayesian

estimates for time tends to reduce when the magnitude of shift in the MST increases.

The mean of the standard deviation of the posterior estimates of time, E(στ ), also

decreases when shift sizes increase.

Table 11.4 indicates that the average of the Bayesian estimator of the magnitude of

the change, E(δ), identifies change sizes with some biases. This estimator tends to

overestimate and underestimate the sizes where there exist large drops and moderate to

small drops, respectively. Having said that, Bayesian estimates of the magnitude of the

change must be studied in conjunction with their corresponding standard deviations.

In this manner, analysis of credible intervals is effective.

11.7 Conclusion

Quality improvement programs and monitoring of medical process outcomes are now




within industrial and business quality control applications. Indeed, knowing the change

point enhances efficiency of root cause analysis efforts by restricting the search to a


tighter window of observations and related variables.

In this paper, using a Bayesian framework, we modeled change point detection in

time-to-event data for a clinical process with dichotomous outcomes, death and sur-

vival, where patient mix was present. We considered a drop in the mean survival time

of an in-control process. We constructed Bayesian hierarchical models and derived

posterior distributions for change point estimates using MCMC. The performance of

the Bayesian estimators were investigated through simulation when they were used in

conjunction with risk-adjusted survival time CUSUM control charts monitoring right

censored survival time of patients who underwent cardiac surgery procedures within

a follow-up period of 30 days where the severity of risk factors prior to the surgery

was evaluated by the Parsonnet score. The results showed that the Bayesian estimates

significantly outperform the RAST CUSUM control charts in change detection over

different magnitude of drops in the mean survival time.



probabilistic intervals around estimates and probabilistic inferences about the loca-

tion of the change point. This is a significant advantage of the proposed Bayesian

approach. Furthermore, flexibility of Bayesian hierarchical models, ease of extension

to more complicated change scenarios such as linear and nonlinear trends in survival

time, relief of analytic calculation of likelihood function, particularly for non-tractable

likelihood functions and ease of coding with available packages should be considered

as additional benefits of the proposed Bayesian change point model for monitoring

purposes.


mortality observed in the pilot hospital. Although it is expected that the superiority of

the proposed Bayesian estimator persists over other processes in which the in-control

rate and the distribution of baseline risk may differ, the results obtained for estimators

and control charts over various change scenarios motivates replication of the study using


as replacing priors with more informative alternatives may be of interest.

11.7 Conclusion 331


advantage of building on control charts that may be already in place in practice (as

in the pilot hospital). An alternative may be to retain the two-step approach but

to use a Bayesian framework in both stages. There is now a substantial literature on







Acknowledgment



Appendix

Change point model code in WinBUGS

model {


y[i] ∼ dweib(alpha0, gamma[i])I(yc[i],)

gamma[i] = pow(exp(beta0 * riskscore[i])/(lambda0+step(i-tau) * lambda0 * (k-1)),

alpha) }

RL=RLcusum-1

k ∼ dnorm(0.255, 2.77)I(0.01, 0.99)

tau ∼ dunif(1, RL) }


Bibliography

Assareh, H. and Mengersen, K. (2011). Detection of the time of a step change in





20(3):207–222.

Assareh, H., Smith, I., and Mengersen, K. (2011b). Identifying the time of a linear



2190:365–370.















17(2):119–124.


Chapman & Hall/CRC.




38(2):124–136.



102(477):140–152.

BIBLIOGRAPHY 333






Parsonnet, V., Dean, D., and Bernstein, A. D. (1989). A method of uniform stratifi-

cation of risk for evaluating the results of surgery in acquired adult heart disease.

Circulation, 79(6):3–12.





coda. Citeseer.














in Medicine, 29(4):444–454.







24(6):721–735.




CHAPTER 12

Change Point Estimation in Monitoring

Survival Time

Preamble













336 Chapter 12. Change Point in Monitoring Survival Time




Following adaption of Bayesian approach in estimation of changes in odds ratio of

binary outcomes in risk-adjusted control charts through Chapters 9 to 10 and achieved

accuracy and precision obtained by the developed Bayesian estimator for identification

of drops in mean survival time in presence of patient mix in Chapter 11, in this chapter

the proposed Bayesian change point model was extended to estimate a wider range

of step changes, increases and decreases, in the mean survival time of patients who

underwent cardiac surgery. Similarly, the data were right censored since the monitoring

was conducted over a limited follow-up period and the effect of risk factors prior to the

surgery was captured using a Weibull accelerated failure time regression model.






different magnitude scenarios. The performance of the estimator was also investigated

over various follow-up period, censoring time, and results showed that it performed bet-

ter over longer follow-up period. In comparison with the alternative built-in CUSUM

estimator, more accurate and precise estimates were obtained by the Bayesian estima-

tor. These superiorities were enhanced when probability quantification, flexibility and

generalizability of the Bayesian change point detection model are also considered.




components change point estimators were designed to estimate time of a step change

in mean survival time of hospital outcomes in presence of patient mix. Meanwhile the

simulation study implemented in this research, contributes to an analytic application

of the risk-adjusted survival time control charts over various change scenarios.

337







certified that:



field of expertise;






unit; and





Assareh, H. and Mengersen, K. (2011) Change point estimation in monitoring survival

time, PLOS One, under revision.



Signature & Date:





12.1 Abstract 339

12.1 Abstract

Precise identification of the time when a change in a hospital outcome has occurred

enables clinical experts to search for a potential special cause more effectively. In this

paper, we develop change point estimation methods for survival time of a clinical pro-

cedure in the presence of patient mix in a Bayesian framework. We apply Bayesian

hierarchical models to formulate the change point where there exists a step change

in the mean survival time of patients who underwent cardiac surgery. The data are

right censored since the monitoring is conducted over a limited follow-up period. We

capture the effect of risk factors prior to the surgery using a Weibull accelerated failure

time regression model. Markov Chain Monte Carlo is used to obtain posterior distri-

butions of the change point parameters including location and magnitude of changes

and also corresponding probabilistic intervals and inferences. The performance of the

Bayesian estimator is investigated through simulations and the result shows that pre-

cise estimates can be obtained when they are used in conjunction with the risk-adjusted

survival time CUSUM control charts for different magnitude scenarios. The proposed

estimator shows a better performance where a longer follow-up period, censoring time,

is applied. In comparison with the alternative built-in CUSUM estimator, more accu-

rate and precise estimates are obtained by the Bayesian estimator. These superiorities



12.2 Introduction

A control chart monitors the behavior of a process over time by taking into account

the stability and dispersion of the process. The chart signals when a significant change

has occurred. This signal can then be investigated to identify potential causes of the

change and corrective or preventive actions can then be conducted. Following this

cycle leads to variation reduction and process stabilization (Montgomery, 2008). The

achievements obtained by industrial and business sectors via the implementation of a

quality improvement cycle including quality control charts and root causes analysis have


motivated other sectors such as healthcare to consider those tools and apply them as

an essential part of the monitoring process in order to improve the quality of healthcare

delivery.

One of the earliest comprehensive research studies was undertaken by (Benneyan,

1998a,b) who utilized SPC methods and control charts in epidemiology and control

infection and discussed a wide range of control charts in the health context. Woodall

(2006) comprehensively reviewed the increasing stream of adaptions of control charts

and their implementation in healthcare surveillance. He acknowledged the need for

modification of the tools according to health sector characteristics such as emphasis on

monitoring individuals, particularly dichitomos data, and patient mix. Risk adjustment

has been considered in the development of control charts due to the impact of the hu-

man element in process outcomes. Steiner and Cook (2000) developed a risk-adjusted




(Cook, 2004; Grigg and Spiegelhalter, 2007). Both modified procedures have been in-

tensively reviewed and are now well established for monitoring clinical outcomes where

the observations are recorded as binary data (Grigg and Farewell, 2004; Grigg and

Spiegelhalter, 2006; Cook et al., 2008).






Sego et al. (2009) used an accelerated failure time regression model to capture the het-

erogeneity among patients prior to the surgery and developed a risk-adjusted survival

time CUSUM (RAST CUSUM) scheme. They showed that this procedure is more sensi-

tive in detection of an increase in odds ratio compared to risk-adjusted CUSUM charts.

Steiner and Jones (2010) extended this approach by proposing an EWMA procedure

based on the same survival time model discussed by Sego et al. (2009).






industrial context of quality control. Accurate detection of the time of change can help

in the search for a potential cause more efficiently as a tighter time-frame prior to the

signal in the control charts is investigated. Assareh et al. (2011) discussed the benefits

of change point investigation in monitoring cardiac surgery outcomes and post-signal

root causes analysis by providing precise estimates of the time of the change in the rates

of use of blood products during surgery and adverse events in the follow-up period.

A built-in change point estimator in CUSUM charts suggested by Page (1954, 1961)

and also an equivalent estimator in EWMA charts proposed by Nishina (1992) are two

early change point estimators which can be applied for all discrete and continuous dis-

tributions underlying the charts. However they do not provide any statistical inferences

on the obtained estimates.

Samuel and Pignatiello (2001) developed and applied a maximum likelihood estimator

(MLE) for the change point in a process fraction nonconformity monitored by a p-

chart, assuming that the change type is a step change. They showed how closely this



MLE estimator with EWMA and CUSUM charts. These authors also constructed a

confidence set based on the estimated change point which covers the true process change

point with a given level of certainty using a likelihood function based on the method

proposed by Box and Cox (1964).

This approach was extended to other probability distributions and change type scenar-

ios. In the case of a very low fraction non-conforming, Noorossana et al. (2009) derived

and analyzed the MLE estimator of a step change based on the geometric distribution

control chats discussed by Xie et al. (2002).

All MLE estimators described above were developed assuming that the underlying dis-

tribution is stable over time. This assumption cannot often be satisfied in monitoring


clinical outcomes as the mean of the process being monitored is highly linked to indi-

vidual characteristics of patients. Therefore it is required that the survival time model,


points in time-to-event control charts.





where heterogeneity exists as well as inferences based on posterior distributions for the

time and the magnitude of a change Gelman et al. (2004).


change points are estimated assuming that the underlying change is a step change. In

this scenario, we model the step change in the mean survival time of patients following

a clinical process. We analyze and discuss the performance of the Bayesian change

point model through posterior estimates and probability based intervals. Risk-adjusted

survival time CUSUM charts are reviewed in Section 12.3. The change point model is

demonstrated in Section 12.4 and evaluated in Sections 12.5-12.7. We then compare the

Bayesian estimator with the CUSUM built-in estimator in Section 12.8 and summarize

the study and obtained results in Section 12.9.


The survival time of a patient who has undergone cardiac surgery is affected by the rate

of mortality of cardiac surgery within the hospital and also patient covariates such as

age, gender, co-morbidities and so on. Risk-adjusted control charts of time-to-event are

monitoring procedures designed to detect changes in a process parameter of interest,

such as survival time, where the process outcomes are affected by covariates, such as

risk factors. In these procedures, regression models for time are used to adjust control

charts in such a way that the effects of covariates for each input, patient say, would be

eliminated.










1 if xi ≤ c

0 if xi > c.

(12.1)







manner.










phase I. In this phase, an available dataset of patient records is used assuming that the

process is in-control for that period of time. A set of independent priors can also be


used to obtain posterior estimates of the AFT parameters over the training data.







W±i (ti, δi | ui) = (1− (ρ±)−α)

(tiexp(β

Tui)

λ0

)− δiαlogρ

±. (12.2)

where it is designed to detect an increase (a decrease) from λ0 to λ+1 = ρ+λ0 (λ−

1 =


i−1 +

W+i } and Z−

i = min{0, Z−i−1 − W−






























This structure may be expanded to multiple levels in a hierarchical fashion, resulting in

a Bayesian hierarchical model (BHM). In complicated BHMs it is not easy to obtain the


emergence of Markov chain Monte Carlo (MCMC) methods. In MCMC algorithms a

Markov chain is constructed whose stationary distribution is the posterior distribution

of the parameters. Samples generated from a long run of the Markov chain can then

be used for posterior inferences. Some common MCMC methods for drawing samples

include Metropolis-Hastings and the Gibbs sampler; see Gelman et al. (2004) for more

details.


in a survival time of ti, i = 1, ..., T , that is initially in-control. The observations can

be explained by a survival function S(ti, ui), where the underlying distribution, (f(.)),

is a Weibull distribution with parameters (α0, λ0), and ui is a vector of covariates. At

an unknown point in time, τ , the Weibull scale parameter changes from its in-control

state of λ0 to λ1, λ1 = k × λ0, k > 0 and 6= 1. The right censored survival time step

change model can thus be parameterized using a survival function as follows:


(1) (2)

Figure 12.1 Cumulative distribution functions of prior distributions. The assigned priors for the mag-nitude of the change, k, in the scale parameter of the Weibull AFT model λ in the cases of detectionof (1) an increase, or (2) a decrease in k.

S(ti, ui) =

exp[−(tiexp(β

T0 ui)

λ0

)α0]

if i = 1, 2, ..., τ

exp[−(tiexp(β

T0 ui)

λ1

)α0]

if i = τ + 1, ..., T

(12.4)


Assume that the process ti is monitored by a control chart that signals at time T .

We assign a truncated normal prior distribution (µ, σ)I(.) for k where all parameters

are set to correspond to the design of RAST CUSUM and the obtained signals. For

an increase in k which is detected by the lower bound h− of the RAST CUSUM, we

set N(µ = 4.004, σ = 8)I(1.01, 20). Similarly, the prior is set to N(µ = 0.255, σ =

0.6)I(0.01, 0.99) for a drop of k that is detected by the upper bound h+ of the RAST

CUSUM. This setting leads to relatively informed priors for the magnitude of the

change. The cumulative distribution functions (CDF) of these priors are shown in

Figure 12.1. The mean of both priors were set to correspond to the shifts that the

chart was calibrated to detect; see Section 12.5. The priors encourage sensitivity in

detection of low to relatively large jumps and falls in k.

Note that other distributions such as the uniform and the Gamma might also be of

interest for k since it is always a positive value; see Gelman et al. (2004) for more

details on selection of prior distributions. We place a uniform distribution on the range

(1, T − 1) as a prior for τ where T is set to the time of the signal of the control chart.

12.5 Evaluation 347

See the Appendix for the step change model code in WinBUGS.

12.5 Evaluation


step change detection following a signal from a RAST CUSUM control chart when a














We apply the same Weibull AFT model to simulate observations coming from the in-

control state of the process. Figure 12.2 shows the estimated survival curves obtained

through the in-control survival time model for patients with a range of different Par-

sonnet scores. As seen, a patient with a low score, u = 10 or below, is highly likely

(p ≥ 0.902) to survive within the follow-up period; see Figure 12.2-1. In contrast for

patients with a score of u = 50 and higher, death is not unlikely within this period

since the risk of death is estimated to be at least 51% for the last day shown in Figure

12.2-2.





(1) (2)

Figure 12.2 Estimated survival curves for patients with (1) low to medium and (2) medium to highParsonnet scores (risks prior to surgery) over the follow-up period of 30 days obtained through thefitted Weibull AFT model to the training survival time data.













To generate the step change in λ0, or MST, we then induced changes of sizes k = {1.33,

1.5, 2, 3, 4, 5, 7, 10, 15, 20} as increases and their inverse values of k = {0.05, 0.066, 0.1,

0.143, 0.20, 0.25, 0.33, 0.50, 0.66, 0.75} as decreases and generated observations until

the control charts signalled. These changes led to different change sizes in in-control

estimated survival probability over days for a patient with ui as well as survival curves

between patients with different Parsonnet scores.

12.5 Evaluation 349

(1) (2)

Figure 12.3 Estimated probability of survival at the 15th and the 30th day of the follow-up period of30 days over all Parsonnet scores prior and after (1) an increase of size k = 4, and (2) a decrease ofsize k = 0.25 in the MST. Prior and after the change are indexed by 1 and the value of k.

The effects of an increase of size k = 4 and a drop of size k = 0.25 in the MST on the

probability of survival at the midpoint, day 15, and the end, day 30, of the follow-up

period for all possible Parsonnet score are demonstrated in Figure 12.3. As expected,

the probability of survival for each patient would increase when a jump in the MST

occurred. However the magnitude of this increase is larger for patients with higher

Parsonnet scores.

It was also found that the resultant magnitude of the shift in the probability of sur-

vival for an individual patient with a covariate of ui, is not constant over days. The

magnitude of increases in the probability at the end of period is slightly higher than

those obtained for the midpoint of the period caused by a jump of k = 4 in the MST

for patients with Parsonnet scores of less than 63; this is demonstrated by comparison

of the absolute change in probability for the days 15 and 30 of the follow-up period

before (k = 1) and after the increase (k = 4) in Figure 12.4-1. As shown for patients

with higher scores, the increase in probability for the end of the follow-up period is less

than the midpoint. The same behavior was also observed for a drop of size k = 0.25;

however the superiority of the resultant magnitude of the shift in the probability for

the end of the period tends to decline and underlie the corresponding probability for

the midpoint of the period over a wider range of Parsonnet scores; see Figure 12.4-2.



(1) (2)

Figure 12.4 Estimated absolute magnitude of change in probability of survival over all Parsonnet scoresprior and after changes in the MST. Probabilities at the 15th and the 30th day of the follow-up periodof 30 days prior and after (1) an increase of size k = 4, and (2) a decrease of size k = 0.25 in the MST.

We calibrated the RAST CUSUM to detect an increase and a decrease in the MST that

correspond to a halving and a doubling of the odds ratio within the follow-up period and

with an in-control average run length ( ÂRL0) of approximately 10000 observations. As

noted in Section 12.3, for the Weibull AFT model the corresponding odds ratio formula,

discussed by Sego et al. (2009), is not reduced to a closed form of λ0 and ρ± since the

covariate term is not simplified in

OR =Oi1

Oi0, and Oi =

1− S(c | ui)S(c | ui)

(12.5)


Therefore we used Monte Carlo simulation to estimate the corresponding ρ±. To do

so, we set ρ± such that over 100,000 replications of generating Parsonnet scores from

the fitted exponential distribution with a mean of 8.9 and calculating the odds ratio

in Equation 12.5, the desired odds ratios of size OR = 2 and OR = 0.5 were obtained.

An increase of ρ+ = 4.004 and a decrease of ρ− = 0.255 in the MST were found to

correspond to the desired drop and jump in odds ratio, respectively.

We also used Monte Carlo simulation to determine decision intervals, h±. However

other approaches may also be considered; see Steiner and Cook (2000) and Sego et al.

(2009). This setting led to decision intervals of h+ = 4.88 and h− = 4.53. As two sided


Table 12.1 Posterior estimates (mode, sd.) of step change point model parameters (τ and k) followingsignals (RL) from RAST CUSUM ((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6 and τ = 500.

k RL τ στ k σk0.25 651 499.8 96.0 0.226 0.180.33 722 494.8 160.6 0.27 0.193.00 1107 734.1 165.8 3.52 3.64.00 839 496.3 109.1 3.68 2.4

charts were considered, the negative value of h− was used. The associated CUSUM

scores were also obtained through Equation (12.2) considering the generated ti, δi and

ui.




samples through MCMC iterations in WinBUGS (Spielgelhalter et al., 2003), with the

first 20000 samples ignored as burn-in, for all change point scenarios. We then analyzed

the results using the CODA package in R (Plummer et al., 2010). See the Appendix

for the step change model code in WinBUGS.


To demonstrate the results of Bayesian change point detection in risk-adjusted control

charts, we induced a jump and a drop of sizes k = 4.0 and k = 0.25, respectively, at

time τ = 500 in an in-control process with an overall survival time of λ0 = 42133.6.

The RAST CUSUM chart detected the changes and signalled at the 839th and 651st

observations, corresponding to delays of 339 and 151 observations as shown in Figures

12.5-a1 and 12.5-b1, respectively. The posterior distributions of time and magnitude of

the change were then obtained using MCMC discussed in Section 12.5. Both distribu-

tions of the time of the change, τ , concentrate on the 500th observation, approximately,

as seen in Figures 12.5-a2 and 12.5-b2. The posterior for the magnitude of the change,

k, also reasonably identified the exact change sizes as it highly concentrates on values

of around 4.0 and 0.25 shown in Figure 12.5-a3 and 12.5-b3.


(a1) (b1)

(a2) (b2)

(a3) (b3)

Figure 12.5 Risk-adjusted survival time CUSUM charts ((h+, h−) = (4.88, 4.53)) and obtained posteriordistributions of the time τ and the magnitude k of (a1-a3) an increase of size k = 4, and (b1-b3) adecrease of size k = 0.25 in λ (mean survival time) where λ0 = 42133.6 and τ = 500.

This investigation was replicated using a smaller shift in both direction, k = 0.33 and

k = 3.0 in λ0. Table 12.1 summarizes the posterior estimates for all scenarios. If

the posterior was asymmetric and skewed, the mode of the posterior was used as an

estimator for the change point model parameters (τ and k).

The RAST CUSUM signalled after 222 observations when the mean survival time


Table 12.2 Credible intervals for step change point model parameters (τ and k) following signals (RL)from RAST CUSUM ((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6 and τ = 500.

kCI 50% CI 80%

τ k τ k

0.25 (488, 551) (0.14, 0.33) (453, 581) (0.09, 0.48)0.33 (487, 648) (0.15, 0.40) (359, 709) (0.09, 0.57)3.00 (681, 891) (2.08, 5.74) (604, 995) (1.52, 9.32)4.00 (397, 505) (2.48, 5.39) (389, 611) (1.51, 7.20)

became 0.33 where the posterior distribution reported the drop at the 491st observation.

This result implies that although the obtained posterior estimates underestimated the

change point, they still performed substantially better than the RAST CUSUM charts.

However, a large bias was associated with the Bayesian estimate of the time where the

MST became 3.0.


lowing signals of the control chart; see Figures 12.5-a3 and 12.5-b3 and Table 12.1.

The slight bias observed in the figures must be considered in the context of their cor-

responding standard deviations.



shift size occurred, less dispersed posteriors are obtained, particularly for posteriors of

time.







posterior distributions. Under the same probability of 0.5, the CI for the time of

the change of size k = 4.0 covers only eight observations around the 500th observation

whereas it increases to 210 observations for k = 3.0 due to the larger standard deviation;

see Table 12.1. In this scenario, the true change point was not in both CIs whereas the

intervals obtained for the equavalent change size in the opposite direction, k = 0.33,


Table 12.3 Probability of the occurrence of the change point in the last {25, 50, 100, 200, 300, 400,500} observations prior to signalling for RAST CUSUM ((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6and τ = 500.

k 25 50 100 200 300 400 500

0.25 0.03 0.07 0.20 0.89 0.94 0.96 0.970.33 0.01 0.05 0.20 0.59 0.77 0.82 0.903.0 0.00 0.01 0.02 0.12 0.40 0.57 0.824.0 0.01 0.02 0.04 0.08 0.27 0.72 0.96

are highly informative.

This investigation can be extended to other shift sizes for the time estimates. As shown

in Table 12.1 and discussed above, the magnitudes of the changes are also estimated

reasonably well and Table 12.2 shows that in all cases the real sizes of the changes are

contained in the respective posterior 50% and 80% CIs.




in the control charts. For a step change of size k = 4.0 in the mean survival time, since

the RAST CUSUM signals late (see Table 12.1), it is unlikely that the change point

occurred in the last 200 observations. A considerable growth in the probability is seen

when the next 200 observations are included, reaching to 0.72, whereas for a smaller

increase of size k = 3.0, it is still not unlikely that the change point has occurred prior

the last 400 observations with a probability of 0.43. For drops, k = 0.33, 0.25, the

likelihood of occurrence of the change in the last 200 observations are noticeably high

since more precise posteriors of time were obtained; see Table 12.1.



datasets, for different changes in λ0, we replicated the simulation method explained in

Section 12.5 100 times. This replication allows us to have a distribution of estimates

with standard errors of the order of 10. The number of replications is a compromise

between computational time and posterior estimation of the expected value and par-

ticular tail probabilities. Table 12.4 shows the average of the estimated parameters

obtained from the replicated datasets where there exists a step change in λ0 of size k.


Using Monte Carlo simulation an equivalent odds ratio of mortality in the follow-up

period, OR, for each step change in the MST was also obtained.

Table 12.4 Average of posterior estimates (mode, sd.) of step change point model parameters (τ andk) for a change in the mean survival time following signals (RL) from RAST CUSUM ((h+, h−) =(4.88, 4.53)) where λ0 = 42133.6 and τ = 500. Standard deviations are shown in parentheses.

Change point Change size

k OR E(RL) E(τ) E(στ ) E(k) E(σk)

0.05 4.73 542.4 486.0 91.2 0.077 0.173(16.2) (57.3) (34.7) (0.086) (0.022)

0.066 3.94 554.8 490.5 92.9 0.083 0.177(26.6) (62.5) (36.7) (0.075) (0.025)

0.10 3.26 568.3 485.7 99.4 0.127 0.181(39.7) (70.9) (33.9) (0.094) (0.017)

0.143 2.70 594.2 487.3 110.9 0.154 0.182(49.2) (72.5) (34.5) (0.090) (0.016)

0.20 2.26 624.7 503.7 119.5 0.182 0.183(71.3) (87.1) (36.6) (0.103) (0.018)

0.25 2.02 692.3 527.3 132.9 0.231 0.183(150.4) (146.2) (53.4) (0.111) (0.018)

0.33 1.75 779.6 554.3 153.9 0.27 0.186(187.7) (162.3) (58.9) (0.118) (0.023)

0.50 1.41 1139.0 661.8 258.9 0.43 0.188(605.0) (287.7) (173.0) (0.16) (0.028)

0.66 1.23 2469.4 1270.3 562.1 0.57 0.193(2169.8) (783.2) (456.6) (0.22) (0.047)

0.75 1.16 2773.4 1748.0 697.9 0.63 0.195(2195.4) (1304.4) (720.8) (0.25) (0.047)

1.33 0.87 2921.9 2080.6 635.8 1.59 3.25(2629.8) (1674.0) (763.4) (2.57) (0.747)

1.5 0.81 2438.8 1764.9 510.0 1.85 3.60(1671.8) (1238.9) (555.5) (2.59) (0.788)

2.0 0.70 1454.0 928.8 291.9 2.69 3.81(626.9) (434.4) (197.6) (2.10) (0.819)

3.0 0.58 1004.7 645.1 179.9 3.60 3.98(382.2) (250.3) (98.8) (2.19) (0.618)

4.0 0.50 828.8 525.9 137.0 4.12 4.08(196.5) (134.9) (68.0) (2.26) (0.401)

5.0 0.45 785.6 514.5 113.7 5.79 4.14(170.2) (128.8) (63.5) (2.42) (0.394)

7.0 0.38 753.2 493.1 106.1 6.69 4.17(125.4) (100.9) (46.8) (2.36) (0.364)

10.0 0.32 692.4 471.8 95.3 8.90 4.27(89.6) (90.9) (43.2) (2.20) (0.291)

15.0 0.26 689.5 467.6 88.7 12.25 4.38(84.7) (78.2) (41.6) (2.23) (0.270)

20.0 0.22 670.7 465.2 80.1 14.73 4.45(61.6) (73.5) (35.3) (2.04) (0.148)

As seen, the RAST CUSUM control chart tends to detect larger shifts in the MST

with less delays. For a large jump, a k of size 10 and more, the chart signals with a

delay of at most 192 observations. This delay increases over moderate increases in λ0,

reaching to 504 observations for k = 3.0. However, the chart tends to fail in detection


of small jumps since signals with a long delay of more than 954 observations were

obtained when the MST doubled, k = 2.0. This behavior is also consistent over drops.

Having said that, the RAST CUSUM performs better where there exists a drop. For

a fall of size k = 0.25 in the MST, equivalent to doubling of the odds ratio, a delay

of 192 observations is associated with the obtained signal on average, while it is 328

observations for an equivalent increase of k = 4.0, halving of the adds ratio. Moreover,

more precision is associated with the RAST CUSUM signals over reductions.

This superiority can be explained by the nature of censored data. Since survival times

are right censored, the effect of improvements in the process is less observable and

detectable than deteriorations. In other words, the data obtained after an increase in

the MST is less informative than those obtained following a drop; see Section 12.7.

For a large jump in the MST, k of size 7.0 or more, the average values of the modes,

E(τ), tends to underestimate the time of the change since it reports at best the 493rd

observation for k = 7.0. However, the Bayesian estimator still outperforms the chart

signal with less bias over large increases. For inverse change sizes, large falls, the

posterior mode also reports the true change point with less bias than the chart’s signal.

The magnitude of this bias is less than those obtained over jumps in the MST (drops

in the odds ratio).

Although the RAST CUSUM chart was designed to detect moderate shifts in the MST,

approximately k = (0.25, 4.0), it is significantly outperformed by the posterior mode

that detects the change point with a delay of three and 27 observations, respectively,

compared to the chart’s signal with a bias of at best 192 observations obtained for the

fall (doubling of the odds ratio).

Table 12.4 shows that the Bayesian estimator of time, E(τ), tends to overestimate

the time of the change over moderate to small step changes. This bias dramatically

increases over small to very small shifts, a drop of size k = (0.5, 0.66, 0.75) and their

inverse values for jumps, reaching to a bias of 1080 observations obtained for k = 1.33,

yet significantly outperforms the chart’s signal. However, it may still be considered as

an informative estimate of the time of the change.

Table 12.4 indicates that the average of the Bayesian estimator of the magnitude of the

12.7 The Effect of Censoring Time 357

change, E(δ), identifies change sizes with some bias. For large drops, this estimator

tends to overestimate the change size whereas it underestimates the size over moderate

to small drops. It behaves conversely over jumps. The magnitude of an increase

is overestimated for small to moderate shifts while it is underestimated over large

increases. The best estimations were obtained for moderate shifts sizes. Having said

that, Bayesian estimates of the magnitude of the change must be studied in conjunction

with their corresponding standard deviations. In this manner, analysis of credible

intervals is effective.

12.7 The Effect of Censoring Time

Specification of the time c at which the survival times are right censored, affects the

resulting performance of the RAST CUSUM chart. Sego et al. (2009) have addressed

construction of a RAST CUSUM chart in an updating fashion that uses longer censoring

times. Here we investigated the performance of the chart and the proposed Bayesian es-

timator of the change point over longer censoring times. Using the simulation procedure

discussed in Section 12.5 and followed above, we replicated generating in-control and

out-of-control states of the sample process and change point detections for a selection

of decreases, k = (0.1, 0.25), and increases, k = (4, 10), in the MST where the observed

survival times are right censored using follow-up periods of c = {30, 90, 180, 365} which

correspond to, a month, a quarter, a half and a full year, respectively. Note that the

RAST CUSUM chart was not re-calibrated based on the new censoring time since it

was assumed that no updates were obtained for the patients in the training dataset.

Table 12.5 shows that when a longer censoring time is used, the chart detects a fall with

less delay. For a large reduction of size k = 0.1 this delay drops by 26 observations

on average when a follow-up period of 90 days is considered instead of the common

30 days. In this scenario, applying longer periods improves the run length since more

accurate and precise E(RL) were obtained. However it is not as significant as that

observed by replacing a month with a quarter of a year. This behavior is consistent

over a moderate drop of size k = 0.25. The average of Bayesian estimator of the time,

E(τ), also shows that estimates with less bias and variation would be obtained if a

358

Chapter12.ChangePointin

MonitoringSurv

ivalTim

e

Table 12.5 Average of posterior estimates (mode, sd.) of step change point model parameters (τ and k) for a change in the mean survival time using different censoringtime, c, following signals (RL) from RAST CUSUM ((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6 and τ = 500. Standard deviations are shown in parentheses.

k = 0.1 k = 0.25 k = 4 k = 10

c E(RL) E(τ) E(k) E(RL) E(τ) E(k) E(RL) E(τ) E(k) E(RL) E(τ) E(k)

30 568.3 485.7 0.127 692.3 527.3 0.231 828.8 525.9 4.12 692.4 471.8 8.90(39.7) (70.9) (0.094) (150.4) (146.2) (0.111) (196.5) (134.9) (2.26) (89.6) (90.9) (2.20)

90 542.8 486.3 0.124 609.9 513.7 0.233 708.5 517.3 4.38 621.3 473.3 8.81(19.5) (62.3) (0.102) (77.6) (63.0) (0.119) (126.6) (115.9) (2.38) (55.9) (79.5) (2.40)

180 534.8 492.6 0.124 579.4 509.7 0.213 645.6 515.2 4.16 591.4 488.2 8.72(16.1) (36.5) (0.101) (46.5) (50.6) (0.126) (97.3) (86.2) (2.41) (37.9) (40.4) (2.56)

365 527.9 489.7 0.148 562.6 495.3 0.227 604.7 512.2 4.29 562.7 489.1 8.92(12.5) (33.6) (0.139) (44.4) (63.9) (0.130) (69.7) (76.2) (2.52) (25.4) (32.4) (2.71)


longer follow-up period was used.

The behavior of the chart and the estimator observed for drops persists over increases

in the MST as well. Having said that, it seems to be more significant over increases. For

a moderate jump of size k = 4, the bias of the chart signal drops by 120 observations

when a follow-up period of 30 days is replaced by 90 days. The delay of the chart

reaches 104 observations for a follow-up period of a year. In a large increase scenario

of k = 10, the delay reduces from 192 to 62 observations over censoring times of 30

and 365 days, respectively. The Bayesian estimator of the time, E(τ), also tends to

detect the change point more accurately and precisely since less overestimation and

underestimation were observed over a longer censoring time for moderate and large

jumps, respectively.

Although the discussed results are in favor of following up patients for a longer time,

care should be taken in this approach since the possibility of contribution of other risk

factors rather than the process of interest, cardiac surgery, in the observed survival

time increases. Investigation of incorporating such post-surgery factors and also the

effect of re-calibration of the RAST CUSUM is left for further research.


ods

To study the performance of the proposed Bayesian estimators in comparison with that

introduced in Section 12.2, we run the available alternative, built-in estimator of the

CUSUM chart, within the replications discussed in Section 12.6.

Based on the suggestion by Page (1954), if an increase in a process rate is detected

by CUSUM charts, an estimate of the change point is obtained through τcusum =

max{i : Zi = 0}. Similarly for detection of a decrease, the estimated change point is

τcusum = max{i : Z−i = 0}.


provided by the built-in estimator of CUSUM, τcusum charts for shifts in the mean


Table 12.6 Average of detected time of a step change in the mean survival time obtained by the Bayesianestimator (τb) and CUSUM built-in estimator following signals (RL) from RACUSUM ((h+, h−) =(5.85, 5.33)) where λ0 = 42133.6 and τ = 500. Standard deviations are shown in parentheses.

k E(RL) E(τcusum) E(τb)

0.05 542.4 458.22 486.0(16.2) (77.9) (57.3)

0.066 554.8 467.5 490.5(26.6) (80.9) (62.5)

0.10 568.3 456.0 485.7(39.7) (79.9) (70.9)

0.143 594.2 474.5 487.3(49.2) (67.1) (72.5)

0.20 624.7 477.1 503.7(71.3) (75.6) (87.1)

0.25 692.3 523.5 527.3(150.4) (138.4) (146.2)

0.33 779.6 565.4 554.3(187.7) (158.9) (162.3)

0.50 1139.0 903.7 661.8(605.0) (568.0) (287.7)

0.66 2469.4 2289.5 1270.3(2169.8) (2168.2) (783.2)

0.75 2773.4 2426.0 1748.0(2195.4) (1906.9) (1304.4)

1.33 2921.9 2561.2 2080.6(2629.8) (2655.8) (1674.0)

1.5 2438.8 2035 1764.9(1671.8) (1672.0) (1238.9)

2.0 1454.0 997.6 928.8(626.9) (597.2) (434.4)

3.0 1004.7 635.1 645.1(382.2) (332.1) (250.3)

4.0 828.8 468.4 525.9(196.5) (174.0) (134.9)

5.0 785.6 470.0 514.5(170.2) (160.6) (128.8)

7.0 753.2 455.2 493.1(125.4) (100.5) (100.9)

10.0 692.4 417.0 471.8(89.6) (122.2) (90.9)

15.0 689.5 432.3 467.6(84.7) (102.4) (78.2)

20.0 670.7 430.5 465.2(61.6) (112.9) (73.5)

survival time, λ0 say.

The built-in estimator of CUSUM charts outperforms associated signals over all shifts

in the MST; however they tend to significantly underestimate the exact change point

when the magnitude of the shifts increases. The largest biases of 83 and 44 observations

were obtained for a shift size of k = 10 and its inverse value, k = 0.1, as an increase and

a decrease respectively. It has been discussed in Section 12.6 that the RAST CUSUM

12.9 Conclusion 361

has a better performance over drops; this finding persists for the built-in estimator

since less bias and higher precision are associated with the change point estimates over

drops. Having said that, the superiority of the built-in estimator over the chart’s signal

is more significant over jumps in the MST, as the same bias, 42 observations, but in

opposite directions was associated with the reported time of the change through the

chart signal and the built-in estimator for a large reduction of size k = 0.05 (odds ratio

of OR = 4.73).

Although the Bayesian estimator, τb, also tends to underestimate the time of changes

over large shifts, k = 7 or more, and their inverse, it outperforms the built-in estimator,

τcusum, with less bias reaching to 15 and 35 observations over large drops and jumps,

respectively.

The posterior mode tends to overestimate the true change point over moderate to small

shift sizes, yet it reports more accurate results than the alternative which is associated

with significantly large delays. In the only exceptional scenarios, a shift of sizes k = 0.25

and k = 3.0, where less bias is associated with the built-in estimator, no significant

superiority is gained when the obtained variation of the estimates is also taken into

account. Comparison of variation of estimated change points across other scenarios of

shifts in the mean survival time also supports the superiority of the Bayesian estimator

over the alternative.

12.9 Conclusion









In this paper, using a Bayesian framework, we modeled change point detection in time-

to-event data for a clinical process with dichotomous outcomes, death and survival,

where patient mix was present. We considered a range of jumps and falls in the mean

survival time of an in-control process. We constructed Bayesian hierarchical models

and derived posterior distributions for change point estimates using MCMC. The per-

formance of the Bayesian estimators was investigated through simulation in conjunction

with risk-adjusted survival time CUSUM control charts for monitoring right censored

survival time of patients who underwent cardiac surgery procedures within a follow-up

period of 30 days. Here the severity of risk factors prior to the surgery was evaluated by

the Parsonnet score. The results showed that the Bayesian estimates significantly out-

perform the RAST CUSUM control charts in change detection over different magnitude

of shifts in the mean survival time. Over longer follow-up periods better estimates were

provided by the RAST CUSUM chart and the Bayesian estimator. We then compared

the Bayesian estimator with built-in estimators of CUSUM. The Bayesian estimator

was found to perform reasonably well and outperform the alternative.






plicated change scenarios such as linear and nonlinear trends in survival time, relief










12.9 Conclusion 363












Acknowledgment



Appendix

Change point model code for survival time: Detection of a decrease in

the mean survival time

model {



gamma[i] = pow(exp(beta0 * riskscore[i])/(lambda0+step(i-tau) * lambda0 * (k-1)),

alpha) }

RL=RLcusum-1

k ∼ dnorm(0.255, 2.77)I(0.01, 0.99)


#k ∼ dnorm(4.004, 0.0156)I(1.01, 20) # For jump scenarios


Bibliography



20(3):207–222.





















17(2):119–124.


Chapman & Hall/CRC.




38(2):124–136.

BIBLIOGRAPHY 365



102(477):140–152.




Papers, 46(1):47–64.















coda. Citeseer.

















in Medicine, 29(4):444–454.







24(6):721–735.





CHAPTER 13

Estimation of the Time of a Linear Trend in

Monitoring Survival Time

Preamble













368 Chapter 13. Linear Trend Estimation in Survival Time




Following achieved accuracy and precision obtained by the developed Bayesian estima-

tor for the time of a step change in the mean survival time of patients who underwent

cardiac surgery in presence of patient mix in Chapters 11 and 12, in this chapter the

Bayesian change point model was extended to identify the time of linear trend in the

mean survival time of patients who underwent cardiac surgery. The data were right

censored since the monitoring was conducted over a limited follow-up period and the

effect of risk factors prior to the surgery was captured using a Weibull accelerated

failure time regression model.






different magnitude of slope scenarios. The proposed estimator showed a better per-

formance where a longer follow-up period, censoring time, was applied. In comparison

with the alternative built-in CUSUM estimator, more accurate and precise estimates

were obtained by the Bayesian estimator. These superiorities were enhanced when

probability quantification, flexibility and generalizability of the Bayesian change point

detection model are also considered.



chapter contributes to Bayesian methodology and the simulation study implemented in

this research contributes to an analytic application of the risk-adjusted survival time

control charts over various linear drifts.




369



certified that:



field of expertise;






unit; and





Assareh, H. and Mengersen, K. (2011) Bayesian estimation of the time of a linear trend

in monitoring survival time, Ready for submission.



Signature & Date:






13.1 Abstract

Change point detection is recognized as an essential tool of root cause analyses within

quality control programs as it enables clinical experts to search for potential causes of

change in hospital outcomes more effectively. In this paper, we consider estimation of

the time when a linear trend disturbance has occurred in survival time following an

in-control clinical intervention in the presence of variable patient mix. To model the

process and change point, a linear trend in the survival time of patients who underwent

cardiac surgery is formulated using hierarchical models in a Bayesian framework. The

data are right censored since the monitoring is conducted over a limited follow-up

period. We capture the effect of risk factors prior to the surgery using a Weibull

accelerated failure time regression model. We use Markov Chain Monte Carlo to obtain

posterior distributions of the change point parameters including the location and the

slope size of the trend and also corresponding probabilistic intervals and inferences.

The performance of the Bayesian estimator is investigated through simulations and the

result shows that precise estimates can be obtained when they are used in conjunction

with the risk-adjusted survival time CUSUM control charts for different trend scenarios.

In comparison with the alternative built-in CUSUM estimator, reasonably accurate

and precise estimates are obtained by the Bayesian estimator. These superiorities



13.2 Introduction

A control chart monitors the behavior of a clinical process over time or patients by

taking into account the stability and dispersion of the process. The chart signals when

a significant change has occurred. This signal can then be investigated to identify po-

tential causes of the change and corrective or preventive actions can then be conducted.

Following this cycle leads to variation reduction and process stabilization (Montgomery,

2008).

In monitoring hospital outcomes it is necessary to consider the impact of patient health


on process outcomes. To this end, risk adjustment has been taken into account in the

development of control charts. Steiner and Cook (2000) developed a risk-adjusted




(Cook, 2004; Grigg and Spiegelhalter, 2007). Both modified procedures have been

intensively reviewed and are now well established for monitoring clinical outcomes

where the observations are recorded as binary data (Grigg and Farewell, 2004; Grigg

and Spiegelhalter, 2006; Cook et al., 2008).






Sego et al. (2009) used an accelerated failure time regression model to capture the het-

erogeneity among patients prior to the surgery and developed a risk-adjusted survival

time CUSUM (RAST CUSUM) scheme. They showed that this procedure is more sensi-

tive in detection of an increase in odds ratio compared to risk-adjusted CUSUM charts.

Steiner and Jones (2010) extended this approach by proposing an EWMA procedure

based on the same survival time model discussed by Sego et al. (2009).

The need to know the time at which a process began to vary, the so-called change point,

has been raised and discussed in an industrial context of quality control. Accurate

detection of the time of change can help in the search for a potential cause more

efficiently as a tighter time-frame prior to the signal in the control charts is investigated.

In a clinical study, Assareh et al. (2011a) illustrated the capabilities of the change point

investigation through comparison of obtained estimates for the true time of detected

changes in rate of excess use of blood products and major adverse events during and

after cardiac surgery with the time of known potential causes.

A built-in change point estimator in CUSUM charts suggested by Page (1954, 1961)

and also an equivalent estimator in EWMA charts proposed by Nishina (1992) are two


early change point estimators which can be applied for all discrete and continuous dis-

tributions underlying the charts. However they do not provide any statistical inferences

on the obtained estimates.


the change point in a process fraction nonconformity monitored by a p-chart, assuming

that the change type is a step change. They showed how closely this new estimator

detects the change point in comparison with the usual p-chart signal. Subsequently,

Perry and Pignatiello (2005) compared the performance of the derived MLE estimator

with EWMA and CUSUM charts. These authors also constructed a confidence set

based on the estimated change point which covers the true process change point with

a given level of certainty using a likelihood function based on the method proposed by

Box and Cox (1964). Recently, Assareh et al. (2011c) have argued the compatibility

of the developed methods in an industrial context for monitoring clinical outcomes

and proposed a series of Bayesian estimators for step change in odds ratio of clinical

outcomes. These estimators were shown to be precise, highly informative and flexible

for change point investigation in the presence of patient mix. The proposed approach

was then extended to develop a Bayesian estimator of time of a drop in the mean

survival time for patients have undergone cardiac surgery with different pre-operative

risk of death (Assareh and Mengersen, 2011).

It is common to experience other types of change in the process parameters. Bissell

(1984) and Gan (1991, 1992) investigated the performance of CUSUM and EWMA

control charts over linear trends in the process mean. Such drifts can be caused by

tools wearing, spread of infections, learning curve and skill improvement or motivation

reduction that may lead to shifts the process parameter over time in an industrial

or clinical contexts. MLE estimators of the time when such drifts has occurred were

developed for normal (Perry and Pignatiello Jr, 2006) and Poisson processes (Perry

et al., 2006). In the presence of patient mix, Assareh et al. (2011b) developed a Bayesian

estimator for linear trends that can be applied in conjunction with risk-adjusted control

charts.

In this paper we extend the proposed Bayesian change point estimator to identify the


time of a linear trend disturbance in monitoring survival time. In this scenario, we

model a possible linear trend in the mean survival time of patients following a cardiac

surgery. We analyze and discuss the performance of the Bayesian change point model

through posterior estimates and probability based intervals. Risk-adjusted survival time

CUSUM charts are reviewed in Section 13.3. The change point model is demonstrated

in Section 13.4 and evaluated in Sections 13.5-13.6. We then compare the Bayesian

estimator with the CUSUM built-in estimator in Section 13.7 and summarize the study

and obtained results in Section 13.8.


The survival time of a patient who has undergone cardiac surgery is affected by the rate

of mortality of cardiac surgery within the hospital and also patient covariates such as

age, gender, co-morbidities and so on. Risk-adjusted control charts of time-to-event are

monitoring procedures designed to detect changes in a process parameter of interest,

such as survival time, where the process outcomes are affected by covariates, such as

risk factors. In these procedures, regression models for time are used to adjust control

charts in such a way that the effects of covariates for each input, patient say, would be

eliminated.









1 if xi ≤ c

0 if xi > c.

(13.1)








manner.










phase I. In this phase, an available dataset of patient records is used assuming that the

process is in-control for that period of time. A set of independent priors can also be

used to obtain posterior estimates of the AFT parameters over the training data.







W±i (ti, δi | ui) = (1− (ρ±)−α)

(tiexp(β

Tui)

λ0

)− δiαlogρ

±. (13.2)

where it is designed to detect an increase (a decrease) from λ0 to λ+1 = ρ+λ0 (λ−

1 =


i−1 +


W+i } and Z−

i = min{0, Z−i−1 − W−






























This structure may be expanded to multiple levels in a hierarchical fashion, resulting in

a Bayesian hierarchical model (BHM). In complicated BHMs it is not easy to obtain the


emergence of Markov chain Monte Carlo (MCMC) methods. In MCMC algorithms a

Markov chain is constructed whose stationary distribution is the posterior distribution

of the parameters. Samples generated from a long run of the Markov chain can then

be used for posterior inferences. Some common MCMC methods for drawing samples

include Metropolis-Hastings and the Gibbs sampler; see Gelman et al. (2004) for more

details.


in a survival time of ti, i = 1, ..., T , that is initially in-control. The observations can be

explained by a survival function S(ti, ui), where the underlying distribution, (f(.)), is

a Weibull distribution with parameters (α0, λ0), and ui is a vector of covariates. At an

unknown point in time, τ , the Weibull scale parameter changes from its in-control state

of λ0 to λ1i, λ1i = λ0 × (1 + k(i − τ)), k 6= 0. The right censored survival time linear

trend change model can thus be parameterized using a survival function as follows:

S(ti, ui) =

exp[−(tiexp(β

T0 ui)

λ0

)α0]

if i = 1, 2, ..., τ

exp[−(tiexp(β

T0 ui)

λ1i

)α0]

if i = τ + 1, ..., T

(13.4)


We assign a left-truncated normal prior distribution (µ = 0, σ = 1)I(0, ) for k, the

magnitude of the slope, where an increase in survival time is detected by the lower

bound h− of the RAST CUSUM. Similarly, the prior is right-truncated N(µ = 0, σ =

1)I(, 0), where a decrease in survival time is detected by the upper bound h+ of the

RAST CUSUM. This setting leads to relatively informed priors for the magnitude of

the slope.

Note that other distributions such as the uniform might also be of interest for k; see

Gelman et al. (2004) for more details on selection of prior distributions. We place a

uniform distribution on the range (1, T − 1) as a prior for τ where T is set to the time

13.5 Evaluation 377

of the signal of the control chart. See the Appendix for the linear trend change point

model code in WinBUGS.

13.5 Evaluation


linear trend estimation following a signal from a RAST CUSUM control chart when a














We apply the same Weibull AFT model to simulate observations coming from the in-

control state of the process. Figure 13.1 shows the estimated survival curves obtained

through the in-control survival time model for patients with a range of different Par-

sonnet scores. As seen, a patient with a low score, u = 10 or below, is highly likely

(p ≥ 0.94) to survive within the follow-up period; see Figure 13.1-1. In contrast for

patients with a score of u = 50 and higher, death is not unlikely within this period

since the risk of death is estimated to be at least 51% for the last day shown in Figure

13.1-2.




(1) (2)

Figure 13.1 Estimated survival curves for patients with (1) low to medium and (2) medium to highParsonnet scores (risks prior to surgery) over the follow-up period of 30 days obtained through thefitted Weibull AFT model to the training survival time data.














To generate the linear trend in λ0, or MST, we then induced trends with slopes of sizes

k = {0.0025, 0.005, 0.01, 0.025, 0.05, 0.1, 0.2, 0.5, 1} as increasing drifts, improve-

ment in MST, and their negative values in a tighter range of k = {−0.0025, −0.005,

−0.01, −0.025, −0.05, −0.1} as decreasing trends, deterioration in MST, and generated

observations until the control charts signalled. Note that study of decreasing trends

is limited since at some point in time λ1i tends to be negative. To avoid this in the

13.5 Evaluation 379

simulation study, the obtained negative values for λ1i were replaced by 1. It is worth

mention that as the control chart is designed to detect such large shifts in MST, no

long sequence of observations coming from the replaced parameter is expected.

These changes led to different change sizes in in-control estimated survival probability

over days for a patient with ui as well as survival curves between patients with different

Parsonnet scores.

The effects of an increasing trend with a slope of size k = 0.005 and a decreasing

one with a slope of size k = −0.005 in the MST on the probability of survival at the

midpoint, day 15, and the end, day 30, of the follow-up period for all possible Parsonnet

scores for patients who undergo the surgery prior and after 50, 100 and 150 days of the

occurrence of the drifts demonstrated in Figure 13.2.

As expected, the probability of survival for each patient would increase when an in-

creasing trend in the MST occurred. However the magnitude of this increase is larger

for patients with higher Parsonnet scores, in particular for the midpoint of the follow-up

period. Similar behavior can also be seen for a drift with a negative slope. The magni-

tude of changes in probability of survival following an increasing trend tends to reduce

over time. In the contrast, the effect of a decreasing trend in MST on probability of

survival increases over time. This is demonstrated by comparison of the gaps between

lines in Figures 13.2-a and 13.2-b.

It was also found that the resultant magnitude of the shift in the probability of survival

behaves non-constantly over patients with a particular covariate of ui for different days

in the follow-up period.

The magnitude of increases in the probability at the end of period is slightly higher

than those obtained for the midpoint of the period caused by a slope of k = 0.005 in

the MST for patients with Parsonnet scores of less than 61; this is demonstrated by

comparison of the absolute change in probability for the days 15 and 30 of the follow-up

period after 100 observations, i = 600, following the change point in Figure 13.3-1. As

shown for patients with higher scores, the increase in probability for the end of the

follow-up period is less than the midpoint.


(a1) (a2)

(b1) (b2)

Figure 13.2 Estimated probability of survival at the (1) 15th and the (2) 30th day of the follow-up periodof 30 days over all Parsonnet scores prior (i = 500) and after (i = {550, 600, 650}) (a) an increasingtrend with a slope of size k = 0.005, and (b) a decreasing trend with a slope of size k = −0.005 in theMST.

The same behavior was also observed for a slope of size k = −0.005; however the

superiority of the resultant magnitude of the shift in the probability for the end of the

period tends to decline and underlies the corresponding probability for the midpoint

of the period over a wider range of Parsonnet scores; see Figure 13.3-2. Experiencing

larger shifts in the probability of survival following a decreasing drift is also revisited

in Figure 13.3.


We calibrated the RAST CUSUM to detect an increase and a decrease in the MST that

correspond to a halving and a doubling of the odds ratio within the follow-up period and

with an in-control average run length ( ÂRL0) of approximately 10000 observations. As

noted in Section 13.3, for the Weibull AFT model the corresponding odds ratio formula,

13.5 Evaluation 381

(1) (2)

Figure 13.3 Estimated absolute magnitude of change in probability of survival at the 15th and the 30th

day of the follow-up period of 30 days over all Parsonnet scores following (i = 600) (1) an increasingtrend with a slope of size k = 0.005, and (2) a decreasing trend with a slope of size k = −0.005 in theMST.

discussed by Sego et al. (2009), is not reduced to a closed form of λ0 and ρ± since the

covariate term is not simplified in

OR =Oi1

Oi0, and Oi =

1− S(c | ui)S(c | ui)

(13.5)


Therefore we used Monte Carlo simulation to estimate the corresponding ρ±. To do

so, we set ρ± such that over 100,000 replications of generating Parsonnet scores from

the fitted exponential distribution with a mean of 8.9 and calculating the odds ratio

in Equation 13.5, the desired odds ratios of size OR = 2 and OR = 0.5 were obtained.

An increase of ρ+ = 4.004 and a decrease of ρ− = 0.255 in the MST were found to

correspond to the desired drop and jump in odds ratio, respectively.

We also used Monte Carlo simulation to determine decision intervals, h±. However

other approaches may also be considered; see Steiner and Cook (2000) and Sego et al.

(2009). This setting led to decision intervals of h+ = 4.88 and h− = 4.53. As two sided

charts were considered, the negative value of h− was used. The associated CUSUM

scores were also obtained through Equation (13.2) considering the generated ti, δi and

ui.


Table 13.1 Posterior estimates (mode, sd.) of linear trend change point model parameters (τ and k)following signals (RL) from RAST CUSUM ((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6 and τ = 500.

k RL τ στ k σk0.005 742 519.67 82.48 0.011 0.28-0.005 632 503.4 93.02 -0.004 0.006

The linear trends and control charts were simulated in the R package (http://www.r-

project.org). To obtain posterior distributions of the parameters of trends we used

the R2WinBUGS interface (Sturtz et al., 2005) to generate 100,000 samples through

MCMC iterations in WinBUGS (Spielgelhalter et al., 2003), with the first 20000 samples

ignored as burn-in, for all change point scenarios. We then analyzed the results using

the CODA package in R (Plummer et al., 2010). See the Appendix for the linear trend

change point model code in WinBUGS.


To demonstrate the results of Bayesian change point detection in risk-adjusted control

charts, we induced two linear trends with slopes of sizes k = 0.005 and k = −0.005,

respectively, at time τ = 500 in an in-control process with an overall survival time of

λ0 = 42133.6. The RAST CUSUM chart detected the increasing and decreasing drifts

and signalled at the 742nd and 632nd observations, corresponding to delays of 242 and

132 observations as shown in Figures 13.4-a1 and 13.4-b1, respectively. The posterior

distributions of time and magnitude of the slope were then obtained using MCMC

discussed in Section 13.5. The distribution of the time obtained for the decreasing trend

concentrates on the 500th observation, approximately, as seen in Figure 13.4-a2. The

posterior of the associated slope also highly concentrates on the true magnitude. For the

increasing trend, k = 0.005, although both posteriors tend to slightly overestimate the

time and the magnitude; however reasonable information can still be obtained (Figure

13.4-b1,b2).

Table 13.1 summarizes the posterior estimates for the above scenarios. If the posterior

was asymmetric and skewed, the mode of the posterior was used as an estimator for

the change point model parameters (τ and k).


(a1) (b1)

(a2) (b2)

(a3) (b3)

Figure 13.4 Risk-adjusted survival time CUSUM charts ((h+, h−) = (4.88, 4.53)) and obtained posteriordistributions of the time τ and the magnitude k of (a1-a3) an increasing trend with a slope of sizek = 0.005, and (b1-b3) a decreasing trend with a slope of size k = −0.005 in λ (mean survival time)where λ0 = 42133.6 and τ = 500.

The RAST CUSUM signalled after 242 observations when an increasing linear trend,

improvement in MST, with a slope of size k = 0.005, occurred in the mean survival

time whereas the posterior distribution reported the change at the 519th observation.

This result implies that although the obtained posterior estimates overestimated the

change point, they still performed substantially better than the RAST CUSUM charts.


Table 13.2 Credible intervals for linear trend change point model parameters (τ and k) followingsignals (RL) from RAST CUSUM ((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6 and τ = 500. Standarddeviations are shown in parentheses.

kCI 50% CI 80%

τ k τ k

0.005 (486, 562) (0.003, 0.19) (431, 625) (0.002, 0.48)-0.005 (440, 548) (-0.007, -0.003) (399, 599) (-0.011, -0.001)

The superiority of the Bayesian estimator is even enhanced over a slope of the same

size but opposite direction of slope, k = −0.005.


lowing signals of the control chart; see Figures 13.4-a3 and 13.4-b3 and Table 13.1.

The slight bias observed in the figures must be considered in the context of their cor-

responding standard deviations.

Comparison of estimates obtained for both slope sizes reveals that the RAST CUSUM

chart and the Bayesian estimator perform better over decreasing trends, deterioration

in MST. Although a shorter run of observations from the out-of control state of the

process is used when a decreasing trend occurred, more accurate posteriors are obtained.







posterior distributions. Under the same probability of 0.5, the CI for the time of the

slope of size k = 0.005 covers 66 observations around the 500th observation whereas it

increases to 88 observations for k = −0.005 due to the larger standard deviation; see

Table 13.1. In both scenarios, the true change point was in both CIs.

As shown in Table 13.1 and discussed above, the magnitudes of the slopes are also

estimated reasonably well. Table 13.2 shows that in all cases the real sizes of the slopes

are contained in the respective posterior 50% and 80% CIs.



Table 13.3 Probability of the occurrence of the change point in the last {25, 50, 100, 150, 200, 300,400} observations prior to signalling for RAST CUSUM ((h+, h−) = (4.88, 4.53)) where λ0 = 42133.6and τ = 500.

k 25 50 100 150 200 300 400

0.005 0.00 0.01 0.04 0.15 0.37 0.85 0.97-0.005 0.00 0.08 0.25 0.50 0.70 0.92 0.97



in the control charts. For a trend with a slope of size k = 0.005 in the mean survival

time, since the RAST CUSUM signals late (see Table 13.1), it is unlikely that the change

point occurred in the last 150 observations. A considerable growth in the probability

is seen when the next 150 observations are included, reaching to 0.85, whereas for the

decreasing trend, the probability mass located between the last 100 to 200 observations

is noticeably high, 0.45.



datasets, for different trends in λ0, we replicated the simulation method explained in

Section 13.5 100 times. This replication allows us to have a distribution of estimates

with standard errors of the order of 10. The number of replications is a compromise

between computational time and posterior estimation of the expected value and par-

ticular tail probabilities. Table 13.4 shows the average of the estimated parameters

obtained from the replicated datasets where there exists a linear trend in λ0 with a

slope of size k.

As seen, the RAST CUSUM control chart tends to detect large increasing drifts in

the process induced by a linear trend in the MST with a slope of size k = 0.1 and

higher with less delays compared to lower values of positive slopes. For large slopes

of k = 0.2 and more, the chart signals with a delay of at most 175 observations. This

delay increases over moderate increasing trends in λ0, reaching to 309 observations for

k = 0.025. Over increasing trends with a small slope of size k = 0.01 and less, the chart

fails since a long delay of more than 448 observations is associated with the signal. This

delay reaches to 657 for k = 0.0025.


Table 13.4 Average of posterior estimates (mode, sd.) of linear trend change point model parameters (τand k) for a change in the mean survival time following signals (RL) from RAST CUSUM ((h+, h−) =(4.88, 4.53)) where λ0 = 42133.6 and τ = 500. Standard deviations are shown in parentheses.

Change point Slope size

k E(RL) E(τ) E(στ ) E(k) E(σk)

1.0 635.0 492.6 29.2 0.668 0.614(36.5) (28.9) (6.5) (0.128) (0.013)

0.5 653.6 499.2 37.5 0.451 0.581(40.3) (34.2) (11.0) (0.222) (0.038)

0.2 675.7 502.8 60.9 0.234 0.556(54.1) (53.2) (32.7) (0.238) (0.051)

0.1 722.6 519.0 72.5 0.129 0.538(67.0) (67.3) (32.1) (0.137) (0.056)

0.05 766.6 533.4 92.8 0.070 0.528(90.5) (86.5) (42.2) (0.080) (0.054)

0.025 809.9 566.2 104.7 0.035 0.527(113.1) (107.5) (38.8) (0.040) (0.053)

0.01 948.0 624.1 117.3 0.019 0.534(154.2) (145.3) (46.7) (0.036) (0.065)

0.005 1064.8 682.2 162.9 0.008 0.521(222.1) (181.2) (65.1) (0.015) (0.069)

0.0025 1157.5 734.4 171.4 0.007 0.537(374.3) (212.5) (68.4) (0.013) (0.059)

-0.0025 822.6 575.7 176.7 -0.0024 0.032(86.3) (102.0) (45.3) (0.001) (0.105)

-0.005 684.0 531.9 84.6 -0.0048 0.048(64.3) (95.4) (57.7) (0.012) (0.130)

-0.01 599.3 505.8 51.2 -0.010 0.029(50.4) (81.8) (46.2) (0.029) (0.077)

-0.025 544.0 489.4 21.0 -0.022 0.025(33.3) (53.1) (18.7) (0.045) (0.054)

-0.05 525.0 487.8 15.5 -0.056 0.041(21.7) (20.6) (11.0) (0.069) (0.044)

-0.1 516.2 484.6 7.3 -0.110 0.049(10.8) (10.2) (4.3) (0.083) (0.031)

This behavior of the chart is also consistent over deterioration in mean survival time

induced by linear trends with medium to small negative slopes. Having said that, the

RAST CUSUM performs better where there exists a decreasing trend. For a trend with

a medium slope of size k = −0.05 in the MST, a short delay of only 25 observations

is associated with the obtained signal on average, while it is 266 observations for an

equivalent increasing trend. The delay reaches to 184 observations for a small slope sce-

nario of k = −0.005 where it is still far less than the delay of 564 observations obtained

for k = 0.005. In the worst case, the difference between delays is 335 observations for


k = ±0.0025. This superiority of the chart over decreasing trends is also revisited in

the obtained precisions.

This performance can be explained by the nature of censored data as well as the different

effect of linear trends with negative and positive slopes on the survival model. Since

survival times are right censored, the effect of improvements in the process is less

observable and detectable than deteriorations. In other words, the data obtained after

an increasing trend in the MST is less informative than those obtained following a

decreasing trend. It was also shown in Figure 13.3 and discussed in Section 13.5 that a

trend with a negative slope has a larger impact on the probability of the survival and

the observed survival time after the time of the change than a trend with a positive

slope of the same size.

As noted in Section 13.5, decreasing trends with large slope are not common and

therefore they are not investigated here since λ tends to be negative quickly.

For an increasing trend with a large slope in the MST, k of size 0.2 or more, the average

values of the modes, E(τ), estimates the time of the change accurately since at worst

a bias of eight observations is associated for k = 1.0. This estimator outperforms the

chart signal with less bias over dramatic drifts.

For moderate to gradual increasing trends, the posterior mode tends to overestimate

the true change point with more bias than the drifts with large slopes. However this

bias is noticeably less than that obtained by the chart’s signal. For medium slopes,

the bias reaches to 66 observations where k = 0.025 and still outperforms the chart’s

signal with a bias of 309 observations. Over increasing trends, this bias is at most

234 observations for a very small slope of k = 0.0025 whereas 657 observations is the

associated delay based on the chart’s signal. By reduction of the magnitude of the slope

in increasing trends in MST, more bias is associated with the Bayesian estimates.

Table 13.4 reveals that although the Bayesian estimator of time, E(τ), tends to overes-

timate the time of the change over gradual decreasing trends, it outperforms the chart’s

signal with less delays. A bias of 31 observations is associated with the proposed es-

timator for k = −0.005 whereas it is 184 observations for the RAST CUSUM. For

moderate decreasing trends in MST, the Bayesian estimator tends to underestimate


the true time of the change. In the most extreme scenario, k = −0.1, where the worst

performance of the Bayesian estimator is seen, the associated bias still equal to the

delay based on the control chart’ signal; yet a better precision is associated with the

Bayesian estimator.

Table 13.4 shows that the Bayesian estimator behaves in the same manner and has a

better performance over deteriorations than the chart. The Bayesian estimator of time,

E(τ), tends to overestimate the time of the change over moderate decreasing trends.

However the obtained biases are far less than those reported for equivalent increasing

trends. This bias reaches to 75 observations for k = −0.0025 and is significantly less

than the delay observed, 234 observations, for the same trend in the opposite direction.

Table 13.4 indicates that the average of the Bayesian estimator of the magnitude of

the slope, E(δ), identifies change sizes with some bias. For large positive slopes, this

estimator tends to underestimate the slope size whereas it overestimates the size for

moderate to small increasing trends. Having said that, Bayesian estimates of the mag-

nitude of the change can be studied in conjunction with their corresponding standard

deviations. In this manner, analysis of credible intervals is effective. This estimator be-

haves more accurately and precisely over deteriorations in MST. Although the direction

of observed small biases are not consistent over small to moderate slopes, in all sce-

narios highly informative estimates are obtained, particularly compared to equivalent

estimates obtained for trends with positive slopes.


ods

To study the performance of the proposed Bayesian estimators in comparison with that

introduced in Section 13.2, we run the available alternative, built-in estimator of the

CUSUM chart, using the replications discussed in Section 13.6.

Based on the suggestion by Page (1954), if an increase in a process rate is detected

by CUSUM charts, an estimate of the change point is obtained through τcusum =

max{i : Zi = 0}. Similarly for detection of a decrease, the estimated change point is


τcusum = max{i : Z−i = 0}.

Table 13.5 Average of detected time of a linear trend in the mean survival time obtained by the Bayesianestimator (τb) and CUSUM built-in estimator following signals (RL) from RACUSUM ((h+, h−) =(5.85, 5.33)) where λ0 = 42133.6 and τ = 500. Standard deviations are shown in parentheses.

k E(RL) E(τcusum) E(τb)

1.0 635.0 449.0 492.6(36.5) (48.7) (28.9)

0.5 653.6 454.1 499.2(40.3) (94.8) (34.2)

0.2 675.7 456.7 502.8(54.1) (133.4) (53.2)

0.1 722.6 460.7 519.0(67.0) (117.2) (67.3)

0.05 766.6 475.1 533.4(90.5) (136.2) (86.5)

0.025 809.9 482. 3 566.2(113.1) (132.0) (107.5)

0.01 948.0 590.5 624.1(154.2) (177.5) (145.3)

0.005 1064.8 609.4 682.2(222.1) (247.1) (181.2)

0.0025 1157.5 681.2 734.4(374.3) (351.1) (212.5)

-0.0025 822.6 655.3 575.7(86.3) (125.0) (102.0)

-0.005 684.0 560.4 531.9(64.3) (120.4) (95.4)

-0.01 599.3 510.3 505.8(50.4) (98.8) (81.8)

-0.025 544.0 488.4 489.4(33.3) (80.4) (53.1)

-0.05 525.0 484.0 487.8(21.7) (48.6) (20.6)

-0.1 516.2 482.6 484.6(10.86) (40.7) (10.2)


provided by the built-in estimator of CUSUM, τcusum charts for shifts in the mean

survival time, λ0 say.

The built-in estimator of CUSUM charts outperforms associated signals over all drifts

in the MST, except k = −0.1; however it tends to significantly underestimate the exact

change point when the magnitude of the slope increases over moderate to dramatic

improvement trends in MST, k = 0.025 and higher. This estimator behaves in the same


manner over decreasing trends with medium slopes, k = −0.025 and larger magnitudes

of slopes. It has been discussed in Section 13.6 that the RAST CUSUM has a better

performance over decreasing trends; this finding persists for the built-in estimator since

less bias and higher precision are associated with the signals over deteriorations. For

gradual linear trends, increasing or decreasing, the built-in estimator overestimates the

change point, yet less delays are seen over deteriorations. For k = −0.005, it is 60

observations compared with a delay of 109 observations is the delay associated with

the change point estimate of k = 0.005.

Although the Bayesian estimator, τb, also tends to underestimate the time of changes

over large slopes in increasing trends in MST, k = 0.2 or more, as well as medium

slopes in decreasing trends, k = −0.025, it outperforms the built-in estimator, τcusum,

with less bias reaching to 8 and 16 observations compared to 51 and 18 observations

for k = 1.0 and k = −0.1, respectively.

The posterior mode tends to overestimate the true change point over drifts with small

negative slopes and moderate positive slopes, yet it estimates more accurately than the

alternative reporting estimates with larger delays. In the only exceptional scenarios,

increasing trends with slopes of sizes k = 0.005 and k = 0.0025, where less bias is

associated with the built-in estimator, no significant superiority is gained when the

obtained variation of the estimates is also taken into account. Comparison of variation

of estimated change points across other scenarios of shifts in the mean survival time

also supports the superiority of the Bayesian estimator over the alternative.

13.8 Conclusion







13.8 Conclusion 391


In this paper, using a Bayesian framework, we modeled change point detection in-

duced by linear trends in time-to-event data for a clinical process with dichotomous

outcomes, death and survival, where patient mix was present. We considered a range

of increasing and decreasing trends in the mean survival time of an in-control pro-

cess. We constructed Bayesian hierarchical models and derived posterior distributions

for change point estimates using MCMC. The performance of the Bayesian estimators

was investigated through simulation in conjunction with risk-adjusted survival time

CUSUM control charts for monitoring right censored survival time of patients who

underwent cardiac surgery procedures within a follow-up period of 30 days. Here the

severity of risk factors prior to the surgery was evaluated by the Parsonnet score. The

results showed that the Bayesian estimates significantly outperform the RAST CUSUM

control charts in change detection over different magnitude of slopes of linear trends

in the mean survival time. We then compared the Bayesian estimator with built-in

estimators of CUSUM. The Bayesian estimator was found to perform reasonably well

and outperform the alternative.






plicated change scenarios such as linear and nonlinear trends in survival time, relief






















Acknowledgment



Appendix

Change point model code for survival time: Estimation of an increasing

trend in the mean survival time

model {



gamma[i] = pow(exp(beta0 * riskscore[i])/

(lambda0+step(i-change)*lambda0*(k*(i-change))),alpha) }

BIBLIOGRAPHY 393

# gamma[i] = pow(exp(beta*riskscore[i])/lambda2[i],alpha) # For decreasing trend

scenario

# lambda2[i] = max((lambda0+step(i-change)*lambda0*(k*(i-change))),1) # For de-

creasing trend scenario

RL = RLcusum-1

k ∼ dnorm(0,1)I(0,)

#k ∼ dnorm(0,1)I(0,1) # For decreasing trend scenario


Bibliography

Assareh, H. and Mengersen, K. (2011). Bayesian estimation of the time of a decrease in

risk-adjusted survival time control charts. IAENG International Journal of Applied

Mathematics, 41(4):360–366.



20(3):207–222.

Assareh, H., Smith, I., and Mengersen, K. (2011b). Bayesian estimation of the time

of a linear trend in risk-adjusted control charts. IAENG International Journal of

Computer Science, 38(4):409–417.

Assareh, H., Smith, I., and Mengersen, K. (2011c). Change point detec-

tion in risk adjusted control charts. Statistical Methods in Medical Research,

doi:10.1177/0962280211426356.


















17(2):119–124.




84.


Chapman & Hall/CRC.




38(2):124–136.



102(477):140–152.




Papers, 46(1):47–64.









BIBLIOGRAPHY 395











coda. Citeseer.
















in Medicine, 29(4):444–454.







24(6):721–735.

CHAPTER 14

Conclusion

This research mainly aimed to promote application and translation of well-established

concepts and techniques of the industrial SQC in a health context as well as development

of new methods satisfying health related characteristics and needs by employment of

the Bayesian approach. To this end, in a practical manner research questions have been

raised and set up during implementation of quality control program within either a local

hospital or similar clinical centers. A series of research studies has been developed and

followed through Chapters 3-13 to meet the outlined objectives. In this chapter, the

findings and contributions from all studies of this thesis are summarized and outlined.

These contributions are mapped back to the original research objectives and goals

outlined in Chapter 1. This is then followed by discussion of possible future research.

14.1 Research Findings

To summarize and map the findings we first categorize the achievements of the studies

under the objectives stated in Chapter 1, and then revisit the contributions of each

study.

398 Chapter 14. Conclusion

14.1.1 Objective 1: Dataset Quality Evaluation

This objective targeted the data collection and preparation required for monitoring

purposes in a clinical setting. In a quality control practice, historical data are employed

to study of the behavior of the clinical processes and establish in-control and out-of-

control states. This effort mainly contributes to phase I of control charting in which

the control chart parameters are estimated.

In Chapter 3, we adapted the well-established acceptance sampling plans and con-

trol charting methods to evaluate and improve data quality in collecting clinical data

and building databases in a local hospital. The special characteristics of clinical data

management procedures were discussed and elements of procedures from an industrial

context of quality control were modified and fitted accordingly. Application of pro-

moted techniques led to reduction in modification costs through economical design of

data inspection. It also let to improvement in data quality, with respect to rate of

errors, by monitoring ongoing data collection processes and implementation of quality

improvement cycle.

In a more specific case of using historical data for control charting purposes, when

construction of a risk model underlying observed mortality among patients admitted

to ICU of a local hospital was required for monitoring ICU outcomes, the quality and

associated modification costs were major concerns. In Chapter 4 in an optimization

framework, a comprehensive algorithm was developed in which a trade off between ac-

curacy of the obtained risk model and associated modification costs are investigated

recursively. The benefits of the algorithm include the integration of value of information

theory and associated Bayesian components. For example, the predictive posterior dis-

tribution of modification costs and utility theory quantitatively expresses the statistical

contribution of data being captured in the risk model construction. The algorithm was

applied to calibrate an available APACHE II risk model in a local hospital and econom-

ical and statistical achievements obtained by this application were reported. Moreover,

the flexibility and generalization of the proposed algorithm in sample size determina-

tion for complex statistical model was discussed and a series of possible extensions and

applications was also addressed.

14.1 Research Findings 399

14.1.2 Objective 2: Control Charts Application and Development

This objective aimed to explore recent developments and paradigms in the industrial

area and adapt them to current control charting methods in a clinical context to improve

their practical capabilities in the health sector. In this setting, adaptation, modification

and development of control charts and associated components loaned from an industrial

context were followed to meet the specific characteristics of health care surveillance.

Again, a Bayesian approach was considered as the framework for this development.

Following adaptation of attribute control charts in clinical data management processes

in Chapter 3, multivariate charting methods for continuous measures were considered

in monitoring correlated clinical quality characteristics. Common data issues such as

data incompleteness were raised and handled using various imputation techniques. The

performance of the multivariate charting and imputation methods were investigated and

discussed and best performances were identified through simulated scenarios.

In Chapter 6, in a general context of control charting, a Bayesian approach was proposed

as an alternative for the well-known problem of change point estimation following an

out-of-control signal from a control chart. This post-signal effort aims to facilitate root

causes analysis and enhance efficiency of quality improvement activities. A Bayesian

estimator was developed and posterior distributions of the time and the magnitude

of changes in a Poisson rate, here a realization of monitoring the number of failures

of clinical instruments in a local hospital, were obtained using MCMC methods. The

capabilities of the Bayesian framework in construction of probabilistic inferences, the

ease of extension to complex change scenarios and the ease of computations were high-

lighted. It was shown that more accurate and precise estimates are able to be obtained

when the proposed Bayesian estimators are applied to Poisson control charts. The

comparison study also supported the Bayesian estimator as a strong alternative in the

field of change point estimation in control charting. In this setting a Bayesian model

selection criterion was also proposed to distinguish the underlying type of change of

detected shifts. The Bayesian estimator was then extended to capture the number of

change points prior to the control chart’s signal as a variable. In this extended study

in Chapter 7, using reversible jump MCMC methods the performance of the Bayesian


estimator was investigated over several scenarios of multiple change point including

monotonic and non-monotonic step changes. The results supported employment of the

Bayesian estimator in conjunction with control charts. Comparison with competitors

showed that the Bayesian estimators were preferred with respect to statistical criteria

as well as computational aspects such as flexibility and generalisation.

In Chapter 8, the potential benefits of post-signal change point investigation was com-

prehensively illustrated and discussed in a clinical setting. Following the proposed

Bayesian approach and achieved results in Chapters 6 and 7, a Bayesian estimator was

developed to study the true time of detected changes by Bernoulli EWMA and CUSUM

control charts in excess blood product usage for each patient undergoing CABG and ad-

verse cardiac event outcome of patients undergoing PTCA. This study validated change

point estimation in quality control programs in a clinical context since the obtained

estimates coincided with the expected time of change in the process due to known

potential causes.

Following the benefits of post-signal change point investigation in monitoring hospital

outcomes shown in Chapter 8 and also the advantages of the Bayesian framework in

change point estimation discussed in Chapters 6 and 7, in Chapter 9 a Bayesian change

point estimator was developed to identify the true time of a step change in the odds ratio

of death in an ICU detected by risk-adjusted control charts in which binary observations

were adjusted using a risk model. This study considered the special characteristics

of patient mix in development and modification of control charting methods in the

healthcare context. The proposed estimators were found highly informative when they

were used in conjunction with risk-adjusted CUSUM and EWMA control charts and

were superior in comparison with alternatives.

In Chapter 10 the above study was replicated for the case in which the odds ratio of

death was unstable over time and could be explained by a linear function. This study

also supported the Bayesian post-signal change point approach since more accurate and

precise estimates of change points parameters could be achieved. The superiority of

the Bayesian estimator over alternatives was also maintained in this study.

In Chapters 11 a Bayesian estimator was developed for a non-binary outcome, notably

14.1 Research Findings 401

to identify the time of a decrease in mean survival time of patients who have undergone

a cardiac surgery. This model considered the pre-operative covariates contributing to

surgery outcomes and handled censored observations since patients were followed for

a limited period of time after the surgery. This model was then extended in 12 to

estimate a wider range of step change scenarios in mean survival time. The simulation

study supported the employment of the proposed Bayesian estimator after signalling

of risk-adjusted survival time CUSUM control charts since more accurate and precise

estimates of change point parameters can be obtained even in comparison with the built-

in estimator of CUSUM. The proposed Bayesian change point model was extended in

Chapter 13 to the case in which a linear trend exists in survival time after cardiac

surgery. The appealing characteristics of the Bayesian estimator and its superiority

over alternatives were also maintained in this study.

14.1.3 Contribution to Application

The adaptation of acceptance sampling plans and control charting methods in eval-

uation and improvement of clinical data quality contributed to Application since the

body of knowledge was transferred across sectors in Chapter 3. In the same manner,

in Chapter 5 the application of developed multivariate control charts was considered

in a clinical setting. This contribution to application was followed by importing the

change point estimation paradigm from an industrial context of quality engineering in

healthcare surveillance in Chapter 8. In this study the benefits of such investigation in

monitoring hospital outcomes were illustrated. Within Chapters 9 to 13 several risk-

adjusted control charts were constructed and applied on real and simulated data and

the performance of the charts was comprehensively investigated in detection of various

shift scenarios. These components of investigations were also seen as contributions to

Application.

14.1.4 Contribution to Method

In Chapter 4 using a Bayesian framework, an algorithm was developed to determine

optimal sample size for construction of a logistic regression risk model. Compared to


alternative models, both economical and statistical concerns were considered simulta-

neously. The proposed algorithm was able to capture more complexity and was also

superior with respect to practical and economical criteria.

In Chapter 6 a Bayesian estimator was developed for change point estimation in control

charts. It was shown that the Bayesian estimator was a strong alternative considering

statistical as well as practical criteria. This estimator was extended over various change

scenarios including step change, linear trend and multiple changes. In Chapter 7 the

Bayesian change point model was advanced using reversible jump MCMC to handle

multiple change scenarios in which the number of changes was unknown.

The proposed Bayesian framework was considered in a clinical setting to model and

estimate the time of changes in binary hospital outcomes in Chapter 8. In Chapter 9 a

Bayesian change point estimator was developed for estimation of a step change in odds

ratios of hospital outcomes in the presence of patient mix. This estimator was then

modified to handle linear trends in the odds ratio in Chapter 10. Following obtained

results, in Chapters 11 and 12 a Bayesian estimator was developed to identify the time

of a step change in mean survival time observed after a clinical intervention which is

being monitored by risk-adjusted survival time control charts. This estimator was then

extended in Chapter 13 to estimate the time of a linear trend in mean survival time.

All of the above model developments and extensions were categorized as contributions

to Method.

14.2 Research Summary and Remarks

Integrating the findings obtained and discussed across the chapters meets the general

aim of this thesis in taking the opportunities for adaptation and exchange of SQC knowl-

edge between sectors, mainly transferring paradigms and techniques from an industrial

context to a healthcare context. From this perspective, acceptance sampling plans and

statistical process control tools were adapted for clinical data quality evaluation and

improvement purposes. Multivariate charting methods and post-signal change point

estimation were also adapted to enhance efficiency of improvement programs which

14.2 Research Summary and Remarks 403

affect quality of hospital outcomes.

The benefits of the Bayesian approach and its hierarchical structures and computational

methods in reducing the difficulties of model development in the complex environment

of monitoring hospital outcomes, incorporating expert’s knowledge as well as histori-

cal data, and providing highly informative results for decision making purposes, were

reported and highlighted in several studies. A summary of research components and

findings are outlined in Table 14.1.

The proposed change point estimators have been developed in conjunction with a lim-

ited but well-known set of control charting procedures. These can be modified and

applied for other charting techniques in a clinical setting including variable life ad-

justed display and resetting sequential probability ratio charts to facilitate root causes

investigations. In Chapter 8 we illustrated the potential benefits of change point inves-

tigation in root causes analysis following signals of two clinical procedures; however,

identification of causes needs a comprehensive investigation of all possible factors and

potential contributors including data errors, patient mix, resources and process of care

(Lilford et al., 2004). In this investigation observing a lag between causes and effects

and dealing with a complex system of causes are expected (Hay and Pettitt, 2001; Vin-

cent, 2003). In some cases a detected change only represents the behavior of a complex

system in a clinical setting (Galea et al., 2010). Such investigations lead to efficient

interventions in the clinical system and process of care in hospitals, thus enhancing

patient-based outcomes.

Some of the developed models and tools across research components were implemented

at SAWMH. The results of the research and practical considerations regarding their

implementation were discussed. Due to some limitations in available datasets and

data collection processes at SAWMH, Bayesian change point estimators developed for

Poisson, risk-adjusted binary and survival time control charts within Chapters 6-7 and

9-13, were investigated using simulated datasets and a range of change patterns. Having

said that all datasets were simulated based on historical data collected at SAWMH or

well-known studies which can be considered realizations of process obtained in practice.

It is worthwhile to extend the implementation of the developed models and study of


the performances in processes where practical complications such as over-dispersion

and departure from assumptions about change patterns exist.

14.3 Future Research

As discussed in each component of this thesis, extensions and development of the pro-

posed and adapted methods can be followed. In addition to these immediate potential

extensions, opportunities and areas for further research were also raised and addressed

in the reviewed body of literature in Chapter 2. Here, possible developments of the

research are summarized and discussed and then some of the major areas for future

work related to the context of this thesis are highlighted.

14.3.1 Immediate Research

The application of acceptance sampling plans and statistical process control tools for

evaluation and improvement of clinical data quality data were explained and applied in

the context of a a local hospital. As discussed in Chapter 3, it is worthwhile to incorpo-

rate statistical methods for risk assessment of data errors (Win et al., 2004; Hasan and

Padman, 2006) as well as error detection methods (Nosanchuk and Gottmann, 1974).

A study of the integration discussed tools in information technology-based platforms

for data collection within and across clinical centers during patient care and clinical

trials, such as Electronic Data Capture (EDC) and Electronic Medical Report (EMR)

systems are also of further research interest.

In Chapter 4 an economical sample size determination algorithm was proposed. The

flexibility and ease of generalization of the developed method was also addressed. It

would be worth investigating this algorithm with other optimal seeking procedures

such as acceptance sampling plans (Montgomery, 2008) and statistically defined criteria

(Concato et al., 1995; Peduzzi et al., 1995) for sample size determination. An advanced

version of the approach could be applied in model selection when all features of the

value of information theory have been incorporated into the algorithm (Winkler, 2003).

In Chapter 5, applications of multivariate charting methods were discussed in a clinical

14.3

Futu

reResearch

405

Table 14.1 Summary of research components.

Chapter Chp. title Context Process Developed model Application Outcome

3 Data Quality Im-provement in ClinicalDatabases

Healthcare Clinical data collec-tion

Adoption of acceptance sampling plans and quality improve-ment cycle from an industrial context

SAWMH - ICU and radiation met-rics data

Improvement in data quality and reduc-tion in associated costs

4 An Economical SampleSize Determination Al-gorithm

Healthcare Data modificationand control chartconstruction

Development of a data capturing algorithm using utility andvalue of information theories from economy and Bayesiancontexts

SAWMH - Calibration of ICU(APACHE II) risk model

Determination of optimized sample sizeconsidering data modification and correc-tion costs

5 Implementation of Mul-tivariate Control Charts

Healthcare Monitoring cor-related clinicalvariables

Implementation of multivariate control charts, T 2 andMEWMA and MCUSUM, and imputation methods

SAWMH - Radiography process Design and performance of multivariatecharting for correlated variables and im-putation for missing data

6 Change Point Estima-tion in Poisson ControlCharts

General Monitoring count(Poisson) data

Development of an estimator of time and magnitude ofchange for step, linear trend and multiple changes scenariosin Poisson processes using Bayesian models

Simulated data Design of Bayesian change point esti-mator; it outperformed Poisson controlcharts and alternative estimators

7 Multiple Change Pointin Poisson ControlCharts

General Monitoring count(Poisson) data

Extension of the proposed Bayesian estimator for estimationnumber, time and magnitude of changes in multiple changesscenario in Poisson processes

Simulated data Design of Bayesian change point esti-mator; it outperformed Poisson controlcharts and alternative estimators

8 Change Point Detectionin Cardiac Surgery Out-comes

Healthcare Monitoring clinicalbinary outcomes

Development of an estimator of time and magnitude ofchanges in binary outcomes of clinical processes usingBayesian models

SAWMH - Cardiac surgery and an-gioplasty outcomes

Design of Bayesian change point estima-tor; it provided better estimates comparedto EWMA and CUSUM control charts

9 Change Point Estima-tion in Risk-AdjustedCharts

Healthcare Monitoring risk-adjusted outcomes

Development of the proposed Bayesian estimator for estima-tion of step changes in odds ratio of mortality outcomes ofclinical processes at presence of patient mix

Simulated data based on SAWMH -ICU outcomes, risk adjusted usingAPACHE II risk model

Design of Bayesian change point esti-mator; it outperformed RAEWMA andRACUSUM control charts and alternativeestimators

10 Linear Trend Estima-tion in Risk-AdjustedCharts

Healthcare Monitoring risk-adjusted outcomes

Extension of the Bayesian estimator for linear trends in oddsratio of mortality outcomes of clinical processes at presenceof patient mix

Simulated data based on SAWMH -ICU outcomes, risk adjusted usingAPACHE II risk model

Design of Bayesian change point esti-mator; it outperformed RAEWMA andRACUSUM control charts and alternativeestimators

11 Estimation of a De-crease in Survival Time

Healthcare Monitoring survivaltime outcomes

Development of the Bayesian estimator for estimation oftime and magnitude of decreases in right-censored survivaltime following clinical processes at presence of patient mix

Simulated data based on cardiacsurgery dataset (Steiner and Cook,2000), risk adjusted using Parsonnetrisk model

Design of Bayesian change point estima-tor; it outperformed RASTCUSUM con-trol chart and alternative estimator

12 Change Point in Moni-toring Survival Time


Extension of the Bayesian estimator for step changes inright-censored survival time following clinical processes atpresence of patient mix



13 Change Point in Moni-toring Survival Time


Extension of the Bayesian estimator for linear trends inright-censored survival time following clinical processes atpresence of patient mix




setting when the characteristics of the interest were variable. Adaptation of multivariate

charting methods for attributes, binary or count data, as well as consideration of patient

mix for all or some of the correlated variables can be followed in further research. A

study of related body of knowledge and techniques in an industrial context is also highly

recommended.

Further research could also consider extension and replication of the proposed Bayesian

estimator of the Poisson rate, detailed in Chapter 6, over other values of rates as

well as other underlying distributions can be considered for further research. It is

also worthwhile to study the effect of various informative and non-informative prior

distributions in the model. For the multiple change point case where no prior knowledge

exists about the number of change points, other methods and formulations in the

Bayesian context may be of interest. Among those, product partition models (Barry

and Hartigan, 1992; Loschi and Cruz, 2002a,b) and stochastic approximation Monte

Carlo (Liang et al., 2007) are directions for further research.

The candidate partly investigated the Chib model (Chib, 1998) and an original version

of the product partition model (Barry and Hartigan, 1992); since no reasonable results

were obtained in comparison with MCMC method, the research was not followed here

and results was not included in this thesis.

The promotion of the change point estimation investigation in monitoring hospital

outcomes requires further replication of the study undertaken in Chapter 8. Clinicians

are encouraged to apply such investigations in their quality improvement programs

and report their results. Furthermore, the presence of patient mix and the effect of the

covariates can be considered in this replicated researches.

The proposed Bayesian estimators for step change and linear trend in odds ratios

of hospital outcomes and mean survival time were studied over a limited range of

covariates in Chapters 9 and 13. Replication of this study over other populations as

well as underlying risk models would be interesting for further research. Meanwhile,

other priors can be applied in the change point model in the presence of patient mix.

In this regard, the study of computational characteristics of Bayesian estimators, in-

cluding timing, complexity and convergence, using informative and uninformative priors

14.3 Future Research 407

with different underlying distributions is a major direction for further research. In this

area, a little has been discussed within this thesis.

In bridging research, change point estimation can be applied in clinical data quality

area. Where the data are collected over long term in meta analysis or clinical trials,

finding the true time of deteriorations in data quality and identifying corrective actions

are of practical concern. At the same time cost-effectiveness of defined interventions

into system can be investigated using value of information theory discussed in Chapter

4.

The Bayesian approach can also contribute to estimation of a change in the process

parameters of interest, clinical or not clinical, in a prospective manner. To this end,

the change point model can be formulated in a model selection context in which a

hypothesis of no change competes with a hypothesis of a change. This approach has

been partly studied in phase I monitoring where no prior knowledge exists about the

in-control state of the process (Tsiamyrtzis and Hawkins, 2005, 2008). It is worthwhile

to employ various formulations and model selection criteria and procedures such as

model averaging, particularly considering the patient mix in the models. This poten-

tial direction of research can contribute to monitoring clinical processes in which the

underlying risk model has not been constructed or calibrated.

This extension of research is now under investigation by the candidate. In this research

reversible jump MCMC has been considered as the model selection procedure. Since it

is an ongoing research and no result has been obtained yet, it has not been outlined in

this thesis.

14.3.2 Relevant Research

Achievements obtained through the application of the Bayesian approach in an indus-

trial context of quality control encourage clinicians to tackle some of known problem

in monitoring clinical outcomes using Bayesian methods. Economical design of sam-

ple size and intervals in industrial charting methods have been investigated using a

Bayesian framework. Although in healthcare surveillance all individuals are monitored

instead of samples of patients, there may still exist some gain in transferring the body


of knowledge and methods in a new context in which cost has been replaced by risk

and observations are affected by covariates. In the healthcare surveillance, effort has

focused on achieving a control chart based on constant parameters, but most often con-

trol charts are required for data which come from unknown and different distributions.

It seems that by applying Bayesian methods and allowing control chart parameters to

vary a new perspective on tackling risk could be built. Moreover, there is an opportu-

nity to simultaneously deal with over- and under- dispersion. This idea might also be

applicable for monitoring multiple units and the application of funnel plots where ex-

tra variation exists. Other advantages, such as construction of probabilistic inferences

about the state of the process and prediction for the subsequent observations, have

been reported by employment of a Bayesian approach in an industrial context. This

can also direct further research in healthcare surveillance.

Beyond the scope of this thesis, adaptation of other monitoring concepts and paradigms

from the industrial SQC can be of interest. Among those, monitoring profiles and

multistage processes are worthwhile to be considered. At the same time extension of

proposed Bayesian estimators for complex processes in the industrial context can be

directions for further research in an industrial environment.

Bibliography








Galea, S., Riddle, M., and Kaplan, G. (2010). Causal thinking and complex system

approaches in epidemiology. International Journal of Epidemiology, 39(1):97–106.




Association.

BIBLIOGRAPHY 409

Hay, J. and Pettitt, A. (2001). Bayesian analysis of a time series of counts with covari-

ates: an application to the control of an infectious disease. Biostatistics, 2(4):433–444.



Lilford, R., Mohammed, M. A., Spiegelhalter, D., and Thomson, R. (2004). Use and

misuse of process and outcome data in managing performance of acute medical care:

avoiding institutional stigma. The Lancet, 363(9415):1147–1154.


















24(6):721–735.

Vincent, C. (2003). Understanding and responding to adverse events. New England

Journal of Medicine, 348(11):1051–1056.





Publishing.

Bibliography

Bibliography

Abu-Taleb, A. A., Alawneh, A. J., and Smadi, M. M. (2007). Statistical analysis of

recent changes in relative humidity in jordan. Environmental Sciences, 3(2):75–77.

Ahmadzadeh, F. (2009). Change point detection with multivariate control charts by ar-

tificial neural network. The International Journal of Advanced Manufacturing Tech-

nology.

Ajani, A., Reid, C., Duffy, S., Andrianopoulos, N., Lefkovits, J., Black, A., New, G.,

Lew, R., Shaw, J., Yan, B., et al. (2008). Outcomes after percutaneous coronary

intervention in contemporary australian practice: insights from a large multicentre

registry. Medical Journal of Australia, 189(8):423–428.

Alaeddini, A., Ghazanfari, M., and Nayeri, M. (2009). A hybrid fuzzy-statistical clus-

tering approach for estimating the time of changes in fixed and variable sampling

control charts. Information Sciences, 179(11):1769–1784.

Alidousti, S., Assareh, H., and Kazempour, Z. (2005). Quality control of indexing

process. Faslnameye Ketab, 63:63–73.

Amiri, A. and Allahyari, S. (2011). Change point estimation methods for control

chart postsignal diagnostics: a literature review. Quality and Reliability Engineering

International, doi:10.1002/qre.1266.




Assareh, H. and Mengersen, K. (2011a). Bayesian estimation of the time of a decrease in

risk-adjusted survival time control charts. IAENG International Journal of Applied

Mathematics, 41(4):360–366.

Assareh, H. and Mengersen, K. (2011b). Detection of the time of a step change in





20(3):207–222.

Assareh, H., Smith, I., and Mengersen, K. (2011b). Bayesian estimation of the time

of a linear trend in risk-adjusted control charts. IAENG International Journal of

Computer Science, 38(4):409–417.

Assareh, H., Smith, I., and Mengersen, K. (2011c). Change point detec-

tion in risk adjusted control charts. Statistical Methods in Medical Research,

doi:10.1177/0962280211426356.

BIBLIOGRAPHY 413

Assareh, H., Smith, I., and Mengersen, K. (2011d). Identifying the time of a linear



2190:365–370.

Aylin, P., Best, N., Bottle, A., and Marshall, C. (2003). Following shipman: a pilot

system for monitoring mortality rates in primary care. The Lancet, 362(9382):485–

491.



Bavry, A., Kumbhani, D., Helton, T., Borek, P., Mood, G., and Bhatt, D. (2006). Late

thrombosis of drug-eluting stents: a meta-analysis of randomized clinical trials. The

American Journal of Medicine, 119(12):1056–1061.






Benneyan, J. (2006). Discussion-the use of control charts in health-care and public-

health surveillance. Journal of Quality Technology, 38(2):113–123.









336.

Benneyan, J. C., Lloyd, R. C., and Plsek, P. E. (2003). Statistical process control as

a tool for research and healthcare improvement. Quality and Safety in Health Care,

12(6):458.




Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley.

414 Bibliography

Bersimis, S., Psarakis, S., and Panaretos, J. (2007). Multivariate statistical pro-

cess control charts: an overview. Quality and Reliability Engineering International,

23(5):517–543.

Besanko, D. and Braeutigam, R. (2002). Microeconomics: An Integrated Approach.

Wiley.






353:1205–1206.



Bourke, P. D. (1991). Detecting a shift in the fraction of nonconforming items us-

ing run-length control charts with 100% inspection. Journal of Quality Technology,

23(3):225–238.





Brooks, S. P. (1998). Markov chain Monte Carlo method and its application. Journal

of the Royal Statistical Society. Series D (The Statistician), 47(1):69–100.

Brown, P., Mazzone, P., Oliviero, A., Altibrandi, M. G., Pilato, F., Tonali, P. A., and

Di Lazzaroc, V. (2004). Effects of stimulation of the subthalamic area on oscillatory

pallidal activity in Parkinson’s disease. Experimental Neurology, 188(2):480–490.



Calabrese, J. (1995). Bayesian process control for attributes. Management Science,

41(4):637–645.



Capilla, C. (2009). Application and simulation study of the Hotelling’s T 2 control chart

to monitor a wastewater treatment process. Environmental Engineering Science,

26(2):333–342.

BIBLIOGRAPHY 415

Carlin, B., Gelfand, A., and Smith, A. (1992). Hierarchical Bayesian analysis of change-

point problems. Applied statistics, pages 389–405.

Carlin, B. and Louis, T. (2000). Empirical Bayes: past, present and future. Journal of

the American Statistical Association, 95(452):1286–1289.

Celano, G., Castagliola, P., Trovato, E., and Fichera, S. (2011). Shewhart and EWMA

control charts for short production runs. Quality and Reliability Engineering Inter-

national, 27(3):313–326.

Chang, T. C. and Gan, F. F. (2001). Cumulative sum charts for high yield processes.

Statistica Sinica, 11(1):791–805.

Chen, R. (1978). A surveillance system for congenital malformations. Journal of the

American Statistical Association, pages 323–327.

Cheng, S., Mao, H., Goswami, V., Laxmi, P., Meyners, M., Srivastava, P., Jain, N.,

Sivakumar, B., Jain, M., Gupta, R., et al. (2011). The economic design of multivariate

MSE control chart. Economic Design, 8(2):75–85.

Cheon, S. and Kim, J. (2010). Multiple change-point detection of multivariate mean

vectors with the Bayesian approach. Computational Statistics & Data Analysis,

54(2):406–415.



Choi, J., Horn, D., Kist, M., and DAgostino Jr, R. (2011). Evaluation of data entry

errors and data changes to an electronic data capture clinical trial database. Drug

Information Journal, 45:421–430.

Chow, S., Shao, J., and Wang, H. (2007). Sample Size Calculations in Clinical Research.

Chapman & Hall.

Christensen, A., Melgaard, H., Iwersen, J., and Thyregod, P. (2003). Environmen-

tal monitoring based on a hierarchical Poisson-Gamma model. Journal of Quality

Technology, 35(3):275–285.

Claxton, K., Ginnelly, L., Sculpher, M., Philips, Z., and Palmer, S. (2004). A pilot

study on the use of decision theory and value of information analysis as part of

the nhs health technology assessment programme. Health Technology Assessment,

8(31):1–103.

Claxton, K., Neumann, P. J., Araki, S., and Weinstein, M. C. (2001). Bayesian value-

of-infomation analysis - an application to a policy model of Alzheimer’s disease.

International Journal of Technology Assessment in Health Care, 17(1):38–55.

Cochran, W. (2007). Sampling Techniques. Wiley-India.

416 Bibliography

Colosimo, B. and Del Castillo, E. (2007). Bayesian Process Monitoring, Control and

Optimization. Chapman and Hall/CRC.







Cook, D., Steiner, S., Cook, R., Farewell, V., and Morton, A. (2003). Monitoring the

evolutionary process of quality: risk-adjusted charting to track outcomes in intensive

care. Critical Care Medicine, 31(6):1676.




Craigmile, P., Calder, C., Li, H., Paul, R., and Cressie, N. (2009). Hierarchical model

building, fitting, and checking: a behind-the-scenes look at a Bayesian analysis of

arsenic exposure pathways. Bayesian Analysis, 4(1):1–36.



Crowder, S. V. (1989). Design of exponentially weighted moving average schemes.


Daemen, J., Wenaweser, P., Tsuchida, K., Abrecht, L., Vaina, S., Morger, C., Kukreja,

N., Juni, P., Sianos, G., Hellige, G., et al. (2007). Early and late coronary stent

thrombosis of sirolimus-eluting and paclitaxel-eluting stents in routine clinical prac-

tice: data from a large two-institutional cohort study. The Lancet, 369(9562):667–

678.

Del Castillo, E. and Montgomery, D. (1994). Short-run statistical process control: Q-

chart enhancements and alternative methods. Quality and Reliability Engineering

International, 10(2):87–97.



trol, 14(3):5–9.


11(4):10–13.

BIBLIOGRAPHY 417


Sampling. Wiley.





Duarte, B. and Saraiva, P. (2003). Change point detection for quality monitoring of

chemical processes. Computer Aided Chemical Engineering, 14:401–406.

Duncan, A. (1956). The economic design of X charts used to maintain current control

of a process. Journal of the American Statistical Association, pages 228–242.

Eckles, J. (1968). Optimum maintenance with incomplete information. Operations

Research, 16(5):1058–1067.



17(2):119–124.

Feltz, C. and Sturm, G. (1994). Real-time empirical Bayes manufacturing process

monitoring for censored data. Quality and Reliability Engineering International,

10(6):467–476.

Fricker Jr, R. and Chang, J. (2008). A spatio-temporal methodology for real-time

biosurveillance. Quality Engineering, 20(4):465–477.

Galea, S., Riddle, M., and Kaplan, G. (2010). Causal thinking and complex system

approaches in epidemiology. International Journal of Epidemiology, 39(1):97–106.




84.



Washington.

Garjani, M., Noorossana, R., and Saghaei, A. (2010). A neural network-based control

scheme for monitoring start-up processes and short runs. The International Journal

of Advanced Manufacturing Technology, 51(9):1023–1032.


Chapman & Hall/CRC.

418 Bibliography

Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the

Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine

Intelligence, 6(2):721–741.

Ghazanfari, M., Alaeddini, A., Niaki, S., and Aryanezhad, M. (2008). A clustering

approach to identify the time of a step change in Shewhart control charts. Quality


Girshick, M., Rubin, H., and Sitgreaves, R. (1955). Estimates of bounded relative error

in particle counting. The Annals of Mathematical Statistics, 26(2):276–285.


13(1):18–22.



Grigg, O. and Farewell, V. (2004a). A risk-adjusted sets method for monitoring adverse

medical outcomes. Statistics in Medicine, 23(10):1593–1602.

Grigg, O. V. and Farewell, V. T. (2004b). An overview of risk-adjusted charts. Journal



38(2):124–136.



102(477):140–152.




Hamada, M. (2002). Bayesian tolerance interval control limits for attributes. Quality


Hannan, E., Farrell, L., and Cayten, C. (1997). Predicting survival of victims of motor

vehicle crashes in new york state. Injury, 28(9-10):607–615.

Harvey, A., Zhang, H., Nixon, J., and Brown, C. (2007). Comparison of data extraction

from standardized versus traditional narrative operative reports for database-related

research and quality control. Surgery, 141(6):708–714.




Association.

BIBLIOGRAPHY 419

Hastings, W. (1970). Monte Carlo sampling methods using Markov chains and their

applications. Biometrika, 57(1):97–109.





Hay, J. and Pettitt, A. (2001). Bayesian analysis of a time series of counts with covari-

ates: an application to the control of an infectious disease. Biostatistics, 2(4):433–444.

Helms, R. (2001). Data quality issues in electronic data capture. Drug Information

Journal, 35(3):827–837.


Biometrika, 58(3):509–523.

Holmes, D. and Mergen, A. (1993). Improving the performance of the T 2 control chart.


Hosmer, D. and Lemeshow, S. (2000). Applied Logistic Regression. Wiley-Interscience.



Hsieh, F., Bloch, D., and Larsen, M. (1998). A simple method of sample size calculation

for linear and logistic regression. Statistics in Medicine, 17(14):1623–1634.


Ivanov, J., Tu, J., and Naylor, C. (1999). Ready-made, recalibrated, or remodeled?:

issues in the use of risk indexes for assessing mortality after coronary artery bypass

graft surgery. Circulation, 99(16):2098–2104.

Jain, K. (1993). A Bayesian Approach to Multivariate Quality Control. PhD thesis,

University of Maryland at College Park.

Jain, K., Alt, F., and Grimshaw, S. (1993). Multivariate quality control-a Bayesian

approach. In Annual Quality Congress Transactions-American Society for Quality

Control, volume 47, pages 645–645. American Society for Quality control.

Jones, H., Ohlssen, D., and Spiegelhalter, D. (2008). Use of the false discovery rate

when comparing multiple health care providers. Journal of Clinical Epidemiology,

61(3):232–240.



420 Bibliography



24(2):63–69.

Kish, L. (1995). Survey Sampling. Wiley.








Papers, 46(1):47–64.

Kooli, I. and Limam, M. (2009). Bayesian np control charts with adaptive sample

size for finite production runs. Quality and Reliability Engineering International,

25(4):439–448.

Kramer, A. and Zimmerman, J. (2007). Assessing the calibration of mortality bench-

marks in critical care: the Hosmer-Lemeshow test revisited*. Critical Care Medicine,

35(9):2052–2056.

Lagerqvist, B., James, S., Stenestrand, U., Lindback, J., Nilsson, T., and Wallentin,

L. (2007). Long-term outcomes with drug-eluting stents versus bare-metal stents in

Sweden. New England Journal of Medicine, 356(10):1009–1019.



Liang, F. (2009). Improving SAMC using smoothing methods: theory and applications

to Bayesian model selection problems. The Annals of Statistics, 37(5B):2626–2654.



Lilford, R., Mohammed, M. A., Spiegelhalter, D., and Thomson, R. (2004). Use and

misuse of process and outcome data in managing performance of acute medical care:

avoiding institutional stigma. The Lancet, 363(9415):1147–1154.



20(4):404–413.

BIBLIOGRAPHY 421






Loschi, R. and Cruz, F. (2005). Extension to the product partition model: computing

the probability of a change. Computational Statistics & Data Analysis, 48(2):255–

268.

Loschi, R., Cruz, F., and Arellano-Valle, R. (2005). Multiple change point analysis for

the regular exponential family using the product partition model. Journal of Data

Science, 3(3):305–330.

Loschi, R., Cruz, F., Iglesias, P., and Arellano-Valle, R. (2003). A Gibbs sampling

scheme to the product partition model: an application to change-point problems.

Computers & Operations Research, 30(3):463–482.

Loschi, R., Cruz, F., Takahashi, R., Iglesias, P., Arellano-Valle, R., and MacGre-

gor Smith, J. (2008). A note on Bayesian identification of change points in data

sequences. Computers & Operations Research, 35(1):156–170.

Lovegrove, J., Valencia, O., Treasure, T., Sherlaw-Johnson, C., and Gallivan, S. (1997).

Monitoring the results of cardiac surgery by variable life-adjusted display. The Lancet,

350(9085):1128–1130.



Lu, X., Xie, M., Goh, T., and Lai, C. (1998). Control chart for multivariate attribute

processes. International Journal of Production Research, 36(12):3477–3489.

Lucas, J. and Crosier, R. (1982). Fast initial response for CUSUM quality-control

schemes: give your CUSUM a head start. Technometrics, 24(3):199–205.

Makis, V. (2008). Multivariate Bayesian control chart. Operations Research, 56(2):487–

496.

Makis, V. (2009). Multivariate Bayesian process control for a finite production run.

European Journal of Operational Research, 194(3):795–806.

Mangano, D., Tudor, I., Dietzel, C., et al. (2006). The risk associated with Aprotinin

in cardiac surgery. New England Journal of Medicine, 354(4):353–365.

Marcellus, R. (2008a). Bayesian monitoring to detect a shift in process mean. Quality


422 Bibliography

Marcellus, R. (2008b). Bayesian statistical process control. Quality Engineering,

20(1):113–127.

Marcin, J. and Romano, P. (2007). Size matters to a model’s fit. Critical Care Medicine,

35(9):2212–2213.

Marshall, B., Spitzner, D., and Woodall, W. (2007). Use of the local Knox statistic

for the prospective monitoring of disease occurrences in space and time. Statistics in

Medicine, 26(7):1579–1593.

Marshall, C., Best, N., Bottle, A., and Aylin, P. (2004). Statistical issues in the

prospective monitoring of health outcomes across multiple units. Journal of the

Royal Statistical Society. Series A (Statistics in Society), 167(3):541–559.

Mayer, E., Bottle, A., Rao, C., Darzi, A., and Athanasiou, T. (2009). Funnel plots and

their emerging application in surgery. Annals of Surgery, 249(3):376.

Mohammed, M. and Deeks, J. (2008). In the context of performance monitoring, the

Caterpillar plot should be mothballed in favor of the Funnel plot. The Annals of

Thoracic Surgery, 86(1):348.





Mohammed, M., Worthington, P., and Woodall, W. (2008). Plotting basic control

charts: tutorial notes for healthcare practitioners. Quality and Safety in Health

Care, 17(2):137.

Montgomery, D. and Woodall, W. (2008). An overview of six sigma. International

Statistical Review, 76(3):329–346.




Morton, A., Mengersen, K., Waterhouse, M., and Steiner, S. (2010). Analysis of aggre-

gated hospital infection data for accountability. Journal of Hospital Infection.

Morton, A., Whitby, M., McLaws, M., Dobson, A., McElwain, S., Looke, D., Stackel-

roth, J., and Sartor, A. (2001). The application of statistical process control charts

to the detection and monitoring of hospital-acquired infections. Journal of Quality

in Clinical Practice, 21(4):112–117.

Morton, N. and Lindsten, J. (1976). Surveillance of downs syndrome as a paradigm of

population monitoring. Human Heredity, 26(5):360–371.

BIBLIOGRAPHY 423



Nemes, S., Jonasson, J., Genell, A., and Steineck, G. (2009). Bias in odds ratios by

logistic regression modelling and sample size. BMC Medical Research Methodology,

9(1):56–60.

Nenes, G. and Tagaras, G. (2007). The economically designed two-sided Bayesian

control chart. European Journal of Operational Research, 183(1):263–277.

Nikolaidis, Y., Rigas, G., and Tagaras, G. (2007). Using economically designed She-

whart and adaptive X charts for monitoring the quality of tiles. Quality and Relia-

bility Engineering International, 23(2):233–245.














Patel, H. (1973). Quality control methods for multivariate binomial and Poisson dis-

tributions. Technometrics, pages 103–112.










424 Bibliography

Perry, M., Pignatiello, J., and Simpson, J. (2007a). Change point estimation for mono-


search, 45(8):1791–1813.

Perry, M., Pignatiello, J., and Simpson, J. (2007b). Estimating the change point of

the process fraction non-conforming with a monotonic change disturbance in SPC.








130.

Pignatiello, J. J. and Runger, G. C. (1990). Comparisons of multivariate CUSUM

charts. Journal of Quality Technology, 22(3):173–186.


coda. Citeseer.

Poloniecki, J., Valencia, O., and Littlejohns, P. (1998). Cumulative risk adjusted mor-

tality chart for detecting changes in death rate: observational study of heart surgery.


Porteus, E. and Angelus, A. (1997). Opportunities for improved statistical process

control. Management Science, 43(9):1214–1228.

Prabhu, S. and Runger, G. (1997). Designing a multivariate EWMA control chart.


Qiu, P. and Hawkins, D. (2003). A nonparametric multivariate cumulative sum pro-

cedure for detecting shifts in all directions. Journal of the Royal Statistical Society:

Series D (The Statistician), 52(2):151–164.

Reynolds, M. J. and Stoumbos, Z. G. (1999). A CUSUM chart for monitoring a pro-

portion when inspecting continuously. Journal of Quality Technology, 31(1):87–108.


nometrics, 1(3):239–250.

Rolka, H., Burkom, H., Cooper, G., Kulldorff, M., Madigan, D., and Wong, W. (2007).

Issues in applied statistics for public health bioterrorism surveillance using multiple

data streams: research needs. Statistics in Medicine, 26(8):1834–1856.

BIBLIOGRAPHY 425



Trials, 6(2):141–150.

Rubin, D. (1987). Multiple Imputation for Nonresponse in Surveys. Wiley Online

Library.

Ryan, T. P. (2011). Statistical Methods for Quality Improvement. Wiley.







Samuel, T., Pignatiello, J., and Calvin, J. (1998a). Identifying the time of a step change

in a normal process variance. Quality Engineering, 10(3):529–538.

Samuel, T., Pignatiello, J., and Calvin, J. (1998b). Identifying the time of a step change


Samuel, T. and Pignatjello, J. (1998). Identifying the time of a change in a Poisson

rate parameter. Quality Engineering, 10(4):673–681.

Sarndal, C., Swensson, B., and Wretman, J. (2003). Model Assisted Survey Sampling.

Springer Verlag.

Schafer, J. Norm: Multiple imputation of incomplete multivariate data under a normal

model, version 2. http://www.stat.psu.edu/ jls/misoftwa.html.

Schafer, J. and Olsen, M. (1998). Multiple imputation for multivariate missing-data

problems: A data analyst’s perspective. Multivariate Behavioral Research, 33(4):545–

571.


man & Hall/CRC.

Schonhofer, B., Guo, J., Suchi, S., Kohler, D., and Lefering, R. (2004). The use of

APACHE II prognostic system in difficult-to-wean patients after long-term mechan-

ical ventilation. European Journal of Anaesthesiology, 21(7):558–565.

Seber, G. (1984). Multivariate Observations. Wiley Online Library.



426 Bibliography



Self, S. and Mauritsen, R. (1988). Power/sample size calculations for generalized linear

models. Biometrics, 44(1):79–86.








Shiau, J., Chen, C., and Feltz, C. (2005). An empirical Bayes process monitoring

technique for polytomous data. Quality and Reliability Engineering International,

21(1):13–28.

Shiau, J., Chiang, C., and Hung, H. (1999a). A Bayesian procedure for process capa-

bility assessment. Quality and Reliability Engineering International, 15(5):369–378.

Shiau, J., Hung, H., and Chiang, C. (1999b). A note on Bayesian estimation of process

capability indices. Statistics & Probability Letters, 45(3):215–224.

Shieh, G. (2001). Sample size calculations for logistic and Poisson regression models.

Biometrika, 88(4):1193–1199.

Skinner, K., Montgomery, D., and Runger, G. (2003). Process monitoring for multiple

count data using generalized linear model-based control charts. International Journal

of Production Research, 41(6):1167–1180.

Somers, R. (1962). A new asymmetric measure of association for ordinal variables.

American Sociological Review, pages 799–811.




Sonesson, C. (2007). A CUSUM framework for detection of space–time disease clusters

using scan statistics. Statistics in Medicine, 26(26):4770–4789.

Spiegelhalter, D. (2005a). Funnel plots for comparing institutional performance. Statis-

tics in Medicine, 24(8):1185–1202.

Spiegelhalter, D. (2005b). Handling over-dispersion of performance indicators. Quality

and Safety in Health Care, 14(5):347.

BIBLIOGRAPHY 427

Spiegelhalter, D., Abrams, K., and Myles, J. (2004). Bayesian Approaches to Clinical

Trials and Health-Care Evaluation. Wiley.




Spielgelhalter, D., Best, N. C. B., and Van Der Linde, A. (2002). Bayesian measures of

model complexity and fit. Journal of the Royal Statistical Society. Series B (Method-

ological), 64(4):583–639.










in Medicine, 29(4):444–454.

Stone, G., Ellis, S., Cox, D., Hermiller, J., O’Shaughnessy, C., Mann, J., Turco, M.,

Caputo, R., Bergin, P., Greenberg, J., et al. (2004). One-year clinical results with the

slow-release, polymer-based, paclitaxel-eluting TAXUS stent: the TAXUS-IV trial.

Circulation, 109(16):1942–1947.







Sturm, G., Feltz, C., and Yousry, M. (1991). An empirical Bayes strategy for analysing

manufacturing data in real time. Quality and Reliability Engineering International,

7(3):159–167.



Suistomaa, M., Niskanen, M., Kari, A., Hynynen, M., and Takala, J. (2002). Cus-

tomised prediction models based on APACHE II and SAPS II scores in patients with

prolonged length of stay in the icu. Intensive Care Medicine, 28(4):479–485.

428 Bibliography



Sullivan, J. and Woodall, W. (1996). A comparison of multivariate control charts for


Tagaras, G. (1994). A dynamic programming approach to the economic design of

X-charts. IIE Transactions, 26(3):48–56.

Tagaras, G. (1996). Dynamic control charts for finite production runs. European

Journal of Operational Research, 91(1):38–55.

Tagaras, G. and Nikolaidis, Y. (2002). Comparing the effectiveness of various Bayesian

X control charts. Operations Research, 50(2):878–888.

Taylor, H. (1965). Markovian sequential replacement processes. The Annals of Math-

ematical Statistics, 36(6):1677–1694.

Taylor, H. (1967). Statistical control of a Gaussian process. Technometrics, 9(1):29–41.

Taylor, W. (2000). Change-point analysis: a powerful new tool for detecting changes.

http://www.variation.com/cpa/tech/changepoint.html.

Teres, D. and Lemeshow, S. (1999). When to customize a severity model. Intensive

Care Medicine, 25(2):140–142.

Testik, M., Runger, G., and Borror, C. (2003). Robustness properties of multivariate

EWMA control charts. Quality and Reliability Engineering International, 19(1):31–

38.





Triantafyllopoulos, K. (2006). Multivariate control charts based on Bayesian state space

models. Quality and Reliability Engineering International, 22(6):693–707.

Tsiamyrtzis, P. (2000). A Bayesian Approach to Quality Control Problems. PhD thesis,

University of Minnesota.





24(6):721–735.

BIBLIOGRAPHY 429



20(4):435–450.

Tuyl, F., Gerlach, R., and Mengersen, K. (2009). Posterior predictive arguments in

favor of the Bayes-Laplace prior as the consensus prior for binomial and multinomial

parameters. Bayesian Analysis, 4(1):151–158.

US. Food and Drug Administration (FDA): Early Communication about

an Ongoing Safety Review Aprotinin Injection (marketed as Trasylol).

http://www.fda.gov/cder/drug/earlycomm/aprotinin.htm.

Vincent, C. (2003). Understanding and responding to adverse events. New England

Journal of Medicine, 348(11):1051–1056.

Wald, A. (1947). Sequential Analysis. John Wiley & Sons.

White, C. (1977). A Markov quality control process subject to partial observation.

Management Science, 23(8):843–852.





Whittemore, A. (1981). Sample size for logistic regression with small response proba-

bility. Journal of the American Statistical Association, pages 27–32.





Publishing.







Woodall, W., Brooke Marshall, J., Joner Jr, M., Fraker, S., and Abdel-Salam, A.

(2008). On the use and evaluation of prospective scan methods for health-related

surveillance. Journal of the Royal Statistical Society: Series A (Statistics in Society),

171(1):223–237.

430 Bibliography



Woodall, W. H. and Mahmoud, M. A. (2005). The inertial properties of quality. Tech-

nometrics, 47(4):425–436.

Woodall, W. H. and Montgomery, D. C. (1999). Research issues and ideas in statistical

process control. Journal of Quality Technology, 31(4):376–386.

Wu, C. (2008). Assessing process capability based on Bayesian approach with subsam-

ples. European Journal of Operational Research, 184(1):207–228.






Yang, Z., Xie, M., Kuralmani, V., and Tsui, K. (2002). On the performance of geometric

charts with estimated control limits. Journal of Quality Technology, 34(4):448–458.

Yeh, A., Mcgrath, R., Sembower, M., and Shen, Q. (2008). EWMA control charts

for monitoring high-yield processes based on non-transformed observations. Interna-

tional Journal of Production Research, 46(20):5679–5699.

Yin, Z. and Makis, V. (2011). Economic and economic-statistical design of a mul-

tivariate Bayesian control chart for condition-based maintenance. IMA Journal of

Management Mathematics, 22(1):47–63.

Zantek, P. and Nestler, S. (2009). Performance and properties of Q-statistic monitoring

schemes. Naval Research Logistics (NRL), 56(3):279–292.





Journal, 38(4):371–387.

Zhang, P. and Su, Q. The economically designed control chart for short-run pro-

duction based on Bayesian method. In Artificial Intelligence, Management Science

and Electronic Commerce (AIMSEC), 2011 2nd International Conference on, pages

4828–4831. IEEE.

Zhao, X. and Chu, P. S. (2010). Bayesian change-point analysis for extreme events (ty-

phoons, heavy rainfall, and heat waves): a RJMCMC approach. Journal of Climate,

23(5):1034–1046.

Date post:	10-May-2020
Category:	Documents
Upload:	others
View:	8 times
Download:	0 times

Bayesian Hierarchical Models in Statistical Quality...

Documents