OPTIMAL EXPERIMENTAL DESIGNS FOR ACCELERATED LIFE TESTS WITH
CENSORING AND CONSTRAINTS
by
Eric Michael Monroe
A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree
Doctor of Philosophy
ARIZONA STATE UNIVERSITY
May 2009
©2009 Eric Michael Monroe All Rights Reserved
OPTIMAL EXPERIMENTAL DESIGNS FOR ACCELERATED LIFE TESTS WITH
CENSORING AND CONSTRAINTS
by
Eric Michael Monroe
has been approved
April 2009
Graduate Supervisory Committee:
Rong Pan, Co-Chair Douglas C. Montgomery, Co-Chair
Christine M. Anderson-Cook Connie M. Borror Robert L. Parker
ACCEPTED BY THE GRADUATE COLLEGE
ABSTRACT
Manufacturers are continually faced with customer expectations to deliver products
faster while assuring high reliability. Companies must diligently plan and execute acceler-
ated life tests in order to insure future reliability performance is met. In many industrial
applications, accelerated life test results involve considerations that hinder the ability of the
analyst to easily define a suitable test plan. A methodology for defining efficient, yet discern-
ing, tests could insure that corporate investments in reliability testing are properly selected
to mitigate risk while minimizing cost. This dissertation consists of three main studies.
First, classical design of experiment methods are extended to multi-stress accelerated life
test plans. This investigation mitigates the uncertainty in model parameter estimation for
non-linear models with censoring and constrained feasible design regions. An electronics
industry case study serves as the motivation for this research. Inference comparisons are
drawn against current best practices documented in the literature. Second, an alternate
technique for determining an optimal stress test level is introduced using a generalized linear
model framework. This approach achieves equivalent results to traditional maximum likeli-
hood estimation techniques, but avoids the necessity to compute complex first and second
partial derivatives. In addition, it avoids the convergence problems associated with locally
optimal solutions. Finally, a sensitivity and robustness study demonstrates additional tools
to use in the planning of accelerated life tests.
iii
To Janelle and Jackson.
iv
ACKNOWLEDGMENTS
The writing of a dissertation is a highly individualized and internally-focused en-
deavor, yet it is not possible without the personal and professional support of numerous
people. Thus, my sincere gratitude goes to faculty, fellow colleagues, friends, and family
for their time, support, and patience over the years. Without their help, this dissertation
would not have been possible.
I especially want to thank Dr. Douglas C. Montgomery and Dr. Rong Pan for being
the co-chairs of my Ph.D. committee. I wish to thank Dr. Montgomery, who assisted me
in entering the Ph.D. program and provided me guidance both as a student and as a full-
time working professional. To Dr. Pan, I appreciate his countless hours of mentoring and
coaching during the writing process. I also would like to thank my committee members, Dr.
Christine M. Anderson-Cook, Dr. Connie M. Borror, and Dr. Robert L. Parker for timely
advice at critical points along the way. Also, I would like to thank the National Science
Foundation for their support of my research through research grant CMMI-0654417.
My graduate study would not have been the same without the social, professional,
and academic support I have received. I am particularly thankful for the support of my
friend, Jinsuk Lee, in learning the Latex syntax used to format my dissertation. Profession-
ally, I would like to thank Intel Corporation for their financial and temporal support. In
particular, I’d like to thank my supervisors that offered me the flexibility to manage both
an academic and professional career. They are Kenneth T. Yee, Nicholas P. Mencinger,
Mary A. VerHelst, Derek M. Wolfe, and Eric R. Gee. Academically, I would like to express
my gratitude to the faculty and staff of the Industrial Engineering department at Arizona
State University for their assistance and encouragement during my course of study.
v
Finally, I wish to acknowledge all of my friends and family. They have listened to
me moan, cheer, complain, laugh, and ponder my way through this academic program over
the years, but also provided me with emotional support throughout this long process. I
wish to especially thank my wife, Janelle, who provided never-ending support and endured
many of my late night working sessions. This endeavor would not have succeeded without
their fullest support.
vi
TABLE OF CONTENTS
Page
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Research objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
CHAPTER 2 LITERATURE REVIEW . . . . . . . . . . . . . . . . . . . . . . . . 7
1. Accelerated life testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. Optimal experimental designs . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1. Historical overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2. Relevant literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3. Alphabetic optimality criteria . . . . . . . . . . . . . . . . . . . . . . 12
2.4. Computer algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1. Point exchange algorithms . . . . . . . . . . . . . . . . . . 16
2.4.2. Branch-and-bound methods . . . . . . . . . . . . . . . . . . 18
2.4.3. Simulated annealing . . . . . . . . . . . . . . . . . . . . . . 18
2.4.4. Coordinate-exchange algorithms . . . . . . . . . . . . . . . 19
2.4.5. Genetic algorithms . . . . . . . . . . . . . . . . . . . . . . . 21
2.5. Software integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6. Applications in reliability . . . . . . . . . . . . . . . . . . . . . . . . 24
3. Unknown model parameter dependency . . . . . . . . . . . . . . . . . . . . 25
vii
Page
3.1. Parameter dependency . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2. Model-robust designs . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3. Model-sensitive designs . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4. Bayesian designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5. Multifaceted selection criteria designs . . . . . . . . . . . . . . . . . 30
4. Use of a Bayesian D-optimal criterion in designing accelerated life tests . . 31
5. Generalized Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
CHAPTER 3 OPTIMAL DESIGNS FOR CONSTRAINED REGIONS . . . . . . 35
1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2. Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3. Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1. Acceleration model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2. Irregular feasible design region . . . . . . . . . . . . . . . . . . . . . 40
3.3. Legacy experiments and preliminary data analysis . . . . . . . . . . 42
4. Analysis and interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1. Experimental designs and simulation study . . . . . . . . . . . . . . 45
4.2. Effects of experimental design, sample size, and censoring . . . . . . 48
4.3. Alternative test plan with Type-I censoring . . . . . . . . . . . . . . 50
5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
CHAPTER 4 A GLM APPROACH TO DESIGNING ACCELERATED LIFE TESTS 55
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
viii
Page
1.2. Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
1.3. Scope of work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2. Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.1. Maximum likelihood estimation approach . . . . . . . . . . . . . . . 60
2.2. Generalized linear model approach to failure time analysis . . . . . . 62
3. Computational procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.1. Search methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2. Initial values and convergence criteria . . . . . . . . . . . . . . . . . 68
4. Numerical examples and results . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.1. Case study 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2. Case study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
CHAPTER 5 SENSITIVITY AND ROBUSTNESS ANALYSIS . . . . . . . . . . 83
1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2. Numerical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3. Influence of parameter uncertainty on U-optimal design performance . . . . 86
3.1. Uncertainty range . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2. Design construction sensitivity . . . . . . . . . . . . . . . . . . . . . 90
3.3. Prediction variance sensitivity for a single acceleration factor . . . . 91
3.4. Prediction variance sensitivity for multiple acceleration factors . . . 93
4. Mitigation schemes using controllable design factors . . . . . . . . . . . . . 97
5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
CHAPTER 6 CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . 100
ix
Page
1. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
2. Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
APPENDIX A D-OPTIMAL SIMULATION RESULTS . . . . . . . . . . . . . . . 113
APPENDIX B DERIVATION OF THE WEIGHT MATRIX . . . . . . . . . . . . 117
APPENDIX C STATISTICAL ANALYSIS SOFTWARE (SAS 9.2) CODE . . . . 121
1. D-optimality computations for a two-factor design . . . . . . . . . . . . . . 122
2. U-optimality computations for a two-factor with interaction design . . . . . 123
3. Computation of the lifetime model coefficients . . . . . . . . . . . . . . . . . 124
4. Non-linear search method using conjugate gradients . . . . . . . . . . . . . 125
5. Formatting of the D-optimal design output . . . . . . . . . . . . . . . . . . 126
6. Formatting of the U-optimal design output . . . . . . . . . . . . . . . . . . 127
7. Computation of the prediction variance contours . . . . . . . . . . . . . . . 128
APPENDIX D MATRIX RESULTS FOR U-OPTIMAL DESIGNS . . . . . . . . . 130
1. Fisher information matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
2. Covariance matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
APPENDIX E U-OPTIMAL DESIGNS . . . . . . . . . . . . . . . . . . . . . . . . 133
x
LIST OF TABLES
Table Page
1. Life-stress relationship models for accelerated life testing . . . . . . . . . . . 8
2. Design criteria for alphabetic optimality . . . . . . . . . . . . . . . . . . . . 12
3. Feasible design region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4. Accelerated life test results from the legacy test conditions . . . . . . . . . . 43
5. Regression table for the full model . . . . . . . . . . . . . . . . . . . . . . . 44
6. Regression table for the reduced model without regressor Tmax . . . . . . . 44
7. Regression table for the reduced model without regressor tsoak . . . . . . . . 44
8. Regression table for the reduced model without regressor ∆T . . . . . . . . 45
9. Degenerate design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
10. Degenerate split design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
11. D-optimal design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
12. U-optimal design without an interaction effect . . . . . . . . . . . . . . . . . 73
13. U-optimal design with a small interaction effect . . . . . . . . . . . . . . . . 80
14. U-optimal design with a large interaction effect . . . . . . . . . . . . . . . . 80
15. U-optimal design for a set of hypothesized model coefficients . . . . . . . . 84
16. Ranges for the lifetime regression model coefficients . . . . . . . . . . . . . . 87
17. Model sensitivity study on prediction variance . . . . . . . . . . . . . . . . . 88
18. Legacy design test conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 114
19. Orthogonal design test conditions . . . . . . . . . . . . . . . . . . . . . . . . 114
20. D-optimal design test conditions . . . . . . . . . . . . . . . . . . . . . . . . 114
21. Parameter estimation results for the legacy design simulations . . . . . . . . 115
22. Parameter estimation results for the orthogonal design simulations . . . . . 115
xi
Table Page
23. Parameter estimation results for the D-optimal design simulations . . . . . 116
24. Parameter estimation results for a Type-I censored D-optimal design . . . . 116
25. U-optimal design for condition – – – . . . . . . . . . . . . . . . . . . . . . . 134
26. U-optimal design for condition – – + . . . . . . . . . . . . . . . . . . . . . . 134
27. U-optimal design for condition – + – . . . . . . . . . . . . . . . . . . . . . . 134
28. U-optimal design for condition – + + . . . . . . . . . . . . . . . . . . . . . 134
29. U-optimal design for condition + – – . . . . . . . . . . . . . . . . . . . . . . 135
30. U-optimal design for condition + – + . . . . . . . . . . . . . . . . . . . . . 135
31. U-optimal design for condition + + – . . . . . . . . . . . . . . . . . . . . . 135
32. U-optimal design for condition + + + . . . . . . . . . . . . . . . . . . . . . 135
xii
LIST OF FIGURES
Figure Page
1. Air-to-air temperature cycle chamber with three temperature zones . . . . . 36
2. Temperature cycle profile with respect to the model variables . . . . . . . . 40
3. Feasible design region for ALT testing conditions: a 3-D view (left) and a
2-D view (right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4. Legacy (top), orthogonal (middle), and D-optimal designs (bottom) . . . . 47
5. Estimation of parameter a by numerical simulation using legacy (upper),
orthogonal (middle), and D-optimal designs (lower). . . . . . . . . . . . . . 49
6. Estimation of parameters a (top), b (middle) and c (bottom) using Type-I
censoring compared with Type-II censoring. . . . . . . . . . . . . . . . . . . 52
7. Normalized unit square feasible design region . . . . . . . . . . . . . . . . . 60
8. Feasible design region in terms of ξ (left) and x (right) scales . . . . . . . . 73
9. Contour plots of prediction variance at the use condition under the degenerate
split (top), D-optimal (middle), and U-optimal (bottom) criterion . . . . . . 76
10. U-optimal designs for varying magnitudes of interaction effect: none (top),
low (middle), and high (bottom) . . . . . . . . . . . . . . . . . . . . . . . . 81
11. U-optimal design for hypothesized values of model coefficients . . . . . . . . 85
12. Comparing U-optimal designs for (+++) and (– – –) levels . . . . . . . . . 91
13. Sensitivity plot of activation energy and prediction variance . . . . . . . . . 92
14. Sensitivity of U-optimal designs to model mis-specification . . . . . . . . . . 94
15. Prediction variance contour plot for censoring time and sample size . . . . . 98
xiii
CHAPTER 1
Introduction
1. Overview
Reliability is the science of assessing the probability that a component, system, or
device (henceforth called “unit”) will maintain an expected level of performance for a pre-
scribed period of time. Implied in this definition is a set of typical conditions associated
with the operation of the unit. Reliability is an important consideration for both consumers
and manufacturers. The ability of a product to consistently meet the performance expec-
tations of a unit under normal operating conditions greatly influences the buying patterns
of a consumer. A consumer that has a negative experience with a product due to its in-
ability to function properly over time is likely going to buy a different product for their
next purchase. Conversely, a positive experience is likely to promote consumers to buy the
product again. So strong are the emotions associated with reliability, contingent valuation
studies by Tversky and Kahneman (1986) reveal that consumers are willing to pay more for
products that demonstrate high reliability. In addition to there being a potential upside in
sales due to high reliability, the converse is also true. From a manufacturer’s point of view,
there is often a liability associated with a product failing to perform as advertised. This
is observed as a warranty claim in which a replacement product is offered at no additional
cost or a proportion of the original purchase price is refunded to the customer. In either
case, there is an implicit “cost” to the manufacturer that directly impacts profitability.
In today’s competitive marketplace, companies are increasingly more cost-conscious.
They now seek to tailor their product’s reliability to customer usage patterns in lieu of meet-
ing a military grade “golden standard” (MIL-HDBK-217F, 1991) of reliability suitable for
enduring all possible environmental conditions. As such, manufacturers now view reliability
as an optimization variable when making cost, design and marketing decisions. To deter-
2
mine whether a product meets a stated reliability, experimental testing is typically done as
part of the research and development of a product. However, time-to-market pressures do
not allow for testing under their typical operating conditions. Product launch decisions are
often dependent on conclusions made under a shortened test time period. These timelines
are achieved through experiments called Accelerated Life Tests (ALT). An ALT seeks to
replicate the same type of failure modes experienced under normal use conditions, but to
compress the onset of failure into a shorter period of time. This is achieved by conducting
experiments at two or more stress levels and forming a relationship (or regression model)
between the product lifetime and the stress level. This stress level to lifetime relationship
is then extrapolated back to the expected field stress conditions and lifetime performance
is inferred.
Although many reasons can be stated in support of reliability tests, they are often
very expensive to conduct. Costs include test sample construction and preparation, testing
apparatus setup and operation, as well as response measurement collection and analysis.
Therefore, companies must actively balance the need for precise parameter estimates with
the cost in acquiring such information. Planning for these trade-off decisions is a chal-
lenge due to additional complexities associated with reliability testing. Examples of these
complexities include data censoring, constraints in the feasible test region, non-normal re-
sponse distributions, and non-linearity and/or uncertainty in the form of the model. Thus,
a practitioner may be faced with many, often competing, objectives that need to be ad-
dressed when selecting the “best” test for an application. To help guide an experimenter,
Box and Draper (1975) and Myers, Montgomery, and Vining (2002) provide a list of good
experimental design properties.
3
Once the planning and execution of a reliability test is completed, the analysis
of the data still remains. Today, maximum likelihood estimation (MLE) is the most
commonly used approach toward estimating ALT model parameters involving censoring
(Aldrich, 1997). This approach involves evaluating first and second partial derivatives of
a log-likelihood function. For censored data, solving the resulting system of equations can
be complex and require iterative computational procedures such as the Newton-Raphson
method. Although computer software programs automate this analysis, at times it is de-
sirable for practitioners to be able to retain insight into the underlying structure of the
solution. Thus, having a simplified method is preferred and is further discussed in Chapters
4 and 5.
2. Motivation
Accelerated life testing is an integral part of reliability evaluations. This dissertation
explores planning and assessment techniques that can provide the best designs for ALTs
with censoring and non-linearity. The following section highlights the motivating factors
for investigating this area of research.
The first element of this research is motivated through my professional experience
at Intel Corporation as a reliability engineer. Over the last decade, there has been an ever-
increasing global push towards more environmentally sound manufacturing practices and
policies. One of the main concerns within the electronics industry is the use of lead in sol-
der. This issue came to the forefront of the entire industry as the European Union adopted
a “Reduction of Hazardous Substances” (ROHS) policy that took effect as of July 1, 2006.
Complementing the ROHS directive is the “Waste from Electrical and Electronic Equip-
ment” directive which covers equipment recyclability. As a result, electronic manufacturers
are required to eliminate eutectic (lead) solder from their products. Thus, companies are
4
evaluating many alternate materials that provide the necessary performance and reliability
for electronic interconnects. The problem arising from this effort is the need to revalidate all
supporting reliability models. Solder joint fatigue due to thermal cycling from powering the
system on and off is one of the main reliability concerns with interconnect materials. How-
ever, over forty years of testing data is available for eutectic solder fatigue. This is not true
for the proposed lead-free materials being investigated. As such, industry practitioners are
seeking to efficiently collect data in order to refine empirical models and to validate that
products with lead-free materials are sufficiently robust. Inefficient experimental design
construction prevents practitioners from effectively discerning between competing empirical
models. The origin of these inefficient experimental designs is traced back to industry stan-
dard test procedures. These procedures pre-date the adoption of “mechanism-based” or
“knowledge-based” use condition evaluations (e.g., Blish et al. (1999), Mencinger (2000))
which employ model building techniques commonly used by industrial statisticians. These
industry standard designs are only used because decades of data existed prior to the need
for empirical model building. Thus, data to support the selection of the model form for
eutectic solder is not a problem. However, lead free solder evaluations are not afforded
the same luxury in time. This motivates the integration of classical experimental design
techniques into accelerated life testing practices. More specifically, this research explores
design construction that is best suited to deal with data censoring, non-linear model form,
and restricted design regions. Details of this research (Monroe and Pan, 2008) are discussed
further in Chapter 3.
The second element of this research is motivated from the need to address model
parameter dependency problems typically incurred when solving an optimization involving
a non-linear model. Through this research, we offer an extension of the use of generalized
5
linear models (GLM) in a reliability application. By integrating GLM techniques into the
reliability planning procedure, a simpler and more structured formulation of the problem
is possible. With a structured approach toward experimental design selection in place,
it is easier to discover relationships that have an effect on test point selection during the
reliability planning stage. Work by Chipman and Welch (1996) suggests that optional design
points for non-linear models do not necessarily lie at the extreme points as is the case for
linear models. The ability to succinctly describe this behavior is missing from the literature
and is better facilitated through the use of GLMs.
The final element of this research addresses optimal design sensitivity and robustness.
Reliability practitioners increasingly are faced with the challenge of planning accelerated life
tests under large degrees of uncertainty. This uncertainty coupled with difficulties associated
with restricted feasible design regions, censoring, and sample size selection increases the risk
of an unacceptable experimental outcome. Moreover, since the initial values specified for
a lifetime model represent expert opinion or an initial guess, practitioners could feel some
degree of anxiety over the impact model parameter mis-specification has on the resulting
predictions. Thus, evaluating the performance of such optimal designs under various levels
of uncertainty enables a practitioner to more readily assess their comfort with a test plan
during the planning stage. We conclude with a discussion on the use of controllable factors
such as censoring time and sample size to mitigate highly sensitive scenarios. Through these
efforts, we provide practitioners with graphical and computational tools to better facilitate
accelerated life test planning.
6
3. Research objectives
The research objectives of this dissertation are:
1. To illustrate the importance of integrating experimental design considerations into
the early stages of accelerated life test planning. A design matrix’s impact on uncer-
tainty in estimating model parameters is quantified and compared to more traditional
considerations such as sample size, test condition allocation levels, test time, and
censoring scheme.
2. To extend the use of generalized linear models as an alternate technique for estimating
model parameters for survival data in lieu of maximum likelihood estimation.
3. To evaluate optimal stress level selection and sample size allocation strategies and
the differences that exist between linear and non-linear forms and to assess model
parameter sensitivity.
4. Summary
The remainder of this dissertation is structured as follows. A literature review of
the key areas of focus is summarized in Chapter 2. The review includes current research
in the area of accelerated life testing, optimal experimental designs, the model dependency
problem, and generalized linear models. Chapter 3 is a case study that further describes
the motivation for this research. This chapter addresses how experimental design concepts
augment existing ALT planning procedures. Chapter 4 introduces the theoretical work
associated with the GLM-based approach toward planning optimal experimental designs
for ALTs. Chapter 5 highlights the robustness of the proposed designs to uncertainty in
model parameter values as well as model form. Finally, Chapter 6 describes future research
plans.
CHAPTER 2
Literature Review
In this chapter, literature reviews are provided that are associated with the main ele-
ments of this dissertation. This review is partitioned into four parts: accelerated life testing,
optimal experimental designs, the model parameter dependency problem, and generalized
linear models.
1. Accelerated life testing
Accelerated life testing (ALT) is a procedure commonly used in the reliability field to
estimate product health. This procedure is motivated by a need to estimate the probability
distribution of a lifetime of a unit when operated within its standard environment. For
units with high reliability, lifetime testing is cost and time prohibitive. Thus, acceleration
methods are commonly used to hasten the onset of degradation or failure. Accelerated
life testing procedures are comprised of two components: an acceleration model and a life
distribution.
The first component is an acceleration model. To facilitate the development of an
acceleration model, accelerated testing is employed that applies a more severe level of stress
upon the unit. The degree of stress incurred is linked to an acceleration variable that can
be controlled during testing. Typically, the acceleration variable is temperature, voltage,
humidity, force, or any other engineering metric relevant to the system performance. An
acceleration model is statistically fit based on the results observed from several different
stress levels, all of which exceed the normal operating condition. We note that there is an
inherent risk in conducting accelerated life tests. Failure mechanisms can result when stress
levels exceed a physical property or limitation that is not present under use conditions.
Therefore, prudent care must be taken to consider these possibilities prior to extrapolating
the results of an acceleration model back to the use stress level where inferences are made.
8
Some common life-stress acceleration models are Arrhenius, Inverse Power, and
Eyring. The Arrhenius reaction rate equation originates from the Swedish physical chemist
Svandte Arrhenius. The Arrhenius life-stress model is formulated by assuming that life is
proportional to the inverse reaction rate of the process. It is used for accelerated testing
involving a thermal acceleration variable. In contrast, the Inverse Power Law is used for
non-thermal accelerated stresses. This model has a linear form when we take the logarithm
of both sides. The main property of a power law model is its scale invariance. The scaling
exponent in these models are often linked to the physical sciences. Finally, the generalized
Eyring relationship (1941) is based on chemistry and quantum mechanics principles. It is
used when temperature and a second non-thermal stress such as humidity or voltage are
the accelerated stresses of a test and their interaction is also of interest. Nelson (1990)
and Tobias and Trinidade (1995) are excellent sources for additional details on life-stress
acceleration models. Examples of these life-stress acceleration models are shown in Table 1
for an exponential distribution where b and c are constants and s is the stress factor.
Table 1: Life-stress relationship models for accelerated life testing
Relationship ModelArrhenius θ(s) = c · e
bs
Inverse Power λ(s) = c · sa
Eyring λ(s) = c · s · e−bs
The second component of an accelerated life test is the life distribution, which de-
scribes the dispersion of results expected from a population. Many common life distributions
belong to the exponential family with mean θ(s), and hence failure rate λ(s) = θ(s)−1. A
relation between the stress and failure rate is needed. In reliability, an exponential, Weibull,
or lognormal distribution is typically used.
9
2. Optimal experimental designs
2.1. Historical overview
Since Fisher (1926) introduced the concept of factorial designs, experimental design
has become an important consideration during the planning stages of a test. Standard
factorial designs offer many desirable properties such as orthogonality, ease of estimation,
and symmetry of estimation variances. In addition, experimenters enjoy the ease of use
associated with the symmetrical construction of a factorial design. In this construction,
test point locations are automatically specified once the practitioner defines the feasible
operating region. Despite all its benefits, one of the drawbacks of a factorial design is the
large sample size required as the number of factors grows. To address this issue, fractional
factorial designs and symmetry-based designs such as the Central Composite Design (CCD)
(Box and Wilson, 1951) and the Box-Behnken Design (BBD) (Box and Behnken, 1960), are
often used. These designs retain some of the desirable properties of a full factorial design,
while providing a smaller sample size requirement. However, even with CCD and BBD
designs available, there are some cases where these designs are not the “best” choice given
the requirements of the test. Such is the case when the distribution is non-normal or the
feasible design region is not regularly shaped. In such situations, forcing a symmetrical
design upon an irregularly shaped region leads to designs that do not fully span the feasible
design region. Under these conditions, selection of the best stress test levels may not be
initially obvious, and thus an alternate approach is needed in the selection of stress test
levels. Experimental designs generated for these type of conditions are referred to as the
“optimal” designs in that they attempt to select the test point locations based on a criteria.
The following section provides a summary of the research in this area.
10
2.2. Relevant literature
Practitioners typically face many choices in planning an ALT. Typical considerations
include stress level selection, stress loading, sample size, sample allocation to each stress
level, and data censoring strategy. These considerations should be jointly evaluated with an
optimization criterion based on the purpose of the test. Given the cost and time associated
with this type of testing, interest in developing a procedure for optimizing the test plan
continues to be an active area of research. The remainder of this section highlights previous
research completed in each of these areas.
The seminal work in developing an optimal accelerated life testing procedure is
presented by Chernoff (1962). He outlines the accelerated life testing plans for two life-stress
functions and applies them for exponential life distributions. Alternate lifetime distributions
such as the Weibull, two-parameter Weibull, lognormal, and extreme value are developed
shortly thereafter by Mann (1972), Kielpinski and Nelson (1975), Meeker and Nelson (1975),
Nelson and Kielpinski (1976), and Nelson and Meeker (1978). Test plans with a single stress
factor are discussed in these works. Two-stress-level optimal plans are efficient, but they
are not robust to model mis-specification and do not allow validation or assessment of the
stress-life relationship. This is addressed by Meeker and Hahn (1977) through a compromise
plan that specifies three unique stress levels and employs a 4:2:1 ratio of test samples to
each stress level when stress levels are ordered low to high. This strategy is used since
failure rates decrease as stress levels decrease. Hence, by allocating an increasingly larger
proportion of samples as stress levels decrease, we maintain a reasonable chance of observing
failures. The work of Escobar and Meeker (1995) is the first to outline ALT plans for two
stress variables. Their approach states a primary and secondary objective. The primary
objective is to minimize the standardized asymptotic variance of the maximum likelihood
11
estimator for a given quantile, while the second objective is to achieve D-optimality. Since
then, a variety of extensions continue to be published. These include Park and Yum (1996)
who expand this approach to include two variables with an interaction effect; and Tang
et al. (2002) who use three stress levels and contour plots to minimize the variance of
an estimate of interest. For cases involving a categorical response, Joseph and Wu (2004)
offer the Failure Amplification Method (FAMe) as an approach for improving parameter
estimation. Given that categorical responses are not as informative as continuous ones,
this approach seeks to maximize the information that can be gained by amplifying the
probability of failure.
In addition to the particular combinations of design variables, the precision of results
rely on test unit allocation. Jiao (2001) and Ng, Balakrishnan, and Chan (2007) offer
recent additions to the literature regarding optimal sample size allocation while Bai and
Yun (1996) and Kim and Bai (1999) explore accelerated life tests for experiments with
unequal allocations to design combinations.
Censoring is yet another important consideration in reliability experiments. It is
generally cost and time prohibitive to fully execute reliability testing to completion. As a
result, practitioners must consider the added complexity of incomplete information. Typ-
ical ANOVA or least squares regression techniques can not be applied for censored data.
Escobar and Meeker (1995) document the procedure for computing Fisher Information ma-
trices. They use asymptotic covariance matrices to estimate maximum likelihood estimators
(MLEs) for combinations of censoring, truncation, and explanatory variables. Tang, Goh,
and Ong (1999) expand this work to include failure-free life considerations.
12
In conclusion, many ALT planning approaches exist in the literature. Nelson (1990,
2005), Escobar and Meeker (1995), and Viertl (1988) are excellent summaries of the main
contributions in the area of accelerated life test planning.
2.3. Alphabetic optimality criteria
Optimal experimental design is first discussed by Smith (1918) in the context of
one-factor polynomials. However, focus on how to define an optimization criterion for ex-
perimental designs did not occur until the late 1950’s. Kiefer (1959) first introduces a set
of single value measures that are related to desirable characteristics involving a design’s
moment matrix. Each characteristic is assigned a “letter” based on the desirability charac-
teristic, and thus are commonly referred to as “alphabetic” nomenclature for experimental
designs. Building upon the initial work of Kiefer, the alphabetic criteria today now includes
A, D, E, G, and I (also known as V or IV). These criteria are described in Table 2.
Table 2: Design criteria for alphabetic optimality
Criterion Description of the objectiveA minimizes the traceD minimizes the determinant (volume of the joint confidence ellipsoid)E minimizes the norm (diameter of joint confidence ellipsoid)G minimizes the maximum prediction variance over a region of interest
I, V minimizes the average prediction variance over a region of interest
Letting X represent the design matrix, M be the moment matrix (X ′X), and ξ be
a measure, the various optimality criteria, Ψ, are stated in Equations (2.1) through (2.5)
ΨA{M(ξ)} = tr|M−1(ξ)| (2.1)
13
ΨD{M(ξ)} = log |M−1(ξ)| (2.2)
ΨE{M(ξ)} = norm|M−1(ξ)| = min λi(M) (2.3)
ΨG{M(ξ)} = max[w(x)d(x, ξ)
](2.4)
ΨI{M(ξ)} = min(∫
Rd(x, ξ)µ(dx)
)(2.5)
where λ is an eigenvector, d(x, ξ) = f ′(x)M−1f(x), and µ is the expected value of the
response.
Of all the optimality criteria, D-optimality is the most popular due to the simplicity
of its computation. However, G-optimality is increasing in popularity as computer software
packages facilitate its computation. Increased use of optimal designs is linked to the devel-
opment of computer algorithms. The origin of computer-based optimization algorithms can
be traced back to Mitchell’s development of DETMAX (1974). It is an exchange algorithm
that allows the size of the design to vary instead of requiring that the addition of a point
be immediately followed by a deletion.
In comparing D-optimal, G-optimal, and I-optimal designs, it is important to rec-
ognize the relative strengths and weaknesses of each. The D-optimality criteria minimizes
the joint confidence region of all the parameters. As such, its focus is on parameter estima-
tion. Thus, D-optimality is most often selected as a criteria in conjunction with screening
experiments where the objective of the experimenter is on model selection. In comparison,
14
the focus of G- and I-optimality is prediction variance. G-optimality focuses on guiding the
practitioner toward designs that achieve the lowest worst-case prediction variance over a
prescribed region of interest. In contrast, I-optimality integrates over the region and returns
a design with the lowest average prediction variance. In the case that the region of interest
is a single point, I-optimality and G-optimality are equal. Finally, it is important to note
that all three optimality criteria assume that the form of the model is already known.
Upon selecting the optimality criterion, it is natural to want to compare several
designs being considered, often which are of different size. Design efficiencies are scaled
values that provide a metric of comparison ranging from 0 to 100%. D-efficiency and G-
efficiency can be expressed as
Deff =
[|M(ξ)||M(ξ∗)|
] 1p
(2.6)
Geff =d(ξ∗)d(ξ)
=p
d(ξ)(2.7)
where p is the number of parameters in the model, ξ is the design measure, and ξ∗ is the
optimal design measure. An excellent discussion of optimal designs and theory is written by
Atkinson (2006). More recently, research has extended the optimality criterion beyond a sin-
gle point measure. Variance dispersion graphs (VDGs) and fraction of design space (FDS)
plots allow a practitioner to evaluate prediction variances across a region. Giovannitti-
Jensen and Myers (1989) introduce VDGs as a means to evaluate locational dependencies
of prediction variances by plotting their minimum, mean, and maximum values as a func-
tion of radii. Similarly, Zahran, Anderson-Cook, and Myers (2003) develop FDS plots to
display scaled prediction variance (SPV) as a proportion of the volume of the design region.
15
FDS plots illustrate the level of design robustness based on their shape, with a flat line
at low levels of prediction variance being the most desirable. An excellent discussion of
these graphical techniques is found in Myers, Montgomery, and Anderson-Cook (2009) and
Goldfarb et al. (2004).
2.4. Computer algorithms
Alphabetic design optimality criteria are first described by Kiefer (1959) in the
late 1950’s. However, the prominent use of these optimal designs did not occur until the
development of computer-based algorithms. Mitchell’s DETMAX (1974) is the first example
of such an algorithm. Its objective is to find the maximum determinant of a design’s moment
matrix. Since then, many improvements have been made in the algorithms that are currently
available in today’s software. This section compares the various approaches using the D-
criterion as an illustration and concludes with a discussion on their current adoption into
commonly used software.
Currently, D-optimality algorithms in software can be summarized into one of five
categories. We list these chronologically based on their first publication as:
• Point exchange algorithms;
• Branch and bound methods;
• Simulated annealing;
• Coordinate exchange algorithms; and
• Genetic algorithms.
16
We briefly discuss each approach noting that the D-optimality criterion, ΨD, is
written as
ΨD = maxξ
∣∣∣M(ξ)∣∣∣ = max
ξ
∣∣∣X ′X∣∣∣ (2.8)
where X is the design matrix, M is the moment matrix, and ξ is the design space.
2.4.1. Point exchange algorithms. Point exchange procedures were the first algo-
rithm developed to construct optimal experimental designs. Starting with Mitchell’s sem-
inal work in 1974, these algorithms use a steepest ascent approach toward locating the
optimal design. The initial approach and subsequent adaptations are briefly summarized as
follows. The user is first prompted to input a model form, number of experimental design
runs (n), and a list of potential candidate design points. Next, a randomly selected n-run
design is made. At this point, the initial design is improved by adding the (n + 1)st run
that maximizes the possible increase in |M(ξ)|. This is followed by a removal of the single
run that minimizes the decrease in |M(ξ)|. Thus, a series of “steps” are taken that provide
the largest incremental increase in the determinant. The step size, ∆D, can therefore be
expressed as
∆D = argmax
[∣∣∣M(ξi+1)∣∣∣− ∣∣∣M(ξi)
∣∣∣] (2.9)
where i is the ith cycle of adding and subtracting design points, and D is used to reference
the D-optimality criterion. This process continues until the incremental improvements in the
criterion, ∆D, become sufficiently small, typically less than 1×10−4. Noting that steepest
ascent strategies often yield locally optimal results, the procedure is repeated from many
17
different initial starting points. In addition, to allow the flexibility of examining designs
of many different sizes, Mitchell also incorporated the idea of “excursions”. An excursion
allows a design to grow or shrink provided it generates a higher D-efficiency value. The
DETMAX procedure represents a major breakthrough as considerations for constructing
designs had previously been limited to “direct search” methods described by Box and Draper
(1971). Yet, even with the innovative leap of DETMAX, practitioners were still limited by
the size of designs that could be considered. This is due to the reliance on punch cards as
the primary computing method during the mid-1970’s.
Despite this setback, DETMAX inspired many research efforts into creating various
permutations of Mitchell’s approach in an attempt to improve computational efficiency.
On such example is the work of Galil and Kiefer (1980b). They focus on improving the
selection of the initial design, thus reducing the required search time. Still other researchers
focused on the approach for augmenting (adding and subtracting) design points. A rank-
1 augmentation chooses points to add and delete sequentially. Examples include Wynn’s
algorithm (1972) and Mitchell’s DETMAX (1974). This is in contrast to an improved
rank-2 augmentation strategy where points are added and deleted simultaneously. The
most notable of these works are by Federov (1972), a modification to the Federov approach
by Cook and Nachtsheim (1980), the k-exchange algorithm by Johnson and Nachtsheim
(1983), and the kl-exchange algorithm by Atkinson and Donev (1989). The computational
improvements made are notable. For example, the modified Federov exchange algorithm is
about twice as fast as the full Federov algorithm while producing comparable designs. An
excellent review of optimal design algorithm development is found in Cook and Nachtsheim
(1980) and Steinberg and Hunter (1984).
18
2.4.2. Branch-and-bound methods. A branch-and-bound algorithm was developed by
Welch (1982). The concept employed by Welch was previously popularized by the work of
Little et al. (1963) to solve the so-called “traveling salesman” problem. This approach
seeks to divide all possible n-point designs into two partitions which form a binary tree
(branching) structure. At first glance, this is essentially an exhaustive search approach
which would be inefficient. However, through the use of convenient “bounds”, much of the
tree structure is never evaluated. This is analogous to “pruning” decision-trees in the field
of data mining and machine learning. By adapting some of the work of Federov (1972)
and applying the Hadamard inequality for positive definite matrices, a strategy similar
to point exchange algorithms is applied where a ∆D function is used. It is important to
note that although many of the algorithms discussed up to this point may be viewed as
an approximation to Federov’s delta function, their exact handling of the optimization of
the delta function may differ. In addition, although sharing many similarities with point-
exchange, branch-and-bound methods differ in that they guarantee a global D-optimum
design. However, this guarantee comes at the expense of additional computing time.
2.4.3. Simulated annealing. A third type of computer-generated algorithm was pro-
posed by Haines (1987) called simulated annealing. This heuristic method is based on the
Metropolis algorithm used in statistical physics. The key to this approach is to introduce
a perturbation scheme that eliminates the need for gradient-based calculations (e.g. ∆D).
Furthermore, what distinguishes Haines’ work from others is the nesting of the Metropolis
algorithm within an annealing procedure. This approach differs from the previous point-
exchange approaches in that a path of steepest ascent is not followed. Instead, the annealing
algorithm allows some degree of random oscillations to occur in the hope of avoiding lo-
cal minima. As such, it is simply a heuristic and differs from branch-and-bound in that
19
the optimum returned is not necessarily global. However, it does offer robustness to small
changes in the coordinates of the design points. As such, it forms yet another choice for
practitioners to consider as they make trade-off decisions between computation efficiency,
global optimality, and algorithm robustness.
2.4.4. Coordinate-exchange algorithms. Cyclic coordination-exchange algorithms
were initially developed by Meyer and Nachtsheim (1995). This class of algorithm dif-
fered greatly from the wide range of point-exchange techniques. Among the main benefits
associated with this approach are (1) eliminating the need for an explicitly defined, con-
structed, or enumerated candidate set; (2) avoiding the problems associated with a need to
identify global optima; (3) adding the ability to mix discrete and continuous design spaces
together; and (4) reducing computational time by one or two orders of magnitude for large
problems.
The coordinate-exchange algorithm is a modification of the k-exchange algorithm
of Johnson and Nachtsheim (1983). This approach was prompted by observing that “the
performance of design algorithms seems to improve with the level of greediness” (Meyer
and Nachtsheim, 1995). One of the key features that enabled the eventual development
of coordinate-exchange is Federov’s “delta function” term. Federov showed that if the
exchange of points results in the maximum increase in the determinant, that relationship
can be expressed as
∣∣∣M(ξi+1)∣∣∣ =
∣∣∣M(ξi)∣∣∣[1 + ∆D,i(xj , x)
](2.10)
where ∆D,i(xj , x) is the delta function term and the D subscript indicates that the function
is applicable to the D-optimality criterion. However, Meyer and Nachtsheim recognized
20
that the computational complexity of Federov’s delta function could be greatly reduced by
partitioning the coordinates in a special way. Specifically, by partitioning the coordinates
into a fixed group and an exchanged group, a much lower order computation is required.
This can be illustrated by the following simple example originally discussed by Meyer and
Nachtsheim (1995).
Given a set f(x) where the exchange of the jth coordinate group of the ith point xi
is needed. If we partition xij such that xj are exchanged points and xi,−j are fixed points,
then we can reorder their terms into two groups as shown
f(x) =
f1(xj)
f2(x−j)
(2.11)
where f2(x−j) corresponds to the components of f(x) that do not explicitly involve xj .
Now consider a set where f ′(x) = (1, x1, x2, x1x2, x21) and we exchange coordinates along
j=1. This selects all terms involving x1 and places them in the exchange group, f1, and the
remaining points into group f2. Thus f ′1(x1)=(x1, x1x2, x21) and f ′2(x−1)=(1, x2), respec-
tively. Thus, the partitioned coefficient matrix, A, corresponding with f can be reduced to
Equation (2.12).
A =
A11 A12
A21 A22
(2.12)
This, in turn, allows the delta function for D-optimality to be reduced to Equation (2.13)
where a and c are both constants.
21
∆ijD(xij , xj , xi,−j) = f ′1(xj)A11f1(xj) + a′f1(xj) + c (2.13)
Therefore, this approach applied to a D-optimality condition can be written as
maxxj∈χj
∆ijD(xij , xj , xi,−j) = ∆ij
D(xij , x∗j , xi,−j). (2.14)
Thus, the maximization of the delta function only involves an evaluation of a reduced matrix
and their inner products. For example, for first-order models, the computation involves only
four multiplications and a single addition. By comparison, previous point-exchange methods
for convex design spaces involved the evaluation of a quadratic form involving a p×p matrix.
2.4.5. Genetic algorithms. The most recent development in computer-generated de-
signs involves Genetic Algorithms (GA). This approach explores a much different path
toward developing computer-generated optimal designs. These algorithms reflect matur-
ing methods for constructing optimal experimental designs. In particular, these techniques
have an ability to selectively choose preferred elements of multiple different algorithms. In
previous algorithms, the need for developing a candidate set was often a limitation. In
addition, the strict application of an alphabetic criterion often drew serious criticism for
creating designs that are very dependent on the model specified by the experimenter. By
comparison, genetic algorithms eliminate both of these issues.
The methods genetic algorithms employ toward finding these solutions are also dif-
ferent and are loosely based on Darwin’s evolution theory. Back (1996) was among the
first to apply these concepts. These iterative heuristics involve one of several processes for
determining the next “generation” of designs. These processes are: recombination, muta-
22
tion, and selection. Recombination or crossover is a description for generating offspring
designs in which some test runs are received from each parent. Mutation is the process
that attempts to emulate natural variation. Under this scenario, the new design contains a
random change introduced into the original design. Many different approaches have been
taken. For example, Heredia-Langner et al. (2003) used a Gaussian perturbation scheme
for selecting a mutation, while Borkowski (2003) employed a blending technique involving a
linear combination of two runs. Moving on in our discussion of heuristic processes, we note
that selection is the final procedure. It evaluates children designs based on an objective
function and uses this as a criteria for selecting which parent designs to use in the next
iteration.
Using these various processes, several research efforts have emerged over the last
decade. Starting with a small sampling of works which include Parkinson (2000), Hamada,
et al. (2001), Heredia-Langner et al. (2003, 2004), and Borkowski (2003), this area continues
to grow in popularity as an approach. A complete taxonomy of hybrid heuristics can be
found in Talbi (2002).
2.5. Software integration
As with many great innovations, the real challenge lies in the ability to maintain
functionality while distributing the knowledge to a broader audience. The same can be said
for the integration of computer-based optimal design algorithms into commercially avail-
able software. Over the last 30 years, optimal design algorithms have gone from something
only possible with punch card systems to standardized modules capable of being run on
the most basic personal computer in a matter of minutes. In the 1980’s, point-exchange
algorithms were the only option available, although their availability was generally limited
23
to customized code such as FORTRAN. During the 1990’s, researchers continued to follow
a path in which modest improvements in the computational efficiency of point-exchange
algorithms were made. During this period, the widespread use of personal computers and
corresponding growth in software availability led to greater use of computer-based experi-
mental designs. Echip, SAS, RS/Discover, and Design Expert were examples of packages
that provided optimal-design options at that time (Meyer and Nachtsheim, 1995). Even so,
the numerical algorithms available at that time generated designs focusing on a select few
desirable properties. Thus, they were not well-suited for the general practitioner. In partic-
ular, they would rely on the user to be cognizant of the many pitfalls of optimal designs. A
decade earlier, Steinberg and Hunter (1984) recognized the importance of easy-to-use “in-
teractive expert software”. The interactive programs that they sought would be intelligent
enough to prompt the experimenter to consider many of the same questions that a seasoned
statistical consultant would ask. In recent years, the statistical community has witnessed
just such an event. Algorithms have become increasingly more discerning and efficient with
respect to their optimization criteria, allowing users to explore a much broader range of
complex problems. In particular, the emergence of graphical techniques have served as an
excellent tool to demonstrate the sensitivities and pitfalls of certain design considerations.
Today, coordinate-exchange algorithms are now the most widely used method of
computing D-optimal designs. They are prominently found in algorithms such as SAS, JMP,
Matlab, Design Expert, and others. The exception is Minitab software which currently uses
a point-exchange method (along with a Federov method algorithm) in their most recent
software release. In some cases (such as SAS), the practitioner may choose among several
of these algorithms. In conclusion, with the development of clever algorithms such as
24
coordinate-exchange and the emergence of genetic algorithms, there is great promise for
continued improvements in software for the foreseeable future.
2.6. Applications in reliability
It is important to note that optimal designs have localized benefits. The ability of
a practitioner to select the optimal design is predicated on having an a priori knowledge
of the model form. Although optimal designs might be used for screening purposes, they
do not guarantee that the practitioner can easily manage sequential testing options such
as design augmentation. Despite this limitation, these designs generally can provide useful
information provided care is used by the experimenter.
With the theoretical foundation laid in the previous sections, we now turn our at-
tention to applications of optimal design within the reliability field. Optimal experimental
design has largely been used for two purposes: to minimize the prediction variance of the
response at the use condition stress or to minimize the variance of the model parameter
estimates. The majority of the literature addresses the former case.
Escobar and Meeker (1995) first applied D-optimality as a criterion for accelerated
test design; however, it was a secondary criteria. The focus of this and subsequent work by
this team has been on reducing the prediction variance of the response. As an extension of
this research, Park and Yum (1996) have explored designs with two stresses involved with
potential interaction between them. Focusing more on model parameter estimates, Onar
and Padgett (2003) have previously applied a locally penalized D-optimality criteria. This
technique addresses situations when constraints exist in the feasible region. Other research
such as Ng et al. (2007) have focused on sample allocation. Sitter and Torsney (1995)
explored binary response experiments. Tang, Goh, Sun, and Ong (1999a) discussed the
25
challenges of censoring for two-parameter exponential distributions while Tang and Yang
(2002) discussed multiple levels of a constant stress level. Most recently, Zhao and Elsayed
(2005) have extended the optimality discussion to include proportional mean residual life.
A comprehensive look at accelerated life test plans is given by Nelson (2005a, 2005b).
3. Unknown model parameter dependency
The previous section has outlined many criteria for determining the “optimal” design
for prediction quality or precision of parameter estimates. Equations (2.1) through (2.5)
illustrate the strong dependence a design matrix, X, has on the outcome of the experiment.
The selection of the experimental design is based in large part on the ability of the practi-
tioner to postulate an adequate model in the region of interest. For the case of non-linear
models, this poses a dilemma as the formulation of a model is dependent upon knowledge
of the parameter estimates, which are unknown. Khuri and Mukhopadhyay (2006) outline
some of the common approaches toward solving this design dependence problem. They
include:
1. the specification of initial values, or best guesses of the parameters and the con-
struction of locally-optimal designs using a design criterion such as D-optimality or
G-optimality;
2. a sequential experimentation approach in which the estimate of the parameters are
updated in successive stages, starting with an initial value; and
3. a Bayesian approach which a prior distribution is assumed on the parameters, which
is then incorporated into an appropriate design criterion by integration over the prior
distribution.
For this research, we focus our discussion on strategies associated with the first option listed.
26
3.1. Parameter dependency
Alphabetic-optimal designs have become increasingly popular as the algorithms have
been integrated into commercial software. However, for GLMs, these “optimal” designs are
sensitive to model parameter assumptions. Thus, using the same model with even a small
departure in the assumed parameter values could greatly change the resulting optimal de-
sign. This has been recognized as a problem, thus, alternate approaches have been developed
to counter this problem. This section will summarize the approaches most commonly found
in the literature and provide a detailed discussion of a recent Bayesian approach by Zhang
and Meeker (2006) that pertains specifically to design selection for accelerated life tests.
The section will conclude with a brief overview of how the Bayesian method may be applied.
The approaches toward developing more robust designs can be summarized into the
following categories:
1. Model-robust designs;
2. Model-sensitive (discriminate) designs;
3. Bayesian designs; and
4. Multifaceted selection criteria designs.
3.2. Model-robust designs
The selection process with alphabetic-optimal designs assumes that both the model
selected and its parameter estimates are perfectly known, which is unlikely under even the
best of circumstances. For model-robust designs, a practitioner may have strong assertions
that the model specification is correct, but may have doubts as to the exact values for the
model parameters. Under these circumstances, designs are sought that provide reasonable
27
results across a wide range of potential values. Another scenario is that the practitioner
is uncertain about the model specification. In this situation, the practitioner wishes to
select designs that are robust such that inaccuracies in the model could be identified or
refinements in the model could be considered.
In each of these cases, the primary concern raised with alphabetic-optimal (D-
optimal) designs is their tendency to concentrate all the experimental runs on a select
few design points which are ideally suited toward estimating the coefficients of the assumed
model. This is a poor model-building strategy as it provides little or no opportunity to
check for lack of fit. Furthermore, bias will also likely become more severe as design points
are moved further away from one another. This motivated Box and Draper (1959) and
Box (1982) to advocate alternative experimental designs that consider the possible effects
of bias. They argued that a more appropriate criterion for comparing experimental designs
is the average mean square error (AMSE) which can be written as the sum of the average
prediction variance (APV) and average squared bias (ASB). Equation (2.15) shows this
relationship
AMSE = APV +ASB
=NK
σ2
∫RE
[y(x)− E[y(x)]
]2
dx+NK
σ2
∫R
[E[y(x)]− f(x)
]2
dx
=NK
σ2
∫RE[y(x)− f(x)
]2dx (2.15)
where the quantity Nσ2 is a scale factor for variance, and K is the reciprocal of the volume
of the region. This would effectively penalize designs that strictly populate points at the
extreme regions.
28
In the spirit of this work, Huber (1975) completed sensitivity studies of model mis-
specification using a minimax analysis. He similarly concluded that considerable bias could
occur from the quadratic terms. Kussmaul (1969) examined model mis-specification and
suggested that a G-optimal design for a higher order polynomial would be a good strategy.
Lauter (1974) suggested examining a collection of potential model forms and pro-
posed maximizing the average of the log of the determinant. Cheng, Steinberg, and Sun
(1999) used estimation capacity (EC) as a criterion which considers the percentage of al-
ternate or potential models that are estimatable. Cook and Natchsheim (1982) and Li
and Nachtsheim (2000) have constructed model robust factorial designs where the design
criterion is given by
φ(D) =m∑i=1
wiei(D) (2.16)
where ei(D) is a measure of the efficiency or optimality of the design D for model i. A good
review of these and other approaches is offered by Steinberg and Hunter (1984).
3.3. Model-sensitive designs
Model-sensitive designs are an alternate approach toward achieving increased model
robustness. These designs focus on being able to discriminate between alternate (candidate)
models. As such, they are often referred to as model discrimination designs. This idea orig-
inated in the development of discriminating experimental designs for regression modeling.
Pioneers in this approach include Hunter and Reiner (1965), Box and Hill (1967), and Hill,
Hunter, and Wichern (1968). As an example, the latter group formed a hybrid of this
approach with a focus both on model discrimination and parameter estimation. Similarly,
29
Atkinson (1972) offered suggestions on designs that served a dual role. His compromise
designs targeted both parameter estimation for an assumed model, and enabled testing
for lack of fit associated with quadratic terms. More recently, the concept of model-robust
factorial designs (MRFD’s) was developed by Li and Nachtsheim (2000). This spawned
the research of Bingham and Li (2002) into a class of optimal robust parameter designs
that included control-by-noise interaction. Work by Jones, Li, Nachtsheim, and Ye (2009)
integrates model-robustness with supersaturated and partially supersaturated designs. Al-
though many other publications exist, they all share a common thread - a desire to display
shortcomings in such a way as to help the experimenter develop a better model.
3.4. Bayesian designs
As previously mentioned, a practitioner may have doubts as to the exact value for
the model parameters, Bayesian methods are useful in constructing robust designs. In such
cases, knowledge (either through prior experimental results or expert opinion) of the range
and relative likelihood of potential values is stated as a prior distribution. This likelihood
function may reflect either a highly informed state of knowledge or relative uncertainty,
typically shown as a diffuse probability density function such as the uniform distribution.
A weighted objective criteria is then computed by integrating across the model parameter
uncertainty using the probability density function. A design that maximizes this objective
function is called a Bayesian optimal design.
Based on the bias concerns previously identified, O’Hagan (1978) promoted a
Bayesian strategy. Interestingly enough, his design criteria, which was based on a pos-
terior variance, advocated the addition of more points in the center region of the design
- a common conclusion among many other approaches. In light of this, Draper (1982) of-
30
fered an approach toward determining how many center points to add. DeGroot and Goel
(1979) and Chalonar and Larntz (1992) examined the applicability of Bayesian estimation
for accelerated life testing. More recently, DuMouchel and Jones (1994) published a sim-
ple Bayesian approach to D-optimal designs to reduce dependence on an assumed model.
Chaloner and Verdinelli (1995) noted that experimental design is a situation where it is
meaningful within Bayesian theory to average over the sample space. The work by Agboto
and Nachtsheim (2005) also added support the use of Bayesian techniques.
In the realm of accelerated life testing, many techniques have been published. They
include Polson’s (1993) general decision theory; Verdinelli, Polson, and Singpurwalla’s
(1993) utility functions based on a Shannon information criterion; and Erkanli and Soyer
(2000) for exponential lifetimes without censoring. With respect to the D-optimality crite-
ria, Ginebra and Sen (1998) and Pascual and Montepiedra (2002) used a minimax design
criterion. Others such as Zhang and Meeker (2006) targeted parameter estimation when
censoring exists.
3.5. Multifaceted selection criteria designs
Genetic algorithms have emerged as a technique for finding experimental designs
that meet multifaceted selection criteria. As such, their objective functions can incorporate
considerations from several areas. Initial methods that illustrate this in the literature in-
clude research by Parkinson (2000), Hamada et al. (2001), Heredia-Langner et al. (2003,
2004), and Borkowski (2003). This area continues to grow in popularity as an approach.
Each published research seems to spawn new and innovative ideas. Such is the case with
authors such as Drain et al. (2004) who have used hybrid heuristics (genetic algorithm and
31
simulated annealing) and Del Castillo et al. (2007) who have created designs robust against
noise factors.
4. Use of a Bayesian D-optimal criterion in designing accelerated life tests
In regard to how this technique may be applied to this research, many features of this
approach may be adapted. In particular, we believe the construction of the acceleration
model could easily be expanded to include three stress factors. Standardization of the
accelerating variables and the use of the log-location scale distribution would also be an
easy adaptation. For designs that are focused on parameter estimation, and not prediction
variance, we suggest changing the utility function to include a Shannon loss function. The
Shannon criteria is an entropy-based idea originally developed by Lindley (1956) and shown
in Equation (2.17).
U(d, θ,D, Y ) ∝ L(θ|θ) = log
[p(θ|y,D)p(θ)
](2.17)
Substituting this loss function into a Bayesian-based utility function yields
U(D) =∫
log
[p(θ|y,D)p(θ)
]p(y, θ|D) dθ dy. (2.18)
For a normal linear model with p parameters, it has been previously shown that the solution
D∗ that maximizes information gain is
U(D∗) = −k2
log(2π)− k
2+
12
log
∣∣∣∣∣X ′X +R
σ2
∣∣∣∣∣ (2.19)
32
where the first two terms are a constant, R is the contribution from the prior distribution,
and X is the design matrix. Thus, the Bayesian D-optimal information gain occurs for
design D∗ in which
D∗ = argmax∣∣∣X′X + R
∣∣∣. (2.20)
On a final note, Jones, Lin, and Nachtsheim (2008) recognized that maximizing
|X ′X+R| is equivalent to maximizing |MD+n−1R|, where M = |X ′X|/n is the normalized
information matrix. They advocate the use of this form of the objective as it provides an
insightful look at how a Bayesian optimal approach compares to the standard D-optimality
criteria. This can be seen from the following argument. When either n is large (n→∞) or
when little prior information about θ is known (R→ 0), the Bayesian approach reduces to
D-optimality.
5. Generalized Linear Models
Linear models are the basis for most analysis of continuous data. This is typically
expressed in the form:
E(Yi) = µi = xi′β
Yi ∼ N(µi, σ2) (2.21)
where x1, x2, . . . , xp are covariates and the random variables Yi are assumed to be indepen-
dent, normally distributed with value, µ, and with constant variance, σ2. In some cases, the
random variables involve count data, binary responses, or other non-normally distributed
data. Advances in statistical theory have allowed the structure of linear model fitting to be
33
applied in more general situations. This branch of analysis is referred to as a generalized
linear model (GLM). Nelder and Wedderburn (1972) were the co-developers of the GLM
technique that unifies the linear, logistic, and Poisson regression under a single framework.
A GLM consists of three elements:
1. a response variable distribution function from the exponential family;
2. a linear predictor η = xi′β; and
3. a link function such that g(µi) = xi′β = g(η).
The appeal of GLMs is the unifying structure they provide. Excellent sources for
details pertaining to the construction of GLMs can be found in McCullagh and Nelder
(1989), Myers, Montgomery, and Vining (2002), and Dobson (2002).
GLMs have been applied to a broad range of problems. One such area is survival
data analysis. Aitkin and Clayton (1980) and Whitehead (1980) applied GLMs to survival
data. Most notably, they discuss how to address both censoring using a Poisson variate and
time-dependent covariates through a proportional hazard (PH) model. Other published
research using the PH model include papers by Barbosa and Louzada-Neto (1994) who
examined ALTs with interval censoring.
Wang and Kececioglu (2000) highlight the estimation benefits GLM structures pro-
vide. These authors note that maximum likelihood estimation often relies on numerical
methods (Newton-Raphson) to estimate a complex model with many unknown parameters.
Problems often arise with the Newton-Raphson algorithm failing to converge when the start-
ing point is not close to the solution. This issue is particularly a problem when censoring
exists. In light of this problem, we propose an alternate method in lieu a likelihood-based
approach. By using GLMs, they reduce the complexity of the system of equations to one
34
involving iteratively weighted least squares which do not request initial parameter guesses.
This approach is very effective, but has yet to be applied in the planning stages. Given the
computational ease that the GLM construction offers, it seems logical to extend this line of
research to address the non-linear, constrained, and censored data environment described
in the Introduction. Through the literature search completed thus far, this extension is not
addressed and thus motivates the research shown in Chapter 4.
CHAPTER 3
Optimal Designs for Constrained Regions
1. Motivation
Reliability engineers often postulate empirically derived accelerated life testing mod-
els to extrapolate expected lifetime performance of a component. While pragmatic, exper-
imentally derived estimation procedures often yield unacceptably large variances for model
parameters. Thus, correct determination of a “best” model among several competing op-
tions remains difficult. With the migration to environmentally friendly lead-free alloys for
microelectronic components, there has been a need to update current solder based fatigue
models. Unlike eutectic solder, decades of historical data does not exist for lead-free alloys
such as Sn-Ag-Cu (Clech, 2005). Moreover, microprocessor packaging has become increas-
ingly more complex in its design and integration with the system. This coupled with a
cost-conscious business environment drives the need for efficient test plans that maximizes
information gained from a reliability test investment. Thus, accelerated life tests are ex-
pected to provide conclusive evidence that a model is both sufficient and appropriate for its
intended application. Historically used test plans do not satisfy the requirements of today’s
competitive business environment; thus, a new approach is needed.
2. Problem statement
In evaluating the reliability performance of a solder interconnect there are three pri-
mary failure mechanisms of concern: tensile rupture through mechanical overloading, creep
failure through static loading, and low-cycle fatigue caused by cyclic thermo-mechanical
loading. It is the focus of this research to explore the cyclic fatigue case. Solder joint
fatigue is attributed primarily to stresses incurred due to mismatches in the coefficient of
thermal expansion (CTE). Although solder joints may exist in a near equilibrium state
during the elevated temperatures of assembly, they experience dynamic loading as the tem-
36
Fig. 1: Air-to-air temperature cycle chamber with three temperature zones
perature patterns fluctuate during field use. Furthermore, for microprocessor packaging
that involves hundreds of fine pitched joints, the stress state of the solder joint is location
dependent. Other influential factors include solder chemistry, standoff height, solder fillet
radius, and solder ball diameter. Estimating these interdependencies at a microscopic level
is often achieved through finite element analysis (FEA). However, we restrict our view to
the macroscopic view and describe behavior through an empirically derived model gener-
ated from accelerated life testing. Such tests are conducted in a thermal cycle chamber as
depicted by Fig. 1. In this chamber, test units are moved between a hot zone and cold zone
in a predetermined pattern to complete a thermal cycle.
To better understand this problem, we provide an overview of the evolving form of
solder joint fatigue models. Coffin (1954) and Manson (1953) were the first to empirically
37
describe cyclic thermal stresses in ductile metals. They observed that fatigue failure times
are expressed as an inverse power law function of the temperature amplitude, ∆T ,
Nf = A(∆T )−n (3.1)
where Nf is the number of cycles to failure and A is a constant. This work was note-
worthy in that it provided a simple means of correlating expected product lifetimes to an
acceleration test setting in a controlled environment. The first modification to the Coffin-
Manson equation was proposed by Norris and Landzberg (1969) and accounted for creep
and relaxation properties of solder
Nf = A(∆T )−n(f)mexp(
Q
kBTmax
)(3.2)
where f is the frequency of reversals within a fixed time, Q is the activation energy, and kB
is Boltzmann’s constant (8.617×10−5). As temperature cycle and thermal shock equipment
venders offered alternate heating and cooling media, it became necessary to specify the ramp
rates and soak periods. Sahasrabudhe et al. (2003) observed that dwell or soak time, tsoak,
is a better predictor of reliability performance versus frequency of reversals and postulated
the following model,
Nf = A(∆T )−n(tsoak)−m . (3.3)
Over the last two decades, solder fatigue models have been postulated in many dif-
ferent forms (e.g., Engelmaier (1984), Darveaux (1997), Syed and Amkor (2001)); thus
38
complicating the selection of the “best” or “most appropriate” model. Although much re-
search has been spent on model formulation and variance reduction for product lifetime
predictions in the field of reliability engineering (e.g., Escobar and Meeker (1995), Park
and Yum (1996), Tang et al. (2002), Zhao and Elsayed (2005)), relatively little has been
spent in evaluating experimental designs for estimation and validation of acceleration mod-
els. Legacy conditions associated with the use of standards-based testing still pervade the
industry despite the transition to use- or field-condition modeling (Blish et al. (1999),
Mencinger (2000)). Examining the design matrix of these legacy test conditions reveals
high degrees of co-linearity, thus inflating uncertainty about each parameter’s true value.
As a result, additional cost is incurred associated with larger sample size requirements, use
of higher reliability materials, or product redesign. Even worse is the loss in ability to verify
that the postulated model has been correctly formulated. Therefore, it is the purpose of
this research to illustrate the importance of experimental design in reliability test planning.
Proposed designs are constrained in a feasible design region that is defined by equipment,
material, and time-to-market considerations. We draw relative comparisons between the
influence of experimental design selection and some traditional reliability test plan vari-
ables such as sample size, censoring scheme, as well as sample allocation to a given stress
level, with an objective of minimizing the variance of parameter estimates for the failure
acceleration model.
In the following sections, we first present an industrial example in which the selection
of an acceleration model can not be conclusively determined. Next, with a hypothesized
form of the model and its parameter values, we specify candidate experimental designs and
simulate failure data for test plans with different sample size, censoring method and test
unit allocation. Maximum likelihood estimation of the model parameters is then completed
39
and the results are compared for these test plans. Based on these findings, recommended
augmentations to traditional ALT test plans are provided to aid practitioners in model
selection.
3. Data collection
3.1. Acceleration model
We first select a modified form of a Norris-Landzberg model for modeling lead-free
solder joint fatigue,
Nf = A(∆T )−a(tsoak)−bexp(
c
kBTmax
). (3.4)
The corresponding acceleration model becomes
AF =(
∆Tstress∆Tuse
)−a( tsoak,stresstsoak,use
)−bexp
[c
kB
(1
Tmax,stress− 1Tmax,use
)](3.5)
where AF is the acceleration factor, and a, b, and c are the coefficients to be estimated. The
subscript “use” is used to distinguish expected field operating conditions whereas “stress”
denotes accelerated conditions conducted in a laboratory setting. The three regressor
variables are temperature amplitude (∆T ), soak time (tsoak), and maximum temperature
(Tmax). All three of these measurements are taken by thermocouple readings within the test
chamber over time. Temperature amplitude is defined as the temperature range associated
with one cycle period. The soak time is the length of time during the profile where the
temperature has reached a near saturation state. Physically, we are concerned with the
stress relaxation that occurs associated with the maximum temperature region, although
40
Fig. 2: Temperature cycle profile with respect to the model variables
the profile typically has symmetric loadings for hot and cold regions. Finally, the last mea-
surement is the maximum temperature obtained within a cycle. It is assumed that the
thermal profile is repeatable over time and that chamber drifting does not occur. Fig. 2
shows the temperature versus time profile of a typical temperature cycle system.
3.2. Irregular feasible design region
Next, a feasible design region is defined to reflect the testing constraints from equip-
ment limitations, material property dependencies, and time-to-market requirements. The
addition of these constraints is an important distinction between previous work done in this
area and reflects many of the trade-off decisions facing engineers. Table 3 highlights some
relevant constraints considered.
For example, the test temperature is limited below 125◦C; otherwise artificial failure
modes that are not experienced in the field (e.g. melting) could be introduced. Similarly,
solder creep is a temperature dependent effect and the test chamber temperature must
be less than 50% of the melting temperature of solder in Kelvin. In addition to material
temperature limitations, the air-to-air temperature cycling chambers used for accelerated
41
Table 3: Feasible design region
constraint description variable minimum maximumC1 time to market Tmax 80◦CC2 material property (creep, melting) Tmax 125◦CC3 equipment capability (compressor) Tmin −55◦CC4 equipment capability (compressor) Tmin 5◦CC5 time to market (when tsoak = 60 mins) AF 4C6 time to market (when tsoak = 10 mins) AF 4C7 material property (stress relaxation) tsoak 10 mins 60 mins
Fig. 3: Feasible design region for ALT testing conditions: a 3-D view (left) and a 2-D view(right)
life testing have operating constraints. Finally, competitive pressures often dictate that
the profitability of a product only exists within a narrow window of time. Thus, time-to-
market considerations require tests to be reasonably accelerated. These constraints form
an irregular shaped region as shown in Fig. 3.
42
3.3. Legacy experiments and preliminary data analysis
The legacy design considered here traces its origin back to military and industry
standards-based testing (JEDEC 2004, 2005). Fixed temperature cycle conditions were used
as a comparative benchmark of reliability performance across a wide range of products and
applications. As such, they were not defined with the intent of empirical model building.
This approach differs significantly from today’s knowledge based reliability assessments that
are tailored for based on customer usage behavior (e.g. Blish (1999), Mencinger (2000)). In
fact, qualitative inference is often made through lack of failures using legacy data, in contrast
to building a predictive model associated with the failure mechanism. Four legacy testing
conditions that have been used in the microelectronic industry are shown in Appendix Table
18.
While the collection of conditions shown in Table 18 allows for the estimation of
three model parameters, a, b and c, it is far from an efficient design. This coupled with
interval readouts and right censoring typically associated with reliability testing greatly
reduces the precision of the parameter estimates. Table 4 provides test results from an ALT
plan that used the legacy test conditions. The first two columns provide the start and end
cycle times. The last four columns are the number of failures associated with each stress
level within a cycle interval. The four asterisked numbers depict that the test was censored
at the start time for that row.
The likelihood function of interval and right censored ALT data is given by
LIK =k∏i=1
[F (t+i )− F (t−i )]riR(t−i )si (3.6)
where i is the time interval, t−i and t+i are the start time and end time of the ith interval,
43
Table 4: Accelerated life test results from the legacy test conditions
Start Time End Time L1 L2 L3 L40 750
750 1000 18*1000 12501250 1500 3 11500 1750 18* 1 51750 2000 2 262000 2250 1 18*2250 2500 2 22500 2750 14 52750 3000 6 53000 3250 14*
respectively, ri is the number of failed units and si is the number of right censored units.
Assuming a Weibull failure distribution, the failure and reliability functions are
F (t) = 1− exp
[−(t
α
)β ](3.7)
and
R(t) = exp
[−(t
α
)β ](3.8)
where α is the characteristic life of a Weibull distribution and it is related to the testing
condition by
log(α) = intercept− a log(∆T )− b log(tsoak) + c1
kBTmax. (3.9)
The above formula is a linear regression model of characteristic life with three regressors, but
it is unknown whether this model is the true acceleration model or if a reduced model with
44
fewer regressors is sufficient. Parameter estimation is carried out by the maximum likelihood
method, which maximizes the likelihood function with respect to model parameters, a, b,
c, intercept and the shape parameter β.
Table 5: Regression table for the full model
Predictor Coefficient Error Z P 95% CIIntercept -9.89 17.05 -0.58 0.562 (-43.30, 23.53)a -1.84 2.37 -0.77 0.439 (-6.49, 2.82)b 0.49 0.37 1.32 0.187 (-0.24, 1.22)c 0.33 0.21 1.56 0.119 (-0.08, 0.74)
Table 5 shows that none of the three regressors is significant when the full model is fitted.
However, when we reduce the model to involve only two regressors, all three reduced models
are found to be significant as shown in Tables 6, 7, and 8.
Table 6: Regression table for the reduced model without regressor TmaxPredictor Coefficient Error Z P 95% CIIntercept 16.65 0.65 25.79 0.000 (15.39, 17.92)a 1.86 0.12 15.70 0.000 (1.62, 2.09)b -0.09 0.03 -2.65 0.008 (-0.16, -0.02)
Table 7: Regression table for the reduced model without regressor tsoakPredictor Coefficient Error Z P 95% CIIntercept 12.36 1.99 6.21 0.000 (8.46, 16.27)a 1.26 0.29 4.31 0.000 (0.69, 1.83)c 0.05 0.02 2.80 0.005 (0.02, 0.09)
This makes model selection difficult, particularly when engineering or professional
experience suggests an alternative model form. A straightforward remedy is to increase
sample size and extend the observation time to allow for increased failures, but it carries a
45
Table 8: Regression table for the reduced model without regressor ∆T
Predictor Coefficient Error Z P 95% CIIntercept 3.30 0.22 15.10 0.000 (2.87, 3.73)b 0.20 0.04 4.64 0.000 (0.12, 0.29)c 0.17 0.01 16.70 0.000 (0.15, 0.19)
heavy cost penalty, particularly in reliability testing. Based on this end result, a retrospec-
tive inquiry would ask whether the experiment could be executed in a way to increase the
likelihood of revealing the “correct” model during the first set of experiments. The answer
lies in the selection of the experimental design better suited to the feasible region.
4. Analysis and interpretation
4.1. Experimental designs and simulation study
To demonstrate the influence of experimental design selection on an ALT analysis,
we perform a series of simulations in which random failure times are generated based on
a set of assumed testing conditions and hypothesized parameter values. The use condition
is defined to be ∆T = 65◦C, tsoak= 10 minutes and Tmax = 85◦C and the distribution
of lifetimes follow a Weibull distribution with a characteristic life of 10,000 cycles and a
shape parameter of 2. In addition, the coefficients for a, b and c are 1.80, 0.333 and 0.122,
respectively. These values are selected by a combined consideration of both the estimated
values from the legacy experiments and the values appeared in engineering literature (Pan
et al. (2005)).
Most reliability test plans discussed in the literature focus on the reliability predic-
tion at use stress level (e.g., Park and Yum (1996), Tang et al. (1999a), Tang and Yang
(2002), and Onar and Padgett (2003)). Based on this objective and with a linear life-stress
model, only tests on the highest and lowest possible acceleration levels are needed, and the
46
allocation of test units on these levels needs to be optimized. The current problem, how-
ever, requires the exploration of an irregularly shaped experimental design region for model
estimation, so it is not effective if only two experimental points are used. Three experi-
mental designs are considered: a non-statistically based design based on legacy conditions,
an orthogonal design, and a design based on the D-optimality criterion. The D-efficiencies
of these three designs are <1%, 38%, and 74%, respectively. In all cases, experimental
designs are restricted to the minimum number of test conditions (four) sufficient for fitting
Equation (3.9). Many companies adopt legacy or standards-based designs because they
are easy to set up and their simplicity allows higher test chamber utilization. In contrast,
orthogonal designs have deeply rooted statistical properties. The orthogonal design selected
here is a Resolution III 23−1 fractional factorial design. Since second order factors are as-
sumed negligible, all main effects can be estimated independent of one another. Finally,
the D-optimal based design is formed by considering the D-optimal criterion, which seeks
to maximize the determinant of the Fisher information matrix and thus minimize variances
of parameter estimates. These designs are depicted as two-dimensional projections in Fig.
4. The magnitude of the soak time is depicted as a circle for 10 minutes, a triangle for 15
minutes, a trapezoid for 30 minutes, and a square for 60 minutes. Set points used for each
test condition are summarized in the Appendix A Tables 18-20.
First we use Equation (3.5) to calculate the acceleration factor at each designed
experimental point using the hypothesized model parameter values. Assuming that the
failure times are linearly accelerated at higher stress levels, the shape parameter of failure
distribution at the stress testing condition remains the same as that at the use condition,
but the scale parameter changes according to the acceleration factor, i.e., βstress,k = 2
and αstress,k = 10,000AFk
, for k = 1, 2, 3, 4. Failure times are simulated according to the Weibull
47
Fig. 4: Legacy (top), orthogonal (middle), and D-optimal designs (bottom)
48
failure distribution at each experimental point. We vary the sample size from 100 to 800 and
consider complete failure data, 60% right censored data, 30% right censored data, and 30%
interval censored data. The right censoring is a failure-based (Type-II) censoring scheme,
where the experiments are terminated after a certain number of failures are observed. For
example, if 100 test units are placed in the testing chamber, with 30% right censoring
we stop the experiments after 30 test units have failed. The interval censoring scheme
provides readouts every 250 cycles up to 30% failure. We allocate an equal number of test
units at each experimental test condition, because it has been shown that for complete and
right censored data, equal sample allocation is the best strategy to minimize the estimation
variation of model parameters (Ng et al. (2007), Guo and Pan (2007)). We repeat the
simulation 200 times at each experimental point and calculate the means and standard
errors of the maximum likelihood estimation of regressor coefficients a, b and c.
4.2. Effects of experimental design, sample size, and censoring
The numerical simulations results are presented in Tables 21-23 in Appendix A.
Based on these values, we plot the estimation of parameter a in different experimental
settings in Fig. 5. Here the average of estimated values and the 95% confidence intervals
are given and compared with the true value. For all cases, the parameter estimation is close
to the true value, which is due to MLEs being unbiased for large sample sizes. However,
the confidence interval varies dramatically depending on the sample size, censoring scheme
and experimental design.
It is not surprising to see that by increasing sample size and by increasing failure
time information (reducing censoring), we improve the parameter estimation in terms of
narrowing its confidence interval. However, the magnitude of improvement is relatively
49
Para
met
er "a
"
-2
-1
0
1
2
3
4
5
6
25 50 100 200 25 50 100 200 25 50 100 200 25 50 100 200
1. Complete 2. Rt 60% 3. Rt 30% 4. Int 30%
SS w ithin Censoring
Para
met
er "a
"
-2
-1
0
1
2
3
4
5
6
25 50 100 200 25 50 100 200 25 50 100 200 25 50 100 200
1. Complete 2. Rt 60% 3. Rt 30% 4. Int 30%
SS w ithin Censoring
Para
met
er "a
"
-2
-1
0
1
2
3
4
5
6
25 50 100 200 25 50 100 200 25 50 100 200 25 50 100 200
1. Complete 2. Rt 60% 3. Rt 30% 4. Int 30%
SS w ithin Censoring
Fig. 5: Estimation of parameter a by numerical simulation using legacy (upper), orthogonal(middle), and D-optimal designs (lower).
50
small when compared with the reduction incurred by selecting the “right” experimental
design. For example, a D-optimal design achieves equivalent precision for parameter a
using one-third of the test units required for an orthogonal design. Similarly, a D-optimal
design needs only one-eighth of the test units used in a legacy design. The benefit can also be
expressed in terms of improved precision for a fixed sample size. From this perspective, the
standard errors of D-optimal designs are 1.75x and 3.5x smaller than orthogonal and legacy
designs, respectively. Similar trends are also observed for parameters b and c. Furthermore,
it is clear that the legacy design is poorly suited to estimate parameters b and c. Examining
the b and c columns of Table 21, we observe that standard deviation values are of the same
magnitude as the parameter estimates when censoring approaches 30%. This implies that
neither b nor c is distinguishable from zero (e.g. is statistically insignificant). This mimics
the situation observed in the legacy design example. Examining Tables 22 and 23, we
observed that the orthogonal and D-optimal designs do not have this concern. Therefore,
selecting a good experimental design, especially when a design criterion is optimized prior
to initiating tests, is much more desirable than increasing sample size or generating more
failure data. Unfortunately, the selection of optimal experimental designs for reliability
testing applications has not been properly emphasized in many industrial practices as we
have shown by the legacy design example.
4.3. Alternative test plan with Type-I censoring
Often, to prepare a reliability test, it is preferred to have a fixed testing time instead
of fixed number of failures, so the total testing period is controlled in advance. For those test
units that do not fail during the testing period, they are Type-I censored. To accommodate
this need, we design an alternative test plan with Type-I censoring that has comparable
51
performance to a Type-II censoring plan. We allocate the number of test units unevenly
on each experimental point such that roughly the same number of failures occur at all
experimental conditions within the fixed testing period. The sample allocation scheme is
found by solving the following system of equations
Fk(tc) = 1− exp
[−(tcAFkα
)β ], (3.10)
∑k
Fk(tc)nk = q N, (3.11)
and ∑k
nk = N (3.12)
where nk is the number of test units allocated to the test condition k, N is the total units
on test, F (t) is the Weibull failure function, tc is the Type-I censoring time, and q is the
percentage of units failed in Type-II censored test.
We execute simulations for the alternative Type-I censoring of the 30% right censored
D-optimal experiments. The parameter estimations are depicted in Fig. 6. One can see
that for all three parameters the performance of Type-I censoring are close to that of the
Type-II right censoring.
5. Conclusions
In this chapter, we have exemplified experimental design selection as a critical step
in reliability test-plan development. This is motivated by a field-case example in which
selection between competing alternative forms of the reliability model were equally viable
based on their log-likelihood. Numerical simulations demonstrated that the importance of
52
Para
met
er "a
"
0.50
1.00
1.50
2.00
2.50
3.00
100 200 400 800 100 200 400 800
Type I Type II
Total SS w ithin Censor
Para
met
er "b
"
0.00
0.10
0.20
0.30
0.40
0.50
0.60
100 200 400 800 100 200 400 800
Type I Type II
Total SS w ithin Censor
Para
met
er "c
"
-0.10
0.00
0.10
0.20
0.30
100 200 400 800 100 200 400 800
Type I Type II
Total SS w ithin Censor
Fig. 6: Estimation of parameters a (top), b (middle) and c (bottom) using Type-I censoringcompared with Type-II censoring.
53
experimental design selection can exceed that of sample size determination, or choice of
test-unit allocation strategy when we want to build the product-failure acceleration model,
which is different from the typical reliability-prediction purpose of ALTs. Proper selection
of experimental designs offers the practitioner two options in which to realize benefits. The
first option is a sample size reduction for a fixed level of precision. In this study, we observed
an eight-fold reduction from the legacy design. The benefit can alternatively be realized as
an increase in precision for a fixed sample size. With this objective in mind, the D-optimal
design observed a 3.5× reduction in the width of the confidence interval about the estimates
for a, b, and c. The increased precision insured that parameters b and c could be determined
and included in the model whereas they could not in the legacy design. Avoiding Type II
errors (omission of significant factors) is particularly important as incorrect decisions made
during the empirical model building stage are amplified in ALT as typical interest (use
condition) regions are often extrapolated values well outside the design region. Recognizing
that knowledge of model form is not always known a priori, we have also illustrated that
similar, but more modest, benefits also exist for the orthogonal design. In some cases, the
robustness of orthogonal designs to may supersede selection of a D-optimal design, but that
should not diminish the importance of experimental design considerations presented here.
6. Summary
As the electronic industry has adopted lead-free solder joint technology, the failure
acceleration model of solder joint fatigue needs to be re-investigated through accelerated
life tests (ALT). However, traditional ALT designs currently employed by the industry fail
to provide sufficient information for this purpose. This chapter explores a criterion for effec-
tively planning ALT. It addresses the problem by hypothesizing a model form, deriving the
model parameters using real data, and simulating ALT results for different experimental
54
design plans. Relative comparisons are drawn between the influence of experimental design
selection and some traditional reliability test plan variables such as sample size, censoring
scheme, and sample allocation. We find that the importance of selecting an experimental
design properly exceeds the importance usually given to the selection of sample size and
censoring scheme, or to the choice of a test unit allocation strategy in the context of a
reliability test case study. D-optimal based designs provide good performance for accelera-
tion model validation, especially when the testing feasible region is asymmetric. D-optimal
designs are shown to offer eight-fold reduction in sample size over legacy designs for a fixed
precision. Alternatively, D-optimal designs yielded a 3.5× increase in parameter estimate
precision over legacy designs for a fixed sample size. Numerical simulations confirmed that
legacy designs provide insufficient precision to determine if two of three model parameters
are significant.
CHAPTER 4
A GLM Approach to Designing Accelerated Life Tests
1. Introduction
1.1. Motivation
In an increasingly competitive market, the time dimension of developing and evaluat-
ing product quality can not be ignored. A manufacturer’s viability and level of profitability
is often dictated by its ability to produce highly reliable products and introduce them into
the market prior to their competitors; thus, effective and efficient tools for assessing relia-
bility are needed at the product development stage. Statistical techniques associated with
design of experiments (DOE) are well accepted by design and quality engineers; however,
the application of DOE theories and techniques for reliability improvement still remains
in its infancy. Traditional reliability test methods, such as accelerated life tests (ALT),
typically only consider the effects of environmental and operational conditions on product
lifetime. As such, these tests are conducted mainly for the purpose of reliability verification.
This is in sharp contrast to reliability tests that optimize the information gained toward
an experimenter’s objective. Examining the literature, we find few publications that incor-
porate optimal DOE concepts into the planning phase of ALTs. This may be partly due
to the theoretical and computational difficulties uniquely associated with reliability testing.
These challenges have inhibited the straightforward application of traditional DOE tools,
and thus, hampered their broader adoption. In particular, these difficulties arise from:
• Responses that are failure times or failure counts that cannot be sensibly modeled
with a normal distribution;
• Censoring which causes extremely scarce and imbalanced experimental outcomes;
• Design, cost, and time constraints which restrict the feasible design region;
56
• Increased uncertainty in model estimation and prediction due to ALT extrapolations
to the intended use condition.
1.2. Problem statement
As described earlier, practitioners must carefully balance time-to-market require-
ments against the loss of information associated with censoring in order to yield an efficient
ALT experimental design. Often the objective involves minimizing the uncertainty in a
parameter or prediction estimate. To determine the experimental design to achieve such an
objective, it is necessary to understand the relationship between the testing variables and
the response. In a multi-variable problem, these statistical relationships are contained in
a covariance matrix. The inverse of the covariance matrix is called the information matrix
and is often used to assess the goodness of an experimental design as measured by an opti-
mality criterion. The maximum likelihood method is a prominent estimation method. For
reliability applications, the likelihood function includes contributions from both failure and
survival units - leading to more complex factor dependencies for the reliability practitioner
to consider. With these challenges in mind, this chapter introduces an alternate method
for computing the maximum likelihood estimators (and thus the information matrix) using
a Generalized Linear Model (GLM) formulation.
GLMs provide a unifying structure for analyzing linear and non-linear regression
models. For details pertaining to the construction of GLMs, one may refer to McCullagh
and Nelder (1989), Myers, Montgomery, and Vining (2002), and Dobson (2002). Aitkin
and Clayton (1980) and Whitehead (1980) applied GLMs to survival data analysis. For
reliability testing data analysis, Barbosa and Louzada-Neto (1994) examined ALTs with
interval censoring. Wang and Kececioglu (2000) highlighted the model estimation benefits
57
that GLM structures provide. These authors note that maximum likelihood estimation
often relies on numerical methods such as Newton-Raphson to estimate a complex model
with many unknown parameters. Thus, convergence problems may arise when the starting
point is not close to the solution. This issue is particularly a problem when censoring
exists. In contrast, the iteratively weighted least squares (IWLS) approach employed in
GLMs typically provides stable solutions toward parameter estimation. By reformulating
the problem in terms of a GLM, we avoid the necessity of complex numerical methods to
find the corresponding solution of a system of equations involving second partial derivations
of non-linear equations as required by the maximum likelihood estimation method.
Although yielding equivalent results, the GLM-based computational approach offers
several benefits. The first is the ability to solve these complex systems of equations through
the method of iteratively weighted least squares (IWLS). This simplification allows the
practitioner to directly evaluate levels of sensitivity that exist between the definable test
condition factors against the system response during the planning stage. This knowledge is
important in that it further enhances a practitioner’s ability to recognize the most influential
factors and their relative sensitivity. Second, the standard algorithms for GLM optimal
designs extend easily to handle two or more factors and their interactions. In doing so, we
are able to explore more complicated reliability regression models.
1.3. Scope of work
In this chapter, we focus on alphabetic-optimal designs. The strengths of an opti-
mal criteria-based design as compared to a standard design such as a factorial or fractional
factorial design are two-fold: they possess the capabilities of accommodating arbitrary ex-
perimental design regions and numbers of experimental points. These designs have become
58
increasingly popular as their algorithms have been integrated into commercial software.
We acknowledge that these designs have well documented limitations, most notably the
sensitivity to the model parameter assumptions and limitations in checking for model lack
of fit, see Ozol-Godfrey, Anderson-Cook, and Robinson (2007) for more details. Zhang
and Meeker (2006) have discussed Bayesian methods to mitigate such concerns and can be
applied in conjunction with the general strategy outlined here.
To begin, we formulate a GLM structure for failure times with an exponential distri-
bution and two design factors, and obtain the optimal ALT design based on this formulation.
We consider two optimality criteria – one for model parameter estimation and another for
model prediction. We compare our results with existing methods such as Meeker and Es-
cobar’s degenerate and degenerate split designs (1998, Chapter 20). Some other associated
problems, such as the sensitivity of optimal design to parameter estimates and the model
parameter dependency problem, are addressed as future research.
The first criterion provides optimal designs for parameter estimation using a maxi-
mum determinant-based objective function (D-optimality) as shown by
max∀x∈R
D =
∣∣∣∣∣X ′WX
n
∣∣∣∣∣ (4.1)
where X is the design matrix comprised of test points within a feasible region R, W is the
weight matrix associated with iteratively weighted least squares (IWLS), and n is the sam-
ple size. The second criterion provides experimental designs that minimize the prediction
variance (U-optimality) at a prescribed location. The prediction variance at the intended
use condition is written as
59
min∀x∈R
U = x(m)′use · (X ′WX)−1 · x(m)
use (4.2)
where X is the design matrix, W is the weight matrix, and x(m)use is the location of the use
condition (outside the feasible region R) for a given model form m. The computation of
this prediction variance is consistent with a particular instance of the c-optimality criterion
defined by Chaloner and Verdinelli (1995). However, it is important to note that acceler-
ated life test predictions are extrapolations as the operating level typically lies outside the
experimental design region. In addition, the use condition may not be restricted to a sin-
gle point. For example, as described in Monroe and Pan (2009), the ambient temperature
and humidity over time can be characterized as a bivariate distribution, so the operational
stresses experienced by an outdoor device are random variables, instead of constants. Fig.
1 illustrates the concept of the separation of experimental test region and use region. To
emphasize these distinct features of ALTs and their corresponding predictions, we desig-
nate U-optimality as the criteria for minimizing the prediction variance for any set of use
conditions that lie outside the feasible design region (R). In this research, we restrict our
analysis to only investigate the scenario where the use condition is constant.
The remainder of this chapter is outlined as follows. Section 2 provides the theoret-
ical development associated with the proposed method. Starting with a likelihood function
for reliability, we derive an equivalent GLM formulation and apply an iteratively weighted
least squares approach to find MLE estimators. Section 3 discusses the computational meth-
ods applied using the SAS software. In Section 4, a two-factor acceleration model is used to
illustrate the results for both D-optimality and U-optimality criteria, and to compare these
designs to other ALT plans. This analysis is followed by a second example that expands
60
Fig. 7: Normalized unit square feasible design region
the model to include a two-factor interaction term. Finally, we summarize our results and
conclusions.
2. Methodology
In this section we briefly summarize the existing techniques and then provide a
detailed overview of the technique being proposed. Through this discussion, we reveal the
features that allow the adoption of this method across a broader and more complex set of
regression models.
2.1. Maximum likelihood estimation approach
Nelson (1990) and others have previously outlined an approach for computing the
covariance matrix. This procedure is based on a large sample size assumption and involves
61
using maximum likelihood estimation (MLE) techniques to approximate the asymptotic
variance. For reliability testing involving censoring, the likelihood function is
L =∏[
f(ti, xi)]ci ∏[
S(ti, xi)]1−ci
, (4.3)
where f(ti, xi) is the failure density function, S(ti, xi) is the survival function, and ci is the
proportion of samples that fail. Applying an expectation function to the negative second
partial derivative of the likelihood function yields the Fisher information matrix, Iβ, as
given by
Iβ = E
[− ∂2`(β)∂β ∂β′
]. (4.4)
Computing the inverse of the Fisher information matrix yields the covariance matrix as
shown by
Σβ = I−1β . (4.5)
Due to the complexity of the second partial derivatives, this estimation approach relies
on numerical methods. Unfortunately, these numerical methods offer few insights into the
relative importance of the experimental design matrix. For functions involving multiple
parameters, g=g(β), we can compute the asymptotic variance as
Avar(g) =
[∂g(β)∂β
]′Σβ
[∂g(β)∂β
]. (4.6)
62
Since reliability times are strictly positive, we use an alternate form of the delta method
approximation in which
log[g(β)
]∼ N
[log[g(β)], Ase[log(g)]
](4.7)
where Ase[log(g)] = (1g )2Avar(g). Recall that the prediction variance is given by
V ar[y(x)
]= x(m)′(X ′WX)−1x(m)σ2 (4.8)
where x(m)′ is a function of the location of the design variables and the model form. For a
first-order model with a two-way interaction, we designate this as
x(1)′ =[1, x1, x2, x1x1
](4.9)
where m = 1 is used to designate a first-order model.
2.2. Generalized linear model approach to failure time analysis
To better understand the underlying structure of optimal experimental designs with
potential data censoring, we reformulate the problem in terms of a generalized linear model.
Recall that a linear model is typically expressed in the form as
E(Yi) = µi = xi′β
Yi ∼ N(µi, σ2) (4.10)
63
where x1, x2, . . . , xp are covariates and the random variables Yi are assumed to be indepen-
dent, normally distributed with mean value, µi, and with constant variance, σ2. In the cases
where the random variables involve count data, binary responses, or other non-normally dis-
tributed data, linear models are inadequate. However, GLMs allow the structure of linear
model fitting to be applied in more general situations. Nelder and Wedderburn (1972) de-
veloped the GLM technique that unifies the linear, logistic, and Poisson regression under a
single framework. A GLM consists of three elements:
1. a response variable distribution function from the exponential family;
2. a linear predictor ηi = xi′β, where xi is a 1×p vector and β is a p×1 matrix; and
3. a link function such that ηi = g(µi).
Aitkin and Clayton (1980) and Whitehead (1980) demonstrate how this technique
is used for analyzing survival data with covariates, which are modeled by the Cox’s propor-
tional hazard (PH) model. The proportional hazard function h(ti, xi) is written as
h(ti, xi) = λ(ti) · exp(x′iβ) (4.11)
where λ(ti) is the baseline hazard and is dependent on covariates xi, ηi = x′iβ is the linear
predictor, and an exponential function is used to insure the hazard value is strictly positive.
Following the notation of McCullagh and Nelder (1989, Chapter 9), let the obser-
vation from a life test be a pair (ti, ci), where ti is the failure or survival time and ci is
an indicator variable, which is 1 if a failure time is observed and 0 if the observed unit is
censored. For simplicity, we assume the failure time to have an exponential distribution, so
64
the hazard function, conditioned on a set of fixed covariates, is a constant, or λ(ti) = λ0.
It is worth noting that for the exponential failure time distribution the proportional hazard
model is equivalent to the accelerated fail time model, which is often assumed in reliability
engineering applications (Lawless, 2003). Note that this equivalency is not true for other
distributions, such as the Weibull distribution. The cumulative hazard H(ti, xi) function is
then defined as
H(ti, xi) = Λ(ti) · ex′iβ, (4.12)
where
Λ(ti) =∫ t
−∞λ(u) du. (4.13)
Recalling the logarithmic relationship between the survival function and the negative
cumulative hazard function, we express the survival function, S(ti, xi), as
S(ti, xi) = e−H(ti,xi) = exp[− Λ(ti) · ex
′iβ], (4.14)
and the failure function, F (ti, xi), as
F (ti, xi) = 1− S(ti, xi) = 1− e−H(ti,xi). (4.15)
It is easy to derive the failure density function, f(ti, xi), by taking a partial derivative as
shown by
f(t, x) =∂
∂t
[F (ti, xi)
]= λ(ti) · exp
[x′iβ − Λ(ti) · ex
′iβ]. (4.16)
65
Then, we find the likelihood function as the product of all failure time and censored obser-
vations,
L =∏[
f(ti, xi)]ci ∏[
S(ti, xi)]1−ci
, (4.17)
or the log-likelihood function,
` = log(L) =∑[
ci · log f(ti, xi)]
+∑[
(1− ci) · logS(ti, xi)]. (4.18)
Computing the natural logarithm of Equations (4.16) and (4.14) yields
log f(ti, xi) = log λ(ti) + xi′β − Λ(ti) · ex
′iβ (4.19)
and
logS(ti, xi) = −Λ(ti) · ex′iβ. (4.20)
Substituting these results into Equation (4.18) and simplifying provides the following form,
` =∑[
ci
(log λ(ti) + x′iβ
)]+∑[
− Λ(ti) · ex′iβ]. (4.21)
This can be rewritten in a more convenient form by adding and subtracting a constant as
shown by
` =∑[
ci
(log λ(ti) + x′iβ
)]+∑[
− Λ(ti)ex′iβ]
+∑[
ci log Λ(ti)− ci log Λ(ti)]. (4.22)
66
Exchanging the elements from the first and third terms of Equation (4.22) and regrouping
yields
` =∑[
ci
(log Λ(ti) + x′iβ
)− Λ(ti) · ex
′iβ]
+∑[
ci log(λ(ti)
Λ(ti)
)]. (4.23)
Defining µi = H(ti) = Λ(ti) · ex′iβ and letting λ(ti) = λ0 for an exponential distribution, we
find that
` =∑[
ci logµi − µi]
+∑[
ci · log( 1ti
)]. (4.24)
The first term of the above function is the kernel of the log-likelihood function for a
Poisson variable with mean µi. In addition, since the second term does not depend on the
unknown β’s, it can be dropped from the maximization equation. Therefore, we can write
this as a generalized linear model strictly in terms of a Poisson distributed variable, such as
pi ∼ Poisson(µi) (4.25)
and
logµi = λ0 + x′iβ + log ti. (4.26)
where the last term in the log link function is an offset term.
Recalling that the elements of a weight matrix are the reciprocal of the variance,
the inverse of the weight matrix is simply the variance of the Poisson distribution. Since a
Poisson distribution has variance equal to the mean, the elements of µ are used to construct
67
a weight matrix of the form W = Diag(µ1, µ2, . . . , µn). Thus, the variance-covariance
matrix of β estimate, β, is approximated by
V ar(β) ≈ 1φ
(X ′WX)−1, (4.27)
where X is an n x p+1 matrix with the first column being 1,
X =
1 x1,1 . . . xp,1
1 x1,2 . . . xp,2
......
. . ....
1 x1,n . . . xp,n
and where xj,i is the jth factor at the ith run for the design variable, and p is the number
of experimental factors, n is the number of total experimental runs, and φ is a dispersion
parameter which, in this case, is equal to one. Since failure times are involved in the weight
matrix, they are replaced by their expected values. Given the censoring time is tc, the
expected value of the mean lifetime is expressed as
E(µi) = Φ(tc) + tc · eβ0+xi′β · [1− Φ(tc)], (4.28)
where β0 = log λ0. (The derivation of this result is in Appendix B). Thus, we have an
expression relating the mean of each distribution as a function of the vector of model
parameters β, the censoring time tc, and the failure distribution Φ. With this result in
mind, we next discuss the computational procedures used to conduct this analysis.
68
3. Computational procedure
This section describes the key features of the computational methods for finding
optimal experimental designs. To find an optimal design, we need to find the design matrix
X∗ such that the D-optimal or U -optimal design criteria expressed in Equations (4.1) and
(4.2), are optimized. To enable this analysis, SAS programs are written using the PROC
IML procedure and non-linear call functions. These programs are listed in Appendix C.
3.1. Search methods
Our initial search methods employed a grid search strategy. Although highly effi-
cient programmatically, it does not guarantee finding the global optimal design. To improve
the likelihood of finding the global optimal design, increasingly smaller grid sizes are re-
quired. Thus, the effectiveness of this approach is inversely proportional to the grid size.
With these results in mind, our approach focused on examining non-linear search methods.
We compared non-linear programming methods such as Quasi-Newton (NLPQN), Nelder-
Mead Simplex Method (NLPNMS), and Conjugate Gradient (NLPCG). These subroutines
minimize or maximize a continuous non-linear function f = f(x) of n parameters, where
x = (x1, . . . , xn)′. Although we selected NLPCG based on its shortest computation time
for large designs, we note that NLPNMS and NLPQN are more suitable for situations in-
volving non-linear constraints on the parameters. Some examples of these search methods
are discussed and outlined by Atkinson (2006).
3.2. Initial values and convergence criteria
To avoid locally optimal solutions, we select data points with a uniform distributed
random number generator and a changing seed value. To mimic a typical reliability plan-
ning situation, we considered designs up to 200 units. For smaller designs involving less
69
than 50 samples, we observed that the sample size has only a minimal impact on compu-
tational time. However, as candidate designs exceed 100 test points, computational time
and termination criteria (tc) thresholds become increasingly more important. Examples of
these thresholds include the maximum number of iterations allowed, tc[1], the maximum
number of functional calls, tc[2], and the absolute functional convergence criteria, tc[3]. The
default levels for these thresholds is largest for the conjugate gradient (NLPCQ) method
and are observed to be sufficient for designs up to 200 test points. For larger designs, these
convergence criteria can easily be changed by specifying the desired value as an option
in the call command. For smaller, unconstrained problems, the NLPCG call requires less
memory than other subroutines (SAS, 2008). As a result, we found that NLPCG is the
fastest algorithm as it scales linearly with n, while all other optimization techniques scale
as n2.
4. Numerical examples and results
In this section, we illustrate two important features of the GLM approach: design ef-
ficiency and model flexibility. To demonstrate these features, we introduce two case studies.
The first study examines a temperature-humidity acceleration model with only main effect
terms. Efficiency comparisons are made to experimental design heuristics for two-factor
ALTs introduced by Meeker and Escobar (1998). The second study illustrates the flexibil-
ity of the U-optimality approach by expanding the lifetime regression model to include an
interaction effect. Changes in the test locations and sample allocation for the U-optimal
design resulting from the addition of an interaction term are discussed.
70
4.1. Case study 1
As part of our first study, we consider a reliability test plan associated with a micro-
processor device. In particular, we consider a thermo-moisture failure mode. Examples of
such failure modes include corrosion, passivation cracking, and bond interface degradation.
Models for predicting such behavior have been described in the literature and are sum-
marized by Shirley (1994). With these type of failure mechanisms in mind, we introduce
a two-factor accelerated life testing example with a non-linear response surface and data
right censored at 100 hours. A temperature-humidity test is planned for 100 test specimens.
Prior testing of a previous product suggests that the lifetimes follow a Weibull distribution.
For simplicity, we assume a shape parameter is known, thus allowing the data to be trans-
formed into an exponential distribution. Examining the results of several different failures
modes, we observe that the activation energy value ranges between 0.75 to 1.15 and that
the moisture exponent ranges from 3 to 5. Assuming a mean lifetime of one million hours at
the use condition and applying conservative (smaller) values for the model coefficients, we
postulate a true lifetime regression model for the mean duration, or Mean Time to Failure
(MTTF in hours) as
log(MTTF ) = −5.238 + 0.750s1 − 3.000s2, (4.29)
where s1 and s2 are the natural stresses of temperature and humidity, s1 = 11,605T with
temperature in Kelvin and s2 = log(h) with humidity measured as a percentage. We
observe that the sign of s1 and s2 differ due to our formulation of these natural stresses.
For this example, we note that as temperature increases, s1 decreases. On the other hand,
higher humidities yield larger values of s2. Furthermore, the intercept of Equation (4.29)
71
reflects the mean lifetime of the use condition under a normalized scale and is discussed
further below. To compute this normalized scale, we define the region of acceptable test
conditions. Based on a maximum test duration of two months, an activation energy of 0.75
and a Peck coefficient of -3.00, and a use condition of 30◦C / 25% RH, a minimum test
condition constraint is defined as 60◦C for temperature and 60% RH for humidity. This
corresponds to location (1,1) on Fig. 7. Similarly, the melting point for the test board
constrains temperature to a maximum value of 110◦C. At this temperature, test chambers
can typically achieve humidity levels of 90%. Thus, a maximum stress condition of 110◦C
/ 90% RH is selected corresponding to location (0,0) on Fig. 7.
Following Meeker and Escobar’s notation, these regressors are normalized to a unit
square with the ith level of variable j defined as ξji ≡ (sji− sjL)/(sjH − sjL) where sjH and
sjL are the high and low stress level conditions, respectively. Thus, the use condition lies
at (0,0) and the highest stress level lies at (1,1). After this transformation, the previously
postulated lifetime regression model becomes
log(MTTF ) = 13.816− 5.994ξ1 − 3.843ξ2. (4.30)
Meeker and Escobar (1998) provide a degenerate design by combining the effects
of two factors together. Then, an optimal split design is constructed by considering the
minimal parameter estimation variance as an additional criteria. We create these designs
by minimizing the prediction variance of the 0.1 percentile lifetime at a use condition of
30◦C / 25% RH. These serve as benchmarks for our comparison later. Table 9 presents
the test plan for the degenerate design. The low stress level is found at 87◦C / 66% RH,
and there are 67 test specimens allocated. The remaining 33 specimens are allocated at
72
Table 9: Degenerate designTest Temperature Humidity Reg Time Fail Alloc ExpCond Natural Code Std Natural Code Std Model hours
i ◦ C s1 ξ1 % s2 ξ2 µ(x) ζi Φ ni Fails
Use 30.0 38.3 0.000 25.0 3.219 0.000 13.82 -9.207 0.0001Low 87.0 32.2 0.758 66.0 4.190 0.758 6.36 -1.754 0.1590 67 10.7High 110.0 30.3 1.000 90.0 4.500 1.000 3.98 0.827 0.8461 33 27.9
Table 10: Degenerate split designTest Temperature Humidity Reg Time Fail Alloc ExpCond Natural Code Std Natural Code Std Model hours
i ◦ C s1 ξ1 % s2 ξ2 µ(x) ζi Φ ni Fails
Use 30.0 38.3 0.000 25.0 3.219 0.000 13.82 -9.207 0.0001Split1 91.3 31.8 0.806 60.0 4.094 0.683 6.36 -1.754 0.1590 26 6.5Split2 73.7 33.5 0.603 90.0 4.500 1.000 6.36 -1.754 0.1590 41 4.2High 110.0 30.3 1.000 90.0 4.500 1.000 3.98 0.627 0.8461 33 27.7
the user-defined high stress level condition of 110◦C / 90% RH. For more details on this
procedure, refer to Meeker and Escobar (1998, Chapter 20).
Next, we split the allocation of units assigned to the low stress level to two opposing
locations along a line of constant failure rate. This generates two points at the north and
south boundaries of the feasible test region as denoted by the triangle markers in the left
panel of Fig. 8. Table 10 presents the split design, where “Split1” corresponds to the point
on the south boundary in the left graph of Fig. 8 and “Split2” corresponds to the point on
the north boundary.
Table 11: D-optimal designTest Temperature Humidity Reg Std Time F(t) Alloc ExpCond Natural Code Std Natural Code Std Model hours
i ◦ C s1 x1 % s2 x2 µi ζi Φ ni Fails
1 60 34.83 1.00 90 4.500 0.00 7.39 -2.782 0.0060 33 2.02 83.2 32.56 0.50 60 4.090 1.00 6.90 -2.295 0.0959 21 2.03 110 30.29 0.00 60 4.090 1.00 5.19 -0.590 0.4257 20 8.54 110 30.29 0.00 90 4.500 0.00 3.98 0.627 0.8461 26 22.0
73
Use condition
Use conditionSplit 2
Split 1
Split 1
Split 2High
Low
Low
High
x1
x22
Fig. 8: Feasible design region in terms of ξ (left) and x (right) scales
Table 12: U-optimal design without an interaction effectTest Temperature Humidity Reg Std Time F(t) Alloc ExpCond Natural Code Std Natural Code Std Model hours
i ◦ C s1 x1 % s2 x2 µi ζi Φ ni Fails
1 70.9 33.73 0.78 61.4 4.117 0.95 7.71 -3.102 0.0441 50 2.22 60.0 34.83 1.00 90 4.500 0.00 7.39 -2.782 0.0600 21 1.33 110 30.29 0.00 60 4.090 1.00 5.19 -0.590 0.4257 16 6.84 110 30.29 0.00 90 4.500 0.00 3.98 0.627 0.8461 13 11.0
It is important to note that in our previous discussion of an experimental design re-
gion, we adopted an orientation different than that used by Meeker and Escobar. Through-
out the remainder of this chapter, we designate the high stress condition to lie at the origin
instead of the use condition as shown by the right panel of Fig. 8. This is a simple linear
transformation (rotation and scaling) and does not inhibit prediction variance comparisons
from being made. We compute this xji ≡ (sji−sjC)/(sjH−sjC) where sjH and sjC are the
high and constrained stress level conditions. This alternate orientation is selected so that
the use condition region (in contrast to a single use condition point) will fall in the first
74
quadrant of the design space. Using this nomenclature yields three non-collinear locations
labeled as High, Split1, and Split2 in Table 10. Note that Split1 is on the north boundary
established by the 60% humidity constraint instead of the east boundary as would occur in
an unconstrained situation (see dashed triangle on the right panel of Fig. 8).
Examining the resulting three point design, we note some areas of concern. First,
the allocation of specimens at the two split points are constrained by the total number
of specimens obtained in the degenerated design. Although this maintains the secondary
design criteria (D-optimality), we find that it allows the distinct possibility of only a couple
specimens failing for the Split2 condition (with the expected number of failures being 4.2).
Secondly, there are no specimens allocated to the lowest stress condition corner at 60◦C
/ 60% RH. Given our objective is to minimize the prediction variance of a location well
outside the feasible design region, it seems logical to have a large allocation of test specimens
at a stress level close to the use condition.
The U-optimal design we obtained is shown in the bottom panel of Fig. 9 and
summarized in Table 12. Four distinct experimental points are selected. Three of them are
at the corners of the feasible design region and one is close to the lowest stress level corner.
We note that the testing condition with the lowest possible stress level is not selected. This
is due to the non-linearity of the response function, which is common in GLM experimental
designs, and is also the result of the high censoring rate that occurs at low stress levels,
which is a common characteristic of reliability testing. The number of specimens allocated
to the three corner points is smaller than the number allocated at the lowest stress level
point. This is because more testing samples are censored at the low stress level. We also
observe that the selected experimental points are widely spread across the entire feasible
design region so that the model parameters can be better estimated, thus reducing the
75
prediction variance at the use condition. Although maximizing the spacing between the
design points has an obvious benefit in terms of prediction variance, one must carefully
balance this benefit against the risk of incurring no failures, particularly for small sample
sizes.
Applying the prediction variance equation shown in Equation (4.2) at the normalized
and scaled use condition location (1.758, 3.159), we observe that the U-optimal design has
the lowest prediction variance of 0.539 followed by the D-optimal design at 0.561 and the
Optimal Split design at 0.683. Thus, the U-optimal design provides a 2% (√
0.561√0.539
− 1) and
12.5% (√
0.683√0.539
− 1) reduction in the width of the confidence intervals for the use condition
lifetimes over the D-optimal and Degenerate Split designs, respectively. This result is not
surprising since the objective function and goal of the Degenerate Split design are different
and there are additional constraints for its construction.
Fig. 9 depicts the Degenerate Split, D-optimal, and U-optimal designs. We note
that all three designs are observed to have good radial symmetry as you extend beyond the
design region; however, their design construction differs. Denoting the circular marker size
proportional to the allocation of a design test condition, one can see that the D-optimal
and U-optimal designs are supported by four distinct test point locations versus the three
locations that always result from a Degenerate Split design. In addition, D-optimality,
which focuses more on good estimators for all the model parameters, places experimental
points more evenly at the corners of the feasible design region. Once again, due to the
curvature of response function and high censoring at low testing stress levels, the D-optimal
ALT plan does not place all its experimental points at the corners and does not assign an
equal sample size to all the test locations. For the models we studied, D-optimal plans are
given in Table 11 and are illustrated in the middle panel of Fig. 9.
76
Use
41
26
33
Use
20 21
26 33
Use
16
2113
50
Fig. 9: Contour plots of prediction variance at the use condition under the degenerate split(top), D-optimal (middle), and U-optimal (bottom) criterion
77
In contrast to the other designs, the U-optimal design has 50% of all the test units
allocated to the corner closest to the use condition. We also observe the impact of censoring
as the location of this point is not at the (1,1) corner. Examining the contour plots in Fig. 9
and the prediction variance at the use condition (denoted with a square marker), we observe
that the U-optimal design has the lowest prediction variance at the use condition location
and is reasonably robust to changes in the use condition location as shown by the contours.
Both censoring and interaction effects are importance facets of accelerated life testing that
drive these unique features. We now discuss the effect of including an interaction term in
the model in the following case study.
4.2. Case study 2
Recalling the mean lifetime regression model with two main effects, we now examine
a situation when a two-way interaction effect is also present. We consider the following
log-linear function for mean time to failure
log(MTTF ) = −5.238 + 0.750s1 − 3.000s2 − 0.040s1s2. (4.31)
Applying the same scaling and normalization as previously described yields
log(MTTF ) = 8.887− 4.965x1 − 5.804x2 + 0.410x1x2. (4.32)
We also examine the design when a larger interaction effect (double the previous case
considered) is present by assuming the log lifetime model is
log(MTTF ) = −5.238 + 0.750s1 − 3.000s2 − 0.080s1s2. (4.33)
78
Applying the same scaling and normalization as previously described yields
log(MTTF ) = 3.958− 3.936x1 − 7.767x2 + 0.819x1x2. (4.34)
We note that the optimal degenerate design and split design can not directly address
the effect of the two-way interaction because they combine the effects of multiple factors
as one effect characterized by an acceleration factor. These designs provide two or three
experimental points, which are insufficient for estimating the four model parameters. These
issues are not a problem for the more flexible construction of the U-optimal design. U-
optimal designs for three different interaction levels (none, low, and high) are now discussed.
We first consider the design constructed assuming the model form includes a two-way
interaction effect, but its true coefficient value is zero. Note that the construction of the
U-optimal design has exactly the same four test points as previously described in Table 12.
However, the resulting prediction variance is different (0.539 versus 4.054). This is because
the assumed model form is different. In Case 1, the assumed model form is strictly a main
effects model with three parameters to estimate. In Case 2, we assume a main effects model
with a two-factor interaction effect, thus having four parameters to estimate. The impact a
different design has on the prediction variance is more readily apparent by comparing the
(X ′WX) and (X ′WX)−1 matrices for both model assumptions. These matrices (as well as
those for the other interaction models) are shown in Appendix D. Examining the first two
(X ′WX) matrices in Appendix D1, we note that the dimensions are different, reflecting
the difference in the number of model parameters. We also observe that the upper left
3×3 submatrix is equal to the matrix for the main effects model. However, this is not true
79
for the (X ′WX)−1 matrices. Hence, a different prediction variance exists, even if the true
coefficient value of the interaction term is zero, but has been estimated. We may also explain
the prediction variance differences in terms of their available degrees of freedom. Given this
U-optimal design is supported by four test points, the model is saturated. Because all of
the degrees of freedom are used in the estimation of the parameters, there are zero residual
degrees of freedom left for testing goodness of fit. Furthermore, this inflates the parameter
estimation error and, in turn, inflates the prediction variance (since the prediction variance
is a combination of estimation error).
We next consider two models with a small and large interaction coefficient, respec-
tively. We observe that four distinct test locations also are optimal for minimizing the
prediction variance of these designs, just as occurred for the previous models. For the
model with a low level of interaction, the results are summarized in Table 13 and illustrated
in the middle panel of Fig. 10. For the model with a high level of interaction. These results
are summarized in Table 14 and the bottom panel in Fig. 10. Examining the three panels
of Fig. 10, we observe that the allocation of the lowest stress level test units increases (from
50 to 53 to 59) as the magnitude of the interaction effect increases (0 to 0.41 to 0.82). In
addition, the location of the two points at the lower humidity levels migrate inward into
the middle of the feasible design region. This is most easily seen by comparing the shapes
of the contour lines. In the no interaction case, they are symmetric. As the interaction ef-
fect strengthens, the temperature acceleration (which is exponential) becomes much larger
than the moisture acceleration (which is logarithmic). Thus, we observe a stretching of the
contour bands along the x1 and x2 axes. In addition, the contours show that the predic-
tion variance deteriorates more quickly as both temperature and humidity become large.
This distortion of the contour bands leads to an increasingly larger gradient of prediction
80
variance values as you extend away from the design region. For our fixed use condition
location, we observe the prediction variance increases from 4.054 to 5.990 to 10.504 as the
interaction effect is increased.
Table 13: U-optimal design with a small interaction effect
Test Temperature Humidity Reg Std Time F(t) Alloc ExpCond Natural Code Std Natural Code Std Model hours
i ◦ C s1 x1 % s2 x2 µi ζi Φ ni Fails
1 69.6 33.97 0.785 79.1 4.203 0.733 7.55 -2.942 0.0514 53 2.72 60 34.83 1.000 90 4.500 0.000 7.39 -2.782 0.0600 22 1.33 110 30.29 0.000 60 4.090 1.000 5.19 -0.590 0.4257 15 6.44 110 30.29 0.000 90 4.500 0.000 3.98 0.627 0.8461 11 9.3
Table 14: U-optimal design with a large interaction effect
Test Temperature Humidity Reg Std Time F(t) Alloc ExpCond Natural Code Std Natural Code Std Model hours
i ◦ C s1 x1 % s2 x2 µi ζi Φ ni Fails
1 64.3 34.38 0.902 71.2 4.265 0.579 7.76 -3.152 0.0419 59 2.52 60.0 34.83 1.000 90.0 4.500 0.000 7.39 -2.782 0.0600 17 1.03 110 30.29 0.000 64.9 4.170 0.805 4.96 -0.354 0.5043 13 6.64 110 30.29 0.000 90.0 4.500 0.000 3.98 0.627 0.8461 11 11.0
Despite these designs minimizing the potential prediction variance observed, careful
consideration must also be given to the total sample size selected. For convenience, we
selected 100 test specimens. Upon examining the expected quantity of failures for the U-
optimal designs (the last column of Tables 12, 13, and 14), we observe that two design
locations are expected to observe fewer than four failures. If sample size is unconstrained,
then one can simply apportion a multiple of the original design that satisfies the minimum
failure count constraint. This recommendation is based on our observation that the test
locations and their corresponding allocation proportions are constant for designs larger than
80 samples. Thus, 300 samples would be a good choice in that a minimum of four or five
81
16
21
50
Use
13
Use
15
2211
53
Use
1711
59
0.100
13
Fig. 10: U-optimal designs for varying magnitudes of interaction effect: none (top), low(middle), and high (bottom)
82
failures are expected at each stress level. In the case that the total sample size available is
larger but modestly constrained, we advise augmenting the original design with additional
test specimens only at the test locations with low failure counts sufficient to yield four or
five failures. If no increase to the sample size is possible, we suggest a compromise to the
U-optimal design locations in which the lowest stress levels are increased sufficient to insure
four to five failures occur.
5. Conclusions
This chapter highlights some of the benefits associated with incorporating design of
experiment concepts into accelerated life test planning. It also examines cases that involve
curvature in the response surface coupled with censoring. A GLM framework is shown to
facilitate the computation of maximum likelihood estimators for reliability, and thus, the
determination of designs that more fully meet a practitioner’s needs. Recent improvements
in the non-linear search capabilities in standard software packages such as SAS enable large
scale matrix computations with relative ease. Using this approach, designs are constructed
that offer gains of 2% and 12.5% over comparable D-optimal and Split Optimal designs. In
addition, graphical depictions integrating prediction variance contours coupled with their
corresponding experimental designs are shown. These recent advancements offer practition-
ers an ability to easily compare multiple designs for their robustness. More importantly,
the framework outlined with the GLM construction extends the range of regression mod-
els beyond two factors. As such, it enables more complex designs to be considered by a
reliability practitioner.
CHAPTER 5
Sensitivity and Robustness Analysis
1. Motivation
Reliability practitioners increasingly are faced with the challenge of making product
lifetime assessments under time, material, and cost constraints. As such, accelerated life
test plans often involve difficult choices associated with restricted design regions, censor-
ing, and sample size selection. The complexity of these trade-off decisions is compounded
by uncertainty in the model parameter estimates associated with environmental accelera-
tion factors. In the previous chapter, we propose the use of a U-optimality criteria in the
construction of such designs to minimize the variance of a use condition prediction. The U-
optimal design is shown to minimize the use condition prediction variance for a prescribed
model. Since the initial values specified for a lifetime model represent expert opinion or an
initial guess, practitioners could feel some degree of anxiety over the impact model param-
eters mis-specification has on the resulting predictions. Thus, we revisit the U-optimality
design and evaluate its performance under various levels of uncertainty. We offer several
approaches toward quantifying the sensitivity of a U-optimal design that enables a practi-
tioner to more readily assess their comfort with possible outcomes from a test plan in the
planning stage. We conclude with a discussion on the use of controllable factors such as
censoring time and sample size to mitigate highly sensitive scenarios.
2. Numerical example
As part of this evaluation, we examine a two-factor accelerated life test example
similar to those introduced in Chapter 4. With the same set of assumptions and constraints,
we express a log-lifetime regression model of the form shown in Equation (5.1)
log(MTTF ) = −5.238 + 0.750s1 − 3.000s2 − 0.025s1s2, (5.1)
84
where s1 and s2 are natural stresses of temperature and humidity, s1 = 11,605T with tem-
perature in Kelvin, and s2 = log(h) with humidity measured as a percentage. Applying
the same scaling and normalization scheme we describe in Chapter 4 yields the regression
model shown in Equation (5.2).
log(MTTF ) = 10.735− 5.351x1 − 5.069x2 + 0.256x1x2 (5.2)
Applying the U-optimality criterion,
min∀x∈R
U = x(1)′use · (X ′WX)−1 · x(1)
use, (5.3)
a design having four unique test point locations in feasible region R is constructed for a use
condition location of x(1)′use=[1, 1.759, 3.159, 5.555], which corresponds to 30◦C / 25% RH in
natural units. The optimal design with factor combinations and proportions of the sample
are summarized in Table 15 and are graphically depicted in Fig. 11.
Table 15: U-optimal design for a set of hypothesized model coefficientsTest Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % RH x2 Φ ni Fails
1 69.9 0.779 64.8 0.809 0.0480 53 2.52 60 1.000 90 0.000 0.0600 20 1.33 110 0.000 60 1.000 0.4257 16 6.44 110 0.000 90 0.000 0.8461 11 9.3
In Fig. 11, we summarize two key features of the U-optimal design: its construction
and its resulting prediction variance contours across a range of use conditions. The first
85
Fig. 11: U-optimal design for hypothesized values of model coefficients
summary describes the location of the test points within the feasible design region and their
corresponding sample allocation. We observe that over half (53) of the samples are allocated
to the lowest stress condition (0.779, 0.809), while the highest stress condition (0,0) receives
only 11 samples. The remaining samples are approximately equally allocated at the (0,1)
and (1,0) corners. This allocation reflects, in part, differences in failure rates across the four
stress test conditions. Note that the failure rate at the lowest stress condition of 69.9◦C /
64.8% RH is 4.8% as compared to 84.6% at the highest stress condition of 110◦C / 90% RH.
Even with this disproportionate allocation, we observe that the expected number of failures
at the lowest stress condition (2.5) is still lower than the highest stress condition (9.3). We
also observe that the lowest stress condition does not lie at the (1,1) corner. Instead, it lies
at (0.779, 0.809), which reflects an additional measure of compensation by the design. This
is in response to the low rate of failure occurring at the intersection of the design region
constraints (1,1) as discussed in Section 4.
86
The second summary included in Fig. 11 is the contour profile associated with the
prediction variance of this design. We note that the contours are elongated closest to the
x1 and x2 axes. This produces steep contours as one progresses diagonally in the direction
from the highest stress condition (0,0) to the use condition (1.759, 3.159). We observe a
prediction variance value of 5.272 at the use condition location and denote it with a square
marker. With a baseline design constructed, we now evaluate and discuss the sensitivity to
model parameter mis-specification on this design.
3. Influence of parameter uncertainty on U-optimal design performance
Although the U-optimal design of Table 15 provides the lowest use condition predic-
tion variance conditional on the initial parameter guesses, this designation depends on the
specific values of the regression model coefficients. In some situations, practitioners have
little knowledge of what these coefficient values are a priori to conducting the tests. In such
circumstances, it is desirable to have a design which performs robustly and is relatively in-
sensitive to model parameter uncertainty. This provides the practitioner some reassurance
that small to modest mis-specifications in model parameters do not result in unacceptably
large increases in the width of the prediction intervals. To demonstrate this idea, we discuss
the implications of model parameter mis-specification from two perspectives assuming that
the true values of the model coefficients deviate no more than ±20% from the hypothe-
sized values. First, we evaluate how the optimal design would change based on different
parameter values. U-optimal designs constructed with perfect knowledge of the parameter
coefficients are evaluated over the ±20% range of uncertainty. The location and allocation
of stress test points are compared to the U-optimal design constructed assuming the hy-
pothesized coefficient values are true. Second, we examine the impact of mis-specification
on the prediction variance - the main objective of this study. Trends and general patterns
87
associated with the uncontrollable parameter estimates (e.g. the lifetime regression model
coefficients) are discussed. Combinations that result in undesirably large shifts and inflation
of the prediction variance are identified. We then offer augmentation strategies involving
the controllable factors such as censoring time and sample size to mitigate the degree to
which prediction variances are negatively impacted.
3.1. Uncertainty range
Recall that the initial guess for our natural (unscaled) model coefficient values are
Q=0.75, c=-3.00, and Q× c=-0.0025, respectfully. Minimum and maximum levels for each
parameter estimate are computed as 20% deviations from these values and shown in Table
16.
Table 16: Ranges for the lifetime regression model coefficients
Model Low Midpt. HighCoeff. – 0 +
Q 0.60 0.75 0.90c -2.40 -3.00 -3.60
Q × c -0.02 -0.025 -0.03
It is worth noting that although the range in coefficient size examined is ±20%, its
impact in terms of an overall acceleration level is considerably higher. For example, the
thermal acceleration is modeled as an Arrhenius relationship, as shown by
AF = exp
[− Q
kB
( 1Thigh
− 1Tlow
)](5.4)
where Q is the activation energy, temperatures are in Kelvin, and kB is Boltzmann’s con-
stant (8.617×10−5). Given the range of temperatures associated with the feasible design
88
region (60-110◦C), the corresponding thermal acceleration factor ranges from 15 to 60 (nom-
inally at 30.2). A similar investigation into the moisture acceleration reveals smaller, more
linear, impact as the acceleration ranges from 2.65 to 4.30 (nominally at 3.38). Noting that
we assume an Eyring model relationship such that the joint thermo-moisture acceleration
is multiplicative, an overall acceleration factor range of 40 to 257 exists about the nominal
acceleration factor of 102. As a practitioner, it offers an additional measure of comfort to
know that lifetimes may differ substantially more than expected, yet still fall well within
the range of values we examine here.
With high and low levels specified, we create a 23 factorial design and calculate their
corresponding eight new log-lifetime regression models shown in the middle four columns
of Table 17. Next, a U-optimal design is constructed for each of these permutations. These
designs are summarized in Appendix E. The corresponding prediction variance of these U-
optimal designs are also computed, thus establishing a “best” achievable prediction variance
level under the premise of perfect knowledge. These values are summarized under the PVbest
column heading in Table 17.
Table 17: Model sensitivity study on prediction variance
Run Design β0 β1 β2 β3 PVbest PVorig RE1 – – – 7.54 -4.28 -4.05 0.205 2.99 3.84 78.02 – – + 6.31 -4.02 -4.55 0.307 3.38 4.00 84.53 – + – 3.68 -4.28 -5.59 0.205 7.07 7.43 95.24 – + + 2.45 -4.02 -6.08 0.307 9.17 11.07 82.95 + – – 19.02 -6.68 -4.06 0.205 4.48 4.56 98.36 + – + 17.79 -6.42 -4.55 0.307 4.82 4.92 97.87 + + – 15.16 -6.68 -5.59 0.205 7.47 9.96 75.18 + + + 13.93 -6.42 -6.08 0.307 8.29 11.61 71.4
89
In addition, the prediction variance of the “original” (mis-specified) design shown
in Table 15 is computed for each of these combinations. These results are summarized as
PVorig in Table 17. By computing the quotient of the prediction variances resulting from
the “best” and “original” designs, we establish a relative efficiency (RE),
RE =PVbestPVorig
, (5.5)
where PV is the prediction variance evaluated at the use condition level of 30◦C / 25% RH.
The relative efficiency results of this analysis are summarized in the last column of Table
17.
Examining Table 17 further, we observe that the first row represents a condition
where all coefficients are at the low (–) level. Under this condition, the original design yields
a prediction variance of 3.84 as compared to the best achievable value of 2.99. Applying
Equation (5.5), this yields a relative efficiency score of 78%. Similar studies were done for
the other test conditions. We observe efficiencies to range from 71% to 98% and categorize
them into three groups. The first group contains four design combinations (run 1, 2, 5,
and 6 from Table 17) that observe prediction variance values better than expected with the
original design. The second group contains two design combinations (runs 3 and 4) whose
best achievable prediction variance levels (PVbest) are larger, but retain a high degree of
relative efficiency (83-95%). The third group contains two design combinations (runs 7 and
8) that have both an increase in the best achievable prediction variance (from 5.27 to 7.47
and 8.29, respectively) coupled with an additional increase in the prediction variance due to
design inefficiency (71-75%). This further raises the observed prediction variance to 9.96 and
11.61, respectively. Perhaps more important to the practitioner is the absolute magnitude
90
of the prediction variance. We observe the original design yields values ranging from 3.84 to
11.61. Given the expected prediction variance level of 5.27, we observe confidence interval
lengths as 85% (√
3.845.27) to 148% (
√11.615.27 ) of those initially expected.
3.2. Design construction sensitivity
Examining the results of Table 17, we select two conditions with low relative effi-
ciency scores for comparison. Condition (+++) is the worst performing condition and thus
is selected. We also select condition (– – –) based on its low relative efficiency score. Note
we opted to select the (– – –) location over the (++–) location, which has a lower relative
efficiency score, since the former design represents the design combination furthest from
(and most different than) the (+++) location. Fig. 12 compares these two designs to the
original design (000).
To facilitate the comparison, we have placed and sized unshaded circular markers to
depict the original design. In the top panel, we observe that the (+++) design has nearly
identical locations for the test point locations. The only significant departure between the
two designs is the location of the lowest stress corner which has shifted from (0.78, 0.81) to
(0.74, 0.95) for the (+++) design . The allocation of the samples between the two designs
is also quite similar. Examining the lower panel of Fig. 12, we compare the (– – –) and
(000) designs. For the (– – –) design, we observe a shift in the lowest stress level condition
and the (0,1) stress condition to higher humidity levels. The lowest stress condition shifts
from (0.78, 0.81) to (0.86, 0.64) which maintains a near constant failure rate level while
the (0,1) stress condition shifts to (0, 0.94). The allocation of samples to the lowest stress
condition was observed to grow from 53 to 63 samples, with fewer samples allocated to each
of the remaining locations.
91
Fig. 12: Comparing U-optimal designs for (+++) and (– – –) levels
From this investigation, we find that aside from changes in the sample allocation
given to the lowest stress level condition, the overall U-optimal design changes very little
across the extreme regions of the model parameter ranges shown here.
3.3. Prediction variance sensitivity for a single acceleration factor
With the discussion on the construction sensitivity complete, we now evaluate its
impact on prediction variance. In Fig. 13, we illustrate the sensitivity of the prediction
92
Fig. 13: Sensitivity plot of activation energy and prediction variance
variance across a range of activation energy values corresponding to the low (–) and high (+)
levels. This plot represents the prediction variance of the original design evaluated under
various mis-specification scenarios. We first draw your attention to the square at (0.75,
5.272). This is the U-optimal result for the original design when correctly specified. Next,
we highlight the three groups of lines labeled A, B, and C. These groups are established
according to the level of their moisture acceleration exponent. Group A represents c at the
high (+) level, Group B represents c at the nominal (0) level, and Group C represents c at
the low (–) level. Within each group are three lines, which represent the high (+), nominal
(0), and low (–) levels for the interaction term. From Fig. 13, we observe the following
about each group of lines.
93
• Group A represents the most sensitive set of conditions. Values of the prediction
variance are higher than expected and vary widely within the group.
• Group B represents only mild mis-specification. Prediction variance values observed
within this group are both higher and lower than expected, but the deviations are
from the nominal level (square marker in Fig. 13) are small.
• Group C represents the least sensitive conditions. Although mis-specified, the predic-
tion variance values are all better than expected.
Finally, we conclude this discussion by noting the unequal effect of changing the
activation energy (Q) and moisture exponent (c) by ±20%. For c we have a monotonic
increase in prediction variance as we go from the (–) to (0) to (+) levels. On the other
hand, for Q we have non-monotonic changes as shown in Group A.
3.4. Prediction variance sensitivity for multiple acceleration factors
Having completed a univariate look at the sensitivity for a given x1 level, we now
take a more geometric look at the model parameter sensitivity. Recall the 23 full factorial
design results shown in Table 17. In this table, the column labeled PVbest is the prediction
variance resulting from a U-optimal design with perfect knowledge of that specific set of
coefficients. The PVorig column represents the prediction variance resulting from the use of
the U-optimal design associated with the mis-specified (original) values of the coefficients.
The results of the factorial and center point (original) design are graphically depicted in
Fig. 14. In this figure we illustrate four key features: the U-optimal test point locations
and their sample allocation, the relative efficiency (RE) of the U-optimal design based
on the practitioner’s assumptions, and the theoretical and actual use condition prediction
variances, respectively.
94
+
Q
c
Q x c– +
11
15
17
5795%7.077.43
10
10
18
6282%9.17
11.07
+
5
19
17
5975%7.479.96
8
19
18
5571%8.29
11.61
10
10
18
63100%5.275.27
–
–17
18
20
44
78%2.993.84
15
17
20
4885%3.384.00
14
16
18
52
98%4.484.56
11
20
19
5098%4.824.92
Fig. 14: Sensitivity of U-optimal designs to model mis-specification
The first feature is the U-optimal design test points and their corresponding allo-
cation for each of the eight parameter value combinations considered. This is shown with
different sized circles in a unit square feasible design region for each location. Note the orig-
inal design is also included in this figure and is denoted with a darker rectangular outline.
Recall that the lower left corner of each feasible region is the origin (0,0) and represents
the highest stress level of 110◦C / 90% RH. The upper right corner of each feasible de-
sign region is the minimum stress level constraint established at (1,1). This corresponds
to 60◦C / 60% RH as discussed in Chapter 4. For clarity, we designate model coefficient
levels with + and – sign combinations, while specific stress test condition levels are spec-
ified as coordinates within a unit square. With this in mind, we return to the discussion
95
associated with the lower left corner of the overall factorial design (e.g. the cube outlined
with the dashed lines). Within the feasible region, circular markers indicate the location
of the U-optimal design levels for that specific set of coefficients. The allocation of the test
units is proportional to the size of the marker. For this design location, we observe that 44
units are allocated to a coordinate location of (0.94, 1.0), 21 units are allocated to (1,0), 18
units are allocated to (0,1), and 17 units are allocated to (0,0). The second feature of this
figure is the relative efficiency score as previously defined in Equation (5.5). This is shown
as the top number (percentage) within the given design region. For the same lower left
design we denote a relative efficiency score of 78%. The third feature of this figure is the
theoretically best prediction variance that can be achieved given the true model parameter
levels. Pragmatically, this can never be achieved; however, it provides a useful benchmark
for comparing designs. Its value is listed as the middle number within the feasible design
region square and is 2.99 for the lower left combination of model parameters. Finally, the
fourth feature is the prediction variance expected given the model parameter assumptions
made by the practitioner. This is 3.84 for the same lower left combination.
Continuing our examination of Fig. 14, we note the following three observations
about the optimal location and allocation.
• The highest stress level condition at (0,0) is present in all designs. It receives the
smallest proportion of samples of all design points due to its high failure rate.
• The lowest stress level condition consistently has the highest allocation of points. It
receives the largest proportion of samples of all design points due to its low failure
rate. We also note that the lowest stress condition does not always reside at (1,1).
96
• The off-diagonal stress levels are generally located at the corners and receive approx-
imately equal allocations.
We also note the following four observations on the level of relative efficiency (RE)
and the overall prediction variance.
• Two-factor U-optimal designs are reasonably insensitive to mis-specification if one of
the two-factor estimates is close to its true value. Examining Fig. 13, these designs are
denoted with parameter levels of (+ 0),(– 0), (0 +), and (0 –) respectively. Relative
efficiency values for these cases averaged 93%.
• Two-factor U-optimal designs are also insensitive to mis-specification if the coefficients
are mis-specified in opposite directions. These designs are denoted with parameter
levels of (+ –) and (– +). Relative efficiency values for these cases averaged 94%.
• We acknowledge that larger degrees of sensitivity exist for models having both coeffi-
cients mis-specified in the same direction. We denote these designs as (– –) and (++)
and note their average relative efficiency as 77%.
• The relative efficiency metric is necessary but not sufficient in determining potentially
sensitive regions. We highlight that some design combinations with low efficiencies
had prediction variance levels lower than expected. Examine the (– – –) design in Fig.
14. Its relative efficiency is 78%, but the prediction variance is lower than expected
with the original design (2.99 vs. 5.27). However, the most sensitive and concerning
mis-specification situations do have low relative efficiency scores. We observe test
levels (+++) and (– ++) as examples. Both of these situations would be of concern
to a practitioner as the best possible prediction variance levels are larger (8.29 and
97
9.17 compared to 5.27) due to the combination of model parameters as well as inflated
(8.29 → 11.61, 9.17 → 11.07) due to the design inefficiency values of 71% and 83%,
respectively.
4. Mitigation schemes using controllable design factors
So far we have focused the discussion on the uncontrollable variables (e.g. the pa-
rameter coefficients) involved in a reliability test plan. In this section, we examine factors
that are controllable by the practitioner, namely the total sample size and censoring time.
Although cost and time to market constraints are likely to exist, a practitioner is well served
in understanding the incremental gains in information (reduction in prediction variance of
the use condition) that are associated with increasing or decreasing these controllable fac-
tors. The remainder of this section provides an approach to use in mitigating uncertainties
in the model parameter estimates. For this study, we assume the censoring time and sample
size are restricted to less than 300 hours of testing and 300 test samples, respectively.
Recall from the previous section that the (+++) combination of levels poses the
largest concern for a practitioner. The prediction variance at the use condition is larger due
to the combination of model parameters as well as the low relative efficiency of the original
design. The prediction variance for the U-optimal design under this combination of levels
is 11.61, more than double the expected value of 5.27. This increase in prediction variance
is likely to inhibit meaningful conclusions from being drawn. Thus, we seek to provide
recommendations for mitigating this problem through adjustments to censoring time and
sample size.
As part of the original design, we selected a total sample size of 100 test specimens
and suspended the test after 100 hours. At this point, it is desirable to know whether an
increase in the sample size, an extension of the censoring time, or some combination of both
98
provides an ability to reach the prediction variance level of 5.27 associated with the original
design. To facilitate this decision, we compute the prediction variance associated with the
original design at the (+++) level for different combinations of censoring time and sample
size. A contour plot of the results is shown in Fig. 15.
Fig. 15: Prediction variance contour plot for censoring time and sample size
We observe that the existing design resides at a censoring time of 100 hours and
sample size of 100 units and is denoted with a square marker at a prediction variance value
of 11.61. Extending horizontally along a line of constant sample size, we observe that the
censoring time requirement to reduce prediction variance to 5.27 exceeds 300 hours, thus is
not a viable option. Extending vertically along a line of constant censoring time, we observe
that increasing sample size to 225 samples reduces the use condition prediction variance to
99
the desired level. This option is denoted in Fig. 15 with a circular marker. We also note
there are many additional combinations involving an increase in both censoring time and
sample size. In particular, we highlight a solution that is approximately perpendicular to
the gradient of the contours. The combination offers a more balanced solution and lies at
a censoring time of 160 hours and sample size of 160 units and is denoted with a triangular
marker. Finally, we also note there are combinations possible for censoring times as low
as 75 hours provided that the sample size could be increased to 300. Thus, this two-
staged approach which first determines the model parameter sensitivity, then mitigates any
unacceptably large risk regions by adjusting the censoring time and/or sample size allows
a practitioner to increase the robustness of a reliability test plan.
5. Conclusions
This chapter highlights considerations a practitioner makes in applying a U-optimal
design criterion. We examined the construction of the design and note that both the test
point levels and relative allocation are largely insensitive to large degrees of model parameter
uncertainty. We completed a similar investigation on the impact of model parameter un-
certainty on the use condition prediction variance. By linking model parameter uncertainty
to physical and chemical-kinetic parameter levels (e.g. Q and c), we show that U-optimal
designs provide stable results, even with mild levels of uncertainty in the model parameter
estimates. For the most extreme levels of mis-specification in the uncontrollable factors, we
offer an approach that uses controllable factors such as censoring time and sample size to
mitigate unacceptably large increases in the width of the prediction intervals.
CHAPTER 6
Conclusions and Future Work
1. Conclusions
This dissertation presents methods of selecting and constructing optimal experimen-
tal designs for the unique challenges associated with accelerated life testing in reliability.
These include restricted design regions, non-normal response variables, and censoring which
causes scarce and/or imbalanced experimental outcomes. The main contributions of this
dissertation are the application and development of an optimality criterion for minimiz-
ing either the variance of the model parameter estimates or a prediction variance at an
extrapolated use condition. Using several case studies, we show that the application of sta-
tistical techniques associated with design of experiments provide significant improvements
over traditional accelerated life testing approaches.
In Chapter 3, we provide a criterion for effectively planning accelerated life tests for
thermo-mechanical testing. Comparisons are drawn between the influence of experimental
design selection and traditional test plan variables such as sample size, censoring scheme,
and sample allocation. Using a D-optimality criterion, we improve parameter estimate
precision three-fold and reduce sample size requirements eight-fold.
In Chapter 4, we develop an alternate way of computing the information matrix used
during the planning stage of designing an accelerated life testing experiment. A generalized
linear model approach is developed that allows optimal designs to be computed using itera-
tively weighted least squares solutions versus maximum likelihood. Optimality criteria are
discussed for parameter estimation or prediction variance at an intended ambient usage con-
ditions subject to a constrained accelerated test region. Through this generalized approach,
we enable practitioners to extend accelerated life models beyond two factors. In addition,
we provide several useful guidelines and graphical techniques for reliability practitioners.
101
Important guidelines found in this study include the optimal number of support points
for a two-factor model with interaction term (4) and strategies for extrapolating design
construction results from smaller designs (less than 80 samples). Graphical interpretation
is enhanced through the use of an alternate orientation (rotated and scaled) for showing
experimental design points and use conditions, the designation of the sample proportion
at a given test level condition via marker size, and the integration of prediction variance
contours.
In Chapter 5, we conclude with a sensitivity and robustness study of U-optimal
designs. We assess the impact of model parameter uncertainty on both the design construc-
tion and the resulting prediction variance. Design behavior is quantified through a relative
efficiency measure. Mitigation strategies with controllable factors are demonstrated that
prevented unacceptably large increases in the width of the prediction intervals when extreme
levels of mis-specification in the uncontrollable factors existed.
2. Future work
The research in this dissertation has highlighted the complex trade-off decisions
associated with accelerated life testing plans. Practitioners continue to face increased time
and cost pressures in our global marketplace. Some extensions of this work include:
• using split plot designs for addressing noise factors and spatial variations that exist
in temperature cycle chambers;
• deriving the U-optimal solution for other distributions such as Weibull;
• deriving of the U-optimal criteria for differing censoring times per stress condition
(Type II censoring);
102
• determining the number of distinct support points associated with optimal designs
when acceleration factors exceed two and model complexity increases;
• establishing a set of guidelines for locating and allocating experimental design condi-
tions based on the location of the use condition from the origin (high stress condition),
lifetime model coefficients, and censoring time;
• incorporating Bayesian methods with a U-optimality criterion using quadrature meth-
ods described by Gotwalt, Jones, and Steinberg (2009);
• constructing hybrid U-optimal designs that augment a design for multi-objective ro-
bustness; and
• developing design efficiency measures for use conditions that are random variables as
described in Monroe and Pan (2009).
REFERENCES
Agboto, V. and Nachtsheim, C. (2005) Bayesian model robust designs. Unpublished,http://www.csom.umn.edu/Assets/32654.pdf
Aitkin, M. and Clayton, D. (1980) The fitting of exponential, Weibull and extreme valuedistributions to complex censored survival data using GLIM. Applied Statistics, 29,156 – 163.
Aldrich, J. (1997) R.A. Fisher and the making of maximum likelihood 1912-1922. StatisticalScience, 12(3), 162 – 176.
Atkinson, A. (1972) Planning experiments to detect inadequate regression models.Biometrika, 59, 275 – 293.
Atkinson, A. (2006) Generalized Linear Models and Response Transformations, WorldScientific Publishing, New York.
Atkinson, A. and Donev, A. (1989) The construction of exact D-optimum experimentaldesigns with application to blocking response surface designs. Biometrika, 76, 515 –526.
Back, T. (1996) Evolutionary algorithms in theory and practice. Oxford University Press,Oxford.
Bai, D., Cha, M., and Chung, S. (1992) Optimum simple ramp tests for the Weibulldistribution and Type-I censoring. IEEE Transaction on Reliability, 41, 407 – 413.
Bai, D. and Chun, Y. (1991) Optimum simple step-stress accelerated life-tests with com-peting causes of failure. IEEE Transaction on Reliability, 40, 622 – 627.
Bai, D., Chun, Y., and Cha, M. (1997) Time-censored ramp tests with stress bound forWeibull life distribution. IEEE Transaction on Reliability, 46, 99 – 107.
Bai, D., Chun, Y., and Kim, J. (1995) Failure-censored accelerated life test sampling plansfor Weibull distribution under expected test time constraint. Reliability EngineeringSystem Safety, 50, 61 – 68.
Bai, D., Kim, M., and Chun, Y. (1993) Design of failure-censored accelerated life-testsampling plans for lognormal and Weibull distributions. Engineering Optimization,21, 187 – 212 .
Bai, D., Kim, M., and Lee, S. (1989) Optimum simple step-stress accelerated life tests withcensoring. IEEE Transactions on Reliability, 38, 528 – 532.
104
Bai, D. and Yun, H. (1996) Accelerated life tests for products of unequal size. IEEETransaction on Reliability, 45, 611 – 618 .
Barbosa, E. P. and Louzada-Neto, F. (1994) Analysis of accelerated life tests with Weibullfailure distribution via generalized linear models. Communications in Statistics -Simulation and Computation, 23(2), 455 – 465.
Bingham, D., and Li, W. (2002) A class of optimal robust parameter designs. Journal ofQuality Technology, 34(3), 244 – 259.
Blish, R., Huber, S., and Durrant, N. (1999) Use condition based reliability evaluation ofnew semiconductor technologies. Sematech Technology Transfer, August 31, 1999.
Borkowski, J. (2003) Using a genetic algorithm to generate small exact response surfacedesigns. Journal of Probability and Statistical Science, 1, 65 – 88.
Box, G. and Behnken, D. (1960) Some new three level designs for the study of quantitativevariables. Technometrics, 2, 455 – 475 .
Box, G. and Draper, N. (1975) Robust designs. Biometrika, 62, 347 – 352.
Box, G. (1982) Choice of response surface design and alphabetic optimality. Utilitas Math-ematica, 21B, 11 – 55.
Box, G. and Hill, W. (1967) Discrimination among mechanistic models. Technometrics, 9,57 – 71.
Box, G. and Draper N. (1959) A basis for the selection of a response surface design. Journalof the American Statistical Association, 54, 622 – 653.
Box, G. and Wilson, K. (1951) On the experimental attainment of optimum conditions.Journal of the Royal Statistical Society, 13, 1 – 45.
Box, M. and Draper, N. (1971) Factorial designs, the —X ′X— criterion, and some relatedmatters. Technometrics, 13, 731 – 742.
Chaloner, K. and Larntz, K. (1992) Bayesian design for accelerated life testing. Journal ofStatistical Planning and Inference, 33, 245 – 259.
Chaloner, K. and Verdinelli, I. (1995) Bayesian experimental design: A review. StatisticalScience, 10, 273 – 304.
105
Cheng, C., Steinberg, D., and Sun, D. (1999) Minimum aberration and model robustnessfor two-level fractional factorial designs. Journal of the Royal Statistical Society - B,61, 85 – 93.
Chernoff, H. (1962) Optimal accelerated life designs for estimation. Technometrics, 4, 381– 408.
Chipman, H. and Welch, W. (1996) D-optimal design for generalized linear models. Un-published, http://math.acadiau.ca/chipmanh/publications.html.
Clech, J. (2005) Lead free solder joint reliability trends. The Materials Information Society,ASM International, Chapter 4 of Lead Free Solder Interconnect Reliability. MaterialsPark, Ohio, 44073-0002.
Coffin, J. (1954) A study of the effects of cyclic thermal stresses on a ductile material.Transactions ASME, 76, 931 – 950.
Cook, R. and Nachtsheim, C. (1980) A comparison of algorithms for constructing D-optimaldesigns. Technometrics, 22, 315 – 324.
Cook, R. and Nachtsheim, C. (1982) Model robust, linear-optimal designs. Technometrics,24,49 – 54.
Darveaux, R. (1997) Solder joint fatigue life model. Proceedings of the Minerals, Metals,and Materials Society (TMS), Orlando, FL., 213 – 218.
DeGroot, M. and Goel, P. (1979) Bayesian estimation and optimal design in partiallyaccelerated life testing. Naval Research Logistics, 26, 223 – 235 .
Del Castillo, E., Alvarez, M., and Ilzarbe, L. and Viles, E. (2007) A new design criterionfor robust parameter estimates. Journal of Quality Technology, 39(3), 279 – 295.
Dobson, A. (2002) An Introduction to Generalized Linear Models, Chapman & Hall, 2ndedition, Boca Raton, FL.
Drain, D., Carlyle, W., Montgomery, D., Borror, C., and Anderson-Cook, C. (2004) Agenetic algorithm hybrid for constructing optimal response surface designs. Qualityand Reliability Engineering International, 20(7), 637 – 650.
Draper, N. (1982) Center points in second-order response surface designs. Technometrics,24, 127 – 133.
106
DuMouchel, W. and Jones, B. (1994) A simple Bayesian modification of D-optimal designsto reduce dependence on an assumed model. Technometrics, 36(1), 37 – 47.
Engelmaier, W. (1984) Functional cycles and surface mounting attachment reliability. ISHMTechnical Monograph Series 6984-002, International Society for Hybrid Microelec-tronics, Silver Springs, MD., 87 – 114.
Erkanli, A. and Soyer, R. (2000) Simulation-based designs for accelerated life tests. Journalof Statistical Planning and Inference, 90, 335 – 348.
Escobar, L. and Meeker, W. (1995) Planning accelerated life tests with two or more exper-imental factors. Technometrics, 37, 411 – 427.
Eyring, H. (1941) The activated complex in chemical reactions. Journal of ChemicalPhysics, 3(2), 107 – 115.
Federov, V. (1972) Theory of optimal experiments. New York: Academic Press. Fisher,R. (1912). On an absolute criterion for fitting for frequency curves. Messenger ofMathematics, 41, 155 – 160.
Fisher, R. (1926) The arrangement of field experiments. Journal of the Ministry of Agri-culture of Great Britain, 33, 503 – 513.
Galil, Z. and Kiefer, J. (1980b) Time- and space-saving computer methods, related toMitchell’s DETMAX, for finding D-optimum designs. Technometrics, 21, 301 – 313.
Ginebra, J. and Sen, A. (1998) Minimax approach to accelerated life tests. IEEE Transac-tions on Reliability, 47(3), 261 – 267.
Giovannitti-Jensen, A. and Myers, R. (1989) Graphical assessment of the prediction capa-bility of response surface designs. Technometrics, 31, 159 – 171.
Goldfarb, H., Anderson-Cook, C., Borror, C., and Montgomery, D. (2004) Fraction ofdesign space plots for assessing mixture and mixture-process designs. Journal ofQuality Technology, 36(2), 169 – 179.
Gotwalt, C., Jones, B., and Steinberg, D. (2009) Fast computation of designs robust toparameter uncertainty for nonlinear settings. Technometrics, 51(1), 88 – 95.
Guo, H. and Pan, R. (2007) D-Optimal reliability test design for two-stress accelerated lifetests. The Proceedings of IEEE International Conference on Industrial Engineeringand Engineering Management, 1236 – 1240.
107
Haines, L. (1987) The application of the annealing algorithm to the construction of exactoptimal designs for linear-regression models. Technometrics, 29(4), 439 – 447.
Hamada, M., Martz, H., Reese, C., and Wilson, A. (2001) Finding near-optimal Bayesianexperimental designs via genetic algorithms. The American Statistician, 55, 175 –181.
Heredia-Langner, A., Carlyle, W., Montgomery, D., Borror, C., and Runger, G. (2003)Genetic algorithms for the construction of D-optimal designs. Journal of QualityTechnology, 35, 28 – 46.
Heredia-Langner, A., Montgomery, D., Carlyle, W., and Borror, C. (2004) Model-robust op-timal designs: A genetic algorithm approach. Journal of Quality Technology, 36(3),263 – 279.
Hill, W. and Hunter, W. and Wichern, D. (1968) A joint design criterion for the dualproblem of model discrimination and parameter estimation. Technometrics, 10, 145– 160.
Huber, P. (1975) A Survey of Statistical Design and Linear Models, Sirvastava, J. (ed),North-Holland, New York, 287-301.
Hunter, W. and Reiner, A. (1965) Designs for discriminating between two rival models.Technometrics, 7, 307 – 323.
JEDEC (2004) JEDEC Standard JESD22-A105C for power and temperature cycling.JEDEC Solid State Technology Association, http://www.jedec.org.
JEDEC (2005) JEDEC Standard JESD22-A104C for temperature cycling. JEDEC SolidState Technology Association, http://www.jedec.org.
Jiao, L. (2001) Optimal allocations of stress levels and test units in accelerated life tests.Ph.D. dissertation, Rutgers University, Dept. of Industrial and Systems Engineering.Newark, NJ.
Johnson, M. and Nachtsheim, C. (1983) Some guidelines for constructing exact D-optimaldesigns on convex design spaces. Technometrics, 25, 271 – 277.
Jones, B., Lin, D., and Nachtsheim, C. (2008) Bayesian D-optimal supersaturated designs.Journal of Statistical Planning and Inference, 138(1), 86-92.
Jones, B., Li, W., and Nachtsheim, C. and Ye, K. (2009) Model discrimination - anotherperspective on model-robust designs. Journal of Statistical Planning and Inference,139(1), 45-53.
108
Joseph, V. and Wu, C. (2004) Failure amplification method: an information maximizationapproach to categorical response optimization. Technometrics, 46(1), 1 – 24.
Khuri, A. and Mukhopadhyay, S. (2006) GLM designs: the dependence on unknown param-eters dilemma. Response Surface Methodology and Related Topics, World ScientificPublishing, 203–223.
Kiefer, J. (1959) Optimum experimental designs. Journal of the Royal Statistical SocietyB, 21, 272 – 304.
Kielpinski, T. and Nelson, W. (1975) Optimum censored accelerated life tests for normaland lognormal distributions. IEEE Transactions on Reliability, 24, 310 – 320.
Kim, C. and Bai, D. (1999) Design of step-stress accelerated life tests for Weibull distri-butions with a nonconstant shape parameter. Journal Korean Statistical Society, 28,415 – 433.
Kussmaul, K. (1969) Protection against assuming the wrong degree in polynomial regres-sion. Technometrics, 11, 677 – 682.
Lauter, E. (1974) Experimental design in a class of models. Mathematische Operations-forschung und Statistik, 5, 379 – 398.
Lawless, J. (2003) Statistical Models and Methods for Lifetime Data, John Wiley & Sons,2nd edition, Hoboken.
Li, W. and Nachtsheim, C. (2000) Model-robust factorial designs. Technometrics, 42, 345– 352.
Liao, H. and Elsayed, E. (2006) Reliability inference for field conditions from accelerateddegradation testing. Naval Research Logistics, 53(6), 576 – 587.
Little, J., Murty, K., Sweeney, D., and Karel, C. (1963) An algorithm for the travelingsalesman problem. Operations Research, 11, 972 – 989.
Lindley, D. (1956) On the measure of information provided by an experiment. The Annalsof Mathematical Statistics, 27(4), 986 – 1005.
Mann, N. (1972) Design of over-stress life-test experiments when failure times have atwo-parameter Weibull distribution. Technometrics, 14, 437 – 451.
Manson, S. (1953) Behavior of materials under conditions of thermal stress. NACA-TN-2933 from NASA, Lewis Research Center, Cleveland, OH. 44135.
109
McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, Chapman and Hall,2nd edition, New York.
Meeker, W. and Escobar, L. (1998) Statistical Methods for Reliability Data, John Wiley &Sons, New York
Meeker, W. and Hahn, G. (1977) Asymptotically optimum over-stress tests to estimate thesurvival probability at a condition with a low expected failure probability. Techno-metrics, 19(4), 381 – 399.
Meeker, W. and Nelson, W. (1975) Optimum accelerated life tests for Weibull and extremevalue distributions and censored data. IEEE Transactions on Reliability, 24, 321 –332.
Mencinger, N. (2000) A mechanism-based methodology for processor package reliabilityassessments. Assembly Technology Development Quality and Reliability, Intel Corp.,http://download.intel.com/technology/itj/q32000/pdf/reliability.pdf.
Meyer, R. and Nachtsheim, C. (1995) The coordinate-exchange algorithm for constructingexact optimal experimental designs. Technometrics, 37(1), 60 – 69.
MIL-HDBK-217F (1991) Reliability prediction for electronic equipment. Naval Publicationsand Forms Center, F edition, 5801 Tabor Ave, Philadelphia, PA 19120.
Mitchell, T. (1974) An algorithm for the construction of D-optimal experimental designs.Technometrics, 20, 203 – 210.
Monroe, E. and Pan, R. (2008) Experimental designs for accelerated life tests with non-linear constraints and censoring. Journal of Quality and Technology, 40(4), 355 –367.
Monroe, E. and Pan, R. (2009) Knowledge-based reliability assessments for time-varyingclimates. Quality and Reliability Engineering International, 25, 111 – 124.
Monroe, E., Pan, R., Anderson-Cook, C., Montgomery, D., and Borror, C. (2009) Ageneralized linear model approach to designing accelerated life test experiments. LosAlamos Technical Report LA-UR 09-02209.
Myers, R., Montgomery, D., and Anderson-Cook, C. (2009) Response Surface Methodology:Process and Product Optimization using Designed Experiments, John Wiley & Sons,3rd edition, New York.
Myers, R., Montgomery, D., and Vining, G. (2002) Generalized Linear Models: WithApplications in Engineering and the Sciences, John Wiley & Sons, New York.
110
Nelder, J. and Wedderburn, R. (1972) Generalized linear model. Journal of the RoyalStatistical Association, 370 – 384.
Nelson, W. (1990) Accelerated Testing - Statistical Models, Test Plans, and Data Analysis,John Wiley & Sons, New York.
Nelson, W. (2005a) A bibliography of accelerated test plans. IEEE Transactions on Reli-ability, 54(2), 194 – 197.
Nelson, W. (2005b) A bibliography of accelerated test plans Part II - references. IEEETransactions on Reliability, 54(3), 370 – 372.
Nelson, W. and Kielpinski, T. (1976) Theory for optimum censored accelerated life testsfor normal and lognormal life distributions. Technometrics, 18, 105 – 114.
Nelson, W. and Meeker, W. (1978) Theory for optimum censored accelerated life tests forWeibull and extreme value distributions. Technometrics, 20(2), 171 – 177.
Ng, H.K.T. and Balakrishnan, N. and Chan, P.S. (2007) Optimal sample size allocationfor test with multiple levels of stress with extreme value regression. Naval ResearchLogistics, 54(3), 237-249.
Norris, K. and Landzberg, A. (1969) Reliability of controlled collapse interconnections.IBM Journal of Research and Development, May 1969, 266 – 271.
O’Hagan, A. (1978) Curve fitting and optimal design for prediction. Journal of the RoyalStatistical Society - B, 40, 1 – 41.
Onar, A. and Padgett, W. (2003) A penalized local D-optimality approach to design foraccelerated test models. Journal of Statistical Planning and Inference, 119, 411 –420.
Ozol-Godfrey, A., Anderson-Cook, C., and Robinson, T. (2007) Fraction of design spaceplots for generalized linear models. Journal of Statistical Planning and Inference,138, 203 – 219.
Pan, N., Henshall, G., Dai, S., Strum, M. J., Lewis, R., Benedetto, E., and Rayner, J. (2005)An acceleration model for Sn-Ag-Cu solder joint reliability under various thermalcycle conditions. Surface Mount Technology Association International (SMTA).
Park, J. and Yum, B. (1996) Optimal design of accelerated life tests with two stresses.Naval Research Logistics, 43, 863 – 884.
111
Parkinson, D. (2000) Robust design employing a genetic algorithm. Quality and ReliabilityEngineering International, 16, 201 – 208.
Pascual, F. and G. Montepiedra, G. (2002) On minimax designs when there are two can-didate models. Journal of Statistical Computation and Simulation, 72, 841 – 862.
Polson, N. (1993) Advances in Reliability, Basu, A. (ed), Elsevier, New York, 321 – 330.
Sahasrabudhe, S., Monroe, E., Tandon, S., and Patel, M. (2003) Understanding the effectof dwell time on fatigue life of packages using thermal shock and intrinsic materialbehavior. The Proceedings of Electronic Components and Technology Conference, 53,898 – 904.
SAS (2008) SAS/STAT 9.2 User’s Guide: Language Reference, SAS Institute, Cary, NC.,http://support.sas.com/documentation/.
Shirley, C., Ed. (1994) THB reliability models and life prediction for intermittently-powered non-hermetic components. International Reliability Physics Symposium, 32,San Jose, CA. 72 – 78.
Sitter, R. and Torsney, B. (1995) Optimal designs for binary response experiments withtwo design variables. Statistica Sinica, 5, 405 – 419.
Smith, K. (1918) On the standard deviations of adjusted and interpolated values of anobserved polynomial function and its constants and the guidance they give towardsa proper choice of the distribution of observations. Biometrika, 12, 1 – 85.
Steinberg, D. and Hunter, W. (1984) Experimental design: review and comment. Techno-metrics, 26(2), 71 – 97.
Syed, A. and Amkor (2001) Predicting solder joint reliability for thermal, power, andbend cycle within 25% accuracy. IEEE 51st Electronic Components and TechnologyConference, 255 – 263.
Talbi, E. (2002) A taxonomy of hybrid metaheuristics. Journal of Heuristics, 8, 541 – 564.
Tang, L., Goh, T., and Ong, H. (1999a) Planning accelerated life tests for censored two-parameter exponential distributions. Naval Research Logistics, 46(2), 169 – 186.
Tang, L. and Yang, G. (2002) Planning multiple-levels constant-stress accelerated life tests.Proceedings 2002 Annual Reliability and Maintainability Symposium, 338 – 342.
Tobias, P. and Trindade, D. (1995) Applied Reliability, Chapman & Hall / CRC, 2ndedition, Boca Raton.
112
Tversky, A. and Kahneman, D. (1986) Rational choice and the framing of decisions. Journalof Business, 59(4), 251 – 278 .
Verdinelli, I., Polson, N., and Singpurwalla, N. (1993) Reliability and Decision Making,Barlow, R., Clariotti, C., Spizzichino, F. (eds), Chapman & Hall, London, 247 – 256.
Viertl, R. (1988) Statistical Methods for Accelerated Life Testing, Gottingen: Vandenhoeckand Ruprecht.
Wang, W. and Kececioglu, D. (2000) Fitting the Weibull log-linear model to acceleratedlife-test data. IEEE Transactions on Reliability, 49(2), 217 – 223.
Welch, W. (1982) Branch-and-bound search for experimental designs based on D-optimalityand other criteria. Technometrics, 24, 41 – 48.
Whitehead, J. (1980) Fitting Cox’s regression model to survival data using GLIM. AppliedStatistics, 29, 268 – 275.
Wynn, H. (1972) Results in the theory and construction of D-optimum experimental de-signs. Journal of the Royal Statistical Society - B, 34, 133 – 147.
Zahran, A., Anderson-Cook, C., and Myers, R. (2003) Fraction of design space to assessprediction capability of response surface designs. Journal of Quality Technology, 5,377 – 386.
Zhang, Y. and Meeker, W. (2006) Bayesian methods for planning accelerated life tests.Technometrics, 48, 49 – 60.
Zhao, W. and Elsayed, E. (2005) Optimum accelerated life testing plans based on propor-tional mean residual life. Quality and Reliability Engineering International, 21, 701– 713.
114
Table 18: Legacy design test conditions
condition ∆T (◦C) tsoak (mins) Tmax(◦C) Tmin(◦C)L1 125 30 85 -40L2 165 30 125 -40L3 125 15 100 -25L4 150 15 125 -25
Table 19: Orthogonal design test conditions
condition ∆T (◦C) tsoak (mins) Tmax(◦C) Tmin(◦C)O1 155 15 125 -30O2 120 15 100 -20O3 155 60 100 -55O4 120 60 125 5
Table 20: D-optimal design test conditions
condition ∆T (◦C) tsoak (mins) Tmax(◦C) Tmin(◦C)D1 120 10 125 5D2 180 60 125 -55D3 93 60 98 5D4 140 10 85 -55
115
Table 21: Parameter estimation results for the legacy design simulations
Censoring Test units a (stdev) b (stdev) c (stdev)None - 200 1.836(0.255) 0.321(0.079) 0.115(0.042)
Complete 100 1.830(0.374) 0.326(0.113) 0.117(0.062)Data 50 1.891(0.579) 0.315(0.163) 0.107(0.091)
25 1.753(0.779) 0.364(0.222) 0.129(0.129)Rt 60% 200 1.820(0.334) 0.331(0.103) 0.121(0.056)
100 1.811(0.500) 0.328(0.147) 0.124(0.083)50 1.822(0.705) 0.338(0.206) 0.122(0.114)25 1.830(0.925) 0.324(0.289) 0.118(0.148)
Rt 30% 200 1.785(0.501) 0.335(0.142) 0.124(0.081)100 1.806(0.699) 0.314(0.212) 0.116(0.118)50 1.824(1.010) 0.332(0.304) 0.129(0.174)25 1.829(1.435) 0.309(0.463) 0.116(0.252)
Int 30% 200 1.827(0.530) 0.321(0.154) 0.111(0.088)100 1.750(0.727) 0.338(0.223) 0.129(0.123)50 1.803(1.227) 0.329(0.315) 0.121(0.186)25 1.783(1.751) 0.304(0.438) 0.110(0.283)
Table 22: Parameter estimation results for the orthogonal design simulations
Censoring Test units a (stdev) b (stdev) c (stdev)None - 200 1.807(0.129) 0.333(0.025) 0.122(0.020)
Complete 100 1.792(0.186) 0.332(0.037) 0.124(0.026)Data 50 1.836(0.259) 0.337(0.050) 0.122(0.037)
25 1.814(0.389) 0.329(0.074) 0.119(0.059)Rt 60% 200 1.800(0.180) 0.336(0.034) 0.120(0.026)
100 1.808(0.266) 0.336(0.045) 0.119(0.033)50 1.782(0.357) 0.331(0.068) 0.123(0.049)25 1.766(0.483) 0.336(0.100) 0.128(0.073)
Rt 30% 200 1.811(0.250) 0.335(0.046) 0.125(0.034)100 1.812(0.381) 0.333(0.064) 0.125(0.049)50 1.774(0.598) 0.337(0.099) 0.113(0.073)25 1.732(0.804) 0.317(0.138) 0.122(0.101)
Int 30% 200 1.764(0.262) 0.329(0.048) 0.131(0.032)100 1.821(0.369) 0.328(0.071) 0.123(0.047)50 1.800(0.599) 0.322(0.111) 0.127(0.066)25 1.777(0.897) 0.341(0.142) 0.126(0.112)
116
Table 23: Parameter estimation results for the D-optimal design simulations
Censoring Test units a (stdev) b (stdev) c (stdev)None - 200 1.809(0.073) 0.335(0.020) 0.122(0.013)
Complete 100 1.803(0.115) 0.332(0.030) 0.121(0.019)Data 50 1.793(0.159) 0.330(0.040) 0.122(0.030)
25 1.798(0.245) 0.335(0.055) 0.124(0.039)Rt 60% 200 1.798(0.100) 0.333(0.024) 0.119(0.017)
100 1.796(0.156) 0.330(0.038) 0.123(0.025)50 1.794(0.213) 0.333(0.054) 0.122(0.038)25 1.823(0.311) 0.341(0.080) 0.119(0.053)
Rt 30% 200 1.800(0.146) 0.334(0.036) 0.122(0.026)100 1.799(0.189) 0.336(0.054) 0.121(0.037)50 1.813(0.289) 0.341(0.080) 0.124(0.053)25 1.787(0.425) 0.335(0.118) 0.122(0.077)
Int 30% 200 1.768(0.150) 0.329(0.034) 0.119(0.026)100 1.757(0.214) 0.317(0.053) 0.123(0.036)50 1.783(0.314) 0.318(0.080) 0.119(0.054)25 1.817(0.490) 0.323(0.114) 0.112(0.078)
Table 24: Parameter estimation results for a Type-I censored D-optimal design
Censoring Test units a (stdev) b (stdev) c (stdev)Int 30% 200 1.799(0.160) 0.341(0.039) 0.118(0.026)
100 1.797(0.219) 0.325(0.055) 0.136(0.047)50 1.844(0.320) 0.357(0.081) 0.131(0.054)25 1.883(0.441) 0.325(0.107) 0.109(0.072)
118
This section further details the derivation of the weight matrix and its relationship to
a reliability likelihood function involving censoring. Since we have assumed an exponential
distribution, the log mean can be written as
log(µi) = log[Λ(ti)
]+ x′iβ
= log[λ0ti
]+ x′iβ
= log[λ0
]+ log
[ti
]+ x′iβ
= β0 + x′iβ + log(ti). (B.1)
Since µi = H(ti, xi) = Λ(ti) · exi′β, the expected value of the mean of the distribution is
E(µi) = exp(β0 + x′iβ) · E(ti) = λ0 · ex′iβ · E(ti) (B.2)
where β0 = log(λ0). Since ti is not known beforehand, we replace it with its expectation.
Given that a priori knowledge does not exist as to whether a unit will be censored or
not, we partition the expectation function into two conditional cases proportional to their
probabilities of occurrence. This is expressed as
E(ti) = P (t < tc) · E(ti|t < tc) + P (t ≥ tc) · E(ti|t ≥ tc) (B.3)
where tc is the censoring time planned for the experiment. Next, the probability terms are
replaced with the survival function, S(ti, xi), and failure function, F (ti, xi), for a shown
119
in Equations (4.14) and (4.15). Integration by parts is used to solve the first expectation
function term on the right hand side of Equation (B.3). This result yields
E(ti|t < tc) =1
λ0ex′iβ
. (B.4)
The second expectation function on the right hand side of Equation (B.3) is simply
the censoring time as shown by
E(ti|t ≥ tc) = tc. (B.5)
Substituting Equations (B.4) and (B.5) into Equation (B.3) yields
E(ti) =
[1− e−H(ti,xi)
][1
λ0ex′iβ
]+ e−H(ti,xi) · tc. (B.6)
Therefore, by substituting Equation (B.6) into Equation (B.2) we have an expected value
for the mean of each distribution as shown by
E(µi) = eβ0+x′iβ · E(ti)
= [1− e−µi(tc)] + tceβ0+x′iβe−µi(tc)
= Φ(tc) + tc · eβ0+x′iβ · [1− Φ(tc)], (B.7)
written as a function of the vector of model parameters β, censoring time tc, and failure
120
distribution Φ. The elements of µ can, in turn, be used to construct a the weight matrix
which has the form W = diag(µ1, µ2, . . . , µn). With the GLM formulation, the variance-
covariance matrix, Σ, is computed as (X ′WX)−1. The X matrix in this example is
X =
1 x11 x21 x31
1 x12 x22 x32
......
......
1 x1n x2n x3n
where the matrix element, xji represents the jth stress factor level (j = 1, 2, 3) and ith unit
(i = 1, 2, . . . , n) and the weight matrix, W is
W =
µ1 0 0 0
0 µ2 0 0
0 0 µ3 0
0 0 0 µ4
.
122
1. D-optimality computations for a two-factor design
proc iml; /* IML = Interactive Matrix Language */
start DOptimal(xx) /* NLPCG call - nonlinear conj grad optimiz */
global (Weight, D, b0, b1, b2, t); /* Passing variables */
x=shape(xx,nrow(xx)*ncol(xx)/2,2); /* Matrix with nrow(xx)=1 and ncol(xx)=Nd*2 */
/* Nd = the number of test points in design */
F=j(nrow(x),1) || x[,1] || x[,2]; /* F is [1 | x1 | x2 ] matrix */
G= x[,1] || x[,2]; /* G is [x1 | x2 ] matrix */
b=b1//b2; /* b is the beta matrix, a column vector */
a1=G*b; /* This section computes the weight matrix */
Phi= 1-exp(-b0*t*exp(G*b)); /* Refer to the derivation in Appendix B */
a2=exp(a1);
a3=exp(-b0*t*a2);
a4=1-a3;
a5=1-Phi;
a5d=diag(a5);
a6=diag(b0*t*a5);
a7=diag(a6)*a2;
a8=Phi+a7; /* a8 = mu_i = Eqn B.7 */
W=diag(a8); /* W = diag(mu_i) */
XWX=F‘*W*F; /* Computes the Fisher Information Matrix */
D=det(XWX); /* Computes the determinant */
return(D); /* Returns value to NLPCG function */
finish;
123
2. U-optimality computations for a two-factor with interaction design
proc iml; /* IML = Interactive Matrix Language */
start UOptimal(xx) /* NLPCG call - nonlinear conj grad optimiz */
global (Weight, Var, b0, b1, b2, b3, t); /* Passing variables */
x=shape(xx,nrow(xx)*ncol(xx)/2,2); /* Matrix with nrow(xx)=1 and ncol(xx)=Nd*2 */
/* Nd = the number of test points in design */
F=j(nrow(x),1)||x[,1]||x[,2]||x[,1]#x[,2]; /* F is [1 | x1 | x2 | x1*x2] matrix */
G= x[,1] || x[,2] || x[,1]#x[,2]; /* G is [x1 | x2 | x1*x2] matrix */
b=b1//b2//b3; /* b is the beta matrix, a column vector */
a1=G*b; /* This section computes the weight matrix */
Phi= 1-exp(-b0*t*exp(G*b)); /* Refer to the derivation in Appendix B */
a2=exp(a1);
a3=exp(-b0*t*a2);
a4=1-a3;
a5=1-Phi;
a5d=diag(a5);
a6=diag(b0*t*a5);
a7=diag(a6)*a2;
a8=Phi+a7; /* a8 = mu_i = Eqn B.7 */
W=diag(a8); /* W = diag(mu_i) */
use={1 1.758 3.159 5.555}; /* Defines use condition location */
XWX=F‘*W*F; /* Computes the Fisher Information Matrix */
Var=use*inv(XWX)*use‘; /* Computes the pred variance at a use cond */
return(Var); /* Returns value to NLPCG function */
finish;
124
3. Computation of the lifetime model coefficients
do g1=0.60 to 0.90 by 0.15; /* is the activation energy, Q */
do h1=-2.4 to -3.6 by -0.60; /* is the moisture exponent, c */
do i1=-0.020 to -0.030 by -0.005;; /* is the interaction term, Q x c */
do y=2 to 2 by 1; /* exponent for censoring time */
do Nd=100 to 100 by 25; /* Nd = the number of test points in design */
t=10**y; /* computes censoring time */
c0=-5.2378; /* intercept to scale to 10^6 hours @ use */
c1=38.2801; /* s1_use = 11605/T for T=(30+273.15) */
c2=30.2876; /* s1_hi = 11605/T for T=(110+273.15) */
c3=4.5000; /* s2_use = log_e(RH) for RH=25% */
c4=3.2189 /* s2_hi = log_e(RH) for RH=90% */
b0=round(c0+c1*g1+c4*h1+c1*c4*i1, 0.0001); */ intercept for scaled lifetime model */
b1=round((c2-c1)*g1 + c4*(c2-c1)*i1, 0.0001); */ coefficient for accel factor 1 */
b2=round((c3-c4)*h1 + c1*(c3-c4)*i1, 0.0001); */ coefficient for accel factor 2 */
b3=round((c2-c1)*(c3-c4)*i1, 0.0001); */ coefficient for interaction 1*2 */
125
4. Non-linear search method using conjugate gradients
x0init = ranuni((1:(Nd*2))‘); /* uniform random variate generator */
con = shape(0,1,Nd*2) // shape(1,1,Nd*2); /* unit square constraint matrix */
/* Note: If you wish to change the termination criteria */
/* Create a tc matrix (e.g. tc= [x // y // z] */
/* Add tc to NLPCG call statement (e.g. x0init, 0, con, tc); */
/* tc[1] = maximum number of iterations allowed */
/* tc[2] = maximum number of functional calls */
/* tc[3] = absolute functional convergence criteria */
call nlpcg(rc, /* Return code */
x0, /* Returned optimum factors */
"UOptimal", /* Function to optimize */
x0init, /* Initial value of factors */
0, /* 0=minimization; 1=maximization */
con); /* Specifies termination criterion */
126
5. Formatting of the D-optimal design output
/* Enter your specific computer name and directory path here */
libname tojmp ’C:\Documents and Settings\emmonroe\Desktop’;
toout=round(shape(x0, nd,2),0.001) || /* Formats the output in tabular cols */
Weight || repeat(D,nd,1) || /* Two pipe operators || used to define col */
repeat(b0,nd,1) || repeat(b1,nd,1)||
repeat(b2,nd,1) || repeat(t,nd,1) ||
repeat(Nd,nd,1) ;
first=1; /* runs create statement first time through */
if first then do;
colnames={’x1’ ’x2’ ’weight’ ’Determ’ ’b0’ ’b1’ ’b2’ ’b3’ ’t’ ’Nd’};
create tojmp.newdat from toout[colname=colnames];
end;
append from toout;
first=0; /* turns create flag off after first pass */
end; /* closes DO loops shown in section D3 */
end;
end;
end;
end;
end;
quit;
proc freq data=tojmp.newdat; /* Formats data in a nice table */
tables b0*b1*b2*b3*t*Nd*x1*x2*pred_var / noprint out=tojmp.summary(drop=percent);
run;
127
6. Formatting of the U-optimal design output
toout=round(shape(x0, nd,2),0.001) || /* Formats the output in tabular cols */
Weight || repeat(Var,nd,1) || /* Two pipe operators || used to define col */
repeat(b0,nd,1) || repeat(b1,nd,1) ||
repeat(b2,nd,1) || repeat(b3,nd,1) ||
repeat(t,nd,1) || repeat(Nd,nd,1) ;
first=1; /* runs create statement first time through */
if first then do;
colnames={’x1’ ’x2’ ’weight’ ’var’ ’b0’ ’b1’ ’b2’ ’b3’ ’t’ ’Nd’};
create tojmp.newdat from toout[colname=colnames];
end;
append from toout;
first=0; /* turns create flag off after first pass */
end;
end;
end;
end;
end;
end;
quit;
proc freq data=tojmp.newdat; /* groups data into nice summary format */
tables b0*b1*b2*b3*t*Nd*x1*x2*pred_var / noprint out=tojmp.summary(drop=percent);
run;
128
7. Computation of the prediction variance contours
proc iml; /* IML = Interactive Matrix Language */
n1=50; /* Samples per stress level condition */
n2=21;
n3=16;
n4=13;
N=n1+n2+n3+n4;
d1=shape( {0.782, 0.951},n1,2); /* Matrices w. normalized stress levels */
d2=shape( {1,0},n2,2);
d3=shape( {0,1},n3,2);
d4=shape( {0,0},n4,2);
D=d1//d2//d3//d4; /* Design matrix */
F=j(N,1) || D || D[,1]#D[,2]; /* [1 | x1 | x2 | x1*x2] */
G= D || D[,1]#D[,2]; /* [x1 | x2 | x1*x2 ] */
t=100; /* is the censoring time */
g1=0.75; /* is the activation energy, Q */
h1=-3; /* is the moisture exponent, c */
i1=-0.025; /* is the interaction term, Q x c */
c0=-5.2378; /* intercept to scale to 10^6 hours @ use */
c1=38.2801; /* s1_use = 11605/T for T=(30+273.15) */
c2=30.2876; /* s1_hi = 11605/T for T=(110+273.15) */
c3=4.5000; /* s2_use = log_e(RH) for RH=25% */
c4=3.2189 /* s2_hi = log_e(RH) for RH=90% */
129
b0=round(c0+c1*g1+c4*h1+c1*c4*i1, 0.0001); */ intercept for scaled lifetime model */
b1=round((c2-c1)*g1 + c4*(c2-c1)*i1, 0.0001); */ coefficient for accel factor 1 */
b2=round((c3-c4)*h1 + c1*(c3-c4)*i1, 0.0001); */ coefficient for accel factor 2 */
b3=round((c2-c1)*(c3-c4)*i1, 0.0001); */ coefficient for interaction 1*2 */
b=b1//b2//b3;
a1=G*b; /* Computation of the weight matrix */
Phi= 1-exp(-b0*t*exp(G*b));
a2=exp(a1);
a3=exp(-b0*t*a2);
a4=1-a3;
a5=1-Phi;
a5d=diag(a5);
a6=diag(b0*t*a5);
a7=diag(a6)*a2;
a8=Phi+a7;
W=diag(a8);
XWX=F‘*W*F; /* Compute the X’WX matrix */
I=inv(XWX); /* Computation of the Fisher info matrix */
use={1 1.758 3.159 5.555}; /* Declaration of the use condition */
Var=use*inv(XWX)*use‘; /* Computation of prediction variance */
print Var;
quit;
131
1. Fisher information matrices
Assuming a main effects model with no interaction term yields,
(X ′WX) =
77.50 42.86 40.59
42.86 38.45 19.23
40.59 19.23 39.39
.
Assuming a model with an interaction term but coefficient estimated to be zero,
(X ′WX) =
77.50 42.86 40.59 19.23
42.86 38.45 19.23 15.04
40.59 19.23 39.39 18.29
19.23 15.04 18.29 14.30
.
Assuming a model with an interaction term and small interaction coefficient yields,
(X ′WX) =
77.04 43.49 36.58 15.58
43.49 38.92 15.58 12.23
36.58 15.58 31.28 11.42
15.58 12.23 11.42 8.97
.
Assuming a model with an interaction term and large interaction coefficient yields,
(X ′WX) =
59.47 34.96 20.80 10.37
34.96 33.20 10.37 9.35
20.80 10.37 14.15 6.00
10.37 9.35 6.00 5.42
.
132
2. Covariance matrices
Assume main effects model and no interaction term
(X ′WX)−1 =
0.057 −0.046 −0.037
−0.046 0.071 0.013
−0.037 0.013 0.057
Assume model with interaction term and interaction coefficient is zero
(X ′WX)−1 =
0.077 −0.077 −0.077 0.076
−0.077 0.121 0.077 −0.122
−0.077 0.077 0.139 −0.156
0.076 −0.122 −0.156 0.296
Assume model with interaction term and interaction coefficient is small
(X ′WX)−1 =
0.091 −0.091 −0.091 0.081
−0.091 0.136 0.091 −0.143
−0.091 0.091 0.151 −0.158
0.082 −0.143 −0.158 0.366
Assume model with interaction term and interaction coefficient is large
(X ′WX)−1 =
0.091 −0.091 −0.113 0.108
−0.091 0.150 0.113 −0.209
−0.113 0.113 0.274 −0.282
0.108 −0.209 −0.282 0.652
134
Table 25: U-optimal design for condition – – –Test Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % x2 Φ ni Fails
1 62.6 0.940 60 1.000 0.0223 44 1.02 60 1.000 90 0.000 0.0600 21 1.33 110 0.000 60 1.000 0.4257 18 7.74 110 0.000 90 0.000 0.8461 17 14.4
Table 26: U-optimal design for condition – – +Test Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % x2 Φ ni Fails
1 62.1 0.952 61.55 0.937 0.0231 48 1.12 60 1.000 90 0.000 0.0600 20 1.23 110 0.000 60 1.000 0.4257 17 7.24 110 0.000 90 0.000 0.8461 15 12.7
Table 27: U-optimal design for condition – + –Test Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % x2 Φ ni Fails
1 69.3 0.792 67.2 0.720 0.0510 57 2.92 60 1.000 90 0.000 0.0600 18 1.13 110 0.000 60 1.000 0.4257 15 6.44 110 0.000 90 0.000 0.8461 11 9.3
Table 28: U-optimal design for condition – + +Test Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % x2 Φ ni Fails
1 66.0 0.864 69.5 0.637 0.0443 62 2.82 60 1.000 90 0.000 0.0600 18 1.13 110 0.000 61.5 0.940 0.0195 10 4.54 110 0.000 90 0.000 0.8461 10 8.5
135
Table 29: U-optimal design for condition + – –Test Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % x2 Φ ni Fails
1 71.5 0.744 61.3 0.948 0.0457 52 2.42 60 1.000 90 0.000 0.0600 18 1.13 110 0.000 60 1.000 0.4257 16 6.84 110 0.000 90 0.000 0.8461 14 11.8
Table 30: U-optimal design for condition + – +Test Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % x2 Φ ni Fails
1 72.6 0.720 62.9 0.885 0.0533 50 2.72 60 1.000 90 0.000 0.0600 19 1.13 110 0.000 60 1.000 0.4257 20 8.54 110 0.000 90 0.000 0.8461 5 4.2
Table 31: U-optimal design for condition + + –Test Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % x2 Φ ni Fails
1 76.4 0.641 66.9 0.732 0.0828 59 4.92 60 1.000 90 0.000 0.0600 17 1.03 110 0.000 60 1.000 0.4257 19 8.14 110 0.000 90 0.000 0.8461 5 4.2
Table 32: U-optimal design for condition + + +Test Temperature Humidity F(t) Alloc ExpCond Natural Std Natural Std
i ◦ C x1 % x2 Φ ni Fails
1 76.4 0.641 68.0 0.691 0.0868 55 4.82 60 1.000 90 0.000 0.0600 18 1.13 110 0.000 60 1.000 0.4257 19 8.14 110 0.000 90 0.000 0.8461 8 6.8