Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | abraham-garrett |
View: | 221 times |
Download: | 0 times |
VIII. Introduction to Response Surface Methodology
A. Sequential Experimentation
1. Phases of Experimentation
I. Screening
• very small design (Resolution III)
• little, if any, replication
• analyze by normal probability plots
• extremely cost conscious (save resources for later)
• little, if any, concern about lack-of-fit
II. Initial Steepest Ascent (Descent)
• replicate at least the center
• begin to be concerned over lack-of-fit
• serious consideration to Resolution IV or higher designs
• less cost conscious
III. Follow-Up Steepest Ascent
IV. Optimization
• replication extremely important
• often starts as a mid-course correction
• lack-of-fit may suggest design augmentation
• popular designs
(a) central composite design (CCD)
(b) augment to a CCD
(c) Box-Behnken
• extremely expensive
Important Considerations in Choosing Designs
• Purpose of Experiment
• Proposed Model
• Estimation versus Testing
• Concern Over Lack-of-Fit
• Ability to Augment, if Necessary
• Protection from Outliers
The Relationship Between Design and Model
The specific design used determines which models are estimable!
1. Screening Designs
2. “Interactive” Model
iij
k
jj
iikkiii
x
xxxy
1
0
22110
iijij
k
j
k
jjjij
k
jj
iikkikkiiii
ikkiii
xxx
xxxxxx
xxxy
''
'
1 110
1,,131132112
22110
3. Optimization
iijij
k
j
k
jjj
k
jijjjij
k
jj
iikkikkiiii
ikkkii
ikkiii
xxxx
xxxxxx
xxx
xxxy
''
'
1 11
2
10
1,,131132112
22
222
2
111
22110
B. Steepest Ascent
Steepest ascent is an example of an optimization.
In calculus, how do we optimize something?
Consider a situation where we may model the response by a strict first-order model
Taking the first derivative with respect to xj,
kkxbxbxbby
22110ˆ
0ˆ
j
j
bx
y
Technically, we find the path of steepest ascent by a constrained optimization technique based on Lagrangian multipliers.
If we have only two factors, the path of steepest ascent is the line from the origin to the maximum response over the circle defined by
where c is the radius of the circle.
For k ≥ 3 factors, the path of steepest ascent is the line from the origin to the maximum response over the sphere defined by
0any for 2
2
2
1 ccxx
0any for 1
2
ccxk
jj
Since the path of steepest ascent represents the optimum response over spheres, we need to construct this path in the metric where spheres make the most sense.
Our procedure is:
• construct the path in the design variables, and
• convert this line back to the natural units.
64
62
60
807876747270
132
131
130
129
128
Path of SteepestAscent
Let x1 be the “key” factor, and let x10 be a specific value for this factor along the desired path.
The settings for the other factors are
This path passes through the center of the region of interest.
To convert this line back to the natural units,
• let cj be the center value, in the natural units, for the jth factor, and
• let dj be the “scaling” factor.
Let be the specific setting for the jth factor along the path of steepest ascent; thus,
kjxb
bx j
j,,3,2
10
1
0
0
*
0 jjjjxdcx
*
0jx
We usually pick the factor with the largest in absolute value estimated coefficient as our key factor.
We construct the line by increasing this key factor by a convenient amount each time.
We then run a series of experiments along this path.
Example: Kilgo (1988) performed an experiment to determine the effect of CO2 pressure, CO2 temperature, peanut moisture, CO2 flow rate and peanut particle size on the total yield of oil per batch of peanuts. A 25-1design was carried out and only temperature, x2, and particle size, x5, were important.
Since x5 has the largest in absolute value coefficient, we use it as our key factor.
For a specific setting of particle size, x50, along the path, the appropriate setting for temperature, x20, is given by
We can convert each value of x20 back to the natural units by
x2 x5 bj 9.875 -22.25 cj 60 2.665 dj 35 1.385
5050
5
2
2044.0 xx
b
bx
202022
*
203560 xxdcx
We can convert each value of x50 back to the natural units by
505055
*
50385.1665.2 xxdcx
Design Natural Variables Units Run x20 x50 x20
* x50* y
1 0.444 -1.0 75.5 1.28 81 2 0.488 -1.1 77.1 1.14 84 3 0.533 -1.2 78.7 1.00 90 4 0.577 -1.3 80.2 0.86 97 5 0.622 -1.4 81.8 0.73 95 6 0.666 -1.5 83.3 0.59 92
C. Second-Order Experiments
1. Overview
For first order designs, we must have:
• at least two levels for each variables.
• at least as many points as parameters to estimate, ie $k+1$; and
• the main effects can not be completely aliased with each other
For a second order model, we now must have:
• at least three levels for each variable in order to estimate both the first order and pure quadratic effects;
• at least as many points as parameters to estimate, ie
• the main effects and two factor interactions cannot be completely aliased with each other.
2
)2)(1(
21
kkkkk
The first design which meets these criteria is the 3k factorial design.
Note:
• this design uses three levels for each variable.
• 3k ≥ (k+1)(k+2)/2 [equality if and only if k=1]
• the 3k allows us to estimate all first order, pure quadratic, two-factor and higher interactions.
A major disadvantage of the 3k factorial:
Often the 3k design points are more than are required for the second order model.
Thus, the 3k factorial is really too expensive to be practical.
2. The Central Composite Design
The single most popular second order response surface design is the central composite design (CCD) developed by Box and Wilson (JRSS, B 1951).
The CCD was intended to be a more economical alternative to the 3k.
The design consists of three parts:
• a Res. V fraction of a 2k;
• a series of “axial” runs; and
• a series of center runs.
Sometimes, we convey this information by:
Note: The CCD is rather flexible in that is not fixed.
Thus, we may choose in order to meet some particular needs.
00
00
00
0
0
00
00
1111
D
It is instructive to see the two variable (k=2) CCD with
Note: With the center run, the k=2 CCD with is the 32 factorial design.
1
00
10
10
01
01
11
11
11
11
D
1
For k = 3, the CCD is
Note: With a single center run, the CCD requires 15 design runs [23 + 2•3 + 1] as opposed to the 27 required by a 33 factorial.
1
000
100
100
010
010
001
001
111
111
111
111
111
111
111
111
D
There are three common choices for $\alpha$:
• 1 (cuboidal)
• (spherical ccd)
• (rotatable) where nf is the number of factorial points.
A rotatable design is one where the prediction variance for any two points the same distance from the design center is the same.
As a result, if a design is rotatable, the prediction variance at some specific location only depends on that location's distance from the design center.
k
25.0
fn
Finally, an important question concerns how many center runs should we use.
From a variance-based optimality perspective: 1-3 are usually enough.
For detecting Lack of Fit, probably 6-8.
D. Optimization
Primary goal: of the second order experiment: optimization.
Consider:
From calculus, the point of optimal response is
• either the stationary point
• or some point on the boundary of the region.
Let x0 denote the factor settings at the stationary point.
Let y0 be the response at this point.
To find x0, we need to solve the system obtained by
2
222
2
111211222110ˆ xbxbxxbxbxbby
00 x
y
It is important to note that the stationary point may be:
• a point of maximum response;
• a point of minimum response, or
• a saddle point.
Even if the stationary point is an optimum, it may lie outside the region of experimentation.
Hence, we have little faith in it.
Bottom Line: Often, the stationary point is not a reliable point of optimal response.
Thus, the point of optimal response often lies on the boundary of the region of interest.
How should we find this point?
Consider Lagrangian multipliers.
We thus optimize
Subject to the constraint that
where R is the radius of the region of interest.
Let
where μ is the Lagrangian multiplier.
2
222
2
111211222110ˆ xbxbxxbxbxbby
2
1
2 Rxk
jj
k
jji
Rxy1
22ˆ
D. Multiple Responses
In many engineering experiments, we have more than one response of interest.
The key: to find appropriate compromise operating conditions.
Two basic approaches for jointly optimizing two or more responses:
• the desirability function, and
• nonlinear programming approaches.
Several statistical software packages include some form of the desirability function.
Some spreadsheets, including EXCEL, use good reduced gradient algorithms to perform appropriate constrained optimization.
The Desirability Function
The desirability function provides an overall measure for the “goodness” of a specific setting:
• A large value indicates a desirable set of values for the various responses.
• A low value indicates an undesirable set of values.
Derringer and Suich (Journal of Quality Technology 1980) proposed an approach which:
1. determines the individual desirabilities for each response and
2. then combines these individual desirabilities into an overall desirability.
The analyst then seeks to find the settings in the factors which maximize the overall desirability.
The individual desirabilities depend upon whether we wish
• to maximize the response of interest,
• to minimize the response of interest, or
• to achieve a specific target value for the response of interest.
Derringer and Suich use a scale from 0, which represents completely undesirable, to 1, which represents fully desirable, for their individual desirability functions.
Consider the target value case first.
• is the predicted value for the response.
• yT is the specific target value for the response of interest.
• yL is the smallest possible value which has any desirability.
• yU is the largest possible value which has any desirability.
y
One approach defines the desirability for this response by
With this definition,
• we give any predicted value for the response less than yL or greater than yU a desirability of 0.
• if the predicted value is exactly at the target value, we give it a desirability of 1.
• the further the predicted value is from the target, the lower desirability we give it.
U
UT
TU
U
TL
LT
L
L
yy
yyyyy
yy
yyyyy
yyyy
d
ˆfor 0
ˆfor ˆ
ˆfor ˆ
ˆfor 0
Derringer and Suich actually proposed the following slight modification
U
UT
t
TU
U
TL
s
LT
L
L
yy
yyyyy
yy
yyyyy
yy
yy
d
ˆfor 0
ˆfor ˆ
ˆfor ˆ
ˆfor 0
The exponents s and t provide greater flexibility in assigning the desirability within the range of interest.
Suppose we wish to maximize the response.
• yL is the smallest desirable value for this response.
• yU is a fully desirable value.
Basically, yU represents the point of diminishing returns.
In some cases, yU represents a true bound for the response.
In other cases, yU is some arbitrary value larger than the largestobserved response.
For this situation, Derringer and Suich proposed
U
UL
s
LU
L
L
yy
yyyyy
yy
yy
d
ˆfor 1
ˆfor ˆ
ˆfor 0
Suppose we wish to minimize the response.
• yU is the largest desirable value for this response.
• yL is a fully desirable value.
Basically, yL represents the point of diminishing returns.
U
UL
s
LU
U
L
yy
yyyyy
yy
yy
d
ˆfor 0
ˆfor ˆ
ˆfor 1
Once we have the individual desirabilities, we need to combine them in a meaningful way.
How should we do this?
Note:
• If any of the individual responses is completely undesirable, then the overall desirability also should be completely undesirable.
• Similarly, the overall desirability should be 1 if and only if all of the individual responses are completely desirable.
Suppose we have m responses of interest.
Let d1, d2, … , dm be the individual desirabilities.
Derringer and Suich defined the overall desirability, D, by
which is the geometric mean of the desirabilities.
mm
jj
dD/1
1
Myers and Montgomery (1995) outline an experiment, originally presented in Box, Hunter, and Hunter (1978).
Purpose: to find the settings for
• reaction time (x1),
• reaction temperature (x2), and
• the amount of catalyst (x3)
which maximize the conversion (y1) of a polymer and achieves a target value of 57.5 for the thermal activity (y2).
The lower bound for the conversion is 80.
The maximum possible value is 100.
Thermal activity must be between 55 and 60.
The experimental results:x1 x2 x3 y1 y2
-1 -1 -1 74 53.2 1 -1 -1 51 62.9 -1 1 -1 88 53.4 1 1 -1 70 62.6 -1 -1 1 71 57.3 1 -1 1 90 67.9 -1 1 1 66 59.8 1 1 1 97 67.8
-1.682 0 0 76 59.1 1.682 0 0 79 65.9
0 -1.682 0 85 60.0 0 1.682 0 97 60.7 0 0 -1.682 55 57.4 0 0 1.682 81 63.2 0 0 0 81 59.2 0 0 0 75 60.4 0 0 0 76 59.1 0 0 0 83 60.6 0 0 0 80 60.8 0 0 0 91 58.9
A reasonable model for conversion is
A reasonable model for thermal activity is
323121
2
3
2
2
2
13211
87.337.1113.219.5
94.283.120.664.403.109.81ˆ
xxxxxxx
xxxxxy
31223.226.423.60ˆ xxy
Let
• s=1 for conversion and
• s=t=1 for thermal activity.
The Derringer-Suich approach recommends a setting of
This setting gives a predicted conversion of 95.21 and a predicted thermal activity of 57.50.
The overall desirability for this setting is 0.8720, which is reasonably close to 1.
484.0 and 682.1 389.0321 xxx
Nonlinear Programming Approaches
Jointly optimizing two or more responses when the prediction equations contain second order or higher terms is a standard example of a nonlinear programming problem.
Many spreadsheets have built-in routines for solving these problems, for example the SOLVER routine in Microsoft EXCEL.
The major spreadsheets use good algorithms, usually based on reduced gradients.
We simply need
1. to input the appropriate prediction equations,
2. to input the constraints, and
3. to specify one response as the ``key.''
The spreadsheet routine finds the optimal setting.
These routines are not guaranteed to find a solution within the experimental region unless we specify some additional constraints.
For cuboidal experimental regions, i.e. when we use a face centered cube CCD, then each xj must fall within the interval -1 to 1.
In which case, we need the following additional constraints:
For spherical experimental regions, we need the additional constraint
With these additional constraints, the spreadsheet routine may not find a feasible solution.
When this occurs, we must relax one or more of our constraints in order to find a solution.
11 11 1121
k
xxx
kxk
jj
1
2
We can use the SOLVER routine in Microsoft EXCEL to find optimal conditions.
We use the same second order prediction equations as before.
Recall, we seek to maximize the conversion.
We thus specify conversion, , as our key response and tell the routine that we want to maximize it.
Since we have a target value of 57.5 for the thermal activity, we specify the following constraint:
Since this experiment uses a spherical CCD, we need to impose the additional constraint
1y
33
1
2 j
jx
5.57ˆ2y
The spreadsheet recommends the setting
This setting gives a conversion of 94.37% and a thermal activity of 57.5.
404.0 and 682.1 429.0321 xxx
E. Robust Parameter Design
1. Overall Taguchi Philosophy
Consider the manufacture of a ball point pen.
• important characteristic is the fit between the barrel and the cap.
• barrel and the cap are produced by separate injection molding processes.
• How can we produce these barrels and caps such that the fit is “optimal”?
What are the real issues in this problem?
The Japanese would view any part which does not achieve the target value as having some tangible loss of value.
Often, they use a squared error loss function:
Thus, a part may be within specifications and still considered “poor”, just not quite poor enough to be rejected.
Impacts of such a philosophy
1. should seek conditions which minimize the expected “loss”
2. must consider both the mean and the variance
2)(Loss Tyk
2. Overview of Taguchi's Parameter Design
Fundamental to this approach are the concepts of
1. control factors --- factors which the experimenter can readily control.
2. noise factors • factors which the experimenter either cannot or will not directly control in the process
• factors “move” randomly in actual process although they can be fixed for the experiment.
Suppose we wish to develop a cake mix “robust” to customer use.
What are possible control factors?
What are possible noise factors?
Goal of parameter design: find the settings for the control factors which are most “robust” to the noise factors.
Taguchi proposes “crossing”:
1. a design for the control factors (inner or control array)
2. a design for the noise factors (outer or noise array)
Each point of the inner array is replicated according to a design in the noise factors called the outer array.
Typically, these designs are “saturated” or “near-saturated”.
For example, suppose we have three control and three noise factors.
Let x1, x2, and x3 represent the control factors.
Let z1, z2, and z3 represent the noise factors.
An appropriate inner array is a 23-1 fraction or
x1 x2 x3
-1 -1 -1 -1 1 1
1 -1 1 1 1 -1
Each of these settings is replicated by the outer array.
z1 z2 z3
-1 -1 -1 -1 1 1
1 -1 1 1 1 -1
The resulting design consists of 4 x 4 or 16 runs and follows.
x1 x2 x3 z1 z2 z3
-1 -1 -1 -1 -1 -1 -1 1 1
1 -1 1 1 1 -1
-1 1 1 -1 -1 -1 -1 1 1
1 -1 1 1 1 -1
1 -1 1 -1 -1 -1 -1 1 1
1 -1 1 1 1 -1
1 1 -1 -1 -1 -1 -1 1 1
1 -1 1 1 1 -1
While the inner and outer arrays are completely saturated, all of the interactions between the control and noise factors are estimable!
An important question:
Why run the experiment in the noise factors?
• We seek to find the settings in the control factors which are most “robust” to the noise factors.
• Thus, the noise levels ±1 correspond to what?
What is the natural consequence?
How does this contrast with typical experimentation?
All the designs recommended by Taguchi (the so-called “Taguchi designs”) are orthogonal arrays of strength 2.
• allow the estimation of “main effects”
• do not allow the estimation of any interactions.
Examples of orthogonal arrays of strength 2 include:
1. Resolution III fractional factorial designs
2. Plackett-Burman designs.
Three level orthogonal arrays do exist.
• allow the estimation of the linear and pure quadratic terms
• do not allow estimation of the two-factor or higher interactions.
3. Contributions/Drawbacks
The greatest contributions of this total approach are:
1. it seriously considers the variance over a region of interest; and
2. it provides a rationale for modeling the behavior of the noise in terms of the control factors.
Note:
1. RSM “buries” the impact of the noise factors in εi.
2. Taguchi assumes that the variance is not constant over the region of interest!
• A nice insight: it may be possible to model the variance.
• For Taguchi, the variance changes as the result of noise x control interactions.
This is a rather limited approach for modeling the variance, but a start.
Naturally, there are several drawbacks to the Taguchi approach.
1. the sequential nature of investigation is not exploited; (not completely fair)
2. it uses an unnecessarily limited number of designs which do not adequately deal with interactions;
3. better, simpler, and more efficient analyses exist;
4. importance of data transformations seems not to be appreciated or exploited; (definitely unfair)
5. Taguchi uses baffling terminology.
6. the designs used by Taguchi are much larger than really required since they completely cross the noise and the control factors;
7. the Taguchi approach does not go far enough to model the variance.
4. Statistical Alternatives to Robust Parameter Design
I. The “Combined Array” Method
The basic ideas underlying the “combined array” are
• propose a single model in both the control and noise factors
• run a design specifically for the model proposed.
In the process,
• we can estimate some of the control by control interactions
• we can allocate our experimental resources more efficiently,
• in some cases, the resulting designs are significantly smaller.
(Usually, if we use a fractional factorial, the combined array is about the same size as the corresponding crossed array.)
5. Treating the Mean and the Variance as a Dual Response Problem
The basic goal of parameter design is to achieve a target condition for the mean while simultaneously minimizing the variance.
• Suppose that an n point design has been replicated such that each design point has been run a total of m ≥ 2 times
Any reasonable method may be applied to generate the replication including the use of an outer array.
• let be the estimated variance at the ith design point
• Consider modeling the mean by
• Consider modeling the variance by
where is a suitable transformation of the variance.
2
is
)(ˆii
xfy
)()( 2
iixgst
)( 2
ist
Much work has been and currently is being done on modeling variance.
• Most authors suggest using the natural logarithm of the variance , but the theory for this transformation requires moderate to large amounts of replication (m ≥ 10) to justify this approach
• A reasonable alternative uses the standard deviation, which is the square root transformation.
• Other approaches can be employed, including generalized linear models (GLIM).
Two legitimate questions surface:
1. What do we gain by explicitly modeling the variance?
2. What are the consequences of modeling the variance, particularly with regard to estimation?
)log( 2
is
Example: The Printing Study Experiment
The purpose of the experiment was to study the effect of:
x1: speed x2: pressure x3: distance
upon a printing machine's ability to apply coloring inks upon package labels.
The experiment used a 33 complete factorial with three runs at each design point (m=3).
Assume that the goal of the experiment is to find the conditions which minimize the variability while achieving a target value of 500.0 for the mean response.
The fitted response surface for the response itself was:
The fitted response surface for the standard deviation was:
321323121
321
8.826.435.7566
5.1314.1091777.314ˆ
xxxxxxxxx
xxx
32.290.48ˆ x
Consider using the Derringer-Suich desirability approach to minimize over the cube defined by the 33 design subject to the constraints:
1. (acceptable range: 490-510)
2. a maximal acceptable value for is 60.
The resulting best settings are
x1 = 1.00 x2 = 1.00 x3 = -0.5
which yield an estimated standard deviation of 33.4 and a desirability of 0.6662.
Consider using the SOLVER tool in EXCEL to minimize over the cube defined by the 33 design subject to the constraint:
The resulting best settings again are
x1 = 1.00 x2 = 1.00 x3 = -0.50
500ˆ
500ˆ
The advantages of the statistical procedures are:
1. They directly consider the question of interest rather than burying this question in a signal-to-noise ratio.
2. They are standard applications of a basic RSM procedure. As a result,
(a) they allow a sequential investigation;
(b) they allow the use of a broad array of experimental designs; and
(c) they use more rigorous methods of analysis.
IV. Concluding Remarks
This course has introduced the fundamentals of data analysis to engineers.
Students need to understand that this course only offers a beginning.
Other statistical topics of importance to engineers include:
• more model building and model diagnostics (more regression analysis),
• more process control,
• more experimental design,
• reliability, and
• time series.
Many departments offer full course on each of these topics.
This course gives the student a reasonable foundation for pursuing these more advanced areas in statistics.