+ All Categories
Home > Documents > Factorial Experiments - NUS

Factorial Experiments - NUS

Date post: 13-Mar-2022
Category:
Upload: others
View: 13 times
Download: 0 times
Share this document with a friend
21
CHAPTER 12 Factorial Experiments There are several good reasons why the large-scale factorial (the name, factorial experiment, was coined by R. A. Fisher) popular some disciplines are rarely employed in clinical studies; large is used in the sense of many factors, not necessarily many subjects. The reason is complexity in execution: if there are p experimental factors with the ith factor having L; levels, then the total number of treatmen1 combinations is g = L 1 x Lz x ... x Lp· This number can be overw ingly large if either p or any of the L;'s is even moderate in magnitude. large number of treatment combinations may not cause serious ministrative difficulties when the experimental units are laboratory mals or plots of land, but probably will cause such difficulties when experimental units are human subjects: the number of criteria excluding patients from the study increases, the rules for ad· dosages become more complex, the possible adverse reactions that be watched for grow in number, etc. Other reasons for clinical experiments' rarely involving many include complexity in analysis if the sample sizes for the g trectum combinations vary, and complexity in interpretation if interactions This was seen to be the case in Section 4.2 even for the simplest experiment-two factors both with two levels-and will be seen operate to an even greater degree for a more complicated experiment. Sections 12.1 and 12.2 are devoted to the 2P factorial experi factors each at two levels-with p = 3 for illustrative purposes in sections. The case of unequal sample sizes is considered in the section and that of equal sample sizes in the latter. In both sections are repeated examples of and exercises in the estimation of effects and their stan'ciard errors. The purpose is to ensure that the has a thorough understanding of their meaning and appreciates estimates are fundamental to and more important than the 306 THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES 307 squares in the associated analysis of variance table. Section 12.3 is devoted to factorial studies with more than two levels per factor. A study with two factors, one having three levels and the other four, is used for illustration. Section 12.4, finally, is devoted to the fractional replication of a factorial study (i.e., to the administration of only a fraction of all possible treatment combinations). An example is given of its usefulness in studies at the interface of the clinical and social sciences. As was shown in Problem 5.4, a factorial experiment may be conduct- ed in randomized blocks. Chapter 13 presents some design strategies for the case when the natural or desirable block size is less than the total number of treatment combinations. The reader interested in the design and analysis of more complicated factorial experiments than those con- sidered here may refer to Chapter 5 of Cochran and Cox (1957) and to Chapters 7 and 8 of Federer (1955). 12.1. THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES The easiest factorial study to conceive, carry out, analyze, and inter- pret calls for each of the p factors to have two levels. An example of a factorial study with p = 2 was presented and analyzed in Section 4.2. The data presented there (see Table 4.2) were from only half of the full experiment. Table 12.1 presents individual and summary data for the effects on the weight of the thymus of castration and of adrenalectomy separately for male and for female mice (castration of a female mouse consisted of the removal of her ovaries). The study is formally like a 2 3 factorial experiment except that only two of the factors, castration (yes or no) and adrenalectomy (yes or no), are bona fide experimental factors Whose levels are capable of being randomly assigned to experimental units. The third factor, sex, is a classificatory factor whose levels are obviously incapable of being randomly applied. The present study is therefore like those many clinical experiments in which (approximately) equal numbers of males and females or members of one diagnostic subgroup and another are enrolled in order to ascertain whether different of patients respond differently to the treatments being compared. effects on the analysis and on the inferences of studying a mixture of l ··-· !nd classificatory variables will be discussed later. et niik, Xiik. and siik denote the sample size, mean, and standard for level i of the first factor (here, i = 1 for males, i = 2 for females),
Transcript

CHAPTER 12

Factorial Experiments

There are several good reasons why the large-scale factorial (the name, factorial experiment, was coined by R. A. Fisher) popular some disciplines are rarely employed in clinical studies; large is used in the sense of many factors, not necessarily many subjects. The reason is complexity in execution: if there are p experimental factors with the ith factor having L; levels, then the total number of treatmen1 combinations is g = L1 x Lz x ... x Lp· This number can be overw ingly large if either p or any of the L;'s is even moderate in magnitude. large number of treatment combinations may not cause serious ministrative difficulties when the experimental units are laboratory mals or plots of land, but probably will cause such difficulties when experimental units are human subjects: the number of criteria excluding patients from the study increases, the rules for ad· dosages become more complex, the possible adverse reactions that be watched for grow in number, etc.

Other reasons for clinical experiments' rarely involving many include complexity in analysis if the sample sizes for the g trectum combinations vary, and complexity in interpretation if interactions This was seen to be the case in Section 4.2 even for the simplest experiment-two factors both with two levels-and will be seen operate to an even greater degree for a more complicated experiment.

Sections 12.1 and 12.2 are devoted to the 2P factorial experi factors each at two levels-with p = 3 for illustrative purposes in sections. The case of unequal sample sizes is considered in the section and that of equal sample sizes in the latter. In both sections are repeated examples of and exercises in the estimation of effects and their stan'ciard errors. The purpose is to ensure that the has a thorough understanding of their meaning and appreciates estimates are fundamental to and more important than the

306

THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES 307

squares in the associated analysis of variance table. Section 12.3 is devoted to factorial studies with more than two levels per factor. A study with two factors, one having three levels and the other four, is used for illustration. Section 12.4, finally, is devoted to the fractional replication of a factorial study (i.e., to the administration of only a fraction of all possible treatment combinations). An example is given of its usefulness in studies at the interface of the clinical and social sciences.

As was shown in Problem 5.4, a factorial experiment may be conduct­ed in randomized blocks. Chapter 13 presents some design strategies for the case when the natural or desirable block size is less than the total number of treatment combinations. The reader interested in the design and analysis of more complicated factorial experiments than those con­sidered here may refer to Chapter 5 of Cochran and Cox (1957) and to Chapters 7 and 8 of Federer (1955).

12.1. THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES

The easiest factorial study to conceive, carry out, analyze, and inter­pret calls for each of the p factors to have two levels. An example of a factorial study with p = 2 was presented and analyzed in Section 4.2. The data presented there (see Table 4.2) were from only half of the full experiment. Table 12.1 presents individual and summary data for the effects on the weight of the thymus of castration and of adrenalectomy separately for male and for female mice (castration of a female mouse consisted of the removal of her ovaries). The study is formally like a 23

factorial experiment except that only two of the factors, castration (yes or no) and adrenalectomy (yes or no), are bona fide experimental factors Whose levels are capable of being randomly assigned to experimental units. The third factor, sex, is a classificatory factor whose levels are obviously incapable of being randomly applied. The present study is therefore like those many clinical experiments in which (approximately) equal numbers of males and females or members of one diagnostic subgroup and another are enrolled in order to ascertain whether different

of patients respond differently to the treatments being compared. effects on the analysis and on the inferences of studying a mixture of

l ··-· !nd classificatory variables will be discussed later. et niik, Xiik. and siik denote the sample size, mean, and standard

for level i of the first factor (here,

i = 1 for males,

i = 2 for females),

308

Table 12.1. Values of thymus weight (in milligrams) for millie and female mice each in a 2 X 2 factorial study

Males Females Adrenalectomy Adrenalectomy

Castration No Yes No

25 19 44 31 50 31 26 25 35 27 48 32 31 32 26 41 35 35

No 24 35 39 48 33 27 27 32 29 30 20 23 25 35 29 27 21 36 25

n112 = 10 n211 = 14 n 111 =13 X111 =26.231 x112 = 32.1oo X211 = 35.429

s 111 =4.512 S11z = 7.564 s 211 =7.783

31 41 44 48 34 39 34 30 41 45 36 41 39 38 32 38

Yes 32 38 39 34 36 39 44 34 42 30 36 43 45 33 37 43 32 55 35 39

nl2z = 8 nzzl = 14 n121 = 14 x121 = 37.357 X122 = 43.000 x22l = 36.429

s121 =4.845 S122 = 6.547 s 221 = 3.995

level j of the second (here,

j = 1 for no castration,

j = 2 for castration),

and level k of the third (here,

k = 1 for no adrenalectomy,

k = 2 for adrenalectomy)

Yes

44 48 47 42 44 44 38 35 39 42

n212 = 1Cl x212 = 42.3oo

s212 = 4.1029

40 45 52 26 55 38 48 52 46 53 52 56

nzzz = 12 x222 = 46.917

s 222 = 8.702

of p = 3 factors. The standard deviations in Table 12.1 do not seem to vary in any systematic way, so no transformation of the data is necessary.

THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES

The value of the pooled variance is

sz = L L: I (niik- 1)s~k = 38.3730 L: I I ( nijk - 1)

with L: L: L: (niik- 1) = 87 degrees of freedom.

12.1.1. Three-Way Interaction

309

(12.1)

The analysis appropriately begins with an examination of the 2P means for any evidence of all p factors' interacting, a so-called p- way inter­action. Here, the assessment of three-way interaction sensibly proceeds as follows. For males, the estimated interaction between adrenalectomy and castration, say EAqM, is the contrast

EACIM = (XAqM- XA.qM)- (XACjM- X ACjM)

= X122- X121- Xnz + Xn1

= 43.000-37.357-32.100 + 26.231 = -0.226. (12.2)

The notation EAc for the interaction effect and the notation A, A, and so on for the two levels of a factor is the same as in Section 4.2. Necessarily, the estimated interaction in (12.2) is the same as that obtained in (4.10), except for the different numbers of decimal places to which the means were reported.

For females, the estimated interaction between adrenalectomy and castration is

EAqF = X222- X221- Xz12 + Xz11

= 46.917-36.429-42.300 + 35.429 = 3.617. (12.3)

The estimated three-way interaction, say EsAc (S for sex),· is the difference between the two separate two-way interactions,

EsAc = EAqF- EACIM = 3.617- (-0.226) = 3.843. (12.4)

The estimated variance of the contrast EACIM is

S2L l:-1-=_f_, nljk WACjM

where

( 1 )-

1 ( 1 1 1 1)-

1

WAqM= LL- = -+-+-+- =2.6784, nlik 13 10 14 8

(12.5)

FACTORIAL EXPERIMENTS

310

and the estimated variance of EAC\F is s2

/ w AC\F' where

( 1 )-1

( 1 1 1 1 )-1

WAC\F= II- = -+-+-+- =3.0657. n~k 14 10 14 12

The estimated standard error of EsAc is then

12( 1 1) se(EsAc) = \j s - +­WAC\M WAC\F

= ~ 38.3730 (2.~84 + 3.0~57) = 5.181.

The ratio of EsAc to its estimated standard error is

3.843 = 0.74 LsAc = 5.181

with 87 degrees of freedom, so there is no statistical evidence

interaction involving all three factors. And fortunate that is. The interpretation of three-way and

interactions is no trivial undertaking. A three-way interaction, example, involves a difference between differences of differences. pose the ratio in (12.8) had been statistically significant. The descripti of the interaction might have been as follows. "For males, the effect castration when performed along with adrenalectomy is essentially same as its effect when performed without adrenalectomy. For on the other hand, the effect of castration is greater when along with adrenalectomy than when performed alone. The difference in the differential effect of castration is significant." The third level of difference in this statement is the notion of an experimental effect: the "effect" of cas measured as the difference between the mean response to castration the mean response to its control operation (which consisted of thetizing and cutting into an animal, but not of removing any tissue)

12.1.2. Two-Way Interactions

The failure to find a significant three-way interaction also haS fortunate consequence the fact that the several two-way interactions be analyzed and interpreted relatively simply. For example, it is v analyze the data for evidence of an interaction between the two surgery without complicating matters by taking sex into account. the analysis just performed leads directly to an analysis of the in

between castration and adrenalectomy.

THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES ~~

The estimated interaction effect, say EAc; is a weighted average of the two sex-specific estimates in (12.2) and (12.3),

- _ WAqMEAC\M+ WAC\FEAC\F ( EAc- , 12.9)

W AC\M + W AC\F

where the weighting factors are given in (12.5) and (12.6). Thus

- _2.6784X(-0.226)+3.0657X3.617 _ ( EAc- 2.6784+3.0657 -1.825. 12.10)

On the average, the effect on the weight of the thymus of castration when performed along with adrenalectomy is about 1.8 milligrams greater than its effect when performed alone. Equivalently, the average effect of adrenalectomy when performed along with castration is about 1.8 milligrams greater than its effect when performed alone.

The estimated standard error of EAc is

- / s2 /38.3730 se(EAc) = \j = \j

57 41 = 2.585 (12.11)

W AC\M + W AC\F . 4

and

EAc 1.825 LAc= se(EAc) = 2_585 = 0.71 (12.12)

with 87 degrees of freedom. There is obviously no statistical evidence of an interaction between the two kinds of surgery, and therefore inferences about the effects of one may be made irrespective of the level of the other. · ·

Such is not the case for one of the two-way interactions involving sex. The reader is asked in Problem 12.1 to work through the algebra for analyzing the interaction between sex and castration, and the interaction between sex and adrenalectomy. The latter is found there to be small and statistically nonsignificant: EsA = 2.916 with an estimated standard error of 2.590. It is therefore valid, as in Section 12.1.3, to make inferences about the effect of adrenalectomy, without having to take account of sex or castration.

A sizable and statistically highly significant interaction exists between sex and castration, however. The estimated interaction effect is

Esc= -8.526, (12.13)

about J~ times its estimated standard error of

se(Esd = 2.554. (12.14)

FACTORIAL EXPERIMENTS

On the average, the effect of castration on the weight of the thymus for females is over 8.5 milligrams less than the effect of castration on the weight of the thymus for males. There is nothing implied in the descrip­tion of interaction about the effect of castration being statistically significant (or not) for females or about the effect being statistically significant (or not) for males. Whatever the two effects are, so far they

are only known to differ significantly from each other. The subsequent analysis of the effect of castration could take one of

312

two forms, both of which will be illustrated in Section 12.1.3.

1. Not only is castration of a male a physically different kind of operation from castration of a female, it turns out statistically to have different consequences with respect to the response variable. It therefore seems appropriate to examine the effect of castration separately for males and for females. It makes little physical sensv to inquire into an average effect of castration, with the average taken over males and females. A similar decision concerning the subsequent analysis-analyze the effect of one factor separately within levels of the other-would be made if a significant interaction were found between two experimental factors rather than, as here, between an experimental and a classificatory factor;

2. Suppose that the significant interaction was between sex adrenalectomy, not between sex and castration. Adrenalectomy is operation that is physically the same for males and females, and thus may make physical sense to ask, "What is the average effect of alectomy on mice in general, bearing in mind that the effect is not same for the two sexes?" A similar decision concerning the analysis-analyze the average effect of an experimental factor, across levels of a classificatory factor, even if there is i be made in a clinical experiment. For example, suppose that an action exists between subtype of depression (so-called neurotic v so-called psychotic depression) and treatment (Imipramine placebo). It may make clinical sense to inquire into the average effect treatment on depressed patients in general and indeed an answer possible. In fact, as will be seen shortly, infinitely many answers exist.

12.1.3. Main Etlects This section is devoted to making inferences about the effects

single factors, either averaged over or specific to levels of the factors. Little space will be devoted to the effect of sex. Neither in present example nor in most factorial studies with one or classificatory factors controlled will there be a question of whether

THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES 313

kind of subject differs on the average from another on the response variable. Average differences are known a priori to exist, and the only serious questions pertain to whether the effects of treatments are the same for different kinds of subjects. All that is informative about sex in the study being analyzed is already known: the effect of adrenalectomy seems to be the same for males and females, but the effect of castration is not.

Thanks to the absence of interaction between adrenalectomy and either of the other factors, the inferences about the effect of adren­alectomy are straightforward. Four independent estimates of its effect are available, one for each sex and each level of castration. The reader is asked in Problem 12.2 to confirm that the optimally weighted average of the four estimates is, say,

EA = 7.368,

over 5! times its estimated standard error of

(12.15)

se(EA) = 1.291. (12.16)

The effect of adrenalectomy is statistically highly significant and strong. On the average, adrenalectomy increases the weight of a mouse's thymus nearly 7.4 milligrams no matter what the mouse's sex and irrespective of a concomitant castration. A 95% confidence interval for the underlying effect, say .:\A, is EA± 1.99 se(EA), or

4.80 :s:; .:)_A :s:; 9.94. (12.17)

At the lower limit, the effect of adrenalectomy is moderate; at the upper limit, the effect is strong.

According to Decision 1 concerning the effect of castration, the effect would be analyzed separately for males and for females (see Problem 12.3). For males, the estimated effect is, say,

EqM = 11.036

with an estimated standard error of

(12.18)

se(EqM) = 1.852. (12.19)

N9t only is the effect of castration statistically highly significant for males [_?qM/se(EqM) = 5.96], the values covered by the confidence interval EqM ± 1.99 x se(EqM), or

7.35 :s:; .:\qM :s:; 14.72, (12.20)

an suggest a strong effect. For females, the estimated effect of castration is much less than for

314 FACTORIAL EXPERIMENTS

males and is statistically nonsignificant: EqF = 2.584 (12.21)

with an estimated standard error of se(EqF) = 1.755. (12.22)

A confidence interval for the underlying effect in females is 2.584 ±

1.99 X 1.755, or -0.91:::::; bc\F:::::; 6.08.

(12.23)

The data are consistent with a possibly strong and positive castration for females but are also consistent with no or even a negative effect. Perhaps the only unequivocal statement possible about the effect of castration for females is that it is significantly less than the effect for males (this follows from the significant interaction between sex and castration). The data are insufficient to establish either the reality (in the sense of different from zero) or the direction of the effect for females.

12.1.4. Averaging over Levels of a Classificatory Factor

If Decision 2 were adopted for proceeding with inferences about the effect of castration, the investigator would be obliged to specify or to estimate what the distribution was across the levels of the classificatory factor in the population to which the inferences would apply. Rarely in a factorial clinical experiment will the distribution in the sample mirror the distribution in the population of interest. The current example of an experiment involving mice is not ideal for illustrating the point, but the study cited earlier of the effect of Imipramine (versus placebo) for neurotic and for psychotic depressives is helpful. Such a study would ' for the enrollment of as nearly equal numbers of the two kinds of patients· as possible regardless of the ratio of neurotic to psychotic depressives · the population from which the patients were drawn. (Given that the · of a study with diagnostic subtype controlled as a factor are to test whether an interaction exists and, if necessary, to make inferences about the effect of treatment separately for each subtype, equal numbers patients from the different subtypes are statistically optimal.)

In the clinical world in which knowledge about the effect of ipramine will be put to use, a distribution that is other than even is Assume for the sake of illustration that the numerical results that hav been obtained for the effects of castration pertain to Imipramine apply to psychotic depressives (instead of to male mice) and to depressives (instead of to female mice). If P is the proportion of

THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES 315

sives in a particular population who are of the psychotic subtype (P will be close to 1 in many psychiatric hospitals and close to 0 in many community mental health centers) and if 1-P is the proportion who are of the neurotic subtype, then the estimate of the effect of Imipramine in

that population is, say,

Ec = P · EqM + (1- P) · EqF

with an estimated standard error of

se(Ec) = ~ P 2 (se(EqM)? + (1- P?(se(Eqp)?.

(12.24)

(12.25)

Figure 12.1 shows how the estimated effect and the associated

confidence limits,

Ec ± 1.99 se(Ec), (12.26)

vary as a function of P. In any population for which P <:: 0.08, the

~ 5 t> 2 0 t> Q)

~ Q)

Q) OJ)

~ Q) > ro

J!._ u

I~ 4

0

-2~--~~--~--~---L--~--~--~--~~ 0 0.2 0.4 0.6 1.0

P (=proportion in "M" category)

Figure 12.1. Estimated average effect of factor "C" as a function of the proportion in

category "M", and 95% confidence limits for the effect.

FACTORIAL EXPERIMENTS 316 confidence interval includes zero and thus no statistically significant average effect of treatment may be asserted to exist. If the population is such that P :2 0.08, however, a statistically significant positive effect exists. In spite of tests performed by packaged programs (see Section 12.1.5) for the statistical significance of the purportedly single main or average effect of a treatment that interacts with a classificatory factor, there are actually infinitely many average effects, one for each propor­tionate distribution of a population across levels of that factor. It is for the investigator to estimate the distribution and to describe the kinds of clinical populations characterized by that distribution.

12.1.5. The Analysis of Variance Table

Presented throughout the preceding sections were the several quan­tities needed to determine the first set of sums of squares in Table 12.2. For any factorial effect whose ratio of estimated effect to estimated standard error is L, the sum of squares is equal to

ss for effect= s2 • e. (12.27)

Also presented in Table 12.2 are the sums of squares (rounded down to four decimal places from the original eight) for the GLM Type II and GLM Type III analyses of the SAS package (1982). The Type II sums of squares are all calculated using the same principles that were applied here; the two sets of sums of squares in the table differ only because of

rounding errors. A rather different principle underlies the GLM Type III analysis (see

Table 12.2. Mean squares for the data in Table 12.1 using formula ( 12.27) and from SAS' s GLM Types II and Ill

Mean Squares

Source of Variation df From (12.27) Type II Type Ill

1,092.3398 Castration 1 1,024.5534 1,024.5266

Adrenalectomy 1 1,249.8920 1,250.8200 1,191.5937

Castration x adrenalectomy 1 19.1263 19.1251 16.4309

Sex 1 666.7406 666.9327 716.3663

Castration X sex 1 427.6364 427.7044 384.9353

Adrenalectomy x sex 1 48.6409 48.6188 48.8784

Castration x adrenalectomy X sex 1 21.1125 21.1123 21.1123

Within cells 87 38.3730 38.3712 38.3712

THE 2P FACTORIAL STUDY, UNEQUAL SAMPLE SIZES 317

Sections 6.4 and 6.5): unveighted rather than weighted averages of differences between mean: are calculated and tested for statistical significance. Consider, for example, the Type Ill sum of squares for interaction between adrenalectomy and castration. The unweighted average of the two sex-sptcific estimated interactions [see (12.2) and (12.3)] is, say,

• 1 EAc = 2(EAC\M + EAqF) = !(-0.226 + 3.617) = 1.696 (12.28)

with a variance proportional to, say,

~=-41 (-1-+-1-) =!(0.3734+0.3262)=0.1749 (12.29) WAC WAqM ~AC!F

[see (12.5) and (12.6)]. The Type III sum of squares is equal to

• •z 1 2 _ WAc· EAc= --(1.696) - 16.4461,

0.1749 (12.30)

which is the same, to one decima] place, as the value in Table 12.2. The reader will notice in Table 12.2 that, while different for all effects

save the three-way interaction, the Type II and Type III sums of squares are not so different that they would lead to opposite inferences about statistical significance. On the contrary, their percentage differences, 100 X (larger SS- smaller SS)/smaller SS, are modest. At least as im­portant is the fact that th~ two methods yield similar estimates of the several factorial effects. Table ,12.3 presents these estimates as well as their estimated standard errors for all the factorial effects. Problem 12.4

Table 12.3. Type 11 and Type Ill estimates of all factorial effects, and their estimated standard errors,

for the data in Table 12.1

Type II Type III

Effect Estimate se Estimate se

Castration 6.583 1.274 6.911 1.295

Adrenalectomy 7.368 1.291 7.218 1.295

Castration x adrenalectomy 1.825 2.585 1.696 2.591

Sex 5.323 1.277 5.597 1.295

Castration x sex -8.526 2.554 -8.204 2.591

Adrenalectomy x sex 2.916 2.590 2.924 2.591

Castration x adrenalectomy x sex 3.843 5.181 3.843 5.181

FACTORIAL EXPERIMENTS

318 asks the reader to confirm the correctness of the Type III estimate of the

main effect of adrenalectomy. The estimated effects are identical for the three-way interaction and

are similar (one might even say identical for all practical purposes) for the remaining factorial effects. The standard errors of the Type Ill estimates are all greater than or equal to the standard errors of the Type II estimates, an inequality that is true not only in this example but in general. This inequality follows from the Type III estimates' being un­weighted averages of differences between means, but the Type II estimates' being weighted averages, with weights specifically intended to minimize the variance of the final estimate (see Problem 4.4). The magnitudes of the Type Ill estimates sometimes happen, by chance, to exceed the magnitudes of the Type II estimates. It is therefore impossible to predict beforehand for which type of estimate the ratio of estimated

effect to estimated standard error will be larger. The reader will notice in Table 12.3 that none of the estimated

standard errors of the Type II estimates are equal. For the Type III estimates, on the other hand, there are not only some equalities but also patterns to the estimated standard errors when they are unequal. In particular, the standard error of the estimated three-way interaction is

estimated to be ' ~,~ ,~ y 1

se(E3-way) = s i..J i..J i..J -, tlijk

that of each estimated two-way interaction is estimated to be

' s ~,~ ,~ ,~ 1 se(Ez-way) = -

2 i..J i..J i..J -,

tlijk

and that of each estimated main effect is estimated to be

' s /, ,~ ,~ 1 se(Emain) = 4 '.J f..J f..J f..J tlijk.

(12.31)

(12.32)

(12.33)

The major advantage of a Type II over a Type Ill analysis is that, in the absence of interaction, the estimated main effects are unbiased and have the smallest possible variance. In the absence of higher-order interaction, the estimated lower-order interactions are unbiased an::l have the smallest possible variance. The major practical advantage of a Type III over a Type II analysis (when the analysis is performed by hand) is simplicity. Its major theoretical advantages (see Nelder, 1977; Speed, Hocking, and Hackney, 1978; and Yates, 1934) concern the inter­pretation of its estimated main effects and interactions in the pres~nce of

THE 2P FACTORIAL STUDY, EQUAL SAMPLE SIZES 319

higher-order interactions. The position taken here is that, in the presence of interaction, it must be the investigator and not a general-purpose algorithm that decides how the ~nalysis should proceed. Given the results of the most important companson beween the two types of analyses, namely that the estimated effects, standard errors, and sums of squares will be close when there are only random departures from the desired equal sample sizes, it usually makes little practical difference which type

is selected for use.

12.2. THE 2P FACTORIAL STUDY, EQUAL SAMPLE SIZES

The analysis of the data fro~ a ~actorial study (but not necessarily the interpretation of the results) s1mphfies greatly when the sample sizes are all equal to a common n . . one e~amp~e of the simplification that results from equal sampl~ sizes .Is the Identity of . a Ty~e II and a Type III analysis. Others Will be giVen t~roug~out th~s secti~n. Because so much attention has already been ~atd to It~ the Illustrative example used in Section 12.1 will be used agam here, WI~h equal sample sizes produced by selecting for analysis, in each of the eight cells in Table 12.1, the first n = 8 measurements and by discarding the remaining ones. This device was adopted solely for the sake of continuity with Section 12.1. Under no circumstances in practice should any measurements be discarded in order to produce equal sample sizes.

The means and standard deviations of the resulting measurements are presented in Table 12.4; they differ only little from the corresponding

Table 12.4. Summary values of thymus weight (in milligrams) derived from Table 12.1 by discarding all measurements after the

eighth from each cell

Males Females Adrenalectomy Adrenalectomy

Castration No Yes No Yes

X 111 == 24.875 X 112 =31.625 x211 = 39.ooo x212=42.125

No s111 = 3.907 s112 == 7.633 s211 == 8.350 S212 =4.518

)(121 =37.125 X122 == 43.000 X221 == 36.750 x222 = 49.250

Yes s121 = 4.518 s122 == 6.547 s221 = 4.713 s222 = 5.471

320

FACTORIAL EXPERIMENTS

Table 12.5. Point estimates, estimated standard errors, and mean squares for all factorial effects for the

values in Table 12.4

Effect

Castration Adrenalectomy Castration x adrenalectomy

Sex Castration x sex Adrenalectomy x sex Castration x adrenalectomy x sex

Within cells

Estimate

7.1250 7.0625 4.2500 7.6250

-9.3750 1.5000

10.2500

se

1.4766 1.4766 2.9532 1.4766 2.9532 2.9532 5.9064

MS(df)

812.2500(1) 798.0625(1)

72.2500(1) 930.2500(1) 351.5625(1)

9.0000(1) 105.0625(1)

34.8852(56)

values in Table 12.1. The value of the pooled variance is

s2 = 34.8852

with 56 degrees of freedom. The estimated factorial effects, their mated standard errors, and their associated mean squares appear in 12.5. Because each mean square for a factorial effect has 1 degree; freedom, the mean squares and sums of squares are identical. The may check that the sum of the seven sums of squares for the ial effects is equal to the sum of squares with 7 degrees of measuring the variability among the eight means, 8 I I I (Xiik- X .. 3,078.4375. This equality holds only because the sample sizes are

Because of the reduced sample sizes in the cells, from a nearly 12 in Table 12.1 to a constant of 8 in Table 12.4, the standard errors in Table 12.5 are larger than those in Table 12.3 inferences about the factorial effects from Table 12.5, however, same as those based on the entries in Tables 12.2 and 12.3, remainder of this section is devoted to the description and applic three different numerical procedures for calculating the summary

tics for a 2P factorial study.

12.2.1. Procedure 1 (Simple Averages of Means)

Thanks to the equality of the sample sizes, the following procedure provides all of the quantities needed for making about the interaction effects and about the main effects ·

~./""\

THE 2P FACToRiAL STUp-y-, EQUAL SAMPLE SIZES 321

factors do not interact.' ~e p-way interaction may be estimated by considering any of the \(p- I)-way interactions, and by taking the difference between the estimated interaction at one level of the remain­ing factor and the estimated interaction at the other. For example, the estimated interaction between castration and adrenalectomy for males is

EAqM = xl22- x-121- x-112 + x-111 = -0.875

and the estimated interaction for females is

EAqF = x222- x-221- x-212 + x-211 = 9.37s.

(12.34)

(12.35) The estimated three-way interaction is the difference between these two estimates,

EsAc = EAqF- EAqM = 9.375- (-0.875) = I0.250. (12.36) The same estimate is obtained no matter which (p- I)-way interaction one starts with (see Problem 12.5).

Each (p- I)-way interaction may be estimated from the values in the table obtained by averaging across the levels of the remaining factor. The means in Table I2.6, for example, are informative about the interaction between sex and adrenalectomy. It is important for the reader to appreciate the two meanings of each of the entries in such a table. One is as the simple unweighted average of the means at the levels of the remaining factor (here, castration and no castration). In this sense the entries are the same as the quantities analyzed according to a Type III analysis. The second meaning of each entry is as the mean of the 2n (here, I6) measurements for the given combination of factors. The two meanings apply only because the sample sizes are the same in all cells.

The estimated interaction between sex and adrenalectomy is

BAs= x2.2- xl.2- Xz.I +Xu= 1.5, (12.37)

where xi.k = (xi!k + xizk)/2 for i and k equal to I and 2 (and similarly

Table 12.6. Means informative about the interaction between sex and adrenalectomy

in Table 12.4

Adrenalectomy

No (k = 1) Yes (k=2)

Sex

Males (i= 1)

Xu = 31.0000 Xl.2=37.3125

Females (i = 2)

x2.1 = 37.875o x22=45.6875

322 FACTORIAL

for X;i. and X.ik). Trle two remaining estimated two-way interactions easily checked to be equal to the values in Table 12.5 (see Pro 12.6).

When the number of factors, p, exceeds three, the estimation (p- 2)-way interactions, (p- 3)-way interactions, and so on proceeds just shown, with the relevant tables of means equivalently obtained averaging means across levels of the factors not involved in an · action or by avera~irlg the responses of all subjects with a given bination of factors~...,E~~ch mean that is relevant to a (p- 2)-way i action is the average of ~r fundamentt;!Lmeans (and thus of measurements), each mean that is relevant to a (p- 3)-way interaction the average of eight fundamental means (and thus of 8n me and in general each 01ean that is relevant to a (p- q)-way interaction the average of 2q fundamental means (and thus of 2qn measurements).

Eventually, the step of estimating the main or average effects of the factors is reached. Ea<:h is easily estimated as the difference between mean response to one level of the factor (averaged over all 2p-J levels the remaining factors, and thus based on 2p-J n measurements) and mean response to the other level of the factor (also based on 2p-J measurements). For tile data in Table 12.4, the main effect of alectomy, for example, is the difference between the mean response the 2

3-

1• 8 = 32 mice that received an adrenalectomy, x .. 2 = 41.5;

and the mean response of the 32 that did not, X .. 1 = 34.4375. Thus

£A= x .. z- x .. l = 4L5ooo- 34.4375 = 7.0625.

Likewise, the estimate<! effect of castration is

Ec = x.z. -- x.l. = 41.53125-34.40625 = 7.125,

and the estimated effect of sex is

, Es = X2 •• -· X1 .. = 41.78125-34.15625 = 7.625. (

The st9;n,dard errors of the estimated factorial effects may be estimated as follows./Each (p- q)-way interaction is a contrast (with coefficients + 1 and -1) in 2p-q means, with each of these means being an average of 2qn measurements; q varies from a minimum of zero for the highest-order interaction, the p-way, to a maximum of p- 1 for the main effects. Thus, from the algebra of co11trasts that has been applied throughout the text,

2 - __ s_ 2p-q Var(.E(p-q)-way) - 2qn

s22p-2q

n

Here, with s2

= 34.8852, n = 8, p = 3, and q varying from zero to two,

THE 2P FACTORIAL STUDY, EQUAL s'MPLE SIZES 323

( - ~34.8852 X 23-0 • /34.8852 X 8

se E3-way) = , '.J - = 5.9064, 8

(12.42)

(- ~34.8852 X 23-2 . /34.8852 X 2

se Ez-way) = : '.J = 2.9532, 8

(12.43)

and

se(E1

-way) = se(E . ) = /34.8852 x 23-4 matn effet 'J 8

;>-----~ = /34.8852 = 1.4766. (12.44)

ic ; '.J 2 x 8 i > } Tre s~ <?f sq~ares for a factorial ~ffect (and, because it has 1 degree

of fm~de>m, also tts mean square), is .qual to _ -2 n

SS((p- q)-way effect)- E(p-q)-way X 2p-2q· . ...,..,.,. ' ...... ~. -· ..

(12.45)

In the present example, the.refore' · ije sum of squares for the three-way interaction is equal to the squared 'eslmate times 8/2

3-0

= 8/8 = 1, so that

SS(three-way interactioi) = 10.252 = 105.0625.

The sum of squares for each two-wi interaction is equal to the squared

estimate times 8/23-2 = 8/2 = 4, so tlat, for example,

SS(castration x adrena!ecPmy) = 4.252

X 4 = 72.25.

The sum of squares for each main efect is equal to the squared estimate

times 8/23-4 = 8 x 2 = 16, so that, fo. example,

SS(sex) = 7.625 X 16 = 930.25.

The simplicity of the preceding 31alysis, thanks to the equality of the sample sizes, should not be taken e mean that the analysis necessarily ends there. For example, the findirg of a significant (p- q)-way inter­action means that the estimated lo{er-order interactions and estimated main effects involving those factors may not have any reasonable inter­pretation. These lower-order efftcts would have to be estimated separately within levels of the oth~r factors and then, as discussed in Section 12.1, either averaged accor;ing to characteristics of a population or analyzed individually. In Table 12.5, for example, the interaction between castration and sex is statis1cally significant, and so the estimate there of the main effect of cas(ation has little meaning. A more

324 FACTORIAL EXPERIM~S

thorough analysis of the effect of castration is therefore in order ~~ Problem 12. 7)

A final commePt concerns the definitions adopted here of interad effects. These effects have been defined as differences between m~ fundamental diffefences between means, with a consequence being l interactions of di~erent orders have different precisions [see (12.4" Beginning with work by F. Yates in the 1930s and continuing in ~ major texts on th( design of experiments (Cochran and Cox, 1957,: 153-158; Federer' 1955, PP· 174, 181, and 188), interpretability 1'

sacrificed for the sake of equal precision and thus of simple analysis' having the factori~l effects defined as

Yates's 1?ain effect= current main effect, . 1

Yates's 2-way iateractron = 2 (current 2-way interaction),

. 1 Yates's 3-way i?teractwn = 22 (current 3-way interaction)

. 1 Yates's p-way ?teractwn =

2p-t (current p-way interaction).

(12.'

The reader is as~ed in Problem 12.8 to confirm that each of Yate estimated effects las the same estimated variance,

r ' • s2

'

ar(Yates s estimated effect) = --2P-2n'

and that the sum of squares for each is equal to the squared estima times the same coistant,

s~= (Yates's estimated effect? X 2P-2 n.

12.2.2. Procedui 2 (Yates's Algorithm)

Table 12.7 lays )~t fo: the means in Table 12.4 a simple algorithm d1 to F. Yates for eftmatmg all of the factorial effects in a 2P factori experiment. Each of the P factors has an arbitrarily defined "upp< level," here desigrated _by the subscript 2, and a "lower level," he1 designated by the ubscnpt 1. In general, the 2P treatment combinatior are labeled and onered as follows:

(1)-all factors t their lower levels; (a)-factor A c~·bitrarily defined) at its higher level, all others at thei

lower levels;

THE 2P FACTORIAL STUDY, E~QUAL SAMPLE SIZES 325

Table 12.7. Application of Y~ates's algorithm to the means in Table 12.4

factorial combination Mean I II III Effect ss

(1) X\ 11 = 24.875 62.00)0 136.625 303.750 37.9688 (c) Xm = 37.125 74.6~5 167.125 28.500 7.1250 812.2500 (a) x112 = 31.625 75.75)0 23.625 28.250 7.0625 798.0625 (ca) xl22 = 43.ooo 91.3715 4.875 8.500 4.2500 72.2500 (s) x211 = 39.ooo 12.25)0 12.625 30.500 7.6250 930.250C (cs) X221 = 36.750 11.3715 15.625 -18.750 -9.3750 351.5625 (as) x212 = 42.125 -2.25)0 -0.875 3.000 1.5000 9.0000 (cas) x222 = 49.25o 7.1~5 9.375 10.250 10.2500 105.0625

(b)-factor B (arbitrarily d~fined) at its higher level, all others at their lower levels;

(ab)-factors A and B both at their higher levels, all others at their lower levels;

(c)-factor C (arbitrarily d~fined) at its higher level, all others at their lower levels;

( ac )-factors A and C both at their higher levels, all others at their lower levels;

(abc .. . )-all factors at their higher levels.

In Table 12.7, the factors are listed in the order castration (its higher level is designated c), adremdectomy (its higher level is designated a), and sex (its higher level is designated s).

Each of the p columns headed I, II, ... calls for exactly the same arithmetic to be applied to the entries in the preceding column. The entries in a column are paired [in the column of means, e.g., the pairs are (24.875, 37.125), (31.625, 4:3.000), ... , (42.125, 49.250)], the intrapair sums are entered into the fin;t 2p-t places in the next column, and the intrapair differences (the second value minus the first) are entered into the last 2p-t places in the next column. The entries in the pth column (here, the third) are the numefators of the effects related as follows to the designated factorial combinations.

The first entry in column p, which corresponds to combination (1), is the numerator of the overall mean level of response in the entire study.

FACTORIAL EXPERIMENTS

326

Its divisor is, in general, 2r. Here,

- 303.750 = 37.9688. X ... =

Each entry in column p that appears in a row designated by a single letter is the numerator of the corresponding factor's estimated main effect. Its divisor is, in general, 2p-1

. Here, for example, the estimated main effect of adrenalectomy is found to be, in the row designated (a), EA = 28.250/4 = 7.0625. Similarly, entries in rows designated by pairs of letters are the numerators of the estimates of the corresponding two-way

interactions, with each denominator equal to 2p-2

; etc. The sums of squares for the factorial effects may be found using the

formulas presented in Section 12.2.1, but a much simpler rule is:

square the entry in the pth column and multiply by n/2r. (12.47)

Here, because each mean is based on n = 8 measurements, a quantity that happens to equal 23 , the sums of squares are seen to bear especially

simple relations to the values in column Ill. Yates's algorithm is applicable even when the 2r sample sizes are

unequal. The data analyst using it in such a situation should realize that it , produces Type III estimates of the factorial effects. It produces Type III sums of squares when n in the rule given in (12.47) is replaced by n:<m; the harmonic mean of the 2r sample sizes. The reader is asked Problem 12.9 to apply Yates's algorithm to the means in Table 12.1 to confirm that the results agree (except for errors due to rounding) those in Tables 12.2 and 12.3 for the Type Ill analysis.

12.2.3. Procedure 3 (The Algebra of the 2r Factorial Study)

When all sample sizes are equal, each factorial effect is estimated constant times the difference between the sum of half of all the and the sum of the remaining half of the means. There are patterns means that receive one sign and the means that receive the other. patterns not only help to strengthen one's insight into the meanu1 factorial effects, they also are important when only a fraction of combinations must be selected for study (see Section 12.4) or when factorial effects must be confounded with block-to-block v Section 13.2). With the factorial combinations designated as Section 12.2.2, Table 12.8 displays the patterns of pluses and each of the effects in a 23 factorial study, and presents the applying the indicated arithmetic to the means in Table 12.4.

"l1' ~ ""'' ~ -.C)

~ = ..... ~ ~ e ~

~ &...

~ t'l ... ~ ~ -~ ..... &...

.s ... .a 'Q' = Q ..... ~

.§ .... ~ .:: ~

.C) ~

~ ~

Ql$ ~

""'' ~ .c ~

c:: 0 -~

~ 0 u til ·.: B <.)

&!

OlrlOOir>Oirl a I v:'"'lv:v:t-:~'"'l ::::1 OOOOOOOOOC'fiO

CZ1 NN ('f"'l~ ~

I

0 ~""' ~'"'!I+++++++ ~~

""' ~N

~ ..-: I I + I + I + I ~N

.q-

0

~~ I + I I + + I I ~\0

(")

0

--;;;-:§!I I I++ I I+ ~0\

(")

0 ~o

S~ I+++ I I I I ~(")

.q-

""' ~~ I I + I I + I + ~.-.

(")

""' ~N ..,.-. ~r-:

(")

" ""' ~.-.-< 00 ~..,;

N

... <.) <»

~ til ·.: 0

~ ~

+ I I I I + +

I I + I + + I

>< <<!)

"' X

8 8 B B <.) <.)

~ ~.£ o:! "' o:! Z X Z

;>-. 1-1 >( ~ .....

a~ ~a~ B X X B X

c:: <.) c:: c:: <.) c:: 0 <» 0 0 <» 0 ·~ ~ ·.= ·.= -a ·;:: o:! c:: o:! o:! c:: "' ~~~~~~~ l)~l)~l)~l)

327

d .:2 'Oi c::

~ 0 <.)

til ·.: B <.)

J9 c:: <» -~ OJ)

.... .£

<» "' c:: 0 5;-~ c:: "' ~ "

328 FACTO RIAL EXPERIMENTS

the values in the final column of the table are identical to those in column III of Table 12.7.

Note that each main effect is characterized by a plus sign for every factorial combination in which its identifying letter is present (which, by convention, means that the factor is present at its "higher level") and by a minus sign for every factorial combination in which its identifying letter is absent (which, by convention, means that the factor is present at its "lower level"). This pattern corresponds to taking the sum (or average) of all mean responses to one of the two levels of a factor, and subtracting from it the sum (or average) of all mean responses to the other.

Each two-way interaction is characterized by a plus sign for every factorial combination in which both identifying letters are present or both are absent and by a minus sign for every factorial combination in which only one identifying letter is present. The three-way interaction is characterized by a plus sign for every factorial combination in which all three identifying letters are present or only one is present and by a minus sign for every factorial combination in which two identifying letters are present or none is present. In the general 2P factorial experiment, each r-way interaction is characterized by a plus sign for every factorial combination in which all r identifying letters are present or r- 2 of them are present, and so on, and by a minus sign for every factorial com­bination in which r- 1 identifying letters are present, or r- 3 of them are present, and so on.

An algebraic device that produces the pattern of pluses and minuses for any factorial effect is the following. Let a, b, c, ... again denote the upper levels of the factors. The main effect of factor A, for example, is characterized by the algebraic quantity

(a -1)(b + 1)(c + 1), (12.48)

which is understood to be equal to the expanded product

(abc) + ( ab) + ( ac) + (a) - (be) - (b) - (c) - ( 1),

each term of which is taken to represent the mean response at the corresponding combination of factors. The expression in (12.48) may be read as, "Take the difference between the upper and lower levels of factor A, and sum (or average) them across levels of the other factors." The interaction between factors B and C, as another example, may be characterized by the expression

(a+ 1)(b -1)(c -1),

which is understood to be equal to the expanded product

(abc)+ (be)+ (a)+ (1)- (ab)- (ac)- (b)- (c).

THE 2P FACTORIAL STUDY, EQUAL SAMPLE SIZES 329

The expression may be read as, "Take the difference between B's upper versus lower difference at the upper level of C and at the lower level of C, and sum (or average) them across levels of the other factors." In general each effect may be characterized by one of the 2P expressions (a± 1)(b ± 1)(c ± 1) ... , where the sign is taken to be minus if the letter represents one of the factors defining the effect and is taken to be plus if the letter does not. The expression with each sign a plus corresponds to the overall mean.

The final algebraic identities that underlie the definitions of the factorial effects may be seen in the signs in Table 12.8. Those signs may be interpreted as abbreviations of the coefficients + 1 and -1 that multiply the mean responses. The multiplication of two signs may therefore be defined as (+)x(+)=(+1)x(+1)=+1=(+); (+)x(-)= (+1) x (-1) = -1 = (-); and (-) x (-) = (-1) x (-1) = +1 = (+). Notice the following patterns to the products of the signs.

Take any pair of main effects, and multiply their signs for each of the 2P factorial combinations. The resulting signs characterize the inter-action between the two effects. Symbolically, if A, B, C, ... represent the patterns characterizing the main effects and if AB, AC, . .. represent the patterns characterizing the two-way interactions, then

Ax B= AB, Ax C = AC, and so on. (12.49)

Take any two-way interaction and a main effect not involved in that interaction. The signs resulting from a multiplication of each of the 2P pairs of signs characterize the interaction of all three effects. If ABC represents the pattern characterizing the three-way interaction, then

ABx C=ACx B= BCx A= ABC. (12.50)

Take any pair of effects and carry through the multiplication of their signs. The result characterizes the generalized interaction between those two effects: if a factor is represented in one effect but not in the other, then it is represented in their generalized interaction; if a factor is not represented in either effect, or if a factor is represented in both, then it is not represented in their generalized interaction. Thus the examples in (12.49) and (12.50) are of generalized interactions that correspond to ordinary ones. But the relations

AxAB= B, ABx BC= AC, ABC x B = AC, and so on, (12.51)

point out that the notion of a generalized interaction carries the idea of interaction beyond the level at which it has so far been considered. The rule of multiplication illustrated in (12.51) is that a squared symbol is

330 FACTORIAL EXPERIMENTS

replaced by unity:

Ax AB = A 2 B= B, AB x BC= AB2 C= AC, and so on. (12.52)

The practical importance of generalized interactions will be illustrated in Sections 12.4 and 13.2. • I ....

(::: u

"' "' r- "' r- ·a I::S "

..,. '-0 <n "' 00, '-0 .... :::E

..,. '-0 00 00 <n '-0 " .~

..,. '-0 "' <n 00, '-0 "0

"I::S "0 ..,. ,..... <n ..,. r- ,..... "'

12.3. A 3 x 4 FACTORIAL EXPERIMENT • I ..,. 00 ,..... r-00, "'

u

! ~ ..,. . '-0 . 00 . < .... . ,..... .0\ . <n

~ ..c:: -.:t N ..,. ,..... '1" ,..... .i.i 00 II II II II

'S ·;u II II ... A number of exercises were provided in Chapter 6 in the calculation ~ l: -- :t -"' :t . -'"

0

·= (::: ,_;I~ ,; I~ I,? I~

;...

of the sums of squares for main effects and interaction in a study with two ;:J ~

~ " factors, one or both of which had more than two levels. Here, more

z = 0 "' 0 '1" ..:::

emphasis is placed on the estimation and interpretation of the contrasts .... 0 "' 0 00 '1" s 0 00,

"' <n 0 " 0 00, ,..... ..,. <:3

<'"> N 0 <n ,..... ..,. 12 that underlie the sums of squares and that would be examined if ~~

0 ..,. "' ..,. 0 r- ..,. ..,. '-0 <n

00 "' N N 00, <n ~ significant main effects or interactions were found. The values in Table .t::~ '1" cri o No ·a-, N . ..,. . . "' <:3

._\Q <n ,..... ,..... '-0 ,..... ..... <n ,..... co <n ,.....

6.10 from Afifi and Azen (1972) and Kutner (1974) will again be used for II II II II II II II II II II II """

;: .... ~

" " "' " " " " ... ... - ... I:

illustrative purposes. They are presented in Table 12.9, with the only ~~ -£ I>< .;; ~I>< J;' ,2 I;< J;' :S't lk .::~ ·-:§~

IO:: <::>

change being that of the name of one of the factors in Table 6.10, ...

;t = ~

Stratum, to Disease (the experimental subjects were dogs having one of "' ;:!

"I::S ·: "' 0 0 r- ..,. E' three different diseases). Because the 12 sample sizes differ from one

<'"> N 0 0 00 '1"

~ 11:1 "' 00, 0 co 0 0 r- ..,. <::>

"' 00 0 "' 0 0

.:;! "' "' ,..... 0 ,..... 00, ..,.

" another only for random reasons, the analysis of unweighted means [the

0 0 N -.:t

~ ~ "' 0 .,£- '1" 00, <n 0 0<:1 t-: «: "' ..... ,..... <n .,£- 0 ..,. aci o\

method used by subroutine P2V of the BMDP package (1981), by the <:: "' 00, -~ "'- II II II II II II II II II II II "' I::S " "'

GLM Type III routine of the SAS package (1982), and by option 9 of the I::S Q a "' "' "' '" '" '" '" '" "' g~ l"k ~

~ .... -.; -£ I>< .;; ~I~ rJ: r£ ~~ ~ <:3

SPSSX ANOV A routine (1983)] is appropriate and will be the only 10:: I:

:s E " <:3 ... method applied.

"' I::S f-< ~ "'~ 0 0 r- '-0 ·i:l ~ 0 0 '-0 <n <n

The table is bordered by the two sets of unweighted marginal means ~"' 0 N 0 '-0 '-0 '-0 <n ·!l

1~ 0 r- 0 r- '-0 co 00 <n a

and by the harmonic means of the sample sizes on which the individual 0 r- 0 ..... '-0 N -.:!" II") 0 00, <n 00 ,..... <n '-0 <n Cll

N oO 0 ·o cxi N 00 . "' . • . '-0

means are based. In the general case of a factorial study with r levels to ~'Q- <n N ,.... -.:t <'"> N '-0 ,..... ,..... -.:t N ~ II II II II II II II II II II II r-

the row factor and c levels to the column factor, l.l "' 0\ ....... N N "' N N "' "' N N g<j ik c - .... i 1:>< .;; ~I~ ~ ~I~~ a ~ 10:: ~

- 1 i - (12.53) "'~ r.n

K =- X ~~ l. l)' "' 0 0

c j=l = I::S

..,. d' "' 0 0 <n '1"

.... l.l "' 00, 0 0 '-0 '-0 ..,. " "' r- 0 "' 0,...... 00 -.:!" ~

- 1 i - .... "' ..... <n N 0 r- -.:!" 0\

(12.54) ~ "' 0 N <n

'1" "' '-0 00,

x'.=- x. = ,..... a\ c-ri • 00 o cri 00 • "0

00 • (:::

·1 '1' I::S '-0 N ,_, -.:!" N <n • <n

r i=J <n N ,...; -.:t N "'

~ II II II II II II II II II II II ~ ~ ~~ ~ - - - g"'l~ c i 1:>< "' "' <'> <'> ~

n\H)=-- (12.55) 1 .:;: I~ "' IO::

•• c 1 ' Q\ ~0 I-I

~ j=l n;j ~ "0

" •• '-0

.... :;:: ., .....

and - " ~ .

~ "' OJ) ;:! 0..

~ "' "' ·;u c <::> "

-(H)_ r " ..... N ~ Cll ~ .:a "' (12.56) c " " n· --- 0 ...

·I r 1 . ;:J a ~

I-i=l n;j

331

FACTORIAL EXPERIMENTS

332 The variance of X';. is inversely proportional to w';_, say, where

(1 c 1)-1 w'· = - ' - = cn<Hl

'· 2 ~ '·' c j=1 rlij

(12.57)

and that of X'.i is inversely proportional to w'.i, say, where

w'.j= (\I l_)-1

= m~fl. (12.58) r i=l n;i

The pooled sum of squares for variation within treatment groups is equal to I I (n;i -1)st with I I (n;i -1) = n .. - rc degrees of freedom. The method for calculating the sum of squares for interaction between the row and column factors was illustrated in Section 6.4; that method is valid whether the main effects of the factors are estimated as functions of weighted or of unweighted averages of the individual means. The sum of

squares for the row factor, say RSS, is equal to

,~ , ex-, x-, )2 RSS = f..- W;, ; . - .. (rows)

with r- 1 degrees of freedom, where

-, Iw:.x:. X .. (rows) = '\ 1 •

L.. W;,

(12.59)

(12.60)

The sum of squares for the column factor, finally, say CSS, is equal to

css = L w'JX:j- X' .. (co\'s)f

with c - 1 degrees of freedom, where

-, Iw'.iX'.i X .. (col's) = '\ , ·

L.. w .j

(12.61)

(12.62)

In general, x:.<rowsl is not equal to x:.<corsl· The analysis of variance table for the values in Table 12.9 is given in

Table 12.10. The value of the sum of squares for interaction was found in (6.55), and the value of the sum of squares for treatments was found in Problem 6.13. Because X' .. (rows) = 18.850980, the sum of squares for

diseases is equal to

4(4.444444[21.816667 -18.85098oY + · · · + 4.897959 [15.316667- 18.850980]

2) = 415.8730.

Inferences about the interaction effects and about the main effects may

be made as follows.

A 3 X 4 FACTORIAL EXPERIMENT

Table 12.10. Analysis of variance (method of unweighted means) of the values in Table 12.9

Source of Variation df ss MS F Ratio

Treatments 3 2,997.4721 999.1574 9.05 Diseases 2 415.8730 207.9365 1.88 Interaction 6 707.2663 117.8777 1.07 Within cells 46 5,080.9944 110.4564

12.3.1. Interaction Between tbe Factors

333

Interaction between treatments and diseases is tested for statistical significance by comparing the F ratio for interaction with F<r-ll<c-1l,n .. -rc,a· Here, with a= 0.05, F 6,46,o.os = 2.31. If the inter­action had turned out to be statistically significant, individual contrasts constituting the interaction would have been tested for significance and confidence intervals would have been constructed using the Scheffe critical value

S = .J(r -1)(c- 1)F(r-l)(c-l),n .. -rc,a = .J6 X 2.31 = 3.72. (12.63)

An example of such a contrast is 1-- - 1-- -

C = (z[Xu + Xn]- X13)- (z[X31 + X32J- X33), (12.64)

which compares the difference between the average for Treatments 1 and 2 on the one hand and Treatment 3 on the other for Disease 1 with the corresponding difference for Disease 3. An "interaction contrast" is in general defined as

C = ,~ ' c\'lc(c) x-.. L.. L., I I 'I'

where I c\'l =I c]cl = 0. Its estimated standard error is

/ [ (r) (cl]2

se(C) = yWMS L L C; :.i . IJ

(12.65)

(12.66)

The coefficients defining the contrast in (12.64) are c\'l = + 1, c<;l = 0, c~l = -1 and c\cl = + 1/2, c&cl = + 1/2, c~c) = -1, c~cl = 0. Its value is C = 1.55, and its estimated standard error is se( C)= 9.20. Because C/se( C) = 0.17, the underlying contrast is obviously not significantly different from zero. A confidence interval for the underlying contrast is C±3.72Xse(C), or the interval from -32.67 to 35.77.

FACTORIAL EXPERIMENTS 334

Continue to assume that the interaction is statistically significant. Hypothesis tests and confidence intervals may be desired for such specific

contrasts involving the diseases as

c1 = x13- x23 and such specific contrasts involving the treatments as

1' - - - - 1 - -C2 =4(Xu + X12 + X31 + Xd -2(X13 + x33).

A reasonable critical value appears to be

S=.fcr+c-2)F,+c-2n -rca=J5xF546005 , .. ' ' ' .

= -/5 X 2.43 = 3.49.

(12.67)

(12.68)

(12.69)

The basis for this critical value is that there are rc -1 degrees of freedom for comparing all rc means, of which (r -1)(c -1) are for the interaction between the two factots. The difference of (rc-1)-(r-1)(c-1)= r + c - 2 is thus the number of degrees of freedom for specific com-

parisons other than interaction contrasts. The contrast in (12.67) is an especially simple one, but not so the

contrast in (12.68). It is the average of the two estimates of the contrast in the treatment means that were compared in the interaction contrast in (12.64). The two estimates could validly be averaged because their difference was far from being statistically significant. The general point is that a statistically significant interaction means only that some (not all) contrasts in one factor vary across levels of the other and therefore might not sensibly be averaged. Other contrasts will be relatively constant and so may validly be averaged. The reader is asked in Problem 12.10 to make inferences about C1 and C2 using the critical value in (12.69).

12.3.2. No Interaction Between the Factors

In the present example, the interaction between treatments and dis­eases is not statistically significant, which means that: (1) none of the interaction contrasts, defined by (12.65), could possibly be significant using the Scheffe criterion; and (2) all inferences about the effects of one factor and about the effects of the other may validly be based on the two sets of marginal means. The statistical significance of the effects of the row factor may be tested by comparing the F ratio for that factor to F r-1,n .. -rc,a· If statistically significant effects are found, inferences about contrasts such as c<rl =I c\'l X';. may be made by estimating the standard

error of Cas

se(C) = ~WMS 2: [c\')Y c n:\~)

FRACTIONAL REPLICATION OF A 2P STUDY 335

and by using

S = J(r-1)Fr-l,n .. -rc,a (12.71)

as the critical value. With a few appropriate and obvious changes, inferences about the effects of the column factor would be made similarly.

Here the significance of the F ratio in Table 12.10 for diseases is tested by comparing it to F 2,46,o.o5 = 3.20; the marginal unweighted averages in Table 12.9 for the three diseases do not differ significantly one from another. The critical value for testing for the significance of the differences among the averages for the four treatments is F 3.46, 0.05 = 2.81, so the differences are statistically significant. The appropriate critical value for making inferences about contrasts in the marginal unweighted averages for the four treatments is

S = J3 X F 3,46,0.05 = J3 X 2.81= 2.90.

The reader is asked in Problem 12.11 to make inferences about

cic) =!ex: I+ x:2)- x:3 and

c~c) = x:4- x:3·

12.4. FRACTIONAL REPLICATION OF A 2P STUDY

(12.72)

(12.73)

(12.74)

The fractional replication of a 2P factorial study is one in which only a fraction, say 2-q, of all 2P treatment combinations are applied. With p = 6, for example, a 26 factorial study calls for 64 treatment com­binations and at least that many observations. These may be more observations than the investigator can afford to make, or, because each main effect would be estimated as the difference between a pair of means both based on a multiple of 32 measurements, the complete study may give more precision than necessary. In either case, it would be helpful if the investigator could study only half or only a quarter of all the possible factorial combinations and, under certain conditions, still make valid inferences about the important factorial effects. This will be possible with the appropriate selection of a 2p-q fractional factorial design. With half (i.e., T 1) of all factorial combinations studied, q = 1 and the design is a 2p-l fractional factorial. With a quarter (i.e., T 2

) of them studied, q = 2 and the design is a 2P-2 fractional factorial. It is assumed throughout this section that at least two independent observations are made at each of the 2p-q combinations studied, so that an unbiased estimator of the within-cell variance is available.

336 FACTORIAL EXPERIMENTS

To introduce terminology and to point out some of the problems with fractional replication, consider for the sake of simplicity a 23

-1 study (i.e.,

the study of only four of the eight possible factorial combinations in a full 23 study). It should be obvious that not all effects will be estimable when there is fractional replication. The selection of the four particular fac­torial combinations that will be studied calls for the identification by the investigator of the least important factorial effect; that will be the effect not capable of estimation in the final study. It (in fact, every factorial effect) is defined, as in Table 12.8, as a contrast that subtracts the sum of half of the mean responses from the sum of the other half. Select, at random, either the set of factorial combinations that is associated with a plus sign or the set associated with a minus sign in the defining contrast associated with the least important effect. The selected set constitutes the factorial combinations that will be studied.

In a 23 study the three-way interaction is usually the least important effect. Let A, B, and C denote the three factors and let a, b, and c denote their (arbitrarily defined) upper levels. Suppose that the four factorial combinations associated with a plus sign in the defining contrast for the three-way interaction are selected. This means that the studied combinations are (a), (b), (c), and (abc): factor A at its upper level, B and C at their lower levels; similarly for B and for C; and factors A, B, and C all at their upper levels.

Table 12.11 lays out the patterns of pluses and minuses defining the contrasts associated with the seven factorial effects. Notice that the three-way interaction is nonestimable, being defined as the sum (or

Table 12.11. Contrasts defining factorial effects in a 2 3

-1 fractional

factorial study with the three-way interaction serving as the

defining contrast

Factorial Combination

Factorial Effect (a) (b) (c) (abc)

A + - + B - + - + AB - - + + c - - + + AC - + - + BC + - - + .ABC + + + +

FRACTIONAL REPLICATION OF A 2P STUDY 337

average) of the four mean responses. The three main effects are all estimated, in accordance with the meaning of a main effect, by contrasts that take the difference between the sum of the mean responses to the higher level of the factor and the sum of the mean responses to the lower level. Notice, however, that each of these contrasts also serves to define the estimator of a two-way interaction. Effects that are estimated by the same contrast in a 2p-q fractional factorial study are referred to as aliases of one another. Thus, in the illustrative 23

-1 study, A (now denoting the

estimated main effect of factor A) and BC (now denoting the estimated interaction effect between B and C) are aliases, as are B and AC, and C and AB. Note that each estimable effect has as its alias the generalized interaction [see (12.51) and (12.52)] between it and the three-way inter­action, for example,

AxABC=BC.

The identity between an effect's alias and its generalized interaction with the defining contrast is true in general for a 2P-1 fractional factorial study, and a generalization that will be pointed out later holds for a 2p-q

fractional factorial study. It is now clear what kinds of assumptions must be made in order for an

estimated effect in a 2p-q fractional factorial study to be taken as an unbiased estimator of the underlying factorial effect: the effect's single alias (in case q = 1) or the effect's several aliases (in case q > 1) must be assumed to be zero. In general, two-way interactions may be expected to be nonzero (although possibly small relative to the magnitudes of the underlying main effects), and thus they may exert a biasing effect on their aliases in a 23

-1 fractional factorial study. The fractional replication of a

23 factorial study, while useful didactically, is to be avoided in practice. The 24

-1 fractional factC>rialstudy, however, represents one that might

be adopted in practice. Table 12.12 lays out the factorial combinations and the patterns of their sums and differences for estimating the factorial effects when the four-way interaction is taken as the defining contrast (the eight combinations with a minus sign were selected). Notice that each estimated main effect has a three-way interaction as an alias: A is the negative of Ax ABCD =BCD, B is the negative of B x ABCD = ACD, etc. (Negatives must be taken because the "negative half" of the ABCD interaction was selected.) Thus, under the frequently reasonable assumption that there is no three-way interaction among any of the factors, unbiased inferences may be made about each of the main effects. (An exercise in the estimation of these effects is given in Problem 12.12.) Each two-way interaction has another one as an alias-AB is the negative of AB x ABCD = CD, etc.-so that a one-half replicate of a 24

338 FACTORIAL EXPERIMENTS

Table 12.12. Contrasts defining factorial effects in a 2 4-

1 fractional factorial study with the four-way interaction serving as the

defining contrast

Factorial Combination

Factorial Effect (a) (b) (c) (d) (abc) (abd) (acd) (bed)

A + - - - + + + B - + - - + + - + AB - - + + + + c - + - + - + + AC - + - + + - + BC + - - + + - - + ABC + + + - + D - - - + - + + + AD - + + - - + + BD + + - - + - + ABD + + - + - + CD + + - - - + + ACD + - + + - - + BCD - + + + - - + ABCD

factorial study is useless for gaining information about specific inter-actions.

As an illustration of the complexity that arises in designing a 2p-q

fractional factorial study when q > 1, consider the selection of four out of the eight factorial combinations in Table 12.12 so that a 24

-2 study

results. The four-way interaction already was used as a defining contrast, so it would appear to make sense to use a three-way interaction as a second defining contrast. Let the ACD interaction be the one judged to be worth sacrificing, and let the factorial combinations having a plus sign in the row for ACD be selected. The resulting factorial combin­ations and patterns of pluses and minuses defining the estimated effects appear in Table 12.13. An important feature of the 24

-2 design is

that a third factorial effect has, willy-nilly, ended up serving as a defining contrast. It is true in general that if any two (or more) factorial effects serve as defining contrasts, then so do their generalized interactions. Here ABCD X ACD = B, a main effect, turned out to be a defining contrast and thus to be nonestimable. The inability to estimate a main effect would generally render a particular fractional factorial design inappropriate.

A gratuitous by-product of there actually being three defining con-

FRACTIONAL REPLICATION OF A 2P STUDY 339

Table 12.13. Contrasts defining factorial effects in a 2 4

-2 fractional factorial study with the ABCD

and ACD interactions serving as defining contrasts

Factorial Combination

Factorial Effect (a) (c) (d) (acd)

A + - - + B AB + + c + - + AC + + BC + - + ABC + + D + + AD - + - + BD + + ABD + + CD + - + ACD + + + + BCD - + + ABCD

trasts in a 2P-2 fractional factorial study is that each estimable factorial effect has three aliases, its generalized interactions with the defining contrasts. Thus, in the example, A has as aliases A x B or the AB interaction, Ax ACD or the CD interaction, and Ax ABCD or the BCD interaction. The reader is asked to confirm in Problem 12.13 that the use of any pair of three-way interactions to serve as defining contrasts in a 24

-2 fractional factorial study assures that the main effects are

estimable, but that some main effects have others as aliases. The selection of factorial combinations for a 2p-q fractional factorial

study when q > 1 is obviously not trivial if main effects are to be estimable at all, and to have as aliases interactions of the highest order possible. Plans for p up to eight are given in tables by Cochran and Cox (1957, pp. 276-289) and in figures as well as in tables by the National Bureau of Standards (1963, p. 12-15 to p. 12-18). Some plans for p as high as 10 are given on page 253 of Cox (1958).

Given the rarity with which clinical experiments are conducted as high-order factorial studies, the even greater rarity of the fractional replication of a factorial study is not surprising. An appropriate ap­plication is in the use of vignettes in socioclinical studies of factors affecting clinicians' attitudes, decisions to treat or to refer patients, etc.

340 FACTORIAL EXPERIMENTS

(Link, 1983; Nathanson and Becker, 1978). Consider the following initial portion of a hypothetical vignette that might be sent to several clinicians for perusal and then for responses to a series of questions concerning recommendations for treatment, hospitalization, and so on. The parti­cular vignette sent to a clinician would be determined at random.

A is a B _year old c person whom you diagnose as ---

having _____!2__; ~ is covered by E

A represents the patient's sex by the insertion of a male's name (Mr. John Abbott) or a female's name (Ms. Jane Abbott). B represents the patient's age by the insertion of a number denoting relative youth (30) or maturity (60). C might represent the patient's color, D his or her diagnosis, E his or her kind of insurance coverage, and so on. If the number of factors is at all large [Nathanson and Becker (1978), for example, varied seven factors in their study of the kind of care obstetri­cians and gynecologists might recommend to different kinds of women], then the fractional replication of the full factorial study is indicated.

For a one-half replicate (q = 1), the highest-order interaction should serve as the defining contrast. For a one-quarter replicate (q = 2), the following defining contrasts for p up to eight assure that all main effects are estimable, that none has another main effect as an alias, and that the maximum number of main effects have as aliases interactions of the highest possible order.

p = 5: any pair of three-way interactions that have only one factor in common (e.g., ACD and BDE, and therefore also ABCE).

p = 6: any pair of four-way interactions that have two factors in common (e.g., ACEF and ABDE, and therefore also BCDF).

p = 7: any pair of five-way interactions that have three factors in common (e.g., ACDEG and BCDFG, and therefore also ABEF).

p = 8: any pair of five-way interactions that have two factors in common (e.g., ABDEF and ACDGH, and therefore also BCEFGH).

The suggested defining contrasts for the 27-2 fractional factorial study,

for example, are such that four main effects (A, B, E, and F) have as aliases a three-way, a four-way, and a six-way interaction. The other three main effects ( C, D, and G) have 2 four-way interactions and a five-way interaction as aliases. Had the defining contrasts instead been, for example, ABCD and DEFG (and therefore also ABCEFG), every one of the main effects would have had at least one three-way interaction

PROBLEMS 341

as an alias, and one of the main effects, D, would have had 2 three-way interactions as aliases.

Problem 12.1. Confirm that the estimated interaction effect between sex and adrenalectomy for the data in Table 12.1 is, say, EsA = 2.916 with an estimated standard error of 2.590. (Hint: Confirm that, in the presence of castration, the estimated interaction is, say,

E SAIC =ex ACIF- x A.qF)- ex ACIM- x A.qM) = x222- x221- x122 + xl21

= 46.917-36.429-43.000 + 37.357 = 4.845

with a weight of

1 1 1 1 ) - 1

WsAIC = (}2+14+8+14 = 2.8475;

and confirm that, in the absence of castration, the estimated interaction is EsAIC = 1.002 with a weight of wsAIC = 2.8707. Use the fact that

and that

- WsAICESAIC + WsAICESAIC EsA= ,

w SAIC + w SAIC

se(Es~ = R!( WsAIC + WsAic).)

Confirm that the estimated interaction effect between sex and cas­tration is, say, Esc= -8.526 with an estimated standard error of 2.554. (Hint: Confirm that, in the presence of adrenalectomy, the estimated interaction is, say,

EsqA =(X ACIF- X ACIF)- (XACIM- X ACIM) = x222- x212- x122 + Xn2

= 46.917-42.300-43.000 + 32.100 = -6.283

with a weight of

1 1 1 1 )- 1

Wsc!A = (-+-+-+- = 2 4490· 12 10 8 10 . '

and confirm that, in the absence of adrenalectomy, the estimated inter­action is EsqA. = -10.126 with a weight of wsqA. = 3.4340.)

Problem 12.2. Check that the following are the four estimates from Table 12.1 of the effect of adrenalectomy, say, E Alii= X;i2- X;ib with weights W;i = n;ii · n;i2/(n;ii + n;i2)·

342 FACTORIAL EXPERIMENTS

Sex Castration EAjij

Male (i = 1) No(j=1) 5.869 Male (i = 1) Yes (j = 2) 5.643 Female (i = 2) No (j = 1) 6.871 Female (i = 2) Yes (j = 2) 10.488

Thus confirm that the optimally weighted average is

EA =II WijEAiidii W;j = 7.368

with an estimated standard error of --2

I s = 1.291. se(EA) ="VII wij

wij

5.6522

5.0909 5.8333

6.4615

Problem 12.3. Check that the two estimates from Table 12.1 of the effect of castration for males, with their weights, are as follows.

Adrenalectomy

No (k = 1)

Yes(k=2)

EqM,k

11.126

10.900

WM,k

6.7407 4.4444

Thus confirm that the optimally weighted average,

EqM =I wM,kEqM,kii wM,k>

and its estimated standard error,

- I S2

se(EqM) = "V-I , WM,k

are the values given in (12.18) and (12.19). Check that the two estimates from Table 12.1 of the effect of

castration for females, with their weights, are as follows.

Adrenalectomy

No (k = 1)

Yes (k = 2)

EqF,k

1.000 4.617

WF,k

7.0000 5.4545

Thus confirm that the optimally weighted average and its estimated standard error are the values given in (12.21) and (12.22).

PROBLEMS 343

Problem 12.4. Confirm that the Type III estimate of the main effect of adrenalectomy for the data in Table 12.1, and its estimated standard error, are the values given in Table 12.3. (Hint: The four separate estimates of the effect of adrenalectomy are given in Problem 12.2. Confirm that their unweighted average is, say, EA =I I EAii)4 = 7 .218. Because the estimated variance of EAiii is s2/wii, therefore

, s2 "~, 1 Var(EA) = 16 1... 1... W;i.

Confirm that the estimated variance is 1.6777, the square root of which is given in Table 12.3.)

Problem 12.5. Confirm that the same estimate as in (12.36) is obtained for the three-way interaction in Table 12.4 if EsAc is defined as E ASIC - E ASIC and if EsAc is defined as EcsiA - E cs!A..

Problem 12.6. Confirm that the estimated interaction between cas­tration and adrenalectomy in Table 12.4 is EAc = 4.2500. (Hint: Check that the following is the relevant set of means.

Castration

No(j=1) Yes (j = 2)

Adrenalectomy

No (k = 1)

X.11 = 31.9375

x.21 = 36.9375

Yes(k=2)

X.12 = 36.8750

x.22 = 46.1250

Use the fact that EAc = X.22- X.12- X.21 + x.ll·) Confirm that the estimated interaction between castration and sex in

Table 12.4 is Ecs = -9.3750. (Hint: Check that the following is the relevant set of means.

Castration

No (j = 1)

Yes (j = 2)

Males (i = 1)

Xu.= 28.2500 x12. = 4o.o625

Sex

Females (i = 2)

x21. = 40.5625 x22. = 43.oooo

Use the fact that Ecs = X22.- X12.- X21. + X11 .. )

Problem 12.7. Confirm that a 95% confidence interval for the effect of castration for males for the data in Table 12.4 is 7.64:s:~qM:s: 15.99. (Hint: Check that the estimated effect is EqM = 11.8125 with an esti-

344 FACTORIAL EXPERIMENTS

mated standard error of~+-&,)= 2.088.) Confirm that a 95% confidence interval for the effect of castration for

females is -1.7 4:::; LlqF:::; 6.61. (Hint: Check that the estimated effect is EqF = 2.4375, also with an estimated standard error of 2.088.)

Confirm that, in a population with 40% "M's" and 60% "F's", a 95% confidence interval for the average effect of castration is 3.18:::; Llc:::; 9.20. (Hint: Check that the estimated average effect is

Be= 0.4 X 11.8125 + 0.6 X 2.4375 = 6.1875

with an estimated standard error of

Jo.4 2(se(EqM)? + 0.62(se(EqF))2 = 1.506.)

Problem 12.8. For q = 0, 1, ... , p -1, Yates's estimator of the (p­q)-way interaction, .say Y(p-q)-way, is related to the estimator defined in Section 12.2 by

- 1 -Y(p-q)-way =

2p-q-1 E(p-q)-way

(by convention, q = p- 1 defines a main effect). Show that the estimated variance is

s2 Var( Y(p-q)-way) = 2P-2 n'

and that the associated sum of squares is

Y-2 2p-2 (p-q)-way . n,

both independent of q. [Hint: To prove the first result, use equation (12.41) and the fact that Var(aX) = a2 Var(X) for any constant a. To prove the second result, use the fact that the sum of squares for an effect is equal to s2 x Effect2/V ar(Effect).]

Problem 12.9. Complete the following application of Yates's algorithm to the means in Table 12.1. Each estimated main effect is the corresponding entry in column III divided by 4; each estimated two-way interaction is the corresponding entry in column III divided by 2; and the estimated three-way interaction is just the corresponding entry in column III. Compare the estimated effects with those for the Type III analysis in Table 12.3.

Each sum of squares is the squared entry in column III times n<Hlj8, where n<Hl is the harmonic mean of the eight sample sizes in Table 12.1. Confirm that n<Hl = 11.436, and compare the sums of squares with those for the Type III analysis in Table 12.2.

PROBLEMS 345

Combination Mean I II III Effect ss

(1) 26.231 63.588 138.688

(c) 37.357 161.075 27.643

(a) 32.100 71.858 7.218 1,191.5378

(ca) 43.000 5.617 1.696 16.4377

(s) 35.429 22.387

(cs) 36.429 10.900 17.359 -8.204 384.9004

(as) 42.300 5.847 2.924 48.8709

(cas) 46.917 4.617 3.843

Problem 12.10. Using the critical value in (12.69), confirm that confidence intervals for the contrasts in (12.67) and (12.68) are:

for C 1, -14.85 to 38.72;

for C2, -4.50 to 27 .62.

(Hint: Confirm that C 1 = 11.9333 and that

Var( C1) = WMS (__!__ + __!__) = 58.9101; n13 n23

and :confirm that C2 = 11.5 583 and that

( 1 [ 1 1 1 1 J 1 [ 1 1 ]) Var(C2)=WMS -6 -+-+-+- +- -+- =21.1708.)

1 n11 n 12 n31 n32 4 n13 n33

Problem 12.11. Using the critical value in (12.72), confirm that confidence intervals for the contrasts in (12.73) and (12.74) are:

for c<c) 1 ,

for c&c),

5.92 to 27.15;

-8.00 to 15.60.

(Hint: Confirm that C\cl = 16.5306 and that

c) _ (1 [ 1 1 J 1 ) _ . Var(C~ ) - WMS-4 3

-(H)+ 3

-(H) + 3

-(Hl -13.3980, Xn_ 1 Xn.2 Xn.3

and confirm that c&c) = 3.8000 and that

Var( c&c)) = WMS ( 1-(H) +

1 (H)) = 16.5685.)

3 X n.3 3 X n.4

Problem 12.12. Suppose that three measurements were taken at each of the eight factorial combinations in Table 12.12, with means as follows.

FACTdRIAL EXPERIMENTS 346

Combination Mean Combination Mean

(a) 4.23 (abc) 3.90

(b) 2.88 (abd) 4.16

(c) 5.75 (acd) 6.22

(d) 5.04 (bed) 5.93

Assuming that the three-way interactions are all negligible, confirm that the following are estimates of, and sums of squares for, the main effects.

Effect Estimate Sum of Squares

A -0.2725 0.445538

B -1.0925 7.161338

c 1.3725 11.302538

D 1.1475 7.900538

(Hint: Each estimated effect is obtained by multiplying each mean by + 1 or -1 as indicated in Table 12.12, by summing the results, and by dividing by 4 (the number of upper level-lower level differences sum­med by this simple process). If n denotes the number of measurements contributing to each mean, then each sum of squares is equal to the squared estimated effect times 2n. Here, with n = 3, the sum of squares is equal to six times the squared estimate. Equivalently, each sum of squares is equal, in a g~neral 2p-q fractional factorial study, to n/2p-q times the square of the corresponding net sum of the 2p-q means.)

Confirm that the following are sums of squares for the two-way

interactions.

Interaction

AB and CD

AC and BD AD and BC

Sum of Squares

0.063038

1.545338 0.003038

(Hint: Each sum of squares is equal to 3/24-

1 = i times the square of the corresponding net sum of the eight means.)

If the within-group mean square (with 16 degrees of freedom) is equal to 0.6603, confirm that none of the sums of squares for interaction is statistically significant at the 0.05 level and that the main effects of B, C,

and Dare.

REFERENCES 347

Problem 12.13. Confirm that, when any pair of three-way inter­actions (e.g., ABC and ACD) serve as defining contrasts for a 24-2

fractional factorial study, then each main effect is estimable. Two of them (A and C in the example) have 2 two-way interactions and 1 three-way interaction as aliases; and two of them (B and D in the example) have a main effect, a two-way interaction, and the four-way

interaction as aliases.

REFERENCES

Afifi. A. A. and Azen, S. P. (1972). Statistical analysis: A computer oriented approach. New

York: Academic Press. BMDP (1981). BMDP statistical software. Berkeley: University of California Press.

Cochran, W. G. and Cox, G. M. (1957). Experimental designs, 2nd ed. New York: Wiley.

Cox, D. R. (1958). Planning of experiments. New York: Wiley. Federer, W. T. (1955). Experimental design: Theory and application. New York: Macmillan.

Kutner, M. H. (1974). Hypothesis testing in linear models. Am. Stat., 28, 98-100.

Link, B. (1983). Reward system of psychotherapy: Implications for inequities in service

delivery. J. Health Soc. Behav., 24, 61-69. Nathanson, C. A. and Becker, M. H. (1978). Physician behavior as a determinant ·of

utilization patterns: The case of abortion. Am. J. Publ. Health, 68, 1104-1114.

National Bureau of Standards (1963). Experimental statistics. Washington, D.C.: National

Bureau of Standards Handbook 91. Neider, J. A. (1977). A reformulation of linear models (with discussion). J. R. Stat. Soc.

Ser. A, 140, 48-76. SAS Institute, Inc. (1982). SAS user's guide: Statistics. Cary, N.C.: SAS Institute, Inc.

Speed, F. M., Hocking, R. R., and Hackney, 0. P. (1978). Methods of analysis of linear models with unbalanced data. J. Am. Stat. Assoc., 73, 105-112.

SPSS, Inc. (1983). SPSSX user's guide. New York: McGraw-Hill. Yates, F. (1934). The analysis of multiple classifications with unequal numbers in the

different classes. J. Am. Stat. Assoc., 29, 51-66.


Recommended