PLEASE SCROLL DOWN FOR ARTICLE · Sample size re-estimation without unblinding for normally...

PLEASE SCROLL DOWN FOR ARTICLE

This article was downloaded by: [University of North Carolina Chapel Hill]On: 16 December 2008Access details: Access Details: [subscription number 768122806]Publisher Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Theory and MethodsPublication details, including instructions for authors and subscription information:http://www.informaworld.com/smpp/title~content=t713597238

Sample size re-estimation without unblinding for normally distributed outcomeswith unknown varianceA. Lawrence Gould a; Weichung Joseph Shih a

a Biostatistics and Research Data Systems, Merck, Sharp, and Dohme Research Laboratories,

Online Publication Date: 01 January 1992

To cite this Article Gould, A. Lawrence and Shih, Weichung Joseph(1992)'Sample size re-estimation without unblinding for normallydistributed outcomes with unknown variance',Communications in Statistics - Theory and Methods,21:10,2833 — 2853

To link to this Article: DOI: 10.1080/03610929208830947

URL: http://dx.doi.org/10.1080/03610929208830947

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial orsystematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply ordistribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directlyor indirectly in connection with or arising out of the use of this material.

http://www.informaworld.com/smpp/title~content=t713597238

http://dx.doi.org/10.1080/03610929208830947

http://www.informaworld.com/terms-and-conditions-of-access.pdf

COMMUN. STATIST.-THEORY METH., 21(10), 2833-2853 (1992)

SAMPLE SIZE REESTIMATION WITHOUT UNBLINDING

FOR NORMALLY DISTRIBUTED OUTCOMES

WITH UNKNOWN VARIANCE

A. Lawrence Gould Weichung Joseph Shih

Biostatistics and Research Data Systems

Merck, Sharp, and Dohme Research Laboratories

West Point, PA 19486 Rahway, NJ 07065-914

Key Words and Phrases: sample size adjustment; interim analysis;

clinical trial; EM algorithm

ABSTRACT

Monitoring clinical trials in nonfatal diseases where ethical

considerations do not dictate early termination upon demonstration of

efficacy often requires examining the interim findings t o assure that the

protocol-specified sample size will provide sufficient power against the null

hypothesis when the alternative hypothesis is true. The sample size may be

increased, if necessary to assure adequate power. This paper presents a new

method for carrying out such interim power evaluations for observations

from normal distributions without unbIinding the treatment assignments or

discernably affecting the Type 1 error rate. Simulation studies confirm the

expected performance of the method.

Copyright O 1992 by Marcel Dekker, Inc.

Downloaded By: [University of North Carolina Chapel Hill] At: 15:23 16 December 2008

1. INTRODUCTION

GOULD AND SHIH

As a rule, group sequential methods (e.g., Pocock, 1977, 1982; O'Brien

and Fleming, 1979; Gould and Pecore, 1982; Lan and DeMets, 1983, Geller

and Pocock, 1987), allow early rejection (or, sometimes, acceptance) of the

null hypothesis if warranted by the interim findings. These methods often

are used in clinical trials in cancer, heart disease, and other life-threatening

conditions where ethical considerations require terminating the trial if there

is compelling early evidence of efficacy.

Double-blinded trials should remain so until completion if the null

hypothesis will not be accepted or rejected at an interim stage, to prevent

conscious or unconscious bias. However, the ability to check the assump-

tions made in determining the sample size without unblinding would be use-

ful, to assure that the trial has adequate power. Gould (1992) described a

means for doing so when the outcomes were binomially distributed. Nor-

mally distributed outcomes with unknown within-group variances require a

different approach because estimating the within-group variances requires

group mean information unneeded for binomially distributed outcomes.

Sample size readjustment for normally distributed data has been

studied previously, most recently by Lohr (1988) and by Wittes and Brittain

(1990). Lohr obtained the asymptotic properties of estimates of the mean

and covariance matrix of a multivariate normal distribution when the sam-

ple size can be adjusted on the basis of one or two interim analyses of the

data. Lohr's method is based on the usual corrected cross-product estimator

of the sample covariance matrix, and so would require knowing the


SAMPLE SIZE RE-ESTIMATION 2835

individual group means in the hypothesis-testing situation considered here.

Wittes and Brittain studied by simulation a procedure for adjusting the

sample size in finite samples that also requires knowing the individual group

means at the interim examination. The approach considered here does not

require knowing the group means at the interim examination, and applies

for finite samples from univariate normal distributions.

Section 2 below describes the method and how it affects the Type 1

error rate in finite samples. Section 3 discusses estimating u2, the common

within-group variance. Section 4 provides the findings from a series of simu-

lation studies that confirm the anticipated performance of the method.

Section 6 addresses briefly several issues arising in its application.

2. METHOD

2.1 Description The method is analogous to Stein's (1945) method for

obtaining a sample large enough to provide a specified-width confidence

interval, but differs in (a) not requiring identification of the treatment

assignments, and (b) using all of the information in the combined sample.

2 Suppose that N observations are to be drawn, 8N from a H(pl, u ) distribu-

2 tion and (1-Q)N from a N(p2, u ) distribution, n2 unknown, 0 < (3 < 1.

For simplicity, assume 8 = 0.5, although this is not essential. The null

hypothesis Ho: p1=p2 ordinaily would be tested against the alternative HI:

p1 # p2 using a Student t test. Given Type 1 and Type 2 error rates a and

p, respectively, the total sample size would be determined from


2836 GOULD AND SHIH

where pl - p2 is determined by a specific alternative hypothesis HI: p1 - p2

2 = A (a known value), a is an assumed value $r u2, and z, is the value at

which the standard normal cdf equals 7. If a 2 underestimates u2, the actual

likelihood of rejecting Ho when H1 is true will be less than the power

specified for the trial.

Now suppose that the sample size will be reconsidered after n (c N)

observations (e.g., n t N/2) without knowing the treatment assignments.

With a rearonable estimate, i2, of the within-group variance, u2, one can

determine via (1) the actual sample size

needed to provide 100(1-p)% power for rejecting the null hypothesis. If N'

is "sufficiently larger" than N, additional patients would be obtained to

bring the final sample size up to N'; otherwise, the trial would be completed

as planned. For example, requiring N'IN > 1.25 means that the sample will

be increased only if the "correct" sample size is more than 25% larger than

the original sample size. To keep the final sample size within reasonable

limits, N' might be limited to no more than some multiple of N (e.g., N' 5

2N). The options when N' > UN are discussed in Section 6 .

2.2 Effect on Type 1 Error Rate Let the random LL. able Z1

denote the difference between the means of the initial samples based on a

total of nl observations, and let the random variable Z2 denote the differ-

ence between the subsequent sample means, based on a total of n2 observa-

tions. z1 and z2 both estimate 6 = p1 - p2; neither Z1 nor Z2 actually

would be observed in practice because the group membership of the data



remains blinded. Suppose for simplicity that equal numbers of observations

are drawn from each distribution. Combining the two samples yields

N = nl + n2, m = ml + m2, 2 = (n l i l + n2Z2)/N,

and 2 2 2 s = (mlsl + m2s2)/m

2 2 where si denotes an estimator of a based on mi degrees of freedom from

2 2 the initial (i = 1) or subsequent (i = 2) sample. Assume that misi / u has a

chi-square distribution with mi degrees of freedom, at least approximately.

2 2 Values for sl and s2 are required in practice. The hypothesis Ho: 6 = 0 will

be tested using the statistic t = Z/s.

The probability of wrongly rejecting Ho when n2 does not depend on

2 s1 is provided by the integral of a central t density with m degrees of free-

dom over the set of values It1 > tc, an appropriate critical value. The

2 probability cannot be computed in this way when n2 depends on sl.

The joint density of the mean and sample variance from the initial

sample is essentially the product 0f.a normal and a chi-square density. Con-

2 ditional on sl, the same is true of the joint density of the mean and sample

variance from the second sample. Consequently, the joint density of the

statistics from both samples is the product of these densities. The joint

density of Z and the sample variances can be written as


2838 GOULD AND SHIH

2 Since nl and a are fixed quantities, this expression can be simplified with

no loss of generality by the transforms vi = misZ/u2, i = 1, 2. With the

additional transformation Z -+ t =m Z/s, the density becomes

The probability of rejecting Ho is the integral of (3):

The quantity t,-(vl) depends on vl because the distribution of t and v2

2 depends on n2, which is determined by sl and, therefore, by vl. Consequent-

ly, the order of integration in (4) cannot be interchanged, as the usual

derivation of the Student t density would require.

To illustrate the effect of the dependence, suppose that n2 depends

on v1 in the following way: vl 6 v! , n2 = n 2 ~ ; v1 > v i n2 = n22.

With the transformation vl, v2 -. v ( = v + v2), w (= vl/v), (4) can be 1



where I x ( . , . ) denotes the usual incomplete Beta function, f 2(.: m) denotes X . .

a central chi-square density with m degrees of freedom, a(,) denotes the

standard normal cdf, ti1) denotes the critical value for a central t distribu-

tion with m(') = ml + m21 degrees of freedom, m21 = n21 - 2, etc. The

first integral in (5) is a, the nominal Type 1 error rate. The remaining

terms of (5) represent the perturbation of the Type 1 error rate due to the

sequential sampling scheme. These latter two terms cancel if n21 = n22.

The magnitude of the perturbation can be calculated easily. Thus,

suppose that nl = 20, so that ml = 18. This is not a large initial sample.

At the interim stage, decide to obtain n2 = 20 more observations (10 from

2 2 each group) if sl < 1.5, or n2 = 40 more observations if sl > 1.5. Suppose

that the test is to be at a nominal 5% level, so that the critical t value

would be tc = 2.03 (n2 = 20) or 2.00 (n2 = 40). Assume that u = 1. Then

the lower integration limit in (5) is v i = rnlsf/02 = 18 x 1.5/1 = 27.

Figure 1 plots the values of the algebraic sum of the second and third terms

of (5). The net value of this sum is -0.0002, which represents the negligible

difference between the true and nominal Type 1 error rates in this example.

The simulation findings presented below also support the assertion that this

approach has a negligible effect on the Type 1 error rate.

The sample size re-estimation approach described here does not rule

out the possibility that the interim estimate of u2 might be small enough so

that no further observations would be required to assure the desired power,


GOULD AND SHIH

Deviation From True Type 1 E r r o r Rate : Values of Differences Between I n t e g r a n d s

0.05,

Figure 1. Deviation from True Type 1 Error Rate: Values of Differences Between Integrands

0.04

0.03

a, 0.02 0

c 0 . 0 1 - Q,

0 . 0 0 - . . L C

-0.01

2 -0.02 - 2 -0.03 t!D 2 -0.04 C - -0.05

-0.06 -

-0 .07 -

i.e., that n2 = 0. Essentially the same argument used to obtain (5)

establishes the following result:

- !'-\ nl = 20, n2 = 0 or 20 - 1 \,/ Integrated diff. = -0.0103

I \ - I \

\ 1 I \ I

. . . . . . . . . . r 1

- 1 l

- I 8 I

- ; I 1 n , = 20, n2 = 20 or 40

- I I Integrated d i f f . = -0.0002 I I

- I t I I 1 I I I

'*/



Here, tL1) refers to the critical value for a t distribution with rnl = 9 - 2

degrees of freedom and ti2) refers to a t distribution with rn = ml+m2 d.f.

To illustrate the effect of possible early termination on the Type 1

error rate, suppose that nl = 20. At the interim stage, obtain n2 = 20 more

2 2 observations if sl 2 0.5, or call the trial complete if sl < 0.5. For a nominal

5% level test, the critical t value would be tc = 2.10 (n2 = 0) or 2.03 (n2 =

20). If v = 1 then the lower integration limit in (6) is V; = mlsi2/a2 = 18 .

x 0.511 = 9. Figure 1 also displays the results of the calculations for this

case. Even with the small sample size (10 or 20 observations per group), the

RHS of (6) is -0.01, a small and conservative effect on the Type 1 error rate.

3. ESTIMATING a 2

If the treatment assignments were known, e2 could be computed by

pooling the within-group sample variances. Since the assignments are

2 unknown, a must be estimated some other way. We consider two ways to

estimate a2 : a simple adjustment of the pooled sample variance based on

the difference between the means presumed by HI; and the EM algorithm,

which does not depend on HI.

3.1 Simple adjustment Suppose the interim sample contains en obser-

vations from group 1 and (l-0)n observations from group 2; n is known, e is

unknown. Let x.. denote the j-th observation from group i. The overall 1J

estimate of a2 based on the pooled sample can be computed without

unblinding and written formally as


GOULD AND SHIH

2 2 where $ denotes the unknown within-group estimate of a . Since the inter-

im sample is blinded, 6 and the group sample means jZ1, x -2 will be un-

known, as will both terms of this last expression. However, if the alternative

hypothesis HI: p l -p2 = A is t rue and if n is large enough so that - X 2 is

reasonably close to A, then

2 nqi -e) (i, - r212 e o(l-e)(n - I ) A ,

so that if O = 0.5,

2 2 k2 = (s - A 14). n-2 (7)

When a blocked randomization scheme is used to assign subjects to treat-

ments, 0 will be very nearly known and very close to @. This will be true

especially if the block size is l x or 2 x the number of treatments. The effect

will be to improve the approximation immediately preceding (7).

3.2 EM Algorithm Since the treatment identifications are unknown,

any of the interim observations xi, i = 1, ..., n could be in either treatment

group, so that the treatment assignments are "missing at random" (Rubin,

1976). Let ri denote the treatment group membership indicator:

7. = 1 (0) if sample member i is in treatment group 1 (group 2) 1

71, . . ., rn are independent random variables with T(ri = 1) = e. Given ril

x. (i = 1, ..., n) has a normal distribution with density 1



TABLE 1

Accuracy of EM algorithm estimate of sigma (100 iterations per case)

25 obs/gp [True Mean Difference1 / True a

0 0.5 1 2 True a I Mean S. D. Mean S. D. Mean S.D. Mean S.D.

Notes: (1) Each recursive computation of 8 continued until conver-

gence (successive estimates differing by 0.01 or less) or until 50

cycles had been reached, whichever came first.

50 obs/gp )True Mean Difference) / True. o

0 0.5 1 2

(2) The tabulated quantities are the estimated values of a

and the corresponding standard deviations among the 100

repetitions of each case.

True o

0.5

The expression for the conditional probability (or expectation) of si given xi

Mean S.D. Mean S.D. Mean S.D. Mean S.D.

0.481 0.035 0.494 0.038 0.511 0.041 0.576 0.091

therefore is ?(ri = I I xi) = B(ri I xi) D

ownloaded By: [University of North Carolina Chapel Hill] At: 15:23 16 December 2008

2844 GOULD AND SHi.

The log likelihood of the interim observations follows from (8),

The EM algorithm (Dempster, Laird, and Rubin, 1977) for estimating a

proceeds as follows. Assume 0 = 8. The "E" step consists of substituting

"current" estimates of pl, p2, and u into (9) to obtain provisional values for

the expectations of the ri. The "M" consists of obtaining maximum likeli-

2 hood estimates of pl, p2, and o after replacing the ri in (10) with their pro-

visional expectations. The "E" and "Mu steps are repeated until the value

2 of u stabilizes; the resulting value is the estimate, &: of u required in (2).

Table 1 provides the results of a small simulation study investigating the

performance of this algorithm. Although u2 was estimated accurately, -

p2)/a was not estimated well. The averages over the iterations of the values

of (bl-ji2)/8, based on maximum likelihood estimators, ranged from 0.3 to

0.5 in 29 of the 32 cases shown in Table 1, in no particular pattern; the

exceptional values were 0.6, 0.7, and 0.8. This is consistent with Fowlkes's

(1979) assertion that the accuracy of the estimates of pl and p2 cannot be

assured due to their sensitivity to the starting values (Fowlkes, 1979).

3.3 Initial values for EM algorithm We adapt a suggestion of

Fowlkes (1979) for finding initial parameter estimates for the EM

algorithm. Let z ( ~ ) < z ( ~ ) < ... < z ( ~ ) denote the ordered data at the

interim evaluation. Let p. = (i - 0.5)/n for i=l , ..., n and calculate qi 1

= where 0-I denotes the inverse of the standard normal

distribution function. Fit a simple linear regression by least squares to

the points {(qi, z . ), i=l, ..., n); let b denote the slope of the fitted (1)


SAMPLE SIZE RE-ESTIMATION

line, and let a denote its intercept:

The initial values of a , p,, and p2 then are

where c is some chosen constant. The choice of c influences the

estimation of the means, but not the variance. Ideally, we would like c

= 2 ~ / ( ~ ~ - p ~ ) ; however, although b estimates u , there is no good

estimate of (p2-p1). We get around this problem in the following way.

In most clinical trials that use a normal approximation for estimating

the sample size, the inverse of the coefficient of variation A = (pl-p2)/a

usually ranges between 0.20 and 0.50 (which correspond to about 430

and 70 patients per group, respectively, for power = .90, one-sided a =

0.05). We suggest taking the middle value in this range, 0.35, and

converting it to c = 2 x ( 110.35) = 5.71.

4. SIMULATION STUDIES

4.1 Design Simulation studies explored the behavior of the proce-

dure over a range of parameter values likely to occur in practice. The

values of sigma assumed by the design ( 8 ) and the true value of sigma

(a) were set at 0.707, 1, 1.414, 2, 2.828, and 4. All combinations of a

and 8 values were considered. The design always assumed A = 1, and

the sample size was selected to provide 90% power for rejecting the null


2846 GOULD AND SHIH

hypothesis when the alternative was true. Equal samples were taken

from each distribution (O = 0.5). For the simulation, the true mean

differences were set at 0 (null hypothesis true), 0.5, 1, and 2. The

effects of evaluating the sample size after obtaining 25% and 50% of the

initially planned data were considered, as were the effects of two rules

for deciding to increase the sample size (increase if N1/N > 1.33 or

1.05). In all cases, N' < 2N, reflecting a practical limitation on

increasing the size of ongoing studies. The effect of the algorithm used

to estimate a (simple or EM) also was evaluated. In all, 864 cases (36

combinations of a and 8 , 3 nonzero true mean difference values, 2

examination time values, 2 values of sample size increase rule, 2 algo-

rithms) were run. Each case included a test with a zero mean

difference and a nonzero true mean difference, so there were 864 tests of

the null hypothesis when it was true. Each case was replicated 1000

times, and statistics were collected about the number of rejections of

the null hypothesis when it was true and when it was false, and the

distributions of the final sample size under either hypothesis.

4.2 Results The probability of rejecting Ho when it was true

did not depend materially on any of the factors defining the cases,

because none of the coefficients differed significantly from 0 in a logistic

regression relating the probability of wrongly rejecting Ho to these fac-

tors for each algorithm. Therefore, the 864 rejection frequency values

should be distributed like Binomial variates with n = 1000 and p =

0.05. Figure 2 displays the distributions of the rejection frequencies for

the two algorithms. The results agree closely with expectation.


SAMPLE SIZE RE-ESTIMATION

Observed and E x p e c t e d CDF of R e j e c t i o n s of Ho in 1000 Runs (432 c a s e s )

1 0 Observed CDF (EM a l g o r i t h m )

- Expec ted C D F if p = 0 . 0 5

30 35 40 45 50 55 60 65 70 75 80

Rejections of Ho

Figure 2. Observed and Expected CDF of Rejections of Ho in 1000 Runs

(432 cases for each way of estimating u2)

Figure 3 displays the effects of correctly and incorrectly

specifying the true mean difference and the true variance on the

likelihood of rejecting Ho when A # 0. The two algorithms for

estimating o behaved essentially identically. This probably reflects the

range of A / U values used in the simulations (which covers most of the

situations in c1inica.l trials that use a normal approximation for sample

size calculations). Overspecifying the true mean difference or

underspecifying the true variance caused a loss in power, as expected.

However, when the true mean difference and variance were correctly ,


GOULD AND SHIH

Percent of 1000 R u n s Rejecting Ho as a Function of the True Mean Difference ( T M D ) and odes/utrue

TMD = 2 6 A7

0 . 0

0 .

0 . . . . ... T M D = 0.5

/ 0

4 0

0 /

/ 0 Design assumptions

_ - - - v 4' ' - T M D = 1

/ Power = 90% 0

0 O d e s = V t r u e

7'

Figure 3. Percent of 1000 Runs Rejecting Ho as a Function of the True

Mean Difference (TMD) and udes/urrue

specified, the power was very close to the assumed value of 90%,

usually exceeding it slightly. Since the EM procedure does not depend

on &, the value assumed for u in calculating the sample size, the loss of

power when 6 (= odes in Fig. 1) is less than utrue actually was due to

requiring that N' 5 2N.

5 . EXAMPLE

Suppose that a difference A = 0.30 is to be detected with 90%

power using a 1-sided 5% level test (a = -05). A design taking e = 1.5



would require 430 patients per group; a design with B = 0.80 would

require 120 patients in each group. If the (unknown) true value of u

actually were 1, then the trial should contain 190 patients per group.

In practice an interim examination might be carried out after observing

100 patients, 50 from each group, and might suggest that the final

sample should contain 200 patients in each group. If the trial had been

designed with 5 = 0.80, this would mean that 160 more patients than

planned needed to be entered into the trial and assigned at random to

the two groups. If the trial had been designed with B = 1.5, no further

patients beyond those planned would need to be recruited for the trial.

6. DISCUSSION

The method described here does not estimate reliably the true

difference between the treatment means (Fowlkes, 1979), and so does

not provide a way to ascertain the actual magnitude of pl-p2. The

average and median "mean differenceJu" values estimated from the 100

repetitions of each case summarized in Table 1 did not depend materi-

ally on the true "mean difference/an values.

The statistical power specified at the planning stage and checked

at the interim stage corresponds to a fixed alternative hypothesis that

the true mean difference equals A, a quantity specified by the

researcher. In the context of a clinical trial, A would be the least

clinically meaningful difference worth detecting, identified a priori.

The method provides a given level of assurance for detecting a specified


GOULD AND SHIH

difference if it is present. It is not designed to enhance the likelihood of

detecting the difference that appears to be present (which cannot be

estimated).

The method ordinarily needs to applied only once, when enough

2 data are available to provide a reasonably reliable estimate of u . Table 1 suggests that as few as 25 observations per group should suffice.

From (2), N' is a random variable with a heavy tail to the right; when

the assumed and true a values happen to be close, then overly large N'

values become disproportionately more likely with smaller values of N.

Thus, an interim look with fewer than 25 observations per group may

lead to too large a final sample size. The procedure does not have to be

repeated after obtaining a reliable estimate of u2 because the estimate

and, therefore, the sample size, will not change materially with further

looks. Moreover, adding new patients to a multicenter clinical trial

brings up many administrative issues, e.g., changes in contracts,

funding, perhaps number of centers, etc. The fewer of these that have

to be made, and the earlier, the better.

When N' > wN, there are two options. The trial may be

terminated immediately and its results summarized without testing the

hypotheses. Such a trial would be regarded as uninformative about the

hypotheses, and reexamination of the assumptions about the variability

of the responses or the relevance of the target population would be

appropriate. Alternatively, the trial could be continued to completion

with the additional observations, accepting the possibility that the


SAMPLE SIZE RE-ESTIMATION 285 1

actual power may be less than desired. Less power does not mean zero

power, so rejection of the null hypothesis still could occur on comple-

tion of the trial.

The reestimated sample size could turn out to be much smaller

than the planned size (e.g., 180 vs. 430 patients per group as in Section

5 ) , suggesting that the trial could be terminated after obtaining the

initial observations. This is unlikely to affect the Type 1 error rate

materially, as shown in section 2.2. However, unless ethical considera-

tions dictate otherwise, the trial should not be terminated because

demonstrating efficacy with respect to a single variable seldom is the

only objective of a trial.

The EM algorithm always reasonably estimates a, regardless of

the true and assumed values of A and a. This certainly is useful for

designing additional trials in the same indication before completing the

current trial. More importantly, however, the value of N' provided by

(2) is the value likely to provide the required power for rejecting KO in

favor of the specified alternative. This is not necessarily true for the

simple method. The simple estimate of a assumes a value for A and,

from (7), may understate or overstate the true value of a depending on

whether this assumed value overestimates or underestimates the true

value. Overestimating the true value of A causes underestimation of u,

so N' is insufficient to provide the required power against Ho in favor of

the specified alternative. This guards against an inflated sample size

when Ho is true, but the power loss may be excessive when the true


2852 GOULD AND SHIH

value of A is only a little less than the vaiue set by HI. The converse is

true when the assumed value of A exceeds the true value, so that the

simple method has the undesirable property of moving the sample size

away from clinical reality (Spiegelhalter, Freeman, and Blackburn,

1986).

BIBLIOGRAPHY

Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum

likelihood from incomplete data via the EM algorithm (with

discussion). Journal of the Royal Statistical Society, B 39, 1-38.

Fowlkes, E.B. (1979). Some methods for studying the mixture of two

normal (lognormal) distributions. Journal of the American Statistical

Association 74, 561 - 575.

Geller, N. L. & Pocock, S. J. (1987). Interim analyses in randomized

clinical trials: Ramifications and guidelines for practitioners.

Biometries 43, 213-223.

Gould, A. L. (1992). Lnterim analyses for monitoring clinical trials that

do not affect the type I error rate. Statistics in Medicine 11, 55-66.

Gould, A.L. and Pecore, V.J. (1982). Group sequential methods for

clinical trials allowing early acceptance of Ho and incorporating costs.

Biometrika 69, 75-80.

Lan, K.K.G. and DeMets, D.L. (1983). Design and analysis of group

sequential tests based on the Type 1 error spending rate function.

Biornetrika 74, 149-154.

Lohr, S. L. (1988). Accurate multivariate estimation using double and

triple sampling. University of Minnesota Technical Report No. 505,

February 1988.



OIBrien, P. C. & Fleming, T. R. (1979). A mutiple testing procedure

for clinical trials. Biometn'cs 35, 549-556.

Pocock, S.J. (1977). Group sequential methods in the design and

analysis of clinical trials. Biometrika 64, 191-199.

Pocock, S. J. (1982). Interim analyses for randomized clinical trials:

The group sequential approach. Biometn'cs 38, 153-162.

Rubin, D. B. (1976). Inference and missing data. Biometrika 63, 581-

592.

Stein, C. (1945). A two-sample test for a linear hypothesis whose

power is independent of the variance. Annals of Mathematical

Statistics 16, 243-258.

Wittes, J. and Brittain, E. (1990). The role of internal pilot st,udies in

increasing the efficiency of clinical trials. Statistics in Medicine 9,

65-72.

Received November 1991; Revised May 1992


Date post:	10-Nov-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

PLEASE SCROLL DOWN FOR ARTICLE · Sample size re-estimation without unblinding for normally...

Documents