+ All Categories
Home > Documents > SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS: A REVIEW

SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS: A REVIEW

Date post: 03-Oct-2016
Category:
Upload: ian-gordon
View: 213 times
Download: 1 times
Share this document with a friend
11
Austral. J. Statist. 36(2),1994,199-209 SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS: A REVIEW IAN GORDON’ University of Melbourne Summary This paper reviews methods for determining sample sizes when infer- ence is required on the difference between two proportions from independent samples. The sample sizes for Fisher’s exact test are generally regarded as the desired values, but they axe very tedious to calculate, so many approxi- mations have been introduced. These are described and their relative merits assessed. Key words: Sample size; independent proportions; approximations; Fisher’s exact test. 1. Introduction This paper reviews methods for determining sample sizes when inference is required on the difference between two proportions from independent samples. Define XI and X2 to be independent binomial random variables, with XI = Bi(n1,pI) and X2 = Bi(n2,pz). Inference is required on the difference between p1 and p2. For simplicity, we assume that n1 = = n. This special case is by no means artificial or even unusual, as many randomised trials and case-control studies, for example, are designed to have equal sample sizes in the two groups being compared. However, we make the assumption for simplicity of presentation only; formulae for the general case are straightforward extensions. In spite of the increasing sophistication of sample size methods, it seems likely that this simple setting will remain important, if only because at initial planning stages researchers often do not have, and carmot obtain, the necessary information to refine the sample size calculation (e.g. to allow for confounding variables). Other reviews of the two proportions case may be found in Donner (1984) and Gebski & McNeil (1985). d d 2. Fisher’s Exact Test The ‘gold standard’ for the comparison of proportions from two independent samples is often regarded, explicitly or implicitly, as the sample size requirements Received March 1993; r e v i d September 1993; accepted December 1993. lStatistical ConsultingCentre, Dept of Statisti-, University of Melbourne, Parkville, Vic. 3052, Australia. Acknowledgements. The author thanks Ray Watson for valuable discussions during the prepa- ration of this paper.
Transcript

Austral. J . Statist. 36(2), 1994, 199-209

SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS: A REVIEW

IAN GORDON’ University of Melbourne

S u m m a r y

This paper reviews methods for determining sample sizes when infer- ence is required on the difference between two proportions from independent samples. The sample sizes for Fisher’s exact test are generally regarded as the desired values, but they axe very tedious to calculate, so many approxi- mations have been introduced. These are described and their relative merits assessed.

Key words: Sample size; independent proportions; approximations; Fisher’s exact test.

1. Introduction

This paper reviews methods for determining sample sizes when inference is required on the difference between two proportions from independent samples.

Define XI and X2 to be independent binomial random variables, with XI = Bi(n1,pI) and X2 = Bi(n2,pz). Inference is required on the difference between p1 and p2. For simplicity, we assume that n1 = = n. This special case is by no means artificial or even unusual, as many randomised trials and case-control studies, for example, are designed to have equal sample sizes in the two groups being compared. However, we make the assumption for simplicity of presentation only; formulae for the general case are straightforward extensions.

In spite of the increasing sophistication of sample size methods, it seems likely that this simple setting will remain important, if only because at initial planning stages researchers often do not have, and carmot obtain, the necessary information to refine the sample size calculation (e.g. to allow for confounding variables). Other reviews of the two proportions case may be found in Donner (1984) and Gebski & McNeil (1985).

d

d

2. Fisher’s Exact Test

The ‘gold standard’ for the comparison of proportions from two independent samples is often regarded, explicitly or implicitly, as the sample size requirements

Received March 1993; r e v i d September 1993; accepted December 1993. lStatistical Consulting Centre, Dept of Statisti-, University of Melbourne, Parkville, Vic. 3052, Australia. Acknowledgements. The author thanks Ray Watson for valuable discussions during the prepa- ration of this paper.

200 IAN GORDON

for Fisher's exact test (Fisher, 1935). Others sometimes credited for this test are Yates (1934) and Irwin (1935)2. The test proceeds using an argument based on the conditionality principle (Birnbaum, 1962), which states that inferences should be made conditional on the value of an ancillary statistic, if one exists. An ancillary statistic (Fisher, 1925) is one whose distribution is free of the parameter of interest.

Here, we consider the conditional distribution of X1 with T = XI +X2 fixed at, say, T = t. This distribution is non-central hypergeometric. D e h e the odds ratio $ = pl(1- p2)/p2(l -PI); then

Pr(X1= z 1 T = t ;$} = (3 (t '"a:)+= (a:= L , L + l , . i . , U ) . (1)

Here L = max(0,t - n) and U = min(t,n). T is the ancillary statistic. In fact, the distribution of T is not completely free of +, but the information available from T is inconclusive; the MLE of + is 0 or 00 - see Clark (1972) and Plackett (1977). The hypothesis p1 > p2, is equivalent to $ > 1. Further, in case-control studies in particular, the sample odds ratio itself is a statistic of direct interest, being a consistent estimator (under mild conditions) of the rate ratio between the exposed and unexposed population in appropriately designed studies (Prentice & Breslow, 1978; Greenland & Thomas, 1982).

To test Ho: p1 = pp ($ = 1) against the alternative HI: pl > p2 ($ > l), with Pr{type I error} 5 a, the critical region is C = {zC, a:, + 1,. . . , U}, where

In some cases the critical region may be empty. The conditional test, for a specific alternative HI: $ = $1, is given by

(2)

power of this

'Although Yates's paper was obviously published before the other two, the issue of priority is complicated by two things: Yates states (modestly?) in his paper, "It was suggested to me by Professor Fisher that the probability of any observed set of d u e s in a 2 x 2 contingency table with given marginal totals can be exactly determined." Yates (1984) also suggests that the exact (i.e. hypergeometric) distribution must have been known to Fisher much earlier, due to a cryptic comment in Fisher (1926). Secondly, Irwin states in a footnote that his paper was concluded by May 1933 but that publication was unavoidably delayed.

SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS 20 1

This is the ratio of two polynomials in $1. Further, the average conditional power, called the over-all power by some authors, is given by

Summation here is over all values of t = 21 + 5 2 which are such that z, exists.

By using this with (1) and the definition of $, we obtain

The actual calculation of R($l I t ) or R(p1,pz) is ‘rather involved generally’ (Ben- nett & Hsu, 1960). Finney (1948) and Latscha (1953) provide tables of x, for nominal levels of significance and unequal sample sizes up to max(n1,nz) = 20. Bennett & Hsu (1960) calculate R(p1, p2) using the Finney-Latscha critical val- ues, and present their findings diagrammatically. Even as early as 1960 this was done on a computer, although previously approximations had been developed based on the normal approximation to the hypergeometric distribution (Pat- naik, 1948) and the arcsine transformation (Sillitto, 1949). Finney et d. (1963) published extensive tables.

However, sample size calculations based on Fisher’s test require the even more di5cult computational task of inverting (3), i.e. given p l and p2 under H1 - note that it is insdlicient to specify $1 alone - we require the smallest value of n such that R(p1,pz) achieves a pre-set level. Denote this value of n by n E .

Gail & Gart (1973) calculate nE for a power of 0.5, 0.8 or 0.9 for a = 0.05 and 0.01, p l = 0.1(0.1)0.9,0.95 and p2 = 0.05,0.1(0.1)0.9. In passing, they confirm a suggestion in an editorial note following Bennett & HSU’S (1960) paper3: the anomalies sometimes found between the power of the conditional test and approximations based on, for example, the arcsine transformation, are due to the conditional test often having an actual size that is considerably less than the nominal value, because of the requirement that Pr{T E C I Ho} 5 a. Haseman (1978) revised Gail & Gart’s results by calculating the exact sample size for all cases in their table.

Casagrande, Pike & Smith (1978a) provide further sample sizes for the exact test, with some overlap (and agreement) with Haseman’s results. They calculated

3Note that Gail & Gart (1973) actually refer to an editorial note following Sillitto’s (1949) paper; this appears to be an error, since there is no such editorial note, and the point they are making clearly refers to the note following Bennett & Hsu (1960).

202 IAN GORDON

n required for a power of 0.8, 0.9 or 0.95, p l = 0.05(0.05)0.5, p2 = 0.1(0.05)0.95 and a = 0.05,0.025 and 0.005; not all combinations were tabulated. The algo- rithm used is given in Casagrande, Pike & Smith (1978b).

Nearly sixty years after its introduction, the exact test remains the subject of debate. There have been intermittent attempts to justify the position originally put by Barnard (1947), that an unconditional test is preferable, in the sense of having greater power (e.g. Berkson, 1978; Upton, 1982; Haviland, 1990), and some extensive work has been done on implementing such a test (McDonald, Davis & Milliken, 1977; Suissa & Schuster, 1985). For a full discussion of these issues, see Yates (1984). In summary, the arguments for the conditional test are that the margins in a 2 x 2 table contain essentially no information about the association under scrutiny; therefore, conditioning on the margins entails no loss of information and leads to a more sensitive test. Further, these arguments apply under whatever sampling scheme the table was obtained; in particular, they do not depend on the margins being actually fixed by design. It should be noted that Barnard himself, the first to suggest an unconditional approach, has withdrawn his suggestion and argued for the conditional test not once but several times, and as recently as 1989 (Barnard, 1949; Barnard, 1976; Barnard, 1989).

3. Approximations

Because of the onerous computational demands of Fisher's test in samples which are not small, simplifying approximations have been developed. These all use asymptotic theory, and hence normal approximations.

3.1. The Normal Approximation to the Binomial

plies that the power of the test, R(p l ,p2 ) may be approximated by Straightforward application of the appropriate normal approximation im-

where 0 denotes the distribution function of the standard normal distribution, C, = @-1(7), p = $(pi + p2) and = 1 - p.

If R b i , p2) 2 1'- P, then

Solving this inequality for n gives

SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS 203

where (6)

2 A = [ C l - , & i + C l - p J P l 9 l + P 2 9 2 ] *

This formula has an unclear pedigree, probably because it is so straightforward. Schlesselman (1974) calls it a ‘well-known’ formula and cites Hdperin et d. (1968)*; Casagrande, Pike & Smith (1978~) call it the ‘uncorrected x2 formula’ and cite Fleiss (1973).

Equation (5) underestimates the required sample size, relative to that based on (3). A. suggested reason for this is that no correction for continuity is used in its derivation. Two basic formulae which claim to apply such a correction have been proposed. The first is due to Kramer & Greenhouse (1959). The derivation (following Casagrande, Pike & Smith, 1978c) is as follows. Let D = XI - X2. We reject Ho if D 2 d,, where d, is an (integer) critical value. Then

where D* has a normal distribution with the same mean and variance as D, and the subtraction of 1 is a correction for continuity. The use of 1 rather than 3 is a reminder that we are approximating the exact test. In that test the total T = X1 + X2 is taken as fixed at its observed value t , so that we can rewrite D as X I - (t - X I ) = 2x1 - t , from which it is evident that given T = t , D does not take consecutive integer values as expected prima facie, but only every second integer value from -t to t. Alternatively, we see that the subtraction of 1 is required to give a test statistic which, upon squaring, is well-known as Yates’s correction to the x2 test statistic. This leads to

The requirement for power leads to a second formula involving d,,

where < is another correction factor owing to the discreteness of the distribution of D. By simultaneously solving (7) and (8) and setting the value of (, we may eliminate d, and obtain a formula for n.

= -1 gives the Kramer-Greenhouse formula, The choice

~~ ~~

‘However, Halperin et d. themselves refer to the formula as ‘wd-known’.

204 IAN GORDON

This formula was used for tables of sample size in the first edition of Fleiss (1973). Casagrande, Pike & Smith (1978~) suggest that the use of (any) continuity cor- rection in the power calculation does not seem justified, and so we should use < = 0.5 This leads to a formula which gives a value of n that is approximately half-way between the value given by (5), which is too liberal, and the value given by (9), which is too conservative. This is the Casagrande, Pike & Smith (1978~) formula,

Casagrande, Pike & Smith claimed that n C p S was so close to n E for a wide range of values of p l and p2 that it 'effectively eliminates the necessity of ever requiring the exact values'. Aleong & Bartlett (1979) produced graphs based on this approximation.

3.2. The Arcsine Transformation A standard method of making approximate inferences on a proportion p is

to use the arcsine transformation, to stabilise the variance of the estimator. For the comparison of two proportions, p l and pa, the test statistic may be written as

2, = & A ( & , P 2 ) , (11) where X ( p 1 , p z ) = arcsin the two groups.

- arcsin 6, in the case of equal sample sizes in

The usual calculations mean that requiring the power to be at least 1 - p gives

Sillitto (1949) states that this formula is given by Paulson & Wallis (1947), who provide a nomogram based on it. Sillito (1949) reproduces the formula and Cochran & Cox (1957) use it to produce sample size tables for a! = 0.05, power = 0.8, 0.9 or 0.95, and 6 = p l -p2 = 0.05(0.05)0.7. Feigl (1978) produces isographs based on (12) for both one- and two-sided tests, using a = 0.01 with a power of 0.90 or 0.95, and cy = 0.05 with a power of 0.80 or 0.90.

Anscombe (1948) proposed an related approximation based on the transfor- mations

~ ~

'Neither the Kramer-Greenhouse nor the Casagrande, Pike &c Smith correction can legiti- mately be regarded as implementations of the usual correction for continuity (see Gordon & Watson, 1994a), but they were originally motivated in this way.

SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS 205

yielding a corresponding sample size calculation; Makuch & Simon (1980) give tables based on (13).

No continuity correction is employed in the derivation of (12). Noting this, Walters (1979) proposes the use of the alternative statistic

The usual calculation leads to the following formula for Walters’ sample size, denoted by nw; iteration is required to find nw.

Walters claims that the agreement between nw and n E is ‘very good indeed’. The evidence for this claim is empirical; for six configurations considered, nw yielded a value that was different from n E by at most 3, while the uncorrected arcsine value n A was as much as 8% in error, and on the non-conservative side (e.g. n E = 503, nA = 464).

Although finding nW requires an iterative solution, Dobson & Gebski (1986) show that by using Taylor series approximations for arcsin d p m a closed expression for n may be found, which has very good accuracy. Specifically, using

yields

where F = C1-a + c1-p and G = l / d m + l / d m . They give six examples for which n D G = nw, and also provide a version of the formula for unequal sample sizes.

3.3. Normal Approximation to the Hypergeometric A third method uses a normal approximation to the conditional distribution

of XI I t , the non-central hypergeometric distribution, given by (1). Patnaik (1948) uses this to derive an approximation for the conditional power and the average conditional power. This approach does not yield a closed solution and is of historical interest only.

3.4. Accuracy of the Approximations In addition to the assessments already mentioned, a further empirical study

indicated that nW was, in fact, superior to n C p S , but the advantage was mainly confined to cases in which at least one of p l and p2 was close to 0 or 1 (Ury, 1981).

206 IAN GORDON

TABLE 1 Comparison of approximate formulae for nE, the sample size required

to test Ho: p l = pz against HI: p1 > p z , for a = 0.05 and power = 0.90

P2 Pl nE nu nA ncps KG ~ D G nw 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.05 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.70 0.60 0.50 0.40 0.30 0.20 0.60 0.50 0.40 0.30 0.50 0.40

0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.90 0.90 0.90 0.90 0.90 0.90 0.90 0.90 0.80 0.80 0.80 0.80 0.80 0.80 0.70 0.70 0.70 0.70 0.60 0.60

503 89 42 25 18 13 10 8 6 5

232 74 39 25 17 12 10 8

338 97 47 30 18 12 408 111 53 31 445 116

474 82 38 23 15 11 8 6 4 3

217 67 34 21 14 10 7 5

320 89 42 24 16 10

101 46 25 423 106

388

463 76 35 21 14 10 8 6 5 4

213 65 33 20 14 10 7 5

319 88 42 24 16 11 388 102 46 26 423 106

513 95 46 28 20 14 11 8 6 5

237 77 41 26 18 13 10 7

339 98 48 29 19 14 408 111 52 30 442 115

551 107 53 33 23 17 13 10 8 7

255 86 47 30 21 16 12 9

359 108 54 34 23 16 427 121 58 35 462 125

503 90 44 28 20 15 12 10 8 7

233 75 40 26 18 14 11 9

339 98 49 30 20 14 408 112 53 31 443 116

503 90 43 27 19 15 11 9 7 7

233 75 40 26 18 14 11 8

339 98 49 30 20 14 408 112 53 31 443

Radhahishna (1983) made a more comprehensive comparison, again empirical. He compared 180 combinations involving the 30 unique combinations of p l and p2 in Haseman’s (1978) tables, LY = 0.05 or 0.01, and power equal to 0.90, 0.80 and 0.50. He considered the difference between the approximate sample size and the exact sample size. This difference was zero in 34% of cases for nw, and 22% of cases for nCpS. The magnitude of the difference was less than or equal to two in 97% of cases for nw, and 81% of cases for nCpS.

The empirical evidence in Radhakrishna’s (1983) analysis indicates that n c p ~ tends to be slightly conservative; this is less clear for nDG. However, Thomas & Conlon (1992) consider the performance of these approximations for very small p l and p2. They show that at this extremity of the parameter space

SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS 207

it is true that n D G over-estimates n E , in some cases by fax greater amounts than n C p S (e.g. p2 = 0.0001, pi = 0.1: n E = 79, n C p S = 99, n D G = 170). The poor performance of n D G is due in large part to the failure of the Taylor series approximation to the arcsine function for such small values (Gordon & Watson, 1994b). The Walters approximation, nw, was found to be much better, though it still consistently overestimates n E .

In summary, unless very small p l and p2 are envisaged, the modified arcsine value n D G appears to be the best available approximation to sample sizes given by Fisher’s exact test.

The approximations considered in this paper for the equal sample size case have straightforward parallels for the unequal sample size case; see Fleiss, Tytun & Ury (1980), Lee (1984) and Dobson & Gebski (1986).

To conclude this section we give an indicative table, Table 1, showing the exact values based on Fisher’s test and the key approximations, for the equal sample size context. This table demonstrates the salient features already men- tioned, i.e. nu and n A are consistently too low, n K G is too high, n c p s and n D G are both very good, with n D G being margindy better, and nw best of all. n D G is so close to nw that the additional effort required to calculate nW (for p l and p2 in these ranges) does not seem warranted. The table is based on the fist 30 configurations in Radhakrishna’s (1983) evaluation.

References

ALEONG, J. & BARTLETT, D.E. (1979). Improved graphs for calculating sample sizes when

ANSCOMBE, F.J. (1948). The transformation of Poisson, binomial and negative-binomial data.

BARNARD, G.A. (1947). Significance tests for 2 x 2 tables. Biometrika 34, 123-138. - (1949). Statistical inference. J. Roy. Statist. SOC. Ser. B XI, 115-149. - (1976). Conditional inference is not inefiident. Scand. J. Statist. 3, 132-134. - (1989). On alleged gains in power from lower P-values. Statist. in Medicine 8, 1469-1477. BENNETT, B.M. & HSU, P. (1960). On the power function of the exact test for the 2 x 2

BERKSON, J. (1978). In dispraise of the exact test. J. Statist. Plann. Inference 2, 27-42. BIRNBAUM, A. (1962). On the foundations of statistical inference (with discussion). J . Amer.

CASAGRANDE, J.T., PIKE, M.C. & SMITH, P.G. (1978a). The power function of the ‘exact’

-, - & - (1978b). Algorithm AS 129: The power function of the ‘exact’ test for comparing

-, - & - (1978~). An improved approximate formula for calculating sample sizes for com-

CLARK, RM. (1972). Partial sufficiency, information and the 2 x 2 table. Department of

COCHRAN, W.G. & COX, G.M. (1957). Experimental Designs, 2nd ed. New York: Wiley. DOBSON, A.J. & GEBSKI, V.J. (1986). Sample sims for comparing two independent propor-

tions using the continuity-corrected arc sine transformation. Statistician 35, 51-53.

comparing two independent binomial distributions. Biometrics 35,875-881.

Biometrika 35, 246-254.

contingency table. Biometrika 47, 393-398. (Correction. (1961). 48,475.)

Statist. Assoc. 57, 269-326.

test for comparing two binomial distributions. Appl. Statist. 27, 176-180.

two binomial distributions. Appl. Statist. 27, 212-219.

paring two binomial distributions. Biometrics 34,483-496.

Probability and Statistics, University of Sheflield. Research Report 101.

208 IAN GORDON

DONNER, A. (1984). Approaches to sample size estimation in the design of clinical trials - a

FEIGL, P. (1978). A graphical aid for determining sample sises when comparing two indepen-

FINNEY, D.J. (1948). The Fisher-Yates test of significance in 2 x 2 contingency tables.

-, LATSCHA, R., BENNETT, B.M. & HSU, P. (1963). Tables for testing significance in a 2 x 2

FISHER, R.A. (1925). Theory of statistical estimation. Proc. Cambridge Philos. SOC. 28,

- (1926). Bayed theorem and the fourfold table. Eugenics Rev. 18, 32-33. - (1935). The logic of inductive inference (with discussion). J. Roy. Statist. SOC. Ser. A 98,

FLEISS, J.L. (1973). Stat is t id Methods for Rates and Proportions. New York: Wiley. -, TYTUN, A. & URY, H.K. (1980). A simple approximation for calculating sample sizes for

comparing independent proportions. Biometries 36,343-346. GAIL, M. & GART, J.J. (1973). The determination of sample sizes for use with the exact

conditional test in 2 x 2 comparative trials. Biometrics 2 9 , 4 4 1 4 8 . GEBSKI, V. & McNEIL, D. (1985). Sample siws for clinical trials-the choice is yours. A review

of the sample size problem in clinical trials. Statistical Computing Laboratory, Macquarie University. Technical Report No.31.

GORDON, 1. & WATSON, R. (1994a). The myth of continuity-corrected sample size formulae. Biometries, submitted.

- & - (1994b). A note on the comparison of small probabilities. Controlled CIin. Tkials, (to

GREENLAND, S. & THOMAS, D.C. (1982). On the need for the rare disease assumption in case control studies. Amer. J. Epidemiol. 118, 547-553.

HALPERIN, M., ROGOT, E., GURIAN, J. & EDERER, F. (1968). Sample sizes for medical trials with special reference to long-term therapy. J. Chronic Disease 21, 13-24.

HASEMAN, J.K. (1978). Exact sample sizes for use with the Fisher-Irwin test for 2 x 2 tables. Biometrics 34, 106-109.

HAVILAND, M.G. (1990). Yates’s correction for continuity and the analysis of 2 X 2 contingency tables. Statist. in Medicine 9, 363-367.

IRWIN, J.O. (1935). Tests of significance for differences between percentages based on small numbers. Metron 12, 83-94.

KRAMER, M. & GREENHOUSE, S.W. (1959). Determination of sample size and selection of cases. In Psychopharmacology: Problems in Evaluation, eds. J.O. Cole and R.W. Gerard, pp.356-371. Washington: Nationd Academy of Sciences, National Research Council.

LATSCHA, R. (1953). Tests of significance in a 2 x 2 contingency table: extension of Finney’s table. Biometrika 40, 74-86.

LEE, Y.J. (1984). Quick and simple approximation of sample sizes for comparing two indepen- dent binomial distributions: different-samplesize case. Biometrics 40, 239-241.

MAKUCH, R.W. & SIMON, R.M. (1980). Sample size considerations for non-randomized com- parative studies. J . Chronic Disease 33, 175-181.

McDONALD, L.L., DAVIS, B.M. & MILLIKEN, G.A. (1977). A nonrandomized unconditional test for comparing two proportions in 2 x 2 contingency tables. Technometrics 19, 145-157.

PATNAIK, P.B. (1948). The power function of the test for the difference between two propor- tions in a 2 x 2 table. Biometrika 35, 157-175.

PAULSON, E. & WALLIS, W.A. (1947). Planningand analyzingexperiments for comparing two percentages. In Selected Techniques of Statistical Analysis, eds. C. Eisenhart, M.W. Hastay and W.A. Wallis, pp.247-266. New York: McGraw-Hill.

review. Statist. Medicine 3, 199-214.

dent proportions. Biometrics 34, 111-122.

Biometrika 35, 145-156.

contingency table. Cambridge: Cambridge University Press.

257-261.

39-82.

appear.)

SAMPLE SIZE FOR TWO INDEPENDENT PROPORTIONS 209

PLACKETT, R.L. (1977). The marginal totals of a 2 X 2 table. Biometrika 64, 37-42. PRENTICE, R.L. & BRESLOW, N.E. (1978). Retrospective studies and failure time models.

Biometrika 6 5 , 153-158. RADHAKRISHNA, s. (1983). Computation for sample size for comparing two proportions.

Indian J. Med. Res. 77, 915-919. SCHLESSELMAN, J.J. (1974). Sample size requirements in cohort and case-control studies of

disease. Amer. J. Epidemiol. 99, 381-384. SILLITTO, G.P. (1949). Note on approximations to the power function of the ‘2 x 2 comparative

trial’. Biometrika 36, 347-352. SUISSA, S. & SCHUSTER, J.J. (1985). Exact unconditional sample sizes for the 2 x 2 binomial

trial. J. Roy. Statist. SOC. Ser. A 148, 317-327. THOMAS, R.G. & CONLON, M.C. (1992). Sample size determination based on Fisher’s exact

test for use in 2 x 2 comparative trials with low event rates. Controlled Clinical Trials 13,

UPTON, G.J.G. (1982). A comparison of alternative tests for the 2 x 2 comparative trial. J. Roy. Statist. SOC. Ser. A 45, 86-105.

URY, H.K. (1981). Continuity-corrected approximations to sample sizes for comparing two independent proportions with the use of Yates’ correction. Biometrics 36,347-351.

WALTERS, D.E. (1979). In defence of the arc sine approximation. Statistician 28, 219-222. YATES, F. (1934). Contingency tables involving small numbers and the x2 test. J. Roy. Statist.

- (1984). Tests of significance for 2 x 2 contingency tables (with discussion). J. Roy. Statist.

134-147.

SOC. Suppl. 1, 217-235.

SOC. Ser. A 147, 42-63.


Recommended