+ All Categories
Home > Documents > The Analysis of Designed Experiments with Censored Observations

The Analysis of Designed Experiments with Censored Observations

Date post: 09-Dec-2016
Category:
Upload: j-taylor
View: 212 times
Download: 0 times
Share this document with a friend
10
The Analysis of Designed Experiments with Censored Observations Author(s): J. Taylor Source: Biometrics, Vol. 29, No. 1 (Mar., 1973), pp. 35-43 Published by: International Biometric Society Stable URL: http://www.jstor.org/stable/2529674 . Accessed: 07/09/2013 11:05 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Biometrics. http://www.jstor.org This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AM All use subject to JSTOR Terms and Conditions
Transcript
Page 1: The Analysis of Designed Experiments with Censored Observations

The Analysis of Designed Experiments with Censored ObservationsAuthor(s): J. TaylorSource: Biometrics, Vol. 29, No. 1 (Mar., 1973), pp. 35-43Published by: International Biometric SocietyStable URL: http://www.jstor.org/stable/2529674 .

Accessed: 07/09/2013 11:05

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access toBiometrics.

http://www.jstor.org

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 2: The Analysis of Designed Experiments with Censored Observations

BIOMETRIcS 29, 35-43 March 1973

THE ANALYSIS OF DESIGNED EXPERIMENTS WITH CENSORED OBSERVATIONS

J. TAYLOR

Unilever Research Laboratory Colworth/Welwyn, Colworth House, Sharnbrook, Bedford, England

SUMMARY

An observation is said to be "censored" if we know only that it is less than (or greater than) a certain known value. Sampford and Taylor [1959] proposed an iterative method of analysis for randomized block experiments with Type I censoring, and this method can also be used for other experimental designs. The method has been applied to simulated data. The results confirm that parameters can be estimated with only a small bias, and that it is possible to do an approximate t-test to compare treatment means.

An approximation due to Tiku will reduce the labour of calculation.

1. INTRODUCTION

An observation xi is said to be "censored" if we know only that xi < Li or xi 2 Ui , the value of Li or Ui being known but the exact value of xi being unknown. This is called Type I censoring. Censored observations, when they occur, are usually few in number and the exact value of xi is known for most members of the sample. For example, we may be able to weigh most of the items in a sample exactly but a few may be too heavy to be weighed by the balance; or we may record the times which animals take to eat some food but a few animals have not eaten it when the recorder must go to do other work. Censored observations may also be obtained in plate counts of bacteria, since an unexpectedly high density will give overlapping colonies which are uncountable but are certainly more than some known number.

Several authors (e.g. Cohen [1959, 1961, 1963], Swan [1969a], Tiku [1967]) have considered estimation of the parameters of a single normal distribution when a sample is drawn and is found to contain censored observations, and Tiku [1971] has considered censored samples from two populations which have different means and a common standard deviation. When we analyse the results of a designed experiment, or do a regression analysis, we assume that the observations have come from different populations whose means are related in some way. Sampford and Taylor [1959] gave a method of analysis for randomized block experiments with censored observations, and said that it could be used for other experimental designs. They derived maximum likelihood (M\1L) estimates of the parameters and suggested a method of correcting the estimate of the variance for bias. They also suggested

35

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 3: The Analysis of Designed Experiments with Censored Observations

36 BIOMETRICS, MARCH 1973

an approximate t-test to compare treatment means. These suggestions cannot be verified mathematically, but Sampford and Taylor presented results from a small simulation test which showed that they were reasonable. A much larger simulation has now been done, and it is described in this paper.

2. SUMMARY OF THE METHOD OF ANALYSIS

(a) Model In a randomized block experiment with m blocks and n treatments, we

assume that the observation for treatment j in block i is given by

Xii Aii + Eii

where

Hii A + Pi + Ti (with E P= E TZ = ?) and where the Ei; are normally and independently distributed with mean 0 and standard deviation a.

From any set of values for the xi , we can calculate estimates of the ,g,, and these are

Aii = xi. + x.j - x.. (1)

Now suppose that r of the observations are censored. For h = 1 ... r, let x, represent the censored value xi j. We know that xh > some value Uh but we do not know the exact value of xh.

(b) Maximum likelihood estimates If we assume a set of values for x, , we can calculate estimates of 1ij

from (1), using the r assumed values and the (min - r) values of x; which are observed exactly. Sampford and Taylor showed that the ML estimates are obtained if we find x, such that

=h Uh + Ta (h) (h =1 r) (2)

where

v(X) = f(n)/ { 1-F )}

f and F being the frequency function and distribution function of a standard- ized normal variate,

Uh Ah U -

and

2 (xE i - A)23 mn." - q E Xfgh)

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 4: The Analysis of Designed Experiments with Censored Observations

CENSORING IN DESIGNED EXPERIMENTS 37

where the summation is over all xij , both assumed and observed, and where

XA0) = v n) {v(n) - n}.

is the ML estimate of o2, h is the estimated standardized distance between the censoring point and the mean of the distribution from which the observa- tion was drawn. v(X) is the reciprocal of Mill's ratio. It was tabulated by Sampford [1952], or it may be computed by an approximating formula (Swan [1969b]).

The calculations are done iteratively. Initial values of xh are assumed, and a is calculated from (3). 6f is then held constant and equations (2) are solved for each xh in turn, until the values converge. 6 is then re-calculated and the process is continued until it converges. Sampford and Taylor gave an example of the calculations. If the calculations are done by a computer, it is simplest to recalculate 6 after each cycle of the xh

(c) Estimate of 2

62 is a biassed estimate of a2. Sampford and Taylor suggested that the bias would be approximately corrected if the divisor in equation (3) was replaced by

(m - 1)(n - 1) - r + EXG~h). They also suggested that this divisor should be used as the effective degrees of freedom of the estimate. This is analogous to the correction which is required for uncensored samples. It will be called the absolute correction, since it assumes that the absolute difference between the divisors for ML and corrected estimates is the same for censored and for uncensored samples.

One of the aims of the present investigation was to find how well this correction removes the bias. The first results suggested that it over corrects, by using too low a value for the divisor. Another divisor was therefore tried as well. This was

(m- 1)(n-1) {mIn-r+ EXah) }

inn

It will be called the proportionate correction, since it assumes that the ratio of the divisors for ML and corrected estimates is the same for censored and for uncensored samples.

(d) t-tests

Sampford and Taylor suggested that an approximate test of the hypothesis that rj = -rk would be given by dividing the difference between the estimated means for treatments j and k by

+ 1 e s iWis Wik

where s is the square root of the "unbiassed" estimate of o-2, Win j 1 if

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 5: The Analysis of Designed Experiments with Censored Observations

38 BIOMETRICS, MARCH 1973

the (i, j) observation is uncensored, and Wij = X(rqh) if it is the hth censored observation. The ratio would be referred to the t-distribution, with degrees of freedom equal to the divisor which is used to obtain the "unbiassed" estimate. In the present investigation the performance of this test has been observed, using first the absolute correction and then the proportionate correction to calculate s'.

3. EXPERIMENTAL METHOD

Randomized block experiments with m = 6, n = 5 were simulated. In each experiment, the block parameters were Pi = 0, P2 = -0.1, p3 = +0.1,

P4 = +0.3, p5 = -0.5, P8 = 0. The values of (,u + rj) for treatments 3, 4 and 5 were 3.55, 3.0, and 0.5. Treatments 1 and 2 each had the same param- eter; and for case 1 this was the same as the parameter for treatment 3, for case 2 the same as for treatment 4, and for case 3 the same as for treatment 5. The values of Iis are given in Table 1. The censoring point Uij was taken to be 5.0 for all (i, j). of was taken to be 1.0. The expected proportion of censored observations in each cell is given in brackets in Table 1.

TABLE 1 VALTJES OF yjj, WITH THE PROBABILITY OF CENSORING IN BRACKETS

Treatment

Block 3 4 5

1 4.55 (0.326) 4.00 (0.159) 1.50 (0.000)

2 4.45 (0.291) 3.90 (0.136) 1.40 (0.000)

.3 4.65 (0.363) 4.10 (0.184) 1.60 (0.000)

4 4.85 (0.440) 4.30 (0.242) 1.80 (0.00o)

5 4.25 (0.227) 3.70 (0.097) 1.20 (0.000)

6 4.55 (0.326) 4.00 (0.159) 1.50 (0.000)

The expected number of censored observations in an experiment is approximately 2 for treatment 3, 1 for treatment 4, and 0 for treatment 5.

Values of Ei i were generated by a pseudo-random method and were added to the parameters to give a sample result of an experiment. 1000 samples (i.e. 30000 values of Ei ) were generated for case 1, 3000 samples for case 2, and 1000 samples for case 3.

The pseudo-random values were obtained by the modified polar method (Marsaglia and Bray [1964, section 1]). Several properties of this generator had already been tested and found satisfactory, and further tests which might be relevant to this investigation were applied to the Ej2 . These tested (i) mean values for each (i, j) in each case, for each treatment, and for all

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 6: The Analysis of Designed Experiments with Censored Observations

CENSORING IN DESIGNED EXPERIMENTS 39

the Ej ; (ii) the mean and the distribution of the residual mean square in an analysis of variance of the uncensored data; (iii) the mean and the distribution of t, as calculated for a t-test to compare treatments 1 and 2; (iv) the mean number of censored values (> 5.0) for each (i, j) in each case, for all samples in each case, and for all samples in all cases; (v) the frequency distribution of the number of censored values in each sample. These tests gave satisfactory results. The worst results were the mean of all 150000 values of Ej , which was -0.0064 (expectation = 0, S.E. = 0.0026); and the mean number of censored values in a sample for case 2, which was 4.833 (expectation = 4.903, S.E. = 0.035). These should not seriously affect the results which follow.

4. RESULTS

The censored samples were analysed by the method described in section 2, using both the absolute and the proportionate correction in the estimation of cr2.

(a) Number of useful samples

In some samples, all observations for treatment 3 were censored. This has a probability of 0.0011. It also happens for treatments 1 and 2 in case 1. These samples have been excluded from the remaining analyses, since it is impossible to analyse the results for all treatments. In this way we excluded 3 samples out of 1000 for case 1, 2 out of 3000 for case 2, and 4 out of 1000 for case 3. If such results were obtained in a real experiment we should analyse the results for the remaining treatments only, and should obtain some esti- mates, but the fact that we have not done so in these few samples should not noticeably affect our conclusions.

(b) Mean values

The differences between the mean estimates for treatment parameters in the analyses of the censored samples and of the same samples uncensored were 0.004, 0.003 and 0.000 for treatments 3, 4 and 5 respectively. These values include the estimates for treatments 1 and 2 in the appropriate cases. The censored means were slightly greater than the uncensored ones, for the treatments which had an appreciable amount of censoring. This may be partly a compensation for the slight negative bias in the uncensored values (see section 3).

(c) Estimates of o2

Table 2 gives the following properties of the estimates of o2 which were obtained by using the alternative divisors:

(i) The mean value of the estimate is given. (ii) We wish to compare this mean with the mean estimate of o2 from

the uncensored values for the same samples. The difference (or ratio) of the estimates follows no known distribution. We have therefore

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 7: The Analysis of Designed Experiments with Censored Observations

40 BIOMETRICS, MARCH 1973

ai) ta Cd

0 C~~j CC) ro C o cq o C\N n Lft

EQ Cr CH

+' r

X v v v v * * *rl 0 *r

C)~~~~~~~~~~~~~~~~~~~~~~~~~i C d+

0~~~~~~~~~~~~~~~~~~

a) 0

C~~~~~~~j re) + ~~~~~~~~~~~~~~~C) C

CU~~~~~~~~~~~~~~~~~~~~~~~~C PA~ ~ ~~~*

8 h~

CU * *r

'd Cd

o H H N en J mQ a) o lklo CO C U n in) z vIa . r- 0

Q L ? 0

t3 I r00d 0

9 a )

R~~~~~~C 'Cd .fQ P4 P4 S ) ) 1 9 oo

CT z~~~~~~~~~~~~~~~~~~~~ C)

ci) 0 ci) 0 -9 -9 -9 -1-' CC)

0 *r4 0 * r4 Cd Cd Cd H oC

*, ,p

%~~, .r 80 8 8. . U

0 .ci U F- , CU .rC 'H HS HC 0

-P Ol 'C 0? oU d .CQ HQ Hz F-_ oi) CU F-o C S U o I e e

FcR 'H 0 * C ) H 0 .r 'H .r

0 H r -

0 U) a) 9 C)C CU C - - Cc 0 S acC) aC) CC) bU C

CC 0 Cd 'H02 C f-p r-p -pC) P4 r-I0 N0~.C 0

H W U c U N C) CC - P i PA 0~~0 1CU 0 IC 'C 0C C) 0 CH Ea~~~~~~~~~~~~~~~~~~~~~~~~i C

+' o ~~~~~~-i ~~~~c~ c-) +) a) at0 r-

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 8: The Analysis of Designed Experiments with Censored Observations

CENSORING IN DESIGNED EXPERIMENTS 41

used a t-test for paired samples, and have relied on the robustness of the t-test in large samples.

(iii) The distribution of the estimates has been tested by applying a x2 test to the frequencies in 40 cells, each of which has the same expected frequency. For each sample separately we compared the estimate of o_2 with the fractiles of the distribution for the "effective degrees of freedom" (see section 2(c)), and allocated the sample to the appropriate cell.

The estimates with the absolute correction are too large on the average, while those with the proportionate correction are too small. The errors are greater for the absolute correction, but the means in the three cases are only 3.4%, 1.9% and 1.5% respectively greater than the corresponding values for uncensored samples. The significant value of X2, for the proportionate correction in case 2, was caused by an excess of low values.

(d) Values of t The approximate t-test, described in section 2(d), was made to compare

treatments 1 and 2. We should expect the mean value of t to be close to zero and not to differ significantly from the value for uncensored samples, since the treatments had the same parameters, and this was confirmed. The frequency distribution of t was also compared with the theoretical distribu- tion, in the same way as for the estimates of o_2. Table 3 gives the values of X2. It also gives the percentages of values of t which were outside the 5% point of the distribution, for a two-sided test of significance.

TABLE 3 DISTRIBUTION IN CENSORED SAMPLES OF t TO COMPARE THE MEANS OF TREATMENTS WITH THE

SAME PARAMETER

Case I Case 2 Case 3

Absolute correction

Chi-square for distribution (39 &a-.) 50.3 21.3 31.1

Percentage outside 5% poinl 4.3 4.9 4.4

ProPoIrionaie correction

Chi-square for distribution (39 a.f.) 54.4 19.5 38.2 Percentage outside 5% point 4.2 4.9 4.3

Standard error of each percentage = 0.7

* For the same 1000 samples as in Table 2.

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 9: The Analysis of Designed Experiments with Censored Observations

42 BIOMETRICS, MARCH 1973

No values of x2 were significantly large, at the P = 0.05 level. The values for case 2 were smaller than would occur by chance in 95% of experiments. This may nevertheless be a chance effect, since we should have expected case 2 to be intermediate between cases 1 and 3 if there were any real effects.

5. DISCUSSION

The methods suggested by Sampford and Taylor have worked well for the experiments which were considered in this investigation. The estimates of o-2 using the absolute correction were slightly too high, which implies that the "effective degrees of freedom" are too low. The estimated bias was only 3.4% in case 1, for which 23% of the observations are censored. The approximate t-test gave quite a good approximation to the theoretical distri- bution and the frequency of significant values was close to the expected value.

All analyses depend on assumptions about the distribution of errors, and an analysis of censored data is particularly sensitive to departures from the assumptions in the tails of the distribution. We have considered small exper- iments with a high percentage of censored observations, and these are as extreme as any which one might hope would give reasonable results. If the assumptions about the distribution are correct, then we may expect that the approximations suggested in this paper will be better in larger experiments with the same percentage of censored observations, and in experiments with a lower percentage.

The proportionate correction for bias gave smaller absolute errors in the estimates of o_2 than the absolute correction and the distribution of t was almost as good. The variance tended to be underestimated, so the "effective degrees of freedom" were too high. The absolute correction is recommended because it will give more conservative tests of significance and because the bias, although greater than that for the proportionate correction, is quite small.

6. TIKU'S APPROXIMATION

Tiku [1967] has pointed out that (in our notation) V(,q) is approximately a linear function of q and that this simplifies estimation for a single censored sample. This approximation may also be used in the method which is con- sidered in this paper. For a given value of 6, the equations (2) are then linear in the xh and so can be solved without iteration. 6f must be estimated iteratively, and the xh must be recalculated.

We have not calculated estimates using Tiku's approximation in this investigation, but they should have similar properties to the ML estimates. It will be necessary to make corrections for bias in the same way. Tiku's approximation will greatly reduce the labour of calculation by a desk machine. It may save some time in computer calculation, but this is unlikely to be great and it seems preferable to use the exact formula for K(n).

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions

Page 10: The Analysis of Designed Experiments with Censored Observations

CENSORING IN DESIGNED EXPERIMENTS 43

ACKNOWLEDGMENT

I thank Miss M. Kuo for writing the computer programme which per- formed the simulation experiment and summarized the results.

L'ANALYSE D'EXPERIENCES AVEC DES OBSERVATIONS "CENSUREES"

RESUME

Une observation est dite censuree" si l'on sait seulement qu'elle est plus petite (ou plus grande) qu'une certain valeur connue. Sampford et Taylor [1959] ont propose une methode iterative d'analyse pour les experiences en blocs randomises avec une censure du type I, et cette methode peut aussi etre utilisee pour d'autres plans experimentaux. La methode a ete appliquee a des donnees simulees. Les resultats confirment que les parametres peuvent etre estim6s avec seulement un petit bias, et qu'il est possible de faire un test approche pour comparer les traitements moyens. Une approximation due a Tiku devrait reduire le travail de calcul.

REFERENCES Cohen, A. C., Jr. [1959]. Simplified estimators for the normal distribution when samples are

singly censored or truncated. Technometrics 1, 217-37. Cohen, A. C., Jr. [1961]. Tables for maximum likelihood estimates: singly truncated and

singly censored samples. Technometrics 3, 535-41. Cohen, A. C., Jr. [1963]. Progressively censored samples in life testing. Technometrics 5,

327-39. Marsaglia, G. and Bray, T. A. [1964]. A convenient method for generating normal variables.

SIAM Review 6, 260-4. Sampford, M. R. [1952]. The estimation of response-time distributions. II Multi-stimulus

distributions. Biometrics 8, 307-69. Sampford, M. R. and Taylor, J. [1959]. Censored observations in randomized block experi-

ments. J. R. Statist. Soc. B. 21, 214-37. Swan, A. V. [1969a]. Computing maximum-likelihood estimates for parameters of the normal

distribution from grouped and censored data. Apple. Statist. 18, 65-9. Swan, A. V. [1969b]. Algorithm AS17. The reciprocal of Mill's ratio. Apple. Statist. 18,

115-6. Tiku, M. L. [1967]. Estimating the mean and standard deviation from a censored normal

sample. Biometrika 54, 155-65. Tiku, M. L. [1971]. Estimating the means and standard deviation from two censored

normal samples. Biometrika 58, 241-3.

Received February 1972, Revised August 1972

Key words: Censored observations; Randomized blocks; Simulation.

This content downloaded from 129.171.178.62 on Sat, 7 Sep 2013 11:05:25 AMAll use subject to JSTOR Terms and Conditions


Recommended