+ All Categories
Home > Documents > IRAG Working Group 2: CAM-based assays

IRAG Working Group 2: CAM-based assays

Date post: 04-Dec-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
28
Pergamon Food and Chemical Toxicology 35 (1997) 39-66 T~-~'~ IRAG WORKING GROUP 2 CAM-based Assays H. SPIELMANN*t, M. LIEBSCHt, F. MOLDENHAUERt, H.-G. HOLZHI]TTER~, D. M. BAGLEY§, J. M. LIPMAN¶, W. J. W. PAPE**, H. MILTENBURGERtt, O. de SILVA$$, H. HOFER~ and W. STEILING¶¶ tZEBET (National Centre for Documentation and Validation of Alternatives to Animal Experiments), BgVV (Federal Institute for Health Protection of Consumers and Veterinary Medicine), Diedersdorfer Weg 1, D..12277 Berlin, Germany, :~Humboldt University, Berlin, Germany, §Colgate-Palmolive, Piscataway, NJ, USA, ¶Hoffmann-La Roche, Inc., Nutley, NJ 07110-1199, USA, **Beiersdorf AG, 20245 Hamburg, Germany, ttUniversity of Darmstadt, Germany, :~:~L'Orral, Aulnay sous Bois, France, ~Austrian Research Center, Dept of Toxicology, AI010 Wien, Austria and ¶¶Henkel, Dusseldorf, Germany AbstractS:AM-based assays, in which test material is applied to the chorion allantoic membrane (CAM) of embryonated chicken eggs, were assessed as alternatives to the Draize eye irritation test. Two general types of CA M-based assays are currently in use, the HET42AM test and the CAMVA assay. Evaluations were made of five data sets produced with three different modifications of the HET-CAM test and two data sets obtained with the same CAMVA protocol. Data sets consisted of 9-133 test chemicals, usually from the sponsor's product line, and also from a validation trial. Each data set and assay protocol were analysed for quality of data, purpose and proposed use of the assay, range of responses covered, range of test materials amenable, current use in safety and risk assessment both in-house and for regulatory purposes. Since the MMAS Draize score was not available for all in vivo data sets, the ZMMMIS, which correlates well with the MMAS, was used instead. In vitro/in vivo correlations calculated with Pearson's linear coefficient ranged from r = 0.6 to r = 0.9 for six of seven data sets. Corneal opacity and inflammation of the iris showed the best correlation to in vitro data. Prediction rates were significantly improved when partial linear regression was used, and the predictivity of three different HET-CAM protocols was almost the same. HET4:AM assays showed the best prediction with surfactants and surfactant-based formulations, whereas the CAMVA assay provided the best performance with alcohols. © 1997 Elsevier Science Ltd Introduction In 1992 a steeri~ag committee was formed for a workshop of the US Interagency Regulatory Alterna- tives Group (IRAG) to be held in November 1993 on "Eye Irritation Testing: Practical Applications of Non-whole Animal Alternatives." IRAG Working Group 2 (WG-2)assumed the task of evaluating CAM-based assays according to IRAG's Guidelines for the Evaluation of Eye Irritation Alternative Tests (IRAG, WG 6, 1993). In May 1993 IRAG sent letters of invi~:ation to submit data to institutions and individuals who had indicated that they were interested in participating in the IRAG workshop. They were asked to submit both in vivo Draize eye test data and in vi~'ro data that were generated in their laboratories. These who responded and indicated that they intended to submit data from CAM-based assays to IRAG were asked to send their data to WG-2. *Chair. Other members of Working Group 2: Bagley, de Silva, Steiling, Miltenburger and Holzhiitter. WG-2 held two meetings in Berlin on 16 June and 18 August 1993. During the first WG-2 meeting the general strategy was developed for distributing the work within WG-2 and for identifying sponsors who might submit their data to WG-2. The second meeting focused on data handling, data analysis and data presentation. To facilitate data analysis for the biostatisticians both at ZEBET and at the Humboldt University in Berlin, WG-2 decided to ask for submission of all of the data not only in a printed document but also on computer diskettes. Data were submitted to WG-2 for two types of CAM-based assays, the HET-CAM assay, which was developed by Luepke in Germany (Luepke, 1985), and the CAMVA assay, which was developed by Bagley in the United States. In the HET-CAM assay testing is performed on the "chorion allantoic membrane" (CAM) of chicken egg on day 9 of embryonation; the CAMVA assay is performed on the CAM on either day 10 or day 14 of embryonation 0278-6915/97/$17.0(I + 0.00 © 1997 Elsevier Science Ltd. All rights reserved. Printed in Great Britain PH 0278-6915(96)00103-2
Transcript

Pergamon Food and Chemical Toxicology 35 (1997) 39-66

T~-~'~

IRAG WORKING GROUP 2

CAM-based Assays H. S P I E L M A N N * t , M. L I E B S C H t , F. M O L D E N H A U E R t ,

H . - G . H O L Z H I ] T T E R ~ , D. M. B A G L E Y § , J. M. L I P M A N ¶ , W. J. W. P A P E * * , H. M I L T E N B U R G E R t t , O. de S I L V A $ $ ,

H. H O F E R ~ a n d W. S T E I L I N G ¶ ¶

tZEBET (National Centre for Documentation and Validation of Alternatives to Animal Experiments), BgVV (Federal Institute for Health Protection of Consumers and Veterinary Medicine), Diedersdorfer Weg 1, D..12277 Berlin, Germany, :~Humboldt University, Berlin, Germany, §Colgate-Palmolive, Piscataway, NJ, USA, ¶Hoffmann-La Roche, Inc., Nutley, NJ 07110-1199, USA, **Beiersdorf AG, 20245 Hamburg, Germany, ttUniversity of Darmstadt, Germany, :~:~L'Orral, Aulnay sous Bois, France, ~Austrian Research Center, Dept of Toxicology, AI010 Wien, Austria and ¶¶Henkel, Dusseldorf,

Germany

AbstractS:AM-based assays, in which test material is applied to the chorion allantoic membrane (CAM) of embryonated chicken eggs, were assessed as alternatives to the Draize eye irritation test. Two general types of CA M-based assays are currently in use, the HET42AM test and the CAMVA assay. Evaluations were made of five data sets produced with three different modifications of the HET-CAM test and two data sets obtained with the same CAMVA protocol. Data sets consisted of 9-133 test chemicals, usually from the sponsor's product line, and also from a validation trial. Each data set and assay protocol were analysed for quality of data, purpose and proposed use of the assay, range of responses covered, range of test materials amenable, current use in safety and risk assessment both in-house and for regulatory purposes. Since the MMAS Draize score was not available for all in vivo data sets, the ZMMMIS, which correlates well with the MMAS, was used instead. In vitro/in vivo correlations calculated with Pearson's linear coefficient ranged from r = 0.6 to r = 0.9 for six of seven data sets. Corneal opacity and inflammation of the iris showed the best correlation to in vitro data. Prediction rates were significantly improved when partial linear regression was used, and the predictivity of three different HET-CAM protocols was almost the same. HET4:AM assays showed the best prediction with surfactants and surfactant-based formulations, whereas the CAMVA assay provided the best performance with alcohols. © 1997 Elsevier Science Ltd

Introduction

In 1992 a steeri~ag committee was formed for a workshop of the US Interagency Regulatory Alterna- tives Group ( IRAG) to be held in November 1993 on "Eye Irritation Testing: Practical Applications of Non-whole Animal Alternatives." I R A G Working Group 2 ( W G - 2 ) a s s u m e d the task of evaluating CAM-based assays according to I R A G ' s Guidelines for the Evaluation of Eye Irritation Alternative Tests ( IRAG, W G 6, 1993). In May 1993 I R A G sent letters of invi~:ation to submit data to institutions and individuals who had indicated that they were interested in participating in the I R A G workshop. They were asked to submit both in vivo Draize eye test data and in vi~'ro data that were generated in their laboratories. These who responded and indicated that they intended to submit data from CAM-based assays to I R A G were asked to send their data to WG-2.

*Chair. Other members of Working Group 2: Bagley, de Silva, Steiling, Miltenburger and Holzhiitter.

WG-2 held two meetings in Berlin on 16 June and 18 August 1993. During the first WG-2 meeting the general strategy was developed for distributing the work within WG-2 and for identifying sponsors who might submit their data to WG-2. The second meeting focused on data handling, data analysis and data presentation. To facilitate data analysis for the biostatisticians both at ZEBET and at the Humboldt University in Berlin, WG-2 decided to ask for submission of all of the data not only in a printed document but also on computer diskettes.

Data were submitted to WG-2 for two types of CAM-based assays, the H E T - C A M assay, which was developed by Luepke in Germany (Luepke, 1985), and the C A M V A assay, which was developed by Bagley in the United States. In the H E T - C A M assay testing is performed on the "chor ion allantoic membrane" (CAM) of chicken egg on day 9 of embryonation; the C A M V A assay is performed on the C A M on either day 10 or day 14 of embryonation

0278-6915/97/$17.0(I + 0.00 © 1997 Elsevier Science Ltd. All rights reserved. Printed in Great Britain PH 0278-6915(96)00103-2

40 H. Spielmann et al.

The following laboratories have submitted data to WG-2:

C A M V A assay: Colgate-Palmolive, Piscataway, N J, USA; Hoffmann-La Roche, Nutley, N J, USA and MB-Research Laboratories, Spinnerstown, PA, USA.

H E T - C A M assay: ZEBET-BGA, German HET- CAM assay validation study, Beiersdorf, Hamburg, Germany (data were submitted in 1994 after the IRAG meeting held in 1993); Forschungsinstitut Seibersdorf, Seibersdorf, Aus- tria; Henkel, Diisseldorf, Germany; L'Or6al, Aulnay sous Bois, France; and Merck, Darmstadt, Germany. The selection of members of WG-2 indicates that the predominant use of CAM-based assays is in Europe.

For reasons of confidentiality names of sponsors are not given in the WG-2 report except for the German validation study, which has already been published (Spielmann et al., 1993).

After the data had been reviewed by WG-2, it was decided that the following data could not be included in the in vivo/in vitro correlation procedure at all or only to a limited extent:

C A M V A assay. Since one laboratory had not submitted in vivo data on a diskette in a standard format, statistical analysis of the data submitted by this sponsor could not be performed. The submission of this laboratory contained acceptable in vitro data, but the in vivo data were classified Draize eye test data which did not permit analysis of the individual tissue scores.

H E T - C A M assay. One of the sponsors submitted a set of 26 chemicals with both in vivo and in vitro

data. Only nine of these chemicals met the quality standard defined for acceptance by IRAG. Since the remaining number of chemicals with both acceptable in vivo and in vitro data was too small for statistical analysis, WG-2 decided to present the data but not to include this data set in the statistical analysis. Another sponsor did not submit any in vivo Draize eye test data; therefore, this data set could not be analysed and is not covered by the present report.

Test protocols, test chemicals and data analysis

Each HET-CAM laboratory used different versions of the original test protocol developed by Luepke in 1985. Evaluation of the HET-CAM assay was even more complicated because each laboratory chose test chemicals according to the product line of the company. On the one hand, a broad spectrum of industrial chemicals was tested in a validation study, and on the other hand, several companies of the cosmetics industry have submitted only data obtained with surfactants and surfactant-based formulations. In contrast, the two CAMVA labora- tories covered by WG-2 have used the same test protocol. Comparison of the performance of the CAMVA assay is, therefore, less complex than the

evaluation of the performance of the HET-CAM assay within this study.

Correlation analysis of in vivo v. in vitro data was performed on two data sets obtained with one protocol of the CAMVA assay and on five data sets generated with three different variations of the HET-CAM protocol. Six of these data sets are in-house data selected by the sponsors. The data set from the national German HET-CAM validation study, which was performed under blind conditions in eight laboratories, was obtained by testing 133 carefully selected chemicals covering a broad spectrum of chemical and toxicological properties.

The data submitted by the sponsors were analysed by WG-2 according to the guidelines developed by IRAG WG-6 for acceptance and evaluation of the data submitted for comparing in vitro with in vivo data (IRAG Focus Group Six, Draft Report Guidelines, 1993). After evaluation of the quality of the data according to criteria for data acceptance suggested by WG-6, biostatistical analysis of the data was carried out in Berlin in a joint venture between ZEBET and Hermann-Georg Holzhiitter at the Department of Biochemistry, Humboldt University Medical School "Charit6".

During statistical analysis an attempt was made to identify the utility of each of the HET-CAM protocols as well as of the CAMVA protocol by taking chemical and product class of each data set into account, as recommended by IRAG WG-6.

The Methods section of this report describes conditions for evaluation and acceptance of the data sets submitted. The same section also covers the biostatistical methods used. In the Results section, in vitro and in vivo data of each set are presented and analysed separately. In the same section the predictivity of the in vitro endpoints of the CAM- based assays is compared with the different in vivo

reaction scores of the rabbit's eye as suggested by IRAG WG-6. In addition, the relation between predictivity of CAM-based assays and chemical classes of test materials was analysed, and the results are discussed with respect to the goal of the study.

The WG-2 report was prepared at ZEBET by Manfred Liebsch, Ferdinand Moldenhauer and Horst Spielmann.

Methods

Data sets submitted

Data submissions followed the Guidelines for the Evaluation of Eye Irritation Alternative Tests: Criteria for Data Submission, as suggested by the IRAG Guidelines Working Group WG-6 (IRAG, 1993). Data were submitted from sponsors who had generally performed both in vivo and in vitro assays on test materials. To facilitate determination of in vitro~in vivo correlations for individual tissues of the eye, individual scores were requested for every

Working Group 2: CAM-based assays 41

animal tested rather than average Draize scores. This approach allowed comparison of responses in each tissue and provided a measure of the variability of each tissue response among the three to six rabbits tested. As suggested by IRAG WG-6, the following details were carefully evaluated for each data set of CAM-based assays submitted to WG-2.

1. In vitro data. Purpose, basis and proposed use of each in vitro assay; standard protocols and their limitations; data characteristics of in vitro data; specific characteristics of in vitro tests (e.g. existence of concurrent positive and negative controls); numbers of replicates; range of irritation potential for the data set as defined by in vivo Draize scores; range of response potential for the in vitro assay; spectrum of test materials; variance of the assay according to sponsors and if the assay was evaluated in a validation trial, information on interlaboratory variation; if the validation was conducted under blind conditions and has been published in the peer- reviewed literature.

2. In vivo data. Whether in vivo data were provided; whether the study was conducted according to GLP principles; number of animals used per test material, in particular if more than three animals were used; individual tissue scores for each animal and time point; whether MAS scores were provided or sufficient information to calculate MAS; whether MMMIS scores were provided or sufficient infor- mation to calculate MMMIS; if the test was con- ducted for 21 days or truncated earlier; and if original protocols were provided.

The majority of data sets submitted included most of the requested data. Those lacking in vivo data were not reviewed by WG-2 and are not included in this report. An outline of the protocol of the specific CAM..based assay is given in the Results section for each data set, and the purpose and proposed use of the test, range of responses and range of test materials amenable to use in the assay are described as well as the use of each assay for safety and risk assessment.

during the observation period between 24 hr and 21 days after application.

Summarized tissue score o f the Draize eye test. Because information on area of cornea affected and discharge was missing from several European data sets the in vivo Draize eye score MAS (maximum average score) or MMAS (modified MAS, excluding the first hour) could not be calculated for each of the data sets submitted. Instead, a non-weighted sum (EMMMIS) of those individual scores available from all data sets was used to characterize the range and distribution of the in vivo responses covered by each of the data sets and to determine the overall correlation between in vitro scores and the rabbit's eye. MMAS and ZMMMIS were calculated according to the following equations:

MMAS = 5 x (scoreopa¢ity × scorearea) + 5 X score~ris

+ 2 x (scOreerythema + scorechemosis + scoreais~harg~)

EMMMIS = MMMISopaciw + MMMIS~thema

+ MMMISch~mosi~ + MMMISdi~chars,

The essential information for calculating the MMAS was provided with two of the data sets, HETCAM-II and CAMVA-Laboratory B. These two data sets are provided to show that EMMMIS can be used as a summarized tissue score for the overall reaction of the rabbit's eye. Figures 1 and 2 show the correlation between MMAS and EMMMIS for the two data sets. Although linear correlation between EMMMIS and MAS is significant in the two data sets (r = 0.81 or r = 0.98; P < 0.001), the CAMVA data set (Fig. 2), which covers the entire range of reactivity of the MMAS and the XMMMIS scores, indicates that the relation between MMAS and EMMMIS is non-linear (saturation) at the upper end. This is due to the fact that the MMAS is a summarized score, in which corneal damages are weighted with a higher factor (× 5) than effects on iris and conjunctiva (x2). At the lower end of the MMAS scale, predominantly conjunctiva effects contribute to the

In vivo scores used for biostatistics

Individual tissue scores o f the Draize eye test. According to the IRAG/WG 6 guidelines (IRAG, 1993) the MMMIS (mean of modified maximum individual tissue score) was used as in vivo score for effects observed o:a the different tissues of the rabbit's eye, that is, only effects from 24 hr or more after dosing were used to reduce the potential scoring bias from immediate, transient effects. The MMMIS is calculated according to the following equation for each tissue of the rabbit's eye:

n 1 ~ max(score) MMMISti~,e = n

where n is the number of animals used in the Draize eye test and max(score) is the highest score recorded

'F 14

p~OOOt

. ,~ ~, • • ,, ,:0 MMAS

Fig. 1. Draize eye test scores: correlation between modi- fied maximum average score (MMAS) and summarized modified means of maximum individual tissue scores (ZMMMIS). In vivo data submitted by laboratory HETCAM-II allowed the calculation of both MMAS and EMMMIS. Linear correlation coefficient of the two scores is significant (r_p = 0.98, P ~< 0.001) in the lower range of

eye irritation (MMAS 0-58 and XMMMIS 0-12).

FCT 35/I--P

42

16.000

12o0o

io.o0o

6 ~

4 ~

I

r~O.a06 ~cO,OOI n~20

H . Spielmann et al.

where max_score Y designates the total range of Y scores. An example of this analysis is given in Fig. 3, which shows the data set of the submission HETCAM-I I I . The bold line represents the linear regression for prediction of the M M A S by the in vitro score " S C O R E I " of H E T C A M - I I I and the dashed lines cover the area of _11 MMAS, which is 1/10 of the maximum in vivo score. 34 of 42 data points are within this range, resulting in a prediction rate of

,]0 76%.

Partial linear regression (pr3). Many X - Y dia- grams for comparing in vitro~in vivo data revealed a considerable scatter of data points which cannot be satisfactorily related to each other by a single linear regression line. Large scattering of data may arise from various sources, all of which may be superimposed in a given plot:

1. A common biological or mechanistic basis for the two test systems under comparison may be lacking so that the data sets are truly non-associated.

2. The "noise" of the data originating both from experimental errors and from biological variation of the test system can increase to such an extent that the existing functional relation- ship between X and Y can no longer be extracted from the data.

3. Subgroups of data display different func-tional relationships which superimpose. This is char- acteristic of studies performed on a set of heterogeneous test chemicals comprised of various classes of materials that differ consider- ably in their structure and thus in their mechanistic mode of action. Since the type and degree of functional relationship may vary from

,, • , • , • , • , ,, MMAS

Fig. 2. Draize eye test scores: correlation between modified maximum average score (MMAS) and summarized modi- fied mean of maximum individual tissue scores (EMMMIS). In vivo data submitted with data set of CAMVA-Laboratory B allowed the calculation of both MMAS and EMMMIS. The linear correlation coefficient for the two scores is significant (rp = 0.81, P ~< 0.001). The data points follow the same pattern as in Fig. 1. However, the correlation follows a non-linear saturation curve at the upper end of the

EMMMIS scale (see text).

MMAS. At the upper end of the M M A S scale, an increasing proport ion of corneal effects with higher weighting factors contribute to the MMAS. There- fore, the M M A S shows a non-linear increase in the range of severe eye irritancy. In contrast, the non-weighted score E M M M I S increases in a linear fashion over the entire range of eye irritancy. This difference has to be taken into account in comparing in vitro/in vivo correlations of the present analysis of WG-2, which uses the summarized Y~MMMIS in most cases, and similar analyses conducted by the other I R A G Working Groups, which are based on the MMAS.

Biostatistical methods used for data analysis

Test on normal distribution o f data. Normal distribution of all in vitro and in vivo scores was assessed by the Kolmogorov-Smirnov Test (Lozfin, 1992). The two values of px and py indicate the level of significance for X or Y being normally distributed according to this statistical test.

Correlation analysis. Correlation between two variables, X and Y, was assessed by Pearson's momentum correlation coefficient r~ and Spearman's rank correlation coefficient r~ (Lozfin, 1992).

Linear regression modelling: single linear regression (prl). The functional relation between two variables X and Y was analysed by the linear regression model Y = mX + n. The two parameters m (slope) and n (intercept) were estimated by minimizing the sum of deviation squares SDS = 3Z(Ayi )2 ; AY~ = y,p~c,ed __ yobserved is the difference between theoretical and experimental Y values, respectively (regression type I), that is errors in the independent variable X (usually the in vitro score) were neglected.

The predictive power of the model was assessed on the basis of the prediction rate pr [%], defined as the percentage r f data points for which

[AYd ~< (max_score Y/10)

.0

• o . ° . " •

• . ~ .o*°

° ~ . - - °

lo

o o

Fig. 3. Single linear regression (s = l): In vitro data of the HETCAM-III assay (SCORE1, X-axis) plotted v. in vivo data in the Draize eye test (MMAS, Y-axis) to calculate single linear correlation and the prediction rate (prl). The bold line represents the linear regression line for predicting the MMAS score from in vitro scores. The dashed lines indicate confidence limits for prediction of single values at P = 0.05. The two thin lines at a distance___ max MMAS/10 from the regression line (= + l l) are used to assess the prediction rate (prl): 32 of 42 observed data points fall into the area enclosed by these two lines and thus are considered as correctly predicted, namely

prl = 76%.

Working Group 2: CAM-based assays 43

group to group, the plotting of all data in one diagram results in a large scatter of data points.

4. The functional relationship between the end- points of different test systems may be non- linear. It is therefore suggested that splitting the data set further into several linearly linked subsets permits identification of parts of the non-linear relationship.

Therefore, it is important to know whether there is a complete lack of functional relationship between the data or whether it is possible to select at least some fractions of data (corresponding to certain classes of toxic materials tested) for which a linear functional relationship can be established. This was assessed by a nowfl statistical method which we have called partial line6~r regression: The basic idea of this concept is to perform a fractionation of the whole data set into s non-overlapping subsets {X,Y}~, {X,Y}2 . . . . . {X,Y}s such that the total sum of deviation squares SDS,o, = SDS1 + SDS2 + . . . + SDS~ becomes minimum. Here SDS~ is the sum of deviation squares obtained in fitting a straight line to the data of fraction {X,Y},. The problem addressed here belongs to the class of so-called NP-complete problems, that is the search for the global minimum of SDSto~ requires considering all s N possible fractionations (total number of possibilities to assign N data couples to one of s fractions). For N = 25 data couples, s = 3 fractions, and a computing time of 1 sec for fitting a single linear regression function would result in a total compu- tation time of 26,867 years! Therefore, instead of determining the global minimum, the analyst must be satisfied with "good" fractionations referring to deep local minima of SDS .... This can be achieved by means of a random search-algorithm: Starting with a random initial fractionation, "mutat ions" are carried out by randomly selecting a single data point (Xk, Y 0 and assigning it to a different subset. The linear regression is repeated in the two subsets affected by this Iransfer of the data point and the "muta t ion" is accepted if the total sum of deviation squares is smaller than before. Otherwise the mutat ion is rejected, that is, the data point is reassigned to the original subset. This procedure is repeated as long as no further decline of SDS,ot can be achieved after a large enough number of mutations. The quality of the fit was again assessed by the prediction rate (prs).

An example demonstrating the method of partial linear regression is given in Figs 4a, b, which show the same data set of Fig. I (HETCAM-II) , analysed with partial linear regression in either two data subsets (Fig. 4a) or three data subsets (Fig. 4b). Again, the bold lines represent the linear regression lines derived from two or three subsets of the data and the dashed lines cover the area of correct prediction. Since', regression lines calculated with

" ...V// -

,: ,: ,,

"-b

/ - L ~ j . / "

s Sem~l IO 14

Fig. 4. Partial linear regression (s = 2, s = 3). In vitro data of the HETCAM-III assay (SCORE1, X-axis) plotted v. in vivo data in the Draize eye test (MMAS, Y-axis) to calculate prediction rates (pr2, pr3) based on two or three subsets of data. The bold lines represent linear regression lines derived from two data subsets (a) or three data subsets (b) for predicting the MMAS from the in vitro scores. The dashed lines indicate the confidence limits (P = 0.05) for prediction of single MMAS values. With two data subsets the prediction rate is 76%; with three data subsets

it is 81%.

partial linear regression show a better fit to the data, 95% confidence limits are used to define the area of correct prediction. A prediction rate of 76% was achieved with two subsets (s = 2), and a prediction rate of 81% was achieved with three data subsets (s = 3).

Analysis of predictive power of in vitro assays for classified test materials. To assess whether there are significant differences among the prediction rates obtained for the various classes of materials tested, we set up (r x s) contingency tables by compiling the frequencies hrs with which (s) in vivo scores are correctly predicted for test materials belonging to the chemical class (r). The chi-square test (e.g. Colquhoun, 1971 or Weber, 1986) was used to estimate the degree of dependency between both variables, chemical group and goodness of prediction.

Software used. Data were collected and com- posed with Microsoft EXCEL 4.0 for a uniform biometrical data analysis of different data sub- missions. Kolmogorov-Smirnov-Test , Spearman's and Pearson's correlation coefficients, and )~2-values were computed with a program written in the programming language Borland C + + using stan- dard equations taken from Colquhoun (1971), Loz~in

44 H. Spielmann et al.

(1992) and Weber (1986). A program to analyze the predictive power of the CAM-based assays (prediction rates derived from linear regression and partial linear regression) according to the principles or equations listed above was written in the programming language PASCAL.

Results

Scientific basis o f CAM-based assays

CAM-based assays are derived from the obser- vation that the vascularized chorion allantoic membrane (CAM) of an embryonated hen's egg is quite similar to the vascularized mucous tissues of the eye. In all CAM-based assays, fertile hen's eggs are incubated for 9 days (HET-CAM) or 10 and 14 days (CAMVA); then eggs are opened and the test material is applied to the CAM. In all of the HET-CAM assays blood vessels, capillaries and albumen are observed for up to four different endpoints typical of irritation (capillary injection, haemorrhage, coagulation and lysis) but in the CAMVA assay only for vascular effects (ghost vessels, capillary injection and haemorrhage). Two entirely different scoring systems are used with the HET-CAM assay: either test materials are tested at fixed concentrations and the time point of first occurrence of specific irritant effects is registered, or the threshold concentration of the test material is determined at which the main irritant effect is observed on the CAM within a fixed time period. In the CAMVA assay that specific concentration of a test material is determined which exhibits a positive vascular reaction on the CAM (of any type or degree) in 50% of the eggs.

H E T C A M - I , validation study

Protocol. The HET-CAM assay developed by Luepke (1985) was modified by using a precise determination of the time points of the first occurrence of reactions on the CAM (haemorrhage, lysis, coagulation) and calculating a weighted irritation score "IS" (Kalweit et al., 1990; Spielmann et al., 1991). In addition, the irritation threshold "IT" was determined, that is the lowest concentration of a test chemical exhibiting a reaction on the CAM (Spielmann et al., 1992). Fresh and fertile White Leghorn eggs were incubated while they were rotating at 38°C and 60% humidity for 9 days. After non-viable embryos were discarded, egg shells were opened at the air cell, and 0.3 ml of the test material was applied to the CAM.

To determine the IS, materials were tested both pure (100%) and at 10% concentration, dissolved in either 0.9% NaCl solution or olive oil, using three eggs per test. For an observation period of up to 300 sec maximum, blood vessels, capillaries and albumen were examined for haemorrhage (H), lysis (L) and coagulation (C) and the time points of first occurrence of each of the three reactions were

recorded with a computer program which also calculated the irritation score according to the following formula:

I S = 5 ( 3 0 1 - H ) + 7 ( 3 0 1 - L ) + 9 ( 3 0 1 - C ) 300 300 300

Concurrent positive controls (0.1 r~ NaOH and 0.1% SDS) and a negative control (0.9% NaCI) were included in each experiment. To determine the irritation threshold (IT) several concentrations of each material were tested on three eggs each and the highest concentration of a test material inducing a weak irritation reaction (haemorrhage, lysis or coagulation) was recorded.

Tests were performed in a validation trial with 13 laboratories participating in phase I (inter- laboratory assessment) and seven laboratories in phase II (data base development). For test materials of phase I (existing chemicals), original Draize eye test data were not available. Therefore, the data set submitted to IRAG covered only data of 133 new chemicals of phase I1; each of them was tested in at least two experiments in two of the seven laboratories.

Purpose and proposed use o f the test. Although the HET-CAM assay was proposed to allow prediction of all degrees of eye irritation (Luepke, 1985) the validation study aimed only at identifying severe eye irritants, which are labelled R 41 according to the European Industrial Chemicals Act. Evaluation criteria for the HET-CAM assay were developed in agreement with the aim of the study to achieve a maximum of correct predictions of severely irritating chemicals (R 41) with an acceptable low percentage of overlabelling. All materials that test negative must be tested in vivo in the Draize eye irritation test.

Range of responses covered. According to Euro- pean criteria for the classification and labelling of industrial chemicals, 50% of the 133 test materials were not classified (n.c.), 8% were classified as eye irritants (R 36) and 42% were classified as severe eye irritants (R 41). Since these classifications do not permit comparing the in vivo database with other submitted data sets, a non-weighted in vivo score (MMMIS; see Biostatistical Methods) was calculated for each chemical. When all tissues of the eye are severely damaged, the MMMIS is 16. Figure 5(a) shows the frequency distribution of the MMMIS for 76 of the 133 test materials. For 57 chemicals the MMMIS could not be calculated, since the in vivo score "discharge" was not reported with the Draize test data. Figure 5(a) demonstrates that one-third of the test materials hardly affected the tissues of the rabbit's eye (MMMIS ~< 2), whereas about two-thirds were equally distributed over the range of 4 or less to 14 or less with a maximum at 8 or less, that is, the test materials selected for this validation study adequately covered the entire range of in vivo responses.

Working Group 2: CAM-based assays 45

30 ~ 7 ~

1o

5a: HLrI'CAM-I, VALIDATION (7 lab~} 76 chemicals = 100%

< ~ < ~ ~ # l o <I i2 O l ( ~16

, .

5b: I~TCAM-[, LAB-A 52 chemicals = 100%

<.4 <-,5 <.,8 o l O <=12 ~ 1 4 ~ 1 6

5c: I~[TI'CAM-II 42 cJ',¢ mical s = 100% 11%

19% M

. . . . 2 % . . , ~ I ~ I I ~ <=4 ~ 4=8 4=10 ~ 1 2 4.14 4.16

6O 6Z% 5d: ;'~CAM- III 97 chemicals = 100%

<i4 <=6 <=8 <.10 <.15 ~ I 4 <=16

5e : C A M V A , L A B - A 9 3 ¢ h c m i c a d s = 1 0 o %

<"4 <-~ <=4 ~ 1 0 ~ I 2 <i i4 0 [ 6

s z~% 5f: CAMVA. LAB-B 20 chemicals = 100%

[s,~ I <~s <..8 <=*o <=12 <=14 <=16

Fig. 5(a-f). Frequency distribution of summarized in vivo scores measured in Draize rabbit's eye tests of all of the CAM-based assays submitted to WG-2. HET-CAM and CAMVA assays and ZMMMIS are described in the Methods section. Frequency distribution is expressed as percent of the total number of chemicals with sufficient information for calculating ZMMMIS within each of the

data sets.

Range of materials amenable to use in the assay. According to statements of the submitting labora- tories, a wide spectrum of materials of different chemical natures can be assessed in the HETCAM-I assay. Since colcured test materials have to be rinsed off before reactions on the CAM can be determined, an exact measurement of the time-depen- dent score IS i,; not feasible. To demonstrate the variety of test materials which were used in the HETCAM-I validation study, a short character- ization of solubility, range of pH and chemical nature is given:

number: solubility: pH:

in organ ic: aliphatic organic:

133 51% in water, 49% in olive oil 14% ~<3; 15% ~<6; 26% ~<8; 15% ~<11; 1.5% >11; 29% no specification S, N, Si, CN C1, amines, nitro compounds, alcohols (mono-, di-), glycerine derivatives, amino sugars, alde- hydes, ketones, carbon acids (mono-, di-), amino acids, esters,

ethers, phosphate esters, ure- thane, nitriles, silanes

aromatic: hydrocarbons, chlorinated com- pounds, amines, nitro com- pounds, aldehydes, ketones

heterocyclic: purines, triazole, thiadiazole.

Use in risk assessment: In single cases, the HETCAM-I is accepted for classification of indus- trial chemicals as severe eye irritants and labelling with the defined term "may cause severe damage of the eye" (R 41) according to the regulation of the Hazardous Chemicals Act in Germany.

HETCAM-I, Laboratory A

Protocol. The HET-CAM assay protocol used in an industrial laboratory was exactly the same as described in the previous section for the HETCAM-I validation study.

Purpose and proposed use of test. The HETCAM-I assay as part of an in vitro test battery is used by the submitting laboratory for comparative assessment of the general irritation potential (ocular and skin) of raw materials and final formulations (cosmetics and personal-care products).

Range of responses covered. The frequency dis- tribution of the non-weighted sum score MMMIS is shown in Fig. 5(b) for the submitted in vivo data set. Figure 5(b) demonstrates that the majority of test materials (56%) only slightly affected the tissues of the rabbit's eye (MMMIS ~<2), and 90% of the test materials induced in vivo responses of ~<8 and the remaining 10% up to 12 as the maximum score.

Range of materials amenable to use in the assay. As with many other CAM-based assays, the test is compatible with a wide spectrum of materials of different chemical natures and of different physico- chemical properties. A data set of 52 test materials was submitted, all formulations, which were tested in the CTFA Evaluation of Alternatives Program of alternatives to the Draize eye test (Gettings et al., 1991, 1992 and 1994a, b). It covered 10 lhvdroalco- holic formulations from phase I of the program, 18 oil/water-based formulations from phase II and 25 surfactant-based formulations from phase III.

Use in risk assessment. Because of the type of products developed by the company (mainly cosmetics and personal care products) the HET- CAM-I assay is not used for general risk assessment but only for safety assessment of raw materials and products.

HETCAM-H

Protocol. The HET~CAM assay protocol devel- oped by Luepke (1985) and Luepke and Kemper (1986) was slightly modified by the industrial laboratory which submitted the data set. Briefly, fertile White Leghorn eggs were incubated for 9 days (37.8°C, 55% humidity) and 0.3ml of the test material was applied to the CAM. Vessels and

46 H. Spielmann et al.

capillaries of the CAM were observed for the first occurrence of hyperaemia (capillary injection), haemorrhage and coagulation (thrombosis in big and medium vessels or opacity of albumen). Observations are terminated 5 rain after application, and two time-dependent scores ("SCOREI", "HEM I") are calculated according to the scheme developed by Luepke (1985):

Time after application (min) ~<0.5 ~<2 ~<5

Hyperaemia 5 3 1 Haemorrhage 7 5 3 Coagulation 9 7 5

"SCOREI" is the sum of the scores resulting from hyperaemia, haemorrhage and coagulation (maxi- mum = 21) and " H E M I " is the score resulting only from haemorrhage (maximum = 7).

Tests were performed on four eggs each either in a single assay (when formulations were tested) or in two repeated assays (when chemicals were tested). Two concurrent negative controls (distilled water and a negative product of the category of the test material) and a positive control (according to the category of the test material) were included in each experiment.

Purpose and proposed use of test. The HET-CAM test is used for in-house testing combined with other organotypic and cytotoxicity-based in vitro assays for screening of new cosmetic ingredients and formulations with respect to irritation potential.

Range of responses covered. Figure 5(c) shows that in vivo responses of the data set ranged from MMMIS <2 to ~<12. In this data set 36% of test materials only slightly affected the tissues of the rabbit 's eye (MMMIS ~< 2), whereas 57% of the chemicals showed moderate to severe effects not exceeding MMMIS = 12. Higher classes up to the maximum of MMMIS~<I6 were not represented. Since the data show clusters in the no-effect range and middle range the test materials do not adequately cover the entire range of in vivo eye irritation responses.

Range of materials amenable to use in the assay. According to the statement of the submitting laboratory, the HETCAM-II protocol is compatible with liquid and solid test materials and is particularly adapted to surfactant-based products and aqueous substances. The submitted data set covered 20 surfactants and 22 cosmetic formulations, which are characterized as follows:

Use in risk assessment. The HETCAM-II assay is not used for risk assessment; rather, it is used for safety assessment of new ingredients and formulations. The evaluation takes into account the historical background of the chemical or product family and evaluation in other in vitro test systems.

HETCAM-l l l

Protocol. The HETCAM-III assay is a modifi- cation of Luepke's method (1985) used in an indus- trial laboratory (Bartnik et al., 1988; Sterzel et al., 1990). Although the experimental design is similar to HETCAM-I and HETCAM-II, HETCAM-III uses an entirely different scoring system. Furthermore, materials may be tested at various dilutions and/or with shorter periods of exposure.

Briefly, fertile hen's eggs (e.g. White Leghorn) were incubated for 9 days (37.5 + I°C, 40-60% humidity) and 0.3 ml of the test material was applied to the CAM on day 10. Each material was tested on six eggs. In general, solid and liquid materials were tested undiluted, whereas turbid materials were diluted with water (or solvents) down to a concentration which allowed observation of the CAM. When enough of the CAM could be seen through the test material, the score "Q" was determined by a reaction-time method. When reactions on the CAM could not be detected through the test material (e.g. with coloured test materials), the endpoint score "S" was determined after rinsing at 0.5, 1, 3 or 5 min post application.

For calculation of the score "Q" the time points of first occurrence of haemorrhage, coagulation and lysis in capillaries, vessels and albumen were determined within ~<5min (300sec) after appli- cation. For each reaction that occurred, weighting factors g~ and g2 and the score for the test material were calculated according to the equations given below. The final score "Q" is the ratio of the observed score and the score of a positive benchmark reference chemical (sodium magnesium lauryl-myristyl-6- ethoxysulfate; Texapon ASV, an anionic surfactant) which is concurrently determined in each assay (score ~ 750 _+ 100).

M 1 ~, x,,j 301 - - M ' j = l

g~ = 10

1 M g2 = _.~ j~ldk.j

number: solubility: pH:

classes:

form:

42 42 water-soluble 0 ~<6; 16 ~<8; 4 ~<9; 0 >9; 22 not determined 20 chemicals (surfactants); 22 formu- lations (12 lotions, three gels, seven shampoos) 39 liquids, three gels.

301 - - x k , ~

Y k j - - - ~ , j

Rkj = g~" yk,/g~

1 3 score = ~ ,£, ~. R,j

j= l

Working Group 2: CAM-based assays 47

1 ~< M ~< 6 (number of eggs)

1 ~< j <~ 6 (index of egg number)

1 ~< k ~< 3 (index of reaction)

1 ~< x -'.-t 301 (reaction time)

d~. = 1 if xk,j # 301 sec

dk,j == 0 if xk.~ = 301 sec

Q _ score,e.,., scorebenchmark

For endpoint assessment (score "S"), the main reactions (haemorrhage, coagulation, lysis) observed on the CAM of each of six eggs are scored semiquantitatively (0 = none, 1 = weak, 2 = moder- ate, 3 = strong). "S" is the sum of scores determined on six eggs tested.

Purpose and proposed use of test. According to the statement of the submitting laboratory, the HETCAM-III assay is used to screen raw materials and to evaluate the irritating properties of formu- lations, for example cosmetics and household products. In some cases the HETCAM-III assay is used in combinatic,n with in vitro cytotoxicity assays (e.g. RBC haemolysis, 3T3 cell neutral red uptake) or with human dermatological studies.

Range of responses covered. According to the statement of the submitting laboratory, the HET- CAM-III protocol is capable of covering the entire range of in vivo eye irritation from mild irritation of cosmetic formulations up to severe irritation of raw materials. As shown in Fig. 5(d), the in vivo responses of the data set subraitted to IRAG did not adequately cover the entire range of irritation of the rabbit's eye. On the basis of the non-weighted sum of the in vivo scores MMMIS, 62% of the materials tested were practically non-irritant (MMMIS ~<2) and the remaining 38% were equally represented, with only 6-9% in the five classes up to MMMIS ~< 14. Since the majority of test materials were of low eye- irritating properties, the data set does not adequately cover the entire range of in vivo eye irritation responses.

Range of mater;als amenable to use in the assay. According to the submitting laboratory, the HET- CAM-1II assay protocol can be used to test a great variety of test materials of different chemical classes and physicochemical properties. As with other HET-CAM tests, the assay cannot be used to test chemicals that are coloured or turbid and that stick to the CAM. The data set submitted to 1RAG covered only tests with raw materials of the following spectrum:

number: physical appearance:

solubility in water:

97 78 liquids, four creams and 17 solids 37 soluble, five non-soluble, 57 not specified

pH: chemical classes:

inorganic aliphatic

not specified

alkali sulfite fatty alcohols, ketones, acids, esters, ethers, fatty acid amides and amines, triglyce- rides, sulfosuccinates, betaines and sulfonates

alicyclic quinones, benzoic acid esters, phthalic acid esters, phospho- nic acid esters, dyes, polycyclic esters

heterocyclic fragrances, triazines, vitamins C-H derivatives glucosides polymers protein hydrolysates, polysac-

charides, polyacrylates

Use in risk assessment. The HETCAM-III assay is routinely used in-house for safety assessment of raw materials and formulations. The company also uses the HETCAM-III assay to classify industrial chemicals as severe eye irritants according to EU regulations (R 41).

CAMVA, Laboratory A

Protocol. The chorion allantoic membrane vascular assay (CAMVA) was developed by Leighton et al. (1985) and refined by Kong et al. (1987). The protocol was subsequently modified in an industrial laboratory to reduce the sensitivity and to use day 10 rather than day 14 eggs (Bagley et al., 1988 and 1991a, b), In contrast to the HET-CAM assays described in the previous sections, only vascular effects on the CAM are recorded in the CAMVA assay.

Briefly, fertile hen's eggs (DeKalb XL strain) are incubated at 99 + I°F at 50-60% humidity for 10 or 14 days. On day 4 of incubation a 2.5 ml portion of egg albumen is removed, and a window is cut in the shell and taped. On day 10 (day 14) of incubation, tapes are removed and 40/d of the test material is applied in a series of 10 concentrations to the CAM of 10 eggs. 30 min after dosing, the CAMs are examined for hypoaemia (ghost vessels), hyperaemia (capillary injection) and haemorrhage. If a vascular effect of any degree is observed, the egg is considered positive. The concentration of test material inducing a positive response in 50% of the eggs (RCs0) is calculated by probit analysis.

Purpose and proposed use of test. The CAMVA is used by the company for screening new products (mainly cosmetics and personal care products) and ingredients for eye irritation potential, and also to aid in development of milder products.

Range of responses covered. According to the statement of the submitting laboratory, the CAMVA is capable of predicting the entire range of in vivo eye irritation responses from mild to moderate and severe. In Fig. 5(e) the frequency distribution of the non-weighted sum of the in vivo scores MMMIS is

48

shown for the submitted data set of 93 materials tested on day 14 eggs. Figure 5(e) demonstrates that one-third of the test materials induce only minimal effects on the rabbit's eye (MMMIS ~< 2), whereas two-thirds of the test materials were equally distributed over the range of MMMIS ~< 4 to ~< 14 with a maximum at MMMIS ~< 10; that is, the test materials adequately covered the entire range of in vivo responses.

Range of materials amenable to use in the assay. According to the statement of the submitting laboratory, the CAMVA can be used to test hydrophilic and hydrophobic samples in both solid and liquid form. The test has been successfully used to predict eye irritation potential of surfactant-based formulations; it is less accurate for predicting the eye irritation potential of oil-in-water emulsions. Furthermore, the CAMVA can be used to predict eye irritation potential of alcohol-containing formu- lations if the scale for evaluation is shifted in comparison to the scale used for surfactants. Materials containing PEG fatty acids typically result in false positives. Test materials of the data set submitted to IRAG cover the following spectrum:

number: 124 solubility: 121 water-soluble classes: soap, shampoos (hair, body), deter-

gents (laundry, dish), bleaches, fabric softener, cleaner (all-purpose, toilet), Tween (20-80), laureates, glyceride deri- vatives, cosmetics (with or without alcohol).

Use in risk assessment. To assess the ocular irritation potential of a product within the company submitting the CAMVA data, first the historical data of similar products is evaluated and the eye irritation potential of the ingredients of the product is assessed; thereafter the CAMVA serves as a screen for eye irritation potential.

CAMVA, Laboratory B

Protocol. The same CAMVA protocol (Bagley et al., 1988) was used as described in the previous section (CAMVA laboratory A), except that RCs0 was determined graphically instead of by a computer- aided probit analysis. All assays were performed at day 14 of embryonation.

Purpose and proposed use of test. The proposed use of the CAMVA assay by the pharmaceutical company submitting the data set is still in the process of evaluation. In addition to Draize eye tests and CAMVA data, the data set submitted to IRAG contained data of the bovine corneal opacity and permeability assay (BCOP). The two organotypic assays were chosen by the company, since they reflect effects on two different in vivo endpoints, the cornea (BCOP) and the conjunctivae (CAMVA).

Range of responses covered. A data set of 20 intermediates of pharmaceutical chemicals was

H. Spielmann et al.

chosen for submission to IRAG. Figure 5(f) shows the frequency distribution of the non-weighted sum of the in vivo scores MMMIS. The figure demon- strates that the selection of data is different from the other data sets submitted, since these data are clustered on both ends of the scale of eye irritation responses: 50% of the data represent mild eye irritation (MMMIS ~< 2 and ~<4) and 30% represent severe eye irritation in vivo (MMMIS ~< 14 and ~< 16). None of the other data sets submitted contained test materials with a sum score of 14 < MMMIS ~< 16. The remaining 20% of the chemicals are equally distributed in the middle range with low frequencies of 0-10%. Thus, although the data selected represent the greatest range of in vivo responses of all data sets submitted to IRAG, because of the clustering at both ends of the Draize eye irritation scale the data selection influences statistical measures for assessing predictivity of the assay.

Range of materials amenable to use in the assay. The range of materials amenable in the CAMVA was not addressed by the submitting laboratory. The test materials of the data set provided to IRAG cover the following spectrum:

number: 20 solubility in H20: 11 insoluble; two negligible; two

slight; one moderate; two soluble; one 5% soluble; one readily soluble

pH: 1 = 3.5; 2 = 6; 2 = 7.8; 1 = 12.5; 14 not determined

classes: nucleosides, alcohols, esters, ketones, amines, steroids (ali- phatic, cyclic, heterocyclic), tran- quillizers.

Use in risk assessment. According to the statement of the submitting laboratory, the current data base is insufficient to establish the utility of the two in vitro assays (the CAMVA and the BCOP) as replacements for the Draize eye test. It is therefore the goal of the company to evaluate whether the two in vitro tests can be used independently or in combination to determine if a compound is either non- to mildly irritating or moderately to extremely irritating. If the assays prove reliable at predicting this classification, they could be used as a screen in which compounds found to be moderately to extremely irritating would be labelled as such on the Material Data Safety Sheets (MDSS). To ensure public safety, compounds that score non- to mildly irritating in the in vitro test(s) would be tested in the in vivo Draize eye test before classification on the MDSSs to ensure that no false negative results are reported. This approach both ensures public safety and meets the goals of federal agencies.

Biostatistical analysis

Of the nine data sets of CAM-based assays submitted, two could not be included in the analysis

Working Group 2: CAM-based assays 49

~ C A M ] u . d IHErCAMirmrCAMirmTCAMiaErCAMi CAMVA i CA~rVA 1 a.ssays I I i I ! 1I i III i i

I validationi LAB-B i i i LAB-A i LAB-B invA, otissue ~ [ study i i i i scores 0/IMM]CS)~ I [*,6] i [%] [%] [%] [%] ! [%] i

i =0 [ 1 45 i l 64 ~1 21 i 1 6 8 i l 46 i l 55 ] cornea i <=1 I I 23 !l 9 i i 17 ii 13 ill 17 il 10 opacity i <=2 ]1 18 il 0 i l 48 i i 18 ill 23 il 0 I

i < = 3 I' s i, 9 i' 14 !, t i, 11 !, 5 i <=4 II 6 ill 18 ii 0 il 0 il 3 il l 30 I i •O I m 51 i m 55 ill 26 ] / 7 6 iM1 60 i m 50 i

iris i <=1 [ I 49 il 18 ! / 7 4 i l 22 ! l 39 ill 25 i i < - - 2 I' 0 i= 27 i, 0 i, 2 i' I i= 25 !

•O Ill 16 i l l 36 il 5 l1 20 [11 14 il 0 erythemai <=1 ]1 21 il 18 i l l 33 i l l 41 i l 26 il 10 [

i <--2 [I 23 ii 9 i / 60 ill 15 i i 20 i l l 45 I i <=3 1 1 40 i l l 36 !1 2 ill 24 [11 40 [ I 45 ] i : 0 Ill 33 i l l 36 ii 12 i l 53 i l l 34 il I0 [ i <: I I I 21 i l 18 ill 26 i" 22 i l 16 ] I 35 I

chemosisi <=2 Ill 24 il 9 im 36 ill 18 i1 19 ill 20 I i <:--3 I a 15 i, 0 im 19 [, 6 i" 27 ia lO I i < = 4 I' 8 im 36 it 7 i' 0 il 3 i= 25 I i : o I a 29 idata im 36 i l 51 • 31 iS 15 i

discharge i < 1 I . 19 i not ill 24 II 22 II 23 i l 40 I i < - 2 l i 23 ]sub- Ill 33 [I 9 ill 20 [I 15 I

!miRed [I 7 iS 19 Ill 26 ill 30 I [ < - 3 [11 29 , , , ,

l(mgth of horizontal ban indicates percentage Fig. 6. Frequency distribution of individual in vivo tissue scores (MMMIS) from Draize rabbit eye test in each data set. The number of chemicals with sufficient data to be analysed is given for each CAM-based data set. Frequency distribution is given as percent of chemicals of a data set for each tissue of the rabbit's

eye. Length of horizontal bars indicates percentage of chemicals inducing a specific classified effect.

because of incomplete raw data. The remaining seven data sets (five submissions for HET--CAM and two submissions for CAMVA) were analysed by Working Group 2 (Fig. 6).

According to the IRAG Guidelines, data analysis was aimed primarily at obtaining information on the power of eacl~L assay to correctly predict in vivo Draize scores for each single tissue of the rabbit eye. In addition, analysis was aimed at comparing the predictive power (performance) of the different modifications of CAM-based assays. Whether the predictive power of the assays was related to the chemistry of test materials was also investigated. Finally, the relation between in vitro scores and the total irritant effect on the rabbit's eye was assessed for each of the data sets according to Pearson's linear correlation analysis. Because of missing information on corneal area affected and discharge, however, the MMAS could be correctly calculated only with two data sets, HETCAM-II and CAMVA,

Laboratory B. Therefore, a non-weighted summary score, Y.MMMIS (see Methods section), was used to assess the overall performance of each assay. Figures 1 and 2 demonstrate that the two summary scores, MMAS and EMMMIS, are correlated in an almost linear fashion in the lower range and in a non-linear saturation-type fashion in the upper range.

For assessing the ability of CAM-based assays to predict Draize eye test scores correctly, a new statistical tool, the prediction rate (Fig. 3), was created. Prediction rates were calculated by using either single-linear regressions (pri, Fig. 3) or partial linear regression derived from three data subsets (pr3, Fig. 4b).

In general, all statistical measures were calcu- lated for all combinations between each single in vitro score and each of the following five in vivo scores determined in Draize eye tests in the rabbit: cornea/opacity, iris, erythema, chemosis, discharge.

50 H. Spielmann et aL

Table I. Pearson's correlation coefficien's for ~z MMMIS and MMAS

Z MMMIS n MMAS n P

HETCAM-I VALIDATION IS 0.607 C -0.618 HETCAM-I LAB-A (CTFA I 111) IS 0.331 HETCAM-I LAB-B IS 0.825 HETCAM-I1 HEM1 0.913 SCORE1 0.880 HETCAM-III Q 0.848 R 0.783 S 0.813 CAMVA LAB-A Ig RC50 10d -0.766 lg RC50 14d -0.711 CAMVA LAB-B Ig RC50 -0.806 RC50 -0.812

76 0.001 76 0.001

52 0.05

9 0.01

42 0.907 42 0.001 42 0.861 42 0.001

86 0.001 86 0.001 11 0.01

42 0.00 I 93 0.001

20 -0.882 20 0.001 20 - 0.744 20 0.001

In addi t ion, in submiss ions f rom the USA, the

co rnea /a rea was included. F o r reasons discussed

later, the in vivo score for days to clear could no t

be considered for biometr ical analysis. In the

present s tudy the M M M I S (mean o f modified

m a x i m u m individual score) was used as a score o f

single tissues o f the eye, that is, immedia te react ions

occur r ing wi th in the first h o u r after appl ica t ion were

omit ted.

Table 1 summar izes Pea r son ' s l inear cor re la t ion

coefficients tha t were de te rmined for the cor re la t ion

between in vitro scores o f the seven da ta sets o f

C A M - b a s e d assays and the total in vivo scores o f

eye irr i tancy, usual ly Y M M M I S ; also M M A S for

two assays.

The results o f biometr ical analysis o f each da ta set

are presented in Tables 2-8.

Overal l analysis o f in v i t ro/ in vivo correlat ion

Table ! shows tha t Pea r son ' s corre la t ion co-

efficients for all in vitro~in vivo corre la t ions for

all C A M - b a s e d assays were significant (P ~< 0.01)

or even highly significant (P ~< 0.001), wi th the

except ion o f H E T C A M - I , Lab A (P ~< 0.05). The

latter da ta set is a complex one c o m p o s e d o f test

mater ia ls f rom C T F A studies I - I I I . It did no t

s h o w an acceptable in vitro~in vivo correlat ion.

However , cor re la t ion coefficients o f the o ther data

sets ranged f r o m r_p = 0.607 in the da ta set o f a blind

in t e r l abora to ry val idat ion trial, which covered a

Table 2. Summarized statistical results of in vivo/in vitro correlation and prediction of individual in vivo tissue scores in the Draize eye test HETCAM-I, validation study (7 laboratories)

HETCAM-I validation (7 labs) Cornea opacity Iris Erythema Chemosis Discharge

H N 133 129 132 131 78 px py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r_p -0.333 0.001 -0.34 0.001 -0.334 0.001 -0.364 0.001 -0.459 0.001 r_s -0.362 0.001 -0.284 0.001 -0.348 0.001 -0.378 0.001 -0.52 0.001 prl pr3 30% 71% 27% 87% 17% 60% 28% 70% 21% 41%

C N 133 129 132 131 78 px py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r_p -0.59 0.001 -0.463 0.001 -0.454 0.001 -0.364 0.001 -0.611 0.001 r_s -0.355 0.001 -0.212 0.001 -0.313 0.001 -0.336 0.001 -0.336 0.001 prl pr3 46% 72% 16% 92% 15% 58% 18% 72% 14% 49%

L N 133 129 132 131 78 px py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r_p -0.382 0.001 -0.282 0.001 -0.333 0.001 -0.311 0.001 -0.464 0.001 r_s -0.315 0.001 -0.185 0.001 -0.304 0.001 -0.285 0.001 -0.503 0.001 prl pr3 18% 81% 10% 85% 14% 57% 17% 66% 15% 44%

IT N 125 121 124 123 72 p× py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r_p -0.389 0.001 -0.375 0.001 -0.395 0.001 -0.414 0.001 -0.542 0.001 r_s --0.495 0.001 -0.351 0.001 -0.466 0.001 -0.486 0.001 -0.625 0.001 prl pr3 32% 63% 33% 92% 16% 53% 30% 61% 29% 54%

IS N 133 129 132 131 78 px py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r_p 0.538 0.001 0.423 0.001 0.452 0.001 0.476 0.001 0.629 0.001 r_s 0.527 0.001 0.463 0.001 0.431 0.001 0.475 0.001 0.624 0.001 prl pr3 33% 79% 27% 92% 17% 46% 19% 75% 20% 54%

N number of chemicals; px probability for normal distribution of in vitro variable x; py probability for normal distribution of in vivo variable y; rp Pearson's correlation coefficient; rs Spearman's rank correlation coefficient; prl correctly predicted in vivo scores for single linear regression; pr3 correctly predicted in vivo scores for 3 subsets of data.

W o r k i n g G r o u p 2: C A M - b a s e d assays 51

Table 3. Summarized sta'fistical results of in vivo/in vitro correlation and prediction of individual in vivo tissue scores in the Draize eye test HETCAM-I, Laboratory-A

HETCAM-I LAB-A Cornea Area Iris Erythema Chemosis Discharge

IS N 52 52 52 52 52 52 px py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r_p 0.310 0.05 0,308 0.05 0.360 0.01 0.327 0.05 0.284 0.05 0.282 0.05 r_s 0.374 0.001 0.373 0.001 0.409 0.001 0.380 0.001 0.350 0.001 0.379 0.001 prl pr3 51% 90% 21% 88% 40% 98% 7% 78% 30% 98% 28% 86%

N number of chemicals; px probability for normal distribution of in vitro variable x; py probability for normal distribution of in vivo variable y; r_p Pearson's correlation coefficient; r_s Spearman's rank correlation coefficient; prl correctly predicted in vit, o scores for single linear regression; pr3 correctly predicted in vivo scores for 3 subsets of data.

broad spectrum of chemicals and eye irritancy, up to r.p = 0.913 in the data set from a single laboratory (HETCAM-II), which was restricted to cosmetics and covered the lower range of eye irritancy. The individual scatter plots of the data summarized in Table ! are shown individually for each of the data sets in Figs 7-15. In addition, the figures give inform- ation on linear regression analysis with the regression line and the 95% confidence limits for both the regression line and the prediction of single values.

HETCAM-! (validation study)

The analysis covers tests on 133 industrial chem- icals submitted by ZEBET and derived from the

German interlaboratory validation trial. The data were produced within a double blinded phase II of the validation trial (data base development), in which each of the chemicals was tested independently in two experiments by two of the seven participating labora- tories. Draize eye tests were performed on three, four or six rabbits for each chemical according to OECD Test Guideline 405 in laboratories of the German chemical industry. Each data unit (per chemical) was, therefore, the mean of four independent HET-CAM assays performed in two laboratories and a Draize eye test performed in one laboratory.

Figure 7 shows the overall linear regression analysis for 76 of the 133 chemicals in the validation

Table 4. Summarized statistical results of in vivo/in vitro correlation and prediction of individual in vivo tissue scores in the Draize eye test

HETCAM-I, Laboratory-B

HETCAM-I LAB-B Cornea opacity Iris Erythema Chemosis

H10 N 9 9 9 9 px py 0.05 0.1 0.05 >0.1 0.05 >0.1 0.05 0.1 r_p -0.751 0.05 -0.904 0.001 -0.904 0.001 -0.953 0.001 r.s -0.565 0.05 -0.792 0 . 0 1 -0.787 0 . 0 1 -0.787 0.01

LI0 N 9 9 9 9 px py >0.1 0.1 >0.1 >0.1 >0.1 >0.1 >0.1 0.1 r_p --0.541 >0.05 --0.251 >0.05 --0.015 >0.05 -0.012 >0.05 r,s --0.457 >0.05 --0.142 >0.05 --0.035 >0.05 -0.035 >0.05

CI0 N 9 9 9 9 px py 0.01 0.1 0.01 0.1 0.01 0.1 0.01 0.1 r_p -0.559 >0.05 -0.463 >0.05 -0.364 >0.05 -0.382 >0.05 r_s -0.467 >0.05 -0 .38 >0.05 -0.317 >0.05 -0.317 >0.05

HI00N 6 6 6 6 px py 0.1 0.01 0.1 0.01 0.1 >0.1 0.1 >0.1 r_p -0.821 0.05 -0.821 0.05 -0.463 >0.05 -0.569 >0.05 r_s -0 .75 >0.05 -0 .75 >0.05 -0.364 >0.05 -0.357 >0.05

LI00 N 6 6 6 6 px IZy >0.1 0.01 >0.1 0.01 >0.1 >0.1 >0.1 >0.1 r_p 0.289 0.667 0.289 0.667 0.289 0.335 0.289 0.333 r_s -0.367 >0.05 -0.367 >0.05 0.25 >0.05 0.246 >0.05

C100 N 6 6 6 6 px I:'Y >0.1 0.01 >0.1 0.01 >0.1 >0.1 >0.1 >0.1 r_p -0.287 >0.05 -0.287 >0.05 -0 .82 0.05 -0.724 0.05 r_s -0.122 >0.05 -0.122 >0.05 -0 .75 0.05 -0.769 0.05

ISI0 N 9 9 9 9 px I:'Y >0.1 0.1 >0.1 >0.1 >0.1 >0.1 >0.1 0.1 r_p --0,898 0.1 0.831 0.01 0.687 0.05 0.722 0.05 r_s 0,74 0.05 0.788 0.01 0.735 0.05 0.736 0.05

ISI00 N 6 6 6 6

px py >0.1 0.01 >0.1 0.01 >0.1 >0.1 >0.1 >0.1 r_p 0,764 0.05 0.764 0.05 0.581 >0.05 0.692 >0.05 r_s 0,6 >0.05 0.6 >0.05 0.615 >0.05 0.636 >0.05

C10 coagulation, test concentration 10% CI00 coagulation, test concentration 100% HI0 haemorrhage, test concentration 10% HI00 haemorrhage, test concentration 100% LI0 lysis, test concentration 10% L100

lysis, test concentration 100% N number of chemicals; px probability for normal distribution of in vitro variable x; py probability for normal

distribulion of in vivo variable y; r_p Pearson's correlation coefficient: r_s Spearman's rank correlation coefficient.

52

Table 5. Summarized statistical results

H. Spielmann et al.

of in t,ivo/in vitro correlation and prediction of individual in vivo tissue scores in the Draize eye test HETCAM-II

HETCAM-II Cornea opacity Cornea area Iris Erythema Chemosis Discharge

SCOREI N 42 42 42 42 42 42 px py >0.1 0.01 >0.1 0.01 >0.1 0.01 >0.1 0.01 >0.1 0.01 >0.1 0.01 r_p 0.79 0.001 0.81 0.001 0.77 0.001 0.84 0.001 0.90 0.001 0.83 0.001 r_s 0.82 0.001 0.82 0.001 0.81 0.001 0.87 0.001 0.90 0.001 0.84 0.001 prl pr3 26% 100%o 33% 85% 47% 100%o 38% 97% 59% 97% 61% 9 7 0 N 42 42 42 42 42 42 px py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r_p 0.86 0.001 0.86 0.001 0.89 0.001 0.91 0.001 0.85 0.001 0.82 0.001 r_s 0.87 0.001 0.92 0.001 0.88 0.001 0.93 0.001 0.89 0.001 0.87 0.001 prl pr3 73% 97% 64% 95% 78% 100% 71% 95% 73% 95% 59% 97%

HEMI

N number of chemicals; px probability for normal distribution of in vitro variable x; py probability for normal distribution of in vit,o variable y; r_p Pearson's correlation coefficient; r_s Spearman's rank correlation coefficient; prl correctly predicted in vivo scores for single linear regression; pr3 correctly predicted in vivo scores for 3 subsets of data.

trial, including confidence limits for the mean and for the prediction of single values. Since data on "discharge" were missing, 57 of the 133 chemicals could not be included. Figure 7 demonstrates that the chemicals covered the whole range of eye irritancy. Testing a broad spectrum of industrial chemicals under blind conditions in several laboratories has increased the variability of the in vitro assay to the extent that prediction of in vivo eye irritancy is insufficient with the linear regression model.

Table 2 gives the detailed statistical analysis of the whole data set (133 chemicals). The reaction times for haemorrhage (H), coagulation (C) and lysis (L) as well as the weighted irritation score (IS) and irritating threshold (IT) concentration of the chemicals were used as in vitro parameters for data analysis. Table 2 shows a normal distribution of all in vitro (x) and in vivo (y) scores distribution (px and py ~<0.01). Furthermore, although Pearson's linear correlation coefficients and Spearman's rank correlation co- efficients (r_p and r_s) show fairly low absolute values of 0.24).5, correlation between in vitro~in vivo scores was significant (P~<0.001) with sample sizes of n = 72-133. In general, prediction rates calculated with single linear regression (pr 1) were low (18-46%), whereas prediction rates calculated with partial linear regression (pr3) were much better (41-92%). In

particular, effects on cornea (opacity) and on iris were better predicted by all of the in vitro scores (71-92%) than erythema, chemosis and discharge (41-75%). In comparing the HETCAM-I data set with the other submitted HET-CAM data sets, inter-laboratory variances due to participation of seven laboratories in the validation trial have to be taken into account.

H E T C A M - I , Laboratory A

The data of this submission were obtained in a laboratory of the cosmetics industry using the same modification of the HET-CAM assay as in the previous section. The data set covers 53 test materials, all of them formulations, which were tested in the CTFA Evaluation of Alternatives Program (Gettings et al., 1991, 1992 and 1994a, b) on alternatives to the Draize eye test. 10 hydroalcoholic formulations were tested in phase I of the CTFA program, 18 oil/water-based formulations in phase II and 25 surfactant-based formulations in phase III. To allow a comparison of data obtained with an identical set of test materials from the CTFA validation study in the three different CAM-based assays (see Table 9), the data were submitted after the IRAG workshop 1993 was held.

Figure 8 shows the overall linear regression analysis obtained with 52 test materials. The graph

Table 6. Summarized statistical results of in vivo/in vitro correlation and prediction of individual in vivo tissue scores in the Draize eye test HETCAM-III

HETCAM-III Cornea opacity Iris Erythema Chemosis Discharge

Q N 87 87 87 87 87 px py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r_p 0.719 0.001 0.683 0.001 0.697 0.001 0.7 0.001 0.777 0.001 r_s 0.755 0.001 0.724 0.001 0.696 0.001 0.731 0.001 0.734 0.001 prl pr3 81% 930 73% 93% 32% 82% 70% 96% 22% 78%

R N 87 87 87 87 87 px py 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 r p 0.63 0.001 0.646 0.001 0.635 0.001 0.667 0.001 0.721 0.001 r_s 0.703 0.001 0.695 0.001 0.62 0.001 0.684 0.001 0.685 0.001 prl pr3 75% 97% 70% 95% 29% 72% 64% 9 0 0 18% 81%

S N 11 11 I1 11 I1 px py >0.1 0.05 >0.1 0.01 >0.1 >0.1 >0.1 >0.1 >0.1 >0.1 r_p 0.729 0.05 0.373 >0.05 0.724 0.05 0.944 0.01 0.734 0.05 r_s 0.825 0.01 0.383 >0.05 0.763 0.01 0.930 0.01 0.798 0.01 prl pr3 63% 100% 54% not calcul. 27% 90% 72% 100% 63% 100%

N number of chemicals; px probability for normal distribution of in vitro variable x; py y; r.p Pearson's correlation coefficient; r_s Spearman's rank correlation coefficient; regression; pr3 correctly predicted in vivo scores for 3 subsets of data.

probability for normal distribution of in vivo variable prl correctly predicted in vivo scores for single linear

Tab

le 7

. S

umm

ariz

ed s

tatis

tical

re

sults

o

f in

vh~

o/in

vit

ro c

orre

lati

on a

nd p

redi

ctio

n o

f in

divi

dual

in

vivo

tis

sue

scor

es i

n th

e D

raiz

e ey

e te

st

CA

MV

A,

Lab

ora

tory

-A

CA

MV

A

LA

B-A

C

orne

a op

acit

y

In

N

93

RC

50-1

4d

px p

y 0.

01

0.01

r_

p -0

.57

0

0.00

1 r_

s -0

.57

7

0.00

1 pr

l pr

3 3

90

82

%

In

N

42

RC

50-1

0d

px p

y 0.

05

>0.

1 r_

p -0

.69

2

0.00

1 r_

s --

0.81

4 0.

001

prl

pr

3 54

%

92%

Co

rnea

are

a Ir

is

Ery

them

a C

hem

osis

D

isch

arge

93

93

93

93

93

0.01

0.

01

0.01

0.

01

0.01

0.

01

0.01

0.

01

0.01

0.

01

-0.5

49

0.

001

-0.3

11

0.

01

-0.7

11

0.

001

-0.7

36

0

.00

1

-0.7

10

0.

001

-0.5

79

0.

001

-0.2

49

0

.00

1

-0.7

21

0.

001

-0.7

39

0

.00

1

-0.7

2

0.00

1 36

%

78%

36

%

94%

2

60

80

%

34%

82

%

32%

74

%

42

42

42

42

42

0.05

0.

01

0.05

0.

01

0.05

0.

01

0.05

0.

01

0.05

0.

05

--0.

731

0.00

1 --

0.54

3 0.

001

--0.

837

0.00

1 --

0.73

0 0.

001

--0.

649

0.00

1 -0

.81

2

0.00

1 -0

.54

4

0.00

1 -0

.82

5

0.0

01

-0

.77

2

0.00

1 -0

.72

0

0.00

1 21

%

80%

28

%

97%

26

%

83%

38

%

92%

23

%

83%

N n

um

ber

of c

hem

ical

s; p

x pr

obab

ilit

y fo

r no

rmal

dis

trib

utio

n o

f in

vit

ro v

aria

ble

x; p

y pr

obab

ilit

y fo

r no

rmal

dis

trib

utio

n o

f in

viv

o va

riab

le y

; r.p

Pea

rson

's c

orre

lati

on c

oeff

icie

nt;

r_s

Sp

earm

an's

ran

k co

rrel

atio

n co

effi

cien

t; p

rl

corr

ectl

y pr

edic

ted

in v

ivo

scor

es f

or s

ingl

e li

near

reg

ress

ion;

pr3

cor

rect

ly p

redi

cted

in

vivo

sco

res

for

3 su

bset

s o

f da

ta.

Tab

le 8

. S

umm

ariz

ed s

tatis

tical

re

sults

o

f in

viv

o/in

vit

ro c

orre

lati

on a

nd

pre

dict

ion

of

indi

vidu

al i

n vi

vo t

issu

e sc

ores

in

the

Dra

ize

eye

test

C

AM

VA

, L

abor

ator

y-B

n=l

¢3

> ~r

~a

CA

MV

A

LA

B-B

C

orne

a op

acit

y

In

N

20

RC

50-1

4d

px p

y 0.

01

0.01

r_

p -0

.81

7

0.00

1 r_

s -0

.86

2

0.00

1 p

rl

pr3

40

%

100%

o In

N

20

R

C50

-14d

px

py

0.01

0.

01

r_p

-0.8

23

0.

001

r_s

-0.8

62

0.

001

prl

pr

3 55

%

100%

Co

rnea

are

a Ir

is

Ery

them

a C

hem

osis

D

isch

arge

20

20

20

20

20

0.01

0.

01

0.01

0.

1 0.

01

0.1

0.01

0.

1 0.

01

0.01

--

0.84

2 0.

001

-0.7

89

0.

001

-0.7

66

0.

001

-0.7

79

0

.00

1

-0.8

04

0.

001

--0.

852

0.00

1 --

0.87

4 0.

001

-0.7

79

0.

001

-0.7

77

0

.00

1

--0.

772

0.00

1 4

5%

10

0%o

45%

90

%

50%

95

%

25%

85

%

20%

9

00

20

20

20

20

20

0.

01

0.01

0.

01

0.1

0.01

0.

1 0.

01

0.1

0.01

0.

01

--0.

800

0.00

1 -0

.84

9

0.00

1 -0

.70

4

0.0

01

-0

.77

9

0.0

01

-0

.76

7

0.00

1 -0

.85

2

0.00

1 -0

.87

4

0.0

01

-0

.77

9

0.0

01

-0

.77

7

0.0

Ol

-0.7

72

0.

001

10%

o 90

%

55%

95

%

45%

95

%

35%

85

%

40

%

80%

~o

N n

um

ber

of c

hem

ical

s; p

x pr

obab

ilit

y fo

r no

rmal

dis

trib

utio

n o

f in

vit

ro v

aria

ble

x; p

y pr

obab

ilit

y fo

r no

rmal

dis

trib

utio

n o

f in

viv

o va

riab

le y

; ~p

Pea

rson

's c

orre

lati

on c

oeff

icie

nt;

r_s

Sp

earm

an's

ran

k co

rrel

atio

n co

effi

cien

t; p

rl

corr

ectl

y pr

edic

ted

in v

ivo

scor

es f

or s

ingl

e li

near

reg

ress

ion;

pr3

cor

rect

ly p

redi

cted

in

vivo

sc

ores

for

3 s

ubse

ts o

f da

ta.

54

la0

HETGAI44 VALIDATION ~ l l l fm lS VS. Imtabon ~ o m 18

H. Spielmann et al.

* ° ° - °

Is

Fig. 7. Overall performance of the HETCAM-I assay, validation trial. Linear correlation between the score "IS" of the HET-CAM assay and the summarized Draize eye score EMMMIS. Pearson's single linear correlation analysis ( r=0.61, P~<0.001). Linear regression line and the confidence limits of the regression line and of the prediction

of single ~MMMIS values are shown.

FETCNd4 L A O ~ E m m m ~ . ;~,..~;,,.; ~ m (C11FA t4N)

12 ! ~ 1

10

m

Fig. 8. Overall performance of the HETCAM-I assay, Laboratory A. Linear correlation between the score IS of the HET--CAM assay and the summarized Draize eye score EMMMIS Pearson's single linear correlation analysis (r = 0.33, P ~< 0.05). Linear regression line and confidence limits of the regression line and of the prediction of single

EMMMIS values are shown.

includes confidence limits of the regression line and the predict ion of single values. The linear in v i tro/

in vivo correlat ion of this da ta set shows the lowest correla t ion coefficient (r_p = 0.33, P ~< 0.05) of all of the data sets of CAM-based assays. W h e n this result is compared with other data sets, the fact tha t the da ta set consists of different test materials, which were tested in phases I, II and III of the C T F A val idat ion study, has to be taken into account.

Statistical calculat ions shown in Table 3 were performed on those 52 of the 53 C T F A test materials for which complete units of in vitro and in vivo data were available. Since the submit ted data reported only the weighted i r r i ta t ion score (IS), single reaction times for the occurrence of haemor rhage (H), iysis (L) and coagulat ion (C) were not included in statistical calculations. Fur thermore , the i r r i ta t ion threshold concent ra t ion (IT) of the test materials was deter- mined only within C T F A phase III. Therefore, IT was also not included in the calculat ions of in vi tro/

in vivo correlat ions and predictivity.

The data in Table 3 are dis tr ibuted normally, and significant Pearson ' s correla t ion coefficients of r _ p ~ 0 . 3 were determined for all correlat ions between IS and the different tissue scores of the rabbi t ' s eye (P ~< 0.05). Spearman ' s non-parametr ica i correla t ion analysis also revealed similar correla t ion coefficients of r_s ~ 0.4 between IS and each of the in

vivo eye scores (P ~< 0.001). Similarly, as reported in the previous section for the data set f rom the G e r m a n val idat ion project ( H E T C A M - I , validation), rates of correct predict ion were fairly low with single l inear regression (pr l : 7 - 5 1 % ) and were significantly improved when part ial regression was used (pr3: 78-90%).

H E T C A M - I , Labora to ry B

The analysis was performed on da ta obta ined in a cont rac t labora tory with the same modificat ion of the H E T - C A M assay as in the two previous sections with 26 test materials, mos t of which were products used in the cons t ruct ion business, for

Table 9. Summarized statistical results of in vivo/in vitro correlation and prediction of individual in vivo tissue scores in the Draize eye test for chemicals of CTFA study phase II and phase III HETCAM-I

LABORATORY-A, HETCAM-II AND CAMVA LABORATORY-A

CTFA PHASE It CTFA PHASE Ill

Erythema Erythema Chemosis

HETCAM-I N 18 25 25 LAB-A px py 0.05 0.05 >0.01 0.01 >0.1 0.01 IS r_p 0.519 0.05 -0.138 >0.05 -0.198 >0.05

r_s 0.583 0.01 -0.130 >0.05 -0.225 >0.05 prl pr3 16°/o 94% 28% 92% 84% 92%

HETCAM-II N 18 25 25 SCORE1 px py 0.05 0.05 0.1 0.01 0.1 0.01

r_p 0.250 >0.05 0.209 >0.05 0.119 >0.05 r_s 0.203 >0.05 0.257 >0.05 0.054 >0.05 prl pr3 16% 94% 36% 76% 88% 100%

CAMVA N 18 25 25 LAB-A px py 0.01 0.05 0.01 0.01 0.01 0.01 In RC50-14/10d r_p -0.271 >0.05 0.0035 >0.05 0.004 >0.05

r_s -0.120 >0.05 0.0504 >0.05 0.114 >0.05 prl pr3 27% 100% 24% 92% 88% 100%

N number of chemicals; px probability for normal distribution of in vitro variable x; py probability for normal distribution of in vivo variable y; r_p Pearson's correlation coefficient; r_s Spcarman's rank correlation coefficient; prl correctly predicted in vivo scores for single linear regression; pr3 correctly predicted in vivo scores for 3 subsets of data.

Working Group 2: CAM-based assays

14.0 j j ~mo.8~5

15

Fig. 9. Overall performance of the HETCAM-I assay, Laboratory B. Linear correlation between the score IS of the HET-CAM assay and the summarized Draize eye score ~ZMMMIS. Pearson's single linear correlation analysis (r=0.83, P~<0.01). Linear regression line and the confidence limits of the regression line and of the prediction

of single ZMMMIS values are shown.

120

1-

!o lo

example concrete-based formulations. Because of incomplete in vitro or in vivo data, only six to nine complete data unit~ could be analysed.

Figure 9 gives the overall linear regression analysis of the nine test materials of this data set. It includes confidence limits for the regression line and for the prediction of single values. The small data set is shown because it meets the IRAG acceptance criteria. However, the random sample of n = 9 data pairs seems too small to allow drawing any relevant conclusions.

Table 4 gives the analysis of in vitro scores v. single in vivo tissue scores and shows that basic criteria for assessing predictivity were not met by most of the data. This may 19e due to the small size of the data samples (n = 6 or n = 9). Data shown in Table 4 met the I R A G accept~Lnce criteria and were therefore included in this r,~port. However, the submission was not used for fttrther statistical assessment of the predictive value of the H E T - C A M assay.

H E T C A M - H

The submission contained data on 42 water-soluble materials (20 surfactants and 22 cosmetic formu- lations: lotions, shampoos, gels) which were tested in a laboratory of the cosmetics industry. Most of the Draize eye tests (39/42) were performed with six rabbits; a few (3 of 42) were performed with three or five rabbits, asing a protocol according to which eyes were rinsed 1 hr after application. The HETCAM-I I assay was performed with four eggs per test material either in a single assay per formulation or in two repeated tests per chemical. Two different time-dependent weighted scores were used for data analysis, either ba~;ed on hyperaemia, haemorrhage and coagulation (SCORE1) according to Luepke (1985), or only on haemorrhage (HEM1).

Figure 10(a, b) shows the overall linear regression analysis of 42 test materials, including confidence limits of the regression line and prediction of single values. Figure 10(a) gives the correlation between SCORE1 and EMMMIS, and Fig. 10(b) gives

55

the correlation between SCOREI and MMAS. The two graphs reveal that results for the correlation coefficients (r_p = 0.88/r_p = 0.86) and the confi- dence limits are similar. Single ]gMMMIS values can be predicted with a confidence of 95% within a variation of + 4 and single MMAS values can be predicted within a variation of +20. Furthermore, the scatterplots also show that the correlation between SCOREI and MMAS or EMMMIS is non-linear. Therefore, partial linear regression (s = 2 and s = 3), which was usually applied in the present study to analyse the prediction of single tissue scores, was also applied to the overall analysis (Fig. 4a, b). As demonstrated by these figures, the percentage of correct predictions in the HETCAM-II based on single linear regression (Fig. 3) was the same as that based on two data subsets (76%). The predictivity increased to 81% when three data subsets were formed.

Table 5 gives the statistical analysis of the correlation between the two in vitro scores of HETCAM-II and single tissue scores of the rabbit 's eye. Table 5 demonstrates that SCORE1 values did not show a normal distribution over the total scale of reactions ( p x > 0 . 1 ) , whereas HEMI showed a normal distribution (px and py ~< 0.01). Therefore,

12.0

IO,O

!,o 4,0

2O

0.0 ~ 1 i i i

i 2 ~ 4 5 6 7 8 9 I0 ii t2 13 14 ]5 16 ]7 18 ]9 20 21

r _ ~ O . ~ ) . ~ 2

, _ • . ,, i 2 3 4 s 6 ? 8 9 I0 11 12 IJ 14 I~ 16 17 18

Scanl

Fig. 10. (a) Overall performance of the HETCAM-II assay. Linear correlation between the score "SCOREI" of the HET-CAM assay and the summarized Draize eye score ZMMMIS. Pearson's single linear correlation analysis (r = 0.88, P ~< 0.001). Linear regression line and confidence limits of the regression line and of the prediction of single EMMMIS values are shown. (b) Overall performance of the HETCAM-II assay. Linear correlation between the score "SCOREI" of the HET-CAM assay and the summarized Draize eye score MMAS. Pearson's single linear correlation analysis (r = 0.86, P ~< 0.001). Linear regression line and confidence limits of the regression line and of the prediction

of single MMAS values are shown.

56 H. Spielmann et al.

valuable calculations of predictive power (pri /pr3) could only be performed with H E M I. When single " linear regression was used, about two-thirds of ,~o Draize eye tissue scores were correctly predicted m .... by HEM1 (prl = 5 9 - 7 8 % ) ; scores of corneal | l0

opacity and iris were slightly better than those .... 14

of other tissues. When partial linear regression was ,0 used, prediction of in vivo tissue scores was again very 2.° high (pr3 = 95-100%) for all of the in vivo tissue 0.0 scores.

H E T C A M - I I I

The analysis covers data obtained with the third type of H E T - C A M assay in an industrial laboratory. 97 in vitro~in vivo data units derived from testing pure chemicals and formulations were submitted. Depending on the physicochemical properties of test materials, two different H E T C A M - I I I modifications were used: For the 86 materials which allowed observation of the C A M through test materials (transparent solids or liquids) the time-dependent score " Q " was determined, and for 10 turbid or coloured materials the score "S" was determined at fixed time points. In addition, one material was tested with both protocols. The score " Q " is the ratio of the observed score and the score of a reference benchmark anionic surfactant which is concurrently determined in each assay; the score " S " is the sum of scores of the main reaction observed on the C A M (haemorrhage, coagulation, lysis) determined in six eggs. In addition, a second time-dependent score " R " (which is not corrected by the benchmark score) was calculated.

Figures 11 and 12 show the overall linear regression analysis for the 86 test materials of this data set, including confidence limits of the regression line and prediction of single values. Figure 11 gives the correlation between the score " Q " and E M M M I S , and Fig. 12 shows the correlation between the score " R " and EMMMIS . Linear correlation between the score " Q " and Draize eye test data and confidence limits are slightly better

R

Fig. 12. Overall performance of the HETCAM-III assay. Linear correlation between the score R of the HET-CAM assay and the summarized Draize eye score EMMMIS. Pearson's single linear correlation analysis (r=0.78, P ~< 0.001). Linear regression line and confidence limits of the regression line and of the prediction of single

ZMMMIS values are shown.

(r_p = 0.85) than between the score " R " and Draize eye test data (r_p = 0.78). Single Z M M M I S values can be predicted with a confidence of 95% within a variation of + 4 when the H E T - C A M score " Q " is used. Furthermore, the data set indicates a clustering at the low- and middle range of eye irritancy, whereas the high range is not represented. Figure 13 gives the result of the overall analysis of those 11 materials of the submission, for which the H E T - C A M endpoint " S " was determined instead of the time-dependent scores " Q " or " R " . Although 9 of 11 data points are within the confidence limits of the whole regression line, the confidence limits for the prediction of Z M M M I S cover a variation of about +6. Again, as with other small random samples, significant conclusions should not be drawn from this analysis.

Table 6 gives the analysis of the correlation between the in vitro scores and single tissue scores of the rabbit 's eye. It shows that in vivo and in vitro

data sets were normally distributed for chemicals which allowed determination of the time-dependent " Q " (benchmark corrected) and " R " (Q without benchmark correction). In contrast, of the 11 test

'+o F ,_~0~ J 160 ir =lpmO.I 13 14.o ~11

10~14"0 ~ ~ ~ ~ 12+o .i. • ~ * • I~o

~.o •

+ o . . . . ,+

.~o • :

t 4 6 | Io 12 14 16 II O

Fig. 11. Overall performance of the HETCAM-III assay: Linear correlation between the score "Q" of the HET-CAM assay and the summarized Draize eye score ZMMMIS. Pearson's single linear correlation analysis ( r= 0.85, P ~< 0.001). Linear regression line and confidence limits of the regression line and of the prediction of single

EMMMIS values are shown.

S

Fig. 13. Overall performance of the HETCAM-III assay. Linear correlation between the score S of the HET-CAM assay and the summarized Draize eye score ZMMMIS. Pearson's single linear correlation analysis (r=0.81, P~< 0.01). Linear regression line and confidence limits of the regression line and of the prediction of single

ZMMMIS values are shown.

Working Group

materials for which "S" was determined, neither in vivo nor in vitro data showed a normal distribution. Thus, for these 11 da'La units, only the non-parametric correlation coefficient (r_s) can be used to charac- terize in vivo/in vitro correlations. A significant correlation (P ~< 0.01) was observed between "S" and cornea, erythema, chemosis and discharge. Prediction rates for the score "S" shown in Table 6 were also determined but are not considered in further evaluation, since the prerequisite of normal distri- bution of the data is not met.

Furthermore, Table 6 demonstrates that all statistical values computed for the scores "Q" and "R" are of the same order of magnitude. Indeed, according to the Z 2 test there is no significant difference between the two time-dependent scores "Q" and "R" , since the critical difference of 12% (for n = 87) is not reached in any of the values. Quite similar to other H E T - C A M protocols described so far, the best prediction rates (with either of the two regression models s = 1 and s = 3) were obtained with effects on cornea and iris (prl: 70-81%; pr3: 93-97%). In addition, and in contrast to other H E T - C A M assays, chemosis was sufficiently well predicted (prl: 64-70%; pr3: 90-96%). Low rates of correct predictions were obtained for erythema and discharge with single linear regression (prl: 18-32%), whereas partial regression revealed prediction rates of 72-82%, indicating a non-linear correlation between the in vitro and in vivo :scores.

C A M V A , Laboratory A

The data submission covered two data sets of 93 and 42 materials respectively tested in the CAMVA (Chorion Allantoic Membrane Vascular Assay) in a single industrial laboratory. The 93 materials were tested on eggs after 14 days of embryonat ion and t]ae 42 materials were tested after 10 days of embryonation. 12 materials of the two data sets were tested with both protocols. The majority of materials were water-soluble household or personal care products with low eye-irritating properties. In the CAMVA, vascular effects observed on the CAM withi:a 30min are recorded and the number of positive eggs of a group of 10 eggs tested is determined per concentration of test material. The concentration of test material inducing a positive response in 50% of the eggs (RC50) is calculated. Statistical analysis was performed by using both RC50 and logRC50 as in vitro scores.

Figure 14(a) shows the overall linear regression analysis for 42 test materials of the data set obtained with the CAMVA assay after 10 days of embryona- tion, including confidence limits of the regression line and prediction of single values. Figure 14(b) shows the same analysis of 93 materials tested with the CAMVA assay after 14 days of embryonation. The analysis reveal,; similar correlation coefficients for the two data semis (10-day CAMVA: r_p = 0.77;

2: CAM-based assays 57

_,o

b

I~.O- ~.o.~

• S

I¢ RCSO I~1 [%1

• ;4,

~--P~3 TM

II RI:SO I~ ['II

Fig. 14. (a) Overall performance of the CAMVA assay, Laboratory A. Linear correlation between the score log RC50-10d of the assay and the summarized Draize eye score EMMMIS. Pearson's single linear correlation analysis (r = 0.76, P ~< 0.001). Linear regression line and confidence limits of the regression line and of the prediction of single EMMMIS values are shown. Data were obtained on day 10 of embryonation. (b) Linear correlation between the score log RC50-14d of the assay and the summarized Draize eye score ZMMMIS. Pearson's single linear correlation analysis (r = 0.71, P ~< 0.001). Linear regression line and confidence limits of the regression line and of the prediction of single ZMMMIS values are shown. Data were obtained on day 14

of embryonation.

14-day CAMVA: r_p=0.71) and also similar confidence limits; single in vivo scores can be predicted with 95% probability by log RC50 values with a variation of _+6 (EMMMIS). Although the regression lines of Fig. 14(a, b) are characterized by the same slope, the regression line of the 10-day CAMVA crosses the Y-axis at an intercept of 6 (EMMMIS) and the regression line of the 14-day CAMVA crosses the Y-axis at an intercept of 4 (Y~MMMIS). This indicates that the 14-day CAMVA assay predicts higher in vivo scores than the 10-day CAMVA assay within the entire range of the in vitro

score log RC50. Table 7 shows that, with a single exception (cornea

opacity in data set 10-day), all in vitro and in vivo data showed normal distribution and a significant correlation (r_p and r_s~<0.001), which allows analysis of prediction rates. According to Table 7, prediction rates obtained with single linear regression were significantly lower (prl = 23-54%) than those obtained with partial linear regression (pr3 = 74-92%). This indicates a non-linear correlation between in vitro~in vivo data, although log RC50 was used for the purpose of linearization, instead of RC50. The submitter reported a better correlation

58 H. Spielmann et al.

between day 10 in vitro and in vivo data than between day 14 in vitro and in vivo data. However, our statistical analysis (Fisher's exact test) did not reveal significant differences between 10-day and 14-day in vitro data. With single line regression no differences were observed in prediction of effects on iris, erythema, chemosis and discharge in the rabbit eye, whereas corneal opacity was predicted slightly better (P = 0.04) and corneal area was predicted with a lower rate (P = 0.03) with the day 10 protocol. When prediction rates based on partial linear regression (pr3) are compared, no significant difference was seen between day 10 and day 14 data. Further- more, although higher correlation coefficients were obtained between RC50 and conjunctiva scores than between RC50 and cornea scores, the corresponding prediction rates for the two ocular tissues did not differ significantly.

C A M V A , Laboratory B

The data submission covers 20 chemicals tested in a contract laboratory by the CAMVA assay according to Bagley et al. (1988) and submitted by a company of the pharmaceutical industry. The same protocol was used as described in the previous section (Laboratory A), except that RC50 was determined graphically according to Litchfield and Wilcoxon (1949) and that all assays were performed on eggs on day 14 of embryonation. The in vivo data were obtained with six rabbits per Draize eye test. Test chemicals belonged to different chemical classes, all of which were intermediates of pharmaceuticals with unequally distributed in vivo responses. This is shown in Fig. 5f with clusters of data at both the lower and the upper ends of the scale.

Figure 15 shows the overall linear regression analysis of 20 test materials of the data set obtained by the CAMVA in Laboratory B, including confidence limits of the regression line and prediction of single values. Since this data set allowed

calculation of the MMAS from in vivo data, the overall analysis shows the correlation between log RC50 and the MMAS. The data show two clusters at the lower and the upper ends of the MMAS scale; thus the linear correlation coefficient is high (r_p = 0.88) although the scatterplot suggests a non-linear correlation between log RC50 and MMAS. This is also supported by the wide confidence interval of about + 45 for the prediction of single MMAS values.

Table 8 shows statistical analysis of the correlation between the in vitro scores RC50 and log RC50 and the single tissue scores of the rabbit's eye. The data were originally submitted as RC50 and transformed to log RC50, since this transformation improved in vitro/in vivo correlation in CAMVA Laboratory A. As shown in Table 8, log transform- ation did not significantly improve the statistical measures for the data set of CAMVA Laboratory B. Correlation coefficients (r_p and r_s) between RC50 and all of the in vivo scores were high, ranging from 0.77 to 0.87. Prediction rates obtained with single linear regression (prl) were comparable to other CAM-based assays, ranging from 20 to 55%. In contrast, prediction rates obtained with partial regression were higher than in most of the other data sets submitted, ranging from 80 to 100%.

Comparison o f predictivity o f different CAM-based assays

Since data sets submitted to WG-2 were generated with different sets of test materials, it is difficult to compare the ability of the different CAM-based assays to predict tissue reactions in vivo. However, three data sets submitted by three laboratories were generated by three different CAM-based assays but with an identical set of cosmetic formulations, which were tested in the CTFA Evaluation of Alternatives Program (Gettings et al., 1991, 1992 and 1994a, b) for alternatives to the Draize eye test. 10 hydroalcoholic

• ~! o r . . l~-Om2

m 4~

i

A I, ~, ° . . . . . . . . . .

Fig. 15. Overall performance of the CAMVA assay, Laboratory B. Linear correlation between the score log RC50-14d of the CAMVA assay and the summarized Draize eye score MMAS. Pearson's single linear correlation analysis (r = 0.81, P ~< 0.001). Linear regression line and confidence limits of the regression line and of the prediction of single MMAS values are shown. Data were obtained on day 14

of embryonation.

Working Group 2: CAM-based assays 59

formulations were tested in phase I of the CTFA program, 18 oil/water-based formulations in phase II and 25 surfactant-based formulations in phase III. Since the assays we, re slightly modified during the CTFA study, only data of phase II and phase III of the CTFA program could be used in the present study to compare the predictive value of the three assays (Table 9).

The data shown in Table 9 are subsets of the data sets submitted by HETCAM-I (Laboratory A), HETCAM-II and CAMVA (Laboratory A). Corre- lations of all in vitro scores and all in vivo tissue scores were calculated, and only those nine in vitro~in vivo correlations providing the best results are shown in Table 9. Since the te:~t materials consisted of cosmetic formulations, which rarely produce corneal damage, all of the three assays correlated quite well with the reaction of the conjunctiva of the rabbit 's eye, especially with erythema. In addition, test materials of Phase III (surfactant-based formulations) showed a satisfactory correlation of in vitro data with chemosis in the rabbit 's eye in vivo.

Furthermore, in vitro~in vivo correlations and predictivity with the three CAM-based assays were quite similar. When single linear regression was used to calculate the pre,dictivity of the three assays, the prediction rates (p::l) obtained were 16-36% for erythema and 84-88% for chemosis. Prediction rates calculated with partial regression (pr3) were significantly higher, ranging from 92 to 100% for eight of the nine in vitro~in vivo correlations shown in Table 9. It is important to emphasize that predictivity should not be assessed on the basis of a single linear regression model. In eight of the nine in vitro/in vivo correlations shown in Table 9, Pearson's correlation coefficients (r_p) exceeded 5% probability of error, that is, were not significant. Nevertheless, results in Table 9 obtained with partial linear regression demonstrate that no differences in predictivity could be detected among the three different CAM-based assays when identical test materials of the CTFA study were analysed.

Prediction of specific in vivo tissue scores o f the rabbit's eye

Table 10 shows the mean prediction rates for single tissue scores of the eye, which were calculated

Table 10. Mean prediction rates of in vivo scores from in vitro scores of all CAM-based assays obtained with single linear regression (prl) and with

partial linear regression (pr3)

Mean prl Mean pr3 (%) (%)

Cornea 57.5 92.2 Iris 52.5 95.9 Erythema 33.8 83.5 Chemosis 45.7 90.8 Discharge 35.3 82.9

For each CAM-based assay the most predictive in vh,o score was used for calculation.

from all data sets. If a CAM-based assay provided several scores (see Tables 2-8), only the most predictive in vitro score was used for calculation of the mean. It must be taken into account that test materials used in the assays belonged to different chemical classes and produced quite different irritation reactions in the rabbit 's eye. Table l0 demonstrates that corneal opacity and inflammation of the iris are the in vivo tissue scores which were predicted equally well with both regression models (s = 1 and s = 3). Although effects on conjunctiva (erythema, chemosis and discharge) were not sufficiently well predicted by the single linear regression model (mean prl : 35-57%), using partial linear regression significantly improved the mean prediction rates for these scores (mean pr3: 83-95%).

Relation between predictivity o f CAM-based assays and chemical classes o f test materials

It is well known that the ability of many in vitro assays to predict complex in vivo effects may vary with the chemical nature of the test materials. Therefore, each of the submitted data sets was analysed for the existence of a relation between chemistry of test materials and predictivity of each CAM-based assay by using contingency tables and the chi-square test. The contingency tables in Table l l(a-e) were established by counting the number of in vivo scores which were correctly predicted by the most predictive in vitro score of each of the CAM-based assays. For each test material, numbers between 0 and 5 (columns of contingency tables) were used to describe how many of the five in vivo scores (corneal opacity, iris, erythema, chemosis and discharge) were predicted correctly. When information on corneal area affected was also available (HETCAM-II and CAMVA) the maximum possible number of in vivo scores that might be correctly predicted increases to six instead of five. The lines in the contingency tables were established by grouping the test materials into specific classes according to recommendations of the submitting laboratory or as suggested by WG-2. The last column of Table l l (a-e) lists the means of in vivo tissue scores which were correctly predicted for each chemical group. Although contingency tables were established for both prediction models (s = 1 and s = 3), Table l l(a--e) shows only those obtained by single linear regression. On the one hand, when partial regression was used, the maximum number of in vivo scores predicted correctly was significantly increased, and on the other hand, because of the better fit of the regression lines, there was no longer a significant correlation between chemical classes and predictive power of the in vitro assays (see Discussion section and Table 12).

The results summarized in Table 1 l(a-e) show that the chi-square values for three of the five data sets (which provided 83% of all data and a great variety

60 H . S p i e l m a n n et al .

Table I 1. Relation between predictivity of CAM-based assays and chemical classes of test materials

Sum of correctly predicted in vivo scores

Chemical group Total None I 2 3 4 5 6 Mean

a. HETCAM-I, Validation (7 labs): Haemorrhage (H) max. 5

Aliphatic 40 18 6 9 4 2 1 1.23 Aromatic 8 4 2 0 1 0 1 1.25 Heterocyclic* 3 0 2 0 0 0 1 2.33 Inorganic* 6 0 1 3 2 0 0 2.17 Other 19 6 2 4 2 5 0 1.89 Total 76 28 13 16 9 7 3 1.51 Z2 = 34.9 P = 0.025

b. HETCAM-II: Haemorrhage (HEMI) max. 6

Gels /shampoos* 10 0 0 0 1 2 4 3 4.90 Lotions 12 I I 1 2 2 2 3 3.75 Surfact. ionic 12 0 2 0 1 4 2 3 4.08 Surfact. non-ionic* 8 0 0 2 0 2 2 2 4.25 Total 42 1 3 3 4 10 10 11 4.21 Z" = 14.4 P = 0.70

Alcohols Esters Ethers Fatty acids Fatty acids (N sub.) Fragrances* Glycerides Other Phosph. comp Protein deriv.

Chemical group Salt Sugar derivs.* Sulfonates Total ~2 = 80.0

c. HETCAM-11I: Time-dependent weighted score (R) max. 5

7 0 0 0 6 1 0 3.14 19 3 0 1 8 7 0 2.84 10 1 1 3 3 1 I 2.50 4 2 0 0 1 1 0 1.75 3 1 1 0 0 0 1 2.00

7 0 0 0 3 4 0 3.57 5 1 0 1 3 0 0 2.20

15 2 3 2 4 2 2 2.47 5 2 0 2 0 1 0 1.60 3 I 0 2 0 0 0 1.33

No. of correctly predicted in vivo scores 0 1 2 3 4 5 6 Total Mean 3 0 0 1 1 0 1 3.33 3 0 0 0 0 2 I 4.33 3 1 1 1 0 0 0 1.00

87 14 6 13 29 19 6 2.59 P = 0.05

d. C A M V A , Laboratory A: In RC50 14d max. 6

Alcohols* 5 0 1 0 1 0 3 0 3.80 Cosmetics 28 9 3 2 2 1 5 6 2.79 Detergents 25 7 9 4 4 1 0 0 1.32 Fatty acid derivs. 4 2 0 I 0 0 0 1 2.00 Other 18 5 3 6 3 0 0 1 1,67 Shampoos 7 2 4 1 0 0 0 0 0.86 Sulfate* 6 0 0 2 2 2 0 0 3.00 Total 93 25 20 16 12 4 8 8 2.06 ~(2 = 72.6 P = 0.0001

e. C A M V A , Laboratory B: In RCS0 14d Alcohols* 3 0 1 0 0 1 I 0 3.33 Amine/amide 3 1 1 0 1 0 0 0 1.33 Aromatics* 7 2 0 1 1 2 1 0 2.57 Other 3 2 0 0 0 0 0 1 2.00 Steroids 4 1 0 I I 0 1 0 2.50 Total 20 6 2 2 3 3 3 1 2.40 ,(2 = 21,3 P = 0.70

*Groups with the best prediction rate.

of chemicals) are significantly greater (P ~< 0.05) than the critical value for chi-square. This means that in three of five data sets the maximum number of correctly predicted in vivo scores is significantly related to specific subgroups of test materials. The strongest correlation between grouped test materials and the maximum number of correctly predicted in vivo scores was found in the data set in Table 1 l(d) of CAMVA (Laboratory A) ( P = 0.001). In this data set effects of alcohols on the rabbit's eye were

predicted with the highest rate (3.8 of 6 tissue scores) and effects induced by shampoos showed the lowest prediction rate (0.9 of 6 tissue scores). A significant correlation between chemical groups and the maximum number of correctly predicted in vivo scores was also seen in the data sets of HETCAM-I validation study (P = 0.025, Table l la) and in the data set of HETCAM-III (P = 0.05, Table l lc). In the HETCAM-I validation data set, effects of heterocyclic compounds showed the highest

Working Group 2: CAM-based assays 61

prediction rate (2.3 of 5 tissue scores); aliphatic chemicals showed the lowest rate (1.2 of 5 tissue scores). Among the claemicals tested in HETCAM-III the effects of sugar derivatives on the eye were predicted with the highest rate (4.3 of 5 possible tissue scores) and sulfonates with the lowest rate (1.0 of 5 possible tissue score,;). Attempts of WG-2 to assign test materials for which data were submitted by various laboratories to subgroups of comparable chemistry were limited by the scarce amount of information provided with the data sets.

Discussion

IRAG WG-2 has evaluated five sets of data produced with three different modifications of the HET~CAM assay a.nd two sets of data obtained with a single CAMVA protocol. All of the HET- CAM assays were performed in Europe, whereas the two CAMVA assays were generated in US laboratories. Other HET-CAM and CAMVA data submitted could not be included in the evaluation because they did not meet the IRAG quality criteria for in vivo Draize eye test data. It is important to note that, except for the data set from the German HET-CAM "blind" validation trial, data selection may be biased to some extent, since the sponsors selected the test chemicals and the data that were submitted.

Comparative assessment o f the predictive power o f C A M - b a s e d assays

The main goal of the IRAG initiative was a comparative assessment of the overall utility of the various in vitro methods currently discussed as alternatives to the I-)raize eye irritation test. In this field of in vitro toxicology many in vitro methods have been developed during the past decade. Some of them have proven reliability (defined as intra-and inter- laboratory reproducibility). However, the most important aspect, ability of in vitro data to sufficiently correspond to in vivo reactions (also defined as predictivity), has usually not been proven. Therefore, assessment of the predictive value of in vitro methods is still one of the most crucial aspects of evaluating data from validation studies. This aspect is even more difficult with the Draize eye test, which is used for several quite diffe;ent regulatory purposes, for instance on the one hand for safety assessment of consumer products including cosmetics and on the other hand for risk assessment of pesticides, industrial chemicals and pharmaceuticals. It also must be taken into account that although the Draize eye test is performed according to a standard test protocol (OECD Guideline 405, OECD, 1981 and 1994), several quite different scoring systems for Draize eye test data are used by different regulatory authorities for regulatory purposes (Chambers et al., 1993).

In many validation studies, in vitro data have been correlated to complex scoring systems generated from

the in vivo Draize eye test data, for example, MAS or MMAS. High correlation between in vitro and in vivo

data do not necessarily reflect the "real" predictive value of in vitro assays, since the correlation coefficient is strongly dependent on the spectrum of test materials selected. If data sets were generated by clusters of data in the low and high range of in vivo responses, the entire range of in vivo responses is not adequately represented. Under such circumstances highly significant correlation coefficients will be obtained although the predictive value of the specific in vitro assay is in fact low.

In contrast, if toxicity tests are used to predict a specific toxic potential of test materials to be classified as positive or negative, the predictivity of a new assay is characterized by its sensitivity, specificity and the rates of positive and negative prediction. This information can be calculated from the percentages of correct or false positive and negative classifications obtained with the new assay compared with the existing test, as is usually done in 2 × 2 contingency tables (e.g. Balls et al., 1990). It is therefore much easier to assess the utility of an alternative to the Draize eye test if the intended use of the in vitro method is limited to identifying severe eye irritants v. low and non-irritating material, than to assess whether the assay is able to predict in vivo effects for the whole spectrum of eye irritation.

Since the IRAG approach is aimed at assessing the utility of alternatives to replace the Draize eye test not only for severe irritants but for the whole spectrum of eye irritation, a new biostatistical tool, the prediction rate, was used to assess the predictive power of CAM-based assays for the entire range of in vivo responses. The prediction rate is defined as the percentage of correctly predicted in vivo scores within the whole range of in vivo responses. Based on linear regression and on partial linear regression between in vitro and in vivo scores, an area of _+ 1/10 of the maximum in vivo score was defined as "correct prediction" (see Fig. 3). Many examples in Tables 2-8 demonstrate that the same level of predictivity is achieved with in vitro~in vivo endpoints characterized not only by a high correlation but also by a low correlation. For example, with HETCAM-I, cornea opacity was correctly predicted for 51% (prl) of the test materials when the linear correlation coeffÉcient was very low, for example r_p = 0.31 (Table 3), whereas in the CAMVA assay (Laboratory B) about the same prediction rate was obtained (prl = 55%) with a much higher linear correlation coefficient of r_p = 0.82 (Table 8). It should be taken into account that in the latter data set, test materials of low and extremely high eye irritation potential were over- represented, leading to unusually high linear corre- lation coefficients. Although the example may suggest that applying the new biostatistical tool "prediction rate" will provide better assessment and that comparison of the overall utility of the various

62 H. Spielmann et al.

CAM-based assays is possible, the outcome still depends on the selection of test chemicals.

Prediction rates derived from single linear re- gression and obtained with the different types of CAM-based assays were often very low (Tables 2-8) with two exceptions: with HETCAM-II (score HEMI) and HETCAM-III (scores Q and IS) prediction of many in vivo scores was sufficient (about 60-80%). It is obvious that the data sets of the two submissions did not represent a wide spectrum of chemical classes. In the case of HETCAM-III , it must be taken into account that the scoring system includes an anionic surfactant as benchmark reference and is optimized for in-house testing of raw materials and products of the company submitting the data. The lowest rate of prediction was obtained with the data set from the German interlaboratory validation study (HETCAM-I, validation). The broad physico- chemical spectrum of the 133 industrial chemicals tested on the one hand and blind testing in seven laboratories on the other hand may have accounted for the result. This assumption is supported by comparison of the irritation score (IS) in the German validation study (Table 2) with both HETCAM-I (Laboratory A) (Table 3) and HETCAM-III (Table 6). Prediction rates derived from single linear regression and also from partial linear regression were significantly better when the data were produced within a single laboratory (Table 2) and with test materials from the same physicochemical spectrum, for instance surfactants (Table 6), than in the validation study.

Prediction rates were considerably improved up to 100% when partial linear regression (s = 3) was used. Partial linear regression (PLR) provided much better prediction rates of all in vivo scores than single- line regression (SLR). This indicates either a non-linear relation between in vitro and in vivo scores or the existence of chemical subclasses of test materials which exhibit similar tissue reactions in vivo by different mechanisms, or both. The new tool of partial linear regression permitted assessing the potential predictivity of the in vitro assays based on the actual data sets submitted. However, the available information on the physicochemical properties of the test materials used was not sufficient to elucidate the rules behind the fractions obtained by this method. In that respect, the predictive value of our approach is restricted, since a new material cannot be unambiguously assigned to one of the three fractions. Nevertheless, the predictive power of this method is significantly better than what can be achieved with the single linear approach because the prediction bounds (confidence limits for prediction of a single value, Fig. 4b) of the various regression lines constructed by PLR are usually located much closer together than those associated with a single regression line (Fig. 3). Furthermore, the various fractions do not overlap within the whole range of in

vitro scores; namely, ranges of the in vitro scores exist

in which all data are associated with only one regression line. This is illustrated by the data and regression lines in Fig. 4(b). Chemicals producing in vitro scores (SCOREI) below 6 are unequivocally associated with the lowest regression line. Because of the narrow range enclosed by the confidence limits for prediction of single values (dashed lines) of this line the accuracy of predictions possible for chemicals with SCORE1 < 6 is much higher than on the basis of the single regression model (cf. prediction bounds in Fig. 3). On the other hand, materials with in vitro

scores > 6 can be alternatively associated with one of the three regression lines, and without further knowledge of their physico-chemical properties, it is impossible to decide to which of the three possible fractions they actually belong. Even here, however, PLR is more predictive than conventional linear regression analysis, since instead of a single very broad range of possible MMAS values (usually between 0 and 100% of the in vivo scale), it offers two or three distinct narrow ranges into which the in vivo scores are likely to fall. According to the PLR analysis depicted in Fig. 4b, materials with in

vitro scores (SCORE 1) between 8 and 10 would be assigned to either low (MMAS < 24) or high (MMAS >40) levels of eye irritancy, whereas intermediate in vivo scores are unlikely to occur.

Prediction of individual tissue scores in the rabbit's eye

According to IRAG Guidelines an essential goal was to analyse correlations between in vitro scores of specific assays and single endpoints of the Draize eye test in vivo. This approach was appreciated by WG-2, since it may lead to a better understanding of mechanistic similarities between in vitro and in vivo

endpoints. Furthermore, the summarized MAS score currently used to evaluate the Draize eye test is based on a mixture of effects observed in different tissues of the eye, which are multiplied by weighting factors derived from ocular risk assessment in humans ( × 5 for corneal damages and iritis, x 2 for conjunctiva effects). It is therefore quite unlikely that a single in vitro test or a single score of an in vitro testing system will be able to predict all of the complex scores used to calculate the Draize eye test.

An important result of the comparative data analysis in Tables 2-8 is that all types of CAM-based assays, irrespective of scores and endpoints, showed their best performance when used to predict corneal lesions and iritis, whereas their performance in predicting conjunctiva reactions was less efficient. This holds true even for the CAMVA assay, which according to the submitting laboratories is more predictive for conjunctiva effects, since the in vitro endpoints determined are based solely on vascular reactions of the CAM. In Laboratory B the CAMVA assay proved to be most predictive for cornea and iris. In Laboratory A the predictive power of the CAMVA assay for damage observed on cornea, iris

Working Group 2: CAM-based assays

Table 12. Representat ion of chemicals o f a specific data set by regression lines R L I , RL2 and RL3 o f the s = 3 model (see Fig. 4b). Example shown:

C A M V A , LAB-B

C A M V A in Draize-cornea Chemical RC50-14d opacity Chemical number [g/litre] M M I S group RLI RL2 RL3

12 6.908 0.33 I • 7 6.908 0.00 I •

14 2.398 2.33 I •

2 6.214 0.00 II • 16 3.857 3.50 II 13 4.461 1.00 11

11 6.215 0.00 I l l • 15 4.460 3.50 I I I • 18 4.277 3.50 I I l •

1 6.908 0.00 I I I • 5 6.908 0.00 I I I • 9 6.908 0.00 I I I •

10 6.908 0.00 I I I •

17 3.689 3.50 IV • 19 - 0 . 2 7 4 4.00 IV • 20 - 2 . 8 1 3 4.00 IV •

6 4.942 0.00 V • 3 6.908 0.00 V • 4 6.908 0.00 V • 8 6.908 0.00 V •

Chemical groups: l alcohols; II amines/amides; III aromatics; IV other; V steroides; for regression lines R L I , RL2, RL3 see Fig. 4b.

63

and conjunctiva wa:; identical. This is probably due to the chemicals selected by CAMVA-Laboratory A; Fig. 6 shows the frequency distribution of single tissue scores of the submitted data sets. Only 13% of the test materials in Laboratory A showed corneal effects of MMMIS 13 and MMMIS 4, whereas 35% of the test materials in Laboratory B induced corneal lesions of MMMIS 3 and MMMIS 4.

To assess the abikty of in vitro methods to predict reversibility of the irritant reactions of the rabbit 's eye, an analysis of the in rico score "days to clear" was recommended by the IRAG Guidelines group. Unfortunately, the time period between application of test materials and termination of the Draize test showed great variability between laboratories. Draize tests were terminated on day 7, day 14, day 21 or even day 35. Therefore, scores ranging from >7 to >35 should have been used to cover the different periods of exposure in regression analysis. There is still uncertainty whether a specific eye lesion would have persisted until day 21, whereas the Draize test was actually terminated on day 7. It is important to note that because of the short observation time of ~<5 rain, CAM-based assays cannot be expected Lo predict reversibility of ocular lesions.

Correlation o f predictivity of CAM-based assays to chemical classes

Contingency tables (Table 1 la-e) clearly demon- strate that the predictive power of the various protocols of CAM.based assays is related to the

chemistry of test materials. On the other hand the overall analysis is limited, since the sets of test materials used in the various CAM-based assays were quite different. Taking this limitation into consideration, the present data show that HET- CAM assays performed best with surfactants or surfactant-based formulations, and that the CAMVA assay performed best with alcohols. A definitive statement about predictivity of various protocols of the CAMVA assay is impossible because of the considerable differences between materials tested. It is therefore recommended that in future validation studies, statistical analysis should be performed on each total data set as well as on subgroups of materials tested.

Finally, WG-2 used partial linear regression for a contingency analysis between chemical groups and the maximum of correctly predicted in vivo scores. The analysis revealed non-significant chi-square values in all of the data sets. Because the rate of correctly predicted scores improved, the correlation could be detected between the chemistry of the test materials, and the predictive power of the assay can be analysed, as shown in Table 12 for only one data set (CAMVA, Laboratory B). Chemicals grouped into four different chemical classes were reassigned to their respective regression lines RL 1, RL 2 and RL 3 of the s = 3 model. Since none of the four groups of chemicals is sufficiently represented by one of the three regression lines, no correlation could be detected between chemistry of a test material and predictive power of the CAMVA assay.

64 H. Spielmann et al.

Future use of C A M - b a s e d assays

As described and analysed in the present report, CAM-based assays have been developed in the cosmetics industry where they are used at the lower end of the Draize eye test scale to discriminate between non-irritant and mildly irritant surfactant- based formulations. In the present report biostatisti- cal analysis reveals a very good prediction of in vivo Draize eye test scores from in vitro HET-CAM and CAMVA data for surfactant-based material from MAS score of 0-40. Our data therefore support the use of CAM-based assays for safety assessment of non-irritating and mildly irritating surfactants.

A similar result was obtained in the international EC/Home Office validation trial of nine in vitro alternatives to the Draize eye test, including the HET~CAM assay (Balls et al., 1995), since prelimi- nary evaluation suggests that, with the possible exception of predicting the irritancy of surfactants, none of the nine tests met the performance criteria that were expected from an in vitro test to be acceptable for regulatory purposes.

The HET-CAM assay also showed promise in a national German validation project for replacing the Draize eye test for severely eye-irritating chemicals (Spielmann et al., 1995). Discriminant analysis revealed that among the nine endpoints routinely determined in the HET-CAM assay, coagulation of a 10% solution was the best discriminating factor to identify severely eye-irritating chemicals and coagulation of undiluted chemicals for the less water-soluble ones. Stepwise discriminant analysis allowed development of an in vitro testing strategy for identifying severely eye-irritating chemicals by com- bining coagulation data from the HET-CAM assay with cytotoxicity data. Results obtained with 200 chemicals under blind conditions suggest that this approach will provide an acceptable predictivity for regulatory purposes.

Summary The main goal of IRAG Working Group 2 was to

assess the overall utility of CAM-based assays as alternatives to the Draize eye irritation test. Sponsors of data submitted both in vivo results obtained under standard conditions in the Draize eye test and results from testing the same materials in vitro in a CAM-based assay. The CAM assay is an organotypic test in which test material is applied to the sensitive chorion allantoic membrane (CAM) of embryonated chicken eggs. Two general types of CAM-based assays are currently in use for in vitro testing: in Europe chiefly the HET-CAM test and in the United States chiefly the CAMVA assay.

WG-2 has evaluated five sets of data produced with three different modifications of the HET-CAM test and two sets of data obtained with a single CAMVA assay protocol. Most of the data sets were produced by in-house testing; one was produced in a blind

validation trial. The data sets covered between 9 and 133 test chemicals, usually from the product line of the sponsor. In contrast, in the validation trial a wide spectrum of chemicals was tested.

According to guidelines provided by IRAG, WG-2 documented the quality of the data submitted, assessed the test protocols in a standardized fashion and evaluated the purpose and proposed use of each of the CAM-based assays. WG-2 has also analysed the range of responses covered by each CAM-based assay and the range of test materials amenable to use in the assay. In addition, each assay was evaluated with respect to safety and risk assessment either for in-house use or for regulatory purposes.

Pearson's linear correlation between in vitro scores and a summarized Draize eye score was analysed to assess overall performance of CAM-based assays. Calculation of the MMAS Draize score was impossible because information was missing on a few specific tissue scores (cornea, area affected, and conjunctiva discharge) in most data sets from European laboratories. Instead, the EMMMIS, a non-weighted summarized score, could be calculated from the tissue scores of all of the data sets submitted. Two data sets permitted demonstrating that the MMAS and EMMMIS are significantly correlated to each other. When ~MMMIS was used as the in vivo score to calculate Pearson's linear correlation coefficient r for in vitro~in vivo correlations, values ranging from r = 0.6 to r = 0.9 were found with six of seven data sets.

To assess in vitro/in vivo correlations between different in vitro endpoints or scores and each of the tissue scores of the Draize eye test, both Pearson's single linear regression and partial linear regression were calculated. In general, corneal opacity and inflammation of the iris showed a better correlation to in vitro data than the other reactions on the rabbit's eye. Prediction rates of the single linear regression were low. In contrast, prediction rates were significantly better with partial linear regression, assuming three subsets of data. Moreover, when the data sets were produced within a single laboratory and the test materials were restricted to a limited physicochemical spectrum, for example surfactants, prediction of tissue damage was significantly better (>95%) than with data sets covering a broad spectrum of physicochemical properties, namely in a validation trial.

Comparison of the predictive power of in vitro data generated by the various protocols of the HET-CAM and the CAMVA assay was difficult because of differences in the test materials used. However, there was no obvious difference in predictivity between three of the CAM-based assays when identical test materials from the CTFA study were compared.

Attempts of WG-2 to assign test materials to subgroups of similar chemical properties were limited by the scarce amount of information provided. However, the subgroup of surfactant-based chemicals

Working Group 2: CAM-based assays 65

of low eye i r r i tant potent ial was p redominan t in most of the data sets. For this group of chemicals, biostatistical analysis revealed a quite acceptable in

vitro~in vivo correl~Ltion. H E T - C A M assays provided the best predictio~ with surfactants and surfactant- based formulat ions, whereas the C A M V A assay showed its best per formance with alcohols.

REFERENCES

Bagley D. M., Bruner L. H., de Silva O., Cottin M., O'Brien K. A. F., Uttley M. and Walker A. P. (1992) An evaluation of five potential alternatives in vitro to the rabbit eye irritatic,n test in vivo. Toxicology in Vitro 6, 275-284.

Bagley D. M., Rizvi P. Y., Kong B. M. and De Salva S. J. (1988) An improved CAM assay for predicting ocular irritation potential. In Progress in In Vitro Toxicology, Alternative Methods in Toxicology. Vol. 6. pp. 131-138. Edited by A. M. Golberg. Mary Ann Liebert, New York.

Bagley D. M., Rizvi P. Y., Kong B. M. and De Salva S. J. (1991 a) Factors affecting use of the hen's egg chorioallan- toic membrane as a model for predicting eye irritation potential: I. Journal of Toxicology-Cutaneous and Ocular Toxicology 10, 95--104.

Bagley D. M., Rizvi P. Y,, Kong B. M. and De Salva S. J. (1991b) Evaluatioa of the vascular components of the chorioallantoic membrane assay as a model for eye irritation potential: II. Journal of Toxicology-Cutaneous and Ocular Toxicc,logy 10, 105 113.

Balls M., Blaauboer B., Brusik D., Frazier J., Lamb D., Pemberton M., Reinhardt C., Robertfroid M., Rosenkranz H., Schmid B., Spielmann H., Stammati A. L. and Walum F. (1990) Report and recommendations of the CAAT/ERGATT workshop on the validation of toxicity test procedures. ATLA 18, 313-337.

Balls M., Botham P A., Bruner L. H. and Spielmann, H. (1995) The EC/HO international validation study on alternatives to the Draize eye irritation test for classification and labellling of chemicals. Toxicology in Vitro 9, 871 929.

Bartnik F. G., K~isl:ner W., Kiinstler K. and Sterzel W. (1988) Bewertung der lokalen Vertr/iglichkeit yon Tensiden mittels in vitro Methoden. Seifen-Ole-Fette- Wachse 114, 41-47.

Chambers W. A., Green S., Gupta K. C., Hills R. N., Huntley K., Hurley P. M., Lambert L. A., Lee C. C., Lee J. K., Liu P. T., Lowther D. K., Roberts C. D., Seabaugh V. M., Springer J. A. and Wilcox N. L. (1993) Scoring for eye irritation test. Food and Chemical Toxicology 31, 11 I-115.

Colquhoun D. (1971) Lectures on Biostatistics. Clarendon Press, Oxford.

Gettings S. D., Bagley D. M., Demetrulias J. L,, Dipasquale L. C., Hintze K. L., Rozen M. G., Teal J. J., Weise S. L., Chudkowski M., Marenus K. D., Pape W. J. W., Roddy M. T., Schnetzinger R., Silber P. M., Glaza S. M. and Kurtz P. J. (1991) The CTFA Evaluation of Alternatives Program: An evaluation of in vitro alternatives to the Draize primary eye irritation test. (Phase I) Hydro- alcoholic formulations; (Part 2) Data analysis and biological significance. In Vitro Toxicology 4, 247-288.

Gettings S. D., Bagley D. M., Chudkowski M., Demetrulias J. L., Dipasquale L. C., Galli C. L., Gay R., Hintze K. L., Janus J., Marenus K. D., Muscatiello M. J., Pape W. J. W., Renskers K. J., Roddy M. T. and Schnetzinger R. (1992) The CTFA Evaluation of Alternatives Proglam: Development of potential alterna- tives to the Draize eye test. (Phase II) Review of materials and methods. ATLA 20, 164-171.

Gettings S. D., Dipasquale L. C., Bagley D. M., Casterton P. L., Chudkowski M., Curren R. D., Demetrulias J. L., Feder P. I., Galli C. L., Gay R., Glaza S. M., Hintze K. L., Janus J., Kurtz P. J., Lordo R. A., Marenus K. D., Moral J., Muscatiello M., Pape W. J. W., Renskers K. J., Roddy M. T. and Rozen M. G. (1994a) The CTFA Evaluation of Alternatives Program: An evaluation of in vitro alternatives to the Draize primary eye irritation test. (Phase II) Oil/water emulsions. Food and Chemical Toxicology 32, 943-976.

Gettings S. D., Hintze K. L., Bagley D. M., Casterton P. L., Chudkowski M., Curren R. D., Demetrulias J. L., Dipasquale L. C., Earl L. K., Feder P. I., Galli C. L., Gay R., Glaza S. M., Gordon V. C., Janus J., Kurtz P. J., Lordo R. A., Marenus K. D., Moral J., Pape W. J. W., Renskers K. J., Rheins L. A., Roddy M. T., Rozen M. G., Tedeschi J. P. and Zyracki J. (1994b) The CTFA Evaluation of Alternatives Program. (Phase III) Surfactant-based formulations. World Congress on Alternatives and Animal Use in the Life Sciences, Baltimore, MD, USA, Nov. 14-19, 1993. In Vitro Toxicology 7, 166.

Grimm H. and Recknagel R. D. (1985) Grundkurs Bio- statistik. 1st Ed. Gustav Fischer Verlag, Jena, Germany.

I RAG (1993) Guidelines for the evaluation of eye irritation alternative tests: Criteria for data submissions. Working Group 6, Draft Report. Interagency Regulatory Alterna- tives Group, Washington, DC.

Kalweit S., Besoke R., Gerner I. and Spielmann H. (1990) A national validation project of alternative methods to the Draize rabbit eye test. Toxicology in Vitro 4, 702-706.

Kong B., Viau C., Rizvi P. and De Salva S. (1987) The development and evaluation of the chorioallantoic membrane (CAM) assay. In Alternative Methods in Toxicology. Vol. 5. In Vitro Toxicology, Approaches to Validation. p. 163. Edited by A. M. Golberg. Mary Ann Liebert, New York.

Leighton J., Nassauer J. and Tchao R. (1985) The chick embryo in toxicology: an alternative to the rabbit eye. Food and Chemical Toxicology 23, 293-298.

Lozfin L. (1992) Angewandte Statistik fiir Naturwissen- schaftler. Pareys Studientexte; Nr. 74, Berlin, Hamburg.

Luepke N. P. (1985) Hen's egg chorioallantoic membrane test for irritation potential. Food and Chemical Toxicology 23, 287-291.

Luepke N. P. and Kemper F. H. (1986) HET-CAM: an alternative to the Draize eye test. Food and Chemical Toxicology 24, 495-496.

OECD (1981) Guidelines for testing of chemicals. Organisation for Economic Cooperation and Develop- ment, Paris.

OECD (1994) Guidelines for testing of chemicals. Organisation for Economic Cooperation and Develop- ment, Paris.

Spielmann H., Gerner I., Kalweit S., Moog R., Wirnsberger T., Krauser K., Kreiling R., Kreuzer H., Luepke N. P., Miltenburger H. G., MOiler N., M0rmann P., Pape W., Siegemund B., Spengler J., Steiling W. and Wiebel F. (1991) Interlaboratory assessment of alternatives to the Draize eye irritation test in Germany. Toxicology in Vitro 5, 539-542.

Spielmann H., Kalweit S., Liebsch M., Wirnsberger T., Gerner I., Bertram-Neis E., Krauser K., Kreiling R., Miltenburger H. G., Pape W. and Steiling W. (1993) Validation study of alternatives to the Draize eye irritation test in Germany: cytotoxicity testing and HET-CAM test with 136 industrial chemicals. Toxicology in Vitro 7, 505-510.

Spielmann H., Liebsch M., Moldenhauer F. H. G., Holzhfitter H. G. and de Silva O. (1995) Modern bio- statistical methods for assessing in vitro/in vivo correlation in a validation trial on in vitro alternatives to the Draize eye test. Toxicology in Vitro 9, 549-556.

FCT 35/I--C

66 H. Spielmann et al.

Sterzel W., Bartnik F. G., Matthies W., Kiistner W. and Kiinstler K. (1990) Comparison of two in vitro and two in vivo methods for the measurement of irritancy. Toxicology in Vitro 4, 698-701.

Weber E. (1986) Grundriss der biologischen Statistik: Anwendung d. math. Statistik in Forschung, Lehre

u. Praxis. 9th Ed. Gustav Fischer Verlag, Stuttgart.

Weil C. S. and Scala R. A. (1971) Study of intra- and interlaboratory variability in the results of rabbit eye and skin irritation tests. Toxicology and Applied Pharmacology 19, 276-360.


Recommended