+ All Categories
Home > Documents > Quantification of contributions of molecular fragments for eye irritation of organic chemicals using...

Quantification of contributions of molecular fragments for eye irritation of organic chemicals using...

Date post: 28-Dec-2016
Category:
Upload: kunal
View: 218 times
Download: 0 times
Share this document with a friend
7
Quantication of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study Supratik Kar, Kunal Roy n Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India article info Article history: Received 5 December 2013 Accepted 24 February 2014 Keywords: Eye irritation In silico OECD Molar-adjusted eye score QSAR abstract The eye irritation potential of chemicals has largely been evaluated using the Draize rabbit-eye test for a very long time. The Draize eye-irritation data on 38 compounds established by the European Center for Ecotoxicology and Toxicology of Chemicals (ECETOC) has been used in the present quantitative structureactivity relationship (QSAR) analysis in order to predict molar-adjusted eye scores (MES) and determine possible structural requisites and attributes that are primarily responsible for the eye irritation caused by the studied solutes. The developed model was rigorously validated internally as well as externally by applying principles of the Organization for Economic Cooperation and Development (OECD). The test for applicability domain was also carried out in order to check the reliability of the predictions. Important fragments contributing to higher MES values of the solutes were identied through critical analysis and interpretation of the developed model. Considering all the identied structural attributes, one can choose or design safe solutes with low eye irritant properties. The presented approach suggests a model for use in the context of virtual screening of relevant solute libraries. The developed QSAR model can be used to predict existing as well as future chemicals falling within the applicability domain of the model in order to reduce the use of animals. & 2014 Elsevier Ltd. All rights reserved. 1. Introduction Eye irritation is one of the major ocular toxicities due to air pollutants, chemicals and pharmaceuticals [1]. In order to evaluate the eye irritation potential of chemicals, the Draize in vivo eye irritation test has been used as a standard testing protocol for a long time [2]. In this test, sample chemicals are applied into the lower conjunctival cul-de-sac of rabbit eyes, and the ocular responses are scored based on damage to the cornea, iris, and conjunctiva. The tissue grades are combined into a weighted score and the highest average score across different test animals on various days is termed the maximum average score (MAS). The Draize test has been used by various regulatory agencies and pharmaceutical companies in order to evaluate the ocular toxicity of chemicals worldwide [3]. The United Nations Globally Harmonized System (GHS) [4], the U.S. Environmental Protection Agency (U.S. EPA) classication system [5], and the European Union (EU) classication system [6] are three major regulatory criteria used for ocular hazard classication based on Draize test results. The in vivo rabbit eye irritation test has frequently been criticized for its cruelty, and there are some common disadvan- tages to animal tests; for example, they are expensive and time- consuming. At the same time, there is an increasing pressure from social and economic forces to halt the use of large number of animals and nd alternative methods to evaluate chemical ocular toxicity. Efforts have been made to develop alternation in vitro methods to reproduce and predict eye irritation potential. The National Toxicology Program (NTP) Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM) and the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) utilized different in vitro methods in order to evaluate ocular toxicity [7]. In vitro methods may be a useful alternative to the in vivo test but none of them are sufciently validated to replace the in vivo test completely, as the currently available in vitro assays have several limitations. Their primary shortcoming is that they all require physical samples of compounds for testing and, in spite of noteworthy technical advances in the ICCVAM ocular toxicity assay, they still require eye tissues in the assay, which is time-consuming and costly [8]. Quantitative structureactivity relationship (QSAR) studies can be utilized to predict eye irritation potential as an alternative Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/cbm Computers in Biology and Medicine http://dx.doi.org/10.1016/j.compbiomed.2014.02.014 0010-4825 & 2014 Elsevier Ltd. All rights reserved. n Correspondence to: Manchester Institute of Biotechnology, Manchester M1 7DN, United Kingdom. Tel. : þ91 98315 94140; fax: þ91 33 2837 1078. E-mail addresses: [email protected], [email protected], [email protected] (K. Roy). URL: http://sites.google.com/site/kunalroyindia/ (K. Roy). Computers in Biology and Medicine 48 (2014) 102108
Transcript
Page 1: Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study

Quantification of contributions of molecular fragments for eyeirritation of organic chemicals using QSAR study

Supratik Kar, Kunal Roy n

Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India

a r t i c l e i n f o

Article history:Received 5 December 2013Accepted 24 February 2014

Keywords:Eye irritationIn silicoOECDMolar-adjusted eye scoreQSAR

a b s t r a c t

The eye irritation potential of chemicals has largely been evaluated using the Draize rabbit-eye test for avery long time. The Draize eye-irritation data on 38 compounds established by the European Center forEcotoxicology and Toxicology of Chemicals (ECETOC) has been used in the present quantitativestructure–activity relationship (QSAR) analysis in order to predict molar-adjusted eye scores (MES)and determine possible structural requisites and attributes that are primarily responsible for the eyeirritation caused by the studied solutes. The developed model was rigorously validated internally as wellas externally by applying principles of the Organization for Economic Cooperation and Development(OECD). The test for applicability domain was also carried out in order to check the reliability of thepredictions. Important fragments contributing to higher MES values of the solutes were identifiedthrough critical analysis and interpretation of the developed model. Considering all the identifiedstructural attributes, one can choose or design safe solutes with low eye irritant properties. Thepresented approach suggests a model for use in the context of virtual screening of relevant solutelibraries. The developed QSAR model can be used to predict existing as well as future chemicals fallingwithin the applicability domain of the model in order to reduce the use of animals.

& 2014 Elsevier Ltd. All rights reserved.

1. Introduction

Eye irritation is one of the major ocular toxicities due to airpollutants, chemicals and pharmaceuticals [1]. In order to evaluatethe eye irritation potential of chemicals, the Draize in vivo eyeirritation test has been used as a standard testing protocol for along time [2]. In this test, sample chemicals are applied into thelower conjunctival cul-de-sac of rabbit eyes, and the ocularresponses are scored based on damage to the cornea, iris, andconjunctiva. The tissue grades are combined into a weighted scoreand the highest average score across different test animals onvarious days is termed the maximum average score (MAS).

The Draize test has been used by various regulatory agenciesand pharmaceutical companies in order to evaluate the oculartoxicity of chemicals worldwide [3]. The United Nations GloballyHarmonized System (GHS) [4], the U.S. Environmental ProtectionAgency (U.S. EPA) classification system [5], and the EuropeanUnion (EU) classification system [6] are three major regulatory

criteria used for ocular hazard classification based on Draize testresults. The in vivo rabbit eye irritation test has frequently beencriticized for its cruelty, and there are some common disadvan-tages to animal tests; for example, they are expensive and time-consuming. At the same time, there is an increasing pressure fromsocial and economic forces to halt the use of large number ofanimals and find alternative methods to evaluate chemical oculartoxicity.

Efforts have been made to develop alternation in vitro methodsto reproduce and predict eye irritation potential. The NationalToxicology Program (NTP) Interagency Center for the Evaluation ofAlternative Toxicological Methods (NICEATM) and the InteragencyCoordinating Committee on the Validation of Alternative Methods(ICCVAM) utilized different in vitro methods in order to evaluateocular toxicity [7]. In vitro methods may be a useful alternative tothe in vivo test but none of them are sufficiently validated toreplace the in vivo test completely, as the currently availablein vitro assays have several limitations. Their primary shortcomingis that they all require physical samples of compounds for testingand, in spite of noteworthy technical advances in the ICCVAMocular toxicity assay, they still require eye tissues in the assay,which is time-consuming and costly [8].

Quantitative structure–activity relationship (QSAR) studiescan be utilized to predict eye irritation potential as an alternative

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/cbm

Computers in Biology and Medicine

http://dx.doi.org/10.1016/j.compbiomed.2014.02.0140010-4825 & 2014 Elsevier Ltd. All rights reserved.

n Correspondence to: Manchester Institute of Biotechnology, Manchester M17DN, United Kingdom. Tel. :þ91 98315 94140; fax: þ91 33 2837 1078.

E-mail addresses: [email protected], [email protected],[email protected] (K. Roy).

URL: http://sites.google.com/site/kunalroyindia/ (K. Roy).

Computers in Biology and Medicine 48 (2014) 102–108

Page 2: Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study

in silico method, just as it has been used successfully to predictseveral other toxicological endpoints for some time [9]. Comparedto in vivo and in vitro studies, computational tools can be used topredict potential chemical toxicity using less money and time, andare also applicable for virtual compounds before they are evensynthesized. Unfortunately, few studies have been performed tomodel the eye irritation potential of chemicals. Patlewicz et al. [10]demonstrated that a lack of good quality in vivo data hindered thedevelopment of QSAR models for the prediction of eye irritation.Again, in vivo results for local health endpoints like eye irritationcan be inconsistent and imprecise. With respect to biologicalassays, there is a prominent complexity in terms of mechanismsof toxic action and some mechanisms have yet to be elucidated.For skin and eye irritation there is presently no validated replace-ment for the Draize test as suggested by Patlewicz et al. [10].Barratt et al. [11] developed a QSAR model of eye irritationpotential for neutral organic chemicals using an octanol/waterpartition coefficient and dipole moment. Abraham et al. [12]performed a QSAR analysis using the Draize eye irritation data-base. Significant membrane-interaction QSAR (MI-QSAR) modelsfor eye irritation potential were constructed based on the Eur-opean Center for Ecotoxicology and Toxicology of Chemicals(ECETOC) data set of 38 compounds [13,14]. Solimeo et al. [15]have developed QSAR models for small molecules using animaleye toxicity data compiled by the National Toxicology ProgramInteragency Center for the Evaluation of Alternative ToxicologicalMethods, and have demonstrated the superior performance of theconsensus modeling approach to predict eye toxicity. In this study,the authors used a composite score from three different regulatoryagencies to classify all of the compounds into ocular toxicants andnon-toxicants based on Draize test results. The most significantresult of this study was the demonstration of the superiorperformance of the consensus modeling approach in predictingeye toxicity.

In the present study, the ECETOC data set [16] has been used todevelop a QSAR model for eye irritation potential. The reasonbehind undertaking this study was to explore the structuralfragments and attributes responsible for eye irritation and con-struct a reproducible QSAR model for regulatory agencies andpharmaceutical companies. Our model has also been comparedwith a previously reported model using the same data set,considering statistical validation criteria as well as with respectto mechanistic interpretation. The presented model provides richinformation in the context of virtual screening of relevant chemi-cal libraries to predict eye irritation toxicity.

2. Materials and methods

2.1. The dataset

The European Center for Ecotoxicology and Toxicology ofChemicals (ECETOC) [16] has established a standard data set forthe eye irritation potential of 38 chemicals whose Draize rabbiteye-irritation scores have been measured in compliance withOECD Guideline 405 [17]. A total of 38 compounds were used todevelop a statistically robust as well as interpretable QSAR modelfor eye irritants. The dependent variable for eye irritation potentialis molar-adjusted eye scores (MES) from the Draize rabbit eyeirritation test. In QSAR studies, responses are expressed on themolar scale. So, the molarity of the solution was calculated usingMolarity¼ ðDensity� 1000Þ=relative�molecular �mass The MESvalues were then calculated as the raw eye irritation scores(Draize MAS value) divided by the molarity of the solution.

2.2. Descriptor calculation

The conformational analysis of the molecules was carried outusing the optimal search method available in Cerius 2 version 4.10software [18]; descriptors belonging to various categories calcu-lated for the present study included: (i) spatial, (ii) topological, (iii)thermodynamic, (iv) electronic, (v) structural parameters and (vi)E-state parameters. Easily interpretable constitutional and func-tional group count descriptors were calculated using Dragon6 software [19]. Additionally, the extended topochemical atomindices (ETA descriptors) developed by Roy and co-workers [20]were also calculated for the present work using the PaDEL-Descriptor version 2.11 software [21]. Only those descriptors thatcan demonstrate the structural attributes of the molecules with aclear physical meaning were used in the final model development,in order to comply with the OECD principle. These choices weremade manually based on the knowledge of previous QSAR studies.Note that, during development of the QSAR model, we used thewhole pool of calculated descriptors in order to identify thebest descriptors for our model using the genetic method (videinfra).

2.3. Dataset splitting

Selection of the training and test sets plays a crucial role in theconstruction of a statistically significant QSAR model. The selectionshould be such that the test set molecules lie within the chemicalspace occupied by the training set molecules. In this study, all themolecules were first sorted based on their MES values, and then everythird compound was placed in the test set. As a result, two-thirds ofthe total compounds were incorporated into the training set(NTraining¼26) and the rest were incorporated into the test set(NTest¼12). This method ensures uniform selection of test set mole-cules covering the entire range of the activity space of the totaldataset. As the data set was divided into training and test sets basedon toxicity profiles, a principal component analysis (PCA) score plotwas constructed based on the structural descriptors to check whetherthe data set was appropriately divided considering the structuralfeatures. The PCA score plot strongly suggested that the test com-pounds were located in close proximity to the training set compounds.The PCA score plot is presented in the Supplementary material as Fig.S1. Structural features were also considered in order to perform adiversity validation test by the Euclidean distance [22] method toprove that the distribution of the training and test set compounds wasquite good. A scatter plot presented in Supplementary material as Fig.S2 shows that not a single test compound fell outside of the meannormalized distance of the training set compounds.

2.4. Model development

The descriptor-based QSAR models were built using twodifferent chemometric tools: stepwise multiple linear regression(stepwise MLR) [23] and genetic partial least squares (G/PLS) [24].

2.5. Software

Software tools like STATISTICA 7.0 [25], SPSS 9.0 [26] andMINITAB 14 [27] were used in the present study to develop thein silico models.

2.6. Validation metrics

Different statistical metrics were employed to ensure thefitness of the in silico models, and internal, external and overallvalidation methodologies were subsequently employed for modelvalidation. The goodness-of-fit of the equation was judged by the

S. Kar, K. Roy / Computers in Biology and Medicine 48 (2014) 102–108 103

Page 3: Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study

quality metric determination coefficient (R2), as well as using thefollowing internal validation metrics: the leave-one-out crossvalidation parameter, Q2

LOO, and external validation metrics, R2pred.The rm

2 metrics [28], namely r2m andΔrm2 , developed by the presentgroup of authors for internal, external and overall validation ofmodels were also employed for the present work. The calculation

of the rm2 metrics for the test set data (r2mðtestÞ , Δrm

2(test)) estimated

the closeness between the values of the predicted and the

corresponding observed activity data. The value of r2mðtestÞ should

be greater than 0.5 and the value of Δrm2

(test) should be less than

0.2, as suggested by Roy et al. [29]. Similarly, r2mðLOOÞand Δrm2

(LOO)

parameters were used for the training set and, r2mðoverallÞ and

Δrm2

(overall) were used for the overall set [29]. The models werealso subjected to additional validation tests like Q2

ext(F2) [30] andGolbraikh and Tropsha's [31] criteria to check each model'sreliability.

The robustness of the models was checked based on theY-randomization technique. For a robust model, the determinationcoefficient (R2) of the non-random model should exceed thesquared average correlation coefficient of the randomized models(Rr2). The model randomization was performed at a 99% confidencelevel followed by calculation of the cRp

2 parameter [32], whichpenalizes model R2 for small differences in the values of R2 and Rr

2:

cR2p ¼ R�

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiR2�R2

r

qð1Þ

For an acceptable model, the value of cRp2 should be greater

than 0.5.The best model was also scanned under the chance correlation test

using the Y-scrambling approach suggested by Eriksson et al. [33].

2.7. Test for applicability domain (AD)

According to OECD principle 3, an acceptable QSAR modelshould possess a defined AD, which represents the chemical spacedefined by the structural information extracted from the chemicalsused in model development, i.e., the training set compounds in aQSAR analysis. Here, the applicability domain of the QSAR modelwas checked using the leverage approach [34].

3. Results

The developed QSAR models were examined to quantify thecontributions of structural fragments from the studied moleculesfor their eye irritation potential. The statistical results of thedeveloped QSAR models were analyzed using various quality andvalidation metrics. The GFA-spline model, being the most satisfac-tory, is explained further here:

MES¼ �6:938þ6:012� ðAtype�H�50Þ�102:808� ðoΔεD�0:0644 Þ�0:570� ðHOMOÞþ0:070� ðJurs�RNCSÞ

nTraining ¼ 26; R2 ¼ 0:917; R2a ¼ 0:901; Q2

LOO ¼ 0:798;

r2mðLOOÞScaled ¼ 0:749; Δr2mðLOOÞScaled ¼ 0:048

nTest ¼ 12; Q2F1 ¼ R2

pred ¼ 0:807; Q2F2 ¼ 0:806;

r2mðtestÞScaled ¼ 0:765; Δr2mðtestÞScaled ¼ 0:001

r2mðoverallÞScaled ¼ 0:754; Δr2mðoverallÞScaled ¼ 0:033 ð2Þ

Here, nTraining and nTest refer to the number of compounds in thetraining and test sets, respectively. The metric R2 (0.917) refers tothe determination coefficient for judging the goodness-of-fit. Thepredictive potential of the developed model in terms of internaland external validation tests is reflected in the acceptable values

of the Q2LOO (0.798) and R2pred (0.807) metrics, respectively. The

threshold value of the Δrm2 parameter should be less than 0.2, and

the threshold values of the remaining parameters should begreater than 0.5. All the internal and external validation metricsfor the developed model bore values that lie within acceptablelimits. Further, satisfactory values for all the rm

2 metrics accountfor small deviations of the predicted activity data from thecorresponding experimental observations. Identical values for theR2pred (0.807) and Q2

ext(F2) (0.806) metrics indicate that the test setselected for the QSAR model development had a similar distribu-tion of responses as the training set. The scatter plot of observedversus calculated/predicted MES values of the training/test set com-pounds is presented in Fig. 1. The resulting graph showsthat the most points were close to the line of fit. This again indicatedthe good quality of the developedmodel. The observed and calculated/predicted MES values of 38 chemicals are reported in Table 1. Themodel was further subjected to leave–many–out (leave�10%� , 25%�and 50%� out) cross-validation to check the robustness. The result-s are as follows: R2leave-10%-out¼0.92, Q2

leave-10%-out¼0.79, R2leave-25%-out¼0.92, Q2

leave-25%-out¼0.78, R2leave-50%-out¼0.92 and Q2leave-50%-out¼0.70.

The external predictivity of the model was further judged byGolbraikh and Tropsha's criteria. For the GFA spline model, thesestatistical parameters yielded the following results, which arehighly acceptable:

Q2 ¼ 0:79840:5;

r2 ¼ 0:81540:6;

r20�r0'2�� ��¼ 0:0001o0:3;

r2�r20r2

¼ 0:005o0:1;

orr2�r'20

r2¼ 0:005o0:1;

0:85rk¼ 0:943r1:15or 0:85rk'¼ 0:947r1:15

The value of cRp2 (model randomization: 0.726) calculated based onthe randomization results (Rr¼0.397) was much higher than thethreshold value of 0.5, ensuring that the model was not theoutcome of mere chance alone. Again, Y-scrambling was done100 times according to Eriksson et al. [33] and the obtained resultis highly acceptable (R2Y¼0.28 and Q2

Y¼0.01).The descriptors thus appearing in Eq. (2) obey the following

order of significance based on their standardized coefficients:

Fig. 1. Scatter plot of observed versus calculated/predicted MES values of thetraining/test set compounds.

S. Kar, K. Roy / Computers in Biology and Medicine 48 (2014) 102–108104

Page 4: Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study

(a) Atype�H�50, (b) oΔεD�0:0644 , (c) HOMO and (d)Jurs�RNCS, and the standardized coefficient values are 3.86,�3.09, �1.54 and 0.68, respectively.

The descriptor Atype�H�50 refers to an atom-centered frag-ment descriptor, which signifies the hydrophobicity of H attachedto a heteroatom. This descriptor actually indicates a positivecontribution of the hydrogen bond donor feature. According tothe GFA spline equation, Atype�H�50 is the most significantdescriptor among the descriptors present in the equation. As thisdescriptor has a positive contribution towards the toxicity, a highvalue for the Atype�H�50 descriptor makes a positive contribu-tion towards toxicity. Compounds like 12, 26, 29 and 30show higher toxicity profiles due to their high values of theAtype�H�50 descriptor. In contrast, although compound number25 has two Atype�H�50 fragments, it shows reduced toxicity, asthe corresponding value of ΔεD is the highest among all thetraining set compounds (vide infra).

The second important descriptor (ΔεD) belongs to the class ofETA descriptors. This is again a measure of the contribution ofhydrogen bond donor atoms [35]. In a hydrogen bond donorfragment, a hydrogen atom is bound to a highly electronegative

atom such as nitrogen, oxygen or fluorine. Thus, the contributionsof nitrogen, oxygen and fluorine atoms are considered by thisdescriptor. The descriptor ΔεDis the difference between ε2 and ε5,where ε2 ¼ ½ð∑εEHÞ=ðNV Þ� and ε5 ¼ ½ð∑εEHþ∑εXHÞ=ðNV þNXHÞ�. Thedescriptor ε is an indicator of the electronegativity of atoms, ∑εEHis the sum of the electronegativity measure without consideringhydrogens, ∑εXH is the sum of the electronegativity measure forhydrogen atoms attached to heteroatoms of a particular com-pound, NV is the vertex count excluding hydrogens and NXH is thenumber of hydrogens connected to a heteroatom.

A negative coefficient for the descriptor oΔεD�0:0644 indi-cates that the toxicity of the molecules increases witha decrease in the value of the spline term. The spline functionoΔεD�0:0644 exerts zero contribution for values of the ΔεDparameter less than that of the knot of the spline, i.e., 0.064, since anegative value within a spline term denotes zero contribution of thecorresponding descriptor. When the value of the ΔεD parameter isgreater than 0.064, the toxic potency of the compound decreases asthe spline term has a positive value and the spline term has anegative coefficient. Thus, the contribution (ΔεD) of hydrogen bonddonor atoms in a molecule should be less than 0.064 in order to exerta lower toxicity profile. Note that the term ΔεD is balancing theAtype�H�50 term in Eq. (2); the low toxicity of compound 25 canbe easily explained taking these two descriptors in consideration.

The third important descriptor is HOMO, referring to thehighest occupied molecular orbital energy, which is cruciallyimportant in governing molecular reactivity and properties. TheHOMO energy refers to the ability of the molecules to donateelectrons during bond formation and thus is a measure of thenucleophilicity of the molecules. Since the value of the HOMO

Table 1The observed and calculated/predicted MES values for the ECETOC dataset usingEq. (2).

IDNo.

Chemicals name ObservedMES

Calculated/predictedMES

Training set compounds1 3-Methyl hexane 0.10 1.042 2-Methyl pentane 0.26 1.216 1,5-Hexadiene 0.55 1.077 cis-Cyclooctene 0.43 0.408 1,5-Dimethylcyclo-

octadiene0.44 �0.26

9 4-Bromophenol 0.19 0.4610 2,4-Difluronitrobenzene 0.40 1.4911 3-Ethyltoluene 0.32 0.2312 4-Fluoroaniline 6.62 6.5013 Xylene 1.10 0.0214 Toluene 0.96 0.4915 Styrene 0.77 0.4217 1,3-Di-isopropylbenzene 0.38 0.2318 Methyl amyl ketone 2.26 2.0619 Methyl isobutyl ketone 0.59 2.2020 Methyl ethyl ketone 4.48 2.9922 n-Butanol 5.47 6.0524 Isopropanol 2.34 3.1425 Propylene glycol 0.10 �0.2426 2-Ethyl-1-hexanol 7.82 7.5729 Butyl cellosolve 8.99 7.8130 Cyclohexanol 8.29 7.7631 Ethyl acetate 1.47 1.3432 Methyl acetate 3.14 1.7233 Methyl trimethyl acetate 0.36 0.8138 2,2-Dimethylbutanoic acid 5.59 7.03

Test set compounds3 Methylcyclopentane 0.41 1.314 1,9-Decadiene 0.37 0.685 Dodecane 0.45 0.4216 1-Methylpropylbenzene 0.31 0.3121 Acetone 4.83 3.6634 Ethyl trimethyl acetate 0.63 0.7336 n-Butyl acetate 0.99 1.1837 Ethyl-2-methylaceto-

acetate2.55 0.82

28 Hexanol 8.13 8.2635 Cellosolve acetate 2.03 5.0523 Isobutanol 6.44 5.6027 Glycerol 0.12 �0.72

Fig. 2. Schematic diagram of the essential structural fragments and attributesrequired to make a low eye irritant chemical.

Fig. 3. Williams plot for the best regression model.

S. Kar, K. Roy / Computers in Biology and Medicine 48 (2014) 102–108 105

Page 5: Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study

descriptor of a compound bears a negative value, a negativecoefficient for the descriptor in Eq. (2) denotes an increase in theeye irritation potential of molecules with higher negative values ofthe HOMO descriptor. This is very evident in the cases ofcompound nos. 26, 29 and 30, which bear high negative valuesfor the HOMO descriptor and thereby exhibit higher MES values.Again, an increase in the negative value of the HOMO descriptorrefers to an increase in the nucleophilicity of the molecule with asubsequent decrease in its electrophilic nature. Thus, a chemicalshould be less nucleophilic in order to exert reduced eye irritation.

The descriptor Jurs�RNCS is the relative negative-charge surfacearea. It is the solvent-accessible surface area of the most negativelycharged atom multiplied by the relative negative charge (RNCG), i.e.,

RNCS¼ SA�max � RNCG

A positive coefficient in this descriptor in Eq. (2) indicates that the eyeirritation potential of a chemical increases with an increase in thevalue of the descriptor. The descriptor thus signifies that an increase inthe charge or surface area of the most negatively charged atomswithin the molecules increases their toxicity profile (eye irritation).This observation is valid for most of the toxic compounds, likecompound nos. 22, 29 and 30. Considering all the structural require-ments and attributes, one can design molecules with reduced potencyas eye irritants. A schematic diagram is presented in Fig. 2 tosummarize the major findings. The descriptors HOMO and RNCSindicate that compounds need to be less nucleophilic in order to below eye irritant chemicals.

The applicability domain of the GFA spline model was analyzedusing the leverage approach. The leverage approach showed a criticalHAT diagonal value (hn) of 0.577. We used a72.5s standard deviationunit of the predicted residual values to define the applicability domainof the model. The Williams plot is presented in Fig. 3. All the trainingset compounds bore values of standardized residuals within the limitof 72.5s, indicating that none of them was a prediction outlier.However, the leverage values of compounds 9 and 25 in the trainingset being greater than the critical HAT value of 0.577 (h4hn), thesecompounds behave as influential observations (X outliers) althoughthey are not response outliers (not Y outliers). Out of the 12 test setcompounds, 10 test compounds were found to be within the applic-ability domain. Chemical 27 was (o72.5s) completely outside theAD of the model as defined by the Hat vertical line (high h leveragevalue). In contrast, chemical 35 is a response outlier (Y outlier) andwas incorrectly predicted because the standardized residual value ofthis chemical is greater than�2.5s. So, we could confidently predict83.33% of the test set compounds based on the developed model.

4. Comparison with a previous model

Kulkarni et al. [14] reported a MI-QSAR model using differentcombinations of physicochemical parameters associated withsolute molecules along with physicochemical parameters of theirreceptors and parameters that described the physicochemicalinteraction between them using the same dataset. In contrast,

the present work reveals important structural requisites respon-sible for the eye irritation toxicity profiles of the studied moleculesusing different classes of descriptors (structural and physiochem-ical properties of the chemicals without considering intermolecu-lar solute–membrane interaction properties). Since the datasetprovides a wide coverage of different solutes, it affords anexcellent scope for checking the external predictive ability of thedeveloped model. In our study, the test set was utilized to checkthe external predictivity, justifying the reliability of the developedQSAR models. Unlike the present work, Kulkarni et al. performedneither an external validation to check the predictive capability oftheir model nor a randomization test to check the chance correla-tion of their model. Though the dataset used is the same, the basisof the descriptor computation is totally different for these twoQSAR models (i.e., the models developed by us and Kulkarni et al.).So, we have confined our comparison of the models only tostatistical qualities. A detailed statistical comparison of the pre-vious model with the model developed in this study is shown inTable 2.

The fact that properties known to mediate eye toxicity are alsoobtained from our model actually reflect the good mechanistic andstatistic interpretability of the model developed here. The utility ofQSAR models also lie in their ability to predict the responseproperties of new untested chemicals. From the social and eco-nomic point of view, in silico models like QSAR play an importantrole in filling data gaps for untested chemicals and also supportthe 3Rs [36] (replacement, refinement and reduction of animals inresearch) and REACH policies [37].

5. Conclusions

The QSAR approach has been successfully applied here tomodel a very complex biological end point. An adequate androbust regression-based model was established for 38 structurallydiverse solutes. The success of the present study was in developinga model to predict their eye irritant profile and identify the majorcontributing features in the response. The developed regressionmodel was applicable to diverse classes of solutes, and theefficiency of the model in predicting the eye irritant toxicity ofnew solutes was adequately validated according to OECD guide-lines. The major goals and findings are summarized below:

1. Taking the regression model into consideration, we can con-clude that to synthesize a chemical with low potential as an eyeirritant, one should focus on the following points: (a) thenumber of hydrogen bond donor features relative to molecularsize should be optimal, (b) the chemicals should be lessnucleophilic in nature, and (c) the charge and surface area ofthe most negatively charged atoms in the molecule should beminimal. Considering all these points, one can design suitablesolutes having reduced eye irritant profiles.

2. Here, we have used simple descriptors to make the calculationeasy and reproducible as well as time effective, and thesedescriptors can be used for solutes with diverse structures.

Table 2Comparison of the present GFA-spline model with the previously developed non-linear QSAR model by Kulkarni et al [13] on the same data set.

Model Number ofdescriptors

R2 Q2 SEE r2mðLOOÞ Δr2mðLOOÞ r2(Test) R2pred SEP r2mðtestÞ Δr2mðtestÞ Q2

F2 GTCa r2mðoverallÞ Δr2mðoverallÞcRp

2

Kulkarni et al. [13] 5 0.78 0.73 – – – – – – – – – – – – –

Present study 4 0.81 0.76 0.20 0.67 0.11 0.87 0.88 0.12 0.79 0.12 0.87 Passed 0.68 0.12 0.81

a Golbraikh and Tropsha's criteria.

S. Kar, K. Roy / Computers in Biology and Medicine 48 (2014) 102–108106

Page 6: Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study

3. The overall goal of this study was to exhibit the potentialbenefits of using cheminformatic approaches such as QSARmodeling to obtain predictive knowledge for solutes that affectthe eye, and utilize this knowledge to improve the experimen-tal design of solutes and enable their prioritization for in vivotesting.

4. Considering statistical quality and interpretability, the regression-based QSAR model reported here is highly satisfactory in compar-ison to a previously developed model [13]. A comparison of thepresent GFA-spline model with the previously developed non-linear model by Kulkarni et al. [13] is presented in Table 2. Here,we have tried to focus as much as possible on the mechanisticinterpretation, obeying OECD principle 5, and the statisticalquality of our reported model is also statistically sound. In otherwords, we have given importance here to the to transferabilityand reproducibility of the models. We think that herein lies thereal applicability of any QSAR report (transferability of a QSARmodel) and this is in consonance with the spirit of the OECDguidelines on QSAR model development.

5. The developed QSAR model provides useful information in thecontext of virtual screening of open or commercial solutelibraries.

There are a very limited number of QSAR models with an eyeirritation endpoint and many of them are just classificationmodels. We have tried here to develop a QSAR equation that isstatistically robust enough based on internal, external and overallvalidation approaches. We strongly believe that in silico modelscannot replace experimental studies, but it may help the experi-mental scientist throw more light on the design of molecules withless ocular toxicity by considering the points concluded by theQSAR model.

Conflicts of interest statement

The authors declare no conflict of interest.

Acknowledgments

SK thanks the Department of Science and Technology, Govern-ment of India for awarding him a Research fellowship under theINSPIRE scheme. KR thanks the Council of Scientific and IndustrialResearch (CSIR), New Delhi for awarding a major research project.

Appendix A. Supplementary material

Supplementary data associated with this article can be found inthe online version at http://dx.doi.org/10.1016/j.compbiomed.2014.02.014.

References

[1] J.E. Cometto-Muniz, W.S. Cain, H.K. Hudnell, Agonistic sensory effects ofairborne chemicals in mixtures: odor, nasal pungency and eye irritation,Percept. Psychophys. 59 (5) (1997) 665–674.

[2] J.H. Draize, G. Woodard, H.O. Calvery, Methods for the study of irritation andtoxicity of substances applied topically to the skin and mucous membranes, J.Pharmacol. Exp. Ther. 82 (1944) 377–390.

[3] K.R. Wilhelmus, The Draize eye test, Surv. Ophthalmol. 45 (6) (2001) 493–515.[4] United Nations Globally Harmonized System of Classification and Labelling of

Chemicals (GHS) United Nations Publications, New York & Geneva, 2007.[5] Office of Prevention, P. &. S. O. (Eds.), EPA Label Review Manual: EPA735-B-03-

001, U.S. Environmental Protection Agency, Washington, DC, 2012.

[6] European Union Commission Directive 2001/59/EC of 6 August 2001 adaptingto technical progress for the 28th time Council Directive 67/548/EEC on theapproximation of the laws, regulations and administrative provisions relatingto the classification, packaging and labelling of dangerous substances, J. Eur.Commun. (2001) 1–333.

[7] ICCVAM and NICEATM, ICCVAM Test Method Evaluation Report: CurrentValidation Status of In Vitro Test Methods Proposed for Identifying Eye InjuryHazard Potential of Chemicals and Products, NIH Document No. 10-7553,National Toxicology Program, Research Triangle Park, NC, 2010.

[8] ICCVAM and NICEATM, Independent Scientific Peer Review Panel Report:Evaluation of the Validation Status of Alternative Ocular Safety TestingMethods and Approaches. National Toxicology Program, Research TrianglePark, NC, 2009.

[9] S. Kar, K. Roy, Predictive toxicology using QSAR: a perspective, J. Indian Chem.Soc. 87 (2010) 1455–1515.

[10] G. Patlewicz, R. Rodford, J.D. walker, Quantitative structure-activity relation-ships for predicting skin and eye irritation, Environ. Toxicol. Chem. 22 (8)(2003) 1862–1869.

[11] M.D. Barratt, QSARS for the eye irritation potential of neutral organicchemicals, Toxicol. In Vitro 11 (1-2) (1997) 1–8.

[12] M.H. Abraham, R. Kumarsingh, J.E. Cometto-Muniz, W.S. Cain, A quantitativestructure-activity relationship (QSAR) for a Draize eye irritation database,Toxicol. In Vitro 12 (3) (1998) 201–207.

[13] A.S. Kulkarni, A.J. Hopfinger, Membrane-interaction QSAR analysis: applicationto the estimation of eye irritation by organic compounds, Pharm. Res. 16 (8)(1999) 1245–1253.

[14] A. Kulkarni, A.J. Hopfinger, R. Osborne, L.H. Bruner, E.D. Thompson, Predictionof eye irritation from organic chemicals using membrane-interaction QSARanalysis, Toxicol. Sci. 59 (2) (2001) 335–345.

[15] R. Solimeo, J. Zhang, M. Kim, A. Sedykh, H. Zhu, Predicting chemical oculartoxicity using a combinatorial QSAR approach, Chem. Res. Toxicol. 25 (2012)2763–2769.

[16] D.M. Bagley, P.A. Botham, J.R. Gardner, G. Holland, R. Kreiling, R.W. Lewis, D.A. Stringer, A.P. Walker, Eye irritation: reference chemicals data bank, Toxicol.in Vitro 6 (6) (1992) 487–491.

[17] OECD, Acute Eye Irritation/Corrosion OECD Guideline for Testing of Chemicals405, OECD, Paris, 1987.

[18] CERIUS2 Version 4.10, Accelrys Inc. San Diego, CA, USA; available at ⟨http://www.accelrys.com⟩.

[19] DRAGON ver. 6 is software of TALETE srl, Italy ⟨http://www.talete.mi.it/products/dragon_molecular_descriptors.htmis⟩.

[20] K. Roy, R.N. Das, On extended topochemical atom (ETA) indices for QSPRstudies, in: E.A. Castro, A.K. Hagi (Eds.), Advanced Methods and Applicationsin Chemoinformatics: Research Progress and New Applications, IGI Global,Hershey, 2011.

[21] C.W. Yap, PaDEL-descriptor: an open source software to calculate moleculardescriptors and fingerprints, J. Comput. Chem. 32 (7) (2011) 1466–1474.

[22] EUCLIDEAN (a program written in Java) is developed and validated on knowndata sets by Pravin Ambure (Email: [email protected]) of DrugTheoretics and Cheminformatics Laboratory, Jadavpur University, 2013.

[23] R.B. Darlington, Regression and Linear Models. McGrawHill, New York, 1990.[24] D. Rogers, A.J. Hopfinger, Application of genetic function approximation to

quantitative structure activity relationships and quantitative structure prop-erty relationships, J. Chem. Inf. Comput. Sci. 34 (4) (1994) 854–866.

[25] STATISTICA is a Statistical Software of STATSOFT Inc., USA ⟨http://www.statsoft.com/⟩.

[26] SPSS is a statistical software of SPSS Inc., USA ⟨http://www.spss.com⟩.[27] MINITAB is a Statistical Software of Minitab Inc., USA ⟨http://www.minitab.

com⟩.[28] P.K. Ojha, I. Mitra, R.N. Das, K. Roy, Further exploring rm2 metrics for

validation of QSPR models, Chemom. Intell. Lab. Syst. 107 (2011) 194–205.[29] K. Roy, I. Mitra, S. Kar, P. Ojha, R.N. Das, H. Kabir, Comparative studies on some

metrics for external validation of QSPR models, J. Chem. Inf. Model. 52 (2012)396–408.

[30] G. Schüürmann, R.U. Ebert, J. Chen, B. Wang, R. Kühne, External validation andprediction employing the predictive squared correlation coefficient-test set activitymean vs. training set activity mean, J. Chem. Inf. Model. 48 (2008) 2140–2145.

[31] A. Golbraikh, A. Tropsha, Beware of q2! J. Mol. Graph. Model. 20 (4) (2002)269–276.

[32] I. Mitra, A. Saha, K. Roy, Exploring quantitative structure-activity relationshipQSAR studies of antioxidant phenolic compounds obtained from traditionalChinese medicinal plants, Mol. Simul. 36 (13) (2010) 1067–1079.

[33] L. Eriksson, J. Jaworska, A.P. Worth, M.T.D. Cronin, R.M. McDowell,P. Gramatica, Methods for reliability and uncertainty assessment and forapplicability evaluations of classification-and regression-based QSARs,Environ. Health Perspect. 111 (10) (2003) 1361–1375.

[34] P. Gramatica, Principles of QSTR models validation: internal and external,QSTR Comb. Sci. 26 (5) (2007) 694–701.

[35] K. Roy, R.N. Das, QSTR with extended topochemical atom (ETA) indices. 15.Development of predictive models for toxicity of organic chemicals againstfathead minnow using second generation ETA indices, SAR QSAR Environ. Res.23 (1-2) (2012) 125–140.

[36] R. Benigni, A. Giuliani, Putting the predictive toxicology challenge intoperspective: reflections on the results, Bioinformatics 19 (2003) 1194–1200.

S. Kar, K. Roy / Computers in Biology and Medicine 48 (2014) 102–108 107

Page 7: Quantification of contributions of molecular fragments for eye irritation of organic chemicals using QSAR study

[37] E.S. Williams, J. Panko, D.J. Paustenbach, The European Union's REACHregulation: a review of its history and requirements, Crit. Rev. Toxicol. 39(2009) 553–675.

Supratik Kar is a researcher in the Department of Pharmaceutical Technology,Jadavpur University, Kolkata 700 032, India. The field of his research interest isQSAR and Molecular Modeling. He has an experience of about 5 years in the area ofQSAR and published 25 articles in peer reviewed journals.

Kunal Roy (http://sites.google.com/site/kunalroyindia/ ) is an Associate Professor inthe Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India and Fellow in the Manchester Institute of Biotechnology, University of

Manchester, United Kingdom. He is an Associate Editor of the Springer JournalMolecular Diversity and a member of the Editorial Advisory Board of EuropeanJournal of Medicinal Chemistry (Elsevier). The field of his research interest is QSARand Molecular Modeling. Dr. Roy has published more than 200 research papers inrefereed journals (http://sites.google.com/site/kunalroyindia/home/krlistofpublications). Dr. Roy has been a recipient of Bioorganic and Medicinal Chemistry MostCited Paper 2003–2006, 2004–2007 and 2006–2009 Awards (Elsevier), Bioorganicand Medicinal Chemistry Letters Most Cited Paper 2006–2009 Award (Elsevier),AICTE Career Award (AICTE, New Delhi), etc. He is a reviewer of QSAR papers indifferent journals like Journal of Molecular Modeling (Springer), Journal of ChemicalInformation and Modeling (ACS), European Journal of Medicinal Chemistry (Elsevier),Bioorganic and Medicinal Chemistry Letters (Elsevier), Journal of ComputationalChemistry (Wiley), Chemosphere (Elsevier), Molecular Informatics (Wiley), etc.Dr. Roy is also a member of the Cheminformatics and QSAR Society.

S. Kar, K. Roy / Computers in Biology and Medicine 48 (2014) 102–108108


Recommended