+ All Categories
Home > Documents > First report on exploring structural requirements of alpha and beta thymidine analogs for PfTMPK...

First report on exploring structural requirements of alpha and beta thymidine analogs for PfTMPK...

Date post: 05-Jan-2017
Category:
Upload: kunal
View: 215 times
Download: 0 times
Share this document with a friend
19
BioSystems 113 (2013) 177–195 Contents lists available at SciVerse ScienceDirect BioSystems journal h om epa ge: www.elsevier.com/locate/biosystems First report on exploring structural requirements of alpha and beta thymidine analogs for PfTMPK inhibitory activity using in silico studies Probir Kumar Ojha, Kunal Roy ,1 Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India a r t i c l e i n f o Article history: Received 16 May 2013 Received in revised form 26 June 2013 Accepted 2 July 2013 Keywords: Malaria QSAR PfTMPK Pharmacophore Docking a b s t r a c t With the emergence of multi-drug resistance of the currently available antimalarial drugs including the “magic bullet” artemisinin derivatives in the market, there is an urgent need for discovery and develop- ment of new potent antimalarial molecules. The present work deals with quantitative structure–activity relationship (QSAR) modeling, pharmacophore mapping and docking studies of a series of 35 thymidine analogs as inhibitors of Plasmodium falciparum thymidylate kinase (PfTMPK), an enzyme that catalyzes phosphorylation of thymidine monophosphate (TMP) to thymidine diphosphate (TDP). The models were validated both internally and externally and significant statistical results were obtained, indicating the robustness and reliability of the developed models. The docking study was performed using the Ligand- Fit option of receptor–ligand interactions protocol section available in Discovery Studio 2.1 where lower RMSD values (0.6931 ˚ A) between the co-crystallized ligand and re-docked ligand assured that the ligand was bound in the same binding pocket. The QSAR, pharmacophore mapping and docking studies provide an understanding of important structural requirements or essential molecular properties, or features of molecules, and important binding interactions, and provide an important guidance for the chemist to synthesis of new molecules with improved PfTMPK inhibitory activity profile. This work revealed the importance of –NH-fragment, electrophilicity of the molecules and the number of oxygen atom towards the PfTMPK inhibitory activity of the molecules. To the best of our knowledge, this work presents the first QSAR and pharmacophore report for thymidine analogs which may serve as an efficient tool for the design and synthesis of potent molecules as PfTMPK inhibitors to address the increasing threat of multi-drug resistance against P. falciparum. © 2013 Elsevier Ireland Ltd. All rights reserved. 1. Introduction The most virulent parasitic disease, malaria, is caused by four species of Plasmodium parasites, namely Plasmodium falciparum, Plasmodium vivax, Plasmodium ovale and Plasmodium malariae among which, P. falciparum causes highest percentage (95%) of morbidity and mortality (Schlitzer, 2007). Though there is a sig- nificant progress to develop antimalarial vaccines, chemotherapy still remains the first line treatment against malaria in the world. In World Malaria Report (2011), WHO reports revealed that there were an estimated 216 million new clinical cases (81% in the African region) with nearly 655,000 deaths (91% in Africa), mainly African children under five years of age and pregnant women. This awful situation is due to multi-drug resistance of the currently available antimalarial drugs including the “magic bullet” artemisinin deriva- tives in the market (Dondorp et al., 2010). So, there is an urgent Corresponding author. Tel.: +91 98315 94140; fax: +91 33 2837 1078. E-mail addresses: kunalroy [email protected], [email protected] (K. Roy). 1 URL: http://sites.google.com/site/kunalroyindia/. need to develop new antimalarial drugs against new molecular targets to overcome multi-drug resistance of the clinically avail- able antimalarial drugs (Wells et al., 2009). With availability of the genome sequence of the Plasmodium parasite, it is found that these parasites lack the enzymes required for pyrimidine salvage (Reyes et al., 1982) and are totally dependent on de novo pyrimidine nucle- oside synthesis for DNA replication. So, enzymes involved in the pyrimidine metabolism pathway are validated antimalarial drug targets. There are many enzymes (Anderson, 2005; Booker et al., 2010) involved in the pyrimidine metabolism pathways among which P. falciparum thymidylate kinase (PfTMPK) is important one. PfTMPK catalyzes phosphorylation of thymidine monophosphate (TMP) to thymidine diphosphate (TDP), and represents a poten- tially attractive drug target for malaria. PfTMPK shows that it is able to tolerate a broad range of substrates, which distinguishes it from other thymidylate kinases, in particular the human enzyme (Kandeel et al., 2009; Whittingham et al., 2010). Drug development is a tedious, laborious and time consuming process. Due to the availability of the crystallographic structure of the antimalarial targets and good understanding of the target enzyme in molecular level, computer aided drug designing (CADD) approaches (both 0303-2647/$ see front matter © 2013 Elsevier Ireland Ltd. All rights reserved. http://dx.doi.org/10.1016/j.biosystems.2013.07.005
Transcript

Ft

PDJ

a

ARRA

KMQPPD

1

sPamnsIwrcsat

0h

BioSystems 113 (2013) 177– 195

Contents lists available at SciVerse ScienceDirect

BioSystems

journa l h om epa ge: www.elsev ier .com/ locate /b iosystems

irst report on exploring structural requirements of alpha and betahymidine analogs for PfTMPK inhibitory activity using in silico studies

robir Kumar Ojha, Kunal Roy ∗,1

rug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology,adavpur University, Kolkata 700 032, India

r t i c l e i n f o

rticle history:eceived 16 May 2013eceived in revised form 26 June 2013ccepted 2 July 2013

eywords:alariaSARfTMPKharmacophoreocking

a b s t r a c t

With the emergence of multi-drug resistance of the currently available antimalarial drugs including the“magic bullet” artemisinin derivatives in the market, there is an urgent need for discovery and develop-ment of new potent antimalarial molecules. The present work deals with quantitative structure–activityrelationship (QSAR) modeling, pharmacophore mapping and docking studies of a series of 35 thymidineanalogs as inhibitors of Plasmodium falciparum thymidylate kinase (PfTMPK), an enzyme that catalyzesphosphorylation of thymidine monophosphate (TMP) to thymidine diphosphate (TDP). The models werevalidated both internally and externally and significant statistical results were obtained, indicating therobustness and reliability of the developed models. The docking study was performed using the Ligand-Fit option of receptor–ligand interactions protocol section available in Discovery Studio 2.1 where lowerRMSD values (0.6931 A) between the co-crystallized ligand and re-docked ligand assured that the ligandwas bound in the same binding pocket. The QSAR, pharmacophore mapping and docking studies providean understanding of important structural requirements or essential molecular properties, or features ofmolecules, and important binding interactions, and provide an important guidance for the chemist to

synthesis of new molecules with improved PfTMPK inhibitory activity profile. This work revealed theimportance of –NH-fragment, electrophilicity of the molecules and the number of oxygen atom towardsthe PfTMPK inhibitory activity of the molecules. To the best of our knowledge, this work presents thefirst QSAR and pharmacophore report for thymidine analogs which may serve as an efficient tool forthe design and synthesis of potent molecules as PfTMPK inhibitors to address the increasing threat of

inst P

multi-drug resistance aga

. Introduction

The most virulent parasitic disease, malaria, is caused by fourpecies of Plasmodium parasites, namely Plasmodium falciparum,lasmodium vivax, Plasmodium ovale and Plasmodium malariaemong which, P. falciparum causes highest percentage (95%) oforbidity and mortality (Schlitzer, 2007). Though there is a sig-

ificant progress to develop antimalarial vaccines, chemotherapytill remains the first line treatment against malaria in the world.n World Malaria Report (2011), WHO reports revealed that there

ere an estimated 216 million new clinical cases (81% in the Africanegion) with nearly 655,000 deaths (91% in Africa), mainly Africanhildren under five years of age and pregnant women. This awful

ituation is due to multi-drug resistance of the currently availablentimalarial drugs including the “magic bullet” artemisinin deriva-ives in the market (Dondorp et al., 2010). So, there is an urgent

∗ Corresponding author. Tel.: +91 98315 94140; fax: +91 33 2837 1078.E-mail addresses: kunalroy [email protected], [email protected] (K. Roy).

1 URL: http://sites.google.com/site/kunalroyindia/.

303-2647/$ – see front matter © 2013 Elsevier Ireland Ltd. All rights reserved.ttp://dx.doi.org/10.1016/j.biosystems.2013.07.005

. falciparum.© 2013 Elsevier Ireland Ltd. All rights reserved.

need to develop new antimalarial drugs against new moleculartargets to overcome multi-drug resistance of the clinically avail-able antimalarial drugs (Wells et al., 2009). With availability of thegenome sequence of the Plasmodium parasite, it is found that theseparasites lack the enzymes required for pyrimidine salvage (Reyeset al., 1982) and are totally dependent on de novo pyrimidine nucle-oside synthesis for DNA replication. So, enzymes involved in thepyrimidine metabolism pathway are validated antimalarial drugtargets. There are many enzymes (Anderson, 2005; Booker et al.,2010) involved in the pyrimidine metabolism pathways amongwhich P. falciparum thymidylate kinase (PfTMPK) is important one.PfTMPK catalyzes phosphorylation of thymidine monophosphate(TMP) to thymidine diphosphate (TDP), and represents a poten-tially attractive drug target for malaria. PfTMPK shows that it isable to tolerate a broad range of substrates, which distinguishes itfrom other thymidylate kinases, in particular the human enzyme(Kandeel et al., 2009; Whittingham et al., 2010). Drug development

is a tedious, laborious and time consuming process. Due to theavailability of the crystallographic structure of the antimalarialtargets and good understanding of the target enzyme in molecularlevel, computer aided drug designing (CADD) approaches (both

1 stems

sbr

wslptadcpumtad2cttiphsacoptl

escadpl2hadt

mipaeidvokrmm

2

2

p

78 P.K. Ojha, K. Roy / BioSy

tructure and ligand based) may overcome this problem and maye helpful for the design of potential antimalarial compounds witheduced degree of cross-resistance (Gardner et al., 2002).

CADD is a unifying theme that focuses on the questions whye do chemistry and how we decide what to synthesize and

tudy. Now a days, both structure based (molecular docking) andigand based (quantitative structure–activity relationship (QSAR),harmacophore mapping) drug designing approaches are usedo optimize the essential structural requirements for optimumntimalarial activity. QSAR analysis is an effective approach inrug design process. The QSAR approach leads to design of newompounds and estimate their biological activity prior to com-ound preparation and biological tests. Thus, this analysis avoidsnnecessary time and cost-consuming procedures for less potentialolecules. QSAR attempts to find the correlation between struc-

ural/molecular properties in a form of descriptors with biologicalctivities or toxicities for a set of similar compounds by means ofifferent statistical methods (Hecht et al., 2008; Landavazo et al.,002; Santos-Filho et al., 2001; Weekes and Fogel, 2003). Pharma-ophore mapping plays an important role in drug discovery processo identify new potential drugs by exploring the key chemical fea-ures and the spatial relationships among them (configurations)n the determination of the biological activities of chemical com-ounds. The 3D pharmacophore model thus provides a rationalypothetical ensemble of the primary chemical features respon-ible for the activity of the molecules and thereby proves to ben extremely successful tool for virtual screening to identify leadompounds from a large database (Kurogi and Güner, 2001). Basedn the availability of the crystallographic structure of the targetrotein, molecular docking is also performed to predict how a pro-ein (enzyme) interacts with small molecules (ligands) at molecularevel (Kirkpatrick, 2004).

Roy and Ojha (2010) have reviewed QSAR models of differ-nt classes of antimalarials based on both chemical classes andpecific targets. Ojha and Roy also reported QSAR analysis, pharma-ophore mapping and docking studies of antimalarial compoundscting on different targets (P. falciparum dihydroorotate dehy-rogenase (PfDHODH) (Ojha and Roy, 2010a; Ojha et al., 2012),lasmepsin-II (Ojha and Roy, 2011a), P. falciparum dihydrofo-

ate reductase-thymidylate synthase (PfDHFR-TS) (Ojha and Roy,011b)) and belonging to different chemical classes (aryltriazolyl-ydroxamates (Ojha and Roy, 2010b), endochin analogues (Ojhand Roy, 2011c) and 1,2,3,4-tetrahydroacridin-9(10H)-one (THA)erivatives (Ojha and Roy, 2013) using physicochemical, electronic,opological, structural and thermodynamic descriptors.

In our present work, we have performed QSAR modeling, phar-acophore mapping and docking studies of a series of PfTMPK

nhibitors to explore the structural requirements or molecularroperties and to determine the binding interactions with themino acid residues at molecular level in order to understand thessential features required for optimum PfTMPK inhibitory activ-ty. We have validated QSAR and pharmacophore models usingifferent validation parameters (both qualitative and quantitativealidation tests). We have also defined the domain of applicabilityf the best QSAR models. It may be noted that to the best of ournowledge, this work presents the first QSAR and pharmacophoreeport for � and �-thymidine analogs as novel antimalarials whichay serve as efficient weapons to address the increasing threat ofalaria in the developing countries.

. Method and materials

.1. Data preparation

A dataset comprising of 35 compounds (Table 1) used in theresent work for the development and validation of the QSAR

113 (2013) 177– 195

and pharmacophore models and for docking studies was collectedfrom the reported literature (Cui et al., 2012). In this context, wehave used only those compounds which showed PfTMPK enzymeinhibitory activity. All the molecules were screened against recom-binant PfTMPK enzyme using a coupled assay with pyruvate kinaseand lactate dehydrogenase (Cui et al., 2012). The assay was carriedout using TMP as substrate (Cui et al., 2012). The inhibitory activ-ity (Ki) was measured against P. falciparum parasite in micromolarscale. For the development of QSAR models, the inhibitory constant(concentration required to produce half maximum inhibition) (Ki)(�M) of the molecules was converted to the negative logarithmicscale (pKi = −logKi) while Ki values were used for the developmentof pharmacophore model. In this dataset, though the range of thepKi values is small (1.829 log units), we have performed extensivestatistical validation tests to check the reliability and predictiveability of the models.

2.2. Descriptors

All the dataset compounds were sketched using DiscoveryStudio 2.1 [Discovery Studio, 2001] and saved it as MDL molfile. This MDL mol file was used for the descriptor calculation,pharmacophore analysis and molecular docking studies. Vari-ous categories of descriptors (spatial, topological, thermodynamic,electronic, structural parameters and E-state parameter) werecalculated using Descriptor + module of the Cerius2 version 4.10software (Cerius2, 2005, Accelrys Inc, USA). Additionally, Dragon6 software (DRAGON, 2010, TALETE srl, Italy) was used to cal-culate constitutional, information indices and functional groupcount descriptors. The variables with zero variance or near zerovariance were omitted from the initial pool of descriptors calcu-lated using the Cerius2 version 4.10 software for obvious reasonswhereas descriptors having 95% inter-correlation were removedfrom the initial pool of descriptors in case of the Dragon descrip-tors. Finally, 165 descriptors, calculated by both the Cerius 2 andDragon software, were used for the development of final QSARmodel.

2.3. Splitting of the dataset into training and test sets

The most crucial step in the development of QSAR and pharma-cophore models is splitting of the dataset into training and test sets.Selection of test set should be in such a way that the entire domainof test set molecules lies within the chemical space occupied bythe training set molecules. Thus, the quality of the models and pre-dictive ability of the test set compounds can be improved. Here,for the division of the training and test sets, we have performedk-means clustering analysis using SPSS software (SPSS, 1999, USA).The k-means clustering technique (Leonard and Roy, 2006) helpsto classify similar compounds into groups. From the total datasetof 35 compounds, we have selected 1/3rd compounds (12 com-pounds) randomly from each cluster for the test set and remaining2/3rd compounds (23 compounds) were selected for the trainingset (Table 2). Subsequently, the training set molecules were used forthe purpose of model development and the test set molecules wereused to evaluate the external predictive potential of the developedmodels.

2.4. QSAR model development and assessment of its quality

In this context, we have developed QSAR models using differentchemometric tools such as stepwise regression, genetic function

approximation (GFA) (Rogers and Hopfinger, 1994) and geneticpartial least squares (G/PLS) (Rogers and Hopfinger, 1994; Woldet al., 2001) techniques. Among these models, we have reportedhere only the statistically best models developed by GFA followed

P.K.

Ojha,

K.

Roy

/ BioSystem

s 113 (2013) 177– 195

179

Table 1Structural features, experimental and calculated PfTMPK inhibitory activity values (pKi , M), fit values, dock score and experimental and estimated activity scale of thymidine analogs according to QSAR, pharmacophore mappinga

and docking studies.

N

NH

O

O

O

HO

R 1

2

3

4

56

1 '

2 '3 '

4 '

5 '

A lp h a d e r iv a t iv e s

N

NH

O

O

O

HO

R

1

2 3

4

56

1'

2'

3'

4'

5'

B e ta d e r iv a t iv e s

Sl. no. R Derivatives Observed activity (pKi , M)(Cui et al., 2012)

Calculated activity (QSAR)(pKi , M)

Calculated activity(pharm-2) (pKi , M)

Fit value Dock score Activity scale

Observed Estimated (Hypothesis 2)

1 N

H

O

� 3.267 3.121 3.305 4.638 120.376 − −

2

HN

O

O2

N � 3.267 3.442 3.187 4.520 122.162 − −

3

HN

O

HN� 3.703 3.597 3.861 5.194 118.377 − −

4*

HN

O

HN

� 3.780 3.756 3.990 5.324 121.156 − −

5

HN

O

HNCl

� 3.688 3.860 3.825 5.159 122.485 − −

180P.K

. O

jha, K

. R

oy /

BioSystems

113 (2013) 177– 195Table 1 (Continued)

N

NH

O

O

O

HO

R 1

2

3

4

56

1 '

2 '3 '

4 '5 '

A lp h a d e r iv a t iv e s

N

NH

O

O

O

HO

R

1

2 3

4

56

1'

2'

3'

4'

5'

B e ta d e r iv a t iv e s

Sl. no. R Derivatives Observed activity(pKi , M) (Cui et al.,2012)

Calculated activity(QSAR) (pKi , M)

Calculated activity(pharm-2) (pKi , M)

Fit value Dock score Activity scale

Observed Estimated(Hypothesis 2)

6*

HN

O

HN

Cl

� 3.975 4.180 3.664 4.998 125.839 − −

7

HN

O

HN

Cl

� 3.943 3.936 3.700 5.034 125.114 − −

8*

HN

O

NO 2

� 3.520 3.442 3.187 4.520 123.454 − −

9*

NH

NO2

� 3.130 3.572 3.442 4.776 129.765 − −

10*

NH

NH

O

O

O

� 3.697 3.755 3.958 5.291 119.750 − −

P.K.

Ojha,

K.

Roy

/ BioSystem

s 113 (2013) 177– 195

181Table 1 (Continued)

11

HN

O

HN� 4.194 4.050 4.081 5.414 121.701 + +

12

HN

O

N

HN

� 3.654 4.003 3.812 5.145 120.630 − −

13*

HN

O

HN

� 4.222 4.158 4.545 5.878 121.396 + +

14

HN

O

HNCl

� 4.284 4.169 4.556 5.889 116.689 + +

15

HN

O

HN

Cl

� 4.553 4.536 4.543 5.876 119.574 + +

16

HN

O

HN

Cl

� 4.854 4.634 4.545 5.878 124.004 + +

17

HN

S

HN

Cl

� 4.046 4.401 4.015 5.349 127.004 + +

182P.K

. O

jha, K

. R

oy /

BioSystems

113 (2013) 177– 195

Table 1 (Continued)

N

NH

O

O

O

HO

R 1

2

3

4

56

1 '

2 '3 '

4 '

5 '

A lp h a d e r iv a t iv e s

N

NH

O

O

O

HO

R

1

2 3

4

56

1'

2'

3'

4'

5'

B e ta d e r iv a t iv e s

Sl. no. R Derivatives Observed activity(pKi , M) (Cui et al.,2012)

Calculated activity(QSAR) (pKi , M)

Calculated activity(pharm-2) (pKi , M)

Fit value Dock score Activity scale

Observed Estimated(Hypothesis 2)

18

HN

S

HN

Br

� 4.187 4.401 4.327 5.660 126.504 + +

19

HN

O

HNO

� 3.860 3.936 3.903 5.236 122.911 − −

20

HN

O

HN

O

� 3.799 3.653 3.753 5.086 125.832 − −

21*

HN

O

HN

O

� 3.879 3.662 3.672 5.006 125.565 − −

P.K.

Ojha,

K.

Roy

/ BioSystem

s 113 (2013) 177– 195

183Table 1 (Continued)

22

HN

O

HN

� 3.777 3.679 3.740 5.074 122.771 − −

23*

HN

S

HN

Cl CF 3

� 4.509 4.401 4.442 5.776 132.534 + +

24

HN

O

HN

Cl CF3

� 4.056 4.216 4.410 5.743 130.297 + +

25*

HN

O

HN

O2

N

� 4.959 4.958 4.417 5.750 131.357 + +

26

HN

S

HN

Cl

� 4.432 4.401 4.509 5.843 129.088 + +

27

HN

O

HN

Br

� 4.824 4.634 4.542 5.875 124.220 + +

184P.K

. O

jha, K

. R

oy /

BioSystems

113 (2013) 177– 195

Table 1 (Continued)

N

NH

O

O

O

HO

R 1

2

3

4

56

1 '

2 '3 '

4 '5 '

A lp h a d e r iv a t iv e s

N

NH

O

O

O

HO

R

1

2 3

4

56

1'

2'

3'

4'

5'

B e ta d e r iv a t iv e s

Sl. no. R Derivatives Observed activity(pKi , M) (Cui et al.,2012)

Calculated activity(QSAR) (pKi , M)

Calculated activity(pharm-2) (pKi , M)

Fit value Dock score Activity scale

Observed Estimated(Hypothesis 2)

28

HN

S

HN

Br

� 4.495 4.401 4.503 5.836 122.190 + +

29

HN

O

HNO

� 3.971 4.068 4.323 5.656 121.136 − +

30*

HN

O

HN

O

� 4.161 4.427 4.386 5.719 124.097 + +

31

HN

O

HN

O

� 4.301 4.369 4.315 5.649 125.167 + +

P.K.

Ojha,

K.

Roy

/ BioSystem

s 113 (2013) 177– 195

185

Table 1 (Continued)

32*

HN

O

HN

� 4.328 4.089 4.023 5.356 123.429 + +

33*

HN

S

HN

Cl CF3

� 4.602 4.401 4.442 5.776 130.708 + +

34

HN

O

HN

Cl CF3

� 4.569 4.216 4.554 5.888 128.682 + +

35

HN

O

HN

O2

N

� 4.959 4.958 4.417 5.750 129.888 + +

* Denotes the test set compounds.a The compounds were classified as follows: compounds with Ki (�M) ≤ 100 are active (+) and those with Ki (�M) > 100 are considered as inactive (−).

186 P.K. Ojha, K. Roy / BioSystems

Table 2Serial numbers of compounds under different clusters for QSAR analysis.

Cluster no. Serial no. of the compounds

1 19, 20, 21, 22, 29, 30, 31, 32

batweaucnst

vbRmpir�rtetruidtQtYQaat

r

(

1

2

Dwocipsmyto

2 1, 2, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16, 25, 27, 353 17, 18, 23, 24, 26, 28, 33, 344 3, 10, 11

y PLS regression technique. The quality of the QSAR model wasssessed by using both internal and external validation parame-ers. The statistical parameters used to check the quality of modelsere determination coefficient (R2), variance ratio (F), standard

rror of estimate (s), adjusted R2 (Ra2) etc. But, these parameters

re not sufficient to judge the robustness of a model. Thus, we havesed some other classical statistical parameters like leave-one-outross-validated correlation coefficient (Q2) and R2

pred (for inter-al validation and external validation, respectively), and calculatedome novel rm

2 metrics (both scaled and unscaled versions fromhe site http://aptsoftware.co.in/rmsquare/) such as r2

m (LOO) and

rm2

(LOO) for internal validation, r2m (test) and �rm

2(test) for external

alidation, r2m (overall) and �rm

2(overall) for overall validation and cRp

2

ased on model randomization (Ojha et al., 2011; Mitra et al., 2010;oy et al., 2012a,b; Roy et al., 2013). We have also used the rm

2(rank)

etric recently developed by our group to ascertain the rank orderrediction of the test set molecules (Roy et al., 2012a,b). The min-

mum cutoff value for Q2, r2m (LOO), R2

pred, r2m (test), r2

m (overall) andm

2(rank) is 0.5 and maximum limit for �rm

2(LOO), �rm

2(test) and

rm2

(overall) parameters is 0.2. The minimum cutoff value for cor-ected randomization parameter cRp

2 is 0.5 which ensures thathe developed models are not obtained by chance only (Mitrat al., 2010). The final PLS model was also validated using addi-ional randomization test (Melagraki and Afantitis, 2013) throughandomly reordering (100 permutations) dependent variable (pKi)sing SIMCA-P software (UMETRICS, 2002, Sweden). Here, the

nhibitory activity data (Y) are randomly permuted, keeping theescriptor matrix intact, followed by a PLS run. Each randomiza-ion and subsequent PLS analysis generates a new set of R2 and2 values. The R2 and Q2 values are plotted against the correla-

ion coefficient between the original Y-values and the permuted-values. The developed model is considered valid if R2int < 0.4 and2int < 0.05. We have also checked the model performance usingdditional external validation parameters proposed by Golbraikhnd Tropsha (2002). According to (Golbraikh and Tropsha, 2002),he model will be acceptable if

2 > 0.6 (i)

r2 − r20 )/r2 < 0.1 or (r2 − r′2

0 )/r2 < 0.1 (ii)

.15 > k > 0.85 or 1.15 > k′ > 0.85 (iii)

.5. Applicability domain (AD)

According to Organisation for Economic Co-operation andevelopment (OECD) principle, a QSAR model should be reportedith a defined domain of applicability. The applicability domain

f a QSAR model is defined as a theoretical region of the physico-hemical, structural or biological space, knowledge or informationn which the developed model makes predictions for new com-ounds with a given reliability. Thus, AD represents the chemicalpace defined by the structural information of the chemicals used in

odel development, i.e., the training set compounds in a QSAR anal-

sis. It enables to verify whether the compounds predicted usinghe developed model lie within the region of chemical space or fallutside the region of chemical space. In this study, we have checked

113 (2013) 177– 195

applicability domain of the developed QSAR model using DModX(distance to model) approach (Wold et al., 2001) at 90% confidencelevel using SIMCA-P software (UMETRICS, 2002, Sweden).

2.6. Pharmacophore model generation and assessment of quality

Pharmacophore mapping (a ligand based approach), which iswidely used in the field of in silico drug discovery process enablesto capture the spatial arrangement of the various chemical features,viz. hydrogen bond donor, hydrogen bond acceptor, hydrophobic,hydrophobic aliphatic, hydrophobic aromatic, ring aromatic, pos-itive ionizable, negative ionizable groups etc, that are essentialfor the desired biological activity of the molecules. In this work,we have developed the 3D pharmacophore model using hydrogenbond donor, hydrogen bond acceptor, hydrophobic and ring aro-matic features. Prior to the development of pharmacophore models,conformation analysis for all the molecules was performed usingthe BEST method of conformer generation method using the polingalgorithm (Smellie et al., 1995). The pharmacophore models weredeveloped using the HypoGen module implemented in DiscoveryStudio 2.1 [Discovery Studio 2.1, Accelrys Inc, USA, 2001]. A valueof 1.5 was ascertained to the uncertainty parameter which meansthat biological activity of a particular inhibitor molecule is assumedto be located somewhere in the range of 1.5 times higher or lowerfrom the experimental value of the inhibitor (Kurogi and Güner,2001; Li et al., 2000; Sutter et al., 2000; Poptodorov et al., 2006).Thus, the hypotheses generated were analyzed in terms of theircorrelation coefficients and the cost function values. A wide rangeof difference (more than 40 bits) between the fixed cost and nullcost values reduces the probability for existence of chance corre-lation for the developed hypotheses. The cost difference betweentotal cost values of each hypothesis and fixed cost should be min-imum for a robust pharmacophore model. In order to validate themodel externally, the conformers generated for test set compoundswere mapped using the developed pharmacophore hypothesis andtheir activity data was estimated based on the degree of mapping.The best pharmacophore model thus selected was further validatedusing both qualitative (Recall, Precision, Specificity, Accuracy, F-measure, G-mean (Chang et al., 2013), Cohen’s � (Chang et al., 2013),Güner–Henry score (GH) (Seal et al., 2013) and Matthew’s correla-tion coefficient (MCC) (Matthews, 1975)) and quantitative (R2

predand rm

2(test)) validation parameters. This enables to verify the ability

of the pharmacophore to predict the activity of test set moleculesas compared to the experimentally determined values.

For calculation of the qualitative validation parameters, thecompound set was divided into true actives (TA) (active compoundcorrectly classified as active), false actives (FA) (inactive compoundis wrongly classified as active), false negatives (FN) (active com-pound is wrongly classified as inactive) and true negatives (TN)(inactive compound is correctly classified as inactive). In this study,for estimation (prediction) purposes, the compounds were classi-fied as follows: compounds with Ki (�M) ≤ 100 are active (denotedas “+” sign) and those with Ki (�M) > 100 are considered as inactive(denoted as “−” sign) (Table 1). Recall (or Sensitivity) and Specificityare able to identify the discrimination ability of the models betweenthe active and inactive classes and Accuracy presents the ratio ofthe correctly discriminated classes. The F-measure is actually aweighted harmonic mean of Precision and Recall. The calculationsof the above parameters according to Fawcett (2006) are as follows(Matthews, 1975; Chang et al., 2013; Seal et al., 2013):

Recall = TA(i)

TA + FN

Precision = TN

FA + TN(ii)

stems

S

A

F

G

C

w

P

G

M

2

adp(bPsta2ocIAag

pfaTitct

8-S

0.22

= 0.8

r2m(LO

= 0.8

0.14

P.K. Ojha, K. Roy / BioSy

pecificity = TA

TA + FA(iii)

ccuracy = TA + TN

TA + FA + FN + TN(iv)

-measure = 2 (Recall) (Precision)(Recall + Precision)

(v)

-means =√

Sensitivity × Specificity (vi)

ohen′s � = Pr(a) − Pr(e)1 − Pr(e)

(vii)

here, Pr(a) = TP+TNTP+FN+FP+TN and

r(e) ={

(TP + FP) × (TP + FN)}

+{

(TN + FP) × (TN + FN)}

(TP + FN + FP + TN)2

H =(

34

× Precision + 14

× Sensitivity)

× Specificity (ix)

CC = (TP × TN) − (FP × FN)√(TP + FP) × (TP + FN) × (TN + FP) × (TN + FN)

(viii)

.7. Molecular docking

Molecular docking (Venkatesan et al., 2010) application isnother important tool in the field of molecular modeling that pre-icts the interaction between proteins and small molecules. In theresent work, the crystal structure of the enzyme (2WWF.pdb)Whittingham et al., 2010) was obtained from RCSB protein dataank (http://www.pdb.org). A high resolution crystal structure offTMPK complexed with TMP and ADP was selected for dockingtudies. The enzyme (2WWF.pdb) is a trimeric enzyme containinghree chains (A, B and C). We have concentrated on the chain Cnd TMP binding site for the docking studies (Whittingham et al.,010). The docking studies were performed using LigandFit optionf receptor–ligand interactions protocol section available in Dis-overy Studio 2.1 [Discovery Studio 2.1, Accelrys Inc, USA, 2001].nitially, both the database ligands and the protein were pretreated.ll the duplicate structures were removed for ligand preparation,nd options for ionization change, tautomer generation, isomereneration and 3D generator were set to true. In case of protein

reparation, hydrogen atoms were added to the whole protein forurther analysis. The protein molecule thus prepared was defineds the total receptor and the active site was selected based on theMP binding domain. All the molecules of the dataset were docked

pKi = 3.5057 − 0.5973 ×⟨

7.4409

−10.2273 ×⟨

LUMO-2.40418⟩

+

ntraining = 23, R2 = 0.849, R2a

Q 2 = 0.806, r2m(LOO) = 0.733, �

ntest = 12, R2pred

= 0.818, Q 2f 2

r2m(overall) = 0.734, �r2

m(overall) =

nto the active site of the enzyme and the interaction energies inhe form of dock score between each ligand and the receptor werealculated. The binding interactions were compared with those ofhe bound TMP in order to assess whether the molecules fit into

113 (2013) 177– 195 187

the specified active site of the enzyme or not. Docking was per-formed using PLP1 as the energy grid. The conformational searchof the ligand poses was performed by Monte Carlo trial method.Maximum internal energy was set at 10,000 Cal. Pose saving withinteraction filters were set as default and fifty poses were savedfor each conformation. The 50 docking poses thus saved for eachconformation were ranked according to their dock score (basedon LigScore1, LigScore2, PLP1, PLP2, Jain and PMF) [Discovery Stu-dio 2.1, Accelrys Inc, USA, 2001] function and their interactionwith the receptor was analyzed. From the docking studies, thereceptor–ligand interactions were correlated with the biologicalactivity of the dataset compounds. The validation of the dockingprocedure was done by extracting the co-crystallized TMP fromthe active site of the enzyme and re-docking it to the enzyme toensure that it binds to the same active site and interacts with thesame amino acid residues as before. Schematic representation ofthe workflow (QSAR, pharmacophore mapping and docking stud-ies) is shown in Fig. 1.

2.8. Software used

In this present study, all the molecules were drawn in Dis-covery Studio 2.1 [Discovery Studio 2.1, Accelrys Inc, USA, 2001]software; Cerius 2 version 4.10 software (Cerius2, 2005, AccelrysInc, USA) and Dragon 6 (DRAGON, 2010, TALETE srl, Italy) softwarewere used for the calculation of descriptors where only the formerone was used model development purposes. Discovery Studio 2.1(Discovery Studio 2.1, 2005) was also used for the development ofpharmacophore model and docking studies. SIMCA-P (UMETRICS,2002, Sweden) was used to check the applicability domain for thetest set compounds. SPSS software (SPSS, 1999, USA) was used forcluster analysis while MINITAB (MINITAB, 2004) was used for PLSregression analysis.

3. Results and discussion

3.1. QSAR

Among all the developed models, the model developed by GFAfollowed by PLS was statistically most acceptable one. In this paper,we have reported only the statistically most significant modelsand summarized in Tables 3 and 4. Though both linear and splineoptions were used to develop the genetic models, the models devel-oped with spline option showed statistically more significant resultthan those developed using only linear option.

3.1.1. GFA followed by PLS analysisssNH

69 × nO

25, F = 56.13(df 2, 20), s = 0.191, PRESS = 0.932,

O) = 0.113

15, r2m(test) = 0.714, �r2

m(test) = 0.153,

8, r2m(rank) = 0.648.

(1)

In the above equation, ntraining is the number of compoundsused to develop the model and ntest is the number of compoundsused to validate the developed model. The acceptable values of thecross-validated predictive variance (Q2 = 0.806) and externalpredictive variance (R2

pred = 0.818) indicate the good predictive

potential of the developed model. Moreover, statistically significantresults for all other external validation parameters like Golbraikhand Tropsha (2002) criteria’s (see Table 4), the rm

2 metrics (bothscaled and unscaled) (rm

2(test), rm

2(LOO), rm

2(overall) and rm

2(rank))

188 P.K. Ojha, K. Roy / BioSystems 113 (2013) 177– 195

Fig. 1. Schematic representation of the QSAR, pharmacophore mapping and docking studies.

el (Eq.

attdratvu

Fig. 2. Y-scrambling of the final PLS mod

nd cRp2 indicate the predictive performance and robustness of

he developed model. Additionally, little difference between thewo metrics, Q2

(f2) and R2pred further implicates stability of the

eveloped model. The randomization (both process and model)esults of this model have been reported in Table 3. We have

lso validated the final PLS model using another randomizationest through randomly reordering (100 permutations) dependentariable (pKi) using SIMCA-P software. Here the intercepts val-es in case of both R2 (−0.00289) and Q2 (−0.298) are below

(1)) based on 100 randomization cycles.

zero, indicating that the developed model is not obtained by anychance (Fig. 2). Using the standardized variable matrix for regres-sion, the significance level of the descriptors was found to be in thefollowing order: S ssNH, LUMO and nO. The observed and predictedpKi values according to Eq. (1) are given in Table 1.

The S ssNH descriptor refers to the summation of E-state val-

ues for fragments. Since the spline term 〈7.44098-S ssNH〉bears a negative coefficient, it can be inferred that the numericalvalue of S ssNH descriptor should be more than 7.44098 so as to

P.K. Ojha, K. Roy / BioSystems 113 (2013) 177– 195 189

Tab

le

3C

omp

aris

on

of

stat

isti

cal q

ual

ity

and

vali

dat

ion

par

amet

ers

of

des

crip

tor-

base

d

QSA

R

mod

el.

Typ

es

of

mod

els

R2

Q2

R2

pred

Q2

f2r2 m

(LO

O)

�r m

2(L

OO

)r2 m

(tes

t )�

r m2

(tes

t)r2 m

(ove

rall

)�

r m2

(ove

rall)

r m2

(ran

k)c R

P2

(ran

dom

izat

ion

)

Un

scal

ed

Scal

ed

Un

scal

ed

Scal

ed

Un

scal

ed

Scal

ed

Un

scal

ed

Scal

ed

Un

scal

ed

Scal

ed

Un

scal

ed

Scal

ed

Proc

ess

Mod

el

GFA

0.85

1

0.77

0

0.83

2

0.83

0

0.68

0

0.69

0

0.09

8

0.08

9

0.72

1

0.73

0

0.16

0

0.14

3

0.70

4

0.71

2

0.13

9

0.12

9

0.64

8

0.60

7

0.79

7G

FA

foll

owed

by

PLS

0.84

9

0.80

6

0.81

8

0.81

5

0.72

3

0.73

3

0.12

7

0.11

3

0.70

3

0.71

4

0.17

2

0.15

3

0.72

6

0.73

4

0.16

2 0.

148

0.64

8

0.84

6

Fig. 3. DModX values of the test set compounds at 90% confidence level of the devel-oped QSAR model. The thick horizontal line signifies the critical DModX value (2.078)at the 90% confidence level.

implicate zero contribution of the spline function towards thePfTMPK inhibitory activity. The higher activity profile of com-pound nos. 16, 27 and 28 may be explained by their correspondingnumerical values for the S ssNH descriptor (7.441, 7.506 and 8.280,respectively) which are more than the knot of the spline (7.44098).On the contrary, compound nos. 1 and 2 with lower values (4.994and 4.758, respectively) of this descriptor show lower PfTMPKinhibitory activity data. Thus, it may be suggested that the E-statefragment S ssNH influences the PfTMPK inhibitory activity of themolecules.

The second most significant descriptor, LUMO (an electronicdescriptor), refers to the energy of the lowest unoccupied molecu-lar orbital. It is the lowest energy level in the molecule that containsno electrons and measures the electrophilicity of a molecule. Thespline term 〈LUMO-2.40418〉 having a negative regression coeffi-cient indicates that the numerical value of LUMO should be less than2.40418 (more electrophilic) which in turn indicates zero impactof the spline function and a subsequent improvement of activity.This implies that more electrophilic compounds have more PfTMPKinhibitory activity (as shown in compound nos. 18, 25, 27 and 35)than the less electrophilic compounds (as shown in case of com-pounds like 3, 10, 20 and 21) suggesting that the molecule shouldbe more electrophilic for better PfTMPK inhibitory activity.

The least significant descriptor nO (among three selecteddescriptors) refers to the number of oxygen atoms present in themolecules. The positive regression coefficient of this descriptorindicates that the number of oxygen atom should be more forbetter PfTMPK inhibitory activity as shown in compound nos. 25,30 and 31. Compound nos. 3, 4, 5 and 22 show poor PfTMPKinhibitory activity profile which may be attributed to a corre-sponding decrease in the number of oxygen atoms in case of thesemolecules. This descriptor suggests that the number of oxygenatom should be more for better PfTMPK inhibitory activity.

The applicability domain of the developed model was ana-lyzed according to the DModX (distance to model) approach usingSIMCA-P software. Fig. 3 shows that all the test set compoundsare within the critical DModX value at 90% confidence level (D-critical = 2.048), indicating that all the test set compounds arewithin the applicability domain.

3.2. Pharmacophore modeling

A total of 10 pharmacophore hypotheses (Table 5) were devel-oped using 23 training set compounds based on the conformersobtained from the BEST method of conformer generation. All thehypotheses obtained were satisfactory in terms of the value of

total cost which were also close to the fixed cost value and farfrom the null cost of the model for each of the hypotheses. Goodoverall correlation between the observed and calculated activitydata was also reflected from the satisfactory correlation coefficients

190 P.K. Ojha, K. Roy / BioSystems 113 (2013) 177– 195

Table 4Results of QSAR and pharmacophore model (Hypothesis 2) obtained according to Golbraikh and Tropsha (2002) criteria.

Parameters Pharmacophore model (Hypothesis 2) Remarks GFA QSAR model GFA followed by PLS Remarks Threshold value

1 r2 0.650 Passed 0.831 0.817 Passed r2 > 0.62 [(r2 − r0

2)/r2] 0.012 Passed 0.002 0.001 Passed <0.1[(r2 − r0

′2)/r2] 0.147 0.063 0.0733 K 1.011 Passed 1 1 Passed 0.85 < k or k′ < 1.15

k′ 0.984 0.998 0.997

Table 5Results for the 10 pharmacophore hypotheses generated using conformers developed from the BEST method of conformer search.

Hypothesis no. Total cost Error cost RMS Correlation (R) Pharmacophoric features R2pred r2

m (test) �rm2

(test) Q2f2 rm

2rank

Unscaled Scaled Unscaled Scaled

1 83.973 64.655 0.943 0.932 HBA, HBD, HBD 0.116 – – – – – –2 89.510 70.214 1.171 0.893 HBA, HBD, HBD 0.638 0.521 0.537 0.143 0.079 0.634 0.5763 99.814 78.840 1.457 0.829 HBA, HBD, HBD, HYD −3.22 – – – – – –4 101.761 76.889 1.397 0.857 HBA, HBD, HYD 0.490 – – – – – –5 103.081 83.047 1.577 0.795 HBA, HBD, HBD 0.573 0.427 0.400 0.320 0.312 0.5686 104.192 83.847 1.599 0.789 HBA, HBD, HBD, HYD −3.206 – – – – – –7 104.295 84.797 1.625 0.781 HBA, HBD, HBD 0.533 0.415 0.431 0.103 0.061 0.528 –8 104.593 82.984 1.576 0.796 HBA, HBA, HBD, HYD −3.512 – – – – – –9 104.960 85.347 1.639 0.776 HBA, HBD, HBD 0.518 0.243 0.257 0.435 0.398 0.512 –10 105.066 85.573 1.654 0.774 HBA, HBD, HBD −0.684 – – – – – –

Configuration cost = 18.15.Null cost = 132.126.Fixed cost = 76.713.

Table 6Statistical results according to qualitative validation parameters and randomization test for the best pharmacophore Hypothesis 2.

Recall Precision Specificity Accuracy F-Measure G-Mean Cohen’s � GH MCC

Hypothesis 2 Training set 100.00 90.00 92.86 95.65 94.737 0.964 0.911 0.907 0.914Test set 100.00 100.000 100.000 100.00 100.00 1.00 1.00 1.00 1.00

Fig. 4. Best ranking three feature pharmacophore (Hypothesis 2) developed with conformers from the BEST method.

P.K. Ojha, K. Roy / BioSystems 113 (2013) 177– 195 191

Fc

omTotH

cally significant results were obtained only in case of Hypothesis 2,

Fm

ig. 5. Mapping of the pharmacophore obtained from Hypothesis 2 (developed withonformers from the BEST method) onto one of the most active compound 35.

f the 10 hypotheses. Further, the acceptability of the developedodels was assessed based on their external predictive potential.

hus, external validation was performed based on the mapping

f test set compounds to each of the developed hypotheses withhe subsequent calculation of the R2

pred parameter. Amongst all,ypothesis 2 yielded the best results in terms of external predictive

ig. 7. Docked conformation of one of the most active compounds (compound no. 25) wodule available in Discovery Studio 2.1.

Fig. 6. Mapping of the pharmacophore obtained from Hypothesis 2 (developed withconformers from the BEST method) onto one of the least active compound 8.

potential (R2pred = 0.638). The proximity between the observed and

the predicted activity data for the test set compounds was alsoanalyzed based on the calculation of the rm

2(test) metrics. Statisti-

thereby implicating the predictive ability of the developed model.Additionally, the Q2

(f2) parameter was also calculated taking intoconsideration the average observed activity data of the test set

ith interacting amino acid residues obtained from the software using Ligand-Fit

192 P.K. Ojha, K. Roy / BioSystems 113 (2013) 177– 195

F with ia

mRrabttadtarb

rtv

Htf

Hfitoapt

ig. 8. Docked conformation of one of the least active compounds (compound no. 2)vailable in Discovery Studio 2.1.

olecules. A little difference between the two metrics, Q2(f2) and

2pred, implicates stability of the developed model which is aptly

eflected from the results obtained for Hypothesis 2. The accept-bility of the developed pharmacophore model was also assessedased on different qualitative validation parameters. This ensureshe ability of the model to rightly predict the active and the inac-ive compounds belonging to the training and test sets. Based on thebility of the model to ideally distinguish between the two classes,ifferent qualitative validation parameters were calculated for theraining and test sets (listed in Table 6). The results obtained forll the qualitative validation metrics were within the acceptableange signifying the predictive potential of the developed modeloth internally and externally.

Subsequently, Hypothesis 2 (Fig. 4) was selected as the bestanking three feature (HBA, HBD, HBD) pharmacophore based onhe results obtained for the different qualitative and quantitativealidation metrics calculated for both the training and test sets. The

BA feature is located at a distance of 8.940 ´A and 8.179 ´A from thewo HBD features. The HBD feature lying at a distance of 8.940 ´Arom the HBA feature makes an angle of 60.283◦ with the other

BD feature, while the two HBD features lie at a distance of 7.005 ´Arom each other (Fig. 4). The vectors for the HBA and HBD featuresndicate the direction of formation of hydrogen bonds betweenhe electronegative group and the electropositive hydrogen atom

f the drug molecule, respectively and the corresponding featuret the receptor site. Mapping of the most active compound (com-ound no. 35) (Fig. 5) with the developed pharmacophore indicateshat the HBA feature matches with the ketonic group attached at

nteracting amino acid residues obtained from the software using Ligand-Fit module

the C-4 position of the parent thymidine nucleus. One of the HBDfeatures maps with the hydroxyl group attached at the 3′ posi-tion of the parent nucleus, while the other HBD feature maps withthe aryl substituent bearing –NH-group constituting the urea sidechain attached at the 5′ position of the thymidine nucleus. The HBAfeature indicates regions in the molecule favorable for substitu-tion with electronegative groups and such an observation rightlymatches with the QSAR analysis which indicates that moleculeswith increased number of oxygen atoms show an improvement inantimalarial activity (as in case of compound nos. 25, 30 and 31).Additionally, the HBD features indicate regions bearing electropos-itive hydrogen atoms attached to strong electronegative groups(–NH- and –OH-fragments in this case). This observation also wellcorroborates with the results of QSAR analysis which indicatesinfluence of the secondary amine (–NH) fragment on the activityprofile of the molecules. The presence of such matching features incase of compound nos. 16, 27 and 28 accounts for their increasedPfTMPK inhibitory activity. Fig. 6 shows the mapping of the leastactive compound with the developed pharmacophore. Due to thelack of all the essential features, compound no. 8 fail to map per-fectly with the developed pharmacophore and thus shows lowestactivity profile.

3.3. Molecular docking

We have performed molecular docking studies using the Ligand-Fit module of receptor–ligand interactions section available underDiscovery Studio 2.1 (Accelrys) to understand crucial information

P.K. Ojha, K. Roy / BioSystems 113 (2013) 177– 195 193

F appi(

rttatcAaTrsamwwo–bAggoa(wpt(io

ig. 9. Schematic diagram of mechanistic interpretation of QSAR, pharmacophore mcompound no. 25) obtained from.

egarding the orientation of the inhibitors in the binding pocket ofhe enzyme and the interaction between the target (enzyme) andhe small molecules (ligands) at the molecular level. The inhibitorsre docked in the TMP binding site. The ligand–receptor interac-ions revealed that the PfTMPK inhibitors got bound with a pocketontaining amino acid residues namely Asp17, Lys21, Leu59, Phe44,rg47, Pro45, Phe74, Arg78, Arg99, Tyr100, Ser103, Gly104, Tyr107nd Tyr153. The RMSD value of co-crystallized TMP and re-dockedMP obtained from the software using Discovery Studio 2.1 (Accel-ys) was 0.6931 A which proves that the ligand got bound in theame binding pocket and interacted with the surrounding aminocid residues. Ligand–receptor interactions suggest that one of theost active compounds (compound no. 25) forms hydrogen bondsith Arg78 (using one of the keto groups of thymidine nucleushich acts as a hydrogen bond acceptor feature based on the results

btained from pharmacophore modeling studies), Arg99 (usingNH group of urea moiety which also acts as a hydrogen bond donorased on the results obtained from pharmacophore modeling),rg47 (using –C = O group of urea moiety), Asp17 (using �-hydroxylroup of thymidine nucleus which acts as a hydrogen bond donorroup according to pharmacophore analysis) and Ser22 (using onef the oxygen atoms of the nitro group attached with phenyl ringt para position) amino acid residues, �–� interaction with Phe74using the thymine ring of thymidine nucleus), �–cation interactionith Tyr43 (using the nitrogen atom of nitro group attached withhenyl ring at para position) and Arg99 (using the thymine ring of

hymidine nucleus), explaining its better PfTMPK inhibitory activityFig. 7). Unlike compound no. 25, compound no. 2 shows low activ-ty profile despite forming hydrogen bonds with Arg78 (using onef the keto groups of thymidine nucleus which acts as a hydrogen

ng and docking studies of PfTMPK inhibitors for one of the most active compounds

bond acceptor based on the results obtained from pharmacophoremodeling studies) and Arg99 (–C = O group of amide moiety), �–�interaction with Phe74 (with thymine ring of thymidine nucleus)and �–cation interactions with Tyr43 (nitrogen atom of nitro groupattached with phenyl ring at para position) and Na ion. This may beattributed to the inability of compound no. 2 (lacking one of the sidechain –NH-fragments) to form hydrogen bond with Arg47, Asp17and Ser22 residues and this implicates the failure of the moleculeto fit properly in to the binding pocket of the enzyme (Fig. 8). Thedocking score of all the inhibitors are given in Table 1. It has beenfound that the QSAR and pharmacophore modeling studies are wellcorroborated with docking studies.

4. Overview and conclusions

The work presents QSAR analysis, pharmacophore modelingand docking studies using a set of thymidine analogs having welldefined PfTMPK inhibitory activity. For the development of QSARand pharmacophore models, the total dataset was divided intotraining and test sets using k-means clustering technique. The bestQSAR model evolved from GFA followed by PLS technique wasvalidated by using both internal and external validation param-eters. The applicability domain of the developed QSAR model waschecked according to DModX approach using Simca-P software. Thedeveloped pharmacophore models were analyzed in terms of theircorrelation coefficients and the cost function values and the best

hypothesis thus selected was further validated using both quali-tative (Recall, Precision, Specificity, Accuracy, F-measure, G-mean,Cohen’s �, Güner–Henry score and Matthew’s correlation coeffi-cient) and quantitative (R2

pred and rm2

(test)) validation parameters.

1 stems

Bctwdapvtsi

1

2

3

4

aasmuttotifio

A

la

R

A

B

C

C

94 P.K. Ojha, K. Roy / BioSy

oth QSAR and pharmacophore models were statistically signifi-ant. The results obtained well corroborate with each other. Furtherhe binding interaction between the molecules and the receptoras assessed using docking analysis. All the dataset molecules wereocked in the TMP binding site of the trimeric enzyme (pdb: 2WWF)t C chain using LigandFit option of receptor–ligand interactionsrotocol section available in Discovery Studio 2.1. The lower RMSDalues of the co-crystallized ligand and re-docked ligand provedhat the ligand was bound in the same binding pocket. After con-ideration of QSAR, pharmacophore mapping and docking studiest can be concluded that (Fig. 9):

. Presence of –NH-fragment in the molecule is important forPfTMPK inhibitory activity (conclusion drawn from QSAR (impli-cates conducive effect of –NH-fragments), pharmacophoremapping (acts as a hydrogen bond donor) and docking studies(forms H-bond with nearby amino acid residues);

. Presence of –OH group at 3′ position of the thymidine nucleus isimportant for PfTMPK inhibitory activity (conclusion drawn fromQSAR (number of oxygen atom should be more), pharmacophoremapping (act as a hydrogen bond donor group) and dockingstudies (forms H-bond with the nearby amino acid residue));

. Number of oxygen atoms in the molecule should be more forPfTMPK inhibitory activity (conclusion drawn from QSAR (num-ber of oxygen atoms should be more), pharmacophore mapping(acts as hydrogen bond donor and acceptor features) and dockingstudies (forms H-bond with surrounding amino acid residues));

. The urea moiety substituted at 5′ position of the thymidinenucleus is important for monitoring proper geometry of themolecules at the receptor cavity of the target enzyme for properinteractions (conclusion drawn from pharmacophore mappingand docking studies).

To the best of our knowledge, this work presents the first QSARnd pharmacophore report for thymidine analogs which may serves an efficient query tool giving useful insight for designing andynthesis of potent molecules to address the increasing threat ofalaria in the developing countries. Therefore, this work provide an

nderstanding of the important structural requirements or essen-ial molecular properties and the requisite features of moleculeshat ensure appropriate binding of the molecule at the active sitef the target enzyme. The models provide an important guidance forhe chemist to synthesize new molecules with improved PfTMPKnhibitory activity profile which in turn may serve useful for identi-cation of drugs having improved activity with reduced likelihoodf cross-resistance.

cknowledgments

Financial assistance from the UGC (New Delhi) in form of a fel-owship to PKO and a major research project to KR is thankfullycknowledged.

eferences

nderson, A.C., 2005. Targeting DHFR in parasitic protozoa. Drug Discovery Today10, 121–128.

ooker, M.L., Bastos, C.M., Kramer, M.L., Barker, R.H., Skerlj, R., Sidhu, A.B., Deng,X.Y., Celatka, C., Cortese, J.F., Bravo, J.E.G., Llado, K.N.C., Serrano, A.E., Angulo-Barturen, I., Jimenez-Diaz, M.B., Viera, S., Garuti, H., Wittlin, S., Papastogiannidis,P., Lin, J.W., Janse, C.J., Khan, S.M., Duraisingh, M., Coleman, B., Goldsmith, E.J.,Phillips, M.A., Munoz, B., Wirth, D.F., Klinger, J.D., Wiegand, R., Sybertz, E., 2010.Novel inhibitors of Plasmodium falciparum dihydroorotate dehydrogenase withanti-malarial activity in the mouse model. J. Biol. Chem. 285, 33054–33064.

erius2, 2005. Version 4.10. Accelrys, Inc: San Diego, CA.http://www.accelrys.com/cerius2

hang, C.Y., Hsu, M.T., Esposito, E.X., Tseng, Y.J., 2013. Oversampling to overcomeoverfitting: exploring the relationship between data set composition, moleculardescriptors, and predictive modeling methods. J. Chem. Inf. Model. 53, 958–971.

113 (2013) 177– 195

Cui, H., Carrero-Lérida, J., Silva, A.P.G., Whittingham, J.L., Brannigan, J.A., Ruiz-Pérez,L.M., Read, K.D., Wilson, K.S., González-Pacanowska, D., Gilbert, I.H., 2012. Syn-thesis and evaluation of �-thymidine analogues as novel antimalarials. J. Med.Chem. 55, 10948–10957.

Discovery Studio 2.1, 2005. Discovery Studio 2.1 is a Product of Accelrys Inc, SanDiego, CA, USA.

Dondorp, A.M., Yeung, S., White, L., Nguon, C., Day, N.P., Socheat, D., von Seidlein,L., 2010. Artemisinin resistance: current status and scenarios for containment.Nat. Rev. Microbiol. 8, 272–280.

DRAGON, 2010. DRAGON Version 6 is Software of TALETE srl, Italy. Available athttp://www.talete.mi.it/products/dragon molecular descriptors.htm

Fawcett, T., 2006. An introduction to ROC analysis. Pattern Recognit. Lett. 27,861–874.

Gardner, M.J., Hall, N., Fung, E., White, O., Berriman, M., Hyman, R.W., Carlton, J.M.,Pain, A., Nelson, K.E., Bowman, S., Paulsen, I.T., James, K., Eisen, J.A., Rutherford,K., Salzberg, S.L., Craig, A., Kyes, S., Chan, M.S., Nene, V., Shallom, S.J., Suh, B.,Peterson, J., Angiuoli, S., Pertea, M., Allen, J., Selengut, J., Haft, D., Mather, M.W.,Vaidya, A.B., Martin, D.M.A., Fairlamb, A.H., Fraunholz, M.J., Roos, D.S., Ralph,S.A., McFadden, G.I., Cummings, L.M., Subramanian, G.M., Mungall, C., Venter,J.C., Carucci, D.J., Hoffman, S.L., Newbold, C., Davis, R.W., Fraser, C.M., Barrell, B.,2002. Genome sequence of the human malaria parasite Plasmodium falciparum.Nature 419, 498–511.

Golbraikh, A., Tropsha, A., 2002. Beware of q2! J. Mol. Graphics Modell. 20, 269–276.Hecht, D., Cheung, M., Fogel, G.B., 2008. QSAR using evolved neural networks for

the inhibition of mutant PfDHFR by pyrimethamine derivatives. BioSystems 92,10–15.

Kandeel, M., Ando, T., Kitamura, Y., Abdel-Aziz, M., Kitade, Y., 2009. Mutational,inhibitory and microcalorimetric analyses of Plasmodium falciparum TMP kinase.Implications for drug discovery. Parasitology 136, 11–25.

Kirkpatrick, P., 2004. Virtual screening: gliding to success. Nat. Rev. Drug Discovery3, 294–299.

Kurogi, Y., Güner, O.F., 2001. Pharmacophore modeling and three-dimensionaldatabase searching for drug design using catalyst. Curr. Med. Chem. 8,1035–1055.

Landavazo, D.G., Fogel, G.B., Fogel, D.B., 2002. Quantitative structure–activity rela-tionships by evolved neural networks for the inhibition of dihydrofolatereductase by pyrimidines. BioSystems 65, 37–47.

Leonard, J.T., Roy, K., 2006. On selection of training and test sets for the developmentof predictive QSAR models. QSAR Comb. Sci. 25, 235–251.

Li, H., Sutter, J., Hoffmann, R.D., 2000. Pharmacophore perception, development, anduse in drug design. In: Güner, O.F., Jolla, C.A.L. (Eds.), IUL Biotechnology Series.International University Line, San Diego, CA.

Matthews, B.W., 1975. Comparison of the predicted and observed secondary struc-ture of t4 phage lysozyme. Biochim. Biophys. Acta 405, 442–451.

Melagraki, G., Afantitis, A., 2013. Enalos KNIME nodes: exploring corrosion inhibitionof steel in acidic medium. Chemom. Intell. Lab. Syst. 123, 9–14.

MINITAB, 2004. MINITAB is a Statistical Software of Minitab Inc., USA.Mitra, I., Saha, A., Roy, K., 2010. Exploring quantitative structure–activity rela-

tionship studies of antioxidant phenolic compounds obtained from traditionalChinese medicinal plants. Mol. Simul. 36, 1067–1079.

Ojha, P.K., Mitra, I., Das, R.N., Roy, K., 2011. Further exploring rm2 metrics for vali-dation of QSPR models. Chemom. Intell. Lab. Syst. 107, 194–205.

Ojha, P.K., Mitra, I., Kar, S., Das, R.N., Roy, K., 2012. Lead hopping for PfDHODHinhibitors as antimalarials based on pharmacophore mapping, molecular dock-ing and comparative binding energy analysis (COMBINE): a three-layered virtualscreening approach. Mol. Inf. 31, 711–718.

Ojha, P.K., Roy, K., 2010a. Chemometric modeling, docking and in silico designof triazolopyrimidine-based dihydroorotate dehydrogenase inhibitors as anti-malarials. Eur. J. Med. Chem. 45, 4645–4656.

Ojha, P.K., Roy, K., 2010b. Chemometric modelling of antimalarial activity of aryltri-azolylhydroxamates. Mol. Simul. 36, 939–952.

Ojha, P.K., Roy, K., 2011a. Exploring molecular docking and QSAR studies ofplasmepsin-II inhibitor di-tertiary amines as potential antimalarial compounds.Mol. Simul. 37, 779–803.

Ojha, P.K., Roy, K., 2011b. Exploring QSAR, pharmacophore mapping and dockingstudies and virtual library generation for cycloguanil derivatives as PfDHFR-TSinhibitors. Med. Chem. 7, 173–199.

Ojha, P.K., Roy, K., 2011c. Comparative QSARs for antimalarial endochins: impor-tance of descriptor-thinning and noise reduction prior to feature selection.Chemom. Intell. Lab. Syst. 109, 146–161.

Ojha, P.K., Roy, K., 2013. First report on exploring structural requirements of 1,2,3,4-tetrahydroacridin-9(10H)-one analogs as antimalarials using multiple QSARapproaches: descriptor-based QSAR, CoMFA–CoMSIA 3DQSAR, HQSAR and G-QSAR approaches. Comb. Chem. High Throughput Screening 16, 7–21.

Poptodorov, K., Luu, T., Hoffmann, R.D., 2006. In: Langer, T., Hoffmann, R.D. (Eds.), InMethods and Principles in Medicinal Chemistry, Pharmacophores and Pharma-cophores Searches. Wiley-VCH, Weinheim.

Reyes, P., Rathod, P.K., Sanchez, D.J., Mrema, J.E.K., Rieckmann, K.H., Heidrich, H.G.,1982. Enzymes of purine and pyrimidine metabolism from the human malaria,Plasmodium falciparum. Mol. Biochem. Parasitol. 5, 275–290.

Rogers, D., Hopfinger, A.J., 1994. Application of genetic function approxi-

mation to quantitative structure–activity relationships and quantitativestructure–property relationships. J. Chem. Inf. Comput. Sci. 34,854–866.

Roy, K., Chakraborty, P., Mitra, I., Ojha, P.K., Kar, S., Das, R.N., 2013. Some casestudies on application of “rm

2” metrics for judging quality of quantitative

stems

R

R

R

S

S

S

S

SS

deoxythymidine monophosphate) and dGMP by Plasmodium falciparum type Ithymidylate kinase. Biochem. J. 428, 499–509.

P.K. Ojha, K. Roy / BioSy

structure–activity relationship predictions: emphasis on scaling of responsedata. J. Comput. Chem. 34, 1071–1082.

oy, K., Mitra, I., Kar, S., Ojha, P.K., Das, R.N., Kabir, H., 2012a. Comparative studieson some metrics for external validation of QSPR models. J. Chem. Inf. Model. 52,396–408.

oy, K., Mitra, I., Ojha, P.K., Kar, S., Das, R.N., Kabir, H., 2012b. Introduction of rm2

(rank)metric incorporating rank-order predictions as an additional tool for validationof QSAR/QSPR models. Chemom. Intell. Lab. Syst. 118, 200–210.

oy, K., Ojha, P.K., 2010. Advances in quantitative structure–activity relationshipmodels of antimalarials. Expert Opin. Drug Discovery 5, 751–778.

antos-Filho, O.A., Mishra, R.K., Hopfinger, A.J., 2001. Free energy force field (FEFF)3DQSAR analysis of a set of P. falciparum dihydrofolate reductase inhibitors. J.Comput.-Aided Mol. Des. 15, 787–810.

chlitzer, M., 2007. Malaria chemotherapeutics part I: History of antimalarial drugdevelopment, currently used therapeutics, and drugs in clinical development.Chem. Med. Chem. 2, 944–986.

eal, A., Yogeeswari, P., Sriram, D., Consortium, O.S.D.D., Wild, D.J., 2013. Enhancedranking of PknB inhibitors using data fusion methods. J. Chem. Inf. 5, 2.

mellie, A., Teig, S.L., Towbin, P., 1995. Poling: promoting conformational variation.J. Comput. Chem. 16, 171–187.

PSS, 1999. SPSS is Statistical Software of SPSS Inc., USA.utter, J., Güner, O., Hoffmann, R.D., Li, H., Waldman, M., 2000. Pharmacophore

perception, development, and use in drug design. In: Güner, O.F., Jolla, C.A.L.

113 (2013) 177– 195 195

(Eds.), IUL Biotechnology Series. International University Line, San Diego,CA.

UMETRICS, 2002. UMETRICS SIMCA-P 10.0, [email protected]:www.umetrics.com. Umea, Sweden.

Venkatesan, S.K., Shukla, A.K., Dubey, V.K., 2010. Molecular docking studies ofselected tricyclic and quinone derivatives on trypanothione reductase of Leish-mania infantum. J. Comput. Chem. 31, 2463–2475.

Weekes, D., Fogel, G.B., 2003. Evolutionary optimization, backpropagation, and datapreparation issues in QSAR modeling of HIV inhibition by HEPT derivatives.BioSystems 72, 149–158.

Wells, T.N.C., Alonso, P.L., Gutteridge, W.E., 2009. New medicines to improve con-trol and contribute to the eradication of malaria. Nat. Rev. Drug Discovery 8,879–891.

Whittingham, J.L., Carrero-Lerida, J., Brannigan, J.A., Ruiz-Perez, L.M., Silva, A.P.,Fogg, M.J., Wilkinson, A.J., Gilbert, I.H., Wilson, K.S., González-Pacanowska, D.,2010. Structural basis for the efficient phosphorylation of AZT-MP (3′-azido-3′-

Wold, S., Sjostrom, M., Eriksson, L., 2001. PLS-regression: a basic tool of chemomet-rics. Chemom. Intell. Lab. Syst. 58, 109–130.

World Malaria Report, 2011. WHO, Geneva, Switzerland.


Recommended