ORIGINAL RESEARCH
Predictive 3D-QSAR and HQSAR model generationof isocitrate lyase (ICL) inhibitors by various alignmentmethods combined with docking study
Nirzari Gupta • Vivek K. Vyas • Bhumika Patel •
Manjunath Ghate
Received: 11 February 2013 / Accepted: 22 October 2013
� Springer Science+Business Media New York 2013
Abstract Isocitrate lyase (ICL) is one of the most
important targets in the treatment of Mycobacterium
tuberculosis. In this study a diverse set of 2-benzanilide
derivatives were aligned by two different methods for
CoMFA, CoMSIA, and HQSAR analysis. The best CoM-
FA model was obtained with the internal validation value
(q2) of 0.730 and conventional coefficient (r2) of 0.944.
Various CoMSIA models were generated and cross-vali-
dated. The best cross-validation coefficient (q2) value was
found to be statistically satisfactory (0.688). Both the
models were validated by test set of 10 compounds with
satisfactory prediction value of (r2pred) 0.725 and 0.631 for
CoMFA and CoMSIA, respectively. Cross-validation
coefficient value (q2) of 0.694 and r2 of 0.856 were
obtained for HQSAR study. The docking study reveals that
large hydrophobic pockets occupy R substitutions of these
compounds. An electronically negative surface is observed
near R1 substitution. The results of the 3D-QSAR analysis
corroborate with the molecular docking results, and our
findings will serve as a basis for further development of
better allosteric inhibitors of ICL inhibitors against
M. tuberculosis.
Keywords Isocitrate lyase (ICL) � CoMFA �CoMSIA � Hologram QSAR � Distil � Docking �Mycobacterium tuberculosis � Tripos
Introduction
Mycobacterium tuberculosis is an etiological agent for
tuberculosis (TB). The incidence of TB has steadily risen
in the last years and TB is the world’s second most com-
mon cause of death from infectious diseases after acquired
immunodeficiency syndrome (AIDS) (Ginsburg et al.,
2003). By 2020, it is estimated that one billion people will
be infected, over 125 million people will get sick and over
30 million will die of TB, if control is not further
strengthened. However, the evolution of its new virulent
forms like multidrug resistant (MDR-TB) and extremely
drug resistant (XDR-TB) has become a major threat to
human kind (Rusell et al., 2010; Russell, 2007). TB is
responsible for 2–3 million deaths every year and, proba-
bly, 30 times as many infections (Manabe and Bishai,
2000). A new agent that targets the persistent state of
growth would have a significant impact on the treatment of
TB by shortening the duration of therapy (Nikalie and
Mudassar, 2011). The glyoxylate pathway uses isocitrate
lyase (ICL) and malate synthase to incorporate carbon
during growth of microorganisms on acetate or fatty acids
as the primary carbon source. The glyoxylate cycle is a
reaction sequence in which acetates are converted to suc-
cinates during the energy production and biosynthesis of
cell constituents; this cycle enables bacteria and fungi to
grow on acetate in a hostile environment inside the mac-
rophage where glucose is not available (Ernesto et al.,
2006; Oh et al., 2010). The necessity of ICL to pertain the
infection makes it efficient target for anti-tubercular drug.
The absence of ICL orthologs in mammals should facilitate
the development of glyoxylate cycle inhibitors as novel
drugs for the treatment of TB. Various moieties have been
identified as ICL inhibitors such as 3-nitropropionate
(McFadden and Purohit, 1977), 3-bromopyruvate (Ko and
N. Gupta � V. K. Vyas � B. Patel � M. Ghate (&)
Department of Pharmaceutical Chemistry, Institute of Pharmacy,
Nirma University, Ahmedabad 382 481, Gujarat, India
e-mail: [email protected]
N. Gupta
e-mail: [email protected]
123
Med Chem Res
DOI 10.1007/s00044-013-0865-0
MEDICINALCHEMISTRYRESEARCH
McFadden, 1990), 3 phosphoglycerate (Ko et al., 1989),
mycenon (Hautzel et al., 1990), oxalate, and itaconate
(Shin et al., 2005). However, these inhibitors are not
pharmacologically suitable for testing in vivo because of
their toxicity and low activity. TB continues to be a global
health threat, which makes 2-benzanilide an important new
class of therapeutics. In this study, we developed CoMFA
and CoMSIA 3D-QSAR model 2-methoxybenzanilides,
2-hydroxybenzanilides, and their thioxo analogs for iden-
tification of novel benzanilide derivative as ICL inhibitors
(Kozic et al., 2012). A three-dimensional quantitative
structure–activity relationship (3D-QSAR) can be consid-
ered as the ensemble of steric and electrostatic features of
different compounds which are necessary to ensure optimal
supramolecular interactions with a specific biological tar-
get structure and to trigger or to block its biological
response (Caballero 2010). A QSAR model was developed
using CoMFA (comparative molecular field analysis) and
CoMSIA (comparative similarity indices analysis) meth-
ods. CoMFA is a most commonly used 3D-QSAR tech-
nique in drug discovery which was developed by Cramer
et al. (1988). Steric and electrostatic fields are obtained
from CoMFA analysis. In CoMSIA, similarity indices are
calculated at regularly placed grid points for the aligned
molecules. Besides electrostatic and steric fields, CoMSIA
also calculates other descriptors like hydrophobic, hydro-
gen bond donor (HBD) and hydrogen bond acceptor (HBA)
(Zeng and Zhang, 2010). The purpose of the HQSAR study
is to explore individual atomic contribution to molecular
bioactivity with visual display of active centers in a com-
pound. Furthermore, HQSAR result can be also somehow
used as control to assess our CoMFA and CoMSIA results
for HQSAR technique does not require molecular super-
position (Zhu et al., 2005). Therefore, we believed that
these three QSAR models would provide some new useful
information for designing new ICL inhibitors with simpler
structures (Reddy et al., 2012)
Experimental methods and materials
Dataset
In this study, a dataset of 49 compounds obtained from Jan
Kozic et al.’s work consisted of 2-benzanilide derivatives
which showed inhibitory activity toward ICL enzyme
(Kozic et al., 2012); Considering a high deviation in the
biological activity and structural variations among the
compounds of the series it was considered as an ideal series
for performing QSAR analysis. Biological data with neg-
ative logarithm of minimum inhibitory concentration
(MIC) are expressed in mol/liter. The MIC values were
converted to pMIC and subsequently used as a dependent
variable for 3D-QSAR study (Table 1), thus correlating the
data linearly to the free energy change (Murugesan et al.,
2009).
Selection of training and test set
The total set of 49 inhibitors was divided into training set
(39 compounds) for generating 3D-QSAR model and a test
set (10 compounds) for validating the quality of the mod-
els. Selection of the training and test set molecules was
done by considering the fact that test set molecules repre-
sent a range of activity similar to that of the training set.
Thus, the test set was the true representative of the training
set. This was achieved by arbitrarily setting aside 10
compounds as a test set with a regularly distributed bio-
logical data.
Computational details
The CoMFA and CoMSIA studies were performed using
the SYBYL X 1.2 software from Tripos Inc., St. Louis, Mo,
USA. Structures of all the compounds were built using
Sketch option and energy minimized under the Tripos force
field with 0.01 kcal/(mol A). Gasteige–Huckel method was
used to calculate the charges. Energy minimization was
performed by Powell method. 100 iterations after partial
atomic charges were assigned to each atom (Zambre et al.,
2009).
Alignment of dataset
The first step in building a 3D-QSAR model from a set of
ligands is the alignment of the molecules, including posi-
tion, rotation, and conformation. Here the alignment was
performed using two different methods, (a) rigid alignment
using Distill function and (b) pharmacophoric alignment
using DISCOtech. The common substructure for the rigid
alignment can be seen in Fig. 1. Compound 23 (highest
active) was used as the template for alignment of the whole
dataset in rigid alignment (Fig. 2a) (Murugesan et al.,
2009). In the pharmacophoric alignment, the first step in
building a pharmacophore model from a set of ligands was
the alignment of the molecules (Liu et al., 2010). A
pharmacophore contains essential features like hydrogen
bond donor, hydrogen bond acceptor, hydrophobic, aro-
matic ring site, etc. (Fig. 2b). DISCOtech uses clique
detection methods to generate multiple pharmacophore
hypotheses that can be compared and refined. Conformers
were generated using ConfortTM, stochastic methods and
prepare the model using scoring function. Pharmacophore
model scoring function is based on the number of features,
the number of molecules that fit the model, and the inter-
feature distances. Out of the models developed, the highest
Med Chem Res
123
Table 1 Chemical structures and activity of 2-benzanilide derivatives used in 3D-QSAR study
OCH3
NH
O
R
R1
1-18
OCH3
NH
S
R
R1
19-36
OH
NH
O
R
R1
37-49
Compound number R R1 MIC (lmol L-1) pMIC
1 4-Cl 3-NO2 125 8.097
2 4-Cl 4-NO2 250 8.398
3 4-Cl 3-Cl 125 8.097
4 4-Cl 4-Cl 125 8.097
5a 4-Cl 3,4-diCl 125 8.097
6 4-Cl 3-Br 250 8.398
7 4-Cl 4-Br 62.5 7.795
8a 4-Cl 3-CF3 250 8.398
9 4-Cl 4-CF3 125 8.097
10 5-Cl 3-NO2 125 8.097
11 5-Cl 4-NO2 125 8.097
12a 5-Cl 3-Cl 250 8.398
13 5-Cl 4-Cl 62.5 7.795
14 5-Cl 3,4-diCl 125 8.097
15 5-Cl 3-Br 125 8.097
16 5-Cl 4-Br 16 7.204
17 5-Cl 3-CF3 4 6.602
18a 5-Cl 4-CF3 250 8.398
19 4-Cl 3-NO2 8 6.903
20 4-Cl 4-NO2 8 6.903
21 4-Cl 3-Cl 4 6.602
22 4-Cl 4-Cl 4 6.602
23 4-Cl 3,4-diCl 2 6.301
24a 4-Cl 3-Br 32 7.505
25 4-Cl 4-Br 4 6.602
26 4-Cl 3-CF3 4 6.602
27 4-Cl 4-CF3 8 6.903
28a 5-Cl 3-NO2 250 8.398
29 5-Cl 4-NO2 32 7.505
30a 5-Cl 3-Cl 16 7.204
31 5-Cl 4-Cl 16 7.204
32 5-Cl 3,4-diCl 8 6.903
33 5-Cl 3-Br 32 7.505
34a 5-Cl 4-Br 16 7.204
35 5-Cl 3-CF3 16 7.204
36 5-Cl 4-CF3 16 7.204
37a 4-Cl 3-NO2 8 6.903
38 4-Cl 3-CF3 4 6.602
39 4-Cl 4-CF3 4 6.602
40 4-Cl 3-NO2 4 6.602
41 4-Cl 3-CF3 4 6.602
Med Chem Res
123
scored model is considered as the best features are con-
sidered as essential for the ICL inhibition. The Model was
validated using ROC curve method. The ROC curve is a
function of sensitivity versus 1-Specificity, and the area
under the ROC curve (AUC) value is the important way of
measuring the performance of the test.
AUC ¼Xn
x¼2
se xð Þ 1� spð Þ xð Þ � 1� spð Þ x� 1ð Þ½ �
where se(x) is the percent of the true positives versus the
total positives at rank position x and (1 - sp)(x) is the
percent of the false positives versus the total negatives at
rank position x (Fawcett, 2006). ROC curve method was
performed using SPSS 15 statistical software. Here, a
decoy set of 147 compounds with 49 active compounds of
the dataset was prepared and 20 conformers of each com-
pound were formed by using a genetic algorithm-based
global optimizer to search the low-energy conformations of
molecules.
CoMFA analysis
CoMFA steric and electrostatic fields were calculated
using QSAR module of SYBYL X1.2. CoMFA uses
Lenard-Jones potential and coulomb potential to calculate
steric and electrostatic fields, respectively (Caballero
et al., 2010). Both steric and electrostatic fields were
calculated for each molecule using sp3-hybridized carbon
atom with 1.52 A van der Waals radiuses and a charge of
?1.0. The energy cut-off values for both steric and
electrostatic fields were set to 30.00 kcal/mol with a dis-
tance-dependent dielectric constant. Column filtering was
set 3.0 kcal/mol to reduce noise and improve efficacy
(Phuong et al., 2004).
CoMSIA analysis
CoMSIA similarity index descriptors were calculated using
the same lattice box as that used in CoMFA calculations
with a grid spacing of 2 A employing a C ?1 probe atom
Table 1 continued
Compound number R R1 MIC (lmol L-1) pMIC
42 4-Cl 4-CF3 4 6.602
43a 5-Cl 3-NO2 8 6.903
44 5-Cl 4-NO2 4 6.602
45 5-Cl 3-Cl 4 6.602
46 5-Cl 4-Cl 4 6.602
47 5-Cl 3,4-diCl 4 6.602
48 5-Cl 4-Br 8 6.903
49 5-Cl 4-CF3 2 6.301
a Test set compounds
Fig. 1 Common substructure of
whole dataset
Fig. 2 a Rigid alignment of
training set of molecules using
Distil. b Pharmacophoric
alignment of training set of
molecules using DISCOtech
Med Chem Res
123
with a radius of 1.0 A. Because of the different shapes of
the Gaussian function, the similarity indices can be cal-
culated at both sides, inside as well as outside of molecular
surface as different shapes of Gaussian function (Jewell
et al., 2001). In general, the CoMSIA contours are more
compacted and centered on the ligand atoms, while
CoMFA contours are scattered at the edge of the ligand
surface. Therefore, CoMSIA result will be helpful in
designing new ligand while CoMFA result should be useful
in exploring the complementary features between ligand
and its receptor active site (McGovern et al., 2010; Perez-
villanueva et al., 2011).
Hologram QSAR (HQSAR)
HQSAR is a technique based on the concept of using
molecular substructures expressed in a binary pattern
(molecular hologram) as descriptors in QSAR models. The
premise of HQSAR is that the two-dimensional (2D) fin-
gerprint encodes the structure of a molecule, which is the
key determinant of all molecular properties (Kulkarni
et al., 2008). 2D chemical database storage and searching
technologies rely on linear notations that define chemical
structures [Wiswesser line-formula notation (WLN); sim-
plified molecular input line entry system (SMILES); SLN–
SYBYL line notation]. The process involves generation of
fragments that are hashed into array bins in the range of 1
to L (length) wherein the array is called molecular holo-
gram and the bin occupancies are the descriptor variables
(Honorio et al., 2005). In this study, fingerprints were
generated for all substructures between four and seven
atoms in size for all molecules. The substructure finger-
prints were then hashed into hologram bins with lengths of
97, 151, 199, 257, 307, and 353. LOO cross-validation was
applied to determine the number of components that yields
a good predictive model. PLS then yields a mathematical
equation that related the molecular hologram bin values to
the inhibition activity of the compounds in the database.
PLS analysis and model validation
CoMFA and CoMSIA models were derived using PLS
regression as implemented in the SYBYL. Calculated
CoMFA and CoMSIA descriptors were used as indepen-
dent variables and pMIC values were used as dependent
variables in the PLS regression analysis. Leave-one-out
(LOO) and cross-validation were initially utilized to eval-
uate the predictive capability q2 and r2cv of the models,
respectively; then 10 cycle bootstrapped run was per-
formed (r2bs) to assess the statistical confidence of the
derived models (Vyas et al., 2013). The optimal number of
components was selected based on the smallest error of
prediction and the highest q2 and r2cv. The PLS analysis
was then repeated with no validation to generate CoMFA
and CoMSIA models. The non-cross-validated models
were assessed by the conventional correlation coefficient
(r2), standard error of prediction (SEE), and F values. The
predictive r2 (r2pred) was based only on the molecules
obtained from the database searching (10 compounds, test
set) and is defined as r2pred = SD-PRESS/SD, where SD is
the sum of the squared deviations between the inhibitory
activity of molecules in the test set and the mean inhibitory
activity of the training set molecules and PRESS is the sum
of the squared deviations between the predicted and actual
activity values for every molecule in the test set (Murum-
kar et al., 2011).
Molecular docking study
Selected active molecules were docked using Surflex-Dock
module of SYBYL into the binding site of ICL enzyme.
The crystal structure of ligand bound to ICL (PDB: 1F8M,
resolution: 1.8 A) (Sharma et al., 2000) was used as the
reference for docking studies. Structure-based study was
carried out, and reported to be helpful in validating 3D-
QSAR results (Chen et al., 2010). We performed molecular
docking of all inhibitors (Holt et al., 2008). After docking,
the binding poses of each ligand was analyzed for the
interactions with the active site residues (Aparoy et al.,
2011)
Fig. 3 Receiver operating characteristic (ROC) curve of the phar-
macophore model
Med Chem Res
123
Results and discussion
Alignment of database
In rigid alignment, the highest active compound 23
(MIC = 2 lM) was used as the template for alignment on
the core common substructure (Fig. 1) of whole database
(Table 1). Rigid alignment can be seen in Fig. 2a. In the
pharmacophoric alignment the DISCOtech module was
used (Fig. 2b). In the pharmacophoric alignment, all the
compounds were aligned on two donor sites, two acceptor
sites, one hydrophobic site, and one aromatic site. The
valiation of the model was performed by ROC curve
method. As per Fig. 3, the AUC of ROC curve of model
was found to be 0.78 which demonstrates the reliability of
the model as promising for the alignment.
Result of CoMFA, CoMSIA, and HQSAR analysis
of rigid alignment dataset (align1)
The statistical parameters of standard CoMFA models
constructed with steric and electrostatic fields are given in
Table 2. The q2, r2cv, r2
pred, r2ncv, F, and SEE values were
computed as defined in SYBYL. The PLS analysis showed
a q2 value of 0.730 and r2cv value of 0.715. The non-cross-
validated PLS analysis results in a conventional r2 of
0.944, F = 89.26, and a standard error of estimation (SEE)
of 0.174. In both steric and electrostatic field contributions,
the former accounts for 0.575 while the latter contributes
0.425. The high bootstrapped r2 (0.962) value and low
standard deviation (0.01) suggest a high degree of confi-
dence in the analysis. CoMSIA offered steric and electro-
static, hydrophobic, hydrogen bond donor (HBD) and
acceptor (HBA) field information. These three additional
factors are in combination with steric and electrostatic
fields, and result in best CoMSIA models. Statistically
significant CoMSIA model was obtained by using the
combination of steric, electrostatic, hydrophobic, and HBA
fields (q2 = 0.539, r2cv = 0.627, r2 = 0.911, F = 33,
SEE = 0.230). The corresponding field contributions are
0.103, 0.191, 0.284, 0.244, and 0.177, respectively. CoM-
SIA analysis results are also summarized in Table 2. As per
the HQSAR calculation, the lowest SEE occurred at a
cross-validated q2 of 0.694. The hologram result in the
lowest standard error has a hologram length of 353. The
PLS analysis gave a conventional r2 of 0.856 and a stan-
dard error prediction of 0.3 for all of the studied
compounds.
Pharmacophoric aligned dataset (align2)
The PLS analysis showed a q2 value of 0.167 and
r2cv = 0.287 which are considered as less statistically
significant. The non-cross-validated PLS analysis results in
a conventional r2 of 0.958, F = 101, and a standard error
of estimation (SEE) of 0.153. CoMSIA model was obtained
by using the combination of steric, electrostatic, hydro-
phobic, HBD, and HBA fields (q2 = 0.409, r2 = 0.882,
Table 2 Statistical comparison of align1 and align2 results
Statistical parameters Align1 Align2
CoMFA CoMSIA HQSAR CoMFA CoMSIA HQSAR
q2 0.730 0.539 0.694 0.167 0.409 0.693
r2ncv 0.944 0.911 0.856 0.958 0.882 0.861
r2cv 0.715 0.627 – 0.287 0.355 –
r2bs 0.962 0.974 – 0.983 0.776 –
N 6 9 5 7 2 5
F 89.26 33.0 – 101 62 –
SEE 0.174 0.230 0.300 0.153 0.328 0.400
r2pred 0.735 0.613 0.897 0.641 0.681 0.615
Probability of r2ncv 0.00 0.00 – 0.00 0.00 –
Field contribution
Steric 0.575 0.103 – 0.085 0.414 –
Electrostatic 0.425 0.191 – 0.227 0.586 –
Hydrophobic 0.284 0.168 –
H-bond donor 0.244 0.238 –
H-bond acceptor 0.177 0.282 –
N is the optimal number of components (PLS components), q2 is the leave-one-out (LOO), cross-validation coefficient, r2ncv is the non-cross-
validation coefficient, r2pred is the predictive correlation coefficient, SEE is the standard error of estimation, F is the F-test value, r2
cv is cross-
validation coefficient
Med Chem Res
123
F = 62, SEE = 0.328). The results of the 3D-QSAR using
pharmacophoric alignment and HQSAR calculation is
shown in Table 2. As per the HQSAR calculation, the
lowest standard error occurred at a cross-validated q2 of
0.694 with five optimal components. The hologram result
in the lowest standard error has a hologram length of 353.
The PLS analysis gave a conventional r2 of 0.861 and a
standard error prediction of 0.400 for all of the studied
compounds.
Statistical results of Align2 showed a low value of q2
and r2cv as compared to Align1. Due to the statistical sig-
nificance, further analysis was carried out using Align1
model. Optimization of CoMSIA study was performed
using steric, electrostatic, hydrophobic, hydrogen bond
donor, and hydrogen bond acceptor fields. 3D-QSAR
models were generated using the above fields in different
combinations, and the results of study are summarized in
Table 3. CoMSIA models showed higher correlation and
high predictive properties. In most of the models, hydro-
phobic field was a common factor indicating the impor-
tance of lipophilicity for the present series of molecules.
We found that the CoMSIA descriptors such as electro-
static, hydrophobic, and HBD fields played a significant
role in the prediction of biological activity. A satisfactory
value of q2 of 0.688 was obtained with this model. The
predictive abilities of 3D-QSAR models were further val-
idated using a test set of 10 compounds, not included in the
model generation study. The predicted r2 (r2pred) values of
CoMFA and CoMSIA models are 0.725 and 0.631,
respectively, for Align1 (Table 3). Plot of experimental
and predicted pMIC of training and test set using CoMFA,
CoMSIA, and HQSAR is depicted in Fig. 4 (Table 4). The
residual activity differences can be seen as hologram in
Fig. 5. The histogram of residual activity suggests the
absence of any outlier compound in the training set whose
residual activity is above one.
CoMFA contour maps
The steric contour map for the CoMFA model with the
most active compound 23 (MIC = 2 lM) is shown in
Fig. 6a. In this figure, the green-colored contours represent
regions of high steric tolerance, while the yellow contours
represent regions of low steric bulk tolerance. A large
green contour present near the C-3 position of terminal
phenyl ring indicated that substitution with the groups
which results in increasing the steric tolerance would favor
the activity. This can be seen in case of the most active
compound 23 (MIC = 2 lM) whose –F group is oriented
in this region. It can also be seen by comparing the
structure and activity of 38 (MIC = 4 lM) and 37
(MIC = 8 lM); 38 contains –CF3 group at C-3 position,
which is more appropriate for sterically favored green-
colored region as compared to –NO2 group at C-3 position
of terminal phenyl ring in compound 37. Compounds 26,
32, 40, 41, and 43 showed good activity due to bulky group
at C-3. A second small green contour and a bulky/steric
unfavorable yellow contour were observed away from
molecular area, so there is no significance of steric bulk
property in this area. A second big green contour observed
in the vicinity of the first phenyl indicated the presence of
steric bulk to the activity of the compounds. All the
Table 3 Optimization of CoMSIA analysis for align1
Features q2 r2ncv N F SEE S E H D A
SEHDA 0.539 0.911 9 33 0.230 0.103 0.191 0.284 0.244 0.177
SHDA 0.555 0.928 9 41 0.208 0.130 – 0.366 0.318 0.186
EHDA 0.513 0.911 9 32 0.230 – 0.212 0.316 0.270 0.201
SEHA 0.537 0.912 9 33 0.228 0.128 0.276 0.344 – 0.252
SEDA 0.494 0.884 9 24 0.263 0.156 0.262 – 0.362 0.220
SHE 0.631 0.914 9 34 0.227 0.226 0.341 0.433 – –
SED 0.641 0.880 9 23 0.268 0.193 0.355 – 0.452 –
SEA 0.450 0.870 9 21 0.278 0.204 0.418 – – -0.378
SHA 0.478 0.922 9 38 0.216 0.176 – 0.469 – 0.355
SDA 0.545 0.892 9 26 0.254 0.240 – – 0.472 0.289
EHD 0.688 0.920 9 37 0.218 – 0.254 0.419 0.328 –
EHA 0.524 0.912 9 33 0.228 – 0.307 0.394 – 0.299
EDA 0.497 0.881 9 23 0.266 0.313 – 0.412 0.275
ADH 0.571 0.917 9 35 0.222 – – 0.332 0.414 0.254
Bold values provide the combination of descriptors which is found to be most statistically significant among all CoMSIA combinations
N is the optimal number of components (PLS components); q2 is the leave-one-out (LOO) cross-validation coefficient; r2ncv is the non-cross-
validation coefficient; SEE is the standard error of estimation; F is the Fischer’s F value; S is steric; E is electrostatic; H is hydrophobic; D is
H-bond donor; A is H-bond acceptor
Med Chem Res
123
compounds of this series follow the same scaffold in their
structures.
The electrostatic contour map for the CoMFA model is
shown in Fig. 6b. In this figure, the blue contours represent
the regions of high electrostatic tolerance, while the red
contours represent regions of low electrostatic bulk toler-
ance. A big blue contour covers the thioamide linkage
between the phenyl rings. Hydrogen atom of –NH group of
thioamide and amide group of all the molecules were
shown to be perfectly fit for the blue contour. The reason
being, they have their electropositive hydrogen atom, fill-
ing the blue contour. However, there is an overlap of a
small red contour map at this region which is very complex
to interpret. A second big blue contour observed near the
C-2 of the first phenyl ring suggests that electropositive
substituents in these regions would increase the activity.
Fig. 4 a Plot of experimental and predicted activity using CoMFA
model. b Plot of experimental and predicted activity using CoMSIA
model. c Plot of experimental and predicted activity using HQSAR model
Table 4 Actual and predicted activity of training and test set com-
pounds used in 3D-QSAR (align1)
Compound pMIC Predicted activity
CoMFA CoMSIA HQSAR
1 8.097 8.099 7.997 8.099
2 8.397 8.351 8.415 8.266
3 8.097 7.834 7.803 7.949
4 8.097 8.056 7.945 7.948
5 8.097 7.44 7.461 7.954
6 8.398 8.32 8.179 8.169
7 7.795 7.897 7.905 7.888
8 8.397 7.13 7.183 7.537
9 8.097 8.067 8.096 7.97
10 8.097 8.018 8.036 7.983
11 8.097 8.051 8.124 8.15
12 8.397 7.947 7.945 7.833
13 7.795 7.775 7.724 7.832
14 8.097 8.147 8.059 7.838
15 8.097 7.98 8.012 8.053
16 7.204 7.665 7.661 7.772
17 6.602 6.747 6.972 7.421
18 8.398 7.505 8.08 7.854
19 6.903 7.068 7.035 6.952
20 6.903 6.741 6.785 7.138
21 6.602 6.466 6.807 6.804
22 6.602 6.775 6.639 6.819
23 6.301 6.518 6.629 6.808
24 7.505 6.554 6.858 7.021
25 6.602 6.516 6.535 6.76
26 6.602 6.935 6.691 6.446
27 6.903 6.823 6.753 6.841
28 8.398 6.818 7.496 7.164
29 7.505 7.368 7.594 7.35
30 7.204 6.751 7.286 7.016
31 7.204 7.024 7.193 7.032
32 6.903 6.593 7.02 7.021
33 7.505 7.687 7.659 7.234
34 7.204 6.943 7.178 6.973
35 7.204 7.223 6.698 6.659
36 7.204 7.153 7.175 7.054
37 6.903 6.608 5.758 6.638
38 6.602 6.725 6.821 6.621
39 6.602 6.694 6.515 6.619
40 6.602 6.653 6.634 6.626
41 6.602 6.45 6.494 6.56
42 6.602 6.708 6.672 6.642
43 6.903 6.61 5.758 6.638
44 6.602 6.503 6.629 6.805
45 6.602 6.485 6.596 6.488
46 6.602 6.716 6.209 6.486
Med Chem Res
123
CoMSIA contour maps
The electrostatic contour map of the CoMSIA analysis can
be seen in Fig. 7a. CoMSIA electrostatic contour map was
similarly placed as CoMFA electrostatic contour map.
However, the presence of red contour near the methoxy
group indicates that electronegative group around this
region would increase the inhibitory activity. The presence
of electronegative oxygen of hydroxyl group in compounds
37–49 (MIC B 8 lM) accounts for good activity. The
hydrophobic contour map of CoMSIA analysis based on
the atomic hydrophobicity distribution displays more
clearly the hydrophobic interactions (Fig. 7b). The region
of yellow contour around the –F group of terminal phenyl
ring indicated that the addition of hydrophobic substituents
at these position may increase the activity. Compound 23
(MIC = 2 lM) has better activity due to the presence of
yellow contour at –F-substituted phenyl ring, because –F
group is much more lipophilic than hydrogen; so incor-
porating a –F group in a molecule will make it more
lipophilic. Compound 10 (MIC = 125 lM) and compound
28 have a –NO2 group substitution at this position which is
less hydrophobic in nature as compared with the alkyl
groups like –Pr and –Bu, and thus less active, compara-
tively. The hydrophobic favored regions around the ter-
minal phenyl ring are similar to the steric favored regions.
A small yellow contour present near the C-4 of the terminal
phenyl ring indicates the addition of hydrophobic sub-
stituent for good activity. The bulky unfavorable gray
region is observed away from molecular area.
Molecular docking study
Docking studies revealed that all inhibitors are docked into
the allosteric site of the ICL crystal structure (PDB ID:
1F8M) and are in similar orientation. The protein structure
contains the crystal structure of pyruvic acid as bound
ligand. The inhibitor conformations obtained from docking
and pyruvic acid has low root mean square deviation
(RMSD \ 1.5 A) indicating a highly conserved binding
mode and the reliability of Surflex-Dock for docking
studies. The inhibitor molecules bind to an allosteric site
adjacent to the Mg?2–pyruvic acid-binding region and
other surrounding protein residues through hydrogen
bonding and hydrophobic interactions. The structural
superposition of the six active molecules (pyruvic acid, 17,
23, 38, 42, and 49) in their docked conformations is shown
in Fig. 8. Pyruvic acid is a co-crystallized allosteric
inhibitor in the 1F8M crystal structure. Highly conserved
Mg?2 ion is observed in other ICL crystal structures (PDB
IDs: 1F8I and 1F61). Figure 9 indicates the docking of the
most active molecule 23 from Table 1 and its imposition
with pyruvic acid. The molecule 23 also forms several
interactions with TRP93, LEU348, THR347, and LYS190.
The docking studies demonstrated that the QSAR contour
maps collaborate with molecule 23 docking interaction
with protein allosteric site. The allosteric site has one
hydrophobic groove which contains the ILE36, THR347,
LEU348, and SER317. The R substitution of the com-
pounds lies within this region. The chlorine atom of the
first benzene ring occupies this pocket. One small electri-
cally negative pocket can be seen near ASP108. The
Table 4 continued
Compound pMIC Predicted activity
CoMFA CoMSIA HQSAR
47 6.602 6.407 6.412 6.493
48 6.903 6.83 6.894 6.427
49 6.301 6.466 6.517 6.508
Fig. 5 a Histogram of CoMFA residual values for the training set.
b Histogram of CoMSIA residual values for the training set.
c Histogram of HQSAR residual values for the training set
Med Chem Res
123
carbonyl group of the amide linkage fulfill this space. One
hydrogen bond was found between TRP93 and compound
23 at the distance of 2.20 A. The –OCH3 of the compound
23 stabilizes the SER91, LEU90, and THR93 pocket.
Conclusions
Isocitrate lyase is proven to be a promising target for the
treatment of M. tuberculosis. 3D-QSAR techniques
Fig. 6 a The CoMFA steric contour map represented by green (favored) and yellow (disfavored) polyhedra. b The CoMFA electrostatic contour
map represented by blue (favored) and red (disfavored) polyhedra. Compound 23 is shown inside the field
Fig. 7 a The CoMSIA electrostatic contour map represented by blue
(favored) and red (disfavored) polyhedra. b The CoMSIA hydropho-
bic contour map represented by green (favored) and gray (disfavored)
polyhedra. c The CoMSIA HBD contour map represented by cyan
(favored) and violet (disfavored) polyhedra. Compound 23 is shown
inside the field (Color figure online)
Med Chem Res
123
(CoMFA and CoMSIA) and HQSAR were applied for the
first time on the series of 2-benzanilide derivatives as ICL
inhibitors. All the models (CoMFA, CoMSIA, and
HQSAR) were found to be satisfactory according to the
statistical parameters. Contour map analysis of CoMFA
and CoMSIA will assist in prediction of ICL activity with
appropriate accuracy. That was further correlated with
receptor active site. Combined 3D-QSAR and molecular
docking analysis corroborate each other, and these results
will help to better interpret the structure–activity relation-
ship of these ICL inhibitors and provide valuable insights
into rational drug design. The present QSAR approach
along with docking studies provides useful information to
design the novel derivatives with higher selectivity and
efficacy.
Acknowledgments The authors would like to thank Nirma Uni-
versity, Ahmedabad, India for providing computer facility to com-
plete this work.
Conflict of interest The authors declare that they have no conflict
of interest.
References
Aparoy P, Suresh GK, Reddy K, Reddanna P (2011) CoMFA and
CoMSIA Studies on 5-hydroxyindole-3-carboxylate derivatives
as 5-lipoxygenase inhibitors: generation of homology model and
docking studies. Bioorg Med Chem Lett 21:456–462
Caballero J (2010) 3D-QSAR (CoMFA and CoMSIA) and pharma-
cophore (GALAHAD) studies on the differential inhibition of
aldose reductase by flavonoid compounds. J Mol Graph Model
29:363–371
Caballero J, Fernandez M, Coll D (2010) Quantitative structure–
activity relationship of organosulphur compounds as soybean
15-lipoxygenase inhibitors using CoMFA and CoMSIA. Chem
Biol Drug Des 76:511–517
Chen Y, Li Z, Chen HF (2010) Computational study of CCR5
antagonist with support vector machines and three dimensional
quantitative structure activity relationship methods. Chem Biol
Drug Des 75:295–309
Cramer RD, Patterson DE, Bunce JD (1988) Comparative molecular
field analysis (CoMFA). J Am Chem Soc 110:5959–5967
Ernesto J, Elıas M, McKinney JD (2006) M. tuberculosis isocitrate
lyases 1 and 2 are jointly required for in vivo growth and
virulence. Nat Med 11:638–644
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit
Lett 27:861–874
Ginsburg AS, Grosset JH, Bisha JH (2003) Fluoroquinolones,
tuberculosis, and resistance. Lancet Infect Dis 3:432–442
Hautzel R, Anke H, Sheldrick WS (1990) Mycenon, a new metabolite
from a Mycena species TA 87202 (basidiomycetes) as an
inhibitor of isocitrate lyase. J Antibiot 43:1240–1244
Holt PA, Chaires JB, Trent JO (2008) Molecular docking of
intercalators and groove-binders to nucleic acids using autodock
and surflex. J Chem Inf Model 48:1602–1615
Honorio KM, Garratt RC, Andricopulo AD (2005) Hologram
quantitative structure-activity relationships for a series of
farnesoid X receptor activators. Bioorg Med Chem Lett
15:3119–3125
Jewell NE, Turner DB, Willett P, Sexton GJ (2001) Automatic
generation of alignments for 3D QSAR analyses. J Mol Graph
Model 20:111–121
Ko YH, McFadden BA (1990) Alkylation of isocitrate lyase from
Escherichia coli by 3-bromopyruvate. Arch Biochem Biophys
278:373–380
Ko YH, Vanni P, McFadden BA (1989) The interaction of
3-phosphoglycerate and other substrate analogs with the gly-
oxylate- and succinate-binding sites of isocitrate lyase. Arch
Biochem Biophys 274:155–160
Kozic J, Novotna E, Volkova M, Stolarıkova J, Trejtnar F, Vinsova J
(2012) Synthesis and in vitro antimycobacterial activity of
2-methoxybenzanilides and their thioxo analogues. Eur J Med
Chem 56:387–395
Kulkarni SS, Patel MR, Talele TT (2008) CoMFA and HQSAR
Studies on 6,7-dimethoxy-4-pyrrolidylquinazoline derivatives as
phosphodiesterase10A inhibitors. Bioorg Med Chem 16:3675–
3686
Liu G, Ju X, Cheng J, Liu Z (2010) 3D-QSAR studies of insecticidal
anthranilic diamides as ryanodine receptor activators using
CoMFA, CoMSIA and DISCOtech. Chemosphere 78:300–306
Fig. 8 Superposition of five active molecules in their docked
conformation in allosteric binding site of ICL. Yellow dotted line
indicates hydrogen bond interactions (Color figure online)
Fig. 9 3D view of the docked conformation of molecule 23 (orange)
superimposed on co-crystallized pyruvic acid (red) in the allosteric binding
site of ICL. The hydrogen bonds are shown in yellow broken line, and the
conserved Mg?2 ion is represented as violet ball (Color figure online)
Med Chem Res
123
Manabe YC, Bishai WR (2000) Latent Mycobacterium tuberculosis-
persistence, patience, and winning by waiting. Nat Med
6:1327–1329
McFadden BA, Purohit S (1977) Itaconate, an isocitrate lyase-
directed inhibitor in Pseudomonas indigofera. J Bacteriol
131:136–144
McGovern DL, Mosier PD, Roth BL, Westkaemper RB (2010)
CoMFA analyses of C-2 position salvinorin A analogs at the
kappa-opioid receptor provides insights into epimer selectivity.
J Mol Graph Model 28:612–625
Murugesan V, Prabhakar YS, Katti SB (2009) CoMFA and CoMSIA
studies on thiazolidin-4-one as anti-HIV1 agents. J Mol Graph
Model 27:735–743
Murumkar PR, Le L, Truong TN, Yadav MR (2011) Determination of
structural requirements of influenza neuraminidase type A
inhibitors and binding interaction analysis with the active site
of A/H1N1 by 3D-QSAR CoMFA and CoMSIA modeling. Med
Chem Commun 2:710–719
Nikalie AG, Mudassar P (2011) Multidrug-resistant Mycobacterium
tuberculosis: a brief review. Asian J Biol Sci 4:101–115
Oh KB, Jeon HB, Han YR, Lee YJ, Park J, Lee SH, Yang D, Kwon
M, Shin J, Lee HS (2010) Bromophenols as Candida albicans
isocitrate lyase inhibitors. Bioorg Med Chem Lett 20:6644–6648
Perez-villanueva J, Medina-franco JL, Caulfield TR, Hernandez-
campos A, Hernandez-luis F, Yepez-mulia L (2011) Comparative
molecular field analysis (CoMFA) and comparative molecular
similarity indices analysis (CoMSIA) of some benzimidazole
derivatives with trichomonicidal activity. Eur J Med Chem
46:3499–3508
Phuong T, Minh TK, Van NT, Phuong HT (2004) Synthesis and
antifungal activities of phenylenedithioureas. Bioorg Med Chem
Lett 14:653–656
Reddy BM, Tanneeru K, Meetei PA, Guruprasad L (2012) 3D-QSAR and
molecular docking studies on substituted isothiazole analogs as
inhibitors against MEK-1 kinase. Chem Biol Drug Des 79:84–91
Rusell DG, Barry CE, Flynn JL (2010) Tuberculosis: what we don’t
know can, and does, hurt us. Science 328:852–856
Russell DG (2007) Who puts the tubercle in tuberculosis? Nat Rev
Microbiol 5:39–47
Sharma V, Sharma S, Hoener K, Bentrup Z, Mckinney JD, Russell
DG, Jacobs WR, Sacchettini JC (2000) Stucture of isocitrate
lyase, a persistence of Mycobacterium tuberculosis. Nat Struct
Biol 7:663–668
Shin DS, Kim S, Yang HC, Oh KB (2005) Cloning and expression of
isocitrate lyase, a key enzyme of the glyoxylate cycle, of
Candida albicans for development of antifungal drugs. J Micro-
biol Biotechnol 15:652–655
Vyas VK, Bhatt HG, Patel PK, Jalu J, Chintha C, Gupta N, Ghate M
(2013) CoMFA and CoMSIA studies on C-aryl glucoside
SGLT2 inhibitors as potential anti-diabetic agents. SAR QSAR
Environ Res http://dx.doi.org/10.1080/1062936X.2012.751553
Zambre VP, Murumkar PR, Giridhar R, Yadav MR (2009) Structural
investigations of acridine derivatives by CoMFA and CoMSIA
reveal novel insight into their structures toward DNA G-quad-
ruplex mediated telomerase inhibition and offer a highly
predictive 3D-model for substituted acridines. J Chem Inf Model
49:1298–1311
Zeng H, Zhang H (2010) Combined 3D-QSAR modeling and
molecular docking study on 1,4-dihydroindeno[1,2-c]pyrazoles
as VEGFR-2 kinase inhibitors. J Mol Graph Model 29:54–71
Zhu W, Chen G, Hu L, Luo X, Gui C, Luo C, Puah CM, Chen K,
Jiang H (2005) QSAR Analyses on ginkgolides and their
analogues using CoMFA, CoMSIA, and HQSAR. Bioorg Med
Chem 13:313–322
Med Chem Res
123