Prediction of Drug Like Properties

7/30/2019 Prediction of Drug Like Properties

1/42

Prediction of Drug Like Properties

Gisbert Schneider.

Models are to be used, not believed. (H. Theil)

The Drug-Likeness Concept

Historically, computer-aided molecular design (CAMD) has focused on lead identification and lead

optimization, and many innovative strategies have been developed that assist in improving the

binding affinities of drug candidates to specific receptors. One such method, QSAR, has been

discussed in the previous Chapter. In this Chapter, we will discuss the emerging concept of drug-

likeness, as well as the computational modeling of a set of physicochemical and biological

properties that play an important role in the transformation of a clinical lead to a marketed drug.

Although high potency is an important factor in pharmacological design, one must also recognize

the huge gulf between a tightly bound inhibitor and a bioavailable drug.1Far too often, promising

candidates are abandoned during clinical trialsor worse, withdrawn after market launch in the

medico-economic phasefor a variety of reasons, including low bioavailability, high toxicity, poor

pharmacokinetics, or drug-drug interactions. In addition, the advent of parallel synthesis methods

and high throughput screening has placed increasing stress on the technology that has traditionally

been used to assess potential drug candidates in non-clinical development. Due to the limited timeand resources available to conduct formal in vivo studies, typically only tens of candidates will be

screened. Thus, prioritization by computational means prior to experiment is important in order to

ensure that valuable resources are apportioned to the most promising candidates.

Drug molecules generally act on specific targets at the cellular level, and exert therapeutic action

upon binding to receptors that subsequently modify the cellular machinery. Before a drug molecule

exerts its pharmaceutical (pharmacodynamic) effect on the body via interaction with its target, it

must travel through the body to reach the site of drug action. The study of pharmacokinetics refersto the journey of the drug from its point of entry to the site of action. Broadly speaking, this process

can be defined by the following phases: absorption, distribution, metabolism, and excretion

(ADME). The first hurdle for an orally administrated drug is adequate absorption from the gut wall

into the blood circulatory system. Upon absorption, it will be transported to the liver, where it is

liable to modification by a panel of hepatic microsomal enzymes; some molecules may be

metabolized and some may be excreted via the bile. If a drug molecule survives this first-pass

metabolism, it will enter arterial circulation and is subsequently distributed to the body, including

the target tissue. Once the drug has triggered the desirable therapeutic response, it should be

steadily eliminated from the body; otherwise bioaccumulation may become a concern. In addition, a

1
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3499http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3499http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3499


2/42

drug must not cause any serious toxic side effects, including, but not limited to interference with the

actions of any other drugs the patient may be taking. Such interference is normally caused by

enzyme induction, a process in which one drug stimulates an enzyme, thereby causing a change in

the metabolism of a second drug.

It is not surprising then, that even though the chemical structures of drugs can differ greatlyinaccordance with the requirement of complementary interactions to diverse target receptors

successful drugs on the market today do share certain similarities in their physicochemical

properties. Primarily, such characteristics determine the pharmacokinetics of the drug, where

favorable ADME (absorption, distribution, metabolism, and excretion) properties are required. 2,3

Perhaps the most well-known study in this area is the work of Lipinski and coworkers at Pfizer,

who performed a statistical analysis of 2,200 drugs from the World Drug Index (WDI). 4 They

established a set of heuristics that appears to be generally valid for the majority of the drugsconsidered in the study, normally referred to as the Pfizer rule, or the rule of five, which states

that the absorption or permeation of a drug (that is not a substrate for a biological transporter) is

likely to be impaired when:

logP > 5

Molecular weight >500

Number of hydrogen donor groups >5

Number of hydrogen acceptor groups >10

The beauty of this rule lies in its simplicity. Because all parameters can be easily computed, the

Pfizer rule (or its variants) has become the most widely applied filter in virtual library design today.

However, it should be stressed that compliance to the rule does not necessarily make a molecule

drug-like. In fact, the Pfizer rule by itself appears to be a rather ineffective discriminator between

drugs and non-drugs. Frimurer et al showed that using the above criteria, only 66% of the

compounds in the MDL Drug Data Report (MDDR) database, which contains compounds with

demonstrated biological activities, were classified as drug-like; whereas 75% of the supposedly

nondrug-like compounds from the Available Chemical Directory (ACD) were in fact regarded as

drug-like.5In other words, if the primary objective is to isolate drugs from nondrugs in the broadest

sense, the Pfizer rule fares no better than making close to random assignments. Obviously, a more

complex set of logical rules is required to recognize molecules with drug-like properties.

Independently, two research groups investigated the use of artificial neural networks to develop

virtual screening tools that can distinguish between drug-like and nondrug-like molecules. The

results of their work were published in two back-to-back articles in the Journal of Medicinal

Chemistry in 1998.6,7The first paper was a contribution from Ajay, Walters, and Murcko at Vertex

2
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3500http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3500http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3501http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3503http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3503http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3504http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3504http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3504http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3505http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3505http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3500http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3501http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3503http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3504http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3505


3/42

Pharmaceuticals.6 They selected a set of approximately 5000 compounds from the Comprehensive

Medicinal Chemistry (CMC) database serving as a surrogate for drug-like molecules. They also

chose a similar number of drug-size compounds from the ACD to represent molecules that were

nondrug-like. Seven simple 1D descriptors were generated to encode each molecule, including

molecular weight, number of hydrogen-bond donors, number of hydrogen-bond acceptors, numberof rotatable bonds, 2 (the degree of branching of a molecule), aromatic density, and logP. To

augment these 1D features, Ajay and coworkers also considered a second set of 2D descriptors.

They were the 166 binary ISIS keys, which contained information on the presence or absence of

certain substructural features in a given molecule. A Bayesian neural network (BNN) was used to

train a subset of 7,000 compounds, which was comprised of approximately equal numbers of

compounds from the CMC and ACD sets. The trained neural network was then applied to the

remaining CMC and ACD compounds that were outside the training set. As an external validationthey also tested their network on a large collection of compounds from the MDDR databases, which

were assumed to contain mostly drug-like candidates. The accuracy of classification for the test

predictions using different combinations of 1D and 2D descriptor sets is summarized inTable 1.

Neural network models using seven 1D descriptors alone classified about 83% of the CMC

compounds as drugs, and about 73% of the ACD set as nondrugs. The majority ( 65%) of the

MDDR compounds were predicted to be drug-like, which was in accordance with general

expectation. When 2D ISIS descriptors were utilized the classification accuracy for the ACD (82%)

and the MDDR (83%) compounds improved significantly, though this was at the expense of inferior

prediction for the CMC set (78%). The combined use of 1D and 2D descriptors yielded the best

prediction overall. The classification accuracy of both CMC and ACD approached 90% and, in

addition, about 78% of the MDDR compounds were classified as drug-like. Furthermore, the Vertex

team was able to extract the most informative descriptors and suggested that all seven 1D and only

71 out of 166 ISIS descriptors provided relevant information to the neural network. It was

demonstrated that the prediction accuracy of a neural network using this reduced set of 78

descriptors was essentially identical to the full model. Finally, to demonstrate the utility of this

drug-likeness filter, the researchers conducted a series of simulated library design experiments and

concluded that their system could dramatically increase the probability of picking drug-like

molecules from a large pool of mostly nondrug-like entities.

3
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3504http://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3459/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3459/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/#A3504http://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3459/?report=objectonly


4/42

Table 1 Average drug-likeness prediction performance of a Bayesian neural

network with five hidden nodes on 10 independent test sets6

Descriptors CMC / % ACD / % MDDR druglike / %

7 1D 8184 7175 6168166 ISIS 7779 8183 83847 1D + 166 ISIS 8991 8889 77797 1D + 71 ISIS 8890 8788 7780

Average drug-likeness prediction performance of a Bayesian neural network with five hidden nodes

on 10 independent test sets.

Another neural network-based drug-likeness scoring scheme was reported by Sadowski and

Kubinyi from BASF.7 They selected 5,000 compounds each from the World Drug Index (WDI) and

ACD, to serve as their databases of drug-like and nondrug-like compounds. The choice of

molecular descriptors in their application was based on Ghose and Crippen atom-types,8 which have

been successfully used in the prediction of other physicochemical properties such as logP. In this

study, each molecule was represented by the actual count for each of the 120 atom-types found. The

full set of descriptors was pruned to a smaller subset of 92 that were populated in at least 20 training

molecules, a procedure designed to safeguard against the neural network learning single

peculiarities. Their neural network, a 9251 feed-forward model, classified 77% of the WDI and

83% of the ACD compounds correctly. Application of the neural network to the complete WDI and

ACD databases (containing > 200,000 compounds) yielded similar classification accuracy. It was

noteworthy that, in spite of this apparently good predictivity, Sadowski and Kubinyi did not

advocate the use of such a scoring scheme to evaluate single compounds because they believed that

there was still considerable risk of misclassifying molecules on an individual basis. Instead, they

believed that it would be more appropriate to apply this as a filter to weed out designs with very low

predicted scores.

Recently, Frimurer and coworkers from Novo Nordisk extended these earlier works by attempting

to create a drug-likeness classifier that uses a neural network trained with a larger set of data. 5

Again, MDDR and the ACD were used as the sources of drug-like and nondrug-like entities. The

MDDR compounds were partitioned into two sets. The first set represents 4,500 compounds that

have progressed to at least Phase I of clinical trials (i.e., they should be somewhat drug-like), and

the second was a larger collection of 68,500 molecules that have the status label of Biological

Testing (i.e., lead-like). To decrease the redundancy of the data sets, a diversity filter was applied

to the data set so that any MDDR compounds that had a Tanimoto coefficient (based on ISIS

fingerprints) of greater than 0.85 amongst themselves were removed. This procedure discarded

about 100 compounds from the drug-like MDDR set and 8,500 compounds from the lead-like set.

4
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3504http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3505http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3506http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3503http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3504http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3505http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3506http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3503


5/42

To reinforce the nondrug-like nature of the ACD set, any compounds that are similar to the 4,400

MDDR drug-like set (greater than a Tanimoto cutoff of 0.85) were eliminated, leaving 90,000 ACD

compounds for data analysis. After removing the redundant entries, the 4,400 MDDR drug-like set

was partitioned into 3,000 training compounds and 1,400 test compounds, and the ACD compounds

into a 60,000 member training and a 30,000 member test set. The 60,000 lead-like MDDRcompounds were not utilized in any way during model construction, but were used only as external

validation data. Each compound was represented by three molecular descriptors (number of atoms,

no of heavy atoms, and total charge) in addition to 77 CONCORD atom-type descriptors encoding

the frequency of occurrence (normalized by the entire data set) of particular atom types.

Empirically, it was concluded that the optimal neural network configuration contained 200 hidden

nodes, based on the quality of test set predictions. This neural network gave a training and test

Matthews correlation coefficient of 0.65 and 0.63, respectively (Eq. 2.18).

9

Using a threshold valueof 0.5 as a criteria to distinguish drug from nondrug, the neural network was able to classify 98% of

the ACD compounds but only 63% of the MDDR drug-like set. By lowering the prediction

threshold, an increasing number of MDDR drug-like compounds would be correctly identified, at

the expense of more false positives for the ACD set. They claimed that a threshold value of 0.15

(anything above that was classified as a drug) was an optimal cutoff, providing the best

discrimination between the two data sets. Below this threshold value, 88% of the MDDR drug-like

set and ACD databases were correctly classified. In addition, 75% of the MDDR lead-like

molecules were also predicted as drug-like. The decrease in percentage from the lead-like to the

drug-like set was not unexpected given that there may still be some intrinsic differences between

the two classes of compounds. Finally, Frimurer and coworkers probed for the most informative

descriptors that allowed for discrimination between drugs and nondrugs. By setting each of the 80

descriptors systematically to a constant value (zero was used in this case), and monitoring the

variation in training errors of each sub-system. They argued that the removal of an important

descriptor from the input would lead to a substantial increase in training error. Fifteen key

descriptors were identified by this method; they were aromatic X-H, non-aromatic X-H, C-H, C=O,

sp2 conjugated C, =N-, non-aromatic N, N=C, non-aromatic O, sp2 O, sp2 P, F, Cl, number of atoms,

and total charge. The performance of their neural network was commendable, even with this vastly

reduced set of descriptors. Using a prediction threshold of 0.15, 82% of the MDDR and 87% of the

ACD compounds were correctly classified.

An interesting aspect of the drug-likeness scoring function that was briefly discussed in the

Frimurer publication concerns the setting of the threshold value. For example, if the purpose of the

scoring function is to limit the number of false-positive predictions, then a higher cutoff value

5


6/42

should be used for the threshold. Table 2 gives the percentages of ACD and MDDR compounds that

are correctly classified using different cutoff values.

Table 2Percentage of ACD and MDDR compounds that are correctly predicted

with their corresponding threshold values in the drug-likeness classifier

of Frimurer et al5

Cutoff % ACD correctly predicted % MDDR correctly predicted

0.05 72 950.15 88 880.35 95 740.50 98 63From: Prediction of Drug-Like Properties

As reported, a model with a higher cutoff value contributes fewer false positives (i.e., nondrugs thatwere predicted as drug-like), although this comes at an expense of worse MDDR classification. It is

important to keep in mind that the cutoff value should be set depending on whether false positives

or false negatives are more harmful for the intended application. 10 In a typical virtual screening

application, we usually like to first identify and then remove molecules that are predicted to be

nondrug-like from a large compound library. Let us assume x is the percentage of the compounds

that are actually drugs in the library, and that pD is the probability that a drug is correctly identified

as drug-like, and pN is the probability that a nondrug is correctly identified as nondrug-like. To

gauge the performance of the drug-likeness scoring function one would compute what percentage of

the compounds that were flagged as drug-like were actually drugs. This quantity, denoted

henceforth as drug fraction, is given by Equation 1:

If we assume that for a given threshold value, pD and pN take the values of %MDDR and %ACD

that are correctly classified, we can plot how drug fraction varies with x, the percentage of drugs in

the complete library.Figure 1 shows the hypothetical curves for each threshold value listed in Table

2. In all cases the drug scoring function gives substantial enrichment of drugs after the initial

filtering. This is particularly true in situations where the fraction of actual drug molecules in the

library is very small, a phenomenon that is perhaps reminiscent of reality. Based on statistics

reported by Frimurer et al, the reduction of false positives is, in fact, the key to this kind of virtual

screening application, and therefore a high threshold value should be set. Thus, although on a

percentage basis a 0.15 threshold seems the most discriminating (88% of both ACD and MDDR),

6
http://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3461/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/#A3503http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3462/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3462/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3461/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3461/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3461/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/#A3503http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3462/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3461/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3461/?report=objectonly


7/42

the premise under which virtual screening is applied calls for more rigorous removal of false

negatives, even at the expense of a loss of true positives.

Figure 1

Graphs showing how drug fraction varies with the percentage of drugs in the library (see text).

From: Prediction of Drug-Like Properties

Finding a generally applicable scoring function to predict the drug-likeness of a molecule will

remain one of the most sought-after goals for pharmaceutical researchers in the coming years. Thetools that exist today can discriminate between molecules that come from presumably drug-like

(e.g., MDDR, CMC, WDI) or nondrug-like (e.g., ACD) databases. In our opinion, the majority of

the MDDR and CMC compounds should be regarded as lead-like and not, strictly speaking, drug-

like. Ideally, the drug-like training set should contain only drugs that have passed all safety hurdles.

We also believe that the nondrug set should consist of molecules that have close resemblance to

marketed drugs (i.e., at least somewhat lead-like) but were abandoned during pre-clinical or clinical

development. We anticipate that the analysis will benefit from the more rigorous definition of drugand nondrug because the intrinsic differencepresumably owing to their pharmacokinetics or

toxicological characteristicsbetween them will be amplified.

In a recent review article, Walters, Ajay, and Murcko wrote:11

[What] we may witness in coming years might be attempts to predict the various properties that

contribute to a drug's success, rather than the more complex problem of drug-likeness itself. These

might include oral absorption, blood-brain barrier penetration, toxicity, metabolism, aqueous

solubility, logP, pKa, half-life, and plasma protein binding. Some of these properties are themselves

rather complex and are likely to be extremely difficult to model, but in our view it should be

possible for the majority of properties to be predicted with better-than-random accuracy.

7
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3509http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3509


8/42

This divide-and-conquer approach to drug-likeness scoring also brings better interpretability to the

result. The potential liability of a drug candidate becomes more transparent, and an appropriate

remedy can be sought out accordingly. In the following sections of this Chapter we will discuss the

role played by adaptive modeling and artificial intelligence methods in the prediction of individual

properties that contribute to the overall drug-likeness of a molecule.

Physicochemical Properties

An implicit statement of the Pfizer rule is that a drug must have a balanced hydrophilic-lipophilic

character. Two physicochemical parameters have the most profound influence on drug-like

properties of a molecule. (i) aqueous solubility, which is critical to drug delivery; (ii)

hydrophobicity, which plays a key role in drug absorption, transport and distribution.

Aqueous Solubility

A rapidly advancing area of modern pharmaceutical research is the prediction of the aqueous

solubility of complex drug-sized compounds from their molecular structures. The ability to design

novel entities with sufficient aqueous solubility can bring many benefits to both pre-clinical

research and clinical development. For example, accurate activity measurements can be obtained

only if the substance is sufficiently solubleabove the detection limits of the assay. Otherwise, a

potentially good SAR can be obscured by apparent poor activity due to insufficient solubility rather

than inadequate potency. Finding a ligand with adequate solubility is also a key factor that

determines the success of macromolecular structure determination. In X-ray crystallography, the

formation of crystals appears to be very sensitive to the solubility of ligands. Most biostructural

NMR experiments require ligands dissolved at a relatively high concentration in a buffer. At a more

downstream level in drug development, the solubility of a drug candidate has perhaps the most

profound effect on absorption. Although pro-drug strategies or special methods in pharmaceutical

formulation can help to increase oral absorption, the solubility largely dictates the route of drugadministration and, quite often, the fate of the drug candidates.

The aqueous solubility of a substance is often expressed as log units of molar solubility (mol/L), or

logS. It is suggested that solubility is determined by three major thermodynamic components that

describe the solubilization process.4 The first is the crystal packing energy, which measures the

strength of a solid lattice. The second is the cavitation energy, which accounts for the loss of

hydrogen bonds between the structured water upon the formation of a cavity to host the solute. The

third is the solvation energy, which gauges the interaction energy between the solute and the watermolecules. To account for these effects, a number of experimental and theoretical descriptors have

been introduced to solubility models in the past year. Some of them include melting points,1214

8
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3510http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3510


9/42

cohesive interaction indices,15 solvatochromic parameters,16,17 shape, electronic and topological

descriptors,1823 and mobile order parameters.24 Most of this work has been summarized in an

excellent review by Lipinski et al,4 and will not be discussed here. In this Section, we will focus on

some of the most recent developments involving the use of neural networks to correlate a set of

physicochemical or topological descriptors with experimental solubility.The earliest neural network-based solubility model in the literature was reported by Bodor and

coworkers.18 Fifty-six molecular descriptors, which mostly accounted for geometric (e.g., surface,

volume, and ovality), electronic (e.g., dipole moment, partial charges on various atom-types) and

structural (e.g., alkenes, aliphatic amines, number of N-H bonds) properties, were generated from

the AMPAC optimized structures of 311 compounds. Empirically, Bodor et al determined that 17

out of the 56 descriptors seemed most relevant for solubility and the resulting 17181 neural

network yielded a standard deviation of error of 0.23, which was superior to the correspondingregression model (0.30), based on identical descriptors. In spite of such success, we think that there

are two major deficiencies in this neural network model. First, the use of 18 nodes in the hidden

layer may be excessive for this application, given that there are only 300 training examples.

Second, some of the 17 input descriptors are, in our opinion, redundant. For example, the inclusion

of functional transforms of a descriptor (e.g., QN2 and QN4 are functions of QN) might be

unnecessary because a neural network should be able to handle such mapping implicitly. To

overcome such limitations PCA and smaller networks could be applied.

The research group of Jurs at Pennsylvania State University has investigated many QSPR/QSAR

models for a wide range of physical or biological properties based on molecular structures.2123,2531

Recently, they published two solubility studies using their in-house ADAPT (Automated Data

Analysis and Pattern Recognition Toolkit) routine and neural network modeling. 22,23 Briefly, each

molecule was entered into the ADAPT system as a sketch and the three-dimensional structure was

optimized using MOPAC with the PM3 Hamiltonian. In addition to topological indices, many

geometric and electronic descriptors, including solvent-accessible surface area and volume,

moments of inertia, shadow area projections, gravitational indices, and charged partial surface area

(CPSA) were computed. To reduce the descriptor set they applied a genetic algorithm and simulated

annealing techniques to select a subset of descriptors that yielded optimal predictivity of a

'validation' set (here, a small set of test molecules that was typically 10% of the training set). In the

first study,22 application of the ADAPT procedure to 123 organic compounds led to the selection of

nine descriptors for solubility correlation. The rms errors of the regression and the 931 neural

network models were 0.277 and 0.217 log units, respectively. In the next study,23 the same

methodology was applied to a much larger data set containing 332 compounds, whose solubility

spanned a range of over 14 log units. The best model reported in this study was a 961 neural

9
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3513http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3513http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3514http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3514http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3515http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3516http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3522http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3516http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3519http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3519http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3523http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3520http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3520http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3521http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3520http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3521http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3513http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3514http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3515http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3516http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3522http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3516http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3519http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3523http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3520http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3521http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3520http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3521


10/42

network yielding a rms error of 0.39 log units for the training compounds. It is noteworthy that

there was no correspondence between any of the current descriptors to the set that was selected by

their previous model. A possible explanation is that the ADAPT descriptors may be highly inter-

correlated and therefore the majority of the descriptors are interchangeable in the model with no

apparent loss in predictivity.Perhaps the most comprehensive neural network studies of solubility were performed by

Huuskonen et al.3234 In their first study,32system-specific ANN models were developed to predict

solubility for three different drug classes, which comprised 28 steroids, 31 barbituric acid

derivatives, and 24 heterocyclic reverse transciptase (RT) inhibitors. The experimental logS of these

compounds ranged from 5 to 2. For each class of compounds, the initial list of descriptors

contained 30 molecular connectivity indices, shape indices, and E-state indices. Five

representative subgroups of descriptors were established, based on the clustering of their pairwisePearson correlation coefficients. A set of five parameters were then selected, one from each

subgroup, as inputs to a 531 ANN for correlation analysis. Several five-descriptor combinations

were tried, and those that gave the best fit of training data were further investigated. To minimize

overtraining, an early stopping strategy was applied, so that the training of the neural network

stopped when the leave-one-out cross-validation statistics began to deteriorate. The final models

yielded q2 values of 0.80, 0.86, and 0.72 for the steroids, barbiturates, and RT inhibitors classes,

respectively. Overall, the standard error of predictions was approximately 0.3 to 0.4 log units. Since

each ANN was optimized with respect to a specific compound class, it was not surprising that

application of solubility models derived from a particular class to other classes of compounds yields

unsatisfactory results. It was more surprising, however, that the effort to unravel an universal

solubility model applicable to all three classes of compounds also proved unsuccessful (Note: They

could, in theory, obtain reasonable predictivity for the combined set if an indicator variable was

introduced to specify each compound class. However, this would obviously defeat the purpose of a

generally applicable model). One possible explanation is that the combined data set (83 compounds

in total) contained compounds segregated in distinct chemical spaces and it would be difficult to

find a set of common descriptors that could accurately account for the behavior of each group of

compounds.

In their next study,33 Huuskonen et al collated experimental solubilities of 211 drugs and related

analogs from literature. This set of compounds spanned approximately six log units (from 5.6 to

0.6), which was almost twice the range of their previous study. Thirty-one molecular descriptors,

which included 21 E-state indices, 7 molecular connectivity indices, number of hydrogen donors,

number of hydrogen acceptors, and an aromaticity indicator, were used initially in model building.

The final number of descriptors was later pruned to a subset of 23 by probing the contribution of

10
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3530http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3530http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3530http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3500http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3531http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3530http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3530http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3500http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3531


11/42

each individual parameter. The final ANN model had a 2351 configuration, and yielded r2 = 0.90,

and s = 0.46 for the 160-member training set, and r2 = 0.86, s = 0.53 for the remaining 51 test

compounds. Besides these descriptive statistical parameters, the authors also published the

individual predictions for these compounds. Because all 24 RT inhibitors from the previous study

were part of the data set, this allowed us to investigate the relative merit of a system-specificsolubility model versus a generally applicable model. Of the 24 compounds, 20 were selected for

the training set, and 4 were used as test compounds. Figure 2(a) shows the predicted versus

observed aqueous solubilities for the RT inhibitor in their previous system-specific model, and

Figure 2(b) is the corresponding plot for the predicted solubilities from the general purpose model.

It is clear that, although the predictions of most RT inhibitors were within the correct solubility

range (logS _2 to _5), a comparison of individual predictions for this class of compounds reveals

very weak correlation (r

2

= 0.16; s = 0.73). This result contrasted sharply with the very goodpredictivity (r2 = 0.73; s = 0.41) when the RT inhibitors had been considered on their own. 32This

supports the notion that a system-specific solubility predictor is more accurate than a general one,

though the former obviously has only limited scope. Thus, one must choose an appropriate

prediction tool depending on the nature of the intended application. For instance, if the emphasis is

on a single class of compounds we should consider the construction of a specialist model (provided

there are sufficient experimental data for the series) or recalibrate the general model by seeding its

training set with compounds of interest.

Figure 2

Predicted versus experimental solubility for the 24 RT inhibitors using (a) a system-specific

solubility model and (b) a general model.


11
http://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3465/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3465/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/#A3530http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3530http://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3465/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3465/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/#A3530


12/42

In his most recent study,34 Huuskonen attempted to improve the accuracy and generality of his

aqueous solubility model by considering a large, diverse collection of 1,300 compounds. The

logS values for these compounds ranged from _11.6 to +1.6, which essentially covers the range of

solubilities that can be reliably measured. The full data set was partitioned to a randomly chosentraining set of 884 compounds and a test set of 413. Starting from 55 molecular connectivity,

structural, and E-state descriptors, he applied a MLR stepwise backward elimination strategy to

reduce the set to 30 descriptors. For the training data, this equation yielded r2 = 0.89, s = 0.67, and

r2cv = 0.88, scv = 0.71. The statistical parameters for the 413 test set were essentially identical to that

of leave-one-out cross-validation, thereby indicating the generally robust nature of this model. He

applied ANN modeling to the same set of parameters in order to determine whether the prediction

could be further improved via nonlinear dependencies. Using a 30121 ANN, he obtained r

2

=0.94, s = 0.47 for the training set, and r2 = 0.92, s = 0.60 for the test set, which were both

significantly better than the MLR model. The general applicability of the MLR and ANN models

was further verified by application to a set of 21 compounds suggested by Yalkowsky, 35 which has

since become a benchmark for novel methods. The r2 and s values for the MLR model are 0.83 and

0.88, and for the ANN, 0.91 and 0.63, in good agreement with their respective cross-validated and

external test statistics. Both results were, however, significantly better than those derived from their

previous model constructed using 160 training compounds (r2 = 0.68, s = 1.25). This indicated that a

large and structurally diverse set of compounds were required to train a model capable of giving

reasonable solubility predictions for structures relevant to pharmaceutical and environment interest,

such as the set of compounds under consideration in this study.

logP

The n-octanol/water partition coefficient of a chemical is the ratio of its concentration in n-octanol

to that in aqueous medium at equilibrium. The logarithm of this coefficient, logP, is perhaps the

best-known descriptor in classical QSAR studies. The reason for the usefulness of this property is

related to its correlation with the hydrophobicity of organic substances, which plays a key role in

the modulation of many key ADME processes. Specifically, drug-membrane interactions, drug

transport, biotransformation, distribution, accumulation, protein and receptor binding are all related

to drug hydrophobicity.36 The significance of logP is also captured by the Rule-of-5,4 which states

that a molecule will likely be poorly absorbed if its logP value exceeds five. Other researchers also

established links between logP and blood-brain barrier (BBB) penetration, a critical component in

the realization of activity on the central nervous system (CNS).10,3742 For CNS-active compounds,

usually a logP around 45 is required.

12
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3532http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3534http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3535http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3532http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3534http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3535


13/42

One of the earliest attempts to derive logP values from computational means was the f-constant

method proposed by Rekker.43 Later, Leo and Hansch made a significant advance to this fragment-

based approach that ultimately led to the successful development of the widely popular ClogP

program.44 In summary, they assumed an additive nature of hydrophobicity values from different

molecular fragments, whose parameter values were calibrated by statistical analysis of a largeexperimental database. To estimate the logP value of a novel molecule, the chemical structure is

first decomposed into smaller fragments that can be recognized by the program. The logP value of

the molecule is simply the incremental sum of parameter values from the composite fragments, and

in some cases, additional correction factors. The main advantage of a fragment-based method is that

it tends to be very accurate. However, this approach suffers from two major problems. The first is

that the molecular decomposition process is often very tricky. The second, and the more serious,

concerns missing parameter values when a given structure cannot be decomposed to structures forwhich fragment values are available. Thus, it becomes more fashionable to treat the molecule in its

entirety, and to correlate its logP value with descriptors that are easy to calculate. Most published

reports follow this scheme and are based on the use of MLR or ANN on some combination of

electronic and steric properties. For example, molecular descriptors such as atomic charges,

hydrogen bond effects, molecular volumes or surface areas have been considered in this role.

Schaper and Samitier proposed a logP method based on an ANN to determine the lipophilicity of

unionized organic substances by recognition of structural features.45 Molecules were encoded by a

connection table, where indicator variables were used to denote the presence or absence of specific

atoms or bonds in different molecular positions. Eight different atom types (C, N, O, S, F, Cl, Br,

and I) and four different bond types (single, double, triple, and resonant) were represented in their

implementation. For compounds with up to 10 non-hydrogen atoms, a full description of the

molecule required 260 variables (10 8 indicator variables for atoms and 45 4 for bonds). After

preliminary analysis of their data set, which was comprised of 268 training and 50 test compounds,

147 non-zero descriptors were retained. They experimented with three different hidden layer

configurations (2, 3, and 4 hidden nodes) and suggested that an ANN with three hidden layer

neurons was the optimal choice based on the prediction accuracy of the test set. The 14731 NN

yielded a Pearson correlation coefficient (rtrn) of 0.98 and a standard deviation (s trn) of 0.25 between

observed and calculated logP values for the training compounds. It is interesting to note that,

despite the use of a large number of adjustable parameters (448), this particular NN showed little

evidence of overfitting: the test set correlation coefficient (rtst) is 0.88 and standard deviation (stst) =

0.66. The authors suggested that with a decrease in the rho (r) ratio (either an increase in data

objects or a reduction in non-critical indicator variables), the predictivity of this type of NN system

would further increase. The major shortcoming of this approach is that the molecular representation

13
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3541http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3542http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3542http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3543http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3541http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3542http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3543


14/42

is based on connection matrix/indicator variables. Their study was limited to compounds containing

no more than 10 non-hydrogen atoms. With a connection matrix, the number of input descriptors to

the ANN increases quadratically with the maximum number of allowed atoms (N MaxAtom) in the data

set. For example, using the current scheme of 8 atom-types and 4 bond-types, the total number of

descriptors is calculated by:

If we were to apply this method to drug-size molecules, which contain on average of 2025 non-

hydrogen atoms, then the ANN would need to deal with approximately 1,000 indicator variables.

Introduction of new atom-types, such as phosphorus, would add further complexity to the molecular

description. This begs the question: are all these descriptors necessary to produce a sound logP

model? The answer is, most probably, no. We speculate that because molecular connectivity

descriptors have no physical meaning a large number of them is required to depict or correlate

physicochemical properties. If physically meaningful descriptors are used, then one may obtain a

more direct relationship from fewer predictors.

In a recent study, Breindl, Beck, and Clark applied semi-empirical methods to obtain a small set of

quantum chemical descriptors to correlate the logP values for 105 organic molecules.46 They used

the CONCORD program to convert 2D connectivity into standard 3D structures,47 whose

geometries were further refined by energy minimization using SYBYL. 48 The structures were then

optimized using VAMP,49 a semi-empirical program. The input descriptors, which included both

electrostatic and shape properties of the molecules, were derived from AM1 and PM3 calculations.

Using MLR analysis, they derived a 10-term equation that reported a rtrn value of 0.94 and rcv of

0.87. The choice of descriptors for this MLR model was further analyzed using ANN. With a 124

1 back-propagation network, they improved the fitting of the training set to rtrn = 0.96 and rcv = 0.93

and, furthermore, the neural network also seemed to perform consistently well on 18 test set

molecules. Finally, this approach was validated with a larger data set of 1085 compounds, for which

980 molecules were used as the training set and 105 were held back for testing. The best

performance was obtained with a 16251 network, which yields a rtrn = 0.97 for training and a rcv

of 0.93 with the AM1 parameters, and a slightly worse (rtrn = 0.94 and rcv = 0.91, strn = 0.45) result

for the PM3 set. Again, the validity of the neural network result was confirmed by accurate test set

predictions, which yielded impressive statistical parameters of rtst = 0.95 and stst = 0.53 for the AM1

result and, again, slightly worse values for the PM3 set (rtst = 0.91; stst = 0.67). The deficiency of the

PM3 set was further analyzed, and it was concluded that there was a systematic problem with the

14


15/42

estimation of logP values for those compounds with large alkyl chains. They reasoned that the large

error was due to the uncertainty of appropriate conformations from gas phase geometries under their

setup. By systematically varying the values of one input descriptor while keeping others fixed, they

concluded that the logP values were predominately influenced by three descriptors, namely

polarizability, balance u, and charge OSUM. Furthermore, a direct linear dependence between logPand polarizability was observed. On the other hand, the effects for the balance parameter and

OSUM were shown to be highly non-linear with respect to logP. Overall, it seems that reliable logP

models can be sought using a few quantum chemical parameters, although the time-consuming

nature of the calculation makes it less attractive for analysis of large virtual libraries.

To address some of the limitations of the older QSPR approaches, Huuskonen and coworkers

proposed the use of atom-type electrotropological state (E-state) indices for logP correlation.36The

E-state indices were first introduced by Kier and Hall,

50,51

and have been validated in many QSARand QSPR applications. They capture both the electronic and topological characteristics

surrounding an atomic center as well as its neighboring environment. In the implementation of E-

state descriptors by Huuskonen et al, several new atom-types corresponding to contributions from

amino, hydroxyl, and carbonyl groups in different bonding environments were introduced. This

level of detail seems particularly relevant for the purpose of hydrophobicity modeling. For instance,

it is known that an aromatic amino group is generally less basic than its aliphatic counterpart, which

makes the former less likely to ionize and presumably more hydrophobic. The use of the extended

parameter set was justified by a significant improvement in cross-validated statistics of the 1,754

training set. An MLR model using 34 basic E-state descriptors yielded a q 2 value of 0.81 and an

RMScv of 0.64; whereas with 41 extended parameters, the corresponding values were 0.85 and 0.55.

Huuskonen et al also applied an ANN to be able to model higher-order nonlinearity between the

input descriptors and logP. The final model, which had a 3951 architecture, gave a q2 value of

0.90 and an RMScv 0.46 for leave-one-out cross-validation. Further validation on three independent

test sets yielded a similar RMS error (0.41), thereby confirming the consistency of the predictions.

The logP predictions of this new method were compared to those derived from commercial

programs. It was found that this method was as reliable or better than the established methods for

even the most complex structures.

In our opinion, the approach of Huuskonen and coworkers represents a method of choice for fast

logP estimation, particularly for applications where both speed and accuracy are critical. Because

the algorithm does not depend on the identification of suitable basis fragments, the method is

generally applicable. Unlike methods that utilize quantum chemical descriptors, the calculation is

genuinely high-throughput because E-state indices can be computed directly from SMILES line

notation without costly structure optimization. Furthermore, this hydrophobicity model, which was

15


16/42

developed using 40 descriptors, can account for most, if not all, molecules of pharmaceutical

interest. In contrast, a connectivity table representation may require on the order of thousands of

input values, which also increases the risk of chance correlation. The major limitation of the

Huuskonen hydrophobicity method is the difficulty of chemical interpretation. This is in part due to

the topological nature of the molecular description and in part the use of nonlinear neural networksfor property correlation. Particularly, it is hard to isolate the individual contributions of the

constituent functional groups to the overall hydrophobicity; or conversely, to design modifications

that will lead to a desirable property profile (i.e., the inverse QSPR problem). Another important

issue that has not been addressed concerns the treatment of ionizable compounds, which may adopt

distinct protonation states under different solvent environments (e.g., water and 1-octanol).

Currently, this phenomenon is either ignored or assumed to be handled implicitly. Together with the

inverse QSPR problem, the correct handling of such molecules will be the major question that needsto be answered by the next generation logP prediction systems.

Bioavailability

Bioavailability is the percentage of a drug dose which proceeds, in an unaltered form, from the site

of administration to the central circulation. By definition, a drug that is administered intravenously

has 100% bioavailability. By comparing systemic drug levels achieved after intravenous injectionwith other drug delivery routes, an absolute bioavailability can be measured. Since for several

reasons, oral administration is the preferred route for drug delivery, a major challenge for

biopharmaceutical research is to achieve high oral bioavailability.

Several factors contribute to reduction of oral bioavailability. First, drug molecules may bind to

other substances present in the gastrointestinal tract, such as food constituents. The extent of

reduction may vary significantly with an individual diet. Second, the drug may be poorly absorbed

due to unfavorable physicochemical properties, such as those outlined in the Pfizer rule. Third, thedrug may be metabolized as it passes through the gut wall, or, more commonly, by the liver during

first-pass metabolism. Due to the complexity of the different processes affecting oral

bioavailability, as well as the scarcity of data, the development of a generally applicable

quantitative structure-bioavailability relationship (QSBR) has proven to be a formidable task. The

most extensive QSBR study to-date was reported by Yoshida and Topliss,52 who correlated the oral

bioavailability of 232 structurally diverse drugs with their physicochemical and structural attributes.

Specifically, they introduced a new parameter DlogD, which is the difference between the

logarithm of the distribution coefficient of the neutral form at pH = 6.5 (intestine) versus pH = 7.4

(blood) for an ionizable species. The purpose of this descriptor was to account for the apparent

16
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3550http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3550


17/42

higher bioavailability observed for many acidic compounds. They also included 15 descriptors to

encode the structural motifs with well-known metabolic transformations and therefore elucidated

the reduction of bioavailability due to the first-pass effect. Using these descriptors and a method

termed ORMUCS (ordered multicategorical classification method using the simplex technique),

they achieved an overall classification rate of 71% (97% within one class) when the compoundswere separated to four classes according to bioavailability. Furthermore, 60% (95% within one

class) of the 40 independent test compounds were also correctly classified using this linear QSAR

equation. The result of this study indicates that it might be feasible to obtain reasonable estimates of

oral bioavailability from molecular structures when physically and biologically meaningful

descriptors are employed. In the following section, we will give a brief review of how neural

network methods have been applied to the modeling of absorption and metabolism processes.

Human Intestinal Absorption

The major hurdle in the development of a robust absorption modeland other modelsis very

often the lack of reliable experimental data. Experimental percent human intestinal absorption

(%HIA) data have generally large variability and are usually skewed to either very low or very high

values, with only few compounds in the intermediate range. Jurs and coworkers collated a data set

of 86 compounds with measured %HIA from the literature.31The data were divided to three groups:

a training set of 67 compounds, a validation set of 9 compounds; and an external prediction set of10 compounds. Using their in-house ADAPT program, 162 real-value descriptors were generated

that encoded the topological, electronic and geometric characteristics for every structure. In

addition, 566 binary descriptors were added to the set to indicate the presence of certain

substructural fragments. Two approaches were applied to prune this initial set of 728 descriptors to

a smaller pool of 127. First, descriptors that had variance less than a user-defined minimum

threshold were removed to limit the extent of single-example peculiarities in the data set. Second, a

correlation analysis was performed to discard potentially redundant descriptors. Application of a

GA-NN type hybrid system to this data set yielded a six-descriptor QSAR model. The mean

absolute error was 6.7 %HIA units for the training set, 15.4 %HIA units for the validation set, and

11 %HIA units for the external prediction set. The six descriptors that were selected by the GA

could elucidate the mechanism of intestinal absorption via passive transport, which is controlled by

diffusion through lipid and aqueous media. Three descriptors are related to hydrogen bonding

capability, which reflects the lipophilic and lipophobic characteristics of the molecule. The fourth

descriptor is the number of single bonds, which can be regarded as a measure of structural

flexibility. The other two descriptors represent geometric properties providing information about the

molecular size. This set of descriptors, in our opinion, shares a certain similarity to the ones that

17


18/42

define the Pfizer rule. However, it is fair to point out the great popularity of the Pfizer rule amongst

medicinal chemists is, in the words of Lipinski et al,4 because the calculated parameters are very

readily visualized structurally and are presented in a pattern recognition format. On the contrary,

the use of more complex 3D descriptors and neural network modeling may enhance prediction

accuracy, although it is probably at the expense of a diminished practical acceptance.Overall, the result of this initial attempt to predict absorption models is encouraging and more work

in this area is assured. Because in vivo data are generally more variable and expensive, there will be

strong emphasis on correlating oral absorption and in vitro permeability obtained from model

systems such as Caco-2 or immobilized artificial membranes. In addition, future absorption models

may have a molecular recognition component, which will handle compounds that are substrates for

biological transporters.

Drug Metabolism

Drug metabolism refers to the enzymatic biotransformations which drug molecules are subject to

the body. This is an important defensive mechanism of our bodies against potential toxins, which

are generally lipophilic and are converted to more soluble derivatives that can be excreted more

readily. Most drug metabolism processes occur in the liver, where degradation of drugs is catalyzed

by a class of enzymes called hepatic microsomal enzymes. This constitutes the first-pass effect,

which can limit a drug's systemic oral bioavailability. In the past, relatively few researchers paidspecial attention to drug clearance until a lead molecule had advanced nearly to the stage of clinical

candidate selection. More recently this attitude has changed as the requirement for pharmacokinetic

data for the purposes of correct dose calibration has been recognized. Thus, there is considerable

interest in the development of in vitro or in vivo physiological models to predict hepatic metabolic

clearance during the lead optimization stage.

Lav and coworkers at Roche made an attempt to correlate human pharmacokinetic data from in

vitro and in vivo metabolic data.53They collated experimental data for 22 literature and in-house

compounds that were structurally diverse. The in vitro metabolic data were derived from the

metabolic stability of the substances in hepatocytes isolated from rats, dogs, and humans, and the in

vivo pharmacokinetic data were measured after intravenous administration for the same species. All

in vitro data, as well as the in vivo data for rats and dogs, were used in combination to predict the

human in vivo data. Their statistical analysis included multiple linear regression (MLR), principal

component regression (PCR), partial least squares (PLS) regression, and artificial neural networks

(ANN). The results of their study are summarized in Table 3. The major conclusion from this study

is that the strongest predictors of human in vivo data were human and rat hepatocyte data; the in

vivo clearance data from either rats or dogs did not significantly contribute to any statistical model.

18
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3551http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3551http://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3471/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/#A3502http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3551http://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3471/?report=objectonly


19/42

One possible explanation is that the results from in vivo experiments are generally more variable

and are therefore more noisy when they were used as predictors. It is also clear that all statistical

methods (MLR, PCR, PLS and ANN) appeared to work satisfactorily for this data set; in fact, from

a statistical view point the results are practically identical. It is interesting to note that the non-linear

mapping capability of a neural network was not required in this case, probably because of thealready strong linear correlation between the human in vivo data and the human hepatocyte data (r

= 0.88) and to the rat hepatocyte data (r= 0.81). Overall, despite the limitation of modest data set

size, the results of this study provide further support for early in vitro screening of drug candidates

because satisfactory human pharmacokinetic data can be predicted through mathematical modeling

of these less expensive parameters. It is also fair to point out that the accuracy of their model does

come with a price, that is one must first synthesize a compound and determine the appropriate

biological parameters before a prediction can be made.

Table 3 Accuracy of the statistical models for human in vivo clearance

prediction53

Description useda Statistical parameters

Statistic Model r_h d_h h_h r_a d_a No. of terms b r2 q2

MLR 5 0.84 0.74MLR 2 0.84 0.79

PCR 2 0.85 0.79PLS 2 0.86 0.77PLS 1 0.83 0.79PLS 1 0.83 0.79

NN_linear 5 0.86 0.79NN_sigmoidal 3 0.88 0.77NN_sigmoidal 2 0.88 0.77a

r_h = rat hepatocyte; d_h = dog hepatocyte; h_h = human hepatocyte; r_a = rat animal data;

d_a = dog animal datab

MLR = number of descriptors; PCR = number of principal components; PLS = no. of

components; ANN = no. of descriptors


To overcome this problem, some researchers prefer to focus on theoretical descriptors that can be

computed from molecular structure. Recently, Quiones et al tried to correlate drug half-life values

based on physicochemical or topological descriptors, which were derived from a series of 30

structurally diverse antihistamines.54 These descriptors were used as input values to an ANN and

19


20/42

were trained against the experimental half-life of a drug, which is the time it takes for one-half of a

standard dose to be eliminated from the body. Initially, they tried to formulate a model that made

use of seven physicochemical descriptors: logP, pKa, molecular weight, molar refractivity, molar

volume, parachor, and polarizability. However, it did not lead to a statistically significant model.

They then investigated the possibility of using the CODES descriptors, which capture the atomiccharacter as well as its neighboring chemical environment for each individual atom. In their study,

they picked four CODES descriptors that corresponded to a common chemical substructure present

in all 30 antihistamines. Two neural network configurations, one with five hidden nodes and

another with six, were tested. The results from cross-validated predictions of their model were very

encouraging, and they were mostly consistent with the range of experimental half-life values (Fig.

3). A test set of five other antihistamines was used to evaluate the two ANN models. Again, there

was good agreement between the experimental and calculated half-life values, indicating thegeneral robustness of their models, at least within the domain of biogenic amines.

Figure 3

Calculated half-life values from a neural network versus experimental values. The cross-validated

predictions for the 30 training compounds are shown as open circles; the predictions for the 5 test

set compounds are shown in filled squares. The experimental values for some compounds were

reported as a range and are plotted accordingly. The diagonal line represents a perfect correlation

between experimental and calculated half-life values.From: Prediction of Drug-Like Properties

20
http://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3473/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3473/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3473/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/figure/A3473/?report=objectonly


21/42

Both approaches described above have their strengths and limitations. From a virtual

screening perspective, the approach of Quiones et al is more attractive since their model does not

rely on any experimental parameters. However, the association between the four CODES

descriptors and metabolism is unclear and the current model is relevant only to a specific class ofcompounds that share a common substructure. In this regard, it is our opinion that the set of

structural descriptors used by Yoshida and Topliss52 are particularly informative because they

represent some well-characterized metabolic liabilities. Nevertheless, drug metabolism is

immensely complex and biotransformations are catalyzed by many enzymessome of them may

still be unknown to us. As different enzymes have different substrate specificity according to their

structural requirement, it will be a challenging task to formulate a simple theoretical model that is

generally applicable to many diverse chemical classes. On the other hand, the method proposed byLav is likely to be more general because their approach relies on experimental parameters and is

thus less dependent on the metabolic pathways involved.53 It is conceivable that we will see a

process that is somewhat of a hybrid of the two in the future. A panel of lead compounds with

similar structures could be synthesized and tested in vitro, which is generally less expensive and

time-consuming than in vivo animal testing. These in vitro data will serve as calibration data for

correlation with a set of relevant theoretical descriptors that are directly obtained from molecular

structures. Because of the strong relationship between in vitro and in vivo data, the predictions from

the resulting QSAR model could be used to predict human pharmacokinetic clearance for

compounds that are within the scope of the original lead class. Further research to establish the

relationship between in vitro assay from tissue cultures of major metabolic sites (e.g., liver, kidneys,

lungs, gastrointestinal tract) and in vivo data appears to be justified.

CNS Activity

The nervous system of higher organisms is divided into a central system (CNS) that comprises the

brain and spinal cord, and a peripheral system (PNS) that embodies the remaining nervous tissues in

the body. The CNS coordinates the activities of our bodily functions, including the collection of

information from the environment by means of receptors, the integration of such signals, the storage

of information in terms of memory, and the generation of adaptive patterns of behavior. Many

factors, such as infectious diseases, hormonal disorders and other neurological degenerative

disorders, can disrupt the balance of this extremely complex system, leading to the manifestation of

CNS-related diseases. These include depression, anxiety, sleep disorder, eating disorders,

meningitis, Alzheimer's and Parkinson's diseases. The prevalence of such diseases in the modern

21


22/42

world is reflected in part by the continuous growth of the market for CNS drugs, which is now the

third highest selling therapeutic category behind cardiovascular and metabolic products, and is

predicted to reach over $60 billion worldwide by 2002.55 These drugs, which have the brain as the

site of action, must cross the barrier between brain capillaries and brain tissue (the blood-brain

barrier, or BBB). This barrier helps to protect the brain from sudden chemical changes and allowsonly a tiny fraction of a dose of most drugs to penetrate to cerebrospinal fluid and enter the brain.

Knowledge of the extent of drug penetration through the BBB is of significant importance in drug

discovery, not only for new CNS drugs, but also for other peripherally acting drugs whose exposure

to the brain should be limited in order to minimize the potential risk of CNS-related side-effects. It

is believed that there are certain common physicochemical characteristics common to molecules

that are capable of BBB penetration, whose extent is often quantified by logBB, the logarithm of the

ratio of steady-state concentration of drug in brain to that in blood. Some of these attributes includesize, lipophilicity, hydrogen-bonding propensity, charge, and conformation. It was about 20 years

ago that Levin reported a study describing a strong relationship between rat brain penetration and

molecular weight for drugs that have MW less than 400.37 In a later study, Young et al observed that

the logBB could be related to the difference between the experimental logPoctanol/water and

logPcyclohexane/water values for a set of histamine H2 antagonists.56 This provided a rationale to improve

blood-brain penetration of new designs by reduction of the overall hydrogen bonding propensity.

The earliest correlative logBB study that involved theoretical descriptors was that of Kansy and van

de Waterbeemd, who reported a two-descriptor MLR model using polar surface area (PSA) and

molecular volume from a small set of compounds.38 Although their model seemed to work well for

the 20 compounds within the training set, it was evident that predictions for other compounds were

rather unreliable, presumably due to erroneous extrapolation.57 To overcome this problem, Abraham

and coworkers examined a larger data set of 65 compounds and formulated QSAR models based on

excess molar refraction, molecular volume, polarizability, and hydrogen-bonding parameters, as

well as the experimental logP value.39 Later, Lombardo et al performed semi-empirical calculations

on a subset of 57 compounds selected from the Abraham training set, and derived a free energy

solvation parameter that correlated well with the logBB values.58 Norinder and coworkers

developed PLS models of logBB using a set of MolSurf parameters, which provide information on

physicochemical properties including lipophilicity, polarity, polarizability, and hydrogen bonding.40

More recently, Luco applied the PLS method using topological and constitutive (e.g., element

counts, sum of nitrogen atoms, and indicator variables of individual atoms or molecular fragments)

descriptors to correlate logBB.59 In the past two years, several research groups revisited the use of

PSA and logP in attempts to create models that are easy to interpret and also generally applicable.

These include Clark's MLR models,42 which are two-descriptor models based on PSA and logP

22
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3553http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3535http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3554http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3536http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3555http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3537http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3556http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3538http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3557http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3557http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3540http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3553http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3535http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3554http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3536http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3555http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3537http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3556http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3538http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3557http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3540


23/42

values computed using different methods; the sterberg PLS model that considered logP and a

simple count of hydrogen bond donor and acceptor atoms;60 and the Feher MLR model,10 which

utilized logP, polar surface area, and the number of solvent accessible hydrogen bond acceptors.

Most recently, Keser and Molnr reported a significant correlation between logBB and solvation

free energy derived from generalized Born/surface area (GB/SA) continuum calculations. Thisestablished an efficient means to predict CNS penetration in terms of thermodynamic properties,

whose utility had been limited previously due to high computational cost.61 The statistical

parameters reported by the various studies discussed above are shown in Table 4. The following

general comments can be made on these studies:

Table 4 Summary of representative linear logBB models that have appeared

in the literature

LogBB Model N r2

s RMSE Model: Descriptorsa

Kansy38 20 0.70 0.45 - MLR: PSA, Mol_volAbraham I39 57 0.91 0.20 - MLR: R 2, 2H, 2H, Sb2H, VxAbraham II39 49 0.90 0.20 - MLR: logPoct, 2H, 2HLombardo58 55 0.67 0.41 - LR: Gw0

Norinder I40 28 0.86 0.31 - PLS: MolSurf parametersNorinder II40 56 0.78 0.31 - PLS: MolSurf parameterLuco59 58 0.85 0.32 - PLS: topological, constitutionalKelder62 45 0.84 - - LR: dPSAClark I42 55 0.79 0.35 - MLR: PSA, ClogP

Clark II42 55 0.77 0.37 - MLR: PSA, MlogPsterberg I60 69 0.76 - 0.38 PLS: #HBAo, #HBAn, #HBD, logPsterberg II60 45 0.72 - 0.49 PLS: #HBAo, #HBAn, #HBD, logPFeher10 61 0.73 - 0.42 MLR: nacc, solv, logP, ApolKeser61 55 0.72 0.37 - LR: Gsolva

Molecular descriptors: polar surface area (PSA, Apol), dynamic polar surface area (dPSA),

excess molar refraction (R2), dipolarity/polarisability (2H), hydrogen-bond acceptor acidity

(2H), hydrogen-bond acceptor basicity (2H), characteristic volume of McGowan (Vx),

experimental logP (logPoct), free energy of solvation in water (DGw0, Gsolv), calculated logP

(ClogP, MlogP, logP), no. of hydrogen bonds accepting oxygen and nitrogen atoms

(#HBAo, #HBAn), no. of hydrogen bonds donors (#HBD), and no. of hydrogen bond in

aqueous medium (nacc, solv).


1. most models were developed from an analysis of a core set of 50 structures introduced by

Young et al and Abraham et al;56,39

23
http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3558http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3559http://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3478/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/#A3536http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3537http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3537http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3556http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3538http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3538http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3557http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3560http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3540http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3540http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3558http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3558http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3559http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3554http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3554http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3537http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3558http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3559http://www.ncbi.nlm.nih.gov/books/NBK6404/table/A3478/?report=objectonlyhttp://www.ncbi.nlm.nih.gov/books/NBK6404/#A3536http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3537http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3537http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3556http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3538http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3538http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3557http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3560http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3540http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3540http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3558http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3558http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3508http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3559http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3554http://www.ncbi.nlm.nih.gov/books/NBK6404/#A3537


24/42

2. the various linear models (either MLR or PLS) report r2 values in the range of 0.7 to 0.9, and

standard errors of 0.3 to 0.4 log units. The accuracy of the models is acceptable given that

most data sets have logBB values that span over 3 log units;

3. the descriptors used can be categorized to the following classes: hydrophilic (PSA and its

variant, hydrogen bond propensities), hydrophobic (either calculated or measured logPvalues), or solvation free energy (which arguably characterizes both the hydrophilic and

hydrophobic properties of the molecule), or topological indices (which encode, perhaps

indirectly, the above physicochemical properties);

4. with the exception of the study of Keser and Molnr, few models have been validated

extensively on a sufficiently large test set, probably due to scarcity of reliable data.

The results from the linear models indicate that it is feasible to estimate candidate blood-brainpenetration using computed physicochemical parameters from the molecular structure of a drug.

The major drawback of the above models is that they were developed using limited data and

therefore their general applicability may be questionable. A consequent solution to development

therefore was to increase the diversity of the training set, with the advantage that a larger data set

could also safeguard to some degree against model overfitting. This was the strategy followed by

Ajay and coworkers at Vertex,41 who developed a Bayesian neural network (BNN) to predict drug

BBB penetration using the knowledge acquired from a large (65,000) number of supposedly CNS-

active and -inactive molecules. To construct this data set, they selected compounds from the CMC

and the MDDR databases, based on therapeutic indication. In their initial classification, compounds

that were within the following activity classes were defined as CNS active: anxiolytic,

antipyschotic, neuronal injury inhibitor, neuroleptic, neurotropic, antidepressant, non-opioid

analgesic, anticonvulsant, antimigraine, cerebral antiischemic, opioid analgesic, antiparkinsonian,

sedative, hypnotic, central stimulant, antagonist to narcotics, centrally acting agent, nootropic agent,

neurologic agent and epileptic. Other compounds that did not fall into the above categories were

considered to be CNS inactive, an assumption that was later shown to be invalid. Based on this

classification scheme, there were over 15,000 CNS active molecules and over 50,000 inactive ones.

To minimize the risk of chance correlation, they elected to start with only a few molecular

descriptors. The seven one-dimensional descriptors adopted in their earlier drug-likeness prediction

system6 were also used in this work. These were molecular weight (MW), number of hydrogen

bond donors (Don), number of hydrogen bond acceptors (Acc), number of rotatable bonds (Rot), 2

(which indicates the degree of branching of a molecule), aromatic density (AR), and MlogP. The

authors believed that this set of descriptors were related to the physical attributes that correlate with

BBB penetration, thereby allowing the neural network to discriminate between CNS active and

24


25/42

inactive compounds. Using a BNN with just the seven physicochemical descriptors, they achieved a

prediction accuracy of 75% on active compounds and 65% on inactive ones. Further, they analyzed

the false-positive entities among the supposedly inactive CMC compounds and discovered that a

significant portion of the false positives actually had no information in the activity class (i.e., their

inactivity labeling might be somewhat dubious). Interestingly, for the remaining false positives, theVertex team discovered that most of the remaining compounds belonged to the following

categories: tranquilizer, antivertigo, anorexic, narcotic antagonist, serotonin antagonist, anti-anxiety,

sleep, enhancer, sigma opioid antagonist, antiemetic, antinauseant, antispasmodic, and

anticholinergic. Thus, it is evident that there were significant omissions of therapeutic indication in

the initial CNS activity definition; and furthermore, their BNN made sound generalizations that led

to the correct identification of other known CNS agents. Additional validation of their method on a

database of 275 compounds revealed that prediction accuracies of 93% and 72% were achieved forthe CNS active and inactive compounds, respectively. The BNN method also ranked the relative

importance of the seven descriptors in their CNS model, namely:

Acc > AR Don 2 > MW MlogP > Rot

Ajay and coworkers concluded that CNS activity was negatively correlated with MW, 2a, Rot, and

Acc, and positively correlated with AR, Don, and MlogP, a result that was consistent with known

attributes of CNS drugs. They found that the addition of 166 2D ISIS keys to the seven 1D

descriptors yielded significant improvement, which confirmed their earlier drug-likeness prediction

result.6 Using the combined 1D and 2D descriptors, the BNN yielded predictivity accuracy of 81%

on the active compounds and 78% on the inactive.

The utility of this BNN as a filter to design a virtual library against CNS targets was subsequently

demonstrated. As for any filter designed to handle large compound collections, the principal

consideration was the throughput of the calculation. With their in-house implementation, they

achieved a throughput of almost 1 million compounds on a single processor (195 MHz R10000) per

day. The CNS activity filter was tested on a large virtual library, consisting of about 1 million

molecules constructed with 100 drug-like scaffolds63 combined with 300 most common side chains.

Two types of filters were applied to prune this library. The first was substructure-based, to exclude

compounds containing reactive functional groups; the second was property-based, to discard

molecules with undesirable physicochemical properties, including high MW, high MlogP, and in

the example case, low predicted CNS activity. From the remaining compounds, they identified

several classes of molecules that have favorable BBB penetration properties and are also

particularly amenable to combinatorial chemical library synthesis. As a result, such libraries are

considered as privileged compound classes to address CNS targets.

25


26/42

Toxicity

No substance is free of possible harmful effects. Of the tens of thousands of current commercial

chemical products, only perhaps hundreds have been extensively characterized and evaluated for

their safety and potential toxicity.64,65 There is strong evidence implicating pesticides and industrial

byproducts in links to numerous health problems, including birth defects, cancer, digestive

disorders, mutagenicity, tumorig

Date post:	14-Apr-2018
Category:	Documents
Upload:	alex-mihai-ciubara
View:	218 times
Download:	0 times

Prediction of Drug Like Properties

Documents