+ All Categories
Home > Documents > Drug Like Properties

Drug Like Properties

Date post: 14-Apr-2018
Category:
Upload: le-ma
View: 219 times
Download: 0 times
Share this document with a friend

of 4

Transcript
  • 7/29/2019 Drug Like Properties

    1/4

    384

    Recognizing molecules with drug-like propertiesW Patrick Walters, Ajay and Mark A MurckoA variety of succ essful approaches to the problem ofrecognizing drug-like molecu les have been employed. Theserange from simple cou nting scheme s such as the Lipinski ruleof five to the analysis of the multidimensio nal chemistry space occupied by drugs, to neural network learning systems. Withthis variety of tools, it now appears possib le to design librariesthat are enriched in compounds which have desirable or drug-like properties. Verifying the robustness of these methods, andextending them, w ill form the basis of research in this fieldduring the next few years.

    AddressVertex Pharmac euticals, 130 Waverly Street, Cambridge,MA 02139, USA

    Current Opinion in Chemical Biology 1999, 3:384-387http://biomednet.com/elecref/i 3675931003003840 Elsevier Science Ltd ISSN 1367-5931AbbreviationsACD Available Chemical DirectoryCMC Comprehensive Medicinal ChemistryMDDR MACCS-II Drug ReportWDI World Drug Index

    IntroductionWith the advent of high-throughput chemistry and enzy-mology, some researchers in the early 1990s took theposition that simply throwing more compounds at a drugdiscovery problem would increase the odds of success.Drug companies now routinely assay several hundredthousand compounds against each new drug target, and thesize of the typical screening library is soon expected toapproach a mi lli on compounds. Likewise, the number ofcompounds that can be synthesized in one year by a dedi-cated combinatorial chemist can now routinely be in therange of lO,OOO-100,000 or more [l-3].Anecdotal evidence from a variety of research labs suggeststhat raw speed and sheer numbers are not sufficient tocrack the problem of drug discovery, however. The uti lityof first-generation combinatorial libraries has generallybeen considered to be quite low because these librariestend to be populated with large, lipophilic, highly flexiblemolecules (MA Gallop, The Second Lake TahoeSymposium on Molecu lar Diversity, Tahoe City, CA,January 1998; CB Cooper, National Managed Health CareConference, Boston, MA, May 1997). Support for this the-sis comes from Lipinski et al: [4], who analyzed thecompounds synthesized at Pfizer between 1984 and 1994and showed that the number of compounds with a relativemolecular weight greater than 500 doubled over the 10year period. We should also remember that the number ofhigh-quality lead molecules to be derived from high-throughput screening (HTS) is typically quite low, perhaps

    in the range of 1 per 100,000 compounds screened for eas-ier targets such as enzymes, and much worse for hardertargets such as protein-protein interactions [5].As a consequence, many researchers have begun to paycloser attention to the nature of the compounds synthe-sized and screened. This process is sometimes referred toas recognizing drug-like molecules. In this brief review,we will point out some recent publications in this fie ld, andsuggest some future directions that this field may take.Simple counting methods to predictdrug-likenessMany researchers over the years have attempted to showthat drug-like molecules tend to have certain properties.For example, 1ogP (where P is parti tion coefficient), mole-cular weight, and the number of hydrogen bond ing groupshave been correlated with oral bioava ilability [6,7]. In prin-ciple, then, one should be able to very simply improve theodds of success by biasing a combinatorial library towardscompounds that have certain properties.Recently, researchers at Pfizer [4] have extended this ideawith the establishment of the rule of five to provide aheuristic guide for determining if a compound will be oral-ly bioavailable . The rules were der ived from analysis of2,245 compounds from the World Drug Index (WDI;Derwent Information, London, UK) which have a LJSAN(United States adopted name) or INN (international non-proprietary name) and an entry in the indications andusage field of the database. The assumption is that com-pounds meet ing these criteria have entered human clinicaltrials, and therefore must posess many of the desirablecharacteristics of drugs. It was found that in a high per-centage of compounds, the following rules were rrue:hydrogen bond donors < 5; hydrogen bond acceptors 2 10;relative molecular weight 2500; and IogP 5 5. The majori-ty of the violations came from antibiotics, antifungals,vitamins and cardiac glycosides. The authors suggest thatthese classes of compounds are orally bioavailable, despitetheir violations of the rule of five, due to the presence offunct ional groups that act as substrates for transporters.The app lication of simple counting schemes to combinato-rial library design is obvious. For example, Fecik et al. [8]performed an analysis of a large number of combinatoriallibraries in terms of the weight of the scaffold and averageweight of substituents which are necessary to arrive atproducts with relative molecular weights of 500.Functional group filtersA different approach is to identify functional groups thattend to be undesirable because of chemical reactivity,metabolic lab ility, and so forth. Rishton [9] discusses

  • 7/29/2019 Drug Like Properties

    2/4

    Recognizing molecules with drug-like properties Walters, Ajay and Murcko 385

    chemistry guidelines for the elimination of compoundssuch as alkylating or acylating agents, which tend to appearas false positives in biochemical screens. Specifically, a setof approximately 25 functional groups are described thatare prone to solvolysis or hydrolysis or which tend to reactwith biological nucleophiles.Walters et al. [lo] briefly described an approach (REOS[rapid elimination of swill]) to eliminate undesirablereagents and products from screening and combinatoriallibraries. REOS is a hybrid method that combines somesimple counting schemessimilar to those in the rule offive with a set of functional group filters to remove reac-tive and otherwise undesirable moieties. The authorsclaim that for large (106-109) libraries, it is typically possi-ble to remove 2 99.9% of the compounds at a rate ofapproximately 105compounds per hour per processor.Prediction of oral bioavailabilityOral bioavailability of a drug can be defined as the fractionof the oral dose hat reachessystemic circulation. Reachingsystemic circulation is influenced by both absorption andfirst-passmetabolism n the liver or gut wall. It is alsopossi-ble for drugs to be highly bound to plasmaproteins, thusresulting in low circulating levels. Lipophilicity and solubil-ity are two important determinants of the extent and rate ofabsorption of molecules [11,12]. Lipophilicity influencesboth metabolic activity [13] and plasma protein binding[14]. Interestingly, the effect of lipophilicity on membranepenetration and first-passmetabolismappear o have oppos-ing effects on oral bioavailability. It is important to note thatcorrelation with lipophilicity doesnot imply predictivity.Regression-type models have been attempted tomodel/predict oral bioavailability and in &uo (in situ perfu-sion) and in a& (Caco-2 cells) permeability. Theseapproaches use either theoretically calculated or experi-mentally obtained descriptors relating to logP, pKa,electrostatic interactions, polar surface area, AlogP (i.e. thedifference in the partition coefficient between a polar sol-vent such as diethyl ether and a nonpolar solvent such asisooctane), and so on. Recent methods introduced bySugawara et al. [15] and Winiwarter et a/. [16] provideexcellent examples of the types of models that can bebuilt. Other approaches along similar lines have alsoappeared [17,18]. A major unsolved problem with regres-sion approaches s that it is not evident whether or not aprediction is applicable on a new seriesof compounds.An entirely diferent approach to bioavailability predictionhas been taken by Amidon and co-workers [19]. This is adynamic and phenomenological method where time isaccounted for explicitly in the mathematical formulation.The authors found that a seven-compartmental smallintestine model worked well in characterizing the com-pounds they studied. Explicit knowledge of the effectivepermeability (measure of in S&Uabsorption) of the drug isrequired, however. This is not a high-throughput method.

    Chemistry space methodsSeveral research groups have attempted to define thechemistry space [20,21] that is occupied by drug-likemolecules.The basic dea is that drugs will tend to possessdistinct values for certain properties, and asa result, whenanalyzed in high-dimensional space, drugs will be shownto be distinct from nondrugs. A chemistry space s typical-ly defined by calculating a number of descriptors for eachmolecule and using the descriptor values as points in mul-tidimensional space.As an example, let us assume hat wehave calculated molecular weight, 1ogPand the number ofhydrogen bond donors for a set of molecules. These threedescriptor values can then be used to define a point in athree-dimensional space hat representseach molecule. Inpractice, large numbers (20-100) of descriptors are calcu-lated and statistical techniques such as principalcomponents or factor analysis [ZZ] are used to reduce thedimensionality of the descriptor space.Cummins et a/. [23] compared five databases -Comprehensive Medicinal Chemistry (CMC; MolecularDesign Ltd, San Leandro, CA), MACCS-II Drug Report(MDDR; Molecular Design Ltd), Available ChemicalDirectory (ACD; Molecular Design Ltd),SPECS/BioSPECS database, Specs and BioSPECS,Rijswijk, The Netherlands), and their in-house Wellcomeregistry. They calculated 28 topological ndices, aswell asanestimate of the free energy of solvation for 300,000 com-pounds. Factor analysiswas used to reduce the descriptorspace o four dimensions.The descriptor spacewas hen par-titioned and the occupancy of the resulting sub-hypercubeswas examined. The percentagesof the total volume occu-pied by the databaseswere 27% (CMC), 72% (Wellcomeregistry), 69% (MDDR), 46% (SPECS) and 72% (ACD).The authors also found a 92% overlap between CMC andACD. Thus, although the method may be used to identifyinteresting regionsof space t may not by itself be an effec-tive discriminator between drugs and nondrugs.Gillet et a/. [24] used profiles of calculated properties(numbers of hydrogen bond donors and acceptors, molec-ular weight, rotatable bonds, aromatic rings, and a shapedescriptor) to differentiate between a set of drugs repre-sented by 14,861 compounds from the WDI and a set ofnondrugs represented by 16,807 compounds from theSPRESI database (Daylight Chemical InformationsSystem, Mission Viejo, CA). A genetic algorithm was usedto derive a set of optimal weights for the properties. Thebest weighting schemeswere able to provide a five- to six-fold enhancement over random selection. The authorswere alsoable to achieve similar results using property pro-files to identify drugs belonging to a specific therapeuticclass rom a larger drug database.A Chiron group [ZS] establisheda chemistry spaceusinglogP, principal components analysis of 81 topologicalindices [26], chemical functionality descriptors derivedfrom multidimensional scaling [27] of Tanimoto similarities

  • 7/29/2019 Drug Like Properties

    3/4

    306 Next generation therapeutics

    [ZS] and atom layer tables [29]. Substituents were selectedusing D-optimal design [30]. A list of criteria used to elimi-nate unacceptable candidate amines was also included.

    Examination of building blocks inknown drugsA very different approach is to analyze the build ing blockscommonly found in drugs to see whether nonrandom pat-terns can be unearthed. This work does not directly confrontthe problem of distinguishing drugs from nondrugs, but ithelps to define what drugs are and thereby helps chemists tothink about preferred moieties for library design.Bemis and Murcko [31] examined 5,120 compounds fromthe CMC database and found 1,179 frameworks, or scaf-folds. This suggests that drugs are rather diverse. Whenconsidering just topology, however, only 32 frameworksdescribed the shapes of half the drugs in the set. Evenwhen atom types and hybrid ization are considered, 25% ofal l drugs are found to uti lize only 42 frameworks. Thesesurprising results suggest that a small number of commonshape themes can be re-used in widely divergent drugdesign situations.Ghose et a/ [32] characterized the CMC database basedon computed physicochemical property profiles (log P,molar refractivity, molecular weight, and number ofatoms). They established qualifying ranges, which covermore than 80% of the compounds. They also examinedcommonly occurring functional groups. Not surprisingly,benzene was the most common, with a frequency approxi-mately equal to that of all aromatic heterocycles combined.Nonaromatic heterocycles were more common than aro-matic by approximately twofold. Tertiary amines, alcoholsand carboxamides were the most frequently occurringfunctional groups.Neural network methodsNeural networks [33] have long been used n classificationschemes, but less frequently in pharmaceutical applica-tions; however, two papersappeared in 1998 hat describedthe successful employment of different neural networkapproaches o distinguish drugs from nondrugs.Ajay et a/. [34] used a Bayesian neural network. The net-work was trained using a random partition of 3,500compounds, each from the CMC and ACD databases.Twokinds of descriptors were used: a set of seven one-dimen-sional and 166 two-dimensional descriptors. The programwas able to correctly classify 90% of the CMC compoundsand mis-classified only 10% of the ACD molecules. Thegeneralizability of the method was demonstrated by theprograms ability to correctly classify 80% of the com-pounds from the MDDR.Appearing back-to-back with Ajay et al. 134.1was a contri-bution from Sadowski and Kubinyi [35]. Thoseresearchers developed a feed-forward neural network

    method for discriminating drugs from nondrugs. Theyused 38,416 molecules from the WDI databaseas the drugset and 169,331 molecules from the ACD as the nondrugset. The program was able to correctly classify 83% of theACD compounds and 77% of the WDI compounds.Conclusions and future directionsAs we have shown, a wide variety of methods have alreadybeen applied to the problem of identifying moleculeswithdesirable or drug-like properties. These methods appearto be meeting with some success.A key issue s whethergeneral (i.e. global) rules can be formulated, or whetherrules will always need to be local and situation-specific.The publications by Ajay et a/ [34] and Sadowski andKubinyi [35] suggest that general rules with reasonablepredictive power can be formulated.Another trend we may witness in coming years might beattempts to predict the various properties that contributeto a drugs success, ather than the more complex prob-lem of drug-likeness itself. These might include oralabsorption, blood-brain barrier penetration, toxicity,metabolism, aqueous solubility, logP, pKa, half-life, andplasma protein binding. Some of these properties arethemselves rather complex and are likely to be extreme-ly difficult to model, but in our view it should be possiblefor the majority of properties to be predicted with better-than-random accuracy.Future work is likely to include additional approachesandmore robust attempts at validation of these methods. Also,one hopes that the judicious use of these predictions maylead to increased efficiency in the selection of combinato-rial and HTS libraries. We are probably still several yearsaway from a definitive experiment proving this point, how-ever. Further off, in all likelihood, will be the ability topredict downstream issues pertaining to formulation,manufacturing, shelf-life, chemical stability, and so forth.These too are critical for the success f a drug [36].References and recommended readingPapers of particular interest, publshed within the annual period of review,have been highlighted as:

    l of special interest**of outstanding interest

    1. Gordon EM: Libraries of non-polymeric organic molecules. CurrOpin Biofechnol 1995, 6:624-631.

    2. Dole RE: Discovery of enzyme inhibitors through combinatorialchem istry. MO/ Divers 1997, 2:223-226.

    3. Brown D: Future pathways for combinatorial chem istry. MO/ Divers1997, 2:217-222.

    4. Lipinski CA, Lombard0 F, Dominy SW, Feeney PJ: Experimental andcompuational approaches to estimate solubility and permeablityin drug discovery. Adv Drug De/iv Rev 1997, 23:3-25.

    5. Spencer RW: High-throughput screening of historic collections. observations on file size, biological targe ts, and file diversity.

    Biotechnol Bioeng 1996, 61:61-67.This work provides an analysis of more than 150 high-throughput screensthat were carried out at Pfizer Central Research. The authors compared hitrates for enzyme, cytokine and receptor targe ts. They evaluated the impactof clustering and diversity analysis on a screen for substance P antagonists.

  • 7/29/2019 Drug Like Properties

    4/4

    Recognizing molecules with drug-like properties Walters, Ajay and Murcko 387

    6.

    7.

    8.

    9.

    10.

    11.

    12.

    13.

    14.

    15..

    Navia MA , Chaturvedi PR: Design principles for orally bioavaiabledrugs. Drug Discov Today 1996,i :I 79-189.Chan OH, Stewart BH: Physicochemical and drug-deliveryconsiderations for oral drug bioavailability. Drug Discov Today1996,1:461-473.Fecik RA, Frank KE, Gentry El, Menon SR, Mitscher LA, Telikepalli H:The search for orally acitive medications through combinatorialchem istry. Med Res Rev 1998, 18:149-l 85.Rishton GM : Reactive compounds and in vitro false positives inHTS . Drug Discov Today 1997, 21382-385.Walters WP, Stahl MT, Murcko MA: Virtual screening - an overview.Drug Discov Today 1998, 3:160-l 78.Schanker LS: On the mechanicsm of absorption from thegastrointestinal tract. I Med Pharm Chem 1960, 2:343-346.Leahy D E, Lynch J, Taylor CID: Mechanisms of absorption of smallmolecules. Edited by Prescott LF, Nimm o WS. New York : John Wiey& Sons; 1989.Seydel JK, Schaper KJ: Quantitative Structure-fharmacoke fkRelationships in Drug Design. Edited by Rowland M, Tucke r G.New York : Pergamon Press; 1986.Sawada GA, Barshun CL, Lutzke BS, Houghton ME, Padbury GW,Ho NFH, Raub TJ : Increased lipophilicity and subsequent cellpartition ing decrease passive transcellular diffusion of novelhighly lipophilic antioxidants. Pharm fxptl Ther 1999,288:1317-1326.Sugawara M, Takekuma Y, Yamada H, Kobayashi M, lseki K,Miyazaki K: A general approach for the prediction of the intestinalabsorption of drugs: regression analysis using thephysicochemical properties and drug-membran e eletrostaticinteractions. J Pharm SC; 1998,87:960-966.

    Experimentally determined log Cl values in octanol, diethyl ether, chloroformand isooctane were used in different combinations to model the rat jejunalperme ability of 32 drug s. Reasonable mode ls could be developed for anion-ic, cationic and nonionized compounds. Predictions for an external set of 10compounds (including some zwitterionic compounds) were also reasonable.16. Winiwarter S, Bonham NM, Ax F, Hallberg A, Lennernas H, Karlen A:. Correlation of human jejunal permeability (in t&o) of drugs with

    experimentally and theoretically derived parameters. A multivariantdata analysis approach. J Med Chem 1998,41:4939 -4949.

    In viva human jejunal permeability of 22 structurally diverse compounds wascorrelated with experimentally determined log D (log P) values and calculat-ed structural parameters. The best model used log D, number of hydrogenbond donors (HBD) and polar surface area (PSA); howe ver, models usingcalculated log P, HBD, and PSA and just HBD and PSA were close to thebes t. Reasonable predic tivity was seen on an external validation set of 24compqunds where data on oral bioavailability was available. It is important tonote that some of the actively transported molecules were under-predictedby the models.17. Stenberg P, Luthman K, Artursson P: Prediction of membrane

    permeability to pepides from calculated dynamic molecularsurface properties. Pharm Res 1999, 16:205-212.

    18. Wessel MD, Jurs PC, Tolan JW, Muskal SM: Prediction of humanintestinal absorption of drug compounds from molecularstructure. J Chem Inf orm Comp Sci 1998, 38:726-735.

    19. Yu LX, Lipka E, Crison JR, Amidon GL: Transpo rt approached to thebiopharmaceutical design of oral drug delivery s ystem s: predictionof intestinal absorption. Adv Drug De/iv Rev 1996, 19:359-376.

    20. Pearlman RS, Smith KM: Metric validation and the receptor-relevantsubspace concept. J Chem Inform Comp Sci 1999, 39:28-35.

    21. Pearlman RS, Smith KM: Novel software tools fo r chemicaldiversity. Persp Drug Design Discov 1998, 9:339-353.

    22. Cooley W, Lohones P: Multivariate Data Anaysis. New York : Wiley; 1971.23. Cumm ins DJ, Andrews CW, Bentley JA, Gory M: Molecular diversity

    in chemical databases: comparison of medicinal chemistryknowledge bases and databases of comme rcially avaiablecompounds. J Chem Inform Comp Sci 1996, 36:750-763.

    24. Gillet VJ, Willett P , Bradshaw J: Identification of biological activity. profiles using substructural analysis and genetic algorithms.J Chem inform Comp Sci 1998, 38:165-l 79.

    The authors used profiles of calculated properties (numbers of hydrogenbond donors and acceptors, molecular weigh t, rotatable bonds, aromaticrings, and a 2% shape descriptor) to differentiate between a set of drugsrepresented by 14,861 compounds from the Word Drug Index and a set ofnondrugs represented by 16,807 compounds from the SPRESI database.A genetic algorithm was used to derive a set of optimal weights for the prop-erties. The best weighting schemes were able to provide a five to sixfoldenhancement over random selection. The authors were also able to achievesimilar results using property profiles to identify drugs belonging to a specif-ic therapeutic class from a larger drug database.25. Martin EJ, Critchlow RE: Beyond mere diversity: tailoring. combinatorial libraries for drug discovery. J Comb Chem 1999,

    1~32-45.The authors present an overview of methods used at Chiron for combinato-rial library design an analysis. The paper focuses on a number of techniquesused to ensure that the molecu les produced are diverse and posses desir-able properties.26.

    27.

    28.

    29.

    30.

    31.

    32..

    Kier LB, Hal LH: Molecular Connectivity in Structure-ActivityAnalysis. New York : Wiley; 1986.Torgerso n WS: Multi-dimensional scaling. 1. Theory and methods .Psychometrica 1952, 17:401-419.Willett P, Barnard JM, Downs GM : Chemical similarity searching.J Chem Inform Comp Sci 1998, 38:983-996.Martin EJ, Blaney JM, Siani MA: Measuring diversity: experimentaldesign of combinatorial libraries for drug discovery. J Med Chem1995, 38:1431-l 436.Miller A, Nguyen N-K: A fedorov exchange algorithm of D-optimaldesign. Appl Stat 1994,43:669-678.Bemis GW, Murcko MA: The properties of known drugs. 1.Molecular frame works. J Med Chem 1996, 39:2887-2893.Ghose AK, Viswanadhan VN, Wendelowski JJ: A knowledge-basedapproach in designing combinatorial or medicinal chemis trylibraries for drug discovery. 1. A qualitative characterization ofknown drug databases. J Comb Chem 1999,1:55-67.

    The authors characterized the CMC database based on computed physico-chemical property profiles (log P, molar refractivity, molecular weigh t, andnumber of atom s). They established qualifying ranges, w hich cover morethan 80% of the compounds. They also examined commo nly occurring func-tional groups. They found that benzene was mos t comm on - frequency wasapproximately equal to that of all aromatic heterocycles combined.Nonaromatic heterocycles were more comm on than aromatic (approximate-ly twofold). Tertiary amines, alcohols and carboxamides were the mo st fre-quently occurring functional groups.33. Hertz J, Krogh A, Palmer RG: Introduction to the Theory of Neural

    Computation. Redwood City, CA: Addison Wesley; 1991.34. Ajay , Walters WP, Murcko MA: Can we learn to distinguish. between drug-like and nondrug-like molecules7 J Med Chem

    1998, 41:3314-3324.The authors used a Bayesian neural network to distinguish between drugsand nondrugs. Network was trained using a random partition of 3,500 com -pounds each from CMC and ACD. The network was trained using a set ofseven 1 D and 166 2D descriptors. The program was able to correctly clas-sify 90% of the CMC compounds, and misclassified only 10% of the ACDmolecules. The generalizablity of the method was demonstrated by the pro-gram s ability to correctly classify 80% of the compounds from the MDDR.35. Sadowski J, Kubinyi H: A scoring scheme for discriminating. between drugs and nondrugs. J Med Chem 1998,41:3325-3329.The authors developed a neural network method for discriminating drugsand non-drugs; they used 38,416 molecules from the WDI as the drug setand 169,331 molecules from the ACD as the nondrug set. A set of atomtypes originally developed for log P prediciton was used as descriptors. Afeedforward neural network was trained to classify the compounds. The pro-gram was able to correctly classify 83% of the ACD compounds and 77%of the WDI compounds.36. Streng WH: Physical chemical characterization of drug

    substances. Drug Discos Today 1997,2:415-426.


Recommended