Date post: | 28-Jan-2015 |
Category: |
Health & Medicine |
Upload: | sean-ekins |
View: | 107 times |
Download: | 0 times |
Sean Ekins, M.Sc, Ph.D., D.Sc.
Collaborations in Chemistry, Fuquay-Varina, NC.
Collaborative Drug Discovery, Burlingame, CA.School of Pharmacy, Department of Pharmaceutical
Sciences, University of Maryland. 215-687-1320
Computational Models for
Predicting Human Toxicities
• Key enablers
• What has been modeled – a quick review
• What will be modeled
• Future
Outline
Why Use Computational Models For Toxicology?
Goal of a model – Alert you to potential toxicity, enable you to focus efforts on best molecules – reduce risk
Selection of model – trade off between interpretability, insights for modifying molecules, speed of calculation and coverage of chemistry space – applicability domain
Models can be built with proprietary, open and commercial tools
software (descriptors + algorithms) + data = model/s
Human operator decides whether a model is acceptable
Key enablers: Hardware is getting smaller
1930’s
1980s
1990s
Room size
Desktop size
Not to scale and not equivalent computing power – illustrates mobility
Laptop
Netbook
Phone
Watch
Key Enablers: More data available and open tools
• Details
• Details
What has been modeled
• Physicochemical properties, LogP, logD, Solubility, boiling point, melting point
• QSAR for various proteins, complex properties• Homology models, Docking• Expert systems• Hybrid methods – combine different approaches• Mutagenicity (Ames, micronucleus, clastogenicity,
and DNA damage, developmental tox.. )• Environmental Tox – Aquatic, dermatotoxicology• Mixtures
Physicochemical properties• Solubility data – 1000’s data in Literature • Models median error ~0.5 log = experimental error• LogP –tens of 1000’s data available• Fragmental or whole molecule predictors• All logP predictors are not equal. Median error ~ 0.3 log = experimental
error• People now accept solubility and LogP predictions as if real
ACD predictions + EpiSuite predictions in www.chemspider.com
• Mobile molecular data sheet
• Links to melting point predictor from open notebook science
• Required curation of data
Simple Rules
• Rule of 5
• Lipinski, Lombardo, Dominy, Feeney Adv. Drug Deliv. Rev. 23: 3-25 (1997).
• AlogP98 vs PSA• Egan, Merz, Baldwin, J. Med. Chem. 43: 3867-3877 (2000)
• Greater than ten rotatable bonds correlates with decreased rat oral bioavailability• Veber, Johnson, Cheng, Smith, Ward, Kopple. J Med Chem 45: 2515–2623, (2002)
• Compounds with ClogP < 3 and total polar surface area > 75A2 fewer animal toxicity findings.
• Hughes, et al. Bioorg Med Chem Lett 18, 4872-4875 (2008).
L. Carlsson,et al., BMC Bioinformatics 2010, 11:362
MetaPrint 2D in Bioclipse- free metabolism site predictor
Uses fingerprint descriptors and metabolite database to learn frequencies of metabolites in various substructures
QSAR for Various Proteins
• Enzymes – predominantly Cytochrome P450s - for drug-drug interactions
• Transporters – predominantly P-gp but some others e.g. OATP, BCRP -
• Receptors – PXR, CAR, for hepatotoxicity
• Ion Channels – predominantly hERG for cardiotoxicity
• Issues – initially small training sets – public data is a fraction of what drug companies have
Pharmacophores
Ideal when we have few molecules for training In silico database searching
Accelrys Catalyst in Discovery Studio
Geometric arrangement of functional groups necessary for a biological response
•Generate 3D conformations•Align molecules•Select features contributing to activity•Regress hypothesis•Evaluate with new molecules
•Excluded volumes – relate to inactive molecules
CYP2B6CYP2C9CYP2D6CYP3A4CYP3A5CYP3A7hERGP-gpOATPsOCT1OCT2BCRPhOCTN2ASBThPEPT1hPEPT2FXR LXRCARPXR etc
hOCTN2 – Organic Cation transporterPharmacophore
• High affinity cation/carnitine transporter - expressed in kidney, skeletal muscle, heart, placenta and small intestine
• Inhibition correlation with muscle weakness - rhabdomyolysis• A common features pharmacophore developed with 7 inhibitors• Searched a database of over 600 FDA approved drugs - selected drugs for in vitro testing. • 33 tested drugs predicted to map to the pharmacophore, 27 inhibited hOCTN2 in vitro
• Compounds were more likely to cause rhabdomyolysis if the Cmax/Ki ratio was higher than 0.0025
Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
hOCTN2 – Organic Cation transporterPharmacophore
Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
Diao, Ekins, and Polli, Pharm Res, 26, 1890, (2009)
+ve
-ve
hOCTN2 quantitative pharmacophore and Bayesian model
Diao et al., Mol Pharm, 7: 2120-2131, 2010 r = 0.89
vinblastine
cetirizine
emetine
hOCTN2 quantitative pharmacophore and Bayesian model
Bayesian Model - Leaving 50% out 97 times external ROC 0.90internal ROC 0.79 concordance 73.4%; specificity 88.2%; sensitivity 64.2%.
Lab test set (N = 27) Bayesian model has better correct predictions (> 80%) and lower false positives and negatives than pharmacophore (> 70%)
Predictions for literature test set (N=32) not as good as in house – mean max Tanimoto similarity were ~ 0.6
Diao et al., Mol Pharm, 7: 2120-2131, 2010
PCA used to assess training and test set overlap
Among the 21 drugs associated with rhabdomyolysis or carnitinedeficiency, 14 (66.7%) provided a Cmax/Ki ratio higher than0.0025.
Among 25 drugs that were not associated with rhabdomyolysis or
carnitine deficiency, only 9 (36.0%) showed a Cmax/Ki ratio higher than
0.0025.
Rhabdomyolysis or carnitine deficiency was associated with a Cmax/Ki
value above 0.0025 (Pearson’s chi-square test p = 0.0382).
limitations of Cmax/Ki serving as a predictor for rhabdomyolysis-- Cmax/Ki does not consider the effects of drug tissue distributionor plasma protein binding.
hOCTN2 association with rhabdomyolysis
Drug induced liver injury DILI
• Drug metabolism in the liver can convert some drugs into highly reactive intermediates,
• In turn can adversely affect the structure and functions of the liver.
• DILI, is the number one reason drugs are not approved – and also the reason some of them were withdrawn from
the market after approval• Estimated global annual incidence rate of DILI is 13.9-24.0
per 100,000 inhabitants, – and DILI accounts for an estimated 3-9% of all adverse
drug reactions reported to health authorities • Herbal components can cause DILI too
https://dilin.dcri.duke.edu/for-researchers/info/
• Drug Induced Liver Injury Models
• 74 compounds - classification models (linear discriminant analysis, artificial neural networks, and machine learning algorithms (OneR)) – Internal cross-validation (accuracy 84%, sensitivity 78%, and specificity 90%). Testing
on 6 and 13 compounds, respectively > 80% accuracy.
(Cruz-Monteagudo et al., J Comput Chem 29: 533-549, 2008).
• A second study used binary QSAR (248 active and 283 inactive) Support vector machine models – – external 5-fold cross-validation procedures and 78% accuracy for a set of 18
compounds
(Fourches et al., Chem Res Toxicol 23: 171-183, 2010).
• A third study created a knowledge base with structural alerts from 1266 chemicals. – Alerts created were used to predict results for 626 Pfizer compounds (sensitivity of
46%, specificity of 73%, and concordance of 56% for the latest version) (Greene et al., Chem Res Toxicol 23: 1215-1222, 2010).
• DILI Model - Bayesian
• Laplacian-corrected Bayesian classifier models were generated using Discovery Studio (version 2.5.5; Accelrys).
• Training set = 295, test set = 237 compounds
• Uses two-dimensional descriptors to distinguish between compounds that are DILI-positive and those that are DILI-negative
– ALogP– ECFC_6 – Apol – logD – molecular weight – number of aromatic rings – number of hydrogen bond acceptors – number of hydrogen bond donors – number of rings – number of rotatable bonds – molecular polar surface area – molecular surface area – Wiener and Zagreb indices
Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
Extended connectivity fingerprints
• DILI Bayesian
Features in DILI -Features in DILI +
Avoid===Long aliphatic chains, Phenols, Ketones, Diols, -methyl styrene, Conjugated structures, Cyclohexenones, Amides
Test set analysis
Ekins, Williams and Xu, Drug Metab Dispos 38: 2302-2308, 2010
• compounds of most interest – well known hepatotoxic drugs (U.S. Food and Drug Administration
Guidance for Industry “Drug-Induced Liver Injury: Premarketing Clinical Evaluation,” 2009), plus their less hepatotoxic comparators, if clinically available.
Fingolimod (Gilenya) for MS (EMEA and FDA)
Paliperidone for schizophrenia
Pirfenidone for Idiopathic pulmonary fibrosis
Roflumilast for pulmonary disease
Predictions for newly approved EMEA compounds
Can we get DILI data for these?
Time dependent inhibition for P450 3A4
• Pfizer generated a large dataset (~2000 compounds) and went through sequential Bayesian model generation and testing cycles
Test set 2 20 active in 156 compounds Combined both model predictions
Zientek et al., Chem Res Toxicol 23: 664-676 (2010)
• 3A4 TDI
Indazole ring, the pyrazole, and the methoxy-aminopyridine rings areimportant for TDI
Approach decreased in vitro screening 30%
Helps identify reactive metabolite forming compounds
Zientek et al., Chem Res Toxicol 23: 664-676 (2010)
http://www.slideshare.net/ekinsseanEkins S and Williams AJ, MedChemComm, 1: 325-330, 2010.
Analysis of malaria and TB datasets
Antimalarial Compound libraries and filter failures
Ekins and Williams Drug Disc Today 15; 812-815, 2010
0
20
40
60
80
100G
SK
(13
,35
5)
St J
ud
e(1
52
4)
No
vart
is(5
69
5)
FD
A d
rug
s(1
04
1)
An
tima
lari
al
dru
gs
(14
)
Abbott Alerts
Pfizer Lint Alerts
GSK Alerts
% F
ailu
reFiltering using SMARTs filters to remove thiol reactives, false positives etc at University of New Mexico (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter)
TB Compound libraries and filter failures
Filtering using SMARTs filters to remove thiol reactives, false positives etc at University of New Mexico (http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter)
Ekins et al., Mol Biosyst, 6: 2316-2324, 2010
0
20
40
60
80
100%
Fa
ilu
re
TB
Ma
dd
ry (
90
)
TB
An
an
tha
n (
16
0)
TB
dru
gs
(13
)
US
an
tibio
tics
(16
3)
FD
A d
rug
s (1
04
1)
Abbott Alerts
Pfizer Lint Alerts
GSK alerts
Correlation between the number of SMARTS filter failures and the number of Lipinski violations for different types of rules sets with FDA drug set from CDD (N = 2804)
Suggests # of Lipinski violations may also be an indicator of undesirable chemical features that result in reactivity
Correlations
Ekins and Freundlich, Pharm Res, 28, 1859-1869, 2011.
Could all pharmas share their data as models with each other?
Increasing Data & Model Access
Ekins and Williams, Lab On A Chip, 10: 13-22, 2010.
The big idea
Challenge..There is limited access to ADME/Tox data and models needed for R&D
How could a company share data but keep the structures proprietary?
Sharing models means both parties use costly software
What about open source tools? Pfizer had never considered this - So we proposed a
study and Rishi Gupta generated models
Pfizer Open models and descriptors
Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010
• What can be developed with very large training and test sets?
• HLM training 50,000 testing 25,000 molecules
• training 194,000 and testing 39,000
• MDCK training 25,000 testing 25,000
• MDR training 25,000 testing 18,400
• Open molecular descriptors / models vs commercial descriptors
• Examples – Metabolic Stability
Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010
HLM Model with CDK and SMARTS Keys:
HLM Model with MOE2D and SMARTS Keys
# Descriptors: 578 Descriptors# Training Set compounds: 193,650
Cross Validation Results: 38,730 compounds
Training R2: 0.79
20% Test Set R2: 0.69
Blind Data Set (2310 compounds): R2 = 0.53RMSE = 0.367
Continuous Categorical:κ = 0.40Sensitivity = 0.16Specificity = 0.99PPV = 0.80Time (sec/compound): 0.252
# Descriptors: 818 Descriptors# Training Set compounds: 193,930
Cross Validation Results: 38,786 compounds
Training R2: 0.77
20% Test Set R2: 0.69
Blind Data Set (2310 compounds): R2 = 0.53RMSE = 0.367
Continuous Categorical: κ = 0.42Sensitivity = 0.24Specificity = 0.987PPV = 0.823Time (sec/compound): 0.303
PCA of training (red) and test (blue) compounds
Overlap in Chemistry space
• Examples – P-gp
Gupta RR, et al., Drug Metab Dispos, 38: 2083-2090, 2010
Open source descriptors CDK and C5.0 algorithm
~60,000 molecules with P-gp efflux data from Pfizer
MDR <2.5 (low risk) (N = 14,175) MDR > 2.5 (high risk) (N = 10,820)
Test set MDR <2.5 (N = 10,441) > 2.5 (N = 7972)
Could facilitate model sharing?
CDK +fragment descriptors MOE 2D +fragment descriptorsKappa 0.65 0.67
sensitivity 0.86 0.86specificity 0.78 0.8
PPV 0.84 0.84
Merck KGaA
Combining models may give greater coverage of ADME/ Tox chemistry space and improve predictions?
Lundbeck
Pfizer
Merck
GSK
Novartis
Lilly
BMS
Allergan Bayer
AZ
Roche BI
Merk KGaA
Model coverage of chemistry space
Next steps
ADME/Tox Data crosses diseases Potential to share models selectively with collaborators e.g.
academics, neglected disease researchers We used the proof of concept to submit an SBIR
“Biocomputation across distributed private datasets to enhance drug discovery”
Develop prototype for sharing models securely- collaborate to show how combining data for TB etc could improve models
Phase II- develop a commercial product that leverages CDD Engage Pistoia Alliance to expand concept to many
companies – in progress
Future: What will be modeled
• Mitochondrial toxicity, hepatotoxicity, • More Transporters – MATE, OATPs, BSEP..bigger datasets – driven by
academia• Screening centers – more data – more models • Understanding differences between ligands for Nuclear Receptors
– CAR vs PXR
• Models will become replacements for data as datasets expand (e.g. like logP)
• Toxicity Models used for Green Chemistry
Chem Rev. 2010 Oct 13;110(10):5845-82
What You Might Not Know About Chemistry Databases On The Internet
• Data-sharing between open databases is cyclic• This can proliferate errors in the “Linked Data”
Government Databases Should Come With a Health Warning
Openness Can Bring Serious Quality Issues
NPC Browser http://tripod.nih.gov/npc/
Database released and within days 100’s of errors found in structures
Williams and Ekins, DDT, 16: 747-750 (2011)
Science Translational Medicine 2011
•Make science more accessible = >communication
•Mobile – take a phone into field /lab and do science more readily than on a laptop
•GREEN – energy efficient computing
•MolSync + DropBox + MMDS = Share molecules as SDF files on the cloud = collaborate
Mobile Apps for Drug Discovery
Williams et al DDT 16:928-939, 2011
Acknowledgments• University of Maryland
– Lei Diao– James E. Polli
• Pfizer– Rishi Gupta– Eric Gifford– Ted Liston– Chris Waller
• Merck– Jim Xu
• Antony J. Williams (RSC)
• Accelrys• CDD
• Email: [email protected]
Slideshare: http://www.slideshare.net/ekinssean
Twitter: collabchem
Blog: http://www.collabchem.com/
Website: http://www.collaborations.com/CHEMISTRY.HTM