+ All Categories
Home > Science > Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Date post: 23-Jan-2017
Category:
Upload: sean-ekins
View: 457 times
Download: 0 times
Share this document with a friend
27
Applying Cheminformatics and Bioinformatics Approaches to Neglected Tropical Disease Big Data Sean Ekins 1,2*ǂ , Jair Lage de Siqueira-Neto , Laura-Isobel McCall 3 , Malabika Sarker 4 , Maneesh Yadav 4 , Elizabeth L. Ponder 5 , E. Adam Kallel 1 $ , Danielle Kellar 6,§ , Steven Chen 7 , Michelle Arkin 7 , Barry A. Bunin 1 , James H. McKerrow 3 and Carolyn Talcott 4 . 1 Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA. 2 Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA. 3 Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA 92093, USA. 4 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA. 5 ChEM-H, Shriram Center, 443 Via Ortega, Room 279, MC 5082, Stanford, CA 94305-4125, USA. 6 Department of Pathology, University of California San Francisco, San Francisco, CA 94158, USA. 7 Small Molecule Discovery Center and Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA 94158, USA. $ Retrophin Inc. 12255 El Camino Real, Suite 250 San Diego, CA 92130, USA. § Present address: Five Prime Therapeutics, San Francisco, CA, USA. ǂ contributed equally
Transcript
Page 1: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Applying Cheminformatics and Bioinformatics Approaches to Neglected Tropical Disease Big Data

Sean Ekins1,2*ǂ, Jair Lage de Siqueira-Neto3ǂ, Laura-Isobel McCall3, Malabika

Sarker4, Maneesh Yadav4, Elizabeth L. Ponder5, E. Adam Kallel1 $, Danielle Kellar6,§,

Steven Chen7, Michelle Arkin7, Barry A. Bunin1, James H. McKerrow3 and Carolyn

Talcott4.

1 Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA. 2 Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.

3 Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA 92093, USA. 4 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA.

5 ChEM-H, Shriram Center, 443 Via Ortega, Room 279, MC 5082, Stanford, CA 94305-4125, USA. 6 Department of Pathology, University of California San Francisco, San Francisco, CA 94158, USA.

7 Small Molecule Discovery Center and Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA 94158, USA.

$ Retrophin Inc. 12255 El Camino Real, Suite 250 San Diego, CA 92130, USA. § Present address: Five Prime Therapeutics, San Francisco, CA, USA.

ǂ contributed equally

Page 2: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Chagas Disease

• About 7 million to 8 million people

estimated to be infected worldwide

• Vector-borne transmission occurs in the

Americas.

• A triatomine bug carries the

parasite Trypanosoma cruzi which

causes the disease.

• The disease is curable if treatment is

initiated soon after infection.

Hotez et al., PLoS Negl Trop Dis. 2013

Oct 31;7(10):e2300

Page 3: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Trypanosoma cruzi life cycle

Page 4: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Unmet Need

Eradication rates of parasite E1224 Benznidazole Placebo

At treatment completion 79-91% 91% 26%

12 months after treatment 8-31% 81% 8.5%

Page 5: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data
Page 6: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

http://www.bvgh.org/Current-Programs/Neglected-Disease-Product-Pipelines/NTD-Pipelines.aspx

No FDA approved drugs

Drugs in use and in development for Chagas Disease

posaconazole

monotherapy Phase II

concluded; missing

oxaborole (from Anancor)

in Pre-clinical

Page 7: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

T. cruzi

C2C12 cells

6-8 days

infect

T. cruzi (Trypomastigote)

T. cruzi high-content screening assay

Plate containing

compounds

T.cruzi

Myocyte

Fixing & Staining

Reading

3 days

Page 8: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Image analysis for efficacy assessment

IMAGES

NUMBERS

Moon et al, Plos One, 9(2), e87188, 2014

Page 9: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Host Cell Detection &

Segmentation

Moon et al, Plos One, 9(2), e87188, 2014

Parasite Detection

Page 10: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Screening Assay Validation

Moon et al, Plos One, 9(2), e87188, 2014

Page 11: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

CDD & CDIPD & SRI Collaboration

• Develop a novel combined cheminformatics-systems biology approach to

predict metabolic enzyme targets of HTS hits

• Curate T. cruzi metabolome

• Identify interesting targets

• Identify novel metabolic enzyme-compound hit pairs for T. cruzi

- analyze hits in CDD e.g. Broad hits, literature etc.

- Compare to known compounds with known targets e.g. CYP51

• Developed Machine learning models

• Identified compounds for In vitro testing

• Tested hits in vivo

What we actually did

Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878

Page 12: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Curating T. cruzi metabolome Pathway Genome Data Base (biocyc.org)

Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878

TCruCyc created

using complete

genome sequence of

Dm28c strain

Used Pathologic

workflow

• 11,349 distinct gene products

• 88 were enzymes, 16 transporters

• Infered 1030 enzymatic reactions, 122 pathways

• 806 metabolic compounds – set filtered to 358 for use in similarity searching

Page 13: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

• Dataset from PubChem AID 2044 – Broad Institute data

• Dose response data (1853 actives and 2203 inactives)

• Dose response and cytotoxicity (1698 actives and 2363 inactives)

• EC50 values less than 1 mM were selected as actives.

• For cytotoxicity greater than 10 fold difference compared with EC50

• Models generated using : molecular function class fingerprints of maximum

diameter 6 (FCFP_6), AlogP, molecular weight, number of rotatable bonds,

number of rings, number of aromatic rings, number of hydrogen bond

acceptors, number of hydrogen bond donors, and molecular fractional polar

surface area.

• 5-fold cross validation or leave out 50% x 100 fold cross validation was used

to calculate the ROC for the models generated

T. cruzi Machine Learning models

Page 14: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Model

Best

cutoff

Leave-one

out ROC

5-fold cross

validation ROC

5-fold cross

validation

sensitivity (%)

5-fold cross

validation

specificity (%)

5-fold cross

validation

concordance (%)

Dose response

(1853 actives,

2203 inactives)

-0.676 0.81 0.78 77 89 84

Dose response

and

cytotoxicity

(1698 actives,

2363 inactives)

-0.337 0.82 0.80 80 88 84

External ROC Internal ROC

Concordance

(%)

Specificity

(%)

Sensitivity

(%)

0.79 ± 0.01 0.80 ± 0.01 73.48 ± 1.05 79.08 ± 3.73 65.68 ± 3.89

5 fold cross validation

Dual event 50% x 100 fold cross validation

Page 15: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Good Bad

Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878

T. cruzi Dose Response Machine Learning model features

Tertiary amines, piperidines and

aromatic fragments with basic Nitrogen

Cyclic hydrazines and electron poor

chlorinated aromatics

Page 16: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Good Bad

Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878

T. cruzi Dose Response and cytotoxicity Machine Learning model features

Tertiary amines, piperidines and

aromatic fragments with basic Nitrogen

Cyclic hydrazines and electron poor

chlorinated aromatics

Page 17: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Bayesian Machine Learning Models

Ekins et al, PLoS NTD, 2015 (in press)

- Selleck Chemicals natural product lib. (139 molecules);

- GSK kinase library (367 molecules);

- Malaria box (400 molecules);

- Microsource Spectrum (2320 molecules);

- CDD FDA drugs (2690 molecules);

- Prestwick Chemical library (1280 molecules);

- Traditional Chinese Medicine components (373 molecules)

7569 molecules

99 molecules

Page 18: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Primary Screening of 99 compounds

Ekins et al, PLoS NTD, 2015 (in press)

Page 19: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Synonyms

Infection

Ratio

EC50 (µM) EC90 (µM) Hill slope

Cytotoxicity

CC50 (µM)

Chagas mouse model

(4 days treatment,

luciferase): In vivo

efficacy at 50 mg/kg

bid (IP) (%)

(±)-Verapamil

hydrochloride,

715730, SC-0011762

0.02,

0.02 0.0383 0.143 1.67 >10.0 55.1

29781612,

Pyronaridine

0.00,

0.00 0.225 0.665 2.03 3.0 85.2

511176,

Furazolidone

0.00,

0.00 0.257 0.563 2.81 >10.0 100.5

501337,

SC-0011777,

Tetrandrine

0.00,

0.00 0.508 1.57 1.95 1.3 43.6

SC-0011754,

Nitrofural

0.01,

0.01 0.775 6.98 1.00 >10.0 78.5*

* Used hydroxymethylnitrofurazone for in vivo study (nitrofural pro-drug)

Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878

H3C

O

N

CH3

N

CH3

H3C

O

CH3

O

H3C

O

H3C

N

N

HN

N

N

OH

Cl

O

CH3

O

NN

+

N

O

O–

O

O

O

N+

O

O–

N

HN

NH2

O

In vitro and in vivo data for compounds selected

Page 20: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Verapamil – Broad EC50 < 0.1µM others have shown IC50 > 50µM

Pyronaridine EC50 < 0.587µM in Broad dose response data but never tested in

mouse

Furazolidone (H. Pylori treatment) only in the bigger Broad primary

screen.

Tetrandrine is a P-gp inhibitor used in combination with chloroquine in

Broad primary screen – classed as negative.

Nitrofural, (Known active – Beveridge et al 1980)

not in training set or Broad dataset, predicted active by us, EC50 = 0.77µM

This study used different cell line (CA-I/72 strain) to the Broad data (Tulahuen) –

The later seems to bias hits towards CYP51 etc.

Can account for differences in activity

What do we know about the hits?

Page 21: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

7,569 cpds => 99 cpds => 17 hits (5 in nM range)

Infection Treatment Reading

0 1 2 3 4 5 6

7

Pyronaridine Furazolidone Verapamil

Nitrofural Tetrandrine Benznidazole

In vivo efficacy of the 5 tested compounds

Vehicle

Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878

Page 22: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Sharing in vitro and in vivo data in CDD Vault

Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878

CDD and UCSD used Vault to

securely share data

In vitro and in vivo data captured

Screening and dose response data

Page 23: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Pyronaridine: New anti-Chagas and known anti-Malarial

EMA approved in combination with

artesunate

The IC50 value 2 nM against the

growth of KT1 and KT3 P. falciparum

Known P-gp inhibitor

Active against Babesia and

Theileria Parasites tick-transmitted

Page 24: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Pyronaridine: target hunting for Chagas disease

Similarity search with pyronaridine in

literature dataset we curated on

Chagas Disease

GAPDH

A similarity search on ChEMBL using

the MMDS

trypanothione disulfide reductase

Most similar metabolite (Tanimoto MDL

keys = 0.67 ) = S-adenozyl 3-

(methylthio)propylamine = polyamine

biosynthesis

Page 25: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Ekins et al., PLoS Negl Trop Dis. 2015 Jun 26;9(6):e0003878

Bayesian models and training sets

were provided as supplemental data

Managed to find an overlooked

compound from Broad data

Future work:

Use models to score other libraries

Combinations of molecules

Longer term efficacy studies

Target identification

Test Pyronaridine vs other parasites,

bacteria, viruses

Conclusions

Page 26: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

Drugs in use and in development for Chagas Disease

Page 27: Applying cheminformatics and bioinformatics approaches to neglected tropical disease big data

NIH NIAID grant R41-AI108003-01 “Identification and validation of

targets of phenotypic high throughput screening”

Mike Pollastri

Ni Ai

Alex Clark

Dr. Martin John Rogers

Acknowledgments


Recommended