Introduction to ChEMBL - BioMedBridges

Post on 16-Oct-2021

3 views 0 download

transcript

Computational Aspects of HTS Planning and Analysis Course

Introduction to ChEMBL -

Anne Hersey

ChEMBL Group

EMBL-EBI

Outline • ChEMBL

• Background

• Content

• Identifying compounds binding to a protein target

• Assessing compounds

• Using other resources • Crystal structures

• Druggability algorithms

2

Compound Selection Ta

rget

Annotation Possible ?

Virtual screening

file ChEMBL ChEMBL

Virtual screening

file DrugEBIlity PDBe

No Diversity based list

Yes Activity based hits

Known actives ?

No Select

compounds similar to known

actives

Yes

Are the actives ‘Drug like’ ?

No

Yes

Activity-based

Yes No

Structure based hits

Known structure ?

Yes Select compounds

compatible with binding site

Is the binding site ‘Drug like’ ?

Yes

No Structure-based

3

What is ChEMBL • Open access database for drug discovery

• Freely available (searchable and downloadable)

• Content:

• Bioactivity data manually extracted from the primary medicinal chemistry literature from journals such as J. Med. Chem.

• Subset of data from PubChem

• Deposited data e.g. neglected disease screening, GSK kinase set

• Bioactivity data is associated with a biological target and a chemical structure

• Compounds are stored in a structure searchable format

• Protein targets are linked to protein sequences in UniProt

• Updated regularly with new data

• Secure searching (https://www.ebi.ac.uk/chembldb )

4

Data Example EP1 Antagonists for Inflammatory Pain A. Hall et al.

Bioorg. Med. Chem. Lett. 19 (2009) 2599–2603

5

View of data in ChEMBL

6

Compound Target Activity Assay Lit ref

Some Numbers (ChEMBL17)

7

Accessing ChEMBL Data

8

Drug Targets

9

Data for: ~260 drug targets ~6000 protein targets (single proteins,families and complexes)

Are there known Active Compounds for my Target?

10

From ChEMBL identify compounds that bind to the target Select:

• Potent compounds • Rule of 5 compliant (drug-like) • Ligand efficient molecules

Example (DPP4):

11

Linagliptin

Saxagliptin

Sitacliptin

Alogliptin

Compounds with DPP4 data in ChEMBL Are they drug like?

Ligand Efficiencies

12

LE -RTln(Ki)/Heavy_atoms (Hopkins AL et al DDT; 2004) BEI pKi*1000/MW (Abad-Zapatero C et al DDT 2005) SEI pKi*100/PSA LLE pKi – ALogP (Leeson PD et al NRDD 2007)

In ChEMBL LE calculated for: IC50,Ki,EC50,Kd,XC50,AC50,Potency

Select most ligand efficient compounds

Other Information about compounds

13

Linagliptin bound to DPP4

Compound availability

Another example

14

Sci Transl Med 5, 206ra138 (2013)

Information on PERK Target

15

Searching by Compound

16

Extending dataset – Similar Targets

17

>sp|Q9NZJ5|E2AK3_HUMAN Eukaryotic translation initiation factor 2-alpha kinase 3 OS=Homo sapiens GN=EIF2AK3 PE=1 SV=3 MERAISPGLLVRALLLLLLLLGLAARTVAAGRARGLPAPTAEAAFGLGAAAAPTSATRVPAAGAVAAAEVTVEDAEALPAAAGEQEPRGPEPDDETELRPRGRSLVIISTLDGRIAALDPENHGKKQWDLDVGSGSLVSSSLSKPEVFGNKMIIPSLDGALFQWDQDRESMETVPFTVESLLESSYKFGDDVVLVGGKSLTTYGLSAYSGKVRYICSALGCRQWDSDEMEQEEDILLLQRTQKTVRAVGPRSGNEKWNFSVGHFELRYIPDMETRAGFIESTFKPNENTEESKIISDVEEQEAAIMDIVIKVSVADWKVMAFSKKGGHLEWEYQFCTPIASAWLLKDGKVIPISLFDDTSYTSNDDVLEDEEDIVEAARGATENSVYLGMYRGQLYLQSSVRISEKFPSSPKALESVTNENAIIPLPTIKWKPLIHSPSRTPVLVGSDEFDKCLSNDKFSHEEYSNGALSILQYPYDNGYYLPYYKRERNKRSTQITVRFLDNPHYNKNIRKKDPVLLLHWWKEIVATILFCIIATTFIVRRLFHPHPHRQRKESETQCQTENKYDSVSGEANDSSWNDIKNSGYISRYLTDFEPIQCLGRGGFGVVFEAKNKVDDCNYAIKRIRLPNRELAREKVMREVKALAKLEHPGIVRYFNAWLEAPPEKWQEKMDEIWLKDESTDWPLSSPSPMDAPSVKIRRMDPFATKEHIEIIAPSPQRSRSFSVGISCDQTSSSESQFSPLEFSGMDHEDISESVDAAYNLQDSCLTDCDVEDGTMDGNDEGHSFELCPSEASPYVRSRERTSSSIVFEDSGCDNASSKEEPKTNRLHIGNHCANKLTAFKPTSSKSSSEATLSISPPRPTTLSLDLTKNTTEKLQPSSPKVYLYIQMQLCRKENLKDWMNGRCTIEERERSVCLHIFLQIAEAVEFLHSKGLMHRDLKPSNIFFTMDDVVKVGDFGLVTAMDQDEEEQTVLTPMPAYARHTGQVGTKLYMSPEQIHGNSYSHKVDIFSLGLILFELLYPFSTQMERVRTLTDVRNLKFPPLFTQKYPCEYVMVQDMLSPSPMERPEAINIIENAVFEDLDFPGKTVLRQRSRSLSSSGTKHSRQSNNSHSPLPSN

PERK sequence from Uniprot BLAST search for similar sequences

Compound Selection Ta

rget

Annotation Possible ?

Virtual screening

file ChEMBL ChEMBL

Virtual screening

file druggability PDBe

No Diversity based list

Yes Activity based hits

Known actives ?

No Select

compounds similar to known

actives

Yes

Are the actives ‘Drug like’ ?

No

Yes

Activity-based

Yes No

Structure based hits

Known structure ?

Yes Select compounds

compatible with binding site

Is the binding site ‘Drug like’ ?

Yes

No Structure-based

18

Structure Based Design

19

Is my Protein Druggable? - • Structure based methods identify cavities in protein crystal

structures and assessing the properties of these cavities

• Rules for properties that indicate a druggable cavity learnt from analysis of co-crystal complexes with drug-like ligands

• Examples of Algorithms:

• PocketFinder – An, Totrov & Abagyan, 2005

• Druggability Indices – Hajduk, Huth & Fesik, 2005

• Rule based method - Perola, Herman & Weiss, 2012

20

DrugEBIlity • https://www.ebi.ac.uk/chembl/drugebility/structure

• All potential pockets in crystal structures from PDB predicted using a pocket-finding algorithm (based on SurfNet, Laskowski 1995)

• Decision tree algorithm trained on known binding pockets for drug-like ligands (e.g., rule-of-five)

• Decision tree used to classify unknown pockets into druggable/undruggable

• Second ‘tractability’ algorithm also trained with more relaxed ligand criteria (e.g., Mwt < 800)

21

Is the Binding Site Druggable?

22

Acknowledgements

• John Overington

• Anna Gaulton

• Mark Davies

• Patricia Bento

• Jon Chambers

• Francis Atkinson

• Louisa Bellis

23

• Yvonne Light

• George Papadatos

• Shaun McGlinchey

• Nathan Dedman

• Michal Nowotka

• Ruth Akhtar

• Kaz Ikeda