Docking tutorial
G. Marcou1, E. Kellenberger2
1Faculté de Chimie, UMR7140
2Faculté de Pharmacie, UMR 7200, Illkirch
1/13
Exercise1
Ligand preparation • standardization (aromatization, ionisation, tautomer) • generation of a low energy conformer
Protein preparation
• receptor and binding site definition • structure check
- ionisation state GLU, ASP, HIS, LYS, ARG - tautomeric state HIS - position of the polar hydrogen atoms (SER, TYR, THR, LYS, ASN, GLN) - crystal water molecules - metal coordination type - addition of hydrogen atoms
Docking and scoring
Results are the structure file of the best ligand poses and the score of each pose
The docking workflow
material 2/13 workflow Exercise 2 Exercise 3 goal
Understanding the docking paradigm
1. Re-docking
Exercice E1: re-docking docking of tacrine back into its co-crystal receptor
- effect of the ligand ionisation - effect of the water in binding site
Investigated issues:
The quality of ligand and protein preparation impacts the docking outcome
Docking requires expert intervention to predict unusual binding mode
Exercise1 goal 3/13 workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
Ligand
Receptor
PDB complex
docking predicted complex
Exercice E2: cross-docking docking of tacrine-hupyridone inhibitor (A2E) and aricept (E20) into the binding site of tacrine(TAH)-bound acetylcholinesterase
Investigated issues:
Ligand and protein binding site flexibility
Understanding the docking paradigm
2. Cross-docking
Exercise1 goal 4/13 workflow Exercise 2 Exercise 3 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
Ligand
Receptor
PDB complex #2
docking
PDB complex #1
predicted complex
Understanding the docking paradigm
3. Screening
Exercice E3: screening docking of DUD dataset into the binding site of
tacrine(TAH)-bound acetylcholinesterase, ranking the compounds to discriminate true binders from decoys.
Investigated issues:
The limited accuracy of scoring functions
actives
S
S
Br
O
O
NH
H
O S
S
Br
O
O
NH
H
OS
S
Br
O
O
NH
H
O
S
S
Br
O
O
NH
H
O
decoys cpds# ΔGbind
1121 -44.51 222 -42.21
3563 -41.50
578 -40.31
639 -40.28
…
670 +22.54
Exercise1 goal 5/13 workflow Exercise 2 Exercise 3 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
LeadIT / FlexX Quickstart
6/13
Protein preparation
Molecules >> Prepare Receptor...
Select the protein PDB file and follow the instructions
Ligand preparation
Molecules >> Choose Library...
Load the MOL2 file
Do not tick the box Protonate as in aqueous solution (for exercise
purpose).
Docking
Docking >> Define FlexX Docking...
Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
Course material
Input
pdb pdb1acj.ent PDB entry (1acj)
receptor acj_WAT.mol2 prepared receptor (1acj)
1eve_ali_WAT.mol2 prepared receptor (1eve)
Ligand TAH_1acj.mol2 neutral tacrine (1acj)
TAH_1acj+.mol2 (+) charged tacrine (1acj) A2E_1zgc.mol2 tacrine-hupyridone inhibitor (1zgc)
E20_1eve.mol2 aricept (1eve) DUD.mol2 D.U.D AchE dataset
Flexx mol2/sdf/csv/fxx result files
exercise E.1. 1acj_TAHsite65_TAHredock
1acj_TAHsite65_TAH+redock 1acj_TAHsite65WAT_TAH+redock
exercise E.2. 1acj_A2Esite65WAT-A2Ecrossdock
1acj_E20site65WAT-E20crossdock 1eve_E20site65WAT_E20redock
exercise E.3 1acj_A2Esite65WAT_DUDscreening
7/13 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
Output, full projects
The tacrine / acetylcholinesterase binding mode of is difficult to predict.
PDB 1acj complex shows:
• pocket size >> ligand volume
• only one polar intermolecular interaction
• two key water molecules
8/13 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
Exercise E.1: Re-docking tacrine (TAH) back
into the acetylcholinesterase binding site
Load tacrine / acetylcholinesterase 1acj PDB complex input/pdb/pdb1acj.ent Prepare the receptor and define a 6.5A site around tacrine Dock the neutral tacrine (TAH) / positively charged tacrine (TAH+) Include water in the receptor, dock TAH+
pdb_ligand_site ligand Docking accuracy for the docking ensemble
(10 poses per ligand)
1acj_TAH_site65 TAH Only wrong solutions: Ligand up-side-down
1acj_TAH_site65 TAH+ Mixture of correct and wrong poses
1acj_TAH_site65_WAT TAH+ Only correct poses
9/13 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
Input/ligand/TAH_1acj+.mol2
Harel et al. (1993)
Proc Natl Acad Sci U S A.
Input/ligand/TAH_1acj.mol2
PDB ligand
repository
10/13 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
Tacrine-hupyridone inhibitor (A2E)
- is a derivative of tacrine (TAH+)
- is more flexible than tacrine (TAH+)
The tacrine substructure of the A2E is correctly placed in the protein pocket.
The docking of A2E pyridone group is hindered by unsuitable W279 rotamer.
Exercise E.2: Cross-docking A2E and E20 into
TAH-bound acetylcholinesterase
A2E
TAH+
X-ray Re-docking
11/13 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
The E20 inhibitor is not chemically similar to TAH / A2E.
The docking of E20 is prevented by unsuitable F330 rotamer.
The E20/ acetylcholinesterase binding mode of is difficult to predict, because:
• both ligand and binding site contain polar and charged groups
• BUT no H-bonds nor ionic bonds are experimentally observed in the X-ray complex
Top 1% Top20%
True positive (ACTIVE) rate, TPrate …. / 107 = …. / 107 =
False positive (DECOYS) rate, FPrate …. / 3892 = …. / 3892 =
Enrichment factor (TPnumber / 40) ------------------- = (107 / 3999)
(TPnumber / 800) ------------------- = (107 / 3999)
Enrichment factor from Huang et al. 1.9 2.0
Exercise E.3: Screening the DUD dataset,
using TAH-bound acetylcholinesterase
The DUD dataset
107 true binders and 3892 decoys.
strong bias in the active set (towards E20 derivatives)
Huang, Shoichet and Irwin in 2006 (DOI 10.1021/jm0608356) Don’t start the calculation (takes more than 5 hours)!
Top 1% Top20%
True positive (ACTIVE) rate, TPrate …. / 107 = …. / 107 =
False positive (DECOYS) rate, FPrate …. / 3892 = …. / 3892 =
Enrichment factor (TPnumber / 40) ------------------- = (107 / 3999)
(TPnumber / 800) ------------------- = (107 / 3999)
Enrichment factor from Huang et al. 1.9 2.0
12/13 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
13/13 Exercise1 goal workflow Exercise 2 Exercise 3 Exercise1 material workflow Exercise 2 Exercise 3 goal
Poor docking accuracy true binders not correctly docked Poor scoring accuracy in ranking compounds high score of decoys due to irrelevant polar interaction Impossible identification of the true actives? acetylcholinesterase is a “difficult” target for docking
half of active compounds are similar to E20, and can not be accurately docked the decoys are challenging Expert intervention slightly increases the screening performance.