Taller de docking y cribado virtual: uso de herramientas virtual: uso de herramientas computacionales en el diseño pde fármacos
Introducción al diseño computacional de fármacos
Federico GagoDepartamento de Farmacología
What is a drug?What is a drug?The word ‘drug‘ means different things to different people. In fact, a drug is
any chemical that can change the way a y g yliving creature functions. Drugs can be simple chemicals or complicated ones. They may be gas, liquid or solid. DrugsThey may be gas, liquid or solid. Drugs can be inhaled, swallowed, absorbed through the body surface, or injected. But all drugs work in the same wayBut all drugs work in the same way.
They get very close to natural chemicals in living cells and alter what
th dthey do.
O O
CHOH
Digitalis purpurea L.Digitalis purpurea L.
CH3
CH3
OH
HO
sugars“Distilling in a Medicinal Garden” (1512)“Distilling in a Medicinal Garden” (1512) Cardiotonic heteroside
El primer agente quimioterapéutico moderno: la arsfenamina (606 / Salvarsan) para el tratamiento de la sífilis y la tripanosomiasis (606 / Salvarsan) para el tratamiento de la sífilis y la tripanosomiasis.
compuesto no 606 sintetizado para su ensayo !!!
Sahachiro HataPaul Ehrlich
Drug discovery 100 years ago
P. Ehrlich (1909)
“Th di i f th i ili d l t d th f li itl t ti f “The discoveries of those uncivilized peoples represented the sum of limitless testing of thousands of natural materials. By contrast with their selection of medicines by pure chance, we have to find first certain compounds, for example some arsenic derivatives, which show at least a low degree of therapeutic effect. Once this is done through more or less laborious tests, the purely empirical screening is replaced by preparing chemical variations, homologs and other derivatives whose efficacy has to be tested. But even at best chemical drugs are y gnot magic bullets, and will not always hit only the center of the target, that is the disease-causing organisms. Moreover, nothing is as simple as to ascertain the lethal or the maximal well-tolerated dose, and the curative dose in a given animal species. In humans the well tolerated dose, and the curative dose in a given animal species. In humans the determination of dosages is infinitely more difficult as one has to start with low doses and increase them gradually until they become therapeutically active. This is further aggravated by the occurrence of congenital or acquired idiosyncrasies from most medicines and it by the occurrence of congenital or acquired idiosyncrasies from most medicines… and it cannot justly be demanded that a decision be made within a few months as to the merits or demerits of such new agents.”
Gerhard Domagk (1895-1964) protegió con Prontosil rubruma ratones y conejos frente a dosis letales de estafilococos y estreptococos.
J. Drews: “Drug discovery: a historical perspective”, Science 287, 1960 (2000)
Selective Optimization of Side Activities of drug molecules
antibacteriano
ligando del receptor de endotelina ETA
Selective Optimization of Side Activities of drug molecules
Activity at the M1 receptorDrug Discov. Today. 2006, 11(3-4):160-164
antidepresivoantidepresivo
agonista parcial
What is a better drug?How can it be recognized?
“A better drug is obviously not a new molecule which A better drug is obviously not a new molecule which injected in mice produces a paper”
Dr. Paul Adriaan Jan Janssen (1926-2003)Rev Méd Brux I 643 645 (1980)Rev. Méd. Brux. I. 643-645 (1980)
presented at the XV International Congress on Therapeuticspresented at the XV International Congress on Therapeutics,Brussels (Belgium), 5-9 September, 1979
"When a medicinal chemist synthesizes a compound that does something e traordinar to a biological that does something extraordinary to a biological system, this compound enters an elite class of y pchemicals and becomes classified as a drug."
T. P. Kenakin”Pharmacological Analysis of Drug-Receptor Interaction“, 1987g y g p
MODERN DRUG DISCOVERY:iterative process of ‘make’ and ‘test’
t iki fStriking covergence of:
GENOMICSCOMBINATORIAL CHEMISTRYYHIGH-THROUGHPUT SCREENING
In common:Handling of thousands of objectsg jDependence on miniaturization and lab automationCreation of information and data tidal wavesCreation of information and data tidal wavesNeed for effective data management and mining
De las ÓMICAS a la biología de sistemas Nuevos conceptos: genética y proteómica “químicas”Nuevos conceptos: genética y proteómica “químicas”
transcriptome transcriptome
COMPONENTES DE UNA QUIMIOTECA DE COMPONENTES DE UNA QUIMIOTECA DE BENZODIAZEPINAS(uno de los farmacóforos más notables de la Química Médico-Farmacéutica)(uno de los farmacóforos más notables de la Química Médico-Farmacéutica)
OR
HR
+ NO
RN
H
X
RR
+
NO OO
RH
H NR
H
RR +H
QUÍMICA ORGÁNICA COMBINATORIACRIBADO AMPLIO
• Quimioteca de tamaño inmenso ANÁLOGOS QUÍMICOS / OPTIMIZACIÓN• Quimioteca de tamaño inmenso
• Diversidad estructural lo mayor posible
ANÁLOGOS QUÍMICOS / OPTIMIZACIÓN
• Quimioteca de tamaño modesto
• Sin una meta estructural inicial específica
• Diversidad estructural relativamente estrecha
• Meta estructural específica• Bloques de construcción variopintos
• Meta estructural específica
• Bloques de construcción retrocombinatoriales específicos
• Orden de reacción indefinido
• Estrategia sintética flexible• Orden específico de combinación
• Estrategia sintética bien definida• Espaciador no crucial
• Posibilidad de que el ligando
Estrategia sintética bien definida
• Espaciador crucial
no se pueda desacoplar
• Evolución por selección simple
• El ligando debería ser liberable
• Evolución acumulativa de la selección
Sistema de uHTS automatizado en Bristol-Myers Squibb
(1) Compound store (2) Hit-picking robot(3) 3456 reagent dispensing robot (4) Transport(5) Incubators (6) Piezo-electric distribution robot(7) Topology compensating plate reader (8) 1536 reagent dispensing robot (9) Automated plate replicating system (10) High-capacity stacking system
¿Pueden ayudar los métodos in silico?
Nature Reviews Drug Discovery 2; 369 378 (2003)
métodos in-silico?Nature Reviews Drug Discovery 2; 369-378 (2003)
HIT AND LEAD GENERATION: BEYOND HTS
Descubrimiento de Fármacos Guiado por la GenómicaBases de Datos de
secuencias de genesBases de Datos de
secuencias de proteínas
Secuencias genómicas
Dianas terapéuticas Initial Gene Index (IGI) Initial Protein Index (IPI)
Dianas terapéuticas validadas
Determinación ·Definición de límites intrón-exón
·Polimorfismos de un único nucleótido· Modelado por homología
· Análisis de los sitios
Determinación estructural
·Búsqueda de motivos de secuencia
·Ayuste de ARN
Análisis de los sitios
· Determinación de motivos estructuralesEstructuras 3-D
de proteínas
Sitios de unión de ligandos
Diseño de ligandos basado en la estructura
Nuevos candidatosBailey, D. et al. Nat. Biotech. 2001, 19, 207-209
Proteins fold in such Proteins fold in such a way that they create specific sitesthat are the right si e that are the right size, shape and polarityshape, and polarityfor their ligands.g
folding
Antibodies selectively bind to antigens
Molecular complementarityheavyheavy
lightlightphosphorylcholine lightlight
Immunoglobulin McPC603 Fab-phosphorylcholine complex (2mcp.pdb)
LIGAND binding is highly selectiveg g y
ligandNon-covalent bonds
ligand
binding itsite
macromolecule
The binding site in proteins is mainly The binding site in proteins is mainly determined by the amino acid side chains
The complex formed between the ligand and the bi di it i t bili d b l t i t tibinding site is stabilized by non-covalent interactions
hydrogen bond
cAMP
Binding sites: shape complementarity
... and electrostatic complementarity
In vacuoΔGRL
RR LL RRLLIn vacuo
+ + +
ΔG
+ + +
ΔGR ΔGL ΔGRLΔGsolv ΔGsolv ΔGsolv
+ΔGRL
solv In water
"DelPhi - A Macromolecular Electrostatics Modelling Package":Kim A. Sharp, Anthony Nicholls & Barry HonigKim A. Sharp, Anthony Nicholls & Barry HonigDepartment of Biochemistry and Molecular Biophysics, Columbia University, New York
- Klapper, I.; Hagstrom, R.; Fine, R.; Sharp, K.; Honig, B. “Focusing of Electric Fields in the Active pp , ; g , ; , ; p, ; g, gSite of Cu-Zn Superoxide Dismutase: Effects of Ionic Strength and Amino-acid Modification.”Proteins (1986) 1, 47-59.
- Gilson, M. K.; Sharp, K. A.; Honig, B. H. “Calculating the Electrostatic Potential of Molecules in Solution: Method and Error Assessment” J. Comput. Chem. (1987) 9, 327-335.
- Gilson, M. K.; Honig, B. “Calculation of the Total Electrostatic Energy of a Macromolecular System: Solvation Energies, Binding Energies, and Conformational Analysis.” Proteins (1988) 4, 7-18.
- K. Sharp, K.; Honig, B. “Electrostatic Interactions in Macromolecules: Theory and Applications.” Ann. Rev. Biophys. Biopys. Chem. (1990) 19, 301-332.
- Nicholls, A.; Honig, B. “A Rapid Finite Difference Algorithm, Utilizing Successive Over-Relaxation to Solve the Poisson-Boltzmann Equation.” J. Comput. Chem. (1991) 12, 435-445.
The original reference to the use of the finite difference method for macromolecular electrostatics is:J. Warwicker and H. C. Watson, J. Mol. Biol. (1982) 157, 671.
πρφ )(4)(2 rr −=∇Poisson equation:ε
[ ] )(4)(')()( rrkrr πρφφε −=−∇⋅∇Poisson-Boltzmann equation:
Rε0
[ ] )(4)()()( rrkrr πρφφε ∇∇q
ΔΔGRsolv
ΔGRLi t
εs+ - + -
ε0
ΔGint+-
-
+-
++
--
+-
ε0ΔΔGL
εs+-
+-ΔΔGsolv
ΔGRLint
li d i l tiligand in solution
Protein binding pocket
water molecules
Ligand buried in a partially desolvated binding pocket
ba
bc
fd
fd
ceg
Conclusion: Any method that attempts to model ligand binding also has to consider the desolvation energy and the entropic contributions to the binding process.
At i iti
ALGORITHMS FOR ENERGY MINIMIZATION AND MOLECULAR DYNAMICS
Atomic positions(coordinate file)
C l t t tCovalent structure(topology file)
P t ti l f ti T t l t ti l Potential energy function(parameter file)
Additi l t
Total potential energy
F h tAdditional atoms(hydrogens; heterogroups;solvent; counterions)
Forces on each atom
Eff ti t tSpecial features(periodic boundary conditions;# constant pressure
Effective temperature
# constant pressure# constant temperature)
Atomic velocitiesAtomic velocities
EXPLICIT solute
TREATMENT OF THE SOLVENTTHE SOLVENT
PERIODIC BOUNDARY BOUNDARY
CONDITIONS
PERIODIC BOXPERIODIC BOX
solute
EXPLICIT TREATMENT OF THE SOLVENT
truncated octahedron
THE SOLVENT
water “shells” or “caps”
Cambridge Structural Database
The Cambridge Crystallographic Data Centre (CCDC)
Cambridge Structural Database
The Cambridge Crystallographic Data Centre (CCDC)builds, maintains and distributes the Cambridge Structural Database (CSD) a searchable database of organic and Database (CSD), a searchable database of organic and metallo-organic crystal structures.
The CCDC also produces and distributes software productsThe CCDC also produces and distributes software products
which make use of the data contained in the CSD.
>272,000 organic and metallo-organiccrystal structures analysed using
X t diff ti t h iX-ray or neutron diffraction techniques
ConQuestConQuest provides a full range of text/numeric database search options, in addition to more complex search functionality, including:
• Chemical substructure searching • Geometrical searching • Intermolecular non-bonded contact
searching
Cambridge Crystallographic Data Centrehttp://www.ccdc.cam.ac.uk/http://www.ccdc.cam.ac.uk/
#DOXSIS 33870428 16 9 0 0 0 4 4 28 0 0 30132200000010000000000086137050 99740 84360 90 90 90444000 0 0 0 0 0 0 0168 19P212121 440R=0.0180 211 0121 0112 0211 6101 6110 0011 0121 6110 6011 6101 0112 6C 68H 23I 140N 68
HEADER CSD ENTRY DOXSISCOMPND NICOTINE MONOHYDROGEN IODIDECRYST 13.705 9.974 8.436 90.00 90.00 90.00 P212121ATOM 1 I1 NICO 1 1 583 1 561 0 355 1 00 0 00I1 11550 15650 4210 N1 39070 -53910 121530 C1 41910 -41220 119220
C2 39220 -33520 106130 C3 33370 -39370 94620 C4 30500 -52630 96620C5 33450 -59370 110270 C6 42610 -19100 105490 C7 34670 -8430 103700C8 39850 3750 96340 C9 49810 -1250 90740 N2 49050 -16310 91260
ATOM 1 I1 NICO 1 1.583 1.561 0.355 1.00 0.00ATOM 2 N1 NICO 1 5.355 -5.377 10.252 1.00 0.00ATOM 3 C1 NICO 1 5.744 -4.111 10.057 1.00 0.00ATOM 4 C2 NICO 1 5.375 -3.343 8.953 1.00 0.00ATOM 5 C3 NICO 1 4 573 -3 927 7 982 1 00 0 00C10 58640 -23330 91710 H1 45630 -37430 127800 H2 32010 -34830 84940
H3 27050 -56910 88930 H4 31160 -69540 111430 H5 46660 -17710 114730H6 29460 -11520 97920 H7 31120 -6480 114010 H8 36550 6990 88960H9 41090 10990 105490 H10 51390 1190 80020 H11 54700 1150 97280
ATOM 5 C3 NICO 1 4.573 -3.927 7.982 1.00 0.00ATOM 6 C4 NICO 1 4.180 -5.249 8.151 1.00 0.00ATOM 7 C5 NICO 1 4.584 -5.922 9.302 1.00 0.00ATOM 8 C6 NICO 1 5.840 -1.905 8.899 1.00 0.00ATOM 9 C7 NICO 1 4 752 -0 841 8 748 1 00 0 00H12 45690 -18690 82150 H13 61800 -20270 102020 H14 57540 -33240 92420
H15 61740 -21060 83660 0 3 4 5 6 7 2 4 8 910 812 3 5 6 7 8 9 910101111121313131112
ATOM 9 C7 NICO 1 4.752 0.841 8.748 1.00 0.00ATOM 10 C8 NICO 1 5.461 0.374 8.127 1.00 0.00ATOM 11 C9 NICO 1 6.826 -0.125 7.655 1.00 0.00ATOM 12 N2 NICO 1 6.722 -1.627 7.699 1.00 0.00....CONECT 1 0CONECT 2 3 7CONECT 3 2 4 14....MASTER 0 0 0 0 0 0 0 0 28 0 28 0END
Cambridge Cr stallographic Data CentreCambridge Crystallographic Data Centre
nicotine
http://www.ccdc.cam.ac.uk/products/csd_system/webcsd/http://www.ccdc.cam.ac.uk/products/csd_system/webcsd/
Substructure and Similarity SearchesSubstructure and Similarity Searches
http://www.ccdc.cam.ac.uk/products/csd_system/webcsd/http://www.ccdc.cam.ac.uk/products/csd_system/webcsd/
New browsing interface with integrated visualiser
The Importance of Chirality
The Importance of Stereochemistry Stereoisomers and Biochemistry:Biochemistry:different flavours
Stereoisomers and Pharmacology: drug effectiveness can be a function of the particular enantiomer that is presentfunction of the particular enantiomer that is present
antibiotic properties
hormonal activityproperties activity
>100 times more effective as antiarrythmic than the
used to treat tuberculosis; the (–)‐enantiomer causes
(–)‐enantiomer blindness
I i th Increasing the Value of Value of
Crystallographic Databases
K l d f i t l l i t ti• Knowledge of intermolecular interactions
• Knowledge-based application programs g pp p g
• Data mining tools for ligand-receptor complexescomplexes
http://www.3dchem.com/atoz.asp
Links: Molecules of the Month, A to Z Index of Structures, Top 50 Prescription Medicines, Gallery, Library of Inorganic Structures (over 1600 structures), Interactive 3D Periodic Table, and Search 3Dchem.com
http://www.chemspider.com/
ChemSpider is a free access service providing a structure centric community for chemists. Providing access to millions of chemical structures and g
integration to a multitude of other online services, ChemSpider is the richest single source of structure-based chemistry information.
http://pubchem.ncbi.nlm.nih.gov/ http://pubchem.ncbi.nlm.nih.gov/vw3d/vw3d.cgi?
Quantum ChemistryAtomic orbitals can be combined to give molecular orbitalsAtomic orbitals can be combined to give molecular orbitals
Molecular orbitals of waterMolecular orbitals of water
Oxygen orbitals
Hydrogen orbitals
Ab initio METHODS
* Hartree-Fock method
* Electron correlation methods
variational methodsvariational methodsConfiguration Interaction with double excitations (CID)
Configuration Interaction with single and double excitations (CISD)g g ( )
perturbation methodsMøller and Plesset (MP2 MP3 MP4)Møller and Plesset (MP2, MP3, MP4)
Quadratic Convergence CI method (QCISD)
d it f ti l th d (DFT)density functional methods (DFT)BP86 - developed by Becke and Perdew in 1986
BLYP - developed by Becke, Lee, Yang and Parr
B3LYP - a modification of BLYP in which a 3-parameter functional developed by Axel Becke is used.
General Atomic and Molecular General Atomic and Molecular Electronic Structure SystemElectronic Structure Systemyy
http://www.msg.ameslab.gov/GAMESS/GAMESS.html
Gaussianhtt // i / Spartanhttp://www.gaussian.com/ p
http://www.wavefun.com/
Some sample Gaussian z-matrices
Water (C2v) Ethane (D3d)
cWith variables: With values:o h 1 l1
o h 1 0.96
cc 1 l1h 2 l2 1 a1
h l l1 2 a1
l1 0.96 1 104 0
h l 0.96 2 104.0 h 2 l2 1 a1 3 120.0h 2 l2 1 a1 3 -120.0h 1 l2 2 a1 3 180 0a1 104.0 h 1 l2 2 a1 3 180.0h 1 l2 2 a1 6 120.0h 1 l2 2 a1 6 -120.0
l1 1.54l2 1 09l2 1.09a1 110.0
SEMI-EMPIRICAL METHODS: levels of approximation
CNDO Complete Neglect of Differential Overlap (Developed by John Pople - assumes p g p ( p y patomic orbitals to be spherical when evaluating the two-electron integrals)
INDO Intermediate Neglect of Differential Overlap
NDDO Neglect of Diatomic Differential Overlap
MINDO/3 Modified INDO (Developed by Michael Dewar - uses a set of parameters to approximate the two-electron repulsion integrals)
ZINDO Includes parameters for transition metalsp
MNDO Modified NDO (Developed by Michael Dewar and Walter Thiel in 1977)
AM1 Austin Model 1 (Developed by Michael Dewar and Andrew Holder in 1986)AM1 Austin Model 1 (Developed by Michael Dewar and Andrew Holder in 1986)
PM3 Parametric Model 3 (Developed by Jimmy Stewart in 1988)
Sample input for MOPACPM3 EF PRECISEH2O (water)MOPAC input as a Z-matrixOH 0.96000 1 1H 0.96000 1 104.00000 1 1 2
Water (C2v)
AM1 EF PRECISEH2O (water)H2O (water)MOPAC input in Cartesian coordinatesO 0.0000 0 0.0000 0 0.0000 0H 0.9600 1 0.0000 0 0.0000 0H -0.2322 1 0.9315 1 0.0000 0
Typical flow charts for an ab initio optimization anda corresponding semi empirical calculation
Read input Read input
a corresponding semi-empirical calculation
Calculate geometry
Assign basis set
Calculate geometry
Assign basis set
Calculate integrals
Assign parameters
g
Initial guess
Calculate new
geometryC
Self-consistent Fielditerations
1st cyclelater
cycles
Self-consistent Field
g yCalculate
new geometry
iterations
not
Calculate atomic forces
Calculate atomic forces
ti i d
notoptimized
notoptimized optimized
Population analysis Population analysis
optimized
http://www.cmbi.kun.nl/~schaft/molden/molden.htmlMOLDEN a pre- and post processing program of molecular and electronic structure
reads all the required information from the GAMESS / GAUSSIAN / MOPAC q f f G Goutputfiles, and is also capable of importing lots of other formats (ChemX, PDB, etc)
displays Molecular Density and Molecular Orbitals
a pre- and post processing program of molecular and electronic structure
p y y
supports contour plots, 3-d grid plots with hidden lines and a combination of both.
can write a variety of graphics instructions; postscript, XWindows,VRML, povray, OpenGL, tekronix4014, hpgl, hp2392 and Figure.
i i h d l l ib ican animate reaction paths and molecular vibrations.
can calculate and display the true or Multipole Derived Electrostatic Potential and atomic ha e a be fitted t the Ele t tati P te tial al lated a C ll fa echarges can be fitted to the Electrostatic Potential calculated on a Connolly surface.
has a powerful Z-matrix editor which give full control over the geometry and allows you to build molecules from scratch including polypeptides
G.Schaftenaar and J.H. Noordik, "Molden: a pre- and post-processing program for molecular and electronic structures", J. Comput.-Aided Mol. Design, 14, 123-134 (2000)
to build molecules from scratch, including polypeptides.
SMILES☺Rules
Simplified Molecular Input Line Entry Specification☺1. Atoms are represented by atomic symbols: B, C, N, O, F, P, S, Cl, Br, and I. 2. Double bonds are `=', triple bonds are `#'. 3. Branching is indicated by parentheses.3. Branching is indicated by parentheses. 4. Ring closures are indicated by pairs of matching digits.
Examplesa pre- and post processing program of molecular and electronic structurepDepiction SSMILES Name Remark
C methane hydrogens fill normal valence
CCO ethanol a single bond is assumed tojoin adjacent atoms
CC(=O)O acetic acid parentheses are used to indicate branching
C1CCCCC1 cyclohexane bonds can also be representedby pairs of matching digits http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
CORINACORINAAutomatic generation of three-dimensional atomicthree-dimensional atomic
COoRdINAtes
http://www.molecular-networks.com/online_demos/corina_demo.html
http://chem.sis.nlm.nih.gov/chemidplus/
http://chem.sis.nlm.nih.gov/chemidplus/
BABEL A program designed to interconvert a number of file formats currently used in molecular modelling
Input type codes:
alc -- Alchemy file macmod -- Macromodel file
y g
prep -- AMBER PREP file micro -- Micro World file bs -- Ball and Stick file mm2in -- MM2 Input file bgf -- MSI BGF file mm2out -- MM2 Output file car -- Biosym .CAR file mm3 -- MM3 file boog -- Boogie file mmads -- MMADS fileboog Boogie file mmads MMADS file caccrt -- Cacao Cartesian file mdl -- MDL MOLfile file cadpac -- Cambridge CADPAC file molen -- MOLIN file charmm -- CHARMm file mopcrt -- Mopac Cartesian file c3d1 -- Chem3D Cartesian 1 file mopint -- Mopac Internal file c3d2 Chem3D Cartesian 2 file mopout Mopac Output filec3d2 -- Chem3D Cartesian 2 file mopout -- Mopac Output file cssr -- CSD CSSR file pcmod -- PC Model file fdat -- CSD FDAT file pdb -- PDB file gstat -- CSD GSTAT file psin -- PS-GVB Input file dock -- Dock Database file psout -- PS-GVB Output file dpdb -- Dock PDB file msf -- Quanta MSF file feat -- Feature file schakal -- Schakal file fract -- Free Form Fractional file shelx -- ShelX file gamout -- GAMESS Output file smiles -- SMILES file gzmat -- Gaussian Z-Matrix file spar -- Spartan file g p pgauout -- Gaussian 92 Output file semi -- Spartan Semi-Empirical file g94 -- Gaussian 94 Output file spmm -- Spartan Mol. Mechanics file gr96A -- GROMOS96 (A) file mol -- Sybyl Mol file gr96N -- GROMOS96 (nm) file mol2 -- Sybyl Mol2 file hin -- Hyperchem HIN file wiz -- Conjure filehin -- Hyperchem HIN file wiz -- Conjure file sdf -- MDL Isis SDF file unixyz -- UniChem XYZ file m3d -- M3D file xyz -- XYZ file macmol -- Mac Molecule file xed -- XED file
BABEL A program designed to interconvert a number of file formats currently used in molecular modelling
Output type codes:
di fil i fil
y g
diag -- DIAGNOTICS file i -- IDATM file t -- Alchemy file macmol -- Mac Molecule file bs -- Ball and Stick file k -- Macromodel file bmin -- Batchmin Command file micro -- Micro World file caccrt -- Cacao Cartesian file mi -- MM2 Input file cacint -- Cacao Internal file mo -- MM2 Ouput file cache -- CAChe MolStruct file mm3 -- MM3 file c3d1 -- Chem3D Cartesian 1 file mmads -- MMADS file c3d2 -- Chem3D Cartesian 2 file mdl -- MDL Molfile file d -- ChemDraw Conn Table file ac -- Mopac Cartesian filed ChemDraw Conn. Table file ac Mopac Cartesian file con -- Conjure file ai -- Mopac Internal file contmp -- Conjure Template file pc -- PC Model file cssr -- CSD CSSR file p -- PDB file feat -- Feature file report -- Report file fh F k H ll ZM t i fil S t filfhz -- Fenske-Hall ZMatrix file spar -- Spartan file gamin -- Gamess Input file mol -- Sybyl Mol file gcart -- Gaussian Cartesian file mol2 -- Sybyl Mol2 file g -- Gaussian Z-matrix file maccs -- MDL Maccs file filegotmp -- Gaussian Z-matrix tmplt file xed -- XED file hin -- Hyperchem HIN file unixyz -- UniChem XYZ file icon -- Icon 8 file x -- XYZ file
ftp://ccl.osc.edu/pub/chemistry/software/UNIX/babel/
Open Babel is a community-driven scientific project including both cross-platform programs and a developer library designed to support cross-platform programs and a developer library designed to support molecular modeling, chemistry, and many related areas, including interconversion of file formats and data.http://openbabel.sourceforge.net/wiki/Main_Page
http://ligand‐expo.rcsb.org/
Ligand Explorer
// /http://relibase.ccdc.cam.ac.uk/http://relibase.ebi.ac.uk/http://relibase.rutgers.edu/
Binding Site SuperpositionBinding Site Superposition Analysis of 3D searches
Benzamidine-CarboxylateInteractions
Distance Distribution
Torsion Distribution
LIGPLOT A program for automatically plotting protein-ligand interactions
PDB: 8gch
http://www.biochem.ucl.ac.uk/bsm/ligplot/ligplot.html
PoseViewPoseViewhttp://poseview.zbh.uni‐hamburg.de/poseview
Complex of Estrogen Receptor Alpha and Receptor Alpha and Diethylstilbestrol
PDB code: 3ERD
Stierand K, Maass PC, Rarey M. Molecular complexes at a glance: automated generation of two-dimensional complex diagrams.
Bioinformatics 22(14):1710-6 (2006)Bioinformatics 22(14):1710 6 (2006)
PoseViewPoseViewhttp://poseview.zbh.uni‐hamburg.de/poseview
Crystal Structure of Antagonizing Mutant Antagonizing Mutant 536S of the Estrogen Receptor Alpha Ligand Receptor Alpha Ligand Binding Domain Complexed to Raloxifene
PDB code: 2QXS
The Binding Database
The BindingDB is a public, web-accessible database of measured binding
http://www.bindingdb.org/bind/index.jsp
The BindingDB is a public, web accessible database of measured binding affinities for biomolecules, genetically or chemically modified biomolecules, and synthetic compounds.The database currently contains data generated by isothermal titration calorimetry (ITC) and enzyme inhibition methods; other techniques will be included in the future.
BindingDB Affinity Statistics
Liu,T., Lin,Y., Wen,X., Jorrisen, R.N. and Gilson,M.K. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities.
Nucleic Acids Research 35:D198-D201 (2007) http://sw16.im.med.umich.edu/databases/pdbbind/index.jsphttp://sw16.im.med.umich.edu/databases/pdbbind/index.jsp
/ffinity/
urg.de/a
uni‐marbu
armazie.u
1664.pha
ttp://pc1
ht
http://pc1664.pharmazie.unihttp://pc1664.pharmazie.uni‐‐marburg.de/affinity/marburg.de/affinity/
Ch B k i i t d d t idChemBank is intended to guidechemists synthesizing novelcompounds or libraries, toassist biologists searching forassist biologists searching forsmall molecules that perturbspecific biological pathways,and to catalyze the process by
hi h d h t dihttp://chembank.broad.harvard.edu/
which drug hunters discovernew and effective medicines.
Affinity Affinity vsvs SpecificitySpecificityAffinity Affinity vs.vs. SpecificitySpecificity
Ligand + Receptor Ligand-Receptorkon
kkoff
koffkon
[Ligand] [Receptor][Ligand-Receptor]Kd = =kon [Ligand Receptor]
ΔG = ΔH TΔS ΔG = ΔH - TΔS
Binding EnergyBinding Energy Binding ConstantBinding Constant
ΔG (kcal/mol) ΔKd
0.51 0
2x5x1.0
1.55x
13x2.02.5
29x68x2.5
3.068x
158xΔG = 2.303 RT log Kd
Ajay, Murcko MA. Computational methods to predict binding free energy in ligand-receptorcomplexes. J. Med. Chem. 38, 4953-4967 (1995)
Narrow window of activityNarrow window of activityNarrow window of activityNarrow window of activity
Best ligand from virtual screening: Kd ∼ 50 nM
G /ΔΔG = 4.5 kcal/mol
Experimental detection limit: Kd ∼ 100 μMp d μ
Tirado-Rives J, Jorgensen WL. C t ib ti f f f i t th t i t i Contribution of conformer focusing to the uncertainty in
predicting free energies for protein-ligand binding. J Med Chem. 49(20):5880-4 (2006)
Similar ligands decompose differently into enthalpic and entropic binding contrib tionsand entropic binding contributions.
Two closely related thrombin inhibitorsy
BINANABINANA (BINding ANAlyzer) http://www.nbcr.net/binana/
a python implemented algorithm for analyzing ligand- a python-implemented algorithm for analyzing ligand binding.
- it identifies key binding characteristics like hydrogen bonds, salt bridges, and pi interactions.
Input: receptor and ligand files in the PDBQT format (free PDB → PDBQT converter in AutoDockTools)
Output: ‐ close contacts‐ electrostatic interactions‐ hydrophobic contacts‐ hydrogen bonds ‐ salt bridges g‐ π interactions
Molecular model building,geometry optimization,g y p ,and energy calculations:
Molecular MechanicsAlso a scoring function for docking
Epot = Ebonded + Enon bonded
Also a scoring function for docking
Epot Ebonded Enon-bonded
E = Σ E + Σ E + Σ EEbonded = Σi Ebond + Σi Eangle + Σi Edihedral
E Σ E Σ EEnon-bonded = Σi Eelectrostatic + Σi Evan der Waals
TINKER's "Molecular Mechanics“ Logo Ilustration by Jay Nelson.Courtesy of Prof. Robert Paine, Chemistry Dept., Univ. of New Mexico.
BONDING TERMS
( )∑ −=angles
20θangle θθk
21E
( )20bbonds
bonds bbk21E −= ∑
( )[ ]∑ −+=dihedrals
0ddihedral φφcos1k21E
NON-BONDING TERMS
∑= jiticelectrosta
qqE 1 ∑ −= 6
ij12
ijJonesLennard
BAE∑
ij ijrεπεticelectrosta04 ∑−
ij6
ij12ij
JonesLennard rrE
ji++
repulsive
attractive jiattractive ji
+–ji
LIGAND EFFICIENCY INDICES
Gg(LE)efficiencyLigand Δ=Δ=
ΔG = −RT lnKi
Ng (LE)efficiency Ligand =Δ=
N = number of non-hydrogen atoms
The units of LE (Δg) are kcal/mol per non-hydrogen atom.
For reference purposes it is useful to note that for Ki = 1 nM at 300º K a compo nd has a binding energ of 12 4 kcal/mol at 300º K, a compound has a binding energy of -12.4 kcal/mol.
Thus a 1 nM compound consisting of 25 non hydrogen Thus, a 1 nM compound consisting of 25 non-hydrogen atoms will have a ligand efficiency (LE) of ~ −0.5 kcal/mol/non-hydrogen atomhydrogen atom.
LIGAND EFFICIENCY INDICES
Plot of only the most potent inhibitors as a function of the number of heavy atoms (HA). The ‘maximal affinities’ as measured by pIC50 increase rapidly up to 20 heavy atoms, but plateau beyond 25.
LIGAND EFFICIENCY INDICES
Ligand efficiency as a function of heavy atoms for the Ki dataset is shown in the red circles.
MODIFIED LIGAND BINDING EFFICIENCY INDICES
Percentage efficiency index:
weightmolecular[compound] given a at inhibition % PEI = (e.g.1-30 μM)
Binding efficiency index:
weightmolecular
Binding efficiency index:
pIC or pK ,pKBEI 50di=
Surface binding efficiency index:
weightmolecular BEI
Surface-binding efficiency index:
pIC or pK ,pKSEI 50di
area surface polar SEI 50di=
LEI reference values:
Percentage inhibition of 50% at a given compound concentration = 0.5 on a 0−1 scaleon a 0−1 scale
Molecular weight = 0.333 kDa (mean value of marketed oral drugs) g ( g )
Ki, Kd or IC50 = 1.0 nM
pKi = 9.0
van der Waals PSA = 50 Å2 (normalized to 100 Å2)
compound (inhibitor) concentration (e.g. 10 μM)
PEI 1 5 BEI 27 0 SEI 18 0 PEI = 1.5 BEI = 27.0 SEI = 18.0
EXPERIMENTALDATABASES
3D STRUCTURESFOR THE LIGAND
SYSTEMATIC SEARCHMOLECULAR MECHANICS
CONCORD, WIZARD, CORINA...
THELIGAND-RECEPTORCOMPLEX
THE“DOCKING”
3D STRUCTURESPROBLEM
X-RAYCRYSTALLOGRAPHY
3D STRUCTURESFOR THE RECEPTOR
NMR SPECTROSCOPY
HOMOLOGYMODELINGMODELING
LIGAND-RECEPTOR DOCKING
inhibitor
HIV-1 Protease Enzyme-Inhibitor Complex
Validated TargetsValidated Targets
Combinatorial Chemistry Libraries+
= THE “DOCKING” =
Ligand-Receptor ComplexesPROBLEM
Wh t i D ki ?What is Docking?
• “Best ways to put two molecules together.”
• Three steps:(1) Definition of the structure of the target molecule(1) Definition of the structure of the target molecule.
(2) Location of the binding site.
(3) Determination of the binding mode.
Wh t i D ki ?What is Docking?
• “Best ways to put two molecules together.”– Need to quantify or rank solutions;– Scoring function or force field.
• “Best ways to put two molecules together.” – (plural) Experimental structure may be amongst(plural) Experimental structure may be amongst
one of several predicted solutions.
• “Best ways to put two molecules together ”Best ways to put two molecules together.– Need a search method
A Server for Identification of Protein Pockets & CavitiesCASTp• Identifies all pockets and cavities.
• Measures the volume and area analytically.
http://cast.engr.uic.edu/cast/
A Server for Identification of Protein Pockets & CavitiesCASTp• GPSSpyMOL: Global Protein Surface Survey Plugin for PyMOL
“GRID: A Computational Procedure for DeterminingE ti ll F bl Bi di Sit Bi l i llEnergetically Favorable Binding Sites on Biologically
Important Macromolecules”
Peter Goodford, Oxford University, y
J. Med. Chem. 28, 849-857 (1985), ( )ibid. 32, 1083-1094 (1989); 36, 140-147 (1993); 36, 148-156 (1993)
http://www.moldiscovery.com/
http://www.moldiscovery.com/
Aromatic carbon probeAromatic carbon probeGrid point value range: -5.45 to 5.0 kcal/mol
Contour level: -2.5 kcal/mol
Hydrophobic probeHydrophobic probeGrid point value range: -2.86 to 0.0 kcal/mol
Contour level: -1.0 kcal/mol
Carbonyl oxygen probeCarbonyl oxygen probeGrid point value range: -8.03 to 5.0 kcal/mol
Contour level: -5.0 kcal/mol
Hydroxyl oxygen probeHydroxyl oxygen probeGrid point value range: -12.30 to 5.0 kcal/mol
Contour level: -7.0 kcal/mol
“THE DOCKINGPROBLEM”
SITE/LIGAND REPRESENTATION(treatment of H atoms?)(treatment of H atoms?)
JUXTAPOSITION OF THE LIGAND ANDJUXTAPOSITION OF THE LIGAND ANDSITE FRAMES OF REFERENCE
(docking engine)(docking engine)
EVALUATION OF COMPLEMENTARITYEVALUATION OF COMPLEMENTARITY(scoring functions)
AIM: To obtain the lowest free energy structure(s) for the receptor-ligand complex.
http://biophysics.cs.vt.edu/H++/
H++ is an automated system that computes pK values of ionizable groups in macromolecules and adds missing hydrogen atoms according to the specified pH of the environment. Given a (PDB) structure file on input, H++ outputs the completed structure in several common formats (PDB) structure file on input, H outputs the completed structure in several common formats (PDB, PQR, AMBER inpcrd/prmtop) and provides a set of tools useful for analysis of electrostatic-related molecular properties.
http://pdb2pqr‐1.wustl.edu/pdb2pqr/
http://propka.ki.ku.dk/
D ki l ithDocking algorithms
• Require 3D atomic structure for protein, and 3D structure f d (“li d”)for compound (“ligand”)
• May require initial rough positioning for the ligand
• Will use an optimization method to try and find the best Will use an optimization method to try and find the best rotation and translation of the ligand in the protein, for optimal binding affinityoptimal binding affinity
MOLECULAR DOCKINGSYSTEMATIC SEARCH (brute force algorithm):
All binding orientations of all conformers of the ligand and the receptor(i ti l f t it ti )(impractical for most situations).
AUTOMATED SEARCH:AUTOMATED SEARCH:
GEOMETRIC METHODS: Matching of ligand and receptor site descriptorsGEOMETRIC METHODS: Matching of ligand and receptor site descriptors(descriptors, grids, fragments...).
FORCE FIELD METHODS: Minimizing the ligand-receptor interaction energy - Molecular dynamics and Monte Carlo simulations.
Scoring functionsForce field-based: calculation of van der Waals and electrostatic interaction energies between the receptor and the ligand atomsinteraction energies between the receptor and the ligand atoms
Knowledge-based: statistical analysis of 3D complex structures to derive a sum of potentials of mean force between receptor and ligand derive a sum of potentials of mean force between receptor and ligand atoms
Empirical: the binding free energy is broken down into a number of Empirical: the binding free energy is broken down into a number of different weighted contributions (supposed to be additive: number of hydrogen bonds ionic interactions apolar contacts entropy hydrogen bonds, ionic interactions, apolar contacts, entropy penalties...)
Examples of algorithms to docka ligand into a receptor site
Rigid ligand:
Fast shape matching (DOCK)
Flexible ligand:
Shape matching (DOCK 4.0)
Incremental construction (FlexX)( )
Simulated annealing (AutoDock 2.4)
M t C l i l ti (MCDOCK)Monte Carlo simulation (MCDOCK)
Genetic algorithm (AutoDock 3.0, GOLD, GAMBLER)
Some popular docking programsDOCK• DOCK
– Developed in Tak Kuntz’s group at UCSF –– Shape algorithm - http://www.cmpharm.ucsf.edu/kuntz/dock.html– Recent versions allow for ligand flexibility
• GOLD– Developed at Sheffield University, distributed by CCDC– Uses genetic algorithm– Flexible ligand - http://www.ccdc.cam.ac.uk/
• FLEXX– Flexible ligand – http://www.biosolveit.de/FlexX/– Binding mode prediction and virtual high-throughput screening (vHTS)
• FREDFRED– By OpenEye Scientific – http://www.openeye.com– Rigid, but able to use multiple, well chosen conformers– Very fast– Very fast
• AUTODOCK– Scripps Lab - http://www.scripps.edu/pub/olson-web/doc/autodock/
U G ti Al ith– Uses Genetic Algorithm• LIGANDFIT
– Accelrys - http://www.accelrys.com/cerius2/c2ligandfit.html