Handling ligands with PRODRG Division of Biological Chemistry and Drug Discovery College of Life...

Post on 14-Dec-2015

235 views 2 download

Tags:

transcript

Handling ligands with PRODRG

Division of Biological Chemistry and Drug Discovery

College of Life Sciences

Daan van Aalten

PRODRG - why?

• Early 1990s - no software to generate topologies for non-macromolecular entities

• Manual topology generation is time consuming and error prone (but instructive)

• Small molecule coordinate generators essentially only commercially available

PRODRG - why?• For small molecules, we need to go from

imagination/some chemical info to a correct topology and optimised coordinates in seconds

+ Topologies for SHELX, REFMAC5, CNS, O, TNT, …

PRODRG - why?Citrate (1AJ8)

(1997; 1.9 Å)

NADP+ (1DDI)

(1999; 2.5 Å)

Cyclohexylamine (1PPA)

(1991; 2.0 Å)

Diphosphate (1N5L)

(2002; 2.3 Å)

Ethylene glycol (1JKV)

(2001; 1.4 Å)

Sulphate (1DW9)

(1999; 1.7 Å)

PRODRG History

• Version 1 (1995)– Started as a DRuG PROgram in GROMOS87 – Takes PDB file and generates ‘MOLDES’

(SMILES-like 1D string) and MD topologies

• Version 2 (2004)– Many additional input formats– Many additional output formats, including

topologies for crystallographic software

• Version 2.5 (2005)– Internal all-atom representation

PRODRG History

• Details covered in two publications

• Webserver (~300 runs/day) with short FAQ

PRODRG Guts

• Essentially FORTRAN (30000 lines) with some supporting C (5000) lines

• Compiles well on all major platforms

• Few dependencies (GROMACS for coordinate generation)

PDB file

Molfile

Human

What is PRODRG?

Molecular descripton

Atomic coordinates Chemical types Connectivity Bond orders / aromaticity Hybridisation Formal charges Atomic charges Force field parameters Hydrogen atoms Free torsions Hydrogen bonding

• Generates information about small molecules

Model building& refinement

Moleculardynamics

Docking& analysis

DB lookups & property pred.

Visualisation

PRODRG

How does PRODRG work?

• Fixed order of steps is bad

• Input analysis is rather rude:– Deletes hydrogens– Ignores bond order information

1. Analysis of input

2. Initial data gathering

3. Addition of hydrogens

4. Atom reordering

5. Topology generation

6. Formal and partial charges

7. Additional molecule data

8. Output

• Most steps use ‘chemical pattern matching’• Example: hydrogen generation

How does PRODRG work?

Add 1+sp(x)-ncon(x) hydrogens

Do nothing

Add 1 hydrogen

• Currently all Hs generated by 17 ‘rules’• Chemical knowledge in data, not codeMore flexiblePotentially user-configurable

How does PRODRG work?

Limitations

• Supported atom types limited– C,H,N,O,P,S,F,Cl,Br,I only

• Other chemical limitations– No more than 4 connections/atom– Standard version limited to <=300 atoms

• Ignoring hydrogens and bond types may lead to unexpected results

• (Apolar hydrogens as second-class atoms)

• SMILES not yet implemented (but trivial)

Basic usage: web server

• Four easy steps:1. Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg

Basic usage: web server

• Four easy steps:1. Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg2. Paste input

Basic usage: web server

• Four easy steps:1. Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg2. Paste input3. Edit settings

Chirality restraints? Reduced charges? Coordinates?

Basic usage: web server

• Four easy steps:1. Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg2. Paste input3. Edit settings4. Run it

Basic usage: web server

• Four easy steps:1. Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg2. Paste input3. Edit settings4. Run it

Success!

PRODRG inputs

• PDB coordinates• MDL molfile• MOLDES (SMILES-like 1D string)• JME editor (web server)• “TEXT” input

Text drawings

• Atoms represented by their element symbols

• Connected by bonds– Single: - or |– Double: = or ”– Triple: #

• Change case of symbol to invert chirality

N C-C| " "C-C-C C-O| | |C=O C=C|O

D-Tyr

N C-C| " "c-C-C C-O| | |C=O C=C|O

L-Tyr

PRODRG outputs• PDB (generated/minimzed) coordinates (with/out hydrogens,

with proper atoms names for protein/sugars/DNA), but GIGO principle applies

• Quality control on input coordinates vs topology

• WHAT IF topology - accurate protein-ligand Hbonds• CNS/REFMAC/TNT/SHELX topology (including PTM amino

acid building blocks)• GROMOS/GROMACS/OPLS topologies

• Consistent topology from crystal -> publication

Helping (or kicking) PRODRG

• Additional commands/hints in input file:– PATCH (hybridisation)– INSHYD and DELHYD– PATCH (chirality)– PATCH (torsions)– CPNAME

Hybridisation hints

PATCH <atom> <number>

• Useful if PDB analysis did not quite work

• Allows to nudge PRODRG in right direction:

O “C=C-C| |C-C=N“O

PRODRG> WARNING: multiplicity of generated molecule is not 1.PRODRG> WARNING: bond type assignment failed at CAF .

Hybridisation hints

PATCH <atom> <number>

• Useful if PDB analysis did not quite work

• Allows to nudge PRODRG in right direction:

O “C=C-C| |C-C=N“O

PATCH NAG 21

Adding/removing hydrogens

INSHYD <atom>

DELHYD <atom>

• Allows to override default protonation

• Often not actually what you want

C-C=O | O

INSHYD OAD

PRODRG> Cannot assign type to atom ' OAD'.ERRDRG> Error in GROMOS atom names/types.PRODRG> Drug topology not made, sorry!

Adding/removing hydrogens

INSHYD <atom>

DELHYD <atom>

• Allows to override default protonation

• Often not actually what you want

C-C=O | O

PATCH OAD 3

Modifying chirality

PATCH <atom> -1

• Inverts stereocenter <atom>, useful for PDB input

PATCH <atom> <pattern>

• ‘Absolute’ chiralityfor certain classesof molecules

N C-C| " "C-C-C C-O| | |C=O C=C|OPATCH CA L

L-Tyr

N C-C| " "C-C-C C-O| | |C=O C=C|OPATCH CA D

D-Tyr

Adding dihedral restraints

PATCH <atom> ><pattern>

• After EM pyranose rings often found in undesirable conformations

• PATCH statement introduces additional dihedral restraints to fix conformation

C-C-O-C-O| | |O C-C-C | | | O O OPATCH C1 ALPHAPATCH C2 DPATCH C3 LPATCH C4 DPATCH C5 DPATCH C1 >4C1 -D-Glucose

Building

• PRODRG can add molecular fragments to existing molecules:

BUILD <atom> <fragment>

L-Ala L-Phe

BUILD CB PHI

L-Tyr

BUILD CZ OH

Building

• Allows quick alterations to existing molecules

• Preserves coordinates of root structure

• Fragment libraries contain text drawings –easy to define:

FRAG OH

X-O

FRAG PHI

X-C-C=C " | C-C=C

FRAG ...

Building

• Can also be used to generate oligopeptides and oligosaccharides, using BUILD and

START <fragment>

START bdGLCBUILD O4 adMAN1BUILD O0F bdNAG1

PATCH C1 >4C1PATCH C0B >4C1PATCH C1B >4C1

-D-Glc

-D-NAG-D-Man

PRODRG IP issues• Currently PRODRG freely accessible for

academics through webserver and binaries• Commercial licenses (~10) have provided useful

income that contributes (but does not cover) PRODRG development / maintenance

• Currently no PRODRG grant funding (previously WT senior fellowship)

Thoughts on the future:• Make PRODRG as accessible as possible• Release of source?• Keen to incorporate/integrate with CCP4 but this

will require some development

PRODRG - what next

• Make PRODRG as accessible as possible• Release of source?• Keen to incorporate/integrate with CCP4 but this will

require some development• Need to incorporate SMILES• Make PDB input foolproof by quality control• Move away from the united-atom-with-hydrogen-addition

model • Link up with GUI - not only drawing but also “building”• Link up with coot (build-place-fit ligand at pointer)

Acknowledgements

• Alexander Schüttelkopf

• PRODRG users