+ All Categories
Home > Documents > Alessandro Pedretti

Alessandro Pedretti

Date post: 09-Feb-2016
Category:
Upload: nishan
View: 40 times
Download: 0 times
Share this document with a friend
Description:
UNIVERSITÁ DEGLI STUDI DI MILANO Facoltà di Scienze del Farmaco. Virtual screening and collaborative computing: a new frontier in drug discovery. Alessandro Pedretti. XI Congreso Venezolano de Química Caracas, June 18, 2013. Overview. - PowerPoint PPT Presentation
Popular Tags:
24
Alessandro Pedretti Virtual screening and collaborative computing: a new frontier in drug discovery UNIVERSITÁ DEGLI STUDI DI MILANO Facoltà di Scienze del Farmaco XI Congreso Venezolano de Química Caracas, June 18, 2013
Transcript
Page 1: Alessandro Pedretti

Alessandro Pedretti

Virtual screening and collaborative computing:a new frontier in drug discovery

UNIVERSITÁ DEGLI STUDI DI MILANOFacoltà di Scienze del Farmaco

XI Congreso Venezolano de QuímicaCaracas, June 18, 2013

Page 2: Alessandro Pedretti

Overview

Collaborative computing applied in a computational chemistry laboratory.

WarpEngine paradigm to distribute the calculations in the local network.

Virtual screening setup to choose the best software and parameters.

Two WarpEngine applications to evaluate its performances.

Short WarpEngine practical session.

Page 3: Alessandro Pedretti

Main definition:

The “collaborative computing” term includes technologies and informatics resources based on a network communication system that allows the documents and projects to be shared between users.

All activities are managed by a variety of devices such as desktops, laptops, tablets and smartphones.

What is the collaborative computing

In a computational chemistry laboratory:

The daily activity of a computational chemist requires not only to share information and data between the users, but also hardware resources.

Page 4: Alessandro Pedretti

Typical scenario in a lab

Internet

Firewall

Servers

PCs

Networkdevices

Ethernet infrastructure100-1000 Mbit/s

Several PCs with heterogeneous hardware / OSs.

Very high computational power “fragmented” on the local network.

Hard possibility to use all computational power to run a single complex calculation.

Page 5: Alessandro Pedretti

Parallel computing without the grid paradigm.

Client/server architecture with hot-plug capabilities.

Possibility to perform calculations with different pieces of software without changing the main code.

Expandable by scripting languages.

High-level database interface integrated in the main code supporting the most common SQL database engines (Access, MySQL, SQLite, SQL Server, etc).

Easy configuration by graphic interface.

High performances and security.

Main features

Page 6: Alessandro Pedretti

… to develop WarpEngine:

What we need …

High-level database interface.Fast customizable Web server.

Script engine.Graphic environment.

Plug-in expandability

Scripting languages

Molecule editing

Surface mapping

File format conversion

Database engine

Graphic interface

Property calculation

MM / MD calculations

Trajectory analysis

Page 7: Alessandro Pedretti

Server scheme

UDP server HTTP server

Client manager

Project manager

Jobmanager

VEGA ZZcore

Databaseengine

IP filterPowerNetplug-in

Main program

To clientsTCP/IP, HTTP,broadcast

Optional encrypted tunnelprovided by WarpGate

Page 8: Alessandro Pedretti

Client scheme

UDP client HTTP client

Project manager

Multithreadedworker

VEGA ZZcore

PowerNet plug-in Main program

To the serverTCP/IP, HTTP, broadcast

Page 9: Alessandro Pedretti

WarpEngine is easy expandable by scripting languages, hence it’s possible to perform some calculation types:

Application fields

Semi-empirical calculations

Ab-initio calculations

Rescore of docking poses

Multiple molecular mechanics calculations

Virtual screening

Page 10: Alessandro Pedretti

Today, the virtual screening is a very common approach to identify hit compounds from large libraries of molecules in the drug discovery process.

It can be classified in:

Drug discovery and virtual screening

Structure-basedIt involves molecular docking calculations between each molecule to be tested and the biological target (usually a protein). To evaluate the affinity, a scoring function is applied. The 3D structure of the target must be known.

Ligand-basedThe 3D structure of the biological target is unknown and a set of geometric rules and/or physical-chemical properties (pharmacophore model) obtained by QSAR studies are used to screen the library.

Page 11: Alessandro Pedretti

Dis-advantages of the virtual screening

Advantages:

Fast (but it depends by the library size).

Possibility to optimize the in-home resources.

Cheap.

Disadvantages:

False positive rate.

Limited chemical space (ligand-based).

Impossibility to discriminate the intrinsic activity (structure-based).

Necessity to confirm the results by experimental assays.

Database

Virtual screening

Hit compounds

Page 12: Alessandro Pedretti

For test purposes, we choose three well known and free docking software:

Choice of docking software for virtual screening

AutoDock 4.2 http://autodock.scripps.edu

AutoDock Vina http://vina.scripps.edu

PLANTS http://www.tcd.uni-konstanz.de/research/plants.php

and the acetylcholine esterase (AchE) ligand database from Directory of Useful Decoys (DUD, http://dud.docking.org), containing:

107 true active molecules

3892 true inactive molecules

All these ligands were docked into AchE crystal structure downloaded from PDB (1EVE) in order to evaluate the predictive power and the performances of each docking software.

Page 13: Alessandro Pedretti

The hit rate is the measure of the probability to find active ligands into a set of molecules and it can be calculated by the following equation:

Hit rate evaluation

100._

_moleculesAll

moleculesActiveHR

Considering the whole dataset:

%68.2100.3999107

RandomHR

The random hit rate is the probability to find an active compound by random choices. In other words, every 100 randomly selected ligands from the data set, there are 2.68 active compounds.

Page 14: Alessandro Pedretti

Evaluation of virtual screening performances

The performances of each virtual screening software are evaluated by:

sorting the results by the docking score;

calculating the hit rate in a set of top ranked molecules (1%, 2% and 5% of the total data set);

calculating the enrichment factor:

Random

TopNTopN HR

HREF %

%

Every virtual screening calculation must have at least EF > 1.0 and to be considered enough efficient EF > 2.0. It means that the screening must have performances at least 2-fold better than the random.

Page 15: Alessandro Pedretti

AutoDock and Vina results

two AutoDock runs were performed: screening and full docking parameters.

one Vina calculation with exhaustiveness set to 7;

both software use a similar scoring function based on Amber force field.

Enrichment factor Software Exhaustiveness Flexible

chains 1% 2% 5% Single

CPU time (hours)

AutoDock Screening No 4,67 3,27 1,68 44,96 AutoDock Full docking No 7,47 4,20 3,55 1344,00 Vina 7 No 1,87 2,34 2,06 342,00

Page 16: Alessandro Pedretti

PLANTS results

The PLANTS enrichment performances were evaluated by considering:

all three scoring functions (ChemPLP, PLP and PLP95);

two degrees of exhaustiveness (Speed1 and Speed2);

flexible side chains of aminoacids (PLP and Speed2 only).

Enrichment factor Score Exhaustiveness Flexible

chains 1% 2% 5% Single

CPU time (hours)

ChemPLP Speed1 No 19,62 11,21 5,98 97,64 ChemPLP Speed2 No 18,69 10,74 5,23 66,64 PLP Speed1 No 19,62 10,28 5,23 44,08 PLP Speed2 No 19,62 10,28 5,23 30,28 PLP Speed2 Yes 20,56 10,28 5,05 350,80 PLP95 Speed1 No 17,75 10,28 4,86 37,04 PLP95 Speed2 No 16,82 9,81 4,48 34,44

Page 17: Alessandro Pedretti

Hardware for the test

1 PC configured as client and server:Quad-core

9 PC configured as client:1 six-core7 quad-core1 dual-core1 single-core

37 cores42 Gb ram

> 3 Tb storage

Operating systems:6 Windows 7 Pro x643 Windows 7 Pro1 Windows XP Pro

Network connection:Ethernet 100 Mbs

Page 18: Alessandro Pedretti

Software & data for the test

APBS – Adaptive Poisson-Boltzmann SolverCalculation of solvation energy.

PLANTS – Protein-Ligand ANT systemStructure-based virtual screening.

Database of drugs in .mdb format174.398 molecules, average MW 353,70.

Human M2 muscarinic receptorPDB ID: 3UON.

Both programsare single-threaded

Page 19: Alessandro Pedretti

APBS – Solvation energy calculation.174.398 molecules, two APBS calculation for each molecule (reference and solvated state).

Time required by a single thread calculation: 13 days 5 hours

Time required by WarpEngine: 8 hours 36 minutes

WarpEngine speed: 339,10 jobs / min.

Real case tests

PLANTS – Virtual screening.174.398 molecules, M2 target, PLP, speed2.

Time required by a single thread calculation: 36 days 22 hours

Time required by WarpEngine: 1 day 0 hour 1 minute

WarpEngine speed: 121,00 jobs / min.

Page 20: Alessandro Pedretti

Test Drive

Page 21: Alessandro Pedretti

Graphic interface

Page 22: Alessandro Pedretti

Graphic interface

Page 23: Alessandro Pedretti

Conclusions

The collaborative computing not only can help the users to work together on the same project, but also can be extended efficiently to share the computational resources that remain often unused.

WarpEngine can collect the unused computational power and convey it to carry out large calculations, such as a virtual screening, without interfering with the normal user activities.

The setup phase of a virtual screening plays a pivotal role to obtain good performances in terms of results and calculation speed.

Page 24: Alessandro Pedretti

Acknowledgements

www.vegazz.net

Giulio Vistoli

Matteo Lo Monte

Angelica Mazzolari


Recommended