GLOBAL PHARMA R&D INFORMATICS CONGRESS
Lisbon, 30.11.2017
Friedrich Rippmann, Computational Chemistry & Biology
Boosting discovery research productivity by generatingnovel digital components, and integrating them into coherent workflows
Integration is key
(Digital) components to boost productivity
1
32
4
Large virtual librariesonly the sky is the limit
Predictive modelspredict everything
Promote self servicebreak down hurdles
Integrate componentsand make them easy to use
Integration is key
(Digital) components to boost productivity
1
32
4
Large virtual librariesonly the sky is the limit
Integrate componentsand make them easy to use
Predictive modelspredict everything
Promote self servicebreak down hurdles
MASSIV – The creation of feasible novel chemical space
theoreticalfeasible
purchasabledrugs
103
1071015
1060
Type & number of moleculesAssuming 104 molecules are synthesized per year 100 billion years to get all feasible ones
We must find a smarter way to explore the chemical space
Virtual Compound spaces to feed Drug Discovery
Super MASSIV
MASSIV
BIG
SMALL
Title of Presentation | DD.MM.YYYY5
Exhaustive enumeration: 1.6 x 1011 molecules in GDB17 (Reymond et al. J. Chem. Inf. Model. 2012)
Virtual compound spaces
Merck AcceSSible InVentory
BUILDING BLOCKS
CHEMICAL REACTIONS LOOK-UP
Tailored libraries
MASSIV space
…
1020
in silico synthesis novel chemical matter
106
104 look-up space(105 per reference)
MASSIV applied to drug target XYZ
103
104
102
1.5 x 105
5.4 x1012MASSIV
(virtual space)
# o
fm
ole
cu
les
~200,000 eMoleculesBB118 encoded reactions
• MASSIV look-up
• cluster by reaction
• selection of results by reaction
• 3D overlay with reference structures
• MOCCA models (FUB) & expert selection
23 • synthesis & in vitro verification ongoing
Integration is key
(Digital) components to boost productivity
21
34
Large virtual librariesonly the sky is the limit
Integrate componentsand make them easy to use
Predictive modelspredict everything
Promote self servicebreak down hurdles
Pop Quiz:
Feedback: They all make use of “DeepLearning” aka “NeuralNetworks”
What do these techniques have in common? N
a. Self-driving car
b. Siri
c. Google Translator
d. Face recognition
Deep Learning: From Face Recognition to Drug Discovery
Hierarchical Feature Learning
Com
ple
tefa
ce
Edge
dete
cto
r
Facia
lfe
atu
res
Com
ple
tem
ole
cule
Substr
uctu
ral
ele
ment
Fin
gerp
rint
aka Deep Learning
Input: images Input: structures andbiological activity
Comprehensive Prediction of Kinase Selectivity
Achievements so far
• 277 novel kinase models generated• Data basis: 4,800 compounds measured in 277 kinase assays
high predictivity
goodpredictivity
reasonable predictivity
36 122 200
Who contributed?
• Group of Prof. Hochreiter, Uni Linz (winner of the TOX21 Challenge)
Getting better drugs faster: Free Energy Perturbation @ Merck
Prerequisites
Protein structure(s)
Known binding mode
Assay
Target validation
Test on compoundswith known activity
Does FEP work for mytarget?
Production
Weekly ranking of newideas
Prioritize compoundsfor synthesis
Applied to 12 targetsso far
21 inhibitors used for validation
Integration is key
(Digital) components to boost productivity
3
12
4
Large virtual librariesonly the sky is the limit
Integrate componentsand make them easy to use
Predictive modelspredict everything
Promote self servicebreak down hurdles
14
Merck Online Computational Chemistry Analyzer
Components for boosting productivity
From virtual libraries to prediction of binding constants to synthesis ordering
422
449
673
# o
fm
ole
cu
les
• descriptor calculation
• removal of bad functional groups
• manual inspection & selection
• MOCCA Merck Online
Compchem Analyzer
• FEP calculations
synthesis request via online tool
70
30
6
Application of Predictive
models, based on Deep
Learning, and other Artificial
Intelligence methods
Accurate binding
constant prediction
1
2
3
MASSIVMerck AcceSSible InVentory 1020
4
Title of Presentation | DD.MM.YYYY16
GPU computing is essential
Low-cost GPUs deliver
10 x 8 = 80 GPUs
Safer molecules, faster
Deep Learning
Molecular Dynamics
Title of Presentation | DD.MM.YYYY17
1. Seamless integration from idea to synthesized compound to assay result is crucial for productivity
2. All models (e.g. Regression-, Random Forest-, Deep Learning-based) generated in ONE coherent framework
3. Meaningful application of Deep Learning needs deep know-how
4. Free Energy Perturbation for binding constant prediction works, when it works (needs good X-ray complex structures)
5. GPU computing is essential
Conclusions
Integration is key
(Digital) components to boost productivity
4
1
32
Large virtual librariesonly the sky is the limit
Integrate componentsand make them easy to use
Predictive modelspredict everything
Promote self servicebreak down hurdles
Integrated & collaborative drug design
Integral prediction of all relevant
endpoints
Provide predictions in
integrated design environment
Levarageteamwork: make
it easy for all disciplines to
contribute theircomplementary
expertise
Integrated collaborative Compound Design
X-ray
CompChem
Title of Presentation | DD.MM.YYYY20
• Seamless integration from idea to synthesized compound to assay result is crucial for productivity
• All models (e.g. Regression-, Random Forest-, Deep Learning-based) generated in ONE coherent framework
• Meaningful application of Deep Learning needs deep know-how
• Free Energy Perturbation for binding constant prediction works, when it works (needs good X-ray complex structures)
• GPU computing is essential
• All Research data accessible to all scientists; all researchers independent of discipline on eye level
• External services easily accessible to all who need them (get rid of bureaucracy, but monitor what is done)
• Computational chemists are as responsible in ordering synthesis of compounds as are medicinal chemists
Conclusions