Leiden University. The university to discover. Enhancing Search Space Diversity in Multi-Objective...

Post on 28-Jan-2016

212 views 0 download

Tags:

transcript

Leiden University. The university to discover.

Enhancing Search Space Diversity in Multi-Objective Evolutionary Drug Molecule Design using Niching

1. Leiden Institute of Advanced Computer Science (LIACS)2. Leiden/Amsterdam Center for Drug Research (LACDR)3. NuTech Solutions, Inc.

A. Aleman1

A.P. IJzerman2

E. van der Horst2

M.T.M Emmerich1

T. Bäck1,3

J.W. Kruisselbrink1

A. Bender2

Leiden University. The university to discover.

- Search for molecular structures with specific pharmacological or biological activity that influence the behavior of certain targeted cells

- Objectives: Maximization of potency of drug (and minimization of side-effects)

- Constraints: Stability, synthesizability, drug-likeness, etc.- A huge search space: 1020-1060 drug-like molecules- Aim: provide the medicinal chemist a set of molecular

structures that can be promising candidates for further research

Scope: drug design and development

Leiden University. The university to discover.

Molecule Evolution

Fragments extracted fromFrom Drug Databases

While not terminate do

Generate offspring O from PPt+1= select from (P U O)

Evaluate O

Initialize population P0

- ‘Normal’ evolution cycle- Graph based mutation and

recombination operators- Deterministic elitistic (μ+λ)

parent selection (NSGA-II)

Leiden University. The university to discover.

Molecule Evolution

Leiden University. The university to discover.

Fitness

Objectives:- activity predictors based on support vector machines:

- f1: activity predictor based on ECFP6 fingerprints- f2: activity predictor based on AlogP2 Estate Counts- f3: activity predictor based on MDL

Constraints:- a fuzzy constraint score based on Lipinski’s rule of five and bounds

on the minimal energy confirmation:

Leiden University. The university to discover.

Desirability indexes for modeling fuzzy constraints

The degree of satisfaction can be measured on a scale between 0 and 1Constraints can be modeled in the form of desirability values

Leiden University. The university to discover.

Diversity for Molecule Evolution

- A ‘normal’ search yields very similar molecular structures- Aim for a set of diverse candidate structures because:

- Vague objective functions may result in finding structures that fail in practice

- The chemist desires a set of promising structures rather than only one single solution

- Explicit methods are required to enforce diversity in the search space; i.e. niching

Leiden University. The university to discover.

All molecules are variations of the same theme!

Typical output of a ‘normal’ evolutionary search

Leiden University. The university to discover.

Niching in Multi-Objective EA- Explicitly aim for diversity in the decision space- Different than aiming for diversity in the objective space- Points that lie far apart in the objective space do not

necessarily also lie far apart in the decision space

Leiden University. The university to discover.

Niching-based NSGA-II

A Niching-based NSGA-II algorithm as proposed by Shir et al.

Leiden University. The university to discover.

Dynamic Niche Identification

Peak individuals

q=3 Individuals that do not belong to niche

B.L. Miller, Shaw, M.J.: Genetic algorithms with dynamic niche sharing for multimodal function optimization, Proceedings of IEEE International Conference on EC, May 1996, Pages: 786-791

Leiden University. The university to discover.

Similarity in Molecular Spaces

- Molecules are represented by bitstrings identifying certain structural properties

- A ‘1’ at position i denotes the presence of property i in the molecule, and ‘0’ at position i denotes the absence of property i

- How to define a similarity measure for the graph-like molecular structures?

- Idea: use molecular fingerprints

Leiden University. The university to discover.

Distance based on fingerprints

- The distance between two molecules A and B can be based on the four terms:

- a: the number of properties only present in A

- b: the number of properties only present in B

- c: the number of properties present in both A and B

- d: the number of properties not present in A and B

- One possible distance measure can be created using the Jaccard coefficient (also known as Tanimoto coefficient):

The Jaccard distance fullfills the triangular equation, as opposed to for example the cosine-distance!

Leiden University. The university to discover.

Triangle inequality

Leiden University. The university to discover.

Triangle inequality

Why do we want to have a dissimilarity (distance) measure that obeys the triangle inequality?

If we have very similar molecules, say molecule A is similar to B and molecule A is also similar to C,

then we want to be able to say that B is similar to C.

Leiden University. The university to discover.

Triangle inequality

Leiden University. The university to discover.

Molecule Evolution with Niching

Leiden University. The university to discover.

ExperimentsAim:

Compare the niching-based NSGA-II method with the normal NSGA-II method

Two test-cases:- Find ligands for the Neuropeptide Y2 receptor (NPY2)- Find inhibitors for the Lipoxygenase (LOX)

Two objectives:- Aggregated fitness score based on activity predictors - Aggregated constraints score function

Leiden University. The university to discover.

Experimental setup- 5 runs for each method on each test-case- 1000 generations per runs- Normal NSGA-II:

- 50 parents- 150 offspring

- Niching-based NSGA-II:- 10 niches- 5 parents per niche- 150 offspring- niche radius set to 0.85 (empirically set)

Leiden University. The university to discover.

Average Pareto Fronts

NPY2:

LOX:

Leiden University. The university to discover.

Average distance between the individuals in the final populations

NPY2:

LOX:

Leiden University. The university to discover.

Output sets of a NPY2 run without and with niching

Leiden University. The university to discover.

Output sets of a LOX run without and with niching

Leiden University. The university to discover.

Multi-dimensional Scaling Plots

No Niching Niching

Leiden University. The university to discover.

The chemist’s view on the output

Regarding the niching:- The molecules found with the niching method are clearly

more diverse than the molecules found by the non-niching approach

In general:- The molecules look reasonable overall, but:

- Most molecules still possess unstable and/or toxic features that are not easy to synthesize in practice

- Similar types of uncommon features seem to appear

Leiden University. The university to discover.

Conclusions and OutlookConclusions:- Applying niching using the Jaccard distance based on

molecular fingerprints and is a way to enhance search space diversity in molecule evolution

- It yields more diverse sets of molecules than a normal evolutionary algorithm for molecule evolution

Future research:- Applying these methods on other (more sophisticated)

models as well- In vitro testing of selected molecules found using this

method- Incorporate more sophisticated measures for testing the

synthesizability of candidate molecules

Leiden University. The university to discover.

Thank you!

Alexander AlemanNatural Computing GroupLIACS, Universiteit Leidene-mail: alexander.aleman@gmail.com