+ All Categories
Home > Documents > The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf ·...

The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf ·...

Date post: 01-Nov-2019
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
34
The Scaffold Tree An Analysis Method for Chemical Structure Data Sets Ansgar Schuffenhauer
Transcript
Page 1: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

The Scaffold Tree

An Analysis Method for Chemical Structure Data SetsAnsgar Schuffenhauer

Page 2: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Contributors

2 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

MPI Dortmund• Stefan Wetzel• Marcus Koch• Steffen Renner

• Herbert Waldmann

Novartis colleagues• Peter Ertl• Silvio Roggo• Nathan Brown• Paul Selzer• Jeremy Jenkins• Kamal Azzoui• Jacques Hamon

• Edgar Jacoby

Page 3: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Chemical ClassificationWhy is it needed?

3 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Industrial and publicly funded research using increasingly large screening collections• An increasingly larger part of these compounds result from parallel

synthesis

HTS of these collections gives increasingly large hit sets

It is important to identify hits belonging to a common chemical class • They can be explored in joint synthesis effort• This effort can be guided by SAR derived from the screening data

Get the chemical classes right, map the biological response onto the “chemical map”

Page 4: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Classification of Chemical Structures

4 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Clustering

Classification derived from unsupervised machine-learning

Information of complete dataset is required for classification

No linear scaling with dataset size

No incremental updates possible

Rule-based

Explicitly formulated rules encode “expert knowledge”

Class assignment is derived for each structure independently

Scales linearly with number of molecules in dataset

Incremental updates possible

Page 5: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

The Molecular Framework and its Generalizations

5 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Bemis and Murcko J. Med. Chem. 1996, 39, 2887Xu and Johnson J. Chem. Inf. Comput. Sci. 2002 42, 912

O

NNH

O

prune terminal sidechains

originalstructure

O

N

** *

* ** *

**

** *

***

discard atom and bond type

discard ring sizes and linkage lengths

Not well definedchemical entities

O

NNH

O

O

O

NNH

O

O

** *

* ** *

**

** *

***

** *

* ***

Addition of a cyclicsidechain preventsrecognition of common core

reduced framework graph

molecular framework

topological framework

Page 6: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Are there Alternatives?

6 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Retain the molecular framework as classification element• Exoxyclic and “exolinker” double bonds are part of the molecular

framework

Instead removing atom & bond type and ring size information prune less important rings sidechainspiecemeal• Do not disconnect the scaffold• Use prioritization rules to decide which ring to remove first• Small, generic set of rules, no “dictionary”

Schuffenhauer et al. J. Chem. Inf. Model. 2007,47, 47

Page 7: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

An introductory exampleBaccatin III

7 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

OH

OO

O

O

OH

OHO

O

O

O

H

H

O

O

O

O

O

O

O

O O

Baccatin IIl molecularframework

Page 8: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Rule 3Choose the parent scaffold with smallest number of acyclic linker bonds

8 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Scaffolds having the least number of acyclic linker atoms are likely to be more rigid• Rigid scaffolds are more likely to

present their sidechains in a conserved orientation

Acyclic linkers are strategic bonds• Likely to be formed late in a

parallel synthesis effortN

S

OO

ON

N

S

OO

ON

N

S

OOHO

H

O

ON

FCl

AB

C

D

Flucloxacillin5290-39-5

Page 9: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Rule 4Keep bridged and spriro rings and unusally fused ringsystems.

9 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Such ring patterns are unusual and likely not to be formed unintentionally

Use difference between number of bonds being member > 1 ring (nrrb) and number of rings (nR) - 1|∆| = | nrrb – (nR – 1)|• In most common linear ringfusion

pattern nrrb = nR -1

OH

N

H

NH

NH

Pentazocine359-83-1

A B C

| 2 - (2 - 1)| = 1

| 1 - (2 - 1)| = 0

Case 1: bridged rings

Page 10: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Rule 4 continued

10 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

N

N

OH

H H

H

N

N

O

N

NH

NH

N

O

NH

N

O

Sophocarpin6483-15-4

A B

C

D

|3 - (3 - 2)| = 1|2 - (3 - 2)| = 0|2 - (3 - 2)| = 0

NH

N

O

O

OO

NH

N

O

NH

NH

O

N

| 0 - (2 - 1)| = 1

Rhynchophylline76-66-4

A

B C

| 1 - (2 - 1)| = 0

Case 3: spiro ringsCase 2: non-linearly fused rings

Page 11: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Rule 6Remove rings of size 3, 5 and 6 first

11 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

N

S

O

NH

S

NHO

N

N NH2

N

N

NH

NH

N

A B

Epinastine 80012-43-7

b)

a)

A

B

The majority of the commercially available building blocks are containing rings of size 3,5 or 6.

If rings of different sizes occur, they are likely to be built up intentionally to fulfill a dedicated purpose

Page 12: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Rule 8Remove rings with the least number of hetero-atoms first

12 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Hetero-atoms can make characteristic H-bonding interactions

Hetero-atoms bound with an execyclic double bond

NH

NH

Page 13: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Rule 9If number of hetero-atoms is equal priority of hetero-atoms is N > O > S

13 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

N-heterocycles play in important role in medicinal chemistry

N and O atoms are capable of forming H-bonds

Avoid mapping to benzene in cases there are alternatives

NH

S

S

NH

S

N

Cl

Ticlopidine55142-85-3

A B

Rule 10Keep larger ring with priority

Rule 11Of mixed aromatic/non aromatic ring systems retain non-aromatic rings with priority

Page 14: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Rule 13Use canonical smiles as tiebreaking rule

14 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Keep scaffold which has the canonical smiles with alphabetical sort precedence

The next pruning step will prune the ring which did “win” in the tie breaking

OO

ON

O

O

O

O

Ormeloxifene31477-60-8

A B

C

D

C2Oc1ccccc1CC2c3ccccc3

C3CC(c1ccccc1)c2ccccc2O3

Page 15: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

The introductory example revisitedBaccatin III

15 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

OH

OO

O

O

OH

OHO

O

O

O

H

H

O

O

O

O

O

O

O

O O

Baccatin IIl molecularframework

O

OO

O

O

OH

OHO

O

O

O

H

HO

OHNH

O

Taxol

Rule 3

Rule 4

Rule 4Rule 6

Which rule is used how often?See poster T19

Page 16: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Classification of a public HTS data setNCGC Pyruvate Kinase screen

16 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

HTS run at the NCGC, a NIH roadmap screening institute• Data can be downloaded from PubChem• http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=361

602 active and ~50 000 inactive molecules

Scaffolds shown in tree• Have at least 5% actives• Represent at least 0.02% (10 compounds) of the whole data set.

Page 17: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Scaffold Tree Example for HTS resultsPubChem Pyruvate Kinase Data Set

17 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Color intensity by fraction of actives

Page 18: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Scaffold Tree Example for HTS resultsPubChem Pyruvate Kinase Data Set

18 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Color intensity by fraction of actives

Page 19: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Scaffold Tree Example for HTS resultsPubChem Pyruvate Kinase Data Set

19 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Color intensity by fraction of actives

Page 20: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Scaffold Tree Example for HTS resultsPubChem Pyruvate Kinase Data Set

20 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

O

NNH

OEt

O

N

O

O

O

NNH2O

NNH

O

O

NNH2

Cl

O

NCl

NH2

O

NNH

OPr

O

N

Br

NH

OPr

O

O

NNH2

Cl

Br

O

NNH

ONH

NH

O

NH

O

NNH

O

O

NNH2

O

N

O

O

O

NNHCl

F

F

O

O

NNH

OPr

O

N

O

O

O

NNH

O

Pr

O

NNH2

O

NNH

O

O

N

NH2

O

NNH

O

O

NNH2

Br

Br O

NNH

O

Et

O

NNH

ON

NH

O

N

O

NNH

ONH

NH

O

NH

BuBu

O

NNH2

O

NNH

N

NN

N

O

O

O

NNH

N

NN

N

O

O

O

NNH

O

N

O

O

N

O

OO

NNH

O

O

NNH

O

NH

ON

N

O

NNH

O

O O

O

NNH

O

N

O

NNH

O

N

O

NNH

O

NNNH

O

NNH

O

NNNH

O

NNH

O

FF

O

NNH Cl

O

N

Br

O

NNH F

O

N

Br

O

NNH

O

N

O

NNH Cl

O

NN

NH

O

N

NH

OCl

NN N

O

N

NH

OF

N

Cl

O

N

NH

O

N

O

N

NH

O

N

O

O

N

NH

O

O

O

N

NH

O

Cl

O

O

N

NHO

O O

O

O

NNH

S OO

O

NNH

S OO

NH

O

O

NNH

O

O

actives

4.8

Page 21: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Scaffold Tree doesn’t fit on a screen or a poster?

21 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Scaffold Tree Extracts meaningful chemical series• The tree allows visualization of scaffold hierarchy• If it would just fit onto the screen…

Interactive visualization tool needed:• Manipulate resolution• Filter scaffolds

See poster T30

Page 22: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Chemical Series and Biological Response

22 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Enrichment of actives in part of the series• Even in series with enriched activity the actives only enriched up to

20%• This suggests that the scaffolds potential for biological activity can be

only materialized with the appropriate side chains• After all this is the rationale behind combinatorial chemistry

Page 23: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Chemical Series and Biological Response

23 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Chemical variation of library parallel to smooth change of biological response• Essential binding features (“privileged

substructure”) conserved in library variation• Attractive entry point for chemical exploration• Derivation of SAR possible

Chemical variation of library orthogonal to smooth change of biological response• Chemical variation disrupts essential binding

features• SAR appears to be “flat” or active

compounds appear to be singletons

inactive

highlyactive

desiredbiological activityChemical space Chemical space

library around scaffold projectedinto chemical space

NH

NN

R

NH

RN

NN

R

R

NH2

NN

N

R

R

NH2

NN

N

R

R

NH2

NN

N

R

R

NH2

Page 24: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Structural Classification Evaluated in Biology Space

24 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Can we asses the relation between chemical class on biological activity in a more general manner?• Can we compare clustering and rule-based classification?

Use in vitro data on a uniform assay panel to measure success

Two competing objectives• As few partitions as possible• Biological activity profile of compounds within partitions should be as

similar as possible (low cluster spread SP)

• Evaluate by Pareto analysis

Schuffenhauer et. al. J. Chem. Inf. Model. 2007, 47, 325

Page 25: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Pareto Analysis

25 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

A,B,C are all Pareto optimal solutions

B is superior to D

A-D are superior to E

nPartitionsS

P

A

B

C

D

SP

E

optim

izatio

n

nPartitionsRandom partitioning(lower end benchmark)

One solution of the problem is superior to another solution only if it is superior in all objectives

Profile based clustering (upper end benchmark)

Partitions by structure

Page 26: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Spread of Partitions in Biological Profile Space

26 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Distance of two compounds i,j in profile space:

[ ]∑ −=assays

a

ajpICaipICjid 25050 ),(),(),(

Average within partition k

Average over all partitions weighted by partition size(to be minimized)

( )∑∑≤

=

<

=−=

kni

i

ji

jkkk jid

nnsp

1 1

),(12

1

k

n

k total

k spnnSP

cluster

∑=

=1

Page 27: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Pharmacology Saftey Profile Data Set

27 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

Safety-Pharmacology Profile• 1006 compounds• IC50/EC50 values in 27 assays (mostly aminergic GPCR and Ion

Channels)• No missing values

- Spread values well defined

Page 28: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Classification of Safety Data SetScaffold Tree Pharmacology Profile

28 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

0.0

0.5

1.0

1.5

2.0

2.5

1 10 100 1000NPartitions

SP

pIC50_Kmeans

pIC50_noise_Kmeans

Random

Scaffold Tree

1 ring

2 rings

3 rings

Page 29: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Classification of Safety Data SetClustering with FCFP_4 Fingerprints

29 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

0.0

0.5

1.0

1.5

2.0

2.5

1 10 100 1000NPartitions

SP

pIC50_Kmeans

pIC50_noise_Kmeans

Random

Scaffold Tree

FCFP_4_PPClust

FCFP_4_DivKM

1 ring

3 rings

2 rings

Page 30: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Classification of Safety Data Set FEPOPS Pharmacophore Descriptors

30 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

0.0

0.5

1.0

1.5

2.0

2.5

1 10 100 1000NPartitions

SP

pIC50_Kmeans

pIC50_noise_Kmeans

Random

Scaffold Tree

FCFP_4_PPClust

FCFP_4_DivKM

FEPOPS_DivKM_maj

Jenkins et al. J. Med. Chem. 2004, 47, 6144

Page 31: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Classification of Safety Data Set Simple 1D descriptors (MW, AlogP, PSA, ..)

31 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

0.0

0.5

1.0

1.5

2.0

2.5

1 10 100 1000NPartitions

SP

pIC50_Kmeans

pIC50_noise_Kmeans

Random

Scaffold Tree

FCFP_4_PPClust

FCFP_4_DivKM

FEPOPS_DivKM_maj

simple_1D_DivKM

simple 1D:MW, ALogP, Num_RotatableBonds PSA, Num_H_Acceptors, Num_H_Donors

Page 32: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Are the classifications overlappingAdjusted Rand index matrix – Pharmacology Profile

32 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

safety profile pIC50

_Kmea

ns (1

96)

pIC50

_nois

e_Kmea

ns (1

96)

Murcko

(SCIN

S, 167

)

Scaffo

ld_Tree

(leve

l 2, 3

11)

FCFP_4_P

PClust (1

96)

UNITY_FP_P

PClust (1

96)

Unity_

DivKM (1

96)

FCFP_4_D

ivKM (1

96)

Similo

g_DivK

M (196

)

RDF_DivK

M (196

)

RDF_SOM (1

57)

FEPOPS_DivK

M_dist

(198

)

FEPOPS_DivK

M_maj

(196)

FEPOPS_PPClus

t_dist

(189

)

FEPOPS_PPClus

t_maj

(176)

PhysC

hem_D

ivKM (1

96)

PhysC

hem_P

CA (182

)

Rando

m (196

)pIC50_Kmeans (196) 0.98 0.84 0.03 0.00 0.02 0.02 0.02 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.00 1

pIC50_noise_Kmeans (196) 0.84 0.77 0.03 0.00 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.00 0.8

Murcko (SCINS, 167) 0.03 0.03 1.00 0.19 0.26 0.24 0.16 0.16 0.16 0.10 0.11 0.05 0.07 0.07 0.09 0.09 0.04 0.00 0.9

Scaffold_Tree (level 2, 311) 0.00 0.00 0.19 1.00 0.33 0.34 0.15 0.14 0.13 0.10 0.14 0.05 0.06 0.06 0.08 0.06 0.03 0.00 0.7

FCFP_4_PPClust (196) 0.02 0.01 0.26 0.33 0.83 0.76 0.31 0.30 0.25 0.21 0.36 0.09 0.12 0.13 0.15 0.13 0.06 0.00 0.6

UNITY_FP_PPClust (196) 0.02 0.01 0.24 0.34 0.76 0.81 0.32 0.28 0.24 0.21 0.36 0.09 0.11 0.12 0.14 0.13 0.06 0.00 0.5

Unity_DivKM (196) 0.02 0.01 0.16 0.15 0.31 0.32 1.00 0.34 0.26 0.22 0.16 0.09 0.11 0.10 0.10 0.14 0.06 0.00 0.4

FCFP_4_DivKM (196) 0.01 0.01 0.16 0.14 0.30 0.28 0.34 0.68 0.26 0.18 0.14 0.09 0.10 0.10 0.09 0.14 0.05 0.00 0.3

Similog_DivKM (196) 0.01 0.01 0.16 0.13 0.25 0.24 0.26 0.26 0.57 0.17 0.12 0.08 0.10 0.09 0.09 0.15 0.06 0.00 0.2

RDF_DivKM (196) 0.01 0.01 0.10 0.10 0.21 0.21 0.22 0.18 0.17 1.00 0.26 0.06 0.07 0.07 0.07 0.11 0.04 0.00 0.1

RDF_SOM (157) 0.01 0.01 0.11 0.14 0.36 0.36 0.16 0.14 0.12 0.26 0.56 0.05 0.06 0.06 0.07 0.07 0.03 0.00 0.01

FEPOPS_DivKM_dist (198) 0.01 0.01 0.05 0.05 0.09 0.09 0.09 0.09 0.08 0.06 0.05 0.48 0.31 0.10 0.09 0.05 0.03 0.00 0

FEPOPS_DivKM_maj (196) 0.01 0.01 0.07 0.06 0.12 0.11 0.11 0.10 0.10 0.07 0.06 0.31 0.80 0.11 0.14 0.06 0.03 0.00

FEPOPS_PPClust_dist (189) 0.01 0.01 0.07 0.06 0.13 0.12 0.10 0.10 0.09 0.07 0.06 0.10 0.11 0.16 0.16 0.06 0.03 0.00

FEPOPS_PPClust_maj (176) 0.01 0.01 0.09 0.08 0.15 0.14 0.10 0.09 0.09 0.07 0.07 0.09 0.14 0.16 0.20 0.06 0.03 0.00

PhysChem_DivKM (196) 0.01 0.01 0.09 0.06 0.13 0.13 0.14 0.14 0.15 0.11 0.07 0.05 0.06 0.06 0.06 1.00 0.16 0.00

PhysChem_PCA (182) 0.01 0.01 0.04 0.03 0.06 0.06 0.06 0.05 0.06 0.04 0.03 0.03 0.03 0.03 0.03 0.16 1.00 0.00

Random (196) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

For adjusted rand Index:Hubert and ArabieJ. Classif. 1985, 2, 193

Page 33: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Summary Pareto Analysis

33 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

No partitioning method is generally superior to all others: trade-off between local precision and scaffold hopping potential

The results of all methods is still more close to random than to the ideal biological clustering• Not all biological activity clusters are covered by single chemical class

- Especially not all inactives

local precision

scaffold hopping potential

scaffold tree2D descriptors(FCFP_4)

3D pharmacophoredescriptors(FEPOPs)

Page 34: The Scaffold Tree - University of Sheffieldcisrg.shef.ac.uk/shef2007/talks/schuffenhauer.pdf · Classification of Chemical Structures 4 | The Scaffold Tree | Ansgar Schuffenhauer

Summary

34 | The Scaffold Tree | Ansgar Schuffenhauer | Sheffield June 2007

We can with Scaffold Tree detect chemically meaningful series• Scaffold Tree allows us to detect series with enriched activity

However chemical structure classes are not equivalent with biological activity classes• Within a chemical series biological activity can vary• A biological activity class can be spread over several structural classes

- This is especially true for the “inactive” class.

• This applies for a wide range of structural classifications

However, continuous changes in biologic activity with chemical variation of a chemical series are actually desirable• That biological activity varies smoothly in a chemical class indicates optimization

potential• Initial SAR derived from the series screening results may guide this optimization


Recommended