+ All Categories
Home > Documents > GraphSig : Mining Significant Substructures in Compound Libraries

GraphSig : Mining Significant Substructures in Compound Libraries

Date post: 23-Feb-2016
Category:
Upload: fynn
View: 47 times
Download: 0 times
Share this document with a friend
Description:
GraphSig : Mining Significant Substructures in Compound Libraries. GraphSig. Provides powerful insight into structure-activity relationship in the form of significant substructures . Input: Diverse background database Libraries of compounds with specific activity Output: - PowerPoint PPT Presentation
Popular Tags:
13
GraphSig: Mining Significant Substructures in Compound Libraries 1
Transcript
Page 1: GraphSig : Mining Significant Substructures in Compound Libraries

GraphSig: Mining Significant Substructures in Compound

Libraries

1

Page 2: GraphSig : Mining Significant Substructures in Compound Libraries

GraphSig

Input:• Diverse background

database• Libraries of compounds

with specific activity Output:

• Prioritized list of significant substructures

Provides powerful insight into structure-activity relationship in the form of significant substructures

2

Page 3: GraphSig : Mining Significant Substructures in Compound Libraries

Applications of GraphSig• What makes compounds active

against a target? – Develop pharmacophore models based

on results• What substructures impart specific

biological activity (e.g., BBB permeability)?– Screen compounds for similar activity

3

Page 4: GraphSig : Mining Significant Substructures in Compound Libraries

Key Benefits• Only automatic tool that identifies

significant substructures– No other tools can mine structure-

activity relationship based on topology– Unique use of a background statistical

model derived from diverse compound libraries.

• Scales to large databases– Two orders of magnitude faster than

alternatives4

Page 5: GraphSig : Mining Significant Substructures in Compound Libraries

Validation Studies• Briem & Lessel dataset

– Find substructures specific to activity classes, and their significance as well as support.

• hERG dataset– Find substructures that are significant for

toxicity.• Permeability Datasets (Oral bioavailability, BBB

barrier)– Find substructures that are significant for

permeability.

5

Page 6: GraphSig : Mining Significant Substructures in Compound Libraries

Briem and Lessel (BL) Dataset

• 957 compound subset of MDL Drug Data Report (MDDR) classified by biological activity . These are summarized in the table below.

      

6

49 5HT3 Re-uptake Inhibitors

40 ACE Inhibitors

111 HMG-CoA Inhibitors

134 PAF Antagonists

49 TXA2 Antagonists

574 "Inactives"

Page 7: GraphSig : Mining Significant Substructures in Compound Libraries

Interesting Substructures from the BL Dataset

7

Found in HMG-CoA inhibitors only

Found in PAF Antagonists and

5HT3 inhibitors only

Found in all ACE Inhibitors

Found in PAF Antagonists and

ACE inhibitors only

Found in PAF and TXA Antagonists

and 5HT3 inhibitors only

Found in PAF Antagonists and

TXA inhibitors only

Found in PAF Antagonists only

Page 8: GraphSig : Mining Significant Substructures in Compound Libraries

hERG Dataset• This dataset consists of compounds known to be

active or inactive as hERG blockers. Here are some interesting substructures that we found on applying GraphSig to this dataset.

8

Significant substructures contained only in hERG

blockers

Significant substructures contained in compounds that

are not hERG blockers

Page 9: GraphSig : Mining Significant Substructures in Compound Libraries

BBB Permeability• BBBDataSet – 1593 compounds labeled as BBB

permeable or not• Significant substructures belong exclusively to

either the permeable or non-permeable set, not both. Hence, substructures are representative of each set.

• The frequency of occurrence of these substructures is very low (0.2-5%). Significant substructures would not be found on the basis of frequency.

• Classification Accuracy of 82% with 5-fold cross validation

9

Page 10: GraphSig : Mining Significant Substructures in Compound Libraries

Connections to Other Acelot Tools

• Once a substructure is identified, use SimFinder for similarity searching.

• Perform 3-d alignment of compounds that contain the specific structure and find well-aligned pharmacophoric features

10

Page 11: GraphSig : Mining Significant Substructures in Compound Libraries

Technical Details: Mining

Graph DB

Feature vectors

Significant feature vectors

RWR on graphs

Feature Vector Mining

Sets of subgraphs

Significant subgraphs

Frequent subgraph

mining

high frequency threshold

11

Page 12: GraphSig : Mining Significant Substructures in Compound Libraries

Technical Details: in silico Prediction

12

Page 13: GraphSig : Mining Significant Substructures in Compound Libraries

References• Huahai He; Ambuj K. Singh; GraphRank: Statistical Modeling and

Mining of Significant Subgraphs in the Feature Space. Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), December, 2006, doi:10.1109/ICDM.2006.79

• Sayan Ranu and Ambuj Singh, GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases in 25th International Conference on Data Engineering (ICDE), 2009.

• Sayan Ranu; Ambuj Singh; Mining Statistically Significant Molecular Substructures for Efficient Molecular Classification., J. Chem. Inf. Model., 2009, 49(11), pp 2537–2550 DOI : 10.1021/ci900035z

• Hans Briem and Uta F Lessel, Perspect. Drug Discov. Dec. 2000 (20), 231-244.

13


Recommended