Date post: | 26-Mar-2015 |
Category: |
Documents |
Upload: | jose-moran |
View: | 215 times |
Download: | 2 times |
1
Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia
January, 2007
Structural Search Using ChemAxon Tools
2
Slide 2
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Structural Search Using ChemAxon Tools
Introduction
Search types in JChem
Interfaces
Search options and features
The Chemical Terms language (search result filtering)
Performance
Applications
Future plansAll examples were generated by Marvin
3
Slide 3
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
ChemAxon introduction
Company
• Founded in 1998, based in Budapest, Hungary, representation in the US, UK and Japan
• Wide cheminformatics expertise (>30 staff) 9 PhD, 11 MSc
• Wide industry expertise >180 corporate clients worldwide + >1000 academic users
Products
• Cheminformatics tools - structure drawing, visualization, search, transformation, library profiling and property prediction
• Enterprise chemistry database and cartridge technology
Technology
• Powerful/Flexible – Enterprise API toolkits
• Solutions – Desktop applications
• Java based + .NET – Platform independent + Web ready
Mantra
• Do what they want, respond quickly
4
Slide 4
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
ChemAxon Products Overview
5
Slide 5
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Selected Application Areas
Global licenses
Custom development projects
Value added constructions
Websites/portal front and back end
Content/ Educational
6
Slide 6
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
For academic teaching and research
More information at: http://www.chemaxon.hu/forum/ftopic193.html
• Unlimited* personal license for
all products, support and upgrades *JChem base = 3 searches/min
• Open term license for teaching
• Repeating 2 year license term for research – provided ChemAxon are
cited in publications
• License covers students of the department
• Unlimited number of applications / institution
7
Slide 7
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Where chemical searching useful?
Diversity of applications. :– Compound registration systems– Electronic Laboratory Notebook (ELN) systems – Pharmacophoric group(functional group) identification
(JChem Screen, JKlustor)– Rule-based fragmentation of libraries (JChem
Fragmenter, RECAP)– Virtual reaction processing (JChem Reactor)– Standardization (canonicalization of structures,
JChem Standardizer)– Toxical fragment identification (superstructure search)
8
Slide 8
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Search types in JChem
• Atom By Atom Search or structural search:
• Similarity search:– Different Descriptors– Different Metrics
MC(E)S – maximum common (edge) substructure
Perfect
Exact
Superstructure
Substructure
ResultQueryStructural search type
9
Slide 9
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Structural search interfaces
• JSP(Java Server Pages): web GUI for database
• Command line utility: jcsearch: for files and DB
• Java and .NET API: – MolSearch class
• isMatching() – Only to check matching
• findFirst(), findNext() Enumerate all• findAll() possible
matchings
– JChemSearch class: JChem Base
• Cartridge: access all functionality from Oracle SQL
• Chemical Terms
• Instant JChem
10
Slide 10
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Compatibility and integration
File formats:• SMILES• MDL molfile (v2000 and v3000)• MDL SDF• RXN• RDF• MRV
Integration:• 100% Java • extensive API• JChem Cartridge for Oracle• .NET support via JNBridge
Database engines:• Oracle• MySQL• MS SQL Server• PostgreSQL• MS Access• IBM DB2• etc.
Operating systems:• Windows• Linux• Mac OS X• Solaris• etc.
11
Slide 11
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Instant JChem
Desktop application for local and remote chemical database management, search and structure based prediction
•Simple connect to external databases and share your native database simultaneously
•Powerful search functionalities
•Scalable – explore ’00,000’s+ live structures
•Dynamically predict properties using Calculator Plugins
•Apply canonicalization rules for import and viewing
•Wide import / export options
•Merge data sets into a single set
•Very active development – what do you want to do?
Instant JChem http://www.chemaxon.com/conf/Instant_JChem.ppt
12
Slide 12
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
JChem Cartridge for Oracle
• Access JChem functionalities from non-Java environments via
SQL functions of Oracle
• All search features of JChem Base
• Complex chemical filters and property predictors using
Chemical Terms expressions
• Standardization (structure canonicalization) during registration
• Structure format conversions
• 2D, 3D image generation
• Library enumeration using virtual reactions (Reactor)
JChem Cartridge http://www.chemaxon.com/JChem_Cartridge.ppt
13
Slide 13
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
JChem Cartridge for Oracle
14
Slide 14
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Canonicalization with Standardizer
• Aromatize/dearomatize
• Add/remove explicit hydrogens
• Convert mesomers / tautomers / functional groups
• Remove solvents counterions by list smallest fragment retain largest fragment
• Set/Remove chiral flag, remove stereo features
• Ungroup S groups
• Enumerate by stoichiometry values
• 2D, 3D coordinate generation (cleaning)
• Template based cleaning
Standardizer http://www.chemaxon.com/Standardizer.ppt
15
Slide 15
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Search options
Structural search options:
• Stereo on/off, absolute stereo (ignore chiral flag) • Double bond stereo: no check/marked/all double bonds• Chemical Terms filter expression• Tautomer search• Ignore charge/isotope/radical/valence/mixture brackets• Exact charge/radical/isotope/query features/bond/stereo matching• Vague bond matching modes: „or aromatic”; ignore bond types• Timeout limit• Order sensitive hits• Pre-assignment of query and target atoms• etc
16
Slide 16
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Query features 1. Atomic features
• Query atom types:
• any,
• hetero,
• list,
• not list
• Pseudo atoms e.g. “Resin”
• Explicit lone pairs (matches to implied lone pairs as well.)
• Charge, isotope, radical
• Link nodes (repeatable):(L1-2)
Cl(L1-5)
OH
17
Slide 17
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Query features 2. Query properties
•Query properties:
Symbol Description
H<n> Total hydrogen count
a Aromatic
A Aliphatic
R<n> Ring count in SSSR
r<n> Ring size in SSSR
v<n> Valence
X<n> Connectivity
D<n> Degree
h<n> Implicit H count
rb<n> rb* Ring bond count *: as drawn
s<n> s* Substitution count *: as drawn
u Unsaturated atom
18
Slide 18
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Query features 3. Atomic SMARTS features
• SMARTS atoms:
• Additional query properties:
• Example:
Carbonyl C, but not amide
Symbol Description
& ; , ! Logical operators
$(<smarts>) Recursive smarts
+0, -0 Zero charge
19
Slide 19
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Query features 4. Bond features & components
• Query bond types: Any, single or double, single or aromatic, double or aromatic
• Bond topology: chain/ring
• Smarts bonds
• Component level grouping
Symbol Description
- = # Single, double, triple
: aromatic
& , ; ! Logical operators
@ Ring bond
/ \ /? \? Directional bond (cis/trans)
Symbol Description
(C.C) Same component
(C).(C) Different component
C.C No component restrictions
20
Slide 20
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Stereo searching 1. Double bonds
• Levels of check:– All– Only marked double bonds
(MDL: stereo care flag)
– None
Not cis
Not trans
Cis or trans
(unknown)
Trans
Cis
MeaningDepiction
21
Slide 21
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Stereo searching 2. Tetrahedral chirality
• Stereo bond types:
• Relative stereo configuration• Chiral flag model• Enhanced stereo representation: AND<n>, OR<n>, ABS groups
Up or downDownUp
22
Slide 22
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
S-group integration (query & target)
Both sides are treated similarly by the search:
• Abbreviations (super-atom S-groups):
• Multiple groups:
Other S-groups supported: component, mixture and formulation brackets:
23
Slide 23
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Reaction search
• Reactants, agents, products
• Transformation recognition (mapping)
• Stereospecific reactions (inversion, retention)
• Reactant grouping
• Reacting center
24
Slide 24
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
R-group search
• Scaffold, R-group definitions
• Monovalent, divalent R-groups
• R-logic
•Occurrence
•If-then
•Rest H
25
Slide 25
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Hydrogens
• H representations:– Explicit– Implicit– Query H count:
– total (H<n>)
– implicit (h<n>)
• Example:
Considered in ABAS
Explicit H Implicit H Query H count
Query
Target
Target
Query
26
Slide 26
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Applications of Chemical Terms
CT
virtual synthesisreaction and synthesis rules
pharmacophore analysispharmacophore definitions
drug designgoal functions
structural searchadvanced query expressions
e.g. in Instant JChem & Cartridge
27
Slide 27
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Chemical Terms
searching match("olefine.mol") && !match("c1ccncc1") && (atomCount(16) == 0) || (mass() < 300);
goal functions inhibitor = inhibitor.mol;
(similarity(inhibitor, pharmacophore_tanimoto) > 0.8) && (similarity(inhibitor, chemical_tanimoto) < 0.5);
filtering (mass() <= 500) &&
(logP() <= 5) &&
(donorCount() <= 5) &&
(acceptorCount() <= 10);
• property calculations (partial charge distribution, pKa, logP, HB donors, acceptors, …, etc)
• structure matching functions (describing functional groups, reaction sites, similarity…)
• arithmetic and logic-operators
Elements of the language
Chemical Terms examples
28
Slide 28
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Chemical Terms
Some available functions
• Structural search (match, matchcount)• Partial charge distribution • pKa, Log P, Log D, major microspecies• Polarizability• Topological Polar Surface Area• Number of rotatable bonds, rings, aromatic rings, etc.• Number of HB donors/acceptors• Exact mass • Arithmetic and logic operators • Extensible: your own Java plugins can be easily added.• Etc.
29
Slide 29
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Fingerprint screening in the database
• JChem database searches use fingerprint technology for fastest search results.
• It rapidly* filters out most non-hits -usually more than 99% of them are rejected.
• Supported fingerprint types:– Chemical hashed fingerprints– User-defined additional structural keys
* Average screening time in a 3-million cached table: ~0.5s
JChem table
Hits for the query
Search
query
Fingerprintscreening
Need to be searched
Screenedout
Atom by atomsearch
Results
30
Slide 30
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Performance
Searching 3 million smiles structures (multiplied NCI 2000) in DB: JChem Base 3.2, Dual Xeon 3GHz, 2GB RAM; Oracle 9.2.0.7.0
Query Number of hits Search time (s)
12 0.5
936 0.7
4608 1.4
65208 11.4
31
Slide 31
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Demo
32
Slide 32
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Application: R-group decomposition
JChem is able to identify the ligands of a given scaffold at specified substitution positions:
Query(scaffold) Result
Library R-groupdecomposition
33
Slide 33
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Further applications of structural search in JChem
• Transformations - Standardizer & Reactor
• Identification of pharmacophoric groups - Pmapper
nitro: amidine:
• Identification of bond cleavage - Fragmenter
ether cut:
Enamine-amine tautomerism:
Converting covalent form of alcoholates to ionic form:
34
Slide 34
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Future plans
• Searching in Markush targets (R-groups, atom lists, link nodes, bond lists)
• New bracket types
• Further generic atom types (AH, QH, X, M, etc.)
35
Slide 35
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Future plans – Combinatorial Markush database prototype
Stage I. “Combinatorial libraries”Markush features:• R-groups
– Any nesting– Up to 2 connections– In ring or chain
Functionality:• Registration into database
• Search in Markush DB (w/o enumeration)
• Enumeration (full, selective or hit enumeration)
• Enhanced Markush sketching (MarvinSketch)
Very complex Markush libraries can be handled, even ones with more than 263 members
• Atom lists
• Bond lists
• Link nodes
36
Slide 36
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Summary
• JChem suite: contains a broad range of chemicalsearch facilities.
• Chemical Terms: allows easy and flexible data mining.
• Structural search is a useful tool for many applications.
37
Slide 37
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
References
• JChem Query Guide: http://www.jchem.com/doc/user/Query.html
• Chemical Terms reference:http://www.jchem.com/doc/user/EvaluatorLanguage.html
• JChem Base JSP demo page:http://www.jchem.com/examples/jsp1_x/index.jsp
• Instant JChem:http://www.chemaxon.com/product/ijc.html
• Jcsearch command line tool:http://www.jchem.com/doc/user/Jcsearch.html
• API documentation: http://www.jchem.com/doc/api/index.html
(chemaxon.sss.search.MolSearch, chemaxon.jchem.db.JChemSearch)
38
Slide 38
Structural Search Using ChemAxon Tools — ICCI Pune India, January 2007 — JChem version 3.2
Máramaros köz 3/a Budapest, 1037Hungary
www.chemaxon.com
Thank you for your attention