5 years with ChemAxon
A Developing Partnership Richard Bolton GSK
Abstract
2 21 May 2014 ChemAxon UGM Budapest
• GlaxoSmithKline began using ChemAxon tools across its
enterprise systems in late 2009. A selection of presentations
from previous UGM which reflect the increasing usage and
show the developing relationship between the two companies
will be highlighted and key status updated. A new use for
ChemAxon tools in Enterprise Search will be presented.
Agenda
1. Where did we start?
2. Two retrospectives illustrating changes in working model
1. IJC
2. Compound registration
3. Enterprise Search with added ChemAxon capabilities
4. What next?
5. Conclusions.
3 21 May 2014 ChemAxon UGM Budapest
Where did we start?
4 21 May 2014 ChemAxon UGM Budapest
• GSK began a major chemistry tools simplification effort in 2008.
– Chemistry Research IT simplification programme (CRISP).
– In 2013 this began again as ‘IT Fitness.’ across Discovery
• The ChemAxon Chemistry engine was chosen to replace Daylight, Accord and
MDL(RCG) backend systems and IJC to replace ISIS.
• In 2009 GSK began to work with ChemAxon tools in the CRISP programme to
move from a complex collection of legacy tools; planned to take approximately
24 months.
– ISIS to IJC
– Web services that powered chemistry methods had ChemAxon methods added.
– Computational Scientists began to use ChemAxon toolset in replacement of Daylight toolkit.
• Success led to a collaboration on Small Molecule Registration in March 2011
Major themes of CRISP as presented at ChemAxon UGM.
5 21 May 2014 ChemAxon UGM Budapest
• Migration of Oracle Cartridges and remediation of web services.
– SOA-at-GSK-Working-in-a-Mixed- Technology-Environment : Brett Hiemenz San Diego 2009
– Implementation of ChemAxon in an SOA environment : Shane Weaver : Budapest 2010
• Structure Activity Relationship (SAR) Tools replaced by Helium and JChem 4 Excel
– SAR analysis in Excel using Helium and JChem Karen Worsfold – Budapest 2011
• ISIS to IJC – Deploying Instant JChem on an Enterprise Scale: Brett Hiemenz - San Diego 2011
– Delivering Instant JChem to the Masses - A User Perspective : Stephen Swanson - Budapest 2012
– IJC in the wider enterprise – 18 months on : Karen Worsfold Budapest 2013
• Simplification of small molecule registration
– A-novel-Approach-to- Pharmaceutical-Registration :Charlie Wilkins San Diego 2011
– Going live with registration as a service : Richard Bolton/Akos Papp Budapest 2012
– Registration as a Service: the full story after half a year. Rama Bhamidipati/Akos Papp Boston2012
SAR Tools Objectives
Tibco Spotfire
Excel 2007
Jchem for Excel
Stephen Swanson Budapest UGM 2012
Retrospective 1: IJC evolution
7 21 May 2014 ChemAxon UGM Budapest
• Initial replacement for ISISBase began 2009. Rollout 5.6 mid 2011.
– ChemAxon converted H-views to IJC projects like-for-like. 400 H-views approx half converted.
– Bespoke tools to create test-prod environments
– Slow US performance necessitated Citrix.
• Cross Pharma prioritisation of requirements. Uplift to 5.7.2.1 in 2012
• Evaluation of JChem web Browser 2013
– GSK database architecture made performance slow.
– Joining across multiple data sources
• 2014 Uplift to version 6.2 using ChemAxon Services to replace test/prod
publication mechanism.
– Still using Citrix for US users (databases mostly in UK)
• 2015 Proposed uplift to Browser resourced by ChemAxon Services
Timeline for deployment of IJC
Karen Worsfold Budapest UGM 2013
Prototype ACD finder demo’d to LUG
ACD Finder & GSKChem released to LUG
Decision to use Citrix for US
& Asia
GSKChem and ACD Finder removed
from ISIS NET
ACD Finder & GSKChem released
to users
Citrix available for US &
Asia
IJC 5.6.0 Roll-Out
Script for adding KATE tables
released to LUG
Start of
decommissioning
of ISIS
2009 2010 2011 2012
All programme Hviews removed from
ISIS NET
First programme Hview to
pass UAT
Web Admin page available
IJC 5.7.2.1 Upgrade
Contract Signed
Retrospective 2: Small Molecule registration Evolution Phase1
9 21 May 2014 ChemAxon UGM Budapest
• CRISP targets registration simplification by conversion to ‘service’. GSK legal
raise concerns about hosting registration outside firewall.
• March 2011 collaboration with ChemAxon to create a new Registration ‘Service’
inside firewall to comply with GSK business rules using standard formats.
• Delivered into production at GSK March 2012 (migrated 6 million records)
• Current ChemAxon registration approx 50% automated registration. 2013
approx 48k NCE registered of these approx 21k achiral/no stereo
Compound Name
CCE 001
CCE 002
CCE 003
Purchased
Compounds
The purchasing chemist sends the SD File to
Registrars and they are registered as new
entities
optionally, request
assistance from the
registrar
Contract
Chemist
Chemist
Purchasing Chemist
eLNB or
WebReg
Fully Specified compounds
Registrar
Tools
Registry
Chemist or contract coordinator
enter compounds to be registered
and is notified of undefined
attributes
Compounds that are fully
describe in regard to
stereochemistry, racemic
content are automatically
registered
Compounds with undefined
attributes are sent to a staging
area for registrar attention
Chemists purchase compound
sets and receive data files with
structure information
Registrars modify compounds
for consistency, confirm with the
chemist then register
Registrar
Business Rules
SD Files
Phase I - Laying the Groundwork
Staging
Business Rules
defined by
Standardizer and
Structure Checker
Configurations
Charlie Wilkins San Diego UGM Sept 2011
Other?
Compound Name
CCE 001
CCE 002
CCE 003
Purchased
Compounds
Compound Collection Enhancement send the SD
File to Registry as received and registered as
new entities
optionally, request
assistance from the
registrar
Phase II – Optimization
Contract
Chemist
Chemist
CCE Chemist
eLNB or
WebReg
Fully Specified compounds
Registrar
Tools
Compound Collection
Enhancement purchase
compounds and receive data
files with structure information
Registrar
SD Files SD Files
Fully Specified compounds
Business Rules
Staging
Compounds that are fully
describe in regard to
stereochemistry, racemic
content are automatically
registered
Chemist or contract coordinator
enter compounds to be registered
and registerability
before submitting
Compounds with undefined
attributes are sent to the
registrars
Registrars modify compounds
for consistency, confirm with the
chemist then register
Registry
x
x
x
Charlie Wilkins San Diego UGM Sept 2011
Small Molecule registration Evolution Phase 2
12 21 May 2014 ChemAxon UGM Budapest
• RaaS2 Collaboration with ChemAxon to deliver feedback capability as part of
‘product’ mid 2013. Target of 70% auto-registration with 80% potential.
• For compounds that do not auto-register scientist will be able to edit and ‘fix’
using new compound checker tools.
• If a structure is too complex it can be sent to registrar to complete.
• Rollout is imminent
Enterprise Search with Chemistry: Socrates Search
13 00 Month 0000 ChemAxon UGM Budapest
• Chemistry has been added to the GSK Enterprise Search engine using
ChemAxon technology
• Presented to BioIT world in 2013 (and won an innovation award)
– Not yet presented at a ChemAxon UGM
• GSK data is crawled and the ‘chemistry’ extracted into an Oracle database with a
ChemAxon powered index
– Substructure search only
– Reaction search is possible but the business did not prioritize it
– Electronic Lab Notebook Reactions are included in the indexing.
– JChem libraries used for format conversions and image rendering
Insight Search Objectives
- Search: Enabling findable and interpretable data
1. Internal content: eLNB, CLS, Team Sites, Lotus Notes, IMMS, PIER
2. Chemistry entities by structure or name
3. Integration with external sources: Reaxys, Pharmapendium, NCBI, Clintrial.gov
4. Exposing eLNB attachments
14 BioIT world 2013
Search Technology Overview
15
Autonomy (Text Entities)
ChemAxon (Chemical Structures)
Socrates Search
Lab Notebook
HazELNut
LeadMine (Text Analytics)
Documentum Lotus
BioIT world 2013
Text Analytics
16
Universally Unique
Tag
Text
Chemical
Fingerprint
Structure
Document
Annotations
Substructure Search
Exact structure search
Aspirin
2-acetoxybenzoic acid (IUPAC)
O=C(Oc1ccccc1C(=O)O)C (SMILES)
CCI133 (GSK ID)
2-acetyloxybenzoic acid
acetylsalicylate
acetylsalicylic acid
O-acetylsalicylic acid
Embedded OLE objects (ChemDraw, ISIS Draw)
Tag
Tag Gene
Disease
AJSHBSYTW6HSG8KJG9KA
BioIT world 2013
Search Components
17
Unstructured Content
Team
Site Pier …
GSKSearch
(Text)
ChemAxon
(Chemistry)
Insight Search
Content Indexes
Federated Search
eLNB
External Sources
Crawlers
Web Services
BioIT world 2013
Chemistry Crawling
• Chemistry representations
• Chemistry Sources
• eLNB GSK/CRO, eE, Core Lab Systems LNB, Team Sites, Pier (Documentum), Lotus Notes
Names Structure/Reaction Other
GSK Numbers
Trade/common names
IUPAC names
SMILES
CDXML
Mol Files
Microsoft OLE controls ChemDraw
ISIS
Images
BioIT world 2013
What Next?
19 21 May 2014 ChemAxon UGM Budapest
• Support for Chemistry representation in Spotfire. ChemAxon services
developing a Spotfire plugin using a GSK tool as the model. Precompetitive IP.
• Computational Scientists evaluating in-memory databases.
• Move to the Web: Plexus. Working with ChemAxon to ensure that our roadmap
aligns with delivery plans.
• Partnership with Schrödinger, integrated with Plexus Library Design tools.
• Support services for ChemAxon tools. How can my Support services work with
ChemAxon Support to maintain component currency without a project.
In Conclusion
20 21 May 2014 ChemAxon UGM Budapest
• ChemAxon began as key vendor for GSK in 2009
– Deployment of IJC and key CRISP programme technology.
• Became a co collaborator in 2011
– Registration ‘as a service’
• Service provider in 2014.
– Continued collaboration on Registration roadmap.
– Assistance with Socrates Search
– Services development (Spotfire, reaction databases, academic DPU proof of concept work)
– Services uplift of IJC
• 5 years on and the JChem Platform is now the default ‘Chemistry engine’ .
Thank you for your attention
21 21 May 2014 ChemAxon UGM Budapest
Questions?
UGM Presentations. IJC, JChem Cartridge and Web services
22 21 May 2014
ChemAxon UGM Budapest
• SOA-at-GSK-Working-in-a-Mixed- Technology-Environment : Brett Hiemenz San Diego 2009
• Implementation of ChemAxon in an SOA environment : Shane Weaver : Budapest 2010
• SAR analysis in Excel using Helium and JChem Karen Worsfold – Budapest 2011
• Deploying Instant JChem on an Enterprise Scale: Brett Hiemenz - San Diego 2011
• Delivering Instant JChem to the Masses - A User Perspective : Stephen Swanson - Budapest
2012
• IJC in the wider enterprise – 18 months on : Karen Worsfold Budapest 2013
UGM Presentations: Compound Registration
23 21 May 2014 ChemAxon UGM Budapest
• A-novel-Approach-to- Pharmaceutical-Registration :Charlie Wilkins San Diego 2011
• Going live with registration as a service : Richard Bolton/Akos Papp Budapest 2012
• Registration as a Service: the full story after half a year. Rama Bhamidipati/Akos Papp
Boston2012