Zhenbin (Benjamin) Li and Edith Richter IT RD&M
Challenges and Approach in Migration of Chemistry Infrastructure for Systems and Applications
Historical/background Information about Chemistry Infrastructure Migration in BI
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
2013 - 2015
System and Application Migration Projects
Migration of systems and applications to ChemAxon Technology
2011 - 2012
Chemistry Infrastructure Project
Experimented and Piloted ChemAxon in BI environment
2008
MDL Alternatives Project
Evaluated ChemAxon, Accelrys Accord, Symyx MDL, CambridgeSoft
ERMS with ChemAxon JChem to fulfill a strategic user requirement (2010), presented in 2011 UGM
ERMS with ChemAxon JChem cartridge implementation as pilot (2012), presented in 2012 UGM
Historical/background Information about Chemistry Infrastructure Migration in BI
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
• Reliability
• Extensibility
• Compatibility and integratability
• Enterprise architecture
• User friendliness
• Clear path for migration
• Consulting
• Support
• Expertise
• Agility of enhancement
• Customization
• New upgrade according to customers’ needs
• License model
• Negotiation power
• Short-term cost cut
• Long-term financial gain
• Company stability
• Size
• Culture/work ethics
• Familiarity with global pharmaceutical industry
Chemistry Infrastructure Selection Criteria
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Roadmap of Chemistry Infrastructure Migration
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Business intention
Market options
Preliminary evaluation
Business approval
In-depth evaluation
Pilot implementation
Negotiation
Financial commitment
Acquire licenses
Consulting
In-depth system analysis
Implementation planning
Data migration
System migration
Testing
Completion
Waterfall
● Clear license cutoff time
● Resources (including external consulting) are more focused on the migration projects
● Potentially short time to complete the whole migration
● Requires clear understanding of existing systems and interrelations
● Need of established expertise
● May have nonadjustable deadline
● Less tolerant to failure
● Resources may have to reduce other responsibilities
Evolutional
● Migration takes it own path
● Flexible resource and time planning
● Skills and knowledge accumulating as it goes so that later projects benefit from the experiences of previous projects
● It takes a long time to complete whole migration
● Requires license co-existing period (perpetual license model)
● Structures may be interpreted differently in different technology. Internal standardization process or common understanding may be needed
Migration Approach
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
System Interdependency Analysis
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Migration Planning
Time 2012 2013 2014 2015
System Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1-Q4
System A
System B
System C
System D
System E
System F
System G
System H
Migration planning was based on the interdependency of systems and applications as well as
available resources.
The pilot project provided benchmark for the estimation of time needed for each system. The
estimation should include the migration of data, application, interfaces, and logistics.
Each additional system migration served as additional benchmark for the other systems in order to
further modify and optimize the estimation.
Learning curve effects are considered
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
General Challenges and Approach
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Organizational change management IS global alignment System/service global harmonization Numerous new application releases or upgrades More strict IS project management and QA processes Windows operation system upgrade Users’ reluctance to adapt new technology Low exposure to ChemAxon technology among IS colleagues
Created a Molecular Structure and Reaction Service: An advisory committee formed by experts in chemistry informatics from different OPUs to provide centralized and harmonized service, consulting, training, and tutoring as well as system QA for IS developers, system leaders as well as end users.
Non-stop service
Using evolutional migration and globally coordinated projects.
Packaging for user deployment (internal process)
Continual improvement of BI global IS service
Challenges Approach
Technical Challenges
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Our Experiences Pain Points Opportunities for ChemAxon
Support is great, able to reproduce the error and sometimes add into the next version
Sometimes the new fix introduced other bugs
Better quality assurance and quality control mechanism
Frequent version changes/updates
Tracking detailed information of the upgrades that are relevant to BI environment. We have to create database to track changes
Track changes in database so that customers can compare different versions or upgrades to understand the potential benefits or threats to their environment (change management)
Great customer-centric service
Not sure about the roadmap and future direction
Forming an advisory committee represented by delegates from industry, and more frequent communications with key clients
Technical Challenges (continue)
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Our Experiences Pain Points Opportunities for ChemAxon
Marvin technology can handle many chemistry representations(mol, smarts etc)
Special features in MarvinSketch need to be coded in application. But if application only uses the default, then user will see their own default, causing different settings for different users
Implementation of application defaults for special features
Integration of Marvin technology with Java Swing is easy to do
Web applications hosted with Tomcat are difficult to integrate, especially when there is no clear instruction about which jars to be included. Sometimes specific configuration in Tomcat is needed to allow certain files pass through. At one time, Java 1.7 had some integration problem so we had to go back to 1.6
More clear instructions and cookbook of web application integration.
Technical Challenges (continue)
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Our Experiences Pain Points Opportunities for ChemAxon
JChem Java APIs are very powerful and flexible.
.Net APIs are less ideal (may be not as mature as Java APIs) and time-consuming. We cannot use the latest version. This may be due to some of our third-party components.
Maybe better documentation or training for .Net APIs.
Database cartridge set up is straightforward.
When database password changes (for compliance purpose) or database reboots, we have to re-enable the cartridge. Sometimes JChem server is stuck and not responsive.
A streamlined process to check and re-initiate all databases that use ChemAxon cartridge
Domain indexing is a bit long, but reasonable. Structural search is quickly implemented.
ChemAxon cartridge apparently is more strict than MDL cartridge for Markush and R-group structures. Interpretation of ‘X’ is different. Solution is to capture error in error table during indexing and make correction by business (those structure will be in database, but not searchable).
NA?
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Structure Checking
(Spell Checking)
searches the chemical structure
for issues and reports the
problems found
Structure Checking &
Fixing
(Autocorrection)
searches the chemical structure
for issues and corrects them if
possible
Standardizing (Translation)
Transforms the structures to customized,
canonical, and consistent
representations
Approach of Data Curation and Data Qaulity
Workflow of Data Curation and Data Quality
Provide Global Templates for Cartridge/Interface
Implement Customized Checkers with ChemAxon
Decide about Standardizing and Structure Checking
Review and Discuss Existing Rules
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Templates for Databases
• Provide functions for single- and multistep checks and fixings • Provide functions for automated checks
Packages
• Store the default checker and standardizing configuration • Store customized result messages for checks • Store dedicated structures for tests
Database Tables
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Used Types of Checkers
• As Abbreviated Group / Pseudo Atom etc.
Basic Checkers
• Working with SMARTS defined substructures
• Can be set individually
• Very flexible approach
Substructure Checkers
• Provided as consultancy projects
• Good experiences by now
• Easy to implement both for DB Cartridge and for Marvin Tools
Customized Checkers
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014
Conclusions
• This is a decision and investment for next decades and requires commitment from business and IS.
• Evolutional approach can reduce the complexity of the migration.
• Multiple migration projects may be needed.
• System interdependency analysis is a key step.
• Globally aligned and harmonized service is a great platform.
• Generally the migration was smooth.
• Data curation effort should not be underestimated.
• Effective mechanisms for data curation require IS and business engagement.
• Strong support from users, especially from key stakeholders, is crucial [success = (process improvement) x (customer acceptance)].
Z. Li and E. Richter, Boehringer Ingelheim, ChemAxon UGM, September2014