Building Bioinformatics Tools for Research Scientists Southern California Bioinformatics Summer...

Post on 22-Dec-2015

215 views 0 download

Tags:

transcript

Building Bioinformatics Building Bioinformatics Tools for Research Tools for Research

ScientistsScientists

Southern California Bioinformatics Southern California Bioinformatics Summer Institute 2008Summer Institute 2008

Andrew ClarkAndrew Clark

Mentor: Dr. Ping Du, Allergan, IncMentor: Dr. Ping Du, Allergan, Inc August 21, 2008August 21, 2008

OutlineOutline

Introduction to AllerganIntroduction to Allergan Project objectiveProject objective Motivation: Why is this important?Motivation: Why is this important? Software development processSoftware development process Results and conclusionResults and conclusion AcknowledgementsAcknowledgements

Allergan backgroundAllergan background

Brief history of Allergan Brief history of Allergan Organization structure and operationOrganization structure and operation Product areasProduct areas

• Eye, skin careEye, skin care• BotoxBotox

Project informationProject information

The Research Data Warehouse is a The Research Data Warehouse is a software application designed to software application designed to consolidate information from a consolidate information from a variety of research databases into a variety of research databases into a single point of access.single point of access.

theRDW(web

application)

R & DSOURCE

DATABASES

AllerganScientists

Toxicology

Chemistry

BiologyPharmaco-

kinetics

ResearchDept.

The Research Environment at AllerganThe Research Environment at Allergan

Project objectiveProject objective

Extend the functionality of the “RDW”:Extend the functionality of the “RDW”:• Build and integrate a Structure Data File Build and integrate a Structure Data File

(SDF) export function.(SDF) export function.

SDFile

RDWQueryResult

Structure-activity relationship analysisin SARVision.

Data manipulationin Excel.

Data sharing withexternal collaboratorsvia email.

Why is this important?Why is this important?

Structure Data Files are an industry Structure Data Files are an industry standard for storing chemical standard for storing chemical compound data.compound data.

The SDF export feature will The SDF export feature will • make research scientists’ work simpler. make research scientists’ work simpler. • add a useful feature to the RDW. add a useful feature to the RDW.

Example SD fileExample SD file

3,7-Dihydro-1,3,7-trimethyl-1H-purine-2,6-dione3,7-Dihydro-1,3,7-trimethyl-1H-purine-2,6-dione Marvin 08110810182D Marvin 08110810182D

14 15 0 0 0 0 999 V200014 15 0 0 0 0 999 V2000 -1.6702 -2.1687 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0-1.6702 -2.1687 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 -1.6702 -0.8338 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0-1.6702 -0.8338 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 -1.4154 -0.0492 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0-1.4154 -0.0492 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -1.1853 -1.5013 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0-1.1853 -1.5013 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.1693 -0.6763 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0-3.1693 -0.6763 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -4.5982 -0.6763 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0-4.5982 -0.6763 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.8838 -1.9137 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0-3.8838 -1.9137 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.1693 -3.1513 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0-3.1693 -3.1513 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.4548 -1.9137 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0-2.4548 -1.9137 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -2.4548 -1.0888 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0-2.4548 -1.0888 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -3.1693 0.1488 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0-3.1693 0.1488 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 . . .. . . 5 13 1 0 0 0 05 13 1 0 0 0 0 13 7 1 0 0 0 013 7 1 0 0 0 0 7 14 1 0 0 0 07 14 1 0 0 0 0 14 9 1 0 0 0 014 9 1 0 0 0 0 9 10 2 0 0 0 09 10 2 0 0 0 0 5 10 1 0 0 0 05 10 1 0 0 0 0 . . .. . .M ENDM END> <STRUCTURE_SMILES>> <STRUCTURE_SMILES>CN1C=NC2=C1C(=O)N(C)C(=O)N2CCN1C=NC2=C1C(=O)N(C)C(=O)N2C

> <STRUCTURE_MolecularWeight>> <STRUCTURE_MolecularWeight>353.3255353.3255

$$$$$$$$

Source: http://chem.sis.nlm.nih.gov/chemidplus/applet/marvin/examples/applets/example-view2.6.html

Planning aheadPlanning ahead

2.2. Prepare the requirements specification.Prepare the requirements specification.

4.4. Implement and test code, get feedback Implement and test code, get feedback from managers and RDW users.from managers and RDW users.

3.3. Design the new feature.Design the new feature.

1.1. Understand the research environment at Understand the research environment at Allergan.Allergan.

Software Development ProcessSoftware Development Process Learn the needs of the usersLearn the needs of the users

• (July 8 – 11)(July 8 – 11)

Plan the software programPlan the software program• (July 14 – 18)(July 14 – 18)

Write the program codeWrite the program code• (July 21 – August 8)(July 21 – August 8)

Test and obtain user feedbackTest and obtain user feedback• (August 11 – 19)(August 11 – 19)

ResultsResults

Example RDW usage:Example RDW usage:• Web interfaceWeb interface• Search formSearch form• Report generationReport generation• SDF Export featureSDF Export feature• Exported dataExported data

The RDW Homepage

Selecting a search template

The query form

Configuring experiment search details

Report generated from user search.

The SDF Export button

The SDF file download prompt

Exported SDF data viewed in analysis tool

ConclusionConclusion

Effective bioinformatics research relies Effective bioinformatics research relies

on the creative development on the creative development

of software tools.of software tools.

ConclusionConclusion

Computational resources can:Computational resources can:• Standardize access to data through software Standardize access to data through software

applications or web services.applications or web services.• Perform high power, high precision numerical Perform high power, high precision numerical

calculations.calculations. Knowledge of software development and Knowledge of software development and

computational methods will help make a computational methods will help make a well-rounded bioinformatician. well-rounded bioinformatician.

AcknowledgementsAcknowledgements

Special thanks toSpecial thanks to• The SoCalBSI faculty and staffThe SoCalBSI faculty and staff• My mentor at Allergan:My mentor at Allergan:

Dr. Ping DuDr. Ping Du

• Other supportive individuals at Allergan:Other supportive individuals at Allergan: Dr. Robert Cain, Raju Krishnappa, Gaurang PatelDr. Robert Cain, Raju Krishnappa, Gaurang Patel

• All the SoCalBSI summer interns.All the SoCalBSI summer interns.