Date post: | 19-Jan-2015 |
Category: |
Health & Medicine |
Upload: | david-peyruc |
View: | 596 times |
Download: | 1 times |
tranSMART’s Application to Clinical Biomarker Discovery in SanofiSherry Cao Ph.D.
tranSMART Community Meeting
Nov. 6th, 2013
Outline
● Challenges in clinical biomarker discovery
● How Sanofi is meeting those challenges
● Role of tranSMART
● tranSMART in Sanofi
Clinical Biomarker Discovery Process
Data Capture Discovery & Interpretation
Clinical Sample Validation
Molecular Information• DNA• RNA• Protein• Lipid• Metabolites
Biomarkers • Diagnostic• Prognostic• Efficacy
Signatures• Molecular classifications• Patient stratifications
Target ID/Credentialing• Molecular targets• Pathways• Clinical phenotypes
Clinical Sample Procurement
Sample Sources• In house• Public
Type• In silico• Experimental
Clinical Information• Patients• Diseases• Clinical Phenotypes• Lab tests• Pathology reports• Drugs
Challenges for Clinical Biomarker Discovery
● High-throughput biological measurements generate unprecedented amount of data for each biological sample● Chip based profiling technologies● Exome, transcriptome & genomic sequencing technologies
● The complexity of disease biology requires large sample numbers to reach statistical significance● GWAS studies for complex traits● Molecular signature developments for patient stratification
● Heterogeneous data types & data sources● Research & clinical● Structured & non-structured data
● Data curation is a very critical & time consuming process
● Complex analysis & visualizations are needed to transform data to knowledge
Data Management
Integration & Analysis
| 5
Interdisciplinary team for Clinical Biomarker Research
Clinical Informaticians
Research Informaticians
Clinical Statisticians
Clinicians
Research Scientists
CBR Team
Two Distinctive User Groups
Clinicians, Research Scientists
Informatic Scientists & Statisticians
Main Role Hypothesis generation,Mechanistic Interpretation Data analysis
Statistical Analysis Type Single variable, correlative analysis
Multi-variable complex analysis
Statistical Tool Access Very limited SAS, JMP, R
User Interface Drag & Drop GUI API
Major Complaints Data acquisition, Data analysis turnaround time
Data acquisition, Data curation & reformatting, Not
enough time to do real analysis
Informatics Systems Mapped onto Research Flow
Discovery Interpretation Clinical Sample ValidationData Capture
Data Management& Integration
Platform Specific System
Challenges for Clinical Biomarker Discovery
● High-throughput biological measurements generate unprecedented amount of data for each biological sample● Chip based profiling technologies● Exome, transcriptome & genomic sequencing technologies
● The complexity of disease biology requires large sample numbers to reach statistical significance● GWAS studies for complex traits● Molecular signature developments for patient stratification
● Heterogeneous data types & data sources● Research & clinical● Structured & non-structured data
● Data curation is a very critical & time consuming process
● Complex analysis & visualizations are needed to transform data to knowledge
Data Management
Integration & Analysis
Two Distinctive User Groups
Clinicians, Research Scientists
Informatic Scientists & Statisticians
Main Role Hypothesis generation,Mechanistic Interpretation Data analysis
Statistical Analysis Type Single variable, correlative analysis
Multi-variable complex analysis
Statistical Tool Access Very limited SAS, JMP, R
User Interface Drag & Drop GUI API
Major Complaints Data acquisition, Data analysis turnaround time
Data acquisition, Data curation & reformatting, Not
enough time to do real analysis
Informatics Systems Mapped onto Research Flow
Discovery Interpretation Clinical Sample ValidationData Capture
Platform Specific System
Data Management& Integration
Role of TranSMART within Sanofi
● Translational data hub - One stop shop for all data related to a biomarker discovery project● Clinical & research data● Structured & non-structured data● Fully curated data for integrated analysis & not-fully curated data
● Deliver critically needed statistical/informatics analysis tool to clinicians & research scientists● Unit variant analysis● Simple clustering analysis & heatmap generation
● Help informatics scientists to generate custom analysis data sets based on distinctive cohort definitions
Data management & integration
Data management & integration
Clinical Biomarker Discovery Use Case 1
● Business unit with established & active biomarker discovery process
● Samples are routinely sent out for profiling at different platforms
● Data are generated routinely both from CRO & internal groups● High throughput profiling data● Low throughput imaging & assay data (IHC, ELISA, qPCR, etc.)
● Situation● Biomarker team reps are overwhelmed by data management
related questions with little time to do actual analysis
● Critical need● How to organize data effectively?● How to manage the low throughput data systematically with data
from clinical & high throughput data?● How to search & find the relevant data quickly?
tranSMART in Sanofi – Data Management
Navigate within Programs > Studies > Assays , Analysis and File Folders (see next slide)
Search data using dictionaries
Create new Programs > Studies > Assays and Files Folders, and annotate (tag) them
Export files
Visualize gene expression analysis results
Global view of all the data availableFrom level 1 data (uncurated/raw files)
to levels 3-4 data (analysis results, findings)
Run analysis on subject-level data (former Dataset Explorer)
Browse level 2 (processed) data – incl. clinical / preclinical / molecular data, etc.
Search subject-level data
Select data subsets (cohorts)
Run basic statistical and genomic analyses on those subsets (standard features from tranSMART v1.0)
Export out data subsets
Data organization
● Data is organized in a hierarchical structure:
| 14
* A file folder can be created at any levels: program, study, assay…
File Folder*
AnalysisAssayProgram Study
Each object (Program, Study, Assay, etc.) is tagged with metadata:– Provide information on the object– Enable queries using search
Predefined annotation templates– Most fields use CV with pick-list or
autocomplete functionalities. Examples of dictionaries used: MESH, WhoDD, some branches Nextbio Ontology.
– Description field enables to capture free text
Program Explorer
| 15
Program Explorer box allows to navigate within Programs , Studies , Assays Analysis or File Folders
1
2
Integrated search
| 16
Autocomplete feature for values
in dictionaries
Dropdown with a list of dictionaries + free-text
search
New search function at the top of the screen. Any data (levels 1-4) can be searched.
Browse view: The search returns Programs, Studies, Assays and/or Files that match your query
Analyze view:The system points you to level 2 data
Filter
| 17
A new Filter option can also be used for selections based on fields with a small set of possible values.
The search returns Programs, Studies, Assays and/or Files that match your query.
2
1
Search & filter in Analyze
| 18
Synchronized search & filter function in Analyze
Visualization of gene expression analysis
| 19
Creation of a template for loading and displaying gene expression analysis results.
File export – Shopping Cart function
| 20
New concept of Shopping Cart for exporting files.
Note: If positive feedback from users on this Shopping Cart concept, we may extend this feature in RC-2 to subject-level data.
Clinical Biomarker Discovery Use Case 2
● Business unit with focused biomarker discovery program
●Goal is to identify disease progression biomarkers than the current clinical functional test
● Situation at hand● Researchers don’t have any appropriate analytical tools for
correlative analysis● A variety of profiling experiments are being planned
• RNAseq, Proteomics, RBM, miRNA, Metabolomics● Patient data at multiple time points are collected
● Critical need● How to integrate all the data?● How to enable clinical researchers to analyze and visualize data?● How to analyze time series data more effectively?
tranSMART in Sanofi – Data Integration
● Current state ● Within study clinical & gene expression profiling data
Gene expression
En
d P
oin
t
tranSMART in Sanofi – Data Integration
● In the pipeline● Multi-modal profiling data support
● Data types to be addressed● RNAseq● miRNA profiling (qPCR + seq) ● Metabolomics● Proteomics● RBM
Gene expression
Pro
tein
Le
vel
tranSMART in Sanofi – Providing Analysis Tools to Research Scientists
General Summary Statistics on Patient Cohorts
Baseline marker gene expression is correlated with outcome at 52 weeks
Disease Signature Evaluation
Clinical Biomarker Discovery Use Case 3
● Efficacy biomarker discovery for complex disease with 15,000 patients
● Situation at hand● A number of profiling experiments are being planned
• RNAseq, RBM, Metabolomics● Patients often manifest other disease symptons
● Critical issue● How to load such a large dataset?● How to analyze such a large sample numbers with multiple high
dimensional data?● How to analyze comorbidities?
Conclusions
● tranSMART can provide critical solutions for clinical biomarker discovery needs● Data management, integration & analysis
● Two distinctive user groups for tranSMART through user interface and through API
● Different business units have different requirements for tranSMART
● Sanofi developed critical user interface and functionality improvements to meet sanofi and general clinical biomarker discovery needs
Question
Functionality User Interface
Acknowledgement
●Genzyme● Jike Cui, Adam Palermo, Rena Baek, Petra Olivova, Leslie Jost, Rob
Pomponio, Allison McVie-Wylie, Steve Madden, Clarence Wang
● Diabetes● Juergen Kammerer, Manfred Hendlich, Dan Crowther
●Oncology● Mary Penniston, Jack Pollard
● Sanofi tranSMART development team● Claire Virenque, Annick Peraux● Angelo Decristofano, Lars Greiffenberg, Christophe Gibault, David
Peyruc
Dream Analysis Process
Define question
Identify patient cohort
Obtain relevant profile & clinical data
Run analysis
Export & publish results
Satisfied
Format!