Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | hannah-peck |
View: | 16 times |
Download: | 3 times |
SysMO-SEEK: Sharing Data and Models in Systems Biology
Katy WolstencroftStuart OwenJacky Snoep
University of Manchester
SysMO-DB Project
A data access, model handling and data integration platform for Systems Biology:
To support and manage the diversity of Data, Models and experimental protocols from a
consortium Web based Standards compliant
DB
Pan European collaboration 13 individual projects, >100 institutes
Different research outcomes A cross-section of microorganisms, incl.
bacteria, archaea and yeast
Record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way
Present these processes in the form of computerized mathematical models
Pool research capacities and know-how
Already running since April 2007 Runs for 3-5 years This year, 2 new projects join and 6 leave
http://www.sysmo.net
Systems Biology of Microorganisms
Types of data
Multiple omics genomics, transcriptomics proteomics, metabolomics fluxomics, reactomics
Images Molecular biology Reaction Kinetics Models
Metabolic, gene network, kinetic Relationships between data sets/experiments
Procedures, experiments, data, results and models Analysis of data
Challenges
Heterogeneous data and models Distributed groups of researchers Modellers and experimentalists have different
skills, training, experience Scientists want to remain in control Scientists reluctant to share
Social and technical challenges
SysMO-DB Dev Team
University of Stellenbosch, South AfricaUniversity of Manchester, UK
Jacky Snoep
Heidelberg Institute for Theoretical Studies Germany
University of Manchester, UK
Olga Krebs
Wolfgang Müller
Sergejs Aleksejevs Carole Goble
Stuart Owen
Katy Wolstencroft
Finn Bacall
Franco du Preez
Social Challenge: Focus GroupSysMO PALs
DB team Focus Group Projects
Show what is thereSuggest what is possible
Ask for requirements
Give requirementsTell priorities
Rate outcomesSuggest improvements
Double checkTransmit
Disseminate
Collect answers
Technical Challenge
Rapid and incremental development Driven by the PALs Just enough and just in time , not Just in case No reinvention Sustainable and extensible Migrate to standards Fitting in with normal lab practices
Protocols for Models
Protocol Title Authors Keywords Description Assumptions Equations Numerical Methods/Algorithms Computational Tools Parameter Estimation Techniques Limitations References
What do we share
Methods Data Results+ +Models +
All SysMO Assets
SOP
A Tree View of Assets
Investigation Studies Assay
ConstructionValidation
SOP
SOP
ISA infrastructure provides a directory structure for experiments
http://isatab.sourceforge.net/
Incentives for sharing
Safe haven for data Credit and attribution Help with exporting to public repositories (e.g.
One-click export to ArrayExpress, PRIDE etc) A repository for “supplementary materials” in
publications Linking publications and data
Access other resources through a SEEK gateway
Access Permissions
Just Enough Sharing
...we don’t talk about security
COSMIC
SysMOLab
MOSES
Alfresco
Wiki
Wiki
ANOTHER
A DATASTORE
Just Enough sharing
SOP
Fetch on Request
Direct Upload
How do we share
“Just Enough Results Model” What type of data is it
Microarray, growth curve, enzyme activity… What was measured
Gene expression, OD, metabolite concentration…. What do the values in the datasets mean
Units, time series, repeats….
Based on: Minimum information models
e.g. MIAME, MIAPE, MIRIAM Biological ontologies
e.g. Gene Ontology, MGED, SBO Bioportal web service used in SysMO-SEEK for:
Concept lookup and visualisation
How do we share
Share JERM templates developed by SysMO-DB, PALs and consortium Spreadsheet templates Database Schemas
Encourage uptake throughout SysMO transcriptomics metabolomics proteomics etc….
RightField: Annotation by Stealth
Identifying Biological Objects
What do you have in your data? Proteins/enzymes, genes/expression levels,
metabolites
Where/how do these objects interact? Pathways, flux, experimental conditions
What models describe these interactions
Possible when using common frameworks, naming schemes and controlled vocabularies
Following Standards We recommend formats but we do not enforce
them Protocols and SOPs – Nature Protocols Data – JERM models and community minimum
information models Models – SBML and related standards Publications – PubMed and DOI
If you follow the prescribed formats, you get more out, but if you don’t, you can still participate
Lowering the adoption barrier
SEEK, the eLaboratory
A dynamic resource for analysis as well as browsing
Automatic comparison of data from inside files Understanding where and how data and models
are linked Running simulations with new experimental data Running analyses and workflows over the data
and models
Workflows from myExperiment
Data preparation, annotation and analysis Systems Biology workflow Pack on myExperimentMicroarray analysis and text mining
Created by Afsaneh Maleki-Dizaji
from SUMO, University of Sheffield
Based on previous work by Paul Fisher, University of Manchester
http://www.myexperiment.org/workflows/187
SEEK as a data analysis and meta analysis service
SBML model construction and population Calibration workflow Data requirements
Parameterised SBML model Experimental data
Metabolite concentrations from key results database
Calibration by COPASI web service
Peter Li
Data analysis and meta analysis
SEEK Analysis Service with pre-cooked analysis tools.
Calibration workflow Data requirements
Parameterised SBML model Experimental data
Metabolite concentrations from key results database
Calibration by COPASI web service
Peter Li
Load model:
Load data:
GO
Why it works for us
A solution that fits in with current practices Start simple, show benefits, add more Engage with the people actually doing the work
PhD students, Post-docs Build to the PALs requirements Respect publication cycles Respect cultural differences Scientists stay in control
SysMO Methods Spreading
Virtual Liver Mueller, via HITS
Lungsys SBCancer EraSysBio+
Eukaryotic organisms Interactions between host and pathogen Human disease Multi scale modelling
Acknowledgements
SysMO-DB Team SysMO-PALS
myGrid, Hits and JWS Online EMBL-EBI, MCISB
http://www.sysmo-db.org