Date post: | 12-Jul-2015 |
Category: |
Science |
Upload: | ngehlenborg |
View: | 465 times |
Download: | 2 times |
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Nils GehlenborgHarvard Medical SchoolCenter for Biomedical Informatics
nils_gehlenborg!
Why am I doing this?
Machine
Human
"
#
INTERPRETATION
GENERATIONCOMPUTATION
Machine
Human
"
#
Data
“In every chain of reasoning, the evidence of the last conclusion can be no greater than that of the weakest link of the chain, whatever may be the strength of the rest.”
- Thomas Reid, Essays on the Intellectual Powers of Man (1786)
INTERPRETATION
GENERATIONCOMPUTATION
Machine
Human
"
#
Data
Hypotheses Discoveries
Knowledge Cognition
x y1 y2
1.00 0.96 0.762.00 0.76 -0.143.00 -0.14 -0.914.00 -0.91 -0.845.00 -0.84 0.006.00 0.00 0.847.00 0.84 0.918.00 0.91 0.149.00 0.14 -0.7610.00 -0.76 -0.9611.00 -0.96 -0.28
-1
-0.5
0
0.5
1
1 2 3 4 5 6 7 8 9 10 11
INTERPRETATION
GENERATIONCOMPUTATION
Machine
Human
Hypotheses Discoveries
Knowledge
"
#
Data|
Cognition
What are the data?
Biomedical Omics Data
DNA Icon by Darrin Higgins, from The Noun Project
Genome What is the DNA sequence?
Genome
sequencing of genomic DNAHOW?
single nucleotide variants (SNVs) copy number variants
complex structural variants
WHAT?
http://www.broadinstitute.org/igv
http://www.broadinstitute.org/igv
http://circos.ca, Krzywinski et al. 2009, Genome Research
Clark et al. 2009, PLoS Genetics
source: Human
destination: Lizardchr1
chr2
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chr10
chr11
chr1
2
chr13
chr14
chr15
chr16
chr17
chr18
chr1
9
chr2
0
chr2
1
chr22
chrX
chrY
chr3
chr1
chr2
chr3
chr4
chr5
chr6
chra
chrb
chrc
chrd
chrf
chrg
chrh
saturationline
- +
10Mb
chr3
go to:
chr3 chr2
51280143 152008850
51709189 152345239
orientation:
match
inversion
invert
out in
http://www.mizbee.org
http://genome.lbl.gov/vista
Genome
Transcriptome
What is the DNA sequence?
Which genes are active?
Transcriptome
sequencing of mRNA/microRNA molecules microarray-based hybridization
HOW?
abundance of transcripts/genes/isoformsWHAT?
http://research.fhcrc.org/mcintosh/en/tools.html
http://miso.readthedocs.org/en/latest/sashimi.html#visualizing-and-plotting-miso-output
Genome
Transcriptome
Proteome
What is the DNA sequence?
Which genes are active?
Which proteins are present?
Proteome
mass spectrometry of peptides array-based techniques
HOW?
presence of peptides & proteins abundance of peptides & proteins
WHAT?
Genome
Transcriptome
Proteome
Metabolome
What is the DNA sequence?
Which genes are active?
Which proteins are present?
Which metabolites can be identified?
Metabolome
mass spectrometry NMR spectroscopy
HOW?
presence of metabolites abundance of metabolites
WHAT?
Genome
Transcriptome
Proteome
Metabolome
What is the DNA sequence?
Which genes are active?
Which proteins are present?
Which metabolites can be identified?
Interactome Which molecules are interacting?
Interactome
mass spectrometry, yeast-2-hybrid text mining
HOW?
links between moleculesWHAT?
Genome
Epigenome
Transcriptome
Proteome
Metabolome
What is the DNA sequence?
Which genes are active?
Which proteins are present?
Which metabolites can be identified?
How are DNA and associated proteins modified?
Interactome Which molecules are interacting?
Epigenome
ChIP-seq, ChIP-chip (histones modifications) bisulfite sequencing (DNA methylation)
HOW?
histone modifications along genome DNA methylation patterns along genome
WHAT?
http://epigenomegateway.wustl.edu/browser/
http://compbio.med.harvard.edu/flychromatin/
http://compbio.med.harvard.edu/flychromatin/
Genome
Epigenome
Transcriptome
Proteome
Metabolome
Nucleome
What is the DNA sequence?
Which genes are active?
Which proteins are present?
How is the DNA organized in space/time?
Which metabolites can be identified?
How are DNA and associated proteins modified?
Interactome Which molecules are interacting?
Nucleome
3C/4C/5C chromosome conformation capture Hi-C sequencing
HOW?
contact probabilities for different parts of the genome
WHAT?
Lieberman-Aiden et al., Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome, 2009
Scaling
Dimensions
Genomic Entities
Samples
Timepoints
Data Types
?
TCGAThe Cancer Genome Atlas
mRNA expression
microRNA expression
DNA methylation
protein expression
copy number variants
mutation calls
clinical parameters
Stratome
Anthony92931 / Wikimedia Commons + Modification by Nils Gehlenborg
StratomeX
A Lex, M Streit, H-J Schulz, C Partl, D Schmalstieg, PJ Park, N Gehlenborg, “StratomeX: Visual Anal-ysis of Large-Scale Heterogeneous Genomics Data for Cancer Subtype Characterization“, Computer Graphics Forum 31:1175-1184 (2012)
M Streit, A Lex, S Gratzl, C Partl, D Schmalstieg, H Pfister, PJ Park, N Gehlenborg, “Guided Visual Exploration of Genomic Stratifications in Cancer“, Nature Methods 11:884-885 (2014)
Is there a mutation that overlaps with this mRNA cluster?
Is there a CNV that affects survival?
Is there a pathway that is enriched in this cluster?
Is there a mutually exclusive mutation?
Query
Rank
VisualizeStratifications
Clinical Params Pathways
Guided Exploration
dem
and
products
use
rs
visualization tools
use
rs
visualization tools
use
rs
visualization tools
genome browser network viewer heatmap visualization
http://genome.ucsc.org
use
rs
visualization tools
genome browser network viewer heatmap visualization
use
rs
visualization tools
customized tools for very specific problems
Meyer et al., “MulteeSum: A Tool for Comparative Spatial and Temporal Gene Expression Data”, 2010
use
rs
visualization tools
customized tools for very specific problems
What am I doing?
Building infrastructure to build visualization tools.
Data Management
Visualization Components
Visualization Tools
Data Analytics
Visualization Libraries
Data Management
Visualization Components
Visualization Tools
Data Analytics
Visualization Libraries
Abstraction
System/Platform
Data Management
Visualization Components
Visualization Tools
Data Analytics
Visualization Libraries
Abstraction
Think about scale.
Think about systems.
MD Anderson Cancer Center
University of Rostock
Psalm Haseley, Richard Park, Peter J Park
Michael S Noble, Douglas Voet, Lihua Zou, Spring Liu, Hailei Zhang, Sachet Shukla, Aaron McKenna, Andrew Cherniak, Pei Lin, Gad Getz
Jianhua Zhang, Terrence Wu, Ian Watson, Steven Quayle, Lynda Chin
Harvard Medical School
Broad Institute of MIT & Harvard
Christian Partl, Dieter SchmalstiegGraz University of Technology
Johannes Kepler University Linz Samuel Gratzl, Stefan Luger, Marc Streit
Hans-Jörg Schulz
Acknowledgements
Harvard SEAS Alexander Lex, Hanspeter Pfister
Harvard School of Public Health
Funding NIH/NHGRI K99 HG007583
Ilya Sytchev, Shannan Ho Sui, Winston Hide