Analysis of Gene Expression at the Single-Cell Level Guo-Cheng Yuan Department of Biostatistics and Computational Biology Dana-Farber Cancer Institute Harvard School of Public Health Bioconductor, July 31 st , 201
Transcript
Slide 1
Analysis of Gene Expression at the Single-Cell Level Guo-Cheng
Yuan Department of Biostatistics and Computational Biology
Dana-Farber Cancer Institute Harvard School of Public Health
Bioconductor, July 31 st, 2014
Slide 2
bioconductor
Slide 3
Methods to sequence the DNA and RNA of single cells are poised
to transform many areas of biology and medicine. --- Nature
Methods
Slide 4
Slide 5
Recent technical advances have enabled RNA sequencing (RNA-seq)
in single cells. Exploratory studies have already led to insights
into the dynamics of differentiation, cellular responses to
stimulation and the stochastic nature of transcription. We are
entering an era of single-cell transcriptomics that holds promise
to substantially impact biology and medicine. R. Sandberg, 2014.
Nature Methods
Slide 6
Slide 7
Cell-type A Cell-type B Cell-type C Cell-type D Cell-type E
Cell-type F Cell Division
Slide 8
R. Sandberg, 2014. Nature Methods
Slide 9
Challenges in single-cell data analysis Characterize and
distinguish technical/biological variability Identify new and
meaningful cell clusters. Identify the lineage relationship between
different cell clusters. Characterize the dynamic process during
cell- state transitions. Elucidate the transition of regulatory
networks. Distinguish stochastic vs real variation
Slide 10
Slide 11
CMP GMP MEP CLP MEP Guoji Guo, Eugenio Marco
Slide 12
SPADE: a density-normalized, spanning tree model Qiu et al.
2011 Nat Biotech, p886 Down-sample Clustering, Spanning-tree
Visualization
Cancer Stem Cells Each cancer contains a highly heterogeneous
cell population. Clonal evolution contributes to cancer
heterogeneity Cancer cells are hierarchically organized and
maintained by cancer stem cells How are the leukemia stem cells
related to normal blood cell lineage? How do they differ?
Slide 17
Single cell analysis of the mouse MLL-AF9 acute myeloid
leukemia cells Compilation of mouse cell surface antigens (Lai et
al., 1998; eBioscience website) Primer design for 300 multiplexed
PCR (collaboration with Helen Skaletsky) Micro-fluidic high-
throughput realtime PCR (96.96 Array) Guoji Guo, Assieh
Saadatpour
Slide 18
t-SNE analysis identifies similarities between cell-types t-SNE
is a nonlinear dimension reduction method, and can identify
patterns undetectable by PCA t-SNE minimizes the divergence between
distributions over pairs of points. Leukemia cells are more similar
to GMPs than to HSCs Leukemia cells are highly heterogeneous.
Slide 19
Mapping leukemia cells to normal hematopoietic cell hierarchy
Use 33 common genes to map cell hierarchy. Mapping identifies two
subtypes of leukemia cells. These cells are similar but not
identical to their corresponding normal lineages.
Slide 20
All Leukemia Leukemia 1 Leukemia 2 GMP Coexpression networks
are different among subtypes
Slide 21
Surani and Tischler, Nature 2012 Guo et al. Dev Cell 2010
Slide 22
Dynamic clustering T = 1T = 2T = 3T = 4 Eugenio Marco, Bobby
Karp, Lorenzo Trippa, Guoji Guo Maximizing the penalized
log-likelihood.
Slide 23
Identifying bifurcation points and directions >80% variance
increase during bifurcation is attributed to a single (bifurcation)
direction. ICM TE EPI PE
Slide 24
Modeling dynamics by bifurcation analysis U(x) I) II)
Slide 25
U(x) I) II) Modeling dynamics by bifurcation analysis
Slide 26
Noise level has large impact on lineage biases = 1 = 0.5 =
2
Slide 27
Control Perturbation Lineage bias due to perturbation of TF
activity Predicted lineage bias due to 2 fold decrease of TF level
U(x)
Slide 28
Nanog PE EPI Experimental validation using Nanog mutant
Slide 29
How do we infer dynamics without temporal information?
Slide 30
Characterization of early bipotential progeny of Lgr5 +
intestinal stem cells Tae-Hee Kim, Assieh Saadatpour Crosnier 2006.
Nature Review
Slide 31
Principal Curve Analysis Reconstruct Temporal Information t-SNE
plot indicates two distinct clusters, linked a small number of
transitional cells
Slide 32
Principal Curve Analysis Reconstruct Temporal Information t-SNE
plot indicates two distinct clusters, linked a small number of
transitional cells Principal curve analysis captures the overall
trend of cell-state transition
Slide 33
Inferred dynamic gene expression profile Use the principal
curve coordinate as a proxy for temporal evolution.
Slide 34
Conclusions Single-cell genomics is a powerful technology for
understanding cellular heterogeneity and hierarchy. Single-cell
gene expression data analysis present many new methodological
challenges. It is a great time to develop algorithms and software
for single cell data analysis.
Slide 35
Acknowledgement Eugenio Marco Assieh Saadatpour Bobby Karp
Lorenzo Trippa Paul Robson Stuart Orkin Guoji Guo Ramesh Shivdasani
Tae-Hee Kim Funding from NIH, HSCI