Post on 13-Jan-2016
transcript
Analysis of ribo-seq datafor prediction translation
efficiency and protein quantity from transcriptomics data
Fedor Kolpakov
Biosoft.Ru, Ltd.Institute of Systems Biology, Ltd.
Novosibirsk, Russia
Biosoft.Ru
July 6th - 11th 2013, St. Petersburg, RUSSIA
Genome-scale model for prediction of synthesis rates of mRNAs and proteins
Initial data:Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J,
Wolf J, Chen W, Selbach M. Global quantification ofmammalian gene expression control. Nature, 2011, 473(7347):337-342.
- mouse fibroblasts, parallel metabolic pulse labelling- simultaneously measured absolute mRNA and protein
abundance and turnover for 5000+ genes- first genome-scale quantitative model for prediction of
synthesis rates of mRNAs and proteins
Bukharov Aleksandr, Kiselev Ilya
Experiment design
Schwanhäusser B., et al., 20011 - Fig. 6: Comparison of synthesis rates of mRNA and proteins assuming the measured levels reflect averages over one cell cycle or steady-state values. For the synthesis rates of mRNA (light gray), the deviation between the two approaches is small, because mRNA half lives are mostly smaller than the cell cycle time. For protein synthesis (dark gray), the differences are substantial; they can differ for more than one order of magnitude.
Schwanhäusser B., et al., 20011 - Fig. 6: Comparison of synthesis rates of mRNA and proteins assuming the measured levels reflect averages over one cell cycle or steady-state values. For the synthesis rates of mRNA (light gray), the deviation between the two approaches is small, because mRNA half lives are mostly smaller than the cell cycle time. For protein synthesis (dark gray), the differences are substantial; they can differ for more than one order of magnitude.
P – protein, exp – population mean,ss – steady state;
“They do not take into account that gene expression in mammalian cells is non-continuous. In addition, the non-uniform age distribution of cells in culture as described in 19, 23 is neglected, since this effect is expected to be small compared to the deviation obtained by neglecting the cell cycle.“
Schwanhäusser B., et al., 20011, supplementary materials
Agent based model
each cell is an agent
4247 blocks for protein synthesis
Phase of a cellular cycle Percent of cells in phase
Rate of transcription
G1 50.8% Vsr
S 22.9% 0.7*Vsr G2 13% 2*Vsr M 3.3% 0 G0 10% Vsr
Tab.1. Parameters of a cellular cycle
Numerical experiment.
The initial size of population is 200 cells which divide within 108 hours. Average quantity of protein molecules were calculated. This experiment was repeated for 4247 proteins.
Correlation of experiment and numerical modeling is equal to R=0.99
Absolute values also were coordinated (so for 81,6% of proteins absolute values differ by less than 7%
Main deviations from experimental values are observed for proteins with extremely low copy numbers, where experimental error can be significant.
Ignolia N. et al., 2011
- The rate of translation is remarkably consistent between different classes of messages (Figures 3D and 3E).
- The kinetics of elongation are independent of length and protein abundance and are the same in secreted proteins, whose translation occurs on the ER surface.
- Translation speed is also independent of codon usage, which is consistent with the absence of pauses at rare codons.
- Although this may be the case for specific examples, they find no evidence for a large effect on the overall rate of elongation.
An important practical implication for the universality of the average rate of elongation is that ribosome footprint density provides a reliable measure of protein synthesis independent of the particular gene being translated.
Ignolia N. et al., 2011
“Our data are consistent with recent work that indirectly infers translation levels from absolute mRNA and protein abundance measurements (Schwanhausser et al., 2011).
Notably, they found that translation was the single largest contributor to protein abundance, highlighting the value of direct measurements of protein synthesis.”
Ignolia N. et al., 2011
R = 0.49
R = -0.17
R = -0.41
Schwanhausser et al., 2011
Ignolia N. et al., 2011
Current works
1. Database on ribo-seq data
2. Analyses of lncRNA
3. Models of biological pathways involved in translation regulation(for example, mTOR)
4. More predictors for translation efficiency• protein binding sites• miRNA binding sites• …
Initial row data, collected from literature, GEO, SRA and ENCODE databases were systematically collected and uniformly processed using specially developed workflow (pipeline) for BioUML platform:- sequenced reads were aligned to reference genome using Bowtie;- peaks were identified using MACS and SISSR algorithms- further refinement of obtained peaks- position weight matrices (PWM) were constructed by different methods(ChIPMunk, our own methods)- ROC curves were calculated to estimate and compare built PWM- site models (PWMs + thresholds) were constucted for recognition TFbinding sites.
TFClass database is used as a core for information about transcription factors, their classification and cross-linking with Ensembl.
BioUML platform provides web interface for access to GTRD database: search information, browsing, different data views. Built-in genomebrowser provides powerful visualisation of ChIP-seq data.
GTRD - Gene Transcription Regulation Database
Prediction of gene expression level by ChIP-seq dataChIP-seq peaks (MACS) for histones and transcriptio factor binding sites were extracted from GTRD database for 2 cell lines: GM12878 and K562.
Machine learning - Random Forest algorithm.
R - 0.72 – 0.77
Ribo-seq experimentsCell type Treatment Chemicals #Samp
lesReferences
K526 (human)
Cycloheximide 1 Batut P, Dobin A, Plessy C, Carninci P, Gingeras TR «High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression», Genome Res. 2013 Jan;23(1):169-80
HEK293 (human)
High, medium and low Mg buffer
Cycloheximide, Harringtonine
3 Ingolia NT, Brar GA, Rouskin S, McGeachy AM, Weissman JS «The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments», Nat Protoc. 2012 Jul 26;7(8):1534-50
PC3 (human)
Rapamycin, PP242
cycloheximide 12 Hsieh AC et al. «The translational landscape of mTOR signalling steers cancer initiation and metastasis», Nature. 2012 Feb 22;485(7396):55-61
HEK293 (human)
Two ribosomal populations: free (cytosol) and ER-bound
cycloheximide 4 Reid DW, Nicchitta CV. «Primary role for endoplasmic reticulum-bound ribosomes in cellular translation identified by ribosome profiling», J Biol Chem. 2012 Feb 17;287(8):5518-27
Olga Gluschenko, Ivan Yevshin
Cell type Treatment Chemicals Samples number
References
Embryonic stem cells(mouse)
Control – no drug, experiment - cycloheximide, emetine
Cycloheximide, emetine
18 Ingolia NT, Lareau LF, Weissman JS. «Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes», Cell. 2011 Nov 11;147(4):789-802
HeLa (human)
Transfection miR-1 and miR-155
Cycloheximide 12 Guo H, Ingolia NT, Weissman JS, Bartel DP.«Mammalian microRNAs predominantly act to decrease target mRNA levels», Nature. 2010 Aug 12;466(7308):835-40
Neutrophils (mouse)
Mir-223 knockout mouse, wild-type mouse
cycloheximide 4 Guo H, Ingolia NT, Weissman JS, Bartel DP.«Mammalian microRNAs predominantly act to decrease target mRNA levels», Nature. 2010 Aug 12;466(7308):835-40
Embryonic stem cells(mouse)
siLuc or siLin28a
Cycloheximide 18 Cho J et al. «LIN28A is a suppressor of ER-associated translation in embryonic stem cells», Cell. 2012 Nov 9;151(4):765-77
Fibroblasts (mouse)
DMSO – control, treatment – Rapamycin
No information 21 GSE25626
Neurons (mouse)
PBS, NaCl Cycloheximide 4 GSE40969
Cell type Treatment Chemicals Samples number
References
Human fibroblasts infected by cytomegalovirus
Cycloheximide, Harringtonine
16 Stern-Ginossar N et al. «Decoding human cytomegalovirus», Science. 2012 Nov 23;338(6110):1088-93
HEK293 (human)
lactimidomycin 2 SRA056494 (SRS351807, SRS351808)
Embrionic fibroblast (mouse)
Cycloheximide, lactimidomycin
2 SRA056494 (SRS351809, SRS351810)
Workflow for ribo-seq data analyses
Workflow for ribo-seq data analyses
Workflow for ribo-seq data analyses
Model of mTOR pathway
on the base of model
Richard J. Dimelow R.J. and Wilkinson S.J.
Control of translation initiation: a model-based analysis from limited experimental data. J. R. Soc. Interface(2009)6, 51–61 doi:10.1098/rsif.2008.0221
Lequieu J, Chakrabarti A, Nayak S, Varner JD (2011) Computational Modeling and Analysis of Insulin Induced Eukaryotic Translation Initiation. PLoS Comput Biol 7(11): e1002263.doi:10.1371/journal.pcbi.1002263
Wanted experiment
For the same cell line and conditions:- CAGE -> transcription start sites- RNA-seq
- polyA +/-; nucleus, cytoplasm, whole cell- ribo-seq
- harringtonine -> translation start site- cycloheximide -> translation efficiency
- protein MS- pulse labelled with heavy amino acids (SILAC, left) ->
protein abundance and turnover.
Current works
1. Database on ribo-seq data
2. Analyses of lncRNA
3. Models of biological pathways involved in translation regulation(for example, mTOR)
4. More predictors for translation efficiency• protein binding sites• miRNA binding sites• …
Acknowledgements
Ivan YevshinOlga GluschenkoEseniya BasmanovaRuslan Sharipov