Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | fay-porter |
View: | 217 times |
Download: | 3 times |
TFInfer
A Tool for Probabilistic Inference of Transcription Factor Activities
H.M. Shahzad AsifInstitute of Adaptive and Neural Computation
School of InformaticsUniversity of Edinburgh
Scope
Introduction Software Features Inputs and Outputs Software Interfaces Software Requirements and Availability Acknowledgements References
Introduction A novel standalone software for inference of
transcription factor activities (TFAs). Following probabilistic state space model provides the
basis:
“y(n)” is expression level of gene “n” at time instant “t” and the only observed variable.
“Xnm” contains binary value corresponding to link between gene “n” and transcription factor “m”.
“bnm” encodes the regulatory strength between gene “n” and transcription factor “m”.
Identifiability
As regulatory strengths and TFAs appear in the likelihood through a product, there is a sign ambiguity (de-repressing looks the same as activating)
This ambiguity can be resolved by providing additional information, e.g.: TF X is an activator for gene Y TF X is active/ inactive in condition C.
Introduction Latent or Hidden variable cm(t) is used to estimate
mth TFA at time instant "t”. Efficient Variational Bayesian EM algorithm is used
to obtain the posteriors over model parameters. Model exploits the natural sparsity of the regulatory
network by using connectivity information. Feasible for genome-wide applications. Probabilistic approach helps to associate confidence
intervals with the results.
Software Features
Genome-wide Inference. Freeware. Open-source. Supported data types:
Times-series data Time-independent data Replicates
Genome connectivity included for: Yeast E. coli
Software Features
Computationally efficient. User friendly. No programming expertise required. Probabilistic Modelling for TFAs. Coded in C# using dnAnalytics and ZedGraph
libraries. Usable under Linux/Mac via Mono. User manual available.
Input and Output Files
Inputs Standard format is CSV (Comma separated file). Input files contain logged gene expression data. First
column for gene annotations and a (optional) header row.
Connectivity data is included with the software for Yeast and E.coli. For yeast, the connectivity file contains common names of
genes. For E.coli, the connectivity file contains b numbers. User can supply own connectivity file.
Using data selection interface, required transcription factors can be selected.
Input and Output Files (cntd.)
Output TFAs in two formats:
Graphical representation (error bars) for every transcription factor selected.
A CSV file for TFAs.
Graphs can be saved in different formats. CSV file can be exported containing TFAs. As the model is probabilistic, all results have
confidence intervals.
Software Interface
Three main interfaces: Data input and Initial Configuration:
Gene expression data. Genome connectivity. Time-series, time-independent, replicates.
Data Selection: Transcription factor selection.
Result: Graph for each transcription factor. A CSV file containing relative concentration of all
transcription factors selected.
TFInfer Main Interface
Using this option, data file(s)is supplied containing geneexpression data.For replicates, multiple files can be used. Maximum number of replicates is 5.
Description
TFInfer Main Interface
If data file(s) contains a header row, then this optionmust be selected before selecting data file.
Description
TFInfer Main Interface
Specify whether the data is- Time-series or- Time-independent
Description
TFInfer Main Interface
In case of replicates, this option must be selected. Ifselected, number ofreplicates are shown onthe right.
Description
TFInfer Main Interface
Connectivity file is supplied using this. Two connectivity files are included; for yeast and E.coli.
Description
TFInfer Main Interface
Reset the state of the software.
Description
TFInfer Main Interface
Load the data and Connectivity files.
Description
TFInfer Main Interface
Start the process.
Description
TFInfer Main Interface
Stop the process.
Description
TFInfer Main Interface
When calculations are complete, results can be Seen using this button.
Description
TFInfer Main Interface
For every data file, TFInfer shows the summary of the data. For connectivity file, this information is alsoshown followed by the a window containing a list oftranscription factors.
Description
User can select any number of transcription factors here.
Description
TFInfer Data Selection Interface
TFInfer Results Window
TFInfer Results Window
This option is for changing the sign of the signal.
Description
TFInfer Results Window
This option is for saving theresults in the form of a plot.
Description
TFInfer Results Window
This option is for saving theresults in the csv file.
Description
Software Requirements and Availability
Microsoft .Net framework version 2 or Mono is required. Download link is available on TFInfer page.
Software installer and other related material available on TFInfer home: http://homepages.inf.ed.ac.uk/s0976841/TFInfer/
Acknowledgements
Software is based on the model proposed in bioinformatics paper[1].
Thanks to Dr Matthew Rolfe for providing connectivity information and for useful discussions.
Thanks to Dr. Guido Sanguinetti for all the support. Thanks to UoS for DoR Devolved funding.
References[1]G. Sanguinetti, N. Lawrence, and M. Rattray. Probabilistic inference of
transcription factor concentrations and gene-specific regulatory activities. Bioinformatics, 22(22):2775, 2006.
[2]C. Harbison, D. Gordon, T. Lee, N. Rinaldi, K. Macisaac, T. Danford, N. Hannett, J. Tagne, D. Reynolds, J. Yoo,et al. Transcriptional regulatory code of a eukaryotic genome. Nature, 431:99–104, 2004.
[3]T. I. Lee, N. J. Rinaldi, F. Robert, D. T. Odom, Z. Bar-Joseph, G. K. Gerber, N. M. Hannett, C. T. Harbison,C. M. Thompson, I. Simon, J. Zeitlinger, E. G. Jennings, H. L. Murray, D. B. Gordon, B. Ren, J. J. Wyrick,J.-B. Tagne, T. L. Volkert, E. Fraenkel, D. K. Gifford, and R. A. Young. Transcriptional Regulatory Networks in Saccharomyces cerevisiae. Science, 298(5594):799–804, 2002.
[4]P. T. Spellman, G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein, and B. Futcher. Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization. Mol. Biol. Cell, 9(12):3273–3297, 1998.
[5]http://www.zedgraph.org/[6]Matlab C Math library.[7]http://www.ecocyc.com/
Contact
Shahzad [email protected]
Institute of Adaptive and Neural ComputationSchool of Informatics
University of Edinburgh