Outline
1. Goals
2. Structure
3. Services
4. Flagship Projects
BICF Goals
• To build a core facility open to the research community at UTSW with recognizable impact on the scope and quality of clinical and basic cancer research.
• Major Funding Source: CPRIT Core Grant and Cancer Center
BICF Structure
BICF Recruits New Hires • Brandi Cantarel, Ph.D., Computational Biologist III, started on
11/1/2015 • He Zhang, Ph.D., Computational Biologists III, started on 1/22/2016 • Xiu Luo, Ph.D., Data Scientist I, in process.
Open Positions • Computational Biologist – OPEN, high activity • Computational Biologist – OPEN, high activity • Data Scientist – OPEN SOON • Scientific Programmer – OPEN SOON
BICF Services
1. Curated Databases 2. Software Pipelines 3. Educational Programs 4. Helpdesk Consults
5. Fellows Program 6. Program Development 7. Collaborations 8. Flagship Projects
BICF Services
Public / Curated Databases • Research data are increasingly available in the public domain
Gene Expression Omnibus (GEO) The Cancer Genomics Atlas (TCGA) Cancer Cell Line Encyclopedia (CCLE) Literature…
• Utilizing these data are challenging
Scattered around in different places and frequently updated Organized in different formats and used different terminology Required specialized expertise to process
• BICF provides centralized infrastructure for public and curated databases and intuitive-to-use tools to access these resources.
BICF Services
Next-generation Sequencing Analysis Pipelines
Next Step: Incorporate the pipeline into BioHPC cloud system and set up educational programs for all researchers on campus.
Custom Software Probemapper – EntrezToProbe engine for handling mappings between probes and genes in microarray data.
MBCB – R package that provides model-based background correction incorporating negative control beads in Illumina BeadArray data.
Pipeclip – Pipeline for identifying cross-linking sites in PAR-CLIP, HITS-CLIP, and iCLIP data.
SbacHTS – R package for detecting and correcting spatial background noise in RNAi screening experimental results.
BAYSIC – Variant Integration Tools, which uses Bayesian posterior probabilities to determine highly confident variants predicted from a combination of variant calling tools.
Term2Gene - Online tool for identifying list of genes associated with specific diseases/biological pathways using PubMed's query term definitions.
Lung Cancer Explorer – Online tool that lets you explore and analyze gene expression data from dozens of lung cancer datasets.
BICF Services
BICF Educational Programs
• Monthly Lectures – part of BioHPC training.
• Nano Courses – 2 day courses with lectures and hands-on exercises.
Check BioHPC Training Calendar.
BICF Upcoming Courses
• 03/23: Introduction to Statistical Testing: Going beyond the T-test to pick the right test for your data analysis.
• 04/27: Sequence Homology and Alignments: Understanding how does BLAST/FASTA work in order to optimize your sequence similarity searches.
• 05/25: Next Generation Sequence Technologies: From Sanger to MinIONs, choosing the sequencing technology for your experiments.
• 06/22: Introduction to Sequence Variation: SNV, INDELS and SVs, predicting human variation in healthy vs. disease populations.
Organizer: Dr. Brandi Cantarel <[email protected]>
BICF Upcoming Courses
• 07/27: Introduction to ClipSeq Analysis: ClipSeq as a
method for detecting of genome-wide protein-RNA interaction maps.
• 09/28: RNAseq Analysis: Gene expression profiling to determine molecular functional differences in cells and tissues.
• 11/16: Introduction to Microbiome ‘Omics Technologies: What is a microbiome and how can it be studied?
Organizer: Dr. Brandi Cantarel <[email protected]>
BICF Services
BICF Help Desk • Open to all UTSW research community.
– consultation re data access – consultation re routine data analysis – consultation re study design – consultation re grant application
• Triage of proposals for collaboration. • Triage of proposals for hourly consulting. • Help desk hours:
Daily 10:00am – 11:00am
NB5.604 [email protected]
BICF Hourly Consulting
• Help Desk consults needing more than brief interactions can be scheduled as a fee-for-service basis. – more involved data access
– performance of simple data analysis
– more involved standard support in study design
– more involved standard support with grant application
• Review / triage toward prog dev and collab level
BICF Services
BICF Fellows Program • 5 - 10 slots available • Postdoctoral fellows and graduate students engaged in
a bioinformatics analysis move to BICF – Technical super-vision by trained staff – Environment with quantitative thinking – Peer-to-peer training among fellows – Integration with the Department of Bioinformatics
• Scientific responsibility and financing remains with fellow’s lab
• Use of CPRIT funds – Bridge funds for unbillable FTEs of staff working with
fellows
BICF Services
Program Development
• To provide bioinformatics expertise to newly forming project teams.
– Attendance of project meetings
– Dedicated contributions to study design
– Acquisition of preliminary data
– Participation in grant application
• Application through helpdesk consultation.
• Prioritization by Steering Committee.
BICF Services
BICF Collaborations
• Provide cancer community with -‘bioinformatics personnel on demand’.
• Application through helpdesk consultation.
• Prioritization by Steering Committee.
– Staff stays located in BICF or moves temporarily to project lab.
– FTEs paid by project lab.
Selected publications 1. Cell, 2012 May 11, 149(4) 768-779
2. Cell. 2013 Sep 12;154(6):1269-84
3. Cell, 2013 Aug 29;154(5):1085-99
4. Nature. 2011 Dec 1;480(7375):113-7
5. Nature, 2012 Jan 26; 481:511-515
6. Nature. 2014 Aug 17. doi: 10.1038/nature13671
7. Science. 2012 Nov; 338(6109):956-959
8. Science. 2014 Sep 5;345(6201):1139-45.
9. Lancet, 2012 Mar 3;379(9818):785-7
10. Nature Biotechnology. 2014 Dec;32(12):1213-22.
11. Nature Biotechnology. 2015 Aug 10. [Epub ahead of print]
12. Science Signaling 2013 Oct 15;6(297):ra90
BICF Services
Flagship Projects
• Clinical Sequencing Project (Collaborate with Dr. Jim Malter, Dr. Ward Wakeland)
• Lung Cancer Project (Collaborate with Dr. David Gerber and John Minna)
• Kidney Cancer Project (Collaborate with Dr. Jim Brugarolas)
• Clinical Database Developments
Flagship Projects
• Clinical Sequencing Project (Collaborate with Dr. Jim Malter, Dr. Ward Wakeland) – Set up the testing clinical server with local back ups
– Developed and tested the germline mutation calling pipeline
– developed and testing somatic mutation calling pipeline
– developed database and web portal for storing the raw data and results
– curating available database/resources for clinical actionable mutations
Flagship Projects
• Lung Cancer Project (Collaborate with Dr. David Gerber and John Minna)
– Developed Lung Cancer Explorer for public data
– Team joined the IRB protocol for accessing UTSW lung cancer patient and tumor sample data
– Work with John Minna and Adi Gazdar to develop lung cancer cell line database
Comprehensive database development Lung Cancer Explorer
Data curation and analysis
Identification of appropriate datasets
Data curation and quality assessment
Data processing
Statistical analysis Alignment and integration
Analysis functions for database infrastructure
Survival analysis Correlation analysis Group comparison
Flagship Projects
• Kidney Cancer Project (Collaborate with Dr. Jim Brugarolas)
– Working with IR, Jim Brugarolas, Payal Kapur and others on clinical data curation
– Developing prototype database and web portal
– Transferred, stored and analyzed large amount of sequencing data
Pilot Kidney Cancer Explorer Data
Kidney Cancer Program Total RNA-Seq1 Mutation
2
Patients 1402 95 101
Goal of the clinical database
Secure Account System
User-friendly Data Input and Search
Track Account Login History
Track Clinical Data Change History
Collaborators Online Record Tool
Thank you