+ All Categories
Home > Documents > BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor:...

BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor:...

Date post: 15-Jan-2016
Category:
View: 224 times
Download: 0 times
Share this document with a friend
Popular Tags:
32
BI 83201: The Literature BI 83201: The Literature of Computational Genomics of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45 Higgins 465 Meeting Time: Fridays 10:30-11:45 Higgins 465 Requirements: Read and discuss 1-2 papers per week. Requirements: Read and discuss 1-2 papers per week. Grading will be based on participation. Attendance at Grading will be based on participation. Attendance at all sessions is mandatory. all sessions is mandatory. Course website: Course website: bioinformatics.bc.edu/chuanglab/courses.htm bioinformatics.bc.edu/chuanglab/courses.htm All papers will be available online at least 1 week All papers will be available online at least 1 week before discussion. Students will be assigned before discussion. Students will be assigned sections/figures, for which they will be expected to sections/figures, for which they will be expected to lead the discussion, including asking other students lead the discussion, including asking other students questions. questions. Office Hours by arrangement: Contact Jeff at Office Hours by arrangement: Contact Jeff at [email protected], Phone: 2-0804, Higgins 444B (soon to be [email protected], Phone: 2-0804, Higgins 444B (soon to be moving to Higgins 420). moving to Higgins 420).
Transcript
Page 1: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

BI 83201: The Literature of BI 83201: The Literature of Computational GenomicsComputational Genomics

Instructor: Prof. Jeffrey ChuangInstructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45 Higgins 465Meeting Time: Fridays 10:30-11:45 Higgins 465 Requirements: Read and discuss 1-2 papers per week. Grading Requirements: Read and discuss 1-2 papers per week. Grading

will be based on participation. Attendance at all sessions is will be based on participation. Attendance at all sessions is mandatory.mandatory.

Course website: bioinformatics.bc.edu/chuanglab/courses.htmCourse website: bioinformatics.bc.edu/chuanglab/courses.htm All papers will be available online at least 1 week before All papers will be available online at least 1 week before

discussion. Students will be assigned sections/figures, for which discussion. Students will be assigned sections/figures, for which they will be expected to lead the discussion, including asking they will be expected to lead the discussion, including asking other students questions.other students questions.

Office Hours by arrangement: Contact Jeff at [email protected], Office Hours by arrangement: Contact Jeff at [email protected], Phone: 2-0804, Higgins 444B (soon to be moving to Higgins 420).Phone: 2-0804, Higgins 444B (soon to be moving to Higgins 420).

Page 2: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Changing perspectives in yeast research nearly a decade after the

genome sequence

Kara Dolinski and David Botstein

BI 83201 : Literature of Computational Genomics

January 27, 2006

Page 3: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

I. IntroductionI. Introduction

The yeast Saccharomyces cerevisiae was the 1st sequenced eukaryotic genome (1996).

It is 12 million base pairs long(Human is 3 billion), over 16 chromosomes.

It was chosen because it has an extensive history as a model organism, along with the worm C. elegans and the fly D. melanogaster.

Page 4: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.
Page 5: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.
Page 6: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Major Benefits of Sequencing the Yeast Genome

Ability to identify clones via sequencing, rather than genetic or physical mapping methods.

Creation of yeast strains, each with a deletion of one gene, for every gene in the genome.

Whole genome expression assays.

A "grand unification,” showing that protein sequence similarity persists between yeast, mouse, human, fly, and worm, i.e. functional similarity often also means sequence similarity.

Page 7: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

      From the parts list to the system From the parts list to the system level: Goals of post-genome-sequence level: Goals of post-genome-sequence

yeast researchyeast research

Understand and annotate every Understand and annotate every functional feature in the genome.functional feature in the genome.

Understand the interactions of every Understand the interactions of every feature – “systems biology”feature – “systems biology”

Page 8: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

A central goal of yeast research remains A central goal of yeast research remains the determination of the biological role the determination of the biological role of every sequence feature in the yeast of every sequence feature in the yeast genome. The most remarkable change genome. The most remarkable change has been the shift in perspective from has been the shift in perspective from focus on individual genes and focus on individual genes and functionalities to a more global view of functionalities to a more global view of how the cellular networks and systems how the cellular networks and systems interact and function together to interact and function together to produce the highly evolved organism we produce the highly evolved organism we see today. see today.

Page 9: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

      Genes and their biological rolesGenes and their biological roles

1995: The number of characterized 1995: The number of characterized genes was 1000-2000.genes was 1000-2000.

2006: 5773 genes in the genome. 2006: 5773 genes in the genome. 4299 are characterized.4299 are characterized.

Annotation of individual functions Annotation of individual functions remains challenging.remains challenging.

Page 10: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

SGD SGD Several Several Ball et al. 2001 Ball et al. 2001 http://www.yeastgenome.org http://www.yeastgenome.org

CYGD/MIPS CYGD/MIPS Several Several Guldener et al. Guldener et al. 2005 2005

http://mips.gsf.de/genre/proj/yeast/ http://mips.gsf.de/genre/proj/yeast/

bioGRID bioGRID Genetic/physical Genetic/physical interaction interaction

Breitkreutz et al. Breitkreutz et al. 2003 2003

http://biodata.mshri.on.ca/http://biodata.mshri.on.ca/yeast_grid/ yeast_grid/

BIND BIND Genetic/physical Genetic/physical interaction interaction

Bader et al. 2003 Bader et al. 2003 http://www.blueprint.org/bind/http://www.blueprint.org/bind/bind.php bind.php

DIP DIP Physical Physical interaction interaction

Xenarios et al. Xenarios et al. 2002 2002

http://dip.doe-mbi.ucla.edu/dip/http://dip.doe-mbi.ucla.edu/dip/Main.cgi Main.cgi

MINT MINT Physical Physical interaction interaction

Zanzoni et al. Zanzoni et al. 2002 2002

http://160.80.34.4/mint/ http://160.80.34.4/mint/

IntAct IntAct Physical Physical interaction interaction

Hermjakob et al. Hermjakob et al. 2004b 2004b

http://www.ebi.ac.uk/intact/http://www.ebi.ac.uk/intact/index.html index.html

Deletion Deletion Consortium Consortium

Phenotype Phenotype analysis analysis

Giaever et al. Giaever et al. 2002; Winzeler et 2002; Winzeler et al. 1999 al. 1999

http://www-sequence.stanford.edu/http://www-sequence.stanford.edu/group/yeast_deletion_project/data_sgroup/yeast_deletion_project/data_sets.htmlets.html

GEO GEO MicroArray MicroArray Edgar et al. 2002 Edgar et al. 2002 http://http://www.ncbi.nlm.nih.govwww.ncbi.nlm.nih.gov/geo//geo/

Array Express Array Express MicroArray MicroArray Brazma et al. 2003 Brazma et al. 2003 http://http://www.ebi.ac.uk/arrayexpresswww.ebi.ac.uk/arrayexpress//

YMGV YMGV MicroArray MicroArray Marc et al. 2001 Marc et al. 2001 http://http://www.transcriptome.ens.fr/ymgvwww.transcriptome.ens.fr/ymgv//

SMD SMD MicroArray MicroArray Gollub et al. 2003 Gollub et al. 2003 http://http://smd.stanford.edusmd.stanford.edu//

List of the major sources of yeast functional genomics data; in addition to the main SGD site, yeast genome data are also distributed via SGD Lite (http://sgdlite.princeton.edu), a lightweight yeast genome database, which is built from GMOD components and can be downloaded and installed locally.

Page 11: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

            Gene expression technology and Gene expression technology and

the emergence of system-level biologythe emergence of system-level biology

Two major expression technologies Two major expression technologies developeddeveloped

SAGE (Serial Analysis of Gene SAGE (Serial Analysis of Gene Expression)Expression)

mRNA MicroarraysmRNA Microarrays

Page 12: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

SAGE

Serial Analysis ofGene Expression

Page 13: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Figure 1. Yeast genome microarray. The actual size of the microarray is 18 mm by 18 mm.

Derisi et al.Science 24 October 1997:Vol. 278. no. 5338, pp. 680 - 686

Example of an mRNA Expression Microarray

Page 14: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Defining functional or regulatory Defining functional or regulatory subsystems, or "modules“subsystems, or "modules“

Study all the genes that respond to certain stresses, e.g. temperature change, starvation, radiation.

Study genes that are active in “natural” behaviors: cell cycle, sporulation, pheromone response.

Identify genes that are often co-expressed and/or co-regulated, such as ribosomal genes.

Page 15: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

(C) Seven members of a class of genes marked by early induction with a peak in mRNA levels at 18.5 hours. Each of these genes contain STRE motif repeats in their upstream promoter regions.

Science 24 October 1997:Vol. 278. no. 5338, pp. 680 - 686

Distinct temporal patterns of induction or repression help to group genes that share regulatory properties.

Page 16: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

It is quite rare for genes to have unchanging expression levels across different experiments; for example, expression of the yeast actin (ACT1) gene, which was traditionally used as a control in Northern blots to ensure that equivalent levels of RNA were loaded in each well, changes significantly in several diverse types of microarray experiments

Expression Levels Are Highly Condition Dependent

Page 17: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Analysis and Display of Genome-scale Analysis and Display of Genome-scale DataData

How can such a vast amount of expression data be analyzed, managed, and presented?

Clustering algorithms group genes with similar expression profiles over different experiments.

Page 18: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Figure 1. Yeast genome microarray. The actual size of the microarray is 18 mm by 18 mm.

Derisi et al.Science 24 October 1997:Vol. 278. no. 5338, pp. 680 - 686

Example of an mRNA Expression Microarray

Page 19: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Eisen et al. (1998) PNAS 95:14863

Clustering of Gene Expression Profiles

Page 20: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Gene OntologyGene Ontology

A functional annotation system to allow one to search for biases in clusters of genes.

Broad terms are the parents to more specific terms.

Consistent annotation system across species.

Page 21: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

A Clustered Group of Genes and Its Functional Annotation

The Gene Ontology allows one to assess the statistical significance in bias for functional categories.

Page 22: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Insights into the global Insights into the global transcriptional network transcriptional network

Co-regulated genes should share a common transcription factor binding site.

Computational methods to search for motifs shared among co-regulated genes(REDUCE, AlignACE, MODEM).

Page 23: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

YCL030C HIS4

SCer GCAGTCGAACTGACTCTAATAGTGACTCCGGTAAATTAGTTAATTAATTGCTAAACCCATGCACAGTGACTCACGTTTTTTTATCAGTCATTCGASPar GCAGTCGAACTGACTCTAATAGTGACTCCGGTAAATTAGTTAATTAATTGCTAAACCCATGCACAGTGACTCATGTTTTTT-ATCAGTCATTCGASMik GCGGTCAAACTGACTCTAATAGTGACTCCGGTAAATTAGTTAATTAATTGCTAAACCCATGCACAGTGACTCATGCTTTCT-ATCAGTCATTCGASBay -TGAACGAACTGACTCTAATAGTGACTCTGGTAAATTAGTTAATTAATTTCTAAACCCATGCACAGTGACTCATGTTTTGTTATCAGTCATTCGT * ********************* ******************** *********************** * *** * ************

SCer TATAGAAGGTAAGAAAAGGATATGACT----ATGAACAGTAGTATACTGTGTATATAATAGATATGGAACGTTATATTCACCTCCGATGTGTGTTSPar TAGAGAAGGTAAGAAAAGGATATGACT----ATGAACAGTAATATACTATGTATATAATAGATAAGGAACGTTATATTCACCTTGGATGTGTGTTSMik TACAGA-GGTAAGAAAAGCGAACTACT----AAGAACAGTGGTACATGGTGTATATAATAGATAAGGAACAT-GTATTCACTTTTAATGTGAGTTSBay TAAAGA-AGAAAGAGAGGAAGATGACTCAAAATAAATACTAGTGTATTGTGTATATAACAGAGATGGAACACTGGATTC-CACCTAATGTGTGTT ** *** * **** * * * *** * ** * * * * ********* *** * ***** **** * ***** ***

SCer GTACATACATAAAAATATCATAGCACAACTGCGCTGTGTAA---TAGTAATACAATAGTTTACAAAATTTTTTTTCTGAATA---SPar GTACATACATAAGAATATCATACTACAAGTGCGCTGTGTAA---TAGTAACATAATAGTTAACAA-----TTTTTTTGAATA---SMik GTCTATA-AGAAGAATAGTATACCACAAGCGTGCTGTGTAACGATAATAATATAACAATTTACAAGATT-TTTTTTTGAATA---SBay GTCCATACATAGAATTAGTATACCACAATTGCGCTGTGTAA---TAATAACATAATAGATTACAAAA---TTTTGGAAAAAAAAA ** *** * * * ** *** **** * ********* ** *** * ** * * **** **** ** *

GCN4 BAS1 PHO2 RAP1 GCN4

TATA

Comparative Genomic Approaches to FindingComparative Genomic Approaches to FindingTranscription Factor Binding SitesTranscription Factor Binding Sites

Alignments of 4 – 13 yeast species, to determine unusually conserved motifs.

Page 24: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

DNA-binding proteins are crosslinked to DNA with formaldehyde in vivo.

Isolate the chromatin. Shear DNA along with bound proteins into small fragments.

Bind antibodies specific to the DNA-binding protein to isolate the complex by precipitation. Reverse the cross-linking to release the DNA and digest the proteins.

Use PCR to amplify specific DNA sequences to see if they were precipitated with the antibody.

Chromatin Chromatin Immuno-Immuno-precipitation precipitation to Determine to Determine Binding SitesBinding Sites

Page 25: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Integration of Data Sources

Harbison and colleagues (2004 ) used a combination of experimental (chIP-chip), comparative genomics, and motif discovery methods to identify putative DNA binding sites for >200 transcription factors in yeast.

Bayesian network takes as input different properties of sequence elements upstream of a gene and outputs the likelihood of that gene exhibiting a particular expression pattern

Page 26: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Interaction Networks Interaction Networks

Synthetic lethal interactions

protein-DNA interactions

protein-protein interactions.

Page 27: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Synthetic Lethal Interactions

Genetic interaction network representing the synthetic lethal/sick interactions determined by SGA analysis. Genes are represented as nodes, and interactions are represented as edges that connect the nodes. Up to 1000 genes and 4000 interactions.

Page 28: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Protein-DNA interactions

Transcription factor

Binding site

Page 29: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Motifs in the E. ColiTranscriptional Regulatory Network

Nature Genetics  31, 64 - 68 (2002)

Page 30: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Protein-Protein interactions

Problem: Experiments are not robust

Verification by checking for co-expression of orthologs inother species.

Check for “joint” sequence conservation of orthologs.

Other data integration methods.

Page 31: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

Outline of the comprehensive two-hybrid analysis. We cloned almost all yeast ORFs individually as a DNA-binding domain fusion (bait) in a MATa strain and as an activation domain fusion (prey) in a MAT   strain, and subsequently divided them into pools, each containing 96 clones. These bait and prey clone pools were systematically mated with each other, and the diploid cells formed were selected for the simultaneous activation of three reporter genes (ADE2, HIS3, andURA3) followed by sequence tagging to obtain ISTs.

PNAS | April 10, 2001 | vol. 98 | no. 8 | 4569-4574

Protein-Protein Interactions

Page 32: BI 83201: The Literature of Computational Genomics Instructor: Prof. Jeffrey Chuang Instructor: Prof. Jeffrey Chuang Meeting Time: Fridays 10:30-11:45.

      Conclusions and some thoughts Conclusions and some thoughts about the Futureabout the Future

Most new understanding has come from Most new understanding has come from comparative genomics.comparative genomics.

Genome-scale data has provided new Genome-scale data has provided new goalsgoals

Other important areas – allelic effects, Other important areas – allelic effects, gene localization, metabolism dynamics, gene localization, metabolism dynamics, how selection operates on networks.how selection operates on networks.

Philosophy – how should large scale data Philosophy – how should large scale data be used to generate and test hypotheses?be used to generate and test hypotheses?


Recommended