Date post: | 18-Dec-2015 |
Category: |
Documents |
Upload: | kathryn-robbins |
View: | 217 times |
Download: | 1 times |
Summer Bioinformatics Workshop 2008
Comparative Genomics and Phylogenetics
Chi-Cheng Lin, Ph.D., ProfessorDepartment of Computer Science
Winona State University – Rochester [email protected]
2
Summer Bioinformatics Workshop 2008
Outline
• Comparative Genomics
• Phylogenetics
• Phylogenetic Tree
• Phylgenetics Applications
• Gene Tree vs. Species Tree
3
Summer Bioinformatics Workshop 2008
Comparative Genomics
• Analysis and comparison of genomes from different species
• Purposes– to gain a better understanding of how species have
evolved– to determine the function of genes and non-coding
regions of the genome • The functions of human genes and other DNA
regions often are revealed by studying their parallels in nonhumans.– Researchers have learned a great deal about the
function of human genes by examining their counterparts in simpler model organisms such as the mouse.
4
Summer Bioinformatics Workshop 2008
Comparative Genomics
• Features looked at when comparing genomes: – sequence similarity– gene location– length and number of coding regions within genes– amount of non-coding DNA in each genome– highly conserved regions maintained in organisms
• Computer programs that can line up multiple genomes and look for regions of similarity among them are used.
• Many of these sequence-similarity tools, such as BLAST, are accessible to the public over the Internet.
5
Summer Bioinformatics Workshop 2008
Of Mice and Men
• The full complement of human chromosomes can be cut into about 150 pieces, then reassembled into a reasonable approximation of the mouse genome.
• The colors of the mouse chromosomes and the numbers alongside indicate the human chromosomes containing homologous segments.
• This piecewise similarity between the mouse and human genomes means that insights into mouse genetics are likely to illuminate human genetics as well.
Source: http://www.ornl.gov/sci/techresources/Human_Genome/publicat/tko/06_img.html
6
Summer Bioinformatics Workshop 2008
Phylogenetics
• Phylogenetics– Study of evolutionary relationships
(sequences / species)– Infer evolutionary relationship from
shared features
• Phylogeny– Relationship between organisms
with common ancestor
• Phylogenetic tree– Graph representing evolutionary
history of sequences / speciesSource of image: http://superfrenchie.com/Pics/Blog/culture/evolution.jpg
7
Summer Bioinformatics Workshop 2008
Phylogenetics
• Premise– Members sharing common evolutionary history
(i.e., common ancestor) are more related to each other
– Can infer evolutionary relationship from shared features
• Long history of phylogenetics– Historically - based on analysis of observable features
(e.g., morphology, behavior, geographical distribution)– Now - mostly analysis of DNA / RNA / amino acid
sequences
8
Summer Bioinformatics Workshop 2008
Phylogenetics
• Goals– Understand relationship of sequence to similar sequences– Construct phylogenetic tree representing evolutionary history
• Motivation / application– Identify closely related families
• Use phylogenetic relationships to predict gene function
– Follow changes in rapidly evolving species (e.g., viruses)• Analysis can reveal which genes are under selection• Provide epidemiology for tracking infections & vectors
• Relationship to multiple sequence alignment (MSA)– Alignment of sequences should take evolution into account– More precise phylogenetic relationships Improved MSA– CLUTALW (http://www.ebi.ac.uk/clustalw/), a popular MSA
program, can produce alignment that is then used to build phylogenetic tree.
9
Summer Bioinformatics Workshop 2008
Phylogenetic Tree Terminology
• Leaf / terminal node / taxon– Node with no children– Original sequence
• Join / internal node– Point of joining two leaves / clusters– Inferred common ancestor
• Branches– Represent change– Length represents evolutionary distance
• Cluster / clade– All sequences in subtree with common
ancestor (treated as single node)
10
Summer Bioinformatics Workshop 2008
Phylogenetic Tree Terminology
• Binary tree– Each edge that splits must connect to two children
• Rooted tree– Contains a single ancestor of all nodes– Evolution proceeds from root to leaves of tree
• Unrooted tree– No single ancestor node– No direction of evolution
• Molecular clock assumption (rooted tree)– Mutations occur at constant rate– Distance from root to leaves same for each leaf
11
Summer Bioinformatics Workshop 2008
Rooted and Unrooted Trees
Human
Chimpanzee
Gorilla
Orangutan Human
Chimpanzee
Gorilla
Orangutan
Rooted Tree Unrooted Tree
Root
Direction of evolution
14
Summer Bioinformatics Workshop 2008
Source: http://gi.cebitec.uni-bielefeld.de/people/boecker/bilder/tree_of_life_new.gif
15
Summer Bioinformatics Workshop 2008
Applications – Mammal Systematics
Source: http://www.isem.univ-montp2.fr/PPP/PM/RES/Phylo/Mamm/PHYLMOL-Placentalia%7EEnglish.jpg
16
Summer Bioinformatics Workshop 2008
Application – Epidemiology (CSI!)
• Which patients are more likely infected by the dentist?
Source: http://trc.ucdavis.edu/djbegun/Lect_12.1.html
17
Summer Bioinformatics Workshop 2008
Application – Modern Human Evolution
• Based on mtDNA genome
• Example – Global mtDNA diversity
analysis (Ingman et al., 2000 Nature. Volume 408:708-713)
– Africans have twice as much diversity among them as do non-Africans Africans have a longer genetic history
– More recent population expansion for non-Africans
– Africans and non-Africans diverged recently
Out of Africa
Source of image: Ingman et al., 2000, Nature. Volume 408: 708-713
18
Summer Bioinformatics Workshop 2008
Gene Tree vs. Species Tree
• Gene typically diverges before speciation
• Phylogenetic tree based on divergence of one single homologous gene– Evolutionary history of
gene– Gene tree rather than
species tree• More genes are
needed to build species trees
Source of image: http://www.bioinf2.leeds.ac.uk/b/genomics.html