+ All Categories
Home > Documents > Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and...

Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and...

Date post: 31-May-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
16
2/25/09 1 CSCI1950‐Z Computa3onal Methods for Biology Lecture 9 Ben Raphael February 23, 2009 hHp://cs.brown.edu/courses/csci1950‐z/ Outline Searching Through trees 1. Branch‐swapping: NNI, SPR, TBR. 2. MCMC Consensus Trees and Supertrees
Transcript
Page 1: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

CSCI1950‐Z Computa3onal Methods for Biology  

Lecture 9 

Ben Raphael February 23, 2009 

hHp://cs.brown.edu/courses/csci1950‐z/ 

Outline 

Searching Through trees 1.   Branch‐swapping: NNI, SPR, TBR. 2.  MCMC 

Consensus Trees and Supertrees 

Page 2: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

Heuris3c Search 

1.  Start with an arbitrary tree T. 2.  Check “neighbors” of T. 3.  Move to a neighbor if it provides the best 

improvement in parsimony/likelihood score. 

Caveats: Could be stuck in local op3mum, and not achieve global op3mum 

Trees and Splits 

Given a set X, a split is a par33on of X into two non‐empty subsets A and B such that  X = A | B. 

For a phylogene3c tree T with leaves L, each edge e defines a split Le = A | B, where A and B are the leaves in the subtrees obtained by removing e. 

A B

e

Page 3: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

Compu3ng the Splits Metric 

A phylogene3c tree T defines a collec3on of splits Σ(T) = { Le | e is edge in T}. 

Theorem:  ρ(T1, T2) = | Σ(T1) \ Σ(T2) | + |Σ(T2) \ Σ(T1) |           = |Σ(T1)| + |Σ(T2)| ‐ 2 |Σ(T1)∩Σ(T2)| 

Proof: (whiteboard) 

Nota3on: A \ B = {x: x ∈ A, x ∉ B} 

Nearest Neighbor Interchange 

Claim: The number of NNI neighbors of a binary tree is 2(n‐3) 

Proof: (whiteboard) 

Rearrange four subtrees defined by one internal edge 

Page 4: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

Subtree Pruning and Regrafing  (SPR) 

1.  Remove a branch. 2.  Reconnect incident vertex by 

subdividing a branch  

Subtree Pruning and Regrafing  (SPR) 

1.  Remove a branch. 2.  Reconnect incident vertex by 

subdividing a branch  

Claim: The number of SPR neighbors of a binary tree is  2(n‐3) (2n – 7) 

Proof: (whiteboard) 

Page 5: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

Tree Bisec3on and Reconnec3on  (TBR) 

1.  Remove a branch. 2.  Reconnect subtrees by adding 

new branch that subdivides branches in both.  

Rela3onship between Opera3ons 

•  Every NNI is an SPR and every SPR is a TBR. •  Every TBR is a single SPR or a composi3on of two SPR. 

•  All three types of opera3ons are inver3ble:  If T  T’, then T’  T. 

Theorem: For all T and T’ in B(n), there is a sequence of NNI (or SPR or TBR) opera3ons that transform T into T’. 

α  α‐1 

Page 6: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

Rela3onship between Opera3ons 

•  Every NNI is an SPR and every SPR is a TBR. •  Every TBR is a single SPR or a composi3on of two SPR. •  All three types of opera3ons are inver3ble:  

If T  T’, then T’  T. 

NNI  TBR SPR 

Heuris3c Search 

1.  Start with an arbitrary tree T. 2.  Check “neighbors” of T. 3.  Move to a neighbor if it provides the best 

improvement in parsimony/likelihood score. 

PAUP* (widely used phylogene3c package) includes command: 

hsearch nreps=num swap=type 

Where type = NNI, SPR, TBR 

Page 7: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

From Likelihood to Bayesian 

Given data X = (x1, …, xn), we found the tree T and branch lengths t* that maximized likelihood Pr[X | T, t*]. 

What about other trees? 

Could we compute Pr[T, t* | X]? 

Back to Coin Flipping 

Flip coin with  p = Pr[heads] unknown. 

Earlier we computed max. likelihood es3mate of p. L(p) = Pr[ D | p]. 

Pr[p | D] = Pr[ p, D]/Pr[D]      = Pr[D|p]Pr[p] / Pr[D] 

44 tosses 20 heads 

11 tosses 5 heads 

Prior Posterior 

Page 8: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

Bayesian Methods 

Pr[T, t* | X] = Pr[X, T, t*] / Pr[X]       = Pr[X | T, t*] Pr[T, t*] / Pr[X]       = Pr[X | T, t*] Pr[T, t*] / (ΣT’, t’Pr[X | T’, t’] Pr[T’, t’]       

Prior Posterior 

Problem: Cannot compute denominator. 

Bayes Theorem 

Bayesian Methods 

Pr[T, t* | X] = Pr[X, T, t*] / Pr[X]       = Pr[X | T, t*] Pr[T, t*] / Pr[X]       = Pr[X | T, t*] Pr[T, t*] / (ΣT’, t’Pr[X | T’, t’] Pr[T’, t’]       

Prior Posterior 

Problem: Cannot compute denominator. 

Solu2on: Use power of Markov Chains to draw trees (“sample”) according to distribu3on Pr[T, t* | X]   

Bayes Theorem 

Page 9: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

Markov Chain Monte Carlo 

To sample from a distribu3on   Define a Markov chain with equilibrium distribu3on π.  Simulate chain through many transi3ons.  Afer many transi3ons (e.g. ~10000), will be at equilibrium π.  (“Burn‐in”)  Output every n‐th state.  (n ~ 50).   

A  C 

G T 

Jukes‐Cantor model of DNA 

Equilibrium distribu3on:  qA = qC = qG = qT = 1/4 

MCMC on Trees 

NNI neighborhood for trees with 5 leaves 

1.  Define a Markov chain: •  States are trees T. •  Equilibrium distribu3on is posterior Pr[T, 

t* | X].   2.  Simulate Markov chain for many steps (burn‐

in). 3.  Output T from every n‐th (e.g. n = 50) step. 

Page 10: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

10 

MCMC on Trees 

NNI neighborhood for trees with 5 leaves 

1.  Define a Markov chain: •  States are trees T. •  Equilibrium distribu3on is posterior Pr[T, 

t* | X].   2.  Simulate Markov chain for many steps (burn‐

in). 3.  Output T from every n‐th (e.g. n = 50) step. 

For transi3ons, can use NNI, SPR, TBR, or other opera3ons.  

Can define* the transi3on probabili3es of this Markov chain without compu3ng Z = (ΣT’, t’Pr[X | T’, t’] Pr[T’, t’]  (Metropolis algorithm). 

*“involves burning of incense, cas3ng of chicken bones, use of magical incanta3ons, and invoking the opinions of more pres3gious colleagues.” ‐‐Felsenstein 

How Many Times Did Wings Evolve? 

•  Previous studies had shown loss of wings: winged  wingless transi3ons 

•  Gain of wings (Wingless  winged transi3on) appears to be much more complicated 

Page 11: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

11 

Phylogeny of Insects 

Build phylogeny of winged and wingless s3ck insects  

Used data from: 18S ribosomal DNA (~1,900 base 

pairs (bp)) 28S rDNA (2,250 bp) Por3on of histone 3 (H3, 372 bp) Used mul3ple tree reconstruc3on 

techniques 

(Nature 2003) 

Most Parsimonious Evolu3onary Tree of Winged and Wingless Insects 

•  All most parsimonious reconstruc3on gave a wingless ancestor •  All required mul3ple winged  wingless transi3ons. 

Page 12: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

12 

Most Parsimonious Evolu3onary Tree of Winged and Wingless Insects 

Will Wingless Insects Fly Again?  

•  All most parsimonious reconstruc3ons all required the re‐inven3on of wings. 

•  It is likely that wing developmental pathways are conserved in wingless s3ck insects 

Page 13: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

13 

Next Ques3ons 

•  How to combine/merge trees? •  How to determine “confidence” in a par3cular tree/branch? 

Mul3ple Trees? 

Page 14: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

14 

Consensus Trees 

Strict Consensus Tree 

Page 15: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

15 

Strict Consensus 

No non‐trivial splits in common! Strict consensus tree is unresolved. 

Splits Equivalence Theorem 

A phylogene3c tree T defines a collec3on of splits Σ(T) = { Le | e is edge in T}. 

Splits A1 | B1 and A2 | B2 are pairwise compa.ble if at least one of A1∩A2 , A1∩B2 , B1∩A2, and B1∩B2 is the empty set. 

Splits Equivalence Theorem: Let Σ be a collec3on of splits.  There is a phylogene3c tree such that  Σ(T) = Σ if and only if the splits in Σ are pairwise compa3ble. 

The Pairwise Compa3bility Theorem (for binary characters) follows from this theorem. 

Page 16: Outline - Brown University · 2/25/09 6 Relaonship between Operaons • Every NNI is an SPR and every SPR is a TBR. • Every TBR is a single SPR or a composion of two SPR. • All

2/25/09 

16 

Majority Consensus Tree 

Majority Consensus Tree 


Recommended