+ All Categories
Home > Documents > The (Supertree) of Life: Procedures, Problems, and Prospects Presented by Usman Roshan.

The (Supertree) of Life: Procedures, Problems, and Prospects Presented by Usman Roshan.

Date post: 20-Dec-2015
Category:
View: 216 times
Download: 1 times
Share this document with a friend
Popular Tags:
23
The (Supertree) of Life: Procedures, Problems, and Prospects Presented by Usman Roshan
Transcript

The (Supertree) of Life: Procedures, Problems, and

ProspectsPresented by Usman Roshan

Supertree Methods• Input: Set of trees • Output: Tree leaf-labeled by where is

the set of leaves of .

• Why supertree methods?

}{ ,,1 kTT T

)(TLk

iiTL

1)(

TT

T

Motivation (1)

• Supertree methods are used as part of divide-and-conquer method to solve NP-hard problems on large datasets

Motivation (2)

• Supertree methods are used when we have missing data

Types of supertree methods (1)• Direct methods (e.g. strict consensus

supertrees, MinCutSupertrees)

Types of supertree methods (2)

• Indirect methods (e.g. MRP, average consensus)

Types of supertree methods (3) (MRP)

Definitions• Contraction:

• Restriction:

• If then contains

23 TT

2},,,{|3 TT EDBA

2)(|3 2 TT TL 3T 2T

Optimization problems

• Subtree Compatibility: Given set of trees ,does there exist tree ,such

that, (we say contains ).

• NP-hard (Steel 1992)• Special cases are poly-time (rooted trees,

DCM)• MRP: also NP-hard

}{ ,,1 kTT T TtTt tL )(|,T T

T

Limitations of supertree methods

Three desirable properties:• P1: Method can be applied to any unordered set of

input trees• P2: Renaming the species does not change the

constructed supertree• P3: If the input trees are compatible, then the output

tree is one of the “parent trees”.There is no supertree method that can satisfy P1-P3 when the input trees are unrooted; however, forrooted trees an extension of BUILD satisfies P1-P3.

Rooted subtrees (BUILD)(Aho et al 1981)

• Input: Set of rooted trees

• Output: Tree that contains

TT T

BUILD (2) - Definitions

• Cluster: Set of taxa in a rooted subtree

• A different representation of rooted phylogenetic trees

• Let C(T) be the clusters of tree T. In this example C(T) = {{1,2}, {3,4}, {1,2,3,4},{1,2,3,4,5}}

• We write (IJ)K in T, if I,J are in some cluster of T which doesn’t contain J; e.g. (12)3, (34)5 are in T

BUILD (3) - Algorithm

1. Initialize C as set of input taxa2. If |C|=1 return C, else compute graph

3. Let C’ be the sets of taxa in the connected components of G. If |C’| = 1 then is incompatible, else set C = C C’, and repeat step (2) on each new cluster in C’.

})(,:),{( TkijTCkjiE T

}{speciesV

T

BUILD (4) - Algorithm

BUILD (5) - Algorithm

BUILD (6) - Algorithm

BUILD (7) - Algorithm

Compatible source trees

• For compatible source trees, MRP or BUILD can be used; however, the strict consensus of MRP trees (or the strict consensus supertree) may not be compatible with the input.

• BUILD has been extended to output all parent trees; also shown that source trees have a unique parent tree iff BUILD constructs a binary tree.

Incompatible source trees (1)

For incompatible source trees two strategies:

• Resolve incompatibilities by using quartet methods or removing troublesome taxa.

• Use an appropriate algorithm such as MRP or MinCutSupertrees; the latter is an extension of BUILD so that it always outputs a tree.

Incompatible source trees (2)

Desirable property• P1: If at least one tree contains (IJ)K and no

source tree contains (IK)J or (JK)L, then the output tree must contain (IJ)K

No method can satisfy P1; however, thecondition: if all source trees contain (IJ)K then output must contain (IJ)K can be satisfied.

Supertree criticism

• Do not take biomolecular sequences into account• Dataset non-independence• MRP: Favors larger source trees because they

contribute more characters; may also favor unbalanced source trees

• Direct methods: Cannot incorporate support values in the source trees (except for MinCutSupertrees), and cannot compute support values in the supertree (unlike MRP)

Applications of supertrees

• Systematics – MRP is the standard method used by biologists

• Evolutionary models

• Rates of cladogenesis

• Evolutionary patterns

• Biodiversity and conservation

Bright future for supertree construction

• Despite increase in phylogenetic data, species are poorly characterizes at the molecular level; thus, giving rise to problems from taxon sampling (non-random sampling), long branch attraction, and missing data

• ML analysis: Genes evolve under different models• Non-molecular data


Recommended