+ All Categories
Home > Documents > Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and...

Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and...

Date post: 02-Jan-2016
Category:
Upload: cordelia-harmon
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
23
Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University
Transcript
Page 1: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Constructing evolutionary trees from rooted triples

Bang Ye Wu

Dept. of Computer Science and Information Engineering

Shu-Te University

Page 2: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

An evolutionary tree A rooted tree Each leaf represents one species. Internal nodes are unlabelled. (inferred

common ancestors)

a b c d e f

Page 3: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

A (rooted) triple (triplet) An evolutionary tree of 3 species. A constraint in an evolutionary tree construction

problem. (c(ab)): lca(b,c)=lca(c,a)lca(a,b)

lca : lowest common ancestor : “is an ancestor of “

a,b should be closer than a,c or b,c.

a b c

Page 4: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

A tree compatible with triples

Given a set of triples, construct a tree satisfying all the triples.

If such a tree exists, the problem is polynomial time solvable. [Aho et al, 1981]

a d b cab cca dba d

Page 5: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Incompatible (conflicting) triples

ab c ba c

Two conflicting triples

ab c bd c db a

Three conflicting triples (pairwise compatible)

Page 6: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Two optimization problems

The maximum consensus tree: – the tree satisfying maximum number of triples.– NP-hard [Jansson, 2001][Wu, to appear]– A new heuristic algorithm [this paper]

The maximum compatible set:– The compatible species subset of maximum

cardinality. – NP-hard [this paper]

Page 7: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Previous heuristicBest-One-Split-First

If a species x is split from a set V, all triples (x(v1v2)), v1 and v2 in V, will be satisfied.

Repeatedly split one species from the set. Choose the split species greedily.

Page 8: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

triples: (a(bc)),(c(ad)),(b(ad)),(c(bd))

{a,b,d}

cb

{a,d}c

da b c

c is chosen, and the two triples is satisfied.

c is split

b is split

Page 9: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Previous heuristicMin-Cut-Split-First

Construct an auxiliary graph:– Vertex: species– Each edge is labeled by a set: for each

triple (x(yz)), x is in the label set of edge (y,z).

c

b

ca

d

a

b,c triples: (a(bc)),(c(ad)),(b(ad)),(c(bd))

Page 10: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

– A bipartition corresponds to a split in the tree.– The label in the cut of the bipartition corresponds to

the triples conflicting the split. Repeatedly find the bipartition with minimum

cut.

{a,d} {b,c}

a d b cc

b

ca

d

a

b,c

a min-cut, triple (c(bd)) is conflicting

Page 11: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Previous heuristicBest-Pair-Merge-First

Instead of top-down splitting, BPMF uses the bottom-up merging strategy.

Starting from sets of singleton, we repeatedly merge the sets step by step.

Scoring functions are used to evaluate which pair should be merged in each step.

Page 12: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

triples: (a(bc)),(c(ad)),(b(ad)),(c(bd))

{a} {b} {c} {d}

{a,d} {b} {c}

{a,d} {b,c}

{a,d,b,c}

a d

a d b c

a d b c

Page 13: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

An exact algorithm for MCTT

Dynamic programming F(V)=max{F(V1)+F(V2)+W(V1,V2)},

taken among all bipartition (V1,V2) of V.– F(V): # of satisfied triples over V.– W(V1,V2): # of (x(v1v2) for x not in V and

v1, v2 in V1, V2 respectively. Computed with cardinality from small

to large.

Page 14: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

n=4 abcd3

n=3 abc1

abd3

bcd2

n=2 ab0

ac0

ad2

bc1

bd1

cd0

n=1 a0

b0

c0

d0

ab c ca d ba d cb d

a d b c

Page 15: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Our new heuristic algorithm (DPWP)

Derived from the exact algorithm. The number of subsets of each

cardinality is limited by a parameter K. When K=infinity, it is just the exact

algorithm. Time-quality trade-off. The time complexity is O(n2k2(n3+k)).

– Sorry, there is a mistake in the paper.

Page 16: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

The experiment results (time)

1

10

100

1000

10000

12 15 18 20 24 27 30

n

tim

e (s

ec) Exact

DPWP(300)

DPWP(600)

DPWP(900)

Page 17: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Average ratio in the test

0.80.850.9

0.951

1.051.1

1.151.2

12 15 18 20

ratio

BPMF

BOSF

MCSF

DWDP(300)

DWDP(600)

DWDP(900)

Page 18: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Worst ratio in the test

0.80.9

11.11.21.31.41.51.6

12 15 18 20

ratio

BPMF

BOSF

MCSF

DWDP(300)

DWDP(600)

DWDP(900)

Page 19: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Improvement100*(DPWP - BestofOther)/BestofOther

0

5

10

15

20

18 20 24 27 30

n

(%) Max

average

Page 20: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

The MCST problem Given triples over species set S, find a

subset U of S such that all given triples over U is compatible and |U| is maximum.

We show the problem is NP-hard.– Transformed from the Feedback Vertex

Set problem.

Page 21: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

The feedback vertex set problem

Feedback vertex set: a vertex subset containing at one vertex of each cycle of the given directed graph.– In other words, removing a feedback

vertex set results in an acyclic digraph.

Page 22: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

The reduction

T 1

T 2

T p

....

x

rp

r1

r2

....

V p

V 3

V 2

V 1

x 1 ,x 2 ,...

Page 23: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.

Concluding remarks What is the approximation ratio?

– The Best-One-Split-First algorithm is a 3-approximation algorithm,

– The larger K give us better solution, but we do not know the theoretic bound of the ratio.


Recommended