+ All Categories
Home > Documents > Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm,...

Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm,...

Date post: 16-Dec-2015
Category:
Upload: abbie-blong
View: 229 times
Download: 3 times
Share this document with a friend
21
Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niederme ier, Peter Rossmanith
Transcript
Page 1: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems

Algorithmica(2003)Jens Gramm, Rolf Niedermeier, P

eter Rossmanith

Page 2: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Outline

Introduction Preliminaries Linear-Time solution for constant d Related Problems Linear-Time solution for fixed k Conclusion

Page 3: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Intro : Problem Definition

Input: String s1, s2, …, sk over alphabet Σ of length L each, and a nonnegative integer d.

Question: Is there a string s of length L such that dH(s, si)≤d for all i=1,…,k dH(s1, s2) = |{i|s1[i]≠s2[i]}|, |s1|=|s2|

Page 4: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

NP-completeness

CLOSEST STRING is NP-complete d is usually small in biological applica

tions O(kL+kd*dd) result in this paper

PTAS by Li et al

Page 5: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Extended problems

d-MISMATCH DISTINGUISHING STRING

SELECTION DISTINGUISHING SUBSTRING

SELECTION

Page 6: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Preliminaries

Given a set of string S={s1,…,sk}, each of length L s is optimal center string iff no s’ such t

hat maxi=1,…,kdH(s’,si)<maxi=1,…,kdH(s,si) s is optimal median string iff no s’ such t

hat Σi=1,…,kdH(s’,si)<Σi=1,…,kdH(s,si)

Page 7: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Given a set of k strings of length L, think of this string as k x L matrix

Optimal median string : a c c a

s1 a b c d

s2 a a d b

s3 b c d a

s4 a c c c

Page 8: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Main idea

Search! Fixed-parameter tractibility Reduction to problem kernel

Page 9: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

LEMMA 1. Given a set of strings S={s1,…,sk}, each of length L, and a permutationσ:{1,…,L}{1,…,L}. Then s is an optimal center string for {s1,…,sk} iff σ(s) is an optimal center string for {σ(s1), σ(s2), …, σ(sk)}

Page 10: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

LEMMA 2. To compute an optimal center string, it is sufficient to solve a normalized and reordered instance. From this, the solution of the original instance can be derived in linear time s

1a b c d

s2

a a d b

s3

b c d a

s4

a c c c

s1

a b a a

s2

a c b b

s3

b a b c

s4

a a a d

s1

b a a a

s2

c a b b

s3

a b b c

s4

a a a d

Page 11: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

LEMMA 3. A CLOSEST STRING instance with arbitrary alphabet Σ, |Σ|>k, isomorphic to a CLOSEST STRING instance with alphabet Σ’, |Σ’|=k. By normalization

Page 12: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

LEMMA 4. Given a CLOSTEST STRING instance s1,…,sk of length L and d. If the resulting k x L matrix has more than kd dirty dirty columns, then there is no string s with maxi=1,…,kdH(s,si)≤d A column is dirty iff it contains at least tw

o different symbols from alphabet Σ By pigeon theorem

Page 13: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

A Linear-Time solution for constant d

Bounded search tree algorithm LEMMA 5. Given a set of strings S={s1,

…,sk} and a positive integer d. If there are i, j {1,…,k} with dH{si,sj}>2d, then there is no string s with maxi=1,…,kdH(s, si)≤d

Page 14: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.
Page 15: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Theorem 1. Given a set of string S={s1,…,sk} and d, Algorithm D determines in O(kL+kd*dd) time. By lemma 4, reduced the input instance t

o O(kd) in O(kL) time Depth=d, Time(D0+D1+D2+D3)=kd by buil

ding a table containing the distances of candidate s1 to all other given strings

Page 16: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

correctness

Show only the correctness of first step If s1 is not a solution but there exists a c

enter string s P :={p|s1[p]≠si[p]}, |P|=d+1 Ps1≠s=si := {p|s1[p]≠s[p]=si[p]} goal! Ps1≠s=si =Ps≠si∪ P (disjoint), |Ps≠si|≤d So d+1 subcases is sufficient

Page 17: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Related Problems d-MISMATCH problem

Si,p,L denote the length L substring of a given string si starting at position p

Whether there is a string of length L and a position p with 1≤p≤n-L+1, such that dH(s,si,p,L)≤d, for all I

Stojanvoic et al give a linear time algorithm fo 1-MISMATCH

Theorem 2. d-MISMATCH is solvable in O(kL+(n-L)kd*dd) time which O(n*k) for fixed d

Naively: O(n*(KL+kd*dd)) Maintain the queue of dirty columns Considering only the first L columns, we can build a FIFO

queue in O(kL) Update at each position in O(k) time

Page 18: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

DSS problem DISTINGUISHING STRING SELECTION

Given S={s1,…,sk1}, S’={s’1,…,s’k2} all of the same length L, and d1,d2≥0, is there a s such that

LEMMA 6. Given two set of strings S1={s1,…,sk1} and S2={s’1,…,s’k2} and positive d1,d2. If there are i{1,…,k1} and j{1,…k2} with dH(si,s’j)<L-(d1+d2), then there is no string s satisfying both maxi=1,…,k1dH(s,si)≤d1 and minj=1,…,k2dH(s,s’j)≥L-d2

dH(s,s’j)≤dH(s,si)+dH(si,s’j)

2,...,1

1,...,1

)',(max

),(max

2

1

dLssd

dssd

jHkj

iHki

Page 19: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

A Linear-Time Solution for Fixed k

Is CLOSEST STRING fixed parameter tractable?

Use integer linear programming (ILP) Lenstra: ILP with a fixed number of va

riables can be solved in linear time(exponential space)

Page 20: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

CLOSEST STRING in ILP Column types for k

For k=3: (a,a,a)t, (a,a,b)t, (a,b,a)t, (b,a,a)t, (a,b,c)t

|column types|=B(k)≤k! Xt,φ, t: column type, φΣ

Number of column type t whose corresponding character in the desired solution string of CLOSEST STRING is set to φ

B(k)*k Variables needed Minimize

Φt,i denates the alphabet symbol at the ith entry of column type t

tt

kiit

x}){(,

1,

max

Page 21: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Conclusion

Fixed parameter tractability for CLOSEST STRING in d, k

Improve previous work in d-MISMATCH

DSS CLOSEST SUBSTRING ?


Recommended