+ All Categories
Home > Documents > An Optimal Algorithm for Online Square Detection

An Optimal Algorithm for Online Square Detection

Date post: 07-Jan-2016
Category:
Upload: wyatt
View: 38 times
Download: 2 times
Share this document with a friend
Description:
An Optimal Algorithm for Online Square Detection. Gen-Huey Chen, Jin-Ju Hong, Hsueh-I Lu National Taiwan University. Outline. The definitions of the square detection problem and the online square detection problem The techniques of the algorithm in [Cro86] for the square detection problem - PowerPoint PPT Presentation
Popular Tags:
27
CPM 2005 CPM 2005 1 An Optimal Algorithm An Optimal Algorithm for Online Square De for Online Square De tection tection Gen-Huey Chen, Jin-Ju Hong, Gen-Huey Chen, Jin-Ju Hong, Hsueh-I Lu Hsueh-I Lu National Taiwan University National Taiwan University
Transcript
Page 1: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 11

An Optimal Algorithm for An Optimal Algorithm for Online Square DetectionOnline Square Detection

Gen-Huey Chen, Jin-Ju Hong, Hsueh-I LuGen-Huey Chen, Jin-Ju Hong, Hsueh-I Lu

National Taiwan UniversityNational Taiwan University

Page 2: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 22

OutlineOutlineThe definitions of the square detection problem The definitions of the square detection problem

and the online square detection problemand the online square detection problemThe techniques of the algorithm in [Cro86] for tThe techniques of the algorithm in [Cro86] for t

he square detection problemhe square detection problemOur algorithm for the online square detection prOur algorithm for the online square detection pr

oblemoblemConclusionConclusion

Page 3: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 33

Square Detection ProblemSquare Detection ProblemSquare: a nonempty string of the form XXSquare: a nonempty string of the form XXE.g. “a b c a b c” is a square.E.g. “a b c a b c” is a square.

“ “a b c a b c a” is not a square.a b c a b c a” is not a square.

Input: a string Input: a string SSSquare detection problem:Square detection problem:

Is there a square in Is there a square in SS??

Page 4: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 44

Online Square Detection ProbleOnline Square Detection Problemm

Leung, Peng, and Ting in COCOON’04Leung, Peng, and Ting in COCOON’04Input: a string Input: a string SSLet Let mm be the unknown smallest integer s.t. be the unknown smallest integer s.t. SS[1..[1..

mm] contains a square.] contains a square.Online square detection problem:Online square detection problem:

Determine Determine mm as soon as as soon as SS[[mm] is read.] is read.An An OO((mm log log22mm)-time algorithm )-time algorithm [LPT04][LPT04]

An An OO((mm log logββ)-time algorithm in our paper)-time algorithm in our paper

Page 5: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 55

Algorithm in [Cro86] forAlgorithm in [Cro86] forSquare Detection ProblemSquare Detection Problem

forfor kk = 1 = 1 toto pp // // pp: # of blocks: # of blocks

{ { ifif a square ends in a square ends in BBii thenthen returnreturn YES; } YES; }

returnreturn NO; NO;

B1 B2 B3 B4 . . . Bp

Page 6: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 66

ff-factorization-factorizationLet Let ddkk denote the starting position of the denote the starting position of the kk-th bloc-th bloc

k k BBkk..

BBkk is is SS[[ddkk]] if if SS[[ddkk] does not occur before ] does not occur before ddkk, or , or the the

longest prefix of longest prefix of SS[[ddkk....nn] that occurs before ] that occurs before ddkk..

1 2 3 4 5 6 7 8 9 10 111 2 3 4 5 6 7 8 9 10 11 ……E.g. E.g. SS = a a a b b a b a b a a … = a a a b b a b a b a a …

BB11 BB22 BB33 BB44 BB55 BB66

Page 7: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 77

ff-factorization (cont.)-factorization (cont.)

A square ending in A square ending in BBkk is centered either in is centered either in BBkk-1-1

or in or in BBkk..

. . . Bk-1 Bk

Page 8: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 88

Square Ending in the Square Ending in the kk-th Block-th BlockCase 1. The square is entirely in the Case 1. The square is entirely in the kk-th block.-th block.

Case 2. The square begins in the (Case 2. The square begins in the (kk-1)-st block.-1)-st block.Case 2.1. The square is centered in the (Case 2.1. The square is centered in the (kk-1)-st block.-1)-st block.

Case 2.2. The square is centered in the Case 2.2. The square is centered in the kk-th block.-th block.

Case 3. The square begins before the (Case 3. The square begins before the (kk-1)-st block -1)-st block and centered in the (and centered in the (kk-1)-st or -1)-st or kk-th block.-th block.

Page 9: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 99

Our Algorithm for OnlineOur Algorithm for OnlineSquare Detection ProblemSquare Detection Problem

forfor ii = 1 = 1 toto nn // // n n = |= |S|S|

{ compute the { compute the ff-factorization of -factorization of SS[1..[1..ii];];

ifif a square ends at a square ends at SS[[ii] ] thenthen returnreturn ii; }; }

returnreturn NO-SQUARE; NO-SQUARE;

Page 10: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1010

Square Ending at Square Ending at SS[[ii]] in in BBkk

Case 1. The square is entirely in the Case 1. The square is entirely in the kk-th block.-th block.

Case 2. The square begins in the (Case 2. The square begins in the (kk-1)-st block.-1)-st block.Case 2.1. The square is centered in the (Case 2.1. The square is centered in the (kk-1)-st block.-1)-st block.

Case 2.2. The square is centered in the Case 2.2. The square is centered in the kk-th block.-th block.

Case 3. The square begins before the (Case 3. The square begins before the (kk-1)-st block -1)-st block and centered in the (and centered in the (kk-1)-st or -1)-st or kk-th block.-th block.

Page 11: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1111

LL((ii11, , ii22, , ii)-square)-square::

RR((ii11, , ii22, , ii)-square)-square::

S

i1 c i2 i

i1 c < i2

S

i1 ci2 i

i2 c < i

j

i1 j < i2

j

i1 j < i2

Page 12: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1212

Square Ending at Square Ending at SS[[ii]] in in BBkk

Case 1. The square is entirely in the Case 1. The square is entirely in the kk-th block.-th block.

Case 2. The square begins in the (Case 2. The square begins in the (kk-1)-st block.-1)-st block.Case 2.1. The square is centered in the (Case 2.1. The square is centered in the (kk-1)-st block.-1)-st block.

Case 2.2. The square is centered in the Case 2.2. The square is centered in the kk-th block.-th block.

Case 3. The square begins before the (Case 3. The square begins before the (kk-1)-st block -1)-st block and centered in the (and centered in the (kk-1)-st or -1)-st or kk-th block.-th block.

LL((ddkk-1-1, , ddkk, , ii)-square :)-square :

RR((ddkk-1-1, , ddkk, , ii)-square :)-square :

RR(1, (1, ddkk-1-1, , ii)-square :)-square :

dk-1 dk i

1 dk-1 i

Page 13: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1313

Our Algorithm for OnlineOur Algorithm for OnlineSquare Detection ProblemSquare Detection Problem

forfor ii = 1 = 1 toto nn // // n n = |= |S|S|

{ compute the { compute the ff-factorization of -factorization of SS[1..[1..ii];];

let let SS[[ii] belong to ] belong to BBkk;;

ifif an an LL((ddkk-1-1, , ddkk, , ii)-square)-square is detected is detected thenthen returnreturn ii;;

ifif an an RR((ddkk-1-1, , ddkk, , ii)-square)-square is detected is detected thenthen returnreturn ii;;

ifif an an RR(1, (1, ddkk-1-1, , ii)-square)-square is detected is detected thenthen returnreturn ii;;

}}

returnreturn NO-SQUARE; NO-SQUARE;

amortizedO(logβ)time

Page 14: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1414

Longest Common ExtensionsLongest Common ExtensionsFor positions For positions ii11ii22ii33 in in SS

XXRR((ii11, , ii22, , ii33)): longest common right extension of : longest common right extension of

positions positions ii11 and and ii22 with boundary with boundary ii33

1 2 3 4 5 6 7 8 9 101 2 3 4 5 6 7 8 9 10

E.g. E.g. SS = a b a b b a b a b a = a b a b b a b a b a

XXLL((ii22, , ii33, , ii11)): longest common left extension of p: longest common left extension of p

ositions ositions ii22 and and ii33 with boundary with boundary ii11

XXRR(3, 8, 10) = 2(3, 8, 10) = 2XXLL(4, 9, 2) = 3(4, 9, 2) = 3

Page 15: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1515

Head Extension Function: Head Extension Function: XXRR(1, (1, jj, ,

ii))If the string If the string SS is read character by character, in is read character by character, in

the the ii-th iteration, for all -th iteration, for all jjii, , XXRR(1, (1, jj, , ii) can be c) can be c

omputed in omputed in OO(1) time with totally (1) time with totally OO((ii)-time pr)-time preprocessing.eprocessing.

1 2 3 4 5 6 7 8 9 101 2 3 4 5 6 7 8 9 10

E.g. E.g. SS = a b a b b a b a b a = a b a b b a b a b a

XXRR(1,(1,jj,10) ,10) 10 0 2 0 0 4 0 3 0 110 0 2 0 0 4 0 3 0 1

We call We call XXRR(1, (1, jj, , ii) ) the head extension functionthe head extension function

Page 16: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1616

LL((ii11, , ii22, , ii))-square-square

Y Z Y ZS

i1 j i2 i

Page 17: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1717

[ML84] [ML84] SS has an has an LL((ii11, , ii22, , ii)-square)-square if and only if ther if and only if ther

e is an index e is an index jj with with ii11jj<<ii22 such that such that XXRR((jj, , ii22, , ii)) = = ||SS[[ii

22....ii]|]| and and XXLL((jj-1, -1, ii22-1, -1, ii11)) + + XXRR((jj, , ii22, , ii)) ||SS[[jj....ii22-1]|-1]|..

LL((ii11, , ii22, , ii))-square-square

Y Z Y ZS

i1 j i2 i

S[1..i-1] contains no square.

=

Page 18: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1818

Detecting Detecting LL((ddkk-1-1, , ddkk, , ii))-squares-squares

Let Let zz((jj)) = = ||SS[[jj....ddkk-1]|-1]|--XXLL((jj-1,-1,ddkk-1,-1,ddkk-1-1) ) for all for all jj in in BBkk-1-1

In the In the ii-th iteration: is there an index -th iteration: is there an index jj in in BBkk-1-1 s.t. s.t. XXRR

((jj, , ddkk, , ii)) = = zz((jj))??

Y Z Y =Z ?S

dk-1 j dk i

z(j)

Page 19: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 1919

In the In the ddkk-th iteration -th iteration (preprocessing)(preprocessing)

Compute Compute zz((jj)) for all for all jj in in BBkk-1-1

Build the suffix tree of Build the suffix tree of BBkk-1-1$$

For all For all uu, compute, compute

min{min{zz((jj)| )| jj ↔ a leaf in ↔ a leaf in uu’s subtree}’s subtree}

Y Z YS

dk-1 j dk i

z(j)

u

z(j)O(|Bk-1|logβ) time

Page 20: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 2020

In the In the ii-th iteration-th iteration

If |If |SS[[ddkk....ii]| equals the value stored in ]| equals the value stored in uu

a square ends at position a square ends at position ii

Y Z Y =Z ?S

dk-1 j dk i

z(j)

u

z(j)

S[dk..i]

Page 21: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 2121

RR((ii11, , ii22, , ii))-square-square

Y Z Y ZS

i1 i2 j i

Page 22: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 2222

RR((ii11, , ii22, , ii))-square-square

[ML84] [ML84] SS has an has an RR((ii11, , ii22, , ii)-square)-square if and only if the if and only if the

re is an index re is an index jj with with ii22<<jj<<ii such that such that XXRR((ii22, , j+1j+1, , ii)) = = ||

SS[[jj+1..+1..ii]|]| and and XXLL((ii22-1, -1, jj, , ii11)) + + XXRR((ii22, , jj, , ii)) ||SS[[ii22....jj]|]|..

Y Z Y ZS

i1 i2 j i

S[1..i-1] contains no square.

=

Page 23: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 2323

Detecting Detecting RR((ddkk-1-1, , ddkk, , ii))-square-square

Let Let zz((jj)) = = ||SS[[ddkk....jj]|]|--XXLL((ddkk-1,-1,jj,,ddkk-1-1) ) for all for all jj in in BBkk

Insert the position Insert the position jj into the set of into the set of jj++zz((jj))For all For all jj in the set of in the set of ii, , XXRR((ddkk, , jj+1, +1, ii)) = = zz((jj))??

Y Z Y =Z ?S

dk-1 dk j i

z(j)set of j+z(j)

insert jamortizedO(logβ) time

Page 24: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 2424

Computing Computing XXLL((ddkk-1, -1, jj, , ddkk-1-1))

||SS[[gg,,ddkk-1]| = min( |-1]| = min( |SS[[ddkk-1-1....ddkk-1]|, -1]|, ||SS[[ddkk....jj]|]| ) )

For all For all vv with with ggvv<<ddkk, , XXLL((vv, , ddkk-1, -1, gg)) can be compute can be compute

d in d in OO(1) time using the technique of computing the (1) time using the technique of computing the head extension function.head extension function.

Y Z YS

dk-1 dk j i

g

v

Page 25: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 2525

Computing Computing XXLL((ddkk-1, -1, jj, , ddkk-1-1)) (cont.)(cont.)

Let Let FF((jj)) denote the longest suffix of denote the longest suffix of SS[[ddkk....jj]] that is al that is al

so a substring of so a substring of SS[[gg....ddkk-1]-1]

XXLL((ddkk-1,-1,jj,,ddkk-1-1)) = | = |FF((jj)| if )| if yy==ddkk-1-1

min( min( ||FF((jj)|)|, , XXLL((yy,,ddkk-1,-1,gg)) ) otherwise ) otherwise

Y Z YS

dk-1 dk j i

g

y

F(j)

Page 26: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 2626

Time ComplexityTime Complexity

forfor ii = 1 = 1 toto nn // // n n = |= |S|S|

{ compute the { compute the ff-factorization of -factorization of SS[1..[1..ii];];

let let SS[[ii] belong to ] belong to BBkk;;

ifif an an LL((ddkk-1-1, , ddkk, , ii)-square)-square is detected is detected thenthen returnreturn ii;;

ifif an an RR((ddkk-1-1, , ddkk, , ii)-square)-square is detected is detected thenthen returnreturn ii;;

ifif an an RR(1, (1, ddkk-1-1, , ii)-square)-square is detected is detected thenthen returnreturn ii;;

}}

returnreturn NO-SQUARE; NO-SQUARE;

amortizedO(logβ)time

Page 27: An Optimal Algorithm for Online Square Detection

CPM 2005CPM 2005 2727

ConclusionConclusionEach of those Each of those OO(log(logββ) terms comes from the tr) terms comes from the tr

aversal in a suffix tree of a string with aversal in a suffix tree of a string with OO((ββ) dis) distinct characters.tinct characters.

Expected time: Expected time: OO((mm))Is it possible to reduce the running time to worIs it possible to reduce the running time to wor

st-case st-case OO((mm) time for a general alphabet?) time for a general alphabet?


Recommended