+ All Categories
Home > Documents > Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice &...

Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice &...

Date post: 30-Mar-2015
Category:
Upload: ashanti-byerley
View: 227 times
Download: 2 times
Share this document with a friend
Popular Tags:
25
Tuned Boyer Moore Algorithm Fast string searching , HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C. T. Lee Speaker: C. W. Cheng National Chi Nan University
Transcript
Page 1: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Tuned Boyer Moore Algorithm

Fast string searching , HUME A. and SUNDAY D.M., Software - Practice &

Experience 21(11), 1991, pp. 1221-1248.

Adviser: R. C. T. Lee

Speaker: C. W. Cheng

National Chi Nan University

Page 2: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Problem Definition

Input: a text string T with length n and a pattern string P with length m.

Output: all occurrences of P in T.

Page 3: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Definition• Ts : the first character of a string T aligns to a pattern P.

• Pl : the first character of a pattern P aligns to a string T.

• Tj : the character of the jth position of a string T.

• Pi : the character of the ith position of a pattern P.

• Pf : the last character of a pattern P.

• n : The length of T.

• m : The length of P.

Page 4: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Rule 2-2: 1-Suffix Rule (A Special Version of Rule 2)

• Consider the 1-suffix x. We may apply Rule 2-2 now.

T

P

x

x

Page 5: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Introduction

• simplification of the Boyer-Moore algorithm.

• uses only the bad-character shift.

• easy to implement.

• very fast in practice

• uses Rule 2-2: 1-Suffix Rule

Page 6: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Tuned Boyer Moore Algorithm

• In this algorithm, We always focus on the last character of the window of T and try to slide the pattern to match the last character of T.

Page 7: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Tuned Boyer Moore Algorithm Rule

T x z y

Since Ts+m-1 ≠ Pf , we move the pattern P to right such that the largest position i in the right of Pi is equal to Ts+m. We can shift the pattern at least (m-i) positions right until Ts+m-1 = Pf.

Shift

s s+m-1

P z x yi1 f

P z x yi1 f

P z x yi1 f

Shift

Page 8: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Tuned Boyer Moore Preprocessing Table

• In this algorithm, we construct a table as follow. Let x be a character in the alphabet. We record the position of the last x, if it exists in P, we record the position of x from the second last position of P. If x does not exist in P1 to Pm-1, we record it as m.

Page 9: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Tuned Boyer Moore Preprocessing Table

• Example: 6 5 4 3 2 1

P=AGCAGAC

A C G T

bmBC 1 4 2 7

Page 10: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

Page 11: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

tbmBC[A]=1, shift=1

Page 12: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

Page 13: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

tbmBC[G]=2, shift=2

Page 14: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

Page 15: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

match

Page 16: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

exact match

tbmBC[C]=4, shift=4

Page 17: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

Page 18: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

match

Page 19: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

mismatch

tbmBC[C]=4, shift=4

Page 20: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

Page 21: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

tbmBC[T]=7, shift=7

Page 22: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Example

• Text string T=GCGAGCAGACGTGCGAGTACG

• Pattern string

P=AGCAGAC

G C G A G C A G A C G T G C G A G T A C G

A G C A G A C

A C G T

tbmBC 1 4 2 7

Page 23: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Time complexity

• preprocessing phase in O(m+ σ) time and O(σ) space complexity, σ is the number of alphabets in pattern.

• searching phase in O(mn) time complexity.

Page 24: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Reference[KMP77] Fast pattern matching in strings, D. E. Knuth, J. H. Morris, Jr and V. B. Pratt, SIAM J. Computing, 6, 1977, pp. 323–350.[BM77] A fast string search algorithm, R. S. Boyer and J. S. Moore, Comm. ACM, 20, 1977, pp. 762–772.[S90] A very fast substring search algorithm, D. M. Sunday, Comm. ACM, 33, 1990, pp. 132–142.[RR89] The Rand MH Message Handling system: User’s Manual (UCIVersion), M. T. Rose and J. L. Romine, University of California, Irvine, 1989.[S82] A comparison of three string matching algorithms, G. De V. Smith, Software—Practice and Experience,12, 1982, pp. 57–66.[HS91] Fast string searching, HUME A. and SUNDAY D.M. , Software - Practice & Experience 21(11), 1991, pp.

1221-1248. [S94] String Searching Algorithms , Stephen, G.A., World Scientific, 1994. [ZT87] On improving the average case of the Boyer-Moore string matching algorithm, ZHU, R.F. and TAKAOKA, T., Journal of Information Processing 10(3) , 1987, pp. 173-177 .[R92] Tuning the Boyer-Moore-Horspool string searching algorithm, RAITA T., Software - Practice & Experienc

e, 22(10) , 1992, pp. 879-884. [S94] On tuning the Boyer-Moore-Horspool string searching algorithms, SMITH, P.D., Software - Practice & Experience, 24(4) , 1994, pp. 435-436. [BR92] Average running time of the Boyer-Moore-Horspool algorithm, BAEZA-YATES, R.A., RÉGNIER, M., Theoretical Computer Science 92(1) , 1992, pp. 19-31. [H80] Practical fast searching in strings, HORSPOOL R.N., Software - Practice & Experience, 10(6) , 1980, pp. 501-506. [L95] Experimental results on string matching algorithms, LECROQ, T., Software - Practice & Experience 25(7) , 1995, pp. 727-765.

Page 25: Tuned Boyer Moore Algorithm Fast string searching, HUME A. and SUNDAY D.M., Software - Practice & Experience 21(11), 1991, pp. 1221-1248. Adviser: R. C.

Thanks for your listening


Recommended