+ All Categories
Home > Software > String matching algorithms

String matching algorithms

Date post: 20-Jan-2017
Category:
Upload: mahdi-esmailoghli
View: 622 times
Download: 1 times
Share this document with a friend
26
String Matching Algorithms Mahdi Esmail oghli [email protected] Dr. Bagheri Summer 2015 Amirkabir University of Technology
Transcript
Page 1: String matching algorithms

String Matching

AlgorithmsMahdi Esmail oghli

[email protected]

Dr. Bagheri

Summer 2015 Amirkabir University of

Technology

Page 2: String matching algorithms

”Little String“The Pattern

”Big String“The Text

Where is it?

Page 3: String matching algorithms

For Example

Pattern: “CO”

Text: “COCACOLA”

Finding Pattern in Text

Position: 0 1 2 3 4 5 6 7 8

Output: 1 5

Page 4: String matching algorithms

Applications

• Searching Systems

•Genetic (BLAST)

Page 5: String matching algorithms

String Matching Algorithms

NAIVE Shifting Algorithm

Robin - Karp Algorithm

Finite Automaton String Matching

Knuth - Morris - Pratt Algorithm

Page 6: String matching algorithms

“Naive Shifting Algorithm”

Page 7: String matching algorithms

NAIVE Shifting Algorithm

NAIVE-String-Macher(T,P)

1 n = T.length 2 m = P.length 3 for s = 0 to n-m 4 if p[1..m] == T[s+1 .. s+m] 5 print “Pattern occurs with shift” s

Page 8: String matching algorithms

Order of

“NAIVE Shifting Algorithm”

O (( n-m+1 ) * m )

It is not very good matching algorithm

Page 9: String matching algorithms

NAIVE Shifting Algorithm

“CO”“COCACOLA”

Yes Yes

Page 10: String matching algorithms

“Text”“Text”

Page 11: String matching algorithms

Text: CWERUFIVNWQERUQONACRUIRIUQVWCNTQOMXRHQG

ERYCUGOFRUWFEQWUFYWOAMFIHAUFIHFGERGOUD

CJNWUVNWIVNCKMXOQIJXWOEUHFNWCVKAMSCWDIVN

WUEBVFNQIONVOWNVWIUBVEIWLNCMIQWC9URBGUR

IPMOQUNBFUYIWBEIUONFMPOQAFMQIONSDFCOYQWB

CO

Page 12: String matching algorithms

“The Rabin-Karp Algorithm”

Page 13: String matching algorithms

The Rabin-Karp AlgorithmRabin-Karp-Matcher(T, P, d, q) 1 n = T.length 2 m = P.length 3 h = d^(m-1) mod q 4 p = 0 5 t0 = 0 6 for i = 1 to m //PreProcessing 7 p = (dp + p[i]) mod q 8 t0 = (dt0 + T[i]) mod q 9 for s = 0 to n-m //Matching 10 if p == ts 11 if P[1..m] == T[s+1..s+m] 12 print “Pattern occurs with shift” s 13 if s < n-m 14 ts+1 = (d(ts - T[s + 1]h) + T[s + m + 1]) mod 1

Page 14: String matching algorithms

2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1

mod 13

7

2 3 5 9 0 2 3 1 4 1 5 2 6 7 3 9 9 2 1

8 9 3 11 0 1 7 8 4 5 10 11 7 9 11

Pattern 3 1 4 1 5 7mod 13

… …

Page 15: String matching algorithms

“ String Matching With Finite Automata "

Page 16: String matching algorithms

Finite Automaton String Matching

Many String-Matching algorithms build a finite automaton

Because they are efficient:

They examine each text character EXACTLY ONCE constant time for each character

Page 17: String matching algorithms

Finite Automaton String Matching

O ( n )After preprocessing the pattern to build the automaton

Page 18: String matching algorithms

Construct string matching AutomatonPattern: ababaca

a a

aa

aaaab b

bb

c320 1 654 7

i - 1 2 3 4 5 6 7 8 9 10 11

T[i] - a b a b a b a c a b a

State 0 1 2 3 4 5 4 5 6 7 2 3

Page 19: String matching algorithms

Finite Automaton Matcher

Finite-Automaton-Matcher(T, 𝝈, m)

1 n = T.length 2 q = 0 3 for i=0 to n 4 q = 𝝈 (q, T[i]) 5 if q == m 6 print ”Pattern occurs with shift” i-m

Page 20: String matching algorithms

“The Knuth Moris Pratt Algorithm”

(KMP Algorithm)

• Linear-Time String-Matching Algorithm

Page 21: String matching algorithms

KMP Algorithm

2 Stage:

• Prefix Function

• String Matching

Page 22: String matching algorithms

Compute-Prefix-FunctionCompute-Prefix-Function(P) 1 m = P.length 2 let π[1..m] be a new array 3 π[1] = 0 4 k = 0 5 for q = 2 to m 6 while k > 0 and P[k + 1] ≠ P[q] 7 k = π[k] 8 if P[k + 1] == P[q] 9 k = k + 1 10 π[q] = k 11 return π

Page 23: String matching algorithms

Compute-Prefix-Function

i 1 2 3 4 5 6 7

P[i] a b a b a c a

π[i] 0 0 1 2 3 0 1

Page 24: String matching algorithms

KMP AlgorithmKMP-Macher(T, P) 1 n = T.length 2 m = P.length 3 π = Compute-Prefix-Function(P) 4 q = 0 //number of characters matched 5 for i = 0 to n //scan the text from left to right 6 while q > 0 and P[q + 1] ≠ T[i] 7 q = π[q] //next character does not match 8 if P[q + 1] == T[i] 9 q = q + 1 //next character matches 10 if q == m //is all of P matched 11 print ” Pattern occurs with shift ” i-m 12 q = π[q] //look for the next match

Page 25: String matching algorithms

KMP Algorithm

O ( n )Where N is length of text

Page 26: String matching algorithms

Thank You


Recommended