Post on 05-Jan-2016
transcript
LING 388: Language and Computers
Sandiway Fong
9/27
Lecture 10
Adminstrivia
• Reminder– Homework 4 due Wednesday
Today’s Topic
• Finite State Automata (FSA)
– equivalent to the regular expressions we’ve been studying
Regular Expressions: Example
.... from lecture 8
• example (sheeptalk) – baa!– baaa! – baaaa!– …
• regular expression– baaa*!– baa+!
Regular Expressions: Example
.... from lecture 8
• example (sheeptalk) – baa!– baaa! – baaaa!– …
• regular expression– baaa*!– baa+!
s w
z
b
!
ya
a
> xa
Regular Expressions: Example
• step-by-step• regular expression
– baaa*!
s>
Start state: s
Regular Expressions: Example
• step-by-step• regular expression
– baaa*!
– b
– from s, – see ‘b’, – move to w
s wb>
Regular Expressions: Example
• step-by-step• regular expression
– baaa*!
– ba
– From w, – see an ‘a’, – move to x
s wb a> x
Regular Expressions: Example
• step-by-step• regular expression
– baaa*!
– baa
– From x, – see an ‘a’, – move to y
s wb
ya> x
a
Regular Expressions: Example
• step-by-step• regular expression
– baaa*!
– baaa*– baa– baaa– baaaa– baaaaa...– from y,– see an ‘a’, – move to ?
y’
y”
a
a
a...
but machine musthave a finite numberof states!
s wb
ya> x
a
Regular Expressions: Example
• step-by-step• regular expression
– baaa*!
– baaa*– baa– Baaa– baaaa– baaaaa...– from y,– see an ‘a’, – “loop” aka return to state y
y
a
s wb a> x
a
Regular Expressions: Example
• step-by-step• regular expression
– baaa*!
– baaa*!
– from y,– see an ‘!’, – move to final state z
(indicated in red) z
!
y
a
Note: machine cannot finish (i.e. reach the end of the input string) in states s, x or y
s wb a> x
a
Finite State Automata (FSA)
• construction– the step-by-step FSA construction method we just
used – works for any regular expression
• conclusion– anything we can encode with a regular expression,
we can build a FSA for it
– an important step in showing that FSA and REs are equivalent
Microsoft Word Wildcards
• basic wildcards– ? and *
• ? any single character• e.g. p?t put, pit, pat, pet
• * zero or more characters
x yd
abc
e
z etc.
...
...
y
a etc.
one loopfor eachcharacter
Microsoft Word Wildcards
• basic wildcards– @
• one or more of the preceding character
• e.g. a@
– [ ]• range of characters• e.g. [aeiou]
x ya
a
x yo
aei
u
Microsoft Word Wildcards
• basic wildcards– < >
• < • beginning of a word
• can think of there being a special symbol/invisible character marking the beginning of each word
• > • end of a word
• suppose there is an invisible character marking the end of each word
x y<
see anything but ‘<‘
x y>
see anything but ‘>‘
Microsoft Word Wildcards
• basic wildcards– < >
• > • end of a word
– Note• the see-anything-but loop
is implicit• m>• “word that ends in m”• example:
– mom is...
x y>
see anything but ‘>‘
x ym
see anything but ‘m‘
z>
Finite State Automata (FSA)
• more formally– (Q,s,f,Σ,)1. set of states (Q): {s,w,x,y,z} 5 states must be a finite set2. start state (s): s3. end state(s) (f): z
4. alphabet (Σ): {a, b, !}5. transition function :
signature: character × state → state• (b,s)=w• (a,w)=x• (a,x)=y• (a,y)=y• (!,y)=z
z
!
y
a
s wb a> x
a
Finite State Automata (FSA)
• in Prolog– define one predicate for each state
• taking one argument (the input list L)• consume input character (take the head of the list)• call next state with the tail of the list
– rule• fsa(L) :- s(L).
i.e. call start state s
Finite State Automata (FSA)
• state s: (start state)– s([b|L]) :- w(L).match input string beginning with b and
call state w with remainder of input
• state w:– w([a|L]) :- x(L).
• state x:– x([a|L]) :- y(L).
• state y:– y([a|L]) :- y(L).– y([!|L]) :- z(L).
• state z: (end state)– z([]).
z
!
y
a
s wb a> x
a
Finite State Automata (FSA)
• query– ?- s([b,a,a,a,!]).
Databases([b|L]) :- w(L).w([a|L]) :- x(L).x([a|L]) :- y(L).y([a|L]) :- y(L).y([!|L]) :- z(L).z([]).
[b,a,a,a,!] [a,a,a,!] [a,a,!]
[!]
[]z
!
y
a
s wb a> x
a
[a,!]
Finite State Automata (FSA)
• In which state does query– ?- s([b,a,b,a,!]).
fail?
z
!
y
a
s wb a> x
a
Databases([b|L]) :- w(L).w([a|L]) :- x(L).x([a|L]) :- y(L).y([a|L]) :- y(L).y([!|L]) :- z(L).z([]).
[b,a,b,a,!] [a,b,a,!] [b,a,!]
FSA
• Finite State Automata (FSA) have a limited amount of expressive power
• Let’s look at a modification to FSA and its effect on its power
String Transitions
– so far...• all machines have had just a
single character label on the arc• so if we allow strings to label arcs
– do they endow the FSA with any more power?
b
• Answer: No– because we can always convert a
machine with string-transitions into one without
abb
a b b
Finite State Automata (FSA)
• equivalent
s
z
baa
!
y
a
>
machine with 5 states
z
!
y
a
s wb a> x
a
Finite State Automata (FSA)
• equivalent
Databases([b|L]) :- w(L).w([a|L]) :- x(L).x([a|L]) :- y(L).y([a|L]) :- y(L).y(['!'|L]) :- z(L).z([]).
Databases([b,a,a|L]) :- y(L).y([a|L]) :- y(L).y(['!'|L]) :- z(L).z([]).
z
!
y
a
s wb a> x
as
z
baa
!
y
a
>
Empty Transitions
– so far...• how about allowing the empty
character? – i.e. go from x to y without seeing a input
character– does this endow the FSA with any more
power?
b
• Answer: No– because we can always convert a
machine with empty transitions into one without
x yε
Empty Transitions
• example– (ab)|b
a
ε
b a
b
b> >
Empty Transitions
• example– (ab)|(empty string)
a ba
ε
b>
= final state
NDFSA
• Basic FSA– deterministic
• it’s clear which state we’re always in, or• deterministic = no choice point
• NDFSA– ND = non-deterministic
• i.e. we could be in more than one state• non-deterministic choice point
– example:• initially, either in state 1 or 2
s x
y
aa
b
b
1 2a
ε
3b
>
>
NDFSA
• more generally– non-determinism can be had not just with ε-transitions but
with any symbol• example:
– given a, we can proceed to either state 2 or 3
1 2a
a
3b
>
NDFSA
• NDFSA– are they more powerful than FSA?– similar question asked earlier for ε-transitions – Answer: No– We can always convert a NDFSA into a FSA
• example– (set of states)
1 2a
a
3b
1 2,3a
3b
2,32> >
NDFSA
• example– (set of states)– construct new machine with
states = set of possible states of the old machine
• Essential trick:– i.e. simulate the old (non-
deterministic) machine with the new machine
1 2a
a
3b
>
{1}>
a{1}> {2,3}
{3}ba
{1}> {2,3}
{3}ba
{1}> {2,3}
1 2,3a
3b
2,32>