Regular expressions

Courtesy: Costas/Ullman 1

Regular Expressions

2

RE’s: Introduction

• Regular expressions describe languages by an algebra.

• They describe exactly the regular languages.

• If E is a regular expression, then L(E) is the language it defines.

• We’ll describe RE’s and their languages recursively.

Courtesy: Costas/Ullman

3

Operations on Languages• RE’s use three operations: union,

concatenation, and Kleene star.

• The union of languages is the usual thing, since languages are sets.

• Example: {01,111,10}{00, 01} = {01,111,10,00}.


4

Concatenation

• The concatenation of languages L and M is denoted LM.

• It contains every string wx such that w is in L and x is in M.

• Example: {01,111,10}{00, 01} = {0100, 0101, 11100, 11101, 1000, 1001}.


5

Kleene Star• If L is a language, then L*, the Kleene

star or just “star,” is the set of strings formed by concatenating zero or more strings from L, in any order.

• L* = {ε} L LL LLL …

• Example: {0,10}* = {ε, 0, 10, 00, 010, 100, 1010,…}


6

RE’s: Definition

• Basis 1: If a is any symbol, then a is a RE, and L(a) = {a}.– Note: {a} is the language containing one

string, and that string is of length 1.

• Basis 2: ε is a RE, and L(ε) = {ε}.

• Basis 3: ∅ is a RE, and L(∅) = ∅.


7

RE’s: Definition – (2)• Induction 1: If E1 and E2 are regular

expressions, then E1+E2 is a regular expression, and L(E1+E2) = L(E1)L(E2).

• Induction 2: If E1 and E2 are regular expressions, then E1E2 is a regular expression, and L(E1E2) = L(E1)L(E2).

• Induction 3: If E is a RE, then E* is a RE, and L(E*) = (L(E))*.Courtesy: Costas/Ullman


Definition (continued)

For regular expressions and

1r 2r

2121 rLrLrrL

2121 rLrLrrL

** 11 rLrL

11 rLrL

9

Precedence of Operators

• Parentheses may be used wherever needed to influence the grouping of operators.

• Order of precedence is * (highest), then concatenation, then + (lowest).



ExampleRegular expression: *aba

*abaL *aLbaL *aLbaL *aLbLaL

*aba ,...,,,, aaaaaaba

,...,,,...,,, baababaaaaaa


Example

Regular expression bbabar *

,...,,,,, bbbbaabbaabbarL


Example

Regular expression bbbaar **

}0,:{ 22 mnbbarL mn


Example

Regular expression *)10(00*)10( r

)(rL = { all strings containing substring 00 }


Example

Regular expression )0(*)011( r

)(rL = { all strings without substring 00 }


Equivalent Regular Expressions

Definition:

Regular expressions and

are equivalent if

1r 2r

)()( 21 rLrL


Example L= { all strings without substring 00 }

)0(*)011(1 r

)0(*1)0(**)011*1(2 r

LrLrL )()( 211r 2rand

are equivalentregular expressions


Regular Expressionsand

Regular Languages


Theorem

LanguagesGenerated byRegular Expressions

RegularLanguages



RegularLanguages


RegularLanguages

Proof:


Proof - Part 1

r)(rL

For any regular expression the language is regular


RegularLanguages

Proof by induction on the size of r


Induction BasisPrimitive Regular Expressions: ,,Corresponding NFAs

)()( 1 LML

)(}{)( 2 LML

)(}{)( 3 aLaML

regularlanguages

a


Inductive Hypothesis Suppose that for regular expressions and , and are regular languages

1r 2r)( 1rL )( 2rL


Inductive StepWe will prove:

1

1

21

21

*

rL

rL

rrL

rrL

Are regular Languages


By definition of regular expressions:

11

11

2121

2121

**

rLrL

rLrL

rLrLrrL

rLrLrrL


)( 1rL )( 2rLBy inductive hypothesis we know: and are regular languages

Regular languages are closed under: *1

21

21

rLrLrLrLrL Union

Concatenation

Star

We also know:


Therefore:

** 11

2121

2121

rLrL

rLrLrrL

rLrLrrL

Are regularlanguages

)())(( 11 rLrL is trivially a regular language(by induction hypothesis)

End of Proof-Part 1


Using the regular closure of operations,we can construct recursively the NFA that accepts

M)()( rLML

Example: 21 rrr )()( 11 rLML

)()( 22 rLML

)()( rLML


For any regular language there is a regular expression with

Proof - Part 2


RegularLanguages

Lr LrL )(

We will convert an NFA that accepts to a regular expression

L


Since is regular, there is aNFA that accepts it

LM

LML )(

Take it with a single accept state


From construct the equivalentGeneralized Transition Graphin which transition labels are regular

expressions

M

Example:

a

ba,

cM

a

ba

c

CorrespondingGeneralized transition graph


Another Example:

ba a

b

b0q 1q 2q

ba,a

b

b0q 1q 2q

b

bTransition labels are regular expressions


Reducing the states:ba

ab

b0q 1q 2q

b

0q 2q

babb*

)(* babb

Transition labels are regular expressions


Resulting Regular Expression:

0q 2q

babb*

)(* babb

*)(**)*( bbabbabbr

LMLrL )()(


In GeneralRemoving a state:

iq q jqa b

cde

iq jq

dae* bce*dce*

bae*

2-neighbors


iq jq

dae* bce*dce*

bae*

iq q jqa b

cde

kq

f g

kq

fge*

dge*

fae*

bge*fce*

This can be generalized to arbitrary number of neighbors to q

3-neighbors


0q fq

1r

2r

3r4r

*)*(* 213421 rrrrrrr LMLrL )()(

The resulting regular expression:

By repeating the process until two states are left, the resulting graph is

Initial graph Resulting graph

End of Proof-Part 2


Standard Representations of Regular Languages

Regular Languages

DFAs

NFAsRegularExpressions


When we say: We are given a Regular Language

We mean:

L

Language is in a standard representation

L

(DFA, NFA, or Regular Expression)

Date post:	11-Apr-2017
Category:	Internet
Upload:	shiraz316
View:	29 times
Download:	0 times

Regular expressions

Internet