+ All Categories
Home > Documents > Unit 9 - Final

Unit 9 - Final

Date post: 14-Apr-2018
Category:
Upload: ashok-sharma
View: 218 times
Download: 0 times
Share this document with a friend

of 19

Transcript
  • 7/30/2019 Unit 9 - Final

    1/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 169

    Unit 9 Regular Expressions and

    Regular Languages

    Structure

    9.1 Introduction

    Objectives

    9.2 Regular expressions

    9.3 Regular Expressions accepted by the Language

    9.4 Finite Automaton from Regular Grammar

    9.5 Regular Grammar from Finite Automata

    Self Assessment Questions9.6 Summary

    9.7 Terminal Questions

    9.8 Answers

    9.1 Introduction

    In this unit, you will learn about regular expressions along with finite

    automata, which act as a device for computing regular expressions. A

    regular expression is a set of strings of symbols that can be generated by a

    regular grammar using certain operations such as union, intersection and

    concatenation. A regular expression also follows different identities that is

    based on common mathematical operations such as addition and

    multiplication. These identities help simplify the regular expression. A

    regular expression can be accepted both by deterministic as well as non-

    deterministic automata.

    Objectives:

    After going through this unit, you will be able to

    explain the concept of regular expressions

    understand the regular expression accepted by the language.

    Convert finite automata from regular grammar.

    Convert regular grammar from finite automata.

  • 7/30/2019 Unit 9 - Final

    2/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 170

    9.2 Regular Expressions

    In computing, regular expressions are used to represent a set of strings and

    include symbols that are arranged using certain syntax rules. We can de

    regular expression R1 using terminal symbols such as and that are

    elements of . Some of the algebraic operations defined with regular

    expression are:

    1. Union: The union of two regular expressions is also a regular

    expression. For example, if R1 and R2 are the two regular expressions,

    then the union R1 + R2 is also a regular expression.

    2. Concatenation: The concatenation of two regular expressions is a

    regular expression. For example, if R1 and R2 are the two regular

    expressions, then the concatenation R1R2 is also a regular expression.

    3. Iteration: The iteration of a regular expression is also a regular

    expression. For example, if R1 is a regular expression, then the iteration

    1R is also a regular expression.

    4. Order of evolution: The order of evolution of a regular expression is a

    regular expression. For example, if R1 is a regular expression, then

    order of evolution (R1) is also a regular expression.

    9.2.1 Definition:

    A regular expression is recursively defined as follows.

    1. is a regular expression denoting an empty language.

    2. is a regular expression which indicates the language containing an

    empty string.

    3. a is a regular expression which indicates the language containing only

    {a}

    4. If R is a regular expression denoting the language LR and S is a regular

    expression denoting the language LS, then

  • 7/30/2019 Unit 9 - Final

    3/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 171

    a. R+S is a regular expression corresponding to the language

    LR LS.b. RS is a regular expression corresponding to the language LR.LS.

    c. R* is a regular expression corresponding to the language LR.

    5. The expressions obtained by applying any of the rules from 1 to 4 are

    regular expressions.

    Note: If parentheses are not present in the regular expressions, then

    precedence of the operands is as follows: iteration, concatenation and

    union. First you need to perform the iteration operation, then the

    concatenation operation and finally the union operation.

    Note: Any set, which is represented by using a regular expression, is known

    as regular set. If the regular expression is R, then the regular set of R is

    L(R).

    9.2.2 Example:

    Let x, y , where,

    x represents the set {x}

    x + y represents the set {x, y}

    xy represents the set {xy}

    x* represents the set {, x, xx, xxx, }

    (x + y)* represents the set{x + y}*

    9.3 Regular Expressions accepted by the Language

    9.3.1 Example:

    Some examples of regular expressions and the language corresponding to

    these regular expressions are given here.

  • 7/30/2019 Unit 9 - Final

    4/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 172

    RegularExpressions

    Meaning

    (a+b)* Set of strings of as and bs of any length including the NULLstring.

    (a+b)*abb Set of strings of as and bs ending with the string abb

    ab(a+b)* Set of strings of as and bs starting with the string ab.

    (a+b)*aa(a+b)* Set ofstrings of as and bs having a sub string aa.

    a*b*c* Set of string consisting of any number of as(may be emptystring also) followed by any number of bs(may includeempty string) followed by any number of cs(may includeempty string).

    abc Set of string consisting of at least one a followed by string

    consisting of at least one b followed by string consisting ofat least one c.

    aa*bb*cc* Set of strings consisting of at least one a followed by stringconsisting of at least one b followed by string consisting ofat least one c.

    (a+b)* (a + bb) Set of strings of as and bs ending with either a or bb

    (aa)* (bb)*b Set of strings consisting of even number of as followed by

    odd number of bs.

    9.3.2 Example

    Obtain a regular expression to accept a language consisting of strings of as

    and alternate as and bs.

    Solution: The alternate as and bs can be obtained by concatenating the

    string ab zero or more times which can be represented by the regular

    expression

    (ab)*

    and adding an optional b to the front and adding an optional a at the end as

    shown below:

    ( + b) (ab)* ( + a).

    Thus, the complete expression is given by

    ( + b) (ab)* ( + a)

  • 7/30/2019 Unit 9 - Final

    5/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 173

    9.3.3 Note

    The expression can also be obtained as shown below:

    The as and bs can be generated using one of the following ways:

    i) (ab)*

    ii) b(ab)*

    iii) (ba)*

    iv) a(ba)*

    Therefore the expression to generate alternate as and bs can be obtained

    by taking the union of regular expressions as shown below:

    (ab)* + b(ab)* + (ba)* + a(ba)*

    9.3.4 Example

    Obtain a regularexpression to accept a language consisting of strings of 0s

    and 1s with at most one pair of consecutive 0s.

    Solution: It is clear from the statement that the string consisting of at most

    one pair of consecutive 0s may

    o begin with combination of any number of 1s and 01s represented by (1

    + 01)*

    o end with any number of 1s represented by 1 *.

    Therefore the complete regular expression which consists of strings 0s and

    1s with at most one pair of consecutive 0s is given by

    (1 + 01)*00 1*.

    9.3.5 Example

    Obtain a regular expression to accept a language containing at least one a

    and at least one b where = {a, b, c}.

    Solution: Strings of as, bs and cs can be generated using the regular

    expression

    (a + b + c)*.

  • 7/30/2019 Unit 9 - Final

    6/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 174

    But this string should have at least one aand at least one b. There are

    two cases to be considered:

    First a preceding b which can be represented using

    c*a(a + c)*b

    First b preceding a which can be represented using

    c*b(b + c)*a

    The regular expression (a + b + c)* can be preceded by one of the regular

    expressions considered in the two cases just discussed.

    Therefore the final regular expression isc*a(a + c)*b(a + b + c)* +c*b(b + c)*a(a + b + c)*

    This expression can also be written as shown below:

    [c*a(a+c)*b + c*b(b4c)*a] (a+b+c)*

    9.3.6 Example

    Obtain a regular expression to accept a language consisting of strings of as

    and bs of even length.

    Solution: String ofas and bs of even length can be obtained by the

    combination of the strings aa, ab, ba and bb.

    The language may even consist of an empty string denoted by .

    Therefore the regular expression can be of the form

    (aa + ab + ba + bb)*

    The * closure includes the empty string.

    The language corresponding to the regular expression is denoted by

    L(R)={(aa + ab + ba + bb)n n 0}.

  • 7/30/2019 Unit 9 - Final

    7/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 175

    9.3.7 Example

    Obtain a regular expression to accept a language consisting of strings of as

    and bs of odd length.

    Solution: String of as and bs of odd length can be obtained by the

    combination of the strings aa, ab, ba and bb followed by either a or b.

    Therefore the regular expression can be of the form

    (aa + ab + ba + bb)* (a + b)

    String of as and bs of odd length can also be obtained by the combination

    of the strings aa, ab, ba and bb preceded by either a or b.

    Therefore the regular expression can also be represented as

    (a + b) (aa + ab + ba + bb)*.

    Observation:Even though these two expressions seem to be different, thelanguage corresponding to these two expressions is same.

    9.3.8 Example

    Obtain a regular expression such that L(R) = {w w {0, 1}* with at least

    three consecutive 0s.

    Solution: A string consisting of 0s and ls can be represented by the

    regular expression

    (0 + 1)*

    This arbitrary string can precede three consecutive zeros and can follow

    three consecutive zeros.

    Therefore the regular expression can be written as

    (0 +1)* 000(0+1)*.

    The language corresponding to the regular expression can be written as

    L(R) = { (0 + 1)m000(0+1)n m 0 and n 0}.

  • 7/30/2019 Unit 9 - Final

    8/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 176

    9.4 Finite Automaton from Regular Grammar

    9.4.1 Definition

    A grammar G = (VN, VT, S, ) is said to be regular grammar the

    grammar is right regular or left regular.

    A grammar G is said to be right regularif all the productions are of the form

    A wB and / or A w, where A, B VT and w VT*.

    A grammar G is said to be left regularif all the productions are of the form

    A Bw and / or A w, where A, B VT and w VT*.

    9.4.2 Example

    (i) The grammar with the set of productions

    S aaB bbA

    A aA b

    B bB a

    is a right linear grammar.

    (ii) The grammar with the set of productions

    S Baa Abb

    A Aa b

    B Bb a

    is a left linear grammar.

    9.4.3 Definition

    A grammar which has at most one non terminal on the right side of any

    production without restriction on the position of this non terminal (observe

    that: non terminal can be leftmost or rightmost) is called linear grammar.

    9.4.4 Theorem

    Let G = (VN, VT, S, ) be a right linear grammar. Then there exists a

    language L(G) which is accepted by a finite automata, that is, the language

    generated from the regular grammar is a regular language.

  • 7/30/2019 Unit 9 - Final

    9/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 177

    Proof: Let V = {q0, q1, } be the variables and S = q0 be the start state.

    Let the productions in the grammar be

    q0 x1q1

    q1 x2q2

    q2 x3q3

    qn xn+1

    Assume that the language L(G) generated from these productions is w.

    Corresponding to each production in the grammar we can have equivalent

    transitions in the FA to accept the string w.

    After accepting the string wm the FA will be in the final state.

    The procedure to obtain FA from these productions is given below.

    Step 1: The start symbol q0 in the grammar is the start state of FA.

    Step 2: For each production of the form q I wqj the corresponding

    transition defined will be of the form

    *(qi, w) = qj.

    Step 3: For each production of the form q i w, the corresponding transition

    defined will be of the form

    *(qi, w) = qf, where qf is the final state.

    Since the string w L(G) is also accepted by FA, by applying the transitions

    obtained in step 1 through 3, the language is regular.

  • 7/30/2019 Unit 9 - Final

    10/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 178

    9.4.5 Problem: Construct a DFA and the transition diagram, to accept the

    language generated by the following grammar.

    S 01A

    A 10B

    B 0A 11

    Solution: Observe that each production of the form

    A wB

    the corresponding transition will be (A, w) = B

    Also, for each production of the form A w, we can introduce the transition(A, w) = qf, where qf is the final state.

    The transitions obtained from grammar G is shown in the table.

    The transition diagram is shown below.

    The DFA is

    M = (Q, , , q0, F) where

  • 7/30/2019 Unit 9 - Final

    11/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 179

    Q = {S, A, B, qf, q1, q2, q3},

    = {0, 1}, q0 = S (start state), F = {q f}, is shown in the table. Here, theadditional vertices (states) introduced are q1, q2, q3.

    9.4.6 Problem:

    Construct DFA and the corresponding transition diagram to accept the

    language generated by the following grammar.

    S aA

    A aA bB

    B bB

    Solution: Observe that each production of the form

    A wB

    the corresponding transition will be

    (A, w) = B

    Also, for each production of the form

    A w,

    we can introduce the transition

    (A, w) = qf, where qf is the final state.

    The transitions obtained from grammar G is shown in the table.

  • 7/30/2019 Unit 9 - Final

    12/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 180

    Observe that for each production of the form , make A as the final

    state.

    The transition diagram corresponding to this is shown below.

    9.5 Regular Grammar from Finite Automata

    9.5.1 Theorem:

    Let M = (Q, , , q0, F) be a finite automata. If L is the regular language

    accepted by FA, then there exists a right linear grammar G = (VN, VT, S, )

    so that L = L(G).

    Proof: Let M = (Q, , , q0, F), where Q = {q0, q1, , qn}, = {a1, a2, , am}.

    A regular grammar G = (VN, VT, S, ) can be constructed where

    VN = {q0, q1, , qn}, VT = , S = q0.

    The set of productions can be obtained as shown below.

    Step 1: For each transition of the form (qi, a) = qj the corresponding production is

    qi aqj

    Step 2: If qF, the final state in FA, then introduce the production q.

    Since these productions are obtained from the transitions defined for FA, the

    language accepted by FA is also accepted by the grammar.

  • 7/30/2019 Unit 9 - Final

    13/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 181

    9.5.2 Example:

    Obtain a regular grammar from the following DFA given by the transition diagram.

    Solution: For each transition of the form (A, a) = B, introduce the

    production A aB. If q F (the final state), introduce the production A.

    The productions obtained from the transitions defined for DFA is shown

    below.

    From the diagram, it is clear that the state B is a final state.

    Therefore we introduce the production .The grammar G corresponding to the productions obtained is shown below.

  • 7/30/2019 Unit 9 - Final

    14/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 182

    9.5.3 Example

    Construct a regular grammar for the following DFA given by the transition

    diagram.

    Solution: For each transition of the form (A, a) = B, introduce the

    production A aB.

    If q F (the final state), introduce the production A. The productionsobtained from the transitions defined for DFA is shown below.

  • 7/30/2019 Unit 9 - Final

    15/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 183

    Since the set of final states: {S, A, B}, we introduce the productions S ,

    A , and B .Therefore the grammar G is:

    G = (VN, VT, S, ) where

    VN = {S, A, B, C}

    VT = {a, b}

    Observation: The finite automaton in this problem accepts strings of as

    and bs except those containing the substring abb. Therefore from the

    grammar G we can obtain a regular language which consist of strings of as

    and bs without the substring abb.

    9.5.4Example

    Obtain a right linear grammar for the regular expression ((aab)* ab)*, given

    by the transition diagram.

    The right linear grammar is given by

    G = (VN, VT, S, ) where

    VN = {S, A, B}

    VT = {a, b}

  • 7/30/2019 Unit 9 - Final

    16/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 184

    9.5.5 Note

    The left linear grammar can be obtained from FA as follows.

    Step 1: Obtain the reverse of given DFA.

    Step 2: Obtain the right linear grammar from the reversed DFA.

    Step 3: Obtain the left linear grammar from right linear grammar.

    9.5.6 Example

    Obtain a left linear grammar for the DFA shown below.

    Step 1: Reverse the DFA. That is, A as the final state and C as the start

    state and reverse the direction of the arrow. The reversed DFA is shown

    below.

    Step 2: obtain the right linear grammar for the above DFA. The

    corresponding productions are shown below.

  • 7/30/2019 Unit 9 - Final

    17/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 185

    Step 3: Reverse the productions of right linear grammar to get left linear

    grammar.

    If A abcdB is the production in right linear grammar, after reversing the

    production will be of the form

    A Bdcba.

    The conversion of right linear grammar to the left linear grammar is shown

    below.

    Therefore the final left linear grammar is

    G = (VN, VT, S, ) where

    VN = {C, A, B}

    VT = {0, 1}

    Now we show that the string 10101 is accepted by DFA.

  • 7/30/2019 Unit 9 - Final

    18/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 186

    Hence the left linear grammar obtained is equivalent to the given FA.

    Self Assessment Questions

    1. The regular expression (11)* stands for _______

    2. The regular expression (01)* + 1 stands for _____

    3. The regular expression (0 + 10)*1* stands for ______

    4. Obtain a left linear grammar for the regular expression ((aab)* ab)*.

    9.6 Summary

    In this unit special type of grammar called regular grammars were

    considered. Different forms of regular expressions and the regular

    expressions accepted by the language are given. We provided a method of

  • 7/30/2019 Unit 9 - Final

    19/19

    Fundamentals of Theory of Computer Science Unit 9

    Sikkim Manipal University Page No.: 187

    obtaining a regular grammar from the finite and automaton (and vice versa).

    Sufficient number of examples were given.

    9.7 Terminal Questions

    1. Obtain a right linear grammar for the language L = {anbm n 2, m 3}.

    2. Obtain the left linear grammar for the right linear grammar shown below.

    9.8 Answers

    Self Assessment Questions

    1. Set of strings consisting of even number of 1s.

    2. The language consists of a string 1 or strings of (01)s that repeat zero

    or more times.

    3. Stings of 0s and 1s ending with any number of 1s (possible none).

    4. G = (VN, VT, S, ) where VN = {A, B, S}, VT = {a, b}


Recommended