+ All Categories
Home > Documents > CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf ·...

CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf ·...

Date post: 23-Sep-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
160
CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis Thomas Dillig Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 1/33
Transcript
Page 1: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

CS345H: Programming Languages

Lecture 4: Implementation of Lexical Analysis

Thomas Dillig

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 1/33

Page 2: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Announcements

I WA1 and PA0 are due Today

I WA2 and PA1 out today :-)

I If you are not very, very busy right now, get started now

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 2/33

Page 3: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Announcements

I WA1 and PA0 are due Today

I WA2 and PA1 out today :-)

I If you are not very, very busy right now, get started now

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 2/33

Page 4: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Announcements

I WA1 and PA0 are due Today

I WA2 and PA1 out today :-)

I If you are not very, very busy right now, get started now

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 2/33

Page 5: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Outline

I Last time: Specifying lexical structure using regularexpressions

I Today: How to recognize strings matching regular expressionsusing finite automata.

I We will see determinist finite automata (DFAs) andnon-deterministic finite automata (NFAs)

I High-level story: RegEx -> NFA -> DFA -> Tables

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 3/33

Page 6: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Outline

I Last time: Specifying lexical structure using regularexpressions

I Today: How to recognize strings matching regular expressionsusing finite automata.

I We will see determinist finite automata (DFAs) andnon-deterministic finite automata (NFAs)

I High-level story: RegEx -> NFA -> DFA -> Tables

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 3/33

Page 7: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Outline

I Last time: Specifying lexical structure using regularexpressions

I Today: How to recognize strings matching regular expressionsusing finite automata.

I We will see determinist finite automata (DFAs) andnon-deterministic finite automata (NFAs)

I High-level story: RegEx -> NFA -> DFA -> Tables

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 3/33

Page 8: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Outline

I Last time: Specifying lexical structure using regularexpressions

I Today: How to recognize strings matching regular expressionsusing finite automata.

I We will see determinist finite automata (DFAs) andnon-deterministic finite automata (NFAs)

I High-level story: RegEx -> NFA -> DFA -> Tables

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 3/33

Page 9: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions in Lexical Specifications

I Last lecture: How to specify the predicate s ∈ L(R)

I But yes/no answer is not enough!

I We really want to partition input into tokens

I We adapt regular expressions for this goal

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 4/33

Page 10: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions in Lexical Specifications

I Last lecture: How to specify the predicate s ∈ L(R)

I But yes/no answer is not enough!

I We really want to partition input into tokens

I We adapt regular expressions for this goal

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 4/33

Page 11: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions in Lexical Specifications

I Last lecture: How to specify the predicate s ∈ L(R)

I But yes/no answer is not enough!

I We really want to partition input into tokens

I We adapt regular expressions for this goal

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 4/33

Page 12: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions in Lexical Specifications

I Last lecture: How to specify the predicate s ∈ L(R)

I But yes/no answer is not enough!

I We really want to partition input into tokens

I We adapt regular expressions for this goal

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 4/33

Page 13: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (1)

I Step 1: Write a regular expression for the lexemes of eachtoken

I Integer constant: digit+

I Identifier: letter (letter + digit)∗

I Lambda: ’lambda’

I . . .

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 5/33

Page 14: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (1)

I Step 1: Write a regular expression for the lexemes of eachtoken

I Integer constant: digit+

I Identifier: letter (letter + digit)∗

I Lambda: ’lambda’

I . . .

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 5/33

Page 15: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (1)

I Step 1: Write a regular expression for the lexemes of eachtoken

I Integer constant: digit+

I Identifier: letter (letter + digit)∗

I Lambda: ’lambda’

I . . .

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 5/33

Page 16: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (1)

I Step 1: Write a regular expression for the lexemes of eachtoken

I Integer constant: digit+

I Identifier: letter (letter + digit)∗

I Lambda: ’lambda’

I . . .

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 5/33

Page 17: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (1)

I Step 1: Write a regular expression for the lexemes of eachtoken

I Integer constant: digit+

I Identifier: letter (letter + digit)∗

I Lambda: ’lambda’

I . . .

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 5/33

Page 18: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (2)

I Step 2: Construct R, matching lexemes for all tokens

I R = Integer constant + Identifier + Lambda + . . .

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 6/33

Page 19: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (2)

I Step 2: Construct R, matching lexemes for all tokens

I R = Integer constant + Identifier + Lambda + . . .

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 6/33

Page 20: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (3)

I Let the input be characters x1...xn

I Step 3: For each 1 ≤ i ≤ n check x1...xj ∈ L(R) for some j

I Then, remove x1...xj from input and repeat

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 7/33

Page 21: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (3)

I Let the input be characters x1...xn

I Step 3: For each 1 ≤ i ≤ n check x1...xj ∈ L(R) for some j

I Then, remove x1...xj from input and repeat

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 7/33

Page 22: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Lexical Specifications (3)

I Let the input be characters x1...xn

I Step 3: For each 1 ≤ i ≤ n check x1...xj ∈ L(R) for some j

I Then, remove x1...xj from input and repeat

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 7/33

Page 23: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities I

I There are ambiguities in this algorithm. Where?

I How much input is used? What if x1...xi ∈ L(R) andx1...xj ∈ L(R)?

I Example: identifier = letter (letter + digit)∗, if = ’i’ ’f’

I Rule: Pick longest possible string in L(R)

I This is known as “maximal munch”

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 8/33

Page 24: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities I

I There are ambiguities in this algorithm. Where?

I How much input is used? What if x1...xi ∈ L(R) andx1...xj ∈ L(R)?

I Example: identifier = letter (letter + digit)∗, if = ’i’ ’f’

I Rule: Pick longest possible string in L(R)

I This is known as “maximal munch”

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 8/33

Page 25: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities I

I There are ambiguities in this algorithm. Where?

I How much input is used? What if x1...xi ∈ L(R) andx1...xj ∈ L(R)?

I Example: identifier = letter (letter + digit)∗, if = ’i’ ’f’

I Rule: Pick longest possible string in L(R)

I This is known as “maximal munch”

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 8/33

Page 26: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities I

I There are ambiguities in this algorithm. Where?

I How much input is used? What if x1...xi ∈ L(R) andx1...xj ∈ L(R)?

I Example: identifier = letter (letter + digit)∗, if = ’i’ ’f’

I Rule: Pick longest possible string in L(R)

I This is known as “maximal munch”

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 8/33

Page 27: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities I

I There are ambiguities in this algorithm. Where?

I How much input is used? What if x1...xi ∈ L(R) andx1...xj ∈ L(R)?

I Example: identifier = letter (letter + digit)∗, if = ’i’ ’f’

I Rule: Pick longest possible string in L(R)

I This is known as “maximal munch”

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 8/33

Page 28: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities II

I What if two rules match with the same number of characters?

I x1...xi ∈ L(R1) and x1...xi ∈ L(R2)?

I Example: "if"

I Rule: Use rule listed first

I This is how "if" is matched as a keyword, not identifier

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 9/33

Page 29: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities II

I What if two rules match with the same number of characters?

I x1...xi ∈ L(R1) and x1...xi ∈ L(R2)?

I Example: "if"

I Rule: Use rule listed first

I This is how "if" is matched as a keyword, not identifier

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 9/33

Page 30: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities II

I What if two rules match with the same number of characters?

I x1...xi ∈ L(R1) and x1...xi ∈ L(R2)?

I Example: "if"

I Rule: Use rule listed first

I This is how "if" is matched as a keyword, not identifier

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 9/33

Page 31: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities II

I What if two rules match with the same number of characters?

I x1...xi ∈ L(R1) and x1...xi ∈ L(R2)?

I Example: "if"

I Rule: Use rule listed first

I This is how "if" is matched as a keyword, not identifier

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 9/33

Page 32: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Ambiguities II

I What if two rules match with the same number of characters?

I x1...xi ∈ L(R1) and x1...xi ∈ L(R2)?

I Example: "if"

I Rule: Use rule listed first

I This is how "if" is matched as a keyword, not identifier

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 9/33

Page 33: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Error Handling

I What if no rule matches a prefix of the input?

I Solution 1: Get stuck

⇒ Unacceptable

I Better Solution: Write a rule matching all “bad” strings

I Question: What kind of rule and where to place it?

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 10/33

Page 34: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Error Handling

I What if no rule matches a prefix of the input?

I Solution 1: Get stuck

⇒ Unacceptable

I Better Solution: Write a rule matching all “bad” strings

I Question: What kind of rule and where to place it?

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 10/33

Page 35: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Error Handling

I What if no rule matches a prefix of the input?

I Solution 1: Get stuck ⇒ Unacceptable

I Better Solution: Write a rule matching all “bad” strings

I Question: What kind of rule and where to place it?

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 10/33

Page 36: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Error Handling

I What if no rule matches a prefix of the input?

I Solution 1: Get stuck ⇒ Unacceptable

I Better Solution: Write a rule matching all “bad” strings

I Question: What kind of rule and where to place it?

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 10/33

Page 37: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Error Handling

I What if no rule matches a prefix of the input?

I Solution 1: Get stuck ⇒ Unacceptable

I Better Solution: Write a rule matching all “bad” strings

I Question: What kind of rule and where to place it?

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 10/33

Page 38: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Where are we?

I We now know how we can partition input string into tokensassuming we can decide if a string is in the language describedby a regular expression.

I Next: How to decide if s ∈ L(R)

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 11/33

Page 39: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Where are we?

I We now know how we can partition input string into tokensassuming we can decide if a string is in the language describedby a regular expression.

I Next: How to decide if s ∈ L(R)

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 11/33

Page 40: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Regular Expressions ⇔ Specification

I Finite Automata ⇔ Implementation

I A finite automata formally consists of:

I An input alphabet Σ

I A set of states S

I A start state n

I A set of accepting states F ⊆ S

I A set of transitions state→input state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 12/33

Page 41: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Regular Expressions ⇔ Specification

I Finite Automata ⇔ Implementation

I A finite automata formally consists of:

I An input alphabet Σ

I A set of states S

I A start state n

I A set of accepting states F ⊆ S

I A set of transitions state→input state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 12/33

Page 42: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Regular Expressions ⇔ Specification

I Finite Automata ⇔ Implementation

I A finite automata formally consists of:

I An input alphabet Σ

I A set of states S

I A start state n

I A set of accepting states F ⊆ S

I A set of transitions state→input state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 12/33

Page 43: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Regular Expressions ⇔ Specification

I Finite Automata ⇔ Implementation

I A finite automata formally consists of:I An input alphabet Σ

I A set of states S

I A start state n

I A set of accepting states F ⊆ S

I A set of transitions state→input state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 12/33

Page 44: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Regular Expressions ⇔ Specification

I Finite Automata ⇔ Implementation

I A finite automata formally consists of:I An input alphabet Σ

I A set of states S

I A start state n

I A set of accepting states F ⊆ S

I A set of transitions state→input state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 12/33

Page 45: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Regular Expressions ⇔ Specification

I Finite Automata ⇔ Implementation

I A finite automata formally consists of:I An input alphabet Σ

I A set of states S

I A start state n

I A set of accepting states F ⊆ S

I A set of transitions state→input state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 12/33

Page 46: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Regular Expressions ⇔ Specification

I Finite Automata ⇔ Implementation

I A finite automata formally consists of:I An input alphabet Σ

I A set of states S

I A start state n

I A set of accepting states F ⊆ S

I A set of transitions state→input state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 12/33

Page 47: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Regular Expressions ⇔ Specification

I Finite Automata ⇔ Implementation

I A finite automata formally consists of:I An input alphabet Σ

I A set of states S

I A start state n

I A set of accepting states F ⊆ S

I A set of transitions state→input state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 12/33

Page 48: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Transition S1 →α S2

I This means: In state S1 and input character α, go to state S2

I If end of input and in accepting state ⇒ accept

I Otherwise ⇒ reject

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 13/33

Page 49: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Transition S1 →α S2

I This means: In state S1 and input character α, go to state S2

I If end of input and in accepting state ⇒ accept

I Otherwise ⇒ reject

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 13/33

Page 50: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Transition S1 →α S2

I This means: In state S1 and input character α, go to state S2

I If end of input and in accepting state ⇒ accept

I Otherwise ⇒ reject

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 13/33

Page 51: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata

I Transition S1 →α S2

I This means: In state S1 and input character α, go to state S2

I If end of input and in accepting state ⇒ accept

I Otherwise ⇒ reject

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 13/33

Page 52: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata as State Graphs

I It is much easier to imagine finite automata visually:

A state:

The start state:

An accepting state:

A transition:

a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 14/33

Page 53: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata as State Graphs

I It is much easier to imagine finite automata visually:

A state:

The start state:

An accepting state:

A transition:

a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 14/33

Page 54: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata as State Graphs

I It is much easier to imagine finite automata visually:

A state:

The start state:

An accepting state:

A transition:

a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 14/33

Page 55: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata as State Graphs

I It is much easier to imagine finite automata visually:

A state:

The start state:

An accepting state:

A transition:

a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 14/33

Page 56: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Finite Automata as State Graphs

I It is much easier to imagine finite automata visually:

A state:

The start state:

An accepting state:

A transition:

a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 14/33

Page 57: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

A Simple Example

I Here is an automaton that only accepts the string ”1”:

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 15/33

Page 58: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

A Simple Example

I Here is an automaton that only accepts the string ”1”:

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 15/33

Page 59: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Another Simple Example

I A finite automaton accepting any number of 1’s followed by asingle 0

I Alphabet: {0, 1}

0

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 16/33

Page 60: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Another Simple Example

I A finite automaton accepting any number of 1’s followed by asingle 0

I Alphabet: {0, 1}

0

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 16/33

Page 61: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Another Simple Example

I A finite automaton accepting any number of 1’s followed by asingle 0

I Alphabet: {0, 1}

0

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 16/33

Page 62: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

And Another Example

I Alphabet: {0, 1}

I What language does this automata recognize?

0

1

0

0

1

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 17/33

Page 63: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

And Another Example

I Alphabet: {0, 1}

I What language does this automata recognize?

0

1

0

0

1

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 17/33

Page 64: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

And Another Example

I Alphabet: {0, 1}

I What language does this automata recognize?

0

1

0

0

1

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 17/33

Page 65: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Epsilon Moves

I Another kind of transition: ε-moves

A B

I Machine can move from state A to B without reading anyinput

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 18/33

Page 66: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Epsilon Moves

I Another kind of transition: ε-moves

A B

I Machine can move from state A to B without reading anyinput

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 18/33

Page 67: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Epsilon Moves

I Another kind of transition: ε-moves

A B

I Machine can move from state A to B without reading anyinput

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 18/33

Page 68: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Deterministic and Nondeterministic Automata

I Deterministic Finite Automata (DFA)

I At most one transition per input on any state

I No ε moves

I Nondeterministic Finite Automate (NFA)

I Can have multiple transitions for one input in a given state

I Can have ε-moves

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 19/33

Page 69: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Deterministic and Nondeterministic Automata

I Deterministic Finite Automata (DFA)I At most one transition per input on any state

I No ε moves

I Nondeterministic Finite Automate (NFA)

I Can have multiple transitions for one input in a given state

I Can have ε-moves

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 19/33

Page 70: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Deterministic and Nondeterministic Automata

I Deterministic Finite Automata (DFA)I At most one transition per input on any state

I No ε moves

I Nondeterministic Finite Automate (NFA)

I Can have multiple transitions for one input in a given state

I Can have ε-moves

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 19/33

Page 71: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Deterministic and Nondeterministic Automata

I Deterministic Finite Automata (DFA)I At most one transition per input on any state

I No ε moves

I Nondeterministic Finite Automate (NFA)

I Can have multiple transitions for one input in a given state

I Can have ε-moves

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 19/33

Page 72: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Deterministic and Nondeterministic Automata

I Deterministic Finite Automata (DFA)I At most one transition per input on any state

I No ε moves

I Nondeterministic Finite Automate (NFA)I Can have multiple transitions for one input in a given state

I Can have ε-moves

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 19/33

Page 73: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Deterministic and Nondeterministic Automata

I Deterministic Finite Automata (DFA)I At most one transition per input on any state

I No ε moves

I Nondeterministic Finite Automate (NFA)I Can have multiple transitions for one input in a given state

I Can have ε-moves

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 19/33

Page 74: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Execution of Finite Automata

I A DFA can only take one path through the state graph that iscompletely determined by the input

I NFAs can choose:

I Whether to make ε moves

I Which one of multiple transitions for a single input to take

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 20/33

Page 75: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Execution of Finite Automata

I A DFA can only take one path through the state graph that iscompletely determined by the input

I NFAs can choose:

I Whether to make ε moves

I Which one of multiple transitions for a single input to take

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 20/33

Page 76: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Execution of Finite Automata

I A DFA can only take one path through the state graph that iscompletely determined by the input

I NFAs can choose:I Whether to make ε moves

I Which one of multiple transitions for a single input to take

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 20/33

Page 77: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Execution of Finite Automata

I A DFA can only take one path through the state graph that iscompletely determined by the input

I NFAs can choose:I Whether to make ε moves

I Which one of multiple transitions for a single input to take

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 20/33

Page 78: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Acceptance of NFAs

I This means: A NFA can get into multiple states at the sametime

I Consider again the alphabet Σ = {0, 1} and the language ofall strings ending in at least two 0s.

I Consider input 1 0 0

100

0I Rule: NFA accepts if it can get to a final state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 21/33

Page 79: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Acceptance of NFAs

I This means: A NFA can get into multiple states at the sametime

I Consider again the alphabet Σ = {0, 1} and the language ofall strings ending in at least two 0s.

I Consider input 1 0 0

100

0I Rule: NFA accepts if it can get to a final state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 21/33

Page 80: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Acceptance of NFAs

I This means: A NFA can get into multiple states at the sametime

I Consider again the alphabet Σ = {0, 1} and the language ofall strings ending in at least two 0s.

I Consider input 1 0 0

100

0

I Rule: NFA accepts if it can get to a final state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 21/33

Page 81: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Acceptance of NFAs

I This means: A NFA can get into multiple states at the sametime

I Consider again the alphabet Σ = {0, 1} and the language ofall strings ending in at least two 0s.

I Consider input 1 0 0

100

0

I Rule: NFA accepts if it can get to a final state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 21/33

Page 82: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Acceptance of NFAs

I This means: A NFA can get into multiple states at the sametime

I Consider again the alphabet Σ = {0, 1} and the language ofall strings ending in at least two 0s.

I Consider input 1 0 0

100

0

I Rule: NFA accepts if it can get to a final state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 21/33

Page 83: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Acceptance of NFAs

I This means: A NFA can get into multiple states at the sametime

I Consider again the alphabet Σ = {0, 1} and the language ofall strings ending in at least two 0s.

I Consider input 1 0 0

100

0

I Rule: NFA accepts if it can get to a final state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 21/33

Page 84: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Acceptance of NFAs

I This means: A NFA can get into multiple states at the sametime

I Consider again the alphabet Σ = {0, 1} and the language ofall strings ending in at least two 0s.

I Consider input 1 0 0

100

0

I Rule: NFA accepts if it can get to a final state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 21/33

Page 85: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Acceptance of NFAs

I This means: A NFA can get into multiple states at the sametime

I Consider again the alphabet Σ = {0, 1} and the language ofall strings ending in at least two 0s.

I Consider input 1 0 0

100

0I Rule: NFA accepts if it can get to a final state

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 21/33

Page 86: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFAs vs. DFAs

I Fundamental Result: NFAs and DFAs recognize the same setof languages (regular languages)

I DFAs are faster to execute, since there are no choices toconsider

I But NFAs can be much simpler for the same language

I Result: DFAs can be exponentially larger than NFArecognizing same language

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 22/33

Page 87: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFAs vs. DFAs

I Fundamental Result: NFAs and DFAs recognize the same setof languages (regular languages)

I DFAs are faster to execute, since there are no choices toconsider

I But NFAs can be much simpler for the same language

I Result: DFAs can be exponentially larger than NFArecognizing same language

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 22/33

Page 88: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFAs vs. DFAs

I Fundamental Result: NFAs and DFAs recognize the same setof languages (regular languages)

I DFAs are faster to execute, since there are no choices toconsider

I But NFAs can be much simpler for the same language

I Result: DFAs can be exponentially larger than NFArecognizing same language

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 22/33

Page 89: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFAs vs. DFAs

I Fundamental Result: NFAs and DFAs recognize the same setof languages (regular languages)

I DFAs are faster to execute, since there are no choices toconsider

I But NFAs can be much simpler for the same language

I Result: DFAs can be exponentially larger than NFArecognizing same language

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 22/33

Page 90: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Finite Automata

I High-Level Sketch:

I Lexical Specification

I Regular Expressions

I NFA

I DFA

I Implementation of DFA

⇒ Lexer

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 23/33

Page 91: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Finite Automata

I High-Level Sketch:I Lexical Specification

I Regular Expressions

I NFA

I DFA

I Implementation of DFA

⇒ Lexer

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 23/33

Page 92: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Finite Automata

I High-Level Sketch:I Lexical Specification

I Regular Expressions

I NFA

I DFA

I Implementation of DFA

⇒ Lexer

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 23/33

Page 93: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Finite Automata

I High-Level Sketch:I Lexical Specification

I Regular Expressions

I NFA

I DFA

I Implementation of DFA

⇒ Lexer

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 23/33

Page 94: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Finite Automata

I High-Level Sketch:I Lexical Specification

I Regular Expressions

I NFA

I DFA

I Implementation of DFA

⇒ Lexer

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 23/33

Page 95: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Finite Automata

I High-Level Sketch:I Lexical Specification

I Regular Expressions

I NFA

I DFA

I Implementation of DFA

⇒ Lexer

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 23/33

Page 96: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to Finite Automata

I High-Level Sketch:I Lexical Specification

I Regular Expressions

I NFA

I DFA

I Implementation of DFA

⇒ Lexer

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 23/33

Page 97: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to NFA (1)

I For each kind of regular expression, define an NFA andcombine

I Will use the following notation: NFA for regular expression M :

M

I Base cases:

I For ε :

I For input a:a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 24/33

Page 98: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to NFA (1)

I For each kind of regular expression, define an NFA andcombine

I Will use the following notation: NFA for regular expression M :

M

I Base cases:

I For ε :

I For input a:a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 24/33

Page 99: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to NFA (1)

I For each kind of regular expression, define an NFA andcombine

I Will use the following notation: NFA for regular expression M :

M

I Base cases:

I For ε :

I For input a:a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 24/33

Page 100: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to NFA (1)

I For each kind of regular expression, define an NFA andcombine

I Will use the following notation: NFA for regular expression M :

M

I Base cases:

I For ε :

I For input a:a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 24/33

Page 101: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to NFA (1)

I For each kind of regular expression, define an NFA andcombine

I Will use the following notation: NFA for regular expression M :

M

I Base cases:

I For ε :

I For input a:a

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 24/33

Page 102: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to NFA (2)

I For AB :BA

I For A + B :B

A

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 25/33

Page 103: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to NFA (2)

I For AB :BA

I For A + B :B

A

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 25/33

Page 104: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Regular Expressions to NFA (3)

I For A∗:

A

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 26/33

Page 105: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Example of Regular Expression to NFA conversion

I Consider the regular expression (1 + 0)∗1

1C E

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 27/33

Page 106: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Example of Regular Expression to NFA conversion

I Consider the regular expression (1 + 0)∗1

1C E

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 27/33

Page 107: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Example of Regular Expression to NFA conversion

I Consider the regular expression (1 + 0)∗1

1

0

C E

D F

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 27/33

Page 108: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Example of Regular Expression to NFA conversion

I Consider the regular expression (1 + 0)∗1

1

0B

C E

D F

G

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 27/33

Page 109: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Example of Regular Expression to NFA conversion

I Consider the regular expression (1 + 0)∗1

1

0A B

C E

D F

G H

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 27/33

Page 110: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Example of Regular Expression to NFA conversion

I Consider the regular expression (1 + 0)∗1

1

0

1A BC E

D F

G H I J

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 27/33

Page 111: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: The Trick

I Insight: Simulate the NFA

I At any given time, the NFA is in a set of states

I States in the DFA ⇒ all (reachable) subsets of states in theNFA

I Start State:

the set of states reachable through ε moves fromthe NFA start state

I Add transition A→a B to DFA iff:

I B is in the set of states reachable from any state in A afterseeing input a, considering ε moves as well

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 28/33

Page 112: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: The Trick

I Insight: Simulate the NFA

I At any given time, the NFA is in a set of states

I States in the DFA ⇒ all (reachable) subsets of states in theNFA

I Start State:

the set of states reachable through ε moves fromthe NFA start state

I Add transition A→a B to DFA iff:

I B is in the set of states reachable from any state in A afterseeing input a, considering ε moves as well

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 28/33

Page 113: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: The Trick

I Insight: Simulate the NFA

I At any given time, the NFA is in a set of states

I States in the DFA ⇒ all (reachable) subsets of states in theNFA

I Start State:

the set of states reachable through ε moves fromthe NFA start state

I Add transition A→a B to DFA iff:

I B is in the set of states reachable from any state in A afterseeing input a, considering ε moves as well

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 28/33

Page 114: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: The Trick

I Insight: Simulate the NFA

I At any given time, the NFA is in a set of states

I States in the DFA ⇒ all (reachable) subsets of states in theNFA

I Start State:

the set of states reachable through ε moves fromthe NFA start state

I Add transition A→a B to DFA iff:

I B is in the set of states reachable from any state in A afterseeing input a, considering ε moves as well

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 28/33

Page 115: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: The Trick

I Insight: Simulate the NFA

I At any given time, the NFA is in a set of states

I States in the DFA ⇒ all (reachable) subsets of states in theNFA

I Start State: the set of states reachable through ε moves fromthe NFA start state

I Add transition A→a B to DFA iff:

I B is in the set of states reachable from any state in A afterseeing input a, considering ε moves as well

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 28/33

Page 116: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: The Trick

I Insight: Simulate the NFA

I At any given time, the NFA is in a set of states

I States in the DFA ⇒ all (reachable) subsets of states in theNFA

I Start State: the set of states reachable through ε moves fromthe NFA start state

I Add transition A→a B to DFA iff:

I B is in the set of states reachable from any state in A afterseeing input a, considering ε moves as well

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 28/33

Page 117: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: The Trick

I Insight: Simulate the NFA

I At any given time, the NFA is in a set of states

I States in the DFA ⇒ all (reachable) subsets of states in theNFA

I Start State: the set of states reachable through ε moves fromthe NFA start state

I Add transition A→a B to DFA iff:I B is in the set of states reachable from any state in A after

seeing input a, considering ε moves as well

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 28/33

Page 118: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 119: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 120: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 121: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 122: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

0

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 123: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

0

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 124: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

0

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 125: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 126: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 127: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 128: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 129: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

1 ABCDEGHIJ

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 130: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 131: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 132: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

0

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 133: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

0

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 134: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

0

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 135: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

0

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 136: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

00

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 137: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

00

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 138: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

00

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 139: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

00

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 140: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

00

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 141: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

00

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 142: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

00

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 143: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: Example

Recall our friendly NFA for (1 + 0)∗1:

1

0

1A BC E

D F

G H I J

ABCDHI

ABCDFGHI0

11 ABCDEGHIJ

00

1

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 29/33

Page 144: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: How many states?

I We need a state in the DFA for each set of states the NFAcan be in

I How many different states?

I If there are N states, the NFA must be in some subset ofthose N states

I How many subsets of N states?

2N

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 30/33

Page 145: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: How many states?

I We need a state in the DFA for each set of states the NFAcan be in

I How many different states?

I If there are N states, the NFA must be in some subset ofthose N states

I How many subsets of N states?

2N

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 30/33

Page 146: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: How many states?

I We need a state in the DFA for each set of states the NFAcan be in

I How many different states?

I If there are N states, the NFA must be in some subset ofthose N states

I How many subsets of N states?

2N

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 30/33

Page 147: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: How many states?

I We need a state in the DFA for each set of states the NFAcan be in

I How many different states?

I If there are N states, the NFA must be in some subset ofthose N states

I How many subsets of N states?

2N

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 30/33

Page 148: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

NFA to DFA: How many states?

I We need a state in the DFA for each set of states the NFAcan be in

I How many different states?

I If there are N states, the NFA must be in some subset ofthose N states

I How many subsets of N states? 2N

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 30/33

Page 149: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation

I A DFA can be implemented by a 2D table T

I One dimension is “states”

I Other dimension is “input symbols”

I For every transition A→c B , define T[A,c]=B

I DFA “execution”: If in state A and input c, read T[A,c] = B

and skip to state B

I Very efficient

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 31/33

Page 150: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation

I A DFA can be implemented by a 2D table TI One dimension is “states”

I Other dimension is “input symbols”

I For every transition A→c B , define T[A,c]=B

I DFA “execution”: If in state A and input c, read T[A,c] = B

and skip to state B

I Very efficient

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 31/33

Page 151: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation

I A DFA can be implemented by a 2D table TI One dimension is “states”

I Other dimension is “input symbols”

I For every transition A→c B , define T[A,c]=B

I DFA “execution”: If in state A and input c, read T[A,c] = B

and skip to state B

I Very efficient

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 31/33

Page 152: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation

I A DFA can be implemented by a 2D table TI One dimension is “states”

I Other dimension is “input symbols”

I For every transition A→c B , define T[A,c]=B

I DFA “execution”: If in state A and input c, read T[A,c] = B

and skip to state B

I Very efficient

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 31/33

Page 153: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation

I A DFA can be implemented by a 2D table TI One dimension is “states”

I Other dimension is “input symbols”

I For every transition A→c B , define T[A,c]=B

I DFA “execution”: If in state A and input c, read T[A,c] = B

and skip to state B

I Very efficient

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 31/33

Page 154: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation

I A DFA can be implemented by a 2D table TI One dimension is “states”

I Other dimension is “input symbols”

I For every transition A→c B , define T[A,c]=B

I DFA “execution”: If in state A and input c, read T[A,c] = B

and skip to state B

I Very efficient

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 31/33

Page 155: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Table Implementation of a DFA

0

11

00

1S

T

U

0 1

S T U

T T U

U T U

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 32/33

Page 156: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Table Implementation of a DFA

0

11

00

1S

T

U

0 1

S T U

T T U

U T U

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 32/33

Page 157: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation cont.

I Writing regular expressions as NFAs and converting them toDFAs is exactly what flex does

I In fact, if you open the auto-generated flex file lex.yy.c,you will see these tables emitted

I But, these DFAs can be huge

I In practice, flex-like tools trade off speed for space in thechoice of NFA and DFA representations

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 33/33

Page 158: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation cont.

I Writing regular expressions as NFAs and converting them toDFAs is exactly what flex does

I In fact, if you open the auto-generated flex file lex.yy.c,you will see these tables emitted

I But, these DFAs can be huge

I In practice, flex-like tools trade off speed for space in thechoice of NFA and DFA representations

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 33/33

Page 159: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation cont.

I Writing regular expressions as NFAs and converting them toDFAs is exactly what flex does

I In fact, if you open the auto-generated flex file lex.yy.c,you will see these tables emitted

I But, these DFAs can be huge

I In practice, flex-like tools trade off speed for space in thechoice of NFA and DFA representations

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 33/33

Page 160: CS345H: Programming Languages Lecture 4: Implementation of ...tdillig/cs345H/lecture4.pdf · Outline I Last time: Specifying lexical structure usingregular expressions I Today: How

Implementation cont.

I Writing regular expressions as NFAs and converting them toDFAs is exactly what flex does

I In fact, if you open the auto-generated flex file lex.yy.c,you will see these tables emitted

I But, these DFAs can be huge

I In practice, flex-like tools trade off speed for space in thechoice of NFA and DFA representations

Thomas Dillig, CS345H: Programming Languages Lecture 4: Implementation of Lexical Analysis 33/33


Recommended