Date post: | 20-Nov-2014 |
Category: |
Education |
Upload: | arab-open-university-and-cairo-university |
View: | 301 times |
Download: | 0 times |
Compilers
WELCOME TO A JOURNEY TO
CS419 Lecture 7
Parsing tokens using Context Free Grammars
Cairo UniversityFCI
Dr. Hussien SharafComputer Science [email protected]
2
PART ONE
Dr. Hussien M. Sharaf
PARSING A parser gets a stream of
tokens from the scanner, and determines if the syntax (structure) of the program is correct according to the (context-free) grammar of the source language.
Then, it produces a data structure, called a parse tree or an abstract syntax tree, which describes the syntactic structure of the program.
parser
Stream of tokens
Parse/syntax tree
Dr. Hussien M. Sharaf 3
CFG A context-free grammar is a notation for
defining context free languages. It is more powerful than finite automata or
RE’s, but still cannot define all possible languages.
Useful for nested structures, e.g., parentheses in programming languages.
Basic idea is to use “variables” to stand for sets of strings.
These variables are defined recursively, in terms of one another.
Dr. Hussien M. Sharaf 4
CFG FORMAL DEFINITION C =(V, Σ, R, S) V: is a finite set of variables. Σ: symbols called terminals of the
alphabet of the language being defined.
S V: a special start symbol. R: is a finite set of production rules of
the form A→ where AV, (V Σ)
Dr. Hussien M. Sharaf 5
CFG -1
Define the language { anbn | n > 1}. Terminals = {a, b}. Variables = {S}. Start symbol = S. Productions =
S → ab S → aSb Summary S → ab S → aSb
Dr. Hussien M. Sharaf 6
DERIVATION We derive strings in the language of a CFG by
starting with the start symbol, and repeatedly replacing some variable A by the right side of one of its productions.
Derivation example for “aabb” Using S→ aSb
generates uncompleted string that still has a non- terminal S.
Then using S→ ab to replace the inner S Generates “aabb”
S aSb aabb ……[Successful derivation of aabb]
Dr. Hussien M. Sharaf 7
CFG -1 : BALANCED-PARENTHESES
Prod1S → (S)Prod2S → ()
Derive the string ((())).S → (S) …..[by prod1]
→ ((S)) …..[by prod1]→ ((())) …..[by prod2]
Dr. Hussien M. Sharaf 8
CFG -2 : PALINDROME
Describe palindrome of a’s and b’s using CFG
1] S → aSa 2] S → bSb 3] S → Λ
Derive “baab” from the above grammar. S → bSb [by 2]
→ baSab [by 1]→ ba ab [by 3]
Dr. Hussien M. Sharaf 9
CFG -3 : EVEN-PLAINDROME
i.e. {Λ, ab, abbaabba,… } S → aSa| bSb| Λ Derive abaaba
10
S
S
Λ
aa
S bb
S aa
Dr. Hussien M. Sharaf
CFG – 4
Describe anything (a+b)* using CGF1] S → Λ 2] S → Y 3] Y→ aY4] Y → bY 5] Y →a 6] Y→ b
Derive “aab” from the above grammar.
S → aY [by 3]Y → aaY [by 3]Y → aab [by 6]
Dr. Hussien M. Sharaf 11
CFG – 5
1] S → Λ 2] S → aS 3] S→ bS
Derive “aa” from the above grammar.
S → aS [by 2]→ aaS [by 2]→ aa [by 1]
Dr. Hussien M. Sharaf 12
13
PART TWO
Dr. Hussien M. Sharaf
Parsing CFG grammar is about categorizing the
statements of a language. Parsing using CFG means categorizing a certain
statements into categories defined in the CFG. Parsing can be expressed using a special type
of graph called Trees where no cycles exist. A parse tree is the graph representation of a
derivation. Programmatically; Parse tree can be
represented as a dynamic data structure using a single root node.
14Dr. Hussien M. Sharaf
Parse tree
15
(1)A vertex with a label which is a Non-terminal symbol is a parse tree.
(2) If A → y1 y2 … yn is a rule in R, then the tree
A
y1 y2 yn. . .
is a parse tree.
Dr. Hussien M. Sharaf
Ambiguity A grammar can generate the same
string in different ways. Ambiguity occurs when a string has two
or more leftmost derivations for the same CFG.
There are ways to eliminate ambiguity such as using Chomsky Normal Form (CNF) which does n’t use Λ.
Λ cause ambiguity.
16Dr. Hussien M. Sharaf
Ex 1 Deduce CFG of addition and parse the
following expression 2+3+5 1] S→S+S|N 2] N→1|2|3|4|5|6|7|8|9|0 N1|N2|N3|N4|N5|N6|N7|N8|N9|N0
17
S
S+N
S+
N
5S+
3
N
2
N
Can u makeanother parsingtree ?
Dr. Hussien M. Sharaf
Ex 2 Deduce CFG of a addition/multiplication
and parse the following expression 2+3*5
1] S→S+S|S*S|N
2] N→1|2|3|4|5|6|7|8|9|0|NN
18
S
S*S
S*
N
5S+
3
N
2
N
Can u makeanother parsingtree ?
Dr. Hussien M. Sharaf
Ex 3 CFG without ambiguity Deduce CFG of a addition/multiplication
and parse the following expression
2*3+51] S→ Term|Term + S 2] Term → N|N * Term 3] N→1|2|3|4|5|6|7|8|9|0
19
S
S+N
S+
N
5S*
3
N
2
N
Can you makeanother parsingtree ?
Dr. Hussien M. Sharaf
Example 4 : AABB
20
S A | A BA Λ| a | A b | A A
B b | b c | B c | b BSample derivations:
S AB AbB Abb AAbb Aabb aabb
S
A B
AA Bb
a a b
S
BA
b
A
b
AA
a aDr. Hussien M. Sharaf
S AB AAB aAB aaB aabB aabb
Ex 5
21
S A | A BA Λ | a | A b | A AB b | b c | B c | b B
w = aabb
S
A B
AA b
a
a
bA
S
A
A A
AA bA
a e
a
bA
S
A B
AA Bb
a a b
Dr. Hussien M. Sharaf
REMOVING AMBIGUITY
22
Eliminate “useless” variables.Eliminate Λ-productions: AΛ.Avoid left recursion by replacing it with
right-recursion.
But if a language is ambiguous, it can’t be totally removed. We just need to the parsing to continue without entering an infinite loop.
Dr. Hussien M. Sharaf
THANK YOU
Dr. Hussien M. Sharaf 23