+ All Categories
Home > Education > Compiler Design UNIT-2

Compiler Design UNIT-2

Date post: 14-Apr-2017
Category:
Upload: ankur-srivastava
View: 31 times
Download: 1 times
Share this document with a friend
79
COMPILER DESIGN UNIT-2 BASIC PARSING TECHNIQUES Topics: Parsers, Shift reduce parsing, operator precedence parsing, top down parsing, predictive parsers Automatic Construction of efficient Parsers: LR parsers, the canonical Collection of LR(0) items, constructing SLR parsing tables, constructing Canonical LR parsing tables, Constructing LALR parsing tables, using ambiguous grammars, an automatic parser generator, implementation of LR parsing tables. 31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1
Transcript
Page 1: Compiler Design UNIT-2

COMPILER DESIGN• UNIT-2 BASIC PARSING TECHNIQUES Topics: Parsers, Shift reduce parsing, operator precedence parsing, top down parsing, predictive parsers Automatic Construction of efficient Parsers: LR parsers, the canonical Collection of LR(0) items, constructing SLR parsing tables, constructing Canonical LR parsing tables, Constructing LALR parsing tables, using ambiguous grammars, an automatic parser generator, implementation of LR parsing tables.31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT

(CD) 1

Page 2: Compiler Design UNIT-2

DEFINITION OF PARSING

A parser is a compiler or interpreter component that breaks data into smaller elements for easy translation into another language.

A parser takes input in the form of a sequence of tokens or program instructions and usually builds a data structure in the form of a parse tree or an abstract syntax tree.

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 231-01-2017

Page 3: Compiler Design UNIT-2

Contd…..

• To identify the language constructs present in a given input program.• If the parser determines the input to be a valid one, it outputs a

representation of the input in the form of a parser tree.

• If the input is grammatically incorrect, the parser declares the detection of syntax error in the input. This case no parse tree can be produced.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3

Page 4: Compiler Design UNIT-2

ROLE OF PARSER

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 431-01-2017

Page 5: Compiler Design UNIT-2

• In the compiler model, the parser obtains a string of tokens from the lexical analyzer,

• and verifies that the string can be generated by the grammar for the source language.

• The parser returns any syntax error for the source language.• It collects sufficient number of tokens and builds a parse

tree.

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 531-01-2017

Page 6: Compiler Design UNIT-2

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 631-01-2017

Page 7: Compiler Design UNIT-2

• There are basically two types of parser:

• Top-down parser:• starts at the root of derivation tree and fills in• picks a production and tries to match the input• may require backtracking• some grammars are backtrack-free (predictive)

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 731-01-2017

Page 8: Compiler Design UNIT-2

Parser contd……..• Bottom-up parser:

• starts at the leaves and fills in • starts in a state valid for legal first tokens • uses a stack to store both state and sentential forms.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 8

Page 9: Compiler Design UNIT-2

TOP DOWN PARSING• A top-down parser starts with the root of the parse tree, labeled with

the start or goal symbol of the grammar.

• To build a parse, it repeats the following steps until the fringe of the parse tree matches the input string

• STEP1: At a node labeled A, select a production A α and construct the appropriate child for each symbol of α

• STEP2: When a terminal is added to the fringe that doesn’t match the input string, backtrack • STEP3: Find the next node to be expanded.

• The key is selecting the right production in step 1

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 931-01-2017

Page 10: Compiler Design UNIT-2

EXAMPLE FOR TOP DOWN PARSING• Supppose the given production rules are as follows:• S-> aAd|aB• A-> b|c• B->ccd

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1031-01-2017

Page 11: Compiler Design UNIT-2

PROBLEMS WITH TOP DOWN PARSING

1) BACKTRACKING Backtracking is a technique in which for expansion of non-terminal symbol we choose one alternative and if some mismatch occurs then we try another alternative if any.If for a non-terminal there are multiple production rules beginning with the same input symbol then to get the correct derivation we need to try all these alternatives.

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1131-01-2017

Page 12: Compiler Design UNIT-2

EXAMPLE OF BACKTRACKING

• Suppose the given production rules are as follows:• S->cAd• A->a|ab

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1231-01-2017

Page 13: Compiler Design UNIT-2

More Example• S → rXd | rZd • X → oa | ea • Z → ai

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 13

Page 14: Compiler Design UNIT-2

2) LEFT RECURSIONLeft recursion is a case when the left-most non-terminal in a

production of a non-terminal is the non-terminal itself( direct left recursion ) or through some other non-terminal definitions, rewrites to the non-terminal again(indirect left recursion). Consider these examples -

(1) A -> Aq (direct)(2) A -> Bq

B -> Ar (indirect)Left recursion has to be removed if the parser performs top-down

parsing

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1431-01-2017

Page 15: Compiler Design UNIT-2

Contd….• The production is left-recursive if the leftmost symbol on the right

side is the same as the non terminal on the left side. • For example,

expr → expr + term.• A grammar is left recursive if it has a nonterminal, say A, that has a

derivation of Aα from it.• Presence of left recursion creates difficulties while designing the

corresponding parsers.• Left recursion is of two types: Immediate left recursion General left recursion31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT

(CD) 15

Page 16: Compiler Design UNIT-2

Contd……• An immediate left recursion happens with a nonterminal A having

production rule of the form • A Aα| β.• The immediate left recursion can be eliminated by introducing a new

nonterminal symbol, say A’ thus modifying the grammar.• The grammar rule A Aα| β is modified as,• A βA’ • A’ αA’ |ε• Thus the rule A Aα1| Aα2|…..| Aαm| β 1| β 2 |….| β n can be

modified as,

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 16

Page 17: Compiler Design UNIT-2

Contd……• A β1 A’| β2 A’| ……..|βn A’• A’ α1 A’| α2 A’|……..|αm A’|ε• Example• Consider the following left-recursive grammar for arithmetic

expression, E E + T | T T T * F | F F (E) | id

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 17

Page 18: Compiler Design UNIT-2

Contd……• Elimination of immediate left recursion from the rules modifies the

grammar as,• E TE’• E’ +T E’| ε• T F T’• T’ *F T’| ε• F (E) | id• However, even if there may be no immediate left recursion, a number

of production rules may act together to give a general left recursion.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 18

Page 19: Compiler Design UNIT-2

Contd……

• S Aa• A Sb|c• Here, S is left recursive, because S Aa Sba. • This form of general left recursion can be eliminated with the

following algorithm.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 19

Page 20: Compiler Design UNIT-2

Algorithm

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 20

Page 21: Compiler Design UNIT-2

Contd…..• For example, consider the grammar,• S Aa• A Sb | c• Let the order of nonterminals be S, A. For i=1, the rule S Aa is

through, since there is no immediate left recursion. For i=2, A Sb|c is modified as, A Aab|c, which has immediate left

recursion & hence, is eliminated by modifying the rule as, A cA’ A’ abA’|ε

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 21

Page 22: Compiler Design UNIT-2

RECURSIVE DESCENT PARSING• A recursive descent parser is a kind of top-down parser built from a set

of mutually recursive procedures (or a non-recursive equivalent) where each such procedure usually implements one of the productions of the grammar.

• It is a common form of top-down parsing.• It is called recursive as it uses recursive procedures to process the input.• It constructs the parse tree from the top & the input is read from left to

right.• It uses procedures for every terminal & non-terminal entity.

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2231-01-2017

Page 23: Compiler Design UNIT-2

Example• Consider the grammar S abA A cd|c|εFor the input stream ab, the recursive descent parser starts by constructing a

parse tree representing S abA.Now construct the parse tree for the above grammar.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 23

Page 24: Compiler Design UNIT-2

PREDICTIVE LL(1) PARSING

• The first “L” in LL(1) refers to the fact that the input is processed from left to right.• The second “L” refers to the fact that LL(1) parsing determines a leftmost derivation

for the input string. • The “1” in parentheses implies that LL(1) parsing uses only one symbol of input to

predict the next grammar rule that should be used. • The data structures used by LL(1) are 1. Input buffer 2. Stack 3. Parsing table

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2431-01-2017

Page 25: Compiler Design UNIT-2

• The construction of predictive LL(1) parser is based on two very important functions and those are First and Follow.

• For construction of predictive LL(1) parser we have to follow the following steps:

• STEP1: computate FIRST and FOLLOW function.• STEP2: construct predictive parsing table using first and follow function.• STEP3: parse the input string with the help of predictive parsing table

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2531-01-2017

Page 26: Compiler Design UNIT-2

FIRSTIf X is a terminal then First(X) is just X!If there is a Production X → ε then add ε to first(X)If there is a Production X → Y1Y2..Yk then add

first(Y1Y2..Yk) to first(X)First(Y1Y2..Yk) is either

First(Y1) (if First(Y1) doesn't contain ε)OR (if First(Y1) does contain ε) then First (Y1Y2..Yk) is everything in First(Y1)

<except for ε > as well as everything in First(Y2..Yk)If First(Y1) First(Y2)..First(Yk) all contain ε then add ε to First(Y1Y2..Yk) as

well.

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2631-01-2017

Page 27: Compiler Design UNIT-2

FOLLOW• First put $ (the end of input marker) in Follow(S) (S is the

start symbol)• If there is a production A → aBb, (where a can be a whole

string) then everything in FIRST(b) except for ε is placed in FOLLOW(B).

• If there is a production A → aB, then everything in FOLLOW(A) is in FOLLOW(B)

• If there is a production A → aBb, where FIRST(b) contains ε, then everything in FOLLOW(A) is in FOLLOW(B)

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2731-01-2017

Page 28: Compiler Design UNIT-2

EXAMPLE OF FIRST AND FOLLOWThe GrammarE → TE'E' → +TE'E' → εT → FT'T' → *FT'T' → εF → (E)F → id

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2831-01-2017

Page 29: Compiler Design UNIT-2

PROPERTIES OF LL(1) GRAMMARS1. No left-recursive grammar is LL(1) 2. No ambiguous grammar is LL(1) 3. Some languages have no LL(1) grammar 4. A ε–free grammar where each alternative expansion for A begins with a

distinct terminal is a simple LL(1) grammar.

Example:S aS a

is not LL(1) because FIRST(aS) = FIRST(a) = { a } S aS´

S´ aS εaccepts the same language and is LL(1)

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2931-01-2017

Page 30: Compiler Design UNIT-2

PREDICTIVE PARSING TABLE

Method:1. production A α:

a) a FIRST(α), add A α to M[A,a]b) If ε FIRST(α):

I. b FOLLOW(A), add A α to M[A,b]II. If $ FOLLOW(A), add A α to M[A,$]

2.Set each undefined entry of M to error

If M[A,a] with multiple entries then G is not LL(1).

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3031-01-2017

Page 31: Compiler Design UNIT-2

EXAMPLE OF PREDICTIVE PARSING LL(1) TABLE

The given grammar is as followsS EE TE´E´ +E —E εT FT´T´ * T / T εF num id

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3131-01-2017

Page 32: Compiler Design UNIT-2

BOTTOM UP PARSINGBottom-up parsing starts from the leaf nodes of a tree and works in

upward direction till it reaches the root node. we start from a sentence and then apply production rules in reverse

manner in order to reach the start symbol. Here, parser tries to identify R.H.S of production rule and replace it

by corresponding L.H.S. This activity is known as reduction.Also known as LR parser, where L means tokens are read from left to

right and R means that it constructs rightmost derivative.Bottom-up parsing is based on the reverse process to top-down.

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3231-01-2017

Page 33: Compiler Design UNIT-2

Example• Consider the grammar, S aABe A Abc|b B d and the sentence abbcde. Parsing by bottom up methods, gives

abbcde aAbcde aAde aABe S

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 33

Reverse of this gives:

S => aABe => aAde => aAbcde => abbcde

Which is clearly a series of rightmost derivations.

Page 34: Compiler Design UNIT-2

EXAMPLE OF BOTTOM-UP PARSERE → T + E | T T → int * T | int | (E) Consider the string: int * int + int

int * int + int T → intint * T + int T → int * TT + int T → intT + T E → TT + T E → TE

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3431-01-2017

Page 35: Compiler Design UNIT-2

SHIFT REDUCE PARSING• Bottom-up parsing uses two kinds of actions: 1.Shift 2.Reduce• Shift: Move | one place to the right , Shifts a terminal to the left string

ABC|xyz ABCx|yz ⇒• Reduce: Apply an inverse production at the right end of the left string

If A → xy is a production, then Cbxy|ijk CbA|ijk ⇒

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3531-01-2017

Page 36: Compiler Design UNIT-2

Example

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 36

Page 37: Compiler Design UNIT-2

EXAMPLE OF SHIFT REDUCE PARSING

|int * int + int shiftint | * int + int shiftint * | int + int shiftint * int | + int reduce T → intint * T | + int reduce T → int * TT | + int shiftT + | int shiftT + int | reduce T → intT + T | reduce E → TT + E | reduce E → T + EE |

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3731-01-2017

Page 38: Compiler Design UNIT-2

OPERATOR PRECEDENCE PARSINGOperator grammars have the property that no production right side is empty or has two adjacent nonterminals. This property enables the implementation of efficient operator-

precedence parsers. These parser rely on the following three precedence relations:

Relation Meaning

a <· b a yields precedence to b

a =· b a has the same precedence as b

a ·> b a takes precedence over bANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT

(CD) 3831-01-2017

Page 39: Compiler Design UNIT-2

• These operator precedence relations allow to delimit the handles in the right sentential forms: <· marks the left end, =· appears in

the interior of the handle, and ·> marks the right end.• Suppose that $ is the end of the string, Then for all terminals we can

write: $ <· b and b ·> $• If we remove all Nonterminals and place the correct precedence

relation: <·, =·, ·> between the remaining terminals, there remain strings that can be analyzed by easily developed parser.

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3931-01-2017

Page 40: Compiler Design UNIT-2

EXAMPLE OF OPERATOR PRECEDENCE PARSING

id + * $id ·> ·> ·>+ <· ·> <· ·>* <· ·> ·> ·>$ <· <· <·

For example, the following operator precedence relations can

be introduced for simple expressions:

Example: The input string: id1 + id2 * id3

after inserting precedence relations becomes$ <· id1 ·> + <· id2 ·> * <· id3 ·> $

ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 4031-01-2017

Page 41: Compiler Design UNIT-2

Contd….

id + * $

id ·> ·> ·>

+ <· ·> <· ·>

* <· ·> ·> ·>

$ <· <· <·

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 41

They follow from the following facts:

+ has lower precedence than * (hence + <• * and * •> +).

Both + and * are left-associative (hence + •> + and * •> *).

Page 42: Compiler Design UNIT-2

Associativity• If an operand has operators on both sides, the side on which the

operator takes this operand is decided by the associativity of those operators.

• Example• Operations such as Addition, Multiplication, Subtraction, and Division

are left associative. If the expression contains:• id op id op id it will be evaluated as: ( id op id ) op id …………….

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 42

Page 43: Compiler Design UNIT-2

Example….• (id + id) + id• Operations like Exponentiation are right associative, i.e., the order of

evaluation in the same expression will be:• id op (id op id)• Another example, id ^ (id ^ id)If two different operators share a common operand, the precedence

of operators decides which will take the operand. That is, 2+3*4 can have two different parse trees, one corresponding to

(2+3)*4 and another corresponding to 2+(3*4).

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 43

Page 44: Compiler Design UNIT-2

Contd…• By setting precedence among operators, this problem can be easily

removed. • As in the previous example, mathematically * (multiplication) has

precedence over + (addition), so the expression 2+3*4 will always be interpreted as:

• 2 + (3 * 4)• These methods decrease the chances of ambiguity in a language or its

grammar.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 44

Page 45: Compiler Design UNIT-2

Another Definition• For operators, associativity means that when the same operator

appears in a row, then which operator occurrence we apply first. • In the following, let Q be the operator a Q b Q cIf Q is left associative, then it evaluates as (a Q b) Q cAnd if it is right associative, then it evaluates as a Q (b Q c)It's important, since it changes the meaning of an expression. 31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT

(CD) 45

Page 46: Compiler Design UNIT-2

Contd…….• Consider the division operator with integer arithmetic, which is left

associative: 4 / 2 / 3 <=> (4 / 2) / 3 <=> 2 / 3 = 0If it were right associative, it would evaluate to an undefined

expression, since you would divide by zero 4 / 2 / 3 <=> 4 / (2 / 3) <=> 4 / 0 = undefined.• If you write 12 - 5 + 3, the possible evaluations include:• (12 - 5) + 3 = 10• 12 - (5 + 3) = 4

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 46

Page 47: Compiler Design UNIT-2

Contd…• Left Associative means we evaluate our expression from left to right

hand side.• Right Associative means we evaluate our expression from right to left

hand side.• We know *,/ and % have same precedence, but according to

associativity answer may change.• For eg: we have exp: 4*8/2%5• Left associative: (4*8)/2%5 ==> (32/2)%5 ==>16%5 ==>1• Right associative: 4*8/(2%5) ==> 4*(8/0) ==>Undefined behavior.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 47

Page 48: Compiler Design UNIT-2

LR PARSERS

• One of the best methods for syntactic recognition of programming languages is LR parsing.

• An LR parser uses the shift-reduce technique.• The L stands for left-to-right scanning & the R for a rightmost

derivation.• LR(1) parsing- i.e, LR parsing with one symbol lookahead.• LR(k) parsing, with k symbols of lookahead.• LR parsers are a type of bottom-up parsers that efficiently

handle deterministic context-free languages.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 48

Page 49: Compiler Design UNIT-2

LR Parsers• A bottom-up parser follows a rightmost derivation from the bottom

up.• Such parsers typically use the LR algorithm and are called LR parsers.

• L means process tokens from Left to right.• R means follow a Rightmost derivation.

• Furthermore, in LR parsing, the production is applied only after the pattern has been matched.

• In LL (predictive) parsing, the production was selected, and then the tokens were matched to it.

Page 50: Compiler Design UNIT-2

Rightmost Derivations• Let the grammar be

E E + T | TT T * F | FF (E) | id | num

Page 51: Compiler Design UNIT-2

Rightmost Derivations• A rightmost derivation of (id + num)*id is

E T T*F T*id F*id (E)*id (E + T)*id (E + F)*id (E + num)*id (T + num)*id (F + num)*id (id + num)*id.

Page 52: Compiler Design UNIT-2

LR Parsers• An LR parser uses a parse table, an input buffer, and a stack of

“states.”• It performs three operations.

• Shift a token from the input buffer to the stack.• Reduce the content of the stack by applying a production.• Go to a new state.

Page 53: Compiler Design UNIT-2

ADVANTAGES

• The advantages of LR parsing are numerous:1.An LR parser can recognize virtually all PL constructs written with

CFGs.2.It is the most general nonbacktracking technique known.3.It can be implemented in a very efficient manner.4.The languages it can recognize is a proper superset of that for

predictive parsers.5.It can detect syntax errors quickly.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 53

Page 54: Compiler Design UNIT-2

DISADVANTAGE

• The primary disadvantage to LR parsers is that it is far too much work to manually create LR parsing tables.

• However, tools exist to automatically generate an LR parser from a given grammar.

• These are called LR parser generators, such as YACC, BISON etc.• These parser generators are not only useful in creating the parser, but

also in finding errors in the grammar.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 54

Page 55: Compiler Design UNIT-2

LR PARSING METHODS

• There are actually three different methods to perform LR parsing: SLR(1) – Simple LR Parser: Works on smallest class of grammar. Few number of states, hence very small table. Simple and fast construction. Easy to implement, but less powerful.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 55

Page 56: Compiler Design UNIT-2

Contd…….

Canonical LR or LR Parser: It is most general and powerful. It is tedious and costly to implement. Works on complete set of LR(1) Grammar Generates large table and large number of states Slow construction

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 56

Page 57: Compiler Design UNIT-2

Contd……

• LALR(1) – Look-Ahead LR Parser: Works on intermediate size of grammar. Number of states are same as in SLR(1). It is a mix of SLR and Canonical LR. It is implemented efficiently. Most parser generators generate LALR parser, since they are the

trade-off between power and efficiency.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 57

Page 58: Compiler Design UNIT-2

LL vs LRLL LRDoes a leftmost derivation. Does a rightmost derivation in reverse.

Starts with the root nonterminal on the stack. Ends with the root nonterminal on the stack.

Ends when the stack is empty. Starts with an empty stack.

Uses the stack for designating what is still to be expected.

Uses the stack for designating what is already seen.

Builds the parse tree top-down. Builds the parse tree bottom-up.

Continuously pops a nonterminal off the stack, and pushes the corresponding right hand side.

Tries to recognize a right hand side on the stack, pops it, and pushes the corresponding nonterminal.

Expands the non-terminals. Reduces the non-terminals.

Reads the terminals when it pops one off the stack. Reads the terminals while it pushes them on the stack.

Pre-order traversal of the parse tree. Post-order traversal of the parse tree.31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT

(CD) 58

Page 59: Compiler Design UNIT-2

Constructing LR Parsing Tables• The table is an essential part of LR parsing.• But, How does one go about making the tables.• It is a daunting task if one does not use automated tools for the purpose.• One important fact to be kept in mind while constructing LR parsing tables

is that the state on the top of the stack provides a wealth of information to the parser.

• An LR parser is keeping track of viable prefixes for the handles.• It uses an automaton to recognize these prefixes.• The goto portion of the table simulates this automaton, but it does not

need to scan the stack on every input symbol to figure out what state it is in.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 59

Page 60: Compiler Design UNIT-2

Contd…..• Items• First concept in LR table construction is that of an item.• An item is a production rule with a position indicator (dot) at some point

on the RHS. • If A XYZ is a production, the possible items of this production are: A .XYZ A X.YZ A XY.Z A XYZ.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 60

Page 61: Compiler Design UNIT-2

Contd…..• Items are also known as LR(0) items, since they assume no lookahead.• An item denotes how much of a production we have seen so far

during the parsing .• SLR Parsing Tables• We augment the grammar and use two functions- closure & goto.• Augmented grammar: An augmented grammar simply has a new “dummy” start

symbol, whose only production is the start symbol of the grammar in question.• If G is our grammar with start symbol S, then the augmented grammar

G’ = (VT , VN U {S’}, S’, F U {S’ S }).

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 61

Page 62: Compiler Design UNIT-2

Contd…..• Let us consider the following augmented grammar for constructing

the SLR parsing table: S’ S S aABe A Abc A b B d

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 62

Page 63: Compiler Design UNIT-2

Contd….• The sets of Items for the augmented grammar computed using the process• Io = {[ S’ .S], [S .aABe ]}• I1 = {[ S’ S.]}• I2 = {[ S a.ABe ], [A .Abc ], [A .b]}• I3 = {[ S aA. Be ], [A A. bc ], [B .d]}• I4 = {[ A b.]}• I5 = {[ S aAB.e ]}• I6 = {[ A Ab.c ]} • I7 = {[B d.]}• I8 = {[ S aABe. ]} I9 = {[A Abc. ]}

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 63

Page 64: Compiler Design UNIT-2

Goto…..• The goto function for this grammar is: goto(Io , S) = I1

goto(Io , a) = I2

goto(I2 , A) = I3

goto(I2 , b) = I4

goto(I3 , B) = I5

goto(I3 , b) = I6

goto(I3 , d) = I7

goto(I5 , e) = I8

goto(I6 , c) = I931-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT

(CD) 64

Page 65: Compiler Design UNIT-2

Graph

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 65

S’ --> S.I1

B--> d. I7

A --> Ab.cI6

S’ --> .S I0

S’ -->.aABe

S’ -->a.ABeA -->.AbcA --> .b I2

A --> b. I4

S’ -->aA.BeA -->A.bcB--> .d I3

A --> Abc.I9

S’ -->aAB.eI5

S’ -->aABe.I8

a

d

b

b

A

B e

c

S

Page 66: Compiler Design UNIT-2

Constructing LR(1) PARSERS• LR(1) item = LR(0) item + lookahead• Example S AA A aA/b

I0

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 66

Production Rule:A α.Bβ, a/bB .Ɣ, c/d

or if β is not there

S’ .S, $S .A A, $

A .aA, a/b .b, a/b

A b., $

aS A.A, $A .aA, $

.b, $

A

A b., a/b

S AA., $

AA a.A, $A .aA, $

.b, $

Sa

S’ S., $

bb

A b., $

b

A a. A, a/bA .aA, a/b

.b, a/b

a

aA., a/b

b., a/b

A

b

Page 67: Compiler Design UNIT-2

An Automatic Parser Generator 1. ACCENT:• A Compiler for the Entire Class of Context-Free Languages• Welcome to Accent, a modern compiler that can process all grammars without any

restriction.

• No knowledge of parsing technology is required; grammars can be used directly without adapting them to a particular parsing technique. There is no struggling against shift/reduce conflicts as known from Yacc and no rewriting of left-recursive rules as it is required for LL(k) parsers.

• Accent is in use in industrial projects, especially where languages are implemented that are defined by complex standard documents and grammars that cannot be processed by traditional systems.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 67

Page 68: Compiler Design UNIT-2

CONTD……• Accent can be used in the style of Yacc, i.e. you provide a grammar

and add semantic actions. However, Accent also supports the Extended Backus Naur Form, and there are no restrictions on where to place semantic actions. Like Yacc, Accent cooperates with Lex.

• Accent is Open Source Software. Commercial support is available from Metarga GmbH.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 68

Page 69: Compiler Design UNIT-2

Contd……

2. AFLEX AND AYACC• Aflex and Ayacc are similar to the Unix tools lex and yacc, but they are

written in Ada and generate Ada output. • They were developed by the Arcadia Project at the University of

California, Irvine. • Aflex is based on the tool 'flex' written by Vern Paxson. These tools

are copyrighted, but are freely redistributable.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 69

Page 70: Compiler Design UNIT-2

Contd……3. ALE:• Attribute-Logic Engine , Version 3.2, a freeware logic programming

and grammar parsing and generation system. • This includes information on obtaining the system, user's guide,

graphical interfaces, and grammars. This version includes:• ALE is faster (again), both at compile-time and run-time.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 70

Page 71: Compiler Design UNIT-2

Contd…..• A new parsing compilation algorithm (Empty-First-Daughter closure),which:• The EFD-closure algorithm assumes that a grammar is ``EFD-closed'' meaning

that the first daughter of all the grammatical rules in the grammar are non-empty.

• corrects a long-standing problem in ALE with combining empty categories.• works around a problem that non-ISO-compatible Prologs, including SICStus

Prolog.• Shallow cuts (if-then-else predicates) have been added to the definite

clause language.• Faster extensionalisation code, particularly with grammars that have few or

no extensional types

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 71

Page 72: Compiler Design UNIT-2

Contd…..

• Faster subsumption checking code for chart edges.• ALE Source-level Debugger 3.0, which has been integrated with the

new SICStus 3.7 source-level debugger.• More compile-time error and warning messages,• Several bug corrections,• An updated user's manual,• An SWI Prolog port.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 72

Page 73: Compiler Design UNIT-2

Contd…..4. The AnaGram Parser Generator:• The parser is a C/C++ function that parses text according to the rules

in our grammar and, as it matches rules, calls our code to deal with them.

• Using a grammar means we get faster development, easier modification and maintainability, and fewer bugs in our software.

• We use an easy-to-understand description of input instead of bug-prone, fragile, branching code.

• We make the rules, and AnaGram makes a parser for our input.• AnaGram 2.01 runs on Win32 platforms.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 73

Page 74: Compiler Design UNIT-2

Contd…..5. Bison, The YACC-compatible Parser Generator:• Bison is a general-purpose parser generator that converts a grammar

description for an LALR(1) context-free grammar into a C program to parse that grammar.

• Once we are proficient with Bison, we may use it to develop a wide range of language parsers, from those used in simple desk calculators to complex programming languages.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 74

Page 75: Compiler Design UNIT-2

Contd……

6. BTYACC:• BTYACC is a modified version of yacc that supports automatic

backtracking and semantic disambiguation to parse ambiguous grammars, as well as syntactic sugar for inherited attributes.

7. BYACC:• Berkeley Yacc is a public domain LALR(1) parser generator. It has been

made as compatible as possible with AT&T Yacc.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 75

Page 76: Compiler Design UNIT-2

Contd……8. Coco/R:• It is a compiler generator, which takes an attributed grammar of a

source language and generates a scanner and a parser for this language.

• The scanner works as a deterministic finite automaton. • The parser uses recursive descent. • LL(1) conflicts can be resolved by a multi-symbol lookahead or by

semantic checks.• Coco/R for C#, Java, C++, F#, VB.Net, Delphi, Swift, Oberon, other

languages.

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 76

Page 77: Compiler Design UNIT-2

Contd…..• Some more examples.• DEPOT4• FLEX• GOBO EIFFEL LEX & YACC• HAPPY• HOLUB• LEX• LLGEN• MKS LEX & YACC• PCYACC

• PRECC• PROGRAMMAR• QUEX• RDP• TP LEX AND YACC• VISUALPARSE++……………….……………….

31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 77

Page 78: Compiler Design UNIT-2

Using Ambiguous grammar• The number of states in LR(1) parsing table is much more than that in SLR

parsing table.• LALR reduces the number of states in LR(1) parsing table.

• LALR (Lookahead LR) is less powerful than LR(1)• reducing states may introduce reduce-reduce conflict, but not shift-reduce conflict.• LALR has the same number of states as SLR, but more powerful.

• Constructing LALR parsing table.• Combine LR(1) sets with the same sets of first parts (ignore lookahead).• Algorithms exist that skip constructing the LR(1) sets.

Page 79: Compiler Design UNIT-2

Contd….

• Using ambiguous grammars• ambiguous grammars will results in conflicts• Can use precedence and associativity to resolve the conflicts• May result in a smaller parsing table in comparison to using un-ambiguous grammars.

• Example: E->E+EE->E*EE->(E)E->id


Recommended