Chapter 1
Language Processor
Introduction
• Semantic gap• Solve by PL
– Design and coding– PL implementation steps
• Introduced new PL Domain • Specification Gap:
– Semantic gap between two specificationof same task.
• Execution Gap:– Gap between the semantics of the program
written in different programming language.
Application domain
Execution domain
Execution domain
PL Domain
Application domain
Specification gap
Execution Gap
Semantic Gap
Language Processor
• Definition: LP is a software which bridges a specification or execution gap.
• Parts of LP:– Language translator: bridges an execution gap like compiler, assembler– Detranslator– Preprocessor– language migrator
• Interpreter: is a language processor which bridges an execution gap without generating m/c lang. program.
• Problem oriented lang.– Less specification gap, more execution gap
• Procedure oriented lang.– More specification gap, less specification gap
Language processing activities• Program generation activity
• Program Execution activity:– Translation and Interpretation
Application domain
Program generator
domain
Target PL
Domain
Execution Domain
Specification Gap
• Program Translation– Translate program from SL to m/c language.
• Characteristics– A program must be translated before it can be executed.– A translated program may saved in a file and saved program
may be executed repeatedly.– A program must be retranslated following modifications.
• Program Interpretation:– Reads the source program and stores in to memory. – Determines it meaning and performs action.
Program interpretation and Execution
• Program Execution– Fetch the instruction cycle– Decode the instruction to determine the
operation.– Execute the instruction
• Program Interpretation– Fetch the statement– Analyze the instruction to determine the meaning.– Execute the statement
Comparison
• ?????
Fundamentals of language processing
• LP= Analysis of SP+ Synthesis of TP.• Analysis of SP
– Lexical rule: valid lexical units– Syntax rule: formation of valid statements– Semantic rule: Associate mening with valid
statements.
Phases of LP
•
• Forward Reference: A forward reference of a program entity is a reference to the entity which precedes its definition in the program.
Ex. struct s { struct t *pt};. . struct t { struct s *ps };
• Issues concerning memory requirements and organization of LP.
Analysis Phase
Synthesis Phase
Source program
Target programIR
Errors Errors
Passes of LP
• Language Processor pass: A language processor pass is the processing of every statement in a source program, or its equivalent representation, to perform language processing function.
– Pass-I: Perform Analysis of SP.– Pass-II : Perform synthesis of TP.
Intermediate Representation of Programs
• Intermediate Representation: An Intermediate representation(IR) is a representation of a source program which reflects the effect of some, but not all, analysis and synthesis tasks performed during language processing.
Front End Back EndSP
Intermediate Representation(IR)
TP
IR and Semantic actions
• Properties of IR– Ease of Use:– Processing Efficiency: – Memory efficiency: compact
• Semantic actions:– All actions performed by the front end, except lexical and
syntax analysis are called semantic actions., which includes• Checking semantic validity• Determine the meaning• Constructing IR
Toy Compiler
• Gcc or cc compiler- c or c++• Toy compiler- ???
– Front End• Lexical analysis• Syntax analysis• Semantic analysis
– Back End• Memory Allocation• Code Generation
Symbol table
Generation
Symbol Type length address
a int
b float
temp int
Front End
• Lexical (scanning)– Ex. a:=b+i ; id#2 op#5 id #3 op#3 id #1 op#10
• Syntax (Parsing)a,b : real;a:=b+i;
• Semantic: IC tree is generated
real
ba
Back End• Memory Allocation:
• Code Generation: Generating Assembly Lang.– issues:
• Determine the places where IR should be kept.• Determine which instructions should be used for type conversion.• Determine which addressing mode should be used for accessing variables.
Symbol Type length address
a int 2 2000
b float 4 2002
temp int 2 2006
Fundamentals of Language Specification
• Programming language Grammars:– Terminal symbols
• lowercase letters, punctuation marks, null• Concatenation(.)
– Nonterminal symbols: name of syntax category of language– Productions: called rewriting a rule, is a rule of the grammar
• NT = String of T’s and NT’s.
• Production form:
<article>= a/an/the<Noun> =<boy ><apple><Noun phrase>= <artical><Noun>
Grammar• Def: A grammar G of a language Lg is a quadruple (∑,SNT,S,P) where,
– ∑ is the set of terminals– SNT is the set of NT’s– S is the distinguished symbol– P is the set of productions
• Ex: Derive a sentence “A boy ate an apple”
– <sentence> = <Noun Phrase> <verb phrase>– <Noun phrase> =<article><Noun>– <verb phrase>=<verb ><noun phrase>– <Article> = a/an/the– <Noun> = boy/apple– <Verb> = ate
Grammar
• Derive a + b * c /5 and construct parse tree.(top down)– <exp>=<exp> + <term> | <term>– <term>=<term>*<factor> | <factor>– <factor>=<factor>/<number>– <number>=0/1/2/3/../9
• Classification of grammar:
– Type-0: phrase structure grammar– Type-1 : context sensitive grammar– Type-2 : context free grammar– Type-3 : linear grammar or regular grammar
Binding
• Definition: A binding is the association of an attribute of a program entity with a value.
– Static Binding: Binding is a binding performed before the execution of a program begins.
– Dynamic Binding: Binding is a binding performed after the execution of a program begins.
Chapter - 3
Scanning and
Parsing
Unit-2
Role of lexical Analyzer
Scanning
• Definition: Scanning is the process of recognizing the lexical components in a source string.
• Type-3 grammar or regular grammar• Regular grammar used to identify identifiers• Regular language obtained from the operation or , concatenation and Kleen*• Ex. Write a regular expression which used to identify strings which ends with
abb.– (a+b)*abb.
• Ex. Write a regular expression which used to identify strings which recognize identifiers.
– R.E. = (letter)(letter/ digit)*– Digit = 0/1/2/…/9– Letter = a/b/c/…./z
Regular Expression and Meaningr String rs String s
r.s and rs Concatenation of r and s(r) Same meaning as r
r/s or (r/s) alternation (r or s)(r)/(s) Alternation
[r] An optional occurrence of r(r)* 0 or more occurrence of string r(r)+ 1 or more occurrence of string r
Examples of regular expression• Integer :[+/-](d)†• Real : [+/-](d)†.(d) †• Real with optional fraction : [+/-](d)†.(d) *• Identifier : l(l/d)*
Example of Regular expression
• String ending with 0 : (0+1)*0• String ending with 11: (0+1)*11• String with 0 EVEN and 1 ODD. (0+1)*(01*01*)*11*0*• The language of all strings containing exactly two
0’s. :1*01*01*• The language of all strings that do not end with 01 :
^+1+(0+1)*+(0+1)*11
Finite state automaton
• FSA: is a triple (S,∑,T) where, S is a finite set of states, ∑ is the alphabet of source symbols, T is a finite set of state transitions
FSA
DFA NFA
DFA from Regular Expression
(0+1)*0
(0+1)*(1+00) (0+1)*
(11+10)*
Transition table from DFA
States/input 0 1
qo q1 qo
q1 q1 qo
Transition Table(0+1)*0
(0+1)*(1+00) (0+1)*(11+10)*
DFA and it’s transition Diagram
Check for the given string aabab
Types of ParserTypes of Parser
Top down Parser
Backtracking
Bottom Up Parser
SLR
Predictive Parser
LR LALR
Shift Reduce Parser
LR Parser
32
Example
• Expression grammar (with precedence)
• Input string x – 2 * y
# Production rule12345678
expr → expr + term | expr - term | termterm → term * factor | term / factor | factorfactor → number | identifier
33
Example
• Problem:– Can’t match next terminal– We guessed wrong at step 2
Rule Sentential form Input string- expr expr
expr
x
+
term
fact
term
2 expr + term x - 2 * y 3 term + term x – 2 * y 6 factor + term x – 2 * y 8 <id> + term x – 2 * y - <id,x> + term x – 2 * y
x - 2 * y
Current position in the input stream
34
Backtracking
• Rollback productions• Choose a different production for expr• Continue
Rule Sentential form Input string- expr2 expr + term x - 2 * y 3 term + term x – 2 * y 6 factor + term x – 2 * y 8 <id> + term x – 2 * y ? <id,x> + term x – 2 * y
x - 2 * y
Undo all these productions
35
Retrying
• Problem:– More input to read– Another cause of backtracking
Rule Sentential form Input string- expr
expr
expr
x
-
term
fact
term2 expr - term x - 2 * y 3 term - term x – 2 * y 6 factor - term x – 2 * y 8 <id> - term x – 2 * y - <id,x> - term x – 2 * y
x - 2 * y
3 <id,x> - factor x – 2 * y 7 <id,x> - <num> x – 2 * y
fact
2
36
Successful Parse
• All terminals match – we’re finished
Rule Sentential form Input string
- exprexpr
expr
x
-
term
fact
term2 expr - term x - 2 * y 3 term - term x – 2 * y 6 factor - term x – 2 * y 8 <id> - term x – 2 * y - <id,x> - term x – 2 * y
x - 2 * y
4 <id,x> - term * fact x – 2 * y 6 <id,x> - fact * fact x – 2 * y
2
7 <id,x> - <num> * fact x – 2 * y fact
- <id,x> - <num,2> * fact x – 2 * y 8 <id,x> - <num,2> * <id> x – 2 * y
term * fact
y
Problems in Top down Parsing
• Backtracking( we have seen)• Left recursion • Left Factoring
38
Left Recursion
• Problem: termination– Wrong choice leads to infinite expansion
(More importantly: without consuming any input!)– May not be as obvious as this– Our grammar is left recursive
Rule Sentential form Input string- expr2 expr + term x - 2 * y 2 expr + term + term x – 2 * y 2 expr + term + term + term x – 2 * y 2 expr + term + term + term + term x – 2 * y
x - 2 * y
Rules for Left Recursion
• If A-> Aa1/Aa2/Aa3/………/Aan/b1/b2/…/bn• After removal of left Recursion
A-> b1A’/b2A’/b3A’A’-> a1A’/a2A’/є
• Ex. Apply for • A-> Aa/Ab/c/d• A-> Ac/Aad/bd/є
40
Removing Left Recursion• Two cases of left recursion:
• Transform as follows
# Production rule123
expr → expr + term | expr - term | term
# Production rule456
term → term * factor | term / factor | factor
# Production rule1234
expr → term expr2expr2 → + term expr2 | - term expr2 | e
# Production rule456
term → factor term2term2 → * factor term2 | / factor term2 | e
Left Factoring• When the choice between two production is not clear, we
may be able to rewrite the productions to defer decisions is called as left factoring.
Ex. Stmt-> if expr then stmt else stmt | if expr then stmtStmt-> if expr then stmt S’S’-> if expr then stmt | є
• Rules: if A-> ab1/ab2 then A-> aA’A’-> b1/b2
Some examples for Left factoring
• S-> Assig_stmt/call_stmt/other– Assig_stmt-> id=exp– call_stmt->id(exp_list)
Recursive Descent Parsing• Example
Rule 1: S a S b Rule 2: S b S a Rule 3: S BRule 4: B b B Rule 5: B e
– Parse: a a b b b
• Has to use R1: S a S b• Again has to use R1: a S b a a S b b• Now has to use Rule 2 or 3, follow the order (always R2 first): • a a S b b a a b S a b b a a b b S a a b b a a b b b S a a a b b
– Now cannot use Rule 2 any more: a a b b b B a a a b b a a b b b B a a a b b incorrect, backtrack• After some backtracking, finally tried
– a S b a a S b b a a b B b b a a b b b worked
Predicative Parsing
• Need to immediately know which rule to apply when seeing the next input character– If for every non-terminal X
• We know what would be the first terminal of each X’s production
• And the first terminal of each X’s production is different– Then
• When current leftmost non-terminal is X• And we can look at the next input character• We know exactly which production should be used next
to expand X
Predicative Parsing
• Need to immediately know which rule to apply when seeing the next input character– If for every non-terminal X
• We know what would be the first terminal of each X’s production
• And the first terminal of each X’s production is different– Example
Rule 1: S a S b Rule 2: S b S a Rule 3: S BRule 4: B b B Rule 5: B e
First terminal is aFirst terminal is b
If next input is a, use R1If next input is b, use R2
But, R3’s first terminal is also bWon’t work!!!
Predicative Parsing• Need to immediately know which rule to apply when seeing the
next input character– If for every non-terminal X
• We know what would be the first terminal of each X’s production• And the first terminal of each X’s production is different
– What grammar does not satisfy the above?• If two productions of the same non-terminal have the same first symbol (N or
T), you can see immediately that it won’t work– S b S a | b B – S B a | B C
• If the grammar is left recursive, then it won’t work– S S a | b B, B b B | c– The left recursive rule of S can generate all terminals that the other productions of S can
generate» S b B can generate b, so, S S a can also generate b
Predicative Parsing
• Need to rewrite the grammar – Left recursion elimination
• This is required even for recursive descent parsing algorithm
– Left factoring• Remove the leftmost common factors
First()
• First() = { t | * t }– Consider all possible terminal strings derived
from – The set of the first terminals of those strings
• For all terminals t T– First(t) = {t}
First()• For all non-terminals X N
– If X e add e to First(X)– If X 1 2 … n
• i is either a terminal or a non-terminal (not a string as usual)
• Add all terminals in First(1) to First(X)
– Exclude e
• If e First(1) … e First(i-1) thenadd all terminals in First(i) to First(X)
• If e First(1) … e First(n) thenadd e to First(X)
• Apply the rules until nothing more can be added• For adding t or e: add only if t is not in the set yet
First()• Grammar
E TE’E’ +TE’ | eT FT’T’ *FT’ | eF (E) | id | num
• FirstFirst(*) = {*}, First(+) = {+}, …First(F) = {(, id, num}First(T’) = {*, e}First(T) = First(F) = {(, id, num}First(E’) = {+, e}First(E) = First(T) = {(, id, num}
First()
• GrammarS ABA aA | eB bB | e
• FirstFirst(A) = {a, e}First(B) = {b, e}First(S) = First(A) ={a, e}
Is this complete?
First()• Grammar
S AB | B (R1 | R2)A aA | c (R3 | R4)B bB | d (R5 | R6)
• FirstFirst(A) = {a, c}First(B) = {b, d}First(S) = First(A) First(B) = {a, b, c, d}
• Productions– First (R1) = {a, c}, First (R2) = {b, d}– First (R3) = {a}, First (R4) = {c}– First (R5) = {b}, First (R6) = {d}
If we see a If we see b If we see c If we see d When expanding S Use R1 Use R2 Use R1 Use R2When expanding A Use R3 - Use R4 -When expanding B - Use R5 - Use R6
Input: acbdExpands S, seeing a, use R1: S ABExpands A, seeing a, use R3: AB aABExpands A, seeing c, use R4: aAB acBExpands B, seeing b, use R5: acB acbBExpands B, seeing d, use R6: acbB acbd
First()• Grammar
S AB (R1)A aA | e (R2 | R3)B bB | e (R4 | R5)
• FirstFirst(A) = {a, e}First(B) = {b, e}First(S) = First(A) First(B) ={a, b, e}
• Productions– First (R1) = {a, b, e}– First (R2) = {a}, First (R3) = {e}– First (R4) = {b}, First (R5) = {e}
If we see a If we see b If we see eWhen expanding S Use R1 Use R1 Use R1When expanding A Use R2 - Use R3When expanding B - Use R4 Use R5
Input: aabbUse R1: S ABExpands A, seeing a, use R2: AB aABExpands A, seeing a, use R2: aAB aaABExpands A, seeing b, What to do? Not in table!
Follow()
• Follow() = { t | S * t }– Consider all strings that may follow – The set of the first terminals of those strings
• Assumptions– There is a $ at the end of every input string– S is the starting symbol
• For all non-terminals only– Add $ into Follow(S)– If A B add First() – {e} into Follow(B)– If A B or
A B and e First() add Follow(A) into Follow(B)
Follow()• First
First(A) = {a, e}First(B) = {b, e}First(S) = First(A) ={a, b, e}
• Productions– First (R1) = {a, b, e}– First (R2) = {a}, First (R3) = {e}– First (R4) = {b}, First (R5) = {e}
• Follow– Follow(S) = {$}– Follow(B) = Follow(S) = {$}– Follow(A) = First(B) Follow(S) = {b, $}
• Since e First(B), Follow(S) should be in Follow(A)
If we see a If we see bWhen expanding S Use R1 Use R1When expanding A Use R2 ?When expanding B - Use R4
Grammar S AB (R1) A aA | e (R2 | R3) B bB | e (R4 | R5)
If we see a If we see b If we see $When expanding S Use R1 Use R1 Use R1
When expanding A Use R2 Use R3 Use R3
When expanding B - Use R4 Use R5
Construct a Parse Table• Construct a parse table M[N, T{$}]
– Non-terminals in the rows and terminals in the columns• For each production A
– For each terminal a First() add A to M[A, a]• Meaning: When at A and seeing input a, A should be used
– If e First() then for each terminal a Follow(A) add A to M[A, a]• Meaning: When at A and seeing input a, A should be used
– In order to continue expansion to e– X AC A B B b | e C cc
– If e First() and $ Follow(A) add A to M[A, $]• Same as the above
First() and Follow() – another example
– First(*) = {*}– First(F) = {(, id, num}– First(T’) = {*, e}– First(T) = First(F) = {(, id, num}– First(E’) = {+, e}– First(E) = First(T) = {(, id, num}
– Follow(E) = {$, )}– Follow(E’) = Follow(E) = {$, )}– Follow(T) = {$, ), +}
• Since we have TE’ from first two rules and E’ can be e• Follow(T) = (First(E’)–{e}) Follow(E’)
– Follow(T’) = Follow(T) = {$, ), +}– Follow(F) = {*, $, ), +}
• Follow(F) = (First(T’)–{e}) Follow(T’)
Grammar E TE’ E’ +TE’ | e T FT’ T’ *FT’ | e F (E) | id | num
Construct a Parse Table
Grammar E TE’ E’ +TE’ | e T FT’ T’ *FT’ | e F (E) | id | num
First(*) = {*}First(F) = {(, id, num}First(T’) = {*, e}First(T) {(, id, num}First(E’) = {+, e}First(E) {(, id, num}
Follow(E) = {$, )}Follow(E’) = {$, )}Follow(T) = {$, ), +}Follow(T) = {$, ), +}Follow(T’) = {$, ), +}Follow(F) = {*, $, ), +}
E TE’: First(TE’) = {(, id, num}E’ +TE’: First(+TE’) = {+}E’ e: Follow(E’) = {$,)}T FT’: First(FT’) = {(, id, num}T’ *FT’: First(*FT’) = {*}T’ e: Follow(T’) = {$, ), +}id num * + ( ) $
E E TE’ E TE’ E TE’
E’ E’ +TE’ E’ e E’ e
T T FT’ T FT’ T FT’
T’ T’ *FT’ T’ e T’ e T’ e
F F id F num F (E)
Stack Input ActionE $ id + num * id $ ETE’T E’ $ id + num * id $ T FT’F T’ E’ $ id + num * id $ F idT’ E’ $ + num * id $ T’ eE’ $ + num * id $ E’ +TE’T E’ $ num * id $ T FT’F T’ E’ $ num * id $ F numT’ E’ $ * id $ T’ *FT’F T’ E’ $ id $ F idT’ E’ $ $ T’ eE’ $ $ E’ e$ $ Accept
Pop F from stack Remove id from input
+TE’: Only TE’ in stack Remove + from input
Pop T’ from stack Input unchanged
id num * + ( ) $E E TE’ E TE’ E TE’E’ E’ +TE’ E’ e E’ eT T FT’ T FT’ T FT’T’ T’ *FT’ T’ e T’ e T’ eF F id F num F (E)
More about LL Grammar• What grammar is not LL(1)?
S A | BA aaA | eB abB | b
• First(A) = {a, e}, First(B) = {a, b}, First(S) = {a, b, e}• Follow(S) = {$}, Follow(A) = {$}, Follow(B) = {$}
– But this grammar is LL(2)• If we lookahead 2 input characters, predictive parsing is possible• First2(A) = {aa, e}, First2(B) = {ab, b$}, First2(S) = {aa, ab, b$, e}
a b $S S A
S BS B S A
A A aaA A e B B abB B b
aa ab b$ $ ba, bb, a$S S A S B S B S AA A aaA A e B B abB B b
A Shift-Reduce ParserE E+T | T Right-Most Derivation of id+id*idT T*F | F E E+T E+T*F E+T*id E+F*idF (E) | id E+id*id T+id*id F+id*id id+id*id
Right-Most Sentential Form Reducing Productionid+id*id F idF+id*id T FT+id*id E TE+id*id F idE+F*id T FE+T*id F idE+T*F T T*F E+T E E+T E
Handles are red and underlined in the right-sentential forms
A Stack Implementation of A Shift-Reduce Parser
• There are four possible actions of a shift-parser action:
1. Shift : The next input symbol is shifted onto the top of the stack.2. Reduce: Replace the handle on the top of the stack by the non-terminal.3. Accept: Successful completion of parsing.4. Error: Parser discovers a syntax error, and calls an error recovery routine.
• Initial stack just contains only the end-marker $.• The end of the input string is marked by the end-marker $.
A Stack Implementation of A Shift-Reduce Parser
Stack InputAction$ id+id*id$ shift$id +id*id$ reduce by F id $F +id*id$ reduce by T F$T +id*id$ reduce by E T $E +id*id$ shift$E+ id*id$ shift $E+id *id$ reduce by F id$E+F *id$ reduce by T F $E+T *id$ shift$E+T* id$ shift $E+T*id $ reduce by F id$E+T*F $ reduce by T T*F $E+T $ reduce by E E+T$E $ accept
Operator-Precedence Parser• Operator grammar
– small, but an important class of grammars– we may have an efficient operator precedence parser (a shift-reduce
parser) for an operator grammar.
• In an operator grammar, no production rule can have:– e at the right side– two adjacent non-terminals at the right side.
• Ex:EAB EE+E |Aa E*E |Bb E/E | id
not operator grammar operator grammar
Precedence Relations
• In operator-precedence parsing, we define three disjoint precedence relations between certain pairs of terminals.
a <. b b has higher precedence than aa =· bb has same precedence as aa .> b b has lower precedence than a
• The determination of correct precedence relations between terminals are based on the traditional notions of associativity and precedence of operators. (Unary minus causes a problem).
Using Operator-Precedence Relations
• The intention of the precedence relations is to find the handle of a right-sentential form, <. with marking the left end, =· appearing in the interior of the handle, and .> marking the right hand.
• In our input string $a1a2...an$, we insert the precedence relation between the pairs of terminals (the precedence relation holds between the terminals in that pair).
Using Operator -Precedence Relations
E E+E | E-E | E*E | E/E | E^E | (E) | -E | id
The partial operator-precedencetable for this grammar.
• Then the input string id+id*id with the precedence relations inserted will be:
$ <. id .> + <. id .> * <. id .> $
id + * $id .> .> .>+ <. .> <. .>* <. .> .> .>$ <. <. <.
To Find The Handles1. Scan the string from left end until the first .> is encountered. 2. Then scan backwards (to the left) over any =· until a <. is
encountered. 3. The handle contains everything to left of the first .> and to
the right of the <. is encountered.
$ <. id .> + <. id .> * <. id .> $ E id $ id + id * id $$ <. + <. id .> * <. id .> $ E id $ E + id * id $ $ <. + <. * <. id .> $ E id $ E + E * id $ $ <. + <. * .> $ E E*E $ E + E * .E $$ <. + .> $ E E+E $ E + E $$ $ $ E $
Operator-Precedence Parsing Algorithm -- Example
stack inputaction
$ id+id*id$ $ <. id shift$id +id*id$ id .> + reduceE id$ +id*id$ shift$+ id*id$ shift$+id *id$ id .> * reduce E id$+ *id$ shift$+* id$ shift$+*id $ id .> $ reduce E id $+* $ * .> $ reduce E E*E $+ $ + .> $ reduce E E+E $ $ accept
id + * $id .> .> .>+ <. .> <. .>* <. .> .> .>$ <. <. <.
Chapter - 6
Introduction to Compiler
Unit-6
Aspects of Compilation
• Compiler bridges semantic gap between a PL domain and an execution domain.
• Two aspects of compilations are:-– Generate code to implement meaning of a source program in
execution domain. – Provide diagnostics for violations of PL semantics in a source program.– Data Types– Data Structures– Scope rules– Control Structures
Three address Code• In three-address code, there is at most one operator on the right side of an
instruction; that is, no built-up arithmetic expressions are permitted.
• Example: A source-language expression x+y*z might be translated into the sequence of three-address instructions below where tl and tz are compiler-generated temporary names.
• Generate code for x=a+b+c+d• Generate the code for x= -a *b + -a *b
Quadruple
OP Arg1 Arg2 Result
(0) uminus a t1
(1) * t1 b t2
(2) uminus a t3
(3) * t3 b t4
(4) + t2 t4 t5
(5) = t5 x
t1 = uminus at2 = t1*bt3 = uminus at4 = t3*bt5 = t2 + t4X=t5
Triple
OP Arg1 Arg2
(0) uminus a
(1) * (0) B
(2) uminus A
(3) * (2) B
(4) + (1) (3)
(5) = x (4)
t1 = uminus at2 = t1*bt3 = uminus at4 = t3*bt5 = t2 + t4X=t5
Indirect Triples
OP Arg1 Arg2
(0) uminus a
(1) * (0) B
(2) uminus A
(3) * (2) B
(4) + (1) (3)
(5) = x (4)
t1 = uminus at2 = t1*bt3 = uminus at4 = t3*bt5 = t2 + t4X=t5
statement
(0) (11)
(1) (12)
(2) (13)
(3) (14)
(4) (15)
(5) (16)
Example• Construct Quadruple , Triple , Indirect Triple Representations of
a = b * - c + b * - c
Example
(c) Indirect triple
Aspects of compilation
• A compiler bridges a specification gap between a PL domain and an execution domain.
• Generate code to implement meaning of a source program in the execution domain.
• Provide diagnosis for violation of a PL semantics in a source program – PL features are:
• Data types: Specification of legal values for variables of the type• Data structures: • Scope rules: Accessibility of variables declared in different blocks of a
program.• Control structures:
Memory Allocation
• Memory binding: is an association between the ‘memory address’ attribute of a data item and the address of memory area. – Static memory Allocation:
• Allocates Before Execution– Dynamic memory Allocation
• Allocates After Execution
Static memory Allocation
• Program consist Three units A,B,C.
• Advantage??????
Code(A)
Data(A)
Code(B)
Data(B)
Code(C)
Data(C)
Procedure AData(A)
B()Procedure B
Data (B)C()
Procedure CData (C)
Dynamic memory Allocation• Program consist Three units A,B,C.
• Program A is active Data(A) is allocated
Code(A)
Code(B)
Code(C)
Data(A)
Procedure AData(A)
B()Procedure B
Data (B)C()
Procedure CData (C)
Dynamic memory Allocation• Program consist Three units A,B,C.• Pro. A calls B. and Data(B) gets allocated
Code(A)
Code(B)
Code(C)
Data(B)
Procedure AData(A)
B()Procedure B
Data (B)C()
Procedure CData (C)
Dynamic memory Allocation• Program consist Three units A,B,C.• Pro. B calls C. and Data(C) gets allocated
Code(A)
Code(B)
Code(C)
Data(C)
Procedure AData(A)
B()Procedure B
Data (B)C()
Procedure CData (C)
Dynamic memory Allocation• Different Scenario….
• Memory allocation in Block structured language(same as above.)• Advantage??????
Procedure AData(A)
B()C()
Procedure BData (B)
Procedure CData (C)
Code(A)
Code(B)
Code(C)
Data(A)
Code(A)
Code(B)
Code(C)
Data(A)
Data(B)
Code(A)
Code(A)
Code(A)
Data(A)
Data(C)
(b)(a) (c)
85
Stack
• Last In, First Out (LIFO) data structure
main (){ a(0); }void a (int m){ b(1); }void b (int n){ c(2); }void c (int o){ d(3); }void d (int p){ }
stack
Stack Pointer Stack grows down
Stack Pointer
Stack Pointer
Stack Pointer
Stack Pointer
• Activation Records:• also called frames• Information(memory) needed by a single execution of a
procedure• A general activation record:
Return value
actual parameters
optional control link
optional access link
machine status
local variables
temporaries
Store result of function call
Points to calling Procedure
Information of Program counter
Store temporary value
Information of actual Parameter
Non local data of other Procedure
Store local data
Activation Record for Factorial Program
main(){ int f; f=factorial(3); } int factorial(int n) { if(n==1)
{ return 1;}
else{return(n*factorial(n-1));}
}
Activation Record for Factorial Program
Activation Record for Factorial Program
– Parameter passing• The method to associate actual parameters with
formal parameters.• The parameter passing method will effect the code
generated.
• Call-by-value:– The actual parameters are evaluated and their r-values
are passed to the called procedure.– Implementation:
» a formal parameter is treated like a local name, so the storage for the formals is in the activation record of the called procedure.
» The caller evaluates the actual parameters and places their r-values in the storage for the formals.
– Call-by-reference:• also called call-by address or call-by-location.• The caller passes to the called procedure a pointer to
the storage address of each actual parameter.– Actual parameter must have an address -- only variables
make sense, an expression will not (location of the temporary that holds the result of the expression will be passed).
– Copy-restore:• A hybrid between call-by-value and call-by-reference.
– The actual parameters are evaluated and its r-values are passed to the called procedure as in call-by-value.
– When the control returns, the r-value of the formal parameters are copied back into the l-value of the actuals.