Download - CS1352_NOV07

7/28/2019 CS1352_NOV07

1/18

7/28/2019 CS1352_NOV07

2/18

NOV/DEC-'07/CS1352-Answer Key

7. What are basic blocks and flow graphs?

A basic block is a sequence of consecutive statements in which flow ofcontrol enters at the beginning and leaves at the end without halt or possibility of

branching except at the end.A flow graph is a directed graph in which the flow control information is added to thebasic blocks. The nodes in the flow graph are basic blocks the block whose leader is the first statement is called initial block. There is a directed edge from block B1 to block B2 if B2 immediately follows B1 in thesome execution sequence. We can say that B1 is a predecessor of B2 and B2 is a successorof B1.

8. What are the limitations of static allocation?

o The size of a data object and constraints on its position in memory must be

known at compile timeo Recursive procedures are restricted, because all activations of a procedure usethe same bindings for local names.

o Data structures cannot be created dynamically, since there is no mechanismfor storage allocation at run time.

9. Define activation tree.

Each execution of a procedure is referred to as an activation of the procedure.Activation tree depicts the way control enters and leaves activations. Here, each noderepresents activation of a procedure, root represents the activation of the mainprogram, node for a is the parent of the node for b iff control flows from activation ato b and the node for a is to the left of the node for b iff the lifetime of a occurs beforethe lifetime of b.

10.What is inline expansion?Here, the body of the procedure is substituted for the call in the caller, with the

actual parameters literally substituted for the formals. i.e. the procedure is treated as if itwere a macro.

PART B

11.a. i. Explain in detail about the role of lexical analyzer with the possible errorrecovery actions. (6)

Few errors are discernible at the lexical level alone, because a lexical analyzer has avery localized view of a source program. The simplest recovery strategy is panic moderecovery: delete the successive characters from the remaining input until the lexicalanalyzer can find a well-formed token. Other possible error recovery actions are

o Deleting an extraneous charactero Inserting a missing charactero Replacing an incorrect character by a correct charactero Transposing two adjacent characters

- 2 -

http://engineerportal.blogspot.in/

7/28/2019 CS1352_NOV07

3/18


ii. What is a compiler? Explain the various phases of compiler in detail, with a neat

sketch. (10)

The process of compilation is very complex. So it comes out to be customary from thelogical as well as implementation point of view to partition the compilation process intoseveral phases. A phase is a logically cohesive operation that takes as input one

representation of source program and produces as output another representation. (2)Source program is a stream of characters: E.g.pos = init + rate * 60 (6) lexical analysis: groups characters into non-separable units, called token, and

generates token stream: id1 = id2 + id3 * const The information about the identifiers must be stored somewhere (symbol

table). Syntax analysis: checks whether the token stream meets the grammatical

specification of the language and generates the syntax tree. Semantic analysis: checks whether the program has a meaning (e.g. if pos is a record

and init and rate are integers then the assignment does not make a sense).:=

:=

id1 +

id2

*

id3 60

Syntax analysis

id1 +

id2

*

id3 inttoreal

60

Semantic analysis Intermediate code generation, intermediate code is something that is both close to the

final machine code and easy to manipulate (for optimization). One example is thethreeaddress code:

dst = op1 op op2

The three-address code for the assignment statement:temp1 = inttoreal(60);temp2 = id3 * temp1;temp3 = id2 + temp2;id1 = temp3

Code optimization: produces better/semantically equivalent code.temp1 = id3 * 60.0id1 = id2 + temp1

Code generation: generates assemblyMOVF id3, R2MULF #60.0, R2

MOVF id2, R1ADDF R2, R1MOVF R1, id1

Symbol Table Creation / Maintenance

Contains Info (storage, type, scope, args) on Each Meaningful Token, typicallyIdentifiersData Structure Created / Initialized During Lexical Analysis Utilized / Updated During Later Analysis & Synthesis

- 3 -


7/28/2019 CS1352_NOV07

4/18


Error Handling

Detection of Different Errors Which Correspond to All PhasesEach phase should know somehow to deal with error, so that compilation

can proceed, to allow further errors to be detectedSource Program

1

2

3

Symbol-table

Manager 4

5

6

Lexical Analyzer

Syntax Analyzer

Semantic Analyzer

Error Handler

Intermediate Code

Generator

Code Optimizer

Code Generator

Target Program (2)

(OR)

b. i. Give the minimized DFA for the following expression (a|b)*abb. (10)

Syntax tree for (a|b)*abb#:

- 4 -


7/28/2019 CS1352_NOV07

5/18


Calculation of firstpos, lastpos and nullable for nodes in syntax tree:

Calculation of followpos:

Node followpos1 {1, 2, 3}

2 {1, 2, 3}3 {4}4 {5}5 {6}6 -

Now, the start state of DFA is firstpos of the rootSo, A= {1, 2, 3}

Consider the input symbol a:Position 1 and 3 are for a in A So,

let B = followpos(1) U followpos(3)

= {1, 2, 3} U {4} = {1, 2, 3, 4}DTrans[A, a] = BConsider the input symbol b:

Position 2 is for b in ASo, let B = followpos(2)

= {1, 2, 3} = ADTrans[A, b] = A

- 5 -


7/28/2019 CS1352_NOV07

6/18


Now continue with B,Consider the input symbol a:

Position 1 and 3 are for a in ASo, followpos(1) U followpos(3)

= {1, 2, 3} U {4} = {1, 2, 3, 4} = B

DTrans[B, a] = BConsider the input symbol b:Position 2 and 4 are for b in B So,

followpos(2) U followpos(4)= {1, 2, 3, 4, 5} = C

DTrans[B, b] = C

Now continue with C,

Consider the input symbol a:Position 1 and 3 are for a in A

So, followpos(1) U followpos(3)= {1, 2, 3} U {4} = {1, 2, 3, 4} = B

DTrans[C, a] = BConsider the input symbol b:

Position 2 and 5 are for b in C So,followpos(2) U followpos(5)

= {1, 2, 3, 6} = DDTrans[C, b] = D

Now continue with D,Consider the input symbol a:

Position 1 and 3 are for a in DSo, followpos(1) U followpos(3)

= {1, 2, 3} U {4} = {1, 2, 3, 4} = BDTrans[D, a] = BConsider the input symbol b:

Position 2 is for b in DSo, followpos(2) = {1, 2, 3} = ADTrans[D, b] = A

The position associated with the end marker #, 6 is in D. So, D is the final state.DFA

a a

b

a b b

A B C D

a

b

- 6 -


7/28/2019 CS1352_NOV07

7/18


Transition table:

States

ABCD

Input symbol

a bB AB CB DB A

ii. Draw the transition diagram for unsigned numbers. (6)

12.a. i. Explain the role of parser in detail. (4)Parser obtains a string of tokens from the lexical analyzer and verifies that the string can

be generated by the grammar for the source language. It can report any syntax error in anintelligible fashion.Errors can be of lexical, syntactic, semantic or logical. The error handler in a parser hassimple-to-state goals:

should report the presence of errors clearly and accurately should recover from each error quickly enough to be able to detect subsequent

errors should not significantly slow down the processing of correct programs

- 7 -


7/28/2019 CS1352_NOV07

8/18


ii. Construct predictive parsing table for the grammarE->E+T | T, T->T*F | F, F->(E)|id (12)

Eliminating left recursion: (2)

E->TEE->+TE |

T->FTT->*FT | F-> (E) | id

Calculation of First: (2)

First (E) = First (T) = First (F) = {(, id} First(E) = {+, }First (T) = {*, }

Calculation of Follow: (2)

Follow (E) = Follow (E) = {), $}Follow (T) = Follow (T) = {+,), $}Follow (F) = {+, *,), $}

Predictive parsing table: (6)

Non Input Symbolterminal id + * ( ) $

E E->TE E->TEE E->+TE E-> E-> T T->FT T->FTT T-> T->*FT T-> T-> F F->id F->(E)

(OR)

b. i. Give the LALR parsing table for the grammar (12)S-> L=R | R

L->*R | id

R->L.

Given grammar:1. S->L=R2. S->R3. L->*R4. L->id

5.

R->LAugmented grammar:S->SS->L=RS->RL->*RL->idR->L

- 8 -


7/28/2019 CS1352_NOV07

9/18


Canonical collection of LR(1) items

I0: S->.S, $S->.L=R, $S->.R, $L->.*R, =

L->.id, =R->.L, $I1: goto(I0, S)

S->S., $I2: goto(I0, L)

S->L.=R, $R->L., $

I3: goto(I0, R)S->R., $

I4: goto(I0, *)L->*.R, =R->.L, =L->.*R, =L->.id, =

I5: goto(I0, id)L->id., =

I6: goto(I2, =)S->L=.R, $R->.L, $

LR (1) table construction:

action

L->.*R, $L->.id, $

I7: goto(I4, R)L->*R., =

I8: goto(I4, L)

R->L., =goto(I4, *)=I4goto(I4, id)=I5I9: goto(I6, R)

S->L=R., $I10: goto(I6, L)

R->L., $I11: goto(I6, *)

L->*.R, $R->.L, $L->.*R, $

L->.id, $I12: goto(I6, id)L->id., $

I13: goto(I11, R)L->*R., $

goto(I11, L)=I10goto(I11, *)=I11goto (I11, id)=I12

gotoStates

012345678

910111213

=

s6

r4

r3r5

* id $s4 s5

Accr5r2

s4 s5

s11 s12

r1r5s11 s12

r4r3

S L R1 2 3

8 7

10 9

10 13

This grammar is LR(1), since it does not produce any multi-defined entry in itsparsing table.

- 9 -


7/28/2019 CS1352_NOV07

10/18


LALR table construction:

I4 and I11 are similar. Combine them as

I411 or I4:L->*.R, =/$R->.L, =/$L->.*R, =/$L->.id, =/$

I5 and I12 are similar. Combine them asI512 or I5:

L->id., =/$I7 and I13 are similar. Combine them as

I713 or I7:L->*R., =/$

I8 and I10 are similar. Combine them asI810 or I8:

R->L., =/$

States

01234

56789

action

= *

s4

s6

s4

r4 s4r3r5

goto

id $ S L R

s5 1 2 3Accr5r2

s5 8 7

r4s5 8 9r3r5r1

ii. What are the reasons for using LR parser technique? (4)

LR parsers can be constructed to recognize virtually all programminglanguage constructs for which CFGs can be written

LR parsing method is the most general non backtracking shift reduce parsing

method known, yet it can be implemented as efficiently as other shift-reducemethods

The class of grammars that can be parsed using LR methods is a propersuperset of the class of grammars that can be parsed with predictive parsers

An LR parser can detect a syntactic error as soon as it is possible to do so on aleft-to-right scan of the input

- 10 -


7/28/2019 CS1352_NOV07

11/18


13.a. i. Explain about the different type of three address statements. (8)It is one of the intermediate representations. It is a sequence of statements of the form

x:= y op z, where x, y, and z are names, constants or compiler-generatedtemporaries and op is an operator which can be arithmetic or a logical operator. E.g.x+y*z is translated as t1=y*z and t2=x+t1.

Reason for the term three-address code is that each statement usually containsthree addresses, two for the operands and one for the result. (2)

Common three address statements: (2)

x:=y op z (assignment statements) x:= op y (assignment statements) x:=y(copy statements) goto L (unconditional jump) Conditional jumps like if x relop y goto L param x, call p,n and return y for procedure calls indexed assignments x:=y[i] and x[i]:= y

address and pointer assignments x:=&y, x:=*y and *x:=y

Implementation: (4)

Quadruples

Record with four fields, op, arg1, arg2 and result Triples

Record with three fields, op, arg1, arg2 to avoid entering temporarynames into symbol table. Here, refer the temporary value by the position of thestatement that computes it.

Indirect triples

List the pointers to triples rather than listing the triples

For a: = b* -c + b * -cQuadruples

Op arg1 arg2 result(0) uminus c t1(1) * b t1 t2(2) uminus c t3(3) * b t3 t4(4) + t2 t4 t5(5) := t5 a

TriplesOp arg1 arg2

(0) uminus c(1) * b (0)(2) uminus c(3) * b (2)(4) + (1) (3)(5) assign a (4)

- 11 -


7/28/2019 CS1352_NOV07

12/18


Indirect Triples

Op arg1 arg2 Statement(14) uminus c (0) (14)(15) * b (14) (1) (15)

(16) uminus c (2) (16)(17) * b (16) (3) (17)(18) + (15) (17) (4) (18)(19) assign a (18) (5) (19)

ii. What are the methods of translating Boolean expression? (8) Used to compute logical values. (2) Used as conditional expressions in statements, that alters the flow of control. Operators used are and, or and not. Elements are Boolean variables/relational expressions.

Methods of translating Boolean expressions: (2)

Encode true and false numerically and evaluate like arithmetic expression By flow of control, i.e. represent the value of Boolean expression by a position

reached in the programSemantics of programming language determines whether all parts of the Boolean

expression must be evaluated. If so, can optimize the evaluation by computing only

enough of it to determine its value.

Syntax directed definitions to produce 3AC for Booleans: (4)E --> E1 or E2

{ E1.True =E.True; E2.True=E.True; E1.false=newlabel();E2.false=E.false; E.code=E1.code || gen(E1.false,:) || E2.code }

E--> E1 and E2{ E1.true=newlabel();E2.true=E.true; E1.false=E.false; E2.false=E.false;E.code=E1.code||gen(E1.true,.)||E2.code}

E--> not E1 {E1.false=E.true;E1.true=E.false; E.code=E1.code}E--> (E1) {E1.true=E.true;E1.false=E.false; E.code=E1.code} E-->ID1 RELOP ID2

{E.code=gen(ifID1.place RELOP ID2.place goto E.true|| gen(gotoE.false}

E--> True{ F.code=gen(gotoE.true)}

E--> false{F.code=gen(gotoE.false)}

(OR)

b. i. Write short notes on back-patching. (8)

Back patching is the activity of filling up unspecified information of labels usingappropriate semantic actions in during the code generation process. (2)

- 12 -


7/28/2019 CS1352_NOV07

13/18


In the semantic actions the functions used are (2)mklist(i) create a new list having i, an index into array of quadruples.merge(p1,p2) - merges two lists pointed by p1 and p2back patch(p,j) inserts the target label j for each list pointed by p.

Example: (4)

Source: L2: x= y+1if a or b then L3:if c then After Backpatching:

x= y+1 100: if a goto 103Translation: 101: if b goto 103

if a go to L1 102: goto 106if b go to L1 103: if c goto 105go to L3 104: goto 106

L1: if c goto L2 105: x=y+1goto L3 106:

ii. Explain procedure calls with an example. (8)

Procedure is an important and frequently used programming construct that isimperative for a compiler to generate good code for procedure calls and returns. (2)Consider the following grammar for a simple procedure call statement:

S-> call id (Elist)Elist -> Elist, EElist ->E

Calling sequences: (2)

The translation for a call includes a calling sequence, a sequence of actions taken onentry to and exit from each procedure.Example: (4)

Syntax directed translation:S-> call id(Elist)

{for each item p on queue doEmit(param p);

Emit(call id.place)}Elist -> Elist, E

{append E.place to the end of the queue}Elist - > E

{initialize queue to contain only E.place}E.g. Call p1(int a, int b)

param aparam bcall p1

14.a. i. Construct the DAG for the following basic block: (6)d:=b*c

e:=a+b

b:=b*c

a:=e-d

- 13 -


7/28/2019 CS1352_NOV07

14/18


ii. Explain in detail about primary structure-preserving transformations on basic

blocks. (10)

Structure preserving transformations:

It is implemented by constructing a dag for a basic block. Common sub expression canbe detected by noticing, as a new node m is about to be added, whether there is an existingnode n with the same children, in the same order, and with the same operator. If so, ncomputes the same value as m and may be used in its place.

E.g. DAG for the basic block

d:=b*ce:=a+bb:=b*ca:=e-d is given by

For dead-code elimination, delete from a dag any root (root with no ancestors) thathas no live variables. Repeated application of this will remove all nodes from the dag thatcorresponds to dead code.

(OR)

b. i. Describe in detail about a simple code generator with the appropriate

algorithm. (8)

It generates target code for a sequence of three address statements. (2)Assumptions:

For each operator in three address statement, there is a corresponding targetlanguage operator.

Computed results can be left in registers as long as possible. E.g.a=b+c: (2)

Add Rj,Ri where Ri has b and Rj has c and result in Ri. Cost=1; Add c, Ri where Ri has b and result in Ri. Cost=2; Mov c, Rj; Add Rj, Ri; Cost=3;

- 14 -


7/28/2019 CS1352_NOV07

15/18


Register descriptor: Keeps track of what is currently in each registerAddress descriptor: Keeps tracks of the location where the current value of the name canbe found at run time.Code generation algorithm: For x= y op z (2)

Invoke the function getreg to determine the location L, where the result of yop z should be stored (register or memory location) Check the address descriptor for y to determine y Generate the instruction op z, L where z is the current location of z If the current values of y and/or z have no next uses, alter register descriptor

Getreg: (2) If y is in a register that holds the values of no other names and y is not live,

return register of y for L If failed, return empty register If failed, if X has next use, find an occupied register and empty it If X is not used in the block, or suitable register is found, select memory

location of x as L

ii. Explain in detail about run-time storage management. (8)

Information needed during an execution of a procedure is kept in a block ofstorage called an activation record; storage for names local to the procedure also appears inthe activation record. Two standard storage-allocation strategies are Static allocation (4)

The position of an activation record in memory is fixed at compile time. Here,a new activation record is pushed onto the stack for each execution of a procedure.The record is popped when the activation ends.

Activation record for a procedure has fields to hold parameters, results,machine status information, local data, temporaries and the like.

A call statement is implemented by a sequence of two target-machine

instructions. A MOV instruction saves the return address and a GOTO transferscontrol to the target code for the called procedure.

Stack allocation (4)

Static allocation becomes stack allocation by using relative addresses forstorage in activation records. The position of the record for an activation of a procedureis not known until run time. In stack allocation, this position is usually stored in aregister (Indexed address mode).

Relative addresses in an activation record can be taken as offsets from anyknown position in the activation record.

15.a. i. Explain in detail about principle sources of optimization. (10)Code optimization is needed to make the code run faster or take less space or both.Function preserving transformations:

Common sub expression elimination Copy propagation Dead-code elimination Constant folding

- 15 -


7/28/2019 CS1352_NOV07

16/18


Common sub expression elimination: (2)

E is called as a common sub expression if E was previously computed and the

values of variables in E have not changed since the previous computation.Copy propagation: (2)

Assignments of the form f:=g is called copy statements or copies in short. The ideahere is use g for f wherever possible after the copy statement.Dead code elimination: (2)

A variable is live at a point in the program if its value can be used subsequently.Otherwise dead. Deducing at compile time that the value of an expression is a constant andusing the constant instead is called constant folding.Loop optimization: (4)

Code motion: Moving code outside the loopTakes an expression that yields the same result independent of the number of

times a loop is executed (a loop-invariant computation) and place the expression beforethe loop. Induction variable elimination Reduction in strength: Replacing an expensive operation by a cheaper one.

ii. Describe in detail about optimization of basic blocks with example. (6)Code improving transformations: Structure-preserving transformations

o Common sub expression eliminationo Dead-code eliminations

Algebraic transformations like reduction in strength.Structure preserving transformations: (3)

It is implemented by constructing a dag for a basic block. Common sub

expression can be detected by noticing, as a new node m is about to be added,whether there is an existing node n with the same children, in the same order, andwith the same operator. If so, n computes the same value as m and may be used in itsplace.E.g. DAG for the basic block

d:=b*ce:=a+bb:=b*ca:=e-d is given by

- 16 -


7/28/2019 CS1352_NOV07

17/18


For dead-code elimination, delete from a dag any root (root with no ancestors) thathas no live variables. Repeated application of this will remove all nodes from the dag thatcorresponds to dead code.Use of algebraic identities: (3)

e.g. x+0 = 0+x=x

x-0 = xx*1 = 1*x = xx/1 = x

Reduction in strength:Replace expensive operator by a cheaper one.

x ** 2 = x * xConstant folding:

Evaluate constant expressions at compile time and replace them by their values. Canuse commutative and associative lawsE.g. a=b+c

e=c+d+b

IC: a=b+ct=c+de=t+b

If t is not needed outside the block, change this toa=b+ce=a+d

using both the associativity and commutativity of +.

(OR)

b. i. Describe in detail about storage organization. (10)Subdivision of run time memory: (4)

Run time storage: The block of memory obtained by compiler from OS to execute thecompiled program. It is subdivided into

Generated target code Data objects Stack to keep track of the activations Heap to store all other information

Activation record: (Frame) (4)

CodeStatic data

Stack

Heap

It is used to store the information required by a single procedure call.Returned valueActual parametersOptional control linkOptional access linkSaved machine statusLocal datatemporaries

- 17 -


7/28/2019 CS1352_NOV07

18/18


Temporaries are used to hold values that arise in the evaluation of expressions.

Local data is the data that is local to the execution of procedure. Saved machine status

represents status of machine just before the procedure is called. Control link (dynamic link)points to the activation record of the calling procedure. Access link refers to the non-localdata in other activation records. Actual parameters are the one which is passed to the called

procedure. Returned value field is used by the called procedure to return a value to thecalling procedure

Compile time layout of local data: (2)

The amount of storage needed for a name is determined by its type. The field for thelocal data is laid out as the declarations in a procedure are examined at compile time. Thestorage layout for data objects is strongly influenced by the addressing constraints on the targetmachine.

ii. Explain in detail various methods of passing parameters. (6)

Call by value

A formal parameter is treated just like a local name. Its storage is in theactivation record of the called procedureThe caller evaluates the actual parameter and place the r-value in the storage

for the formals Call by reference

If an actual parameter is a name or expression having L-value, then that l-value itself is passed

However, if it is not (e.g. a+b or 2) that has no l-value, then expression isevaluated in the new location and its address is passed.

Copy-Restore: Hybrid between call-by-value and call-by-ref (copy in, copy out)Actual parameters evaluated, its r-value is passed and l-value of the actuals

are determinedWhen the called procedure is done, r-value of the formals are copied back to

the l-value of the actuals Call by name

Inline expansion(procedures are treated like a macro)

- 18 -