Date post: | 19-Dec-2015 |
Category: |
Documents |
View: | 222 times |
Download: | 2 times |
THEORY OF COMPILATIONLecture 09 – IR (Backpatching)
Eran Yahav
Reference: Dragon 6.2,6.3,6.4,6.6
www.cs.technion.ac.il/~yahave/tocs2011/compilers-lec09.pptx
2
Recap
Lexical analysis regular expressions identify tokens (“words”)
Syntax analysis context-free grammars identify the structure of the program
(“sentences”) Contextual (semantic) analysis
type checking defined via typing judgments can be encoded via attribute grammars
Syntax directed translation (SDT) attribute grammars
Intermediate representation many possible IRs generation of intermediate representation 3AC
3
Journey inside a compiler
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Inter.Rep.
Code Gen.
float position;
float initial;
float rate;
position = initial + rate * 60
<float> <ID,position> <;> <float> <ID,initial> <;> <float> <ID,rate> <;> <ID,1> <=>
<ID,2> <+> <ID,3> <*> <60>
TokenStream
4
Journey inside a compiler
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Inter.Rep.
Code Gen.
<ID,1> <=> <ID,2> <+> <ID,3> <*> <60>
60
<id,1>
=
<id,3>
<id,2>
+
*
AST
id
symbol
type
data
1 position
float
…
2 initial float
…
3 rate float
…
symbol table
S ID = EE ID | E + E | E * E | NUM
5
Problem 3.8 from [Appel]
A simple left-recursive grammar: E E + id E id
A simple right-recursive grammar accepting the same language:
E id + E E id
Which has better behavior for shift-reduce parsing?
6
Answer
The stack never has more than three items on it. In general, withLR-parsing of left-recursive grammars, an input string of length O(n)requires only O(1) space on the stack.
E E + idE id
Input
id+id+id+id+id
id (reduce) E E + E + id (reduce) E E + E + id (reduce) E E + E + id (reduce) E E + E + id (reduce) E
stack
left recursive
7
Answer
The stack grows as large as the input string. In general, with LR-parsingof right-recursive grammars, an input string of length O(n) requires O(n) space on the stack.
E id + EE id
Input
id+id+id+id+id
id id + id + id id + id + id + id + id id + id + id id + id + id + id id + id + id + id + id + id + id + id + id (reduce) id + id + id + id + E (reduce) id + id + id + E (reduce) id + id + E (reduce) id + E (reduce) E
stack
right recursive
8
Journey inside a compiler
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Inter.Rep.
Code Gen.
60
=
<id,3>
<id,2>
+
*
<id,1>
inttofloat
60
<id,1>
=
<id,3>
<id,2>
+
*
AST AST
coercion: automatic conversion from int to float
inserted by the compiler
id
symbol
type
1 position
float
2 initial float
3 rate float
symbol table
9
Journey inside a compiler
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Inter.Rep.
Code Gen.
t1 = inttofloat(60)t2 = id3 * t1t3 = id2 + t2id1 = t3
3AC
60
=
<id,3>
<id,2>
+
*
<id,1>
inttofloat
production semantic rule
S id = E S.code := E. code || gen(id.var ‘:=‘ E.var)
E E1 op E2 E.var := freshVar(); E.code = E1.code || E2.code || gen(E.var ‘:=‘ E1.var ‘op’ E2.var)
E inttofloat(num)
E.var := freshVar(); E.code = gen(E.var ‘:=‘ inttofloat(num))
E id E.var := id.var; E.code = ‘’
t1 = inttofloat(60)t2 = id3 * t1
t3 = id2 * t2id1 = t3
(for brevity, bubbles show only code generated by the node and not all accumulated “code” attribute)
note the structure:
translate E1translate E2
handle operator
10
Journey inside a compiler
Inter.Rep.
Code Gen.
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
3AC Optimized
t1 = inttofloat(60)t2 = id3 * t1t3 = id2 + t2id1 = t3
t1 = id3 * 60.0id1 = id2 + t1
value known at compile timecan generate code with converted
valueeliminated temporary t3
11
Journey inside a compiler
Inter.Rep.
Code Gen.
LexicalAnalysi
s
Syntax Analysi
s
Sem.Analysi
s
Optimized
t1 = id3 * 60.0id1 = id2 + t1
Code Gen
LDF R2, id3MULF R2, R2, #60.0LDF R1, id2ADDF R1,R1,R2STF id1,R1
12
You are here
Executable
code
exe
Source
text
txt
Compiler
LexicalAnalysi
s
Syntax Analysi
s
Parsing
Semantic
Analysis
Inter.Rep.
(IR)
Code
Gen.
13
IR So Far…
many possible intermediate representations
3-address code (3AC) Every instruction operates on at most
three addresses result = operand1 operator operand2
gets us closer to code generation enables machine-independent
optimizations how do we generate 3AC?
14
Last Time: Creating 3AC
Creating 3AC via syntax directed translation
Attributes code – code generated for a nonterminal var – name of variable that stores result
of nonterminal
freshVar() – helper function that returns the name of a fresh variable
15
Creating 3AC: expressions
production
semantic rule
S id := E
S.code := E. code || gen(id.var ‘:=‘ E.var)
E E1 + E2
E.var := freshVar(); E.code = E1.code || E2.code || gen(E.var ‘:=‘ E1.var ‘+’ E2.var)
E E1 * E2
E.var := freshVar(); E.code = E1.code || E2.code || gen(E.var ‘:=‘ E1.var ‘*’ E2.var)
E - E1 E.var := freshVar(); E.code = E1.code || gen(E.var ‘:=‘ ‘uminu’ E1.var)
E (E1) E.var := E1.varE.code = ‘(‘ || E1.code || ‘)’
E id E.var := id.var; E.code = ‘’(we use || to denote concatenation of intermediate code fragments)
16
exampleassig
n
a+
*
buminu
s
c
*
buminu
s
c
E.var = cE.code =‘’
E.var = bE.code =‘’
E.var = t2E.code =‘t1 = -c t2 = b*t1’
E.var = t1E.code =‘t1 = -c’
E.var = bE.code =‘’
E.var = cE.code =‘’
E.var = t3E.code =‘t3 = -c’
E.var = t4E.code =‘t3 = -c t4 = b*t3’
E.var = t5E.code =‘t1 = -c t2 = b*t1 t3 = -c t4 = b*t3 t5 = t2+t4’
17
Creating 3AC: control statements
3AC only supports conditional/unconditional jumps
Add labels
Attributes begin – label marks beginning of code after – label marks end of code
Helper function freshLabel() allocates a new fresh label
18
Expressions and assignments
production
semantic action
S id := E { p:= lookup(id.name); if p ≠ null then emit(p ‘:=‘ E.var) else error }
E E1 op E2
{ E.var := freshVar(); emit(E.var ‘:=‘ E1.var op E2.var) }
E - E1 { E.var := freshVar(); emit(E.var ‘:=‘ ‘uminus’ E1.var) }
E ( E1) { E.var := E1.var }
E id { p:= lookup(id.name); if p ≠ null then E.var :=p else error }
19
Boolean Expressions
production
semantic action
E E1 op E2
{ E.var := freshVar(); emit(E.var ‘:=‘ E1.var op E2.var) }
E not E1 { E.var := freshVar(); emit(E.var ‘:=‘ ‘not’ E1.var) }
E ( E1) { E.var := E1.var }
E true { E.var := freshVar(); emit(E.var ‘:=‘ ‘1’) }
E false { E.var := freshVar(); emit(E.var ‘:=‘ ‘0’) }
• Represent true as 1, false as 0• Wasteful representation, creating variables for true/false
20
Boolean expressions via jumps
production
semantic action
E id1 op id2
{ E.var := freshVar(); emit(‘if’ id1.var relop id2.var ‘goto’ nextStmt+2);emit( E.var ‘:=‘ ‘0’);emit(‘goto ‘ nextStmt + 1);emit(E.var ‘:=‘ ‘1’)}
21
Example
E
E E
a < b
or
E
c < d
E
e < f
and
if a < b goto 103
100:
T1 := 0 101:
goto 104 102:
T1 := 1 103:
if c < d goto 107 104:
T2 := 0 105:
goto 108 106:
T2 := 1 107:
if e < f goto 111 108:
T3 := 0 109:
goto 112 110:
T3 := 1 111:112:113:
T4 := T2 and T3
T5 := T1 or T4
22
Short circuit evaluation
Second argument of a Boolean operator is only evaluated if the first argument does not already determine the outcome
(x and y) is equivalent to if x then y else false;
(x or y) is equivalent to if x then true else y
23
examplea < b or (c<d and e<f)
100: if a < b goto 103101: T1 := 0102: goto 104103: T1 := 1104: if c < d goto 107105: T2 := 0106: goto 108107: T2 := 1108: if e < f goto 111109: T3 := 0110: goto 112111: T3 := 1112: T4 := T2 and T3
113: T5 := T1 and T4
100: if a < b goto 105101: if !(c < d) goto 103102: if e < f goto 105103: T := 0104: goto 106105: T := 1106:
naive Short circuit evaluation
24
Control Structures
For every Boolean expression B, we attach two properties falseLabel – target label for a jump when condition B evaluates
to false trueLabel – target label for a jump when condition B evaluates
to true For every statement S we attach a property
next – the label of the next code to execute after S Challenge
Compute falseLabel and trueLabel during code generation
S if B then S1 | if B then S1 else S2 | while B do S1
25
Control Structures: next
production semantic action
P S S.next = freshLabel();P.code = S.code || label(S.next)
S S1S2 S1.next = freshLabel();S2.next = S.next;S.code = S1.code || label(S1.next) || S2.code
The label S.next is symbolic, we will only determine its value after we finish deriving S
26
Control Structures: conditional
production semantic action
S if B then S1
B.trueLabel = freshLabel();B.falseLabel = S.next;S1.next = S.next;S.code = B.code || gen (B.trueLabel ‘:’) || S1.code
27
Control Structures: conditional
production semantic action
S if B then S1 else S2
B.trueLabel = freshLabel();B.falseLabel = freshLabel();S1.next = S.next;S2.next = S.next;S.code = B.code || gen(B.trueLabel ‘:’) || S1.code || gen(‘goto’ S.next) || gen(B.falseLabel ‘:’) || S2.code
B.code
S1.code
goto S.next
S2.code
…
B.trueLabel:
B.falseLabel:S.next:
28
Boolean expressionsproduction semantic action
B B1 or B2 B1.trueLabel = B.trueLabel; B1.falseLabel = freshLabel();B2.trueLabel = B.trueLabel;B2.falseLabel = B.falseLabel;B.code = B1.code || gen (B1.falseLabel ‘:’) || B2.code
B B1 and B2 B1.trueLabel = freshLabel();B1.falseLabel = B.falseLabel;B2.trueLabel = B.trueLabel;B2.falseLabel = B.falseLabel;B.code = B1.code || gen (B1.trueLabel ‘:’) || B2.code
B not B1 B1.trueLabel = B.falseLabel;B1.falseLabel = B.trueLabel;B.code = B1.code;
B (B1) B1.trueLabel = B.trueLabel; B1.falseLabel = B.falseLabel; B.code = B1.code;
B id1 relop id2
B.code=gen (‘if’ id1.var relop id2.var ‘goto’ B.trueLabel)||gen(‘goto’ B.falseLabel);
B true B.code = gen(‘goto’ B.trueLabel)
B false B.code = gen(‘goto’ B.falseLabel);
29
Boolean expressions
How can we determine the address of B1.falseLabel?
Only possible after we know the code of B1 and all the code preceding B1
production semantic action
B B1 or B2 B1.trueLabel = B.trueLabel; B1.falseLabel = freshLabel();B2.trueLabel = B.trueLabel;B.falseLabel = B.falseLabel;B.code = B1.code || gen (B1.falseLabel ‘:’) || B2.code
30
Example
S
if B then S1
B1 B2and
false
true
B.trueLabel = freshLabel();B.falseLabel = S.next;S1.next = S.next;S.code = B.code || gen (B.trueLabel ‘:’) || S1.code
B1.trueLabel = freshLabel();B1.falseLabel = B.falseLabel;B2.trueLabel = B.trueLabel;B2.falseLabel = B.falseLabel;B.code = B1.code || gen (B1.trueLabel ‘:’) || B2.code
B.code = gen(‘goto’ B.trueLabel)
B.code = gen(‘goto’ B.falseLabel)
31
Computing addresses for labels We used symbolic labels We need to compute their addresses We can compute addresses for the
labels but it would require an additional pass on the AST
Can we do it in a single pass?
32
Backpatching
Goal: generate code in a single pass
Generate code as we did before, but manage labels differently
Keep labels symbolic until values are known, and then back-patch them
New synthesized attributes for B B.truelist – list of jump instructions that eventually
get the label where B goes when B is true. B.falselist – list of jump instructions that eventually
get the label where B goes when B is false.
33
Backpatching
Previous approach does not guarantee a single pass The attribute grammar we had before is
not S-attributed (e.g., next), and is not L-attributed.
For every label, maintain a list of instructions that jump to this label
When the address of the label is known, go over the list and update the address of the label
34
Backpatching
makelist(addr) – create a list of instructions containing addr
merge(p1,p2) – concatenate the lists pointed to by p1 and p2, returns a pointer to the new list
backpatch(p,addr) – inserts i as the target label for each of the instructions in the list pointed to by p
35
Backpatching Boolean expressions
production semantic action
B B1 or M B2
backpatch(B1.falseList,M.instr);B.trueList = merge(B1.trueList,B2.trueList);B.falseList = B2.falseList;
B B1 and M B2
backpatch(B1.trueList,M.instr);B.trueList = B2.trueList;B.falseList = merge(B1.falseList,B2.falseList);
B not B1 B.trueList = B1.falseList;B.falseList = B1.trueList;
B (B1) B.trueList = B1.trueList;B.falseList = B1.falseList;
B id1 relop id2
B.trueList = makeList(nextInstr);B.falseList = makeList(nextInstr+1);emit (‘if’ id1.var relop id2.var ‘goto _’) || emit(‘goto _’);
B true B.trueList = makeList(nextInstr);emit (‘goto _’);
B false B.falseList = makeList(nextInstr);emit (‘goto _’);
M M.instr = nextinstr;
36
Marker
{ M.instr = nextinstr;} Use M to obtain the address just before
B2 code starts being generated
B1
or
B
M B2
37
ExampleX < 150 or x > 200 and x != y
B
B
x <150
B
x >200
B
x!=
y
B
and
or M
M
100: if x< 150 goto _101: goto _
B id1 relop id2
B.trueList = makeList(nextInstr);B.falseList = makeList(nextInstr+1);emit (‘if’ id1.var relop id2.var ‘goto _’) || emit(‘goto _’);
B.t = {100}B.f = {101}
38
ExampleX < 150 or x > 200 and x != y
B
B
x <150
B
x >200
B
x!=
y
B
and
or M
M
100: if x< 150 goto _101: goto _
B.t = {100}B.f = {101}
M M.instr = nextinstr;
M.i = 102
39
ExampleX < 150 or x > 200 and x != y
B
B
x <150
B
x >200
B
x!=
y
B
and
or M
M
100: if x< 150 goto _101: goto _
B.t = {100}B.f = {101} M.i = 102
B id1 relop id2
B.trueList = makeList(nextInstr);B.falseList = makeList(nextInstr+1);emit (‘if’ id1.var relop id2.var ‘goto _’) || emit(‘goto _’);
102: if x> 200 goto _103: goto _
B.t = {102}B.f = {103}
40
ExampleX < 150 or x > 200 and x != y
B
B
x <150
B
x >200
B
x!=
y
B
and
or M
M
100: if x< 150 goto _101: goto _
B.t = {100}B.f = {101} M.i = 102
102: if x> 200 goto _103: goto _
B.t = {102}B.f = {103}
M M.instr = nextinstr;
M.i = 104
41
ExampleX < 150 or x > 200 and x != y
B
B
x <150
B
x >200
B
x!=
y
B
and
or M
M
100: if x< 150 goto _101: goto _
B.t = {100}B.f = {101} M.i = 102
102: if x> 200 goto _103: goto _
B.t = {102}B.f = {103}
M.i = 104
B id1 relop id2
B.trueList = makeList(nextInstr);B.falseList = makeList(nextInstr+1);emit (‘if’ id1.var relop id2.var ‘goto _’) || emit(‘goto _’);
104: if x!=y goto _105: goto _
B.t = {104}B.f = {105}
42
ExampleX < 150 or x > 200 and x != y
B
B
x <150
B
x >200
B
x!=
y
B
and
or M
M
100: if x< 150 goto _101: goto _
B.t = {100}B.f = {101} M.i = 102
102: if x> 200 goto 104103: goto _
B.t = {102}B.f = {103}
M.i = 104
104: if x!=y goto _105: goto _
B.t = {104}B.f = {105}
B B1 and M B2
backpatch(B1.trueList,M.instr);B.trueList = B2.trueList;B.falseList = merge(B1.falseList,B2.falseList);
B.t = {104}B.f = {103,105}
43
ExampleX < 150 or x > 200 and x != y
B
B
x <150
B
x >200
B
x!=
y
B
and
or M
M
100: if x< 150 goto _101: goto 102
B.t = {100}B.f = {101} M.i = 102
102: if x> 200 goto 104103: goto _
B.t = {102}B.f = {103}
M.i = 104
104: if x!=y goto _105: goto _
B.t = {104}B.f = {105}
B.t = {104}B.f = {103,105}
B B1 or M B2
backpatch(B1.falseList,M.instr);B.trueList = merge(B1.trueList,B2.trueList);B.falseList = B2.falseList;
B.t = {100,104}B.f = {103,105}
44
Example
100: if x<150 goto _101: goto _102: if x>200 goto _103: goto _104: if x!=y goto _105: goto _
100: if x<150 goto _101: goto _102: if x>200 goto 104103: goto _104: if x!=y goto _105: goto _
100: if x<150 goto _101: goto 102102: if x>200 goto 104103: goto _104: if x!=y goto _105: goto _
Before backpatching After backpatchingby the productionB B1 and M B2
After backpatchingby the productionB B1 or M B2
45
Backpatching for statementsproduction semantic action
S if (B) M S1 backpatch(B.trueList,M.instr);S.nextList = merge(B.falseList,S1.nextList);
S if (B) M1 S1 N else M2 S2
backpatch(B.trueList,M1.instr);backpatch(B.falseList,M2.instr);temp = merge(S1.nextList,N.nextList);S.nextList = merge(temp,S2.nextList);
S while M1 (B) M2 S1
backpatch(S1.nextList,M1.instr);backpatch(B.trueList,M2.instr);S.nextList = B.falseList;emit(‘goto’ M1.instr);
S { L } S.nextList = L.nextList;
S A S.nextList = null;
M M.instr = nextinstr;
N N.nextList = makeList(nextInstr); emit(‘goto _’);
L L1 M S backpatch(L1.nextList,M.instr); L.nextList = S.nextList;
L S L.nextList = S.nextList
46
Exampleif (x < 150 or x > 200 and x != y) y=200;
B
B
x <150
B
x >200
B
x!=
y
B
and
or M
M
100: if x< 150 goto _101: goto 102
B.t = {100}B.f = {101} M.i = 102
102: if x> 200 goto 104103: goto _
B.t = {102}B.f = {103}
M.i = 104
104: if x!=y goto _105: goto _
B.t = {104}B.f = {105}
B.t = {104}B.f = {103,105}
B.t = {100,104}B.f = {103,105}
S if (B) M S1
backpatch(B.trueList,M.instr);S.nextList = merge(B.falseList,S1.nextList);
if
…M
M.i = 106
S.nextList = {103,105}
47
Example
100: if x<150 goto _101: goto 102102: if x>200 goto 104103: goto _104: if x!=y goto _105: goto _106: y = 200
After backpatchingby the productionB B1 or M B2
100: if x<150 goto 106101: goto 102102: if x>200 goto 104103: goto _104: if x!=y goto 106105: goto _106: y = 200
After backpatchingby the productionS if (B) M S1
48
Procedures
we will see handling of procedure calls in much more detail later
n = f(a[i]);
t1 = i * 4t2 = a[t1] // could have expanded this as well param t2t3 = call f, 1n = t3
49
Procedures
type checking function type: return type, type of formal
parameters within an expression function treated like any
other operator symbol table
parameter names
D define T id (F) { S } F | T id, FS return E; | …E id (A) | … A | E, A
expressions
statements
50
Summary
pick an intermediate representation translate expressions use a symbol table to implement declarations generate jumping code for boolean expressions
value of the expression is implicit in the control location
backpatching a technique for generating code for boolean
expressions and statements in one pass idea: maintain lists of incomplete jumps, where all
jumps in a list have the same target. When the target becomes known, all instructions on its list are “filled in”.