+ All Categories
Home > Documents > Languages and Compilers (SProg og Oversættere)

Languages and Compilers (SProg og Oversættere)

Date post: 05-Jan-2016
Category:
Upload: kacia
View: 24 times
Download: 0 times
Share this document with a friend
Description:
Languages and Compilers (SProg og Oversættere). Bent Thomsen Department of Computer Science Aalborg University. With acknowledgement to Norm Hutchinson whose slides this lecture is based on. input. input. input. output. output. output. Source Text. AST. Decorated AST. Object Code. - PowerPoint PPT Presentation
Popular Tags:
82
1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University nowledgement to Norm Hutchinson whose slides this lecture is based on.
Transcript
Page 1: Languages and Compilers (SProg og Oversættere)

1

Languages and Compilers(SProg og Oversættere)

Bent Thomsen

Department of Computer Science

Aalborg University

With acknowledgement to Norm Hutchinson whose slides this lecture is based on.

Page 2: Languages and Compilers (SProg og Oversættere)

2

Where are we (going)?

Compiler Driver

Syntactic Analyzer

callscalls

Contextual Analyzer Code Generator

calls

Dependency diagram of a typical Multi Pass Compiler:

A multi pass compiler makes several passes over the program. The output of a preceding phase is stored in a data structure and used by subsequent phases.

input

Source Text

output

AST

input output

Decorated AST

input output

Object Code

Page 3: Languages and Compilers (SProg og Oversættere)

3

Code Generation

A compiler translates a program from a high-level language into an equivalent program in a low-level language.

A compiler translates a program from a high-level language into an equivalent program in a low-level language.

TAM Program

Triangle Program

Compile

Run

Result

JVM Program

Java Program

Compile

Run

Result

x86 Program

C Program

Compile

Run

Result

We shall look at this in more detail the next couple of lectures

Page 4: Languages and Compilers (SProg og Oversættere)

4

TAM machine architecture

• TAM is a stack machine– There are no data registers as in register machines.– The temporary data are stored in the stack.

• But, there are special registers (Table C.1 of page 407)

• TAM Instruction Set– Instruction Format (Figure C.5 of page 408)– op: opcode (4 bits)

• r: special register number (4 bits)• n: size of the operand (8 bits)• d: displacement (16 bits)

• Instruction Set– Table C.2 of page 409

Page 5: Languages and Compilers (SProg og Oversættere)

5

TAM Registers

Page 6: Languages and Compilers (SProg og Oversættere)

6

TAM Machine code

• Machine code are 32 bits instructions in the code store– op (4 bits), type of instruction

– r (4 bits), register

– n (8 bits), size

– d (16 bits), displacement

• Example: LOAD (1) 3[LB]:– op = 0 (0000)

– r = 8 (1000)

– n = 1 (00000001)

– d = 3 (0000000000000011)

• 0000 1000 0000 0001 0000 0000 0000 0011

Page 7: Languages and Compilers (SProg og Oversættere)

7

TAM Instruction set

Page 8: Languages and Compilers (SProg og Oversættere)

8

TAM machine architecture

• Two Storage Areas– Code Store (32 bits words)

• Code Segment: to store the code of the program to run– Pointed to by CB and CT

• Primitive Segment: to store the code for primitive operations– Pointed to by PB and PT

– Data Store (16 bits words)• Stack

– global segment at the base of the stack» Pointed to by SB

– stack area for stack frames of procedure and function calls» Pointed to by LB and ST

• Heap– heap area for the dynamic allocation of variables

» Pointed to by HB and HT

Page 9: Languages and Compilers (SProg og Oversættere)

9

TAM machine architecture

Page 10: Languages and Compilers (SProg og Oversættere)

10

Global Variable and Assignment Command

• Triangle source code! simple expression and assignment

let

var n: Integer

in

begin

n := 5;

n := n + 1

end

• TAM assembler code

0: PUSH 1

1: LOADL 5

2: STORE (1) 0[SB]

3: LOAD (1) 0[SB]

4: LOADL 1

5: CALL add

6: STORE (1) 0[SB]

7: POP (0) 1

8: HALT

Page 11: Languages and Compilers (SProg og Oversættere)

11

The “Phases” of a Compiler

Syntax Analysis

Contextual Analysis

Code Generation

Source Program

Abstract Syntax Tree

Decorated Abstract Syntax Tree

Object Code

Error Reports

Error Reports

Next lecture

Page 12: Languages and Compilers (SProg og Oversættere)

12

Storage Allocation

A compiler translates a program from a high-level language into an equivalent program in a low-level language.

A compiler translates a program from a high-level language into an equivalent program in a low-level language.

The low level program must be equivalent to the high-level program.=> High-level concepts must be modeled in terms of the low-level machine.

This lecture is not about the code generation phase itself, but about the way we represent high-level structures in terms of a typical low-level machine’s memory architecture and machine instructions.

=> We need to know this before we can talk about code generation.

Page 13: Languages and Compilers (SProg og Oversættere)

13

What This Lecture is About

High Level Program

Low-level Language Processor

How to model high-level computational structures and data structures in terms of low-level memory and machine instructions.

Procedures

Expressions

VariablesArrays

Records

Objects

Methods

Registers

Machine Instructions

Bits and BytesMachine Stack

How to model ?

Page 14: Languages and Compilers (SProg og Oversættere)

14

Data Representation

• Data Representation: how to represent values of the source language on the target machine.

Records

ArraysStrings

Integer

Char

?

00..10

01..00

...

High level data-structures

0:1:2:3:

Low level memory model

wordword

Note: addressing schema and size of “memory units” may vary

Page 15: Languages and Compilers (SProg og Oversættere)

15

Data Representation

Important properties of a representation schema:• non-confusion: different values of a given type should have

different representations• uniqueness: Each value should always have the same

representation.

These properties are very desirable, but in practice they are not always satisfied:Example: • confusion: approximated floating point numbers.• non-uniqueness: one’s complement representation of integers

+0 and -0

Page 16: Languages and Compilers (SProg og Oversættere)

16

Data Representation

Important issues in data representation:

• constant-size representation: The representation of all values of a given type should occupy the same amount of space.

• direct versus indirect representation

x bit pattern x bit pattern•

handle

Direct representationof a value x

Indirect representationof a value x

Page 17: Languages and Compilers (SProg og Oversættere)

17

Indirect Representation

small x bit pattern

Q: What reasons could there be for choosing indirect representations?

To make the representation “constant size” even if representation requires different amounts of memory for different values.

big x bit pattern

Both are represented by pointers

=>Same size

Page 18: Languages and Compilers (SProg og Oversættere)

18

Indirect versus Direct

The choice between indirect and direct representation is a key decision for a language designer/implementer.

• Direct representations are often preferable for efficiency:• More efficient access (no need to follow pointers)• More efficient “storage class” (e.g stack rather than heap

allocation)• For types with widely varying size of representation it is almost

a must to use indirect representation (see previous slide)

Languages like Pascal, C, C++ try to use direct representation wherever possible.

Languages like Scheme, ML use mostly indirect representation everywhere (because of polymorphic higher order functions)

Java: primitive types direct, “reference types” indirect, e.g. objects and arrays.

Page 19: Languages and Compilers (SProg og Oversættere)

19

Data Representation

We now survey representation of the data types found in the Triangle programming language, assuming direct representations wherever possible.

We will discuss representation of values of:• Primitive Types• Record Types• Static Array Types

We will use the following notations (if T is a type): #[T] The cardinality of the type (i.e. the number of possible values)size[T] The size of the representation (in number of bits/bytes)

Page 20: Languages and Compilers (SProg og Oversættere)

20

Data Representation: Primitive Types

What is a primitive type?The primitive types of a programming language are those types that cannot be decomposed into simpler types. For example integer, boolean, char, etc.

Type: booleanHas two values true and false=> #[boolean] = 2=> size[boolean] ≥ 1 bit

Note: In general if #[T] = n then size[T] ≥ log2n bits

Valuefalsetrue

Possible Representation1bit byte(option 1) byte(option2)0 00000000 000000001 00000001 11111111

Page 21: Languages and Compilers (SProg og Oversættere)

21

Data Representation: Primitive Types

Type: integerFixed size representation, usually dependent (i.e. chosen based on) what is efficiently supported by target machine. Typically uses one word (16 bits, 32 bits, or 64 bits) of storage.

size[integer] = word (= 16 bits)=> # [integer] ≤ 216 = 65536

Modern processors use two’s complement representation of integers

1 0 0 0 0 1 0 0 1 0 0 1 0 1 1 1

Multiply with -(215) Multiply with 2n

Value = -1.215 +0.214 +…+0.23+1.22 +1.21 +1.20

n = position from left

Page 22: Languages and Compilers (SProg og Oversættere)

22

Data Representation: Primitive Types

Example: Primitive types in TAM (Triangle Abstract Machine)

TypeBooleanCharInteger

Representation00...00 and 00...01 UnicodeTwo’s complement

Size1 word1 word1 word

Example: A (possible) representation of primitive types on a Pentium

TypeBooleanCharInteger

Representation00...00 and 11..11 ASCIITwo’s complement

Size1 byte1 byte1 word

Page 23: Languages and Compilers (SProg og Oversættere)

23

Data Representation: Composite Types

Composite types are types which are not “atomic”, but which are constructed from more primitive types.

• Records (called structs in C)Aggregates of several values of several different types

• ArraysAggregates of several values of the same type

• (Variant Records or Disjoined Unions)• (Pointers or References)• (Objects)• (Functions)

Page 24: Languages and Compilers (SProg og Oversættere)

24

Data Representation: Records

Example: Triangle Records

type Date = recordy : Integer,m : Integer,

d : Integer end;type Details = record

female : Boolean,dob : Date,status : Char

end;var today: Date;var my: Details

type Date = recordy : Integer,m : Integer,

d : Integer end;type Details = record

female : Boolean,dob : Date,status : Char

end;var today: Date;var my: Details

Page 25: Languages and Compilers (SProg og Oversættere)

25

Data Representation: Records

Example: Triangle Record Representation

today.m

2002

2

today.y

today.d 5

my.dob.m

1970

5

my.dob.y

my.dob.d 17

false

‘u’

my.female

my.dob

my.status

…1 word:

Page 26: Languages and Compilers (SProg og Oversættere)

26

Data Representation: Records

Records occur in some form or other in most programming languages:Ada, Pascal, Triangle (here they are actually called records)C, C++ (here they are called structs).

The usual representation of a record type is just the concatenation of individual representations of each of its component types.

r.I1

r.I2

r.In

value of type T1

value of type T2

value of type Tn

Page 27: Languages and Compilers (SProg og Oversættere)

27

Data Representation: Records

Example:size[Date] = 3*size[integer] = 3 wordsaddress[today.y] = address[today]+0address[today.m] = address[today]+1address[today.d] = address[today]+2

address[my.dob.m] = address[my.dob]+1 = address[my]+2

Q: How much space does a record take? And how to access record elements?

Note: these formulas assume that addresses are indexes of words (not bytes) in memory (otherwise multiply offsets by 2)

Page 28: Languages and Compilers (SProg og Oversættere)

28

Arrays

An array is a composite data type, an array value consists of multiple values of the same type. Arrays are in some sense like records, except that their elements all have the same type.

The elements of arrays are typically indexed using an integer value (In some languages such as for example Pascal, also other “ordinal” types can be used for indexing arrays).

Two kinds of arrays (with different runtime representation schemas):• static arrays: their size (number of elements) is known at

compile time.• dynamic arrays: their size can not be known at compile time

because the number of elements may vary at run-time.

Q: Which are the “cheapest” arrays? Why?

Page 29: Languages and Compilers (SProg og Oversættere)

29

Static Arrays

Example:type Name = array 6 of Char; var me: Name;var names: array 2 of Name

type Name = array 6 of Char; var me: Name;var names: array 2 of Name

‘K’‘r’‘i’‘s’‘ ’‘ ’

me[0]me[1]

me[2]

me[3]

me[4]

me[5]

‘J’‘o’‘h’‘n’‘ ’‘ ’

names[0][0]names[0][1]

names[0][2]

names[0][3]

names[0][4]

names[0][5]

Name

‘S’‘o’‘p’‘h’‘i’‘a’

names[1][0]names[1][1]

names[1][2]

names[1][3]

names[1][4]

names[1][5]

Name

Page 30: Languages and Compilers (SProg og Oversættere)

30

Static Arrays

Example:

type Coding = record Char c, Integer n

end

var code: array 3 of Coding

type Coding = record Char c, Integer n

end

var code: array 3 of Coding

‘K’5

code[0].ccode[0].n Coding

‘i’22

code[1].ccode[1].n Coding

‘d’4

code[2].ccode[2].n Coding

Page 31: Languages and Compilers (SProg og Oversættere)

31

Static Arrays

type T = array n of TE; var a : T;

type T = array n of TE; var a : T;

a[0]

a[1]

a[2]

a[n-1]

size[T] = n * size[TE]

address[a[0]] = address[a]address[a[1]] = address[a]+size[TE]address[a[2]] = address[a]+2*size[TE]…address[a[i] ] = address[a]+i*size[TE]…

Page 32: Languages and Compilers (SProg og Oversættere)

32

Static Storage Allocation

Example: Global variables in Triangle

let type Date = record y: Integer, m:Integer, d:Integer end; //Date var a: array 3 of Integer; var b: Boolean; var c: Char; var t: Date;in ...

let type Date = record y: Integer, m:Integer, d:Integer end; //Date var a: array 3 of Integer; var b: Boolean; var c: Char; var t: Date;in ...

Exist as long as program is running

Compiler can:• compute exactly how much memory is needed for globals.• allocate memory at a fixed position for each global variable.

Page 33: Languages and Compilers (SProg og Oversættere)

33

Static Storage Allocation

address[a] = 0address[b] = 3address[c] = 4address[t] = 5

a[0]

a[1]

a[2]

a

bc

t.y

t.m

t.d

t

let type Date = record y: Integer, m:Integer, d:Integer end; //Date var a: array 3 of Integer; var b: Boolean; var c: Char; var t: Date;

let type Date = record y: Integer, m:Integer, d:Integer end; //Date var a: array 3 of Integer; var b: Boolean; var c: Char; var t: Date;

Example: Global variables in Triangle

Page 34: Languages and Compilers (SProg og Oversættere)

34

Stack Storage Allocation

let var a: array 3 of Integer; var b: Boolean; var c: Char; proc Y() ~ let var d: Integer; var e: ... in ... ; proc Z() ~ let var f: Integer; in begin ...; Y(); ... endin begin ...; Y(); ...; Z(); end

let var a: array 3 of Integer; var b: Boolean; var c: Char; proc Y() ~ let var d: Integer; var e: ... in ... ; proc Z() ~ let var f: Integer; in begin ...; Y(); ... endin begin ...; Y(); ...; Z(); end

Example: When do the variables in this program “exist”

as long as the program is

running

when procedure Y is active

when procedure Z is active

Now we will look at allocation of local variables

Page 35: Languages and Compilers (SProg og Oversættere)

35

Stack Storage Allocation

Start of program End of program time

call depth

global

Y Z1

2 Y

Z

1) Procedure activation behaves like a stack (LIFO). 2) The local variables “live” as long as the procedure they are declared in.1+2 => Allocation of locals on the “call stack” is a good model.

A “picture” of our program running:

Page 36: Languages and Compilers (SProg og Oversættere)

36

Stack Storage Allocation: Accessing locals/globals

First time around, we assume that in a procedure only local variables declared in that procedure and global variables are accessible.We will extend on this later to include nested scopes and parameters.

A stack allocation model (under the above assumption):• Globals are allocated at the base of the stack. • Stack contains “frames”.

• Each frame correspond to a currently active procedure. (It is often called an “activation frame”)

• When a procedure is called (activated) a frame is pushed on the stack

• When a procedure returns, its frame is popped from the stack.

Page 37: Languages and Compilers (SProg og Oversættere)

37

Stack Storage Allocation: Accessing locals/globals

SB

LB

ST

callframe

SB = Stack baseLB = Locals baseST = Stack top

callframe Dynamic link

globals

Page 38: Languages and Compilers (SProg og Oversættere)

38

What’s in a Frame?

A frame contains• A dynamic link: to next frame on

the stack (the frame of the caller)• Return address• Local variables for the current

activation

return adress

locals

Link data

Local data

LB

ST

dynamic link

Page 39: Languages and Compilers (SProg og Oversættere)

39

What Happens when Procedure is called?

LB

ST

SB = Stack baseLB = Locals baseST = Stack top

callframe

callframe

When procedure f() is called• push new f() call frame

on top of stack.• Make dynamic link in new

frame point to old LB• Update LB (becomes old ST)

new callframe

for f()

Page 40: Languages and Compilers (SProg og Oversættere)

40

What Happens when Procedure returns?

LB

ST

When procedure f() returns• Update LB (from dynamic

link)• Update ST (to old LB)

currentcall

frame for f()

currentcall

frame for f()

callframe

Note, updating the ST implicitly “destroys” or “pops” the frame.

Page 41: Languages and Compilers (SProg og Oversættere)

41

Accessing global/local variables

Q: Is the stack frame for a procedure always on the same position on the stack?

A: No, look at picture of procedure activation below. Imagine what the stack looks like at each point in time.

time

call depth

global

Y Z1

2 Y

Z

G

Y

G

Z

Y

Page 42: Languages and Compilers (SProg og Oversættere)

42

Accessing global/local variables

The global frame is always at the same place in the stack.=> Address global variables relative to SBA typical instruction to access a global variable: LOAD 4[SB]

Frames are not always on the same position in the stack.Depends on the number of frames already on the stack.=> Address local variables relative to LBA typical instruction to access a local variable: LOAD 3[LB]

RECAP: We are still working under the assumption of a “flat” block structure

How do we access global and local variables on the stack?

Page 43: Languages and Compilers (SProg og Oversættere)

43

Accessing global/local variables

Example: Compute the addresses of the variables in this program

let var a: array 3 of Integer; var b: Boolean; var c: Char; proc Y() ~ let var d: Integer; var e: ... in ... ; proc Z() ~ let var f: Integer; in begin ...; Y(); ... endin begin ...; Y(); ...; Z(); end

let var a: array 3 of Integer; var b: Boolean; var c: Char; proc Y() ~ let var d: Integer; var e: ... in ... ; proc Z() ~ let var f: Integer; in begin ...; Y(); ... endin begin ...; Y(); ...; Z(); end

Var Size Address

abc

de

f

311

[0]SB[3]SB[4]SB

1?

1

[2]LB[3]LB

[2]LB

Page 44: Languages and Compilers (SProg og Oversættere)

44

Accessing non-local variables

RECAP: We have discussed• stack allocation of locals in the call frames of procedures• Some other things stored in frames:

• A dynamic link: pointing to the previous frame on the stack=> corresponds to the “caller” of the current procedure.

• A return address: points to the next instruction of the caller.• Addressing global variables relative to SB (stack base)• Addressing local variables relative to LB (locals base)

Now… we will look at accessing non-local variables.

Or in other words. We will look into the question:

How does lexical scoping work?

Page 45: Languages and Compilers (SProg og Oversættere)

45

Accessing non-local variables: what is this about?

Example: How to access p1,p2 from within Q or S?

let var g1: array 3 of Integer; var g2: Boolean; proc P() ~ let var p1,p2 proc Q() ~ let var q:... in ... ; proc S() ~ let var s: Integer; in ... in ...in ...

let var g1: array 3 of Integer; var g2: Boolean; proc P() ~ let var p1,p2 proc Q() ~ let var q:... in ... ; proc S() ~ let var s: Integer; in ... in ...in ...

Scope Structurevar g1,g2proc P()

var p1,p2proc Q()

proc S()

var q

Q: When inside Q, does the dynamic link always point to frame of P?

var s

Page 46: Languages and Compilers (SProg og Oversættere)

46

Accessing non-local variables

Q: When inside Q, does the dynamic link always point to a frame P?A: No! Consider following scenarios:

var g1,g2proc P()

var p1,p2proc Q()

proc S()

var q

var s

timeG

1 P

S

Q

timeG

P1

2

3

Q2

3P

S

Q

P

Q

Page 47: Languages and Compilers (SProg og Oversættere)

47

Accessing non-local variables

We can not rely on the dynamic link to get to the lexically scoped frame(s) of a procedure.

=> Another item is added in the link data: the static link.

The static link in a frame points to the next lexically scoped frame somewhere higher on the stack.

Registers L1, L2, etc. are used to point to the lexical scoped frames. (L1, is most local). A typical instruction for accessing a non-local variable looks like: LOAD [4]L1 LOAD [3]L2

These + LB and SB are called the display registers

Page 48: Languages and Compilers (SProg og Oversættere)

48

What’s in a Frame (revised)?

A frame contains• A dynamic link: to next frame on

the stack (the frame of the caller)• Return address• Local variables for the current

activation

static link

locals

Link data

Local data

LB

ST

dynamic link

return address

Page 49: Languages and Compilers (SProg og Oversættere)

49

Accessing non-local variables

proc P()

proc Q() ...proc S() ...

timeG

P

S

Q

LB

ST

P()frame

globalsSB

S()frame

Q()frame

Dynamic L.

Static Link

L1

Page 50: Languages and Compilers (SProg og Oversættere)

50

Accessing variables, addressing schemas overview

We now have a complete picture of the different kinds of addresses that are used for accessing variables stored on the stack.

Type of variable

Global

Local

Non-local, 1 level up

Non-local, 2 levels up

...

Load instruction

LOAD offset[SB]

LOAD offset[LB]

LOAD offset[L1]

LOAD offset[L2]

Page 51: Languages and Compilers (SProg og Oversættere)

51

Routines

We call the assembly language equivalent of procedures “routines”.

In the preceding material we already learned some things about the implementation of routines in terms of the stack allocation model:

• Addressing local and globals through LB,L1,L2,… and SB

• Link data: static link, dynamic link, return address.

We have yet to learn how the static link and the L1, L2, etc. registers are set up.

We have yet to learn how routines can receive arguments and return results from/to their caller.

Page 52: Languages and Compilers (SProg og Oversættere)

52

Routines

We call the assembly language equivalent of procedures “routines”.

What are routines? Unlike procedures/functions in higher level languages. They are not directly supported by language constructs. Instead they are modeled in terms of how to use the low-level machine to “emulate” procedures.

What behavior needs to be “emulated”?• Calling a routine and returning to the caller after completion.• Passing arguments to a called routine• Returning a result from a routine• Local and non-local variables.

Page 53: Languages and Compilers (SProg og Oversættere)

53

Routines

• Transferring control to and from routine:Most low-level processors have CALL and RETURN for transferring control from caller to callee and back.

• Transmitting arguments and return values:Caller and callee must agree on a method to transfer argument and return values. => This is called the “routine protocol”

There are many possible ways to pass argument and return values. => A routine protocol is like a “contract” between the caller and the callee.

There are many possible ways to pass argument and return values. => A routine protocol is like a “contract” between the caller and the callee.

!The routine protocol is often dictated by the operating system.

Page 54: Languages and Compilers (SProg og Oversættere)

54

Routine Protocol Examples

The routine protocol depends on the machine architecture (e.g. stack machine versus register machine).

Example 1: A possible routine protocol for a RM- Passing of arguments:

first argument in R1, second argument in R2, etc.- Passing of return value:

return the result (if any) in R0

Note: this example is simplistic:- What if more arguments than registers?- What if the representation of an argument is larger than can be stored in a register.

For RM protocols, the protocol usually also specifies who (caller or callee) is responsible for saving contents of registers.

Page 55: Languages and Compilers (SProg og Oversættere)

55

Routine Protocol Examples

Example 2: A possible routine protocol for a stack machine

- Passing of arguments: pass arguments on the top of the stack.

- Passing of return value:leave the return value on the stack top, in place of the arguments.

Note: this protocol puts no boundary on the number of arguments and the size of the arguments.

Most micro-processors, have registers as well as a stack. Such “mixed” machines also often use a protocol like this one.

The “Triangle Abstract Machine” also adopts this routine protocol.We now look at it in detail (in TAM).

Page 56: Languages and Compilers (SProg og Oversættere)

56

TAM: Routine Protocol

SB

LB

ST

globals

just before the call just after the call

args

SB

LB

ST

globals

result

What happens in between?

Page 57: Languages and Compilers (SProg og Oversættere)

57

TAM: Routine Protocol

LB

ST

(1) just before the call

args

(2) just after entry

LB

ST

args

link data

note: Going from (1) -> (2) in TAM is the execution of a single CALL instruction.

Page 58: Languages and Compilers (SProg og Oversættere)

58

TAM: Routine Protocol

(2) just after entry

LB

ST

args

link data

(3.1) during execution of routine

LB

ST

args

link data

localdata

shrinks and grows during execution

Page 59: Languages and Compilers (SProg og Oversættere)

59

TAM: Routine Protocol

(3.2) just before return

LB

ST

args

link data

localdata

result

(4) just after return

LB

ST result

note: Going from (3.2) -> (4) in TAM is the execution of a single RETURN instruction.

Page 60: Languages and Compilers (SProg og Oversættere)

60

TAM: Routine Protocol, Example

let var g: Integer; func F(m: Integer, n: Integer) : Integer ~ m*n ; proc W(i:Integer) ~ let const s ~ i*i in begin putint(F(i,s)); putint(F(s,s)) endin begin getint(var g); W(g+1)end

let var g: Integer; func F(m: Integer, n: Integer) : Integer ~ m*n ; proc W(i:Integer) ~ let const s ~ i*i in begin putint(F(i,s)); putint(F(s,s)) endin begin getint(var g); W(g+1)end

Example Triangle Program

Page 61: Languages and Compilers (SProg og Oversættere)

61

TAM: Routine Protocol, Example

PUSH 1 -- expand globals make place for gLOADA 0[SB] -- push address of gCALL getint -- read integer into gCALL succ -- add 1CALL(SB) W -- call W (using SB as static link)POP 1 -- contract globalsHALT

PUSH 1 -- expand globals make place for gLOADA 0[SB] -- push address of gCALL getint -- read integer into gCALL succ -- add 1CALL(SB) W -- call W (using SB as static link)POP 1 -- contract globalsHALT

TAM assembly code:

let var g: Integer; ... in begin getint(var g); W(g+1)end

let var g: Integer; ... in begin getint(var g); W(g+1)end

Page 62: Languages and Compilers (SProg og Oversættere)

62

TAM: Routine Protocol, Example

F: LOAD -2[LB] -- push value of argument m LOAD -1[LB] -- push value of argument n CALL mult -- multiply m and n RETURN(1) 2 -- return replacing 2 word argument pair by 1 word result

F: LOAD -2[LB] -- push value of argument m LOAD -1[LB] -- push value of argument n CALL mult -- multiply m and n RETURN(1) 2 -- return replacing 2 word argument pair by 1 word result

func F(m: Integer, n: Integer) : Integer ~ m*n ; func F(m: Integer, n: Integer) : Integer ~ m*n ;

arguments addressed relative to LB (negative offsets!)

Size of return value and argument space needed for updating the stack on return from call.

Page 63: Languages and Compilers (SProg og Oversættere)

63

TAM: Routine Protocol, Example

W: LOAD -1[LB] -- push value of argument i LOAD -1[LB] -- push value of argument i CALL mult -- multiply: result is value of s LOAD -1[LB] -- push value of argument i LOAD 3[LB] -- push value of local var s CALL(SB) F -- call F (use SB as static link) … RETURN(0) 1 -- return, replacing 1 word argument by 0 word result

W: LOAD -1[LB] -- push value of argument i LOAD -1[LB] -- push value of argument i CALL mult -- multiply: result is value of s LOAD -1[LB] -- push value of argument i LOAD 3[LB] -- push value of local var s CALL(SB) F -- call F (use SB as static link) … RETURN(0) 1 -- return, replacing 1 word argument by 0 word result

proc W(i: Integer) ~ let const s~i*i in ... F(i,s) ...

proc W(i: Integer) ~ let const s~i*i in ... F(i,s) ...

Page 64: Languages and Compilers (SProg og Oversættere)

64

TAM: Routine Protocol, Example

let var g: Integer; ... in begin getint(var g); W(g+1)end

let var g: Integer; ... in begin getint(var g); W(g+1)end

SB g 3

after reading g

3

just before call to W

4STSB g

ST arg #1

Page 65: Languages and Compilers (SProg og Oversættere)

65

TAM: Routine Protocol, Example

proc W(i: Integer) ~ let const s~i*i in ... F(i,s) ...

proc W(i: Integer) ~ let const s~i*i in ... F(i,s) ...

just after entering W

34

SB g

LB arg #1

STlinkdata

just after computing s

34

SB g

LB arg i

ST

linkdata16

just before calling F

34

SB g

LB arg i

ST

linkdata164

16

s

arg #1arg #2

static link dynamic link

Page 66: Languages and Compilers (SProg og Oversættere)

66

TAM: Routine Protocol, Example

func F(m: Integer, n: Integer) : Integer ~ m*n ; func F(m: Integer, n: Integer) : Integer ~ m*n ;

just before calling F

34

SB g

LB arg i

ST

linkdata164

16

s

arg #1arg #2

just after entering F

34

SB g

LB

arg i

ST

linkdata164

16

s

arg marg n

linkdata

just before return from F

34

SB g

LB

arg i

ST

linkdata164

16

s

arg marg n

linkdata64

Page 67: Languages and Compilers (SProg og Oversættere)

67

TAM: Routine Protocol, Example

func F(m: Integer, n: Integer) : Integer ~ m*n ; func F(m: Integer, n: Integer) : Integer ~ m*n ;

just before return from F

34

SB g

LB

arg i

ST

linkdata164

16

s

arg marg n

linkdata64

after return from F

34

SB g

LB arg i

ST

linkdata1664

s …

Page 68: Languages and Compilers (SProg og Oversættere)

68

TAM Routine Protocol: Frame Layout Summary

LB

ST

local variablesand intermediate

results

dynamic linkstatic link

return addres

Local data, grows and shrinksduring execution.

Link data

arguments Arguments for current procedurethey were put here by the caller.

Page 69: Languages and Compilers (SProg og Oversættere)

69

Accessing variables, addressing schemas overview(Revised)

We now have a complete picture of the different kinds of addresses that are used for accessing variables and formal parameters stored on the stack.

Type of variable

Global

Local

Parameter

Non-local, 1 level up

Non-local, 2 levels up...

Load instruction

LOAD +offset[SB]

LOAD +offset[LB]

LOAD -offset[LB]

LOAD +offset[L1]

LOAD +offset[L2]

Page 70: Languages and Compilers (SProg og Oversættere)

70

Static Links

We have already seen somewhat how the static link in the callframes is set in the previous: It is initialized from a parameter of the CALL instruction.

But… how can the compiler determine that parameter at compile time?

To understand this we need to know a few things about the Triangle programming language.

• Statically scoped, block-structured language.• The static link points to the scope where the function was defined.• If there are no higher-order funcs or procs. The compiler always knows which func/proc is being called. That func/proc must be defined somewhere within the active scope.=> the static link can always be found in one of the display registers

Page 71: Languages and Compilers (SProg og Oversættere)

71

Static Links

the static link can always be found in one of the display registers

let proc P() let proc Q() let proc Q1 Q1body proc Q2 Q2body in Qbody proc S() let proc S1 S1body proc S2 S2body in Sbody in Pbodyin ...Main Program...

call

from to

Main P

Static link?

P QS

S PS1S2SQ

SBLBLBSBLBLBL1L2

S2 L1L2

S1S

Page 72: Languages and Compilers (SProg og Oversættere)

72

Arguments: by value or by reference

Some programming languages allow to two kinds of function/procedure parameters.

Example: in Triangle (similar in Pascal)

let proc S(var n:Integer, i:Integer) ~ n:=n+i; var today: record y:integer, m:Integer, d:Integer end;in begin b := {y~2002, m ~ 2, d ~ 22}; S(var b.m, 6);end

let proc S(var n:Integer, i:Integer) ~ n:=n+i; var today: record y:integer, m:Integer, d:Integer end;in begin b := {y~2002, m ~ 2, d ~ 22}; S(var b.m, 6);end

Constant/Value parameterVar/reference parameter

Page 73: Languages and Compilers (SProg og Oversættere)

73

Arguments: by value or by reference

Value parameters:At the call site the argument is an expression, the evaluation of that expression leaves some value on the stack. The value is passed to the procedure/function.A typical instruction for putting a value parameter on the stack:LOADL 6

Var parameters:Instead of passing a value on the stack, the address of a memory location is pushed. This implies a restriction that only “variable-like” things can be passed to a var parameter. In Triangle there is an explicit keyword var at the call-site, to signal passing a var parameter. In Pascal and C++ the reference is created implicitly (but the same restrictions apply). Typical instructions: LOADA 5[LB] LOADA 10[SB]

Page 74: Languages and Compilers (SProg og Oversættere)

74

Recursion

How are recursive functions and procedures supported on a low-level machine?=> Surprise! The stack memory allocation model already works!Example:

let func fac(n:Integer) ~ if (n=1) then 1 else n*fac(n-1);in begin putint(fac(6));end

let func fac(n:Integer) ~ if (n=1) then 1 else n*fac(n-1);in begin putint(fac(6));end

why does it work? because every activation of a function gets its own activation record on the stack, with its own parameters, locals etc.=> procedures and functions are “reentrant”. Older languages (e.g. FORTRAN) which use static allocation for locals have problems with recursion.

Page 75: Languages and Compilers (SProg og Oversættere)

75

Recursion: General Idea

Why the stack allocation model works for recursion:Like other function/procedure calls, lifetimes of local variables and parameters for recursive calls behave like a stack.

fac(3)

fac(2)

fac(1)

fac(4) fac(4)

fac(3)

fac(2)

fac(4)fac(4)

fac(3) fac(3)

fac(2)

fac(2)

fac(1)

fac(3)

fac(2)

fac(4)

fac(3)?

?

fac(4)

Page 76: Languages and Compilers (SProg og Oversættere)

76

Recursion: In Detail

let func fac(n:Integer) ~ if (n=1) then 1 else n*fac(n-1);in begin putint(fac(6));end

let func fac(n:Integer) ~ if (n=1) then 1 else n*fac(n-1);in begin putint(fac(6));end

SB arg 1 6ST

before call to facSB arg 1 6

ST

right after enteringfac

linkdata

SB arg n 6

right before recursivecall to fac

linkdata

LB

ST

LB

6value of n5arg 1:value of n-1

Page 77: Languages and Compilers (SProg og Oversættere)

77

Recursion

SB arg n 6

right before recursivecall to fac

linkdata

ST

LB

6value of n5arg

SB arg n 6

right before next recursive call to fac

linkdata

ST

LB

6value of n5arg n

linkdata

54

value of narg

SB arg n 6

right before next recursive call to fac

linkdata

LB

6value of n5arg n

linkdata

54

value of narg

linkdata

43ST

value of narg

Page 78: Languages and Compilers (SProg og Oversættere)

78

Recursion

LB

ST

Is the spaghetti of static and dynamic links getting confusing?

Let’s zoom in on just a single activation of the fac procedure. The pattern is always the same:

argument n

link data

to caller context (= previous LB)to lexical context (=SB)

nn-1

Intermediate results in the computation of n*fac(n-1);

?

just before recursive call in fac

Page 79: Languages and Compilers (SProg og Oversættere)

79

Recursion

LB

ST

link data

result = 1

just before the return from the “deepest call”: n=1 after return from deepest call

LB

STresult=1

?caller frame

(what’s in here?)

argument n=2

link data

n=2 Next step:multiplyargument n=1

Page 80: Languages and Compilers (SProg og Oversættere)

80

Recursion

just before the return from the “deepest call”: n=1

after return from deepest call and multiply

LB

ST?

caller frame(what’s in here?)

argument n=2

link data

2*fac(1)=2 Next step:return

to caller contextto lexical context (=SB)

result

From here on down the stack is shrinking,multiplying each time with a bigger n

Page 81: Languages and Compilers (SProg og Oversættere)

81

Recursion

LB

ST

argument n

link data

nrecurs. arg: n-1

just before recursive call in fac

LB

ST

argument n

link data

nfac(n-1)

after completion of the recursivecall

Calling a recursive function is just like calling any other function. After completion it just leaves its result on the top of the stack!A recursive call can happen in the midst of expression evaluation.Intermediate results. local variables, etc. simply remain on the stack and computation proceeds when the recursive call is completed.

Page 82: Languages and Compilers (SProg og Oversættere)

82

Summary

• Data Representation: how to represent values of the source language on the target machine.

• Storage Allocation: How to organize storage for variables (considering different lifetimes of global, local and heap variables)

• Routines: How to implement procedures, functions (and how to pass their parameters and return values)


Recommended