+ All Categories
Home > Documents > Languages and Compilers (SProg og Oversættere)

Languages and Compilers (SProg og Oversættere)

Date post: 07-Jan-2016
Category:
Upload: kipp
View: 25 times
Download: 0 times
Share this document with a friend
Description:
Languages and Compilers (SProg og Oversættere). Bent Thomsen Department of Computer Science Aalborg University. With acknowledgement to Elsa Gunter who’s slides this lecture is based on. Type Checking. When is op(arg1,…,argn) allowed? - PowerPoint PPT Presentation
Popular Tags:
87
1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University nowledgement to Elsa Gunter who’s slides this lecture is based on .
Transcript
Page 1: Languages and Compilers (SProg og Oversættere)

1

Languages and Compilers(SProg og Oversættere)

Bent Thomsen

Department of Computer Science

Aalborg University

With acknowledgement to Elsa Gunter who’s slides this lecture is based on.

Page 2: Languages and Compilers (SProg og Oversættere)

2

Type Checking

• When is op(arg1,…,argn) allowed?• Type checking assures that operations

are applied to the right number of arguments of the right types– Right type may mean same type as was

specified, or may mean that there is a predefined implicit coercion that will be applied

• Used to resolve overloaded operations

Page 3: Languages and Compilers (SProg og Oversættere)

3

Type Checking

• Type checking may be done statically at compile time or dynamically at run time

• Untyped languages (eg LISP, Prolog) do only dynamic type checking

• Typed languages can do most type checking statically

Page 4: Languages and Compilers (SProg og Oversættere)

4

Dynamic Type Checking

• Performed at run-time before each operation is applied

• Types of variables and operations left unspecified until run-time– Same variable may be used at different

types

Page 5: Languages and Compilers (SProg og Oversættere)

5

Static Type Checking

• Performed after parsing, before code generation

• Type of every variable and signature of every operator must be known at compile time

Page 6: Languages and Compilers (SProg og Oversættere)

6

Static Type Checking

• Can eliminate need to store type information in data object if no dynamic type checking is needed

• Catches many programming errors at earliest point

Page 7: Languages and Compilers (SProg og Oversættere)

7

Strongly Typed Language

• When no application of an operator to arguments can lead to a run-time type error, language is strongly typed

• Depends on definition of “type”

Page 8: Languages and Compilers (SProg og Oversættere)

8

Strongly Typed Language

• C is “strongly typed” but type coercions may cause unexpected (undesirable) effects; no array bounds check (in fact, no runtime checks at all)

• SML “strongly typed” but still must do dynamic array bounds checks, arithmetic overflow checks

Page 9: Languages and Compilers (SProg og Oversættere)

9

How to Handle Type Mismatches

• Type checking to refuse them

• Apply implicit function to change type of data–Coerce int into real

–Coerce char into int

Page 10: Languages and Compilers (SProg og Oversættere)

10

Conversion Between Types:

• Explicit: all conversions between different types must be specified

• Implicit: some conversions between different types implied by language definition– Implicit conversions called coercions

Page 11: Languages and Compilers (SProg og Oversættere)

11

Coercion Examples

Example in Pascal:var A: real;B: integer;

A := B–Implicit coercion - an automatic

conversion from one type to another

Page 12: Languages and Compilers (SProg og Oversættere)

12

Coercions Versus Conversions

• When A has type int and B has type real, many languages allow coercion implicit in

A := B

• In the other direction, often no coercion allowed; must use explicit conversion:

– A := round(B); Go to integer nearest B

– A := trunc(B); Delete fractional part of B

Page 13: Languages and Compilers (SProg og Oversættere)

13

Type Equality (aka Type Compatibility)

• When are two types “the same”?

• Name equivalence: two types equal only if they have the same name– Simple but restrictive

– Usually loosened to allow two types to be equal when one is defined with the name of the other (declaration equivalence)

Page 14: Languages and Compilers (SProg og Oversættere)

14

Type Equality

• Structure equivalence: Two types are equivalent if the underlying data structures for each type are the same–Problem: how far to go – are two

records with the same number of fields of same type, but different labels equivalent?

Page 15: Languages and Compilers (SProg og Oversættere)

15

Elementary Data Types

• Data objects contain single data value with no components

• Standard elementary types include:

integers, reals, characters, booleans, enumerations, pointers (references in SML)

Page 16: Languages and Compilers (SProg og Oversættere)

16

Specification of Elementary Data Types

• Basic attributes of type usually used by compiler and then discarded

• Some partial type information may occur in data object

• Values usually match with hardware types: 8 bits, 16 bits, 32 bits, 64 bits

• Operations: primitive operations with hardware support, and user-defined operations built from primitive ones

Page 17: Languages and Compilers (SProg og Oversættere)

17

Integers – Specification

• Range of integers for some fixed minint to some fixed maxint, typically -2^31 through 2^31 – 1 or –2^30 through 2^30 - 1

• Standard collection of operators: +, -, *, /, mod, ~ (negation)

• Standard relational operations: =, <, >, <=, >=, =/=

Page 18: Languages and Compilers (SProg og Oversættere)

18

Integers - Implementation

• Implementation:

– Binary representation in 2’s complement arithmetic

– Three different standard representations:

S Data

Sign bit (0 for +, 1 for -) Binary integer

Page 19: Languages and Compilers (SProg og Oversættere)

19

Integers - Implementation

• First kind:

S Data

Sign bit (0 for +, 1 for -) Binary integer

Page 20: Languages and Compilers (SProg og Oversættere)

20

• Second kind

• Third kind

T Address

Integers – Implementation

S Data

T S Data

Type descriptor

Type descriptor

Sign bit

Sign bit

Page 21: Languages and Compilers (SProg og Oversættere)

21

Integer Numeric Data

• Positive values

64 + 8 + 4 = 76

0 1 0 0 1 1 0 0

sign bit

Page 22: Languages and Compilers (SProg og Oversættere)

22

Subranges

• Example (Ada):

A:integer range 10..20

• Subtype of integers (implicit coercion into integer)

Page 23: Languages and Compilers (SProg og Oversættere)

23

Subranges

• Data may require fewer bits than integer type

–Data in example above require only 4 bits

• Range checking usually requires some runtime time information and dynamic type checking

Page 24: Languages and Compilers (SProg og Oversættere)

24

IEEE Floating Point Format

• IEEE standard 754 specifies both a 32- and 64-bit standard

• At least one supported by most hardware

• Numbers consist of three fields:– S (sign), E (exponent), M (mantissa)

S E M

Page 25: Languages and Compilers (SProg og Oversættere)

25

Floating Point Numbers: Theory

• Every non-zero number may be uniquely written as

(-1)S * 2 e * mwhere 1 m < 2 and S is either 0 or 1

Page 26: Languages and Compilers (SProg og Oversættere)

26

Floating Point Numbers: Theory

• Every non-zero number may be uniquely written as

(-1)S * 2 (E – bias) * (1 + (M/2N))

where 0 M < 1

• N is number of bits for M (23 or 52)

• Bias is 127 of 32-bit ints

• Bias is 1023 for 64-bit ints

Page 27: Languages and Compilers (SProg og Oversættere)

27

IEEE Floating Point Format (32 Bits)

• S: a one-bit sign field. 0 is positive.

• E: an exponent in excess-127 notation. Values (8 bits) range from 0 to 255, corresponding to exponents of 2 that range from -127 to 128.

Page 28: Languages and Compilers (SProg og Oversættere)

28

IEEE Floating Point Format (32 Bits)

• M: a mantissa of 23 bits. Since the first bit of the mantissa in a normalized number is always 1, it can be omitted and inserted automatically by the hardware, yielding an extra 24th bit of precision.

Page 29: Languages and Compilers (SProg og Oversættere)

29

Exponent Bias

• If 8 bits (256 values) +127 added to exponent to get E

• If E = 127 then 127-127 = 0 is true exponent

• If E = 129 then 129-127 = 2 is true exponent

• If E = 120 then 120-127 = -7 is true exponent

Page 30: Languages and Compilers (SProg og Oversættere)

30

Floating Point Number Range

• In 32-bit format, the exponent has 8 bits giving a range from –127 to 128 for exponent

• This give a number range from 10-38

to 1038 roughly speaking

Page 31: Languages and Compilers (SProg og Oversættere)

31

Floating Point Number Range

• In 64-bit format,the exponent is extended to 11 bits giving a range from -1023 to +1024 for the exponent

• This gives a range from 10-308 to

10308 roughly speaking

Page 32: Languages and Compilers (SProg og Oversættere)

32

Decoding IEEE format

• Given E, and M, the value of the representation is:Parameters Value

• E=255 and M 0 An invalid number• E=255 and M = 0 • 0<E<255 2{E-127}(1+(M/ 223))• E=0 and M 0 2 -126 (M / 223)• E=0 and M=0 0

Page 33: Languages and Compilers (SProg og Oversættere)

33

Example Floating Point Numbers

• +1= 20*1= 2{127-127}*(1 + .0)

0 01111111 000000…

• +1.5= 20*1.5= 2{127-127}*(1+ 222/ 223)

0 01111111 100000…

• -5= -22*1.25= 2{129-127}*(1+ 221/ 223)

1 10000001 010000…

Page 34: Languages and Compilers (SProg og Oversættere)

34

Other Numeric Data

• Short integers (C) - 16 bit, 8 bit

• Long integers (C) - 64 bit

• Boolean or logical - 1 bit with value true or false (often stored as bytes)

• Byte - 8 bits

Page 35: Languages and Compilers (SProg og Oversættere)

35

Other Numeric Data

• Character - Single 8-bit byte - 256 characters

• ASCII is a 7 bit 128 character code

• Unicode is a 16-bit character code (Java)

• In C, a char variable is simply 8-bit integer numeric data

Page 36: Languages and Compilers (SProg og Oversættere)

36

Enumerations

• Motivation: Type for case analysis over a small number of symbolic values

• Example: (Ada)Type DAYS is {Mon, Tues, Wed, Thu,

Fri, Sat, Sun}• Implementation: Mon 0; … Sun 6• Treated as ordered type (Mon < Wed)• In C, always implicitly coerced to integers

Page 37: Languages and Compilers (SProg og Oversættere)

37

Pointers

• A pointer type is a type in which the range of values consists of memory addresses and a special value, nil (or null)

• Use of pointers to create arbitrary data structures

Page 38: Languages and Compilers (SProg og Oversættere)

38

Pointer Data

• Each pointer can point to an object of another data structure– Its l-value is its address; its r-value is

the address of another object

• Accessing r-value of r-value of pointer called dereferencing

Page 39: Languages and Compilers (SProg og Oversættere)

39

Pointer Aliasing

• A:= B– Numeric assignment

A: A:

B: B:– Pointer assignment

A: A:

B: B:

7.2 0.4

0.4 0.4

7.2

0.4 0.4

Page 40: Languages and Compilers (SProg og Oversættere)

40

Problems with Pointers

• Dangling Pointer

A: Delete A

B:• Garbage (lost heap-dynamic variables)

A: A:

B: B:7.2

0.4 0.4

7.2

0.4

Page 41: Languages and Compilers (SProg og Oversættere)

41

Ways to Create Dangling Pointers

int * A, B;

A = new int;

A = 5;

B = A;

delete A;

/* B is still pointing to the address of object A returned to stack */

Page 42: Languages and Compilers (SProg og Oversættere)

42

Ways to Create Dangling Pointers

int * A;

int * sub () { int B;

B = 5;

return B;}

main () { A = sub(); . . . }

/* A has been assigned the address of an object that is out of scope */

Page 43: Languages and Compilers (SProg og Oversættere)

43

SML references

• An alternative to allowing pointers directly

• References in SML can be typed

• … but they introduce some abnormalities

Page 44: Languages and Compilers (SProg og Oversættere)

44

SML imperative constructs

• SML reference cells– Different types for location and contents

x : int non-assignable integer value

y : int ref location whose contents must be integer

!y the contents of location y

ref x expression creating new cell initialized to x

– SML assignmentoperator := applied to memory cell and new contents

– Examplesy := x+3 place value of x+3 in cell y; requires x:int

y := !y + 3 add 3 to contents of y and store in location y

Page 45: Languages and Compilers (SProg og Oversættere)

45

SML examples

• Create cell and change contentsval x = ref “Bob”;

x := “Bill”;

• Create cell and incrementval y = ref 0;

y := !y + 1;

• While loop val i = ref 0;

while !i < 10 do i := !i +1;

!i;

Page 46: Languages and Compilers (SProg og Oversættere)

46

Composite Data Types

• Composite data types are sets of data objects built from data objects of other types

• Elements called data structures• Some created by users, eg an array

of integers• Some created internally by compiler,

eg symbol table, or subroutine activation record

Page 47: Languages and Compilers (SProg og Oversættere)

47

Specification of Structured Data Types

• Number of components– Fixed or varying over life of data

structure

• Arrays and records have fixed number

• Lists have variable number

– If variable number of components, is there a max number possible

Page 48: Languages and Compilers (SProg og Oversættere)

48

Specification of Structured Data Types

• Type of each component–Homogeneous: all components

have same type• Arrays

–Heterogeneous: components have varying types• Records (also lists in some

languages, but not SML)

Page 49: Languages and Compilers (SProg og Oversættere)

49

Specification of Structured Data Types

• Method of accessing components–Array subscripting

–Record labels

–SML datatype pattern matching

Page 50: Languages and Compilers (SProg og Oversættere)

50

Operations on Data Structures

• Creation and deletion of structures

• Whole-structure operations–Assigning to variable

–Iterating a function over the structure

–Computing its length or size

Page 51: Languages and Compilers (SProg og Oversættere)

51

Operations on Data Structures

• Component selection operations– Direct access (aka random selection)

• Takes constant time– Sequential selection

• Usually proportional to some dimension of the structure (like the number of components)

– May allow component update, or may only allow access to value

Page 52: Languages and Compilers (SProg og Oversættere)

52

Operations on Data Structures

• Component insertion and deletion– Applies to structures with variable

number of components

– Causes major effects on possible data layouts

• Example seen in the layouts for strings

Page 53: Languages and Compilers (SProg og Oversættere)

53

General Layout of Data Structures

• Descriptor– Contains type information and other

attributes of data structure

– May only exist in symbol table at compile time, or may be a direct part of data object, or split between two

– Usually several words long

Page 54: Languages and Compilers (SProg og Oversættere)

54

General Layout of Data Structures

• Layout of component data–Sequential: arrays and records

• Uses least storage for structure if number of components fixed

• Least flexible for overall storage management

Page 55: Languages and Compilers (SProg og Oversættere)

55

General Layout of Data Structures

• Layout of component data–Linked: lists, trees

• Uses more space per structure since each component must also have a pointer to it

• Maximum flexibility for overall storage management, put pieces where they fit

Page 56: Languages and Compilers (SProg og Oversættere)

56

Strings

• Character string is a data object composed of a sequence of characters

• Main kinds:– Fixed declared length

– Variable length with declared maximum length

– Unbounded length

Page 57: Languages and Compilers (SProg og Oversættere)

57

String operations

• String concatenation

• Length of string

• Substring selection by position

• Lexicographical ordering (based on underlying codes such as ASCII)

• Substring by pattern matching

Page 58: Languages and Compilers (SProg og Oversættere)

58

String Interface

• Can be implemented as primitive type (as in SML or Java) or an array of characters (as in C and C++)

• If primitive, operations are built in• If array of characters, string

operations provided through a library

Page 59: Languages and Compilers (SProg og Oversættere)

59

String Implementations

• Fixed declared length (aka static length)–Packed array padded with blanks

Descriptor Data

A l l a b o a r d ø ø

String Length=12

Pointer to data

Page 60: Languages and Compilers (SProg og Oversættere)

60

String Implementations

• May need runtime descriptor for type, and length is substring operations include runtime checks

• Update pads with blanks or truncates as necessary

Page 61: Languages and Compilers (SProg og Oversættere)

61

String Implementations

• Variable length with declared maximum (aka limited dynamic length)– Packed array with runtime descriptor

String Max Length=12 Cur Length=10 Pointer to data

A l l a b o a r d

Page 62: Languages and Compilers (SProg og Oversættere)

62

String Implementations

• Descriptor may occur as initial block of data object for array

Page 63: Languages and Compilers (SProg og Oversættere)

63

String Implementations

• Unbounded length (aka dynamic length)

– Two standard implementations

– First: Linked list

A l l String Curr Length = 10

Pointer to data

a b o a r d

Page 64: Languages and Compilers (SProg og Oversættere)

64

String Implementations

• Unbounded length– Second implementation: null terminated

contiguous array

– Must reallocate and copy when string grows

A l l a b o a r d

String Pointer to data

Page 65: Languages and Compilers (SProg og Oversættere)

65

Arrays

• Ordered sequence of fixed number of objects all of the same type

• Indexed by integer, subrange, or enumeration type, called subscript

• Multidimensional arrays have one subscript per each dimension

• L-value for array element given by accessing formula

Page 66: Languages and Compilers (SProg og Oversættere)

66

Type Checking Arrays

• Basic type – array

• Number of dimensions

• Type of components

• Type of subscript

• Range of subscript (must be done at runtime, if at all)

Page 67: Languages and Compilers (SProg og Oversættere)

67

Array Layout

• Assume one dimension

1 dim array

Virtual Origin (VO)

Lower Bound (LB)

Upper Bound (UB)

Comp type

Comp size (E)

A[LB]

A[LB+1]

A[UB]

A[0]

Page 68: Languages and Compilers (SProg og Oversættere)

68

Array Component Access

• Component access through subscripting, both for lookup (r-value) and for update (l-value)

• Component access should take constant time (ie. looking up the 5th element takes same time as looking up 100th element)

Page 69: Languages and Compilers (SProg og Oversættere)

69

Array Access Function

• L-value of A[i] = VO + (E * i)

= + (E * (i – LB))

• Computed at compile time

• VO = - (E * LB)

• More complicated for multiple dimensions

Page 70: Languages and Compilers (SProg og Oversættere)

70

Records

• Ordered sequence of fixed number of objects of differing types

• Indexed by fixed identifiers called labels or fields

• L-value for record element given by more complex accessing formula than for arrays

Page 71: Languages and Compilers (SProg og Oversættere)

71

Typical Record Layout

Descriptor Data

R.1

R.2

R.n

Record typeNum. of componentsComp 1 labelComp 1 typeComp 1 location =

Comp n labelComp n typeComp n location

Page 72: Languages and Compilers (SProg og Oversættere)

72

Type Checking Record

• Basic type – record

• Number, name (label) of components

• Possibly order of labels– If order matters, labels must be unique– If order doesn’t matter, layout must give

a canonical ordering

• Type of components per label

Page 73: Languages and Compilers (SProg og Oversættere)

73

Record Layout

• Most of descriptor exists only at compile time

• Access function:

• Comp i location given by

• L-value of R.i = + (size of R.j)i - 1

j = 1

Page 74: Languages and Compilers (SProg og Oversættere)

74

Lists

• Ordered collection of variable number of elements–Many languages (LISP, Scheme,

Prolog) allow heterogeneous list

–SML has only homogeneous lists

Page 75: Languages and Compilers (SProg og Oversættere)

75

Lists

• Layout: linked series of cells (called cons cells) with descriptor, data and pointers–Data in first cell of list called head

of list

–R-value of pointer in first cell called tail of list

Page 76: Languages and Compilers (SProg og Oversættere)

76

Lists

• Sequential access of data by following pointers–Access is linear in position in

list• Takes twice as long to look up 10th element as to look up 5th element

Page 77: Languages and Compilers (SProg og Oversættere)

77

Lists

• Adding a new element to list done only at head, called consing

• Creates new cell with element to be added and pointer to old list (ie. creates new list)

Page 78: Languages and Compilers (SProg og Oversættere)

78

List Layout

• Example: [1,2.5,’a’]

list

list

list int 1

real 2.5

char ‘a’

Page 79: Languages and Compilers (SProg og Oversættere)

79

List Layout

• Example: [[1,2.5],[’a’]]

list

list

int 1

real 2.5

char ‘a’list

list

Page 80: Languages and Compilers (SProg og Oversættere)

80

Union Types

• Set-wise the (discriminated) union of the component types

• Interchangeable with variant records as primitive type construct

• Elements chosen from one of component types

Page 81: Languages and Compilers (SProg og Oversættere)

81

Union Types

• Problem: if int occurs as two different components of union type, can we tell which component an int is for?

Page 82: Languages and Compilers (SProg og Oversættere)

82

Union Types

• Two kinds of union types:

–Free union - Ans: no

–Discriminated union – Ans: yes

• If each component is tagged to separate occurrences of same type, discriminated union, otherwise not

Page 83: Languages and Compilers (SProg og Oversættere)

83

Descriptor Data

• No tag if free union• L is fixed length of biggest component

Union Layout

Union type

Component type

Component tag

Component location

Actual data

Unused space

L

Page 84: Languages and Compilers (SProg og Oversættere)

84

Combining Data Structures

• Possible to have any of the above structures as components of others

• Since lists are of variable size, but arrays must store fixed size element, how to store lists in an array?

Page 85: Languages and Compilers (SProg og Oversættere)

85

Combining Data Structures

• Answer: cons cells have uniform size, store just the leading cons cell

Page 86: Languages and Compilers (SProg og Oversættere)

86

Example:

• Data in 4-element array of lists

list

list

list

listlist

list

int 5

int 6

int 3

int 1

int 7

int 2

Page 87: Languages and Compilers (SProg og Oversættere)

87

Type symmary

• Static type checking takes place after syntax check and before code generation

• Some type checking can be necessary at run time

• Types vs. Syntax

• Simply typed values and composite values

• User defined types

• Equivalence on types


Recommended