Type Systemsevans/cs655-S00/Spring-1999/Slides/11types.p… · Ad Hoc Polymorphism Universal:...

transcript

1

Type Systems

31 March 1999 CS655 - P.F. Reynolds

Haskell & ML: Interesting Features

• Type inferencing

• Freedom from side effects

• Pattern matching

• Polymorphism

• Support for higher order functions

• Lazy patterns / lazy evaluation

• Support for object-oriented programming

2


Type Inferencing• Def: abil ity of the language to infer types without having

programmer provide type signatures.– SML e.g.:

f un mi n ( a: r eal , b )

= if a > b

t hen b

el se a

– type of a has to be given, but then that’s sufficient to figure out• type of b• type of min

– What if type of a is not specified?- could be ints- could be bools...


Type Inferencing (cont)

• Haskell (as with ML) guarantees type safety

– Haskell example:

eq = ( a = b )

– a polymorphic function that has a return type of bool,• assumes only that its two arguments are of the same type and can

have the equality operator applied to them.

– ML has similar assumption, for what it calls equality types.

• Overuse of type inferencing in both languages is discouraged– declarations are a design aid– declarations are a documentation aid– declarations are a debugging aid

3


Polymorphism

• ML:

f un fa c t ori al (0 ) = 1

= | fa c t ori al (n ) = n * f ac t or ia l ( n - 1 )

– ML infers factorial is an integer function: int -> int

• Haskell :

f act or i al ( 0) = 1

f act or i al ( n) = n * f act or i al ( n - 1 )

– Haskell infers factorial is a (numerical) function: Num a => a -> a


Polymorphism (cont)

• ML:

fun mymax(x,y) = if x > y then x else y

– SML infers mymax is ambiguous

fun mymax(x: real ,y) = if x > y then x else y

– SML infers mymax is real

• Haskell :

mymax(x,y) = if x > y then x else y

– Haskell infers mymax is an Ord function

4


Polymorphism (Cardelli & Wegner)

• Universe, V, of all values• A Type is a set of values selected from V (subset of V)

• Sometimes only way to enumerate is through constants and functions

• An Ideal is a type that satisfies certain "technical" properties

• (one would not identify a type containing integers and Int-> Int functions)

• All types found in programming languages are ideals

• (Value) Having a type::= membership in a set.

• Because ideals can overlap, a value can have many types

• A type system (in a language) is a collection of ideals of V• Languages provide support for defining which types are

mappable onto ideals


More Terms• Monomorphic Type System: a value belongs to at

most one type

• Polymorphic Type System: a value may belong tomany types

• Mostly Monomorphic . . . Mostly Polymorphic– One or the other characterizes individual languages

• Polymorphism, as it relates to:– values and variables: may have more than one type

– functions: arguments can be of > one type

– types: operations are applicable to operands of more than one type

5

Polymorphism: A Taxonomy

Parametric: uniformity of type structure is achieved by type parameters

Inclusion: object can belong to many different classes that need not be disjoint (subtypes & inheritance)

Overloading: same name used to denote different functions. Use determined from context

Coercion: a semantic operation required to convert an argument to a type expected by a function.

Universal

Ad Hoc

Polymorphism

Universal: infinite number of types with common structure

Ad Hoc: finite set of potentially unrelated types.


Exploring Terminology…• Is inclusion polymorphism a kind of parametric polymorphism?

– Consider invocation of a method (behavior) in C++ (Smalltalk): selectionis based (parametrically) on type…

– Why is inclusion polymorphism not a form of parametric polymorphism?

• Are generics (templates) a form of universal polymorphism?– Cardelli & Wegner: no

– Day et. al.: yes (parametric)

• Is there a difference between/among subtypes, subclasses andinheritance?– Subtypes: derived type’s methods/data subsume parent type’s

– Subclasses: structuring

– Inheritance: subtypes + subclasses -> specialization

6


Cardelli on Type Systems• Type system

– purpose is to prevent occurrence of execution errors during runtime

• Type Sound Language– absence of execution errors holds for all program runs that can be

described in a programming language

• Typechecker– method for determining if type errors occur– ambiguities in language specifications often lead to different type checker

implementations, hampering language soundness.

• Type– “Upper bound” (maximal set) on range of values a variable can take on

• Typed Language– one in which variables can be given (nontrivial) types

How about “can assume”?


More Cardelli on Types• Explicit / implicit typing

– as names suggest…

• Trapped errors– execution error when computation stops “ immediately”

• Untrapped errors– execution errors that go unnoticed and cause arbitrary behavior

• Safe program fragment– one that does not (cannot?) cause untrapped errors to occur

• Safe language– one in which all program fragments are safe

7


Safety and Typed Languages• “Untyped languages may enforce safety by performing run

time checks.”

• “Typed languages may also use a mixture of run time andstatic checks.”

-- Is an untyped language that enforces safety comprehensively at run time equivalent to a typed language that uses run time checks exclusively?


Off on Good Behavior• Forbidden errors

– all untrapped errors plus some trapped errors

– (what trapped errors might be included?)

• Good Behavior (well behaved)– no forbidden errors occur

– a well behaved program fragment is safe

• Strongly checked language– One in which all (legal) program fragments have good behavior

• no untrapped errors occur

• none of the specified trapped errors occur

• other trapped errors may occur - programmer must avoid them

– (notice avoidance of “strongly typed”)

8


Safety and Typed

Typed Untyped

SafeUnsafe

ML LISPC Assembler

-- Cardell i argues languages should be safe and typed (Should type system be implicit, or explicit, or both?)


ML Type Inferencing• Key concepts:

– Type variables

– Substitution

– Unification

– Most general unifiers

– Inferencing

Hindley-Milner

9


Type Variables/Instances• Type variables:

tyvar ::= ‘identifier

e.g.: ‘a ‘ b ‘m

– provide for polymorphism

• Type instances:

int

10


Most General Unifiers• Make no unnecessary assumptions:

– ‘a li st and ‘b are unified by ‘ a --> int li st, ‘ b --> int li st li st– ‘a li st and ‘b are unified by ‘ b --> ‘ a list

• Which unifier is more general? � 1 is an instance of � 2 iff there exists � such that � 1 = � � 2

• For example above: � = ‘a --> int li st

• The most general unifier of types t1 and t2 is a substitution �such that:– t1 and t2 are unified by �– and there is no more general � ’ that also unifies t1 and t2


ML Type Inferencing Example fun find p [] = false

| find p (x::S) = if p x then true

else find p S

• Initial type environment:false: booltrue: boolif: bool * ‘e * ‘e -> ‘e

• No assumptions about find: find: ‘k

• lambda: new � with fresh type variables for parameters:p: ‘i(x::S): ‘j

11


• Analysis:(x::S) implies ‘j --> ‘c listp x implies ‘i --> ‘c -> ‘lif p x implies ‘l --> boolif ... true else find ... implies ‘k --> ‘m -> boolfind p ... implies ‘m --> (‘c -> bool ) * ‘nfind ... S implies ‘n --> ‘c list

• Composing all substitutions yields: (x::S): ‘c list p : ‘c -> bool find: (‘c -> bool ) * ‘c list -> bool

ML Type Inferencing Example (2)fun find p [] = false | find p (x::S) = if p x then true

else find p S


Constraint-Based Type Inference andParametric PolymorphismOle Agesen (Stanford) 1994

• Constraint-based analysis: technique for inferringimplementation types

• Using flow analysis to build network of type variablesconnected by constraints

• Program can be viewed as collection of slots andexpressions

x � 0 // slot declarationx := y + 1 // Expression (y+1) and slots (x,y)

12


Three Step Process: Steps 1 & 2• Allocating "type variables" to every slot and expression

– Initially empty

– Process of type inference binds types to type variables

– On termination of inferencing, type variables hold "soundtypes"

– Inference exhibits montonicity: type variables have typesadded, only.

• Seeding type variables– Find obvious cases where type is known and assign to type

variable

e.g. x � 0 // x’s type is the type of the literal 0.


Three Step Process: Step 3• Establishing constraints and propagating (repeat until

termination)– Connect type variables into a network by adding directed

edges• Nodes are type variables; edges are constraints

– Whenever constraint is added, object types are propagated

– One constraint generated for each data flow in the program• assignment generates data flow from expr to assigned variable

• variable access gens data flow from variable to accessing expression

• message send (or func call) generates flows from actuals to formalsand result generates flow back to message send (invocation point)

13


IF Expression Data Flows

IF-expr THEN-expr ELSE-expr

(IF test-expr THEN-expr ELSE-expr)

Type of whole expression is union of types of

IF Expression Data Flows


Templates

max: a = ( self > a iftrue: [self] false: [a] ).

self a

self > a

result

true

false

Templates

14


Template Examples

selfa

result

[integer]max:

3 max: 4 2 max: 1

self a

max:

3 max: 4 2.5 max: 1.3

[integer, float]

Template Examples


Inference Algorithms• Basic - just saw it.

– works when all uses of a method are "similar”

– fails when two or more uses of method supply different types ofarguments, whether or not uses are individually polymorphic

• only one max template is created; may need more than one

• 1-level expansion: retype each method for each send invoking it.

– (would separate template shown on right on previous slide into two)

– Works when polymorphic call chain is only one level deep

– Fails when polymorphic call chain is > 1 level deep

– usual case

15


1-Level Expansion Algorithm: Failure


P-Level Expansion Algorithm

• Generalization of 1-Level algorithm

– expand to depth of p, then apply basic algorithm

• Size of expanded program is exponential in p !!!

• For any p, it’ s possible to find a code sequence thatrequires p+1 expansions.

16


Adaptive Inference Algorithms• Precise inference algorithms must not mix types

– Types mix if two incompatible activation records are represented by thesame template

– Create lots of templates so they represent fewer activation records

• Eff icient type inference requires processing as few templates aspossible– Template creation carries a computational cost

• 1-level algorithm does poorly in both cases

• Desire algorithm that is precise and eff icient

• Adaptive algorithms attempt to create templates only when needed– Lump similar activation records in one template when possible

• Success is when algorithm operates at a cost proportional to theamount of polymorphism in the program.


Adaptive Inference Algorithms (2)

A critical condition in adaptive inference algorithms:

for the sends:

i.e. two uses of max: can share same template if they have same receiver and

same argument types resp.

Problem is: we’re trying to infer their types!

Adaptive algorithms use partial type info (or anything else) at decision point

share a template type(rcvrExp1, N1) = type(rcvrExp2, N2)

type(argExp1, N1) = type(argExp2, N2)

��

rcvrExp1 max: argExp1 & rcvrExp2 max: argExp2

Adaptive Inference Algorithms (2)

(N1, N2)

17

Hash Function Algorithm

Improves precision and efficiency significantly

Hash function is computed when use of a method is processed:

hash: Send x Template HashValuemaps a use (send in context of a template) to a hash value

template can be shared by two uses iff they have same hash value

Tradeoffs between precision and efficiency determined by hash function

Assume send is being analyzed in context of a template N (for method

containing S); analysis done for each possible receiver of the send:

hash_family(S,N) = { ( � 1 � S), ( 2,S),..., ( k,S)} where

type(S.receiver, N) = { 1, 2,..., k}

Hash Function Algorithm

Hash Function Algorithm (2)

For hash value ( i,S); the first value is a possible recievertwo uses of method can share template only if they have same receiver object

Precision improved in two ways:

Inherited methods are reanalyzed in context of every object that inherits them

Sends to self (common) can be analyzed more precisely because a single target can be found

Second component of hash value is send itself

=> different sends connect to different templates

essentially, an implementation of 1-level algorithm

Sends with no arguments hashed to same template

Note: receiver type may not be ful ly known when hash is done

OK. Just go back and rehash when receiver type grows (monotonically)

Hash Function Algorithm (2)

18


Hash Function Algorithm (3)• Algorithm controls polymorphism at receiver well

e.g.

444 value: nil With: nil With: nil.

3.5 value: 100 With: 100 With: 100

wil l map to receiver types of [ integer] and [float] resp. i.e. the twosends do not interfere. Two distinct sets of templates will be created.

Hash algorithm clearly fails on polymorphic arguments sinceargument types are not considered

• One work-around is to always force templates to not be shared formethods such as ifTrue: False

– Has obvious computational costs


Iterative Algorithm• A significant improvement over hash function algorithm

• First iteration is to apply the basic algorithm

– creates one shared template for each method

• In subsequent iterations, less is shared

• Key idea: use type information on previous iteration to decidewhether or not to share templates in current iteration.

typei-1(rcvrExp1, N1) = typei-1(rcvrExp2, N2)

typei-1(argExp1, N1) = typei-1(argExp2, N2)

for:

rcvrExp1 max: argExp1 (in context N1)

rcvrExp2 max: argExp2 (in context N2)

Share a template �

19


Iterative Algorithm (2)• Advantage: complete type information is available from previous

iteration

• Iterative algorithm has more information available when makingcritical decision

• Iterative algorithm uses types of both receiver and arguments (butfrom the previous iteration)

• Termination is a key consideration

– after fixed number of steps (e.g. 5 - 7)

– when a fix point is reached

• may never be reached in face of recursion

• With this algorithm (using fix-point termination), analysis time isproportional to amount of polymorphism in the program.


Summary

Type Systemsevans/cs655-S00/Spring-1999/Slides/11types.p… · Ad Hoc Polymorphism Universal:...

Documents