Post on 24-Aug-2020
transcript
1
Type Systems
31 March 1999 CS655 - P.F. Reynolds
Haskell & ML: Interesting Features
• Type inferencing
• Freedom from side effects
• Pattern matching
• Polymorphism
• Support for higher order functions
• Lazy patterns / lazy evaluation
• Support for object-oriented programming
2
31 March 1999 CS655 - P.F. Reynolds
Type Inferencing• Def: abil ity of the language to infer types without having
programmer provide type signatures.– SML e.g.:
f un mi n ( a: r eal , b )
= if a > b
t hen b
el se a
– type of a has to be given, but then that’s sufficient to figure out• type of b• type of min
– What if type of a is not specified?- could be ints- could be bools...
31 March 1999 CS655 - P.F. Reynolds
Type Inferencing (cont)
• Haskell (as with ML) guarantees type safety
– Haskell example:
eq = ( a = b )
– a polymorphic function that has a return type of bool,• assumes only that its two arguments are of the same type and can
have the equality operator applied to them.
– ML has similar assumption, for what it calls equality types.
• Overuse of type inferencing in both languages is discouraged– declarations are a design aid– declarations are a documentation aid– declarations are a debugging aid
3
31 March 1999 CS655 - P.F. Reynolds
Polymorphism
• ML:
f un fa c t ori al (0 ) = 1
= | fa c t ori al (n ) = n * f ac t or ia l ( n - 1 )
– ML infers factorial is an integer function: int -> int
• Haskell :
f act or i al ( 0) = 1
f act or i al ( n) = n * f act or i al ( n - 1 )
– Haskell infers factorial is a (numerical) function: Num a => a -> a
31 March 1999 CS655 - P.F. Reynolds
Polymorphism (cont)
• ML:
fun mymax(x,y) = if x > y then x else y
– SML infers mymax is ambiguous
fun mymax(x: real ,y) = if x > y then x else y
– SML infers mymax is real
• Haskell :
mymax(x,y) = if x > y then x else y
– Haskell infers mymax is an Ord function
4
31 March 1999 CS655 - P.F. Reynolds
Polymorphism (Cardelli & Wegner)
• Universe, V, of all values• A Type is a set of values selected from V (subset of V)
• Sometimes only way to enumerate is through constants and functions
• An Ideal is a type that satisfies certain "technical" properties
• (one would not identify a type containing integers and Int-> Int functions)
• All types found in programming languages are ideals
• (Value) Having a type::= membership in a set.
• Because ideals can overlap, a value can have many types
• A type system (in a language) is a collection of ideals of V• Languages provide support for defining which types are
mappable onto ideals
31 March 1999 CS655 - P.F. Reynolds
More Terms• Monomorphic Type System: a value belongs to at
most one type
• Polymorphic Type System: a value may belong tomany types
• Mostly Monomorphic . . . Mostly Polymorphic– One or the other characterizes individual languages
• Polymorphism, as it relates to:– values and variables: may have more than one type
– functions: arguments can be of > one type
– types: operations are applicable to operands of more than one type
5
Polymorphism: A Taxonomy
Parametric: uniformity of type structure is achieved by type parameters
Inclusion: object can belong to many different classes that need not be disjoint (subtypes & inheritance)
Overloading: same name used to denote different functions. Use determined from context
Coercion: a semantic operation required to convert an argument to a type expected by a function.
Universal
Ad Hoc
Polymorphism
Universal: infinite number of types with common structure
Ad Hoc: finite set of potentially unrelated types.
31 March 1999 CS655 - P.F. Reynolds
Exploring Terminology…• Is inclusion polymorphism a kind of parametric polymorphism?
– Consider invocation of a method (behavior) in C++ (Smalltalk): selectionis based (parametrically) on type…
– Why is inclusion polymorphism not a form of parametric polymorphism?
• Are generics (templates) a form of universal polymorphism?– Cardelli & Wegner: no
– Day et. al.: yes (parametric)
• Is there a difference between/among subtypes, subclasses andinheritance?– Subtypes: derived type’s methods/data subsume parent type’s
– Subclasses: structuring
– Inheritance: subtypes + subclasses -> specialization
6
31 March 1999 CS655 - P.F. Reynolds
Cardelli on Type Systems• Type system
– purpose is to prevent occurrence of execution errors during runtime
• Type Sound Language– absence of execution errors holds for all program runs that can be
described in a programming language
• Typechecker– method for determining if type errors occur– ambiguities in language specifications often lead to different type checker
implementations, hampering language soundness.
• Type– “Upper bound” (maximal set) on range of values a variable can take on
• Typed Language– one in which variables can be given (nontrivial) types
How about “can assume”?
31 March 1999 CS655 - P.F. Reynolds
More Cardelli on Types• Explicit / implicit typing
– as names suggest…
• Trapped errors– execution error when computation stops “ immediately”
• Untrapped errors– execution errors that go unnoticed and cause arbitrary behavior
• Safe program fragment– one that does not (cannot?) cause untrapped errors to occur
• Safe language– one in which all program fragments are safe
7
31 March 1999 CS655 - P.F. Reynolds
Safety and Typed Languages• “Untyped languages may enforce safety by performing run
time checks.”
• “Typed languages may also use a mixture of run time andstatic checks.”
-- Is an untyped language that enforces safety comprehensively at run time equivalent to a typed language that uses run time checks exclusively?
31 March 1999 CS655 - P.F. Reynolds
Off on Good Behavior• Forbidden errors
– all untrapped errors plus some trapped errors
– (what trapped errors might be included?)
• Good Behavior (well behaved)– no forbidden errors occur
– a well behaved program fragment is safe
• Strongly checked language– One in which all (legal) program fragments have good behavior
• no untrapped errors occur
• none of the specified trapped errors occur
• other trapped errors may occur - programmer must avoid them
– (notice avoidance of “strongly typed”)
8
31 March 1999 CS655 - P.F. Reynolds
Safety and Typed
Typed Untyped
SafeUnsafe
ML LISPC Assembler
-- Cardell i argues languages should be safe and typed (Should type system be implicit, or explicit, or both?)
31 March 1999 CS655 - P.F. Reynolds
ML Type Inferencing• Key concepts:
– Type variables
– Substitution
– Unification
– Most general unifiers
– Inferencing
Hindley-Milner
9
31 March 1999 CS655 - P.F. Reynolds
Type Variables/Instances• Type variables:
tyvar ::= ‘identifier
e.g.: ‘a ‘ b ‘m
– provide for polymorphism
• Type instances:
int
10
31 March 1999 CS655 - P.F. Reynolds
Most General Unifiers• Make no unnecessary assumptions:
– ‘a li st and ‘b are unified by ‘ a --> int li st, ‘ b --> int li st li st– ‘a li st and ‘b are unified by ‘ b --> ‘ a list
• Which unifier is more general? � 1 is an instance of � 2 iff there exists � such that � 1 = � � 2
• For example above: � = ‘a --> int li st
• The most general unifier of types t1 and t2 is a substitution �such that:– t1 and t2 are unified by �– and there is no more general � ’ that also unifies t1 and t2
31 March 1999 CS655 - P.F. Reynolds
ML Type Inferencing Example fun find p [] = false
| find p (x::S) = if p x then true
else find p S
• Initial type environment:false: booltrue: boolif: bool * ‘e * ‘e -> ‘e
• No assumptions about find: find: ‘k
• lambda: new � with fresh type variables for parameters:p: ‘i(x::S): ‘j
11
31 March 1999 CS655 - P.F. Reynolds
• Analysis:(x::S) implies ‘j --> ‘c listp x implies ‘i --> ‘c -> ‘lif p x implies ‘l --> boolif ... true else find ... implies ‘k --> ‘m -> boolfind p ... implies ‘m --> (‘c -> bool ) * ‘nfind ... S implies ‘n --> ‘c list
• Composing all substitutions yields: (x::S): ‘c list p : ‘c -> bool find: (‘c -> bool ) * ‘c list -> bool
ML Type Inferencing Example (2)fun find p [] = false | find p (x::S) = if p x then true
else find p S
31 March 1999 CS655 - P.F. Reynolds
Constraint-Based Type Inference andParametric PolymorphismOle Agesen (Stanford) 1994
• Constraint-based analysis: technique for inferringimplementation types
• Using flow analysis to build network of type variablesconnected by constraints
• Program can be viewed as collection of slots andexpressions
x � 0 // slot declarationx := y + 1 // Expression (y+1) and slots (x,y)
12
31 March 1999 CS655 - P.F. Reynolds
Three Step Process: Steps 1 & 2• Allocating "type variables" to every slot and expression
– Initially empty
– Process of type inference binds types to type variables
– On termination of inferencing, type variables hold "soundtypes"
– Inference exhibits montonicity: type variables have typesadded, only.
• Seeding type variables– Find obvious cases where type is known and assign to type
variable
e.g. x � 0 // x’s type is the type of the literal 0.
31 March 1999 CS655 - P.F. Reynolds
Three Step Process: Step 3• Establishing constraints and propagating (repeat until
termination)– Connect type variables into a network by adding directed
edges• Nodes are type variables; edges are constraints
– Whenever constraint is added, object types are propagated
– One constraint generated for each data flow in the program• assignment generates data flow from expr to assigned variable
• variable access gens data flow from variable to accessing expression
• message send (or func call) generates flows from actuals to formalsand result generates flow back to message send (invocation point)
13
31 March 1999 CS655 - P.F. Reynolds
IF Expression Data Flows
IF-expr THEN-expr ELSE-expr
(IF test-expr THEN-expr ELSE-expr)
Type of whole expression is union of types of
IF Expression Data Flows
31 March 1999 CS655 - P.F. Reynolds
Templates
max: a = ( self > a iftrue: [self] false: [a] ).
self a
self > a
result
true
false
Templates
14
31 March 1999 CS655 - P.F. Reynolds
Template Examples
selfa
result
[integer]max:
3 max: 4 2 max: 1
self a
max:
3 max: 4 2.5 max: 1.3
[integer, float]
Template Examples
31 March 1999 CS655 - P.F. Reynolds
Inference Algorithms• Basic - just saw it.
– works when all uses of a method are "similar”
– fails when two or more uses of method supply different types ofarguments, whether or not uses are individually polymorphic
• only one max template is created; may need more than one
• 1-level expansion: retype each method for each send invoking it.
– (would separate template shown on right on previous slide into two)
– Works when polymorphic call chain is only one level deep
– Fails when polymorphic call chain is > 1 level deep
– usual case
15
31 March 1999 CS655 - P.F. Reynolds
1-Level Expansion Algorithm: Failure
31 March 1999 CS655 - P.F. Reynolds
P-Level Expansion Algorithm
• Generalization of 1-Level algorithm
– expand to depth of p, then apply basic algorithm
• Size of expanded program is exponential in p !!!
• For any p, it’ s possible to find a code sequence thatrequires p+1 expansions.
16
31 March 1999 CS655 - P.F. Reynolds
Adaptive Inference Algorithms• Precise inference algorithms must not mix types
– Types mix if two incompatible activation records are represented by thesame template
– Create lots of templates so they represent fewer activation records
• Eff icient type inference requires processing as few templates aspossible– Template creation carries a computational cost
• 1-level algorithm does poorly in both cases
• Desire algorithm that is precise and eff icient
• Adaptive algorithms attempt to create templates only when needed– Lump similar activation records in one template when possible
• Success is when algorithm operates at a cost proportional to theamount of polymorphism in the program.
31 March 1999 CS655 - P.F. Reynolds
Adaptive Inference Algorithms (2)
A critical condition in adaptive inference algorithms:
for the sends:
i.e. two uses of max: can share same template if they have same receiver and
same argument types resp.
Problem is: we’re trying to infer their types!
Adaptive algorithms use partial type info (or anything else) at decision point
share a template type(rcvrExp1, N1) = type(rcvrExp2, N2)
type(argExp1, N1) = type(argExp2, N2)
��
rcvrExp1 max: argExp1 & rcvrExp2 max: argExp2
Adaptive Inference Algorithms (2)
(N1, N2)
17
Hash Function Algorithm
Improves precision and efficiency significantly
Hash function is computed when use of a method is processed:
hash: Send x Template HashValuemaps a use (send in context of a template) to a hash value
template can be shared by two uses iff they have same hash value
Tradeoffs between precision and efficiency determined by hash function
Assume send is being analyzed in context of a template N (for method
containing S); analysis done for each possible receiver of the send:
hash_family(S,N) = { ( � 1 � S), ( 2,S),..., ( k,S)} where
type(S.receiver, N) = { 1, 2,..., k}
Hash Function Algorithm
Hash Function Algorithm (2)
For hash value ( i,S); the first value is a possible recievertwo uses of method can share template only if they have same receiver object
Precision improved in two ways:
Inherited methods are reanalyzed in context of every object that inherits them
Sends to self (common) can be analyzed more precisely because a single target can be found
Second component of hash value is send itself
=> different sends connect to different templates
essentially, an implementation of 1-level algorithm
Sends with no arguments hashed to same template
Note: receiver type may not be ful ly known when hash is done
OK. Just go back and rehash when receiver type grows (monotonically)
Hash Function Algorithm (2)
18
31 March 1999 CS655 - P.F. Reynolds
Hash Function Algorithm (3)• Algorithm controls polymorphism at receiver well
e.g.
444 value: nil With: nil With: nil.
3.5 value: 100 With: 100 With: 100
wil l map to receiver types of [ integer] and [float] resp. i.e. the twosends do not interfere. Two distinct sets of templates will be created.
Hash algorithm clearly fails on polymorphic arguments sinceargument types are not considered
• One work-around is to always force templates to not be shared formethods such as ifTrue: False
– Has obvious computational costs
31 March 1999 CS655 - P.F. Reynolds
Iterative Algorithm• A significant improvement over hash function algorithm
• First iteration is to apply the basic algorithm
– creates one shared template for each method
• In subsequent iterations, less is shared
• Key idea: use type information on previous iteration to decidewhether or not to share templates in current iteration.
typei-1(rcvrExp1, N1) = typei-1(rcvrExp2, N2)
typei-1(argExp1, N1) = typei-1(argExp2, N2)
for:
rcvrExp1 max: argExp1 (in context N1)
rcvrExp2 max: argExp2 (in context N2)
Share a template �
19
31 March 1999 CS655 - P.F. Reynolds
Iterative Algorithm (2)• Advantage: complete type information is available from previous
iteration
• Iterative algorithm has more information available when makingcritical decision
• Iterative algorithm uses types of both receiver and arguments (butfrom the previous iteration)
• Termination is a key consideration
– after fixed number of steps (e.g. 5 - 7)
– when a fix point is reached
• may never be reached in face of recursion
• With this algorithm (using fix-point termination), analysis time isproportional to amount of polymorphism in the program.
31 March 1999 CS655 - P.F. Reynolds
Summary