Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | dora-lynch |
View: | 239 times |
Download: | 7 times |
© Kenneth C. Louden, 2005 1
Chapter 9 – Abstract Data Types Chapter 9 – Abstract Data Types and Modulesand Modules
Programming Languages:
Principles and Practice, 2nd Ed.
Kenneth C. Louden
Chapter 9 K. Louden, Programming Languages 2
IntroductionIntroduction Built-in types have important properties that
"abstract" away the implementation: use of int and its operations (+, *, etc.) normally do not require knowledge of bit patterns (2's complement? 4 bytes?).
User-defined types do not in general have this property: internal structure is visible to all code.
Use of internal structure makes it difficult to change later.
Operations on data (except the most basic) not specified and often hard to find.
Chapter 9 K. Louden, Programming Languages 3
Abstract Data TypeAbstract Data Type A mechanism of a programming language
designed to imitate the abstract properties of a built-in type as much as possible.
Must include a specification of the operations that can be applied to the data.
Must hide the implementation details from client code.
These properties are sometimes called encapsulation & information hiding (with different emphases).
See p. 357 for a more detailed description of these two properties.
Chapter 9 K. Louden, Programming Languages 4
Algebraic SpecificationAlgebraic Specification
Sometimes confused with ADT mechanisms.
Involves the specification of the properties of the operations in the form of mathematical equations.
Such "semantic" properties should be part of an ADT mechanism, but usually aren't.
Can be difficult:– Do the equations "make sense?"– Can all necessary properties be proven from
them?
Chapter 9 K. Louden, Programming Languages 5
Algebraic Specification (cont.)Algebraic Specification (cont.)
Consists of two parts:– Syntactic specification of the operations (the
signature of the type)– Semantic specification (the axioms defining
the semantics of the operations)
Signature usually written in mathematical form: f : A B C instead of (in C syntax) C f(A,B);
Axioms use some notion of value equality (more like Java .equals() than ==).
Chapter 9 K. Louden, Programming Languages 6
Stack Example (p. 363)Stack Example (p. 363)type stack(element) imports booleanoperations:
createstk : stackpush : stack element stackpop : stack stacktop : stack elementemptystk : stack boolean
variables: s: stack; x: elementaxioms:
emptystk(createstk) = trueemptystk(push(s,x)) = falsetop(createstk) = errortop(push(s,x)) = xpop(createstk) = errorpop(push(s,x)) = s
Chapter 9 K. Louden, Programming Languages 7
Stack Example (2)Stack Example (2)
Note use of "imports" clause to specify dependencies on other types.
Note parameterization of stack type by element. This is a "generic" stack specification (instead of, say, an integer stack).
"Operations" section is the signature. "Axioms" section is the semantic
specification, using names defined in the "variables" section.
Chapter 9 K. Louden, Programming Languages 8
Stack Example (3)Stack Example (3) There are six axioms – are these enough? Rule of thumb for axioms: classify operations
into constructors and inspectors:– Constructors create new values in the type. E.g.,
createstk and push are stack constructors.– Inspectors pull out data, including already existing
values of the type, as well as other data contained in a value of the type. E.g., pop, top, and emptystk are inspectors.
There must be one axiom for each inspector-constructor pair. E.g., 3*2 = 6 axioms for stack.
There are two "error" axioms. In practice, these will involve exceptions.
Inspectors can be further classified into predicates (boolean-valued inspectors) and selectors.
Chapter 9 K. Louden, Programming Languages 9
Queue Example (p. 361) Queue Example (p. 361) type queue(element) imports booleanoperations:
createq: queueenqueue: queue element queuedequeue: queue queuefrontq: queue elementemptyq: queue boolean
variables: q: queue; x: elementaxioms:
emptyq(createq) = trueemptyq(enqueue(q,x)) = falsefrontq(createq) = errorfrontq(enqueue(q,x)) = if emptyq(q) then x else frontq(q)dequeue(createq) = errordequeue(enqueue(q,x)) = if emptyq(q) then q else enqueue(dequeue(q),x)
Chapter 9 K. Louden, Programming Languages 10
Algebraic Specification NotesAlgebraic Specification Notes
Specifications are usually written in functional form with no side effects or assignment. So no "void" functions.
Specifications are often simplified to make axiom writing easier. E.g., in the stack example, pop does not return the top, only the (previously created) stack below the current top. We could have written pop aspop: stack element stack, but the axioms are more complex (try it!).
Chapter 9 K. Louden, Programming Languages 11
Algebraic Specification Notes (cont.)Algebraic Specification Notes (cont.)
The classification of the operations into constructors and inspectors can be much more difficult than the stack example indicates.
Case in point: queue example (slide 9). In the text, dequeue is said to be an inspector. In reality it is a constructor (called an inessential constructor, since it can be replaced by other constructors).
Inessential constructors are lumped with inspectors in the rule of thumb for the number of axioms.
Chapter 9 K. Louden, Programming Languages 12
ADT Language MechanismsADT Language Mechanisms
Most languages do not have a specific ADT mechanism – instead they have a more general module mechanism (see later).
Common OO languages have a class mechanism, which has many of the properties needed by an ADT mechanism.
ML does have a specific ADT mechanism (now viewed as historical, since a newer module mechanism is more useful).
Chapter 9 K. Louden, Programming Languages 13
Queue ADT in MLQueue ADT in MLabstype 'element Queue = Q of 'element listwith
val createq = Q [];fun enqueue(Q lis, elem) = Q(lis @ [elem]);fun dequeue(Q lis) = Q(tl lis);fun frontq(Q lis) = hd lis;fun emptyq(Q []) = true | emptyq(Q(h::t)) = false;
end;
ML interpreter responds:type 'a Queue
val createq = - : 'a Queue
val enqueue = fn : 'a Queue * 'a -> 'a Queue
val dequeue = fn : 'a Queue -> 'a Queue
val frontq = fn : 'a Queue -> 'a
val emptyq = fn : 'a Queue -> bool
Chapter 9 K. Louden, Programming Languages 14
Queue ADT in ML (cont.)Queue ADT in ML (cont.) Queue is parameterized by type 'element in ML,
just like the algebraic specification (slide 9). [ML reports this type as 'a in its response.]
Queue is implemented internally by a list, but this is not revealed by the interpreter (note hyphen instead of Q [] in createq response).
ML requires that the implementation be a new type, so the lists are embedded inside the (hidden) ML constructor Q (which could also be called Queue if we wished).
To see that this really meets the algebraic specification, we would still have to show the validity of the equations (exercise!).
Chapter 9 K. Louden, Programming Languages 15
ModulesModules
An ADT mechanism is limited to the definition of a single type and operations on that type. This is inadequate for structuring large programs.
OO classes (Chapter 10) are more dynamic and versatile than ADT mechanisms.
Thus, languages usually offer a more general construct – the module – that is useful for structuring large programs.
Chapter 9 K. Louden, Programming Languages 16
Modules (2)Modules (2) Definition: a module is a program unit with a
public interface and a private implementation; all services that are available from a module are described in its public interface and are exported to other modules, and all services that are needed by a module must be imported from other modules.
Thus, a module offers general services, which may include types and operations on those types, but are not restricted to these.
Modules have nice properties:– A module can be (re)used in any way that its public
interface allows.– A module implementation can change without affecting
the behavior of other modules.
Chapter 9 K. Louden, Programming Languages 17
Modules (3)Modules (3) Modules are the principle mechanism used to
decompose large programs. Example – a compiler:
Modules usually offer an additional benefit: names within one module do not clash with names in other modules.
Modules usually have a close relationship to separate compilation (though this is often hard to make precise in a specification).
Chapter 9 K. Louden, Programming Languages 18
Modules (4)Modules (4)
Languages that have comprehensive module mechanisms:– Ada, where they are called packages (not to be
confused with Java packages)– ML, where they are called structures
Languages that have weak mechanisms with some module-like properties:– C++, where they are called namespaces– Java, where they are called packages
Languages with no module mechanism:– C (but modules can be imitated using separate
compilation)– Pascal
Chapter 9 K. Louden, Programming Languages 19
Ada Package ExampleAda Package Example Ada uses a package specification to define the public
interface of a package:generic type T is private;package Queues is type Queue is private; function createq return Queue; function enqueue(q:Queue;elem:T) return Queue; function frontq(q:Queue) return T; function dequeue(q:Queue) return Queue; function emptyq(q:Queue) return Boolean;private type Queuerep; type Queue is access Queuerep;end Queues;
Chapter 9 K. Louden, Programming Languages 20
Ada Package Example (2)Ada Package Example (2) Note use of generic keyword to specify a Queue
as a parameterized type. Note attempt to keep the details of the exact data
structure used for Queue hidden by using the incompletely specified Queuerep type and making Queue a reference or pointer (an access type) to Queuerep (required, otherwise compiler would not know how much space to allocate for a Queue).
Use of private declarations restricts use of Queue to assignment and equality testing.
Package specification must be accompanied by a package body giving the implementation (either in same or separate file) – see next slide.
Chapter 9 K. Louden, Programming Languages 21
Ada Package Example (3)Ada Package Example (3) Package body (see pp. 379-380):
package body Queues is type Queuerep is
recorddata: T;next: Queue;
end record; function createq return Queue is
beginreturn null;
end;...end Queues;
Use of this code:with Queues; -- specifies dependency on Queues packageprocedure Quser is -- note dot notation below package IntQueues is new Queues(Integer); iq: IntQueues.Queue := IntQueues.createq;begin iq := IntQueues.enqueue(iq,3); iq := IntQueues.dequeue(iq);end Quser;
Chapter 9 K. Louden, Programming Languages 22
ML Structure ExampleML Structure Example ML uses a signature to specify the interface of a
structure:signature QUEUE =sigtype 'a Queueval createq: 'a Queueval enqueue: 'a Queue * 'a -> 'a Queueval frontq: 'a Queue -> 'aval dequeue: 'a Queue -> 'a Queueval emptyq: 'a Queue -> boolend;
Note use of sig ... end to construct an actual value for the QUEUE signature.
Note lack of specification of the details of the Queue data type.
Chapter 9 K. Louden, Programming Languages 23
ML Structure Example (2)ML Structure Example (2) ML structure definition uses the QUEUE signature to
describe its public interface:structure Queue1: QUEUE =
structdatatype 'a Queue = Q of 'a listval createq = Q [];fun enqueue(Q lis, elem) = Q (lis @ [elem]);fun frontq (Q lis) = hd lis;fun dequeue (Q lis) = Q (tl lis);fun emptyq (Q []) = true | emptyq (Q (h::t)) = false;end;
Note use of struct ... end to construct an actual value for the Queue1 structure.
Since actual structure of the Queue data type is not part of the signature, it is not usable outside the structure code.
Chapter 9 K. Louden, Programming Languages 24
ML Structure Example (3)ML Structure Example (3) Queue1 structure can now be used as follows:
(* Note dot notation below *) - val q = Queue1.enqueue(Queue1.createq,3); val q = Q [3] : int Queue1.Queue Queue1.frontq q; val it = 3 : int - val q1 = Queue1.dequeue q; val q1 = Q [] : int Queue1.Queue - Queue1.emptyq q1; val it = true : bool
Notes:– Dependence on previous code not made explicit;
compiler must still somehow find the code.– Even though details of implementation cannot be used
in this code (e.g. user code such as val q = Q [1] is illegal), the use of a list is still visible.
– Code cannot just refer to signature QUEUE, it must specify an implementation.
Chapter 9 K. Louden, Programming Languages 25
C++ Namespaces & Java PackagesC++ Namespaces & Java Packages C++ and Java do not have modules in the sense of Ada and
ML: classes are used instead. C++ and Java do have mechanisms for controlling name
clashes and organizing code into groups: namespaces in C++, packages in Java.
Clients must use similar dot notation as in Ada and ML to reference names in namespaces/packages.
Each of these languages has a mechanism for automatically dereferencing names:– Ada: use– ML: open– C++: using [namespace]– Java: import
Only Ada has explicit dependency syntax (keyword with). Java class loader automatically searches for code. C++ requires textual inclusions for declarations, linker must search for code. ML "compilation manager" does this too (not in ML specification).
Chapter 9 K. Louden, Programming Languages 26
Imitating Modules in CImitating Modules in C Header file queue.h (declarations only):
struct Queuerep; /* incomplete type */typedef struct Queuerep * Queue;Queue createq(void);/* void* imitates polymorphism */Queue enqueue(Queue q, void* elem);void* frontq(Queue q);Queue dequeue(Queue q);int emptyq(Queue q); /* no boolean in C */
Both code file and client code textually import the header (#include "queue.h", see next slide).
Linker given code file location. Depends entirely on the skill of programmer.
Chapter 9 K. Louden, Programming Languages 27
Imitating Modules in C (2)Imitating Modules in C (2) Code file queue.c:
#include "queue.h"struct Queuerep{ void* data; Queue next;};Queue createq(void){ return 0;}/* etc. */
Client code:#include "queue.h"/* etc. */int x[] = {2}; /* array gives indirection */int y[] = {3};Queue q = createq();q = enqueue(q,x);q = enqueue(q,y);q = dequeue(q);int* z = (int*) frontq(q);/* etc. */
Chapter 9 K. Louden, Programming Languages 28
Problems with ModulesProblems with Modules Modules are not types
– Modules sometimes used to imitate OO classes– Module interface usually contains types, whose
representations may be exposed– Reference to types awkward: IntQueues.Queue (Ada); int Queue1.Queue (ML).
Modules are static– Modules are primarily compile-time artifacts– Use of a module to imitate a class (without exporting a
type) results in only one available object (see pp. 391-392)
Modules do not control values of exported types– Assignment can cause undesirable aliasing– Equality tests may not be appropriate– ML and Ada have some ability to control these (with
effort)
Chapter 9 K. Louden, Programming Languages 29
Problems with Modules (2)Problems with Modules (2)
Objections of previous slide can be (mostly) overcome by using OO classes, with modules relegated to code organization status (C++, Java)
Significant problems still exist, even with OO (see next slide):– Modules do not expose dependencies– Modules do not express semantics
Chapter 9 K. Louden, Programming Languages 30
Problems with Modules (3)Problems with Modules (3) Modules do not expose dependencies
– Only Ada documents compilation dependencies in code (keyword with).
– Hidden implementation dependencies can be worse: order relation is a common one
– C++ does a particularly good job of hiding these– Ada uses constrained polymorphism (pp. 249-250, 395)– Java uses interfaces such as Comparable, Comparator– ML uses functor (example pp. 396-397)
Modules do not express semantics– Universally ignored in today's languages– Useful for proving code correctness– Maybe some day...
Chapter 9 K. Louden, Programming Languages 31
Mathematics of ADTsMathematics of ADTs Overview only – details beyond the scope of this
course Important requirements for axioms:
– Consistency: two different values are not identified– Completeness: all required properties are expressed– Independence: no axiom is provable from others (not as
important as the other two) Is there an actual type that implements the ADT? Initial algebra: quotient algebra of the free
algebra of terms by the "smallest" equivalence relation generated by the axioms
Initial algebra is the "largest" implementation: two elements are equal only if provable from the axioms
Chapter 9 K. Louden, Programming Languages 32
Mathematics of ADTs (2)Mathematics of ADTs (2) Principle of extensionality (from set theory): two
data values are equal if all their contained elements are equal.
Initial algebra may not have this property Final algebra: any two data values that cannot be
distinguished by inspector operations are necessarily equal
Final algebra is the "smallest" implementation & always obeys the principle of extensionality
When final and initial algebra are the same, can say there is a "unique" implementation.
If not, which is "better?" (see text for example)