+ All Categories
Home > Documents > Vol. 7, No. 6, July-August 2008 Efficient Integrity ... · power of the data model and constraint...

Vol. 7, No. 6, July-August 2008 Efficient Integrity ... · power of the data model and constraint...

Date post: 11-Aug-2019
Category:
Upload: truongtruc
View: 212 times
Download: 0 times
Share this document with a friend
19
Vol. 7, No. 6, July-August 2008 Efficient Integrity Checking for Essential MOF + OCL in Software Repositories Miguel Garcia, Institute for Software Systems, Hamburg University of Science and Technology (TUHH), Germany The efficient detection of run-time violations of integrity constraints (or their avoid- ance in the first place) has not been satisfactorily addressed for the combination of object model and constraint definition language most widely accepted in industry, namely OMG’s Essential MOF and Object Constraint Language (OCL). We identify the dimensions relevant to this problem, and classify existing proposals by their po- sition in the solution space. After this comparative survey, we propose a solution for the efficient integrity checking of invariants expressed in OCL over the Essential MOF data model, and describe the software architecture of its implementation using object-relational mapping technology. 1 INTRODUCTION Model-Driven Software Engineering (MDSE) encompasses traditional areas of both Lan- guage Design and Software Engineering (language definition and tooling, manipulation of programs and models, refinement of specifications into lower-level abstractions) follow- ing a unified conceptual and technical framework (metamodeling and declarative model transformations). By expressing a language definition as a metamodel, the information about abstract syntax and static semantics (including typing rules) becomes machine- processable, enabling language-aware manipulation along a toolchain in a reusable, declar- ative manner. Metamodels are expressed in Essential MOF (EMOF) [1] (covering struc- tural aspects), and are extended with constraints expressed in OCL [2], to be evaluated over finite populations of instances. An OCL class invariant is a Boolean function over a database snapshot. As MDSE techniques are applied to development processes of ever increasing com- plexity, additional demands are placed on the infrastructure supporting those processes. Software repositories [3] play a pivotal role in the management of software artifacts con- forming to an EMOF data model, checking the integrity constraints given as OCL invari- ants. The task of runtime integrity checking has proven non-scalable if performed without regard for optimization techniques, yet many EMOF software repositories in use today do not adequately address this concern. Solving this industrially relevant problem requires identifying a calculus expressive enough to handle OCL yet tractable enough that opti- mizations of collection operations are feasible. Moreover, an empirical evaluation of the proposed approach should validate the findings before real-world deployment. Cite this document as follows: Miguel Garcia: Efficient Integrity Checking for Essential MOF + OCL in Software Repositories, in Journal of Object Technology, vol. 7, no. 6, July- August 2008, pages 101–119, http://www.jot.fm/issues/issues 2008 7/article3.pdf
Transcript

Vol. 7, No. 6, July-August 2008

Efficient Integrity Checking for EssentialMOF + OCL in Software Repositories

Miguel Garcia, Institute for Software Systems,Hamburg University of Science and Technology (TUHH), Germany

The efficient detection of run-time violations of integrity constraints (or their avoid-ance in the first place) has not been satisfactorily addressed for the combination ofobject model and constraint definition language most widely accepted in industry,namely OMG’s Essential MOF and Object Constraint Language (OCL). We identifythe dimensions relevant to this problem, and classify existing proposals by their po-sition in the solution space. After this comparative survey, we propose a solutionfor the efficient integrity checking of invariants expressed in OCL over the EssentialMOF data model, and describe the software architecture of its implementation usingobject-relational mapping technology.

1 INTRODUCTION

Model-Driven Software Engineering (MDSE) encompasses traditional areas of both Lan-guage Design and Software Engineering (language definition and tooling, manipulation ofprograms and models, refinement of specifications into lower-level abstractions) follow-ing a unified conceptual and technical framework (metamodeling and declarative modeltransformations). By expressing a language definition as a metamodel, the informationabout abstract syntax and static semantics (including typing rules) becomes machine-processable, enabling language-aware manipulation along a toolchain in a reusable, declar-ative manner. Metamodels are expressed in Essential MOF (EMOF) [1] (covering struc-tural aspects), and are extended with constraints expressed in OCL [2], to be evaluatedover finite populations of instances. An OCL class invariant is a Boolean function over adatabase snapshot.

As MDSE techniques are applied to development processes of ever increasing com-plexity, additional demands are placed on the infrastructure supporting those processes.Software repositories [3] play a pivotal role in the management of software artifacts con-forming to an EMOF data model, checking the integrity constraints given as OCL invari-ants. The task of runtime integrity checking has proven non-scalable if performed withoutregard for optimization techniques, yet many EMOF software repositories in use today donot adequately address this concern. Solving this industrially relevant problem requiresidentifying a calculus expressive enough to handle OCL yet tractable enough that opti-mizations of collection operations are feasible. Moreover, an empirical evaluation of theproposed approach should validate the findings before real-world deployment.

Cite this document as follows: Miguel Garcia: Efficient Integrity Checking for Essential MOF+ OCL in Software Repositories, in Journal of Object Technology, vol. 7, no. 6, July-August 2008, pages 101–119,http://www.jot.fm/issues/issues 2008 7/article3.pdf

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

Integrity checking is an instance of the model-checking problem, i.e. determiningwhether a concrete world satisfies predicates. In turn, query evaluation is an area thor-oughly studied in the academic literature. We follow the engineering approach of coher-ently combining existing scientific knowledge to solve an industrial problem. Our workfalls just short of building a concrete product based on the technology choices made (be-cause that’s a task for industry). Rather, we disclose the detailed reasoning behind ourapproach (which industry refrains from doing).

The structure of this article is as follows. Sec. 2 provides context on the artifacts sub-ject to integrity checking in MDSE repositories, followed by a review in Sec. 3 of thestrategies for integrity checking available to repository designers. Sec. 4 covers the oftenoverlooked interplay between expressiveness of the constraint language and runtime costof integrity checking. Sec. 5 presents a technology choice that balances these conflict-ing requirements. A review of the difficulties associated to checking computationally-complete OCL can be found in Sec. 6, followed by the translation rules into the chosencalculus (Sec. 7) and a sample of the optimization techniques thus enabled (Sec. 8). Re-lated work (Sec. 9) includes pointers to the main-memory case and to recent progresson integrity checking in the SQL/relational setting. Sec. 10 concludes. Familiarity withmetamodeling techniques and object-oriented databases is assumed. Knowledge aboutOCL is helpful but not required.

2 ROLE OF ESSENTIAL MOF AND OCL IN MODEL-DRIVEN SOFT-WARE ENGINEERING

The MDSE approach of adopting and extending results from previously separate disci-plines can be seen at work in the best practices for defining the syntax, static seman-tics, and behavior of domain-specific languages (DSLs). Following MDSE principles,the abstract syntax of a DSL is represented as an object-oriented model (expressed inEMOF) thus attaining a number of advantages compared to an EBNF approach. Thisobject-oriented model additionally captures the static semantics of the DSL (e.g., declare-before-use) in the form of invariants expressed in the Object Constraint Language (OCL).As shown in [4], the type checking rules of a DSL are also amenable to an OCL formu-lation, an area previously treated separately in DSL design. Additional benefits naturallyemerge once the language definition is available as a metamodel (and can thus be pro-cessed mechanically):

• Abstract Syntax Trees (ASTs) can be exchanged with ease in a toolchain (e.g.,between a compiler front-end and an static analyzer), fostering interoperability.

• The declarativeness of the OCL formulation allows applying formal techniquesto language processing, in particular Hoare-style program verification of model-transformation algorithms, so as to know at transformation design-time whetherwell-formed output will always be generated for well-formed input [5].

102 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

3 INTEGRITY CHECKING IN MDSE SOFTWARE REPOSITORIES

• Prototypes exist [6, 7, 8] where an AST definition is augmented with annotationsto univocally determine a concrete syntax. From this augmented definition, a gen-erator can derive: (a) grammars for different parser generators, making parsers in-terchangeable; (b) classes whose instances represent Concrete Syntax Tree (CST)nodes, thus allowing for OCL to be used to query and constrain a CST; (c) a vis-itor to transform a well-formed CST (as checked with OCL) into an AST; (d) anunparser from CST to textual notation (i.e., a pretty-printer); and (e) a text edi-tor supporting usability features such as syntax-directed completion, markers forviolations of well-formedness, use-defs navigation, folding, and structural views.

• Following a similar approach, a concrete visual syntax can be defined, allowing forthe generation of a diagram editor for the DSL in question [9].

3 INTEGRITY CHECKING IN MDSE SOFTWARE REPOSITORIES

Given the ubiquity of EMOF and OCL in MDSE, it comes as no surprise that softwarerepositories are required to manage artifacts conformant to EMOF + OCL metamod-els [10]. Infrastructural functionality expected of such repositories includes scalability,concurrent access, integrity checking and enforcement, versioning [11], and view main-tenance. These capabilities in turn are needed to support higher-level use cases suchas: traceability between requirement specs and implementation artifacts, impact analysis,refactoring, and avoidance of architectural erosion [12].

The implementers of some EMOF + OCL software repositories in use today have notpaid enough attention to the formal foundation of those languages, with the end result thatit cannot be determined anymore whether some tool behaviors are correct or not. Analysesof ambiguities in past revisions of the MOF and OCL specification can be found in [13]and [14]. A formalism that offers rigorous precision is a good start, yet Fegaras and Maierdefine in [15] additional criteria for a calculus to be suitable for a query language:

• Coverage: whether the calculus has enough expressive power to represent all con-cepts of the query language. In the case of OCL, these concepts include aggrega-tion, duplicate values, sort orders, several collection types (sets, bags, ordered sets,lists), negation, and user-defined (potentially recursive) functions.

• Ease of manipulation: expressions in the calculus should lend themselves to uni-form matching and rewriting, such as in type-checking or optimization.

• Evaluation fitness: whether all valid query plans can be derived from an expressionin the calculus. A formalism that expresses queries at too low a level of abstractionacts as a barrier to effective evaluation.

By relying on a formal calculus that is suitable with respect to OCL, precise defini-tions for the problems of query optimization, integrity checking, and view maintenance

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 103

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

become possible, and correctness of their solutions can be examined. Efficient imple-mentations are the next step. Before discussing a calculus that fulfills the above criteria,we elaborate on the alternative approach of directly anchoring the semantics of EMOF+ OCL in terms of the Relational Data Model, turning OCL into a surface syntax. Thiswould acknowledge the fact that results from the object-oriented and deductive databasecommunities have become mainstream in SQL3 and are thus likely to be efficiently sup-ported by conformant DBMSs. We see however some disadvantages with this approach:

• Pre-SQL3 relational formalisms do not fulfill the coverage criteria as defined above.Queries involving aggregation or sort orders need be formulated as a mixture ofrelational algebra interspersed with control structures. Only those fragments brack-eted between control structures are amenable to optimization.

• Post-SQL3 extended-relational formalisms strongly resemble the calculus adoptedin our approach. Algorithms for incremental view maintenance based on theseformalisms can thus serve as a foundation for our solution architecture.

• It is more efficient to manipulate query plans at the highest level of abstractionpossible. Once optimized, object-level queries can be cast in terms of relationalalgebra thus opening the way for further potential optimizations.

• EMOF concepts cannot be mapped one-to-one to relational “counterparts”, thusmaking a direct relational anchoring non-trivial in itself. For example, a relationalview may contain the primary keys of its base relations, while each object in anobject view has a globally unique object-ID.

4 CONSTRAINT LANGUAGE EXPRESSIVENESS AND ITS IMPACTON THE RUNTIME COST OF INTEGRITY CHECKING

There is a mutual dependency between the expressiveness of a constraint language andthe computational complexity of evaluating integrity constraints upon updates to databasestate. Three categories can be distinguished:

1. Design-time avoidance of integrity violations: By carefully limiting the expressivepower of the data model and constraint language, it is possible to determine atdatabase schema design time whether some ordering of update transactions mayviolate the integrity constraints. After this proof has been carried out (e.g. basedon algorithm model-checking as shown by Lamport in [16]) no run-time checksare needed. An example of this approach for a variant of the F-Logic language ispresented in [17]. Actually there is still a run-time overhead in that each transac-tion is augmented with its generated weakest precondition. Those fragments of theprecondition which cannot be proved to be implied by the database invariants haveto be checked at runtime.

104 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

5 INCREMENTAL INTEGRITY CHECKING FOR OCL

2. Run-time integrity checking with efficient evaluation: For more expressive con-straint languages, not all integrity checks can be skipped at runtime. Nevertheless,the evaluation of those remaining checks can be made more efficient than that forarbitrary formulas in first-order predicate logic (PSPACE-complete in the worst-case for finite object graphs [18]). We aim at identifying the subset of OCL whoseexpressive power fits in this category. An algorithm for incremental view mainte-nance [19] optimizes integrity checking, as discussed in Sec. 5.

3. Run-time integrity checking with best-effort evaluation: For some specific com-binations of database schemas and full-OCL invariants, custom checks are derivedwhose efficiency is comparable to that of category 2 above, sometimes using heuris-tics. For the remaining cases, large data sets have to be scanned. This approach isfollowed in [20] and [21] where the non-declarative subset of OCL is also adopted(including control structures and negation).

The chosen complexity of integrity checking (second item above) does not precludead-hoc queries from using full-OCL (and require full scans of entity extents in somecases). It seems questionable, however, for the formulation of an integrity constraint torequire computational completeness, as the constraint is rendered non-declarative. Thoseconstraints, if really needed, are best enforced by the business logic that manipulates thesoftware repository, e.g. following Design-By-Contract [22], as recommended by bestpractices evolved over the years for the architecture of multi-tier information systems.

5 INCREMENTAL INTEGRITY CHECKING FOR OCL

Integrity enforcement comprises two runtime phases: (a) violation detection and (b) con-sistency restoration. For each OCL invariant, a view to hold the object-IDs of thoseinstances not fulfilling it is defined (a denial view). At transaction commit time, all suchviews should be empty, otherwise a consistency restoration policy is to be applied (roll-back, compensating action, or postponing consistency restoration altogether). Policies forconsistency restoration are outside the scope of this article. Given that most transactionsleave the majority of invariants unaffected, full recomputation of views after each up-date is impractical. Instead, incremental maintenance is preferred, a process comprisingdesign and runtime activities:

1. At database design time, each view definition is mechanically analyzed to deter-mine which update operations (when performed on certain data elements) affect theresultset .

2. For such events, actions are generated to react to them, taking as input the deltacaused by the update and using it to bring the materialized view into an up-to-datestate (a self-maintenance strategy as opposed to querying the base extents).

3. At runtime, the planned actions are executed upon being triggered by the updatesbeing monitored, performing change propagation.

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 105

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

An efficient algorithm for incremental view maintenance in an EMOF + OCL contextis not as concise as the above summary might suggest because:

• Update operations on an object model are richer than their relational counterpart,given the additional collection types available (lists, ordered sets, bags).

• Method overriding is an issue in that a subclass may redeclare a side-effects-freeoperation (an OCL defined one), with that operation being used in an invariant.Instances of the subtype should have the overriding definition evaluated in place ofthe overridden one.

• Updates may have side effects, which in turn may affect invariants. These side ef-fects result from inverse relationships maintained automatically in EMOF betweentwo entities (its closest counterpart in relational databases is referential integrity).For example, upon deleting an instance which is bidirectionally linked to another,this second instance will have its reference cleared.

A concrete realization of the above ideas, satisfying the complexity requirements in-troduced in Sec. 4, is provided by the MOVIE algorithm for incremental view mainte-nance [19], explained in detail by its author in his PhD thesis [23]. A thorough perfor-mance evaluation [24] confirms its practical usefulness. The MOVIE algorithm is basedon the translation of queries into the monoid calculus and their subsequent optimization,as discussed in [25] and [26]. The monoid calculus embodies the relational calculus, andhas proven versatile enough to support both traditional as well as innovative optimiza-tions. The software architecture of the proposed solution comprises:

1. The design time mapping of a model expressed in EMOF into a relational databaseschema (performed by a ready made component [27]). Data manipulation occursat runtime only as EMOF-level update operations that are intercepted and matchedagainst event patterns derived by MOVIE from view definitions for invariants.

2. The design time translation of OCL invariants into monoid calculus expressions.The resulting event patterns (derived by MOVIE for runtime interception) corre-spond to EMOF-level update operations. The accompanying actions generated byMOVIE to effect view maintenance are also EMOF-level updates.

The data definition, manipulation, and query languages (DDL, DML, DQL) of oursolution are: EMOF, EMOF-level update operations, and full-OCL. The (incrementallymaintainable) constraint language is the subset of OCL translatable into monoid calculus,and moreover valid as input for MOVIE (as defined in Sec. 4.1.2.2 of [23]). Although full-OCL is our standard DQL, nothing prevents the user from expressing read-only queriesdirectly in SQL or in the ORM-level query language, JPQL [28] (Java Persistence QueryLanguage, sometimes referred to as EJB3QL). Writing these “pass-through queries” inSQL requires knowledge of the mapping decisions encapsulated in item 1 above.

The barriers to efficient evaluation introduced by full-OCL are covered next, followedby an in-depth discussion of the translation of OCL into monoid calculus as a prerequisiteto applying the MOVIE algorithm.

106 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

6 COMPUTATIONALLY COMPLETE CONSTRAINT LANGUAGE

6 COMPUTATIONALLY COMPLETE CONSTRAINT LANGUAGE

Proposals using full-OCL for integrity constraints [20, 21] involve re-evaluating candidatebroken invariants on a set of instances collected at runtime. The applied strategy consistsin minimizing the amount of relevant instances, instead of avoiding re-computing subex-pressions whose value has not changed (e.g., by caching their values). This is a majordifference with incremental view maintenance. The essential aspects of the full-OCLapproaches are illustrated with two examples, including the difficulties introduced by re-cursion. For a more detailed presentation see Sec. 4 in [21].

A core aspect of [20] and [21] is the observation that for each data element on whichan OCL invariant depends, it is possible to derive a navigation-based query in the di-rection from the data element back to the instance where the invariant is evaluated. Onthe wake of an update on some data element, these navigation paths lead to a set of in-stances relevant for re-evaluating the invariant in question. For example, in an scenariowhere Departments may have good and bad Employees (Figure 1(a)), an integrity con-straint may require the union of two sets (all bad employees and those good employeesover forty) not to contain a hobbyist:

context Department

inv noHobbyst : badEmps->union(goodEmps->select(age > 40))

->select(hasHobby)->isEmpty()

(a) Departments have good and bad employees

hasHobby age

badEmps goodEmps

hasHobby

(b) Reachability for noHobbyst

Figure 1: The noHobbyst example

Given a Department d, adding or removing employees (good or bad), as well as chang-ing their hobby status may affect the invariant noHobbyst when evaluated for d. However,for this particular invariant, age updates are relevant only for good employees. This in-tuition is reflected in the reachability path shown as a tree in Figure 1(b). Thanks tobidirectional associations, upon an update to a node in that tree, the fixed-length pathto the root can be followed to find the Department instance (i.e., d) on which invariantnoHobbyst should be re-evaluated at transaction-commit time.

Special care is required for recursive functions ranging over dynamic data struc-tures, as illustrated by the forward-only list of Figure 2. In that example, the invariantlastWagonHasLightsOn is fulfilled for a Wagon w in a train as long its last wagon has thelights on. In this case, a statically fixed back-navigation path will not achieve the desiredresult, as the required number of links to traverse changes at runtime. A conservativeapproach consists in re-evaluating recursive invariants for all instances of their contexts,

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 107

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

thus achieving completeness at the expense of efficient evaluation. It is not clear from[20, 21] how recursion over dynamic data structures is dealt with.

context Wagon

inv lastWagonHasLightsOn : f()

context Wagon::f()

def : if next.oclIsUndefined()

then hasLightsOn

else next.f()

endif

Figure 2: The lastWagonHasLightsOn example

7 TRANSLATION OF OCL INTO MONOID CALCULUS

Queries translated into the monoid calculus refer to the same object-oriented schemaas their OCL counterparts. No schema mapping is needed because most EMOF con-structs have a direct counterpart in the monoid calculus, with the following exceptions:(a) EMOF-level ordered sets (no duplicates, insertion order preserved) are representedas monoid lists; (b) EMOF dictionaries (Maps in Java) are represented as sets of (key,value) pairs, where pairs are monoid lists. Under these conventions, for the purposesof side-effect-free queries, the result of evaluating the monoid translation agrees with itsoriginal OCL formulation. For update purposes instead, these conventions would not beconsistent, as for example monoid lists do not capture the semantics of EMOF orderedsets (which require membership testing before insertion). We do not claim to optimizeupdates, whose semantics are enforced by the ORM engine. The fact that the same dataschema is shared by both OCL and monoid expressions makes possible to optimize themonoid formulation without the additional complication of data mapping. No schemachanges are introduced during rewriting for optimization. Finally, the optimized versionis semantics preserving with respect to its original formulation.

An internal node in the AST of an OCL class invariant stands for a function applica-tion, with each subnode providing actual arguments. Some OCL constructs (e.g. let v= ... in ...) add identifiers to the scope visible in subtrees. Such syntax can be re-moved by expanding definitions, thus achieving the shape of “function application only”mentioned before. This rewriting does not alter meaning as OCL has call-by-value evalu-ation semantics. Terminal nodes are not tagged with function applications but with any of:(a) a literal constant; (b) the predefined OCL variable self; (c) entity extents of the formClassName.allInstances(). The variable self ranges over an entity extent, namely thatfor the class where the invariant was defined. Unlike UML, there are no class-scoped at-tributes or associations in EMOF. We assume furthermore that invocations of user-defined,non-recursive functions have been replaced with their definitions (this may involve substi-tuting usages of formal arguments by their corresponding actual arguments). To account

108 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

7 TRANSLATION OF OCL INTO MONOID CALCULUS

for late binding (choosing a function definition based on the actual type of a usage insteadof its declared type) a potentially verbose case distinction is needed. This is no principleobstacle with whole-model analysis: all possible actual types are known at translationtime and the actual type of an object can be queried with oclIsTypeOf(). After this pre-processing step, each internal node stands for the invocation of either an OCL predefinedfunction or a user-defined (directly or indirectly) recursive function.

The Monoid Calculus

The monoid calculus provides a uniform notation for collections such as lists, bags andsets, based on the observation that the operations of set and bag union and list concate-nation are monoid operations (that is, they are associative and have an identity element).Monoids for collection types are known as collection monoids. Operations like conjunc-tions and disjunctions on booleans and integer addition are instead primitive monoids.Borrowing notation from [25], a monoid of type T is a pair (⊕,Z⊕) where ⊕ is an associa-tive function of type T × T → T andZ⊕ is the left and right identity of ⊕. A monoid maybe commutative (i.e., when ⊕ is commutative), idempotent (i.e., when ∀x : x ⊕ x = x), orboth. For example, (+, 0) is a commutative and anti-idempotent monoid, while (∪, {}) isboth commutative and idempotent.

An expression of the form ⊕Je | e1 . . . enK is a comprehension over monoid ⊕. Unlikethe prominent role granted in functional programming languages to list comprehensions,the notation above uniformly captures collection operations, whose kind is revealed bythe outermost braces ([] for lists, {{ }} for bags, {} for sets). Each ei is a qualifier, whichcan either be a generator of the form v ← E, where v is a variable and E is a collection-valued expression, or a filter p (a boolean valued predicate). Informally, each generatorv ← E sequentially binds variable v to the elements of expression E’s value, making itvisible in successive qualifiers. A filter evaluating to true results in successive qualifiers(if any) being evaluated under the current bindings, otherwise ’backtracking’ takes place.The head expression e is evaluated for those bindings that satisfy all the filters, and takentogethers these values constitute the resulting collection. For example, the following SQL-like nested query:

select distinct e(x)

from ( select d(y) from E as y where q(y) ) as x

where p(x)

is translated as { e(x) | x ← {{ d(y) | y ← E , q(y) }} , p(x) }

Applying a function f to each element in a collection (map f xs in Haskell) is thusexpressed as J f (x) | x ← xs K, while filter p xs becomes J f (x) | x ← xs, p(x) K.Comprehensions in turn are syntactic sugar for monoid homomorphisms, which expressstructural recursion on the collection constructor (++ for lists, ∪ for sets, ] for bags), asshown pictorially in Figure 3 [29]. For example, taking ⊗ to be max(x,y) = case x<yof true -> y | false -> x makes L−∞; maxMC find the maximum of collection C.

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 109

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

1. Monad Comprehensions 5

1.2.1 Catamorphisms

To stress this idea of deriving a recursive computation from the recursivestructure of the input collection, let us undertake a generalisation step. Givena collection [α] (or {|α}|, {α}) and values z :: β, ⊗ :: α× β → β we define theoverloaded mix-fix operator(|)| as

(|z;⊗ )| :: β× (α× β → β) → [α] → β(|z;⊗ )| xs ≡ case xs of [] → z

| x ↑ xs ′ → x ⊗ ((|z;⊗ )| xs ′)

Pictorially,(|z;⊗ )| is the spine transformer

↑x0

��↑

??

x1��

↑xn

��[]??

−→(|z;⊗ )|

⊗x0

��⊗

??

x1��

⊗xn

��z

???

and we can immediately see that we could have defined maximum ≡(|-∞;max)|.When applied to lists, the operator (| )| is known as foldr or reduce, espe-cially in the functional programming community. In more general collectionprogramming settings,(| )| is also known as sri (structural recursion on insert)[1.2, 1.21].

We can give an algebraic account of the nature of(| )|. Observe that(|z;⊗ )|is a solution to the equations below which effectively say that the unknownh is a homomorphism from monoid ([], ↑) to monoid (z, ⊗):

h [] = z (1.1a)

h (x ↑ xs) = x ⊗ h xs (1.1b)

It can be shown—based on the fact that ([], ↑) is the term or initial algebra oflists built using these two constructors—that(|z;⊗ )| is the unique solution tothese equations, completely determined by z and ⊗ [1.16]. Homomorphismsof initial algebras have been dubbed catamorphisms [1.17] and this is theterminology we will adopt.

Caveat : Equation (1.1b) suggests that operator ⊗ of the target alge-bra must not be completely arbitrary: ⊗ needs to have the same algebraicproperties as ↑: associativity, left-commutativity (if ↑ :: α ×{|α }| →{|α }| or↑ :: α× {α} → {α}), or left-idempotence (if ↑ :: α× {α} → {α}).

Catamorphisms are a versatile tool. A number of useful collection pro-cessing functions turn out to be catamorphisms:

maximum ≡ (|-∞;max)|

minimum ≡ (|+∞;min)|

or ≡ (|false;∨)|

and ≡ (|true;∧ )|

Figure 3: Graphical representation of the homomorphism from monoid (↑, []) to (z,⊗)(the latter not necessarily a collection monoid)

Translation Rules

Transformations for languages with a number of syntactic constructs (such as OCL) takethe form of LHS → RHS pattern-based substitutions, where each OCL construct ismatched by only one LHS . The transformation algorithm can be shown to correctly pre-serve meaning if each rewrite transformation is proved meaning-preserving. This followscase by case from definitions (in the respective semantic domains of OCL and monoid cal-culus). The rewrite rules are terminating because they decrease the number of occurrencesof OCL constructs available for matching, and are confluent given that the LHS s partitionthe set of shapes that OCL constructs may take (each OCL construct being matched byone rewrite rule). Translation operates bottom-up from the leaves of the AST. For eachnode all required information is available locally due to pre-processing: no lookup of thecorrect binding for an OCL variable is needed as no such usages are left except for self.

Regarding the possible OCL constructs, Figure 4 depicts the relevant fragment of theOCL metamodel [30], i.e. the classes whose instances are nodes in an AST. As partof preprocessing, some constructs have been desugared (LetExp, VariableExp), whileothers do not appear in invariants (MessageExp). Occurrences of UnspecifiedValueExp,InvalidLiteralExp, and NullLiteralExp stand for the result of applying a partial func-tion outside its domain. StateExp and TypeExp are functions that access instance-leveldata (the current state, given an associated statechart) and the actual type (which remainsconstant througout the lifetime of the instance, as EMOF lacks dynamic reclassification).Related to this, the boolean operation oclIsKindOf() reports whether a pair of types be-longs to the transitive closure of the direct subtype relationship ≤1 of EMOF + OCL [4].

OCL Monoid calculusc->select( e | boolExpr(e) ) J e | e← c , boolExpr(e) Kc->reject( e | boolExpr(e) ) J e | e← c , boolExpr(e) = false Kc->exists( e | boolExpr(e) ) ∨{ boolExpr(e) | e← c }c->forAll( e | boolExpr(e) ) ∧{ boolExpr(e) | e← c }c->collect( e | expr(e) ) J expr(e) | e← c K

c->one( e | boolExpr(e) )1 = length( [ e | e← c , boolExpr(e)] )where length(x) ≡ +[1 | e← x]

Table 1: Non-recursive subcases of LoopExp

110 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

7 TRANSLATION OF OCL INTO MONOID CALCULUS

Figure 4: Fragment of the OCL 2.0 metamodel (only inheritance relationships shown)

OCL constructs of the form LiteralExp are translated as follows: (a) a literal of theprimitive types (integer, real, string, or boolean) has a corresponding monoid constant,the same goes for literals of a user-defined enumerations; (b) a collection literal of typeordered set or list is translated as a monoid list, while set and bag collections have directcounterparts; (c) a tuple literal is translated as a set of pairs (tag, value).

The iterator expressions (LoopExp) comprise non-recursive subcases (Table 1). Theremaining subcases are first desugared to their iterate() form as defined in the OCLstandard ([30], Sec. 11.9 and A.3.1.3). iterate() in turn can be expressed as a left-fold.To capture this primitive recursive function, the function composition monoid (◦, λx.x) isneeded [25] where the function composition, ◦, defined as ( f ◦ g) x = f (g(x)), is asso-ciative but neither commutative nor idempotent. Even though the type of this monoid,T◦(α) = α→ α, is parametric, it is still a primitive monoid. For a list L = [a1, a2, . . . , an],applying ◦[λx. f (x, a) | a ← L] to z expands to (λx. f (x, a1)) ◦ . . . ◦ (λx. f (x, an))(z)which computes the left-fold f (. . . ( f ( f (z, an), an−1), . . . a1). The formulation of OCL’sc->iterate(a ; acc=init | expr(acc,a)) is thus the comprehension ◦[λ acc.expr |a ← c](init). The expressive power of comprehensions involving ◦ lies in their ability tocompose functions that propagate a state during list iteration. For example, the reverse oflist L is ◦[λx.x++[a] | a← L]([]). Actually, the OCL standard defines the semantics of allLoopExp in terms of iterate(), but as can be seen from Table 1 the additional expressivepower is not necessary, and may complicate optimization by hiding properties that ⊗ mayexhibit (commutativity, idempotence).

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 111

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

OCL Monoid calculusc->count(m) +[1 | e← c, e = m]c->excludes(m) ∧{e , m | e← c}c1->excludesAll(c2) ∧{∧{e , m | e← c1} | m← c2}

c->includes(m) ∨{e = m | e← c}c1->includesAll(c2) ∧{∨{e = m | e← c1} | m← c2}

c->isEmpty() c = J Kc->sum() +[e | e← c]c->size() +[1 | e← c]c1->product(c2) {(x, y) | x← c1, y← c2}

Table 2: Standard OCL operations on all collection types

In EMOF terminology, a structural feature is either (a) an instance field or associationend; or (b) a method. Accessing the value of (a) is represented with PropertyCallExp. In-voking an (OCL-defined, side-effect free) method is represented with OperationCallExp.Therefore, occurrences of these constructs are translated as function application in monoidexpressions. The sibling of PropertyCallExp (AssociationClassCallExp) is not rele-vant for EMOF, as class-scoped structural features are not allowed. The pending cases ofOperationCallExp not translated so far comprise: (a) operations on the primitive typesboolean, integer, real, and string; and (b) collection operations (not to be confused withiterator operations). The first group can be translated as-is given that all storage enginesimplement them natively. From the point of view of optimization, they are handled asblack-boxes. Translation rules for collection operations appear in Tables 2 to 4, classifiedby computational complexity, which is not apparent from the uniform OCL syntax.

The implementation of OCL AST transformations is discussed in [31], including tech-niques such as the encapsulation of walker code, instantiation of type-parametric visitorswith type substitutions, and tracking the input-output relationship between AST nodesalong a chain of visitors.

8 OPTIMIZATIONS WITH MONOID CALCULUS

The invariant noHobbyst (Figure 1 in Sec. 6) is amenable to a basic optimization, pushingselections below joins (the predicate hasHobby = true appears only after building partialresults, performing it earlier increases selectivity). The vast body of query optimizationalgorithms is not applicable to the surface syntax of OCL: the same concept can be ex-pressed in so many different ways that ease of manipulation (Sec. 3) is impracticable.

We claim that query optimization is required for two purposes in an EMOF + OCLsetting: (a) for ad-hoc queries, and (b) to optimize expressions obtained from OCL in-variants before their maintenance plans are derived by MOVIE. The case for (a) shouldbe evident. As for (b), the authors of [21] observe that invariant rewriting may disconcertusers, who would be faced with integrity violation errors based on expressions they have

112 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

8 OPTIMIZATIONS WITH MONOID CALCULUS

OCL Monoid calculusc1->excluding(c2) Je | e← c1,∧{d , e | d ← c2}Kc->append(m) c++[m]c->asBag() {{e | e← c}}c->asOrderedSet()

[e | e← c]c->asSequence()

c->asSet() {e | e← c}c->flatten() ⊕Je | s← c, e← sKc->including(m) c ⊕ JmKc1->intersection(c2) ⊕Je | e← c1,∨{e = d | d ← c2}Kc->prepend(m) JmK ⊕ cc1->union(c2) c1 ∪ c2

c1->symmetricDifference(c2)same translation as for (c1->union(c2))->excluding((c1->intersection(c2)))

Table 3: Overloaded collection operators (⊕ stands for the merge operator of the resultingcollection monoid )

c->at(i)(◦Jλ(x, k).(if k = i then a else x, k − 1) | a← cK

(NULL, length(x))).fstc->first() ◦Jλx.a | a← cKc->last() same translation as for c->at(c->size())

c->indexOf(m)(◦Jλ(x, k).(if a = m then k else−1, k − 1) | a← cK

(−1, length(x))).fstc->subOrderedSet(j,k) (◦Jλ(x, i). if j ≤ i ≤ k then ([a]++x, i − 1)c->subSequence(j,k) else (x, i − 1) K([], length(c))).fst

c->insertAt(k,m)(◦Jλ(x, k).if i = k then ([m]++[a]++x, i − 1)

else ([a]++x, i − 1) | a← cK([], length(c))).fst

Table 4: Collection operations involving comprehensions of function composition

never seen before. As a consequence, rewriting in general (and optimization in particular)is explicitly avoided. The usability concern in question can be addressed in that errormessages can be produced by evaluating the original OCL invariant once it is known (byoptimized evaluation) that it has been broken. Actually, re-evaluation is inherent to theapproach in [21], thus incurring no additional overhead.

The primitive operations supported by storage managers or query engines correspondto query algebra operators (semi-joins, selection supported by indexes, etc.) The monoidcalculus takes advantage of this fact by offering a uniform framework for query transla-tion, rewriting for optimization, and execution plan generation: query optimization canbe made aware of the physical schema (table partitioning applied as part of ORM), sav-ing I/O costs. To illustrate this kind of optimization, we show an end-to-end example oftranslation, optimization, and plan generation aware of physical schema, adapted to theEMOF + OCL setting from [26].

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 113

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

Consider a database of Films and the actors appearing in them (recording in howmany scenes, scenes), together with the films’ directors, as shown in Figure 5.

Figure 5: EMOF-level logical schema for the films, actors, and directors database

Assuming that most queries access either actors or directors, it makes sense to ver-tically decompose the logical schema into four tables (see Figure 6). Clustering tablecolumns that are frequently accessed together avoids unnecessary I/O, as its elements arestored physically contiguous.

Films

PK #

title

Cast

PK #

actorscenesfilm

Directors

PK #

directorfilm

Person

PK #

name

FK

FKFK

FK

Figure 6: Physical schema for the films, actors, and directors database

The OCL query below (in terms of the logical schema in Figure 5) returns the titles ofHitchcock-style films: the director appears as an actor in exactly one scene.

Film.allInstances()->select( f |

f.directors->exists( d |

f.cast->exists( c | c.scenes = 1 and c.actor = d ) ) )

->collect( f | f.title )

Its translation as a monoid comprehension is as readable as the OCL version:

{ f .title | f ← film,some{some{d = c.actor | c← f .cast, c.scenes = 1} |

d ← f .directors}}

114 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

8 OPTIMIZATIONS WITH MONOID CALCULUS

Before optimization can start, the connection to the physical schema is established byreplacing film by its definition in terms of the vertically decomposed tables, using the nestoperator to reconstruct the owned collections of actors and directors for each film. Withthat, the monoid comprehension can be normalized: all variables ranging over collections(i.e., f , d, c) appear first followed by a predicate in conjunctive normal form. This notyet optimized formulation has a direct counterpart in query algebra (Figure 7), a treeof cartesian products. In principle, relational optimizations could start from there, thusguaranteeing that monoid-based optimizations do not end up in execution plans worsethan relational optimization (e.g., exchanging two generators results in join reordering).

²�ç.÷ ìcæ:â)÷�â)ë:ôlålå ç�ë:â�òNê:çWòlâ�ò6ôYð«â'â��Xðyã»â«é>ê�ã»ä·åló®â�ä·â«é>êeì°ê:ã»çcéYëÇõöæ:ç�ä ìÉðyç�ä�åYó®â��NìcéYò@ò6â�â«åló»ø élâ�ë¨ê:â�òðyç�ä·ålæ:â�álâ«é[ë¨ã»çcé�ó®ã»ícâ�êeálã»ëë÷

ú ã»éYðyâÆêeálâRélâ�ë�êeâ�òZë¨ôlî9�>ôlâ«æeã»â�ëXò6ã»æeâ�ðMêeó®øWðyçcæeæeâ�ë:å[ç�éYòÓêeçGê:álâRð«çcä·ålô6êdì.ê:ã»çcéZç°õ�å ç°ê:â�é>ê:ã´ì°ó»ó®øZðyçcä·åló»â��ã»é�êeâ«æeä·â�ò6ã´ì°æeøpæ:â�ë¨ôYó¢êdë«ü6÷CâAì°ã»ä õöçcæ�ôYélélâ�ë¨ê:ã»élû·êeálâ�ë:â�îaøXèaã®æ:ê:ôlâ�çcõ,êeálâ�â=�>ôlã»è.ì°ó»â«éYð«â

Úôè(õo"¯õ}ö}ç-ú21�3�ç-ûc1�8kè(õC�¹"¯ú���1@3a��çFõ � éUé ì Úôè(õi" õUö�çFúo1@3�çFúh��1�3a�$çFû5� õC�$ç�õ � é÷�álã´ðdá�÷CâQã®é>ê:æeç6ò6ôYð«â�òÇì�ë�ê:álâ�ðyç�æ:âQç°õ,ê:álâ�ð«çcä·ålæeâ«álâ�éYë¨ã»çcéÇélçcæeäpì°ó»ãu��ì.ê:ã»çcéÇã®é ú â�ðyê:ã»çcé ¶ ù ¶ ù"àÂálâ�ðyç�ä�åYæ:â«ñálâ�éYë¨ã»çcé'élçcæeäpì°ó»ãu��ì.êeã®ç�éRì°ó»ûcçcæeã®ê:áläþã´ë¯û�ã®è�â«é)ì�ë�ìÇë:âyê�ç°õ�ð«çcé6ýYôYâ«é>êQæeâ«÷�æeã®ê:â�æ:ôló»â�ëX^Nç°õ+÷�álã´ðdáRêeálâ�ìcî[ç.è�ââ=�>ôlã»è.ì°ó»â«éYð«âUã»ëXì§åYìcæ¨ê ^NìcéYòZë¨â�æ:è�â�ë·ìcë�êeálâRîYì�ë¨ã´ðÆálâ«ôlæeã´ë�êeã»ðÆã»éZç�ôlæ·ê:ædì°é[ë¨ó´ì.êeã®ç�éWålæeç6ðyâ�ë:ë�ù ��élélâ�ë�êeã®éYûæeâ�ì°ó»ã��«â�ë¯ìXålã»å â«ó»ã®élâ�ò)â�è.ì°ó»ôYì.êeã®ç�é|ü3ì°éYòRë:â«æeècâ�ë�êeçÆì�è�çcã´òÆê:álâ�îlôlã»ó»ò)ç°õ+êeálâ�ã®é>ê:â�æ:ä·â�ò6ã»ìcæ:øÆæeâ�ë:ôló¢ê_û�æeìcélûcâ�ëç.ècâ�æ�ÿö÷CâQç�ä·ã¢ê�ë:â«è�â«ædì°ó�ç°õ,ê:álâ�ôléYélâ�ë¨ê:ã»élûXë¨ê:â«å[ë�álâ«æeâQõöçcæ�îlæeâ«èaã®ê�øA�#Ù

è}­ ë êeã¢êeó®â%" ­ ò 1 � ã»ó®äpë=ç­9� ý ê:ã®ê:ó»â ì ­ ò ë ê:ã®ê:ó»â�çò6ã»æ:â�ðMêeçcædë ì è è�ó ò ë ò6ã»æ:â�ðMêeçcæ�"�ó ò 1ä^¯ã®æeâ�ðyê:ç�æeë�ç)ó ò ë Yó»ä ì ­ ò ëñð é é§çð«ì�ë�ê ì c ý ìcðyê:çcæ ì . ò ë ì�ðMêeçcæ�çeë:ð«â«élâ�ë ì . ò ë ë:ð«â«élâ�ë�þf"�. ò 1 d ì�ë�ê�ç

. ò ë Yó»ä ì ­ ò ëñð gZþ#çp�| � t è p�| � t è�ó ì . ë ìcðMêeçcæ�"�.;1½­ ë ð«ì�ë�ê(ç). ë ë:ð«â«élâ�ë ì [ é%"�óc1½­ ë ò6ã»æ:â�ðMê:ç�æeë#é§éë=ë�ë ÿ ­ �� è}­ ò ë ê:ã®ê:ó»â%"A­ ò 1 � ã»ó»ä·ë(ç

ó ò 1ä^¯ã®æeâ�ðyê:ç�æeë�ç)ó ò ë Yó»ä ì ­ ò ëãð çp�| � t è�ó ò ë ò6ã»æ:â�ðMê:ç�æ ì . ò ë ìcðyê:çcæa"R. ò 1 d ìcë¨ê=çE. ò ë Yó»ä ì ­ ò ëñð çE. ò ë ëeðyâ�élâ�ë ì [ éUé��élélâ�ë�ê�êeálâ p�| � t Ù� è}­�ò ë ê:ã®ê:ó»â%"A­�òf1 � ã»ó»ä·ë(çEó�òf1½^�ã»æ:â�ðMêeçcædë�ç).ëòf1 d ìcë¨ê=ç

ó£ò ë Yó»ä ì ­�ò ëñð ç).ëò ë Yó»ä ì ­�ò ëãð ç).Eò ë ë:ð«â«élâ�ë ì [ ç)ó£ò ë ò6ã»æ:â�ðMêeçcæ ì .Eò ë ìcðyê:ç�æ�éh ëÂâ��6å[â�ðMê:â�ò3ü6êeálâ�ôlélélâ�ë�êeã®éYûpålæ:ç6ð«â�ëeë�ðyçcé[ë�êeæ:ôYðyêeëCêeálâAðdáYì°ædìcðyê:â�æ:ã´ë�êeã»ð�élçcæeäpì°ó�õöç�æ:ä ÷�ã®ê:á�ìcó®ó|û�â«élâ�æeì°ñ

ê:ç�æeë"åYô6ê¥Yædë�ê�ü°õöçcó»ó»ç.÷�â�òtîaøtì�ðyç�é.ï�ôléYðyê:ã»çcé·ç°õ�ålæeâ�ò6ã´ð«ì°ê:â�ë�ÿKìcðyê:ôYìcó®ó»øtì�ålæeâ�òlã»ð�ì.ê:âÂã»é d � � �MùcàÂáYâ�ð«ì°éYçcélã´ð«ìcó�>ôlâ«æeø�â«élû�ã®élâ�� ë�ÿKì°ó»ûcâ�îlædì°ã´ð=��ã»é�êeâ«æeålæeâyêeì°ê:ã»çcépçcõ êeálã´ëHâ��aåYæ:â�ë:ëeë¨ã»çcé=ücìcë+õöçcæ+â�ècâ«æeø�ð«çcä·ålæeâ«álâ�éYë:ã®ç�épã®épélç�æ:äpìcóõöçcæeäUü6ã»ë�ê:áYì°ê�ç°õ�ì�ó»âyõµê:ñ²ò6â�â«åÆê:æeâ«â�çcõ�ð�ì°æ:ê:â�ë:ã´ì°éÇåYæ:ç6ò6ôYðyêeëAÿâøY�#ÙÇ�ù�ú�û ü§ý ü§þ ÿ�ÿIÆ��)ú�û �Cþ ��� ù�ú�û ����IúCû ��þ �� ùwú�û ����IúCû � ��ÿ���ÿ����Sö���)ú�û ��ý �§ÿ��wü��������IúCû ���wü����yÿ¨ÿ � ã»ó®äpë�ùwúcøj^¯ã»æ:â�ðMê:ç�æeë��)ú§��ø d ì�ë�ê��IúU�F�²�ç.÷Câ«è�â«æ�üÂê:álâ§élç�æ:äpì°ó¯õöçcæeä ð«ìcé ë:â«æeècâ'ì�ë�ìWîYì�ë¨ã´ëXõöç�æj�>ôlâ�æ:ø ç�å6ê:ã»ä·ãu��ì.ê:ã»çcé]ã®é ú ø6ë�êeâ«ä � ë�ê�øaó»â

c ú h d ¨ £§£�g²üHã»éWê:álâ�ë:â«éYë:âÇêeáYì.ê·êeálâUâ��lðdáYì°éYûcâÇçcõ�ê�÷Cç'û�â«élâ�æeì°ê:çcædëAã»ä·åló»â«ä·â«é>êeë�êeálâUæ:â�çcædò6â«æeã»élûRç°õ�ï�çcã»éã»élålô6êdëAì°éYò)ê:á[ì.ê�åló´ìcðyâ�ä·â«é>ê�ç°õ�åYæ:â�ò6ã»ð�ì.êeâ�ëQélâ��aê�êeçÆû�â«élâ�æeì°ê:ç�æeë�ê:áYì°êAîlã®é[ò)êeálâ«ã»æ�è.ì°æeã»ìcîló»â�ë�ê:ædì°éYë¨õöçcæeäpëìÆð«ì°æ:ê:â�ë¨ã´ì°é'ålæeçaòlôYðMê�ã®é>êeçUì�æeâ�ì°ó ï�ç�ã®é|ù d çcéYë:â=�>ôlâ�é>ê:ó»øcü3÷Câ�ìcæ:â�ûcôYìcæeìcé>ê:â«â�ò�élçcêQêeç�â�éYò'ôlå'÷�ã®ê:á§åló´ì°éYë÷Cçcædë¨â¯ê:áYìcé ú ø6ë�êeâ«ä ��ë¨ê�ø>ó»â�çcå6êeã®ä·ã��«â�æeëÂ÷Cçcôló´òÆålæeçaòlôYðyâ�ü6ã®éUçcôYæÂð�ìcë:â

è�­�ò ë ê:ã®ê:ó»â%"A­�òf1 � ã»ó»ä·ë=ç)ó£òa1½^�ã»æ:â�ðMêeçcædë�ç�­�ò ëãð ì ó�ò ë Yó®ä³ç.Eòf1 d ìcë¨ê=ç).ëò ë ë:ð«â«élâ�ë ì [ ç­�ò ëñð ì .ëò ë Yó»ä³ç)ó£ò ë ò6ã»æ:â�ðMê:ç�æ ì .ëò ë ìcðyê:çcæ�é

÷�álã´ðdáUã»ëÇ'ùwú£û ü§ý ü§þ ÿ�ÿ¨ÿ � ã»ó»ä·ë�ù�ú Èùwú�û ����)ú�û �Cþ � ^¯ã®æeâ�ðyê:ç�æeë��)ú}� Èù�ú£û ����Iú�û ��þ ����)úCû �âý �-ÿ��wü��������Iú�û ���wü���� ÿ§Æ��Iú£û � ��ÿ��âÿ�����ö°ÿ d ì�ë�ê �IúU�F�-�

Õ;I½ !#��OR� ç�O��)�;TW¾#ç�ç'�Ù!�!Aé2QpO�OR�M�N�L!�§â¯ï�ô[ë�êAëeì�÷�ê:áYì°êAðyâ«æ:êeìcã®é§ðyç�ä�åYæ:â�álâ«éYë:ã»çcéRêeâ«æeäpëQòlã®æeâ�ðyê:ó»ø)ðyç�æ:æeâ�ë:å[ç�éYòUêeçÆìcó®û�â«îlædìXç�å[â�æeì°ê:çcædë�ù��|âyê�ôYëðyç�éYë:ã»ò6â�æ�ê:álã´ëVðyç�æ:æeâ�ë:å[ç�éYò6â�éYðyâ�ã®é·ã®êeë"æeâ«ècâ�æeë:â�òlã®æeâ�ðyê:ã»çcé|ù�àÂálâ«é�ê:áYâ,ï�ç�ã®é53iö;È ò 3 � ã´ë"çcîaèaã®ç�ôYë¨ó»øAðyç�ä·ålô6ê:â�òîaø)ÿöó»âyê#"�ò6â�élç°êeâ¯êeálâ�� t�� y t ç�å[â�æeì°ê:ã»çcéÆç°õ"ìpë¨ôlã®êeìcîló»âQæeâ�ð«çcædòXä·çcéYçcã´ò��#Ù

3 ö È ò 3 � ì è=ú$"ÊûV"¯úo1¦3 ö çFû51@3 � ç§l+é

[ f

Figure 7: Query algebra formulation, non-optimized stage

The semijoin E1 Xp E2 is a join variant that delivers only those left operand objectshaving at least one join partner with respect to the join predicate p. Its implementation isefficient because as soon as a join partner is found for an E1 object, then it is known tobelong to the result and no further E2 objects need be accessed. The monoid comprehen-sion formulation allows detecting those access patterns that correspond to semijoins. Inthe example, after partial flattening of subqueries (not shown), the shaded subexpressionin Figure 8 is one such case.

â��Xð«ã®â�é�ê�ì°ó»ûcâ�îlæeìcã»ðAçcå â«ædì.ê:ã»çcé[ë�î âyõöçcæeâtâ��6áYì°ôYë¨ê:ã»ècâtôlélélâ�ë�êeã®élûÇý[ì.ê:ê:â�éYëQçcô6ê¯ê:álâH�>ôlâ«æeø9� ë¯ë�êeæ:ôYðyê:ôlæeâtôYé�êeã®óì°ó»ó=ã»é6õöç�æ:äpì.êeã®ç�é�çcéÉÿKå[çcê:â«é>êeã»ìcó®ó»øUôléYðyç�æ:æeâ«ó´ì.êeâ�ò��Âë:ôlî9�>ôlâ«æeã®â�ë�ã´ë�ó»ç�ë¨ê�ù�àÂáYã»ë¯ã»ë�ç°õ+â�ècâ«é�ä·ç�æ:âAã»ä·å[ç�æ¨êdì°éYð«âë:ã®éYð«â�ôléYð«çcæeæ:â�ó»ì°ê:â�ò�ë¨ôYîÌ�>ôlâ�æ:ã»â�ëVã»éYò6ã´ð«ì°ê:âÂêeálâ3�>ôlâ«æeøAâ«éYûcã»élâÂê:áYì°êHìcð�ðyâ�ëeëVð«ìcé·î[â�åláaø6ë:ã»ð�ì°ó»ó®øAó»ç6ð«ìcó®ã��«â�òtã»épìåYìcæ¨êeã¢êeã®ç�élâ�òpë¨ê:ç�æeìcûcâ�ò6â�ë¨ã»ûcé?Ù"^¯ôlæ:ã»élû�æ:â�÷�æ:ã®ê:ã»élûAçcõ êeálâ�ðyçcé[ðyâ«ålê:ôYìcóÐ�>ôlâ«æeø�ê:álâQò6âyêeâ�ðMêeã®ç�éXç°õ3êeálâ¯ë:â«ä·ã ï�çcã»éåYì°ê¨ê:â�æ:éÆã»é

è}­ ë êeã¢êeó®âi"A­91 Yó»äpë�ç óc1½­ ë òlã®æeâ�ðyê:çcædë�çp�| � t è�ó ì . ë ì�ðMêeçcæ�"�.;1½­ ë ð«ìcë¨ê=çE. ë ë:ð«â«élâ�ë ì [ é�é

÷Cçcôló´ò�êeæ:ã»ûcû�â«æ¯ê:álâ��>ôlâ«æeø�â«éYûcã»élâtêeç�ë¨â�ó®â�ðMê�ì�ë¨ê:ç�æeìcûcâ�òlâ�ë:ã®û�é)÷�ã®ê:áGò6ã®æeâ�ðyê:ç�æeëQì°éYò'ð«ì�ë�êAë¨ê:çcæeâ�òRã®é'ê:álâ�ã®æç.÷�éÆåYì°æ:ê:ã®ê:ã»çcéYë�üaê:áaôYë�ðdálçaç�ë:ã®élû·ì�åláaø6ë¨ã´ð«ìcó|ë:ðdálâ�äpìtó»ã®í�â�ê:álâ�ç�élâ�÷�â�ålæeâ�ë:ôlä·â�ò|ù

h éYélç°êdì.ê:ã»élû�åYìcæ¨êdëVç°õ|ì1�>ôlâ�æ:ø�÷�ã¢êeápã¢êdë+æeâ�ë:å[â�ðMê:ã»ècâ�ìcó®û�â«îlædì°ã´ðCâ=�>ôlã»è.ì°ó»â«é>êHð«ìcé�î â�ò6ç�élâ�ã»éXì°é·ã®ê:â«ædì.êeã®è�âäpì°éléYâ«æ�ü�æeâ«åló´ìcð«ã®éYû�ä�ç�æ:â¯ì°éYòpä·ç�æ:â�çcõ3ê:álâQð«ìcó»ð«ôló®ô[ë+â��6ålæeâ�ëeë:ã®ç�éYë+ôlé>êeã®ó�ì�ålôlæeâ¯ìcó®û�â«îlædì°ã´ðÂæeâ«ålæeâ�ë:â«é>êeì°ê:ã»çcéã´ëÂæ:â�ìcðdálâ�ò3ù

h ë�ì�ðyç�éYë¨â(�>ôlâ«éYð«âcü>ì°é·çcå6êeã®ä·ã��«ã»élû��>ôlâ�æ:øtê:ædì°éYë:ó»ì°ê:ç�æ+ë¨álç�ôló´ò·ä·ìcå�]1�_�Æê:çAì � �§w �#n$� � nu( ç°õ|ð«ìcó»ð«ôló®ô[ëì°é[ò�ì°ó»ûcâ«îYæeìYü�ë:ã®éYð«â+ë¨ç�ä·â�]1�_��â��6ålæeâ�ëeë¨ã»çcéYë|áYì�ècâVì°é�ç�îaè>ã»çcô[ë3ì°ó»ûcâ«îYæeìcã»ð"ðdáYìcæeì�ðMê:â�æ+ÿöâ�ù û[ùyê:áYâHë¨â«ê=çcå â«ædì.êeçcædë� ¢�¡ � ¢3ü�¡�¢ � � � ÿ�� ��� ü°ìcéYò%� · � � ©R� ücë:çcä·â�ë:â«ó»â�ðyê:ã»çcéYë�ü>ìcéYò·ç°õ=ð«çcôlædë:â"ï�çcã»éYë��yüaæ:â�ò6ôYðyã»élû�êeálâ¯ð«ç�ë¨ê�çcõ�å[ì.ê¨êeâ«æeéäpì.êdðdálã®éYûÆì°êQêeálâpë:ìcä·â�êeã®ä·â�ù � æ:ç�îló®â�äpë¯çcõ�è.ì°ó»ôlâ«ñ²ð«çcéYë¨ê:æeôYðMêeã®ç�é|ü®�>ôYì°é>ê:ãu[ð�ì.ê:ã»çcé§ì°éYò'ûcâ�élâ«ædì°ó"ålôYæ:å ç�ë:âðyç�ä·ålô6êeì°ê:ã»çcéYë·ìcæ:âÆî[â�ë�êpâ��aåYæ:â�ë:ë:â�òÓã»éWê:álâ�ð«ìcó»ð«ôló»ôYë«ù � ô6ê:ê:ã»élû)êeálâ�ì°ó»ûcâ�îlæeì)çcå â«ædì.êeçcædë�ò6ç.÷�é^ê:ç'ê:álâ�ã®æð«ìcó»ð«ôló»ôYëÂâ=�>ôlã»è.ì°ó»â«é>êeë�ìcó®ó»ç.÷�ë�õöçcæ�ìpë:â«äpì°é>ê:ã´ð«ìcó®ó»øXðyó»â�ìcé�ã»é>ê:â�æeì�ðMê:ã»çcéÆç°õ�ê:áYâ�ê�÷�ç�õöçcæeäpì°ó»ã»ë:äpë«ù

à,ôlæ:éYã®élûtê:álã´ëÂì°ålåYæ:ç>ìcðdápã»é�êeç�ì�ðyç�ä·åló®â«ê:âQáaø>îYæ:ã´ò��>ôlâ«æeø·çcå6êeã®ä·ã��«ã»élû·ë�êeæeì°ê:â«û�ø·ä�â�ì°éYë�ü�Yædë�êeó®ø�ü�êeç�[éYòê:áYâ�ð�ì°ó´ðyôló»ôYë�â=�>ôlã»è.ì°ó»â«é>êeë�õöçcæ�ê:áYâ�ìcó®û�â«îlædì�ç�å[â�æeì°ê:çcædë�ç°õ|êeálâ¯ôYéYò6â«æeó»ø>ã»élû·òlì.êdì°îYì�ë¨â¯â«élû�ã®éYâcüaã®é�åYì°æ:ê:ã´ðyôYó»ìcæõöçcæÆêeálç�ë:â'ìcó®û�çcæeã¢êeáläpëXêeáYì.êUålæeçcä·ã»ë:â'ìcé â��pð«ã®â�é>êÆâ�è�ìcó®ô[ì.ê:ã»çcé]ç°õAêeálâ§ålæ:ç�îló»â«äpì.êeã»ð'élâ�ë¨ê:ã»élû ð«ì�ë¨â�ëX÷Câò6ã´ë:ð«ç.ècâ�æ:â�ò3ù ú å â�ð«ã»ìcó�ä�ç�élçcã´òlë�ó»ã®í�â p�|U�#r&t�� c ­Ðg0ü�÷�álç�ë:âå� t�� y t çcå â«ædì.êeçcæ·å[â�æ¨õöç�æ:äpë·ì§ë¨ç�æ¨êeâ�ò^ë:âyêpã»éYë:â«æ:êìcð�ðyç�æeò6ã»élû·êeçXõöôléYðyê:ã»çcéã­ c � â�û�e�f�g0ü3ð�ì°éRæeâ«ålæeâ�ë:â«é>ê�ìcó®û�â«îlædìpçcå â«ædì.êeçcædëÂê:á[ì.ê¯æeâ«ó»øÆçcéRåláaø6ë:ã»ð�ì°ó=åYæ:ç�å[â�æ¨êeã®â�ëÿöó»ã»ícâ�ë¨ç�æ¨êeã®éYû���ç°õ�ê:álâ�ã®æ�ã»élålôlê�üYâ�ù û[ùaê:álâ p�|U�#r�s � t�� y t�s5{�|}n�~ ù

ú â�ðyçcé[ò6ó®ø�ü°êeálâ<�>ôlâ«æeøtó´ì°éYûcôYìcûcâCê:ç�ä·çcéYçcã´òpð«ì°ó´ðyôYó®ôYë+ä·ìcålålã»élû�ã´ë+å â«æ:õöçcæeä�â�ò3ùS�'â�êeálâ«éXë¨êeìcæ¨ê+ôYélélâ�ë¨ê¨ñã»élû�õöçcó»ó®ç.÷�ã»élû�ê:álâ�éYçcæeä·ìcó®ã���ì°ê:ã»çcéÆì°ålålæeç�ì�ðdá|üaì°ã»ä�ã»élû�ÿ [ ��ê:çpì�ècç�ã»òXê:álâAì�ë:ë:â«ä�îló»øpç°õ"ôlélélâ�ðyâ�ëeëeì°æeã®ó»øXò6â«â�åélâ�ë�êeâ�òUã»é�êeâ«æeä·â�ò6ã´ì.êeâ�æ:â�ë¨ôló®êeë�ìcéYò3ü[ä�ç�æ:â�ã»ä·å[ç�æ¨êdì°é>ê:ó»øcü,ÿ ¶ �Âê:çpûcâ«ê�álçcó´òÆç°õVôléYð«çcæeæ:â�ó»ì°ê:â�ò�ë:ôlî9��ôYâ«æeã®â�ëAÿöã»éålæeâ�ë:â«éYð«âHçcõ[ì�åYìcæ¨êeã»ð«ôló´ì°æ"ë�êeçcædì°ûcâ�ò6â�ë:ã®û�éÐ��êeáYì.êVäpì.êdðdá�ê:áYâ�åYì°ê¨êeâ«æeétç°õ[â��Xðyã»â«é>ê¥��ôYâ«æeøQâ�élûcã»élâCì°ó»ûcçcæeã®ê:áläpë�ù

àÂálâ�æeâ«÷�æeã®ê:ã»élûAåláYì�ë¨â¯ã»ë�ðyç�ä�åYó®â«ê:â�÷�áYâ«éÇìtålôlæeâ�ì°éléYç°êeì°ê:â�ò·õöçcæeä ã´ë�çcî6êdì°ã»élâ�ò3ù¥²�ç.÷�â�ècâ«æ�ü°÷Câ¯ð�ì°élélçcêæeâ�ì°ó»ó»øpâ«éYë:ôlæeâ¯êeáYì.êÂêeálâAðyç�ä�åYó®â«ê:â1�>ôlâ«æeøpâ��6â�ð«ô6ê:ã»çcéÆåló´ì°éUð«ìcé�î âAð«ìcæ:æeã®â�òpçcôlê�î>ø·êeálâAë¨ê:çcædì°û�â�îYì�ðdícâ�éYò®� ëålæeã®ä·ã®ê:ã»ècâ�ë«ù � ç�ë¨ê¨ñ²ålæeç6ðyâ�ëeë:ã®élûRõöçcæ·ê:áYç�ë:âÆä·ç�élçcã´ò^ç�å[â�æeì°ê:ã»çcéYë�êeáYì.êpáYì�è�âÆéYç§æeâ�ìcó®ã���ì°ê:ã»çcéÉã»éZê:álâ��>ôlâ�æ:øâ«éYûcã»élâ�ätôYë¨ê�î âAë:ðdálâ�ò6ôló»â�ò3ù h ë�ä·â�é�êeã®ç�élâ�òÆî âyõöç�æ:â�ü6ê:álã´ë�ë:ã¢êeôYì.êeã®ç�éU÷�ã®ó»ó|ã»ä�åYæ:ç.è�âQ÷�ã®ê:áRð«çcä·ålæeâ«álâ�éYë:ã®ç�é6ñêeìcã®ó»çcæeâ�òj�>ôlâ«æeøpålæeç6ðyâ�ëeë:çcædë«ù

9 ¾ Û�ØUÞ3Ç�Ý8Ã�ß«Û�Ø ÅAØUÜ;:�Ý�Ù"ÝUÚ¥»=< ÛtÚ�>

� ì�ðyã»élû�ê:álâ�æeã»ðdáYâ«æÂê�øaå[â�ò�ð«ôlæeæ:â�é�ê�ç�î6ï�â�ðyêÂä·ç6ò6â�ó»ë�üaê:álâ�ä·ç�élçcã´òÇéYç.÷@åló´ì�øaë�êeálâ�æ:ç�ó®âQêeáYì.êÂê:áYâAë¨â«êÂåYó»ì�ø�â�òõöçcæ�æeâ«ó´ì.êeã®ç�éYì°ó"òlì.êdìÇä·ç6ò6â�ó»ë�ù\^¯ôlâ�ê:çÆê:áYâ�õKì�ðMê�ê:á[ì.ê\]1�_�Éã´ë�ìÇó´ì°éYûcôYìcûcâAêeáYì.êAð�ì°é)î â·äpì°ålå â�ò)ê:çÆê:áYâð«ìcó»ð«ôló»ôYë"æeì°ê:álâ�æVë�êeæeìcã®û�á�ê:ñ_õöç�æ:÷Âì°ædò3ü�÷�âCáYì�è�â�élç.÷Wçcî6êdì°ã»élâ�ò�ìQôlélã®õöçcæeä�õöæeìcä·â«÷Cçcæeí¯õöç�æ!�>ôlâ«æeø�ê:ædì°éYë:ó´ì.ê:ã»çcé=üæeâ«÷�æeã¢êeã®élû�õöç�æ�çcå6êeã®ä·ã���ì°ê:ã»çcé|üYìcéYòÆåló´ì°éUûcâ�élâ«ædì.êeã®ç�é|ù!�Câ�ë:ã»ò6â�ë�ålæeç.èaã»ò6ã»élû·ålæeâ�ð«ã»ë:â�ä�â�ì°élã»élû·õöç�æÂê:álâ��>ôlâ�æ:øó´ì°élû�ôYì°û�â�÷�ã¢êeáÇã®êeëCò6ã®è�â«ædë¨â�ð«çcéYë¨ê:æeôYðyêeë�ü�÷Câ�û�ìcã®é�ì°é | � t�� x r�n$|}~ x m9p�t ��x ~Ðr�n$v�p õöç�æHã®ê�ù¥�§â�áYì�è�â�ë¨áYç.÷�é·ê:áYì°êã®ê"ã´ë"ã»éYò6â«â�òtå ç�ëeë:ã®îló»âCê:çQîlæeã»ò6û�âCê:álâÂû�ìcåtî âyê�÷Câ«â�é�ð«çcéYð«â«å6êeôYì°ó �>ôlâ«æeã»â�ë"ì°éYò�â��Xð«ã®â�é�êVã»ä·åló®â�ä·â«é>êeì°ê:ã»çcéYë�õöç�æâ��6ålæeâ�ëeë:ã®ç�éYëCûcã»ècâ«éÆã»é�êeálâAä·çcéYçcã´ò�ð�ì°ó´ðyôló»ôYë�ù!�'â�åYæ:â�ë¨â�é�êeâ�òÇê:â�ðdálélãq��ôYâ�ëÂõöç�æ�ò6â�ò6ô[ðyã»élû·ê:álâ�ì°ålåló»ã´ð«ì°îYã®ó»ã¢ê�øç°õ|ð«â«æ:êeì°ã»éQï�çcã»é�ê�ø>å â�ë�üaì°élélçcêeì°ê:ã»élûAë¨ôYîÌ�>ôlâ�æ:ã»â�ëV÷�ã®ê:á·êeálâ«ã»æHìcó®û�â«îlædì°ã´ðCâ=�>ôlã»è�ìcó®â�é>êeë�ùS��â�ðyâ�é�êCò6â«è�â«ó»çcålä·â«é>êdëã»é�ê:álâ\Yâ�ó»ò3ü�ó»ã®í�âAîaøaåYìcëeë�åló´ì°é[ë«üÐlê�ë¨ä·çaç°êeáló®øÆã»é�êeçXêeálâtålã´ðMêeôlæeâcù � æ:ç�äþçcôlæ¯å[ç�ã®é>ê¯çcõ+èaã®â�÷�ü[êeálã´ë�ålæeç.ècâ�ëê:áYâ�ä�ç�élçcã´òÆðyçcä·ålæeâ«áYâ«éYë:ã®ç�éÆð�ì°ó´ðyôló»ôYë�ì·è.ì°ó»ôYìcîló®âQõöç�æ:äpì°ó»ã´ë¨ä õöç�æ�ðyôlæeæeâ«é>ê�ì°éYòÇõöô6ê:ôYæ:â��>ôlâ�æ:øÇålæ:ç6ð«â�ëeë¨ã»élûðdáYìcó®ó»â«élû�â�ë�ù

�§çcæeíÇã»éRålæeçcûcæeâ�ëeë�ì°é[òÆõöô6ê:ôlæeâtâ�� ç�æ¨êdë�ð«çcéYð«â«æeéRë¨â�ècâ«ædì°ó3ã´ëeë¨ôlâ�ë«ù/�§â�ìcæ:âAð«ôlæeæ:â�é�êeó®øÆôléYòlâ«æe÷Cì�øÆò6â«è�â«ó®ñçcåYã®élû§ì°éÉç�î6ï�â�ðyê·ìcó®û�â«îlædì�÷�álã´ðdáÉã´ë�â�ë:å[â�ðyã´ì°ó»ó®øÓò6â�ë¨ã»ûcélâ�òGê:ç§ã»é�êeâ«ædìcðyê�÷�ã®ê:áÉêeálâÆä�ç�élçcã´òÉð�ì°ó´ðyôló»ôYë�ã®éWìë:â�ì°ä·ó»â�ëeë�äpì°éYélâ«æ�ù�]¯éYðyâ�ðyçcä·åló»âyêeâ�ò3ü�ã®êQ÷�ã»ó»ó�áYâ«ó»åRê:çÆålæeçcä·ç°êeâ�ê:áYâ�áaøaîlæeã»òRìcålålæeç�ì�ðdá�÷�â�ë¨ôlû�ûcâ�ë�êeâ�ò�ã»éú â�ðMê:ã»çcéj©Yù flù"àÂálâ�æ:â�÷�ã»ó®ó3î â�ìtäpì°éaø>ñ0ê:ç°ñ²äpì°éaø·äpì°ålålã»élû�ç°õ,ê:álâ�ì°ó»ûcâ�îlædìAçcå â«ædì.êeçcædë+êeçtêeálâAð«ìcó»ð«ôló»ôYë�ê:áYì°ê÷�ã»ó®ólî âCê:álâ�ã»élålôlêVê:ç�ê:áYâ�çcå â«ædì.ê:ç�æ,å[ì.ê¨êeâ«æeé�äpì°êeðdálã»élû�ålæeçað«â�ëeë«ù c a ú e ó glâ��6åló®ç�æ:â�ë,ê:álâ�ò6â�ûcæeâ«â�ë,çcõYõöæeâ«â�ò6çcä

[ ¢

Figure 8: Semi-join access pattern

The resulting optimized formulation appears in Figure 9.

îlô6ê+éYçcé6ñ�ë�êdì°éYòYì°ædò3ü.ç°õµêeâ«épë¨ã»ûcéYãl[ð�ì°é>ê:ó»øAä·çcæeâCâ��Xðyã»â«é>ê3ï�çcã»é·çcå â«ædì.êeçcædë=áYì�è�â�ì°é�â=�>ôlã»è�ìcó®â�é>êVð�ì°ó´ðyôló»ôYë�õöç�æ:äê:çaç[ü6âcù ûYùaê:áYâ¯õKìcä·ã®ó»øXç°õ�álã»â«ædì°ædðdálã»ð�ì°ó�ï�çcã»éYë c �3� ú e�f}g²üaê:álâ p�t � n {�|Un�~&% ülì°é[òXêeálâ ~Ìt�p�r {�|}n�~(' c ú ê:â(e�¢(gHÿ_ë¨â�âî â«ó»ç.÷<�Mù

àÂálâAë:â«ä·ã ï�çcã»é93_ö % ò 3 � ã»ë�ì�ï�ç�ã®éUè.ì°æeã´ì°é>êÂê:áYì°ê¯ò6â«ó»ã»ècâ«ædëCçcéló»øXêeálç�ë:â�ó»âyõµê�çcå â«ædì°é[òÇç�î6ï�â�ðyêeëÂêeáYì.ê�áYì�è�âì.êCó®â�ìcë¨ê�ç�élâ+ï�ç�ã®é�åYì°æ:ê:élâ�æH÷�ã®ê:á�æ:â�ë¨å â�ðyêHê:ç�ê:áYâ¯ë:â«ä·ã ï�ç�ã®é�ålæeâ�ò6ã´ð«ì°ê:â�l,ù"��ä�åYó®â�ä�â�é>êeì.êeã®ç�éÇð«ìcéXî â¯â��pð«ã®â�é>êã»é·ê:â«æeäpë"ç°õ|ë:åYìcð«â�ÿKîlô ��â«ædë+ç�æ"Yó®â�ë"ê:á[ì.êHálç�ó»ò�êeálâ�ã®é>êeâ«æeä�â�ò6ã´ì.ê:â�æeâ�ë:ôló®ê+çcéló»ø�ð�ì°æeæ:ø=3 ö çcî6ï�â�ðMêdë��"ìcéYò�ê:ã»ä·âÿKì�ë�ë:çaçcéÇì�ë�ìcéaø�ï�ç�ã®é�åYì°æ:ê:élâ�æ�ã´ëHõöçcôléYòXõöçcæÂì°é%3_ö�ç�î6ï�â�ðyê�üaã¢êÂî â«ó»çcélû>ë+êeçAê:áYâ¯æeâ�ë:ôló®ê�<aélçtçcê:álâ�æ/3 � ç�î6ï�â�ðyêáYì�ëCê:çpî â�ê:çcô[ðdálâ�ò���Ù

3iö % ò 3 � ì è(úk"¯úo1@3iöUç p�| � t è)lA"¯ûc1¦3 � é�é��çcê:â�ü6ê:áYì°ê�élâ«ã®ê:áYâ«æ;3 ö ü�3 � ülélçcæÂêeálâÂï�çcã»éÆæ:â�ë¨ôló®ê�élâ�â�òÇê:çpî â�ç°õ=ê�øaå â�ß p�t�r ìcëÂã»éUæ:â�ó»ì°ê:ã»çcéYìcó ê:álâ�çcæeøcù

àÂálâtìcî[ç.è�â�ä·â«é>ê:ã»çcélâ�ò�élâ�ë�ê_ï�çcã»é�çcå â«ædì.ê:ç�æ ' ü3ìcë�ìH[éYì°ó�â��lì°ä·åló»âcü�ì°ó»ó»ç.÷�ëÂê:ç�ï�ç�ã®é'ì°é�çcîlï�â�ðMê¯ç°õVê:áYâó»âyõµê�ç�å[â�æeìcéYòo3 ö ÷�ã¢êeáÆê:álâA÷�áYçcó»â�y �F| á(�*)Çç°õ"äpì.êdðdálã®éYûpæ:ã»ûcá>ê�çcå â«ædì°éYòGÿI3 � �Âç�î6ï�â�ðyêeë�ül÷�álã´ðdáUã»ë�â��lìcðMêeó®ø÷�áYì°ê�ã»ëÂélâ�â�ò6â�ò�õöçcæÂê:áYâ�â«è.ì°ó»ôYì.êeã®ç�é�çcõS�>ôlâ�æ:ã»â�ëÂ÷�ã®ê:áUélâ�ë�êeã®élûpã»é�ê:álâHÿ��R��� ��� ñ�ðyó´ì°ôYë:â§Ù

3_ö '+-, .0/ ù-1 +02�, ò 1 +3, .42�, ��1 +3, .52�6 7

3 � ì è�­"ÿZú9�8" ý ) ì è�.°ÿ$ú+ç-ûA�f"�û°1@3 � ç§l,ÿ$ú+çFû �#é�þf"�ú21�3iö(é

ú â�â c ú h �ÂòA��eU©}g+õöç�æAì�ò6âyêdì°ã»ó®â�ò'òlã»ëeðyôYëeë:ã®ç�é'çcõ�êeálâpélâ�ë¨êKï�ç�ã®é+� ëAðdá[ì°ædìcðMêeâ«æeã»ë¨ê:ã´ð«ë�ù c d `�e�f}g�ì�ðdálã»â«ècâ�ë¯õöôléYðMñê:ã»çcé[ì°ó»ã¢ê�øÇë:ã»ä�ã»ó´ì°æÂê:ç�êeálâ�élâ�ë¨êKï�ç�ã®éUîaøXã»é>ê:æeçaòlôYðyã»élûpì�w n�~ x � ��y �F| á(� n�~ y�çcå â«ædì.êeçcæ�ù

��ç.÷�ü.æeâ«èaã´ë¨ã®ê:ã»élû�ê:álâ��>ôlâ«æeø�æ:â�÷�æ:ã®ê:ã»élûQì°î ç.ècâcü�÷CâCòlâyê:â�ðMê�ê:álâ�ðdáYìcæeì�ðMêeâ«æeã»ë¨ê:ã´ð¥Ñ"Ö î�î(Ó�Ò}ËUçcõ[ì¯ë:â«ä·ã ï�çcã»é�ì°êÿ ­ �Mù"�Cø�ä·ì°êeðdálã»élû¯ê:álâ�ë:â«ä·ã ï�çcã»é?� ë"ò6â�Yélã»élû�ðyçcä·ålæeâ«áYâ«éYë:ã®ç�étì°û>ì°ã»éYë¨ê,êeálâ��>ôlâ�æ:ø�÷CâCæ:â�ðyçcû�élã��«â�ìQë:ôlîÌ�>ôlâ�æ:øê:á[ì.ê�ð«ìcé�î â�â��Xð«ã®â�é�êeó®øÆðyç�ä�åYô6ê:â�ò�îaøÇì·ë:â«ä·ã ï�çcã»é?Ùè�ó ò "Ró ò 1ä^¯ã®æeâ�ðyê:çcædë�ç

p�| � t èCó ò ë ò6ã®æeâ�ðyê:ç�æ ì . ò ë ìcðMêeçcæa"R. ò 1 d ìcë¨ê�çE. ò ë Yó»ä ì ó ò ë Yó»äjçE. ò ë ëeðyâ�élâ�ë ì [ é§éì ^�ã»æ:â�ðMêeçcædë %�Iú�û ���wü��������)úCû �âý �-ÿ��wü���� ��)úCû ��þ �����Iú�û �Cþ ���IúCû � ��ÿ���ÿ�� ��ö d ìcë¨ê

ú ã»ä·åló®â�æ:â�÷�æ:ã®ê:ã»élû§ÿöålæeâ�ò6ã´ð«ì°ê:â�ålôYë:áYò6ç.÷�é�õöç�æ+. ò ë ë:ð«â«élâ�ë ì [ ��ålæeç.èaã»ò6â�ë�ôYë¯÷�ã®ê:á'ì��>ôlâ«æeøÇêeáYì.êQêeìcícâ�ëìcòlè�ìcé>êeì°û�âÂç°õ�ê:álâ�åYì°æ:ê:ã®ê:ã»çcélâ�ò·ë�êeçcædì°ûcâÂ÷�ã®ê:ápæeâ�ë:å[â�ðMê+êeç d ìcë¨ê�ù"�§â�ð«ìcé�æeâ�ë¨ê:æeã´ðMê d ìcë¨ê+î âyõöçcæeâVï�ç�ã®élã»élûAìcéYòçcîlêeì°ã»éGÿöôYë:ã®éYû·ê:álâ�åYì°ê¨êeâ«æeéÇõöç�æÂê:álâAìcó®û�â«îlædì°ã´ð¯ë:â«ó»â�ðyê:ã»çcé9Æ ò ÿ§3\� ì è�õ%" õ+1�3�ç§l+é(�#Ùè�ó£ò9"Ró£òf1ä^¯ã®æeâ�ðyê:çcædë�ç

p�| � t èCó�ò ë ò6ã®æeâ�ðyê:ç�æ ì .Eò ë ìcðMêeçcæa"R.Eò�1 d ìcë¨ê�çE.Eò ë ëeðyâ«éYâ�ë ì [ ç).ëò ë Yó»ä ì ó�ò ë Yó®äjé�éì ^�ã»æeâ�ðMêeçcædë��)ú %�Iú�û ���wü���� ���)úCû �âý �-ÿ��wü�������)úCû �Cþ ����Iú�û �Cþ � ÿ§Æ��Iú�û � ��ÿ��âÿ�� �Sö.ÿ d ì�ë�ê �Iú}�F�

ú ã»éYðyâC÷CâCò6ç�élçcê�÷Âì°é>ê�ê:álâÂç�å6ê:ã»ä·ãu��ì.ê:ã»çcé�ålæeç6ðyâ�ëeë=ê:ç�ë�êeçcå�álâ«æeâcü.÷Câ�á[ì�ècâ�ê:çiYéYò�ì�÷Âì�ø�ç°õYæeâ«ä·â�ätî â«æ:ñã»élû�ê:áYâ�ë:ôlîÌ�>ôlâ�æ:ø9� ë|ã»ä·åló®â�ä·â«é>êeì°ê:ã»çcé�ã»é�õöôlæ:ê:álâ�æ=æeâ«÷�æeã¢êeã®élû�ë¨ê:â�åYë�ù,àÂáaôYë�ü�÷CâHìcélélç°êdì.êeâVê:álâCë¨ôYîÌ�>ôlâ�æ:øQ÷�ã¢êeáã®êeë�ì°ó»ûcâ�îlæeìcã»ðAâ=�>ôlã»è�ìcó®â�é>êQîaø �Ft � m x v�n�~ y�êeálâ�ð«çcä·ålæeâ«álâ�éYë:ã®ç�éR÷�ã¢êeáRêeálâ�ìcó®û�â«îlædì°ã´ð�â��aåYæ:â�ë:ë:ã®ç�é|ü�çcî6êdì°ã»élã»élûìHÔ'à���Ò}Í�Îí�>ôlâ«æeøpélçcêeì.êeã®ç�é|üYë:ã®ä·ã»ó»ìcæCê:ç·ê:áYâ�çcélâ�ålæeçcå ç�ë:â�ò�ã®é c ��ì°í e�¦�g�Ù

è}­ ò ë êeã¢êeó®â%"A­ ò 1 � ã®ó»äpë�çó ò 1 ÿ$^¯ã®æeâ�ðyê:çcædë��)ú %�IúCû ���wü��������)úCû ��ý �§ÿ��wü���� ��)úCû �Cþ �����IúCû �Cþ � ÿ§Æ��Iú£û � ��ÿ���ÿ����Sö.ÿ d ì�ë�ê �Iú§�-�-��çó ò ë Yó»ä ì ­ ò ëãð é

àÂálâCï�çcã»éÆåYì.ê:ê:â«æeéÇäpì°êeðdálã»élû·ålæeçað«â«â�ò6â�ëCì°éYò�çcé[ðyâ�ì°û�ìcã®é�÷�â�ò6â«ê:â�ðyêÂê:áYì°ê�ì·ë:â«ä·ã ï�çcã»éÇã´ë�ì°ålåló»ã´ð«ì°îYó®â�ë¨ã»éYð«âê:áYâ���ôYâ«æeøÌ� ëÂæeâ�ë:ôló¢ê�ð«çcé>êeìcã®éYë�ì°ê¨êeæ:ã»îlô6êeâ�ë�ë¨ê:çcæeâ�ò�ã»é�[ó®â � ã®ó»äpë�çcéYó®ø9Ù

� ã»ó®äpë ù ú %ù ú û ���� ú û �Cþ � ÿ$^¯ã»æ:â�ðMê:ç�æeë � ú %� ú û ���wü���� ��� ú û �âý �-ÿ��wü���� �� ú û ��þ ����� ú û ��þ � ÿ§Æ � ú û � ��ÿ��âÿ�����ö ÿ d ìcë¨ê � ú§�F�-�

��ç�õöôlæ:ê:áYâ«æ�ålæeç°ï�â�ðMêeã®ç�é�ã´ëÂélâ�ð«â�ëeë:ìcæ:ø�ü>ê:álâ��>ôlâ�æ:øXæeâ=�>ôlâ�ë�êeâ�òÆì°ó»ó|ì.ê:ê:æeã®îlôlê:â�ë�ã®é � ã®ó»äpë«ù�§â¯äpì°é[ì°ûcâ�ò�ê:ç�æ:â�÷�æ:ã®ê:â�ê:álâ�>ôlâ«æeø·ã®é>êeç�ì°é�â��Xðyã»â«é>ê�ë¨â�ä·ã ï�ç�ã®é6ñ�ðyç�ätîlã»éYì°ê:ã»çcé·ê:á[ì.ê�ì°ó´ë¨ç�ìcð«ð«çcôlé>êdë+õöç�æ

ê:áYâAåYì°æ:ê:ã®ê:ã»çcéYâ�òUë�êeçcædì°û�â�ëeðdálâ�ä·ì·çcõ,êeálâ1Yó»ä çcî6ï�â�ðMêeë�ùÂàÂálâAðyç�æ:âAë¨ê:ædì.êeâ«ûcøÇã»ëÂêeçÇò6â«ê:â�ðyê�ê:áYâAåYì.ê:ê:â�æ:éYë�ç°õ

[ ©

Figure 9: Query algebra formulation, optimized stage

The resulting query plan expressed in relational algebra has a straightforward trans-lation into JPQL [28], the ORM-level query language. Further potential relational op-timizations may be performed by the RDBMS. In keeping with MDSE principles, thistranslation is not implemented as string manipulations but as an AST-to-AST transforma-tion [32]. Well-formedness is thus ensured before delivering output for further processing.

Without the conceptual framework of the monoid calculus, applying similar rewritingsdirectly on OCL ASTs would not have been feasible. This is further evidence to the claimthat integrity checking for EMOF + OCL should follow the approach proposed here.

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 115

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

9 RELATED WORK

The influence of the Object Query Language (OQL) defined in the 1990s by the Ob-ject Data Management Group (ODMG) cannot be understated, reaching to JPQL today.Trigoni [33] formalizes type inference for OQL queries. Additionally, algorithms areprovided for applying two semantic optimization heuristics: constraint introduction andconstraint elimination. These refined heuristics take into consideration association rulesdiscovered with data mining, which are not as strong as integrity constraints (they mayhave exceptions in fact). Given that these “rules” statistically hold most of the time, itpays off to monitor their validity status at runtime. Unless they become invalid, they canbe used during optimization to increase selectivity and to skip evaluations, thus improvingperformance. As with other heuristic techniques, safety measures are built in to preventthe cost of analysis to exceed optimization speed-up.

Ritter et. al. [10] also aim at integrity checking by translating OCL into a viewdefinition language, this time SQL’92. However, no systematic performance analysis ismade. The Dresden OCL Toolkit [34] compiles full-OCL into RDBMS stored proceduresincluding control structures, thus compromising query optimization in the general case.

The optimization of object-based queries is not only relevant for the persistent case:naıve evaluation over instances in main-memory also results in unacceptable performance.A succint account of this problem and a heuristic solution for θ-joins in Java 5 appearsin [35]. A recent book on the subject of database integrity is [36]. Most contributionsfocus on the relational case. The book [37] is devoted to view materialization.

10 CONCLUSIONS AND FURTHER WORK

We have addressed an industrially relevant problem by going back to first principles, lever-aging research results from object databases to improve the efficiency of software repos-itories for EMOF + OCL. Our choice of integrity checking mechanism does not requirefor the database to be in a consistent state before an update can take place, yet reportingof integrity violations is sound and complete (no false positives, no missed violations).This is deemed vital to account for the realities of collaborative design environments.

Incremental view maintenance adds a measure of reactivity to the monitoring of in-variants. Unlike the more powerful Event Condition Action rules (ECA) of an activeDBMS, view definitions based on OCL invariants cannot make statements about eventsexternal to the database state, nor range over several snapshots as in versioned data mod-els [11] (values in pre- and poststates can only be referred from OCL postconditions, notfrom class invariants). OCL-based views are however sufficient to support a variety of usecases in software repositories, such as monitoring the conformance of artifacts to codingand modeling conventions [12]. Moreover, not all views need be maintained incremen-tally (as required for integrity constraints): in some cases results are only periodicallyneeded (e.g., after an integration build, or on a daily or weekly basis). Examples include:(a) detecting opportunities for applying refactorings; (b) checking mutual consistency be-

116 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

10 CONCLUSIONS AND FURTHER WORK

tween artifacts and documentation; and (c) deriving software metrics.

As usual, irrespective of whether an OCL-based view is tagged for incremental orbatch evaluation, it makes for concise composite queries. Materialized views naturallysupport OCL’s derive statement, which is used to specify values for attributes or associ-ation ends. Looking into the future, the proposed infrastructure can serve as a basis forsupporting ECA and versioning functionality through extensions to the OCL language.

REFERENCES

[1] Object Management Group: Meta Object Facility (MOF) Core Specification, formal/06-01-01, http://www.omg.org/docs/formal/06-01-01.pdf (Jan 2006)

[2] Warmer, J., Kleppe, A.: The Object Constraint Language: Getting Your Models Ready forMDA. Addison-Wesley, Boston, MA, USA (2003) ISBN 0321179366.

[3] Dittrich, K.R., Tombros, D., Geppert, A.: Databases in Software Engineering: a Roadmap.In: ICSE - Future of SE Track. (2000) 293–302

[4] Garcia, M.: Rules for Type-checking of Parametric Polymorphism in EMF Generics. InBleek, W.G., Schwentner, H., Zullighoven, H., eds.: Software Engineering 2007 – Beitragezu den Workshops. Volume 106 of GI-Edition Lecture Notes in Informatics. (2007) 261–270http://www.sts.tu-harburg.de/˜mi.garcia/pubs/2007/mdsdHeute/garcia-emfgen-2.pdf.

[5] Garcia, M., Moller, R.: Certification of Transformations Algorithms in Model-DrivenSoftware Development. In Bleek, W.G., Rasch, J., Zullighoven, H., eds.: Software En-gineering 2007. Volume 105 of GI-Edition Lecture Notes in Informatics. (2007) 107–118http://www.sts.tu-harburg.de/˜mi.garcia/pubs/2007/se2007/GarciaMoeller.pdf.

[6] Muller, P.A., Fleurey, F., Fondement, F., Hassenforder, M., Schneckenburger, R., Gerard, S.,Jezequel, J.M.: Model-Driven Analysis and Synthesis of Concrete Syntax. In Nierstrasz, O.,Whittle, J., Harel, D., Reggio, G., eds.: MoDELS. Volume 4199 of LNCS., Springer (2006)98–110 http://lglpc35.epfl.ch/lgl/docs/papers/MDASOCS-5-pam.pdf.

[7] Jouault, F., Bezivin, J., Kurtev, I.: TCS: a DSL for the specification of textual concrete syn-taxes in model engineering. In Jarzabek, S., Schmidt, D.C., Veldhuizen, T.L., eds.: GPCE,ACM (2006) 249–254

[8] Daly, C.J.: AST framework generation with Gymnast. In: Tech Exchange Panel: LanguageToolkits, EclipseCON 2005 (2005)

[9] Ehrig, K., Ermel, C., Hansgen, S., Taentzer, G.: Generation of Visual Editors as Eclipse Plug-ins. In: ASE ’05: Proceedings of the 20th IEEE/ACM Intnl Conf on Automated SoftwareEngineering, New York, NY, USA, ACM Press (2005) 134–143

[10] Ritter, N., Steiert, H.P.: Enforcing Modeling Guidelines in an ORDBMS-based UML Repos-itory. In: Intnl Resource Mgmt. Assoc. Conf. 2000 (Information Modeling Methods andMethodologies Track of IRMA 2000), Anchorage, Alaska (May 2000) 269–273

[11] Kovse, J.: Model-Driven Development of Versioning Systems. PhD thesis, TU Kaiser-slautern, Germany (August 2005)

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 117

EFFICIENT INTEGRITY CHECKING FOR ESSENTIAL MOF + OCL IN SOFTWARE REPOSITORIES

[12] Ruokonen, A., Hammouda, I., Mikkonen, T.: Enforcing Consistency of Model-Driven Architecture Using Meta-Designs. In: European Conf. on MDA: Workshopon Consistency in Model Driven Engineering (C@MoDE 2005). (Nov. 2005) 127–141http://practise.cs.tut.fi/files/publications/EEWES/metadesign.pdf.

[13] Amelunxen, C., Schurr, A.: On OCL as part of the Metamodeling FrameworkMOFLON. In: 6th OCL Workshop at the UML/MoDELS Conference. (2006) http://st.inf.tu-dresden.de/OCLApps2006/topic/acceptedPapers/13 Amelunxen MOFLON.pdf.

[14] Brucker, A.D., Wolff, B.: The HOL-OCL Book. Technical Report 525, ETH Zurich (2006)http://www.brucker.ch/bibliography/abstract/brucker.ea-hol-ocl-book-2006.

[15] Fegaras, L., Maier, D.: Towards an Effective Calculus for Object Query Languages. In:SIGMOD ’95: Proceedings of the 1995 ACM SIGMOD Intl Conf. on Management of Data,New York, NY, USA, ACM Press (1995) 47–58 http://lambda.uta.edu/sigmod95.ps.gz.

[16] Lamport, L.: The +CAL Algorithm Language. In: NCA ’06: Procof the Fifth IEEE Intnl Symposium on Network Computing and Applica-tions, Washington, DC, USA, IEEE Computer Society (2006) 5–10 See alsohttp://research.microsoft.com/users/lamport/pubs/pluscal.pdf.

[17] Lawley, M.: Transaction Safety in Deductive Object-Oriented Databases. In Ling, T.W.,Mendelzon, A., Vieille, L., eds.: DOOD. Volume 1013 of LNCS., Springer (1995) 395–410

[18] Stockmeyer, L.J.: The Complexity of Decision Problems in Automata Theory and Logic.Technical Report MAC TR-133, MIT, Cambridge MA, Project MAC (1974)

[19] Ali, M.A., Fernandes, A.A.A., Paton, N.W.: MOVIE: an incremental maintenance systemfor materialized object views. Data Knowl. Eng. 47(2) (2003) 131–166

[20] Cabot, J., Teniente, E.: Incremental Evaluation of OCL Constraints. In Dubois, E.,Pohl, K., eds.: CAiSE. Volume 4001 of LNCS., Springer (2006) 81–95 Project homepagehttp://www.lsi.upc.edu/˜jcabot/research/IncrementalOCL/index.html.

[21] Altenhofen, M., Hettel, T., Kusterer, S.: OCL Support in an Industrial Environment. InDemuth, B., Chiorean, D., Gogolla, M., Warmer, J., eds.: OCL for (Meta-)Models in Mul-tiple Application Domains, Dresden, University Dresden (2006) 126–139 http://st.inf.tu-dresden.de/OCLApps2006/topic/acceptedPapers/03 Altenhofen OCLSupport.pdf.

[22] Meyer, B., Mingins, C., Schmidt, H.: Providing trusted components to the industry. Com-puter 31(5) (1998) 104–105

[23] Ali, M.A.: Incremental Maintenance of Materialized Views in Object-OrientedDatabases. PhD thesis, University of Manchester, UK (September 2002)http://computing.unn.ac.uk/staff/CGMA2/projectlinks.html.

[24] Ali, M.A., Paton, N.W., Fernandes, A.A.A.: An Experimental Performance Evaluation ofIncremental Materialized View Maintenance in Object Databases. In Kambayashi, Y., Wini-warter, W., Arikawa, M., eds.: DaWaK. Volume 2114 of LNCS., Springer (2001) 240–253

[25] Fegaras, L., Maier, D.: Optimizing Object Queries using an Effective Calculus. ACM Trans.Database Syst. 25(4) (2000) 457–516

118 JOURNAL OF OBJECT TECHNOLOGY VOL 7, NO. 6

10 CONCLUSIONS AND FURTHER WORK

[26] Grust, T., Scholl, M.H.: Translating OQL into Monoid Comprehensions—Stuck withNested Loops? Technical Report 3a/1996, Dept. of Computer and Information Sci-ence, Database Research Group, U Konstanz, (September 1996) http://www.inf.uni-konstanz.de/dbis/publications/download/GS:TR96a.ps.gz.

[27] Elver Project: Teneo EMF Persistency, http://www.eclipse.org/emft/projects/teneo/ (2007)

[28] EJB 3.0 Expert Group: JSR 220: Enterprise JavaBeans, Version 3.0: EJB 3.0 SimplifiedAPI. Available at http://java.sun.com/products/ejb/docs.html (2005)

[29] Grust, T.: Monad Comprehensions. A Versatile Representation for Queries. In: The Func-tional Approach to Data Management - Modeling, Analyzing and Integrating HeterogeneousData. Springer Verlag (Sept 2003) 288–311 ISBN: 978-3-540-00375-5.

[30] Object Management Group: OMG OCL Specification v2.0, formal/2006-05-01 (May 2006)http://www.omg.org/technology/documents/formal/ocl.htm.

[31] Garcia, M.: How to process OCL Abstract SyntaxTrees (2007) http://www.eclipse.org/articles/article.php?file=Article-HowToProcessOCLAbstractSyntaxTrees/index.html, Eclipse Technical Article.

[32] Garcia, M.: Formalizing the Well-formedness Rules of EJB3QL in UML + OCL. In Kuhne,T., ed.: Reports and Revised Selected Papers, Workshops and Symposia at MoDELS 2006,Genoa, Italy. LNCS 4364, Springer-Verlag (2006) 66–75

[33] Trigoni, A.: Semantic Optimization of OQL Queries. PhD thesis, University of Cambridge,UK (October 2002) http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-547.html.

[34] Demuth, B., Hußmann, H., Loecher, S.: OCL as a Specification Language for BusinessRules in Database Applications. In Gogolla, M., Kobryn, C., eds.: UML. Volume 2185 ofLNCS., Springer (2001) 104–117

[35] Willis, D., Pearce, D.J., Noble, J.: Efficient Object Querying for Java. InThomas, D., ed.: ECOOP. Volume 4067 of LNCS., Springer (2006) 28–49http://www.mcs.vuw.ac.nz/˜djp/files/WPN ECOOP06.ps.

[36] Doorn, J.H., Rivero, L.C., eds.: Database Integrity: Challenges and Solutions. Idea GroupPublishing (2002)

[37] Gupta, A., Mumick, I.S., eds.: Materialized views: techniques, implementations, and appli-cations. MIT Press, Cambridge, MA, USA (1999)

ABOUT THE AUTHORS

Miguel Garcia is a PhD candidate and research assistant at the Institute for Software Sys-tems at the Hamburg University of Science and Technology (TUHH), Germany. He canbe reached at [email protected]. See also http://www.sts.tu-harburg.de/˜mi.garcia.

VOL 7, NO. 6 JOURNAL OF OBJECT TECHNOLOGY 119


Recommended