Katholieke Universiteit Leuven · Katholieke Universiteit Leuven Department of Computer Science...

A proposal for region-based memory

management for deterministic Mercury

programs

Quan Phan

Gerda Janssens

Report CW424, September 2005

Katholieke Universiteit LeuvenDepartment of Computer Science

Celestijnenlaan 200A – B-3001 Heverlee (Belgium)

A proposal for region-based memory

management for deterministic Mercury

programs

Quan Phan

Gerda Janssens

Report CW424, September 2005

Department of Computer Science, K.U.Leuven

Abstract

This paper presents an approach for region-based memory man-agement for Mercury programs. First, region analysis based on apoints-to graph determines the different regions in the program.Second, the liveness of the regions is computed. Finally, a pro-gram transformation adds region annotations to the program forthe region-based memory management. Some small Mercury pro-grams are analysed manually and discussed to show the promisingbenefits of our approach.

While the approach is developed here mainly for deterministicMercury programs the paper also discusses possible extensions tosupport non-determinism, module-based region analysis, and mostnoticeably the combination with compile-time garbage collection inthe context of Mercury programs.

Keywords : Program Analysis, Mercury, Region-based memory management.CR Subject Classification : D.3.4, I.2.3

A proposal for region-based memory management for

deterministic Mercury programs∗

Quan Phan and Gerda Janssens

Department of Computer Science, K.U.Leuven

Celestijnenlaan, 200A, B-3001 Heverlee, Belgium

{quan.phan,gerda.janssens}@cs.kuleuven.be

September 16, 2005

Abstract

This paper presents an approach for region-basedmemory management for Mercury programs. First,region analysis based on a points-to graph deter-mines the different regions in the program. Second,the liveness of the regions is computed. Finally, aprogram transformation adds region annotations tothe program for the region-based memory manage-ment. Some small Mercury programs are analysedmanually and discussed to show the promising ben-efits of our approach.

While the approach is developed here mainlyfor deterministic Mercury programs the paperalso discusses possible extensions to support non-determinism, module-based region analysis, andmost noticeably the combination with compile-timegarbage collection in the context of Mercury pro-grams.

1 Introduction

This report describes an algorithm that starts witha Mercury logic program and ends up producing anoutput program with region-based memory man-agement (RBMM). The input Mercury program isin normal form, has been optimized to have special-ized forms of unification, and has goals reorderedso that input variables are ground before any op-erations. The algorithm as described here is fordeterministic programs but it is designed with sup-port for non-deterministic programs as well as for

∗This work is supported by the project GOA/2003/08and by FWO Vlaanderen.

combination with compile-time garbage collection(CTGC) in mind. The algorithm is composed ofthree phases. The first phase is a goal-independentanalysis of each procedure. This analysis detectsthe region structure of memory used by a procedureand represents this information in terms of regionpoints-to graph. The precision of splitting mem-ory into different regions will have a large impacton the quality of the whole algorithm. The secondphase uses the region points-to graph of each pro-cedure to precisely detect the lifetime of regions.The lifetime information is composed of the set oflive regions at each program point and the sets ofregions that a procedure creates and removes. Thethird phase is the transformation of the input pro-gram to a program with region support. Based onthe information about the lifetime of regions it in-serts statements to create regions, remove regions,and rename regions. The ability of the transfor-mation to create regions right before they need tobe live and to remove regions right after they be-come dead can theoretically reduce significantly thememory consumption of programs.

The structure of the report is as follows. Sec-tion 2 presents some basic notions of our approach.Section 3 introduces the concept of region points-tograph and the points-to analysis. Section 4 presentsthe live region analysis. Section 5 is about thetransformation that adds annotations to programs.Section 6 shows the detailed analyses of some smallMercury programs. Finally, section 7 presents pos-sible extensions of the approach and discuss thebenefits of combining region-based memory man-agement and compile-time garbage collection.

1

2 THE BASIC APPROACH 2

2 The basic approach

In this section we give an overview of the workingcontext, what we want to achieve and the reasonsbehind them.

2.1 Term representation

One basic point when we talk about memory man-agement is how terms are represented in memory.This aspect depends on a specific implementationof a language. Therefore we introduce our viewof term representation when the heap memory isorganised in terms of regions in the context of Mel-bourne Mercury Compiler. The way a term storedin regions is controlled by two factors. The firstfactor is the type of the term, which defines itsstructure and the second is how a program is goingto use it. We explain this by the following example.We define two types as follows::- type list(T) −−− > [];[T|list(T)].:- type ex −−− > foo(int).Consider a variable L of type list(ex). There arethree structured components in the type list(ex):the list backbone, the functor foo/1, and the inte-ger value inside foo. The term bound to L (as anyother value of the type) can be stored finitely inmaximum three corresponding regions, one for thelist backbone, one for the functor foo/1, and theother for the integer value. Assume L = [foo(1),foo(2)] the memory representation of L using threeregions is shown in Figure 1. By allocating a termin many different regions we can do the removalof regions right after some parts of the term die.Clearly the way we divided a term into differentregions here is based on its structure. Now assumethat the program never accesses the integer insidefoo/1. For example it can be a program that takesout only the elements of L, without caring abouttheir content. Then in that case we may store theterm in only two regions, one for the list backboneand the other for the foo/1 functor and its argu-ment. This is shown in Figure 2. Please do notget the impression of de-reference. The idea here isdifferent, we can and want to do that because theprogram does not access the argument of foo/1 andthis representation is more economical. So whenit is the case, even if the argument of foo/1 is nolonger of primitive type it is still stored in the sameregion of foo/1. To see why it is reasonable to do

[.]/2

[.]/2

foo/1

foo/1

[]

Lr

1

2

foo intr r

L

Figure 1: Representation of term using the maxi-mum number of regions.

[.]/2

[.]/2

foo/1

foo/1

[]

Lr

foor

1

2

L

Figure 2: Representation of term using fewer re-gions.

3 REGION POINTS-TO ANALYSIS 3

like that imagine a procedure that receives an inputvariable of type list(ex), returns an output variableof type ex. Certainly, it never touches the integerin ex. For this procedure the input variable can bestored in only two regions. In the calling contextwhere the input variable is constructed, there aretwo situations. The caller constructs the variable ineither two or three regions (The situation where thecaller just puts it into one region can never happenbecause of the way we do region points-to analy-sis). If it is in two regions then obviously there willno problem. When the input variable is in threeregions it is still safe for the called procedure toregard only two regions because it never accessesthe other one. These two factors will be capturedby region points-to graph, which guides how termsneed to be stored. Generally they will be stored inas many regions as needed but for some cases lessregions can be used.

2.2 Merging regions

When analysing a program, at the beginning we canassume each variable appearing in the program isin a different region (to be precise we have to say avariable bound to a term that is stored in a region.But for short, we will just say a variable is stored ina region). Then based on their types and behaviourof the program we establish the relations betweentheir regions. One important point in RBMM iswhich variables must be in the same region.We want to allocate terms into different regions sothat we can do timely removal of them. But theway a language is implemented has certain controlon this matter. In an obviously accepted way ofimplementing computer languages, when an unin-stantiated variable gets bound to a value, if thatvalue does not exist in the heap then new memorywill be allocated for it. But if that value is alreadythere then the variable will be made point to theexisting value.In the context of RBMM, if the value exists thenit must be in some existing region and some vari-able(s) is pointing to it. If the implementation ofa language makes the uninstantiated variable pointto the existing value, it forces the variable to be inthe same region of the existing value. Accordingto the above assumption, the uninstantiated vari-able has been assigned to a region different fromthe region of the existing value, therefore the effect

is like we “merge” the two regions. That is the ba-sic reason of merging two regions of two variableswhen we build the region points-to graph. In prin-ciple, it means that if we want to (when we needa finite representation of recursive data structures)or have to (because of the accepted way computerlanguages are implemented) put two variables intothe same region then we merge their regions intoone.There is a good reason to allocate a term into re-gions based on its structure. This is because pro-grams likely treat values of the same structure ina term the same. By putting values of the samestructure into the same region we can hopefully re-move the region when a program no longer needsthat part of the term while the other parts are stillneeded. We can think of a finer level of control,where each value is monitored by its own memorycells. But this approach soon becomes intractable.Region is seemingly a reasonable level of abstrac-tion.

3 Region points-to analysis

The goal of this analysis is to build the regionpoints-to graph for each procedure. The conceptof region points-to graph used here was introducedfor Java in [1]. Its details have been modified inthe context of Mercury therefore we discuss it againhere to make the paper self-contained and easier toread. Region points-to graph, G = (N,E), con-sists of a set of nodes (N) representing regions anda set of directed edges (E) representing the refer-ences between the regions. A node is named by thevariables that point to memory in the region cor-responding to the node. From now on to avoid thelengthy expressions we use the concepts of node andof the region corresponding to a node interchange-ably. An edge is labelled by a type selector [4],which represents the structured relation betweenvariables in two regions. For example, if X is avariable of type foo that is defined::- type foo(T) −−− > f(T), Y is a variable of typeT and X = f(Y), then the points-to graph repre-senting this is as in Figure 3. The points-to graphis the graphical representation of the splitting ofthe memory used by a procedure into regions.The region points-to analysis is made up of twoanalyses. One is a flow-insensitive intraprocedural


Y(f/1, 1)

X

Figure 3: Points-to graph.

analysis that only deals with unifications, ignoringprocedure calls and the other is an interprocedu-ral analysis that integrates points-to graphs of thecalled procedures (callee) with that of the callingprocedure (caller). The interprocedural analysis re-quires a fixpoint computation to calculate points-tographs for recursive procedures. There are two op-erations used by the points-to analysis: unify andedge, which are defined as follows. If an operationis applied to a starting graph G = (N,E) and endsup at a new graph G′ = (N ′, E′) then

• unify(n,m): unify nodes n and m in thegraph.

– N ′ = N − {n,m} ∪ {n ∪ m}

– E′ = {(n′, sel,m′)|∃(n, sel,m) ∈ E : n ⊆n′ ∧ m ⊆ m′}

• edge(n, sel,m): create an edge with label selfrom node n to node m.

– G′ = (N,E ∪ {(n, sel,m)}).

3.1 Intraprocedural analysis

To specify this analysis, assume that we areanalysing a procedure p with points-to graph G =(N,E). The analysis works as follows.

1. At the beginning, each variable in p is assignedto a separate node: X → nX , nX becomes anode in N .

2. The unifications in the procedure are processedone by one as follows:

• X := Y : unify(nX , nY ),

• X == Y : do nothing,

• X => f(X1, . . . , Xn): create referencesfrom nX to each of nX1, . . . , nXn bydrawing edges as follows

– edge with label (f/n, 1) from nX tonX1

– . . .

– edge with label (f/n, n) from nX tonXn,

• X <= f(X1, . . . , Xn): create referencesfrom nX to each of nX1, . . . , nXn

– edge with label (f/n, 1) from nX tonX1

– ...

– edge with label (f/n, n) from nX tonXn.

3. The following rules are fired whenever applica-ble

• R1 :

after unify(n, n′)if

m 6= m′∧(n, sel,m), (n′, sel,m′) ∈ E

thenunify(m,m′)

• R2 :

after edge(n, sel,m)if

m 6= m′∧(n, sel,m′) ∈ E

thenunify(m,m′)

• R3 : This rule is to deal with variablesthat have recursive types.

after edge(nX , sel, nY )if

(nZ , nY ) ∈ E∗∧nY 6= nZ∧type(Y ) = type(Z)

thenunify(nY , nZ)

in which E∗ is the transitive closure of Eand type(X) returns the type of variableX.

3.2 Interprocedural analysis

When interprocedural analysis is executed for aprocedure, the following assumptions are made:


• The points-to graphs of the callees are avail-able or are the current points-to graph in casethe procedure is recursive.

• For a call q(Y 1, ..., Y n), the formal declarationof q/n is q(X1, ..., Xn).

The interprocedural analysis is performed as fol-lows.

1. Process each procedure call in the procedure,for a call q(Y 1, . . . , Y n): integrate the graphof q/n, Ge = (Ne, Ee), into the graph of thecurrently analysed procedure, Gr = (Nr, Er)by building the partial α mapping from Ne toNr as follows.

• – α(nX1) = nY 1

– . . .

– α(nXn) = nY n

• In the graph Ge, start from each nXi, fol-low each edge once and apply the follow-ing rules when appropriate

– R4 :

ifα(ne) = nr∧(ne, sel,me) ∈ Ee∧(nr, sel,mr′) ∈ Er∧α(me) = mr 6= mr′

thenunify(mr,mr′)

– R5 :

ifα(ne) = nr∧(ne, sel,me) ∈ Ee∧(nr, sel,mr) ∈ Er∧α(me) undefined

thenα(me) = mr

– R6 :

ifα(ne) = nr∧(ne, sel,me) ∈ Ee∧∀p : (nr, sel, p) 6∈ Er∧α(me) = mr

thenedge(nr, sel,mr)

– R7 :

ifα(ne) = nr∧(ne, sel,me) ∈ Ee∧∀p : (nr, sel, p) 6∈ Er∧α(me) undefinedmr = FreshNode(Gr)

thenα(me) = mr,edge(nr, sel,mr)

• Record renaming of regions at this callsite.

– R8 :

ifα(nXi) = nY i∧α(nXj) = nY j∧nXi = nXj∧Xi is an input variable of q/nXj is an output variable of q/n

thenrename(nY i, nY j , Gr).

– rename(p, q,G(N,E)): a help proce-dureFirst: record rename p to qThen: follow G from p and q, eachtime with the same selector, do ei-ther of the followings.

(a) if (p, sel,m), (q, sel, n) ∈ E∧m 6=nthen rename(m,n,G).

(b) if (p, sel,m) ∈ E ∧ p 6= m∧ 6 ∃n :(q, sel, n) ∈ Ethen edge(q, sel,m).

(c) if (q, sel,m) ∈ E ∧ q 6= m∧ 6 ∃n :(p, sel, n) ∈ Ethen edge(p, sel,m).

(d) if (p, sel, p) ∈ E ∧ (q, sel, q) 6∈ Ethen edge(q, sel, q).

(e) if (q, sel, q) ∈ E ∧ (p, sel, p) 6∈ Ethen edge(p, sel, p).

The reason of R8 is as follows. Consider a casewhen the caller provides a callee with an inputregion and expects the result of the callee in adifferent region but the callee by its own puts

4 LIVE REGION ANALYSIS 6

its output into the input region. We have twooptions here. We can either unify the inputregion and the output region in the caller’spoints-to graph (because from the behaviourof the callee we know that they are actuallythe same) or keep the two regions separate inthe caller’s graph but let the callee inform thecaller about that fact. The first option is safebut too conservative, causing less precision inthe case, for example, the two regions are thesame in only one branch of execution of thecallee but not in the other branches. Rule R8realizes the second option, where the callee re-names the input region to the one expected bythe caller. The effect of this rule can be seenin the analysis and transformation of the pro-gram Game of Life in Section 6.

2. The procedure will be analysed iteratively un-til there is no change in its points-to graph.

4 Live region analysis

The goal of live region analysis is to detect live re-gions at each program point and to compute theinformation about which regions will be createdand removed by each procedure. These pieces ofinformation will be used to transform the originalprogram to the program with region-based memorymanagement. To define this analysis we assume thedefinitions of program point and execution path forMercury as in Nancy’s thesis [4]. The set of live re-gions at a program point is computed via the setof live variables at the program point. So first wedefine the concept of live variables at a programpoint.

4.1 Live variables at a program point

A variable is live at a program point if:

• There exists an execution path containing theprogram point that instantiates the variablebefore the program point and uses it at or afterthe program point

• OR it is an output variable that is instantiatedbefore the program point.

If we define pre inst(pp, P ) the set of variables in-stantiated before the program point pp in the exe-cution path P, post use(pp, P ) the set of variables

used at or after pp in the execution path P, out(p)the set of output variables of a procedure p thenthe set of live variables at a program point i is:

LV (i) = {V | ∃P : (V ∈ pre inst(i, P )) ∧ (V ∈out(p) ∨ V ∈ post use(i, P ))}

We define two exceptional cases of the above for-mula.

• If i = first(P ) then LV(i) = set of input vari-ables of p. first(P ) returns the first programpoint in P.

• Each execution path of a procedure is ex-tended to one more program point called“out”. LV(out) is the set of output variablesof a procedure.

4.2 Live regions at a program point

A region is live at a program point if it is reachablefrom a live variable at the program point. Theset of regions that are reachable from a variable isdefined:

Reach(X) = {nX} ∪ {m | ∃(nX ,m) ∈ E∗(X)},

in which E∗(X), the transitive closure of E from X,is defined:

E∗(X) = {(nX , ni) |∃(nX , sel0, n1), . . . , (ni−1, seli−1, ni) ∈ E ∧ sel0 ∈TTtype(X), sel1 ∈ TTtype((X, sel0)), . . . , seli−1 ∈

TTtype((X, sel0 • sel1 • . . . • seli−2))}.

The set of live regions at a program point i is de-fined:

LR(i) =⋃

(Reach(X)) ∀X ∈ LV (i).

4.3 The analysis

This analysis computes the set of live regions (LR)at each program point and for each procedure theset of regions that the procedure may create, calledbornR and the set of regions that it may remove,called deadR. We define several sets of regions,which live region analysis will care about, for eachprocedure p with its points-to graph G = (N,E):

• inputR(p) is the set of regions reachable frominput variables.

5 PROGRAM TRANSFORMATION 7

• outputR(p) is the set of regions reachable fromoutput variables.

• bornR(p) = outputR(p)− inputR(p) is the setof output regions that the procedure or any ofthe procedures it calls may create.

• deadR(p) = inputR(p)− outputR(p) is the setof input regions that the procedure or any ofthe procedures it calls may remove.

• localR(p) = N − input(p) − output(p).

The analysis can be performed in two passes. Thefirst pass is to compute live variables (LV) at eachprogram point. The second pass computes live re-gions (LR) at each program point. For a programpoint that is a call to procedure q we assume thatthe called procedure, q, has been analysed so thesets deadR(q) and bornR(q) have been computed.These sets of q (callee) may be updated by the rulesdefined below when analysing p (caller). Whilecomputing live regions the analysis will try to ap-ply the rules when applicable. To specify the ruleswe define some help functions:

• pp(l): return the program point of the literall,

• next(pp(l)): return the next program point ofl in an execution path. As each execution pathis already extended to one more final point if lis the last literal then next(pp(l)) = out.

The rules and their reasons are defined as follows.

• L1 : if a region that is supposed to be removedby the called procedure is still live after thecall then it is excluded from the deadR set ofthe called procedure.

ifl ≡ q(. . .)∧r ∈ LR(pp(l))∧r ∈ LR(next(pp(l)))∧r = α(r′)∧r′ ∈ deadR(q)

thendeadR(q) = deadR(q) − {r′}

• L2 : if a region that is supposed to be createdby the called procedure has already been cre-ated by the calling one then it is excluded fromthe bornR set of the called procedure.

ifl ≡ q(. . .)∧r ∈ LR(pp(l))∧r = α(r′)∧r′ ∈ bornR(q)

thenbornR(q) = bornR(q) − {r′}

5 Program transformation

The goal of program transformation is to introduce”create” and ”remove” statements based on regionliveness information. The transformation is exe-cuted for each procedure. It follows each executionpath and applies the following transformation ruleswhen appropriate. In the specification of the rulesbelow assume we are analysing procedure p.

• T1 : if an output region for a procedure call (q)is not live and it is not created by the calledprocedure (q), it is created before the call.

ifl ≡ q(. . .)∧R = apply renaming(LR(pp(l))∧r 6∈ R∧r ∈ LR(next(pp(l)))∧r = α(r′)∧r′ 6∈ bornR(q)

thenadd ”create r” before l

apply renaming(LR): apply the renaming in-formation, if available, at the call site to theset of regions LR and return the renamed setof regions.

• T2 : if a region is not live before a unification(construction) but it is live after and is a localregion or a born region, the region is createdbefore the unification.

ifl ≡ X <= f(. . .)∧rS 6∈ LR(pp(l))∧rS ∈ LR(next(pp(l)))∧(rS ∈ localR(p) ∪ bornR(p))∧X ∈ S

thenadd ”create rS” before l

6 ANALYSIS AND TRANSFORMATION OF SOME PROGRAMS 8

• T3 : if a region is live before a procedure callbut it is not live after the call, and the pro-cedure does not remove it then the caller willremove that region.

ifl ≡ q(. . .)∧R = apply renaming(LR(pp(l))∧r ∈ R∧r 6∈ LR(next(pp(l)))∧(r ∈ localR(p) ∪ deadR(p))6 ∃r′ : (α(r′) = r ∧ r′ 6∈ deadR(q))

thenadd ”remove r” before next(pp(l))

• T4 : if a region is live before a unification butit is not live after and it is in the dead regionset or is a local region, then it is removed afterthe unification.

ifl ≡ unif∧r ∈ LR(pp(l))∧r 6∈ LR(next(pp(l)))∧(r ∈ localR(p) ∪ deadR(p))

thenadd ′′remove r” before next(pp(l))

6 Analysis and transforma-

tion of some programs

All the programs have only one module and arewritten explicitly in the normal form, with all uni-fications already specialized, and goals reordered.The analysis performed here does not consider thecombination with reuse (CTGC) and the extensionto deal with non-determinism.

6.1 Naive reverse

This program is the deterministic version of naivereverse (nrev), where nrev predicate is declaredwith mode (in, out) and append predicate withmode (in, in, out).

append(X, Y, Z) :-(

(1) X => [],(2) Z := Y

XeY, Z

ZsX, Xs

([.], 1)([.], 1)

([.], 2)([.], 2)

Figure 4: The points-to graph of append after theintraprocedural analysis.

;(3) X => [Xe | Xs],(4) append(Xs, Y, Zs),(5) Z <= [Xe | Zs]

).

nrev(L, R) :-(

(1) L => [],(2) R <= []

;(3) L => [H | T],(4) nrev(T, V),(5) L1 <= [H],(6) append(V, L1, R)

).

6.1.1 Region points-to analysis

The analysis for append:

At the beginning, each variable appears in appendis assigned to a different node. Therefore we havethe following nodes: nX , nY , nZ , nXe, nXs, nZs.

1. Intraprocedural analysis:(1): do nothing.(2): unify(nZ , nY ).(3): edge(nX , ([.], 1), nXe)edge(nX , ([.], 2), nXs) : type(X) = type(Xs)so unify(nX , nXs) (R3).(5): edge(nZ , ([.], 1), nXe)edge(nZ , ([.], 2), nZs) : type(Z) = type(Zs) sounify(nZ , nZs) (R3).The points-to graph of append after the in-traprocedural analysis is shown in Figure 4.

2. Interprocedural analysis:(4): This is the recursive call to append(Xs, Y,Zs). So we assume the current points-to graph


XeY, Z

ZsX, Xs

([.], 1)([.], 1)

([.], 2) ([.], 2)

XeY, Z

ZsX, Xs

([.], 1)([.], 1)

([.], 2)([.], 2)

Figure 5: The mapping from formal to actual re-gions. The dash-dot arrows are mapping from thecall, the dash arrow is due to rule R5. The caller’sgraph is above and the callee’s graph is below.

of append and produce the mapping from re-gions of formal parameters to those of actualparameters.α(nX) = nXs, α(nY ) = nY , α(nZ) = nZs.From rule R5 we have α(nXe) = nXe. Themapping and the resulting graph after this stepis shown in Figure 5.Fixpoint reached.

The analysis for nrev:

Nodes: nL, nR, nH , nT , nV , nL1.

1. Intraprocedural analysis:(1): do nothing.(2): do nothing.(3): edge(nL, ([.], 1), nH)edge(nL, ([.], 2), nT ) : type(L) = type(T ) sounify(nL, nT ) (R3).(5): edge(nL1, ([.], 1), nH)The points-to graph of nrev after the intrapro-cedural analysis is shown in Figure 6.

2. Interprocedural analysis:(6): append(V, L1, R)α(nX) = nV , α(nY ) = nL1, α(nZ) = nR.By applying the rules we have:α(nXe) = nH (R5),edge(nL1, ([.], 2), nL1) (R6),edge(nV , ([.], 2), nV ) (R6),edge(nR, ([.], 2), nR) (R6),edge(nV , ([.], 1), nH) (R6),edge(nR, ([.], 1), nH) (R6),

([.], 1)([.], 1)

([.], 2)

L1HL, T

V R

Figure 6: The points-to graph of nrev after theintraprocedural analysis.

XeY, Z

ZsX, Xs

([.], 1)([.], 1)

([.], 2) ([.], 2)

([.], 1)([.], 1)

([.], 2)([.], 2)

L1HL, T

([.], 2)

V R

([.], 1) ([.], 1)([.], 2)

Figure 7: The interprocedural analysis of nrev, atprogram point (6). The dot arrows are referencesdue to rule R6.

renaming nL1 to nR (R8, only the first actionhappens, none of the cases in the second actionsatisfied).The resulting graph after this step is shown inFigure 7.(4): nrev(T, V)

α(nL) = nT , α(nR) = nV .α(nH) = nH (R5).The resulting graph after this step is shown inFigure 8.Fixpoint reached.

6.1.2 Live region analysis

Live region analysis for append:

The deterministic version of append has 2 execu-


([.], 1)([.], 1)

([.], 2)

L1HL, T

([.], 2)

V R

([.], 1)

([.], 2)

([.], 2)([.], 1)

([.], 1)([.], 1)

([.], 2)

L1HL, T

([.], 2)

V R

([.], 1)

([.], 2)

([.], 2)([.], 1)

Figure 8: The interprocedural analysis of nrev, atprogram point (4).

tion paths:

• (1) (2) out

• (3) (4) (5) out

At the beginning, the sets of regions of append are:

• inputR(append) = {n{X,Xs}, nXe, n{Y,Z,Zs}}

• outputR(append) = {nXe, n{Y,Z,Zs}}

• deadR(append) = {n{X,Xs}}

• bornR(append) = φ

• localR(append) = φ

The results of live variable and live region analysesare summarized in the below tables for eachexecution path.pp LV LR(1) {X,Y } {n{X,Xs}, nXe, n{Y,Z,Zs}}(2) {Y } {nXe, n{Y,Z,Zs}}out {Z} {nXe, n{Y,Z,Zs}}

pp LV LR(3) {X,Y } {n{X,Xs}, nXe, n{Y,Z,Zs}}(4) {Xe,Xs, Y } {n{X,Xs}, nXe, n{Y,Z,Zs}}(5) {Xe,Zs} {nXe, n{Y,Z,Zs}}out {Z} {nXe, n{Y,Z,Zs}}

No rules are applicable at program point (4) (theonly procedure call in append) in the live regionanalysis for append.The sets of dead and born regions after thisanalysis.deadR(append) = {n{X,Xs}},bornR(append) = φ.Live region analysis for nrev:

Execution paths:

• (1) (2) out

• (3) (4) (5) (6) out

Sets of regions:

• inputR(nrev) = {n{L,T}, nH}

• outputR(nrev) = {nH , nR}

• deadR(nrev) = {n{L,T}}

• bornR(nrev) = {nR}

• localR(nrev) = {nV , nL1}

The results of live variable and live region analysesfor each execution path.pp LV LR(1) {L} {n{L,T}, nH}(2) {} {}out {R} {nH , nR}

pp LV LR(3) {L} {n{L,T}, nH}(4) {H,T} {n{L,T}, nH}(5) {H,V } {nV , nH}(6) {V,L1} {nV , nL1, nH}out {R} {nR, nH}

No rules are applicable at program point (4) and(6).The sets of dead and born regions are:deadR(append) = {n{X,Xs}},bornR(append) = φ,deadR(nrev) = {n{L,T}},bornR(nrev) = {nR}.

6.1.3 Transformation

For append:

(1): rule T4 applied, so add “remove rX” before(2). Note that no “remove rX” is added before(5) because nX belongs to deadR(append), i.e. thecallee will remove it, not the caller.


For nrev:

(1): add remove rL before (2) (T4). Note that rH

is not removed even it is not live at (2), this is be-cause nH does not belong to deadR(nrev).(2): add create rR before (2) (T2).(5): add create rL1 before (5) (T2).At (6) because the renaming, which has beenrecorded at this call site in region points-to analy-sis, renames nL1 to nR the rule T1 fails to apply,preventing the transformation from creating regionrR.The transformed program is as follows:

append(X, Y, Z) :-(

(1) X => [],remove rX ,(2) Z := Y

;(3) X => [Xe | Xs],(4) append(Xs, Y, Zs),(5) Z <= [Xe | Zs]

).

nrev(L, R) :-(

(1) L => [],remove rL,create rR,(2) R <= []

;(3) L => [H | T],(4) nrev(T, V),create rL1,(5) L1 <= [H],(6) append(V, L1, R)

).

6.1.4 Comparison with the result of Hen-

ning and Kostis

The program nrev with region support created bytheir prototype in [3] is shown below.

append(X,Y,Z) c(X0, R10, R11) :-(

X = [],Z = Y

;X = R10.[X0.Xe | R10.Xs],append(Xs,Y,Zs) c(X0, R10, R11),

Z = R11.[Xe | Zs]).

nrev(L,R) c(X0)i(R9)o(R0) :-(

L = [],release R9,new R0,R = []

;L = R9.[X0.H | R9.T],nrev(T, V) c(X0)i(R9)o(R4),new R0,append(V, R0.[H], R) c(X0, R4, R0),release R4

).

In this work the release of the region of the listbackbone of the first input list of append is per-formed in nrev, a caller of append. This means thatthe lifetime of the region here is longer than in ourresult, where the removal happens inside append.I speculate that this is because their region infer-encer cannot do a better job. The impact of thisimprecision on memory use is not significant herebut it is the case in the qsort example.

6.2 qsort

The analysis and transformation for qsort is donewith the assumption that values of primitive typesare also stored in the heap memory. The versionof qsort here is written for type list(int). split isdeclared with mode (in, in, out, out) and qsort withmode (in, in, out).

split(X, L, L1, L2) :-(

(1) L => [],(2) L1 <= [],(3) L2 <= []

;(4) L => [Le | Ls],((5) X >= Le− >(6) split(X, Ls, L11, L2),(7) L1 <= [Le | L11];(8) split(X, Ls, L1, L21),(9) L2 <= [Le | L21]


)).

qsort(L, A, S) :-(

(1) L => [],(2) S := A

;(3) L => [Le | Ls],(4) split(Le, Ls, L1, L2),(5) qsort(L2, A, S2),(6) A1 <= [Le | S2],(7) qsort(L1, A1, S)

).


The analysis for split:

Nodes: nX , nL, nL1, nL2, nLe, nLs, nL11, nL21.

1. Intraprocedural analysis:(1), (2), (3): do nothing.(4): edge(nL, ([.], 1), nLe)edge(nL, ([.], 2), nLs) : type(L) = type(Ls) sounify(nL, nLs) (R3).(5): do nothing.(7): edge(nL1, ([.], 1), nLe)edge(nL1, ([.], 2), nL11) : type(L1) = type(L11)so unify(nL1, nL11) (R3)(9): edge(nL2, ([.], 1), nLe)edge(nL2, ([.], 2), nL21) : type(L2) = type(L21)so unify(nL2, nL21) (R3)

2. Interprocedural analysis:The interprocedural analysis for split does notchange the shape of the points-to graph pro-duced by intraprocedural analysis.(6): split(X, Ls, L11, L2)α(nX) = nX , α(nL) = nLs, α(nL1) = nL11,α(nL2) = nL2.By applying the rules we have:α(nLe) = nLe (R5).(8): split(X, Ls, L1, L21)α(nX) = nX , α(nL) = nLs, α(nL1) = nL1,α(nL2) = nL21.By applying the rules we have:α(nLe) = nLe (R5).Fixpoint reached and the graph is shown in

Figure 9 (This is also the graph after intrapro-cedural analysis).

([.], 1)([.], 1)

([.], 2)

L1L, Ls Le

L2

([.], 2)

([.], 2)([.], 1)

X

Figure 9: The points-to graph of split.

([.], 1)

([.], 2)

L, Ls Le

([.], 2)([.], 1)

S, A

L1

L2

S2, A1

Figure 10: The points-to graph of qsort after in-traprocedural analysis.

The analysis for qsort:

Nodes: nL, nA, nS , nLe, nLs, nL1, nL2, nS2, nA1.

1. Intraprocedural analysis:(1): do nothing.(2): unify(nS , nA)(3): edge(nL, ([.], 1), nLe)edge(nL, ([.], 2), nLs) : type(L) = type(Ls) sounify(nL, nLs) (R3)(6): edge(nA1, ([.], 1), nLe)edge(nA1, ([.], 2), nS2) : type(A1) = type(S2)so unify(nA1, nS2) (R3)The graph after this step is shown in Figure 10.

2. Interprocedural analysis:(4): split(Le, Ls, L1, L2)α(nX) = nLe, α(nL) = nLs, α(nL1) = nL1,α(nL2) = nL2.By applying the rules we have:α(nLe) = nLe (R5).edge(nL1, ([.], 1), nLe) (R6).edge(nL1, ([.], 2), nL1) (R6).


([.], 1)([.], 1)

([.], 2)

L1L, Ls Le

L2

([.], 2)

([.], 2)([.], 1)

X

([.], 1)

([.], 2)

L, Ls Le

([.], 2)([.], 1)

S, A

L1

L2

S2, A1

Figure 11: The interprocedural analysis of qsort, atprogram point (4). The dot arrows are referencesdue to rule R6.

edge(nL2, ([.], 1), nLe) (R6).edge(nL2, ([.], 2), nL2) (R6).The graph after this step is shown in Fig-ure 11.(5): qsort(L2, A, S2)

α(nL) = nL2, α(nA) = nA, α(nS) = nS2.By applying the rules we have:α(nLe) = nLe (R5).renaming nA to nS2 (R8): caus-ing edge(nA, ([.], 1), nLe) (R8, c),edge(nA, ([.], 2), nA) (R8, e).The graph after this step is shown in Fig-ure 12.(7): qsort(L1, A1, S)

α(nL) = nL1, α(nA) = nA1, α(nS) = nS .By applying the rules we have:α(nLe) = nLe (R5).renaming nA1 to ns (R8, only the first actionhappens, none of the cases in the secondaction satisfied).This step does not change the shape of thepoints-to graph of qsort. Fixpoint reached.

L2

L1 S2, A1

([.], 1)L, Ls Le

([.], 2)([.], 1)

S, A

L2

L1 S2, A1

([.], 1)

([.], 2)

L, Ls Le

([.], 2)([.], 1)

S, A

([.], 2)

([.], 2)

([.], 1)

Figure 12: The interprocedural analysis of qsort, atprogram point (5). The dot arrows are referencesdue to rule R8 (renaming).


Live region analysis for split:

Execution paths:

• (1) (2) (3) out

• (4) (5) (6) (7) out

• (4) (8) (9) out

At the beginning, the sets of regions of split are:

• inputR(split) = {nX , n{L,Ls}, nLe}

• outputR(split) = {nLe, n{L1,L11}, n{L2,L21}}

• deadR(split) = {nX , n{L,Ls}}

• bornR(split) = {n{L1,L11}, n{L2,L21}}

• localR(split) = φ

The results of live variable and live region analysesfor each execution path.


pp LV LR(1) {L,X} {nL, nLe, nX}(2) {} {}(3) {L1} {nL1, nLe}out {L1, L2} {nL1, nLe, nL2}

pp LV LR(4) {L,X} {n{L,Ls}, nLe, nX}(5) {X,Le, Ls} {n{L,Ls}, nLe, nX}(6) {X,Le, Ls} {n{L,Ls}, nLe, nX}(7) {L11, Le, L2} {n{L1,L11}, nLe, n{L2,L21}}out {L1, L2} {n{L1,L11}, nLe, n{L2,L21}}

pp LV LR(4) {L,X} {n{L,Ls}, nLe, nX}(8) {X,Le, Ls} {n{L,Ls}, nLe, nX}(9) {L1, Le, L21} {n{L1,L11}, nLe, n{L2,L21}}out {L1, L2} {n{L1,L11}, nLe, n{L2,L21}}

At program point (6): no rules applied.At program point (8): no rules applied.The sets of dead and born regions are:deadR(split) = {nX , n{L,Ls}},bornR(split) = {n{L1,L11}, n{L2,L21}}.Live region analysis for qsort:

Execution paths:

• (1) (2) out

• (3) (4) (5) (6) (7) out

At the beginning, the sets of regions of qsort are:

• inputR(qsort) = {n{L,Ls}, nLe, n{S,A}}

• outputR(qsort) = {n{S,A}, nLe}

• deadR(qsort) = {n{L,Ls}}

• bornR(qsort) = φ

• localR(qsort) = {nL1, nL2, n{A1,S2}}

The results for live variable and live region analysesfor each execution path.pp LV LR(1) {L,A} {n{L,Ls}, nLe, n{A,S}}(2) {A} {n{A,S}, nLe}out {S} {n{A,S}, nLe}

pp LV LR(3) {L,A} {n{L,Ls}, nLe, n{A,S}}(4) {A,Le, Ls} {n{A,S}, nLe, n{L,Ls}}(5) {A,Le, L1, L2} {n{A,S}, nLe, nL1, nL2}(6) {Le, L1, S2} {nLe, nL1, n{A1,S2}}(7) {L1, A1} {nL1, nLe, n{A1,S2}}out {S} {n{A,S}, nLe}

At program point (4): rule L1 applied,

α(nX) = nLe, nLe is live at (4) and at (5) alsotherefore deadR(split) = deadR(split) − {nX} ={nX , xL} − {nX} = {nL}.The sets of dead and born regions after thisanalysis are:deadR(split) = {n{L,Ls}},bornR(split) = {n{L1,L11}, n{L2,L21}},deadR(qsort) = {n{L,Ls}},bornR(qsort) = φ.


For split:

(1): add remove rL after (1) (T4).(2): add create rL1 before (2) (T2).(3): add create rL2 before (3) (T2).For qsort:

(1): add remove rL after (1) (T4).At program point (5): rename nA to nS2, i.e. thecall to qsort here puts S2 into the same region ofA.At program point (7): rename nA1 to nS , i.e. thecall to qsort here puts S into the same region of A1The transformed program is as follows:

split(X, L, L1, L2) :-(

(1) L => [],remove rL,create rL1,(2) L1 <= [],create rL2,(3) L2 <= []

;(4) L => [Le | Ls],((5) X >= Le− >(6) split(X, Ls, L11, L2),(7) L1 <= [Le | L11];(8) split(X, Ls, L1, L21),(9) L2 <= [Le | L21])

).

qsort(L, A, S) :-(

(1) L => [],remove rL,(2) S := A


;(3) L => [Le | Ls],(4) split(Le, Ls, L1, L2),(5) qsort(L2, A, S2),(6) A1 <= [Le | S2],(7) qsort(L1, A1, S)

).

6.2.4 Comparison with the result of Hen-

ning and Kostis

The qsort program with region support generatedby their prototype in [3] is shown below.

split(X, L, L01, L02) c(X0, R17)o(R0, R1) :-(

L = [],L01 = [],L02 = [],new R0,new R1

;L = R17.[X0.Le| R17.Ls],(X >= Le− >split(X, Ls, L011, L02) c(X0, R17)o(R0, R1),L01 = R0.[Le|L011]

;split(X,Ls,L01,L021) c(X0, R17)o(R0, R1),L02 = R1.[Le|L021]

)).

qsort(L,A,S) c(X0, R15)i(R14) :-(

L = [],release R14,S = A

;L = R14.[X0.Le | R14.Ls],split(Le, Ls, L01, L02) c(X0, R14)o(R4, R6),release R14,qsort(L02, A, S02) c(X0, R15)i(R6),qsort(L01, R15.[Le | S02], S) c(X0, R15)i(R4)

).

In this example, again, the removal of the region ofthe input list backbone of split happens in the callerof split, while it is actually dead inside. By beingable to removing it inside split before creating the

regions for the two sublists our transformed pro-gram will use no more memory than what neededto store the original list.

6.3 Game of Life

We assume a simplified, fake implementation ofnextgen in the discussion below. The only essen-tial point of nextgen here is that its behaviourcauses output regions different from input ones. Soif our analysis is precise enough the input regions ofnextgen, as it is called in life, should be removedby itself. The built-in is, as any other built-ins,is treated like deadR(is) and bornR(is) are empty.Namely, it will only read from or write to regions,never create or remove a region. nextgen is de-clared with mode (in, out) and life with mode (in,in, out).

nextgen(G, G1) :-(1) G => gen(A),(2) A1 is A + 2,(3) G1 <= gen(A1).

life(N, G, H) :-(

(1) N => 0− >

(2) H := G;

(3) N1 is N - 1,(4) nextgen(G, G1),(5) life(N1, G1, H)

).


Analysis for nextgen:

Nodes: nG, nG1, nA, nA1.

1. Intraprocedural analysis:(1): edge(nG, (gen/1, 1), nA).(4): edge(nG1, (gen/1, 1), nA1).No interprocedural analysis happens fornextgen.Fixpoint reached and the graph is shown inFigure 13.

Analysis for life:

Nodes: nN , nG, nH , nN1, nG1.


(gen/1, 1) (gen/1, 1)G1G A A1

Figure 13: The points-to graph of nextgen.

N N1

G1G, H

Figure 14: The points-to graph of life after in-traprocedural analysis.

1. Intraprocedural analysis:(1): do nothing.(2): unify(nH , nG).The graph after this step is shown in Figure 14.

2. Interprocedural analysis:(4): nextgen(G, G1)α(nG) = nG, α(nG1) = nG1.R7: new node m1, edge(nG, (gen/1, 1),m1),α(nA) = m1.R7: new node m2, edge(nG1, (gen/2, 1),m2),α(nA1) = m2.The graph after this step is shown in Figure 15.(5): life(N1, G1, H)

α(nN ) = nN1, α(nG) = nG1, α(nH) = nH .R8: renaming(nG1, nH , Glife), causing re-name nG1 to nH , m2 to m1.At this point if we had chosen to unify nG1 andnH in the caller’s graph instead of renaming,the temporary values in life would have beenput into the same region of the output andnot been able to be removed. This is becausenH has been unified with nG already, so afterthis merging H, G and G1 are in the same re-gion. This will prevent the call to nextgen(G,G1) from removing the region of G. Fixpointreached and the graph is shown in Figure 16.


Live region analysis for nextgen:

Execution paths: (1) (2) (3) (4) out.At the beginning, the sets of regions of nextgen are:

(gen/1, 1) (gen/1, 1)G1

N N1

G1

G A A1

G, H m1 m2(gen/1, 1) (gen/1, 1)

Figure 15: The points-to graph of life after inter-procedural analysis at program point (4). The newnodes, edges, and α mappings are due to rule R7.

(gen/1, 1) (gen/1, 1)G1

N N1

G1G, H m1 m2(gen/1, 1) (gen/1, 1)

N1N

G, H m2m1

Figure 16: The points-to graph of life after inter-procedural analysis at program point (5).

7 FUTURE EXTENSIONS OF THE REGION ANALYSIS 17

• inputR(nextgen) = {nG, nA}

• outputR(nextgen) = {nG1, nA1}

• deadR(nextgen) = {nG, nA}

• bornR(nextgen) = {nG1, nA1}

• localR(nextgen) = φ

The results for live variable and live region analy-ses:pp LV LR(1) {G} {nG, nA}(2) {A} {nA}(3) {A1} {nA1}out {G1} {nG1, nA1}

No rules applied and the sets of dead and born re-gions are unchanged.Live region analysis for life:

Execution paths:

• (1) (2) out

• (3) (4) (5) out.

At the beginning, the sets of regions of life are:

• inputR(life) = {nN , nG,m1}

• outputR(life) = {nG,m1}

• deadR(life) = {nN}

• bornR(life) = φ

• localR(life) = {nN1, nG1,m2}

The results for live variable and live region analysesfor each execution path:pp LV LR(1) {N,G} {nN , nG,m1}(2) {G} {nG,m1}out {H} {nG,m1}

pp LV LR(3) {N,G} {nN , nG,m1}(4) {G,N1} {nG,m1, nN1}(5) {N1, G1} {nN1, nG1,m2}out {H} {nG,m1}

No rules applied.


For nextgen:

(1): add remove rG after (1) (T4).(2): add create rA1 before (2) (T2),remove rA after (2) (T4).(3): add create rG1 before (4) (T2).For life:

(1): add remove rN after (1) (T4).(3): add create rN1 before (3) (T2),remove rN after (3) (T4).(5): rename nG1 to nH and m2 to m1, i.e. the callto life here puts H into the same regions of G1.The transformed program is as follows:

nextgen(G, G1) :-(1) G => gen(A),

remove rG,create rA1,

(2) A1 is A + 2,remove rA,create rG1,

(3) G1 <= gen(A1).

life(N, G, H) :-(

(1) N => 0,remove rN

− >(2) H := G

;create rN1,(3) N1 is N - 1,remove rN ,(4) nextgen(G, G1),(5) life(N1, G1, H)

).

7 Future extensions of the re-

gion analysis

The extensions of the above algorithm to sup-port non-determinism, module-based analysis, andcombination with compile-time garbage collection(CTGC) [4] can be developed tentatively as follows.

7.1 Non-determinism

To support non-determinism we will need to pre-vent the regions that will be used when backtrack-


ing from being removed. The sets of live regionsat program points are derived from the sets of livevariables. Therefore if we can compute the set ofvariables that are live at a program point when theprogram backtracks to that program point we willbe able to compute the set of regions needed whenbacktracking. At the first sight, that set of variablescan be approximated by the backward use with 2nd

instantiation as defined in [4] and the other partsof the algorithm can be intact. A variable X issaid to be in local backward use (lbu) w.r.t. a pro-gram point (i) within a procedure definition if thatvariable is instantiated at (i) and can be accessedby the literals of the procedure after backtrackinghas reentered the code prior to (i). The new set oflive variables at a program point will be defined asfollows.

LV2(i) = LV (i) ∪ lbu(i).

7.2 Module-based region analysis

Mercury programs are composed of several mod-ules. We assume that there are no circular callsamong the modules so that when analysing a Mer-cury program each module can be analysed once.So when we are analysing a module all the proce-dures that are called from the module have beenanalysed already. If we allow a caller to have con-trol on region-related behaviour of a callee then allthe regions in deadR set of the callee can alwaysbe removed and all regions in bornR set can alwaysbe created by itself. We can change the analysis bydropping the rule L1 and L2 in live region analy-sis and enhancing the transformation to introduce“keep” and “use” statements with the following twoadditional rules.T5: If a region, r, is supposed to be removed bythe called procedure but it is still live after the callthen the calling procedure will add “keep r” beforethe call.The effect of “keep r” is that it will prevent thecalled procedure from really removing r.T6: If a region, r, is supposed to be created bythe called procedure but it has been created by thecalling one then the calling procedure will add “user” before the call.The effect of “use r” is that the called procedurewill use the region r instead of creating it.This means that the region removal and creation

are now conditional. Having this conditional re-moval and creation makes the module-based re-gion analysis rather simple. An analysed proce-dure will be identified by deadR, bornR, and theregion points-to graph (nodes, edges, renaming, al-pha mapping). In the region points-to analysis ofa procedure (caller) that calls the analysed one(callee), the region points-to graph of the calleewill be used to produce the caller’s points-to graph.The live region analysis takes place just to calcu-late the live region information at each programpoint of the caller. The transformation will use thetwo additional rules to introduce “keep” and “use”statements to make the behaviour of the callee suitthe requirements at the calling context. With thisorganisation we will have to generate only one op-timized version of a procedure with region supportbut any callers can control its real behaviour tomeet their own needs. The drawback of this ap-proach is that checking the conditions can be aburden at runtime.

7.3 Combination with CTGC

7.3.1 The approach

CTGC tries to reuse dead memory, while RBMMtries to remove the regions that contain them. Soif CTGC decides that the cells of X are reused toconstruct Y then in the region points-to analysiswe should merge the regions of X and Y, i.e. unifynX and nY . In the definition of the current regionpoints-to analysis, unifying two nodes can causeother nodes to be merged so we need to preventthose unwanted mergings of regions (which unnec-essarily limit the chance of removal). More insightsneed to be developed to make this complete, but itis likely that by preventing any R rules to be trig-gered when “unifying two nodes because of reuse”in region points-to analysis and the current defini-tion of Reach(i) using valid type selectors in liveregion analysis can support the combination.

7.3.2 A motivational example

A motivational example of combining RBMM andCTGC is shown below. The convert program istaken from Chapter 11 in [4] with some modifica-tions just to make the explanation shorter. convertis declared with mode (in, out) and transforms a list


L0, R0

C A

L, RF

G

( f/2, 2) ( f/2, 1)

( g/1, 1)

([.], 2)

([.], 1)

([.], 2)

([.], 1)

Figure 17: The points-to graph of convert after re-gion points-to analysis.

of type f to a list of type g.:- type f −−− > f(int, int).:- type g −−− > g(int).

convert(L0, L) :-(

(1) L0 => [],− >

(2) L <= [];

(3) L0 => [F | R0],(4) F => f(A, C),(5) G <= g(A),(6) convert(R0, R),(7) L <= [G | R]

).

After region points-to analysis, the graph ofconvert is shown in Figure 17.The example will show the extended analysis with

two different reuse decisions:

• reuse the cells of L0 to construct L and thecells of F to construct G,

• only reuse the cells of L0 to construct L for thereason that either the backend does not allowreusing cells with different types or we subjec-tively prohibit the reuse of non-matching ar-ity because as reported in [4] the setting withreuse of matching arity gives better result thanthe other ones do (assume the analysis prefersreusing L0 to construct L to reusing F ).

1. Case 1: L0 for L and F for G:We will need to unify nL0 and nL, nF and

C A

( f/2, 2) ( f/2, 1)

( g/1, 1)

([.], 2)

([.], 1)L0, R0

L, RF, G

Figure 18: The points-to graph of convert after uni-fying n{L0,R0} and n{L,R}, nF and nG.

pp LV LR(1) {L0} {n{L0,R0,L,R}, n{F,G}, nA, nC}(2) {} {}out {L} {n{L0,R0,L,R}, n{F,G}, nA}

pp LV LR(3) {L0} {n{L0,R0,L,R}, n{F,G}, nA, nC}(4) {F,R0} {n{L0,R0,L,R}, n{F,G}, nA, nC}(5) {R0, A,C} {n{L0,R0,L,R}, n{F,G}, nA, nC}(6) {R0, G} {n{L0,R0,L,R}, n{F,G}, nA, nC}(7) {G,R} {n{L0,R0,L,R}, n{F,G}, nA}out {L} {n{L0,R0,L,R}, n{F,G}, nA}

Figure 19: Live region analysis results of convertin Case 1.

nG, no rules will be applied after these actions.The points-to graph after this is shown in Fig-ure 18.Live region analysis:

Execution paths: (1) (2) out,(3) (4) (5) (6) (7) out.At the beginning, the sets of regions of convertare:

• inputR(convert) ={n{L0,R0,L,R}, n{F,G}, nA, nC}

• outputR(convert) ={n{L0,R0,L,R}, n{F,G}, nA}

• deadR(convert) = {nC}

• bornR(convert) = φ

• localR(convert) = φ

The results for live variable and live regionanalyses is shown in Figure 19. No rules


applied and the sets of dead and born re-gions are unchanged. Note that at “out” thelive variable set contains L of type list(g(int))therefore nC is not reachable from L because([.], 1)•(f/2, 2) is not a valid type selector of L.The same situation happens at (7), where thelive variable set contains G of which (f/2, 2)is not a valid type selector, and R of which([.], 1)• (f/2, 2) is not a valid type selector andthe node nC is not reachable.Transformation:At (1), rule T4 applied and the transformedprogram is as follows.

convert(L0, L) :-(

(1) L0 => [],remove rC ,

− >(2) L <= []

;(3) L0 => [F | R0],(4) F => f(A, C),(5) G <= g(A),(6) convert(R0, R),(7) L <= [G | R]

).

We see that the transformed program can deal-locate the memory for C, which is memoryleak in CTGC. The memory cells of the listbackbone and of part of the list’s elements arereused by CTGC.

2. Case 2: Only reuse L0 for L:We will unify only nL0 and nL, no rules will beapplied after this action. The points-to graphafter this is shown in Figure 20.Live region analysis:

Execution paths: (1) (2) out,(3) (4) (5) (6) (7) out.At the beginning, the sets of regions of convertare:

• inputR(convert) ={n{L0,R0,L,R}, nF , nA, nC}

• outputR(convert) ={n{L0,R0,L,R}, nG, nA}

• deadR(convert) = {nF , nC}

• bornR(convert) = {nG}

C A

F

G

( f/2, 2) ( f/2, 1)

( g/1, 1)

([.], 2)

([.], 1)

([.], 1)

L0, R0

L, R

Figure 20: The points-to graph of convert afterunifying n{L0,R0} and n{L,R}.

pp LV LR(1) {L0} {n{L0,R0,L,R}, nF , nA, nC}(2) {} {}out {L} {n{L0,R0,L,R}, nG, nA}

pp LV LR(3) {L0} {n{L0,R0,L,R}, nF , nA, nC}(4) {F,R0} {n{L0,R0,L,R}, nF , nA, nC}(5) {R0, A,C} {n{L0,R0,L,R}, nF , nA, nC}(6) {R0, G} {n{L0,R0,L,R}, nF , nA, nC , nG}(7) {G,R} {n{L0,R0,L,R}, nG, nA}out {L} {n{L0,R0,L,R}, nG, nA}

Figure 21: Live region analysis results of convertin Case 2.

• localR(convert) = φ

The results for live variable and live regionanalyses are shown in Figure 21.At (6): rule L2 applied. So bornR(convert)

= φ, which means that the region nG needs tobe created by the procedure that calls convert.Look in detail at program point “out”, thelive variable set contains L of type list(g(int)).nC is not reachable from L because ([.], 1]) •(f/2, 2) is not a valid type selector of L. Tomake nF not reachable from L one solutionis to keep the two edges with the same la-bel ([.], 1) separated when unifying n{L0,R0}

and n{L,R}. That is, the points-to graphwill contain two separated edges both start-ing from n{L0,R0,L,R}: (n{L0,R0}, ([.], 1), nF )and (n{L,R}, ([.], 1), nG), but going to differentnodes. From L we should only follow the lateredge, not the former one (because L ∈ {L,R}).


Therefore nG is reachable, but nF is not. Thesame situation happens at (7), where the livevariable set contains G of type g(int) and R oftype list(g(int)) and the nodes nC and nF arenot reachable.Transformation:At (1), rule T4 applied so rF and rC are re-moved.Note that at the program point “out” in theexecution path (1), no “create rG” is added,even according to the live region analysis itbecomes live at this point. This is because at(2), L is constructed, not G. The live regionanalysis is not precise here but the transfor-mation still ensures the correct solution. It isprobably possible to enhance the live regionanalysis if we are able to distinguish the typeconstructors of R used in each execution path.In the execution path (1), L is constructed by[], so nG and nA are not live at “out” in path(1).The transformed program is as follows.

convert(L0, L) :-(

(1) L0 => [],remove rF ,remove rC ,

− >(2) L <= []

;(3) L0 => [F | R0],(4) F => f(A, C),(5) G <= g(A),(6) convert(R0, R),(7) L <= [G | R]

).

Here the transformed program will reuse thememory of the input list backbone to createthe backbone of the output list and be able todeallocate the memory cells used for F and C,which otherwise are memory leaks.

7.3.3 Potential benefits of combining

RBMM and CTGC

Without combination RBMM alone as describedabove can already be interesting for several pro-grams, such as qsort, where RBMM helps the pro-

gram run with no more memory than what is re-quired to store the whole original list. With thetransformation based on region liveness informa-tion described above, the lifetime of regions in theoutput program is shorter when comparing with theresults achieved by the analyses of Henning (in thecontext of functional programming - subset of SMLand logic programming - XSB Prolog) and Sigmund(in the context of OOP - Java), which should causeless memory consumption. It is worth pointing outthat our algorithm combines the good parts of [2]and [1]. The good points of [2] are the ability to cre-ate and remove regions across procedure borders.The support of renaming regions helps increase theprecision of region points-to analysis. The advan-tages of [1] are the powerfulness but simplicity ofthe analysis and transformation, which should leadto simple correctness proof. This is not at all thecase in [2] and [3], where not only is the correct-ness of the region type system hard to be provedbut also the inference algorithm and its soundnessproof are not trivial.

The possibility of benefits of the combination ofRBMM and CTGC is when there are procedurescontain cells that die but cannot be reused locallyby CTGC. This is the case when some cells diein a procedure and the procedure has no construc-tion allowed to reuse them (due to there is nothingto reuse for (Case 1), reuse decision or back-endconstraints (Case 2),...). In [4] some of the cellsthat die unconditionally can be reused using cellcache (when the size fits). For the benchmark RayTracer, cell cache can increase the relative reduc-tion from ∼ 25% to ∼ 50%, which means that,at least for this medium program, there are quitemany cells put into the cache and reused later on.Using cell cache has its own cost, such as maintain-ing the cache and checking the cache before anyallocation, and may also harm locality. With theregion analysis above the unconditionally died cellswill be put into separate regions that can be re-claimed therefore we may gain over CTGC alone:

• The cost of maintaining and using the cache,

• The cells that are put into the cache but notreused. This information is not collected in[4] for the Ray Tracer program. But if thisnumber is, on average, significant then it canbe a definite advantage of RBMM.

REFERENCES 22

• ”Locality”. This gain is not conclusive forRBMM. The experimental results from currentRBMM systems do not always support goodlocality and faster code. This is reported asa more experimental than theoretical researchproblem [5].

References

[1] S. Cherem and R. Rugina. Region analysis andTransformation for Java. In Proceedings of the

4th international symposium on Memory man-

agement, pages 85–96. ACM Press., October2004.

[2] F. Henglein, H Makholm, and H. Niss. A directapproach to control-flow sensitive region-basedmemory management. In Principles and Prac-

tice of Declarative Programming., pages 175–186. ACM Press., 2001.

[3] H. Makholm and K. Sagonas. On enabling theWAM with region support. In Proceedings of

the 18th International Conference on Logic Pro-

gramming. Springer Verlag., 2002.

[4] Mazur N. Compile-time garbage collection for

the declarative language Mercury. PhD thesis,Department of Computer Science, KatholiekeUniversiteit Leuven, May 2004.

[5] M. Tofte, L. Birkedal, M. Elsman, and N. Hal-lenberg. A Retrospective on Region-BasedMemory Management. In Higher-Order and

Symbolic Computation, 17, pages 245–265.Kluwer Academic., 2004.

Date post:	22-May-2020
Category:	Documents
Upload:	others
View:	8 times
Download:	0 times

Katholieke Universiteit Leuven · Katholieke Universiteit Leuven Department of Computer Science...

Documents