ATX & WING

ATX & WINGWorkshops

Papers & Abstracts

IJCARManchester

2012

ATX & WING Workshops, IJCAR, Manchester 2012

This document covers the submissions at two workshops that were satellite eventsat IJCAR 2011 in Manchester:

• ATX 2012: Workshop on Automated Theory eXploration

• WING 2012: 4th International Workshop on Invariant Generation

The ATX submissions were all peer reviewed by at least 2 reviewers; the WINGpapers were peer reviewed, and the short abstracts lightly reviewed by the organis-ers.

The organisers and programme committees were as follows.

ATX

Programme committee

Jacques Fleuriot (Edinburgh)Timothy Griffin (Cambridge)Peter Hfner (NICTA, Australia)Joe Hurd (Galois, USA)Temur Kutsia (RISC, Austria)Roy McCasland (Edinburgh)Annabelle McIver (Macquarie)Stephan Merz (INRIA)Petros Papapanagiotou (Edinburgh)Alan Smaill (Edinburgh)David Stanovsky (Charles University)Georg Struth (Sheffield)Josef Urban (Radboud, Netherlands)

Organisers

Jacques FleuriotPeter HofnerAnnabelle McIverAlan Smaill

WING

Programme committee

Clark Barrett (New York University)Nikolaj Bjorner (Microsoft Research)Gudmund Grov (Edinburgh)Ashutosh Gupta (IST Austria)Bart Jacobs (KUL, Belgium)Moa Johansson (Chalmers, Sweden)Laura Kovacs (TU Vienna)David Monniaux (VERIMAG, France)Enric Rodrguez Carbonell (UPC, Catalonia)Helmut Veith (TU Vienna)Thomas Wies (New York University)

Organisers

Gudmund GrovThomas Wies

Invited Talks(ATX)

Robert L.ConstableProof Assistants and the Dynamic Nature of Formal

Theories

(WING)

Aditya NoriSpecification Inference and Invariant Generation: A Machine

Learning Perspective

Antoine MinéInvited Astrée tutorial

Proof Assistants and the Dynamic Nature of

Formal Theories

Robert L. Constable

Cornell University

Abstract

This lecture will consider lessons from a decade long eÆort to explore and advance the

logic of events using the Nuprl proof assistant operating in its logical programming

environment. It will examine the impact of extensions to the underlying constructive

type theory and the programming environment over this period, one of which led to

the solution of a long standing open problem in constructive logic. I will also illustrate

methods of proof exchange between versions of this theory that are based on replaying

results as the theory is extended. This method seems promising for proof exchange

among proof assistants based on the LCF tactic mechanism as the main method for

building proofs.

Both theory exploration and proof exchange illustrate the dynamic nature of formal

theories created using modern proof assistants and dispel the false impression that

formal theories are rigid and brittle objects that become less relevant over time in a

fast moving field like computer science.

The ideas I am discussing here are based on the work of the Cornell PRL research

group, in particular our research on the logic of events by Mark Bickford, Richard

Eaton, and Vincent Rahli, and by our collaboration with David Guaspari of ATC-NY

corporation.

1

Specification Inference and Invariant

Generation: A Machine Learning Perspective

Aditya Nori

Microsoft Research, India

Abstract. Computing good specification and invariants is key to e↵ec-

tive and e�cient program verification. In this talk, I will describe our

experiences in using machine learning techniques (Bayesian inference,

SVMs) for computing specifications and invariants useful for program

verification. The first project Merlin uses Bayesian inference in order to

automatically infer security specifications of programs. A novel feature of

Merlin is that it can infer specifications even when the code under anal-

ysis gives rise to conflicting constraints, a situation that typically occurs

when there are bugs. We have used Merlin to infer security specifications

of 10 large business critical web applications. Furthermore, we show that

these specifications can be used to detect new information flow security

vulnerabilities in these applications. In the second project Interpol, we

show how interpolants can be viewed as classifiers in supervised machine

learning. This view has several advantages: First, we are able to use o↵-

the-shelf classification techniques, in particular support vector machines

(SVMs), for interpolation. Second, we show that SVMs can find relevant

predicates for a number of benchmarks. Since classification algorithms

are predictive, the interpolants computed via classification are likely to

be relevant predicates or invariants. Finally, the machine learning view

also enables us to handle superficial non-linearities. Even if the under-

lying problem structure is linear, the symbolic constraints can give an

impression that we are solving a non-linear problem. Since learning algo-

rithms try to mine the underlying structure directly, we can discover the

linear structure for such problems. We demonstrate the feasibility of In-

terpol via experiments over benchmarks from various papers on program

verification.

Papers(ATX)

Rene NeumannA Framework for Verified Depth-First Algorithms

Jesse AlamaTipi: A TPTP-based theory development environment emphasizing proof

analysis

Zining CaoReducing Higher Order pi-Calculus to Spatial Logics

Koen Claessen, Moa Johansson, Dan Rosen and Nick Smallbone

HipSpec : Automating Inductive Proofs of Program Properties

Aleks Kissinger Synthesising Graphical Theories

Mark Adams and David AspinallRecording and Refactoring HOL Light Tactic Proofs

Alan SmaillTheory Exploration: a role for Model Theory?

(WING)Antoine Miné

Abstract Domains for Bit-Level Machine Integer and Floating-point Operations

A Framework for Verified Depth-First Algorithms

René Neumann⇤

Technische Universität MünchenGarching, Germany

[email protected]

AbstractWe present a framework in Isabelle/HOL for formalizing variants of depth-first search.

This framework allows to easily prove non-trivial properties of these variants. Moreover,verified code in several programming languages including Haskell, Scala and Standard MLcan be generated.

In this paper, we present an abstract formalization of depth-first search and demonstratehow it is refined to an efficiently executable version. Further we use the emptiness-problemof Büchi-automata known from model-checking as the motiviation to present three Nested-DFS-algorithms. They are formalized, verified and transformed into executable code usingour framework.

1 IntroductionModel-checking is a widely-spread technology that is used in the analysis of a multitude ofsystems (e.g. software, hardware, or communcation protocols) [16]. To prove certain correctnessproperties of a system, its state space is exhaustively analyzed. One of the approaches –proposed by Vardi and Wolper [22, 23] – reduces model-checking problems to operations onBüchi-automata [2]. But it requires a verified or at least trusted implementation of theseoperations to generate trustworthy results. Therefore the CAVA-project1, which we are partof, is working on creating verified code for these operations so that it is eventually possibleto build a model-checker out of this verified code. The code can then be used as a referenceimplementation by more efficient implementations.

One algorithm that is fundamental in automata-theoretic model-checking (and graph-theoryas a whole) is depth-first search (DFS)[21]. This algorithm is especially interesting, as it isused in several different areas and with different goals (finding cycles, finding a specific node,generating the set of reachable nodes, . . . ) and might even be restricted by special constraints.An example of such a constraint is the need to generate the automaton on the fly, as themodel-checking process yields huge automata where it in general is not feasible to generate theautomaton upfront and hold it in memory. All these points have to be considered to createa flexible formalization that fits all the use-cases. This flexibility and the different use-casesresult in nontrivial proofs of many properties. Experience shows that these proofs on paperare not always free of mistakes, and even if they are, the implementations do not necessarilyneed to be (see for example the findings of Schimpf et al. [19]). On the other hand, some ofthese proofs are shared between several algorithms and re-doing them each time is a waste oftime. In this paper, we therefore present a general verified formalization of DFS resulting in aframework that allows to formalize different variants of DFS, lowering the barrier of using eagermachine-checked proofs.

An important problem in the field of model-checking is checking the emptiness of Büchi-automata. Algorithms exist to solve this problem by means of DFS. Examples are Nested DFS[5] and algorithms to find strongly connected components (SCCs) of graphs [6, 7] – enhancements

⇤Supported by a DFG project1Computer Aided Verification of Automata: http://cava.in.tum.de

1

A Framework for Verified Depth-First Algorithms Neumann

of Tarjan’s original algorithm [21]. In this paper, we present our proceeding in embeddingdifferent variants of Nested DFS into our framework.

As mentioned earlier, CAVA strives to result in a verified and executable implementation.Hence, we do not only formalize DFS and Nested DFS in the interactive theorem proverIsabelle/HOL [17], but also use the power of its code generator [9] to produce verified functionalcode2.

Related Work Lammich and Tuerk [15] formalized Hopcroft’s algorithm to determinize NFAsas part of CAVA with the same goal of providing proven base-algorithms for model-checking.Coming from the same motivation, Chou and Peled [3] and Ray et al. [18] verified otheralgorithms used in model-checking, though without generating executable code. FurthermoreSchimpf et al. [19] formalized the translation of LTL to BA, and also generate code.

Outline We first present an abstract view on how to formalize general DFS and give examplesof the properties we verified. Then we present a specific use-case of DFS (Nested DFS to findcycles in Büchi-automata) in Section 3, before showing how to bridge the gap to executablecode in Section 4. The paper is ended with an outview of our future plans on this topic.

2 A general DFS framework

2.1 Algorithms

Algorithm 1 Simple DFS1: visited ;2: procedure DFS(x)3: if x /2 visited then4: visited visited [ {x}5: for all e 2 succs x do6: DFS e

7: DFS start

Depth-first search is, in its most well-known formulation, a very simple algorithm (seeAlgorithm 1): We iterate through all nodes reachable from a given starting node3 chosing thefollowing node nondeterministicly from the set of successors. The call-stack of the procedurealso serves implicitly as the path from the starting node to the current node. In this form thealgorithm can only be used to create the set of reachable nodes (= visited), which does not fulfillour requirements stated in Section 1, in particular the ability to support all use-cases of DFS.

Therefore we enhance this simple algorithm to modify an opaque state � (see Alg. 2) via thethree functions action (called when visiting a node), remove (called when we omit a nodebecause it has been visited already) and post (called when having visited all successors of thenode and backtracking to its parent). Additionally, it is possible to abort the search with the

2The Isabelle theory files and the generated code can be downloaded from http://cava.in.tum.de/downloads.Please note that the formalizations presented in this paper are tailored towards easy representation and explanationand thus might differ from their counterparts in Isabelle.

3We do not consider depth-first forests like it is done in other places [12, 4], because their direct use complicatesthe proofs though the gains are neglegible.

2


Algorithm 2 Simple DFS with state augmentation1: visited ;;� �0

2: procedure DFS(x)3: if cond � then4: if x /2 visited then5: visited visited [ {x};� action � x

6: for all e 2 succs x do7: DFS e

8: else9: � remove � x

10: � post � x

11: DFS start

help of a fourth function cond (cf. line 3). With this augmentation, the search can be utilizedfor different use-cases by implementing these functions accordingly.

Example To search for a specific node, � could be implemented as a boolean flag. This wouldbe set to true in action if the node is encountered. And by letting cond be ��.¬�, the loopwould be aborted if the node has been found.

Algorithm 3 Functionalized DFS with state augmentation1: function DFS-step(R, s)2: let x :: xs = stack s and w :: ws = wl s in3: if w = ; then . backtrack4: sLstack xs,� post (� s) s x,wl wM5: else6: choose e 2 w and let w

0 = w � {e} in7: if e 2 visited s then . already visited8: sL� remove (� s) s e,wl w

0 :: wsM9: else . visit new node

10: sLstack e :: x :: xs,wl (succs e \R) :: w0 :: ws,11: � action (� s) s e, visited visited s [ {e}M

12: function DFS-start(R, x)13: Lstack [x], visited {x},wl [succs x \R],� (�0 x)M

14: function DFS(R, start)15: while (� s. stack s 6= [] ^ cond (� s)) (DFS-Step R) (DFS-start R start)

Though reasoning about imperative algorithms is possible in Isabelle/HOL [1], reasoningabout functional programs is preferred. Hence, the imperative DFS-algorithm is rewritten asfunctions (see Alg. 3), where we encapsulate each step of the search into an explicit DFS-state(s) that is modified inside a while-loop. Though the concept of while-loops is primarily knownfrom the imperative world, they are used here for conceptual and technical reasons, in particularto uncouple the modification of states from the recursion. But they do not add expressivenessand can easily be replaced by direct recursive calls inside DFS-step.

3


Besides this change, we also make the stack and a list of working sets (wl) explicit by addingthem as part of the DFS-state. The latter contains the set of those successors of a particularnode that still need to be processed. This list of working sets became necessary when we droppedthe forall-construct, which has been hiding this kind of book-keeping.

As can be seen in line 6 the nondeterministic choice is still preserved and we will outline inSection 4 the way of implementing this construct. For this section, we consider DFS-step toreturn a set containing exactly the results generated for each possible e 2 w.

Note that in the same step we introduce a set R that restricts the generation of successorsinsofar as nodes in R are excluded from visiting. For the rest of this paper we make R implicitthough, to help readibility.

The Algorithm 3 does not allow to prove important properties, especially those concerningthe time at which nodes are discovered or backtracked from. Therefore, we further extend itby adding more fields to the DFS-state and manipulating them accordingly (code omitted forbrevity):

• add a counter to represent time

• add a map � : node 7! time, that assigns a timestamp to each finished node

• add a map � : node 7! time, that assigns a timestamp to each discovered node (thisreplaces visited)

As a bit of notation we write discovered s instead of dom(� s) and equivalently finished s

for dom(� s), thus discovered resembles the former visited.In the introduction in Section 1 we mentioned the necessary ability of generating the

automaton on the fly. This is addressed by handling the function returning the successors(succs) as an argument to the DFS algorithm, similiar to action, etc. This allows the user ofthis DFS library to provide any graph model that fits his needs as long as he is able to providethe needed soundness proofs.

2.2 Properties

From the bare algorithms we move on to constructs that allow easy and elegant proofs. Westart by lifting the condition of the while-loop and its body into an explicit predicate:

DFS-next s s

0 :() stack s 6= [] ^ cond (� s) ^ s

0 2 DFS-step s

This can further be expanded to yield the set of all DFS-states that can be generated fromthe starting state:

DFS-constr-from x := DFS-next

⇤ (DFS-start x)

Thanks to this construction, we can use induction to prove predicates over all possibleDFS-states. Examples of such predicates are (v !⇤

\R w denotes reachability over V \R):

8x s v. s 2 DFS-constr-from x =) v 2 finished s =) x!⇤\R v

8x s v w. s 2 DFS-constr-from x =) w 6!⇤v =) v 6!⇤

w =)v 2 finished s =) w 2 finished s =) � s w > � s v _ � s w < � s v

The second lemma (together with the fact that 8v. � s v < � s v holds) already is the proofof an important property: The intervals created by the discovery and the finishing time of two

4


Algorithm 4 Nested DFS by Courcoubetis et al. [5]

1: procedure Nested-DFS

2: DFS-blue start

3: procedure DFS-blue(s)4: s.blue true5: for all t 2 succs s do6: if ¬ t.blue then7: DFS-blue t

8: if s 2 A then . s is accepting9: DFS-red s s

10: procedure DFS-red(seed , s)11: s.red true12: for all t 2 succs s do13: if ¬ t.red then14: DFS-red seed t

15: else if t = seed then16: report cycle

nodes are disjoint if neither can reach the other in the graph. We moreover formalize and proofcrucial theorems (Parenthesis Theorem, White-path Theorem) regarding depth-first search treesCormen et al. [4, pp. 606–608] formulate4.

We now define two more predicates reflecting the need to express properties about the stateof a finished search:

DFS-finished x s :() s 2 DFS-constr-from x ^ ¬9 s0. DFS-next s s

0

DFS-completed x s :() DFS-finished x s ^ cond (� s)

Intuitively, DFS-finishedx s holds iff s is the final DFS-state starting from x. DFS-completedx s

then adds the additional constraint, that the search has exhaustively discovered all reachablestates, i.e. it has not been aborted beforehand. Interesting properties that can be proven noware for example:

8x s. DFS-completed x s =) finished s = discovered s

8x s. DFS-completed x s =) finished s = {v | v = x _ x!+\R v}

8x s. s 2 DFS-constr-from x =) stack s = [] =) DFS-completed x s

3 Case Study: Nested DFSNested DFS is an approach for checking emptiness of Büchi-automata: The BA is empty if andonly if there is no cycle (reachable from the starting node) that contains at least one acceptingstate (for the rest of the section we will use A to denote the set of accepting states). Hence thegeneral procedure of Nested DFS is to use a first DFS run (the “blue” one) to find the acceptingnodes and then run another DFS (the “red” one) from each of these nodes to find a path backto that node – resulting in a cycle.

The first Nested DFS proposal is due to Courcoubetis et al. [5] and is presented in Alg. 4.Here the red search tries to find a path directly to the node it started from (seed in DFS-red).

Note that the knowledge whether a node has been searched in a red DFS is shared globallyby having the boolean red directly attached to the node (cf. line 11). This avoids searchingsubtrees that have been searched already in previous runs of the red DFS. Without this behavior

4In the Isabelle theories, these proofs can be found in the TreeDFS-theory.

5


a plain DFS with linear runtime (in the size of the edges) is started for each accepting state.As the number of them is also linear, the resulting runtime for the whole algorithm would bequadratic. But with the approach shown in Alg. 4 the result is an overall linear runtime for thewhole search: We visit an edge two times in each colored DFS, once upon reaching the nodeand once upon backtracking.

Please also note that this omission of nodes visited in other red searches has the crucialprerequisite of triggering the red search while backtracking (see line 9). Running the red partdirectly in the discovery phase would result in an incomplete algorithm as possible cycles mightbe missed. Proving this is non-trivial (on paper). But as the ability to restrict the search by acertain set is built into the general framework of DFS (the parameter R in the algorithms ofSection 2.1), it does not pose a problem in our formalization.

Enhancements of the basic Nested DFS are for example given by Holzmann et al. [11], wherethe red search succeeds if a path to the stack of the blue one has been found, or by Schwoonand Esparza [20], where also the blue search is utilizied for cycle detection. See the latter paperfor some elaborations about different implementations of Nested DFS.

As it turns out, the differences between the algorithms are rather small. Hence, it is sensibleto show general results over a parametrized core component first, and implement the differencesthereafter. This also allows to add other implementations of Nested DFS without having tostart from scratch. By using the general DFS-body described in the previous section we getmost of the important properties for free, most importantly statements of reachability.

The algorithm of the red phase can be abstracted to one that tries to find a nonempty pathinto a given set ss. We also allow to pass a (possibly empty) set of nodes R that the searchmust not visit. Using the definitions of Section 2.1 we get the following implementation of aDFS, where � is just a boolean value signalling whether a path into ss has been found yet:

SubDFS ss R = Lcond = ��.¬�, �0 = �x. false, restrict = R,

action = �� s e.� _ e 2 ss, post = �� s e.�

remove = �� s e. e = start s ^ e 2 ss M

The remove specification is required, because the starting node of this search might be in theset ss but it is not sufficient to check this at the start, as the path should be nonempty. Thiscan only be ensured if there is another reachable node of which the starting node is a successor.And it is not covered by action as this method is only called for non-visited nodes.

For this utilization of a DFS it can be shown that it fullfills its specification:

8x s. DFS-finished x s =) � s$ 9 v.v 2 ss ^ x!+\R v

Now, having the red DFS, we continue with the blue part, whose specification is: Onbacktracking from an accepting node run the red DFS, while sparing all nodes already visited ina red DFS. Therefore, our opaque state � now contains two parts: a boolean value signallingthe discovery of a cycle (b) and a set of all nodes visited through red depth-first searches (F ).

NestedDFS M = Lcond = � (b, F ).¬b, �0 = �x. (false, ;), restrict = ;,action = �� s e.�, remove = �� s e.�

post = � (b, F ) s e. if b _ e /2 A then (b, F ) elselet sub = SubDFS (M s) F inif � sub then (true, F ) else (false, F [ finished sub) M

6


Note that NestedDFS is parametrized over some M that returns the set into which SubDFS

tries to find a path. Hence we get the implementation of Courcoubetis et al. [5] withNestedDFS (� s. {head (stack s)}) and the implementation of Holzmann et al. [11] withNestedDFS (� s. set (stack s)).

Since we can show the general correctness property

8x s. (8s. head (stack s) 2M s ^M s ✓ set (stack s)) =) DFS-finished x s =)fst(� s)$ 9 v 2 A. v !+

v ^ x!⇤v

we immediately get the correctness of these two implementations.We can further enhance this to also give a formalization of the Nested DFS algorithm due

to Schwoon and Esparza [20], as this is the algorithm of Holzmann et al. with an additionalcheck in the blue phase: If we encounter a node that is already on the stack and either thatnode or our parent node (i.e. the head of the stack) is accepting, we have found a cycle. Thuswe replace the noop-implementation of remove by the following one:

� (b, F ) s e. ((e 2 A _ head (stack s) 2 A) ^ e 2 set(stack s), F )

The results of the previous formalizations can also be used here – all what remains is toshow that the new remove implementation is well-behaved.

In this section, we showed that the parametrization makes it possible to do step-wiseformalization of DFS-utilizations. This allows to focus on the immediate problems and properties,while being able to use theorems about the lower layers in a black-box-like style. It also allowsto verify different variants without needing to redo basic proofs. Though lines of code are not avery reliable measure, they give some hints about dimensions: Specifying and proving propertiesof the parameterized blue and red DFS took about 700 lines (this does not include the “raw”DFS parts), while instantiating them for the two variants is around 30 lines each. The morecomplex formalization of the Schwoon-Esparza-algorithm took another 250 lines.

The further refinement of these Nested DFS variants into executable code follows the samepattern as above: First refine parametrized SubDFS, followed by a parametrized NestedDFS,then instantiate or augment. The relation of lines of code is also similar to above: 400 for theimplementation of the red and blue phase, 50 each for instantiation and 100 for the additionalremove-part.

4 ImplementationThe mechanized proofs of DFS presented in the previous sections is not enough to create verifiedmodel-checkers: We also need to provide code that has the properties we proved. For tasks likethis, Isabelle/HOL provides a code-generator [9, 8] which allows to convert functional definitionsinto code in a functional language (SML, OCaml, Haskell, Scala). But as the definitions mightinclude constructs of higher-order logic like quantification or nondeterminism, they are notnecessarily expressible in a (deterministic) program as is. Thus it is not sufficient to solely passthem on to the code-generator.

Moreover abstract implementations of data-structures like sets, as they are provided byIsabelle, are easier to use in proofs, but tend to be inefficient for executable code. Therefore it isnecessary to derive a definition that fulfills the same properties as the more abstract definition,but replaces the offending data-structures and concepts by better fitted equivalents.

This is achieved using a refinement framework by Lammich [15], that allows data-refinement [10] on functions given in a monadic form [24]. Here a refinement is a relation

7


on two nondeterministic programs S and S

0 such that S S

0 holds if and only if all possibleresults of S are also results of S05. Data-refinement is the special case, where the structure ofthe program is kept but data-structures are replaced – for example abstract sets by concretelists. The refinement framework provides the ability to annotate abstraction relations R ✓ c⇥ a

that relate concrete values c with abstract values a. This is then written S +R S

0, expressingthat if some x is a result of S, there exists some x

0 such that (x, x0) 2 R and x

0 is a result of S0.

Example It is possible to replace the specification x 2 X by the head-operation on a list xs

where set xs = X. In terms of the refinement this is expressed by

RETURN head xs SPEC �x. x 2 X

Lammich provides one more framework named the Isabelle Collections Framework (ICF) [13,14]. This framework allows to create the proofs just with an interface of the operations(called locales in Isabelle) and later “plugging-in” different concrete implementations of abstractdata-structures that satisfy this interface. Furthermore, the ICF already ships with differentimplementations allowing to change the concrete data-structures without much hassle.

With these ingredients, we provide a modified version of DFS-step called DFS-step-Impl

where operations on sets and maps are replaced by their counterparts of the ICF and theoperations on the opaque state � are replaced by user-specified implementations of the fouroperations (action by action-Impl, cond by cond-Impl and so on). The same is done withsuccs that is replaced by succs-Impl. Under the assumptions that the user has proven theseimplementations to be correct (where ↵� is the abstraction relation on the opaque states and ↵

the abstraction relation on the DFS-states):

RETURN action-Impl (� s) s x +↵� RETURN action (↵�(� s)) (↵ s) x

. . .

cond-Impl (� s)$ cond (↵�(� s))

set(succs-Impl x) = succs x

we show thatRETURN DFS-step-Impl s +↵ DFS-Step (↵ s).

This intuitively describes that the implementation of DFS-step returns a result whoseabstraction is a possible result of DFS-step. Therefore, all properties on the abstract levelcarry over into the concrete world. Of course, this also holds for the final while-loop:

RETURN DFS-Impl x +↵ DFS x.

With the help of the refinement framework one can generate code such that

RETURN dfs-code x +↵ RETURN DFS-Impl x

so that by transitivity it holds that

RETURN dfs-code x +↵ DFS x.

In particular, it can be shown that ↵(dfs-code x) 2 DFS-constr-from x. Therefore it isguaranteed that all the properties shown for elements of DFS-constr-from also hold on theresult of the code-equation, and – after it has been converted into executable code with the helpof the Isabelle code generator – also on the generated code6.

5We do not consider the special cases FAIL and SUCCEED here. Please refer to the work of Lammich [15].6We provide generated and tested code in OCaml, SML, and Haskell. Please refer to the README distributed

in the archive for details.

8


5 Outview

The current framework for depth-first search is a good starting point for other utilizations besidesNested DFS. One further utilization we want to formalize are the different SCC-algorithms [6, 7].

But there are also some points that we would like to see included in the framework itself:The current usage of action, etc. does not allow nondeterminism in them. This is a drawbackas it reduces the possibilities of what can be accomplished in them: For instance, it is notpossible to return a counter-example in addition to the simple “yes/no” answer, because thecounter-example depends on the series of successor-choices made. Therefore, we want to embedthem into the same nondeterministic monad the main DFS-loop is already in. We also expectcertain proofs to become easier then.

Another addition are proofs about the complexity of the operations, for example showingthat the implementations of Nested DFS presented in the previous section are running in lineartime. With this addition, we would make the set of formalized properties more complete.

References

[1] L. Bulwahn, A. Krauss, F. Haftmann, L. Erkök, and J. Matthews. Imperative functional program-ming with Isabelle/HOL. In O. A. Mohamed, C. Muñoz, and S. Tahar, editors, Proc. of the 21thInternational Conference on Theorem Proving in Higher Order Logics (TPHOL), volume 5170 ofLecture Notes in Computer Science, pages 352–367. Springer, 2008.

[2] J. R. Büchi. On a decision method in restricted second order arithmetic. In E. Nagel, P. Suppes,and A. Tarski, editors, Proc. of the International Congress Logic, Methodology and Philosophyof Science 1960, volume 44 of Studies in Logic and the Foundations of Mathematics, pages 1–11.Elsevier, 1966.

[3] C.-T. Chou and D. Peled. Formal verification of a partial-order reduction technique for modelchecking. Journal of Automated Reasoning, 23(3):265–298, 1999.

[4] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, ThirdEdition. MIT Press, 2009.

[5] C. Courcoubetis, M. Vardi, P. Wolper, and M. Yannakakis. Memory-efficient algorithms for theverification of temporal properties. Formal Methods in System Design, 1(2/3):275–288, 1992.

[6] J.-M. Couvreur. On-the-fly verification of linear temporal logic. In J. Wing, J. Woodcock, andJ. Davies, editors, Proc. Formal Methods, volume 1708 of Lecture Notes in Computer Science, pages253–271. Springer, 1999.

[7] J. Geldenhuys and A. Valmari. Tarjan’s algorithm makes on-the-fly LTL verification more efficient.In K. Jensen and A. Podelski, editors, Tools and Algorithms for the Construction and Analysis ofSystems, volume 2988 of Lecture Notes in Computer Science, pages 205–219. Springer, 2004.

[8] F. Haftmann. Code Generation from Specifications in Higher Order Logic. PhD thesis, TechnischeUniversität München, 2009.

[9] F. Haftmann and T. Nipkow. Code generation via higher-order rewrite systems. In M. Blume,N. Kobayashi, and G. Vidal, editors, Functional and Logic Programming: 10th InternationalSymposium: FLOPS 2010, volume 6009 of Lecture Notes in Computer Science. Springer, 2010.

[10] C. A. R. Hoare. Proof of correctness of data representations. Acta Informatica, 1:271–281, 1972.[11] G. Holzmann, D. Peled, and M. Yannakakis. On nested depth first search. In J.-C. Grégoire,

G. J. Holzmann, and D. A. Peled, editors, Proc. of the 2nd SPIN Workshop, volume 32 of DiscreteMathematics and Theoretical Computer Science, pages 23–32. American Mathematical Society,1997.

9


[12] D. J. King and J. Launchbury. Structuring depth-first search algorithms in Haskell. In Proc. ofthe 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages (POPL),pages 344–354. ACM Press, 1995.

[13] P. Lammich. Collections Framework. In G. Klein, T. Nipkow, and L. Paulson, editors, Archive ofFormal Proofs. http://afp.sourceforge.net/entries/Collections.shtml, 2009. Formal proofdevelopment.

[14] P. Lammich and A. Lochbihler. The Isabelle Collections Framework. In M. Kaufmann andL. Paulson, editors, Interactive Theorem Proving, volume 6172 of Lecture Notes in ComputerScience, pages 339–354. Springer, 2010.

[15] P. Lammich and T. Tuerk. Applying data refinement for monadic programs to Hopcroft’s algorithm.submitted for publication, 2012.

[16] S. Merz. Model checking: A tutorial overview. In F. Cassez, C. Jard, B. Rozoy, and M. Ryan,editors, Modeling and Verification of Parallel Processes, volume 2067 of Lecture Notes in ComputerScience, pages 3–38. Springer, 2001.

[17] T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL — A Proof Assistant for Higher-OrderLogic, volume 2283 of Lecture Notes in Computer Science. Springer, 2002.

[18] S. Ray, J. Matthews, and M. Tuttle. Certifying compositional model checking algorithms in ACL2.In W. A. Hunt, Jr., M. Kaufmann, and J. S. Moore, editors, 4th International Workshop on theACL2 Theorem Prover and its Applications, 2003.

[19] A. Schimpf, S. Merz, and J.-G. Smaus. Construction of Büchi automata for LTL model checkingverified in Isabelle/HOL. In S. Berghofer, T. Nipkow, C. Urban, and M. Wenzel, editors, TheoremProving in Higher Order Logics, volume 5674 of Lecture Notes in Computer Science, pages 424–439.Springer, 2009.

[20] S. Schwoon and J. Esparza. A note on on-the-fly verification algorithms. In N. Halbwachs andL. Zuck, editors, Proc. of the 11th International Conference on Tools and Algorithms for theConstruction and Analysis of Systems (TACAS), volume 3440 of Lecture Notes in ComputerScience, pages 174–190. Springer, 2005.

[21] R. Tarjan. Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1(2):146–160, 1972.

[22] M. Y. Vardi. Verification of concurrent programs: the automata-theoretic framework. Annals ofPure and Applied Logic, 51(1-2):79–98, 1991.

[23] M. Y. Vardi and P. Wolper. Reasoning about infinite computations. Information and Computation,115(1):1–37, 1994.

[24] P. Wadler. The essence of functional programming. In Proc. of the 19th ACM SIGPLAN-SIGACTsymposium on Principles of programming languages (POPL), pages 1–14. ACM Press, 1992.

10

Tipi: A TPTP-based theory development

environment emphasizing proof analysis

Jesse Alama⇤

Center for Artificial IntelligenceNew University of Lisbon

[email protected], http://centria.di.fct.unl.pt/~alama/

Abstract

In some theory development tasks, a problem is satisfactorily solved once it is shownthat a theorem (conjecture) is derivable from the background theory (premises). Dependingon one’s motivations, the details of the derivation of the conjecture from the premisesmay or may not be important. In some contexts, though, one wants more from theorydevelopment than simply derivability of the target theorems from the background theory.One may want to know which premises of the background theory were used in the course ofa proof output by an automated theorem prover (when a proof is available), whether theyare all, in suitable senses, necessary (and why), whether alternative proofs can be found,and so forth. The problem, then, is to support proof analysis in theory development; thetool described in this paper, Tipi, aims to provide precisely that.

1 Introduction

A characteristic feature of theorem proving problems arising in theory development is that weoften do not know which premises of our background theory are needed for a proof until wefind one. If we are working in a stable background theory in which the axioms are fixed, wenaturally include all premises of the background theory because it a safe estimate (perhapsan overestimate) of what is needed to solve the problem. We may add lemmas on top ofthe background theory to help a theorem prover find a solution or to make our theory morecomprehensible. Since computer-assisted theorem proving is beset on all sides by intractability,any path through a formal theory development task is constantly threatened by limitationsboth practical (time, memory, patience, willpower) and theoretical (undecidability of first-ordervalidity). Finding even one solution (proof, model, etc.) is often no small feat, so declaringvictory once the first solution is found is thus quite understandable and may be all that iswanted.

In some theory development tasks, though, we want to learn more about our problem beyondits solvability. This paper announces Tipi, a tool that helps us to go beyond mere solvability ofa reasoning problem by providing support for answering such questions as:

• What premises of the problem were used in the solution?

• Do other automated reasoning systems derive the conclusion from the same premises?

• Are my premises consistent? Do they admit unintended models?

⇤Supported by the ESF research project Dialogical Foundations of Semantics within the ESF Eurocoresprogram LogICCC (funded by the Portuguese Science Foundation, FCT LogICCC/0001/2007). Research forthis paper was partially done while a visiting fellow at the Isaac Newton Institute for the Mathematical Sciencesin the programme ‘Semantics & Syntax’. The author wishes to thank Ed Zalta and Paul Oppenheimer forstimulating the development of Tipi and for being guinea pigs for it.

1

Tipi Alama

• What premises are truly needed for the conclusion? Can we find multiple sets of suchpremises? Is there a a “minimal” theory that derives the conclusion?

• Are my axioms independent of one another?

Let us loosely call the investigation of these and related questions proof analysis.Tipi is useful for theory exploration both in the context of discovery and in the context of

justification. In the context of discovery, one crafts lemmas, adds or deletes axioms, changesexisting axioms, modifies the problem statement, etc., with the aim of eventually showingthat the theory is adequate for one’s purposes (it derives a certain conjecture, is satisfiable, isunsatisfiable, etc.). In the context of discovery, the set of one’s axioms is in flux, and one needstools to help ensure that the development is not veering too far o↵ course into the unexpectedcountersatisfiability, admitting “nonsense” models, being inconsistent, etc. In the context ofjustification, after the initial work is done and a solution is found, one wants to know moreabout the relationship between the theory and the conjecture than simply that the latter isderivable from the former. What is the proof like? Are there other proofs? Tipi is designed tofacilitate answering questions such as these.

The theorem provers and model finders that make up Tipi include E, Vampire, Prover9, Mace4,and Paradox. The system is extensible; adding support for new automated reasoning systems isstraightforward because we rely only on the SZS ontology to make judgments about theoremproving problems.

Tipi uses a variety of automated reasoning technology to carry out its analysis. It usestheorem provers and model finders and is based on the TPTP syntax for expressing reasoningproblems [7] and the SZS problem status ontology [6] thereby can flexibly use of a variety ofautomated reasoning tools that support this syntax.

Going beyond solvability and demanding more of our solutions is obviously not a newidea. Our interests complement those of Wos and collaborators, who are also often interestednot simply in derivability, but in finding proofs that have certain valuable properties, suchas being optimal in various senses; see [11, 10]. Tipi emphasizes proof analysis at the level ofsets of premises, whereas one could be interested in more fine grained information such as thenumber of symbols employed in a proof, whether short proofs are available, whether a theory isaxiomatized by a single formula, etc. Such analysis tends to involve rather expert knowledge ofparticular problems and low-level tweaking of proof procedures. Tipi uses automated reasoningtechnology essentially always in “automatic mode”.

The philosophical background of Tipi is a classic problem in the philosophy of logic knownas the proof identity problem:

When are two proofs the same?

Standard approaches to the proof identity problem work with natural deduction derivationsor category theory. One well-known proposal is to identify “proof” with a natural deductionderivation, define a class of conversion operations on natural deduction derivations, and declarethat two proofs are the same if one can be converted to the other. See [3] for a discussion of thisapproach to the proof identity problem. The inspiration for Tipi is to take on the proof identityproblem with the assistance of modern automated reasoning tools. From this perspective, theTPTP library [7] can be seen as a useful resource on which to carry out experiments about“practical” proof identity. TPTP problems typically don’t contain proofs in the usual sense ofthe term, but they do contain hints of proofs in the sense that they specify axioms and perhapssome intermediate lemmas.

2

Tipi Alama

One does not need to share the philosophical background (or even care about it) to startusing Tipi, which in any case was designed to facilitate TPTP-based theory development.

Terminology In the following we sometimes equivocate on the term theory, understandingit sometimes in its mathematical sense as a set of formulas closed under logical consequence(and so is always infinite and has infinitely many axiomatizations), and sometimes in its practicesense, represented as a TPTP problem, which always has finitely many axioms. “TPTP theory”simply means an arbitrary (first-order) TPTP problem. Of course, from a TPTP theory T wecan obtain a theory in the mathematical sense of the term by simply reading the formulas ofT as logical formulas and closing T under logical consequence. From an arbitrary first-ordertheory in the mathematical sense of the term obviously one cannot extract a unique finiteaxiomatization and, worse, many theories of interest are not even finitely axiomatizable. Still,we may at times, for precision, need to understand “theory” in its mathematical sense, eventhough of course we shall always work with finite TPTP theories (problems).

Convention Some TPTP problems do not have a conjecture formula. Indeed, some TPTPproblems are not theorem proving problems per se but are better understood as model findingproblems (e.g., the intended SZS status is Satisfiable). For expository convenience we shallrestrict ourselves to TPTP problems whose intended interpretation is that a set of premisesentails a single conclusion.

The structure of this paper is as follows. Section 2 describes some simple tools providedby Tipi to facilitate theory development. Section 3 discusses the problem of determining whichpremises are needed. Section 4 discusses two algorithms, one syntactic and the other semantic,for determining needed needed premises. Section 5 concentrates on independent sets of axioms.Section 6 discusses some simple model analysis tools provided by Tipi. Section 7 gives a sense ofthe experience so far with using Tipi on real-world proof analysis tasks. Section 8 says where canobtain Tipi and briefly discusses its implementation. Section 9 concludes and suggests furtherdirections for proof analysis and dependencies.

2 Syntax analysis

When designing TPTP theories, one needs to be careful about the precise language (signature)that one employs. An all-too-familiar problem is typos: one intends to write connected_to butwrites conected_to instead. One quick check that can help catch this kind of error is to lookfor unique occurrences of constants, functions, or predicates. A unique occurrence of a relationor function symbol is a sign (though by no means a necessary or su�cient condition) that thetheory is likely to be trivially inadequate to the intended reasoning task because it will fail tobe (un)satisfiable, or fail to derive the conjecture. Detecting such hapax legomena early in thetheory development process can prevent wasted time “debugging” TPTP theories that appearto be adequate but which are actually flawed.

3 Needed premises

Once it is known that a conjecture c is derivable from a background theory T , one may wantto know about the class of proofs that witness this derivability relation. Depending on whichautomated theorem prover (ATP) is used, there may not even be a derivation of c from T , butonly the judgment that c is a theorem of T . If one does have a derivation (e.g., a resolutionproof) d, one can push the investigation further:

• Which premises of T occur in d?

3

Tipi Alama

• Are all the premises occurring in d needed?

Various notions of “needed” are available. For lack of space we cannot give a complete surveyof this interesting concept; see [1] for a more thorough discussion of the notion of “dependency”in the context of interactive theorem provers. One can distinguish whether a formula is neededfor a derivation, or for a conclusion. In a Hilbert style calculus, the sequence d of formulas

xC,A,A Ñ B,By

is a derivation of D from the premises X “ tC,A,A Ñ Bu. All axioms of X do appear in d,so it is reasonable to assert that all of X is needed for d. But are all premises needed for theconclusion B of d? The formula C is not used as the premise of any application of a rule ofinference (here, modus ponens is the only rule of inference). Thus one can simply delete thefirst term of d and obtain a derivation d1 of B from X ´ tCu. In a plain resolution calculus, aderivation of the empty clause from a set C of clauses can have unused premises in the same senseas there can be unused premises of a Hilbert-style derivation. Still, there can be “irrelevant”literals in clauses of C whose deletion from C and from a refutation d of C yields a more focusedproof.

Intuitively, any premise that is needed for a conclusion is also needed for any derivation ofthe conclusion (assuming sensible notions of soundness and completeness of the calculi in whichderivations are carried out). However, a premise that is needed for a derivation need not beneeded for its conclusion. Clearly, multiple proofs of a conclusion are often available, employingdi↵erent sets of premises.

In the ATP context, we may even find that, if we keep trimming unused premises from atheory T that derives a conjecture c until no more trimming is possible (so that every premiseis needed by the ATP to derive the conjecture), there may still be proper subsets of “minimal”premises that su�ce. The examples below in Section 7 illustrate this.

4 Reproving and minimal subtheories

Given an ATP A, a background theory T and a conjecture c, assume that T does derive c andis witnessed by an A-derivation d. Define

T0 :“ t' P T : ' occurs in du

as the set of premises of T occurring in d. Do we need all of T to derive c? If T0 is a propersubset of T , then the answer is evidently “no”. One simple method to investigate the questionof which premises of T are needed to derive c is to simply repeat the invocation of A usingsuccessively weaker subtheories of T . Given Tk and an A-derivation dk of c from Tk, define

Tk`1 :“ t' P Tk : ' occurs in dku

We are then after the fixed point of the sequence

T0 Å T1 Å T2 Å . . .

We can view this discussion as the definition of a new proof procedure:

Definition 1 (Syntactic reproving). Given a background theory T , an ATP A, a conjecture c,use A to derive c from A. If this succeeds, extract the premises of T that were used by A toderive c; call this set T 1. If T 1

“ T , then stop and return d. If T 1 is a proper subset of T , thenlet T :

“ T 1 and repeat.

4

Tipi Alama

We call this proof procedure “syntactic” simply because we view the task of a proof finder asa syntactic one. The name is not ideal because an ATP may use manifestly semantic methodsin the course of deriving c from T . The definition of syntactic reproving requires of A only thatwe can compute from a successful search for a derivation, which premises were used; we do notrequire a derivation from A, though in practice various ATPs do in fact emit derivations andfrom these we simply extract used premises.

If A is a complete ATP, then we can find a fixed point, provided we have unlimited resources;the existence of a fixed point follows from the assumption that T0 $ c and the fact that T0 isfinite.1 Of course, we do place restrictions on our proof searches, so we often cannot determinethat a proper subset of T su�ces to derive c, even if there is such a subset.

It can happen that the syntactic reprove procedure applied with an ATP A, a theory T , anda conjecture c, terminates with a subtheory T 1 of T even though there is a proper subset T 2 ofT that also su�ces and, further, A can verify that T 2 su�ces. The syntactic reprove proceduredoes not guarantee that the solution it finds is truly minimal. Some other proof procedure,then, is needed.

Definition 2 (Semantic reproving). Given a background theory T and ATPs A and B, a con-jecture c, use A to derive c from A. If this succeeds, extract the premises of T that were usedby A to derive c; call this set T 1. Define

T˚ :“ t' P T 1 : T 1

´ t'u & cu

Now use A to check whether T˚$ c. If this succeeds, return T˚.

The semantic reprove procedure takes two ATPs A and B as parameters. A is used forchecking derivability, whereas B is used to check underivability. This proof procedure is called“semantic” because the task of constructing T˚ is carried out in Tipi using a model finder (e.g.,Paradox or Mace4), which solves the problem of showing that X & ' by producing a model ofX Y t 'u. As with “syntactic” in “syntactic reprove”, the “semantic” in “semantic reprove”is not ideal because any ATP that can decide underivability judgments would work; whetherB uses syntactic or semantic methods (or a combination thereof) to arrive at its solution isimmaterial. Indeed, in principle, for B a theorem prover could be used. Even though Vampire

and E are typically used to determine derivability, because of the properties of their searchprocedures, they can be used for determining underivability, though establishing underivabilityis not necessarily their strong suit and often a model finder can give an answer more e�cientlyto the problem of whether X $ '.

5 Independence

In proof analysis a natural question is whether, in a set X of axioms, there is an axiom ' thatdepends on the others in the sense that ' can be derived from X ´ t'u.

Definition 3 (Independent set of formulas). A set X of formulas is independent if for everyformula ' in X it is not the case that X ´ t'u $ '.

Tipi provides a proof procedure for testing whether a set of formulas is independent. Thealgorithm for testing this is straightforward: given a finite set X “ t'1, . . . ,'nu of axiomswhose independence we need to test, test successively

1. whether there is any ' “ 'k in X such that for some j1 ‰ k, we have txj1u $ '

1There may even be multiple A-minimal subtheories of T that derive c, but the proof procedure underdiscussion will find only one of them.

5

Tipi Alama

2. whether there is any ' “ 'k in X such that for some j1, j2 ‰ k we have txj1 , xj2u $ ',

3. etc., for increasingly larger k (the upper bound is of course n ´ 1).

On the assumption that most set of axiom that arise in practice are not independent, Tipi

employs a “fail fast” heuristic: if X is not independent, then we can likely find, for someaxiom ' in X, a small subset X˚ su�ces to prove '. Other algorithms for testing indepen-dence are conceivable. It could be that the naive algorithm that is immediately suggested bythe definition—enumerate the axioms, checking for each one whether it is derivable from theothers—may be the best approach. Experience shows this is indeed an e�cient algorithm ifone really does have an independent set (obviously the iterative “fail fast” algorithm sketchedrequires npn ´ 1q “ Opn2

q calls to an ATP for a set of axioms of size n, whereas the obviousalgorithm just makes n calls). A model finder can be used to facilitate this: if X ´ t'u Y t 'u

is satisfiable, then ' is independent of X. If one is dealing with large sets of axioms, testingindependence becomes prohibitively expensive, so one could employ a randomized algorithm:randomly choose an axiom ' and a proper subset T 1 of T that does not contain ' and testwhether T 1 proves '. Tipi implements all these algorithms for checking independence.

A typical application of independence checking first invokes one of the minimization algo-rithms described in Section 4. If there is a proper subset T 1 of T that su�ces to derive c, thenthe independence of the full theory T is probably less interesting (and in any event requiresmore work to determine) than the independence of the sharper set T 1.

Checking independence is related to the semantic reprove algorithm described in Section 4.If we are dealing with a theory that has a conjecture formula, then the two notions are notcongruent, because the property of independence holds for a set of formulas without regardto whether they are coming from a theory that has a conjecture formula. If we are dealingwith a theory T without a conjecture whose intended SZS status is Unsatisfiable, i.e., thetheory should be shown to be unsatisfiable, then an axiom ' of a theory T gets included inthe the set T˚ of Definition 2 of semantic reproving if T ´ ' is satisfiable. T is “semanticallyminimal” when no proper subtheory of T is unsatisfiable, i.e., every proper subtheory of Tis satisfiable. Independence and semantic minimality thus coincide in the setting of theorieswithout a conjecture formula with intended SZS status Unsatisfiable.

6 Model analysis

When developing formal theories, one’s axioms, lemmas, and conjecture are typically in flux.One may request the assistance of an automated reasoning system to check simply whetherone’s premises are consistent. One might go further an ask whether, if one is dealing with aTPTP theory that has a conjecture, there the theory is satisfiable when the conjecture is takenas simply another axiom. An “acid test” for whether one is proceeding down the right path atall is whether one’s problem is countersatisfiable.

Tipi provides tools for facilitating this kind of analysis. A single command is available thatcan check, given a theory

• whether the theory without the conjecture has a model

• whether the axioms of the theory together with the conjecture (if present) has a model

• whether the axioms of the theory together with the negation of the conjecture has amodel.

6

Tipi Alama

Axiom Name Formula Minimum 1 Minimum 2

edge ends are vertices The head and the tail of an edgeare vertices

‹

in path properties If P is a path from vertices v1and v2, and if v is in P , then (i) vis a vertex and (ii) there is anedge e of P such that v is eitherthe head or the tail of e.

‹

on path properties If P is a path from vertices v1and v2, and if e is on P , then e isan edge and both the head andthe tail of e are in P .

‹

Table 1: Two minimal subtheories of GRA008+1.

The second consistency check is useful to verify that one’s whole problem (axioms together withthe conjecture, considered as just another axiom) is sensible. It can happen that the axiomsof a problem have very simple models, but adding the conjecture makes the models somewhatmore complicated. If a set of axioms has a finite model but we cannot determine reasonablyquickly that the axioms together with the conjecture have a finite model, then we can take suchresults as a sign that the conjecture may not be derivable from the axioms. (Of course, it ispossible that that the set of axioms is finitely satisfiable but the set containing axioms and theconjecture is finitely unsatisfiable. One can use tools such as Infinox [2] to complement Tipi insuch scenarios.)

7 Experience

Tipi has so far been used successfully to analyze a variety of theories occurring in diverse TPTPtheory development tasks. It has proved quite useful for theory development tasks in compu-tational metaphysics [4], which was the initial impetus for Tipi.

To get a sense of how one can apply Tipi, we now consider several applications of Tipi toproblems coming from the large TPTP library of automated reasoning problems. In theseexamples we use E as our theorem prover and Paradox as our model finder.

Example 1 (GRA008+1). A problem in a first-order theory2 about graphs has 17 premises.Syntactic reprove brings this down to 12. Progressing further with semantic repoving, we findthat 8 of the 12 are needed (in the sense that for each of them, their deletion, while keepingthe others, leads to countersatisfiability). Moreover, none of the other 4 is individually needed(the conjecture is still derivable from the 4 theories one obtains by deleting the 4). It turns outthat there are two minimal theories that su�ce to derive the conjecture; see Table 1.

Example 2 (PUZ001+1). Pelletier’s Dreadbury Mansion puzzle [5] asks: “Who killed AuntAgatha?”

Someone who lives in Dreadbury Mansion killed Aunt Agatha. Agatha, thebutler, and Charles live in Dreadbury Mansion, and are the only people who livetherein. A killer always hates his victim, and is never richer than his victim. Charles

2See GRA001+0.

7

Tipi Alama

Axiom Name Formula Solution 1 Solution 2

composition identity x;1 “ x ‹converse cancellativity x;x; y _ y “ y ‹

converse idempotence ¯x “ x ‹converse multiplicativity ˘x; y “ x; y ‹maddux3 a kind of de Morgan x “ x _ y _ x _ y ‹

Table 2: Minimal subtheories of REL002+1.

hates no one that Aunt Agatha hates. Agatha hates everyone except the butler.The butler hates everyone not richer than Aunt Agatha. The butler hates everyoneAunt Agatha hates. No one hates everyone. Agatha is not the butler.

Among the 13 first-order sentences in the formalization are three

lives(agatha), lives(butler), lives(charles)

that turn out to be deletable, which the reader may find amusing since we are dealing with amurder mystery. Each of the 10 other premises turn out to be needed (their deletion leads tocountersatisfiability), so there is a unique minimal subtheory of the original theory that su�cesto solve the mystery (which is that Agatha killed herself). Note that the premise that Agathais not the butler (an inference that would perhaps be licensed on pragmatic grounds if it weremissing from the text) is needed, which perhaps explains why the puzzle explicitly states it.

Example 3 (REL002+1). A problem about relation algebra is to show that J is a right unit forthe join operation (`):

@xpx _ J “ Jq.

There are 13 premises3. Syntactic reprove with Vampire shows that 7 axioms can be cut, whereassyntactic reprove with E finds 6. The sets of syntactically minimal premises of E and Vampire

are, interestingly, not comparable (neither is a subset of the other). Semantic reprove with the10 distinct axioms used by either Vampire or E shows, surprisingly, that 2 are needed whereaseach of the other 8 is separately eliminable. Of the 256 combinations of these 8 premises, wefind two minima; see Table 2.

With enough caution, Tipi can be used somewhat in the large-theory context, where thereare “large” numbers of axioms (at least several dozen, sometimes many more). Although it isquite hopeless, in the large-theory context, to test all possible combinations of premises in thehope of discovering all minimal theories, one can, sometimes use syntactic reproving to weedout large classes of subsets. With these filtered premises, semantic reproving can be used tofind minima using a more tractable number of combinations of premises.

Example 4 (TOP024+1). Urban’s mapping [9] of the Mizar Mathematical Library, with its richlanguage for interactively developing mathematics, into pure first-order theorem proving prob-lems, is a rich vein of theorem proving problems. Many of them are quite challenging owing tothe large number of axioms and the inherent di�culty of reasoning in advanced pure mathe-matics.

Here the problem is to prove that every maximal T0 subset of a topological space T is dense.

3See REL001+0.

8

Tipi Alama

Name Formula Solution 1 Solution 2

dt k3 tex 4 Maximal anti-discrete subsets of a topo-logical space T are subsets of T

‹

reflexivity r1 tarski X Ñ X ‹t3 subset A P }pXq i↵ A Ñ X ‹

Table 3: Two minimal subtheories of TOP024+1.

Of the 68 available premises, 9 are found through an initial syntactic reproving run using E

and Vampire. Of these 9, 3 are (separately) not needed, whereas the other 6 are needed. Of the8 combinations of these 3 premises, we find two minima; see Table 3.

8 Availability

Tipi is available at

https://github.com/jessealama/tipi

At present Tipi relies on the GetSymbols, TPTP2X, and TPTP4X tools, which are part of the TPTPWorld distribution [8]. These are used to parse TPTP theory files; a standalone parser for theTPTP language is planned, which would eliminate the dependency on these additional tools.

9 Conclusion and future work

At the moment Tipi supports a handful of theorem provers and model finders. Supportingfurther systems is desirable; any automated reasoning system that supports the SZS ontologycould, in principle, be added.

Tipi supports, at the moment, only first-order logic, and so covers only a part of the spaceof all TPTP theories. There seems to be no inherent obstacle to extending Tipi to supporthigher-order theories as well.

More systematic investigation for alternative proofs of a theorem could be carried out usingProver9’s clause weight mechanism. One could have an alternative approach to the problem ofgenerating multiple alternative proofs to the simple approach taken by Tipi.

When working with models of a theory under development that makes true some ratherunusual or unexpected formulas, it can sometimes be di�cult to pinpoint the di�culty withthe theory that allows it to have such unusual models. One has to infer, by looking at theraw presentation of the model, what the strange properties are. We would like to implement asmarter, more interactive diagnosis of “broken” theories.

The problem of finding minimal subtheories su�cient to derive a conjecture, checking in-dependence of sets of axioms, etc., clearly requires much more e↵ort than simply deriving theconjecture. Tipi thus understandably can take a lot of time to answer some of the questionsput to it. Some of this ine�ciency seems unavoidable, but it is reasonable to expect that fur-ther experience with Tipi could lead to new insights into the problem of finding theory minima,determining independence, etc.

The proof procedures defined by Tipi naturally suggest extensions to the SZS ontology [6].One can imagine SZS statuses such as

9

Tipi Alama

• IndependentAxioms: The set of axioms is independent.

• DependentAxioms: The set of axioms is dependent.

• MinimalPremises: No proper subset of the axioms su�ces to derive the conjecture.

• NonMinimalPremises: A proper subset of the axioms su�ces to derive the conjecture.

• UniqueMinimum: There is a unique subset S of the axioms such that S derives the con-jecture and every proper subset of S fails to derive the conjecture.

• MultipleIncomparableMinima: There are at least two proper subsets S1 and S2 of theaxioms su�ces to derive the conjecture, with neither S1 Ñ S2 nor S2 Ñ S1.

Tipi itself already can be seen as supporting these (currently uno�cial) SZS statuses. One couldeven annotate the statistics for many problems in the TPTP library by listing the number ofpossible solutions (minimal subtheories of the original theory) they admit, or the number ofpremises that are actually needed.

References

[1] Jesse Alama, Lionel Mamane, and Josef Urban. Dependencies in formal mathematics. CoRR,abs/1109.3687, 2011.

[2] Koen Claessen and Ann Lilliestrom. Automated inference of finite unsatisfiability. In R. A.Schmidt, editor, CADE 2009, volume 5633 of Lecture Notes in Artificial Intelligence, pages 388–403, 2009.

[3] K. Dosen. Identity of proofs based on normalization and generality. Bulletin of Symbolic Logic,pages 477–503, 2003.

[4] B. Fitelson and E.N. Zalta. Steps toward a computational metaphysics. Journal of Philosophical

Logic, 36(2):227–247, 2007.

[5] F.J. Pelletier. Seventy-five Problems for Testing Automatic Theorem Provers. Journal of Auto-

mated Reasoning, 2(2):191–216, 1986.

[6] G. Sutcli↵e. The SZS ontologies for automated reasoning software. In Proceedings of the LPAR

Workshops: Knowledge Exchange: Automated Provers and Proof Assistants, and The 7th Inter-

national Workshop on the Implementation of Logics, volume 418, pages 38–49. CEUR WorkshopProceedings, 2008.

[7] G. Sutcli↵e. The TPTP problem library and associated infrastructure. Journal of Automated

Reasoning, 43(4):337–362, 2009.

[8] G. Sutcli↵e. The TPTP World–Infrastructure for automated reasoning. In Logic for Programming,

Artificial Intelligence, and Reasoning, pages 1–12. Springer, 2010.

[9] J. Urban. MPTP 0.2: Design, implementation, and initial experiments. Journal of Automated

Reasoning, 37(1):21–43, 2006.

[10] L. Wos. The flowering of automated reasoning. In D. Hutter and W. Stephan, editors, Mechanizing

Mathematical Reasoning, volume 2605 of Lecture Notes in Computer Science, pages 204–227.Springer, 2005.

[11] L. Wos and G.W. Pieper. Automated Reasoning and the Discovery of Missing and Elegant Proofs.Rinton Press, 2003.

10

Reducing Higher Order º-Calculus to SpatialLogics

Zining Cao

Department of Computer Science and TechnologyNanjing University of Aero. & Astro.

Nanjing 210016, P. R. [email protected]

Abstract. In this paper, we show that the theory of processes can bereduced to the theory of spatial logic. Firstly, we propose a spatial logicSL for higher order º-calculus, and give an inference system of SL. Thesoundness and incompleteness of SL are proved. Furthermore, we showthat the structure congruence relation and one-step transition relationcan be described as the logical relation of SL formulae. We also extendbisimulations for processes to that for SL formulae. Then we extend alldefinitions and results of SL to a weak semantics version of SL, calledWL. At last, we add µ-operator to SL. This new logic is named µSL.We show that WL is a sublogic of µSL and replication operator can beexpressed in µSL.

1 Introduction

Higher order º-calculus was proposed and studied intensively in Sangiorgi’s dis-sertation [29]. In higher order º-calculus, processes and abstractions over pro-cesses of arbitrarily high order, can be communicated. Some interesting equiva-lences for higher order º-calculus, such as barbed equivalence, context bisimu-lation and normal bisimulation, were presented in [29]. Barbed equivalence canbe regarded as a uniform definition of bisimulation for a variety of concurrentcalculi. Context bisimulation is a very intuitive definition of bisimulation forhigher order º-calculus, but it is heavy to handle, due to the appearance of uni-versal quantifications in its definition. In the definition of normal bisimulation,all universal quantifications disappeared, therefore normal bisimulation is a veryeconomic characterization of bisimulation for higher order º-calculus. The coin-cidence between the three weak equivalences was proven [29, 28, 20]. Moreover,this proposition was generalized to the strong case [10].

Spatial logic was presented in [12]. Spatial logic extends classical logic withconnectives to reason about the structure of the processes. The additional con-nectives belong to two families. Intensional operators allow one to inspect thestructure of the process. A formula A1|A2 is satisfied whenever we can split theprocess into two parts satisfying the corresponding subformula A

i

, i = 1, 2. Inthe presence of restriction in the underlying model, a process P satisfies formulanrA if we can write P as (∫n)P 0 with P 0 satisfying A. Finally, formula 0 is

only satisfied by the inaction process. Connectives | and r come with adjunctoperators, called guarantee (.) and hiding (Æ) respectively, that allow one toextend the process being observed. In this sense, these can be called contextualoperators. P satisfies A1 . A2 whenever the spatial composition (using |) of Pwith any process satisfying A1 satisfies A2, and P satisfies AÆn if (∫n)P satisfiesA. Some spatial logics have an operator for fresh name quantification [11].

There are lots of works of spatial logics for º-calculus and Mobile Ambients. Insome papers, spatial logic was studied on its relations with structural congruence,bisimulation, model checking and type system of process calculi [5, 6, 9, 16, 27].

The main idea of this paper is that the theory of processes can be reducedto the theory of spatial logic.

In this paper, we present a spatial logic for higher order º-calculus, calledSL, which comprises some action temporal operators such as høi and hahAii,some spatial operators such as prefix and composition, some adjunct operatorsof spatial operators such as . and Æ, and some operators on the property offree names and bound names such as ™n and e™. We give an inference system ofSL, and prove the soundness of the inference system for SL. Furthermore, weshow that there is no finite complete inference system for SL. Then we studythe relation between processes and SL formulas. We show that a SL formulacan be viewed as a specification of processes, and conversely, a process can beviewed as a special kind of SL formulas. Therefore, SL is a generalization ofprocesses, which extend process with specification statements. We show that thestructural congruence relation and one-step transition relation can be describedas the logical relation of SL formulas. We also show that bisimulations for higherorder processes can be characterized by a sublogic of SL.

Furthermore, we give a weak semantics version of SL, called WL, where theinternal action is unobservable. The results of SL are extended to WL, suchas an inference system for WL, the soundness of this inference system, and nofinite complete inference system for WL.

Finally, we add µ-operator to SL. The new logic is named µSL. We showthat WL is a sublogic of µSL and replication operator can be expressed in µSL.Thus µSL is a powerful logic which can express both strong semantics and weaksemantics for higher order processes.

This paper is organized as follows: In Section 2, we briefly review higher or-der º-calculus. In Section 3, we present a spatial logic SL, including its syntax,semantics and inference system. The soundness and incompleteness of the infer-ence system of SL are proved. Furthermore, we discuss that SL can be regardedas a specification language of processes and processes can be regarded as a kindof special formulas of SL. Bisimulation in higher order º-calculus is describedby a sublogic of SL. In Section 4, we give a weak semantics version of SL, calledWL. We generalize concepts and results of SL to WL. In Section 5, we add µ-operator to SL. The new logic is named µSL. We studied the expressive powerof this extension. The paper is concluded in Section 6.

2 Higher Order º-Calculus

2.1 Syntax and Labelled Transition System

In this section we briefly recall the syntax and labelled transition system of thehigher order º-calculus. Similar to [28], we only focus on a second-order fragmentof the higher order º-calculus, i.e., there is no abstraction in this fragment.

We assume a set N of names, ranged over by a, b, c, ... and a set V ar ofprocess variables, ranged over by X, Y, Z, U, .... We use E, F, P, Q, ... to stand forprocesses. Pr denotes the set of all processes.

We first give the grammar for the higher order º-calculus processes as follows:P ::= 0 | U | º.P | P1|P2 | (∫a)Pº is called a prefix and can have one of the following forms:º ::= a(U) | ahP i, here a(U) is a higher order input prefix and ahP i is a

higher order output prefix.In each process of the form (∫a)P the occurrence of a is bound within the

scope of P . An occurrence of a in a process is said to be free iÆ it does not liewithin the scope of a bound occurrence of a. The set of names occurring free inP is denoted fn(P ). An occurrence of a name in a process is said to be boundif it is not free, we write the set of bound names as bn(P ). n(P ) denotes theset of names of P , i.e., n(P ) = fn(P ) [ bn(P ). The definition of substitution inprocess terms may involve renaming of bound names when necessary to avoidname capture.

Higher order input prefix a(U).P binds all free occurrences of U in P . Theset of variables occurring free in P is denoted fv(P ). We write the set of boundvariables as bv(P ). A process is closed if it has no free variable; it is open if itmay have free variables. Prc is the set of all closed processes.

Processes P and Q are Æ-convertible, P ¥Æ

Q, if Q can be obtained fromP by a finite number of changes of bound names and variables. For example,(∫b)(ahb(U).Ui.0) ¥

Æ

(∫c)(ahc(U).Ui.0).Structural congruence: P |Q ¥ Q|P ; (P |Q)|R ¥ P |(Q|R); P |0 ¥ P ; (∫a)0 ¥

0; (∫m)(∫n)P ¥ (∫n)(∫m)P ; (∫a)(P |Q) ¥ P |(∫a)Q if a /2 fn(P ).In [26], Parrow has shown that in higher order º-calculus, the replication can

be defined by other operators such as higher order prefix, parallel and restriction.For example, !P can be simulated by R

P

= (∫a)(D|ahP |Di.0), where D =a(X).(X|ahXi.0).

The operational semantics of higher order processes is given in Table 1. Wehave omitted the symmetric cases of the parallelism and communication rules.

ALP :P

Æ°! P 0

QÆ°! Q0

P ¥ Q,P 0 ¥ Q0

OUT : ahEi.P ahEi°! P

IN : a(U).PahEi°! P{E/U} bn(E) = ;

PAR :P

Æ°! P 0

P |Q Æ°! P 0|Qbn(Æ) \ fn(Q) = ;

COM :P

(∫b)ahEi°! P 0 QahEi°! Q0

P |Q ø°! (∫eb)(P 0|Q0)eb \ fn(Q) = ;

RES :P

Æ°! P 0

(∫a)P Æ°! (∫a)P 0a /2 n(Æ)

OPEN :P

(∫c)ahEi°! P 0

(∫b)P(∫b,c)ahEi°! P 0

a 6= b, b 2 fn(E)° ec

Table 1. The operational semantics of higher order º-calculus

2.2 Bisimulations in Higher Order º-Calculus

Context bisimulation and contextual barbed bisimulation were presented in [29,28] to describe the behavioral equivalences for higher order º-calculus. Let usreview the definition of these bisimulations. In the following, we abbreviateP{E/U} as P hEi.

Context bisimulation is an intuitive definition of bisimulation for higher orderº-calculus.

Definition 1 A symmetric relation R µ Prc £ Prc is a strong contextbisimulation if P R Q implies:

(1) whenever Pø°! P 0, there exists Q0 such that Q

ø°! Q0 and P 0 R Q0;

(3) whenever PahEi°! P 0, there exists Q0 such that Q

ahEi°! Q0 and P 0 R Q0;

(4) whenever P(∫b)ahEi°! P 0, there exist Q0, F , ec such that Q

(∫c)ahF i°! Q0 andfor all C(U) with fn(C(U)) \ {eb,ec} = ;, (∫eb)(P 0|ChEi) R (∫ec)(Q0|ChF i). HereC(U) represents a process containing a unique free variable U.

We write P ªCt

Q if P and Q are strongly context bisimilar.Contextual barbed equivalence can be regarded as a uniform definition of

bisimulation for a variety of process calculi.Definition 2 A symmetric relation R µ Prc £ Prc is a strong contextual

barbed bisimulation if P R Q implies:(1) P |C R Q|C for any C;(2) whenever P

ø°! P 0 then there exists Q0 such that Qø°! Q0 and P 0 R

Q0;(3) P #

µ

implies Q #µ

, where P #a

if 9P 0, PahEi°! P 0, and P #

a

if 9P 0,

P(∫b)ahEi°! P 0.We write P ª

Ba

Q if P and Q are strongly contextual barbed bisimilar.Intuitively, a tau action represents the internal action of processes. If we just

consider external actions, then we should adopt weak bisimulations to charac-terize the equivalence of processes.

Definition 3 A symmetric relation R µ Prc £Prc is a weak context bisim-ulation if P R Q implies:

(1) whenever P"=) P 0, there exists Q0 such that Q

"=) Q0 and P 0 R Q0;

(2) whenever PahEi=) P 0, there exists Q0 such that Q

ahEi=) Q0 and P 0 R Q0;

(3) whenever P(∫b)ahEi

=) P 0, there exist Q0, F , ec such that Q(∫c)ahF i

=) Q0 andfor all C(U) with fn(C(U)) \ {eb,ec} = ;, (∫eb)(P 0|ChEi) R (∫ec)(Q0|ChF i). HereC(U) represents a process containing a unique free variable U.

We write P ºCt

Q if P and Q are weakly context bisimilar.Definition 4 A symmetric relation R µ Prc £ Prc is a weak contextual

barbed bisimulation if P R Q implies:(1) P |C R Q|C for any C;(2) whenever P

"=) P 0 then there exists Q0 such that Q"=) Q0 and P 0 R

Q0;(3) P +

µ

implies Q +µ

, where P +µ

if 9P 0, P"=) P 0 and P 0 #

µ

.We write P º

Ba

Q if P and Q are weakly contextual barbed bisimilar.

3 Logics for Strong Semantics

In this section, we present a logic to reason about higher order º-calculus calledSL. This logic extends propositional logic with three kinds of connectives: actiontemporal operators, spatial operators, operators about names and variables. Wegive the syntax and semantics of SL. The inference system of SL is also given.We prove the soundness and incompleteness of this inference system. As far aswe know, this is the first result on the completeness problem of the inferencesystem of spatial logic. Furthermore, we show that structural congruence, one-step transition relation and bisimulation can all be characterized by this spatiallogic. It is well known that structural congruence, one-step transition relation andbisimulation are the central concepts in the theory of processes, and almost allthe studies of process calculi are about these concepts. Therefore, our study givesan approach of reducing theory of processes to theory of spatial logic. Moreover,since processes can be regarded as a special kind of spatial logic formulas, spatiallogic can be viewed as an extension of process calculus. Based on spatial logic,it is possible to propose a refinement calculus [23] of concurrent processes.

3.1 Syntax and Semantics of Logic SL

Now we introduce a logic called SL, which is a spatial logic for higher orderº-calculus.

Definition 5 Syntax of logic SLA ::= >| ?| ¬A | A1 ^ A2 | høiA | hahA1iiA2 | ha[A1]iA2 | hahA1iiA2 | 0 |

X | a ØX.A | A \ a ØX | ahA1i.A2 | A \ a | A1|A2 | A1 . A2 | arA | A Æ a |(Nx)A | (NX)A | (™a)A | (™)A | a 6= b

In (Nx)A, (NX)A, the variables x (and X) are bound with scope the formulaA. We assume defined on formulas the standard relation ¥

Æ

of Æ-conversion(safe renaming of bound variables), but we never implicitly take formulas “upto Æ-conversion”: our manipulation of variables via Æ-conversion steps is alwaysquite explicit. The set fn(A) of free names in A, and the set fpv(A) of freepropositional variables in A, are defined in the usual way. A formula is closed if

it has no free variable such as X, it is open if it may have free variables. SLc

is the set of all closed formulas. In the following, we use A{b/a} to denote theformula obtained by replacing all occurrence of a in A by b. Similarly, we useA{Y/X} to denote the formula obtained by replacing all occurrence of Y in Aby X. It is easy to see that a process can also be regarded as a spatial formula.For example, process ahEi.P is also a spatial formula. In this paper, we say thatsuch a formula is in the form of process formula.

Definition 6 Semantics of logic SL

[[>]]Pr

= Pr

[[?]]Pr

= ;[[¬A]]

Pr

= Pr ° [[A]]Pr

[[A1 Â2]]Pr

= [[A1]]Pr

\ [[A2]]Pr

[[høiA]]Pr

= {P | 9Q. Pø°! Q and Q 2 [[A]]

Pr

}[[hahA1iiA2]]Pr

= {P | 9P1, P2. PahP1i°! P2, P1 2 [[A1]]Pr

and P2 2 [[A2]]Pr

}[[ha[A1]iA2]]Pr

= {P | 8R, R 2 [[A1]]Pr

,9Q. PahRi°! Q and Q 2 [[A2]]Pr

}

[[hahA1iiA2]]Pr

= {P | 9P1, P2. P(∫b)ahP1i°! P2, (∫eb)P1 2 [[A1]]Pr

and P2 2[[A2]]Pr

}[[0]]

Pr

= {P | P ¥ 0}[[X]]

Pr

= {P | P ¥ X}[[aØX.A]]

Pr

= {P | 9Q. P ¥ a(X).Q and Q 2 [[A]]Pr

}[[A \ aØX]]

Pr

= {P | a(X).P 2 [[A]]Pr

}[[ahA1i.A2]]Pr

= {P | 9P1, P2. P ¥ ahP1i.P2, P1 2 [[A1]]Pr

and P2 2[[A2]]Pr

}[[A \ a]]

Pr

= {P | ahP i.0 2 [[A]]Pr

}[[A1|A2]]Pr

= {P | 9Q1, Q2. P ¥ Q1|Q2, Q1 2 [[A1]]Pr

and Q2 2 [[A2]]Pr

}[[A1 . A2]]Pr

= {P | 8Q. Q 2 [[A1]]Pr

implies P |Q 2 [[A2]]Pr

}[[arA]]

Pr

= {P | 9Q. P ¥ (∫a)Q and Q 2 [[A]]Pr

}[[AÆ a]]

Pr

= {P | (∫a)P 2 [[A]]Pr

}[[(Nx)A]]

Pr

= [n/2fn((Nx)A)([[A{n/x}]]

Pr

\{P | n 2 fn(P )})[[(NX)A]]

Pr

= [V /2fpv((NX)A)([[A{V/X}]]

Pr

\{P | V 2 fpv(P )})[[(™a)A]]

Pr

= {P | a /2 fn(P ) and P 2 [[A]]Pr

}[[(™)A]]

Pr

= {P | 9Q. P ¥ Q and bn(Q) = ; and Q 2 [[A]]Pr

}[[a 6= b]]

Pr

= Pr if a 6= b

[[a 6= b]]Pr

= ; if a = b

In SL, formula hahA1iiA2 is satisfied by the processes that can receive a pro-cess satisfying A1 and then become a process satisfying A2. Formula ha[A1]iA2 issatisfied by processes that if it receive any process satisfying A1 then it becomesa process satisfying A2. A \ aØX is an adjunct operator of aØX.A, and A \ ais an adjunct operator of ahAi.0. (™a)A is satisfied by processes that satisfies Aand a is not its free name. (™)A is satisfied by processes that satisfy A and haveno bound names. Other operators in SL are well known in spatial logic or canbe interpreted similarly as above operators.

Definition 7 P |=SL

A iÆ P 2 [[A]]Pr

.

Definition 8 For a set of formulas ° and a formula A, we write ° |=SL

A,if A is valid in all processes that satisfy all formulas of ° .

Definition 9 If “A1, ..., An

infer B” is an instance of an inference rule, andif the formulas A1, ..., An

have appeared earlier in the proof, then we say thatB follows from an application of an inference rule. A proof is said to be from° to A if the premise is ° and the last formula is A in the proof. We say A isprovable from ° in an inference system AX, and write ° `

AX

A, if there is aproof from ° to A in AX.

For example, the following sets can be defined by operators in SL:{P | 8P1. P1 2 [[A1]]Pr

implies ahP1i.P 2 [[A2]]Pr

} = [[(b(Y ).ahA1i.Y .høiA2) \ b]]

Pr

{P | 8P1. P1 2 [[A1]]Pr

implies ahP i.P1 2 [[A2]]Pr

} = [[(b(Y ).ahY i.A1 .høiA2) \ b]]

Pr

{P | a 2 fn(P ) and P 2 [[A]]Pr

} = [[¬(™a)> Â]]Pr

{P | X 2 fv(P ) and P 2 [[A]]Pr

} = [[¬(™X)> Â]]Pr

(Hx)A = (Nx)xrA, which is related to name restriction in an appropriateway; namely, that if process P satisfies formulas A{n/x}, then (∫n)P satisfies(Hx)A.

(aHX)A = (NX)aØX.A, which is related to process variable restriction inan appropriate way; namely, that if process P satisfies formulas A{U/X}, thena(U).P satisfies (aHX)A.

3.2 Inference System of SL

Now we list a number of valid properties of spatial logic. The combination ofthe complete inference system of first order logic and the following axioms andrules form the inference system S of SL.

hÆi? ! ?aØX.? ! ?ah>i.? ! ?ah?i.> ! ?? \ aØX ! ?? \ a ! ?A|? ! ?A .? ! ¬A

? . A $ >ar? ! ??Æ a ! ?(™a)? ! ?(Nx)? ! ?(™)? ! ?(NX)? ! ?A|B $ B|A

(A|B)|C $ A|(B|C)A|0 $ Aar0 $ 0arbrA $ brarAar((™a)A|B) $ (™a)A|arBarA ! (Nb)brA{b/a}aØX.A ! (NY )aØ Y.A{Y/X}(™a)0 $ 0

(™a)X $ X(™a)a(X).A $ ?(™a)ahBi.A $ ?a 6= b ! ((™a)b(X).A $ b(X).(™a)A)a 6= b ! ((™a)bhBi.A $ bh(™a)Bi.(™a)A)(™a)A|(™a)B $ (™a)(A|B)a 6= b ! ((™a)(™b)A $ (™b)(™a)A)(™a)arA $ arA

(™)0 $ 0(™)X $ X(™)aØX.A $ aØX.(™)A(™)ahBi.A $ ah(™)Bi.(™)A(™)A|(™)B $ (™)(A|B)(™)arA ! ?(Nx)0 $ 0(Nx)X $ X

(Nx)aØX.A $ aØX.(Nx)(x 6= a Â)(Nx)ahBi.A ! ah(Nx)(x 6= a ^B)i.(Nx)(x 6= a Â)(Nx)(A|B) ! (Nx)A|(Nx)B(Nx)x 6= a ^ arA ! ar(Nx)A(NX)0 $ 0(NX)X ! Y

(NX)aØ Y.A $ aØ Y.(NX)A(NX)ahBi.A ! ah(NX)Bi.(NX)A(NX)(A|B) ! (NX)A|(NX)B(NX)arA $ ar(NX)AaØX.(A \ aØX) ! AA ! (aØX.A) \ aØX)

ahA \ ai.0 ! AA ! ((ahAi.0) \ a)(A|A . B) ! BA ! (B . A|B)ar(AÆ a) ! AA ! (arAÆ a)hÆiA,A ! B ` hÆiB

aØX.A, A ! B ` aØX.BahCi.A,A ! B ` ahCi.BahBi.A,B ! C ` ahCi.AhahBiiA,C ! B ` hahCiiAha[B]iA,C ! B ` ha[C]iAA \ aØX, A ! B ` B \ aØXA \ a,A ! B ` B \ a

A ! B ` A|C ! B|CarA,A ! B ` arB(™a)A,A ! B ` (™a)B(™)A,A ! B ` (™)BahBi.A ! hahBiiA(høiA)|B ! høi(A|B)(hahCiiA)|B ! hahCii(A|B)

(aØ U.A ^ ((™)B $ B)) ! ha[B]iA{B/U}(((™b1, ...,™b

n

)B $ B) ^ ((™)C $ C)) !((hahb1r...b

n

rCiiA)|B ! hahb1r...bn

rCii(A|B))(((™b1, ...,™b

n

)B $ B) ^ ((™)C $ C)) !((hahb1r...b

n

rCiiA)|ha[C]iB ! høib1r...bn

r(A|B))(a 6= b ^ ((™a)B $ B) ^ ((™)B $ B)) ! (arhbhBiiA ! hbhBiiarA)(^n

i=1a 6= bi

^ a 6= c ^ ((™a)B $ B) ^ ((™)B $ B)) !(arhchb1r...b

n

rBiiA ! hchb1r...bn

rBiiarA)(a 6= b ^ ^n

i=1b 6= ci

^ (B ! ¬(™b)>) ^ ((™)B $ B)) !(brhahc1r...c

n

rBiiA ! hahbrc1r...cn

rBiiA)ha[B]iA ! hahBiiAhahBiiA ! ha[B]iA, where B is syntactically a valid process in the higher

order pi° calculus.

Intuitively, axiom arA ! (Nb)brA{b/a} means that if process P satisfies(∫a)A and b is a fresh name then P satisfies (∫b)A{b/a}. Axiom ahBi.A !hahBiiA means that an output prefix process can perform an output action,which is a spatial logical version of Rule OUT in the labelled transition system ofhigher order º-calculus. Axiom (aØU.A^((™)B $ B)) ! ha[B]iA{B/U} meansthat an input prefix process can perform an input action, which is a spatial logicalversion of Rule IN in the labelled transition system of higher order º-calculus.Axiom (((™b1, ...,™b

n

)B $ B)^((™)C $ C)) ! ((hahb1r...bn

rCiiA)|ha[C]iB! høib1r...b

n

r(A|B)) is a spatial logical version of Rule COM . Other axiomsand rules are spatial logical version of structural congruence rules or labelledtransition rules similarly.

3.3 Soundness of SL

Inference system of SL is said to be sound with respect to processes if everyformula provable in SL is valid with respect to processes.

Now, we can prove the soundness of inference system S of SL:Proposition 1 ° `

S

A ) ° |=SL

AProof. See Appendix A.

3.4 Incompleteness of SL

The system SL is complete with respect to processes if every formula validwith respect to processes is provable in SL. For a logic, completeness is animportant property. The soundness and completeness provide a tight connectionbetween the syntactic notion of provability and the semantic notion of validity.Unfortunately, by the compactness property [18], the inference system of SL isnot complete.

The depth of higher order processes in Pr, is defined as below:Definition 10 d(0) = 0; d(U) = 0; d(a(U).P ) = 1 + d(P ); d(ahEi.P ) =

1 + d(E) + d(P ); d(P1|P2) = d(P1) + d(P2); d((∫a)P ) = d(P ).Lemma 1 For any P 2 Pr, there exists n, such that d(P ) = n.Proof. Induction on the structure of P .Proposition 2 There is no finite sound inference system AX such that

° |=SL

A ) ° ÀX

A.Proof. See Appendix B.

3.5 Spatial Logic as a Specification of Processes

In the refinement calculus [23], imperative programming languages are extendedby specification statements, which specify parts of a program “yet to be devel-oped”. Then the development of a program begins with a specification state-ment, and ends with an executable program by refining a specification to itspossible implementations. In this paper, we generalize this idea to the caseof process calculi. Roughly speaking, we extend processes to spatial logic for-mulas which are regarded as the specification statements. Processes can beregarded as a special kind of spatial logic. One can view the intensional op-erators of spatial logic as the “executable program statements”, for example,ahP i.Q, P |Q and etc; and view the extensional operators of spatial logic asthe “specification statements”, for example, A . B, A \ b and etc. For example,(bØ Y.ahY i.A1 . høiA2) \ b|(dØ Y.chB1i.Y . høiB2) \ d represents a specificationstatement which describes a process consisting of a parallel of two processes sat-isfying statements (b Ø Y.ahY i.A1 . høiA2) \ b and (d Ø Y.chB1i.Y . høiB2) \ drespectively. Furthermore, (bØ Y.ahY i.A1 . høiA2) \ b represents a specificationwhich describes a process P that ahP i.Q satisfies A2 for any Q satisfying A1.Similarly, (dØ Y.chB1i.Y . høiB2) \ d represents a specification statement whichdescribes a process M such that chNi.M satisfying B2 for any N satisfying B1.We can also define refinement relation on spatial logic formulas. Intuitively, if|=

SL

A ! B, then A refines B. For example, ar(a ØX.d.X|ahc.0i.e.0) refinesar(ha[c.0]id.c.0|hahc.0iie.0). Based on spatial logic, one may develop a theoryof refinement for concurrent processes. This will be a future research directionfor us.

3.6 Processes as Special Formulas of Spatial Logic

Any process can be regarded as a special formulas of spatial logic. For exam-ple, (Na)ar(NX)(a Ø X.d.X|ahc.0i.e.0) is a spatial logic formula, which rep-resents the process which is structural congruent to (∫a)(a(X).d.X|ahc.0i.e.0).Furthermore, in this section, we will show that structural congruence and la-belled transition relation can be reformulated as the logical relation of spatiallogical formulas.

Definition 11 The translating function TPS is defined inductively as follows:TPS(P ) def= P for process P that has no operators of (∫a)·, or a(X).·;TPS((∫a)P ) def= (Ha)TPS(P );TPS(a(X).P ) def= (aHX)TPS(P ).Proposition 3 For any P, Q 2 Prc, P ¥ Q , P |=

SL

TPS(Q) and Q |=SL

TPS(P ) , TPS(P ) `SL

TPS(Q) and TPS(Q) `SL

TPS(P ).P roof. See Appendix C.Proposition 4 For any P, Q 2 Prc, P

Æ°! Q , P |=SL

hÆiTPS(Q) ,TPS(P ) `

SL

hÆiTPS(Q).P roof. See Appendix D.Although Proposition 2 states that the inference system is not complete,

Propositions 3 and 4 show that this inference system is complete with respectto structural congruence and labelled transition relation of processes.

3.7 Behavioral Equivalence Relation of Spatial Logic

In [9], we introduced a spatial logic called L, and proved that L gives a charac-terization of context bisimulation.

Definition 12 [9] Syntax of logic LA ::= ¬A | A1 Â2 | hah>ii> | hah>ii> | høiA | A1 . A2.It is easy to see that L is a sublogic of SL.In [9], we proved the equivalence between ª

Ct

and logical equivalence withrespect to L.

Proposition 5 [9] For any P, Q 2 Prc, P ªCt

Q ,for any formula A 2 L,P |=

L

A iÆ Q |=L

A.Definition 13 A and B are behavioral equivalent with respect to L, written

A ªL

B, iÆ for any formula C 2 L, |=SL

A ! C iÆ |=SL

B ! C.By Proposition 5, it is easy to get the following corollary, which characterize

ªCt

by SL property.Corollary 1 For any P, Q 2 Prc, P ª

Ct

Q , P ªL

Q.Relation ª

L

is a binary relation on spatial logical formulas. The above resultsshow that ª

L

gives a logical characterization of bisimulation when formulasare in the form of processes. Moreover, relation ª

L

also gives a possibility togenerialize bisimulation on processes to that on spatial logical formulas. Sincewe have discussed that spatial logical formulas can be regarded as specificationsof processes, we may get a concept of bisimulation on specifications of processesbased on ª

L

.

4 Logics for Weak Semantics

In this section, we present a logic for weak semantics, named WL. Roughlyspeaking, in this logic, action temporal operators høi, hahAii, ha[A]i and hahAiiin SL are replaced by the weak semantics version of operators hh"ii, hhahAiii,hha[A]ii and hhahAiii. Almost all definitions and results of SL can be generalizedto WL.

4.1 Syntax and Semantics of Logic WL

Now we introduce a logic called WL, which is a weak semantics version of spatiallogic.

Definition 14 Syntax of logic WLA ::= >| ?| ¬A | A1 Â2 | hh"iiA | hhahA1iiiA2 | hha[A1]iiA2 | hhahA1iiiA2

| 0 | X | aØX.A | A \ aØX | ahA1i.A2 | A \ a | A1|A2 | A1 . A2 | arA | AÆ a| (Nx)A | (NX)A | (™a)A | (™)A | a 6= b

Definition 15 Semantics of logic WLSemantics of formulas of WL can be the same as formulas of SL, except that

semantics of operators hh"ii, hhahAiii, hha[A]ii and hhahAiii should be definedas follows:

[[hh"iiA]]Pr

= {P | 9Q. P"=) Q and Q 2 [[A]]

Pr

}[[hhahA1iiiA2]]Pr

= {P | 9P1, P2. PahP1i=) P2, P1 2 [[A1]]Pr

and P2 2[[A2]]Pr

}[[hha[A1]iiA2]]Pr

= {P | 8R, R 2 [[A1]]Pr

,9Q. PahRi=) Q and Q 2 [[A2]]Pr

}

[[hhahA1iiiA2]]Pr

= {P | 9P1, P2. P(∫b)ahP1i=) P2, (∫eb)P1 2 [[A1]]Pr

and P2 2[[A2]]Pr

}

4.2 Inference System of WL

The inference system of WL is similar to the inference system of SL exceptthat any inference rule about action temporal operators høi, hahAii, ha[A]i andhahAii in SL is replaced by one of the following inference rules.

hhÆii? ! ?hhÆiiA,A ! B ` hhÆiiBhhÆiiA,A ! hh"iiB ` hhÆiiBhh"iiA,A ! hhÆiiB ` hhÆiiBhhahBiiiA,C ! B ` hhahCiiiAhha[B]iiA,C ! B ` hha[C]iiAahBi.A ! hhahBiiiA(aØ U.A ^ ((™)B $ B)) ! hha[B]iiA{B/U}(hh"iiA)|B ! hh"ii(A|B)(hhahCiiiA)|B ! hhahCiii(A|B)(((™b1, ...,™b

n

)B $ B) ^ ((™)C $ C)) !((hhahb1r...b

n

rCiiiA)|B ! hhahb1r...bn

rCiii(A|B))

(((™b1, ...,™bn

)B $ B) ^ ((™)C $ C)) !((hhahb1r...b

n

rCiiiA)|hha[C]iiB ! hh"iib1r...bn

r(A|B))arhh"iiA ! hh"iiarA(a 6= b ^ (((™a)B ^ (™)B) $ B)) ! (arhhbhBiiiA ! hhbhBiiiarA)(^n

i=1a 6= bi

^ a 6= c ^ ((™a)B $ B) ^ ((™)B $ B)) !(arhhchb1r...b

n

rBiiiA ! hhchb1r...bn

rBiiiarA)(a 6= b ^ ^n

i=1b 6= ci

^ (B ! ¬(™b)>) ^ ((™)B $ B)) !(brhhahc1r...c

n

rBiiiA ! hhahbrc1r...cn

rBiiiA)hha[B]iiA ! hhahBiiiAhhahBiiiA ! hha[B]iiA, where B is syntactically a valid process in the

higher order pi° calculus.The above axioms and rules are weak semantics version of corresponding

axioms and rules in SL. We name the above inference system of WL as W .The soundness and incompleteness of inference system W of WL can be given

similarly as the case of SL:Proposition 6 ° `

W

A ) ° |=WL

AProposition 7 There is no finite sound inference system AX such that

° |=WL

A ) ° ÀX

A.Similar to Proposition 4, we show that many-steps transition relation is prov-

able in WL.Proposition 8 For any P, Q 2 Prc, P

Æ=) Q , P |=WL

hhÆiiTPS(Q) ,TPS(P ) `

WL

hhÆiiTPS(Q).Since structural congruence and labelled transition relation are central con-

cepts in the theory of processes, and they can be characterized in WL, the abovepropositions give a possible approach to reduce the theory of processes to thetheory of spatial logic in the case of weak semantics.

5 Adding µ-Operator to SL

In this section, we add µ-operator [3] to SL. We refer to this new logic as µSL.We will show that WL is a sublogic of µSL.

5.1 Syntax and Semantics of µSL

The formula of µSL is the same as the formula of SL except that the followingµ-calculus formula is added:

If A(X) 2 µSL, then µX.A(X) 2 µSL, here X occurs positively in A(X),i.e., all free occurrences of X fall under an even number of negations..

The model of µSL is the same as SL. We write such set of processes in whichA is true as [[A]]e

Pr

, where e: V ar ! 2Pr is an environment. We denote by e[X √W ] a new environment that is the same as e except that e[X √ W ](X) = W. Theset [[A]]e

S

is the set of processes that satisfy A. In the following, we abbreviateA(B) as A{B/X}, and abbreviate An+1(B) as A(An(B)) where A0(B) is B.

Semantics of µ-operator is given as following:[[µX.A(X)]]e

Pr

= \{W µ Pr | [[A(X)]]e[X√W ]Pr

µ W}.

In µ-calculus [3], it is well known that [[µX.A(X)]]ePr

= [[A1(?)]]ePr

[[[A2(?)]]ePr

[ ...

5.2 Inference System of µSL

Inference system of µSL is the combination of the following two rules of µ-calculus [3] and the inference system of SL.

A(µX.A(X)) ! µX.A(X).` A(B) ! B )` µX.A(X) ! B.

We name the above inference system of µSL as M .The soundness and incompleteness of inference system M of µSL can be

given as in the case of SL.

Proposition 9 ° `M

A ) ° |=µSL

A

Proposition 10 There is no finite sound inference system AX such that° |=

µSL

A ) ° ÀX

A.

5.3 Expressivity of µSL

In this section, we will discuss the expressive power of µSL. We will prove thatWL is a sublogic of µSL and give a function which can translates a WL formulainto an equivalent µSL formula.

Now we can give a translating function from WL formula to µSL formula:Definition 16 The translating function T is defined inductively as follows:

TWM (A) def= A for proposition A of WL that is not in the form of hh"iiA,hhahA1iiiA2, hha[A1]iiA2 or hhahA1iiiA2.

TWM (hh"iiA) def= µX.(TWM (A) _ høiX)

TWM (hhahA1iiiA2)def= µX.(hahTWM (A1)ii(µY.(TWM (A2)_høiY ))_høiX)

TWM (hha[A1]iiA2)def= µX.(ha[TWM (A1)]i(µY.(TWM (A2) _ høiY ) _ høiX)

TWM (hhahA1iiiA2)def= µX.(hahTWM (A1)ii(µY.(TWM (A2) _ høiY ) _ høiX)

The following proposition states the correctness of translating function TWM .

Proposition 11 For any A 2 WL, TWM (A) 2 µSL; for any P 2 Pr,P |=

µSL

TWM (A) , P |=WL

A.

Proof : See Appendix E.In µSL, we can also define the replication operator:

Definition 17 !A def= ¬µX.¬(A|¬X)Proposition 12 `

µSL

A|!A $!AProof : See Appendix F.The above results show that WL is a sublogic of µSL. Therefore µSL can

be used as a uniform logic framework to study both the strong semantics andthe weak semantics of higher order º-calculus.

6 Conclusions

Spatial logic was proposed to describe structural and behavioral properties ofprocesses. There are many papers on spatial logic and process calculi. Spatiallogic is related to some topics on process calculi, such as model checking, struc-tural congruence, bisimulation and type system. In [16], a spatial logic for ambi-ents calculus was studied, and a model checking algorithm was proposed. Someaxioms of spatial logic were given, but the completeness of logic was not studied.Most spatial logics for concurrency are intensional [27], in the sense that theyinduce an equivalence that coincides with structural congruence, which is muchfiner than bisimilarity. In [22], HirschkoÆ studied an extensional spatial logic.This logic only has spatial composition adjunct (.), revelation adjunct (Æ), asimple temporal modality (hi), and an operator for fresh name quantification.For º-calculus, this extensional spatial logic was proven to induce the same sep-arative power as strong early bisimilarity. In [9], context bisimulation of higherorder º-calculus was characterized by an extensional spatial logic. In [5], a typesystem of processes based on spatial logic was given, where types are interpretedas formulas of spatial logic.

In this paper, we want to show that the theory of processes can be reducedto the theory of spatial logics. We firstly defined a logic SL, which comprisessome temporal operators and spatial operators. We gave the inference systemof SL and showed the soundness and incompleteness of SL. Furthermore, weshowed that structural congruence and transition relation of higher order º-calculus can be reduced to the logical relation of SL formulas. We also showedthat bisimulations in higher order º-calculus can be characterized by a sublogicof SL. Furthermore, we propose a weak semantics version of SL, called WL.At last, we add µ-operator to SL. The new logic is named µSL. We studiedthe expressive power of this extension. These results can be generalized to otherprocess calculi. Since some important concepts of processes can be described inspatial logic, we think that this paper may give an approach of reducing thestudy of processes to the study of spatial logic. The further work for us is todevelop a refinement calculus [23] for concurrent processes based on our spatiallogic.

References

1. R. M. Amadio and M. Dam. Reasoning about Higher-order Processes. In TAP-SOFT95, LNCS 915, 202-216. 1995.

2. R. M. Amadio. On the Reduction of CHOCS-Bisimulation to º-calculus Bisimula-tion. In CONCUR93, LNCS 715, 112-126. 1993.

3. A. Arnold and D. Niwinski. Rudiments of µ-calculus. Studies in Logic, Vol 146,North-Holland, 2001.

4. M. Baldamus and J. Dingel. Modal Characterization of Weak Bisimulation forHigher-order Processes. In TAPSOFT97, LNCS 1214, 285–296, 1997.

5. L. Caires. Spatial-Behavioral Types for Concurrency and Resource Control in Dis-tributed Systems. In Theoretical Computer Science 402(2-3), 2008.

6. L. Caires. Logical Semantics of Types for Concurrency . In CALCO’07, LNCS,2007.

7. L. Caires, H. T. Vieira. Extensionality of Spatial Observations in Distributed Sys-tems. In EXPRESS’2006, ENTCS, 2006.

8. L. Caires. Behavioral and spatial observations in a logic for the º-calculus. InFOSSACS04, LNCS 2987, 72-87, 2004.

9. Z. Cao, A Spatial Logical Characterisation of Context Bisimulation. In Proceedingof ASIAN 2006, LNCS 4435, 232-240, 2006.

10. Z. Cao. More on bisimulations for higher-order º-calculus. In FOSSACS06, LNCS3921, 63-78, 2006.

11. L. Caires and L. Cardelli. A Spatial Logic for Concurrency (Part II), TheoreticalComputer Science, Vol 322(3), 517-565. 2004.

12. L. Caires and L. Cardelli. A Spatial Logic for Concurrency (Part I). Informationand Computation, Vol 186(2), 194-235. 2003.

13. W. Charatonik, S. Dal Zilio, A. D. Gordon, S. Mukhopadhyay, and J.-M. Talbot.The complexity of model checking mobile ambients. In FoSSaCS’01, LNCS 2030,152-167, 2001.

14. W. Charatonik, S. Dal Zilio, A. D. Gordon, S. Mukhopadhyay, J.-M. Talbot. ModelChecking Mobile Ambients.

15. L. Cardelli and A. Gordon. Logical Properties of Name Restriction. In Proc. ofTLCA’01, LNCS 2044. 2001.

16. L. Cardelli and A. Gordon. Anytime, Anywhere, Modal Logics for Mobile Ambi-ents. In Proc. of POPL’00, pages 365-377. ACM Press, 2000.

17. G. Conforti and G. Ghelli. Decidability of Freshness ,Undecidability of Revelation.In : Proc. of FoSSaCS’04 , LNCS 2987. 2004.

18. C. C. Chang. Model Theory. North-Holland, 1977.19. L. Caires1 and E. Lozes. Elimination of Quantifiers and Undecidability in Spatial

Logics for Concurrency. In Theoretical Computer ScienceVolume 358 , Issue 2(August 2006) Pages: 293 - 314.

20. A. JeÆrey, J. Rathke. Contextual equivalence for higher-order º-calculus revisited.In Proceedings of Mathematical Foundations of Programming Semantics, Elsevier,2003.

21. D. HirschkoÆ, E. Lozes, and D. Sangiorgi. Separability, Expressiveness and Decid-ability in the Ambient Logic. In Proc. of LICS’02, pages 423-432. IEEE ComputerSociety, 2002.

22. D. HirschkoÆ. An Extensional Spatial Logic for Mobile Processes. CONCUR’04,LNCS 3170, 325-339, 2004, Springer-Verlag.

23. C. Morgan, P. Gardiner, K. Robision, and T. Vickers. On the Refinement Calculus.Springer-Verlag, 1994.

24. R. Milner, J. Parrow, and D. Walker. Modal logics for mobile processes. TheoreticalComputer Science, 114(1):149-171, 1993.

25. L. Gregory Meredith, Matthias Radestock: Namespace Logic: A Logic for a Re-flective Higher-Order Calculus. TGC 2005: 353-369.

26. J.Parrow. An introduction to the º-calculus. In J. Bergstra, A. Ponse and S. Smolkaeditors, Handbook of Process Algebra, North-Holland, Amsterdam, 2001.

27. D. Sangiorgi. Extensionality and Intensionality of the Ambient Logic. In Proc. ofthe 28th POPL, Pages 4-17. ACM Press, 2001.

28. D. Sangiorgi. Bisimulation in higher-order calculi. Information and Computation,131(2), 1996.

29. D. Sangiorgi. Expressing mobility in process algebras: first-order and higher-orderparadigms. Ph.D thesis, Department of Computer Science, University of Einburgh,1992.

30. C. Stirling. Modal Logics for Communicating Systems. Theoretical Computer Sci-ence, (49):311-347, 1987.

31. B. Thomsen, Plain CHOCS: A second generation calculus for higher order pro-cesses. Acta Informatica, Vol 30, 1-59, 1993.

Appendix A. Proof of Proposition 1

Proposition 1 ° `SL

A ) ° |=SL

AProof. It is enough by proving that every axiom and every inference rule of

inference system is sound. We only discuss the following cases:Case (1): Axiom ar((™a)A|B) $ (™a)A|arB.Suppose P 2 [[ar((™a)A|B)]], then P ¥ (∫a)(P1|P2), a /2 fn(P1), P1 2

[[A]] and P2 2 [[B]]. Therefore we have P ¥ (∫a)(P1|P2) ¥ P1|(∫a)P2, P 2[[(™a)A|arB]]. Hence ar((™a)A|B) $ (™a)A|arB. The inverse case is simi-lar.

Case (2): Axiom a 6= b ! ((™a)bhBi.A $ bh(™a)Bi.(™a)A).Suppose a 6= b and P 2 [[(™a)bhBi.A]], then P ¥ bhP1i.P2, a /2 fn(P1),

a /2 fn(P2), P1 2 [[B]] and P2 2 [[A]]. Therefore we have P1 2 [[(™a)B]] andP2 2 [[(™a)A]], P 2 [[bh(™a)Bi.(™a)A)]]. Hence a 6= b ! ((™a)bhBi.A !bh(™a)Bi.(™a)A). The inverse case is similar.

Case (3): Axiom (A|A . B) ! B.Suppose P 2 [[A|A . B]], then P ¥ P1|P2, P1 2 [[A]] and P2 2 [[A . B]].

Therefore, P ¥ P1|P2 2 [[A|A . B]]. Hence (A|A . B) ! B.Case (4): Axiom A ! (B . A|B).Suppose P 2 [[A]], then for any Q 2 [[B]], P |Q 2 [[A|B]]. Hence A !

(B . A|B).Case (5): Axiom (((™b1, ...,™b

n

)B $ B) ^ ((™)C $ C)) !((hahb1r...b

n


rCii(A|B)).

Suppose P 2 [[(hahb1r...bn

rCiiA)|B]], then P ¥ P1|P2, P1(∫b1,...,bn)ahQi°!

P 01, P 0

1 2 [[A]], P2 2 [[B]] and Q 2 [[C]]. Since (™b1, ..., bn

)B $ B, {b1, ..., bn

} \fn(P2) = ;. Therefore we have P1|P2

(∫b1,...,bn)ahQi°! P 01|P2. Hence (((™b1, ...,™b

n

)B$ B) ^ ((™)C $ C)) ! ((hahb1r...b

n


rCii(A|B)).

Case (6): Axiom (((™b1, ...,™bn

)B $ B) ^ ((™)C $ C)) !((hahb1r...b

n


r(A|B)).Suppose P 2 [[(hahb1r...b

n

rCiiA)|ha[C]iB]], then P ¥ P1|P2,

P1(∫b1,...,bn)ahQi°! P 0

1, P2ahQi°! P 0

2, P 01 2 [[A]], P 0

2 2 [[B]] and Q 2 [[C]]. Since(™b1, ..., bn

)B $ B, {b1, ..., bn

} \ fn(P 02) = ;. Therefore we have P1|P2

ø°!(∫b1, ..., bn

)(P 01|P 0

2). Hence (((™b1, ...,™bn

)B $ B) ^ ((™)C $ C)) !((hahb1r...b

n


r(A|B)).Case (7): Axiom (^n

i=1a 6= bi

^ a 6= c ^ ((™a)B $ B) ^ ((™)B $ B)) !(arhchb1r...b

n

rBiiA ! hchb1r...bn

rBiiarA).

Suppose P 2 [[arhchb1r...bn

rBiiA]], then P ¥ (∫a)P1, P1(∫b1,...,bn)chQi°!

P 01, Q 2 [[B]], P 0

1 2 [[A]]. Since ^n

i=1a 6= bi

^ a 6= c ^ ((™a)B $ B) ^ ((™)B $B), a /2 n(Q). Therefore we have P ¥ (∫a)P1

(∫b1,...,bn)chQi°! (∫a)P 01. Hence

(^n

i=1a 6= bi

â 6= c^((™a)B $ B)^((™)B $ B)) ! (arhchb1r...bn

rBiiA !hchb1r...b

n

rBiiarA).Case (8): Axiom (a 6= b ^ ^n

i=1b 6= ci

^ ( B ! ¬(™b)>) ^ ((™)B $ B)) !(brhahc1r...c

n

rBiiA ! hahbrc1r...cn

rBiiA).

Suppose P 2 [[brhahc1r...cn

rBiiA]], then P ¥ (∫b)P1, P1(∫c1,...,cn)ahQi°!

P 01, Q 2 [[B]], P 0

1 2 [[A]]. Since a 6= b^^n

i=1b 6= ci

^( B ! ¬(™b)>)^((™)B $ B),

b 2 fn(Q). Therefore we have P ¥ (∫b)P1(∫b)(∫c1,...,cn)ahQi°! P 0

1. Hence (a 6=b ^ ^n

i=1b 6= ci

^ (B ! ¬(™b)>) ^ ((™)B $ B)) ! (brhahc1r...cn

rBiiA !hahbrc1r...c

n

rBiiA).

Appendix B. Proof of Proposition 2

Proposition 2 There is no finite sound inference system AX such that° |=

SL

A ) ° ÀX

A.Proof. Let © = {ah0i.>, ah0i.ahb.0i.>, ah0i.ahb.0i.ahb.b.0i.>, ah0i.ahb.0i.

ahb.b.0i.ahb.b.b.0i.>, ...}. It is easy to see that any finite subset of © can besatisfied in Pr, but © can not be satisfied in Pr. Suppose it is not true, let Psatisfies ©. By Lemma 1, there exists n, such that d(P ) = n. But for any n, thereexists '

n

in © such that for any P satisfying 'n

, d(P ) > n. This contradicts theassumption. Therefore © can not be satisfied in Pr.

Suppose there is a finite inference system such that ° |=SL

A ) ° `SL

A.Since © can not be be satisfied in Pr, we have © |=

SL

?. By the assumption,© `

SL

?. Hence there is a proof from © to ? in SL. Since proof is a finiteformula sequence, there is finite many formulas '

i

in © occur in the proof.Therefore we have ^©

i

`SL

?, where ©i

= {'i

| 'i

is in the proof}. Then by thesoundness of inference system of SL, we have that ©

i

is not satisfiable. Since ©i

is a finite subset of ©, this contradicts the assumption. Therefore SL have nofinite complete inference system.

Appendix C. Proof of Proposition 3

Proposition 3 For any P, Q 2 Prc, P ¥ Q , P |=SL

TPS(Q) and Q |=SL

TPS(P ) , TPS(P ) `SL


TPS(P ).P roof. It is trivial by the definition that P ¥ Q , P |=

SL

TPS(Q) andQ |=

SL

TPS(P ). By the soundness, TPS(P ) `SL

TPS(Q) ) P |=SL

TPS(Q).We only need to prove P ¥ Q ) TPS(P ) `

SL


TPS(P ).We only discuss the following cases, other cases are similar or trivial:Case (1): (∫m)(∫n)P ¥ (∫n)(∫m)P : Since mrnrTPS(P ) $ nrmrTPS(P ),

we have mrnrTPS(P ) `SL

nrmrTPS(P ). The inverse case is similar.Case (2): (∫a)(P |Q) ¥ P |(∫a)Q if a /2 fn(P ) : Since a /2 fn(P ), (™a)TPS(P ) $

TPS(P ). Furthermore, since ar((™a)TPS(P )|TPS(Q)) $ (™a)TPS(P )|arTPS(Q),

we have ar(TPS(P )|TPS(Q)) `SL

TPS(P )|arTPS(Q). The inverse case is sim-ilar.

Appendix D. Proof of Proposition 4

Proposition 4 For any P, Q 2 Prc, PÆ°! Q , P |=

SL

hÆiTPS(Q) ,TPS(P ) `

SL

hÆiTPS(Q).P roof. It is trivial by the definition that P

Æ°! Q , P |=SL

hÆiTPS(Q). Bythe soundness, TPS(P ) `

SL

hÆiTPS(Q) ) P |=SL

hÆiTPS(Q). We only need toprove P

Æ°! Q ) P `SL

hÆiTPS(P ).We apply the induction on the length of the inference tree of P

Æ°! Q :Case (1): if the length is 0, then P

Æ°! Q is in the form of ahEi.K ahEi°! K or

a(U).KahEi°! K{E/U}.

Subcase (a): ahEi.K ahEi°! K : Since ahEi.TPS(K) ! hahEiiTPS(K), wehave ahEi.TPS(K) `

SL

hahEiiTPS(K).

Subcase (b): a(U).KahEi°! K{E/U} : Since (a(U).TPS(K) ^ ((™)TPS(E) $

TPS(E))) ! ha[TPS(E)]iTPS(K){TPS(E)/U}, we have a(U).TPS(K) `SL

ha[TPS(E)]iTPS(K){TPS(E)/U}.Case (2): Assume the claim holds if length is n, now we discuss the case that

length is n + 1.

Subcase (a):M

(∫b)ahEi°! M 0 NahEi°! N 0

M |N ø°! (∫eb)(M 0|N 0)eb \ fn(N) = ;.

Since M(∫b)ahEi°! M 0, N

ahEi°! N 0, and eb \ fn(N) = ;, we have TPS(M) !hahebrTPS(E)iiTPS(M 0), TPS(N) ! ha[TPS(E)]iTPS(N 0) and (™b1, ..., bn

)TPS(E)$ TPS(E). By the axiom: (((™b1, ..., bn

)TPS(N) $ TPS(N))^ (™)TPS(E)) !((hahb1r...b

n

rTPS(E)iiTPS(M))|ha[TPS(E)]iTPS(N) ! høib1r...bn

r(TPS(M)|TPS(N))), we have P ¥ TPS(M)|TPS(N) `

SL

høib1r...bn

r(TPS(M 0)|TPS(N 0)).

Subcase (b):M

bhEi°! M 0

(∫a)MbhEi°! (∫a)M 0

a /2 n(Æ).

Since MbhEi°! M 0 and a /2 n(bhEi), we have TPS(M) ! hbhTPS(E)iiTPS(M 0)

and ((™a)TPS(E)^(™)TPS(E)) $ TPS(E). By the axiom (a 6= b^((™a)TPS(E)^(™)TPS(E)) $ TPS(E)) ! (arhbhTPS(E)iiTPS(M) ! hbhTPS(E)iiarTPS(M)),we have TPS(P ) = arTPS(M) `

SL

arhbhTPS(E)iiTPS(M) `SL

hbhTPS(E)iiarTPS(M)).

Subcase (c):M

(∫c)ahEi°! M 0

(∫b)M(∫b,c)ahEi°! M 0

a 6= b, b 2 fn(E)° ec.

Since M(∫c)ahEi°! M 0 and a 6= b, b 2 fn(E) ° ec, we have TPS(M) !

hahecrTPS(E)iiTPS(M 0) and a 6= b^^n

i=1b 6= ci

^ (B ! ¬(™b)>). By the axiom(a 6= b^^n

i=1b 6= ci

^ (E ! ¬(™b)>)^((™)E $ E)) ! (brhahc1r...cn

rTPS(E)iiTPS(M 0) ! hahbrc1r...c

n

rTPS(E)iiTPS(M 0)), we have TPS(P ) = brTPS(M)`

SL

(brhahc1r...cn

rTPS(E)iiTPS(M 0) `SL

hahbrc1r...cn

rTPS(E)iiTPS(M 0).

Appendix E. Proof of Proposition 11

Proposition 11 For any A 2 WL, TWM (A) 2 µSL; for any P 2 Pr,P |=

µSL

TWM (A) , P |=WL

A.Proof : We only discuss the case A = hhahA1iiiA2, other cases are similar.Suppose P |=

µSL

TWM (A). Since [[µX.C(X)]]ePr

= [i

[[Ci(?)]]ePr

, if P2 [[µX.C(X)]]e

Pr

, then P 2 [[Ci(?)]]ePr

for some i. Let B = hahTWM (A1)ii(µY.(TWM (A2) _ høiY )), then P |=

µSL

B _ høiB _ høihøiB... _ høiiB, herehøii+1B denotes høi(høiiB), høi0B is B. Hence P

"=) Q, Q 2 [[hahTWM (A1)ii(µY.(TWM (A2) _ høiY ))]]e

Pr

. Hence QahEi°! Q0, E 2 [[TWM (A1)]]e

Pr

, and Q0 2[[µY.(TWM (A2) _ høiY )]]e

Pr

. By the similar discuss, we have that Q0 "=) Q00

and Q00 2 [[TWM (A2)]]ePr

. Hence PahEi=) Q00, E 2 [[TWM (A1)]]e

Pr

, and Q00 2[[TWM (A2)]]e

Pr

. We have P |=WL

A. The converse claim is similar.

Appendix F. Proof of Proposition 12

Proposition 12 `µSL

A|!A $!AProof : Since by the inference system, `

µSL

S(µX.S(X)) ! µX.S(X), wehave ¬µX.S(X) ! ¬S(µX.S(X)). Let S(X) = ¬(A|¬X), then ¬µX.S(X) =¬µX.¬(A|¬X) =!A, ¬S(µX.S(X)) = A|¬µX.¬(A|¬X) = A|!A. Therefore weget `

µSL

!A ! A|!A.Since by the inference system, `

µSL

!A ! A|!A, we have `µSL

¬(A|A|!A) !¬(A|!A). Let T (X) = ¬(A|¬X), then T (¬(A|!A)) = ¬(A|A|!A). Since `

µSL

T (¬(A|!A)) ! ¬(A|!A), by the inference system, we have `µSL

µX.T (X) !¬(A|!A). Furthermore, µX.T (X) = µX.¬(A|¬X) = ¬!A, hence `

µSL

¬!A !¬(A|!A), we have `

µSL

A|!A !!A.

HipSpec: Automating Inductive Proofs

of Program PropertiesKoen Claessen Moa Johansson Dan Rosen⇤

Nicholas SmallboneDepartment of Computer Science and Engineering, Chalmers University of Technology

{koen,moa.johansson,nicsma}@chalmers.se [email protected]

Abstract

We present ongoing work on HipSpec, a system for automatically deriving and provingproperties about functional programs. HipSpec uses a combination of theory formation,counter example testing and inductive theorem proving to automatically generate a setof equational theorems about recursive functions in a program, which are later used as abackground theory for proving stated properties about a program. Initial experiments areencouraging; our initial HipSpec prototype already compares favorably to other, similarsystems.

1 Introduction

We are studying the problem of automatically proving algebraic properties of programs. Ouraim is to build a tool that programmers can use while they are developing software and thatsupports them in the process. This paper describes our current progress made towards thisdirection. In particular, we have worked on the problem of automating induction.

The context we are working in is strongly typed purely functional programming, notablyusing the programming language Haskell. There are two key advantages to using Haskell inthis context: (1) pure functional programming is semantically simpler and thus easier to reasonabout than programming languages with side e↵ects, (2) many Haskell programmers alreadyuse QuickCheck [5], a tool for property-based random testing, which means that many Haskellprograms already have formal properties stated about them (albeit unproved, but tested).

The main obstacle one encounters when doing automatic verification is deciding where andhow induction needs to be applied.

Let us look at a simple example. Consider the following Haskell program implementing thelist reverse function in two di↵erent ways. The first is called reverse:

reverse [] = []reverse (x:xs) = reverse xs ++ [x]

The second is called qreverse, and uses an accumulating parameter which leads to a functionwith better complexity:

revacc [] acc = accrevacc (x:xs) acc = revacc xs (x:acc)

qreverse xs = revacc xs []

A natural property one would like to state and verify of the above program is that reverseand qreverse implement the same function: 8 xs . reverse xs = qreverse xs.

In previous work, we developed a tool called Hip, the Haskell Inductive Prover [15]. Hipreads in a Haskell program and a property, and tries to prove the property by various induction

⇤Corresponding Author

1

HipSpec : Automating Inductive Proofs Claessen, Johansson, Rosen and Smallbone

principles. For each induction principle, Hip generates a set of first-order proof obligations whichare sent to an automatic first-order prover. Thus, Hip takes care of the inductive reasoning butdelegates the pure first-order reasoning to an automatic prover. The version of Hip used in thispaper only applies structural induction, so for the reverse property, it will try induction onxs. This will fail: the induction hypothesis reverse as = qreverse as is too weak to provereverse (a:as) = qreverse (a:as).

What is needed is one or more extra lemmas that make the proof go through, and canthemselves be proven by induction. The challenge of automatic induction is to come up withsuch lemmas automatically. Most automated induction provers take a top-down approach: theystart by trying to prove the inductive step, somehow “get stuck”, and use the stuck state tocome up with a more general lemma that might help the proof of the inductive step go through.

Our approach di↵ers in that we use a bottom-up approach. Our tool, called HipSpec, readsin a program and first, before even looking at the properties, makes a list of conjectures aboutthe program. Each conjecture is then submitted to Hip to try to find an automatic inductionproof. As soon as a conjecture is proved, it is given as a lemma to Hip, so that it may be used inproving more conjectures. When as many conjectures as possible have been proved, Hip tries toprove the actual properties stated by the programmer, using all the lemmas it has now proved.

How do we come up with the conjectures? We use another tool we developed in previouswork, called QuickSpec [6]. QuickSpec generates conjectures in the form of algebraic equationsabout the functions in a given program. It uses random testing, so equations are sometimesfalse. However, we have a kind of completeness: QuickSpec will discover all true equations upto a given term depth. This completeness gives a good chance of finding useful lemmas.

For our example, QuickSpec takes about 3 seconds to run. HipSpec feeds QuickSpec’sequations into Hip, which proves them. Many of the equations are proved without induction,and thus cannot be needed as lemmas; the equations that needed induction are shown below:

No Conjecture Lemmas used

(1 ) xs = xs++[]

(2 ) xs++(ys++zs) = (xs++ys)++zs

(3 ) revacc xs (ys++zs) = revacc xs ys++zs

(4 ) revacc xs ys = reverse xs++ys (1 ), (3 )(5 ) revacc ys xs++zs = revacc (revacc xs ys) zs (3 )

Now Hip attempts to prove the original property, assuming the lemmas above, and easilysucceeds: the property follows directly from (4 ) and the definition of qreverse, and no inductionis needed. Note that equation number (5 ) was proved but was never needed for proving theoriginal property. Proving unnecessary things is an obvious risk of our approach.

Contributions We augment the automated induction landscape with a new method which usesa bottom-up theory exploration approach to find auxiliary lemmas. Our bottom-up approachcombines our own earlier work on conjecture generation based on testing (QuickSpec) andinduction principle enumeration (Hip). Our hypothesis is that algebraic equations constructedfrom terms up to a certain depth form a rich enough background theory for proving manyalgebraic properties about programs, and that a reasoning system for functional programs canbe built on top of an automatic first-order theorem prover. The experimental results in thispaper have so far confirmed this.

2


2 HipSpec

As we have seen in the introduction, HipSpec is a combination of Hip and QuickSpec. Tounderstand how to combine the tools, we will first need to see how they work in more detail.

Hip is an automatic tool for proving user-stated equality properties about Haskell programs.This is done by compiling the program to first-order logic. Hip applies induction rules tothe conjectures, yielding first-order proof obligations, which are proved using an automatedfirst-order theorem prover. Thus, the first-order prover takes care of non-inductive reasoning,while Hip adds inductive reasoning at the meta-level.

The focus of our work is currently not on proving termination, so we restrict ourselves byallowing only well-founded definitions, and put this responsibility on the end user to enforcethis policy for now. This has the upside that that quantified variables range only over totalvalues, so any property which is proved true by induction can be safely added as an axiom tothe first-order theory. Whenever we prove a lemma by induction we will indeed add it to thefirst-order theory.

QuickSpec conjectures equations about a functional program by testing. The user of QuickSpecprovides a list of functions and their types, a random test data generator for each of the typesinvolved, a set of variables, and a depth limit. QuickSpec generates the set of all type-correctterms whose depth is less than the limit, built from the functions and variables given, which wecall the universe. It then partitions the universe into equivalence classes by random testing: twoterms will be in the same equivalence class unless a test case distinguished them.

From this equivalence relation we can generate a vast set of equations about the testedprogram, which completely characterises the universe of terms. However, there will be verymany equations, most of them redundant. Therefore, QuickSpec also includes a pruning phasethat removes redundant equations, leaving behind a minimal core from which the rest of theequations follow; this minimal core is presented to the user.

In HipSpec, we exploit QuickSpec in several ways: first, we use the list of equations thatQuickSpec produces. Second, we use the huge list of equations that exists before pruning.Finally, QuickSpec’s pruning algorithm involves a lightweight, fast equational theorem prover,based on congruence closure, which we use.

HipSpec HipSpec’s operation is illustrated in Figure 1.

HaskellSource

Conjectures

First-OrderTheory

TheoremProver

Induction (Hip)

QuickSpec

Translation(Hip)

Timeout

Open conjecture

Theorem

Extend theory

Figure 1: An overview of HipSpec.

We start by running QuickSpec on the source file, which generates a list of conjectures.Furthermore, we translate the source to a first-order theory using Hip.

3


HipSpec maintains two sets of equations: conjectures, which we suspect are true but havenot proved, and lemmas, which we have proved. The first-order theory in Figure 1 consists ofHip’s translation of our program plus the current set of lemmas. Initially the conjectures consistof all equations that QuickSpec found—even those that would be removed by pruning. Ourmain loop is as follows:

1. Pick a conjecture from the conjecture set.

2. Ask Hip to prove the conjecture by induction, giving the lemmas as background theory.

3. If Hip succeeds within a timeout, and the proof used induction, move the conjecture tothe lemma set.

The loop ends either when all conjectures have been proved, or all conjectures have beentried and failed since the last time the lemma set changed.

Picking a conjecture The performance of HipSpec completely depends on one heuristic:which conjecture to try to prove next. Our heuristics are quite preliminary. We define thedesirability of an equation by the following total order:

• An equation that is in QuickSpec’s final list of pruned equations is more desirable than anequation that is not.

• Otherwise, a simpler equation is more desirable than a complicated equation, using asimplicity metric that is built into QuickSpec.

We maintain a set of failed conjectures, ones that we have tried and failed to prove, and a set ofactive conjectures, that we will try to prove. Our basic heuristic is as follows:

• Always pick the most desirable conjecture in the active set.

• If the active set is empty, move all failed conjectures to the active set.(The loop above will terminate when this process is no longer productive.)

The definition above roughly means that we first try to prove the set of equations that QuickSpecwould normally present to the user, then all equations that QuickSpec discovers includingredundant ones, in order of simplicity.

Discarding subsumptions It is quite expensive to send every conjecture to Hip to be proved,when we may have thousands of them. Luckily, as we saw earlier, QuickSpec has a lightweight“theorem prover” based on congruence closure. This prover can e�ciently answer questions of theform “given these lemmas, can I prove this equation?”, replying either “yes” or “don’t know”.Adding new lemmas is somewhat expensive, but we can do this incrementally as we prove them.

Whenever we pick a conjecture, we check if this prover can prove it from the current lemmas.If so, we just discard it. This filters out most conjectures that are provable without induction.

Candidates The basic heuristic above has the problem that, if we fail to prove a conjecture,we never retry it until we have tried all the active conjectures. In the meantime, we may haveproved a lemma that would allow us to prove the failed conjecture, and the failed conjecturemay be useful as a lemma in its own right.

For example, when dealing with natural numbers, the law x*y = y*x is one of the first wewill try to prove. We will probably fail, but this law is useful, and we want to prove it as soonas we have the necessary lemmas.

One sign that we may be able to prove x*y = y*x is if we start to prove consequences ofthat law, such as x*S y = y*S x. In this case we want to come back to x*y = y*x. Anotherconsequence of x*y = y*x is x+(x*y) = x*S y, which is the very lemma we need to prove theoriginal x*y = y*x!

4


We conjecture that proving a consequence of an equation F is a good reason for us to tryto prove F . Accordingly, we define the following heuristic. If we have just proved equationE in background �, and we have a failed equation F such that �, F ` E, then we move Fto the active set and try to prove it next. In the example above, F is x*y = y*x, and E iss x*y = y*s x. The heuristic is somewhat unreliable, but can be implemented e�ciently usingQuickSpec’s built-in equational reasoning as mentioned above.

Discarding renamings QuickSpec unfortunately produces many renamings of the sameequation, e.g., both xs ++ [] = xs and ys ++ [] = ys. Normally, the pruning algorithmtakes care of removing these duplicates, but we have to do this ourselves in HipSpec. Wheneverwe fail to prove a conjecture, we remove any conjecture that is simply a renaming of that one.

3 Examples

Using Peano arithmetic, with standard definitions of addition and multiplication recursively onthe first argument, we will try to get HipSpec to prove Nicomachus’ Theorem. It states that thesum of the n first cubes is the nth triangle number squared:

nX

k=1

k3 =

nX

k=1

k

!2

The tri function below calculates the triangle numbers and the sum of cubes by cubes:

tri Z = Ztri (S n) = tri n + S n

cubes Z = Zcubes (S n) = cubes n + (S n * S n * S n)

Using these definitions, Nicomachus’ theorem is stated as follows:

prop_Nicomachus x = cubes x =:= tri x*tri x

When HipSpec is given the plus, multiplication, tri and cubes, it generates and proves theproperties listed in Figure 2 below. The properties are listed in the order they were proved.

No Conjecture Lemmas used

(1 ) x = x+Z

(2 ) Z = x*Z

(3 ) S (x+y) = x+S y

(4 ) x+y = y+x (1 ), (3 )(5 ) x+(y+z) = y+(x+z) (4 )(6 ) x*(y+z) = (x*z)+(x*y) (4 ), (5 )(7 ) x*(y+y) = (x+x)*y (4 ), (5 )(8 ) x = x*S Z

(9 ) x+(x*y) = x*S y (1 ), (3 ), (5 )(10 ) x*y = y*x (2 ), (9 )(11 ) x*(y*z) = z*(x*y) (4 ), (6 ), (10 )(12 ) x+(x*x) = tri x+tri x (3 ), (4 ), (5 ), (8 ), (9 )(13 ) cubes x = tri x*tri x (12 ) and more

Figure 2: Properties proved about the theory with natural number addition, multiplication,triangle numbers (tri) and sum of cubes (cubes). Z represents zero and S successor.

5


HipSpec first proves that the naturals form a commutative semiring. Only induction on onevariable was used in this test. This means that, for instance, to prove (4 ), commutativity ofaddition, then (1 ) and (3 ) are required. Some superfluous lemmas are proved, for example (7 ),which is never used and is later superseded by the stronger (9 ). In (12 ), the well-known identity,Pn

k=1 k = n(n+ 1)/2 is proved. From this lemma HipSpec proves Nicomachus’ Theorem.

3.1 Binary Arithmetic

Peano arithmetic is not very realistic. A better way to implement the naturals is as a list of bits:

data Bits = Zero | ZeroAnd Bits | OneAnd Bits

Here, the constructor ZeroAnd n represents the number 2n, while OneAnd n represents 2n+1.Thus 6 can be represented as ZeroAnd (OneAnd (OneAnd Zero)). Arithmetic operations canbe defined in the usual messy way:

succ Zero = OneAnd Zerosucc (ZeroAnd x) = OneAnd xsucc (OneAnd x) = ZeroAnd (succ x)

plus Zero xs = xsplus xs Zero = xsplus (ZeroAnd xs) (ZeroAnd ys) = ZeroAnd (plus xs ys)plus (ZeroAnd xs) (OneAnd ys) = OneAnd (plus xs ys)plus (OneAnd xs) (ZeroAnd ys) = OneAnd (plus xs ys)plus (OneAnd xs) (OneAnd ys) = ZeroAnd (succ (plus xs ys))

We can specify succ and plus by relating them to the corresponding functions on Peanonumbers by means of a function toNat :: Bits -> Nat:

toNat Zero = ZtoNat (ZeroAnd x) = toNat x + toNat xtoNat (OneAnd x) = S (toNat x + toNat x)

prop_succ n = toNat (succ n) =:= S (toNat n)prop_plus m n = toNat (plus m n) =:= toNat m + toNat n

HipSpec is able to prove both prop_succ and prop_plus, along with the lemmas aboutPeano arithmetic from the last section, and these generated lemmas:

Conjecture

plus xs ys = plus ys xs

s (plus xs ys) = plus xs (s ys)

S (toNat xs) = toNat (s xs)

ZeroAnd (plus xs xs) = plus (ZeroAnd xs) (plus xs xs)

plus xs (plus ys zs) = plus zs (plus xs ys)

toNat (plus xs ys) = toNat ys+toNat xs

toNat (ZeroAnd xs) = toNat (plus xs xs)

3.2 Lists with E�cient Append

Another representation of lists is as a type with three constructors: the empty list, a singletonlist and the concatenation of two lists, with the invariant that a list has no internal Nil nodes.

data List a = Nil | Unit a | Append (List a) (List a)

We can e�cienctly concatenate such lists as well as convert them to normal lists:

6


Nil +++ xs = xsxs +++ Nil = xsxs +++ ys = Append xs ys

toList Nil = []toList (Unit x) = [x]toList (Append xs ys) = toList xs ++ toList ys

Let us implement a function to reverse such lists, followed by some properties:

reverseL Nil = NilreverseL (Unit x) = Unit xreverseL (Append xs ys) = Append (reverseL ys) (reverseL xs)

prop_append xs ys = toList (xs +++ ys) =:= toList xs ++ toList ysprop_reverse xs = toList (reverseL xs) =:= reverse (toList xs)

HipSpec proves both properties. The first requires the lemma xs ++ [] = xs, and thesecond requires reverse (xs ++ ys) = reverse ys ++ reverse xs, which are automaticallygenerated and proved as before.

4 Evaluation

We have evaluated HipSpec on two test suites from the inductive theorem proving literature.The results of our evaluation are available online1. For fairness, we did not allow Hip to useequations from the test suite as lemmas when proving other properties: we only use the lemmasthat are automatically generated by HipSpec.

As the test suites contain many unrelated functions, we split them up by hand into indepen-dent parts. This was quite mechanical and we hope to automate it in the future.

The runtimes for the various parts range from under a minute to about half an hour.

Test Suite A This consists of 50 theorems about lists and natural numbers [8], and was usedto evaluate techniques for discovering auxiliary lemmas and generalisations in the now-defunctCLAM prover. There are no function definitions provided with the test suite so we haveimplemented them in Haskell in a standard fashion. Of the 50 properties, 38 are equational andso within the scope of HipSpec. Of these, HipSpec proves 36. The two that we cannot prove are:

No Conjecture

T14 ordered (isort xs) = True

T50 count x (isort xs) = count x xs

Both require conditional lemmas, which we cannot yet discover: T14 requires ordered xs !

ordered (insert x xs), and T50 needs x /= y ! count x (insert y ys) = count x ys.

Test Suite B This consists of 85 theorems about lists, natural numbers and binary trees [9].It originates from IsaPlanner and was later translated into Haskell2 as the test suite for theZeno prover [16]. 71 of the properties in the test suite are equational. Of these, HipSpec proves67. However, only 15 require any lemmas at all: 52 are proved without theory exploration, and10 are even proved without induction3. The four unproved properties are:

1http://web.student.chalmers.se/~danr/hipspec-evaluation/2http://www.doc.ic.ac.uk/~ws506/tryzeno/comparison/3This is because these problems were originally designed only for evaluating rippling in IsaPlanner on problems

involving if- and case-expressions, which are expressed as higher-order functions in IsaPlanner’s logic.

7


No Conjecture

53 count n xs = count n (sort xs)

66 len (filter p xs) <= len xs = True

68 len (delete n xs) <= len xs = True

78 sorted (sort xs) = True

Because of the overlap about sorting in the two test suites, property 53 and 78 above are thesame as T50 and T14 respectively, from Test Suite B. Properties 66 and 68 in Test Suite A,unproved by HipSpec, needs the conditional lemma expressing transitivity of <= to be proved.

Test Suite B has been evaluated on IsaPlanner4, Zeno, ACL2s [4] (by the authors of Zeno)and Dafny [12]. Dafny proves 45, IsaPlanner proves 47, ACL2s proves 74 and Zeno proves 82,failing on three properties. Two of these are equational, namely number 72 and 74:

No Conjecture

72 rev (drop i xs) = take (len xs - i) (rev xs)

74 rev (take i xs) = drop (len xs - i) (rev xs)

None of the other tools can prove the two properties above, but HipSpec can. Both propertieshave similar proofs; let us look at number 72. HipSpec discovers the five lemmas below:

No Conjecture

1 len (drop x xs) = len xs-x

2 len xs = len (rev xs)

3 xs = take x xs++drop x xs

4 rev (ys++xs) = rev xs++rev ys

5 xs = take (len xs) (xs++ys)

With these lemmas, the proof proceeds by equational reasoning as follows:

rev (drop x xs) = {5}take (len (rev (drop x xs))) (rev (drop x xs) ++ rev (take x xs)) = {2}take (len (drop x xs)) (rev (drop x xs) ++ rev (take x xs)) = {1}take (len xs - x) (rev (drop x xs) ++ rev (take x xs)) = {4}take (len xs - x) (rev (take x xs ++ drop x xs)) = {3}take (len xs - x) (rev xs)

5 Related Work

Inductive theories are undecidable, and proofs are thus often performed interactively, for instanceusing proof assistants such as ACL2 [11] and Isabelle[14]. In particular, the discovery of auxiliarylemmas or generalisations has to be performed by a human user.

Proof-planning is an automated technique which exploits the fact that certain families ofproofs share a similar structure [2]. For instance, all inductive proofs consists of one or more base-cases followed by step-cases. Rippling is a commonly used heuristic for guiding rewriting in thestep-case in proof-planning [3]. If some step in the proof-plan fails, proof-critics may be employedto analyse the failure and perhaps suggest a missing lemma or generalisation [8]. Proof critics fordi↵erent kinds of failures of rippling were implemented in the now defunct CLAM proof-planner.Using these critics, CLAM succeed in finding generalisations for proofs involving accumulator

4http://dream.inf.ed.ac.uk/projects/lemmadiscovery/results/case-analysis-rippling.txt

8


variables, such as the example from §1: 8 xs . reverse xs = qreverse xs. IsaPlanner is acurrent proof-planner built on top of the Isabelle proof assistant [7]. It supports rippling andcritics for lemma discovery, but not for finding generalisations and can thus not prove propertiesabout functions defined using accumulator variables. The main di↵erence between our workand the proof critics approach is that critics work top-down, trying to derive a missing lemmafrom a particular failed proof attempt. HipSpec on the other hand, works bottom-up, first ittries theory exploration to construct lemmas about available functions, then it tackles the userdefined properties of interest. While proof-critics are tied to react to a particular failure patternof a particular proof-plan, theory exploration techniques can be used in a more flexible manner,in combination with various provers or proof techniques. However, theory exploration techniquesmay also come up with irrelevant lemmas, while the critics approach is more targeted towards aparticular lemma suitable for the current proof attempt.

Proof assistants often have large libraries of previously proved lemmas for the user to buildfurther upon. The theory exploration systems IsaCoSy and IsaScheme were developed forautomating the creation of such lemma libraries for inductive theories in Isabelle [10, 13]. Bothsystems use IsaPlanner to prove conjectures which pass counter-example checking, but di↵erin the heuristics they use to generate conjectures. IsaCoSy and IsaScheme make one pass atgenerating a background theory, if some conjecture cannot be proved, it is left open. However, asIsaPlanner has a lemma calculation critic, so simple lemmas can be derived on-the-fly. HipSpecon the other hand, relies only on the lemmas it has previously discovered and proved. Thisallows failed proofs to be returned to later, after other theorems suitable as lemmas are proved.

Zeno is an automated theorem prover for a subset of Haskell [16]. It performs proofs bystructural induction and rewriting and case analysis. Like IsaPlanner, it can discover simplelemmas by applying common subterm generalistion to subgoals to which no rewriting stepsapply, followed by a new induction on the resulting conjecture.

Dafny is a program verifier with a simple automated induction tactic [12]. It uses heuristicsfor selecting properties which may require proof by induction. Like HipSpec, it then appliesinduction on the meta-level and pass the resulting proof obligations to the SMT solver Z3. Dafnydoes not support automated lemma discovery, so auxiliary lemmas must be supplied by the user.

6 Conclusion and Future Work

We have introduced HipSpec, a tool that uses a form of theory formation to build a backgroundtheory about a Haskell program, in order to automatically prove stated properties by induction.In the experimental evaluation, the tool compares favorably against existing similar systems.

The main di↵erence with most existing methods is that we use a bottom-up approach whenbuilding the background theory. A possible advantage over top-down, syntactic methods is thatour approach will not get stuck as easily. A possible drawback is that we might generate fartoo many irrelevant lemmas, and waste time proving them. Our initial experiments show thatthis is often not the case; it is relatively cheap to generate and prove the conjectures. Choosingappropriate algorithms and data structures to discharge lemmas provable without inductionquickly plays a key role in this.

For future work, we will explore the design space of which order to try to prove conjectures.We should exploit syntactic information, such as the call graph. When proving a conjecturefails, we should postpone trying to prove similar conjectures. There is a world of heuristics here.

We plan to extend HipSpec to the full Haskell language, in particular by allowing non-terminating functions and exceptions. Both QuickSpec and Hip already deal with exceptions.Hip can also deal with non-terminating functions; what we have not completely worked out is

9


how to practically distinguish between lemmas that hold for all values of a datatype (includingpartial values), and lemmas that only hold for completely defined, total values.

The properties we deal with right now are universally quantified equalities between expressions.We would like to extend these to other kinds of properties, in particular conditional properties.The main di�culty is getting QuickSpec to come up with conditional equations automatically.

Hip supports several induction principles apart from structural induction, notably Plotkin’sfixed point induction; we would like to support these in HipSpec.

Finally, for HipSpec to be usable by programmers, we need to give useful feedback when wecannot prove a property. This feedback should help the programmer to decide e.g. if a lemmashould be added by hand or if QuickSpec should be given more function definitions. We plan toincorporate ideas from our earlier work on non-standard counter examples here [1].

References

[1] Jasmin Blanchette and Koen Claessen. Generating counterexamples for structural inductions byexploiting nonstandard models. In LPAR. Springer LNCS, 2010.

[2] Alan Bundy. The use of explicit plans to guide inductive proofs. In CADE, pages 111–120, 1988.

[3] Alan Bundy, David Basin, Dieter Hutter, and Andrew Ireland. Rippling: meta-level guidance formathematical reasoning. Cambridge University Press, New York, NY, USA, 2005.

[4] Harsh Raju Chamarthi, Peter Dillinger, Panagiotis Manolios, and Daron Vroon. The ACL2 sedantheorem proving system. In TACAS, pages 291–295, 2011.

[5] Koen Claessen and John Hughes. QuickCheck: a lightweight tool for random testing of Haskellprograms. In ICFP, pages 268–279, 2000.

[6] Koen Claessen, Nicholas Smallbone, and John Hughes. QuickSpec: Guessing formal specificationsusing testing. In TAP, pages 6–21, 2010.

[7] L. Dixon and J. D. Fleuriot. Higher order rippling in IsaPlanner. In Theorem Proving in HigherOrder Logics, volume 3223 of LNCS, pages 83–98, 2004.

[8] Andrew Ireland and Alan Bundy. Productive use of failure in inductive proof. Journal of AutomatedReasoning, 16:16–1, 1995.

[9] Moa Johansson, Lucas Dixon, and Alan Bundy. Case-analysis for rippling and inductive proof.ITP, pages 291–306, 2010.

[10] Moa Johansson, Lucas Dixon, and Alan Bundy. Conjecture synthesis for inductive theories. Journalof Automated Reasoning, 47(3):251–289, 2011.

[11] Matt Kaufmann, Manolios Panagiotis, and J Strother Moore. Computer-Aided Reasoning: AnApproach. Kluwer Academic Publishers, 2000.

[12] K. Rustan Leino. Automating induction with an SMT solver. To appear at VMCAI 2012.

[13] Omar Montano-Rivas, Roy McCasland, Lucas Dixon, and Alan Bundy. Scheme-based theoremdiscovery and concept invention. Expert Systems with Applications, 39(2):1637–1646, February2012.

[14] Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Isabelle/HOL — A Proof Assistant forHigher-Order Logic, volume 2283 of LNCS. Springer, 2002.

[15] Dan Rosen. Proving Equational Haskell Properties using Automated Theorem Provers, 2012. MSc.Thesis, University of Gothenburg.

[16] Willam Sonnex, Sophia Drossopoulou, and Susan Eisenbach. Zeno: A tool for the automaticverification of algebraic properties of functional programs. Technical report, Imperial CollegeLondon, February 2011.

10

Synthesising Graphical TheoriesAleks Kissinger

Abstract

In recent years, diagrammatic languages have been shown to be a powerful and ex-pressive tool for reasoning about physical, logical, and semantic processes representedas morphisms in a monoidal category. In particular, categorical quantum mechanics, or“Quantum Picturalism”, aims to turn concrete features of quantum theory into abstractstructural properties, expressed in the form of diagrammatic identities. One way we searchfor these properties is to start with a concrete model (e.g. a set of linear maps or finite re-lations) and start composing generators into diagrams and looking for graphical identities.

Naıvely, we could automate this procedure by enumerating all diagrams up to a givensize and check for equalities, but this is intractable in practice because it produces far toomany equations. Luckily, many of these identities are not primitive, but rather derivablefrom simpler ones. In 2010, Johansson, Dixon, and Bundy developed a technique calledconjecture synthesis for automatically generating conjectured term equations to feed intoan inductive theorem prover. In this extended abstract, we adapt this technique to dia-grammatic theories, expressed as graph rewrite systems, and demonstrate its applicationby synthesising a graphical theory for studying entangled quantum states.

1 IntroductionMonoidal categories can be thought of as theories of generalised (physical, logical, semantic,...) processes. In particular, they provide an abstract setting for studying how processes (i.e.morphisms) behave when they are composed in time (via the categorical composition �) and inspace (via the monoidal product ⌦). Categorical quantum mechanics, a program initiated byAbramsky and Coecke in 2004, takes this notion seriously by showing that many phenomenain quantum mechanics can be completely understood at the abstract level of monoidal cate-gories carrying certain extra structure [1]. This notion has also had success in reasoning aboutstochastic networks [7], illustrative “toy theories” of physics [3], and even linguistics [6]. Inmost of these contexts, we make extensive use of spacial “swap” operations (symmetries) andtemporal “feedback” operations (traces). Formally, this means we are working with a particularkind of monoidal category called a symmetric traced category.

Working with large compositions of morphisms in a symmetric traced category using tra-ditional, term-based expressions quickly becomes unwieldy, as one attempts to describe aninherently two-dimensional structure using a (one-dimensional) term language. Diagrammatic

languages thus provide a natural alternative.String diagrams represent morphisms as “boxes” with various inputs and outputs, and

compositions as “wires” connecting those boxes. They provide a way of visualising compositionthat is both highly intuitive and completely rigorous. They were originated by Penrose inthe 1970s as a way to describe contractions of (abstract) tensors [18]. These abstract tensorsystems closely resembled traced monoidal categories, so perhaps unsurprisingly, Joyal andStreet were able to use string diagrams in 1991 to construct free symmetric traced categories [13].In the context of categorical quantum mechanics, complementary observables [2], quantummeasurements [9], and entanglement [4] can be studied using string diagrams and diagrammaticidentities.

Joyal and Street’s formalisation of string diagrams uses topological graphs with some extrastructure. While this method comes very close to the intuitive notion of a string diagram, it isnot obvious how one could automate the creation and manipulation of these topological objects

1

Synthesising Graphical Theories A. Kissinger

in a computer program. For that reason, Dixon, Duncan, and Kissinger developed a discreterepresentation of string diagrams called string graphs. Unlike their continuous counterparts, itis straightforward to automate the manipulation of string graphs using graph rewrite systems.Using a graph rewrite system, we can draw conclusions about any concrete model of that system.Since the amount of data needed to represent a morphism concretely often grows exponentiallyin the number of inputs and outputs, we can gain potentially huge computational benefits bymanipulating these morphisms using rewrite rules instead of concrete calculation.

We might also ask if this can be done in reverse. Consider a situation where we havea collection of generators as well as concrete realisations of those generators (e.g. as linearmaps, finite relations, etc.) and we wish to discover the algebraic structure of these generators.One way to find new rules is to start enumerating all string graphs involving those generators,evaluating them concretely, and checking for identities. This process becomes intractable veryquickly, as the number of distinct string graphs over a set of generators grows exponentiallywith size. However, in 2010, Johansson, Dixon, and Bundy provided a clever way of avoidingenumerating redundant rewrite rule candidates for a term rewrite theory, which they calledconjecture synthesis [12]. This technique keeps a running collection of rewrite rules and onlyenumerates terms that are irreducible with respect to this collection of rules. This simplecondition is surprisingly e↵ective at curbing the exponential blow-up of the rewrite systemsynthesis procedure, and provides the basis for a piece of software developed by the authorscalled IsaCoSy [11] (for ISAbelle COnjecture SYnthesis).

The main contribution of this paper is an algorithm that generates new graphical theories us-ing an adapted conjecture synthesis procedure. This adapted procedure has been implementedin a tool called QuantoCoSy, built on string graph rewriting platform called Quantomatic [15].We provide an example of applying QuantoCoSy to generate a graphical theory used in thestudy of quantum entanglement. We use as a basis the string graph (aka open-graph) formal-ism developed in [8].

After briefly reviewing string graphs and string graph rewriting in section 2, we show insection 3 how, given a valuation of the generators of a string graph, the graph itself can be eval-uated as a linear map using tensor contraction. The graphical conjecture synthesis procedureis then described in section 4. In section 5, we look at a test case where our theory synthesissoftware, QuantoCosy, is used to synthesise a simple graphical theory of entangled states.

2 Background: String GraphsIn this section, we summarise the construction of the category SGraphT of string graphsparametrised by a monoidal signature. Details of how this can be done formally in the contextof adhesive categories can be found in [8]. Our goal in defining string graphs is to representdiagrammatic theories using graph rewrite systems. To do this, we need to re-cast (topological)string diagrams as typed graphs. We do this by replacing wires with chains of special verticescalled wire-vertices. We also introduce node-vertices, which represent “logical” nodes (akaboxes) in the diagram.

C

A B

f f

CC

D

g7! g

DC

E

F

F

h h F

A B

C

D

F

E

(1)

2


A (small, strict) monoidal signature T defines a set of generators that can be interpretedin a monoidal category. Let w(X) be the set of lists over a set X. T consists of a set Mof morphisms, a set O of objects and functions dom, cod : M ! w(O) that assign input andoutput types for each morphism. Note in particular that T says nothing about composition.

Definition 2.1. The category SGraphT has as objects string graphs with node-vertex andwire-vertex types given by T , and as arrows graph homomorphisms respecting these types.

From hence forth, we will refer to chains of wire vertices simply as wires, when there is noambiguity. There are a few things to note here. Firstly, wire-vertices carry the (object) type ofa wire, so every connection between two node-vertices must contain at least one node-vertex.We do not allow wires to split or merge, so wire-vertices must have at most one input and oneoutput. When our generators are non-commutative, we distinguish inputs and outputs usingedge types (in

1

, in2

, out1

, out2

, . . .).This category can be defined as a full subcategory of the slice category Graph/GT , where

GT is called the derived typegraph of a monoidal signature T . Furthermore, SGraphT is apartial adhesive category, as defined in [8]. Such a category provides the mechanisms we needto perform graph matching and rewriting in the following sections. For details, see [8].

Definitions 2.2. For a string graph G, let !(G) be the set of wire-vertices and ⌘(G) the set ofnode-vertices. If a wire-vertex has no in-edges, it is called an input. We write the set of inputsof a string graph G as In(G). Similarly, a wire-vertex with no out-edges is called an output, andthe set of outputs is written Out(G). The inputs and outputs define a string graph’s boundary,written Bound(G). If a boundary vertex has no in-edges and no out-edges, (it is both and inputand output) it is called an isolated wire-vertex.

2.1 Plugging and Rewriting for String Graphs

There are two important, related operations we wish to apply to string graphs. The first isplugging, where input and output wire-vertices are glued together, and the second is stringgraph rewriting, where parts of the string graph are cut out and replaced with other graphs.

Definition 2.3. For a string graph G and wire-vertices x 2 In(G) and y 2 Out(G), let G/(x,y)

be a new string graph with the wire-vertices x and y identified. This is called the plugging of(x, y) in G, and the new vertex (x ⇠ y) 2 G/

(x,y) is called the plugged wire-vertex.

We can compose string graphs by performing pluggings on the disjoint union: (G+H)/x,yfor x 2 Out(G), y 2 In(H). We can plug many wires together by performing many pluggingsone after another. Equivalently, we can perform a pushout along a discrete graph, respectingcertain conditions. When there can be no confusion, we also call this a plugging.

Definition 2.4. A pushout of a span Gp � K

q�! H is called a plugging ifK is a disconnectedgraph of wire-vertices, p and q are mono, and for all k 2 K, p(k) 2 Bound(G), q(k) 2 Bound(H)and p(k) 2 In(G), q(k) 2 Out(H).

We perform rewrites on string graphs using the double pushout (DPO) graph rewritingtechnique. We first define a string graph rewrite rule as a pair of string graphs sharing aboundary. It will be convenient to formalise this as a particular kind of span.

Definition 2.5. A string graph rewrite rule L! R is a span of monomorphisms Lb1 � B

b2�!

R such that b1

(B) = Bound(L), b2

(B) = Bound(R) and for all b 2 B, b1

(b) 2 In(L), b2

(b) 2In(R).

3


To perform rewriting, we first introduce a notion of matchings. These are special monomor-phisms that only connect to the rest of the graph via their boundary.

Definition 2.6. A matching of a string graph L on another string graph G is a monomorphismm : L ! G such that whenever an edge not in the image of m is adjacent to m(v) for somevertex v in L, v must be in the boundary Bound(L) of L. When such a matching exists, we sayL (or any rule with L as its LHS) matches G.

Note that, for the class of rewrite rules considered (i.e. those of the form of Definition 2.5),this corresponds to the notion of matching defined by Ehrig et al [19]. Once a matching isfound, the occurrence of L in G can be replaced with R. However, some care must be taken toensure that R is reconnected to the remainder of G in the same location as L was previously.We begin by removing the interior of L in G, i.e. the part of L that is not in the boundary.We do this by finding a context graph G0 such that when L and G0 are plugged together alongthe boundary of L, the total graph G is obtained. This is known as computing the pushout

complement of Bb1�! L

m�! G . Since G is the result of gluing L and G0 together along B, wecan replace L with R by performing a second plugging, this time of R and G0 along B.

L B R

G G0 H

m m0 (2)

In other words, we express G as a plugging of two graphs L and G0, then compute H byplugging R into G0. This completes the rewrite of G into H. Note that even in a category withall pushouts, pushout complements need not exist or be unique. However, the following theorem,proved in [8], shows that DPO rewriting is always well-defined for string graph matchings andrewrite rules.

Theorem 2.7. Let Lb1 � B

b2�! R be a boundary span and m : L ! G a matching. Then

Bb1�! L

m�! G has a unique pushout complement G0and both of the pushout squares in

diagram (2) exist and are preserved by the embedding of SGraphT into Graph/GT .

We will also find the following lemma useful when combining matching and string graphenumeration. It shows that performing a plugging only creates new matchings local to theplugged vertex.

Lemma 2.8. Let x 2 In(G), y 2 Out(G), then let m : L! G/(x,y) be a matching such that the

plugged vertex x ⇠ y is not in the image of m. Then there exists a matching m0 : L! G.

Proof. Let q : G ! G/(x,y) be the quotient map. Then, pulling back q over m is just the

restriction of q to the image of m. Since the image of m does not contain the plugged vertex,q restricts to an isomorphism r. It is straightforward to show that m0 � r�1 is a matching.

Definition 2.9. A set of string graph rewrite rules is called a string graph rewrite system. Fora string graph rewrite system S, we write G S H if there exists a rule L R 2 S and amatching m : L ! G such that G can be rewritten to H using the DPO diagram given by

diagram (2). Let * S be the reflexive, transitive closure of S and * S be the associatedequivalence relation.

4


The vigilant reader will notice that we allow any number of wire-vertices to occur withina wire (all with the same type). However, the number of wire-vertices makes no semanticdi↵erence in the interpretation of a string graph. Therefore, we will consider two string graphsto be semantically equivalent if they only di↵er in the length of their wires. This equivalencerelation is called wire-homeomorphism. We can formalise wire-homeomorphism as a stringgraph rewrite system.

Definition 2.10. For a monoidal signature T = (O,M, dom, cod), the rewrite system H isdefined as follows. For every X 2 O, we define a loop contraction rule hL

X and a wire contractionrule hW

X . For every f 2M and 0 i < Length(dom(f)), 0 j < Length(cod(f)), we definean input contraction rule hI

f,i and an output contraction rule hOf,j .

XX hLX

X hWX

X

X

X

X

X

X ... X...X hI

f,iinf,i hO

f,jinf,i f

outf,jfoutf,j

......

...

...

f

...

...

fX

X X

It was shown in [8] that this is a confluent, terminating rewrite system. Formal forms arecalled reduced string graphs, and contain only one wire-vertex on every wire.

3 Interpreting String Graphs as Tensor ContractionsDescribing the contractions of many tensors was the main reason Penrose introduced stringdiagram notation, so perhaps unsurprisingly, there is a natural way to interpret a string graphas a contraction of tensors. We first give the monoidal signature T = (O,M, dom, cod) avaluation v : T ! VectC. This is a monoidal signature homomorphism from T to VectC thatassigns to each of the objects o 2 O a vector space and each of the morphisms f 2M a linearmap v(f). We can then regard the linear map v(f) as a tensor with a lower index for everyinput to f in T and an upper index for every output.

A string graph can then be interpreted as a big tensor contraction by interpreting wire-vertices as identities (i.e. the Dirac delta tensor �ji ), node-vertices as linear maps v(f), for fthe type of the vertex, and edges as sums over an index.

f

g

k l

nm

o

p

q

r 7! �ki1

�li2| {z }

inputs

�j1m�j2n �j3o| {z }outputs

[v(f)]p,ok,l [v(g)]m,nq| {z }

nodes

�qp�rr|{z}

wires

We are using the Einstein summation convention (repeated indices are summed over), andwe have labeled edges in the string graph with their corresponding indices. Like wire-verticesthemselves, the � maps are used mainly just for book-keeping. They maintain the correct orderof inputs and outputs, define circles, and connect more interesting tensors together.

This gives the evaluation of a string diagram G as a linear map JGK in VectC. This isa special case of a much more general construction. In [14], Kissinger showed that stringgraphs could be used to form the free symmetric traced category over a monoidal signature.As a consequence, for any symmetric traced category C, a monoidal signature homomorphismv : T ! C lifts uniquely to an evaluation functor from the category of string graphs to C.

4 Conjecture Synthesis for String Graph TheoriesWe will now describe a synthesis procedure for graphical identities in the spirit of IsaCoSy [11].A notable di↵erence is that, rather than passing conjectured identities o↵ to an inductive

5


theorem prover, we simply evaluate them using a given valuation of the generators of thetheory.

The key to this procedure is that it maintains a set of reduction rules S throughout, andavoids enumerating string graphs that are reducible with respect to S. We refer to such graphsas redexes.

The theory synthesis procedure takes as input a string-graph signature T along with an(m,n)-tensor for every box in T with m inputs and n outputs. It also takes an initial setof rewrite rules S and a reduction ordering . This is a function from string graphs to Nwhere G ⇠= H ) (G) = (H) and G S H ) (G) > (H). We shall also chose tobe non-increasing in the number of node-vertices and wire-vertices in a string graph. We willalso maintain a set K (initially empty) of congruences, but these cannot be used for redex-elimination.

In the term case, a single round of synthesis is parametrised by two natural numbers: themaximum term size (or term depth) and the maximum number of free variables occurring inthe term. For string graphs, we parametrise a run with four natural numbers: the number ofinputs M , the number of outputs N , the maximum number of pluggings P , and the maximumnumber of node-vertices Q. Will shall refer to string graphs with M inputs, N outputs, and upto P,Q pluggings and node-vertices as string graphs of size (M,N,P,Q).

First, we define a string graph enumeration procedure that avoids reducible expressionswith respect to our initial rewrite system S. Suppose we have a string graph signature T =(O,M, dom, cod), then we can define a string graph for every morphism in T , called its gener-ator. For each f 2M, this is the smallest string graph containing a node-vertex of type f . Forcompleteness, we also include the identity generator, which is just two connected wire-vertices.For an example of a set of generators, see (3) in the next section.

We enumerate string graphs by starting with disconnected string graphs, i.e. disjoint unionsof generators, and performing pluggings to connect separate components together.

Definition 4.1. For a string graph G, let p(G) be the set In(G)⇥Out(G). A pair (x, y) 2 p(G)defines a particular plugging G/

(x,y). Two pluggings in (x, y), (x0, y0) 2 p(G) are called similar

if there exists an isomorphism � : G/(x,y)

⇠= G/(x0,y0

)

that is identity on node-vertices. Letp(x,y)(G) ✓ p(G) be the set of all pairs (x0, y0) similar to (x, y).

Let D(M,N,Q) be the set of all disconnected string graphs with M inputs, N outputs andup to Q node-vertices. The procedure ENUM takes as input a graph, a set of (distinguishable)pluggings and the number of pluggings left to do. Note that each plugging decreases |In(G)|and |Out(G)| by 1. So, if we start with G 2 D(M + p,N + p,Q) and before p pluggings, we willget a string graph with M inputs and N outputs.

1: procedure ENUM(⇧, G, p)2: if p = 0 then

3: save G4: exit

5: else if ⇧ = {} then

6: exit

7: end if

8: let (x, y) 2 ⇧9: if no rule in S matches G/

(x,y), local to (x, y) then10: ENUM(⇧� {(x, y)}, G/

(x,y), p� 1)11: end if

12: let ⇧0 = ⇧� p(x,y)(G)

6


13: ENUM(⇧0, G, p)14: end procedure

15:

16: for all 0 p P , G 2 D(M + p,N + p,Q) do17: ENUM(p(G), G, p)18: end for

The procedure ENUM recurses down two branches, one where we decide to do a particularplugging (x, y) 2 ⇧ (line 10), and one where we decide not to (line 13). Line 12 prevents usfrom deciding not to do a certain plugging then later deciding to do one that is similar, as thiswill enumerate redundant graphs.

The crucial redex-elimination step in the algorithm occurs at line 9. Recall that a ruleL R matches G/

(x,y) local to (x, y) if the matching m : L ! G/(x,y) contains the plugged

vertex (x ⇠ y) 2 G/(x,y) in its image. Assuming each of our generators is irreducible, it su�ces

to consider only matches of rules local to the most recent plugging in order to eliminate allreducible expressions. By Lemma 2.8, if a rule has a matching on G/

(x,y) that is not local to(x, y), it already has a matching on G. Therefore, the condition on line 9 guarantees no redexes,with respect to S, will be enumerated.

Once all of the irreducible graphs of size (M,N,P,Q) have been enumerated, we updatethe rewrite system as follows. First, we evaluate the string graphs as linear maps and organisethem into equivalence classes, up to scalar factors and permutations of inputs and outputs. Wethen filter out any remaining isomorphic graphs in each equivalence class and identify the setof minimal string graphs C 0 ✓ C with respect to . Choose a representative G

0

of C 0. Finally,we add new reductions G G

0

to S for all G 2 C � C 0. Add congruences G0 G0

andG

0

G0 for all G0 2 C 0 � {G0

} to K.We postpone filtering out isomorphic string graphs until after enumeration because tensor

contraction is fast, and two string graphs will not be isomorphic unless they are in the sameequivalence class. In order to get a well-behaved rewrite system, we should choose so thatthere are very few congruences, if any. Obviously, if we choose to be strict (for all G 6⇠= H,(G) < (H) or (G) > (H)), there will be no congruences. While strict reduction orderingsfor graphs are much more di�cult to compute than their term analogues, this may be tractableif we adapted our graph enumeration procedure to only produce canonical representatives ofisomorphism classes of string graphs (cf. [10, 16]).

Once a single run of the synthesis procedure is complete, we can re-run the procedure usinglarger values of M , N , P , and Q as well as the updated rewrite system S. Using this growingcollection of reductions can be very e↵ective in stemming the exponential blow-up in both thenumber of string graphs that need to be enumerated and the number of rewrite rules found.

Theorem 4.2. Applying the synthesis procedure for a series of runs given by {(Mi, Ni, Pi, Qi)}yields a rewrite system S [K that is complete for all graphs of size (Mi, Ni, Pi, Qi) for some i.Furthermore, if K = {} the rewrite system yields unique normal forms for graphs of the given

sizes.

Proof. For completeness, suppose there are string graphs G and H of size (Mi, Ni, Pi, Qi) suchthat JGK = JHK. Since S is a terminating rewrite system, we can normaliseG andH with respectto S. Since is non-increasing on node-vertices and wire-vertices, the associated normal formsG0 andH 0 are also of size (Mi, Ni, Pi, Qi), so both graphs will have been enumerated. Therefore,either G0 ⇠= H 0 or (G0 H 0) 2 K. For unique normal forms, note that when K is empty, onlyG0 ⇠= H 0 is possible.

7


Of course, we are interested in graphs of all sizes. However, completeness is not expected(and sometimes not desirable) in these cases, as the model or models used to synthesise thetheory may be degenerate in some sense as examples of some algebraic structure. To ensureunique normal forms, we would need to ensure confluence of S. Unlike termination, this doesnot come for free, as there may exist critical pairs that are larger than any of the graphssynthesised. However, the synthesis procedure could be used in conjection with a graphicalvariant of Knuth-Bendix completion to increase the chances of obtaining a confluent rewritesystem for graphs of all sizes.

5 Application and ResultsThe procedure described in section 4 has been implemented in a tool called QuantoCosy. Thissits on top a general framework for string graph rewriting called Quantomatic [15]. We show inthis section how QuantoCosy performs in synthesising an example theory, called the GHZ/W-calculus. For a detailed description of this theory and how it can be applied, see [5].

In the GHZ/W-calculus, we can express large, many-body entangled quantum states ascompositions of simpler components. The key here is to use three-body states as algebraic“building blocks” for more complex states. A three-system, or tripartite entangled state, canbe represented as a vector in H⌦H⌦H for some complex vector space H. For finite dimensions,we can equivalently express this vector as a linear map H ⌦H ! H. When H ⇠= C2, there areonly two “interesting” tripartite entangled states, the GHZ state and the W state. Regardingthese as linear maps from C2 ⌦ C2 ⇠= C4 to C2, they can be written as follows:

:=

✓1 0 0 00 0 0 1

◆:=

✓0 1 1 00 0 0 1

◆

These maps exhibit particularly nice algebraic identities. They can both be extended tocommutative Frobenius algebras, which are unital, associative algebras over a vector space (oran object in some monoidal category) that have a strong self-duality property. In particular,they can be naturally associated with a co-unital, co-algebra that interacts well with the algebra.For pairs of Frobenius algebras, we can also always construct a special map called the dualiser.See [5] for details on how these things are defined algebraically.

The generators of the GHZ/W-calculus are: (i) the generators of both Frobenius algebras,(ii) the dualiser of the two algebras, and (iii) two zero vectors.

, , , , , , ,, ,0

0, ,(3)

As was the case for terms, filtering out redexes has a huge impact on the number of stringgraphs that need to be checked. In ten synthesis runs, we generate all of the GHZ/W rewriterules with total node-vertices, pluggings, and inputs+outputs 3. Using a naıve synthesis,this yielded 13, 302 rewrite rules. Using the procedure described in the previous section, thisyielded 132 rules, including all of the rules used to define the GHZ/W-calculus in [5]. A plotof the number of rewrite rules generated using a naıve graph enumeration algorithm againstthe number generated using the redex-elimination procedure across all 10 runs is provided inFigure 1.

6 Conclusion and OutlookIn this paper, we showed how string diagrams could be discretised into string graphs, whichcan be manipulated automatically using DPO graph rewriting. We then demonstrated how

8


(

0

,

1

,

3

,

3

)

(

1

,

0

,

3

,

3

)

(

0

,

2

,

3

,

3

)

(

1

,

1

,

3

,

3

)

(

2

,

0

,

3

,

3

)

(

0

,

3

,

3

,

3

)

(

1

,

2

,

3

,

3

)

(

2

,

1

,

3

,

3

)

(

3

,

0

,

3

,

3

)

102

103

104

Total

Number

ofRules

naıve redex elimination

Figure 1: Number of rules syntesised across 10 runs of the synthesis procedure.

a technique developed for term theories called conjecture synthesis can be adapted to workfor string graphs. Finally, we showed an example of the application of this technique to areal graphical theory, called the GHZ/W-calculus. The results there were promising, as wedemonstrated an exponential drop-o↵ in the number of extraneous rules generated when usingredex-eliminating routine as compared to a naıve synthesis routine.

In the end, some of the 132 rules produced were seemingly arbitrary, whereas others reflect areal algebraic relationship between the GHZ-structure and the W-structure. There are variousreasons, including simple aesthetics, why a human mathematician would take some of theserewrites to be valuable or interesting. QuantoCosy is completely ignorant to such considerations,as it only knows what rules to throw away, rather than which ones to highlight. However, thisis already useful, as it provides a researcher with a pool of hundreds of rules to sort through,rather than tens of thousands.

There are various ways in which this can be improved further. One is to make the synthesissmarter by employing heuristics that search for particular classes of rewrites that are commonto many algebraic structures. This has been done for terms using scheme-based conjecturesynthesis [17], where the search procedure starts by attempting to prove familiar sets of identities(associativity and commutativity of a binary operation, for instance) before moving to lessfamiliar conjectures.

Another way to improve the quality and conciseness of the synthesis output would be to re-fine the synthesised rules using Knuth-Bendix completion. This could prove a powerful methodfor automatically producing new rewrite rules for pattern graphs [14]. Pattern graphs are usedto describe infinite families of string graphs that have some repeated substructure. Starting witha rewrite system that contains some pattern graph rewrite rules that are previously known, andsome concrete rules that are produced using conjecture synthesis, one can obtain new patterngraph rewrites by performing critical pair analysis and completion.

Finally, and most importantly, this procedure can be improved and its scope broadenedby looking at di↵erent kinds of concrete models. As mentioned briefly in section 3, stringgraphs can be used to construct the free symmetric traced category over a signature. So, anyinterpretation of a signature in a concrete symmetric traced category can be lifted uniquelyto an evaluation of string graphs. Examples of such categories where one might wish to form

9


models are: Mat(F ) of matrices over a finite field, (FRel,⇥) of sets and finite relations withthe “wave” style tensor and trace, (FRel,+) of sets and finite relations with the “token” styletensor and trace, and products of concrete symmetric traced categories (i.e. giving valuationsin many models simultaneously).

The support of a theory synthesis tool allows a mathematician to very quickly get a pictureof a theory by specifying a few generators, and opens to the way for experimentation and rapiddevelopment of a wide variety of new theories.

References

[1] S. Abramsky and B. Coecke. A categorical semantics of quantum protocols. In Proceedings from

LiCS, arXiv:quant-ph/0402130v5, 2004.

[2] B. Coecke and R. Duncan. Interacting quantum observables. In Proceedings from ICALP, pages298–310, 2008.

[3] B. Coecke and B. Edwards. Spekkens’s toy theory as a category of processes. arXiv:1108.1978v1[quant-ph], 2011.

[4] B. Coecke and A. Kissinger. The compositional structure of multipartite quantum entanglement.In Automata, Languages and Programming, volume 6199 of Lecture Notes in Computer Science,pages 297–308. Springer, 2010.

[5] B. Coecke and A. Kissinger. The compositional structure of multipartite quantum entanglement.arXiv:1002.2540v2 [quant-ph], 2010.

[6] B. Coecke, M. Sadrzadeh, and S. Clark. Mathematical foundations for a compositional distribu-tional model of meaning. arXiv:1003.4394v1 [cs.CL], 2010.

[7] B. Coecke and R. W. Spekkens. Picturing classical and quantum bayesian inference.arXiv:1102.2368v1 [quant-ph], 2011.

[8] L. Dixon and A. Kissinger. Open Graphs and Monoidal Theories. arXiv:1007.3794v1 [cs.LO],2010.

[9] R. Duncan and S. Perdrix. Rewriting measurement-based quantum computations with generalisedflow. In Proceedings of the 37th international colloquium conference on Automata, languages and

programming: Part II, ICALP’10, pages 285–296, Berlin, Heidelberg, 2010. Springer-Verlag.

[10] L. Goldberg. E�cient algorithms for listing unlabeled graphs. Journal of Algorithms, 13(1):128–143, 1992.

[11] M. Johansson, L. Dixon, and A. Bundy. IsaCoSy. http://dream.inf.ed.ac.uk/projects/isaplanner/wiki/IsaCoSy.

[12] M. Johansson, L. Dixon, and A. Bundy. Conjecture Synthesis for Inductive Theories. Journal of

Automated Reasoning, 2010.

[13] A. Joyal and R. Street. The geometry of tensor calculus I. Advances in Mathematics, 88:55–113,1991.

[14] A. Kissinger. Pictures of Processes: Automated Graph Rewriting for Monoidal Categories and

Applications to Quantum Computing. PhD thesis, University of Oxford (awaiting review), 2012.

[15] A. Kissinger, A. Merry, L. Dixon, R. Duncan, M. Soloviev, and B. Frot. Quantomatic.https://sites.google.com/site/quantomatic/, 2011.

[16] B. D. McKay. Isomorph-free exhaustive generation. J Algorithms, 26:306–324, 1998.

[17] O. Montano-Rivas, R. McCasland, L. Dixon, and A. Bundy. Scheme-based synthesis of inductivetheories. Advances in Artificial Intelligence, pages 348–361, 2010.

[18] R. Penrose. Applications of negative dimensional tensors. In Combinatorial Mathematics and its

Applications, pages 221–244. Academic Press, 1971.

[19] U. Prange, H. Ehrig, and L. Lambers. Construction and properties of adhesive and weak adhesivehigh-level replacement categories. Applications of Categorical Structures, 16:365–388, 2008.

10

Recording and Refactoring HOL Light Tactic

Proofs

Mark Adams, David AspinallSchool of Informatics, University of Edinburgh

June 10, 2012

Abstract. In this article we present a mechanism for recording HOLLight tactic proofs in a hierarchical tree structure, with information storedat the level of atoms in the user’s ML proof script. This is written to sup-

port refactoring of tactic proofs, so that single-tactic packaged-up proofscan be flattened to a series of interactive tactic steps, and vice versa. Italso provides a good basis for proof visualisation and for proof querying

capability. The techniques presented can be adapted straightforwardly toother systems.

1 Introduction

Although now 30 years old, Paulson’s subgoal package [1] is still the predominantmode of interactive proof in various contemporary theorem provers, includingHOL4, HOL Light and ProofPower. The Flyspeck Project [2], a massive inter-national collaborative effort to formalise Hale’s Kepler Conjecture proof, usesHOL Light’s subgoal package throughout, for example.

The subgoal package is simple in concept and yet remarkably effective inpractice. Users start with a single main goal, which gets broken down over aseries of tactic steps into hopefully simpler-to-prove subgoals. The user focuseson each subgoal in turn, moving onto the next when the current has been proved.The proof is complete when the last subgoal has been proved. Behind the scenes,the subgoal package is keeping everything organised by maintaining a proofstate that consists of a list of current proof goals and a justification function forconstructing the formal proof of a goal from the formal proofs of its subgoals.Tactics are functions that take a goal and return a subgoal list plus a justificationfunction. The subgoal package state is updated every time a tactic is applied,incorporating the tactic’s resulting subgoals and justification function.

Despite its widepread use, subgoal package user facilities remain basic andlack useful extended features. One potentially useful facility is automated proofrefactoring, which can help save time for both experts and novices. Othersinclude proof graph visualisation, which can help the user understand a largeproof, and proof querying, for answering basic questions about the nature of aproof, such as the tactics and theorems it uses.

In this article we first motivate the need for automated refactoring, in par-ticular for packaging up and breaking down tactic proofs. We also provide somedetail about a tactic recording mechanism we have implemented for the HOL

1

Light system, that acts as solid basis for implementing tool support for proofrefactoring, proof graph visualisation and proof querying.

2 Motivation for Tactic Proof Refactoring

We first briefly explain how tactic proofs are often done in practice. When firstwritten, a proof in the subgoal package is usually written as a long series oftactic steps. This first version of a proof is usually not beautiful, but does thejob of proving the theorem. What often happens next is that proof is cleanedup, or refactored, to become more succint. This refactoring will often involvepackaging-up the tactic steps into a single, compound tactic that proves the goalin one step. If done well, the resulting compound tactic is usually neater andmore concise because it can factor out repeated use of tactics, for example whenthe same tactic is applied to each subgoal of a given goal. These packaged-upproofs feature heavily in the source code building up the standard theories ofthe HOL4 and HOL Light systems, and were a prerequisite for submitting workto the Flyspeck Project for a number of years.

In light of this, automatic refactoring is potentially useful for three reasons.The first is that the process of packaging up a non-trivial proof can be longand tedious, and for proofs that run into dozens of lines it can be easy to missopportunities to make the proof more concise. Doing this automatically couldboth save effort and result in more concise proofs.

The second reason is that, given that most of the best examples of tacticproofs are packaged-up, novice users have to laboriously unpick them if theywant to step through these masterpieces and learn how the experts prove theirtheorems. Unpicking a packaged-up proof is even more tedious than packagingit up in the first place, because the user does not know which tactics apply tomore than one subgoal and thus need to occur more than once in the unpackagedproof. Automation would improve access to the wealth of experience that is heldin source code building up systems’ standard theories.

The third reason is that proofs need to be maintained over time, due tochanges in the theory context in which the theorems are proved. If the proofsare packaged up, then they will need to be unpackaged, debugged and thenrepackaged, which again would be considerably easier with automated support.

3 Example Tactic Proof Flattening

In this section we show how automated flattening of a packaged-up tactic proofcan reveal the structure of the proof and enable the proof steps to be replayedinteractively.

The following is a proof taken from the implementation of HOL Light. It isnot a particularly untypical example of such.

let REAL LT INV = prove(‘! x. &0 < x ==> &0 < inv(x)‘,GEN TAC THENREPEAT TCL DISJ CASES THEN

ASSUME TAC (SPEC ‘inv(x)‘ REAL LT NEGTOTAL) THENASM REWRITE TAC[] THENL

2

[RULE ASSUM TAC(REWRITE RULE[REAL INV EQ 0]) THENASM REWRITE TAC[];DISCH TAC THENSUBGOAL THEN ‘&0 < −−(inv x) ∗ x‘ MP TAC THENL[MATCH MP TAC REAL LT MUL THEN ASM REWRITE TAC[];REWRITE TAC[REAL MUL LNEG]]] THEN

SUBGOAL THEN ‘inv(x) ∗ x = &1‘ SUBST1 TAC THENL[MATCH MP TAC REAL MUL LINV THENUNDISCH TAC ‘&0 < x‘ THEN REAL ARITH TAC;REWRITE TAC

[REAL LT RNEG; REAL ADD LID; REAL OF NUM LT; ARITH]]);;

Unpicking this proof manually is an arduous task. Some tactics get appliedto more than one subgoal, but it is not clear which. Furthermore, use of THENLreveals that the proof branches at various points, but it is far from clear whichbranches are finished within the right hand side of the THENLs and whichbranches carry on further into the proof script.

The proof actually flattens out into the following series of tactics:

e (GEN TAC);;e (REPEAT TCL DISJ CASES THEN

ASSUME TAC (SPEC ‘inv x‘ REAL LT NEGTOTAL));;(∗ ∗∗∗ Subgoal 1 ∗∗∗ ∗)e (ASM REWRITE TAC []);;e (RULE ASSUM TAC (REWRITE RULE [REAL INV EQ 0]));;e (ASM REWRITE TAC []);;(∗ ∗∗∗ Subgoal 2 ∗∗∗ ∗)e (ASM REWRITE TAC []);;(∗ ∗∗∗ Subgoal 3 ∗∗∗ ∗)e (ASM REWRITE TAC []);;e (DISCH TAC);;e (SUBGOAL THEN ‘&0 < −−inv x ∗ x‘ MP TAC);;(∗ ∗∗∗ Subgoal 3.1 ∗∗∗ ∗)e (MATCH MP TAC REAL LT MUL);;e (ASM REWRITE TAC []);;(∗ ∗∗∗ Subgoal 3.2 ∗∗∗ ∗)e (REWRITE TAC [REAL MUL LNEG]);;e (SUBGOAL THEN ‘inv x ∗ x = &1‘ SUBST1 TAC);;(∗ ∗∗∗ Subgoal 3.2.1 ∗∗∗ ∗)e (MATCH MP TAC REAL MUL LINV);;e (UNDISCH TAC ‘&0 < x‘);;e (CONV TAC REAL ARITH);;(∗ ∗∗∗ Subgoal 3.2.2 ∗∗∗ ∗)e (REWRITE TAC

[REAL LT RNEG;REAL ADD LID;REAL OF NUM LT;ARITH]);;

The flattened proof shows that the application of REPEAT TCL resulted inthree subgoals, with the first subgoal finishing in the first branch of the firstTHENL, the second subgoal finished off by ASM REWRITE TAC prior to thefirst THENL, and the third subgoal continuing beyond second branch of thefirst THENL to split and continue to the end of the packaged proof.

3

4 Requirements for Capturing Tactic Proofs

Underlying a tactic proof refactoring facility must be some system for capturingtactic proofs in a suitable form. This is also true of proof visualisation and proofquerying tools. In this section we discuss the requirements for a mechanism thatsupports all three activities.

First note that it is not particularly important that the original proof can berecreated in all its detail, including those parts that are redundant and don’t getexecuted. For our purposes, we are more interested in what does get executed,which helps proof refactoring, and means that proof visualisation can showwhat has happened and proof querying can portray what parts of the proofhave been used. Thus capturing the proof by a static syntactic transformationof the original proof script will not suit our purposes. Rather, the proof needs tosomehow be dynamically recorded, as it is executed, to capture what is actuallyused. We call this tactic proof recording.

Also note that the subgoal package, of course, already dynamically capturestactic proofs, simply as a list of subgoals (or actually a stack of such lists, so thatinteractive steps can be undone if required). However, this form is not suitablefor our purposes because it does not explicitly capture the structure of the prooftree, and neither does it carry the various crucial pieces of information, such asthe tactics used in the proof, that we require.

To be most suitable for our purposes of proof refactoring, visualisation andquerying, our tactic recording mechanism has seven main requirements:

• To fully capture all the information needed to recreate a proof;

• To capture the parts of the proof that actually get used;

• To capture the information in a form that suitably reflects the full struc-ture of the original proof, including the structure of the goal tree andhierarchy corresponding to the explicit use of tacticals in the proof script;

• To capture information at a level that is meaningful to the user, i.e. withatoms corresponding to the ML binding names for the tactics and otherobjects mentioned in the proof script;

• To be capable of capturing both complete and incomplete proofs;

• To work both for interactive proofs, possibly spread over several ML com-mands and involving meta operations for undoing steps or switching be-tween goals, and for non-interactive packaged-up proofs.

• To work for existing proofs, without requiring modification to the originalproof script.

5 The Tactic Proof Recording Mechanism

Our recording mechanism is designed to meet the above requirements. It main-tains a proof tree in program state, in parallel with the subgoal package’s normalstate. The proof tree has nodes corresponding to goals in the proof, and branchescorresponding to goals’ subgoals, reflecting the structure of the original proof.Each node carries information about its goal, including a statement of the goal,

4

a unique goal identity number and a description of the tactic that got appliedto the goal. Active subgoals are labelled as such in place of a tactic description,thus enabling incomplete proofs to be represented. The tree gets added to asinteractive tactic steps are executed, and deleted from if steps are undone.

The crucial means by which goals in the subgoal package state are linked toparts of the stored goal tree is based on the goal identity numbers. The datatypefor goals is extended to carry such an identity number. These identity-carryinggoals are called xgoals.

type goalid = int;;type xgoal = goal ∗ goalid;;

Tactics are adjusted, or promoted, to work with xgoals, so that they take asinput an xgoal and return as part of their output a list of uniquely numberedxgoals. When a promoted tactic is applied to an xgoal, its result is incorporatedinto the proof tree by locating the tree node with the same identity as the tactic’sxgoal input. For efficient node location, a separate lookup table is maintainedin program state for returning a pointer to the proof tree node correspondingto a given goal identity. The datatype for tactics is a trivial variant of HOLLight’s original, with xgoals instead of goals.

type xgoalstate = (term list ∗ instantiation) ∗ xgoal list ∗ justification ;;type xtactic = xgoal −> xgoalstate;;

A typical theorem prover has over 100 commonly used tactics. Rather thanlaboriously implementing promoted forms for each of these, it is preferable towrite a generic wrapper function for promoting a supplied tactic. Promotedtactic ML objects overwrite their unpromoted versions, to enable existing proofscripts to be replayed without adjustment.

let tactic wrap name (tac:tactic) : xtactic =fun (xg:xgoal) −>

let (g,id) = dest xgoal xg inlet (meta,gs,just) = tac g inlet obj = Mname name inlet xgs = extend gtree id (Gatom obj) gs in(meta,xgs,just);;

let REFL TAC = tactic wrap ”REFL TAC” REFL TAC;;let STRIP TAC = tactic wrap ”STRIP TAC” STRIP TAC;;let DISCH TAC = tactic wrap ”DISCH TAC” DISCH TAC;;

The generic tactic wrap function promotes tactics of ML datatype tactic.The name argument carries the ML binding name of the tactic that is beingpromoted. Local value obj is of ML datatype mlobject, for representing the MLexpression syntax of the tactic as it occurs in the proof script (see below). Inthis case, the expression syntax is simply an ML binding name.

It is necessary to write wrapper functions for the datatype of each tactic andinference rule datatype that can occur in a proof script. As the datatypes be-come more complex, so does the implementation of their corresponding wrapperfunctions. Slightly more complex than tactic wrap is term tactic wrap, a func-tion for promoting tactics that take a term argument. Its implementation is

5

similar to tactic wrap, except that here obj has the expression syntax of an MLname binding applied to a HOL term.

let term tactic wrap name (tac:term−>tactic) (tm:term) : xtactic =fun (xg:xgoal) −>

let (g,id) = dest xgoal xg inlet (meta,gs,just) = tac g inlet obj = Mapp (Mname name, [Mterm tm]) inlet xgs = extend gtree id (Gatom obj) gs in(meta,xgs,just);;

let UNDISCH TAC = term tactic wrap ”UNDISCH TAC” UNDISCH TAC;;let EXISTS TAC = term tactic wrap ”EXISTS TAC” EXISTS TAC;;

The recursive ML datatype mlobject is capable of representing all the MLexpression syntax that commonly occurs in tactic proofs, including ML bindingnames, strings, lists, function applications, HOL terms, etc. A full explanationof this datatype and the implementation of more difficult promotion functionsis left for another paper.

type mlobject =Mname of string

| Mstring of string| Mlist of mlobject list| Mapp of (mlobject ∗ mlobject list)| Mterm of term| ... ;;

6 Conclusion

In this article we have argued the need for a tactic proof refactoring capability,and provided some insight into how tactic proofs can be recorded in HOL Lightto support this. The techniques are equally applicable to other subgoal packagesimplemented in other theorem provers. The same recording mechanism holdsinformation in a form that is ideal for dumping a proof tree graph for proofvisualisation, and for proof querying. We intend to implement these facilitiesas part of the Proof General system [3] for HOL Light.

7 References

[1] L. C. Paulson. Logic and Computation: Interactive proof with CambridgeLCF. (Cambridge University Press, 1987).[2] T. C. Hales. Introduction to the Flyspeck project. In T. Coquand, H. Lom-bardi, and M.F. Roy, editors, Mathematics, Algorithms, Proofs, number 05021in Dagstuhl Seminar Proceedings, Dagstuhl, Germany, 2006. InternationalesBegegnungs und Forschungszentrum fur Informatik (IBFI), Schloss Dagstuhl,Germany. http://drops.dagstuhl.de/opus/volltexte/2006/432.[3] David Aspinall. Proof General: A Generic Tool for Proof Development.Tools and Algorithms for the Construction and Analysis of Systems, ProcTACAS 2000, LNCS 1785.

6

Theory Exploration: a role for Model Theory?

Alan Smaill

Abstract

There is an empirical claim that, when exploring a mathematical theory, after a succes-sion of key results have been obtained, a point of equilibrium is reached where any queryof interest can be resolved by routine reasoning from the results already established. Hereis suggested some ways of thinking about the situation, in general. There are at leastsome situations where we can establish that all results (of a certain shape) will follow byroutine reasoning from a small number of key properties. An example is described, andthe significance for automated theory exploration discussed.

1 IntroductionIn a series of papers, Bruno Buchberger has presented a view of algorithmically basedtheory exploration (see e.g. Buchberger (2004, 2006, 2008)). Important features are thedistinction between “easy proving” and “hard proving”, and a “spiral” progressioninvolving introduction of new concepts, successive stages of working out the key con-sequences with “hard reasoning”, until reaching a point of saturation; at this point the“hard reasoning” stops providing any new insights, and “easy reasoning” is su�cient forsubsequent use of the theory. This last possibility can give rise to streamlined procedures(e.g. normalisers or decision procedures) based on the theorems established with hardreasoning.

Examples in Buchberger (2006) include:

Arithmetic Given the recursion equations for plus, use “hard” means (induction) toprove appropriate arithmetic laws; given the right laws, induction becomes un-necessary; an e�cient procedure for deciding equality of arithmetic expressionsinvolving plus and 0 can then be validated.

Simple Set Operations Given notions of set membership and definitions of [,\, useFOL reasoning to establish identities involving [,\. Given such boolean algebraproperties, simple rewriting is su�cient for their use, e.g. via a boolean algebrasimplifier.

Real Closed Fields Again start with axioms and FOL reasoning. After hard work toestablish the delineability theorem (Collins 1975); subsequently inference can pro-ceed without reasoning involving quantifiers, and a decision procedure is able tobe validated.

From these examples, we see that the notions of “hard” and “easy” reasoning arerelative; that they do not correspond to algorithmic complexity (the real-closed fielddecision procedure embodies “easy” reasoning, but is computationally very expensive);that the “easy” reasoning involves less search than “hard” reasoning.

Having established that a situation has been reached where “easy” reasoning is suf-ficient for a given purpose, there is still work to do if the aim is to give an algorithm

1

that exploits this sort of reasoning. Such algorithms, whose accuracy depends on theprevious reasoning steps, should be verified.

Putting this together, we have some examples with a family resemblance. Whatsorts of tools and techniques can be brought to bear in order to facilitate and if possibleautomate aspects of this exploration process? In this paper is described an approachwhere model theory can be used to identify situations where it is known that “easy”reasoning is su�cient.

The next sections describe: (2) a presentation by Henkin of such a model-theoreticargument; (3) a discussion of the approach via normal forms for the same problem; (4)the details of the argument from Henkin; (5) a plan for proving other results of this form;(6) its realisation for another example; and (7) concluding discussion.

2 Henkin’s presentationWe first look at the first of Buchberger’s examples above, the case of addition over thenatural numbers with 0.

This particular example (actually the version without the identity 0) is discussed inHenkin (1977) :

But now let us return to the equational theory of the number system N . Thefact that the commutative and associative laws form a base can also be provedby means of normal forms for the terms of T = (T,�). However, the resultabout this base has a semantical significance which permits a completelydi↵erent kind of proof. Namely, the indicated result is equivalent to thestatement that every structureA = (A, o) which satisfies the commutative andassociative laws — i.e., every commutative semi-group — will satisfy anyequation which holds identically in N . And this can be proved by showingthat each commutative semi-groupA is a homomorphic image of a subdirectpower ofN .

(Henkin 1977, p 92)

This means that if we have an equation formed just using + and variables, and if it isuniversally true inN , the natural numbers, then we can prove it by equational reasoning,without induction, using associativity and commutativity as axioms. Taking equationalreasoning as “easy” and reasoning with induction as “hard”, this means that for goals ofthe appropriate kind, we know that “easy” reasoning is complete, that there is no needto use “hard” reasoning.

3 Normal forms in natural numbersBefore looking at the details of the model-theoretic argument, consider the approach vianormal forms mentioned by Henkin.

Henkin mentions that the result for natural numbers can be got by defining a normalform for terms formed of variables and +. So, for example, we can rewrite by sorting thevariables according to some order or other, and left associate both terms; that this pre-serves the denotation if the terms follows exactly from commutativity and associativity.It’s easy enough to see that this defines unique normal forms.

What is also true is that if the two normal forms are not identical, then the equationis not universally true, so that this form of reasoning is complete. If we compare the twoterms, and cancel pairs of variables from the left and right terms that are identical, wemust come to a situation with a variable that appears in one term but not the other, sowe can devise a simple countermodel.

2

Although the argument in this case is simple, it looks to be quite hard to generalisethe second step in a way that might give automatic recognition, in other similar cases.Presumably an inductive argument over the syntax of normal forms, using appropriatecancellation lemmas is available.

The use of model-theoretic ideas in relation to exploiting common properties in au-tomated reasoning is described for example in (Baader & Snyder 2001, Sec 5); as inHenkin’s exposition, this involves consideration of the models of the underlying theo-ries, and indeed the notions of commutativity, associativity and identity. The main focusis on unification in theories where equality is characterised by some combination of theseproperties. In contrast, the emphasis for us is to identify when a given theory. involvinginductive definitions, is such that the identities of the theory are correctly axiomatised inthis way.

An alternative, proof theoretic, approach to show that “easy” reasoning su�ces, is totry to show that for every inductive proof, there is a corresponding equational one fromthe lemmas – this looks di�cult in general.

For the first stage, the area of devising normal forms from an equationally presentedtheory has had a fair bit of attention, for example via completion procedures such asKnuth-Bendix (see (Baader & Nipkow 1998, Ch 7)). Work on synthesis of decisionprocedures in terms of rewrite systems is also relevant (Janicic & Bundy 2007).

On the other hand, there is little work on automatically showing completeness ofaxiomatisations with respect to a larger theory (the inductively defined theory of addition,here).

4 The model theoretic argumentHenkin also alludes to a di↵erent way of showing that associativity and commutativityform a base. Here is a reconstruction of that argument.

4.1 The proofWe want to show that every equation of the above form valid in N follows from asso-ciativity and commutativity. By completeness of first order logic, it is enough to showthat any such equation is true in all models of associativity and commutativity (call thisan AC-structure; this is traditionally called a commutative semigroup). There are someoperations of building new models from old that preserve validity of atomic formulae(equations, here). If we can show that every AC-structure can be got from N by somesuch operations, that will be enough.

Three such operations are:

1. Taking products. If, for each i 2 I, Ai

= (Ai

,+i

) is a set with a binary operation,then the product

Qi2I

Ai

is given as the set of sequences { (xi

)i2I

| xi

2 A

i

}, with thebinary operation defined pointwise, i.e.

(xi

)i2I

+ (y

i

)i2I

=def

(xi

+i

y

i

)i2I

.

Any equation true in each of theAi

is going to be true inQ

i2I

Ai

.2. Taking a substructure. If we take a subset of the domain of a structure closed under

the operations of the structure, then that will also preserve universally valid equa-tions, since they simply have to be true for fewer interpretations of the variables.

3. Taking a homomorphic image (equivalently, taking a quotient). If there is a homo-morphism f : Y! Z (i.e. if 8y1, y2 2 Y f (y1+y2) = f (y1)+ f (y2)) and f is surjective,then any equation valid inY is also valid inZ.

3

Now the claim is that any AC-structure can be got from N via these operations.For an example, think of the real numbers modulo 1, i.e. the real interval [0, 1[ with“wrap-around” addition.

Suppose that X = (X,+) is an AC-structure.First, take a copy of N for each element x 2 X to get

Qx2X

N . This is also an AC-structure.

Now take the substructure of the product whose domain is the set

{ (nx

)x2X

| nx

= 0 for all but finitely many x }.

This is closed under the + operation on the product, so is also an AC-structure; call it S.We can now find a homomorphism f : S! X as follows.Set

f ((nx

)x2X

) =def

X

x2X

n

x

.x

where the sum is defined since only a finite number of terms are non-zero. n

x

.x for n

x

anatural number, and x 2 X, means the n-fold addition of x to itself in X. Now it shouldbe checked that f is a homomorphism.

Since each step preserves validity of equations, this shows that all equations valid inN are also valid in X.

It is worth noting that the proof, extended to allow an identity, also shows that theseconsideration as also apply to equations over reals, rationals, complex numbers, using +,simply because these are commutative semi-groups.

4.2 Can we use this?This looks like a harder proof, both to establish by hand, and to hope to automate, andthere is little work on automation of such proofs. However there are quite a lot of suchpreservation theorems known (see Chang & Keisler (1973)) that relate syntactic categoriesof axiomatisations to preservation under operations on structures. Building up a theoryof how to construct appropriate structures would be a real challenge.

It will appear from the example in this paper that when induction is used in the proofs(when fully spelt out, over the syntax of formulae), its use is straightforward. At thisstage, this is anecdotal evidence in favour of this approach.

5 Other examples, and a proof planHere are some similar situations, where we can expect the same approach to work:

1. Equations over naturals, using + and 0 (as above, with identity property).2. Equations over naturals, using ⇥ and 1.3. Equations of lists, with terms formed from append,nil, using associativity of append,

and identity property (on both sides of append).4. Equations of sets, using union and intersection (from boolean algebra characterisa-

tion, including distributivity).5. . . .

Let us look for an approach to the automation of proofs for these examples in the spiritof proof plans (Bundy 1991); the idea is that search for proof is guided by an expectation ofthe overall shape of the proof, and a planning-based approach to the assembly of tacticsthat build a final verification. The shape of the strategy will be given informally here.

4

5.1 The taskLet us use the second example above; in fact a slight variant will simplify the presentationhere. UseN>0 for the structure with domain the positive natural numbers, and the usualmultiplication and multiplicative unit; the set of positive natural numbers is written asN>0.

We will look at the claim applied to the positive natural numbers; the full claim asabove follows easily, by noting separately that for such identities

N>0 |= t(~x) = s(~x) i↵ N |= t(~x) = s(~x).

This is done by simply considering the cases where a variable is assigned the zero valuewhen the variable appears on both sides of the identity, and only on one side.

The claim we want to verify is the following:

If t(~x), s(~x) are terms built from variables1 and the constant 1 using ⇥, withthe usual interpretation in the positive natural numbers N>0, and ass, com, idare the usual statements of associativity, commutativity and identity for thislanguage, then

N>0 |= t(~x) = s(~x) i↵ ass, com, id ` t(~x) = s(~x).

The proof idea is to take any model S = (S,⇥S

, 1S

) of ass, com, id, and show that itcan be regarded as related withN>0 by a series of intermediate structures, each of whichpreserves the truth of such identities; this shows that any true identity is a semanticconsequence of ass, com, id, and we can appeal to completeness of equational reasoning.

5.2 The StrategyThe task involves supplying particular roles so that the parts of the proof fit together asfollows. Suppose given a structure Swith the appropriate properties:

1. Take a product of N>0, indexed by the elements of S. Operations are definedpointwise; this preserves truth of identities.

2. Take a substructure of the product, by picking appropriates sequences (ns

)s2S

closedunder multiplication. This will preserve all purely universally quantified state-ments.

3. Define a “transfer function” of N>0 to the algebra S, i.e. a function . : N ⇥ S! S. Inthe case of addition, this was simply defined n . s as the n-fold sum of s with itself,giving the key property that (n + m) . s = (n . s) +

S

(m . s). On the assumption thatthe properties of the algebra should be carried over in the same style, we want anoperation . such that:

(n ⇥m) . s = (n . s) ⇥S

(m . s)1 . s = 1

s

.

4. Finally, we get S as a homomorphic image of the substructure in part 2 by defininga homomorphism that uses the local operation defined in part 3.

This splits the problem up into selecting the right ingredients to fit this outline.

1not necessarily the same set of variables.

5

6 From Plan to ProofNow we look at how the planned shape of proof can be fleshed out into a proof. The proofplanning approach suggests using middle-out reasoning here – the missing ingredients donot have to be immediately provided, but place-holders in the form of meta-variables areused to allow planning to continue; the meta-variables are then incrementally instantiatedas verification conditions are satisfied. For the present example, this step is currently doneon paper — see Kraan et al. (1996), Johansson et al. (2010) for examples of middle-outreasoning in proof planning.

Forming the product structure – no choice here, this follows the first step fromsection 4.1.

Finding the substructure The earlier proof, replacing +with times, suggests lookingat the families which are “nearly the identity”, i.e. (n

s

)s2S

where n

s

= 1 for all but finitelymany values. The proof obligation here is to check that such sequences are closed undermultiplication, and that the identity element (1)

s2S

is included. Let’s call this structure T .The first proof obligation can be discharged by induction on the number of non-

identity elements.

Defining an operation Apart from the unhelpful operation that maps everything to1

S

, it is not so easy to invent an operation here. Using primitive recursion directly getsnowhere; since it’s the multiplicative structure that is our interest, we can take recursionvia factorisation and build in the property we want. This form of definition is suggestedby a form of recursion analysis, as in Kraan et al. (1996).

1 . s = 1S

p . s = ? (where p is prime)(x ⇥ y) . s = (x . s) ⇥

S

(y . s)

where we need to supply the base case values, and check that our function is well-defined,in that it does not depend on the factorisation chosen.

The solution falls out:

1 . s = 1S

p . s = s (where p is prime)(x ⇥ y) . s = (x . s) ⇥

S

(y . s)

so that n . s is s to the power m, where m is the number of occurrences of prime divisors ofn, including repeated factors. The proof obligation here requires us to show for example

x ⇥ y = x

0 ⇥ y

0 ! (x . s) ⇥S

(y . s) = (x0 . s) ⇥S

(y

0 . s),

and itself uses prime/composite induction.

Defining the homomorphism Given the operation and the structure T , we wanth : T! S such that

1. h((1)s2S

) = 1S

2. h((ns

)s2S

⇥ (ms

)s2S

) = h((ns

)s2S

) ⇥S

h((ms

)s2S

), and3. h is surjective.

6

A uniform way to use an operation in this situation is to take

h((ns

)s2S

) =Y

s2S

n

s

. s.

We can then check the three conditions. Commutativity and associativity are neededto prove the second homomorphism property, via induction on the number of values in(n

s

)s2S

that are not 1.Thus the result is established.

7 DiscussionThis looks reasonably hopeful for automation; we have a proof outline which can beexpressed in a higher-order language, verified in general, and used to guide the choiceof components in a given application. The proof planning framework provides controlfor instantiation of the needed components in a middle-out way. More work is neededto see how generally applicable this is, and to what extent in can be automated.

These results are interesting for search, because they tell us that once we have estab-lished a small number of inductive theorems in the object theory, there is no need formore inductive proofs when considering goals of a certain shape.

In contrast to other approaches, no assumption about decidability is made here; therestriction to easy reasoning improves the situation for search, even if the problem remainsundecidable. The assumption here is that hard reasoning subsumes easy reasoning, i.e.that easy reasoning steps are a subset of the steps permitted in hard reasoning: in thiscase, we are left with a smaller search space. However, it does not give us any guaranteeof finding shorter proofs – it may well be that there are short hard proofs, that requirelonger easy proofs.

On the other hand, this sort of result allows us to transfer decision procedures if wehave them (e.g. of provability from a few algebraic properties) to other domains. Theseare interesting properties for modularisation of search.

ReferencesBaader, F. & Nipkow, T. (1998), Term Rewriting and All That, Cambridge University Press.

Baader, F. & Snyder, W. (2001), Unification theory, in J. A. Robinson & A. Voronkov, eds,‘Handbook of Automated Reasoning, Volume 1’, Vol. I, Elsevier, chapter 8, pp. 447–553.

Buchberger, B. (2004), Algorithm-supported mathematical theory exploration: A per-sonal view and strategy, in B. Buchberger & J. A. Campbell, eds, ‘Artificial Intelligenceand Symbolic Computation, 7th International Conference, AISC 2004’, number 3249in ‘Lecture Notes in Computer Science’, Springer, pp. 236–250.url: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.88.9625

Buchberger, B. (2006), ‘Mathematical theory exploration’. Invited talk at IJCAR-06.url: http://www.easychair.org/FLoC-06/buchberger_ijcar_floc06.pdf

Buchberger, B. (2008), ‘Theory exploration versus theorem proving: Why automatedtheorem proving has little impact on mathematics’.url: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.68.3031

Bundy, A. (1991), A science of reasoning, in J.-L. Lassez & G. Plotkin, eds, ‘ComputationalLogic: Essays in Honor of Alan Robinson’, MIT Press, pp. 178–198.

7

Chang, C. C. & Keisler, H. J. (1973), Model Theory, North-Holland, Amsterdam.

Collins, G. (1975), Quantifier elimination for real closed fields by cylindrical algebraicdecompostion, in H. Brakhage, ed., ‘Automata Theory and Formal Languages: 2ndGI Conference Kaiserslautern, May 20–23, 1975’, Vol. 33 of Lecture Notes in Computer

Science, Springer, Berlin, pp. 134–183.url: http://dx.doi.org/10.1007/3-540-07407-4_17

Henkin, L. (1977), Algebraic aspects of logic: past, present, future, in ‘Colloque Interna-tional de Logique’, Editions du Centre National de la Recherche Scientifique, Paris,pp. 89–106.

Janicic, P. & Bundy, A. (2007), Automatic synthesis of decision procedures, in M. Kauers,M. Kerber, R. Miner & W. Windsteiger, eds, ‘Towards Mechanized Mathematical As-sistants: 14th Symposium, Calculemus 2007’, Vol. 4573 of Lecture Notes in Artificial

Intelligence, Springer, pp. 80–93.

Johansson, M., Dixon, L. & Bundy, A. (2010), Dynamic rippling, middle-out reason-ing, and lemma discovery, in ‘Walther Festschrift’, number 6463 in ‘Springer’, LNCS,pp. 102–116.

Kraan, I., Basin, D. & Bundy, A. (1996), ‘Middle-out reasoning for synthesis and induc-tion’, Journal of Automated Reasoning 16(1–2), 113–145. Also available from Edinburghas DAI Research Paper 729.

8

Abstract Domains for Bit-Level Machine

Integer and Floating-point Operations

Antoine Mine

⇤

CNRS & Ecole Normale Superieure45, rue d’Ulm

75005 Paris, [email protected]

Abstract

We present a few lightweight numeric abstract domains to analyze C programs that ex-ploit the binary representation of numbers in computers, for instance to perform “compute-through-overflow” on machine integers, or to directly manipulate the exponent and man-tissa of floating-point numbers. On integers, we propose an extension of intervals with amodular component, as well as a bitfield domain. On floating-point numbers, we proposea predicate domain to match, infer, and propagate selected expression patterns. Thesedomains are simple, e�cient, and extensible. We have included them into the Astree andAstreeA static analyzers to supplement existing domains. Experimental results show thatthey can improve the analysis precision at a reasonable cost.

1 Introduction

Semantic-based static analysis is an invaluable tool to help ensuring the correctness of programsas it allows discovering program invariants at compile-time and fully automatically. Abstractinterpretation [9] provides a systematic way to design static analyzers that are sound but ap-proximate: they infer invariants which are not necessarily the tightest ones. A central conceptis that of abstract domains, which consist of a set of program properties together with a com-puter representation and algorithms to compute sound approximations in the abstract of thee↵ect of each language instruction. For instance, the interval domain [9] allows inferring vari-able bounds. Bound properties allow expressing the absence of many run-time errors (such asarithmetic and array overflows) but, due to approximations, the inferred bounds may not besu�ciently precise to imply the desired safety assertions (e.g., in the presence of loops). Ane↵ective static analyzer for run-time errors, such as Astree [7], uses additional domains to inferlocal and loop invariants of a more complex form (e.g., octagons [20]) and derive tighter bounds.

Most numeric domains naturally abstract an ideal semantics based on perfect integers orrationals, while computers actually use binary numbers with a fixed number of digits. One so-lution is to adapt the domains to take into account hardware limitations: overflows are detectedand treated as errors, while floating-point semantics is simulated by introducing rounding errors[17]. While this works well in many cases, it is not su�cient to analyze programs that performoverflows on purpose (expecting a wrap-around semantics) or that rely on the precise binaryrepresentation of numbers. The goal of this article is to propose a set of simple, lightweightnumeric abstract domains that are aware of these aspects.

⇤This work is supported by the INRIA project “Abstraction” common to CNRS and ENS in France.

1

2 1 INTRODUCTION

char add1(char x, char y) {

return (char)

((unsigned char)x +

(unsigned char)y);

}

char add2(char x, char y) {

unsigned register r1,r2,r3;

r1 = x; r2 = y;

r3 = r1 + r2;

return r3;

}

(a) (b)

Figure 1: Integer “compute-through-overflow” examples. char are assumed to be signed.

union u { int i[2]; double d; };

double cast(int i) {

union u x,y;

x.i[0] = 0x43300000;

y.i[0] = x.i[0];

x.i[1] = 0x80000000;

y.i[1] = i ^ x.i[1];

return y.d - x.d;

}

double sqrt(double d) {

double r;

unsigned* p = (unsigned*)&d;

int e = (*p & 0x7fe00000) >> 20;

*p = (*p & 0x801fffff) | 0x3fe00000;

r = ((c1*d+c2)*d+c3)*d+c4;

*p = (e/2 + 511) << 20;

p[1] = 0;

return d * r;

}

(a) (b)

Figure 2: Floating-point computations exploiting the IEEE binary representation. On theright, c1 to c4 are unspecified constant coe�cients of a polynomial approximation. A 32-bitbig-endian processor is assumed (e.g., PowerPC).

1.1 Motivating Examples

Figure 1.a presents a small C function that adds two signed bytes (char) by casting them tounsigned bytes before the addition and casting the result back to signed bytes. The functionsystematically triggers an overflow on negative arguments, which is detected by an analyzersuch as Astree. Additionally, on widespread architectures, the return value equals x+y due towrap-around. This programming pattern is used in popular industrial code generators such asTargetLink [10] and known as “compute-through-overflow.” An analysis not aware of wrap-around will either report dead code (if overflows are assumed to be fatal) or return [�128, 127](if overflows produce full-range results). Even a wrapping-aware interval analysis will returnan imprecise interval for arguments crossing zero, e.g. [�1, 0], as the first cast maps {�1, 0} to{255, 0} and intervals cannot represent non-convex properties (see Sec. 2.6).

A variant is shown in Fig. 1.b, where the casts are implicit and caused by copies betweenvariables of di↵erent types. This pattern is used to ensure that arithmetic computations areperformed in CPU registers only, using a pool of register variables with irrelevant signedness(i.e., register allocation is explicit and not entrusted to the compiler).

Figure 2.a presents a C function exploiting the binary representation of floating-point num-bers based on the IEEE standard [13]. It implements a conversion from 32-bit integers to 64-bitfloats by first constructing the float representation for x.d = 252 + 231 and y.d = 252 + 231 + i

using integer operations and then computing y.d � x.d = i as a float subtraction. This codeis similar to the assembly code generated by compilers when targeting CPUs missing the con-version instruction (such as PowerPC). Some code generators choose to provide their own Cimplementation instead of relying on the compiler (for instance, to improve the traceability of

1.2 Contribution 3

the assembly code). Figure 2.b exploits the binary encoding of floats to implement a squareroot: the argument is split into an exponent e and a mantissa in [1, 4] (computed by maskingthe exponent in d); then the square root of the mantissa is evaluated through a polynomial,while the exponent is simply halved. In both examples, a sound analyzer not aware of the IEEEfloating-point encoding will return the full float range.

These examples may seem disputable, yet they are representative of actual industrial codes(the examples have been modified for the sake of exposition). The underlying programmingpatterns are supported by many compilers and code generators. An industrial-strength staticanalyzer is expected to accept existing coding practices and handle them precisely.

1.2 Contribution

We introduce a refined concrete semantics taking bit manipulations into account, and presentseveral abstractions to infer precise bounds for the codes in Figs. 1–2.

Section 2 focuses on integers with wrap-around: we propose an interval domain extendedwith a modular component and a bitfield domain abstracting each bit separately. Handling theunion type and pointer cast from Fig. 2 requires a specific memory model, which is describedin Sec. 3. Section 4 presents a bit-level float domain based on pattern matching enriched withpredicate propagation. Section 5 presents experimental results using the Astree and AstreeAstatic analyzers. Finally, Sec. 6 concludes.

The domains we present are very simple and lightweight; they have a limited expressiveness.They are intended to supplement, not replace, classic domains, when analyzing programs fea-turing bit-level manipulations. Moreover, they are often slight variations on existing domains[9, 15, 23, 19, 20]. We stress the fact that these domains and the change of concrete semanticsthey require have been incorporated into existing industrial analyzers, to enrich the class ofprograms they can analyze precisely, at low cost and with no precision regression on previouslyanalyzed codes.

1.3 Related Work

The documentation [2] for the PolySpace analyzer suggests removing, prior to an analysis, allcomputes-through-overflows and provides a source filter based on regular expressions to do so.This solution is fragile and can miss casts (e.g., when they are not explicit, as in Fig. 1.b) orcause unsoundness (in case the pattern is too inclusive and transforms unrelated code parts),while the solution we propose is semantic-based.

Various domains supporting modular arithmetics have been proposed, such as simple [11]and a�ne congruences [12, 24]. Masdupuy introduced interval congruences [15] to analyze arrayindices; our modular intervals are a slightly simpler restriction and feature operators adapted towrap-around. Simon and King propose a wrap-around operator for polyhedra [26]; in additionto being costly, it outputs convex polyhedra while our examples require the inference of non-convex invariants locally. Abstracting each bit of an integer separately is a natural idea thathas been used, for instance, by Monniaux [23] and Regehr et al. [25]. Brauer et al. [8] propose abit-blasting technique to design precise transfer functions for small blocks of integer operations,which can bypass the need for more expressive (e.g., disjunctive) local invariants.

We are not aware of any abstract domain able to handle bit-level operations on floating-point numbers. Unlike classic predicate abstraction [6], our floating-point predicate domainincludes its own fast and ad-hoc (but limited) propagation algorithm instead of relying on anexternal generic tool.

4 2 INTEGER ABSTRACT DOMAINS

int-type ::= (signed | unsigned)? (char | short | int | long | long long) n? (n 2 N⇤)

bitsize : int-type! N⇤

signed : int-type! {true, false}

range(t)def=

([0, 2bitsize(t) � 1] if ¬signed(t)

[�2bitsize(t)�1, 2bitsize(t)�1

� 1] if signed(t)

Figure 3: Integer C types and their characteristics.

2 Integer Abstract Domains

2.1 Concrete Integer Semantics

In this section, we focus on integer computations. Before designing a static analysis, we need toprovide a precise, mathematical definition of the semantics of programs. We base our semanticson the C standard [5], extended with hypotheses on the representation of data-types necessaryto analyze the programs in Fig. 1.

The C language mixes operators based on mathematical integers (addition, etc.) and opera-tors based on the binary representation of numbers (bit-wise operators, shifts). At the hardwarelevel, however, all integer computations are performed in registers of fixed bit-size. Thus, oneway to define the semantics is to break it down at the bit level (i.e., “bit-blasting” [8]). Wechoose another route and express the semantics using classic mathematical integers in Z. Oursemantics is higher-level than a bit-based one, which provides some advantages: on the concretelevel, it makes the classic arithmetic C operations (+, -, *, /, %) straightforward to express; onthe abstract level, it remains compatible with abstract domains expressed on perfect numbers(such as polyhedra). We show that this choice does not preclude the definition of bit-wiseoperators nor bit-aware domains.

2.2 Integer Types

Integer types in C come in di↵erent sizes and can be signed or unsigned. We present in Fig. 3the type grammar for integers, int-type, including bitfields that can only appear in structures(when a bit size n is specified). The bit size and signedness of types are partly implementation-specific. We assume that they are specified by two maps: bitsize : int-type ! N⇤ and signed :int-type ! {true, false}. Moreover, we assume that unsigned integers are represented usinga pure binary representation: b

n�1 · · · b0 2 { 0, 1 }n representsP

n�1i=0 2ib

i

, and signed integers

use two’s complement representation: b

n�1 · · · b0 2 { 0, 1 }n representsP

n�2i=0 2ib

i

� 2n�1b

n�1.Although this is not required by the C standard, it is the case for all the popular architectures.1

The range (i.e., the set of acceptable values) of each type is derived as in Fig. 3.

2.3 Integer Expressions

We consider here only a pure, side-e↵ect-free, integer fragment of C expressions, as depicted inFig. 4. To stay concise, we include only arithmetic and bit-wise operators and casts. Moreover,statements are reduced to assignments and assertions (which are su�cient to model programsas control-flow graphs). We perform a static transformation that makes all wrap-around e↵ects

1The C standard allows some features that we do not handle: padding bits, trap representations, one’scomplement representations or sign-magnitude representations of negative numbers, and negative zeros.

2.3 Integer Expressions 5

expr ::= n (constant n 2 Z)| V (variable V 2 V)| (int-type) expr (cast)

| ⇧ expr (unary operation, ⇧ 2 { -,~ })| expr � expr (binary operation, � 2 { +, -, *, /, %, &, |,^, >>, << })

stat ::= V = expr (assignment)

| assert(expr) (assertion)

Figure 4: Fragment of integer C syntax.

⌧ : expr! int-type

⌧(n) 2 int-type (given) ⌧(V ) 2 int-type (given)

⌧(⇧ e)def= promote(⌧(e)) ⌧((t) e)

def= t

⌧(e1 � e2)def=

(lub(promote(⌧(e1)), promote(⌧(e2))) if � 2 { +, -, *, /, %, &, |,^ }

promote(⌧(e1)) if � 2 { <<, >> }

where:

promote(t)def=8

>>><

>>>:

int if rank(t) < rank(int) ^ range(t) ✓ range(int)

unsigned else if rank(t) < rank(int) ^ range(t) ✓ range(unsigned)

promote(t0) if t has bitfield type t

0n, based on t

0

t otherwise

lub(t, t0)def=

when rank(t) � rank(t0):8><

>:

t if signed(t) = signed(t0) or ¬signed(t) ^ signed(t0)

or signed(t) ^ ¬signed(t0) ^ range(t0) ✓ range(t)

unsigned t if signed(t) ^ ¬signed(t0) ^ range(t0) 6✓ range(t)

when rank(t) < rank(t0): lub(t0, t)

rank(char)def= 1 rank(short)

def= 2 rank(int)

def= 3 rank(long)

def= 4

rank(long long)def= 5 rank(signed t)

def= rank(unsigned t)

def= rank(t)

Figure 5: Typing of integer expressions.

explicit in expressions by first typing sub-expressions and then inserting casts. These steps areperformed in a front-end and generally not discussed, but we present them to highlight of fewsubtle points.

Typing. The type ⌧(e) of an expression e is inferred as in Fig. 5 based on the given type ofvariables and constants. Firstly, a promotion rule (promote) states that values of type t smallerthan int (where the notion of “smaller” is defined by the rank function) are promoted to int,if int can represent all the values in type t, and to unsigned otherwise. Values of bitfield typet n are promoted as their corresponding base type t. Secondly, for binary operators, the typeof the result is inferred from that of both arguments (lub). Integer promotion causes valueswith the same binary representation but di↵erent types to behave di↵erently. For instance,


LV M def= V

Ln M def= (⌧(n))n

L (t) e M def= (t) L e M

L ⇧ e M def= let t = ⌧(⇧ e) in (t) (⇧ (t) L e M)

L e1 � e2 M def= if � 2 { +, -, *, /, %, &, |,^ } then:

let t = ⌧(e1 � e2) in (t) ((t) L e1 M � (t) L e2 M)if � 2 { <<, >> } then:

let t = ⌧(e1) in (t) ((t) L e1 M � (unsigned 5) L e2 M)

LV = e M def= V = (⌧(V )) L e M

L assert(e) M def= assert(L e M)

Figure 6: Insertion of implicit casts.

J expr K : E ! P(Z)JV K⇢ def

= { ⇢(V ) } Jn K⇢ def= {n }

J (t) e K⇢ def= {wrap(v, range(t)) | v 2 J e K⇢ } J ⇧ e K⇢ def

= { ⇧ v | v 2 J e K⇢ }J e1 � e2 K⇢ def

= { v1 � v2 | v1 2 J e1 K⇢ ^ v2 2 J e2 K⇢ ^ (v2 6= 0 _ � /2 { /, % }) }

where wrap(v, [`, h])def= min { v0 | v0 � ` ^ 9k 2 Z : v = v

0 + k(h� `+ 1) }

J stat K : E ! P(E)

JV = e K⇢ def= { ⇢[V 7! v] | v 2 J e K⇢ }

J assert(e) K⇢ def= { ⇢ | 9v 2 J e K⇢ : v 6= 0 }

Figure 7: Concrete semantics.

unsigned char a = 255 and signed char b = -1 have the same representation, but a >> 1

= 127 while b >> 1 = -1. Integer promotion is said to be “value preserving”, as opposed to“representation preserving” [4]. This rule comforts us in our decision to focus on the integervalue of variables instead of their binary representation.

Cast introduction. The translation of expressions, L · M, is presented in Fig. 6. Firstly,before applying an operator, its arguments are converted to the type of the result. This canlead to wrap-around e↵ects. For instance, in (int)-1 + (unsigned)1, the left argument isconverted to unsigned, which gives 232 � 1 on a 32-bit architecture. The case of bit shiftoperators is special; as shifting by an amount exceeding the bit-size of the result is undefinedin the C standard, we model instead the behavior of intel 32-bit hardware: the right argumentis masked to keep only the lower 5 bits (abusing bitfield types). Secondly, casts are introducedto ensure that the value of the result lies within the range of its inferred type. Finally, beforestoring a value into a variable, it is converted to the type of the variable.

2.4 Operator Semantics

After translation, the semantics of expressions can be defined in terms of integers, without anyreference to C types. The arithmetic operators (+,-,*,/,%) have their classic meaning in Z.2 To

2Note that / rounds towards zero and that a % bdef= a� (a/b)*b.

2.5 Expression Semantics 7

define bit-wise operations (~, &, |, ^, <<, >>) on Z, we first associate an (infinite) bit pattern toeach integer in Z. It is an element of the boolean algebra B = ({0, 1}N,¬,^,_) with pointwisenegation ¬, logical and ^, and logical or _ operators. The pattern p(x) 2 B of an integer x 2 Zis defined using an infinite two’s complement representation:

p(x)def=

(p(x) = (b

i

)i2N where b

i

= bx/2ic mod 2, if x � 0

p(x) = (¬bi

)i2N where (b

i

)i2N = p(�x� 1), if x < 0

(1)

The elements in B are reminiscent of 2-adic integers, but we restrict ourselves to those repre-senting regular integers. The function p is injective, and we note p

�1 its inverse, which is onlydefined on sequences that are stable after a certain index (9i : 8j � i : b

j

= b

i

). The bit-wiseC operators are given a semantics in Z, based on their natural semantics in B, as follows:

~xdef= p

�1(¬p(x)) = �x� 1 x & y

def= p

�1(p(x) ^ p(y))

x | y

def= p

�1(p(x) _ p(y)) x ^ y

def= p

�1(p(x)� p(y))

x << y

def= bx⇥ 2yc x >> y

def= bx⇥ 2�y

c

(2)

where � is the exclusive or and b·c rounds towards �1. This semantics is compatible withthat of existing arbitrary precision integer libraries [1, 22].

2.5 Expression Semantics

Given environments ⇢ 2 E

def= V ! Z associating integer values to variables, we can define

the semantics J · K of expressions and statements as shown in Fig. 7. The semantics of wrap-around is modeled by the wrap function. Our semantics is non-deterministic: expressions (resp.statements) return a (possibly empty) set of values (resp. environments). This is necessary todefine the semantics of errors that halt the program (e.g., division by zero). Non-determinismis also useful to design analyses that must be sound with respect to several implementations atonce. We could for instance relax our semantics and return a full range instead of the modularresult in case of an overflow in signed arithmetics (as the result is undefined by the standard).

All it takes to adapt legacy domains to our new semantics is an abstraction of the wrap

operator. A straightforward but coarse one would state that wrap](v, [`, h])def= {v} if v 2 [`, h],

and [`, h] otherwise (see also [26] for a more precise abstraction on polyhedra). In the following,we will introduce abstract domains specifically adapted to the wrap-around semantics.

2.6 Integer Interval Domain D

]i

We recall very briefly the classic interval abstract domain [9]. It maps each variable to aninterval of integers:

D

]

i

def= { [`, h] | `, h 2 Z [ {±1 } }

As it is a non-relational domain, the abstract semantics of expressions, and so, of assignments,can be defined by structural induction, replacing each operator � on Z with an abstract version �]

i

on intervals. Abstract assertions are slightly more complicated and require backward operators;we refer the reader to [18, § 2.4.4] for details. We recall [9] that optimal abstract operators canbe systematically designed with the help of a Galois connection (↵, �):

[`1, h1] �]

i

[`2, h2]def= ↵

i

({ v1 � v2 | v1 2 �

i

([`1, h1]), v2 2 �

i

([`2, h2]) }

�

i

([`, h])def= {x 2 Z | ` x h }

↵

i

(X)def= [minX, maxX]


-

]

m

([`, h] + kZ) def= [�h,�`] + kZ

~]

m

([`, h] + kZ) def= [�h� 1,�`� 1] + kZ

([`1, h1] + k1Z) �]m

([`2, h2] + k2Z)def=8

><

>:

([`1, h1] �]

i

[`2, h2]) + gcd(k1, k2)Z if � 2 { +, -, *,[,O }

([`1, h1] �]

i

[`2, h2]) + 0Z if k1 = k2 = 0, � 2 { /, %, &, |,^, >>, << }[�1,+1] + 0Z otherwise

wrap

]

m

([`, h] + kZ, [`0, h0])def=

let k0 = gcd(k, h0� `

0 + 1) in([wrap(`, [`0, h0]), wrap(h, [`0, h0])] + 0Z if (`0 + k

0Z) \ [`+ 1, h] = ;

[`, h] + k

0Z otherwise

where gcd is the greatest common divisor, and gcd(0, x) = gcd(x, 0) = x.

Figure 8: Abstract operators in the modular interval domain D

]

m

.

For instance, [`1, h1] +]

i

[`2, h2]def= [`1 + `2, h1 + h2]. Additionally, an abstraction []

i

of the set

union [, as well as a widening O]

i

[9] enforcing termination are required. We note that theoptimal abstraction of wrap is:

wrap

]

i

([`, h], [`0, h0]) =([wrap(`, [`0, h0]), wrap(h, [`0, h0])] if (`0 + (h0

� `

0 + 1)Z) \ [`+ 1, h] = ;

[`0, h0] otherwise

which returns the full interval [`0, h0] when [`, h] crosses a boundary in `

0 + (h0� `

0 + 1)Z. Thisis the case when {wrap(v, [`0, h0]) | v 2 [`, h] } is not convex.

Example. Consider the function add1 in Fig. 1.a and assume that, in the abstract, thecomputed range for both arguments is [�1, 1]. Then (unsigned char)[�1, 1] is abstracted as[0, 255]. This is the best interval abstraction as, in the concrete, the value set is {0, 1, 255}.The following interval addition, which is computed in type int due to integer promotion, gives[0, 510]. Once cast back to signed char, the interval result is the full range [�128, 127]. This ismuch coarser than the concrete result, which is [�2, 2].

2.7 Modular Interval Domain D

]m

We now propose a slight variation on the interval domain that corrects the precision loss inwrap

]

i

. It is defined as intervals with an extra modular component:

D

]

m

def= { [`, h] + kZ | `, h 2 Z [ {±1 }, k 2 N }

�

m

([`, h] + kZ) def= {x+ ky | ` x h, y 2 Z }

The domain was mentioned briefly in [14] but not defined formally. It is also similar to theinterval congruences ✓.[`, u]hmi by Masdupuy [15], but with ✓ set to 1, which results in simplerabstract operations (abstract values with ✓ 6= 1 are useful when modeling array accesses, as in[15], but not when modeling wrap-around).

Abstract operators �]m

are defined in Fig. 8. There is no longer a best abstraction in general(e.g., { 0, 2, 4 } could be abstracted as either [0, 4]+0Z or [0, 0]+2Z which are both minimal and

2.8 Bitfield Domain D

]

b

9

yet incomparable), which makes the design of operators with a guaranteed precision di�cult. Wedesigned ~

]

m

, -]m

, +]m

, -]m

, and []m

based on optimal interval and simple congruence operators[11], using basic coset identities to infer modular information. The result may not be minimal(in the sense of minimizing �

m

(x]

�

]

m

y

])) but su�ces in practice. For other operators, we simplyrevert to classic intervals, discarding any modulo information.

We now focus our attention on wrap

]

m

([`, h] + kZ, [`0, h0]). Similarly to intervals, wrapping[`, h]+kZ results in a plain interval if no [`, h]+ky, y 2 Z crosses a boundary in `

0+(h0�`

0+1)Z,in which case the operation is exact. Otherwise, it returns the interval argument [`, h] moduloboth h

0� `

0+1 and k. This forgets that the result is bounded by [`0, h0] but keeps an importantinformation: the values ` and h. In practice, we maintain the range [`0, h0] by performing areduced product between D

]

m

and plain intervals D]

i

, which ensures that each operator (exceptthe widening) is at least as precise as in an interval analysis.

Example. Consider again the function add1 from Fig. 1.a, with abstract arguments [�1, 1]+0Z. Then, e = (unsigned char)[-1,1] is abstracted as [�1, 1]+256Z. Thus, e+e gives [�2, 2]+256Z. Finally, (char)(e+e) gives back the expected interval: [�2, 2] + 0Z.

2.8 Bitfield Domain D

]b

The interval domain D

]

i

is not very precise on bit-level operators. On the example of Fig. 2.b,(range(int) & 0x801fffff) | 0x3fe00000 is abstracted as range(int), which does not capturethe fact that some bits are fixed to 0 and others to 1. This issue can be solved by a simple domainthat tracks the value of each bit independently. A similar domain was used by Monniaux whenanalyzing unsigned 32-bit integer computations in a device driver [23]. Our version, however,abstracts a Z�based concrete semantics, making it independent from bit-size and signedness.The domain associates to each variable two integers, z and o, that represent the bit masks forbits that can be set respectively to 0 and to 1:

D

]

b

def= Z⇥ Z

�

b

(z, o)def= { b | 8i � 0 : (¬p(b)

i

) ^ p(z)i

or p(b)i

^ p(o)i

}

↵

b

(S)def= (_ {¬p(b) | b 2 S }, _ { p(b) | b 2 S })

where p is defined in (1). The optimal abstract operators can be derived through the Galois

connection P(Z)��!

��

↵b

�b Z ⇥ Z. We present the most interesting ones in Fig. 9. They use the

bit-wise operators on Z defined in (2). For bit shifts, we only handle the case where the rightargument represents a positive singleton (i.e., it has the form n

]

b

for some constant n � 0).Wrapping around an unsigned interval [0, 2n � 1] is handled by masking high bits. Wrappingaround a signed interval [�2n, 2n � 1] additionaly performs a sign extension. Our domain hasinfinite increasing chains (e.g., X]

n

= (�1, 2n � 1)), and so, requires a widening: O]

b

will set allthe bits to 0 (resp. 1) if the mask for bits at 0 (resp. at 1) is not stable.

E�ciency. The three domains D

]

i

, D]

m

, D]

b

are non-relational, and so, very e�cient. Usingfunctional maps, even joins and widenings can be implemented with a sub-linear cost in practice[7, §III.H.1]. Each abstract operation costs only a few integer operations. Moreover, the valuesencountered during an analysis generally fit in a machine word. Our analyzer uses an arbitraryprecision library able to exploit this fact to improve the performance [22].

10 3 MEMORY ABSTRACT DOMAIN

n

]

b

def= (~n, n) (n 2 Z)

~]

b

(z, o)def= (o, z)

(z1, o1) &]

b

(z2, o2)def= (z1 | z2, o1 & o2)

(z1, o1) |]

b

(z2, o2)def= (z1 & z2, o1 | o2)

(z1, o1) ^]

b

(z2, o2)def= ((z1 & z2) | (o1 & o2), (z1 & o2) | (o1 & z2))

(z1, o1) <<]

b

(z2, o2)def= ((z1 << n) | ((1 << n)� 1), o1 << n), when 9n � 0 : (z2, o2) = n

]

b

(z1, o1) >>]

b

(z2, o2)def= (z1 >> n, o1 >> n), when 9n � 0 : (z2, o2) = n

]

b

(z1, o1) []

b

(z2, o2)def= (z1 | z2, o1 | o2)

(z1, o1) O]

b

(z2, o2)def= (z1 O z2, o1 O o2) with x O y

def= if x = x | y then x else � 1

wrap

]

b

((z, o), [0, 2n � 1])def= (z | (�2n), o & (2n � 1))

wrap

]

b

((z, o), [�2n, 2n � 1])def= ((z & (2n � 1)) | (�2nz

n

), (o & (2n � 1)) | (�2non

))

Figure 9: Abstract operators in the bitfield domain D

]

b

. See also (2).

synt : (L⇥ P(L))! expr

synt((V, o, t), C)def=8

>>>>>>>><

>>>>>>>>:

c if c = (V, o, t) 2 C

(t)c if c = (V, o, t0) 2 C ^ t, t

02 int-type ^ bitsize(t) = bitsize(t0)

hi-word-of-dbl(c) if c = (V, o, t0) 2 C ^ t 2 int-type ^ t

0 = double ^ bitsize(t) = 32

dbl-of-word(c1, c2) if c1 = (V, o, t0) 2 C ^ c2 = (V, o+ 4, t0) 2 C ^ t = double^

t

02 int-type ^ bitsize(t0) = 32

range(t) otherwise

Figure 10: Cell synthesis to handle physical casts. A big endian architecture is assumed.

3 Memory Abstract Domain

A prerequisite to analyze the programs in Fig. 2 is to detect and handle physical casts, that is,the re-interpretation of a portion of memory as representing an object of di↵erent type throughthe use of union types (Fig. 2.a) or pointer casts (Fig. 2.b).

In this section, we are no longer restricted to integers and consider arbitrary C expres-sions and types. Nevertheless our first step is to reduce expressions to the following simplifiedgrammar:

expr ::= n | &V | ⇧ expr | expr � expr | (scalar-type) expr | *scalar-type expr

stat ::= (*scalar-type expr) = expr | assert(expr)scalar-type ::= int-type | float | double | char *

Memory accesses (including field and array accesses, variable reads and modifications) arethrough typed dereferences *

t

e, and pointer arithmetics is reduced to arithmetics on byte-based memory addresses. The translation to such expressions can be performed statically bya front-end. For instance, in Fig. 2.a, the statement y.i[1] = i ^ x.i[1] is translated into*

int

(&y+ 4) = (*int

&i) ^ (*int

(&x+ 4)). We do not detail this translation further.A low-level semantics would model memory states as maps in { (V, i) 2 V ⇥ N | i <

bitsize(⌧(V )) } ! { 0, 1 } from bit positions to bit values. We consider instead a slightly

higher-level concrete semantics: the memory is a collection of cells in L

def= { (V, o, t) 2

11

V ⇥ N ⇥ scalar-type | o + sizeof(t) sizeof(V ) }, where V is the variable the cell lives in, ois its o↵set in bytes from the start of V , and t is the cell type (integer, float, or pointer). Aconcrete environment ⇢ is a partial function from cells in L to values in V, where V containsall the integer, float, and pointer values. Pointer values are represented as pairs composed ofa variable name V 2 V and a byte o↵set o 2 N from the start of the variable, and writtenas &V + o. Cells are added on-demand when writing into memory: an assignment *

t

e = e

0

creates a cell (V, o, t) for each pointer &V + o pointed to by e. Reading a cell (V, o, t) froma memory with cell set C ✓ L returns ⇢(V, o, t) if (V, o, t) 2 C. If (V, o, t) /2 C, we try tosynthesize its value. An example synthesis function, synt, is presented in Fig. 10: it returnsan expression over-approximating the value of a cell (V, o, t) using only cells in C. Firstly, if tis an integer type and there is an integer cell (V, o, t0) 2 C with the same size but t 6= t

0, itsvalue is converted to t — i.e., a physical cast *((int*)&V) is treated as a regular cast (int)V.Secondly, selected bit-level manipulations of floats are detected and translated into expressionsusing two new operators: hi-word-of-dbl and dbl-of-word, that denote, respectively, extractingthe most significant 32-bit integer word of a 64-bit double and building a 64-bit double fromtwo words (this semantics is formalized in Sec. 4.1). Thirdly, if no synthesis can occur, thewhole range of t is returned.

This concrete semantics is then abstracted in a straightforward way: a specific memorydomain maintains a set C ✓ L of current cells and delegates the abstraction of P(C ! V) to anunderlying domain. Pointer values are abstracted by maintaining an explicit set of pointed-tovariables C ! P(V) and delegating the abstraction of o↵sets to a numeric domain. Givenan expression and an abstract environment, the memory domain resolves pointer dereferences,synthesizes cell values, and translates dynamically expressions into simple numeric expressions,similar to Fig. 4 extended with the operators introduced during synthesis. Such expressionscan then be handled directly by numeric domains, as shown in Sec. 4.2.

This memory domain was introduced in [19] and we will not discuss it further. Note howeverthat the synthesis was limited to integer cells in [19], while we extend it here to express somebit-level float manipulations.3

Example. In Fig. 2.a, writing into x.i[0] and x.i[1] creates the cells c1 = (x, 0, int) andc2 = (x, 4, int). Reading back x.d amounts to evaluating the expression dbl-of-word(c1, c2).In Fig. 2.b, the expression (*p & 0x7fe00000) >> 20 is translated into (hi-word-of-dbl(c) &

0x7fe00000) >> 20, where the cell c = (d, 0, double) represents the variable d.

4 Floating-Point Abstract Domains

4.1 Concrete Bit-Level Floating-Point Semantics

We now consider the analysis of programs manipulating floating-point numbers. Due to thelimited precision of computers, float arithmetics exhibits rounding errors. For many purposes, itis su�cient to model floats as reals, with rounding abstracted as a non-deterministic choice in anerror interval. This permits the use of the classic abstract domains defined on rationals or reals(intervals, but also relational domains [17]) to model floating-point computations. However,this is not su�cient when the bit representation of numbers is exploited, as in Fig. 2.

We introduce a bit-level semantics based on the ubiquitous IEEE 754-1985 standard [13].We focus here on double-precision numbers, which occupy 64 bits: a 1-bit sign s, a 11-bit

3Our analyzer (Sec. 5) extends this further to synthesize and decompose integers, bitfields, and 32-bit floats(not presented here for the sake of conciseness).

12 4 FLOATING-POINT ABSTRACT DOMAINS

dbl : {0, 1}64 ! Vdbl(s, e10, . . . e0,m0 . . . ,m51)

def=8

>>><

>>>:

(�1)s ⇥ (1 +P51

i=0 2�i�1

m

i

)⇥ 2(P10

i=0 2iei�1023) ifP10

i=0 2i

e

i

/2 { 0, 2047 }

(�1)s ⇥ (P51

i=0 2�i�1

m

i

)⇥ 2�1022 if 8i : ei

= 0

(�1)s ⇥1 if 8i : ei

= 1 ^ 8j : mj

= 0

NaN if 8i : ei

= 1 ^ 9j : mj

= 1

J dbl-of-word(e1, e2) K⇢ def= { dbl(b131, . . . , b

10, b

231, . . . , b

20) |

8j 2 {1, 2} :P31

i=0 2i

b

j

i

2 Jwrap(ej

, [0, 232 � 1]) K⇢ }J hi-word-of-dbl(e) K⇢ def

= {

P31i=0 2

i

b

i+32 | 9b0, . . . , b31 : dbl(b63, . . . , b0) 2 J e K⇢ }

Figure 11: Bit-level concrete semantics of floating-point numbers, extending Fig. 7.

exponent e0 to e10, and a 52-bit mantissa m0 to m51. The mapping from bit values to floatvalues is described by the function dbl in Fig. 11. In addition to normalized numbers (ofthe form ±1.m0m1 · · · ⇥ 2x), the standard allows denormalized (i.e., small) numbers, signedinfinities ±1, and NaNs (Not a Numbers, standing for error codes). This representationgives a concrete semantics to the operators dbl-of-word and hi-word-of-dbl introduced by cellsynthesis. As in the case of integers, legacy abstract domains can be adapted to our newsemantics by defining an abstraction for our new operators. Straightforward ones would statethat dbl-of-word

](e1, e2) = range(double) and hi-word-of-dbl

](e) = range(unsigned), but wepropose more precise ones below.

4.2 Predicate Domain on Binary Floating-Point Numbers D

]p

The programs in Fig. 2 are idiomatic. It is di�cult to envision a general domain that can reasonprecisely about arbitrary binary floating-point manipulations. Instead, we propose a lightweightand extensible technique based on pattern matching of selected expression fragments. However,matching each expression independently is not su�cient: it provides only a local view thatcannot model computations spread across several statements precisely enough. We need toinfer and propagate semantic properties to gain in precision.

To analyze Fig. 2.a, we use a domain D

]

p

of predicates of the form V = e, where V 2 V ande is an expression chosen from a fixed list P with a parameter W 2 V . At most one predicateis associated to each V , so, an abstract element is a map from V to P[ {> }, where > denotesthe absence of a predicate:

D

]

p

def= V ! (P [ {> }) where:

P ::= W ^ 0x80000000 (W 2 V)| dbl-of-word(0x43300000,W ) (W 2 V)

�

p

(X]

p

)def= { ⇢ 2 E | 8V 2 V : X]

p

(V ) = > _ ⇢(V ) 2 JX]

p

(V ) K⇢ }

(3)

The concretization is the set of environments that satisfy all the predicates, and there is noGalois connection. Figure 12 presents a few example transfer functions. Assignments and testsoperate on a pair of abstractions: a predicate X]

p

and an interval X]

i

. Sub-expressions from e are

combined by combine with predicates from X

]

p

to form idioms. The assignment then removes

(in Y

]

p

) the predicates about the modified variable (var(p) is the set of variables appearing inp) and tries to infer a new predicate. The matching algorithm is not purely syntactic: it uses asemantic interval information from X

]

i

(e.g., to evaluate sub-expressions). Dually, successfully

4.3 Exponent Domain D

]

e

13

JV = e K]p

(X]

p

, X

]

i

)def=

let Y ]

i

= JV = combine(e,X]

p

, X

]

i

) K]

i

X

]

i

inlet Y ]

p

= �W : if W = V or V 2 var(X]

p

(W )) then > else X

]

p

(W ) in

if e = W ^ e1 ^ J e1 K]i

X

]

i

2 {[231, 231], [�231,�231]}then (Y ]

p

[V 7!W ^ 0x80000000], Y ]

i

)

else if e = dbl-of-word(e1,W ) ^ J e1 K]i

X

]

i

= [1127219200, 1127219200]then (Y ]

p

[V 7! dbl-of-word(0x43300000,W )], Y ]

i

)

otherwise (Y ]

p

, Y

]

i

)

J assert(e) K]p

(X]

p

, X

]

i

)def= (X]

p

, J assert(combine(e,X]

p

, X

]

i

)) K]

i

X

]

i

)

X

]

p

[

]

Y

]

p

def= �V : if X]

p

(V ) = Y

]

p

(V ) then X

]

p

(V ) else >

combine(e,X]

p

, X

]

i

) replaces sub-expressions V1 - V2 in e with (double)I when:9V

01 , V

02 : 8j 2 {1, 2} : X]

p

(Vj

) = dbl-of-word(0x43300000, V 0j

) ^

X

]

p

(V 01) = I ^ 0x80000000 ^X

]

i

(V 02) = [�231,�231]

Figure 12: Abstract operator examples in the predicate domain D

]

p

.

double frexp(double d, int* e) {

int x = 0;

double r = 1, dd = fabs(d);

if (dd >= 1) { while F (dd > r) { x++; ⌥ r *= 2; } }

else { while (dd < r) { x--; r /= 2; } }

*e = x;

return d/r;

}

Figure 13: Floating-point decomposition into mantissa and exponent.

matched idioms refine the interval information. The join on D

]

p

removes the predicates that are

not identical. As D]

p

is flat, it has no need for a widening. Similarly to non-relational domains,the cost of each operation is sub-linear when implemented with functional maps [7, §III.H.1].

To stay concise, we only present the transfer functions su�cient to analyze Fig. 2.a. It iseasy to enrich P with new predicates and extend the pattern matching and the propagationrules to accommodate other idioms. For instance, Fig. 2.b can be analyzed by adding thepredicate V = hi-word-of-dbl(W ) and using information gathered by the bitfield domain D

]

b

during pattern matching.

4.3 Exponent Domain D

]e

As last example, we consider the program in Fig. 13 that decomposes a float into its exponentand mantissa. Although it is possible to extract exponents using bit manipulations, as inFig. 2.b, this function uses another method which illustrates the analysis of loops: when |d| � 1it computes r = 2x iteratively, incrementing x until r � |d|. This example is, as those inFigs. 1–2, inspired from actual industrial codes and out of scope of existing abstract domains.

To provide precise bounds for the returned value, a first step is to bound r. We focus on

14 5 EXPERIMENTAL RESULTS

size with domains w/o domains pre-processed

(KLoc) time alarms time alarms time alarms154 10h44 22 10h04 22 11h38 22186 7h44 10 7h22 10 7h16 10103 54mn 2 44mn 451 46mn 6

(a) 493 7h34 3 15h27 1,833 8h40 195661 14h46 2 16h23 3,419 13h32 253616 22h03 5 26h46 5,350 20h45 300859 65h08 110 41h03 5,968 59h55 316

2,428 48h28 1 44h06 3,822 44h57 674

(b) 113 25mn 30 20mn 30 17mn 30

(c) 79 3h22 7 3h09 7 3h27 7

(d) 102 46mn 64 59mn 64 59mn 641,727 30h18 2,133 26h15 2,388 28h46 2,190

Figure 14: Analysis performance on industrial benchmarks.

the case where |d| � 1. To prove that r and d/r are bounded, it is necessary to infer at F theloop invariant r 2d and use the loop exit condition d r. Several solutions exist to inferthis invariant (such as using the polyhedra domain). We advocate the use of a variation D

]

e

ofthe, more e�cient, zone domain [16]. While the original domain infers invariants of the formV �W 2 [`, h], we infer invariants of the form V/W 2 [`, h]:

D

]

e

def= (V ⇥ V)! { [`, h] | `, h 2 R [ {±1 } }

�

e

(X]

e

)def= { ⇢ 2 E | 8V,W 2 V : ⇢(W ) = 0 _ ⇢(V )/⇢(W ) 2 X

]

e

(V,W ) }

↵

e

(X)def= �(V,W ) : [min { ⇢(V )/⇢(W ) | ⇢ 2 X }, max { ⇢(V )/⇢(W ) | ⇢ 2 X }]

The domain is constructed by applying a straightforward log transformation to [16]. A nearlinear cost can be achieved by using packing techniques [7, §III.H.5].

Now that r is bounded, in order to prove that x is also bounded, it is su�cient to infer arelationship between x and r, i.e., r = 2x at F, and r = 2x�1 at ⌥. This is possible, for instance,by using the predicate domain D

]

p

from Sec. 4.2 enriched with a new predicate parameterizedby a variable W and an integer i:

P0 ::= P | 2W+i

(W 2 V, i 2 Z)

To stay concise, we do not describe the adapted transfer functions.

5 Experimental Results

All the domains we presented have been incorporated into the Astree static analyzer for syn-chronous embedded C programs [3, 7] and its extension AstreeA for multi-threaded programs[21]. Both check for arithmetic and memory errors (integer and float overflows and invalidoperations, array overflows, invalid pointer dereferences). Thanks to a modular design basedon a partially reduced product of communicating domains, new domains can be easily added.

Figure 14 presents the running-time and number of alarms when analyzing large (up to2.5 MLoc) C applications from the aeronautic industry. Figure 14.a corresponds to a family

15

of control-command software; each one consists in a single very large reactive loop generatedfrom a graphical specification language (similar to Lustre) and features much floating-pointcomputations (integrations, digital filters, etc.). Figure 14.b is a software to perform hardwarechecks; it features many loops and is mainly integer-based. Figures 14.c–d are embeddedcommunication software that manipulate strings and bu↵ers; they feature much physical caststo pack, transmit, and unpack typed and untyped messages, as well as some amount of booleanand numeric control. Additionally, the applications in Fig. 14.d are multi-threaded. Moredetails on the analyzed applications are available in [7, III.M] and [21].

In all the tests, the default domains are enabled (intervals, octagons, binary decision trees,etc., we refer the reader to [7] for an exhaustive list). The first and second columns show,respectively, the result with and without our new domains. In many cases (shown in boldface),our domains strictly improve the precision. Moreover, the improved analysis is between twiceslower and twice faster. This variation can be explained as follows: adding new domains incursa cost per abstract operation, but improving the precision may decrease or increase the numberof loop iterations to reach a fixpoint. In the third column, our new domains are disabled butthe code is pre-processed with ad-hoc syntactic scripts to (partially) remove their need (e.g.,replacing the cast function in Fig. 1.b with a C cast), which is possibly unsound and incomplete(hence the remaining alarms). Comparing the first and last columns shows that being sounddoes not cause a significant loss of e�ciency.

6 Conclusion

We presented several abstract domains to analyze C codes exploiting the binary representationof integers and floats. They are based on a concrete semantics that allows reasoning aboutthese low-level implementation aspects in a sound way, while being compatible with legacyabstract domains. Each introduced domain focuses on a specific coding practice, so, they arenot general. However, they are simple, fast to design and implement, e�cient, and easy toextend. They have been included in the industrial-strength analyzer Astree [3] to supplementexisting domains and enable the precise analysis of codes exploiting the binary representationof numbers without resorting to unsound source pre-processing.

Future work includes generalizing our domains and developing additional domains special-ized to code patterns we will encounter when analyzing new programs, with the hope of buildinga library of domains covering most analysis needs.

References

[1] GMP: The GNU multiple precision arithmetic library. http://gmplib.org/.

[2] How do I get rid of red OVFL outside PolySpace for Model Link TL? http://www.mathworks.fr

/support/solutions/en/data/1-5D3ONT/.

[3] AbsInt, Angewandte Informatik. Astree run-time error analyzer. http://www.absint.com/astree.

[4] ANSI Technical Committee and ISO/IEC JTC 1 Working Group. Rationale for internationalstandard, Programming languages, C. Technical Report 897, rev. 2, ANSI, ISO/IEC, Oct. 1999.

[5] ANSI Technical Committee and ISO/IEC JTC 1 Working Group. C standard. Technical Report9899:1999(E), ANSI, ISO & IEC, 2007.

[6] T. Ball, R. Majumdar, T. Millstein, and S. Rajamani. Automatic predicate abstraction of Cprograms. In Proc. of the ACM SIGPLAN Conf. on Prog. Lang. Design and Implementation(PLDI’01), pages 203–213. ACM, 2001.

16 REFERENCES

[7] J. Bertrane, P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Mine, and X. Rival. Static analysisand verification of aerospace software by abstract interpretation. In AIAA Infotech@Aerospace,number 2010-3385, pages 1–38. AIAA, Apr. 2010.

[8] J. Brauer and A. King. Transfer function synthesis without quantifier elimination. In Proc. of the20th European Symp. on Prog. (ESOP’11), volume 6602 of LNCS, pages 97–115. Springer, Mar.2011.

[9] P. Cousot and R. Cousot. Static determination of dynamic properties of programs. In ISOP’76,pages 106–130. Dunod, Paris, France, 1976.

[10] dSpace. TargetLink code generator. http://www.dspaceinc.com.

[11] P. Granger. Static analysis of arithmetic congruences. Int. Journal of Computer Mathematics,30:165–199, 1989.

[12] P. Granger. Static analysis of linear congruence equalities among variables of a program. In Proc.of the Int. Joint Conf. on Theory and Practice of Soft. Development (TAPSOFT’91), volume 493of LNCS, pages 169–192. Springer, 1991.

[13] IEEE Computer Society. Standard for binary floating-point arithmetic. Technical report, AN-SI/IEEE Std. 745-1985, 1985.

[14] D. Kastner, S. Wilhelm, S. Nenova, P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Mine, andX. Rival. Astree: Proving the absence of runtime errors. In Embedded Real Time Soft. and Syst.(ERTS2 2010), pages 1–9, May 2010.

[15] F. Masdupuy. Semantic analysis of interval congruences. In Proc. of the Int. Conf on FormalMethods in Prog. and Their Applications (FMPTA’93), volume 735 of LNCS, pages 142–155.Springer, 1993.

[16] A. Mine. A new numerical abstract domain based on di↵erence-bound matrices. In Proc. of the 2dSymp. on Programs as Data Objects (PADO II), volume 2053 of LNCS, pages 155–172. Springer,May 2001.

[17] A. Mine. Relational abstract domains for the detection of floating-point run-time errors. In Proc.of the European Symp. on Prog. (ESOP’04), volume 2986 of LNCS, pages 3–17. Springer, 2004.

[18] A. Mine. Weakly Relational Numerical Abstract Domains. PhD thesis, Ecole Polytechnique,Palaiseau, France, Dec. 2004.

[19] A. Mine. Field-sensitive value analysis of embedded C programs with union types and pointerarithmetics. In Proc. of the ACM Conf. on Lang., Compilers, and Tools for Embedded Syst.(LCTES’06), pages 54–63. ACM, June 2006.

[20] A. Mine. The octagon abstract domain. Higher-Order and Symbolic Computation, 19(1):31–100,2006.

[21] A. Mine. Static analysis of run-time errors in embedded critical parallel C programs. In Proc. ofthe 20th European Symp. on Prog. (ESOP’11), volume 6602 of LNCS, pages 398–418. Springer,Mar. 2011.

[22] A. Mine, X. Leroy, and P. Cuoq. ZArith: Arbitrary precision integers library for OCaml. http:

//forge.ocamlcore.org/projects/zarith.

[23] D. Monniaux. Verification of device drivers and intelligent controllers: a case study. In Proc. ofthe 7th ACM & IEEE Int. Conf. on Embedded Soft. (EMSOFT’07), pages 30–36. ACM, Sep. 2007.

[24] M. Muller-Olm and H. Seidl. Analysis of modular arithmetic. In Proc. of the 14th European Symp.on Prog. (ESOP’05), volume 3444 of LNCS, pages 46–60. Springer, Apr. 2005.

[25] J. Regehr and U. Duongsaa. Deriving abstract transfer functions for analyzing embedded software.In Proc. of the ACM Conf. on Lang., Compilers, and Tools for Embedded Syst. (LCTES’06), pages34–43. ACM, June 2006.

[26] A. Simon and A. King. Taming the wrapping of integer arithmetic. In Proc. of the 14th Int. Symp.on Static Analysis (SAS’07), volume 4634 of LNCS, pages 121–136. Springer, Aug. 2007.

Abstracts(ATX)

Yuhui Lin, Alan Bundy and Gudmund Grov The Use of Rippling to Automate Event-B Invariant Preservation Proofs

Alex Merry and Matvey SolovievRewriting Pattern Graphs

Roy McCaslandAutomated Theorem Discovery: a Case Study

(WING)Friedrich Gretz, Joost-Pieter Katoen, and Annabelle McIver

Prinsys - A Software Tool for the Synthesis of Probabilistic Invariants

Alexei IliasovAugmenting formal development with use case reasoning

Daniel Larraz, Enric Rodriguez-Carbonell, and Albert RubioSMT-Based Array Invariant Generation

Mengjun LiFormal Characterization and Verification of Loop Invariant Based on Finite

Difference

Lamia Labed Jilani, Wided Ghardallou, Ali MiliConclusive Proofs of While Loops Using Invariant Relations

Rajiv Murali and Andrew IrelandE-SPARK: Automated Generation of Verifiable Code from Formally Verified Designs

Pierre-Loic Garoche, Temesghen Kahsai and Cesare TinelliInvariant Stream Generators using Automatic Abstract Transformers based on a

Decidable Logic.

Ott TinnFaster Automatic Test Case Generation

The Use of Rippling to Automate Event-B

Invariant Preservation ProofsYuhui Lin

University of Edinburgh, [email protected]

Alan BundyUniversity of Edinburgh, UK

[email protected]

Gudmund GrovUniversity of Edinburgh, UK

[email protected]

Abstract

Proof automation is a common bottleneck for industrial adoption of formal methods.In Event-B, a significant proportion of proof obligations which require human interactionfall into a family called invariant preservation. In this paper we show that a rewritingtechnique, called rippling, does increase the automation of proofs in this family. We thenextend this technique by combining two existing approaches. An earlier version of thispaper has previously been published in [8]

1 Introduction

Event-B [2] is a “top-down” formal modelling method, which captures requirements in anabstract formal specification, and then stepwise refines the specification into the final product.Proof obligations (POs) are raised to ensure correctness of each development step. Most ofthese POs can be discharged automatically by the Rodin platform [3], yet still 3% to 10% ofthem require human interaction [6]. In an industrial sized project, this proportion of interactiveproofs can be thousands, e.g. 43,610 POs with 3.3% interactive proof in the Roissy AirportShuttle project [1]. One type of PO, called invariant preservation (INV) POs, can account fora significant proportion of the POs needing interaction. To illustrate, 188 out of 317 (59%)the undischarged POs in the BepiColombo case study1 are INV POs. Moreover, specificationsoften change frequently, which requires users to reprove previously proven POs. Here we arguethat part of the problem is a lack of meta-level reasoning, since Rodin provers only work at theobject level logic. Here, we propose to use a meta-level reasoning technique, called rippling [5],for INV POs. Our hypothesis is

By utilising rippling we can increase the automation of Event-B invariant preser-vation proof obligations and make proofs more robust to changes.

2 INV POs and Rippling

Machines are key components of Event-B specifications. A machine contains variables, in-variants and events. Variables represent the states of the machine, and invariants describe

any

swhere

s 2 Subs ^ s /2 dom call

then

call := call [ {(s 7! (seize 7! ?))}end

Notation Definition

x 7! y denotes the pair (x, y)dom(r) {x|9y.x 7! y 2 r}r ⇤ s {x 7! y|x 7! y 2 r ^ y 2 s}r ; s {x 7! y|9z.x 7! z 2 r ^ z 7! y 2 s}

Figure 1: An example of events & mathematical notations

1The case study can be found in http://deploy-eprints.ecs.soton.ac.uk/136/

1

The Use of Rippling to Automate Event-B INV Proofs Yuhui Lin, Alan Bundy and Gudmund Grov

constraints on the states. INV POs are generated to guarantee that the invariants hold in allstates: they hold initially; and each event preserves them. To illustrate, consider the invariant

Callers = dom((call ; st)⇤ Connected) (1)

in which Callers, call, st and Connected are variables. Figure 1 defines an event in which s isan argument followed by a guarded action, describing how the state changes. An INV PO (2)is generated to ensure (1) holds under changes made in the event

s 2 Subs, s /2 dom call, Callers = dom((call ; st)⇤ Connected) `Callers = dom((call [ {(s 7! (seize 7! ?)))} ; st)⇤ Connected).

(2)

We have observed that INV POs follow a pattern of f(x) ` f(g(x)), that is, the invariant f(x)is embedded in the INV PO f(g(x)). This feature makes rippling applicable. Rippling is arewriting technique which is applicable in any scenario where one of the assumptions can beembedded in the goal. In the case of INV POs, the invariant is embedded in the goal. The

pattern showed above can be annotated as f(x) ` f( g(x) ). f(x) is the hypothesis. The

embedding in the goal is called the skeleton (i.e. the non-boxed part), while the di↵erencesare the wave-front, and the underline annotates the skeleton insides (but not belongs to) thewave-front. When applying a rewrite rule, the skeleton in a goal should be preserved, and aripple measure must decrease, e.g. skeletons which are separated by wave-front are movingtogether. To illustrate, by applying a rewrite rule f(g(x)) = h(f(x)), the annotation of the

pattern would be f(x) ` h(f(x)) , where the hypothesis f(x) is now a subterm of the goal.

3 IsaScheme and lemma conjecture

The key advantage of rippling is that the strong expectation of how the proof should succeedcan help us to build a proof patching mechanism when a proof is blocked, e.g. due to a missinglemma. It can contribute to proof automation and makes proofs more robust to changes. Thismechanism is known as proof critics [4]. One useful critic is lemma speculation [4] which isapplicable when proofs are blocked due to a missing lemma. Meta-level annotations are usedto guide and construct the lemma being conjectured. Consider the following blocked ripplingproof:

Callers = dom(( call [ {(s 7! (seize 7! ?))} ; st)⇤ Connected)

we construct the left hand side of the missing lemma with one of our wave-fronts and parts

of skeletons, which is call [ {(s 7! (seize 7! ?))} ; st. Because the skeleton needs to be

preserved, we can construct the right hand side of the lemma by introducing a meta-variable?F1 to represent the unknown part. Then we have

call [ {(s 7! (seize 7! ?))} ; st = ?F1 call ; st {(s 7! (seize 7! ?))} st

In middle-out reasoning [7], this meta-variable is stepwise instantiated by unification during theproof. However, higher-order unification brings a challenge for this approach. Therefore, wepropose a new approach by using IsaScheme[9] which is a scheme-based approach to instantiatethese meta-variables, to generate the missing lemmas. Given a scheme and candidate termsand operations, IsaScheme can instantiate the meta-variables and construct and prove lemmas.The scheme will help constrain the lemmas generated, and we can further filter out those thatwill not provide valid ripple steps. The following algorithm shows more details about how toconstruct a scheme and get the potential lemmas with the example showed in this section (Tomake it more readable, we represent {(s 7! (seize 7! ?))} as X).

2

The Use of Rippling to Automate Event-B INV Proofs Yuhui Lin, Alan Bundy and Gudmund Grov

When no rewriting rules are applicable, the following process can be triggered.

1. Construct the lhs of the scheme with a wave-front and part of skeletons, i.e. (call [X) ; st

2. Utilising rippling properties, we can partially predict how the term evolves on the rhs. i.e. evolve

from (call [X) ; st to call ; st... . Also we need to construct a term with meta-variables to

specify the new shape of combination of the constants and variables in the wave-fronts, i.e. X,and those next to the wave-fronts in the skeleton, i.e. st. In our example this term would be(?F2 X st). Now we introduce meta-variables to combine these terms to compose the rhs of the

missing lemma, i.e. ... = ?F1(call ; st)(?F2 X st) .

3. Now we have a scheme to instantiate. i.e.Myscheme ?F1 ?F2 ⌘ (call [ X) ; st = ?F1 (call ; st) (?F2 X st)

in which Myscheme is the name of our scheme; ?F1 and ?F2 are meta-variables to be instantiatedfrom a set of given terms. Then we try this scheme in IsaScheme with relevant proof context,including assumptions.

4. IsaScheme returns potential lemmas which we can apply to unblock the current proof. In ourexample, we get the following lemma, i.e. (call [X) ; st = (call ; st) [ (X ; st) , which helps toproceed with the proof.

4 Conclusion & Further Work

We have showed that rippling is applicable to prove INV POs. We have combined lemma specu-lation and scheme-based theory exploration to the discovery of missing lemmas when proofs areblocked. This has to be done manually in Rodin. Moreover, with meta-level reasoning and itspatching mechanism, the robustness of proofs can be improved, as the proof strategy remainsthe same even if the POs are required to be re-proven when specifications change. Currentlythese schemes are deduced manually, and next we plan to automate this process.

Acknowledgement: This work is supported by EPSRC grant EP/H024204/1 (AI4FM). Thanks toOmar Montano Rivas, Andrew Ireland, Moa Johansson and the AI4FM partners for useful discussions.

References

[1] J.R. Abrial. Formal methods in industry: achievements, problems, future. In Proceedings of the

28th international conference on Software engineering, pages 761–768. ACM, 2006.

[2] J.R. Abrial. Modeling in Event-B System and Software Engineering. Cambridge Univ Pr, 2010.

[3] J.R. Abrial, M. J. Butler, S. Hallerstede, T. S. Hoang, F. Mehta, and L. Voisin. Rodin: an opentoolset for modelling and reasoning in event-B. STTT, 12(6):447–466, 2010.

[4] A.Ireland. Productive use of failure in inductive proof. Journal of Automated Reasoning, 16(1–2):79–111, March 1996.

[5] A. Bundy. Rippling: meta-level guidance for mathematical reasoning, volume 56. Cambridge UnivPr, 2005.

[6] C. B. Jones, G. Grov, and A. Bundy. Ideas for a high-level proof strategy language. TechnicalReport CS-TR-1210, School of Computing Science, Newcastle Univ, 2010.

[7] I. Kraan, D. Basin, and A. Bundy. Middle-out reasoning for synthesis and induction. Journal of

Automated Reasoning, 16(1):113–145, 1996.

[8] Y. Lin, A. Bundy, and G. Grov. The use of rippling to automate event-b invariant preservationproofs. NASA Formal Methods, pages 231–236, 2012.

[9] O. Montano-Rivas, R. McCasland, L. Dixon, and A. Bundy. Scheme-based synthesis of inductivetheories. Advances in Artificial Intelligence, pages 348–361, 2010.

3

Rewriting Pattern GraphsAlex Merry

University of [email protected]

Matvey SolovievUniversity of Cambridge

[email protected]

April 4, 2012

1 String Graphs

String graphs were originally introduced as open graphs in [4]. They are intended to be graphequivalents of “string diagrams”, the graphical structures that arise in monoidal theories (eg:proof nets [6], Penrose diagrams [9] and quantum information processing languages [2]). Thisallows familiar automated graph rewriting techniques to be used, such as double-pushout rewrit-ing [5].

In order to cope with peculiarities of these diagrams, such as “dangling edges” and loopscontaining no nodes, Dixon et al. use two sorts of vertices: inner vertices, which should beconsidered as being “normal” vertices, and typically represent an operation of some sort ina computational setting, and edge points, which act as anchors for the “wires” connectingthe inner vertices, and may carry type information for those wires. The result is a category(SGraph) of graphs with these two types of vertices, where there are no edges between innervertices, and edge-points have at most one in-edge and at most one out-edge.

Example 1.1. A diagrammatic presentation of a string graph:

p1 p2v1

p3v2

p4 p5

We refer to edge-points with no in-edge as in input, and those with no out-edge as an output.

2 Pattern Graphs

Pattern graphs are a method of specifying infinite sets of graphs in a finite manner. The concept,and a lot of the terminology, is lifted from linear logic, where ! (“bang”) allows resources tobe duplicated or deleted. !-boxes were informally introduced in [3], and pattern graphs aredescribed in detail in [8].

Quantomatic [1] makes use of pattern graphs to express rewrite rules, and it is in the contextof this tool that a lot of this work has been done. We present a brief summary of pattern graphshere.

An open subgraph O of a string graph G is a full subgraph such that every vertex in G thatis not in O but is connected to a vertex in O by an edge in G is an inner vertex.

!-boxes are named open subgraphs that may be copied or deleted (“killed”). They can alsobe dropped (we forget about the !-box) or merged (when they are disjoint; we combine two!-boxes into one). The naming allows more than one !-box to refer to the same subgraph. Thefact that they are open ensures that edges do not bifurcate when we copy them, and no inputsor outputs are created by killing them.

For example, the following graph is intended to encompass all graphs with one inner vertexand any number of inputs and outputs:

1

Rewriting Pattern Graphs A. Merry & M. Soloviev

!-boxes also have a partial order that respects the subgraph relation. If the subgraph denotedby one !-box is contained within another, the !-box can be marked as being nested within thelarger !-box via this order.

Dropping a !-box or merging two !-boxes are both operations that purely operate on the!-boxes themselves, without a↵ecting the graph, and they work in the expected manner.

Killing a !-box is just normal graph subtraction, and the other !-boxes are restricted asnecessary, unless they are nested within the killed !-box, in which case they are dropped.

When a !-box is copied, all the !-boxes that are nested within it are also copied. All other!-boxes that overlap with the copied !-box are extended to also include both the original inter-section and the copy of it.

For example, in the following pattern graph

b1 b2

v

if we copy b1 we get

b1

b01

b2

v

v0

Definition 2.1. A pattern graph P is a tuple (G,B,�), where G is a string graph, B is a poset

and � is a map from B to open subgraphs of G that is monotonic with respect to the subgraph

order.

Definition 2.2. A pattern graph morphism, � : P ! Q, between two pattern graphs is a string

graph morphism �G : GP ! GQ together with a poset morphism �B : BP ! BQ such that for

all b 2 BP , the preimage of �Q(�B(b)) under �G is �P(b).

Pattern graphs and their morphisms form a category, SPatGraph. There is a forgetfulfunctor U : SPatGraph ! SGraph that simply discards all !-box information.

If P can be transformed into Q by a sequence of the !-box operations, we say that Q isderivable from P, and write P ⇣ Q. We also consider a string graph G derivable from P ifP ⇣ (G,?,?).

3 Rewriting Pattern Graphs

Dixon et al. showed that string graphs inherit “enough adhesivity” from their parent categoryto allow for double-pushout rewriting. [8] demonstrates how pattern graphs can be used toform a rewrite pattern, which encompasses a family of string graph rewrite rules.

A string graph rewrite rule is a span of monos in SGraph in which the span inducesbijections on the inputs and ouputs of its codomains. A rewrite pattern is a similar constructionin SPatGraph which also induces a bijection on !-boxes. The idea is that whatever !-boxoperations need to be applied to the LHS of the rule to match a graph are also applied to theRHS. The forgetful functor is then applied, and the resulting rewrite rule is used to rewrite thegraph.

2

Rewriting Pattern Graphs A. Merry & M. Soloviev

Our aim is to use rewrite patterns to directly rewrite pattern graphs. This has immediateapplications in terms of applying Knuth-Bendix to a system of rewrite patterns, but also hasa potential use in expressing higher-order rules of graphical systems, for example Kissinger’sidea [7] of a form of “graph induction” for deriving rewrite patterns from a system of concreterewrite rules (such as might be generated by an automated system like QuantoCoSy [7]).

Such a system should take a pattern rule like

F G F H

and rewrite the pattern

b1

b2

b1

b2

K F G

G

to K F H

H

The semantics here are that a rewrite pattern L R should rewrite a pattern graph P toanother pattern graph Q only when P and Q form a rewrite rule such that for every concretederivation P Q of P Q, there is a concrete derivation L R of L R that rewrites Pto Q.

SPatGraph (or even the category of more general pattern graphs based on arbitrary graphs,rather than string graphs, where we drop the openness requirement) is not, however, adhesive.This is fairly easy to demonstrate, simply by considering the subcategory of pattern graphsbased on the empty graph. This is isomorphic to Poset, which is not adhesive.

We believe, however, that this should be possible to overcome by placing suitable restrictionson pattern graph morphisms to prevent maps such as the one taking two nested !-boxes b1 b2to three nested !-boxes b1 b3 b2, with b3 not in the image of the map. This is a naturalrestriction that corresponds to the intuition that rewrite rules operate on contiguous subregionsof a graph’s drawing.

References

[1] Quantomatic. http://sites.google.com/site/quantomatic/.

[2] B. Coecke and R. Duncan. Interacting quantum observables. In ICALP 2008. LNCS, 2008.

[3] L. Dixon and R. Duncan. Graphical reasoning in compact closed categories for quantum computa-tion. Annals of Mathematics and Artificial Intelligence, 56:23–42, 2009. 10.1007/s10472-009-9141-x.

[4] L. Dixon, R. Duncan, and A. Kissinger. Open graphs and computational reasoning. In DCM, pages169–180, 2010.

[5] H. Ehrig, M. Pfender, and H. Jurgen Schneider. Graph-grammars: An algebraic approach. In 14th

Annual Symposium on Switching and Automata Theory, pages 167–180. IEEE, 1973.

[6] J.-Y. Girard. Proof-nets: The parallel syntax for proof-theory. In Logic and Algebra, pages 97–124.Marcel Dekker, 1996.

[7] A. Kissinger. Pictures of Processes: Automated Graph Rewriting for Monoidal Categories and

Applications to Quantum Computing. PhD thesis, Department of Computer Science, University ofOxford, UK, 2012.

[8] A. Kissinger, A. Merry, and M. Soloviev. Pattern graphs. Submitted to DCM 2012.

[9] R. Penrose. Applications of negative dimensional tensors. In Combinatorial Mathematics and its

Applications, pages 221–244. Academic Press, 1971.

3

Automated Theorem Discovery: a Case Study(extended abstract)

Roy McCasland

There have recently been e↵orts made to formalise several well-known mathematicalresults – the Four-Colour Theorem, the classification of finite simple groups, and Kepler’sConjecture, amongst the more notable examples (Gonthier 2005, Gonthier et al. 2007, Hales2008). In each of these cases, a great deal of e↵ort has been/is required on the part of theusers, primarily involving the proving of lemmas which are necessary for the automatedtheorem-prover to be able to prove the desired result. In some cases this e↵ort takes years tocomplete.

It would be highly advantageous if a significant portion of the ’development’ work ofthese projects could be automated; i.e., if the lemmas could be automatically ’discovered’.By this, we mean that a system could firstly generate (at least many of) the lemmas that arenecessary for the final proof to go through, and secondly, prove them, and then store themin the system’s database, for future use.

There are in fact now a number of systems which have recently demonstrated somesuccess in automated theorem discovery, albeit in relatively low-level mathematics. In thistalk we present an on-going case study involving one of these systems – MATHsAiD – a newversion of the system described in McCasland & Bundy (2006).

MATHsAiD is currently being used to formalise Zariski Spaces. While Zariski Spacesare not nearly so well-known as the aforementioned examples, they nevertheless presenta significant challenge to state-of-the-art theorem-provers, let alone any theorem-discoverysystems; in part because they involve several theories, including rings, modules, semirings,semimodules and topologies. A traditional mathematical exposition of these ideas is givenin McCasland et al. (1998)

The talk will explain the new architecture of MATHsAiD, and the techniques used toallow a high level of automation in building up the theory as a succession of developmentsat di↵erent levels where it is important to be able to filter the proved results so as to retainjust those which are mathematically interesting.

References

Gonthier, G. (2005), ‘A computer-checked proof of the four colour theorem’,http://research.microsoft.com/ gonthier/4colproof.pdf.

Gonthier, G., Mahboubi, A., Rideau, L., Tassi, E. & Thery, L. (2007), A Modular Formalisationof Finite Group Theory, Research Report RR-6156, INRIA.url: http://hal.inria.fr/inria-00139131

Hales, T. (2008), ‘Formal proof’, Notices of the AMS 55(11), 1370–80.

1

McCasland, R. L. & Bundy, A. (2006), MATHsAiD: A mathematical theorem discovery tool,in ‘Symbolic and Numeric Algorithms for Scientific Computing (SYNASC ’06)’, pp. 17–22.url: doi.ieeecomputersociety.org/10.1109/SYNASC.2006.51

McCasland, R. L., Bundy, A. & Smith, P. F. (2006), ‘Ascertaining mathematical theorems’,Electronic Notes in Theoretical Computer Science 151, 21–38.

McCasland, R., Moore, M. & Smith, P. (1998), ‘An introduction to Zariski Spaces over ZariskiTopologies’, Rocky Mountain J. Math 28, 1357–1369.

2

Inferring Loop Invariants Dynamically

Juan Pablo Galeotti and Andreas Zeller{galeotti, zeller}@cs.uni-saarland.de

Saarland University – Computer Science, Saarbrucken, Germany

There is extensive literature on inferring loop invariants statically (i.e. without ex-plicitly executing the program under analysis). We report on a new dynamic techniquefor inferring loop invariants based on the invariant detector Daikon [2]. Unlike Inv-Gen [4], this new technique follows a counter example guided approach for refiningcandidate loop invariants. Let us consider the following annotated program for multi-plying 16 bit integers in the left column:

_(requires 0<=x<65535)

_(requires 0<=y<65535)

_(ensures \result==x

*

y) {

mult = i = 0;

while (i<y) {

mult+=x; i++;

}

return mult;

}

// Candidate Loop Invariants

#1 x one of { 1, 1316 }

#2 y one of { 1, 131 }

#3 i >= 0

...

#9 i <= y

#10 i == (mult / x)

#11 mult == (x

*

i)

Our approach starts by finding new test cases using the search-based test suite generatorEvoSuite [3]. Then, the dynamic invariant detector collects 11 different loop invariantcandidates (excerpt shown on the right), which we feed to the static verifier VCC [1].

Since the conjunction of all candidates under-approximates the loop invariant, thestatic verifier fails. Then, EvoSuite guides the generation of new test inputs using thestatic verifier’s error model. The invariant detector synthesizes new candidates (rulingsome of them out), which are fed to VCC. This refinement continues until VCC suc-cessfully verifies the program (using only candidates #9 and #11).

The combination of test case generation and Daikon opens the potential for infer-ring loop invariants even for nontrivial programs. Current challenges include the staticverification itself, as well as refining the candidate loop invariants.

The main challenge, however, will be to find appropriate patterns for the most re-current loop invariants: Daikon itself is limited to at most three related variables, andwe will have to expand the search space considerably. Finally, we are also looking forbenchmarks such that we can compare against other existing automatic loop invariantdetectors, such as InvGen [4].

References

1. Cohen E., Dahlweid M., Hillebrand M., Leinenbach D., Moskal M., Santen T., Schulte W., and Tobies S., VCC: APractical System for Verifying Concurrent C. TPHOLs, 2009.

2. Ernst M., Cockrell J., Griswold W., and Notkin D. Dynamically discovering likely program invariants to supportprogram evolution. IEEE TSE, 27(2), 2002.

3. Fraser G., and Arcuri A., EvoSuite: Automatic Test Suite Generation for Object-Oriented Software. ESEC/FSE, 2011.4. Gupta A., Majumdar R., and Rybalchenko A., From Tests To Proofs. TACAS, 2009.

Prinsys – a Software Tool for the Synthesis ofProbabilistic Invariants !

Friedrich Gretz1,2, Joost-Pieter Katoen1, and Annabelle McIver2

1 RWTH Aachen University, Germany2 Macquarie University, Sydney, Australia

Abstract

We are interested in aiding correctness proofs for probabilistic programs, i.e.While programs, enriched with a probabilistic choice operator “[p]” that exe-cutes the left alternative with probability p and the right alternative with 1− p.There are tools for non-probabilistic programs that generate invariants for verifi-cation purposes [2, 1]. For probabilistic programs the existing tools rely on modelchecking and are limited to finite-state systems or do not allow parameters [6,4, 3]. One tool that is based on abstract interpretation is mentioned in [7] butits merits cannot be assessed3. Our novel tool Prinsys4 is based on some of theideas in [5] and aids the verification of probabilistic programs with loops and realvalued variables. We can derive quantitative properties of a given program byreasoning over program annotations in the style of Hoare logic. Our annotationsare generalised to take account of the quantitative properties of probabilisticprograms. Prinsys computes sound annotations for probabilistic while loops.

Our approach to generate invariants is constraint-based. Given a template,i.e. the shape of the desired invariant, and a loop we generate constraints interms of inequalities between piecewise functions. These constraints are thentransformed into universally quantified first-order formulas and solved using offthe shelf tools. The result is the description of all invariant instances of thetemplate that was given by the user. Using this semi-automatical approach iter-atively, a user can construct an invariant that helps him to prove a property ofthe program.

In the workshop we would like to report about the latest ongoing work andshow some use cases of Prinsys.

References

1. Colon, M., Sankaranarayanan, S., Sipma, H.: Linear Invariant Generation UsingNon-linear Constraint Solving. In: CAV. pp. 420–432 (2003)

! This work is supported by the EU FP7 project CARP, DFG research traininggroup Algosyn, Australian Research Council DP1092464 and NWO visitor grant040.11.302.

3 Absinthe appears to be unmaintained and no documentation or download could befound as of this writing.

4 Download the tool at: http://www-i2.informatik.rwth-aachen.de/prinsys/.

2. Gupta, A., Rybalchenko, A.: InvGen: An Efficient Invariant Generator. In: Bouaj-jani, A., Maler, O. (eds.) CAV. LNCS, vol. 5643, pp. 634–640. Springer (2009)

3. Hahn, E.M., Hermanns, H., Wachter, B., Zhang, L.: Pass: Abstraction refinement forinfinite probabilistic models. In: Esparza, J., Majumdar, R. (eds.) TACAS. LectureNotes in Computer Science, vol. 6015, pp. 353–357. Springer (2010)

4. Hahn, E.M., Hermanns, H., Zhang, L.: Probabilistic Reachability for ParametricMarkov Models. STTT 13(1), 3–19 (2011)

5. Katoen, J.P., McIver, A., Meinicke, L., Morgan, C.: Linear-Invariant Generation forProbabilistic Programs. In: SAS, pp. 390–406 (2011)

6. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of ProbabilisticReal-time Systems. In: CAV. pp. 585–591 (2011)

7. Monniaux, D.: An Abstract Analysis of the Probabilistic Termination of Programs.In: Cousot, P. (ed.) SAS. LNCS, vol. 2126, pp. 111–126. Springer (2001)

Augmenting formal development

with use case reasoning

(abstract)

Alexei Iliasov

Newcastle University, UK

State-based methods for correct-by-construction software development relyon a combination of safety constraints and refinement obligations to demonstratedesign correctness. One prominent challenge, especially in an industrial setting,is ensuring that a design is adequate: requirements compliant and fit for purpose.The paper presents a technique for augmenting state-based, refinement-drivenformal developments with reasoning about use case scenarios; in particular, itdiscusses a way for the derivation of formal verification conditions from a high-level, diagrammatic language of use cases, and the methodological role of usecases in a formal modelling process.

The approach to use case reasoning is based on our previous work on agraphical notation for expressing event ordering constraints [2, 1]. The extensionsis realised as a plug in to the Event-B modelling tool set - the Rodin Platform[3] - and smoothly integrates into the Event-B modelling process. It providesa modelling environment for working with graph-like diagrams describing eventordering properties. In the simplest case, a node of such graph is an event of theassociated Event-B machine; an edge is a statement about the relative propertiesof the connected nodes/events. There are three main edge kinds - ena, dis andfis - defined as relations over Event-B events.

U = {f !→ g | ∅ ⊂ f ⊆ S × S ∧ ∅ ⊂ g ⊆ S × S}ena = {f !→ g | f !→ g ∈ U ∧ ran(f) ⊆ dom(g)}dis = {f !→ g | f !→ g ∈ U ∧ ran(f) ∩ dom(g) = ∅}fis = {f !→ g | f !→ g ∈ U ∧ ran(f) ∩ dom(g) )= ∅}

where f ⊆ S × S is a relational model of an Event-B event (we treat an eventas a next-state relation). These definitions are converted into consistency proofobligations. For instance, if in a use case graph there appears an ena edgeconnecting events b and h one would have to prove the following theorem (see[2] for a justification).

∀v, v′, pb · I(v) ∧ Gb(pb, v) ∧ Rb(pb, v, v′) ⇒ ∃ph · Gh(ph, v′) (1)

A use case diagram is only defined in an association with one Event-B model,it does not exist on its own. The use case plug in automatically generates all therelevant proof obligations. A change in a diagram or its Event-B model leadsto the re-computation of all affected proof obligations. These proof obligationsare dealt with, like all other proof obligation types, by a combination of au-tomated provers and interactive proof. Like in the proofs of model consistency

f g f g f g

P(v)f g

fD(v)

1) 2) 3)

4) 5) 6)

f

g

h

7)

f

f

g

h

8)

Q(v)h

R(v)

1) f ena g 5) skip(C), C = {v | P (v)}2) f dis g 6) f(C), C = {v | D(v)}3) f fis g 7) f ena g ∨ f ena h

4) f ena q.g ∧ q.g ena r.h 8) f ena g ∧ f ena h

q = {v | Q(v)}, r = {v | R(v)}

Fig. 1. A summary of the core use case notation and its interpretation.

and refinement, the feedback from an undischarged use case proof obligationmay often be interpreted as a suggestion of a diagram change such as an ad-ditional assumptions or assertion - predicate annotations on graph edges thatpropagate properties along the graph structure. The example in the next sectiondemonstrates how such annotations enable the proof of a non-trivial property.

The use case tool offers a rich visual notation. The basic element of a diagramis an event, visually depicted as a node (in Figure 1, f and g represent events).Event definition (its parameters, guard and action) is imported from the associ-ated Event-B model. One special case of node is skip event, denoted by a greynode colour (Figure 1, 5). Event relations ena,dis,fis are represented by edgesconnecting nodes ((Figure 1, 1-3)). Depending on how a diagram is drawn, edgesare said to be in and or or relation (Figure 1, 7-8). New events are derived frommodel events by strengthening their guards (a case of symmetric assumption andassertion) (Figure 1, 6). Edges may be annotated with constraining predicatesinducing assertion and assumption derived events (Figure 1, 4). Not shown onFigure 1 are nodes for the initialisation event start (circle), implicit deadlockevent stop (filled circle) and nodes for container elements such as loop (used inthe coming example). To avoid visual clutter, the repeating parts of a diagrammay be declared separately as diagram aspects[2].

References

1. Alexei Iliasov. Augmenting Event-B Specifications with Control Flow Information.In NODES 2010, May 2010.

2. Alexei Iliasov. Use case scenarios as verification conditions: Event-B/Flow approach.In Proceedings of 3rd International Workshop on Software Engineering for Resilient

Systems, Septembre 2011.3. The RODIN platform. Online at http://rodin-b-sharp.sourceforge.net/.

SMT-Based Array Invariant Generation

?

Daniel Larraz, Enric Rodrıguez-Carbonell, and Albert Rubio

Universitat Politecnica de Catalunya, Barcelona, Spain

Discovering loop invariants is an essential task for verifying the correctnessof programs or computer systems in general. In this talk we present a techniquefor generating universally quantified loop invariants over array variables.

Namely, programs are assumed to consist of unnested loops and contain linearexpressions in assignments, if and while conditions, as well as in array accesses.Now, let a = (A1, . . . , Am) be the array variables of a program. Given a positiveinteger k > 0, our method generates invariants of the form

8↵ : 0 ↵ C(v)� 1 : ⌃mi=1⌃

kj=1aijAi[dij↵+ Eij(v)] + B(v) + b↵↵ 0

where C, Eij and B are linear polynomials with integer coe�cients over thescalar variables of the program v = (v1, . . . , vn) and aij , dij , b↵ 2 Z, for alli 2 {1, . . . ,m} and j 2 {1, . . . , k}. This family of properties is quite general andallows us to handle a wide variety of programs for which we can automaticallygenerate non-trivial invariants.

Unlike previous approaches based on abstract interpretation or first-ordertheorem proving, our method builds upon the so-called constraint-based invariantgeneration approach. This method produces linear invariants, i.e., invariantsexpressed as linear inequalities over scalar variables, by transforming the problemof the existence of an inductive invariant for a loop into a satisfiability problem inpropositional logic over non-linear arithmetic, thanks to Farkas’ Lemma. Despitethe potential of the method, its application has been limited so far due to thelack of good solvers for the obtained non-linear constraints.

However, recently significant progress has been made in SMT modulo thetheory of non-linear arithmetic. In particular, the Barcelogic SMT solver hasshown to be very e↵ective on finding solutions in the presence of non-linear inte-ger arithmetic. It can also combine integers and reals, which is very useful whenhandling the constraints generated by the constraint-based invariant generationapproach.

Our techniques have been successfully implemented in the CppInv tool. Byusing the Barcelogic SMT solver as a back-end, it automatically generates in-ductive loop invariants (both linear scalar invariants as well as array invariants)for programs written in a subset of the C++ language. We believe that thecombination of our tool with some static analysis to infer the set of potentiallyinteresting invariants for proving some given property would be very useful inthe automation of the verification process.

?This work has been partially supported by the Spanish MEC/MICINN under grant

TIN 2010-68093-C02-01

Formal Characterization and Verification of

Loop Invariant Based on Finite Di↵erence

Mengjun Li

School of Computer Science, National University of Defense Technology,

Changsha, China

Loop invariants play a major role in software verification. Dynamic approach

provides ways to discover likely invariants rapidly. Since the likely loop invariant

may not be real, the validity of the likely loop invariants need to be verified. In

this paper, we present a formal characterization and a verification approach for

equality loop invariants based on finite di↵erence.

The following theorem gives a formal characterization of loop invariants,

where x = (x1, · · · , xn).

Theorem 1. E(x) = 0 is a loop invariant if and only if, for each transition

⌧i(1 i m), �⌧iE(x) is 0 or �⌧iE(x) = 0 is also a loop invariant.

Definition 1. The finite di↵erence tree (FDT) of a likely loop invariant E(x) =

0 with respect to transitions T = {⌧1, · · · , ⌧m} is defined as follows:

(1)The root is E(x) and the leaves are values 0;

(2)If the tree contains a non-leaf node F (x), then F (x) has m child nodes

�⌧1F (x), · · · ,�⌧mF (x).

If the FDTs are infinite, theorem 1 can not be used to verify the validity of likely

loop invariants. In the following, we presents a practical verification approach

for loop invariants.

Definition 2. Let T be a finite di↵erence tree of a likely loop invariant E(x) = 0

with respect to transitions T = {⌧1, · · · , ⌧m}, a node F (x) in T is called a zero

node if F (x) =

Pkj=0 pj(x)Fj(x), where Fj(x)(j = 0, · · · , k)) is the ancestor

node of F (x) and each Fj(x) satisfies that Fj(x0) = 0(j = 0, · · · , k), where x0

expresses the initial value of x, and each pj(x)(j = 0, · · · , k) is an arbitrary

function over variables x1, · · · , xn.

Definition 3. The decidable finite di↵erence tree(DFDT) of a likely loop invari-

ant E(x) = 0 with respect to transitions T = {⌧1, · · · , ⌧m} is the finite di↵erence

tree of E(x) = 0 with respect to transitions T = {⌧1, · · · , ⌧m} with zero nodes as

leaves.

Theorem 2. If there exists a finite DFDT T of E(x) = 0 with respect to tran-

sitions T = {⌧1, · · · , ⌧m} and E(x0) = 0, then E(x) = 0 is a loop invariant.

The e↵ectiveness of our verification approach have been demonstrated on those

examples occurring in Laura Kovacs’s Ph.D.Thesis. We even can prove f�n! = 0

is a loop invariant of the program computing the greatest factorial less than or

equal to a given N . Note that f � n! = 0 is not a polynomial loop invariant,

to the best of our knowledge, our work is the first on verifing the validity of

non-polynomial loop invariants.

Conclusive Proofs of While Loops Using Invariant Relations

Lamia Labed Jilani (ISG, Tunisia), Wided Ghardallou (FST, Tunisia), Ali Mili (NJIT, USA)

Traditional methods of verifying iterative programs rely on invariant assertions to prove partial correctness and on variant functions to prove termination. As such, they suffer from some weaknesses, which we characterize briefly and broadly as follows:

If we attempt to prove the partial correctness of a while loop with respect to a (pre/post) specification and the proof fails, we have no simple way to tell whether the proof failed because the loop is incorrect or because the invariant assertion is inadequate.

Even assuming we have determined that the invariant assertion is inadequate, we have no simple way to determine whether we need to adjust it by strengthening it or by weakening it: granted, if the invariant assertion does not imply the post condition or is not implied by the precondition, then the remedy is obvious; but if the invariant assertion does not meet the inductive step, it is not clear how it must be adjusted.

When one uses variant functions to prove loop termination, one equates the termination of the loop with the condition that the number of iterations is finite, but fails to capture the situation where individual iterations of the loop body fail to terminate normally because they attempt an illegal operation (array reference out of bounds, pointer reference to null, etc).

In this paper, we present an alternative method to analyze loops, which appears to address most of the concerns raised above, and is based on the concept of invariant relations. Our approach can be characterized by the following premises.

Any invariant relation can be converted into a necessary condition of termination; our definition of termination is comprehensive, in the sense that it encompasses the condition that the number of iterations is finite, as well as the condition that each individual iteration proceeds normally.

Given an invariant relation, we can use it to generate a necessary condition of correctness of the loop with respect to a relational specification. This condition returns false whenever the invariant relation contradicts the candidate specification.

Given an invariant relation, we can use it to generate a sufficient condition of correctness of the loop with respect to a relational specification. This condition returns true whenever the invariant relation subsumes the candidate specification.

We have an automated tool that generates invariant relations from a static analysis of the source code of the loop, using a knowledge base that captures the necessary programming knowledge and domain knowledge that is required for an adequate analysis of the loop. Each invariant relation captures some functional property of the loop. If the tool runs out of invariant relations without capturing all the functional details of the loop, it gives an indication of how to complete the missing knowledge in the knowledge base.

Using the capabilities discussed above, we generate an algorithm for proving the correctness of a loop with respect to a relational specification that extracts invariant relations until it finds one that contradicts the candidate specification (if the necessary condition is false) or one that subsumes the candidate specification (if the sufficient condition is true) or until it runs out of invariant relations (in which case it indicates what is missing from the knowledge base).

E-SPARK: Automated Generation of Verifiable

Code from Formally Verified Designs

Rajiv Murali and Andrew Ireland

School of Mathematical and Computer Sciences,Heriot-Watt University,

Edinburgh, EH14 4AS, UK.{rm339, A.Ireland}@hw.ac.uk

The safety-critical sectors are faced with conflicting demands of achievingboth high assurance as well increasing the productivity of their development pro-cess. Auto-coders have been e↵ective in many areas, but the need for high levelsof assurance has prevented its use in safety critical applications such as avionics.Standards for safety critical systems require a constant degree of verification,and most commercially available auto-coders do not satisfy this requirement.

We propose an approach where the auto-coder takes a formal model andgenerates code along with annotations, i.e. information flow analysis and proofassertions. In this way design invariants can be reused at the code level in orderto support formal verification. Specifically, we have targeted Event-B and theSPARK Approach. At the design level, Event-B provides formal modeling sup-ported by a strong and extensible toolset called Rodin. On the implementationlevel, the SPARK Approach includes a range of static analysis tools, from dataflow analysis to formal verification. We have developed an eclipse based plug-incalled E-SPARK for the Rodin platform that supports the automatic genera-tion of provably correct code. At this stage, E-SPARK has been designed forthe sequential subset of Event-B, and has been tested successfully on a range ofarithmetic, searching and sorting examples of algorithmic design. Future devel-opment of E-SPARK could aim to target Event-B models of concurrent systems.

Invariant Stream Generators using

Automatic Abstract Transformers based on

a Decidable Logic

?

Pierre-Loıc Garoche1,2, Temesghen Kahsai2 and Cesare Tinelli2

1 Onera, the French Aerospace Lab, France2 The University of Iowa

The use of formal analysis tools on system models or code often requiresthe availability of auxiliary invariants about the studied system. Abstract inter-pretation is currently one of the best approaches to discover useful invariants,especially numerical ones. However, its application is limited by two orthogonalissues: (i) developing an abstract interpretation is often non-trivial; each trans-fer function of the system has to be represented at the abstract level, dependingon the abstract domain used; (ii) with precise but costly abstract domains, theinformation computed by the abstract interpreter can be used only once the apost fix point has been reached; something that may take a long time for verylarge system analysis or with delayed widening to improve precision.

In this work we try to address these issues by combining techniques fromabstract interpretation and logic-based model checking. Specifically, we proposea general method for the automatic definition of abstract interpreters that com-pute numerical invariants of transition systems. We rely on the possibility ofencoding the transition system in a decidable logic—such as those typically usedby SMT-based model checkers—to compute transformers for an abstract inter-preter completely automatically. Our method has the significant added benefitthat the abstract interpreter can be instrumented to generate system invariantson the fly, during its iterative computation of a post fix point. A prototype im-plementation of the method provides initial evidence of the feasibility of ourapproach and the usefulness of its incremental invariant generation feature.

While motivated by practical issues (namely, the generation of auxiliary in-variants for a k-induction model checker) the current work is more general andcan be adapted to a wide variety of contexts. It only requires that the transitionsystem semantics be expressible in a decidable logics with an e�cient solver, suchas SAT or SMT solvers, and that the elements of the chosen abstract domainbe e↵ectively representable in that logic. Such requirements are satisfied by alarge number of abstract domains used in current practice. As a consequence,we believe that our approach could help considerably in expanding the reach ofabstract interpretation techniques to a variety of target languages, as well as fa-cilitate their integration with complementary techniques such as model checkingones.

? Work currently under submission at another conference.

Faster Automatic Test Case Generation

Ott TinnUniversity of Edinburgh

This talk presents a new tool that extends an existing symbolic execution tool for C programsto partially support C++ and makes it faster at finding inputs that could crash the analysedprograms. The speedup is achieved by adding a new transformation phase that aims to makebug finding faster without optimizing any relevant bugs away by trying to computes just theone bit per input that shows whether the program crashes on that input.

The transformation phase transforms the programs such that regular observable behaviouris removed if it is not needed (for example it is not necessary to produce output if it is notevaluated). Then it adds validation calls to make sure the compiler can not optimize awaypotentially dangerous operations such as memory accesses. After that optimizing transformssuch as the following are applied:

• Validation removal: if a validation call is always preceded by another one that wouldcatch at least the same set of bugs then it could be removed. The same can be done forcalls that can never fail.

• Validation hoisting: move each validation call to the earliest point in the control flowgraph (CFG) where it could be executed without changing the CFG or the set of inputson which the program crashes.

• Loop removal: if a loop has no side e↵ects and a known iteration count then it could beremoved by replacing the induction variable with a symbolic variable that takes the samerange. This makes the loop get executed with any of the values instead of all, but that isenough to find any potential crashes.

The sample was an extensive set of programs from a programming competition archive. Theevaluation showed that on that class of C++ programs an input that triggers a crash can befound in less than five minutes for a significant number of cases and that the new system takesabout half as much time as the base system if the time limit is between a minute and an hourper program. A bug was found in at most one hour in roughly half the cases where a programwas known to crash1 on some input.

In general similar bug finding systems should be usable for finding bugs in small programswith clearly defined input spaces and not too complicated logic. Thus these systems should beusable for finding bugs in students’ assignments, other small programs, and parts of programs(e.g. a few classes).

The new transformation stage should also be applicable to other similar bug finding systems.Without the loop transformations it could even be used in systems without symbolic reasoningabilities, such as fuzzers.

1Some of the known crashes were undetectable by the systems so it is not clear how many detectable crashes

were not found.

Date post:	17-Mar-2022
Category:	Documents
Upload:	others
View:	9 times
Download:	0 times

ATX & WING

Documents