+ All Categories
Home > Documents > 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram...

1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram...

Date post: 20-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
13
Drawing Euler Diagrams with Circles: The Theory of Piercings Gem Stapleton, Leishi Zhang, John Howse, and Peter Rodgers Abstract—Euler diagrams are effective tools for visualizing set intersections. They have a large number of application areas ranging from statistical data analysis to software engineering. However, the automated generation of Euler diagrams has never been easy: given an abstract description of a required Euler diagram, it is computationally expensive to generate the diagram. Moreover, the generated diagrams represent sets by polygons, sometimes with quite irregular shapes that make the diagrams less comprehensible. In this paper, we address these two issues by developing the theory of piercings, where we define single piercing curves and double piercing curves. We prove that if a diagram can be built inductively by successively adding piercing curves under certain constraints, then it can be drawn with circles, which are more esthetically pleasing than arbitrary polygons. The theory of piercings is developed at the abstract level. In addition, we present a Java implementation that, given an inductively pierced abstract description, generates an Euler diagram consisting only of circles within polynomial time. Index Terms—Automated diagram drawing, Euler diagrams, diagrammatic reasoning, information visualization. Ç 1 INTRODUCTION A N Euler diagram is a collection of closed curves that partition the plane into connected subsets, called regions, each of which is enclosed by a set of curves. Typically, Euler diagrams are used to visualize set-theoretic relationships, where each curve in the Euler diagram represents a set and each region represents the intersection of a number of sets. The term “Euler diagram” is often confused with the term “Venn diagram”—in fact, the latter can actually be seen as a subclass of Euler diagrams. The reason for this is that whereas Venn diagrams have to represent all possible set intersections, Euler diagrams only need to represent a subset of the possible intersections. For example, the left-hand diagram in Fig. 1 shows an Euler diagram with three sets and five intersections (including that outside all of the curves) whereas the right-hand diagram in Fig. 1 is both an Euler diagram and a Venn diagram with three sets and includes all eight possible intersections. Euler diagrams are attractive visualization tools because they are able to represent set intersection and enclosure in an easy-to-understand way. However, despite the benefits as a visualization method, the practical use of Euler diagrams has been held back by the difficulties in their automated generation. All generation approaches start with an abstract description of the diagram to be embedded. Typically, these descriptions state which curves are to be present and which set intersections must be represented. In order to transform abstract descriptions into diagrams effectively, various research efforts have been devoted to the automated generation of Euler diagrams [1], [2], [3], [4]. Some existing generation approaches, such as [5], [6], construct a so-called dual graph from the abstract descrip- tion, which is embedded in the plane, and “wrap” closed curves around the dual graph, as illustrated in Fig. 2. Each node in the graph represents a required set intersection. For instance, PR represents the set P \ R \ Q \ S and the node with no label represents P \ R \ Q \ S. For space reasons, we omit the details of these generation approaches. As the number of required sets and intersections increases, the number of vertices and edges in the dual graph can increase dramatically, with the graph having at most 2 n vertices, each of which represents a set intersection. Two vertices are joined by an edge whenever the set intersection they represent differs by exactly one set (e.g., nodes for A \ B \ C and A \ B \ C will be joined by an edge). Generating and manipulating such graphs can involve a huge amount of computation. Stages of the drawing process typically involve finding a large planar subgraph of the dual that has an embedding with certain properties. The subgraph and its embedding are chosen depending on the well-formedness conditions that the to- be-drawn diagram is required to possess. Moreover, the diagrams generated usually represent sets by polygons, sometimes with quite irregular shapes [7], [8], which make the diagrams less comprehensible and not necessarily appealing to users who are familiar with the idea of using circles to represent set-theoretic relationships. In this paper, we propose a method that is capable of generating a class of Euler diagrams using circles in polynomial time. In part, the polynomial-time algorithm exists because the number of set intersections to be represented is constrained to be at most 4 ðn 1Þ, where n is the number of sets. However, this constraint on the number of set intersections alone is not necessarily sufficient to ensure that the aforementioned algorithms run in polynomial time, as we will further discuss below. In order 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 7, JULY 2011 . G. Stapleton and J. Howse are with the Visual Modelling Group, University of Brighton, Lewes Road, Brighton BN2 4GJ, UK. E-mail: {g.e.stapleton, john.howse}@brighton.ac.uk. . L. Zhang and P. Rodgers are with the University of Kent, Kent, UK. E-mail: {p.j.rodgers, l.zhang}@kent.ac.uk. Manuscript received 22 July 2009; revised 24 Nov. 2009; accepted 18 Jan. 2010; published online 8 Sept. 2010. Recommended for acceptance by W. Wang. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TVCG-2009-07-0154. Digital Object Identifier no. 10.1109/TVCG.2010.119. 1077-2626/11/$26.00 ß 2011 IEEE Published by the IEEE Computer Society
Transcript
Page 1: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

Drawing Euler Diagrams with Circles:The Theory of Piercings

Gem Stapleton, Leishi Zhang, John Howse, and Peter Rodgers

Abstract—Euler diagrams are effective tools for visualizing set intersections. They have a large number of application areas ranging

from statistical data analysis to software engineering. However, the automated generation of Euler diagrams has never been easy:

given an abstract description of a required Euler diagram, it is computationally expensive to generate the diagram. Moreover, the

generated diagrams represent sets by polygons, sometimes with quite irregular shapes that make the diagrams less comprehensible.

In this paper, we address these two issues by developing the theory of piercings, where we define single piercing curves and double

piercing curves. We prove that if a diagram can be built inductively by successively adding piercing curves under certain constraints,

then it can be drawn with circles, which are more esthetically pleasing than arbitrary polygons. The theory of piercings is developed at

the abstract level. In addition, we present a Java implementation that, given an inductively pierced abstract description, generates an

Euler diagram consisting only of circles within polynomial time.

Index Terms—Automated diagram drawing, Euler diagrams, diagrammatic reasoning, information visualization.

Ç

1 INTRODUCTION

AN Euler diagram is a collection of closed curves thatpartition the plane into connected subsets, called

regions, each of which is enclosed by a set of curves.Typically, Euler diagrams are used to visualize set-theoreticrelationships, where each curve in the Euler diagramrepresents a set and each region represents the intersectionof a number of sets. The term “Euler diagram” is oftenconfused with the term “Venn diagram”—in fact, the lattercan actually be seen as a subclass of Euler diagrams. Thereason for this is that whereas Venn diagrams have torepresent all possible set intersections, Euler diagrams onlyneed to represent a subset of the possible intersections. Forexample, the left-hand diagram in Fig. 1 shows an Eulerdiagram with three sets and five intersections (including thatoutside all of the curves) whereas the right-hand diagram inFig. 1 is both an Euler diagram and a Venn diagram withthree sets and includes all eight possible intersections.

Euler diagrams are attractive visualization tools becausethey are able to represent set intersection and enclosure inan easy-to-understand way. However, despite the benefitsas a visualization method, the practical use of Eulerdiagrams has been held back by the difficulties in theirautomated generation. All generation approaches start withan abstract description of the diagram to be embedded.Typically, these descriptions state which curves are to bepresent and which set intersections must be represented. Inorder to transform abstract descriptions into diagrams

effectively, various research efforts have been devoted tothe automated generation of Euler diagrams [1], [2], [3], [4].

Some existing generation approaches, such as [5], [6],construct a so-called dual graph from the abstract descrip-tion, which is embedded in the plane, and “wrap” closedcurves around the dual graph, as illustrated in Fig. 2. Eachnode in the graph represents a required set intersection. Forinstance, PR represents the set P \R \Q \ S and the nodewith no label represents P \R \Q \ S. For space reasons,we omit the details of these generation approaches.

As the number of required sets and intersectionsincreases, the number of vertices and edges in the dualgraph can increase dramatically, with the graph having atmost 2n vertices, each of which represents a set intersection.Two vertices are joined by an edge whenever the setintersection they represent differs by exactly one set (e.g.,nodes for A \B \ C and A \B \ C will be joined by anedge). Generating and manipulating such graphs caninvolve a huge amount of computation. Stages of thedrawing process typically involve finding a large planarsubgraph of the dual that has an embedding with certainproperties. The subgraph and its embedding are chosendepending on the well-formedness conditions that the to-be-drawn diagram is required to possess. Moreover, thediagrams generated usually represent sets by polygons,sometimes with quite irregular shapes [7], [8], which makethe diagrams less comprehensible and not necessarilyappealing to users who are familiar with the idea of usingcircles to represent set-theoretic relationships.

In this paper, we propose a method that is capable ofgenerating a class of Euler diagrams using circles inpolynomial time. In part, the polynomial-time algorithmexists because the number of set intersections to berepresented is constrained to be at most 4� ðn� 1Þ, wheren is the number of sets. However, this constraint onthe number of set intersections alone is not necessarilysufficient to ensure that the aforementioned algorithms runin polynomial time, as we will further discuss below. In order

1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 7, JULY 2011

. G. Stapleton and J. Howse are with the Visual Modelling Group,University of Brighton, Lewes Road, Brighton BN2 4GJ, UK.E-mail: {g.e.stapleton, john.howse}@brighton.ac.uk.

. L. Zhang and P. Rodgers are with the University of Kent, Kent, UK.E-mail: {p.j.rodgers, l.zhang}@kent.ac.uk.

Manuscript received 22 July 2009; revised 24 Nov. 2009; accepted 18 Jan.2010; published online 8 Sept. 2010.Recommended for acceptance by W. Wang.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TVCG-2009-07-0154.Digital Object Identifier no. 10.1109/TVCG.2010.119.

1077-2626/11/$26.00 � 2011 IEEE Published by the IEEE Computer Society

Page 2: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

to define our class of Euler diagrams, we identify two types ofcurves, which we have termed “single piercing curves” and“double piercing curves,” respectively, at the abstractdescription level. Second, we show that if a diagram can bedrawn by successively adding these piercing curves, then itcan be drawn with circles in an efficient manner. Forsimplicity, in the remaining part of this paper, we say acurve which is either a single piercing or a double piercing is apiercing of a diagram. Although we will define the terms“single and double piercing” later, for now, we refer thereader to Fig. 3, which illustrates a diagram that can be builtinductively by adding piercing curves. Diagrams that can begenerated inductively using piercing curves can be identifiedat the abstract level and it is at this level that we develop thetheory of piercings.

“Pierced” Euler diagrams can be thought of as beingsparse, and are typical of those seen to represent lots ofsubset and disjointness relationships between sets. This isindicative of the type of situations where Euler diagramsexcel at representing information. Particular examplesinclude their application as a basis for software modelingnotations, such as class diagrams, state charts, and con-straint diagrams [9]; see [10] for an example of a softwaremodel produced using constraint diagrams. Indeed, Eulerdiagrams are suitable for forming the basis of logics, whichare capable of ontology specification [11], [12]; here, onemay often want to specify that the classes (concepts) in theontology are either disjoint or in a subset relationship. Eulerdiagrams have a wide range of other application areas suchas statistical data analysis and logical reasoning [13], [14],[15], [16], [17], [18], [19].

We start, in Section 2, by providing some examples ofEuler diagrams that have been automatically drawn usingprevious generation methods. This allows for comparisonwith those produced using the methods of this paper.Section 3 overviews the syntax of Euler diagrams and othernecessary background material. Abstract descriptions ofEuler diagrams are detailed in Section 4. Section 5 definesthe notions of a single piercing and a double piercing.Inductively pierced descriptions are defined in Section 6.We prove that all inductively pierced descriptions can beembedded with circles in Section 7. Some limitations of the

theory are discussed in Section 8. To demonstrate the utility

of the theoretical results, Section 9 provides an implementa-

tion that embeds inductively pierced descriptions as

diagrams drawn with circles; the software is freely available

from www.eulerdiagrams.com/piercing.htm. Many of the

diagrams in this paper were generated by the software,

including those in Figs. 1 and 3. The complexity of our

drawing method is identified in Section 10, where we also

present some discussions around the complexity of other

drawing algorithms. Finally, we conclude in Section 11 and

discuss future directions for this research.

2 OTHER DRAWING METHODS

The first generation method, developed by Flower and

Howse [5], provides an algorithm that is theoretically

capable of drawing an Euler diagram given any abstract

description, D, provided D has a so-called completely well-

formed drawing. The associated software implementation

can produce drawings for diagrams with at most four

curves. An illustration of the output can be seen in Fig. 4.

The abstract description for this diagram specifies the labels

present, A;B;C, and D, and the regions (called zones) to be

present (i.e., which set intersections are to be represented): ;(outside all of the curves), A (inside just A), B, AB (inside

exactly A and B), AC, ABC, ACD, and ABCD. All of the

diagrams in this section have this abstract description to

allow for easy comparison (although some use lowercase

curve labels).The techniques of Flower and Howse [5] were extended to

enhance the layout [20]. First, some modifications were made

to the implementation of the generation method; in our

running example, this gives the left-hand diagram in Fig. 5,

although the labels are not shown, as opposed to the diagram

in Fig. 4. Also Flower et al. [20] used layout metrics and hill

climbing algorithms to improve the diagrams’ esthetic

qualities; the result of the layout improvements applied to

the left-hand diagram in Fig. 5 can be seen on the right.

STAPLETON ET AL.: DRAWING EULER DIAGRAMS WITH CIRCLES: THE THEORY OF PIERCINGS 1021

Fig. 1. Euler diagrams.

Fig. 2. Generation using a dual graph.

Fig. 3. An inductively pierced diagram.

Fig. 4. Generation using the methods of [5].

Page 3: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

Further extensions to the generation methods of Flowerand Howse allow the drawing of abstract descriptions thatneed not have a completely well-formed embedding. Thiswas done by Rodgers et al. [21], where techniques to allowthe generation of any abstract description were developed;output from the software of Rodgers et al. can be seen inFig. 6. All of the methods described so far use a dual-graph-based approach and are computationally complex, havingan NP-complete step. This means that some diagrams takea significant time to draw.

Indeed, the dual graph method requires one to choose adual graph from the infinitely many that are capable ofgenerating the required Euler diagram. The chosen graphdirectly impacts the esthetic quality of the drawn diagramsand finding a suitable dual can be difficult. A substantialpart of Rodgers et al. [21] focuses on the task of finding adual that minimizes the number of times well-formednessconditions are broken and guarantees the absence of certainconditions (such as no nonsimple curve and no “discon-nected” zones). An alternative method for choosing a dualgraph is developed by Simonetto and Auber [22], which hasbeen implemented [6]. Output from that implementationcan be seen in Fig. 7, where the labels have been manuallyadded postdrawing.1

A different method was developed by Chow [24], whichdraws so-called monotone Euler diagrams. Among otherrestrictions, monotone diagrams must have the intersectionbetween all curves in the to-be-generated Euler diagrambeing present; such diagrams are called monotone. Many“pierced” diagrams do not have this intersection present, so

our method is complementary to that of Chow. We do nothave access to Chow’s software implementation of hisgeneration method, so we refer the reader to http://apollo.cs.uvic.ca/euler/DrawEuler/index.html for imagesof automatically drawn diagrams that can be compared, interms of esthetics, with those in this paper.

Most recently, an inductive generation method has beendeveloped [23], which draws Euler diagrams by adding onecurve at a time; see Fig. 8 for an example of the softwareoutput. This method has the advantage that it can drawdiagrams under arbitrary sets of the well-formednessconditions (where possible) but it also has an NP-completestep since it searches through graphs for cycles with certainproperties. The layout metrics of [20] could be applied tothe diagrams drawn using this method to improve theiresthetic qualities.

Using the techniques we develop in this paper, thediagram in Fig. 9 can be generated. We identify a class of

1022 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 7, JULY 2011

Fig. 5. Using the layout improvement methods of [20].

Fig. 6. Generation using the methods of [21].

Fig. 7. Generation using the methods of [6].

Fig. 8. Inductive generation using the methods of [23].

Fig. 9. Generation using the methods of this paper.

1. We thank Paolo Simonetto for supplying this image.

Page 4: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

abstract descriptions that can be drawn quickly, in poly-nomial time, entirely with circles in a completely well-formed manner.

3 EULER DIAGRAMS

An Euler diagram is a collection of closed curves drawn inthe plane. Each curve has a label, chosen from a set L. Theclosed curves essentially provide a partition of the planeinto minimal regions. A zone is a union of minimal regionsdetermined by being inside certain curves and outside theother curves. To illustrate, Fig. 10 shows an Euler diagramwith three curves A, B, and C. There are seven zones in thisdiagram. Note that zone c (the regions inside curve C butoutside the other two curves) consists of two minimalregions, and therefore, is disconnected. Recall that a closedcurve in the plane is a continuous function of the formc : ½a; b� ! IR2, where cðaÞ ¼ cðbÞ. Given an arbitrary func-tion, f : A! B, we write imageðfÞ to denote the set ofelements in B to which f maps.

Definition 3.1. An Euler diagram is a pair, d ¼ ðCurve; lÞ,where

1. Curve is a finite collection of closed curves each withcodomain IR2 and

2. l : Curve! L is an injective function.

Definition 3.2. A minimal region of an Euler diagram d ¼ðCurve; lÞ is a connected component of

IR2 �[

c2CurveimageðcÞ:

Definition 3.3. A zone in an Euler diagram d ¼ ðCurve; lÞ is anonempty set of minimal regions that can be described as beinginterior to certain curves (possibly no curves) and exterior tothe remaining curves.

For the interpretability and classification of diagrams, arange of diagram properties have been defined, which aresometimes called well-formedness conditions. Throughoutthe paper, we assume the following set of well-formednessconditions:

1. All of the curves are simple.2. No pair of curves runs concurrently.3. There are no triple points of intersection between the

curves.4. Whenever two curves intersect, they cross.5. Each zone is connected (i.e., consists of exactly one

minimal region).

To illustrate, Fig. 11 shows some examples of nonwell-formed Euler diagrams. Formal definitions of the well-formedness conditions can be found in [25]. Any Eulerdiagram which satisfies all of the conditions is said to becompletely well formed.

4 ABSTRACTION OF EULER DIAGRAMS

In order to generate an Euler diagram, we start with adescription of that diagram. To illustrate, the diagram inFig. 12 can be described as having four curves, A, B, C, andD. These curves divide the plane in such a manner thatthere are six zones present. For instance, there is one zoneinside A only and another zone inside precisely A and B.Thus, each present zone can be described by the labels ofthe curves that the zone is inside.

Definition 4.1. An abstract description D is a pair, ðL;ZÞ,where L is a subset of L (i.e., all of the labels in D are chosenfrom the set L) and Z � PPL such that ; 2 Z. Elements of Zare called abstract zones (or, simply, zones). GivenD ¼ ðL;ZÞ, we define LðDÞ ¼ L and ZðDÞ ¼ Z.

In Fig. 12, the diagram d has abstract description L ¼fA;B;C;Dg a n d Z ¼ f;; fAg; fBg; fA;Bg; fCg; fC;Dgg.Sometimes we will write the zones using lowercase lettersand as strings, such as ab instead of fA;Bg; when writingzones as strings, using lowercase distinguishes the zone a ¼fAg from the curve label A, for instance.

Definition 4.2. Given an Euler diagram d ¼ ðCurve; lÞ, we mapd to the abstract description abstractðdÞ ¼ ðimageðlÞ; ZÞ,called the abstraction of d, where Z contains exactly oneabstract zone for each zone in d; in particular, given a zone z,in d, the set Z contains the abstract zone

abstractðzÞ ¼ flðcÞ : c 2 CðzÞg;

where CðzÞ is the set of curves in d that contain z. Given anabstraction, D ¼ ðL;ZÞ, if abstractðdÞ ¼ D, then we say d isan embedding of D.

Note that ; represents the zone that is contained by nocurves. It is called the outside zone and is present in everyabstract description. The Euler diagram generation problemcan be summarized as: given an abstract description,

STAPLETON ET AL.: DRAWING EULER DIAGRAMS WITH CIRCLES: THE THEORY OF PIERCINGS 1023

Fig. 10. An Euler diagram.

Fig. 11. Nonwell-formed diagrams.

Fig. 12. Abstractions.

Page 5: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

D ¼ ðL;ZÞ, find an Euler diagram d such that abstractðdÞ ¼D and such that d satisfies some chosen well-formednessconditions.

5 PIERCING CURVES

A class of abstract descriptions that can be drawn withcircles in an efficient manner can be built by successivelyadding piercing curves. We define two types of piercingcurves (strictly, curve labels). If an abstract description canbe built inductively, entirely of this type of curve, undercertain constraints, then it can be drawn with circles inpolynomial time.

To illustrate, in Fig. 3, the curve labeled D is what wedefine to be a single piercing of B. The curve D containsexactly two zones that inside D and that inside both B andD. Double piercings pierce two other curves that themselvesintersect. So, in Fig. 3, the curve C is a double piercing of Aand B. We will see later that it is not always the case thatdouble piercings can be drawn with circles and we have toplace restrictions on the manner in which we inductivelyconstruct our abstract descriptions.

To add a single piercing as a circle, it is necessary thatthe two zones through which the circle is to pass aretopologically adjacent. Indeed, by the well-formednessconditions, we know that if these two zones aretopologically adjacent, then they “separated” by the curvethrough which the to-be-added single piercing is to pass.In the case of a double piercing, we need a point p, whichis an intersection point of the two curves being pierced. InFig. 3, for example, we can add a double piercing of A andB in d2 by finding a disc around one of their (two)intersection points. Key to our construction is that, if webuild a diagram d by successively adding piercing curves,then any pair of zones in d whose abstractions differ onlyby one label are topologically adjacent. Our results belowhold since we are working in IR2, and under the standardmetric, the neighborhood of a point p is a disc, so p caneasily be enclosed by a circle.

The zones that a piercing curve passes through arestrongly related to each other and we call these zones acluster. The following definition formalizes this concept atthe abstract level:

Definition 5.1. Let z be an abstract zone and let L � L� z. Theset fz [ L0 : L0 2 PPLg is an L-cluster for z, denoted byCðz; LÞ. Given an L-cluster, Cðz; LÞ, and a label,� 2 L � ðz [ LÞ, the L-cluster Cðz [ f�g; LÞ is called a �-partner for Cðz; LÞ.

For example, given the zone ab (formally fA;Bg) and thelabel set fC;Dg, the set

Cðab; fC;DgÞ ¼ fab; abc; abd; abcdg

is a fC;Dg-cluster for ab. Since the label E is not in ab or infC;Dg, the cluster

Cðabe; fC;DgÞ ¼ fabe; abce; abde; abcdeg

is an E-partner for Cðab; fC;DgÞ.Since piercing curves contain specific sets of zones, it is

useful to define the set of zones contained by a curve labelin an abstract description. We also observe that in a

diagram, some curves properly contain other curves. Thisconcept is also helpful at the abstract level.

Definition 5.2. Let D ¼ ðL;ZÞ be an abstract description and let�1 and �2 be distinct curve labels inL. If �1 2 z and z 2 Z, thenwe say �1 contains z inDwith the set of such zones denoted byZcð�1Þ. If Zcð�1Þ � Zcð�2Þ, then �2 contains �1 in D. The setof curves that contain �1 in D is denoted by Lcð�1Þ.

To compute Lcð�1Þ, we can make use of the followingresult:

Lemma 5.1. Let D ¼ ðL;ZÞ be an abstract description and let �be a curve label in L. Then Lcð�Þ ¼

Tzi2fz2Z:�2zg zi � f�g.

5.1 Single Piercings

As illustrated previously, single piercing curves are thosecurves that intersect with exactly one other curve.

Definition 5.3. Let D ¼ ðL;ZÞ be an abstract description and let�1; �2 2 L be distinct curve labels. Then �1 is a singlepiercing of �2 in D if there exists a zone z such that

1. �1 62 z and �2 62 z,2. Zcð�1Þ ¼ Cðz [ f�1g; f�2gÞ, and3. Cðz; f�2gÞ � Z.

The zone z is said to identify �1 as a piercing of �2.

To illustrate, the diagram d1 in Fig. 13 has a singlepiercing curve labeled C. At the abstract level, abstractðd1Þhas zone set Z ¼ f;; a; b; ab; bc; abcg. The zone b identifies Cas a piercing of A, since

ZcðCÞ ¼ fbc; abcg ¼ Cðbc; fAgÞ

and

Cðb; fAgÞ ¼ fb; abg � Z:

Neither A nor B are single piercing curves in d1 but they areboth single piercing curves in d2; removing a single piercingcurve can create single piercing curves. The curve C can bethought of as “piercing” the single curve A, hence, theterminology “single piercing.”

5.2 Double Piercings

Double piercing curves are curves that pierce two othercurves and split four zones. This is a clear generalization ofa single piercing curve and one can proceed to define triplepiercings and so forth. We further discuss triple piercingsbelow, in the context of generation with circles. Ourgeneration method works for abstract descriptions that arebuilt using single and double piercings but not n-piercings,where n � 3.

1024 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 7, JULY 2011

Fig. 13. Identifying single piercings.

Page 6: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

Definition 5.4. Let D ¼ ðL;ZÞ be an abstract description and let�1; �2; �3 2 L be distinct curve labels. Then �1 is a doublepiercing of �2 and �3 in D if there exists a zone z such that

1. �1 62 z; �2 62 z, and �3 62 z,2. Zcð�1Þ ¼ Cðz [ f�1g; f�2; �3gÞ, and3. Cðz; f�2; �3gÞ � Z.

The zone z is said to identify �1 as a piercing of �2 and �3.

To illustrate, the diagram in Fig. 14 has a double piercingcurve labeled D and abstract zone set f;; a; ab; b; c; ac; bc;abc; cd; acd; bcd; abcdg. The abstract zone c identifies D as adouble piercing of A and B; we have

ZcðDÞ ¼ Cðcd; fA;BgÞ

and

Cðc; fA;BgÞ ¼ fc; ac; bc; abcg � Z:

Removing D from d1 creates a new double piercing, C, of Aand B; C was not a piercing in d1. In d2, all of the curves aredouble piercings and the removal of any one of them turnsthe remaining curves into single piercings.

6 INDUCTIVELY PIERCED DESCRIPTIONS

There are some diagrams that can be produced byinductively adding piercings. We call them piercing dia-grams and the abstract description of this type of diagram iscalled an inductively pierced description. Fig. 15 illustratesthe process, starting with the “empty” diagram, andsuccessively adding curves. Note that the first two diagramsdo not contain single or double piercing curves. The curve Bis a single piercing in d2, we add the double piercing C togive d3, then add D and E to give d4 and d5, respectively.There are different sequences of diagrams that result in d5 byadding piercing curves.

All of the diagrams in Fig. 15 are connected since theircurves form a connected component of the plane. We canalso add curves that are not connected to any other curve,shown in Fig. 16, and then pierce these new curves. Thus,curves that do not intersect with any other curves act as abase case to which we can add piercings; we call thesecurves base piercings.

Definition 6.1. Let D ¼ ðL;ZÞ be an abstract description and let� 2 L be a curve label. Then � is a base piercing in D if thereexists a zone z such that

1. � 62 z,2. Zcð�Þ ¼ Cðz [ f�g; ;Þ, and3. Cðz; ;Þ � Z.

The zone z is said to identify � as a base piercing.

Note that in the above definition,Cðz [ f�g; ;Þ ¼ fz [ �g andCðz; ;Þ ¼ fzg. While this alternative presentation may makethe concepts in the definition more immediately apparent,our chosen presentation of the definition readily matches thedefinitions of single piercings and double piercings.

We are aiming to identify a class of abstract descriptionsthat can be drawn with circles, since these are estheticallypleasing, in an efficient manner. If we can build an abstractdescription D starting from ð;; f;gÞ by successively addingpiercings, then it is not necessarily possible to draw D usingonly circles. Assume that we have an abstract description D,drawn in a completely well-formed manner by successivelyadding piercings. If we want to add a single piercing, �, toobtain some particular abstract description, then it isreasonably obvious that we can add a circle labeled � tothat drawing and obtain the required abstraction. We note,however, that a single piercing cannot necessarily be addedas a circle to an arbitrary diagram, which will bedemonstrated below.

The case of double piercings is more interesting. Weobserve that in any embedding drawn with circles, any pairof curves intersects exactly twice or not at all. If �1 is adouble piercing of �2 and �3, then �2 and �3 intersectexactly twice, meaning that if �1 is to be embedded as acircle in a well-formed manner, then it has to containexactly one of these intersection points. Therefore, there is achoice of at most two places, where �1 can be embedded. InFig. 15, there was only one choice for the location of D since

STAPLETON ET AL.: DRAWING EULER DIAGRAMS WITH CIRCLES: THE THEORY OF PIERCINGS 1025

Fig. 14. Identifying double piercings.

Fig. 15. Inductively adding piercings.

Fig. 16. Disconnected diagrams.

Page 7: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

it has to be enclosed by C. If we are to add more doublepiercings to d5, then we must ensure that if they pierce Aand B and are to be enclosed by C, then they must also beenclosed by D. Fig. 17 illustrates a scenario where we havetwo choices for how to add D to d8.

Further, consider the abstract description with labels L ¼fA;B;C;D;Eg and zones Z ¼ f;; a; b; ab; c; ac; bc; abc; d; ad;abd; e; ae; abeg. This abstraction has three double piercingsof A and B, namely C, D, and E. Fig. 18 shows how wecould embed A and B, then add C and D as piercings. It isthen obviously not possible to add E as a circle to d14.Therefore, we must restrict the manner in which we adddouble piercings at the abstract level for which we make useof the following definition:

Definition 6.2. Let C1 ¼ Cðz; f�1; �2gÞ and C2 ¼ Cðz [ f�3g;f�1; �2gÞ be partner clusters given �3. Let D ¼ ðL;ZÞ be anabstract description. If C1 [ C2 � Z, then �3 is outside

associated with C2 in D and is inside associated with C1

in D.

Given a double piercing, �3, of �1 and �2 identified by z,�3 is outside associated with Cðz; f�1; �2gÞ and insideassociated with Cðz [ f�3g; f�1; �2gÞ. The zones in Cðz [f�3g; f�1; �2gÞ are all inside �3, hence, the terminology“inside associated.”

Finally, to define an inductively pierced description, weneed an operation to remove curve labels from abstrac-tions. Removing curve labels involves two steps: weremove the curve label, �, from the label set and then weupdate the zone list, making sure � is removed from eachzone; we transform D into the abstract description that wedefine to be D� �.

Definition 6.3. Given an abstract description, D ¼ ðL;ZÞ, and� 2 L, we define D� � to be D� � ¼ ðL� f�g; fz� f�g :z 2 ZgÞ.

Definition 6.4. Let D ¼ ðL;ZÞ be an abstract description. ThenD is an inductively pierced description if either

1. D ¼ ð;; f;gÞ, or2. D has a base piercing, �1, such that D� �1 is

inductively pierced, or3. D has a single piercing, �1, of �2 such that D� �1 is

inductively pierced, or4. D has a double piercing, �1, of �2 and �3 identified by

z, and either

a. no other curve label, �4, in D is outside associatedwith Cðz; f�2; �3gÞ or

b. exactly one other curve label, �4, in D is outsideassociated with Cðz; f�2; �3gÞ and we have either

i. Lcð�1Þ ¼ Lcð�4Þ ¼ Lcð�2Þ or

ii. Lcð�1Þ ¼ Lcð�4Þ ¼ Lcð�3Þ.and D� �1 is inductively pierced.

Note that given a piercing curve, �1, we have Lcð�1Þ ¼ z,where z identifies �1. Thus, in point 4b, we seek to establishwhether Lcð�4Þ ¼ z and whether Lcð�2Þ ¼ z or Lcð�3Þ ¼ z.From lemma 5.1, we can compute Lcð�iÞ using

Lcð�iÞ ¼\

zj2fz2Z:�i2zgzj � f�ig:

The space of inductively pierced descriptions representsa relatively small, but not insignificant, fraction of allabstract descriptions. The number of abstract descriptionswith label set L is T ¼ 22jLj � 1, since there are 2jLj zones andany nonempty set of zones gives rise to an abstractdescription. In general, the number of inductively pierceddescriptions with label set L is bounded above by thenumber of nonempty subsets of PPL with cardinality at most4� ðjLj � 1Þ, since there are at most 4� ðjLj � 1Þ zones insuch a description: observing that any inductively pierceddescription containing only one curve label has two zones,and that there is such description containing exactly twocurve labels with four zones, the maximum number ofzones present when jLj � 2 is:

4� ðjLj � 1Þ;

since adding a double piercing increases the number ofzones by 4. For any jLj, it is relatively easy to show thatthere is an inductively pierced description with label set Lcontaining 4� ðjLj � 1Þ zones. We can also provide a lowerbound on the number of zones present, namely jLj þ 1,since every time we add a piercing curve, the number ofzones increases by at least 1. Again, it is easy to show thatthere is an inductively pierced description with label set Lcontaining jLj þ 1 zones. Thus, we can give an upper boundon the number of inductively pierced abstract descriptionscontaining at least two curve labels:

UB ¼X4�ðjLj�1Þ

i¼jLjþ1

2jLjCi:

However, the number of abstract descriptions and theupper bound placed on the number of inductively pierceddescriptions is not hugely insightful. First, there are manyabstract descriptions that are isomorphic to each other:D1 ¼ ðL1; Z1Þ is isomorphic to D2 ¼ ðL2; Z2Þ if there exists abijection � : L1 ! L2 that induces a bijection � : Z1 ! Z2

such that for each z1 2 Z1, � 2 z1 if and only if �ð�Þ 2 �ðz1Þ.Table 1 summarizes the number of inductively pierceddescriptions (IPDs) and the number of abstract descriptions

1026 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 7, JULY 2011

Fig. 17. Choices of embedding. Fig. 18. Limitations.

Page 8: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

(ADs) for fixed label set, containing up to four curve labelsreduced up to isomorphism:

7 EMBEDDING INDUCTIVELY PIERCED

DESCRIPTIONS

The manner in which we embed inductively pierceddescriptions reflects the inductive definition. Given anabstract description that we wish to embed, we start withthe empty diagram and successively add curves in theappropriate manner until we obtain a diagram with therequired abstraction. In order to specify how to add acurve label � to an abstract description D� � to give D, weneed to describe �’s effect on the zones.

The diagram in Fig. 12 can be obtained from that in Fig. 19,by adding a curveB. The abstraction of the diagram in Fig. 19consists of L ¼ fA;C;Dg and Z ¼ f;; a; c; cdg. To describehow to addB, all we need is to know which zone identifiedBas a piercing and which curves B pierces. In this example, Bis identified by the zone ; and pierces A. The cluster Cð; [fBg; fAgÞ allows us to create the abstraction of the diagramin Fig. 12 from the abstraction, D, of the diagram in Fig. 19:we take L [ fBg and Z [ Cð; [ fBg; fAgÞ. We can write thisabstraction as Dþ ðB;Cð; [ fBg; fAgÞÞ. For simplicity, wewill writeDþB, when the cluster is clear from the context ornot relevant.

So, when we identify a curve as a piercing curve, westore Cðz; LÞ in order to know how to reconstruct D whenstarting with the empty diagram (which is an embeddingof ð;; f;gÞ). Our embedding method adds curves succes-sively to drawn Euler diagrams until we obtain an Eulerdiagram with the required abstraction. For instance, if wewant to embed the abstract description D5 ¼ ðL;ZÞ, whereL ¼ fA;B;C;D;Eg and Z ¼ fa; b; ab; c; ac; bc; abc; cd; acd;bcd; abcd; be; abeg, then our first task is to identify whetherD5 has a piercing curve. Here, E is a single piercing of A,identified by z ¼ b with Cðz; fAgÞ ¼ fb; abg; note that D5 isthe abstraction of d5 in Fig. 15. Removing E from D5

yie lds D4 ¼ D5 �E w i t h LðD4Þ ¼ fA;B;C;Dg andZðD4Þ ¼ fa; b; ab; c; ac; bc; abc; cd; acd; bcd; abcdg, the abstrac-tion of d4. Continuing in this manner, we identify D as apiercing of D4, C and a piercing of D3 ¼ D4 �D, and soforth, until we obtain ð;; f;gÞ. Successively removingpiercings gives a sequence of abstract descriptions thatmirrors our inductive process of adding circles at thedrawn diagram level. In this case, that sequence is hD0;D1; . . . ; D5iÞ.Definition 7.1. Given an abstract description, D ¼ ðL;ZÞ, a

pierced decomposition of D is a sequence, decðDÞ ¼ hD0;D1; . . . ; Dni, where each Di�1 (0 < i � nÞ is obtained from Di

by the removal of some piercing, �i (so, Di�1 ¼ Di � �i) and

Dn ¼ D. If D0 contains no labels, then decðDÞ is a totalpierced decomposition.

The notion of a decomposition is similar to an abstractionof Euler diagrams developed in [26]. Establishing whetheran arbitrary abstraction, D, is an inductively pierceddescription produces, as a biproduct, a total pierceddecomposition. Trivially, we have the following lemma:

Lemma 7.1. Every inductively pierced description has a totalpierced decomposition.

So, the first step in our embedding process is to create atotal pierced decomposition. The following theorem allowsus to identify whether an abstract description is inductivelypierced in a relatively efficient manner. It establishes that theorder in which we remove piercings is not important whendetermining whether a description is inductively pierced:

Theorem 7.1. Let D be an inductively pierced description withpiercing �1. Then D� �1 is also inductively pierced.

To prove the above theorem, one uses the followinglemma.

Lemma 7.2. Let D be an inductively pierced description with atleast two distinct piercings, �1 and �2. Then �2 is a piercing ofD� �1.

We can easily identify some curve labels as not beingpiercings, using the following lemma. Moreover, we canalso quickly identify some descriptions as not beinginductively pierced:

Lemma 7.3. Let D ¼ ðL;ZÞ be an abstract description such thatL 6¼ ; with � 2 L.

1. If jZcð�Þj 6¼ 1 and jZcð�Þj 6¼ 2 and jZcð�Þj 6¼ 4, then �is not a piercing curve.

2. If jZðDÞj > 4� ðjLðDÞj � 1Þ, then D is not induc-tively pierced.

3. If jZðDÞj < jLðDÞj, then D is not inductively pierced.

We now present a key result of this paper: all inductivelypierced descriptions can be embedded in a completely well-formed manner, where all curves are circles.

Theorem 7.2. Let D be an inductively pierced description. Thenthere is an embedding of D that possesses all of the well-formedness conditions and all of whose curves are circles.

Proof. The strategy to prove this theorem is to use aninduction argument. Given a total pierced decomposition,ðD0; . . . ; DnÞ, we assume thatDi ¼ ðLi; ZiÞ is embedded asdi in the appropriate manner (i.e., well-formed and withcircles) and such that for any pair of zones, z1 and z2, in diif their abstract descriptions have a symmetric differencecontaining exactly one label, then they are topologically

STAPLETON ET AL.: DRAWING EULER DIAGRAMS WITH CIRCLES: THE THEORY OF PIERCINGS 1027

TABLE 1The Proportion of Inductively Pierced Abstract Descriptions

Fig. 19. Adding a curve.

Page 9: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

adjacent. In any well-formed diagram (in particular, di),

we also have the property that any pair of zones that are

topologically adjacent have exactly one label in the

symmetric difference of their abstractions.Consider, then, Diþ1 which is obtainable from Di by

adding a piercing, �i. We know, therefore, that Diþ1 ¼Di þ ð�i; Cðz; LÞÞ, for some L � Li and zone z 2 Zi. Wecan show that it is possible to add a circle to di to givediþ1 ¼ di þ �i in such a manner that

1. abstractðdi þ �iÞ ¼ Diþ1.2. di þ �i is well formed.3. Any pair of zones in di þ �i whose abstractions

have a symmetric difference that contains exactlyone label are topologically adjacent.

Clearly, if �i is a base or single piercing, this is trivial

given the topological adjacency property. Thus, we

sketch the argument, where �i is a double piercing.Suppose that �i pierces �x and �y identified by z.

Then, in di, the curves labeled �x and �y are drawnwith circles and overlap (like Venn-2, the Venndiagram with two curves); see the representative regionin di in Fig. 20, which includes �x and �y but may omitcurves in di that also pass through the illustratedregion. Clearly, we can add a curve labeled �i to di, asshown in the representative region of di þ �i, by ourassumption about topological adjacency. However, weneed to ensure that the curve labeled �i only intersects�x and �y and does not contain any other curves indi þ �i so that we get the correct abstraction (i.e.,abstractðdi þ �iÞ ¼ Di þ �i); since di is embedded usingonly circles, this is the only way the curve labeled �ican be embedded in such a manner that we have thewrong abstraction.

We assume that di þ �i has the wrong abstraction. Wefurther assume, without loss of generality, that the curvelabeled �i does not contain any curves that are basepiercings or are single piercings of �x or �y (we can alwaysroute �i to achieve this given topological adjacency and

well-formedness). Similarly, we can always route �ithrough di in such a manner that it only intersects with�x and �y. Hence, the curve labeled �i contains anothercurve because we have the wrong abstraction. Therefore,it must contain at least one curve labeled, say, �j that isoutside associated with Cðz; f�x; �ygÞ in Di since di is awell-formed embedding of Di.

Since di þ �i has the wrong abstraction, �j is alsooutside associated with Cðz; f�x; �ygÞ in Di þ �i. Weknow that Diþ1 is inductively pierced, so there is noother curve label in Diþ1 that is outside associated withCðz; f�x; �ygÞ. Moreover, if an additional curve label �kwas outside associated with Cðz; f�x; �ygÞ in Di, then it towould be outside associated with Cðz; f�x; �ygÞ inDi þ �i. This implies that in Di, �j is the only curvelabel that is outside associated with Cðz; f�x; �ygÞ.

We know, again without loss of generality, from thedefinition of an inductively pierced description thatLcð�iÞ ¼ Lcð�jÞ ¼ Lcð�xÞ in Di þ �i. Therefore, in Di,Lcð�jÞ ¼ Lcð�xÞ. It then follows that the four zonesaround the point q are exactly those in Cðz; f�x; �ygÞand we can, therefore, draw a circle around q labeled �ito give di þ �i with abstraction Di þ �i. Hence, Di þ �ican be embedded using circles. Trivially, we can drawthe circle labeled �i sufficiently small to ensure well-formedness. Finally, it is also trivial that the con-structed embedding ensures that zones whose abstrac-tions have a one label symmetric difference aretopologically adjacent, and the result then follows byinduction. tu

We note that in the above proof, we argued that we

could draw a circle sufficiently small in order to add it in

the correct manner; this was to ensure that the circle c was

enclosed by the correct set C of other circles. This need to

draw curves with a small area inside them is not particular

to using circles to represent the sets. When we want to

draw one curve inside another, it is necessary that it has a

smaller area inside it. This feature of Euler diagrams may

make them difficult to read at a single scale.In the definition of an inductively pierced description,

the assertion in condition 4 that at most one other curvelabel �4 exists (satisfying the specified properties in 4(b)) isnecessary for ensuring that we can draw an appropriatediagram with circles.

Theorem 7.3. Let D be an abstract description with double

piercing, �1, of �2 and �3 given z. Suppose that D� �1 is

inductively pierced and that there exist two distinct curve labels,

�4 and �5, which are outside associated with Cðz; f�2; �3gÞ in

D� �1, and for each i 2 f4; 5g, either

1. Lcð�1Þ ¼ Lcð�iÞ ¼ Lcð�2Þ or2. Lcð�1Þ ¼ Lcð�iÞ ¼ Lcð�3Þ.

Then D cannot be drawn with circles.

The strategy for the proof is to take any embedding of

D� �1 drawn with circles, show that �4 encloses one point

where �2 and �3 intersect and that �5 encloses the other

point. It then follows that �1 cannot be drawn as a circle to

give a diagram with abstraction D.

1028 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 7, JULY 2011

Fig. 20. Drawing with circles.

Page 10: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

8 LIMITATIONS

We have already seen that we cannot necessarily drawdouble piercings as circles, unless we are considering anabstract description that is inductively pierced. It is natural toask whether, if an abstract description has a single piercing,that piercing can be drawn as a circle. In general, the answerto this is no. Fig. 21 shows an example: if we want to add asingle piercing,E, of C identified by zone ab, then we cannotdo so using a circle. There are alternative (not well-formed)embeddings of abstractðdÞ that allow us to drawE as a circle.This example proves the following lemma:

Lemma 8.1. Let D be an abstract description with a singlepiercing curve �. Then it is not necessarily the case that acurve labeled � can be added as a circle to an embedding ofD� � to give an embedding of D.

Perhaps the most obvious question is whether this workextends to the case of triple piercings. A triple piercing, �1,of �2, �3, and �4 identified by z contains exactly the zones inCðz [ f�1g; f�2; �3; �4gÞ. To illustrate, D is a triple piercingof f;; a; b; ab; c; ac; bc; abc; d; ad; bd; abd; cd; acd; bcd; abcdg. Vi-sually, if we were to draw this abstract description usingcircles, we would need to add D to a diagram like d12 inFig. 18. Intuitively, D cannot be added as a circle and wecannot extend the results to allow triple piercings and stilldraw the diagrams with circles.

Finally, while we have identified a significant class ofabstract descriptions that can be embedded using circles inan efficient manner; there are still many descriptions thatcan be drawn with circles that do not have inductivelypierced descriptions, such as that in Fig. 21. Future workwill extend the techniques developed in this paper toidentify a larger class of abstract descriptions that can beembedded with circles.

9 IMPLEMENTATION

We have implemented a tool that allows the automatedembedding of inductively pierced abstract descriptions,including functionality to identify whether a descriptionhas this property. It is relatively straightforward to establishwhether any given abstract description, D ¼ ðL;ZÞ, isinductively pierced. As stated before, if D is inductivelypierced, then establishing this produces a total pierceddecomposition, decðDÞ ¼ ðD0; . . . ; DnÞ. When we generatedecðDÞ, we also create a list of piercings lp ¼ ð�0; . . . ; �n�1Þ,where Di þ �i ¼ Diþ1, and a list of clusters, lc ¼ ðCðzo;L0Þ; . . . ; Cðzn�1; Ln�1ÞÞ, where �i pierces the curves in Liidentified by zi. The two lists lp and lc completely determinedecðDÞand we use these two lists to construct our embedding.

Once an abstract description, D, is confirmed to be aninductively pierced description, the list of piercings lp andlist of clusters lc can be used to generate the layout. Since

we are building up our diagram inductively, at each keystage in the embedding process, we need to add a circle Cithat is to be labeled �i. The program will establish whattype of piercing is �i. For each �i, it is either contained byother circles that have already been drawn or not containedby any other circles.

We first sketch the case of how to add Ci when �i is notcontained by any other curves. If �i is a base piercing, theprogram will calculate the occupied region in the currentdrawing (see the dotted rectangle area in Fig. 22) and placethe curve Ci outside that region. The size of the curve isdetermined by the number of curves with which Ci is tointersect in an embedding of D; the more intersections, thebigger we make the curve, to ensure that the still-to-be-added curves are not too small in the final diagram.

Suppose now thatCi is a single piercing ofCp in the currentdiagram (i.e., the embedding ofDi). IfCi is not fully containedby any other curves, the method will first find the occupiedsectors of curveCp and generate a list of available sectors (seethe gray areas in Fig. 23 as an example). The algorithm willthen find the most suitable available sector in which to fit thecurve. The most suitable sector is selected by measuring the“size” of each available sector (see the dotted line in Fig. 23 asan example) and finding one whose size is closest to twice thedesired radius, ri, of Ci. Once the most suitable sector isfound, the algorithm will fit the new curve. The size andcenter of the new curve will be adjusted to make sure theresulting diagram has abstraction Di þ �i.

If Ci is a double piercing curve, the algorithm will findthe two curves that Ci pierces, say, Cp1 and Cp2. It calculatesthe two intersection points of Cp1 and Cp2 and selects a validone as the center of Ci. An invalid intersection point isdetectable by checking whether that point has been used asthe center of another dual piercing, Cx, of Cp1 and Cp2, and ifsuch a Cx exists, whether Cx is to contain Ci. Once a validintersection point is found, it can be used as the center of thenew curve Ci. The radius of the curve can be determined bythe area which is already marked as available between Cp1

and Cp2 (see the gray area of Fig. 24 as an example). Again,the radius ofCi will be adjusted to make sure that the addingof new curve results in a diagram with abstraction Di þ �i.

Suppose now that Ci is contained by some other curve inDi þ �i. If Ci is fully contained by other curves, the methodwill find the innermost outside curve of Ci, say, C0, and fitCi inside C0. If C0 is a base piercing, the program will

STAPLETON ET AL.: DRAWING EULER DIAGRAMS WITH CIRCLES: THE THEORY OF PIERCINGS 1029

Fig. 21. Single piercings cannot always be drawn as circles. Fig. 22. Adding a base piercing curve that is inside no other curves.

Fig. 23. Adding a single piercing curve that is inside no other curves.

Page 11: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

calculate the unoccupied area (see the gray area in Fig. 25)in C0 and then fit Ci inside this area. If Ci is a single piercingcurve, the algorithm will calculate the sector in curve Cp,which is occupied by C0 (see Fig. 26). A list of availablesubsectors will then be generated (see the gray sectors inFig. 26). Similar to adding single piercing curves that are notcontained by other curves, Ci will be fitted to the mostsuitable sector and the radius will be adjusted to generate avalid layout. If Ci is a double piercing curve, the methodwill find the two curves that Ci pierces, say, Cp1 and Cp2.The two intersection points of the two curves will becalculated and one that is inside curve C0 will be used as thecenter of Ci; see Fig. 27.

To prototype the generation mechanism, we haveimplemented the method as a Java program, available from

www.eulerdiagrams.com/piercing.htm. The program takesan abstract description of an Euler diagram as input andchecks whether the abstract description is inductivelypierced. Once the program confirms that an abstractdescription is inductively pierced, it will draw the diagramby inductively fitting circles to available spaces in thereverse order of the sequence of removal. Figs. 1, 3, 9, 12, 13,14, 15, 16, 17, 18, and 19 were all drawn using our software.Fig. 28 shows some nice automatically generated layouts forEuler diagrams containing many curves. Fig 29 shows adiagram containing 52 circles, which took 6.641 seconds todraw (Intel Pentium CPU with 2 GB RAM, under WindowsXP operation system, Java Version 1.6.0_03 from SunMicrosystems, Inc.). As one can see in this diagram, thecircles are sometimes a little small.

1030 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 7, JULY 2011

Fig. 24. Adding a double piercing curve that is inside no other curves.

Fig. 25. Adding a base piercing curve inside other curves.

Fig. 26. Adding a single piercing curve inside other curves.

Fig. 27. Adding a double piercing curve inside other curves.

Fig. 28. Output from the software.

Fig. 29. A diagram containing 52 circles.

Page 12: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

The method generates the basic drawing of a piercingdiagram with circles. The layout may not be optimal butvarious methods can be applied to adjust the size of circlesafter the initial drawing. In future work, we plan to developforce directed methods that will move the circles’ centerpoints and will alter the radii to improve the layout whilemaintaining the abstract description.

10 COMPLEXITY

The class of inductively pierced descriptions can be drawnefficiently, even under the requirement that the diagramsproduced are completely well formed. A naive algorithm toidentify whether an abstract description, D, is inductivelypierced first checks the cardinality of the zone set: D is notinductively pierced if jZðDÞj > 4� ðjLðDÞj � 1Þ. If the zoneset is small enough, then for each curve label �, we determinewhether it is a piercing curve. As soon as we identify apiercing curve, we check whether the conditions of definition6.4 are satisfied; in the worst case, this takes jLðDÞj � jZðDÞj,which is OðjLðDÞj2Þ, since jZðDÞj � 4 � ðjLðDÞj � 1Þ. Again,in the worst case, we have to iterate through all the curvelabels in LðDÞ performing this process. Hence, the timecomplexity of this naive algorithm is OðjLðDÞj3Þ. Thegeneration process is efficient because the program onlyneeds to calculate the radius and center of each circle; theembedding stage (i.e., after we have produced a decomposi-tion) of the implemented algorithm is of OðjLðDÞj2Þ. There-fore, the time complexity of the entire generation algorithm isthat of the process of seeking a decomposition, OðjLðDÞj3Þ.

We note that there are trivial, highly efficient, methods ofdrawing any abstract description but they pay no regard towell-formedness; for example, draw one circle for eachzone, join these circles at a single point (creating a “wedgeof circles”), then traverse the circles to form each curve [27].It is perhaps natural to ask whether other drawing methodsthat produce completely well-formed diagrams havesimilar efficient properties to our method given therestriction to the class of inductively pierced descriptions.Thus, we will consider in more detail the dual-graph-basedmethod of Flower and Howse [5], since this methodguarantees to draw a well-formed diagram whenever thisis possible; this drawing process was illustrated in Fig. 2.

Given an abstract description,D ¼ ðL;ZÞ, the first stage ofthe Flower and Howse method constructs the so-calledsuperdual: the superdual ofD is a graph,G, with vertex setZand with an edge, e, between two zones (i.e., vertices)whenever the symmetric difference of the zones containsexactly one curve label. The next stage is to find a subgraph ofthe superdual that is planar, well connected, and has anembedding which passes the face conditions; we do not havespace to formally define these conditions. The superdual hasa planar subgraph that possesses these two conditions if andonly if the abstract description D has a completely well-formed embedding. Moreover, Flower and Howse use such asubgraph to produce a completely well-formed embedding.Checking whether an arbitrary abstract description has asuperdual with such a subgraph is known to be NP-complete.

If we consider only inductively pierced descriptions,then it is relatively easy to establish that the superdual isplanar, well connected, and that any embedding of it passesthe face conditions. Hence, the Flower and Howse methodwhen applied to inductively pierced description does not

have this NP-complete step. Therefore, the complexity ofthis method is dependent on 1) the algorithm used to find aplane embedding of the superdual and 2) the algorithmused to construct the curves of the diagram given theembedding of the superdual. There are known polynomial-time algorithms to construct plane embeddings of planargraphs, so point 1 is of polynomial complexity.

Regarding point 2 , whether the algorithm used to drawthe curves as presented in [5] is of comparable complexity toour drawing method (i.e., at most OðjLðDÞj2Þ, giving anoverall complexity ofOðjLðDÞj3Þ) or has worse time complex-ity that is unknown. The method presented in [3] for drawingthe curves from a suitable subgraph of the superdual ispolynomial. However, typically, the curves are not circles,since the layout of superdual impacts the possible routingsfor the curves.

It may also be natural to ask, if we consider the larger classof abstract descriptions consisting of all those whose numberof zones is polynomial in the number of labels, whetherthe drawing method of Flower and Howse is of polynomial-time complexity. We suspect that the answer to this may wellbe no: it is easy to show that there are abstract descriptionswhose number of zones is linear (e.g., 4� ðjLðDÞj � 1Þ) in thenumber of zones where the superdual is nonplanar, andtherefore, a large planar subgraph must be found, which iswell connected and has an embedding which passes the faceconditions. Known algorithms to find well-connected planarsubgraphs are exponential and the potential need to considerall different plane embeddings of each such subgraph cannotbe overlooked.

11 CONCLUSION

In this paper, we have identified a class of abstractdescriptions of Euler diagrams that can be drawn efficientlywith circles. Using circles brings an esthetic quality to theautomatically generated diagrams. The implemented soft-ware that we have developed, which is freely available fromwww.eulerdiagrams.com/piercing.htm, demonstrates thepractical utility of the research. The many areas in whichEuler diagrams can be, and are, used to visualize informa-tion serve to demonstrate the significance of the work.

Our results improve on previous contributions in anumber of ways. Existing generation approaches tend toembed Euler diagrams using polygons, sometimes withquite irregular shapes. Moreover, generating an embeddingof an Euler diagram from an arbitrary abstract descriptioncan be very computationally expensive: the complexity ofthe computation often grows exponentially as the numberof curves in the diagram increases. We have identified thata class of inductively pierced descriptions can be drawn incompletely well-formed manner in polynomial time.

In the future, we plan to investigate an amalgamation ofembedding methods. A goal is to be able to identify a“maximally pierced subdescription” of an abstract descrip-tion. If we are able to do this, then we can embed thatsubdescription using the method presented in this paperand then use known techniques to add the remainingcurves to the diagram [8], [23]. To illustrate, if we want toembed the abstraction D with zones ZðDÞ ¼ f;; a; b; ab; c;ac; bc; abc; ad; acd; e; be; ce; bce; abe; abceg, we can remove E,giving D� E that is inductively pierced, embed D� E, andthen add E, as shown in Fig. 30.

STAPLETON ET AL.: DRAWING EULER DIAGRAMS WITH CIRCLES: THE THEORY OF PIERCINGS 1031

Page 13: 1020 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER … · 2016. 3. 30. · Euler diagram consisting only of circles within polynomial time. ... Fig. 1 is both an Euler diagram and

ACKNOWLEDGMENTS

This work is supported by the UK EPSRC grants EP/E011160/1 and EP/E010393/1 for the Visualization withEuler Diagrams project. The authors thank the anonymousreviewers for their very helpful comments which lead to animproved version of this paper.

REFERENCES

[1] S. Chow and F. Ruskey, “Drawing Area-Proportional Venn andEuler Diagrams,” Proc. Conf. Graph Drawing, pp. 466-477, Sept.2003.

[2] J. Flower, J. Howse, and J. Taylor, “Nesting in Euler Diagrams:Syntax, Semantics and Construction,” Software and SystemsModelling, vol. 3, pp. 55-67, Mar. 2004.

[3] P. Rodgers, L. Zhang, G. Stapleton, and A. Fish, “EmbeddingWellformed Euler Diagrams,” Proc. 12th Int’l Conf. InformationVisualization, pp. 585-593, 2008.

[4] A. Verroust and M.-L. Viaud, “Ensuring the Drawability of EulerDiagrams for up to Eight Sets,” Proc. Third Int’l Conf. Theory andApplication of Diagrams, pp. 128-141, 2004.

[5] J. Flower and J. Howse, “Generating Euler Diagrams,” Proc. SecondInt’l Conf. Theory and Application of Diagrams, pp. 61-75, Apr. 2002.

[6] P. Simonetto, D. Auber, and D. Archambault, “Fully AutomaticVisualisation of Overlapping Sets,” Computer Graphics Forum,vol. 28, no. 3, pp. 967-974, 2009.

[7] J. Flower, A. Fish, and J. Howse, “Euler Diagram Generation,”J. Visual Languages and Computing, vol. 19, pp. 675-694, 2008.

[8] G. Stapleton, J. Howse, P. Rodgers, and L. Zhang, “GeneratingEuler Diagrams from Existing Layouts,” Proc. Workshop Layout ofSoftware Eng. Diagrams, 2008.

[9] S. Kent, “Constraint Diagrams: Visualizing Invariants in ObjectOriented Modelling,” Proc. Conf. Object-Oriented Programming,Systems, Languages and Applications (OOPSLA ’97), pp. 327-341,Oct. 1997.

[10] S.-K. Kim and D. Carrington, “Visualization of Formal Specifica-tions,” Proc. Sixth Asia Pacific Software Eng. Conf., pp. 102-109, 1999.

[11] I. Oliver, J. Howse, G. Stapleton, E. Nuttila, and S. Torma,“Expressing Ontologies Using Diagrammatic Logics,” Proc. Int’lSemantic Web Conf. (Posters and Demos), http://kcap09.stanford.edu/share/posterDemos, 2010.

[12] Y. Zhao and J. Lovdahl, “A Reuse Based Method of Developingthe Ontology for E-Procurement,” Proc. Nordic Conf. Web Services,pp. 101-112, 2003.

[13] P. Artes and B. Chauhan, “Longitudinal Changes in the VisualField and Optic Disc in Glaucoma,” Progress in Retinal and EyeResearch, vol. 24, no. 3, pp. 333-354, 2005.

[14] R. DeChiara, U. Erra, and V. Scarano, “VennFS: A Venn DiagramFile Manager,” Proc. Seventh Int’l Conf. Information Visualization (IV’03), pp. 120-126, 2003.

[15] E. Hammer, Logic and Visual Information. CSLI Publications, 1995.[16] H. Kestler, A. Muller, J. Kraus, M. Buchholz, T. Gress, H. Liu,

D. Kane, B. Zeeberg, and J. Weinstein, “Vennmaster: Area-Proportional Euler Diagrams for Functional GO Analysis ofMicroarrays,” BMC Bioinformatics, vol. 9, p. 67, 2008.

[17] T. Quick, C. Nehaniv, K. Dautenhahn, and G. Roberts,“Sensorimotor Information Flow in Genetic Regulatory NetworkDriven Control Systems,” Technical Report Research Note RN/05/29, Univ. College London.

[18] S.-J. Shin, The Logical Status of Diagrams. Cambridge Univ. Press,1994.

[19] N. Swoboda and G. Allwein, “Using DAG Transformations toVerify Euler/Venn Homogeneous and Euler/Venn FOL Hetero-geneous Rules of Inference,” J. Software and System Modeling, vol. 3,no. 2, pp. 136-149, 2004.

[20] J. Flower, P. Rodgers, and P. Mutton, “Layout Metrics for EulerDiagrams,” Proc. Seventh Int’l Conf. Information Visualisation,pp. 272-280, 2003.

[21] P. Rodgers, L. Zhang, and A. Fish, “General Euler DiagramGeneration,” Proc. Int’l Conf. Theory and Application of Diagrams,Sept. 2008.

[22] P. Simonetto and D. Auber, “An Heuristic for the Construction ofIntersection Graphs,” Proc. 13th Int’l Conf. Information Visualisation,2009.

[23] G. Stapleton, P. Rodgers, J. Howse, and L. Zhang, “InductivelyGenerating Euler Diagrams,” to be published in IEEE Trans.Visualization and Computer Graphics, 2009.

[24] S. Chow, “Generating and Drawing Area-Proportional Euler andVenn Diagrams,” PhD dissertation, Univ. of Victoria, 2007.

[25] G. Stapleton, P. Rodgers, J. Howse, and J. Taylor, “Properties ofEuler Diagrams,” Proc. Workshop Layout of Software Eng. Diagrams,pp. 2-16, 2007.

[26] A. Fish and J. Flower, “Abstractions of Euler Diagrams,” Proc. FirstInt’l Workshop Euler Diagrams, vol. 134, pp. 77-101, 2005.

[27] A. Fish and G. Stapleton, “Formal Issues in Languages Based onClosed Curves,” Proc. Distributed Multimedia Systems, Int’l Work-shop Visual Languages and Computings, pp. 161-167, 2006.

Gem Stapleton is a senior research fellow, withinterests including the theory of diagrammaticlogics and developing automated diagram layouttechniques. She received the Best Paper Awardat Diagrams 2004, was runner-up for the BritishComputer Society Distinguished DissertationAward 2005, and was the only UK finalist forthe Cor Baayen Award 2006, presented byERCIM to the most promising young researcherin computer science and applied mathematics.

She was the general chair of Diagrams 2008.

Leishi Zhang received the MSc degree incomputer science from the University of Dundeeand the PhD degree in bioinformatics visualiza-tion from the University of Brunel (funded by theEPSRC). She is currently a research associate.Her main research interests include informationvisualization, graph theory, artificial intelligence,and data analysis. She has published herresearch in a number of international journalsand conferences relating to the area of data

analysis and visualization.

John Howse is a professor of mathematics andcomputation and is the leader of the VisualModelling Research Group. His main researchinterests include diagrammatic reasoning andthe development of visual modeling languages.He is on the program committee for severalinternational conferences, was the general chairof Visual Languages and Human-Centric Com-puting 2006, is on the steering committee for theDiagrams conference series, and was the

program chair for Diagrams 2008. He received the Best Paper Awardat Diagrams 2002.

Peter Rodgers is a senior lecturer and his mainresearch interests are in diagrammatic visuali-zation, including graph and Euler diagram layouttechniques. He has led several research pro-jects supported by the national and internationalfunding bodies. He sits on the program commit-tee of various international conferences.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

1032 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 7, JULY 2011

Fig. 30. Embedding maximal subdiagrams.


Recommended