Spiraling and Folding: The Word Viewstefanko/Publications-new/J21.pdf · Spiraling and Folding: The...

Spiraling and Folding: The Word View

Marcus Schaefer · Eric Sedgwick ·Daniel Štefankovic

Abstract We show that for every n there are two simple curves on the torus inter-secting at least n times without the two curves folding or spiraling with respect toeach other. On the other hand, two simple curves in a punctured plane that intersectat least n times (and do not create any empty bigons) must either form a spiral ofdepth d or a fold of width cn/(d + 1) − 1, where c only depends on the number ofpunctures in the plane. The construction of the two curves on the torus involves traintracks and word equations, and the verification that the two curves do not spiral leadsus to an infinite binary word based on the golden ratio which does not contain anysquare word ww for which |w| is even.

Keywords Curves · Surfaces · Topology · Square-free words · Thue word · Spirals ·Folds

1 Introduction

Maybe you have found yourself aimlessly doodling away on a piece of paper, pro-ducing psychedelic drawings like

mailto:[email protected]



610

As you are squeezing one more bend of the curve into the picture you might havewondered whether the doodle has any inherent structure. The picture above, for ex-ample, contains a fair amount of spiraling and folding. If your curves intersect often,do they always have to spiral or fold? And does this depend on the surface on whichyou are drawing?

Before we discuss these questions, we need to make our notions of folding andspiraling precise. We are interested in the behavior of two simple curves, that is,curves without self-intersections. We typically draw one of the curves as a straightline; it might be part of a triangulation of the surface. The second curve intersectsthe first curve a large number of times. Two curves are reduced (with respect to eachother) if their drawing does not contain a lens: a lens (or empty bigon), is a disc-shaped region bounded by two arcs, one from each curve, that does not contain anyother part of the curve in its interior. Requiring the curves to be reduced seems toseriously restrict doodling, but we can always eliminate a lens by punching a holeinto the lens, puncturing the surface. The earlier picture contains four lenses whichcould be removed by adding four punctures to the surface.1

An annulus is a disk with one puncture. An arc in A is a curve within A withendpoints on the boundary of A. If the endpoints are on both boundary components,we call the arc spanning, otherwise it is peripheral. Two spanning arcs in an annulusA are said to spiral if they do not form a lens within A and intersect at least threetimes. The number of intersections minus 2 is the depth of the spiral. Two curves α

and β spiral if there is an annulus A and two subarcs α′ ⊆ α and β ′ ⊆ β such that α′and β ′ spiral in A (note that α′ and β ′ are spanning arcs in A).

We say that two curves α and β have a fold of width w if there is an annulus A thatdoes not contain endpoints of either α or β and so that α intersects A in a peripheralarc α′, and the intersection of β and A contains at least w peripheral arcs each ofwhich intersects α′ twice without forming a lens with α′. We require a fold to havewidth at least 1.

The following picture shows that our opening example contained a fold of widththree and a spiral of depth one (the two annuli are crosshatched).

1Topologically speaking: lenses are topologically “trivial”, since they can be removed by an isotopy ofeither curve. Of course any curve on the plane can be isotoped into a point, so if we take the more topo-logical view—instead of speaking of reduced curves—we would need to anchor the four endpoints of thecurves by placing them on the boundaries of four punctures in the plane.

611

The following example shows that in spite of an arbitrarily large number of inter-sections of α and β there need not be any spiraling and only very narrow folding in adrawing:

However, for this to be possible with reduced curves, the plane would need tocontain a large number of punctures, so let us assume that we are dealing with a fixedsurface, that is, the number of punctures is fixed.

Does a large number of intersections between two reduced curves on a fixedsurface force either a deep spiral or a wide fold?

We will show in Sect. 2 that in the punctured plane there is a constant c = O(p)

depending only on the number p of punctures such that any reduced curves with n

intersections contain either a spiral of depth√

n/c or a fold of width√

n/c. In Sect. 3we construct two reduced curves on the torus with arbitrary many intersections whichform neither spiral nor fold.

The question above is not an entirely idle problem inspired by doodling on scrapsof paper, it is closely related to graph drawing problems, and the string graph prob-lem in particular. It is easy to show, adapting a construction due to Kratochvíl andMatoušek [8] for string graphs, that in the presence of n punctures two curves canbe forced to intersect on the order of �(2n) times in the plane without there being aspiral. These curves do, however, contain a fold of width �(2n). In the companionpaper to this paper we construct two reduced curves in the (unpunctured) plane with-out any spirals and with arbitrarily many intersections [16]. By the result mentionedearlier, these two curves have a wide fold.

In the present paper we construct curves α and β on the torus that form no spiraland no fold. This closes a promising approach to recognizing string graphs on sur-faces of higher genus. A string graph is the intersection graph of simple curves inthe plane (or another surface). Recognizing whether a graph is a string graph is anold question [2, 6] that was settled only recently by proving an exponential upperbound on the number of intersections needed in a realization of a string graph in theplane [11, 12].

There are three known approaches to proving the decidability of string graphs inthe plane: the topological proof by Pach and Tóth [11], our more combinatorial prooffrom [12] and an algebraic proof using trace monoids [14]. The last proof is the onlyone that currently works for arbitrary surfaces. Unfortunately, it only gives a double-exponential upper bound on the number of intersections in an optimal realization ofa string graph on a surface. The proof from [12] makes essential use of properties of

the plane, and does not seem to lift easily to other surfaces. This leaves the proof ofPach and Tóth which proceeds by finding a deep spiral in the realization of a stringgraph which then allows the simplification of that realization. Our main result of thispaper implies that this approach cannot be lifted in a straightforward manner even tothe torus, since there are pairs of curves that do not form any spirals.

The companion paper [16] establishes Theorem 3.1—that folds and spirals can beavoided on the torus—using entirely topological methods; our proof of this result inthe present paper follows a different approach: we capture the behavior of curves insurfaces using words (over monoids) and word equations. To this end, we representcurves on the torus using Thurston’s train tracks. Relevant properties of curves canthen be described using word equations: We construct a family of curves using traintracks and show that they do not form a spiral by proving that the solution of a wordequation associated with the train track does not contain any squares, that is words ofthe form ww.2

The word view has an advantage over the topological approach in that it lendsitself more naturally to automation: the solutions to the word equations we derive caneasily be computed and verified, and basic properties of curves represented by traintracks can be decided efficiently, following ideas from [13].

Also, the word view leads us to new results about square-free words; during theproof we encounter an infinite binary word which does not contain any squares of theform ww where |w| is even. Interestingly, the word has both a natural definition—its nth digit is �nφ mod 2�, where φ is the golden ratio—and a simple recursiveconstruction which connects it to the word equations we use to encode our traintracks.

Remark 1.1 Given an infinite binary word a without squares of the form ww with|w| even, we can combine it with the infinite word b = 01010101 . . . which does notcontain any squares ww with |w| odd to build an infinite word on a 4-letter alphabetthat does not contain any squares ww at all (one letter corresponding to each combi-nation 00, 01, 10, and 11). Thue showed that square-free words exist over a ternaryalphabet [9], but as far as we know all known constructions involve morphisms ofwords and are not direct (though there are some recursive constructions).

If we interleave the letters of a and b, we obtain an infinite binary word whichdoes not contain any squares ww with |w| > 3. In particular, the interleaved wordcontains only a finite number of different squares; this is an old result, again typicallyproved through the use of morphisms. The best result known is that there are wordsthat contain only three different squares, 00, 11, 0101 [5].

2 The Planar Case

Spirals and folds can be avoided entirely on the torus as we will see in the nextsection. In the plane with a fixed number of punctures, however, there always has to

2Studying curves in surfaces and their properties using words and word equations is an approach we firstused in our papers on string graphs [12, 14] and continued with papers on algorithms for curves representedby Thurston train tracks or through normal coordinates [13, 15].

612

be a spiral or fold of size �(�1/2), where � is the number of intersections betweentwo reduced curves.

Theorem 2.1 Two reduced curves intersecting � times in a plane with p puncturesform either a spiral of depth d or a fold of width �/cpd −1, where cp = 2(12p+13)2.

A similar result is implicit in the paper by Pach and Tóth [11] and some of ourarguments resemble theirs. We obtain a slightly better constant cp by using a genusargument.

Consider two reduced curves α and β which intersect a finite number of times.Removing the two curves from the plane decomposes the plane into a number ofregions we call cells. A segment of a curve is a connected component of α \ β andβ \ α. We call a segment proper if it does not contain an endpoint of the curve; thetwo segments of a curve that are not proper are called its end segments. A k-cellis a cell whose boundary, after erasing the end segments of α and β , consists of k

proper segments. A cell is good if it is an empty 4-cell, that is, it does not contain anypunctures or endpoints, and it is bad otherwise.

Arbitrarily orient α in one direction. Call a segment of α good if it is proper andits right side borders a good cell, otherwise the segment is bad. Our first goal is tobound the number of bad segments.

Lemma 2.2 In a drawing of two reduced curves in a plane with p punctures a curvehas at most 12p + 12 bad segments not counting the two end segments of the curve.

Proof Fix a drawing of two reduced curves α and β . Consider the dual multigraphG of the drawing: assign a vertex to each cell and connect two vertices by an edge iftheir cells share a proper segment. Note that G may contain multiple edges, but noloops.

Erase the four end segments from the drawing. Then the drawing has � vertices,namely the � intersection points of the curves, and 2(� − 1) edges (the proper seg-ments), and hence, by Euler’s formula there are 2 + 2(� − 1) − � = � faces, implyingthat G has � vertices. Moreover, the number of edges of G is 2(� − 1), since an edgein G corresponds to a proper segment of the drawing. Vertices of degree k in G corre-spond to k-cells in the drawing. In particular, G contains at most p vertices of degree2, since each 2-cell must contain a puncture (otherwise it would be a lens) and atmost 4 vertices of degree 3, since a 3-cell must contain one of the four endpoints. Letdk be the number of k-cells and let db

4 and dg

4 be the number of bad and good 4-cells,resp., so d4 = db

4 + dg

4 .Assuming that � ≥ 2, G contains no isolated vertices and we get

2d2 + 3d3 + 4d4 + 5(� − d4 − d3 − d2) ≤ 2|E(G)| = 4(� − 1).

Replacing d4 by db4 +d

g

4 gives us �−dg

4 ≤ 3d2 + 2d3 +db4 − 4. Every 2-cell contains

a puncture, every 3-cell an endpoint, and every bad 4-cell an endpoint or a puncture,so 3d2 + 2d3 + db

4 = 2d2 + d3 + (d2 + d3 + db4 ) ≤ 2p + 4 + (p + 4) = 3p + 8. This

implies that � − dg

4 ≤ 3p + 4.

613

Now the bad cells border at most 2|E(G)| − 4dg

4 segments, and

2|E(G)| − 4dg

4 = 4(� − dg

4 ) − 4 ≤ 12p + 12,

so there are at most 12p + 12 bad segments not counting the end segments of the twocurves. �

Proof of Theorem 2.1 Fix a drawing of reduced curves α and β and a direction oftravel along α. Label the segments encountered α0, . . . , α�. We write α[i:j ] for thesequence αi, . . . , αj . Each segment is adjacent to a cell on its left and its right (not allthese cells are pairwise distinct, of course). Let us consider the � − 1 cells α[1:� − 1]encountered along the right side of α excluding the two end segments. Since at most12p + 12 of these segments are bad, there is a sequence of at least w = (� − 1 −(12p + 12))/((12p + 12) + 1) ≥ �/(12p + 13) − 1 consecutive segments, α[i:i +w − 1], such that the cells on the right of the segments are all empty 4-cells. Thisblock of 4-cells attaches to α a second time, again to a block of segments, α[j :j +w − 1]; the rest of the argument depends on where and how it does this. Figure 1illustrates one possible scenario, in which the cells reattach to the opposite side of α,overlapping the original block.

If the block of 4-cells attaches on the same side of α, we have found a fold ofwidth w and we are done since w ≥ �/(12p + 13) − 1 ≥ �/cpd − 1. Hence, we canassume that the block of 4-cells attaches to the left side of α.

If the two blocks α[i:i + w − 1] and α[j :j + w − 1] overlap in more thanw(d − 1)/d segments, we have found a spiral of depth d : in case i < j start with αj

and follow the empty 4-cell on its right. It reattaches to α in αj+(j−i); continuing thisprocess, we obtain a sequence of empty 4-cells connecting αi+k(j−i) to αi+(k+1)(j−i)

for 0 ≤ k ≤ d . This is possible since all those cells start within the range α[i:i+w−1]as j − i < w − w(d − 1)/d = w/d and so i + d(j − i) ≤ i + w − 1. Pick one of thetwo subarcs of β bounding these 4-cells. It has d +2 intersections with α, so we havefound a spiral of depth d . The case j < i is symmetric.

We can therefore assume that the two blocks overlap in at most w(d − 1)/d seg-ments; then there are at least w/d consecutive segments α[j ′:j + w − 1], j ≤ j ′ thatare not involved in the overlap and that have good 4-cells on their left side. Let α′ bethe subarc α[i + w:j + w − 1] (this includes α[j ′:j + w − 1] since these segmentsare not part of the overlap). Note that α′ together with the segment of β connectingthe beginning of α[i + w] to the end of α[j + w − 1] forms a closed curve C.

Temporarily replace α with α′ (without changing β). Now α′ has at most 12p+12bad segments (we can apply Lemma 2.2 again), so it contains a block α[k:k +w′ −1]

Fig. 1 A block of 4-cellsα[i:i + w − 1] reattaching to α

on the opposite side inα[j :j + w − 1] overlapping theoriginal block α[i:i + w − 1] inα[j :i + w − 1]

614

of good segments, where w′ = w/(d(12p + 13)) − 1. Since the cells on the rightof these segments are all contained within the closed curve C, the block of cells hasto reattach to α′ on the right hand side, so in α′ we now have a wide fold. In general,this fold will not be a fold with respect to α, since α can cut through the cells of thefold. But if we look at the cells of the fold between α′ and β in the presence of α, wesee that all the cells remain good, with one possible exception: the cell that containsthe endpoint of α in the region bounded by C. Let us pick a block of cells on the rightof the interval α[k:k + w′ − 1]. If all the cells are good, this block has to reattach onthe right or the left side of α as a block. We can then continue on the other side of thatblock, as long as all those cells are good as well. If we keep encountering good cells,this process will continue until we reattach to α on the right side (we know we must,since we are following the fold with respect to α′). We saw that there is at most onebad cell we can encounter, so either α[k:k +�w′/2�−1] or α[k +�w′/2�:k +w′ −1]will not run into that bad cell. Since the process starts on the right-hand side of α andends on the right-hand side of α, there must be a block of cells that attaches to α onthe right-hand side with both its α-sides (if not, then the first block reattaches to theleft side and after that every block starts attached on the right and then reattaches tothe left, but that is not possible, since the last block has to reattach to the right of α).This block constitutes a fold of width �w′/2� ≥ w/(2d(12p + 13)2) − 1. �

It is easy to construct two reduced curves in the plane that intersect � times andwhose spirals and folds have depth and width at most O(

√�), so Theorem 2.1 is

essentially tight.

3 The Torus Case

Let fn be the nth Fibonacci numbers, with f0 = 0 and f1 = 1. Figure 2 shows atrain track describing an arrangement of curves on the torus (using the usual planarrepresentation of the torus identifying opposite sides of the square).

Fig. 2 A weighted train trackon the torus (black lines)

615

Without going into the formal details of weighted train tracks, let us describe thebasic idea. Ignore, for the moment, the dotted line α, it is not part of the train track.A line of weight w in the diagram represents w parallel curves. For example, in theupper left part of Fig. 2 we see five groups of parallel curves merge together. Theleftmost group represents fn − 1 parallel curves, the third group fn+1 − 1 (as wefind out by tracing it across the dashed line), and the fifth group fn+3 − 1 paral-lel curves. The second and fourth group represent the end of a single curve. Thesefive groups merge, in a device called a switch, into a single group of fn+4 − 1 =(fn − 1)+ 1 + (fn+1 − 1)+ 1 + (fn+3 − 1) parallel curves, just to split up again intofive groups of fn+3 −1, 1, fn+1 −1, 1 and fn −1 parallel curves. Note that all lines inthe diagram have a weight, and that when multiple lines merge (or split) in a switch,the weights add up consistently. Hence, if we replace each line with the number ofparallel curves corresponding to its weight, we obtain a system of curves in the torusrealizing the train track. We do not a priori know how the eight ends of the curvespair up and whether any closed curves are present, and, indeed, this depends on n.Clearly, there have to be at least four curves. We show that if n ≡ 1 mod 3, then thetrain track represents a system of exactly four curves and there are no closed curvecomponents. Moreover, we argue that α (as shown in the picture) does not form a spi-ral, fold or lens with any of those four curves. Since one of the curves has to cross α

at least (fn+1 − 1)/4 times, this will complete the argument.To prove these results about the train track in Fig. 2, we change our point of view

and reexpress the train track, or, more precisely, the connectivity of the curves rep-resented by the train track, as a word equation. A word equation is an equation con-taining variables and letters from some alphabet, �. We use uppercase letters forvariables and lowercase letters for elements of �. For example,

aX = Xa

where |X| = n is a simple word equation with given lengths, that is, the length ofthe word that a variable represents is specified. The equation is interpreted over thefree monoid generated by �, so equality must hold letter by letter. In this particularexample, the solution is X = an (for any alphabet � containing a). To take anotherexample, consider

aX0X1 = X1aX0

with |X0| = 1, X1 = n. Again, this equation is solvable, by aX0X1 = an+2; however,the uniqueness of the solution depends on the parity of n. For even n, there is morethan one solution, while for odd n the solution is unique (assuming � contains atleast two letters, including a). Finally, the system

abX = Xab

is solvable if and only if |X| is even (in which case the solution is unique). We writeX[i] for the ith letter in X (for a particular solution), and X[i:j ] for the subwordX[i]X[i + 1] · · ·X[j ] of X.

Figure 3 shows the train track from Fig. 2 (no longer embedded) recast as a wordequation. We have labeled four of the endpoints with {a, b, a, b} and left the otherfour undetermined (the questions marks are unnamed, but distinct, variables).

616

Fig. 3 Intersection of α with a system of curves containing no spirals

The word equation shown in Fig. 3 is

X0aX1bX2X3aX4bX5 = X5?X1?X3X2?X4?X0 (1)

with given lengths |X0| = |X3| = fn − 1, |X1| = |X4| = fn+1 − 1, |X2| = |X5| =fn+3 − 1 (the question marks have unit length).

Since the word equation is directly modeled on the connectivity of the train track,it is clear that any solution to (1) corresponds to a set of simple curves realizing thetrain track in Fig. 3.3 This set includes four curves labeled a, b, a, b . Since there areonly eight endpoints any other curve in this realization of the train track must be aclosed curve. However, if there is a closed curve in the realization, we can relabel it inthe solution to the word equation with an arbitrary letter. Hence, if we can show thatthe solution to (1) is unique, there cannot be any closed curves in the train track, andthe realization consists of the four simple curves a, b, a, b. Indeed, in Lemmas 3.4and 3.5 we show that (1) has a unique solution with the given length constraints aslong as n ≡ 1 mod 3.

How do spirals translate into the world of words? Let w ∈ �∗ be the word weread along the curve α if we replace each intersecting curve with its letter. If anyof the four curves a, b, a or b, call it β , formed a spiral with α, then there is anannulus A containing at least three intersections of α with β . Consider a curve γ inthe train track which intersects α between two of these intersection points in A. Thecurve γ does not intersect β (the curves in the train track do not intersect) and it

3The reverse need not be true: by labeling the four ends on top of Fig. 3 by different letters, we excludethe case that any curve has both its endpoints there, a case that might very well occur for certain choicesof n.

617

cannot double back—forming a bigon—because there are no punctures in A and wedisallowed endpoints in an annulus witnessing a fold, and thus the bigon would beempty. Hence, γ has to run parallel to β , implying that w contains a word of the formβxβxβ (where x is the word corresponding to the group of curves wedged betweenβ , and β is the letter representing the curve β) which contains the square βxβx.Hence, a spiral in the train track forces a square in the word along α. In Lemma 3.5we show that the word X0aX1bX2 in the unique solution to (1) with the given lengthsis square-free. Since X0aX1bX2 records the intersections of the curves along α, thisimplies that none of the curves forms a spiral with α.

We finally observe that the arrangement of curves does not contain any folding: ifwe orient the curves in the train track, we see that they keep intersecting α in the samedirection, whereas a fold requires a curve to intersect α in both directions (rememberthat a fold is contained in an annulus). Pending the proofs of Lemmas 3.4 and 3.5, wehave thus established the following result.

Theorem 3.1 For any n there are two simple curves α and β on the torus that inter-sect at least n times without forming either a lens, a fold or a spiral.

3.1 Uniqueness

Lemma 3.2 For n ≡ 1 (mod 3) a solution to the equation

Y0Y1 = Y1Y0 (2)

with length constraints |Y0| = fn+2 and |Y1| = fn+1 +fn+3 over an arbitrary alpha-bet is uniquely determined by knowing Y [i] for at least one odd and one even valueof i.

Proof Let � = |Y0Y1| = fn+1 + fn+2 + fn+3 = 2fn+3. Assume Y = Y0Y1 = Y1Y0 isa solution to (2). By the equation, Y [i] = Y [(i +fn+2 −1) mod �+1].4 The functionf (x) = (x + fn+2 − 1) mod � + 1 splits the positions of Y into gcd(fn+2, �) orbitssuch that within each orbit the value of Y is constant. For n ≡ 1 (mod 3), fn+2 iseven, and, since gcd(fn+2, fn+3) = 1, gcd(fn+2, �) = 2. Therefore, the function f

has exactly two orbits in this case: the even and the odd positions. �

Lemma 3.3 For n ≡ 1 (mod 3) the equation

X0aX1bX2 = X2?X1?X0 (3)

with length constraints |X0| = fn − 1, |X1| = fn+1 − 1 and |X2| = fn+3 − 1 has aunique solution, and that solution is (ab)(fn+4−1)/2.

Proof Consider the equation

X = X0aX1aX2 = X2?X1?X0 (4)

4The fn+2nd position in Y after i (with wrap-around) is (i + fn+2 − 1) mod � + 1 since the first positionin Y is 1.

618

and let X (and thereby X0,X1,X2) be a solution over the alphabet {a,�} minimizingthe number of occurrences of the letter a. Then (4) has a unique solution if andonly if X does not contain the letter �. Counting occurrences of a on both sides ofX0aX1aX2 = X2?X1?X0 it is clear that both ? must stand in for the letter a, so X isa solution to the equation X = X0aX1aX2 = X2aX1aX0. Adding aX1a in front ofboth sides of this equation shows that X0,X1,X2 fulfill

aX1aX0aX1aX2 = aX1aX2aX1aX0. (5)

Now Y0 = aX1aX0 and Y1 = aX1aX2 form a solution to

Y = Y0Y1 = Y1Y0 (6)

with length constraints |Y0| = fn+2 and |Y1| = fn+1 +fn+3. Moreover, Y [1] = a andY [fn+1 + 1] = a, hence, noting that 1 is odd and fn+1 + 1 is even, we can applyLemma 3.2 to show that (6) and, thereby, (4), has a unique solution. This implies that

X = X0aX1bX2 = X2?X1?X0 (7)

has a unique solution if it has any solution. However, it is easy to see that X made upof alternating a and b is a solution, hence that solution is unique. �

We think of the operator · as an involution on �, that is, x = x for every x ∈ �.For example, a = a. We also let � = � and extend · to words in the natural way.

Lemma 3.4 Assume n ≡ 1 (mod 3). If the equation

X = X0aX1bX2X3aX4bX5 = X5?X1?X3X2?X4?X0 (8)

with given lengths |X0| = |X3| = fn − 1, |X1| = |X4| = fn+1 − 1, |X2| = |X5| =fn+3 − 1 has a solution, then that solution is unique and fulfills

X0 = X3,X1 = X4,X2 = X5

and

X0aX1bX2 = X2?X1?X0.

Proof Assume that X is a solution of (8) over the alphabet {a, b, a, b,�} minimizingthe number of letters in {a, b, a, b}. The solution is unique if and only if X does notcontain the character �.

We claim that if X = X0aX1bX2X3aX4bX5 is a solution to (8) then so, by thesymmetry of the equation, is

X′ = X3aX4bX5X0aX1bX2:namely, if X fulfills (8) with the length constraints, then X0aX1bX2 = X5?X1?X3and X3aX4bX5 = X2?X4?X0, so

X3aX4bX5X0aX1bX2 = X2?X4?X0X5?X1?X3.

619

Now applying · to both sides establishes the claim.Since X′ has the same number of � characters as X, it is also a minimal solu-

tion. Moreover, X′ and X agree at the four positions in which a letter is specified bythe equation. Therefore, X and X′ must equal each other. In other words, we haveX0 = X3, X1 = X4 and X2 = X5. Then

X0aX1bX2 = X5?X1?X3 = X2?X1?X0.

Now any solution of X0aX1bX2 = X2?X1?X0 can be turned into a solution ofX0aX1bX2 = X2?X1?X0 by replacing a with a and b with b (leaving any � let-ters untouched). By Lemma 3.3 the solution to X0aX1bX2 = X2?X1?X0 is uniqueand does not involve any � character. Consequently, the solution to X0aX1bX2 =X2?X1?X0 is unique and does not include a � character. But then the minimal solu-tion X of (8) does not contain a � character and is therefore the unique solution tothat equation. �

3.2 Existence and Squares

We have shown that a solution to (1) if it exists is unique. In this section we showthat (1) with the given lengths has a solution and that this solution is square-free.

Lemma 3.5 Let n ≡ 1 (mod 3). Equation (1) over {a, b, a, b} with given lengths|X0| = |X3| = fn −1, |X1| = |X4| = fn+1 −1, |X2| = |X5| = fn+3 −1 has a solutionX = X0aX1bX2X3aX4bX5 and for that solution X0aX1bX2 is square-free.

By Lemma 3.4 we know that the solution we construct will be the unique solution.

Proof of Lemma 3.5 Consider the equation

Z = X0aX1bX2 = X2?X1?X0 (9)

with given lengths |X0| = fn − 1, |X1| = fn+1 − 1, and |X2| = fn+3 − 1. DefiningX3 = X0, X4 = X1 and X5 = X2 gives us X0aX1bX2 = X2?X1?X0 = X5?X1?X3

and X3aX4bX5 = X0aX1bX2 = X2?X1?X0 = X2?X1?X0 = X5?X1?X3, or, inother words, a solution to (1) with the given lengths. Hence to show that (1) has asolution, it is enough to show that (9) is solvable.

Let us write Zuv for the result of replacing all occurrences of u in Z with v. We

allow the specification of multiple replacements, for example, Zabab is the result of

replacing a with a and b with b.We split (9) into two equations. It is obvious that Z is a solution to (9) if and only

if Zabab is a solution to

Z′ = X0aX1bX2 = X2?X1?X0 (10)

over alphabet {a, b} with given lengths |X0| = fn − 1, |X1| = fn+1 − 1, |X2| =fn+3 − 1, and Zbb

aa is a solution of

Z′′ = X0aX1aX2 = X2?X1?X0 (11)

620

over alphabet {a, a} with given lengths |X0| = fn − 1, |X1| = fn+1 − 1, |X2| =fn+3 − 1.

In Lemma 3.3 we saw that (10) has the unique solution Z′ = (ab)(fn+4−1)/2. Notethat Z′ does not contain any odd square, that is a square of the form ww for which |w|is odd. Therefore, Z cannot contain any odd square. Hence, to conclude the argument,it is sufficient to show that (11) has a solution, and that this solution does not containany even squares, that is a square of the form ww for which |w| is even. This proofuses a different approach and we leave it to Sect. 4: Lemma 4.9 shows that there is asolution to (11), and Lemma 4.6 together with Corollary 4.8 shows that this solutiondoes not contain an even square. �

4 The Golden Ratio and Square-free Binary Words

A square is a word of the form ww, where w is not the empty word; we call thesquare even or odd, depending on the parity of |w|. A word is square-free if it doesnot contain a square as a subword. Any binary word of length at least 4 contains asquare, while there are infinite words over a ternary alphabet which are square-free(discovered by Axel Thue [9, 10]).

This seems to close the case of binary words, but there are variations to be consid-ered. For example, there are infinite binary words containing at most 3 squares, 00,11 and 0101 [4, 5, 7]. In this section we show that there are infinite binary words thatdo not contain any even squares.

Theorem 4.1 There is an infinite binary word that does not contain an even square.

Traditionally, square-free words are constructed by repeatedly applying square-free morphisms (morphisms that map square-free words to square-free words) toan initial square-free word. We proceed differently by showing that a particular se-quence, namely the sequence a = (an)n∈N defined by

an = �nφ mod 2� = �nφ� − 2�nφ/2� (12)

does not contain an even square, where φ = (√

5 + 1)/2 is the golden ratio. Thesequence a is listed in Sloane’s Encyclopedia of Integer Sequences [3, SequenceA085002]; it encodes the bits of φ: a2� is the �th binary bit of φ.

4.1 The Bits of the Golden Ratio

Somewhat surprisingly, the bits of a can be generated by a simple recursive law whichwe exploit in the next section to show that a encodes a solution to (11). In this sectionwe establish the necessary properties of a and moreover show that it does not containany even squares.

We need two basic facts about the distribution of nφ mod 1 and Fibonacci num-bers. Let �x be the integer nearest to x, and ‖x‖ := |x − �x| the distance of x ∈ R

to its nearest integer. We will use the well-known identity fk = fk−1φ + φk−1, whereφ = (1 − √

5)/2.

621

Lemma 4.2 Let k be an integer. If 0 < � < fk , then ‖fkφ‖ < ‖�φ‖.

Proof Since fk/fk−1 is a continued fraction convergent of φ, standard approximationresults about continued fractions imply that |fk−1φ − fk| ≤ |�φ − �′| for 0 < � < fk

and any �′; hence

min0<�<fk

‖�φ‖ = ‖fk−1φ‖.

However, ‖fk−1φ‖ = ‖fk − φk−1‖ = ‖ − φk−1‖ = |φk−1| > |φk| = ‖fk+1 − φk‖ =‖fkφ‖, and the lemma follows. �

Lemma 4.3 Let k ≥ 2 be an integer. Then

�fkφ mod 2 ={

1 if k ≡ 0,1 (mod 3),

0 if k ≡ 2 (mod 3).

Proof Since |φk| < 1/2 for k ≥ 2 we can conclude that �fkφ = �fk+1 − φk = fk+1;since fk+1 is even if and only if k + 1 ≡ 0 (mod 3), the lemma is proved. �

Lemma 4.4 Let k, � ≥ 1 so that k + � = fn for some n ≥ 3. Then

ak ={

a� if n ≡ 0,1 (mod 3),

1 − a� if n ≡ 2 (mod 3).(13)

Lemma 4.4 implies that the last fn−2 bits of a[1:fn−1 − 1] are determined bya[1:fn−2] in a very simple way that only depends on n mod 3. In the next section weinvestigate this recursive structure separately, and show that it satisfies (11).

Proof of Lemma 4.4 For the purposes of this proof, let am = �mφ mod 2�, wherem ∈ Z; note that this extension of a is symmetric in the sense that it satisfies

am = 1 − a−m (14)

for all m �= 0. Let k, � and n be as in the statement of the lemma. Define α = fnφ −�fnφ and β = �φ − ��φ�. By Lemma 4.2 we have |α| < |�φ − ��φ| ≤ β . Now

ak = �(fn − �)φ mod 2�= �(�fnφ + α − ��φ� − β) mod 2�= �(�fnφ − ��φ� − β) mod 2�= (�fnφ + �(−��φ� − β) mod 2�) mod 2

= (�fnφ + 1 − a�) mod 2.

In the third equality we used |α| < β , in the fifth equality we used (14). The resultnow follows from Lemma 4.3. �

We still need to show that the sequence a defined in (12) does not contain any evensquares. For this we need the following results about multiples mod 1 of φ.

622

Lemma 4.5 Let � be an integer. Let m,m′ be such that fm ≤ � < fm+1 and fm′ <

� ≤ fm′+1. Let

S = {0, φ,2φ, . . . , (� − 1)φ} mod 1

be a set of multiples of φ modulo 1 (that is, on a circle of length 1).

(i) The longest segment non-intersecting S has length φ−(m−2).(ii) The closest pair of points in S is at distance φ−m′

.

Proof Both claims follow from the proof of the three-distance theorem by Sós; weused the version presented in [1] to calculate the three distances for our sequence.Claim (ii) also follows directly from Lemma 4.2 (if iφ mod 1 and jφ mod 1 are aclosest pair, then so are 0 and (i − j)φ). �

The following lemma proves Theorem 4.1.

Lemma 4.6 There are no n, k ∈ N so that a[n:n + 2k − 1] = a[n + 2k:n + 4k − 1].In other words, a does not contain any even squares.

Proof Suppose that such n and k exist. Let = 2kφ mod 2. Let m′ be such thatfm′ < k + 1 ≤ fm′+1 and let m be such that fm ≤ 2k < fm+1. Note m > m′ andhence

2φ−m′ − φ−(m−2) = φ−(m−2)(2φ(m−2)−m′ − 1) > 0. (15)

By Lemma 4.5 (with � = k + 1, using the lower bound on the distance of 0 andφk mod 1) we have

φk mod 1 ∈ [φ−m′,1 − φ−m′ ]

and hence ∈ [2φ−m′,2 − 2φ−m′ ]. Let x = nφ and let

S = {x, x + φ,x + 2φ, . . . , x + (2k − 1)φ} mod 1.

By Lemma 4.5 set S contains a point b0 ∈ [0, φ−(m−2)] and b1 ∈ [1 − φ−(m−2),1].Let n0, n1 be such that bi = φni mod 1 (for i = 0,1) and let b′

i = φni mod 2 (fori = 0,1).

If < 1 then

2 > + b1 ≥ 1 − φ−(m−2) + 2φ−m′> 1

and hence an1 = �b′1 mod 2� �= �b′

1 + mod 2� = an1+2k .If > 1 then

1 < + b0 ≤ 2 − 2φm′ + φ−(m−2) < 2

and hence an0 = �b′0 mod 2� �= �b′

0 + mod 2� = an0+2k . �

Remark 4.7 As one of the referees pointed out it appears that the length of squarewords in a are all of the form f3k+2 for k ∈ N. We verified this conjecture empiricallyfor a[1:f20 − 1] (using the recursive construction of a presented in the next section).

623

4.2 A Recursive Look at the Golden Ratio

Let w3 = 1 and

wn ={

wn−1(wn−1[1:fn−2])R if n ≡ 0,1 (mod 3)

wn−1(wn−1[1:fn−2])R if n ≡ 2 (mod 3)

where wR denotes the reverse of w and w the bitwise complement of w. The firstfew words are displayed in the following table.

n wn |wn|3 1 14 11 25 1100 46 1100011 77 110001100011 128 11000110001110011100 209 110001100011100111001110001100011 33

10 110001100011100111001110001100011100111001110001100011 54

Note that |wn| = fn−1. By construction wn−1 is a prefix of wn, so it is meaningfulto define w = limn→∞ wn. In Lemma 4.4 we showed that a satisfies the recursiveconstruction law we used to define w, so the following result does not come as asurprise.

Corollary 4.8 w = a.

Proof We show that w[1:fn − 1] = a[1:fn − 1] by induction on n. Equality holdsin the base cases n = 3,4. By Lemma 4.4, (a[fn−1:fn − 1])R = a[1:fn−2] if n ≡0,1 (mod 3), and (a[fn−1:fn − 1])R = a[1:fn−2] for n ≡ 2 (mod 3). In either case,the last fn−2 bits of a are determined by the first fn−2 bits of a in exactly the sameway that the last fn−2 bits of wn are determined by the first fn−2 bits of wn (usingthe recursive definition of wn). Moreover, by inductive assumption, the first fn−2 bitsof w and a agree, completing the proof. �

In particular, by combining Corollary 4.8 with Lemma 4.4 we get

wn ={

wRn if n ≡ 0,1 (mod 3)

wRn if n ≡ 2 (mod 3).

(16)

We still need to show that wn as defined above encodes a solution Z′′ to (11)

Z′′ = X0aX1aX2 = X2?X1?X0.

with given lengths |X0| = fn − 1, |X1| = fn+1 − 1, |X2| = fn+3 − 1. This is accom-plished by the following lemma.

624

Lemma 4.9 If n ≡ 1 (mod 3) then wn+4 fulfills

wn+4 = X0SX1SX2 = X2?X1?X0

over alphabet {0,1} for some X0,X1,X2 with given lengths |S| = 1, and |X0| =fn − 1, |X1| = fn+1 − 1, |X2| = fn+3 − 1. In particular, (11) has the solution Z′′ =(wn+4)

SSaa .

Proof Choose X0,X1,X2, Y0, Y1, Y2 over alphabet {0,1} so that

wn+4 = X0S0X1S1X2 = Y 2?Y1?Y 0

where |S0| = |S1| = 1, and |X0| = |Y0| = fn − 1, |X1| = |Y1| = fn+1 − 1, |X2| =|Y2| = fn+3 − 1. To establish the lemma, it is enough to show that S0 = S1 andXi = Yi for 1 ≤ i ≤ 3.

Since n ≡ 1 (mod 3) by (16) we know that wn+4 fulfills wRn+4 = wn+4. Hence

(X0)R = Y0, and, since X0 = wn+4[1:fn −1] = wn, we have Y0 = (X0)

R = (wn)R =

wn = X0.Similarly, (X2)

R = Y2, and, since X2 = wn+4[1:fn+3 − 1] = wn+3, we have Y2 =(X2)

R = (wn+3)R = wn+3 = X2.

Also, (X1)R = Y 1 and X1 = wn+4[fn + 1:fn+2 − 1] = wn+2[fn + 1:fn+2 − 1] =

(wn+2[1:fn+1 − 1])R = wRn+1 (the second and the last equality hold because the wk

are prefixes of each other and the third equality uses (16)); therefore Y1 = (X1)R =

((wn+1)R)R = wR

n+1 = X1.Finally, we have to show that the bits S0 and S1 in position fn and fn+2 of wn+4

are the same:

S0 = wn+4[fn+2]= wn+3[fn+2] = wn+2(wn+2[1..fn+1])R[fn+2]= (wn+2[1..fn+1])R[1] = wn+2[fn]= wn+4[fn] = S1

which completes the argument. �

Acknowledgements We would like to thank both referees for helpful and detailed comments that im-proved the presentation in both parts of the paper. We would also like to thank Jarek Grytczuk for prompt-ing us to explore the connection to the golden ratio sequence.

References

1. Alessandri, P., Berthé, V.: Three distance theorems and combinatorics on words. Enseign. Math. (2)44(1–2), 103–132 (1998)

2. Benzer, S.: On the topology of the genetic fine structure. Proc. Natl. Acad. Sci. 45, 1607–1620 (1959)3. Cloitre, B.: Sequence A085002. http://www.research.att.com/~njas/sequences/A085002 (2003)4. Entringer, R.C., Jackson, D.E., Schatz, J.A.: On nonrepetitive sequences. J. Comb. Theory Ser. A 16,

159–164 (1974)

625

http://www.research.att.com/~njas/sequences/A085002

5. Fraenkel, A.S., Simpson, J.R.: How many squares must a binary sequence contain? Electron. J. Comb.2 (1995)

6. Graham, R.L.: Problem 1. In: Open Problems at 5th Hungarian Colloquium on Combinatorics (1976)7. Harju, T., Nowotka, D.: Binary words with few squares. Bull. Eur. Assoc. Theor. Comput. Sci. 89,

164 (2006)8. Kratochvíl, J., Matoušek, J.: String graphs requiring exponential representations. J. Comb. Theory,

Ser. B 53, 1–4 (1991)9. Lothaire, M.: Combinatorics on Words. Cambridge Mathematical Library. Cambridge University

Press, Cambridge (1997)10. Lothaire, M.: Algebraic combinatorics on words. In: Encyclopedia of Mathematics and Its Applica-

tions, vol. 90. Cambridge University Press, Cambridge (2002)11. Pach, J., Tóth, G.: Recognizing string graphs is decidable. Discrete Comput. Geom. 28(4), 593–606

(2002)12. Schaefer, M., Štefankovic, D.: Decidability of string graphs. J. Comput. Syst. Sci. 68(2), 319–334

(2004)13. Schaefer, M., Sedgwick, E., Štefankovic, D.: Algorithms for normal curves and surfaces. In: Comput-

ing and Combinatorics. Lecture Notes in Computer Science, vol. 2387, pp. 370–380. Springer, Berlin(2002)

14. Schaefer, M., Sedgwick, E., Štefankovic, D.: Recognizing string graphs in NP. J. Comput. Syst. Sci.67(2), 365–380 (2003) (Special issue on STOC2002, Montreal, QC)

15. Schaefer, M., Sedgwick, E., Štefankovic, D.: Computing Dehn twists and geometric intersection num-bers in polynomial time. Technical report TR05-009, DePaul University (2005)

16. Schaefer, M., Sedgwick, E., Štefankovic, D.: Spiraling and folding: the topological view. In: Bose, P.(ed.) Proceedings of the 19th Annual Canadian Conference on Computational Geometry, CCCG 2007,20–22 August 2007, Carleton University, Ottawa, Canada, pp. 73–76 (2007)

626

Date post:	18-Jul-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Spiraling and Folding: The Word Viewstefanko/Publications-new/J21.pdf · Spiraling and Folding: The...

Documents