+ All Categories
Home > Documents > The Internet Pendulum: On the Periodicity of Internet ...

The Internet Pendulum: On the Periodicity of Internet ...

Date post: 02-Nov-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
16
The Internet Pendulum: On the Periodicity of Internet Topology Measurements Mattia Iodice Roma Tre University Rome, Italy [email protected] Massimo Candela RIPE NCC Amsterdam, The Netherlands [email protected] Giuseppe Di Battista Roma Tre University Rome, Italy [email protected] ABSTRACT Public databases of large-scale topology measures (e.g. RIPE Atlas) are very popular both in the re- search and in the practitioners communities. They are used, at least, for understanding the state of the Internet in real time, for outage detection, and to get a broad baseline view of the Internet evolution over time. However, despite the large amount of investigations, the dynamic aspects of these mea- sures have not been fully understood. As an exam- ple, looking at time-series of such measures it hap- pens to observe patterns that repeat at regular in- tervals. More specifically, looking at a time-series of traceroutes involving certain source-target pairs it happens to observe that the paths follow alter- nations that repeat several times. Have they the features of periodicity? What are their main char- acteristics? In this paper we study the problem of detecting and characterizing periodicities in Inter- net topology measures. For this purpose we devise an algorithm based on autocorrelation and string matching. First, we validate the effectiveness of our algorithm in-vitro, on randomly generated mea- sures containing artificial periodicities. Second, we exploit the algorithm to verify how frequently traceroute sequences extracted from popular databases of topology measures exhibit a periodic behavior. We show that a surprisingly high percentage of mea- sures present one or more periodicities. This hap- pens both with traceroutes performed at different frequencies and with different types of traceroutes. Third, we apply our algorithm to databases of BGP updates, a context where periodicities are even more unexpected than the one of traceroutes. Also in this case our algorithm is able to spot periodicities. We argue that some of them are related to oscillations of the BGP control plane. 1. INTRODUCTION The availability of public databases of large- scale Internet topology measures (e.g. the tracer- outes performed by RIPE Atlas [6] or by Caida Ark [4]) has deeply changed our possibility of understanding the state of the Internet in real time, our outage detection systems, and our methods for getting a broad baseline view of the Internet evolution over time. However, even if several papers have been written where such public databases are the leading actors, the dynamics of these topology measures have not been fully understood. This is important for many research topics and, because of the presence of measure changes that do not have anything to do with faults, is especially crucial for outage detection systems. When looking at time-series of traceroute paths (where a path is a sequence of IP ad- dresses possibly containing asterisks) with one of the many popular visualization tools (see e.g. [9]) it happens to observe paths that follow alternations that repeat several times. Have they the features of the “periodicity”? What are their main characteristics? As an example, the diagram of Fig. 1 shows a periodic pat- tern. It represents a time-series of paths orig- inated by a sequence of traceroutes performed between a specific probe-target pair. The pres- ence of a periodicity, involving paths p 0 , p 4 , p 7 , and p 11 , is quite evident even if mixed with some “noise.” In the networking field we can have differ- ent types of periodicity. As an example we can have that the data plane of a router changes periodically, because of a control plane miscon- figuration or because of some traffic-engineering requirement. Observe that we can have a data plane that has a periodic behavior that, be- cause of an inauspicious timing is not revealed by topology measures. On the contrary we have topology measures that exhibit periodic- ity that do not correspond to any periodicity of the data plane. As an example, consider a sequence of Paris traceroutes [8] against an hash-based load balancer. In this paper we study periodicities in Inter- 1 arXiv:1709.05969v1 [cs.NI] 18 Sep 2017
Transcript
Page 1: The Internet Pendulum: On the Periodicity of Internet ...

The Internet Pendulum: On the Periodicity ofInternet Topology Measurements

Mattia IodiceRoma Tre University

Rome, [email protected]

Massimo CandelaRIPE NCC

Amsterdam, The [email protected]

Giuseppe Di BattistaRoma Tre University

Rome, [email protected]

ABSTRACTPublic databases of large-scale topology measures(e.g. RIPE Atlas) are very popular both in the re-search and in the practitioners communities. Theyare used, at least, for understanding the state of theInternet in real time, for outage detection, and toget a broad baseline view of the Internet evolutionover time. However, despite the large amount ofinvestigations, the dynamic aspects of these mea-sures have not been fully understood. As an exam-ple, looking at time-series of such measures it hap-pens to observe patterns that repeat at regular in-tervals. More specifically, looking at a time-seriesof traceroutes involving certain source-target pairsit happens to observe that the paths follow alter-nations that repeat several times. Have they thefeatures of periodicity? What are their main char-acteristics? In this paper we study the problem ofdetecting and characterizing periodicities in Inter-net topology measures. For this purpose we devisean algorithm based on autocorrelation and stringmatching. First, we validate the effectiveness ofour algorithm in-vitro, on randomly generated mea-sures containing artificial periodicities. Second,we exploit the algorithm to verify how frequentlytraceroute sequences extracted from popular databasesof topology measures exhibit a periodic behavior.We show that a surprisingly high percentage of mea-sures present one or more periodicities. This hap-pens both with traceroutes performed at differentfrequencies and with different types of traceroutes.Third, we apply our algorithm to databases of BGPupdates, a context where periodicities are even moreunexpected than the one of traceroutes. Also in thiscase our algorithm is able to spot periodicities. Weargue that some of them are related to oscillationsof the BGP control plane.

1. INTRODUCTIONThe availability of public databases of large-

scale Internet topology measures (e.g. the tracer-outes performed by RIPE Atlas [6] or by Caida

Ark [4]) has deeply changed our possibilityof understanding the state of the Internet inreal time, our outage detection systems, andour methods for getting a broad baseline viewof the Internet evolution over time. However,even if several papers have been written wheresuch public databases are the leading actors,the dynamics of these topology measures havenot been fully understood. This is importantfor many research topics and, because of thepresence of measure changes that do not haveanything to do with faults, is especially crucialfor outage detection systems.

When looking at time-series of traceroutepaths (where a path is a sequence of IP ad-dresses possibly containing asterisks) with oneof the many popular visualization tools (seee.g. [9]) it happens to observe paths that followalternations that repeat several times. Havethey the features of the “periodicity”? Whatare their main characteristics? As an example,the diagram of Fig. 1 shows a periodic pat-tern. It represents a time-series of paths orig-inated by a sequence of traceroutes performedbetween a specific probe-target pair. The pres-ence of a periodicity, involving paths p0, p4,p7, and p11, is quite evident even if mixed withsome “noise.”

In the networking field we can have differ-ent types of periodicity. As an example we canhave that the data plane of a router changesperiodically, because of a control plane miscon-figuration or because of some traffic-engineeringrequirement. Observe that we can have a dataplane that has a periodic behavior that, be-cause of an inauspicious timing is not revealedby topology measures. On the contrary wehave topology measures that exhibit periodic-ity that do not correspond to any periodicityof the data plane. As an example, considera sequence of Paris traceroutes [8] against anhash-based load balancer.

In this paper we study periodicities in Inter-

1

arX

iv:1

709.

0596

9v1

[cs

.NI]

18

Sep

2017

Page 2: The Internet Pendulum: On the Periodicity of Internet ...

Figure 1: A time-series of paths originated by a sequence of traceroutes. Each path is assigned aPath-id in [p0, p14]. The mapping between paths and IPv4 addresses is omitted for brevity. Thepaths have been recorded by RIPE Atlas on May 1st 2017 from 00:00 to 04:00 am. The probeidentifier is 10039 and the target identifier is nl-ams-as43996.

net topology measures. We do not deal withthe causes of such periodicites but we studytheir presence in public databases. First, wepropose an algorithm, based on autocorrela-tion, string matching, and clustering, for de-tecting and characterizing periodicities. Sec-ond, we check the algorithm in-vitro, against aset of randomly generated time-series of tracer-oute paths. We show that the algorithm isvery effective even in the presence of a cer-tain amount of noise. Third, we exploit thealgorithm to verify how frequently traceroutesequences extracted from popular databases oftopology measures exhibit a periodic behav-ior. We show that a surprisingly high percent-age of measures present one or more period-icities. This happens both with Paris tracer-oute and with traditional traceroute. This alsohappens with traceroutes performed at differ-ent frequencies. Finally, we slightly modify ouralgorithm and apply it to sequences of BorderGateway Protocol (BGP) updates. We con-centrate on the prefixes that are most activein terms of recorded BGP updates and observeseveral periodicities in data extracted from theRIS service of RIPE. This is even more sur-prising than the results obtained by examiningtraceroutes. In fact, before being recorded byRIS collector peers, the BGP updates traverseso many BGP routers, each with its own timingfeatures, to make the synchronization requiredto spot periodicities unlikely. Among the peri-odicities we were able to distinguish some casesthat are related to BGP control plane oscilla-tions. As far as we know, this is the first timethat this type of oscillation is detected in thewild.

The paper is organized as follows. In Sec-tion 2 we introduce basic terminology and dis-cuss the state-of-the-art. In Section 3 we de-

scribe the data set of traceroutes used in theexperiments. In Section 4 we present our method-ology. In Section 5 we show the experimentconducted in-vitro, in Section 6 we show theexperiments on traceroutes, and in Section 7we show the experiments on BGP updates. Con-clusions are in Section 8.

2. BASIC TERMINOLOGY AND RE-LATED WORK

A periodic function is a function that re-peats its values in regular intervals. Function fshows a periodic behavior [23] with period P inan interval T if f(x+P ) = f(x) for each x ∈ T .The values of f in T are the periodic patternof the periodicity. If there exists a least posi-tive constant P with this property, it is calledthe fundamental period. Notice that a periodicfunction with period P is also periodic withperiod kP , for any positive integer k. Givena certain time interval, a function can be peri-odic in just one sub-interval or can be periodicin several disjoint sub-intervals with differentperiods. A time interval when the function isperiodic is a periodic interval. As an example,the function of Fig. 1 is periodic in two periodicintervals, with two different periodic patterns.

2.1 The Role of the MetricPeriodicity has been deeply studied in the

case when the co-domain of f has a metric (e.g.it is a subset of the real or of the integer num-bers). Examples of this case that are in someway related to our problem follow.

In [22] a hybrid technique is presented basedon the combined usage of discrete Fourier trans-form and autocorrelation. The authors arguethat autocorrelation techniques are useful forexploring large periods but do not allow to de-termine the period itself, while periodograms

2

Page 3: The Internet Pendulum: On the Periodicity of Internet ...

allow to use thresholds for noise filtering butare not accurate for short periods. Hence, theycombine the two techniques.

A classification of Dance Music based on pe-riodicity patterns is presented in [13]. Twomethods are compared. The first performs on-set detection and clustering of inter-onset in-tervals. The second uses autocorrelation. Theautocorrelation-based approach gave better re-sults.

Periodicity detection of local motion is stud-ied in [21]. The authors proposed an approachfor local motion analysis via periodicity detec-tion under complex conditions. The work ismostly motivated by the difficulties of analyz-ing local motion in the presence of clutters con-sisting of global and multi-object motions. Thecontribution is in the combination of the lay-ered motion analysis and the autocorrelationof motion energy to estimate the basic motionperiodicity. Interestingly, the field of interestrequires dealing with a large amount of noise.

A classical method for investigating period-icities in disturbed series, with special refer-ence to Wolfer’s sunspot numbers is presentedin [24]. Sunspot numbers are analogous to thedata that would be given by observations ofa disturbed periodic movement, such as thatof a pendulum subject to successive small im-pulses. The method is effective especially forshort periods and for short periodicities.

However, all the above techniques can onlybe used when the co-domain of f(x) has a met-ric. In the case of Internet topology measures,f is just a time-series of paths and there is nototal order between them. Also, the domainis the discrete time since measures are usu-ally performed at specific instants (e.g. eachminute, each hour, etc.).

2.2 Periodicity in InternetSome papers on periodicity issues have been

presented also in the Internet research commu-nity. Even in this case we discuss those thatare somehow related to our work.

Periodicity classification of HTTP traffic, todetect HTTP botnets, is discussed in [15]. TheHTTP botnets periodically connect to partic-ular Web pages or URLs to get commands andupdates from a botmaster. This identifiableperiodic connection pattern has been used inseveral studies as a feature to detect HTTPbotnets. The authors propose three metricsto be used in identifying the types of commu-nication patterns according to their periodic-ity. Test results show that in addition to de-

tecting HTTP botnet communication patternswith accuracy, the proposed method is able toefficiently classify communication patterns intoseveral periodicity categories.

Inferring the periodicity in large-scale Inter-net measurements is the purpose of [7]. Theauthors present two methods for assessing theperiodicity of network events and inferring theirperiodical patterns. The first method uses powerspectral density for inferring a single dominantperiod that exists in a signal representing thesampling process. This method is highly ro-bust to noise, but is most useful for single-period processes. Hence, they present a methodfor detecting multiple periods of a single pro-cess, using iterative relaxation of the time-domainautocorrelation function. They evaluate thesemethods using extensive simulations, and showtheir applicability on real Internet measure-ments of end-host availability and IP addressalternations.

Internet routing instability is studied in [17].The paper examines the network inter-domainrouting information exchanged between back-bone service providers at the major U.S. pub-lic Internet exchange points. The authors de-scribe several unexpected trends in routing in-stability, and examine a number of anomaliesand pathologies observed in the exchange ofinter-domain routing information. They alsoshow that instability exhibits strong tempo-ral properties. They describe a strong cor-relation between the level of routing activityand network usage. The magnitude of rout-ing information exhibits the same significantweekly, daily and holiday cycles as network us-age and congestion. They essentially apply ex-isting techniques to the time-series of the num-ber of BGP updates.

Finally, ping roundtrip-time periodicity isextensively studied in [12] (based on the powerspectrum) and Internet traffic periodicity is stud-ied in [20].

However, also in all the above cases the adoptedtechniques can only be used when the co-domainof f(x) has a metric. To find some contribu-tions where the co-domain of f(x) does nothave a metric we have to explore other researchfields.

2.3 Without a MetricDNA periodicity is studied in [19]. A Fourier

transform of a sequence of bases along a givenstretch of DNA is defined. The transform isinvariant to the labelling of the bases and cantherefore be used as a measure of periodicity

3

Page 4: The Internet Pendulum: On the Periodicity of Internet ...

for segments of DNA with differing base con-tent. It can also be conveniently used to searchfor base periodicities within large DNA databases. Unfortunately, the function can have atmost four values and this makes this techniqueunuseful for our purposes.

A periodicity detection algorithm for databasesof different types of data (e.g. automotive, videosurveillance, and geography) is proposed in [18].The algorithm combines information from thetime-frequency domain and from the autocor-relation space to find meaningful periods. Ifthe fundamental period of a signal is T , in somecases the algorithm outputs 2T . In such cases,the authors manually divide the periodicity by2 after a visual inspection of the signal. Thismanual inspection step makes the algorithmunlikely to scale and unsuitable for an on-lineservice.

Periodicity detection in time series databasesis also studied in [14] The authors define twotypes of periodicities. Whereas symbol period-icity addresses the periodicity of single sym-bols in the time series, segment periodicity ad-dresses the periodicity of portions of the series.Their proposed algorithm for segment period-icity detection uses the convolution in order toshift and compare the time series for all pos-sible values of the period. However, the algo-rithm does not allow to automatically charac-terize the periodicity in terms of involved pat-tern and periodic interval.

3. A DATA SET OF TRACEROUTESIn recent years, measuring network perfor-

mance and availability assumed a key opera-tional role. This drastically increased the num-ber of available Internet measurement platforms.Some of the open ones have distinguished fea-tures of accuracy and coverage.

RIPE Atlas is a platform with more than9, 000 points of view consisting of small hard-ware devices (called probes) able to performvarious types of measurement, including tracer-outes, against other hardware devices (calledanchors). These devices are distributed aroundthe world, and probes are mostly at end-userconnections [6].

In our experiments we consider the IPv4 tracer-outes performed by 9, 738 RIPE Atlas probestowards 258 anchors in the week from May 1stto May 7th 2017. The total amount of probe-anchor pairs is 101, 715 since not all the possi-ble pairs are active. Traceroutes are performedevery 15 minutes. The result of each tracer-

oute is a path of IP addresses. In composingthe paths we consider only the answer to thefirst of the three traceroute packets sent at eachtraceroute round. All these data are publiclyavailable.

Observe that RIPE Atlas traceroutes are sub-ject to IP address aliasing (see e.g. [16]), sinceno anti-aliasing preprocessing is performed byRIPE NCC before providing the traceroutesto the users of the service. Since our goal isto study the phenomenon on an as-is basis wedid not perform any anti aliasing change in thedata set.

The purpose of Fig. 2.a is to show how manydistinct paths are observed by each probe-anchorpair. Observe that most of the pairs (roughly85%) observe less than 20 paths. However, afew of them observe many more paths.

We may expect that a probe-anchor pair whosetraceroutes are periodic have that each path isobserved roughly the same number of times.As an example, this may happen for simpleperiodicities where the periodic pattern doesnot contain repeated paths.

Hence, we computed the number of occur-rences of each distinct observed path for eachpair. Fig. 2.b shows the distribution of thenumber of probe-anchor pairs with respect tothe standard deviation of the number of occur-rences of each distinct path. The figure showsthat real data are very complex and that it ishard to classify probe-anchor pairs accordingto the standard deviation of the number of ob-served paths.

4. A METHODOLOGY FOR DETECT-ING PERIODICITIES

In this section we describe an algorithm, com-posed of four steps, for detecting and char-acterizing periodicities. Its input is a time-series x(t) of paths resulting from a sequence oftraceroutes between a specific source-destinationpair, where x(t) is the path measured at timet. Its output is, possibly, a set of periodicitiesobserved in x(t) each consisting of a period, ofa periodic pattern, and of a periodic intervalwith its starting/ending time. The algorithmcan be tuned using a tolerance parameter ti.

4.1 AutocorrelationIn the Autocorrelation Step we compute a

variation Rxx(l) of the well known autocorre-lation function of x(t) as follows.

4

Page 5: The Internet Pendulum: On the Periodicity of Internet ...

(a)

(b)

Figure 2: Basic features of the data set. (a) Distribution of the number of distinct paths withrespect to the probe-anchor pairs. (b) Distribution of the number of probe-anchor pairs withrespect to the standard deviation of the number of occurrences per distinct path.

Rxx(l) =∑n∈Z

x(n) · x(n+ l)

The dot operator is a path matching oper-ator that outputs one if x(n) = x(n + l) andzero otherwise. When comparing paths con-taining asterisk we do not do any assumptionon the missing IP addresses and consider as-terisks as any other symbol in the path. Ob-serve that an asterisk may not correspond toa routing change (for example, a router mightreply with an ICMP packet only to every otherpacket with expired TTL).

The variable l is called Lag. The result of theautocorrelation performed on the time-series of

Fig. 1 is shown in Fig. 3. In the figure eachunit of lag corresponds to a time unit of 15minutes, the default timing in Atlas anchoringmeasurements.

4.2 Peaks-DetectionIn the Peaks-Detection step we look for “peaks”

in the Rxx(l) function. Intuitively, peaks areclues of periodicity since they correspond to alarge number of evenly spaced identical paths.To do that we simply look for local maxima. Ifno peak is found we conclude that x(t) has noperiodicities. Otherwise, we determine a set ofpeaks, each identified by a value l of Lag.

If several peaks have been found it is possi-ble that several periodicities are present. Also,

5

Page 6: The Internet Pendulum: On the Periodicity of Internet ...

Figure 3: Autocorrelation (ACF) performed on the time-series of Fig. 1.

different peaks may represent the same period-icity detected with overlapping periods. Thearrows in Fig. 3 put in evidence twenty peaks.

4.3 Peak-ClusteringIn the Peak-Clustering step, peaks that are

geometrically near each other are grouped andthe resulting clusters are analyzed. Intuitively,we have that peaks with similar l and similary are likely to correspond to the same peri-odicity. In Fig. 3 the peaks with a red arroware grouped (in this case the time between twoconsecutive of them is 240 minutes) in the samecluster.

Peaks in the same cluster are then analyzedas follows. First, we order them according totheir l. Second, we compute the l-distance(horizontal distance) between consecutive peaks.Third, we check if such distances are roughlythe same or if they can become the same dis-carding one or more peaks (outliers). The peakswith a red arrow of Fig. 3 are roughly equi-spaced.

Clusters where distances are regular may cor-respond to periodicities and their inter-peakdistances maybe their periods. Such clustersare the potential periodicities. We have that atthe end of this step Fig. 3 shows two clusters

(red and green) that are both potential peri-odicities.

4.4 Periodicity CharacterizationIn the Periodicity Characterization step for

each potential periodicity we first verify if itcan be considered a true periodicity. To dothat x(t) is split into sub-sequences with lengthcorresponding to the potential period. Thenconsecutive subsequences are matched one againstthe other and are considered compatible withthe periodicity if they have a Hamming dis-tance less or equal than tolerance ti. If no com-patible pairs of subsequences are found we con-clude that x(t) has no periodicities. Otherwise,we output a periodicity, the maximal subse-quences of compatible intervals are glued intoits periodic interval, its period is the lengthof the subsequence, and its pattern is the onewith the highest number of repetitions amongthe glued intervals.

In the example of Fig. 3 the cluster withgreen peaks does not pass this step and is hencediscarded. The cluster with red peaks passesthis step and we conclude that it correspondsto a true periodicity.

6

Page 7: The Internet Pendulum: On the Periodicity of Internet ...

5. CHECKING THE METHODOLOGYIN-VITRO

In order to test the effectiveness of the al-gorithm of Section 4 we randomly generated5, 000 time-series of traceroutes and tested thealgorithm against such time-series.

Each of them consists of 10, 000 paths corre-sponding to one week of traceroutes performedevery 60 seconds.

5.1 Generating the Time-SeriesEach time-series is generated as follows. We

first generate a periodic interval starting attime t = 0. It consists of a random period inthe range [2, 30] and a random periodicity pat-tern obtained by randomly selecting, for eachinstant of the period, a specific path in a groupof pre-defined 30 paths. Such parameters havebeen selected according to the analysis per-formed in Section 3. E.g., the choice to insert30 pre-defined paths into an observation is dueto the analysis performed on the path distri-bution. In Section 3 we showed that the mostof the pairs (roughly 85%) observes less than20 paths (see Fig. 2.a). As consequence, theusage of 30 distinct paths allows to work witha representative set of elements. The periodicinterval is then obtained by randomizing thenumber of times the period is repeated.

Suppose that the periodicity ends at time t1.We then generate a non-periodic random timeinterval from t1 to t2. The length of [t1, t2]is randomized in such a way that it has thesame probability distribution of the previousrandom periodic interval. The paths in thenon-periodic interval are each randomized inthe same set of paths used for periodic pat-terns. Observe that in most of the time-seriesof the cited public databases the periodic in-terval are quite often separated by intervalswhere very few paths (frequently just one) arefound. Instead, we decided to fill non-periodicintervals with purely random paths in order tostress the algorithm.

We then randomize a new periodic intervalstarting from t2 and keep on executing thesame algorithm, alternating periodic and non-periodic intervals, until the end of the time-series is reached.

In order to have an even distribution of thelength of the periodic patterns in the time-series, when the generation of the time-seriesis finished we perform a final randomizationstep re-shuffling the position of periodic andnon-periodic patterns. Fig. 4 shows one of the

features of the randomized data set. Namely,it shows the number of time-series containinga given number of periodicities.

5.2 Periodicities in the Random Time-Series

We performed our experiments looking forperiodicities in the generated time-series thatcontains 19, 678 periodicities. We have that85.11% of the periodicities were found, with14.89% of false negatives. On the other handthe algorithm detected 194 periodicities thatwere not in the data set, with 0, 9% of falsepositives.

Also, among the periodicities that were cor-rectly detected the algorithm was able to givea correct characterization for 99, 2% of them.Intuitively, long periodic intervals and long pe-riods are easier to detect with respect to shortones. Fig. 5 (blue curve) shows how false nega-tives are distributed with respect to the lengthof the period while Fig. 6 (blue curve) showshow false negatives are distributed with respectto the number of periods in the periodic in-terval. The figure shows that the algorithmis quite conservative in looking for periodici-ties. Observe that the conservative approachis largely justified by the goals pursued in thisanalysis. Our goal is the discovery of a phe-nomenon, so its possible underestimation doesnot affect our conclusions.

5.3 Inserting NoiseAfter the activities described above, we re-

peated the experiments inserting increasing per-centages of noise in the time-series.

We insert noise as follows. When an ele-ment of noise is inserted we select a randomtime tr in one of the periodic periods of the se-ries and randomize with equal probability anaction among those of the following set. Eitherwe insert a random path at tr, or we removethe path that is found at tr, or we substitutethe path at tr with a new random path. Thenoise is inserted only in the periodic intervalssince it would be pointless to change the ran-dom paths of the non-periodic intervals intoother random paths.

In Fig. 7 we show how the algorithm pre-sented in Section 4 behaves in presence of noise.The percentage of noise is the number of in-serted elements of noise with respect to thenumber of paths belonging to periodic sequences.

The experiments performed in-vitro allowedus to determine a suitable value for the toler-ance parameter ti of the algorithm, that was

7

Page 8: The Internet Pendulum: On the Periodicity of Internet ...

Figure 4: Relationship between time-series and periodicities.

Figure 5: Distribution of the false negativeswith respect to the length of the pattern.

selected in order to have a very small numberof false positives. Namely, if the the patterncontains less than 5 paths we set ti = 1, oth-erwise we set ti to 10% of the pattern length.Hence, all the experiments reported in this pa-per have been done with this setting.

6. PERIODICITY OF TRACEROUTESOF PUBLIC DATA SOURCES

We applied the algorithm of Section 4 to thedata set of traceroutes described in Section 3with the purpose to check if periodic time-series of traceroutes exist in such data sourcesand, if yes, to determine their frequency.

6.1 Periodicity in RIPE Atlas Tracer-oute Paths

We found that 36.02% of probe-anchor pairshave at least one periodicity in the time inter-val. Also, we have found a total of 186, 403periodicities. Fig. 8 gives more details on theresults of the experiment. Fig. 8.a shows that

Figure 6: Distribution of the false negativeswith respect to the number of periods

about 122, 000 periodicities have a pattern thatis composed of exactly two paths. Looking atsuch paths we have that about 60, 000 of themhave an alternation of a path where all IPv4addresses are present and a path where oneof such addresses is substituted by an aster-isk. Also, 180 periodicities have a pattern of16 paths. Fig. 8.b shows that about 148, 000periodicities have a number periods repeatedinto the periodic interval that is less or equalthan 10. Fig. 8.c shows that most of the peri-odicities (about 165, 000) have a periodic pat-tern will at most 10 paths. Fig. 8.d shows thatthe relationship between number of periodic-ities and duration is quite scattered. Most ofthem (about 131, 000) last less than two hours.

6.2 Paris Traceroute and Plain Tracer-oute

Up to now we did not consider the causesof the periodicities, since they are out of thescope of this paper. However, it is important

8

Page 9: The Internet Pendulum: On the Periodicity of Internet ...

Figure 7: Performance of the algorithm in presence of noise. Percentage of false-negatives, of false-positives, and of correctly characterized periodicities. Recall that the total number of periodicitiesis 19, 678.

to observe that traceroutes can be performedwith different methods. Namely, the probesmay either use Paris traceroute or traditionaltraceroute [5]. In Paris traceroute packets areforged using different Paris-ids, whose value ro-tates in a range that varies from one to a max-imum value that can be set by the user. Thedata set of Section 3 has been computed us-ing Paris traceroutes with maximum value ofparis-id equal to 16, that is the default for allRIPE-Atlas anchoring measurements.

The impact of using Paris traceroute on peri-odicity is made clear by the example of Fig. 9.Observe that exactly the same periodic pat-tern is found independently on the frequencyof traceroutes. Most probably what happens isthat packets forged by probes traverse differentpaths according to an hash function applied tothe packet header [5]. Also, since the periodicpattern of Fig. 9 contains 16 paths we maysuppose the presence of a load-balancing com-ponents with at least 16 forwarding options.

Once a periodicity is found, it is interestingto decide if Paris traceroute is, at least par-tially, responsible for it. To do that we cancheck if the periodic pattern contains a paththat comes out in the periodicity in all caseswhen a specific Paris-id is used for the mea-sure. In this case we argue that we are proba-bly looking at a periodicity triggered by a load-

balancer operating per-flow on Paris tracerouteprobes. In our data set 18.4% of the period-icities have this feature. Also, 72.6% of thecorresponding patterns have less than 6 paths.

The above discussion naturally opens the ques-tion whether periodicities can be found even ifParis traceroute is not used. Since measureswithout Paris traceroutes are not the default,we had to setup a special purpose set of mea-sures on the probe-anchor pairs of our data set.To do that we had two types of problems: (i)performing ad-hoc traceroutes for all the pairswould be unfeasible and (ii) to have compara-bility of the results we had to perform ad-hocmeasures in the same week of the data set. Wesolved the problem as follows. We performed apreliminary experiment in the week from April24th to April 30th. We then spotted the peri-odic pairs of that week and randomized 60 ofthem. Finally, in our reference week we per-formed the ad-hoc measures on those pairs,with the advantage to have two comparabledata set in the same week.

6.3 Changing the FrequencyAnother important issue to be discussed is

about the possibility of finding periodicities chang-ing the frequency of traceroutes. Hence, for the60 pairs described above we performed mea-sures every 60 seconds, that is the maximum

9

Page 10: The Internet Pendulum: On the Periodicity of Internet ...

(a) Distribution of the number of distinct pathscontained in periodicities.

(b) Distribution of the number of periods containedin the periodic intervals.

(c) Distribution of the lengths of the patterns.

(d) Distribution of the durations of the periodicities.

Figure 8: Periodicities found in the data set.

frequency that is possible to set on RIPE Atlasprobes.

We found that 36 of the 60 probe-anchorpairs with traceroutes performed every 60 sec-onds have at least one periodicity in the time

interval. Also, we have found a total of 1, 194periodicities. Fig. 10 illustrates the details ofthe experiment.

Fig. 10.a shows that 671 periodicities have apattern that is composed of exactly two paths.Looking at such paths we have that 315 ofthem have an alternation of a path where allIP addresses are present and a path where oneof such addresses is substituted by an aster-isk. Also, 101 periodicities have a pattern of5 paths. Fig. 10.b shows that 1, 158 periodic-ities have a number periods repeated into theperiodic interval that is less or equal than 10.Fig. 10.c shows that most of the periodicities(1, 021) have a periodic pattern will at most 10paths. Even in this case, Fig. 10.d shows thatthe relationship between number of periodici-ties and duration is quite scattered. Roughlyhalf of them (about 534) last less than twohours.

7. PERIODICITY OF BGP UPDATESWe apply our algorithm to another impor-

tant set of topology measures. Namely, we con-sider the BGP updates collected by the RIPERouting Information Service (RIS) [3]. Sinceyear 2001, the RIPE RIS collects and storesBGP updates from several locations aroundthe world. The routers where such updatesare gathered are called collector peers. We con-sider 151 of those collector peers.

While traceroute measures are active mea-sures, since they inject packets into the net-work waiting for a reply, the BGP updatesgathered by collector peers are passive mea-sures, since they just collect information fromthe network without performing any action.

7.1 State of the InternetGiven an IPv4 prefix π, we are interested to

spot periodicities related to π and visible in theentire Internet. To do that we define the stateof the Internet with respect to π as follows. Wesubdivide a time interval of interest in atomicinstants each lasting one second (the maximumtime granularity for BGP updates). The stateof a collector peer c at time t with respect to πis the AS-path used by c at time t for reachingπ. The state of the Internet at time t withrespect to π is the set of the states of all thecollector peers. Roughly speaking, the stateat time t is how the (known part of) Internetreaches π at time t. In this case the value offunction f at time t is the state of Internet attime t.

We applied our algorithm to detect if the

10

Page 11: The Internet Pendulum: On the Periodicity of Internet ...

(a)

(b)

Figure 9: (a) A periodicity found with one traceroute every 15 minutes. (b) a periodicity foundwith one traceroute every minute. The two measures are done in the same interval of time and onthe same probe-anchor pair.

Prefix Origin AS Period Start of Observation End of Observation110.170.17.0/24 134438 480 2017-05-14 15:15:33 2017-05-15 01:15:3393.181.192.0/19 13118 1800 2017-05-14 02:00:00 2017-04-14 12:00:00193.0.132.0/22 3203 710 2017-05-16 11:00:00 2017-05-16 21:00:00185.123.238.0/24 8296 1100 2017-05-15 4:00:00 2017-05-15 14:00:00196.250.233.0/24 37662 1200 2017-05-13 15:00:00 2017-05-14 1:00:00192.129.3.0/24 2614 450 2017-05-13 10:00:00 2017-05-13 20:00:0013.15.32.0/20 22390 820 2017-05-13 00:00:00 2017-05-13 10:00:0091.193.202.0/24 25211 420 2017-05-12 06:00:00 2017-05-12 16:00:00133.69.128.0/19 2523 450 2017-05-17 02:00:00 2017-05-17 12:00:00133.69.128.0/20 2523 350 2017-05-17 02:00:00 2017-05-17 12:00:00154.66.175.0/24 25543 620 2017-05-16 14:15:00 2017-05-17 00:15:00

Table 1: Prefixes with a periodicity. The Period column shows the duration of the spotted periodin seconds. The right columns specify the observed interval of time (roughly ten hours).

evolution of the state of the Internet has someperiodicity. However, before doing that weneeded to get further evidence of the effective-ness of the algorithm. In fact, the experimentsperformed in Section 5 are all targeted to checkthe validity of the algorithm against time-seriesof paths obtained from traceroutes and henceit is not completely clear that those results ap-ply even in the case where BGP updates areconsidered.

7.2 BeaconsIn Internet there are prefixes that are an-

nounced and withdrawn on a regular basis, thatare called beacons [2].

They are used for networking experimentsand the sequence of announcements-withdrawalsinvolving them is as follows. First, an an-nouncement is issued, then after two hours itis withdrawn, then after two hours it is an-

11

Page 12: The Internet Pendulum: On the Periodicity of Internet ...

(a) Distribution of the number of distinct paths containedin periodicities. Observe that the logarithmic scale is notused.

(b) Distribution of the number of periods contained inthe periodic intervals.

(c) Distribution of the lengths of the patterns.

(d) Distribution of the durations of the periodicities.

Figure 10: Periodicities found in the data setbuilt not using Paris traceroute and with mea-sures every 60 seconds.

nounced again. The origin router keeps on do-ing this “forever” with an overall period of fourhours.

Hence, we tested the algorithm against allIPv4 beacons (14) listed in [2] for 24 hours

and for each of them we were able to spot justone periodicity lasting four hours. For the caseof BGP updates we slightly changed the algo-rithm. Namely, when the autocorrelation isperformed the dot operator is redefined suchthat it outputs one if 95% the states of the In-ternet at time t and at time t+ l coincide butfor at most 5% of the collector peers. This isdone to tolerate little temporal anomalies.

7.3 Experiments with the Most ActivePrefixes

At this point we performed our experimentsfocusing on the 50 most active prefixes of thePotaroo Web site [1] in the week from May 7thto May 14th 2017. Such prefixes are involvedin that week in more than 400, 000 updates,ranging from about 20, 000 for the most activeto the about 4, 000 for the most quiet.

For each of the prefixes we extracted the cor-responding updates from the RIPE RIS andconsidered only the ten hours with the highestnumber of updates. We obtained the resultsillustrated in Table 1. The table shows the 11of the 50 prefixes where we have found a peri-odicity.

7.4 A Special CaseWe applied our algorithm to analyze the evo-

lution of the state of the Internet also beforethe the week from May 7th to May 14th.

A very clear example of periodicity that wehave found is depicted by using the popularBGPlay visualization system [11] in Fig. 13.With respect to prefix 45.42.41.0/24, the stateshows a period of 580 seconds. Observe thatthe frames on left show almost completely thesame configuration.

Looking at the details of the detected peri-odicities we observed that, between 02:00 AMand 11:00 AM of Apr 18 2017, with respect toprefix 110.170.10.0/24, the collector peer iden-tified with 01-195.66.226.20 shows a period of450 seconds. During the periodicity it alter-nates between AS-paths 56730-51945-2914-1299-7029-6316 and 56730-51945-1299-2914-23352-6316. Observe that the pair of adjacent ASes1299 and 2914 appear alternatively in this or-der and in the opposite order. According to [10]this may correspond to the presence of a dis-pute reel in the control plane of the routers tra-versed by the updates. As far as we know thisis the first time that this configuration is ob-served to oscillate in the wild. Of course thismay even correspond to a configuration thathas been set by purpose. Fig.11 and Fig.12

12

Page 13: The Internet Pendulum: On the Periodicity of Internet ...

Figure 11: The time series of the states of collector peer 01-195.66.226.20 with respect to prefix66.19.194.0/24. Each value of the y-axes corresponds to a distinct AS-path.

Figure 12: Output of the autocorrelation ap-plied to the states sequence of collector peer01-195.66.226.20.

show the sequence of AS paths and the outputof the autocorrelation applied to the sequence,respectively.

8. CONCLUSIONSWe presented a methodology for inferring

periodic behavior in Internet topology mea-sures. We believe that this can be a usefulcard to compose the puzzle of methods andtools that is needed for fully understanding thecomplex dynamics of such measures.

We have shown that finding Internet topol-ogy measures that exhibit a periodic behavioris frequent both for traceroutes and for BGPupdate collections.

The RIPE NCC is considering the possibilityof putting at disposal of all the network oper-ators a service, called Periodicity-as-a-Service,

for detecting the periodicities of RIPE Atlastraceroute measurements. The service has beenannounced and discussed with operators at RIPE-74. A prototype implementation that exploitsthe algorithm presented in this paper is avail-able at http://atlas.ripe.net/periodicity.

13

Page 14: The Internet Pendulum: On the Periodicity of Internet ...

Figure 13: Evolution of the state of Internet with a period of 580 seconds shown with BGPlayvisualization system. The states to the left are identical.

14

Page 15: The Internet Pendulum: On the Periodicity of Internet ...

9. REFERENCES

[1] Potaroo site.http://www.potaroo.net/. [Online;accessed 18-May-2017].

[2] RIPE beacons description.https://www.ripe.net/analyse/

internet-measurements/

routing-information-service-ris/

current-ris-routing-beacons.[Online; accessed 17-May-2017].

[3] RIPE RIS. https://www.ripe.net/analyse/internet-measurements/

routing-information-service-ris.[Online; accessed 18-May-2017].

[4] Caida Ark Description.http://www.caida.org/projects/ark/,2017. [Online; accessed 17-May-2017].

[5] Paris Traceroute description.https://paris-traceroute.net/, 2017.[Online; accessed 17-May-2017].

[6] RIPE ATLAS description.https://atlas.ripe.net/, 2017.[Online; accessed 17-May-2017].

[7] Argon, O., Shavitt, Y., andWeinsberg, U. Inferring the periodicityin large-scale internet measurements. InINFOCOM, 2013 Proceedings IEEE(2013), IEEE, pp. 1672–1680.

[8] Augustin, B., Cuvellier, X.,Orgogozo, B., Viger, F., Friedman,T., Latapy, M., Magnien, C., andTeixeira, R. Avoiding tracerouteanomalies with paris traceroute. InProceedings of the 6th ACM SIGCOMMconference on Internet measurement(2006), ACM, pp. 153–158.

[9] Candela, M., Di Bartolomeo, M.,Di Battista, G., and Squarcella,C. Dynamic traceroute visualization atmultiple abstraction levels. In Proc. 21stInternational Symposium on GraphDrawing (GD ’13) (2013), S. Wismathand A. Wolff, Eds., vol. 8242 of LectureNotes in Computer Science, pp. 500–511.

[10] Cittadini, L., Di Battista, G.,Rimondini, M., and Vissicchio, S.Wheel + ring = reel: the impact of routefiltering on the stability of policy routing.IEEE/ACM Transactions on Networking19, 4 (Aug 2011), 1085–1096.

[11] Colitti, L., Di Battista, G.,Mariani, F., Patrignani, M., andPizzonia, M. Visualizing interdomainrouting with bgplay. J. Graph AlgorithmsAppl. 9, 1 (2005), 117–148.

[12] Csabai, I. 1/f noise in computernetwork traffic. Journal of Physics A:Mathematical and General 27, 12 (1994),L417.

[13] Dixon, S., Pampalk, E., andWidmer, G. Classification of dancemusic by periodicity patterns.

[14] Elfeky, M. G., Aref, W. G., andElmagarmid, A. K. Periodicitydetection in time series databases. IEEETransactions on Knowledge and DataEngineering 17, 7 (2005), 875–887.

[15] Eslahi, M., Rohmad, M., Nilsaz, H.,Naseri, M. V., Tahir, N., andHashim, H. Periodicity classification ofhttp traffic to detect http botnets. InComputer Applications & IndustrialElectronics (ISCAIE), 2015 IEEESymposium on (2015), IEEE,pp. 119–123.

[16] Gunes, M. H., and Sarac, K.Resolving ip aliases in buildingtraceroute-based internet maps.IEEE/ACM Trans. Netw. 17, 6 (Dec.2009), 1738–1751.

[17] Labovitz, C., Malan, G. R., andJahanian, F. Internet routinginstability, vol. 27. ACM, 1997.

[18] Parthasarathy, S., Mehta, S., andSrinivasan, S. Robust periodicitydetection algorithms. In Proceedings ofthe 15th ACM international conferenceon Information and knowledgemanagement (2006), ACM, pp. 874–875.

[19] Silverman, B., and Linsker, R. Ameasure of dna periodicity. Journal oftheoretical biology 118, 3 (1986), 295–300.

[20] Squillante, M. S., Yao, D. D., andZhang, L. Internet traffic: periodicity,tail behavior, and performanceimplications. CRC Press, Inc., 2000.

[21] Tong, X., Duan, L., Xu, C., Tian,Q., Lu, H., Wang, J., and Jin, J. S.Periodicity detection of local motion. InMultimedia and Expo, 2005. ICME 2005.IEEE International Conference on(2005), IEEE, pp. 650–653.

[22] Vlachos, M., Yu, P., and Castelli,V. On periodicity detection andstructural periodic similarity. InProceedings of the 2005 SIAMInternational Conference on DataMining (2005), SIAM, pp. 449–460.

[23] Wikipedia. Periodic function —Wikipedia, the free encyclopedia.http://en.wikipedia.org/w/index.

15

Page 16: The Internet Pendulum: On the Periodicity of Internet ...

php?title=Periodic%20function&

oldid=763638017, 2017. [Online;accessed 17-May-2017].

[24] Yule, G. U. On a method ofinvestigating periodicities in disturbedseries, with special reference to wolfer’ssunspot numbers. PhilosophicalTransactions of the Royal Society ofLondon. Series A, Containing Papers ofa Mathematical or Physical Character226 (1927), 267–298.

16


Recommended