Some Scaling Properties of Traffic
in Communication Networks
Paulo GonçalvesDANTE - INRIA - ENS Lyon
Patrick Loiseau (PhD, 2006-2009)Shubhabrata Roy (PhD, 2010-2013)
M. Sokol (PhD, 2010- )B. Girault (PhD, 2012-2015)
Seminars Complex Networks, LIP6, UPMC – march 7, 2013
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 1 / 25
Scaling Properties of Traffic
Historical perspective
Mar
kov
2004
Man
djes
: QoS
ON/O
FF(th
eo.)
1997
Taqq
u:ON
/OFF
mod
el
1988
Van
Jaco
bson
: AIM
D- T
CP19
94Pa
xson
Floy
d:LR
DW
AN
MandelbrotLRD (fBm)heavy tails
1917
Erlan
g:cir
cuit
switc
hing
netw
orks
1969
Klein
rock
: pac
ket s
witc
hing
netw
orks
1992
Tim
Bern
ers L
ee: W
eb19
94No
rros:
queu
esan
dLR
D
1997
Crov
ella:
heav
yta
ils
1997
Park
Crov
ella:
QoS
degr
adat
ion
1998
Padh
ye: M
arko
v- 1
TCP
sour
ce
2004
Robe
rts: Q
oSins
ensit
ivity
1993
Lelan
dW
illing
erTa
qqu:
LRD
LAN
1968-69exponentialPoissonMarkov
Mar
kov
Some open questions:Long Range Dependence / Heavy Tailed distributions impact on QoS ?Existing models (e.g. Padhye) only predict mean metrics (e.g. throughput) :what about variability?
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 2 / 25
Scaling Properties of Traffic
Historical perspective
Mar
kov
2004
Man
djes
: QoS
ON/O
FF(th
eo.)
1997
Taqq
u:ON
/OFF
mod
el
1988
Van
Jaco
bson
: AIM
D- T
CP19
94Pa
xson
Floy
d:LR
DW
AN
MandelbrotLRD (fBm)heavy tails
1917
Erlan
g:cir
cuit
switc
hing
netw
orks
1969
Klein
rock
: pac
ket s
witc
hing
netw
orks
1992
Tim
Bern
ers L
ee: W
eb19
94No
rros:
queu
esan
dLR
D
1997
Crov
ella:
heav
yta
ils
1997
Park
Crov
ella:
QoS
degr
adat
ion
1998
Padh
ye: M
arko
v- 1
TCP
sour
ce
2004
Robe
rts: Q
oSins
ensit
ivity
1993
Lelan
dW
illing
erTa
qqu:
LRD
LAN
1968-69exponentialPoissonMarkov
Mar
kov
Some open questions:Long Range Dependence / Heavy Tailed distributions impact on QoS ?Existing models (e.g. Padhye) only predict mean metrics (e.g. throughput) :what about variability?
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 2 / 25
Scaling Properties of Traffic
Historical perspective
Mar
kov
2004
Man
djes
: QoS
ON/O
FF(th
eo.)
1997
Taqq
u:ON
/OFF
mod
el
1988
Van
Jaco
bson
: AIM
D- T
CP19
94Pa
xson
Floy
d:LR
DW
AN
MandelbrotLRD (fBm)heavy tails
1917
Erlan
g:cir
cuit
switc
hing
netw
orks
1969
Klein
rock
: pac
ket s
witc
hing
netw
orks
1992
Tim
Bern
ers L
ee: W
eb19
94No
rros:
queu
esan
dLR
D
1997
Crov
ella:
heav
yta
ils
1997
Park
Crov
ella:
QoS
degr
adat
ion
1998
Padh
ye: M
arko
v- 1
TCP
sour
ce
2004
Robe
rts: Q
oSins
ensit
ivity
1993
Lelan
dW
illing
erTa
qqu:
LRD
LAN
1968-69exponentialPoissonMarkov
Mar
kov
Some open questions:Long Range Dependence / Heavy Tailed distributions impact on QoS ?Existing models (e.g. Padhye) only predict mean metrics (e.g. throughput) :what about variability?
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 2 / 25
Scaling Properties of Traffic
Our approach
To combine theoretical models with controlled experiments in realisticenvironments and real-world traffic traces
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 3 / 25
Scaling Properties of Traffic
Simplified System
Access Point Core NetworkCongestion Overdimensioned
– Congestion essentially arises at the access points→ Simplified System : single bottleneck
– Users’ behavior : ON/OFF source model
– MetroFlux : a probe for traffic capture at packet level (O. Goga,. . . )
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 4 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Long memory in aggregated traffic: the Taqqu model
Heavy-tailed distributed ON periods: heavy tail index αON > 1
Theorem (Taqqu, Willinger, Sherman, 1997)
In the limit of a large number of sources Nsrc, if:
flow throughput is constant,
same throughput for all flows ;
aggregated bandwidth B(∆)(t) is long range dependent, with parameter:
H = max(3− αON
2,12
)Long memory: long range correlation (H > 1/2)
CovB(∆) (τ) = E{B(∆)(t)B(∆)(t + τ)
}∼
τ→∞τ (2H−2)
Variance grows faster than ∆: Var{B(∆)(t)
}∼ ∆2H
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 5 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Theorem validation on a realistic environment
Controlled experiment: MetroFlux 1 Gbps, 100 sources, 8 hours traffic
UDP/TCP: throughput limited to 5 Mbps (no congestion)
ON Distribution Log-diagram Taqqu Prediction(source) (aggregated traffic)
distribution
10−2
100
102
10410
−10
10−5
100
µON
αON
=1.5
logVar{B
(∆) }
0.1ms 1ms 10ms 100ms 1s 10s 100s
+ : TCPo : UDP
RTT µON
H
1 2 3 4
0.4
0.6
0.8
1 + : TCP
o : UDP
ON duration scale ∆ αON
⇒ Protocol has no influence at large scales
⇒ Long memory shows up beyond scale ∆ = µON (mean flow duration)
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 6 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Influence of flow mean throughput / duration correlation
Web traffic acquired at in2p3 (Lyon) with MetroFlux 10 Gbps
ON Distribution Size Distribution
Mean throughput
distribution
0.01ms0.1ms1ms10ms0.1s 1s 10s 100s1000s
10−10
10−5
100
αON
=1.2
distribution
100
105
1010
10−10
10−5
100
αSI
=0.85
E{thr.|d
ur.}
0.1s 1s 10s 100s 1000s10
4
105
106
107
β−1=0.4
ON duration size
duration
Heavy-tailed ON periods, αON = 1.2
Heavy tailed flow sizes, αSI = 0.85
Flow throughput and duration are correlated:
E{thr.|dur.} ∝ (dur.)β−1, β = αON/αSI (= 1.4)
⇒ Which heavy tail index does control LRD ? (αON , αSI ) ?
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 7 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Influence of flow mean throughput / duration correlation
Web traffic acquired at in2p3 (Lyon) with MetroFlux 10 Gbps
ON Distribution Size Distribution Mean throughput
distribution
0.01ms0.1ms1ms10ms0.1s 1s 10s 100s1000s
10−10
10−5
100
αON
=1.2
distribution
100
105
1010
10−10
10−5
100
αSI
=0.85
E{thr.|d
ur.}
0.1s 1s 10s 100s 1000s10
4
105
106
107
β−1=0.4
ON duration size duration
Heavy-tailed ON periods, αON = 1.2
Heavy tailed flow sizes, αSI = 0.85
Flow throughput and duration are correlated:
E{thr.|dur.} ∝ (dur.)β−1, β = αON/αSI (= 1.4)
⇒ Which heavy tail index does control LRD ? (αON , αSI ) ?
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 7 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Taqqu model extension
Planar Poisson process to describe arrival instant vs duration
Log-diagram, β > 1
logVar{B
(∆) }
scale ∆
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 8 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Taqqu model extension
Planar Poisson process to describe arrival instant vs duration
Proposition (LGVBP, 2009)
Model: E{through.|dur.} = M · (dur.)β−1; Var{through.|dur.} = V
CovB(∆) (τ) = CM2τ−(αON−2(β−1))+1 + C ′V τ−αON+1
Log-diagram, β > 1
logVar{B
(∆) }
scale ∆
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 8 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Taqqu model extension
Planar Poisson process to describe arrival instant vs duration
Proposition (LGVBP, 2009)
Model: E{through.|dur.} = M · (dur.)β−1; Var{through.|dur.} = V
CovB(∆) (τ) = CM2τ−(αON−2(β−1))+1 + C ′V τ−αON+1
Log-diagram, β > 1
logVar{B
(∆) }
τ ∗
HTaqqu
H=HTaqqu
+(β−1)
scale ∆
threshold τ∗ =(
C ′VCM2
)1/(2(β−1))
→ if ∆� τ∗: H = HTaqqu + (β− 1)
→ if ∆� τ∗: H = HTaqqu
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 8 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Taqqu model extension
Planar Poisson process to describe arrival instant vs duration
Proposition (LGVBP, 2009)
Model: E{through.|dur.} = M · (dur.)β−1; Var{through.|dur.} = V
CovB(∆) (τ) = CM2τ−(αON−2(β−1))+1 + C ′V τ−αON+1
Log-diagram, β > 1
logVar{B
(∆) }
τ ∗
HTaqqu
H=HTaqqu
+(β−1)
scale ∆
threshold τ∗ =(
C ′VCM2
)1/(2(β−1))
→ if ∆� τ∗: H = HTaqqu + (β− 1)
→ if ∆� τ∗: H = HTaqqu
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 8 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Taqqu model extension
Planar Poisson process to describe arrival instant vs duration
Proposition (LGVBP, 2009)
Model: E{through.|dur.} = M · (dur.)β−1; Var{through.|dur.} = V
CovB(∆) (τ) = CM2τ−(αON−2(β−1))+1 + C ′V τ−αON+1
Log-diagram, β > 1
logVar{B
(∆) }
τ ∗
HTaqqu
H=HTaqqu
+(β−1)
scale ∆
threshold τ∗ =(
C ′VCM2
)1/(2(β−1))
→ if ∆� τ∗: H = HTaqqu + (β− 1)→ if ∆� τ∗: H = HTaqqu
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 8 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
Taqqu model extension
Planar Poisson process to describe arrival instant vs duration
Proposition (LGVBP, 2009)
Model: E{through.|dur.} = M · (dur.)β−1; Var{through.|dur.} = V
CovB(∆) (τ) = CM2τ−(αON−2(β−1))+1 + C ′V τ−αON+1
Log-diagram, β > 1
logVar{B
(∆) }
τ ∗
HTaqqu
H=HTaqqu
+(β−1)
scale ∆
Correlations intensify LRD (β > 1)
Traffic evolution, future Internet:“flow-aware” control mechanisms,FTTH
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 8 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
LRD impact on QoS: a brief (experimental) outlookThe situation is complex. . .
Negative on finite queues with UDP flows [cf. Mandjes, 2004 (infinitequeues)]
– LRD degrades QoS for large queue sizes (beyond some threshold)– but the threshold depends on the considered QoS metric (loss rate vs
mean load)
Questionable with TCP flows: [Park, 1997] against [Ben Fredj, 2001]
– LRD has contradictory effects on QoS metrics depending on:
with slow start without slow start
Delay ↘ ↗
loss rate ↘ →
mean throughput → ↗
– Heavy tailed distributions (i.e LRD) can favour QoS for large flows
But in general, QOS is a complex function of multiple variables
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 9 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
LRD impact on QoS: a brief (experimental) outlookThe situation is complex. . .
Negative on finite queues with UDP flows [cf. Mandjes, 2004 (infinitequeues)]
– LRD degrades QoS for large queue sizes (beyond some threshold)– but the threshold depends on the considered QoS metric (loss rate vs
mean load)
Questionable with TCP flows: [Park, 1997] against [Ben Fredj, 2001]
– LRD has contradictory effects on QoS metrics depending on:
with slow start without slow start
Delay ↘ ↗
loss rate ↘ →
mean throughput → ↗
– Heavy tailed distributions (i.e LRD) can favour QoS for large flows
But in general, QOS is a complex function of multiple variables
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 9 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
LRD impact on QoS: a brief (experimental) outlookThe situation is complex. . .
Negative on finite queues with UDP flows [cf. Mandjes, 2004 (infinitequeues)]
– LRD degrades QoS for large queue sizes (beyond some threshold)– but the threshold depends on the considered QoS metric (loss rate vs
mean load)
Questionable with TCP flows: [Park, 1997] against [Ben Fredj, 2001]
– LRD has contradictory effects on QoS metrics depending on:
with slow start without slow start
Delay ↘ ↗
loss rate ↘ →
mean throughput → ↗
– Heavy tailed distributions (i.e LRD) can favour QoS for large flows
But in general, QOS is a complex function of multiple variables
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 9 / 25
Scaling Properties of Traffic Heavy tailed distributions and long range dependence
LRD impact on QoS: a brief (experimental) outlookThe situation is complex. . .
Negative on finite queues with UDP flows [cf. Mandjes, 2004 (infinitequeues)]
– LRD degrades QoS for large queue sizes (beyond some threshold)– but the threshold depends on the considered QoS metric (loss rate vs
mean load)
Questionable with TCP flows: [Park, 1997] against [Ben Fredj, 2001]
– LRD has contradictory effects on QoS metrics depending on:
with slow start without slow start
Delay ↘ ↗
loss rate ↘ →
mean throughput → ↗
– Heavy tailed distributions (i.e LRD) can favour QoS for large flows
But in general, QOS is a complex function of multiple variables
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 9 / 25
Scaling Properties of Traffic TCP and large deviations principle
Second level of description : single TCP source traffic
Nsrc
Sources
1
Agrégat
τON τOFF
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 10 / 25
Scaling Properties of Traffic TCP and large deviations principle
Second level of description : single TCP source traffic
i (RTT)Sources
1
Agrégat
τON τOFF
Nsrc
Wi
single TCP source traffic detail
Long-lived flow → stationary regime
⇒ How to characterize the congestion window evolution?
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 10 / 25
Scaling Properties of Traffic TCP and large deviations principle
Markov model
i (RTT)
Wi (paquets)
n
long-lived flow stationary regime: AIMD
model: (Wi )i≥1 finite Markov chain (irreducible, aperiodic), transition matrixQ : {
Qw,min(w+1,wmax) = 1− p(w),Qw,max(bw/2c,1) = p(w).
p(·) loss probability of at least one packet, only depends on the currentcongestion window (hyp.)
Example: [Padhye, 1998] Bernoulli loss: p(w) = 1− (1− ppkt)w
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 11 / 25
Scaling Properties of Traffic TCP and large deviations principle
Almost sure mean throughput
W(n)
i (RTT)
Wi (paquets)
n
mean throughput at scale n (RTT): W(n)
=∑n
i=1 Wi
n
Ergodic Birkhoff theorem (1931): almost sure mean
For almost all realisation, the mean throughput at scale n converges towards a valuecorresponding to the expectation of the invariant distribution:
W(n) p.s.−−−→
n→∞W
(∞)= E{Wi}
Example: [Padhye, 1998], W(∞) ∼
ppkt→0
√3
2ppkt(RTT=1, MSS=1)
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 12 / 25
Scaling Properties of Traffic TCP and large deviations principle
Throughput variability: Large Deviations
W(n)
Wi
n
W(n)
i
Wi
n
i
W(n) ' α 6= W
(∞)Rare events
Large Deviations theorem (Ellis, 84)
P(W(n) ' α) ∼
n→∞exp(n · f (α))
f (α) Large Deviation spectrum
→ Scale invariant quantity
W(∞)
0
α
f (α)
⇒ Does a similar theorem exist for a single realization?
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 13 / 25
Scaling Properties of Traffic TCP and large deviations principle
Throughput variability: Large Deviations
2n
Wi
n
W(n)
i
Wi
n
i
W(n)
W(n) ' α 6= W
(∞)Rare events
Large Deviations theorem (Ellis, 84)
P(W(n) ' α) ∼
n→∞exp(n · f (α))
f (α) Large Deviation spectrum
→ Scale invariant quantity
W(∞)
0
α
f (α)
⇒ Does a similar theorem exist for a single realization?
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 13 / 25
Scaling Properties of Traffic TCP and large deviations principle
Large Deviation on almost all realizations
intervalle knWi
n
W(n)1
2n iknn
W(n)kn
intervalle 1
Large Deviation theorem on almost all realisations (Loiseau et al., 2010)
For a given α, if kn ≥ enR(α), then a.s.
#{j ∈ {1, · · · , kn} : W
(n)j ' α
}kn
∼n→∞
exp(n · f (α))
“Price to pay”: exponential increase of the number of intervals
Finite realization (of size N): nkn = N
⇒ [αmin(n), αmax(n)] support of observable spectrum at scale n
Theory: p(·) → Q → f (α),R(α), αmin, αmax
Practice: (Wi )i≤N → observed distribution
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 14 / 25
Scaling Properties of Traffic TCP and large deviations principle
Large Deviation on almost all realizations
intervalle knWi
n
W(n)1
2n iknn
W(n)kn
intervalle 1
Large Deviation theorem on almost all realisations (Loiseau et al., 2010)
For a given α, if kn ≥ enR(α), then a.s.
#{j ∈ {1, · · · , kn} : W
(n)j ' α
}kn
∼n→∞
exp(n · f (α))
“Price to pay”: exponential increase of the number of intervals
Finite realization (of size N): nkn = N
⇒ [αmin(n), αmax(n)] support of observable spectrum at scale n
Theory: p(·) → Q → f (α),R(α), αmin, αmax
Practice: (Wi )i≤N → observed distribution
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 14 / 25
Scaling Properties of Traffic TCP and large deviations principle
Results: example of Bernoulli losses (ppkt = 0.02)f
(α)
4 10 16
−0.1
−0.05
0
W(∞)
theorique
α (packets)
Apex: almost sure mean: 8.6 packets (Padhye:√
32ppkt
= 8.66)
Superimposition at different scales → scale invariance
beyond n = 100: variabilityn = 100, portion of intervals with mean ∼ 11: e−100×0.01 = 0.37n = 200, portion of intervals with mean ∼ 11: e−200×0.01 = 0.14
⇒ More accurate information than the almost sure mean
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 15 / 25
Scaling Properties of Traffic TCP and large deviations principle
Results: example of Bernoulli losses (ppkt = 0.02)f
(α)
4 10 16
−0.1
−0.05
0
W(∞)α
min(100) α
max(100)
theoriquen=100
α (packets)
Apex: almost sure mean: 8.6 packets (Padhye:√
32ppkt
= 8.66)
Superimposition at different scales → scale invariance
beyond n = 100: variabilityn = 100, portion of intervals with mean ∼ 11: e−100×0.01 = 0.37n = 200, portion of intervals with mean ∼ 11: e−200×0.01 = 0.14
⇒ More accurate information than the almost sure mean
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 15 / 25
Scaling Properties of Traffic TCP and large deviations principle
Results: example of Bernoulli losses (ppkt = 0.02)f
(α)
4 10 16
−0.1
−0.05
0
W(∞)α
minα
maxα
min(200) α
max(200)
theoriquen=100n=200
α (packets)
Apex: almost sure mean: 8.6 packets (Padhye:√
32ppkt
= 8.66)
Superimposition at different scales → scale invariance
beyond n = 100: variabilityn = 100, portion of intervals with mean ∼ 11: e−100×0.01 = 0.37n = 200, portion of intervals with mean ∼ 11: e−200×0.01 = 0.14
⇒ More accurate information than the almost sure mean
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 15 / 25
Scaling Properties of Traffic TCP and large deviations principle
Results: example of Bernoulli losses (ppkt = 0.02)f
(α)
4 10 16
−0.1
−0.05
0
W(∞)α
minα
maxα
minα
maxα
min(500) α
max(500)
theoriquen=100n=200n=500
α (packets)
Apex: almost sure mean: 8.6 packets (Padhye:√
32ppkt
= 8.66)
Superimposition at different scales → scale invariance
beyond n = 100: variabilityn = 100, portion of intervals with mean ∼ 11: e−100×0.01 = 0.37n = 200, portion of intervals with mean ∼ 11: e−200×0.01 = 0.14
⇒ More accurate information than the almost sure mean
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 15 / 25
Scaling Properties of Traffic TCP and large deviations principle
Results: example of Bernoulli losses (ppkt = 0.02)f
(α)
4 10 16
−0.1
−0.05
0
W(∞)α
minα
maxα
minα
maxα
minα
max
theoriquen=100n=200n=500n=1000
α (packets)
Apex: almost sure mean: 8.6 packets (Padhye:√
32ppkt
= 8.66)
Superimposition at different scales → scale invariance
beyond n = 100: variabilityn = 100, portion of intervals with mean ∼ 11: e−100×0.01 = 0.37n = 200, portion of intervals with mean ∼ 11: e−200×0.01 = 0.14
⇒ More accurate information than the almost sure mean
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 15 / 25
Scaling Properties of Traffic TCP and large deviations principle
Results: example of Bernoulli losses (ppkt = 0.02)f
(α)
4 10 16
−0.1
−0.05
−0.010
W(∞)α
minα
maxα
minα
maxα
minα
max11
theoriquen=100n=200n=500n=1000
α (packets)
Apex: almost sure mean: 8.6 packets (Padhye:√
32ppkt
= 8.66)
Superimposition at different scales → scale invariance
beyond n = 100: variabilityn = 100, portion of intervals with mean ∼ 11: e−100×0.01 = 0.37n = 200, portion of intervals with mean ∼ 11: e−200×0.01 = 0.14
⇒ More accurate information than the almost sure mean
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 15 / 25
Scaling Properties of Traffic TCP and large deviations principle
Results II: case of a long-lived flow
losses: not Bernoulli
empirical losses p(w
)
0 20 40 60 80 1000
0.5
1
Bernoulli (ppkt
=0.007)
empirique
w
f(α
)
0 20 40 60 80−0.1
−0.08
−0.06
−0.04
−0.02
0
theo. perte emp.theo. perte Ber.n=100n=200n=500n=1000
αmin,α
max
0 200 400 600 800 10000
20
40
60
80
theo. perte emp.theo. perte Ber.empirique
α (packets) n (RTT)
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 16 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
Two important assets for Large Deviations Utility
General result ( “Large deviations for the local fluctuations of random walks", J. Barral, P. Loiseau, Stochastic
Processes and their Applications, 2011)
A wide class of processes (stationary & mixing) verifies an empirical large deviationprinciple. In particular, this results holds true any time series that can reliably bemodelled by an irreducible, aperiodic Markov process.
Theorem ( “On the estimation of the Large Deviations spectrum", J. Barral, P. G., J. stat. Phys., 2011)
We derived a consistent estimator of the large deviation spectrum from a finite size timeseries (observation samples). We proved convergence on mathematical objects withscale invariance properties (multifractal measures and processes).
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 17 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
An epidemic based model for volatile workloadGoal – Dynamic resource allocation yielding a good compromise between capex andopex costs
Approach – Combine the three ingredients:
A sensible (epidemic) model to catch the burstiness and the dynamics of theworkload
A (Markov) model that verifies a large deviation principle
A probabilistic management policy based on the large deviationcharacterisation
Number of current VoD users
0 20 40 60 80 100 120 140 160 180 2000
20
40
60
80
100
120
140
time
A hidden state Markov process with memory effect
i, r
i, r-1
i+1, r
i-1, r+1
β(i+r)+lβ = β1
β = β2
γiμr
a1 a2
i : current # of viewers / r : current # of infected
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 18 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
An epidemic based model for volatile workloadGoal – Dynamic resource allocation yielding a good compromise between capex andopex costs
Approach – Combine the three ingredients:
A sensible (epidemic) model to catch the burstiness and the dynamics of theworkload
A (Markov) model that verifies a large deviation principle
A probabilistic management policy based on the large deviationcharacterisation
Number of current VoD users
0 20 40 60 80 100 120 140 160 180 2000
20
40
60
80
100
120
140
time
A hidden state Markov process with memory effect
i, r
i, r-1
i+1, r
i-1, r+1
β(i+r)+lβ = β1
β = β2
γiμr
a1 a2
i : current # of viewers / r : current # of infected
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 18 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
An epidemic based model for volatile workloadGoal – Dynamic resource allocation yielding a good compromise between capex andopex costs
Approach – Combine the three ingredients:
A sensible (epidemic) model to catch the burstiness and the dynamics of theworkload
A (Markov) model that verifies a large deviation principle
A probabilistic management policy based on the large deviationcharacterisation
Number of current VoD users
0 20 40 60 80 100 120 140 160 180 2000
20
40
60
80
100
120
140
time
A hidden state Markov process with memory effect
i, r
i, r-1
i+1, r
i-1, r+1
β(i+r)+lβ = β1
β = β2
γiμr
a1 a2
i : current # of viewers / r : current # of infected
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 18 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
An epidemic based model for volatile workloadGoal – Dynamic resource allocation yielding a good compromise between capex andopex costs
Approach – Combine the three ingredients:
A sensible (epidemic) model to catch the burstiness and the dynamics of theworkload
A (Markov) model that verifies a large deviation principle
A probabilistic management policy based on the large deviationcharacterisation
Number of current VoD users
0 20 40 60 80 100 120 140 160 180 2000
20
40
60
80
100
120
140
time
A hidden state Markov process with memory effect
i, r
i, r-1
i+1, r
i-1, r+1
β(i+r)+lβ = β1
β = β2
γiμr
a1 a2
i : current # of viewers / r : current # of infected
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 18 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
An epidemic based model for volatile workloadCalibration and evaluation
VoD workload trace Memory Markov model Modul. Markov Poisson
0 20 40 60 80 100 120 140 160 180 2000
20
40
60
80
100
120
140
0 20 40 60 80 100 120 140 160 180 2000
20
40
60
80
100
120
140
0 20 40 60 80 100 120 140 160 180 2000
50
100
150
Steady state distribution Autocorrelation function Param. estimation precision
0 50 100 15010
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
101
Modelled trace I
Real trace I
ctmc−modelled Trace I
MMPP−modelled trace I
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 210
−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Modelled trace I
Real trace I
ctmc−modelled Trace I
MMPP−modelled trace I
−0.2
−0.1
0
0.1
0.2
0.3
β2
µ lγ a2
a1
β1
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 19 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
Markov processesUnder mild conditions, a Markov processes It verifies a large deviation principle:
P{〈It〉τ ≈ α} ≡ exp (τ · f (α)) , τ →∞
f (α) : theoretically (from the transition matrix) or empirically (from a finite trace)identifiable
"Dynamic" implies time scale: a notion that is explicit in large deviation principle
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 20 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
Markov processesUnder mild conditions, a Markov processes It verifies a large deviation principle:
P{〈It〉τ ≈ α} ≡ exp (τ · f (α)) , τ →∞
f (α) : theoretically (from the transition matrix) or empirically (from a finite trace)identifiable
"Dynamic" implies time scale: a notion that is explicit in large deviation principle
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 20 / 25
Scaling Properties of Traffic Large Deviations applied to dynamic resource management
Markov processesUnder mild conditions, a Markov processes It verifies a large deviation principle:
P{〈It〉τ ≈ α} ≡ exp (τ · f (α)) , τ →∞
f (α) : theoretically (from the transition matrix) or empirically (from a finite trace)identifiable
"Dynamic" implies time scale: a notion that is explicit in large deviation principle
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 20 / 25
Semi-supervised machine learning
Parametric generalisation of semi-supervised learning
Standard classification
Training set
(X (t),Y (t)) −→ classifier
Validation set
X (v) classifier−→ Y (v′) : |Y (v) − Y (v′)| ' 0
Real data
Xclassifier−→ Answer
Semi-supervised classification
Validation set
(X (v),Y (L))classifier−→ Y (v′)
such that |Y (v) − Y (v′)| ' 0
Real data
(X ,Y (L))classifier−→ Answer
Allow to constantly update the classifierto match data evolution
Dataset X = X1,X2, . . . ,Xp︸ ︷︷ ︸labeled points
,Xp+1, . . . ,XN
Similarity matrix W and D (reap. D∗) the row-sum (reap. column) of WLabel matrix Y = {Yi,k ∈ (0, 1) for i = 1, . . .N and k = 1, . . .K}Objective (classification) matrix FN×K : element i belongs to class k∗ = argmax
kFi,k
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 21 / 25
Semi-supervised machine learning
Parametric generalisation of semi-supervised learning
Standard classification
Training set
(X (t),Y (t)) −→ classifier
Validation set
X (v) classifier−→ Y (v′) : |Y (v) − Y (v′)| ' 0
Real data
Xclassifier−→ Answer
Semi-supervised classification
Validation set
(X (v),Y (L))classifier−→ Y (v′)
such that |Y (v) − Y (v′)| ' 0
Real data
(X ,Y (L))classifier−→ Answer
Allow to constantly update the classifierto match data evolution
Dataset X = X1,X2, . . . ,Xp︸ ︷︷ ︸labeled points
,Xp+1, . . . ,XN
Similarity matrix W and D (reap. D∗) the row-sum (reap. column) of WLabel matrix Y = {Yi,k ∈ (0, 1) for i = 1, . . .N and k = 1, . . .K}Objective (classification) matrix FN×K : element i belongs to class k∗ = argmax
kFi,k
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 21 / 25
Semi-supervised machine learning
Parametric generalisation of semi-supervised learning
Standard Laplacian solution
argmaxF
{N∑i=1
N∑j=1
wij || Fi. − Fj. ||2 +µN∑i=1
di || Fi. − Yi. ||2}
Generalised semi-supervised classification [M. Sokol, 2012]
argmaxF
{N∑i=1
N∑j=1
wij || dσ−1i Fi. − dσ−1j Fj. ||2 +µN∑i=1
d2σ−1i || Fi. − Yi. ||2
}
σ = 1 Standard Laplacian (Random walk from unlabelled to labelled points)σ = 1/2 Normalised Laplacianσ = 0 PageRank method (Random walk from labelled to unlabelled points)
F.k =µ
2 + µ
(I − 2
2 + µD−σWDσ−1
)−1Y.k , for k = 1, . . . ,K
Tune the value of parameter σ to match the dataset
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 22 / 25
Semi-supervised machine learning
Parametric generalisation of semi-supervised learning
Standard Laplacian solution
argmaxF
{N∑i=1
N∑j=1
wij || Fi. − Fj. ||2 +µN∑i=1
di || Fi. − Yi. ||2}
Generalised semi-supervised classification [M. Sokol, 2012]
argmaxF
{N∑i=1
N∑j=1
wij || dσ−1i Fi. − dσ−1j Fj. ||2 +µN∑i=1
d2σ−1i || Fi. − Yi. ||2
}
σ = 1 Standard Laplacian (Random walk from unlabelled to labelled points)σ = 1/2 Normalised Laplacianσ = 0 PageRank method (Random walk from labelled to unlabelled points)
F.k =µ
2 + µ
(I − 2
2 + µD−σWDσ−1
)−1Y.k , for k = 1, . . . ,K
Tune the value of parameter σ to match the dataset
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 22 / 25
Semi-supervised machine learning
Parametric generalisation of semi-supervised learning
Standard Laplacian solution
argmaxF
{N∑i=1
N∑j=1
wij || Fi. − Fj. ||2 +µN∑i=1
di || Fi. − Yi. ||2}
Generalised semi-supervised classification [M. Sokol, 2012]
argmaxF
{N∑i=1
N∑j=1
wij || dσ−1i Fi. − dσ−1j Fj. ||2 +µN∑i=1
d2σ−1i || Fi. − Yi. ||2
}
σ = 1 Standard Laplacian (Random walk from unlabelled to labelled points)σ = 1/2 Normalised Laplacianσ = 0 PageRank method (Random walk from labelled to unlabelled points)
F.k =µ
2 + µ
(I − 2
2 + µD−σWDσ−1
)−1Y.k , for k = 1, . . . ,K
Tune the value of parameter σ to match the dataset
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 22 / 25
Semi-supervised machine learning
Parametric generalisation of semi-supervised learning
Standard Laplacian solution
argmaxF
{N∑i=1
N∑j=1
wij || Fi. − Fj. ||2 +µN∑i=1
di || Fi. − Yi. ||2}
Generalised semi-supervised classification [M. Sokol, 2012]
argmaxF
{N∑i=1
N∑j=1
wij || dσ−1i Fi. − dσ−1j Fj. ||2 +µN∑i=1
d2σ−1i || Fi. − Yi. ||2
}
σ = 1 Standard Laplacian (Random walk from unlabelled to labelled points)σ = 1/2 Normalised Laplacianσ = 0 PageRank method (Random walk from labelled to unlabelled points)
F.k =µ
2 + µ
(I − 2
2 + µD−σWDσ−1
)−1Y.k , for k = 1, . . . ,K
Tune the value of parameter σ to match the dataset
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 22 / 25
Follow-up Dynamic graphs analysis
Duality and semi-supervised learning
graph (similarity)ordination←→ process (metric)
formulation (multidimensional scaling) : argmaxF
{N∑i=1
N∑j=1
(|| Fi. − Fj. || −wij)2
}
bridge ordination (MDS) and generalised semi-supervised learning
B leverage σ flexibility to vary duality principle
data adaptivity of semi-supervised learning
B use to update dynamic graph ↔ non-stationary time series
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 23 / 25
Follow-up Dynamic graphs analysis
Duality and semi-supervised learning
graph (similarity)ordination←→ process (metric)
formulation (multidimensional scaling) : argmaxF
{N∑i=1
N∑j=1
(|| Fi. − Fj. || −wij)2
}
bridge ordination (MDS) and generalised semi-supervised learning
B leverage σ flexibility to vary duality principle
data adaptivity of semi-supervised learning
B use to update dynamic graph ↔ non-stationary time series
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 23 / 25
Follow-up Dynamic graphs analysis
Duality and semi-supervised learning
graph (similarity)ordination←→ process (metric)
formulation (multidimensional scaling) : argmaxF
{N∑i=1
N∑j=1
(|| Fi. − Fj. || −wij)2
}
bridge ordination (MDS) and generalised semi-supervised learning
B leverage σ flexibility to vary duality principle
data adaptivity of semi-supervised learning
B use to update dynamic graph ↔ non-stationary time series
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 23 / 25
Follow-up Dynamic graphs analysis
Graph diffusionEpidemic diffusion (MOSAR): Apply standard tools. . .
B Relationship between virus spreading and graph structure: Can diffusionwavelets help?
Coarse Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Fine Scaling Function
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
B How to take into account / reflect dynamicity of graphsP. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 24 / 25
Follow-up Dynamic graphs analysis
Context and collaborations
Dante (B. Girault, E. Fleury,. . . )
Institut des Systèmes Complexes
Sisyphe (ENS Lyon, P. Borgnat)
Other teams (e.g. Geodyn, Maestro. . . )
International cooperations (e.g. EPFL)
. . .
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 25 / 25
Follow-up Dynamic graphs analysis
Context and collaborations
Dante (B. Girault, E. Fleury,. . . )
Institut des Systèmes Complexes
Sisyphe (ENS Lyon, P. Borgnat)
Other teams (e.g. Geodyn, Maestro. . . )
International cooperations (e.g. EPFL)
. . .
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 25 / 25
Follow-up Dynamic graphs analysis
Context and collaborations
Dante (B. Girault, E. Fleury,. . . )
Institut des Systèmes Complexes
Sisyphe (ENS Lyon, P. Borgnat)
Other teams (e.g. Geodyn, Maestro. . . )
International cooperations (e.g. EPFL)
. . .
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 25 / 25
Follow-up Dynamic graphs analysis
Context and collaborations
Dante (B. Girault, E. Fleury,. . . )
Institut des Systèmes Complexes
Sisyphe (ENS Lyon, P. Borgnat)
Other teams (e.g. Geodyn, Maestro. . . )
International cooperations (e.g. EPFL)
. . .
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 25 / 25
Follow-up Dynamic graphs analysis
Context and collaborations
Dante (B. Girault, E. Fleury,. . . )
Institut des Systèmes Complexes
Sisyphe (ENS Lyon, P. Borgnat)
Other teams (e.g. Geodyn, Maestro. . . )
International cooperations (e.g. EPFL)
. . .
P. Gonçalves (Inria) Scaling properties of traffic Complex Networks (Lip6) 25 / 25