www.intel.com/research
Load Sharing in Networking Systems
Lukas Kencl Lukas Kencl Intel Research CambridgeIntel Research CambridgeMFF UK MFF UK PrahaPraha, November 2005 , November 2005
(With lots of help from Weiguang Shi, University of Alberta, and citations from D. Thaler, R. Ravishankar and K. Ross)
2www.intel.com/research
• Intel Research Cambridge •
OutlineOutlineIntel Research CambridgeIntel Research CambridgeLoad sharing in networking systemsLoad sharing in networking systems
Problem statementProblem statementMotivationMotivationImbalance due to traffic propertiesImbalance due to traffic properties
……………………………………. . Break Break ……………………………………………………………………..…………Inspiration: distributed Web cachingInspiration: distributed Web cachingSolutions and properties: Solutions and properties:
Adaptive HRW hashingAdaptive HRW hashingAdaptive Burst ShiftingAdaptive Burst Shifting
……………………………………. . Break Break ……………………………………………………………………..…………
Seminar: Seminar: Q&A, Exercises and Q&A, Exercises and HomeworksHomeworksIR Movies!!!IR Movies!!!
3www.intel.com/research
• Intel Research Cambridge •
Intel Research CambridgeIntel Research Cambridge
4www.intel.com/research
• Intel Research Cambridge •
IllinoisIllinois
Washington Washington
Oregon Oregon
CaliforniaCalifornia
TsukubaTsukubaTokyoTokyo
DelhiDelhiBangaloreBangaloreMumbai Mumbai
IndiaIndia
ArizonaArizona
BeijingShanghai
China
HaifaHaifa
NizhnyNizhnyNovgorodNovgorod
Sarov Sarov
Braunschweig,Braunschweig,UlmUlm
GdanskGdansk
BarcelonaBarcelona
GlasgowGlasgow
CopenhagenCopenhagen
CambridgeCambridge
StockholmStockholm
NiceNice
New MexicoNew Mexico
SwindonSwindon
Over 75 labs & 7000 Over 75 labs & 7000 R&D professionals R&D professionals
worldwideworldwide
Intel R&D Commitment Intel R&D Commitment
ShannonShannon
MassachusettsMassachusetts
PennsylvaniaPennsylvania
Intel Research: 4 Intel Research: 4 LabletsLablets, target 80 people worldwide, target 80 people worldwide
5www.intel.com/research
• Intel Research Cambridge •
Intel Research MissionIntel Research Mission
Build the technical leadership, knowledge assets Build the technical leadership, knowledge assets and systems perspective to make Intel a preeminent and systems perspective to make Intel a preeminent driver of disruptive information technologiesdriver of disruptive information technologies
David TennenhouseDavid TennenhouseVice President, Corporate Technology GroupVice President, Corporate Technology Group
Director of Research, Intel CorporationDirector of Research, Intel Corporation
6www.intel.com/research
• Intel Research Cambridge •
What is unique about Intel Research?What is unique about Intel Research?
Research is largely exploratory Research is largely exploratory
Off roadmap, 7Off roadmap, 7--15 years out on timeline15 years out on timeline
Innovative collaborative model engaging with key UniversitiesInnovative collaborative model engaging with key Universities
““LabletsLablets”” (Berkeley, Seattle, Pittsburgh, Cambridge) (Berkeley, Seattle, Pittsburgh, Cambridge) –– Intel employee Intel employee facilities cofacilities co--located on University campus to facilitate collaborationlocated on University campus to facilitate collaboration
Open Collaborative Agreements signed with UniversitiesOpen Collaborative Agreements signed with Universities
Strategic Research Projects (SRP) inside Intel Strategic Research Projects (SRP) inside Intel
Technology transfer mechanism from open collaborative research Technology transfer mechanism from open collaborative research
Proprietary exploratory research outside scope of any business uProprietary exploratory research outside scope of any business unitnit
Alignment of University engagementsAlignment of University engagements
Research Council, Academic Relations, Labs, Visiting Faculty, InResearch Council, Academic Relations, Labs, Visiting Faculty, Internships!!! ternships!!!
7www.intel.com/research
• Intel Research Cambridge •
Intel Research Cambridge (IRC)Intel Research Cambridge (IRC)
Established in March 2003Established in March 2003
Currently 12 fullCurrently 12 full--time researcherstime researchers
Visiting faculty, student internsVisiting faculty, student interns
Make sure the smartest work with usMake sure the smartest work with us:: cooperation with cooperation with Cambridge University, as well as others throughout UK, Cambridge University, as well as others throughout UK, Europe and elsewhereEurope and elsewhere
8www.intel.com/research
• Intel Research Cambridge •
IRC Key Research Areas IRC Key Research Areas Optical Packet Switching Optical Packet Switching Virtualisation Virtualisation –– XenXen
ParaPara--virtualisation virtualisation -- run modified or new OS on specialised guest run modified or new OS on specialised guest architecturearchitecture
Wired NetworkingWired NetworkingCoMoCoMo –– Continuous MonitoringContinuous MonitoringAdaptive methodsAdaptive methods
Anticipate the wirelessAnticipate the wirelessWIP: from internet to WIP: from internet to IntairnetIntairnet
wwireless access network using multidirectional antennasireless access network using multidirectional antennas
HAGGLE: mobile adHAGGLE: mobile ad--hoc networkshoc networks
Ubiquitous computingUbiquitous computing
9www.intel.com/research
• Intel Research Cambridge •
Load Sharing in Networking Systems Load Sharing in Networking Systems
Problem StatementProblem Statement
10www.intel.com/research
• Intel Research Cambridge •
Map packets to processorsMap packets to processors
Assumptions:Assumptions:•• Data arrives in packets.Data arrives in packets.•• Any processor can Any processor can process any packet.process any packet.HeterogenousHeterogenous
processor capacityprocessor capacity μμjj..
Incoming PacketsIncoming PacketsIncoming PacketsIncoming Packets
Multiple (M)Multiple (M)ProcessorsProcessors
11
22
33
MM
Multiple (N)Multiple (N)InputsInputs
11
22
33
44
NN
Task:Task:•• Map packets to processors so that load within some Map packets to processors so that load within some measure of balance.measure of balance.
11www.intel.com/research
• Intel Research Cambridge •
Packet flows: avoid remapping or out-of-sequence!!!Packet flows: avoid remapping or out-of-sequence!!!
Assumptions:Assumptions:•• Data arrives in Data arrives in packetizedpacketized flows.flows.
•• Any processor can Any processor can process any packet.process any packet.HeterogenousHeterogenous
processor capacityprocessor capacity μμjj..
Incoming PacketsIncoming PacketsIncoming PacketsIncoming Packets
Multiple (M)Multiple (M)ProcessorsProcessors
11
22
33
MM
Multiple (N)Multiple (N)InputsInputs
11
22
33
44
NN
Task:Task:•• Load on processors within some measure of balance.Load on processors within some measure of balance.•• Same flow to same processor (reordering, context).Same flow to same processor (reordering, context).
Advantage:Advantage: system optimization.system optimization. Drawback:Drawback: complexity, overhead.complexity, overhead.
12www.intel.com/research
• Intel Research Cambridge •
Reduce state maintenance of Reduce state maintenance of FlowFlow--toto--Processor Mapping Processor Mapping
Multiple (N)Multiple (N)InputsInputs
11
22
33
NN
Multiple (M) Multiple (M) ProcessorsProcessors
11
22
33
44
MM
Incoming PacketIncoming Packet
flow flow identifier identifier vectorvector vv
v v
Upon packet arrival, a decision is made where to process the pacUpon packet arrival, a decision is made where to process the packet, based ket, based on the flow identifier. A on the flow identifier. A flowflow--toto--processor mapping processor mapping ff is thus established.is thus established.
f ( f ( vv ) = ) = 33
13www.intel.com/research
• Intel Research Cambridge •
Acceptable load sharing Acceptable load sharing as a measure of balanceas a measure of balance
Processing load on processor Processing load on processor jj jj (t)(t)Capacity on processor Capacity on processor jj jj
Workload intensity on processor Workload intensity on processor jj jj(t(t)) = = jj(t(t)) / / jj
Total system workload intensityTotal system workload intensity (t(t)) = = Σ Σ jj(t(t)) / / Σ Σ jj
““No single processorNo single processor is is overutilizedoverutilized if the if the system in totalsystem in total is not overutilized,is not overutilized,and vice versa.and vice versa.””
Acceptable load sharing:Acceptable load sharing:ifif (t(t)) [ [ 1 1 thenthen ∀∀jj, , jj(t(t)) [ [ 1,1,if if (t(t)) >> 11 then then ∀∀jj, , jj(t(t)) >> 1.1.
14www.intel.com/research
• Intel Research Cambridge •
Coefficient of Variation (CV)as a measure of balance
Coefficient of Variation (CV)as a measure of balance
Useful, as it takes the scale of measurements out of consideration.
15www.intel.com/research
• Intel Research Cambridge •
Minimizing Disruption Minimizing Disruption Possible Goal:Possible Goal: Acceptable load sharing Acceptable load sharing withoutwithout maintaining maintaining flow stateflow state
information and yet information and yet minimizingminimizing the probability of the probability of mapping disruptionmapping disruption (flow (flow remapping or reordering).remapping or reordering).
Special 0Special 0--1 Integer Programming Problem:1 Integer Programming Problem:
maxmax ΣΣvv Δ Δ vv(t(t) . ) . ΣΣjj (1{ (1{ f(tf(t--Δ Δ t)(vt)(v) = j} .) = j} . 1{ 1{ f(t)(vf(t)(v)= j})= j}),),whilewhile ΣΣvv aavv(t(t) .) . 1{ 1{ f(t)(vf(t)(v)= j})= j} = = jj (t)(t) [ [ jj , , ∀ ∀ j.j.
v v -- flow identifier vector in the packet header,flow identifier vector in the packet header,ff(t)(t)(v(v)) -- function mapping flows to processors, changing over time,function mapping flows to processors, changing over time,ΔΔvv(t)(t)cc {0,1}{0,1} -- indicator ifindicator if vv has appeared in the intervalshas appeared in the intervals (t(t--22ΔΔ t, tt, t--ΔΔ t) t) and and (t(t--ΔΔ t, t)t, t),,aavv(t(t)) -- how many times hashow many times has vv appeared in the intervalappeared in the interval (t(t--ΔΔ t, t),t, t),
tt -- iteration intervaliteration interval. .
We suspect an We suspect an NPNP--complete problem complete problem –– use use heuristicsheuristics..
16www.intel.com/research
• Intel Research Cambridge •
SummarizeSummarize
Balance loadBalance load
Avoid remapping Avoid remapping -- or packets outor packets out--ofof--sequencesequence
Minimize overheadMinimize overhead
17www.intel.com/research
• Intel Research Cambridge •
Load Sharing in Networking Systems Load Sharing in Networking Systems
MotivationMotivation
18www.intel.com/research
• Intel Research Cambridge •
Practical ExamplesPractical ExamplesDistributed Router, or Parallel Forwarding EngineDistributed Router, or Parallel Forwarding Engine
Server Farm Load BalancerServer Farm Load Balancer
Network ProcessorNetwork Processor
Server Farm Load Balancer
S1
SM
S2
Packet-to-server
mapping computation
Incoming Packet
Packet forwarding
Multiple (M) Servers
19www.intel.com/research
• Intel Research Cambridge •
Network ProcessorNetwork ProcessorNPNP
Pool of Microengines (ME)Pool of Microengines (ME)
GPPGPPSRAMSRAM DRAMDRAM
PacketsPackets
Network processor:Network processor:Pool of multiPool of multi--threaded threaded forwarding enginesforwarding enginesIntegrated or attached GPPIntegrated or attached GPPMemories of varying Memories of varying bandwidth and capacity bandwidth and capacity
Large parallel programming Large parallel programming problemproblem
ConcurrencyConcurrencyPacket orderPacket orderResource allocationResource allocationLoad balancingLoad balancing
At 10 At 10 GbpsGbps, packets arrive , packets arrive every 36 ns!!!every 36 ns!!!
20www.intel.com/research
• Intel Research Cambridge •
Load Sharing in Networking Systems Load Sharing in Networking Systems
Traffic properties make balancing difficultTraffic properties make balancing difficult
21www.intel.com/research
• Intel Research Cambridge •
Traffic PropertiesTraffic Properties
Address distribution Address distribution –– 32bit address space very 32bit address space very unevenly assigned and populatedunevenly assigned and populated
BurstinessBurstiness
Power lawPower law
22www.intel.com/research
• Intel Research Cambridge •
TCP Traffic is BurstyTCP Traffic is Bursty
TCP behaviour: ideal and typical
Packets arrive in periodic bursts!
23www.intel.com/research
• Intel Research Cambridge •
Networking Traffic Exhibits Power-Law (Zipf-law) Properties
Networking Traffic Exhibits Power-Law (Zipf-law) Properties
Flow popularities in traffic traces
P(R)~1/Ra
Frequency of an event as a function of its rank obeys power-law. Popularity of network flows with aclose to 1 (from above).
24www.intel.com/research
• Intel Research Cambridge •
Power-law leads to imbalance if static mapping (Shi et al, 2004)Power-law leads to imbalance if static mapping (Shi et al, 2004)m - number of processors,K - number of distinct object addresses or flow IDs,pi (0<i<K) - popularity of object i, qj (0<j<m) - number of distinct adresses or flow IDs mapped to processor j
25www.intel.com/research
• Intel Research Cambridge •
Conclusion Part IConclusion Part I
Complex problemComplex problem
Traffic properties work against static solutionsTraffic properties work against static solutions
26www.intel.com/research
• Intel Research Cambridge •
Break 5 minutesBreak 5 minutes
27www.intel.com/research
• Intel Research Cambridge •
Load Sharing in Networking Systems Load Sharing in Networking Systems
Inspiration: Distributed Web Caching and Web Inspiration: Distributed Web Caching and Web Server Load BalancingServer Load Balancing
28www.intel.com/research
• Intel Research Cambridge •
What is a Distributed Web CacheWhat is a Distributed Web Cache
Intranet
Internet
Proxy Cache stores Web objects locally
Collection of distributed Web caches
29www.intel.com/research
• Intel Research Cambridge •
HRW MappingHRW Mapping
Def.Def.: : ObjectObject--toto--Processor mapping functionProcessor mapping function f, f (v): Vf, f (v): V d d M :M :f (f (vv) = j) = j ww xxjj . g (. g ( v, v, jj ) = ) = maxmaxkk xxkk . g . g ((v, v, kk),),
wherewhere vv is the object identifier vector,is the object identifier vector, x x = (x= (x11, ... , x, ... , xmm)) is a weights' vector is a weights' vector and and g (g (vv, j), j)cc (0,1)(0,1) is a pseudorandom function of uniform distribution. is a pseudorandom function of uniform distribution.
Highest Random Weight (HRW) MappingHighest Random Weight (HRW) Mapping, , ThalerThaler, , RavishankarRavishankar, 1997, Ross, 1998, 1997, Ross, 1998, CARP Protocol, Windows NT LB, CARP Protocol, Windows NT LB
Minimal disruption of mappingMinimal disruption of mapping in case of processor addition or removal/failure.in case of processor addition or removal/failure.
Load balancing over Load balancing over heterogenousheterogenous processorsprocessors: weights: weights’’ vectorvector xx is in a 1is in a 1--toto--1 1 correspondence tocorrespondence to p p = (p= (p11, ... , p, ... , pmm)),, the vector of object fractions received the vector of object fractions received at each processor.at each processor.Pseudorandom function Pseudorandom function g (g (vv, , j)j)cc (0,1)(0,1) can be implemented as a fastcan be implemented as a fast--computable computable
hash function.hash function.
0
x1
x2
1 2 3
x2 . g (v, 2)
x1 . g (v, 1)
Map to max of:
x3 x3 . g (v, 3)Example: 3 processors
30www.intel.com/research
• Intel Research Cambridge •
HRW Minimal DisruptionHRW Minimal Disruption
Example: Add processor No. 4, vectors mapped either (i) as before addition or (ii) to the newly added processor – minimal number of vectors change mapping.
0 1 2 3
g (v, 3)g (v, 1)
Max of:g (v, 2)
1
4
g (v, 4)
0 1 2 3
g (v, 3)g (v, 1)
g (v, 4)
4
g (v, 2)
1Max of:
Partitioning into contiguous set HRW
Minimal disruption of mapping in case of processor addition:
31www.intel.com/research
• Intel Research Cambridge •
HRW Load BalancingHRW Load Balancingm - number of servers, S1, …, SmK - number of distinct object addresses or flow IDs, from the set of objects Opi (0<i<K) - popularity of object i, qj (0<j<m) - number of distinct adresses or flow IDs mapped to processor j , e.g. sum of the popularities of objects mapped to Sj
32www.intel.com/research
• Intel Research Cambridge •
Load Sharing in Networking Systems Load Sharing in Networking Systems
Adaptive MethodsAdaptive Methods-- Adaptive HRW HashingAdaptive HRW Hashing-- Adaptive Burst ShiftingAdaptive Burst Shifting
33www.intel.com/research
• Intel Research Cambridge •
Adaptive HRW Hashing(Kencl, Le Boudec 2002)Adaptive HRW Hashing(Kencl, Le Boudec 2002)
34www.intel.com/research
• Intel Research Cambridge •
Adaptive HRW mappingAdaptive HRW mapping
35www.intel.com/research
• Intel Research Cambridge •
4. Download new4. Download newx := xx := x(t).(t).
2. Evaluate 2. Evaluate (t(t)) = (= ( 1 1 (t)(t), , 2 2 (t)(t), ... , , ... , mm (t)(t)))
(compare threshold).(compare threshold).
3. Compute new3. Compute newx x (t)(t) = (x= (x1 1 (t)(t), ..., , ..., xxmm (t)(t)).).
Adaptation through FeedbackAdaptation through Feedback
Trigger definition targets Trigger definition targets preventing overload, preventing overload, if system in if system in total not overloaded, and vice total not overloaded, and vice versa.versa.A A threshold threshold triggers adaptation triggers adaptation
when close to load sharing when close to load sharing bounds. bounds. FlowFlow--toto--processor mapping processor mapping ff
becomes a becomes a function of timefunction of timef(t)(f(t)(vv))Adaptation may cause Adaptation may cause flowflow
remapping! remapping! How to minimize the How to minimize the amount remapped?amount remapped?
Multiple (N)Multiple (N)Input CardsInput Cards
11
22
33
NN
Multiple (M)Multiple (M)processorsprocessors
11
22
33
44
MM
CPCP
Control PointControl Point
1. Filtered workload 1. Filtered workload intensity ( j) =intensity ( j) = jj(t).(t).
Problem:Problem: incoming requests incoming requests are packets, not flows! Packets not evenly distributed over floware packets, not flows! Packets not evenly distributed over flows s -->> not evenly distributed over the request object spacenot evenly distributed over the request object space --> HRW mapping not sufficient for > HRW mapping not sufficient for acceptable load sharing boundsacceptable load sharing bounds ––> need to> need to adapt!adapt!
36www.intel.com/research
• Intel Research Cambridge •
Adaptation AlgorithmAdaptation AlgorithmStartStart
TriggerTriggerAdaptationAdaptation
??
Wait time Wait time tt
Adapt Adapt weights' vector weights' vector xx
and uploadand upload
Triggering PolicyTriggering Policy Adaptation PolicyAdaptation Policy
YesYesNoNo
Compute filtered Compute filtered processor workload processor workload
intensity intensity (t(t))
37www.intel.com/research
• Intel Research Cambridge •
TriggeringTriggering PolicyPolicy
Dynamic workload intensityDynamic workload intensitythresholdthreshold
'' (t) = 1/2 (1+(t) = 1/2 (1+ (t))(t))
Triggering policyTriggering policy(i)(i) if if (t(t) ) [[ 1 1 andand maxmax jj (t) > (t) > (t) (t)
then then adaptadapt;;(ii) if(ii) if (t(t) > 1) > 1 andand minmin jj (t)(t) < < (t) (t)
then then adapt.adapt.
ExampleExample::(t) = (0.8, 0.2, 0.2), (t) = (0.8, 0.2, 0.2), (t) = 0.4(t) = 0.4
ee (t) = 0.7,(t) = 0.7,11 (t)(t) >> (t)(t) e e adapt.adapt.
0 10 20 30 40 50 60 70 80 90 1000.7
0.8
0.9
1
1.1
1.2
1.3
1.4
ρ(t), system in totalworkload intensity threshold
Triggering thresholdTriggering threshold
(t)(t) = = maxmax(( '' (t), upper)(t), upper)or vice versaor vice versa
0 10 20 30 40 50 60 70 80 90 1000.7
0.8
0.9
1
1.1
1.2
1.3
1.4
ρ(t), system in totaltriggering threshold
HysteresisHysteresis boundbound
upper: (1+upper: (1+ HH(t(t)) .)) . (t)(t)lower: (1 lower: (1 -- HH(t)) .(t)) . (t)(t)
0 10 20 30 40 50 60 70 80 90 1000.7
0.8
0.9
1
1.1
1.2
1.3
1.4
ρ(t), system in totalmax, min hysteresis bound
38www.intel.com/research
• Intel Research Cambridge •
Adaptation Policy:Adaptation Policy:Minimal DisruptionMinimal Disruption
•• A, B A, B -- mutually exclusive subsets ofmutually exclusive subsets of M={1,...,m}M={1,...,m}, , M=AM=A 44 B.B.•• α α c c (0, 1).(0, 1).•• f, f'f, f' –– two HRW mappings with the weights' vectorstwo HRW mappings with the weights' vectors xx, , x'x'::
x'x'jj = = αα . . xxjj,, ∀ ∀ jj c c A,A,x'x'jj = = xxjj,, ∀ ∀ jj c c BB..
•• ppjj , , p'p'jj -- fraction of objects mapped to nodefraction of objects mapped to node j j usingusing ff, , f'f'..
1) 1) p'p'jj [ [ ppjj , , jj c c AA,,p'p'jj m m ppjj , j, j c c BB..
2) Fraction of objects mapped to a different node by each mapp2) Fraction of objects mapped to a different node by each mappingingisis MINIMAL, MINIMAL, that is, equal tothat is, equal to | | p'p'jj -- ppjj | . |V | | . |V | at every nodeat every node j. j.
39www.intel.com/research
• Intel Research Cambridge •
• reduced receives less, unaltered receives more, if reduction by a single, invariable multiplier.• minimal disruption of the mapping.
0
x1
x2
1 2 3
x2 . g (v, 2)
x1 . g (v, 1)Max of:
1 2 3
x3
02/3.x1
2/3.x2
x3 . g (v, 3)
2/3 . x2 . g (v, 2)
2/3 . x1 . g (v, 1)Max of:
x3 x3 . g (v, 3)
Adaptation Policy: Minimal Adaptation Policy: Minimal Disruption Example (3 proc.)Disruption Example (3 proc.)
40www.intel.com/research
• Intel Research Cambridge •
Adaptation PolicyAdaptation PolicyLet Let (t(t)) [ [ 11.. Then:Then:
xxjj (t) : = (t) : = c(t)c(t) . x. xj j (t(t-- t) ,t) , ifif jj (t)(t) >> (t) (t) (( jj exceeds thresholdexceeds threshold (t(t)))),,xxjj (t) : = (t) : = xxjj (t(t-- t),t), ifif jj (t)(t) [[ (t) (t) (( jj does not exceed thresholddoes not exceed threshold (t(t))))..
IfIf (t) > 1(t) > 1,, the adaptation is carried out in a symmetrical manner.the adaptation is carried out in a symmetrical manner.
The weights' The weights' multiplier coefficient multiplier coefficient c(tc(t)) ::
( )( )(t)(t)minmin {{ jj(t(t)) | | jj(t(t)) >> (t)(t)}}c(t) =c(t) =
1/m1/m
FactorFactor c(t)c(t) is proportional to the minimal error and to the number of nodes.is proportional to the minimal error and to the number of nodes.
41www.intel.com/research
• Intel Research Cambridge •
Adaptive Burst Shifting(Shi, MacGregor, Gburzynski 2005)Adaptive Burst Shifting(Shi, MacGregor, Gburzynski 2005)
42www.intel.com/research
• Intel Research Cambridge •
Adaptive Burst ShiftingAdaptive Burst Shifting
Insight: the periodicity of bursts may allow shifting flows in-between bursts, without affecting order within flows.
43www.intel.com/research
• Intel Research Cambridge •
Adaptive Burst ShiftingAdaptive Burst Shifting
Burst Distributor Algorithm
44www.intel.com/research
• Intel Research Cambridge •
Performance evaluationPerformance evaluation
45www.intel.com/research
• Intel Research Cambridge •
Adaptive HRW on generated trafficAdaptive HRW on generated traffic
46www.intel.com/research
• Intel Research Cambridge •
ExpectationsExpectations
•• Workload intensity on individual processors close Workload intensity on individual processors close to that of the system in total to that of the system in total (acceptable load (acceptable load sharing);sharing);•• Packet loss probability lowered Packet loss probability lowered (acceptable load (acceptable load sharing);sharing);•• Persistent flows Persistent flows (appearing in two consecutive iterations)(appearing in two consecutive iterations)seldom remapped seldom remapped (minimize disruption).(minimize disruption).
47www.intel.com/research
• Intel Research Cambridge •
AHH Keeps Per-processor Workload Intensity Close to IdealAHH Keeps Per-processor Workload Intensity Close to Ideal
0 50 100 150 2000
0.5
1
1.5
2
2.5
Wor
kload
inte
nsity
Time
ρ(t), system in total
Naive, no LS.Naive, no LS.
0 50 100 150 2000
0.5
1
1.5
2
2.5
Wor
kload
inte
nsity
Time
ρ(t), system in total
Static HRW Static HRW
0 50 100 150 2000
0.5
1
1.5
2
2.5
Wor
kload
inte
nsity
Time
ρ(t), system in total
Adaptive HRWAdaptive HRW
0 50 100 150 2000
0.5
1
1.5
2
2.5
Wor
kload
inte
nsity
Time
ρ(t), system in totalMax, min ρj(t), no LSMax, min ρj(t), static LSMax, min ρj(t), adaptive LS
Max and min of all.Max and min of all.
48www.intel.com/research
• Intel Research Cambridge •
Packet Loss Significantly Reducedwith the Adaptive Control LoopPacket Loss Significantly Reducedwith the Adaptive Control Loop
0 50 100 150 2000
2
4
6
8
10
x 104
Pac
kets
dro
pped
s
Time
System in totalNo LSStatic LSAdaptive LS
Packet loss: Naive, Static, Adaptive, Ideal. Packet loss: Naive, Static, Adaptive, Ideal.
0 50 100 150 2000
0.5
1
1.5
2
2.5
3
3.5x 10
4
Pac
kets
dro
pped
Time
Static LSAdaptive LS
Packet loss in excess of Ideal: Static, Adaptive.Packet loss in excess of Ideal: Static, Adaptive.
Adaptive HRW saves on average 60% Adaptive HRW saves on average 60% of packets dropped in by the static load sharing.of packets dropped in by the static load sharing.
49www.intel.com/research
• Intel Research Cambridge •
Minimal Disruption Property Ensures Few Flow RemappingsMinimal Disruption Property Ensures Few Flow Remappings
0 50 100 150 20010
1
102
103
104
105
Flo
ws
Time
Flows appearingFlows persistentFlows remapped
Flows, per iteration: appearing, Flows, per iteration: appearing, persitentpersitent and remapped.and remapped.
Adaptive control loop Adaptive control loop leads on average to:leads on average to:•• less than 0.05% less than 0.05% of the of the appearing flowsappearing flowsremapped per iteration;remapped per iteration;•• less than 0.2% less than 0.2% of the of the persistent flowspersistent flowsremapped per iteration.remapped per iteration.
50www.intel.com/research
• Intel Research Cambridge •
AHH and ABS on generated trafficAHH and ABS on generated traffic
51www.intel.com/research
• Intel Research Cambridge •
Adaptive HRW and Adaptive Burst SwitchingAdaptive HRW and Adaptive Burst Switching
Packet drop rates.Packet drop rates. Packet reordering.Packet reordering.
Packet remapping.Packet remapping.
52www.intel.com/research
• Intel Research Cambridge •
ConclusionConclusion•• ABS better in preventing packet drop ABS better in preventing packet drop •• AHH better in preventing remapping and reordering, AHH better in preventing remapping and reordering, although larger table would improve ABSalthough larger table would improve ABS•• ABS works on much smaller timeABS works on much smaller time--scale scale –– can address can address packet burstspacket bursts•• AHH converges to optimal allocation, but timescale too AHH converges to optimal allocation, but timescale too long for burstslong for bursts•• Best solution probably hybrid Best solution probably hybrid –– under investigation under investigation –– must must be careful to avoid conflicting feedbackbe careful to avoid conflicting feedback--control mechanismscontrol mechanisms
53www.intel.com/research
• Intel Research Cambridge •
The EndThe End
Thank you!Thank you!
Seminar @ 10:40Seminar @ 10:40
54www.intel.com/research
• Intel Research Cambridge •
SeminarSeminar
55www.intel.com/research
• Intel Research Cambridge •
Seminar agendaSeminar agenda
Q & AQ & A
Exercises 1, 2, 3Exercises 1, 2, 3
Internships @ IRCInternships @ IRC
HomeworkHomework
Q & A II.Q & A II.
Cinema?Cinema?
56www.intel.com/research
• Intel Research Cambridge •
Exercise 1: Mapping DisruptionExercise 1: Mapping DisruptionLet there be a system with M servers of the same Let there be a system with M servers of the same
capacity, and an (M+1)th server be added to the capacity, and an (M+1)th server be added to the system. Compute the amount of mapping system. Compute the amount of mapping disruption with the:disruption with the:
a)a) Contiguous mappingContiguous mapping
b)b) Mod M mappingMod M mapping
57www.intel.com/research
• Intel Research Cambridge •
Exercise 2: Prove the x2p relationship of HRW mappingExercise 2: Prove the x2p relationship of HRW mapping
58www.intel.com/research
• Intel Research Cambridge •
Homework: Homework:
1.1. Prove the Prove the mindisruptionmindisruption property of Adaptive HRW property of Adaptive HRW hashinghashing
2.2. Try and design your own adaptive load balancing Try and design your own adaptive load balancing mapping that a) reduces packet reordering within mapping that a) reduces packet reordering within flows, or b) reduces flow remapping. flows, or b) reduces flow remapping.
59www.intel.com/research
• Intel Research Cambridge •
Q & AQ & A
60www.intel.com/research
• Intel Research Cambridge •
Internships? Films?Internships? Films?
Thank you!Thank you!
[email protected]@intel.comhttp://www.intelhttp://www.intel--research.net/cambridgeresearch.net/cambridge//