Date post: | 01-Jan-2016 |
Category: |
Documents |
Upload: | veronica-hampton |
View: | 219 times |
Download: | 0 times |
CS 4700 / CS 5700Network Fundamentals
Lecture 17: Network Modeling (Not Everyone has a Datacenter)
Wide-Area Network Research
Most research now focused on large-scale systems
Challenges: testing and evaluation How to perform wide-area tests in a repeatable,
reliable manner ModelNet, Emulab
Challenge: understanding/capturing Internet topologies Graph characterization: dK-series
2
3
ModelNet dK
Outline
A Case for Network Emulation
Need a way to test large-scale Internet services Peer-to-peer, overlay networks, novel protocols
Testing in the real world PlanetLab… Results not reproducible or predictable Difficult to deploy and administer research software
Simulation tools Allows control over test environment May miss important system interactions
Emulation Emulators subject application traffic to end-to-end bandwidth
constraints, latency, and loss rate of user specified topology Previous implementations not scalable
4
ModelNet
A scalable, cluster-based, comprehensive network emulation environment
5
Design
User run configurable number of instances of application on Edge Nodes within cluster
Each instance is a Virtual Edge Node (VN) Each VN has a unique IP address
Edge nodes route traffic through cluster of Core Routers Equipped with large memories,
modified FreeBSD kernels Core routers route traffic through
emulated links or “pipes” Each pipe has own packet queue and queuing discipline
6
ModelNet Phases
Create Generates a network topology as a graph From Internet traces, BGP dumps, synthetic topology
generators, etc. Annotate graph with loss rates, failure distributions…
Distillation Transforms GMLgraph into pipe topology
Assignment Maps pipe topology to core nodes, distributing emulation
load across core nodes Finding ideal mapping is NP-complete ModelNet uses greedy k-clusters assignment
For k core nodes, randomly select k nodes in distilled topology. Greedily select links from connected component in round robin
7
ModelNet Phases
Binding Multiplex multiple VNs to each physical edge
nodes Bind each physical edge node to a core router Generate shortest path routes between all VNs
and install in core routing tables
Run Executes target application code on edge nodes
8
Inside the Core
Route traffic through emulated “pipes” Each route is an ordered list of pipes Packets move through pipes by reference Routing table requires O(n2) space
Packet Scheduling When packet arrives, put at tail of first pipe in its route. Scheduler stores heap of pipes sorted by earliest deadline -
exit time for first packet in its queue Once every clock tick
Traverse pipes in heap for packets that are ready to exit Move packets to tail of next pipe or schedule for delivery Calculate new deadlines
Multi-core Configuration Next pipe in route may be on different machine If so, core node tunnels packet descriptor to next node
9
Scalability Issues
Traffic traversing core is limited by cluster’s physical internal bandwidth
ModelNet must buffer up to full bandwidth-delay product of target network.
250 MB of packet buffer space to carry flows at aggregate bandwidth of 10 GB/s with 200 ms roundtrip latency.
Assumes perfect routing protocol
10
Baseline Accuracy
Want to insure that under load, packets are subject to correct end-to-end delays
Used kernel logging to track ModelNet performance and accuracy
Results show that by running ModelNet scheduler at highest kernel priority Packets are delivered within 1ms of target end-
to-end value Accuracy is maintained up to 100% CPU usage
11
Scalability
Additional Cores Adding core routers allows ModelNet to deliver
higher throughput Communication between core routers introduces
overhead. Higher cross-core communication results in less throughput benefit
VN Multiplexing Higher degrees of multiplexing enable larger
network emulation Inaccuracies introduced due to context switching,
scheduling, resource contention, etc
12
Accuracy vs. Scalability
Reduce overhead by deviating from target network requirements
Changes should minimally impact application behavior
Ideally, system reports degree and nature of emulation inaccuracy
13
Scalability via Distillation
Pure hop-by-hop emulation Distilled topology is isomorphic
to target network High per packet overhead
End-to-end distillation Remove all interior network nodes Collapse each path into
single pipe Latency = sum of latencies
along path Reliability = product of link
reliabilities along path Low per packet overhead Does not emulate link contention along path
14
Time Dilation on Modelnet
The challenge Need to emulate networks with more resources E.g. fast CPU (20Ghz), large b/w networks (TB/s) But only commodity machines available
Solution Modelnet + time dilation via virtual machines Run application inside single VMs Slow down time inside VM Result: everything looks faster/bigger/fatter
More CPU cycles/time, packets/time, disk I/O /time
15
How It’s Done
Must isolate VM from outside measures of time Time based on shared data structure provided by VMM Scale data structure by a Time Dilation Factor (TDF) Also scale hardware timer by TDF
How do we scale only some resources? Slow the others back
down!! Example: speed up
network by TDF=10 B/w increases by 10,
but delay dec by 10So inc delay by 10
Virtual Machine Monitor (VMM)
NodeMgr
LocalAdmin
VM1 VM2 VMn…
16
ModelNet Summary
ModelNet, antithesis of PlanetLab Testing of unmodified applications Reproducible results Experimentation using broad range of network
topologies and characteristics Large scale experiments (thousands of nodes and
gigabits of cross traffic) Can scale to emulate non-existent resource levels
But what if you want real deployment on-demand? Emulab / NetBed
17
Emulab / NetBed
A shared configuration on-demand testbed What if you don’t have your own cluster What if you need to test specific
environments/HW? What if you need this in 5 mins?
Emulab / NetBed Hardware: 328 PCs, high speed Gb Cisco switches Software: OS-loader and manager via web
interface Wipe all disks, load OS-images, configure routers in
<2 mins Reboot and give ssh access
18
Emulab Web Interface19
20
ModelNet dK
Outline
Importance of Network Topology
Access to real-world network topologies is vital for research
New routing and other protocol design, development, testing, etc. Analysis: performance of a routing algorithm
strongly depends on topology Generation: empirical estimation of scalability
Network robustness, resilience under attack, worm spreading, etc.
21
Network Topology Research22
Sta
tic
Top
olo
gie
sD
yn
am
icTop
olo
gie
s
23
Trade Secrets
Unfortunately, large scale network topologies are often proprietary Think about BGP ISPs want to hide their internal topology
Real datasets are rare Small scale Out of date Static (i.e. not dynamic)
24
Towards Synthetic Topologies
Question: can we use graph models to capture real network topologies? Fit a model to a real topology Use a generator to produce synthetic topologies
that are similar, but not identical to the real topology
Benefits Privacy – synthetic graphs are not proprietary Randomization – produce an infinite number of
stochastic snapshots Scalable – generator can produce similar
topologies of any size
Important Topology Metrics
Degree distribution Clustering Assortativity Distance distribution Betweenness distribution
Problems
No way to reproduce most of the important metrics
No guarantee there will not be any other/new metric found important
25
The Approach
Look at inter-dependencies among topology characteristics
See if by reproducing most basic, simple, characteristics, we can also reproduce all other characteristics, including practically important
Try to find the characteristic(s) that define all others Key Observation:
Graphs are structures of connections between nodes
26
Definition of dK-distributions
dK-distributions are degree correlations within simple connected graphs of size d
For example 1K distribution
correlations between node degree distribution 2K distribution
correlations on joint node degree distribution 3K distribution
correlations on clustering coefficient
27
An Example of dK
xK is distribution of subgraphs with particular degrees dK-1 describes node degree distribution dK-2 describes joint node degree
distribution dK-3 captures clustering coefficient
28
dk-0: average degree=2dk-1: P(1)=1, P(2)=2, P(3)=1 dk-2: P(1,3)=1, P(2,2)=1, P(2,3)=2 dk-3: P(1,3,2)=2, P(2,2,3)=1
28
Nice properties of dK-series
Constructability: we can construct graphs having properties Pd (dK-graphs)
Inclusion: if a graph has property Pd, then it also has all properties Pi, with i < d (dK-graphs are also iK-graphs)
Convergence: the set of graphs having property Pn consists only of one element, G itself (dK-graphs converge to G)Guarantees that all (even not yet defined!) graph metrics can be captured by sufficiently high d
29
Inclusion and dK-randomness
2K
0K
0K-random
1K
Given G
1K-randomnK
2K-random
30
How Do We Generate Graphs?
A number of different approaches Stochastic Pseudograph Matching Rewriting
Some are extensible to d=3, others are not New research proposed d=2.5, to make
generation tractible
31
Stochastic approach
Classical (Erdos-Renyi) random graphs are 0K-random graph in the stochastic approach
Easily generalizable for any d: Reproduce the expected value of the dK-
distributions by connecting random d-plets of nodes with (conditional) probabilities extracted from G
Best for theory Worst in practice
32
Pseudograph approach
Reproduces dK-distributions exactly Constructs not necessarily connected
pseudographs Extended for d = 2 Failed to generalize for d > 2: d-sized
subgraphs start overlap over edges at d = 3
33
Pseudograph details
1K1. dissolve graph into a
random soup of nodes2. crystallize it back
2K1. dissolve graph into a
random soup of edges2. crystallize it back
k1 k2
k1k2
k3
k4
k1
k1
k1 k1-ends
34
dK-Randomizing Rewiring
Can generate random graphs from original Generalizes to any d But cannot generate desired graph from dK-
distributions
35
Algorithms
All algorithms deliver consistent results for d = 0
All algorithms, except stochastic(!), deliver consistent results for d = 1 and d = 2
Both rewiring algorithms deliver consistent results for d = 3
Eventual choice Use pseudograph to construct 1K graphs Use targeted rewriting to build higher d graphs
36
Skitter Scalar Metrics
Metric 0K 1K 2K 3K skitter
<k> 6.31 6.34 6.29 6.29 6.29
r 0 -0.24 -0.24 -0.24 -0.24
<C> 0.001 0.25 0.29 0.46 0.46
d 5.17 3.11 3.08 3.09 3.12
sd 0.27 0.4 0.35 0.35 0.37
l1 0.2 0.03 0.15 0.1 0.1
ln-1 1.8 1.97 1.85 1.9 1.9
37
HOT Scalar Metrics
Metric 0K 1K 2K 3K HOT
<k> 2.47 2.59 2.18 2.10 2.10
r -0.05 -0.14 -0.23 -0.22 -0.22
<C> 0.002 0.009 0.001 0 0
d 8.48 4.41 6.32 6.55 6.81
sd 1.23 0.72 0.71 0.84 0.57
l1 0.01 0.034 0.005 0.004 0.004
ln-1 1.989 1.967 1.996 1.997 1.997
38
HOT 0K39
True HOT Graph HOT 0K
HOT 1K40
True HOT Graph HOT 1K
HOT 2K41
True HOT Graph HOT 2K
HOT 3K42
True HOT Graph HOT 3K