Date post: | 08-May-2015 |
Category: |
Technology |
Upload: | hajime-tazaki |
View: | 956 times |
Download: | 3 times |
Direct Code Execution: Revisiting Library OS
Architecture for Reproducible Network Experiments
Hajime Tazaki (University of Tokyo, Japan), Frederic Urbani (INRIA, France), Emilio Mancini (INRIA, France),
Mathieu Lacage (Alcmeon, France), Daniel Camara (INRIA, France), Thierry Turletti (INRIA, France), Walid Dabbous (INRIA, France)
ACM CoNEXT 2013
Our target: experimentation reproducibility
Ideally one should be able to easilyVerify published results (same scenario)Test and debug with other scenarios
This requiresfunctional/timing realism, debuggability
2
Try to replicate
Extends w/ an idea
Proof the idea is good
at the same condition
Related work: real time emulation
Container Based Emulationprovides lightweight virtualizationMininet-HiFi proposed in CoNEXT’12 ensures fidelity of experiments but :
Timing realism still limited by hardware resourcesNo debugging support
3
Related work: virtual time
Time Dilation [NSDI’06]Clock adjustment between different systemsConstant time dilation factor
Slice Time [NSDI’12]Uses synchronizer to adjust speeds between VMs and underlying emulated network
TTVM [ATC’05]Support debugging with bw/fw navigation
x0.5
4
Related work: network simulators
5
Pros:more debuggabilityNo realtime constraint
Cons:lack of functional realism
Motivation
6
Improve the functional realism of simulators
While keeping timing realism, debuggability
Simulators Emulators Ours
FunctionalRealism -- ++ +
TimingRealism ++ -/+ ++
Debuggability + - +
Motivation
6
Improve the functional realism of simulators
While keeping timing realism, debuggability
Simulators Emulators Ours
FunctionalRealism -- ++ +
TimingRealism ++ -/+ ++
Debuggability + - +
Our approachDirect Code Execution
7
Functional RealismRun real codePOSIX apps, kernel network stacks
Timing Realismns-3 integration (virtual clock)
Debuggabilityall in userspacesingle-process virtualization
DCE
Hardware
Simulation Core
Host operating system
Process
Networkstack
Applications
Networkstack
Applications
node#1 node#N
DCE architecture
8
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Corelayer
ns-3 (network simulation core)
POSIX layer
Application(ip, iptables, quagga)
bottom halves/rcu/timer/interruptstruct net_device
DCE
ns-3 applicati
on
ns-3TCP/IP stack
1) Virtualization core layer
9
Run multiple nodes on a single (host) process
dlmopen(3) etc.
Simulated Processisolation of global symbols
management of stacks/heaps of simulated processes
Keep ns-3 features
Timing Realism
Debuggability
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Corelayer
ns-3 (network simulation core)
POSIX layer
Application(ip, iptables, quagga)
bottom halves/rcu/timer/interruptstruct net_device
DCE
ns-3 applicati
on
ns-3TCP/IP stack
2) Kernel layer (library operating system)
10
Functional Realism
Similar to Library OS
shared library (e.g., liblinux.so)
replaceable (e.g., libfreebsd.so)
Mapping via glue code
struct net_device <=> ns3:NetDevice
synchronize jiffies with simulated clock
Architecture independent code
minimize original code modifications
jiffies/gettimeofday()
SimulatedClock
Synchronize
structnet_device
ns3::NetDevice
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Corelayer
network simulation core
POSIX layer
Application(ip, iptables, quagga)
bottom halves/rcu/timer/interruptstruct net_device
DCE
3) POSIX layer
11
Functional Realism
POSIX reimplementation1. pass-through host library
calle.g., strcpy(3) => (reuse)
2. reimplementation, if a function call involves kernel resource (i.e., system calls)
redirect to our kernel module
e.g., socket(2) => dce_socket()
ARP
Qdisc
TCP UDP DCCP SCTP
ICMP IPv4IPv6
Netlink
BridgingNetfilter
IPSec Tunneling
Kernel layer
Heap Stack
memory
Virtualization Corelayer
ns-3 (network simulation core)
POSIX layer
Application(ip, iptables, quagga)
bottom halves/rcu/timer/interruptstruct net_device
DCE
ns-3 applicati
on
ns-3TCP/IP stack
Use cases
Use cases
13
Reproducibility of an experiment (functional realism)How easy is it to debug a distributed protocol ? (debuggability)
Reproducibility
14
Replicating the MPTCP NSDI’12 experiment from the literature
Goodput measurement of TCP (3G), TCP (Wi-Fi), MPTCP (both)
withDCE + ns-3 (LTE/Wi-Fi)
Linux MPTCP (same s/w)
iperf
Tx
LTEeNodeB
Wi-FiAP
Rx
iperf(server)
iperf(client)
LTEPgw
0
0.5
1
1.5
2
2.5
3
3.5
4
0.05 0.1 0.2 0.5
Ave
rag
e g
oo
dp
ut
(Mb
ps)
Receive/Send buffer size (Mbytes)
MPTCPTCP over Wi-Fi
TCP over 3G
MPTCP used over real 3G and WiFi
Reproducibility (cont.d)
15
0
0.5
1
1.5
2
2.5
3
3.5
4
0.05 0.1 0.2 0.5
Ave
rag
e g
oo
dp
ut
(Mb
ps)
Receive/Send buffer size (Mbytes)
MPTCPTCP over Wi-Fi
TCP over 3G
0
0.5
1
1.5
2
2.5
3
3.5
4
0.05 0.1 0.2 0.5Av
erag
e go
odpu
t (M
bps)
Receive/Send buffer size (Mbytes)
MPTCPTCP over Wi-FiTCP over LTE
Original (NSDI’12) Replicate (w/ DCE)Differences1) no significant goodput improvement with buffer size when DCE in single TCP2) Max goodput range: 2.2 - 2.9Mbps (DCE) 2.0 - 3.2Mbps (NSDI)
Functional Realism
Fully Reproducible
Debuggability
16
Memory error detectionamong distributed nodes
in a single process
using Valgrind
==5864== Memcheck, a memory error detector==5864== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.==5864== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info==5864== Command: ../build/bin/ns3test-dce-vdl --verbose==5864== ==5864== Conditional jump or move depends on uninitialised value(s)==5864== at 0x7D5AE32: tcp_parse_options (tcp_input.c:3782)==5864== by 0x7D65DCB: tcp_check_req (tcp_minisocks.c:532)==5864== by 0x7D63B09: tcp_v4_hnd_req (tcp_ipv4.c:1496)==5864== by 0x7D63CB4: tcp_v4_do_rcv (tcp_ipv4.c:1576)==5864== by 0x7D6439C: tcp_v4_rcv (tcp_ipv4.c:1696)==5864== by 0x7D447CC: ip_local_deliver_finish (ip_input.c:226)==5864== by 0x7D442E4: ip_rcv_finish (dst.h:318)==5864== by 0x7D2313F: process_backlog (dev.c:3368)==5864== by 0x7D23455: net_rx_action (dev.c:3526)==5864== by 0x7CF2477: do_softirq (softirq.c:65)==5864== by 0x7CF2544: softirq_task_function (softirq.c:21)==5864== by 0x4FA2BE1: ns3::TaskManager::Trampoline(void*) (task-manager.cc:261)==5864== Uninitialised value was created by a stack allocation==5864== at 0x7D65B30: tcp_check_req (tcp_minisocks.c:522)==5864== http://valgrind.org/
Debuggability
17
Inspect codes during experiments
among distributed nodes
in a single process
using gdbconditional breakpoint with node id (in a simulated network)
fully reproducible (to easily catch a bug)
(gdb) b mip6_mh_filter if dce_debug_nodeid()==0Breakpoint 1 at 0x7ffff287c569: file net/ipv6/mip6.c, line 88.<continue>(gdb) bt 4#0 mip6_mh_filter (sk=0x7ffff7f69e10, skb=0x7ffff7cde8b0) at net/ipv6/mip6.c:109 #1 0x00007ffff2831418 in ipv6_raw_deliver (skb=0x7ffff7cde8b0, nexthdr=135) at net/ipv6/raw.c:199 #2 0x00007ffff2831697 in raw6_local_deliver (skb=0x7ffff7cde8b0, nexthdr=135) at net/ipv6/raw.c:232 #3 0x00007ffff27e6068 in ip6_input_finish (skb=0x7ffff7cde8b0) at net/ipv6/ip6_input.c:197
Wi-Fi Wi-Fi
Home Agent
AP1 AP2
handoff
ping6
mobile node
correspondentnode
Continuous Integration (CI)
18
Automated testingamong multiple nodes
code coverage
regression tests
w/ deterministic clock
Jenkins CILinux kernel testing
Userspace applications
Conclusions
DCE allowsIncreased realism (functional/timing) Full reproducibility (through determinism)Debuggability of protocol implementations
Enable reproducible network experiments
19
Simulators
Emulators DCE
FunctionalRealism -- ++ +
TimingRealism ++ -/+ ++
Debuggability + - +
Thank you
http://bit.ly/ns-3-dcehttps://github.com/direct-code-execution
Backup Slides
21
Direct Code Execution
Insert real network code in simulatorsEasy replication Functional Realism (real code)Timing Realism (time dilation)Reproducibility (full control)Scalability (slower execution, accurate results)Debuggability (single process)
22
How to use DCE ?
26
Prepare binariesliblinux.so (from linux tree+patch)
iperf (built with PIE binnary)
Write a simulation script
#!/usr/bin/python
from ns.dce import *from ns.core import *
nodes = NodeContainer()nodes.Create (100)dce = DceManagerHelper()dce.SetNetworkStack ("liblinux.so");dce.Install (nodes);
app = DceApplicationHelper()app.SetBinary ("iperf")app.Install (nodes)
Simulator.Stop (Seconds(1000.0))Simulator.Run ()
Limitations of DCE
virtual clock vs real worldcannot interact withcan use wall-clock, but loose reproducibility
low code generalityrequires API-specific glue code (POSIX/kernel)
27
Micro-benchmarks
DCE vs Mininet-HiFi
SettingsXeon 2.8 GHz/8 GB RAM
UDP socket program
Linear topology
1) speed of packet processing
2) scalability needed to ensure realistic results
28
0 1 n-1....... n
udp-perf(server)
udp-perf(client)
1470 bytes/100Mbps
Micro-benchmarks
29
02000400060008000
10000120001400016000
0 4 8 16 24 48 64
Rec
eive
d pa
cket
s pe
r wal
l clo
ck s
econ
ds (p
ps)
Number of Hops
Mininet-HiFiDCE
0 4 8 16 24 32 48 56 640
50000100000150000200000250000300000350000400000450000
Num
ber o
f sen
t/rec
eive
d pa
cket
s (n
)
Number of Hops
SentMininet RecvDCE Recv
PacketLoss
DCE achieves timing realism
0 1 n-1....... n
udp-perf(server)
udp-perf(client)
Flexibility
30
Code coverageas a metric of flexibility
Settingsmptcp_v0.86
DCE-ed test programs (<1LoC)
Configuration of test programs
simple 2 paths (ipv4 iperf)
dual-stack 2 paths (v6only, v4/v6)
10 different packet loss rates
Lines Funcs Branches
mptcp_ctrl.c 76.3% 86.7% 59.9%
mptcp_input.c 66.9% 85.0% 57.9%
mptcp_ipv4.c 68.0% 93.3% 43.8%
mptcp_ipv6.c 57.4% 85.0% 45.2%
mptcp_ofo_queue.c 91.2% 100.0% 89.2%
mptcp_output.c 71.2% 91.9% 58.6%
mptcp_pm.c 54.2% 71.4% 40.5%
Total 68.0% 85.9% 54.8%
POSIX API Coverage
31
0
125
250
375
500
2009-09-04 2010-03-10 2011-05-20 2012-01-05 2013-04-09