Network Emulation using tc
Jeromy Fu
Agenda
• Why emulation• What to be emulated• How TC works• Emulation Howto• Compared with Nistnet/WANem• Other references
Mathematic model
• Mathematic model analysis can provide import insight on the behavior of a system
• But, sometimes difficult because of too many factors combined
Network simulator
• Network simulator is a software program that imitates the working of a computer network
• fast and inexpensive• controlled and reproducible environment
Network emulator
• A network emulator emulates the network which connects end-systems, not the end-systems themselves
• transmit actual network traffic• Can use real code
Real world test
• Impractical experiments• Planet Lab helps• not reproducible
Why emulator
• Complementary
Like
lihoo
d of
Ri
sk O
ccur
renc
e
Consequence of Risk Occurrence
More reality
M
ore
unco
ntro
lled
Trade off
Agenda
• Why emulation• What to be emulated• How TC works• Emulation Howto• Compared with Nistnet/WANem• Other references
What to be emulated
• Bandwidth.• Shaped(Policed) bandwidth more specifically.• Capacity can’t be emulated.
What to be emulated
• RTT
• Jitter
• Queuing delay
What to be emulated
• Duplicate/Disorder/Corrupt• Lossrate• Loss burstiness, a longest sequence beginning
and ending with a loss, consecutive received packets is less than some value Gmin
Agenda
• Why emulation• What to be emulated• How TC works• Emulation Howto• Compared with Nistnet/WANem• Other references
What is TC• TC is abbr. of Traffic Control
- Rate control
- Bandwidth management
- Active Queue Management(AQM)
- Network Emulator, pkt loss, pkt disorder, pkt duplication, pkt delay
- QoS ( diffserv + rsvp )
- Many more …
How TC works
How TC works
TC basic concepts
• Classification(Filter)
- Used to distinguish among different classes of packets and process each class in a specific way.
• Qdisc(Queue discipline)
- Decide which ones to send first, which ones to delay, and which ones to drop
- class/classful Qdisc: Qdisc with/without configurable internal subdivision
TC basic concepts• Class
Classes either contain other Classes, or a Qdisc is attached
Qdiscs and Classes are intimately tied together
• Action
Actions get attached to classifiers and are invoked after a successful classification. Common used actions includes instantly drop, modify or redirect packets, etc.
Works on ingress only.
TC basic concepts
TC Commands
• OPTIONS: options are effective for all sub commands
• OBJECTS: the object of the tc command operates on
• COMMAND: the sub command for each object
TC Qdisc
• Operations on qdisc: add | del | replace | change | show
• Handle: qdisc handle used to identify qdisc• root|ingress|parent CLASSID(handle), specify
the parent node
qdisc handle
• Qdisc handle is used to identify Qdisc
- {none|major[:]}
- none, autogen by kernel
- major is 16bits HEX number(Without ‘0x’ prefix)
- : is optional
• Internally, qdisc_handle = major<<16
TC class
• Class’s parent can be class or qdisc, classid should have the same major with parent
• classid, {[major]:minor} - major/minor are both 16bits HEX numbers(Without ‘0x’ prefix), major is optional
• Internally, classid = (major<<16)|minor
TC filter
• Perf(prio): priority of matching.• Protocol: protocol on which the filter must
operate, ip/icmp etc, see /etc/protocols.• root|classid CLASSID|handle FILTERID, specify
the class or qdisc attached.
Classful qdisc example
Classful qdisc example
Agenda
• Why emulation• What to be emulated• How TC works• Emulation Howto• Compared with Nistnet/WANem• Other references
topology
• Client, Emulator and Server are in the same subnet. Add route.
topology
• Client, Emulator in one subnet, server in another subnet. Use NAT.
tc-tbf
• Tokens are added at a fixed rate• Check if the bucket contains sufficient tokens
Bernoulli loss model
• Model uncorrelated loss events, “loss probability” p.
• Two state, one independent parameter.
Simple Gilbert model
• A system with “consecutive loss events”, which can be characterized by a “loss probability”. (p)and a “burst duration” (1-r).
• Two state, two independent parameters.• 1-r = p -> Bernoulli
Gilbert model
• Within the Bad state there is a probability h that a packet is transmitted.
• “loss probability” (p), a “burst duration”(1-r) and a “loss density”(1-h).
• Two state, three independent parameters.• h=0->Simple
Gilbert-Elliot model
• k is the probability that the packet is transmitted while the system is in Good state.
• In good state, loss events appear as “isolated” and independent with each other
• Two state, four independent parameters.• k=1-> Gilbert
4-state Markov chain
difference
tc-netem
• Loss random(independent loss probability, correlation can be added)| Loss state | Loss gemodel | ecn
tc-netem
• crand(n) = corr*crand(n-1) + (1-corr)*rand()• delay(n) = delay + distri(jitter, crand(n))• duplicate, corrupt , loss, reorder aslo use
crand.• Delay should specified if need reorder(packets
should be queued first)• If gap not specified, gap = 1 will be used.
Distribution table
Why not loss correlation
• Correlation changed the distribution
Netem example
• tc qdisc add dev eth0 root netem delay 100ms 20ms 25% distribution normal
• tc qdisc add dev eth0 root netem loss 0.3% 25%
• tc qdisc add dev eth0 root netem duplicate 1% corrupt 0.1%
• tc qdisc add dev eth0 root netem delay 10ms reorder 25% 50% gap 5
Bandwidth emulation - tbf
• Tc-tbf
• bfifo is the default child qdisc of tbf,• can be replace by other qdiscs such as pfifo.
Bandwidth emulation - tbf
• limit - limit is the size (in bytes) of bfifo, bfifo is the queue which stores the packets.
• rate - the bandwidth cap we need to enforce• burst/buffer/maxburst - this is the bucket size
of the first tbf. Its value should be larger than rate/HZ to achieve the specified throughput, the larger value means more burst when traffic starts(tokens are accumulated in large bucket).
Bandwidth emulation - tbf
• peakrate - if we only have one bucket, the burst rate will be larger than rate we set, so we need peakrate to limit the burst. the peakrate should be no less than rate.
• mtu/minburst - most of the time, set this to MTU of the interface, larger values means larger burst.
Policing and shaping
• Policer: Rate limiting without buffering, typically set at ingress, un-conformed packets are dropped directly.
• Shaper: Rate limiting with buffering, typically set at exgress, and can be buffered and then if no extra buffer, then be dropped, will add extra queuing delay.
Policing and shaping
Shape emulation
• No delay
Bandwidth cap 1mbit/s, don't allow burst traffic, then burst = max(MTU, rate/8/HZ) = (3000,1000000/8/100) = (3000,1250) = 3000,
if queuing delay 100ms, set latency 100ms or we set limit = qdelay*rate/8/1000+burst = 100*1000000/8/1000+3000 = 13750
Shape emulation
• With delay
attach netem to the engress first, and then add tbf to the child qdisc of netem.
use limit parameter for tbf here, if using latency, tbf will not include the extra buffer needed for netem
limit = tbf_burst + netem_qsize + tbf_qsize = max(rate/8/hz, MTU) + delay*rate/8000 + qdelay*rate/8000.
Police emulation
• policer drops packets directly for the Non-conformant packets, and it has no buffer
• tc-tbf with very small buffer.• tc-tbf use bfifo as the default child qdisc,
queue length(in bytes) is set automatically by specifying 'limit' or 'latency', which ensures that the queue length is no less than token bucket depth(introduce queuing delay).
Police emulation
• Workaround is replace the bfifo with pfifo
• You can also use police on ingress. tc-police also use token bucket to do bandwidth cap, but it don't own queue, so there's no qdelay introduced.
Burst emulation
• Most adsl will allow some burst traffic, this kind of burst is caused by the large token bucket size, which accumulates many tokens when transmission starts.
• To emulate the burst, we only need to turn the 'burst' parameter.
Burst emulation
• For example, we allow for 2mbit/s in the first second in the following case. 1m*t + burst = 2m*t => burst = 1m*t => burst=1m=125k
Burst emulation
• What if adding delay? Any problem?• Extra burst will use the netem buffer and cause
extra queuing delay.• Separate the buffer using ifb(Intermediate
Functional Block device)• If traffic is redirect to ifb dev, it is returned back
to the original point when dequeueing from ifb.• Can add qdisc for ifb dev
Burst emulation
• Using ifb
Agenda
• Why emulation• What to be emulated• How TC works• Emulation Howto• Compared with Nistnet/WANem• Other references
Compared with Nistnet/WANem
How Nistnet work
How Nistnet work
• Bandwidth limitation is implemented as adding delay, just like a packet go through a bottleneck link.
• Determine the amount of time to delay a
packet. This is the maximum of two quantities: 1. Probabilistic packet delay time 2. Bandwidth-limitation delay time
How Nistnet work• probdelay = correlatedtabledist(&tableme->ltEntry.lteIDelay);
if (hitme->hitreq.bandwidth) { fixed_gettimeofday(&our_time); //last queue delay bandwidthdelay = timeval_diff(&hitme->next_packet, &our_time);
if (bandwidthdelay < 0) { bandwidthdelay = 0; hitme->next_packet = our_time; } //add transmission delay packettime = (long)skb->len*(MILLION/hitme->hitreq.bandwidth) + ((long)skb->len*(MILLION%hitme->hitreq.bandwidth) + hitme->hitreq.bandwidth/2)/hitme->hitreq.bandwidth; timeval_add(&hitme->next_packet, packettime); bandwidthdelay += packettime; }
delay = probdelay > bandwidthdelay ? probdelay : bandwidthdelay;
Nistnet drawbacks
• Bandwidth model not emulate the real one.• Queuing delay and one way delay are
combined.• Buffer size can only be tune by DRD.• Only ip:port filter supported.• Not support 4 state loss burst model• Only DRD (Derivative Random Drop) AQM
supported.
WANem
• WANem is just a WEB UI which use tc underneath.
WANem
• WEB UI, Easy to use• Add connection disconnect• Queue size need patches to work• No burst settings• No settings for GE or 4-state loss model• Queuing delay can controlled directly
Agenda
• Why emulation• What to be emulated• How TC works• Emulation Howto• Compared with Nistnet/WANem• Other references
Reference
• NEWT (Network Emulator for Windows Toolkit) in vs2010
• Introducing True Network Emulation in Visual Studio 2010
• Network Emulator Toolkit• dummynet • Nistnet FAQ