Post on 04-Jun-2018
transcript
8/13/2019 TCP and congestion control
1/44
8/13/2019 TCP and congestion control
2/44
Introduction to TCP
Communication abstraction: Reliable
Ordered
Point-to-point
Byte-stream
Full duplex
Flow and congestion controlled
Protocol implemented entirely at the ends Fate sharing
Sliding window with cumulative acks Ack field contains last in-order packet received
Duplicate acks sent when out-of-order packet received
8/13/2019 TCP and congestion control
3/44
Evolution of TCP
1975 1980 1985 1990
1982
TCP & IPRFC 793 & 791
1974
TCPdescribed by
Vint Cerfand Bob Kahn
In IEEE Trans Comm
1983
BSD Unix 4.2supports TCP/IP
1984
Nagels algorithm
to reduce overhead
of small packets;
predicts congestion
collapse
1987
Karns algorithm
to better estimate
round-trip time
1986Congestion
collapse
observed
1988Van Jacobsons
algorithms
congestion avoidance
and congestion control
(mostimplemented in
4.3BSD Tahoe)
1990
4.3BSD Reno
fast retransmit
delayed ACKs
1975
Three-way handshake
Raymond Tomlinson
In SIGCOMM 75
8/13/2019 TCP and congestion control
4/44
TCP Through the 1990s
1993 1994 1996
1994ECN
(Floyd)
Explicit
Congestion
Notification
1993TCP Vegas
(Brakmo et al)
real congestion
avoidance
1994
T/TCP
(Braden)
Transaction
TCP
1996
SACK TCP
(Floyd et al)
Selective
Acknowledgement
1996Hoe
Improving TCP
startup
1996FACK TCP
(Mathis et al)
extension to SACK
8/13/2019 TCP and congestion control
5/44
Timeout-based Recovery
Wait at least one RTT before retransmitting
Importance of accurate RTT estimators:
Low RTTunneeded retransmissions
High RTTpoor throughput
RTT estimator must adapt to change in RTT
But not too fast, or too slow!
Spurious timeouts Conservation of packets principle more than a
window worth of packets in flight
8/13/2019 TCP and congestion control
6/44
Initial Round-trip Estimator
Round trip times exponentially averaged:
New RTT = a(old RTT) + (1 - a) (new
sample)
Recommended value for a: 0.8 - 0.9
0.875 for most TCPs
Retransmit timer set to bRTT, where b= 2
Every time timer expires, RTO exponentially backed-off
Like Ethernet
Not good at preventing spurious timeouts
8/13/2019 TCP and congestion control
7/44
Jacobsons Retransmission
Timeout Key observation:
At high loads round trip variance is high
Solution: Base RTO on RTT and standard deviation orRRTT
rttvar = c* dev + (1- c)rttvar
dev = linear deviation
Inappropriately namedactually smoothed linear
deviation
8/13/2019 TCP and congestion control
8/44
TCP Flavors
Tahoe, Reno, Vegasdiffer in data-
driven reliability
TCP Tahoe (distributed with 4.3BSD Unix)
Original implementation of Van Jacobsons
mechanisms (VJ paper)
Includes:
Slow start
Congestion avoidance
Fast retransmit
8/13/2019 TCP and congestion control
9/44
Fast Retransmit
What are duplicate acks (dupacks)? Repeated acks for the same sequence
When can duplicate acks occur?
Loss Packet re-ordering
Window updateadvertisement of new flow controlwindow
Assume re-ordering is infrequent and not oflarge magnitude Use receipt of 3 or more duplicate acks as indication
of loss
Dont wait for timeout to retransmit packet
8/13/2019 TCP and congestion control
10/44
Fast Retransmit
Time
Sequence NoDuplicate Acks
Retransmission
X
8/13/2019 TCP and congestion control
11/44
Multiple Losses
Time
Sequence No Duplicate Acks
RetransmissionX
X
XX
Now what?
8/13/2019 TCP and congestion control
12/44
Time
Sequence No
X
X
XX
Tahoe
8/13/2019 TCP and congestion control
13/44
TCP Reno (1990)
All mechanisms in Tahoe
Addition of fast-recovery
Opening up congestion window after fast retransmit
Delayed acks
Header prediction
Implementation designed to improve performance
Has common case code inlined With multiple losses, Reno typically timeouts
because it does not receive enough duplicate
acknowledgements
8/13/2019 TCP and congestion control
14/44
Reno
Time
Sequence No
X
X
XX
Now what?timeout
8/13/2019 TCP and congestion control
15/44
NewReno
The ack that arrives after retransmission
(partial ack) should indicate that a second
loss occurred
When does NewReno timeout?
When there are fewer than three dupacks for
first loss
When partial ack is lost
How fast does it recover losses?
One per RTT
8/13/2019 TCP and congestion control
16/44
NewReno
Time
Sequence No
X
X
XX
Now what?partial ack
recovery
8/13/2019 TCP and congestion control
17/44
SACK
Basic problem is that cumulative acks
provide little information
Ack for just the packet received
What if acks are lost? carry cumulative also
Not used
Bitmask of packets received
Selective acknowledgement (SACK)
How to deal with reordering
8/13/2019 TCP and congestion control
18/44
Congestion Collapse
Definition: Increase in network load results indecrease of useful work done
Many possible causes
Spurious retransmissions of packets still in flight
Classical congestion collapse
How can this happen with packet conservation
Solution: better timers and TCP congestion control
Undelivered packets
Packets consume resources and are dropped elsewhere innetwork
Solution: congestion control for ALL traffic
8/13/2019 TCP and congestion control
19/44
Other Congestion Collapse
Causes Fragments
Mismatch of transmission and retransmission units
Solutions Make network drop all fragments of a packet (early packet
discard in ATM) Do path MTU discovery
Control traffic Large percentage of traffic is for control
Headers, routing messages, DNS, etc.
Stale or unwanted packets Packets that are delayed on long queues
Push data that is never used
8/13/2019 TCP and congestion control
20/44
Where to Prevent Collapse?
Can end hosts prevent problem?
Yes, but must trust end hosts to do right thing
E.g., sending host must adjust amount of data
it puts in the network based on detectedcongestion
Can routers prevent collapse?
No, not all forms of collapse
Doesnt mean they cant help
Sending accurate congestion signals
Isolating well-behaved from ill-behaved
sources
8/13/2019 TCP and congestion control
21/44
Congestion Control and
Avoidance A mechanism which:
Uses network resources efficiently
Preserves fair network resource allocation
Prevents or avoids collapse
Congestion collapse is not just a theory
Has been frequently observed in many
networks
8/13/2019 TCP and congestion control
22/44
TCP Congestion Control
Motivated by ARPANET congestion collapse
Underlying design principle: packet conservation
At equilibrium, inject packet into network only when
one is removed Basis for stability of physical systems
Why was this not working?
Connection doesnt reach equilibrium
Spurious retransmissions
Resource limitations prevent equilibrium
8/13/2019 TCP and congestion control
23/44
TCP Congestion Control -
Solutions Reaching equilibrium
Slow start
Eliminates spurious retransmissions
Accurate RTO estimation
Fast retransmit
Adapting to resource availability
Congestion avoidance
8/13/2019 TCP and congestion control
24/44
TCP Congestion Control
Changes to TCP motivated by
ARPANET congestion collapse
Basic principlesAIMD
Packet conservation
Reaching steady state quickly
ACK clocking
8/13/2019 TCP and congestion control
25/44
AIMD
Distributed, fair and efficient
Packet loss is seen as sign of congestion and
results in a multiplicative rate decrease
Factor of 2 TCP periodically probes for available bandwidth
by increasing its rate
Time
Rate
8/13/2019 TCP and congestion control
26/44
Implementation Issue
Operating system timers are very coarsehow to pace
packets out smoothly?
Implemented using a congestion window that limits how
much data can be in the network.
TCP also keeps track of how much data is in transit
Data can only be sent when the amount of outstanding
data is less than the congestion window.
The amount of outstanding data is increased on a send and
decreased on ack (last sentlast acked) < congestion window
Window limited by both congestion and buffering
Senders maximum window = Min (advertised window, cwnd)
8/13/2019 TCP and congestion control
27/44
Congestion Avoidance
If loss occurs when cwnd = W Network can handle 0.5W ~ W segments
Set cwnd to 0.5W (multiplicative decrease)
Upon receiving ACK Increase cwnd by (1 packet)/cwnd
What is 1 packet?1 MSS worth of bytes
After cwnd packets have passed byapproximately increase of 1 MSS
Implements AIMD
8/13/2019 TCP and congestion control
28/44
Congestion Avoidance
Sequence Plot
Time
Sequence No
Packets
Acks
8/13/2019 TCP and congestion control
29/44
Congestion Avoidance Behavior
Time
CongestionWindow
Packet loss+ Timeout
Grabbingback
Bandwidth
CutCongestion
Windowand Rate
8/13/2019 TCP and congestion control
30/44
Packet Conservation
At equilibrium, inject packet into network only
when one is removed
Sliding window and not rate controlled
But still need to avoid sending burst of packets would overflow links
Need to carefully pace out packets
Helps provide stability
Need to eliminate spurious retransmissions Accurate RTO estimation
Better loss recovery techniques (e.g. fast retransmit)
8/13/2019 TCP and congestion control
31/44
TCP Packet Pacing
Congestion window helps to pace the
transmission of data packets
In steady state, a packet is sent when an ack is
received Data transmission remains smooth, once it is smooth
Self-clocking behavior
Pr
Pb
ArAb
ReceiverSender
As
8/13/2019 TCP and congestion control
32/44
Reaching Steady State
Doing AIMD is fine in steady state but
slow
How does TCP know what is a good initial
rate to start with?
Should work both for a CDPD (10s of Kbps or
less) and for supercomputer links (10 Gbps
and growing) Quick initial phase to help get up to speed
(slow start)
8/13/2019 TCP and congestion control
33/44
Slow Start Packet Pacing
How do we get thisclocking behavior tostart? Initialize cwnd = 1
Upon receipt of everyack, cwnd = cwnd + 1
Implications Window actually
increases to W in RTT *log2(W)
Can overshoot windowand cause packet loss
8/13/2019 TCP and congestion control
34/44
TCP Saw Tooth
Time
CongestionWindow
InitialSlowstart
FastRetransmit
and Recovery
Slowstartto pacepackets
Timeoutsmay still
occur
8/13/2019 TCP and congestion control
35/44
TCP Modeling
Given the congestion behavior of TCP can we
predict what type of performance we should get?
What are the important factors
Loss rate Affects how often window is reduced
RTT
Affects increase rate and relates BW to window
RTO Affects performance during loss recovery
MSS
Affects increase rate
8/13/2019 TCP and congestion control
36/44
Overall TCP Behavior
Time
Window
Lets concentrate on steady state behaviorwith no timeouts and perfect loss recovery
8/13/2019 TCP and congestion control
37/44
Simple TCP Model
Some additional assumptions
Fixed RTT
No delayed ACKs
In steady state, TCP losses packet eachtime window reaches W packets
Window drops to W/2 packets
Each RTT window increases by 1packetW/2 * RTT before next loss
BW = MSS * avg window/RTT = MSS * (W +W/2)/(2 * RTT) = .75 * MSS * W / RTT
8/13/2019 TCP and congestion control
38/44
Simple Loss Model
What was the loss rate?
Packets transferred = (.75 W/RTT) * (W/2 *
RTT) = 3W2/8
1 packet lostloss rate = p = 8/3W2
W = sqrt( 8 / (3 * loss rate))
BW = .75 * MSS * W / RTT
BW = MSS / (RTT * sqrt (2/3p))
8/13/2019 TCP and congestion control
39/44
TCP Vegas Slow Start
ssthresh estimation via packet pair
Only increase every other RTT
Tests new window size before increasing
8/13/2019 TCP and congestion control
40/44
Packet Pair
What would happen if a source transmitted
a pair of packets back-to-back?
Spacing of these packets would be
determined by bottleneck link
Basis for ack clocking in TCP
What type of bottleneck router behavior
would affect this spacing
Queuing scheduling
8/13/2019 TCP and congestion control
41/44
Packet Pair in Practice
Most Internet routers are FIFO/Drop-Tail
Easy to measure link bandwidths
Bprobe, pathchar, pchar, nettimer, etc.
How can this be used? NewReno and Vegas use it to initialize
ssthresh
Prevents large overshoot of availablebandwidth
Want a high estimateotherwise will take along time in linear growth to reach desired
bandwidth
TCP V C ti
8/13/2019 TCP and congestion control
42/44
TCP Vegas Congestion
Avoidance Only reduce cwnd if packet sent after last
such action
Reaction per congestion episode not per loss
Congestion avoidance vs. control
Use change in observed end-to-end delay to
detect onset of congestion
Compare expected to actual throughput Expected = window size / round trip time
Actual = acks / round trip time
8/13/2019 TCP and congestion control
43/44
TCP Vegas
Fine grain timers Check RTO every time a dupack is received or for
partial ack
If RTO expired, then re-xmit packet
Standard Reno only checks at 500ms
Allows packets to be retransmitted earlier Not the real source of performance gain
Allows retransmission of packet that would havetimed-out Small windows/loss of most of window
Real source of performance gain
Shouldnt comparison be against NewReno/SACK
8/13/2019 TCP and congestion control
44/44
TCP Vegas
Flaws
Sensitivity to delay variation
Paper did not do great job of explaining where
performance gains came from
Some ideas have been incorporated into
more recent implementations
Overall Some very intriguing ideas
Controversies killed it