Slide Set 13: TCP
In this set....• TCP Connection Termination• TCP State Transition Diagram• Flow Control• How does TCP control its
sliding window ?
Connection Termination• Note that after
the server receives the FIN_WAIT_1, it may still have messages -- thus, connection not yet closed.
DATA/ACK
Active Close
FIN_WAIT1
FIN_WAIT2
CLOSE_WAIT
LAST_ACK
CLOSED
FIN, Seq Num = M
ACK= M+1
FIN, Seq Num = N
ACK N+1
TCP State Transitions
CLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED
CLOSE_WAIT
LAST_ACKCLOSING
TIME_WAIT
FIN_WAIT_2
FIN_WAIT_1
Passive open Close
Send/SYNSYN/SYN + ACK
SYN + ACK/ACK
SYN/SYN + ACK
ACK
Close/FIN
FIN/ACKClose/FIN
FIN/ACKACK + FIN/ACK
Timeout after twosegment lifetimesFIN/ACK
ACK
ACK
ACK
Close/FIN
Close
CLOSED
Active open/SYN
• Note: Retransmissions and Data Packet /ACK exchanges are not represented in the state transition diagram.• Final Wait time needed to ensure that the ACK is not lost.
• Simultaneous Connection Inceptions/ Terminations possible !
An Simpler View of the Client Side
CLOSED
TIME_WAIT
FIN_WAIT2
FIN_WAIT1
ESTABLISHED
SYN_SENT
SYN (Send)
Rcv. SYN+ACK, Send ACK
Send FINRcv. ACK, Send Nothing
Rcv. FIN, Send ACK
120 secs
Simpler Server Model
CLOSED
LAST ACK
CLOSE_WAIT
ESTABLISHED
SYN_RCVD
LISTEN
Passive OPEN, Create Listen socket
Rcv. SYN, Send SYN+ACK
RCV ACKRcv. FIN, Send ACK
Send FIN
Rcv. ACK, Send nothing
More about Termination
• Applications on both sides have to “independently” close their half of the connection.
• If one side does it, this means that this side has no data to send but it is willing to receive.
• In the TIME_WAIT state, a client waits for 2 X MSL (typically). During this time the socket cannot be reused.– If ACK is lost, a new FIN may be forthcoming and
this second FIN may be delayed.– Thus, if a new connection uses the same
connection i.e., the same port numbers, this FIN would initiate termination of later connection !
Sequence Numbers and ACKs
• How does one set Sequence numbers ?– Implicitly a number in every byte in the stream.– If we have 500000 bytes and each segment =
MSS and = 1000 bytes, SN of 1st segment = 0, SN of second segment = 1000 and so on.
• Note that ACK number is the number that the receiving host puts in -- indicates the “next” byte that it is expecting.
• ACKs are cumulative -- ACK up to all the bytes that are received.
Flow Control• Flow control ensures that the
sender does not send at a rate that causes the receiver buffer to overflow.
• Note that flow control is “end-to-end”.
Buffers at End Hosts• Sending buffer
– Maintains data sent but not ACKed
– Data written by application but not sent.
• Receive buffer– Data that arrives out of order– Data that is in correct order but
not yet read by application.
Sender Side View• For now, let us forget SN
wrap around.• Three pointers are
maintained, LastByteAcked, LastByteSent, LastByteWritten.
• LastByteAcked ≤ LastByteSent
• LastByteSent ≤ LastByteWritten
Sending application
LastByteWrittenTCP
LastByteSentLastByteAcked(a)
Receiver Side View• Three pointers
maintained again.• LastByteRead <
NextByteExpected• NextByteExpected
≤ LastByteRcvd + 1
Receiving application
LastByteReadTCP
LastByteRcvdNextByteExpected
How is Flow Control done?
• Receiver “advertises” a window size to the sender based on the buffer size allocated for the connection.– Remember the “Advertised Window” field in
the TCP header ?• Sender cannot have more than “Advertised
Window” bytes of unacknowledged data.• Remember -- buffers are of finite size - i.e.,
there is a MaxRcvBuffer and MaxSendBuffer.
Setting the Advertised Window
• On the TCP receive side, clearly, LastByteRcvd -LastByteRead ≤ MaxRcvBuffer
• Thus, it advertises the space left in the buffer i.e., Advertised Window =
MaxRcvBuffer - (LastByteRcvd -LastByteRead)
• As more data arrives i.e., more received bytes than read bytes, LastByteRcvd increases and hence, Advertised Window reduces.
Sender Side Response• At the sender side, the TCP sender should ensure
that: LastByteSent - LastByteAcked ≤ Advertised Window.Thus, we define what is called the “Effective Window”
which limits the amount of data that TCP can send :Effective Window =
Advertised Window - (LastByteSent - LastByteAcked)• Note here that ACKing does not imply that the
process has read the data!• In order to prevent the overflow of the Send Side
buffer: LastByteWritten - LastByteAcked ≤ MaxSendBuffer
– If application tries to write more, TCP blocks.
Persistency• What does one do when Advertised
Window = 0 ?• The sender will persist by sending 1
segment.• Note that this segment may not be
accepted by the receiver initially.• But at some point, it would trigger a
response that may contain a new Advertised window.
Sequence Number Wraparound
• TCP Sequence Number is 32 bits long.• Advertised Window is 16 bits. Since 232
>> 2 X 216, it is almost impossible for the same sequence number to exist twice -- wrap around unlikely.
• In addition, MSL = 120 seconds to make sure that there is no wrap-around.
• Time-stamps may also be used.
How long should the time-out be ?
• Remember, TCP has to ensure reliability.
• So bytes need to be resent if there is no “timely” acknowledgement.
• How long should the sender wait ?• It should be adaptive -- fluctuation in
load on the network.– If too short, false time-outs– If too long, then poor rate of sending.
• Depends on round trip time estimation
RTT Estimation• Simple mechanism could be:
– Send packet, record time T1– When ACK is returned, record time = T2.– T2 -T1 = Estimated RTT.
• To avoid fluctuations, estimated RTT is a weighted average of previous time and current sample
Estimated RTT = (1-) Estimated RTT + SampleRTT• In the original specification = 0.125• The Time out is set to 2 * RTT.
A problemSender Receiver
Original transmission
ACK
Retransmission
Sender ReceiverOriginal transmission
ACKRetransmission
(a) (b)
• When there are retransmissions, it is unclear if the ACK is for the original transmission or for a retransmission.
• How do we overcome this ?
The Karn Patridge Algorithm• Take SampleRTT measurements only for
segments that have been sent once !• This eliminates the possibility that
wrong RTT estimates are factored into the estimation.
• Another change -- Each time TCP retransmits, it sets the next timeout to 2 X Last timeout --> This is called the Exponential Back-off (primarily for avoiding congestion).
Jacobson Karels Algorithm
• The main problem with the Karn/Patridge scheme is that it does not take into account the variation between RTT samples.
• New method proposed -- the Jacobson Karels Algorithm.
• Estimated RTT = Estimated RTT + X Difference– Difference = Sample RTT - Estimated RTT
• Deviation = Deviation + (|Difference| - deviation)
• Timeout = Estimated RTT + deviation.• The values of and are computed based on
experience -- Typically = 1 and = 4.
Silly Window Syndrome• Suppose a MSS worth of data is collected and
advertised window is MSS/2.• What should the sender do ? -- transmit half
full segments or wait to send a full MSS when window opens ?
• Early implementations were aggressive -- transmit MSS/2.
• Aggressively doing this, would consistently result in small segment sizes -- called the Silly Window Syndrome.
Issues ..• We cannot eliminate the possibility of
small segments being sent.• However, we can introduce methods to
coalesce small chunks.– Delaying ACKs -- receiver does not send
ACKs as soon as it receives segments.• How long to delay ? Not very clear.
– Ultimate solution falls to the sender -- when should I transmit ?
Nagle’s Algorithm• If sender waits too long --> bad for interactive
connections.• If it does not wait long enough -- silly window
syndrome.• How to solve ?• Timer -- clock based
– If both available data and Window ≥ MSS, send full segment.
– Else, if there is unACKed data in flight, buffer new data until ACK returns.
– Else, send new data now.• Note -- Socket interface allows some applications
to turn off Nagle’s algorithm by setting the TCP-NODELAY option.
TCP Throughput• If a connection sends W segments
of MSS size (in bytes) in RTT seconds, then, the throughput is defined as : W *MSS / RTT bytes/second.
• If there is a link of capacity R, if there are K connections, what we want is for each TCP connection to have a throughput = R/K.
Throughput (cont)• If a TCP session goes through n links
and if link j has a rate Rj and is shared by Kj connections, ideally the throughput = Rj/Kj.
• Thus, a connection’s end-to-end rate is r = min (R1/K1, R2/K2, .. Rj/Kj... Rn/Kn).
• In reality not so simple, some connections may be unable to use their share -- so the share may be higher.
Where are we ?• We have covered Chapter 5 --
Sections 5.1 and 5.2.• Whatever I left out from
Section 5.2 is for self-study.
Where are we headed ?• We will look at Congestion
Control with TCP next time.– Chapter 6 -- Sections 6.3 and 6.4.