1
Layer 4: Transport Layer
Transport Layer: Motivation
AB
R1 R2
Recall that NL is responsible for forwarding a packet from one HOST to another HOST
But it is the applications that communicate! How do you make applications on HOSTs to communicate?
Need a new layer, called the “Transport Layer” Responsible for providing communication between applications
running in different hosts A Web Browser talking to a Web Server
NetworkLink
Physical
NetworkLink
Physical
NetworkLink
Physical
NetworkLink
Physical
C
NetworkLink
Physical
TransportNetwork
LinkPhysical
TransportNetwork
LinkPhysical
TransportNetwork
LinkPhysical
FTPServer
WebServer
WebBrowser
FTPClient
2
Transport services and protocols provide logical communication
between app processes running on different hosts
transport protocols run in end systems
send side: breaks app messages into segments, passes to network layer
rcv side: reassembles segments into messages, passes to app layer
more than one transport protocol available to apps
Internet: UDP, TCP, SCTP
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
Transport vs. network layer
network layer: logical communication between hosts
transport layer: logical communication between processes relies on, enhances,
network layer services
Household analogy:
12 kids sending letters to 12 kids
processes = kids
app messages = letters in envelopes
hosts = houses
transport protocol = Ann and Bill
network-layer protocol = postal service
3
Transport Layer FunctionsDemultiplexing to upper layer -- Obligatory
Deliver an incoming packet to the correct application
Define Delivery Sematics and Implement them Reliable vs. unreliable Unicast vs. multicast Ordered vs. unordered
Flow control -- Optional Do not allow sender to overrun receiver’s buffer
resources
Congestion control -- Optional Do not allow the sender to overrun the network
capacity
Internet transport-layer protocols User Datagram Protocol (UDP)
unreliable (“best-effort”), unordered unicast or multicast delivery no flow, no congestion control
Transmission Control Protocol (TCP) reliable in-order unicast flow & congestion control
Stream Control Transport Protocol (SCTP) (will not cover in class) RFC 2960
reliable optional ordering unicast flow & congestion control
4
Multiplexing
Multiplexing/demultiplexing – Why?
Why need demultiplexing? Assume you are running 3 network
applications all using the same transport protocol, e.g., TCP
• FTP Server, Telnet Client, Web Browser
When a packet arrives at a host, it moves up the protocol stack until it reaches the transport layer, e.g., TCP
Now, the transport layer needs a way to determine which application the packet needs to be delivered. This is the demultiplexing problem.
Recall that all protocol layers perform multiplexing/demultiplexing:
• e.g., IP needs to determine which transport protocol a given packet needs to be delivered, UDP or TCP?
Demultiplexing
transport
network
link
physical
P2P1 P3
transport
network
link
physical
P2P1 P3
Demultiplexing: How? host receives IP datagrams
each datagram has source IP address, destination IP address
each datagram carries 1 transport-layer segment
each segment has source, destination port number(well-known port numbers for specific applications)
Port #s: 16 bits• 65535 ports• 0-1023 are called well-known
and are reserved– HTTP uses port 80
– Telnet uses port 23
– RFC 1700 lists the reserved ports
source port # dest port #
32 bits
applicationdata
(message)
other header fields
TCP/UDP segment format
5
UDP: User Datagram Protocol
UDP: User Datagram Protocol [RFC 768] “bare bones”, “best effort”
transport protocol
connectionless:
no handshaking between UDP sender, receiver before packets start being exchanged
each UDP segment handled independentlyof others
Just provides multiplexing/demultiplexing
Pros: No connection establishment
No delay to start sending/receiving packets
Simple no connection state at sender,
receiver
Small segment header Just 8 bytes of header
Cons: “best effort” transport service
means, UDP segments may be: lost
delivered out of order to app
no congestion control: UDP can blast away as fast as desired
6
UDP more
often used for streaming multimedia apps
loss tolerant
rate sensitive
other UDP uses DNS
SNMP
reliable transfer over UDP: add reliability at application layer
application-specific error recovery!
source port # dest port #
32 bits
Applicationdata
(message)
UDP segment format
length checksumLength, in
bytes of UDPsegment,including
header
Used for Mux/Demux
UDP Demultiplexing
Create sockets with port numbers:
DatagramSocket mySocket1 = new
DatagramSocket(6428);
DatagramSocket mySocket2 = new
DatagramSocket(4567);
UDP socket identified by two-tuple:
(dest IP address, dest port number)
When host receives UDP segment: checks destination port
number in segment
directs UDP segment to socket with that port number
IP datagrams with different source IP addresses and/or source port numbers directed to same socket
7
UDP Demultiplexing Example
ClientIP:B
P
clientIP: A
P1PP
serverIP: C
SP: 6428
DP: 9157
SP: 9157
DP: 6428
SP: 6428
DP: 5775
SP: 5775
DP: 6428
Source Port (SP) provides “return address”: Identifies the process at the other end of the line
DatagramSocket serverSocket = new DatagramSocket(6428);
UDP checksum
Sender: treat segment contents
as sequence of 16-bit integers
checksum: addition (1’s complement sum) of segment contents
sender puts checksum value into UDP checksum field
Receiver: compute checksum of received
segment check if computed checksum
equals checksum field value: NO - error detected YES - no error detected. But
maybe errors nonetheless?• Reordered bytes
Why checksum at UDP if LL provides error checking? IP is supposed to run over
ANY LL, so UDP does its own error checking
Goal: detect “errors” (e.g., flipped bits) in transmitted segment
8
How to program using the UDP?
TCP UDPIPLLPL
Socket Layer
TCP UDPIPLLPL
Socket LayerTCP UDP
IPLLPL
Socket Layer
Socket Layer: Programmer’s API to
the protocol stack
Typical network app has two pieces: client and server
Server: Passive entity. Provides service to clients e.g., Web server responds with the requested Web page
Client: initiates contact with server (“speaks first”) typically requests service from server, e.g., Web Browser
Socket Creation
Family Type Protocol
TCPPF_INET
SOCK_STREAM IPPROTO_TCP
UDP SOCK_DGRAM IPPROTO_UDP
mySock = socket(family, type, protocol);
UDP/TCP/IP-specific sockets
Socket reference File (socket) descriptor in UNIX
Socket handle in WinSock
9
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
Server starts by getting ready to receive client messages…
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
/* Create socket for incoming messages */if ((servSock = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP)) < 0)
Error("socket() failed");
10
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
ServAddr.sin_family = PF_INET; /* Internet address family */ServAddr.sin_addr.s_addr = htonl(INADDR_ANY); /* Any incoming interface */ServAddr.sin_port = htons(20000); /* Local port 20000 */
if (bind(servSock, (struct sockaddr *) &ServAddr, sizeof(ServAddr)) < 0)Error("bind() failed");
Specifying Addresses struct sockaddr
{unsigned short sa_family; /* Address family (e.g., PF_INET) */char sa_data[14]; /* Protocol-specific address information */
};
struct sockaddr_in{
unsigned short sin_family; /* Internet protocol (PF_INET) */unsigned short sin_port; /* Port (16-bits) */struct in_addr sin_addr; /* Internet address (32-bits) */char sin_zero[8]; /* Not used */
}; struct in_addr{
unsigned long s_addr; /* Internet address (32-bits) */};
Generic
IP S
peci
fic
11
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
struct sockaddr_in peer; int peerSize = sizeof(peer);char buffer[65536];
recvfrom(servSock, buffer, 65536, 0, (struct sockaddr *)&peer, &peerSize);
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
Server is now blocked waiting for a message from a client
12
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
Later, a client decides to talk to the server…
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
/* Create socket for outgoing messages */if ((clientSock = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP)) < 0)
Error("socket() failed");
13
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
// Initialize server’s address and portstruct sockaddr_in server;server.sin_family = AF_INET;server.sin_addr.s_addr = inet_addr(“10.10.100.37”); server.sin_port = htons(20000);
// Send it to the serversendto(clientSock, buffer, msgSize, 0, (struct sockaddr *)&server, sizeof(server));
UDP Client/Server Interaction
Client
1. Create a UDP socket
2. Communicate (send/receive messages)
3. When done, close the socket
Server1. Create a UDP socket
2. Assign a port to socket
3. Communicate (receive/send messages)
4. When done, close the socket
close(clientSock); close(serverSock);
14
27
Transmission Control Protocol (TCP)
28
TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581
reliable, in-order byte steam: no “message boundaries”
send & receive buffers buffer incoming & outgoing
data
flow controlled: sender will not overwhelm
receiver
congestion controlled: sender will not overwhelm
network
point-to-point (unicast): one sender, one receiver
connection-oriented: handshaking (exchange of
control msgs) init’s sender, receiver state before data exchange
State resides only at the ENDsystems – Not a virtual circuit!
full duplex data: bi-directional data flow in same
connection (A->B & B->A in the same connection)
MSS: maximum segment size
socket
door
TCP
send buffer
TCP
receive buffer
socket
door
segment
application
writes dataapplication
reads data
15
29
TCP segment structure
source port # dest port #
32 bits
applicationdata
(variable length)
sequence number
acknowledgement number
Receive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG: urgent data (generally not used)
ACK: ACK #valid
PSH: push data now(generally not used)
RST, SYN, FIN:connection estab(setup, teardown
commands)
# bytes rcvr willingto accept
countingby bytes of data(not segments!)
Internetchecksum
(as in UDP)
30
TCP Connection-oriented demux
TCP socket identified by 4-tuple: source IP address
source port number
dest IP address
dest port number
receiving host uses all four values to direct segment to appropriate socket
16
31
TCP Demultiplexing Example
ClientIP:B
P
clientIP: A
P1P
serverIP: C
SP: 80
DP: 9157
SP: 9157
DP: 80
SP: 80
DP: 5775
SP: 5775
DP: 80
P P
32
Connection Termination
Reliable, In-OrderData Exchange
Connection Establishment
Typical TCP TransactionClient Server
timetime
A TCP Transaction consists of 3 Phases1. Connection Establishment
Handshaking between client and server
2. Reliable, In-Order Data Exchange Recover any lost data
through retransmissions and ACKs
3. Connection Termination Closing the connection
17
33
TCP Connection Establishment
TCP sender, receiver establish “connection” before exchanging data segments
initialize TCP variables:
seq. #s
buffers, flow control info (e.g. RcvWindow)
client: connection initiatorSocket clientSocket = new Socket("hostname", port#);
server: contacted by clientSocket connectionSocket = welcomeSocket.accept();
34
Connection Establishment (cont)Host A Host B
time
Three-way handshake
Three way handshake:
Step 1: client host sends TCP SYN segment to server specifies a random initial
seq # no data
Step 2: server host receives SYN, replies with SYNACK segment
server allocates buffers specifies server initial
seq. #Step 3: client receives
SYNACK, replies with ACK segment, which may contain data
time
Connectionrequest
host ACKs and selects its own initial seq #
host ACKs
18
35
Connection Establishment (cont)Host A Host B
time
Three-way handshake
time
Connectionrequest
host ACKs and selects its own initial seq #
host ACKs
Seq. #’s: byte stream “number” of
first byte in segment’s data
ACKs: seq # of next byte
expected from other side cumulative ACK
36
TCP Starting Sequence Number Selection
Why a random starting sequence #? Why not simply choose 0? To protect against two incarnations of the same connection
reusing the same sequence numbers too soon
That is, while there is still a chance that a segment from an earlier incarnation of a connection will interfere with a later incarnation of the connection
How? Client machine seq #0, initiates connection to server with seq #0.
Client sends one byte and client machine crashes
Client reboots and initiates connection again
Server thinks new incarnation is the same as old connection
19
37
TCP Connection Termination
Closing a connection:
client closes socket:clientSocket.close();
Step 1: client end system sends TCP FIN control segment to server
Step 2: server receives FIN, replies with ACK. Server might send some buffered but not sent data before closing the connection. Server then sends FIN and moves to Closing state.
client server
close
Datawrite
closed
tim
ed w
ait
close
38
TCP Connection TerminationStep 3: client receives FIN, replies
with ACK.
Enters “timed wait” - will respond with ACK to received FINs
Step 4: server, receives ACK. Connection closed.
Why wait before closing the connection?
If the connection were allowed to move to CLOSED state, then another pair of application processes might come along and open the same connection (use the same port #s) and a delayed FIN from an earlier incarnation would terminate the connection.
client server
closing
closing
closed
tim
ed w
ait
closed