Date post: | 21-Dec-2015 |
Category: |
Documents |
View: | 229 times |
Download: | 10 times |
Transport Layer: UDP
COMS W6998
Spring 2010
Erich Nahum
Outline
UDP Layer Architecture Receive Path Send Path
Length (16)Length (16)
0 3 7 15 31
UDP packet format
Checksum (16)Checksum (16)
DataData
Recall what UDP Does RFC 768 IP Proto 17 Connectionless Unreliable Datagram Supports multicast Optional
checksum Nice and simple. Yet still 2187 lines
of code!
Source Port (16)Source Port (16) Destination Port (16)Destination Port (16)
UDP Header
The udp header: include/linux/udp.h
struct udphdr {
__be16 source;
__be16 dest;
__be16 len;
__sum16 check;
};
Checksum Coverage (16)Checksum Coverage (16)
0 3 7 15 31
UDP packet format
Checksum (16)Checksum (16)
DataData
Sidebar: UDP-Lite RFC 3828 Very similar to UDP Difference is checksum
covers part of packet rather than all
Checksum coverage says how many bytes (starting from header) are covered by checksum
Idea is certain apps would rather have a damaged packet than none
Examples are audio, video codecs
IP Protocol 136 Linux UDP-Lite
implementation shares most code with UDP
Source Port (16)Source Port (16) Destination Port (16)Destination Port (16)
1. Packets arrive on an interface and are passed to the udp_rcv() function.
2. UDP packets are packed into an IP packet and passed down to IP via ip_append_data() and ip_push_pending_frames()
Sources of UDP Packets
Higher LayersHigher Layers
udp.cudp.c
udp_rcv
Ip_input.cIp_input.c
__udp4_lib_rcv
__udp4_lib_lookup_skb
ip_local_deliver_finish
MULTICASTMULTICAST
__udp4_lib_mcast_deliver
ICMPICMP
icmp_send
sock.csock.c
sock_queue_rcv_skb
__udp_queue_rcv_skb
udp.cudp.c
Ip_output.cIp_output.c
udp_sendmsg
ip_append_data
ROUTING
ip_route_output_flow
socket.csocket.c
sock_sendmsg
udp_push_pending_frames
ip_push_pending_frames
UDP Implementation Design
UDP Protostruct proto udp_prot = { .name = "UDP", .owner = THIS_MODULE, .close = udp_lib_close, .connect = ip4_datagram_connect, .disconnect = udp_disconnect, .ioctl = udp_ioctl, .destroy = udp_destroy_sock, .setsockopt = udp_setsockopt, .getsockopt = udp_getsockopt, .sendmsg = udp_sendmsg, .recvmsg = udp_recvmsg, .sendpage = udp_sendpage, .backlog_rcv = __udp_queue_rcv_skb, .hash = udp_lib_hash, .unhash = udp_lib_unhash, .get_port = udp_v4_get_port, .memory_allocated = &udp_memory_allocated, .sysctl_mem = sysctl_udp_mem, .sysctl_wmem = &sysctl_udp_wmem_min, .sysctl_rmem = &sysctl_udp_rmem_min, .obj_size = sizeof(struct udp_sock), .slab_flags = SLAB_DESTROY_BY_RCU, .h.udp_table = &udp_table,};
udp_table/** * struct udp_table - UDP table * * @hash: hash table, sockets are hashed on (local port) * @hash2: hash table, sockets are hashed on (local port, local address) * @mask: number of slots in hash tables, minus 1 * @log: log2(number of slots in hash table) */struct udp_table { struct udp_hslot *hash; struct udp_hslot *hash2; unsigned int mask; unsigned int log;};
udp_table_init() allocates the hash tables, initializes them:
for (i = 0; i <= table->mask; i++) { INIT_HLIST_NULLS_HEAD(&table->hash[i].head, i);
table->hash[i].count = 0; spin_lock_init(&table->hash[i].lock); }
Outline
UDP Layer Architecture Receive Path Send Path
Receiving packets in UDP
From user space, you can receive udp traffic with three system calls: recv() (when the socket is connected). recvfrom() recvmsg()
All three are handled by udp_rcv() in the kernel.
Recall IP’s inet_protos
handlerhandler
err_handlererr_handler
net_protocol
gso_send_checkgso_send_check
udp_rcv()udp_err()
igmp_rcv()
Null
inet_protos[MAX_INET_PROTOS]inet_protos[MAX_INET_PROTOS]0
1
MAX_INET_PROTOS
net_protocol
gso_segmentgso_segment
gro_receivegro_receive
gro_completegro_complete
handlerhandler
err_handlererr_handler
net_protocol
gso_send_checkgso_send_check
gso_segmentgso_segment
gro_receivegro_receive
gro_completegro_complete
Higher LayersHigher Layers
Receive Path: udp_rcv
Calls __udp4_lib_rcv(skb, &udp_table, IPPROTO_UDP); Function is used by both
UDP and UDP-Lite
udp.cudp.c
udp_rcv
Ip_input.cIp_input.c
__udp4_lib_rcv
__udp4_lib_lookup_skb
ip_local_deliver_finish
MULTICASTMULTICAST
__udp4_lib_mcast_deliver
ICMPICMP
icmp_send
sock.csock.c
sock_queue_rcv_skb
__udp_queue_rcv_skb
Higher LayersHigher Layers
Receive: __udp4_lib_rcv
Looks up the route table from the skb
Checks that skb has a header Checks that length is good Calcs the checksum Pulls out saddr, daddr Checks if address is multicast
Calls __udp4_lib_mcast_deliver()
udp.cudp.c
udp_rcv
Ip_input.cIp_input.c
__udp4_lib_rcv
__udp4_lib_lookup_skb
ip_local_deliver_finish
MULTICASTMULTICAST
__udp4_lib_mcast_deliver
ICMPICMP
icmp_send
sock.csock.c
sock_queue_rcv_skb
__udp_queue_rcv_skb
Higher LayersHigher Layers
Receive: __udp4_lib_rcv (cont)
udp.cudp.c
udp_rcv
Ip_input.cIp_input.c
__udp4_lib_rcv
__udp4_lib_lookup_skb
ip_local_deliver_finish
MULTICASTMULTICAST
__udp4_lib_mcast_deliver
ICMPICMP
icmp_send
sock.csock.c
sock_queue_rcv_skb Looks up the socket in the
udptable Via __udp4_lib_lookup_skb() Increases refcount on the sk
(socket) If socket is found
Calls __udp_queue_rcv_skb() Decrements refcount with
sock_put(sk) If not,
Send ICMP_UNREACHABLE Drop packet.
__udp_queue_rcv_skb
Higher LayersHigher Layers
Recv: __udp_queue_rcv_skb
udp.cudp.c
udp_rcv
Ip_input.cIp_input.c
__udp4_lib_rcv
__udp4_lib_lookup_skb
ip_local_deliver_finish
MULTICASTMULTICAST
__udp4_lib_mcast_deliver
ICMPICMP
icmp_send
sock.csock.c
sock_queue_rcv_skb Calls sock_queue_rcv_skb Increments some statistics
__udp_queue_rcv_skb
Outline
IP Layer Architecture Receive Path Send Path
Sending packets in UDP From user space, you can send udp traffic with three
system calls: send() (when the socket is connected). sendto() sendmsg()
All three are handled by udp_sendmsg() in the kernel. udp_sendmsg() is much simpler than the tcp parallel
method , tcp_sendmsg(). udp_sendpage() is called when user space calls sendfile()
(to copy a file into a udp socket). sendfile() can be used also to copy data between one file descriptor
and another. udp_sendpage() invokes udp_sendmsg().
UDP Socket Options
For IPPROTO_UDP/SOL_UDP level, there exists a socket option UDP_CORK
Added in Linux kernel 2.5.44.int state=1;setsockopt(s, IPPROTO_UDP, UDP_CORK, &state,sizeof(state));for (j=1;j<1000;j++)sendto(s,buf1,...)state=0;setsockopt(s, IPPROTO_UDP, UDP_CORK, &state,sizeof(state));
UDP_CORK (cont) The above code fragment will call udp_sendmsg() 1000
times without actually sending anything on the wire (in the usual case, when without setsockopt() with UDP_CORK, 1000 packets will be sent).
Only after the second setsockopt() is called, with UDP_CORK and state=0, one packet is sent on the wire.
Kernel implementation: when using UDP_CORK, udp_sendmsg() passes MSG_MORE to ip_append_data().
UDP_CORK is not in glibc, you need to add it to your program:#define UDP_CORK 1
Higher LayersHigher Layers
Send Path: udp_sendmsg()
udp.cudp.c
Ip_output.cIp_output.c
udp_sendmsg
ip_append_data
ROUTING
ip_route_output_flow
socket.csocket.c
sock_sendmsg Checks length, MSG_OOB Checks if there are frames
pending If so, jump to do_append_data
Gets the address Checks if socket is connected
If so, pull routing info out of sk Otherwise, look up via
ip_route_output_flow() Calls ip_append_data()
Handles fragmentation Calls
udp_push_pending_frames()
udp_push_pending_frames
ip_push_pending_frames
Higher LayersHigher Layers
udp_push_pending_frames()
udp.cudp.c
Ip_output.cIp_output.c
udp_sendmsg
ip_append_data
ROUTING
ip_route_output_flow
socket.csocket.c
sock_sendmsg Checks that there is room
in the skb via skb_peek() If not, goto out and bail
Creates UDP header Checksums if necessary
(or partially for UDP-Lite) Calls
ip_push_pending_frames() Combines all pending IP
fragments on the socket as one IP datagram and sends it out
udp_push_pending_frames
ip_push_pending_frames
UDP Backup
nextnextprevprev
sk_buff
transport_headertransport_headernetwork_headernetwork_header
mac_headermac_header
...lots.....lots..
headheaddatadatatailtail
Packetdata
dataref: 1dataref: 1
UDP-Data
UDP-HeaderIP-Header
MAC-Header
net_devicenet_device
sk_buffsk_buffsk_buff_headsk_buff_head
struct sockstruct sock
sksktstamptstampdevdev
nr_fragsnr_frags
...of.....of.....stuff.....stuff..
endendtruesizetruesizeusersusers skb_shared_info
......destructor_argdestructor_arg
``headroom‘‘
``tailroom‘‘
linux-2.6.31/include/linux/skbuff.h
Recall the sk_buff structure
pkt_type: specifies the type of a packet PACKET_HOST: a packet sent to the local host PACKET_BROADCAST: a broadcast packet PACKET_MULTICAST: a multicast packet PACKET_OTHERHOST:a packet not destined for the
local host, but received in the promiscuous mode. PACKET_OUTGOING: a packet leaving the host PACKET_LOOKBACK: a packet sent by the local host
to itself.
Recall pkt_type in sk_buff