Post on 06-Feb-2016
description
transcript
Linux TCP/IP Stack
1: PhysicalLayer
2: DataLink
4: Transport3: Network
7: Application6: Presentation5: Session
Interface Layer (Ethernet, etc.)
Protocol Layer (TCP / IP)
Socket layer
Process
TCP / IP vs. OSI model
TCP/IP Stack Overview Process
1: sosend (……………... )
Socket Layer
2: tcp_output ( ……. )
Protocol Layer (TCP Layer)
3: ip_output ( ……. )
Interface Layer (Ethernet Device Driver)
Output Queue
5: recvfrom(……….)
Input Queue
3: ip_input ( ……... )
4: tcp_input ( ……... )
Protocol Layer (IP Layer)
4: ethernet_output ( ……. ) 2: ethernet_input ( …….. )
Physical Media
Process Layer to TCP Layer
send (int socket, const char *buf, int length, int flags)Process
Kernel sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length)
sendit (struct proc *p, int socket, struct msghdr *mp, int flags, int *return_size)
sosend (struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *top, struct mbuf *control, int flags )
uipc_syscalls.c
uipc_socket.c
tcp_userreq (struct socket *s, int request, struct mbuf *m, struct mbuf * nam, struct mbuf * control ) tcp_userreq.c
tcp_output (struct tcpcb *tp) tcp_output.cTCP Layer
Socket Layer
sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length)
Data
Data
Unused Space
150 BytesData
128 BytesmBuf
100 Bytes
28 Bytes
20 Bytes
50 Bytes
58 Bytes
m_nextpkt = NULL
m_next = NULLm_next
m_nextpkt = NULL
m_len = 100 m_len = 50
m_data m_data
m_type = MT_DATA m_type = MT_DATA
m_flags = M_PKTHDR m_flags = 0
m_pkthdr.len = 150
m_pkthdr.recvif =NULL
data_buffer
MBUF Chain
Socket Layer -sosend passes data and control information to the protocol layer sosend(struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *data_buffer, struct mbuf *control, int flags )
Initialize a new memory buffer and variables to hold flags
yes
Is there enough space in the buffer
sbspace(s->sb_snd)
no
Copy data_buffer mbuf
Free the memory buffers received
1 0More buffersto send?
yes
no
int error = tcp_usrreq(s, flags, mbuf, addr, control)
error
Return value of errorto sendto ( )
TCP Layer - tcp_usrreq(struct socket *s, int request, struct mbuf *data_buffer, mbuf *nam, mbuf * control)
Initialize internet protocol control block inp and TCP control block tpto store information useful for TCP
Convert Socket to Internet Protocol Control Block inp = sotoinpcb(so)
Convert the internet protocol control block to a tcp control block tp = intopcb(inp)
request
PRU_SEND
int error = tcp_output(tp)return errorto tcp_userreq( )
Called by tcp_usrreq for one of the following reasons:To send the initial SYNTo send a finished_sending messageTo send dataTo send a window update after data has been received.
tcp_ouput ( ) functionality: 1. determines whether TCP can send a segment or not depending on: flags in the data sent by the socket layer to send an ACK, etc.
Size of window advertised by the receiver’s end.Amount of data ready to send whether unacknowledged data already exists for the connection
2. Calculate the amount of data to be sent depending on:size of receiver’s windownumber of bytes in the send buffer
3. Check for window shrink
4. Send a segmentAllocate a buffer for the TCP and IP header from the header templateCopy the TCP and IP header template into the the buffer to be sent.Fill the fields in the TCP header.
Decrement the number of buffers to tbe sent, so that the end can be checked.Set sequencenumber and acknowledgement field.Set three fields in the IP header - IP length, TTL and Tos.Pass the datagram to IP
TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)
TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)
struct socket *so = tp -> t_inpcb -> inp_socket
Initialize a tcp header tcp_header
idle
Idle is true if the max sequence number equals the oldest unacknowledged sequence number,if an ACK is not expected from the other end.int idle = (tp -> snd_max == tp -> snd_una)
Check ACK FlagAcknowledgement isnot expected, set the congestion window to one segmenttp -> snd_cwnd = tp -> t_maxseg;
true
false
TCP Layer - tcp_output(struct tcpcb *tp)
Determine length of data that shouldbe transmitted and the flags to be used.len is the minimum number of bytes in the send buffer, win (the minimum of the receiver’s window)and the congestion window. len = min(so -> so_snd.sb_cc, win) - off
Acknowledgement isnot expected, set the congestion window to one segmenttp -> snd_cwnd = tp -> t_maxseg;
off is the offset in bytes from the beginning of the send buffer of the first data byte to send.off bytes have already been sent and acknowledgement on those is awaited.int off = tp -> snd_nxt - tp -> snd_una
Determine the flags like TH_ACK, TH_FIN, TH_RST, TH_SYNflags = tcp _outflags [ tp -> t_state ]
TCP Layer - tcp_output(struct tcpcb *tp)
tp -> t_flags &TF_ACKNOW
Send acknowledgement
Determine the flags like TH_ACK, TH_FIN, TH_RST, TH_SYNflags = tcp _outflags [ tp -> t_state ]
tp -> t_flags &TF_SYN || TH_RST
tp -> t_flags &TH_FIN
true
false
false
trueSend sequence numberor reset
Finished sendingtrue
false
Ckeck flags to determine the type of message:window proberetransmissionnormal data transmission
Length of data < 44 Bytes100 - 40 - 16
yes
Create a new mbuf chain,copy the surplus data andpoint it to the first mbuf chain.
Allocate an mbuf for the TCP & IP header and data if possible.MGETHDR ( m, M_DONTWAIT, MT_HEADR)M_DONTWAIT indicates that if memory is not available for mbuf then come out of the routine and return an error state.
no
Copy the data from the socket send buffer into thenew packet header mbuf
ip_output(m, tp->t_inpcb -> inp_options, &tp -> t_inpcb -> inp_route, so -> so_options & SO_DONOTROUTE, 0)
Packetsdamaged?
ip_output.cip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags, struct ip_moptions *imo)1. Header initialization2. Route Selection3. Source address selection and Fragmentation
1. Header initialization
The value of “flags” decides what’s to be done with the data• IP_FORWARDING : Forward packet• IP_ROUTETOIF : Route directly to Interface• IP_ALLOWBROADCAST : Allow broadcasting of packet• IP_RAWOUTPUT : Packet contains pre-constructed header
yes ERROR
if ((flags == IP_FORWARDING ) || (flags == IP_RAWOUTPUT ))
no
Save header length in hlen for fragmentation algorithm
Construct and initialize IP headerset ip_v = 4, clear ip_off
assign unique identifier to ip_idlength, offset, TTL, protocol, TOS etc
are set by higher layers.
yes
no If the packet has to be forwarded to another host, i.e if the machine is acting as a router, then the IP header for forwarded packets should not be modified by ip_output.
Check if there were any errors while adding headers in higherlayers. Most of the fields of the IP header are pre defined byhigher layer protocols.
If the packet is not being forwarded and has to be sent to another host then initialize the IP header.
2. Route Selection
Verify Cached Route for destination address
Find the interface on which the packet has to be placed. Ifp points to the interface’s ifnet structure.
If (cached_route == destination)
Locate route : Call rtalloc(dst_ip) to locate a route to the destination. Find the interface on which the packet has to be placed. Ifp points to the interface’s ifnet structure. If rtalloc(dst_ip) fails to find a route, return host unreachable error.
yes
no
A cached route may be provided to ip_output as an argument. UDP and TCP maintain a route cache associated with each socket.
Check if the cached route is the correct destination. If a route has not been provided, ip_output sets a temporary route structure called iproute.
If the cached route is provided, find the interface on which the frame has to be sent.
If the packet is being routed, rtalloc locates a route to the address specified by dst. If rtalloc fails, an EHOSTUNREACH error is generated. If ip_forward called ip_output the error is converted to an ICMP error.If the address is found then ifp is made to point to thr ifnet structure for the interface. If the next hop is not the packets final destination, then dst is changed to point to the next hop router.
3. Source address selection and Fragmentation
Check if valid source address is specified.
Select the IP address of the outgoinginterface as the source address.
Does the packet have to be fragmented ?
Fragment the packet if it’s size isgreater than the MTU.
If there are no check_sum errors, send the data to if_output function of the selected interface.
no
yes
yes
no
The final section of the ip_output ensures that theIP header has a valid source IP address. This couldn’t have been done earlier because the route hadn’t been selected yet. If there is no source IP then the IP address of the outgoing interface is used as the source IP.
Larger packets (packets that exceed the MTU) must be fragmented before they can be sent.
In either case (fragmented or not) the checksum is computed (in_cksum). If no errors are found, the data is sent to if_output function of the output interface.
Interface Layer (if_ethersubr.c)
ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *routing_entry)1. Verification2. Protocol-Specific Processing3. Frame Construction4. Interface Queuing.
senderr (ENETDOWN)Ethernet portup and running ?ifp -> if_flags &
(IF_UP | IF_RUNNING )
no
yes
1. Verification
Interface Layer(if_ethersubr.c) - ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *rt_entry)
Function: Takes the data portion of an Ethernet frame ans encapsulates it with a 14-byte header and places it on the interface send_queue.Phases: Verification, Protocol-Specific Processing, Frame Construction, Interface Queuing.
Arguments - ifp points to outgoing interface’s ifnet structurembuf is the data to be sentdestination is the destination addressrt_entry points o the routing entry
Initialize- Ethernet header - struct eth_header *eh
senderr (ENETDOWN)Ethernet portup and running ?ifp -> if_flags &
(IF_UP | IF_RUNNING )
no
yes
Verification
senderr (EHOSTUNREACH)Route valid ?
rt_entry = rtalloc1 (destination, 1)
0
1
Next hop a gateway ?rt = rt -> rt_gwroute
0
1
Destination respondingto ARP requests?
If not then do not send more packets to avoid flooding.
rt -> rt_flags &RTF_REJECT
no
Verification
Protocol Specific Processing
Protocol Specific ProcessingFunctionality: Finds Ethernet address corresponding to the IP address of the destination.
Use m_copy( ) to keep the packet tillan ack. Is recvd.
destination -> sa_family
AF_INET
Send ARP broadcast to find theethernet address corresponding to the destination IP address
Frame Preparartion
Make sure there is room for the 14 byteethernet headerM_PREPEND ( m, sizeof(ethernet_header), M_DONOTWAIT)
Frame Preparartion
Protocol Specific Processing
Form the Ethernet header fromethernet frame type, ethernet MAC address,unicast ethernet address associated with the output interface.e.g. the default gateway for a host
Interface Queuing
Frame Preparartion
Is the output queue full
no
yes Discard the frameFree the memory buffsenderr ( ENOBUFS )
Place the frame on the interface’s send queue
lestart ( ifp )
if_snd
lestart ( ifp )
Interface Layer(if_le.c) - lestart(struct ifnet *ifp)
Function: Dequeues frames from the interface output queue and arranges for them to be transmitted by the Ethernet Card.
le -> sc_if.if_flags &IFF_RUNNING
struct le_softc *le = & le_softcl [ ifp -> if_unit ]
return error
1
0
Copy the the frame in mbuf to the hardware buffer
Set the IFF_OACTIVE on to indicate that thedevice is busy transmitting.