Network Layer 4-1
Chapter 2Delivering the data
Adapted from slides provided for:
All material copyright 1996-2010J.F Kurose and K.W. Ross, All Rights Reserved
Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith RossAddison-Wesley
Network Layer 4-2
Chapter goals: understand how data moves between
layers and systems on the network IP address, subnet Routing
Routing table Address resolution Protocols, ports and sockets
Network Layer 4-3
1
23
0111
value in arrivingpacket’s header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing, forwarding
Network Layer 4-4
IP datagram format
ver length
32 bits
data (variable length,typically a TCP
or UDP segment)
16-bit identifier
header checksum
time tolive
32 bit source IP address
IP protocol versionnumber
header length (bytes)
max numberremaining hops
(decremented at each router)
forfragmentation/reassembly
total datagramlength (bytes)
upper layer protocolto deliver payload to
head.len
type ofservice
“type” of data flgsfragment
offsetupper layer
32 bit destination IP address
Options (if any) E.g. timestamp,record routetaken, specifylist of routers to visit.
Network Layer 4-5
IP Addressing: introduction IP address: 32-bit
identifier for network interface
interface: connection between host/router and physical link router’s typically have
multiple interfaces host typically has one
interface• Host with multiple
interfaces can acts as a router
Command:ifconfig
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
223.1.1.1 = 11011111 00000001 00000001 00000001
223 1 11
IP Address vs MAC Address
MAC address Globally unique Statically
configured by manufacturer
flat
IP address Not necessarily
unique Dynamically
assigned Hierarchical:
made up of network part and host part, corresponding to hierarchy in the Internet
Network Layer 4-6
Discussion: difference betweenIP address vs domain name?
[zhang@storm ~]$ ifconfig
em1 Link encap:Ethernet HWaddr B4:99:BA:01:3B:F6
inet addr:150.108.68.26 Bcast:150.108.68.255 Mask:255.255.255.0
inet6 addr: fe80::b699:baff:fe01:3bf6/64 Scope:Link
….
em2 Link encap:Ethernet HWaddr B4:99:BA:01:3B:F8
UP BROADCAST MULTICAST MTU:1500 Metric:1
…
em3 Link encap:Ethernet HWaddr B4:99:BA:01:3B:FA
UP BROADCAST MULTICAST MTU:1500 Metric:1
….
em4 Link encap:Ethernet HWaddr B4:99:BA:01:3B:FC
UP BROADCAST MULTICAST MTU:1500 Metric:1
….
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
…
virbr0 Link encap:Ethernet HWaddr 52:54:00:F2:86:A6
inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
…
Network Layer 4-7
Private IP address used inCIS dept. network
IP address within Fordham network
Network Layer 4-8
IP Addresses
0network host
10 network host
110 network host
1110 multicast address
A
B
C
D
class1.0.0.0 to127.255.255.255
128.0.0.0 to191.255.255.255
192.0.0.0 to223.255.255.255
224.0.0.0 to239.255.255.255
32 bits
“class-full” addressing:
Network Layer 4-9
Getting a datagram from source to dest.
IP datagram:
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
miscfields
sourceIP addr
destIP addr data
datagram remains unchanged, as it travels source to destination
addr fields of interest here
Dest. Net. next router Nhops
223.1.1 1223.1.2 223.1.1.4 2223.1.3 223.1.1.4 2
forwarding table in A
Network Layer 4-10
Getting a datagram from source to dest.
Starting at A, send IP datagram addressed to B:
look up net. address of B in forwarding table
find B is on same net. as A link layer will send datagram
directly to B inside link-layer frame B and A are directly
connected
Dest. Net. next router Nhops
223.1.1 1223.1.2 223.1.1.4 2223.1.3 223.1.1.4 2
miscfields223.1.1.1223.1.1.3data
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
forwarding table in A
Network Layer 4-11
Getting a datagram from source to dest.
Dest. Net. next router Nhops
223.1.1 1223.1.2 223.1.1.4 2223.1.3 223.1.1.4 2
Starting at A, dest. E: look up network address of
E in forwarding table E on different network
A, E not directly attached
routing table: next hop router to E is 223.1.1.4
link layer sends datagram to router 223.1.1.4 inside link-layer frame
datagram arrives at 223.1.1.4
continued…..
miscfields223.1.1.1223.1.2.3 data
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
forwarding table in A
Network Layer 4-12
Getting a datagram from source to dest.
Arriving at 223.1.4, destined for 223.1.2.2
look up network address of E in router’s forwarding table
E on same network as router’s interface 223.1.2.9 router, E directly
attached link layer sends datagram
to 223.1.2.2 inside link-layer frame via interface 223.1.2.9
datagram arrives at 223.1.2.2!!! (hooray!)
miscfields223.1.1.1223.1.2.3 data Dest. Net router Nhops interface
223.1.1 - 1 223.1.1.4 223.1.2 - 1 223.1.2.9
223.1.3 - 1 223.1.3.27
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
forwarding table in router
Subnetting
Problem 1: Any network with need for more than 255 hosts, needed class B addresses, or get many class C addresses
Problem 2: Each new network implies additional entry in forwarding table large table
Solution: Share one network number between several
networks.
…Subnetting
Made most sense for large corporations or campuses
Corporation networks share 1 network number Number of other networks within the
corporation, using subnet masks E.g. a class B address, is shared among 8 networks,
by using a 19-bit “subnet mask” (255.255.224.0 = 11111111 11111111 11100000 00000000)
I.e. subnet addresses are defined by 1st 19 bits of the IP address. Host part now has a “subnet” part in it.
Class B network address continues to be advertised to the rest of the Internet, subnetting only used “within campus”
Subnet mask Introduce another level of hierarchy into
IP address
Network Layer 4-15
8 bits are borrowed from the host address field to createsubnet address field.Subnet mask: 255.255.255.0, i.e., all 1’s in upper 24 bits and 0’s in lower 8 bits * 24 bits are network number * 8 bits are host number
Forwarding Ex. with Subnet Masks• Routing Table:
SubnetNumber SubnetMask NextHop
128.96.170.0 255.255.254.0 Intface 0
128.96.168.0 255.255.254.0 Intface 1
128.96.166.0 255.255.254.0 R2
128.96.164.0 255.255.252.0 R3
Default R4D = Dest IP Address For each table entry (subnetNumber, SubnetMask, NextHop)If (D & SubnetMask == SubnetNumber) if NextHop is an interface forward datagram to the interface else deliver datagram to NextHop (a router)
Forwardingpseudocode
Network Layer 4-17
IP addressing: CIDR Classful addressing:
inefficient use of address space, address space exhaustion
e.g., class B net allocated enough addresses for 65K hosts, even if only 2K hosts in that network
CIDR: Classless InterDomain Routing network portion of address of arbitrary length address format: a.b.c.d/x, where x is # bits in network
portion of address
11001000 00010111 00010000 00000000
networkpart
hostpart
200.23.16.0/23
Special IP address within subnet NETWORK ADDRESS
A network address is an address where all host bits in the IP address are set to zero (0).
first and lowest numbered address BROADCAST ADDRESS
all host bits in the IP address are set to one (1). last address in the range of addresses All hosts are to accept and respond to the broadcast
address. This makes special services possible.
Network Layer 4-18
Hosts LOOPBACK ADDRESS
127.0.0.0 class 'A' subnet is used for only a single address, the loopback address 127.0.0.1. used to test the local network interface
device's functionality. All network interface devices should
respond to this address. ping 127.0.0.1 to test network hardware
and software
Network Layer 4-19
Special Use IP addresses
PRIVATE IP ADDRESSES RFC 1918 defines a number of IP blocks set
aside by American Registry of Internet Numbers (ARIN) for use as private addresses on private networks that are not directly connected to t Internet.
Class Start End A 10.0.0.0 10.255.255.255B 172.16.0.0 172.31.255.255C 192.168.0.0 192.168.255.255
Network Layer 4-20
Special Use IP addresses
Multicast IP Addresses set aside for special purposes, such as the
IP's used in OSPF, Multicast, and experimental purposes that cannot be used on the Internet.
Class Start EndD 224.0.0.0 239.255.255.255
Network Layer 4-21
Network Layer 4-22
IP addresses: how to get one?
Q: How does a network get the network part of IP addr?
A: gets allocated portion of its provider ISP’s address space
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20
Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23 ... ….. …. ….
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23
Network Layer 4-23
Hierarchical addressing: route aggregation
“Send me anythingwith addresses beginning 200.23.16.0/20”
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7Internet
Organization 1
ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16”
200.23.20.0/23Organization 2
...
...
Hierarchical addressing allows efficient advertisement of routing information:
Network Layer 4-24
Hierarchical addressing: more specific routes
ISPs-R-Us has a more specific route to Organization 1
“Send me anythingwith addresses beginning 200.23.16.0/20”
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7Internet
Organization 1
ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16or 200.23.18.0/23”
200.23.20.0/23Organization 2
...
...
Longest match
Network Layer 4-25
IP addressing: the last word...
Q: How does an ISP get block of addresses?
A: ICANN: Internet Corporation for Assigned
Names and Numbers allocates addresses manages DNS assigns domain names, resolves disputes
Network Layer 4-26
IP addresses: how to get one?
Q: How does a host get IP address?
hard-coded by system admin in a file Windows: control-panel->network->configuration-
>tcp/ip->properties UNIX: /etc/rc.config
DHCP: Dynamic Host Configuration Protocol: dynamically get address from a server “plug-and-play”
Network Layer 4-27
DHCP: Dynamic Host Configuration Protocol
Goal: allow host to dynamically obtain its IP address from network server when it joins networkCan renew its lease on address in useAllows reuse of addresses (only hold address while connected an “on”)Support for mobile users who want to join network (more shortly)
DHCP overview: host broadcasts “DHCP discover” msg [optional] DHCP server responds with “DHCP offer” msg [optional] host requests IP address: “DHCP request” msg DHCP server sends address: “DHCP ack” msg
Network Layer 4-28
DHCP client-server scenario
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A
BE
DHCP server
arriving DHCP client needsaddress in thisnetwork
Network Layer 4-29
DHCP client-server scenarioDHCP server: 223.1.2.5 arriving
client
time
DHCP discover
src : 0.0.0.0, 68 dest.: 255.255.255.255,67yiaddr: 0.0.0.0transaction ID: 654
DHCP offer
src: 223.1.2.5, 67 dest: 255.255.255.255, 68yiaddrr: 223.1.2.4transaction ID: 654Lifetime: 3600 secs
DHCP request
src: 0.0.0.0, 68 dest:: 255.255.255.255, 67yiaddrr: 223.1.2.4transaction ID: 655Lifetime: 3600 secs
DHCP ACK
src: 223.1.2.5, 67 dest: 255.255.255.255, 68yiaddrr: 223.1.2.4transaction ID: 655Lifetime: 3600 secs
Network Layer 4-30
DHCP: more than IP address
DHCP can return more than just allocated IP address on subnet: address of first-hop router for client name and IP address of DNS sever network mask (indicating network versus
host portion of address)
Network Layer 4-31
NAT: Network Address Translation
10.0.0.1
10.0.0.2
10.0.0.3
10.0.0.4
138.76.29.7
local network(e.g., home network)
10.0.0/24
rest ofInternet
Datagrams with source or destination in this networkhave 10.0.0/24 address for
source, destination (as usual)
All datagrams leaving localnetwork have same single source
NAT IP address: 138.76.29.7,different source port numbers
Network Layer 4-32
NAT: Network Address Translation
Motivation: local network uses just one IP address as far as outside world is concerned: range of addresses not needed from ISP: just
one IP address for all devices can change addresses of devices in local network
without notifying outside world can change ISP without changing addresses of
devices in local network devices inside local net not explicitly
addressable, visible by outside world (a security plus).
Network Layer 4-33
NAT: Network Address Translation
Implementation: NAT router must:
outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #). . . remote clients/servers will respond using (NAT IP
address, new port #) as destination addr.
remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair
incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table
Network Layer 4-34
NAT: Network Address Translation
16-bit port-number field: 60,000 simultaneous connections with a
single LAN-side address! NAT is controversial:
routers should only process up to layer 3 violates end-to-end argument
• NAT possibility must be taken into account by app designers, e.g., P2P applications
address shortage should instead be solved by IPv6
Network Layer 4-35
NAT traversal problem client wants to connect to
server with address 10.0.0.1 server address 10.0.0.1
local to LAN (client can’t use it as destination addr)
only one externally visible NATed address: 138.76.29.7
solution 1: statically configure NAT to forward incoming connection requests at given port to server e.g., (123.76.29.7, port
2500) always forwarded to 10.0.0.1 port 25000
10.0.0.1
10.0.0.4
NAT router
138.76.29.7
Client?
Network Layer 4-36
NAT traversal problem solution 2: Universal Plug
and Play (UPnP) Internet Gateway Device (IGD) Protocol. Allows NATed host to:learn public IP address
(138.76.29.7)add/remove port
mappings (with lease times)
i.e., automate static NAT port map configuration
10.0.0.1
10.0.0.4
NAT router
138.76.29.7
IGD
Network Layer 4-37
NAT traversal problem solution 3: relaying (used in Skype)
NATed client establishes connection to relay
External client connects to relay relay bridges packets between to
connections
138.76.29.7
Client
10.0.0.1
NAT router
1. connection torelay initiatedby NATed host
2. connection torelay initiatedby client
3. relaying established
Network Layer 4-38
ICMP: Internet Control Message Protocol
used by hosts & routers to communicate network-level information error reporting:
unreachable host, network, port, protocol
echo request/reply (used by ping)
network-layer “above” IP: ICMP msgs carried in IP
datagrams ICMP message: type, code
plus first 8 bytes of IP datagram causing error
Type Code description0 0 echo reply (ping)3 0 dest. network unreachable3 1 dest host unreachable3 2 dest protocol unreachable3 3 dest port unreachable3 6 dest network unknown3 7 dest host unknown4 0 source quench (congestion control - not used)8 0 echo request (ping)9 0 route advertisement10 0 router discovery11 0 TTL expired12 0 bad IP header
Network Layer 4-39
Traceroute and ICMP
Source sends series of UDP segments to dest first has TTL =1 second has TTL=2, etc. unlikely port number
When nth datagram arrives to nth router: router discards
datagram and sends to source an
ICMP message (type 11, code 0)
ICMP message includes name of router & IP address
when ICMP message arrives, source calculates RTT
traceroute does this 3 times
Stopping criterion UDP segment eventually
arrives at destination host
destination returns ICMP “port unreachable” packet (type 3, code 3)
when source gets this ICMP, stops.
Network Layer 4-40
Hierarchical Routing
scale: with 200 million destinations:
can’t store all dest’s in routing tables!
routing table exchange would swamp links!
administrative autonomy
internet = network of networks
each network admin may want to control routing in its own network
Our routing study thus far - idealization all routers identical network “flat”… not true in practice
Network Layer 4-41
Hierarchical Routing
aggregate routers into regions, “autonomous systems” (AS)
routers in same AS run same routing protocol “intra-AS” routing
protocol routers in different AS
can run different intra-AS routing protocol
gateway router at “edge” of its own
AS has link to router in
another AS
Network Layer 4-42
3b
1d
3a
1c2aAS3
AS1
AS21a
2c2b
1b
Intra-ASRouting algorithm
Inter-ASRouting algorithm
Forwardingtable
3c
Interconnected ASes
forwarding table configured by both intra- and inter-AS routing algorithm intra-AS sets entries
for internal dests inter-AS & intra-As
sets entries for external dests
Network Layer 4-43
Intra-AS Routing
also known as Interior Gateway Protocols (IGP) most common Intra-AS routing protocols:
RIP: Routing Information Protocol
OSPF: Open Shortest Path First
IGRP: Interior Gateway Routing Protocol (Cisco proprietary)
Network Layer 4-44
RIP ( Routing Information Protocol) included in BSD-UNIX distribution in 1982 distance vector algorithm
distance metric: # hops (max = 15 hops), each link has cost 1 DVs exchanged with neighbors every 30 sec in response message (aka
advertisement) each advertisement: list of up to 25 destination subnets (in IP addressing sense)
DC
BA
u vw
x
yz
subnet hops u 1 v 2 w 2 x 3 y 3 z 2
from router A to destination subnets:
Network Layer 4-45
OSPF (Open Shortest Path First)
“open”: publicly available uses Link State algorithm
LS packet dissemination topology map at each node route computation using Dijkstra’s algorithm
OSPF advertisement carries one entry per neighbor router
advertisements disseminated to entire AS (via flooding) carried in OSPF messages directly over IP (rather than
TCP or UDP
UNIX routing Principle
1. Search for a matching host address 2. Search for a matching network
address 3. Search for a default entry (specified
as a network entry, with network ID of 0
Network Layer 4-46
netstat
Display routing table[zhang@storm ~]$ netstat -rnKernel IP routing tableDestination Gateway Genmask Flags MSS Window irtt Iface0.0.0.0 150.108.68.1 0.0.0.0 UG 0 0 0 em1150.108.68.0 0.0.0.0 255.255.255.0 U 0 0 0 em1192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0
Network Layer 4-47
Transport Layer 3-48
Multiplexing/demultiplexing
application
transport
network
link
physical
P1 application
transport
network
link
physical
application
transport
network
link
physical
P2P3 P4P1
host 1 host 2 host 3
= process= socket
delivering received segmentsto correct socket
Demultiplexing at rcv host:gathering data from multiplesockets, enveloping data with header (later used for demultiplexing)
Multiplexing at send host:
Transport Layer 3-49
How demultiplexing works host receives IP datagrams
each datagram has source IP address, destination IP address
each datagram carries 1 transport-layer segment
each segment has source, destination port number
host uses IP addresses & port numbers to direct segment to appropriate socket
source port # dest port #
32 bits
applicationdata (message)
other header fields
TCP/UDP segment format
Transport Layer 3-50
Connectionless demultiplexing recall: create sockets with
host-local port numbers:DatagramSocket mySocket1 = new
DatagramSocket(12534);
DatagramSocket mySocket2 = new DatagramSocket(12535);
recall: when creating datagram to send into UDP socket, must specify
(dest IP address, dest port number)
when host receives UDP segment: checks destination port
number in segment directs UDP segment to
socket with that port number
IP datagrams with different source IP addresses and/or source port numbers directed to same socket
Transport Layer 3-51
Connectionless demux (cont)
DatagramSocket serverSocket = new DatagramSocket(6428);
ClientIP:B
P2
client IP: A
P1P1P3
serverIP: C
SP: 6428DP: 9157
SP: 9157DP: 6428
SP: 6428DP: 5775
SP: 5775DP: 6428
SP provides “return address”
Transport Layer 3-52
Connection-oriented demux
TCP socket identified by 4-tuple: source IP address source port number dest IP address dest port number
recv host uses all four values to direct segment to appropriate socket
server host may support many simultaneous TCP sockets: each socket identified
by its own 4-tuple web servers have
different sockets for each connecting client non-persistent HTTP will
have different socket for each request
Transport Layer 3-53
Connection-oriented demux (cont)
ClientIP:B
P1
client IP: A
P1P2P4
serverIP: C
SP: 9157DP: 80
SP: 9157DP: 80
P5 P6 P3
D-IP:CS-IP: AD-IP:C
S-IP: B
SP: 5775DP: 80
D-IP:CS-IP: B
Transport Layer 3-54
Connection-oriented demux: Threaded Web Server
clientIP:B
P1
client IP: A
P1P2
serverIP: C
SP: 9157DP: 80
SP: 9157DP: 80
P4 P3
D-IP:CS-IP: AD-IP:C
S-IP: B
SP: 5775DP: 80
D-IP:CS-IP: B
Transport Layer 3-55
TCP Connection Management
Recall: TCP sender, receiver establish “connection” before exchanging data segments
initialize TCP variables: seq. #s buffers, flow control info
(e.g. RcvWindow) client: connection initiator Socket clientSocket = new
Socket("hostname","port
number"); server: contacted by client Socket connectionSocket =
welcomeSocket.accept();
Three way handshake:
Step 1: client host sends TCP SYN segment to server specifies initial seq # no data
Step 2: server host receives SYN, replies with SYNACK segment
server allocates buffers specifies server initial
seq. #Step 3: client receives SYNACK,
replies with ACK segment, which may contain data
Transport Layer 3-56
TCP Connection Management (cont.)
Closing a connection:
client closes socket: clientSocket.close();
Step 1: client end system sends TCP FIN control segment to server
Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN.
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-57
TCP Connection Management (cont.)
Step 3: client receives FIN, replies with ACK.
Enters “timed wait” - will respond with ACK to received FINs
Step 4: server, receives ACK. Connection closed.
Note: with small modification, can handle simultaneous FINs.
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-58
TCP Connection Management (cont)
TCP clientlifecycle
TCP serverlifecycle
Transport Layer 3-59
TCP segment structure
source port # dest port #
32 bits
applicationdata (variable length)
sequence number
acknowledgement number Receive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG: urgent data (generally not used)
ACK: ACK #valid
PSH: push data now(generally not used)
RST, SYN, FIN:connection estab(setup, teardown
commands)
# bytes rcvr willingto accept
countingby bytes of data(not segments!)
Internetchecksum
(as in UDP)