application
transport
network
link
physical
Chapter4:NetworkLayerKurose&Ross6th Edition:Chapter4:NetworkLayer
Kurose&Ross7th Edition:Chapter4:NetworkDataPlaneChapter5:NetworkControlPlane
ThisCourse:Hybrid-Combineinonechapter-Butfollow7th Ed.(Mostly!)
3
Chapter4:NetworkLayerourgoals:• understandprinciplesbehindnetworklayerservices:
• Networklayerservicemodels
• Forwardingversusrouting• Howarouterworks• Routing(PathSelection)• Broadcast,Multicast• DealingwithScale• AdvancedTopics:IPv6,Mobility,SDNControllers
• Instantiation,ImplementationintheInternet
• NetworkManagement
4
Chapter4:Outlineq OverviewofNetworkLayerqWhat’sInsideaRouter?q IP:InternetProtocolq RoutingProtocolsq Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
q DataPlaneq ControlPlane
5
Networklayer• Transportsegmentfromsendingtoreceivinghost
• Onsendingsideencapsulatessegmentsintodatagrams
• Onreceivingside,deliverssegmentstotransportlayer
• Networklayerprotocolsinevery host,router
• RouterexaminesheaderfieldsinallIPdatagramspassingthroughit
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical network
data linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
6
Twokeynetwork-layerfunctions
network-layerfunctions:•forwarding:movepacketsfromrouter’sinputtoappropriaterouteroutput•routing: determineroutetakenbypacketsfromsourcetodestination
• routing algorithms
analogy:takingatrip§ forwarding: processofgettingthroughsingleinterchange
§ routing: processofplanningtripfromsourcetodestination
7
Networklayer:dataplane,controlplane
Dataplane• local,per-routerfunction• determineshowdatagramarrivingonrouterinputportisforwardedtorouteroutputport
• forwardingfunction
Controlplane§ network-widelogic§ determineshowdatagramisroutedamongroutersalongend-endpathfromsourcehosttodestinationhost
§ twocontrol-planeapproaches:• traditionalroutingalgorithms:implementedinrouters
• software-definednetworking(SDN):implementedin(remote)servers
1
23
0111
values in arriving packet header
8
Per-routercontrolplane
RoutingAlgorithm
Individual routing algorithm components in each and every router interact in the control plane
dataplane
controlplane
4.1 • OVERVIEW OF NETWORK LAYER 309
tables. In this example, a routing algorithm runs in each and every router and both forwarding and routing functions are contained within a router. As we’ll see in Sec-tions 5.3 and 5.4, the routing algorithm function in one router communicates with the routing algorithm function in other routers to compute the values for its forward-ing table. How is this communication performed? By exchanging routing messages containing routing information according to a routing protocol! We’ll cover routing algorithms and protocols in Sections 5.2 through 5.4.
The distinct and different purposes of the forwarding and routing functions can be further illustrated by considering the hypothetical (and unrealistic, but technically feasible) case of a network in which all forwarding tables are configured directly by human network operators physically present at the routers. In this case, no routing protocols would be required! Of course, the human operators would need to interact with each other to ensure that the forwarding tables were configured in such a way that packets reached their intended destinations. It’s also likely that human configu-ration would be more error-prone and much slower to respond to changes in the net-work topology than a routing protocol. We’re thus fortunate that all networks have both a forwarding and a routing function!
Values in arrivingpacket’s header
1
23
Local forwardingtable
header
0100011001111001
1101
3221
output
Control plane
Data plane
Routing algorithm
Figure 4.2 ♦ Routing algorithms determine values in forward tables
M04_KURO4140_07_SE_C04.indd 309 11/02/16 3:14 PM
1
2
0111
values in arriving packet header
3
9
dataplane
controlplane
LogicallycentralizedcontrolplaneA distinct (typically remote) controller interacts with local control agents (CAs)
Remote Controller
CA
CA CA CA CA
1
2
0111
3
values in arriving packet header
10
NetworkservicemodelQ:Whatservicemodel for“channel” transportingdatagramsfromsendertoreceiver?
exampleservicesforindividualdatagrams:
• guaranteeddelivery• guaranteeddeliverywithlessthan40msec delay
exampleservicesforaflowofdatagrams:
• in-orderdatagramdelivery• guaranteedminimumbandwidthtoflow
• restrictionsonchangesininter-packetspacing
11
Networklayerservicemodels:
NetworkArchitecture
Internet
ServiceModel
best effort
Bandwidth
none
Loss
no
Order
no
Timing
no
Congestionfeedback
no (inferredVia loss
Guarantees ?
12
Chapter4:Outlineü OverviewofNetworkLayerqWhat’sInsideaRouter?q IP:InternetProtocolq RoutingProtocolsq Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
13
Routerarchitectureoverview
routing processor
router input ports router output ports
forwarding data plane (hardware) operttes in
nanosecond timeframe
routing, managementcontrol plane (software)operates in millisecond
time frame
• high-levelviewofgenericrouterarchitecture:
high-seed switching
fabric
14
linetermination
link layer
protocol(receive)
lookup,forwarding
queueing
Inputportfunctions
decentralizedswitching:• usingheaderfieldvalues,lookupoutputportusingforwardingtableininputportmemory(“matchplusaction”)
• goal:completeinputportprocessingat‘linespeed’
• queuing:ifdatagramsarrivefasterthanforwardingrateintoswitchfabric
physical layer:bit-level reception
data link layer:e.g., Ethernetsee chapter 5
switchfabric
15
linetermination
link layer
protocol(receive)
lookup,forwarding
queueing
Inputportfunctions
physical layer:bit-level reception
data link layer:e.g., Ethernetsee chapter 5
switchfabric
decentralizedswitching:• usingheaderfieldvalues,lookupoutputportusingforwardingtableininputportmemory(“matchplusaction”)
• Destination-basedforwarding: forwardbasedonlyondestinationIPaddress(traditional)
• GeneralizedForwarding:forwardbasedonanysetofheaderfieldvalues
16
Destination Address Range
11001000 00010111 00010000 00000000through11001000 00010111 00010111 11111111
11001000 00010111 00011000 00000000through11001000 00010111 00011000 11111111
11001000 00010111 00011001 00000000through11001000 00010111 00011111 11111111
otherwise
Link Interface
0
1
2
3
Q: butwhathappensifrangesdon’tdivideupsonicely?
Destination-basedforwardingforwarding table
17
Longestprefixmatching
Destination Address Range
11001000 00010111 00010*** *********
11001000 00010111 00011000 *********
11001000 00010111 00011*** *********
otherwise
DA: 11001000 00010111 00011000 10101010
examples:DA: 11001000 00010111 00010110 10100001 whichinterface?
whichinterface?
whenlookingforforwardingtableentryforgivendestinationaddress,uselongest addressprefixthatmatchesdestinationaddress.
longestprefixmatching
Link interface
0
1
2
3
18
Longestprefixmatching
• we’llsee whylongestprefixmatchingisusedshortly,whenwestudyaddressing
• longestprefixmatching:oftenperformedusingternarycontentaddressablememories(TCAMs)
• contentaddressable:presentaddresstoTCAM:retrieveaddressinoneclockcycle,regardlessoftablesize
• CiscoCatalyst:canup~1MroutingtableentriesinTCAM
19
Switchingfabrics
• transferpacketfrominputbuffertoappropriateoutputbuffer
• switchingrate:rateatwhichpacketscanbetransferredfrominputstooutputs
• oftenmeasuredasmultipleofinput/outputlinerate• Ninputs:switchingrateNtimeslineratedesirable
• threetypesofswitchingfabrics
memory
memory
bus crossbar
20
Switchingviamemory
firstgenerationrouters:• traditionalcomputerswithswitchingunderdirectcontrolofCPU• packetcopiedtosystem’smemory• speedlimitedbymemorybandwidth(2buscrossingsperdatagram)
inputport(e.g.,
Ethernet)
memoryoutputport(e.g.,
Ethernet)
system bus
21
Switchingviaabus
• datagramfrominputportmemorytooutputportmemoryviaasharedbus
• buscontention: switchingspeedlimitedbybusbandwidth
• 32Gbps bus,Cisco5600:sufficientspeedforaccessandenterpriserouters
bus
22
Switchingviainterconnectionnetwork
• overcomebusbandwidthlimitations• banyannetworks,crossbar,otherinterconnectionnetsinitiallydevelopedtoconnectprocessorsinmultiprocessor
• advanceddesign:fragmentingdatagramintofixedlengthcells,switchcellsthroughthefabric.
• Cisco12000:switches60Gbpsthroughtheinterconnectionnetwork
crossbar
23
Inputportqueuing• Answer2:Outputportcontention• Head-of-the-Line(HOL)blocking: queueddatagramatfrontofqueuepreventsothersinqueuefrommovingforward
outputportcontention:onlyonereddatagramcanbe
transferred.lowerredpacketisblocked
switchfabric
onepackettimelater:greenpacket
experiencesHOLblocking
switchfabric
X
25
ReducingInputQueueing
• Why?• ReduceHOLblocking• Avoidpacketdropsatinputqueues• Saveonqueuememory
• How?• Increaseswitchfabricspeed• Increaseinboundcapacityofoutputports
26
Outputports
§ buffering requiredwhendatagramsarrivefromfabricfasterthanthetransmissionrate
§ schedulingdiscipline choosesamongqueueddatagramsfortransmission
linetermination
link layer
protocol(send)
switchfabric
datagrambuffer
queueing
Datagram (packets) can be lost due to
congestion, lack of buffers
Priority scheduling –who gets best
performance, network neutrality
27
Outputportqueueing
• bufferingwhenarrivalrateviaswitchexceedsoutputlinespeed• queueing(delay)andlossduetooutputportbufferoverflow!
at t, packets movefrom input to output
one packet time later
switchfabric
switchfabric
28
Howmuchbuffering?
• RFC3439ruleofthumb:averagebufferingequalto“typical” RTT(say250msec)timeslinkcapacityC
• e.g.,C=10Gpbs link:2.5Gbit buffer
• recentrecommendation[Appenzellet’04]:withNflows,bufferingequalto:
𝑅𝑇𝑇×𝐶𝑁�
29
Schedulingmechanisms• scheduling:choosenextpackettosendonlink• FIFO(firstinfirstout)scheduling:sendinorderofarrivaltoqueue
• discardpolicy:ifpacketarrivestofullqueue:whotodiscard?• taildrop:droparrivingpacket• priority:drop/removeonprioritybasis• random:drop/removerandomly
queue(waiting area)
packetarrivals
packetdepartureslink
(server)
30
Schedulingpolicies:priority
priorityscheduling:sendhighestpriorityqueuedpacket
• multipleclasses,withdifferentpriorities
• classmaydependonmarkingorotherheaderinfo,e.g.IPsource/dest,portnumbers,etc.
highpriorityqueue(waitingarea)
lowpriorityqueue(waitingarea)
arrivals
classify
departures
link(server)
1 3 2 4 5
5
5
2
2
1
1
3
3 4
4arrivals
departures
packetinservice
31
Schedulingpolicies:stillmoreRoundRobin(RR)scheduling:• multipleclasses• cyclicallyscanclassqueues,sendingonecompletepacketfromeachclass(ifavailable)
• realworldexample?
1 23 4 5
5
5
2
3
1
1
3
2 4
4arrivals
departures
packet in
service
32
WeightedFairQueuing(WFQ):• generalizedRoundRobin• eachclassgetsweightedamountofserviceineachcycle
Schedulingpolicies:stillmore
33
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?q IP:InternetProtocolq RoutingProtocolsq Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
q DataFormatandFragmentationq IPv4Addressingq NetworkAddressTranslation
q IPv6
34
TheInternetnetworklayer
forwardingtable
host,routernetworklayerfunctions:
routingprotocols• pathselection• RIP,OSPF,BGP
IPprotocol• addressingconventions• datagramformat• packethandlingconventions
ICMPprotocol• errorreporting• router“signaling”
transportlayer:TCP,UDP
linklayer
physicallayer
networklayer
35
ver length
32 bits
data (variable length,typically a TCP
or UDP segment)
16-bit identifierheader
checksumtime to
live
32 bit source IP address
head.len
type ofservice
flgs fragmentoffset
upperlayer
32 bit destination IP address
options (if any)
IP protocol versionnumber
header length(bytes)
upper layer protocolto deliver payload to
total datagramlength (bytes)
“type” of data forfragmentation/reassemblymax number
remaining hops(decremented at
each router)
e.g. timestamp,record routetaken, specifylist of routers to visit.
how much overhead?
IPdatagramformat
v20bytesofIPv20bytesofTCP=40bytes+applayeroverhead
36
IPfragmentation,reassembly• networklinkshaveMTU(max.transfer size)- largestpossiblelink-levelframe
• differentlinktypes,differentMTUs
• largeIPdatagramdivided(“fragmented”)withinnet
• onedatagrambecomesseveraldatagrams
• “reassembled” onlyatfinaldestination
• IPheaderbitsusedtoidentify,orderrelatedfragments
fragmentation:in: one large datagramout: 3 smaller datagrams
reassembly
…
…
37
ID=x
offset=0
fragflag=0
length=4000
ID=x
offset=0
fragflag=1
length=1500
ID=x
offset=185
fragflag=1
length=1500
ID=x
offset=370
fragflag=0
length=1040
onelargedatagrambecomesseveralsmallerdatagrams
example:v 4000bytedatagramv MTU=1500bytes
1480bytesindatafield
offset=1480/8
IPfragmentation,reassembly
38
PathMTUdiscovery• SendlargepacketwithDon’tFragment (DF)flagset
• IfarrivesatrouterwithsmallerMTU,packetdropped
• ICMP“packettoobig”sentback,withMTU
• FailsifICMPpacketsareblocked
MSSclamping• Routeradds/altersTCPmaximumsegmentsize(MSS)optiontoallflows
• Breakslayeringguarantees
IPfragmentation,reassembly
39
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?q IP:InternetProtocolq RoutingProtocolsq Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
ü DataFormatandFragmentationq IPv4Addressingq NetworkAddressTranslation
q IPv6
40
IPaddressing:introduction
• IPaddress: 32-bitidentifierforhost,routerinterface
• interface: connectionbetweenhost/routerandphysicallink
• Router’stypicallyhavemultipleinterfaces
• hosttypicallyhasoneortwointerfaces(e.g.,wiredEthernet,wireless802.11)
• IPaddressesassociatedwitheachinterface
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
223.1.1.1 = 11011111 00000001 00000001 00000001
223 1 11
41
IPaddressing:introduction
Q:howareinterfacesactuallyconnected?A:we’lllearnaboutthatinnextchapters.
223.1.1.1
223.1.1.2
223.1.1.3
223.1.1.4 223.1.2.9
223.1.2.2
223.1.2.1
223.1.3.2223.1.3.1
223.1.3.27
A: wired Ethernet interfaces connected by Ethernet switches
A: wireless WiFi interfaces connected by WiFi base station
For now: don’t need to worry about how one interface is connected to another (with no intervening router)
42
Subnets
• IPaddress:• subnetpart- highorderbits
• hostpart- loworderbits
• What’sasubnet?• deviceinterfaceswithsamesubnetpartofIPaddress
• canphysicallyreacheachotherwithoutinterveningrouter
network consisting of 3 subnets
223.1.1.1
223.1.1.3
223.1.1.4 223.1.2.9
223.1.3.2223.1.3.1
subnet
223.1.1.2
223.1.3.27223.1.2.2
223.1.2.1
43
recipe• todeterminethesubnets,detacheachinterfacefromitshostorrouter,creatingislandsofisolatednetworks
• eachisolatednetworkiscalledasubnet
subnet mask: /24
Subnets223.1.1.0/24
223.1.2.0/24
223.1.3.0/24
223.1.1.1
223.1.1.3
223.1.1.4 223.1.2.9
223.1.3.2223.1.3.1
subnet
223.1.1.2
223.1.3.27223.1.2.2
223.1.2.1
44
howmany? 223.1.1.1
223.1.1.3
223.1.1.4
223.1.2.2223.1.2.1
223.1.2.6
223.1.3.2223.1.3.1
223.1.3.27
223.1.1.2
223.1.7.0
223.1.7.1223.1.8.0223.1.8.1
223.1.9.1
223.1.9.2
Subnets
45
IPaddressing:CIDR
CIDR: ClasslessInterDomain Routing• subnetportionofaddressofarbitrarylength• addressformat:a.b.c.d/x,wherexis#bitsinsubnetportionofaddress
11001000 00010111 00010000 00000000
subnetpart
hostpart
200.23.16.0/23200.23.16.0–200.23.17.255
46
IPaddresses:howtogetone?Q: Howdoesahost getIPaddress?
• hard-codedbysystemadmininafile• Windows:control-panel->network->configuration->tcp/ip->properties
• UNIX:/etc/…
• DHCP: DynamicHostConfigurationProtocol:dynamicallygetaddressfromaserver
• “plug-and-play”
47
DHCP:DynamicHostConfigurationProtocol
goal: allowhosttodynamicallyobtainitsIPaddressfromnetworkserverwhenitjoinsnetwork
• canrenewitsleaseonaddressinuse• allowsreuseofaddresses(onlyholdaddresswhileconnected/“on”)
• supportformobileuserswhowanttojoinnetwork(moreshortly)
DHCPoverview:• hostbroadcasts“DHCPdiscover”msg [optional]• DHCPserverrespondswith“DHCPoffer”msg [optional]• hostrequestsIPaddress:“DHCPrequest”msg• DHCPserversendsaddress:“DHCPack”msg
48
DHCPclient-serverscenario
223.1.1.0/24
223.1.2.0/24
223.1.3.0/24
223.1.1.1
223.1.1.3
223.1.1.4 223.1.2.9
223.1.3.2223.1.3.1
223.1.1.2
223.1.3.27223.1.2.2
223.1.2.1
DHCPserver
arriving DHCPclient needs address in thisnetwork
49
DHCP server: 223.1.2.5 arrivingclient
DHCP discover
src : 0.0.0.0, 68 dest.: 255.255.255.255,67
yiaddr: 0.0.0.0transaction ID: 654
DHCP offersrc: 223.1.2.5, 67
dest: 255.255.255.255, 68yiaddrr: 223.1.2.4
transaction ID: 654lifetime: 3600 secs
DHCP request
src: 0.0.0.0, 68 dest:: 255.255.255.255, 67
yiaddrr: 223.1.2.4transaction ID: 655lifetime: 3600 secs
DHCP ACKsrc: 223.1.2.5, 67
dest: 255.255.255.255, 68yiaddrr: 223.1.2.4
transaction ID: 655lifetime: 3600 secs
DHCPclient-serverscenario
Broadcast: is there a DHCP server out there?
Broadcast: I’m a DHCP server! Here’s an IP address you can use
Broadcast: OK. I’ll take that IP address!
Broadcast: OK. You’ve got that IP address!
50
DHCP:morethanIPaddresses
DHCPcanreturnmorethanjustallocatedIPaddressonsubnet:
• addressoffirst-hoprouterforclient• nameandIPaddressofDNSsever• networkmask(indicatingnetworkversushostportionofaddress)
51
• connectinglaptopneedsitsIPaddress,addr offirst-hoprouter,addr ofDNSserver:useDHCP
routerwithDHCPserverbuiltintorouter
§ DHCPrequestencapsulatedinUDP,encapsulatedinIP,encapsulatedin802.1Ethernet
§ Ethernetframebroadcast(dest:FFFFFFFFFFFF)onLAN,receivedatrouterrunningDHCPserver
§ EthernetdemuxedtoIPdemuxed,UDPdemuxedtoDHCP
168.1.1.1
DHCPUDPIPEthPhy
DHCP
DHCP
DHCP
DHCP
DHCP
DHCPUDPIPEthPhy
DHCP
DHCP
DHCP
DHCPDHCP
DHCP:example
52
• DHCPserverformulatesDHCPACKcontainingclient’sIPaddress,IPaddressoffirst-hoprouterforclient,name&IPaddressofDNSserver
§ encapsulationofDHCPserver,frameforwardedtoclient,demuxinguptoDHCPatclient
DHCP:example
routerwithDHCPserverbuiltintorouter
DHCP
DHCP
DHCP
DHCP
DHCPUDPIPEthPhy
DHCP
DHCPUDPIPEthPhy
DHCP
DHCP
DHCP
DHCP
§ clientnowknowsitsIPaddress,nameandIPaddressofDSNserver,IPaddressofitsfirst-hoprouter
53
IPaddresses:howtogetone?
Q: howdoesnetwork getsubnetpartofIPaddr?A: getsallocatedportionofitsproviderISP’saddressspace
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20
Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/24 Organization 3 11001000 00010111 00010101 00000000 200.23.21.0/24
... ….. …. ….Organization n 11001000 00010111 00011110 00000000 200.23.30.0/23
54
Hierarchicaladdressing:routeaggregation
“Send me anythingwith addresses beginning 200.23.16.0/20”
200.23.16.0/23
200.23.18.0/23
200.23.30.0/23
Fly-By-Night-ISP
Organization 0
Organization 7Internet
Organization 1
ISPs-R-Us “Send me anythingwith addresses beginning 199.31.0.0/16”
200.23.20.0/23Organization 2
...
...
hierarchicaladdressingallowsefficientadvertisementofroutinginformation:
55
IPaddressing:thelastword...
Q: howdoesanISPgetblockofaddresses?A: ICANN:InternetCorporationforAssigned
NamesandNumbershttp://www.icann.org/• allocatesaddresses• managesDNS• assignsdomainnames,resolvesdisputes
56
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?q IP:InternetProtocolq RoutingProtocolsq Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
ü DataFormatandFragmentationü IPv4Addressingq NetworkAddressTranslation
q IPv6
58
NAT:networkaddresstranslation
10.0.0.1
10.0.0.2
10.0.0.3
10.0.0.4
138.76.29.7
localnetwork(e.g.,homenetwork)
10.0.0/24
restofInternet
datagramswithsourceordestinationinthisnetworkhave10.0.0/24addressforsource,destination(asusual)
all datagramsleaving localnetworkhavesame single
sourceNATIPaddress:138.76.29.7,differentsourceportnumbers
59
motivation: localnetworkusesjustoneIPaddressasfarasoutsideworldisconcerned:§ rangeofaddressesnotneededfromISP:justoneIPaddressforalldevices
§ canchangeaddressesofdevicesinlocalnetworkwithoutnotifyingoutsideworld
§ canchangeISPwithoutchangingaddressesofdevicesinlocalnetwork
§ devicesinsidelocalnetnotexplicitlyaddressable,visiblebyoutsideworld(asecurityplus)
NAT:networkaddresstranslation
60
implementation: NATroutermust:
§ outgoingdatagrams: replace (sourceIPaddress,port#)ofeveryoutgoingdatagramto(NATIPaddress,newport#)...remoteclients/serverswillrespondusing(NATIPaddress,newport#)asdestinationaddr
§ remember(inNATtranslationtable) every(sourceIPaddress,port#)to(NATIPaddress,newport#)translationpair
§ incomingdatagrams: replace (NATIPaddress,newport#)indest fieldsofeveryincomingdatagramwithcorresponding(sourceIPaddress,port#)storedinNATtable
NAT:networkaddresstranslation
61
10.0.0.1
10.0.0.2
10.0.0.3
S: 10.0.0.1, 3345D: 128.119.40.186, 80
110.0.0.4
138.76.29.7
1: host 10.0.0.1 sends datagram to 128.119.40.186, 80
NAT translation tableWAN side addr LAN side addr138.76.29.7, 5001 10.0.0.1, 3345…… ……
S: 128.119.40.186, 80 D: 10.0.0.1, 3345 4
S: 138.76.29.7, 5001D: 128.119.40.186, 802
2: NAT routerchanges datagramsource addr from10.0.0.1, 3345 to138.76.29.7, 5001,updates table
S: 128.119.40.186, 80 D: 138.76.29.7, 5001 3
3: reply arrivesdest. address:138.76.29.7, 5001
4: NAT routerchanges datagramdest addr from138.76.29.7, 5001 to 10.0.0.1, 3345
NAT:networkaddresstranslation
62
• 16-bitport-numberfield:• 60,000simultaneousconnectionswithasingleLAN-sideaddress!
• NATchallenges:• violatesend-to-endargument
• NATpossibilitymustbetakenintoaccountbyappdesigners,e.g.,P2Papplications
• NATtraversal:whatifclientwantstoconnecttoserverbehindNAT?
NAT:networkaddresstranslation
63
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?q IP:InternetProtocolq RoutingProtocolsq Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
ü DataFormatandFragmentationü IPv4Addressingü NetworkAddressTranslation
q IPv6
64
IPv6:motivation• initialmotivation: 32-bitaddressspacesoontobecompletelyallocated.
• additionalmotivation:• headerformathelpsspeedprocessing/forwarding• headerchangestofacilitateQoS
IPv6datagramformat:• fixed-length40byteheader• nofragmentationallowed
65
IPv6addresses• 128-bitaddresses
• 340billionbillionbillionbillion• Notquite enoughforoneIPforeachatominEarth,butwithin~10ordersofmagnitude
• Howdoyouwritethem?• 2600:1008:b057:7412:506d:1fda:1852:6a68
• Morecompact• 2600:805::79=
2600:0805:0000:0000:0000:0000:0000:0079
66
IPv6datagramformat
priority: identifypriorityamongdatagramsinflowflowLabel: identifydatagramsinsame“flow.”
(conceptof“flow” notwelldefined).nextheader: identifyupperlayerprotocolfordata
data
destination address(128 bits)
source address(128 bits)
payload len next hdr hop limitflow labelpriver
32 bits67
OtherchangesfromIPv4
• checksum: removedentirelytoreduceprocessingtimeateachhop
• options: allowed,butoutsideofheader,indicatedby“NextHeader” field
• ICMPv6: newversionofICMP• additionalmessagetypes,e.g.“PacketTooBig”• multicastgroupmanagementfunctions
68
TransitionfromIPv4toIPv6• notallrouterscanbeupgradedsimultaneously
• no“flagdays”• howwillnetworkoperatewithmixedIPv4andIPv6routers?
• tunneling: IPv6datagramcarriedaspayload inIPv4datagramamongIPv4routers
IPv4 source, dest addr IPv4 header fields
IPv4 datagramIPv6 datagram
IPv4 payload
UDP/TCP payloadIPv6 source dest addr
IPv6 header fields
69
Tunneling
physical view:IPv4 IPv4
A B
IPv6 IPv6
E
IPv6 IPv6
FC D
logical view:
IPv4 tunnel connecting IPv6 routers E
IPv6 IPv6
FA B
IPv6 IPv6
70
flow: Xsrc: Adest: F
data
A-to-B:IPv6
Flow: XSrc: ADest: F
data
src:Bdest: E
B-to-C:IPv6 inside
IPv4
E-to-F:IPv6
flow: Xsrc: Adest: F
data
B-to-C:IPv6 inside
IPv4
Flow: XSrc: ADest: F
data
src:Bdest: E
physical view:A B
IPv6 IPv6
E
IPv6 IPv6
FC D
logical view:
IPv4 tunnel connecting IPv6 routers E
IPv6 IPv6
FA B
IPv6 IPv6
Tunneling
IPv4 IPv4
71
IPv6:adoption
• Google:8%ofclientsaccessservicesviaIPv6• NIST:1/3ofallUSgovernmentdomainsareIPv6capable
• Long(long!)timefordeployment,use•20yearsandcounting!• thinkofapplication-levelchangesinlast20years:WWW,Facebook,streamingmedia,Skype,…•Why?
72
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?ü IP:InternetProtocolq RoutingProtocolsq Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
73
Routing protocols
Routingprotocolgoal: determine“good”paths(equivalently,routes),fromsendinghoststoreceivinghost,throughnetworkofrouters• path:sequenceofrouterspacketswilltraverseingoingfromgiveninitialsourcehosttogivenfinaldestinationhost
• “good”:least“cost”,“fastest”,“leastcongested”• routing:a“top-10”networkingchallenge!
74
u
yx
wv
z2
21
3
1
1
2
53
5
graph:G=(N,E)
N=setofrouters={u,v,w,x,y,z}
E=setoflinks={(u,v),(u,x),(v,x),(v,w),(x,w),(x,y),(w,y),(w,z),(y,z)}
Graphabstractionofthenetwork
aside: graphabstractionisusefulinothernetworkcontexts,e.g.,P2P,whereN issetofpeersandE issetofTCPconnections
75
Graphabstraction:costs
u
yx
wv
z2
21
3
1
1
2
53
5 c(x,x’)=costoflink(x,x’)e.g.,c(w,z)=5
costcouldalwaysbe1,orinverselyrelatedtobandwidth,orrelatedtocongestion
costofpath(x1,x2,x3,…,xp)=c(x1,x2)+c(x2,x3)+…+c(xp-1,xp)
keyquestion: whatistheleast-costpathbetweenuandz?routingalgorithm: algorithmthatfindsthatleastcostpath
76
Routingalgorithmclassification
Q:globalordecentralizedinformation?
global:• allroutershavecompletetopology,linkcostinfo
• “linkstate” algorithmsdecentralized:• routerknowsphysically-connectedneighbors,linkcoststoneighbors
• iterativeprocessofcomputation,exchangeofinfowithneighbors
• “distancevector” algorithms
Q:staticordynamic?static:• routeschangeslowlyovertime
dynamic:• routeschangemorequickly
• periodicupdate• inresponsetolinkcostchanges
77
Alink-stateroutingalgorithm
Dijkstra’s algorithm• nettopology,linkcostsknowntoallnodes
• accomplishedvia“linkstatebroadcast”• allnodeshavesameinfo
• computesleastcostpathsfromonenode(“source”)toallothernodes
• givesforwardingtable forthatnode
• iterative:afterkiterations,knowleastcostpathtokdest.’s
78
Dijsktra’s algorithm
1Initialization:2N' ={u}3forallnodesv4ifvadjacenttou5thenD(v)=c(u,v)6elseD(v)=∞78Loop9findwnotinN' suchthatD(w)isaminimum10addwtoN'11updateD(v)forallvadjacenttowandnotinN' :12D(v)=min(D(v),D(w)+c(w,v))13/*newcosttoviseitheroldcosttovorknown14shortestpathcosttowpluscostfromwtov*/15untilallnodesinN'
notation:• c(x,y): linkcostfromnodextoy;=∞ifnotdirectneighbors
• D(v): currentvalueofcostofpathfromsourcetodest.v
• p(v): predecessornodealongpathfromsourcetov
• N': setofnodeswhoseleastcostpathdefinitivelyknown
79
w3
4
v
x
u
5
37 4
y
8
z2
7
9
Dijkstra’salgorithm:example1Step N'
D(v)p(v)
012345
D(w)p(w)
D(x)p(x)
D(y)p(y)
D(z)p(z)
u ∞∞7,u 3,u 5,uuw ∞11,w6,w 5,u
14,x11,w6,wuwxuwxv 14,x10,vuwxvy 12,y
notes:v constructshortestpathtreeby
tracingpredecessornodesv tiescanexist(canbebroken
arbitrarily)
uwxvyz
80
w3
v
x
u
5
34
y z2
Dijkstra’salgorithm:example1Step N'
D(v)p(v)
012345
D(w)p(w)
D(x)p(x)
D(y)p(y)
D(z)p(z)
u ∞∞7,u 3,u 5,uuw ∞11,w6,w 5,u
14,x11,w6,wuwxuwxv 14,x10,vuwxvy 12,y
notes:v constructshortestpathtreeby
tracingpredecessornodesv tiescanexist(canbebroken
arbitrarily)
uwxvyz
81
Dijkstra’salgorithm:example2
Step012345
N'uuxuxyuxyv
uxyvwuxyvwz
D(v),p(v)2,u2,u2,u
D(w),p(w)5,u4,x3,y3,y
D(x),p(x)1,u
D(y),p(y)∞2,x
D(z),p(z)∞∞4,y4,y4,y
u
yx
wv
z2
21
3
1
1
2
53
5
*Checkouttheonlineinteractiveexercisesformoreexamples:http://gaia.cs.umass.edu/kurose_ross/interactive/ 82
Dijkstra’salgorithm:example2
u
yx
wv
z
resultingshortest-pathtreefromu:
vx
y
wz
(u,v)(u,x)
(u,x)(u,x)
(u,x)
destination link
resultingforwardingtableinu:
83
Dijkstra’salgorithm,discussionalgorithmcomplexity: nnodes• eachiteration:needtocheckallnodes,w,notinN• n(n+1)/2comparisons:O(n2)• moreefficientimplementationspossible:O(nlogn)
oscillationspossible:• e.g.,supportlinkcostequalsamountofcarriedtraffic:
AD
C
B1 1+e
e00 0
initially
AD
C
B
giventhesecosts,findnewrouting….
resultinginnewcosts
2+e 0
001+e 1
AD
C
B
giventhesecosts,findnewrouting….
resultinginnewcosts
0 2+e
1+e10 0
AD
C
B
giventhesecosts,findnewrouting….
resultinginnewcosts
2+e 0
001+e 1
1 1e
84
Distancevectoralgorithm
Bellman-Fordequation(dynamicprogramming)
letdx(y):=costofleast-costpathfromxtoy
thendx(y)=min {c(x,v)+dv(y)}
v
costtoneighborv
min takenoverallneighborsvofx
costfromneighborvtodestinationy
85
Bellman-Fordexample
u
yx
wv
z2
21
3
1
1
2
53
5clearly,dv(z)=5,dx(z)=3,dw(z)=3
du(z)=min{c(u,v)+dv(z),c(u,x)+dx(z),c(u,w)+dw(z)}
=min{2+5,1+3,5+3}=4
nodeachievingminimumisnexthopinshortestpath,usedin forwardingtable
B-Fequationsays:
86
Distancevectoralgorithm
• Dx(y) =estimateofleastcostfromxtoy• xmaintainsdistancevectorDx =[Dx(y):yє N]
• nodex:• knowscosttoeachneighborv:c(x,v)• maintainsitsneighbors’ distancevectors.Foreachneighborv,xmaintainsDv =[Dv(y):yє N]
87
keyidea:• fromtime-to-time,eachnodesendsitsowndistancevectorestimatetoneighbors
• whenxreceivesnewDVestimatefromneighbor,itupdatesitsownDVusingB-Fequation:Dx(y)←minv{c(x,v)+Dv(y)}foreachnodey∊ N
v underminor,naturalconditions,theestimateDx(y)convergetotheactualleastcostdx(y)
Distancevectoralgorithm
88
iterative,asynchronous:eachlocaliterationcausedby:
• locallinkcostchange• DVupdatemessagefromneighbor
distributed:• eachnodenotifiesneighborsonly whenitsDVchanges
• neighborsthennotifytheirneighborsifnecessary
wait for(changeinlocallinkcostormsg fromneighbor)
recompute estimates
ifDVtoanydest haschanged,notify neighbors
eachnode:
Distancevectoralgorithm
89
cost to
0 2 7x y z
xyz
from
cost to
0 2 7x y z
xyz
from
cost to
2 0 17 1 0
2 0 13 1 0
x y zxyz
0 2 3
from
x y zxyz
0 2 3
from
cost to
x y zxyz
0 2 3fro
mcost to
2 0 13 1 0
2 0 1
3 1 02 0 1
3 1 0
time
0 2 7∞ ∞ ∞∞ ∞ ∞
02 0 17 1 0
time
x z12
7
y
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}= min{2+0 , 7+1} = 2
Dx(z) =min{c(x,y) +Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
32
∞ ∞
∞ ∞ ∞
∞2 0 1
from
x y zxyz
cost tonode ytable
∞ ∞ ∞7 1 0
∞ ∞ ∞
from
x y zxyz
cost tonode ztable
from
x y zxyz
cost tox y z
xyz
cost tonode xtable
from
90
Distancevector:linkcostchanges
linkcostchanges:v nodedetectslocallinkcostchangev updatesroutinginfo,recalculates
distancevectorv ifDVchanges,notifyneighbors
“goodnewstravelsfast”
x z14
50
y1
t0:y detectslink-costchange,updatesitsDV,informsitsneighbors.
t1:z receivesupdatefromy,updatesitstable,computesnewleastcosttox ,sendsitsneighborsitsDV.
t2:y receivesz’supdate,updatesitsdistancetable.y'sleastcostsdonotchange,soy doesnot sendamessagetoz.
*Checkouttheonlineinteractiveexercisesformoreexamples:http://gaia.cs.umass.edu/kurose_ross/interactive/ 91
Distancevector:linkcostchanges
linkcostchanges:v nodedetectslocallinkcostchangev badnewstravelsslow - “countto
infinity” problem!v 44iterationsbeforealgorithm
stabilizes:seetext
x z14
50
y60
poisonedreverse:v IfZroutesthroughYtogettoX:
§ ZtellsYits(Zʼs)distancetoXisinfinite(soYwonʼtroutetoXviaZ)
v willthiscompletelysolvecounttoinfinityproblem?
92
ComparisonofLSandDValgorithmsmessagecomplexity• LS: withnnodes,Elinks,O(nE)msgs sent
• DV: exchangebetweenneighborsonly
• convergencetimevaries
speedofconvergence• LS: O(n2)algorithmrequiresO(nE)msgs
• mayhaveoscillations• DV: convergencetimevaries
• mayberoutingloops• count-to-infinityproblem
robustness: whathappensifroutermalfunctions?
LS:• nodecanadvertiseincorrectlink cost
• eachnodecomputesonlyitsown table
DV:• DVnodecanadvertiseincorrectpath cost
• eachnodeʼstableusedbyothers
• errorpropagatethrunetwork
93
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?ü IP:InternetProtocolü RoutingProtocolsq Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
94
Makingroutingscalable
scale: withbillionsofdestinations:
• Can’tstorealldestinationsinroutingtables!
• routingtableexchangewouldswamplinks!
administrativeautonomy• internet=networkofnetworks
• eachnetworkadminmaywanttocontrolroutinginitsownnetwork
ourroutingstudythusfar- idealized§ allroutersidentical§ network“flat”…not trueinpractice
95
aggregateroutersintoregionsknownas “autonomoussystems” (AS)(a.k.a.“domains”)
inter-ASrouting• routingamongAS’es• gatewaysperforminter-domainrouting(aswellasintra-domainrouting)
Internetapproachtoscalablerouting
intra-ASrouting§ routingamonghosts,routers
insameAS(“network”)§ allroutersinASmustrun
same intra-domainprotocol§ routersindifferent AScanrun
different intra-domainroutingprotocol
§ gatewayrouter:at“edge”ofitsownAS,haslink(s)torouter(s)inotherAS’es
96
3b
1d
3a
1c2aAS3
AS1AS2
1a
2c2b
1b
Intra-ASRouting algorithm
Inter-ASRouting algorithm
Forwardingtable
3c
InterconnectedASes
• forwardingtableconfiguredbybothintra-andinter-ASroutingalgorithm
• intra-ASroutingdetermineentriesfordestinationswithinAS
• inter-AS&intra-ASdetermineentriesforexternaldestinations
97
Inter-AStasks• supposerouterinAS1receivesdatagramdestinedoutsideofAS1:
• routershouldforwardpackettogatewayrouter,butwhichone?
AS1 must:1. learnwhichdests are
reachablethroughAS2,whichthroughAS3
2. propagatethisreachabilityinfotoallroutersinAS1
jobofinter-ASrouting!
AS3
AS2
3b
3c3a
AS1
1c1a
1d1b
2a2c
2bothernetworks
othernetworks
98
Intra-ASRouting
• alsoknownasinteriorgatewayprotocols(IGP)• mostcommonintra-ASroutingprotocols:
• RIP:RoutingInformationProtocol• OSPF:OpenShortestPathFirst• IGRP:InteriorGatewayRoutingProtocol(Ciscoproprietaryfordecades,until2016)
99
OSPF(OpenShortestPathFirst)• “open”:publiclyavailable• useslink-statealgorithm
• linkstatepacketdissemination• topologymapateachnode• routecomputationusingDijkstraʼsalgorithm
• routerfloodsOSPFlink-stateadvertisementstoallotherroutersinentire AS
• carriedinOSPFmessagesdirectlyoverIP(ratherthanTCPorUDP
• linkstate:foreachattachedlink
100
OSPF“advanced”features• security: allOSPFmessagesauthenticated(topreventmaliciousintrusion)
• multiplesame-costpaths allowed(onlyonepathinRIP)
• foreachlink,multiplecostmetricsfordifferentToS(e.g.,satellitelinkcostsetlowforbesteffortToS;highforreal-timeToS)
• integrateduni- andmulti-cast support:• MulticastOSPF(MOSPF)usessametopologydatabaseasOSPF
• hierarchical OSPFinlargedomains.
101
HierarchicalOSPFboundary router
backbone router
area 1area 2
area 3
backboneareaborderrouters
internalrouters
102
• two-levelhierarchy: localarea,backbone.• link-stateadvertisementsonlyinarea• eachnodeshasdetailedareatopology;onlyknowdirection(shortestpath)tonetsinotherareas.
• areaborderrouters: “summarize” distancestonetsinownarea,advertisetootherAreaBorderrouters.
• backbonerouters: runOSPFroutinglimitedtobackbone.
• boundaryrouters: connecttootherAS’es.
HierarchicalOSPF
103
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?ü IP:InternetProtocolü RoutingProtocolsü Intra-ASRoutingintheinternet:OSPFq RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
104
Internetinter-ASrouting:BGP• BGP(BorderGatewayProtocol): the defacto
inter-domainroutingprotocol• “gluethatholdstheInternettogether”
• BGPprovideseachASameansto:• eBGP: obtainsubnetreachabilityinformationfrom
neighboringASes• iBGP: propagatereachabilityinformationtoallAS-
internalrouters.• determine“good” routestoothernetworksbasedon
reachabilityinformationandpolicy• allowssubnettoadvertiseitsexistencetorestof
Internet:“Iamhere”
105
eBGP,iBGPconnections
eBGPconnectivityiBGPconnectivity
1b
1d
1c1a
2b
2d
2c2a3b
3d
3c3a
AS2
AS3AS1
1c gatewayroutersrunbotheBGPandiBGP protocols
106
BGPbasics
• whenAS3gatewayrouter3aadvertisespathAS3,XtoAS2gatewayrouter2c:
• AS3promises toAS2itwillforwarddatagramstowardsX
§ BGPsession: twoBGProuters(“peers”)exchangeBGPmessagesoversemi-permanentTCPconnection:
• advertisingpaths todifferentdestinationnetworkprefixes(BGPisa“pathvector” protocol)
1b
1d
1c1a2b
2d
2c2a
3b
3d
3c3a
AS2
AS3AS1
X
BGP advertisement:AS3, X
107
PathattributesandBGProutes• advertisedprefixincludesBGPattributes
• prefix+attributes=“route”• twoimportantattributes:
• AS-PATH:listofASes throughwhichprefixadvertisementhaspassed
• NEXT-HOP: indicatesspecificinternal-ASroutertonext-hopAS
• Policy-basedrouting:• gatewayreceivingrouteadvertisementusesimportpolicytoaccept/declinepath(e.g.,neverroutethroughASY).
• ASpolicyalsodetermineswhethertoadvertise pathtootherotherneighboringASes
108
BGPpathadvertisement
• BasedonAS2policy,AS2router2cacceptspathAS3,X,propagates(viaiBGP)toallAS2routers
1b
1d
1c1a2b
2d
2c2a
3b
3d
3c3a
AS2
AS3AS1
XAS3,X
AS2,AS3,X
§ AS2router2creceivespathadvertisementAS3,X(viaeBGP)fromAS3router3a
§ BasedonAS2policy,AS2router2aadvertises(viaeBGP)pathAS2,AS3,X toAS1 router1c
109
BGPpathadvertisement
• AS1 gatewayrouter 1clearnspathAS2,AS3,Xfrom2a
1b
1d
1c1a2b
2d
2c2a
3b
3d
3c3a
AS2
AS3AS1
XAS3,X
AS2,AS3,X
gatewayroutermaylearnaboutmultiple pathstodestination:
§ AS1 gatewayrouter 1clearnspathAS3,Xfrom3a§ Basedonpolicy,AS1 gatewayrouter 1cchoosespathAS3,X,and
advertisespathwithinAS1 viaiBGP
110
BGPmessages• BGPmessagesexchangedbetweenpeersoverTCPconnection• BGPmessages:
• OPEN: opensTCPconnectiontoremoteBGPpeerandauthenticatessendingBGPpeer
• UPDATE: advertisesnewpath(orwithdrawsold)• KEEPALIVE: keepsconnectionaliveinabsenceofUPDATES;alsoACKsOPENrequest
• NOTIFICATION: reportserrorsinpreviousmsg;alsousedtocloseconnection
111
BGP,OSPF,forwardingtableentries
• recall:1a,1b,1clearnaboutdest XviaiBGPfrom1c:“pathtoXgoesthrough1c”
1b
1d
1c1a2b
2d
2c2a
3b
3d
3c3a
AS2
AS3AS1
XAS3,X
AS2,AS3,X
§ 1d:OSPFintra-domainrouting:togetto1c,forwardoveroutgoinglocalinterface1
Q:howdoesroutersetforwardingtableentrytodistantprefix?
12
1
2
dest interface…
…X
…
…1
physicallink
locallinkinterfacesat1a,1d
112
BGP,OSPF,forwardingtableentries
• recall:1a,1b,1clearnaboutdest XviaiBGPfrom1c:“pathtoXgoesthrough1c”
1b
1d
1c1a2b
2d
2c2a
3b
3d
3c3a
AS2
AS3AS1
X
§ 1d:OSPFintra-domainrouting:togetto1c,forwardoveroutgoinglocalinterface1
Q:howdoesroutersetforwardingtableentrytodistantprefix?
dest interface…
…X
…
…2
§ 1a:OSPFintra-domainrouting:togetto1c,forwardoveroutgoinglocalinterface2
1
2
113
BGProuteselection
• routermaylearnaboutmorethanoneroutetodestinationAS,selectsroutebasedon:1. localpreferencevalueattribute:policydecision2. shortestAS-PATH3. closestNEXT-HOProuter:hotpotatorouting4. additionalcriteria
114
HotPotatoRouting
• 2dlearns(viaiBGP)itcanroutetoXvia2aor2c• hotpotatorouting:chooselocalgatewaythathasleastintra-domaincost(e.g.,2dchooses2a,eventhoughmoreAShopstoX):don’tworryaboutinter-domaincost!
1b
1d
1c1a2b
2d
2c2a
3b
3d
3c3a
AS2
AS3AS1
XAS3,X
AS1,AS3,X
OSPFlinkweights201
152112
263
115
§ AadvertisespathA-wtoBandtoC§ BchoosesnottoadvertiseB-A-wtoC:
§ Bgetsno“revenue” forroutingC-B-A-w,sincenoneofC,A,wareBʼscustomers
§ CdoesnotlearnaboutC-B-A-wpath§ CwillrouteC-A-w(notusingB)togettow
A
B
C
WX
Y
legend:
customernetwork:
providernetwork
SupposeanISPonlywantstoroutetrafficto/fromitscustomernetworks(doesnotwanttocarrytransittrafficbetweenotherISPs)
BGP:achievingpolicyviaadvertisements
116
BGP:achievingpolicyviaadvertisements
§ A,B,Careprovidernetworks§ X,W,Yarecustomer(ofprovidernetworks)§ Xisdual-homed: attachedtotwonetworks§ policytoenforce:XdoesnotwanttoroutefromBtoCviaX
§ ..soXwillnotadvertisetoBaroutetoC
A
B
C
WX
Y
legend:
customernetwork:
providernetwork
SupposeanISPonlywantstoroutetrafficto/fromitscustomernetworks(doesnotwanttocarrytransittrafficbetweenotherISPs)
117
WhydifferentIntra-,Inter-ASrouting?policy:• inter-AS:adminwantscontroloverhowitstrafficrouted,whoroutesthroughitsnet.
• intra-AS:singleadmin,sonopolicydecisionsneededscale:• hierarchicalroutingsavestablesize,reducedupdatetraffic
performance:• intra-AS:canfocusonperformance• inter-AS:policymaydominateoverperformance
118
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?ü IP:InternetProtocolü RoutingProtocolsü Intra-ASRoutingintheinternet:OSPFü RoutingamongISPs:BGPq SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
119
Softwaredefinednetworking(SDN)
• Internetnetworklayer:historicallyhasbeenimplementedviadistributed,per-routerapproach
• monolithic routercontainsswitchinghardware,runsproprietaryimplementationofInternetstandardprotocols(IP,RIP,IS-IS,OSPF,BGP)inproprietaryrouterOS(e.g.,CiscoIOS)
• different“middleboxes”fordifferentnetworklayerfunctions:firewalls,loadbalancers,NATboxes,..
• ~2005:renewedinterestinrethinkingnetworkcontrolplane
120
Recall:per-routercontrolplane
RoutingAlgorithm
Individualroutingalgorithmcomponentsineachandeveryrouterinteractwitheachotherincontrolplanetocomputeforwardingtables
dataplane
controlplane
4.1 • OVERVIEW OF NETWORK LAYER 309
tables. In this example, a routing algorithm runs in each and every router and both forwarding and routing functions are contained within a router. As we’ll see in Sec-tions 5.3 and 5.4, the routing algorithm function in one router communicates with the routing algorithm function in other routers to compute the values for its forward-ing table. How is this communication performed? By exchanging routing messages containing routing information according to a routing protocol! We’ll cover routing algorithms and protocols in Sections 5.2 through 5.4.
The distinct and different purposes of the forwarding and routing functions can be further illustrated by considering the hypothetical (and unrealistic, but technically feasible) case of a network in which all forwarding tables are configured directly by human network operators physically present at the routers. In this case, no routing protocols would be required! Of course, the human operators would need to interact with each other to ensure that the forwarding tables were configured in such a way that packets reached their intended destinations. It’s also likely that human configu-ration would be more error-prone and much slower to respond to changes in the net-work topology than a routing protocol. We’re thus fortunate that all networks have both a forwarding and a routing function!
Values in arrivingpacket’s header
1
23
Local forwardingtable
header
0100011001111001
1101
3221
output
Control plane
Data plane
Routing algorithm
Figure 4.2 ♦ Routing algorithms determine values in forward tables
M04_KURO4140_07_SE_C04.indd 309 11/02/16 3:14 PM
121
dataplane
controlplane
Recall:logicallycentralizedcontrolplaneAdistinct (typicallyremote)controllerinteractswithlocalcontrolagents(CAs)inrouterstocomputeforwardingtables
Remote Controller
CA
CA CA CA CA
122
Softwaredefinednetworking(SDN)
Why a logicallycentralizedcontrolplane?• easiernetworkmanagement:avoidrouter
misconfigurations,greaterflexibilityoftrafficflows• table-basedforwardingallows“programming”
routers• centralized“programming”easier:computetables
centrallyanddistribute• distributed“programming:moredifficult:compute
tablesasresultofdistributedalgorithm(protocol)implementedineachandeveryrouter
• open(non-proprietary)implementationofcontrolplane
123
VerticallyintegratedClosed,proprietarySlowinnovationSmallindustry
SpecializedOperatingSystem
SpecializedHardware
App
App
App
App
App
App
App
App
App
AppApp
SpecializedApplications
HorizontalOpeninterfacesRapidinnovationHugeindustry
Microprocessor
Open Interface
Linux MacOS
Windows(OS) or or
Open Interface
Analogy:mainframetoPCevolution*
*Slidecourtesy:N.McKeown124
Trafficengineering:difficulttraditionalrouting
Q:whatifnetworkoperatorwantsu-to-ztraffictoflowalonguvwz,x-to-ztraffictoflowxwyz?
A:needtodefinelinkweightssotrafficroutingalgorithmcomputesroutesaccordingly(orneedanewroutingalgorithm)!
22
13
1
1
2
53
5
v w
u z
yx
125
Trafficengineering:difficult
Q:whatifnetworkoperatorwantstosplitu-to-ztrafficalonguvwz and uxyz (loadbalancing)?
A:can’tdoit(orneedanewroutingalgorithm)
22
13
1
1
2
53
5
v w
u z
yx
126
yx
wv
z2
21
3
1
1
2
53
5
Trafficengineering:difficult
u
v
x
w
y
z
Q:whatifwwantstorouteblueandredtrafficdifferently?
A:can’tdoit(withdestinationbasedforwarding,andLS,DVrouting)
127
Softwaredefinednetworking(SDN)
dataplane
controlplane
Remote Controller
CA
CA CA CA CA
1:generalized“flow-based”forwarding(e.g.,OpenFlow)
2.control,dataplaneseparation
3.controlplanefunctionsexternaltodata-planeswitches
…4.programmable
controlapplicationsrouting access
controlload
balance
128
GeneralizedForwardingandSDN
230100 1101
values in arrivingpacket’s header
logically-centralized routing controller
1
control plane
data plane
Eachroutercontainsaflowtablethatiscomputedanddistributedbyalogicallycentralizedroutingcontroller
local flow tableheaders counters actions
129
OpenFlow dataplaneabstraction
• flow:definedbyheaderfields• generalizedforwarding:simplepacket-handlingrules
• Pattern:matchvaluesinpacketheaderfields• Actions:formatchedpacket:drop,forward,modify,matchedpacketorsendmatchedpackettocontroller
• Priority:disambiguateoverlappingpatterns• Counters:#bytesand#packets
Flow table in a router (computed and distributed by controller) define router’s match+action rules
130
OpenFlow dataplaneabstraction
• flow:definedbyheaderfields• generalizedforwarding:simplepacket-handlingrules
• Pattern:matchvaluesinpacketheaderfields• Actions:formatchedpacket:drop,forward,modify,matchedpacketorsendmatchedpackettocontroller
• Priority:disambiguateoverlappingpatterns• Counters:#bytesand#packets
1. src=1.2.*.*,dest=3.4.5.*à drop2. src=*.*.*.*,dest=3.4.*.*à forward(2)3.src=10.1.2.3,dest=*.*.*.*à sendtocontroller
* : wildcard
131
OpenFlow:FlowTableEntries
SwitchPort
MACsrc
MACdst
Ethtype
VLANID
IPSrc
IPDst
IPProt
TCPsport
TCPdport
Rule Action Stats
1. Forwardpackettoport(s)2. Encapsulateandforwardtocontroller3. Droppacket4. Sendtonormalprocessingpipeline5. ModifyFields
Packet+bytecounters
Linklayer Networklayer Transportlayer132
Destination-basedforwarding:
*
SwitchPort
MACsrc
MACdst
Ethtype
VLANID
IPSrc
IPDst
IPProt
TCPsport
TCPdport Action
* * * * * 51.6.0.8 * * * port6
Examples
IP datagrams destined to IP address 51.6.0.8 should be forwarded to router output port 6
*
SwitchPort
MACsrc
MACdst
Ethtype
VLANID
IPSrc
IPDst
IPProt
TCPsport
TCPdport Action
* * * * * * * * 22 drop
Firewall:
do not forward (block) all datagrams destined to TCP port 22
*
SwitchPort
MACsrc
MACdst
Ethtype
VLANID
IPSrc
IPDst
IPProt
TCPsport
TCPdport Action
* * * * 128.119.1.1 * * * * dropdo not forward (block) all datagrams sent by host 128.119.1.1133
Destination-basedlayer2(switch)forwarding:
*
SwitchPort
MACsrc
MACdst
Ethtype
VLANID
IPSrc
IPDst
IPProt
TCPsport
TCPdport Action
* * * * * * * * port3
Examples
layer 2 frames from MAC address 22:A7:23:11:E1:02 should be forwarded to output port 6
22:A7:23:11:E1:02
134
OpenFlow abstraction
• Router• match:longestdestinationIPprefix
• action:forwardoutalink
• Switch• match:destinationMACaddress
• action:forwardorflood
• Firewall• match:IPaddressesandTCP/UDPportnumbers
• action:permitordeny
• NAT• match:IPaddressandport
• action:rewriteaddressandport
§ match+action:unifiesdifferentkindsofdevices
135
IP Src = 10.3.*.*IP Dst = 10.2.*.* forward(3)
match action
ingress port = 2IP Dst = 10.2.0.3ingress port = 2IP Dst = 10.2.0.4
forward(3)
match action
forward(4)ingress port = 1IP Src = 10.3.*.*IP Dst = 10.2.*.*
forward(4)
match action
OpenFlow example
Host h110.1.0.1
Host h210.1.0.2
Host h410.2.0.4
Host h310.2.0.3
Host h510.3.0.5
s1 s2
s312
3 4
1
2
34
1
23
4
Host h610.3.0.6
controller
Example: datagrams from hosts h5 and h6 should be sent to h3 or h4, via s1 and from there to s2
136
SDNperspective:dataplaneswitches
Dataplaneswitches• fast,simple,commodityswitchesimplementinggeneralizeddata-planeforwardinginhardware
• switchflowtablecomputed,installedbycontroller
• APIfortable-basedswitchcontrol(e.g.,OpenFlow)
• defineswhatiscontrollableandwhatisnot
• protocolforcommunicatingwithcontroller(e.g.,OpenFlow) data
plane
controlplane
SDNController(networkoperatingsystem)
…routing
accesscontrol
loadbalance
southbound API
northbound API
SDN-controlled switches
network-control applications
137
SDNperspective:SDNcontroller
SDNcontroller(networkOS):§ maintainnetworkstate
information§ interactswithnetworkcontrol
applications“above”vianorthboundAPI
§ interactswithnetworkswitches“below”viasouthboundAPI
§ implementedasdistributedsystemforperformance,scalability,fault-tolerance,robustness data
plane
controlplane
SDNController(networkoperatingsystem)
…routing
accesscontrol
loadbalance
southbound API
northbound API
SDN-controlled switches
network-control applications
138
SDNperspective:controlapplications
network-controlapps:§ “brains”ofcontrol:
implementcontrolfunctionsusinglower-levelservices,APIprovidedbySNDcontroller
§ unbundled:canbeprovidedby3rd party:distinctfromroutingvendor,orSDNcontroller
dataplane
controlplane
SDNController(networkoperatingsystem)
…routing
accesscontrol
loadbalance
southbound API
northbound API
SDN-controlled switches
network-control applications
139
Network-wide distributed, robust state management
Communication to/from controlled devices
Link-state info switch infohost info
statistics flow tables…
…
OpenFlow SNMP…
network graph intent
RESTfulAPI
…Interface, abstractions for network control apps
SDNcontroller
routing accesscontrol
loadbalance
ComponentsofSDNcontroller
communicationlayer:communicatebetweenSDNcontrollerandcontrolledswitches
Network-widestatemanagementlayer:stateofnetworkslinks,switches,services:adistributeddatabase
Interfacelayertonetworkcontrolapps:abstractionsAPI
140
OpenFlowprotocol
• operatesbetweencontroller,switch
• TCPusedtoexchangemessages
• optionalencryption
• threeclassesofOpenFlowmessages:
• controller-to-switch• asynchronous(switchtocontroller)
• symmetric(misc)
OpenFlowController
141
OpenFlow:controller-to-switchmessages
Keycontroller-to-switchmessages• features:controllerqueriesswitchfeatures,switchreplies
• configure:controllerqueries/setsswitchconfigurationparameters
• modify-state:add,delete,modifyflowentriesintheOpenFlowtables
• packet-out:controllercansendthispacketoutofspecificswitchport
OpenFlowController
142
OpenFlow:switch-to-controllermessages
Keyswitch-to-controllermessages• packet-in:transferpacket(anditscontrol)tocontroller.Seepacket-outmessagefromcontroller
• flow-removed:flowtableentrydeletedatswitch
• portstatus:informcontrollerofachangeonaport.
Fortunately,networkoperatorsdon’t“program”switchesbycreating/sendingOpenFlowmessagesdirectly.Insteadusehigher-levelabstractionatcontroller
OpenFlowController
143
Link-state info switch infohost info
statistics flow tables…
…
OpenFlow SNMP…
network graph intent
RESTfulAPI
…
1
2
3
4 5
Dijkstra’s link-state Routing
s1s2
s3s4
SDN:control/dataplaneinteractionexample
S1,experiencinglinkfailureusingOpenFlowportstatusmessagetonotifycontroller
1
SDNcontrollerreceivesOpenFlowmessage,updateslinkstatusinfo
2
Dijkstra’sroutingalgorithmapplicationhaspreviouslyregisteredtobecalledwheneverlinkstatuschanges.Itiscalled.
3
Dijkstra’sroutingalgorithmaccessnetworkgraphinfo,linkstateinfoincontroller,computesnewroutes
4
144
Link-state info switch infohost info
statistics flow tables…
…
OpenFlow SNMP…
network graph intent
RESTfulAPI
…
1
2
3
4 5
Dijkstra’s link-state Routing
s1s2
s3s4
SDN:control/dataplaneinteractionexample
linkstateroutingappinteractswithflow-table-computationcomponentinSDNcontroller,whichcomputesnewflowtablesneeded
5
ControllerusesOpenFlowtoinstallnewtablesinswitchesthatneedupdating
6
145
topologymanager
Basic Network Service Functions
REST API
OpenFlow 1.0 … SNMP OVSDB
forwardingmanager
switchmanager
hostmanager
statsmanager
Network service apps
Service Abstraction Layer (SAL)
AccessControl
TrafficEngineering
…
OpenDaylight(ODL)controller
§ ODLLithiumcontroller
§ networkappsmaybecontainedwithin,orbeexternaltoSDNcontroller
§ ServiceAbstractionLayer:interconnectsinternal,externalapplicationsandservices
146
Network control apps
…
REST API
ONOSdistributed core
southbound abstractions,protocolsOpenFlow Netconf OVSDB
device link host flow packet
northbound abstractions,protocols
Intent
statisticsdevices
hosts
links
paths flow rules topology
ONOScontroller
§ controlappsseparatefromcontroller
§ intentframework:high-levelspecificationofservice:whatratherthanhow
§ considerableemphasisondistributedcore:servicereliability,replicationperformancescaling
147
SDN:selectedchallenges
• hardeningthecontrolplane:dependable,reliable,performance-scalable,securedistributedsystem
• robustnesstofailures:leveragestrongtheoryofreliabledistributedsystemforcontrolplane
• dependability,security:“bakedin”fromdayone?
• networks,protocolsmeetingmission-specificrequirements
• e.g.,real-time,ultra-reliable,ultra-secure
• Internet-scaling
148
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?ü IP:InternetProtocolü RoutingProtocolsü Intra-ASRoutingintheinternet:OSPFü RoutingamongISPs:BGPü SDN:SoftwareDefinedNetworksq ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
149
ICMP:internetcontrolmessageprotocol
• usedbyhosts&routerstocommunicatenetwork-levelinformation
• errorreporting:unreachablehost,network,port,protocol
• echorequest/reply(usedbyping)
• network-layer“above” IP:• ICMPmsgs carriedinIPdatagrams
• ICMPmessage: type,codeplusfirst8bytesofIPdatagramcausingerror
Type Code description0 0 echo reply (ping)3 0 dest. network unreachable3 1 dest host unreachable3 2 dest protocol unreachable3 3 dest port unreachable3 6 dest network unknown3 7 dest host unknown4 0 source quench (congestion
control - not used)8 0 echo request (ping)9 0 route advertisement10 0 router discovery11 0 TTL expired12 0 bad IP header
150
TracerouteandICMP• sourcesendsseriesofUDPsegmentstodestination• firstsethasTTL=1• secondsethasTTL=2,etc.• unlikelyportnumber
• whendatagraminnthsetarrivestonthrouter:• routerdiscardsdatagramandsendssourceICMPmessage(type11,code0)
• ICMPmessageincludenameofrouter&IPaddress
• whenICMPmessagearrives,sourcerecordsRTTs
stoppingcriteria:§ UDPsegmenteventually
arrivesatdestinationhost§ destinationreturnsICMP“portunreachable”message(type3,code3)
§ sourcestops
3 probes
3 probes
3 probes
151
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?ü IP:InternetProtocolü RoutingProtocolsü Intra-ASRoutingintheinternet:OSPFü RoutingamongISPs:BGPü SDN:SoftwareDefinedNetworksü ICMP:TheInternetControlMessageProtocolq NetworkManagementandSNMP
152
Whatisnetworkmanagement?• autonomoussystems(aka“network”):1000sofinteractinghardware/softwarecomponents
• othercomplexsystemsrequiringmonitoring,control:• jetairplane• nuclearpowerplant• others?
"Network management includes the deployment, integration and coordination of the hardware, software, and human elements to monitor, test, poll, configure, analyze, evaluate, and control the network and element resources to meet the real-time, operational performance, and Quality of Service requirements at a reasonable cost."
153
Infrastructurefornetworkmanagement
managed devicemanaged device
managed device
managed device
definitions:
manageddevicescontainmanaged
objects whose dataisgatheredintoaManagement
InformationBase(MIB)
managingentity data
managing entity
agent data
agent data
networkmanagement
protocol
managed device
agent data
agent data
agent data
154
SNMPprotocolTwowaystoconveyMIBinfo,commands:
agent data
managed device
managingentity
agent data
managed device
managingentity
trap msgrequest
request/response mode trap mode
response
155
SNMPprotocol:messagetypes
GetRequestGetNextRequestGetBulkRequest
manager-to-agent: “get me data”(data instance, next data in list, block of data)
Message type Function
InformRequest manager-to-manager: hereʼs MIB value
SetRequest manager-to-agent: set MIB value
Response Agent-to-manager: value, response to Request
Trap Agent-to-manager: inform managerof exceptional event
156
SNMPprotocol:messageformats
….PDUtype(0-3)
RequestID
ErrorStatus(0-5)
ErrorIndex Name Value Name Value
….PDUtype4
Enterprise AgentAddr
TrapType(0-7)
Specificcode
Timestamp Name Value
Get/set header Variables to get/set
Trap header Trap info
SNMP PDU
Moreonnetworkmanagement:seeearliereditionsoftext!
157
Chapter4:Outlineü OverviewofNetworkLayerü What’sInsideaRouter?ü IP:InternetProtocolü RoutingProtocolsü Intra-ASRoutingintheinternet:OSPFü RoutingamongISPs:BGPü SDN:SoftwareDefinedNetworksü ICMP:TheInternetControlMessageProtocolü NetworkManagementandSNMP
158
Chapter4:Summary
• traditionalroutingalgorithms• implementationinInternet:OSPF,BGP
• SDN:Generalizedforwarding&controllers
• implementationinpractice:ODL,ONOS
• InternetControlMessageProtocol
• networkmanagement
nextstop:linklayer!
we’ve learned a lot!
• Dataplaneandcontrolplane• Routers• IP:InternetProtocol
• datagramformat• fragmentation• IPv4addressing• NAT• IPv6
• RoutingProtocols:• LinkState• DistanceVector
159