NIRA: A NEW INTER-DOMAIN ROUTING ARCHITECTURE
Xiaowei Yang, David Clark and Arthur Berger
Presented by Sachin Kadloor
1
Tuesday, September 28, 2010
BGP ROUTING
• Link weights represent costs
•Default routes picked by BGP
• A B, A B C, A B C D
•Need to tune attributes in order to use A C and B D links
•Manual tuning can lead to instabilities
2
A
B
C
D10
20
5020
50
Tuesday, September 28, 2010
OUTLINE OF THE TALK
• Routing
• challenges and existing solutions (OSPF-TE, TeXCP)
•NIRA: New Internet Routing Architecture
• design philosophy
• a new addressing scheme
• route discovery
• forwarding3
Tuesday, September 28, 2010
TRAFFIC ENGINEERING (TE)
•Mathematical framework
• Given: directed graph G=(V,E)
• vector C=[Ce], e in E, of edge capacities
• demand matrix D=[D(s,t)], for s,t in V
• design L=[le], load along edge e, to satisfy demand
•Optimization: minimize max. utilization (le/Ce)
4
Tuesday, September 28, 2010
OPEN SPF-TE (OSPF-TE)
• Offline (static) TE
• Uses long term average data to tune weights
• Avoids the risk of instability caused by real-time fluctuations
• Pre-compute the best re-routes in case of failures
• Chooses the best re-route which works for most failures
• Can lead to over-provisioning of network
5
Tuesday, September 28, 2010
TRAFFIC ENGINEERING WITH XCP (TEXCP)
6
•Online (dynamic) TE
• Recall: XCP gives explicit feedback
• Efficiency and fairness controller
• TeXCP: routers give explicit feedback on utilization
• load balancer
• per-path XCP controller
A
B
C
D10
20
3020
30
Tuesday, September 28, 2010
NIRALet users decide how they want to route
7
Tuesday, September 28, 2010
NIRA: DESIGN PHILOSOPHY
• Users pick ISPs, but cannot control their packets’ routes
• currently, each domain makes a local decision on what the next hop will be (according to BGP protocol)
• Alternative: users choose a path with better QoS
• Stimulate competition among providers
•Different applications have different needs
• e.g. gaming, peer-to-peer downloads, support multi-path routing
8
Tuesday, September 28, 2010
CHALLENGES
• How does a user discover routes?
• How do you represent routes?
• How should the providers be compensated?
9
Tuesday, September 28, 2010
NIRA -OVERVIEW
•Organize the routers into hierarchy
• Use hierarchical addressing
•Nodes in tier-1 get allocated a
chunk of address space
• They allocate a fraction of it to their children
• Children get an address from each parent
• B b, C c, A b.1 and c.1, B b.2 and c.2
• A source and destination address uniquely specifies the route 10
A
B C
D
10
20
5020
50
Tier 1
Tier 2
Tuesday, September 28, 2010
NETWORK MODEL AND ADDRESSING
11
2.2 Route Representation Schemes
2.3 Current Route Selection Technologies
3. NETWORK MODEL, ADDRESSING,ROUTEREPRESENTATIONANDFOR-WARDING
3.1 The Network Model
ae80:2:1:ecAddr: ae80:1:1::ecAlice
InterAddr: ae80:2:2:2::/96
AllocPf: ae80:1:1::/48
InterAddr: ae80:1:1::/96
AS 400
AllocPf: ae80:1::/32
AllocPf: ae80:2:2:2::/64AS 600
AllocPf: ae80:2:2::/48
AllocPf: ae80:2::/32
InterAddr: ae80::/96AllocPf: ae80::/16
peerprovider
peercustomer
AS 200
BobInterAddr: ae80:2:2::/96
AS 500InterAddr: ae80:2::/96
AS 300
AS 100Core
AS 600
AS 300AS 200
AS 100
AS 10
AS 500
AS 400
Addr: ae80:2:2:2::6c1a
ae80:2:1::/96
AS 20
ae80:2:1::/48
3.2 Addressing
Proceedings ot the ACM SIGCOMM 2003 Workshops 304 August 2003
• with hierarchical addressing, an address represents a route to the core
Tuesday, September 28, 2010
ROUTE DISCOVERY AND REPRESENTATION
• TIPP (Topology Information Propagation Protocol) used to discover routes
• information propagated only along hierarchy (not globally)
•With the new addressing scheme, a source address and a destination address efficiently represents a route
• switch routes by switching address
•Note: route specified by source and destination address
12
Tuesday, September 28, 2010
NAME-TO-ROUTE MAPPING
• Bootstrapping a communication is directory lookup based
• Each user registers his name, addresses and preferences in a NRLS (Network to Route Lookup Server)
•Needs to be updated when domain level topology changes
13
Tuesday, September 28, 2010
PACKET FORWARDING
• Each router maintains 3 or 4 forwarding tables
• uphill table: domain’s providers’ addresses
• downhill table: domain’s address and its customers’ address
• bridge table: domain’s neighbors with whom it has peering relationship
• BGP table: if the router also participates in BGP routing
14
Tuesday, September 28, 2010
FORWARDING ALGORITHM
• A router uses longest prefix match to look up the destination address in its downhill table
• if match is found, destination address used to route the packet
• If no match is found, use the uphill table, and the source address to forward the packet towards the core
• if already in the core, or if a peering link supports forwarding, use bridge table to forward the packet
15
Tuesday, September 28, 2010
CONCLUSION
•NIRA allows users to pick routes
• uphill routing based on source address, and downhill routing based on destination address
•Other benefits of NIRA
• scalable: TIPP information propagates only along hierarchy
•memory requirement of BGP scales linearly in IP prefixes advertised
• security: limits source spoofing16
Tuesday, September 28, 2010
THANK YOU.
17
Tuesday, September 28, 2010
BACK-UP SLIDES
18
Tuesday, September 28, 2010
ROUTING•Design goals:
• optimality
• low overhead
• robustness
• fast convergence
• flexibility
• NIRA: route selection by user
19
Tuesday, September 28, 2010
ROUTE DISCOVERY
• TIPP (Topology Information Propagation Protocol)
• a path-vector component to inform a user of the domain-level routes
• a policy based link state component to inform a user of the dynamic network conditions
20
Tuesday, September 28, 2010
ROUTING
21
CoreCore
ISPs ISPs
Dial up
Inter-Domain routing (BGP)NIRA
Intra-Domain routing (IGP)RIP, OSPF, TeXCP
Tuesday, September 28, 2010
TIPP
• Link-state messages only propagated downward the hierarchy
• ensures scalability
• Topology updated based on message heard from neighbors
• Inconsistencies resolved by believing the neighbor that is on the shortest failure-free path to the link that triggered update
22
Tuesday, September 28, 2010
DISTANCE-VECTOR ROUTING
•Distance Vector
• each node knows the distance (cost) from itself to its neighbors
• each node advertises a vector containing distances from itself to all other nodes in the network (initialize unknown distance with infinity)
• Uses Dijkstra’s algorithm to find shortest path
• Routing Information Protocol (RIP)23
Tuesday, September 28, 2010
LINK-STATE ROUTING
24
• Each node learns the complete topology of the network
• Creates shortest paths to all destinations with itself as a root node
• Requires more overhead than Distance-Vector routing, but more robust and scalable
•Open Shortest Path First (OSPF)
Tuesday, September 28, 2010
Walking the Tightrope: Responsive Yet StableTraffic Engineering
Srikanth KandulaMIT CSAIL
Dina KatabiMIT [email protected]
Bruce DavieCisco Systems
Anna CharnyCisco Systems
ABSTRACTCurrent intra-domain Traffic Engineering (TE) relies on offlinemethods, which use long term average traffic demands. It can-not react to realtime traffic changes caused by BGP reroutes, di-urnal traffic variations, attacks, or flash crowds. Further, currentTE deals with network failures by pre-computing alternative rout-ings for a limited set of failures. It may fail to prevent congestionwhen unanticipated or combination failures occur, even though thenetwork has enough capacity to handle the failure.This paper presents TeXCP, an online distributed TE protocol
that balances load in realtime, responding to actual traffic demandsand failures. TeXCP uses multiple paths to deliver demands froman ingress to an egress router, adaptively moving traffic from over-utilized to under-utilized paths. These adaptations are carefully de-signed such that, though done independently by each edge routerbased on local information, they balance load in the whole net-work without oscillations. We model TeXCP, prove the stability ofthe model, and show that it is easy to implement. Our extensivesimulations show that, for the same traffic demands, a network us-ing TeXCP supports the same utilization and failure resilience as anetwork that uses traditional offline TE, but with half or third thecapacity.
Categories and Subject DescriptorsC.2.2 [Computer Communication Networks]: Network Proto-cols; C.2.3 [Computer Communication Networks]: NetworkOperations—Network Management
General TermsAlgorithms, Design, Management, Reliability, Performance.
KeywordsTeXCP, Traffic Engineering, Responsive, Online, Distributed, Sta-ble.
1. INTRODUCTIONIntra-domain Traffic Engineering (TE) is an essential part of
modern ISP operations. The TE problem is typically formalized asminimizing the maximum utilization in the network [5, 6, 15, 26].
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.SIGCOMM’05, August 21–26, 2005, Philadelphia, Pennsylvania, USA.Copyright 2005 ACM 1-59593-009-04/05/0008 ...$5.00.
A B
Boston
NewYork
Seattle
SanDiego
Figure 1: For each Ingress-Egress (IE) pair, there is a TeXCP agent atthe ingress router, which balances the IE traffic across available pathsin an online, distributed fashion.
This allows the ISP to balance the load and avoid hot spots and fail-ures, which increases reliability and improves performance. Fur-thermore, ISPs upgrade their infrastructure when the maximumlink utilization exceeds a particular threshold (about 40% utiliza-tion [20]). By maintaining lower network utilization for the sametraffic demands, traffic engineering allows the ISP to make do withexisting infrastructure for a longer time, which reduces cost.Recent years have witnessed significant advancements in traf-
fic engineering methods, from both the research and operationalcommunities [6, 12, 15, 40]. TE methods like the OSPF weight op-timizer (OSPF-TE) [15, 16] and the MPLS multi-commodity flowoptimizer [26] have shown significant reduction in maximum uti-lization over pure shortest path routing. Nonetheless, because of itsoffline nature, current TE has the following intrinsic limitations:
• It might create a suboptimal or even inadequate load distributionfor the realtime traffic. This is because offline TE attempts tobalance load given the long term traffic demands averaged overmultiple days (potentially months). But the actual traffic maydiffer from the long term demands due to BGP re-routes, externalor internal failures, diurnal variations, flash crowds, or attacks.
• Its reaction to failures is suboptimal. Offline TE deals with net-work failures by pre-computing alternative routings for a lim-ited set of failures [16]. Since the operator cannot predict whichfailure will occur, offline TE must find a routing that works rea-sonably well under a large number of potential failures. Such arouting is unlikely to be optimal for any particular failure. Asa result, current TE may fail to prevent congestion when unan-ticipated or combination failures occur, even though the networkmay have enough capacity to handle the failure.
The natural next step is to use online traffic engineering, whichreacts to realtime traffic demands and failures. Currently, onlineTE research is still in its infancy. Indeed it is challenging to build adistributed scheme that responds quickly to changes in traffic, yetdoes not lead to oscillations, as demonstrated by the instability ofthe early ARPAnet routing [23]. Prior online TE methods are eithercentralized [9, 10] or assume an oracle that provides global knowl-edge of the network [12], and most lack a stability analysis [34,39].There is a need for an online TE protocol that combines practical
TRAFFIC ENGINEERING WITH XCP (TEXCP)
• Problem: routing within domain
• load balancing? reaction to failures?
• Solution: make routing adaptive
• Like XCP: Introduce load balancer and a feedback controller
• Formulate the routing problem as a min-max utilization optimization problem
25
Tuesday, September 28, 2010