The The Stable Paths ProblemStable Paths Problem As A As A Model Of BGP RoutingModel Of BGP Routing
NJIT April 24, 2002
Timothy G. Griffin AT&T Research
http://www.research.att.com/~griffin
Outline
Part I: The glue that holds the Internet together : interdomain routing with The Border Gateway Protocol (BGP)
Part II: A formal model of BGP routing policies
Joint work with Bruce Shepherd and Gordon Wilfong (Bell Labs)
Architecture of Dynamic Routing
AS 1
AS 2
BGP
EGP = Exterior Gateway Protocol
IGP = Interior Gateway Protocol
Metric based: OSPF, IS-IS, RIP, EIGRP (cisco)
Policy based: BGP
The Routing Domain of BGP is the entire Internet
OSPF
EIGRP
• Topology information is flooded within the routing domain
• Best end-to-end paths are computed locally at each router.
• Best end-to-end paths determine next-hops.
• Based on minimizing some notion of distance
• Works only if policy is shared and uniform
• Examples: OSPF, IS-IS
• Each router knows little about network topology
• Only best next-hops are chosen by each router for each destination network.
• Best end-to-end paths result from composition of all next-hop choices
• Does not require any notion of distance
• Does not require uniform policies at all routers
• Examples: RIP, BGP
Link State Vectoring
Technology of Distributed Routing
The Gang of Four
Link State Vectoring
EGP
IGP
BGP
RIPIS-IS
OSPF
6
Many Routing Processes Can Run on a Single Router
Forwarding Table
OSPFDomain
RIPDomain
BGP
OS kernel
OSPF Process
OSPF Routing tables
RIP Process
RIP Routing tables
BGP Process
BGP Routing tables
Forwarding Table Manager
AS Numbers (ASNs)ASNs are 16 bit values.
64512 through 65535 are “private”
• Yale: 29 • MIT: 3• Harvard: 11• Genuity: 1 • AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, …• Sprint: 1239, 1240, 6211, 6242, …• …
ASNs represent units of routing policy
Currently over 12,000 in use.
Autonomous Routing Domains Don’t Always Need BGP or an ASN
Qwest
Yale University
Nail up default routes 0.0.0.0/0pointing to Qwest
Nail up routes 130.132.0.0/16pointing to Yale
130.132.0.0/16
Static routing is the most common way of connecting anautonomous routing domain to the Internet. This helps explain why BGP is a mystery to many …
ASNs Can Be “Shared” (RFC 2270)
AS 701UUNet
ASN 7046 is assigned to UUNet. It is used byCustomers single homed to UUNet, but needing BGP for some reason (load balancing, etc..) [RFC 2270]
AS 7046Crestar Bank
AS 7046 NJIT
AS 7046HoodCollege
128.235.0.0/16
How Many ASNs are there?
Thanks to Geoff Huston. http://www.telstra.net/ops on June 23, 2001
AS Graphs Can Be Fun
The subgraph showing all ASes that have more than 100 neighbors in fullgraph of 11,158 nodes. July 6, 2001. Point of view: AT&T route-server
BGP Table Growth
Thanks to Geoff Huston. http://www.telstra.net/ops/bgptable.html on August 8, 2001
13
Nontransit vs. Transit ASes
ISP 1ISP 2
Nontransit ASmight be a corporateor campus network.Could be a “content provider”
NET ATraffic NEVER flows from ISP 1through NET A to ISP 2(At least not intentionally!)
IP traffic
Internet Serviceproviders (often)have transit networks
14
Selective Transit
NET BNET C
NET A provides transitbetween NET B and NET Cand between NET D and NET C
NET A
NET D
NET A DOES NOTprovide transitBetween NET D and NET B
Most transit networks transit in a selective manner…
IP traffic
Customers and Providers
Customer pays provider for access to the Internet
provider
customer
IP trafficprovider customer
The Peering Relationship
peer peer
customerprovider
Peers provide transit between their respective customers
Peers do not provide transit between peers
Peers (often) do not exchange $$$trafficallowed
traffic NOTallowed
Peering Provides Shortcuts
Peering also allows connectivity betweenthe customers of “Tier 1” providers.
peer peer
customerprovider
18
BGP-4• BGP = Border Gateway Protocol
• Is a Policy-Based routing protocol
• Is the de facto EGP of today’s global Internet
• Relatively simple protocol, but configuration is complex and the
entire world can see, and be impacted by, your mistakes.
• 1989 : BGP-1 [RFC 1105]– Replacement for EGP (1984, RFC 904)
• 1990 : BGP-2 [RFC 1163]
• 1991 : BGP-3 [RFC 1267]
• 1995 : BGP-4 [RFC 1771] – Support for Classless Interdomain Routing (CIDR)
19
BGP Operations (Simplified)
Establish session on TCP port 179
Exchange all active routes
Exchange incremental updates
AS1
AS2
While connection is ALIVE exchangeroute UPDATE messages
BGP session
20
Four Types of BGP Messages
• Open : Establish a peering session.
• Keep Alive : Handshake at regular intervals.
• Notification : Shuts down a peering session.
• Update : Announcing new routes or withdrawing previously announced routes.
announcement = prefix + attributes values
BGP Attributes
Value Code Reference----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development
From IANA: http://www.iana.org/assignments/bgp-parameters
Mostimportantattributes
Not all attributesneed to be present inevery announcement
Attributes are Used to Select Best Routes
192.0.2.0/24pick me!
192.0.2.0/24pick me!
192.0.2.0/24pick me!
192.0.2.0/24pick me!
Given multipleroutes to the sameprefix, a BGP speakermust pick at mostone best route
(Note: it could reject them all!)
23
BGP Route Processing
Best Route Selection
Apply Import Policies
Best Route Table
Apply Export Policies
Install forwardingEntries for bestRoutes.
ReceiveBGPUpdates
BestRoutes
TransmitBGP Updates
Apply Policy =filter routes & tweak attributes
Based onAttributeValues
IP Forwarding Table
Apply Policy =filter routes & tweak attributes
Open ended programming.Constrained only by vendor configuration language
Route Selection Summary
Highest Local Preference
Shortest ASPATH
Lowest MED
i-BGP < e-BGP
Lowest IGP cost to BGP egress
Lowest router ID
traffic engineering
Enforce relationships
Throw up hands andbreak ties
Tweak Tweak Tweak
• For inbound traffic– Filter outbound routes– Tweak attributes on
outbound routes in the hope of influencing your neighbor’s best route selection
• For outbound traffic– Filter inbound routes– Tweak attributes on
inbound routes to influence best route selection
outboundroutes
inboundroutes
inboundtraffic
outboundtraffic
In general, an AS has morecontrol over outbound traffic
26
ASPATH Attribute
AS7018135.207.0.0/16AS Path = 6341
AS 1239Sprint
AS 1755Ebone
AT&T
AS 3549Global Crossing
135.207.0.0/16AS Path = 7018 6341
135.207.0.0/16AS Path = 3549 7018 6341
AS 6341
135.207.0.0/16
AT&T Research
Prefix Originated
AS 12654RIPE NCCRIS project
AS 1129Global Access
135.207.0.0/16AS Path = 7018 6341
135.207.0.0/16AS Path = 1239 7018 6341
135.207.0.0/16AS Path = 1755 1239 7018 6341
135.207.0.0/16AS Path = 1129 1755 1239 7018 6341
AS Graphs Do Not Show Topology!
The AS graphmay look like this. Reality may be closer to this…
BGP was designed to throw away information!
AS Graphs Depend on Point of View
This explains why there is no UUNET (701) Sprint (1239) link on previous slide!
peer peer
customerprovider
54
2
1 3
6
54
2
6
1 3
54 6
1 3
54
2
6
1 32
In fairness: could you do this “right” and still scale?
Exporting internalstate would dramatically increase global instability and amount of routingstate
Shorter Doesn’t Always Mean Shorter
AS 4
AS 3
AS 2
AS 1
Mr. BGP says that path 4 1 is better than path 3 2 1
Duh!
30
Shedding Inbound Traffic with ASPATH Padding Hack
Padding will (usually) force inbound traffic from AS 1to take primary link
AS 1
192.0.2.0/24ASPATH = 2 2 2
customerAS 2
provider
192.0.2.0/24
backupprimary
192.0.2.0/24ASPATH = 2
31
Padding May Not Shut Off All Traffic
AS 1
192.0.2.0/24ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
AS 3 will sendtraffic on “backup”link because it prefers customer routes and localpreference is considered before ASPATH length!
Padding in this way is oftenused as a form of loadbalancing
backupprimary
32
COMMUNITY Attribute to the Rescue!
AS 1
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
backupprimary
192.0.2.0/24ASPATH = 2 COMMUNITY = 3:70
Customer import policy at AS 3:If 3:90 in COMMUNITY then set local preference to 90If 3:80 in COMMUNITY then set local preference to 80If 3:70 in COMMUNITY then set local preference to 70
AS 3: normal customer local pref is 100,peer local pref is 90
33
Hot Potato Routing: Go for the Closest Egress Point
192.44.78.0/24
15 56 IGP distances
egress 1 egress 2
This Router has two BGP routes to 192.44.78.0/24.
Hot potato: get traffic off of your network as Soon as possible. Go for egress 1!
34
Getting Burned by the Hot Potato
15 56
172865High bandwidth
Provider backbone
Low bandwidthcustomer backbone
Heavy Content Web Farm
Many customers want their provider to carry the bits!
tiny http request
huge http reply
SFF NYC
San Diego
35
Cold Potato Routing with MEDs(Multi-Exit Discriminator Attribute)
15 56
172865
Heavy Content Web Farm
192.44.78.0/24
192.44.78.0/24MED = 15
192.44.78.0/24MED = 56
This means that MEDs must be considered BEFOREIGP distance!
Prefer lower MED values
Note1 : some providers will not listen to MEDs
Note2 : MEDs need not be tied to IGP distance
Policies Can Interact Strangely(“Route Pinning” Example)
backup
Disaster strikes primary linkand the backup takes over
Primary link is restored but sometraffic remains pinned to backup
1 2
3 4
Install backup link using community
customer
News at 11:00h
• BGP is not guaranteed to converge on a stable routing. Policy interactions could lead to “livelock” protocol oscillations. See “Persistent Route Oscillations in Inter-domain Routing” by K. Varadhan,
R. Govindan, and D. Estrin. ISI report, 1996 • Corollary: BGP is not guaranteed to recover
from network failures.
PART II
What Problem is BGP solving?
Underlying problem
Shortest Paths
Distributed means of computing a solution.
X?
RIP, OSPF, IS-IS
BGP
• aid in the design of policy analysis algorithms and heuristics• aid in the analysis and design of BGP and extensions• help explain some BGP routing anomalies • provide a fun way of thinking about the protocol
X could
Can we model BGP?
Separate dynamic and static semantics
SPVP = Simple Path Vector Protocol = a distributed algorithm for solving SPP
BGP
SPVP
Booo Hooo, Many, many complications...
BGP Policies
Stable Paths Problem (SPP)
staticsemantics
dynamicsemantics
1
An instance of the Stable Paths Problem (SPP)
2 5 5 2 1 0
0
2 1 02 0
1 3 01 0
3 0
4 2 04 3 0
3
4
2
1
• A graph of nodes and edges, • Node 0, called the origin, • For each non-zero node, a set
or permitted paths to the origin. This set always contains the “null path”.
• A ranking of permitted paths at each node. Null path is always least preferred. (Not shown in diagram)
When modeling BGP : nodes represent BGP speaking routers, and 0 represents a node originating some address block
most preferred…least preferred
Yes, the translation gets messy!
5 5 2 1 0
1
A Solution to a Stable Paths Problem
2
0
2 1 02 0
1 3 01 0
3 0
4 2 04 3 0
3
4
2
1
• node u’s assigned path is either the null path or is a path uwP, where wP is assigned to node w and {u,w} is an edge in the graph,
• each node is assigned the highest ranked path among those consistent with the paths assigned to its neighbors.
A Solution need not represent a shortest path tree, or a spanning tree.
A solution is an assignment of permitted paths to each node such that
An SPP may have multiple solutions
First solution
1
0
2
1 2 01 0
1
0
2
1
0
2
2 1 02 0
1 2 01 0
2 1 02 0
1 2 01 0
2 1 02 0
Second solutionDISAGREE
Multiple solutions can result in “Route Triggering”
1
02
3
1 01 2 3 0
2 3 02 1 0
3 2 1 03 0
1
02
3
1
02
3
Remove primary link Restore primary link
1 01 2 3 0
2 3 02 1 0
3 2 1 03 0
primary link
backup link
BAD GADGET : No Solution
2
0
31
2 1 02 0
1 3 01 0
3 2 03 0
4
3
Persistent Route Oscillations in Inter-Domain Routing. Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer Networks, Jan. 2000
SURPRISE : Beware of Backup Policies
2
0
31
2 1 02 0
1 3 01 0
3 4 2 03 0
4
4 04 2 04 3 0
Becomes a BAD GADGET if link (4, 0) goes down.
BGP is not robust : it is not guaranteed to recover from network failures.
PRECARIOUS
1
0
2
1 2 01 0
2 1 02 0
3
4
5 6
5 3 1 05 6 3 1 2 05 3 1 2 0
6 3 1 06 4 3 1 2 06 3 1 2 0
4 3 1 04 5 3 1 2 04 3 1 2 0
3 1 03 1 2 0
As with DISAGREE, this part has two distinct solutions
This part has a solution only when node 1 is assigned the direct path (1 0).
Has a solution, but can get “trapped”
Solving an SPP
Just enumerate all path assignmentsAnd check stability of each….
Exponential complexity
But, in worst case you (probably) can’t do any better…
Use 3-SAT…Variables V = {X1, X2, …, Xn}
Clauses C1 = X17 or ~X23 or ~X3, C2 = ~X2 or X3 or ~X12 …. Cm = X6 or ~X7 or X18
Question Is there an variable assignment A : V {true, false} such that each clause C1, … ,Cm is true?
3-SAT is NP-complete
Modeling assignment to variable X
X
0
X
X = trueX = false
X
0X
X
0X
X X 0
X 0
X X 0
X 0
SPP Solvability is NP-complete
BAD GADGET
0
X5 X5X7 X7 X3 X3
CX7 or X5 or X3
C X7 0C X5 0C X3 0
SPVP protocol
process spvp[u] { receive P from w { rib-in(uw) := u P if rib(u) != best(u) { rib(u) := best(u)
foreach v in peers(u) { send rib(u) to v } } } }
Pick the best path available at any given time…
SPVP wanders around assignment space
= assignment = solution
Distributed algorithms to solve SPP?
• OSPF-like :– Distribute topology, path ranks– Solve SPP locally– Exponential worst case– How can loops be avoided when multiple solutions
exist?
• RIP-like:– Pick the best path from the set of your neighbor’s paths,
tell your neighbors when you change your mind– Can diverge– Not guaranteed to find a solution, even when one exists– Even when converges, no bound on convergence time
This is BGP…
A sufficient condition for sanity
Static (SPP)
solvable
Dynamic (SPVP)
unique solution
safe (can’t diverge)
predictable restoration
If an instance of SPP has an acyclic dispute digraph, then
all sub-problems uniquely solvable
robust with respect to link/node failures
Dispute Digraph
…(u v)P…(u v)Q...
…Q…P...
u v 0P
Q
Q (u v)P
Gives the dispute arc
Dispute Digraph (cont.)
…(u,v)P...
…P...
u v 0P
P (u,v)P
Gives the transmission arc
Dispute Digraph Example
21
0
43
1 3 01 0
2 1 02 0
4 2 0
4 3 0
3 4 2 0
3 0
3 4 2 0
2 1 0
2 0 1 0
3 0
4 3 0 1 3 0
4 2 0
BAD GADGET II
CYCLE
What is to be done?
StaticApproach
Inter-AScoordination
Automated Analysis of Routing Policies
Dynamic Approach
Extend BGP with a dynamic means of detecting and suppressingpolicy-based oscillations?
These approaches are complementary
Some Applications SPP Theory
• A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong. INFOCOM 2001– Dynamic solution for SPVP based on histories
(dynamically constructed dispute cycles). • Inherently safe backup routing with BGP. Lixin Gao,
Timothy G. Griffin, Jennifer Rexford. INFOCOM 2001• Show that if customer/provider peer/peer model is
followed, then all is well, – Show that this can be exteded with complex
backup policies and remain safe. – Analysis of “cold potato” routing problems (MED
oscillation). Griffin and Wilfong. Work in progress– MED requires a modification to SPP model
– Analysis of Internal BGP (IBGP) configuration. Griffin and Wilfong. Work in progress.
A Few Research Topics
• Dynamic Behavior of BGP – Convergence time, message overhead
• BGP Security– S-BGP defined, but not deployed. Is it a
good solution.– Need an “interdomain trust model”
• Beyond BGP? – When will it break? What will replace it?
Selected Papers on BGP Sanity
• Persistent Route Oscillations in Inter-Domain Routing. Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer Networks, Jan. 2000. (Also USC Tech Report, Feb. 1996)
– Shows that BGP is not guaranteed to converge• An Architecture for Stable, Analyzable Internet Routing. Ramesh Govindan, Cengiz Alaettinoglu,
George Eddy, David Kessens, Satish Kumar, and WeeSan Lee. IEEE Network Magazine, Jan-Feb 1999.
– Use RPSL to specify policies. Store them in registries. Use registry for conguration generation and analysis.
• An Analysis of BGP Convergence Properties. Timothy G. Griffin, Gordon Wilfong. SIGCOMM 1999
– Model BGP, shows static analysis of divergence in policies is NP complete• Policy Disputes in Path Vector Protocols. Timothy G. Griffin, F. Bruce Shepherd, Gordon
Wilfong. ICNP 1999– Define Stable Paths Problem and develop sufficient condition for “sanity”
• A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong. INFOCOM 2001– Dynamic solution for SPVP based on histories
• Stable Internet Routing without Global Coordination. Lixin Gao, Jennifer Rexford. SIGMETRICS 2000
– Show that if certain guidelines are followed, then all is well. • Inherently safe backup routing with BGP. Lixin Gao, Timothy G. Griffin, Jennifer Rexford.
INFOCOM 2001– Use SPP to study complex backup policies
Pointers
• Links on Interdomain routing and BGP:
• http://www.research.att.com/~griffin/interdomain.html
• SIGCOMM 2001 Tutorial on BGP:
• http://www.research.att.com/~griffin/sigcomm2001_bgp_tutorial/abstract.html
• Papers on BGP theory:
• http://www.research.att.com/~griffin/bgpresearch.html
Thank You!