Shivkumar KalyanaramanRensselaer Polytechnic Institute 1
BANANAS: An Evolutionary Framework for Explicit and Multipath Routing in the Internet
Hema T. Kaur, S. Kalyanaraman, A. Weiss, S. Kanwar, A. Gandhi
Rensselaer Polytechnic Institute
[email protected], [email protected]
http://www.ecse.rpi.edu/Homepages/shivkuma
A
ED
CB
F
2
2
1
3
1
1
2
53
5
2
Sponsors: DARPA-NMS, NSF, Intel, AT&T
Shivkumar KalyanaramanRensselaer Polytechnic Institute 2
In case you were wondering…
BANANAS is not an acronym
Just something to remember this by…
Shivkumar KalyanaramanRensselaer Polytechnic Institute 3
Outline
Motivation: Best-effort path multiplicity
BANANAS: Data-plane
BANANAS: Control-plane
BANANAS: Mapping to BGP
Performance Results
Shivkumar KalyanaramanRensselaer Polytechnic Institute 4
Motivation: Best-Effort Path Multiplicity
EthernetWiFi (802.11b)802.11a
USB/802.11a/b
Firewire/802.11a/b
Phone modem
InternetAS1
ISP-1
ISP-n
.
.
.
Shivkumar KalyanaramanRensselaer Polytechnic Institute 5
Multiplicity: Flows, Paths, Exits/Interfaces
Multi-paths withina routing domain
Multi-AS-paths across domainsW/o specifying intra-domain paths Multiple exits from an AS
w/o specifying intra-domain paths
Multiple E2E Flows
Multi-homed interfaces & ISP/AS peers
BANANAS: A Single Abstract Framework To Exploit These Forms of Multiplicity In the Internet and Future Networks
Shivkumar KalyanaramanRensselaer Polytechnic Institute 6
Isn’t multi-path routing an old subject? Lots of old work:
Multi-path algorithms/protocols [5, 6, 7, 8], Internet signaling architectures [9, 10, 11, 12, 13] Novel overlay routing methods [14, 15] Transport-level approaches for multi-homed hosts [16,
17]
But newer goals: Traffic engineering (reliability, availability, survivability, re-route):
Separation of traffic trunking from route selection Packet level or aggregate (micro-flow or macro-flow level)
Security Best-effort e2e service composition…
Why hasn’t it happened after 2 decades of work?
Shivkumar KalyanaramanRensselaer Polytechnic Institute 7
Missing Architectural Concepts An evolutionary partial deployment strategy
Allows partial deployment, incremental upgrades Incentives: increasing value with increasing deployment Fits the connectionless nature of dominant routing protocols,
Must not require signaling (unlike ATM, MPLS etc)
Abstract enough to be applicable to other scenarios: Overlay networks, last-mile multi-hop (mesh or community)
wireless networks, ad-hoc/sensor networks…
Flexible enough to have other realizations/semantics: Different placement of functions (edge vs core) Tunneling/Label Stacking Geographic/Trajectory routing
Shivkumar KalyanaramanRensselaer Polytechnic Institute 8
Limits of Connectionless Traffic Engg (OSPF/BGP)
A
B
C
D
1
1 2
1
E
2
Can not do this with SP routing!A
B
C
D
1
1 2
1
E
2
Links AB and BD are overloaded
A
B
C
D
1
1 2
4
E
2
Links AC and CD are overloaded
State-of-the-art: parameter tweaking OSPF, IS-IS: Link weight tweaking or BGP-4 parameter (LOCAL_PREF, MED) tweaking
Performance ultimately limited by the single path
Shivkumar KalyanaramanRensselaer Polytechnic Institute 9
The Questions Can we do multi-path & explicit routing ?
without signaling (I.e. in a connectionless context)without variable (and large) per-packet overheadbeing backward compatible with OSPF & BGP allowing incremental network upgrades
Non-Goals: Monitoring, Traffic trunking/mapping
Shortest Path MPLS…
BANANAS-TE
Signaled TE
Traffic Engineering (TE) Spectrum
Shivkumar KalyanaramanRensselaer Polytechnic Institute 10
Outline
Motivation: Best-effort path multiplicity
BANANAS: Data-plane
BANANAS: Control-plane
BANANAS: Mapping to BGP
Performance Results
Shivkumar KalyanaramanRensselaer Polytechnic Institute 11
Big Picture: How does it fit?
Multi-paths withina routing domain
Shivkumar KalyanaramanRensselaer Polytechnic Institute 12
Detour: What can we learn from ATM and MPLS ?
MPLS label = Path identifier at each hop Labels is a local identifier…
Signaling maps global identifiers (addresses, path specifications) to local identifiers
Miami
Seattle
SanFrancisco(Ingress)
New York(Egress)
1321
5
120
IP 1321
IP 120
IP 0
IPLabe
l
Shivkumar KalyanaramanRensselaer Polytechnic Institute 13
Global Path Identifiers? Instead of using local path identifiers (labels in MPLS),
consider the use of “global” path identifiers Constructed out of global variables a node already knows! Eg: Link/Router IP addrs, Link weights, ASNs, Area Ids, GPS location
Avoid the need for signaling to establish a mapping!
10
Miami
Seattle
9
27
SanFrancisco(Ingress)
New York(Egress)
18
1
5
4
3
5
IP
IP PathId
4
IP 36
IP 27
IP 0
Shivkumar KalyanaramanRensselaer Polytechnic Institute 14
Global Path Identifier: Key Ideas
ik j
m-11
2w1
w2
wm
IP PathId(i,j)
IP PathId(1,j)
Key ideas (take-home!): 1. Global pathids (computed from global variables) instead of local labels!2. Avoid inefficient path encoding (IP) AND 3. Avoid signaling (MPLS)4. Incrementally deployable: w/ control-plane modifications
Shivkumar KalyanaramanRensselaer Polytechnic Institute 15
Global Path Identifier (continued)
Path = {i, w1, 1, w2, 2, …, wk, k, wk+1, … , wm, j} Sequence of globally known node IDs & Link weights Global PathID: hash of this sequence: computable w/o signaling!
Canonical method: MD5 hashing of the subsequence of nodeIDs followed by a CRC-32 to get a 32-bit hash value (MD5+CRC)
Low collision (i.e. non-uniqueness) probability
Note: Different PathID encodings have different architectural implications
i
k
j
m-11
2w1
w2
wm
Path su
ffix
Shivkumar KalyanaramanRensselaer Polytechnic Institute 16
Abstract Forwarding Paradigm Forwarding table (Eg; at Node k):
[Destination Prefix, ] [Next-Hop, ] [j, ] [k+1, ]
i
k
j
m-1
1
2w1
w2
wm
Path su
ffix
Packet Header:
[j, H{k, k+1, … , m-1} ]
PathID SuffixPathID
H{k, k+1, … , m-1} H{k+1, … , m-1}
[j, H{k+1, … , m-1} ]
Shivkumar KalyanaramanRensselaer Polytechnic Institute 17
BANANAS TE: Partial Deployment Only red nodes are upgraded
“Virtual hop” between upgraded nodes Black nodes compute single-shortest-path
10
Miami
Seattle
9SanFrancisco(Ingress)
New York(Egress)
28
1
5
4
30
1
IP
4
IP 27
IP 0
27
1
3
2
IP 27
IP 27
X
Shivkumar KalyanaramanRensselaer Polytechnic Institute 18
Outline
Motivation: Best-effort path multiplicity
BANANAS: Data-plane
BANANAS: Control-plane
BANANAS: Mapping to BGP
Performance Results
Shivkumar KalyanaramanRensselaer Polytechnic Institute 19
Baseline: Route Computation Strategy
1-bit in LSA: node is “multi-path capable” (MPC)
Two phase algorithm: (m upgraded nodes) 1. (N-m) Dijkstra’s for non-upgraded nodes 2. DFS to discover valid paths to destinations.
Computes all valid paths partial-upgrade (PU) constraints
Problem: inflexible and complex!
Shivkumar KalyanaramanRensselaer Polytechnic Institute 20
Route Computation: Flexibility
Eg: k shortest-paths instead of DFS ( complexity)
Issue: Forwarding for k-shortest paths may not exist Need to validate the forwarding availability for paths!
Idea: A path is valid only if its path suffixes are valid. 2-phase validation algorithm provided in BANANAS
A
ED
CB
F
2
21
3
1
1
2
53
5
2
Shivkumar KalyanaramanRensselaer Polytechnic Institute 21
Implementation: OSPF LSA Extensions
Shivkumar KalyanaramanRensselaer Polytechnic Institute 22
Architectural Flexibility: Placement of Functions Architecture = placement of functions BANANAS functions:
Data-plane = hash processing Control-plane = route computation
Goal: Move functions from the core to the edges Recall: Different PathID encodings have different architectural implications
Link IndicesPathID = concatenation
of link indexesPathID processing:
bit shifting!
Shivkumar KalyanaramanRensselaer Polytechnic Institute 23
Outline
Motivation: Best-effort path multiplicity
BANANAS: Data-plane
BANANAS: Control-plane
BANANAS: Mapping to BGP
Performance Results
Shivkumar KalyanaramanRensselaer Polytechnic Institute 24
Big Picture: How does it Fit
Multi-AS-paths across domainsW/o specifying intra-domain paths Multiple exits from an AS
w/o specifying intra-domain paths
Shivkumar KalyanaramanRensselaer Polytechnic Institute 25
Explicit-Exit Routing: Concept
AS1
AS2
AS3
AS4 Dest. d
ABR1
ASBR2
ASBR3
ASBR4ASBR1
ABR2
Upgraded IBGP & EBGP nodes synchronize on a set of exits for prefixes IBGP locally installs explicit exit(s) for chosen prefix Packet tunneled to explicitly chosen exit (like MPLS stacking)
Shivkumar KalyanaramanRensselaer Polytechnic Institute 26
BGP Explicit-Exit Routing: Details IBGP Table:
Dest-Prefix Exit-ASBR Next-Hop Dest-Prefix Default-Next-Hop
When a packet matches the explicit route (policy definable):
Push destination address
Replace with Exit-ASBR address.
Exit-ASBR pops destination address
1-level label-stacking (a.k.a connectionless tunneling)
Note: address stacking/tunneling is a different realization of the BANANAS hashing concept
Shivkumar KalyanaramanRensselaer Polytechnic Institute 27
Inter-AS Explicit AS-Path Choice
AS0
AS1
AS2
AS3
AS4 Dest. d
ASBR1
ASBR2
ASBR3
Caveat: this requires more coordination across ISPsand control traffic (control-plane penalty)!
Shivkumar KalyanaramanRensselaer Polytechnic Institute 28
Outline
Motivation: Best-effort path multiplicity
BANANAS: Data-plane
BANANAS: Control-plane
BANANAS: Mapping to BGP
Performance Results
Shivkumar KalyanaramanRensselaer Polytechnic Institute 29
Simulation/Implementation/Testing Platforms
Utah’s Emulab Testbed: Experiments with
Linux/Zebra/Click implementation
MIT’s Click Modular RouterOn Linux:
Forwarding Plane
SSFnet Simulation for OSPF/BGP Dynamics
Modular Router
Shivkumar KalyanaramanRensselaer Polytechnic Institute 30
Putting It Together: Integrated OSPF/BGP Simulation
Shivkumar KalyanaramanRensselaer Polytechnic Institute 31
Blow-up of AS2’s Internal Topology
Shivkumar KalyanaramanRensselaer Polytechnic Institute 32
E-PathID Processing
Shivkumar KalyanaramanRensselaer Polytechnic Institute 33
FORWARDING Table in AS2 (node#5)
Corresponding Changes in Packet Headers
Shivkumar KalyanaramanRensselaer Polytechnic Institute 34
Summary Goals:
Best-effort path multiplicity: MPLS-like features in OSPF, IS-IS and BGP Overlay routing (Planetlab deployment)
Non-Goals: Performance monitoring, traffic trunking & mapping to paths BANANAS Framework:
Data-Plane: Hash = Global PathID => NO SIGNALING Control-plane: route computation algos (partial upgrade constraints) Architectural Flexibility, incrementally deployable
Shortest Path MPLS…
BANANAS-TE
Signaled TE
Traffic Engineering
Shivkumar KalyanaramanRensselaer Polytechnic Institute 35
Multiplicity: Take-Home Message…
Multi-paths withina routing domain
Multi-AS-paths across domainsW/o specifying intra-domain paths Multiple exits from an AS
w/o specifying intra-domain paths
Multiple E2E Flows
Multi-homed interfaces & ISP/AS peers
Shivkumar KalyanaramanRensselaer Polytechnic Institute 36
EXTRA SLIDES
Shivkumar KalyanaramanRensselaer Polytechnic Institute 37
Acknowledgements Biplab Sikdar (faculty colleague) Mehul Doshi (MS) Niharika Mateti (MS) Also thanks to:
Satish Raghunath (PhD) Jayasri Akella (PhD)Hemang Nagar (MS)
Work funded in part by DARPA-ITO, NMS Program. Contract number: F30602-00-2-0537, Intel, AT&T
Shivkumar KalyanaramanRensselaer Polytechnic Institute 38
Multiple Areas
PathID re-initialized after crossing area boundaries Source-routing notion similar to, but weaker than PNNI
Red nodes: upgradedGreen nodes: regular
A
ABR2
D
CB
ABR12
2
1
3
1
4
11
5
2
G 53
ABR3
ABR4
ABR5
IH
J
15
2
4 12
1
2 21
4 2
4
Area 1
Area 0
Area 2
77
Shivkumar KalyanaramanRensselaer Polytechnic Institute 39
Why is the Index-based Encoding Interesting?
Ans: Architectural flexibility
Core (interior) nodes: Forwarding function simplified Minimal state (only the index table) No control-plane computation complexity at interior nodes
Edge nodes: Path validation simplified Edge-nodes can store an arbitrary subset of validated paths Heterogeneous route computation algorithms can be used
Shivkumar KalyanaramanRensselaer Polytechnic Institute 40
Path Multiplicity Internet routing protocols designed for “best-effort”
reachability, has implicitly meant “single end-to-end path” Why cannot the concept of “best-effort” allow path-multiplicity ?
Internet topology (level of hosts, routers, AS’es) is not a tree: multi-homing and multi-path availability
Path multiplicity available in several contexts: layer 3 (eg: OSPF, BGP, ad-hoc network routing), or layer 4-7 (eg: overlay networks, peer-to-peer networks)
Path multiplicity offers the potential for spatio-temporal statistical multiplexing gains Packet switching offered temporal stat-muxing gains over ckt switching Gains may vary in different contexts (eg: ad-hoc networks where
capacity is shared network wide)
Shivkumar KalyanaramanRensselaer Polytechnic Institute 41
Zebra/Click Implementation on Linux (Tested on Utah Emulab)
Part of table at node1: (PathID= Link Weights, for simplicity)
3 9 6
74
5 8
1 2
10
53
13 75
4
5145
83
21
3
6793
5 67
38
51
Destination PathID NextHop SuffixPathID
4 260 2 177 (=260 – 83)
4 98 3 0 (= 98 – 98)
4 51 4 0 (= 51 – 51)
4 160 5 0 (=160 – 160)
Shivkumar KalyanaramanRensselaer Polytechnic Institute 42
Linux/Zebra/Emulab Results
D
B
C
Active Nodes
Avg. # of Paths to each Dest
B(k=3) 2.94D(k=3) 2.94C(k=3) 2.79Avg. # of Paths/k *100
B 98%D 98%C 93%
Active Nodes
Avg. # of Paths to each Dest
B(k=7) 6.5
D(k=5) 4.78C(k=5) 4.44Avg. # of Paths/k *100
B 93%D 96%C 89%
Active Nodes
Avg. # of Paths to each Dest
B(k=5) 4.83D(k=5) 4.78C(k=5) 4.44Avg. # of Paths/k *100
B 97%D 96%C 89%
Flat OSPF Area, 3 Active-MPC nodes; Upto k-shortest, validated paths
Shivkumar KalyanaramanRensselaer Polytechnic Institute 43
Path Validation Algorithm Concept: A path is VALID only if its path suffixes are valid. Phase 1 (cont’d):
compute {k-shortest} paths for all other upgraded nodes, and 1-shortest paths for non-upgraded nodes.
Sort computed paths by hopcount
Phase 2: Validate paths starting from hopcount = 1. All 1-hop paths valid. p-hop paths valid if the (p-1)-hop path suffix is valid Throw out invalid paths as they are found
Polynomial complexity to discover valid paths Proof by mathematical induction
Shivkumar KalyanaramanRensselaer Polytechnic Institute 44
BANANAS TE: Explicit, Multi-Path Forwarding
Explicit source-directed routing: Not limited by the shortest path nature of IGP Different PathIds => different next-hops (multi-paths) No signaling required to set-up the paths
Traffic mapping is decoupled from route discovery
10
Miami
Seattle
9
27
SanFrancisco(Ingress)
New York(Egress)
18
1
5
4
3
5
IP
IP PathId
4IP 5
IP 0
IP 36IP 27
IP 0