+ All Categories
Home > Documents > Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan...

Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan...

Date post: 27-Mar-2015
Category:
Upload: abigail-pruitt
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
32
Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore
Transcript
Page 1: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

Path Splicing

Nick FeamsterGeorgia Tech

Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore

Page 2: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

2

Internet Availability

“It is not difficult to create a list of desired characteristics for a new Internet. Deciding how to design and deploy a network that achieves these goals is much harder. … It should be:

1. Robust and available. The network should be as robust, fault-tolerant and available as the wire-line telephone network is today.

2. …

• E911 service• Air traffic control• …

Stanford University Clean-Slate Design for the Internet:

OK for email and the Web, but what about:

Page 3: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

3

Work to do…• Various studies (Paxson, Andersen, etc.) show the

Internet is at about 2.5 “nines”• More “critical” (or at least availability-centric) applications

on the Internet• At the same time, the Internet is

getting more difficult to debug– Scale, complexity, disconnection, etc.

Page 4: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

8

Threats to Availability

• Natural disasters• Physical failures (node, link)• Router software bugs• Misconfiguration• Mis-coordination• Denial-of-service (DoS) attacks• Changes in traffic patterns (e.g., flash crowd)• …

Page 5: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

9

Idea: Backup/Multipath

• For intradomain routing– IP and MPLS fast re-route– Packet deflections [Yang 2006]– ECMP, NotVia, Loop-Free Alternates [Cisco]

• For interdomain routing– MIRO [Rexford 2006]

• Problem– Scale: Protecting against arbitrary failures requires

storing lots of state, exchanging lots of messages– Control: End systems can’t signal when they think a

path has “failed”

Page 6: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

10

Backup Paths: Promise and Problems

• Bad: If any link fails on both paths, s is disconnected from t

• Want: End systems remain connected unless the underlying graph has a cut

ts

Page 7: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

11

Path Splicing: Main Idea

• Step 1 (Generate slices): Run multiple instances of the routing protocol, each with slightly perturbed versions of the configuration

• Step 2 (Splice end-to-end paths): Allow traffic to switch between instances at any node in the protocol

ts

Compute multiple forwarding trees per destination.Allow packets to switch slices midstream.

Page 8: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

12

Outline

• Path Splicing for Intradomain Routing– Generating slices– Constructing paths– Forwarding– Recovery

• Evaluation– Reliability and recovery– Stretch– Effects on traffic

• Path Splicing for Interdomain Routing• Ongoing: Prototype and Deployment Paths

Page 9: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

13

Generating Slices

• Goal: Each instance provides different paths• Mechanism: Each edge is given a weight that is

a slightly perturbed version of the original weight– Two schemes: Uniform and degree-based

ts

3

3

3

“Base” Graph

ts

3.5

4

5 1.5

1.5

1.25

Perturbed Graph

Page 10: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

14

How to Perturb the Link Weights?

• Uniform: Perturbation is a function of the initial weight of the link

• Degree-based: Perturbation is a linear function of the degrees of the incident nodes– Intuition: Deflect traffic away from nodes where traffic

might tend to pass through by default

Page 11: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

15

Constructing Paths

• Goal: Allow multiple instances to co-exist• Mechanism: Virtual forwarding tables

a

t

c

s b

t a

t c

Slice 1

Slice 2

dst next-hop

Page 12: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

16

Forwarding Traffic

• Packet has shim header with forwarding bits

• Routers use lg(k) bits to index forwarding tables– Shift bits after inspection

• To access different (or multiple) paths, end systems simply change the forwarding bits– Incremental deployment is trivial– Persistent loops cannot occur

• Various optimizations are possible

Page 13: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

17

Forwarding: Putting It Together

• End system sets forwarding bits in packet header– Forwarding bits specify slice to be used at any hop

• Router examines/shifts bits, and forwards

ts

Page 14: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

18

Recovery Mechanisms

• End-system recovery– Switch slices at every hop with probability 0.5

• Network-based recovery– Router switches to a random slice if next hop is

unreachable– Continue for a fixed number of hops until

destination is reached

18

Page 15: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

19

Availability Evaluation: Two Aspects

• Reliability: Connectivity in the routing tables should approach the that of the underlying graph– If two nodes s and t remain connected in the

underlying graph, there is some sequence of hops in the routing tables that will result in traffic

• Recovery: In case of failure (i.e., link or node removal), nodes should quickly be able to discover a new path

Page 16: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

20

Availability Evaluation

• A definition for reliability

• Does path splicing improve reliability?– How close can splicing get to the best possible

reliability (i.e., that of the underlying graph)?

• Can path splicing enable fast recovery?– Can end systems (or intermediate nodes) find

alternate paths fast enough?

Page 17: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

21

Reliability Definition

• Reliability: the probability that, upon failing each edge with probability p, the graph remains connected

• Reliability curve: the fraction of source-destination pairs that remain connected for various link failure probabilities p

• The underlying graph has an underlying reliability (and reliability curve)– Goal: Reliability of routing system should approach that of the underlying graph.

Page 18: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

22

Reliability Curve: Illustration

Probability of link failure (p)

Fraction of source-dest pairs disconnected

Better reliability

More edges available to end systems -> Better reliability

Page 19: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

23

Experimental Setup

• Evaluation on two topologies– GEANT (Real) and Sprint (Rocketfuel)

• Compute base graph by taking the union of k perturbed graphs

• Remove an edge from the base graph with probability p

• Compute number of pairs that could reach one another (average over 1,000 trials)

Page 20: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

24

Reliability Approaches Optimal• Sprint (Rocketfuel) topology• 1,000 trials• p indicates probability edge was removed from base graph

Reliability approaches optimal

Average stretch is only 1.3

Sprint topology,degree-based perturbations

Page 21: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

25

Simple Recovery Strategies Work Well

• Which paths can be recovered within 5 trials?– Sequential trials: 5 round-trip times– …but trials could also be made in parallel

Recovery approaches maximum possible

Adding a few more slices improves recovery beyond best possible reliability with fewer slices.

Page 22: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

26

Significant Novelty for Modest Stretch

• Novelty: difference in nodes in a perturbed shortest path from the original shortest path

Example

s d

Novelty: 1 – (1/3) = 2/3

Fraction of edges on short path shared with long path

Page 23: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

27

Summary: Splicing Can Improve Availability

• Reliability: Connectivity in the routing tables should approach the that of the underlying graph– Approach: Overlay trees generated using random link-

weight perturbations. Allow traffic to switch between them– Result: Splicing ~ 10 trees achieves near-optimal reliability

• Recovery: In case of failure, nodes should quickly be able to discover a new path– Approach: End nodes randomly select new bits– Result: Recovery within 5 trials approaches best possible.

Page 24: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

28

Does Splicing Create Loops?

• Persistent loops are avoidable– In the simple scheme, path bits are exhausted from

the header– Never switching back to the same

• Transient loops can still be a problem because they increase end-to-end delay (“stretch”)– Longer end-to-end paths– Wasted capacity– Two-hop loops do occur (around 1 in 100 trials for

k=2, more for higher values of k), but can be avoided with the mechanisms above

Page 25: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

29

Interactions with Traffic

Maximum utilization unaffected

Page 26: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

30

Path Splicing for Interdomain Routing• Observation: Many routers already learn multiple

alternate routes to each destination.• Idea: Use the bits to index into these alternate routes at

an AS’s ingress and egress routers.

• Storing multiple entries per prefix • Indexing into them based on packet headers• Selecting the “best” k routes for each destination

Required new functionality

ddefault

alternate

Splice paths at ingress and egress routers

Page 27: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

31

Experimental Setup

• 2,500-node policy-annotated AS graph• Use C-BGP to compute routes on base graph• Remove each inter-AS edge with probability p• Test connectivity between a random subset of

AS pairs• Compute base reliability without policy

restrictions

Page 28: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

32

Interdomain Splicing: Reliability

2-slice deployment approaches best possible

Page 29: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

33

Incremental Deployment

Partial deployment provides some gains

Page 30: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

34

Ongoing Work

• Software implementation– Click Element– PlanetLab/VINI deployment

• Extension to Cisco Multi-Topology Routing– IETF draft in-progress

Page 31: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

35

Open Questions and Ongoing Work

• How does splicing interact with traffic engineering? Sources controlling traffic?

• What are the best mechanisms for generating slices and recovering paths?

• Can splicing eliminate dynamic routing?

Page 32: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore.

36

Conclusion• Simple: Forwarding bits provide access to

different paths through the network

• Scalable: Exponential increase in available paths, linear increase in state

• Stable: Fast recovery does not require fast routing protocols

http://www.cc.gatech.edu/~feamster/papers/splicing-hotnets.pdf


Recommended