+ All Categories
Home > Documents > Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT...

Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT...

Date post: 20-Dec-2015
Category:
View: 217 times
Download: 1 times
Share this document with a friend
Popular Tags:
22
Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science http://nms.lcs.mit.edu/ron/ Rohit Kulkarni University of Southern California CSCI 558L : Fall 2004
Transcript
Page 1: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Resilient Overlay NetworksDavid Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris.

MIT Laboratory for Computer Science

http://nms.lcs.mit.edu/ron/

Rohit Kulkarni

University of Southern California

CSCI 558L : Fall 2004

Page 2: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Outline Introduction What is RON ? Design Goals RON design Evaluation Related Work Future Work Conclusion

Page 3: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Introduction Current organization of Internet

Independently operating ASes peer together Detailed routing information only within an AS Shared routing information filtered using BGP BGP provides policy enforcement and scalability

Problems with this organization Reduced fault-tolerance of e2e communication Fault recovery takes many minutes Vulnerable to router and link faults, configuration

errors..

Page 4: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Introduction [2] : Other studies Studies highlighting problems

“delayed routing convergence” - inter-domain routers take 10s of mins to reach a consistent view after a fault

“e2e routing behavior” - routing faults prevent internet host from communicating up to 3.3% of time avg over long period

“e2e WAN availability” - 5% of all detected failures last more than 10,000 secs (2 hrs 45mins)

Studies trying to solve problems “Multi-homing” - addressing issues with active connections. Still no

quick fault recovery “Detour” - path selection in internet sub-optimal w.r.t latency, loss-rate.

throughput. Showed benefits of indirect routing No wide-area system that provides quick failure-recovery

Page 5: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

What is RON ? Resilient - “to recover readily from adversity” Overlay Network -

“an isolated virtual network deployed over an existing physical network”

Figure taken from X-bone project [2]

Page 6: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

What is RON ? [2] “RON is an architecture that allows distributed Internet

applications to detect and recover from path outages and periods of degraded performance within several seconds”

An application layer overlay on top of existing Internet

RON nodes monitor Internet paths among themselves Functioning Quality

Route packets directly over internet or using other RON nodes

Page 7: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Design Goals Goal 1: Fast failure detection and Recovery

Detection by using aggressive probing Recovery by using intermediate RON nodes for forwarding

Goal 2: Integrate routing & path selection w/ Applications application specific notions of failures application specific metric in path selection

Goal 3: Framework for policy routing Fine-grained policies aimed at users or hosts E.g. e2e per-user rate control, forwarding rate controls based on

packet classification.

Page 8: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

RON Design:Software Architecture RON Client Conduit RON nodes: entry node, exit node Forwarder Router Membership manager Performance database

Page 9: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Design [2]:Routing and Path Selection Link-state propagation

Link-state routing protocol Routing protocol is RON client with a packet type

Path evaluation and selection Outage detection using active probing Routing metrics

• Latency-minimizer - uses EWMA of RT latency samples w/ parameter (0.9)

• Loss-minimizer - uses avg of last k (100) probe samples.• TCP throughput-optimizer - combines latency and loss rate metrics

using simplified TCP throughput equation

Performance database Detailed performance information

Page 10: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Design [3]:Policy routing Allows users to define types of traffic allowed on

network links Classification

Data classifier module Incoming (via conduit) packets get policy tag Tag identifies set of routing tables to be used

Routing table formation Policy identifies which virtual links to use Separate set of routing tables for each policy

Page 11: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Design [4]:Data forwarding

The forwarding control anddata paths

The RON packet header

Page 12: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Evaluation: Methodology wide-area RON deployed at several internet sites

ISP, US Univs (I-2), Euro Univs, broadband home users, .coms Internet-2 (I-2) policy for more Internet like measurements

Measurements using probe packets, throughput samples 2 datasets

RON1 - 12 nodes, 132 paths• 2.6m samples, 64hrs trace in March 2001,

RON2 - 16 nodes, 240 paths• 3.5m samples, 85hrs trace in May 2001,

Time-averaged samples averaged over 30mins duration Most RON hosts were: Celeron/733, 256MB RAM, 9GB

HDD, FreeBSD. No host was processing bottleneck

Page 13: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Evaluation [2]: Overcoming path outages

Outage data for RON1

Outage data for RON2

Packet loss rate averaged over 30-minIntervals for direct Internet paths vs.RON paths for RON1

Page 14: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Evaluation [3]:Improving loss rates

CDF of improvement in loss-rate achieved by RON1. Samples are averaged over 30 mins

Page 15: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Evaluation [4]:Improving latency

5-minute avg latencies over direct internet path and over RON, as CDF

Page 16: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Evaluation [5]:Improving throughput

CDF of the ratio of throughput achieved via RON to that achieved directly via the Internet

Page 17: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Evaluation [6]:Route Stability

Link-state routing table snapshots every 14 seconds. Total 5616 snapshots

RON’s path selection algos on link-state trace This shows hysteresis is needed for route stability

Number of path changesand run-lengths of routingpersistence for differenthysteresis values

Page 18: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Evaluation [7]: Major results RON was able to successfully detect and recover from 100% (in

RON1) and 60% (in RON2) of all complete outages and all periods of sustained high loss rates of 30% or more.

RON takes 18 seconds, on average, to route around a failure & can do so in face of flooding attack

RON successfully routed around bad throughput failures, doubling TCP throughput in 5% of all samples.

In 5% of the samples, RON reduced the loss probability by 0.05 or more

Single-hop route indirection captured the majority of benefits in our RON deployment, for both outage recovery and latency optimization

Page 19: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Related Work X-Bone

Generic framework for speedy deployment of IP-based overlay networks

Management fns & mechanisms to insert packets into overlay No fault-tolerant operation - no outage detection No application controlled path selection

Detour Showed benefits of indirect routing Kernel level system - not closely tied to application Focus not on quick failure-recovery for preventing disruptions No experimental results analysis from a real-world deployment

Page 20: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Future Work Preventing misuse of established RON

Cryptographic authentication and access control

Mechanisms to detect misbehaving RON peers Just at administrative level not enough

Scalability/Wide spread deployment Keep size of RON within limits (50-100) Have co-existence of many RONs

• Their interactions• Routing stability

Page 21: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

Conclusion RON can greatly improve reliability of Internet packet

delivery by detecting and recovering from outages and path failures more quickly (18 secs) than BGP-4 (several mins)

Can overcome performance failures, improving loss-rates, latency, throughput.

Forwarding packets via at most one intermediate RON node is sufficient

Claims of RON confirmed from experiments Good platform for resilient distributed internet

application development

Page 22: Resilient Overlay Networks David Anderson, Hari Balakrishnan, Frank Kaashoek and Robert Morris. MIT Laboratory for Computer Science

References “Resilient Overlay Networks”, D. Andersen, H. Balakrishnan, M. Kaashoek, R. Morris,

Proc. 18th ACM SOSP, Banff, Canada, October 2001. “Detour: a Case for Informed Internet Routing and Transport”, S. Savage, T. Anderson,

A. Aggarwal, D. Becker, N. Cardwell, A. Collins, E. Hoffman, J. Snell, A. Vahdat, G. Voelker, and J. Zahorjan, IEEE Micro, 19(1):50-59, January 1999.

“Dynamic Internet Overlay Deployment and Management Using the X-Bone”, J. Touch, Computer Networks, July 2001, pp. 117-135

“The Case for Resilient Overlay Networks”, D. Andersen, H.i Balakrishnan, M. Kaashoek, and R. Morris, Proc. HotOS VIII, Schloss Elmau, Germany, May 2001.

“End-to-End Routing Behavior in the Internet”, V. Paxson, In Proc. ACM SIGCOMM, (Stanford, CA, Aug. 1996).

“Delayed Internet Routing Convergence”, C. Labovitz, A. Ahuja, A. Bose, F. Jahanian, In Proc. ACM SIGCOMM (Stockholm, Sweden, September 2000), pp. 175–187.

“Modeling TCP Throughput: A simple model and its empirical validation”, J. Padhye, V. Firoiu, D. Towsley, J. Kurose, In Proc. ACM SIGCOMM (Vancouver Canada, September 1998), pp. 303-323

“End-to-End WAN service availability”, B. Chandra, M. Dahlin, L. Gao, A. Nayate, In Proc. 3rd USITS (San Francisco, CA, 2001), pp. 97-108


Recommended