1 Route Control Platform Making the Network Act Like One Big Router Jennifer Rexford Princeton...

1

Route Control PlatformMaking the Network Act Like One Big Router

Jennifer RexfordPrinceton University

http://www.cs.princeton.edu/~jrex

http://www.cs.princeton.edu/~jrex/papers/rcp.pdf

2

Outline

• Internet architecture– Complexity of network management

• Moving control from routers to servers– Reducing complexity and increasing flexibility

• Traffic engineering example– Today’s approach vs. the RCP

• Making the RCP real – Deployability, scalability, and reliability

• Example applications– Security, maintenance, and customer control

3

Internet Architecture

• The Internet is– Decentralized: loose confederation of peers– Self-configuring: no global registry of topology– Stateless: limited information in the routers– Connectionless: no fixed connection between hosts

• These attributes contribute– To the success of Internet– To the rapid growth of the Internet– … and the difficulty of controlling the Internet!

sender receiver

4

A Well-Studied Architecture Question

• Smart hosts, dumb network• Network moves IP packets between hosts• Services implemented on hosts• Keep state at the edges

Edge EdgeNetworkIP IP

How to partition function vertically?

5

Inside a Single Network

Data Plane• Distributed routers• Forwarding, filtering, queuing

Management Plane• Figure out what is

happening in network• Decide how to change it

Shell scripts Traffic Eng

DatabasesPlanning tools

OSPFSNMP netflow modemsConfigs

OSPFBGP

Link metrics

OSPFBGP

OSPFBGP

Control Plane• Multiple routing processes

on each router• Each router with different

configuration program• Huge number of control

knobs: metrics, ACLs, policy

FIB

FIB

FIB

Routing policies

Packet filters

6

Inside a Single Network

• Data Plane• Distributed routers• Forwarding, filtering, queueing• Based on FIB or labels

Management Plane• Figure out what is

happening in network• Decide how to change it

Shell scripts Traffic Eng

DatabasesPlanning tools

OSPFSNMP netflow modemsConfigs

OSPFBGP

Link metrics

OSPFBGP

OSPFBGP

Control Plane• Multiple routing processes

on each router• Each router with different

configuration program• Huge number of control

knobs: metrics, ACLs, policy

FIB

FIB

FIB

Routing policies

Packet filters

State everywhere!• Dynamic state in forwarding tables• Configured state in settings, policies, packet filters• Programmed state in magic constants, timers• Many dependencies between bits of stateState updated in uncoordinated, decentralized way!

7

How Did We Get in This Mess?

• Initial IP architecture– Bundled packet handling and control logic– Distributed the functions across routers– Didn’t anticipate the need for management

• Rapid growth in features– Sudden popularity and growth of the Internet– Increasing demands for new functionality– Incremental extensions to protocols & router software

• Challenges of distributed algorithms– Some functions are hard to do in a distributed fashion

8

What Does the Operator Want?

• Network-wide views– Network topology– Mapping to lower-level equipment– Traffic matrix

• Network-level objectives– Load balancing– Survivability– Reachability– Security

• Direct control– Explicit configuration of data-plane mechanisms

9

What Architecture Would Achieve This?

• Management plane Decision plane– Responsible for all decision logic and state– Operates on network-wide view and objectives– Directly controls the behavior of the data plane

• Control plane Discovery plane– Responsible for providing the network-wide view– Topology discovery, traffic measurement, etc.

• Data plane– Queues, filters, and forwards data packets– Accepts direct instruction from the decision plane

10

Example Application: Traffic Engineering

• Problem: Adapt routing to the traffic demands– Inputs: network topology and traffic matrix– Outputs: routing of traffic that balances load

• Three ways to solve the problem– Extend the control plane to adapt to load– Management plane, with today’s control plane– Decision plane

11

Interior Gateway Protocol (OSPF/IS-IS)

• Routers flood information to learn the topology– Determine “next hop” to reach other routers…– Compute shortest paths based on the link weights

• Link weights configured by the network operator

32

2

1

13

1

4

5

3

12

Control Plane: Let the Routers Adapt

• Strawman alternative: load-sensitive routing– Link metrics based on traffic load– Flood dynamic metrics as they change– Adapt automatically to changes in offered load

• Reasons why this is typically not done– Delay-based routing unsuccessful in the early days– Oscillation as routers adapt to out-of-date information– Most Internet transfers are very short-lived

• Research and standards work continues…– … but operators have to do what they can today

13

Management Plane: Measure, Model, Control

Topology/Configuratio

n

Offeredtraffic

Changes tolink weights

Operational network

Network-wide“what if”

model

measure

control

optimize

14

Management Plane Approach

• Topology– Connectivity and capacity of routers and links

• Traffic matrix– Offered load between points in the network

• Link weights– Configurable parameters for routing protocol

• Performance objective– Balanced load, low latency, service agreements …

• Question: Given the topology and traffic matrix, which link weights should be used?

15

Management Plane Solution

• Measure– Topology: monitoring of the routing protocols– Traffic matrix: widely deployed traffic measurement

• Model– Representations of topology and traffic– “What-if” models of the routing protocol

• Optimize– Efficient local-search algorithms to find good settings– Operational experience to identify key constraints

http://www.cs.princeton.edu/~jrex/papers/ieeecomm02.pdf

16

This Works, But Has Some Limitations

• “What-if” model– Repeats the logic implemented in the control plane– Duplication of functionality, and debugging

• Optimization techniques– Local search because the problem is intractable– Too much computation to explore all possibilities

• Network effects– Link-weight changes are disruptive– Routers must converge after each change– Leads to transient packet loss and delay

17

Decision Plane Solution

• Measure– Topology: monitoring of the routing protocols– Traffic matrix: widely deployed traffic measurement

• Optimize the routing– Compute desired forwarding paths directly– Simpler than optimizing the link weights

• Instruct the routers– Could change one router at a time to gradually switch

to the new routes– Avoid transient packet loss and delays

18

More Network-Level Objectives

• Survivability– Routing that can tolerate any single equipment failure– Incorporate knowledge of shared risk groups

• Reachability policies– Control which pairs of hosts can communicate– Install packet filters and forwarding-table entries

• Security– Install “blackhole” routes that drop attack traffic – Keep routing tables within router storage limits

• Etc.

19

Is The Decision Plane Feasible?

• Deployability: any path from here to there?– Must be compatible with today’s routers– Must provide incentives for deployment

• Speed: can it run fast enough?– Must respond quickly to network events– Needs to be as fast as a router

• Reliability: single point of failure?– Must be replicated to tolerate failure– Replicas must behave consistently

20

Deployability

• Take a lesson from Ethernet– Change anything but the message format

• Border Gateway Protocol (BGP)– Interdomain routing protocol for the Internet

• Widely implemented on existing routers

• Widely used, especially in backbone networks

– Three main aspects of BGP• Protocol: standard messages sent between routers

• Vendors: path-selection logic on individual routers

• Operators: configuration of policies for path selection

– Logic and policies are complex, but messages simple

21

Deployment in a Single Network

iBGP

eBGP

Before: conventional use of BGP in a backbone network

iBGP

eBGP

After: RCP learns external routes and sends answers to the routers

Only one AS has to change its architecture!

RCP

22

Longer Term, Wide-Spread Deployment

• Represents an AS as a single logical entity– Complete view of AS’s routes– Computes routes for all routers inside an AS

• Exchanges routing information with other ASes– Using BGP or a new inter-AS protocol– While still using BGP to talk to the routers

AS 3AS 2AS 1

iBGP

Physicalpeering

Inter-AS ProtocolRCP RCP RCP

23

RCP Architecture

R

RR

R

RRR

BGP Engine OSPF Viewer

Route ControlServer

BGP Engine OSPF Viewer

Route ControlServerBrain

Brawn

RCP

Scalability through decomposition; reliability through replication

24

Scalability: Three-Part RCP Architecture

• OSPF viewer– Continuous view of network topology– Passive monitoring of link-state advertisements

• BGP engine– Collecting BGP updates from border routers– Sending chosen routes to the router– Lots of TCP connections, like a Web server

• Route Control Server– Logic for computing answers for the routers– Configuration for controlling the logic– Operates on real-time feeds from the monitors

25

Scalability: Initial Prototype

• Implementation platform– 3.2 GHz Pentium-4– 8 GB memory– Linux 2.6.5 kernel

• Workload– Routing/topology changes in AT&T’s network

• RCP performance– Memory usage: less than 2GB– Speed, BGP changes: less than 40 msec– Speed, topology changes: 0.1-0.8 seconds

• System is able to keep up…

26

Reliability

• Replication: avoid single point of failures– Multiple RCPs per network– Connected at different places

• Consistency: replicas act as one– Replicas performing the same algorithm on the same

input get the same answer (eventually)– Replica has complete view of each partition it sees

A B

A A, B B

27

Application: DDoS Blackholing

• Blackholing of denial-of-service attacks– Preconfigure a “null” route on each router– Identify address of attack victim (from DoS system)– RCP assigns the destination address to the null route

iBGP

RCP

Victim 1.2.3.4

“Use null route for 1.2.3.4/32”

attack (detected by traffic analysis)

28

Application: Maintenance Dry-Out

• Dry-out of traffic before maintenance– Plan to take a router out of service– RCP assigns routes via new egress points in advance

iBGP

RCP

Router r about to undergo maintenance

before

d

r

s

“Use route via s for d”

after

29

Application: Customized Egress Selection

• Customer-controlled selection of egress points– Customer with two data centers and many sites– Customer wants to control the load balance– RCP customization (not simply closest egress)

iBGP

RCP

d

r

s

“Use route via r for d”

“Use route via s for d”

Site #1

Site #2

30

Conclusion

• Managing IP networks is too hard– IP architecture not designed for management– Complex, distributed operation of routers

• Reducing complexity in the key– Network-wide views & objectives, and direct control– Removing control logic and state from the routers

• New architecture is feasible– RCP is deployable, scalable, reliable– RCP solves important operations problems

Date post:	19-Dec-2015
Category:	Documents
View:	216 times
Download:	0 times

1 Route Control Platform Making the Network Act Like One Big Router Jennifer Rexford Princeton...

Documents