Date post: | 19-Dec-2015 |
Category: |
Documents |
View: | 216 times |
Download: | 0 times |
1
Route Control PlatformMaking the Network Act Like One Big Router
Jennifer RexfordPrinceton University
http://www.cs.princeton.edu/~jrex
http://www.cs.princeton.edu/~jrex/papers/rcp.pdf
2
Outline
• Internet architecture– Complexity of network management
• Moving control from routers to servers– Reducing complexity and increasing flexibility
• Traffic engineering example– Today’s approach vs. the RCP
• Making the RCP real – Deployability, scalability, and reliability
• Example applications– Security, maintenance, and customer control
3
Internet Architecture
• The Internet is– Decentralized: loose confederation of peers– Self-configuring: no global registry of topology– Stateless: limited information in the routers– Connectionless: no fixed connection between hosts
• These attributes contribute– To the success of Internet– To the rapid growth of the Internet– … and the difficulty of controlling the Internet!
sender receiver
4
A Well-Studied Architecture Question
• Smart hosts, dumb network• Network moves IP packets between hosts• Services implemented on hosts• Keep state at the edges
Edge EdgeNetworkIP IP
How to partition function vertically?
5
Inside a Single Network
Data Plane• Distributed routers• Forwarding, filtering, queuing
Management Plane• Figure out what is
happening in network• Decide how to change it
Shell scripts Traffic Eng
DatabasesPlanning tools
OSPFSNMP netflow modemsConfigs
OSPFBGP
Link metrics
OSPFBGP
OSPFBGP
Control Plane• Multiple routing processes
on each router• Each router with different
configuration program• Huge number of control
knobs: metrics, ACLs, policy
FIB
FIB
FIB
Routing policies
Packet filters
6
Inside a Single Network
• Data Plane• Distributed routers• Forwarding, filtering, queueing• Based on FIB or labels
Management Plane• Figure out what is
happening in network• Decide how to change it
Shell scripts Traffic Eng
DatabasesPlanning tools
OSPFSNMP netflow modemsConfigs
OSPFBGP
Link metrics
OSPFBGP
OSPFBGP
Control Plane• Multiple routing processes
on each router• Each router with different
configuration program• Huge number of control
knobs: metrics, ACLs, policy
FIB
FIB
FIB
Routing policies
Packet filters
State everywhere!• Dynamic state in forwarding tables• Configured state in settings, policies, packet filters• Programmed state in magic constants, timers• Many dependencies between bits of stateState updated in uncoordinated, decentralized way!
7
How Did We Get in This Mess?
• Initial IP architecture– Bundled packet handling and control logic– Distributed the functions across routers– Didn’t anticipate the need for management
• Rapid growth in features– Sudden popularity and growth of the Internet– Increasing demands for new functionality– Incremental extensions to protocols & router software
• Challenges of distributed algorithms– Some functions are hard to do in a distributed fashion
8
What Does the Operator Want?
• Network-wide views– Network topology– Mapping to lower-level equipment– Traffic matrix
• Network-level objectives– Load balancing– Survivability– Reachability– Security
• Direct control– Explicit configuration of data-plane mechanisms
9
What Architecture Would Achieve This?
• Management plane Decision plane– Responsible for all decision logic and state– Operates on network-wide view and objectives– Directly controls the behavior of the data plane
• Control plane Discovery plane– Responsible for providing the network-wide view– Topology discovery, traffic measurement, etc.
• Data plane– Queues, filters, and forwards data packets– Accepts direct instruction from the decision plane
10
Example Application: Traffic Engineering
• Problem: Adapt routing to the traffic demands– Inputs: network topology and traffic matrix– Outputs: routing of traffic that balances load
• Three ways to solve the problem– Extend the control plane to adapt to load– Management plane, with today’s control plane– Decision plane
11
Interior Gateway Protocol (OSPF/IS-IS)
• Routers flood information to learn the topology– Determine “next hop” to reach other routers…– Compute shortest paths based on the link weights
• Link weights configured by the network operator
32
2
1
13
1
4
5
3
12
Control Plane: Let the Routers Adapt
• Strawman alternative: load-sensitive routing– Link metrics based on traffic load– Flood dynamic metrics as they change– Adapt automatically to changes in offered load
• Reasons why this is typically not done– Delay-based routing unsuccessful in the early days– Oscillation as routers adapt to out-of-date information– Most Internet transfers are very short-lived
• Research and standards work continues…– … but operators have to do what they can today
13
Management Plane: Measure, Model, Control
Topology/Configuratio
n
Offeredtraffic
Changes tolink weights
Operational network
Network-wide“what if”
model
measure
control
optimize
14
Management Plane Approach
• Topology– Connectivity and capacity of routers and links
• Traffic matrix– Offered load between points in the network
• Link weights– Configurable parameters for routing protocol
• Performance objective– Balanced load, low latency, service agreements …
• Question: Given the topology and traffic matrix, which link weights should be used?
15
Management Plane Solution
• Measure– Topology: monitoring of the routing protocols– Traffic matrix: widely deployed traffic measurement
• Model– Representations of topology and traffic– “What-if” models of the routing protocol
• Optimize– Efficient local-search algorithms to find good settings– Operational experience to identify key constraints
http://www.cs.princeton.edu/~jrex/papers/ieeecomm02.pdf
16
This Works, But Has Some Limitations
• “What-if” model– Repeats the logic implemented in the control plane– Duplication of functionality, and debugging
• Optimization techniques– Local search because the problem is intractable– Too much computation to explore all possibilities
• Network effects– Link-weight changes are disruptive– Routers must converge after each change– Leads to transient packet loss and delay
17
Decision Plane Solution
• Measure– Topology: monitoring of the routing protocols– Traffic matrix: widely deployed traffic measurement
• Optimize the routing– Compute desired forwarding paths directly– Simpler than optimizing the link weights
• Instruct the routers– Could change one router at a time to gradually switch
to the new routes– Avoid transient packet loss and delays
18
More Network-Level Objectives
• Survivability– Routing that can tolerate any single equipment failure– Incorporate knowledge of shared risk groups
• Reachability policies– Control which pairs of hosts can communicate– Install packet filters and forwarding-table entries
• Security– Install “blackhole” routes that drop attack traffic – Keep routing tables within router storage limits
• Etc.
19
Is The Decision Plane Feasible?
• Deployability: any path from here to there?– Must be compatible with today’s routers– Must provide incentives for deployment
• Speed: can it run fast enough?– Must respond quickly to network events– Needs to be as fast as a router
• Reliability: single point of failure?– Must be replicated to tolerate failure– Replicas must behave consistently
20
Deployability
• Take a lesson from Ethernet– Change anything but the message format
• Border Gateway Protocol (BGP)– Interdomain routing protocol for the Internet
• Widely implemented on existing routers
• Widely used, especially in backbone networks
– Three main aspects of BGP• Protocol: standard messages sent between routers
• Vendors: path-selection logic on individual routers
• Operators: configuration of policies for path selection
– Logic and policies are complex, but messages simple
21
Deployment in a Single Network
iBGP
eBGP
Before: conventional use of BGP in a backbone network
iBGP
eBGP
After: RCP learns external routes and sends answers to the routers
Only one AS has to change its architecture!
RCP
22
Longer Term, Wide-Spread Deployment
• Represents an AS as a single logical entity– Complete view of AS’s routes– Computes routes for all routers inside an AS
• Exchanges routing information with other ASes– Using BGP or a new inter-AS protocol– While still using BGP to talk to the routers
AS 3AS 2AS 1
iBGP
Physicalpeering
Inter-AS ProtocolRCP RCP RCP
23
RCP Architecture
R
RR
R
RRR
BGP Engine OSPF Viewer
Route ControlServer
BGP Engine OSPF Viewer
Route ControlServerBrain
Brawn
RCP
Scalability through decomposition; reliability through replication
24
Scalability: Three-Part RCP Architecture
• OSPF viewer– Continuous view of network topology– Passive monitoring of link-state advertisements
• BGP engine– Collecting BGP updates from border routers– Sending chosen routes to the router– Lots of TCP connections, like a Web server
• Route Control Server– Logic for computing answers for the routers– Configuration for controlling the logic– Operates on real-time feeds from the monitors
25
Scalability: Initial Prototype
• Implementation platform– 3.2 GHz Pentium-4– 8 GB memory– Linux 2.6.5 kernel
• Workload– Routing/topology changes in AT&T’s network
• RCP performance– Memory usage: less than 2GB– Speed, BGP changes: less than 40 msec– Speed, topology changes: 0.1-0.8 seconds
• System is able to keep up…
26
Reliability
• Replication: avoid single point of failures– Multiple RCPs per network– Connected at different places
• Consistency: replicas act as one– Replicas performing the same algorithm on the same
input get the same answer (eventually)– Replica has complete view of each partition it sees
A B
A A, B B
27
Application: DDoS Blackholing
• Blackholing of denial-of-service attacks– Preconfigure a “null” route on each router– Identify address of attack victim (from DoS system)– RCP assigns the destination address to the null route
iBGP
RCP
Victim 1.2.3.4
“Use null route for 1.2.3.4/32”
attack (detected by traffic analysis)
28
Application: Maintenance Dry-Out
• Dry-out of traffic before maintenance– Plan to take a router out of service– RCP assigns routes via new egress points in advance
iBGP
RCP
Router r about to undergo maintenance
before
d
r
s
“Use route via s for d”
after
29
Application: Customized Egress Selection
• Customer-controlled selection of egress points– Customer with two data centers and many sites– Customer wants to control the load balance– RCP customization (not simply closest egress)
iBGP
RCP
d
r
s
“Use route via r for d”
“Use route via s for d”
Site #1
Site #2
30
Conclusion
• Managing IP networks is too hard– IP architecture not designed for management– Complex, distributed operation of routers
• Reducing complexity in the key– Network-wide views & objectives, and direct control– Removing control logic and state from the routers
• New architecture is feasible– RCP is deployable, scalable, reliable– RCP solves important operations problems