+ All Categories
Home > Documents > SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution?...

SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution?...

Date post: 25-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
63
SDN: Google's B4 and Traffic Engineering 1 / 57
Transcript
Page 1: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

SDN: Google's B4 and Traffic Engineering

1 / 57

Page 2: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Outline

1 B4: Experience with a Globally-Deployed Software Defined WAN

2 Achieving high utilization with software-driven WAN

2 / 57

Page 3: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Modern WANs are critical to performance, reliability

Typically provisioned to 30-40% average utilization (2-3x bandwidthcost over-provisioning).

3 / 57

Page 4: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Modern WANs are critical to performance, reliability

Typically provisioned to 30-40% average utilization (2-3x bandwidthcost over-provisioning).

Overheads + high bandwidth requirement.

3 / 57

Page 5: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Google’s WAN, one of the largest in the Internet.

Delivers range of services like search, video, cloud computing, etc.

4 / 57

Page 6: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Google’s WAN, one of the largest in the Internet.

Delivers range of services like search, video, cloud computing, etc.

Architecturally two distinct WANs

4 / 57

Page 7: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Google’s WAN, one of the largest in the Internet.

Delivers range of services like search, video, cloud computing, etc.

Architecturally two distinct WANs

1 User-facing network peers: for user traffic.

4 / 57

Page 8: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Google’s WAN, one of the largest in the Internet.

Delivers range of services like search, video, cloud computing, etc.

Architecturally two distinct WANs

1 User-facing network peers: for user traffic.2 B4

◮ Connectivity between data centers.◮ 90% of internal traffic runs on this network.

eg. asynchronous data copies, end user data replication, etc.

4 / 57

Page 9: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Google’s WAN, one of the largest in the Internet.

Delivers range of services like search, video, cloud computing, etc.

Architecturally two distinct WANs

1 User-facing network peers: for user traffic.2 B4

◮ Connectivity between data centers.◮ 90% of internal traffic runs on this network.

eg. asynchronous data copies, end user data replication, etc.

Why two different WANs?- different requirements (eg. priority, latency, etc.)

Internet traffic continues to grow rapidly, but Google’s WANtraffic grows even more faster.

4 / 57

Page 10: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

SDN approach for DC WAN interconnect.

5 / 57

Page 11: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

SDN approach for DC WAN interconnect.

Motivation:◮ Deploy routing and TE protocols customized to Google’s unique

requirements.

5 / 57

Page 12: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

SDN approach for DC WAN interconnect.

Motivation:◮ Deploy routing and TE protocols customized to Google’s unique

requirements.

Design goals:◮ Treat failures as common events.◮ Switches provide programmatic interface under central control.

5 / 57

Page 13: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Why SDN based solution?

Limitations with traditional WAN architectures.

Elastic bandwidth demands: majority traffic, tolerant to transient failures

Moderate number of sites: few dozen data centers

End application control: control the network at every level with more

flexibility, thus reducing over-provisioning of resources

Cost sensitivity: nearly impossible to match the growing demand with

traditional approaches

Others include success of SDN and OF, rapid iteration of novelprotocols, improved capacity planning, scalability, flexibility, etc. E

6 / 57

Page 14: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Manage switches using SDN principles

SDN Application: support standard routing protocols + centralizedTE service

◮ Edge servers make decisions on resource availability.◮ Use multipath forwarding based on application priority.◮ Dynamic reallocate bandwidth for link/switch failures.

7 / 57

Page 15: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Introduction

Manage switches using SDN principles

SDN Application: support standard routing protocols + centralizedTE service

◮ Edge servers make decisions on resource availability.◮ Use multipath forwarding based on application priority.◮ Dynamic reallocate bandwidth for link/switch failures.

Allows to achieve:

◮ near 100% link utilization on many B4 links◮ 70% on all link utilization

(ie. 2-3x efficiency improvements vs standard practice)

7 / 57

Page 16: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

8 / 57

Page 17: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Logically, a three layered architecture.

B4 WAN - consists multiple sites.within each site, the switch hardware layer forwards traffic

Site Controller layer - consists of Network Control Servers (NCS)hosting both OpenFlow Controllers (OFC) and Network ControlApplications (NCAs).

- OFC maintains network state based on NCA directives- Paxos for fault tolerance of individual servers

Global layer - logically centralized applications like SDN Gateway,central TE server.

- enables central control of entire network- SDN gateway provides abstractions to TE server

9 / 57

Page 18: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Options for integrating existing routing protocols with centralized trafficengineering:

10 / 57

Page 19: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Options for integrating existing routing protocols with centralized trafficengineering:

Approach 1: Build one integrated, centralized service combiningboth routing and TE

10 / 57

Page 20: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Options for integrating existing routing protocols with centralized trafficengineering:

Approach 1: Build one integrated, centralized service combiningboth routing and TE

Approach 2: Build routing and centralized TE as separateindependent services

10 / 57

Page 21: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Options for integrating existing routing protocols with centralized trafficengineering:

Approach 1: Build one integrated, centralized service combiningboth routing and TE

Approach 2: Build routing and centralized TE as separateindependent services

Which one would you prefer?

10 / 57

Page 22: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Approach 2: Building routing and centralized TE as separate independentservices.

11 / 57

Page 23: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Approach 2: Building routing and centralized TE as separate independentservices.

Why?

Focus on SDN infrastructure development.

11 / 57

Page 24: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Approach 2: Building routing and centralized TE as separate independentservices.

Why?

Focus on SDN infrastructure development.

Debug SDN architecture before adding new features.

11 / 57

Page 25: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Overview

Approach 2: Building routing and centralized TE as separate independentservices.

Why?

Focus on SDN infrastructure development.

Debug SDN architecture before adding new features.

TE layer sits on top of routing protocols

BIG RED BUTTON to disable TE (back to shortest path forwarding)

11 / 57

Page 26: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Switch Design

Conventional design needs deep buffers, large forwarding tables, hardwaresupport for HA.

12 / 57

Page 27: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Switch Design

Conventional design needs deep buffers, large forwarding tables, hardwaresupport for HA.

For B4, Google resolves them by:

◮ adjusting transmission rates by careful endpoint management

12 / 57

Page 28: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Switch Design

Conventional design needs deep buffers, large forwarding tables, hardwaresupport for HA.

For B4, Google resolves them by:

◮ adjusting transmission rates by careful endpoint management

◮ having modest number of DCs + abstraction = smaller forwardingtables

12 / 57

Page 29: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Switch Design

Conventional design needs deep buffers, large forwarding tables, hardwaresupport for HA.

For B4, Google resolves them by:

◮ adjusting transmission rates by careful endpoint management

◮ having modest number of DCs + abstraction = smaller forwardingtables

◮ moving software functionality from switches to upper layers

12 / 57

Page 30: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Switch Design

Conventional design needs deep buffers, large forwarding tables, hardwaresupport for HA.

For B4, Google resolves them by:

◮ adjusting transmission rates by careful endpoint management

◮ having modest number of DCs + abstraction = smaller forwardingtables

◮ moving software functionality from switches to upper layers

Need for custom switches

Switches that could export low-level control over switch forwardingbehavior

12 / 57

Page 31: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Switch Design

High-radix switch - deploying fewer larger switches ⇒ yields easiermanagement and software scalability

B4 switches - uses multiple merchant silicon switch chips + two-stageClos topology

Figure: High-radix switch

13 / 57

Page 32: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Network Control Functionality

Majority functionality runs on NCS

Paxos handles leader election for all control functionalities◮ Failure detection◮ New leader election

Modified ONIX for OFC◮ OFC is the Network Information Base (NIB)

eg. topology info., trunk configs., link status, etc.

14 / 57

Page 33: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Design - Routing

How to integrate OpenFlow-based switch with existing routingprotocols?

Google chose Quagga stack for BGP/ISIS on NCS.

Developed an SDN application called”Routing Application Proxy (RAP)”.

RAP provides connectivity between Quagga and OF switches for:◮ BGP/ISIS route updates◮ routing-protocol packets flowing between switches and Quagga◮ interface updates from the switches to Quagga

15 / 57

Page 34: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Traffic Engineering

Goal: share bandwidth among competing

applications/flow-groups

16 / 57

Page 35: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Traffic Engineering

Goal: share bandwidth among competing

applications/flow-groups

Objective function: max-min fair allocation

16 / 57

Page 36: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Traffic Engineering

Notions

Network Topology: a group represents sites as vertices and site-to-siteconnectivity as edges.

Flow Group (FG): aggregate applications to flow groups defined as{source site, dest site, QoS} rule.

Tunnel (T): a site-level path in the network eg. sequence of sites

(A ⇒ B ⇒ C)

Tunnel Group (TG): maps FG to a set of tunnels (T ) and correspondingweights.

17 / 57

Page 37: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Traffic Engineering

Figure: Overview of Traffic Engineering

18 / 57

Page 38: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE - Bandwidth Functions

Associate bandwidth function with every application

Admin-specified static weights (slope functions)

Allocate bandwidth based on flow’s relative priority (fair share)

19 / 57

Page 39: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE - Max-Min Fair Allocation

Formal definition:

Resources are allocated to sources in order of increasing demand

No source gets a resource share larger than its demand

Sources with unsatisfied demand gets an equal share of the resource

S. Keshav (1997)

An Engineering Approach to Computer Networking, p. 215-217

Publisher Addison-Wesley, Reading, MA, 1997

20 / 57

Page 40: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE - Max-Min Fair Allocation

Figure: Example of Max-Min Fair Allocation

1 Assign(

10Mbps4 flows

)

= 2.5 Mbps per flow

2 Sum the over-assigned amount (Residual) for flow 1, 0.5 Mbps over-assigned

3 Assign(

ResidualNo. of under assigned flows

)

to each flow = 0.5/3 = 0.0666 Mbps

4 Repeat steps 2 and 3 with new residual until no residual left or no demand isgreater than residual

Final assignment:Flow 1 = 2 Mbps, Flow 2 = 2.6 Mbps, Flow 3 = 2.7 Mbps, Flow 4 = 2.7 Mbps

21 / 57

Page 41: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE - Weighted Max-Min Fair Allocation

Figure: Example of Weighted Max-Min Fair Allocation

1 Normalize weights (so that smallest weight is 1) W=[5,8,1,2]

2 Unit share =(

Total resourcesum of normalized weights

)

=(

1616

)

= 1

3 Assign every flow [unit share X normalized weight of flow ] units of resource

4 Calculate over-assigned resources and repeat steps 1,2,3, and 4 with thisresidual

Final assignment:Flow 1 = 4 Mbps, Flow 2 = 2 Mbps, Flow 3 = 4 Mbps, Flow 4 = 6 Mbps

22 / 57

Page 42: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE - Optimization

LP optimal for allocating fair share for FGs is expensive and notscalable.

B4 team designed their own algorithm to achieve this with at least99% utilization and 25 times faster performance relative to LP.

Two main components:

1 Tunnel Group Generation: allocates bandwidth to FGs usingbandwidth functions to prioritize bottleneck edges.

2 Tunnel Group Quantization: changes split ratios in each TG tomatch granularity supported by switch hardware tables.

23 / 57

Page 43: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - TE State and OpenFlow

Three modes of B4 switch:

1 Encapsulating switch

2 Transit switch

3 Decapsulating switch

24 / 57

Page 44: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - TE State and OpenFlow

25 / 57

Page 45: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - TE State and OpenFlow

Source switch maps packets to FG using <dest ip >, forwards tocorresponding TG.

TG hashes packets to a T in the desired ratio.

Each site in the path maintains per-tunnel forwarding rules.

Source site encapsulates packet with outer header (ie. Tunnel ID).

Transit switch uses tunnel ID to match rules and forwards it.

Decapsulating switch terminates flow based on tunnel ID.

26 / 57

Page 46: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - Composing Routing and TE

B4 supports two routing services.1 Shortest path routing (uses Longest Prefix Match - LPM table)2 TE (uses Access Control List - ACL table)

Map different flows and groups to appropriate tables.

ACL takes strict precedence over LPM entries.

27 / 57

Page 47: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - Composing Routing and TE

28 / 57

Page 48: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - Coordinating TE State Across Sites

Figure: Overview of Traffic Engineering

29 / 57

Page 49: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - Coordinating TE State Across Sites

TE server coordinates T/TG/FG rule installations across multipleOFCs.

TED - Traffic Engineering Database captures state needed to forwardpackets along multiple paths.

TED - <key,value> data store.

Compute per-site TED, generate TE Ops to OFCs.

TE Ops either add/modify/delete TED entries at OFCs.

OFCs convert TE Ops to flow-programming instructions and sends toall devices in its site.

Finally, OFC responds to original TE Op. g

30 / 57

Page 50: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - Dependencies and Failures

Dependencies among Ops:

◮ to avoid packet drops, all Ops cannot run simultaneouslyeg. configure a T at all sites before configuring TG/FG

31 / 57

Page 51: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - Dependencies and Failures

Dependencies among Ops:

◮ to avoid packet drops, all Ops cannot run simultaneouslyeg. configure a T at all sites before configuring TG/FG

Synchronizing TED between TE and OFC:

◮ requires common TED view◮ TE session supports this synchronization◮ TE synchronizes TED with persistent memory - to handle

simultaneous failures

31 / 57

Page 52: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - Dependencies and Failures

Dependencies among Ops:

◮ to avoid packet drops, all Ops cannot run simultaneouslyeg. configure a T at all sites before configuring TG/FG

Synchronizing TED between TE and OFC:

◮ requires common TED view◮ TE session supports this synchronization◮ TE synchronizes TED with persistent memory - to handle

simultaneous failures

Ordering issues:

◮ site-specific sequences IDs assigned to TE Ops◮ enables ordering among operations

31 / 57

Page 53: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

TE Protocol & OF - Dependencies and Failures

Dependencies among Ops:

◮ to avoid packet drops, all Ops cannot run simultaneouslyeg. configure a T at all sites before configuring TG/FG

Synchronizing TED between TE and OFC:

◮ requires common TED view◮ TE session supports this synchronization◮ TE synchronizes TED with persistent memory - to handle

simultaneous failures

Ordering issues:

◮ site-specific sequences IDs assigned to TE Ops◮ enables ordering among operations

TE Op failures:

◮ due to RPC failure, OFC rejection, etc.◮ dirty/clean bit for each TED entry◮ enables resuming TE Ops from point of failure

31 / 57

Page 54: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - Deployment and Evolution

Network traffic doubled in the year 2012

32 / 57

Page 55: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - Deployment and Evolution

33 / 57

Page 56: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - Deployment and Evolution

Observations:

1 Topology aggregation significantly reduces path churn and systemload.

2 Edge removals happen multiple times a day.

3 WAN links are susceptible ot frequent port flaps and benefit fromdynamic centralized management

34 / 57

Page 57: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - TE Ops Performance

100x reduction in no. of TE Opsby caching recently used tunnels.

reduction in failed Ops

Reduced latency

35 / 57

Page 58: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - TE Ops Performance

Notes:

TG Ops run for every topology change or change in demand

Growth in no. of TG Ops due to addition of network sites

Reduction in failure of TG Ops due to optimizations

36 / 57

Page 59: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - Impact of Failures

Figure: Impact of failure between two sites

Failure of transit router requires longer convergence time (≈ 3.3 sec)◮ update multi-path table entries for potentially several tunnels◮ each update Op is slow

37 / 57

Page 60: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - TE Algorithm Evaluation

Throughput improves as wehave more number of paths

Adding more paths and usingfiner granularity traffic splittinggives more flexibility to TE, butconsumes more hardware tableresources

B4′s deployment uses TE with quantum 1/4 and 4 paths

38 / 57

Page 61: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - Link Utilization

Utilization close to100%

Ability to mix priorityclasses across all edges

Use separate edges fordifferent classes

39 / 57

Page 62: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Evaluation - Link Utilization

Figure: Per-link utilization in a trunk, demonstrating the effectiveness of hashing

For at least 75% site-to-site edges, max-min ratio of link utilization is:

◮ 1.05 without failures (ie. 5% from optimal)◮ 2.0 with failures

40 / 57

Page 63: SDN: Google's B4 and Traffic Engineering · 2016-04-04 · Introduction Why SDN based solution? Limitations with traditional WAN architectures. Elastic bandwidth demands: majority

Conclusion

B4 now serves more traffic than Google’s public facing WAN withhigher growth rate.

SDN deployed cost-effective WAN bandwidth, running many links at100% utilization.

Hybrid approach an effective way to introduce SDN into existingdeployments.

Leveraging control at edge increases WAN utilization and improvingfault tolerance.

41 / 57


Recommended