+ All Categories
Home > Documents > Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What...

Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What...

Date post: 25-May-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
42
© Armir Bujari [email protected] Universita Degli Studi di Padova Data Center Networking 1
Transcript
Page 1: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

© Armir Bujari – [email protected]

Universita Degli Studi di Padova

Data Center Networking

1

Page 2: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2

What are Data Centers?

•Large facilities with 10s of thousands of networked servers

• Compute, storage, and networking working in concert

• “Warehouse-Scale Computers”

2

Page 3: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

33

Types of Data Centers

•Specialized data centers built for one big app

• Social networking: Facebook

• Web Search: Google, Bing

•“Cloud” data centers

• Amazon EC2, Windows

Azure

• Google App Engine

Page 4: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

44

Cloud Computing

•On-demand

• Use resources when you need it; pay-as-you-go

•Elastic

• Scale up & down based on demand

•Multi-tenancy

• Multiple independent users share infrastructure

• Security and resource isolation

• SLAs on performance & reliability (sometimes)

•Dynamic Management

• Resiliency: isolate failure of servers and storage

• Workload movement: move work to other locations

Page 5: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

5

Data Centers with 100,000+ Servers

Microsoft

Google Facebook

Microsoft

5

Page 6: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

66

Page 7: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

7

These things are really big

100 billion searches per month

120+ million users

1.15 billion users

10-100K servers

100s of Petabytes of storage

100s of Terabits/s of Bw(more than core of Internet)

10-100MW of power(1-2 % of global energy

consumption)

100s of millions of dollars

Page 8: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

8

Google Datacenter Infrastructure

Page 9: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

99

Datacenter Traffic Growth

DCN bandwidth growth demanded much more

12

Source: “Jupiter Rising: A Decade of Clos Topologies and Centralized

Control in Google’s Datacenter Network”, SIGCOMM 2015.

Page 10: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

1010

INTERNET

Servers

Fabric

What’s Different about DCNs?

10

Page 11: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

1111

What’s Different about DCNs?

•Single administrative domain

•No need to be compatible with outside world

•Tiny round trip times (microseconds)

•Latency/tail latency critical

•Massive multipath topologies

•Shallow buffers

•Backplane for large-scale parallel computation

Page 12: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

1212

TLA

MLAMLA

Worker Nodes

………

Example: Web Search

12

Picasso

“Everything you can imagine is real.”“Bad artists copy. Good artists steal.”

“It is your work in life that is the ultimate seduction.“

“The chief enemy of creativity is good sense.“

“Inspiration does exist, but it must find you working.”

“I'd like to live as a poor man with lots of money.“

“Art is a lie that makes usrealize the truth.

“Computers are useless. They can only give you answers.”

1.

2.

3.

…..

1. Art is a lie…

2. The chief…

3.

…..

1.

2. Art is a lie…

3. …

..

Art is…

Picasso

• Strict deadlines

• Tail Latency Matters

Deadline = 250ms

Deadline = 50ms

Deadline = 10ms

Partition/Aggregate App Structure

Page 13: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

1313

Data Center Challenges

• Massive bisection bandwidth

• Topologies

• Load balancing

• Optics

• Ultra-Low latency (<10 microseconds)

• Scheduling

• Centralized or distributed control?

• Managing resources across network & servers

• Multi-tenant performance isolation

• App-aware network scheduling (e.g. for big data)

• Next-generation hardware

• RDMA, Rack-Scale Computing

Page 14: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

14

Data Center Costs

Amortized

Cost*

Component Sub-Components

~45% Servers CPU, memory, disk

~25% Power

infrastructure

UPS, cooling, power

distribution

~15% Power draw Electrical utility costs

~15% Network Switches, links, transit

*3 yr amortization for servers, 15 yr for infrastructure; 5% cost of money

The Cost of a Cloud: Research Problems in Data Center Networks. Sigcomm

CCR 2009. Greenberg, Hamilton, Maltz, Patel.

Page 15: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

1515

Server Costs

Ugly secret: 30% utilization considered “good” in data centers

• Uneven application fit

• Each server has CPU, memory, disk: most applications exhaust one resource, stranding the others

• Long provisioning timescales

• New servers purchased quarterly at best

• Uncertainty in demand

• Demand for a new service can spike quickly

• Risk management

• Not having spare servers to meet demand brings failure just when success is at hand

• Session state and storage constraints

• If the world were stateless servers, life would be good

Page 16: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

1616

Goal: Agility – Any service, Any Server

•Turn the servers into a single large fungible pool

• Dynamically expand and contract service footprint

as needed

•Benefits

• Increase service developer productivity

• Lower cost

• Achieve high performance and reliability

The 3 motivators of most infrastructure projects

Page 17: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

1717

Achieving Agility

•Workload management

• Means for rapidly installing a service’s code on a

server

• Virtual machines, disk images, containers

•Storage Management

• Means for a server to access persistent data

• Distributed filesystems (e.g., HDFS, blob stores)

•Network

• Means for communicating with other servers,

regardless of where they are in the data center

Page 18: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

18

More Software…

Tens of millions of lines of code.

Closed, proprietary, outdated.

Specialized

Control

Plane

Specialized

Hardware

Specialized

Features

Hundreds of protocols

6,500 RFCs

Billions of gates.

Power hungry and bloated.

Page 19: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

19

Packet

Forwarding Packet

Forwarding

Packet

Forwarding

Packet

Forwarding

Packet

Forwarding

Control

Control

Control

Control

Control

Global Network Map

Control Plane

Control

Program

Control

Program

Control

Program

19

Software Defined Networking

Page 20: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2020

Conventional DC Network

Reference – “Data Center: Load balancing Data Center Services”, Cisco

2004

CR CR

AR AR AR AR. . .

SS

DC-Layer 3

Internet

SS

SS

. . .

DC-Layer 2

Key• CR = Core Router (L3)

• AR = Access Router (L3)

• S = Ethernet Switch (L2)

• A = Rack of app. servers

~ 1,000 servers/pod == IP

subnet

Page 21: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2121

Layer 2 vs. Layer 3

•Ethernet switching (layer 2)

✓Fixed addresses and auto-configuration (plug & play)

✓Seamless mobility, migration, and failover

x Broadcast limits scale (ARP)

x Spanning Tree Protocol

•IP routing (layer 3)

✓Scalability through hierarchical addressing

✓Multipath routing through equal-cost multipath

x More complex configuration

x Can’t migrate w/o changing IP address

Page 22: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2222

Conventional DC Network Problems

CR CR

AR AR AR AR

SS

S

… …

. . .

SS

… …

~ 5:1

~ 40:1

~ 200:1

•Dependence on high-cost proprietary routers

•Extremely limited server-to-server capacity

Page 23: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2323

Some DC network designs…

Fat-tree [SIGCOMM’08]

Jellyfish (random) [NSDI’12]

BCube [SIGCOMM’10]

Page 24: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

24

INTERNET

Servers

Fabric

100Kbps–100Mbps links

~100ms latency

10–40Gbps links

~10–100μs latency

Transport inside the DC

Page 25: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

25

INTERNET

Servers

Fabric

web appdata-base

map-reduce

HPC monitoringcache

Interconnect for distributed compute workloads

Transport inside the DC

Page 26: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2626

What’s Different About DC Transport?

•Network characteristics• Very high link speeds (Gb/s); very low latency (microseconds)

•Application characteristics• Large-scale distributed computation

•Challenging traffic patterns• Diverse mix of mice & elephants

• Incast

•Cheap switches• Single-chip shared-memory devices; shallow buffers

Page 27: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2727

DC Traffic Characteristics

• Instrumented a large cluster used for data mining and identified distinctive traffic patterns

• Traffic patterns are highly volatile

• A large number of distinctive patterns even in a day

• Traffic patterns are unpredictable

• Correlation between patterns very weak

Page 28: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2828

•Short messages

(e.g., query, coordination)

•Large flows

(e.g., data update, backup)

Low Latency

High Throughput

Data Center Workloads

Mice & Elephants

Page 29: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

2929

Comments: 1500 node cluster in a data center that supports data mining on petabytes of data. T̂he

servers are distributed roughly evenly across 75 ToR switches, which are connected hierarchically in a

C-DC

Figure: Mice are numerous; 99% of flows are smaller than 1 MB. However, more than 90%

of bytes are in fows between 100 MB and 1 GB

Page 30: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

3030

TCP timeout

Worker 1

Worker 2

Worker 3

Worker 4

Aggregator

RTOmin = 300 ms

• Synchronized fan-in congestion

Incast

Vasudevan et al. (SIGCOMM’09)

Page 31: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

3131

DC Transport Requirements

1. Low Latency

– Short messages, queries

2. High Throughput

– Continuous data updates, backups

3. High Burst Tolerance

– Incast

The challenge is to achieve these together

Page 32: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

32

High Throughput Low Latency

Baseline fabric latency (propagation + switching): 10 microseconds

Page 33: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

© Armir Bujari – [email protected]

Universita Degli Studi di Padova

Data Center TCP

Page 34: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

3434

TCP in the Data Center

•TCP [Jacobsen et al.’88] is widely used in the data center

• More than 99% of the traffic

•Operators work around TCP problems‒ Ad-hoc, inefficient, often expensive solutions

‒ TCP is deeply ingrained in applications

Practical deployment is hard → keep it simple!

Page 35: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

3535

Review: The TCP Algorithm

Sender 1

Sender 2

Receiver

ECN = Explicit Congestion Notification

Time

Win

do

w S

ize

(R

ate

)

Additive Increase:W →W+1 per round-trip time

Multiplicative Decrease:W →W/2 per drop or ECN mark

ECN Mark (1 bit)

Page 36: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

3636

TCP Buffer Requirement

•Bandwidth-delay product rule of thumb:

• A single flow needs C×RTT buffers for 100% Throughput.

Thro

ugh

pu

tB

uff

er

Size

100%

B

B ≥ C×RTT

B

100%

B < C×RTT

Page 37: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

3737

Window Size

(Rate)

Buffer Size

Throughput

100%

•Appenzeller et al. (SIGCOMM ‘04):

• Large # of flows: is enough.

37

Reducing Buffer Requirements

Page 38: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

3838

•Appenzeller et al. (SIGCOMM ‘04):

• Large # of flows: is enough

•Can’t rely on stat-mux benefit in the DC.

• Measurements show typically only 1-2 large flows at each

server

38

Key Observation:Low variance in sending rate → Small buffers suffice

Reducing Buffer Requirements

Page 39: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

3939

➢Extract multi-bit feedback from single-bit stream of ECN marks

• Reduce window size based on fraction of marked packets.ECN Marks TCP DCTCP

1 0 1 1 1 1 0 1 1 1 Cut window by 50% Cut window by 40%

0 0 0 0 0 0 0 0 0 1 Cut window by 50% Cut window by 5%

DCTCP: Main IdeaW

ind

ow

Siz

e (B

ytes

)

Win

do

w S

ize

(Byt

es)

Time (sec) Time (sec)

TCP DCTCP

Page 40: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

4040

DCTCP: Algorithm

Switch side:

• Mark packets when Queue Length > K.

Sender side:

– Maintain running average of fraction of packets marked (α).

➢ Adaptive window decreases:

– Note: decrease factor between 1 and 2.

B KMark Don’t Mark

each RTT : F =# of marked ACKs

Total # of ACKs (1− g) + gF

W (1−

2)W

Page 41: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

4141

0

100

200

300

400

500

600

700

0

Que

ue L

en

gth

(P

acke

ts)

Time (seconds)

DCTCP, 2 flowsTCP, 2 flows

0

100

200

300

400

500

600

700

0

Que

ue L

en

gth

(P

acke

ts)

Time (seconds)

DCTCP, 2 flowsTCP, 2 flows

0

100

200

300

400

500

600

700

0

Que

ue L

en

gth

(P

acke

ts)

Time (seconds)

DCTCPTCP

(KB

ytes

)

Experiment: 2 flows (Win 7 stack), Broadcom 1Gbps Switch

ECN Marking Thresh = 30KB

DCTCP vs TCP

Buffer is mostly empty

DCTCP mitigates Incast by creating a large buffer headroom

Page 42: Data Center Networking - MathUniPDabujari/fis1819/lecSlides/dcn.pdfData Center Networking 1 2 What are Data Centers? •Large facilities with 10s of thousands of networked servers

4242

1. Low Latency

✓ Small buffer occupancies → low queuing delay

2. High Throughput

✓ ECN averaging → smooth rate adjustments, low

variance

3. High Burst Tolerance

✓ Large buffer headroom → bursts fit

✓ Aggressive marking → sources react before packets

are dropped

Why it Works


Recommended