Net-Centric 2017
Data-center network (DCN) architectures with Reduced
Power Consumption“Flow/Application triggered SDN
controlled electrical/optical hybrid switching data-center network: HOLST”
Satoru Okamoto, Keio [email protected]
Net-Centric 2017
Co-Authors and Acknowledgement• Co-Authors
– Yukihiro Imakiire, Masayuki Hirono, and NaoakiYamanaka (Keio University)
• Acknowledgement– This work is partly supported by by ”HOLST (High-
speed Optical Layer 1 Switch system for Time slotswitching based optical data center networks)Project” funded by New Energy and IndustrialTechnology Development Organization (NEDO) ofJapan.
2/22
Net-Centric 2017
Outline
• Data-center Electricity Consumption• Data-center network architecture
– Leaf-Spine Electrical Switching– Optical data-center network
• Optical Circuit Switching• Optical Slow Switching
• HOLST data-center network• Summary
3/22
Net-Centric 2017
Data-Center (DC) Electricity Consumption
4/22
http://www.datacenterknowledge.com/archives/2016/06/27/heres-how-much-energy-all-us-data-centers-consume
70 BkWh@2014(2 % of US)
4 % increasePer 5 years
Net-Centric 2017
Breakdown of the Power Consumption in DC
5/22
Cooling
Converter Loss
(DC/DC)
Equipment
• Server + Storage 33 %• Network 17 %
Net-Centric 2017
Basic Data-Center Network (DCN) Architecture
• Leaf-Spine Architecture (Layer 2 or Layer 3)– DCN capacity can be adjustable by changing # of
Spine Switches.
6/22
Leaf
Spine
Servers Storage Internet
Net-Centric 2017
Power consumption : Optical vs. Electrical
7/22
Source: S. Aleksic, IEEE/OSA Journal of Optical Communications and Networking, Vol. 1, No. 3, pp. 245-258, 2009.
1/500(1000 kW→2 kW)
MEMS-based Optical Circuit
Switching !!
Net-Centric 2017
1st Generation: Helios (2010 UC San Diego)
• MEMS-based Optical Circuit Switching (OCS) is introduced to the Leaf-Spine architecture
8/22
Source: N. Farrington, et al, “Helios: a hybrid electrical/optical switch architecture for modular data centers,” Proc. in SIGCOM 2010.
MEMS
Leaf
Spine• # of ports ~ 300• 100 ms Switching
speed
Net-Centric 2017
How to accommodate “big flows” into Optical Circuit Switching Network
• First, all flows are accommodated into Electrical Switching Network.
• If “Elephant Flow” is observed, then the flow is rearranged to Optical Circuit Switching Network.– On-line Flow Classification “Elephant Trap”– Observation-based flow assignment
• Maximum weight matching problem
9/22
Yi Lu, et al, “ElephantTrap: A low cost device for identifying large flows,” 15th IEEE Symposium on High-Performance Interconnects, 2007.
Net-Centric 2017
2nd Generation: Optical Slot Switching (OSS)
• Fixed Length µs-order Slot Switching + SDN control– ICTON 2017 Mo.B3.4 “NEPHELE” (National Technical Univ. of Athens)
• High-speed (10 ns) 2x2 Optical Switch– All Optical, Ring Topology
– ECOC 2017 We.2.A.3 “Cloud BOSS” (Nokia Bell Labs)
• High-speed (100 ns) tunable Tx for making a slot– All Optical, Ring Topology
– ECOC 2017 We.2.A.4 “COSIGN” (Univ. of Bristol)
• High-speed (25 ns) 4x4 Optical Switch– OCS (MEMS) + OSS
10/22
Net-Centric 2017
HOLST
• Slot Switching-based DCN developing project– Keio University, OA Laboratory, and Epi Photonics
– Electrical and Optical (Circuit and Slot) hybrid switching network
– High-speed (10 ns) 8x8 and 16x16 Optical Switch is developing
– Application triggered SDN-based DCN control• ECOC 2017 We.2.A.2 “Hadoop-based Application Triggered
Automatic Flow Switching in Electrical/Optical Hybrid Data-Center Network” (Keio Univ.)
11/22
HOLST: High-speed optical layer 1 switch system for time slot switching based optical data center networks
Net-Centric 2017
HOLST System Architecture
12/22
45
41
36
44
43
42
40
39
38
37
35
34
33
45
41
36
44
43
42
40
39
38
37
35
34
33
45
41
36
44
43
42
40
39
38
37
35
34
33
45
41
36
44
43
42
40
39
38
37
35
34
33
TOR_Switch
Servers
Spine_Switch45
41
36
44
43
42
40
39
38
37
35
34
33
45
41
36
44
43
42
40
39
38
37
35
34
33
Ultra_High_Speed_Optical_L1_Switch
MEMS_Switch
PLZT_Switch
Mice Flow
SDNController
OSS network
OCS network
Net-Centric 2017
Power Reduction by OSS + OCS
• # of ToRs = 256– 30 servers/rack, NIC 10 GE, mixed traffic (Web search and
Data mining)• Mice : < 1 Gbps, Doggy : 1 – 6 Gbps, Elephant: 6 – 10 Gbps• Detailed simulation parameters are shown in M. Hirono, et al,
“HOLST: Architecture design of energy-efficient datacenter network based on ultra high-speed optical switch” IEEE LANMAN2017, June 2017.
13/22
HOLST: 256 ToR
Helios: 256 ToR
Electrical: 256 ToR
Electrical: 128 ToR 45 % reduction
Power consumption of SPINE part (kW)
Net-Centric 2017
How to accommodate “Elephant and Doggy flows” into OCS and OSS Network in HOLST
• First, all flows are accommodated into Electrical Switching Network.
• If “Elephant Flow” is observed, then the flow is rearranged to OCS Network.
• If “Doggy Flow” is observed, then the flow is rearranged to OSS Network.– Observation-based flow assignment– On-line Flow Classification– Application (Hadoop) triggered flow assignment
14/22
Net-Centric 2017
Observation-based Doggy Flow assignment
• 8x8 Optical Switch is assumed– 1 ToR can connect to 7 other ToRs
• ToR Groups should be found in 256 ToRs’ Traffic Matrix• Optimum Grouping problem is NP-hard.
– Heuristic grouping algorithm is developed.
15/22
4 OSS planes
Net-Centric 2017
On-line Flow Classification
• Flow-ID management queue will be set in ToR– Hierarchical Least Recently Used (LRU) queue
• Flow-ID and reference # of the Flow-ID (counter) are stored.– If counter exceeds the threshold, the Flow-ID is moved into higher
queue• Thresholds and queue size are adaptively changed.
16/22
MF: Mice FlowDF: Doggy FlowEF: Elephant Flow
Net-Centric 2017
Hadoop Triggered Flow Assignment
• “Hadoop Cluster” is monitored.– Newly defined “Shuffle Ratio” is used for classification.
17/22
Cluster Manager detects job start→ Instruct flow monitoring to
Traffic Monitor
Set circuit through the SDN Controller
Shuffle-Ratio is large→ Optical
Shuffle-Ratio is small→ Electrical
Calculate “Shuffle-Ratio” from traffic monitor and job information
Optical SwitchElectrical Switch
Hadoop Cluster
aServer
ToR Switch
Traffic Monitor
SDN Controller
Cluster Manager
Net-Centric 2017
HOLST PoC experiment• Small HOLST PoC is constructed.
– 10 GE L2/L3 Switches, 16x16 MEMS Switch, 4x4 PLZT Switch– Software-based OSS adapter, Software-based On-line Flow Classifier
• Throughput is limited due to the software-based
18/22
OSS
OSS
OSS
OSS
フロー判定
フロー判定フロー判定
フロー判定
4x4 PLZT16x16 MEMS
Net-Centric 2017
Optical Slot Switching in HOLST PoC
• Slot Size 200 ms (software) → µs order (developing FPGA)
19/22Sender #1 Receiver #2,3,4
Net-Centric 2017
On-line Flow Classification in HOLST PoC
• ~ 100 Mbps throughput is realized by software emulation.
20/22
Net-Centric 2017
Hadoop triggered Flow Assignment in HOLST PoC
• In the shuffle phase, the flow is rearranged to OCS
21/22
Net-Centric 2017
Summary
• “Optical Slot Switching” becomes the hot topic technology in the optical data-center network.
• In case of the Hybrid DCN, flow classification is required to efficiently utilize the optical network.
• In the HOLST project, three flow classification methods are developing.
22/22