Date post: | 11-May-2015 |
Category: |
Documents |
Upload: | microsoft-technet-belgium-and-luxembourg |
View: | 2,620 times |
Download: | 6 times |
Windows Server 2012 Hyper-V Networking Evolved
Didier Van Hoye
Didier Van Hoye
Technical Architect – FGIA
Microsoft MVP & MEET Member
http://workinghardinit.wordpress.com@workinghardinit
What We’ll Discuss
Windows Server 2012 Networking
Changed & Improved features
New features
Relationship to Hyper-V
Why We’ll Discuss This
We face many network challenges
Keep systems & services running
High to continuous availability
High reliability & reducing complexity
Security, multitenancy, extensibility
Cannot keep throwing money at it (CAPEX)
Network virtualization, QOS, bandwidth management in box
Performance (latency, throughput, scalability)
Leverage existing hardware
Control operational cost (OPEX) Reduce complexity
Eternal Challenge = Balanced Design
M E M O R YC P U
S T O R A G EN E T W O R K
A V
A I L A
B I L I T Y
C A
P A
C I
T Y
C O S T
P E R F O R M A N C E
Network Bottlenecks
In the host networking stack
In the NICs
In the switches
PowerEdge M1000e
9 101 2
11 123 4
13 145 6
15 167 8
0
1
00
1
0 0
1
00
1
0 0
1
00
1
0 0
1
00
1
0
0
1
00
1
0 0
1
00
1
0 0
1
00
1
0 0
1
00
1
0
42
Socket, NUMA, Core, K-Group
Processor: One physical processor, which can consist of one or more NUMA nodes. Today a physical processor ≈ a socket, with multiple cores.
Non-uniform memory architecture (NUMA) node:A set of logical processors and cache that are close to one another.
Core: One processing unit, which can consist of one or more logical processors.
Logical processor (LP): One logical computing engine from the perspective of the operating system, application or driver. In effect, a logical processor is a thread (think hyper threading).
Kernel Group: A set of up to 64 logical processors.
Kernel Group (K-Group)
Advanced Network Features (1)
Receive Side Scaling (RSS)
Receive Segment Coalescing (RSC)
Dynamic Virtual Machine Queuing (DVMQ)
Single Root I/O Virtualization (SR-IOV)
NIC TEAMING
RDMA/Multichannel support for virtual machines on SMB3.0
Windows Server 2012 scales RSS to the next generation of servers & workloads
Spreads interrupts across all available CPUs
Even for those very large scale hosts
RSS now works across K-Groups
Even RSS is “Numa Aware” to optimize performance
Now load balances UDP traffic across CPUs
40% to 100% more throughput (backups, file copies, web)
Receive Side Scaling (RSS)
Node 0 Node 1 Node 2 Node 3
Qu
eu
es
Incoming Packets
RSS improves scalability on multiple processors / NUMA nodes by distributing
TCP/UDP receive traffic across the cores in ≠ nodes / K-Groups
RSS NIC with 8 Queues
Receive Segment Coalescing (RSC)
Coalesces packets in the NICso the stack processes fewer headers
Multiple packets belonging to a connectionare coalesced by the NIC to a larger packet (max of 64 K) and processed within a single interrupt
10 - 20% improvement in throughput &CPU workload Offload to NIC
Enabled by default on all 10Gbps
Coalesced into larger buffer
Incoming Packets
NIC with RSC
RSC helps by coalescing multiple inbound packets into a
larger buffer or “packet” which reduces per packet CPU
costs as less headers need to be processed.
Receive Segment Coalescing
Dynamic Virtual Machine Queue (DVMQ)
VMQ is to virtualization what RSS is to native workloads.
It makes sure that Routing, Filtering etc. is done by the NIC in queues andthat the interrupts for those queues don’t get done by 1 processor (0).
Most inbox 10Gbps Ethernet adapters support this.
Enabled by default.
Network I/O path without VMQ
Network I/O path with VMQ
Root Partition
Physical NIC
CPU
0
CPU
1
CPU
2
CPU
3
Root Partition
Physical NIC
CPU
0
CPU
1
CPU
2
CPU
3
Dynamic Virtual Machine Queue (DVMQ)
Adaptive optimal performance across changing workloads
No VMQ
Root Partition
Physical NIC
CPU
0
CPU
1
CPU
2
CPU
3
Static VMQ
Root Partition
Physical NIC
CPU
0
CPU
1
CPU
2
CPU
3
Dynamic VMQ
Network I/O path with SR-IOVNetwork I/O path without SR-IOV
Single-Root I/O Virtualization (SR-IOV)
Reduces CPU utilization for processing network traffic
Reduces latency path
Increases throughput
Requires: Chipset: Interrupt & DMA remapping
BIOS Support
CPU: Hardware virtualization, EPT or NPT
Physical NIC
Root Partition
Hyper-V Switch
Routing
VLAN
Filtering
Data Copy
Virtual Machine
Virtual NIC
SR-IOV Physical NIC
Virtual Function
VMBUS
SR-IOV Enabling & Live Migration
Virtual Machine
Network Stack
Enable IOV (VM NIC Property)
Virtual Function is “Assigned”
“NIC” automatically created
Traffic flows through VF
Turn On IOV Switch back to Software path Reassign Virtual Function
Assuming resources are available
Migrate as normal
Live Migration Post Migration
Remove VF from VM
VM has connectivity even if
Switch not in IOV mode
IOV physical NIC not present
Different NIC vendor
Different NIC firmware
SR-IOV Physical NICPhysical NIC
Software Switch
(IOV Mode)
SR-IOV Physical NIC
Software path is not used
Virtual Function
“NIC”
Software NIC
Virtual Function
Software Switch
(IOV Mode)
“NIC”
Software NIC
NIC TEAMING
Customers are dealing withway to many issues.
NIC vendors would like toget rid of supporting this.
Microsoft needs this to becompetitive & complete thesolution stack + reducesupport issues.
Teaming modes: Switch dependent
Switch independent
Load balancing: Address Hash
Hyper-Port
Hashing modes: 4-tuple
2-tuple
MAC address
Active/Active & Active/Standby
Vendor Agnostic
Hyper-V Extensible Switch
Network switch
IM MUXProtocol edge
Virtual miniport 1
Port 1 Port 2 Port 3
LBFO
Configuration DLL
LBFO Admin GUI
Kern
el
mo
de
Use
r m
od
e
WMI
IOCTL
NIC Teaming
NIC 1 NIC 2 NIC 3
LBFO Provider
Frame distribution/aggregation
Failure detection
Control protocol implementation
NIC TEAMING (LBFO)
Parent NIC Teaming Guest NIC Teaming
Hyper-V virtual switch
VM (Guest Running Any OS)
SR-IOV NIC SR-IOV NIC
LBFO Teamed NIC
SR-IOV Not exposed Hyper-V virtual
switch
VM (Guest Running Windows Server 2012)
LBFO Teamed NIC
Hyper-V virtual
switch
SR-IOV NIC SR-IOV NIC
SMB Client SMB Server
User
Kernel
R-NIC
Network w/
RDMA
support
NTFS
SCSI
R-NIC
SMB Direct (SMB over RDMA)
What
Addresses congestion in network stack by offloading the stack to the network adapter
Advantages
Scalable, fast and efficient storage access
High throughput, low latency & minimal CPU utilization
Load balancing, automatic failover & bandwidth aggregation via SMB Multichannel
Scenarios
High performance remote file access for applicationservers like Hyper-V, SQL Server, IIS and HPC
Used by File Server and Clustered Shared Volumes (CSV) for storage communications within a cluster
Required hardware
RDMA-capable network interface (R-NIC)
Three types: iWARP, RoCE & Infiniband
SMB Client
Application
Network w/
RDMA
support
SMB Server
Disk
SMB Multichannel
Multiple connections per SMB session
Full Throughput
Bandwidth aggregation with multiple NICs
Multiple CPUs cores engaged when using Receive Side Scaling (RSS)
Automatic Failover
SMB Multichannel implements end-to-end failure detection
Leverages NIC teaming if present, but does not require it
Automatic Configuration
SMB detects and uses multiple network paths
SMB Multichannel Single NIC Port
No failover
Can’t use full 10Gbps Only one TCP/IP connection
Only one CPU core engaged
1 session, without Multichannel
SMB Server
SMB Client
Switch
10GbE
NIC
10GbE
NIC
10GbE
CPU utilization per core
Core 1 Core 2 Core 3 Core 4
RSS
RSS
SMB Server
SMB Client
No failover
Full 10Gbps available Multiple TCP/IP connections
Receive Side Scaling (RSS) helpsdistribute load across CPU cores
1 session, with Multichannel
Switch
10GbE
NIC
10GbE
NIC
10GbE
CPU utilization per core
Core 1 Core 2 Core 3 Core 4
RSS
RSS
Automatic NIC failover
Combined NIC bandwidth available Multiple NICs engaged
Multiple CPU cores engaged
No automatic failover
Can’t use full bandwidth Only one NIC engaged
Only one CPU core engaged
SMB Server 1
SMB Client 1
Switch
10GbE
SMB Server 2
SMB Client 2
NIC
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
Switch
10GbE
Switch
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
RSS RSS
RSS RSS
SMB Server 1
SMB Client 1
SMB Server 2
SMB Client 2
NIC
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
Switch
10GbE
Switch
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
RSS RSS
RSS RSS
SMB Multichannel Multiple NIC Ports
1 session, without Multichannel 1 session, with Multichannel
Switch
10GbE
Switch
10GbE
Switch
10GbE
Automatic NIC failover (faster with NIC Teaming)
Combined NIC bandwidth available Multiple NICs engaged
Multiple CPU cores engaged
Automatic NIC failover
Can’t use full bandwidth Only one NIC engaged
Only one CPU core engaged
SMB Server 1
SMB Client 1
SMB Server 2
SMB Client 2
Switch
10GbE
NIC
10GbE
Switch
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
Switch
1GbE
NIC
1GbE
NIC
1GbE
Switch
1GbE
NIC
1GbE
NIC
1GbE
SMB Server 2
SMB Client 1
Switch
1GbE
SMB Server 2
SMB Client 2
NIC
1GbE
NIC
1GbE
Switch
1GbE
NIC
1GbE
NIC
1GbE
Switch
10GbE
Switch
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
NIC
10GbE
NIC Teaming
NIC Teaming
RSS RSS
RSS RSS
NIC TeamingRSS RSS
SMB Multichannel & NIC Teaming
1 session, NIC Teaming without MC 1 session, NIC Teaming with MC
NIC TeamingRSS RSS
NIC Teaming
NIC Teaming
NIC Teaming
NIC Teaming
1 session, with Multichannel
Automatic NIC failover
Combined NIC bandwidth available Multiple NICs engaged
Multiple RDMA connections
SMB Direct & Multichannel
SMB Server 2
SMB Client 2
SMB Server 1
SMB Client 1
SMB Server 2
SMB Client 2
SMB Server 1
SMB Client 1
Switch
10GbE
Switch
10GbE
R-NIC
10GbE
R-NIC
10GbE
R-NIC
10GbE
R-NIC
10GbE
Switch
54GbIB
R-NIC
54GbIB
R-NIC
54GbIB
Switch
54GbIB
R-NIC
54GbIB
R-NIC
54GbIB
Switch
10GbE
Switch
10GbE
R-NIC
10GbE
R-NIC
10GbE
R-NIC
10GbE
R-NIC
10GbE
Switch
54GbIB
R-NIC
54GbIB
R-NIC
54GbIB
Switch
54GbIB
R-NIC
54GbIB
R-NIC
54GbIB
No automatic failover
Can’t use full bandwidth Only one NIC engaged
RDMA capability not used
1 session, without Multichannel
Auto configuration looks at NIC type/speed => Same NICs are used for RDMA/Multichannel (doesn’t mix 10Gbps/1Gbps, RDMA/non-RDMA)
Let the algorithms work before you decide to intervene
Choose adapters wisely for their function
Switch
1GbE
SMB Server
SMB Client
NIC
1GbE
NIC
1GbE
Switch
1GbE
Switch
Wireless
SMB Server
SMB Client
NIC
1GbE
NIC
Wireless
NIC
1GbE
Switch
1GbE
SMB Server
SMB Client
NIC
1GbE
NIC
1GbE
Switch
10GbE
SMB Server
SMB Client
R-NIC
10GbE
R-NIC
10GbE
Switch
10GbE
NIC
10GbE
NIC
10GbE
Switch
IB
R-NIC
32GbIB
R-NIC
32GbIB
Switch
10GbE
R-NIC
10GbE
R-NIC
10GbE
RSS
RSS
NIC
Wireless
SMB Multichannel Auto Configuration
Metric Large
Send
Offload
(LSO)
Receive
Segment
Coalescing
(RSC)
Receive
Side
Scaling
(RSS)
Virtual
Machine
Queues
(VMQ)
Remote
DMA
(RDMA)
Single Root
I/O
Virtualization
(SR-IOV)
Lower
Latency
Higher
Scalability
Higher
Throughput
Lower Path
Length
Networking Features Cheat Sheet
Advanced Network Features (2)
Consistent Device Naming
DCTCP/DCB/QOS
DHCP Guard/Router Guard/Port Mirroring
Port ACLs
IPSEC Task Offload for Virtual Machines (IPsecTOv2)
Network virtualization & Extensible Switch
Consistent Device Naming
1Gbps flow controlled by TCP Needs 400 to 600KB of memory
TCP saw tooth visible
1Gbps flow controlled by DCTCP Requires 30KB of memory
Smooth
DCTCP Requires Less Buffer Memory
W2K12 deals with network congestion by reacting to
the degree & not merely the presence of congestion.
DCTCP aims to achieve low latency, high burst tolerance and
high throughput, with small buffer switches.
Requires Explicit Congestion Notification (ECN, RFC 3168)
capable switches.
Algorithm enabled when it makes sense
(low round trip times, i.e. in the data center).
Datacenter TCP (DCTCP)
Datacenter TCP (DCTCP)
http://www.flickr.com/photos/srgblog/414839326
Datacenter TCP (DCTCP)
Datacenter TCP (DCTCP)
Running out of buffer in a
switch gets you in to stop/go
hell by getting a boatload of
green, orange & red lights
along your way
Big buffers mitigate this but
are very expensive
http://www.flickr.com/photos/bexross/2636921208/http://www.flickr.com/photos/mwichary/3321222807/
Datacenter TCP (DCTP)
You want to be in a green wave
Windows Server 2012 & ECN
provides network traffic control
by default
http://www.flickr.com/photos/highwaysagency/6281302040/
http://www.telegraph.co.uk/motoring/news/5149151/Motorists-to-be-given-green-
traffic-lights-if-they-stick-to-speed-limit.html
Data Center Bridging (DCB)
Prevents congestion in NIC & network by reserving
bandwidth for particular traffic types
Windows 2012 provides support & control for DCB, tags
packets by traffic type
Provides lossless transport for mission critical workloads
DCB is like a car pool lane …
http://www.flickr.com/photos/philopp/7332438786/
DCB Requirements
1. Enhanced Transmission Selection (IEEE 802.1Qaz)
2. Priority Flow Control (IEEE 802.1Qbb)
3. (Optional) Data Center Bridging Exchange protocol
4. (Not required) Congestion Notification (IEEE 802.1Qau)
10 GbE Phy NIC 10 GbE Phy NIC
LBFO Teamed NIC
Hyper-V virtual switch
VM 1 VM nManagement OS
Live Migration
Storage
Management
Hyper-V Qos beyond the VM
Manage the Network Bandwidth
with a Maximum (value) and/or a
Minimum (value or weight)
Hyper-V Qos beyond the VM
Default Flow per Virtual Switch
VM2
Hyper-V Extensible Switch
VM1Gold
Tenant
Customers may group a number of
VMs that each don’t have
minimum bandwidth. They will be
bucketed into a default flow, which
has minimum weight allocation.
This is to prevent starvation.? ? 10
1 Gbps
Maximum Bandwidth for Tenants
Hyper-V Extensible Switch
Unified Remote Access
Gateway
<100Mb
One common customer pain
point is WAN links are
expensive
Cap VM throughput to the
Internet to avoid bill shock∞
Internet Intranet
Bandwidth Network Management
Manage the Network
Bandwidth with a
Maximum and a
Minimum value
SLAs for hosted Virtual
Machines
Control per VMs and not
per HOST
DHCP & Router Guard, Port Mirroring
IPsec Task Offload
IPsec is CPU intensive => Offload to NIC
In demand due to compliance (SOX, HIPPA, etc.)
IPsec is required & needed for secured operations
Only available to host/parent workloads in W2K8R2
Now extended to virtual machines Managed by the Hyper-V switch
Port ACL
Note: Counters are implemented as ACLs
– Counts packets to address/range
– Read via WMI/PowerShell
– Counters are tied into the resource metering you
can do for charge/show back, planning etc.
Port ACL
Allow/Deny/Counter
MAC, IPv4 or IPv6 addresses
Wildcards allowed in IP addresses
Note: Counters are implemented as ACLs
Counts packets to address/range
Read via WMI/PowerShell
Counters are tied into the resource metering you can do for charge/show back, planning etc.
ACLs are the basic building blocks of virtual switch security functions
http://workinghardinit.wordpress.com@workinghardinit
Questions & Answers