Post on 23-Feb-2016
description
transcript
Enterprise-Grade Scheduling
Gary Kotton - VMwareGilad Zlotkin - Radware
Enterprise-Grade OpenStack from a Scheduler Perspective
1
Enterprise Ready Openstack
Migrating existing mission critical and performance critical enterprise applications requires:
→ High service levels ● Availability ● Performance ● Security
→ Compliance with existing architectures ● Multi-tier● Fault tolerance models
2
Service Level for Applications
• Availabilityo Fault Tolerance (FT) o High Availability (HA) o Disaster Recovery (DR)
3
Service Level for Applications
• Availabilityo Fault Tolerance (FT) o High Availability (HA) o Disaster Recovery (DR)
• Performanceo Transaction Latency (Sec)o Transaction Load/Bandwidth (TPS)
3
Service Level for Applications
• Availabilityo Fault Tolerance (FT) o High Availability (HA) o Disaster Recovery (DR)
• Performanceo Transaction Latency (Sec)o Transaction Load/Bandwidth (TPS)
• Securityo Data Privacyo Data Integrityo Denial of Service
3
Service Level for Applications
• Availabilityo Fault Tolerance (FT) o High Availability (HA) o Disaster Recovery (DR)
• Performanceo Transaction Latency (Sec)o Transaction Load/Bandwidth (TPS)
• Securityo Data Privacyo Data Integrityo Denial of Service
3
What all this has to do with the
Nova and other
Schedulers?
Placement Strategies
• Availability - anti affinityo Application VMs should be placed in different 'failure domains' (e.g.,
on different hosts) to ensure application fault tolerance
4
Placement Strategies
• Availability - anti affinityo Application VMs should be placed in different 'failure domains' (e.g.,
on different hosts) to ensure application fault tolerance
• Performance o Network proximity
Application VMs should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance
4
Placement Strategies
• Availability - anti affinityo Application VMs should be placed in different 'failure domains' (e.g.,
on different hosts) to ensure application fault tolerance
• Performance o Network proximity
Application VMs should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance
o Host Capability IO-Intensive, Network-Intensive, CPU-Intensive,...
o Storage Proximity
4
Placement Strategies
• Availability - anti affinityo Application VMs should be placed in different 'failure domains' (e.g.,
on different hosts) to ensure application fault tolerance
• Performance o Network proximity
Application VMs should be placed as closely as possible to one another on the network (same 'connectivity domain') to ensure low latency and high performance
o Host Capability IO-Intensive, Network-Intensive, CPU-Intensive,...
o Storage Proximity
• Security - Resource Isolation/Exclusivityo Host, Network, ...
4
Group Scheduling
• Group together VMs to provide a certain service
• Enables scheduling policies per group/sub-group
• Provides a multi-VM application designed for fault tolerance and high performance
5
Example
6
Example
Bad placement: if a host goes down entire service is down!
6
Example
Bad placement: if a host goes down entire service is down!
Placement strategy - anti affinity: achieving fault tolerance
6
Group Scheduling in Nova
• Expand on “Server Group” support• Topology of resources and relationships
between themo Debi Dutta and Yathi Udupi (Cisco)o Mike Spreitzer (IBM)o Gary Kotton (VMware)
7
Anti Affinity in Icehouse
• Server groupso Create a server group with a policyo Scheduler hint will ensure that the policy is enforced
Affinity and anti-affinity filters are now default filterso Backward compatible with Havana supporto nova boot --hint group=name|uuid
--image ws.img --flavor 2 --num 3 Wsio Completed with the help of:
Russell Bryant (Red Hat) Xinyuan Huang (Cisco)
8
Beyond Nova Scheduler
Providing enterprise grade services:o Availabilityo Performanceo Security
Requires:→ Hierarchical Scheduling→ Cross Scheduling→ Rescheduling
9
VMware Storage Policy Based Management (SPBM)
→ Datastore selection based on flavor meta data→ Compute scheduling on hosts that are connected to the selected datastore
10
Storage/Compute Cross Scheduling
Storage/Compute Cross SchedulingVMware Virtual SAN SPBM
11
Storage Policy Wizard
SPBM
object
object manager
virtual disk
VSAN objects may be (1) mirrored across hosts & (2) striped across disks/hosts to meet VM storage profile policies
Datastore Profile
Storage/Compute Affinity: Cinder Volume will be on the same host as VM (to optimize performance)
12
Storage/Compute Cross Scheduling
Rescheduling
VMware vCenter HA(high availability)• Automatically orchestrate rescheduling of VMs from failed host on
surviving hosts
13
Rescheduling
VMware vCenter HA(high availability)• Automatically orchestrate rescheduling of VMs from failed host on
surviving hosts
DRS(Distributed resource scheduling)• Evacuate (vMotion) VMs from overloaded
host
DPM(Distributed Power Management)• Evacuate (vMotion) VMs from host to be shutdown
13
Rescheduling
Radware’s Neutron LBaaS rescheduling:• Fault-recovery: rescheduling of failed LB instance• Scaling: rolling scale-up (or down) by controlled fail-over
13
X
Hierarchical Scheduling
Host exclusivity for secure tenant isolation→ Select hosts that were allocated to workload of that specific tenant
→ Compute scheduling on that selected host group
14
Supported by Icehouse
Server-Groups:→ Anti-affinity→ Affinity→ Backward Compatible with Havana
15
Future Roadmap
Server-Groups:•New filterso Network Proximityo Rack affinity/anti-affinityo Host capabilities
•Simultaneous scheduling (Mike Spreitzer - IBM) •Host exclusivity (Phil Day – HP)
16
Future Roadmap Cont.
Scheduler-as-a-service project• Gantt (https://wiki.openstack.org/wiki/Gantt)• Initial steps (Sylvain Bauza - Red Hat):o Forklift the Nova schedulero Discussions of API’s etc.
No DB scheduler (Boris Pavlovic – Mirantis)• Ideas on improving scheduler performance
17
Summary
High service levels → Scheduling Policies● Availability
→ Anti-Affinity → Rescheduling
● Performance → Proximity→ Host Capability→ Cross Scheduling & Rescheduling
● Security→ Resource Exclusivity→ Hierachical Scheduling
18
Q&A
Thank You
Gary Kotton: gkotton@vmware.comGilad Zlotkin: giladz@radware.com