Date post: | 03-Nov-2014 |
Category: |
Technology |
Upload: | pradeep-padala |
View: | 4 times |
Download: | 0 times |
XRM: An Event-based Resource Management Framework for XCP
Pradeep Padala
in collaboration with Ken Igarashi, Akshay I. Mehta, and Ulas C. Kozat
Typical scenario in shared infrastructures
Data Center
Shared infrastructure
(cloud)
Web search Data analytics
Xen Summit AMD 2010
Application requirements
Fast searches Analyze large data
Low response time High throughput
QoS differentiation 3:1
Web search Data analytics
Xen Summit AMD 2010
Xen Summit AMD 2010
How to host these applications?
app1 web
Virtualization
app1db
app2 app3
Node I Node II
Node III Node IV
Node I
Virtualized data center
Virtualization
Node II
app2 app3
Physical partitioning
Improved utilization Reduced costs High flexibility (elastic!)
× Wasteful× Difficult to manage
app1 web
app1db
Virtualized shared data center = a new paradigm!Challenge
How to allocate resources to meet goals?
Xen Summit AMD 2010 5
ProvisionVMs()RunApplications()
While (true) { MonitorApplications() If(AppPerformance != GOAL) {
FindReason()If (ScaleUp) {
FindAvailableResources() MigrateVM()
}If (ScaleOut) {
ProvisionVMs() RunApplication() } } If (Consolidation == True) { FindSuitableVMs()
Consolidate() }}
Challenge #1: Developers don’t want to manage resources
How to determine what to do?Scale UP? Scale Out? Migrate? Clone?
Where to provision VMs?
How to consolidate VMs?
Cloud Providers Want to Consolidate Multiple Services too!
Holy GrailDeployService();AutoScale();
Xen Summit AMD 2010
Challenge #2: Resource Management Spans Multiple Layers
Services
PaaS
IaaS
Hardware
Reso
urce
M
anag
emen
t
How to pass information between the layers so that they don’t make conflicting decisions?
Challenge #3: Complexity of Scaling Primitives
Little overhead EfficientX Limited to single
machine
Xen Summit AMD 2010
Slicing Handles overload Small downtimeX Overhead
Live Migration
State-ful cloneX OverheadX Side-effects
Cloning Maintain
connectionsX Overhead
Live Replication
How to combine primitives to achieve goals?
Xen Summit AMD 2010
What is a perfect Resource Manager?
AutomationResource AllocationHigh UtilizationHigh Application Performance
We are building the (ultimate) RM systemXRM = first incarnation on XCP!
A RM that can automatically re-arrange resources to multiple applications/VMs on multiple physical machines and provides optimal resource utilization and application performance
Outline• Motivation• Challenges in RM• XRM Feedback Control based Design• XRM Implementation and Preliminary Results• Summary and Feedback
Xen Summit AMD 2010
How to achieve the automation?
“Almost any system that is considered automatic has some
element of feedback control”
-Hellerstein et al.
XRM = A Feedback Control System
Xen Summit AMD 2010
RM in multiple layers
Xen Summit AMD 2010
XRM = IaaS RM
Does app modeling and may request
changes
Knows only about VMs and hardware
resources
High level service request
Slice request
Automated control loop
Slice changes
PaaS RM
IaaS RM
Services
Hardware
XRM’s feedback control loop
Monitor
Control
Action
XCP
Network stats
Performance goals
Control parameters
Change resource shares
Migrate Power-off machines
ModelModel can model
applications, VMs, and underlying resources
Xen Summit AMD 2010
Current incarnationXCP
monitoring module
Stats analysis module
RRD database
Out of band stat updates from XCP
nodes
Stats 1. Thresholds2. Rules
Core algorithm module
Algorithm bank
Filtered Stats and stats analysis data
Wrapper
Take action
XCP master node
Xen Summit AMD 2010
Openflow
Low-level commands/XAPI commands
Xen Summit AMD 2010
XRM is an event-based framework• Many algorithms can be developed and plugged in• The algorithms register for specific events
– High CPU utilization– Packet drops– PowerOff– PowerOn– …
• Different algorithms may take different actions
A Common Abstraction for ALL Algorithms
What algorithms can you implement?• AutoControl – automated control of multiple
virtualized resources [PadalaEurosys09]• Models application and sets VM shares based on
application goals
Xen Summit AMD 2010
GoalsResource Shares
App Controller
App Controller
App Controller
Node Controller Node Controller
[PadalaEurosys09] Pradeep Padala, Xiaoyun Zhu, Mustafa Uysal et al. Automated Control of Multiple Virtualized Resources. In the proceedings of the EuroSys 2009
Outline• Motivation• Challenges in RM• XRM Feedback Control based Design• XRM Implementation and Preliminary Results• Summary and Feedback
Xen Summit AMD 2010
XRM features• Interface to upper layers• Auto-* features• External control• Pluggable algorithms• Extensibility
Xen Summit AMD 2010
Xen Summit AMD 2010
XRM Implementation• Implemented on XCP 0.1.1• Written in Python• Pluggable algorithms have to be written in Python• Currently implements four algorithms
– Bin packing– Bin packing + Live migration– Random host– Round-robin
• We have also implemented a simulator (run 1 Million VMs on 100,000 nodes!)– Can capture data during a “real” run– Run multiple algorithms on exact same trace
XRM Evaluation• 5 hosts, 4 cores• Random utilizations• Random slice requests• Three algorithms
– Bin-packing– Round-robin– Random-host
• Slicing algorithms evaluated in previous work - AutoControl [PadalaEurosy’09]
Xen Summit AMD 2010
Comparing three algorithms
0
200
400
600
800
1 2 3 4 5 6 7 8 90
200400600800
1000
Time Interval
Hos
t U
tiliz
atio
n
0
200
400
600
Bin Packing
Random Host
Round-Robin Uses all five hosts, wasting energy
Uses <= five hosts, wasting energy
Uses <= three hosts!
• Experiments on Emulab
• 20 server nodes – 80 VMs
• 20 client nodes
• Mix of applications
• Load increased on ½ of the VMs chosen randomly
AutoControl experiments
Underloaded
Underloaded
Overloaded
Overloaded
Overloaded
VM1VM2VM3VM4
No control needed
AutoControl can readjust
SLO (performance goal) violations
Time Time
Default Xen AutoControl
Applications
GoodBad Target
Summary• Resource management in cloud infrastructures is
complex– Multiple layers of RM– Complex primitives– Complex decisions
• We are developing feedback control theory based RM • XRM is event-based, pluggable and extensible• Complex algorithms like AutoControl can be developed• Research in advanced algorithms in progress
Xen Summit AMD 2010
Summary of our experiences with XCP 0.1.1
• We are trying to build a research cloud based on XCP• Other than XRM, adding Fault Tolerance and a Web-based GUI
to XCP
• Having to install a special distribution is difficult– Why not have XCP as a set of packages in RHEL or other
distributions?– You are breaking toolstacks developed at various companies
• XCP docs is same as Citrix Xenserver docs– Some of the features don’t work or not supported– Better documentation of API
• XCP GUI needs to improve– Bugs in OpenXenCenter
Xen Summit AMD 2010
25
【参考】提供機能概要
Xen Summit AMD 2010
We want feedback from Xen community• Comments on XRM architecture• Should we incorporate XRM into XCP?
– Ocaml• Are you interested in open source XRM?
– Does the community wants to be involved?• Questions?
Xen Summit AMD 2010