+ All Categories
Home > Documents > Robert Deaver Sept. 24 th 2009 Energy Management for Servers and Clusters.

Robert Deaver Sept. 24 th 2009 Energy Management for Servers and Clusters.

Date post: 22-Dec-2015
Category:
Upload: blaze-gray
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:

of 60

Click here to load reader

Transcript
  • Slide 1
  • Robert Deaver Sept. 24 th 2009 Energy Management for Servers and Clusters
  • Slide 2
  • Why Manage Energy 2 Reduce cost Rack energy usage could account for 23%-50% of collocation revenue [Elnozahy] Rate tariffs or up-front deposits required by utility companies [J. Mitchell-Jackson] Reduce heat Also reduces cost Allows higher server density Reducing heat reduces failures
  • Slide 3
  • Two Approaches 3 Single servers Energy Conservation Policies for Web Servers: Elnozahy, Kistler, and Rajamony, proceedings of the 4 th USENIX Symposium on Internet Technologies and Systems, March 2003 Server clusters Multi-mode Energy Management for Multi-tier Server Clusters: Horvath and Skadron, PACT 2008
  • Slide 4
  • Energy Conservation Policies for Web Servers 4 Three policies for energy reduction Dynamic Voltage Scaling (DVS) Request Batching DVS + Request Batching All 3 policies trade system responsiveness to conserve energy Results evaluated using simulator and hardware testbed
  • Slide 5
  • The Policies 5 Focus on reducing CPU energy CPU is the dominant consumer [Bohrer] CPU exhibits the most variation in energy consumption Feedback-driven Control Framework Administrator specifies a percentile-based response time goal Most experiments use 50ms 90 th percentile response-time goal
  • Slide 6
  • The Policies: DVS 6 Varies CPU frequency and voltage to conserve energy while meeting response time requirements Most beneficial for moderate workload Not task based! Task based approach works well for desktop environment but not server environment Ad-Hoc Controller Is Response time goal being met? Decrease CPU Freq Increase CPU Freq Yes No
  • Slide 7
  • The Policies: Request Batching 7 1. Delay servicing of incoming 2. Keep CPU in low power state 3. Packets accumulate in buffer 4. When a packet has been kept pending for longer than specified batching timeout wake up and process requests. If CPUs low power state saves 2.5W and server utilization is 25% it is possible to save 162KJ/day Most beneficial for very light workload Is Response time goal being met? Decrease Batching Timeout Yes No Increase Batching Timeout
  • Slide 8
  • The Polices: Combined 8 Uses request batching when workload is very light Uses DVS when workload is moderate
  • Slide 9
  • Workloads 9 Constructed from web server logs Extended by modifying the inter-arrival time of connections by a scalefactor WorkloadOlympics98FinanceDisk Intense Avg requests (Peak Requests) / sec97 (171)16 (46)15 (30) Avg requests / connection128.531 Unique files (Total File Size)61,807 (795 MB)16,872 (171 MB)698,232 (6,205 MB) Distinct HTTP Requests8,370,0931,360,8861,290,196 Total Response size (excl http headers)49,871 MB2,811 MB10,172 MB 97% / 98% / 99% (MB)24.8 / 50.9 / 1413.74 / 6.46 13.92,498 / 2,860 / 3,382
  • Slide 10
  • Salsa a simulator 10 Estimates energy consumption and response time of web server Based on queuing model built using CSIM execution engine Models process scheduling and file cache hits and misses Validated against real hardware Hardware Model CPU Frequency600 MHz P max 27.2W P idle 4.97W P DeepSleep 2.47W DVS Range300MHz-600Mhz in 33Mhz steps
  • Slide 11
  • Prototype 11 Used to validate Salsa Specs: 600MHz CPU 2.4.3 Linux Kernel Apache web server Does not place CPU into low power state Does not use response time feedback control Salsa is run in open-loop mode for validation
  • Slide 12
  • Validation: Energy 12 Batched requests for 11,953s. Salsa predicted 12,373s 3.5% Error
  • Slide 13
  • Validation: Response Time 13 4.7% Error
  • Slide 14
  • Evaluation: DVS (Response Time) 14 Heavier Workload
  • Slide 15
  • Evaluation: DVS (Workload) 15
  • Slide 16
  • Evaluation: Request Batching (Response Time) 16 Heavier Workload
  • Slide 17
  • Evaluation: Request Batching (Workload) 17
  • Slide 18
  • Evaluation: DVS vs. Request Batching 18 Energy savings dependent on workload Both Policies effective for energy conservation WorkloadOlympics98-4xFinance-12xDisk-Intense-2x Base Energy (J) 1,254,672739,212663,648 Base 90 th Percentile Response Time (ms) 12.36.43.0 DVS Joules (% savings) 915,204 (27%)518,844 (30%)494,982 (25%) Request Batching Joules (% savings) 1,166,128 (7.0%)606,468 (18.0%)525,836 (20.8%)
  • Slide 19
  • Evaluation: Combined Policy 19 Finance-12x, 50ms 90 th Percentile Response Time Goal
  • Slide 20
  • Evaluation: Combined Policy vs. DVS vs. Request Batching 20
  • Slide 21
  • Evaluation: Combined Policy 21
  • Slide 22
  • Evaluation: Combined Policy 22
  • Slide 23
  • Faster Processors 23 Current CPU clock rates >> 600 MHz DVS savings (% energy consumed) remains same Request Batching savings increase Results have not been validated against real hardware Hardware Model CPU Frequency3.0GHz P max 60W P idle 10W P DeepSleep 5W DVS Range1.5GHz-3.0GHz in steps of 150MHz
  • Slide 24
  • Faster Processors: Simulation Results 24
  • Slide 25
  • Related Work 25 DVS CPU utilization over intervals used to predict future utilization [Govil][Weiser] CPU Freq/Voltage set on per task basis [Flautner] Perform well for desktop systems but not in server environment Simulation Wattch, microprocessor power analysis tool [Brooks et. al.] PowerScope, tool for profiling application energy use [Flinn et. al.] Salsa is substantially faster because it is targeted for web workloads
  • Slide 26
  • Conclusions 26 DVS Vary CPU frequency and voltage to save energy Most energy savings with medium workloads Request Batching Group requests and process them in batches when server is under-utilized and keep CPU in sleep mode as much as possible Most energy savings with light workloads DVS + Request Batching Best of both policies! Saves 17%-42% of CPU energy across broad range of workloads
  • Slide 27
  • Critique 27 Request Batching is never compared to policy that uses deep- sleep but does not batch requests DVFS and Request Batching controllers are ad-hoc solutions, no controls analysis Only tested on static content
  • Slide 28
  • Two Approaches 28 Single servers Energy Conservation Policies for Web Servers: Elnozahy, Kistler, and Rajamony, proceedings of the 4 th USENIX Symposium on Internet Technologies and Systems, March 2003 Server clusters Multi-mode Energy Management for Multi-tier Server Clusters: Horvath and Skadron, PACT 2008
  • Slide 29
  • Multi-mode Energy Management for Multi-tier Server Clusters 29 Use DVS and multiple sleep states to manage energy consumption for a server cluster Theoretical analysis of power optimization Validate policies using a multi-tier server cluster Cluster wide energy savings up to 25% with no performance degradation!
  • Slide 30
  • Current Solutions 30 Focus on active portion of cluster Dynamic Voltage Scaling (DVS) Used on a per server basis Increases power efficiency by slowing down CPU Dynamic cluster reconfiguration Load consolidated on subset of servers Unused servers (after consolidation) are shut down
  • Slide 31
  • Related work 31 Distributing demand to cluster subset [Pinheiro et al.] PID controller used to compensate for transient demand variations 45% energy savings Static web workload was interface-bound with peak CPU utilization of 25% Assume machines have lower than actual capacity to compensate wakeup latency Cluster reconfiguration combined with DVS [Elnozahy et al.] Assume a cubic relation between CPU frequency and power Very different results due to different power model
  • Slide 32
  • Outline 32 Models Energy Management and optimization Policies Experiments and Analysis
  • Slide 33
  • System Model 33 Multi-tier server cluster All machines in one tier run same application Requests go through all tiers End to end performance is subject to a Service Level Agreement (SLA) Assumptions All machines in a single tier have identical power and performance characteristics Load balancing within a tier is perfect. Required for analytical tractability. Observations show moderate imbalances are insignifficant!
  • Slide 34
  • Power Model 34 Obtained power model through power measurements of a large pool of characterization experiments varying U i and f i Power usage is approximately linear
  • Slide 35
  • Power Model 35 P i :Power U i :Utilization f i :Frequency Parameters a ij are found through curve fitting Test system had average error of 1%
  • Slide 36
  • Service Latency Model (SLM) 36 Service latency of short requests is mostly a function of CPU utilization Offered load i can be estimated by Prediction 1. Estimate current i from measurements 2. Predict Ui based on i SLM obtained via regression analysis using a heuristically decided format
  • Slide 37
  • Outline 37 Models Energy Management and Optimization Policies Experiments and Analysis
  • Slide 38
  • Multi-mode Energy Management 38 Must consider active and idle (sleeping) nodes Minimization of E transition less important as workload fluctuations for Internet servers fluctuate on larger time scale
  • Slide 39
  • Active Energy Optimization 39 Assigns machines to tiers Determines their operating frequencies Energy management strategy is optimal iff: Total power consumption is minimal SLA is met
  • Slide 40
  • Sleep Energy Optimization 40 Servers may support up to n sleep states (S-states) Assumptions: Workload spikes are unpredictable and arbitrarily large spikes are not supported Maximum Accommodated Load Increase Rate (MALIR, ) is defined to ensure system can meet target SLA E sleep minimized by placing each unallocated server in the deepest possible state subject to MALIR constraint S1S1 S0S0 SnSn Power Level (p i ) Wake-up Latency ( i )
  • Slide 41
  • Feasible Wakeup Schedule 41 Minimum number of servers for each sleep state? If load increases with rate , cluster must wake up machines in to respond! Feasible wakeup schedule exists iff: c: cluster capacityd: demand assume: c(t 0 ), d(t 0 ) are known
  • Slide 42
  • Spare Servers 42 Optimal number of spare servers for each sleep state: Discretized: Note: Derivation included in paper
  • Slide 43
  • Outline 43 Models Energy Management and Optimization Policies Experiments and Analysis
  • Slide 44
  • Active capacity Policy 44 Brute force: Exhaustive search of all possible cluster configurations Does not scale to large clusters! Heuristic approach: Assumes never save power by powering on a machine and lowering cluster CPU frequency [Pinheiro et al.] Takes 2 rounds of calculations Similar to queuing theory based approach by Chen et al.
  • Slide 45
  • Spare Server Policy - Optimal 45 S1 S2 Sn - 1 Sn # Idle nodes in S0 > S0* Done No Place Idle nodes in an S state Yes
  • Slide 46
  • Spare Server Policy - Demotion 46 Maintain list that contains count of idle machines and time each smaller count was first seen. During each control period The list is used to determine the optimal number of machines for each sleep state Working from state on to deeper states, nodes are demoted to states that have a deficit of machines, starting with the deepest state
  • Slide 47
  • Spare Server Policy - Demotion 47 List of timestamps initialized empty idle_since t0t0 t0t0
  • Slide 48
  • t 10 t1 t2 t3 Spare Server Policy - Demotion 48 idle_since t0t0 6 Machines Idling t10 # machines idling sizeof(idle_since)
  • Slide 49
  • idle_since t 10 t1 t2 t3 Spare Server Policy - Demotion 49 idle_since t0t0 2 Machines Idling # machines idling sizeof(idle_since)
  • Slide 50
  • Spare Server Policy - Demotion 50 idle_since t 50 S-StateRunningIdle S062 S32 S42 t10 t20 t30 t50
  • Slide 51
  • Spare Server Policy - Demotion 51 idle_since t 50 S-StateRunningIdle S062 S32 S42 t10 t20 t30 t50 Idle_sincei=3i=4 t10x t20x t30x x x t50x 42 t>t* + w i 2
  • Slide 52
  • Load Estimation 52 Performance monitors Detect when response times exceed D Detect errors from overload (timeouts) Once either monitor triggers a fault, feedback controls used to drive the performance within SLA spec
  • Slide 53
  • Outline 53 Models Energy Management and optimization Policies Experiments and Analysis
  • Slide 54
  • Experiment 54 12-node 4 tier Web server cluster Front end load balancer Web (HTTP) servers Application servers Database servers Baseline, cluster statically provisioned for peak load Test load is a 3-tier implementation of TPC-W benchmark
  • Slide 55
  • Performance and Energy Efficiency 55
  • Slide 56
  • Total Energy Savings 56
  • Slide 57
  • Key Observations 57 Energy savings of 6-14% by exploiting multiple sleep states Average gain for demotion is 10%, optimal 7% Optimal workload sensitivity (7-9%) is smaller than demotion (6-14%) Optimal overall winner, 7% savings over demotion
  • Slide 58
  • Conclusions 58 Can save energy in server clusters in both active and spare capacities Active capacity optimization can be achieved through DVS Spare capacity optimization can be achieved through using multiple sleep states Multiple sleep states save up to 50% more energy than Off- Only solution Optimal policy is superior to Demotion policy
  • Slide 59
  • Critique 59 Notation is not always clearly defined Algorithm explanations hard to follow Newest server trace was 10 years old when paper was published!
  • Slide 60
  • Request Batching v. Cluster Reconfiguration Request BatchingCluster Reconfiguration 60 Focuses on single web servers with light load Uses primarily simulation and hardware test bed Ad-hoc controllers Focuses on tiered server clusters with varying load Has mathematical foundation Control theory based controllers

Recommended