Post on 27-Mar-2015
transcript
© 2010 IBM Corporation
What computer architects need to know about memory throttling
WEED 2010June 20, 2010
IBM Research – Austin
Heather Hanson
Karthick Rajamani
© 2010 IBM Corporation2
Outline
Memory throttling overview
Experimental platform– System configuration– Memory throttling implementation
Memory throttling characterization– Bandwidth– Power– Performance
Summary
© 2010 IBM Corporation3
Memory throttling in a nutshell
Memory throttling is a power-performance knob that:– Impacts memory reference rates of both instruction and data streams– controls power – can be used for safety or optimization
• regulate DIMM temperatures• enforce memory power budgets
Memory throttling restricts read & write traffic – directly controls memory power– indirectly affects processors and other components
Several implementation styles in commercial systems– insert periodic idle cycles– allow arbitrary number of transactions up to power (estimated) threshold– run + hold windows– enforce read & write quotas [this paper]
• first N transactions to proceed in time window• any further requests wait until next time period
© 2010 IBM Corporation4
Comparison to clock throttling
run-hold clock throttlingregular frequency during run portion;
clock halted during hold portion
quota-style memory throttlingreads & writes proceed as requested up
to N requests per period
Example: N = 6Up to 6 transactions serviced per
period, regardless of request timing
Nth request in each period;additional requests would be queued for later service
© 2010 IBM Corporation5
POWER6 Memory Throttling
IBM JS12 blade system– Processor
• POWER6• 1 socket x 2 cores per processor socket• 3.8 GHz frequency (fixed in these experiments)• SLES10 linux
– Memory•16 GB capacity• 8 DIMMS x 2 GB each• DDR2• 667 MHz bus
Quota-style memory throttling– N transactions per M memory cycles
100% throttle level == unthrottled
– Time period is faster than thermal and power supply timescales
© 2010 IBM Corporation6
Memory throttle characterization methodology
1. Sweep throttle settings
• Set throttle
• Run steady-behavior benchmarkDAXPY (double A * X plus Y)FPMAC (floating-point multiply accumulate)RandomMemory (generate random addresses)SPECPower_ssj2008 calibration phase (peak throughput for warehouse transactions)
• Record sensor data, 256ms per sampleMemory powerMemory reads & writesInstruction throughputAnd other sensors not shown here
• Decrement throttle
• Repeat for full range of throttle settings
2. Repeat throttle sweep for multiple benchmarks and memory footprints– Microbenchmarks: L1 cache contained and main memory footprints– SPECPower_ssj2008: behaves as nearly contained in on-chip caches
3. Calculate median sensor data for each permutation {benchmark, footprint, throttle}
© 2010 IBM Corporation7
saturated
linear
Memory throttle effect on bandwidth
transition between linear & saturated regions
© 2010 IBM Corporation8
A closer look at RandomMemory-DIMM
• uses less bandwidth than other benchmarks at same throttle levels• also less bandwidth than its own saturation level
Simply measuring bandwidth at a single/current throttle level is not enough to identify a region of operation less than max could be saturated or transition region
….a controller will not be able to accurately predict the effect on bandwidth of a throttle level change
…or predict the effect on power or performance
Subtle but very important point about transition region
Actual bandwidth < max bandwidth bandwidth restrictions
pipeline starvation reduced request rate
© 2010 IBM Corporation9
Memory Power
is basically linear with bandwidth, so this chart looks familiar….
© 2010 IBM Corporation10
power performance
Throttling effects relative to each benchmark
L1-contained DAXPY: throttling has no effect
DIMM-sized DAXPY: drastic effect
Generally more performance reduction than power reduction (in %)– Throttling alone doesn’t affect static portion of memory power
• Leveraging idle low-power modes of memory can alter positively the power-performance curve for memory request rate throttling.– Possible to waste energy from longer execution time
Larger bandwidth demands larger effect from throttling– Conversely, power reduction only when performance is impacted.
© 2010 IBM Corporation11
Summary
Memory throttling is a power-performance knob available in commercial systems
Memory controller restricts read & write bandwidth– caps memory power– controls DIMM temperature
Mileage may vary– power and performance management depend on bandwidth demand
• throttling a low-bandwidth workload doesn’t reduce much power
– potential to use more energy due to increased execution time• use highly throttled settings with caution
Effective tool for power capping– power constrained configurations– thermal safety– power shifting
© 2010 IBM Corporation12
Acknowledgements
IBM Research – Austin
IBM Systems & Technology Group– Memory characterization: Joab Henderson, Kenneth Wright– EnergyScale firmware: Guillermo Silva, Andrew Geissler