Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | carmella-stone |
View: | 217 times |
Download: | 0 times |
System-level, Unified In-band and Out-of-band Dynamic Thermal Control
Dong Li Virginia Tech
Rong Ge Marquette University
Kirk Cameron Virginia Tech
Motivation Hot spots or elevated temperatures in areas of
the data center are quite common Out-of-band techniques (e.g. CPU cooling
fans) are less studied In-band and out-of-band techniques operate
independently without cooperating with each other Challenge 1: enforcing the same user control policy
across diverse physical mechanisms Challenge 2: in needs of a tunable controller
Temperature Characteristics of Parallel Applications
Temperature Characteristics of Parallel Applications Three temperature characteristics
Sudden change and gradual change lead to actual temperature increase or decrease
Jitter lacks sustained increase or decrease following a spike
Design a controller to recognize these types and respond accordingly
History-based Context-aware Temperature Control(Basic idea) Periodically profile temperature and use the
historical information to predict future CPU temperature
Identify the appropriate technique to perform thermal control and balance power and performance for the next interval based on the prediction
History-based Context-aware Temperature Control(Temperature Profiling and Prediction) Use a two-level window to track the changes in
temperature in both long and short time periods
Temperature samples
The level-one temperature window to react to the “sudden”
Average value to reduce jitter
front rearThe level-two temperature window (FIFO) to react to the “gradual”
History-based Context-aware Temperature Control(Temperature Profiling and Prediction) We assume that temperature will change with
the same rate for the next round of sampling
The temperature difference (tL1/L2) is then used to determine the appropriate temperature regulator response
History-based Context-aware Temperature Control(Target Mode Identification) Inputs:
Predicted temperature behavior based on the temperature profile (tL1/L2)
A parameter (Pp) specified by the user that indicates the aggressiveness of the temperature controller
Outputs: Fan speed Frequency setting The controls follow the thermal control policy (Pp)
History-based Context-aware Temperature Control(Target Mode Identification) We maintain a “thermal control array” for each
available thermal control technique on the system
{g1, g2, g3, …, gnp, …, gN}
Each number represents a mode that controls temperature at a degree
Weak Strong
Effectiveness of controlling temperature
History-based Context-aware Temperature Control(Target Mode Identification) To coordinate multiple thermal management
techniques, we fill out the arrays in a unified way
{g1, g2, g3, …, gnp, …, gN}
Weak Strong
Effectiveness of controlling temperature
np is determined by Pp
Filled with the most effective mode gN
Filled with a subset of physically available modes evenly extracted from the full set
History-based Context-aware Temperature Control(Target Mode Identification) np is determined by Pp
PMIN PMAX
1 N
Mapping
np
Pp
History-based Context-aware Temperature Control(Target Mode Identification)
PMIN PMAX
1 N
Mapping
np
Pp
{g1,…, ,…, gN}gnp
A smaller Pp leads to a more aggressive thermal control More array items store the most efficient temperature
mode A small increment in array index can lead to large
increment in cooling effect
History-based Context-aware Temperature Control(Target Mode Identification)
We use the predicted temperature variance (tL1/L2) from the two-level window to identify an index in the thermal control array
{g1,…, gi,…gi+c*t,…, gN}
current mode next mode
TMIN TMAX
1 N
MappingC = (N-1)/(Tmax – Tmin)
Performance Evaluation (Platform)
ADT7467 developing board
Cooling fan on top of processor
Implement a fan driver that dynamically set the fan speed according to processor temperature
Collect temperature samples from digital thermal sensors embedded in the processor
The processor can be scaled among 5 frequencies
Performance Evaluation (Dynamic Fan Control)
38
40
42
44
46
48
50
52
54
56
58
Pp=75
10
20
30
40
50
60
70
80
90
100
Pp=75
1
Pp=50
Pp=50
2
Pp=25
Pp=25
Pp=75
Pp=75
Pp=50
Pp=50 Pp=25
Pp=25
58
38
42
46
50
54
10
20
40
60
80
100
Our dynamic fan control responds to temperature changes underdifferent control policies (Pp=25 (aggressive), Pp=50(moderate), and Pp=75(weak))
Performance Evaluation (Dynamic Fan Control)
0 100 200 300 400 500 600 700 800 90038
40
42
44
46
48
50
52
54
Sample Points
Tem
pera
ture
(°C
)
Under traditional fan controlUnder our fan controlUnder max fan speed
10
20
30
40
50
60
70
80
Sample Points
PWM
Dut
y C
ycle
Traditional fan controlOur fan controlMax fan speed
Pp = 50; benchmark: bt.B.4 We compare our dynamic fan control method with the
traditional static method and constant fan speed control
Performance Evaluation (Dynamic Fan Control)
In general, larger maximum PWM duty cycle leads to lower temperature
A less powerful fan is able to deliver similar cooling effects as a more powerful fan with our dynamic control
0 100 200 300 400 500 600 700 800 900 100038
40
42
44
46
48
50
52
54
56
58
Sample Points
Tem
pera
ture
(°C
)
25% max
50% max
75% max
100% max
0 50 100 150 200 25010
15
20
25
30
35
40
45
50
55
60
Sample Points
PW
M D
uty
Cyc
le
max 100%max 75%max 50%max 25%
Performance Evaluation (Temperature Aware DVFS Control)
Benchmark: LU.B.4; coupled with traditional static fan control; Pp=50 Our DVFS control scales down frequency only when average
temperature is stabilized Our DVFS control scales up frequency to its original value once the
temperature is consistently below the threshold so as to avoid performance loss
0 200 400 600 800 1000 120038
40
42
44
46
48
50
52
54
Te
mp
era
ture
(°C
)
Trigger Temperature = 51 °C
Freq change:2.4GHz --> 2.2GHz
Freq change:2.2GHz --> 2.4GHz
Performance Evaluation (Temperature Aware DVFS Control)
Our DVFS control performs better than CPUSPEED in terms power-saving and performance
0 100 200 300 400 500 600 700 800 900 100035
40
45
50
55
60
65
70
75
Te
mp
era
ture
(°C
)
CPUspeed
tDVFS
Freq changes:2.4GHz -->2.2GHz
Freq changes:2.2GHz --> 2.0GHz
CPUSPEED tDVFS
Max allowedPWM duty cycle
75% 50% 25% 75% 50% 25%
# freq changes 101 122 139 2 2 3
Execution Time (s) 219 222 223 219 233 234
Ave Power(Watt) 99.78 99.30 100.80 97.93 94.19 92.78
Power-Delay Product (Watt*s)
21852.78 22044.21 22478.64 21447.27 21946.03 21710.32
Performance Evaluation (Dynamic Hybrid Fan and DVFS Control)
0 50 100 150 200 25038
40
42
44
46
48
50
52
54
56
58
sample points
Te
mp
era
ture
(°C
)
Pp = 75Pp = 50Pp = 25
3 14 2
Our method effectively unifies different thermal control techniques and reacts to different user control policies with minimum performance impact
Conclusion We classify thermal characteristics of parallel
applications and use a two-level temperature window to make our controller more effective
We introduce a simple parameter (Pp) to allow the user to specify the aggressiveness of in-band and out-of-band techniques for thermal reductions
We integrate an out-of-band method (fan control) and an in-band method (DVFS)
We explore an efficient fan control method
Thank You