+ All Categories
Home > Documents > Model predictive control and self- learning of thermal ...

Model predictive control and self- learning of thermal ...

Date post: 28-Dec-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
55
Model predictive control and self- learning of thermal models for multi-core platforms* Luca Benini [email protected] *Work supported by Intel Labs Braunschweig
Transcript
Page 1: Model predictive control and self- learning of thermal ...

Model predictive control and self-

learning of thermal models for

multi-core platforms*

Luca [email protected]

*Work supported by Intel Labs Braunschweig

Page 2: Model predictive control and self- learning of thermal ...

The Power Crisis

M.Pedram [CADS10]

mobile clients

Data center, HPC,server0%

20%

40%

60%

80%

100%

Macbook Macbook Air iPhone-3G MacPro-Desktop

Co

2 E

mis

sio

ns %

Transport

Production

Use

Run-time power

~50% of ICT

environmental impact

and cost!

Page 3: Model predictive control and self- learning of thermal ...

3

The Thermal Crisis

• Never-ending shrinking: smaller, faster…

Free lunch?!

[Sun, 1.8 GHz

Sparc v9

Microproc]

[Sun, Niagara

Broadband

Processor]

CMOS

65nm

CMOS

45nm CMOS

32nm

[Coskun et al „07, UCSD]

• Multi-Processor SoC possible“Cool” chips, “hot” applications

Not really…

[Cell Multi-Processor]

system

wear-out

and

lifetime

reliability

degradation !!

• Thermal issues: hot-spots, thermal gradients…

Page 4: Model predictive control and self- learning of thermal ...

3D-SoCs are even worse

Page 5: Model predictive control and self- learning of thermal ...

A System-level View

• Heat density trend 2005-2010 (systems)

[Uptime Institute]

Cooling and hot spot avoidance is an open issue!

Page 6: Model predictive control and self- learning of thermal ...

• Increasing power density

• Thermal issues at multiple levels– Chip / component level

– Server/board level

– Rack level

– Room level

Multi-scale Problem

Today’s focus: Chip level

Page 7: Model predictive control and self- learning of thermal ...

Thermal Management

Spatial and temporal

workload variation

software

High

power

densities

Tecnology scaling

High performace

requirementsLimitated

dissipation

capabilities

System integration

Costs

power, temperature, performance

NON UNIFORM:

Hot spots, thermal

gradients and cycles

Leakage

current Reliability lost,

Aging

Dynamic Approach:

on-line tuning of system performance and

temperature through closed-loop control

Page 8: Model predictive control and self- learning of thermal ...

Control algo

migration policy

System

information

from OSWorkload

CPU utilization,

queue status

PowerReliability

alarms Temperature

Core 1 Core N …Proc. 1 Proc. 2

INTERCONNECT

Private

Mem

Private

Mem

Multicore Platform

Management Loop: Holistic view

SW

HWIntrospection: monitors Self-calibation: knobs (Vdd, Vb, f,on/off )

Task scheduling - migration

Proc. N

Private

Mem

Page 9: Model predictive control and self- learning of thermal ...

Outline

• Introduction

• Energy Controller

• Thermal Controller architecture

• Learning (self-calibration)

• Scalability

• Simulation Infrastructure

• Results

• Conclusion

Page 10: Model predictive control and self- learning of thermal ...

O.S

DRM - General Architecture

• System (Chip Scale)

• Sensors

– Performance counter - PMU

– Core temperature

• Actuator - Knobs

– ACPI states

– P-State DVFS

– C-State PGATING

– Task allocation

• Controller

– Reactive

– Threshold/Heuristic

– Controller theory

– Proactive

– Predictors

Simulation snap-shot

L2

CPU1L1

L2

DRAM

Network

CPU2L1

CPUNL1

HW

SW

App.1

Thre

ad

1 ...

Thre

ad

N

App.N

Thre

ad

1 ...

Thre

ad

N.......

f,v

PGATING

TCPU,#L1MISS,#BUSACCESS,CYCLEACTIVE,....

CONTROLLER

• Controller

• Minimize energy

• Bound the CPU temperature

• Our approach

• Stack of controllers

• Energy controller

• Thermal controller

CONTROLLER

Energy

Thermal

TifECi

WLei

fTCi

Page 11: Model predictive control and self- learning of thermal ...

Energy ControllerCPU BOUND TASK

MEMORY BOUND TASK

cpu

cache

High Frequency

1

0

cpu

cache

Low Frequency

1

0

• Performance Loss

• Power reduction

• Energy Efficiency Loss!

cpu

dram

cache

High Frequency

1

0

cpu

dram

cache

Low Frequency

1

0

• Same Performance

• Power reduction

• Energy Efficiency Gain!

Page 12: Model predictive control and self- learning of thermal ...

Energy Controller

cpu

cache

cpu

dram

cache

Low Frequency

High Frequency

1

0

• Performance Loss

• Power reduction

• Energy Efficiency Loss!

1

0

• Same Performance

• Power reduction

• Energy Efficiency Gain!

CPU BOUND TASK

cpu

cache

High Frequency

1

0

cpu

dram

cache

Low Frequency

1

0

MEMORY BOUND TASK

OUR SOLUTION •Power Saving

• No performance Loss

• Higher Energy Efficiency

Page 13: Model predictive control and self- learning of thermal ...

Outline

• Introduction

• Energy Controller

• Thermal Controller architecture

• Learning (self-calibration)

• Scalability

• Simulation Infrastructure

• Results

• Conclusion

Page 14: Model predictive control and self- learning of thermal ...

Thermal Controller

[Intel®, ISSCC 2007]

Threshold based

controller

•T > Tmax low freq

•T < Tmin high freq

• cannot prevent overshoot

• thermal cycle

Classical feed-back

controller

• PID controllers

• Better than threshold

based approach

• Cannot prevent overshoot

Model Predictive

Controller

•Internal prediction:

avoid overshoot

•Optimization:

maximizes performance

• Centralized

• aware of neighbor

cores thermal

influence

• All at once – MIMO

controller

• Complexity !!!

Thermal

Model

Past input

& output

OptimizerFuture

input

Target

frequency

+-

Cost

function

MPC

Future

output

Future

error

Constraint

Page 15: Model predictive control and self- learning of thermal ...

CoreN

Core1

Corei

multicore

MPC Robustness

Thermal

Model

Past input

& output

OptimizerFuture

input

Target

frequency

+-

Cost

function

MPC

Future

output

Future

error

Constraint

MPC needs a Thermal Model

• Accurate, with low complexity

• Must be known “at priori”

• Depends on user configuration

• Changes with system ageing

“In field” Self-CalibrationWorkload

tCoreN

Workload

t

Workload

tCorei

Workload

t

Workload

t

Workload

tCore1

Workload

t

Workload

executionTraining

tasks

Workload

Power

Temperature

System

Identification

Identified State-Space

Thermal Model

• Force test workloads

• Measure cores temperatures

• System identification

Page 16: Model predictive control and self- learning of thermal ...

Outline

• Introduction

• Energy Controller

• Thermal Controller architecture

• Learning (self-calibration)

• Scalability

• Simulation Infrastructure

• Results

• Conclusion

Page 17: Model predictive control and self- learning of thermal ...

task tasktask

Modello

Termico

TjThermal

model

Tn,j

Pn,j

Modello di

potenza

TaskjPj

Power

model

P=g(task,f)

Thermal Model & Power Model

Page 18: Model predictive control and self- learning of thermal ...

si sisi

sisi

sisi

si

si

Cu cucu cucu

02 20 13 31

01

10

23

32

P0 P1

Matrix A Matrix B

Si 00 Si 11

Cu 22 Cu 33

Tn,j

Pn,j

Tenv

Model Structure

Page 19: Model predictive control and self- learning of thermal ...

Model

least square

optimization

DataOptimal

parameters

Test pattern

System

response

A,B

e = Tmodel - Treal

Parametric optimization

Parameters

Cost function

Error Function

i

LS System Identification

Page 20: Model predictive control and self- learning of thermal ...

PREPROCESSING

Pattern Generator Workloader

core0 core1 coreN

XTS

LS

MODEL

LS

MODEL

PREPROCESSING

PRBS.csv

DATA.csv

..0,0,1,1,1,…

..idle,idle,run,run,run,…

•Temperature

• Frequency

• Workload

C/FORTRAN

(SLICOT,MINPACK)

Matlab System Identification

• N4SID

• PEM

• LS (Levenberg-Marquardt)

HW

(Ts=1/10ms)

Fan board

CPU2 CPU1

Storage drives

SUN FIRE X4270

Air

flo

w

RAMRAM

Chipset

CPU1CPU2

• Intel Nehalem

• 8core/16thread

• 2.9GHz

• 95W TDP

• IPMI

Experimental setup

Page 21: Model predictive control and self- learning of thermal ...

0 5 10 15 20 25 30 3532

34

36

38

40

42

44

°C

Tcore2

0 5 10 15 20 25 30 350

0.2

0.4

0.6

0.8

1

Pcore2

Time (seconds)

Wo

rklo

ad

Workload & Temperature

Temperature trace

Pseudorandom workload pattern

Page 22: Model predictive control and self- learning of thermal ...

Identification based on pure LS fitting

TEMPERATURA MISURATA vs. SIMULATA

25 26 27 28 29 30 31 32 3334

35

36

37

38

39

40

41

42

43

44

45

Tcore2

seconds

°C

Measured

Thermal Model

MEASURED vs. SIMULATED TEMPERATURE

Black-box Identification

Page 23: Model predictive control and self- learning of thermal ...

13.6 13.65 13.7 13.75 13.8 13.85 13.9 13.95 14 14.05 14.132

34

36

38

40

42

44

Tcore2

seconds

°C

Measured

Thermal Model

1.36 1.365 1.37 1.375 1.38 1.385 1.39 1.395 1.4 1.405 1.41

x 104

1

1.5

2

2.5

3

3.5

4x 10

9

f [H

z]

1.36 1.365 1.37 1.375 1.38 1.385 1.39 1.395 1.4 1.405 1.41

x 104

-1

0

1

2

W

time (ms)

PROBLEM 1:

“WORKLOAD” signal

does not include power

variation due to

frequency changes.

FREQUENCY

WORKLOAD

SOLUTION:

Learn Power Model too!

Partially unobservable model

Page 24: Model predictive control and self- learning of thermal ...

Power model P=g(w,f) initially unknown

Power

model

(LUT)

Thermal

model

(A,B)T

fw

P

1° STEP: set f=const, set w as [0|1]N sequence [P1|P0]N

with P1, P0 pre-measured in steady state, we measure T to

obtain A0 by LS

2° STEP: A is known, we set f, w, we measure T, we

invert A and we obtain P

Multi-step Identification

3° STEP: P is known, we now generate richer

sequence w,f and we re-calibrate A by LS

Iterate until convergence

Page 25: Model predictive control and self- learning of thermal ...

13.6 13.65 13.7 13.75 13.8 13.85 13.9 13.95 14 14.05 14.1

34

35

36

37

38

39

40

41

42

Tcore2

seconds

°C

Measured

Thermal Model

1.36 1.365 1.37 1.375 1.38 1.385 1.39 1.395 1.4 1.405 1.41

x 104

1

1.5

2

2.5

3

3.5

4x 10

9

f [H

z]

1.36 1.365 1.37 1.375 1.38 1.385 1.39 1.395 1.4 1.405 1.41

x 104

-1

0

1

2

W

time (ms)

20.5 21 21.5 22 22.5 23

40.5

41

41.5

42

42.5

Tcore2

seconds

°C

Measured

Thermal Model

Problem 2: Instable Model

Tamb

Tcore

t

Tcore differs from Tamb with P=0

Problem 3: Model is not physical

Identification algorithm must be aware of physical properties to avoid over-fitting

Validation

Page 26: Model predictive control and self- learning of thermal ...

Constraint on initial

condition

Linear constraint

CONSTRAINED LEAST SQUARES

26 26.2 26.4 26.6 26.8 27 27.2

37

38

39

40

41

42

43

44

Tcore2

seconds

[°C

]

Measured

Thermal Model

Constrained Identification

Page 27: Model predictive control and self- learning of thermal ...

100 200 300 400 500 60032

34

36

38

Tcore2[°

C]

Measured

Thermal Model

100 200 300 400 500 60030

35

Tcore0

[°C

]

Measured

Thermal Model

100 200 300 400 500 60030

35

Tcore1

[°C

]

Measured

Thermal Model

100 200 300 400 500 60030

35

Tcore3

Seconds

[°C

]

Measured

Thermal Model

Possible causes:

• Package thermal inertia?

• Environment inertia (Air)?

• PLEAK temperature dependency?

Identification with pseudorandom trace:

• Too many samples, huge LS computation

Large time constant

Quasi-steady-state accuracy

Page 28: Model predictive control and self- learning of thermal ...

• Modelling the third time constant as heat sink temperature variation

• One-pole model identification

Enviroment

thermal

model

CPU thermal

model

Theatsink

PT

Tenv

Addressing models stiffnes

100 200 300 400 500 60030

35

40

Tcore

[°C

]

Measured

Thermal Model

100 200 300 400 500 60030

35

40

Tcore

[°C

]

Measured

Thermal Model

100 200 300 400 500 60030

35

40

Tcore

[°C

]

Measured

Thermal Model

100 200 300 400 500 60030

35

40

Tcore

Seconds

[°C

]

Measured

Thermal Model

Page 29: Model predictive control and self- learning of thermal ...

Outline

• Introduction

• Energy Controller

• Thermal Controller architecture

• Learning (self-calibration)

• Scalability

• Simulation Infrastructure

• Results

• Conclusion

Page 30: Model predictive control and self- learning of thermal ...

4

36

MPC Scalability

[Intel, ISSCC 2007] [Intel, ISSCC 2007]

MPC Complexity

• Implicit - a.k.a. on-line

• computational

burden

• Explicit – a.k.a. off-line

• high memory

occupation

# CORES

# E

xp

lic

itre

gio

ns

[Intel, ISSCC 2007] [Intel, ISSCC 2007]

8

1296

16

Ou

t o

fm

em

ory

Complexity grows

superlinearly with

number of cores!!

Page 31: Model predictive control and self- learning of thermal ...

Addressing Scalability

On a short time window,

power has a local thermal

effect!

4

36

# CORES

# E

xp

lic

itre

gio

ns

8

1296

16

Ou

t o

fm

em

ory

8 16 32

One controller for each core

Controller uses:

• local power & thermal model

• neighbor‟s temperatures

Fully distributed

Complexity scales

linearly with #cores

Page 32: Model predictive control and self- learning of thermal ...

CoreN

Core1

Corei

multicore

Distributed Control

Distributed and hierarchical controllers:– Energy Controller (EC)

• Output Frequency fEC

– Minimize power – CPI based

– Performance degradation < 5%

– Temperature Controller (TC)

• Distributed MPC

• Inputs:

– fEC ,TCORE, TNEIGHBOURS

• Output

– Core frequency (fTC)

fTC i,k+1

fTC N,k+1

T i,k

T i,k

controller

node

Ti+1,k

Ti-1,k

TN-1,k

Tx,k

T2,k

TCi

TCN

fEC i,k+1

ECi

CPI i,k+1

CPI N,k+1

ECN

fEC N,k+1

Page 33: Model predictive control and self- learning of thermal ...

Thermal ControllerCore 1

CPI

f1,EC

f1,TC

Thermal ControllerCore 2

CPI

f1,EC

f2,TC

Thermal ControllerCore 3

CPI

f1,EC

Thermal ControllerCore 4

CPI

f1,EC

PLANT

T1+Tneigh

T1+Tneigh

f4,TC

T3+Tneigh

T4+Tneigh

f3,TC

Energ

y C

on

tro

ller

High Level Architecture

Page 34: Model predictive control and self- learning of thermal ...

CPI

f1,EC

Observer

P1,EC

Distributed Thermal Controller

g(·) MPC Controller

Linear Model

QP Optimizx1

TENV

g-1(·)

CPI

P1,TC

f1,TC

MPC Controller Core 1

Nonlinear

(Frequency to Power)

Linear

(Power to Temperature)

s.t

2 states

per coreT1

Classic Luenberger state observer

Implicit formulation

Page 35: Model predictive control and self- learning of thermal ...

?

Region number

x1SHIFTEDx1+

A1-1·B1

[x1 ; TENV ; P1,EC]

REGION NUMBER

GAIN MATRICES

1 F1 , G1

2 F2 , G2

… …

nr Fnr , Gnr

u(k)=∆P1+

P1,EC

P1,TC

Our aim is to minimize the difference between the input P1,TC (also called

manipulated variable MV) and the reference (P1,EC). Our controller can only take in

account a constant reference. To overcome this limitation we reformulate the

tracking problem as a regulation problem consisting in taking the ∆P1 (the new

MV) to 0. The regulated power P1,TC is:

At each time instant the system belongs to a region

according with its current state. On each region the

explicit controller executes the following linear control

law:

The prediction evaluated by our explicit controller cannot take into account the

measured disturbances (uMD=[Tenv, P1, Tneigh]). Thus we exploit the superposition

principle of linear systems:

To remap the effect of these elements we exploit the model to modify the state

(x(k) xSHIFTED(k)) projecting one step forward the MDs effects.

Explicit Distributed Controller

Page 36: Model predictive control and self- learning of thermal ...

CoreN

Core1

Corei

multicore

O.S.Implementation – Linux SMP

• Controller routines

– Scheduler Routine Extension

• Is distributed

• Executes on the core it relies on

– Timing: Scheduler tick (1-10ms)

• CPI estimation

– Performance counters:

• Clock expired

• Instructions retired

• Energy Controller

– Look-up-table:

• fEC = LuT [ CPI ]

• Thermal Controller

– Core Temperature Sensors

– Matrix Multiplication & Look-up-table:

• fTC = LuT [ M*[TCORE, TNEIGHBOURS] ]

fTC i,k+1

fTC N,k+1

T i,k

T i,k

controller

node

Ti+1,k

Ti-1,k

TN-1,k

Tx,k

T2,k

TCi

TCN

fEC i,k+1

ECi

CPI i,k+1

CPI N,k+1

ECN

fEC N,k+1

Page 37: Model predictive control and self- learning of thermal ...

System

Identification

“In field” Self-Calibration

• Force test workloads

• Measure cores temperatures

• System identification

Model Learning Scalability

Thermal

Model

Past input

& output

OptimizerFuture

input

Target

frequency

+-

Cost

function

MPC

Future

output

Future

error

Constraint

MPC Weaknesses – 2nd

Internal Thermal Model

• Accurate, with low complexity

• Must be known “at priori”

• Depends on user configuration

• Changes with system ageing

Workload

tCoreN

Workload

t

Workload

tCorei

Workload

t

Workload

t

Workload

tCore1

Workload

t

Workload

executionTraining

tasks

Workload

CoreN

Core1

Corei

multicore

Temperature

Power

Identified State-Space

Thermal Model

Complexity issue

• State-of-the-art is centralized

• Least square based – is based on matrix

inversion (cubic with #cores)

# CORES

Tim

e-

s

4

105

16

>1h

8

1230

1 2 4

Distributed approach:

each core identifies its

local thermal model

Complexity scales

linearly with #cores

Page 38: Model predictive control and self- learning of thermal ...

Outline

• Introduction

• Energy Controller

• Thermal Controller architecture

• Learning (self-calibration)

• Scalability

• Simulation Infrastructure

• Results

• Conclusion

Page 39: Model predictive control and self- learning of thermal ...

Simulation StrategyTrace driven Simulator [1]:• Not suitable for full system simulation (How to simulate O.S.?)

• looses information on cross-dependencies

resulting in degraded simulation accuracy

Co

ntr

ol

str

ate

gy

Multicore

Simulator

Workload Set

Power Model

Temperature Model

Execution Trace

database

Multicore

Simulator

Workload

Power Model

Temperature Model

Co

ntr

ol

str

ate

gy

Close loop simulator:• Cycle accurate simulators [2] :

• High modeling accuracy

• support well-established power and temperature

co-simulation based on analytical models and

system micro-architectural knowledge

• Low simulation speed

• Not suitable for full-system simulation

• Functional and instruction set simulators:• allow full system simulation

• less internal precision

• less detailed data

• introduces the challenge of having accurate

power and temperature physical models

no micro-architectural model

[1] P Chaparro et al. Understanding the thermal implications of multi-core architectures. 2007

[2] Benini L. et al. MPARM: Exploring the multi-processor SoC design space with SystemC 2005

Page 40: Model predictive control and self- learning of thermal ...

Virtual Platform

Virtual Platform

Simulator

SIMICS

RUBY COREStall Mem

AccessCPU1 CPU2 CPUN

L2L1

L2

DRAM

Network

L1 L1

HWSW

App.1

T1 ... TN

App.N

T1 ... TN

O.S.

....

Simics by Virtutech:• full system functional simulator

• models the entire system:

peripherals, BIOS, network interfaces, cores,

memories

• allows booting full OS, such as LinuxSMP

• supports different target CPU (arm, sparc, x86)

• x86 model:

• in-order

• all instruction are retired in 1 cycle

• does not account for memory latency

Memory timing model• RUBY – GEMS (University of Wisconsin)[1]

• Public cycle-accurate memory timing

model

• Different target memory architectures

• fully integrated with Virtutech Simics

• written in C++

• we use it as skeleton to apply our add-

ons (as C++ object)

[1] Martin Milo M. K. et al. Multifacet’s general

execution-driven multiprocessor simulator (GEMS) toolset 2005

Page 41: Model predictive control and self- learning of thermal ...

Virtual Platform

Virtual Platform

Simulator

SIMICS

PC#hlt,stall

active,cycles

RUBY COREStall Mem

Access

DVFS

fi

fi

fi ,VDD

CPU1 CPU2 CPUN

L2L1

L2

DRAM

Network

L1 L1

HWSW

App.1

T1 ... TN

App.N

T1 ... TN

O.S.

....

#L1MISS, #BUSACCESS,CYCLEACTIVE,....

f,v f,v f,v

Performance knobs (DVFS) module:• Virtutech Simics support frequency change at run-time

• RUBY does not support it:

• does not have internal knowledge of frequency

• We add a new DVFS module to support it :

• ensures L2 cache and DRAM to have a constant clock frequency

• L1 latency scale with Simics processor clock frequency

Performance counters module:• Needed by performance control policy

• We add a new Performance Counter module to support it

• exports to O.S. and application different quantities:

• the number of instruction retired, clock cycles and stall cycles expired,

halt instructions,…

Page 42: Model predictive control and self- learning of thermal ...

Virtual Platform

Virtual Platform

Simulator

SIMICS

PC#hlt,stall

active,cycles

RUBY COREStall Mem

Access

DVFS

fi

fi

fi ,VDD

POWER MODEL

PCORE, PL1, PL2

CPU1 CPU2 CPUN

L2L1

L2

DRAM

Network

L1 L1

HWSW

App.1

T1 ... TN

App.N

T1 ... TN

O.S.

....

#L1MISS, #BUSACCESS,CYCLEACTIVE,....

f,v f,v f,v

Power model module:• At run-time estimate the power consumption of the target architecture

• Core model PT = [PD(f,CPI) + PS(T, VDD)] *(1 − idleness) + idleness *(PIDLE)

• PD experimentally calibrated analytical power model

• Cache and memory power – access cost estimated with CACTI [1]

[1] Thoziyoor Shyamkumar et al. A comprehensive memory modeling tool and its application to the

design and analysis of future memory hierarchies. 2008

Page 43: Model predictive control and self- learning of thermal ...

Power Model

Power model interface

Simulation snap-shotRuby &simics

L2

CPU1

L1

CPU N

L1

L2

DRAM

CPU2

L1

Network

f1 = k1 * fnom f2 = k2 * fnom fN = kN * fnom

i-th CPU# Istruction retired

# Stall Cycle

# HLT Istruction

i-th L1# Line & WD Read

# Line & WD Write

TVf dd ,,

i-th L2

# Line Read

# Line Write

DRAM

# Burst Read

# Burst Write

P

O

W

E

R

M

O

D

E

L

P

O

W

E

R

&

E

N

E

R

G

Y

Page 44: Model predictive control and self- learning of thermal ...

Modeling Real Platform – Power

Ek

CKDCBCKDDAD CPIfkkkfVkP )(2

Real Power Measurement

• Intel server system S7000FC4UR

• 16 cores - 4 quad cores Intel® Xeon® X7350, 2.93GHz

• 16GB FBDIMMs

• Intel® Core™ 2 Duo architecture

• At the wall Power consumption

• test:

• set of synthetic benchmarks with different memory pattern accesses

• forcing all the cores to run at different performance levels

• for each benchmark we extract the clocks per instruction metrics (CPI) and

correlate it with the power consumption

• We relate the static power with the operating point by using an analytical model

High accuracy at

high and low CPI

Page 45: Model predictive control and self- learning of thermal ...

Virtual Plattform

Virtual Platform

Simulator

SIMICS

PC#hlt,stall

active,cycles

RUBY COREStall Mem

Access

DVFS

fi

fi

fi ,VDD

POWER MODEL

PCORE, PL1, PL2

CPU1 CPU2 CPUN

L2L1

L2

DRAM

Network

L1 L1

HWSW

App.1

T1 ... TN

App.N

T1 ... TN

O.S.

....

#L1MISS, #BUSACCESS,CYCLEACTIVE,....

TEMPERATURE MODELT

TCPU,

f,v f,v f,v

Temperature model module:• we integrate our virtual platform with a thermal simulator [1]

• Input: power dissipated by the main functional units composing the target platform

• Output: Provides the temperature distribution along the simulated multicore die area

as output

[1] Paci G. et al. Exploring ”temperature-aware” design in low-power MPSoCs

Page 46: Model predictive control and self- learning of thermal ...

PCPU1

PL1

PCPUn

PL1

PCPU2

PL1

L2L2

Network

Thermal Model

Methods to solve temperature

P

O

W

E

R

M

O

D

E

L

P

O

W

E

R

&

E

N

E

R

G

Y

sisi

si

si

si

si

si

si

si

Cu cucu cu cu

CPU2

L2

CPU1 CPUn

L1L1L1

Heat spreaderIC package

Package

pinPCB

IC die

Thermal ModelTi

Page 47: Model predictive control and self- learning of thermal ...

Modeling Real Platform– Thermal

• Thermal Model Calibration :• Derived from Intel® Core™ 2 Duo layout

• We calibrate the model parameter to simulate real HW transient

• High accuracy (error < 1%) and same transient behavior

<1%

Page 48: Model predictive control and self- learning of thermal ...

Virtual Platform Performance

• Host:• Intel® Core™ 2 Duo

• 2.4 Ghz

• 2GB RAM

Simics +

Ruby:

Simics +

Ruby +

DVFS:

Simics +

Ruby +

DVFS +

Power:

Simics +

Ruby +

DVFS +

Power +

Thermal

interface:

Simics +

Ruby +

DVFS +

Power +

Thermal

Model:

• Target:• 4 core Pentium® 4

• 2GB RAM

• 32 KB private L1 cache

• 4 MB shared L2 cache

• Linux OS

Tsim =

1040 s

Tsim =

1045 s

Tsim =

1110 s

Tsim =

1160 s

Tsim =

1240 s

68 cells

T = 100ns

Compute

every 13us

1 Billion instruction

+ 7% + 19.2%

Page 49: Model predictive control and self- learning of thermal ...

Mathworks Matlab/Simulink

• Numerical computing environment developed to design,

implement and test numerical algorithms

• Mathworks Simulink – for simulation of dynamic systems:

simplifies and speedups the development cycle of control systems

• Can be called as a computational engine by writing C and Fortran

programs that use Mathworks Matlab‟s engine library

• Controller design - two steps:

• developing the control algorithm that optimizes the system

performance

• implementing it in the system

We allow a Mathworks Matlab/Simulink description of the controller to

directly drive at run-time the performance knobs of the emulated system

Page 50: Model predictive control and self- learning of thermal ...

Simulator Virtual Platform

Virtual Platform

Virtutech Simics

PC#hlt,stall

active,cycles

RUBY COREStall Mem

Access

fi ,VDD

POWER MODEL

PCORE, PL1, PL2

CPU1 CPU2 CPUN

L2L1

L2

DRAM

Network

L1 L1

HWSW

App.1

T1 ... TN

App.N

T1 ... TN

O.S.

....

f,vPGATING

#L1MISS, #BUSACCESS,CYCLEACTIVE,....

TEMPERATURE MODELT

Mat

hw

or

ksM

atla

b

Controller

MA

TLA

B

Inte

rfac

eP

CPI

P,T

TCPU,

Mathworks Matlab interface:• New module named Controller in RUBY

• Initialization: starts the Mathworks Matlab engine concurrent process,

• Every N cycle - wake-up:

• send the current performance monitor output to the Mathworks Simulink model

• execute one step of the controller Mathworks Simulink model

• propagate the Mathworks Simulink controller decision to the DVFS module

DVFS

fi

fi

CONTROL-STRATEGIES DEVELOPMENT CYCLE1. Controller design in Mathworks Matlab/Simulink framework

• system represented by a simplified model

• obtained by physical considerations and identification techniques

2. Set of simulation tests and design

adjustments done in Simulink

3. Tuned controller evaluation

with an accurate model of the plant

done in the virtual platform

4. Performance analysis, by simulating the overall system

Page 51: Model predictive control and self- learning of thermal ...

Outline

• Introduction

• Energy Controller

• Thermal Controller architecture

• Learning (self-calibration)

• Scalability

• Simulation Infrastructure

• Results

• Conclusion

Page 52: Model predictive control and self- learning of thermal ...

Results

Energy Controller (EC)

– Performance Loss < 5%

– Energy minimization

Temperature Controller (TC)

– Complexity reduction

• 2 explicit region for controller

– Performs as the centralized

• Thermal capping

<0.3° <3%

<3%

Page 53: Model predictive control and self- learning of thermal ...

Next Steps

• Now working on the embedded implementation

• Server multicore platform and Intel ® SCC

• Explore thermal aware scheduler solution

• co-operate with presented solution

• Develop distributed+multi-scale solution for data-centers

Page 54: Model predictive control and self- learning of thermal ...

Thermal-aware task scheduling

1

2

3

4

56

Sensor DataDatabase

CFD simulationsoftware

PolicyController

SchedulerOther Impact

factors

Collectingenvironmental dataandloadinformationfromsensors

`

Correlationofload& power

Cost Analysis

SchedulingPolicy

Control Policy

Incomingtask

Onsitesurvey

Maploadtopowerconsumption

History Sensor Data

Current Sensor Data

Schematic View of Thermal Management

Datacenter

Abstract HeatModel

Page 55: Model predictive control and self- learning of thermal ...

Recommended