Post on 13-Aug-2020
transcript
Jan M. RabaeyScientific Co-Director BWRCDirector GSRCEECS Dept.Univ. of California, Berkeley
Power Management inPower Management inWireless Wireless SOCsSOCs
With contributions of M. Sheets and H. Qin
The Leakage Challenge (1)The Leakage Challenge (1)
Year2002 ’04 ’06 ’08 ’10 ’12 ’14 ’160
0.2
0.4
0.6
0.8
1
1.2
0
20
40
60
80
100
120
Tec
hn
olo
gy
no
de[
nm
]
Vo
ltag
e [V
]
VTH
VDD
Technology node
2002 ’04 ’06 ’08 ’10 ’12 ’14 ’160
1
2
Year
PDYNAMIC
PLEAK
Po
wer
[µW
/ g
ate] Subthreshold leak
(Active leakage)
T. Sakurai, ISSCC 03
The Leakage Challenge (2)The Leakage Challenge (2)
0.18 micron~1000 samples
20X
30%
0.9
1.0
1.1
1.2
1.3
1.4
0 5 10 15 20
Normalized Leakage (Isb)
No
rmal
ized
Fre
qu
ency
Source: S. Borkar, Intel
The Other Side of the Story:The Other Side of the Story:Leakage is good for you!Leakage is good for you!
10-2
10-1
100
101
0
0.2
0.4
0.6
0.8
1
ELeakage
/ESwitching
EO
p /
nom
inal
EO
pre
f
nominalparallelpipeline
Vthref-180mV
0.81Vddmax
Vthref-95mV
0.57Vddmax
Vthref-140mV
0.52Vddmax
Optimal designs have high leakage (ELk/ESw ≈ 0.5)
Must adapt to process variations and activity
Source: P. Source: P. GelsingerGelsinger (DAC04)(DAC04)
What to do about memory?What to do about memory?“The data retention voltage (DRV)“The data retention voltage (DRV)
0 0.2 0.4 0.6 0.8 10
10
20
30
40
50
60
Supply Voltage (V)
4KB
SR
AM
Lea
kage
Cur
rent
(µA
)
MeasuredDRV range
0 100 200 300 400 500
1000
2000
3000
4000
5000
6000
7000
0
DRV (mV)
Data obtained from 4K bytes SRAM testData obtained from 4K bytes SRAM test--chip,chip,implemented in 130 nm CMOSimplemented in 130 nm CMOS
Calibrating for Process VariationsCalibrating for Process Variations
Module
Most variations are systematic, and can be adjusted for at start-up time using one-time calibration!
• Relevant parameters: Tclock, Vdd, Vth• Can be easily extended to include leakage-reduction and power-down in standby
TestModule
Vbb
Test inputsand responses
Tclock
• Achieves the maximum power saving under technology limit• Inherently improves the robustness of design timing• Minimum design overhead required over
traditional design methodology
Vdd
Adaptive Body BiasingAdaptive Body BiasingSource: P. Gelsinger (DAC04)
Introducing “Power Domains (Introducing “Power Domains (PDsPDs)”)”Similar in Concept to “Clock Domains”, but extended to includepower-down (really!) and local supply and threshold voltage management.
Power source
Active Power NetworkActive Power Network
Load Load Load
• Dynamic voltages forvariable workload
• Power gating or shut-off for leakage control
• Lifetime extension exploiting battery attributes
• Noise management
Introducing “Power Domains (Introducing “Power Domains (PDsPDs)”)”
Domain1
Power Scheduler/Chip Supervisor
Domain2 Domain3
Who is in charge?
Chip Supervisor (or Chip O/S)• Maintains global state and
perspective• Maintains system timers• Alerts blocks of important events
µ-coded statemachine
Eventdecoder
System supervisor
Alarm Tablew/IDs
AlarmManager
SystemTimewheel
=Next Alarm
Timer subsystem
Pow
er c
ontr
ol b
us
Powercontrolmsgs
Systemstatusmsgs
Fun
ctio
nal
units
A Case Study A Case Study ——Protocol Processor for Wireless Sensor NetworksProtocol Processor for Wireless Sensor Networks
Energy train
DLL (MAC)
App/UI
Network
Transport
Baseband
RF (TX/RX)
Sensor/actuatorinterface
Locationing
Aggregation/forwarding
User interface
Sensor/actuators
Antenna
ChipSupervisor
Reactiveradio
Target: < 50 µW average
“Charm” Processor
LocalHW
MAC
DW8051
256DATA
Interconnect network
ADC
4kBXDATA
16kBCODE
PHY
ChipSupervisor
SIF
SIFADC
Serial
GPIO
FlashIF
Serial
Charm ArchitectureCharm Architecture
• Reactive inter- and intra-chip signaling• Aggressive Use of Power-Domains• Chip Supervisor Manages Activity
• 1 V operational supply voltage• 16 MHz Clock Frequency• Simple processor aided with dedicated accelerators
Call a Plumber…This Thing Leaks!Call a Plumber…This Thing Leaks!Block Area (um2) Logic MemoryLocationing 337990 39.9DW8051 63235 8.2 2880.0Interface 6098 0.8Neighborlist 21282 2.5 13.5Serial 2554 0.4NetQ 6296 0.7 108.0DLL 126846 17.4 13.5Supervisor 51094 6.4
Total 76.3 3015.0
Est. leakage @1V (uW)
64KB SRAM for SW code and data
30X the target power…just in leakage!!
Leakage vs. Supply Voltage
Hey buddy, turn down
the voltage!~15X
reductionData retention
voltage
1/15 A * 0.3 V = 98% less leakage power
Gated Power ArchitectureGated Power Architecture
• Vddhi – active mode voltage (nominal)• Vddlo – standby mode voltage allows retention of state
vdd
gnd
vdd
gnd
vdd
gnd
vdd
gnd
vdd
gnd
vdd
gnd
vvdd
gnd
vvdd
gnd
vvdd
gnd
vddh
i
gnd
vddl
o
vddh
i
gnd
vddl
o
vddh
i
gnd
vddl
o
VVDD
GND
VDD (1V) 300mVSTBY
Power Switch TilePower Switch Tile
• Tile is easily incorporated into standard design flow– Cell has same pitch as std. cell library components– Switch tiles placed prior to other standard cells– One additional power strap added to power routing step
• Switch design can be independent of block size– Built in buffer distributes driver circuitry– Enables creation of a buffer tree during STBY signal routing
Std cellheight
STBY_buf
Delay / Leakage Tradeoff
1
1.5
2
2.5
3
3.5
0 10 20 30 40 50Power switch width (um)
Del
ay o
verh
ead
0
0.2
0.4
0.6
0.8
1
Delay overheadLeakage
Power Switch SizingPower Switch Sizing
• Switch sizing enables trade-off between delay overhead and leakage
– Delay scale normalized to un-gated design
– Leakage scale normalized to case when switch size is 50 µm
• Timing slack determines delay requirement– Control domains (DLL, processor) – tolerant of delay overhead
– Datapath domains (locationing) – longer critical paths, less tolerant of delay overhead
H. Qin
System SupervisorSystem Supervisor
• How to control block activation/deactivation?• System supervisor centralizes power control
– Power subsystem – gates block power rails – Clock subsystem – gates block clocks– Timer subsystem – system time-wheel and wake-up timers
PowerNetworkInterface
Pow
er N
etw
ork
Timesubsystem
Clocksubsystem
Powersubsystem
Command/Event
Dispatcher
PowerDomain A
PowerDomain B
PowerDomain C
Power SubsystemPower Subsystem
• Session controller – opens/closes sessions• Connection table – holds connectivity masks and performs
port address translation• Session table – keeps track of open sessions
SrcDecoder
DestDecoder
Connection Table
Session Table
SessionController
connection maskTo/FromDispatcher
SYSCLK
Charms SubCharms Sub--blocks and Connectivityblocks and Connectivity
DLL
Controller(DW8051)
SensorInterface
Serial
Neighborlist
LocationingNetworkQueues
I2CSPI RS-232
A
AA
B
Baseband
RF-frontend
C
D
A
B
DE
B
B
C
A
B AC
E
A
A
Block Port A Port B Port C Port D Port EBB DLLDLL NETQ BB LOC NL DW8051DW8051 NETQ NL DLL SERIALLOC DLL NL DW8051NETQ DW8051 DLLNL DW8051 DLL LOCSERIAL DW8051
Connectivity grid
Power Session TablePower Session Table
Before a power domain can communication with a neighbor, it must first open a session
Power policy: A power domain can sleep if…
1) It has closed all its sessions2) No other domain has a session open
with it3) It wants to go to sleep
A ‘1’ in row i means that power domain i has an open a session with another domain
A ‘1’ in column k means that another domain opened a session with domain k
A ‘1’ in entry (i, i) is domain i's self-sleep bit
Session Table
can_sleep(i) = reduction_nor(row i) and can_sleep(i) = reduction_nor(row i) and reduction_nor(colreduction_nor(col i)i)S
rc D
omai
n
Dest Domain
1 1
0
...
...
1
0
0
1
0
0
11
Clocking SubsystemClocking Subsystem
• Low frequency external clock (32 KHz)• Generated, switchable, higher frequency clock (16 MHz)• Two clocks are made phase-synchronous using DLL• Control signals are generated by system supervisor
RINGOSC_EN16 MHz
Clock generator
REF_CLK_ROOT
SYS_CLK_ASYNC
REF_CLK_PINIBUF (pad)
REF_CLK_ROOT
N-stage chain (N ev en)
SYS_CLK_ASYNC
Clocktree
Clocktree
Variabledelay line
Priority encoder +digital controller
TIMERCLK
SYSCLK
Parallel phasedetector
M-stagedelay line
M
DETECT
REF_CLK_CLOCKMAN
Phasesynchronous
SYS_CLK_ROOT
Timer SubsystemTimer Subsystem
• Centralized system time-wheel– Blocks schedule wake-up alarms– Eliminates other large counters so blocks can sleep– Allows power domains to sleep
• Very low switching activity factor– SYSCLK is disabled during deep sleep– Serial (ripple) comparison starting with MSB
alarm_time
Free-runningCounter
=Alarm Entry #0
Alarm Entry #1
Alarm Entry #N-1
new_alarm
AlarmScheduler
beep_beep
Alarm Manager System Time-wheel
To/FromDispatcher TIMERCLK
SYSCLK
Wireless Sensor Network Protocol ProcessorWireless Sensor Network Protocol Processor
In fab
µWsStandby Power
< 1 mWOn_Power
3mm x 2.75mm =8.2 mm2
Chip Size
0.13µ CMOSTechnology
1V(High) –0.3V(Low)Core Supply Voltages
68KbytesOn Chip memory
16MHz(Main), 1MHz(BB)
Clocks Freqs
62.5K gatesGate Count
3.2MTransistor Count
64Kmemory DW8051
µc
BaseBand
SerialInterface
GPIOInterface
LocationingEngine
Neighbor List
SystemSupervisor
DLL
NetworkQueues
VoltageConv
A Longer Term Perspective:A Longer Term Perspective:On chip power generation and conversion networksOn chip power generation and conversion networks
Anchor Spring flexure Comb fingers
Energy generation and conversion network
Energy Source 1
Energy Source 2
Conversion
Netw
ork 1
Conversion
Netw
ork 2
Reservoir 1(capacitor)
Reservoir 2(microbattery)
Micro-battery
Electrostatic MEMSvibration converters
Summary and PerspectivesSummary and Perspectives
• Active and static power management is leading to a fundamental change in the concept of power distribution on a chip
• Power domains locally manage and trade-off performance, leakage and process variance
• System supervisors giving new meaning to the term OS
• Towards “PGE on a chip”