Power Aware Wireless Power Aware Wireless Microsensor Microsensor SystemsSystems
Anantha Chandrakasan, Rex Min, Manish Bhardwaj, Seong-Hwan Cho, Alice Wang
Massachusetts Institute of Technology
Emerging Emerging MicrosensorMicrosensor ApplicationsApplications
Industrial Plants and Power Line Monitoring(courtesy ABB)
Operating Room of the Future(courtesy John Guttag)
NASA/JPL sensorwebsTarget Tracking & Detection
(Courtesy of ARL)Location Awareness
(Courtesy of Mark Smith, HP)
Websign
Sensor System RequirementsSensor System Requirements
10 – 100mTransmission Distance
Small Size
Extended Lifetime
Spatial Density
Data Rate
Application Characteristics
5 years
1 “AA” battery
0.1-10 nodes/m2
bps to kbps
Typical Values
Predictable ConstraintsPredictable Constraints Unpredictable DiversityUnpredictable Diversity
Network roles:relay, sensor, aggregator
Environment: event and signal statistics
User/Application: required latency, quality
ApplicationApplication--specific designs specific designs provide energy efficient point provide energy efficient point
solutionssolutions
PowerPower--aware designs aware designs adapt energy adapt energy consumption to operating consumption to operating
conditionsconditions
Power Aware Power Aware Microsensor Microsensor ConsiderationsConsiderations
Energy Harvesting
API and Control
RF Innovations
Low-Rate Digital Computation
Energy-Scalable Algorithms
MAC and Protocols
Power Aware Power Aware MicrosensorMicrosensor
NetworksNetworks
)/(10 SVleakage
TI −∝
API
HW
OS & MIDDLEWARE
SW
First Generation Wireless First Generation Wireless MicrosensorMicrosensor
Battery
Mic.
AmpLow-Pass
Filter ADC
ThresholdDetector
DC/DC Converter
Processor FIFO
FIFO
Static RAM Flash ROM
Implemented on an FPGA
Antenna
Clock Recovery
Shifter
Control
Radio IC
Power Amp.
Sensor Processor Radio
206MHz StrongARM 2.4GHz ISM band4-channel acoustic
OSOS--Controlled Power Down ModesControlled Power Down Modes
Data collection: 1024 samples at 1kSPS
(Processor alternates between idle/active)
LOB Calculation
(Processor active full-time)
Data transmission
(Radio transmitter active)
Sleep
(All systems power down)
Time (s)
Pow
er (m
W)
0
100
200
300
400
500
0 0.02 0.04 0.06 0.08 0.1 0.12
Processor Idle:low = idlehigh = active
Processor Sleep:low = sleephigh = active or idle
Dynamic Voltage ScalingDynamic Voltage Scaling
Digitally adjustable DC-DC converter powers SA-1110 core
µOS selects appropriate clock frequency based on workload and latency constraints
SA-1110
Control
µOS
Controller
5
3.6V
Vout
Distributed Processing Exploiting DVSDistributed Processing Exploiting DVS
A/D
Sensor 6
FFT
A/D
Sensor 2
FFTA/D
Sensor 1
FFT
Cluster HeadFFT1, BF & LOB
Sensor 7Sensor 6
A/D
Sensor 2Sensor 1
Cluster Head
Sensor 7
A/D
A/D FFT7 BF & LOB
Ecomp(variable Vdd) = 15.16 mJ
Approach 2: Parallelism ameliorates latency constraint
FFT: Vdd = 0.9 VBF & LOB: Vdd = 1.3 VEcomp(Vdd=1.5V) = 7Efft+Ebf+ELOB
= 27.27 mJ on SA-1100
Approach 1: All latency-critical computation at aggregating node
Energy Scalable AlgorithmsEnergy Scalable Algorithms
x[n]x[n-1]x[n-2]
x[n-N+1]
h[0]h[1]h[2]
h[N-1]
xxx
x
x[n+1] Original∑
−
=
−=1
0
][][][N
k
knxkhny
Original
Transformed
Filter
Energy/sample (µJ)
Acc
urac
y (%
)y[n]
TransformedSortedCoeffs
Re-order Index
x[n+1]
x[n]x[n-1]x[n-2]
x[n-N+1]
xxx
x
h[p]h[q]h[r]
h[s]
[Sinha, ISLPED ’00]y[n]
Maximize quality for a given energy availability
Leakage : Low Duty Cycle ConcernLeakage : Low Duty Cycle Concern
FFT Execution Time
Tota
l Cha
rge
flow
)/(10 SVleakage
TI −∝
Duty Cycle (%)To
tal E
nerg
y/Sw
itchi
ng E
nerg
y
Leakage Dominates Switching Energy for Low Duty Leakage Dominates Switching Energy for Low Duty Cycles Cycles –– “Off” State“Off” State--centric Optimizationcentric Optimization
Power Aware RadioPower Aware Radio
PLLXilinx
Rx Power Control
Demod& slice
0-20dBm
Tx Power Control
VregData
Fine-grain shutdown through regulators and bias controlVariable 6-level PA allows efficient transmission for 10m to 100m
SA1110
Vreg
: Power Down Control
RF StartRF Start--up Energy Overheadup Energy Overhead
Ener
gy p
er b
it (n
J)
10000
1000
100
1010 100 1000 10000 100000
Packet size (bits)
Energy Energy = = PPtxtx_electronics_electronics ((TTtranmsittranmsit + + TTstartstart)) + P+ Poutout TTtransmittransmit
Significant loss in energy efficiency for small packet sizes
Startup Costs are Fundamental Startup Costs are Fundamental ––Innovative Circuits and Protocols RequiredInnovative Circuits and Protocols Required
Integrated SystemIntegrated System--onon--aa--Chip VisionChip Vision
Power Conversion
EnergyScavenger
A/D
MEMORY
RFDSPSensor
Compact Form Factor (mm3 – cm3)Requires interconnection of diverse process technologiesLow computational requirement, but requires flexibility to adapt to time-varying scenariosCost, size and energy are the key design constraints
What is the best computation/communication fabric?What is the best computation/communication fabric?
Why Not Software?Why Not Software?
05
1015202530354045
Pow
er (%
)
Cache Control GCLK EBOX I/O,PLL
[Montanaro, JSSC ‘96]
Software Energy Dissipation is Dominated by Overhead Software Energy Dissipation is Dominated by Overhead and NOT by Useful Workand NOT by Useful Work
What about What about FPGAsFPGAs??
4G C LK [3 :0 ]
4 4 4
Ch a nn elR AM
4GC LK [3 :0 ]
4 4 4
4G C LK [3 :0 ]
4 4 4
44GCLK[3:0] PLL & Clock MUX GCTL[3:0]
I/O Bank 6I/O Bank 7
I/O Bank 3I/O Bank 2
I/O B
ank
4I/O
Ban
k 5
I/O B
ank
1I/O
Ban
k 0
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
Ch a nn elR AM
Ch a nn elR AM
Ch a nn elR AM
Ch a nn elR AM
Ch a nn elR AM
C ha nn e lRA M
C ha nn e lRA M
C ha nn e lRA M
C h an ne lR AM
C h an ne lR AM
C h an ne lR AM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
C lusterR AM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
LB 4LB 3
LB 0
ClusterRAM
LB 5
LB 6
LB 7
LB 2
LB 1
PIM
ClusterRAM
ChannelMemory
Block
I/O B
lock
LB
ClusterPIM
LB
LB
LB
LB
LB
LB
LB
CMB CMB
V-to-HPIM
H-to-VPIM
I/O Block
Channels68%
Clusters13%
Clock/Control network17%
IO1%
DC1%
65%21%
9%5%
InterconnectClock
I/OCLB
CLB CLB
CLBCLB
Xilinx (Courtesy of J. Rabaey)Cypress
InterconnectInterconnect--Centric ArchitecturesCentric Architectures(Flexibility with Power Efficiency)(Flexibility with Power Efficiency)
MEM
PE
MEM
PE
MEM
PE
MEM
PE
MEM
PE
MEM
PE[Simon, JSSC ’00]
Massively parallel, “slow” switching processorsExploits locality of reference, low interconnect costsImage Sensor Application: Wavelet based compression(3 Million Transistors, 0.6µm CMOS, 500µW)
New Energy Metrics in DSM InterconnectNew Energy Metrics in DSM Interconnect
02468
1012141618
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Normalized Energy, Normalized Energy, EE
0
10
20
30
0 1 2 3
Standard modelStandard model
Normalized Energy, Normalized Energy, EE# of
tran
sitio
ns o
f co
st
# of
tran
sitio
ns o
f co
st EE SubSub--micron modelmicron modelBUS
d2
d1
l2
l1
VDD
CL
CL
CI
3==L
I
CCλ
# of
tran
sitio
ns o
f co
st
# of
tran
sitio
ns o
f co
st EE
Extended Busn+a lines
Recovered DataInput Data (n bits)
... DecoderEncoder
Minimizing Transition Activity is not the Minimizing Transition Activity is not the Right approach to Minimize PowerRight approach to Minimize Power
Optimal VOptimal VDDDD and Vand VTT ControlControl
( )DD
THDD
LCLK V
VVC
fαβ −
⋅=
2
DDLCLKSwitching VCfaP ⋅⋅⋅=S
V
DDlLeakage
TH
VIP−
⋅⋅= 100
-0.10.00.10.20.30.40.50.60.7
0.0 0.2 0.4 0.6 0.8 1.0 1.2Supply voltage VDD (V)
Thre
shol
d vo
ltage
VTH
(V)
50MHz
100MHz
200MHzPPLeakage
fCLK = constant
PTOTAL
Pow
er
Switching
Supply voltage VDD (V)
VDD and VTH can be varied to keep a fixed performance
Energy Efficiency of Digital ComputationEnergy Efficiency of Digital Computation
Single butterfly architecture (4 multipliers, 6 adders)
Data Memory
Twiddle ROM
Dat
a A
ddre
ss
Twiddle Address
R/W AB
XY
W
Control Logic
Butterfly structure
A
BW
X=A+BW
Y=A-BW
Optim
al (Vdd , V
th )
Supp
ly V
olta
ge (V
DD)
FFT Computation
Threshold Voltage (Vth)
Exploit Sub-threshold Operation for Sensor Circuits
Adaptive VAdaptive VDDDD/V/VTT ArchitectureArchitecture
TempCircuit to be biased
to optimum VDD/VT point
PhaseDetector
N/PBody BiasGeneratorMatched Delay Line
N
LookupTable
PowerConverter
VDD
ClockWorkload
P
MAC
0.17
5 V
166 kHz clockdata
[Miyazaki, ISSCC ’02]
Leakage Mitigation Using MTCMOSX3 X2 X1 X0
Y3
Y2
Y1
Y0
P0
P1
P2
P3
P4P5P6
P7
pc
pc
VDD
Low VTLogic Virtual
Ground
High VT DeviceSleep
0 50 100 150 200 250 8
10
12
14
16
Dela
y ,n s
Sleep Transistor Width, W/L
A: X=00000000->11111111Y=00000000->10000001
B: X=01111111->11111111Y=10000001->10000001
Vector A
Vector B
Device Sizing is a Major Concern in Multiple
Threshold CMOS
“Leakage Feedback” Flip“Leakage Feedback” Flip--FlopFlop
SLEEP
SLEEP
CLK
CLK
VDD VDD
0 1
0
In the Idle Mode
1
, with inputs drifting
Use leakage to hold state in the flip-flop – very low leakage in sleep mode, with high-performance in active mode
CLK
CLKSLEEP
D Q
[Kao, ESSCIRC ’01]
SLEEP
VDD VDD
Computation vs. CommunicationComputation vs. Communication
1E-11
1E-101E-09
1E-081E-07
1E-06
1E-051E-04
1E-03
1 10 100 1000 10000
Energy for Electronics + Transmit
R2 Propagation LossLimit (no electronics)Assuming 10pJ/bit/m2
Ener
gy (J
)
Distance (m)
Computation: 1nJ/op (µ-Processor) and Communication (@10m): 150nJ/bit @10 m: ~150 instructions/transmitted bit on a low-power processor@10m: > 1Million instructions/transmitted bit using dedicated hardware
Compute, Don’t CommunicateCompute, Don’t Communicate
Fast Startup TransmitterFast Startup Transmitter
/N , /N+1
PFDfref
Σ−∆channel
data LPF
Variable loop filter
E/bit = 10nJ/bit
Variable loop bandwidthFixed loop bandwidth
New Opportunities: “Digital” UWB RadioNew Opportunities: “Digital” UWB Radio
Pulse Generator
CLK Generator
T/HLNA
BUFFER
A/D
A/D
A/D
A/D
Reg
Reg
Reg
Coarse
Acquisition
&
Fine
Tracking
Data Out
Code
Generator
Data In
Minimal Front-end components: leverage low-power digital circuits
3-4 bits A/D sufficient (Newaskar, Blazquez, Chandrakasan, SIPS ‘02)
MultihopMultihop and the Characteristic Distanceand the Characteristic Distance
DD
E = h α1 + α2D
h( )2
D/hD/h D/hD/h D/hD/h
number of hops
per-hop distance
DD
minh
E = 2α1D
dchar
Direct Transmission
Multihop Transmission
E = α1 + α2D2
Tx &Rx Radio Electronics
attenuation,power amp
path loss exponent
0.05
0.1
0.15
0.2
0.25
0 50 100 150 200
1 hop
2 hops
3 hops
4 hops
dchar
E, E
nerg
y (
E, E
nerg
y ( m
Jm
J ))
α1 = 30 nJ/bitα2 = 10-11/bit1000 bits
D, Total Distance (meters)D, Total Distance (meters)
where dchar = α1α2
Characteristic Distance for Multihop Transmission
Analogy to Buffered InterconnectAnalogy to Buffered Interconnect
LL L/hL/h L/hL/h L/hL/h
dchar =α1
α2
E = h α1 + α2L
h( )2
E = α1 + α2L
2
T =α + RCL2 T = k α + RC Lk( )2
R∆L
αα α αR∆L R∆L
C∆L C∆L
LL
C∆L
R∆L R∆L R∆L
dchar =α
RC
Dithering of Characteristic DistanceDithering of Characteristic Distance
Fixed Multihop Routes Rotated Routes
= 116 days of lifetime
( )
( )Lifetime = 47 days
0.5dchar 0.5dchar dchar
Lifetime = 77 days
Lifetime = 77 days
( )511
+ 311
+ 311
Even spatial distribution of energy burdenEven spatial distribution of energy burden
Signal Processing in the NetworkSignal Processing in the Network
KN total samples N total samples
K senders aggregator receiverrelay(s)ddcharchar ddcharchar
Ener
gy S
avin
gs w
ith A
ggre
gatio
n
Distance in dchar-length hops (dchar =55m)
K = 2
K = 3
K = 4
K = 7
K = 5
α1 = 30 nJ/bitα2 = 10-11/bit
0
1
2
3
4
5
0 1 2 3 4 5 6
K = 10
Clustering ProtocolsClustering Protocols
START START START
Slot for node i
ClusteringLocalized control (through the “cluster head”)Local data aggregation
Randomized Rotation of cluster-headsClusters formed during set-upData transfers during steady-state
TDMA in steady stateNode i transmits once per frameNo collisions low energyMaximum sleep time
Time•••
Set-up Frame RoundSteady-state
[Heinzelman02]
Round n Round n+1
Opportunity: Reactive Radios [Opportunity: Reactive Radios [RabaeyRabaey, ISSCC02], ISSCC02]
API and Middleware Layer
TxPower
EnergyReliability
RadioProcessor
Code Selection
Voltage/Frequency
Power-Awareness Manager
Latency Range
Application/Protocol
set_max_energy(Energy energy)set_max_latency(Time latency)set_min_reliability(Prob probReception)set_range(int nearestNodes, Node[] who,
float meters)
Power Aware API: performance of communication defined and exposed as a basis for trade-offs
API and Middleware Layer
Quality of communicationQuality of communication defined along four axes:defined along four axes:
Reliability (BER)“How reliably?”
Energy (µJ)“How much energy?”
Latency (ms)“How soon?”
“To whom?”
ConcernRange (m)
Metric
APIAPI--Controlled Operational Policy Controlled Operational Policy
Radiated Power
Convolutional Code
Rel
iabi
lity
(log
BE
R)
Reliability (log BER)
Rel
iabi
lity
(log
BE
R)
Range (m)
+0 dBm+3 dBm+5 dBm+10 dBm+15 dBm+20 dBm
UncodedR=2/3, K=3R=1/2, K=3R=1/2, K=5R=1/2, K=7
Operational PoliciesRadiated Power Convolutional Code
Ener
gy (
J)
µAMPS-1 Node1000 bits
Total Communication Energy
Range (m)
Range (m)
higher quality
Energy scales gracefully with communication quality
higher quality
Energy ScavengingEnergy Scavenging
Generator RegulatorVDD Load
Electronics
Self-powered operation is a real option if the power dissipation can be scaled to 10’s - 100’s of µW
Mechanical vibration (e.g., machine mounted sensors)Electromagnetic fields (RF)
A major opportunity exists in developing energy scavengers(generator and associated electronics) for extracting useful energy from ambient sources
VibrationVibration--toto--Electric EnergyElectric Energy
Hardwired Fabrics enable No Hardwired Fabrics enable No Power Signal ProcessingPower Signal Processing
(10(10µµW from generator)W from generator)MEMS Generator Controller
Proximity Sensor: Wireless Power SupplyProximity Sensor: Wireless Power Supply
~
Simulation of rotating field over one period at 120kHz (in middle of 2D arrangement and supplied plane xy)
Courtesy of Snorre Kjesbu, ABB
HeelHeel--Strike Energy HarvestingStrike Energy Harvesting
Joe Paradiso (MIT Media Lab) Hagood, Spearing, Schmidt (MIT)
ValveController
Valve OperatingFrequency ~ 30 kHz
Piezo
Electrical Power
Valve Valve
HighPressure
LowPressure
After 3-6 steps, it provides 3 mA for 0.5 sec ~10mW
3D Integration3D Integration
3-D Standard Cell Placement and Routing
3-D Layout Editor3-D Integration
Compact Interconnection of Heterogeneous TechnologiesCompact Interconnection of Heterogeneous Technologies
ConclusionsConclusions
Exciting new applications enabled by a network of low-power wireless sensing devicesPower Aware Design Methodology supersedes Energy Efficient DesignSlower is Better – exploit sub-threshold operation as fastest switching speed is not neededCommunication-centric design
Energy per operation (mW/MIPS) will scale with technologyCommunication costs (nJ/bit) will not scale at the same rate
Low Energy Sensor Design Requires a SystemLow Energy Sensor Design Requires a System--level level Approach Approach –– Tight Coupling Between Fabrics, Tight Coupling Between Fabrics,
Algorithms and ProtocolsAlgorithms and Protocols