Sp09 CMPEN 411 L14 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 14: Designing for Low...

Post on 21-Jan-2016

215 views 0 download

transcript

Sp09 CMPEN 411 L14 S.1

CMPEN 411VLSI Digital Circuits

Spring 2009

Lecture 14: Designing for Low Power

[Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]

Sp09 CMPEN 411 L14 S.2

Reminders Next lecture

Dynamic logic - Reading assignment – Rabaey, et al, 6.3

Sp09 CMPEN 411 L14 S.3

Review: CMOS Power Equations

P = CL VDD2 f + tscVDD Ipeak f + VDD Ileak

Dynamic power

Short-circuit power

Leakage power

Sp09 CMPEN 411 L14 S.4

Power and Energy Design Space

Constant Throughput/Latency

Variable Throughput/Latency

Energy Design Time Non-active Modules Run Time

Active

(Dynamic)

Logic design

Reduced Vdd

TSizing

Multi-Vdd

Clock Gating

DFS, DVS

(Dynamic Freq, Voltage Scaling)

Leakage

(Standby)

Multi-VT

Stack effect

Pin ordering

Sleep Transistors

Multi-Vdd

Variable VT

Input control

Variable VT

Sp09 CMPEN 411 L14 S.5

Transistor Sizing for Minimum Energy

Device sizing COMBINED with supply voltage reduction is a veryeffective way to reduce the energy consumption of a logic network

Device sizing affects dynamic energy consumption gain is largest for networks with large overall effective fan-outs (F

= CL/Cg,1)

Sp09 CMPEN 411 L14 S.7

Dynamic Power Consumption is Data Dependent

A B Out

0 0 1

0 1 0

1 0 0

1 1 0

2-input NOR Gate

With input signal probabilities PA=1 = 1/2 PB=1 = 1/2

Static transition probability P01 = Pout=0 x Pout=1

= P0 x (1-P0)

Switching activity, P01, has two components A static component – function of the logic topology A dynamic component – function of the timing behavior (glitching)

NOR static transition probability = 3/4 x 1/4 = 3/16

Sp09 CMPEN 411 L14 S.8

NOR Gate Transition Probabilities

CL

A

B

BA

P01 = P0 x P1 = (1-(1-PA)(1-PB)) (1-PA)(1-PB)

PA

PB

0

1 0 1

Switching activity is a strong function of the input signal statistics PA and PB are the probabilities that inputs A and B are one

Sp09 CMPEN 411 L14 S.9

Transition Probabilities for Some Basic Gates

P01 = Pout=0 x Pout=1

NOR (1 - (1 - PA)(1 - PB)) x (1 - PA)(1 - PB)

OR (1 - PA)(1 - PB) x (1 - (1 - PA)(1 - PB))

NAND PAPB x (1 - PAPB)

AND (1 - PAPB) x PAPB

XOR (1 - (PA + PB- 2PAPB)) x (PA + PB- 2PAPB)

B

AZ

X0.5

0.5

For Z: P01 =

For X: P01 =

Sp09 CMPEN 411 L14 S.10

Transition Probabilities for Some Basic Gates

P01 = Pout=0 x Pout=1

NOR (1 - (1 - PA)(1 - PB)) x (1 - PA)(1 - PB)

OR (1 - PA)(1 - PB) x (1 - (1 - PA)(1 - PB))

NAND PAPB x (1 - PAPB)

AND (1 - PAPB) x PAPB

XOR (1 - (PA + PB- 2PAPB)) x (PA + PB- 2PAPB)

B

AZ

X0.5

0.5

For Z: P01 = P0 x P1 = (1-PXPB) PXPB

For X: P01 = P0 x P1 = (1-PA) PA

= 0.5 x 0.5 = 0.25

= (1 – (0.5 x 0.5)) x (0.5 x 0.5) = 3/16

Sp09 CMPEN 411 L14 S.11

Another Example

B

A

Z

X0.5

0.5

(1-0.5)(1-0.5)x(1-(1-0.5)(1-0.5)) = 3/16

(1- 3/16 x 0.5) x (3/16 x 0.5) = 0.085

Sp09 CMPEN 411 L14 S.12

Inter-signal Correlations

B

A

Z

X

P(Z=1) = P(B=1) & P(A=1 | B=1)

0.5

0.5

(1-0.5)(1-0.5)x(1-(1-0.5)(1-0.5)) = 3/16

(1- 3/16 x 0.5) x (3/16 x 0.5) = 0.085Reconvergent

Determining switching activity is complicated by the fact that signals exhibit correlation in space and time reconvergent fan-out

Have to use conditional probabilities

notice that Z = (A or B) and B = AB or B = B,

so 0 -> 1 should be (and is) 1/2 x 1/2 = 1/4 !!!

Sp09 CMPEN 411 L14 S.13

Logic Restructuring

Chain implementation has a lower overall switching activity than the tree implementation for random inputs

Logic restructuring: changing the topology of a logic network to reduce transitions

A

BC

D F

AB

CD Z

FW

X

Y0.5

0.5

(1-0.25)*0.25 = 3/16

0.50.5

0.5

0.5

0.5

0.5

7/64

15/256

3/16

3/16

15/256

AND: P01 = P0 x P1 = (1 - PAPB) x PAPB

Sp09 CMPEN 411 L14 S.14

Input Ordering

A

BC

X

F

0.5

0.20.1

B

CA

X

F

0.2

0.10.5

Which is better wrt transition probabilities?

Sp09 CMPEN 411 L14 S.15

Input Ordering

Beneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to 0.5)

A

BC

X

F

0.5

0.20.1

B

CA

X

F

0.2

0.10.5

(1-0.5x0.2)x(0.5x0.2)=0.09 (1-0.2x0.1)x(0.2x0.1)=0.0196

Which is better wrt transition probabilities?

Sp09 CMPEN 411 L14 S.16

Glitching in Static CMOS Networks

ABC

X

Z

101 000

Unit Delay

AB

X

ZC

Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) glitch: node exhibits multiple transitions in a single cycle before

settling to the correct logic value

Sp09 CMPEN 411 L14 S.17

Glitching in Static CMOS Networks

ABC

X

Z

101 000

Unit Delay

AB

X

ZC

Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) glitch: node exhibits multiple transitions in a single cycle before

settling to the correct logic value

Sp09 CMPEN 411 L14 S.18

Glitching in an RCA

S0S1S2S14S15

Cin

0

1

2

3

0 2 4 6 8 10 12

Time (ps)

S O

utp

ut

Vo

ltag

e (

V)

Cin

S0

S1

S2

S3

S4

S5S10

S15

Sp09 CMPEN 411 L14 S.19

Balanced Delay Paths to Reduce Glitching

So equalize the lengths of timing paths through logic

F1

F2

F3

0

0

0

0

1

2

F1

F2

F3

0

0

0

0

1

1

Glitching is due to a mismatch in the path lengths in the logic network; if all input signals of a gate change simultaneously, no glitching occurs

Sp09 CMPEN 411 L14 S.20

Power and Energy Design Space

Constant Throughput/Latency

Variable Throughput/Latency

Energy Design Time Non-active Modules Run Time

Active

(Dynamic)

Logic design

Reduced Vdd

TSizing

Multi-Vdd

Clock Gating

DFS, DVS

(Dynamic Freq, Voltage Scaling)

Leakage

(Standby)

Multi-VT

Stack effect

Pin ordering

Sleep Transistors

Multi-Vdd

Variable VT

Input control

Variable VT

Sp09 CMPEN 411 L14 S.21

Dynamic Power as a Function of VDD

Decreasing the VDD

decreases dynamic energy consumption (quadratically)

But, increases gate delay (decreases performance)

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4

VDD (V) t p

( no

r ma

l ize

d)

Determine the critical path(s) at design time and use high VDD for the transistors on those paths for speed. Use a lower VDD on the other gates, especially those that drive large capacitances (as this yields the largest energy benefits).

Sp09 CMPEN 411 L14 S.22

Multiple VDD Considerations How many VDD? – Two is becoming common

Many chips already have two supplies (one for core and one for I/O)

When combining multiple supplies, level converters are required whenever a module at the lower supply drives a gate at the higher supply (step-up)

If a gate supplied with VDDL drives a gate at VDDH, the PMOS never turns off

- The cross-coupled PMOS transistors do the level conversion

- The NMOS transistor operate on a reduced supply

Level converters are not needed for a step-down change in voltage

Overhead of level converters can be mitigated by doing conversions at register boundaries and embedding the level conversion inside the flipflop (see Figure 11.47)

VDDH

Vin

VoutVDDL

Sp09 CMPEN 411 L14 S.23

Dual-Supply Inside a Logic Block Minimum energy consumption is achieved if all logic

paths are critical (have the same delay)

Clustered voltage-scaling Each path starts with VDDH and switches to VDDL (gray logic

gates) when delay slack is available Level conversion is done in the flipflops at the end of the paths

Sp09 CMPEN 411 L14 S.24

Dual-Supply Inside a Logic Block Minimum energy consumption is achieved if all logic

paths are critical (have the same delay)

Clustered voltage-scaling Each path starts with VDDH and switches to VDDL (gray logic

gates) when delay slack is available Level conversion is done in the flipflops at the end of the paths

Sp09 CMPEN 411 L14 S.25

Power and Energy Design Space

Constant Throughput/Latency

Variable Throughput/Latency

Energy Design Time Non-active Modules Run Time

Active

(Dynamic)

Logic design

Reduced Vdd

TSizing

Multi-Vdd

Clock Gating

DFS, DVS

(Dynamic Freq, Voltage Scaling)

Leakage

(Standby)

Multi-VT

Stack effect

Pin ordering

Sleep Transistors

Multi-Vdd

Variable VT

Input control

Variable VT

Sp09 CMPEN 411 L14 S.26

Stack Effect Subthreshold leakage is a function of the circuit topology

and the value of the inputs

VT = VT0 + (|-2F + VSB| - |-2F|)

where VT0 is the threshold voltage at VSB = 0; VSB is the source- bulk (substrate) voltage; is the body-effect coefficient

A B

B

A

Out

VX

Leakage is least when A = B = 0

Leakage reduction due to stacked transistors is called the stack effect

Sp09 CMPEN 411 L14 S.28

Leakage as a Function of Design Time VT

Reducing the VT increases the sub-threshold leakage current (exponentially)

90mV reduction in VT increases leakage by an order of magnitude

But, reducing VT decreases gate delay (increases performance)

0 0.2 0.4 0.6 0.8 1

VGS (V)ID

(A)

VT=0.4VVT=0.1V

Determine the critical path(s) at design time and use low VT devices on the transistors on those paths for speed. Use a high VT on the other logic for leakage control.

A careful assignment of VT’s can reduce the leakage by as much as 80%

Sp09 CMPEN 411 L14 S.29

Dual-Thresholds Inside a Logic Block

Minimum energy consumption is achieved if all logic paths are critical (have the same delay)

Use lower threshold on timing-critical paths Assignment can be done on a per gate or transistor basis; no

clustering of the logic is needed No level converters are needed

Sp09 CMPEN 411 L14 S.30

IBM Cu11/Cu08 Blue Logic Library

ASIC Cu11 (130nm) Library : Dual-vt library 2690 total cells in standard cell library Nominal Vt level (~300mv) Low Vt level (~210mv)

Low-vt version has same physical footprint ~15% improvement in gate delay ~10x increase in leakage power

ASIC Cu08 (90nm) Library : Multi-vt library 2118 total cells in standard cell library

Intermediate-vt (AVT) and Low-vt (LVT) version of each cell Two more vt levels being planned (very lowvt and high vt)

Sp09 CMPEN 411 L14 S.31

An example to summarize all design-time techniques

Critical path

Sp09 CMPEN 411 L14 S.32

Design Time Low Power Techniques

Lower Vdd

Higher Vdd

Level Converter

Sp09 CMPEN 411 L14 S.33

Design Time Low Power Techniques

Higher Vth

Lower Vth

Sp09 CMPEN 411 L14 S.34

Design Time Low Power Techniques

Stack Forcing

In Out

1/2 W

W

W 1/2 W

1/2 W

1/2 W

Sp09 CMPEN 411 L14 S.35

Low Power Techniques – Interaction w/ each other

Higher Vth

Lower VthApply high Vth and size-up to recover speed

Sp09 CMPEN 411 L14 S.36

Next Lecture and Reminders Next lecture

Dynamic logic - Reading assignment – Rabaey, et al, 6.3