CMOS Inverter - จุฬาลงกรณ์มหาวิทยาลัย · 2009-11-12 ·...

B.Supmonchai July 5th, 2004

2102-545 Digital ICs 1

Chapter 5

CMOS Inverter

Boonchuay SupmonchaiIntegrated Design Application Research (IDAR) Laboratory

July 5, 2004; Revised - June 25, 2005

2102-545 Digital ICs CMOS Inverter 2

B.Supmonchai

Goals of This Chapter

q Quantification of Design Metrics of an inverter

ß Static (or Steady-State) Behavior

ß Dynamic (or Transient Response) Behavior

ß Energy Efficiency

q Optimization of an inverter design

q Technology Scaling and its impact on theinverter metrics


B.Supmonchai

Digital Gate Design Metrics: Recap

q Cost

ß Complexity and Area

q Reliability and Robustness ¤ Static Behavior

ß Noise Margin, Regenerative Property

q Performance ¤ Dynamic Behavior

ß Speed (Delay)

q Energy Efficiency

ß Energy and Power Consumption, Energy-Delay


B.Supmonchai

Why CMOS Inverter?

q CMOS because it is the dominating technologyof the era.

ß High Packing Density

ß Relatively Easy Process

q Inverter because it is the nucleus of all digitaldesigns.

ß Behavior of more intricate structures (logic gates,adders, etc.) can be almost completely derived byextrapolating the results obtained from the inverters.




B.Supmonchai

CMOS Inverter: A First Glance

Vin Vout

CL

VDD

PMOS

NMOS

Driven by OutputOf another gate

=> Fanin

Collective CapacitancesOf Wires and Gates

=> Fanout


B.Supmonchai

Polysilicon

In Out

VDD

GND

PMOS 2l

Metal 1

NMOS

Contacts

N Well

OutIn

VDD

PMOS

NMOS

CMOS Inverter: Physical View Recap


B.Supmonchai

Connect In Metal

Share power and ground

Two CMOS Inverters: Physical View

VDD

Abut Cells


B.Supmonchai

VDD

Vin = VDD

Vout

Rn

CMOS Inverter Static BehaviorVDD

Vout

Rp

Vin = 0

State of Transistors ON: |VGT = VGS - VT| > |VT|, Ron Æ • OFF: |VGT = VGS - VT | > |VT|, Roff finite




B.SupmonchailGate response time is determined by the time to charge CL through Rp

(discharge CL through Rn)

CMOS Inverter Dynamic Behavior

Charge Discharge

VDD

Rp

Vout

CL

Vin = 0

Low to High

Vin = V DD

CL

VDD

Vout

Rn

High to Low


B.Supmonchai

CMOS Propertiesq Full rail-to-rail swing

ß High noise margins

ß Logic levels not dependent upon the relative devicesizes => Ratioless

ÿ Transistors can be minimum size

ÿ Regenerative Property

q Low output impedance

ß Large Fan-out (albeit with degraded performance)

ß Typical output resistance in kW range.


B.Supmonchai

CMOS Properties (2)

q Extremely high input resistance (MOS transistoris near perfect insulator)ß nearly zero steady-state input current

q No direct path between power and ground understeady-state (but there always exists a path withfinite resistance between the output and eitherVDD or GND)ß no static power dissipation

q Propagation delay a function of load capacitanceand resistance of transistors


B.Supmonchai

NMOS Short Channel I-V Plot Recap

NMOS transistor, 0.25 mm, Ld = 0.25 mm, W/L = 1.5, VDD = 2.5 V, VT = 0.4 V

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5

I D (

A)

VDS (V)

X 10-4

VGS = 1.0 V

VGS = 1.5 V

VGS = 2.0 V

VGS = 2.5 V

Lin

ear

depe

nden

ce




B.Supmonchai

PMOS Short Channel I-V Plot Recapl All polarities of all voltages and currents are reversed

-1

-0.8

-0.6

-0.4

-0.2

00-1-2

I D (

A)

VDS (V)

X 10-4

VGS = -1.0 V

VGS = -1.5 V

VGS = -2.0 V

VGS = -2.5 V

PMOS transistor, 0.25 mm, Ld = 0.25 mm, W/L = 1.5, VDD = 2.5 V, VT = -0.4 V


B.Supmonchai

IDSp = -IDSn

VGSn = Vin ; VGSp = Vin - VDD

VDSn = Vout ; VDSp = Vout - VDD

VGSp = -2.5

VGSp = -1Mirror around x-axis

Vin = VDD + VGSp

IDn = -IDp

Vin = 1.5

Vin = 0

Vin = 1.5

Vin = 0

Horiz. shift over VDD

Vout = VDD + VDSp

Vout

IDn

l NMOS and PMOS VTC must be put into a common coordinateset of Vin, Vout, and IDn

Transforming PMOS I-V Plot


B.Supmonchai

CMOS Inverter Load-Line Plot

I Dn (

A)

Vout (V)

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5

X 10-4

Vin = 0 V

Vin = 0.5 V

Vin = 1.0 V

Vin = 1.5 V

Vin = 2.0 V

PMOS

Vin = 1.0 V

Vin = 1.5 V

Vin = 2.0 V

Vin = 2.5 V

Vin = 0.5 V

NMOS

CMOS 0.25 mm, W/Ln = 1.5, W/Lp = 4.5, VDD = 2.5 V, VTn = 0.4 V, VTp = -0.4 V

Vin = 2.5 V

Vin = 2 VVin = 1.5 V Vin = 1 V

Vin = 0.5 V

Vin = 0 V


B.Supmonchai

CMOS Inverter VTC

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5Vin (V)

Vo

ut (

V)

V in =

V out

NMOS offPMOS res

NMOS satPMOS res

NMOS satPMOS sat

NMOS resPMOS sat NMOS res

PMOS off

* VTC = Voltage-Transfer Characteristics




B.Supmonchai

Robustness of CMOS Inverter

q Precise Values of Switching Threshold, VM

ß VM is defined as the point where Vin = Vout

q Noise Margins

ß Piece-Wise Linear Approximation

ß Maximization

q Process Variations

ß Device Variations

ß Technology Scaling


B.Supmonchai

Switching Thresholdq At VM where Vin = Vout, both PMOS and NMOS

transistors are in saturation (since VDS = VGS)

VM ª rVDD/(1 + r) where r = kpVDSATp/knVDSATn

q Switching threshold set by the ratio r, whichcompares the relative driving strengths of thePMOS and NMOS transistors

q Goal: To set VM = VDD/2 (to maximize noisemargins), so r ª 1

†

W L( )p

W L( )n

=kn 'VDSAT ,n VM -VT ,n -VDSAT ,n /2( )

kp 'VDSAT ,p VDD - VM -VT ,p -VDSAT ,p /2( )[ ]


B.Supmonchai

Switch Threshold Example

q In our generic 0.25 micron CMOS process, using theprocess parameters from Table 3.2, at VDD = 2.5V, and aminimum size NMOS device ((W/L)n of 1.5)

-0.1-30 x 10-6-1-0.4-0.4PMOS

0.06115 x 10-60.630.40.43NMOS

l(V-1)k’(A/V2)VDSAT(V)g(V0.5)VT0(V)

115 x 10-6 0.63 (1.25 – 0.43 – 0.63/2)

-30 x 10-6 -1.0 (1.25 – 0.4 – 1.0/2)x x = 3.5

(W/L)p = 3.5 x 1.5 = 5.25 for a VM of 1.25 V

= (W/L)p

(W/L)n


B.Supmonchai

Example: Simulated Results

100

1010.8

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

MV (V

)

W p/W n

1.25 V

r = 3.4

Minimum Width-to-Length = 1.5




B.Supmonchai

Observations I

q VM is relatively insensitive to variations indevice ratioß Small Variations of the ratio do not significantly

disturb VTC.

ß Common Industry Practice to set Wp smaller than therequirement.

q Increasing the width of the PMOS moves VMtowards VDD

q Increasing the width of the NMOS moves VMtoward GND


B.Supmonchai

0

1

2

3

VIL VIHVin

Vo

ut

A piece-wise linear approximation of VTC

VOH = VDD

VOL = GND

VIL VIH

Noise Margins: Determining VIH and VIL

NMH = VDD - VIH

NML = VIL - GND

By definition, VIH and VILare where gain

dVout/dVin = -1

So high gain in thetransition region isvery desirable

Slope = g

VM

Gain g = Slope

Approximating: VIH = VM - VM /g VIL = VM + (VDD - VM )/g


B.Supmonchai

CMOS Voltage Gain

Gain is a strong function of theslopes of the currents in thesaturation region, for Vin = VM

-18

-16

-14

-12

-10

-8

-6

-4

-2

00 0.5 1 1.5 2

Vin

gai

n

q Determined only by technology parameters, especiallychannel length modulation (l). Only designer influencethrough supply voltage and VM (transistor sizing).


B.Supmonchai

Example: VTC and Noise Marginq For a 0.25mm, (W/L)p/(W/L)n = 3.4, (W/L)n = 1.5

(min size) VDD = 2.5V

VM ª 1.25 V, g = -27.5VIL = 1.2 V, VIH = 1.3 V

NML = NMH = 1.2

q Output resistance fi Sensitivity of gate output with respect to noise

ÿ low-output = 2.4 kWÿhigh-output = 3.3 kWÿ Preferably as low as possible

Real ValueVIL = 1.03 V, VIH = 1.45 VNML = 1.03, NMH = 1.05




B.Supmonchai

Observations II

q First-Order Analysis overestimates the gainß Max. gain only 17 at VM Æ VIL = 1.17V, VIH = 1.33V

q Piecewise Linear Approximation is too overlyoptimisticß Major contributor to deviation from the true gain

q CMOS inverter is a poor analog amplifier!ß One of the major differences between analog and

digital designs is that digital circuits operate in theregions of extreme nonlinearity.ÿ Well-defined and well-separated high and low signals


B.Supmonchai

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5

Vin(V)

Vou

t(V)

Impact of Process Variation on VTC

No

min

al

Good PMOSBad NMOS

Bad PMOSGood NMOSlProcess variations (mostly) cause a shift in the switching threshold


B.Supmonchai

Scaling the Supply Voltage

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5

Vin(V)

Vou

t(V)

Reducing VDD improves Gain…

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2

Vin(V)

Vou

t(V)

But it deteriorates for very low VDD

Gain=-1

Practical Lower Bound: VDDmin > 2 to 4 kt /q


B.Supmonchai

Observations III

q Reducing the supply voltage has a positiveimpact on the energy dissipation …

ß But is also detrimental to the delay of the gate

q DC Characteristic becomes increasinglysensitive to device variations once supply andintrinsic voltages become comparable

q Scaling the supply voltage = reducing the swing

ß Reduce internal noise (e.g., crosstalk)

ß More susceptible to external noise that do not scale




B.Supmonchai

CMOS Inverter Dynamic Behaviorq Transient behavior of the gate is determined by

the time it takes to charge and discharge the loadcapacitance, CL, through on-transistorsß Delay is a function of load capacitances and transistor

on-resistances

q Getting CL as small as possible is crucial to therealization of high-performance CMOS circuitsß Transistor Capacitancesß Wire Capacitancesß Fanout

q Wire Resistances also become more important.2102-545 Digital ICs CMOS Inverter 30

B.Supmonchai

VDDVDD

VinVout

M1

M2

M3

M4Cdb2

Cdb1

Cgd12

Cw

Cg4

Cg3

Vout2

Fanout

Interconnect

VoutVin

CL

SimplifiedModel

Fanout

Vout2Vin

Vout

CL

Simplified Model

Intrinsic

Computing the Capacitances

Extrinsic


B.Supmonchai

Finding Cgd: The Miller Effect

DV

DV

q M1 and M2 are either in cut-off or in saturation.

q The floating gate-drain capacitor is replaced by acapacitance-to-ground (gate-bulk capacitor).

“A capacitor experiencing identical but opposite voltage swingsat both its terminals can be replaced by a capacitor to groundwhose value is two times the original value”

Vin

Vout

Cgd1

M1

Vout

DV

DV

2Cgd1

Vin

M1


B.Supmonchai

Diffusion Capacitances: Cdb1 and Cdb2

low-to-highhigh-to-low0.25 mmProcess

0.70.590.860.79PMOS

0.810.790.610.57NMOS

KeqswKeqbpKeqswKeqbp

q We can simplify the diffusion capacitancecalculations by using a Keq to linearize thenonlinear capacitor to the value of the junctioncapacitance under zero-bias

Ceq = Keq Cj0




B.Supmonchai

Extrinsic Capacitances: Cg3 and Cg4

q Simplification of the actual situation

ß Assumes all the components of Cgate are between Vout

and GND (or VDD)

ß Assumes the channel capacitances of the loadinggates are constant

q The extrinsic, or fan-out, capacitance is the totalgate capacitance of the loading gates M3 and M4.

Cfan-out = Cgate(NMOS) + Cgate(PMOS)

= (CGSOn+ CGDOn+ WnLnCox) + (CGSOp+ CGDOp+ WpLpCox)


B.Supmonchai

Example: Layout of Two Inverters

2.3750.72.3750.71.125/0.25PMOS

1.8750.31.8750.30.375/0.25NMOS

PS (mm)AS (mm2)PD (mm)AD (mm2)W/L0.25 mm

In Out

Metal1

GND

PMOS1.125/0.25

NMOS0.375/0.25

Polysilicon

VDD

l = 0.125

AD = Drain Area PD = Drain Perimeter AS = Source AreaPS = Source Perimeter

Minimum Drawn Length


B.Supmonchai

Example: Components of CL (0.25 mm)

6.06.1ÂCL

0.120.12from extractionCw

2.282.28(2 Cgd0p)Wp + CoxWpLpCg4

0.760.76(2 Cgd0n)Wn + CoxWnLnCg3

1.151.5KeqbppADpCj + KeqswpPDpCjswCdb2

0.900.66KeqbpnADnCj + KeqswnPDnCjswCdb1

0.610.612 Cgd0p WpCgd2

0.230.232 Cgd0n WnCgd1

Value (fF)

LÆHValue (fF)

HÆLExpressionC Term


B.Supmonchai

Wiring Capacitance

q The wiring capacitance depends upon the lengthand width of the connecting wires and is afunction of the fan-out from the driving gate andthe number of fan-out gates.

q Wiring capacitance is growing in importancewith the scaling of technology.




B.Supmonchai

Inverter Propagation Delay

Charge

VDD

Rp

Vout

CL

Vin = 0

Low to High

Discharge

Vin = V DD

CL

VDD

Vout

Rn

High to Low

“Propagation delay is proportional to the time-constant of thenetwork formed by the on-resistance and the load capacitance”

q To equalize rise and fall times make the on-resistance of theNMOS and PMOS approximately equal.

tp = f(Ron, CL)

tpHL = 0.69 Reqn CL

tpLH = 0.69 Reqp CL

†

t p =t pHL + t pLH

2

= 0.69CLReqn + Reqp

2Ê

Ë Á

ˆ

¯ ˜


B.Supmonchai

Inverter Transient Response (0.25 µm)

-0.5

0

0.5

1

1.5

2

2.5

3

0 50 100 150 200 250

t(psec)

Vin

, Vou

t(V)

VDD= 2.5V

W/Ln = 1.5

W/Lp = 4.5Reqn= 13 kW /1.5

Reqp= 31 kW /4.5

tpHL = 36 psectpLH = 29 psec

tp = (36+29)/2 = 32.5 psectpHL = 39.9 psec and tpLH = 31.7 psec

tpHL

tpLH

Simulation Analysis

Analysis results is too optimistic ~ 10% better


B.Supmonchai

Inverter Propagation Delay, Revisited

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4

VDD(V)

t p(n

orm

ali

zed

)

tpHL = 0.69 Reqn CL

= 0.69(3CVDD)/(4IDSATn)

q To see how a designer can optimize the delay of agate, we have to expand Req in the delay equation.

†

t pHL ª 0.52 CL

W L( )n¢ k nVDSATn


B.Supmonchai

Minimizing Propagation Delayq Reduce CL

ß Keep the drain diffusion as small as possible

q Increase W/L ratio of the transistorß Most powerful and effective way

ß Watch out for self-loading!ÿ When the intrinsic capacitance dominates

q Increase VDD

ß Trade off energy efficiency for performance

ß Very minimal improvement above a certain level

ß Reliability concerns enforce a firm upper bound on VDD




B.Supmonchai

PMOS-to-NMOS Ratioq So far PMOS and NMOS have been sized such that

their Req’s match (ratio of 3 to 3.5)ß symmetrical VTCß equal high-to-low and low-to-high propagation delays

q If speed is the only concern, reduce the width ofthe PMOS device!ß widening the PMOS degrades the tpHL due to larger

parasitic capacitance

b = (W/L)p/(W/L)n

r = Reqp/Reqn resistance ratio of identically-sized PMOS and NMOS

†

bopt = r 1+CW

Cdn 2 + CCgn 2

Ê

Ë Á Á

ˆ

¯ ˜ ˜


B.Supmonchai

PMOS-to-NMOS Ratio Effects

30

35

40

45

50

1 1.5 2 2.5 3 3.5 4 4.5 5

b

t p(p

sec)

• b of 2.4 (= 31 kW/13 kW) gives symmetrical response

• bopt ~ 1.6 - 1.9

tpLH

tp

tpHL

2.4

Analytic Simulated

• When wire capacitance is negligible (Cdn1+Cgn2 >> CW), bopt = ÷r• If wire capacitance dominates then larger value of b must be used


B.Supmonchai

Device Sizing for Performance

q Divide capacitive load, CL, intoß Cint : intrinsic - diffusion and Miller effect

ß Cext : extrinsic - wiring and fanout

tp = 0.69 Req Cint (1 + Cext/Cint) = tp0 (1 + Cext/Cint)

ÿ where tp0 = 0.69 Req Cint is the intrinsic (unloaded) delayof the gate

q Widening both PMOS and NMOS by a factor S reducesReq by an identical factor (Req = Rref/S), but raises theintrinsic capacitance by the same factor (Cint = SCiref)

tp = 0.69 Rref Ciref (1 + Cext/(SCiref)) = tp0(1 + Cext/(SCiref))


B.Supmonchai

Observation IV

q Intrinsic Delay of the inverter tp0 is independentof the sizing of the gate;

• tp0 can be determined purely by technology andinverter layout

• With no load the increased drive strength of the gateis totally offset by the increased capacitance

q Any S sufficiently larger than (Cext/Cint) wouldyield a much better performance gain with asubstantial area increase




B.Supmonchai

Sizing Impacts on Delay

ÿThe majority of the improvementis already obtained for S = 5.

ÿSizing factors larger than 10barely yield any extra gain (andcost significantly more area).

20

22

24

26

28

30

32

34

36

38

1 3 5 7 9 11 13 15

S

t p(p

sec)

for a fixed load

self-loading effect(intrinsic capacitance dominates)


B.Supmonchai

Impact of Fanout on Delayq Extrinsic capacitance, Cext, is a function of the

fanout of the gateß the larger the fanout, the larger the external load.

q First determine the input loading effect of theinverter. Both Cg and Cint are proportional to the gatesizing, so Cint = gCg is independent of gate sizing and

tp = tp0 (1 + Cext/ gCg) = tp0 (1 + f /g)

q The delay of an inverter is a function of the ratiobetween its external load capacitance and its inputgate capacitance: the effective fan-out f

f = Cext/Cg


B.Supmonchai

q Goal: to minimize the delay through an inverter chain

In Out

CLCg,1

1 2 N

Inverter Chain

q The delay of the j-th inverter stage is

tp,j = tp0 (1 + Cg,j+1/(gCg,j)) = tp0(1 + fj/ g)

q Overall Delay: tp = Âtp,j = tp0 Â (1 + Cg,j+1/(gCg,j))

q If CL is givenß How should the inverters be sized?ß How many stages are needed to minimize the delay?


B.Supmonchai

Sizing the Inverters in the Chainq The optimum size of each inverter is the geometric mean

of its neighbors – meaning that if each inverter is sizedup by the same factor f wrt the preceding gate, it willhave the same effective fan-out and the same delay

†

f = CL Cg,1N = FN

where F represents the overall effective fan-out of thecircuit (F = CL/Cg,1)

q The minimum delay through the inverter chain is

†

t p = Ntp0 1+ FN /g( )ÿ The relationship between tp and F is linear for one inverter,

square root for two, etc.




B.Supmonchai

Example: Inverter Chain Sizing

q CL/Cg,1 has to be evenly distributed over N = 3 inverters

CL/Cg,1 = 8/1

f =

In Out

CL = 8 Cg,1Cg,1

1 f = 2 f2 = 4

3÷8 = 2


B.Supmonchai

Determining N: Optimal Number of Inverters

q What is the optimal value for N given F (=fN) ?ß If the number of stages is too large, the intrinsic delay of the

stages becomes dominate

ß If the number of stages is too small, the effective fan-out of eachstage becomes dominate

q The optimum N is found by differentiating the minimumdelay expression divided by the number of stages andsetting the result to 0, giving

ß For g = 0 (ignoring self-loading) N = ln (F) and the effective-fan outbecomes f = e = 2.71828

ß For g = 1 (the typical case) the optimum effective fan-out (taperingfactor) turns out to be close to 3.6

N Ng + ÷F - ( ÷F lnF)/N = 0


B.Supmonchai

Optimum Effective Fan-Out

q Choosing f larger than optimum has little effect ondelay and reduces the number of stages (and area).ß Common practice to use f = 4 (for g = 1)ß But too many stages has a substantial negative impact on delay

2.5

3

3.5

4

4.5

5

0 0.5 1 1.5 2 2.5 3

g

Fo

pt

0

1

2

3

4

5

6

7

1 1.5 2 2.5 3 3.5 4 4.5 5

f

Norm

ali

zed

Dela

y


B.Supmonchai

Example: Inverter (Buffer) StagingN f tp

1 64 65

CL = 64 Cg,1Cg,1 = 1

1 2.8 8 22.6

CL = 64 Cg,1Cg,1 = 1

1 4 16

CL = 64 Cg,1Cg,1 = 1

1 8

CL = 64 Cg,1Cg,1 = 1

1

2 8 18

3 4 15

4 2.8 15.3




B.Supmonchai

Impact of Buffer Staging for Large CL

q Impressive speed-ups with optimized cascadedinverter chain for very large capacitive loads.

33.120210,00110,000

24.86510011,000

16.522101100

8.38.31110

Opt.InverterChain

TwoStageChain

UnbufferedF

(g = 1)


B.Supmonchai

36

38

40

42

44

46

48

50

52

54

0 20 40 60 80

tS(psec)

t p(p

sec)

Input Signal Rise/Fall Timeq In reality, the input signal changes gradually (and both

PMOS and NMOS conduct for a brief time). This affectsthe current available for charging/discharging CL andimpacts propagation delay.

for a minimum-size inverter with a fan-out of a single gate

! tp increases linearly withincreasing input slope, ts,once ts > tp

! ts is due to the limiteddriving capability of thepreceding gate

ts = input signal slope


B.Supmonchai

Design Challengeq A gate is never designed in isolation: its performance is

affected by both the fan-out and the driving strength ofthe gate(s) feeding its inputs. (Revised tp expression)

tip = tistep + h ti-1step (h ª 0.25)

q Keep signal rise times smaller than or equal to the gatepropagation delays.ß good for performance

ß good for power consumption

q Keeping rise and fall times of the signals small and ofapproximately equal values is one of the major challengesin high-performance designs - slope engineering.


B.Supmonchai

Delay with Long Interconnectq When gates are farther apart, wire capacitance and resis-

tance can no longer be ignored.

tp = 0.69RdrCint + (0.69Rdr+0.38Rw)Cw + 0.69(Rdr+Rw)Cfan

where Rdr = (Reqn + Reqp)/2

q Wire delay rapidly becomes the dominant factor (due tothe quadratic term) in the delay budget for longer wires.

cint

Vin

cfan

(rw, cw, L)Vout

tp = 0.69Rdr(Cint+Cfan) + 0.69(Rdrcw+rwCfan)L + 0.38rwcwL2




B.Supmonchai

Where Does Power Go?

q Static Power Consumption

ß Ideally zero for static CMOS but in the real world..

ß Leakage Current Lossÿ Diodes and Transistors constantly losing charge

q Dynamic Power Consumption

ß Charging/Discharging Capacitancesÿ Major Source of Power Dissipation in CMOS Circuits

ß Direct-Path Current Lossÿ Short circuit between Power Rail during Switching


B.Supmonchai

Pdyn = Energy/cycle * fclk

Vin Vout

CL

VDD

Dynamic Power Consumption

iVDD(t)

Energy Supplied/Cycle =

†

iVDD (t)VDDdt0

•

Ú = CL * VDD2

Energy Stored/Cycle =

†

iVDD (t)vout (t)dt0

•

Ú = CL * VDD2 / 2

= CL * VDD2 * fclk


B.Supmonchai

Switching Activity

q Power dissipation does not depend on the size ofthe devices but depends on how often the circuit isswitched.ß Switching Activity ≡ frequency of energy-consuming

transition = f 0Æ1

Pdyn = CL * VDD2 * f 0Æ1

= CL * VDD2 * P0Æ1 * fclk

= Ceff * VDD2 * fclk

Clock

Gate output

Effective Capacitance Ceff = AverageCapacitance Switched per clock cycle

P0Æ1 = 0.25, f0Æ1 = fclk / 4


B.Supmonchai

Lowering Dynamic Power

Pdyn = CL VDD2 P0Æ1 f

Clock frequency:Increasing…

Quadratic EffectLowering Physical Capacitance

Activity factor:How often, on average,do gates switch?

Supply Voltage:Has been dropping withsuccessive generations

Capacitance:Function of fan-out, wirelength, transistor sizes

Reduction can be obtained only at Logicand Architectural Abstraction Levels




B.Supmonchai

Finite slope of the input signal causes a direct currentpath between VDD and GND for a short period of time

during switching when both the NMOS and PMOStransistors are conducting (active).

Short Circuit Power Consumption

CL

Vin Vout

VDD

Isc

tsc


B.Supmonchai

Esc = tsc VDD Ipeak P0Æ1

Psc = tsc VDD Ipeak f0Æ1

Short Circuit Currents Determinates

q tsc = Duration of the slope of the input signal

q Ipeak determined byß the saturation current of the PMOS and NMOS transistors

which depend on their sizes, process technology, temperature,etc.

ß strong function of the ratio between input and output slopesÿ a function of CL


B.Supmonchai

Isc ª 0 Isc ª Imax

Large capacitive load Small capacitive load

Impact of CL on Psc

Vin Vout

CL

VDD

Vin Vout

CL

VDD

Output fall time significantlylarger than input rise time.

Output fall time substantiallysmaller than input rise time.


B.Supmonchai

Ipeak as a Function of CL

-0.5

0

0.5

1

1.5

2

2.5

0 2 4 6I p

eak (

A)

time (sec)x 10-10

x 10-4

CL = 20 fF

CL = 100 fF

CL = 500 fF

500 psec input slope

Short circuit dissipationis minimized bymatching the rise/falltimes of the input andoutput signals - slopeengineering.

When load capacitanceis small, Ipeak is large.




B.Supmonchai

Psc as a Function of Rise/Fall Times

normalized wrt zero inputrise-time dissipation

When load capacitanceis small (tsin/tsout > 2 forVDD > 2V) the power isdominated by Psc

If VDD < VTn + |VTp| thenPsc is eliminated sinceboth devices are neveron at the same time.

0

1

2

3

4

5

6

7

8

0 1 2 3 4 5

P n

orm

aliz

ed

tsin/tsout

VDD= 3.3 V

VDD = 2.5 V

VDD = 1.5V

W/Lp = 1.125 mm/0.25 mmW/Ln = 0.375 mm/0.25 mmCL = 30 fF


B.Supmonchai

Static (Leakage) Power Consumption

Pstat= VDD Istat

Drain junctionleakage

Sub-threshold currentGate leakage

Vout = VDD

VDD

VDD

dominantfactor.

q All leakages increase exponentially with temperature

ß Junction leakage doubles every 9C

q Sub-threshold current becomes more concern in vDSM

ß The closer the threshold voltage to zero, the larger theleakage current at VGS = 0V (when NMOS off)


B.Supmonchai

Leakage as a Function of VT

1.E-12

1.E-10

1.E-08

1.E-06

1.E-04

1.E-02

0 0.2 0.4 0.6 0.8 1

VGS (V)

I D (

A)

VT=0.4VVT=0.1V

q Continued scaling of supply voltage and the subsequentscaling of threshold voltage will make sub-thresholdconduction a dominant component of power dissipation.

q An 90mV/decade VTroll-off - so each255mV increase inVT gives 3 orders ofmagnitude reductionin leakage (butadversely affectsperformance)


B.Supmonchai

80

0.25 V

13,000

920/400

0.08 mm

24 Å

1.2 V

CL013HS

52

0.29 V

1,800

860/370

0.11 mm

29 Å

1.5 V

CL015HS

42 Å42 Å42 Å42 ÅTox (effective)

43142230FET Perf.(GHz)

0.40 V0.73 V0.63 V0.42 VVTn

3000.151.6020Ioff (leakage)(rA/mm)

780/360320/130500/180600/260IDSat (n/p)(mA/mm)

0.13 mm0.18 mm0.16 mm0.16 mmLgate

2 V1.8 V1.8 V1.8 VVdd

CL018HS

CL018ULP

CL018LP

CL018G

TSMC Processes Leakage and VTFrom MPR, June 2000, pp. 19 – Performance of various TSMC processes(G generic, LP low power, ULP ultra low power, HS high speed)




B.Supmonchai

Exponential Increase in Leakages

1

10

100

1000

10000

30 40 50 60 70 80 90 100 110

Temperature (C)

I lea

ka

ge(

nA

/ mm

)

0.10 mm

0.13 mm

0.18 mm

0.25 mm

Leakage currents double every 10degree increase in temperature

The Leakage Power is six orders of magnitude smaller thanthe dynamic power (at room temperature)


B.Supmonchai

Energy and Power Equations

Dynamic power(~90% today and

decreasingrelatively)

Short-circuit power(~8% today and

decreasing absolutely)

Leakage power(~2% today and

increasing)

f0Æ1 = P0Æ1 * fclock

E = CL VDD2 P0Æ1 + tsc VDD Ipeak P0Æ1 + VDD IleakageTclock

P = CL VDD2 f0Æ1 + tsc VDD Ipeak f0Æ1 + VDD Ileakage


B.Supmonchai

Sizing for Minimum Energy

q Goal: Minimize Energy of the whole circuitß Design parameters: f and VDDß tp £ tpref of circuit with f = 1 and VDD = Vref

1Cg1

In

fCext

Out

†

tp = t p 0 1+fg

Ê

Ë Á

ˆ

¯ ˜ + 1+

Ffg

Ê

Ë Á

ˆ

¯ ˜

Ê

Ë Á

ˆ

¯ ˜

Overall Effective Fan-outF = Cext/Cg1

Intrinsic Delay of the invertertp0 ~ VDDt/(VDDt - VTE)


B.Supmonchai

Sizing for Minimum Energy IIq Performance Constraint (g=1)

†

t p

tpref

=t p0

t p0ref

2 + f +Ff

Ê

Ë Á

ˆ

¯ ˜

3+ F( )=

VDD

Vref

Vref -VTE

VDD -VTE

2 + f +Ff

Ê

Ë Á

ˆ

¯ ˜

3+ F( )=1

†

E = VDD2 Cg1 1+ g( ) 1+ f( ) + F[ ]

EEref

=VDD

Vref

Ê

Ë Á Á

ˆ

¯ ˜ ˜

22 + 2 f + F

4 + FÊ

Ë Á

ˆ

¯ ˜

q Energy for single Transition




B.Supmonchai

Sizing for Minimum Energy III

1 2 3 4 5 6 70

0.5

1

1.5

2

2.5

3

3.5

4

f

F=1

2

5

10

20

VD

D (

V)

0

0.5

1

1.5

1 2 3 4 5 6 7

f

E/E

ref

q Optimum sizing occurs at fopt = ÷F

q Increasing device sizes beyond fopt increase self-loadingfactorß Deteriorate performance and require increase in supply voltage


B.Supmonchai

Observation Vq Device sizing, combined with supply voltage reduction,

is very effective in reducing the energy consumption

ß For F = 1, minimum size device is the most effective

ß For network with large effective fan-out (F >> 1), a largereduction factor of almost 10 can be obtained.

q Oversizing transistors beyond the optimal value resultsin a hefty increase of energy

ß Unfortunately, a common approach in many today’s design

q Optimal sizing factor for energy is smaller than the onefor performance (delay), especially for large F

ß For a fan-out of 20, fopt(energy) = 3.53, fopt(delay) = 4.47


B.Supmonchai

Power-Delay and Energy-Delay Product

q Power-delay product (PDP) = Pav * tp = (CLVDD2)/2

ß PDP is the average energy consumed per switchingevent (Watts * sec = Joule)

ß Lower power design could simply be a slower design

q Energy-delay product (EDP) = PDP * tp = Pav * tp2

ß EDP is the average energy consumed multiplied by thecomputation time required

ß Takes into account that one can trade increased delayfor lower energy/operation (e.g., via supply voltagescaling that increases delay, but decreases energyconsumption)


B.Supmonchai

Energy-Delay Plot

†

EDP =aCL

2VDD3

2 VDD -VTE( )

Where VTE = VT+VDSAT/2

†

VDDopt =32

VTE

VTn = 0.43 V, VDSATn = 0.63 V, VTEn = 0.74 VVTp = -0.4 V, VDSATp = -1 V, VTEp = -0.9 V

VTE ≈ (VTn +| VTp |)/2 = 0.8 V

VDDopt = (3/2)*0.8 = 1.2 V

0

5

10

15

0.5 1 1.5 2 2.5

Vdd (V)E

ne

rg

y-D

ela

y (n

orm

alize

d)

DelayEnergy

Energy-Delay

1.1 V

0.25 micron




B.Supmonchai

Observation VIq Voltage Dependence of the EDP

ß Higher Supply Voltages reduce delay, but harm the energy.

ß Vice Versa for low voltages

q VDDopt simultaneously optimizes performance (delay)and energy

ß For submicron technologies with VT in the range of 0.5 V,VDDopt ~ 1V.

q VDDopt does not necessarily represent the optimumvoltage for a given design problem

ß Goal of the design (speed or power) determinates the supplyvoltage


B.Supmonchai

Goals of Technology Scaling

q Make things cheaper:

ß Want to sell more functions (transistors) per chip forthe same money

ß Build same products cheaper, sell the same part forless money

ß Price per transistor has to be reduced

q But also want to be faster, smaller, lower power


B.Supmonchai

Technology Scaling

q Goals of scaling the dimensions by 30%:

ß Reduce gate delay by 30% (increase operatingfrequency by 43%)

ß Double transistor density

ß Reduce energy per transition by 65% (50% powersavings @ 43% increase in frequency

q Die size used to increase by 14% per generation

q Technology generation spans 2-3 years


B.Supmonchai

International Technology Roadmap for Semiconductors (ITRS)(http://public.itrs.net)

18617717116013010690Max mP power [W]

1.4

1.2

6-7

1.5-1.8

180

1999

1.7

1.6-1.4

6-7

1.5-1.8

2000

14.9-3.6

11-37.1-2.53.5-22.1-1.6Max frequency

[GHz],Local-Global

2.52.32.12.42.0Bat. power [W]

109-10987Wiring levels

0.3-0.60.5-0.60.6-0.90.9-1.21.2-1.5Supply [V]

30406090130Technology node

[nm]

20142011200820042001Year of

Introduction

Node years: 2007/65nm, 2010/45nm, 2013/33nm, 2016/23nm

Technology Evolution (ITRS2000)




B.Supmonchai

Technology Evolution (1999)


B.Supmonchai

Technology Scaling Models

q Full Scaling (Constant Electrical Field)ß Ideal model - dimensions and voltage scale together

by the same factor S

q Fixed Voltage Scalingß Most common until recently

ß Only dimensions scale, voltages remain constant

q General Scalingß Most realistic for todays situation

ß Voltages and dimensions scale with different factors


B.Supmonchai

Scaling Long Channel Devices


B.Supmonchai

Scaling Short Channel Devices




B.Supmonchai

Scaling Wire Capacitances

ecS/SLWire Energy/ IntrinsicEnergy

ecS/SLWire Delay/IntrinsicDelay

ec/SLU2CmV2Wire Energy

ec/SLRonCintWire Delay

ec/SLWL/tWire Capacitance

General ScalingRelationParameter

S = Technology Scaling, U = Voltage Scaling, SL = Wire-length Scalingec = impact of fringing and interwire capacitance


B.Supmonchai

Power Density vs. Scaling Factor

Scaling Factor k ?i normalized by 4 mm design rule?j

1011

10

100

1000

µ k 3

Pow

er D

ensi

ty (

mW

/mm2) µ k 0.7

q Power density increaseapproximately with S2

ß In correspondancewith fixed-voltagescaling

q Recent Trend is more inline with Full-scaling

ß Constant powerdensity

ß Accelerated VDDscaling and moreattention to power-reducing designtechniques


B.Supmonchai

Evolution of Wire Delay and Gate Delay

How the ratio of wire over intrinsic contributions willactually evolve is debatable


B.Supmonchai

Looking into the Future… (Year 2010)

q Performance 2X/16 monthsß 1 TIP (terra instructions/s)

ß 30 GHz clock

q Sizeß No of transistors: 2 Billion

ß Die: 40*40 mm

q Powerß 10kW!!

ß Leakage: 1/3 active Power




B.Supmonchai

Some Interesting Questions

q What will cause this model to break?

q When will it break?

q Will the model gradually slow down?

ß Power and power density

ß Leakage

ß Process Variation

Date post:	29-Mar-2020
Category:	Documents
Upload:	others
View:	9 times
Download:	0 times

CMOS Inverter - จุฬาลงกรณ์มหาวิทยาลัย · 2009-11-12 ·...

Documents