Share this document with a friend

Embed Size (px)

of 27
/27

Transcript

EE141

1

EE141 EECS141 1 Lecture #13

EE141 EECS141 2 Lecture #13

No more re-grading of Midterm after today Hw 5 due on Friday. Hw 6 posted early next week. You get TWO

weeks for this one. Project phase 1 introduced today

Actual launch is on Friday Out of town next week

We lecture offered by Stanley Fr lecture cancelled – Make-up on Tu

March 16 at 3:30pm

EE141

2

EE141 EECS141 3 Lecture #13

Last lecture Inverter delay

Today’s lecture Inverter energy Project launch Optimizing complex CMOS

Reading (Ch 5, 6)

EE141 EECS141 4 Lecture #13

EE141

3

EE141 EECS141 5 Lecture #13

tpHL = 0.69 CL Reqn tpLH = 0.69 CL Reqp

tpLH tpHL

EE141 EECS141 6 Lecture #13

Derived RC model assuming input was a step But input is not a step

Transistor turns on gradually

Let’s look at gate switching more carefully Use our models to understand the effect of input slope

EE141

4

EE141 EECS141 7 Lecture #13

One way to analyze slope effect Plug non-linear IV into diff. equation and solve…

Simpler, approximate solution: Use VT* model

EE141 EECS141 8 Lecture #13

For falling edge at output: For reasonable inputs, can ignore IPMOS

Either Vds is very small, or Vgs is very small

So, output current ramp starts when Vin=VT* Could evaluate the integral Learn more by using an intuitive, graphical

approach

EE141

5

EE141 EECS141 9 Lecture #13

For reasonable input slopes:

EE141 EECS141 10 Lecture #13

For reasonable input slope Model matches

Spice very well

Model breaks with very large tr Input looks “DC” –

traces out VTC Have other problems here anyways

– Short-circuit current

EE141

6

EE141 EECS141 11 Lecture #13

EE141 EECS141 12 Lecture #13

Switching power Charging/discharging capacitors

Leakage power Transistors are imperfect switches

Short-circuit power Both pull-up and pull-down on during

transition Static currents

Biasing currents, in e.g. analog, memory

EE141

7

EE141 EECS141 13 Lecture #13

One half of the energy from the supply is consumed in the pull-up network, one half is stored on CL

Energy from CL is dumped during the 1→0 transition

V in V out C L

V DD

EE141 EECS141 14 Lecture #13

Power = Energy/transition • Transition rate

= CLVDD2 • f0→1

= CLVDD2 • f • α0→1

= CswitchedVDD2 • f

Power dissipation is data dependent – depends on the switching probability

Switched capacitance Cswitched = CL • α0→1

EE141

8

EE141 EECS141 15 Lecture #13

Energy consumed in N cycles, EN:

EN = CL • VDD2 • n0→1

n0→1 – number of 0→1 transitions in N cycles

EE141 EECS141 16 Lecture #13

Short circuit current usually well controlled

Large load Small load

EE141

9

EE141 EECS141 17 Lecture #13

JS = 10-100 pA/µm2 at 25 deg C for 0.25µm CMOS JS doubles for every 9 deg C! Much smaller than transistor leakage in deep submicron

I DL = J S × A

EE141 EECS141 18 Lecture #13

Transistors that are supposed to be off - leak

Input at VDD Input at 0

V DD 0V

V DD

I Leak

V DD 0V

V DD

I Leak

EE141

10

EE141 EECS141 19 Lecture #13

Drain leakage current is exponential with VGS-VT

VDS = 1.2V

G

S D

Sub

Ci

Cd

EE141 EECS141 20 Lecture #13

0 0.5 1 1.5 2 2.5 10 -12

10 -10

10 -8

10 -6

10 -4

10 -2

V GS (V)

I D (A

)

VT

Linear

Exponential

Quadratic

Typical values for S: 60 .. 100 mV/decade

Subthreshold Slope:

S is ΔVGS for ID2/ID1 =10

EE141

11

EE141 EECS141 21 Lecture #13

Two effects: • diffusion current (like a bipolar transistor) • exponential increase with VDS (η: DIBL)

3-10x in current

technologies

0 0 0.2 0.4 0.6 0.8 1 1.2 1.4

V dS [V]

I DS [n

A]

EE141 EECS141 22 Lecture #13

Threshold as a function of the length (for low V DS )

Drain-induced barrier lowering (DIBL) (for short L)

EE141

12

EE141 EECS141 23 Lecture #13

EE141 EECS141 24 Lecture #13

EE141

13

EE141 EECS141 25 Lecture #13

Data transmissions, computations or storage may often fail Because of yield issues Noise Interference

Often used in wireless transmissions, hard disks, memory

Redundancy can help to eliminate the impact of these errors Triple-modular redundancy Coding – add extra bits so that errors can be detected

and corrected

EE141 EECS141 26 Lecture #13

In telecommunication, a Hamming code is a linear error-correcting code named after its inventor, Richard Hamming. Hamming codes can detect up to two simultaneous bit errors, and correct single-bit errors; thus, reliable communication is possible when the Hamming distance between the transmitted and received bit patterns is less than or equal to one. By contrast, the simple parity code cannot correct errors, and can only detect an odd number of errors.

Because of the simplicity of Hamming codes, they are widely used in computer memory (RAM). In particular, a single-error-correcting and double-error-detecting variant commonly referred to as SECDED.

R. Hamming, 1915-1998

EE141

14

EE141 EECS141 27 Lecture #13

Hamming Codes

with

e.g. B3 Wrong

1

1

0

= 3 €

2k ≥ n + k+1

To correct 1 error in 4 bit word: 3 parity bits

In general: 2k ≥ k + n + 1 (n: original bits; k: parity bits) In this example: 8 ≥ 3 + 4 + 1

WE CAN DO BETTER

EE141 EECS141 28 Lecture #13

Pulkit Grover, Stanley Chen, Nam-Seog Kim, and Prof. Jan Rabaey

Spring, 2010

EE141

15

EE141 EECS141 29 Lecture #13

Communication channel -- bit flipping error Introduce redundancy at the transmitter -- error

correction coding Correct errors using redundancy at the decoder

-- decoding A class of immense recent (theoretical and

practical) interest : LDPC Codes

EE141 EECS141 30 Lecture #13

A low-density parity-check (LDPC) code is a linear error correcting code, a method of transmitting a message over a noisy transmission channel. LDPC codes are capacity-approaching codes, which means that practical constructions exist that allow the noise threshold to be set very close (or even arbitrarily close on the BEC) to the theoretical maximum (the Shannon limit) for a symmetric memory-less channel. The noise threshold defines an upper bound for the channel noise up to which the probability of lost information can be made as small as desired. Using iterative belief propagation techniques, LDPC codes can be decoded in time linear to their block length.

LDPC codes are finding increasing use in applications where reliable and highly efficient information transfer over bandwidth or return channel constrained links in the presence of data-corrupting noise is desired (10 GB Ethernet, DVBS-S, etc)

LDPC codes are also known as Gallager codes, in honor of Robert G. Gallager, who developed the LDPC concept in his doctoral dissertation at MIT in 1960.

R. Gallager, 1931-

EE141

16

EE141 EECS141 31 Lecture #13

Check Node: Receive 5 bits from 5 Bit Nodes and send back parity check results

Bit Node: Send to 3 Check Nodes and receive the results from those Check Nodes

Major Operation: XOR, simple control logics

Major Operation: Majority voting, update value, simple control logics

Wires (Wires, Drivers, Repeaters )

EE141 EECS141 32 Lecture #13

C1!

C2!

C3!

C4!

information bits

Parity bits

EE141

17

EE141 EECS141 33 Lecture #13

C1!

C2!

C3!

C4!

X!

EE141 EECS141 34 Lecture #13

C1!

C2!

C3!

C4!

X!

First iteration : Step 1: bit nodes send the channel outputs to the

connected check nodes

0!

0!

1!

1!

1!

1!

1!

1!1!

1!

0!

0!

0!

0!

0!

1!

1!

EE141

18

EE141 EECS141 35 Lecture #13

First iteration : Step 1: bit nodes send the channel outputs to

check nodes they are connected to Step 2: check nodes XOR the values they received

from the other bit nodes and send it to the respective bit node

C1!

C2!

C3!

C4!

X!1!

1!

1!1!

1!

1!

0!1!

1!

0!

0!

0!1!

1!0!

0!

0!

EE141 EECS141 36 Lecture #13

Second iteration : bit nodes send the majority of the messages and

their own channel output back to the check nodes majority of (0,1,1) = 1 (bit decoded correctly)

C1!

C2!

C3!

C4!

X!1!

1!

X!1!0!1!

1!

0!

1!1!

1!

1!

1!

1!1!

1!0!

0!

0!

0!X!

0!X!

EE141

19

EE141 EECS141 37 Lecture #13

Check Node Major Operation: XOR, simple control logics

Check Node: Receive 4 (M) bits from Bit Nodes and send back parity check results

XOR (B2,B3,B4)

Wire Drivers

Control Logics

Send to B1

Send to B2 From B1 From B2 From B3 From B4

Send to B3

Send to B4

XOR (B1,B3,B4)

XOR (B1,B2,B4)

XOR (B1,B2,B3)

B1 B2 B3 B4

Parity Check Result for B1

Parity Check Result for B2

Parity Check Result for B3

Parity Check Result for B4

EE141 EECS141 38 Lecture #13

Bit Node

Bit Node: Receive from 2 (N) Check Nodes and channel and sent the result back to those Check Nodes

Major Operation: Majority voting, update value, simple control logics

Majority Voting Register Wire Driver

Control Logic

Send to C1

Send to C2

From C1 From C2

From Channel

EE141

20

EE141 EECS141 39 Lecture #13

Bit Node

Check Node

Parity Check

Phase I Phase II

Phase III

From Chann

el

3

4

3

Majority &

FSM*

* Finite State Machine

EE141 EECS141 40 Lecture #13

3-Phases • Phase 1 (warm-up)

• Introduction to LDPC decoding • Design a check node

• Phase 2 (deeper design experience) • Design of bit node • Complete design of a decoder (with help)

• Phase 3 (design optimization) • Optimize your design for power-performance tradeoffs • Go for the gold!

EE141

21

EE141 EECS141 41 Lecture #13

Prj-ph1 launched Hw 5

Hw 6

Hw 7

& Prj-ph2 launched

Hw 8

Hw 9

Poster Session

&Prj-ph3 launched

Prj-ph1 due

Hw10 (optional)

Prj-ph2 due

(Monday Mar. 15)

(Friday Apr. 9)

Prj-ph1: 03/05~03/15, 11 days Prj-ph2: 03/15~04/09, 25 days

Prj-ph3: 04/09~05/05, 26 days

(Wednesday May 5)

(Friday Mar. 5) 11 days

25 days

26 days

EE141 EECS141 42 Lecture #13

Do an exercise on LDPC decoding Given an input vector + decoding

procedures, derive the final results step by step

Derive the schematic for Check Node Design and size logic gates (w/ preset

wire loading) for delay Simulate the design in Cadence

SPECTRE

EE141

22

EE141 EECS141 43 Lecture #13

EE141 EECS141 44 Lecture #13

Techniques very similar to the inverter case

Logical Effort technique as the means for gate sizing and topology optimization

However … some other things to be aware of

EE141

23

EE141 EECS141 45 Lecture #13

D C B A

D: 01

C: 1

B: 1

A: 1 CL

C3

C2

C1

RC model:

2 2 2 2

4

4

4

4

EE141 EECS141 46 Lecture #13

D C B A

D

C

B

A CL

C3

C2

C1

Distributed RC model (Elmore delay)

tpHL = 0.69 Reqn(C1+2C2+3C3+4CL)

Propagation delay deteriorates rapidly as a function of fan-in – quadratically in the worst case.

2 2 2 2

4

4

4

4

EE141

24

EE141 EECS141 47 Lecture #13

t p (p

sec)

fan-in

Gates with a fan-in greater than 4 should be avoided.

tpHL

quadratic

linear

tp

tpLH

EE141 EECS141 48 Lecture #13

tpNOR2

t p (p

sec)

eff. fan-out = CL/Cin

All gates have the same drive current.

tpNAND2

tpINV

Slope is a function of “driving strength”

EE141

25

EE141 EECS141 49 Lecture #13

Fan-in: quadratic due to increasing resistance and capacitance

Fan-out: each additional fan-out gate adds two gate capacitances to CL

tp = a1FI + a2FI2 + a3FO

EE141 EECS141 50 Lecture #13

Transistor sizing as long as fan-out capacitance dominates

Progressive sizing

InN CL

C3

C2

C1 In1

In2

In3

M1

M2

M3

MN

Distributed RC line

M1 > M2 > M3 > … > MN (the FET closest to the output is the smallest)

Can reduce delay by more than 20%; Be careful: input loading, junction caps, decreasing gains as technology shrinks

EE141

26

EE141 EECS141 51 Lecture #13

Transistor ordering

C2

C1 In1

In2

In3

M1

M2

M3 CL

C2

C1 In3

In2

In1

M1

M2

M3 CL

critical path critical path

charged 1

0→1 charged

charged 1

delay determined by time to discharge CL, C1 and C2

delay determined by time to discharge CL

1

1

0→1 charged

discharged

discharged

EE141 EECS141 52 Lecture #13

Alternate logic structures F = ABCDEFGH

EE141

27

EE141 EECS141 53 Lecture #13

Isolating fan-in from fan-out using buffer insertion

CL CL

EE141 EECS141 54 Lecture #13

Reducing the voltage swing

linear reduction in delay also reduces power consumption

But the following gate is slower! Or requires use of “sense amplifiers” on the

receiving end to restore the signal level (memory design)

tpHL = 0.5 (CL VDD) / IDSATn

= 0.5 (CL Vswing) / IDSATn

Recommended