+ All Categories
Home > Documents > Feb. 17, 2011

Feb. 17, 2011

Date post: 23-Feb-2016
Category:
Upload: vicky
View: 40 times
Download: 0 times
Share this document with a friend
Description:
Feb. 17, 2011. Midterm overview Real life examples of built chips Clock Skew Arithmetic Data Centers Power reduction techniques Dynamic Voltage / Frequency Scaling Clock Throttling Power Gating Others? Project – 4b adder with Razor recovery. Go Over Problems. 1c 2a; 2b 3c. - PowerPoint PPT Presentation
50
Feb. 17, 2011 • Midterm overview • Real life examples of built chips – Clock Skew • Arithmetic • Data Centers • Power reduction techniques – Dynamic Voltage / Frequency Scaling – Clock Throttling – Power Gating – Others? • Project – 4b adder with Razor recovery
Transcript
Page 1: Feb. 17, 2011

Feb. 17, 2011• Midterm overview• Real life examples of built chips

– Clock Skew• Arithmetic• Data Centers• Power reduction techniques

– Dynamic Voltage / Frequency Scaling– Clock Throttling– Power Gating– Others?

• Project – 4b adder with Razor recovery

Page 2: Feb. 17, 2011
Page 3: Feb. 17, 2011

Go Over Problems

• 1c• 2a; 2b• 3c

Page 4: Feb. 17, 2011

Crossbar Design

Page 5: Feb. 17, 2011
Page 6: Feb. 17, 2011

6

Mirror AdderStick Diagram

CiA B

VDD

GND

B

Co

A Ci Co Ci A B

S

Page 7: Feb. 17, 2011

7

The Mirror Adder•The NMOS and PMOS chains are completely symmetrical. A maximum of two series transistors can be observed in the carry-generation circuitry.

•When laying out the cell, the most critical issue is the minimization of the capacitance at node Co. The reduction of the diffusion capacitances is particularly important.

•The capacitance at node Co is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell .

•The transistors connected to Ci are placed closest to the output.

•Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.

Page 8: Feb. 17, 2011

8

Transmission Gate Full Adder

A

B

P

Ci

VDD A

A A

VDD

Ci

A

P

AB

VDD

VDD

Ci

Ci

Co

S

Ci

P

P

P

P

P

Sum Generation

Carry Generation

Setup

Page 9: Feb. 17, 2011

9

Manchester Carry Chain

CoCi

Gi

DiPi

PiVDD

CoCi

Gi

PiVDD

Page 10: Feb. 17, 2011

10

Manchester Carry Chain

G2

C3

G3Ci,0

P0

G1

VDD

G0

P1 P2 P3

C3C2C1C0

Page 11: Feb. 17, 2011

11

Carry-Bypass Adder

FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3

Co,3Co,2Co,1Co,0Ci ,0

FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3

Co,2Co,1Co,0Ci,0

Co,3

Multip

lexer

BP=PoP1P2P3

Idea: If (P0 and P1 and P2 and P3 = 1)then Co3 = C0, else “kill” or “generate”.

Also called Carry-Skip

Page 12: Feb. 17, 2011

12

Carry-Bypass Adder (cont.)

Page 13: Feb. 17, 2011

13

Carry Ripple versus Carry Bypass

N

tp

ripple adder

bypass adder

4..8

Page 14: Feb. 17, 2011

14

Carry-Select AdderSetup

"0" Carry Propagation

"1" Carry Propagation

Multiplexer

Sum Generation

Co,k-1 Co,k+3

"0"

"1"

P,G

Carry Vector

Page 15: Feb. 17, 2011

15

Carry Select Adder: Critical Path

Page 16: Feb. 17, 2011

16

Linear Carry Select

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15

S0-3 S4-7 S8-11 S12-15

Ci,0

(1)

(1)

(5)(6) (7) (8)

(9)

(10)

(5) (5) (5)(5)

Page 17: Feb. 17, 2011

17

Square Root Carry Select

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13

S0-1 S2-4 S5-8 S9-13

Ci,0

(4) (5) (6) (7)

(1)

(1)

(3) (4) (5) (6)

Mux

Sum

S14-19

(7)

(8)

Bit 14-19

(9)

(3)

Page 18: Feb. 17, 2011

18

Adder Delays - Comparison

Page 19: Feb. 17, 2011

19

LookAhead - Basic Idea

Co k f Ak Bk Co k 1– Gk PkCo k 1–+= =

Page 20: Feb. 17, 2011

20

Look-Ahead: Topology

Co k Gk Pk Gk 1– Pk 1– Co k 2–+ +=

Co k Gk Pk Gk 1– Pk 1– P1 G0 P0Ci 0+ + + +=

Expanding Lookahead equations:

All the way:

Page 21: Feb. 17, 2011

21

Carry Lookahead Trees

Co 0 G0 P0Ci 0+=

Co 1 G1 P1G0 P1P0Ci 0+ +=

Co 2 G2 P2G1 P2P1G0 P+ 2P1P0C i 0+ +=

G2 P2G1+ = P2P1 G0 P0Ci 0+ + G 2:1 P2:1Co 0+=

Can continue building the tree hierarchically.

Page 22: Feb. 17, 2011
Page 23: Feb. 17, 2011

Power Reduction Techniques

• Stop the clock– Dynamic power reduction

• Power gating– Reduce the leakage

• How fast can you turn something on/off?– Nothing to do sleep

• How can you save power while in operation?– Near-threshold design

Page 24: Feb. 17, 2011

Power Gating

Page 25: Feb. 17, 2011
Page 26: Feb. 17, 2011
Page 27: Feb. 17, 2011

Kevin Nowka, IBM

Page 28: Feb. 17, 2011
Page 29: Feb. 17, 2011
Page 30: Feb. 17, 2011
Page 31: Feb. 17, 2011
Page 32: Feb. 17, 2011
Page 33: Feb. 17, 2011

Gate Leakage

Page 34: Feb. 17, 2011
Page 35: Feb. 17, 2011

Digital ParallelizationY[n] = X[n] + X[n-1]

Input(5bits @ 5GS/s)

clk clk

X[n]X[n-1]

Y[n]+

x

Clk = 5GHz

Analog Signal

Input(5bits @ 5GS/s)

Or

(8bits @ 100MHz)

ANALOG DIGITAL

Page 36: Feb. 17, 2011

DSP Parallelization Y[n] = X[n] + X[n-1]

Input(5bits @ 5GS/s)

clk

clk

X[n]X[n-2]

+

x

Y[n-1] = X[n-1] + X[n-2]

clk

clkb

CLK = 5GHz

clk

X[n-1]

Y[n]

Y[n-1]+

x

CLK = 2.5GHz

Page 37: Feb. 17, 2011

DSP Parallelization• Clock speed reduced by ½

– Can parallelize further– Increase number of MACs(multiply/accumulates) by 2

• Intuition?– Area goes up by 2– Power decreases (clock rate down

by 2, computations up by 2, but easier timing constraints)– What about clock power?

• Save a little power, but double the area?

Page 38: Feb. 17, 2011

Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation

• http://www.eecs.umich.edu/~taustin/papers/MICRO36-Razor.pdf

Page 39: Feb. 17, 2011
Page 40: Feb. 17, 2011
Page 41: Feb. 17, 2011
Page 42: Feb. 17, 2011
Page 43: Feb. 17, 2011
Page 44: Feb. 17, 2011
Page 45: Feb. 17, 2011

Project Description

• Minimal: 4b Adder, Implemented with Razor– Simulations into near-threshold domain

• Grad. Student: requires more advanced design– Analog: Opamps built using inverters– Digital: Adiabatic Near-Threshold– Power Gating: add power gating to your design

• Undergrad: extra credit if do any of the above

Page 46: Feb. 17, 2011

Problem 1: On-Chip Wires Consume Energy• On-chip wire power does not scale

– Dominated by interconnect capacitance (CVDD2)

ON-CHIP (Status Quo):100 - 300fJ/bit/mm

NOTE: Sub/Near-Threshold doesn’t help this problem!

OUR GOAL: < 5fJ/bit/mm

[DOE, Exascale Workshop]

Page 47: Feb. 17, 2011

Data Center Design

• http://www.spectrum.ieee.org/feb09/7327

Page 48: Feb. 17, 2011
Page 49: Feb. 17, 2011
Page 50: Feb. 17, 2011

Recommended