+ All Categories
Home > Documents > Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan...

Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan...

Date post: 20-Dec-2015
Category:
View: 215 times
Download: 1 times
Share this document with a friend
27
Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA [email protected] Partially supported by NSF. Partially supported by NSF.
Transcript
Page 1: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Power Modeling and Architecture Evaluation for

FPGA with Novel Circuits for Vdd Programmability

Yan Lin, Fei Li and Lei HeEE Department, UCLA

[email protected]

Partially supported by NSF. Partially supported by NSF.

Page 2: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Overview FPGA architecture evaluation

Area and delay [Rose et al, JSSC’90] Power [Poon et al, FPLA’02][Li et al, FPGA’03]

Vdd programmability for power reduction Concept in [FPGA’03] Application to logic [FPGA’04][DAC’04] Application to interconnects [ICCAD’04]

[Anderson et al, ICCAD’04] Novel circuits and Architecture evaluation

for FPGAs with Vdd-programmability Reduce power by 50% with 17% area and

3% delay increase

Page 3: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Outline Power modeling and architecture

evaluation methodology

FPGA Circuits for Vdd Programmability

Architecture Evaluation with Vdd programmability

Conclusions and Ongoing Work

Page 4: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Framework fpgaEva-LP

Parasitic Extraction

Cycle-accuratePower

Simulator

Power

Arch Spec

Logic Optimization(SIS)

Tech-Mapping (RASP)

Timing-Driven Packing (TV-Pack)

Placement & Routing (VPR)

DelayArea

Benchmark circuits

Page 5: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

FPGA Structure and Models Cluster-based Island Style FPGA Structure

100% buffered interconnects, subset switch block input fc = 50%, output fc = 25%

Area and delay models similar to [Betz-Rose-Marquardt] But based on layout and SPICE for 100nm and below

Mixed-level power model from [FPGA’03]Dynamic power

Capacitive power Short-circuit power

( transition time)

Capacitive power Functional switch Glitch

Static Power Sub-threshold leakage Reverse biased leakage Gate leakage

Page 6: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

New Power Model in fpgaEva-LP2 Short-circuit power

switching time * switching power

fpgaEva-LP used average signal transition time

fpgaEva-LP2 calculates transition time for each buffer as , the buffer delay is NOT a constant 2 as in literature due to input slew is pre-characterized by SPICE

buffer delay <0.012 ns < 0.03 ns >0.03 ns

α 2 4.4 7

bufferr tt

Page 7: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Validation Using SPICE Validate by comparison for each power-component High fidelity with average absolute error of 8%

0

0.0005

0.001

0.0015

0.002

0.0025

b1 parity cm138a z4ml decode

Benchmark Circuits

FPG

A P

ower

(wat

t)

SPICE simulation fpgaEVA-LP fpgaEVA-LP2

Page 8: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Impact of Random Seeds in VPR

5.25

5.3

5.35

5.4

5.45

5.5

5.55

5.6

10.2 10.4 10.6 10.8 11 11.2 11.4 11.6 11.8 12

Critical Path Delay (ns)

FP

GA

En

erg

y (

nJ

/cy

cle

)

circuit: s38584

1

2

3

4

5

6

7

8

9

10

+5%

+12%

12% delay variation and 5% energy variation Min-delay solution among 10 runs is used

Page 9: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Evaluation of Single-Vdd FPGAs

Architectures explored Cluster size N = {6, 8, 10, 12} LUT size k = {3, 4, 5, 6, 7}

Energy-delay (ED) dominant architectures Architecture with smaller delay or less energy (compared

to any other architecture) Relaxed ED dominant set may be also valuable

3

4

5

6

7

8

9

9 10 11 12 13 14 15 16 17

Critical Path Delay (ns)

To

tal

FP

GA

En

erg

y (

nJ/

cycl

e)

(8, 7)

(6, 7)(6, 6)

(10, 5)(8, 5)

(12, 4)

(6, 5)

(8, 4)

(6, 4)(10, 4)

(8, 6)(12, 5)

(10, 6)

(12, 6)(10, 7)

(12, 7)

(10, 3)(12, 3)

(8, 3)

(6, 3)

Page 10: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Energy versus DelayCurrent commercial

architecture For 100nm ITRS technology Min-Energy arch (N,k)=(10,4) or (8.4) Min-Delay arch (N,k)=(8,7) 0.8x delay but 1.7x power

3

4

5

6

7

8

9

9 10 11 12 13 14 15 16 17

Critical Path Delay (ns)

To

tal

FP

GA

En

erg

y (

nJ/

cycl

e)

(8, 7)

(6, 7)(6, 6)

(10, 5)(8, 5)

(12, 4)

(6, 5)

(8, 4)

(6, 4)(10, 4)

(8, 6)(12, 5)

(10, 6)

(12, 6)(10, 7)

(12, 7)

(10, 3)(12, 3)

(8, 3)

(6, 3)

Page 11: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Outline Power modeling and evaluation

methodology

FPGA Circuits for Vdd Programmability

Architecture Evaluation with Vdd programmability

Conclusions and Ongoing Work

Page 12: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Vdd-programmable FPGA [DAC’04][ICCAD’04] Vdd-programmable logic

block Vdd selection Power-gating unused blocks

Page 13: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Vdd-programmable FPGA [FPGA’04][ICCAD’04] Vdd-programmable logic

block Vdd selection Power-gating unused blocks

Vdd-programmable switch

Vdd-level conversion is needed when VddL drives VddH To avoid excessive leakage

Page 14: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Vdd-programmable Routing Switch

Conventional routing switch

Vdd-programmable routing switch Brute-force design [ICCAD’04]

Two extra SRAM cells for each routing switch

New design One extra SRAM cell NAND2 gate –- minimum size & high-Vt transistor

Page 15: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Vdd-Programmable Interconnect Connection Block

New design Only TWO extra SRAM cells for n connection switches Control logic includes 2n NAND2 and a decoder

Brute-force design [ICCAD’04] 2n extra SRAM cells for n connection switches

Page 16: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Power and Delay Vdd-programmable switch uses

4X PMOS power transistor for 7X routing switch 1X PMOS power transistor for 4X connection switch

Compared to conventional switch 1000X less leakage power

Connection box is 28% faster and has 18% less dynamic power By moving mux from critical path of connection box

(Vdd=1.3v)Type

Switch delay (ns) Energy per switch (Joule)

w/o power transistor

w/ power transistor

w/o power transistor

w/ power transistor

Routing 5.9E-11 6.5E-11(+11%) 3.3E-14 3.2E-14 (-2%)

Connection 2.9E-10 2.1E-10(-28%) 3.8E-14 3.1E-14(-18%)

Page 17: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Vdd-gateable Routing Switch

Vdd-gateable two states Normal Vdd or Power-gating

Enable power-gating capability w/o extra SRAM cells

Can be replaced by tri-state buffer

Conventional

Power transitor

Page 18: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Vdd-gateable Connection Block

Enable power-gating capability w/ only one extra SRAM and a low leakage decoder

Conventional Vdd-gateable

Page 19: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Outline Power modeling and evaluation

methodology

FPGA Circuits for Vdd Programmability

Architecture Evaluation with Vdd programmability

Conclusions and Ongoing Work

Page 20: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

FPGA Architecture ClassesArchitecture Class Logic Block Interconnect

Class0 (baseline) single-Vdd single-Vdd

Class1 programmable dual-Vdd

programmable dual-Vdd, level converters in routing

Class2 programmable dual-Vdd

VddH and Vdd-gateable

Class3 programmable dual-Vdd

Class 1, but no level converters in routing

High-Vt is applied to configuration SRAM cells for all the classes

Page 21: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Vdd-level Converters Class3 removes Vdd-level converters from interconnects in

Class1 With constraints that no VddL drives VddH

We developed a routing that one routing tree has a single Vdd level But trees with different Vdd-levels can

share the same wire track

Alternative approaches: Combined vdd-level converter and buffer [Anderson et al,

ICCAD’04] Our new work [DAC’05] allows dual vdd in a tree with a chip

level time slack budgeting for extra power reduction

Page 22: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Energy versus Delay

ED-product reduction 20% by Class1 (Vdd-programmable interconnects w/ level converters) 45% by Class2 (Vdd-gateable interconnects) 50% by Class3 (class1 minus level converters)

Performance degrades 3% due to Vdd programmability

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

10 10.5 11 11.5 12 12.5 13

Critical Path Delay (ns)

Tot

al F

PG

A E

ner

gy/C

ycle

(n

J)Class 0

(8, 7)

(6, 7) (6, 6) (8, 6)(10, 5)(8, 5)

(12, 4)

(8, 4)

(6, 5)(6, 4)

(10, 4)

Class 1

(8, 7)(6, 6)

(10, 5)

(12, 4) (8, 4) (6, 4)

(6, 7)

(8, 5)(8,7)

(6,7)

(8,5)

(10,6) (6,6) (8,6)(10,5)

(12,4)

Class 2

(8,7)(6,7)(10,6) (6,6)

(8,6)(10,5) (8,5) (12,4)

Class 3

LUT 4Low Energy

LUT 7High Performance

Page 23: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Min-area

Min-energy

Energy versus Area

1

2

3

4

5

6

6.00E+06 8.00E+06 1.00E+07 1.20E+07 1.40E+07 1.60E+07 1.80E+07 2.00E+07 2.20E+07 2.40E+07 2.60E+07

Total FPGA Device Area

Tot

al F

PG

A E

ner

gy/C

ycle

(n

J)

Class0(8,7)

(6,7)

(8,6)(6,6)

(10,5)

(8,5)

(12,4)(6,5)

(6,4)(8,4)

(10,4)

Class2

(8,7)(6,7)

(10,6)(6,6)

(8,6)

(10,5)(8,5)

(12,4)(8,4)

(10,4)

Class1

(8,7)

(6,7)(6,6)(10,5)

(8,5)(12,4)(6,4)(8,4)

Class3

(8,7)

(6,7)

(10,4) (8,4) (12,4)

(10,5)(8,5)

(6,6)(10,6)

(8,6)

Average area overhead 118% for Class1 (Vdd-programmable interconnects w/ level converters) 17% for Class2 (Vdd-gateable interconnects) 52% by Class3 (Vdd-programmable interconnects w/o level converters)

Class2 is the best considering both energy and area

Page 24: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Energy Breakdown

Class2 and Class3 dramatically reduce global interconnect leakage

But class1 fails due to leakage in Vdd-level converters

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Class0 Class1 Class2 Class3FPGA Architecture (N,k) = (12,4)

Tot

al F

PG

A E

ner

gy (

nJ/

Cyc

le)

Logic Leakage EnergyLogic Dynamic EnergyLocal Interconnect Leakage EnergyLocal Interconnect Dynamic EnergyGlobal Interconnect Leakage EnergyGlobal Interconnect Dynamic Energy

2.94%3.71%

16.03%

8.09%

49.89%

19.33%

2.70%3.04%

26.22%

7.43%

42.84%

17.77%

4.07%3.92%

39.69%

9.81%

4.88%

37.62%

4.40%

4.32%

42.93%

10.81%5.85%

31.70%

Page 25: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

Class2: Vdd-gateable interconnects + Vdd-programmable CLBs(12, 4)

FP

GA

Are

a O

verh

ead

3.87%

0.60%

4.96%

4.82%

1.80%

1.39% Power Transistors & SRAMs (CLBs)

Vdd-level Converters (CLBs)

Control (Connection Blocks)

Power Transistors (Connection Blocks)

SRAMs (Connection Blocks)

Power Transistors (Routing Switches)Routing Switches 3.87%

Connection Blocks 10.38%

Logic Blocks 3.19%

Area Overhead

17% = 9% for power transistors + 5% for control + 2% for SRAM

Page 26: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

Conclusions and New Results Field programmability is needed for fine-grained dual-vdd

and Vdd-gating in FPGA Vdd-gating offers a better area-power tradeoff than Vdd-

selection 45% energy-delay product reduction with 17% area

overhead Architecture with Vdd-programmability

LUT size 4 low energy and area LUT size 7 best performance

New results [dac’05] Time slack allocation for Vdd-programmable

interconnects Device and architecture co-optimization for 77% energy-

delay reduction

Page 27: Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA LHE@ee.ucla.edu.

References and Download All references and tools at

http://eda.ee.ucla.edu

Results in the slides have been updated compared to the paper in ISFPGA’05


Recommended