+ All Categories
Home > Documents > Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan...

Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan...

Date post: 19-Dec-2015
Category:
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
30
with Fine-Grained Power- Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially supported by NSF grant CCR-0306682. Partially supported by NSF grant CCR-0306682. Address comments to [email protected]. Address comments to [email protected].
Transcript
Page 1: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Routing Track Duplication with Fine-Grained Power-Gating for

FPGA Interconnect Power Reduction

Yan Lin, Fei Li and Lei HeEE Department, UCLA

Partially supported by NSF grant CCR-0306682. Partially supported by NSF grant CCR-0306682. Address comments to [email protected] comments to [email protected].

Page 2: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Outline Review and Motivation

Interconnect Leakage Power Reduction using Power-gating

Interconnect Dynamic Power Reduction using Dual-Vdd

Conclusions and Ongoing Work

Page 3: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Power Limitation of FPGAs Existing FPGAs are HIGHLY power inefficient

(> 100X more than ASIC) E.g. [Kusse, ISLPED’98]

Power is likely the largest limitation for FPGAs

Design Example Vdd Energy

Xilinx XC4003A 5v 4.2mW/MHz

Static CMOS ASIC 3.3v 5.5uW/MHz

Page 4: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

FPGA Power Reduction Power aware FPGA CAD algorithms for

existing FPGA architectures CAD algorithms to minimize power-delay

product [Lamoureux et al, ICCAD’03] Configuration inversion for leakage reduction

[Anderson et al, FPGA’04] Power efficient FPGA circuits and

architectures Dual-Vdd and Vdd-programmable FPGA logic

blocks [Li et al, FPGA’04][Li et al, DAC’04] Vdd-programmable FPGA interconnects

[Li et al, ICCAD’04] [Anderson et al, ICCAD’04]

Page 5: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Overall FPGA Structure Cluster-based Island Style FPGA Structure

Logic blocks are embedded into routing resources Wire segment connectivity is programmable

Page 6: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

FPGA Routing Structure Subset Programmable

switch block An incoming track can

be connected to different outgoing tracks with the same track number

Programmable connection block

Page 7: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Vdd-programmable Interconnects [Li et al, ICCAD’04] Conventional routing switch

Vdd-programmable switch Vdd selection for used switch Power-gating unused switch Configurable Vdd-level conversion

Avoid excessive leakage when low Vdd switch drives high Vdd switches

Power transistor

Page 8: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Limitation of Vdd-programmable Interconnects [Li et al, ICCAD’04] Fine-grained Vdd-level converter insertion

Area overhead 54% area overhead for circuit s38584

Leakage overhead 36% leakage overhead for circuit s38584

SRAM cell overhead 300% SRAM cell overhead for each switch

Area/SRAM efficient low-power interconnects are needed

Page 9: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Outline Review and Motivation

Interconnect Leakage Power Reduction using Power-gating

Interconnect Dynamic Power Reduction using Dual-Vdd

Conclusions and Ongoing Work

Page 10: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Low Utilization Rate of Interconnects

Circuit # of total interconnect switches

# of unused interconnect switches

Utilization rate (%)

alu4apex4bigkeyclmadesdiffeqdsipellipticex5pfrisc

364784374163259653181878774274675547140296454042388523

31224377035401759334379932369747013812580039288216993

14.40%13.80%9.87%9.16%9.04%13.50%7.16%10.33%13.47%9.15%

Average 11.90%

78.15% of total power is consumed by global interconnect power [Li et al, DAC’04]

47% of global interconnect power is leakage Why?

Extremely low utilization rate (~12% w/ minimum array)

Page 11: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Interconnect Utilization Rate is Intrinsically Low Programmable switch block

no more than 25%

Programmable connection block Only one is used (for 64

tracks)

Power-gating unused interconnects is necessary

Page 12: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Vdd-gateable Routing Switch

Vdd-gateable routing switch Only two states for a routing switch

High Vdd Power-gating

Enable power-gating capability w/o extra SRAM cells

Conventional routing switch

Power transitor

Page 13: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Vdd-Gateable Connection Block

Enable power-gating capability w/ only one extra SRAM for a connection block Only n+1 SRAM cells for 2n connection switches A low leakage decoder is needed

Conventional connection block Vdd-gateable connection block

Page 14: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Power and Delay of Vdd-gateable Switch Vdd-gateable switch compared to

conventional switch Dynamic power is almost the same >300X leakage power reduction ~6% delay increase

Vdd

Routing switch delay (ns) Energy per switch (Joule)

w/o power-gating

w/ power-gating w/o power-gating

w/ power-gating

1.3v 5.90E-11 6.26E-11(6%) 3.3E-14 3.25E-14

1.0v 6.99E-11 7.42E-11(6.1%) 1.63E-14 1.65E-14

Page 15: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Power Reduction by Power-gating Unused Interconnects

Vdd-programmable interconnectsVdd-gateable interconnects

Circuit Single-Vdd (baseline) Total Power Saving

Interconnect power (W)

Total power (W) [Li et al, ICCAD04]

Vdd-gateable Interconnects

alu4 0.0657 0.0769 25.13% 29.09%

apex4 0.0437 0.0500 21.83% 30.70%

bigkey 0.1044 0.1375 33.38% 24.89%

clma 0.4918 0.5450 23.42% 45.69%

des 0.1688 0.2136 36.71% 31.79%

diffeq 0.0292 0.0360 17.50% 45.20%

dsip 0.1003 0.1280 34.34% 43.66%

Avg. -- -- 25.19% 38.18%

Page 16: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Outline Review and motivation Interconnect Leakage Power Reduction

using Power-gating

Interconnect Dynamic Power Reduction using Dual-Vdd FPGA fabrics and algorithms Design flow and quantitative evaluation

Conclusions and Ongoing Work

Page 17: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Pre-Defined Dual-Vdd Routing Architecture

Partition routing channel into VddH and VddL regions Vdd-gateable interconnect switch is used Ratio of VddH/VddL track is an architectural parameter

Reduce dynamic power with dual-Vdd by making use of timing slack

Page 18: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Ratio of VddH to VddL Track Determine ratio using dual-Vdd assignment

profile without considering layout constraint Sensitivity-based dual-Vdd assignment

Assignment unit --- a routing tree Power sensitivity --- ΔP/ ΔVdd

Power difference for a routing tree between VddH and VddL Greedy algorithm --- sensitivity based

Initial: uniform VddH assignment Procedure: assign VddL to routing tree with largest power

sensitivity (but without increasing critical delay)

Page 19: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Profile of Dual-Vdd Assignment Assignment with no critical path delay increase

(VddH:VddL=1.5v:1.0v)

Circuits #of routing trees

# of logic

blocks

# of I/O blocks

VddL routing trees (%)

VddL logic blocks (%)

alu4 782 162 22 49.74 82.10

apex4 849 134 28 35.45 78.36

bigkey 1542 294 426 67.77 85.03

clma 7995 1358 144 69.74 89.84

s38417 5426 982 135 64.17 80.05

seq 1138 274 76 20.74 61.62

spla 2091 461 122 54.52 88.47

Avg. 54.54 80.28

Set the ratio of VddH/VddL track to 1:1

Page 20: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Level Converter is NOT Needed

Wire segment can only be connected to another wire segment with the same track number via a subset switch block

A

B

Page 21: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Level Converter is NOT Needed

Wire segment can only be connected to another wire segment with the same track number via a subset switch block

A

B

No level converter is needed in switch block

Page 22: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Layout Constraint Due to Dual-Vdd Dual-Vdd introduces performance

degradation due to layout constraint Insufficient routing resources for Vdd-

matched routing trees May introduce detours

Solutions Vdd-programmable interconnects [Li et al,

ICCAD’04] Provide sufficient routing tracks for Vdd-

matched routing trees Control leakage by power-gating unused

interconnects

Page 23: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Design Flow for Dual-Vdd Interconnects

Delay/Power Model

(dual-Vdd)

Arch Spec

Timing Driven Layout (Single-Vdd)

Tech MappedNetlist (Single-Vdd)

Delay/Power Estimation

Delay Power

Dual-Vdd Assignment for Routing Trees

Timing Driven Layout (Dual-Vdd)

Power-gating Unused Switches

DoubleChannel

width

Page 24: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Dual-Vdd Routing Algorithm Based on the maze routing algorithm in VPR Modify the cost function

),(

),(

)()(

nTMatched

jnstDvExpectedCo

nPathCostDvnTotalCost

TotalCost(n): the cost of routing tree T through wire segment n to the target sink j

PathCostDv(n): the cost of the path from the current partial routing tree to wire segment n

ExpectedDv(n,j): the estimated cost from wire segment n to the target sink j

Matched(T,n): boolean function describing Vdd-matching status

Page 25: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Outline Review and motivation Interconnect Leakage Power Reduction

using Power-gating

Interconnect Dynamic Power Reduction using Dual-Vdd FPGA fabrics and algorithms Quantitative evaluation

Conclusions and Ongoing Work

Page 26: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Comparison of Low Power Architectures

0.07

0.12

0.17

0.22

0.27

60 70 80 90 100 110 120 130

clock frequency (MHZ)

pow

er (

wat

t)

arch-SV

1.3v

1.0v0.9v

1.5varch-PV

1.5v/0.8v1.3v/1.0v

0.9v/0.8v 1.0v/0.8v

arch-PV+PG

1.5v/0.8v1.3v/1.0v

1.0v/0.8v

0.9v/0.8v

arch-DV+PG(1.5W)

1.5v/0.8v1.3v/0.9v1.0v/0.8v0.9v/0.8v

Dual-Vdd interconnects with fine-grained power gating May have performance degradation due to layout constraint Can reduce more power than purely power-gating unused switches Achieve 9.78% interconnect dynamic power reduction, 38.68% total

power saving with 1.5W channel width W is the nominal routing channel width in single-Vdd FPGA

Circuit: S38584

Page 27: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Impact of Routing Channel Width

30%

35%

40%

45%

50%

1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0

channel width

pow

er s

avin

g

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

norm

aliz

ed c

lock

fre

quen

cypower saving

normalized clock frequency0.955

0.838

0.74345.00%

38.68%

34.86%

power saving

clock frequency

We get the power reduction percentage at the maximum clock frequency achieved by dual-Vdd interconnects

Channel width increases from 1.0W to 2.0W Power saving increases from 34.86% to 45% Normalized clock frequency increases from 0.743 to 0.955

Page 28: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Area Overhead of Vdd-gateable Interconnects Device area is dominant

Single-Vdd

(baseline)

Dual-Vdd w/ Power-gating (1.0W)

Dual-Vdd w/ Power-gating (1.5W)

Dual-Vdd w/ Power-gating (2.0W)

[Li et al, ICCAD’04]

Total FPGA area

7077044 11092744 15420197 20249865 22678225

Area overhead (%)

- 57% 118% 186% 220%

Area overhead is mainly due to power transistors for power-gating capability

Track duplication with power-gating vs Vdd-programmable interconnects [Li et at, ICCAD’04] More power reduction (45% vs 25%) & less area overhead

Mainly due to Vdd-level converter removal

High Vdd interconnects with power gating is BEST considering area

Page 29: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Outline Review and motivation Interconnect Leakage Power Reduction

using Power-gating

Interconnect Dynamic Power Reduction using Dual-Vdd

Conclusions and Ongoing Work

Page 30: Routing Track Duplication with Fine- Grained Power-Gating for FPGA Interconnect Power Reduction Yan Lin, Fei Li and Lei He EE Department, UCLA Partially.

Conclusions and Ongoing Work Conclusions

Developed power-gateable interconnects w/ virtually no extra SRAM cell

Achieved 38.18% total power reduction using Vdd-gateable interconnects

Achieved 24.78% interconnect dynamic power reduction, 45.00% total power reduction with duplicated (2W) channel width

Ongoing work Power-ground design to support dual-Vdd Optimal mix of Vdd-programmable and Vdd-gateable

interconnects Architecture evaluation considering Vdd

programmability [Lin et al, to appear in FPGA’05]


Recommended