+ All Categories
Home > Documents > 3D IC Design Tools and Applications to Microarchitecture

3D IC Design Tools and Applications to Microarchitecture

Date post: 04-Feb-2022
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
71
3D IC Design Tools and Applications to 3D IC Design Tools and Applications to Microarchitecture Exploration Microarchitecture Exploration Jason Cong Jason Cong UCLA Computer Science Department UCLA Computer Science Department [email protected] [email protected] http:// http:// cadlab.cs.ucla.edu cadlab.cs.ucla.edu /~cong /~cong
Transcript
Page 1: 3D IC Design Tools and Applications to Microarchitecture

3D IC Design Tools and Applications to3D IC Design Tools and Applications toMicroarchitecture ExplorationMicroarchitecture Exploration

Jason CongJason CongUCLA Computer Science DepartmentUCLA Computer Science Department

[email protected]@cs.ucla.eduhttp://http://cadlab.cs.ucla.educadlab.cs.ucla.edu/~cong/~cong

Page 2: 3D IC Design Tools and Applications to Microarchitecture

2

OutlineOutlineThermalThermal--Aware 3D IC Physical Design FlowAware 3D IC Physical Design Flow

Thermal Models and AssumptionsThermal Models and Assumptions3D Routing with Thermal Via Planning3D Routing with Thermal Via Planning3D Placement3D Placement3D 3D FloorplanningFloorplanning

3D Architecture Exploration3D Architecture Exploration3D Component Modeling and Testing3D Component Modeling and Testing

Concluding Remarks and Future WorkConcluding Remarks and Future Work

Page 3: 3D IC Design Tools and Applications to Microarchitecture

3

Thermal Challenges in 3Thermal Challenges in 3--D ICsD ICs

Key Challenge of 3Key Challenge of 3--D IC D IC Design:Design:

Higher power densityHigher power densityInterInter--layer dielectric layer dielectric layerslayers

High Temperature High Temperature Effects:Effects:

Longer interconnect Longer interconnect delaysdelaysFunctional failureFunctional failure

Temperature increases dramatically along the z direction

Z

T

30oC

100oC

135oC

Si 1 Si 2Si 3 Si 4

150oC

Temperature distribution along z direction

Page 4: 3D IC Design Tools and Applications to Microarchitecture

4

33--D IC Cooling SchemesD IC Cooling SchemesHeat Sink OptimizationHeat Sink Optimization

Air cooling fansAir cooling fansHeat radiating fins Heat radiating fins Thermal grease, AC, etcThermal grease, AC, etc....

ChipChip--Level Temperature Level Temperature OptimizationOptimization

MicrochannelMicrochannel coolingcoolingFloorplanningFloorplanningRouting Routing Thermal via insertionThermal via insertion

Page 5: 3D IC Design Tools and Applications to Microarchitecture

5

ThermalThermal--Aware 3D Physical Design Flow at Aware 3D Physical Design Flow at UCLA (2002 UCLA (2002 –– 2005)2005)

NetlistNetlist (LEFDEF)(LEFDEF) Design constraintsDesign constraints TechnologyTechnology

CIF/GDSIICIF/GDSIIParasitic Parasitic

ExtractionExtraction

ThermalThermalSimulationSimulation

Timing Timing AnalysisAnalysis

ThermalThermal--DrivenDriven3D Floorplanner3D Floorplanner

ThermalThermal--Aware Aware 3D Router w/ 3D Router w/

Thermal Via PlanningThermal Via Planning

OpenAccess

OpenOpenAccessAccess

ThermalThermal--Driven Driven 3D Placement3D Placement

Compact Thermalmodel

Compact Compact ThermalThermalmodelmodel

Layout Layout VerificationVerification

Page 6: 3D IC Design Tools and Applications to Microarchitecture

10/8/2007 UCLA VLSICAD LAB 6

Tech. LibTech. Lib

Ref. LibRef. Lib

DesignDesign

3D OA3D OA

ThermalThermal--Driven Driven 3D Floorplanner3D Floorplanner

ThermalThermal--Driven Driven 3D Placer3D Placer

3D Global Router3D Global Router

ThermalThermal--Via PlannerVia PlannerTier Export

Tier Import

Detailed Routing Detailed Routing by Cadence Routerby Cadence Router2D OA2D OA

3D Physical Design Flow (IBM, UCLA, and PSU) 3D Physical Design Flow (IBM, UCLA, and PSU) (2006 (2006 –– present)present)

Layer & Design Rules

(LEF)

Cell & Via* definitions

(LEF)

Netlist (HDL or DEF)

3D RC extraction3D RC extraction

Timing Timing

InterfaceInterface

3D DRC & 3D LVS3D DRC & 3D LVS

Layout (GDSII )

EinsTimerEinsTimer

PSUPSU UCLAUCLA

Page 7: 3D IC Design Tools and Applications to Microarchitecture

7

Rlateral

Thermal Resistive Network [Wilkerson04]Thermal Resistive Network [Wilkerson04]Circuit stack partitioned into tilesTiles connected through thermal resistances

Lateral resistances: fixedVertical resistances ∝ 1/#via

Heat sources modeled as current sources

Current value = power

Heat sinks modeled as ground nodes

(a) Tiles stack array

(b) Single tile stack

P1R2

R3

R4

P4

P3

P2

R1

1

2

3

4

-

±

R5

P5 5

Accurate and slow

Page 8: 3D IC Design Tools and Applications to Microarchitecture

8

Thermal Resistive Chain ModelThermal Resistive Chain Model

OneOne--Dimension Heat Flow AnalysisDimension Heat Flow AnalysisElmore delayElmore delay--like formula [Chiang01]like formula [Chiang01]

∑ ∑= =

=4

1i

4

ijji4 PRT )(

∑ ∑= =

=4

1i

i

1jji4 RPT )(

P1

R2

R3

R4

P4

P3

P2

R1

1

2

3

4

-

±

Fast and rough

Reduce R: thermal via insertion (routing)Permute P: floorplanning

Page 9: 3D IC Design Tools and Applications to Microarchitecture

9

ThroughThrough--thethe--Silicon Vias (TSSilicon Vias (TS--Vias) in 3D ICsVias) in 3D ICs

Effective in heat dissipatingEffective in heat dissipatingRegular wires have almost no effect (size/direction)Regular wires have almost no effect (size/direction)

Two types of TSTwo types of TS--viasviasSignal TSSignal TS--vias, part of the vias, part of the netlistnetlistThermal TSThermal TS--vias, with no connections, introduced to reduce vias, with no connections, introduced to reduce temperaturetemperature

Pad

Dielectric Layer

Block 1 Block 2

Block 3

Block 4

Metal Routing Layer

Silicon (Device Layers)Block 5

Through-the-Silicon Via(Thermal TS Via)

Through-the-Silicon Via(Signal TS Via)

Page 10: 3D IC Design Tools and Applications to Microarchitecture

10

ThermalThermal--AwareAware 3D Routing Problem3D Routing ProblemInputInput

33--D floorplanning (placement) resultD floorplanning (placement) resultTechnologyTechnologyNetlistNetlistRequired temperature, such as 80Required temperature, such as 80OOCC

OutputOutputRouted netsRouted netsThermal TSThermal TS--via number and locationsvia number and locations

ObjectivesObjectivesMinimum wirelengthMinimum wirelengthMinimum TSMinimum TS--via numbervia number

Page 11: 3D IC Design Tools and Applications to Microarchitecture

11

Multilevel TSMultilevel TS--Via Planning and 3D Routing (TMARS)Via Planning and 3D Routing (TMARS)

Gi

G0

Gk

G0

Gi

Downward PassUpward Pass

level 0

level i

level k

level i

level 0

(1). Power Density Calculation(2). Heat Flow Estimation

(3). Routing Resource Estimation

(1). Power Density Coarsening(2). Heat Flow Estimation

(3). Routing Resource Coarsening

(1). Init Routing Tree Generation(2). TTS Via Planning

(3). TTS Via Number Adjustment

(1) Routing Refinement(2). TTS Via Planning

(3). TTS Via Number Adjustment

Thermal Resistive Network Model

Page 12: 3D IC Design Tools and Applications to Microarchitecture

12

Thermal TSThermal TS--Via Planning Problem Via Planning Problem Determines the thermal TS via density for all tilesDetermines the thermal TS via density for all tilesMinimizing #total thermal TS viaMinimizing #total thermal TS viaMeeting capacity and temperature constraintMeeting capacity and temperature constraintSolving through Solving through

Via planning proportional to Via planning proportional to ∆∆tt (VPPT)(VPPT)•• ∆∆t: vertical t differencet: vertical t difference

Alternating direction via planning (ADVP)Alternating direction via planning (ADVP)

01

35 8

4

0 1086 2 5

∆ta =ta-tba

b

Page 13: 3D IC Design Tools and Applications to Microarchitecture

13

Thermal TS Via Planning Thermal TS Via Planning [Cong & Zhang, ICCAD[Cong & Zhang, ICCAD’’05]05]NonNon--Linear Programming FormulationLinear Programming Formulation

Variable Definition, for tile Variable Definition, for tile LLi, j, ki, j, kai,j,k : TS-via number Ri,j,k : vertical thermal resistancePi,j,k : current source Ύ : constant Ri,j,k = Ύ / ai,j,k

ti,j,k : temperature Ii,j,k : heat flow

ObjectiveObjective

ConstraintsConstraintsCapacity constraintCapacity constraintTemperature constraintTemperature constraintKirchoff'sKirchoff's current lawcurrent law

Constrained NLPConstrained NLPCan be solved by general NLP solverCan be solved by general NLP solverBut very time consumingBut very time consuming

Ri,j,k=Ύ /ai,j,k

±Fixed R

Ii,j,k

ti,j,kN

i , j ,ki , j ,k

k 2 i , j ,k i , j ,k 1

I# total _ via a

t tγ

≥ −

= =−∑ ∑

Page 14: 3D IC Design Tools and Applications to Microarchitecture

14

Alternating Direction TSAlternating Direction TS--Via Planning (ADVP)Via Planning (ADVP)Decompose the NLP into simplified subDecompose the NLP into simplified sub--problemsproblems

Optimizing the via distribution at one direction at a timeOptimizing the via distribution at one direction at a timeAlternating between vertical via planning and horizontal Alternating between vertical via planning and horizontal via planning at each levelvia planning at each levelUpdating the heat flow after every stepUpdating the heat flow after every step

Page 15: 3D IC Design Tools and Applications to Microarchitecture

15

Vertical TSVertical TS--Via PlanningVia PlanningResistive network Resistive network →→ resistive chain resistive chain NLP NLP →→ convex programming convex programming Solvable by any convex Solvable by any convex programming toolprogramming toolTheorem:Theorem:

no capacity constraint: TSno capacity constraint: TS--via number via number proportional to the square root of proportional to the square root of ∆∆tt

VPPTVPPT

4 3 2 4 3 2a : a : a t : t : tΔ Δ Δ=

4 3 2 4 3 2a : a : a t : t : tΔ Δ Δ=

I1

R2=γ /a2

R3=γ /a3

R4=γ /a4

I4

I3

I2

R1

1

2

3

4

-

±

Page 16: 3D IC Design Tools and Applications to Microarchitecture

16

Horizontal TSHorizontal TS--Via PlanningVia PlanningStill an NLP Still an NLP Further simplificationFurther simplification

TTS via number givenTTS via number givenEven out Even out ∆∆t t in one layerin one layerTSTS--via number proportional via number proportional to the vertical heat flow to the vertical heat flow IIi,j,ki,j,k

Fast heat flow estimationFast heat flow estimationThrough Through path countingpath countingError can be corrected by Error can be corrected by accurate modelaccurate model

Ii,j,k+1

layer k

Ii,j,k

Pi,j,k

Ii,j,k+1

123 4 5

Page 17: 3D IC Design Tools and Applications to Microarchitecture

17

Experiment SetupExperiment SetupFourFour--layer 3D Floorplanning results from 3DFP [ICCAD04]layer 3D Floorplanning results from 3DFP [ICCAD04]

MCNC and GSRC floorplanning benchmarks MCNC and GSRC floorplanning benchmarks Power density, random value (10Power density, random value (1055 ~~10107 7 W/mW/m22))

Required temperature, 77Required temperature, 77ooCC

block # net # Init Temp (C)ami33 33 123 298.8ami49 49 408 210.7n100 100 885 275.3n200 200 1585 311.2n300 300 1893 290.2

Benchmark characteristicsBenchmark characteristics

Page 18: 3D IC Design Tools and Applications to Microarchitecture

18

Experimental Results Experimental Results ⎯⎯ Temperature ReductionTemperature Reduction

With thermal via insertion, temperature can be reduced to the With thermal via insertion, temperature can be reduced to the required temperature (77required temperature (77ooC)C)Thermal via insertion can reduce the maximum onThermal via insertion can reduce the maximum on--chip chip temperature by over temperature by over 40%40%

050

100150200250300350

T (C)

ami33 ami49 n100 n200 n300

inputafter routingwith thermal via insertion

Page 19: 3D IC Design Tools and Applications to Microarchitecture

19

Temperature Maps of ami33 Top Layer Temperature Maps of ami33 Top Layer

157-158156-157155-156154-155153-154152-153

76-7775-7674-7573-7472-7371-7270-7169-7068-6967-6866-6765-6664-6563-64

Before Thermal Via Insertion After Thermal Via Insertion

Page 20: 3D IC Design Tools and Applications to Microarchitecture

20

Experimental Results Experimental Results ⎯⎯ Different TSDifferent TS--Via PlannersVia Planners

All can reach the required temperatureAll can reach the required temperaturemm--ADVPADVP

11%11% reduction over flat ADVPreduction over flat ADVP68%68% reduction over TSreduction over TS--via insertion by temperature (mvia insertion by temperature (m--VPPT)VPPT)3.5x 3.5x reduction over even TS via distributionreduction over even TS via distribution

012345678

normalizedTS-vianumber

ami33 ami49 n100 n200 n300

m-ADVPf-ADVPm-VPPTeven

Page 21: 3D IC Design Tools and Applications to Microarchitecture

21

Experimental Results Experimental Results ⎯⎯ Final Routing ResultsFinal Routing Results

0.5

0.6

0.7

0.8

0.9

1

ami33 n100 n300

Completion Rates

m-ADVPm-VPPTeven

Completion rates: mCompletion rates: m--ADVP: ADVP: 96.9%96.9% , m, m--VPPT: VPPT: 93.7% , 93.7% , even: even: 73.44%73.44%Normalized runtime: mNormalized runtime: m--ADVP:ADVP:1.01.0, m, m--VPPT:VPPT:1.491.49 and even:and even:3.83.8

02468

10

ami33 n100 n300

Runtime (s)

Page 22: 3D IC Design Tools and Applications to Microarchitecture

22

OutlineOutlineThermalThermal--Aware 3D IC Physical Design FlowAware 3D IC Physical Design Flow

Thermal Models and AssumptionsThermal Models and Assumptions3D Routing with Thermal Via Planning3D Routing with Thermal Via Planning3D Placement3D Placement3D 3D FloorplanningFloorplanning

3D Architecture Exploration3D Architecture Exploration3D Component Modeling and Testing3D Component Modeling and Testing

Concluding Remarks and Future WorkConcluding Remarks and Future Work

Page 23: 3D IC Design Tools and Applications to Microarchitecture

2D to 3D Transformation by Local Stacking 1. 2D placement on area K*A

For 3D chip with K device layers and each with area A

2. Shrink:

3. Tetris-style 3D legalizationCost R = αd + βv + γtMinimize displacement, #via and thermal cost

23

)K/y,K/(x)y,(x iiii →

Page 24: 3D IC Design Tools and Applications to Microarchitecture

2D to 3D Transformation by FoldingLayer assignment and location mapping according to the folded order

Folding-2

Folding-4

24

Page 25: 3D IC Design Tools and Applications to Microarchitecture

Window-based Stacking / Folding1. Divde 2D placement into NxN windows

2. Apply stacking or folding in a window

Effect of stacking or folding would be spreaded out, and trade-offs are achieved by varying N

Page 26: 3D IC Design Tools and Applications to Microarchitecture

UCLA VLSICAD LAB 26

3D Placement via Transformation3D Placement via TransformationFeaturesFeatures

Existing wellExisting well--performing 2D performing 2D placers can be reusedplacers can be reusedSimple but effective Simple but effective transformation heuristicstransformation heuristicsTradeTrade--off between wire length off between wire length and #via to adapt different and #via to adapt different manufacturing abilitymanufacturing abilityRefinement through RCN graphRefinement through RCN graph

2D Wirelength- and/or Thermal- Driven Placement

2D to 3D Transformation

Layer Reassignment through RCN Graph

2D Detailed Placement for Each Layer

Fast Thermal Model

Accurate Thermal Model

Page 27: 3D IC Design Tools and Applications to Microarchitecture

3D Placement Results (1/2)3D Placement Results (1/2)Wirelength (stacking) Wirelength (stacking)

compared to 2D mPL5compared to 2D mPL5Wirelength Wirelength v.sv.s. # TS via . # TS via tradetrade--offsoffs

circuit 2D mPL5 T3Place

ibm01 5.19E+ 06 2.51E+ 06

6.95E+ 06

ibm03 1.37E+ 07 6.67E+ 06

ibm02 1.44E+ 07

8.21E+ 06

ibm05 4.23E+ 07 1.94E+ 07

ibm04 1.67E+ 07

1.09E+ 07

ibm07 3.73E+ 07 1.90E+ 07

ibm06 2.20E+ 07

1.98E+ 07

ibm09 3.46E+ 07 1.78E+ 07

ibm08 3.94E+ 07

3.61E+ 07

ibm11 5.02E+ 07 2.51E+ 07

ibm10 6.82E+ 07

3.78E+ 07

ibm13 6.58E+ 07 3.30E+ 07

ibm12 7.58E+ 07

7.40E+ 07

ibm15 1.65E+ 08 8.42E+ 07

ibm14 1.42E+ 08

1.06E+ 08

ibm17 3.05E+ 08 1.60E+ 08

ibm16 2.04E+ 08

1.28E+ 08

avg. 1 0.5

ibm18 2.43E+ 08

0.00E+00

1.00E+04

2.00E+04

3.00E+04

4.00E+04

5.00E+04

6.00E+04

7.00E+04

8.00E+04

2.00E+07 2.50E+07 3.00E+07 3.50E+07 4.00E+07 4.50E+07

wirelength

number of TS vias

folding + 7(a)

stacking 7(a)

folding+7(b)

stacking + 7(b)

1 1

2

2

2 2

32 folding + sequential

stacking + sequential

folding + symmetric

stacking + symmetric

27

Page 28: 3D IC Design Tools and Applications to Microarchitecture

UCLA VLSICAD LAB 28

3D Placement Results (2/2)3D Placement Results (2/2)

LST, r = 10%, LST, r = 10%, w/ temp optimization

circuit Temp. (ºC) WL via # Temp. (ºC)

ibm01 276.5 2.81E+06 19020 159.8

ibm03 196.7 7.13E+06 31780 121.6

ibm04 159.6 9.11E+06 40219 96.0

ibm06 160.4 1.23E+07 50576 103.5

ibm07 107.5 2.01E+07 69111 66.4

ibm08 97.7 2.05E+07 75397 63.2

ibm09 96.1 1.94E+07 78102 60.6

ibm13 249.3 3.47E+07 127520 156.2

ibm15 136.5 8.58E+07 260681 90.1

ibm18 89.4 1.31E+08 332012 58.7

Avg. 1.0 1.08 1.06 0.63

Effect of temperature optimizationEffect of temperature optimization

Page 29: 3D IC Design Tools and Applications to Microarchitecture

Analytical Engine for 3D PlacementAnalytical Engine for 3D PlacementDiscrete tier assignmentDiscrete tier assignment

Variables(xi,yi,zi), i=1,2,…,ncell i is placed at (xi,yi) on the tier zi

Relaxed tier assignmentRelaxed tier assignment

UCLA VLSICAD LAB 29

discrete(legalized solution)

relaxed(intermediate solution)

Page 30: 3D IC Design Tools and Applications to Microarchitecture

Analytical EngineAnalytical EngineDiscrete tier assignmentDiscrete tier assignment

Formulate 3D placement problem as continuous Formulate 3D placement problem as continuous optimizationoptimization

Relaxed tier assignmentRelaxed tier assignment

UCLA VLSICAD LAB 30

minimize ( , , )

subject to (no overlap between cells)ee

WL x y z∑

discrete(legalized solution)

relaxed(intermediate solution)

Page 31: 3D IC Design Tools and Applications to Microarchitecture

NonNon--overlap Constraintsoverlap ConstraintsRelaxed by area density Relaxed by area density constraintsconstraints

Divide the placement region into Divide the placement region into binsbinsMeasure the overflow of bin area Measure the overflow of bin area to capture cell overlapsto capture cell overlaps•• Cell overlaps in overflow bins Cell overlaps in overflow bins

violate density constraintsviolate density constraints•• Cell overlaps not in overflow bins Cell overlaps not in overflow bins

do not violate density constraintsdo not violate density constraints

UCLA VLSICAD LAB 31

Page 32: 3D IC Design Tools and Applications to Microarchitecture

NonNon--overlap Constraintoverlap ConstraintReplaced by area density constraintReplaced by area density constraint

Divide the placement region into binsDivide the placement region into binsMeasure the overflow of bin area to Measure the overflow of bin area to capture cell overlapscapture cell overlaps

UCLA VLSICAD LAB 32

minimize ( , , )

subject to (no overlap between cells)ee

WL x y z∑

, , , ,

minimize ( , , )

subject to ( , , )for all , ,

ee

i j k i j k

WL x y z

A x y z Ci j k≤

Page 33: 3D IC Design Tools and Applications to Microarchitecture

NonNon--overlap Constraintoverlap ConstraintReplaced by area density constraintReplaced by area density constraint

Divide the placement region into binsDivide the placement region into binsMeasure the overflow of bin area to Measure the overflow of bin area to capture cell overlapscapture cell overlaps

UCLA VLSICAD LAB 33

minimize ( , , )

subject to (no overlap between cells)ee

WL x y z∑

, , , ,

minimize ( , , )

subject to ( , , )for all , ,

ee

i j k i j k

WL x y z

A x y z Ci j k=

∑add filler cells[Chan et al., ISPD’06]

Page 34: 3D IC Design Tools and Applications to Microarchitecture

NonNon--overlap Constraintoverlap ConstraintReplaced by area density Replaced by area density constraintconstraint

Divide the placement region into binsDivide the placement region into binsMeasure the overflow of bin area to Measure the overflow of bin area to capture cell overlapscapture cell overlaps

UCLA VLSICAD LAB 34

, , , ,

minimize ( , , )

subject to ( , , ) for all , ,ee

i j k i j k

WL x y z

A x y z C i j k=∑

2, , , ,,

minimize ( , , ) ( ( , , ) )2

increase until overlaps are removed

e i j k i j ke k i jWL x y z A x y z Cμ

μ

+ −∑ ∑ ∑[Nam & Cong, Springer’07][Cong & Luo, ISPD’08]

Page 35: 3D IC Design Tools and Applications to Microarchitecture

NonNon--overlap Constraintoverlap ConstraintReplaced by area density Replaced by area density constraintconstraint

Divide the placement region into Divide the placement region into binsbinsMeasure the overflow of bin area Measure the overflow of bin area to capture cell overlapsto capture cell overlaps

Area projection to obtain bin Area projection to obtain bin densities from intermediate densities from intermediate solutionsolution

UCLA VLSICAD LAB 35

Page 36: 3D IC Design Tools and Applications to Microarchitecture

NonNon--overlap Constraintoverlap ConstraintReplaced by area density Replaced by area density constraintconstraint

Divide the placement region into Divide the placement region into binsbinsMeasure the overflow of bin area Measure the overflow of bin area to capture cell overlapsto capture cell overlaps

Area projection to obtain bin Area projection to obtain bin densities from intermediate densities from intermediate solutionsolution

UCLA VLSICAD LAB 36

Page 37: 3D IC Design Tools and Applications to Microarchitecture

Area ProjectionArea ProjectionBellBell--shaped function to project areashaped function to project area

UCLA VLSICAD LAB 37

2

2

1 2( ) 1 2( , ) 2( 1) 1 2 1

0 otherwise

z k z kk z z k z kη

⎧ − − − ≤⎪= − − < − ≤⎨⎪⎩

ηη((k,zk,z)) - The projection ratiofrom “tier z” to tier k

Page 38: 3D IC Design Tools and Applications to Microarchitecture

Area ProjectionArea ProjectionBellBell--shaped function to project areashaped function to project area

An ExampleAn ExampleIntermediate placement ofIntermediate placement ofa cell at a cell at ““tier 2.316tier 2.316””

Projects 0% area to tier 1Projects 0% area to tier 1Projects 80% area to tier 2Projects 80% area to tier 2Projects 20% area to tier 3Projects 20% area to tier 3Projects 0% area to tier 4Projects 0% area to tier 4

UCLA VLSICAD LAB 38

2

2

1 2( ) 1 2( , ) 2( 1) 1 2 1

0 otherwise

z k z kk z z k z kη

⎧ − − − ≤⎪= − − < − ≤⎨⎪⎩ (2, )zη (3, )zη (4, )zη(1, )zη

0%

80%

0%

20%

Page 39: 3D IC Design Tools and Applications to Microarchitecture

Area ProjectionArea ProjectionBellBell--shaped function to project areashaped function to project area

An ExampleAn ExampleIntermediate placement ofIntermediate placement ofa cell at a cell at ““tier 2.316tier 2.316””

Projects 0% area to tier 1Projects 0% area to tier 1Projects 80% area to tier 2Projects 80% area to tier 2Projects 20% area to tier 3Projects 20% area to tier 3Projects 0% area to tier 4Projects 0% area to tier 4

UCLA VLSICAD LAB 39

2

2

1 2( ) 1 2( , ) 2( 1) 1 2 1

0 otherwise

z k z kk z z k z kη

⎧ − − − ≤⎪= − − < − ≤⎨⎪⎩ (2, )zη (3, )zη (4, )zη(1, )zη

ηη((k,zk,z)) - The projection ratiofrom “tier z” to tier k

Page 40: 3D IC Design Tools and Applications to Microarchitecture

Equivalence to NonEquivalence to Non--overlap Constraintoverlap ConstraintArea projection to tiers is not enoughArea projection to tiers is not enough

Counter example: projected area failed to capture illegalityCounter example: projected area failed to capture illegality

Solution: area projection on pseudoSolution: area projection on pseudo--tierstiers

UCLA VLSICAD LAB 40overflow

Page 41: 3D IC Design Tools and Applications to Microarchitecture

Equivalence to NonEquivalence to Non--overlap Constraintoverlap ConstraintTheorem: (Theorem: (x,y,zx,y,z) satisfy the constraints) satisfy the constraints

if.fif.f. (. (x,y,zx,y,z) is a legal placement (no overlaps)) is a legal placement (no overlaps)** after adding** after adding

UCLA VLSICAD LAB 41

, , , ,

, , , ,

( , , )for all , ,

( , , )i j k i j k

i j k i j k

A x y z Ci j k

A x y z C

=⎧⎪⎨ ′ ′=⎪⎩

Page 42: 3D IC Design Tools and Applications to Microarchitecture

Multilevel FrameworkMultilevel Framework

UCLA VLSICAD LAB 42

Level at which analytical engine is appliedCoarseningInterpolation

CI

Page 43: 3D IC Design Tools and Applications to Microarchitecture

Experimental Results (1/2)Experimental Results (1/2)Comparison of tradeComparison of trade--off curves (ibm13)off curves (ibm13)

19% shorter WL19% shorter WL9% fewer TSV9% fewer TSVthanthan15% shorter WL15% shorter WL43% fewer TSV43% fewer TSVthanthan

(consistent behavior on other circuits)(consistent behavior on other circuits)

UCLA VLSICAD LAB 43

Trans.

Page 44: 3D IC Design Tools and Applications to Microarchitecture

44

OutlineOutlineThermalThermal--Aware 3D IC Physical Design FlowAware 3D IC Physical Design Flow

Thermal Models and AssumptionsThermal Models and Assumptions3D Routing with Thermal Via Planning3D Routing with Thermal Via Planning3D Placement3D Placement3D 3D FloorplanningFloorplanning

3D Architecture Exploration3D Architecture Exploration3D Component Modeling and Testing3D Component Modeling and Testing

Concluding Remarks and Future WorkConcluding Remarks and Future Work

Page 45: 3D IC Design Tools and Applications to Microarchitecture

45

ThermalThermal--Aware 3D Floorplanning [ICCAD04]Aware 3D Floorplanning [ICCAD04]First work in this fieldFirst work in this field

Simulated Annealing (SA) EngineSimulated Annealing (SA) EngineNew local zNew local z--neighbor operationsneighbor operationsCost functionCost function

•• nwlnwl ⎯⎯ normalized normalized wirelengthwirelength•• nareanarea ⎯⎯ normalizednormalized chip areachip area•• nvcnvc ⎯⎯ normalized normalized interlayer via numberinterlayer via number•• ccTT ⎯⎯ temperaturetemperature costcost

Hybrid Thermal Evaluation Hybrid Thermal Evaluation At each move At each move ―― uses simplified uses simplified chain modelchain modelAt each SA temperature drop At each SA temperature drop ―― the resistive the resistive network modelnetwork model

a b c

d

e

f g

L1

L2

i h

j k

L3Tcnvcnareanwltcos ⋅+⋅+⋅+⋅= ηγβα

Page 46: 3D IC Design Tools and Applications to Microarchitecture

46

Temperature/Runtime TradeoffTemperature/Runtime Tradeoff

3DFP3DFP--T can reduce the temperature by T can reduce the temperature by 56%56% with with 9.7x9.7x runtimeruntime3DFP3DFP--TT--Fast can reduce the temperature by Fast can reduce the temperature by 40%40% with with 1.8x1.8x

runtimeruntime3DFP3DFP--TT--Hybrid can reduce the temperature by Hybrid can reduce the temperature by 50%50% with with 3.2x3.2x

runtimeruntimeWirelength increase less than 6%Wirelength increase less than 6%

3DFP

3DFP-T

3DFP-T-Fast

3DFP-T-Hybrid

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15

Normalized Runtime

Nor

mal

ized

Tem

pera

tu

Page 47: 3D IC Design Tools and Applications to Microarchitecture

47

Detailed Simulation Result Detailed Simulation Result

Without Thermal Optimization With Thermal Optimization

-- ami33 benchmark with 33 blocks and 4 layers - Generated by FEM based thermal simulation tool (CFD-ACE+)

Page 48: 3D IC Design Tools and Applications to Microarchitecture

3D Floorplanning with Folded BlocksThe exploration of the use of vertical integration on microprocessor design requires consideration for both physical design and architecture.

True 3D packing

Architectural Alternative Selection• The number of layers in folded blocks• The partition way: block folding or port partitioning

Page 49: 3D IC Design Tools and Applications to Microarchitecture

3D Architectural Blocks 3D Architectural Blocks –– Issue QueueIssue QueueBlock foldingBlock folding

Fold the entries and place them Fold the entries and place them on different layers on different layers Effectively shortens the tag linesEffectively shortens the tag lines

Port partitioningPort partitioningPlace tag lines and ports on Place tag lines and ports on multiple layer, thus reducing multiple layer, thus reducing both the height and width of the both the height and width of the ISQ.ISQ.The reduction in tag and The reduction in tag and matchlinematchline wires can help reduce wires can help reduce both power and delay. both power and delay.

Benefits from block foldingBenefits from block foldingMaximum delay reduction of Maximum delay reduction of 50%, maximum area 50%, maximum area reduction of 90% and a reduction of 90% and a maximum reduction in maximum reduction in power consumption of 40%power consumption of 40%

(a) 2D issue queue with 4 taglines; (b) block folding; (c) port partitioning

Page 50: 3D IC Design Tools and Applications to Microarchitecture

3D Architectural Blocks 3D Architectural Blocks –– CachesCaches

Port PartitioningWordline FoldingSingle Layer Design

3D3D--CACTI: a tool to model 3D cache for area, delay and powerCACTI: a tool to model 3D cache for area, delay and powerWe add port partitioning methodWe add port partitioning methodThe area impaction of The area impaction of viasvias

ImprovementsImprovementsPort folding performs better than Port folding performs better than wordlinewordline folding for area.(72% folding for area.(72% vsvs 51%)51%)WordlineWordline folding is more effective in reducing the block delay (13% folding is more effective in reducing the block delay (13% vsvs 5%)5%)Port folding also performs better in reducing power (13% Port folding also performs better in reducing power (13% vsvs 5%)5%)

Page 51: 3D IC Design Tools and Applications to Microarchitecture

Corner Block List (CBL) Representation for 3D Floorplan (ICCD’07)

A 3D CBL composes a 3-tuple (S, L, T) S: a list of block nameL: corner cubic block orientation(X-, Y- or Z- oriented)T: The sequence of {Tn,Tn-1, …,T2} recording the number of blocks (represented by # 1’s separated by a 0) covered by corner cubic block in the uncovered block list

3

4

12

S={1 2 3 4 5}L = ( Y,Z,Y,X)

T=( 10,110,10,1110)

5

Page 52: 3D IC Design Tools and Applications to Microarchitecture

52

OutlineOutlineThermalThermal--Aware 3D IC Physical Design FlowAware 3D IC Physical Design Flow

Thermal Models and AssumptionsThermal Models and Assumptions3D Routing with Thermal Via Planning3D Routing with Thermal Via Planning3D Placement3D Placement3D 3D FloorplanningFloorplanning

3D Architecture Exploration3D Architecture Exploration3D Component Modeling and Testing3D Component Modeling and Testing

Concluding Remarks and Future WorkConcluding Remarks and Future Work

Page 53: 3D IC Design Tools and Applications to Microarchitecture

53

3D Architecture Evaluation with Physical Planning 3D Architecture Evaluation with Physical Planning ---- MEVAMEVA--3D [DAC3D [DAC’’03 & ASPDAC03 & ASPDAC’’06]06]

Optimize Optimize BIPS (not IPC or Freq)BIPS (not IPC or Freq)•• Consider interconnect Consider interconnect

pipelining based on early pipelining based on early floorplanning for critical pathsfloorplanning for critical paths

•• Use IPC sensitivity model Use IPC sensitivity model [Jagannathan05][Jagannathan05]

Area/wirelength Area/wirelength

TemperatureTemperature

2D/3D floorplanning forperformance and thermal with

interconnect pipelining

performance simulationwith interconnect latencies

2D/3D thermal simulation

microarchitectureconfiguration

targetfrequency

critical architecturalpaths and sensitivity

power densityestimates

estimated performance, temperature,and interconnect data

power density withinterconnect consideration

performance, power andtemperature

ESTI

MA

TIO

NVA

LID

ATI

ON

Page 54: 3D IC Design Tools and Applications to Microarchitecture

54

IPC Sensitivity ModelsIPC Sensitivity ModelsStudy sensitivity by varying latency of P with all other Study sensitivity by varying latency of P with all other parameters fixedparameters fixed

Build mathematical models [linear, pieceBuild mathematical models [linear, piece--wise linear, etc. or wise linear, etc. or tabletable--lookup]lookup]•• PPBLBL: minimum latency along P (only from blocks): minimum latency along P (only from blocks)•• PPPLPL: post: post--layout latency along P (blocks + wires)layout latency along P (blocks + wires)•• Delta latency Delta latency δδ = (P= (PPLPL –– PPBLBL))•• f(Pf(P,,δδ): relative degraded IPC with extra ): relative degraded IPC with extra δδ cycle latency on Pcycle latency on P

f(Pf(P,,δδ) = (1 ) = (1 –– x)x)δδ, where x is per, where x is per--cycle IPC degradation for Pcycle IPC degradation for Pe.g.: 2 extra cycles, new IPC = (1e.g.: 2 extra cycles, new IPC = (1--0.024)*(10.024)*(1--0.024)0.024)

•• IPCIPCPLPL = IPC= IPCBLBL x x f(Pf(P,,δδ))We ignore path interactions and use a simple additive We ignore path interactions and use a simple additive model to combine multiple pathsmodel to combine multiple paths

IPCPL(P1,P2,…,PN,δ1,δ2,…,δN) =

IPCBL(P1,P2,…,PN,0,0..,0) * f(P1,δ1) * f(P2,δ2) * … * f(PN,δN)

Page 55: 3D IC Design Tools and Applications to Microarchitecture

55

Design ExampleDesign ExampleAn outAn out--ofof--order superscalar processor microorder superscalar processor micro--architecture architecture with 4 banks of L2 cache in 70with 4 banks of L2 cache in 70nm nm technologytechnology

Critical pathsCritical paths

Page 56: 3D IC Design Tools and Applications to Microarchitecture

56

Baseline Processor ParametersBaseline Processor Parameters

Page 57: 3D IC Design Tools and Applications to Microarchitecture

57

Wirelength Improvement from 3D LayoutWirelength Improvement from 3D Layout

0

20000

40000

60000

80000

100000

120000

3G 4G 5G 6G

2D

3D

Assume two device layers

Page 58: 3D IC Design Tools and Applications to Microarchitecture

58

Performance Improvement of 3D Layout Performance Improvement of 3D Layout

Assume two device layers

Page 59: 3D IC Design Tools and Applications to Microarchitecture

59

2D 2D vsvs 3D Layout3D Layout

2D EV6-like core 3D EV6-like core (2 layers)BIPS= 2.75 BIPS= 2.94

Wakeup loop : The extra cycle is

eliminated.

Branch mispredictionresolution loop and the

L2 cache access latency :

Some of the extra cycles are eliminated

Assume two device layers

Page 60: 3D IC Design Tools and Applications to Microarchitecture

60

Maximum OnMaximum On--Chip TemperaturesChip Temperatures

HS denotes a heat sink, and the 3D integration allows to insert thermal vias to reduce the temperature.

Frequency

Assume two device layers

Page 61: 3D IC Design Tools and Applications to Microarchitecture

61

Thermal Profiles for 2D chip(4Ghz)Thermal Profiles for 2D chip(4Ghz)

Temperature distribution in 2D integration. Temperature distribution in 2D integration.

Page 62: 3D IC Design Tools and Applications to Microarchitecture

62

Thermal Profiles for 3D chip(4Ghz)Thermal Profiles for 3D chip(4Ghz)

Temperature distribution in 3D integration with one heat sink. Temperature distribution in 3D integration with one heat sink.

Temperature distribution in 3D integration with two heat sinks aTemperature distribution in 3D integration with two heat sinks and flipped upper layer. nd flipped upper layer.

Page 63: 3D IC Design Tools and Applications to Microarchitecture

63

Limitation of Component Stacking AloneLimitation of Component Stacking Alone

Extra latency seen by some critical loops:Extra latency seen by some critical loops:

Stacking can only attack wire latency between blocksStacking can only attack wire latency between blocks

Further benefit can only come from attacking block Further benefit can only come from attacking block latencylatency

Component FoldingComponent Folding

Page 64: 3D IC Design Tools and Applications to Microarchitecture

64

Solution: 3D Design w/ Component Folding and Solution: 3D Design w/ Component Folding and StackingStacking

Explore 3D design of architectural structures that areExplore 3D design of architectural structures that areTiming/Throughput CriticalTiming/Throughput CriticalExpensive in Terms of Power Consumption and/or Thermal Expensive in Terms of Power Consumption and/or Thermal OutputOutput

Possible candidates for 3D component foldingPossible candidates for 3D component foldingInstruction Scheduling WindowInstruction Scheduling Window•• Issue Queue can be partitioned into multiple levels via Issue Queue can be partitioned into multiple levels via

matchlinesmatchlines or taglines.or taglines.OnOn--Chip CachesChip Caches•• Regular structure lends itself to a wide range of Regular structure lends itself to a wide range of partitioningspartitionings

Register FileRegister File•• Thermally critical resource Thermally critical resource –– also has a regular structurealso has a regular structure

Page 65: 3D IC Design Tools and Applications to Microarchitecture

65

Results from 3D Folding and StackingResults from 3D Folding and Stacking

0

0.5

1

1.5

2

2.5

3

3.5

4

3G 4G 5G 6G

1 layer

2 layers

3 layers

4 layers

Over 35% performance improvement

Page 66: 3D IC Design Tools and Applications to Microarchitecture

66

5GHz 3 Device Layer Layout5GHz 3 Device Layer Layout

Page 67: 3D IC Design Tools and Applications to Microarchitecture

Exploration of 3D MultiCore Systems -- MC-Sim

L2Bank

L2Bank

L2Bank

SESC Instance

MINT

C C C…

CACHE CONTROLLER

Functional Network Switch

SESC Instance

MINT

C C C…

SESC Instance

MINT

C C C…

SystemC NoC Model

message latencies

messages

Central Page Handler

Page 68: 3D IC Design Tools and Applications to Microarchitecture

MC-Sim ComponentsA number of SESC instances

Each instance is a number of cores cooperating on a single (potentially multithreaded) application

A number of cache banksShared cache state that can be accessed by any SESC instance

A central page handlerTo dole out physical pages to SESC instancesAllows support for multitasking

A functional network switchTo functionally route messages between components

A SystemC NoC modelTo accurately model latency and power Entries in the functional switch wait for an amount of time specified by the NoC

Page 69: 3D IC Design Tools and Applications to Microarchitecture

69

SummarySummaryVery little 3D CAD support from major EDA vendorsVery little 3D CAD support from major EDA vendors

A complete set of thermalA complete set of thermal--aware 3D IC physical design tool is aware 3D IC physical design tool is available from UCLA/available from UCLA/PennStatePennState/IBM collaboration/IBM collaboration

3D thermal modeling3D thermal modeling3D routing with thermal via planning3D routing with thermal via planning3D placement3D placement3D 3D floorplanningfloorplanning

3D physical design tools provide the capability for early physic3D physical design tools provide the capability for early physical al prototyping for microarchitecture explorationprototyping for microarchitecture exploration

Coupled with 3D physical planningCoupled with 3D physical planningConsider both 3D component stacking and foldingConsider both 3D component stacking and foldingOver 35% performance improvementOver 35% performance improvement

Page 70: 3D IC Design Tools and Applications to Microarchitecture

Further ReadingY. Liu, Y. Ma, E. Kursun, J. Cong, and G. Reinman, “Fine Grain 3D Integration for Microarchitecture Design Through Cube Packing Exploration,” Proceedings of 25th IEEE International Conference on Computer Design, Lake Tahoe, CA, pp. 259-266, October 2007.J. Cong, Y. Ma, Y. Liu, E. Kursun, and G. Reinman, “3D Architecture Modeling and Exploration,” Proceedings of 24th International VLSI/ULSI Multilevel Interconnection Conference (VMIC), Fremont, CA, pp. 231-238, September 2007.G. Loh, Y. Xie, and B.Black, “3D processor Design” , IEEE Micro, 2007 J. Cong, G. Luo, J. Wei, and Y. Zhang, “Thermal-Aware 3D IC Placement via Transformation,” Proceedings of the 12th Asian and South Pacific Design Automation Conference (ASP-DAC 2007), Yokohama, Japan, pp. 780-785, January, 2007.Yuan Xie, G. Loh, B. Black, K. Bernstein. Design Space Exploration for 3D Architecture. ACM Journal of Emerging Technologies for Computer Systems 2(2):65-103.J. Cong and Y. Zhang., “Thermal Via Planning for 3-D ICs,” Proceedings of the 2005 IEEE/ACM Int’l Conference on Computer Aided Design, November 2005, pp. 745-752. Tsai, Y-F., Y. Xie, N. Vijaykrishnan, M. J. Irwin Three-Dimensional Cache Design Exploration Using 3DCacti. Proceedings of the IEEE International Conference on Computer Design (ICCD 2005). pp. 519-524

http://http://cadlab.cs.ucla.educadlab.cs.ucla.edu/~cong/~cong

Page 71: 3D IC Design Tools and Applications to Microarchitecture

AcknowledgementsWe would like to thank the supports from DARPA

Support from the primary contractors --Collaboration with CFDRC and IBM and

Publications are available from http://cadlab.cs.ucla.edu/~cong


Recommended