+ All Categories
Home > Documents > Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to...

Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to...

Date post: 07-Jul-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
141
SPDAC03 – Physical Chip Implementation – Section IV 1 Jan. 2003 ASPDAC03 – Physical Chip Implementation 1 Section IV: Timing Closure Techniques Jan. 2003 ASPDAC03 - Physical Chip Implementation 2 IBM Contributions to this presentation include: T.J. Watson Research Center Austin Research Lab ASIC Design Centers EDA Organization * For more detailed information see references at the end of this presentation, which include a wide variety of IBM and External publications covering these areas.
Transcript
Page 1: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 1

Jan. 2003 ASPDAC03 – Physical Chip Implementation 1

Section IV: Timing Closure Techniques

Jan. 2003 ASPDAC03 - Physical Chip Implementation 2

IBM Contributions to this presentation include:

T.J. Watson Research CenterAustin Research LabASIC Design CentersEDA Organization

* For more detailed information see references atthe end of this presentation, which include a widevariety of IBM and External publications covering theseareas.

Page 2: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 3

OverviewIntroductionReview material (timing and synthesis)Introduction to placementPlacement algorithms (skip)Paradigms for placement-synthesis integrationPlacement aware synthesis techniques (skip)Congestion avoidance / mitigation techniquesRouting optimization

Jan. 2003 ASPDAC03 - Physical Chip Implementation 4

Timing ClosureMany aspects of a design contribute to performance, power, and density

Architecture / Logic ImplementationPD Design Style (Flat, Hierarchical, etc)Clocking Paradigm / Test / Circuit FamilyFloor Plan / Synthesis / Placement / Routing

Design Automation for timing closure is more significant than ever before

Designs are largerWires are longer, invalidating statistical synthesis models, and requiring lots of buffersCycle times are more aggressive

Page 3: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 3

Jan. 2003 ASPDAC03 - Physical Chip Implementation 5

Design Automation Tools are Individually Mature

Timing analysisSynthesis / Technology mappingPlacement / RoutingFloor PlanningExtraction / Analysis

Jan. 2003 ASPDAC03 - Physical Chip Implementation 6

Physical SynthesisPlace&route

synthesis timing

Challenge is to integrate them into one cooperative application

Netlist in

Completed Design

Page 4: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 4

Jan. 2003 ASPDAC03 - Physical Chip Implementation 7

Design Flow Evolution:Design Entry

Synthesis w/Timing

Place

Route

Timing

1. Tech independent optimization

2. Tech mapping

3. Timing correctionTiming driven

placement Timing Driven

Placement plus

Automatic Post

placement tuning

Integrated Placement

and Synthesis

Integrated Placement, Synthesis &

Routing

1. Physically aware optimizations

2. Physically aware timing correction

3. Timing / Noise aware routing

Jan. 2003 ASPDAC03 - Physical Chip Implementation 8

Purpose of this Section:

Provide users with an intuitive feel of the inner workings of the major timing closure toolsDemonstrate the advancements in timing closure tools technology via example designsExplore a variety of significant design choices

Page 5: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 5

Jan. 2003 ASPDAC03 - Physical Chip Implementation 9

What you should expect:

High level concepts presented are generally applicable across a wide range of tools / methodologies (ie: not IBM specific)Specific tool internals used in this tutorial are taken from IBM tools. They should provide a reasonable “feel” as to how things are done in the industry.

Jan. 2003 ASPDAC03 - Physical Chip Implementation 10

Worldwide ASIC/PLD SalesTop 5 Suppliers for 2001

IBM $ 2758 growth 1.2%Agere $ 1310 growth -43.5%LSI $ 1243 growth -38.2% NEC $ 1243 growth -35.2% XLIINX $ 1149 growth -26.3%

Revenue: Millions of U.S. DollarsSource: Gartner Dataquest (March 2002)

Page 6: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 6

Jan. 2003 ASPDAC03 - Physical Chip Implementation 11

IBM ASIC Supplier #1 since 1999IBM ASIC Supplier #1 since 1999

NEC

Lucent

LSI Logic

IBM

VLSI

Xilinx

TI

Fujitsu

Toshiba

Hitachi Altera

NEC

Lucent

LSI Logic

IBM

VLSI

Xilinx

TI

Fujitsu

Toshiba

Altera

NEC

Lucent

LSI Logic

IBM

VLSI

XilinxTI

Fujitsu

Toshiba

Altera

NECLucent

LSI Logic

IBM

VLSI

Xilinx

TI

Fujitsu

STM

NEC

LucentLSI Logic

IBM12345678910

1996 1997 1998 1999 2000

AlteraToshiba

Xilinx

TI

Fujitsu

STM

NEC

LucentLSI Logic

IBM

AlteraToshiba

Xilinx

Agilent

Fujitsu

Mitsubishi

2001

Dataquest 96-02

Jan. 2003 ASPDAC03 - Physical Chip Implementation 12

Section OutlineIntroductionReview material (timing and synthesis)Introduction to placementPlacement algorithmsParadigms for placement-synthesis integrationPlacement aware synthesis techniquesCongestion avoidance / mitigation techniquesRouting optimization

Page 7: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 7

Jan. 2003 ASPDAC03 - Physical Chip Implementation 13

Static Timing Analysis

Jan. 2003 ASPDAC03 - Physical Chip Implementation 14

Timing Analysis Basics:Why static timing since simulation is more accurate?

c=0 c=1b=0 a-z delay1 a-z delay2 b=1 a-z delay3 a-z delay4

Exponential explosion as possible design input states grow!

a

b

c

zHow would one calculate the worst case rising delay from a to z?

A simple example:

Simulation has a number of key drawbacksrequires input state vectorslong runtimes

Page 8: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 8

Jan. 2003 ASPDAC03 - Physical Chip Implementation 15

-Required arrival time(RAT) -- the time a signal must arrive at in order to avoid a chip fail

-Slack = Required arrival time - Arrival time– Positive slack good, negative slack bad

Definition of basic terms-Arrival time(AT) -- the time at which a pin switches state

90

10time

vdd

slew = time90 -time10

50 AT = time50

-Slew - the rate at which a signal switches– usually difference of 10% and 90% on voltage curve

Timing Analysis Basics:

Jan. 2003 ASPDAC03 - Physical Chip Implementation 16

Block based timing:Worst value only stored at merge pointsEach segment is processed just once

d=2

d=1

d=5

d=3

d=2

d=1

d=3

d=3d=1

temp at=3 temp at=7

Example Problem: What is slack at PO?

Timing Analysis Basics:

at=0

at=0

at=0

at=1

at=2

at=5 at=6

at=5

at=8at=11

rat=10

Slack= -1

Page 9: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 9

Jan. 2003 ASPDAC03 - Physical Chip Implementation 17

What is Incremental Timing?Enabling small incremental changes without full retimingOnly direct fanin/fanout cone is processed

at=

at=at=

at=at=

d=2

at=

at=0

at=0

rat=10at=0

at=

5

d=1

d=5

d=3

d=2

d=1

d=3

d=3d=1

2

68

5

1

11

d=1d=1d=1

at=2 slack=0

it passed!

at=3at=7

at=10

at=1

Timing Analysis Basics:

Jan. 2003 ASPDAC03 - Physical Chip Implementation 18

Early Mode Analysis

0=aAT1=bAT

2=xRAT

1=xAT121 −=−=xSL

101 =−=bSL

000 =−=aSL1=yAT

0=cAT

011 =−=ySL

a

b xc

y

Definitions change as follows– longest becomes shortest– slack = arrival - required

1 1

110 −=−=cSL

Page 10: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 10

Jan. 2003 ASPDAC03 - Physical Chip Implementation 19

Timing Correction

Fix electrical violationsResize cellsBuffer netsCopy (clone) cells

Fix timing problemsLocal transforms (bag of tricks)Path-based transforms

Jan. 2003 ASPDAC03 - Physical Chip Implementation 20

Local Synthesis Transforms

Resize cellsBuffer or clone to reduce load on critical netsDecompose large cellsSwap connections on commutative pins or among equivalent netsMove critical signals forwardPad early pathsArea recovery

Page 11: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 11

Jan. 2003 ASPDAC03 - Physical Chip Implementation 21

Transform Example

Delay = 4

…..

Double Inverter

Removal

…..

…..

Delay = 2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 22

Resizing

00.010.020.030.040.05

0 0.2 0.4 0.6 0.8 1load

d

A B C

b

ad

e

f0.2

0.2

0.3

?

b

aA

0.035

b

aC

0.026

Page 12: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 12

Jan. 2003 ASPDAC03 - Physical Chip Implementation 23

Cloning

00.010.020.030.040.05

0 0.2 0.4 0.6 0.8 1load

d

A B C

b

a

d

e

f

gh

0.2

0.2

0.20.20.2

?

b

a

d

ef

gh

A

B

Jan. 2003 ASPDAC03 - Physical Chip Implementation 24

Buffering

00.010.020.030.040.05

0 0.2 0.4 0.6 0.8 1load

d

A B C

b

a

d

e

f

gh

0.2

0.2

0.20.20.2

? b

a

d

e

f

gh

0.1

0.2

0.20.20.2

BB

0.2

Page 13: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 13

Jan. 2003 ASPDAC03 - Physical Chip Implementation 25

Redesign Fan-in Tree

a

cd

b eArr(b)=3

Arr(c)=1

Arr(d)=0

Arr(a)=4

Arr(e)=61

1

1

cd

e

Arr(e)=51

1b1

a

Jan. 2003 ASPDAC03 - Physical Chip Implementation 26

Redesign Fan-out Tree

1

1

1

3

1

1

1

Longest Path = 5

1

1

1

3

1

2

Longest Path = 4Slowdown of buffer due to load

Page 14: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 14

Jan. 2003 ASPDAC03 - Physical Chip Implementation 27

Decomposition

Jan. 2003 ASPDAC03 - Physical Chip Implementation 28

Swap Commutative Pins

2

c

ab

2

1

0 1

1

1

3

a

cb

2

1

0

1

1

2

1 5

Simple Sorting on arrival times and delay works

Page 15: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 15

Jan. 2003 ASPDAC03 - Physical Chip Implementation 29

Move Critical Signals Forward

Based on ATPG– linear in circuit size– Detects

redundancies efficiently

Efficiently find wires to be added and remove.– Based on

mandatory assignments..

ab

cd e

ab

edc

Jan. 2003 ASPDAC03 - Physical Chip Implementation 30

Section outlineIntroductionReview material (timing and synthesis)Introduction to placementPlacement algorithmsParadigms for placement-synthesis integrationPlacement aware synthesis techniquesCongestion avoidance / mitigation techniquesRouting optimization

Page 16: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 16

Jan. 2003 ASPDAC03 - Physical Chip Implementation 31

Placement Objective:Find optimal relative ordering of cells

minimize wire length and congestionmaximize timing slack

Find optimal spacing of cellseliminate wiring congestion problemsprovide space for post placement synthesis

clock treesbuffer insertiontiming correction

Find optimal Global Position

Jan. 2003 ASPDAC03 - Physical Chip Implementation 32

A B C

Optimal Relative Order:

Page 17: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 17

Jan. 2003 ASPDAC03 - Physical Chip Implementation 33

A B C

To spread ...

Jan. 2003 ASPDAC03 - Physical Chip Implementation 34

A B C

.. or not to spread

Page 18: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 18

Jan. 2003 ASPDAC03 - Physical Chip Implementation 35

A B C

Place to the left

Jan. 2003 ASPDAC03 - Physical Chip Implementation 36

A B C

… or to the right

Page 19: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 19

Jan. 2003 ASPDAC03 - Physical Chip Implementation 37

A B C

Optimal Relative Order:

Without “free” space the problem is dominated by order

Jan. 2003 ASPDAC03 - Physical Chip Implementation 38

Placement Footprints:Standard Cell:

Data Path:

IP - Floorplanning

Page 20: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 20

Jan. 2003 ASPDAC03 - Physical Chip Implementation 39

Core

ControlIO

Reserved areas

Mixed Data Path &sea of gates:

Placement Footprints:

Jan. 2003 ASPDAC03 - Physical Chip Implementation 40

Perimeter IO

Area IO

Placement Footprints:

Page 21: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 21

Jan. 2003 ASPDAC03 - Physical Chip Implementation 41

Placement objectives are subject to User Constraints / Design Style:

Hierarchical Design Constraintspin locationpower rail reserved layers

Flat Design w/Floor Plan constraintsFixed circuitsIO connections

Jan. 2003 ASPDAC03 - Physical Chip Implementation 42

UnconstrainedPlacement

Page 22: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 22

Jan. 2003 ASPDAC03 - Physical Chip Implementation 43

Floor plannedPlacement

Jan. 2003 ASPDAC03 - Physical Chip Implementation 44

CongestionMAP

Page 23: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 23

Jan. 2003 ASPDAC03 - Physical Chip Implementation 45

Advantages of HierarchyDesign is carved into smaller pieces that can be worked on in parallel (improved throughput)A known floor plan provides the logic design team with a large degree of placement control. A known floor plan provided early knowledge of long wiresTiming closure problems can be addressed by tools, logic design, and hierarchy manipulationLate design changes can be done with minimal turmoil to the entire design

Jan. 2003 ASPDAC03 - Physical Chip Implementation 46

Disadvantages of HierarchyResults depend on the quality of the hierarchy. The logic hierarchy must be designed with PD taken into account.Additional methodology requirements must be met to enable hierarchy. Ex. Pin assignment, Macro Abstract management, area budgeting, floor planning, timing budgets, etc Late design changes may affect multiple components.Hierarchy allows divergent methodologies Hierarchy hinders DA algorithms. They can no longer perform global optimizations.

Page 24: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 24

Jan. 2003 ASPDAC03 - Physical Chip Implementation 47

Physical Synthesis FlowSynthesized NetlistWire-load Models

UnplacedPhysically “unaware” timing

Cleanup: Remove buffers, nominal power levels on gates

Initial “basic” placementFor minimal wire-length, min-cut, Steiner tree estimates, physically aware timing

Logical + Placement optimizations

Timing-driven placement w/resynthesis

For minimal netweights, based on the timing of the net

Physically aware logic optimizations

Timing Improvement

?Placed Netlist

Yes No more

Jan. 2003 ASPDAC03 - Physical Chip Implementation 48

Example of Logical + Placement Optimizations

CutBin

Start with a placed or unplaced netlistDo recursive partitioningDuring and following each partition action, apply logic optimizations such as

timing correctionsrebufferingrepoweringcloningpin swapping move boxes… etc

Page 25: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 25

Jan. 2003 ASPDAC03 - Physical Chip Implementation 49

Summary of Placement MethodsSimulated annealing

(+) High-quality, arbitrary objectives and constraints, parallelizable, easy to implement(-) Doesn’t scale

Quadratic (or, “analytic”)(+) Mathematically clean, fast (ConjGrad) solvers(-) Solving “the wrong problem”, highly illegal solutions must be legalized, fixed “anchors” neededExample: Alpert, Nam, Villarubia QUAD+ACG placer (ICCAD-02)

Partitioning-based(+) Fastest, scales well if multilevel used, good quality(-) Must be heavily tuned (hMetis, MLPart), difficult to constrain, unstable results (same quality but different structure) (?)Example: Capo (http://gigascale.org/bookshelf/)

Jan. 2003 ASPDAC03 - Physical Chip Implementation 50

Section OutlineIntroductionReview material (timing and synthesis)Introduction to placementPlacement algorithmsParadigms for placement-synthesis integrationPlacement aware synthesis techniquesCongestion avoidance / mitigation techniquesRouting optimization

Page 26: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 26

Jan. 2003 ASPDAC03 - Physical Chip Implementation 51

Overview of Common Placement Algorithms:

- Simulated Annealing- Quadratic Placement- Partitioning

Jan. 2003 ASPDAC03 - Physical Chip Implementation 52

for(temp=high; temp > absolute_zero; temp -= increment){

make a random movescore the moveuse temp dependent probability to decide to accept or reject

}

Simulated Annealing:

Note: Clustering can be useto improve performance

Page 27: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 27

Jan. 2003 ASPDAC03 - Physical Chip Implementation 53

Annealing::

Pros:- ease of implementation, dumb moves / smart scoring- can easily accommodate new constraints - just add them to the

scoring function- great quality - can be made to run on parallel processors

Cons:- very long run time

Jan. 2003 ASPDAC03 - Physical Chip Implementation 54

Quadratic Placement

Page 28: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 28

Jan. 2003 ASPDAC03 - Physical Chip Implementation 55

Cost = (x1 − 100)2 + (x1 − x2)2 +(x2 −200)2

x1Cost = 2(x1 − 100) + 2(x1 − x2)

x2Cost =− 2(x1 −x2) +2(x2 − 200)

setting the partial derivatives = 0 we solve for the minimum Cost:

Ax + B = 0

= 04 −2−2 4

x1x2

+ −200−400

= 02 −1−1 2

x1x2

+ −100−200

x1=400/3 x2=500/3

xx22

x1

x=100 x=200Review:

Jan. 2003 ASPDAC03 - Physical Chip Implementation 56

setting the partial derivatives = 0 we solve for the minimum Cost:

Ax + B = 0

= 04 −2−2 4

x1x2

+ −200−400

= 02 −1−1 2

x1x2

+ −100−200

x1=400/3 x2=500/3

xx22

x1

x=100 x=200

Interpretation of matrices A and B:

The diagonal values A[i,i] correspond to the number of connections to xiThe off diagonal values A[i,j] are 1 if object i is connected to object j, 0 otherwiseThe values B[i] correspond to the sum of the locations of fixed objects connected to object i

Review:

Page 29: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 29

Jan. 2003 ASPDAC03 - Physical Chip Implementation 57

Why formulate the problem this way?

Because we canBecause it is trivial to solveBecause there is only one solutionBecause the solution is a global optimumBecause the solution conveys “relative order” informationBecause the solution conveys “global position” information

Jan. 2003 ASPDAC03 - Physical Chip Implementation 58

However:Solution is not legalSolution depends of fixed anchor pointsSolution does not minimize linear wire length, congestion, or timingSolution is generally highly overlapping w/ high density (ie needs to be spread out)

Page 30: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 30

Jan. 2003 ASPDAC03 - Physical Chip Implementation 59

What does the solution look like?

To get an intuitive feel for the solution, examine the relaxation method for solving Ax + B = 0Actual program implementation may use other solution methods (that are generally less intuitive).

Jan. 2003 ASPDAC03 - Physical Chip Implementation 60

Solution of Quadratic using Relaxation:

Page 31: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 31

Jan. 2003 ASPDAC03 - Physical Chip Implementation 61

Jan. 2003 ASPDAC03 - Physical Chip Implementation 62

Constrained Solutions:

Sometimes we want to solve for the minimum wirelength subject to a constraintExample: Using quadratic for partitioning, we may want the quadratic placement to be "centered"

Page 32: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 32

Jan. 2003 ASPDAC03 - Physical Chip Implementation 63

Jan. 2003 ASPDAC03 - Physical Chip Implementation 64

Page 33: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 33

Jan. 2003 ASPDAC03 - Physical Chip Implementation 65

T o minimize C ost = f(x) subjec t to a constra int g(x) = 0 we can use la ngrangia n mult ip liers to mod ify the C ost func tio n as follows :

Cost = f(x) +2g(x)

x Cost = x f(x) + 2 x g(x)

Us ing C G as a cons tra int: whe re : s is the size o f object_i CG = i=1

n s i x i i=1n s i

n is the number of objects g(x ) = ( i=1

n s ix i i=1n s i ) − CG

w here we use N to represe nt the cons tant x g(x) =s iN i=1

n s i

W e ha ve alread y show n tha t

leads to the sys te m of equa tio ns - -- Ax + B = 0x f(x) = 0

T herefore solving the co nstra ined porble m = 0 x Cost = x f(x) + 2 x g(x) leads to : = 0Ax + B + 2 s i

N

Constrained Solutions

Jan. 2003 ASPDAC03 - Physical Chip Implementation 66

To solve Ax + B + = 0 we could use a packaged solve r and add the 2s iN

additiona l unknow n and equatio n to our ma tric ies and 2 CG = i=1n s ix i N

so lve.

Here is an alte rna tive way to solve the system:

by substitutio n we le t x = xu+2x l where is the unco nstra ined solutio n (ie the solutio n to Ax + B = 0)xu

Assuming we can solve the uncons trained proble m, is known.xu

By subs titutio n we ge t:

A(xu + 2x l) + B + 2 s iN = 0

which becomes:

or A2x l + 2s iN = 0 Ax l +

s iN = 0

Constrained Solutions (cont):

Page 34: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 34

Jan. 2003 ASPDAC03 - Physical Chip Implementation 67

W e need to so lve : A x l +s iN = 0

N o te : T he A ma tr ix is t he same a s the A ma tr ix fo r the unco ns tra ined so lut io n. S ince the A ma tr ix is the ne t lis t co n nec t iv it y sp ec ifica tio n, w e have A .

The B mat r ix here is ins tead o f the sum o f fixed lo ca t io n co nnec ts.s iN

In t rep re ta t io n :

T he so lu t io n to can b e o b ta ined b y mo d ify in g the o r igina l ne t lis t A x l +s iN = 0

and p laceme n t such tha t :

1 .) A ll fixed o b jec ts a re mo ved x = 02 .) A co ns tan t fo rce vec to r is ap p lied to each o b jec t. The co ns tan t fo rce vec to r fo r the i’ th o b jec t has mag n itud e s i

N

T hen use the same so lve r a s w as used to so lve A x + B = 0

Constrained Solutions (cont):

Jan. 2003 ASPDAC03 - Physical Chip Implementation 68

We also need to solve for 2

From the CG relationship and we get:x=xu+2xl

where (ie total size)CG= i=0n si(xui +2xli ) N N= i=0

n si

since we have solved for and the only unkown is xu xl 2

we get:

2=NCG− i=0

n sixui

i=0n sixli

Constrained Solutions (cont):

Page 35: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 35

Jan. 2003 ASPDAC03 - Physical Chip Implementation 69

To m in im ize f(x ) (t he w l sq uared co st fu nc t io n) sub jec t to a C G co nst ra in t w e d o thefo llo w in g :

1 .) S o lve fo r b y so lv in g us ing re laxa t ion o r so me o ther me th o dx u A x u + B = 0

2 .) S o lve fo r as fo llo w s : x l A x l +s iN = 0

- M o ve all fixed o b jec ts to lo ca t io n= 0

- A d d a co nstan t fo rce vecto r to each o b ject. The co nsta nt fo rce vec to r fo r the i’ th o b ject has mag n itud e s i

N - U sin g re la xat io n o r so me o the r metho d , so lve fo r x l

3 .) S o lve fo r us in g 2 2=NCG− i= 0

n s i x u i

i= 0n s ix l i

4 .) C o mp u te the fina l p laceme n t us in g x = x u + 2x l

Constrained Solutions (summary):

Jan. 2003 ASPDAC03 - Physical Chip Implementation 70

xx22

x=100 x=200

Force CG to 150s=100

From the previous example we know that the solution to :

Axu + B = 0

x =133.33166.67

with this solution the CG is at (ie not 150)(10 133.33)+(100 166.67)110 = 163.64

Now we need to solve:

which is the same as solving -> Axl +s iN = 0 Axl +

00 +

s iN = 0

Recall that the B matrix represents the position of fixed objects. So, this equationrepresents the solution to :

x1s=10

x1x2

10/110

100/110x = 0

Review:

Page 36: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 36

Jan. 2003 ASPDAC03 - Physical Chip Implementation 71

Constrained Solutions (summary):Advantages of this approach:

1.) The Solver data structure is the netlist onlyie. no additional memory requirements

2.) Sometimes the unconstrained solution is by itself sufficient,therefore we can avoid the additional overhead of producingthe constrained solution

3.) The numerical iterations in this method are NOT dependent onthe CG. We can solve for xu and xl, then try many different CG pointsat very low cost.

Jan. 2003 ASPDAC03 - Physical Chip Implementation 72

Quadratic Techniques:Pros:- mathematically well behaved- efficient solution techniques find global optimum- great quality

Cons:- solution of Ax + B = 0 is not a legal placement, so generally

some additional partitioning techniques are required.- solution of Ax + B = 0 is that of the "mapped" problem, ie

nets are represented as cliques, and the solution minimizes wire length squared, not linear wire length unless additionalmethods are deployed

- fixed IOs are required for these techniques to work well

Page 37: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 37

Jan. 2003 ASPDAC03 - Physical Chip Implementation 73

Partitioning

Jan. 2003 ASPDAC03 - Physical Chip Implementation 74

Partitioning:

Objective:

Given a set of interconnected blocks, produce two sets thatare of equal size, and such that the number of nets connecting the two sets is minimized.

Page 38: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 38

Jan. 2003 ASPDAC03 - Physical Chip Implementation 75

FM Partitioning:

Initial Random Placement

After Cut 1

After Cut 2

list_of_sets = entire_chip;while(any_set_has_2_or_more_objects(list_of_sets)){

for_each_set_in(list_of_sets){

partition_it();}/* each time through this loop the number of *//* sets in the list doubles. */

}

Jan. 2003 ASPDAC03 - Physical Chip Implementation 76

FM Partitioning:

-1

-2

-1

1

0

0

0

2

0

0

1

-

-1

-2

- each object is assigned a gain

- objects are put into a sortedgain list

- the object with the highest gainfrom the smaller of the two sidesis selected and moved.

- the moved object is "locked"- gains of "touched" objects are

recomputed- gain lists are resorted

Object Gain: The amount of change in cut crossingsthat will occur if an object is moved fromits current partition into the other partition

Moves are made based on object gain.

Page 39: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 39

Jan. 2003 ASPDAC03 - Physical Chip Implementation 77

-1

-2

-1

1

0

0

0

2

0

0

1

-

-1

-2

FM Partitioning:

Jan. 2003 ASPDAC03 - Physical Chip Implementation 78

-1

-2

-1

1

0

-2

-20

0

1

-

-1

-2

-2

Page 40: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 40

Jan. 2003 ASPDAC03 - Physical Chip Implementation 79

-1

-2

-1

1

0

-2

-20

0

1

-

-1

-2

-2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 80

-1

-2

-11

0

-2

-20

0

1

-

-1

-2

-2

Page 41: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 41

Jan. 2003 ASPDAC03 - Physical Chip Implementation 81

-1

-2

1 -1

0

-2

-20

-2

-1

-

-1

-2

-2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 82

-1

-2

1 -1

0

-2

-2 0

-2

-1

-

-1

-2

-2

Page 42: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 42

Jan. 2003 ASPDAC03 - Physical Chip Implementation 83

-1

-2

1 -1

0

-2

-20

-2

-1

-

-1

-2

-2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 84

-1

-2

1 -1

-2

-2

-2

0

-2

-1

1

-1

-2

-2

Page 43: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 43

Jan. 2003 ASPDAC03 - Physical Chip Implementation 85

-1

-2

1

-1

-2

-2

-2

0

-2

-1

1

-1

-2

-2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 86

-1

-2

1

-1

-2

-2

-2

0

-2

-1

1

-1

-2

-2

Page 44: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 44

Jan. 2003 ASPDAC03 - Physical Chip Implementation 87

-1

-2

-1

-3

-2

-2

-2

0

-2

-1

1

-1

-2

-2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 88

-1

-2

-1

-3

-2

-2

-2

0

-2

-1

1

-1

-2

-2

Page 45: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 45

Jan. 2003 ASPDAC03 - Physical Chip Implementation 89

-1

-2

-1

-3

-2

-2

-2

0

-2

-1

1

-1

-2

-2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 90

-1

-2

-1

-3

-2

-2

-2

-2

-2

-1

-1

-1

-2

-2

Page 46: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 46

Jan. 2003 ASPDAC03 - Physical Chip Implementation 91

Partitioning:

Pros:- very fast- great quality- scales nearly linearly with problem size

Cons:- non-trivial to implement- very directed algorithm, but this limits the ability to deal with

miscellaneous constraints

Jan. 2003 ASPDAC03 - Physical Chip Implementation 92

FM Partitioning

- For large designs min-cut (FM) produces poor results

To Compensate, there are two widely used enhancements:

1.) Quadratic seeding

2.) Multi-Level partitioning

Page 47: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 47

Jan. 2003 ASPDAC03 - Physical Chip Implementation 93

cut linecut line

move1

move2

move4

move3

Partitioning:

Jan. 2003 ASPDAC03 - Physical Chip Implementation 94

Global Placement - Multi-Level Partitioning:

move1

move2

move4

move3

0 0 0

0 11

2

00

1

1

0 0

0 11

2

0

00

1 0

0

10

0

generate clusters:while(there are clusters)

{partition_it;remove 1 cluster layer;

}partition_it;

Page 48: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 48

Jan. 2003 ASPDAC03 - Physical Chip Implementation 95

move1

move2

move4

move30 0 0

0 11

2

00

1

1

0 0

0 11

2

0

00

1 0

0

10

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 96

move1

move2

move4

move30 0 0

0 11

2

00

1

1

0 0

0 11

2

0

00

1 0

0

10

0

Page 49: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 49

Jan. 2003 ASPDAC03 - Physical Chip Implementation 97

move1

move2

move4

move30 0 0

0 11

2

00

1

1

0 0

0 11

2

0

00

1 0

0

10

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 98

move1

move2

move4

move3

0 0 0

0 11

2

00

1

10

0

1 0

0

10

0

Page 50: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 50

Jan. 2003 ASPDAC03 - Physical Chip Implementation 99

move1

move2

move4

move3

0 0 0

0 11

2

00

1

10

0

1 0

0

10

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 100

move1

move2

move4

move3

0 0 0

0 11

2

00

1

10

0

1 0

0

10

0

Page 51: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 51

Jan. 2003 ASPDAC03 - Physical Chip Implementation 101

move1

move2

move4

move3

0 0 0

0 11

00

1

10

0

1 0

0

10

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 102

move1

move2

move4

move3

0 0 0

0 11

00

1

10

0

1 0

0

10

0

Page 52: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 52

Jan. 2003 ASPDAC03 - Physical Chip Implementation 103

move1

move2

move4

move3

0 0 0

0 11

00

1 1

00

1

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 104

move1

move2

move4

move3

0 0 0

0 11

00

1

1

00

1

0

0

Page 53: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 53

Jan. 2003 ASPDAC03 - Physical Chip Implementation 105

move1

move2

move4

move3

0 0 0

0 11

00

1

1

00

1

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 106

move1

move2

move4

move3

0 0 0

0 11

00

1

1

00

1

0

0

Page 54: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 54

Jan. 2003 ASPDAC03 - Physical Chip Implementation 107

move1

move2

move4

move3

0 0 0

0 11

00

1

1

00

1

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 108

move1

move2

move4

move3

0 0 0

0 11

00

1

00

1

0

0

Page 55: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 55

Jan. 2003 ASPDAC03 - Physical Chip Implementation 109

move1

move2

move4

move3

0 0 0

0 11

00

1

00

1

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 110

move1

move2

move4

move3

0 0 0

0 11

00

1

0

0

Page 56: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 56

Jan. 2003 ASPDAC03 - Physical Chip Implementation 111

move1

move2

move4

move3

0 0 0

0 11

00

1

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 112

move1

move2

move4

move3

0 0 0

0 11

00

1

0

0

Page 57: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 57

Jan. 2003 ASPDAC03 - Physical Chip Implementation 113

move1

move2

move4

move3

0 0 0

0 11

00

1

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 114

move1

move2

move4

move3

0 0 0

0 11

00

0

0

Page 58: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 58

Jan. 2003 ASPDAC03 - Physical Chip Implementation 115

move1

move2

move4

move3

0 0 0

0 11

00

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 116

move1

move2

move4

move3

0 0 0

0 11

00

0

0

Page 59: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 59

Jan. 2003 ASPDAC03 - Physical Chip Implementation 117

move1

move2

move4

move3

0 0 0

0 11

00

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 118

move1

move2

move4

move3

0 0 0

0 11

00

0

0

Page 60: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 60

Jan. 2003 ASPDAC03 - Physical Chip Implementation 119

move1

move2

move4

move3

0 0 0

0 11

00

0

0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 120

MLP/FM Partitioning Cons:

Does not know how to handle “free” spaceResults tend to be erratic, ie results from run to run have significant variation

Page 61: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 61

Jan. 2003 ASPDAC03 - Physical Chip Implementation 121

MLP/FM Partitioning Pros:

Handles designs that have no fixed connection pointsVery fast - can handle large designs

Jan. 2003 ASPDAC03 - Physical Chip Implementation 122

Hybrid Techniques

Use both MLP and Quadratic techniquesResults are more predictable due to quadratic cost functionPartitioning is used for overlap removalQuadratic is used for “free” space handling and some relative order indications

Page 62: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 62

Jan. 2003 ASPDAC03 - Physical Chip Implementation 123

Quadratic Partitioning

Jan. 2003 ASPDAC03 - Physical Chip Implementation 124

Page 63: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 63

Jan. 2003 ASPDAC03 - Physical Chip Implementation 125

Jan. 2003 ASPDAC03 - Physical Chip Implementation 126

Page 64: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 64

Jan. 2003 ASPDAC03 - Physical Chip Implementation 127

Jan. 2003 ASPDAC03 - Physical Chip Implementation 128

Analytical Constraint Generation

Combine Quadratic techniques with MLP Use Quadratic solution to determine global position (ie balance)Use MLP to determine relative ordering of cells

Page 65: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 65

Jan. 2003 ASPDAC03 - Physical Chip Implementation 129

Poor Solution

Analytical Constraint Generation

Capacity = 2 Capacity = 2

Quadratic solution Area=1Analytical constraintACG solution

Jan. 2003 ASPDAC03 - Physical Chip Implementation 130

Analytical Constraint Generation

Page 66: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 66

Jan. 2003 ASPDAC03 - Physical Chip Implementation 131

Jan. 2003 ASPDAC03 - Physical Chip Implementation 132

Page 67: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 67

Jan. 2003 ASPDAC03 - Physical Chip Implementation 133

Jan. 2003 ASPDAC03 - Physical Chip Implementation 134

Page 68: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 68

Jan. 2003 ASPDAC03 - Physical Chip Implementation 135

Jan. 2003 ASPDAC03 - Physical Chip Implementation 136

Page 69: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 69

Jan. 2003 ASPDAC03 - Physical Chip Implementation 137

Jan. 2003 ASPDAC03 - Physical Chip Implementation 138

Page 70: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 70

Jan. 2003 ASPDAC03 - Physical Chip Implementation 139

Jan. 2003 ASPDAC03 - Physical Chip Implementation 140

Page 71: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 71

Jan. 2003 ASPDAC03 - Physical Chip Implementation 141

Jan. 2003 ASPDAC03 - Physical Chip Implementation 142

Page 72: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 72

Jan. 2003 ASPDAC03 - Physical Chip Implementation 143

Jan. 2003 ASPDAC03 - Physical Chip Implementation 144

Page 73: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 73

Jan. 2003 ASPDAC03 - Physical Chip Implementation 145

Jan. 2003 ASPDAC03 - Physical Chip Implementation 146

Page 74: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 74

Jan. 2003 ASPDAC03 - Physical Chip Implementation 147

Jan. 2003 ASPDAC03 - Physical Chip Implementation 148

Page 75: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 75

Jan. 2003 ASPDAC03 - Physical Chip Implementation 149

Jan. 2003 ASPDAC03 - Physical Chip Implementation 150

Page 76: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 76

Jan. 2003 ASPDAC03 - Physical Chip Implementation 151

MLPw/ACG

Jan. 2003 ASPDAC03 - Physical Chip Implementation 152

Global Route Results::

Page 77: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 77

Jan. 2003 ASPDAC03 - Physical Chip Implementation 153

MLPw/o ACG

Jan. 2003 ASPDAC03 - Physical Chip Implementation 154

Original ACG

Side by Side Comparison:

Page 78: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 78

Jan. 2003 ASPDAC03 - Physical Chip Implementation 155

Jan. 2003 ASPDAC03 - Physical Chip Implementation 156

Page 79: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 79

Jan. 2003 ASPDAC03 - Physical Chip Implementation 157

Observations on Quadratic Placement

placements are predictable and repeatabletiming is inherently betterwire length is not the best, but goodrun time: slower than MLP by 4xrun time: faster than annealing by 4xexcellent “free space” handlingplacements “feel” similar to those produced by annealing

Jan. 2003 ASPDAC03 - Physical Chip Implementation 158

Repeatability Example:One circuitMinimum linear length occurs for all solutions where y=50 0 < x < 100Minimum quadratic length occurs for y=50, x=50Quadratic solution IS both minimum linear and minimum quadratic length

(0,50) (0,100)

Page 80: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 80

Jan. 2003 ASPDAC03 - Physical Chip Implementation 159

Section OutlineIntroductionReview material (timing and synthesis)Introduction to placementPlacement algorithmsParadigms for placement-synthesis integrationPlacement aware synthesis techniquesCongestion avoidance / mitigation techniquesRouting optimization

Jan. 2003 ASPDAC03 - Physical Chip Implementation 160

Synthesis - Placement Interface

Page 81: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 81

Jan. 2003 ASPDAC03 - Physical Chip Implementation 161

Read Data

Divide each Partition

Preprocessing

Detailed Placement

While ( any partition has > 2 cells )

Reflow across partitions

Done

Partitioning Algorithm: Partition & Reflow

Global Placement

Netlist

SynthesisSynthesis

Jan. 2003 ASPDAC03 - Physical Chip Implementation 162

What Synthesis Can do when Invoked:

- add boxes- delete boxes- add nets- delete nets- reconnect nets- change box sizes- query placement locations of boxes- query "bin" statistics- remove a box from a bin- add a box to a bin

Page 82: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 82

Jan. 2003 ASPDAC03 - Physical Chip Implementation 163

Placement and Synthesis Integration

Loosely coupled: (methodology coupling)do some synthesis, then write out datado some placement, then write out data.. Repeat

Interleaved: (placement & synthesis in same process)

do pre-pd synthesisfor each placement step redo synthesis

Tightly coupled: (simultaneous P&S aware transforms)

Jan. 2003 ASPDAC03 - Physical Chip Implementation 164

Loosely Coupled Placement & Synthesis:

Characteristics:

- Placement is treated as a black box

- Multiple placement runs are made

Do Placement

Analyze

- re-synthesize- Generate

Constraints

Meet Objectives

Done w/placement

Yes

No

Page 83: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 83

Jan. 2003 ASPDAC03 - Physical Chip Implementation 165

SynthesisSynthesis

Synthesis

Synthesis

Interleaved Placement & Synthesis:

Characteristics:

- the placement flow is the same as ina placement only methodology

- in between each step of the placementprogression, synthesis is invoked

Jan. 2003 ASPDAC03 - Physical Chip Implementation 166

Tightly Coupled

Placement and synthesis algorithms become co-dependentPlacement algorithms have awareness of synthesis activitySynthesis algorithms have awareness of placement activity

Page 84: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 84

Jan. 2003 ASPDAC03 - Physical Chip Implementation 167

Section OutlineIntroductionReview material (timing and synthesis)Introduction to placementPlacement algorithmsParadigms for placement-synthesis integrationPlacement aware synthesis techniquesCongestion avoidance / mitigation techniquesRouting optimization

Jan. 2003 ASPDAC03 - Physical Chip Implementation 168

Summary of TechniquesPlacement-Driven X (PDX)

Cloning, Spreading, Sizing, Fanout Reclustering, …(Constant-Delay Methodology)

Buffer InsertionKey problem: Max RAT at Source Interconnect Tree synthesisHeuristic search over topologies + VanGinneken dynamic programming (with practical limits on polarity, buffer location, buffer library, etc. richness of formulation)C-Tree (IBM), recent Q-Tree (UCSD), P/S/U-Tree (UIC), etc.Early timing analysis (slew rate, cap load control): UCSD+IBM

Buffer Block Planning + Buffered Global RoutingDAC-2001 Best Paper from IBM (buffer bays)ASPDAC-2002 Best Paper from UCSD (delay-bounded floorplan evaluation with given buffer plan)Primal-Dual Multi-Commodity Flow approximation

Page 85: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 85

Jan. 2003 ASPDAC03 - Physical Chip Implementation 169

Placement Driven Cloning

critical

non-critical

Cloning to off-load non-critical path from critical path

Jan. 2003 ASPDAC03 - Physical Chip Implementation 170

Placement Driven ExpansionLogic

Logic

AO

LogicLogic

Logic

LogicExpansion allows primitives to be placed in a more timing friendly way

Expansion Transformation

Page 86: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 86

Jan. 2003 ASPDAC03 - Physical Chip Implementation 171

Example:

Tightly Coupled Placement Driven Expansion

Jan. 2003 ASPDAC03 - Physical Chip Implementation 172

Tightly Coupled Synthesis & Placement:

abcdefg

Transform

ad

bef

cg

Page 87: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 87

Jan. 2003 ASPDAC03 - Physical Chip Implementation 173

ab

c

d e

fg

Tightly Coupled Synthesis & Placement Example:Suppose the primary IO constraints look like this:

Jan. 2003 ASPDAC03 - Physical Chip Implementation 174

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

The placement of the synthesized netlist would look something like this:

Page 88: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 88

Jan. 2003 ASPDAC03 - Physical Chip Implementation 175

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

If we could re-synthesize the netlist, we could get something that looks like this.

Jan. 2003 ASPDAC03 - Physical Chip Implementation 176

Tightly Coupled Synthesis & Placement:

abc

defg

PD-MAP

abc

defg

weight = 1/10

weight = 1

Page 89: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 89

Jan. 2003 ASPDAC03 - Physical Chip Implementation 177

Tightly Coupled Synthesis & Placement example:

Map_TREEFor each cut

partition_itFor each partitionIf(partition number > M){

if(related_node_count < N)merge_nodes

if(related_node_count == 1)merge_node into neighbor

partition}

endend

Jan. 2003 ASPDAC03 - Physical Chip Implementation 178

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

Page 90: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 90

Jan. 2003 ASPDAC03 - Physical Chip Implementation 179

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

Jan. 2003 ASPDAC03 - Physical Chip Implementation 180

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

Page 91: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 91

Jan. 2003 ASPDAC03 - Physical Chip Implementation 181

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

Jan. 2003 ASPDAC03 - Physical Chip Implementation 182

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

Page 92: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 92

Jan. 2003 ASPDAC03 - Physical Chip Implementation 183

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

Jan. 2003 ASPDAC03 - Physical Chip Implementation 184

Tightly Coupled Synthesis & Placement Example:

ab

c

d e

fg

ab

c

fg

d e

Result

Page 93: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 93

Jan. 2003 ASPDAC03 - Physical Chip Implementation 185

Placement Driven Timing Correction

Jan. 2003 ASPDAC03 - Physical Chip Implementation 186

Redesign Fan-in Treea

cd

b eArr(b)=3

Arr(c)=1

Arr(d)=0

Arr(a)=4

Arr(e)=6

1

1

1

cd

e

Arr(e)=51

1b1

a

e

e

Arr(e)=0

Page 94: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 94

Jan. 2003 ASPDAC03 - Physical Chip Implementation 187

Placement Driven Repowering

Repowering is traditionally done using load based cell characterizationPlacement changes continuously during partitioningNeed high efficiency algorithms to do repowering in this environmentSolution: Use Gain Based Formulation

Jan. 2003 ASPDAC03 - Physical Chip Implementation 188

Delay Models

inC in

out

C

Cg =

outC

pC

CEcdin

outinv +

).+=

β1(. .1

pCkd out+= .1

Load based formulation:

pgld += .

inC.β

inC outC

Gain based formulation:

pC

CEcdin

outinv +

).+=

β1(. .1

d: delay

l: logical effort

g: gain

p: intrinsic delay

1k l

Page 95: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 95

Jan. 2003 ASPDAC03 - Physical Chip Implementation 189

Area vs Delay CentricLoad Based Paradigm

• (load-based delay eq.)

• sizedKnow:

Size of each cellTotal Area ->

– area centricDon’t know:

Wire loadsDelay of each cellDelay of a path

Estimation error is in the delay:Local ‘path based’ property.

Gain Based Paradigm• (gain based delay

eq.)• sizeless

Know:The delay of each cell.The delay of a path ->

– delay centricDon’t know:

Wire loadsThe area of each cellThe total area

Estimation error is in the areaGlobal property.

Jan. 2003 ASPDAC03 - Physical Chip Implementation 190

Design FlowHigh Level Synthesis

Restructuring

Tech Mapping

Late Timing Corr

LibraryAnalysis GainBased Opt

Discretization

LoadBasedDelay(DCL)

GainBasedDelay

Page 96: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 96

Jan. 2003 ASPDAC03 - Physical Chip Implementation 191

Power Levels (Gate Sizes)

00.010.020.030.040.05

0 0.2 0.4 0.6 0.8 1Cout

d

A B CoutC

d

Jan. 2003 ASPDAC03 - Physical Chip Implementation 192

Library (Gain) Analysis

00.010.020.030.040.05

0.5 2.5 4.5 6.5 8.5 10.5g

d

A B C

in

out

CCg =

pgld += .

inCoutC

d

Page 97: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 97

Jan. 2003 ASPDAC03 - Physical Chip Implementation 193

Area and Load Calculation

Start at primary outputs/ register inputs. Much like static timing analysis.Incremental.

1g 2g 3g

33

gCC out=

2

32

gCC =

1

21

gCC =

71.0=outC

71.0=outC

1g2g 3g

2

32

gCC =

1

21

gCC =

Jan. 2003 ASPDAC03 - Physical Chip Implementation 194

Gain Calculation

outCinC

1d 3dD

4d2d

43211

dddddDN

ii +++== ∑

=

in

outout

in

N

ii C

CCC

CC

CC

CCgggggG ==

=== ∏ 43

4

2

3.

24321

1....

Minimize D such that:in

out

CCG =

∑∑==

+=N

ii

N

ii pfD

11 CinCoutL

N

igi

N

ili

N

ifiF .

1.

11=∏

=∏=

=∏=

=

Solution: ffi =Geometric

pfpgld +=+= .

Minimize Such that:

Page 98: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 98

Jan. 2003 ASPDAC03 - Physical Chip Implementation 195

Example I

71.0=outC19.0=inC

1d

2

3021.00496.0 :NOR2CCd ×+=

inCCd 2011.00308.0 :NAND2 ×+=

3009.00295.0 :INV

CCd out×+=

2d 3d

000008.03

3.21. ====in

out

CCLffffF

0203.0=f

Nand2 Nor2 Inv Path

p 0.0308 0.0496 0.0295 0.1099

f 0.0203 0.0203 0.0203 0.0609

d 0.0511 0.0699 0.0498 0.1708

Cin 0.19 0.3364 0.3283 .

3C2C

Jan. 2003 ASPDAC03 - Physical Chip Implementation 196

Constant Delay Calculation

outC

ccc

o

c

c

c

fpdglpd

Cout

dCingCout

gpgld

=−==

=

=

=+=

. :Calculate :Measure

0 :Set

:Measure

:Calculate

6.3 :Set.

:Inverter

cd

nandnandnandnandc

nor

nand

pgld

gg

+=

==

.

8.15.2

..

outC

cd

cfgl =.:gatesOther

Page 99: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 99

Jan. 2003 ASPDAC03 - Physical Chip Implementation 197

DiscretizationFrom gain-based model back to appropriate power levelsThere is an error in timing/load when ‘ideal’ power levels are not available.

Goal: Minimize this error.Can be tuned to delay error or capacitance error..

1g 2g 3g

33

gCC out=

2

32

gCC =

1

21

gCC =

71.0=outC

pgld += .

[Kudva98][Beeftink98]

Jan. 2003 ASPDAC03 - Physical Chip Implementation 198

Gain Based: Observations:Gain Based algorithms: A major improvement.

More homogeneous (global) algorithms and designs.Can be better targeted for area and/or delay.

Reveal inherent cell characteristics to optimization tools, leading to improved QOR

Good library design is required to facilitate discretization step

Ideally suited for operation within Physical Synthesis

Page 100: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 100

Jan. 2003 ASPDAC03 - Physical Chip Implementation 199

Placement Driven Buffering

Rip Out all Buffers

Insert Buffers based on placement info

Jan. 2003 ASPDAC03 - Physical Chip Implementation 200

What to do About Long Wires?

Add buffersTune wire sizesModify the placement to reduce them

Page 101: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 101

Jan. 2003 ASPDAC03 - Physical Chip Implementation 201

Placement Driven vs Logic Driven Buffer Insertion

Logic driven buffer insertion focuses on logic topology and buffer sizing while assuming a statistical wire load model Placement driven buffering uses an existing placement as the fundamental constraint

Jan. 2003 ASPDAC03 - Physical Chip Implementation 202

Multiple buffer typesInvertersCapacitance, Slew and Noise constraintsWire SizingSimultaneous driver sizingHigh order interconnect delay and CeffectiveBlockage handling

Placement Driven Buffer Insertion: Buffopt (IBM)

Page 102: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 102

Jan. 2003 ASPDAC03 - Physical Chip Implementation 203

How Do Buffers Help?Reduce delay

Wire delay quadratic in lengthBuffers make delay essentially linearDelay gate dominated, not wire dominated

Fix other problemsBad slews at sinksCapacitance range violationsNoise induced by capacitance coupling

Jan. 2003 ASPDAC03 - Physical Chip Implementation 204

How Does Wire Sizing Help?

Highly resistive lines increase delayWider wires or thick metal layers reduces resistance, but can increase capacitanceFor long interconnect, resistance reduction outweighs capacitance increase

Page 103: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 103

Jan. 2003 ASPDAC03 - Physical Chip Implementation 205

Simple Buffer Insertion ProblemGiven: Source and sink locations, sink capacitancesand RATs, a buffer type, source delay rules, unit wire resistance and capacitance

Buffer

RAT1

RAT2

RAT3RAT4

s0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 206

Simple Buffer Insertion ProblemFind: Buffer locations and a routing tree such that slack at the source is minimized

RAT2

RAT3RAT4

RAT1

s0

)},()({min)( 0410 iii ssdelaysRATsq −= ≤≤

Page 104: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 104

Jan. 2003 ASPDAC03 - Physical Chip Implementation 207

Fundamental Buffer Insertion

Van Ginneken’s dynamic programming algorithmBuilding block: candidate (Cap, slack)

Candidates for each node stored as a listEach sink has one candidatePropagate candidates up the tree

Guarantees optimal solutionQuadratic complexity

Jan. 2003 ASPDAC03 - Physical Chip Implementation 208

Assumptions for the Basic Van Ginneken algorithm:

Given a routing treeGiven a set of potential insertion pointsSingle buffer sizeNo sink or driver sizingLinear gate delay model

Rd Cdown + Kd

Elmore wire delay modelRw (Cw/2 + Cdown)

Page 105: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 105

Jan. 2003 ASPDAC03 - Physical Chip Implementation 209

Van Ginneken ExtensionsMultiple buffer typesInvertersCapacitance, Slew and Noise constraintsWire SizingSimultaneous driver sizingHigh order interconnect delay and CeffectiveBlockage recognition

Jan. 2003 ASPDAC03 - Physical Chip Implementation 210

Example- Connect the end points of the net

using a steiner route

- Add Candidate Nodes

- Final buffer solution is optimal for this route, and this set of candidate nodes.

- Other routes may produce betterfinal solutions.

- Net routing topology is an inputto Van Ginneken’s algorithm

Page 106: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 106

Jan. 2003 ASPDAC03 - Physical Chip Implementation 211

Example

Jan. 2003 ASPDAC03 - Physical Chip Implementation 212

How Many Candidates?Number of candidates seems to double with each additional node

Prune candidate with worst slack when capacitances is greater or equalLinear number of candidates

Page 107: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 107

Jan. 2003 ASPDAC03 - Physical Chip Implementation 213

Pseudo Code:

List = NULL;For each node (bottom up traversal of graph){

augment each item in list with wire segment up to nodeduplicate the listfor each element of the duplicate list

add a buffer at nodeanalyze each element in listanalyze each element in buffered (duplicate) listpick best element of buffered list and delete the restnew list is union of list and “best” element of buffered list

} Pick best solution;

12 16 4 35

Jan. 2003 ASPDAC03 - Physical Chip Implementation 214

Example

12 16 4 35Node 1 processing: 2 evaluations, at most 2 candidates kept Node 2 processing: 4 evaluations, at most 3 candidates keptNode 3 processing: 6 evaluations, at most 4 candidates keptNode 4 processing: 8 evaluations, at most 5 candidates keptNode 5 processing: 10 evaluations, at most 6 candidates keptNode 6 processing: 12 evaluations, at most 7 candidates kept

Now pick the best one: Optimal solution

1_

)1)((2_1

+<=

+== ∑=

NcandidatesNum

NNisevaluationNumN

i

Page 108: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 108

Jan. 2003 ASPDAC03 - Physical Chip Implementation 215

Merging Branches

Critical

Merge is additive

Jan. 2003 ASPDAC03 - Physical Chip Implementation 216

Van Ginneken Algorithm Summary

GoodClever pruning controls # of candidatesFinds an optimal solution in quadratic timeEasily extended to cover a variety of important considerations (like multiple buffer types, wire sizing, polarity, slew, & capacitance constraints, etc.

BadResults depend on quality of route provided

Page 109: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 109

Jan. 2003 ASPDAC03 - Physical Chip Implementation 217

Example Route:Critical: can not offloaddue to route

Different route leadsto better solution

Jan. 2003 ASPDAC03 - Physical Chip Implementation 218

Physical Synthesis FlowSynthesized NetlistWire-load Models

UnplacedPhysically “unaware” timing

Cleanup: Remove buffers, nominal power levels on gates

Initial “basic” placementFor minimal wire-length, min-cut, Steiner tree estimates, physically aware timing

Logical + Placement optimizations

Timing-driven placement w/resynthesis

For minimal netweights, based on the timing of the net

Physically aware logic optimizations

Timing Improvement

?Placed Netlist

Yes No more

Page 110: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 110

Jan. 2003 ASPDAC03 - Physical Chip Implementation 219

Example Route:

If still critical, add net weight

Jan. 2003 ASPDAC03 - Physical Chip Implementation 220

Example Route:

Page 111: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 111

Jan. 2003 ASPDAC03 - Physical Chip Implementation 221

Multiple Buffer Types

Instead of one buffer type, can choose from m power levelsGenerate m candidates instead of oneStill optimalComplexity increase quadratic in m

Jan. 2003 ASPDAC03 - Physical Chip Implementation 222

Inverters

Store candidates in “+” and “-” lists+ implies polarity preserved- implies polarity reversed

Adding inverterSwitches candidate in + list to - listSwitches candidate in - to + list

Final result only chosen from + list

Page 112: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 112

Jan. 2003 ASPDAC03 - Physical Chip Implementation 223

Capacitance Constraints

Each gate g can drive at most C(g) capacitanceWhen inserting buffer g, check downstream capacitance. If it is bigger than C(g), throw out candidateIncreases efficiency

Jan. 2003 ASPDAC03 - Physical Chip Implementation 224

Slew Constraints

Similar to capacitance constraintsWhen inserting buffer, compute slews to gates driven by bufferIf any slew exceeds its target, throw out candidatePotential difficulty: computing slew accurately in bottom-up fashion

Page 113: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 113

Jan. 2003 ASPDAC03 - Physical Chip Implementation 225

Noise Constraints

Each gate has acceptable noise thresholdCompute cumulative noise for each wire viaDevgan noise metricThrow out candidates that violate noise

Can avoid noise while optimizing timing!

Jan. 2003 ASPDAC03 - Physical Chip Implementation 226

Wire Sizing:

For each node (bottom up traversal of graph){

for each Wire Size{

augment each item in list with Sized wire segmentduplicate the listfor each element of the duplicate list

add a buffer at nodeanalyze each element in listanalyze each element in buffered (duplicate) listpick best element of buffered list and delete the restnew list is union of list and “best” element of buffered list

}} Do Final pruning & Pick best solution;

12 16 4 35

Page 114: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 114

Jan. 2003 ASPDAC03 - Physical Chip Implementation 227

Blockage Recognition

Delete insertion points that run over blockages

Jan. 2003 ASPDAC03 - Physical Chip Implementation 228

Route Around Blockage

Page 115: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 115

Jan. 2003 ASPDAC03 - Physical Chip Implementation 229

Buffer Bays

Jan. 2003 ASPDAC03 - Physical Chip Implementation 230

Routing Into Buffer Bays

Page 116: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 116

Jan. 2003 ASPDAC03 - Physical Chip Implementation 231

“Buffer Site”Similar to buffer bays, only exact buffer locations are pre-specified, not just areasUseful as a mechanism for IP blocks and microprocessor designDummy cell that holds a bufferNot connected to any netBecomes buffer when assigned to a netExtra sites decoupling capsSprinkle sites throughout designAllocate percentage within macros

Jan. 2003 ASPDAC03 - Physical Chip Implementation 232

Routing Into Buffer Sites

Page 117: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 117

Jan. 2003 ASPDAC03 - Physical Chip Implementation 233

Generate Steiner Tree

Jan. 2003 ASPDAC03 - Physical Chip Implementation 234

Reduce Congestion and Coupling

Page 118: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 118

Jan. 2003 ASPDAC03 - Physical Chip Implementation 235

Reduce Congestion and Coupling

Jan. 2003 ASPDAC03 - Physical Chip Implementation 236

Assign Buffers

Page 119: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 119

Jan. 2003 ASPDAC03 - Physical Chip Implementation 237

Comments about Buffering and Wire Sizing:

Extremely critical: One of the highest leverage timing closure itemsThere are extended provably correct algorithms for dealing with the problem.Steiner route & Blockage avoidance are mostly heuristic: Hot research area!

Jan. 2003 ASPDAC03 - Physical Chip Implementation 238

Section OutlineIntroductionReview material (timing and synthesis)Introduction to placementPlacement algorithmsParadigms for placement-synthesis integrationPlacement aware synthesis techniquesCongestion avoidance / mitigation techniquesRouting Optimization

Page 120: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 120

Jan. 2003 ASPDAC03 - Physical Chip Implementation 239

Congestion Mitigation

Jan. 2003 ASPDAC03 - Physical Chip Implementation 240

Sources of CongestionPlacement Quality: Do we have a good relative ordering of cells?Placement Density: Do we have appropriate cell spreading?Preplacement of large cells: Is there a better location for these cells?Floorplan quality: Is this a good floorplan / hierarchy?Netlist complexity: Are some logic groupings inherently difficult to routeLibrary characteristics: Do some cells block too much metal internally?

Page 121: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 121

Jan. 2003 ASPDAC03 - Physical Chip Implementation 241

Congestion MitigationConstructive Avoidance

control global placement pin density: fewer pins per unit area means fewer wires per unit areamonitor congestion during placement and perform dynamic spreading

Post placement fix upremove problems from an already placed netlist

Jan. 2003 ASPDAC03 - Physical Chip Implementation 242

Groute / Spread / Redo

Constructive Avoidance:

Characteristics:

- as placement is formed, take action to avoid problems

- between each step of the placementprogression there is the potential to evaluate congestion and take action

Groute / Spread / Redo

Groute / Spread / Redo

… etc

Page 122: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 122

Jan. 2003 ASPDAC03 - Physical Chip Implementation 243

Constructive Avoidance Deficiencies:

Depends on early estimates of congestion that may not be accurate enough to avoid all problemsPost placement actions such as clock tree insertion, repowering, buffering, etc may add congestion to the designGuard banding with conservative “constructive avoidance” causes lose of performance and density

Jan. 2003 ASPDAC03 - Physical Chip Implementation 244

Post Placement Congestion Mitigation

Use production global router, not internal placement based global routerTranslate congestion values into density targets for placement regionsPerform flow based circuit spreadingPreserve relative logic ordering of cells

Page 123: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 123

Jan. 2003 ASPDAC03 - Physical Chip Implementation 245

Network Flow Based Spreading

Supply Nodes Demand Nodes

s t

i j

b(i) > 0 b(j) < 0

Min-cost max-flow formulation, similar to any “fix-up spreader”: “thermal” placement, Bonn’s top-down placer (Vygen), etc.

i if b(i) > 0,Cap(esi) = b(i)Cost(esi) = 0

i s , j t, Cap(eij) = Infinity (Large Int)Cost(eij) = K

j if b(j) < 0,Cap(ejt) = -b(j)Cost(ejt) = 0

Jan. 2003 ASPDAC03 - Physical Chip Implementation 246

Initial Placement

Calculate bin levelcongestion

Is Congestion

belowthreshold ?

Translate bin score tobin target density

Network flow based circuit spreading

Final Placement

Congestion Driven Circuit Spreading

Yes

No

Page 124: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 124

Jan. 2003 ASPDAC03 - Physical Chip Implementation 247

Jan. 2003 ASPDAC03 - Physical Chip Implementation 248

Page 125: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 125

Jan. 2003 ASPDAC03 - Physical Chip Implementation 249

Jan. 2003 ASPDAC03 - Physical Chip Implementation 250

Page 126: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 126

Jan. 2003 ASPDAC03 - Physical Chip Implementation 251

We’ve Talked About

Placement algorithmsPlacement / Synthesis interactionPlacement aware synthesis techniquesThe Constant Delay paradigmPhysical Buffer insertion / Wire sizingCongestion Mitigation

Jan. 2003 ASPDAC03 - Physical Chip Implementation 252

Let’s Look at some Examples:

Page 127: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 127

Jan. 2003 ASPDAC03 - Physical Chip Implementation 253

Pure MLP Quadratic

Jan. 2003 ASPDAC03 - Physical Chip Implementation 254

shatterclonefaninbuffer

Optimization Results

Page 128: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 128

Jan. 2003 ASPDAC03 - Physical Chip Implementation 255

Optimization Results

Jan. 2003 ASPDAC03 - Physical Chip Implementation 256

Optimization Results

Page 129: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 129

Jan. 2003 ASPDAC03 - Physical Chip Implementation 257

Section outlineIntroductionReview material (timing and synthesis)Introduction to placementPlacement algorithmsParadigms for placement-synthesis integrationPlacement aware synthesis techniquesCongestion avoidance / mitigation techniquesRouting optimization

Jan. 2003 ASPDAC03 - Physical Chip Implementation 258

Routing Based Optimization: RBO (IBM)

Page 130: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 130

Jan. 2003 ASPDAC03 - Physical Chip Implementation 259

Routing based Timing Closure Issues

Post Routing timing problems can be significantaffect design schedulemay be too numerous to fix manually

Increasing design density can reduce cost, but it also increases wiring congestion

timing and signal integrity become more significantavailable resource for manual fixup is limitedwithout automation may not be doable

Rerouting with constraints may resolve some of the problems, but this process is slow

Jan. 2003 ASPDAC03 - Physical Chip Implementation 260

Solution:

Integrate global routing, detailed routing and timing correctionGlobal routing is efficient enough to be run in an iterative timing closure loopTiming critical nets avoid scenic routesNon-critical nets that go scenic can be repowered and buffered prior to detailed routing

Page 131: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 131

Jan. 2003 ASPDAC03 - Physical Chip Implementation 261

critical critical pathspaths

non-critical pathsPDS Timinguses steiner wires - fast

ideal "Steiner" routes

Timing deficient wiring solution

Post PD Timing Catches this problem: Slow!

Timing driven wiring solution

RBO Timing Driven Routing sees this during global route stage: Fast!

Force optimal use of wiring resource (e.g.

critical paths get direct route)

Example Problem:

Jan. 2003 ASPDAC03 - Physical Chip Implementation 262

Global RoutingDivides the entire chip into localized rectangular regions called tiles.Compress several pin location in each tile to a single pin location

All the shapes, wires and open are represented in terms of globaltrack capacity and usage.

Page 132: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 132

Jan. 2003 ASPDAC03 - Physical Chip Implementation 263

Global Routing

Two step approachCreate the initial steiner routesCompute the edge congestion's on the gridPerform a rip-up reroute using shortest path algorithm to reduce the overall congestion of the design

AdvantagesCan communicate with detail routerGood correlation with final detail routing solution

Jan. 2003 ASPDAC03 - Physical Chip Implementation 264

Current Methodology

Physical Synthesis

Global Routing

Detailed Routing

Timing Analysis

RBO Methodology

Physical Synthesis

RBO / Physical

Synthesis

Detailed Routing

Analysis

XrGlobal

Extractor Optimizer

Einstimer

No Timing Criticality for Global router

Costly Manual Timing Correction

Routing Based Optimization

Page 133: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 133

Jan. 2003 ASPDAC03 - Physical Chip Implementation 265

RBO Extraction ProcessVery fast

Excellent correlation with final 3D extractionUses global routes for extractionNeighbor information probabilistically determined based on the global routing congestion informationBased on extraction tables

Capacity of All Edges = 5

Probability of having a neighbor = (#OccupiedTracks)/(#Capacity) = 2/4 = 0.5

1

3 3 2

Jan. 2003 ASPDAC03 - Physical Chip Implementation 266

RBO results onRBO results on memcntlmemcntlDesign : Example 1Nets : ~1.6MSize : 23193 x 23193Congestion : Attached is a display ofGlobal congestion

Page 134: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 134

Jan. 2003 ASPDAC03 - Physical Chip Implementation 267

Timing Critical Nets: Without RBO

Jan. 2003 ASPDAC03 - Physical Chip Implementation 268

Nets Routed with RBO flow

Page 135: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 135

Jan. 2003 ASPDAC03 - Physical Chip Implementation 269

RBO Results - Example 1Worst Slack #Slack

Violations#Cap Violations

#Slew Violations

#Opens #Loops

Steiner Estimates

-0.47 17 1 18

XrLocalwithout RBO

-1.57 4687 14 128 50 87020

RBO Timing Closure (Global Routes)

-0.48 209

Detailed routing with RBO

-0.43 14 18 1 54

Jan. 2003 ASPDAC03 - Physical Chip Implementation 270

Example 2: - Critical Net Routed Without RBO

Page 136: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 136

Jan. 2003 ASPDAC03 - Physical Chip Implementation 271

Example 2: Critical Net Routed With RBO

Jan. 2003 ASPDAC03 - Physical Chip Implementation 272

Example 2 Results Summary

Worst Slack #Slack Violations

#Cap Violations

#Slew Violations

#Opens #Loops

Final routing without RBO

-0.54 1224 33 270 0 1152

Using RBO -0.29 909 32 274 0 1070

Page 137: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 137

Jan. 2003 ASPDAC03 - Physical Chip Implementation 273

PDS - RBO Integration

Not N

oise Aw

are

Noise A

ware

112

2

3

45

67

8

9

10

11

11223

4567

89

10

11

Steiner Routes Timing Closure

PDS-Einstimer

Current Flow

Timing SignOff

ChipEdit-Einstimer

Detail Routing

Xrouter

No N

oiseSignO

ff

Probabilistic Detection of Noise Problems

PDS-Einstimer-RBO

Noise Avoidance

PDS-Einstimer-RBO

SignOff Noise Detection

ETCoupling-3DNoise

Noise Correction

Manual Correction

Timing Closure

PDS-Einstimer-RBO

Steiner-Global

Projected Flow

Steiner Routes Timing Closure

PDS-Einstimer

Global Routes Timing Closure

RBO-Einstimer

SignOff Noise Detection

ETCoupling-3DNoise

Noise Correction

Manual Correction

Proposed Flow -Existing Tools

Jan. 2003 ASPDAC03 - Physical Chip Implementation 274

Noise Detection and Avoidance:RBO (Detection)

Length BaseInitial selection includes length and slack thresholdFurther pruning based on Worst Case Miller Timing

Switching Window based refinementPattern generation based on switching window overlaps

RBO (Avoidance)Long Net Spreading Track Reordering Incremental Placement Changes Layer Assignment

Page 138: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 138

Jan. 2003 ASPDAC03 - Physical Chip Implementation 275

Noise Detection and Avoidance:Wire width selectionPhysical Synthesis

IntegrationFix Cap And Slew Violations with Global Routes Interface to RBO Noise Alleviation Resizing Noise Aware Buffering

Jan. 2003 ASPDAC03 - Physical Chip Implementation 276

Long Net Spreader

Page 139: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 139

Jan. 2003 ASPDAC03 - Physical Chip Implementation 277

Wrap UpTiming closure today is highly dependant on integrated tools.Tightly integrated Placement, Timing & Synthesis tools are available today from multiple vendors.Placement techniques are dominated quadratic techniques and partitioningNext on the list for integration are Routing and Signal integrity tools (happening now) These tools have a high degree of complexity. It takes large well funded DA organizations to compete in this space.

Jan. 2003 ASPDAC03 - Physical Chip Implementation 278

Placement ReferencesC. J. Alpert, T. Chan, D. J.C. J. Alpert, T. Chan, D. J.--H,H,\\. Huang, I. Markov, and K. . Huang, I. Markov, and K. YanYan, “, “Quandratic Quandratic Placement Revisited”,Proc. 34th IEEE/ACM Design Automation ConfePlacement Revisited”,Proc. 34th IEEE/ACM Design Automation Conference, 1997, rence, 1997, pp. 752pp. 752--757757C. J. Alpert, J.C. J. Alpert, J.--H Huang, and A. B. Kahng, “Multilevel Circuit Partitioning”, ProH Huang, and A. B. Kahng, “Multilevel Circuit Partitioning”, Proc. 34th c. 34th IEEE/ACM Design Automation Conference, 1997, pp. 530IEEE/ACM Design Automation Conference, 1997, pp. 530--533533U. Brenner, and A. U. Brenner, and A. RoheRohe, “An Effective Congestion Driven Placement Framework”, , “An Effective Congestion Driven Placement Framework”, International Symposium on Physical Design 2002, pp. 6International Symposium on Physical Design 2002, pp. 6--1111A. E. Caldwell, A. B. Kahng, and I.L. Markov, “Can Recursive BisA. E. Caldwell, A. B. Kahng, and I.L. Markov, “Can Recursive Bisection Alone ection Alone Produce Routable Placements”,Proc. 37th IEEE/ACM Design AutomatiProduce Routable Placements”,Proc. 37th IEEE/ACM Design Automation Conference, on Conference, 2000, pp 4772000, pp 477--482482M.A. M.A. BreuerBreuer, “Min, “Min--Cut Placement”, J. Design Automation and Fault Tolerant Cut Placement”, J. Design Automation and Fault Tolerant Computing, I(4), 1997, pp 343Computing, I(4), 1997, pp 343--362362J. J. VygenVygen, “Algorithms for Large, “Algorithms for Large--Scale Flat Placement”, Proc. 34th IEEE/ACM Design Scale Flat Placement”, Proc. 34th IEEE/ACM Design Automation Conference, 1988,pp 746Automation Conference, 1988,pp 746--751751H. H. Eisenmann Eisenmann and F. M. Johannes, “Generic Global Placement and and F. M. Johannes, “Generic Global Placement and FloorplanningFloorplanning”, ”, Proc. 35th IEEE/ACM Design Automation Conference, 1998, pp. 269Proc. 35th IEEE/ACM Design Automation Conference, 1998, pp. 269--274274S.S.--L. L. Ou Ou and M. and M. PedramPedram, “Timing Driven Placement Based on Partitioning with , “Timing Driven Placement Based on Partitioning with Dynamic CutDynamic Cut--Net Control”, Proc. 37th IEEE/ACM Design Automation Conference, Net Control”, Proc. 37th IEEE/ACM Design Automation Conference, 2000, pp. 4722000, pp. 472--476476C.M. C.M. Fiduccia Fiduccia and R.M. and R.M. MattheysesMattheyses, A linear time heuristic for improving network , A linear time heuristic for improving network partitions, partitions, ProcProc. ACM/IEEE Design Automation Conference. (1982) . ACM/IEEE Design Automation Conference. (1982) pppp. 175 . 175 -- 181.181.

Page 140: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 140

Jan. 2003 ASPDAC03 - Physical Chip Implementation 279

Synthesis ReferencesC.L. Berman, J. L. Carter, and K.F. Day. The C.L. Berman, J. L. Carter, and K.F. Day. The Fanout Fanout Problem: From Theory to Practice. In Problem: From Theory to Practice. In Advanced Research in VLSI: Proceedings of the 1989 Decennial CaAdvanced Research in VLSI: Proceedings of the 1989 Decennial Caltech Conference, pages ltech Conference, pages 6969--99, 198999, 1989C. L. Berman, D. J. Hathaway, A. S. C. L. Berman, D. J. Hathaway, A. S. LaPaughLaPaugh, and L. H. , and L. H. TrevillyanTrevillyan. Efficient Techniques for . Efficient Techniques for Timing Corrections. In International Symposium on Circuits and Timing Corrections. In International Symposium on Circuits and Systems, Pages 415Systems, Pages 415--419, 1990419, 1990F. F. BeeftingBeefting, P. N. , P. N. KudvaKudva, D. S. Kung, R. , D. S. Kung, R. PuriPuri, and L. , and L. StokStok. Combinatorial Cell Design for CMOS . Combinatorial Cell Design for CMOS Libraries INTEGRATION, the VLSI Journal, 29:67Libraries INTEGRATION, the VLSI Journal, 29:67--93, 200093, 2000W. W. DonathDonath, P. , P. KudvaKudva, L. , L. StokStok, P. Villarrubia, L. Reddy, and A. Sullivan. Transformational , P. Villarrubia, L. Reddy, and A. Sullivan. Transformational placement and synthesis. In DATE, pages 194placement and synthesis. In DATE, pages 194--201, 2000201, 2000D. J. Hathaway, R.P. D. J. Hathaway, R.P. AbatoAbato, A.D. , A.D. DrummDrumm, and L.P.P.P . Van , and L.P.P.P . Van GinnekenGinneken. Incremental timing . Incremental timing analysis. Technical report, IBM Corp., 1996. U.S. patent 5,508analysis. Technical report, IBM Corp., 1996. U.S. patent 5,508,937.,937.D. Kung, P. D. Kung, P. KudvaKudva, and A. Sullivan. A Gate Sizing Algorithm using Geometric Prog, and A. Sullivan. A Gate Sizing Algorithm using Geometric Programming. In ramming. In Proc. Of the International Workshop on Logic Synthesis, 1997Proc. Of the International Workshop on Logic Synthesis, 1997T. T. Kutzschebauch Kutzschebauch and L. and L. StokStok. Regularity driven logic synthesis. In Proc of the Int. Conf.. Regularity driven logic synthesis. In Proc of the Int. Conf. On On Computer Aided Design, Nov 2000.Computer Aided Design, Nov 2000.P. P. RezvaniRezvani, A.H. , A.H. AjamiAjami, M. , M. PedramPedram, and H. , and H. SavojSavoj. LEOPARD: A Logical Effort based . LEOPARD: A Logical Effort based fanout fanout Optimizer for Area and Delay. In IEEE/ACM International ConfereOptimizer for Area and Delay. In IEEE/ACM International Conference on CAD, pages 516nce on CAD, pages 516--519, 519, 1999.1999.L. L. StokStok, M. , M. IyerIyer, and A. Sullivan. , and A. Sullivan. Wavefront Wavefront technology mapping. In DATE, pages 531technology mapping. In DATE, pages 531--536, 536, 19991999D. S. Kung. A Fast D. S. Kung. A Fast Fanout Fanout Optimization for NewOptimization for New--Continuous Buffer Libraries. In IEEE/ACM Continuous Buffer Libraries. In IEEE/ACM Design Automation Conference, pages 352Design Automation Conference, pages 352--355, 1998355, 1998

Jan. 2003 ASPDAC03 - Physical Chip Implementation 280

DP Buffer Insertion References

Buffer placement in distributed RCBuffer placement in distributed RC--tree networks for minimal Elmore delay tree networks for minimal Elmore delay van van GinnekenGinneken, L.P.P.P. Circuits and Systems, 1990., IEEE International , L.P.P.P. Circuits and Systems, 1990., IEEE International Symposium on , 1990 Page(s): 865 Symposium on , 1990 Page(s): 865 --868 vol.2868 vol.2Optimal wire sizing and buffer insertion for low power and a genOptimal wire sizing and buffer insertion for low power and a generalized delay eralized delay modelmodel LillisLillis, J.; Chung, J.; Chung--KuanKuan Cheng; Lin, T.Cheng; Lin, T.--T.Y. SolidT.Y. Solid--State Circuits, IEEE State Circuits, IEEE Journal of , Volume: 31 Issue: 3 , March 1996 Page(s): 437 Journal of , Volume: 31 Issue: 3 , March 1996 Page(s): 437 ––447447Buffer insertion for noise and delay optimization Alpert, C.J.;Buffer insertion for noise and delay optimization Alpert, C.J.; DevganDevgan, A.; , A.; Quay, S.T. ComputerQuay, S.T. Computer--Aided Design of Integrated Circuits and Systems, IEEE Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 18 Issue: 11 , Nov. 1999 Page(s): 1633Transactions on , Volume: 18 Issue: 11 , Nov. 1999 Page(s): 1633 --16451645Buffer insertion with accurate gate and interconnect delay compuBuffer insertion with accurate gate and interconnect delay computation Alpert, tation Alpert, C.J.;C.J.; DevganDevgan, A.; Quay, S.T. Design Automation Conference, 1999. , A.; Quay, S.T. Design Automation Conference, 1999. Proceedings. 36th , 1999 Page(s): 479 Proceedings. 36th , 1999 Page(s): 479 ––484484Wire Segmenting For Improved Buffer Insertion Alpert, C.;Wire Segmenting For Improved Buffer Insertion Alpert, C.; DevganDevgan, A. Design , A. Design Automation Conference, 1997. Proceedings of the 34th Page(s): 58Automation Conference, 1997. Proceedings of the 34th Page(s): 588 8 ––593593Simultaneous routing and buffer insertion for high performance iSimultaneous routing and buffer insertion for high performance interconnectnterconnectLillisLillis, J.; Chung, J.; Chung--KuanKuan Cheng; TingCheng; Ting--Ting Y. Lin VLSI, 1996. Proceedings., Sixth Ting Y. Lin VLSI, 1996. Proceedings., Sixth Great Lakes Symposium on , 1996 Page(s): 148 Great Lakes Symposium on , 1996 Page(s): 148 --153153

Page 141: Section IV: Timing Closure TechniquesTiming Closure Many aspects of a design contribute to performance, power, and density Architecture / Logic Implementation PD Design Style (Flat,

ASPDAC03 – Physical Chip Implementation – Section IV 141

Jan. 2003 ASPDAC03 - Physical Chip Implementation 281

Blockage Avoidance References

Steiner tree optimization for buffers, blockages, and bays AlperSteiner tree optimization for buffers, blockages, and bays Alpert, C.J.;t, C.J.; GandhamGandham, G.;, G.;Jiang HuJiang Hu;; NevesNeves, J.I.; Quay, S.T.;, J.I.; Quay, S.T.; SapatnekarSapatnekar, S.S. Computer, S.S. Computer--Aided Design of Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: Integrated Circuits and Systems, IEEE Transactions on , Volume: 20 Issue: 4 , April 20 Issue: 4 , April 2001 Page(s): 556 2001 Page(s): 556 ––562.562.A fast algorithm for contextA fast algorithm for context--aware buffer insertion aware buffer insertion JagannathanJagannathan, A.; , A.; SungSung--WooWoo HurHur;; LillisLillis, J. Design Automation Conference, 2000. , J. Design Automation Conference, 2000. Proceedings 2000 Page(s): 368 Proceedings 2000 Page(s): 368 ––373.373.Simultaneous routing and buffer insertion with restrictions on bSimultaneous routing and buffer insertion with restrictions on buffer uffer locationslocations Hai ZhouHai Zhou; Wong, D.F.; I; Wong, D.F.; I--Min Liu;Min Liu; AzizAziz, A. Computer, A. Computer--Aided Aided Design of Integrated Circuits and Systems, IEEE Transactions on Design of Integrated Circuits and Systems, IEEE Transactions on , , Volume: 19 Issue: 7 , July 2000 Page(s): 819 Volume: 19 Issue: 7 , July 2000 Page(s): 819 --824824Maze routing with buffer insertion and wire sizing Maze routing with buffer insertion and wire sizing MinghorngMinghorng Lai; Wong, Lai; Wong, D.F. Design Automation Conference, 2000. Proceedings 2000 Page(sD.F. Design Automation Conference, 2000. Proceedings 2000 Page(s): ): 374 374 --378378Routing tree construction under fixed buffer locations Cong, J.;Routing tree construction under fixed buffer locations Cong, J.; XinXin Yuan Yuan Design Automation Conference, 2000. Proceedings 2000 Page(s): 37Design Automation Conference, 2000. Proceedings 2000 Page(s): 379 9 --384384

Jan. 2003 ASPDAC03 - Physical Chip Implementation 282

Interconnect Planning ReferencesA practical methodology for early buffer and wire resource allocA practical methodology for early buffer and wire resource allocation Alpert, C.J.;ation Alpert, C.J.;Jiang HuJiang Hu;; SapatnekarSapatnekar, S.S.;, S.S.; VillarrubiaVillarrubia, P.G. Design Automation Conference, 2001. , P.G. Design Automation Conference, 2001. Proceedings , 2001 Page(s): 189 Proceedings , 2001 Page(s): 189 ––194194An interconnectAn interconnect--centric design flow for nanometer technologies Cong, J. Proceedicentric design flow for nanometer technologies Cong, J. Proceedings ngs of the IEEE , Volume: 89 Issue: 4 , April 2001 Page(s): 505 of the IEEE , Volume: 89 Issue: 4 , April 2001 Page(s): 505 --528528Buffer block planning for interconnectBuffer block planning for interconnect--drivendriven floorplanningfloorplanning Cong, J.;Cong, J.; TianmingTianming Kong; Kong; Pan, D.Z. ComputerPan, D.Z. Computer--Aided Design, 1999. Digest of Technical Papers. 1999 Aided Design, 1999. Digest of Technical Papers. 1999 IEEE/ACM International Conference on , 1999 Page(s): 358 IEEE/ACM International Conference on , 1999 Page(s): 358 ––363363Provably good global buffering using an available buffer block pProvably good global buffering using an available buffer block plan lan DraganDragan, F.F.;, F.F.;KahngKahng, A.B.;, A.B.; MandoiuMandoiu, I.;, I.; MudduMuddu, S.;, S.; ZelikovskyZelikovsky, A. Computer Aided Design, 2000. , A. Computer Aided Design, 2000. ICCADICCAD--2000. IEEE/ACM International Conference on , 2000 Page(s): 104 2000. IEEE/ACM International Conference on , 2000 Page(s): 104 --109109Provably good global buffering byProvably good global buffering by multiterminal multicommoditymultiterminal multicommodity flow approximation flow approximation DraganDragan, F.F.;, F.F.; KahngKahng, A.B.;, A.B.; MandoiuMandoiu, I.;, I.; MudduMuddu, S.;, S.; ZelikovskyZelikovsky, A. Design , A. Design Automation Conference, 2001. Proceedings of the ASPAutomation Conference, 2001. Proceedings of the ASP--DAC 2001. Asia and South DAC 2001. Asia and South Pacific , 2001 Page(s): 120 Pacific , 2001 Page(s): 120 ––125125Planning buffer locations by network flows Tang, X.; Wong, D.F.Planning buffer locations by network flows Tang, X.; Wong, D.F.; International ; International Symposium on Physical Design, April 2001 Page(s): 180Symposium on Physical Design, April 2001 Page(s): 180--185185RoutabilityRoutability--Driven Repeater Block Planning for InterconnectDriven Repeater Block Planning for Interconnect--Centric Centric Floorplanning Floorplanning SarkarSarkar, P.; , P.; SundararamanSundararaman, V.; , V.; KohKoh, C., C.--K.; International Symposium on Physical K.; International Symposium on Physical Design, April 2001 Page(s): 186Design, April 2001 Page(s): 186--191191


Recommended