+ All Categories
Home > Documents > Methodology from Chaos in IC Implementation Kwangok Jeong * and Andrew B. Kahng *,** * ECE Dept., UC...

Methodology from Chaos in IC Implementation Kwangok Jeong * and Andrew B. Kahng *,** * ECE Dept., UC...

Date post: 22-Dec-2015
Category:
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
34
Methodology from Chaos in IC Implementation Kwangok Jeong * and Andrew B. Kahng *,** * ECE Dept., UC San Diego ** CSE Dept., UC San Diego
Transcript

Methodology from Chaos in IC Implementation

Kwangok Jeong* and Andrew B. Kahng*,**

* ECE Dept., UC San Diego** CSE Dept., UC San Diego

(2)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Outline

• Motivation

• Assessment of “Chaos”

• Exploitation of “Chaotic” behavior

• Conclusion

(3)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Motivation• Chip implementation flow is a “Chaos Machine”

(Ward Vercruysse, Sun Microsystems, ISPD97 talk)• Hard to predict behavior of back-end implementation

• “Inherent noise” (Kahng/Mantik, ISQED-2001)• Equivalent inputs to tools result in different outputs

• Algorithms and EDA tools are not deterministic or predictable• Most design optimization problems are NP-hard

Heuristic-based approaches• Physical phenomena are too complex

Simplified models

How to exploit “chaotic behavior”

(4)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Scope of This Work• We assess “chaotic” behavior in design process

• When it occurs in design processes• Post-synthesis vs. post-routing• Place- and-route tools’ view vs. signoff tools view

• What user inputs affect it most• Input parameter sensitivity to synthesis tools• Input parameter sensitivity to place-and-route tools

• We propose a practical method to exploit “chaos” in EDA tools, based on empirical analyses• Sensitivity of input parameters to outcomes Find safe/easy knobs that don’t change netlists/libraries• Best-of-k: multi-start, multi-run methodologies

(5)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Outline

• Motivation

• Assessment of “Chaos”

• Exploitation of “Chaotic” behavior

• Conclusion

(6)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Analysis 1: Synthesis vs. Place-and-Route• How strongly correlated are post-synthesis netlist quality

and post-routing design quality?• Timing quality of synthesized netlists vs. timing quality after

placement and routing, and signoff

Clock used @

synthesis(ns)

WNS with 2ns clock

after synthesis

Clock used @ P&R

WNS (ns)seen

@ P&R

WNS (ns) @ signoff

1.60 0.400 2.0 0.171 -0.249

1.80 0.200 2.0 0.088 -0.196

1.90 0.101 2.0 0.112 -0.195

1.95 0.051 2.0 0.074 -0.449

2.00 0.001 2.0 0.088 -0.252

2.10 -0.097 2.0 0.088 -0.214

2.20 -0.196 2.0 0.120 -0.281

2.40 -0.395 2.0 0.162 -0.081

Worst quality netlist can result in best quality!

*AES design

(7)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Analysis 2: Implementation Vs. Signoff

• Timing miscorrelation • Delay calculation

• RC parasitic-0.350

-0.300

-0.250

-0.200

-0.150

-0.100

-0.050

0.000

-0.350 -0.250 -0.150 -0.050 0.050

Sig

no

ff (P

rim

eT

ime

)

Implementation (SOCE , Astro)

Astro vs. PrimeTime

SOCE vs. PrimeTime

~200ps underestimation

Worst negative slack comparisonfrom 29 testcases

WNS (ns)

Imp. Signoff

AES 0.144 0.146

JPEG 0.129 0.095

LSU -0.002 -0.005

EXU -0.171 -0.183

Imp

lem

enta

tio

n

Signoff

• How strongly correlated are P&R and signoff ?

(8)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Beyond Miscorrelation Issues• Kahng and Mantik 2001 – “Noise”

• Equivalent inputs result in different outputs• Changing seeds of random number generators• Changing cell/net ordering• Renaming cell instances• Perturbing design hierarchy

• Injecting “noise” is practically difficult

• Our focus – “Chaos” • Negligible change of inputs Large change in outputs• E.g., 0.1ps changes affect design quality significantly

Clock cycle time (ns)Worst negative slack (WNS)

(ns)

1.9998 -0.011

1.9999 -0.068

2.0000 -0.093

2.0001 -0.010

2.0002 -0.004

*JPEG design

89ps difference

(9)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

What Inputs Can Be Perturbed?• Tool-specific options: command options to turn on/off

• Not our concern, since these are tool-dependent

• Design-specific constraints:

• These knobs do not change design signatures easy and safe knobs to perturb

Timing-Related Constraints

Clock cycle time (T) -3 / -2 / -1 / 0 / 1 / 2 / 3 ps

Clock uncertainty (S)

-3 / -2 / -1 / 0 / 1 / 2 / 3 ps

Input/output delay (B)

-3 / -2 / -1 / 0 / 1 / 2 / 3 ps

Floorplan-Related Constraints

Utilization (U) -3 / -2 / -1 / 0 / 1 / 2 / 3 %

Aspect ratio (A) -0.03 / -0.02 / -0.01 / 0 / 0.01 / 0.02 / 0.03

(10)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Testbed• Designs

• Implemented with TSMC 65nm GPLUS library

• Tools

Block Source Clock (ns)

#Cells Area (um2)

AES Opencores 1.7 22438 48957

JPEG Opencores 2.2 69845 178696

LSU OpenSparcT1

1.2 24945 113479

EXU OpenSparcT1

1.2 20382 69780Tool Vendor Purpose

Design Compiler

Synopsys Logic synthesis

RTL Compiler Cadence Logic synthesis

SOC Encounter Cadence Place-and-route

Astro Synopsys Place-and-route

STAR-RCXT Synopsys Signoff RC extraction

PrimeTime-SI Synopsys Signoff STA

(11)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Analysis 3: Noise in Synthesis – Timing • What chaotic behavior is associated with input

parameters of vendor synthesis tools?

• Ideally, results should not vary significantly However, worst negative slack can change by up to 52ps

WN

S (

ns)

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

-3p

s-2

ps

-1p

s0

ps

1p

s

2p

s3

ps

-3p

s

-2p

s-1

ps

0p

s1

ps

2p

s

3p

s-3

ps

-2p

s-1

ps

0p

s

1p

s2

ps

3p

s

Clock ClockUncertainty

IO delay

AES JPEG LSU EXU

(DesignCompiler)

(12)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Analysis 3: Noise in Synthesis – Area• What chaotic behavior is associated with input

parameters of vendor synthesis tools?

• Synthesized area can change by up to 6%

Nor

mal

ized

Are

a (%

)

-3

-2

-1

0

1

2

3

4

5

6

7

-3p

s-2

ps

-1p

s0

ps

1p

s2

ps

3p

s-3

ps

-2p

s-1

ps

0p

s1

ps

2p

s3

ps

-3p

s-2

ps

-1p

s0

ps

1p

s2

ps

3p

s

Clock ClockUncertainty

IO delay

AES JPEG LSU EXU

(DesignCompiler)

(13)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

-3p

s-2

ps

-1p

s0

ps

1p

s2

ps

3p

s-3

ps

-2p

s-1

ps

0p

s1

ps

2p

s3

ps

-3p

s-2

ps

-1p

s0

ps

1p

s2

ps

3p

s-0

.03

-0.0

2-0

.01 0

0.0

10

.02

0.0

3-3

%-2

%-1

%0

%1

%2

%3

%

Clockcycle

Clockuncertainty

IO delay Aspectratio

Placementutilization

AES JPEG LSU EXU

Analysis 4: Noise in P&R Tools• What chaotic behavior is associated with input

parameters of vendor place-and-route tools?

• Noise at place-and-route stage is even worse!• WNS and TNS can change by up to 165ps and 46ns

Astro

WN

S (

ns)

(14)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

-3p

s-2

ps

-1p

s0

ps

1p

s2

ps

3p

s-3

ps

-2p

s-1

ps

0p

s1

ps

2p

s3

ps

-3p

s-2

ps

-1p

s0

ps

1p

s2

ps

3p

s-0

.03

-0.0

2-0

.01 0

0.0

10

.02

0.0

3-3

%-2

%-1

%0

%1

%2

%3

%

Clockcycle

Clockuncertainty

IO delay Aspectratio

Placementutilization

AES JPEG LSU EXU

Analysis 4: Noise in P&R Tools• What chaotic behavior is associated with input parameters

of vendor place-and-route tools?

• Noise at place-and-route stage is even worse!• WNS and TNS can change by up to 190ps and 69ns• Area can change by up to 16.4%

SOC Encounter

WN

S (

ns)

(15)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Outline

• Motivation

• Assessment of “Chaos”

• Exploitation of “Chaotic” behavior

• Conclusion

(16)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Exploiting Noise in Design Flow

• Multi-start and multi-run• When there are idle machines in the compute farm Multi-start: After running on k distinct machines with

ignorable perturbations of inputs, choose best out of k different solutions

• When there are remaining timing-to-market Multi-run: After running k sequential jobs with ignorable

perturbations of inputs, choose best out of k different solutions

• Best-of-k method• Find the best solution from many trials• Larger k better best solution• How to determine k that produces predictably good

results?

(17)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Best-of-k Using Sampling• Which k results in consistent, reasonably good solution?

• To obtain statistics: “set of k trials” should be performed a large number (N) of times, for each value of k

Naive procedure:for each k,

Perform k trials by N times

sk average of best solutions

• For large N, sk is the expected (average) best solution, when we perform k trials

• Example: k = {1, 2, 3, 4, 5, 10}, N = 100 2,500 separate runsMany runs are required!

Best-of-k sampling procedure: // find “virtual” solution space

Perform N’ trials (N’ < N)

Record solutions set of solutions S

// best-of-k sampling

for each k

sample k solutions out of S, N different times

sk average of best solutions • Example: N’ = 50 (Sampling from S does not add cost)

(18)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10

Qu

alit

y R

an

k O

rde

r

kClock cycle Clock uncertaintyInput/output delay Aspect ratioUtilization

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10

Qu

alit

y R

an

k O

rde

r

kClock cycle Clock uncertaintyInput/output delay Aspect ratioUtilization

Application of Best-of-k Sampling(1)• Find best input parameters to perturb using best-of-k

sampling• k = 1, 2, 3, …, 10, and N =100 5,500 exp. in naive procedure• S = 7 solutions from each of T, S, B, A, U perturbations

• Quality rank of input parameters in P&R• E.g.) AES: clock cycle (T) or input/output delay (B) perturbations result

in best solution quality*AES design *EXU design

(19)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Application of Best-of-k Sampling (2)• Solution quality versus number of trials “k” (with N = 100)

• Average solution quality approaches the best solution as “k” increases• Average solution quality is significantly better than worst possible

solution quality best-of-k can avoid bad luck• Best-of-3 shows reasonably good solutions

-0.090

-0.080

-0.070

-0.060

-0.050

-0.040

-0.030

-0.020

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

-0.270

-0.250

-0.230

-0.210

-0.190

-0.170

-0.150

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

-0.190

-0.180

-0.170

-0.160

-0.150

-0.140

-0.130

-0.120

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials-0.330

-0.280

-0.230

-0.180

-0.130

-0.080

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

AES

JPEG

LSU

EXU

(20)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Conclusion and Ongoing Work• Experimental assessment of “chaotic” behavior in commercial

EDA tools• Miscorrelation issues between design stages are well-known

• Exploiting chaos: Intentional negligible input perturbations can significantly change outputs

• Proposed a methodology to exploit the chaotic tool behavior• “best-of-k”: multi-start / multi-run methodology

• Efficient sampling method to determine the best number of trials

• We also find best input parameters to perturb using best-of-k sampling

• Ongoing work• Analysis of potential advantages of “chaos” in advanced physical

synthesis tools to reduce miscorrelation-related issues

• Evaluation of the benefits of chaos in more advanced signoff methodologies (signal integrity-enabled, path-based STA)

Thank You!

Backup

(23)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Potential Cause 1• Miscorrelation between synthesis and place-and-

route• Rank correlation of timing critical paths between

synthesis and placement: 0.421

Not critical at synthesis,

Critical at placement

*AES design

(24)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Potential Cause 2: Parasitic Miscorrelation • Miscorrelation in delay

calculation• With same RC parasitic file

(.spef)• May not be a major

problem: A few tens of picoseconds difference

• Miscorrelation in RC extraction• Implementation tool can

underestimate capacitance by 18.6%

Imp

lem

enta

tio

n

Signoff

WNS (ns)

Imp. Signoff

AES 0.144 0.146

JPEG 0.129 0.095

LSU -0.002 -0.005

EXU -0.171 -0.183

(25)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

Inherent Noise: Detailed Results

• Noise is really random! Difficult to predict• Red texts are the best in each group

Design CriticalityClock(ns)

SOCE Astro BlastFusionWith original Clock With original Clock With original Clock

Setup Setup SetupWNS

(SOCE)(ns)

WNS (PT)(ns)

TNS (PT)(ns)

WNS(Astro)

(ns)

WNS (PT)(ns)

TNS (PT)(ns)

WNS(BF)(ns)

WNS (PT)(ns)

TNS (PT)(ns)

AES

Tight clock(original2.2ns)

2.1998 -0.407 -0.430 -81.124 -0.241 -0.487 -94.822 -0.077 -0.391 -60.156 2.1999 -0.392 -0.420 -73.533 -0.218 -0.512 -89.316 -0.067 -0.397 -58.728 2.2000 -0.399 -0.457 -85.641 -0.255 -0.569 -100.956 -0.081 -0.331 -59.985 2.2001 -0.436 -0.439 -82.053 -0.280 -0.535 -110.341 -0.074 -0.442 -61.048 2.2002 -0.406 -0.441 -82.576 -0.246 -0.490 -92.196 -0.067 -0.384 -51.980

Loose clock(original3.0ns)

2.9998 -0.026 -0.119 -1.965 0.040 -0.280 -35.482 0.000 -0.342 -44.778 2.9999 -0.091 -0.095 -2.137 0.064 -0.325 -34.699 0.001 -0.469 -46.154 3.0000 -0.046 -0.096 -3.499 0.049 -0.346 -36.565 -0.001 -0.448 -48.369 3.0001 -0.049 -0.112 -1.972 0.083 -0.239 -23.040 -0.008 -0.373 -44.683 3.0002 -0.061 -0.078 -1.718 0.057 -0.287 -31.985 0.000 -0.421 -48.042

JPEG

Tight clock(original1.3ns)

1.2998 -0.294 -0.315 -625.434 -0.265 -0.352 -744.637 -0.228 -0.324 -501.295 1.2999 -0.263 -0.281 -566.317 -0.240 -0.418 -701.361 -0.166 -0.266 -410.594 1.3000 -0.257 -0.258 -537.580 -0.256 -0.395 -733.841 -0.244 -0.338 -567.228 1.3001 -0.249 -0.303 -561.013 -0.239 -0.321 -719.196 -0.202 -0.304 -475.253 1.3002 -0.298 -0.514 -757.272 -0.229 -0.346 -731.566 -0.197 -0.277 -471.392

Loose clock(original2.0ns)

1.9998 -0.005 -0.011 -0.011 0.101 -0.140 -0.520 0.000 -0.216 -11.407 1.9999 0.008 -0.068 -0.068 0.101 -0.140 -0.520 0.000 -0.167 -12.021 2.0000 -0.007 -0.093 -0.137 0.101 -0.131 -1.240 -0.002 -0.196 -15.189 2.0001 -0.001 -0.010 -0.010 0.096 -0.098 -0.449 0.001 -0.181 -16.782 2.0002 0.008 -0.004 -0.006 0.099 -0.066 -0.279 -0.006 -0.178 -12.220

(26)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

(27)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

AES ASTRO

-0.108

-0.106

-0.104

-0.102

-0.100

-0.098

-0.096

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

(28)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

AES SOCE

-0.090

-0.080

-0.070

-0.060

-0.050

-0.040

-0.030

-0.020

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

(29)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

JPEG ASTRO

-0.270

-0.250

-0.230

-0.210

-0.190

-0.170

-0.150

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

(30)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

JPEG SOCE

-0.210

-0.205

-0.200

-0.195

-0.190

-0.185

-0.180

-0.175

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

(31)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

LSU ASTRO

-0.150

-0.140

-0.130

-0.120

-0.110

-0.100

-0.090

-0.080

-0.070

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

(32)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

LSU SOCE

-0.190

-0.180

-0.170

-0.160

-0.150

-0.140

-0.130

-0.120

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

(33)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

EXU ASTRO

-0.330

-0.310

-0.290

-0.270

-0.250

-0.230

-0.210

-0.190

-0.170

-0.150

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials

(34)UCSD VLSI CAD Laboratory - ISQED 2010, March 24, 2010

EXU SOCE

-0.330

-0.280

-0.230

-0.180

-0.130

-0.080

1 2 3 4 5 6 7 8 9 10

WN

S (n

s)

k (# of different runs)

Worst of 100 trials

Average of 100 Trials

Best of 100 trials


Recommended