Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | abraham-george |
View: | 219 times |
Download: | 3 times |
Are Floorplan Representations Important in Digital Design?
H. H. Chan, S. N. Adya, I. L. Markov
The University of Michigan
Motivation
• Many FP representations have been proposed since sequence-pair [Murata et.al. ‘96]
• Emphasize better area optimization results, with appealing properties based on math. results
• Interconnect is often more important in physical design, but rarely discussed in the literature
Does the choice of FP reps matter in interconnect-driven floorplanning?
Outline of the Talk
• Common Practices in FP Research
• Background on Representations
• Evaluation Framework
• Results and Analysis
• Conclusions
Current Status Quo
• Prevailing algorithm: simulated annealing– Can optimize power, performance, etc
• Most work focuses on FP representations • Many FP reps exist, compared in terms of
– Size of solution space– Area optimality (capture area-optimal floorplan?)– Asymptotic complexity in “realizing” a floorplan– Complexity of incremental changes in annealing
Are these properties relevant in interconnect-driven floorplanning?
No Room For Improvement Left?
• MCNC benchmark suite is almost always used
• The suite has only 5 benchmarks, with <50 blks – Area-optimal results on apte, xerox and hp– Extremely close results on ami33 and ami49
• Area optimization results are emphasized– Interconnect optimization is often ignored
• Temperature schedules are rarely reported– Hard to reproduce results
Contexts for Floorplanning
• Outline-free floorplanning – Minimize a combination of area and HPWL– Many floorplanners are evaluated in this framework
• Fixed-outline floorplanning– Minimize HPWL subject to a bounding box– More relevant to modern designs
• Large scale floorplanning– Excellent area-packing results for >500 blks
e.g., 10 – 20 copies of ami49– Hardly interesting w/o interconnect optimization
Outline of the Talk
• Common Practices in FP Research
• Background on Representations
• Evaluation Framework
• Results and Analysis
• Conclusions
Families of FP Representations (1)
• Different FP reps can share the same solution space (“equivalent”)– Sequence pair, TCG, TCG-S– O-tree, B*-tree– Mosaic reps: CBL, Q-sequence, TBS etc
• Equivalent FP reps produce similar solutions, only differ in runtime
• We seek to compare solution spacesrather than specific representations
Families of FP Representations (2)
We study sequence pair and B*-tree, since• They are best studied in the literature
– Broad extensions for various constraintse.g. pre-placed blks, rectilinear blks, soft blks
– Hierarchical extensions for over 10K blocks(Parquet-in-Capo, MB*-tree)
• Two “extremes” of representations– Sequence pair: largest solution space– B*-tree: smallest solution space
Sequence Pair (SP)
• Proposed by Murata et al in 1996• Encodes a floorplan using two permutations
• O(n2) time to realize a floorplan– O(n lg n), O(n lg lg n) algos exist– O(n2) time algo is easy to
implement and fast in practice
• Size of solution space: n!2
• Some area-optimal solutions• Equivalent to TCG, TCG-S
B*-tree
• Proposed by Chang et al in 2000• Encodes a floorplan by a binary tree and a
permutation
• O(n)-time to realize a floorplan• Size of solution space:
O(n!22n-2 / n1.5)• Some area-optimal floorplans• All packings are compacted to
the bottom• Equivalent to O-tree
Sequence Pair vs. B*-tree
SP captures low interconnect floorplans, that are not captured by B*-tree
• SP captures more floorplans than B*-tree– This packing is encoded
by SP <a b c> <c a b>– Not captured by B*-tree
Outline of the Talk
• Common Practices in FP Research
• Background on Representations
• Evaluation Framework
• Results and Analysis
• Conclusions
Evaluation Framework (1)
• Floorplanner Parquet [Adya et.al ICCD 2001]
– Simulated annealer based on sequence pair– Competitive results in
• Outline-free floorplanning• Fixed-outline floorplanning• Handling both hard and soft blocks
– Embedded in placer Capo 9.0 for mixed-sized placement (standard cells + large macros) [Adya et.al ICCAD 2004]
Evaluation Framework (2)
• Replace the sequence pair annealer by B*-tree, with minimal changes– Identical temperature schedule– Probabilities of applying moves are identical,
except for those specific to B*-trees– Both versions are open source
http://vlsicad.eecs.umich.edu/BK/parquet/
• Report averages over 50 independent startsrather than best results
• Runtime breakdown: Block-packing vs WL eval.
Evaluation Framework (3)
Min-cut floorplacement with Capo• Use min-cut partitioning whenever possible• Otherwise, resort to annealing-based packing
– Cluster standard cells into soft blocks– Fixed-outline floorplanning on clustered
instances– Min-cut resumes on standard cells
• Outperforms competitive annealers on large FP instances (> 100 blocks)– Clustered FP instances have up to 300 blks, often <20
• Generates thousands of very different FP benchmarks
What to Expect?
• Wirelength evaluation is very CPU-intesive
• Count # floating-point ops in floorplan and wirelength evaluation per move (for SP)– Instrument the code by adding counters
• # ops in HPWL evaluation– Only count arithmetic ops,
not assignments
• # ops in FP evaluation– Worst-case complexity O(n2)
is too pessimistic – Runtime of SP eval
fits to cn1.3
t = cn1.3
Outline of the Talk
• Common Practices in FP Research
• Background on Representations
• Evaluation Framework
• Results and Analysis
• Conclusions
Area Optimization (average performance reported, not best)
• B*-tree packs better than SP• Similar for fixed-outline FP, B*-tree: higher success rate• SP beats B*-tree in evaluation time up to 200 blks,
despite their asymptotic complexities --- O(n2) vs. O(n)
ami33 ami49 n100 n200 n300
SP 9.5% 4.8ms
9.1% 7.2ms
9.2% 17.6ms
10.2% 37.9ms
10.7% 64.3ms
B*-tree 5.6% 7.6ms
5.3% 10.7ms
5.5% 20.3ms
5.9% 39.4ms
6.3% 57.1ms
Outline-free FP (deadspace % / time-per-move)
Area + HPWL Optimization
• Similar performance for SP and B*-tree in area, HPWL, and evaluation time
• Runtime dominated by HPWL evaluation • Same trends for fixed-outline FP and soft blocks
ami33 ami49 n100 n200 n300
SP 14.3% 76e3 29.7ms
14.1% 823e3 48.5ms
11.6% 322e3 93.9ms
13.4% 589e3 219ms
14.1% 707e3 305ms
B*-tree 15.8% 75e3 31.5ms
16.2% 847e3 50.6ms
11.3% 320e3 99.5ms
12.0% 580e3 219ms
11.9% 700e3 313ms
Outline-free FP (deadspace % / HPWL / time-per-move)
Min-Cut Floorplacement
• A wide range of FP benchmarks– Instances with both hard and soft blocks – Facilitate a rigorous empirical comparison
• Capo on the IBM-MSwPins benchmarks – Similar performance of SP and B*-tree
(<1% diff in wirelength, <2.5% in runtime)
• Stand-alone Parquet on the generated instances– 2000+ fixed-outline instances with both hard and soft blocks– Block counts range from 1 to ~300, often <20 – Similar performance of SP and B*-tree
Wirelength Evaluation Runtime (1)
• In both outline-free and fixed-outline FP, HPWL evaluation dominates runtime (~80%)
Outline-free (TPM area-only / area+wire)ami33 ami49 n100 n200 n300
SP 4.8ms 29.7ms
7.2ms 48.5ms
17.6ms 93.9ms
37.9ms 219ms
64.3ms 305ms
B*-tree 7.6ms 31.5ms
10.7ms 50.6ms
20.3ms 99.5ms
39.4ms 219ms
57.1ms 313ms
Consistent with our analysis that WL evaluation should dominate runtime.
Wirelength Evaluation Runtime (2)
• Estimates less accurate in larger benchmarks– Larger netlists may not fit into processor cache
FP evaluation is never a runtime bottleneck!
ami33 ami49 n100 n200 n300
estimated 89% 88% 85% 80% 76%
actual 86% 87% 84% 85% 82%
% time-per-move spent on WL evaluation
(analytical estimates versus actual measurements)
Must Improve WL EvaluationRather than FP Representations!
• Ideas for speed-ups– Special-case the evaluation of 2-pin nets:
no loops, 2 floating-point comparisons only– Remove inessential nets, whose bounding box contains the outline– Conglomerate 2-pin nets between the same blocks
into heavy-weight nets
• These techniques lead to 10% speed-upwithout loss of quality in Parquet 3
• Parquet 4: another 10% speed-up (2x speed-up per move) with improved wirelength (longer temp. schedule)– Better use of cache (float versus double)– Simultaneous computation of min and max
(25% less comparisons)
Conclusions
• We compared the performance ofSP and B*-tree in wirelength-driven floorplanning – Surprisingly similar performance!
• The main bottleneck is interconnect evaluation:
>75% runtime in realistic FP instances– Parquet-4: Improved WL eval. tangible speed-up
• Asymptotic complexity of FP eval. has little relevance – Realistic FP instances are small (<200 blks)
– Worst-case analysis is too pessimistic
• New FP reps seem irrelevant unlessthey support incremental wirelength evaluation
• The status quo in FP literature needs to be changed
t = cn1.3