are floorplan representations important in digital design?
DESCRIPTION
Are Floorplan Representations Important in Digital Design?. H. H. Chan, S. N. Adya, I. L. Markov The University of Michigan. Motivation. Many FP representations have been proposed since sequence-pair [Murata et.al. ‘96] - PowerPoint PPT PresentationTRANSCRIPT
Are Floorplan Representations Important in Digital Design?
H. H. Chan, S. N. Adya, I. L. Markov
The University of Michigan
Motivation
• Many FP representations have been proposed since sequence-pair [Murata et.al. ‘96]
• Emphasize better area optimization results, with appealing properties based on math. results
• Interconnect is often more important in physical design, but rarely discussed in the literature
Does the choice of FP reps matter in interconnect-driven floorplanning?
Outline of the Talk
• Common Practices in FP Research
• Background on Representations
• Evaluation Framework
• Results and Analysis
• Conclusions
Current Status Quo
• Prevailing algorithm: simulated annealing– Can optimize power, performance, etc
• Most work focuses on FP representations • Many FP reps exist, compared in terms of
– Size of solution space– Area optimality (capture area-optimal floorplan?)– Asymptotic complexity in “realizing” a floorplan– Complexity of incremental changes in annealing
Are these properties relevant in interconnect-driven floorplanning?
No Room For Improvement Left?
• MCNC benchmark suite is almost always used
• The suite has only 5 benchmarks, with <50 blks – Area-optimal results on apte, xerox and hp– Extremely close results on ami33 and ami49
• Area optimization results are emphasized– Interconnect optimization is often ignored
• Temperature schedules are rarely reported– Hard to reproduce results
Contexts for Floorplanning
• Outline-free floorplanning – Minimize a combination of area and HPWL– Many floorplanners are evaluated in this framework
• Fixed-outline floorplanning– Minimize HPWL subject to a bounding box– More relevant to modern designs
• Large scale floorplanning– Excellent area-packing results for >500 blks
e.g., 10 – 20 copies of ami49– Hardly interesting w/o interconnect optimization
Outline of the Talk
• Common Practices in FP Research
• Background on Representations
• Evaluation Framework
• Results and Analysis
• Conclusions
Families of FP Representations (1)
• Different FP reps can share the same solution space (“equivalent”)– Sequence pair, TCG, TCG-S– O-tree, B*-tree– Mosaic reps: CBL, Q-sequence, TBS etc
• Equivalent FP reps produce similar solutions, only differ in runtime
• We seek to compare solution spacesrather than specific representations
Families of FP Representations (2)
We study sequence pair and B*-tree, since• They are best studied in the literature
– Broad extensions for various constraintse.g. pre-placed blks, rectilinear blks, soft blks
– Hierarchical extensions for over 10K blocks(Parquet-in-Capo, MB*-tree)
• Two “extremes” of representations– Sequence pair: largest solution space– B*-tree: smallest solution space
Sequence Pair (SP)
• Proposed by Murata et al in 1996• Encodes a floorplan using two permutations
• O(n2) time to realize a floorplan– O(n lg n), O(n lg lg n) algos exist– O(n2) time algo is easy to
implement and fast in practice
• Size of solution space: n!2
• Some area-optimal solutions• Equivalent to TCG, TCG-S
B*-tree
• Proposed by Chang et al in 2000• Encodes a floorplan by a binary tree and a
permutation
• O(n)-time to realize a floorplan• Size of solution space:
O(n!22n-2 / n1.5)• Some area-optimal floorplans• All packings are compacted to
the bottom• Equivalent to O-tree
Sequence Pair vs. B*-tree
SP captures low interconnect floorplans, that are not captured by B*-tree
• SP captures more floorplans than B*-tree– This packing is encoded
by SP <a b c> <c a b>– Not captured by B*-tree
Outline of the Talk
• Common Practices in FP Research
• Background on Representations
• Evaluation Framework
• Results and Analysis
• Conclusions
Evaluation Framework (1)
• Floorplanner Parquet [Adya et.al ICCD 2001]
– Simulated annealer based on sequence pair– Competitive results in
• Outline-free floorplanning• Fixed-outline floorplanning• Handling both hard and soft blocks
– Embedded in placer Capo 9.0 for mixed-sized placement (standard cells + large macros) [Adya et.al ICCAD 2004]
Evaluation Framework (2)
• Replace the sequence pair annealer by B*-tree, with minimal changes– Identical temperature schedule– Probabilities of applying moves are identical,
except for those specific to B*-trees– Both versions are open source
http://vlsicad.eecs.umich.edu/BK/parquet/
• Report averages over 50 independent startsrather than best results
• Runtime breakdown: Block-packing vs WL eval.
Evaluation Framework (3)
Min-cut floorplacement with Capo• Use min-cut partitioning whenever possible• Otherwise, resort to annealing-based packing
– Cluster standard cells into soft blocks– Fixed-outline floorplanning on clustered
instances– Min-cut resumes on standard cells
• Outperforms competitive annealers on large FP instances (> 100 blocks)– Clustered FP instances have up to 300 blks, often <20
• Generates thousands of very different FP benchmarks
What to Expect?
• Wirelength evaluation is very CPU-intesive
• Count # floating-point ops in floorplan and wirelength evaluation per move (for SP)– Instrument the code by adding counters
• # ops in HPWL evaluation– Only count arithmetic ops,
not assignments
• # ops in FP evaluation– Worst-case complexity O(n2)
is too pessimistic – Runtime of SP eval
fits to cn1.3
t = cn1.3
Outline of the Talk
• Common Practices in FP Research
• Background on Representations
• Evaluation Framework
• Results and Analysis
• Conclusions
Area Optimization (average performance reported, not best)
• B*-tree packs better than SP• Similar for fixed-outline FP, B*-tree: higher success rate• SP beats B*-tree in evaluation time up to 200 blks,
despite their asymptotic complexities --- O(n2) vs. O(n)
ami33 ami49 n100 n200 n300
SP 9.5% 4.8ms
9.1% 7.2ms
9.2% 17.6ms
10.2% 37.9ms
10.7% 64.3ms
B*-tree 5.6% 7.6ms
5.3% 10.7ms
5.5% 20.3ms
5.9% 39.4ms
6.3% 57.1ms
Outline-free FP (deadspace % / time-per-move)
Area + HPWL Optimization
• Similar performance for SP and B*-tree in area, HPWL, and evaluation time
• Runtime dominated by HPWL evaluation • Same trends for fixed-outline FP and soft blocks
ami33 ami49 n100 n200 n300
SP 14.3% 76e3 29.7ms
14.1% 823e3 48.5ms
11.6% 322e3 93.9ms
13.4% 589e3 219ms
14.1% 707e3 305ms
B*-tree 15.8% 75e3 31.5ms
16.2% 847e3 50.6ms
11.3% 320e3 99.5ms
12.0% 580e3 219ms
11.9% 700e3 313ms
Outline-free FP (deadspace % / HPWL / time-per-move)
Min-Cut Floorplacement
• A wide range of FP benchmarks– Instances with both hard and soft blocks – Facilitate a rigorous empirical comparison
• Capo on the IBM-MSwPins benchmarks – Similar performance of SP and B*-tree
(<1% diff in wirelength, <2.5% in runtime)
• Stand-alone Parquet on the generated instances– 2000+ fixed-outline instances with both hard and soft blocks– Block counts range from 1 to ~300, often <20 – Similar performance of SP and B*-tree
Wirelength Evaluation Runtime (1)
• In both outline-free and fixed-outline FP, HPWL evaluation dominates runtime (~80%)
Outline-free (TPM area-only / area+wire)ami33 ami49 n100 n200 n300
SP 4.8ms 29.7ms
7.2ms 48.5ms
17.6ms 93.9ms
37.9ms 219ms
64.3ms 305ms
B*-tree 7.6ms 31.5ms
10.7ms 50.6ms
20.3ms 99.5ms
39.4ms 219ms
57.1ms 313ms
Consistent with our analysis that WL evaluation should dominate runtime.
Wirelength Evaluation Runtime (2)
• Estimates less accurate in larger benchmarks– Larger netlists may not fit into processor cache
FP evaluation is never a runtime bottleneck!
ami33 ami49 n100 n200 n300
estimated 89% 88% 85% 80% 76%
actual 86% 87% 84% 85% 82%
% time-per-move spent on WL evaluation
(analytical estimates versus actual measurements)
Must Improve WL EvaluationRather than FP Representations!
• Ideas for speed-ups– Special-case the evaluation of 2-pin nets:
no loops, 2 floating-point comparisons only– Remove inessential nets, whose bounding box contains the outline– Conglomerate 2-pin nets between the same blocks
into heavy-weight nets
• These techniques lead to 10% speed-upwithout loss of quality in Parquet 3
• Parquet 4: another 10% speed-up (2x speed-up per move) with improved wirelength (longer temp. schedule)– Better use of cache (float versus double)– Simultaneous computation of min and max
(25% less comparisons)
Conclusions
• We compared the performance ofSP and B*-tree in wirelength-driven floorplanning – Surprisingly similar performance!
• The main bottleneck is interconnect evaluation:
>75% runtime in realistic FP instances– Parquet-4: Improved WL eval. tangible speed-up
• Asymptotic complexity of FP eval. has little relevance – Realistic FP instances are small (<200 blks)
– Worst-case analysis is too pessimistic
• New FP reps seem irrelevant unlessthey support incremental wirelength evaluation
• The status quo in FP literature needs to be changed
t = cn1.3