natarajan viswanathan chris chong-nuen chu iowa state university

26
FastPlace: Efficient FastPlace: Efficient Analytical Placement Analytical Placement using Cell Shifting, using Cell Shifting, Iterative Local Iterative Local Refinement and a Hybrid Refinement and a Hybrid Net Model Net Model Natarajan Viswanathan Natarajan Viswanathan Chris Chong-Nuen Chu Chris Chong-Nuen Chu Iowa State University Iowa State University International Symposium on Physical International Symposium on Physical Design Design April 19, 2004 April 19, 2004

Upload: gil

Post on 12-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model. Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University International Symposium on Physical Design April 19, 2004. FastPlace – Key Features. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

FastPlace: Efficient FastPlace: Efficient Analytical Placement Analytical Placement using Cell Shifting, using Cell Shifting,

Iterative Local Iterative Local Refinement and a Hybrid Refinement and a Hybrid

Net ModelNet ModelNatarajan ViswanathanNatarajan Viswanathan

Chris Chong-Nuen ChuChris Chong-Nuen Chu

Iowa State UniversityIowa State University

International Symposium on Physical International Symposium on Physical DesignDesign

April 19, 2004April 19, 2004

Page 2: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

FastPlace – Key FeaturesFastPlace – Key Features

1.1. Cell Shifting Cell Shifting 2.2. Iterative Local Iterative Local Refinement Refinement 3.3. Hybrid Net ModelHybrid Net Model

Standard cell placementStandard cell placement Wirelength minimizationWirelength minimization Flat placementFlat placement

Efficient Analytical Efficient Analytical PlacementPlacement

usingusing

Page 3: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Are Existing Algorithms Are Existing Algorithms Adequate?Adequate?

Solution QualitySolution Quality

There may be significant room for improvementThere may be significant room for improvement For existing wirelength-driven placement For existing wirelength-driven placement algorithmsalgorithms

Cong et al. [ASPDAC 03] [ISPD 03]Cong et al. [ASPDAC 03] [ISPD 03] For existing timing-driven placement For existing timing-driven placement algorithmsalgorithms

Cong et al. [ICCAD 03]Cong et al. [ICCAD 03]

EfficiencyEfficiency

Important to have fast placement algorithmsImportant to have fast placement algorithms Circuit sizes are huge in modern designCircuit sizes are huge in modern design Placement must be run in early design stagesPlacement must be run in early design stages

Page 4: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Why Analytical ?Why Analytical ?

Inherently minimize the wirelengthInherently minimize the wirelength

Efficient IntrinsicallyEfficient Intrinsically Elegant convex quadratic programming Elegant convex quadratic programming

formulationformulation Very efficient techniques to solve convex QPVery efficient techniques to solve convex QP

Typically employ a flat placement Typically employ a flat placement methodologymethodology

All cells are placed simultaneouslyAll cells are placed simultaneously

Maintain relative positions of cells Maintain relative positions of cells throughout the placement processthroughout the placement process

Page 5: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Analytical Placement Analytical Placement FormulationFormulation

ectorsSolution v, cell and cellbetween net theofWeight

cell ofcenter theof sCoordinateLet

yxjiw

i),y(x

ij

ii

ectorsSolution v, cell and cellbetween net theofWeight

cell ofcenter theof sCoordinateLet

yxjiw

i),y(x

ij

ii

22 )()(2

1 cell and cellbetween net theofCost

jijiij yyxxw

ji

22 )()(2

1 cell and cellbetween net theofCost

jijiij yyxxw

ji

const2

1

2

1cost Total yyyxxx T

yTT

xT dQdQ const

2

1

2

1cost Total yyyxxx T

yTT

xT dQdQ

Analytical Placement FrameworkAnalytical Placement Framework::

repeatrepeatSolve the convex quadratic programSolve the convex quadratic programSpread the cellsSpread the cells

until the cells are evenly distributeduntil the cells are evenly distributed

Page 6: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

FastPlace ApproachFastPlace Approach Framework:Framework:

repeatrepeatSolve the convex quadratic program Solve the convex quadratic program Reduce wirelength by iterative heuristic Reduce wirelength by iterative heuristic Spread the cells Spread the cells

until the cells are evenly distributed until the cells are evenly distributed

Special features of FastPlace:Special features of FastPlace: Cell ShiftingCell Shifting

Easy-to-compute techniqueEasy-to-compute technique Enable fast convergenceEnable fast convergence

Hybrid Net ModelHybrid Net Model Speed up solving of convex QP Speed up solving of convex QP

Iterative Local RefinementIterative Local Refinement Minimize wirelength based on linear objectiveMinimize wirelength based on linear objective

Framework:Framework:repeatrepeat

Solve the convex quadratic Solve the convex quadratic program program

Reduce wirelength by iterative Reduce wirelength by iterative heuristic heuristic

Spread the cells Spread the cells until the cells are evenly distributed until the cells are evenly distributed

Page 7: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

OutlineOutlineFastPlace: FastPlace:

Efficient Analytical Efficient Analytical Placement Placement using using

1.1. Cell Shifting Cell Shifting 2.2. Iterative Local Iterative Local Refinement Refinement 3.3. Hybrid Net ModelHybrid Net Model

Page 8: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Spreading by Cell ShiftingSpreading by Cell Shifting Quadratic placement should produce good Quadratic placement should produce good

relative position of cellsrelative position of cells Simple shifting of cells should be able to Simple shifting of cells should be able to

produce a good placementproduce a good placement Major difficulties:Major difficulties:

1.1. How to shift cells in a 2-D region?How to shift cells in a 2-D region?

2.2. How to make sure wirelength will still be good?How to make sure wirelength will still be good? Our Approach:Our Approach:

1.1. Perform 1-D shifting in x and y directions Perform 1-D shifting in x and y directions independentlyindependently

2.2. Interleave a small amount of shifting with Interleave a small amount of shifting with quadratic placementquadratic placement

Page 9: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Cell ShiftingCell Shifting

Uniform Bin Structure Non-uniform Bin Structure

1.1. Shifting of bin boundary Shifting of bin boundary

2.2. Shifting of cells linearly within Shifting of cells linearly within each bineach bin Apply to all rows and all columns independentlyApply to all rows and all columns independently

Page 10: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Cell Shifting – Animation Cell Shifting – Animation ……

NBi

Bini

Bini+1

OBiOBi-1 OBi+1

Ui Ui+1

j

k

l

Bini

Bini+1

OBiOBi-1 OBi+1

j

k

l

NBi

Page 11: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Pseudo pin and Pseudo netPseudo pin and Pseudo net

Pseudo net

Additional Force

Pseudo pin

Target Position

Original Position

Pseudo net

Pseudopin

Need to add forces Need to add forces to prevent cells to prevent cells from collapsing backfrom collapsing back

Done by adding Done by adding pseudo pins and pseudo pins and pseudo netspseudo nets

Only diagonal and Only diagonal and linear terms of the linear terms of the quadratic system quadratic system need to be updatedneed to be updated

Takes a single pass Takes a single pass of of O(n)O(n) time to time to regenerate matrix Q regenerate matrix Q (which is common for (which is common for both x and y both x and y problems)problems)

Page 12: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

OutlineOutlineFastPlace: FastPlace:

Efficient Analytical Efficient Analytical Placement Placement using using

1.1. Cell ShiftingCell Shifting 2.2. Iterative Local Iterative Local Refinement Refinement 3.3. Hybrid Net ModelHybrid Net Model

Page 13: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Iterative Local RefinementIterative Local Refinement Iteratively go through all the cells one by Iteratively go through all the cells one by oneone

For each cell, consider moving it in four For each cell, consider moving it in four directions by a certain distancedirections by a certain distance

Compute a score for each direction based onCompute a score for each direction based on Half-perimeter wirelength (HPWL) reductionHalf-perimeter wirelength (HPWL) reduction Cell density at the source and destination Cell density at the source and destination regionsregions

Move in the direction with highest positive Move in the direction with highest positive score score (Do not move if no positive score)(Do not move if no positive score)

Distance moved (H or V) is Distance moved (H or V) is decreasing over iterationsdecreasing over iterations

Detailed placement is handledDetailed placement is handledby the same heuristicby the same heuristic

H H

V

V

Page 14: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

OutlineOutlineFastPlace: FastPlace:

Efficient Analytical Efficient Analytical Placement Placement using using

1.1. Cell ShiftingCell Shifting 2.2. Iterative Local Iterative Local RefinementRefinement 3.3. Hybrid Net ModelHybrid Net Model

Page 15: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Effect of Net Model on Effect of Net Model on RuntimeRuntime

Need to replace each multi-pin net by 2-pin netsNeed to replace each multi-pin net by 2-pin nets Then the placement problem (even with pseudo nets) Then the placement problem (even with pseudo nets) can be formulated as a convex QP:can be formulated as a convex QP:

Solved by any convex QP algorithmsSolved by any convex QP algorithms Use Incomplete Cholesky Conjugate Gradient (ICCG)Use Incomplete Cholesky Conjugate Gradient (ICCG)

Runtime is proportional to # of non-zero entries in Runtime is proportional to # of non-zero entries in Q Q

Each non-zero entry in Q corresponds to one 2-pin Each non-zero entry in Q corresponds to one 2-pin netnet

Traditionally, placers model each multi-pin net by a Traditionally, placers model each multi-pin net by a cliqueclique

High-degree nets will generate a lot of 2-pin netsHigh-degree nets will generate a lot of 2-pin nets Slow down convex QP algorithms significantlySlow down convex QP algorithms significantly

const2

1

2

1cost Total yyyxxx T

yTT

xT dQdQ const

2

1

2

1cost Total yyyxxx T

yTT

xT dQdQ

Page 16: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Clique, Star and Hybrid Clique, Star and Hybrid Net ModelsNet Models

Star Node

Clique Model Star Model Hybrid Model

# # pinspins

Net Net ModelModel

22 CliqueClique33 CliqueClique44 StarStar55 StarStar66 StarStar…… ……

Star model is introduced by Mo et al. Star model is introduced by Mo et al. [ICCAD-00] for macro placement[ICCAD-00] for macro placement

Introduce a star node even for 2-pin Introduce a star node even for 2-pin netsnets

Not clear how the placement result Not clear how the placement result will be affectedwill be affected

Page 17: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Equivalence of Clique and Equivalence of Clique and Star ModelsStar Models

Lemma: By setting the net weights Lemma: By setting the net weights appropriately,appropriately,

clique and star net models clique and star net models are equivalent.are equivalent.

Proof: When star node is at Proof: When star node is at equilibrium position,equilibrium position,

total forces on each cell total forces on each cell are the same forare the same for

clique and star net models.clique and star net models.

Star Node

Clique Model Star Model

Weight = γWWeight = γW Weight = γ kWfor a k-pin net

Weight = γ kWfor a k-pin net

Page 18: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Experimental SetupExperimental Setup ISPD-02 mixed-mode benchmark suite by ISPD-02 mixed-mode benchmark suite by IBMIBM

Macro blocks replaced by standard cells Macro blocks replaced by standard cells with width set to 4 x average cell widthwith width set to 4 x average cell width

10% whitespace10% whitespace

FastPlace implemented in CFastPlace implemented in C Compared with:Compared with:

MetaPl-Capo 8.8 in default modeMetaPl-Capo 8.8 in default mode Dragon 2.2.3 in fixed die modeDragon 2.2.3 in fixed die mode

All placers run on a 750MHz Sun Sparc-2 All placers run on a 750MHz Sun Sparc-2 machinemachine

Page 19: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

CircuitCircuit #Nodes#Nodes #Terminals#Terminals #Nets#Nets #Pins#Pins #Rows#Rows

ibm01ibm01 12506 12506 246246 1411114111 5056650566 9696ibm02ibm02 19342 19342 259259 1958419584 8119981199 109109ibm03ibm03 22853 22853 283283 2740127401 9357393573 121121ibm04ibm04 27220 27220 287287 3197031970 105859105859 136136ibm05ibm05 28146 28146 12011201 2844628446 126308126308 139139ibm06ibm06 32332 32332 166166 3482634826 128182128182 126126ibm07ibm07 45639 45639 287287 4811748117 175639175639 166166ibm08ibm08 51023 51023 286286 5051350513 204890204890 170170ibm09ibm09 53110 53110 285285 6090260902 222088222088 183183Ibm10Ibm10 68685 68685 744744 7519675196 297567297567 234234Ibm11Ibm11 70152 70152 406406 8145481454 280786280786 208208ibm12ibm12 70439 70439 637637 7724077240 317760317760 242242ibm13ibm13 83709 83709 490490 9966699666 357075357075 224224ibm14ibm14 147088 147088 517517 152772152772 546816546816 305305ibm15ibm15 161187 161187 383383 186608186608 715823715823 303303ibm16ibm16 182980 182980 504504 190048190048 778823778823 347347ibm17ibm17 184752 184752 743743 189581189581 860036860036 379379ibm18ibm18 210341 210341 272272 201920201920 819697819697 361361

Placement Benchmark Placement Benchmark StatisticsStatistics

Page 20: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

CircuitCircuit

# Non-zero Entries# Non-zero EntriesSpeed-UpSpeed-Up

( Hybrid / Clique )( Hybrid / Clique )Clique ModelClique Model Hybrid ModelHybrid Model Clique / HybridClique / Hybrid

ibm01ibm01 109183109183 4116441164 2.652.65 1.51.5ibm02ibm02 343409343409 7001470014 4.904.90 2.42.4ibm03ibm03 206069206069 7468074680 2.762.76 1.41.4ibm04ibm04 220423220423 8455684556 2.612.61 1.21.2ibm05ibm05 349676349676 108282108282 3.233.23 1.31.3ibm06ibm06 321308321308 106835106835 3.013.01 1.61.6ibm07ibm07 373328373328 147009147009 2.542.54 1.31.3ibm08ibm08 732550732550 173541173541 4.224.22 2.02.0ibm09ibm09 478777478777 185102185102 2.592.59 1.41.4ibm10ibm10 707969707969 251101251101 2.822.82 1.61.6ibm11ibm11 508442508442 230865230865 2.202.20 1.21.2ibm12ibm12 748371748371 270849270849 2.762.76 1.61.6ibm13ibm13 744500744500 295048295048 2.522.52 1.51.5ibm14ibm14 11251471125147 456474456474 2.462.46 1.31.3ibm15ibm15 17514741751474 607289607289 2.882.88 1.41.4ibm16ibm16 19239951923995 668491668491 2.882.88 1.31.3ibm17ibm17 22357162235716 753507753507 2.972.97 1.41.4ibm18ibm18 22218602221860 711702711702 3.123.12 1.41.4

AverageAverage 2.952.95 1.51.5

Clique Net Model vs Hybrid Net Clique Net Model vs Hybrid Net ModelModel

Page 21: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Half Perimeter WirelengthHalf Perimeter Wirelength

Average Wirelength Ratio Average Wirelength Ratio

FastPlace / Capo : FastPlace / Capo : 1.0101.010 FastPlace / Dragon FastPlace / Dragon : : 1.0161.016

0

10

20

30

40

50

60

70

80

ibm

01

ibm

02

ibm

03

ibm

04

ibm

05

ibm

06

ibm

07

ibm

08

ibm

09

ibm

10

ibm

11

ibm

12

ibm

13

ibm

14

ibm

15

ibm

16

ibm

17

ibm

18

Wirele

ngth

(x 1

0 e

6)

Capo 8.8 Dragon 2.2.3 FastPlace

Page 22: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

CircuitCircuit

RuntimeRuntime Speed-UpSpeed-Up

Capo 8.8Capo 8.8 Dragon 2.2.3Dragon 2.2.3 FastPlaceFastPlace (Capo / FP)(Capo / FP) (Dragon / FP)(Dragon / FP)

ibm01ibm01 3 m 59 s3 m 59 s 29 m 06 s29 m 06 s 13 s13 s x 18.4x 18.4 x 134.3x 134.3ibm02ibm02 7 m 15 s7 m 15 s 31 m 13 s31 m 13 s 33 s33 s x 13.2x 13.2 x 56.8x 56.8ibm03ibm03 8 m 23 s8 m 23 s 31 m 49 s31 m 49 s 33 s33 s x 15.2x 15.2 x 57.8x 57.8ibm04ibm04 10 m 46 s10 m 46 s 1 h 5 m1 h 5 m 39 s39 s x 16.6x 16.6 x 100.0x 100.0ibm05ibm05 10 m 44 s10 m 44 s 1 h 48 m1 h 48 m 51 s51 s x 12.6x 12.6 x 127.1x 127.1ibm06ibm06 12 m 08 s12 m 08 s 1 h 21 m1 h 21 m 45 s45 s x 16.2x 16.2 x 108.0x 108.0ibm07ibm07 18 m 32 s18 m 32 s 1 h 47 m1 h 47 m 1 m 19 s1 m 19 s x 14.1x 14.1 x 81.3x 81.3ibm08ibm08 19 m 53 s19 m 53 s 4 h 30 m4 h 30 m 1 m 33 s1 m 33 s x 12.8x 12.8 x 174.2x 174.2ibm09ibm09 22 m 50 s22 m 50 s 3 h 43 m3 h 43 m 1 m 42 s1 m 42 s x 13.4x 13.4 x 131.2x 131.2ibm10ibm10 29 m 04 s29 m 04 s 3 h 19 m3 h 19 m 2 m 25 s2 m 25 s x 12.0x 12.0 x 82.3x 82.3ibm11ibm11 31 m 11 s31 m 11 s 2 h 22 m2 h 22 m 2 m 13 s2 m 13 s x 14.1x 14.1 x 64.1x 64.1ibm12ibm12 30 m 41 s30 m 41 s 3 h 48 m3 h 48 m 2 m 23 s2 m 23 s x 12.9x 12.9 x 95.7x 95.7ibm13ibm13 39 m 27 s39 m 27 s 3 h 04 m3 h 04 m 2 m 54 s2 m 54 s x 13.6x 13.6 x 63.4x 63.4ibm14ibm14 1 h 12 m1 h 12 m 7 h 37 m7 h 37 m 5 m 34 s5 m 34 s x 12.9x 12.9 x 82.1x 82.1ibm15ibm15 1 h 30 m1 h 30 m 10 h 34 m10 h 34 m 8 m 45 s8 m 45 s x 10.3x 10.3 x 72.4x 72.4ibm16ibm16 1 h 31 m1 h 31 m 12 h 06 m12 h 06 m 10 m 52 s10 m 52 s x 8.4x 8.4 x 66.8x 66.8ibm17ibm17 1 h 43 m1 h 43 m 26 h 54 m26 h 54 m 11 m 30 s11 m 30 s x 9.0x 9.0 x 140.3x 140.3ibm18ibm18 1 h 44 m1 h 44 m 23 h 39 m23 h 39 m 12 m 21 s12 m 21 s x 8.4x 8.4 x 114.9x 114.9

AverageAverage x 13.0x 13.0 x 97.4x 97.4

Runtime ComparisonRuntime Comparison

Page 23: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

FastPlace - Breakdown of RuntimeFastPlace - Breakdown of Runtime

All runtime in seconds

%% RuntimRuntim

ee%%

RuntimRuntimee

% % RuntimRuntim

ee% %

RuntimRuntimee

52.9852.98

47.9947.99

11.8611.86

3.213.21

1.551.55

345.28345.28

292.74292.74

49.2049.20

25.0425.04

6.376.37

57.0957.09

53.9353.93

14.6714.67

3.913.91

1.441.44

285.57285.57

257.41257.41

56.8056.80

13.2713.27

3.753.75

740.92740.927.27.246.646.67.77.738.538.5ibm18ibm18

652.07652.077.47.444.944.98.38.339.439.4ibm16ibm16

132.53132.538.98.937.137.111.111.142.942.9ibm11ibm11

45.4345.437.17.155.155.18.68.629.229.2ibm06ibm06

13.1113.1111.811.848.648.611.011.028.628.6ibm01ibm01

TotalTotalDetail Detail

PlacementPlacement

Iterative Iterative

Local Local RefinementRefinement

Cell Cell

ShiftingShiftingGlobal Global

OptimizationOptimizationCircuitCircuit

Page 24: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Complexity AnalysisComplexity Analysis

Runtime ≈ O(n1.412)where n = # of pins

Runtime ≈ Runtime ≈ O(nO(n1.371.37))

where n = # of where n = # of pinspins

Page 25: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

SummarySummary FastPlace -- Efficient Flat Placement FastPlace -- Efficient Flat Placement

AlgorithmAlgorithm 13.0x faster than Capo13.0x faster than Capo 97.4x faster than Dragon97.4x faster than Dragon Comparable WL to Capo and DragonComparable WL to Capo and Dragon

Based on three techniques:Based on three techniques:1.1. Cell ShiftingCell Shifting

Fast convergenceFast convergence Simple computationSimple computation

2.2. Iterative Local RefinementIterative Local Refinement Reduce wirelength based on HPWL measureReduce wirelength based on HPWL measure

3.3. Hybrid Net ModelHybrid Net Model 1.5x speedup compared to Clique1.5x speedup compared to Clique Applicable to any analytical placement Applicable to any analytical placement

toolstools

Page 26: Natarajan Viswanathan Chris Chong-Nuen Chu Iowa State University

Thank You !!Thank You !!