New Ways of GeneratingNew Ways of GeneratingLarge Realistic BenchmarksLarge Realistic Benchmarksfor Testing Synthesis Toolsfor Testing Synthesis Tools
Petr Fišer, Jan Schmidt
Faculty of Information TechnologyCzech Technical University in Prague
IWSBP 2010, FreibergIWSBP 2010, Freiberg 22
OutlineOutline
Motivation
New benchmark generation methods
Experimental results
Conclusions
IWSBP 2010, FreibergIWSBP 2010, Freiberg 33
MotivationMotivation
… why another artificial benchmark generator?
To test logic synthesis tools Capabilities of synthesis processes Immunity to “bad” structures Ability to discover “good” structures Iterative power Scalability …
IWSBP 2010, FreibergIWSBP 2010, Freiberg 44
MotivationMotivation
J. Cong, K. Minkovich: Optimality study of logic synthesisfor LUT-based FPGAs, IEEE Trans. on CAD, vol. 26, 2007
They created artificially large circuits, functionally equivalent to their smallsmall origins (70 LUTs)
Synthesis produced 10k – 30k LUTs
IWSBP 2010, FreibergIWSBP 2010, Freiberg 55
MotivationMotivation
P. Fišer, J. Schmidt, J: Small but Nasty Logic Synthesis Examples, IWSBP'08
XOR tree is appended to the circuit outputs and the circuit is collapsed
Synthesis produced>400 LUTs instead of 11
IWSBP 2010, FreibergIWSBP 2010, Freiberg 66
MotivationMotivation
Will my synthesis tool produce the same result for different descriptions (versions) of one particular circuit? (a.k.a. iterative power)
Most probably not!(if things go bad)
What went wrong?What descriptions are bad for me?What structures caused my failure?What should I do to perform better?
IWSBP 2010, FreibergIWSBP 2010, Freiberg 77
Proposed BenchmarksProposed Benchmarks
Starting with seed circuit (could be small)Functionally equivalent “big” circuit is createdThe size of the benchmark circuit is adjustable
Result
Transformation 1
Transformation 2
Transformation 3
Bench circuit 1
Bench circuit 2
Bench circuit 3
Seed circuit
Ideal case:
Synthesis
Synthesis
Synthesis
IWSBP 2010, FreibergIWSBP 2010, Freiberg 88
Proposed BenchmarksProposed Benchmarks
Starting with seed circuit (could be small)Functionally equivalent “big” circuit is createdThe size of the benchmark circuit is adjustable
Transformation 1
Transformation 2
Transformation 3
Bench circuit 1
Bench circuit 2
Bench circuit 3
Seed circuit
Real case:
Synthesis
Synthesis
Synthesis
Result 1
Result 2
Result 3
IWSBP 2010, FreibergIWSBP 2010, Freiberg 99
Cong’s LEKU BenchmarksCong’s LEKU Benchmarks
J. Cong, K. Minkovich: Optimality study of logic synthesisfor LUT-based FPGAs, IEEE Trans. on CAD, vol. 26, 2007
LEKU = Logic Examples with Known Upper Bound
Based on elimination of the original circuit structure
… and bad decomposition
ReplicateG57 LUTs
G2570 LUTs
SOP19K terms
ABC balance
Collapse
SIS tech_decomp
LEKU-CB814 gates
LEKU-CD>1M gates
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1010
1. Realistic LEKU Benchmarks1. Realistic LEKU Benchmarks
Any circuit may be used as a seed (instead of g25)
Possible chance of success
Global BDDs may be used instead of collapsing
Upper bound = size of the original circuit
Originalcircuit
Collapse SIS tech_decomp
Global BDD SIS tech_decomp
Possiblylarge circuit
Possiblylarge SOP
Possiblylarge circuit
Possiblylarge SOP
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1111
1. Realistic LEKU Benchmarks1. Realistic LEKU Benchmarks
Size increase by collapsing
0 2000 4000 6000 8000 10000 120000x
1x
2x
3x
4x
5x
6x
7x
Siz
e in
crea
se f
acto
r
Gates
250 ISCAS and IWLS benchmarks
Size increase in 61% of circuits
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1212
1. Realistic LEKU Benchmarks1. Realistic LEKU Benchmarks
Experimental results
Benchmark circuitBenchmark circuit Synthesis (# of 4-LUTs)Synthesis (# of 4-LUTs)
Bench Inp. Out. Process Gates ABC #1 #2
c432 36 7 original 145 84 77 118
c432 36 7 global BDD 2,017 1,031 1,023 1,333
c432 36 7 ABC collapse 2,658 1,246 1,548 1,648
c432 36 7 SIS collapse 7,075 3,361 3,872 4,738
c880 60 26 original 208 113 110 122
c880 60 26 global BDD 407,098 93,190 174,983 N/A
c880 60 26 ABC collapse 13,727 7,437 8,109 9,460
c880 60 26 SIS collapse 30,015 19,787 20,487 28,017
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1313
2. Parity Benchmark Circuits2. Parity Benchmark Circuits
XOR tree is appended to the circuit outputs, then the structure is destroyed (collapsing, BDD)
No guarantee of circuit size increase
Upper bound = size of the core circuit + XOR tree
Collapse SIS tech_decomp
Global BDD SIS tech_decomp
Possiblylarge circuit
Possiblylarge SOP
Possiblylarge circuit
Possibly large MUX tree
x1
corecircuit
xn
y1
ym
XOR
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1414
2. Parity Benchmark Circuits2. Parity Benchmark CircuitsSize increase by appending parity & collapsing
0 1000 2000 3000 4000 50000x
5x
10x
15x
20x
25x
30x
Siz
e in
crea
se
Gates
100 ISCAS and IWLS benchmarks
Size increase in 25% of circuits
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1515
2. Parity Benchmark Circuits2. Parity Benchmark CircuitsExperimental results
Benchmark circuitBenchmark circuit Synthesis (# of 4-LUTs)Synthesis (# of 4-LUTs)
Bench Inp. Out. Process Gates ABC #1 #2
s1238 32 1 original 493 229 241 263
s1238 32 1 global BDD 6,282 3,849 4,055 3,839
s1238 32 1 ABC collapse 31,839 19,741 21,875 25,793
s1238 32 1 SIS collapse 39,636 26,313 28,254 N/A
b4 33 1 original 267 110 108 116
b4 33 1 BDD 16,963 6,347 6,099 4,285
b4 33 1 ABC collapse 1,405 730 841 884
b4 33 1 SIS collapse 4,087 2,036 2,422 1,627
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1616
3. Tautology Benchmarks3. Tautology Benchmarks
Large random SOP is generatedWhen the number of terms exceeds some threshold, the SOP is a tautologyThen, the big SOP is mapped into 2-input gates(SIS tech_decomp) Big network
Upper bound = 0The benchmark size may be adjusted by
1. Number of input variables2. Dimension of SOP terms
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1717
4. Partial Collapsing4. Partial Collapsing
Only parts of the network are collapsed
1. Choose one pivot gate2. Extract its transitive fan-in and fan-out to a given radius3. Collapse the extracted network part4. Decompose into 2-input gates5. Put it back6. Iterate several times
Upper bound = size of the original circuitThe benchmark size may be adjusted by
1. Size of the extracted circuit2. Number of iterations
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1818
4. Partial Collapsing4. Partial Collapsing
Example – c432
0 20 40 60 80 100 120 1400
2000
4000
6000
8000
10000
12000
Gat
es
Part size
IWSBP 2010, FreibergIWSBP 2010, Freiberg 1919
4. Partial Collapsing4. Partial Collapsing
Example – big tautology
0 2000 4000 6000 8000 100000
5000
10000
15000
20000
Gat
es
Part size
IWSBP 2010, FreibergIWSBP 2010, Freiberg 2020
4. Partial Collapsing4. Partial Collapsing
Example – big tautology
0 2000 4000 6000 8000 100000
5000
10000
15000
20000
Gat
es
Part size
0 2000 4000 6000 80008000
8500
9000
9500
10000
10500
11000
Gat
es
Part size
IWSBP 2010, FreibergIWSBP 2010, Freiberg 2121
4. Partial Collapsing4. Partial Collapsing Experimental results
Benchmark circuit Synthesis (4-LUTs)
Bench Inp. Out. Process Gates ABC #1 #2
c432 36 7 original 145 84 77 118
c432 36 7 Part. coll., size 98 1,247 626 782 916
c432 36 7 Part. coll., size 109 3,077 1,445 1,699 2,422
c432 36 7 Part. coll., size 138 5,026 2,598 2,761 3,727
c432 36 7 Part. coll., size 140 11,531 6,647 6,844 9,255
c880 60 26 original 208 113 110 122
c880 60 26 Part. coll., size 129 1,008 485 601 597
c880 60 26 Part. coll., size 171 5,034 2950 2,394 3,769
c880 60 26 Part. coll., size 201 10,423 6224 5,010 7,887
IWSBP 2010, FreibergIWSBP 2010, Freiberg 2222
5. Replicating Shared Logic5. Replicating Shared LogicDuplicate a part of the logic that is shared1. Find a branching signal2. Duplicate its transitive fan-in, to a given depth
Upper bound = size of the original circuitThe benchmark size may be adjusted by
1. Number of duplicated branches2. Depth of duplication
G1
G1
G10
G10
G22
G22
G23
G23
G11
G11
G11’G16
G16
G16’G19
G19
G3
G3
G6
G6
G7
G7
G2
G2
IWSBP 2010, FreibergIWSBP 2010, Freiberg 2323
5. Replicating Shared Logic5. Replicating Shared LogicExperimental results
Benchmark circuit Synthesis (4-LUTs)
Bench Inp. Out. Process Gates ABC #1 #2
c432 36 7 original 145 84 77 118
c432 36 7 10k dup., depth 1 1,428 84 244 333
c432 36 7 10k dup., depth 2 4,905 84 447 586
c432 36 7 10k dup., depth 3 8,389 84 396 637
c432 36 7 10k dup., depth 4 11,349 84 452 739
c432 36 7 10k dup., depth 5 16,040 84 472 771
IWSBP 2010, FreibergIWSBP 2010, Freiberg 2424
6. Adding Inverters6. Adding Inverters
(special bonus – not included in the proceedings)
Add pairs of inverters to random locations
The network size may be arbitrarily expanded
And all the synthesis tools…
Are completely immune to this!
IWSBP 2010, FreibergIWSBP 2010, Freiberg 2525
Summary ExperimentsSummary Experiments
0 2k 4k 6k 8k 10k 12k 14k 16k 18k0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
c432
LU
Ts
Source circuit gates
#1 #2 ABC
IWSBP 2010, FreibergIWSBP 2010, Freiberg 2626
Summary ExperimentsSummary Experiments
0 20k 40k 60k 80k 100k 120k 140k0
10k
20k
30k
40k
50k
60k
s1238_pLU
Ts
Source circuit gates
#1 #2 ABC
IWSBP 2010, FreibergIWSBP 2010, Freiberg 2727
ConclusionsConclusions
Several new benchmark generation methods proposedArtificially “big” circuits are generated from seed circuitsBenchmarks are functionally equivalent to the seed circuits the complexity upper bound is known
Tested on ABC and 2 commercial toolsUnfortunate result – the bigger the circuit going to
synthesis, the bigger the result