scalable don’t-care-based logic optimization and resynthesis
DESCRIPTION
Scalable Don’t-Care-Based Logic Optimization and Resynthesis. Alan Mishchenko, University of California, Berkeley Robert Brayton, University of California, Berkeley Jie-Hong Roland Jiang , National Taiwan University Stephen Jang , Xilinx, Inc. Outline. Motivation - PowerPoint PPT PresentationTRANSCRIPT
Scalable Don’t-Care-Based Logic Optimization and Resynthesis
Alan Mishchenko, University of California, BerkeleyRobert Brayton, University of California, Berkeley Jie-Hong Roland Jiang, National Taiwan University Stephen Jang, Xilinx, Inc
2
Outline• Motivation• Brief history of don’t-cares• Algorithm overview• Algorithm components• Experimental results• Conclusion
3
Motivation
resynthesis
Network after mapping Optimized network
Applications• tech-independent synthesis• post-mapping delay/area optimization• placement-aware resynthesis
Requirements• substantial logic restructuring• flexibility to solve many optimization tasks• reasonable runtime for large designs
Our solution• SAT-based re-synthesis• with don’t-cares• using resubstitution
ff
4
Brief History of Don’t-Cares
• Previous century work (1960-2000)– Permissible functions (Saburo Muroga, 1989)– Compatible observability don’t-care (Hamid Savoj, 1992)
• Complete rather than compatible don’t-cares (2002)• SAT-based don’t-care computation (2005)• Interpolation-based optimization with don’t-cares
without explicitly computing don’t-cares (this talk)
5
Background Summary
• Assuming familiarity with– Networks and nodes– Cuts and cones– Don’t-cares and resubstitution– SAT-based interpolation
6
Resubstitution with Don’t-Cares
Consider all or some nodes in Boolean network
For each node• Create window • Select possible fanin nodes (divisors)• For each candidate subset of divisors
– Rule out some subsets using simulation– Check resubstitution feasibility using SAT– Compute resubstitution function using interpolation
• A low-cost by-product of completed SAT proofs
• Update the network, if there is an improvement
7
Resubstitution
Resubstitution considers a node in a Boolean network and expresses it using a different set of fanin nodes
X X
Computation can be enhanced by don’t cares
8
Windowing a Node in the Networkfor Don’t-Care Computation
• Definition– A window for a node in the
network is the context in which the don’t-cares are computed
• A window includes – n levels of the TFI – m levels of the TFO– all re-convergent paths
captured in this scope• Window with its PIs and POs can
be considered as a separate network
Window POs
Window PIs
n = 3
m = 3
Boolean network (k-LUT mapped circuit)
9
Don’t-Care RepresentationMiter for don’t-care computationMiter for don’t-care computation
WindowWindow Same window Same window with inverterwith inverter
…
ffff
If output is 1, input is a careIf output is 1, input is a careIf output is 0, input is a don’t-careIf output is 0, input is a don’t-care
10
Resubstitution with Don’t Cares
• Given: – node function F(x) to be replaced– care set C(x) for the node– candidate set of divisors {gi(x)} for
expressing F(x)
• Find:– A resubstitution function h(y) such that
F(x) = h(g(x)) on the care set
• SPFD Theorem:– Function h exists if and only if each
pair of care minterms, x1 and x2, distinguished by F(x), is also distinguished by gi(x) for some i
C(x) F(x) g1 g2 g3
C(x) F(x)
g1 g2 g3
h(g)
F’(x)
11
Checking Resubstitution using SAT
x1
f g1 g2 g3
1 1
0
0 1
f g3 g2 g1 C
x2
B A
C
Miter for resubstitution checkMiter for resubstitution check
hh((gg))SPFD Theorem SPFD Theorem in practicein practice
Comments:• Note use of care set, C.• Resubstitution function exists if and only if the SAT problem is unsatisfiable • Function h(g) is computed using interpolation
12
Experimental Setup
• Implemented in ABC (command “mfs”)• The SAT solver is a modified version of MiniSat-1.14C, by Niklas
Een and Niklas Sorensson • The algorithm was applied to a mapped network and attempted
resubstitution for each LUT to reduce (a) area, (b) number of fanins.• Experiments targeting networks after FPGA mapping into 6-LUTs on
an Intel Xeon 2-CPU with 8Gb of RAM • The resulting networks have been verified using equivalence
checker in ABC (command “cec”)
• Optimization scripts used– Baseline: result of (dc2 –l; dc2 –l; if –C 12)1
– Choices: best result of (st; dch; if –C 12)4
– Mfs: best result of (st; dch; if –C 12; mfs –W 4)4
13
Results for Academic BenchmarksExample Profile Baseline Choices Mfs
PI PO FF LUT Level Time LUT Level Time LUT Level Timealu4 14 8 0 845 5 0.46 786 5 2.23 499 5 15.53apex2 39 3 0 987 6 0.53 922 6 5.80 674 6 33.71apex4 9 19 0 821 5 0.41 798 5 2.10 786 5 16.41bigkey 263 197 224 567 3 0.60 567 3 0.86 455 3 1.68clma 383 82 33 3309 10 1.80 2910 9 16.23 701 7 122.24des 256 245 0 880 5 0.62 872 4 2.90 638 4 7.88diffeq 64 39 377 712 7 0.37 690 7 0.80 645 7 2.77dsip 229 197 224 682 3 0.50 681 3 0.58 677 2 1.65elliptic 131 114 1122 1877 10 0.85 1914 10 2.20 1813 10 4.80ex1010 10 10 0 2934 6 1.48 2712 6 17.14 1342 6 101.13ex5p 8 63 0 593 5 0.37 521 5 1.58 119 3 4.57frisc 20 116 886 1777 12 1.06 1749 12 7.43 1757 11 16.64i10 257 224 0 595 9 0.39 554 9 1.37 545 8 9.35misex3 14 14 0 772 5 0.43 701 5 2.19 368 5 12.11pdc 16 40 0 2113 7 1.35 1959 6 15.36 128 5 25.91s38417 28 106 1636 2257 6 1.33 2271 6 7.09 2206 6 26.11s38584 12 278 1452 2319 7 1.47 2373 7 8.41 2250 6 14.01seq 41 35 0 872 5 0.50 834 5 4.64 684 5 17.73spla 16 46 0 1622 6 1.08 1417 6 11.58 161 4 19.12tseng 52 122 385 717 7 0.30 690 7 0.63 639 7 2.35Ratio 1.000 1.000 1.000 0.952 0.976 4.831 0.550 0.878 17.101Ratio 1.000 1.000 1.000 0.578 0.900 3.540
14
Results for Industrial BenchmarksExample Profile Baseline Choices Mfs
PI PO FF LUT Lev Time LUT Lev Time LUT Lev Time
Design01 1332 5064 5625 15453 8 10.08 14830 8 62.17 13793 7 104.91Design02 1559 5701 10373 28091 10 21.50 26972 9 134.89 24997 9 312.14Design03 993 5533 6430 15033 10 7.43 14428 10 40.69 14010 10 118.00Design04 974 1301 940 2841 31 2.09 2723 30 7.82 2697 30 121.33Design05 101 198 1177 2649 6 1.60 2554 5 10.86 2222 5 20.80Design06 68 85 1355 3624 19 2.53 3385 16 27.58 3192 15 102.77Design07 6598 11151 22382 71637 17 61.73 69747 15 475.84 63116 13 1154.14Design08 2126 6451 7075 20504 15 12.27 19860 14 70.61 18943 12 150.09Design09 2450 4798 3725 9951 4 3.13 9718 4 9.50 9374 3 21.67Design10 1032 1767 1124 4447 10 2.24 4299 10 15.13 4105 9 44.32Design11 4040 9406 35654 83113 16 71.99 81601 16 472.68 73478 14 1748.12Design12 115 264 2293 5413 7 3.53 5209 6 24.07 4576 6 49.35Design13 56 87 465 1756 12 1.19 1311 8 8.19 1162 8 27.44Design14 14 60 426 1448 9 0.91 1455 8 8.79 1382 7 34.77
Ratio 1.000 1.00 1.000 0.949 0.90 6.310 0.882 0.83 18.801
Ratio 1.000 1.00 1.000 0.930 0.92 2.979
15
Conclusion
• Introduced a new SAT-based logic optimization engine– uses rugged windowing scheme without previous limitations– uses SAT solver for all aspects of functional manipulation– designed for scalability and applicable to large industrial circuits
• Showed promising experimental results– academic benchmarks (10-40% in area, 10% in delay)– industrial benchmarks (7% in area, 8% in delay)– improvements can be made even on top of strong synthesis
• Future work– improving runtime by fine-tuning simulation and SAT– experimenting with timing-driven and power-aware resynthesis– extending don’t-care computation to work with white-boxes– global circuit restructuring using interpolation
16
The End
17
18
Algorithm OverviewnodeSatBasedResynthesis( node, parameters ) { window = nodeWindow( node, parameters ); divisors = nodeDivisors( node, window, parameters ); cands = nodeResubCandsFilter( node, window, parameters ); best_cand = NULL; for each candidate set c in cands { if ( best_cand != NULL && resubCost(best_cand) < resubCost(c) ) continue; if ( !resubFeasible( node, window, c ) ) continue; best_cand = c; } if ( best_cand != NULL ) { best_func = nodeInterpolate( sat_solver, node ); nodeUpdate( node, best_cand, best_func ); }}
19
Divisor Selection
• Divisor is a candidate fanin of the pivot node after resubstitution
• Divisor computation:– Partition window PIs into
(a) those in the TFI node of the pivot (b) the remaining window PIs
– Add nodes between the pivot and window PIs of type (a), excluding the node and the node’s MFFC
– Add nodes in the window if their structural support has no window PIs of type (b)
– Do not collect divisors whose level exceed a limit
– Do not collect more than a given number of divisors
Window POs
Window PIs
k = 3
m = 3Pivot node
type (a)
type (b) type (
b)
20
Resubstitution
Example: Given F = (a b)(b c), C = 1Two candidate sets:{y1= a’b, y2 = ab’c}, {y3= a b, y4 = bc}Set {y1, y2} is feasibleSet {y3, y4} is infeasible Counter-example: x1 = 100, x2 = 101
abc F y1 y2 y3 y4
000 0 0 0 0 0
001 0 0 0 0 0
010 1 1 0 1 0
011 1 1 0 1 1
100 0 0 0 1 0
101 1 0 1 1 0
110 0 0 0 1 0
111 0 0 0 1 1
• Resubstitution of F(x) with care set C(x) and candidate functions {gi(x)} exists iff every pair of care minterms, x1 and x2, distinguished by F(x), is also distinguished by gi(x) for some i– That is, if information of F(x) does not exceed that of {gi(x)}
21
Computing Dependency Function
• Definition of the interpolant:– Consider A(x, y) and B(y, z), such that A(x, y) B(y, z) = 0, where x and
z appear only in the clauses of A and of B, respectively, and y are variables common to A and B.
– An interpolant of function A(x, y) w.r.t. function B(y, z) is a Boolean function, I(y), depending only on the common variables y, such that A(x, y) I(y) and I(y) (y, z).
• Problem: – Find function h(g), such that h(g(x)) can replace f(x) on care set C(x),
that is, C(x) [h(g(x))f(x)]. The dependency function h(g) expresses the node, f(x), in terms of {gi}.
• Solution:– Prove the corresponding SAT problem “unsatisfiable”– Derive unsatisfiability proof [Goldberg/Novikov, DATE’03]– Derive interpolant from the unsatisfiability proof using McMillan’s
procedure [CAV’03] (assume A and B as shown on previous slide) – Use interpolant as the dependency function, h(g)
22
Resynthesis Heuristics
• Resynthesis is attempted for each node• Window, divisors, and resubstitution candidates are computed• Heuristics for different minimization criteria:
– Area• Try replacing each fanin whose reference counter is 1
– Fanin count• Try replacing each fanin
– Delay• Try replacing each fanin that is on the critical path
23
Previous Work• Optimization and mapping with internal flexibilities
– S. Muroga, Y. Kambayashi, H. C. Lai, and J. N. Culliney, “The transduction method-design of logic networks based on permissible functions”, IEEE Trans. Comp, Vol.38(10), pp. 1404-1424, Oct 1989
– H. Savoj. Don't cares in multi-level network optimization. Ph.D. Dissertation, UC Berkeley, May 1992.
– V. N. Kravets and P. Kudva, “Implicit enumeration of structural changes in circuit optimization”, Proc. DAC ’04, pp. 438-441.
– A. Mishchenko and R. Brayton, "SAT-based complete don't-care computation for network optimization", Proc. DATE '05, pp. 418-423.
– K. McMillan, “Don't-care computation using k-clause approximation”, Proc. IWLS ’05, pp. 153-160.
• Equivalence under don’t-cares– Q. Zhu, N. Kitchen, A. Kuehlmann, and A. L. Sangiovanni-Vincentelli. "SAT sweeping with
local observability don't-cares," Proc. DAC ’06, pp. 229-234.– S. Plaza, K.-H. Chang, I. L. Markov, and V. Bertacco, “Node mergers in the presence of
don't cares'', Proc. ASP-DAC’07, pp. 414-419.• Maximal reduction resynthesis without don’t-cares
– K.-C. Chen and J. Cong, “Maximal reduction of lookup-table-based FPGAs”, Proc. DATE ’92, pp. 224-229.
• Computing dependency functions using interpolation– C.-C. Lee, J.-H. R. Jiang, C.-Y. Huang, and A. Mishchenko. “Scalable exploration of
functional dependency by interpolation and incremental SAT solving”, Proc. IWLS’07.
24
Experimental Results• Implementation of SAT-based resynthesis
– ABC: Logic synthesis and verification system developed at UC Berkeley– SAT solver used is MiniSat-C_v1.14.1 by Niklas Een and Niklas Sörensson
• Outline of experiments – Perform technology-independent synthesis: resyn; if– Perform high-quality FPGA mapping: if– Perform resynthesis
• without choices: imfs –W 66; imfs –a –W 66; imfs -W 66• with choices (script is more complicated)
– Measure gain in area, delay, net count• Commands used in the scripts
– if is a new efficient FPGA mapper based on priority cuts– imfs is the new logic optimization and resynthesis engine described in the present
paper,– resyn is a fast logic synthesis script that performs 5 iterations of AIG rewriting,– choice is a logic synthesis script that performs 15 passes of AIG rewriting and
collects three snapshots of the current network: the original, the final, and an intermediate AIG saved after the first 5 rewriting passes.
• Computer used– ?– Runtime is several minutes for the largest designs in the tables
25
Academic BenchmarksBaseline Choices Imfs
Designs PI PO Reg LUT Level LUT Level LUT Level
alu4 14 8 0 821 6 785 5 558 5 apex2 39 3 0 992 6 866 6 806 6 apex4 9 19 0 838 5 853 5 800 5 bigkey 263 197 224 575 3 575 3 575 3 des 256 245 0 794 5 512 5 483 4 diffeq 64 39 377 659 7 632 7 636 7 dsip 229 197 224 687 3 685 2 685 2 elliptic 131 114 1122 1773 10 1824 9 1820 9 frisc 20 116 886 1748 13 1671 12 1692 12 i10 257 224 0 589 9 560 8 548 7 misex3 14 14 0 785 5 664 5 517 5 s38417 28 106 1636 2684 6 2674 6 2621 6 s38584 12 278 1452 2697 7 2647 6 2620 6 seq 41 35 0 931 5 756 5 682 5 tseng 52 122 385 647 7 649 6 645 6
Ratio1 1.000 1.000 0.929 0.923 0.873 0.901 Ratio2 1.000 1.000 0.940 0.977
26
Academic Benchmarks (PLAs)
Baseline Choices Imfs Designs PI PO Reg
LUT Level LUT Level LUT Level
clma 383 82 33 3323 10 2715 9 1277 8 ex1010 10 10 0 2847 6 2967 6 1282 5 ex5p 8 63 0 599 5 669 4 118 3 pdc 16 40 0 2327 7 2500 6 194 5 spla 16 46 0 1913 6 1828 6 289 4
Ratio1 1.000 1.000 0.995 0.908 0.212 0.718 Ratio2 1.000 1.000 0.213 0.790