constraint programming for compiler optimization march 2006
DESCRIPTION
Constraint Programming for Compiler Optimization March 2006. Joint work with: Alexander Golynski Alejandro López-Ortiz Abid Malik Jim McInnes Claude-Guy Quimper John Tromp Kent Wilken. Funding: NSERC IBM Canada. Acknowledgements. Optimization problems in compilers. - PowerPoint PPT PresentationTRANSCRIPT
Constraint Programming for Compiler Optimization
March 2006
2
Acknowledgements
Joint work with:
Alexander Golynski
Alejandro López-Ortiz
Abid Malik
Jim McInnes
Claude-Guy Quimper
John Tromp
Kent Wilken
Funding:
NSERC
IBM Canada
3
Optimization problems in compilers
Instruction selection
Instruction scheduling·basic-block instruction scheduling·super-block scheduling ·software pipelining & loop unrolling
Register allocation
Memory hierarchy optimizations
4
Basic-block instruction scheduling
Schedule basic-block·straight-line sequence of code with single entry, single exit
Multiple-issue pipelined processors·multiple instructions can begin execution each clock cycle
·delay or latency before results are available
Find minimum length schedule
Classic problem·lots of attention in literature
5
Example: (a + b) + c
instructions
A r1 a
B r2 b
C r3 c
D r1 r1 + r2
E r1 r1 + r3
3 3
31
A B
D C
E
dependency DAG
6
Single-issue pipelined processor
non-optimal schedule
A r1 aB r2 b
nopnop
D r1 r1 + r2C r3 c
nopnop
E r1 r1 + r3
A B
D C
E
3 3
31
dependency DAG
7
Single-issue pipelined processor
optimal schedule
A r1 aB r2 bC r3 c
nopD r1 r1 + r2E r1 r1 + r3
A B
D C
E
3 3
31
dependency DAG
8
Multiple-issue pipelined processor
A B
D C
E
3 3
31
dependency DAG
A
issue width is 2
1
5
4
3
2
B
C
D
E
9
Multiple-issue pipelined processor
A B
D C
E
3 3
31
dependency DAG
A
issue width is 1+1
1
5
4
3
2
C
B
D
E6
10
Production compilers
“At the outset, note that basic-block scheduling is an NP-hard problem, even with a very simple formulation of the problem, so we must seek an effective heuristic, rather than exact, approach.”
Steven Muchnick,
Advanced Compiler Design
& Implementation, 1997
11
Optimal approachesstate-of-the-art
Single-issue
Previous·10-40 instructions
ILP (Arya, 1985)CP (Ertl & Krall, 1991)
·up to 1000 instructionsILP (Wilken et al, 2000)
Our work·up to 2600 instructions·20 × faster
Multiple-issue
Previous·10-40 instructions
ILP (Chang et al., 1997)DP (Kessler, 1998)
·up to 1000 instructionsB&B (Heffernan et al., 2005)
Our work·up to 2600 instructions·50-fold improvement
12
Constraint programming methodology
Model problem·specify in terms of constraints on acceptable solutions
·define/choose constraint model: variables, domains, constraints
Solve model·define/choose search algorithm·define/choose heuristics
13
Constraint programming methodology
Model problem·specify in terms of constraints on acceptable solutions
·define/choose constraint model: variables, domains, constraints
Solve model·define/choose search algorithm·define/choose heuristics
14
Minimal constraint model
variables A, B, C, D, E
domains {1, …, m}
constraints D A + 3D B + 3E C + 3E D + 1gcc(A, B, C, D, E, width)
A B
D C
E
3 3
31
dependency DAG
15
Bounds consistency constraint propagation
[1, 3]
[4, 6]
variable ABCDE
domain [1, 6][1, 6][1, 6][1, 6][1, 6]
D A + 3constraints
[4, 5] [1, 3]
[4, 6]
[1, 3] [1, 2]
D B + 3 E C + 3 E D + 1 gcc(A, B, C, D, E, 1)
[5, 6]
[1, 2] [3, 3]
[6, 6]
16
Improvements to constraint model
1. Distance constraints •constraints over nodes which define regions
2. Predecessor and successor constraints•constraints over nodes with multiple
predecessors or multiple successors
3. Safe pruning constraint•global constraint
4. Dominance constraints•constraints based on graph isomorphism
17
Improvements to constraint model
1. Distance constraints •constraints over nodes which define regions
2. Predecessor and successor constraints•constraints over nodes with multiple
predecessors or multiple successors
3. Safe pruning constraint•global constraint
4. Dominance constraints•constraints based on graph isomorphism
18
Distance constraints: Regions
A pair of nodes i, j define a region in a DAG G if:
(i) there is more than one path from i to j, and
(ii) not all paths from i to j go through some node k distinct from i and j.
i
j
19
Distance constraints: Estimate
A
B
ED
H
F G
C
1
1
1
33
1
3
1
3
20
Distance constraints: Estimate
A
B
ED
H
F G
C
1
1
1
33
1
3
1
3j j+1j+2j+3j+4j+5
5
A F
21
Distance constraints: Estimate
A
B
ED
H
F G
C
1
1
1
33
1
3
1
3j j+1j+2j+3j+4j+5
E H
5
22
Distance constraints: Estimate
A
B
ED
H
F G
C
1
1
1
33
1
3
1
3
9
A
j j+1j+2j+3j+4j+5
j+6j+7j+8j+9
H
23
Distance constraints: Optimal
A
B
ED
H
F G
C
1
1
1
33
1
3
1
3
[1,1]
[10,10]
[2,3]
[5,6] [5,6]
[6,7] [6,7]
[2,3]
propagate latency
propagate all-diff
• Not optimal: A 1
H 10
• Estimate: H A + 9
24
Distance constraints: Optimal
• Optimal: H A + 10
A
B
ED
H
F G
C
1
1
1
33
1
3
1
3
[1,1]
[10,10]
[2,3]
[5,6] [5,6]
[6,7] [6,7]
[2,3]
propagate latency
• Not optimal: A 1
H 10
• Estimate: H A + 9
propagate all-diff
inconsistent
25
Improvements to constraint model
1. Distance constraints •constraints over nodes which define regions
2. Predecessor and successor constraints•constraints over nodes with multiple
predecessors or multiple successors
3. Safe pruning constraint•global constraint
4. Dominance constraints•constraints based on graph isomorphism
26
Predecessor constraints
[4, ]
3
1A
B
DC E
H
F G
3 3
22
11
1
[ ,14]
[5,9]
[8,12][9,12]
[5,9][6,9]
[5,8]
7
11
27
Predecessor constraints
D E
G
A
B
C
H
F
[4, ]
11
3
1
2
1
2[ ,14]
3 3
[5,9]
[8,12][9,12]
[5,9][6,9]
[5,8]
7
11
[9,12]
5 6 7 8 9
28
Predecessor constraints
[4, ]
3
1A
B
DC E
H
F G
3 3
22
11
1
[ ,14]
[5,9]
[8,12][9,12]
[5,9][6,9]
[5,8]
7
11
[9,12]
[12,14]
9 10 11 12
29
Successor constraints
[4, ]
3
1A
B
DC E
H
F G
3 3
22
11
1
[ ,14]
[5,9]
[8,12][9,12]
[5,9][6,9]
[5,8]
7
11
[9,12]
[12,14]
[4,6]
6 7 8 9
30
Constraint programming methodology
Model problem·specify in terms of constraints on acceptable solutions
·define/choose constraint model: variables, domains, constraints
Solve model·define/choose search algorithm·define/choose heuristics
31
Solving instances of the model
Use constraints to establish:·lower bound on length m of optimal schedule·min and max of domains of variables
Backtracking search·branches on min(x), min(x)+1, … ·interleave with bounds consistency constraint propagation
·fallback: singleton consistency on bounds
If no solution found, increment m and repeat search
32
Solving instances of the model
A
B
C
D
1 2 4 5
E
A B
D C
E
3 3
31
[1,5] [1,5]
[1,5] [1,5]
[1,5]
33
Solving instances of the model
A
B
C
D
1 2 4 5
E
A B
D C
E
3 3
31
[ ] [ ]
[ ] [ ]
[ ]
34
Solving instances of the model
A
B
C
D
1 2 5 6
E
A B
D C
E
3 3
31
[1,6] [1,6]
[1,6] [1,6]
[1,6]
35
Solving instances of the model
A
B
C
D
1 2 5 6
E
A B
D C
E
3 3
31
[1,2] [1,2]
[5,5] [3,3]
[6,6]
36
Improvements to constraint solver
Design special purpose constraint propagators
·commonly occurring constraints·significantly improve efficiency
Improved algorithms for bounds consistency·all-diff constraint ·gcc constraint
37
Comparing all-diff propagators (prototype)
n DC MT BC 69 0.0 0.0 0.0 70 0.1 0.0 0.0
111 0.1 0.1 0.1 211 >600.0 >600.0 >600.0 214 1.2 0.9 0.5 216 1.1 0.7 0.4 220 0.9 0.7 0.3 690 26.9 4.0 1.6 856 17.1 10.9 5.5
1006 87.2 15.9 6.0
Time (sec.) to solve instruction scheduling problems; model includes latency, distance, and all-diff constraints.
DC: Régin, 1994; MT: Mehlhorn & Thiel, 2000; BC: IJCAI-
2003
38
Comparing gcc propagators (prototype)
n DC vH BC 69 0.0 0.0 0.0 70 0.1 0.0 0.0
111 0.8 0.0 0.0 211 9.2 0.5 0.1 214 9.3 0.6 0.1 216 124.1 2.7 0.3 220 285.9 5.1 0.5 690 493.2 1.3 1.7 856 471.2 >600.0 3.8
1006 >600.0 >600.0 8.7
Time (sec.) to solve instruction scheduling problems; model includes latency and gcc constraints; width is 2.
DC: Régin, 1996; vH: van Hentenryck et al., 1992; BC: CP-
2003
39
Putting it all together: Experimental results
Architectures
1-issue 2-issue 4-issue 6-issue
Improved 4383 5498 6031 2759
Timed out 2 25 18 14
SPEC 2000 & MediaBench Benchmarks
Total of 352,111 basic blocks of size 3 or greater
Improved = improved schedule over heuristic scheduler
Timed out = not solved within 10 minutes
40
Putting it all together: Experimental results
Architectures Percentage Improvement 1-issue 2-issue 4-issue 6-issue
Average 5.3 5.5 5.9 5.0
Maximum 17.6 25.0 28.0 25.0
SPEC 2000 & MediaBench Benchmarks
For basic blocks with improved schedules
41
Conclusions
CP approach to instruction scheduling·Single-issue processors
20-times faster than previous best optimal approach
·Multiple-issue processorslarger and more difficult problems50-fold reduction in number of problems that
cannot be solved
Constraint propagators·faster all-diff and gcc constraint propagators·useful in many problems
42
Current and future work: Expand scope of problem
Instruction selection
Instruction scheduling·basic-block instruction scheduling·super-block scheduling ·software pipelining & loop unrolling
Register allocation
Memory hierarchy optimizations