progressive register allocation for irregular architectures
DESCRIPTION
Progressive Register Allocation for Irregular Architectures. David Koes [email protected] Seth Copen Goldstein [email protected] March 23, 2005. eax. ebx. ecx. edx. esi. edi. esp. ebp. Irregular Architectures. Few registers Register usage restrictions - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/1.jpg)
2005 International Symposium on Code Generation and Optimization
Progressive Register Allocation for Irregular
Architectures
David [email protected]
Seth Copen [email protected]
March 23, 2005
![Page 2: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/2.jpg)
2005 International Symposium on Code Generation and Optimization2
Irregular Architectures
• Few registers
• Register usage restrictions– address registers, hardwired registers...
• Memory operands
• Examples:– x86, 68k, ColdFire,
ARM Thumb, MIPS16, V800, various DSPs...
eaxebxecxedxesiedi
ebpesp
![Page 3: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/3.jpg)
2005 International Symposium on Code Generation and Optimization3
Fewer Registers More Spills
• Used gcc to compile >10,000 functions from Mediabench, Spec95, Spec2000, and micro-benchmarks
• Recorded which functions spilled
Percent of functions that spill
05
101520253035404550
PPC (32) 68k (16) x86 (8)
Percent
![Page 4: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/4.jpg)
2005 International Symposium on Code Generation and Optimization4
Register Usage Restrictions
• Instructions may prefer or require a specific subset of registers– x86 multiply instruction
imul %edx,%eax // 2 byte instruction
imul %edx,%ecx // 3 byte instruction– x86 divide instruction
idivl %ecx // eax = edx:eax/ecx
![Page 5: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/5.jpg)
2005 International Symposium on Code Generation and Optimization5
Memory Operands
• Load/store not always needed to access variables allocated to memory– depends upon instruction– still less efficient than register access
addl 8(%ebp), %eax vs
movl 8(%ebp), %edxaddl %edx, %eax
![Page 6: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/6.jpg)
2005 International Symposium on Code Generation and Optimization6
Register Allocation Challenges
• Optimize spill code– with few registers, spilling unavoidable
• Model register usage restrictions
• Exploit memory operands– affects spilling decisions
![Page 7: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/7.jpg)
2005 International Symposium on Code Generation and Optimization7
Previous Work
Method Models Irregular Features
Fast Optimal
Graph Coloring
Integer Programming[Goodwin and Wilken 96]
[Kong and Wilken 98]
[Fu and Wilken 2002]
Separated IP[Appel and George 01]
PBQP[Scholz and Eckstein 02] / /
![Page 8: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/8.jpg)
2005 International Symposium on Code Generation and Optimization8
Our Goals
• Expressive– Explicitly represent architectural irregularities
and costs
• Proper model– An optimum solution results in optimal
register allocation
• Progressive solution algorithm– more computation better solution– decent feasible solution obtained quickly– competitive with current allocators
![Page 9: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/9.jpg)
2005 International Symposium on Code Generation and Optimization9
Multicommodity Network Flow (MCNF)
a b
a b
2
22 4
444
instruction
crossbar
source
sink
![Page 10: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/10.jpg)
2005 International Symposium on Code Generation and Optimization10
Modeling Usage Constraints
int foo(int a, int b, int c){ a = a*b; return a/c;}
a
a
b
imuleax edx ecx mem
b
1-1
idiveax edx ecx mem
c
c
1
not quite right…
![Page 11: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/11.jpg)
2005 International Symposium on Code Generation and Optimization11
Modeling Spills and Moves
int foo(int a, int b, int c){ a = a*b; return a/c;}
a
imuleax edx ecx mem
b
1-1
eax edx ecx mem
eax edx ecx mem
c
b
3 3 3
a
idiveax edx ecx mem
c
1
eax edx ecx mem
eax edx ecx mem
![Page 12: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/12.jpg)
2005 International Symposium on Code Generation and Optimization12
Modeling Stores
• Simple approach flawed– doesn’t model memory
persistency
• Solution: antivariables– flow only through memory– eviction cost = store cost– evict only once
![Page 13: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/13.jpg)
2005 International Symposium on Code Generation and Optimization13
Register Allocation as MCNF
• Variables Commodities
• Variable Usage Network Design
• Nodes Allocation Classes (Reg/Mem)
• Registers Limits Node Capacities
• Spill Costs Edge Costs
• Variable Definition Source
• Variable Last Use Sink
![Page 14: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/14.jpg)
2005 International Symposium on Code Generation and Optimization14
Solving an MCNF
• Integer solution NP-complete
• Use standard IP solvers– commercial solvers (CPLEX) are impressive
• Exploit structure of problem– variety of MCNF specific solvers
• empirically faster than IP solvers
• Lagrangian Relaxation technique
![Page 15: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/15.jpg)
2005 International Symposium on Code Generation and Optimization15
Lagrangian Relaxation: Intuition
• Relaxes the hard constraints – only have to solve single commodity flow
• Combines easy subproblems using a Lagrangian multiplier– an additional price on each edge
a b
a b
01
Example:edges have unit capacity
a b
a b
0+11with price, solution to single commodity flow can be solution to multicommodity flow
![Page 16: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/16.jpg)
2005 International Symposium on Code Generation and Optimization16
Solution Procedure
• Compute prices using iterative subgradient optimization– converge to optimal prices
• At each iteration, greedily construct a feasible solution using current prices– allocate most expensive vars first– can always find an allocation
![Page 17: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/17.jpg)
2005 International Symposium on Code Generation and Optimization17
Solution Procedure
• Advantages+ have feasible solution at each step+ iterative nature progressive+ Lagrangian relaxation theory provides
means for computing a lower bound+ Can compute optimality bound
• Disadvantages– No guarantee of optimality of solution
![Page 18: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/18.jpg)
2005 International Symposium on Code Generation and Optimization18
Evaluation
• Replace gcc’s local allocator
• Optimize for code size– easy to statically evaluate
• Evaluate on MediaBench, MiBench, SpecInt95, SpecInt2000– consider only blocks where local allocation is
interesting (enough variables to spill)
![Page 19: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/19.jpg)
2005 International Symposium on Code Generation and Optimization19
Behavior of Solver
![Page 20: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/20.jpg)
2005 International Symposium on Code Generation and Optimization20
Proven Optimality
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
5-10conflicts
(355 blocks)
10-15conflicts
(23 blocks)
15-20conflicts
(7 blocks)
>= 20conflicts
(5 blocks)
>25%
Within 20%
Within 15%
Within 10%
Within 5%
Optimal
![Page 21: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/21.jpg)
2005 International Symposium on Code Generation and Optimization21
Comprehensive Results
-15.00%
-10.00%
-5.00%
0.00%
5.00%
10.00%
15.00%
20.00%
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
1 Iter10 Iters
100 Iters1000 Iters
5-10 conflicts(355 blocks)
10-15 conflicts(23 blocks)
15-20 conflicts(7 blocks)
>= 20 conflicts(5 blocks)
Improvement over gcc
artifact of interaction with gcc
![Page 22: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/22.jpg)
2005 International Symposium on Code Generation and Optimization22
Progressive Nature
:-(
![Page 23: Progressive Register Allocation for Irregular Architectures](https://reader030.vdocument.in/reader030/viewer/2022033103/56814565550346895db23777/html5/thumbnails/23.jpg)
2005 International Symposium on Code Generation and Optimization23
Contributions
• New MCNF model for register allocation+ expressive, can model irregular architectures+ can be solved using conventional ILP solvers
• Progressive solution procedure+ decent initial solution+ maintains feasible solution+ improves solution over time– no optimality guarantees
Progressive